Prediction of potential genes in microbial genomes Time: Mon May 16 14:56:23 2011 Seq name: gi|296493500|gb|ADTK01000001.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont3.1, whole genome shotgun sequence Length of sequence - 1364 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1341 764 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member Predicted protein(s) >gi|296493500|gb|ADTK01000001.1| GENE 1 3 - 1341 764 446 aa, chain - ## HITS:1 COG:mll0964 KEGG:ns NR:ns ## COG: mll0964 COG0507 # Protein_GI_number: 13471082 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Mesorhizobium loti # 1 259 1 268 1015 159 43.0 7e-39 MAIYHLSMKIISRKNGYSAVASAAYRSGSVIPDDRTGLIHDYTRKRGVDDAVILTPANAP SWCGDRSVLWNAVEKAEQRRNSQLAREIELAIPREISREAARETVLAFVRENFVSRGMIA DVAFHHMDRTNPHAHIMLTTRAVGETGFAGKVRDWNDRALAETWRASWADHANRALANAG YQEEIDHRSYERQGLEKAPGLHLGKAACAMEKRGMETERGEQNRLINSLNLEIQVSRTQL ALRTVQEAQRKRELSDAARRAAEALNLTIPAANASADTLREFIATLPQECGNAWEMTPEF LAMSGKVNDIEREGNALLKEQAILEKEMTGLKKARPVASLLSEIPLMTWAEPEYRKRQLR FWKLGKQIESLRRTYRAVKERDIPARRQAFETQWNTWIAPGMAELKEKLSAREAERRREE AEAEARRKEQEHEARLKRHDNHRLSR Prediction of potential genes in microbial genomes Time: Mon May 16 14:56:23 2011 Seq name: gi|296493499|gb|ADTK01000002.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont5.1, whole genome shotgun sequence Length of sequence - 667 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 538 476 ## ECO26_2279 putative exonuclease Predicted protein(s) >gi|296493499|gb|ADTK01000002.1| GENE 1 1 - 538 476 179 aa, chain - ## HITS:1 COG:no KEGG:ECO26_2279 NR:ns ## KEGG: ECO26_2279 # Name: not_defined # Def: putative exonuclease # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 179 641 819 820 373 100.0 1e-102 MIDLETMGKNPDAPIISIGAIFFDPQTGDMGPEFSKTIDLETAGGVIDRDTIKWWLKQSR EAQSAIMTDEIPLDDALLQLREFIDENSGEFFVQVWGNGANFDNTILRRSYERQGIPCPW RYYNDRDVRTIVELGKAIDFDARTAIPFEGERHNALDDARYQAKYVSVIWQKLIPSQAD Prediction of potential genes in microbial genomes Time: Mon May 16 14:56:27 2011 Seq name: gi|296493498|gb|ADTK01000003.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont6.1, whole genome shotgun sequence Length of sequence - 701 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 86 - 700 168 ## COG1484 DNA replication protein Predicted protein(s) >gi|296493498|gb|ADTK01000003.1| GENE 1 86 - 700 168 204 aa, chain - ## HITS:1 COG:SMa0776 KEGG:ns NR:ns ## COG: SMa0776 COG1484 # Protein_GI_number: 16262872 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Sinorhizobium meliloti # 2 194 47 239 245 181 47.0 7e-46 EELTCRENRKAERLIKHARFRLNAELSKLDYRNNRGLDRALIRSLSQGNWLTLKQNILLT GATGSGKTFLACALGHNACRQGYKVYYYRLKALMEQCYQGHADGRYSKLLTRLNNSDLLL LDDWGLEPLSSEQRSDLLEIVDLMYQRGSIIVVSQLPVENWYKMIGDSTHADAILDRLVH GSIKIELKGESMRKIQSPLTEGDQ Prediction of potential genes in microbial genomes Time: Mon May 16 14:56:27 2011 Seq name: gi|296493497|gb|ADTK01000004.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont7.1, whole genome shotgun sequence Length of sequence - 771 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 68 - 616 364 ## COG4584 Transposase and inactivated derivatives 2 1 Op 2 . + CDS 631 - 769 132 ## ECS88_2214 transposase ORF2, IS21 family Predicted protein(s) >gi|296493497|gb|ADTK01000004.1| GENE 1 68 - 616 364 182 aa, chain + ## HITS:1 COG:mll6047 KEGG:ns NR:ns ## COG: mll6047 COG4584 # Protein_GI_number: 13475047 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Mesorhizobium loti # 2 182 82 261 261 145 40.0 3e-35 MKKVRVHADYHVEIDKHYYSVPCSLLGQQLEAWISGELVRLFNQGQEVAVHPRKRTYGYS TRNEHMPEAHRQHATWTPERLLEWAGHIGSETHSYVLHILNSRPHPEQSYRFCLGLLNLH KKYSKARLNAACARALKTKVWRLSGIKSILEKGLDKQPVQDPKPDLLSTMEHENVRGSEY YH >gi|296493497|gb|ADTK01000004.1| GENE 2 631 - 769 132 46 aa, chain + ## HITS:1 COG:no KEGG:ECS88_2214 NR:ns ## KEGG: ECS88_2214 # Name: not_defined # Def: transposase ORF2, IS21 family # Organism: E.coli_S88 # Pathway: not_defined # 1 46 2 47 249 86 100.0 3e-16 MNHLYEQLTALKLTGFRDALKKQLAQPGTYQELGFEERLSLLTAEE Prediction of potential genes in microbial genomes Time: Mon May 16 14:56:54 2011 Seq name: gi|296493496|gb|ADTK01000005.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont13.1, whole genome shotgun sequence Length of sequence - 82502 bp Number of predicted genes - 76, with homology - 76 Number of transcription units - 47, operones - 14 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 439 - 470 2.4 1 1 Op 1 . - CDS 478 - 912 427 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins - Prom 934 - 993 4.2 2 1 Op 2 . - CDS 1053 - 2186 1287 ## COG3203 Outer membrane protein (porin) - Prom 2320 - 2379 9.2 - Term 2484 - 2515 3.2 3 2 Tu 1 . - CDS 2553 - 6077 3329 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit - Prom 6224 - 6283 4.7 + Prom 6192 - 6251 3.0 4 3 Tu 1 . + CDS 6351 - 6617 213 ## COG3042 Putative hemolysin + Term 6843 - 6900 6.2 5 4 Op 1 . - CDS 6614 - 7048 201 ## PROTEIN SUPPORTED gi|163801140|ref|ZP_02195040.1| 50S ribosomal protein L25 - Prom 7070 - 7129 2.7 - Term 7104 - 7139 6.1 6 4 Op 2 . - CDS 7147 - 8136 1183 ## COG1052 Lactate dehydrogenase and related dehydrogenases - Prom 8176 - 8235 8.4 + Prom 8142 - 8201 4.4 7 5 Op 1 . + CDS 8344 - 10983 2355 ## ECUMN_1648 hypothetical protein 8 5 Op 2 . + CDS 10980 - 11165 264 ## LF82_3556 uncharacterized protein YnbE 9 5 Op 3 . + CDS 11125 - 11499 255 ## COG3784 Uncharacterized protein conserved in bacteria + Term 11626 - 11691 14.2 - Term 11628 - 11666 6.2 10 6 Tu 1 . - CDS 11671 - 12576 599 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 12643 - 12702 4.1 + Prom 12646 - 12705 4.2 11 7 Tu 1 . + CDS 12812 - 14311 1164 ## COG1012 NAD-dependent aldehyde dehydrogenases + Term 14336 - 14362 1.0 - Term 14324 - 14350 1.0 12 8 Tu 1 . - CDS 14369 - 16642 2478 ## COG3733 Cu2+-containing amine oxidase - Prom 16805 - 16864 3.4 - Term 16831 - 16886 6.7 13 9 Tu 1 . - CDS 16890 - 18935 1495 ## COG1012 NAD-dependent aldehyde dehydrogenases - Prom 19128 - 19187 7.5 + Prom 18928 - 18987 7.1 14 10 Op 1 5/0.105 + CDS 19220 - 20149 749 ## COG3396 Uncharacterized conserved protein 15 10 Op 2 5/0.105 + CDS 20161 - 20448 365 ## COG3460 Uncharacterized enzyme of phenylacetate metabolism 16 10 Op 3 4/0.211 + CDS 20457 - 21203 735 ## COG3396 Uncharacterized conserved protein 17 10 Op 4 2/0.579 + CDS 21218 - 21715 537 ## COG2151 Predicted metal-sulfur cluster biosynthetic enzyme 18 10 Op 5 1/0.789 + CDS 21723 - 22793 1022 ## COG1018 Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 19 10 Op 6 12/0.000 + CDS 22790 - 23557 697 ## COG1024 Enoyl-CoA hydratase/carnithine racemase 20 10 Op 7 7/0.000 + CDS 23560 - 24345 753 ## COG1024 Enoyl-CoA hydratase/carnithine racemase 21 10 Op 8 1/0.789 + CDS 24350 - 25774 1062 ## COG1250 3-hydroxyacyl-CoA dehydrogenase 22 10 Op 9 1/0.789 + CDS 25764 - 26186 395 ## COG2050 Uncharacterized protein, possibly involved in aromatic compounds catabolism 23 10 Op 10 1/0.789 + CDS 26186 - 27391 1073 ## COG0183 Acetyl-CoA acetyltransferase 24 10 Op 11 2/0.579 + CDS 27418 - 28731 1274 ## COG1541 Coenzyme F390 synthetase + Prom 28744 - 28803 2.4 25 11 Op 1 1/0.789 + CDS 28832 - 29782 695 ## COG3327 Phenylacetic acid-responsive transcriptional repressor 26 11 Op 2 . + CDS 29764 - 30354 590 ## COG0663 Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily + Term 30400 - 30437 5.2 + Prom 30410 - 30469 5.2 27 12 Tu 1 . + CDS 30585 - 31445 759 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) + Term 31454 - 31493 6.2 + Prom 31603 - 31662 4.1 28 13 Tu 1 . + CDS 31882 - 32349 222 ## COG0558 Phosphatidylglycerophosphate synthase + Term 32374 - 32421 5.0 + Prom 32621 - 32680 4.5 29 14 Op 1 3/0.368 + CDS 32721 - 33245 226 ## COG4589 Predicted CDP-diglyceride synthetase/phosphatidate cytidylyltransferase 30 14 Op 2 5/0.105 + CDS 33261 - 35012 1191 ## COG0500 SAM-dependent methyltransferases 31 14 Op 3 . + CDS 35033 - 36325 415 ## COG0671 Membrane-associated phospholipid phosphatase + Term 36373 - 36417 4.1 32 15 Tu 1 . - CDS 36376 - 36981 736 ## COG1182 Acyl carrier protein phosphodiesterase - Prom 37154 - 37213 4.6 + Prom 37076 - 37135 2.3 33 16 Tu 1 . + CDS 37182 - 41084 4167 ## COG1643 HrpA-like helicases + Prom 41118 - 41177 6.5 34 17 Tu 1 . + CDS 41356 - 42156 478 ## COG1434 Uncharacterized conserved protein + Prom 42185 - 42244 5.0 35 18 Tu 1 . + CDS 42353 - 43792 1371 ## COG1012 NAD-dependent aldehyde dehydrogenases + Term 44021 - 44060 -0.0 - Term 43792 - 43825 4.4 36 19 Tu 1 . - CDS 43834 - 44835 1101 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase - Prom 44981 - 45040 5.2 + Prom 44940 - 44999 2.3 37 20 Tu 1 . + CDS 45024 - 45554 260 ## COG3038 Cytochrome B561 + Prom 45616 - 45675 6.0 38 21 Tu 1 . + CDS 45799 - 45972 144 ## SSON_1723 hypothetical protein + Term 46181 - 46209 -0.9 39 22 Tu 1 . - CDS 46084 - 46251 57 ## B21_01389 hypothetical protein - Prom 46466 - 46525 3.1 + Prom 46449 - 46508 5.5 40 23 Tu 1 . + CDS 46592 - 48232 1131 ## COG0840 Methyl-accepting chemotaxis protein + Term 48390 - 48431 -0.7 41 24 Tu 1 . - CDS 48270 - 49193 603 ## COG0583 Transcriptional regulator - Prom 49238 - 49297 4.5 + Prom 49213 - 49272 4.4 42 25 Tu 1 . + CDS 49410 - 50753 1019 ## COG5383 Uncharacterized protein conserved in bacteria + Prom 50840 - 50899 3.9 43 26 Tu 1 . + CDS 51014 - 52633 1568 ## COG3131 Periplasmic glucans biosynthesis protein 44 27 Tu 1 . - CDS 52554 - 52757 89 ## SSON_1717 hypothetical protein - Prom 52891 - 52950 3.8 45 28 Op 1 1/0.789 + CDS 52773 - 52997 255 ## COG2841 Uncharacterized protein conserved in bacteria + Term 53008 - 53036 1.3 46 28 Op 2 . + CDS 53060 - 53596 913 ## PROTEIN SUPPORTED gi|16129386|ref|NP_415944.1| ribosomal-protein-L7/L12-serine acetyltransferase 47 29 Tu 1 . - CDS 53591 - 54571 605 ## B21_01396 hypothetical protein - Prom 54652 - 54711 6.6 + Prom 54435 - 54494 4.5 48 30 Op 1 4/0.211 + CDS 54695 - 55687 898 ## COG1275 Tellurite resistance protein and related permeases 49 30 Op 2 . + CDS 55684 - 56277 613 ## COG0500 SAM-dependent methyltransferases + Term 56437 - 56496 9.7 + Prom 56435 - 56494 4.5 50 31 Tu 1 . + CDS 56579 - 57247 640 ## ECIAI1_1426 putative lipoprotein + Term 57256 - 57287 4.1 51 32 Tu 1 . - CDS 57552 - 57707 61 ## EcSMS35_2551 hypothetical protein - Prom 57788 - 57847 3.1 + Prom 57718 - 57777 4.5 52 33 Tu 1 . + CDS 57838 - 58986 361 ## COG0675 Transposase and inactivated derivatives - Term 58957 - 58987 1.6 53 34 Tu 1 . - CDS 59026 - 60201 888 ## COG3135 Uncharacterized protein involved in benzoate metabolism 54 35 Op 1 3/0.368 + CDS 60293 - 60829 392 ## COG1396 Predicted transcriptional regulators + Term 60860 - 60890 1.1 55 35 Op 2 . + CDS 60902 - 62863 1672 ## COG0826 Collagenase and related proteases 56 36 Tu 1 . - CDS 62955 - 63125 174 ## JW1432 hypothetical protein - Prom 63325 - 63384 4.1 + Prom 63193 - 63252 4.4 57 37 Op 1 . + CDS 63367 - 63561 102 ## gi|300901851|ref|ZP_07119886.1| hypothetical protein HMPREF9536_00069 58 37 Op 2 2/0.579 + CDS 63586 - 64023 406 ## COG1598 Uncharacterized conserved protein 59 37 Op 3 3/0.368 + CDS 64102 - 65508 1111 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs + Prom 65552 - 65611 3.4 60 38 Op 1 13/0.000 + CDS 65753 - 66898 1208 ## COG0687 Spermidine/putrescine-binding periplasmic protein 61 38 Op 2 30/0.000 + CDS 66916 - 67929 856 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 62 38 Op 3 36/0.000 + CDS 67930 - 68871 984 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I 63 38 Op 4 5/0.105 + CDS 68861 - 69655 733 ## COG1177 ABC-type spermidine/putrescine transport system, permease component II 64 38 Op 5 . + CDS 69677 - 71101 1331 ## COG1012 NAD-dependent aldehyde dehydrogenases + Prom 71277 - 71336 3.1 65 39 Op 1 . + CDS 71440 - 71661 153 ## ECIAI1_1442 conserved hypothetical protein; putative inner membrane protein 66 39 Op 2 . + CDS 71747 - 71980 373 ## G2583_1807 hypothetical protein 67 40 Op 1 4/0.211 - CDS 71981 - 72430 371 ## COG3238 Uncharacterized protein conserved in bacteria 68 40 Op 2 . - CDS 72427 - 72945 496 ## COG1247 Sortase and related acyltransferases - Prom 72976 - 73035 2.3 + Prom 72912 - 72971 2.9 69 41 Tu 1 . + CDS 73126 - 74163 980 ## COG2130 Putative NADP-dependent oxidoreductases + Term 74268 - 74302 -0.4 + Prom 74165 - 74224 5.2 70 42 Tu 1 . + CDS 74361 - 75026 600 ## COG1802 Transcriptional regulators + Term 75109 - 75146 0.3 - Term 75016 - 75055 5.1 71 43 Tu 1 . - CDS 75062 - 77164 1622 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 77204 - 77263 5.8 + Prom 77191 - 77250 6.2 72 44 Tu 1 . + CDS 77406 - 78467 1293 ## COG3391 Uncharacterized conserved protein - Term 78539 - 78573 -0.9 73 45 Tu 1 . - CDS 78582 - 80081 1511 ## COG1113 Gamma-aminobutyrate permease and related permeases - Prom 80194 - 80253 3.8 + Prom 80252 - 80311 4.4 74 46 Op 1 . + CDS 80348 - 80965 472 ## COG0625 Glutathione S-transferase 75 46 Op 2 . + CDS 81041 - 81253 99 ## ECO103_1586 hypothetical protein 76 47 Tu 1 . + CDS 82074 - 82500 400 ## COG3501 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|296493496|gb|ADTK01000005.1| GENE 1 478 - 912 427 144 aa, chain - ## HITS:1 COG:ECs1997 KEGG:ns NR:ns ## COG: ECs1997 COG0589 # Protein_GI_number: 15831251 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Escherichia coli O157:H7 # 1 144 25 168 168 259 100.0 1e-69 MNRTILVPIDISDSELTQRVISHVEAEAKIDDAEVHFLTVIPSLPYYASLGLAYSAELPA MDDLKAEAKSQLEEIIKKFKLPTDRVHVHVEEGSPKDRILELAKKIPAHMIIIASHRPDI TTYLLGSNAAAVVRHAECSVLVVR >gi|296493496|gb|ADTK01000005.1| GENE 2 1053 - 2186 1287 377 aa, chain - ## HITS:1 COG:ompN KEGG:ns NR:ns ## COG: ompN COG3203 # Protein_GI_number: 16129338 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein (porin) # Organism: Escherichia coli K12 # 1 377 1 377 377 619 99.0 1e-177 MKSKVLALLIPALLAAGAAHAAEVYNKDGNKLDLYGKVDGLHYFSDNSAKDGDQSYARLG FKGETQINDQLTGYGQWEYNIQANNTESSKNQSWTRLAFAGLKFADYGSFDYGRNYGVMY DIEGWTDMLPEFGGDSYTNADNFMTGRANGVATYRNTDFFGLVNGLNFAVQYQGNNEGAS NGQEGTNNGRDVRHENGDGWGLSTTYDLGMGFSAGAAYTSSDRTNDQVNHTAAGGDKADA WTAGLKYDANNIYLATMYSETRNMTPFGDSDYAVANKTQNFEVTAQYQFDFGLRPAVSFL MSKGRDLHAAGGADNPAGVDDKDLVKYADVGATYYFNKNMSTYVDYKINLLDEDDSFYAA NGISTDDIVALGLVYQF >gi|296493496|gb|ADTK01000005.1| GENE 3 2553 - 6077 3329 1174 aa, chain - ## HITS:1 COG:ECs2000_1 KEGG:ns NR:ns ## COG: ECs2000_1 COG0674 # Protein_GI_number: 15831254 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Escherichia coli O157:H7 # 1 411 1 411 411 832 99.0 0 MITIDGNGAVASVAFRTSEVIAIYPITPSSTMAEQADAWAGNGLKNVWGDTPRVVEMQSE AGAIATVHGALQTGALSTSFTSSQGLLLMIPTLYKLAGELTPFVLHVAARTVATHALSIF GDHSDVMAVRQTGCAMLCAANVQEAQDFALISHIATLKSRVPFIHFFDGFRTSHEINKIV PLADDTILDLMPQAEIDAHRARALNPEHPVIRGTSANPDTYFQSREATNPWYDAVYDHVE QAMNDFSAATGRQYQPFEYYGHPQAERVIILMGSAIGTCEEVVDELLIRGEKVGVLKVRL YRPFSAKHLLQALPGSVRSVAVLDRTKEPGAQAEPLYLDVMTALAEAFNNGERETLPRVI GGRYGLSSKEFGPDCVLAVFAELNAAKPKARFTVGIYDDVTNLSLPLPENTLPNSAKLEA LFYGLGSDGSVSATKNNIKIIGNSTPWYAQGYFVYDSKKAGGLTVSHLRVSEQPIRSAYL ISQADFVGCHQLQFIDKYQMAERLKPGGIFLLNTPYSADEVWSRLPQEVQAVLNQKKARF YVINAAKIARECGLAARINTVMQMAFFHLTQILPGDSALAELQGAIAKSYSSKGQDLVER NWQALALARESVEEVPLQPVNPHSANRPPVVSDAAPDFVKTVTAAMLAGLGDALPVSALP PDGSWPMGTTRWEKRNIAEEIPIWKEELCTQCNHCVAACPHSAIRAKVVPPEAMENAPAS LHSLDVKSRDMRGQKYVLQVAPEDCTGCNLCVEVCPAKDRQNPEIKAINMMSRLEHVEEE KINYDFFLNLPEIDRSKLERIDIRTSQLITPLFEYSGACSGCGETPYIKLLTQLYGDRML IANATGCSSIYGGNLPSTPYTTDANGRGPAWANSLFEDNAEFGLGFRLTVDQPRVRVLRL LDQFADKIPAELLTALKSDATPEVRREQVAALRQQLNDVAEAHELLRDADALVEKSIWLI GGDGWAYDIGFGGLDHVLSLTENVNILVLDTQCYSNTGGQASKATPLGAVTKFGEHGKRK ARKDLGVSMMMYGHVYVAQISLGAQLNQTVKAIQEAEAYPGPSLIIAYSPCEEHGYDLAL SHDQMRQLTATGFWPLYRFDPRRADEGKLPLALDSRPPSVALEETLLHEQRFRRLNSQQP EVAEQLWKDAAADLQKRYDFLAQMAGKAEKSNTD >gi|296493496|gb|ADTK01000005.1| GENE 4 6351 - 6617 213 88 aa, chain + ## HITS:1 COG:STM1649 KEGG:ns NR:ns ## COG: STM1649 COG3042 # Protein_GI_number: 16764993 # Func_class: R General function prediction only # Function: Putative hemolysin # Organism: Salmonella typhimurium LT2 # 38 88 1 51 51 63 92.0 1e-10 MRAAFWVGCAALLLSACSSEPVQQATAAHVAPGLKASMSSSGEANCAMIGGSLSVARQLD GTAIGMCALPNGKRCSEQSLAAGSCGSY >gi|296493496|gb|ADTK01000005.1| GENE 5 6614 - 7048 201 144 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163801140|ref|ZP_02195040.1| 50S ribosomal protein L25 [Vibrio campbellii AND4] # 1 144 1 147 147 82 29 9e-15 MRTTMKKVAAFVALSLLMAGCVSNDKIAVTPEQLQHHRFVLESVNGKPVTSDKNPPEISF GEKMMISGSMCNRFSGEGKLSNGELTAKGLAMTRMMCANPQLNELDNTISEMLKEGAQVD LTANQLTLATAKQTLTYKLADLMN >gi|296493496|gb|ADTK01000005.1| GENE 6 7147 - 8136 1183 329 aa, chain - ## HITS:1 COG:ECs2002 KEGG:ns NR:ns ## COG: ECs2002 COG1052 # Protein_GI_number: 15831256 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Escherichia coli O157:H7 # 1 329 1 329 329 669 100.0 0 MKLAVYSTKQYDKKYLQQVNESFGFELEFFDFLLTEKTAKTANGCEAVCIFVNDDGSRPV LEELKKHGVKYIALRCAGFNNVDLDAAKELGLKVVRVPAYDPEAVAEHAIGMMMTLNRRI HRAYQRTRDANFSLEGLTGFTMYGKTAGVIGTGKIGVAMLRILKGFGMRLLAFDPYPSAA ALELGVEYVDLPTLFSESDVISLHCPLTPENYHLLNEAAFDQMKNGVMIVNTSRGALIDS QAAIEALKNQKIGSLGMDVYENERDLFFEDKSNDVIQDDVFRRLSACHNVLFTGHQAFLT AEALTSISQTTLQNLSNLEKGETCPNELV >gi|296493496|gb|ADTK01000005.1| GENE 7 8344 - 10983 2355 879 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_1648 NR:ns ## KEGG: ECUMN_1648 # Name: ydbH # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 879 1 879 879 1691 99.0 0 MLGKYKAVLALLLLIILVPLTLLMTLGLWVPTLAGIWLPLGTRIALDESPRITRKGLIIP DLRYLVGDCQLAHITNASLSHPSRWLLNVGTVELDSACLAKLPQTEQSPAAPKTLAQWQS MLPNTWINIDKLIFSPWQEWQGKLSLALTSDIQQLRYQGEKVKFQGQLKGQQLTVSELDV VAFENQSPVKLVGEFTMPLVPDGLPVSGHATATLNLPQEPSLVDAELDWQENSGQLIVLA RDNGDPLLDLPWQITRQQLTVSDGRWSWPYAGFPLSGRLGVKVDNWQAGLENALVSGRLS VLTQGQAGKGNAVLNFGPGKLSMDNSQLPLQLTGEAKQADLILYARLPAQLSGSLTDPTL AFEPGALLRSKGRVIDSLDIDEIRWPLAGVKVTQRGVDGRLQAILQAHENELGDFVLHMD GLANDFLPDAGRWQWRYWGKGSFTPMNATWDVAGKGEWHDSTITLTDLSTGFDQLQYGTM TVEKPRLILDKPVVWVRDAQHPSFSGALSLDAGQTLFTGGSVLPPSTLKFSVDGRDPTYF LFKGDLHAGEIGPVRVNGRWDGIRLRGNAWWPKQSLTVFQPLVPPDWKMNLRDGELYAQV AFSAAPEQGFRAGGHGVLKGGSAWMPDNQVNGVDFVLPFRFADGAWHLGTRGPVTLRIAE VINLVTAKNITADLQGRYPWTEEEPLLLTDVSVDVLGGNVLMKQLRMPQHDPALLRLNNL SSSELVSAVNPKQFAMSGAFSGALPLWLNNEKWIVKDGWLANSGPMTLRLDKDTADAVVK DNMTAGSAINWLRYMEISRSSTNINLDNLGLLTMQANITGTSRVDGKSGTVNLNYHHEEN IFTLWRSLRFGDNLQAWLEQNARLPGNDCPQGKECEEKQ >gi|296493496|gb|ADTK01000005.1| GENE 8 10980 - 11165 264 61 aa, chain + ## HITS:1 COG:no KEGG:LF82_3556 NR:ns ## KEGG: LF82_3556 # Name: ynbE # Def: uncharacterized protein YnbE # Organism: E.coli_LF82 # Pathway: not_defined # 1 61 1 61 61 96 96.0 2e-19 MKILLAALTSSFMLAACTPRIEVAAPKEPITINMNVKIEHEIIIKADKDVEELLETRSDL F >gi|296493496|gb|ADTK01000005.1| GENE 9 11125 - 11499 255 124 aa, chain + ## HITS:1 COG:ydbL KEGG:ns NR:ns ## COG: ydbL COG3784 # Protein_GI_number: 16129344 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 16 124 2 110 110 196 98.0 7e-51 MSKSCLKLVAIFSEAMMKKTLLLCAFLVGLVSSNVMALTLDEARTQGRGGETFYGYLVAL KTDAETEKLVTDINAERKASYQQLAKQNNVSVDDIAKLAGQKLVARAKPGEYVQGINGKW VRKF >gi|296493496|gb|ADTK01000005.1| GENE 10 11671 - 12576 599 301 aa, chain - ## HITS:1 COG:feaR KEGG:ns NR:ns ## COG: feaR COG2207 # Protein_GI_number: 16129345 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 301 1 301 301 609 99.0 1e-174 MNPAMDNEFQQWLSQINQVCGNFTGRLLTERYTGVLDTHFAKGLKLSTVTTSGVNLSRTW QEVKGSDDAWFYTVFQLSGQAIMEQDERQVQIGAGDITLLDASRPCSLYWQESSKQISLL LPRTLLEQYFPHQKPVCAERLDADLPMVQLSHRLLQESMNNPALSETESEAALQAMVCLL RPVLHQRESVQPRRERQFQKVVTLIDDNIREEILRPEWIAGETGMSVRSLYRMFADKGLV VAQYIRNRRLDFCADAIRHAADDEKLAGIGFHWGFSDQSHFSTVFKQRFGMTPGEYRRKF R >gi|296493496|gb|ADTK01000005.1| GENE 11 12812 - 14311 1164 499 aa, chain + ## HITS:1 COG:feaB KEGG:ns NR:ns ## COG: feaB COG1012 # Protein_GI_number: 16129346 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Escherichia coli K12 # 1 499 2 500 500 1001 100.0 0 MTEPHVAVLSQVQQFLDRQHGLYIDGRPGPAQSEKRLAIFDPATGQEIASTADANEADVD NAVMSAWRAFVSRRWAGRLPAERERILLRFADLVEQHSEELAQLETLEQGKSIAISRAFE VGCTLNWMRYTAGLTTKIAGKTLDLSIPLPQGARYQAWTRKEPVGVVAGIVPWNFPLMIG MWKVMPALAAGCSIVIKPSETTPLTMLRVAELASEAGIPDGVFNVVTGSGAVCGAALTSH PHVAKISFTGSTATGKGIARTAADHLTRVTLELGGKNPAIVLKDADPQWVIEGLMTGSFL NQGQVCAASSRIYIEAPLFDTLVSGFEQAVKSLQVGPGMSPVAQINPLVSRAHCDKVCSF LDDAQAQQAELIRGSNGPAGEGYYVAPTLVVNPDAKLRLTREEVFGPVVNLVRVADGEEA LQLANDTEYGLTASVWTQNLSQALEYSDRLQAGTVWVNSHTLIDANLPFGGMKQSGTGRD FGPDWLDGWCETKSVCVRY >gi|296493496|gb|ADTK01000005.1| GENE 12 14369 - 16642 2478 757 aa, chain - ## HITS:1 COG:tynA KEGG:ns NR:ns ## COG: tynA COG3733 # Protein_GI_number: 16129347 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Cu2+-containing amine oxidase # Organism: Escherichia coli K12 # 1 757 1 757 757 1535 99.0 0 MGSPSLYSARKTTLALAVALSFAWQAPVFAHGGEAHMVPMDKTLKEFGADVQWDDYAQLF TLIKDGAYVKVKPGAQTAIVNGQPLALQVPVVMKDNKAWVSDTFINDVFQSGLDQTFQVE KRPHPLNALTADEIKQAVEIVKASADFKPNTRFTEISLLPPDKEAVWAFALENKPVDQPR KADVIMLDGKHIIEAVVDLQNNKLLSWQPIKDAHGMVLLDDFASVQNIINNSEEFAAAVK KRGITDAKKVITTPLTVGYFDGKDGLKQDARLLKVISYLDVGDGNYWAHPIENLVAVVDL EQKKIVKIEEGPVVPVPMTARPFDSRDRVAPAVKPMQIIEAEGKNYTITGDMIHWRNWDF HLSMNSRVGPMISTVTYNDNGTKRKVMYEGSLGGMIVPYGDPDIGWYFKAYLDSGDYGMG TLTSPIARGKDAPSNAVLLNETIADYTGVPMEIPRAIAVFERYAGPEYKHQEMGQPNVST ERRELVVRWISTVGNYDYIFDWIFHENGTIGIDAGATGIEAVKGVKAKTMHDETAKDDTR YGTLIDHNIVGTTHQHIYNFRLDLDVDGENNSLVAMDPVVKPNTAGGPRTSTMQVNQYNI GNEQDAAQKFDPGTIRLLSNPNKENRMGNPVSYQIIPYAGGTHPVAKGAQFAPDEWIYHR LSFMDKQLWVTRYHPGERFPEGKYPNRSTHDTGLGQYSKDNESLDNTDAVVWMTTGTTHV ARAEEWPIMPTEWVHTLLKPWNFFDETPTLGALKKDK >gi|296493496|gb|ADTK01000005.1| GENE 13 16890 - 18935 1495 681 aa, chain - ## HITS:1 COG:maoC_1 KEGG:ns NR:ns ## COG: maoC_1 COG1012 # Protein_GI_number: 16129348 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Escherichia coli K12 # 1 500 1 500 500 930 99.0 0 MQQLASFLSGTWQSGRGRSRLIHHAISGEALWEVTSEGLDMAAARLFAIEKGAPALRAMT FIERAAMLKAVAKHLLSEKERFYALSAQTGATRADSWVDIEGGIGTLFTYASLGSRELPD DTLWPEDELIPLSKEGGFAARHVLTSKSGVAVHINAFNFPCWGMLEKLAPTWLGGMPAII KPATATAQLTQAMVKSIVDSGLVPEGAISLICGSAGDLLDHLDSQDVVTFTGSAATGQML RVQPNIVAKSIPFTMEADSLNCCVLGEDVTPDQPEFALFIREVVREMTTKAGQKCTAIRR IIVPQALVNAVSDALVARLQKVVVGDPAQEGVKMGALVNAEQRADVQEKVNILLAAGCEI RLGGQADLSAAGAFFPPTLLYCPQPDEAPAVHATEAFGPVATLMPAQNQRHALQLACAGG GSLAGTLVTADPQIARQFIADAARTHGRIQILNEESAKESTGHGSPLPQLVHGGPGRAGG GEELGGLRAVKHYMQRTAVQGSPTMLAAISKQWVRGAKVEEDRIHPFRKYFEELQPGDSL LTPRRTMTEADIVNFACLSGDHFYAHMDKIAAAESIFGERVVHGYFVLSAAAGLFVDAGV GPVIANYGLENLRFIEPVKPGDTIQVRLTCKRKTLKKQRSAEEKPTGVVEWAVEVFNQHQ TPVALYSILTLVARQHGDFVD >gi|296493496|gb|ADTK01000005.1| GENE 14 19220 - 20149 749 309 aa, chain + ## HITS:1 COG:ydbO KEGG:ns NR:ns ## COG: ydbO COG3396 # Protein_GI_number: 16129349 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 309 1 309 309 626 99.0 1e-179 MTQEERFEQRIAQETAIEPQDWMPDAYRKTLIRQIGQHAHSEIVGMLPEGNWITRAPTLR RKAILLAKVQDEAGHGLYLYSAAETLGCAREDIYQKMLDGRMKYSSIFNYPTLSWADIGV IGWLVDGAAIVNQVALCRTSYGPYARAMVKICKEESFHQRQGFEACMALAQGSEAQKQML QDAINRFWWPALMMFGPNDDNSPNSARSLAWKIKRFTNDELRQRFVDNTVPQVEMLGMTV PDPDLHFDTESGHYRFGEIDWQEFNEVINGRGICNQERLDAKRKAWEEGTWVREAALAHA QKQLARKVA >gi|296493496|gb|ADTK01000005.1| GENE 15 20161 - 20448 365 95 aa, chain + ## HITS:1 COG:ynbF KEGG:ns NR:ns ## COG: ynbF COG3460 # Protein_GI_number: 16129350 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized enzyme of phenylacetate metabolism # Organism: Escherichia coli K12 # 1 95 1 95 95 189 100.0 8e-49 MSNVYWPLYEVFVRGKQGLSHRHVGSLHAADERMALENARDAYTRRSEGCSIWVVKASEI VASQPEERGEFFDPAESKVYRHPTFYTIPDGIEHM >gi|296493496|gb|ADTK01000005.1| GENE 16 20457 - 21203 735 248 aa, chain + ## HITS:1 COG:ydbP KEGG:ns NR:ns ## COG: ydbP COG3396 # Protein_GI_number: 16129351 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 248 1 248 248 487 100.0 1e-138 MNQLTAYTLRLGDNCLVLSQRLGEWCGHAPELEIDLALANIGLDLLGQARNFLSYAAELA GEGDEDTLAFTRDERQFSNLLLVEQPNGNFADTIARQYFIDAWHVALFTRLMESRDPQLA AISAKAIKEARYHLRFSRGWLERLGNGTDVSGQKMQQAINKLWRFTAELFDADEIDIALS EEGIAVDPRTLRAAWEAEVFAGINEATLNVPQEQAYRTGGKKGLHTEHLGPMLAEMQYLQ RVLPGQQW >gi|296493496|gb|ADTK01000005.1| GENE 17 21218 - 21715 537 165 aa, chain + ## HITS:1 COG:paaD KEGG:ns NR:ns ## COG: paaD COG2151 # Protein_GI_number: 16129352 # Func_class: R General function prediction only # Function: Predicted metal-sulfur cluster biosynthetic enzyme # Organism: Escherichia coli K12 # 1 165 3 167 167 320 97.0 7e-88 MQRLATIAPPQVHEIWALLSQIPDPEIPVLTITDLGMVRNVTQMGEGWVIGFTPTYSGCP ATEHLIGAIREAMSTHGFTPVQVVLQLDPAWTTDWMTPDARERLRQYGISPPAGHSCHAH LPPEVCCPRCASVHTTLISEFGSTACKALYRCDSCREPFDYFKCI >gi|296493496|gb|ADTK01000005.1| GENE 18 21723 - 22793 1022 356 aa, chain + ## HITS:1 COG:paaE KEGG:ns NR:ns ## COG: paaE COG1018 # Protein_GI_number: 16129353 # Func_class: C Energy production and conversion # Function: Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 # Organism: Escherichia coli K12 # 1 356 1 356 356 702 98.0 0 MTTFHSLTVAKVEPETRDAVTITFAVPQPLQEAYRFRPGQHLTLKASLDGEELRRCYSIC RSYLPGEISVAVKAIEGGRFSRYARDHIRQGMTLEVMVPQGHFGYQPQAERQGRYLAIAA GSGITPMLAIIAATLQTEPESQFTLIYGNRTSQSMMFRQALADLKDKYPQRLQLLCIFSQ ETLDSDLLHGRIDGEKLQSLGASLINFRLYDEAFICGPAAMMDETEAALKALGMPDKTIH LERFNTPGTRVKRSVNVQSDGQKVTVRQDGRDREIVLNADDESILDAALRQGADLPYACK GGVCATCKCKVLRGKVAMETNYSLEPDELAAGYVLSCQALPLTSDVVVDFDAKGMA >gi|296493496|gb|ADTK01000005.1| GENE 19 22790 - 23557 697 255 aa, chain + ## HITS:1 COG:ydbS KEGG:ns NR:ns ## COG: ydbS COG1024 # Protein_GI_number: 16129354 # Func_class: I Lipid transport and metabolism # Function: Enoyl-CoA hydratase/carnithine racemase # Organism: Escherichia coli K12 # 1 255 1 255 255 435 100.0 1e-122 MSELIVSRQQRVLLLTLNRPAARNALNNALLMQLVNELEAAATDTSISVCVITGNARFFA AGADLNEMAEKDLAATLNDTRPQLWARLQAFNKPLIAAVNGYALGAGCELALLCDVVVAG ENARFGLPEITLGIMPGAGGTQRLIRSVGKSLASKMVLSGESITAQQAQQAGLVSDVFPS DLTLEYALQLASKMARHSPLALQAAKQALRQSQEVALQAGLAQERQLFTLLAATEDRHEG ISAFLQKRTPDFKGR >gi|296493496|gb|ADTK01000005.1| GENE 20 23560 - 24345 753 261 aa, chain + ## HITS:1 COG:paaG KEGG:ns NR:ns ## COG: paaG COG1024 # Protein_GI_number: 16129355 # Func_class: I Lipid transport and metabolism # Function: Enoyl-CoA hydratase/carnithine racemase # Organism: Escherichia coli K12 # 1 261 2 262 262 489 99.0 1e-138 MEFILSHVEKGVMTLTLNRPERLNSFNDEMHAQLAECLKQVERDDTIRCLLLTGAGRGFC AGQDLNDRNVDPTGPAPDLGMSVERFYNPLVRRLAKLPKPVICAVNGVAAGAGATLALGC DIVIAARSAKFVMAFSKLGLIPDCGGTWLLPRVAGRARAMGLALLGNQLSAEQAHEWGMI WQVVDDETLADTAQQLARHLATQPTFGLGLIKQAINSAETNTLDTQLDLERDYQRLAGRS ADYREGVSAFLAKRSPQFTGK >gi|296493496|gb|ADTK01000005.1| GENE 21 24350 - 25774 1062 474 aa, chain + ## HITS:1 COG:ydbU KEGG:ns NR:ns ## COG: ydbU COG1250 # Protein_GI_number: 16129356 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyacyl-CoA dehydrogenase # Organism: Escherichia coli K12 # 1 474 2 475 475 917 99.0 0 MINVQTVAVIGSGTMGAGIAEVAASHGHQVLLYDISAEALTRAIDGIHARLNSRVTRGKL TAETCERTLKRLIPVTDIHALAAANLVIEAASERLEVKKALFAQLAEVCPPQTLLTTNTS SISITAIAAEVKNPERVAGLHFFNPAPVMKLVEVVSGLATAAEVVEQLCELTLSWGKQPV RCHSTPGFIVNRVARPYYSEAWRALEEQVAAPEVIDAALRDGAGFPMGPLELTDLIGQDV NFAVTCSVFNAFWQERRFLPSLVQQELVIGGRLGKKSGLGVYDWRAEREAVVGLEAVSDS FSPMKVEKKSDGVTEIDDVLLIETQGETAQALAIRLARPVVVVDKMAGKVVTIAAAAVNP DSATRKAIYYLQQQGKTVLQIADYPGMLIWRTVAMIINEALDALQKGVASEQDIDTAMRL GVNYPYGPLAWGAQLGWQRILRLLENLQHHYGEERYRPCSLLRQRALLESGYES >gi|296493496|gb|ADTK01000005.1| GENE 22 25764 - 26186 395 140 aa, chain + ## HITS:1 COG:paaI KEGG:ns NR:ns ## COG: paaI COG2050 # Protein_GI_number: 16129357 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized protein, possibly involved in aromatic compounds catabolism # Organism: Escherichia coli K12 # 1 140 1 140 140 255 100.0 2e-68 MSHKAWQNAHAMYENDACAKALGIDIISMDEGFAVVTMTVTAQMLNGHQSCHGGQLFSLA DTAFAYACNSQGLAAVASACTIDFLRPGFAGDTLTATAQVRHQGKQTGVYDIEIVNQQQK TVALFRGKSHRIGGTITGEA >gi|296493496|gb|ADTK01000005.1| GENE 23 26186 - 27391 1073 401 aa, chain + ## HITS:1 COG:paaJ KEGG:ns NR:ns ## COG: paaJ COG0183 # Protein_GI_number: 16129358 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA acetyltransferase # Organism: Escherichia coli K12 # 1 401 1 401 401 734 99.0 0 MREAFICDGIRTPIGRYGGALSGVRADDLAAIPLRELLVRNPRLDAECIDDVILGCANQA GEDNRNVARMATLLAGLPQSVSGTTINRLCGSGLDALGFAARAIKAGDGDLLIAGGVESM SRAPFVMGKATSAFSRQAEMFDTTIGWRFVNPLMAQQFGTDSMPETAENVAELLKISRED QDSFALRSQQRTAKAQSSGILAEEIVPVVLKNKKGVVTEIQHDEHLRPETTLEQLRGLKA PFRANGVITAGNASGVNDGAAALIIASEQMAAAQGLTPRARIVAMATAGVEPRLMGLGPV PATRRVLERAGLSIHDMDVIELNEAFAAQALGVLRELGLPDDAPHVNPNGGAIALGHPLG MSGARLALAASHELHRRNGRYALCTMCIGVGQGIAMILERV >gi|296493496|gb|ADTK01000005.1| GENE 24 27418 - 28731 1274 437 aa, chain + ## HITS:1 COG:paaK KEGG:ns NR:ns ## COG: paaK COG1541 # Protein_GI_number: 16129359 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Escherichia coli K12 # 1 437 1 437 437 911 99.0 0 MITNTKLDPIETASVDELQALQTQRLKWTLKHAYENVPMYRRKFDAAGVHPDDFRELSDL RKFPCTTKQDLRDNYPFDTFAVPMEQVVRIHASSGTTGKPTVVGYTQNDIDNWANIVARS LRAAGGTPKDKIHVAYGYGLFTGGLGAHYGAERLGATVIPMSGGQTEKQAQLIRDFQPDM IMVTPSYCLNLIEELERQLGGDASGCSLRVGVFGAEPWTQAMRKEIERRLGITALDIYGL SEVMGPGVAMECLETTDGPTIWEDHFYPEIVNPHDGTPLADGEHGELLFTTLTKEALPVI RYRTRDLTRLLPGTARTMRRMDRISGRSDDMLIIRGVNVFPSQLEEEIVKFEHLSPHYQL EVNRRGHLDSLSVKVELKESSLTLTHEQRCQVCHQLRHRIKSMVGISTDVMIVNCGSIPR SEGKACRVFDLRNIVGA >gi|296493496|gb|ADTK01000005.1| GENE 25 28832 - 29782 695 316 aa, chain + ## HITS:1 COG:paaX KEGG:ns NR:ns ## COG: paaX COG3327 # Protein_GI_number: 16129360 # Func_class: K Transcription # Function: Phenylacetic acid-responsive transcriptional repressor # Organism: Escherichia coli K12 # 1 316 1 316 316 620 98.0 1e-177 MSKLVTFIQHAVNAVPVSGTSLISSLYGDSLSHRGGEIWLGSLAALLEGLGFGERFVRTA LFRLNKEGWLDVSRIGRRSFYSLSDKGLRLTRRAESKIYRAEQPAWDGKWLLLLSEGLDK STLADVKKQLIWQGFGALAPSLMASPSQKLADVQTLLHEAGVVDNVICFEAQIPLALSRA ALRARVEECWHLTEQNAMYETFIQSFRPLVPLLKEAADELTPERAFHIQLLLIHFYRRVV LKDPLLPEELLPAHWAGHTARQLCINIYQRVAPAALAFVSEKGETSVGELPAPGSLYFQR FGGLNIEQEAICQFTR >gi|296493496|gb|ADTK01000005.1| GENE 26 29764 - 30354 590 196 aa, chain + ## HITS:1 COG:paaY KEGG:ns NR:ns ## COG: paaY COG0663 # Protein_GI_number: 16129361 # Func_class: R General function prediction only # Function: Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily # Organism: Escherichia coli K12 # 1 196 1 196 196 388 98.0 1e-108 MPIYQIDGLTPVVPEESFVHPTAVLIGDVILGKGVYVGPNASLRGDFGRIVVKDGANIQD NCVMHGFPEQDTVVEEDGHIGHSAILHGCIIRRNALVGMNAVVMDGAVIGENSIVGASAF VKAKAEMPANYLIVGSPAKAIRELSEQELAWKKQGTHEYQVLVTRCKQTLHQVEPLREVE PNRKRLVFDENLRPKQ >gi|296493496|gb|ADTK01000005.1| GENE 27 30585 - 31445 759 286 aa, chain + ## HITS:1 COG:ECs2008 KEGG:ns NR:ns ## COG: ECs2008 COG0667 # Protein_GI_number: 15831262 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Escherichia coli O157:H7 # 1 286 1 286 286 557 98.0 1e-158 MSSNTFTLGTKSVNRLGYGAMQLAGPGVFGPPRDRHVAITVLREALALGVNHIDTSDFYG PHVTNQIIREALYPYSDDLTIVTKIGARRGEDASWLPAFSPAELQKAVHDNLRNLGLDVL DVVNLRIMMGDGHGPAEGSIEASLTVLAEMQQQGLVKHIGLSNVTPTQVAEARKIAEIVC VQNEYNIAHRADDAMIDALARDGIAYVPFFPLGGFTPLQSSTLSDVAASLGATPMQVALA WLLQRSPNILLIPGTSSVAHLRENMAAEKLHLSEEVLSTLDGISRE >gi|296493496|gb|ADTK01000005.1| GENE 28 31882 - 32349 222 155 aa, chain + ## HITS:1 COG:ynbA KEGG:ns NR:ns ## COG: ynbA COG0558 # Protein_GI_number: 16129369 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphate synthase # Organism: Escherichia coli K12 # 5 154 53 202 203 268 99.0 2e-72 MLVAQPILFMLLPIVLFIRMALNALDGMLARECNQQTRLGAILNETGDVISDIALYLPFL FLPESNASLVILMLFCTILTEFCGLLAQTINGVRSYAGPFGKSDRALIFGLWGLAVAIYP QWMQWNNLLWSIASILLLWTAINRCRSVLLMSAER >gi|296493496|gb|ADTK01000005.1| GENE 29 32721 - 33245 226 174 aa, chain + ## HITS:1 COG:ynbB KEGG:ns NR:ns ## COG: ynbB COG4589 # Protein_GI_number: 16129370 # Func_class: R General function prediction only # Function: Predicted CDP-diglyceride synthetase/phosphatidate cytidylyltransferase # Organism: Escherichia coli K12 # 1 174 125 298 298 300 97.0 9e-82 MGDPSGFLHTVSAIFWGWIMTVFALSHAAWLLMLPTTNIQGGALLVLFLLALTESNDIAQ YLWGKSCGRRKVVPKVSPGKTLEGLVGGVITTMIASLIIGPLLTPLNTLQALLAGLLIGI SGFCGDVVMSAIKRDIGVKDSGKLLPGHGGLLDRIDSLIFTAPVFFYFIRYSCY >gi|296493496|gb|ADTK01000005.1| GENE 30 33261 - 35012 1191 583 aa, chain + ## HITS:1 COG:ynbC_2 KEGG:ns NR:ns ## COG: ynbC_2 COG0500 # Protein_GI_number: 16129371 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli K12 # 275 575 1 301 311 610 96.0 1e-174 MENSRIPGEHFFTTSDNTALFYRHWPALQPGAKKVIVLFHRGHEHSGRLQHLVDELAMPD TAFYAWDARGHGKTSGPRGYSPSLARSVQDVDEFVRFAASDSQVGLEEVVVIAQSVGAVL VATWVHDYAPAIRGLVLASPAFKVKLYVPLARPALALWHRLRGLFFINSYVKGRYLTHDR QRVASFNNDPLITRAIAVNILLDLYKTSERIVSDAAAITLPTQLLISGDDYVVHRQPQID FYQRLRSPLKELHLLPGFYHDTLGEENRALAFEKMQSFISHLYANKSQKFDYQHEDCTGP SADRWRLLSGGPVPLSPVDLAYRFMRMAMKLFGAHSAGLHLGMSTGFDSGSSLDYVYQNQ PQGSNAFGRLVDKIYLNSVGWRGIRQRKTHLQMLIKQAVADLHAKGLAIRVVDIAAGHGR YVLDALENEPAVCDILLRDYSELNVAQGQEMIAQRGMSGQVRFEQGDAFNPEELSALTPR PTLAIVSGLYELFPENEQVKNSLAGLANAIEPGGILIYTGQPWHPQLELIAGVLTSHKDG KPWVMRVRSQGEMDSLVRDAGFDKCTQRIDEWGIFYGFDGGAS >gi|296493496|gb|ADTK01000005.1| GENE 31 35033 - 36325 415 430 aa, chain + ## HITS:1 COG:ynbD_1 KEGG:ns NR:ns ## COG: ynbD_1 COG0671 # Protein_GI_number: 16129372 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Escherichia coli K12 # 1 342 1 342 342 643 98.0 0 MLQGAGWLLLLAPFFFFTYGSLNQFTAVQDLNSHDIPSQVFGWETAIPFLPWTIVPYWSL DLLYGFSLFVCSTTFEQRRLVHRLILATVMACCGFLLYPLKFSFIRPEVSGVTGWLFSQL ELFDLPYNQSPSLHIILCWLLWRHFRQHLAERWRKVCGGWFLLIAISTLTTWQHHFIDVI TGLAVGMLIDWMVPVDRRWNYQKPDQRRIKIALPYVVGAGSCIVLMELMVMIQLWWSVWL CWPVFSLLIIGRGYGGLGAITTGKDSQGKLPPAVYWLTLPWRIGMWLSMRWFCRRLEPVS KITAGVYLGAFPRHIPAQNAVLDVTFEFPRGRATKDRLYFCVPMLDLVVPEEGELRQAVA MLETLREEQGSVLVHCALGLSRSALVVAAWLLCYGHCKTVDEAISYIRARRSHIVLKEDH KAMLKLWENR >gi|296493496|gb|ADTK01000005.1| GENE 32 36376 - 36981 736 201 aa, chain - ## HITS:1 COG:acpD KEGG:ns NR:ns ## COG: acpD COG1182 # Protein_GI_number: 16129373 # Func_class: I Lipid transport and metabolism # Function: Acyl carrier protein phosphodiesterase # Organism: Escherichia coli K12 # 1 201 1 201 201 385 99.0 1e-107 MSKVLVLKSSILAGYSQSNQLSDYFVEQWREKHSADEITVRDLAANPIPVLDGELVGALR PSDAPLTPRQQEALALSDELIAELKAHDVIVIAAPMYNFNISTQLKNYFDLVARAGVTFR YTENGPEGLVTGKKAIVITSRGGIHKDGPTDLVTPYLSTFLGFIGITDVKFVFAEGIAYG PEMAAKAQSDAKAAIDSIVAA >gi|296493496|gb|ADTK01000005.1| GENE 33 37182 - 41084 4167 1300 aa, chain + ## HITS:1 COG:ECs2015 KEGG:ns NR:ns ## COG: ECs2015 COG1643 # Protein_GI_number: 15831269 # Func_class: L Replication, recombination and repair # Function: HrpA-like helicases # Organism: Escherichia coli O157:H7 # 20 1300 1 1281 1281 2541 99.0 0 MTEQQKLTFTALQQRLDSLMLRDRLRFSRRLHGVKKVKNPDAQQAIFQEMAKEIDQAAGK VLLREAARPEITYPDNLPVSQKKQDILEAIRDHQVVIVAGETGSGKTTQLPKICMELGRG IKGLIGHTQPRRLAARTVANRIAEELKTEPGGCIGYKVRFSDHVSDNTMVKLMTDGILLA EIQQDRLLMQYDTIIIDEAHERSLNIDFLLGYLKELLPRRPDLKIIITSATIDPERFSRH FNNAPIIEVSGRTYPVEVRYRPIVEEADDTERDQLQAIFDAVDELSQESPGDILIFMSGE REIRDTADALNKLNLRHTEILPLYARLSNSEQNRVFQSHSGRRIVLATNVAETSLTVPGI KYVIDPGTARISRYSYRTKVQRLPIEPISQASANQRKGRCGRVSEGICIRLYSEDDFLSR PEFTDPEILRTNLASVILQMTALGLGDIAAFPFVEAPDKRNIQDGVRLLEELGAITTDEQ ASAYKLTPLGRQLSQLPVDPRLARMVLEAQKHGCVREAMIITSALSIQDPRERPMDKQQA SDEKHRRFHDKESDFLAFVNLWNYLGEQQKALSSNAFRRLCRTDYLNYLRVREWQDIYTQ LRQVVKELGIPVNSEPAEYREIHIALLTGLLSHIGMKDADKQEYTGARNARFSIFPGSGL FKKPPKWVMVAELVETSRLWGRIAARIDPEWVEPVAQHLIKRTYSEPHWERAQGAVMATE KVTVYGLPIVAARKVNYSQIDPALCRELFIRHALVEGDWQTRHAFFRENLKLRAEVEELE HKSRRRDILVDDETLFEFYDQRISHDVISARHFDSWWKKVSRETPDLLNFEKSMLIKEGA EKISKLDYPNFWHQGNLKLRLSYQFEPGADADGVTVHIPLPLLNQVEESGFEWQIPGLRR ELVIALIKSLPKPVRRNFVPAPNYAEAFLGRVKPLELPLLDSLERELRRMTGVTVDREDW HWDQVPDHLKITFRVVDDKNKKLKEGRSLQDLKDALKGKVQETLSAVADDGIEQSGLHIW SFGQLPESYEQKRGNYKVKAWPALVDERDSVAIKLFDNPLEQKQAMWNGLRRLLLLNIPS PIKYLHEKLPNKAKLGLYFNPYGKVLELIDDCISCGVDQLIDANGGPVWTEEGFAALHEK VRAELNDTVVDIAKQVEQILTAVFNINKRLKGRVDMTMALGLSDIKAQMGGLVYRGFVTG NGFKRLGDTLRYLQAIEKRLEKLAVDPHRDRAQMLKVENVQQAWQQWINKLPPARREDED VKEIRWMIEELRVSYFAQQLGTPYPISDKRILQAMEQISG >gi|296493496|gb|ADTK01000005.1| GENE 34 41356 - 42156 478 266 aa, chain + ## HITS:1 COG:ydcF KEGG:ns NR:ns ## COG: ydcF COG1434 # Protein_GI_number: 16129375 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 266 1 266 266 550 100.0 1e-156 MNITPFPTLSPATIDAINVIGQWLAQDDFSGEVPYQADCVILAGNAVMPTIDAACKIARD QQIPLLISGGIGHSTTFLYSAIAQHPHYNTIRTTGRAEATILADIAHQFWHIPHEKIWIE DQSTNCGENARFSIALLNQAVERVHTAIVVQDPTMQRRTMATFRRMTGDNPDAPRWLSYP GFVPQLGNNADSVIFINQLQGLWPVERYLSLLTGELPRLRDDSDGYGPRGRDFIVHVDFP AEVIHAWQTLKHDAVLIEAMESRSLR >gi|296493496|gb|ADTK01000005.1| GENE 35 42353 - 43792 1371 479 aa, chain + ## HITS:1 COG:ECs2021 KEGG:ns NR:ns ## COG: ECs2021 COG1012 # Protein_GI_number: 15831275 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Escherichia coli O157:H7 # 1 479 1 479 479 946 99.0 0 MSVPVQHPMYIDGQFVTWRGDAWIDVVNPATEAVISRIPDGQAEDARKAIDAAERAQPEW EALPAIERASWLRKISAGIRERASEISALIVEEGGKIQQLAEVEVAFTADYIDYMAEWAR RYEGEIIQSDRPGENILLFKRALGVTTGILPWNFPFFLIARKMAPALLTGNTIVIKPSEF TPNNAIAFAKIVDEIGLPRGVFNLVLGRGETVGQELAGNPKVAMVSMTGSVSAGEKIMAT AAKNITKVCLELGGKAPAIVMDDADLELAVKAIVDSRVINSGQVCNCAERVYVQKGIYDQ FVNRLGEAMQAVQFGNPAERNDIAMGPLINAAALERVEQKVARAVEEGARVALGGKAVDG KGYYYPPTLLLDVRQEMSIMHEETFGPVLPVVAFDTLEDAISMANDSDYGLTSSIYTQNL NVAMKAIKGLKFGETYINRENFEAMQGFHAGWRKSGIGGADGKHGLHEYLQTQVVYLQS >gi|296493496|gb|ADTK01000005.1| GENE 36 43834 - 44835 1101 333 aa, chain - ## HITS:1 COG:ECs2022 KEGG:ns NR:ns ## COG: ECs2022 COG0057 # Protein_GI_number: 15831276 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Escherichia coli O157:H7 # 1 333 1 333 333 641 99.0 0 MSKVGINGFGRIGRLVLRRLLEVKSNIDVVAINDLTSPKILAYLLKHDSNYGPFPWSVDF TEDSLIVDGKSIAVYAEKEAKNIPWKAKGAEIIVECTGFYTSAEKSQAHLDAGAKKVLIS APAGEMKTIVYNVNDDTLDGNDTIVSVASCTTNCLAPMAKALHDSFGIEVGTMTTIHAYT GTQSLVDGPRGKDLRASRAAAENIIPHTTGAAKAIGLVIPELSGKLKGHAQRVPVKTGSV TELVSILGKKVTAEEVNNALKQATTNNESFGYTDEEIVSSDIIGSHFGSVFDATQTEITA VGDLQLVKTVAWYDNEYGFVTQLIRTLEKFAKL >gi|296493496|gb|ADTK01000005.1| GENE 37 45024 - 45554 260 176 aa, chain + ## HITS:1 COG:STM1639 KEGG:ns NR:ns ## COG: STM1639 COG3038 # Protein_GI_number: 16764983 # Func_class: C Energy production and conversion # Function: Cytochrome B561 # Organism: Salmonella typhimurium LT2 # 1 175 1 175 176 281 84.0 3e-76 MENKYSRLQISIHWLVFLLVIAAYCAMEFRGFFPRSDRPLINMVHVSCGISILVLMVVRL LLRLKYPTPPIIPKPKPMMTGLAHLGHLVIYLLFIALPVIGLVMMYNRGNPWFAFGLTMP YASEANFERVDSLKSWHETLANLGYFVIGLHAAAALAHHYFWKDNTLLRMMPRKRS >gi|296493496|gb|ADTK01000005.1| GENE 38 45799 - 45972 144 57 aa, chain + ## HITS:1 COG:no KEGG:SSON_1723 NR:ns ## KEGG: SSON_1723 # Name: ydcA # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 57 1 57 57 92 100.0 4e-18 MKKLALILFMGTLVSFYADAGRKPCSGSKGGISHCTAGGKFVCNDGSISASKKTCTN >gi|296493496|gb|ADTK01000005.1| GENE 39 46084 - 46251 57 55 aa, chain - ## HITS:1 COG:no KEGG:B21_01389 NR:ns ## KEGG: B21_01389 # Name: mokB # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 55 1 55 55 95 100.0 4e-19 MNLAIQILASYPPSGKEKGYEAQPSGGVSAHYLHYDSDIHTPDPTNALRTAVPGR >gi|296493496|gb|ADTK01000005.1| GENE 40 46592 - 48232 1131 546 aa, chain + ## HITS:1 COG:trg KEGG:ns NR:ns ## COG: trg COG0840 # Protein_GI_number: 16129380 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Escherichia coli K12 # 1 546 1 546 546 895 99.0 0 MNTTPSHRLGFLHHIRLVPLFACILGGILVLFALSSALAGYFLWQADRDQRDVTAEIEIR TGLANSSDFLRSARINMIQAGAASRIAEMEAMKRNIAQAESEIKQSQQGYRAYQNRPVKT PADEALDTELNQRFQAYITGMQPMLKYAKNGMFEAIINHESEQIRPLDNAYTDILNKAVK IRSTRANQLAELAHQRTRLGGMFMIGAFVLALVMTLITFMVLRRIVIRPLQHAAQRIEKI ASGDLTMNDEPAGRNEIGRLSRHLQQMQHSLGMTVGTVRQGAEEIYRGTSEISAGNADLS SRTEEQAAAIEQTAASMEQLTATVKQNADNAHHASKLAQEASIKASDGGQTVSGVVKTMG AISTSSKKISEITAVINSIAFQTNILALNAAVEAARAGEQGRGFAVVASEVRTLASRSAQ AAKEIEGLISESVRLIDLGSDEVATAGKTMSTIVDAVASVTHIMQEIAAASDEQSRGITQ VSQAISEMDKVTQQNASLVEEASAAAVSLEEQAARLTEAVDVFRLNKHSVSAEPRGAGEP VSFATV >gi|296493496|gb|ADTK01000005.1| GENE 41 48270 - 49193 603 307 aa, chain - ## HITS:1 COG:ydcI KEGG:ns NR:ns ## COG: ydcI COG0583 # Protein_GI_number: 16129381 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 307 48 354 354 595 99.0 1e-170 MEKNSLFSQRIRLRHLHTFVAVAQQGTLGRAAETLNLSQPALSKTLNELEQLTGARLFER GRQGAQLTLPGEQFLTHAVRVLDAINTAGQSLHRKEDLNNDVVRVGALPTAALGILPSVI GQFHQQQKETTLQVATMSNPMILAGLKTGEIDIGIGRMSDPELMTGLNYELLFLESLKLV VRPNHPLLQENVTLSRVLEWPVVVSPEGTAPRQHSDALVQSQGCKIPSGCIETLSASLSR QLTVEYDYVWFVPSGAVKDDLRHATLVALPVPGHGAGEPIGILTRVDATFSSGCQLMINA IRKSMPF >gi|296493496|gb|ADTK01000005.1| GENE 42 49410 - 50753 1019 447 aa, chain + ## HITS:1 COG:ECs2028 KEGG:ns NR:ns ## COG: ECs2028 COG5383 # Protein_GI_number: 15831282 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 447 1 447 447 869 99.0 0 MANSITADEIREQFSQAMSAMYQQEVPQYGTLLELVADVNLAVLENNPQLHEKMVNADEL ARLNVERHGAIRVGTAQELATLRRMFAIMGMYPVSYYDLSQAGVPVHSTAFRPIDDASLA RNPFRVFTSLLRLELIENEILRQKAAEILRQRDIFTPRCRQLLEEYEQQGGFNETQAQEF VQEALETFRWHQSATVDEETYRALHNEHRLIADVVCFPGCHINHLTPRTLDIDRVQSMMP ECGIEPKILIEGPPRREVPILLRQTSFKALEETVLFAGQKQGTHTARFGEIEQRGVALTP KGRQLYDDLLRNAGTGQDNLTHQMHLQETFRTFPDSEFLMRQQGLAWFRYRLTPSGEAHR QAIHPGDDPQPLIERGWVAAQPITYEDFLPVSAAGIFQSNLGNETQARNHGNASREAFEQ ALGCPVLDEFQLYQEAEERSKRRCGLL >gi|296493496|gb|ADTK01000005.1| GENE 43 51014 - 52633 1568 539 aa, chain + ## HITS:1 COG:ECs2029 KEGG:ns NR:ns ## COG: ECs2029 COG3131 # Protein_GI_number: 15831283 # Func_class: P Inorganic ion transport and metabolism # Function: Periplasmic glucans biosynthesis protein # Organism: Escherichia coli O157:H7 # 1 539 13 551 551 1131 100.0 0 MAAVCGTSGIASLFSQAAFAADSDIADGQTQRFDFSILQSMAHDLAQTAWRGAPRPLPDT LATMTPQAYNSIQYDAEKSLWHNVENRQLDAQFFHMGMGFRRRVRMFSVDPATHLAREIH FRPELFKYNDAGVDTKQLEGQSDLGFAGFRVFKAPELARRDVVSFLGASYFRAVDDTYQY GLSARGLAIDTYTDSKEEFPDFTAFWFDTVKPGATTFTVYALLDSASITGAYKFTIHCEK SQVIMDVENHLYARKDIKQLGIAPMTSMFSCGTNERRMCDTIHPQIHDSDRLSMWRGNGE WICRPLNNPQKLQFNAYTDNNPKGFGLLQLDRDFSHYQDIMGWYNKRPSLWVEPRNKWGK GTIGLMEIPTTGETLDNIVCFWQPEKAVKAGDEFAFQYRLYWSAQPPVHCPLARVMATRT GMGGFPEGWAPGEHYPEKWARRFAVDFVGGDLKAAAPKGIEPVITLSSGEAKQIEILYIE PIDGYRIQFDWYPTSDSTDPVDMRMYLRCQGDAISETWLYQYFPPAPDKRQYVDDRVMS >gi|296493496|gb|ADTK01000005.1| GENE 44 52554 - 52757 89 67 aa, chain - ## HITS:1 COG:no KEGG:SSON_1717 NR:ns ## KEGG: SSON_1717 # Name: not_defined # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 67 1 67 67 124 100.0 1e-27 MADKVYLKYTPSDYSFNLGKNASGIVFNQTAPPEEGAEEKTINSSRGRQHTDVYPALAGN TDTAMFH >gi|296493496|gb|ADTK01000005.1| GENE 45 52773 - 52997 255 74 aa, chain + ## HITS:1 COG:ydcH KEGG:ns NR:ns ## COG: ydcH COG2841 # Protein_GI_number: 16129385 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 20 74 1 55 55 82 100.0 2e-16 MFPEYRDLISRLKNENPRFMSLFDKHNKLDHEIARKEGSDGRGYNAEVVRMKKQKLQLKD EMLKILQQESVKEV >gi|296493496|gb|ADTK01000005.1| GENE 46 53060 - 53596 913 178 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|16129386|ref|NP_415944.1| ribosomal-protein-L7/L12-serine acetyltransferase [Escherichia coli str. K-12 substr. MG1655] # 1 178 1 178 179 356 98 3e-97 MTETIKVSESLELHAVAESHVTPLYQLICKNKTWLQQSLNWPQFVQSEEDTRKTVQGNVM LHQRGYAKMFMIFKEDELIGVISFNRIEPLSKTAEIGYWLDESHQGQGIISQALQALIHH YAQSGELRRFVIKCRVDNPQSNQVALRNGFILEGCLKQAEFLNDAYDDVNLYARIIDS >gi|296493496|gb|ADTK01000005.1| GENE 47 53591 - 54571 605 326 aa, chain - ## HITS:1 COG:no KEGG:B21_01396 NR:ns ## KEGG: B21_01396 # Name: ydcK # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 326 1 326 326 638 98.0 0 MRKYRLSEEQRAFSYQEDGTKKNVLLRQIIAISDFNDVIAGTAGGWIDRETVLAQEGNCW IYDQNAIAFGGAVISGNTRITGTSVLWGEVYATDNVWIDNSEISQGAYISDSVTIHDSLV CGQCRIFGHALIDQHSMIVAAQGLTPDHQLLLQIYDRARVSASRIVHQAQIYGDAVVRYA FIEHRAEVFDFASVEGNEENNVWLCDCAKVYGHAQVKAGIEEDAIPTIHYSSQVAEYAIV EGNCVLKHHVLIGGNAVVRGGPILLDEHVVIQGESRISGAVIIENHVELTDHAVVEAFDG DTVHVRGPKVINGEERITRTPLAGLL >gi|296493496|gb|ADTK01000005.1| GENE 48 54695 - 55687 898 330 aa, chain + ## HITS:1 COG:tehA KEGG:ns NR:ns ## COG: tehA COG1275 # Protein_GI_number: 16129388 # Func_class: P Inorganic ion transport and metabolism # Function: Tellurite resistance protein and related permeases # Organism: Escherichia coli K12 # 1 330 1 330 330 574 98.0 1e-164 MQSDKVLNLPAGYFGIVLGTIGMGFAWRYASQVWQVSHWLGDGLVILAMIIWGLLTSAFI TRLIRFPHSVLAEVRHPVMSSFVSLFPATTMLVAIGFVPWFRPLAVCLFSFGVVVQLAYA AWQTAGLWRGSHPEEATTPGLYLPTVANNFISAMACGVLGYTDAGLVFLGAGVFSWLSLE PVILQRLRSSGELPTALRTSLGIQLAPALVACSAWLSVNGGEGDTLAKMLFGYGLLQLLF MLRLMPWYLSQPFNASFWSFSFGVSALATTGLHLGSGSENGFFHTLAVPLFIFTNFIIAI LLIRTFALLMQGKLLVRTERAVLMKAEDKE >gi|296493496|gb|ADTK01000005.1| GENE 49 55684 - 56277 613 197 aa, chain + ## HITS:1 COG:tehB KEGG:ns NR:ns ## COG: tehB COG0500 # Protein_GI_number: 16129389 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli K12 # 1 197 1 197 197 400 99.0 1e-112 MIIRDENYFTDKYELTRTHSEVLEAVKVVKPGKTLDLGCGNGRNSLYLAANGYDVDAWDK NAMSIANVERIKSIENLDNLHTRVVDLNNLTFDRQYDFILSTVVLMFLEAKTIPGLIANM QRCTKPGGYNLIVAAMDTADYPCTVGFPFAFKEGELRRYYEGWERVKCNEDVGELHRTDA NGNRIKLRFATMLARKK >gi|296493496|gb|ADTK01000005.1| GENE 50 56579 - 57247 640 222 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_1426 NR:ns ## KEGG: ECIAI1_1426 # Name: ydcL # Def: putative lipoprotein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 222 1 222 222 433 100.0 1e-120 MRTTSFAKVAALCGLLALSGCASKITQPDKYSGFLNNYSDLKETTSATGKPVLRWVDPSF DQSKYDSIVWNPITYYPVPKPSTQVGQKVLDKILNYTNTEMKEAIAQRKPVVTTAGPRSL IFRGAITGVDTSKEGLQFYEVVPVALVVAGTQMATGHRTMDTRLYFEGELIDAATNKPVI KVVRQGEGKDLNNESTPMAFENIKQVIDDMATDATMFDVNKK >gi|296493496|gb|ADTK01000005.1| GENE 51 57552 - 57707 61 51 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_2551 NR:ns ## KEGG: EcSMS35_2551 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 51 1 51 51 62 72.0 4e-09 MHVHLVFVTRYRRQIFDHDATENYALTFQMYVLILKLNWLKWMANQITSIC >gi|296493496|gb|ADTK01000005.1| GENE 52 57838 - 58986 361 382 aa, chain + ## HITS:1 COG:ydcM KEGG:ns NR:ns ## COG: ydcM COG0675 # Protein_GI_number: 16129391 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli K12 # 1 382 21 402 402 727 98.0 0 MRRFAGACRFVFNRALARQNENHEVGNKYIPYGKMASWLVEWKNATETQWLKDSPSQPLQ QSLKDLERAYKNFFQNRAAFPRFKKRGQNDVFRYPQGVKLDQENSRIFLPKLGWMRYRNS RQVTGVVKNVTVSQSCGKWYISIQTESEVSTPVHPSASMVGLDAGVAKLATLSDGTVFEP VNSFQKNQKTLARLQRQLSRKVKFSNNWQKQKRKIQRLHSCIANIRRDYLHKVTTAVSKN HAMIVIEDLKVSNMSKSAAGTVSQPGRNVRAKSGLNRSILDQGWYEMRRQLEYKQLWRGG QVLAVPPAYTSQRCAYCGHTAKENRLSQSKFRCQVCGYTANADVNGARNILAAGHAVLAC GEMVQSGRPLKQEPTEMIQATA >gi|296493496|gb|ADTK01000005.1| GENE 53 59026 - 60201 888 391 aa, chain - ## HITS:1 COG:ydcO KEGG:ns NR:ns ## COG: ydcO COG3135 # Protein_GI_number: 16129392 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized protein involved in benzoate metabolism # Organism: Escherichia coli K12 # 1 391 88 478 478 638 99.0 0 MRLFSIPPPTLLAGFLAVLIGYASSGAIIWQAAIVAGATTAQISGWMTALGLAMGVSTLT LTLWYRVPVLTAWSTPGAALLVTGLQGLTLNEAIGVFIVTNALIVLCGITGLFARLMRII PHSLAAAMLAGILLRFGLQAFASLDGQFTLCGSMLLVWLATKAVAPRYAVIAAMIIGIVI VIAQGDVVTTDVVFKPVLPTYITPDFSFAHSLSVALPLFLVTMASQNAPGIAAMKAAGYS APVSPLIVFTGLLALVFSPFGVYSVGIAAITAAICQSPEAHPDKDQRWLAAAVAGIFYLI AGLFGSAITGMMAALPVSWIQMLAGLALLSTISGSLYQALHNERERDAAVVAFLVTASGL TLVGIGSAFWGLIAGGVCYVVLNLIADRNRY >gi|296493496|gb|ADTK01000005.1| GENE 54 60293 - 60829 392 178 aa, chain + ## HITS:1 COG:ECs2037 KEGG:ns NR:ns ## COG: ECs2037 COG1396 # Protein_GI_number: 15831292 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 178 1 178 178 361 99.0 1e-100 MENLARFLSTTLKQLRQQRGWSLSRLAEATGVSKAMLGQIERNESSPTVATLWKIATGLN VPFSTFISPPQSATPSVYDPQQQAMVITSLFPYDPQLCFEHFSIQMAPGAISESTPHEKG VIEHVVVIDGQLDLCVDGEWQTLNCGEGVRFAADVTHIYRNGGEQTVHFHSLIHYPRS >gi|296493496|gb|ADTK01000005.1| GENE 55 60902 - 62863 1672 653 aa, chain + ## HITS:1 COG:ECs2039 KEGG:ns NR:ns ## COG: ECs2039 COG0826 # Protein_GI_number: 15831293 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Escherichia coli O157:H7 # 1 653 15 667 667 1350 99.0 0 MTVSSHRLELLSPARDAAIAREAILHGADAVYIGGPGFGARHNASNSLKDIAELVPFAHR YGAKIFVTLNTILHDDELEPAQRLITDLYQTGVDALIVQDMGILELDIPPIELHASTQCD IRTVEKAKFLSDVGFKQIVLARELNLDQIRAIHQATDATIEFFIHGALCVAYSGQCYISH AQTGRSANRGDCSQACRLPYTLKDDQGRVVSYEKHLLSMKDNDQTANLGALIDAGVRSFK IEGRYKDMSYVKNITAHYRQMLDAIIEERGDLARASSGRTEHFFVPSTEKTFHRGSTDYF VNARKGDIGAFDSPKFIGLPVGEVLKVAKDHLDVAVTEPLANGDGLNVLIKREVVGFRAN TVEKTGENQYRVWPNEMPADLHKIRPHHPLNRNLDHNWQQALTKTSSERRVAVDIELGGW QEQLILTLTSEEGVSITHTLDGQFDEANNAEKAMNNLKDGLAKLGQTLYYARDVQINLPG ALFVPNSLLNQFRREAADMLDAARLASYQRGSRKPVADPAPVYPQTHLSFLANVYNQKAR EFYHRYGVQLIDAAYEAHEEKGEVPVMITKHCLRFAFNLCPKQAKGNIKSWKATPMQLVN GDEVLTLKFDCRPCEMHVIGKIKNHILKMPLPGSVVASVSPDELLKTLPKRKG >gi|296493496|gb|ADTK01000005.1| GENE 56 62955 - 63125 174 56 aa, chain - ## HITS:1 COG:no KEGG:JW1432 NR:ns ## KEGG: JW1432 # Name: yncJ # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 56 21 76 76 102 100.0 6e-21 MAGHKGHEFVWVKNVDHQLRHEADSDELRAVAEESAEGLREHFYWQKSRKPEAGQR >gi|296493496|gb|ADTK01000005.1| GENE 57 63367 - 63561 102 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|300901851|ref|ZP_07119886.1| ## NR: gi|300901851|ref|ZP_07119886.1| hypothetical protein HMPREF9536_00069 [Escherichia coli MS 84-1] # 1 64 1 64 64 112 100.0 1e-23 MFTLLIYSAAGRRCETKRVQTLARISGRRCSEWPQPFETQVSWEAQCHAASLLRKAILKQ LGLS >gi|296493496|gb|ADTK01000005.1| GENE 58 63586 - 64023 406 145 aa, chain + ## HITS:1 COG:ECs2042 KEGG:ns NR:ns ## COG: ECs2042 COG1598 # Protein_GI_number: 15831296 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 145 1 145 145 263 100.0 9e-71 MRETVEIMRYPVTLTPAPEGGYMVSFVDIPEALTQGETVAEAMEAAKDALLTAFDFYFED NELIPLPSPLNSHDHFIEVPLSVASKVLLLNAFLQSEITQQELARRIGKPKQEITRLFNL HHATKIDAVQLAAKALGKELSLVMV >gi|296493496|gb|ADTK01000005.1| GENE 59 64102 - 65508 1111 468 aa, chain + ## HITS:1 COG:ydcR KEGG:ns NR:ns ## COG: ydcR COG1167 # Protein_GI_number: 16129398 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Escherichia coli K12 # 1 468 1 468 468 917 99.0 0 MKKYQQLAEQLREQIASGIWQPGDRLPSLRDQVALSGMSFMTVSHAYQLLESQGYIIARP QSGYYVAPQAIKMPKAPVIPVTRDEAVDINTYIFDMLQASRDPSVVPFASAFPDPRLFPL QQLNRSLAQVSKTATAMSVIENLPPGNAELRQAIARRYALQGITISPDEIVITAGALEAL NLSLQAVTEPGDWVIVENPCFYGALQALERLRLKALSVATDVKEGIDLQALELALQDYPV KACWLMTNSQNPLGFTLTPQKKAQLVALLNQYNVTLIEDDVYSELYFGREKPLPAKAWDR HDGVLHCSSFSKCLVPGFRIGWVAAGKHARKIQRLQLMSTLSTSSPMQLALVDYLSTRRY DAHLRRLRRQLAERKQRAWQALLRYLPAEVKIHHSDSGYFLWLELPEPLDAGELSLAALT HHISIAPGKMFSTGENWSRFFRFNTAWQWGEREEQAVKQLGKLIQERL >gi|296493496|gb|ADTK01000005.1| GENE 60 65753 - 66898 1208 381 aa, chain + ## HITS:1 COG:ydcS KEGG:ns NR:ns ## COG: ydcS COG0687 # Protein_GI_number: 16129399 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Escherichia coli K12 # 1 381 1 381 381 767 99.0 0 MSKTFARSSLCALTMTIMNAHAAEPPTNLDKPEGRLDIIAWPGYIERGQTDKQYDWVTQF EKETGCAVNVKTAATSDEMVSLMTKGGYDLVTASGDASLRLIMGKRVQPINTALIPNWKT LDPRVVKGDWFNVGGKVYGTPYQWGPNLLMYNTKTFPTPPDSWQVVFVEQNLPDGKSNKG RVQAYDGPIYIADAALFVKATQPQLGISDPYQLTEEQYQAVLKVLRAQHSLIHRYWHDTT VQMSDFKNEGVVASSAWPYQANALKAEGQPVATVFPKEGVTGWADTTMLHSEAKHPVCAY KWMNWSLTPKVQGDVAAWFGSLPVVPEGCKASPLLGEKGCETNGFNYFDKIAFWKTPIAE GGKFVPYSRWTQDYIAIMGGR >gi|296493496|gb|ADTK01000005.1| GENE 61 66916 - 67929 856 337 aa, chain + ## HITS:1 COG:ydcT KEGG:ns NR:ns ## COG: ydcT COG3842 # Protein_GI_number: 16129400 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Escherichia coli K12 # 1 337 1 337 337 669 99.0 0 MTYAVEFDNVSRLYGDVRAVDGVSIAIKDGEFFSMLGPSGSGKTTCLRLIAGFEQLSGGT ISIFGKPASNLPPWERDVNTVFQDYALFPHMSILDNVAYGLMVKGVNKKQRHAMAQEALE KVALGFVHQRKPSQLSGGQRQRVAIARALVNEPRVLLLDEPLGALDLKLREQMQLELKKL QQSLGITFIFVTHDQGEALSMSDRVAVFNNGRIEQVDSPRDLYMRPRTPFVAGFVGTSNV FDGLMAEKLCGMTGSFALRPEHIRLNTPGELQANGTIQAVQYQGAATRFELKLNGGEKLL VSQANMTGEELPATLTPGQQVMVSWSRDVMVPLVEER >gi|296493496|gb|ADTK01000005.1| GENE 62 67930 - 68871 984 313 aa, chain + ## HITS:1 COG:ECs2046 KEGG:ns NR:ns ## COG: ECs2046 COG1176 # Protein_GI_number: 15831300 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Escherichia coli O157:H7 # 1 313 1 313 313 506 99.0 1e-143 MAMNVLQSPSRPGLGKVSGFFWRNPGLGLFLLLLGPLMWFGIVYFGSLLTLLWQGFYTFD DFTMSVTPELTLANIRALFNPANYDIILRTLTMAVAVTIASAILAFPMAWYMARYTSGKM KAFFYIAVMLPMWASYIVKAYAWTLLLAKDGVAQWFLQHLGLEPLLTAFLTLPAVGGNTL STSGLGRFLVFLYIWLPFMILPVQAALERLPPSLLQASADLGARPRQTFRYVVLPLAIPG IAAGSIFTFSLTLGDFIVPQLVGPPGYFIGNMVYSQQGAIGNMPMAAAFTLVPIVLIALY LAFVKRLGAFDAL >gi|296493496|gb|ADTK01000005.1| GENE 63 68861 - 69655 733 264 aa, chain + ## HITS:1 COG:ECs2047 KEGG:ns NR:ns ## COG: ECs2047 COG1177 # Protein_GI_number: 15831301 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component II # Organism: Escherichia coli O157:H7 # 1 264 1 264 264 421 99.0 1e-118 MHSERAPFFLKLAAWGGVVFLHFPILIIAAYAFNTEDAAFSFPPQGLTLRWFSVAAQRSD ILDAVTLSLKVAALATLMALVLGTLAAAALWRRDFFGKNAISLLLLLPIALPGIVTGLAL LTAFKTINLEPGFFTIVVGHATFCVVVVFNNVIARFRRTSWSLVEASMDLGANGWQTFRY VVLPNLSSALLAGGILAFALSFDEIIVTTFTSGHERTLPLWLLNQLGRPRDVPVTNVVAL LVMLVTTLPILGAWWLTREGDNGQ >gi|296493496|gb|ADTK01000005.1| GENE 64 69677 - 71101 1331 474 aa, chain + ## HITS:1 COG:ECs2048 KEGG:ns NR:ns ## COG: ECs2048 COG1012 # Protein_GI_number: 15831302 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Escherichia coli O157:H7 # 1 474 1 474 474 927 99.0 0 MQHKLLINGELVSGEGEKQPVYNPATGDVLLEIAEASAEQVDAAVRAADAAFAEWGQTTP KVRAECLLKLADVIEENGQVFAELESRNCGKPLHSAFNDEIPAIVDVFRFFAGAARCLNG LAAGEYLEGHTSMIRRDPLGVVASIAPWNYPLMMAAWKLAPALAAGNCVVLKPSEITPLT ALKLAELAKDIFPAGVINVLFGRGKTVGDPLTGHPKVRMVSLTGSIATGEHIISHTASSI KRTHMELGGKAPVIVFDDADIEAVVEGVRTFGYYNAGQDCTAACRIYAQKGIYDTLVEKL GAAVATLKSGAPDDESTELGPLSSLAHLERVSKAVEEAKATGHIKVITGGEKRKGNGYYY APTLLAGALQDDAIVQKEVFGPVVSVTLFDNEEQVVNWANDSQYGLASSVWTKDVGRAHR VSARLQYGCTWVNTHFMLVSEMPHGGQKLSGYGKDMSLYGLEDYTVVRHVMVKH >gi|296493496|gb|ADTK01000005.1| GENE 65 71440 - 71661 153 73 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_1442 NR:ns ## KEGG: ECIAI1_1442 # Name: ydcX # Def: conserved hypothetical protein; putative inner membrane protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 73 10 82 82 117 100.0 1e-25 MTHICARFIHLAGRPYMSLYQHMLVFYAVMAAIAFLITWFLSHDKKRIRFLSAFLVGATW PMSFPVALLFSLF >gi|296493496|gb|ADTK01000005.1| GENE 66 71747 - 71980 373 77 aa, chain + ## HITS:1 COG:no KEGG:G2583_1807 NR:ns ## KEGG: G2583_1807 # Name: ydcY # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 77 1 77 77 112 100.0 5e-24 MSHLDEVIARVDAAIEESVIAHMNELLIALSDDAELSREDRYTQQQRLRTAIAHHGRKHK EDMEARHEQLTKGGTIL >gi|296493496|gb|ADTK01000005.1| GENE 67 71981 - 72430 371 149 aa, chain - ## HITS:1 COG:ydcZ KEGG:ns NR:ns ## COG: ydcZ COG3238 # Protein_GI_number: 16129406 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 149 1 149 149 222 100.0 2e-58 MNQSLTLAFLIAAGIGLVVQNTLMVRITQTSSTILIAMLLNSLVGIVLFVSILWFKQGMA GFGELVSSVRWWTLIPGLLGSFFVFASISGYQNVGAATTIAVLVASQLIGGLMLDIFRSH GVPLRALFGPICGAILLVVGAWLVARRSF >gi|296493496|gb|ADTK01000005.1| GENE 68 72427 - 72945 496 172 aa, chain - ## HITS:1 COG:ECs2052 KEGG:ns NR:ns ## COG: ECs2052 COG1247 # Protein_GI_number: 15831306 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sortase and related acyltransferases # Organism: Escherichia coli O157:H7 # 1 172 1 172 172 352 99.0 2e-97 MSIRFARKADCAAIAEIYNHAVLYTAAIWNDQTVDADNRIAWFEARTIAGYPVLVSEEDG VVTGYASFGDWRSFDGFRHTVEHSVYVHPDHQGKGLGRKLLSRLIDEARDCGKHVMVAGI ESQNQASLHLHHSLGFVVTAQMPQVGTKFGRWLDLTFMQLQLDERTEPDAIG >gi|296493496|gb|ADTK01000005.1| GENE 69 73126 - 74163 980 345 aa, chain + ## HITS:1 COG:yncB KEGG:ns NR:ns ## COG: yncB COG2130 # Protein_GI_number: 16129408 # Func_class: R General function prediction only # Function: Putative NADP-dependent oxidoreductases # Organism: Escherichia coli K12 # 1 345 32 376 376 697 97.0 0 MGQQKQRNRRWVLASRPHGAPVPENFRLEEDDVATPGEGQVLLRTVYLSLDPYMRGRMSD EPSYSPPVDIGGVMVGGTVSRVVESNHPNYQPGDWVLGYSGWQDYDISSGDDLVKLGDHP QNPSWSLGVLGMPGFTAYMGLLDIGQPKEGETLVVAAATGPVGATVGQIGKLKGCRVVGV AGGAEKCRHAIEVLGFDVCLDHHADDFAEQLVKACPKGIDIYYENVGGKVFDAVLPLLNT SARIPVCGLVSSYNATELPPGPDRLPLLMATVLKKRIRLQGFIIAQDYGHRIHEFQKEMG QWVKEDKIHYHEDITDGLENAPQTFIGLLKGKNFGKVVIRVAGDD >gi|296493496|gb|ADTK01000005.1| GENE 70 74361 - 75026 600 221 aa, chain + ## HITS:1 COG:yncC KEGG:ns NR:ns ## COG: yncC COG1802 # Protein_GI_number: 16129409 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 221 20 240 240 405 100.0 1e-113 MPGTGKMKHVSLTLQVENDLKHQLSIGALKPGARLITKNLAEQLGMSITPVREALLRLVS VNALSVAPAQAFTVPEVGKRQLDEINRIRYELELMAVALAVENLTPQDLAELQELLEKLQ QAQEKGDMEQIINVNRLFRLAIYHRSNMPILCEMIEQLWVRMGPGLHYLYEAINPAELRE HIENYHLLLAALKAKDKEGCRHCLAEIMQQNIAILYQQYNR >gi|296493496|gb|ADTK01000005.1| GENE 71 75062 - 77164 1622 700 aa, chain - ## HITS:1 COG:yncD KEGG:ns NR:ns ## COG: yncD COG1629 # Protein_GI_number: 16129410 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Escherichia coli K12 # 1 700 1 700 700 1374 99.0 0 MKIFSVRQTVLPALLVLSPVVFAADEQTMIVSAAPQVVSELDTPAAVSVVDGEEMRLATP RINLSESLTSVPGLQVQNRQNYAQDLQLSIRGFGSRSTYGIRGIRLYVDGIPATMPDGQG QTSNIDLSSVQNVEVLRGPFSALYGNASGGVMNVTTQTGQQPPTIEASSYYGSFGSWRYG LKATGATGDGTQPGDVDYTVSTTRFTTHGYRDHSGAQKNLANAKLGVRIDEASKLSLIFN SVDIKADDPGGLTKAEWKANPQQAPRAEQYDTRKTIKQTQAGLRYERSLSAQDDMSVMMY AGERETTQYQSIPMAPQLNPSHAGGVITLQRHYQGIDSRWTHRGELGVPVTFTTGLNYEN MSENRKGYNNFRLNSGMPEYGQKGELRRDERNLMWNIDPYLQTQWQLSEKLSLDAGVRYS SVWFDSNDHYVTPGNGDDSGDASYHKWLPAGSLKYAMTDAWNIYLAAGRGFETPTINELS YRADGQSGMNLGLKPSTNDTIEIGSKTRIGDGLLSLALFQTDTDDEIVVDSSSGGRTTYK NAGKTRRQGAELAWDQRFAGDFRVNASWTWLDATYRSNVCNEQDCNGNRMPGIARNMGFA SIGYVPEDGWYAGTEARYMGDIMADDENTAKAPSYTLVGLFTGYKYNYHNLTVDLFGRVD NLFDKEYVGSVIVNESNGRYYEPSPGRNYGVGMNIAWRFE >gi|296493496|gb|ADTK01000005.1| GENE 72 77406 - 78467 1293 353 aa, chain + ## HITS:1 COG:ECs2056 KEGG:ns NR:ns ## COG: ECs2056 COG3391 # Protein_GI_number: 15831310 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 353 1 353 353 622 99.0 1e-178 MHLRHLFSSRLRGSLLLGSLLVASSFSTQAAEEMLRKAVGKGAYEMAYSQQENALWLATS QSRKLDKGGVVYRLDPVTLEVTQAIHNDLKPFGATINNTTQTLWFGNTVNSAVTAIDAKT GEVKGRLVLDDRKRTEEVRPLQPRELVADDATNTVYISGIGKESVIWVVDGENIKLKTAI QNTGKMSTGLALDSKGKRLYTTNADGELITIDTADNKILSRKKLLDDGKEHFFINISLDT ARQRAFITDSKAAEVLVVDTRNGNILAKVAAPESLAVLFNPARNEAYVTHRQAGKVSVID AKSYKVVKTFDTPTHPNSLALSADGKTLYVSVKQKSTKQQEATQPDDVIRIAL >gi|296493496|gb|ADTK01000005.1| GENE 73 78582 - 80081 1511 499 aa, chain - ## HITS:1 COG:ECs2057 KEGG:ns NR:ns ## COG: ECs2057 COG1113 # Protein_GI_number: 15831311 # Func_class: E Amino acid transport and metabolism # Function: Gamma-aminobutyrate permease and related permeases # Organism: Escherichia coli O157:H7 # 1 499 18 516 516 918 100.0 0 MSKHDTDTSDQHAAKRRWLNAHEEGYHKAMGNRQVQMIAIGGAIGTGLFLGAGARLQMAG PALALVYLICGLFSFFILRALGELVLHRPSSGSFVSYAREFLGEKAAYVAGWMYFINWAM TGIVDITAVALYMHYWGAFGGVPQWVFALAALTIVGTMNMIGVKWFAEMEFWFALIKVLA IVTFLVVGTVFLGSGQPLDGNTTGFHLITDNGGFFPHGLLPALVLIQGVVFAFASIEMVG TAAGECKDPQTMVPKAINSVIWRIGLFYVGSVVLLVMLLPWSAYQAGQSPFVTFFSKLGV PYIGSIMNIVVLTAALSSLNSGLYCTGRILRSMAMGGSAPSFMAKMSRQHVPYAGILATL VVYVVGVFLNYLVPSRVFEIVLNFASLGIIASWAFIIVCQMRLRKAIKEGKAADVSFKLP GAPFTSWLTLLFLLSVLVLMAFDYPNGTYTIAALPIIGILLVIGWFGVRKRVAEIHSTAP VVEEDEEKQEIVFKPETAS >gi|296493496|gb|ADTK01000005.1| GENE 74 80348 - 80965 472 205 aa, chain + ## HITS:1 COG:yncG KEGG:ns NR:ns ## COG: yncG COG0625 # Protein_GI_number: 16129413 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione S-transferase # Organism: Escherichia coli K12 # 1 205 1 205 205 414 98.0 1e-116 MIKVYGVPGWGSTISELMLTLADIPYQFVDVSGFDHEGTSRELLKTLNPLCQIPTLALEN DEIMTETAAIALMVLDRRPDLAPPVGRAERQQFQRLLVWLVANVYPTFTFADYPERWAPD APEQLKKNVIEYRKSLYIWLNSQLTAEPYAFGEQLTLVDCYLCTMRTWGPGHEWFQDNAT NISAIADAVCQLPKLQEVLKRNEII >gi|296493496|gb|ADTK01000005.1| GENE 75 81041 - 81253 99 70 aa, chain + ## HITS:1 COG:no KEGG:ECO103_1586 NR:ns ## KEGG: ECO103_1586 # Name: yncH # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 70 1 70 70 108 98.0 4e-23 MLCFLIYITLPFIQLVYFISSEKKLTIHIVQMFHLLSQVFYNLKKFLMMDMLGVGDAINI NTNKNIRQVC >gi|296493496|gb|ADTK01000005.1| GENE 76 82074 - 82500 400 142 aa, chain + ## HITS:1 COG:ECs2060 KEGG:ns NR:ns ## COG: ECs2060 COG3501 # Protein_GI_number: 15831314 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 142 1 142 702 262 97.0 1e-70 MSTGLRFTLEVDGLPPDAFAVVSFHLTQSLSSLFSLDLSLVSQQFLSLEFAQVLDKMAYL TIWQGDDVQRRVKGMVTWFELGENDKNQMLYSMKVCPPLWRTGLRQNFRIFQNEDIESIL ATILKENGVTEWSPLFSEPHPS Prediction of potential genes in microbial genomes Time: Mon May 16 14:57:35 2011 Seq name: gi|296493495|gb|ADTK01000006.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont14.1, whole genome shotgun sequence Length of sequence - 1642 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 776 319 ## ECO111_p4-04 putative colicin activity protein 2 1 Op 2 . + CDS 779 - 1039 222 ## ECO111_p4-10 probable colicin immunity protein + Term 1048 - 1089 4.7 3 2 Tu 1 . + CDS 1101 - 1244 107 ## gi|510383|emb|CAA45169.1| lysis protein + Term 1272 - 1308 3.1 Predicted protein(s) >gi|296493495|gb|ADTK01000006.1| GENE 1 3 - 776 319 257 aa, chain + ## HITS:1 COG:no KEGG:ECO111_p4-04 NR:ns ## KEGG: ECO111_p4-04 # Name: not_defined # Def: putative colicin activity protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 257 325 581 581 394 91.0 1e-108 RARAELNQANEDVARNQERQAKAVQVYNSRKSELDAANKTLADAIAEIKQFDRFAHDPMA GGHRMWQMAGLKAQRAQTDVNNKQAAFDAAAKEKSDADAALSAAQERRKQKENKEKDAKD KLDKESKRNKPGKATGKGKPVGDKWLDDAGKDSGAPIPDRIADKLRDKEFKNFDDFRKKF WEEVSKDPDLSKQFKGSNKTNIQKGKAPFARKKDQVGGRERFELHHDKPISQDGGVYDMN NIRVTTPKRHIDIHRGK >gi|296493495|gb|ADTK01000006.1| GENE 2 779 - 1039 222 86 aa, chain + ## HITS:1 COG:no KEGG:ECO111_p4-10 NR:ns ## KEGG: ECO111_p4-10 # Name: not_defined # Def: probable colicin immunity protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 86 1 86 86 141 96.0 8e-33 MELKHSISDYTEAEFLEFVKKICRAEGATEEDDNKLVREFERLTEHPDGSDLIYYPRDDR EDSPEGIVKEIKEWRAANGKSGFKQG >gi|296493495|gb|ADTK01000006.1| GENE 3 1101 - 1244 107 47 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|510383|emb|CAA45169.1| ## NR: gi|510383|emb|CAA45169.1| lysis protein [Escherichia coli] # 1 47 1 47 47 66 87.0 5e-10 MKKITGIILLLLAVIILSACQANYIRDVQGGTVSPSSTAEVTGLATQ Prediction of potential genes in microbial genomes Time: Mon May 16 14:57:45 2011 Seq name: gi|296493494|gb|ADTK01000007.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont15.1, whole genome shotgun sequence Length of sequence - 568 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 11 - 394 364 ## ECO111_p4-04 putative colicin activity protein - Prom 472 - 531 3.4 Predicted protein(s) >gi|296493494|gb|ADTK01000007.1| GENE 1 11 - 394 364 127 aa, chain - ## HITS:1 COG:no KEGG:ECO111_p4-04 NR:ns ## KEGG: ECO111_p4-04 # Name: not_defined # Def: putative colicin activity protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 126 1 126 581 135 100.0 5e-31 MSGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGS GHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIADIM AALKGPV Prediction of potential genes in microbial genomes Time: Mon May 16 14:58:02 2011 Seq name: gi|296493493|gb|ADTK01000008.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont20.1, whole genome shotgun sequence Length of sequence - 43649 bp Number of predicted genes - 42, with homology - 41 Number of transcription units - 17, operones - 9 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 114 - 1619 659 ## ECO26_0314 hypothetical protein 2 1 Op 2 . + CDS 1639 - 2352 289 ## APECO1_1716 hypothetical protein + Term 2595 - 2635 5.1 - Term 2384 - 2423 2.1 3 2 Tu 1 . - CDS 2434 - 3315 140 ## COG0582 Integrase - Prom 3384 - 3443 1.9 + Prom 3731 - 3790 5.6 4 3 Tu 1 . + CDS 4039 - 5253 429 ## PROTEIN SUPPORTED gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 + Term 5431 - 5459 -1.0 + Prom 5533 - 5592 4.0 5 4 Op 1 . + CDS 5681 - 5884 236 ## COG3311 Predicted transcriptional regulator 6 4 Op 2 . + CDS 5884 - 6315 270 ## SbBS512_E4075 hypothetical protein 7 4 Op 3 . + CDS 6328 - 7161 645 ## COG3561 Phage anti-repressor protein + Prom 7783 - 7842 5.6 8 5 Op 1 . + CDS 7992 - 8186 246 ## SbBS512_E4078 hypothetical protein 9 5 Op 2 . + CDS 8179 - 8397 260 ## SbBS512_E4081 hypothetical protein 10 5 Op 3 . + CDS 8390 - 8584 100 ## gi|237703423|ref|ZP_04533904.1| predicted protein 11 5 Op 4 . + CDS 8581 - 8844 154 ## gi|227884731|ref|ZP_04002536.1| second beta subunit of nicotinic acetylcholine receptor 12 5 Op 5 . + CDS 8841 - 9062 248 ## SbBS512_E4082 hypothetical protein 13 5 Op 6 . + CDS 9055 - 9657 446 ## ETA_24230 conserved hypothetical protein, related to bacteriophage P27 14 5 Op 7 . + CDS 9668 - 10009 290 ## SbBS512_E4084 hypothetical protein 15 5 Op 8 . + CDS 10002 - 10373 184 ## gi|227884727|ref|ZP_04002532.1| hypothetical protein HMPREF0358_1319 16 5 Op 9 . + CDS 10360 - 13116 939 ## COG5519 Superfamily II helicase and inactivated derivatives + Term 13170 - 13207 -0.9 + Prom 13615 - 13674 7.5 17 6 Op 1 . + CDS 13701 - 13862 111 ## gi|300901898|ref|ZP_07119930.1| hypothetical protein HMPREF9536_00113 18 6 Op 2 . + CDS 13879 - 14340 296 ## EFER_2037 hypothetical protein 19 6 Op 3 . + CDS 14334 - 15011 605 ## HDEF_1687 APSE-2 prophage; transfer protein gp7 20 6 Op 4 . + CDS 15011 - 16432 394 ## COG1835 Predicted acyltransferases 21 6 Op 5 . + CDS 16432 - 18537 506 ## ECS88_2506 putative DNA transfer protein from bacteriophage + Term 18541 - 18572 1.1 + Prom 18673 - 18732 5.7 22 7 Tu 1 . + CDS 18935 - 19306 101 ## ECO111_0327 hypothetical protein 23 8 Op 1 . - CDS 19951 - 20136 134 ## gi|300901906|ref|ZP_07119938.1| transcriptional regulator, AlpA family 24 8 Op 2 . - CDS 20063 - 20374 69 ## gi|301305460|ref|ZP_07211553.1| hypothetical protein HMPREF9347_04082 + Prom 20678 - 20737 3.6 25 9 Tu 1 . + CDS 20811 - 21098 107 ## ECs1588 putative transcriptional activator 26 10 Op 1 2/0.500 - CDS 21770 - 22180 303 ## COG0583 Transcriptional regulator 27 10 Op 2 6/0.000 - CDS 22159 - 23115 531 ## COG1975 Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family 28 10 Op 3 12/0.000 - CDS 23125 - 25323 1811 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 29 10 Op 4 15/0.000 - CDS 25320 - 26276 656 ## COG1319 Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs 30 10 Op 5 . - CDS 26273 - 26962 427 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs + Prom 27266 - 27325 5.5 31 11 Tu 1 . + CDS 27380 - 27994 290 ## COG3477 Predicted periplasmic/secreted protein + Term 28014 - 28043 -0.9 - Term 28596 - 28636 3.2 32 12 Op 1 . - CDS 28884 - 29594 504 ## ECSE_0307 hypothetical protein 33 12 Op 2 . - CDS 29563 - 31206 959 ## EC55989_0294 putative surface or exported protein 34 12 Op 3 . - CDS 31196 - 33721 1739 ## B21_00253 hypothetical protein 35 12 Op 4 . - CDS 33747 - 34415 442 ## APECO1_1705 hypothetical protein - Term 34424 - 34456 3.0 36 13 Op 1 . - CDS 34473 - 35060 621 ## ECIAI1_0294 common pilus ECP 37 13 Op 2 . - CDS 35135 - 35677 158 ## COG2771 DNA-binding HTH domain-containing proteins - Prom 35821 - 35880 6.1 - Term 36699 - 36736 2.8 38 14 Op 1 6/0.000 - CDS 36762 - 36905 241 ## PROTEIN SUPPORTED gi|26246304|ref|NP_752343.1| 50S ribosomal protein L36 39 14 Op 2 . - CDS 36902 - 37168 460 ## PROTEIN SUPPORTED gi|110640565|ref|YP_668293.1| 50S ribosomal protein L31 type B - Prom 37192 - 37251 5.2 40 15 Tu 1 . - CDS 37529 - 37630 61 ## - Prom 37731 - 37790 4.7 + Prom 38480 - 38539 7.2 41 16 Tu 1 . + CDS 38745 - 43001 2686 ## EcolC_3322 Ig domain-containing protein + Term 43040 - 43074 4.6 - Term 43065 - 43112 13.9 42 17 Tu 1 . - CDS 43141 - 43488 252 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 43570 - 43629 2.2 Predicted protein(s) >gi|296493493|gb|ADTK01000008.1| GENE 1 114 - 1619 659 501 aa, chain + ## HITS:1 COG:no KEGG:ECO26_0314 NR:ns ## KEGG: ECO26_0314 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 497 113 609 609 881 97.0 0 MHRLSVAEQYKAIAQVREIAKQYEQGAISLARMNSQMKRLQQQQRKINGNRKAQFAPVKG GSGNSATMGALLFGGATAAAGVMAVSKTSEFMANSFANAEAQGELIQRAKLGGVDVNQMY NITEWAYKNGVDSMMGDQGARKYLDQMKDVRERAAKSYSEAELVKDKKTGKSEWKGGDNA INELLNIGVINKSDLKKFADNPAGLISKAVNGMMKKGFSDSQIGQRLEDLGDDLMLTSKY WQRSVKDVQESINQQKASGKWLTETQQESIVKFRELNRQLSQLSDARQVAFVDGFMKSLD PKVTEEFLKNLSNLTPYFTKLGEAVGSLFEAIMKIVNWFNRNDDKTDAIQKNLGDAPPLS NEGMKQHLSNLVPDQYKGAGTATTAPDNSHSLFNTIKGLLFDDNSSVSDVKMSVNEAPIT NLKQGALNNLAMTTPAYNLSPVFNINPTFEVVTEVPLTINSDTSRLSDYVDFTARASRDS FLKSLTLTSLSGQSNGGEFPP >gi|296493493|gb|ADTK01000008.1| GENE 2 1639 - 2352 289 237 aa, chain + ## HITS:1 COG:no KEGG:APECO1_1716 NR:ns ## KEGG: APECO1_1716 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 237 1 237 237 404 99.0 1e-111 MATAGILTIRAANTPEQHVQAVYKAEQDINSTKNENSKANKTKGENGFAIVTSGLASSGN DVYENYMALAFDSVDNVNVRRLADVTSYPVENGATVSDHVQIKNNKFSLKGRITETPIKS DPGLLKSAGVNGNRRSLAIDYLNQIMDSRQPFLLVTENKTFENVVLTGIEYTEEASESLV FDLSFEQIRLVSYGTVNTVAIKTQPSKNIGANMKKRVNTEKSSSEGEDTITPAFKQE >gi|296493493|gb|ADTK01000008.1| GENE 3 2434 - 3315 140 293 aa, chain - ## HITS:1 COG:Z5878_2 KEGG:ns NR:ns ## COG: Z5878_2 COG0582 # Protein_GI_number: 15804857 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Escherichia coli O157:H7 EDL933 # 110 277 4 182 205 64 30.0 2e-10 MFLSYRKKPDIALQDVSRTTVTGWIEHMQKTLSQQSIANYISPMAQLWELASSRYHDAPE RALSPWRGHRLDVAQSRESYEAFSNKELLQVLQVFSGDSAENKEMMALCLIGAYTGMRIN EIASLTIDDVKEIEGVLCFEITQGKTKAAARVVPVHSLITPLVLSLREKPHNGFLFYHAS ITERADGKRSTWHTQRFTRAKRKALGEKGTERKVFHSLRHGVAQLLDRNQIPEDRIALLL GHTRGNTETFRTYSKNAASPVELKNYIELLRYPEIEKGLSINKKSNLRHKTTP >gi|296493493|gb|ADTK01000008.1| GENE 4 4039 - 5253 429 404 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 [Campylobacter concisus 13826] # 10 389 14 403 406 169 29 2e-41 MKLNARQIDTAKPKEKAYKLADGGGLYLLVKPNGGKYWRLKYRVAGKEKLLALGVYPEVT LADARAKREDAKRGIAGGIDPMEAKREEKIARETQLNNTFKDIALEWHSSKLKKWSAGYA SDILEAFNKDVLPYIGKKPIAEIKPLELLNVLRRIEGRGATEKAKKVRQRCGEVFRYAIV TGRAEYNPAPDLTSAMQGHESNHYPFLTAKELPDFFKALSSYSGSALVVMAARLLIITGL RTGELRGALWDEIDFNKAIWEIPASRMKMRRPHIVPLSEQALSLIGKIREITGNYPLMFP GRNDPRKTMSEASINQVFKRIGYAGRVTGHGFRHTMSTILHEQGYNTAWIETQLAHVDKN SIRGTYNHAQYLDGRREMLQWYADYMDSLEHGGNVVHMVFEKHA >gi|296493493|gb|ADTK01000008.1| GENE 5 5681 - 5884 236 67 aa, chain + ## HITS:1 COG:ECs3513 KEGG:ns NR:ns ## COG: ECs3513 COG3311 # Protein_GI_number: 15832767 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 67 1 68 68 83 61.0 9e-17 MANYNSLIRLSEVQRRTGYSKAWIYRLISQGRFPKQVKIGSRAIAFVESEIDEWIEKCIL ESRDEVA >gi|296493493|gb|ADTK01000008.1| GENE 6 5884 - 6315 270 143 aa, chain + ## HITS:1 COG:no KEGG:SbBS512_E4075 NR:ns ## KEGG: SbBS512_E4075 # Name: not_defined # Def: hypothetical protein # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 1 143 1 143 143 234 83.0 7e-61 MEKKNRPLQAANSDIRVSDITPLTKSLQAPKRTPKKHRARVYMLRTGIEGWTENDILRYC RLSSGRNYATELERQLGITLERIDEKNPDGIGTHLRYRFSCRGDVLKVITHINHLANIND HNGLSQQEIADILKLYPDAFNAA >gi|296493493|gb|ADTK01000008.1| GENE 7 6328 - 7161 645 277 aa, chain + ## HITS:1 COG:ECs1251 KEGG:ns NR:ns ## COG: ECs1251 COG3561 # Protein_GI_number: 15830505 # Func_class: K Transcription # Function: Phage anti-repressor protein # Organism: Escherichia coli O157:H7 # 29 147 3 107 209 94 42.0 3e-19 MNIEKSRLISEAAPHLNASLSTINGNEFAAIVPVIPGHIGGRETNIVSAKALHKALGVGK DFSTWITDRISEYDFTIGHDYSVHKTISPNLGKSPNGAAYSKIKQSGRPGKDYLLSVGMA KELAMIERNDQGRAIRRYFIQCEEELQRSVPEIAARYRRQLKARISAANNFKPMCDALNM ARAEQGKTTQQHHYTNESNMISRIVLGGLTAKQWARINGYSGEPRDHMNAEQLEHLSYLE STNITLIDMGMEYEQRKGELTRLSQRWLAKRLEAVHV >gi|296493493|gb|ADTK01000008.1| GENE 8 7992 - 8186 246 64 aa, chain + ## HITS:1 COG:no KEGG:SbBS512_E4078 NR:ns ## KEGG: SbBS512_E4078 # Name: not_defined # Def: hypothetical protein # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 1 61 172 232 262 128 100.0 5e-29 MLATTPTQKPQFIWIIAAVRRDCPTITAKIHHIAAESERDARRSLVRDHVCFFAGRIRME VAHD >gi|296493493|gb|ADTK01000008.1| GENE 9 8179 - 8397 260 72 aa, chain + ## HITS:1 COG:no KEGG:SbBS512_E4081 NR:ns ## KEGG: SbBS512_E4081 # Name: not_defined # Def: hypothetical protein # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 1 72 1 72 72 126 97.0 3e-28 MIKTYDVHMDPLERTSQIITLTEVINDILVSNSPSRDEKLKALLAILDLAVRDVHFLLEG GEMPGKTGATNE >gi|296493493|gb|ADTK01000008.1| GENE 10 8390 - 8584 100 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237703423|ref|ZP_04533904.1| ## NR: gi|237703423|ref|ZP_04533904.1| predicted protein [Escherichia sp. 3_2_53FAA] # 1 64 1 64 64 81 96.0 1e-14 MNNSINAPRLTSALQLIEQAAAVLVAVSLSAEEMDAADVVDAIKACSSLVNDARAELVIL GGEK >gi|296493493|gb|ADTK01000008.1| GENE 11 8581 - 8844 154 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|227884731|ref|ZP_04002536.1| ## NR: gi|227884731|ref|ZP_04002536.1| second beta subunit of nicotinic acetylcholine receptor [Escherichia coli 83972] # 1 87 1 87 87 160 100.0 3e-38 MNINLIYRHPCELEIESLLSREEPYPDTFTLADRTTERLTRARTGLVHVMNEILPSVGGE QATVITSWLQKVTSLIDISLIDAESAK >gi|296493493|gb|ADTK01000008.1| GENE 12 8841 - 9062 248 73 aa, chain + ## HITS:1 COG:no KEGG:SbBS512_E4082 NR:ns ## KEGG: SbBS512_E4082 # Name: not_defined # Def: hypothetical protein # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 1 73 1 73 73 134 100.0 9e-31 MTNIQLIEAQCRIEQVQTVLGFWLEGASPSNRDKLMIGAVMSLLNGVPEAIQEADELLGK YELQNHSGEAKHE >gi|296493493|gb|ADTK01000008.1| GENE 13 9055 - 9657 446 200 aa, chain + ## HITS:1 COG:no KEGG:ETA_24230 NR:ns ## KEGG: ETA_24230 # Name: not_defined # Def: conserved hypothetical protein, related to bacteriophage P27 # Organism: E.tasmaniensis # Pathway: not_defined # 7 200 13 203 203 147 41.0 2e-34 MNNFLTFHAEATPDGVNIMYRSNDGMTERVEAVSYIDAVNRLDAGDYDDKPDEGMSIHLA IADGGNQGYFDYTSQHNVIMWRWLIATVFMLEMREENGTVSIIDDTGNPSEVAVYSNGIV AMPLYPVAERLAMANNIEGAMIERFGIESGTERAIIFYRAMMDVEQGALTPFGRETLAEL HNSFIAELNENGMPAEPVTH >gi|296493493|gb|ADTK01000008.1| GENE 14 9668 - 10009 290 113 aa, chain + ## HITS:1 COG:no KEGG:SbBS512_E4084 NR:ns ## KEGG: SbBS512_E4084 # Name: not_defined # Def: hypothetical protein # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 1 113 1 113 113 206 91.0 2e-52 MITKNFRLNALANKYASALYKHITSTSGGDYFMVDADGEAVRVEIVNGVKGVRSLIDSYT LAAMKVFYPQWETVGIELLDRCVTKDGLTDVGREIWQSMVNDMGATVAGGSHA >gi|296493493|gb|ADTK01000008.1| GENE 15 10002 - 10373 184 123 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|227884727|ref|ZP_04002532.1| ## NR: gi|227884727|ref|ZP_04002532.1| hypothetical protein HMPREF0358_1319 [Escherichia coli 83972] # 1 123 1 123 123 243 100.0 4e-63 MRENAEMALSSAIGEQVAKIAGAVWIHNLHSTGEEKMAIQTPEGRTITTSLKPSDVCDLI CAFMYPAMRTAHGDKWKLATTAEFDMWLNNDGMLTDYGITKWQMLVSHIANAIDHVGYGD AKH >gi|296493493|gb|ADTK01000008.1| GENE 16 10360 - 13116 939 918 aa, chain + ## HITS:1 COG:Z1843 KEGG:ns NR:ns ## COG: Z1843 COG5519 # Protein_GI_number: 15801311 # Func_class: L Replication, recombination and repair # Function: Superfamily II helicase and inactivated derivatives # Organism: Escherichia coli O157:H7 EDL933 # 531 912 1 384 386 509 65.0 1e-144 MRNIDLIRQVISASENNWPHVLGCLNINVPDSPRRHAPCPACGGKDRFRFDDNGRGSFIC NQCGAGDGLDLIKRVNNCDTTEAALLAADVLGIDYRTTETPEATSQKREQLETERQRREQ ERLKRAEKDEQQRRDTFSRQFDDMRRKAVNGKSDYLVAKGVGDFTFPVLPDGSLLLALVD KSGAVTAAQTITSHGEKRLLTGSAKRGAYHAINAPETTQSILIAEGLATALSAHLIRPEA LTVAAIDAGNLLYVAQVLRDKFPSAQIIIAADNDHSEGRQNTGRIAAEKAALSVSGWVAL PPTDHKADWNDYHQKHGIKCATEAFNKSMYQPQGNGVKQEPQTIEGSDFKVMDTDPLKPR IESREDGIYWVSPRADSQSGEIINNESWLCSPLSVIGTGRDDKDQYLILRWLSFGSETPT TAAIPLADIGEREGWRTLKAGGVNVTTKSSLRAILADWLQRSGSRELWRVAHATGWQCGA YIMPDGEIIGTPENPVLFSGRSSAAAGYTVSGSAKSWRDNVARLAFGNYSMMTGIGAALA APLIGLVGADGFGIHFYEQSSAGKTTTANVASSLYGNPDLLRLTWYGTALGLANEAAAHN DGLMPLDEVGQGADPVSVSQSAYALFNGVGKLQGAKDGGNRDLKRWRTVAISTGEMDLET FIATSGRKTKAGQLVRLLNIPLSKAVRFHDYQNGKQHADALKDAYQHHHGAAGREWIKWL ADHQQQAIKTVRDCESRWRSLIPSDYGEQVHRVAARFAILEGALLLGEVVTGWDAQICRD AIQHSYNAWLREFGTGNKEHQQIIEQTEAFLNAYGLSRFAPFPYSPADLPIKDLAGYRQR GEHDESPMIFYTFPATFEKEIACGFNAKQFAEVLKKAGMLTPPNSGRGYQRKSPRIQGRQ INVYVLNYQPGDYNSSEE >gi|296493493|gb|ADTK01000008.1| GENE 17 13701 - 13862 111 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|300901898|ref|ZP_07119930.1| ## NR: gi|300901898|ref|ZP_07119930.1| hypothetical protein HMPREF9536_00113 [Escherichia coli MS 84-1] # 1 53 1 53 53 80 100.0 2e-14 MAVSVKPVLISEKQMEAIKKIQEEQRKKSGIGVAPTLHEIARGLIDKALAGCM >gi|296493493|gb|ADTK01000008.1| GENE 18 13879 - 14340 296 153 aa, chain + ## HITS:1 COG:no KEGG:EFER_2037 NR:ns ## KEGG: EFER_2037 # Name: not_defined # Def: hypothetical protein # Organism: E.fergusonii # Pathway: not_defined # 1 150 2 151 160 276 84.0 2e-73 MELKFIDNPVRLQAFLNEPGNTENIVEPGHTYYIKPDAVYLGIYEGLVLAGVHEVRNFWH SVVECHAVYDPGFRGEYALQGHRLFCKWLLENSPFLNSVTMVPDTTKYGRSIIRLLGATR VGHLADAYIRCGKPVGVTLYQLTRQQYKELSEC >gi|296493493|gb|ADTK01000008.1| GENE 19 14334 - 15011 605 225 aa, chain + ## HITS:1 COG:no KEGG:HDEF_1687 NR:ns ## KEGG: HDEF_1687 # Name: P32 # Def: APSE-2 prophage; transfer protein gp7 # Organism: H.defensa # Pathway: not_defined # 20 211 6 196 209 156 61.0 5e-37 MLITHIAHKHLNRAVYEKGGDGGASKAQAKAQQQAIDLQREQWNTVMNNLKPYAEVGLPA LQQLQGLMTLEGQNKAANDFFGSGLYKTQADQARYQNLVSAEATGGLGSTATSNQLSSIA PMLYNNWLSGQMQNYGNLLNVGMNAASGQATAGQNYANNTGQLLQGLGAIRAGQAQQPSS LARGIGGAASGALAGAQLGSVVPGIGNVAGAIGGGLIGLVGGLGF >gi|296493493|gb|ADTK01000008.1| GENE 20 15011 - 16432 394 473 aa, chain + ## HITS:1 COG:ECs3232 KEGG:ns NR:ns ## COG: ECs3232 COG1835 # Protein_GI_number: 15832486 # Func_class: I Lipid transport and metabolism # Function: Predicted acyltransferases # Organism: Escherichia coli O157:H7 # 1 203 1 217 378 171 44.0 4e-42 MATWQLGGLPGMAPQNDNAPRSSVPQPVQYQQQPNVGLMALQGLRGVAEINQQARQQQRK AEFQKAYAGAFESGDRNKMRSLISEYPEEFESVQKGMGFIDDDQRNSIGHLATSAQIASS LGTGAFGKFIADNEDEMRRLGISPESVAEMHVNDPQEFQRLAGSMALFSLGHEKYFDIKD RMEGRDIERGKLAETIRSNQAGEALQARGQDISRANALTSAYAPTAAMQNYNQYAQMLKA DPEGAAAFAAAAGINTNAKKLMSVRENDDGTVTKYYTDGSEEQGKLNQPISGDGFRPIAL PTAQKIMEKSPEGAKKVAGFAYRVRDALDSMDTLKGQLSPQRVAIINNALGNGTLANLTL SPTEQQYVVNANDAIMAILRQETGAAIVPAEMSKYYQMYFPQPGDSTKTIDTKRRKMENQ FNSLKAASGRAYDALRVISAVDRGTSSSSQTLPQSEQVSQPAASSNFSSLWGD >gi|296493493|gb|ADTK01000008.1| GENE 21 16432 - 18537 506 701 aa, chain + ## HITS:1 COG:no KEGG:ECS88_2506 NR:ns ## KEGG: ECS88_2506 # Name: not_defined # Def: putative DNA transfer protein from bacteriophage # Organism: E.coli_S88 # Pathway: not_defined # 63 701 33 671 671 1043 92.0 0 MAKAWKDVIASPQYQALAPEQKVQAQEQYFNEVVAPQAGESVEQAKQAFYAAYPLPSTNE IDRSQSATQNIQHTSSDNSLASGYAKLATQQREGLERSAEQGASLGAAMRDAITGESRMT PEMERLQNVASAPELNSLSMDALKAGWSQLFGSDASQEKILQGMGATLRQDEKGNTIVSL PSGDYALNKPGLSPQDLTSFLANALAFTPAGRAGTVLGAIGKSAATDLALQGATSLAGGE DIDPLQTVISAGIGGIGKGLENTASAVSRAVRGDMSPEAKAAVDFASERNLPLMTSDMLK DKTFMQSQAQTLGERVPFFGTGKNRLNQQQARENLVRTFSDGLGSISDKQLYESATKGQQ KFIEAAGKRYNRIIDAMGDTPVDLSNTVKTIDNQIAVLSRPGKSQDRAAVKVLQQFKDDI TSGPNDLRLARENRTDLRKRFIASSDTVDKDTLQKASDIIYKAYTADMKKAVAKNLGADE AINMARVDRSWSKFNDMMGRTRVQKAIASGKSTPEDVTKLVFSQSPSERSQLYRLLDDNG RQNARAAIVQNAVDKATDPSGNISVEKFINALHRNRKQSATFFKGVHGKELDGVIKYLND TRHAAKANVQNLNGQQLYGLLVGGGIINAAVLAGMLKTAAFVVPAAGAVGGAAKAYESPV IRNALLRLANTPKGSTAYDRAISTVTQSLTRVAQASQKEAQ >gi|296493493|gb|ADTK01000008.1| GENE 22 18935 - 19306 101 123 aa, chain + ## HITS:1 COG:no KEGG:ECO111_0327 NR:ns ## KEGG: ECO111_0327 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 123 1 123 123 237 99.0 9e-62 MAIIIYPRHGIWYIAWVGISNPIPVDGAVTVRAYPLYVKQEDFSDITATAKHQFRDKYAR EGYQLQKILQLRPTSMHTMAGGYHKTVLYMRQSGGDVLIIGVYAHRTRYDDVLCGYTNPM YLV >gi|296493493|gb|ADTK01000008.1| GENE 23 19951 - 20136 134 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|300901906|ref|ZP_07119938.1| ## NR: gi|300901906|ref|ZP_07119938.1| transcriptional regulator, AlpA family [Escherichia coli MS 84-1] # 1 61 1 61 61 110 100.0 3e-23 MRLIKLSEVMLRTGFKKTAIYSWVKTGTFPQPVKIGRSARWSLEEVEAWIQNKLDSRAGN Q >gi|296493493|gb|ADTK01000008.1| GENE 24 20063 - 20374 69 103 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|301305460|ref|ZP_07211553.1| ## NR: gi|301305460|ref|ZP_07211553.1| hypothetical protein HMPREF9347_04082 [Escherichia coli MS 124-1] # 1 103 6 108 108 200 100.0 2e-50 MVIGIILPARVILHWPEPLCVKSMRELHMLCVKLQLKMSCSGGNFFGKQRFGGYQHGYLA FGYRQSQIVTMLNLKQRSLYAPYQIKRSYAQNGLQENGNLLMG >gi|296493493|gb|ADTK01000008.1| GENE 25 20811 - 21098 107 95 aa, chain + ## HITS:1 COG:no KEGG:ECs1588 NR:ns ## KEGG: ECs1588 # Name: not_defined # Def: putative transcriptional activator # Organism: E.coli_O157J # Pathway: not_defined # 1 89 1 89 90 83 52.0 3e-15 MIRDRKAEDLESKGLYRRAAARWAEIMMLADGDKEREHAANRRSECIRKAARQPATTDRF GDLRQAVKRTHSALGMDEEARGYFRRYRDKDCQRQ >gi|296493493|gb|ADTK01000008.1| GENE 26 21770 - 22180 303 136 aa, chain - ## HITS:1 COG:yagP KEGG:ns NR:ns ## COG: yagP COG0583 # Protein_GI_number: 16128267 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 136 1 136 136 248 96.0 3e-66 MPAPVVLILAAGRGERFLASGGNTHKCIGWRQSPEVAPYRWPFEENGRTFDLAIEPQITT NDLRLMLQLALVGEGITIATQETFRPYIESGKLVSLLDDFLPQFPGFYLYFPQRRNIAPK LRALIDHVKEWRQQLA >gi|296493493|gb|ADTK01000008.1| GENE 27 22159 - 23115 531 318 aa, chain - ## HITS:1 COG:yagQ KEGG:ns NR:ns ## COG: yagQ COG1975 # Protein_GI_number: 16128268 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family # Organism: Escherichia coli K12 # 1 318 1 318 318 609 99.0 1e-174 MSYPLFDKDEHWHKPEQAFLTDDHRTILRFAVEALMSGKGAVLVTLVEIRGGAARPLGAQ MVVREDGRYCGFVSGGCVEAAAAFEALEMMGSGRDREIRYGEGSPWFDIVLPCGGGITLT LHKLRSAQPLLAVLNRLEQRKPAGLRYDPQAQSLVCLPTQTRTGWNLNGFEVGFRPCVRL MIYGRSLEAQATASLAAATGYDSHIFDLFPASASAQIDTDTAVILLCHDLNRELPVLQAA REAKPFYLGALGSYRTHTLRLQKLHELGWSREETAQIRAPVGIFPKARDAHTLALSVLAE VASVRLHQEEDSCLPPSS >gi|296493493|gb|ADTK01000008.1| GENE 28 23125 - 25323 1811 732 aa, chain - ## HITS:1 COG:ECs0314 KEGG:ns NR:ns ## COG: ECs0314 COG1529 # Protein_GI_number: 15829568 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Escherichia coli O157:H7 # 1 732 1 732 732 1386 98.0 0 MKFDKPAGENPIDQLKVVGRPHDRIDGPLKTTGTARYAYEWHEEAPNAAYGYIVGSAIAK GRLTALDTDAAQKAPGVLAVITASNAGALGKGDKNTARLLGGPTIEHYHQAIALVVAETF EQARAAASLVQAHYRRNKGAYSLADEKQAVNQPPEGTPDKNVGDFDGAFSSAAVKIDATY TTPDQSHMAMEPHASMAVWDGNKLTLWTSNQMIDWCRTDLAKTLKVPVENVRIISPYIGG GFGGKLFLRSDALLAALAARAVKRPVKVMLPRPTIPNNTTHRPATLQHLRIGADQSGKIT AISHESWSGNLPGGTPETAVQQSELLYAGANRHTGLRLATLDLPEGNAMRAPGEAPGLMA LEIAIDELAEKAGIDPVEFRILNDTQVDPADPTRRFSRRQLIECLRTGADKFGWKQHNAT PGQVRDGEWLVGHGVAAGFRNNLLEKSGARVHLEPNGTVTVETDMTDIGTGSYTILAQTA AEMLGVPLEQVAVHLGDSSFPVSAGSGGQWGANTSTSGVYAACVKLREMIASAVGFDPEQ SQFADGKITNGTRSAMLHEATAGGRLIAEESIEFGTLSKEYQQSTFAGHFVEVGVHSATG EVRVRRMLAVCAAGRILNPKTARSQVIGAMTMGMGAALMEELAVDDRLGYFVNHDMAGYE VPVHADIPKQEVIFLDDTDPISSPMKAKGVGELGLCGVSAAIANAVYNATGIRVRDYPIT LDKLLDKLPDVV >gi|296493493|gb|ADTK01000008.1| GENE 29 25320 - 26276 656 318 aa, chain - ## HITS:1 COG:yagS KEGG:ns NR:ns ## COG: yagS COG1319 # Protein_GI_number: 16128270 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs # Organism: Escherichia coli K12 # 1 318 1 318 318 595 99.0 1e-170 MKAFTYERVNTPAEAALSAQRVPGAKFIAGGTNLLDLMKLEIETPTHLIDVNGLGLDKIE VTDAGGLRIGALVRNTDLAAHERVRRDYAVLSRALLAGASGQLRNQATTAGNLLQRTRCP YFYDTNQPCNKRLPGSGCAALEGFSRQHAVVGVSEACIATHPSDMAVAMRLLDAVVETIT PEGKTRSITLADFYHPPGKTPHIETALLPGELIVAVTLPPPLGGKHIYRKVRDRASYAFA QVSVAAIIHPDGSGRVALGGVAHKPWRIEAADAQLSQGAQAVYDTLFASAHPTAENTFKL LLAKRTLASVLAEARAQA >gi|296493493|gb|ADTK01000008.1| GENE 30 26273 - 26962 427 229 aa, chain - ## HITS:1 COG:ECs0316 KEGG:ns NR:ns ## COG: ECs0316 COG2080 # Protein_GI_number: 15829570 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Escherichia coli O157:H7 # 1 229 1 229 229 432 98.0 1e-121 MSNQGEYPEDNRVGKHEPHDLSLTRRDLIKVSAATAATAVVYPHSTLAASVPAATPAPEI MPLTLKVNGKTEQLEVDTRTTLLDALRENLHLIGTKKGCDHGQCGACTVLVNGRRLNACL TLAVMHQGAEITTIEGLGSPDNLHPMQAAFIKHDGFQCGYCTSGQICSSVAVLKEIQDGI PSHVTVDLVSAPETTADEIRERMSGNICRCGAYANILAAIEDAAGEIKS >gi|296493493|gb|ADTK01000008.1| GENE 31 27380 - 27994 290 204 aa, chain + ## HITS:1 COG:ECs0317 KEGG:ns NR:ns ## COG: ECs0317 COG3477 # Protein_GI_number: 15829571 # Func_class: S Function unknown # Function: Predicted periplasmic/secreted protein # Organism: Escherichia coli O157:H7 # 1 204 1 204 204 405 99.0 1e-113 MNIFEQTPPNRRRYGLAAFIGLIAGVVSAFVKWGAEVPLPPRSPVDMFNAACGPESLIRA AGQIDCSRNFLNPPYIFLRDWLGLTDPNAAVYTFAGHVFNWVGVTHIIFSIVFAVGYCVV AEVFPKIKLWQGLLAGAFAQLFVHMISFPLMGLTPPLFDLPWYENVSEIFGHLVWFWSIE IIRRDLRNRITHEPDPEIPLGSNR >gi|296493493|gb|ADTK01000008.1| GENE 32 28884 - 29594 504 236 aa, chain - ## HITS:1 COG:no KEGG:ECSE_0307 NR:ns ## KEGG: ECSE_0307 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 236 16 251 251 476 100.0 1e-133 MFRRRGVTLTKALLTAVCMLAAPLTQAISVGNLTFSLPSETDFVSKRVVNNNKSARIYRI AISAIDSPGSSELRTRPVDGELLFAPRQLALQAGESEYFKFYYHGPRDNRERYYRVSFRE VPTRNQTRRSPTGGEVSTEPVVVMDTILVVRPRQVQFKWSFDKVTGTVSNTGNTWFKLLI KPGCDSTEEEGDAWYLRPGDVVHQPELRQPGNHYLVYNDKFIKISDSCPAKPPSAD >gi|296493493|gb|ADTK01000008.1| GENE 33 29563 - 31206 959 547 aa, chain - ## HITS:1 COG:no KEGG:EC55989_0294 NR:ns ## KEGG: EC55989_0294 # Name: yagW # Def: putative surface or exported protein # Organism: E.coli_55989 # Pathway: not_defined # 1 547 1 547 547 1056 100.0 0 MRVNLLITMIIFALIWPVTELRAAVSKTTWADAPAREFVFVENNSDDNFFVTPGGALDPR LTGANRWTGLKYTGSGTIYQQSLGYIDNGYNTGLYTNWKFDMWLENSPVSSPLTGLRCIN WYAGCNMTTSLILPQTTDASGFYGATVTSGGAKWMHGMLSDAFYQYLQQMPVGSSFTMTI NACQTSVNYDASSGARCKDQASGNWYVRNVTHTKAANLRLINTHSLAEVFINSDGVPTLG EGNADCRTQTIGSLSGLSCKMVNYTLQTNGLSNTSIHIFPAIANSSLASAVGAYDMQFSL NGSSWKPVSNTAYYYTFNEMKSADSIYVFFSSNFFKQMVNLGISDINTKDLFNFRFQNTT SPESGWYEFSTSNTLIIKPRDFSISIISDEYTQTPSREGYVGSGESALDFGYIVTTSGKT AADEVLIKVTGPAQVIGGRSYCVFSSDDGKAKVPFPATLSFITRNGATKTYDAGCDDSWR DMTDALWLTTPWTDISGEVGQMDKTTVKFSIPMDNAISLRTVDDNGWFGEVSASGEIHVQ ATWRNIN >gi|296493493|gb|ADTK01000008.1| GENE 34 31196 - 33721 1739 841 aa, chain - ## HITS:1 COG:no KEGG:B21_00253 NR:ns ## KEGG: B21_00253 # Name: yagX # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 841 1 841 841 1548 99.0 0 MPLRRFSPGLKAQFAFGMVFLFVQPDASAADISAQQIGGVIIPQAFSQALQDGMSVPLYI HLAGSQGRQDDQRIGSAFIWLDDGQLRIRKIQLEESEDNASVSEQTRQQLMALANAPFNE ALTIPLTDNAQLDLSLRQLLLQLVVKREALGTVLRSRSEDIGQSSVNTLSSNLSYNLGVY NNQLRNGGSNTSSYLSLNNVTALREHHVVLDGSLYGIGSGQQDSELYKAMYERDFAGHRF AGGMLDTWNLQSLGPMTAISAGKIYGLSWGNQASSTIFDSSQSATPVIAFLPAAGEVHLT RDGRLLSVQNFTMGNHEVDTRGLPYGIYDVEVEVIVNGRVISKRTQRVNKLFSRGRGVGA PLAWQVWGGSFHMDRWSENGKKTRPAKESWLAGASTSGSLSTLSWAATGYGYDNQAVGET RLTLPLGGAINVNLQNMLASDSSWSSIGSISATLPGGFSSLWVNQEKTRIGNQLRRSDAD NRAIGGTLNLNSLWSKLGTFSISYNDDRRYNSHYYTADYYQNVYSGTFGSLGLRAGIQRY NNGDSNANTGKYIALDLSLPLGNWFSAGMTHQNGYTMANLSARKQFDEGTIRTVGANLSR AISGDTGDDKTLSGGAYAQFDARYASGTLNVNSAADGYINTNLTANGSVGWQGKNIAASG RTDGNAGVIFNTGLEDDGQISAKINGRIFPLNGKRNYLPLSPYGRYEVELQNSKNSLDSY DIVSGRKSHLTLYPGNVAVIEPEVKQMVTVSGRIRAEDGTLLANARINNHIGRTRTDENG EFVMDVDKKYPTIDFRYSGNKTCEVALELNQARGAVWVGDVVCSGLSSWAAVTQTGEENE S >gi|296493493|gb|ADTK01000008.1| GENE 35 33747 - 34415 442 222 aa, chain - ## HITS:1 COG:no KEGG:APECO1_1705 NR:ns ## KEGG: APECO1_1705 # Name: yagY # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 222 17 238 238 418 99.0 1e-116 MKKHLLPLALLFSGISPAQALDVGDISSFMNSDSSTLSKTIKNSTDSGRLINIRLERLSS PLDDGQVISMDKPDELLLTPASLLLPAQASEVIRFFYKGPADEKERYYRIVWFDQALSDA QRDNANRSAVATASARIGTILVVAPRQANYHFQYANGSLTNTGNATLRILAYGPCLKAAN GKECKENYYLMPGKSRRFTRVDTADNKGRVALWQGDKFIPVK >gi|296493493|gb|ADTK01000008.1| GENE 36 34473 - 35060 621 195 aa, chain - ## HITS:1 COG:no KEGG:ECIAI1_0294 NR:ns ## KEGG: ECIAI1_0294 # Name: ecpA # Def: common pilus ECP # Organism: E.coli_IAI1 # Pathway: not_defined # 1 195 1 195 195 290 99.0 2e-77 MKKKVLAIALVTVFTGMGVAQAADVTAQAVATWSATAKKDTTSKLVVTPLGSLAFQYAEG IKGFNSQKGLFDVAIEGDSTATAFKLTSRLITNTLTQLDTSGSTLNVGVDYNGAAVEKTG DTVMIDTANGVLGGNLSPLPNGYNASNRTTAQDGFTFTIISGTTNGTTAVTDYSTLPEGI WSGDVSVQFDATWTS >gi|296493493|gb|ADTK01000008.1| GENE 37 35135 - 35677 158 180 aa, chain - ## HITS:1 COG:ykgK KEGG:ns NR:ns ## COG: ykgK COG2771 # Protein_GI_number: 16128279 # Func_class: K Transcription # Function: DNA-binding HTH domain-containing proteins # Organism: Escherichia coli K12 # 1 180 17 196 196 332 99.0 3e-91 MECQNRSDKYIWSPHDAYFYKGLSELIVDIDRLIYLSLEKIRKDFVFINLSTDSLSEFIN RDNEWLSAVKGKQVVLIAARKSEALANYWYYNSNIRGVVYAGLSRDIRKELAYVINGRFL RKDIKKDKITDREMEIIRMTAQGMQPKSIARIENCSVKTVYTHRRNAEAKLYSKIYKLVQ >gi|296493493|gb|ADTK01000008.1| GENE 38 36762 - 36905 241 47 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|26246304|ref|NP_752343.1| 50S ribosomal protein L36 [Escherichia coli CFT073] # 1 47 1 47 47 97 97 1e-19 MMKVLNSLRTAKERHPDCQIVKRKGRLYVICKSNPRFKAVQGRKKKR >gi|296493493|gb|ADTK01000008.1| GENE 39 36902 - 37168 460 88 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|110640565|ref|YP_668293.1| 50S ribosomal protein L31 type B [Escherichia coli 536] # 1 88 1 88 88 181 100 5e-45 MMKPNIHPEYRTVVFHDTSVDEYFKIGSTIKTDREIELDGVTYPYVTIDVSSKSHPFYTG KLRTVASEGNVARFTQRFGRFVSTKKGA >gi|296493493|gb|ADTK01000008.1| GENE 40 37529 - 37630 61 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKENKVQQISHKLINIVVFVAIVEYAYLFLHFY >gi|296493493|gb|ADTK01000008.1| GENE 41 38745 - 43001 2686 1418 aa, chain + ## HITS:1 COG:no KEGG:EcolC_3322 NR:ns ## KEGG: EcolC_3322 # Name: not_defined # Def: Ig domain-containing protein # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 1418 1 1418 1418 2503 99.0 0 MSHYKTGHKQPRFRYSVLARCVAWANISVQVLFPLAVTFTPVMAARAQHAVQPRLSMGNT TVTADNNVEKNVASFAANAGTFLSSQPDSDATRNFITGMATAKANQEIQEWLGKYGTARV KLNVDKDFSLKDSSLEMLYPIYDTPTNMLFTQGAIHRTDDRTQSNIGFGWRHFSGNDWMA GVNTFIDHDLSRSHTRIGVGAEYWRDYLKLSANGYIRASGWKKSPDVEDYQERPANGWDI RAEGYLPAWPQLGASLMYEQYYGDEVGLFGKDKRQKDPHAISTEVTYTPVPLLTLSAGHK QGKSGENDTRFGLEVNYRIGEPLAKQLDTDSIRERRVLAGSRYDLVERNNNIVLEYRKSE VIRIALPERIEGKGGQTLSLGLVVSKATHGLKNVQWEAPSLLAEGGKITGQGSQWQVTLP AYRPGKDNYYAISAVAYDNKGNASKRVQTEVVITGAGMSADRTALTLDGQSRIQMLANGN EQRPLVLSLRDAEGQPVTGMKDQIKTELAFKPAGNIVTRSLKATKSQAKPTLGEFTETEA GVYQSVFTTGTQSGEATITVSVDGMSKTVTAELRATMMDVANSTLSANEPSGDVVADGQQ AYTLTLTAVDSEGNPVTGEASRLRFVPQDTNGVTVGAISEIKPGVYSATVSSTRAGNVVV HAFSEQYQLGTLQQTLKFVAGPLDAAHSSITLNPDKPVVGGTVTAIWTAKDAYDNPVTSL TPEAPSLAGAAAVGSTASGWTNNDDGTWTAQITLGSTAGELEVMPKLNGQDAAANAAKVT VVADALSSNQSKVSVAEDHVKAGESTTVTLIAKDAHGNTISGLSLSASLTGTASEGATVS SWTEKGDGSYVATLTTGGKTGELRVMPLFNGQPAATEAAQLTVIAGEMSSANSTLVADNK APTVKMTTELTFTVKDAYGNPVTGLKPDAPVFSGAASTGSERPSAGNWTEKGNGVYVATL TLGSAAGQLSVMPRVNGQNAVAQPLVLNVAGDASKAEIRDMTVKVNNQLANGQSANQITL TVVDSYGNPLQGQEVTLTLPQGVTSKTGNTVTTNAAGKVDIELMSTVAGEHSITASVNNA QKTVTVKFKADFSTGQATLEVDGSTPKVANDNDAFTLTATVKDQYGNLLPGAVVVFNLPR GVKPLADGNIMVNADKEGKAELKVVSVTAGTYEITASAGNDQPSNAQSVTFVADKTTATI SSIEVIGNRAVADGKTKQTYKVTVTDANNNLLKDSDVTLTASSENLVLDPKGTAKTNEQG QAVFTGSTTIAATYTLTAKVEQANGQVSTKTAESKFVADDKNAVLAASPERVDSLVADGK TTATMTVTLMAGVNPVGGSMWVDIEAPKGVTEKDYQFLPSKADHFSGGKITRTFSTSKPG VYTFTFNALTYGGYEMTPVKVTINAVAAETENGEEEMP >gi|296493493|gb|ADTK01000008.1| GENE 42 43141 - 43488 252 115 aa, chain - ## HITS:1 COG:ECs0337 KEGG:ns NR:ns ## COG: ECs0337 COG2207 # Protein_GI_number: 15829591 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 115 180 296 296 216 87.0 6e-57 MKNFKTDIYFVSTFEPSTKSVDLLTVETFAGTVCEYADMPKEWTTTRGLYASFRYEGNWE NYPDWVRNIYLIELPARGLARVNGSDIERFYYNEDFVEKDGNDVVCEIFIPVRPV Prediction of potential genes in microbial genomes Time: Mon May 16 14:59:59 2011 Seq name: gi|296493492|gb|ADTK01000009.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont22.1, whole genome shotgun sequence Length of sequence - 548 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 66 - 546 123 ## ASA_pAsa206 mobilization protein Predicted protein(s) >gi|296493492|gb|ADTK01000009.1| GENE 1 66 - 546 123 160 aa, chain + ## HITS:1 COG:no KEGG:ASA_pAsa206 NR:ns ## KEGG: ASA_pAsa206 # Name: mobB # Def: mobilization protein # Organism: A.salmonicida # Pathway: not_defined # 1 160 1 153 154 75 29.0 6e-13 MNSLLTLAKDLEQKSKAQQQSTGEMLKAAFSEHEKSVRAELSESEKRISAAILDHDRKLS AAMSQRTKGMLRMVSQTWLTIVLVSTLLTASGASILWWQGQQILDNYTTIREQKRTQAML SERNSGVQLSTCGEQRRRCVRVNPEAGRFGEDSSWMILAG Prediction of potential genes in microbial genomes Time: Mon May 16 15:00:02 2011 Seq name: gi|296493491|gb|ADTK01000010.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont23.1, whole genome shotgun sequence Length of sequence - 1157 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 106 - 165 3.9 1 1 Tu 1 . + CDS 200 - 751 278 ## Sputw3181_0844 hypothetical protein - Term 560 - 602 2.0 2 2 Tu 1 . - CDS 843 - 1070 130 ## EpC_pEp050040 mobilization protein A Predicted protein(s) >gi|296493491|gb|ADTK01000010.1| GENE 1 200 - 751 278 183 aa, chain + ## HITS:1 COG:no KEGG:Sputw3181_0844 NR:ns ## KEGG: Sputw3181_0844 # Name: not_defined # Def: hypothetical protein # Organism: Shewanella_W3-18-1 # Pathway: not_defined # 4 181 1 163 165 95 35.0 8e-19 MKDMKAVVVFTGKDLKIMRTEGGSGYWHARTDRLNDADYLIAVRNRRETWAVKDLEHGTA FLIAKITGCFKSPDYDDRNVITFDENAEIHTPKAWKMLTDGQRYPVAYLSAQEAFLRIGV TPEQLEWKKFHPSSPSVPNTVIPGLAEEKTEKLSLNEAIERAKKDISNATGIDSSAITIS IKI >gi|296493491|gb|ADTK01000010.1| GENE 2 843 - 1070 130 75 aa, chain - ## HITS:1 COG:no KEGG:EpC_pEp050040 NR:ns ## KEGG: EpC_pEp050040 # Name: mobA # Def: mobilization protein A # Organism: E.pyrifoliae # Pathway: not_defined # 1 75 430 504 504 108 88.0 7e-23 MADDVWSYATGERGAERARHGLEQAGAEFERAAAPVVTRLNVIEAHREQERAAQHQKALE LERSQRQRTYNGPSL Prediction of potential genes in microbial genomes Time: Mon May 16 15:00:07 2011 Seq name: gi|296493490|gb|ADTK01000011.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont27.1, whole genome shotgun sequence Length of sequence - 827 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 776 297 ## EcHS_A1637 L-shaped tail fiber protein Predicted protein(s) >gi|296493490|gb|ADTK01000011.1| GENE 1 2 - 776 297 258 aa, chain - ## HITS:1 COG:no KEGG:EcHS_A1637 NR:ns ## KEGG: EcHS_A1637 # Name: not_defined # Def: L-shaped tail fiber protein # Organism: E.coli_HS # Pathway: not_defined # 1 258 378 634 1258 449 91.0 1e-125 MVTNAGNYGSPYGNIDFIEISARGLPSLLSADNVSRHLSIRRLGSTGLTDNNQMRYGLVK GDGFIEVWAFQGAFINDAKVAVLAQTTLNTELYIPDGFVKQTAAPSGYIEGNVVRIYDQV NKPTKADLGLSNAMLTGAFGLGGSGISTNGKMSDVEILKALRDKGGHFWRGDKPTGSTAT IYSHGSGIFSRCGDTWSAINIDYSTAKIKIYAGNDARLNNGTFSVNELYGSANKPSKSDV GLGNVTNDAQVKKTGDTM Prediction of potential genes in microbial genomes Time: Mon May 16 15:00:25 2011 Seq name: gi|296493489|gb|ADTK01000012.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont28.1, whole genome shotgun sequence Length of sequence - 47982 bp Number of predicted genes - 27, with homology - 27 Number of transcription units - 16, operones - 5 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 163 - 238 87.1 # Asn GTT 0 0 - Term 645 - 681 4.2 1 1 Tu 1 . - CDS 694 - 1410 687 ## COG0217 Uncharacterized conserved protein - Prom 1470 - 1529 9.9 - Term 1653 - 1697 10.5 2 2 Tu 1 . - CDS 1753 - 3207 1129 ## COG0775 Nucleoside phosphorylase - Prom 3238 - 3297 5.4 - Term 3267 - 3302 6.7 3 3 Tu 1 . - CDS 3309 - 4625 1016 ## COG0477 Permeases of the major facilitator superfamily - Prom 4651 - 4710 4.6 4 4 Tu 1 . + CDS 4940 - 5992 303 ## COG0859 ADP-heptose:LPS heptosyltransferase 5 5 Op 1 . - CDS 6248 - 8077 619 ## APECO1_1065 putative autotransporter 6 5 Op 2 . - CDS 8113 - 8283 58 ## ECIAI39_1068 hypothetical protein 7 5 Op 3 . - CDS 8210 - 8704 289 ## c2437 hypothetical protein - Prom 8741 - 8800 4.7 - Term 9568 - 9600 3.2 8 6 Tu 1 . - CDS 9612 - 11633 1599 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 11660 - 11719 5.2 9 7 Op 1 2/0.600 - CDS 11764 - 13341 1137 ## COG1021 Peptide arylation enzymes 10 7 Op 2 2/0.600 - CDS 13345 - 14133 217 ## COG3208 Predicted thioesterase involved in non-ribosomal peptide biosynthesis 11 7 Op 3 2/0.600 - CDS 14145 - 15245 519 ## COG4693 Oxidoreductase (NAD-binding), involved in siderophore biosynthesis 12 7 Op 4 7/0.000 - CDS 15242 - 24733 5858 ## COG3321 Polyketide synthase modules and related proteins 13 7 Op 5 2/0.600 - CDS 24821 - 30928 3556 ## COG1020 Non-ribosomal peptide synthetase modules and related proteins - Prom 30970 - 31029 5.2 - Term 31044 - 31086 12.5 14 8 Tu 1 . - CDS 31119 - 32078 547 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 32153 - 32212 3.5 + Prom 32112 - 32171 3.0 15 9 Op 1 35/0.000 + CDS 32335 - 34047 194 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 16 9 Op 2 3/0.600 + CDS 34034 - 35836 217 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 17 9 Op 3 1/0.800 + CDS 35829 - 37109 967 ## COG0477 Permeases of the major facilitator superfamily 18 9 Op 4 . + CDS 37137 - 38441 1171 ## COG0147 Anthranilate/para-aminobenzoate synthases component I - Term 38569 - 38599 4.3 19 10 Op 1 . - CDS 38635 - 39897 366 ## PROTEIN SUPPORTED gi|239523764|gb|EEQ63630.1| 30S ribosomal protein S15 - Prom 40142 - 40201 5.1 - TRNA 40059 - 40134 87.1 # Asn GTT 0 0 - Term 40089 - 40127 -0.8 20 10 Op 2 . - CDS 40235 - 41032 371 ## COG3228 Uncharacterized protein conserved in bacteria - Prom 41156 - 41215 80.3 + TRNA 41126 - 41215 75.5 # Ser CGA 0 0 - Term 41758 - 41795 4.1 21 11 Tu 1 . - CDS 41885 - 42535 440 ## COG3443 Predicted periplasmic or secreted protein - Prom 42662 - 42721 7.1 22 12 Op 1 13/0.000 - CDS 42792 - 43427 459 ## COG2717 Predicted membrane protein 23 12 Op 2 3/0.600 - CDS 43428 - 44432 832 ## COG2041 Sulfite oxidase and related enzymes 24 13 Tu 1 . - CDS 44541 - 44954 289 ## COG2351 Transthyretin-like protein - Prom 44979 - 45038 5.3 + Prom 44902 - 44961 3.9 25 14 Tu 1 . + CDS 45039 - 45758 239 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Prom 46100 - 46159 3.0 26 15 Tu 1 . + CDS 46214 - 47116 279 ## COG0642 Signal transduction histidine kinase + Term 47156 - 47201 7.2 - Term 47142 - 47188 9.2 27 16 Tu 1 . - CDS 47224 - 47919 478 ## COG0693 Putative intracellular protease/amidase Predicted protein(s) >gi|296493489|gb|ADTK01000012.1| GENE 1 694 - 1410 687 238 aa, chain - ## HITS:1 COG:yeeN KEGG:ns NR:ns ## COG: yeeN COG0217 # Protein_GI_number: 16129927 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 238 1 238 238 441 100.0 1e-124 MGRKWANIVAKKTAKDGATSKIYAKFGVEIYAAAKQGEPDPELNTSLKFVIERAKQAQVP KHVIDKAIDKAKGGGDETFVQGRYEGFGPNGSMIIAETLTSNVNRTIANVRTIFNKKGGN IGAAGSVSYMFDNTGVIVFKGTDPDHIFEILLEAEVDVRDVTEEEGNIVIYTEPTDLHKG IAALKAAGITEFSTTELEMIAQSEVELSPEDLEIFEGLVDALEDDDDVQKVYHNVANL >gi|296493489|gb|ADTK01000012.1| GENE 2 1753 - 3207 1129 484 aa, chain - ## HITS:1 COG:ECs2779 KEGG:ns NR:ns ## COG: ECs2779 COG0775 # Protein_GI_number: 15832033 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside phosphorylase # Organism: Escherichia coli O157:H7 # 1 484 1 484 484 999 100.0 0 MNNKGSGLTPAQALDKLDALYEQSVVALRNAIGNYITSGELPDENARKQGLFVYPSLTVT WDGSTTNPPKTRAFGRFTHAGSYTTTITRPTLFRSYLNEQLTLLYQDYGAHISVQPSQHE IPYPYVIDGSELTLDRSMSAGLTRYFPTTELAQIGDETADGIYHPTEFSPLSHFDARRVD FSLARLRHYTGTPVEHFQPFVLFTNYTRYVDEFVRWGCSQILDPDSPYIALSCAGGNWIT AETEAPEEAISDLAWKKHQMPAWHLITADGQGITLVNIGVGPSNAKTICDHLAVLRPDVW LMIGHCGGLRESQAIGDYVLAHAYLRDDHVLDAVLPPDIPIPSIAEVQRALYDATKLVSG RPGEEVKQRLRTGTVVTTDDRNWELRYSASALRFNLSRAVAIDMESATIAAQGYRFRVPY GTLLCVSDKPLHGEIKLPGQANRFYEGAISEHLQIGIRAIDLLRAEGDRLHSRKLRTFNE PPFR >gi|296493489|gb|ADTK01000012.1| GENE 3 3309 - 4625 1016 438 aa, chain - ## HITS:1 COG:ECs2778 KEGG:ns NR:ns ## COG: ECs2778 COG0477 # Protein_GI_number: 15832032 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 438 1 438 438 806 99.0 0 MDSTLISTRPDEGTLSLSRARRAALGSFAGAVVDWYDFLLYGITAALVFNREFFPQVSPA MGTLAAFATFGVGFLFRPLGGVIFGHFGDRLGRKRMLMLTVWMMGIATALIGILPSFSTI GWWAPILLVTLRAIQGFAVGGEWGGAALLSVESAPKNKKAFYSSGVQVGYGVGLLLSTGL VSLISMMTTDEQFLSWGWRIPFLFSIVLVLGALWVRNGMEESAEFEQQQHNQAAAKKRIP VIEALLRHPGAFLKIIALRLCELLTMYIVTAFALNYSTQNMGLPRELFLNIGLLVGGLSC LTIPCFAWLADRFGRRRVYITGALIGTLSAFPFFMALEAQSIFWIVFFSIMLANIAHDMV VCVQQPMFTEMFGASYRYSGAGVGYQVASVVGGGFTPFIAAALITYFAGNWHSVAIYLLA GCLISAMTALLMKDNQRS >gi|296493489|gb|ADTK01000012.1| GENE 4 4940 - 5992 303 350 aa, chain + ## HITS:1 COG:ECs2777 KEGG:ns NR:ns ## COG: ECs2777 COG0859 # Protein_GI_number: 15832031 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Escherichia coli O157:H7 # 1 350 1 350 350 644 95.0 0 MFLVSLLRRIAFSYYDYKAYNFNIEKTDFVVIHIPDQIGDAMAIFPVIRALELHKIKHLL IVTSTINLEVFNALKLEQTKLTLVTMTMQDHATLKEIKYLAKNITLQYGTPDLCIEAMRK KNLKTMIFISQLKAKTNFQVVGLTMKCYSPLCKNASRMDQNLRAPVPMTWAFMMREAGFP AVRSIYELPLSDDVLDEVREEMRSLGSYIALNLEGSSQERIFSFSIAENIIAKIQSETDI PIVIVYGPKGEDKARALVDCYNNVYRLSLSPSIKHSAAIIKDAYIAITPDTSILHMASAY NTPVVAIYADYKTRWPAMADVLESVIVGQKIDNISLDEFAKALKSVLARI >gi|296493489|gb|ADTK01000012.1| GENE 5 6248 - 8077 619 609 aa, chain - ## HITS:1 COG:no KEGG:APECO1_1065 NR:ns ## KEGG: APECO1_1065 # Name: not_defined # Def: putative autotransporter # Organism: E.coli_APEC # Pathway: not_defined # 1 605 1 605 652 907 98.0 0 MKANFTLSDGDKAVTDADGKAKVTLKGTKAGAHTVTASMVGGKSEQLVVNFTADTLTAQV NLNVTEDNFIANNIGMTRLQATVTDGNGNPVEGIKVNFRGTSVTLSSTSVETDDQVFAEI LVTSTEVGLKTVSASLADKPTEVISRLLNAKVDVNSATITSQEIPEGQVMVAQDIAVKAH VNDQFGNPVTHQPATFSAAPSSQMIISQNTVSTNTQGVAEVTMTPERNGSYTVKASLANG ASLEKQLEAIDEKLTLTSSPLIGVNAPKGATLTAMLTSANGTPVEGQVINFSVTLEGATL SGGKVRTNSSGQAPVVLTSNKVGTYTVTASFHNGVTIQTQTTVKVTGNPSTAHVASFIAE PSTIAATNSDLSTLKATVEDGSGNLIEGLTVYFALKNGSTTLTSLTAVTDQNGIATTSVK GAITGSVTVSTVTSAGGMQTVDISLVAGPADASQSILKNNQSSLKGDFTDSAELHLVLHD ISGNPIKVSEGMEFVQSGTNVPYMKISAIDYTQNINGDYKATVTGGGEGIATLIPVLNGV HQAGLSTTIEFISAETRPMTGTVSVNGANLPTASFPSQGFTGAYYQLNNDNFAPGKTAAD YSFQTRPLG >gi|296493489|gb|ADTK01000012.1| GENE 6 8113 - 8283 58 56 aa, chain - ## HITS:1 COG:no KEGG:ECIAI39_1068 NR:ns ## KEGG: ECIAI39_1068 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_IAI39 # Pathway: not_defined # 1 56 1 56 56 76 100.0 3e-13 MTATLENGDSMQQTVNYVPNVTNAEITLAASKDPVIADNNDLTTLTAPSLIQRAMR >gi|296493489|gb|ADTK01000012.1| GENE 7 8210 - 8704 289 164 aa, chain - ## HITS:1 COG:no KEGG:c2437 NR:ns ## KEGG: c2437 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_CFT073 # Pathway: not_defined # 1 164 105 268 268 344 100.0 8e-94 MLDRDGFIYGQNLRLCTEARYICPVQQIIDTPLLRKLNQFRTFVRNVRPGDELDVPAQVS EKNLTPPPGNSSGNLEQQIASTSQLIGSLLAEDMNSEQAANIARGWASSQASGVMTDWLS RFGTARITLGVDEDFSLKNSRDGNPRKWRFDATNSELCAERHEC >gi|296493489|gb|ADTK01000012.1| GENE 8 9612 - 11633 1599 673 aa, chain - ## HITS:1 COG:YPO1906 KEGG:ns NR:ns ## COG: YPO1906 COG1629 # Protein_GI_number: 16122154 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Yersinia pestis # 1 673 1 673 673 1277 99.0 0 MKMTRLYPLALGGLLLPAIANAQTSQQDESTLVVTASKQSSRSASANNVSSTVVSAPELS DAGVTASDKLPRVLPGLNIENSGNMLFSTISLRGVSSAQDFYNPAVTLYVDGVPQLSTNT IQALTDVQSVELLRGPQGTLYGKSAQGGIINIVTQQPDSTPRGYIEGGVSSRDSYRSKFN LSGPIQDGLLYGSVTLLRQVDDGDMINPATGSDDLGGTRASIGNVKLRLAPDDQPWEMGF AASRECTRATQDAYVGWNDIKGRKLSISDGSPDPYMRRCTDSQTLSGKYTTDDWVFNLIS AWQQQHYSRTFPSGSLIVNMPQRWNQDVQELRAATLGDARTVDMVFGLYRQNTREKLNSA YDMPTMPYLSSTGYTTAETLAAYSDLTWHLTDRFDIGGGVRFSHDKSSTQYHGSMLGNPF GDQGKSNDDQVLGQLSAGYMLTDDWRVYTRVAQGYKPSGYNIVPTAGLDAKPFVAEKSIN YELGTRYETADVTLQAATFYTHTKDMQLYSGPVGMQTLSNAGKADATGVELEAKWRFAPG WSWDINGNVIRSEFTNDSELYHGNRVPFVPRYGAGSSVNGVIDTRYGALMPRLAVNLVGP HYFDGDNQLRQGTYATLDSSLGWQATERMNISVYVDNLFDRRYRTYGYMNGSSAVAQVNM GRTVGINTRIDFF >gi|296493489|gb|ADTK01000012.1| GENE 9 11764 - 13341 1137 525 aa, chain - ## HITS:1 COG:YPO1907 KEGG:ns NR:ns ## COG: YPO1907 COG1021 # Protein_GI_number: 16122155 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Peptide arylation enzymes # Organism: Yersinia pestis # 1 525 1 525 525 1036 100.0 0 MNSSFESLIEQYPLPIAEQLRHWAARYASRIAVVDAKGSLTYSALDAQVDELAAGLSSLG LRSGEHVIVQLPNDNAFVTLLFALLRLGVIPVLAMPSQRALDIDALIELAQPVAYVIHGE NHAELARQMAHKHACLRHVLVAGETVSDDFTPLFSLHGERQAWPQPDVSATALLLLSGGT TGTPKLIPRRHADYSYNFSASAELCGISQQSVYLAVLPVAHNFPLACPGILGTLACGGKV VLTDSASCDEVMPLIAQERVTHVALVPALAQLWVQAREWEDSDLSSLRVIQAGGARLDPT LAEQVIATFDCTLQQVFGMAEGLLCFTRLDDPHATILHSQGRPLSPLDEIRIVDQDENDV APGETGQLLTRGPYTISGYYRAPAHNAQAFTAQGFYRTGDNVRLDEVGNLHVEGRIKEQI NRAGEKIAAAEVESALLRLAEVQDCAVVAAPDTLLGERICAFIIAQQVPTDYQQLRQQLT RMGLSAWKIPDQIEFLDHWPLTAVGKIDKKRLTALAVDRYRHSAQ >gi|296493489|gb|ADTK01000012.1| GENE 10 13345 - 14133 217 262 aa, chain - ## HITS:1 COG:YPO1908 KEGG:ns NR:ns ## COG: YPO1908 COG3208 # Protein_GI_number: 16122156 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Predicted thioesterase involved in non-ribosomal peptide biosynthesis # Organism: Yersinia pestis # 1 262 6 267 267 520 99.0 1e-147 MCIPLWPARNGNTAHLVMCPFAGGSSSAFRHWQAEQLADCALSLVTWPGRDRLRHLEPLR SITQLAALLANELEASVSPDTPLLLAGHSMGAQVAFETCRLLEQRGLAPQGLIISGCHAP HLHSERQLSHRDDADFIAELIDIGGCSPELRENQELMSLFLPLLRADFYATESYHYDSPD VCPPLRTPALLLCGSHDREASWQQVDAWRQWLSHVTGPVVIDGDHFYPIQQARSFFTQIV RHFPHAFSAMTALQKQPSTSER >gi|296493489|gb|ADTK01000012.1| GENE 11 14145 - 15245 519 366 aa, chain - ## HITS:1 COG:YPO1909 KEGG:ns NR:ns ## COG: YPO1909 COG4693 # Protein_GI_number: 16122157 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Oxidoreductase (NAD-binding), involved in siderophore biosynthesis # Organism: Yersinia pestis # 1 366 1 366 366 702 99.0 0 MMPSASPKQRVLIVGAKFGEMYLNAFMQPPEGLELVGLLAQGSARSRELAHAFGIPLYTS PEQITRMPDIACIVVRSTVAGGTGTQLARHFLTRGVHVIQEHPLHPDDISSLQTLAQEQG CCYWVNTFYPHTRAGRTWLRDAQQLRRCLAKTPPVVHATTSRQLLYSTLDLLLLALGVDA AAVECDVVGSFSDFHCLRLFWPEGEACLLLQRYLDPDDPDMHSLIMHRLLLGWPEGHLSL EASYGPVIWSSNLFVADHQENAHSLYRRPEILRDLPGLTRSAAPLSWRDCCETVGPEGVS WLLHQLRSHLAGEHPPAACQSVHQIALSRLWQQILRKTGNAEIRRLTPPHHDRLAGFYND DDKEAL >gi|296493489|gb|ADTK01000012.1| GENE 12 15242 - 24733 5858 3163 aa, chain - ## HITS:1 COG:YPO1910_1 KEGG:ns NR:ns ## COG: YPO1910_1 COG3321 # Protein_GI_number: 16122158 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Polyketide synthase modules and related proteins # Organism: Yersinia pestis # 1 1911 1 1911 1911 3732 99.0 0 MDNLRFSSAPTADSIDASIAQHYPDCEPVAVIGYACHFPESPDGETFWQNLLEGRECSRR FTREELLAVGLDAAIIDDPHYVNIGTVLDNADCFDATLFGYSRQEAESMDPQQRLFLQAV WHALEHAGYAPGAVPHKTGVFASSRMSTYPGREALNVTEVAQVKGLQSLMGNDKDYIATR AAYKLNLHGPALSVQTACSSSLVAVHLACESLRAGESDMAVAGGVALSFPQQAGYRYQPG MIFSPDGHCRPFDASAEGTWAGNGLGCVVLRRLRDALLSGDPIISVILSSAVNNDGNRKV GYTAPSVAGQQAVIEEALMLAAIDDRQVGYIETHGTGTPLGDAIEIEALRNVYAPRPQDQ RCALGSVKSNMGHLDTAAGIAGLLKTVLAVSRGQIPPLLNFHTPNPALKLEESPFTIPVS AQAWQDEMRYAGVSSFGIGGTNCHMIVASLPDALNARLPNTDSGRKSTALLLSAASDSAL RRLATDYARALRENADASSLAFTALHARRLDLPFRLAAPLNRETAEALSAWASEKSGALV YSGHGASGKQVWLFTGQGSHWRTMGQTMYQHSTAFADTLDRCFSACSEMLTPSLREAMFN PDSAQLDNMAWAQPAIVAFEIAMAAHWRAEGLKPDFAIGHSVGEFAAAVVCGHYTIEQVM PLVCRRGALMQQCASGAMVAVFADEDTLMPLARQFELDLAANNGTQHTVFSGPEARLAVF CATLSQHDINYRRLSVTGAAHSALLEPILDRFQDACAGLHAEPGQIPIISTLTADVIDES TLNQADYWRRHMRQPVRFIQSIQVAHQLGTRVFLEMGPDAQLVACGQREYRDNAYWIASA RRNKEASDVLNQALLQLYAAGVALPWADLLAGDGQRIAAPCYPFDTERYWKERVSPACEP ADAALSAGLEVASRAATALDLPRLEALKQCATRLHAIYVDQLVQRCTGDAIENGVDAMTI MRRGRLLPRYQQLLQRLLNNCVVDGDYRCTDGRYVRARPIEHQQRESLLTELAGYCEGFQ AIPDTIARAGDRLYEMMSGAEEPVAIIFPQSASDGVEVLYQEFSFGRYFNQIAAGVLRGI VQTRQPRQPLRILEVGGGTGGTTAWLLPELNGVPALEYHFTDISALFTRRAQQKFADYDF VKYSELDLEKEAQSQGFQAQSYDLIVAANVIHATRHIGRTLDNLRPLLKPGGRLLMREIT QPMRLFDFVFGPLVLPLQDLDAREGELFLTTAQWQQQCRHAGFSKVAWLPQDGSPTAGMS EHIILATLPGQAVSAVTFTAPSEPVLGQALTDNGDYLADWSDCAGQPEQFNARWQEAWRL LSQRHGDALPVEPPPVAAPEWLGKVRLSWQNKAFSRGQMRVEARHPAGEWLPLSPAAPLP APQTHYQWRWTPLNVASIDHPLTFSFSAGTLARSDELAQYGIIHDPHASSRLMIVEESED TLALAEKVIAALTASAAGLIVVTRRAWRVEENEALSASHHALWALLRVAANEQPERLLAA IDLAENTSWETLHQGLSAVSLSQRWLAARGDTLWLPSLSPNTGCAAELPANVFTGDSRWH LVTGAFGGLGRLAVNWLREKGARRIALLAPRVDESWLRDVEGGQTRVCRCDVGDAGQLAT VLDDLAANGGIAGAIHAAGVLADAPLQELDDHQLAAVFAVKAQAASQLLQTLRNHDGRYL ILYSSAAATLGAPGQSAHALACGYLDGLAQQFSTLDAPKTLSVAWGAWGESGRAATPEML ATLASRGMGALSDAEGCWHLEQAVMRGAPWRLAMRVFTDKMPPLQQALFNISATEKAATP VIPPADDNAFNGSLSDETAVMAWLKKRIAVQLRLSDPASLHPNQDLLQLGMDSLLFLELS SDIQHYLGVRINAERAWQDLSPHGLTQLICSKPEATPAASQPEVLRHDADERYAPFPLTP IQHAYWLGRTHLIGYGGVACHVLFEWDKRHDEFDLAILEKAWNQLIARHDMLRMVVDADG QQRILATTPEYHIPRDDLRALSPEEQRIALEKRRHELSYRVLPADQWPLFELVVSEIDDC HYRLHMNLDLLQFDVQSFKVMMDDLAQVWRGETLAPLAITFRDYVMAEQARRQTSAWHDA WDYWQEKLPQLPLAPELPVVETPPETPHFTTFKSTIGKTEWQAVKQRWQQQGVTPSAALL TLFAATLERWSRTTTFTLNLTFFNRQPIHPQINQLIGDFTSVTLVDFNFSAPVTLQEQMQ QTQQRLWQNMAHSEMNGVEVIRELGRLRGSQRQPLMPVVFTSMLGMTLEGMTIDQAMSHL FGEPCYVFTQTPQVWLDHQVMESNGELMFSWYCMDNVLEPGAAEAMFNDYCAILQAVIAA PESLKTLASGIARHIPRRRWPLNAQADYDLRDIEQATLEYPGIRQARAEITEQGALTLDI VMADDPSPSAAMPDEHELTQLALPLPEQAQLDELEATWRWLEARALQGIAATLNRHGLFT TPEIAHRFSAIVQALSAQASHQRLLRQWLQCLTEREWLIREGESWRCRIPLSEIPEPQEA CPQSQWSQALAQYLETCIARHDALFSGQCSPLELLFNEQHRVTDALYRDNPASACLNRYT AQIAALCSAERILEVGAGTAATTAPVLKATRNTRQSYHFTDVSAQFLNDARARFHDESQV SYALFDINQPLDFTAHPEAGYDLIVAVNVLHDASHVVQTLRRLKLLLKAGGRLLIVEATE RNSVFQLASVGFIEGLSGYRDFRRRDEKPMLTRSAWQEVLVQAGFANEQAWPAQESSPLR QHLLVARSPGVNRPDKKAVSRYLQQRFGTGLPILQIRQREALFTPLHAPSDAPTEPAKPT PVAGGNPALEKQVAELWQSLLSRPVARHHDFFELGGDSLMATRMVAQLNRRGIARANLQD LFSHSTLSDFCAHLQAATSGEDNPIPLCQGDGEETLFVFHASDGDISAWLPLASALNRRV FGLQAKSPQRFATLDQMIDEYVGCIRRQQPHGPYVLAGWSYGAFLAAGAAQRLYAKGEQV RMVLIDPVCRQDFCCENRAALLRLLAEGQTPLALPEHFDQQTPDSQLADFISLAKTAGMV SQNLTLQAAETWLDNIAHLLRLLTEHTPGESVPVPCLMVYAAGRPSRWTPAETEWQGWIN NADDAVIEASHWQIMMEAPHVQACAQHITRWLCATSTQPENTL >gi|296493489|gb|ADTK01000012.1| GENE 13 24821 - 30928 3556 2035 aa, chain - ## HITS:1 COG:YPO1911_2 KEGG:ns NR:ns ## COG: YPO1911_2 COG1020 # Protein_GI_number: 16122159 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Non-ribosomal peptide synthetase modules and related proteins # Organism: Yersinia pestis # 111 1491 1 1381 1381 2726 99.0 0 MISGAPSQDSLLPDNRHAADYQQLRERLIQELNLTPQQLHEESNLIQAGLDSIRLMRWLH WFRKNGYRLTLRELYAAPTLAAWNQLMLSRSPENAEEETPPDESSWPNMTESTPFPLTPV QHAYLTGRMPGQTLGGVGCHLYQEFEGHCLTASQLEQAITTLLQRHPMLHIAFRPDGQQV WLPQPYWNGVTVHDLRHNDAESRQAYLDALRQRLSHRLLRVEIGETFDFQLTLLPDNRHR LHVNIDLLIMDASSFTLFFDELNALLAGESLPAIDTRYDFRSYLLHQQKINQPLRDVARA YWLAKASTLPPAPVLPLACEPATLREVRNTRRRMIVPATRWHAFSNRAGEYGVTPTMALA TCFSAVLARWGGLTRLLLNITLFDRQPLQPAVGAMLADFTNILLLDTACDGDTVSNLARK NQLTFTEDWEHRHWSGVELLRELKRQQRYPHGAPVVFTSNLGRSLYSSRAESPLGEPEWG ISQTPQVWIDHLAFEHHGEVWLQWDSNDALFPPALVETLFDAYCQLINQLCDDESAWQKP FADMMPASQRAIRERVNATGAPIPEGLLHEGIFRIALQQPQALAVTDMRYQWNYHELTDY ARRCAGRLIECGVQPGDNVAITMSKGAGQLVAVLAVLLAGAVYVPVSLDQPAARREKIYA DASVRLVLICQHDASAGSDDIPALAWQQAIEAEPIANPVVRAPTQPAYIIYTSGSTGTPK GVVISHRGALNTCCDINTRYQVGPHDRVLALSALHFDLSVYDIFGVLRAGGALVMVMENQ RRDPHAWCELIQRHQVTLWNSVPALFDMLLTWCEGFADATPENLRAVMLSGDWIGLDLPA RYRAFRPQGQFIAMGGATEASIWSNACEIHDVPAHWRSIPYGFPLTNQRYRVVDEQGRDC PDWVPGELWIGGIGVAEGYFNDPLRSEQQFLTLPDERWYRTGDLGCYWPDGTIEFLGRRD KQVKVGGYRIELGEIESALSQLVGVKQATVLAIGEKEKTLAAYVVPQGEAFCVTDHRNPA LPQAWHTLAGTLPCCAISPEISAEQVADFLQHRLLKLKPGHTAGADPLPLMNSLAIQPRW QAVVERWLAFLVTQRRLKPAAEGYQVCAGEEREDEHPHFSGHDLTLSQILRGARNELSLL NDAQWSPESLAFNHPASAPYIQELATICQQLAQRLQRPVRLLEVGTRTGRAAESLLAQLN AGQIEYVGLEQSQEMLLSARQRLAPWPGARLSLWNADTLAAHAHSADIIWLNNALHRLLP EDPGLLATLQQLAVPGALLYVMEFRQLTPSALLSTLLLTNGQPEALLHNSADWAALFSAA GFNCQHGDEVAGLQRFLVQCPDRQVRRDPRQLQAALAGRLPGWMVPQRIVFLDALPLTAN GKIDYQALKRRHTPEAENPAEADLPQGDIEKQVAALWQQLLSTGNVTRETDFFQQGGDSL LATRLTGQLHQAGYEAQLSDLFNHPRLADFAATLRKTDVPVEQPFVHSLEDRYQPFALTD VQQAYLVGRQPGFALGGVGSHFFVEFEIADLDLTRLETVWNRLIARHDMLRAIVRDGQQQ VLEQTPPWVIPAHTLHTPEEALRVREKLAHQVLNPEVWPVFDLQVGYVDGMPARLWLCLD NLLLDGLSMQILLAELEHGYRYPQQLLPPLPVTFRDYLQQPSLQSPNPDSLAWWQAQLDD IPPAPALPLRCLPQEVETPRFARLNGALDSTRWHRLKKRAADAHLTPSAVLLSVWSTVLS AWSAQPDFTLNLTLFDRRPLHPQINQILGDFTSLMLLSWHPGESWLHSAQSLQQRLSQNL NHRDVSAIRVMRQLAQRQNVPAVPMPVVFTSALGFEQDNFLARRNLLKPVWGISQTPQVW LDHQIYESEGELRFNWDFVAALFPAGQVERQFEQYCALLNRMAEDESGWQLPLAALVPPV KHAGQCAERSPRVCPEHSQPHIAADESTVSLICDAFREVVGESVTPAENFFEAGATSLNL VQLHVLLQRHEFSTLTLLDLFTHPSPAALADYLAGVATVEKTQRPRPVRRRQRRI >gi|296493489|gb|ADTK01000012.1| GENE 14 31119 - 32078 547 319 aa, chain - ## HITS:1 COG:YPO1912 KEGG:ns NR:ns ## COG: YPO1912 COG2207 # Protein_GI_number: 16122160 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Yersinia pestis # 1 319 1 319 319 647 99.0 0 MTESPQTQSEISIHQLVVGKPANDGNIPAQCELLRCSLQEGMDILLWRGHFARPETLQLH DDLGRINFSCILEGTSRFAIQGLRRHTDWELARNRHYITHTPDCRGSASYCGRFESITLS FSPETLALWVPDISAVIKNKIDSHCCCQQQRCNAETHLTAQALRHALMRMHGGFSHEQKP STLWLQGQSLVMLSLVLDEHREDASCLSCHFNPMERQKLLRAKDLLLADLTQAPGVAELA RESGLSVLKIKRGFRVLFNNSVYGLFQAERMQEARRRLANGNTSVMTVAADLGYANASHF SAAFQKQFGVTPSTFKRGM >gi|296493489|gb|ADTK01000012.1| GENE 15 32335 - 34047 194 570 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 330 535 295 507 563 79 31 4e-14 MALAGLAALTSLGALLFLAWSLRDIRATPDAIPAWPLGGVIGCVVLTFVLRLQAFNTSHY AAFHLENILRSRLARKALQLPPGVLQQMGSGSVAKVMLDDVKSLHIFVADSTPLYARAII MPLATIVILFWLDWRLAIATLGVLAFGSVILVLARQRSENMAQRYHKAREQVSAAVIEFV QAMPVVRTFDSGSTSFLRYQRALEEWVDVLKTWYRKAGFSARFSFSILNPLPTLFVLIWS GYGLLHYGSFDFIAWVAVLLLGSGMAEAVMPMMMLNNLVAQTRLSIQRIYQVLAMPELSL PQSDQQPQEASITFEQVSFYYPQARTGAALQEVSFHVPAGQIVALVGPSGAGKSTVARLL LRYADPDKGHIRIGGVDLRDMQTDTLMKQLSFVFQDNFLFADTIANNIRLGAPDTPLEAV IAAARVAQAHDFISALPEGYNTRVGERGVFLSGGQRQRITIARALLQDRPILVLDEATAF ADPENEAALIKALAAAMRGRTVIMVAHRLSMVTQADVILLFSDGQLREMGNHTQLLAQGG LYQRLWQHYQQAQHWVPGGTQEEVVENERQ >gi|296493489|gb|ADTK01000012.1| GENE 16 34034 - 35836 217 600 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 337 559 16 242 329 88 30 8e-17 MKDNNPADNLAWRVIWRQLISSVGSQARMLRRSMLALLLAAFMQGIAFACLYPIIDALLR GDAPQLLNWAMAFSVAAIVTLVLRWYGLGFEYRGHLAQATHELRLRLGEQLRRVPLEKLQ RGRAGEMNALLLGSVDENLNYVIAIANILLLTIVTPLTASLATLWIDWRLGLVMLLIFPL LVPFYYWRRPAMRRQMQTLGEAHQRLSGDIVEFAQGMMVLRTCGSDADKSRALLAHFNAL ENLQTRTHRQGAGATMLIASVVELGLQVVVLSGIVWVVTGTLNLAFLIAAVAMIMRFAEP MAMFISYTSVVELIASALQRIEQFMAIAPLPVAEQSEMPERYDIRFDNVSYRYEEGDGYA LNHVSLTFPAASMSALVGASGAGKTTVTKLLMRYADPQQGQISIGGVDIRRLTPEQLNSL ISVVFQDVWLFDDTLLANIRIARPQATRQEVEEAARAAQCLEFISRLPQGWLTPMGEMGG QLSGGERQRISIARALLKNAPVVILDEPTAALDIESELAVQKAIDNLVHNRTVIIIAHRL STIAGAGNILVMEEGQVVEQGTHAQLLSHHGRYQALWQAQMAARVWRDDGVSASGEWVHE >gi|296493489|gb|ADTK01000012.1| GENE 17 35829 - 37109 967 426 aa, chain + ## HITS:1 COG:YPO1915 KEGG:ns NR:ns ## COG: YPO1915 COG0477 # Protein_GI_number: 16122163 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 1 426 1 426 426 664 99.0 0 MSDVQSNVKPLTLTTGRVIFAIAGVYVTQSLVSALSMQSLPALVRAAGGSLALAGATTLF MLPWALKFIWAPWIERWRLPPGSQERRSRMLILRGQVALAAILTIVAAIGWFGREGGFPD TQIVALFVLFMVAGTVASTIDIASDGFCVDQLTRTGYGWGNSVQVGGSYLGMMCGGGVFL MLSAASGWPVAMLMMAVLIMALSLPLWRITEPTRTATIPHVPALGYALRRKQARLGLLLV LMLNSGMRFVLPLLAPLLLDHGLSMSALGALFSGGNIAAGIAGTLAGGLLMKYTSPGRAL LTAYGVQGIALLAVVMTLMMAPGHLLLQILQCLVLVQSISLACALVCLYATLMSLSSPLQ AGVDFTLFQCTDAAIAILAGVIGGVVAQHFGYAACFLFAGAFTLLAAWVAYIRLHSAREL MTSAID >gi|296493489|gb|ADTK01000012.1| GENE 18 37137 - 38441 1171 434 aa, chain + ## HITS:1 COG:YPO1916 KEGG:ns NR:ns ## COG: YPO1916 COG0147 # Protein_GI_number: 16122164 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component I # Organism: Yersinia pestis # 1 434 1 434 434 872 99.0 0 MKISEFLHLALPEEQWLPTISGVLRQFAEEECYVYERQPCWYLGKGCQARLHINADGTQA TFIDDAGEQKWAVDSIADCARRFMAHPQVKGRRVYGQVGFNFAAHARGIAFNAGEWPLLT LTVPREELIFEKGNVTVYADSADGCRRLCEWVKEAGTTTQNAPLAVDTALNGEAYKQQVA RAVAEIRRGEYVKVIVSRAIPLPSRIDMPATLLYGRQANTPVRSFMFRQEGREALGFSPE LVMSVTGNKVVTEPLAGTRDRMGNPEHNKAKEAELLHDSKEVLEHILSVKEAIAELEAVC QPGSVVVEDLMSVRQRGSVQHLGSGVSGQLSENKDAWDAFTVLFPSITASGIPKNAALNA IMQIEKTPRELYSGAILLLDDTRFDAALVLRSVFQDSQRSWIQAGAGIIAQSTPERELTE TREKLASIAPYLMV >gi|296493489|gb|ADTK01000012.1| GENE 19 38635 - 39897 366 420 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239523764|gb|EEQ63630.1| 30S ribosomal protein S15 [Helicobacter pullorum MIT 98-5489] # 2 406 3 397 397 145 28 4e-34 MFLTDAKIRTLKPSDKPFKVSDSHGLYLLVKPGGSRHWYLKYRISGKESRIALGAYQAIS LSDARQQREGIRKMLALNINPVQQRAAERGSRTPEKVFKNLALAWHKSNRKWSQNTADRL LASLNNHIFPVIGNLPVSELKPRHFIDLLKGIEEKGLLEVASRTRQHLSNIMRHAVHQEL IDTNPAANLGGVTTPPVRRHYPALPLERLPELLERIGAYHQGRELTRHAVLLMLHVFIRS SELRFARWSEIDFTNRVWTIPATREPIIGVRYSGRGAKMRMPHIVPLSEQSIAILKQIKD ITGNNELIFPGDHNPYKPMCENTVNKALRVMGYDTKKDICGHGFRAMACSALMESGLWAK DAVERQMSHQERNTVRMAYIHKAEHLEARKAMMQWWSDYLEACRESYAPPYTIGKNKFIP >gi|296493489|gb|ADTK01000012.1| GENE 20 40235 - 41032 371 265 aa, chain - ## HITS:1 COG:ECs2774 KEGG:ns NR:ns ## COG: ECs2774 COG3228 # Protein_GI_number: 15832028 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 265 14 278 278 523 98.0 1e-148 MIKWPWKVQESAHQTALPWQEALSIPLLTGLTEQEQSKLVTLAERFLQQKRLVPLQGFEL DSLRSCRIALLFCLPVLELGLEWLDGFHEVLIYPAPFVVDDEWEDDIGLVHNQRIVQSGQ SWQQGPIVLNWLDIQDSFDASGFNLIIHEVAHKLDTRNGDRASGVPFISLREVAGWEHDL HAAMSNIQEEIELVGENAASIDAYAASDPAECFAVLSEYFFSAPELFAPRFPSLWQRFCQ FYQQDPLQRLHHANDTDSFSATNVH >gi|296493489|gb|ADTK01000012.1| GENE 21 41885 - 42535 440 216 aa, chain - ## HITS:1 COG:yodA KEGG:ns NR:ns ## COG: yodA COG3443 # Protein_GI_number: 16129919 # Func_class: R General function prediction only # Function: Predicted periplasmic or secreted protein # Organism: Escherichia coli K12 # 1 216 1 216 216 430 98.0 1e-120 MAIRLHKLAVALGVFIVSAPAFSHGHHSHGKPLTEVEQKAANGVFDDANVQNRTLSDWDG VWQSVYPLLQSGKLDPVFQKKADADKTKTFAEIKDYYRKGYVTDIEMIGIEDGIVEFHRN NETTSCKYDYDGYKILTYKSGKKGVRYLFECKDPESKAPKYIQFSDHIIAPRKSSHFHIF MGNDSQQSLLNEMENWPTYYPYQLSSEEVVEEMMSH >gi|296493489|gb|ADTK01000012.1| GENE 22 42792 - 43427 459 211 aa, chain - ## HITS:1 COG:yedZ KEGG:ns NR:ns ## COG: yedZ COG2717 # Protein_GI_number: 16129918 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 211 1 211 211 337 99.0 7e-93 MRLTAKQVTWLKVCLHLAGLLPFLWLVWAINHGGLGADPVKDIQHFTGRTALKFLLATLL ITPLARYAKQPLLIRTRRLLGLWCFAWATLHLTSYALLELGVNNLALLGKELITRPYLTL GIISWVILLALAFTSTQAMQRKLGKHWQQLHNFVYLVAILAPIHYLWSVKIISPQPLIYA GLAVLLLALRYKKLLSLFNRLRKQVHNKLSV >gi|296493489|gb|ADTK01000012.1| GENE 23 43428 - 44432 832 334 aa, chain - ## HITS:1 COG:ECs2709 KEGG:ns NR:ns ## COG: ECs2709 COG2041 # Protein_GI_number: 15831963 # Func_class: R General function prediction only # Function: Sulfite oxidase and related enzymes # Organism: Escherichia coli O157:H7 # 1 334 1 334 334 665 98.0 0 MKKNQFLKESDVTAESVFFMTRRQVLKALGISAAALSLPHAAHADLLSWFKGNDRPPAPA GKPLEFSKPAAWQNDLPLTPADKVSGYNNFYEFGLDKADPAANAGSLKTDPWTLKISGEV AKPLTLDHDDLTRRFPLEERIYRMRCVEAWSMVVPWIGFPLHKLLALAEPTSNAKYVAFE TIYAPEQMPGQQDRFIGGGLKYPYVEGLRLDEAMHPLTLMTVGVYGKALPPQNGAPVRLI VPWKYGFKGIKSIVSIKLTRERPPTTWNLAAPDEYGFYANVNPHVDHPRWSQATERFIGS GGILDVQRQPTLLFNGYAEQVASLYRGLDLRENF >gi|296493489|gb|ADTK01000012.1| GENE 24 44541 - 44954 289 137 aa, chain - ## HITS:1 COG:ECs2708 KEGG:ns NR:ns ## COG: ECs2708 COG2351 # Protein_GI_number: 15831962 # Func_class: R General function prediction only # Function: Transthyretin-like protein # Organism: Escherichia coli O157:H7 # 1 137 1 137 137 271 100.0 3e-73 MLKRYLVLSVVTAAFSLPSLVYAAQQNILSVHILNQQTGKPAADVTVTLEKKADNGWLQL NTAKTDKDGRIKALWPEQTATTGDYRVVFKTGDYFKKQNLESFFPEIPVEFHINKVNEHY HVPLLLSQYGYSTYRGS >gi|296493489|gb|ADTK01000012.1| GENE 25 45039 - 45758 239 239 aa, chain + ## HITS:1 COG:yedW KEGG:ns NR:ns ## COG: yedW COG0745 # Protein_GI_number: 16129915 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli K12 # 1 239 1 239 239 458 99.0 1e-129 MNQAVSITYDLWHIIFMKILLIEDNQRTQEWVTQGLSEAGYVIDAVSDGRDGLYLALKDD YALIILDIMLPGMDGWQILQTLRTAKQTPVICLTARDSVDDRVRGLDSGANDYLVKPFSF SELLARVRAQLRQHHTLNSTLEISGLRMDSVSQSVSRDNISITLTRKEFQLLWLLASRAG EIIPRTVIASEIWGINFDSDTNTVDVAIRRLRAKVDDPFPEKLIATIRGMGYSFVAVKK >gi|296493489|gb|ADTK01000012.1| GENE 26 46214 - 47116 279 300 aa, chain + ## HITS:1 COG:yedV KEGG:ns NR:ns ## COG: yedV COG0642 # Protein_GI_number: 16129914 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 300 153 452 452 573 98.0 1e-163 MLEQYKINSIIICIVAIVLCSVLSPLLIRTGLREIKKLSGVTEALNYNDSREPVEVSALP RELKPLGQALNKMHHALVKDFERLSQFADDLAHELRTPINALLGQNQVTLSQTRSIAEYQ KTIAGNIEELENISRLTENILFLARADKNNVLVKLDSLFLNKEVENLLDYLEYLSDEKEI CFKVECNQQIFADKILLQRMLSNLIVNAIRYSPEKSRIHITSFLDTNGYLNIDIASPGTK INEPEKLFRRFWRGDNSRHSEGQGLGLSLVKAIAELHGGSATYHYFNKHNVFRITLPQRN >gi|296493489|gb|ADTK01000012.1| GENE 27 47224 - 47919 478 231 aa, chain - ## HITS:1 COG:yedU KEGG:ns NR:ns ## COG: yedU COG0693 # Protein_GI_number: 16129913 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Escherichia coli K12 # 1 231 53 283 283 467 98.0 1e-132 MIAADERYLPPDNGKLFSTGNHPIETLLPLYHLHAAGFEFEVATISGLMTKFEYWAMPHK DEKVMPFFEQHKSLFRNPKKLADVVVSLNADSEYAAIFVPGGHGALIGLPESQDVAAALQ WAIKNDRFVISLCHGPAAFLALRHGDNPLNGYSICAFPDAADKQTPEIGYMPGHLTWYFG EELKKMGMNIINDDITGRVHKDRKVLTGDSPFAANALGKLAAQEMLAAYAG Prediction of potential genes in microbial genomes Time: Mon May 16 15:00:55 2011 Seq name: gi|296493488|gb|ADTK01000013.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont31.1, whole genome shotgun sequence Length of sequence - 48816 bp Number of predicted genes - 47, with homology - 47 Number of transcription units - 22, operones - 13 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1108 1176 ## COG0001 Glutamate-1-semialdehyde aminotransferase - Prom 1217 - 1276 5.4 + Prom 1099 - 1158 3.8 2 2 Op 1 3/1.000 + CDS 1333 - 2754 1479 ## COG0038 Chloride channel protein EriC 3 2 Op 2 . + CDS 2836 - 3180 441 ## COG0316 Uncharacterized conserved protein + Term 3192 - 3228 8.2 4 3 Tu 1 . - CDS 3227 - 3772 590 ## COG2860 Predicted membrane protein - Prom 3818 - 3877 2.5 5 4 Op 1 5/0.500 - CDS 3888 - 4688 796 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 6 4 Op 2 . - CDS 4681 - 5379 702 ## COG0775 Nucleoside phosphorylase - Prom 5480 - 5539 4.3 + Prom 5332 - 5391 2.0 7 5 Tu 1 . + CDS 5463 - 6980 1039 ## COG0232 dGTP triphosphohydrolase + Prom 7002 - 7061 2.7 8 6 Tu 1 . + CDS 7110 - 8534 1537 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain + Term 8545 - 8580 6.5 + Prom 8566 - 8625 3.8 9 7 Tu 1 . + CDS 8689 - 9846 1142 ## COG3835 Sugar diacid utilization regulator + Term 9958 - 9992 -0.5 10 8 Tu 1 . - CDS 9935 - 10321 456 ## ECIAI1_0162 hypothetical protein - Prom 10350 - 10409 3.6 - Term 10588 - 10628 10.6 11 9 Op 1 5/0.500 - CDS 10636 - 11460 1076 ## COG2171 Tetrahydrodipicolinate N-succinyltransferase 12 9 Op 2 9/0.167 - CDS 11491 - 14163 2390 ## COG2844 UTP:GlnB (protein PII) uridylyltransferase - Term 14188 - 14221 4.5 13 9 Op 3 . - CDS 14225 - 15019 905 ## COG0024 Methionine aminopeptidase - Prom 15135 - 15194 4.6 + Prom 15094 - 15153 3.5 14 10 Op 1 38/0.000 + CDS 15189 - 16112 1594 ## PROTEIN SUPPORTED gi|26106512|gb|AAN78698.1|AE016755_198 30S ribosomal protein S2 + Term 16287 - 16320 3.1 + Prom 16285 - 16344 3.8 15 10 Op 2 24/0.000 + CDS 16370 - 17221 1001 ## PROTEIN SUPPORTED gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts + Term 17232 - 17265 5.2 + Prom 17266 - 17325 4.9 16 10 Op 3 . + CDS 17368 - 18093 977 ## COG0528 Uridylate kinase + Term 18111 - 18168 8.2 - Term 17994 - 18037 -0.3 17 11 Tu 1 . - CDS 18083 - 18328 111 ## ECIAI39_0174 hypothetical protein - Prom 18479 - 18538 2.1 + Prom 18107 - 18166 3.5 18 12 Op 1 8/0.333 + CDS 18243 - 18800 797 ## COG0233 Ribosome recycling factor + Prom 18810 - 18869 2.1 19 12 Op 2 7/0.333 + CDS 18892 - 20088 878 ## COG0743 1-deoxy-D-xylulose 5-phosphate reductoisomerase + Term 20193 - 20222 0.4 + Prom 20095 - 20154 3.9 20 13 Op 1 32/0.000 + CDS 20277 - 21035 446 ## COG0020 Undecaprenyl pyrophosphate synthase 21 13 Op 2 12/0.000 + CDS 21156 - 21905 591 ## COG0575 CDP-diglyceride synthetase 22 13 Op 3 18/0.000 + CDS 21917 - 23269 918 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 23 13 Op 4 17/0.000 + CDS 23299 - 25731 2500 ## COG4775 Outer membrane protein/protective antigen OMA87 + Prom 25741 - 25800 2.1 24 13 Op 5 15/0.000 + CDS 25853 - 26338 588 ## COG2825 Outer membrane protein 25 13 Op 6 18/0.000 + CDS 26342 - 27367 940 ## COG1044 UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase + Term 27394 - 27445 7.1 + Prom 27386 - 27445 4.0 26 14 Op 1 25/0.000 + CDS 27472 - 27927 452 ## COG0764 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases 27 14 Op 2 11/0.000 + CDS 27931 - 28719 790 ## COG1043 Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase 28 14 Op 3 11/0.000 + CDS 28719 - 29867 1153 ## COG0763 Lipid A disaccharide synthetase 29 14 Op 4 6/0.333 + CDS 29864 - 30460 679 ## COG0164 Ribonuclease HII 30 14 Op 5 7/0.333 + CDS 30495 - 33977 3421 ## COG0587 DNA polymerase III, alpha subunit 31 14 Op 6 3/1.000 + CDS 33990 - 34949 1322 ## COG0825 Acetyl-CoA carboxylase alpha subunit + Term 34970 - 34999 1.2 + Prom 34968 - 35027 3.5 32 15 Op 1 1/1.000 + CDS 35048 - 37189 1941 ## COG1982 Arginine/lysine/ornithine decarboxylases 33 15 Op 2 2/1.000 + CDS 37246 - 37635 416 ## COG0346 Lactoylglutathione lyase and related lyases + Term 37655 - 37687 4.7 34 16 Tu 1 . + CDS 37700 - 38998 1044 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control + Term 39210 - 39268 1.3 - Term 38993 - 39037 6.6 35 17 Op 1 . - CDS 39047 - 39301 313 ## COG4568 Transcriptional antiterminator 36 17 Op 2 . - CDS 39294 - 39494 166 ## ECIAI39_0458 hypothetical protein - Prom 39566 - 39625 3.3 + Prom 39579 - 39638 1.8 37 18 Op 1 5/0.500 + CDS 39660 - 40205 752 ## COG4681 Uncharacterized protein conserved in bacteria 38 18 Op 2 4/0.833 + CDS 40202 - 40624 302 ## COG1186 Protein chain release factor B 39 18 Op 3 . + CDS 40638 - 41348 686 ## COG3015 Uncharacterized lipoprotein NlpE involved in copper resistance + Term 41449 - 41490 3.3 - Term 41368 - 41404 4.0 40 19 Op 1 . - CDS 41503 - 42327 410 ## EcHS_A0197 hypothetical protein - Term 42335 - 42374 6.2 41 19 Op 2 6/0.333 - CDS 42380 - 44098 2054 ## COG0442 Prolyl-tRNA synthetase - Term 44116 - 44146 2.7 42 20 Op 1 . - CDS 44209 - 44916 611 ## COG1720 Uncharacterized conserved protein 43 20 Op 2 . - CDS 44913 - 45317 306 ## EC55989_0194 outer membrane lipoprotein - Prom 45342 - 45401 2.5 - Term 45364 - 45400 6.7 44 21 Op 1 22/0.000 - CDS 45435 - 46250 1148 ## COG1464 ABC-type metal ion transport system, periplasmic component/surface antigen 45 21 Op 2 32/0.000 - CDS 46290 - 46943 843 ## COG2011 ABC-type metal ion transport system, permease component 46 21 Op 3 . - CDS 46936 - 47967 1146 ## COG1135 ABC-type metal ion transport system, ATPase component + Prom 47929 - 47988 3.9 47 22 Tu 1 . + CDS 48155 - 48727 541 ## COG0241 Histidinol phosphatase and related phosphatases Predicted protein(s) >gi|296493488|gb|ADTK01000013.1| GENE 1 1 - 1108 1176 369 aa, chain - ## HITS:1 COG:hemL KEGG:ns NR:ns ## COG: hemL COG0001 # Protein_GI_number: 16128147 # Func_class: H Coenzyme transport and metabolism # Function: Glutamate-1-semialdehyde aminotransferase # Organism: Escherichia coli K12 # 1 369 1 369 426 726 99.0 0 MSKSENLYSAARELIPGGVNSPVRAFTGVGGTPLFIEKADGAYLYDVDGKAYIDYVGSWG PMVLGHNHPAIRNAVIEAAERGLSFGAPTEMEVKMAQLVTELVPTMDMVRMVNSGTEATM SAIRLARGFTGRDKIIKFEGCYHGHADCLLVKAGSGALTLGQPNSPGVPADFAKHTLTCT YNDLASVRAAFEQYPQEIACIIVEPVAGNMNCVPPLPEFLPGLRALCDEFGALLIIDEVM TGFRVALAGAQDYYGVVPDLTCLGKIIGGGMPVGAFGGRRDVMDALAPTGPVYQAGTLSG NPIAMAAGFACLNEVAQPGVHETLDELTSRLAEGLLEAAEEAGIPLVVNHVGGMFGIFFA DAESVTCYQ >gi|296493488|gb|ADTK01000013.1| GENE 2 1333 - 2754 1479 473 aa, chain + ## HITS:1 COG:yadQ KEGG:ns NR:ns ## COG: yadQ COG0038 # Protein_GI_number: 16128148 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Escherichia coli K12 # 1 473 1 473 473 789 100.0 0 MKTDTPSLETPQAARLRRRQLIRQLLERDKTPLAILFMAAVVGTLVGLAAVAFDKGVAWL QNQRMGALVHTADNYPLLLTVAFLCSAVLAMFGYFLVRKYAPEAGGSGIPEIEGALEDQR PVRWWRVLPVKFFGGLGTLGGGMVLGREGPTVQIGGNIGRMVLDIFRLKGDEARHTLLAT GAAAGLAAAFNAPLAGILFIIEEMRPQFRYTLISIKAVFIGVIMSTIMYRIFNHEVALID VGKLSDAPLNTLWLYLILGIIFGIFGPIFNKWVLGMQDLLHRVHGGNITKWVLMGGAIGG LCGLLGFVAPATSGGGFNLIPIATAGNFSMGMLVFIFVARVITTLLCFSSGAPGGIFAPM LALGTVLGTAFGMVAVELFPQYHLEAGTFAIAGMGALLAASIRAPLTGIILVLEMTDNYQ LILPMIITGLGATLLAQFTGGKPLYSAILARTLAKQEAEQLARSKAASASENT >gi|296493488|gb|ADTK01000013.1| GENE 3 2836 - 3180 441 114 aa, chain + ## HITS:1 COG:STM0204 KEGG:ns NR:ns ## COG: STM0204 COG0316 # Protein_GI_number: 16763594 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Salmonella typhimurium LT2 # 1 114 16 129 129 221 99.0 4e-58 MSDDVALPLEFTDAAANKVKSLIADEDNPNLKLRVYITGGGCSGFQYGFTFDDQVNEGDM TIEKQGVGLVVDPMSLQYLVGGSVDYTEGLEGSRFIVTNPNAKSTCGCGSSFSI >gi|296493488|gb|ADTK01000013.1| GENE 4 3227 - 3772 590 181 aa, chain - ## HITS:1 COG:yadS KEGG:ns NR:ns ## COG: yadS COG2860 # Protein_GI_number: 16128150 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 181 27 207 207 296 100.0 1e-80 MDPFGVLVLGVVTAVGGGTIRDMALDHGPVFWVKDPTDLVVAMVTSMLTIVLVRQPRRLP KWMLPVLDAVGLAVFVGIGVNKAFNAEAGPLIAVCMGVITGVGGGIIRDVLAREIPMILR TEIYATACIIGGIVHATAYYTFSVPLETASMMGMVVTLLIRLAAIRWHLKLPTFALDENG R >gi|296493488|gb|ADTK01000013.1| GENE 5 3888 - 4688 796 266 aa, chain - ## HITS:1 COG:yadT KEGG:ns NR:ns ## COG: yadT COG0614 # Protein_GI_number: 16128151 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Escherichia coli K12 # 1 266 1 266 266 482 100.0 1e-136 MAKSLFRALVALSFLAPLWLNAAPRVITLSPANTELAFAAGITPVGVSSYSDYPPQAQKI EQVSTWQGMNLERIVALKPDLVIAWRGGNAERQVDQLASLGIKVMWVDATSIEQIANALR QLAPWSPQPDKAEQAAQSLLDQYAQLKAQYADKPKKRVFLQFGINPPFTSGKESIQNQVL EVCGGENIFKDSRVPWPQVSREQVLARSPQAIVITGGPDQIPKIKQYWGEQLKIPVIPLT SDWFERASPRIILAAQQLCNALSQVD >gi|296493488|gb|ADTK01000013.1| GENE 6 4681 - 5379 702 232 aa, chain - ## HITS:1 COG:STM0207 KEGG:ns NR:ns ## COG: STM0207 COG0775 # Protein_GI_number: 16763597 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside phosphorylase # Organism: Salmonella typhimurium LT2 # 1 232 1 232 232 410 95.0 1e-114 MKIGIIGAMEEEVTLLRDKIEKRQTISLGGCEIYTGQLNGTEVALLKSGIGKVAAALGAT LLLEHCKPDVIINTGSAGGLAPTLKVGDIVVSDEARYHDADVTAFGYEYGQLPGCPAGFK ADDKLIAAAEACIAELNLNAVRGLIVSGDAFINGSVGLAKIRHNFPQAIAVEMEATAIAH VCHNFNVPFVVVRAISDVADQQSHLSFDEFLAVAAKQSSLMVESLVQKLAHG >gi|296493488|gb|ADTK01000013.1| GENE 7 5463 - 6980 1039 505 aa, chain + ## HITS:1 COG:dgt KEGG:ns NR:ns ## COG: dgt COG0232 # Protein_GI_number: 16128153 # Func_class: F Nucleotide transport and metabolism # Function: dGTP triphosphohydrolase # Organism: Escherichia coli K12 # 1 505 1 505 505 1007 99.0 0 MAQIDFRKKINWHRRYRSPQGVKTEHEILRIFESDRGRIINSPAIRRLQQKTQVFPLERN AAVRTRLTHSMEVQQVGRYIAKEILSRLKELKLLEAYGLDELTGPFESIVEMSCLMHDIG NPPFGHFGEAAINDWFRQRLYPEDAESQPLTDDRCSVAALRLRDGEEPLNELRRKIRQDL CYFEGNAQGIRLVHTLMRMNLTWAQVGGILKYTRPAWWRGETPETHHYLMKKPGYYLSEE AYIARLRKELNLALYSRFPLTWIMEAADDISYCVADLEDAVEKRIFTVEQLYHHLHEAWG QHEKGSLFSLVVENAWEKSRSNSLSRSTEDQFFMYLRVNTLNKLVPYAAQRFIDNLPAIF AGTFNHALLEDASECSDLLKLYKNVAVKHVFSHPDVEQLELQGYRVISGLLEIYRPLLSL SLSDFTELVEKERVKRFPIESRLFHKLSTRHRLAYVEAVSKLPSDSPEFPLWEYYYRCRL LQDYISGMTDLYAWDEYRRLMAVEQ >gi|296493488|gb|ADTK01000013.1| GENE 8 7110 - 8534 1537 474 aa, chain + ## HITS:1 COG:ECs0165 KEGG:ns NR:ns ## COG: ECs0165 COG0265 # Protein_GI_number: 15829419 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Escherichia coli O157:H7 # 1 474 1 474 474 775 99.0 0 MKKTTLALSALALSLGLALSPLSATAAETSSATTAQQMPSLAPMLEKVMPSVVSINVEGS TTVNTPRMPRNFQQFFGDDSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVIID ADKGYVVTNNHVVDNATVIKVQLSDGRKFDAKMVGKDPRSDIALIQIQNPKNLTAIKMAD SDALRVGDYTVAIGNPFGLGETVTSGIVSALGRSGLNAENYENFIQTDAAINRGNSGGAL VNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKNLTSQMVEYGQVKRGELGIMGTELNS ELAKAMKVDAQRGAFVSQVQPNSSAAKAGIKAGDVITSLNGKPISSFAALRAQVGTMPVG SKLTLGLLRDGKQVNVNLELQQSSQNQVDSSSIFNGIEGAEMSNKGKDQGVVVNNVKTGT PAAQIGLKKGDVIIGANQQAVKNIAELRKVLDSKPSVLALNIQRGDSSIYLLMQ >gi|296493488|gb|ADTK01000013.1| GENE 9 8689 - 9846 1142 385 aa, chain + ## HITS:1 COG:yaeG KEGG:ns NR:ns ## COG: yaeG COG3835 # Protein_GI_number: 16128155 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Sugar diacid utilization regulator # Organism: Escherichia coli K12 # 1 385 7 391 391 733 99.0 0 MAGWHLDTKMAQDIVARTMRIIDTNINVMDARGRIIGSGDRERIGELHEGALLVLSQGRV VDIDDAVARHLHGVRQGINLPLRLEGEIVGVIGLTGEPENLRKYGELVCMTAEMMLEQSR LMHLLAQDSRLREELVMNLIQAEENTPALTEWAQRLGIDLNQPRVVAIVEVDSGQLGVDS AMAELQQLQNALTTPERNNLVAIVSLTEMVVLKPALNSFGRWDAEDHRKRVEQLITRMKE YGQLRFRVSLGNYFTGPGSIARSYRTAKTTMVVGKQRMPESRCYFYQDLMLPVLLDSLRG DWQANELARPLARLKAMDNNGLLRRTLAAWFRHNVQPLATSKALFIHRNTLEYRLNRISE LTGLDLGNFDDRLLLYVALQLDEER >gi|296493488|gb|ADTK01000013.1| GENE 10 9935 - 10321 456 128 aa, chain - ## HITS:1 COG:no KEGG:ECIAI1_0162 NR:ns ## KEGG: ECIAI1_0162 # Name: yaeH # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 128 1 128 128 218 100.0 6e-56 MYDNLKSLGITNPEEIDRYSLRQEANNDILKIYFQKDKGEFFAKSVKFKYPRQRKTVVAD GVGQGYKEVQEISPNLRYIIDELDQICQRDRSEVDLKRKILDDLRHLESVVTNKISEIEA DLEKLTRK >gi|296493488|gb|ADTK01000013.1| GENE 11 10636 - 11460 1076 274 aa, chain - ## HITS:1 COG:dapD KEGG:ns NR:ns ## COG: dapD COG2171 # Protein_GI_number: 16128159 # Func_class: E Amino acid transport and metabolism # Function: Tetrahydrodipicolinate N-succinyltransferase # Organism: Escherichia coli K12 # 1 274 1 274 274 521 100.0 1e-148 MQQLQNIIETAFERRAEITPANADTVTREAVNQVIALLDSGALRVAEKIDGQWVTHQWLK KAVLLSFRINDNQVIEGAESRYFDKVPMKFADYDEARFQKEGFRVVPPAAVRQGAFIARN TVLMPSYVNIGAYVDEGTMVDTWATVGSCAQIGKNVHLSGGVGIGGVLEPLQANPTIIED NCFIGARSEVVEGVIVEEGSVISMGVYIGQSTRIYDRETGEIHYGRVPAGSVVVSGNLPS KDGKYSLYCAVIVKKVDAKTRGKVGINELLRTID >gi|296493488|gb|ADTK01000013.1| GENE 12 11491 - 14163 2390 890 aa, chain - ## HITS:1 COG:glnD KEGG:ns NR:ns ## COG: glnD COG2844 # Protein_GI_number: 16128160 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: UTP:GlnB (protein PII) uridylyltransferase # Organism: Escherichia coli K12 # 1 890 1 890 890 1785 99.0 0 MNTLPEQYANTALPTLPGQPQNPCVWPRDELTVGGIKAHIDTFQRWLGDAFDNGISAEQL IEARTEFIDQLLQRLWIEAGFSQIADLALVAVGGYGRGELHPLSDIDLLILSRKKLPDDQ AQKVGELLTLLWDVKLEVGHSVRTLEECMLEGLSDLTVATNLIESRLLIGDVALFLELQK HIFSEGFWPSDKFYAAKVEEQNQRHQRYHGTSYNLEPDIKSSPGGLRDIHTLQWVARRHF GATSLDEMVGFGFLTSAERAELNECLHILWRIRFALHLVVSRYDNRLLFDRQLSVAQRLN YSGEGNEPVERMMKDYFRVTRRVSELNQMLLQLFDEAILALPADEKPRPIDDEFQLRGTL IDLRDETLFMRQPEAILRMFYTMVRNSAITGIYSTTLRQLRHARRHLQQPLCNIPEARKL FLSILRHPGAVRRGLLPMHRHSVLGAYMPQWSHIVGQMQFDLFHAYTVDEHTIRVMLKLE SFASEETRQRHPLCVDVWPRLPSTELIFIAALFHDIAKGRGGDHSILGAQDVVHFAELHG LNSRETQLVAWLVRQHLLMSVTAQRRDIQDPEVIKQFAEEVQTENRLRYLVCLTVADICA TNETLWNSWKQSLLRELYFATEKQLRRGMQNTPDMRERVRHHQLQALALLRMDNIDEEAL HQIWSRCRANYFVRHSPNQLAWHARHLLQHDLSKPLVLLSPQATRGGTEIFIWSPDRPYL FAAVCAELDRRNLSVHDAQIFTTRDGMAMDTFIVLEPDGSPLSADRHEVIRFGLEQVLTQ SSWQPPQPRRQPAKLRHFTVETEVTFLPTHTDRKSFLELIALDQPGLLARVGKIFADLGI SLHGARITTIGERVEDLFIIATADRRALNNELQQEVHQRLTEALNPNDKG >gi|296493488|gb|ADTK01000013.1| GENE 13 14225 - 15019 905 264 aa, chain - ## HITS:1 COG:ECs0170 KEGG:ns NR:ns ## COG: ECs0170 COG0024 # Protein_GI_number: 15829424 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Escherichia coli O157:H7 # 1 264 1 264 264 537 100.0 1e-153 MAISIKTPEDIEKMRVAGRLAAEVLEMIEPYVKPGVSTGELDRICNDYIVNEQHAVSACL GYHGYPKSVCISINEVVCHGIPDDAKLLKDGDIVNIDVTVIKDGFHGDTSKMFIVGKPTI MGERLCRITQESLYLALRMVKPGINLREIGAAIQKFVEAEGFSVVREYCGHGIGRGFHEE PQVLHYDSRETNVVLKPGMTFTIEPMVNAGKKEIRTMKDGWTVKTKDRSLSAQYEHTIVV TDNGCEILTLRKDDTIPAIISHDE >gi|296493488|gb|ADTK01000013.1| GENE 14 15189 - 16112 1594 307 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|26106512|gb|AAN78698.1|AE016755_198 30S ribosomal protein S2 [Escherichia coli CFT073] # 1 307 1 307 307 618 99 1e-176 MVSMTYLWYKARRTSDPFRIHRLDGSDNLTLCNNTHVSAHIPGCPLGSVIWDTWRHNPNF YIEVLIMATVSMRDMLKAGVHFGHQTRYWNPKMKPFIFGARNKVHIINLEKTVPMFNEAL AELNKIASRKGKILFVGTKRAASEAVKDAALSCDQFFVNHRWLGGMLTNWKTVRQSIKRL KDLETQSQDGTFEKLTKKEALMRTRELEKLENSLGGIKDMGGLPDALFVIDADHEHIAIK EANNLGIPVFAIVDTNSDPDGVDFVIPGNDDAIRAVTLYLGAVAATVREGRSQDLASQAE ESFVEAE >gi|296493488|gb|ADTK01000013.1| GENE 15 16370 - 17221 1001 283 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts [Haemophilus influenzae R2866] # 1 280 1 279 283 390 71 1e-107 MAEITASLVKELRERTGAGMMDCKKALTEANGDIELAIENMRKSGAIKAAKKAGNVAADG VIKTKIDGNYGIILEVNCQTDFVAKDAGFQAFADKVLDAAVAGKITDVEVLKAQFEEERV ALVAKIGENINIRRVAALEGDVLGSYQHGARIGVLVAAKGADEELVKHIAMHVAASKPEF IKPEDVSAEVVEKEYQVQLDIAMQSGKPKEIAEKMVEGRMKKFTGEVSLTGQPFVMEPSK TVGQLLKEHNAEVTGFIRFEVGEGIEKVETDFAAEVAAMSKQS >gi|296493488|gb|ADTK01000013.1| GENE 16 17368 - 18093 977 241 aa, chain + ## HITS:1 COG:ECs0173 KEGG:ns NR:ns ## COG: ECs0173 COG0528 # Protein_GI_number: 15829427 # Func_class: F Nucleotide transport and metabolism # Function: Uridylate kinase # Organism: Escherichia coli O157:H7 # 1 241 1 241 241 460 100.0 1e-130 MATNAKPVYKRILLKLSGEALQGTEGFGIDASILDRMAQEIKELVELGIQVGVVIGGGNL FRGAGLAKAGMNRVVGDHMGMLATVMNGLAMRDALHRAYVNARLMSAIPLNGVCDSYSWA EAISLLRNNRVVILSAGTGNPFFTTDSAACLRGIEIEADVVLKATKVDGVFTADPAKDPT ATMYEQLTYSEVLEKELKVMDLAAFTLARDHKLPIRVFNMNKPGALRRVVMGEKEGTLIT E >gi|296493488|gb|ADTK01000013.1| GENE 17 18083 - 18328 111 81 aa, chain - ## HITS:1 COG:no KEGG:ECIAI39_0174 NR:ns ## KEGG: ECIAI39_0174 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_IAI39 # Pathway: not_defined # 1 81 12 92 92 127 98.0 9e-29 MRILLIWVLNASTHLSMRTSASFLISLITLRILENLSQSDQTVTHSGSVLSIALIKHITG KIPRDTYAESYPYLSITGIIP >gi|296493488|gb|ADTK01000013.1| GENE 18 18243 - 18800 797 185 aa, chain + ## HITS:1 COG:ECs0174 KEGG:ns NR:ns ## COG: ECs0174 COG0233 # Protein_GI_number: 15829428 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome recycling factor # Organism: Escherichia coli O157:H7 # 1 185 1 185 185 296 100.0 1e-80 MISDIRKDAEVRMDKCVEAFKTQISKIRTGRASPSLLDGIVVEYYGTPTPLRQLASVTVE DSRTLKINVFDRSMSPAVEKAIMASDLGLNPNSAGSDIRVPLPPLTEERRKDLTKIVRGE AEQARVAVRNVRRDANDKVKALLKDKEISEDDDRRSQDDVQKLTDAAIKKIEAALADKEA ELMQF >gi|296493488|gb|ADTK01000013.1| GENE 19 18892 - 20088 878 398 aa, chain + ## HITS:1 COG:dxr KEGG:ns NR:ns ## COG: dxr COG0743 # Protein_GI_number: 16128166 # Func_class: I Lipid transport and metabolism # Function: 1-deoxy-D-xylulose 5-phosphate reductoisomerase # Organism: Escherichia coli K12 # 1 398 1 398 398 756 99.0 0 MKQLTILGSTGSIGCSTLDVVRHNPEHFRVVALVAGKNVTRMVEQCLEFSPRYAVMDDEA SAKLLKTMLQQQGSCTEVLSGQQAACDMAALEDVDQVMAAIVGAAGLLPTLAAIRAGKTI LLANKESLVTCGRLFMDAVKQSKAQLLPVDSEHNAIFQSLPQPIQHNLGYADLEQNGVVS ILLTGSGGPFRETPLRDLATMTPDQACRHPNWSMGRKISVDSATMMNKGLEYIEARWLFN ASASQMEVLIHPQSVIHSMVRYQDGSVLAQLGEPDMRTPIAHTMAWPNRVNSGVKPLDFC KLSALTFAAPDYDRYPCLKLAMEAFEQGQAATTALNAANEITVAAFLAQQIRFTDIAALN LSVLEKMDMREPQCVDDVLSVDANAREVARKEVMRLAS >gi|296493488|gb|ADTK01000013.1| GENE 20 20277 - 21035 446 252 aa, chain + ## HITS:1 COG:ECs0176 KEGG:ns NR:ns ## COG: ECs0176 COG0020 # Protein_GI_number: 15829430 # Func_class: I Lipid transport and metabolism # Function: Undecaprenyl pyrophosphate synthase # Organism: Escherichia coli O157:H7 # 1 252 2 253 253 509 100.0 1e-144 MLSATQPLSEKLPAHGCRHVAIIMDGNGRWAKKQGKIRAFGHKAGAKSVRRAVSFAANNG IEALTLYAFSSENWNRPAQEVSALMELFVWALDSEVKSLHRHNVRLRIIGDTSRFNSRLQ ERIRKSEALTAGNTGLTLNIAANYGGRWDIVQGVRQLAEKVQQGNLQPDQIDEEMLNQHV CMHELAPVDLVIRTGGEHRISNFLLWQIAYAELYFTDVLWPDFDEQDFEGALNAFANRER RFGGTEPGDETA >gi|296493488|gb|ADTK01000013.1| GENE 21 21156 - 21905 591 249 aa, chain + ## HITS:1 COG:ECs0177 KEGG:ns NR:ns ## COG: ECs0177 COG0575 # Protein_GI_number: 15829431 # Func_class: I Lipid transport and metabolism # Function: CDP-diglyceride synthetase # Organism: Escherichia coli O157:H7 # 1 249 1 249 249 418 100.0 1e-117 MLAAWEWGQLSGFTTRSQRVWLAVLCGLLLALMLFLLPEYHRNIHQPLVEISLWASLGWW IVALLLVLFYPGSAAIWRNSKTLRLIFGVLTIVPFFWGMLALRAWHYDENHYSGAIWLLY VMILVWGADSGAYMFGKLFGKHKLAPKVSPGKTWQGFIGGLATAAVISWGYGMWANLDVA PVTLLICSIVAALASVLGDLTESMFKREAGIKDSGHLIPGHGGILDRIDSLTAAVPVFAC LLLLVFRTL >gi|296493488|gb|ADTK01000013.1| GENE 22 21917 - 23269 918 450 aa, chain + ## HITS:1 COG:ECs0178 KEGG:ns NR:ns ## COG: ECs0178 COG0750 # Protein_GI_number: 15829432 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Escherichia coli O157:H7 # 1 450 1 450 450 872 100.0 0 MLSFLWDLASFIVALGVLITVHEFGHFWVARRCGVRVERFSIGFGKALWRRTDKLGTEYV IALIPLGGYVKMLDERAEPVVPELRHHAFNNKSVGQRAAIIAAGPVANFIFAIFAYWLVF IIGVPGVRPVVGEIAANSIAAEAQIAPGTELKAVDGIETPDWDAVRLQLVDKIGDESTTI TVAPFGSDQRRDVKLDLRHWAFEPDKEDPVSSLGIRPRGPQIEPVLENVQPNSAASKAGL QAGDRIVKVDGQPLTQWVTFVMLVRDNPGKSLALEIERQGSPLSLTLIPESKPGNGKAIG FVGIEPKVIPLPDEYKVVRQYGPFNAIVEATDKTWQLMKLTVSMLGKLITGDVKLNNLSG PISIAKGAGMTAELGVVYYLPFLALISVNLGIINLFPLPVLDGGHLLFLAIEKIKGGPVS ERVQDFCYRIGSILLVLLMGLALFNDFSRL >gi|296493488|gb|ADTK01000013.1| GENE 23 23299 - 25731 2500 810 aa, chain + ## HITS:1 COG:ECs0179 KEGG:ns NR:ns ## COG: ECs0179 COG4775 # Protein_GI_number: 15829433 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein/protective antigen OMA87 # Organism: Escherichia coli O157:H7 # 1 810 1 810 810 1609 100.0 0 MAMKKLLIASLLFSSATVYGAEGFVVKDIHFEGLQRVAVGAALLSMPVRTGDTVNDEDIS NTIRALFATGNFEDVRVLRDGDTLLVQVKERPTIASITFSGNKSVKDDMLKQNLEASGVR VGESLDRTTIADIEKGLEDFYYSVGKYSASVKAVVTPLPRNRVDLKLVFQEGVSAEIQQI NIVGNHAFTTDELISHFQLRDEVPWWNVVGDRKYQKQKLAGDLETLRSYYLDRGYARFNI DSTQVSLTPDKKGIYVTVNITEGDQYKLSGVEVSGNLAGHSAEIEQLTKIEPGELYNGTK VTKMEDDIKKLLGRYGYAYPRVQSMPEINDADKTVKLRVNVDAGNRFYVRKIRFEGNDTS KDAVLRREMRQMEGAWLGSDLVDQGKERLNRLGFFETVDTDTQRVPGSPDQVDVVYKVKE RNTGSFNFGIGYGTESGVSFQAGVQQDNWLGTGYAVGINGTKNDYQTYAELSVTNPYFTV DGVSLGGRLFYNDFQADDADLSDYTNKSYGTDVTLGFPINEYNSLRAGLGYVHNSLSNMQ PQVAMWRYLYSMGEHPSTSDQDNSFKTDDFTFNYGWTYNKLDRGYFPTDGSRVNLTGKVT IPGSDNEYYKVTLDTATYVPIDDDHKWVVLGRTRWGYGDGLGGKEMPFYENFYAGGSSTV RGFQSNTIGPKAVYFPHQASNYDPDYDYECATQDGAKDLCKSDDAVGGNAMAVASLEFIT PTPFISDKYANSVRTSFFWDMGTVWDTNWDSSQYSGYPDYSDPSNIRMSAGIALQWMSPL GPLVFSYAQPFKKYDGDKAEQFQFNIGKTW >gi|296493488|gb|ADTK01000013.1| GENE 24 25853 - 26338 588 161 aa, chain + ## HITS:1 COG:STM0225 KEGG:ns NR:ns ## COG: STM0225 COG2825 # Protein_GI_number: 16763615 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein # Organism: Salmonella typhimurium LT2 # 1 161 1 161 161 209 90.0 2e-54 MKKWLLAAGLGLALATSAQAADKIAIVNMGSLFQQVAQKTGVSNTLENEFKGRASELQRM ETDLQAKMKKLQSMKAGSDRTKLEKDVMAQRQTFAQKAQAFEQDRARRSNEERGKLVTRI QTAVKSVANSQDIDLVVDANAVAYNSSDVKDITADVLKQVK >gi|296493488|gb|ADTK01000013.1| GENE 25 26342 - 27367 940 341 aa, chain + ## HITS:1 COG:lpxD KEGG:ns NR:ns ## COG: lpxD COG1044 # Protein_GI_number: 16128172 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase # Organism: Escherichia coli K12 # 1 341 1 341 341 650 100.0 0 MPSIRLADLAQQLDAELHGDGDIVITGVASMQSAQTGHITFMVNPKYREHLGLCQASAVV MTQDDLPFAKSAALVVKNPYLTYARMAQILDTTPQPAQNIAPSAVIDATAKLGNNVSIGA NAVIESGVELGDNVIIGAGCFVGKNSKIGAGSRLWANVTIYHEIQIGQNCLIQSGTVVGA DGFGYANDRGNWVKIPQIGRVIIGDRVEIGACTTIDRGALDDTIIGNGVIIDNQCQIAHN VVIGDNTAVAGGVIMAGSLKIGRYCMIGGASVINGHMEICDKVTVTGMGMVMRPITEPGV YSSGIPLQPNKVWRKTAALVMNIDDMSKRLKSLERKVNQQD >gi|296493488|gb|ADTK01000013.1| GENE 26 27472 - 27927 452 151 aa, chain + ## HITS:1 COG:ZfabZ KEGG:ns NR:ns ## COG: ZfabZ COG0764 # Protein_GI_number: 15799862 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases # Organism: Escherichia coli O157:H7 EDL933 # 1 151 1 151 151 295 100.0 2e-80 MTTNTHTLQIEEILELLPHRFPFLLVDRVLDFEEGRFLRAVKNVSVNEPFFQGHFPGKPI FPGVLILEAMAQATGILAFKSVGKLEPGELYYFAGIDEARFKRPVVPGDQMIMEVTFEKT RRGLTRFKGVALVDGKVVCEATMMCARSREA >gi|296493488|gb|ADTK01000013.1| GENE 27 27931 - 28719 790 262 aa, chain + ## HITS:1 COG:lpxA KEGG:ns NR:ns ## COG: lpxA COG1043 # Protein_GI_number: 16128174 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase # Organism: Escherichia coli K12 # 1 262 1 262 262 511 100.0 1e-145 MIDKSAFVHPTAIVEEGASIGANAHIGPFCIVGPHVEIGEGTVLKSHVVVNGHTKIGRDN EIYQFASIGEVNQDLKYAGEPTRVEIGDRNRIRESVTIHRGTVQGGGLTKVGSDNLLMIN AHIAHDCTVGNRCILANNATLAGHVSVDDFAIIGGMTAVHQFCIIGAHVMVGGCSGVAQD VPPYVIAQGNHATPFGVNIEGLKRRGFSREAITAIRNAYKLIYRSGKTLDEVKPEIAELA ETYPEVKAFTDFFARSTRGLIR >gi|296493488|gb|ADTK01000013.1| GENE 28 28719 - 29867 1153 382 aa, chain + ## HITS:1 COG:lpxB KEGG:ns NR:ns ## COG: lpxB COG0763 # Protein_GI_number: 16128175 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipid A disaccharide synthetase # Organism: Escherichia coli K12 # 1 382 1 382 382 743 99.0 0 MTKQRPLTIALVAGETSGDILGAGLIRALKERVPNARFVGVAGPRMQAEGCEAWYEMEEL AVMGIVEVLGRLRRLLHIRADLTKRFGELKPDVFVGIDAPDFNITLEGNLKKQGIKTIHY VSPSVWAWRQKRVFKIGRATDLVLAFLPFEKAFYDKYNVPCRFIGHTMADAMPLDPDKNA ARDVLGIPHDAHCLALLPGSRGAEVEMLSADFLKTAQLLRQTYPDLEIVVPLVNAKRREQ FERIKAEVAPDLSVHLLDGMGREAMVASDAALLASGTAALECMLAKCPMVVGYRMKPFTF WLAKRLVKTDYVSLPNLLAGRELVKELLQEECEPQKLAAALLPLLANGKTSHAMHDTFRE LHQQIRCNADEQAAQAVLELAQ >gi|296493488|gb|ADTK01000013.1| GENE 29 29864 - 30460 679 198 aa, chain + ## HITS:1 COG:ECs0185 KEGG:ns NR:ns ## COG: ECs0185 COG0164 # Protein_GI_number: 15829439 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HII # Organism: Escherichia coli O157:H7 # 1 198 1 198 198 372 98.0 1e-103 MIEFVYPHTHLVAGVDEVGRGPLVGAVVTAAVILDPARPIAGLNDSKKLSEKRRLALCEE IKEKALSWSLGRAEPHEIDELNILHATMLAMQRAVAGLHIAPEYVLIDGNRCPKLPMPAM AVVKGDSRVPEISAASILAKVTRDAEMAALDIVFPQYGFAQHKGYPTAFHLEKLAEYGAT EHHRRSFGPVKRALGLAS >gi|296493488|gb|ADTK01000013.1| GENE 30 30495 - 33977 3421 1160 aa, chain + ## HITS:1 COG:dnaE KEGG:ns NR:ns ## COG: dnaE COG0587 # Protein_GI_number: 16128177 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Escherichia coli K12 # 1 1160 1 1160 1160 2372 99.0 0 MSEPRFVHLRVHSDYSMIDGLAKTAPLVKKAAALGMPALAITDFTNLCGLVKFYGAGHGA GIKPIVGADFNVQCDLLGDELTHLTVLAANNTGYQNLTLLISKAYQRGYGAAGPIIDRDW LIELNEGLILLSGGRMGDVGRSLLRGNSALVDECVAFYEEHFPDRYFLELIRTGRPDEES YLHAAVELAEARGLPVVATNDVRFIDSSDFDAHEIRVAIHDGFTLDDPKRPRNYSPQQYM RSEEEMCELFVDIPEALANTVEIAKRCNVTVRLGEYFLPQFPTGDMSTEDYLVKRAKEGL EERLAFLFPDEEERLKRRPEYDERLETELQVINQMGFPGYFLIVMEFIQWSKDNGVPVGP GRGSGAGSLVAYALKITDLDPLEFDLLFERFLNPERVSMPDFDVDFCMEKRDQVIEHVAD MYGRDAVSQIITFGTMAAKAVIRDVGRVLGHPYGFVDRISKLIPPDPGMTLAKAFEAEPQ LPEIYEADEEVKALIDMARKLEGVTRNAGKHAGGVVIAPTKITDFAPLYCDEEGKHPVTQ FDKSDVEYAGLVKFDFLGLRTLTIINWALEMINKRRAKNGEPPLDIAAIPLDDKKSFDML QRSETTAVFQLESRGMKDLIKRLQPDCFEDMIALVALFRPGPLQSGMVDNFIDRKHGREE ISYPDVQWQHESLKPVLEPTYGIILYQEQVMQIAQVLSGYTLGGADMLRRAMGKKKPEEM AKQRSVFAEGAEKNGINAELAMKIFDLVEKFAGYGFNKSHSAAYALVSYQTLWLKAHYPA EFMAAVMTADMDNTEKVVGLVDECWRMGLKILPPDINSGLYHFHVNDDGEIVYGIGAIKG VGEGPIEAIIEARNKGGYFRELFDLCARTDTKKLNRRVLEKLIMSGAFDRLGPHRAALMN SLGDALKAADQHAKAEAIGQADMFGVLAEEPEQIEQSYASCQPWPEQVVLDGERETLGLY LTGHPINQYLKEIERYVGGVRLKDMHPTERGKVITAAGLVVAARVMVTKRGNRIGICTLD DRSGRLEVMLFTDALDKYQQLLEKDRILIVSGQVSFDDFSGGLKMTAREVMDIDEAREKY ARGLAISLTDRQIDDQLLNRLRQSLEPHRSGTIPVHLYYQRADARARLRFGATWRVSPSD RLLNDLRGLIGSEQVELEFD >gi|296493488|gb|ADTK01000013.1| GENE 31 33990 - 34949 1322 319 aa, chain + ## HITS:1 COG:ECs0187 KEGG:ns NR:ns ## COG: ECs0187 COG0825 # Protein_GI_number: 15829441 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase alpha subunit # Organism: Escherichia coli O157:H7 # 1 319 1 319 319 619 100.0 1e-177 MSLNFLDFEQPIAELEAKIDSLTAVSRQDEKLDINIDEEVHRLREKSVELTRKIFADLGA WQIAQLARHPQRPYTLDYVRLAFDEFDELAGDRAYADDKAIVGGIARLDGRPVMIIGHQK GRETKEKIRRNFGMPAPEGYRKALRLMQMAERFKMPIITFIDTPGAYPGVGAEERGQSEA IARNLREMSRLGVPVVCTVIGEGGSGGALAIGVGDKVNMLQYSTYSVISPEGCASILWKS ADKAPLAAEAMGIIAPRLKELKLIDSIIPEPLGGAHRNPEAMAASLKAQLLADLADLDVL STEDLKNRRYQRLMSYGYA >gi|296493488|gb|ADTK01000013.1| GENE 32 35048 - 37189 1941 713 aa, chain + ## HITS:1 COG:ldcC KEGG:ns NR:ns ## COG: ldcC COG1982 # Protein_GI_number: 16128179 # Func_class: E Amino acid transport and metabolism # Function: Arginine/lysine/ornithine decarboxylases # Organism: Escherichia coli K12 # 1 713 1 713 713 1511 100.0 0 MNIIAIMGPHGVFYKDEPIKELESALVAQGFQIIWPQNSVDLLKFIEHNPRICGVIFDWD EYSLDLCSDINQLNEYLPLYAFINTHSTMDVSVQDMRMALWFFEYALGQAEDIAIRMRQY TDEYLDNITPPFTKALFTYVKERKYTFCTPGHMGGTAYQKSPVGCLFYDFFGGNTLKADV SISVTELGSLLDHTGPHLEAEEYIARTFGAEQSYIVTNGTSTSNKIVGMYAAPSGSTLLI DRNCHKSLAHLLMMNDVVPVWLKPTRNALGILGGIPRREFTRDSIEEKVAATTQAQWPVH AVITNSTYDGLLYNTDWIKQTLDVPSIHFDSAWVPYTHFHPIYQGKSGMSGERVAGKVIF ETQSTHKMLAALSQASLIHIKGEYDEEAFNEAFMMHTTTSPSYPIVASVETAAAMLRGNP GKRLINRSVERALHFRKEVQRLREESDGWFFDIWQPPQVDEAECWPVAPGEQWHGFNDAD ADHMFLDPVKVTILTPGMDEQGNMSEEGIPAALVAKFLDERGIVVEKTGPYNLLFLFSIG IDKTKAMGLLRGLTEFKRSYDLNLRIKNMLPDLYAEDPDFYRNMRIQDLAQGIHKLIRKH DLPGLMLRAFDTLPEMIMTPHQAWQRQIKGEVETIALEQLVGRVSANMILPYPPGVPLLM PGEMLTKESRTVLDFLLMLCSVGQHYPGFETDIHGAKQDEDGVYRVRVLKMAG >gi|296493488|gb|ADTK01000013.1| GENE 33 37246 - 37635 416 129 aa, chain + ## HITS:1 COG:ECs0189 KEGG:ns NR:ns ## COG: ECs0189 COG0346 # Protein_GI_number: 15829443 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Escherichia coli O157:H7 # 1 129 10 138 138 271 100.0 3e-73 MLGLKQVHHIAIIATDYAVSKAFYCDILGFTLQSEVYREARDSWKGDLALNGQYVIELFS FPFPPERPSRPEACGLRHLAFSVDDIDAAVAHLESHNVKCEAIRVDPYTQKRFTFFNDPD GLPLELYEQ >gi|296493488|gb|ADTK01000013.1| GENE 34 37700 - 38998 1044 432 aa, chain + ## HITS:1 COG:mesJ KEGG:ns NR:ns ## COG: mesJ COG0037 # Protein_GI_number: 16128181 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Escherichia coli K12 # 1 432 1 432 432 776 96.0 0 MTLTLNRQLLTLRQILVAFSGGLDSTVLLHQLVQWRTENPGVTLRAIHVHHGLSANADAW VRHCENICQQWQVPLVVERVQLAQEGLGIEAQARQARYQAFARTLLPGEVLVTAQHLDDQ CETFLLALKRGSGPAGLSAMAEVSEFAGTRLIRPLLARTRGELAQWALAHGLRWIEDESN QDDSYDRNFLRLRVVPLLQQRWPHFAEATARSAALCAEQESLLDELLADDIAHCQKPQGT LQIAPMLAMSDARRAAIIRRWLAGKNAPMPSRDALVRIWQEVALAREDASPCLRLGAFEI RRYQSQLWWIKSVTGQSETIVPWQTWLQPLELPAGLGSVQLTAGGDIRPPRADEAVSVRF KAPGLLHIVGRNGGRKLKKIWQELGVPPWLRDTTPLLFYGETLIAAAGVFVTQEGVAEGE KGVSFVWQKTLS >gi|296493488|gb|ADTK01000013.1| GENE 35 39047 - 39301 313 84 aa, chain - ## HITS:1 COG:ECs0191 KEGG:ns NR:ns ## COG: ECs0191 COG4568 # Protein_GI_number: 15829445 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Escherichia coli O157:H7 # 1 84 3 86 86 155 100.0 1e-38 MNDTYQPINCDDYDNLELACQHHLMLTLELKDGEKLQAKASDLVSRKNVEYLVVEAAGET RELRLDKITSFSHPEIGTVVVSES >gi|296493488|gb|ADTK01000013.1| GENE 36 39294 - 39494 166 66 aa, chain - ## HITS:1 COG:no KEGG:ECIAI39_0458 NR:ns ## KEGG: ECIAI39_0458 # Name: yaeP # Def: hypothetical protein # Organism: E.coli_IAI39 # Pathway: not_defined # 1 66 1 66 66 107 100.0 1e-22 MEKYCELIRKRYAEIASGDLGYVPDALGCVLKVLNEMAADDALSEAVREKAAYAAANLLV SDYVNE >gi|296493488|gb|ADTK01000013.1| GENE 37 39660 - 40205 752 181 aa, chain + ## HITS:1 COG:yaeQ KEGG:ns NR:ns ## COG: yaeQ COG4681 # Protein_GI_number: 16128183 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 181 1 181 181 326 98.0 2e-89 MALKATIYKATVNVADLDRNQFLDASLTLARHPSETQERMMLRLLAWLKYADERLQFTRG LCADDEPEAWLRNDHLGIDLWIELGLPDERRIKKACTQAAKVALFTYNSRAAQIWWQQNQ SKFVQFANLSVWYLDDEQLAKVSAFADRTMTLQATIQDGVIWLSDDKNNLEVNLTAWQQP S >gi|296493488|gb|ADTK01000013.1| GENE 38 40202 - 40624 302 140 aa, chain + ## HITS:1 COG:yaeJ KEGG:ns NR:ns ## COG: yaeJ COG1186 # Protein_GI_number: 16128184 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Escherichia coli K12 # 1 140 1 140 140 208 97.0 2e-54 MIVISRHVAIPDGELEITAIRAQGAGGQHVNKTSTAIHLRFDIRASSLPEYYKERLLAAS HHLISSDGVIVIKAQEYRSQELNREAALARLVAVIKDLTTEQKARRPTRPTRASKERRLA SKAQKSSVKAMRGKVRSGRE >gi|296493488|gb|ADTK01000013.1| GENE 39 40638 - 41348 686 236 aa, chain + ## HITS:1 COG:ECs0194 KEGG:ns NR:ns ## COG: ECs0194 COG3015 # Protein_GI_number: 15829448 # Func_class: M Cell wall/membrane/envelope biogenesis; P Inorganic ion transport and metabolism # Function: Uncharacterized lipoprotein NlpE involved in copper resistance # Organism: Escherichia coli O157:H7 # 1 236 1 236 236 466 99.0 1e-131 MVKKAIVTAMAVISLFTLMGCNNRAEVDTLSPAQAAELKPMPQSWRGVLPCADCEGIETS LFLEKDGTWVMNERYLGAREEPSSFASYGTWARTADKLVLTDSKGEKSYYRAKGDALEML DREGNPIESQFNYTLEPAQSSLPMTPMTLRGMYFYMADAATFTDCATGKRFMVANNAELE RSYLAARGHSEKPVLLSVEGHFTLEANPDTGAPTKVLAPDTAGKFYPNKDCSSLGQ >gi|296493488|gb|ADTK01000013.1| GENE 40 41503 - 42327 410 274 aa, chain - ## HITS:1 COG:no KEGG:EcHS_A0197 NR:ns ## KEGG: EcHS_A0197 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_HS # Pathway: not_defined # 1 274 1 274 274 516 97.0 1e-145 MDKPKAYCRLLLPSFLLLSACTVDISHPDPSATAVDAEAKTWAVKFQHQSSFTEQSIKEI TAPDLKPGDLLFSSSLGVTSFGIRVFSTSSVSHVAIYLGENNVAEATGAGVQIVSLKKAM KHSDKLFVLRVPDLTPQQATEITAFANKIKDSGYNYRGIVEFIPFMVTRQMCSLNPFSAD FRQQCVSGLAKAQLSSVGEGDKKSWFCSEFVTDAFAKAGHPLTLAQSGWISPADLMHMRT GDVSAFKPETQLQYVGHLKPGIYIKAGRFVGLTQ >gi|296493488|gb|ADTK01000013.1| GENE 41 42380 - 44098 2054 572 aa, chain - ## HITS:1 COG:ECs0196 KEGG:ns NR:ns ## COG: ECs0196 COG0442 # Protein_GI_number: 15829450 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Prolyl-tRNA synthetase # Organism: Escherichia coli O157:H7 # 1 572 1 572 572 1135 99.0 0 MRTSQYLLSTLKETPADAEVISHQLMLRAGMIRKLASGLYTWLPTGVRVLKKVENIVREE MNNAGAIEVSMPVVQPADLWQESGRWEQYGPELLRFVDRGERPFVLGPTHEEVITDLIRN ELSSYKQLPLNFYQIQTKFRDEVRPRFGVMRSREFLMKDAYSFHTSQESLQETYDAMYAA YSKIFSRMGLDFRAVQADTGSIGGSASHEFQVLAQSGEDDVVFSDTSDYAANIELAEAIA PKEPRAAATQEMTLVDTPNAKTIAELVEQFNLPIEKTVKTLLVKAVEGSSFPLVALLVRG DHELNEVKAEKLPQVASPLTFATEEEIRAVVKAGPGSLGPVNMPIPVVIDRTVAAMSDFA AGANIDGKHYFGINWDRDVATPEIADIRNVVAGDPSPDGQGTLLIKRGIEVGHIFQLGTK YSEALKASVQGEDGRNQILTMGCYGIGVTRVVAAAIEQNYDERGIVWPDAIAPFQVAILP MNMHKSFRVQELAEKLYSELRAQGIEVLLDDRKERPGVMFADMELIGIPHTIVLGDRNLD NDDIEYKYRRNGEKQLIKTGDIVEYLVKQIKG >gi|296493488|gb|ADTK01000013.1| GENE 42 44209 - 44916 611 235 aa, chain - ## HITS:1 COG:yaeB KEGG:ns NR:ns ## COG: yaeB COG1720 # Protein_GI_number: 16128188 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 235 1 235 235 468 99.0 1e-132 MSSFQFEQIGVIRSPYKEKFAVPRQPGLVKSANGELHLMAPYNQADAVRGLEAFSHLWIL FVFHQTMEGGWRPTVRPPRLGGNARMGVFATRSTFRPNPIGMSLVELKEVVCHKDSVILK LGSLDLVDGTPVVDIKPYLPFAESLPDASASYAQSAPAAEMAVSFTAEVEKQLLTLEKRY PQLTLFIREVLAQDPRPAYRKGEETGKTYAVWLHDFNVRWRVTDAGFEVFALEPR >gi|296493488|gb|ADTK01000013.1| GENE 43 44913 - 45317 306 134 aa, chain - ## HITS:1 COG:no KEGG:EC55989_0194 NR:ns ## KEGG: EC55989_0194 # Name: rcsF # Def: outer membrane lipoprotein # Organism: E.coli_55989 # Pathway: Two-component system [PATH:eck02020] # 1 134 1 134 134 220 100.0 1e-56 MRALPICLVALMLSGCSMLSRSPVEPVQSTAPQPKAEPAKPKAPRATPVRIYTNAEELVG KPFRDLGEVSGDSCQASNQDSPPSIPTARKRMQINASKMKANAVLLHSCEVTSGTPGCYR QAVCIGSALNITAK >gi|296493488|gb|ADTK01000013.1| GENE 44 45435 - 46250 1148 271 aa, chain - ## HITS:1 COG:yaeC KEGG:ns NR:ns ## COG: yaeC COG1464 # Protein_GI_number: 16128190 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface antigen # Organism: Escherichia coli K12 # 1 271 1 271 271 505 99.0 1e-143 MAFKFKTFAAVGALIGSLALVGCGQDEKDPNHIKVGVIVGAEQQVAEVAQKVAKDKYGLD VELVTFNDYVLPNEALSKGDIDANAFQHKPYLDQQLKDRGYKLVAVGNTFVYPIAGYSKK IKSLDELQDGSQVAVPNDPTNLGRSLLLLQKVGLIKLKDGVGLLPTVLDVVENPKNLKIV ELEAPQLPRSLDDAQIALAVINTTYASQIGLTPAKDGIFVEDKDSPYVNLIVTREDNKDA ENVKKFVQAYQSDEVYEAANKVFNGGAVKGW >gi|296493488|gb|ADTK01000013.1| GENE 45 46290 - 46943 843 217 aa, chain - ## HITS:1 COG:yaeE KEGG:ns NR:ns ## COG: yaeE COG2011 # Protein_GI_number: 16128191 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, permease component # Organism: Escherichia coli K12 # 1 217 1 217 217 310 100.0 9e-85 MSEPMMWLLVRGVWETLAMTFVSGFFGFVIGLPVGVLLYVTRPGQIIANAKLYRTVSAIV NIFRSIPFIILLVWMIPFTRVIVGTSIGLQAAIVPLTVGAAPFIARMVENALLEIPTGLI EASRAMGATPMQIVRKVLLPEALPGLVNAATITLITLVGYSAMGGAVGAGGLGQIGYQYG YIGYNATVMNTVLVLLVILVYLIQFAGDRIVRAVTRK >gi|296493488|gb|ADTK01000013.1| GENE 46 46936 - 47967 1146 343 aa, chain - ## HITS:1 COG:ECs0201 KEGG:ns NR:ns ## COG: ECs0201 COG1135 # Protein_GI_number: 15829455 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, ATPase component # Organism: Escherichia coli O157:H7 # 1 343 1 343 343 677 100.0 0 MIKLSNITKVFHQGTRTIQALNNVSLHVPAGQIYGVIGASGAGKSTLIRCVNLLERPTEG SVLVDGQELTTLSESELTKARRQIGMIFQHFNLLSSRTVFGNVALPLELDNTPKDEIKRR VTELLSLVGLGDKHDSYPSNLSGGQKQRVAIARALASNPKVLLCDEATSALDPATTRSIL ELLKDINRRLGLTILLITHEMDVVKRICDCVAVISNGELIEQDTVSEVFSHPKTPLAQKF IQSTLHLDIPEDYQERLQAEPFTDCVPMLRLEFTGQSVDAPLLSETARRFNVNNNIISAQ MDYAGGVKFGIMLTEMHGTQQDTQAAIAWLQEHHVKVEVLGYV >gi|296493488|gb|ADTK01000013.1| GENE 47 48155 - 48727 541 190 aa, chain + ## HITS:1 COG:ECs0202 KEGG:ns NR:ns ## COG: ECs0202 COG0241 # Protein_GI_number: 15829456 # Func_class: E Amino acid transport and metabolism # Function: Histidinol phosphatase and related phosphatases # Organism: Escherichia coli O157:H7 # 1 190 1 190 191 394 100.0 1e-110 MAKSVPAIFLDRDGTINVDHGYVHEIDNFEFIDGVIDAMRELKKMGFALVVVTNQSGIAR GKFTEAQFETLTEWMDWSLADRDVDLDGIYYCPHHPQGSVEEFRQVCDCRKPHPGMLLSA RDYLHIDMAASYMVGDKLEDMQAAVAANVGTKVLVRTGKPITPEAENAADWVLNSLADLP QAIKKQQKPA Prediction of potential genes in microbial genomes Time: Mon May 16 15:01:13 2011 Seq name: gi|296493487|gb|ADTK01000014.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont32.1, whole genome shotgun sequence Length of sequence - 1433 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - SSU_RRNA 21 - 1433 99.0 # AB035920 [D:964..2505] # 16S ribosomal RNA # Escherichia coli O157:H7 # Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; Enterobacteriaceae; Escherichia. Prediction of potential genes in microbial genomes Time: Mon May 16 15:01:14 2011 Seq name: gi|296493486|gb|ADTK01000015.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont39.1, whole genome shotgun sequence Length of sequence - 4114 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 209 107 ## ECS88_4652 hypothetical protein - Prom 347 - 406 4.5 2 2 Tu 1 . - CDS 1753 - 2034 140 ## EcE24377A_3399 hypothetical protein + Prom 2797 - 2856 5.7 3 3 Tu 1 . + CDS 3068 - 3952 542 ## COG3596 Predicted GTPase Predicted protein(s) >gi|296493486|gb|ADTK01000015.1| GENE 1 2 - 209 107 69 aa, chain - ## HITS:1 COG:no KEGG:ECS88_4652 NR:ns ## KEGG: ECS88_4652 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 69 1 69 243 144 100.0 1e-33 MNQPIHNAYWLSRFESILNSALAQHRAVSLIRVDLRFPEYMPATIMDTDLDSAVISRFFA SLKAKIQAY >gi|296493486|gb|ADTK01000015.1| GENE 2 1753 - 2034 140 93 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_3399 NR:ns ## KEGG: EcE24377A_3399 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 93 284 376 376 186 98.0 2e-46 MFFVEIVSCCFLVELFHDIDSFFINQRLSKLESFIDKWREGKEIWIKAYLQLIHNTLGIT ETEKLVNLYNSFFTDKVVIWDESVKKVKWIYEK >gi|296493486|gb|ADTK01000015.1| GENE 3 3068 - 3952 542 294 aa, chain + ## HITS:1 COG:ECs1395 KEGG:ns NR:ns ## COG: ECs1395 COG3596 # Protein_GI_number: 15830649 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Escherichia coli O157:H7 # 12 294 10 289 290 186 36.0 4e-47 MSFSHSSLSAQVKSYLTFLPEEIRQKILEHLHGVIHYEPVIGIMGKSGTGKSSLCNAIFQ SRICATHPLNGCPRQAHRLTLQLGERRMTLVDMPGIGETPQHDQEYRELYRQLLPELDLI IWILRSDERAYAADIAMHQFLLNEGADPSRFLFVLSHADRVFPAEEWNDTEKCPSRQQEL SLATVTARVATLFPSSFPVLSVAAPVGWNLPALVSLMIHALPPQATSAVYSHIRGENRSE QARKHAQQTFGDAIGKSFDDAVARFSFPAWMLQLLRKTRDRIIHLLVTLWDHLF Prediction of potential genes in microbial genomes Time: Mon May 16 15:01:20 2011 Seq name: gi|296493485|gb|ADTK01000016.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont41.1, whole genome shotgun sequence Length of sequence - 7469 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 2, operones - 2 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 365 - 1369 445 ## COG3547 Transposase and inactivated derivatives - Term 1376 - 1416 2.1 2 1 Op 2 . - CDS 1448 - 1882 242 ## COG0789 Predicted transcriptional regulators - Prom 1908 - 1967 2.6 + Prom 1866 - 1925 3.0 3 2 Op 1 . + CDS 1954 - 2304 317 ## ASA_P4G088 putative mercuric transport protein 4 2 Op 2 . + CDS 2318 - 2593 296 ## COG2608 Copper chaperone 5 2 Op 3 . + CDS 2629 - 3051 317 ## Hneap_1210 MerC mercury resistance protein 6 2 Op 4 . + CDS 3103 - 4797 371 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 7 2 Op 5 . + CDS 4815 - 5177 356 ## SeSA_B0044 transcriptional regulator MerD 8 2 Op 6 . + CDS 5174 - 5410 179 ## SeSA_B0043 putative mercury resistance protein 9 2 Op 7 1/0.000 + CDS 5407 - 6114 272 ## COG2200 FOG: EAL domain 10 2 Op 8 . + CDS 6189 - 7457 631 ## COG2801 Transposase and inactivated derivatives Predicted protein(s) >gi|296493485|gb|ADTK01000016.1| GENE 1 365 - 1369 445 334 aa, chain - ## HITS:1 COG:PA0445 KEGG:ns NR:ns ## COG: PA0445 COG3547 # Protein_GI_number: 15595642 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Pseudomonas aeruginosa # 3 332 2 335 338 252 42.0 9e-67 MENIALIGIDLGKNSFHIHCQDHRGKAVYRKKFTRPKLIEFLATCPATTIAMEACGGSHF MARKLEELGHFPKLISPQFVRPFVKSNKNDFVDAEAICEAASRPSMRFVQPRTESQQAMR ALHRVRESLVQDKVKTTNQMHAFLLEFGISVPRGAAVISRLSTLLEDSSLPLYLSQLLLK LQQHYHYLVEQIKDLESQLKRKLDEDEVGQRLLSIPCVGTLTASTISTEIGDGKQYASSR DFAAATGLVPRQYSTGGRTTLLGISKRGNKKIRTLLVQCARVFIQKLEHQSGKLADWVRE LLCRKSNFVVTCALANKLARIAWALTARQQTYEA >gi|296493485|gb|ADTK01000016.1| GENE 2 1448 - 1882 242 144 aa, chain - ## HITS:1 COG:SMa0281 KEGG:ns NR:ns ## COG: SMa0281 COG0789 # Protein_GI_number: 16262604 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Sinorhizobium meliloti # 4 132 6 132 134 102 47.0 3e-22 MENNLENLTIGVFAKAAGVNVETIRFYQRKGLLREPDKPYGSIRRYGEADVVRVKFVKSA QRLGFSLDEIAELLRLDDGTHCEEASSLAEHKLKDVREKMADLARMETVLSELVCACHAR KGNVSCPLIASLQGEAGLARSAMP >gi|296493485|gb|ADTK01000016.1| GENE 3 1954 - 2304 317 116 aa, chain + ## HITS:1 COG:no KEGG:ASA_P4G088 NR:ns ## KEGG: ASA_P4G088 # Name: merT # Def: putative mercuric transport protein # Organism: A.salmonicida # Pathway: not_defined # 1 116 19 134 134 207 100.0 1e-52 MSEPQNGRGALFAGGLAAILASTCCLGPLVLVALGFSGAWIGNLTVLEPYRPLFIGAALV ALFFAWKRIYRPVQACKPGEVCAIPQVRATYKLIFWIVAVLVLVALGFPYVVPFFY >gi|296493485|gb|ADTK01000016.1| GENE 4 2318 - 2593 296 91 aa, chain + ## HITS:1 COG:MA1337 KEGG:ns NR:ns ## COG: MA1337 COG2608 # Protein_GI_number: 20090198 # Func_class: P Inorganic ion transport and metabolism # Function: Copper chaperone # Organism: Methanosarcina acetivorans str.C2A # 25 85 5 65 68 57 47.0 5e-09 MKKLFASLALAAAVAPVWAATQTVTLAVPGMTCAACPITVKKALSKVEGVSKVDVGFEKR EAVVTFDDTKASVQKLTKATADAGYPSSVKQ >gi|296493485|gb|ADTK01000016.1| GENE 5 2629 - 3051 317 140 aa, chain + ## HITS:1 COG:no KEGG:Hneap_1210 NR:ns ## KEGG: Hneap_1210 # Name: not_defined # Def: MerC mercury resistance protein # Organism: H.neapolitanus # Pathway: not_defined # 1 138 1 138 140 235 97.0 3e-61 MGLMTRIADKTGALGSVVSAMGCAACFPALASFGAAIGLGFLSQYEGLFISRLLPLFAAL AFLANALGWFSHRQWLRSLLGMIGPAIVFAATVWLLGNWWTANLMYVGLALMIGVSIWDF VSPAHRRCGPDGCELPAKRL >gi|296493485|gb|ADTK01000016.1| GENE 6 3103 - 4797 371 564 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 100 559 6 458 458 147 25 2e-35 MSTLKITGMTCDSCAVHVKDALEKVPGVQSADVSYAKGSAKLAIEVGTSPDALTAAVAGL GYRATLADAPSVSTPGGLLDKMRDLLGRNDKTGSSGALHIAVIGSGGAAMAAALKAVEQG ARVTLIERGTIGGTCVNVGCVPSKIMIRAAHIAHLRRESPFDGGIAATTPTIQRTALLAQ QQARVDELRHAKYEGILEGNPAITVLHGSARFKDNRNLIVQLNDGGERVVAFDRCLIATG ASPAVPPIPGLKDTPYWTSTEALVSETIPKRLAVIGSSVVALELAQAFARLGAKVTILAR STLFFREDPAIGEAVTAAFRMEGIEVREHTQASQVAYINGEGDGEFVLTTAHGELRADKL LVATGRAPNTRKLALDATGVTLTPQGAIVIDPGMRTSVEHIYAAGDCTDQPQFVYVAAAA GTRAAINMTGGDAALNLTAMPAVVFTDPQVATVGYSEAEAHHDGIKTDSRTLTLDNVPRA LANFDTRGFIKLVVEEGSGRLIGVQAVAPEAGELIQTAALAIRNRMTVQELADQLFPYLT MVEGLKLAAQTFNKDVKQLSCCAG >gi|296493485|gb|ADTK01000016.1| GENE 7 4815 - 5177 356 120 aa, chain + ## HITS:1 COG:no KEGG:SeSA_B0044 NR:ns ## KEGG: SeSA_B0044 # Name: merD # Def: transcriptional regulator MerD # Organism: S.enterica_Schwarzengrund # Pathway: not_defined # 1 120 1 120 120 172 100.0 4e-42 MSAYTVSQLAHNAGVSVHIVRDYLVRGLLRPVACTTGGYGVFDDAALQRLCFVRAAFEAG IGLDALARLCRALDAADGAQAAAQLAVLRQLVERRRAALAHLDAQLASMPAERAHEEALP >gi|296493485|gb|ADTK01000016.1| GENE 8 5174 - 5410 179 78 aa, chain + ## HITS:1 COG:no KEGG:SeSA_B0043 NR:ns ## KEGG: SeSA_B0043 # Name: not_defined # Def: putative mercury resistance protein # Organism: S.enterica_Schwarzengrund # Pathway: not_defined # 1 78 1 78 78 132 100.0 4e-30 MNAPDKLPPETRQPVSGYLWGALAVLTCPCHLPILAAVLAGTTAGAFLGEHWGVAALALT GLFVLAVTRLLRAFRGGS >gi|296493485|gb|ADTK01000016.1| GENE 9 5407 - 6114 272 235 aa, chain + ## HITS:1 COG:AGl374gl_3 KEGG:ns NR:ns ## COG: AGl374gl_3 COG2200 # Protein_GI_number: 15890301 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 13 216 8 212 266 175 43.0 7e-44 MTSSQPAGWTAAELAQAAARGQLDLHYQPLVDLRDHRIAGAEALMRWRHPRLGLLPPGQF LPLAESFGLMPEIGAWVLGEACRQMHKWQGPAWQPFRLAINVSASQVGPTFDDEVKRVLA DMALPAELLEIELTESVAFGNPALFASFDALRAIGVRFAADDFGTGYSCLQHLKCCPITT LKIDQSFVARLPDDARDQTIVRAVIQLAHGLGMDVIFRRRLHQLIGRNGCCAASS >gi|296493485|gb|ADTK01000016.1| GENE 10 6189 - 7457 631 422 aa, chain + ## HITS:1 COG:mlr6273 KEGG:ns NR:ns ## COG: mlr6273 COG2801 # Protein_GI_number: 13475245 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Mesorhizobium loti # 21 416 15 409 535 259 38.0 1e-68 MATDTPRIPEQGVATLPDEAWERARRRAEIISPLAQSETVGHEAADMAAQALGLSRRQVY VLIRRARQGSGLVTDLVPGQSGGGKGKGRLPEPVERVIHELLQKRFLTKQKRSLAAFHRE VTQVCKAQKLRVPARNTVALRIASLDPRKVIRRREGQDAARDLQGVGGEPPAVTAPLEQV QIDHTVIDLIVVDDRDRQPIGRPYLTLAIDVFTRCVLGMVVTLEAPSAVSVGLCLVHVAC DKRPWLEGLNVEMDWQMSGKPLLLYLDNAAEFKSEALRRGCEQHGIRLDYRPLGQPHYGG IVERIIGTAMQMIHDELPGTTFSNPDQRGDYDSENKAALTLRELERWLTLAVGTYHGSVH NGLLQPPAARWAEAVARVGVPAVVTRATSFLVDFLPILRRTLTRTGFVIDHIHYYADGHC CK Prediction of potential genes in microbial genomes Time: Mon May 16 15:01:30 2011 Seq name: gi|296493484|gb|ADTK01000017.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont42.1, whole genome shotgun sequence Length of sequence - 558 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 340 - 556 212 ## c3668 hypothetical protein Predicted protein(s) >gi|296493484|gb|ADTK01000017.1| GENE 1 340 - 556 212 72 aa, chain + ## HITS:1 COG:no KEGG:c3668 NR:ns ## KEGG: c3668 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_CFT073 # Pathway: not_defined # 1 72 1 72 272 149 100.0 4e-35 MRLASRFGYANQIRRDRPLTHEELMHYVPGIFGEDKHTSRSQNYTYIPTITVLESLQREG FQPFFACQTRVR Prediction of potential genes in microbial genomes Time: Mon May 16 15:01:34 2011 Seq name: gi|296493483|gb|ADTK01000018.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont43.1, whole genome shotgun sequence Length of sequence - 10795 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 250 - 299 12.2 1 1 Op 1 2/0.000 - CDS 320 - 1714 375 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) - Prom 1796 - 1855 5.1 - Term 1931 - 1971 4.5 2 1 Op 2 . - CDS 1991 - 2944 501 ## COG4571 Outer membrane protease - Prom 3098 - 3157 3.9 3 2 Tu 1 . + CDS 2838 - 3161 99 ## gi|300897442|ref|ZP_07115865.1| conserved hypothetical protein + Term 3194 - 3232 0.2 - Term 3247 - 3290 3.1 4 3 Tu 1 . - CDS 3457 - 4218 445 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 4337 - 4396 2.2 - Term 4350 - 4390 8.1 5 4 Op 1 . - CDS 4401 - 5291 644 ## ECO111_0588 hypothetical protein 6 4 Op 2 . - CDS 5292 - 8264 2419 ## COG0457 FOG: TPR repeat 7 4 Op 3 . - CDS 8251 - 10488 1699 ## LF82_1481 bacteriophage N4 adsorption protein B - Prom 10554 - 10613 5.9 Predicted protein(s) >gi|296493483|gb|ADTK01000018.1| GENE 1 320 - 1714 375 464 aa, chain - ## HITS:1 COG:ECs1662 KEGG:ns NR:ns ## COG: ECs1662 COG0596 # Protein_GI_number: 15830916 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Escherichia coli O157:H7 # 1 464 31 494 494 951 99.0 0 MASQFNHWFGEEKPSPDLLCGYLSVPLKYTDTGGDASYEKKSQVKLALTKLPAKSKHKGS ILIISGGPGLPGINPYINFDWPVTNLRESWDIIGFDPRGVGQSTPTINCRQSDTETQENI TEKQQVLNKINACIHNTGAEVIRHIGSNEAVYDIDRIRQALGDKQLTAVAYSYGTQIAAL YAERFPYNVRSIVLDGVVDIDDLEDNFTWQLKQAQSYQETFDRFASWCARTKSCPLSSDR DKAITQFHELLSKLHHKPLLDSKRENISSDELISLTTDLLLWRSSWPTLATAIRQFSQGI VSNEIETALSAPIASEESSDASGVILCVDQGDEQLTPEERKSRKDALANAFPAINFDNGR SDSPDFCELWPIHSDLNKTRLKNTVLPSGLLFVAHKYDPTTPWINARKMAEKFSSPLLTI NGDGHTLALTGVNLCVDKAVVHHLITPQKIENIYCPGNSEAEIQ >gi|296493483|gb|ADTK01000018.1| GENE 2 1991 - 2944 501 317 aa, chain - ## HITS:1 COG:ECs1663 KEGG:ns NR:ns ## COG: ECs1663 COG4571 # Protein_GI_number: 15830917 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protease # Organism: Escherichia coli O157:H7 # 1 317 1 317 317 597 98.0 1e-171 MRAKLLGIVLTTPIAISSFASTETISFTPDNINADISLGTLSGKTKERVYLAEEGGRKVS QLDWKFNNAAIIKGAINWDLMPQISIGAAGWTTLGSRGGNMVDQDWMDSSNPGTWTDESR HPDTQLNYANEFDLNIKGWLLNEPNYRLGLMAGYQESRYSFTARGGSYIYSSEEGFRDDI GSFPTGERAIGYKQRFKMPYIGLTGSYRYEDFELGGTFKYSGWVEASDNDEHYDPGKRIT YRSKVKDQNYYSVAVNAGYYVTPNAKVYVEGAWNRVTNKKGNTSLYDHNDNTSDYSKNGA GIENYNFITTAGLKYTF >gi|296493483|gb|ADTK01000018.1| GENE 3 2838 - 3161 99 107 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|300897442|ref|ZP_07115865.1| ## NR: gi|300897442|ref|ZP_07115865.1| conserved hypothetical protein [Escherichia coli MS 198-1] # 1 101 1 101 119 145 93.0 9e-34 MSAFMLSGVNDIVSVEAKELIAIGVVRTIPRSFARIKVLHSIVLMIEYVFFISNLMSQSH IAPLFIFCLALNDYHNALSVFGLIICYCFMLRFYCVVFYAFVVFLSI >gi|296493483|gb|ADTK01000018.1| GENE 4 3457 - 4218 445 253 aa, chain - ## HITS:1 COG:ECs0598 KEGG:ns NR:ns ## COG: ECs0598 COG2207 # Protein_GI_number: 15829852 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 253 1 253 253 452 99.0 1e-127 MQLSSSEPCVVILTEKEVEVSVNNHATFTLPKNYLAAFACNNNVIELSTLNHVLIPHINR NIINDYLLFLNKNLTCVKPWSRLATPVIACHSRTPEVFRLAANHSKQQPSKPCEAELTRA LLFTVLSNFLEQSRFIALLMYILRSSVRDSVCRIIQSDIQHYWNLRIVASSLCLSPSLLK KKLKNENTSYSQIVTECRMRYAVQMLLMDNKNITQVAQLCGYSSTSYFISVFKAFYGLTP LNYLAKQRQKVMW >gi|296493483|gb|ADTK01000018.1| GENE 5 4401 - 5291 644 296 aa, chain - ## HITS:1 COG:no KEGG:ECO111_0588 NR:ns ## KEGG: ECO111_0588 # Name: ybcH # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 296 1 296 296 575 100.0 1e-163 MRKFIFVLLTLLLVSPFSFAMKGIIWQPQNRDSQVTDTQWQGLMSQLRLQGFDTLVLQWT RYGDAFTQPEQRALLFKRAAAAQQAGLKLIVGLNADPEFFMHQKQSSAALESYLNRLLAA DLQQARLWSAAPGVTPDGWYISAEIDDLNWRSEAARQPLLTWLNNAQRLISDVSAKPVYI SSFFAGNMSPDGYRQLLEQVKATGVNVWVQDGSGVDKLTAEQRERYLQASADCQSPAPAS GIVYELFVAGKGKTFTAKPKPDAEIASLLAKRSSCGKDTLYFSLRYLPVAHGILEY >gi|296493483|gb|ADTK01000018.1| GENE 6 5292 - 8264 2419 990 aa, chain - ## HITS:1 COG:nfrA KEGG:ns NR:ns ## COG: nfrA COG0457 # Protein_GI_number: 16128551 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Escherichia coli K12 # 1 990 1 990 990 1844 99.0 0 MKENNLNRVIGWSGLLLTSLLSTSALADNIGTSAEELGLSDYRHFVIYPRLDKALKAQKN NDEATAIREFEYIHQQVPDNIPLTLYLAEAYRHFGHDDRARLLLEDQLKRHPGDARLERS LAAIPVEVKSVTTVEELLAQQKACDAAPTLRCRSEVGQNALRLAQLPVARAQLNDATFAA SPEGKTLRTDLLQRAIYLKQWSQADTLYNEARQQNTLSAAERRQWFDVLLAGQLDDRILA LQSQGIFTDPQSYITYATALAYRGEKARLQHYLNENKPLFTTDAQEKSWLYLLSKYSANP VQALANYTVQFADNRQYVVGATLPVLLKEGQYDAAQKLLATLPANEMLEERYAVSVATRN KAESLRLARLLYQQEPANLTRLDQLTWQLMQNEQSREAADLLLQRYPFQGDARVSQTLMA RLASLLESHPYLATPAKVAILSKPLPLAEQRQWQSQLPGIADNCPAIVRLLGDMSPSYDA AAWNRLAKCYRDTLPGVALYAWLQAEQRQPNAWQHRAVAYQAYQVEDYATALAAWQKISL HDMSNEDLLAAANTAQAAGNGAARDRWLQQAEQRGLGNNALYWWLHAQRYIPGQPELALS DLTRSINIAPSANAYVARATIYRQRHNVPAAVSDLRAALELEPNNSNIQAALGYALWDSG DIAQSREMLEQAHKGLPDDPALIRQLAYVNQRLDDMPATQHYARLVIDDIDNQALITPLT PEQNQQRFNFRRLHEEVGRRWTFSFDSSIGLRSGAMSTANNNVGGAAPGKSYRSYGQLEA EYRIGRNMLLEGDLLSVYSRVFADTGENGVMMPVKNPMSGTGLRWKPLRDQIFFLAVEQQ LPLNGQNGASDTMLRASASFFNGGKYSDEWHPNGSGWFAQNLYLDAAQYIRQDIQAWTAD YRVSWHQKVANGQTIEPYAHVQDNGYRDKGTQGAQLGGVGVRWNIWTGETHYDAWPHKVS LGVEYQHTFKAINQRNGERNNAFLTIGVHW >gi|296493483|gb|ADTK01000018.1| GENE 7 8251 - 10488 1699 745 aa, chain - ## HITS:1 COG:no KEGG:LF82_1481 NR:ns ## KEGG: LF82_1481 # Name: nfrB # Def: bacteriophage N4 adsorption protein B # Organism: E.coli_LF82 # Pathway: not_defined # 1 745 1 745 745 1507 99.0 0 MDWLLDVFATWLYGLKVIAITLAVIMFISGLDDFFIDVVYWVRRIKRKLSVYRRYPRMSY RELYKPDEKPLAIMVPAWNETGVIGNMAELAVTTLDYENYHIFVGTYPNDPDTQRDVDEV CARFPNVHKVVCARPGPTSKADCLNNVLDAITQFERSANFAFAGFILHDAEDVISPMELR LFNYLVERKDLIQIPVYPFEREWTHFTSMTYIDEFSELHGKDVPVREALAGQVPSAGVGT CFSRRAVTALLADGDGIAFDVQSLTEDYDIGFRLKEKGMTEIFVRFPVVDEAKEREQRKF LQHARTSNMICVREYFPDTFSTAVRQKSRWIIGIVFQGFKTHKWTSSLTLNYFLWRDRKG AISNFVSFLAMLVMLQLLLLLAYESLWPNAWHFLSIFSGSAWLMTLLWLNFGLMVNRIVQ RVIFVTGYYGLTQGLLSVLRLFWGNLINFMANWRALKQVLQHGDPRRVAWDKTTHDFPSV TGDTRSLRPLGQILLENQVITEEQLDTALRNRVEGLRLGGSMLMQGLISAEQLAQALAEQ NGVAWESIDAWQIPSSLIAEMPASVALHYAVLPLRLENDELIVGSEDGIDPVSLAALTRK VGRKVRYVIVLRGQIVTGLRHWYARRRGHDPRAMLYNAVQHQWLTEQQTGEIWRQYVPHQ FLFAEILTTLGHINRSAINVLLLRHERSSLPLGKFLVTEGVISQETLDRVLTIQRELQVS MQSLLLKAGLNTEQVAQLESENEGE Prediction of potential genes in microbial genomes Time: Mon May 16 15:01:52 2011 Seq name: gi|296493482|gb|ADTK01000019.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont46.1, whole genome shotgun sequence Length of sequence - 585 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 56 - 584 217 ## ECO111_p4-04 putative colicin activity protein Predicted protein(s) >gi|296493482|gb|ADTK01000019.1| GENE 1 56 - 584 217 176 aa, chain + ## HITS:1 COG:no KEGG:ECO111_p4-04 NR:ns ## KEGG: ECO111_p4-04 # Name: not_defined # Def: putative colicin activity protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 176 151 326 581 297 100.0 1e-79 MMSKIVTSLPADDITESPVSSLPLDKATVNVNVRVVDDVKDERQNISVVSGVPMSVPVVD AKPTERPGVFTASIPGAPVLNISVNNSTPAVQTLSPGVTNNTDKDVRPAGFTQGGNTRDA VIRFPKDSGHNAVYVSVSDVLSPDQVKQRQDEENRRQQEWDATHPVEAAERNYERA Prediction of potential genes in microbial genomes Time: Mon May 16 15:01:56 2011 Seq name: gi|296493481|gb|ADTK01000020.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont47.1, whole genome shotgun sequence Length of sequence - 1847 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1846 1258 ## COG3209 Rhs family protein Predicted protein(s) >gi|296493481|gb|ADTK01000020.1| GENE 1 1 - 1846 1258 615 aa, chain + ## HITS:1 COG:rhsC KEGG:ns NR:ns ## COG: rhsC COG3209 # Protein_GI_number: 16128676 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli K12 # 1 615 7 621 1397 1202 98.0 0 ARQGDMTQYGGSIVQGSAGVRIGAPTGVACSVCPGGVTSGHPVNPLLGAKVLPGETDIAL PGPLPFILSRTYSSYRTKTPAPVGSLGPGWKMPADIRLQLRDNTLILSDNGGRSLYFEHL FPGEDGYSRSESLWLVRGGVAKLDEGHRLAALWQALPEELRLSPHRYLATNSPQGPWWLL GWCERVPEADEVLPAPLPPYRVLTGLVDRFGRTQTFHREAAGEFSGEITGVTDGAGRHFR LVLTTQAQRAEEARQQAISGGTEPSAFPDTLPGYTEYGRDNGIRLSAVWLTHDPEYPENL PAAPLVRYGWTPRGELAAVYDRSNTQVRSFTYDDKYRGRMVAHRHTGRPEIRYRYDSDGR VTEQLNPAGLSYTYQYEKDRITITDSLNRREVLHTQGEAGLKRVVKKEHADGSVTQSQFD AVGRLKAQTDAAGRTTEYSPDVVTGLITRITTPDGRASAFYYNHHSQLTSATGPDGLEMR RKYDEYGRLIQETAPDGDITRYRYDNPHSDLPCATDDATGSRKTMTWSRYGQLLTFTDCS GYVTRYDHDRFGQVTAVHREEGLSQYHAYDSRGQLTAVKDTQGHETRYEYNAAGDLTTVI APDGSRNGTQYDAWG Prediction of potential genes in microbial genomes Time: Mon May 16 15:01:57 2011 Seq name: gi|296493480|gb|ADTK01000021.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont48.1, whole genome shotgun sequence Length of sequence - 1554 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 79 - 1476 1392 ## ECS88_5030 putative ATP-dependent Clp protease proteolytic subunit (endopeptidase Clp) Predicted protein(s) >gi|296493480|gb|ADTK01000021.1| GENE 1 79 - 1476 1392 465 aa, chain + ## HITS:1 COG:no KEGG:ECS88_5030 NR:ns ## KEGG: ECS88_5030 # Name: not_defined # Def: putative ATP-dependent Clp protease proteolytic subunit (endopeptidase Clp) # Organism: E.coli_S88 # Pathway: not_defined # 1 465 275 739 739 913 99.0 0 MPESIRNMITPPRNSAPRVQDDGPAASRTPVQAAAPVVDENSIRAQVLAEQKARVNGIND LFAMFGGRYQSLQAQCLADPECSLEQAREKLLNEMGRESTPSNKNTPAHIYAGNGNFVGD GIRQALMARAGFEKTERGNVYNGMTLREYARMSLTERGIGVSSYNPMQMVGAAFTHSTSD FGNILLDVANKAILQGWEDAPETYEQWTRKGQLSDFKIAHRVGMGGFSALRQVREGAEYK YVTTGDKQATIALATYGELFSITRQAIINDDLNMLTDVPMKLGRAAKSTIADLVYAILTS NPKISTDNVSLFDKAKHANVLESAAMDVASLDKARQLMRVQKEGERHLNIRPAFVLVPTA MESVANQVIRSSSVKGADINAGIINPVKDFATVIAEPRLDDNSQTTFYMAASKGSDTIEV AYLNGVDTPYIDQMEGFSVDGVTTKVRIDAGVAPVDHRGLVKCTA Prediction of potential genes in microbial genomes Time: Mon May 16 15:02:04 2011 Seq name: gi|296493479|gb|ADTK01000022.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont51.1, whole genome shotgun sequence Length of sequence - 1584 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1582 715 ## COG3209 Rhs family protein Predicted protein(s) >gi|296493479|gb|ADTK01000022.1| GENE 1 1 - 1582 715 527 aa, chain - ## HITS:1 COG:Z0268 KEGG:ns NR:ns ## COG: Z0268 COG3209 # Protein_GI_number: 15799917 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli O157:H7 EDL933 # 1 527 239 765 1404 922 92.0 0 DGAGRRFHLALTTQAQRAEAFRKQRASSLSSPASPRSVSSSQVFPDTLPAGTGYGTDNGI RLEAVWLTHDPAYPDEQPTAPLARYTYTAGGELRAVYDRSGTQVRGFAYDAEHAGRMVAH HYAGRPESRYRYDDTGRVTEQVNPEGLDYRFEYGQDRVTITDSLNRREVLYTEGEGGLKR VVKKEHADGSITRSEYDEAGRLKAQTDAAGRRTEYRLHMASGAVTAVTGPDGRTVRYGYN SQRQVTSVTYPDGLRSSREYDERGRLTAETSRSGETTRYSYDDPASELPTGIQDATGSTK QMAWSRYGQLLAFTDCSGYTTRYEYDRYGQQIAVHREEGISTYSSYNPRGQLVSQKDAQG REIRYEYSAAGDLTATVSPDGKRSTIEYDKRGRPVSVTEGGLTRSMGYDAAGRITVLTNE NGSQSTFRYDPVDRLTEQRGFDGRTQRYQYDLTGKLTQSEDEGLITLWHYDASDRITRRT VNGEPAEQWQYDDHGWLTEISHLSEGHRVAVHYGYDDKGRLTGERQT Prediction of potential genes in microbial genomes Time: Mon May 16 15:02:04 2011 Seq name: gi|296493478|gb|ADTK01000023.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont52.1, whole genome shotgun sequence Length of sequence - 536 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + LSU_RRNA 1 - 536 100.0 # CP000946 [R:4420328..4423307] # 23S ribosomal RNA # Escherichia coli ATCC 8739 # Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; Enterobacteriaceae; Escherichia. Prediction of potential genes in microbial genomes Time: Mon May 16 15:02:05 2011 Seq name: gi|296493477|gb|ADTK01000024.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont58.1, whole genome shotgun sequence Length of sequence - 951 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 37 - 949 113 ## EcHS_A1637 L-shaped tail fiber protein Predicted protein(s) >gi|296493477|gb|ADTK01000024.1| GENE 1 37 - 949 113 304 aa, chain + ## HITS:1 COG:no KEGG:EcHS_A1637 NR:ns ## KEGG: EcHS_A1637 # Name: not_defined # Def: L-shaped tail fiber protein # Organism: E.coli_HS # Pathway: not_defined # 1 304 899 1202 1258 448 78.0 1e-124 MQTLINSDTKRIYVRFVVNGNWTAWSQVVVSGWNQDVTVRSLTSTTPSKLGGGRIDVLGS TSDYGSMNCTVRGVDSTGTNSAWSVGTSESTGKMLFLKNHRSSAQVLLNGDDGAVQLLSG TVNGATAQALTINKDEVNSTADLVIRKQTGTGNRFALLNSGNSELPVSIRVWGSSTRQNV FEVGTSAAYLFYAQKTTDGQNLTVNGSVNCTTLNQSSDRRLKENIEIIDNATDAIRKING YTYTLKENGAHCAGVIAQEVEEAIPEAVGSFIHYGEELQGPTVDGNELREETRYLNVDYA AVTG Prediction of potential genes in microbial genomes Time: Mon May 16 15:02:10 2011 Seq name: gi|296493476|gb|ADTK01000025.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont77.1, whole genome shotgun sequence Length of sequence - 3268 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 139 - 2655 1430 ## ECO26_2902 hypothetical protein 2 1 Op 2 . + CDS 2731 - 3186 329 ## ECED1_4883 hypothetical protein + Term 3212 - 3251 3.1 Predicted protein(s) >gi|296493476|gb|ADTK01000025.1| GENE 1 139 - 2655 1430 838 aa, chain + ## HITS:1 COG:no KEGG:ECO26_2902 NR:ns ## KEGG: ECO26_2902 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 838 1 838 838 1639 99.0 0 MLQIVGALILLIAGFAILRLLFRALISTASALAGLILLCLFGPALLAGYITERITRLFHI RWLAGVFLTIAGMIISFMWGLDGKHIALEAHTFDSVKFILTTALAGGLLAVPLQIKNIQQ NGITPEDISKEINGYYCCFYTAFFLMACSACAPLIALQYDISPSLMWWGGLLYWLAALVT LLWAASQIQALKKLTCAISQTLEEQPVLNSKSWLTSLQNDYSLPDSLTERIWLTLISQRI SRGELREFELADGNWLLNNAWYERNMAGFNEQLKENLSFTPDELKTLFRNRLNLSPEAND DFLDRCLDGGDWYPFSEGRRFVSFHHVDELRICASCGLTEVHHAPENHKPDPEWYCSSLC RETETLCQEIYERPYNSFISDATANGLILMKLPETWSTNEKMFASGGQGHGFAAERGNHI VDRVRLKNARILGDNNARNGADRLVSGTEIQTKYCSTAARSVGAAFDGQNGQYRYMGNNG PMQLEVPRDQYAGAVETMRNKIREGKVPGVTDPAEASRLIRRGHLTYTQARNITRFGTIE SVTYDIAEGSVVSLAAGGISFALTASVFWLSTGDRDAALQTAAVQAGKTFTRTLAVYVTT QQLHRLSVVQGMLKHIDFSTASPTVRLALQKGTGAGNISALNKVMKGTLVTSLALVAVTT GPDMIKMLRGRISGTQFIRNLAVASSGVAGGAVGSVAGGILFSPLGPFGALTGRVVGGVL GGMIASAVSGKIAGALVEEDRVKILAMIQEQVTWLAGSFLLTGHEIENLNENLARVIDQN ALEIIFAAGIQQRAATNMLIKPLVVSIIRQRPVMEYDASHLGNMVNRLEEALPPELPA >gi|296493476|gb|ADTK01000025.1| GENE 2 2731 - 3186 329 151 aa, chain + ## HITS:1 COG:no KEGG:ECED1_4883 NR:ns ## KEGG: ECED1_4883 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_ED1a # Pathway: not_defined # 1 151 1 151 151 310 100.0 1e-83 MIHLFKTCMITAFILGLTWSAPLRAQDQRYISIRNTDTIWLPGNICAYQFRLDNGGNDEG FGPLTITLQLKDKYGQTLVTRKMETEAFGDSNATRTTDAFLETECVENVATTEIIKATEE SNGHRVSLPLSVFNPQDYHPLLITVSGKNVN Prediction of potential genes in microbial genomes Time: Mon May 16 15:02:23 2011 Seq name: gi|296493475|gb|ADTK01000026.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont81.1, whole genome shotgun sequence Length of sequence - 5461 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 238 - 486 195 ## RB2501_10797 rrf2 family protein (putative transcriptional regulator) - Prom 736 - 795 5.7 2 2 Tu 1 . - CDS 1076 - 1342 154 ## gi|300902071|ref|ZP_07120085.1| dihydrofolate reductase 3 3 Op 1 9/0.000 - CDS 1672 - 1953 198 ## COG1109 Phosphomannomutase 4 3 Op 2 . - CDS 2040 - 2855 401 ## COG0294 Dihydropteroate synthase and related enzymes - Prom 2965 - 3024 4.3 5 4 Tu 1 . - CDS 3096 - 3662 250 ## EcSMS35_3187 putative immunoglobuling-binding protein - Prom 3784 - 3843 7.7 + Prom 3631 - 3690 3.9 6 5 Tu 1 . + CDS 3755 - 3937 63 ## gi|300824720|ref|ZP_07104826.1| hypothetical protein HMPREF9346_04588 + Term 4151 - 4186 2.2 Predicted protein(s) >gi|296493475|gb|ADTK01000026.1| GENE 1 238 - 486 195 82 aa, chain - ## HITS:1 COG:no KEGG:RB2501_10797 NR:ns ## KEGG: RB2501_10797 # Name: not_defined # Def: rrf2 family protein (putative transcriptional regulator) # Organism: R.biformata # Pathway: not_defined # 1 77 67 143 144 75 44.0 5e-13 MEEAKIRTISIEDIVLVIDGEVAFTMCVLGLKKCSEVNPCPVHHKYKNIKADLKEMIKTT TVFDMVDQVNDGHSFLKIFELE >gi|296493475|gb|ADTK01000026.1| GENE 2 1076 - 1342 154 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|300902071|ref|ZP_07120085.1| ## NR: gi|300902071|ref|ZP_07120085.1| dihydrofolate reductase [Escherichia coli MS 84-1] # 1 88 108 195 195 175 100.0 9e-43 MEGCLKHYENEDKRTVFIIGGGQIYEEALEKNRVDEMFITFVDHTFGADTFFPSIDFSLW NEEVLRVHEADSKNAYNFTVKKFTKKLS >gi|296493475|gb|ADTK01000026.1| GENE 3 1672 - 1953 198 93 aa, chain - ## HITS:1 COG:AGl2256 KEGG:ns NR:ns ## COG: AGl2256 COG1109 # Protein_GI_number: 15891239 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 86 9 94 458 126 67.0 9e-30 MVRKYFGTDGIRGKANEGAMTAETALRVGMAAGRVFRRGDHRHRVVIGKDTRLSGYMLEP ALTAGFTSMGMDVFLFGPLPTTCFSLNSYLFPI >gi|296493475|gb|ADTK01000026.1| GENE 4 2040 - 2855 401 271 aa, chain - ## HITS:1 COG:RSc1527 KEGG:ns NR:ns ## COG: RSc1527 COG0294 # Protein_GI_number: 17546246 # Func_class: H Coenzyme transport and metabolism # Function: Dihydropteroate synthase and related enzymes # Organism: Ralstonia solanacearum # 6 265 25 282 291 162 38.0 8e-40 MNKSLIIFGIVNITSDSFSDGGRYLAPDAAIAQARKLMAEGADVIDLGPASSNPDAAPVS SDTEIARIAPVLDALKADGIPVSLDSYQPATQAYALSRGVAYLNDIRGFPDAAFYPQLAK SSAKLVVMHSVQDGQADRREAPAGDIMDHIAAFFDARIAALTGAGIKRNRLVLDPGMGFF LGAAPETSLSVLARFDELRLRFDLPVLLSVSRKSFLRALTGRGPGDVGAATLAAELAAAA GGADFIRTHEPRPLRDGLAVLAALKETARIR >gi|296493475|gb|ADTK01000026.1| GENE 5 3096 - 3662 250 188 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_3187 NR:ns ## KEGG: EcSMS35_3187 # Name: not_defined # Def: putative immunoglobuling-binding protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 174 1 135 369 99 42.0 7e-20 MLKRRDAFLKKSALAVSVALLLSAQAQAVLTGPVDANSSSLLIGENSFITNSTGTANNTF LLGGGAFNMDSPGSLQFGSFSGVYNSPHSVTLGRDAGQAESKYGVAIGKSAEVLNSQQSV AIGGWAGIENSSGSVALGHGSQVSGENNVVSVGAGPEGYGESVKGAPETHSIVNKTFLSF IFNGLFSC >gi|296493475|gb|ADTK01000026.1| GENE 6 3755 - 3937 63 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|300824720|ref|ZP_07104826.1| ## NR: gi|300824720|ref|ZP_07104826.1| hypothetical protein HMPREF9346_04588 [Escherichia coli MS 119-7] # 6 60 1 55 55 99 98.0 6e-20 MPGILLCIICDLNLSSRKRLALKSLNYKHIQLCKGGYNKPDYFNVCIPALLISQNNKNPG Prediction of potential genes in microbial genomes Time: Mon May 16 15:02:39 2011 Seq name: gi|296493474|gb|ADTK01000027.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont81.2, whole genome shotgun sequence Length of sequence - 3486 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 630 - 689 2.5 1 1 Op 1 . + CDS 752 - 943 169 ## EC55989_4843 hypothetical protein 2 1 Op 2 . + CDS 1012 - 1260 116 ## EC55989_4844 putative conjugation inhibition protein, immunity protein 3 2 Tu 1 . - CDS 1278 - 1562 84 ## EcSMS35_3195 hypothetical protein - Term 1768 - 1797 -0.9 4 3 Tu 1 . - CDS 2027 - 2263 268 ## COG3311 Predicted transcriptional regulator 5 4 Tu 1 . - CDS 2332 - 2904 453 ## EcSMS35_3197 hypothetical protein Predicted protein(s) >gi|296493474|gb|ADTK01000027.1| GENE 1 752 - 943 169 63 aa, chain + ## HITS:1 COG:no KEGG:EC55989_4843 NR:ns ## KEGG: EC55989_4843 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 48 1 48 57 65 83.0 6e-10 MVTLTINRKPQGIFSKSPATPLQDKTDTAYKMKAGDQIPQLNDKQKPADKTPWRHDKTPE QKP >gi|296493474|gb|ADTK01000027.1| GENE 2 1012 - 1260 116 82 aa, chain + ## HITS:1 COG:no KEGG:EC55989_4844 NR:ns ## KEGG: EC55989_4844 # Name: not_defined # Def: putative conjugation inhibition protein, immunity protein # Organism: E.coli_55989 # Pathway: not_defined # 3 82 35 114 114 132 90.0 3e-30 MGIPDDLIQDIAIRELAFGAGTLHAAVASYVQSPRYYRALIAGGARYNLNGQPCGEVTPQ EQKEAETRLMMLNDRRKDRKPR >gi|296493474|gb|ADTK01000027.1| GENE 3 1278 - 1562 84 94 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_3195 NR:ns ## KEGG: EcSMS35_3195 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 94 157 250 250 179 100.0 4e-44 MRCLLLLSQTSICHPRVSATVDEECSQRVDLLQQTWRVISADGQCRVERCFQAARGDTSG QYIALKTVALSLGLPVVTGITHRPVQRCTLITAQ >gi|296493474|gb|ADTK01000027.1| GENE 4 2027 - 2263 268 78 aa, chain - ## HITS:1 COG:Z1188 KEGG:ns NR:ns ## COG: Z1188 COG3311 # Protein_GI_number: 15800709 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Escherichia coli O157:H7 EDL933 # 1 65 1 64 65 114 90.0 3e-26 MLTSMTGHDSVLLRADDPLIDMNYITSFTGMTDKWFYKLISEGHFPKPIKLGRSSRWYKS EVEQWMQQRIEESRGAAA >gi|296493474|gb|ADTK01000027.1| GENE 5 2332 - 2904 453 190 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_3197 NR:ns ## KEGG: EcSMS35_3197 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 190 1 190 190 394 98.0 1e-108 MNAIPYFDYSMAPFWPSYQNKIIGVLNRVLREQSCSRIRRILLRLPGEHGNIFSDRKTWF GMAFIETVSALMNATPGRDLCWLLTRHPEKPEYHVVLCVRQEYFDGPELDRLILDAWSNV LGFASPGEAAPYQKQITRDVVLDSCSPDCEERLKELIWAFSDFARDRRGVHDPEARYLAS NPWYPVAGQL Prediction of potential genes in microbial genomes Time: Mon May 16 15:02:47 2011 Seq name: gi|296493473|gb|ADTK01000028.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont83.1, whole genome shotgun sequence Length of sequence - 779 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 322 - 348 -1.0 1 1 Tu 1 . - CDS 356 - 778 152 ## gi|301307324|ref|ZP_07213334.1| hypothetical protein HMPREF9347_05893 Predicted protein(s) >gi|296493473|gb|ADTK01000028.1| GENE 1 356 - 778 152 140 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|301307324|ref|ZP_07213334.1| ## NR: gi|301307324|ref|ZP_07213334.1| hypothetical protein HMPREF9347_05893 [Escherichia coli MS 124-1] # 1 140 20 159 159 283 100.0 2e-75 KNARVAKCDGSEFFLELPSMNADFPAGKIILKLGDSGFYNKRTKSLEGAYGLRHIWDKHR VEIGATSAEDIVIFLESILLAGAEVLIDPKKGQNKAIVVESGTGMMILELKKPNGEDPYY SIITAYDRKSHPGTKLHTLI Prediction of potential genes in microbial genomes Time: Mon May 16 15:02:55 2011 Seq name: gi|296493472|gb|ADTK01000029.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont86.1, whole genome shotgun sequence Length of sequence - 292 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 28 - 243 254 ## pECS88_0050 single-stranded DNA-binding protein Predicted protein(s) >gi|296493472|gb|ADTK01000029.1| GENE 1 28 - 243 254 71 aa, chain + ## HITS:1 COG:no KEGG:pECS88_0050 NR:ns ## KEGG: pECS88_0050 # Name: ssb # Def: single-stranded DNA-binding protein # Organism: E.coli_S88 # Pathway: DNA replication [PATH:ecz03030]; Mismatch repair [PATH:ecz03430]; Homologous recombination [PATH:ecz03440] # 1 68 109 176 210 105 92.0 4e-22 MQMLGRAAGAQTQPEEGQQFSGQPQPEPQAEAGTKKGGAKTKGRGRKAAQPEPQPQPPEG DDYGFSDDIPF Prediction of potential genes in microbial genomes Time: Mon May 16 15:02:57 2011 Seq name: gi|296493471|gb|ADTK01000030.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont86.2, whole genome shotgun sequence Length of sequence - 179 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 179 80 ## pECS88_0051 hypothetical protein Predicted protein(s) >gi|296493471|gb|ADTK01000030.1| GENE 1 2 - 179 80 59 aa, chain + ## HITS:1 COG:no KEGG:pECS88_0051 NR:ns ## KEGG: pECS88_0051 # Name: yubL # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 59 10 68 86 120 100.0 1e-26 MSEYFRILQGLPDGPFTRKHAEAVAAQYRNVFIEDDHGEQFRLVVRNNGAMVWRTWNFE Prediction of potential genes in microbial genomes Time: Mon May 16 15:03:00 2011 Seq name: gi|296493470|gb|ADTK01000031.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont86.3, whole genome shotgun sequence Length of sequence - 2038 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 242 - 376 75 ## SeHA_A0053 ParB-like partition protein 2 1 Op 2 . + CDS 431 - 865 441 ## EcE24377A_E0012 plasmid SOS inhibition protein B 3 1 Op 3 . + CDS 862 - 1581 467 ## ECO111_p3-49 plasmid SOS inhibition protein A + Term 1703 - 1736 0.7 - Term 1304 - 1348 -0.9 4 2 Tu 1 . - CDS 1593 - 1817 81 ## APECO1_O1CoBM19 hypothetical protein + Prom 1629 - 1688 3.1 5 3 Tu 1 . + CDS 1803 - 2015 93 ## APECO1_O1CoBM20 modulator of post-segregation killing protein Predicted protein(s) >gi|296493470|gb|ADTK01000031.1| GENE 1 242 - 376 75 44 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0053 NR:ns ## KEGG: SeHA_A0053 # Name: not_defined # Def: ParB-like partition protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 44 609 652 652 97 100.0 1e-19 MKKGDAAEHAEHHMKDNRWVPGWMCAPHPQTDTTERTDNLADAA >gi|296493470|gb|ADTK01000031.1| GENE 2 431 - 865 441 144 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_E0012 NR:ns ## KEGG: EcE24377A_E0012 # Name: psiB # Def: plasmid SOS inhibition protein B # Organism: E.coli_E24377A # Pathway: not_defined # 1 143 1 143 144 283 98.0 1e-75 MKTELTLNVLQTMNTQEYEDIRAAGSDERRELTHAVMRELDAPDNWTMNGEYGSEFGGFF PVQVRFTPAHERFHLALCSPGDVSQVWVLVLVNAGGEPFAVVQVQRRFAPEAVSHSLALA ASLDAQGYSVSDIIHILMAEGSQA >gi|296493470|gb|ADTK01000031.1| GENE 3 862 - 1581 467 239 aa, chain + ## HITS:1 COG:no KEGG:ECO111_p3-49 NR:ns ## KEGG: ECO111_p3-49 # Name: not_defined # Def: plasmid SOS inhibition protein A # Organism: E.coli_O111_H- # Pathway: not_defined # 1 239 1 239 239 414 97.0 1e-114 MSARSQALVPLSTEQQAAWRAVAETEKRRHQGNTLAEYPYAGAFFRCLNGSRRISLSDLR FFMPSLTAEELHGNRLQWLYAIDVLIETQGEVCLLPLPGDAAERLFPSVRFRVRERSRHK SALVMQKYSRQQAREAEQKARAYQALVAQAEIELAFHSPETVGSWYARWSDRVAEHDLET LFWQWGERFPSLAGMERWQWQDMPFWQVIAEAGMAARVVGHAVREMERWMVPNKLREAA >gi|296493470|gb|ADTK01000031.1| GENE 4 1593 - 1817 81 74 aa, chain - ## HITS:1 COG:no KEGG:APECO1_O1CoBM19 NR:ns ## KEGG: APECO1_O1CoBM19 # Name: sok # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 74 1 74 74 134 100.0 1e-30 MWTRHRDASWWLMKINLLRGYLLSATQHGNKPPSRHEAESLKRRAHHSPYTCTLTTLTFP ENNPLIQTVHGKSV >gi|296493470|gb|ADTK01000031.1| GENE 5 1803 - 2015 93 70 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM20 NR:ns ## KEGG: APECO1_O1CoBM20 # Name: mok # Def: modulator of post-segregation killing protein # Organism: E.coli_APEC # Pathway: not_defined # 1 70 1 70 70 133 100.0 2e-30 MSSPHQDSLLPRFAQGEEGHETTTKFPCLVCVDRVSHTVDIHLSDTKIAVRDSLQRRTQG GGGFHGLRIR Prediction of potential genes in microbial genomes Time: Mon May 16 15:03:12 2011 Seq name: gi|296493469|gb|ADTK01000032.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont86.4, whole genome shotgun sequence Length of sequence - 1855 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 35 - 1853 1422 ## COG1475 Predicted transcriptional regulators Predicted protein(s) >gi|296493469|gb|ADTK01000032.1| GENE 1 35 - 1853 1422 606 aa, chain + ## HITS:1 COG:PSLT068 KEGG:ns NR:ns ## COG: PSLT068 COG1475 # Protein_GI_number: 17233437 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Salmonella typhimurium LT2 # 1 606 1 610 665 779 66.0 0 MPVTKCEPETTRKASRKSAKTQETALSALLAQTEEVSVPLASLIKSPLNVRTVPYSVESV SELAGSIKGVGLLQNLVVHTLPGDRYGVAAGGRRLAALNMLAERNILPADWPVRVKVIPQ ELATAASMTENGHRRDMHPAEQIAGFRALAQEGKTPAQTGDLLGYSPRHVQRMLKLADLA PVILDALAEDRITTEHCQALALESDTARQVQVFEAACQSGWGGKPDVRVIRNLITESEVA VKDNTKFRFVGADAFSPDELRTDLFSDDGDGYVDRVALDAALLEKLQAVAEFLREAEGWE WCAGRMEPVGECREDAGTYRCLPEPEAVLTEAEDERLNELMTRYDALENQCEESDLLEAE MKLMRCMAKVRAWTPEMRAGSGVVVSWRYGNVCVQRGVQLRSEDDAADDADRTEQVQEKA SVEEISLPLLTKMSSERTLAVQAALMQQPDKSLALLAWTLCLNVFGSGAYSKPVQISLEC KHYSLTSDAPSGKEGAAFLVLMAEKARLAALLPEGWSRDMTTFLSLSQEVLLSLLSFCTA CSLNGVQTRECGHTSRSPLDSLESAIGFHMRDWWQPTKANFFGHLKKPQIIAALNDAGLS GAARDA Prediction of potential genes in microbial genomes Time: Mon May 16 15:03:13 2011 Seq name: gi|296493468|gb|ADTK01000033.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont86.5, whole genome shotgun sequence Length of sequence - 1959 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 6 - 440 292 ## SeHA_A0054 plasmid SOS inhibition protein B 2 1 Op 2 . + CDS 437 - 1156 487 ## ECSE_P1-0060 plasmid SOS inhibition protein A 3 1 Op 3 . + CDS 1153 - 1749 313 ## SeHA_A0056 hypothetical protein + Term 1875 - 1920 3.1 Predicted protein(s) >gi|296493468|gb|ADTK01000033.1| GENE 1 6 - 440 292 144 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0054 NR:ns ## KEGG: SeHA_A0054 # Name: psiB # Def: plasmid SOS inhibition protein B # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 144 1 144 144 294 100.0 7e-79 MKTELTLNALQSMNAQEYEDIRAAGSDMRRNLTHEVMREVDAPANWMMNGEYGSEFGGFF PVQVRFTPAHERFHLALCSPGDVSQLWMLVLVNCGGQPFAVVQVQHIFTPVAISHTLALA ATLDAQGYSVNDIIHILMAEGGQA >gi|296493468|gb|ADTK01000033.1| GENE 2 437 - 1156 487 239 aa, chain + ## HITS:1 COG:no KEGG:ECSE_P1-0060 NR:ns ## KEGG: ECSE_P1-0060 # Name: not_defined # Def: plasmid SOS inhibition protein A # Organism: E.coli_SE11 # Pathway: not_defined # 1 239 1 239 239 430 99.0 1e-119 MSARSRALIPLSAEQQAAMQAVAVTEQRRRQGRTLSAWPYASAFFRCLNGSRRISLTDLR FFAPALTKEEFHGNRLLWLAAVDKLIESFGEVCVLPLPSDAGHRLFPSVPFREGERRRQK TTLTEQKYSRQREREAERRELEYQTCFAQAQIDLAFHTPATVGSWLSRWSGVVEEHDLET IFWGWCGRFPSLSSFDRFFWQEEPLWRLIFEAGEAGRGAPVQVRALEQWMIPNKLENAI >gi|296493468|gb|ADTK01000033.1| GENE 3 1153 - 1749 313 198 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0056 NR:ns ## KEGG: SeHA_A0056 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 198 1 198 198 378 100.0 1e-104 MMKSDEQYQVPVWMRPLLPLLCNTGGNDPEELLNDTETTASANIVRYVLIVAVRSQIDLL QLLYRKGLLRTEIPGGFSPEEAQELLDNLVRSHISKALSGERMAARDRNADLTWIRQQLV DAAWFVRATLEAHGMGVGNESPSAPPETMPDIQTRELVMLIKRLASSLKAVKPDSSVVRE AQDWLCDRKLVDITDILR Prediction of potential genes in microbial genomes Time: Mon May 16 15:03:24 2011 Seq name: gi|296493467|gb|ADTK01000034.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont97.1, whole genome shotgun sequence Length of sequence - 10951 bp Number of predicted genes - 17, with homology - 16 Number of transcription units - 8, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 840 - 1121 162 ## SbBS512_E0896 hypothetical bacteriophage protein 2 1 Op 2 . - CDS 1133 - 1675 256 ## COG4220 Phage DNA packaging protein, Nu1 subunit of terminase - Term 1783 - 1818 -1.0 3 2 Op 1 . - CDS 1877 - 2260 328 ## ECIAI39_0515 hypothetical protein, putative DNA-binding domain 4 2 Op 2 . - CDS 2272 - 2613 331 ## ECIAI39_0514 putative head decoration protein from prophage 5 2 Op 3 . - CDS 2623 - 3663 882 ## ECIAI39_0513 putative major head protein from prophage - Prom 3697 - 3756 5.5 6 3 Op 1 . - CDS 3881 - 4303 291 ## COG0629 Single-stranded DNA-binding protein 7 3 Op 2 . - CDS 4300 - 4584 190 ## Z1844 hypothetical protein - Prom 4610 - 4669 2.4 - Term 4779 - 4818 1.3 8 4 Op 1 . - CDS 4900 - 6654 824 ## COG3378 Predicted ATPase 9 4 Op 2 . - CDS 6651 - 6950 252 ## ECIAI39_2021 hypothetical protein 10 4 Op 3 . - CDS 6968 - 7189 130 ## gi|300902122|ref|ZP_07120127.1| hypothetical protein HMPREF9536_00315 11 4 Op 4 . - CDS 7190 - 7381 175 ## ECED1_1789 hypothetical protein 12 4 Op 5 . - CDS 7381 - 7566 212 ## Z1840 hypothetical protein + Prom 7356 - 7415 4.5 13 5 Tu 1 . + CDS 7565 - 7768 189 ## + Term 7787 - 7830 9.6 + Prom 7864 - 7923 6.0 14 6 Tu 1 . + CDS 7947 - 8252 210 ## ECO26_2325 hypothetical protein + Term 8491 - 8530 -0.7 - Term 8440 - 8475 -0.6 15 7 Tu 1 . - CDS 8586 - 9164 468 ## COG3617 Prophage antirepressor - Term 9178 - 9208 1.0 16 8 Op 1 . - CDS 9221 - 9478 189 ## COG3311 Predicted transcriptional regulator 17 8 Op 2 . - CDS 9480 - 9665 134 ## SbBS512_E2187 site-specific recombinase, phage integrase family - Prom 9822 - 9881 3.8 Predicted protein(s) >gi|296493467|gb|ADTK01000034.1| GENE 1 840 - 1121 162 93 aa, chain - ## HITS:1 COG:no KEGG:SbBS512_E0896 NR:ns ## KEGG: SbBS512_E0896 # Name: not_defined # Def: hypothetical bacteriophage protein # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 1 93 1 93 93 132 100.0 3e-30 MTESEILRLIRRACGISKQHDEQATQPDSVTADNYVRVVAEVMRRDGIELNGVDMRNIRT RVLELLAYRRRSQQRRESAKNTYQWKKPERLRR >gi|296493467|gb|ADTK01000034.1| GENE 2 1133 - 1675 256 180 aa, chain - ## HITS:1 COG:nohB KEGG:ns NR:ns ## COG: nohB COG4220 # Protein_GI_number: 16128543 # Func_class: L Replication, recombination and repair # Function: Phage DNA packaging protein, Nu1 subunit of terminase # Organism: Escherichia coli K12 # 6 167 3 181 181 107 37.0 1e-23 MKSHLMNKKNMAQSCRVSATAFDKWGVTPVERKGREAFYDVASVIDNRVSNAINQITDDK GEIDDDELLRVRIRLLTAQAEAQELKNERERGDVIDTAFCIYVLSKLASQISSIMDSLPL AMTRKFPDMKPSMLDGLKKEVIRACNACAKLDENIPLMLSDYLMETAGNVPDKLQPNKDK >gi|296493467|gb|ADTK01000034.1| GENE 3 1877 - 2260 328 127 aa, chain - ## HITS:1 COG:no KEGG:ECIAI39_0515 NR:ns ## KEGG: ECIAI39_0515 # Name: not_defined # Def: hypothetical protein, putative DNA-binding domain # Organism: E.coli_IAI39 # Pathway: not_defined # 1 127 1 127 127 229 94.0 2e-59 MQNHYNDLKPIAEMMYPDPAVEELKAIADKMRLSERLVDMNQVMELTTLSRRTLLNLEAR GEFPERVQVTEGRKAWYLSEVIDWINNIPRSSEYCRVPVPKKPDAALCLKIERVRRNARD GRYKLIG >gi|296493467|gb|ADTK01000034.1| GENE 4 2272 - 2613 331 113 aa, chain - ## HITS:1 COG:no KEGG:ECIAI39_0514 NR:ns ## KEGG: ECIAI39_0514 # Name: not_defined # Def: putative head decoration protein from prophage # Organism: E.coli_IAI39 # Pathway: not_defined # 1 113 1 113 113 188 98.0 4e-47 MATHYTELMAGTEALVTTLGIFSANKGVIPAFTPLMQEDATGALVVWDGSSVGKAVYVSA VQIDTAKKTQAQVYKTGVLNVDALNWPESVKELSAKVAAFVGSGISVQPLARV >gi|296493467|gb|ADTK01000034.1| GENE 5 2623 - 3663 882 346 aa, chain - ## HITS:1 COG:no KEGG:ECIAI39_0513 NR:ns ## KEGG: ECIAI39_0513 # Name: not_defined # Def: putative major head protein from prophage # Organism: E.coli_IAI39 # Pathway: not_defined # 1 346 59 404 404 689 98.0 0 MVDLYSPTQLVQVANAEDVQKKLNALFTSLFFTRSVMFESRDIILDTIDDPNIPIAAFCS PMVGSKVSRDEGYESKTIRLGYMKPKSSIDPNKLAVRPAGVSPEQYNAFGARNIKVKQAI VNQAKAIRARIEWLAVQAITTGKNIIEGDGIERYELDWNIKPQNIITQSGGTEWSGKDKE TFDPNDDIESYAEFSEGVTNIIIMGGNVWKKYRSFRAIKEALDTRRGSNSELETALKDLG DSVSFKGYMGDVAIVVYSGRYTDEDGTEKHFLDPDLMVLGNTALQGIVAYGGIQDPELIR MGLTKAELAPKNYIVPGDPAIEYVQTHSAPQPIPARINRFVTVRIG >gi|296493467|gb|ADTK01000034.1| GENE 6 3881 - 4303 291 140 aa, chain - ## HITS:1 COG:ECs1587 KEGG:ns NR:ns ## COG: ECs1587 COG0629 # Protein_GI_number: 15830841 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Escherichia coli O157:H7 # 1 140 1 136 136 164 68.0 6e-41 MTAQIAAYGRLVADPQLKTTSKGTQMAMASMAVPLPCSQADDGTATMWLYVLAFGRQADA LAKHHKGELVSVAGNMQVSQWTGQNGETRRGWQVIADSVISARTARPGGKKGQQGQATDA LNRAKQQSGNDDPYGDNIPF >gi|296493467|gb|ADTK01000034.1| GENE 7 4300 - 4584 190 94 aa, chain - ## HITS:1 COG:no KEGG:Z1844 NR:ns ## KEGG: Z1844 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157 # Pathway: not_defined # 12 91 1 80 83 108 85.0 5e-23 MPAENMNQGRQMTKIRRDRTQAKYKALDMTELSLKVAIKAIDHHTRAGYAKEHPDLISAF MTTAAANFATLTEREIAEAEQVTTINVKTGEQTA >gi|296493467|gb|ADTK01000034.1| GENE 8 4900 - 6654 824 584 aa, chain - ## HITS:1 COG:YPO0880_2 KEGG:ns NR:ns ## COG: YPO0880_2 COG3378 # Protein_GI_number: 16121187 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Yersinia pestis # 190 561 1 371 409 383 52.0 1e-106 MKLAPNVKQQSRGIKHKGTEVIIFAGSDAWAHAKQWQEHDARMAGDNEPPVWLGEQQLSE LDNLQIVPEGRKSARIYRAGYLAPVMIKAIGQKLAAAGVQDANFYPEGMHGQEVQNWREY LARERQNLSDGLVIELPVKQKAQLSQMADSERAQLLADRFDGVCVHPESEIVHVWRGGVW CPVSTMELSREMVAIYSEHRATFSKRVINNAVEALKVIAEPMGEPSGDLLPFANGALDLK TGEFSPHTPENWITTHNGIEYTPPAPGENIRDNAPNFHKWLEHAAGKDPRKMMRICAALY MIMANRYDWQMFIEATGDGGSGKSTFTHIASLLAGKQNTVSAEMTSLDDAGGRAQVVGSR LIVLADQPKYTGEGTGIKKITGGDPVEINPKYEKRFTAVIRAVVLATNNNPMIFTERAGG VARRRVIFRFDNIVSEAEKDRELPEKIAAEIPVIIRRLLANFTDPEKARALLLEQRDGDE ALAIKQQTDPVIEFCQFLNFLEEARGLMMGGGGDSVKYTTRNSLYRVYLAFMAYAGRSKP LNVAEFSKAMKPAAKVYGHEYITRKVKGVTQTNAITTDDCDAFL >gi|296493467|gb|ADTK01000034.1| GENE 9 6651 - 6950 252 99 aa, chain - ## HITS:1 COG:no KEGG:ECIAI39_2021 NR:ns ## KEGG: ECIAI39_2021 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_IAI39 # Pathway: not_defined # 1 99 1 99 99 147 77.0 2e-34 MEMKNSGFIASGPARPEFMNGDIYRDKYGGTVTIKGVAERRITYRREGYSYDCVMPVYQF RRDFSLVYAAPRSKPISREKAWGNIQKMKTMINGFRGKK >gi|296493467|gb|ADTK01000034.1| GENE 10 6968 - 7189 130 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|300902122|ref|ZP_07120127.1| ## NR: gi|300902122|ref|ZP_07120127.1| hypothetical protein HMPREF9536_00315 [Escherichia coli MS 84-1] # 1 73 1 73 73 102 100.0 9e-21 MNTEREVFFKLLACAESSLTLNNSAKAILNMWLDCINDNEDANIAYGLLSLIDESAEKLN DAINSALLSNKSS >gi|296493467|gb|ADTK01000034.1| GENE 11 7190 - 7381 175 63 aa, chain - ## HITS:1 COG:no KEGG:ECED1_1789 NR:ns ## KEGG: ECED1_1789 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_ED1a # Pathway: not_defined # 1 59 1 59 63 85 79.0 5e-16 MQEITLHEAAERAHQTEIICRLLEVYPNKITDADISALASLLARLSGSVASFLIEEESKL VGD >gi|296493467|gb|ADTK01000034.1| GENE 12 7381 - 7566 212 61 aa, chain - ## HITS:1 COG:no KEGG:Z1840 NR:ns ## KEGG: Z1840 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157 # Pathway: not_defined # 1 61 1 61 61 94 96.0 2e-18 MLKTFRVFARAVNPIGHTIGIAQNVKAVNVQTAIAAVRSESSEYGLSQVIISAVYELKEV H >gi|296493467|gb|ADTK01000034.1| GENE 13 7565 - 7768 189 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTHSLADFCSEYSNTTGRATATRFAFRPGGYVDDLRFSGTQCQKTPHKGRLCVYKVWYMI YSNHNGS >gi|296493467|gb|ADTK01000034.1| GENE 14 7947 - 8252 210 101 aa, chain + ## HITS:1 COG:no KEGG:ECO26_2325 NR:ns ## KEGG: ECO26_2325 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 101 1 101 101 149 98.0 4e-35 MATANKNAKSQLTTVRVPLDVMQGMESVKLDGESNAGFIVTAMRGEIARRQAEGSGENPL VSSLDALAKVEQIGIKAAEEIGQLVTVAREELQRRKVKEHE >gi|296493467|gb|ADTK01000034.1| GENE 15 8586 - 9164 468 192 aa, chain - ## HITS:1 COG:Z1818_1 KEGG:ns NR:ns ## COG: Z1818_1 COG3617 # Protein_GI_number: 15801289 # Func_class: K Transcription # Function: Prophage antirepressor # Organism: Escherichia coli O157:H7 EDL933 # 24 132 10 118 188 130 55.0 1e-30 MNKNIAVTGKGYARPVKKFCDIRDLVVLRFDSVNVRVVYLNGDPWFVAKDVCAALELTNS RTALQMLDDDEKGVNLTYTPGGNQNMRIISESGFYKLIARSRKATTPGTFAHRFSNWVFR NVIPGIRKTGTYGIPWGALQDFSRRKEQYQISASEKGRELQACKRKKRELEEEEKTLIRE YQPEFYFGNRIQ >gi|296493467|gb|ADTK01000034.1| GENE 16 9221 - 9478 189 85 aa, chain - ## HITS:1 COG:ECs1575 KEGG:ns NR:ns ## COG: ECs1575 COG3311 # Protein_GI_number: 15830829 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Escherichia coli O157:H7 # 15 68 1 54 62 60 55.0 6e-10 MAIYSLVDENDLRTMKDIDRFIREKECIALTTLANSTRWKMEQAGKFPRRIKIGERAAGY RLSEVQAWIRGEWHPGWKPGKTKQQ >gi|296493467|gb|ADTK01000034.1| GENE 17 9480 - 9665 134 61 aa, chain - ## HITS:1 COG:no KEGG:SbBS512_E2187 NR:ns ## KEGG: SbBS512_E2187 # Name: not_defined # Def: site-specific recombinase, phage integrase family # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 16 61 1 46 46 72 97.0 3e-12 MGADPYIVELLLGHKVKGVAGVYNKSRHIKKKLEVLNMWVNYLNTIAGFNNNVIELNKEV V Prediction of potential genes in microbial genomes Time: Mon May 16 15:03:55 2011 Seq name: gi|296493466|gb|ADTK01000035.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont98.1, whole genome shotgun sequence Length of sequence - 2685 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 175 - 234 7.1 1 1 Tu 1 . + CDS 258 - 1325 242 ## EC55989_1387 hypothetical protein 2 2 Tu 1 . - CDS 1267 - 1887 193 ## EC55989_1386 putative phage regulatory protein - Prom 1924 - 1983 5.6 + Prom 1697 - 1756 2.7 3 3 Tu 1 . + CDS 1831 - 2040 91 ## - Term 2220 - 2268 4.3 4 4 Tu 1 . - CDS 2357 - 2542 146 ## ECO26_0865 putative lipoprotein Rz1 precursor Predicted protein(s) >gi|296493466|gb|ADTK01000035.1| GENE 1 258 - 1325 242 355 aa, chain + ## HITS:1 COG:no KEGG:EC55989_1387 NR:ns ## KEGG: EC55989_1387 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 355 1 355 355 703 99.0 0 MSQHQYYPQLKWKPAEYESLMLLDQTTLSGFTPIITIPDIDWDYENECYKKSLSSYLSDF GINLAASWKANRPVLLDVKYLDKHGSSRHHPLDMCIQDARVNGKEIIPVVSPAYSTNYIH AVQRNLINGLAISITPQTWHQFTSLVNHLNIHPSLIDVIIDFGDIQNATDSLKQQALSMV NTLSGQAPWRNLILSSTAYPASQAGIPQHQVHHIPRHEYDLWMYVVQNFSNGRTPSFSDY PTASSTITSVDPRFMSQYVSVRYSNDTSWIFVKGTAVKGNGWGQTKNLCTTLVSSPEYQV FGSKFSWGDDYIYQRSLGANKSGGSKEWRKVAHTHHITLVVRQLYWLAQTQPAKP >gi|296493466|gb|ADTK01000035.1| GENE 2 1267 - 1887 193 206 aa, chain - ## HITS:1 COG:no KEGG:EC55989_1386 NR:ns ## KEGG: EC55989_1386 # Name: not_defined # Def: putative phage regulatory protein # Organism: E.coli_55989 # Pathway: not_defined # 1 206 1 206 206 398 98.0 1e-110 MKDQDVRFAVHHKLLKESHLDPDCLVVDEFSISLGASRADIAVINGVIHGYELKSEYDSL ERLPLQIKHYSSVMDKVTLVVAEKHLEGALKLIPGWWGVKTVSVGPKGAILIKHMRGEKL NRNHNSLMLAQLLWKDECIDVLERWGYSKGIKSKPRFELWNIIAENIPIANLRLEVRTAL KKRVGWKVKAWQAESAPTNKAVSRLT >gi|296493466|gb|ADTK01000035.1| GENE 3 1831 - 2040 91 69 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRFFQKLMMHREPNILIFHDFSLLPPLLNYSALHFTAFERFDHFNHKMLICSNKIHSHTT MVITFCLEN >gi|296493466|gb|ADTK01000035.1| GENE 4 2357 - 2542 146 61 aa, chain - ## HITS:1 COG:no KEGG:ECO26_0865 NR:ns ## KEGG: ECO26_0865 # Name: not_defined # Def: putative lipoprotein Rz1 precursor # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 61 1 61 61 92 95.0 3e-18 MLTRLSKVYVLMFLLGVSACKSPPPVQSQRPEPAAWAMEKAQDLQQMLNSIITVSEVEST R Prediction of potential genes in microbial genomes Time: Mon May 16 15:04:09 2011 Seq name: gi|296493465|gb|ADTK01000036.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont101.1, whole genome shotgun sequence Length of sequence - 1793 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 74 - 562 518 ## COG4574 Serine protease inhibitor ecotin 2 2 Tu 1 . + CDS 1454 - 1717 155 ## PROTEIN SUPPORTED gi|167855983|ref|ZP_02478730.1| 50S ribosomal protein L31 type B Predicted protein(s) >gi|296493465|gb|ADTK01000036.1| GENE 1 74 - 562 518 162 aa, chain - ## HITS:1 COG:eco KEGG:ns NR:ns ## COG: eco COG4574 # Protein_GI_number: 16130146 # Func_class: R General function prediction only # Function: Serine protease inhibitor ecotin # Organism: Escherichia coli K12 # 1 162 1 162 162 320 99.0 9e-88 MKTILPAVLFAAFATTSAWAAESVQPLEKIAPYPQAEKGMKRQVIQLTPQEDESTLKVEL LIGQTLEVDCNLHRLGGKLESKTLEGWGYDYYVFDKVSSPVSTMMACPDGKKEKKFVTAY LGDAGMLRYNSKLPIVVYTPDNVDVKYRVWKAEEKIDNAVVR >gi|296493465|gb|ADTK01000036.1| GENE 2 1454 - 1717 155 87 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855983|ref|ZP_02478730.1| 50S ribosomal protein L31 type B [Haemophilus parasuis 29755] # 4 81 15 92 94 64 37 7e-11 MHTNWQVCSLVVQAKSERISDISTQLNAFPGCEVAVSDAPSGQLIVVVEAEDSETLIQTI ESVRNVEGVLAVSLVYHQQEEQGEETP Prediction of potential genes in microbial genomes Time: Mon May 16 15:04:14 2011 Seq name: gi|296493464|gb|ADTK01000037.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont101.2, whole genome shotgun sequence Length of sequence - 11375 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 1, operones - 1 average op.length - 13.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 10/0.000 + CDS 2 - 2434 2948 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 2 1 Op 2 7/0.000 + CDS 2441 - 3136 468 ## COG1145 Ferredoxin 3 1 Op 3 4/0.000 + CDS 3123 - 3986 686 ## COG0348 Polyferredoxin 4 1 Op 4 7/0.000 + CDS 3962 - 4432 437 ## COG3043 Nitrate reductase cytochrome c-type subunit 5 1 Op 5 3/0.000 + CDS 4442 - 5044 526 ## COG3005 Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit 6 1 Op 6 14/0.000 + CDS 5057 - 5680 546 ## COG4133 ABC-type transport system involved in cytochrome c biogenesis, ATPase component 7 1 Op 7 14/0.000 + CDS 5680 - 6339 753 ## COG2386 ABC-type transport system involved in cytochrome c biogenesis, permease component 8 1 Op 8 9/0.000 + CDS 6381 - 7118 764 ## COG0755 ABC-type transport system involved in cytochrome c biogenesis, permease component 9 1 Op 9 9/0.000 + CDS 7115 - 7324 281 ## COG3114 Heme exporter protein D 10 1 Op 10 16/0.000 + CDS 7321 - 7800 667 ## COG2332 Cytochrome c-type biogenesis protein CcmE 11 1 Op 11 11/0.000 + CDS 7797 - 9740 2249 ## COG1138 Cytochrome c biogenesis factor 12 1 Op 12 5/0.000 + CDS 9737 - 10294 669 ## COG0526 Thiol-disulfide isomerase and thioredoxins 13 1 Op 13 . + CDS 10291 - 11343 1207 ## COG4235 Cytochrome c biogenesis factor Predicted protein(s) >gi|296493464|gb|ADTK01000037.1| GENE 1 2 - 2434 2948 810 aa, chain + ## HITS:1 COG:napA KEGG:ns NR:ns ## COG: napA COG0243 # Protein_GI_number: 16130143 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 4 810 22 828 828 1704 99.0 0 AAAGLSVPGVARAVVGQQEAIKWDKAPCRFCGTGCGVLVGTQQGRVVACQGDPDAPVNRG LNCIKGYFLPKIMYGKDRLTQPLLRMKNGKYDKEGEFTPITWDQAFDVMEEKFKTALKEK GPESIGMFGSGQWTIWEGYAASKLFKAGFRSNNIDPNARHCMASAVVGFMRTFGMDEPMG CYDDIEQADAFVLWGANMAEMHPILWSRITNRRLSNQNVTVAVLSTYQHRSFELADNGII FTPQSDLVILNYIANYIIQNNAINQDFFSKHVNLRKGATDIGYGLRPTHPLEKAAKNPGS DASEPMSFEDYKAFVAEYTLEKTAEMTGVPKDQLEQLAQLYADPNKKVISYWTMGFNQHT RGVWANNLVYNLHLLTGKISQPGCGPFSLTGQPSACGTAREVGTFAHRLPADMVVTNEKH RDICEKKWNIPSGTIPAKIGLHAVAQDRALKDGKLNVYWTMCTNNMQAGPNINEERMPGW RDPRNFIIVSDPYPTVSALAADLILPTAMWVEKEGAYGNAERRTQFWRQQVQAPGEAKSD LWQLVQFSRRFKTEEVWPEELLAKKPELRGKTLYEVLYATPEVSKFPVSELAEDQLNDES RELGFYLQKGLFEEYAWFGRGHGHDLAPFDDYHKARGLRWPVVNGKETQWRYSEGNDPYV KAGEGYKFYGKPDGKAVIFALPFEPAAEAPDEEYDLWLSTGRVLEHWHTGSMTRRVPELH RAFPEAVLFIHPLDAKARDLRRGDKVKVVSRRGEVISIVETRGRNRPPQGLVYMPFFDAA QLVNKLTLDATDPLSKETDFKKCAVKLEKV >gi|296493464|gb|ADTK01000037.1| GENE 2 2441 - 3136 468 231 aa, chain + ## HITS:1 COG:napG KEGG:ns NR:ns ## COG: napG COG1145 # Protein_GI_number: 16130142 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Escherichia coli K12 # 1 231 1 231 231 449 100.0 1e-126 MSRSAKPQNGRRRFLRDVVRTAGGLAAVGVALGLQQQTARASGVRLRPPGAINENAFASA CVRCGQCVQACPYDTLKLATLASGLSAGTPYFVARDIPCEMCEDIPCAKVCPSGALDREI ESIDDARMGLAVLVDQENCLNFQGLRCDVCYRECPKIDEAITLELERNTRTGKHARFLPT VHSDACTGCGKCEKVCVLEQPAIKVLPLSLAKGELGHHYRFGWLEGNNGKS >gi|296493464|gb|ADTK01000037.1| GENE 3 3123 - 3986 686 287 aa, chain + ## HITS:1 COG:napH KEGG:ns NR:ns ## COG: napH COG0348 # Protein_GI_number: 16130141 # Func_class: C Energy production and conversion # Function: Polyferredoxin # Organism: Escherichia coli K12 # 1 287 1 287 287 549 100.0 1e-156 MANRKRDAGREALEKKGWWRSHRWLVLRRLCQFFVLGMFLSGPWFGVWILHGNYSSSLLF DTVPLTDPLMTLQSLASGHLPATVALTGAVIITVLYALAGKRLFCSWVCPLNPITDLANW LRRRFDLNQSATIPRHIRYVLLVVILVGSALTGTLIWEWINPVSLMGRSLVMGFGSGALL ILALFLFDLLVVEHGWCGHICPVGALYGVLGSKGVITVAATDRQKCNRCMDCFHVCPEPH VLRAPVLDEQSPVQVTSRDCMTCGRCVDVCSEDVFTITTRWSSGAKS >gi|296493464|gb|ADTK01000037.1| GENE 4 3962 - 4432 437 156 aa, chain + ## HITS:1 COG:ECs3092 KEGG:ns NR:ns ## COG: ECs3092 COG3043 # Protein_GI_number: 15832346 # Func_class: C Energy production and conversion # Function: Nitrate reductase cytochrome c-type subunit # Organism: Escherichia coli O157:H7 # 1 156 1 156 156 318 99.0 3e-87 MEFGSEIMKSHDLKKALCQWTAMLALVVSGAVWAANGVDFSQSPEVSGTQEGAIRMPKEQ ERMPLNYVNQPPMIPHSVEGYQVTTNTNRCLQCHGVESYRTTGAPRISPTHFMDSDGKVG AEVAPRRYFCLQCHVPQADTAPIVGNTFTPSKGYGK >gi|296493464|gb|ADTK01000037.1| GENE 5 4442 - 5044 526 200 aa, chain + ## HITS:1 COG:ECs3091 KEGG:ns NR:ns ## COG: ECs3091 COG3005 # Protein_GI_number: 15832345 # Func_class: C Energy production and conversion # Function: Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit # Organism: Escherichia coli O157:H7 # 1 200 1 200 200 429 100.0 1e-120 MGNSDRKPGLIKRLWKWWRTPSRLALGTLLLIGFVGGIVFWGGFNTGMEKANTEEFCISC HEMRNTVYQEYMDSVHYNNRSGVRATCPDCHVPHEFVPKMIRKLKASKELYGKIFGVIDT PQKFEAHRLTMAQNEWRRMKDNNSQECRNCHNFEYMDTTAQKSVAAKMHDQAVKDGQTCI DCHKGIAHKLPDMREVEPGF >gi|296493464|gb|ADTK01000037.1| GENE 6 5057 - 5680 546 207 aa, chain + ## HITS:1 COG:ccmA KEGG:ns NR:ns ## COG: ccmA COG4133 # Protein_GI_number: 16130138 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome c biogenesis, ATPase component # Organism: Escherichia coli K12 # 3 207 1 205 205 375 100.0 1e-104 MGMLEARELLCERDERTLFSGLSFTLNAGEWVQITGSNGAGKTTLLRLLTGLSRPDAGEV LWQGQPLHQVRDSYHQNLLWIGHQPGIKTRLTALENLHFYHRDGDTAQCLEALAQAGLAG FEDIPVNQLSAGQQRRVALARLWLTRATLWILDEPFTAIDVNGVDRLTQRMAQHTEQGGI VILTTHQPLNVAESKIRRISLTQTRAA >gi|296493464|gb|ADTK01000037.1| GENE 7 5680 - 6339 753 219 aa, chain + ## HITS:1 COG:ECs3089 KEGG:ns NR:ns ## COG: ECs3089 COG2386 # Protein_GI_number: 15832343 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome c biogenesis, permease component # Organism: Escherichia coli O157:H7 # 1 219 2 220 220 280 100.0 2e-75 MFWRIFRLELRVAFRHSAEIANPLWFFLIVITLFPLSIGPEPQLLARIAPGIIWVAALLS SLLALERLFRDDLQDGSLEQLMLLPLPLPAVVLAKVMAHWMVTGLPLLILSPLVAMLLGM DVYGWQVMALTLLLGTPTLGFLGAPGVALTVGLKRGGVLLSILVLPLTIPLLIFATAAMD AASMHLPVDGYLAILGALLAGTATLSPFATAAALRISIQ >gi|296493464|gb|ADTK01000037.1| GENE 8 6381 - 7118 764 245 aa, chain + ## HITS:1 COG:ECs3088 KEGG:ns NR:ns ## COG: ECs3088 COG0755 # Protein_GI_number: 15832342 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome c biogenesis, permease component # Organism: Escherichia coli O157:H7 # 1 245 1 245 245 417 100.0 1e-116 MWKTLHQLAIPPRLYQICGWFIPWLAIASVVVLTVGWIWGFGFAPADYQQGNSYRIIYLH VPAAIWSMGIYASMAVAAFIGLVWQMKMANLAVAAMAPIGAVFTFIALVTGSAWGKPMWG TWWVWDARLTSELVLLFLYVGVIALWHAFDDRRLAGRAAGILVLIGVVNLPIIHYSVEWW NTLHQGSTRMQQSIDPAMRSPLRWSIFGFLLLSATLTLMRMRNLILLMEKRRPWVSELIL KRGRK >gi|296493464|gb|ADTK01000037.1| GENE 9 7115 - 7324 281 69 aa, chain + ## HITS:1 COG:ECs3087 KEGG:ns NR:ns ## COG: ECs3087 COG3114 # Protein_GI_number: 15832341 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Heme exporter protein D # Organism: Escherichia coli O157:H7 # 1 69 1 69 69 90 100.0 7e-19 MTPAFASWNEFFAMGGYAFFVWLAVVMTVIPLVVLVVHSVMQHRAILRGVAQQRAREARL RAAQQQEAA >gi|296493464|gb|ADTK01000037.1| GENE 10 7321 - 7800 667 159 aa, chain + ## HITS:1 COG:ECs3086 KEGG:ns NR:ns ## COG: ECs3086 COG2332 # Protein_GI_number: 15832340 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cytochrome c-type biogenesis protein CcmE # Organism: Escherichia coli O157:H7 # 1 159 1 159 159 321 100.0 3e-88 MNIRRKNRLWIACAVLAGLALTIGLVLYALRSNIDLFYTPGEILYGKRETQQMPEVGQRL RVGGMVMPGSVQRDPNSLKVTFTIYDAEGSVDVSYEGILPDLFREGQGVVVQGELEKGNH ILAKEVLAKHDENYTPPEVEKAMEANHRRPASVYKDPAS >gi|296493464|gb|ADTK01000037.1| GENE 11 7797 - 9740 2249 647 aa, chain + ## HITS:1 COG:ECs3085 KEGG:ns NR:ns ## COG: ECs3085 COG1138 # Protein_GI_number: 15832339 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cytochrome c biogenesis factor # Organism: Escherichia coli O157:H7 # 1 647 1 647 647 1143 100.0 0 MMPEIGNGLLCLALGIALLLSVYPLWGVARGDARMMASSRLFAWLLFMSVAGAFLVLVNA FVVNDFTVTYVASNSNTQLPVWYRVAATWGAHEGSLLLWVLLMSGWTFAVAIFSQRIPLD IVARVLAIMGMVSVGFLLFILFTSNPFSRTLPNFPIEGRDLNPLLQDPGLIFHPPLLYMG YVGFSVAFAFAIASLLSGRLDSTYARFTRPWTLAAWIFLTLGIVLGSAWAYYELGWGGWW FWDPVENASFMPWLVGTALMHSLAVTEQRASFKAWTLLLAISAFSLCLLGTFLVRSGVLV SVHAFASDPARGMFILAFMVLVIGGSLLLFAARGHKVRSRVNNALWSRESLLLANNVLLV AAMLVVLLGTLLPLVHKQLGLGSISIGEPFFNTMFTWLMVPFALLLGVGPLVRWGRDRPR KIRNLLIIAFISTLVLSLLLPWLFESKVVAMTVLGLAMACWIAVLAIAEAALRISRGTKT TFSYWGMVAAHLGLAVTIVGIAFSQNYSVERDVRMKSGDSVDIHEYRFTFRDVKEVTGPN WRGGVATIGVTREGKPETVLYAEKRYYNTAGSMMTEAAIDGGITRDLYAALGEELENGAW AVRLYYKPFVRWIWAGGLMMALGGLLCLFDPRYRKRVSPQKTAPEAV >gi|296493464|gb|ADTK01000037.1| GENE 12 9737 - 10294 669 185 aa, chain + ## HITS:1 COG:ECs3084 KEGG:ns NR:ns ## COG: ECs3084 COG0526 # Protein_GI_number: 15832338 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Escherichia coli O157:H7 # 1 185 1 185 185 378 100.0 1e-105 MKRKVLLIPLIIFLAIAAALLWQLARNAEGDDPTNLESALIGKPVPKFRLESLDNPGQFY QADVLTQGKPVLLNVWATWCPTCRAEHQYLNQLSAQGIRVVGMNYKDDRQKAISWLKELG NPYALSLFDGDGMLGLDLGVYGAPETFLIDGNGIIRYRHAGDLNPRVWEEEIKPLWEKYS KEAAQ >gi|296493464|gb|ADTK01000037.1| GENE 13 10291 - 11343 1207 350 aa, chain + ## HITS:1 COG:ccmH_2 KEGG:ns NR:ns ## COG: ccmH_2 COG4235 # Protein_GI_number: 16130131 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cytochrome c biogenesis factor # Organism: Escherichia coli K12 # 130 350 1 221 221 430 100.0 1e-120 MRFLLGVLMLMISGSALATIDVLQFKDEAQEQQFRQLTEELRCPKCQNNSIADSNSMIAT DLRQKVYELMQEGKSKKEIVDYMVARYGNFVTYDPPLTPLTVLLWVLPVVAIGIGGWVIY ARSRRRVRVVPEAFPEQSVPEGKRAGYVVYLPGIVVALIVAGVSYYQTGNYQQVKIWQQA TAQAPALLDRALDPKADPLNEEEMSRLALGMRTQLQKNPGDIEGWIMLGRVGMALGNASI ATDAYATAYRLDPKNSDAALGYAEALTRSSDPNDNRLGGELLRQLVRTDHSNIRVLSMYA FNAFEQQRFGEAVAAWEMMLKLLPANDTRRAVIERSIAQAMQHLSPQESK Prediction of potential genes in microbial genomes Time: Mon May 16 15:04:17 2011 Seq name: gi|296493463|gb|ADTK01000038.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont101.3, whole genome shotgun sequence Length of sequence - 6938 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 14 - 661 839 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain - Prom 797 - 856 3.8 + Prom 1156 - 1215 2.8 2 2 Tu 1 . + CDS 1239 - 3350 1478 ## COG3468 Type V secretory pathway, adhesin AidA + Term 3365 - 3400 4.2 - TRNA 3452 - 3528 78.0 # Pro GGG 0 0 - Term 3407 - 3435 -0.1 3 3 Op 1 8/0.000 - CDS 3603 - 5372 1615 ## COG3083 Predicted hydrolase of alkaline phosphatase superfamily 4 3 Op 2 . - CDS 5392 - 5619 303 ## COG3082 Uncharacterized protein conserved in bacteria - Prom 5691 - 5750 4.6 + Prom 5676 - 5735 6.9 5 4 Tu 1 . + CDS 5801 - 6808 1217 ## COG3081 Nucleoid-associated protein Predicted protein(s) >gi|296493463|gb|ADTK01000038.1| GENE 1 14 - 661 839 215 aa, chain - ## HITS:1 COG:ECs3082 KEGG:ns NR:ns ## COG: ECs3082 COG2197 # Protein_GI_number: 15832336 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 215 1 215 215 385 99.0 1e-107 MPEATPFQVMIVDDHPLMRRGVRQLLELDPGFEVVAEAGDGASAIDLANRLDIDVILLDL NMKGMSGLDTLNALRRDGVTAQIIILTVSDASSDVFALIDAGADGYLLKDSDPEVLLEAI RAGAKGSKVFSERVNQYLREREMFGAEEDPFSVLTERELDVLHELAQGLSNKQIASVLNI SEQTVKVHIRNLLRKLNVRSRVAATILFLQQRGAQ >gi|296493463|gb|ADTK01000038.1| GENE 2 1239 - 3350 1478 703 aa, chain + ## HITS:1 COG:yejO KEGG:ns NR:ns ## COG: yejO COG3468 # Protein_GI_number: 16130127 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 37 703 169 836 836 1117 94.0 0 MNGTAEYSVLNDGYQIVQMGGAANQTTLNNGVLQVYGAANDPTIKGGRLIVEKDGITVLA AIEKGGLLEVKEGGLAIAVDQKAGGKLIVSTNALEVSGTNSKGQFSIKDGVSKNYELDDG SGLIVMEDTQAIDTILDEHATMQSLGKDTGTRVQANAVYDLGRSDQNGSITYSSKAISEN MVINNGRANVWAGTMVNVSVRGNDGILEVMKPQINYAPAMLVGKVVVSEGASFRTHGAVD TSKADVSLENSVWTIIADITTTNQNTLLNLANLAMSDANVIMMDEPVTRSSVTASAENFI TLTTNTLSGNGNFYMRTDMANHQSDQLNVTGQATGDFKIFVTDTGASPAAGDSLTLVTTG GGDAAFTLGNAGGVVDIGTYEYTLLDNGNHSWSLAENRAQITPSTTDVLNMAAAQPLVFD AELDTVRERLGSVKGVSYDTAMWSSAINTRNNVTTDAGAGFEQTLTGLTLGIDSRFSREE SSTIRGLFFGYSHSDIGFDRGGKGNIDSYTLGAYAGWEHQNGAYVDGVVKVDRFANTIHG KMSNGATAFGDYNSNGAGAHVESGFRWVDGLWSVRPYLAFTGFTTDGQDYTLSNGMRADV GNTRILRAEAGTAVSYHMDLQNGTTLEPWLKAAVRQEYADSNQVKVNDDGKFNNDVAGAR GVYQAGIRSSFTPTLSGHLSVSYGNGAGVESPWNTQAGVVWTF >gi|296493463|gb|ADTK01000038.1| GENE 3 3603 - 5372 1615 589 aa, chain - ## HITS:1 COG:ECs3080 KEGG:ns NR:ns ## COG: ECs3080 COG3083 # Protein_GI_number: 15832334 # Func_class: R General function prediction only # Function: Predicted hydrolase of alkaline phosphatase superfamily # Organism: Escherichia coli O157:H7 # 1 589 1 586 586 1166 99.0 0 MVTHRQRYREKVSQMVSWGHWFALFNILLSLVIGSRYLFIADWPTTLAGRIYSYVSIIGH FSFLVFATYLLILFPLTFIVGSQRLMRFLSVILATAGMTLLLIDSEVFTRFHLHLNPIVW QLVINPDENEMARDWQLMFISVPVILLLELVFATWSWQKLRSLTRRRRFARPLAAFLFIA FIAFIASHVVYIWADANFYRPITMQRANLPLSYPMTARRFLEKHGLLDAQEYQRRLIEQG NPDAVSVQYPLSELRYRDMGTGQNVLLITVDGLNYSRFEKQMPALAGFAEQNISFTRHMS SGNTTDNGIFGLFYGISPSYMDGILSTRTPAALITALNQQGYQLGLFSSDGFTSPLYRQA LLSDFSMPSVRTQSDEQTATQWINWLGRYAQEDNRWFSWVSFNGTNIDDSNQQAFARKYS RAAGNVDDQINRVLNALRDSGKLDNTVVIITAGRGIPLSEEEETFDWSHGHLQVPLVIHW PGTPAQRINALTDHTDLMTTLMQRLLHVSTPASEYSQGQDLFNPQRRHYWVTAADNDTLA ITTPKKTLVLNNNGKYRTYNLRGERVKDEKPQLSLLLQVLTDEKRFIAN >gi|296493463|gb|ADTK01000038.1| GENE 4 5392 - 5619 303 75 aa, chain - ## HITS:1 COG:ECs3079 KEGG:ns NR:ns ## COG: ECs3079 COG3082 # Protein_GI_number: 15832333 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 75 1 75 75 120 100.0 8e-28 MPQISRYSDEQVEQLLAELLNVLEKHKAPTDLSLMVLGNMVTNLINTSIAPAQRQAIANS FARALQSSINEDKAH >gi|296493463|gb|ADTK01000038.1| GENE 5 5801 - 6808 1217 335 aa, chain + ## HITS:1 COG:ECs3078 KEGG:ns NR:ns ## COG: ECs3078 COG3081 # Protein_GI_number: 15832332 # Func_class: R General function prediction only # Function: Nucleoid-associated protein # Organism: Escherichia coli O157:H7 # 1 335 1 335 335 641 99.0 0 MSLDINQIALHQLIKRDEQNLELVLRDSLLEPTETVVEMVAELHRVYSAKNKAYGLFSEE SELAQTLRLQRQGEEDFLAFSRAATGRLRDELAKYPFADGGFVLFCHYRYLAVEYLLVAV LSNLSSMRVNENLDINPTHYLDINHADIVARIDLTEWETNPESTRYLTFLKGRVGRKVAD FFMDFLGASEGLNAKAQNRGLLQAVDDFTAQAQLDKAERQNVRQQVYSYCNEQLQAGEEI ELESLSKELAGVSEVSFTEFAAEKGYELEESFPADRSTLRQLTKFAGSGGGLTINFDAML LGERIFWDPATDTLTIKGTPPNLRDQLQRRTSGGN Prediction of potential genes in microbial genomes Time: Mon May 16 15:04:40 2011 Seq name: gi|296493462|gb|ADTK01000039.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont101.4, whole genome shotgun sequence Length of sequence - 71098 bp Number of predicted genes - 64, with homology - 63 Number of transcription units - 37, operones - 18 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 5 - 57 4.2 1 1 Op 1 . - CDS 72 - 452 633 ## PROTEIN SUPPORTED gi|26108973|gb|AAN81176.1|AE016763_135 50S ribosomal protein L25 2 1 Op 2 . - CDS 481 - 2241 1553 ## COG1061 DNA or RNA helicases of superfamily II - Prom 2300 - 2359 2.7 + Prom 2270 - 2329 3.5 3 2 Op 1 8/0.062 + CDS 2391 - 3086 728 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 4 2 Op 2 . + CDS 3114 - 4304 1254 ## COG0477 Permeases of the major facilitator superfamily + Term 4320 - 4349 0.5 + Prom 4349 - 4408 5.2 5 3 Tu 1 . + CDS 4637 - 4981 223 ## SSON_2237 hypothetical protein + Term 5017 - 5064 1.9 6 4 Op 1 11/0.062 - CDS 4985 - 6574 331 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 7 4 Op 2 11/0.062 - CDS 6576 - 7601 1043 ## COG4239 ABC-type uncharacterized transport system, permease component 8 4 Op 3 11/0.062 - CDS 7601 - 8695 1382 ## COG4174 ABC-type uncharacterized transport system, permease component 9 4 Op 4 2/0.625 - CDS 8696 - 10510 1472 ## COG4166 ABC-type oligopeptide transport system, periplasmic component 10 4 Op 5 5/0.125 - CDS 10592 - 12148 1049 ## COG2200 FOG: EAL domain - Prom 12179 - 12238 4.1 - Term 12246 - 12283 6.4 11 5 Tu 1 . - CDS 12329 - 12895 358 ## PROTEIN SUPPORTED gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 - Prom 12954 - 13013 5.5 12 6 Op 1 3/0.562 - CDS 13307 - 14020 488 ## COG0671 Membrane-associated phospholipid phosphatase 13 6 Op 2 3/0.562 - CDS 14059 - 15045 657 ## COG0523 Putative GTPases (G3E family) - Term 15101 - 15152 2.2 14 7 Tu 1 . - CDS 15163 - 16629 1187 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases - Prom 16660 - 16719 6.0 - Term 16792 - 16840 5.0 15 8 Tu 1 . - CDS 16852 - 17424 760 ## COG0231 Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) + Prom 17354 - 17413 3.4 16 9 Tu 1 . + CDS 17579 - 17833 181 ## COG0727 Predicted Fe-S-cluster oxidoreductase 17 10 Tu 1 . - CDS 17830 - 19011 1020 ## COG0477 Permeases of the major facilitator superfamily - Prom 19160 - 19219 3.4 + Prom 19219 - 19278 3.5 18 11 Op 1 11/0.062 + CDS 19379 - 20509 1355 ## COG4668 Mannitol/fructose-specific phosphotransferase system, IIA domain 19 11 Op 2 19/0.000 + CDS 20509 - 21447 1039 ## COG1105 Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) 20 11 Op 3 4/0.438 + CDS 21464 - 23155 1996 ## COG1299 Phosphotransferase system, fructose-specific IIC component + Term 23172 - 23202 0.2 + Prom 23334 - 23393 4.2 21 12 Op 1 8/0.062 + CDS 23579 - 24520 556 ## COG0524 Sugar kinases, ribokinase family 22 12 Op 2 4/0.438 + CDS 24508 - 25446 1087 ## COG2313 Uncharacterized enzyme involved in pigment biosynthesis + Term 25496 - 25525 1.9 23 13 Tu 1 . + CDS 25540 - 26790 1333 ## COG1972 Nucleoside permease 24 14 Tu 1 . - CDS 26987 - 27685 442 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Prom 27913 - 27972 6.1 + Prom 27690 - 27749 3.5 25 15 Op 1 3/0.562 + CDS 27815 - 28756 1003 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase + Term 28770 - 28813 7.5 26 15 Op 2 . + CDS 28856 - 30106 1352 ## COG1972 Nucleoside permease 27 16 Op 1 3/0.562 - CDS 30162 - 31250 713 ## COG0524 Sugar kinases, ribokinase family 28 16 Op 2 5/0.125 - CDS 31253 - 32110 874 ## COG0648 Endonuclease IV 29 16 Op 3 . - CDS 32184 - 33233 1158 ## COG2855 Predicted membrane protein - Prom 33284 - 33343 9.8 + Prom 33253 - 33312 9.1 30 17 Op 1 . + CDS 33332 - 34213 204 ## PROTEIN SUPPORTED gi|149913192|ref|ZP_01901726.1| 50S ribosomal protein L35 + Prom 34331 - 34390 5.3 31 17 Op 2 . + CDS 34418 - 35887 1778 ## COG0833 Amino acid transporters + Term 35909 - 35940 4.1 + Prom 35948 - 36007 2.8 32 18 Tu 1 . + CDS 36095 - 38086 2023 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins + Term 38095 - 38125 2.1 - Term 38084 - 38111 1.5 33 19 Tu 1 . - CDS 38118 - 38954 753 ## COG0627 Predicted esterase - Prom 39198 - 39257 3.0 + Prom 39036 - 39095 4.7 34 20 Op 1 4/0.438 + CDS 39212 - 39880 753 ## COG0302 GTP cyclohydrolase I 35 20 Op 2 . + CDS 39897 - 41054 1098 ## COG2311 Predicted membrane protein 36 21 Tu 1 . - CDS 41003 - 41206 73 ## EcE24377A_2448 hypothetical protein + Prom 41094 - 41153 1.9 37 22 Tu 1 . + CDS 41196 - 42236 1056 ## COG1609 Transcriptional regulators + Term 42248 - 42277 -0.2 + Prom 42239 - 42298 5.1 38 23 Op 1 16/0.000 + CDS 42516 - 43514 1340 ## COG1879 ABC-type sugar transport system, periplasmic component + Term 43523 - 43558 6.1 39 23 Op 2 10/0.062 + CDS 43575 - 45095 192 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 40 23 Op 3 . + CDS 45111 - 46121 1346 ## COG4211 ABC-type glucose/galactose transport system, permease component + Term 46139 - 46201 8.6 41 24 Op 1 4/0.438 - CDS 46192 - 47427 1183 ## COG0167 Dihydroorotate dehydrogenase 42 24 Op 2 . - CDS 47421 - 48659 971 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases - Prom 48817 - 48876 6.5 - Term 48883 - 48925 -0.8 43 25 Op 1 . - CDS 48978 - 49217 236 ## G2583_2688 hypothetical protein 44 25 Op 2 3/0.562 - CDS 49220 - 49939 780 ## COG2949 Uncharacterized membrane protein - Prom 49970 - 50029 3.7 - Term 49993 - 50021 1.6 45 26 Tu 1 . - CDS 50089 - 50973 901 ## COG0295 Cytidine deaminase - Prom 50999 - 51058 4.7 46 27 Op 1 23/0.000 - CDS 51103 - 51798 732 ## COG1346 Putative effector of murein hydrolase 47 27 Op 2 . - CDS 51795 - 52193 439 ## COG1380 Putative effector of murein hydrolase LrgA - Prom 52214 - 52273 3.9 + Prom 52207 - 52266 3.4 48 28 Tu 1 . + CDS 52402 - 53379 985 ## COG0042 tRNA-dihydrouridine synthase + Prom 53861 - 53920 6.5 49 29 Op 1 2/0.625 + CDS 54057 - 55493 1204 ## COG1538 Outer membrane protein 50 29 Op 2 . + CDS 55546 - 56307 202 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 - Term 56359 - 56410 -0.7 51 30 Tu 1 . - CDS 56437 - 57015 671 ## COG0586 Uncharacterized membrane-associated protein - Prom 57116 - 57175 5.2 + Prom 57075 - 57134 2.9 52 31 Tu 1 . + CDS 57161 - 57772 595 ## SSON_2192 hypothetical protein + Term 57786 - 57831 11.2 + Prom 57816 - 57875 5.1 53 32 Tu 1 . + CDS 57985 - 58878 1067 ## COG1686 D-alanyl-D-alanine carboxypeptidase + Term 58886 - 58922 4.3 - Term 58822 - 58850 1.3 54 33 Tu 1 . - CDS 58916 - 60631 1648 ## COG0277 FAD/FMN-containing dehydrogenases - Prom 60663 - 60722 4.2 + Prom 60622 - 60681 4.1 55 34 Tu 1 . + CDS 60827 - 63124 2708 ## COG1472 Beta-glucosidase-related glycosidases + Term 63136 - 63166 3.4 56 35 Op 1 13/0.062 + CDS 63335 - 64252 1031 ## COG1732 Periplasmic glycine betaine/choline-binding (lipo)protein of an ABC-type transport system (osmoprotectant binding protein) 57 35 Op 2 24/0.000 + CDS 64259 - 65416 1288 ## COG1174 ABC-type proline/glycine betaine transport systems, permease component 58 35 Op 3 24/0.000 + CDS 65409 - 66335 1127 ## COG1125 ABC-type proline/glycine betaine transport systems, ATPase components 59 35 Op 4 . + CDS 66340 - 67071 812 ## COG1174 ABC-type proline/glycine betaine transport systems, permease component + Term 67320 - 67359 1.2 - Term 66873 - 66912 -0.7 60 36 Op 1 . - CDS 67052 - 67159 147 ## 61 36 Op 2 . - CDS 67219 - 67950 663 ## COG0789 Predicted transcriptional regulators - Prom 68037 - 68096 4.4 + Prom 68001 - 68060 3.5 62 37 Op 1 9/0.062 + CDS 68172 - 69857 1220 ## COG3275 Putative regulator of cell autolysis 63 37 Op 2 2/0.625 + CDS 69854 - 70573 832 ## COG3279 Response regulator of the LytR/AlgR family 64 37 Op 3 . + CDS 70620 - 71090 450 ## COG4807 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|296493462|gb|ADTK01000039.1| GENE 1 72 - 452 633 126 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|26108973|gb|AAN81176.1|AE016763_135 50S ribosomal protein L25 [Escherichia coli CFT073] # 1 126 1 126 126 248 96 6e-65 MSIESRPLLHTQSRSLTCCWVACSRINLREKEMFTINAEVRKEQGKGASRRLRAANKFPA IIYGGKEAPLAIELDHDKVMNMQAKAEFYSEVLTIVVDGKEIKVKAQDVQRHPYKPKLQH IDFVRA >gi|296493462|gb|ADTK01000039.1| GENE 2 481 - 2241 1553 586 aa, chain - ## HITS:1 COG:ZyejH_1 KEGG:ns NR:ns ## COG: ZyejH_1 COG1061 # Protein_GI_number: 15802740 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA or RNA helicases of superfamily II # Organism: Escherichia coli O157:H7 EDL933 # 1 358 1 358 358 726 99.0 0 MIFTLRPYQQEAVDATLNHFRRHKTPAVIVLPTGAGKSLVIAELARLARGRVLVLAHVKE LVAQNHAKYQALGLEADIFAAGLKRKESHGKVVFGSVQSVARNLDAFQGEFSLLIVDECH RIGDDEESQYQQILTHLTKVNPHLRLLGLTATPFRLGKGWIYQFHYHGMVRGDEKALFRD CIYELPLRYMIKHGYLTPPERLDMPVVQYDFSRLQAQSNGLFSEADLNRELKKQQRITPH IISQIMEFAEKRKGVMIFAATVEHAKEIVGLLPAEDAALITGDTPGAERDVLIEDFKAQR FRYLVNVAVLTTGFDAPHVDLIAILRPTESVSLYQQIVGRGLRLAPGKTDCLILDYAGNP HDLYAPEVGTPKGKSDNVPVQVFCPACGFANTFGGKTTADGTLIEHFGRRCQGWFEDDDG HREQCDFRFRFKNCPQCNAENDIAARRCRECDTVLVDPDDMLKAALRLKDALVLRCSGMS LQHGHDEKGEWLKITYYDEDGADVSERFRLQTPAQRTAFEQLFIRPHTRTPGIPLRWITA ADILAQQALLRHPDFVVARMKGQYWQVREKVFDYEGRFRRAHELRG >gi|296493462|gb|ADTK01000039.1| GENE 3 2391 - 3086 728 231 aa, chain + ## HITS:1 COG:ECs3075 KEGG:ns NR:ns ## COG: ECs3075 COG1187 # Protein_GI_number: 15832329 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Escherichia coli O157:H7 # 1 231 1 231 231 469 100.0 1e-132 MRLDKFIAQQLGVSRAIAGREIRGNRVTVDGEIVRNAAFKLLPEHDVAYDGNPLAQQHGP RYFMLNKPQGYVCSTDDPDHPTVLYFLDEPVAWKLHAAGRLDIDTTGLVLMTDDGQWSHR ITSPRHHCEKTYLVTLESPVADDTAEQFAKGVQLHNEKDLTKPAVLEVITPTQVRLTISE GRYHQVKRMFAAVGNHVVELHRERIGGITLDADLAPGEYRPLTEEEIASVV >gi|296493462|gb|ADTK01000039.1| GENE 4 3114 - 4304 1254 396 aa, chain + ## HITS:1 COG:ECs3074 KEGG:ns NR:ns ## COG: ECs3074 COG0477 # Protein_GI_number: 15832328 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 396 1 396 396 655 100.0 0 MTTRQHSSFAIVFILGLLAMLMPLSIDMYLPALPVISAQFGVPAGSTQMTLSTYILGFAL GQLIYGPMADSFGRKPVVLGGTLVFAAAAVACALAQTIDQLIVMRFFHGLAAAAASVVIN ALMRDIYPKEEFSRMMSFVMLVTTIAPLMAPIVGGWVLVWLSWHYIFWILAVAAILASAM IFFLIKETLPPERRQPFHIRTTIGNFAALFRHKRVLSYMLASGFSFAGMFSFLSAGPFVY IEINHVAPENFGYYFALNIVFLFVMTIFNSRFVRRIGALNMFRSGLWIQFIMAAWMVISA LLGLGFWSLVVGVAAFVGCVSMVSSNAMAVILDEFPHMAGTASSLAGTFRFGIGAIVGAL LSLATFNSAWPMIWSIAFCATSSILFCLYASRPKKR >gi|296493462|gb|ADTK01000039.1| GENE 5 4637 - 4981 223 114 aa, chain + ## HITS:1 COG:no KEGG:SSON_2237 NR:ns ## KEGG: SSON_2237 # Name: yejG # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 114 1 114 114 231 100.0 4e-60 MTSLQLSIVHRLPQNFRWSAGFAGSKVEPIPQNGPCGDNSLVALKLLSPDGDNAWSVMYK LSQALSDIEVPCSVLECEGEPCLFVNRQDEFAATCRLKNFGVAIAEPFSNYNPF >gi|296493462|gb|ADTK01000039.1| GENE 6 4985 - 6574 331 529 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 268 528 1 261 563 132 33 7e-30 MTQTLLAIENLSVGFRHQQTVRTVVNDVSLQIEAGETLALVGESGSGKSVTALSILRLLP SPPVEYLSGDIRFHGESLLHASDQTLRGVRGNKIAMIFQEPMVSLNPLHTLEKQLYEVLS LHRGMRREAARGEILNCLDRVGIRQAAKRLTDYPHQLSGGERQRVMIAMALLTRPELLIA DEPTTALDVSVQAQILQLLRELQGELNMGMLFITHNLSIVRKLAHRVAVMQNGRCVEQNN AATLFASPTHPYTQKLLNSEPSGDPVPLPEPASTLLDVEQLQVAFPIRKGILKRIVDHNV VVKNISFTLRAGETLGLVGESGSGKSTTGLALLRLINSQGSIIFDGQPLQNLNRRQLLPI RHRIQVVFQDPNSSLNPRLNVLQIIEEGLRVHQPTLSAAQREQQVIAVMHEVGLDPETRH RYPAEFSGGQRQRIAIARALILKPSLIILDEPTSSLDKTVQAQILTLLKSLQQKHQLAYL FISHDLHVVRALCHQVIVLRQGEVVEQGPCARVFAAPQQEYTRQLLALS >gi|296493462|gb|ADTK01000039.1| GENE 7 6576 - 7601 1043 341 aa, chain - ## HITS:1 COG:ECs3071 KEGG:ns NR:ns ## COG: ECs3071 COG4239 # Protein_GI_number: 15832325 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Escherichia coli O157:H7 # 1 341 1 341 341 649 99.0 0 MSRLSPVNQARWARFRHNRRGYWSLWIFLVLFGLSLCSELIANDKPLLVRYDGSWYFPLL KNYSESDFGGPLASQADYQDPWLKQRLENNGWVLWAPIRFGATSINFATDKPFPSPPSRQ NWLGTDANGGDVLARILYGTRISVLFGLMLTLCSSVMGVLAGALQGYYGGKVDLWGQRFI EVWSGMPTLFLIILLSSVVQPNFWWLLAITVLFGWMSLVGVVRAEFLRTRNFDYIRAAQA LGVSDRSIILRHMLPNAMVATLTFLPFILCSSITTLTSLDFLGFGLPLGSPSLGELLLQG KNNLQAPWLGITAFLSVAILLSLLIFIGEAVRDAFDPNKAV >gi|296493462|gb|ADTK01000039.1| GENE 8 7601 - 8695 1382 364 aa, chain - ## HITS:1 COG:ECs3070 KEGG:ns NR:ns ## COG: ECs3070 COG4174 # Protein_GI_number: 15832324 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Escherichia coli O157:H7 # 1 364 1 364 364 682 99.0 0 MGAYLIRRLLLVIPTLWAIITINFFIVQIAPGGPVDQAIAAIEFGNAGVLPGAGGEGVRA SHAQTGVGNISDSNYRGGRGLDPEVIAEITHRYGFDKPIHERYFKMLWDYIRFDFGDSLF RSASVLTLIKDSLPVSITLGLWSTLIIYLVSIPLGIRKAVYNGSRFDVWSSAFIIIGYAI PAFLFAILLIVFFAGGSYFDLFPLRGLVSANFDLLPWYQKITDYLWHITLPVLATVIGGF AALTMLTKNSFLDEVRKQYVVTARAKGVSEKNILWKHVFRNAMLLVIAGFPATFISMFFT GSLLIEVMFSLNGLGLLGYEATVSRDYPVMFGTLYIFTLIGLLLNIVSDISYTLVDPRID FEGR >gi|296493462|gb|ADTK01000039.1| GENE 9 8696 - 10510 1472 604 aa, chain - ## HITS:1 COG:yejA KEGG:ns NR:ns ## COG: yejA COG4166 # Protein_GI_number: 16130115 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Escherichia coli K12 # 1 604 3 606 606 1193 99.0 0 MIVRILLLFIALFTFGVQAQAIKESYAFAVLGEPRYAFNFNHFDYVNPAAPKGGQITLSA LGTFDNFNRYALRGNPGARTEQLYDTLFTTSDDEPGSYYPLIAESARYADDYSWVEVAIN PRARFHDGSPITARDVEFTFQKFMTEGVPQFRLVYKGTTVKAIAPLTVRIELAKPGKEDM LSLFSLPVFPEKYWKDHKLSDPLATPPLASGPYRITSWKMGQNIVYSRVKDYWAANLPVN RGRWNFDTIRYDYYLDDNVAFEAFKAGAFDLRMENDAKNWATRYTGKNFDKKYIIKDEQK NESAQDTRWLAFNIQRPVFSDRRVREAITLAFDFEWMNKALFYNAWSRTNSYFQNTEYAA RNYPDAAELVLLAPMKKDLPPEVFTQIYQPPVSKGDGYDRDNLLKADKLLNEAGWVLKGQ QRVNATTGQPLSFELLLPASSNSQWVLPFQHSLQRLGINMDIRKVDNSQITNRMRSRDYD MMPRVWRAMPWPSSDLQISWSSEYINSTYNAPGVQSPVIDSLINQIIAAQGNKEKLLPLG RALDRVLTWNYYMLPMWYMAEDRLAWWDKFSQPAVRPIYSLGIDTWWYDVNKAAKLPSAS KQGE >gi|296493462|gb|ADTK01000039.1| GENE 10 10592 - 12148 1049 518 aa, chain - ## HITS:1 COG:rtn_2 KEGG:ns NR:ns ## COG: rtn_2 COG2200 # Protein_GI_number: 16130114 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 261 518 1 258 258 531 100.0 1e-150 MFIRAPNSGRKLLLTCIVAGVMIAILVSCLQFLVAWHRHEVKYDTLITDVQKYLDTYFAD LKSTTDRLQPLTLDTCQQANPELTARAAFSMNVRTFVLVKDKKTFCSSATGEMDIPLNEL IPALDINKNVDMAILPGTPMVPNKPAIVIWYRNPLLKNSGVFAALNLNLTPSLFYSSRQE DYDGVALIIGNTALSTFSSRLMNVNELTDMPVRETKIAGIPLTVRLYADDWTWNDVWYAF LLGGMSGTVVGLLCYYLMSVRMRPGREIMTAIKREQFYVAYQPVVDTQALRVTGLEVLLR WRHPVAGEIPPDAFINFAESQKMIVPLTQHLFELIARDAAELEKVLPVGVKFGINIAPDH LHSESFKADIQKLLTSLPAHHFQIVLEITERDMLKEQEATQLFAWLHSVGVEIAIDDFGT GHSALIYLERFTLDYLKIDRGFINAIGTETITSPVLDAVLTLAKRLNMLTVAEGVETPEQ ARWLSERGVNFMQGYWISRPLPLDDFVRWLKKPYTPQW >gi|296493462|gb|ADTK01000039.1| GENE 11 12329 - 12895 358 188 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 [Haemophilus parasuis 29755] # 65 184 58 174 175 142 57 5e-33 MVKSQPILRYILRGIPAIAVAVLLSACSANNTAKNMHPETRAVGSETSSLQASQDEFENL VRNVDVKSRIMDQYADWKGVRYRLGGSTKKGIDCSGFVQRTFREQFGLELPRSTYEQQEM GKSVSRSNLRTGDLVLFRAGSTGRHVGIYIGNNQFVHASTSSGVIISSMNEPYWKKRYNE ARRVLSRS >gi|296493462|gb|ADTK01000039.1| GENE 12 13307 - 14020 488 237 aa, chain - ## HITS:1 COG:ECs3066 KEGG:ns NR:ns ## COG: ECs3066 COG0671 # Protein_GI_number: 15832320 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Escherichia coli O157:H7 # 1 237 13 249 249 436 99.0 1e-122 MIKNLPQIVLLNIVGLALFLSWYIPVNHGFWLPIDADIFYFFNQKLVESKAFLWLVALTN NRAFDGCSLLAMGMLMLSFWLKENAPGRRRIVIMGLVMLLTAVVLNQLGQALIPVKRASP TLTFTDINRVSELLSVPTKDASRDSFPGDHGMMLLIFSAFMWRYFGKVAGLIALIIFVVF AFPRVMIGAHWFTDIIVGSMTVILIGLPWVLLTPLSDRLITFFDKSLPGKNKHFQNK >gi|296493462|gb|ADTK01000039.1| GENE 13 14059 - 15045 657 328 aa, chain - ## HITS:1 COG:yeiR KEGG:ns NR:ns ## COG: yeiR COG0523 # Protein_GI_number: 16130111 # Func_class: R General function prediction only # Function: Putative GTPases (G3E family) # Organism: Escherichia coli K12 # 1 328 1 328 328 656 99.0 0 MTRTNLITGFLGSGKTTSILHLLAHKDPNEKWAVLVNEFGEVGIDGALLADSGALLKEIP GGCMCCVNGLPMQVGLNTLLRQGKPDRLLIEPTGLGHPKQILDLLTAPVYEPWIDLRATL CILDPRLLLDEKSASNENFRDQLAAADIIVANKSDRATPESEQALQRWWQQNGGDRQLIH SEHGKVDGHLLDLPRRNLAELPASAAHSHQHVVKKGLAALSLPEHQRWRRSLNSGQGYQA CGWIFDADTVFDTIGILEWARLAPVERVKGVLRIPEGLVRINRQGDDLHIETQNVAPPDS RIELISSSEADWNALQSALLKLRLATTA >gi|296493462|gb|ADTK01000039.1| GENE 14 15163 - 16629 1187 488 aa, chain - ## HITS:1 COG:yeiQ KEGG:ns NR:ns ## COG: yeiQ COG0246 # Protein_GI_number: 16130110 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Escherichia coli K12 # 1 488 1 488 488 1004 99.0 0 MKTIASVTLPHHVHAPRYDRQQLQSRIVHFGFGAFHRAHQALLTDRVLNAQGGDWGICEI SLFSGDQLMSQLRAQNHLYTVLEKGADGNQVIIVGAVHECLNAKLDSLAAIIEKFCEPQV AIVSLTITEKGYCIDPATGALDTSNPRIIHDLQTPEEPHSAPGILVEALKRRRERGLTPF TVLSCDNIPDNGHVVKNAVLGMAEKRSPELAGWIKEHVSFPGTMVDRIVPAATDESLAEI SQHLGVNDPCAISCEPFIQWVVEDNFVAGRPAWEVAGVQMVNDVLPWEEMKLRMLNGSHS FLAYLGYLSGFAHISDCMQDRAFRHAARTLMLNEQAPTLQIKDVDLTQYADKLIARFANP ALKHKTWQIAMDGSQKLPQRMLAGIRIHLGRETDWSLLALGVAGWMRYVSGVDDAGNAID VRDPLSDKIRELVAGSSSEQRVTALLSLREVFGDDLPDNPHFVQAIEQAWQQIVQFGAHQ ALLNTLKI >gi|296493462|gb|ADTK01000039.1| GENE 15 16852 - 17424 760 190 aa, chain - ## HITS:1 COG:ECs3063 KEGG:ns NR:ns ## COG: ECs3063 COG0231 # Protein_GI_number: 15832317 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) # Organism: Escherichia coli O157:H7 # 1 190 86 275 275 384 100.0 1e-107 MPRANEIKKGMVLNYNGKLLLVKDIDIQSPTARGAATLYKMRFSDVRTGLKVEERFKGDD IVDTVTLTRRYVDFSYVDGNEYVFMDKEDYTPYTFTKDQIEEELLFMPEGGMPDMQVLTW DGQLLALELPQTVDLEIVETAPGIKGASASARNKPATLSTGLVIQVPEYLSPGEKIRIHI EERRYMGRAD >gi|296493462|gb|ADTK01000039.1| GENE 16 17579 - 17833 181 84 aa, chain + ## HITS:1 COG:YPO1286 KEGG:ns NR:ns ## COG: YPO1286 COG0727 # Protein_GI_number: 16121569 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster oxidoreductase # Organism: Yersinia pestis # 1 84 1 84 84 101 66.0 3e-22 MECRPGCGACCTAPSISSPIPGMPDGKPANTPCIQLDEQQRCKIFTSPLRPKVCAGLQAS AEMCGNSRQQAMTWLIDLEMLTAP >gi|296493462|gb|ADTK01000039.1| GENE 17 17830 - 19011 1020 393 aa, chain - ## HITS:1 COG:ECs3062 KEGG:ns NR:ns ## COG: ECs3062 COG0477 # Protein_GI_number: 15832316 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 393 1 393 393 691 99.0 0 MHNSPAVSSAKSFDLTSTAFLIVAFLTGIAGALQTPTLSIFLTDEVHARPAMVGFFFTGS AVIGILVSQFLTGRSDKRGDHKSLIVFCCLLGVLACTLFAWNRNYFVLLFVGVFLSSFGS TANPQMFALAREHADKTGREAVMFSSFLRAQVSLAWVIGPPLAYALAMGFSFTVMYLSAA VAFIVCGVMVWLFLPSMQKELPLATGTVEAPRRNRRDTLLLFVICTLMWGSNSLYIINMP LFIINELHLPEKLAGVMMGTAAGLEIPTMLIAGYFAKRLGKRFLMRVAAVGGVCFYAGML MAHSPVILLGLQLLNAIFIGILGGIGMLYFQDLMPGQAGSATTLYTNTSRVGWIIAGSVA GIVAEIWNYHAVFWFAMVMIIATLFCLLRIKDV >gi|296493462|gb|ADTK01000039.1| GENE 18 19379 - 20509 1355 376 aa, chain + ## HITS:1 COG:fruB_1 KEGG:ns NR:ns ## COG: fruB_1 COG4668 # Protein_GI_number: 16130107 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol/fructose-specific phosphotransferase system, IIA domain # Organism: Escherichia coli K12 # 1 286 1 286 286 467 99.0 1e-131 MFQLSVQDIHPGEKAGDKEEAIRQVAAALVQAGNVAEGYVNGMLAREQQTSTFLGNGIAI PHGTTDTRDQVLKTVVQVFQFPEGLTWGDGQVAYVAIGIAASSDEHLGLLRQLTHVLSDD SVAEQLKSATTAEELRALLMGEKQSEQLKLDNEMLTLDIVASDLLTLQALNAARLKEAGA VDATFVTKAINEQPLNLGQGIWLSDSAEGNLRSAIAVSRAANAFDVDGETAAMLVSVAMN DDQPIAVLKRLADLLLDNKADRLLKADAATLLALLTSDDAPTDDVLSAEFVVRNEHGLHA RPGTMLVNTIKQFNSDITVTNLDGTGKPANGRSLMKVVALGVKKGHRLRFTAQGGDAEQA LKAIGDAIAAGLGEGA >gi|296493462|gb|ADTK01000039.1| GENE 19 20509 - 21447 1039 312 aa, chain + ## HITS:1 COG:ECs3060 KEGG:ns NR:ns ## COG: ECs3060 COG1105 # Protein_GI_number: 15832314 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) # Organism: Escherichia coli O157:H7 # 1 312 1 312 312 611 100.0 1e-175 MSRRVATITLNPAYDLVGFCPEIERGEVNLVKTTGLHAAGKGINVAKVLKDLGIDVTVGG FLGKDNQDGFQQLFSELGIANRFQVVQGRTRINVKLTEKDGEVTDFNFSGFEVTPADWER FVTDSLSWLGQFDMVCVSGSLPSGVSPEAFTDWMTRLRSQCPCIIFDSSREALVAGLKAA PWLVKPNRRELEIWAGRKLPEMKDVIEAAHALREQGIAHVVISLGAEGALWVNASGEWIA KPPSVDVVSTVGAGDSMVGGLIYGLLMRESSEHTLRLATAVAALAVSQSNVGITDRPQLA AMMARVDLQPFN >gi|296493462|gb|ADTK01000039.1| GENE 20 21464 - 23155 1996 563 aa, chain + ## HITS:1 COG:fruA_3 KEGG:ns NR:ns ## COG: fruA_3 COG1299 # Protein_GI_number: 16130105 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Escherichia coli K12 # 226 563 1 338 338 545 100.0 1e-154 MKTLLIIDANLGQARAYMAKTLLGAAARKAKLEIIDNPNDAEMAIVLGDSIPNDSALNGK NVWLGDISRAVAHPELFLSEAKGHAKPYTAPVTATAPVAASGPKRVVAVTACPTGVAHTF MAAEAIETEAKKRGWWVKVETRGSVGAGNAITPEEVAAADLVIVAADIEVDLAKFAGKPM YRTSTGLALKKTAQELDKAVAEATPYEPAGKAQTATTEGKKESAGAYRHLLTGVSYMLPM VVAGGLCIALSFAFGIEAFKEPGTLAAALMQIGGGSAFALMVPVLAGYIAFSIADRPGLT PGLIGGMLAVSTGSGFIGGIIAGFLAGYIAKLISTQLKLPQSMEALKPILIIPLISSLVV GLAMIYLIGKPVAGILEGLTHWLQTMGTANAVLLGAILGGMMCTDMGGPVNKAAYAFGVG LLSTQTYGPMAAIMAAGMVPPLAMGLATMVARRKFDKAQQEGGKAALVLGLCFISEGAIP FAARDPMRVLPCCIVGGALTGAISMAIGAKLMAPHGGLFVLLIPGAITPVLGYLVAIIAG TLVAGLAYAFLKRPEVDAVAKAA >gi|296493462|gb|ADTK01000039.1| GENE 21 23579 - 24520 556 313 aa, chain + ## HITS:1 COG:ECs3058 KEGG:ns NR:ns ## COG: ECs3058 COG0524 # Protein_GI_number: 15832312 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli O157:H7 # 1 313 1 313 313 613 98.0 1e-176 MREKDYVVIIGSANIDVAGYSHESLNYADSNPGKIKFTPGGVGRNIAQNLALLGNKAWLL SAVGSDFYGQSLLTQTNQSGVYVDKCLIVPGENTSSYLSLLDNTGEMLVAINDMNISNAI TAEYLAQHREFIQRAKVIVADCNISEEALAWILDNAANVPVFVDPVSAWKCVKVRDRLNQ IHTLKPNRLEAETLSGIALSGRDDVAKVAAWFHQHGLNRLVLSMGGDGVYYSDISGENGW SAPIKTNVINVTGAGDAMMAGLASCWVDGMPFAESVRFAQGCSSMALSCEYTNNPDLSIA NVISLVENAECLN >gi|296493462|gb|ADTK01000039.1| GENE 22 24508 - 25446 1087 312 aa, chain + ## HITS:1 COG:ECs3057 KEGG:ns NR:ns ## COG: ECs3057 COG2313 # Protein_GI_number: 15832311 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized enzyme involved in pigment biosynthesis # Organism: Escherichia coli O157:H7 # 1 312 1 312 312 569 99.0 1e-162 MSELKISPELLQISPEVQDALKNKKPVVALESTIISHGMPFPQNAQTAIEVEETIRKQGA VPATIAIIGGVMKVGLSKEEIELLGREGHNVTKVSRRDLPFVVAAGKNGATTVASTMIIA ALAGIKVFATGGIGGVHRGAEHTFDISADLQELANTNVTVVCAGAKSILDLGLTTEYLET FGVPLIGYQTKALPAFFCRTSPFDVSIRLDSASEIARAMAVKWQSGLNGGLVVANPIPEQ FAMPEHTINAAIDQAVAEAEAQGVIGKESTPFLLARVAELTGGDSLKSNIQLVFNNAILA SEIAKEYQRLAG >gi|296493462|gb|ADTK01000039.1| GENE 23 25540 - 26790 1333 416 aa, chain + ## HITS:1 COG:yeiM KEGG:ns NR:ns ## COG: yeiM COG1972 # Protein_GI_number: 16130102 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside permease # Organism: Escherichia coli K12 # 1 416 1 416 416 631 98.0 0 MDIMRSVVGMVVLLAIAFLLSVNKKSISLRTVGAALLLQIAIGGIMLYFPPGKWAVEQAA LGVHKVMSYSDAGSAFIFGSLVGPKMDVLFDGAGFIFAFRVLPAIIFVTALISLLYYIGV MGLLIRILGSIFQKALNISKIESFVAVTTIFLGQNEIPAIVKPFIDRMNRNELFTAICSG MASIAGSMMIGYAGMGVPIDYLLAASLMAIPGGILFARILSPATEPSQVTFENLSFSETP PKSIIEAAANGAMTGLKIAAGVATVVMAFVAIIALINGIIGGVGGWFGFANVSLESIFGY VLAPLAWIMGVDWSDANLAGSLIGQKLAINEFVAYLNFSPYLQTSGTLDVKTIAIISFAL CGFANFGSIGVVVGAFSAISPKRAPEIAQLGLRALAAATLSNLMSATIAGFFIGLA >gi|296493462|gb|ADTK01000039.1| GENE 24 26987 - 27685 442 232 aa, chain - ## HITS:1 COG:ECs3055 KEGG:ns NR:ns ## COG: ECs3055 COG0664 # Protein_GI_number: 15832309 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Escherichia coli O157:H7 # 14 232 1 219 219 446 100.0 1e-125 MKEIHNNDLKLQLMSESAFKDCFLTDVSADTRLFHFLARDYIVQEGQQPSWLFYLTRGRA RLYATLANGRVSLIDFFAAPCFIGEIELIDKDHEPRAVQAIEECWCLALPMKHYRPLLLN DTLFLRKLCVTLSHKNYRNIVSLTQNQSFPLVNRLAAFILLSQEGDLYHEKHTQAAEYLG VSYRHLLYVLAQFIHDGLLIKSKKGYLIKNRKQLSGLALEMDPENKFSGMMQ >gi|296493462|gb|ADTK01000039.1| GENE 25 27815 - 28756 1003 313 aa, chain + ## HITS:1 COG:yeiK KEGG:ns NR:ns ## COG: yeiK COG1957 # Protein_GI_number: 16130100 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Escherichia coli K12 # 1 313 1 313 313 631 99.0 0 MEKRKIILDCDPGHDDAIAMMMAAKHPAIDLLGITIVAGNQTLDKTLINGLNVCQKLEIN VPVYAGMPQPIMRQQIVADNIHGETGLDGPVFEPLTRQAESTHAVKYIIDTLMASDGDIT LVPVGPLSNIAVAMRMQPAILPKIREIVLMGGAYGTGNFTPSAEFNIFADPEAARVVFTS GVPLVMMGLDLTNQTVCTPDVIARMERAGGPAGELFSDIMNFTLKTQFENYGLAGGPVHD ATCIGYLINPDGIKTQEMYVEVDVNSGPCYGRTVCDELGVLGKPANTKVGITIDTDWFWG LVEECVRGYIKTH >gi|296493462|gb|ADTK01000039.1| GENE 26 28856 - 30106 1352 416 aa, chain + ## HITS:1 COG:yeiJ KEGG:ns NR:ns ## COG: yeiJ COG1972 # Protein_GI_number: 16130099 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside permease # Organism: Escherichia coli K12 # 1 416 1 416 416 653 99.0 0 MDVMRSVLGMVVLLAIAFLLSVNKKKISLRTVGAALVLQVVIGGIMLWLPPGRWVAEKVA FGVHKVMAYSDAGSAFIFGSLVGPKMDTLFDGAGFIFGFRVLPAIIFVTALVSILYYIGV MGILIRILGGIFQKALNISKIESFVAVTTIFLGQNEIPAIVKPFIDRLNRNELFTAICSG MASIAGSTMIGYAALGVPVEYLLAASLMAIPGGILFARLLSPATESSQVSFNNLSFTETP PKSIIEAAATGAMTGLKIAAGVATVVMAFVAIIALINGIIGGVGGWFGFEHASLESILGY LLAPLAWVMGVDWTDANLAGSLIGQKLAINEFVAYLNFSPYLQTAGTLDAKTVAIISFAL CGFANFGSIGVVVGAFSAVAPHRAPEIAQLGLRALAAATLSNLMSATIAGFFIGLA >gi|296493462|gb|ADTK01000039.1| GENE 27 30162 - 31250 713 362 aa, chain - ## HITS:1 COG:ZyeiI_2 KEGG:ns NR:ns ## COG: ZyeiI_2 COG0524 # Protein_GI_number: 15802716 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli O157:H7 EDL933 # 54 362 1 309 309 629 99.0 1e-180 MNNREKEILAILRRNPLIQQNEIADMLQISRSRVAAHIMDLMRKGRIKGKGYILTEQEYC VVVGTINMDIRGMADIHYPQAASHPGTIHCSAGGVGRNIAHNLALLGRDVHLLSVIGDDF YGEMLLEETRRAGVNVSGCVRLHGQSTSTYLAIANRDDETVLAINDTHLLEQLSPQLLNG SRDLLRHAGVVLADCNLTAEALEWVFTLAGEIPVFVDTVSEFKAGKIKHWLAHIHTLKPT LPELEILWGQAITSDADRNAAVNALHQQGVQQLFVYLPDESVYCSEKDGEQFLLTAPAHT TVDSFGADDGFMAGLVYSFLEGNNFRDSARFAMACAAISRASGSLNNPTLSADNALSLVP MV >gi|296493462|gb|ADTK01000039.1| GENE 28 31253 - 32110 874 285 aa, chain - ## HITS:1 COG:ECs3051 KEGG:ns NR:ns ## COG: ECs3051 COG0648 # Protein_GI_number: 15832305 # Func_class: L Replication, recombination and repair # Function: Endonuclease IV # Organism: Escherichia coli O157:H7 # 1 285 1 285 285 580 99.0 1e-165 MKYIGAHVSAAGGLANAAIRAAEIDATAFALFTKNQRQWRAAPLTTQTIDEFKAACEKYH YTSAQILPHDSYLINLGHPVTEALEKSRDAFIDEMQRCEQLGLSLLNFHPGSHLMQISEE DCLARIAESINIALDKTQGVTAVIENTAGQGSNLGFKFEHLAAIIDGVEDKSRVGVCIDT CHAFAAGYDLRTPAECEKTFADFARIVGFKYLRGMHLNDAKSTFGSRVDRHHSLGEGNIG HDAFRWIMQDDRFDGIPLILETINPDIWAEEIAWLKAQQTEKAVA >gi|296493462|gb|ADTK01000039.1| GENE 29 32184 - 33233 1158 349 aa, chain - ## HITS:1 COG:ECs3050 KEGG:ns NR:ns ## COG: ECs3050 COG2855 # Protein_GI_number: 15832304 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 349 1 349 349 573 100.0 1e-163 MTNITLQKQHRTLWHFIPGLALSAVITGVALWGGSIPAVAGAGFSALTLAILLGMVLGNT IYPHIWKSCDGGVLFAKQYLLRLGIILYGFRLTFSQIADVGISGIIIDVLTLSSTFLLAC FLGQKVFGLDKHTSWLIGAGSSICGAAAVLATEPVVKAEASKVTVAVATVVIFGTVAIFL YPAIYPLMSQWFSPETFGIYIGSTVHEVAQVVAAGHAISPDAENAAVISKMLRVMMLAPF LILLAARVKQLSGANSGEKSKITIPWFAILFIVVAIFNSFHLLPQSVVNMLVTLDTFLLA MAMAALGLTTHVSALKKAGAKPLLMALVLFAWLIVGGGAINYVIQSVIA >gi|296493462|gb|ADTK01000039.1| GENE 30 33332 - 34213 204 293 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149913192|ref|ZP_01901726.1| 50S ribosomal protein L35 [Roseobacter sp. AzwK-3b] # 3 241 1 243 305 83 26 4e-15 MHITLRQLEVFAEVLKSGSTTQASVMLALSQSAVSAALTDLEGQLGVQLFDRVGKRLVVN EHGRLLYPRALALLEQAVEIEQLFREDNGAIRIYASSTIGNYILPAVIARYRHDYPQLPI ELSVGNSQDVMQAVLDFRVDIGFIEGPCHSTEIISEPWLEDELVVFAAPTSPLARGPVTL EQLAAAPWILRERGSGTREIVDYLLLSHLPKFEMAMELGNSEAIKHAVRHGLGISCLSRR VIEDQLQAGTLSEVAVPLPRLMRTLWRIHHRQKHLSNALRRFLDYCDPANVPR >gi|296493462|gb|ADTK01000039.1| GENE 31 34418 - 35887 1778 489 aa, chain + ## HITS:1 COG:lysP KEGG:ns NR:ns ## COG: lysP COG0833 # Protein_GI_number: 16130094 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli K12 # 1 489 1 489 489 911 100.0 0 MVSETKTTEAPGLRRELKARHLTMIAIGGSIGTGLFVASGATISQAGPGGALLSYMLIGL MVYFLMTSLGELAAYMPVSGSFATYGQNYVEEGFGFALGWNYWYNWAVTIAVDLVAAQLV MSWWFPDTPGWIWSALFLGVIFLLNYISVRGFGEAEYWFSLIKVTTVIVFIIVGVLMIIG IFKGAQPAGWSNWTIGEAPFAGGFAAMIGVAMIVGFSFQGTELIGIAAGESEDPAKNIPR AVRQVFWRILLFYVFAILIISLIIPYTDPSLLRNDVKDISVSPFTLVFQHAGLLSAAAVM NAVILTAVLSAGNSGMYASTRMLYTLACDGKAPRIFAKLSRGGVPRNALYATTVIAGLCF LTSMFGNQTVYLWLLNTSGMTGFIAWLGIAISHYRFRRGYVLQGHDINDLPYRSGFFPLG PIFAFILCLIITLGQNYEAFLKDTIDWGGVAATYIGIPLFLIIWFGYKLIKGTHFVRYSE MKFPQNDKK >gi|296493462|gb|ADTK01000039.1| GENE 32 36095 - 38086 2023 663 aa, chain + ## HITS:1 COG:cirA KEGG:ns NR:ns ## COG: cirA COG4771 # Protein_GI_number: 16130093 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Escherichia coli K12 # 1 663 1 663 663 1314 99.0 0 MFRLNPFVRVGLCLSAISCAWPVLAVDDDGETMVVTASSVEQNLKDAPASISVITQEDLQ RKPVQNLKDVLKEVPGVQLTNEGDNRKGVSIRGLDSSYTLILVDGKRVNSRNAVFRHNDF DLNWIPVDSIERIEVVRGPMSSLYGSDALGGVVNIITKKIGQKWSGTVTVDTTIQEHRDR GDTYNGQFFTSGPLIDGVLGMKAYGSLAKREKDDPQNSTTTDTGETPRIEGFSSRDGNVE FAWTPNQNHDFTAGYGFDRQDRDSDSLDKNRLERQNYSVSHNGRWDYGTSELKYYGEKVE NKNPGNSSPITSESNTVDGKYTLPLTAINQFLTVGGEWRHDKLSDAVNLTGGTSSKTSAS QYALFVEDEWRIFEPLALTTGVRMDDHETYGEHWSPRAYLVYNATDTVTVKGGWATAFKA PSLLQLSPDWTSNSCRGACKIVGSPDLKPETSESWELGLYYMGEEGWLEGVESSVTVFRN DVKDRISISRTSDVNAAPGYQNFVGFETGANGRRIPVFSYYNVNKARIQGVETELKIPFN DEWKLSFNYTYNDGRDVSNGENKPLSDLPFHTANGTLDWKPLALEDWSFYVSGHYTGQKR ADSATAKTPGGYTIWNTGAAWQVTKDVKLRAGVLNLGDKDLSRDDYSYNEDGRRYFMAVD YRF >gi|296493462|gb|ADTK01000039.1| GENE 33 38118 - 38954 753 278 aa, chain - ## HITS:1 COG:ECs3046 KEGG:ns NR:ns ## COG: ECs3046 COG0627 # Protein_GI_number: 15832300 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Escherichia coli O157:H7 # 1 278 1 278 278 556 99.0 1e-158 MEMLEEHRCFEGWQQRWRHDSSTLNCPMTFSIFLPPPRDHTPPPVLYWLSGLTCNDENFT TKAGAQRVAAELGIVLVMPDTSPRGEQVANDDGYDLGQGAGFYLNATQPPWATHYRMYDY LRDELPALVQSQFNVSDRCAISGHSMGGHGALIMALKNPGKYTSVSAFAPIVNPCSVPWG IKAFSRYLGEDKNAWLEWDSCALMYASNAQDAIPTLIDQGDNDQFLADQLQPAVLAEAAR QKAWPMTLRIQPGYDHSYYFIASFIEDHLRFHAQYLLK >gi|296493462|gb|ADTK01000039.1| GENE 34 39212 - 39880 753 222 aa, chain + ## HITS:1 COG:ECs3045 KEGG:ns NR:ns ## COG: ECs3045 COG0302 # Protein_GI_number: 15832299 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase I # Organism: Escherichia coli O157:H7 # 1 222 1 222 222 412 100.0 1e-115 MPSLSKEAALVHEALVARGLETPLRPPVHEMDNETRKSLIAGHMTEIMQLLNLDLADDSL METPHRIAKMYVDEIFSGLDYANFPKITLIENKMKVDEMVTVRDITLTSTCEHHFVTIDG KATVAYIPKDSVIGLSKINRIVQFFAQRPQVQERLTQQILIALQTLLGTNNVAVSIDAVH YCVKARGIRDATSATTTTSLGGLFKSSQNTRHEFLRAVRHHN >gi|296493462|gb|ADTK01000039.1| GENE 35 39897 - 41054 1098 385 aa, chain + ## HITS:1 COG:Z3408 KEGG:ns NR:ns ## COG: Z3408 COG2311 # Protein_GI_number: 15802708 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 EDL933 # 1 385 44 428 428 665 98.0 0 MERNVTLDFVRGVAILGILLLNISAFGLPKAAYLNPAWYGAITPQDAWTWAFLDLVGQVK FLTLFALLFGAGLQMLLPRGRRWIQSRLTLLVLLGFIHGLLFWDGDILLAYGLVGLICWR LVRDAPSVKSLFNTGVMLYLVGLGVLLLLGLISDSQTSRAWTPDASAILYEKYWKLHGGV EAISNRADGVGNSLLALGAQYGWQLAGMMLIGAALMRSGWLKGQFSLRHYRRTGFVLVAI GVTINLPAIALQWQLDWAYRWCAFLLQMPRELSAPFQAIGYASLFYGFWPQLSRFKLVLA IACVGRMALTNYLLQTLICTMLFYHLGLFMHFDRLELLAFVIPVWLANILFSVIWLRYFR QGPVEWLWRQLTLRAAGTAISKTSR >gi|296493462|gb|ADTK01000039.1| GENE 36 41003 - 41206 73 67 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_2448 NR:ns ## KEGG: EcE24377A_2448 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 67 1 67 67 128 100.0 5e-29 MVIIFTTCYEIAVNDCLLAAILSALNTGLRESSHSNGNGYSRFVNECDPDRYLDVLDIAV PAARKVN >gi|296493462|gb|ADTK01000039.1| GENE 37 41196 - 42236 1056 346 aa, chain + ## HITS:1 COG:ECs3043 KEGG:ns NR:ns ## COG: ECs3043 COG1609 # Protein_GI_number: 15832297 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 346 1 346 346 662 99.0 0 MITIRDVARQAGVSVATVSRVLNNSTLVSADTREAVMKAVSELDYRPNANAQALATQVSD TIGVVVMDVSDAFFGALVKAVDLVAQQHQKYVLIGNSYHEAEKERHAIEVLIRQRCNALI VHSKALSDDELAQFMDNIPGMVLINRVVPGYAHRCVCLDNLSGARMATRMLLNNGHQRIG YLSSSHGIEDDAMRKAGWMSALKEQDIIPPESWIGTGTPDMPGGEAAMVELLGRNLQLTA VFAYNDNMAAGALTALKDNGIAIPLHLSIIGFDDIPIARYTDPQLTTVRYPIASMAKLAT ELALQGAAGNIDPRASHCFMPTLVRRHSVATRQNAAAITNSTNQAM >gi|296493462|gb|ADTK01000039.1| GENE 38 42516 - 43514 1340 332 aa, chain + ## HITS:1 COG:mglB KEGG:ns NR:ns ## COG: mglB COG1879 # Protein_GI_number: 16130088 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli K12 # 1 332 1 332 332 587 100.0 1e-167 MNKKVLTLSAVMASMLFGAAAHAADTRIGVTIYKYDDNFMSVVRKAIEQDAKAAPDVQLL MNDSQNDQSKQNDQIDVLLAKGVKALAINLVDPAAAGTVIEKARGQNVPVVFFNKEPSRK ALDSYDKAYYVGTDSKESGIIQGDLIAKHWAANQGWDLNKDGQIQFVLLKGEPGHPDAEA RTTYVIKELNDKGIKTEQLQLDTAMWDTAQAKDKMDAWLSGPNANKIEVVIANNDAMAMG AVEALKAHNKSSIPVFGVDALPEALALVKSGALAGTVLNDANNQAKATFDLAKNLADGKG AADGTNWKIDNKVVRVPYVGVDKDNLAEFSKK >gi|296493462|gb|ADTK01000039.1| GENE 39 43575 - 45095 192 506 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 279 482 17 215 245 78 26 9e-14 MVSSTTPSSGEYLLEMSGINKSFPGVKALDNVNLKVRPHSIHALMGENGAGKSTLLKCLF GIYQKDSGTILFQGKEIDFHSAKEALENGISMVHQELNLVLQRSVMDNMWLGRYPTKGMF VDQDKMYRETKAIFDELDIDIDPRARVGTLSVSQMQMIEIAKAFSYNAKIVIMDEPTSSL TEKEVNHLFTIIRKLKERGCGIVYISHKMEEIFQLCDEVTVLRDGQWIATEPLAGLTMDK IIAMMVGRSLNQRFPDKENKPGEVILEVRNLTSLRQPSIRDVSFDLHKGEILGIAGLVGA KRTDIVETLFGIREKSAGTITLHGKQINNHNANEAINHGFALVTEERRSTGIYAYLDIGF NSLISNIRNYKNKVGLLDNSRMKSDTQWVIDSMRVKTPGHRTQIGSLSGGNQQKVIIGRW LLTQPEILMLDEPTRGIDVGAKFEIYQLIAELAKKGKGIIIISSEMPELLGITDRILVMS NGLVSGIVDTKTTTQNEILRLASLHL >gi|296493462|gb|ADTK01000039.1| GENE 40 45111 - 46121 1346 336 aa, chain + ## HITS:1 COG:mglC KEGG:ns NR:ns ## COG: mglC COG4211 # Protein_GI_number: 16130086 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type glucose/galactose transport system, permease component # Organism: Escherichia coli K12 # 1 336 1 336 336 486 100.0 1e-137 MSALNKKSFLTYLKEGGIYVVLLVLLAIIIFQDPTFLSLLNLSNILTQSSVRIIIALGVA GLIVTQGTDLSAGRQVGLAAVVAATLLQSMDNANKVFPEMATMPIALVILIVCAIGAVIG LINGLIIAYLNVTPFITTLGTMIIVYGINSLYYDFVGASPISGFDSGFSTFAQGFVALGS FRLSYITFYALIAVAFVWVLWNKTRFGKNIFAIGGNPEAAKVSGVNVGLNLLMIYALSGV FYAFGGMLEAGRIGSATNNLGFMYELDAIAACVVGGVSFSGGVGTVIGVVTGVIIFTVIN YGLTYIGVNPYWQYIIKGAIIIFAVALDSLKYARKK >gi|296493462|gb|ADTK01000039.1| GENE 41 46192 - 47427 1183 411 aa, chain - ## HITS:1 COG:yeiA_1 KEGG:ns NR:ns ## COG: yeiA_1 COG0167 # Protein_GI_number: 16130085 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Escherichia coli K12 # 1 326 3 328 328 682 99.0 0 MLTKDLSITFCGVKFPNPFCLSSSPVGNCYEMCAKAYDTGWGGVVFKTIGFFIANEVSPR FDHLVKEDTGFIGFKNMEQIAEHPLEENLAALRQLKEDYPDKVLIASIMGENEQQWEELA RLVQEAGADMIECNFSCPQMTSHAMGSDVGQSPELVEKYCRAVKRGSTLPMLAKMTPNIG DMCEVALAAKRGGADGIAAINTVKSITNIDLNQKIGMPIVNGKSSISGYSGKAVKPIALR FIQQMRTHPELRDFPISGIGGIETWEDAAEFLLLGAATLQVTTGIMQYGYRIVEDMASGL SHYLADQGFDSLQEMVGLANNNIVPAEDLDRSYIVYPRINLDKCVGCGRCYISCYDGGHQ AMEWSEKTRTPHCNTEKCVGCLLCGHVCPVGCIELGEVKFKKGEKEHPVTL >gi|296493462|gb|ADTK01000039.1| GENE 42 47421 - 48659 971 412 aa, chain - ## HITS:1 COG:yeiT KEGG:ns NR:ns ## COG: yeiT COG0493 # Protein_GI_number: 16130084 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Escherichia coli K12 # 1 412 1 412 412 807 98.0 0 MPQQNYLDELTPAFTPLLAIKEASRCLLCHDAPCSQACPAQTDPGKFIRSIYFRNFKGAA ETIRENNALGAVCARVCPTEKLCQSGCTRAGVDTPIDIGRLQRFVTDFEQQTGMEIYQPG TKMLGKVAIIGAGPAGLQASVTLTNQGYDVTIYEKEAHPGGWLRNGIPQFRLPQSVLDAE IARIEKMGVTIKCNNEVGNTLTLEQLKAENRAVLVTVGLSSGSGLSLFEHSNVEIAVDFL QRARQAQGDISIPQSALIIGGGDVAMDVASTLKVLGCQAVTCVAREELDEFPASEKEFAS ARELGVSIIDGFTPVAVEGNKVTFKHVRLPGELTMAADKIILAVGQHARLDAFAELEPQR NTIKTQNYQTRDPQVFAAGDIVEGDKTVVYAVKTGKEAAEAIHHYLEGACSC >gi|296493462|gb|ADTK01000039.1| GENE 43 48978 - 49217 236 79 aa, chain - ## HITS:1 COG:no KEGG:G2583_2688 NR:ns ## KEGG: G2583_2688 # Name: yeiS # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 79 1 79 79 135 100.0 6e-31 MDVQQFFVVAVFFLIPIFCFREAWKGWRAGAIDKRVKNAPEPVYVWRAKNPGLFFAYMVA YIGFGILSIGMIVYLIFYR >gi|296493462|gb|ADTK01000039.1| GENE 44 49220 - 49939 780 239 aa, chain - ## HITS:1 COG:ECs3036 KEGG:ns NR:ns ## COG: ECs3036 COG2949 # Protein_GI_number: 15832290 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Escherichia coli O157:H7 # 1 239 1 239 239 447 100.0 1e-126 MLKRVFLSLLVLIGLLLLTVLGLDRWMSWKTAPYIYDELQDLPYRQVGVVLGTAKYYRTG VINQYYRYRIQGAINAYNSGKVNYLLLSGDNALQSYNEPMTMRKDLIAAGVDPSDIVLDY AGFRTLDSIVRTRKVFDTNDFIIITQRFHCERALFIALHMGIQAQCYAVPSPKDMLSVRI REFAARFGALADLYIFKREPRFLGPLVPIPAMHQVPEDAQGYPAVTPEQLLELQKKQGK >gi|296493462|gb|ADTK01000039.1| GENE 45 50089 - 50973 901 294 aa, chain - ## HITS:1 COG:cdd KEGG:ns NR:ns ## COG: cdd COG0295 # Protein_GI_number: 16130081 # Func_class: F Nucleotide transport and metabolism # Function: Cytidine deaminase # Organism: Escherichia coli K12 # 1 294 1 294 294 540 99.0 1e-153 MHPRFQTAFAQLADNLQSALEPILADKYFPALLTGEQVSSLKSATGLDEDALAFALLPLA AACARTPLSNFNVGAIARGVSGTWYFGANMEFIGATMQQTVHAEQSAISHAWLSGEKALA AITVNYTPCGHCRQFMNELNSGLDLRIHLPGREAHALRDYLPDAFGPKDLEIKTLLMDEQ DHGYALTGDALSQAAIAAANRSHMPYSKSPSGVTLECKDGRIFSGSYAENAAFNPTLPPL QGALILLNLKGYDYPDIQRAVLAEKADAPLIQWDATSATLKALGCHSIDRVLLA >gi|296493462|gb|ADTK01000039.1| GENE 46 51103 - 51798 732 231 aa, chain - ## HITS:1 COG:yohK KEGG:ns NR:ns ## COG: yohK COG1346 # Protein_GI_number: 16130080 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative effector of murein hydrolase # Organism: Escherichia coli K12 # 1 231 1 231 231 383 100.0 1e-106 MMANIWWSLPLTLIVFFAARKLAARYKFPLLNPLLVAMVVIIPFLMLTGISYDSYFKGSE VLNDLLQPAVVALAYPLYEQLHQIRARWKSIITICFIGSVVAMVTGTSVALLMGASPEIA ASILPKSVTTPIAMAVGGSIGGIPAISAVCVIFVGILGAVFGHTLLNAMRIRTKAARGLA MGTASHALGTARCAELDYQEGAFSSLALVLCGIITSLIAPFLFPIILAVMG >gi|296493462|gb|ADTK01000039.1| GENE 47 51795 - 52193 439 132 aa, chain - ## HITS:1 COG:yohJ KEGG:ns NR:ns ## COG: yohJ COG1380 # Protein_GI_number: 16130079 # Func_class: R General function prediction only # Function: Putative effector of murein hydrolase LrgA # Organism: Escherichia coli K12 # 1 132 1 132 132 218 100.0 2e-57 MSKTLNIIWQYLRAFVLIYACLYAGIFIASLLPVTIPGSIIGMLILFVLLALQILPAKWV NPGCYVLIRYMALLFVPIGVGVMQYFDLLRAQFGPVVVSCAVSTLVVFLVVSWSSQLVHG ERKVVGQKGSEE >gi|296493462|gb|ADTK01000039.1| GENE 48 52402 - 53379 985 325 aa, chain + ## HITS:1 COG:yohI KEGG:ns NR:ns ## COG: yohI COG0042 # Protein_GI_number: 16130078 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-dihydrouridine synthase # Organism: Escherichia coli K12 # 11 325 1 315 315 643 99.0 0 MAPHFFAGEDMRVLLAPMEGVLDSLVRELLTEVNDYDLCITEFVRVVDQLLPVKVFHRIC PELQNASRTPSGTLVRVQLLGQFPQWLAENAARAVELGSWGVDLNCGCPSKTVNGSGGGA TLLKDPELIYQGAKAMREAVPAHLPVSVKVRLGWDSGEKKFEIADAVQQAGATELVVHGR TKEQGYRAEHIDWQAIGEIRQRLNIPVIANGEIWDWQSAQQCIAISGCDAVMIGRGALNI PNLSRVVKYNEPRMPWPEVVALLQKYTRLEKQGDTGLYHVARIKQWLSYLRKEYDEATEL FQHVRVLNNSPDIARAIQAIDIEKL >gi|296493462|gb|ADTK01000039.1| GENE 49 54057 - 55493 1204 478 aa, chain + ## HITS:1 COG:ECs3025 KEGG:ns NR:ns ## COG: ECs3025 COG1538 # Protein_GI_number: 15832279 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Escherichia coli O157:H7 # 1 478 28 505 505 876 99.0 0 MNRDSFYPAIACFPLLLMLAGCAPMHETRQALSQQTPAAQVDTALPTALKNGWPDSQWWL EYHDNQLTSLINNALQNAPDMQVAEQRIQLAEAQAKAIATQDGPQIDFSADMERQKMSAE GLMGPFALNDPAAGTTGPWYTNGTFGLTAGWHLDIWGKNRAEVTARLGTVKARAAEREQT RQLLAGSVARLYWEWQTQAALNTVLQQIEKEQNTIIATDRQLYQNGITSSVEGVETDINA SKTRQQLNDVAGKMKIIEARLSALTNNQTKSLKLKPVALPKVASQLPDELGYSLLARRAD LQAAHWYVESSLSTIDAAKAAFYPDINLMAFLQQDALHLSDLFRHSAQQMGVTAGLTLPI FDSGRLNANLDIAKAESNLSIASYNKAVVEAVNDVARAASQVQTLAEKNQHQAQIERDAL RVVGLAQARFNAGIIAGSRVSEARIPALRERANGLLLQGQWLDASIQLTGALGGGYKR >gi|296493462|gb|ADTK01000039.1| GENE 50 55546 - 56307 202 253 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 57 241 52 238 242 82 30 6e-15 MAQVAIITASDSGIGKECALLLAQQGFDIGITWHSDEEGAKDTAREVVSHGVRAEIVQLD LGKLPEGAQALEKLIQRLGRIDVLVNNAGAMTKAPFLDMAFDEWRKIFTVDVDGAFLCSQ IAARQMVKQGQGGRIINITSVHEHTPLPDASAYTAAKHALGGLTKTMALELVRHKILVNA VAPGAIATPMNGMDDSDVKPDAEPSIPLRRFGATHEIASLVAWLCSEGANYTTGQSLIVD GGFMLANPQFNPE >gi|296493462|gb|ADTK01000039.1| GENE 51 56437 - 57015 671 192 aa, chain - ## HITS:1 COG:yohD KEGG:ns NR:ns ## COG: yohD COG0586 # Protein_GI_number: 16130074 # Func_class: S Function unknown # Function: Uncharacterized membrane-associated protein # Organism: Escherichia coli K12 # 1 192 13 204 204 333 100.0 9e-92 MDLNTLISQYGYAALVIGSLAEGETVTLLGGVAAHQGLLKFPLVVLSVALGGMIGDQVLY LCGRRFGGKLLRRFSKHQDKIERAQKLIQRHPYLFVIGTRFMYGFRVIGPTLIGASQLPP KIFLPLNILGAFAWALIFTTIGYAGGQVIAPWLHNLDQHLKHWVWLILVVVLVVGVRWWL KRRGKKKPDHQA >gi|296493462|gb|ADTK01000039.1| GENE 52 57161 - 57772 595 203 aa, chain + ## HITS:1 COG:no KEGG:SSON_2192 NR:ns ## KEGG: SSON_2192 # Name: yohC # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 203 1 203 203 354 100.0 1e-96 MNSQQGGGMSHVWGLFSHPDREMQVINRENETISHHYTHHVLLMAAIPVICAFIGTTQIG WNFGDGTILKLSWFTGLALAVLFYGVMLAGVAVMGRVIWWMARNYPQRPSLAHCMVFAGY VATPLFLSGLVALYPLVWLCALVGTVALFYTGYLLYLGIPSFLNINKEEGLSFSSSTLAI GVLVLEVLLALTVILWGYGYRLF >gi|296493462|gb|ADTK01000039.1| GENE 53 57985 - 58878 1067 297 aa, chain + ## HITS:1 COG:pbpG KEGG:ns NR:ns ## COG: pbpG COG1686 # Protein_GI_number: 16130072 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Escherichia coli K12 # 1 297 17 313 313 505 99.0 1e-143 MLAVPFAPQAVAKTAAATTASQPEIASGSAMIVDLNTNKAIYSNHPDLVRPIASISKLMT AMVVLDARLPLDEKLKVDISQTPEMKGVYSRVRLNSEISRKDMLLLALMSSENRAAASLA HHYPGGYKAFIKAMNAKAKSLGMNNTRFVEPTGLSVHNVSTARDLTKLLIASKQYPLIGQ LSTTREDMATFSNPTYTLPFRNTNHLVYRDNWNIQLTKTGFTNAAGHCLVMRTVINNKPV ALVVMDAFGKYTHFADASRLRTWIETGKVMPVPAAALSYKKQKAAQMAAAGQTAQND >gi|296493462|gb|ADTK01000039.1| GENE 54 58916 - 60631 1648 571 aa, chain - ## HITS:1 COG:ECs3020 KEGG:ns NR:ns ## COG: ECs3020 COG0277 # Protein_GI_number: 15832274 # Func_class: C Energy production and conversion # Function: FAD/FMN-containing dehydrogenases # Organism: Escherichia coli O157:H7 # 1 571 1 571 571 1170 99.0 0 MSSMTTTDNKAFLNELARLVGHSHLLTDPAKTARYRKGFRSGQGDALAVVFPGSLLELWR VLKACVTADKIILMQAANTGLTEGSTPNGNDYDRDIVIISTLRLDKLHVLGKGEQVLAYP GTTLYSLEKALKPLGREPHSVIGSSCIGASVIGGICNNSGGSLVQRGPAYTEMSLFARIN EDGKLTLVNHLGIDLGETPEQILSKLDDDRIKDDDVRHDGRHAHDYDYVHRVRDIEADTP ARYNADPDRLFESSGCAGKLAVFAVRLDTFEAEKNQQVFYIGTNQPEVLTEIRRHILANF ENLPVAGEYMHRDIYDIAEKYGKDTFLMIDKLGTDKMPFFFNLKGRTDAMLEKVKFFRPH FTDRAMQKFGHLFPSHLPPRMKNWRDKYEHHLLLKMAGDGVGEAKSWLVDYFKQAEGDFF VCTPEEGSKAFLHRFAAAGAAIRYQAVHSDEVEDILALDIALRRNDTEWYEHLPPEIDSQ LVHKLYYGHFMCYVFHQDYIVKKGVDVHALKEQMLELLQQRGAQYPAEHNVGHLYKAPET LQKFYRENDPTNSMNPGIGKTSKRKNWQEVE >gi|296493462|gb|ADTK01000039.1| GENE 55 60827 - 63124 2708 765 aa, chain + ## HITS:1 COG:bglX KEGG:ns NR:ns ## COG: bglX COG1472 # Protein_GI_number: 16130070 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Escherichia coli K12 # 1 765 1 765 765 1508 99.0 0 MKWLCSVGIAVSLALQPALADDLFGNHPLTPEARDAFVTELLKKMTVDEKIGQLRLISVG PDNPKEAIREMIKDGQVGAIFNTVTRQDIRAMQDQVMELSRLKIPLFFAYDVLHGQRTVF PISLGLASSFNLDAVKTVGRVSAYEAADDGLNMTWAPMVDVSRDPRWGRASEGFGEDTYL TSTMGKTMVEAMQGKSPADRYSVMTSVKHFAAYGAVEGGKEYNTVDMSPQRLFNDYMPPY KAGLDAGSGAVMVALNSLNGTPATSDSWLLKDVLRDQWGFKGITVSDHGAIKELIKHGTA ADPEDAVRVALKSGINMSMSDEYYSKYLPGLIKSGKVTMEELDDAARHVLNVKYDMGLFN DPYSHLGPKESDPVDTNAESRLHRKEAREVARESLVLLKNRLETLPLKKSATIAVVGPLA DSKRDVMGSWSAAGVADQSVTVLTGIKNSVGENGKVLYAKGANVTSDKGIIDFLNQYEEA VKVDPRSPQEMIDEAVQTAKQSDVVVAVVGEAQGMAHEASSRTDITIPQSQRDLIAALKA TGKPLVLVLMNGRPLALVKEDQQADAILETWFAGTEGGNAIADVLFGDYNPSGKLPMSFP RSVGQIPVYYSHLNTGRPYNADKPNKYTSRYFDEANGALYPFGYGLSYTTFTVSDVKLSA PTMKRDGKVTASVQVTNTGKREGATVVQMYLQDVTASMSRPVKQLKGFEKITLKPGETQT VSFPIDIEALKFWNQQMKYDAEPGKFNVFIGTDSARVKKGEFELL >gi|296493462|gb|ADTK01000039.1| GENE 56 63335 - 64252 1031 305 aa, chain + ## HITS:1 COG:yehZ KEGG:ns NR:ns ## COG: yehZ COG1732 # Protein_GI_number: 16130069 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic glycine betaine/choline-binding (lipo)protein of an ABC-type transport system (osmoprotectant binding protein) # Organism: Escherichia coli K12 # 1 305 1 305 305 556 99.0 1e-158 MPLLKLWAGSLVMLAAVSLPLQAASPVKVGSKIDTEGALLGNIILQVLESHGVPTVNKVQ LGTTPVVRGAITSGELDIYPEYTGNGAFFFKDENDAAWKNAQQGYEKVKKLDSEHNKLIW LTPAPANNTWTIAVRQDVAEKNKLTSLADLSRYLQEGGTFKLAASAEFIERADALPAFEK AYGFKLGQDQLLSLAGGDTAVTIKAAAQQTSGVNAAMAYGTDGSVAALGLQTLSDPQGVQ PIYAPAPVVRESVLKEYPQMAQWLQPVFASLDAKTLQQLNASIAVEGLDAKKVAADYLKQ KGWTK >gi|296493462|gb|ADTK01000039.1| GENE 57 64259 - 65416 1288 385 aa, chain + ## HITS:1 COG:yehY KEGG:ns NR:ns ## COG: yehY COG1174 # Protein_GI_number: 16130068 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport systems, permease component # Organism: Escherichia coli K12 # 1 385 1 385 385 612 100.0 1e-175 MTYFRINPVLALLLLLTAIAAALPFISYAPNRLVSGEGRHLWQLWPQTIWMLVGVGCAWL TACFIPGKKGSICALILAQFVFVLLVWGAGKAATQLAQNGSALARTSLGSGFWLAAALAL LACSDAIRRISTHPLWRWLLHMQIAIIPLWLLYSGTLNDLSLMKEYANRQDVFDDALAQH LTLLFGAVLPALVIGVPLGIWCYFSTARQGAIFSLLNVIQTVPSVALFGLLIAPLAALVT AFPWLGTLGIAGTGMTPALIALVLYALLPLVRGVVVGLNQIPRDVLESARAMGMSGAQRF LHVQLPLALPVFLRSLRVVMVQTVGMAVIAALIGAGGFGALVFQGLLSSAIDLVLLGVIP VIVLAVLTDALFDLLIALLKVKRND >gi|296493462|gb|ADTK01000039.1| GENE 58 65409 - 66335 1127 308 aa, chain + ## HITS:1 COG:yehX KEGG:ns NR:ns ## COG: yehX COG1125 # Protein_GI_number: 16130067 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport systems, ATPase components # Organism: Escherichia coli K12 # 1 308 1 308 308 588 99.0 1e-168 MIEFSHVSKLFGAQKAVNDLNLNFQEGSFSVLIGTSGSGKSTTLKMINRLVEHDSGVIRF AGEEIRSLPVLELRRRMGYAIQSIGLFPHWSVAQNIATVPQLQKWSRARIDDRIDELMAL LGLESNLRERYPHQLSGGQQQRVGVARALAADPQVLLMDEPFGALDPVTRGALQQEMTRI HRLLGRTIVLVTHDIDEALRLAEHLVLMDHGEVVQQGNPLTMLTRPANDFVRQFFGRSEL GVRLLSLRSVADYVRREERADGEALAEEMTLRDALSLFVARGCEVLPVVNIQGQPCGTLH FQDLLVEA >gi|296493462|gb|ADTK01000039.1| GENE 59 66340 - 67071 812 243 aa, chain + ## HITS:1 COG:ECs3015 KEGG:ns NR:ns ## COG: ECs3015 COG1174 # Protein_GI_number: 15832269 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport systems, permease component # Organism: Escherichia coli O157:H7 # 1 243 1 243 243 380 100.0 1e-105 MKMLRDPLFWLIALFVALIFWLPYSQPLFAALFPQLPRPVYQQESFAALALAHFWLVGIS SLFAVIIGTGAGIAVTRPWGAEFRPLVETIAAVGQTFPPVAVLAIAVPVIGFGLQPAIIA LILYGVLPVLQATLAGLGAIDASVTEVAKGMGMSRGQRLRKVELPLAAPVILAGVRTSVI INIGTATIASTVGASTLGTPIIIGLSGFNTAYVIQGALLVALAAIIADRLFERLVQALSQ HAK >gi|296493462|gb|ADTK01000039.1| GENE 60 67052 - 67159 147 35 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRIAKIGVIALFLFMALGGIGGVMLAGYTFILRAG >gi|296493462|gb|ADTK01000039.1| GENE 61 67219 - 67950 663 243 aa, chain - ## HITS:1 COG:yehV KEGG:ns NR:ns ## COG: yehV COG0789 # Protein_GI_number: 16130065 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 1 243 1 243 243 466 100.0 1e-131 MALYTIGEVALLCDINPVTLRAWQRRYGLLKPQRTDGGHRLFNDADIDRIREIKRWIDNG VQVSKVKMLLSNENVDVQNGWRDQQETLLTYLQSGNLHSLRTWIKERGQDYPAQTLTTHL FIPLRRRLQCQQPTLQALLAILDGVLINYIAICLASARKKQGKDALVVGWNIQDTTRLWL EGWIASQQGWRIDVLAHSLNQLRPELFEGRTLLVWCGENRTSAQQQQLTSWQEQGHDIFP LGI >gi|296493462|gb|ADTK01000039.1| GENE 62 68172 - 69857 1220 561 aa, chain + ## HITS:1 COG:yehU KEGG:ns NR:ns ## COG: yehU COG3275 # Protein_GI_number: 16130064 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Escherichia coli K12 # 1 561 1 561 561 1078 100.0 0 MYDFNLVLLLLQQMCVFLVIAWLMSKTPLFIPLMQVTVRLPHKFLCYIVFSIFCIMGTWF GLHIDDSIANTRAIGAVMGGLLGGPVVGGLVGLTGGLHRYSMGGMTALSCMISTIVEGLL GGLVHSILIRRGRTDKVFNPITAGAVTFVAEMVQMLIILAIARPYEDAVRLVSNIAAPMM VTNTVGAALFMRILLDKRAMFEKYTSAFSATALKVAASTEGILRQGFNEVNSMKVAQVLY QELDIGAVAITDREKLLAFTGIGDDHHLPGKPISSTYTLKAIETGEVVYADGNEVPYRCS LHPQCKLGSTLVIPLRGENQRVMGTIKLYEAKNRLFSSINRTLGEGIAQLLSAQILAGQY ERQKAMLTQSEIKLLHAQVNPHFLFNALNTIKAVIRRDSEQASQLVQYLSTFFRKNLKRP SEFVTLADEIEHVNAYLQIEKARFQSRLQVNIAIPQELSQQQLPAFTLQPIVENAIKHGT SQLLDTGRVAISARREGQHLMLEIEDNAGLYQPVTNASGLGMNLVDKRLRERFGDDYGIS VACEPDSYTRITLRLPWRDEA >gi|296493462|gb|ADTK01000039.1| GENE 63 69854 - 70573 832 239 aa, chain + ## HITS:1 COG:ECs2936 KEGG:ns NR:ns ## COG: ECs2936 COG3279 # Protein_GI_number: 15832190 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Escherichia coli O157:H7 # 1 239 6 244 244 459 100.0 1e-129 MIKVLIVDDEPLARENLRVFLQEQSDIEIVGECSNAVEGIGAVHKLRPDVLFLDIQMPRI SGLEMVGMLDPEHRPYIVFLTAFDEYAIKAFEEHAFDYLLKPIDEARLEKTLARLRQERS KQDVSLLPENQQALKFIPCTGHSRIYLLQMKDVAFVSSRMSGVYVTSHEGKEGFTELTLR TLESRTPLLRCHRQYLVNLAHLQEIRLEDNGQAELILRNGLTVPVSRRYLKSLKEAIGL >gi|296493462|gb|ADTK01000039.1| GENE 64 70620 - 71090 450 156 aa, chain + ## HITS:1 COG:yehS KEGG:ns NR:ns ## COG: yehS COG4807 # Protein_GI_number: 16130062 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 156 1 156 156 302 100.0 1e-82 MLSNDILRSVRYILKANNNDLVRILALGNVEATAEQIAVWLRKEDEEGFQRCPDIVLSSF LNGLIYEKRGKDESAPALEPERRINNNIVLKKLRIAFSLKTDDILAILTEQQFRVSMPEI TAMMRAPDHKNFRECGDQFLRYFLRGLAARQHVKKG Prediction of potential genes in microbial genomes Time: Mon May 16 15:05:00 2011 Seq name: gi|296493461|gb|ADTK01000040.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont101.5, whole genome shotgun sequence Length of sequence - 15363 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 4, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 13 - 474 535 ## COG4808 Uncharacterized protein conserved in bacteria - Prom 532 - 591 1.7 - Term 490 - 520 2.0 2 2 Op 1 . - CDS 599 - 2599 1037 ## COG2801 Transposase and inactivated derivatives 3 2 Op 2 . - CDS 2596 - 3732 854 ## ECIAI1_2197 hypothetical protein 4 2 Op 3 . - CDS 3725 - 6004 1668 ## EcE24377A_2408 hypothetical protein 5 2 Op 4 . - CDS 6015 - 7103 943 ## COG0714 MoxR-like ATPases - Prom 7178 - 7237 7.6 - Term 7202 - 7246 7.7 6 3 Op 1 . - CDS 7410 - 7727 124 ## B21_02005 hypothetical protein 7 3 Op 2 . - CDS 7788 - 9653 1326 ## ECO103_2592 hypothetical protein - Prom 9719 - 9778 3.2 8 4 Op 1 . - CDS 9936 - 11420 1104 ## ECIAI1_2191 hypothetical protein 9 4 Op 2 . - CDS 11430 - 15224 2198 ## COG3831 Uncharacterized conserved protein - Prom 15261 - 15320 2.4 Predicted protein(s) >gi|296493461|gb|ADTK01000040.1| GENE 1 13 - 474 535 153 aa, chain - ## HITS:1 COG:ECs2934 KEGG:ns NR:ns ## COG: ECs2934 COG4808 # Protein_GI_number: 15832188 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 153 5 157 157 245 100.0 3e-65 MKAFNKLFSLVVASVLVFSLAGCGDKEESKKFSANLNGTEIAITYVYKGDKVLKQSSETK IQFASIGATTKEDAAKTLEPLSAKYKNIAGVEEKLTYTDTYAQENVTIDMEKVDFKALQG ISGINVSAEDAKKGITMAQMELVMKAAGFKEVK >gi|296493461|gb|ADTK01000040.1| GENE 2 599 - 2599 1037 666 aa, chain - ## HITS:1 COG:ECs2931 KEGG:ns NR:ns ## COG: ECs2931 COG2801 # Protein_GI_number: 15832185 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 666 1 667 745 1264 96.0 0 MNSLRPELLELTPQALTALSNAGFVKRSLKELENGNVPEISHENSALIATFSDGVRTQLA NGQALKEAQCSCGASGMCRHRVMLVLSYQRLCTTAQPTEKEEAWDPAIWLEELATLPDAT RKRAQALVAKGITIELFCTPGEIPSARLPMSDVRFYSRSSIRFARCDCIEGTLCEHVVLA VQAFVEAKTQQAEFTHLIWQMRSEHVTSSDDPFANDEGNACRQYVQQLSQALWLGGISQP LIHYEAAFSRAQQAAERCNWRWVSESLRQLRASVDAFHVRASHYHAGECLRQLAALNSRL NCAQEMARRDSVGEVPPVPWRTVVGSGIAGEAKLDHLRLVSLGMRCWQDIEHYGLRIWFT DPDTGSILHLSRSWPRSEQENSPAATRRLFSFQAGALAGGQIVSQAAKRSADGELLLATR NRLSSVVPLSPDAWQMLSAPLRQPGIVALREYLRQRPPACIRPLNQVDNLFILPVAECIS LGWDSSRQTLDAQVISGEGEDNLLTLSLPASACSPFAVERMAALLQQTDDPVSLVSGFVS FVDGQLTLEPRVMMTKTRAWALDAETAPVAPLPSASVLPVPSTAHQLLMRCQALLIQLLH NGWRYQEQSAISQAELLANDLSAVGFYRLAHVLGQFRNTESEARVEAMNNGVLLCEQLFP MLQQQG >gi|296493461|gb|ADTK01000040.1| GENE 3 2596 - 3732 854 378 aa, chain - ## HITS:1 COG:no KEGG:ECIAI1_2197 NR:ns ## KEGG: ECIAI1_2197 # Name: yehP # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 378 1 378 378 737 99.0 0 MSELNDLLTTRELQRWRLILGEAAETTLCGLDDNARQIDHALEWLYGRDPERLQRGERSG GLGGSNLTTPEWINSIHTLFPQQVIERLESDAVLRYGIEDVVTNLDVLERMQPSESLLRA VLHTKHLMNPEVLAAARQIVRQVVEEIMARLAKEVRQAFSGVRDRRRRSLIPLARNFDFK STLRANLQHWHPQHGKLYIESPRFNSRIKRQSEQWQLVLLVDQSGSMVDSVIHSAVMAAC LWQLPGIRTHLVAFDTSVVDLTADVADPVELLMKVQLGGGTNIASAVEYGRQLIEQPAKS VIILVSDFYEGGSSSLLTHQVKKCVQSGIKVLGLAALDSTATPCYDHDTAQALVNVGAQI AAMTPGELASWLAENLQS >gi|296493461|gb|ADTK01000040.1| GENE 4 3725 - 6004 1668 759 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_2408 NR:ns ## KEGG: EcE24377A_2408 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 759 1 759 759 1403 99.0 0 MSEPLIVGIRHHSPACARLVKSLIESQRPRYVLIEGPADFNDRVDELFFSHQLPVAIYSY CQYQDGAAPGRGAWTPFAEFSPEWQALQAARRIQAQTYFIDLPCWAQSEEEDDSPDTQDE SQALLLRATRMDNSDTLWDHLFEDESQQTALPSALAHYFAQLRGDFPGDALNRQREAFMA RWIAWAVQQNNGDVLVVCGGWHAPALAKMWRECPQDINTPELPSLADAITGCYLTPYSEK RLDVLAGYLSGMPAPVWQNWCWQWGLQQAGEQLLKTVLTRLRQHNLPASTADMAAAHLHA MALAQLRGHTLPLRTDWLDAIAGSLIKEALNAPLPWSYRGVIHPDTDPILLTLIDTLAGD GFGKLAPSTPQPPLPKDVTCELERTAISLPAELTLNRFTPDGLAQSQVLHRLAILEIPGI VRQQGSTLTLAGNGEEHWKLTRPLSQHAALIEAACFGATLQEAARHKLEADMLDAGGIGS ITTCLSQAALAGLASFSQQLLEQLTLLIAQENQFAEMGQALEVLYALWRLDEISGMQGAQ ILQTTLCAAIDRTLWLCESNGRPDEKEFHAHLHSWQALCHILRDLHSGVNLSGVSLSAAV ALLERRSQAIHAPALDRGAALGALMRLEHPNASAEAALTMLAQLSPAQSGEALHGLLALA RHQLACQPAFIAGFSSHLNQLSDANFINALPDLRAAMAWLPPRERGTLAHQVLEHYQLAQ LPVSALQMPLHCPPQAIAHHQQLEQQALASLQHWGVFHV >gi|296493461|gb|ADTK01000040.1| GENE 5 6015 - 7103 943 362 aa, chain - ## HITS:1 COG:ECs2927 KEGG:ns NR:ns ## COG: ECs2927 COG0714 # Protein_GI_number: 15832181 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Escherichia coli O157:H7 # 1 362 23 384 384 716 99.0 0 MSPQNNHLQRPPAAVLYADELAKLKQNDNAPCPPGWQLSLPAARAFILGDSAQNISRKVV ISPSAVERMLVTLATGRGLMLVGEPGTAKSLLSELLATAISGDAGLTIQGGASTTEDQIK YGWNYALLINHGPSTEALVPAPLYQGMRDGKIVRFEEITRTPLEVQDCLLGMLSDRVMTV PELTGEASQLYAREGFNIIATANTRDRGVNEMSAALKRRFDFETVFPIMDFAQELELVAS ASARLLAHSGIPHKVPDAVLELLVRTFRDLRANGEKKTSMDTLTAIMSTAEAVNVAHAVG VRAWFLANRAGEPADLVDCIAGTIVKDNEEDRARLRRYFEQRVATHKEAHWQAYYQARHR LP >gi|296493461|gb|ADTK01000040.1| GENE 6 7410 - 7727 124 105 aa, chain - ## HITS:1 COG:no KEGG:B21_02005 NR:ns ## KEGG: B21_02005 # Name: yehK # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 105 1 105 105 189 100.0 3e-47 MIVQKELVAIYDYEVPVPEDPFSFRLEIHKCSELFTGSVYRLERFRLRPTFHQRDREDAD PLINDALIYIRDECIDERKLRGESPETVIAIFNRELQNIFNQEIE >gi|296493461|gb|ADTK01000040.1| GENE 7 7788 - 9653 1326 621 aa, chain - ## HITS:1 COG:no KEGG:ECO103_2592 NR:ns ## KEGG: ECO103_2592 # Name: yehI # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 621 590 1210 1210 1221 100.0 0 MALTFWLRIIEKKHLFAGEDYFLSILGLDALPGLLLAFSHRPKETFPLILNFGATELALP VARVWRRFAAQRDLARQWILQWPEHTASALIPLVFTKPSDNSEAALLALRLLYEQGHGEL LQTVANRWQRTDVWSALEQLLKQGPMDIYPARIPKAPDFWHPAMWSRPRLITNNQPVTGD ALEIIGEMLRFTQGGRFYSGLEQLKTFCQPQTLAAFAWDLFTAWQQAGAPAKDNWAFLAL SLFGDESTARDLTTQILAWPQEGKSARAVSGLNILTLMNNDMALIQLHHISQRAKSRPLR DNAAEFLQVVAENRGLSQEELADRLVPTLGLDDPQALSFDFGPRQFTVRFDENLNPVIFD QQNVRQKSVPRLRADDDQLKAPEALARLKGLKKDATQVSKNLLPRLEAALRTTRRWSLAD FHTLFVNHPFTRLVTQRLIWGVYPANEPRRLLNAFRVAAEGEFCNAQDEPIDLPADALIG IAHPLEMTAEMRSEFAQLFADYEIIPPLRQLTRRTVLLTPDESASNSLNRWEGKSATVGQ LMGMRYKGWESGYEDAFVYDLGEYRLVLKFSPGFNHYNVDSKALMSFRSLRVYSDNKSVT FAELDVFDLSEALSAPDVIFH >gi|296493461|gb|ADTK01000040.1| GENE 8 9936 - 11420 1104 494 aa, chain - ## HITS:1 COG:no KEGG:ECIAI1_2191 NR:ns ## KEGG: ECIAI1_2191 # Name: yehI # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 494 1 494 1210 914 97.0 0 MDKELPWLADNAQLELKYKKGKTPLSHRKWPGEPVPVITESIIQTLGDKLLQKAEKKKNI VWRYENFSLEWQSAITQAINLIGEHKPSIPARTMAALVCIAQNDSQQLLDEIVQQEGLEY ATEVVIARQFITRCYESDPLLVTLQYQDEDYGYGYRSETYNEFDLRLRKHLSLAEESSWQ RCADKLIAALPGIPKIRRPFIALLLPEKPEIANELASLESSRSSLHSKEWLKVIATDNTA VKKLERYWGLDVFSDREASYMSQENRFGYAACASLLREQGLAAVPRLAMYAHKEDCGSLL VQINHPQVIRTLLLVADKNKPSLQRVAKYSKNFPHATLAALAELLALKEPPARPGYPIIE DKKLPAQQKARDEYWRTLLQTLMASQPQLAAEVMPWLSTQARAVLNSYLSAPPKPVIDST DNSSLPEMLVSPPWRSKKKMTAPRLDLAPLELTPQIYWQPGEQERLAATESARYFSTESL AERMEQKSGRVVLQ >gi|296493461|gb|ADTK01000040.1| GENE 9 11430 - 15224 2198 1264 aa, chain - ## HITS:1 COG:molR_g1_1 KEGG:ns NR:ns ## COG: molR_g1_1 COG3831 # Protein_GI_number: 16130053 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 112 1 112 112 201 96.0 6e-51 MRHFIYQDEKSHKFWAVEQQDNELHISWGKIGTHGQSQIKSFSDAAAAAKAELKLIAEKV KKGYVEQAKDNSLQPSQTVTGSLKVADLSTIIQEQPSFVAETRAPDKNTDAVLPWLAKDI AVVFPPEVVHTTLSHRRFPGVPVQQADKLTQLRRLACSVSQRDNKTATFDFSACSLEWQN TVAQAISQIDGLKTTQLPSPVMAVLTALEMKCTRYKVREDVMDQIVQEGGLEYATDVIIH LQQIDIEWDYANNVIIILPSGIAPSYLEQYSRFELRLRKHLSLTEESLWQKCAQKLIAAI PHIPEWRQPLIALLLPEKPEIAHEIAQRLLGQKKLPSLEWLKIVATDEHILASLEKYHEP YAIFDDYYCGAIWSATVLQEQGVAALPRFAPYAASDYCADVLRHINHPFALTLLIRVAGH TKRCHDRMTKAIAAFPHAAMAALTELLGQKEENSWRIMLMTMLISQPALAEQVIPWLSTP AVAVLKSCQQQLTQPSNHASADLLPAVVVSPPWLSKKKKSPIPVLDLAPLGIEPICYLTE EISNQLLAKYIWYSKHITVSHEESTTNLLARMGFQRRIAGTYIKAPEAVVEAWLNEDYST LLSEFKVFHSPTGHYWQLGILTTLPLEKAVKAWNALTLSPHTDTEYAMLHFGLKGLPGLV NSLARYPQEALPITNYFAASELAPAVARAFNKLKTLRENARSWLLKYPEHALTGLLPAAL GKAGEAQDNARAALRMLTENGHQPLLQEIARRYNQPEVTDAVNALLALDPLDNHPTKIPT LPAFYQPSLWTRPVLKANAQSLPDSALLHLGEMLRFPQEEALYPGLLQVKDVCSADSLAG FAWDLFTAWQTAGAPSKESWAFTALGVLGNDDTARKLTPLIRAWPGESQHKRATVGLDIL AAIGSDIALMQLNGIAQKLKFKALQERAKEKIADIAESRELTVAELEDRLAPDLGLDDNG SLLLDFGPRQFTVSFDETLKPFVRDVSGSRLKDLPKPNKSDDETRANDAVNRYKLLKKDA RTIAAQQVARLESAMCLRRRWSLENFQLFLVEHPLVRHLTRRLIWGVYSAENQLLACFRV AEDNSYSTADDDLFTLPEGDISIGTPHVLEISPTDAAAFGQLFADYELLPPFRQLDRNSY ALTEAKRNASELTRWAGRKCPSGRVMGLANKGWIKGEPQDGGWIGWMIKPLGRWSLIMEI DEGFAVGMSPAELSAEQLLSKLWLWEGKAESYGWGSNSTQEAQFSVLDAITASELINDIE ALFE Prediction of potential genes in microbial genomes Time: Mon May 16 15:05:35 2011 Seq name: gi|296493460|gb|ADTK01000041.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont101.6, whole genome shotgun sequence Length of sequence - 24553 bp Number of predicted genes - 25, with homology - 25 Number of transcription units - 11, operones - 5 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 12 - 2045 2342 ## COG0143 Methionyl-tRNA synthetase - Prom 2075 - 2134 4.4 + Prom 2090 - 2149 2.7 2 2 Op 1 . + CDS 2177 - 3286 1115 ## COG0489 ATPases involved in chromosome partitioning + Term 3325 - 3388 10.0 + Prom 3407 - 3466 2.8 3 2 Op 2 . + CDS 3549 - 3830 180 ## G2583_2648 hypothetical protein + Term 3859 - 3908 9.5 + Prom 3984 - 4043 7.1 4 3 Op 1 7/0.000 + CDS 4123 - 4665 442 ## COG3539 P pilus assembly protein, pilin FimA + Term 4702 - 4736 5.1 5 3 Op 2 10/0.000 + CDS 4745 - 5419 399 ## COG3121 P pilus assembly protein, chaperone PapD 6 3 Op 3 . + CDS 5435 - 7915 1672 ## COG3188 P pilus assembly protein, porin PapC + Prom 7950 - 8009 4.5 7 4 Tu 1 . + CDS 8126 - 8965 384 ## EC55989_2363 putative exported fimbrial-like adhesin protein + Term 9022 - 9052 4.3 8 5 Tu 1 1/1.000 - CDS 9047 - 9385 312 ## COG5455 Predicted integral membrane protein - Prom 9433 - 9492 3.6 - Term 9491 - 9521 1.1 9 6 Tu 1 . - CDS 9604 - 10428 578 ## COG2215 ABC-type uncharacterized transport system, permease component - Prom 10455 - 10514 3.9 + Prom 10466 - 10525 5.0 10 7 Tu 1 1/1.000 + CDS 10549 - 10821 357 ## COG1937 Uncharacterized protein conserved in bacteria + Prom 10835 - 10894 3.0 11 8 Op 1 12/0.000 + CDS 11125 - 11832 386 ## COG2145 Hydroxyethylthiazole kinase, sugar kinase family 12 8 Op 2 2/0.750 + CDS 11829 - 12629 691 ## COG0351 Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase 13 8 Op 3 . + CDS 12694 - 12996 285 ## COG3757 Lyzozyme M1 (1,4-beta-N-acetylmuramidase) 14 8 Op 4 2/0.750 + CDS 13019 - 13516 357 ## COG3757 Lyzozyme M1 (1,4-beta-N-acetylmuramidase) 15 8 Op 5 . + CDS 13568 - 14314 581 ## COG2188 Transcriptional regulators - Term 14195 - 14251 11.1 16 9 Op 1 5/0.500 - CDS 14288 - 15253 782 ## COG0524 Sugar kinases, ribokinase family 17 9 Op 2 4/0.750 - CDS 15250 - 16254 1088 ## COG1397 ADP-ribosylglycohydrolase 18 9 Op 3 . - CDS 16251 - 17528 1414 ## COG0477 Permeases of the major facilitator superfamily - Prom 17576 - 17635 3.0 + Prom 17579 - 17638 4.0 19 10 Tu 1 . + CDS 17785 - 18837 1134 ## COG1830 DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes + Term 19000 - 19033 5.2 + Prom 18909 - 18968 4.3 20 11 Op 1 . + CDS 19147 - 20001 693 ## COG0191 Fructose/tagatose bisphosphate aldolase 21 11 Op 2 4/0.750 + CDS 20030 - 21292 993 ## COG4573 Predicted tagatose 6-phosphate kinase 22 11 Op 3 13/0.000 + CDS 21302 - 21754 333 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 23 11 Op 4 10/0.000 + CDS 21785 - 22069 330 ## COG3414 Phosphotransferase system, galactitol-specific IIB component 24 11 Op 5 7/0.000 + CDS 22073 - 23428 1461 ## COG3775 Phosphotransferase system, galactitol-specific IIC component 25 11 Op 6 . + CDS 23476 - 24516 691 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases Predicted protein(s) >gi|296493460|gb|ADTK01000041.1| GENE 1 12 - 2045 2342 677 aa, chain - ## HITS:1 COG:ZmetG_1 KEGG:ns NR:ns ## COG: ZmetG_1 COG0143 # Protein_GI_number: 15802593 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA synthetase # Organism: Escherichia coli O157:H7 EDL933 # 1 567 1 567 567 1205 99.0 0 MTQVAKKILVTCALPYANGSIHLGHMLEHIQADVWVRYQRMRGHEVNFICADDAHGTPIM LKAQQLGITPEQMIGEMSQEHQTDFAGFNISYDNYHSTHSEENRQLSELIYSRLKENGFI KNRTISQLYDPEKGMFLPDRFVKGTCPKCKSPDQYGDNCEVCGATYSPTELIEPKSVVSG ATPVMRDSEHFFFDLPSFSEMLQAWTRSGALQEQVANKMQEWFESGLQQWDISRDAPYFG FEIPNAPGKYFYVWLDAPIGYMGSFKNLCDKRGDSVSFDEYWKKDSTAELYHFIGKDIVY FHSLFWPAMLEGSNFRKPTNLFVHGYVTVNGAKMSKSRGTFIKASTWLNHFDADSLRYYY TAKLSSRIDDIDLNLEDFVQRVNADIVNKVVNLASRNAGFINKRFDGVLASELADPQLYK TFTDAAEVIGEAWESREFGKAVREIMALADLANRYVDEQAPWVVAKQEGRDADLQAICSM GINLFRVLMTYLKPVLPKLTERAEAFLNTELTWDGIQQPLLGHKVNPFKALYNRIDMKQV EALVEASKEEVKAAAAPVTGPLADDPIQETITFDDFAKVDLRVALIENAEFVEGSDKLLR LTLDLGGEKRNVFSGIRSAYPDPQALIGRHTIMVANLAPRKMRFGISEGMVMAAGPGGKD IFLLSPDAGAKPGHQVK >gi|296493460|gb|ADTK01000041.1| GENE 2 2177 - 3286 1115 369 aa, chain + ## HITS:1 COG:ECs2919 KEGG:ns NR:ns ## COG: ECs2919 COG0489 # Protein_GI_number: 15832173 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Escherichia coli O157:H7 # 1 369 11 379 379 726 100.0 0 MNEQSQAKSPEALRAMVAGTLANFQHPTLKHNLTTLKALHHVAWMDDTLHVELVMPFVWH SAFEELKEQCSAELLRITGAKAIDWKLSHNIATLKRVKNQPGINGVKNIIAVSSGKGGVG KSSTAVNLALALAAEGAKVGILDADIYGPSIPTMLGAENQRPTSPDGTHMAPIMSHGLAT NSIGYLVTDDNAMVWRGPMASKALMQMLQETLWPDLDYLVLDMPPGTGDIQLTLAQNIPV TGAVVVTTPQDIALIDAKKGIVMFEKVEVPVLGIVENMSVHICSNCGHHEPIFGTGGAEK LAEKYHTQLLGQMPLHISLREDLDKGTPTVISRPESEFTAIYRQLADRVAAQLYWQGEVI PGEISFRAV >gi|296493460|gb|ADTK01000041.1| GENE 3 3549 - 3830 180 93 aa, chain + ## HITS:1 COG:no KEGG:G2583_2648 NR:ns ## KEGG: G2583_2648 # Name: yehE # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 93 1 93 93 181 100.0 5e-45 MNKYWLSGIIFLAYGLASPAFSSETATLTINGRISPPTCSMAMVNSQPQQHCGQLTYNVD TRHQVSSPVKGVTTEVVVAGSDSKRRIVLNRYD >gi|296493460|gb|ADTK01000041.1| GENE 4 4123 - 4665 442 180 aa, chain + ## HITS:1 COG:yehD KEGG:ns NR:ns ## COG: yehD COG3539 # Protein_GI_number: 16130049 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 180 1 180 180 315 99.0 4e-86 MKRSIIAAAVFSSFFMSAGVFAADVDTGTLTIKGNIAESPCKFEAGGDSVSINMPTVPTT VFEGKAKYSTYDDAVGVTSSMLKISCPKEVAGVKLSLITNDKITGNDKAIASSNDTVGYY LYLGDNSDVLDVSAPFNIESYKTAEGQYAIPFKAKYLKLTDNSVQSGDVLSSLVMRVAQD >gi|296493460|gb|ADTK01000041.1| GENE 5 4745 - 5419 399 224 aa, chain + ## HITS:1 COG:yehC KEGG:ns NR:ns ## COG: yehC COG3121 # Protein_GI_number: 16130048 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli K12 # 1 224 16 239 239 409 99.0 1e-114 MKGLLSLLIFSMVLPAHAGIVIYGTRIIYPAENKEVMVQLMNQGNRSSLLQAWIDDGDTS LPPEKIQVPFMLTPPVAKIGANSGQQVKIKIMPNKLPTNKESIFYLNVLDIPPNSPEQEG KNALKSAMQNRIKLFYRPAGIAPVNKATFKKLLVNRSGNGLVIKNDSANWVTISDVKANN VKVNYETIMIAPLESQSVNVKSNNANNWYLTIIDDHGNYISDKI >gi|296493460|gb|ADTK01000041.1| GENE 6 5435 - 7915 1672 826 aa, chain + ## HITS:1 COG:yehB KEGG:ns NR:ns ## COG: yehB COG3188 # Protein_GI_number: 16130047 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli K12 # 1 826 1 826 826 1615 98.0 0 MLRMTPLASAIVALLLGIEAYAAEETFDTHFMIGGMKDQQVANIRLDDNQPLPGQYDIDI YVNKQWRGKYEIIVKDNPQETCLSREVIKRLGINSDNFASGKQCLTFEQLVQGGSYTWDI GVFRLDFSVPQAWGEELESGYVPPENWERGINAFYTSYYVSQYYSDYKASGNSKSTYVRF NSGLNLLGWQLHSDASFSKTNNNPGVWKSNTLYLERGFAQLLGTLRVGDMYTSSDIFDSV RFRGVRLFRDMQMLPNSKQNFTPRVQGIAQSNALVTIEQNGFVVYQKEVPPGPFAITDLQ LAGGGADLDVSVKEADGSVTTYLVPYAAVPNMLQPGVSKYDFAAGRSHIEGASKHSDFVQ AGYQYGFNNLLTLYGGSMVANNYYAFTLGTGWNTRIGAISVDATKSHSKQDNGDVFDGQS YQIAYNKFVSQTSTRFGLAAWRYSSRDYRTFNDHVWANNKDNYRRDENDVYDIADYYQND FGRKNSFSANMSQSLPEGWGSVSLSTLWRDYWGRSGSSKDYQLSYSNNLRRISYTLAASQ AYDENHHEEKRFNIFISIPFDWGDDVSTPRRQIYMSNSTTFDDQGFASNNTGLSGTVGSR DQFNYGVNLSHQHQGNETTAGANLTWNAPVATVNGSYSQSSTYRQAGASVSGGIVAWSGG VNLANRLSETFAVMNAPGIKDAYVNGQKYRTTNRNGVVVYDGMTPYRENHLMLDVSQSDS EAELRGNRKIAAPYRGAVVLVNFDTDQRKPWFIKALRADGQPLTFGYEVNDIHGHNIGVV GQGSQLFIRTNEIPPSVNVAIDKQQGLSCTITFGKEIDESRNYICQ >gi|296493460|gb|ADTK01000041.1| GENE 7 8126 - 8965 384 279 aa, chain + ## HITS:1 COG:no KEGG:EC55989_2363 NR:ns ## KEGG: EC55989_2363 # Name: yehA # Def: putative exported fimbrial-like adhesin protein # Organism: E.coli_55989 # Pathway: not_defined # 1 279 66 344 344 449 98.0 1e-125 MSSSDIIVGLYNDTIKLNLHFEWTNKNNITLSNNQTSFTSGYSVTVTPAASNAKVNVSAG GGGSVMINGVATLSSASSSTRGSAAVQFLLCLLGGKSWDACVNSYRNALAQNAGVYSFNL TLSYNPITTTCKPDDLLITLDSIPVSQLPATGNKATINSKKEDIILRCKNLLGQQNQTSR KMQVYLSSSDLLTNSNTILKGAEDNGVGFILESNGSPVTLLNITNSSKGYTNLKEIAAKS KLTDTTVSIPITASYYVYDTNKIKSGALEATALINVKYD >gi|296493460|gb|ADTK01000041.1| GENE 8 9047 - 9385 312 112 aa, chain - ## HITS:1 COG:yohN KEGG:ns NR:ns ## COG: yohN COG5455 # Protein_GI_number: 16130045 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Escherichia coli K12 # 1 112 61 172 172 207 99.0 3e-54 MTIKNKMLLGVLLLVTSAAWAAPATAGSTNTSGISKYELSSFIADFKHFKPGDTVPEMYR TDEYNIKQWQLRNLPAPDAGTHWTYMGGAYVLISDTDGKIIKAYDGEIFYHR >gi|296493460|gb|ADTK01000041.1| GENE 9 9604 - 10428 578 274 aa, chain - ## HITS:1 COG:ZyohM KEGG:ns NR:ns ## COG: ZyohM COG2215 # Protein_GI_number: 15802585 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Escherichia coli O157:H7 EDL933 # 1 274 1 283 283 448 90.0 1e-126 MTEFTTLLQQGNAWFFIPSAILLGALHGLEPGHSKTMMAAFIIAIKGTIKQAVMLGLAAT ISHTAVVWLIAFGGMVISKRFTAQSAEPWLQLISAVIIIGTAFWMFWRTWRGERNWLENM HEYDHEHHHHDHEDHHDHGHHHHHEHGEYQDAHARAHANDIKRRFDGREVTNWQILLFGL TGGLIPCPAAITVLLICIQLKALTLGATLVVSFSIGLALTLVTVGVGAAISVQQVAKRWS GFNTLAKRAPYFSSLLIGLVGVYMGVHGFMGIMR >gi|296493460|gb|ADTK01000041.1| GENE 10 10549 - 10821 357 90 aa, chain + ## HITS:1 COG:ECs2911 KEGG:ns NR:ns ## COG: ECs2911 COG1937 # Protein_GI_number: 15832165 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 90 1 90 90 146 100.0 1e-35 MSHTIRDKQKLKARASKIQGQVVALKKMLDEPHECAAVLQQIAAIRGAVNGLMREVIKGH LTEHIVHQGDELKREEDLDVVLKVLDSYIK >gi|296493460|gb|ADTK01000041.1| GENE 11 11125 - 11832 386 235 aa, chain + ## HITS:1 COG:ECs2907 KEGG:ns NR:ns ## COG: ECs2907 COG2145 # Protein_GI_number: 15832161 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxyethylthiazole kinase, sugar kinase family # Organism: Escherichia coli O157:H7 # 1 235 28 262 262 417 98.0 1e-117 MTNDVVQTFTANTLLALGASPAMVIETEEASQFAAIASALLINVGTLTQPRAQAMRAAVE QAKSSQTPWTLDPVAVGALDYRRHFCHELLSFKPAAIRGNASEIMALAGIANGGRGVDTT DAAANAIPAAQTLARETGAIVVVTGEVDYVTDGHRAVGIHGGDPLMTKVVGTGCALSAVV AACCALPGDTLENVASACHWMKQAGERAVARSEGPGSFVPHFLDALWQLTQEVQA >gi|296493460|gb|ADTK01000041.1| GENE 12 11829 - 12629 691 266 aa, chain + ## HITS:1 COG:thiD KEGG:ns NR:ns ## COG: thiD COG0351 # Protein_GI_number: 16130041 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase # Organism: Escherichia coli K12 # 1 266 1 266 266 516 100.0 1e-146 MKRINALTIAGTDPSGGAGIQADLKTFSALGAYGCSVITALVAQNTRGVQSVYRIEPDFV AAQLDSVFSDVRIDTTKIGMLAETDIVEAVAERLQRYQIQNVVLDTVMLAKSGDPLLSPS AVATLRSRLLPQVSLITPNLPEAAALLDAPHARTEQEMLEQGRSLLAMGCGAVLMKGGHL DDEQSPDWLFTREGEQRFTAPRIMTKNTHGTGCTLSAALAALRPRHTNWADTVQEAKSWL SSALAQADTLEVGHGIGPVHHFHAWW >gi|296493460|gb|ADTK01000041.1| GENE 13 12694 - 12996 285 100 aa, chain + ## HITS:1 COG:yegX KEGG:ns NR:ns ## COG: yegX COG3757 # Protein_GI_number: 16130040 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lyzozyme M1 (1,4-beta-N-acetylmuramidase) # Organism: Escherichia coli K12 # 1 99 4 102 275 201 100.0 3e-52 MQLRITSRKKLTSLLCALGLISIVAIYPRQTVNFFYSTAVQITDYIHFYGYRPVKSFAIR IPASYTIHGIDVSRWQERIDWQRVAKMRDNGIRLQFAFIY >gi|296493460|gb|ADTK01000041.1| GENE 14 13019 - 13516 357 165 aa, chain + ## HITS:1 COG:yegX KEGG:ns NR:ns ## COG: yegX COG3757 # Protein_GI_number: 16130040 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lyzozyme M1 (1,4-beta-N-acetylmuramidase) # Organism: Escherichia coli K12 # 1 165 111 275 275 337 97.0 5e-93 MDPYFSRNWQLSRENGLLRGAYHYFSPSVAAPVQARLFLQTVDFSQGDFPAVLDVEERGK LSAKELRKRVSQWLKMVEKSTGKKPIIYSGAVFYHTNLAGYFNEYPWWVAHYYQRRPDND GMAWRFWQHSDRGQVDGINGPVDFNVFHGTVEELQAFVDGIKETP >gi|296493460|gb|ADTK01000041.1| GENE 15 13568 - 14314 581 248 aa, chain + ## HITS:1 COG:ECs2904 KEGG:ns NR:ns ## COG: ECs2904 COG2188 # Protein_GI_number: 15832158 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 248 1 248 248 477 100.0 1e-135 MEQAHTQLIAQLNERILAADNTPLYIKFAETVKNAVRSGVLEHGNILPGERDLSQLTGVS RITVRKAMQALEEEGVVTRSRGYGTQINNIFEYSLKEARGFSQQVVLRGKKPDTLWVNKR VVKCPEEVAQQLAVEAGSDVFLLKRIRYVDEEAVSIEESWVPAHLIHDVDAIGISLYDYF RSQHIYPQRTRSRVSARMPDAEFQSHIQLDSKIPVLVIKQVALDQQQRPIEYSISHCRSD LYVFVCEE >gi|296493460|gb|ADTK01000041.1| GENE 16 14288 - 15253 782 321 aa, chain - ## HITS:1 COG:ECs2903 KEGG:ns NR:ns ## COG: ECs2903 COG0524 # Protein_GI_number: 15832157 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli O157:H7 # 1 321 1 321 321 629 98.0 1e-180 MSGARLHTLLPELTSRQPVMVVGAAVIDVIADAYALPWRGCDIELKQQSVNVGGCALNIA VALKRLGIEAGNALPLGQGVWAEIIRNRMAKEGLISLIDNAEGDNGWCLALVEPDSERTF MSFSGVENQWNRQWLARLTVAPGSLLYFSGYQLASPCGELLVEWLEKLQDVTPFIDFGPR IGDIPDALLARIMACRPLVSLNRQEAEIAAERFALSAEITTLGKQWQEKFAAPLIVRLDK EGAWYFSNDASGCIPAFPTQVVDTIGAGDSHAGGVLAGLASGLPLADAVLLGNAVASWVV GHRGGDCAPTREELLLAHKNV >gi|296493460|gb|ADTK01000041.1| GENE 17 15250 - 16254 1088 334 aa, chain - ## HITS:1 COG:yegU KEGG:ns NR:ns ## COG: yegU COG1397 # Protein_GI_number: 16130037 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ADP-ribosylglycohydrolase # Organism: Escherichia coli K12 # 1 334 1 334 334 643 100.0 0 MKTERILGALYGQALGDAMGMPSELWPRSRVKAHFGWIDRFLPGPKENNAACYFNRAEFT DDTSMALCLADALLEREGKIDPDLIGRNILDWALRFDAFNKNVLGPTSKIALNAIRDGKP VAELENNGVTNGAAMRVSPLGCLLPARDVDSFIDDVALASSPTHKSDLAVAGAVVIAWAI SRAIDGESWSAIVDSLPSIARHAQQKRITTFSASLAARLEIALKIVRNADGTESASEQLY QVVGAGTSTIESVPCAIALVELAQTDPNRCAVLCANLGGDTDTIGAMATAICGALHGVNA IDPALKAELDAVNQLDFNRYATALAKYRQQREAV >gi|296493460|gb|ADTK01000041.1| GENE 18 16251 - 17528 1414 425 aa, chain - ## HITS:1 COG:yegT KEGG:ns NR:ns ## COG: yegT COG0477 # Protein_GI_number: 16130036 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 425 1 425 425 761 99.0 0 MKTTAKLSFMMFVEWFIWGAWFVPLWLWLSKSGFSAGEIGWSYACTAIAAILSPILVGSI TDRFFSAQKVLAVLMFAGALLMYFAAQQTTFAGFFPLLLAYSLTYMPTIALTNSIAFANV PDVERDFPRIRVMGTIGWIASGLACGFLPQMLGYADISPTNIPLLITAGSSALLGVFAFF LPDTPPKSTGKMDIKVMLGLDALILLRDKNFLVFFFCSFLFAMPLAFYYIFANGYLTEVG MKNATGWMTLGQFSEIFFMLALPFFTKRFGIKKVLLLGLVTAAIRYGFFIYGSADEYFTY ALLFLGILLHGVSYDFYYVTAYIYVDKKAPVHMRTAAQGLITLCCQGFGSLLGYRLGGVM MEKMFAYPEPVNGLTFNWSGMWTFGAVMIAIIAVLFMIFFRESDNEITAIKVDDRDIALT QGEVK >gi|296493460|gb|ADTK01000041.1| GENE 19 17785 - 18837 1134 350 aa, chain + ## HITS:1 COG:ECs2900 KEGG:ns NR:ns ## COG: ECs2900 COG1830 # Protein_GI_number: 15832154 # Func_class: G Carbohydrate transport and metabolism # Function: DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes # Organism: Escherichia coli O157:H7 # 1 350 25 374 374 695 100.0 0 MTDIAQLLGKDADNLLQHRCMTIPSDQLYLPGHDYVDRVMIDNNRPPAVLRNMQTLYNTG RLAGTGYLSILPVDQGVEHSAGASFAANPLYFDPKNIVELAIEAGCNCVASTYGVLASVS RRYAHRIPFLVKLNHNETLSYPNTYDQTLYASVEQAFNMGAVAVGATIYFGSEESRRQIE EISAAFERAHELGMVTVLWAYLRNSAFKKDGVDYHVSADLTGQANHLAATIGADIVKQKM AENNGGYKAINYGYTDDRVYSKLTSENPIDLVRYQLANCYMGRAGLINSGGAAGGETDLS DAVRTAVINKRAGGMGLILGRKAFKKSMADGVKLINAVQDVYLDSKITIA >gi|296493460|gb|ADTK01000041.1| GENE 20 19147 - 20001 693 284 aa, chain + ## HITS:1 COG:ECs2899 KEGG:ns NR:ns ## COG: ECs2899 COG0191 # Protein_GI_number: 15832153 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Escherichia coli O157:H7 # 1 284 3 286 286 557 99.0 1e-159 MYVVSTKQMLNNAQRGGYAVPAFNIHNLETMQVVVETAANLHAPVIIAGTPGTFTHAGTE NLLALVSAMAKQYHHPLAIHLDHHTKFDDIAQKVRSGVRSVMIDASHLPFAQNISRVKEV VDFCHRFDVSVEAELGQLGGQEDDVQVNEADAFYTNPAQAREFAEATGIDSLAVAIGTAH GMYASAPALDFSRLENIRQWVNLPLVLHGASGLSTKDIQQTIKLGICKINVATELKNAFS QALKNYLTEHPEATDPRDYLQSAKSAMRDVVSKVIADCGCEGRA >gi|296493460|gb|ADTK01000041.1| GENE 21 20030 - 21292 993 420 aa, chain + ## HITS:1 COG:gatZ KEGG:ns NR:ns ## COG: gatZ COG4573 # Protein_GI_number: 16130033 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted tagatose 6-phosphate kinase # Organism: Escherichia coli K12 # 1 420 1 420 420 860 99.0 0 MKTLIARHKAGEHIGICSVCSAHPLVIEAALAFDRNSTRKVLIEATSNQVNQFGGYTGMT PADFREFVFTIADKVGFARERIILGGDHLGPNCWQQENADAAMEKSVELVKEYVRAGFSK IHLDASMSCAGDPIPLAPETVAERAAVLCFAAESVATDCQREQLSYVIGTEVPVPGGEAS AIQSVHITHVEDAANTLRTHQKAFIARGLTEALTRVIAIVVQPGVEFDHSNIIHYQPQEA QPLAQWIESTRMVYEAHSTDYQTRTAYWELVRDHFAILKVGPALTFALREAIFALAQIEQ ELIAPENRSGCLAVIEEVMLDEPQYWKKYYRTGFNDSLLDIRYSLSDRIRYYWPHSRIKN SVETMMVNLEGVDIPLGMISQYLPKQFERIQSGELSAIPHQLIMDKIYDVLRAYRYGCAE >gi|296493460|gb|ADTK01000041.1| GENE 22 21302 - 21754 333 150 aa, chain + ## HITS:1 COG:ECs2897 KEGG:ns NR:ns ## COG: ECs2897 COG1762 # Protein_GI_number: 15832151 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Escherichia coli O157:H7 # 1 150 1 150 150 287 98.0 4e-78 MTNLFVRSGISFVDRSEVLTHIGNEMLAKGVVYDTWPQALIAREAEFPTGIMLEQHAIAI PHCEAIHAKSSAIYLLRTTNKVHFQQADDDNDVEVSLVIALIVENPQQQLKLLRCLFGKL QQPDIVETLITLPETQLKEYFTKYVLDSDE >gi|296493460|gb|ADTK01000041.1| GENE 23 21785 - 22069 330 94 aa, chain + ## HITS:1 COG:gatB KEGG:ns NR:ns ## COG: gatB COG3414 # Protein_GI_number: 16130031 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIB component # Organism: Escherichia coli K12 # 1 94 1 94 94 179 98.0 8e-46 MKRKIIVACGGAVATSTMAAEEIKELCQNHNIPVELIQCRVNEIETYMDGVHLICTTAKV DRSFGDIPLVHGMPFISGVGIEALQNKILTILQG >gi|296493460|gb|ADTK01000041.1| GENE 24 22073 - 23428 1461 451 aa, chain + ## HITS:1 COG:ECs2895 KEGG:ns NR:ns ## COG: ECs2895 COG3775 # Protein_GI_number: 15832149 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIC component # Organism: Escherichia coli O157:H7 # 1 451 1 451 451 806 100.0 0 MFSEVMRYILDLGPTVMLPIVIIIFSKILGMKAGDCFKAGLHIGIGFVGIGLVIGLMLDS IGPAAKAMAENFDLNLHVVDVGWPGSSPMTWASQIALVAIPIAILVNVAMLLTRMTRVVN VDIWNIWHMTFTGALLHLATGSWMIGMAGVVIHAAFVYKLGDWFARDTRNFFELEGIAIP HGTSAYMGPIAVLVDAIIEKIPGVNRIKFSADDIQRKFGPFGEPVTVGFVMGLIIGILAG YDVKGVLQLAVKTAAVMLLMPRVIKPIMDGLTPIAKQARSRLQAKFGGQEFLIGLDPALL LGHTAVVSASLIFIPLTILIAVCVPGNQVLPFGDLATIGFFVAMAVAVHRGNLFRTLISG VIIMSITLWIATQTIGLHTQLAANAGALKAGGMVASMDQGGSPITWLLIQVFSPQNIPGF IIIGAIYLTGIFMTWRRARGFIKQEKVVLAE >gi|296493460|gb|ADTK01000041.1| GENE 25 23476 - 24516 691 346 aa, chain + ## HITS:1 COG:ECs2894 KEGG:ns NR:ns ## COG: ECs2894 COG1063 # Protein_GI_number: 15832148 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli O157:H7 # 1 346 1 346 346 711 99.0 0 MKSVVNDTDGIVRVAESVIPEIKHQDEVRVKIASSGLCGSDLPRIFKNGAHYYPITLGHE FSGYIDAVGSGVDDLHPGDAVACVPLLPCFTCPECLKGFYSQCAKYDFIGSRRDGGFAEY IVVKRKNVFALPTDMPIEDGAFIEPITVGLHAFHLAKGCENKNVIIIGAGTIGLLAIQCA VALGAKSVTAIDISSEKLALAKSFGAMQTFNSSEMSAPQMQSVLRELRFNQLILETAGVP QTVELAVEIAGPHAQLALVGTLHQDLHLTSATFGKILRKELTVIGSWMNYSSPWPGQEWE TASRLLTERKLSLEPLIAHRGSFESFTQAVRDIARNAMPGKVLLIP Prediction of potential genes in microbial genomes Time: Mon May 16 15:05:48 2011 Seq name: gi|296493459|gb|ADTK01000042.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont101.7, whole genome shotgun sequence Length of sequence - 18679 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 8, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 16 - 75 3.0 1 1 Tu 1 . + CDS 97 - 870 509 ## COG1349 Transcriptional regulators of sugar metabolism + Term 883 - 926 1.5 2 2 Tu 1 . - CDS 952 - 1851 640 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase - Prom 1878 - 1937 3.4 + Prom 2032 - 2091 3.9 3 3 Tu 1 . + CDS 2197 - 2574 210 ## ECSE_2357 hypothetical protein - Term 2729 - 2768 5.1 4 4 Tu 1 . - CDS 2905 - 4266 1453 ## COG0826 Collagenase and related proteases 5 5 Op 1 . - CDS 4369 - 4665 226 ## EC55989_2339 conserved hypothetical protein, putative plasmid stabilisation system protein 6 5 Op 2 . - CDS 4667 - 4918 304 ## COG3609 Predicted transcriptional regulators containing the CopG/Arc/MetJ DNA-binding domain - Prom 5102 - 5161 3.1 - Term 5126 - 5164 3.3 7 6 Tu 1 2/0.500 - CDS 5172 - 5504 403 ## COG3422 Uncharacterized conserved protein 8 7 Op 1 40/0.000 - CDS 5695 - 6417 860 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 9 7 Op 2 10/0.000 - CDS 6414 - 7817 1233 ## COG0642 Signal transduction histidine kinase 10 7 Op 3 5/0.000 - CDS 7814 - 9229 1299 ## COG0477 Permeases of the major facilitator superfamily 11 7 Op 4 10/0.000 - CDS 9230 - 12307 3208 ## COG0841 Cation/multidrug efflux pump 12 7 Op 5 27/0.000 - CDS 12308 - 15430 3384 ## COG0841 Cation/multidrug efflux pump 13 7 Op 6 . - CDS 15430 - 16677 1360 ## COG0845 Membrane-fusion protein - Prom 16749 - 16808 3.4 + Prom 17141 - 17200 3.3 14 8 Op 1 . + CDS 17230 - 17889 569 ## COG4245 Uncharacterized protein encoded in toxicity protection region of plasmid R478, contains von Willebrand factor (vWF) domain 15 8 Op 2 . + CDS 17967 - 18647 558 ## ECIAI1_2148 hypothetical protein Predicted protein(s) >gi|296493459|gb|ADTK01000042.1| GENE 1 97 - 870 509 257 aa, chain + ## HITS:1 COG:gatR KEGG:ns NR:ns ## COG: gatR COG1349 # Protein_GI_number: 16132228 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli K12 # 1 257 3 259 259 454 100.0 1e-128 MNSFERRNKIIQLVNEQGTVLVQDLAGVFAASEATIRADLRFLEQKGVVTRFHGGAAKIM SGNSETETQEVGFKERFQLASAPKNRIAQAAVKMIHEGMTVILDSGSTTMLIAEGLMTAK NITVITNSLPAAFALSENKDITLVVCGGTVRHKTRSMHGSIAERSLQDINADLMFVGADG IDAVNGITTFNEGYSISGAMVTAANKVIAVLDSSKFNRRGFNQVLPIEKIDIIITDDAVS EVDKLALQKTRVKLITV >gi|296493459|gb|ADTK01000042.1| GENE 2 952 - 1851 640 299 aa, chain - ## HITS:1 COG:ECs2892 KEGG:ns NR:ns ## COG: ECs2892 COG1597 # Protein_GI_number: 15832146 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Escherichia coli O157:H7 # 1 299 1 299 299 596 100.0 1e-170 MAEFPASLLILNGKSTDNLPLREAIMLLREEGMTIHVRVTWEKGDAARYVEEARKLGVAT VIAGGGDGTINEVSTALIQCEGDDIPALGILPLGTANDFATSVGIPEALDKALKLAIAGN AIAIDMAQVNKQTCFINMATGGFGTRITTETPEKLKAALGGVSYIIHGLMRMDTLQPDRC EIRGENFHWQGDALVIGIGNGRQAGGGQQLCPNALINDGLLQLRIFTGDEILPALVSTLK SDEDNPNIIEGASSWFDIQAPHEITFNLDGEPLSGQNFHIEILPAALRCRLPPDCPLLR >gi|296493459|gb|ADTK01000042.1| GENE 3 2197 - 2574 210 125 aa, chain + ## HITS:1 COG:no KEGG:ECSE_2357 NR:ns ## KEGG: ECSE_2357 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 125 1 125 125 252 100.0 3e-66 MPLLYLNTRECRWYLMGEGEMKKIAAISLISIFIMSGCAVHNDETSIGKFGLAYKSNIQR KLDNQYYTEAEASLARGRISGAENIVKNDAAHFCVTQGKKMQIVDLKTEGAGLHGVARLT FKCGE >gi|296493459|gb|ADTK01000042.1| GENE 4 2905 - 4266 1453 453 aa, chain - ## HITS:1 COG:yegQ KEGG:ns NR:ns ## COG: yegQ COG0826 # Protein_GI_number: 16130021 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Escherichia coli K12 # 1 453 1 453 453 929 100.0 0 MFKPELLSPAGTLKNMRYAFAYGADAVYAGQPRYSLRVRNNEFNHENLQLGINEAHALGK KFYVVVNIAPHNAKLKTFIRDLKPVVEMGPDALIMSDPGLIMLVREHFPEMPIHLSVQAN AVNWATVKFWQQMGLTRVILSRELSLEEIEEIRNQVPDMEIEIFVHGALCMAYSGRCLLS GYINKRDPNQGTCTNACRWEYNVQEGKEDDVGNIVHKYEPIPVQNVEPTLGIGAPTDKVF MIEEAQRPGEYMTAFEDEHGTYIMNSKDLRAIAHVERLTKMGVHSLKIEGRTKSFYYCAR TAQVYRKAIDDAAAGKPFDTSLLETLEGLAHRGYTEGFLRRHTHDDYQNYEYGYSVSDRQ QFVGEFTGERKGDLAAVAVKNKFSVGDSLELMTPQGNINFTLEHMENAKGEAMPIAPGDG YTVWLPVPQDLELNYALLMRNFSGETTRNPHGK >gi|296493459|gb|ADTK01000042.1| GENE 5 4369 - 4665 226 98 aa, chain - ## HITS:1 COG:no KEGG:EC55989_2339 NR:ns ## KEGG: EC55989_2339 # Name: not_defined # Def: conserved hypothetical protein, putative plasmid stabilisation system protein # Organism: E.coli_55989 # Pathway: not_defined # 1 98 1 98 98 195 100.0 4e-49 MRIIKLMPKANEDLEGIWYYSYHHFGEPQADRYVEHLSDVLQILSNNNIGTPRPELGEGI FVLPFERHVIYFLQSPGEIIVIRILNQNQDATRHLHWS >gi|296493459|gb|ADTK01000042.1| GENE 6 4667 - 4918 304 83 aa, chain - ## HITS:1 COG:STM2955 KEGG:ns NR:ns ## COG: STM2955 COG3609 # Protein_GI_number: 16766261 # Func_class: K Transcription # Function: Predicted transcriptional regulators containing the CopG/Arc/MetJ DNA-binding domain # Organism: Salmonella typhimurium LT2 # 1 83 24 106 118 128 90.0 2e-30 MARTMTVDLGDELREFIESLIESGDYRTQSEVIRESLRLLREKQAESRLQALRDMLAEGL SSGEAQPWEKDAFLRKVKAGIRK >gi|296493459|gb|ADTK01000042.1| GENE 7 5172 - 5504 403 110 aa, chain - ## HITS:1 COG:yegP KEGG:ns NR:ns ## COG: yegP COG3422 # Protein_GI_number: 16130020 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 110 14 123 123 169 99.0 9e-43 MAGWFELSKSSDNQFRFVLKAGNGETILTSELYTSKASAEKGIASVRSNSPQEERYEKKT ASNGKFYFNLKAANHQIIGSSQMYATAQSRETGIASVKANGTSQTVKDNT >gi|296493459|gb|ADTK01000042.1| GENE 8 5695 - 6417 860 240 aa, chain - ## HITS:1 COG:baeR KEGG:ns NR:ns ## COG: baeR COG0745 # Protein_GI_number: 16130019 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli K12 # 1 240 1 240 240 462 99.0 1e-130 MTELPIDENTPRILIVEDEPKLGQLLIDYLRAASYAPTLISHGDQVLPYVRQTPPDLILL DLMLPGTDGLTLCREIRRFSDIPIVMVTAKIEEIDRLLGLEIGADDYICKPYSPREVVAR VKTILRRCKPQRELQQQDAESPLIIDEGRFQASWRGKMLDLTPAEFRLLKTLSHEPGKVF SREHLLNHLYDDYRVVTDRTIDSHIKNLRRKLESLDAEQSFIRAVYGVGYRWEADACRIV >gi|296493459|gb|ADTK01000042.1| GENE 9 6414 - 7817 1233 467 aa, chain - ## HITS:1 COG:ECs2886 KEGG:ns NR:ns ## COG: ECs2886 COG0642 # Protein_GI_number: 15832140 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli O157:H7 # 1 467 1 467 467 912 99.0 0 MKFWRPGITGKLFLAIFATCIVLLISMHWAVRISFERGFIDYIKHGNEQRLQLLSDALGE QYAQHGNWRFLRNNDRFVFQILRSFEHDNSEDKPGPGMPPHGWRTQFWVVDQNNKVLVGP RAPIPPDGTRRPILVNGAEVGAVIASPVERLTRNTDINFDKQQRQTSWLIVALATLLAAL ATFLLARGLLAPVKRLVDGTHKLAAGDFTTRVTPTSEDELGKLAQDFNQLASTLEKNQQM RRDFMADISHELRTPLAVLRGELEAIQDGVRKFTPETVASLQAEVGTLTKLVDDLHQLSM SDEGALAYQKAPVDLIPLLEVAGGAFRERFASRGLKLQFSLPDSITVFGDRDRLMQLFNN LLENSLRYTDSGGSLKISAEQHDKTVRLTFADSAPGVSDEQLQKLFERFYRTEGSRNRAS GGSGLGLAICLNIVEAHNGRIIAAHSPFGGVSITVELPLERDLQREV >gi|296493459|gb|ADTK01000042.1| GENE 10 7814 - 9229 1299 471 aa, chain - ## HITS:1 COG:ECs2885 KEGG:ns NR:ns ## COG: ECs2885 COG0477 # Protein_GI_number: 15832139 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 471 1 471 471 749 99.0 0 MTDLPDSTRWQLWIVAFGFFMQSLDTTIVNTALPSMAQSLGESPLHMHMVIVSYVLTVAV MLPASGWLADKVGVRNIFFTAIVLFTLGSLFCALSGTLNELLLARALQGVGGAMMVPVGR LTVMKIVPREQYMAAMTFVTLPGQVGPLLGPALGGLLVEYASWHWIFLINIPVGIIGAIA TLMLMPNYTMQTRRFDLSGFLLLAVGMAVLTLALDGSKGTGLSPLAITGLVAVGVVALVL YLLHARNNNRALFSLKLFRTRTFSLGLAGSFAGRIGSGMLPFMTPVFLQIGLGFSPFHAG LMMIPMVLGSMGMKRIVVQVVNRFGYRRVLVATTLGLSLVTLLFMTTALLGWYYVLPFVL FLQGMVNSTRFSSMNTLTLKDLPDNLASSGNSLLSMIMQLSMSIGVTIAGLLLGLFGSQH VSVDSGTTQTVFMYTWLSMALIIALPAFIFARVPNDTHQNVAISRRKRSAQ >gi|296493459|gb|ADTK01000042.1| GENE 11 9230 - 12307 3208 1025 aa, chain - ## HITS:1 COG:yegO KEGG:ns NR:ns ## COG: yegO COG0841 # Protein_GI_number: 16130016 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Escherichia coli K12 # 1 1025 1 1025 1025 1867 99.0 0 MKFFALFIYRPVATILLSVAITLCGILGFRMLPVAPLPQVDFPVIMVSASLPGASPETMA SSVATPLERSLGRIAGVSEMTSSSSLGSTRIILQFDFDRDINGAARDVQAAINAAQSLLP SGMPSRPTYRKANPSDAPIMILTLTSDTYSQGELYDFASTQLAPTISQIDGVGDVDVGGS SLPAVRVGLNPQALFNQGVSLDDVRTAISNANVRKPQGALEDGTHRWQIQTNDELKTAAE YQPLIIHYNNGGAVRLGDVATVTDSVQDVRNAGMTNAKPAILLMIRKLPEANIIQTVDSI RAKLPELQETIPAAIDLQIAQDRSPTIRSSLEEVEQTLIISVALVILVVFLFLRSGRATI IPAVSVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENIARHLEAGMKP LQAALQGTREVGFTVLSMSLSLVAVFLPLLLMGGLPGRLLREFAVTLSVAIGISLLVSLT LTPMMCGWMLKASKPREQKRLRGFGRMLVALQQGYGKSLNWVLNHTRLVGVVLLGTIALN IWLYISIPKTFFPEQDTGVLMGGIQADQSISFQAMRGKLQDFMKIIRDDPAVDNVTGFTG GSRVNSGMMFITLKPRDERSETAQQIIDRLRVKLAKEPGANLFLMAVQDIRVGGRQSNAS YQYTLLSDDLAALREWEPKIRKKLATLPELADVNSDQQDNGAEMNLVYDRDTMARLGIDV QAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRYTQDISALEKMFVINNEGKAIPLSY FAKWQPANAPLSVNHQGLSAASTISFNLPTGKSLSDASAAIDRAMTQLGVPSTVRGSFAG TAQVFQETMNSQVILIIAAIATVYIVLGILYESYVHPLTILSTLPSAGVGALLALELFNA PFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRHGNLTPQEAIFQACLLRFRPIMMTTLA ALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQLLTLYTTPVVYLFFDRLRLRFSRKPK QTVTE >gi|296493459|gb|ADTK01000042.1| GENE 12 12308 - 15430 3384 1040 aa, chain - ## HITS:1 COG:yegN KEGG:ns NR:ns ## COG: yegN COG0841 # Protein_GI_number: 16130015 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Escherichia coli K12 # 1 1040 1 1040 1040 1808 99.0 0 MQVLPPSSTGGPSRLFIMRPVATTLLMVAILLAGIIGYRALPVSALPEVDYPTIQVVTLY PGASPDVMTSAVTAPLERQFGQMSGLKQMSSQSSGGASVITLQFQLTLPLDVAEQEVQAA INAATNLLPSDLPNPPVYSKVNPADPPIMTLAVTSTAMPMTQVEDMVETRVARKISQISG VGLVTLSGGQRPAVRVKLNAQAIAALGLTSETVRTAITGANVNSAKGSLDGPSRAVTLSA NDQMQSAEEYRQLIIAYQNGAPIRLGDVATVEQGAENSWLGAWANKEQAIVMNVQRQPGA NIISTADSIRQMLPQLTESLPKSVKVTVLSDRTTNIRASVDDTQFELMMAIALVVMIIYL FLRNIPATIIPGVAVPLSLIGTFAVMVFLDFSINNLTLMALTIATGFVVDDAIVVIENIS RYIEKGEKPLAAALKGAGEIGFTIISLTFSLIAVLIPLLFMGDIVGRLFREFAITLAVAI LISAVVSLTLTPMMCARMLSQESLRKQNRFSRASEKMFDRIIAAYGRGLAKVLNHPWLTL SVALSTLLLSVLLWVFIPKGFFPVQDNGIIQGTLQAPQSSSFANMAQRQRQVADVILQDP AVQSLTSFVGVDGTNPSLNSARLQINLKPLDERDDRVQKVIARLQTAVDKVPGVDLFLQP TQDLTIDTQVSRTQYQFTLQATSLDALSTWVPQLMEKLQQLPQLSDVSSDWQDKGLVAYV NVDRDSASRLGISMADVDNALYNAFGQRLISTIYTQANQYRVVLEHNTENTPGLAALDTI RLTSSDGGVVPLSSIAKIEQRFAPLSINHLDQFPVTTISFNVPDNYSLGDAVQAIMDTEK TLNLPVDITTQFQGSTLAFQSALGSTVWLIVAAVVAMYIVLGILYESFIHPITILSTLPT AGVGALLALLIAGSELDVIAIIGIILLIGIVKKNAIMMIDFALAAEREQGMSPREAIYQA CLLRFRPILMTTLAALLGALPLMLSTGVGAELRRPLGIGMVGGLIVSQVLTLFTTPVIYL LFDRLALWTKSRFARHEEEA >gi|296493459|gb|ADTK01000042.1| GENE 13 15430 - 16677 1360 415 aa, chain - ## HITS:1 COG:ECs2882 KEGG:ns NR:ns ## COG: ECs2882 COG0845 # Protein_GI_number: 15832136 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli O157:H7 # 1 415 50 464 464 720 99.0 0 MKGSYKSRWVIVIVVVIAAIAAFWFWQGRNDSQSAAPGATKQAQQSPAGGRRGMRSGPLA PVQAATAVEQAVPRYLTGLGTITAANTVTVRSRVDGQLMALHFQEGQQVKAGDLLAEIDP SQFKVALAQAQGQLAKDKATLANARRDLARYQQLAKTNLVSRQELDAQQALVSETEGTIK ADEASVASAQLQLDWSRITAPVDGRVGLKQVDVGNQISSGDTTGIVVITQTHPIDLVFTL PESDIATVVQAQKAGKPLVVEAWDRTNSKKLSEGTLLSLDNQIDATTGTIKVKARFNNQD DALFPNQFVNARMLVDTEQNAVVIPTAALQMGNEGHFVWVLNSENKVSKHLVTPGIQDSQ KVVIRAGISAGDRVVTDGIDRLTEGAKVEVVEAQSATTPEEKATSREYAKKGARS >gi|296493459|gb|ADTK01000042.1| GENE 14 17230 - 17889 569 219 aa, chain + ## HITS:1 COG:ECs2881 KEGG:ns NR:ns ## COG: ECs2881 COG4245 # Protein_GI_number: 15832135 # Func_class: R General function prediction only # Function: Uncharacterized protein encoded in toxicity protection region of plasmid R478, contains von Willebrand factor (vWF) domain # Organism: Escherichia coli O157:H7 # 1 219 1 219 219 400 98.0 1e-111 MSEQITFATSDFASNPEPRCPCILLLDVSGSMSGRPINELNAGLVTFRDELLADSLALKR VELGIVTFGPVHVEQPFTSAANFFPPILFAQGDTPMGAAITKALDMVEERKREYRANGIS YYRPWIFLITDGAPTDEWQAAANKVFQGEEDKKFAFFSIAVQGADMKTLAQISVRQPLPL QGLQFRELFSWLSSSLRSVSRSTPGTEVVLEAPKGWTSV >gi|296493459|gb|ADTK01000042.1| GENE 15 17967 - 18647 558 226 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_2148 NR:ns ## KEGG: ECIAI1_2148 # Name: yegK # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 224 28 251 253 434 100.0 1e-120 MQVAWLNDQQPLLVMFLADGAGSVSQGGEGAMLAVNEAMAYMSQKVQGGELGLNDILATD IVLTIRQRLFAEAEAKELAVRDFACTFLGLISSANGTLIMQIGDGGVVVDFGHGLQLPLT PMVGEYANMTHFITDEDAVSRLETFTSTERVHKVAAFTDGIQRLALNMLDNSPHVPFFTP FFNGLASATQEQLDLLPELLKQFLSSPAVNERTDDDKTLALALWAE Prediction of potential genes in microbial genomes Time: Mon May 16 15:06:06 2011 Seq name: gi|296493458|gb|ADTK01000043.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont101.8, whole genome shotgun sequence Length of sequence - 37737 bp Number of predicted genes - 31, with homology - 30 Number of transcription units - 20, operones - 9 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 58 - 501 396 ## COG3779 Uncharacterized protein conserved in bacteria - Prom 689 - 748 8.3 + Prom 609 - 668 5.7 2 2 Tu 1 . + CDS 719 - 2665 1210 ## COG4248 Uncharacterized protein with protein kinase and helix-hairpin-helix DNA-binding domains + Term 2888 - 2925 -0.3 - Term 2619 - 2669 1.3 3 3 Tu 1 . - CDS 2677 - 4029 1541 ## COG0443 Molecular chaperone - Prom 4057 - 4116 4.1 4 4 Op 1 . + CDS 4042 - 4194 86 ## gi|227887125|ref|ZP_04004930.1| conserved hypothetical protein 5 4 Op 2 . + CDS 4163 - 5011 721 ## COG0122 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase 6 5 Op 1 1/0.667 - CDS 5120 - 7987 2216 ## COG2202 FOG: PAS/PAC domain 7 5 Op 2 . - CDS 8017 - 8454 405 ## COG3447 Predicted integral membrane sensor domain - Prom 8516 - 8575 4.6 8 6 Op 1 4/0.000 + CDS 8772 - 9413 589 ## COG0572 Uridine kinase 9 6 Op 2 4/0.000 + CDS 9505 - 10086 718 ## COG0717 Deoxycytidine deaminase 10 6 Op 3 . + CDS 10108 - 11961 1200 ## COG2982 Uncharacterized protein involved in outer membrane biogenesis + Term 11971 - 12016 9.5 - Term 11851 - 11890 -0.9 11 7 Op 1 5/0.000 - CDS 12014 - 12304 269 ## COG5606 Uncharacterized conserved small protein 12 7 Op 2 1/0.667 - CDS 12294 - 12536 216 ## COG4679 Phage-related protein - Prom 12617 - 12676 5.1 - Term 12634 - 12676 9.3 13 8 Tu 1 . - CDS 12684 - 14267 1791 ## COG1253 Hemolysins and related proteins containing CBS domains - Prom 14422 - 14481 5.1 + Prom 14936 - 14995 4.7 14 9 Tu 1 . + CDS 15033 - 15923 1170 ## COG1210 UDP-glucose pyrophosphorylase + Prom 16155 - 16214 5.4 15 10 Tu 1 . + CDS 16316 - 16945 459 ## KP1_3725 putative acid phosphatase + Term 17043 - 17093 1.1 + Prom 17673 - 17732 3.4 16 11 Tu 1 . + CDS 17903 - 19336 1078 ## EcHS_A2194 hypothetical protein 17 12 Tu 1 . - CDS 19231 - 19434 81 ## + Prom 19497 - 19556 3.9 18 13 Op 1 6/0.000 + CDS 19587 - 20618 222 ## COG1596 Periplasmic protein involved in polysaccharide export + Prom 20648 - 20707 2.5 19 13 Op 2 3/0.167 + CDS 20783 - 21058 190 ## COG0394 Protein-tyrosine-phosphatase 20 13 Op 3 2/0.167 + CDS 21073 - 23235 692 ## COG3206 Uncharacterized protein involved in exopolysaccharide biosynthesis + Prom 23335 - 23394 4.6 21 14 Op 1 25/0.000 + CDS 23517 - 24422 196 ## COG0438 Glycosyltransferase 22 14 Op 2 . + CDS 24457 - 25635 420 ## COG0438 Glycosyltransferase + Prom 26419 - 26478 6.8 23 15 Op 1 . + CDS 26728 - 27030 137 ## Teth514_2284 O-antigen polymerase 24 15 Op 2 . + CDS 27017 - 28195 299 ## COG0438 Glycosyltransferase + Term 28440 - 28487 1.3 + Prom 29415 - 29474 6.6 25 16 Tu 1 . + CDS 29496 - 29711 66 ## gi|300902311|ref|ZP_07120307.1| putative integral membrane protein MviN + Prom 29860 - 29919 6.3 26 17 Op 1 . + CDS 29942 - 30265 61 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 27 17 Op 2 . + CDS 30277 - 31278 226 ## Tter_2109 carbohydrate-binding family 9 + Prom 31304 - 31363 4.5 28 18 Op 1 12/0.000 + CDS 31559 - 32482 314 ## COG0438 Glycosyltransferase 29 18 Op 2 1/0.667 + CDS 33520 - 33927 121 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis + Prom 33980 - 34039 4.6 30 19 Tu 1 . + CDS 34091 - 35497 1581 ## COG0362 6-phosphogluconate dehydrogenase + Term 35514 - 35561 7.1 + Prom 36169 - 36228 5.1 31 20 Tu 1 . + CDS 36315 - 37595 139 ## gi|300902319|ref|ZP_07120315.1| hypothetical protein HMPREF9536_00506 Predicted protein(s) >gi|296493458|gb|ADTK01000043.1| GENE 1 58 - 501 396 147 aa, chain - ## HITS:1 COG:yegJ KEGG:ns NR:ns ## COG: yegJ COG3779 # Protein_GI_number: 16130011 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 147 7 153 153 288 100.0 2e-78 MLSLLFFTTAGFSEVSDTLVTGGYDKQAMSDAIKHARKETDKFIEVMNKKDADTFAVKAP ITDHGRTEHFWLTDVTYSNGMFIGVISNDPGIVTNVEYGQEWKIKKEDISDWMYTRGDKI YGGYTIDPLLVTYPKEEADELRAKLVR >gi|296493458|gb|ADTK01000043.1| GENE 2 719 - 2665 1210 648 aa, chain + ## HITS:1 COG:yegI KEGG:ns NR:ns ## COG: yegI COG4248 # Protein_GI_number: 16130010 # Func_class: R General function prediction only # Function: Uncharacterized protein with protein kinase and helix-hairpin-helix DNA-binding domains # Organism: Escherichia coli K12 # 1 648 1 648 648 1304 99.0 0 MKTNIKVFTSTGELTTLGRELGKGGEGAVYDIEEFVDSVAKIYHTPPPALKQDKLAFMAA TADAQLLNYVAWPQATLHGGRGGKVIGFMMPKVSGKEPIHMIYSPAHRRQSYPHCAWDFL LYVARNIASSFATVHEHGHVVGDVNQNSFMVGRDSKVVLIDSDSFQINANGTLHLCEVGV SHFTPPELQTLPSFVGFERTENHDNFGLALLIFHVLFGGRHPYSGVPLISDAGNALETDI THFRYAYASDNQRRGLKPPPRSIPLSMLPSDVEAMFQQAFTESGVATGRPTAKAWVAALD SLRQQLKKCTVSAMHVYPGHLADCPWCALDNQGVIYFIDLGEEVITTGGDFVLAKVWAMV MASVAPPALQLPLPDHFQPTGRPLPLGLLRREYIILLEIALSALSLLLCGLQAEPRYIIL VPVLAAIWIIGSLTSKAYKAEVQQRREAFNRAKMDYDHLVRQIQQVGGLEGFIAKRTMLE KMKDEILGLPEEEKRALAALHDTARERQKQKFLEGFFIDVASIPGVGPARKAALRSFGIE TAADVTRRGVKQVKGFGDHLTQAVIDWKASCERRFVFRPNEAITPADRQAVMAKMTAKRH RLESTLTVGATELQRFRLHAPARTMPLMEPLRQAAEKLAQAQADLSRC >gi|296493458|gb|ADTK01000043.1| GENE 3 2677 - 4029 1541 450 aa, chain - ## HITS:1 COG:ECs2878 KEGG:ns NR:ns ## COG: ECs2878 COG0443 # Protein_GI_number: 15832132 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Escherichia coli O157:H7 # 1 450 22 471 471 891 99.0 0 MFIGFDYGTANCSVAVMRDGKPHLLKMENDSTLLPSMLCAPTREAVSEWLYRHHDVPADD DETQALLRRAIRYNREEDIDVTAKSVQFGLSSLAQYIDDPEEVWFVKSPKSFLGASGLKP QQVALFEDLVCAMMLHIRQQAQAQLPEAITQAVIGRPINFQGLGGDEANAQAQGILERAA KRAGFRDVVFQYEPVAAGLDYEATLQEEKRVLVVDIGGGTTDCSLLLMGPQWRSRLDREA SLLGHSGCRIGGNDLDIALAFKNLMPLLGMGGETEKGIALPILPWWNAVAINDVPAQSDF YSSANGRLLNDLVRDAREPEKVALLQKVWRQRLSYRLVRSAEECKIALSSVAETRASLPF ISDELATLISQQGLESALSQPLARILEQVQLALDNAQEKPDVIYLTGGSARSPLIKKALA EQLPGIPIAGGDDFGSVTAGLARWAEVVFR >gi|296493458|gb|ADTK01000043.1| GENE 4 4042 - 4194 86 50 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|227887125|ref|ZP_04004930.1| ## NR: gi|227887125|ref|ZP_04004930.1| conserved hypothetical protein [Escherichia coli 83972] # 1 50 1 50 50 89 100.0 6e-17 MPVKKGRDFSEMLPSRQPEYESKAQRLNNVYAESGLTRRCDVYPELAAAV >gi|296493458|gb|ADTK01000043.1| GENE 5 4163 - 5011 721 282 aa, chain + ## HITS:1 COG:alkA KEGG:ns NR:ns ## COG: alkA COG0122 # Protein_GI_number: 16130008 # Func_class: L Replication, recombination and repair # Function: 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase # Organism: Escherichia coli K12 # 1 282 1 282 282 555 99.0 1e-158 MYTLNWQPPYDWSWMLGFLAARAVSGVETVADSYYARSLAVGEYRGVVTAIPDIARHTLH INLSAGLEPVAAECLAKMSRLFDLQCNPQIVNGALGRLGAARPGLRLPGCVDAFEQGVRA ILGQLVSVAMAAKLTARVAQLYGERLDDFPEYICFPTPQRLAAADPQALKALGMPLKRAE ALIHLANAALEGTLPMTIPGDVEQAMKTLQTFPGIGRWTANYFALRGWQAKDVFLPDDYL IKQRFPGMTPAQIRRYAERWKPWRSYALLHIWYTEGWQPDEA >gi|296493458|gb|ADTK01000043.1| GENE 6 5120 - 7987 2216 955 aa, chain - ## HITS:1 COG:Z3235m_2 KEGG:ns NR:ns ## COG: Z3235m_2 COG2202 # Protein_GI_number: 15804978 # Func_class: T Signal transduction mechanisms # Function: FOG: PAS/PAC domain # Organism: Escherichia coli O157:H7 EDL933 # 151 520 1 370 370 763 99.0 0 MIWVLSESIGALALVPLGLLFKPHYLLRHRNPRLLFESLLTLAITLTLSWLSMLYLPWPF TFIIVLLMWSAVRLPRMEAFLIFLTTVMMVSLMMAADPSLLATPRTYLMSHMPWLPFLLI LLPANIMTIANIMTMVMYAFRAERKHISESETRFRNAMEYSAIGMALVGTEGQWLQSNKA LCQFLGYSQEELRGLTFQQLTWPEDLNKDLQQVEKLISGEINTYSMEKRYYNRNGDVVWA LLAVSLVRHTDGTPLYFIAQIEDINELKRTEQVNQQLMERITLANEAGGIGIWEWELKPN IFSWDKRMFELYEIPPHIKPNWQVWYECVLPEDRQHAGKVIRDSLQSRSPFKLEFRITVK DCIRHIRALANRVLNKEGEVERLLGINMDMTEVKQLNEALFQEKERLHITLDSIGEAVVC IDMAMKITFMNPVAEKMSGWTQEEALGVPLLTVLHITFGDNGPLMENIYSADTSRSAIEQ DVVLHCRSGGSYDVHYSITPLSTLDGSNIGSVLVIQDVTESHKMLRQLSYSASHDALTHL ANRASFEKQLRILLQTVNSTHQRHALVFIDLDRFKAVNDSTGHAAGDALLREQASLMLSM LRSSDVLARLGGDEFGLLLPDCNVESARFIATRIISAVNDYHFIWEGRVHRVGASAGITL IDDNNHQAAEVMSQADIACYASKNGGRGRVTVYEPQQAAAHSERAVMSLDEQWRMIKENQ LMMIAHGVASPRIPQARNLWLISLKLWSCEGEIIDEQTFRRSFSDPALSHALDRRVFHDF FQQAAKAVASKGLSIALPLSVAGLSSATLVNELLEQRENSPLPARLLHLIIPAEAIFDHA ESVQKLRLAGCRIVLSQMGRDLQIFNSLKANMADYLLLDGELCANVQGNLIDEMLITIIQ GHAQRLGMKTIAGPVVLPLVMDTLSGIGVDLIYGDVIANAQPLDLLVNSSYFAIN >gi|296493458|gb|ADTK01000043.1| GENE 7 8017 - 8454 405 145 aa, chain - ## HITS:1 COG:ECs2874 KEGG:ns NR:ns ## COG: ECs2874 COG3447 # Protein_GI_number: 15832128 # Func_class: T Signal transduction mechanisms # Function: Predicted integral membrane sensor domain # Organism: Escherichia coli O157:H7 # 1 133 1 133 177 233 98.0 1e-61 MSKQSQHVLIALPHPLLHLVSLGLVSFIFTLFSLELSQFGTQLAPLWFPTSIIMVAFYRH AGRMWPGIALSCSLGNIAASILLFSTSSLNMTWTTINIVEAVVGAVLLRKLLPWYNPLQN LADWLRLALGSAIVPPLLGVFWLSC >gi|296493458|gb|ADTK01000043.1| GENE 8 8772 - 9413 589 213 aa, chain + ## HITS:1 COG:ECs2873 KEGG:ns NR:ns ## COG: ECs2873 COG0572 # Protein_GI_number: 15832127 # Func_class: F Nucleotide transport and metabolism # Function: Uridine kinase # Organism: Escherichia coli O157:H7 # 1 213 19 231 231 418 100.0 1e-117 MTDQSHQCVIIGIAGASASGKSLIASTLYRELREQVGDEHIGVIPEDCYYKDQSHLSMEE RVKTNYDHPSAMDHSLLLEHLQALKRGSAIDLPVYSYVEHTRMKETVTVEPKKVIILEGI LLLTDARLRDELNFSIFVDTPLDICLMRRIKRDVNERGRSMDSVMAQYQKTVRPMFLQFI EPSKQYADIIVPRGGKNRIAIDILKAKISQFFE >gi|296493458|gb|ADTK01000043.1| GENE 9 9505 - 10086 718 193 aa, chain + ## HITS:1 COG:ECs2872 KEGG:ns NR:ns ## COG: ECs2872 COG0717 # Protein_GI_number: 15832126 # Func_class: F Nucleotide transport and metabolism # Function: Deoxycytidine deaminase # Organism: Escherichia coli O157:H7 # 1 193 1 193 193 384 99.0 1e-107 MRLCDRDIEAWLDEGRLSINPRPPVERINGATVDVRLGNKFRTFRGHTAAFIDLSGPKDE VSAALDRVMSDEIVLDESEAFYLHPGELALAVTLESVTLPADLVGWLDGRSSLARLGLMV HVTAHRIDPGWSGCIVLEFYNSGKLPLVLRPGMLIGALSFEPLSGPAARPYNRREDAKYR NQQGAVASRIDKD >gi|296493458|gb|ADTK01000043.1| GENE 10 10108 - 11961 1200 617 aa, chain + ## HITS:1 COG:asmA KEGG:ns NR:ns ## COG: asmA COG2982 # Protein_GI_number: 16130004 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein involved in outer membrane biogenesis # Organism: Escherichia coli K12 # 1 617 1 617 617 1162 98.0 0 MRRFLTTLMILLVVLVAGLSALVLLVNPNDFRDYMVKQVAARSGYQLQLDGPLRWHVWPQ LSILSGRMSLTAQGASQPLVRADNMRLDVALLPLLSHQLSVKQVMLKGAVIQLTPQTEAV RSEDAPVAPRDNTLPDLSDDRGWSFDISSLKVADSVLVFQHEDDEQVTIRNIRLQMEQDP QHRGSFEFSGRVNRDQRDLTISLNGTVDASDYPHDLMAAIEQINWQLQGADLPKQGIQGQ GSFQVQWQESHKRLSFNQISLTANDSTLSGQAQVTLTEKPEWQLRLQFPQLNLDNLIPLN ETANGENGAAQQGLSQSTLPRPVISSRIDEPAYQGLQGFTADILLQASNVRWRGMNFTDV ATQMTNKSGLLEITQLQGKLNGGQVSLPGTLDATSINPRINFQPRLENVEIGTILKAFNY PISLTGKMSLAGDFSGADIDADAFRHNWQGQAHVEMTDTRMEGMNFQQMIQQAVERNGGD VKAAENFDNVTRLDRFTTDLTLKDGVVTLNDMQGQSPMLALSGAGTLNLAEQTCDTQFDI RVVGGWNGESKLIDFLKETPVPLRVYGNWQQLNYSLQVDQLLRKHLQDEAKRRLNDWAER NKDSRNGKDVKKLLEKM >gi|296493458|gb|ADTK01000043.1| GENE 11 12014 - 12304 269 96 aa, chain - ## HITS:1 COG:ECs2870 KEGG:ns NR:ns ## COG: ECs2870 COG5606 # Protein_GI_number: 15832124 # Func_class: S Function unknown # Function: Uncharacterized conserved small protein # Organism: Escherichia coli O157:H7 # 1 96 1 96 96 162 95.0 1e-40 MKIETFDSVWDAVSAPPEQAENMRIRAELVTIINNWIEQQGFSQAQAASALGVTQPRISE LARGKIQIFSIDKLITMMAHAELHIQRIEIQYPHAA >gi|296493458|gb|ADTK01000043.1| GENE 12 12294 - 12536 216 80 aa, chain - ## HITS:1 COG:ECs2869 KEGG:ns NR:ns ## COG: ECs2869 COG4679 # Protein_GI_number: 15832123 # Func_class: S Function unknown # Function: Phage-related protein # Organism: Escherichia coli O157:H7 # 1 80 33 112 112 154 98.0 4e-38 MQQGLNPYDWKPFSTIGPGVREIRTRDADGIYRVMYVAKFEEAVYVLHCFQKKTQTTSQS DIDLAKRRYKELVQERKNEN >gi|296493458|gb|ADTK01000043.1| GENE 13 12684 - 14267 1791 527 aa, chain - ## HITS:1 COG:yegH_2 KEGG:ns NR:ns ## COG: yegH_2 COG1253 # Protein_GI_number: 16130003 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Escherichia coli K12 # 232 527 1 296 296 574 99.0 1e-163 MEWIADPSIWAGLVTLVVIELVLGIDNLVFIAILAEKLPPAHRDRARITGLMLAMVMRLL LLTSISWLVTLTQPLFSFRSFTFSARDLIMLFGGFFLLFKATIELSERLEGKDSNNPTQR KGAKFWGVVTQIVVLDAIFSLDSVVTAIGMVDHLLVMMAAVVIAISLMLMASKPLTQFVN SHPTIVILCLSFLLMIGFSLVAEGFGFVIPKGYLYAAIGFSVMIEVLNQLAIFNRRRFLS ANQTLRQRTTEAVMRLLSGQKEDAELDAETASMLLDHGNQQIFNPQERRMIERVLNLNQR TVSSIMTSRHDIEHIDLNAPEEEIRQLLERNQHTRLVVTDGDDAEDLLGVVHVIDLLQQS LRGEPLNLRVLIRQPLVFPETLPLLPALEQFRNARTHFAFVVDEFGSVEGIVTLSDVTET IAGNLPNEVEEIDARHDIQKNADGSWTANGHMPLEDLVQYVPLPLDEKREYHTIAGLLME YLQRIPKPGEEVQVGDYLLKTLQVESHRVQKVQIIPLREDGEMEYEV >gi|296493458|gb|ADTK01000043.1| GENE 14 15033 - 15923 1170 296 aa, chain + ## HITS:1 COG:STM2098 KEGG:ns NR:ns ## COG: STM2098 COG1210 # Protein_GI_number: 16765428 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose pyrophosphorylase # Organism: Salmonella typhimurium LT2 # 1 295 1 295 297 543 92.0 1e-154 MANLKAVIPVAGLGMHMLPATKAIPKEMLPIVDKPMIQYIVDEIVAAGIKEIVLVTHSSK NAVENHFDTSYELEALLEQRVKRQLLAEVQAICPPGVTIMNVRQAQPLGLGHSILCARPV VGDNPFVVVLPDIILDGGTADPLRYNLAAMIARFNETGRSQVLAKRMPGDLSEYSVIQTK EPMVAEGQVARIVEFIEKPDEPQTLDSDLMAVGRYVLSADIWAELERTEPGAWGRIQLTD AIAELAKKQSVDAMLMTGESYDCGKKMGYMQAFVTYGMRNLKEGAKFRESIKKLLA >gi|296493458|gb|ADTK01000043.1| GENE 15 16316 - 16945 459 209 aa, chain + ## HITS:1 COG:no KEGG:KP1_3725 NR:ns ## KEGG: KP1_3725 # Name: not_defined # Def: putative acid phosphatase # Organism: K.pneumoniae_NTUH-K2044 # Pathway: not_defined # 1 209 1 209 209 330 89.0 2e-89 MNWQLISFFCDSTVLLPSAAALFIVLMLRKTSRLLAWQWSLLFCITGAIVCASKLAFMGW GLGIRELDYTGFSGHSALSAAFWPIFLWLLSARFSAGLQKAAVATGYILAAVVGYSRLVI HAHSDPEVIAGLLLGVAGSALFLLLQKRTSDCDYKTVPWGGIACLVMFSLILLHSGSKAP TQTLLGQIATAIGPLDKPFTREDLHKQAW >gi|296493458|gb|ADTK01000043.1| GENE 16 17903 - 19336 1078 477 aa, chain + ## HITS:1 COG:no KEGG:EcHS_A2194 NR:ns ## KEGG: EcHS_A2194 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_HS # Pathway: not_defined # 1 477 28 504 504 917 97.0 0 MIKIARIAVTLGLLSSLGAQAYAAGLVVNDNDLRNDLAWLSDRGVIHLSLSTWPLSQEEI ARALKKAKPSYSSEQVVLARINQRLSALKADFRVTGYTSTDQPGTPQGFGQTQPADNSLG LAFNNSGEWWDIHLQGNVEGGERISNGSRFNANGAYGAVKFWNQWLSFGQVPQWWGPGYE GSLIRGDAMRPMTGFLMQRAEQAAPETWWLRWVGPWQYQISASQMNQYTAVPHAKIIGGR FTFSPFQSLELGASRIMQWGGEGRPQSLSSFWDGFTGKDNTGTDNEPGNQLAGFDFKFKL EPTLGWPVSFYGQMIGEDESGYLPSANMFLGGVEGHHGWGKDAVNWYLEAHDTRTNMSRT NYSYTHHIYKDGYYQQGYPLGDAMGGDGQLVAGRVELITEDNQRWSTRLVYAKVNPENQS INKAFPHADTLKGVQLGWSGDVYQSVRLNTSLWYTNANNSDSDDVGASAGIEIPFSL >gi|296493458|gb|ADTK01000043.1| GENE 17 19231 - 19434 81 67 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLAVTSLQAYGEKKTDRVLQRYHPLSPIFSIQPYKLNGISIPALAPTSSLSLLFALVYHS DVFKRTD >gi|296493458|gb|ADTK01000043.1| GENE 18 19587 - 20618 222 343 aa, chain + ## HITS:1 COG:ECs1139 KEGG:ns NR:ns ## COG: ECs1139 COG1596 # Protein_GI_number: 15830393 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Escherichia coli O157:H7 # 1 343 36 378 379 553 76.0 1e-157 MVELPDSDYDLDKLVNVYPMTPGLIDQLRPETVLARPNPQLDNLLRSYEYRIGVGDVLMV TVWDHPELTTPAGQYRSASDTGNWVNSDGTIFYPYIGKVQVAGKTLSQVRQDIASRLTTY IESPQVDVSIAAFRSQKTYVTGEVMKSGQQPITNIPLTVMDAINAAGGLAPDADWRNVVL THNGKDTKISLYALMQKGDLSQNRLLYPGDILFVPRNDDLKVFVMGEVVKQSTLKMDRSG MTLAEAIGNAEGMSQAFSDATGVFVIRQLKGDKQGKIANIYQLNAQDASAMVLGTEFHLQ PYDIVYVTTAPIVRWNRVISQLVPTITGVHDMTETAKFIKEWP >gi|296493458|gb|ADTK01000043.1| GENE 19 20783 - 21058 190 91 aa, chain + ## HITS:1 COG:ECs1138 KEGG:ns NR:ns ## COG: ECs1138 COG0394 # Protein_GI_number: 15830392 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Escherichia coli O157:H7 # 2 88 64 150 152 116 57.0 1e-26 MNGLSLDGHIGKQFTASMGRDYELILVMEKKHIEQIGKIAPELRGKTMLFGHWLQQREIP DPYKKSEEAFSLVYQLIAQAGNLWAQKLGAK >gi|296493458|gb|ADTK01000043.1| GENE 20 21073 - 23235 692 720 aa, chain + ## HITS:1 COG:ZyccC_1 KEGG:ns NR:ns ## COG: ZyccC_1 COG3206 # Protein_GI_number: 15800902 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein involved in exopolysaccharide biosynthesis # Organism: Escherichia coli O157:H7 EDL933 # 9 478 11 479 480 581 62.0 1e-165 MVSVTKTSASKNEDDIDLSRLVGEFIDHRKLIISVTSLFTVIALVYALFATPIYQADALI QVEQKQANAILTNLSQMLPDSQPQSAPEIALLQSRMVLGKTVDDLNLQARITPKYFPILG RGWARLTGTKNGELRIAKLFIPGANEDYIPKLQLNVIDNKRFEIVGDDFSFESDFGKLTQ KNGVVIDIANISAKPGDKFYISYTTRLKAINDLKDSFSVNDQGKDTGILTLSISGSEPSL IKVILDRITENYLEQNIERQAAQDAKSLEFLNQQLPKVRGELDIAEDKLNAYRRKSDSVD LSLEAKSVLEQIVNVDNQLNELTFKESEISQLYTREHPTYKSLMEKRKTLLDEKSKLNKR VSAMPETQQEILRLSRDVESGRAVYMQLLNRQQELNIAKSSAIGNVRIIDDAITQPKPVK PKKVLVVIIGLVLGFMISVGLVLIRVFLRRGIESPEQLEEIGISVYASIPVSESFTKKIN NSNKVKNVKAAEYQGFLALDNPADIALEAIRGLRTSLHFAMMDAKNDILMISGAGPNAGK TFVSTNLAAVTAQTNKKVLFIDADMRKGYTHKLFNVNNDNGLSDYLSGRVDTEKCIKAIS AGFDFISRGAVPPNPAELLMNSRLEKLLDWAKDNYELVIVDTPPILAVTDAAIIGRHVGT TLLVARFELNTPKEIEVSLRRFDNSGVHINGCILNGVMKKASSYYGYGYNHYGYSYSDKD >gi|296493458|gb|ADTK01000043.1| GENE 21 23517 - 24422 196 301 aa, chain + ## HITS:1 COG:alr5202 KEGG:ns NR:ns ## COG: alr5202 COG0438 # Protein_GI_number: 17232694 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Nostoc sp. PCC 7120 # 8 300 124 411 429 157 31.0 3e-38 MTEQTKLDLIHCHFSLDGIYALDLARKKDIPLITTLHGFDVTTKSHKWLLSKSPTYINYF FNKNKLKNGEHKFLCVSDFIYNAAINNGFNEKNLIKHYIGIDVDKYNTREKAEEQKIILH VARLVEKKGTSTLISAMRNISKNFPEYKLIIIGEGPLQEQLLEQAKELNLENNISFLGAK SHAEVMQWMRKASLLVLPSITAKNGDAEGLGMVLLEAAATGVPLIGTNHGGIPEVIKDSI NGYLVNENDADMLQDRISYLLNNDSVRHQMGRAARDVINRDFNIHIQSAKLEDIYQKLIH G >gi|296493458|gb|ADTK01000043.1| GENE 22 24457 - 25635 420 392 aa, chain + ## HITS:1 COG:DRA0368 KEGG:ns NR:ns ## COG: DRA0368 COG0438 # Protein_GI_number: 15808026 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Deinococcus radiodurans # 118 331 126 332 415 131 36.0 2e-30 MEKLTKNIIILSTADWDNPFWTNKQHVALDLAKKGYNIFYIDSLGLRKPGLNKKDFKRIF NRLVKGIRPPKKVHQNIKVWSPVLIPFNDNSIIRKINNILFNSLLKLWSNDKKSKTTLWT YNPLTTQFLELTDFSYVVYHCVDEIKAQPGMPIDILEQKEEELTRNADIVFVTSPTLLES RRLWSNNVHYFSNVADYDHFSKALEQSTIIPEDLREIDGPILGFIGAISSYKINFLLLEK IAKSHPEWNIVLIGDVGEGDPNTNVDILKKLKNMHFLGAKDYNSLPSYLKGFDVALLPNN INSYTDNMFPMKFFEYLAAGRNVVSVNLPSIKEFSEYFKLTNNDNDFIQAIESILLGDSI ELDKIQLLAKQFTYDTRTRKMINLINQPELIK >gi|296493458|gb|ADTK01000043.1| GENE 23 26728 - 27030 137 100 aa, chain + ## HITS:1 COG:no KEGG:Teth514_2284 NR:ns ## KEGG: Teth514_2284 # Name: not_defined # Def: O-antigen polymerase # Organism: Thermoanaerobacter_X514 # Pathway: not_defined # 1 77 360 436 452 68 38.0 6e-11 MWSKAGLMAAILSCWFIFLTLKINWRGMFSINQQLKYASISGMMVSLWLMLHGFGENFGL VGEQHQMPILAISIGIVMALKNMTSSLNYNISNTKRDENE >gi|296493458|gb|ADTK01000043.1| GENE 24 27017 - 28195 299 392 aa, chain + ## HITS:1 COG:all1345 KEGG:ns NR:ns ## COG: all1345 COG0438 # Protein_GI_number: 17228840 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Nostoc sp. PCC 7120 # 209 332 217 344 410 76 36.0 1e-13 MKMNNKVVIFARSIPAHNSGGLEVVTWDMCKGLAKIGLEVELITTKLPSTNFISKEENIN IVEVEGTIPGKYNRAWWEKSAEHFLSYPAERLIGVISVSAAGFGVLKYRNKYNNVNFIMQ AHGTSIDELKTKIKSRSAIKTLKGIKNVYWFFKDAYYYKKFDYIVGIGDAVVKSLTSFPN NLFVSENKVVKIENGIDESLFSFSIDKKEKLRVDFNIAKEKKIIISVCRLHEQKGVDNNI RVVKEIVEQNKHQELLYIICGSGPAESSLKSMVQSLSLNENIIFVGDQDRVTIAKYLNMS DVFMFLTKRIEGLPLNVLEAQASGLPMIISKHLFFEQTELVNKVDNKDIKKAAEILIQIL SENNEVSRKSYINEINTLDYSTKLYHELLNRK >gi|296493458|gb|ADTK01000043.1| GENE 25 29496 - 29711 66 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|300902311|ref|ZP_07120307.1| ## NR: gi|300902311|ref|ZP_07120307.1| putative integral membrane protein MviN [Escherichia coli MS 84-1] # 1 71 434 504 504 116 100.0 4e-25 MVFILTLYSIIYITLNVESYFHTNHVIYTLIINFISFYLFVIVSGFFFKEFRMVYSIIKV IQCKISGGKGK >gi|296493458|gb|ADTK01000043.1| GENE 26 29942 - 30265 61 107 aa, chain + ## HITS:1 COG:MT3119 KEGG:ns NR:ns ## COG: MT3119 COG0110 # Protein_GI_number: 15842598 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Mycobacterium tuberculosis CDC1551 # 9 104 137 232 262 68 37.0 3e-12 MFFNKNVNIVSHDNIYIGENCLFGHNICCFDSDHRYNDLSKDIRYQGYAKSPVRIERNVW ICAGVMITKGTIIGENSVIAGNSVARGSLRQNSIYAGIPCKFIKEIS >gi|296493458|gb|ADTK01000043.1| GENE 27 30277 - 31278 226 333 aa, chain + ## HITS:1 COG:no KEGG:Tter_2109 NR:ns ## KEGG: Tter_2109 # Name: not_defined # Def: carbohydrate-binding family 9 # Organism: T.terrenum # Pathway: not_defined # 25 330 305 600 1276 129 29.0 2e-28 MLKKMFFLLFIVSVFKCQASDFLIGVNSHLYQKKDEEIISTIEKVKSLGIKAIRVDAPWK LVEQVKGAYSIPPAWDVIVDYASKNNIDVLFILDYGNKYYDNGDKPISKDAVKGFVNYVD YLTKHFSNRVRFYQIWNEWNNKNGDTTPGKVEDYKVLVKNTYPIIKKNAQDSIVITSSFS PAAFNKAIGLERRGDYFRDFLTTDMVNFTDALSIHPYTTYRKPPFHQFKYYTKQIEYAMN LLKSGPFKNKPIYITEIGWTTASSPVAVSLNTQAQDLKNAICQAKKLGYAGVFIYELKDN HLYPKDPEDGFGILDSKMQNKDAANIIQNLACN >gi|296493458|gb|ADTK01000043.1| GENE 28 31559 - 32482 314 307 aa, chain + ## HITS:1 COG:AGl3095 KEGG:ns NR:ns ## COG: AGl3095 COG0438 # Protein_GI_number: 15891662 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 304 107 406 407 341 54.0 8e-94 MPLAIEQLDVSSHDVIISSSHAVAKGILTGPDQLHISYVHSPIRYAWDLQHQYLRESHLD KGVKGILAKYLLHKIRQWDYRTANGVDHFIANSQFIARRIHKVYGRTADVIYPPVDVHRF LMNTSKQDYYLTASRLVPYKKIDLIVEAFSNMPDKRLVVIGNGSEMVKIKSKAKSNIEIL GYQPDSVMLEHMQNAKAFVFAAEEDFGITPVEAQACGTPVVAFGKGGSLETVRPYGVDKP TGVFFDEQSVPSLVKAINFFDTVSDKIEPQDCRENAMRFSVEIFKNNLSKYVEDKWTEFN LSKRIQY >gi|296493458|gb|ADTK01000043.1| GENE 29 33520 - 33927 121 135 aa, chain + ## HITS:1 COG:wcaJ KEGG:ns NR:ns ## COG: wcaJ COG2148 # Protein_GI_number: 16129987 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Escherichia coli K12 # 1 135 330 464 464 239 80.0 7e-64 MRVMENDNIVVQAKKNDVRVTKVGKFLRSTSLDELPQFFNVWCGQMSVVGPRPHAVAHNE QYRALIQGYMLRHKVKPGITGLAQINGWRGETDTLEKMEKRIEYDLLYIRSWSIWLDLKI IFLTVFKGFINKAAY >gi|296493458|gb|ADTK01000043.1| GENE 30 34091 - 35497 1581 468 aa, chain + ## HITS:1 COG:gnd KEGG:ns NR:ns ## COG: gnd COG0362 # Protein_GI_number: 16129970 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconate dehydrogenase # Organism: Escherichia coli K12 # 1 468 1 468 468 889 95.0 0 MSKQQIGVVGMAVMGRNLALNIERRGYTVSVFNRSREKTEEVIAENPGKKLVPYYTVQEF VESLETPRRILLMVKAGAGTDSAIDSLKPYLDKGDIIIDGGNTFFQDTIRRNRELSAEGF NFIGTGISGGEEGALKGPSIMPGGQKEAYELVAPILKQIAAVAEDGEPCVTYIGADGAGH YVKMVHNGIEYGDMQLIAEAYALLKGGLALSNEELAQTFTEWNEGELSSYLIDITKDIFT KKDEEGKYLVDVILDEAANKGTGKWTSQSSLDLGEPLSLITESVFARYISSLKDQRVAAS KVLSGPQAQPAGDKAEFIEKVRRALYLGKIVSYAQGFSQLRAASDEYNWDLNYGEIAKIF RAGCIIRAQFLQKITDAYAQNAGIANLLLAPYFKQIADDYQQALRDVVAYAVQNGIPVPT FSAAIAYYDSYRSAVLPANLIQAQRDYFGAHTYKRTDKEGVFHTEWLE >gi|296493458|gb|ADTK01000043.1| GENE 31 36315 - 37595 139 426 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|300902319|ref|ZP_07120315.1| ## NR: gi|300902319|ref|ZP_07120315.1| hypothetical protein HMPREF9536_00506 [Escherichia coli MS 84-1] # 1 426 213 638 638 775 99.0 0 MKPKTSLIKTKGVSDVVIKGLVLTTHPTLSKPYNGSCIEIEGAKGWYIGNNVFSNYGASA IFCRNSSEMFIYNNMFAGSKGAAGDVTLWGSTCQSKIKNNTMLSGSDSAIIIQTIADGDF CSDNLIQDNNISNCTRYGIVVYNNLSSKKSILRNTKVIGNKIYNILGQVKNPSVHNTKTY GAGIYVLSAENITIQYNEVSKCNLLTNSSTLAPGCIGLNATSNAIVSDNIISDGAYYGIF IGDALQQGKGSNANSPDFIPDGIVYVQRNRIINCKRDGVFIINKHSVIINDNMVEGNQGC GISTSVSNNEFYHSMKNITIANNSSCKNKLDGINLSDAYCARVSGGAYSNNDRCGINLTS FGTEVSLLQCQDNKIGISISDKGMKCRIEDCNLTSNDVGVLSVVPYDERGCAFSKNKLSR KINASS Prediction of potential genes in microbial genomes Time: Mon May 16 15:06:50 2011 Seq name: gi|296493457|gb|ADTK01000044.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont105.1, whole genome shotgun sequence Length of sequence - 2436 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 + CDS 77 - 466 131 ## COG2963 Transposase and inactivated derivatives 2 1 Op 2 5/0.000 + CDS 463 - 810 159 ## PROTEIN SUPPORTED gi|148984516|ref|ZP_01817804.1| 50S ribosomal protein L9 3 1 Op 3 . + CDS 860 - 2398 912 ## COG3436 Transposase and inactivated derivatives Predicted protein(s) >gi|296493457|gb|ADTK01000044.1| GENE 1 77 - 466 131 129 aa, chain + ## HITS:1 COG:ECs0328 KEGG:ns NR:ns ## COG: ECs0328 COG2963 # Protein_GI_number: 15829582 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 129 5 133 133 242 99.0 1e-64 MSDMQKNVTPGRRKGCPNYPPEFKQQLVAASCEPGISISKLALENGINANLLFKWRQQWR EGKLLLPSSESPQLLPVTLDAAAEQPESLAEDPETLSISCEVTFRHGTLRFNGNVSEKLL TLLIQELKR >gi|296493457|gb|ADTK01000044.1| GENE 2 463 - 810 159 115 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148984516|ref|ZP_01817804.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP3-BS71] # 1 99 2 100 107 65 38 3e-11 MIPLPSGTKIWLVAGITDMRNGFNGLAAKVQTALKDDPMSGHVFIFRGRSGSQVKLLWST GDGLCLLTKRLERGRFAWPSARDGKVFLTQAQLAMLLEGIDWRQPKRLLTSLTML >gi|296493457|gb|ADTK01000044.1| GENE 3 860 - 2398 912 512 aa, chain + ## HITS:1 COG:ECs0330 KEGG:ns NR:ns ## COG: ECs0330 COG3436 # Protein_GI_number: 15829584 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 512 1 512 512 975 94.0 0 MNDISSDDIFLLKQRLAEQEALIHALQEKLSNREREIDHLQAQLDKLRRMNFGSRSEKVS RRIAQMEADLNRLQKESDTLTGRVYDPAVQRPLRQTRTRKPFPESLPRDEKRLLPAAPCC PNCGGSLSYLGEDTAEQLELMRSAFRVIRTVREKHACTQCDAIVQAPAPSRPIERGIAGP GLLARVLTSKYAEHTPLYRQSEIYGRQGVELSRSLLSGWVDACCRLLSPLEEALHGYVLT DGKLHADDTPVPVLLPGNKKTKTGRLWTYVRDDRNAGSTLAPAVWFAYSPDRKGIHPQTH LAGFSGVLQADAYAGFNELYRDGRITEAACWAHARRKIHDVHVRTPSALTEEALKRIGEL YAIEAEIRGMTAEQRLAERQLKTKPLLKSLESWLREKMKTLSRHSELAKAFAYALNQWPA LTYYADDGWAEADNNIAENALRMVSLGRKNYLFFGSDHGGERGALLYSLIGTCKLNGVEP ESYLRYVLDVIADWPINRVGELLPWRVALPTE Prediction of potential genes in microbial genomes Time: Mon May 16 15:06:51 2011 Seq name: gi|296493456|gb|ADTK01000045.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont107.1, whole genome shotgun sequence Length of sequence - 2158 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 3 - 62 4.0 1 1 Tu 1 . + CDS 103 - 2151 1277 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins Predicted protein(s) >gi|296493456|gb|ADTK01000045.1| GENE 1 103 - 2151 1277 682 aa, chain + ## HITS:1 COG:AGl1858 KEGG:ns NR:ns ## COG: AGl1858 COG4771 # Protein_GI_number: 15891046 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 33 682 54 707 707 417 39.0 1e-116 MKNKYIIAPGIAVMCSAVISSGYASSDKKEDTLVVTASGFTQQLRNAPASVSVITSEQLQ KKPVSDLVDAVKDVEGISITGGNEKPDISIRGLSGDYTLILVDGRRQSGRESRPNGSGGF EAGFIPPVEAIERIEVIRGPMSSLYGSDAIGGVINIITKPVNNQTWDGVLGLGGIIQEHG KFGNSTTNDFYLSGPLIKDKLGLQLYGGMNYRKEDSISQGTPAKDNKNITATLQFTPTES QKFVFEYGKNNQVHTLTPGESLDAWTMRGNLKQPNSKRETHNSRSHWVAAWNAQGEILHP EIAVYQEKVIREIKSGKKDKYNHWDLNYESRKPEITNTIIDAKVTAFLPENVLTIGGQFQ HAELRDDSATGKKTTETQSVSIKQKAVFIENEYAATDSLALTGGLRLDNHEIYGSYWNPR LYAVYNLTDNLTLKGGIAKAFRAPSIREVSPGFGTLTQGGASIMYGNRDLKPETSVTEEI GIIYSNDSGFSASATLFNTDFKNKLTSYDIGTKDPVTGLNTFIYDNVGEANIRGVELATQ IPVYDKWHVSANYTFTDSRRKSDDESLNGKSLKGEPLERTPRHAANAKLEWDYTQDITFY SSLNYTGKQIWAAQRNGAKVPRVRNGFTSMDIGLNYQILPDTLINFAVLNVTDRKSEDID TIDGNWQVDEGRRYWANVRVSF Prediction of potential genes in microbial genomes Time: Mon May 16 15:06:52 2011 Seq name: gi|296493455|gb|ADTK01000046.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont113.1, whole genome shotgun sequence Length of sequence - 1955 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1954 1593 ## COG5281 Phage-related minor tail protein Predicted protein(s) >gi|296493455|gb|ADTK01000046.1| GENE 1 1 - 1954 1593 651 aa, chain + ## HITS:1 COG:Z0975 KEGG:ns NR:ns ## COG: Z0975 COG5281 # Protein_GI_number: 15800511 # Func_class: S Function unknown # Function: Phage-related minor tail protein # Organism: Escherichia coli O157:H7 EDL933 # 1 651 182 832 1021 1014 98.0 0 EATRQKVAFIRQLKEQATRLNLSSSELLRAKAAQLGVSSAAEVYIRKMEQAGKATHSLGL KSAAARQEIGVLIGELARGNLGALRGSGITLANRAGWIDTLMSPKGMMLGGVIGGIAASV YGLGKAWYDGQKEGEEFNRQLSLTGHYAGVTAGQLWTLSRAISGNGITQHAAAGALAQVV GSGAFRGNDIGMVARAAAQMERSVGQSVSDTINQFKRLKDDPVNAAKALDNELHFLTATQ LEQIRVLGDQGRSSDAARIAMSALAEETGRRTADIDNNLNALGSTLKYLSDLWSRFWDAA MNIGREDSLDEQIAALQEKVSRAKRLPWTASSSQVEYDQQRLNDLQEKKRQKDLQDAKEQ AERNYQEQQKRRNAENAALNRMNETEAARHQREIARINSMQYADQAVRDAAIQRENERYE KALASGKKKTRETRNDEATRLLLQYSQQQAQVEGQIAAARQSAGIATEKMTEAHKQLLAL QQRISDLDGKKLTADEKSVLARKDELIQALTLLDVKQQELQKQTALNDLKKKTIQLTSQL AEEERAQRQQHDLDIATVGMGDQQRQRYQVQLSLRQKYQQQLEQLRRDSEQKGTYNTDDY RKAEQALTESLNRQLNENRRYWQQLEVVQGNWKNGVLRAFQDFTVDADNTA Prediction of potential genes in microbial genomes Time: Mon May 16 15:06:56 2011 Seq name: gi|296493454|gb|ADTK01000047.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont116.1, whole genome shotgun sequence Length of sequence - 14229 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 9, operones - 2 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 312 214 ## COG2963 Transposase and inactivated derivatives - Prom 369 - 428 2.0 2 2 Tu 1 . - CDS 431 - 727 128 ## UTI89_C4887 P pilus adhesin PapG protein - Prom 809 - 868 3.7 3 3 Tu 1 . - CDS 1586 - 1804 83 ## ECP_4534 PapF protein 4 4 Op 1 . - CDS 2162 - 2689 276 ## c3585 PapE protein 5 4 Op 2 . - CDS 2716 - 3252 293 ## c3586 PapK protein 6 4 Op 3 . - CDS 3262 - 3843 -41 ## ECS88_3261 protein PapJ precursor, P pilus assembly 7 4 Op 4 10/0.000 - CDS 3880 - 4599 559 ## COG3121 P pilus assembly protein, chaperone PapD 8 4 Op 5 . - CDS 4685 - 7204 1445 ## COG3188 P pilus assembly protein, porin PapC 9 4 Op 6 . - CDS 7254 - 7841 161 ## ECS88_3264 minor pilin protein PapH 10 4 Op 7 . - CDS 7904 - 8455 320 ## COG3539 P pilus assembly protein, pilin FimA - Prom 8496 - 8555 3.4 - Term 8500 - 8540 5.1 11 5 Tu 1 . - CDS 8661 - 8975 167 ## ECUMN_3340 pap operon regulatory protein PapB - Prom 9104 - 9163 7.4 12 6 Tu 1 . - CDS 9169 - 9342 91 ## - Prom 9379 - 9438 1.8 + Prom 9302 - 9361 7.7 13 7 Tu 1 . + CDS 9391 - 9612 173 ## UTI89_C4897 pap operon regulatory protein PapI + Term 9804 - 9847 9.1 14 8 Op 1 2/0.000 - CDS 10098 - 10283 97 ## COG2801 Transposase and inactivated derivatives 15 8 Op 2 . - CDS 10402 - 11337 161 ## COG3547 Transposase and inactivated derivatives - Prom 11377 - 11436 1.7 16 9 Tu 1 . - CDS 12020 - 13528 458 ## COG3344 Retron-type reverse transcriptase Predicted protein(s) >gi|296493454|gb|ADTK01000047.1| GENE 1 3 - 312 214 103 aa, chain - ## HITS:1 COG:ECs1328 KEGG:ns NR:ns ## COG: ECs1328 COG2963 # Protein_GI_number: 15830582 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 94 1 97 113 163 95.0 8e-41 MSRKTQRYSKEFKAEAVRTVLENQLSISEGASRLSLPEGTLGQWVTAARKGLGTPGSRTV AELESEILQLRKALNEARLERDILTVDSICQCNTLDWPPESPD >gi|296493454|gb|ADTK01000047.1| GENE 2 431 - 727 128 98 aa, chain - ## HITS:1 COG:no KEGG:UTI89_C4887 NR:ns ## KEGG: UTI89_C4887 # Name: papG # Def: P pilus adhesin PapG protein # Organism: E.coli_UTI89 # Pathway: not_defined # 2 93 240 330 335 65 34.0 6e-10 MAETNVKLTCTAPTDVKIKLSTPKTPNYNNSHFSVNLGNGWDSVISLDGYSYDEANLQNK YVTQSGRTIKIGSKLYGDVSKVKPGGINGSLVMTFDFQ >gi|296493454|gb|ADTK01000047.1| GENE 3 1586 - 1804 83 72 aa, chain - ## HITS:1 COG:no KEGG:ECP_4534 NR:ns ## KEGG: ECP_4534 # Name: not_defined # Def: PapF protein # Organism: E.coli_536 # Pathway: not_defined # 2 72 95 167 167 80 71.0 2e-14 MTDMTNLRIALYQGNNTSTRLTLGEGSGSGYRVTAGLDTTTSTFTFMSVLFRNGDLNGGA FSATASMSMIYN >gi|296493454|gb|ADTK01000047.1| GENE 4 2162 - 2689 276 175 aa, chain - ## HITS:1 COG:no KEGG:c3585 NR:ns ## KEGG: c3585 # Name: papE # Def: PapE protein # Organism: E.coli_CFT073 # Pathway: not_defined # 1 175 7 179 179 252 79.0 4e-66 MKKIIGLCLPVMLGAVLMSQHAHAADNLTFKGKLIIPACTVTKAEVDWGNIEIQTLSQNG NHEKEFTVNMQCPYNLGTMKVTITATNTQNNNSILVPNTSTTTGNGLLIYLYNRNQNGSI GSPIVFNSPFPPGYITGQVPARNISLYAKLGYKGNIQNLRAGTFSATATLVASYS >gi|296493454|gb|ADTK01000047.1| GENE 5 2716 - 3252 293 178 aa, chain - ## HITS:1 COG:no KEGG:c3586 NR:ns ## KEGG: c3586 # Name: papK # Def: PapK protein # Organism: E.coli_CFT073 # Pathway: not_defined # 1 178 2 179 179 321 97.0 8e-87 MIKSTGALLLFAALSAGQAMASDVAFRGNLLDRPCHVSGDSLKKHVVFKTRASRDFWYPP GRSPTESFVIRLENCHATAVGKIVTLTFKGTEEAALPGHLKVTGVNSGRLGIALLDTDGS SLLKPGTFHNKGQGEKVTGNSLELPFGAYVVATPEALRTKSVVPGDYEATATFELTYR >gi|296493454|gb|ADTK01000047.1| GENE 6 3262 - 3843 -41 193 aa, chain - ## HITS:1 COG:no KEGG:ECS88_3261 NR:ns ## KEGG: ECS88_3261 # Name: papJ # Def: protein PapJ precursor, P pilus assembly # Organism: E.coli_S88 # Pathway: not_defined # 1 193 1 193 193 377 99.0 1e-103 MVVNKTTAVLYLIALSLSGFIHTFLRAEERGIYDDVFTADELHHYRINERGGRTGSLAVS GALLSSPCTLVSNEVPLSLRPENHSASAGAPLMLRLAGCGDGGALQPGKRGVAMTVSGSL VTGPGSGSALLPDRKLSGCDHLVIHDGDTFLLCRPDRRQEEMLAAWRKRATQEGEYSDAR SNPAMLRLSIKYE >gi|296493454|gb|ADTK01000047.1| GENE 7 3880 - 4599 559 239 aa, chain - ## HITS:1 COG:YPO0699 KEGG:ns NR:ns ## COG: YPO0699 COG3121 # Protein_GI_number: 16121020 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Yersinia pestis # 6 237 7 237 239 283 59.0 2e-76 MIRKKILMAAIPLFVISGADAAVSLDRTRAVFDGSEKSMTLDISNDNKQLPYLAQAWIEN ENQEKIITGPVIATPPVQRLEPGAKSMVRLSTTPDISKLPQDRESLFYFNLREIPPRSEK ANVLQIALQTKIKLFYRPAAIKTRPNEVWQDQLILNKVSGGYRIENPTPYYVTVIGLGGS EKQAEEGEFETVMLSPRSEQTVKSANYNTPYLSYINDYGGRPVLSFICNGSRCSVKKEK >gi|296493454|gb|ADTK01000047.1| GENE 8 4685 - 7204 1445 839 aa, chain - ## HITS:1 COG:YPO0698 KEGG:ns NR:ns ## COG: YPO0698 COG3188 # Protein_GI_number: 16121019 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Yersinia pestis # 1 835 1 822 826 768 49.0 0 MRGMKDRIPFAVNNITCVILLSLFCNAASAVEFNTDVLDAADKKNIDFTRFSEAGYVLPG QYLLDVIVNGQSISPASLQISFVEPQSSGDKAEKKLPQACLTSDMVRLMGLTAESLDKVV YWHDGQCADFHGLPGVDIRPDTGAGVLRINMPQAWLEYSDATWLPPSRWDDGIPGLMLDY NLNGTVSRNYQGGDSHQFSYNGTVGGNLGPWRLRADYQGSQEQSRYNGEKTTNRNFTWSR FYLFRAIPRWRANLTLGENNINSDIFRSWSYTGASLESDDRMLPPRLRGYAPQITGIAET NARVVVSQQGRVLYDSMVPAGPFSIQDLDSSVRGRLDVEVIEQNGRKKTFQVDTASVPYL TRPGQVRYKLVSGRSRGYGHETEGPVFATGEASWGLSNQWSLYGGAVLAGDYNALAAGAG WDLGVPGTLSADITQSVARIEGERTFQGKSWRLSYSKRFDNADADITFAGYRFSERNYMT MEQYLNARYRNDYSSREKEMYTVTLNKNVADWNTSFNLQYSRQTYWDIRKTDYYTVSVNR YFNVFGLQGVAVGLSASRSKYLGRDNDSAYLRISVPLGTGTASYSGSMSNDRYVNMAGYT DTFNDGLDSYSLNAGLNSGGGLTSQRQINAYYSHRSPLANLSANIASLQKGYTSFGVSAS GGATITGKGAALHAGGMSGGTRLLVDTDGVGGVPVDGGQVVTNRWGTGVVTDISSYYRNI TSVDLKRLPDDVEATRSVVESALTEGAIGYRKFSVLKGKRLFAILRLADGSQPPFGASVT SEKGRELGMVADEGLAWLSGVTPGETLSVNWDEKIQCQVNVPETAISDQQLLLPCTPQK >gi|296493454|gb|ADTK01000047.1| GENE 9 7254 - 7841 161 195 aa, chain - ## HITS:1 COG:no KEGG:ECS88_3264 NR:ns ## KEGG: ECS88_3264 # Name: papH # Def: minor pilin protein PapH # Organism: E.coli_S88 # Pathway: not_defined # 1 195 1 195 195 412 99.0 1e-114 MKNFCVLFVMYFWSTLLHAVSSEPFPPPGMSLPEYWGEEHVWWDGRAAFHGEVVRPACTL AMEDAWQIIDMGETPVRDLQNGFSGPERKFSLRLRNCEFNSQGGNLFSDSRIRVTFDGVR GETPDKFNLSGQAKGINLQIADARGNIARAGKVMPAIPLTGNEEALDYTLRIVRNGKKLE AGNYFAVLGFRVDYE >gi|296493454|gb|ADTK01000047.1| GENE 10 7904 - 8455 320 183 aa, chain - ## HITS:1 COG:YPO2759 KEGG:ns NR:ns ## COG: YPO2759 COG3539 # Protein_GI_number: 16122963 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Yersinia pestis # 3 182 6 175 176 83 35.0 2e-16 MIKSVIAGAVAMAVVSFGVNAAPTTPQGQGRVTFNGTVVDAPCSISQKSADQSIDFGQLS KSFLANDGQSKPMNLDIELVNCDITAFKNGNAKTGSVKLAFTGPTVSGHPSELATNGGTG TAIMIQAAGKNVPFDGTEGDANLLKDGDNVLHYTAVVKKSSDGNAQITEGAFSAVATFNL SYQ >gi|296493454|gb|ADTK01000047.1| GENE 11 8661 - 8975 167 104 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_3340 NR:ns ## KEGG: ECUMN_3340 # Name: papB # Def: pap operon regulatory protein PapB # Organism: E.coli_UMN026 # Pathway: not_defined # 1 104 1 104 104 196 98.0 2e-49 MAHHEVISRSGNEFLLNIRENVLLPGSMSEMHFFLLIGISSIHSDRVILAMKDYLVGGHS RKEVCEKYQMNNGYFSTTLGRLIRLNALAARLAPYYTDESSAFD >gi|296493454|gb|ADTK01000047.1| GENE 12 9169 - 9342 91 57 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLSVINGKVFLFSVFALFVKFSEFAFCWLYLMCITFRALFFCEKKVRKNAFRRSFML >gi|296493454|gb|ADTK01000047.1| GENE 13 9391 - 9612 173 73 aa, chain + ## HITS:1 COG:no KEGG:UTI89_C4897 NR:ns ## KEGG: UTI89_C4897 # Name: papI # Def: pap operon regulatory protein PapI # Organism: E.coli_UTI89 # Pathway: not_defined # 1 73 5 77 77 137 98.0 1e-31 MKNEILEFLNRHNGGKTAEIAEALAVTDYQARYYLLLLEKEGVVQRSPLRRGMATYWFLK GEMQAGQNCSSTT >gi|296493454|gb|ADTK01000047.1| GENE 14 10098 - 10283 97 61 aa, chain - ## HITS:1 COG:VC0257 KEGG:ns NR:ns ## COG: VC0257 COG2801 # Protein_GI_number: 15640286 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Vibrio cholerae # 1 59 223 281 290 93 69.0 1e-19 MERFFRSLKNEWIPVVGYVSFSEAAHAITDYIVGYYSALRPHEYNGGLPPNESENRYWKK L >gi|296493454|gb|ADTK01000047.1| GENE 15 10402 - 11337 161 311 aa, chain - ## HITS:1 COG:MT3430_1 KEGG:ns NR:ns ## COG: MT3430_1 COG3547 # Protein_GI_number: 15842922 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Mycobacterium tuberculosis CDC1551 # 15 298 81 362 413 83 31.0 5e-16 MLAWMTSFGTLKRIGVECTGTYGSGLLRYFQNAGLEVLEVTAPDRMERRKRGKSDTIDAE CAAHAAFSGIRTVTPKTRDGMIESLRVLKTCRKTAISARRVALQIIHSNIISAPDELREQ LRNMTRMQLIRPLGSWRPDASEYRNVTNVYRISLKSLARRYLELHDEIADLDVMIAAIVD ELAPELIKRNAIGYESASQLLITAGDNPQRLRSESGFAALCGVSPVPVSSGKTNRYRLNR GGDRAANSALHIIAIGRLRTDAKTKEYVARRVAEGHTKMEAIRCLKRYISREVYTLLRNQ NRRINSIPITA >gi|296493454|gb|ADTK01000047.1| GENE 16 12020 - 13528 458 502 aa, chain - ## HITS:1 COG:ykfC KEGG:ns NR:ns ## COG: ykfC COG3344 # Protein_GI_number: 16128243 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Escherichia coli K12 # 1 364 1 364 376 720 98.0 0 MQRKLATWAVTDPSLRIQRLLRLITQPEWLAEAARITLSSKGAHTPGVDGVNKTMLQARL AVELQILRDELLSGHYQPLPARRVYIPKSNGKLRPLGIPALRDRIVQRAMLMAMEPIWES DFHTLSYGFRPERSVHHAIRTVKLQLTDCGETRGRWVIEGDLSSYFDTVHHRLLMKAVRR RISDARFMTLLWKTIKAGHIDVGLFRAASEGVPQGGVISPLLSNIMLNEFDQYLHERYLS GKARKDRWYWNNSIQRGRSTAVRENWQWKPAVAYCRYADDFVLIVKGTKAQAEAIREECR GVLEGSLKLRLNMDKTKITHVNDGFIFLGHRIIRKRSRYGEMRVVSTIPQEKARNFAASL TALLSGNYSESKVDMAEQLNRKLKGWAMFYQFVDFKAKVFSYIDRVVFWKLAHWLARKYR TGIASLMRWWCKSPKPGQSKTWVLFGKTNHGKLSGEILYRLVGQGKKLFRWRLPEGNPYL RTETRNTYTSRFTEVAMAFASI Prediction of potential genes in microbial genomes Time: Mon May 16 15:07:20 2011 Seq name: gi|296493453|gb|ADTK01000048.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont118.1, whole genome shotgun sequence Length of sequence - 936 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 76 - 333 90 ## B21_00650 hypothetical protein 2 1 Op 2 . - CDS 402 - 935 11 ## COG3209 Rhs family protein Predicted protein(s) >gi|296493453|gb|ADTK01000048.1| GENE 1 76 - 333 90 85 aa, chain - ## HITS:1 COG:no KEGG:B21_00650 NR:ns ## KEGG: B21_00650 # Name: ybfB # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 85 24 108 108 124 100.0 7e-28 MHRLSLLDSTRDVSELISLMSYGMMVICFPTGIVFFIALIFIGTVSDIIGVRIDSKYIMA IIIWLYFLSGGYIQWFVLSKRIINK >gi|296493453|gb|ADTK01000048.1| GENE 2 402 - 935 11 177 aa, chain - ## HITS:1 COG:rhsC KEGG:ns NR:ns ## COG: rhsC COG3209 # Protein_GI_number: 16128676 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli K12 # 1 177 1221 1397 1397 394 99.0 1e-110 DPIGLKGGWNFYQYPLNPISNIDPLGLETLKCIKPLHSMGGTGERSGPDIWGNPFYHQYL CVPDGKGDYTCGGQDQRGESKGDGLWGPGKASNDTKEAAGRCDLVETDNSCVENCLKGKF KEVRPRYSVLPDIFTPINLGLFKNCQDWSNDSLETCKMKCSGNNIGRLIRFVFTGVM Prediction of potential genes in microbial genomes Time: Mon May 16 15:07:22 2011 Seq name: gi|296493452|gb|ADTK01000049.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont141.1, whole genome shotgun sequence Length of sequence - 947 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 55 - 309 258 ## ECO103_3591 hypothetical protein 2 1 Op 2 . + CDS 356 - 733 285 ## ECED1_4987 toxin of the YeeV-YeeU toxin-antitoxin system 3 1 Op 3 . + CDS 730 - 946 134 ## EcSMS35_3213 hypothetical protein Predicted protein(s) >gi|296493452|gb|ADTK01000049.1| GENE 1 55 - 309 258 84 aa, chain + ## HITS:1 COG:no KEGG:ECO103_3591 NR:ns ## KEGG: ECO103_3591 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 84 41 124 124 167 98.0 1e-40 MHYLADRAGIRGQFSDADAYHLDQAFPLLMKQLELMLTSGELNPRHQHTVTLYARGLTCE ADTLGSCGYVYMAVYPTLAPATTS >gi|296493452|gb|ADTK01000049.1| GENE 2 356 - 733 285 125 aa, chain + ## HITS:1 COG:no KEGG:ECED1_4987 NR:ns ## KEGG: ECED1_4987 # Name: yeeV # Def: toxin of the YeeV-YeeU toxin-antitoxin system # Organism: E.coli_ED1a # Pathway: not_defined # 1 125 35 159 159 254 98.0 9e-67 MKTLPDTHVREASRCPSPVTIWQTLLTRLLDQHYGLTLNDTPFADERVIEQHIEAGISLC DAVNFLVEKYALVRTDQPGFSACTRSQLINSIDILRARRATGLMARDNYRTVNNITLGKH PGAKQ >gi|296493452|gb|ADTK01000049.1| GENE 3 730 - 946 134 72 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_3213 NR:ns ## KEGG: EcSMS35_3213 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 72 1 72 162 139 100.0 3e-32 MKLALTLEADSINVQALNMGRIVVDVDGVTLAELINVVCDNGYSLRVVDESDRASAERTP PSAALTGIRCST Prediction of potential genes in microbial genomes Time: Mon May 16 15:07:29 2011 Seq name: gi|296493451|gb|ADTK01000050.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont147.1, whole genome shotgun sequence Length of sequence - 1009 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - LSU_RRNA 57 - 1009 100.0 # ECOUW87 [D:1..2063] # 23S ribosomal RNA # Escherichia coli str. K-12 substr. MG1655 # Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; Enterobacteriaceae; Escherichia. Prediction of potential genes in microbial genomes Time: Mon May 16 15:07:29 2011 Seq name: gi|296493450|gb|ADTK01000051.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont152.1, whole genome shotgun sequence Length of sequence - 1152 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 - CDS 2 - 503 430 ## COG0740 Protease subunit of ATP-dependent Clp proteases 2 1 Op 2 . - CDS 517 - 1152 496 ## COG5511 Bacteriophage capsid protein Predicted protein(s) >gi|296493450|gb|ADTK01000051.1| GENE 1 2 - 503 430 167 aa, chain - ## HITS:1 COG:ECs0829_1 KEGG:ns NR:ns ## COG: ECs0829_1 COG0740 # Protein_GI_number: 15830083 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Escherichia coli O157:H7 # 1 167 56 222 226 328 98.0 2e-90 MQAGHQSDADIYIYDEIGFWGVTAKQFISDLNALGDITHINLHINSPGGDVFEGIAIFNA LKTHGASITVYVDGVAASMASVIAMVGNPVIMPENSFMMIHKPFGFTGGDAEDMRTYADL LDKVEAVLLPAYAQKTGKTTDEIAAMLADETWMSGAECLAHGFADQL >gi|296493450|gb|ADTK01000051.1| GENE 2 517 - 1152 496 211 aa, chain - ## HITS:1 COG:ECs2961 KEGG:ns NR:ns ## COG: ECs2961 COG5511 # Protein_GI_number: 15832215 # Func_class: R General function prediction only # Function: Bacteriophage capsid protein # Organism: Escherichia coli O157:H7 # 1 211 290 500 500 335 79.0 4e-92 ERELTIQPGIIYDDLKPGEEIGMVKSDRPNPNLETFRNGQLRAVAAGSRLSFSSTARNYN GTYSAQRQELVESTDGYLILQDWFIGAVTRPMYRAWLKQAVASGVIRLPRDLDRSSMYTA VYSGPVMPWIDPVKEAEAWKIQIRGGAATESDWVRAGGRNPDDVKRRRKAEIDENRKLDL VFDTDPASDKGGSSAATKRQEPQHTDDQSEE Prediction of potential genes in microbial genomes Time: Mon May 16 15:07:30 2011 Seq name: gi|296493449|gb|ADTK01000052.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont154.1, whole genome shotgun sequence Length of sequence - 1002 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 31 - 1002 598 ## COG3436 Transposase and inactivated derivatives Predicted protein(s) >gi|296493449|gb|ADTK01000052.1| GENE 1 31 - 1002 598 323 aa, chain - ## HITS:1 COG:ECs0330 KEGG:ns NR:ns ## COG: ECs0330 COG3436 # Protein_GI_number: 15829584 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 323 190 512 512 441 65.0 1e-124 EYADHLPLYRQSEIYRRQGVELSRATLGRWTGAVAELLEPLYDVLRQYVLMPGKVHADDI PVPVQEPGSGKTRTARLWVYVRDDRNAGSQMPPAVWFAYSPDRKGIHPQNHLAGYSGVLQ ADAYGGYRALYESGRITEAACMAHARRKIHDVHARAPTYITTEALQRIGELYAIEAEVRG CSAEQRLAARKARAAPLMQSLYDWIQQQMKTLSRHSDTAKAFAYLLKQWDALNVYCSNGW VEIDNNIAENALRGVAVGRKNWMFAGSDSGGEHAAVLYSLIGTCRLNNVEPEKWLRYVIE HIQDWPANRVRDLLPWKVDLSSQ Prediction of potential genes in microbial genomes Time: Mon May 16 15:07:31 2011 Seq name: gi|296493448|gb|ADTK01000053.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont157.1, whole genome shotgun sequence Length of sequence - 794 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 47 - 628 489 ## EcSMS35_3205 hypothetical protein Predicted protein(s) >gi|296493448|gb|ADTK01000053.1| GENE 1 47 - 628 489 193 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_3205 NR:ns ## KEGG: EcSMS35_3205 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 193 68 260 260 379 100.0 1e-104 MTEDGQALVMLKNGTIGITGLMQGCPNGVQTLLGSRISINGNLIPTSQMCNQQTGFRAVE VEIGQAPEMVKKAVHSIAERDVSVLQAFGVRMEFTRGDMLKVCPKFVTSLAGFSPKQTTT INKDSVLQAARQAYAREYDEETTETADFGSYEVKGNKVEFEVFNPEDRAYDKVTVTVGAD GNATSASVEFIGK Prediction of potential genes in microbial genomes Time: Mon May 16 15:07:34 2011 Seq name: gi|296493447|gb|ADTK01000054.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont167.1, whole genome shotgun sequence Length of sequence - 944 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 48 - 914 232 ## COG2801 Transposase and inactivated derivatives Predicted protein(s) >gi|296493447|gb|ADTK01000054.1| GENE 1 48 - 914 232 288 aa, chain + ## HITS:1 COG:tra5_g1 KEGG:ns NR:ns ## COG: tra5_g1 COG2801 # Protein_GI_number: 16128357 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli K12 # 1 288 1 288 288 576 99.0 1e-164 MKYVFIEKHQAEFSIKAMCRVLRVARSGWYTWCQRRTRISTRQQFRQHCDSVVLAAFTRS KQRYGAPRLTDELRAQGYPFNVKTVAASLRRQGLRAKASRKFSPVSYRAHGLPVSENLLE QDFYASGPNQKWAGDITYLRTDEGWLYLAVVIDLWSRAVIGWSMSPRMTAQLACDALQMA LWRRKRPRNVIVHTDRGGQYCSADYQAQLKRHNLRGSMSAKGCCYDNACVESFFHSLKVE CIHGEHFISREIMRATVFNYIECDYNRWRRHSWCGGLSPEQFENQNLA Prediction of potential genes in microbial genomes Time: Mon May 16 15:07:35 2011 Seq name: gi|296493446|gb|ADTK01000055.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont168.1, whole genome shotgun sequence Length of sequence - 1509 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1110 282 ## PROTEIN SUPPORTED gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 Predicted protein(s) >gi|296493446|gb|ADTK01000055.1| GENE 1 1 - 1110 282 370 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 [Campylobacter concisus 13826] # 49 355 60 353 406 113 27 1e-25 MALGDVKVRSAKPEAKAYKLTDGEGMVLLVHPNGSKYWRLSYRFGGKEKMLALGKYPEVS LVDARGRRDEARKLLANGVDPSENKKAVKVEHEQEVITFEVVAREWHASNQKWSASHSAR VLKSLEDNLFAAIGKRNIADLKTRDLLVPIKAVESSGRLEVAARLQQHTTAIIRFAVQSG LIDYNPAQEIAGAVATAKRQHCAALELNRIPELLQRIYHYSGRPLTRLAVELTLLVFIRS SELRFARWSEVDFDTAMWTIPGEREPLEGVKHSQRGSKMRTPHLVPLSRQALTILEKIKS MSGNRELIFVGDHDPRKPMSENTLNKALRVMGYDTKTEVCGHGFRTMACSSLIESGLWSR GKRLVRTVLI Prediction of potential genes in microbial genomes Time: Mon May 16 15:07:46 2011 Seq name: gi|296493445|gb|ADTK01000056.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont169.1, whole genome shotgun sequence Length of sequence - 35579 bp Number of predicted genes - 37, with homology - 37 Number of transcription units - 13, operones - 8 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 437 - 862 177 ## ECIAI1_2622 putative bacteriophage protein 2 1 Op 2 . + CDS 862 - 1275 329 ## EcHS_A0941 putative tail fiber assembly protein 3 2 Op 1 6/0.000 + CDS 1382 - 2554 1061 ## COG3497 Phage tail sheath protein FI 4 2 Op 2 . + CDS 2564 - 3079 497 ## COG3498 Phage tail tube protein FII 5 2 Op 3 . + CDS 3134 - 3436 384 ## SbBS512_E2491 phage tail protein E + Term 3449 - 3483 3.2 6 3 Op 1 2/0.500 + CDS 3563 - 6640 2574 ## COG5283 Phage-related tail protein 7 3 Op 2 2/0.500 + CDS 6637 - 7122 513 ## COG3499 Phage protein U 8 3 Op 3 . + CDS 7119 - 8219 757 ## COG3500 Phage protein D + Prom 8221 - 8280 4.0 9 3 Op 4 . + CDS 8310 - 8528 146 ## B21_00857 hypothetical protein - Term 8600 - 8654 0.4 10 4 Tu 1 . - CDS 8764 - 10449 1580 ## COG2985 Predicted permease - Prom 10567 - 10626 5.5 + Prom 10552 - 10611 4.2 11 5 Tu 1 . + CDS 10719 - 11096 100 ## ECIAI39_0827 conserved hypothetical protein; putative inner membrane protein - Term 11086 - 11114 1.4 12 6 Tu 1 . - CDS 11126 - 11383 468 ## COG0695 Glutaredoxin and related proteins - Prom 11407 - 11466 2.2 + Prom 11377 - 11436 4.8 13 7 Op 1 . + CDS 11543 - 11896 135 ## S0846 hypothetical protein 14 7 Op 2 4/0.000 + CDS 11839 - 12435 531 ## COG0778 Nitroreductase 15 7 Op 3 . + CDS 12496 - 13398 1507 ## PROTEIN SUPPORTED gi|15830186|ref|NP_308959.1| ribosomal protein S6 modification protein + Prom 13400 - 13459 1.6 16 7 Op 4 . + CDS 13486 - 13962 415 ## EcE24377A_0925 TPR repeat-containing protein + Term 13970 - 14024 6.1 + Prom 14086 - 14145 7.3 17 8 Op 1 13/0.000 + CDS 14313 - 15425 1207 ## COG0687 Spermidine/putrescine-binding periplasmic protein 18 8 Op 2 30/0.000 + CDS 15520 - 16653 1300 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 19 8 Op 3 36/0.000 + CDS 16663 - 17616 924 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I 20 8 Op 4 . + CDS 17613 - 18458 761 ## COG1177 ABC-type spermidine/putrescine transport system, permease component II 21 8 Op 5 . + CDS 18518 - 19006 330 ## LF82_2664 inner membrane protein YbjO 22 8 Op 6 . + CDS 19047 - 20174 1092 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase - Term 20164 - 20195 2.5 23 9 Op 1 4/0.000 - CDS 20349 - 21080 840 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain - Prom 21147 - 21206 9.4 - Term 21188 - 21232 0.7 24 9 Op 2 12/0.000 - CDS 21372 - 22040 914 ## COG4160 ABC-type arginine/histidine transport system, permease component 25 9 Op 3 12/0.000 - CDS 22040 - 22756 804 ## COG4215 ABC-type arginine transport system, permease component 26 9 Op 4 7/0.000 - CDS 22763 - 23494 1111 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain 27 9 Op 5 . - CDS 23512 - 24312 246 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 - Prom 24383 - 24442 2.0 - Term 24387 - 24427 5.6 28 10 Tu 1 . - CDS 24458 - 24973 368 ## ECO103_0909 putative lipoprotein - Prom 25014 - 25073 4.1 + Prom 24964 - 25023 6.2 29 11 Op 1 4/0.000 + CDS 25099 - 25422 515 ## COG0393 Uncharacterized conserved protein 30 11 Op 2 . + CDS 25419 - 26249 837 ## COG3023 Negative regulator of beta-lactamase expression + Term 26382 - 26423 1.0 31 12 Op 1 4/0.000 - CDS 26246 - 27259 902 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Prom 27287 - 27346 3.0 32 12 Op 2 4/0.000 - CDS 27358 - 28788 1194 ## COG0702 Predicted nucleoside-diphosphate-sugar epimerases 33 12 Op 3 5/0.000 - CDS 28799 - 29800 1173 ## COG2008 Threonine aldolase 34 12 Op 4 5/0.000 - CDS 29837 - 31555 1756 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] - Prom 31583 - 31642 5.3 - Term 31642 - 31677 2.6 35 12 Op 5 2/0.500 - CDS 31688 - 32656 878 ## COG1018 Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 36 12 Op 6 2/0.500 - CDS 32668 - 34320 1967 ## COG1151 6Fe-6S prismane cluster-containing protein - Prom 34353 - 34412 4.8 37 13 Tu 1 . - CDS 34464 - 35363 791 ## COG2431 Predicted membrane protein - Prom 35507 - 35566 3.4 Predicted protein(s) >gi|296493445|gb|ADTK01000056.1| GENE 1 437 - 862 177 141 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_2622 NR:ns ## KEGG: ECIAI1_2622 # Name: not_defined # Def: putative bacteriophage protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 141 221 361 361 97 92.0 2e-19 MTPTQNSYPVTIGAGGAGGVSATNGTSGGNSVFASLIAPGGAGGGKVGVTNTNGGNGGVP STGDIRITGGNGGDGQSGNISVSGEGGTSYWGGGGRAGAGGGVSGKAYGSGGGGAYDAGY SGTSMTGGKGAAGICIIEEFA >gi|296493445|gb|ADTK01000056.1| GENE 2 862 - 1275 329 137 aa, chain + ## HITS:1 COG:no KEGG:EcHS_A0941 NR:ns ## KEGG: EcHS_A0941 # Name: not_defined # Def: putative tail fiber assembly protein # Organism: E.coli_HS # Pathway: not_defined # 1 137 1 137 137 231 91.0 6e-60 MNASYAVIENGMVVNVIVWDGEDEFTVPDNLQLINISDISEQPGIGWAYSDGVFTAPLPP ERSHDELVADAEQKKQSLIDAAMVNISVIQLKLQAGRKLTQEETTRLNVVLDYIEAVTAT NTSTAPDIIWPVFPASR >gi|296493445|gb|ADTK01000056.1| GENE 3 1382 - 2554 1061 390 aa, chain + ## HITS:1 COG:STM2701 KEGG:ns NR:ns ## COG: STM2701 COG3497 # Protein_GI_number: 16766014 # Func_class: R General function prediction only # Function: Phage tail sheath protein FI # Organism: Salmonella typhimurium LT2 # 1 390 1 390 390 707 90.0 0 MAQDYHHGVRVVEVNEGTRSITTVSTAIVGMVCTGDDADAKMFPLNKPVLITDVLTASGK AGESGTLARSLDAIADQAKPVTVVVRVPQGETEEETTTNIIGAVTAEGKKTGMKALLSAQ SQLGVKPRILGVPGHDNKAVATELLGVAQSLRGFAYLSAYGCKTVQEAITYRKNFSQREG MLIWPDFTGWDTVLNADATAYATARALGLRAKIDEQTGWHKSLSNVGVNGVTGISADVFW DLQDPATDAGLLNQNDVTTLIRKDGFRFWGSRCLSDDPLFAFENYTRTAQVLMDTMAEAP MWAVDKPLNPSLARDIIEGIRAKMRSLISQGYLIGGDCWLDESVNDKDTLKAGKLTIDYD YTPVPPLENLMLRQRITDQYLVNFASQVSA >gi|296493445|gb|ADTK01000056.1| GENE 4 2564 - 3079 497 171 aa, chain + ## HITS:1 COG:STM2700 KEGG:ns NR:ns ## COG: STM2700 COG3498 # Protein_GI_number: 16766013 # Func_class: R General function prediction only # Function: Phage tail tube protein FII # Organism: Salmonella typhimurium LT2 # 1 171 1 171 171 315 95.0 2e-86 MALPRKLKHLNLFNDGNNWQGIVESLTLPKFTRKYEKYRGGGMPGAVDVDLGLDDSALDT EFSIGGTELLLFKQMGKATVDGIQLRFTGSIQRDDTGEVQAVELVVRGRHKEVDSGEWKT GESNTTKVTSTNSYAKLTINGEVLYEVDLINMVEIVDGVDLMEAHRNALGL >gi|296493445|gb|ADTK01000056.1| GENE 5 3134 - 3436 384 100 aa, chain + ## HITS:1 COG:no KEGG:SbBS512_E2491 NR:ns ## KEGG: SbBS512_E2491 # Name: not_defined # Def: phage tail protein E # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 1 100 1 100 100 149 100.0 4e-35 MSDKQTEKTIQLDTPIMRGKTEITEIVLRKPQSGALRGTRLQAIMDMDVNAMMTVIPRIS SPALTAQEIAEMDPADLTAMSVEVVTFLLKKSVLAGLPTA >gi|296493445|gb|ADTK01000056.1| GENE 6 3563 - 6640 2574 1025 aa, chain + ## HITS:1 COG:STM2697 KEGG:ns NR:ns ## COG: STM2697 COG5283 # Protein_GI_number: 16766010 # Func_class: S Function unknown # Function: Phage-related tail protein # Organism: Salmonella typhimurium LT2 # 1 773 1 758 935 722 61.0 0 MSDNNLRLQVILNAVDKLTRPFRAAQASSKELAGAIRNSRDALKQLNQAGNNLEKFRKLQ ADNKKLGDRLNYARQKANLLSSELEAMEQPSQRHLVALGRQTLAVQRLEEQQKYLQKQTA LVRAELYRAGISAKDDAGATARLARETSRYNQELSKQEARLKRLGEAQRRMNAARASYAR SLEVRDRIAGAGATTTAAGLAMGTPVMAAVKSYTSMEDAMKGVAKQVNGLRDDNGIRTAR FYEMQDAIKAASEQLPMENGAVDFAALVEGGARMNVANPDDSWEDQKRDLLAFASTAAKA ATAFELPADELSESLGKIAQLYKIPTRNIEQLGDALNYLDDNAMSKGADIIDVMQRLGGV ADRLDYRKAAALGSTFLTLGAAPEVAASAANAMVRELSIATMQSKSFFEGMNLLKLNPEV IEKQMTKDAMGTIQRVLEKVNALPQDKRLSAMTMLFGKEFGDDAAKLANNLPELQRQLKL TAGNDALGSMQKESDINKDSLSAQWLLVKTGAQNTFSSLGETLRQPLMDILYTVKSVTGA LRRWVEANPELTGTLMKASAVVAAVTVGLGTLAVALAAVLGPLAVIRLGFSVLGIKTLPS VTAAVTRTSSALSWLAGAPLVLLRRGLASSGNAAGLLTAPLSSLRRTASLTGNVLKTVAG APVALLRSGLSGLRAVAVMFMNPLAVLRGGLVAAGTVLRVLASGPLAMLRVALYAVSGLL GALLSPIGLVVTALAGVALVVWKYWQPITAFLGGVVEGFKAAAGPVSAAFEPLKPVFQWI GDKVQALWGWFTDLLTPVKSTSAELQSAAAMGRRFGEALAEGLNMVMHPLDSLKSGVSWL LEKLGIVSKEAAKAKLPESVTRQQPATVNADGKVMMPSGGFPSWGYGFAGMYDSGGYIPR GQFGIVGENGPEIVNGPANVTSRRNTAALAAVVAGMMGVAAAPAELPPLHPLALPAKGGE AMVSRAATVPPVQRIEAPMQIIIQTQPGQSAQDIAREVARQLDERERRLKAKARSNYSDQ GGYDA >gi|296493445|gb|ADTK01000056.1| GENE 7 6637 - 7122 513 161 aa, chain + ## HITS:1 COG:STM2696 KEGG:ns NR:ns ## COG: STM2696 COG3499 # Protein_GI_number: 16766009 # Func_class: R General function prediction only # Function: Phage protein U # Organism: Salmonella typhimurium LT2 # 1 160 1 160 161 261 85.0 3e-70 MMMVLGLYVFMLRTVPYQELQYQRSWRHAANSRVNRRPSTQFLGPENDMLTLSGVLMPEI TGGRLSLLALEQMAEQGKAWPLIEGSGTIYGMYVIEGLNQTKTEFFRDGMPRRIEFTLSL KRVDESLSDMFGDLSAQLNNLQGTATSALSDISKTVGGLLS >gi|296493445|gb|ADTK01000056.1| GENE 8 7119 - 8219 757 366 aa, chain + ## HITS:1 COG:STM2695 KEGG:ns NR:ns ## COG: STM2695 COG3500 # Protein_GI_number: 16766008 # Func_class: R General function prediction only # Function: Phage protein D # Organism: Salmonella typhimurium LT2 # 1 366 1 366 366 622 85.0 1e-178 MNFSSELLNKGNKTPAFSISIEGRDITTVLDNRLIGLTLTDNRGFEADQLDLELDDADGK IVLPRRGAVITLALGWKGQPLFPKGAFTVDEIEHTGAPDRLTIRARSADFRETLNTRREK SWHNTTIGEVVKEIAARHKLKMALGKELSDKPVEHIDQTNESDGSFLMRLARQYGAIASV KNGNLLFIRQGQGKSATGKPLPVITITRKDGDSHRFTLADRGAYTGVIASWLHTREPAKK ESTTVKRKRRTKKQKKEPEAKQGDYLVGTDENVLVLNRTYANRGNAERAAKMQWERLQRG VASFSLQLAEGRADLYTEMPVKVSGFKQPIDDAEWTITTLTHTVSPDNGFTTSLELEVRI DDFEME >gi|296493445|gb|ADTK01000056.1| GENE 9 8310 - 8528 146 72 aa, chain + ## HITS:1 COG:no KEGG:B21_00857 NR:ns ## KEGG: B21_00857 # Name: ybl53 # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 72 12 83 83 127 100.0 1e-28 MMICPLCGSAAHTRSSFQVSSLTKERYNQCQNINCSHTFVTHETFVRSIATPKESNPVQP HPMKSGQVALSL >gi|296493445|gb|ADTK01000056.1| GENE 10 8764 - 10449 1580 561 aa, chain - ## HITS:1 COG:ECs0927 KEGG:ns NR:ns ## COG: ECs0927 COG2985 # Protein_GI_number: 15830181 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Escherichia coli O157:H7 # 1 561 1 561 561 1046 100.0 0 MNINVAELLNGNYILLLFVVLALGLCLGKLRLGSIQLGNSIGVLVVSLLLGQQHFSINTD ALNLGFMLFIFCVGVEAGPNFFSIFFRDGKNYLMLALVMVGSALVIALGLGKLFGWDIGL TAGMLAGSMTSTPVLVGAGDTLRHSGMESRQLSLALDNLSLGYALTYLIGLVSLIVGARY LPKLQHQDLQTSAQQIARERGLDTDANRKVYLPVIRAYRVGPELVAWTDGKNLRELGIYR QTGCYIERIRRNGILANPDGDAVLQMGDEIALVGYPDAHARLDPSFRNGKEVFDRDLLDM RIVTEEVVVKNHNAVGKRLAQLKLTDHGCFLNRVIRSQIEMPIDDNVVLNKGDVLQVSGD ARRVKTIADRIGFISIHSQVTDLLAFCAFFVIGLMIGMITFQFSTFSFGMGNAAGLLFAG IMLGFMRANHPTFGYIPQGALSMVKEFGLMVFMAGVGLSAGSGINNGLGAIGGQMLIAGL IVSLVPVVICFLFGAYVLRMNRALLFGAMMGARTCAPAMEIISDTARSNIPALGYAGTYA IANVLLTLAGTIIVMVWPGLG >gi|296493445|gb|ADTK01000056.1| GENE 11 10719 - 11096 100 125 aa, chain + ## HITS:1 COG:no KEGG:ECIAI39_0827 NR:ns ## KEGG: ECIAI39_0827 # Name: ybjM # Def: conserved hypothetical protein; putative inner membrane protein # Organism: E.coli_IAI39 # Pathway: not_defined # 1 125 1 125 125 217 100.0 1e-55 MKHKQRWAGAICCFVLFIVVCLFLATHMKGAFRAAGHPEIGLLFFILPGAVASFFSQRRE VLKPLFGAMLAAPCSMLIMRLFFSPTRSFWQELAWLLSAVFWCALGALCFLFISSLFKPQ HRKNQ >gi|296493445|gb|ADTK01000056.1| GENE 12 11126 - 11383 468 85 aa, chain - ## HITS:1 COG:grxA KEGG:ns NR:ns ## COG: grxA COG0695 # Protein_GI_number: 16128817 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutaredoxin and related proteins # Organism: Escherichia coli K12 # 1 85 1 85 85 168 100.0 2e-42 MQTVIFGRSGCPYCVRAKDLAEKLSNERDDFQYQYVDIRAEGITKEDLQQKAGKPVETVP QIFVDQQHIGGYTDFAAWVKENLDA >gi|296493445|gb|ADTK01000056.1| GENE 13 11543 - 11896 135 117 aa, chain + ## HITS:1 COG:no KEGG:S0846 NR:ns ## KEGG: S0846 # Name: ybjC # Def: hypothetical protein # Organism: S.flexneri_2457T # Pathway: not_defined # 1 83 1 83 95 105 95.0 6e-22 MRAIGKLPKGVLILEFIGMMLLAVALLSVSDSLSLPEPFSRPEVQILMIFLGVLLMLPAA VVVILQVAKRLAPQLMNPFPKRSVRRLLTAPVRRPVPVFCSAVALFALPTKRYVKNW >gi|296493445|gb|ADTK01000056.1| GENE 14 11839 - 12435 531 198 aa, chain + ## HITS:1 COG:mdaA KEGG:ns NR:ns ## COG: mdaA COG0778 # Protein_GI_number: 16128819 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Escherichia coli K12 # 6 198 48 240 240 404 100.0 1e-113 MQCSSIIRITDKALREELVTLTGGQKHVAQAAEFWVFCADFNRHLQICPDAQLGLAEQLL LGVVDTAMMAQNALIAAESLGLGGVYIGGLRNNIEAVTKLLKLPQHVLPLFGLCLGWPAD NPDLKPRLPASILVHENSYQPLDKGALAQYDEQLAEYYLTRGSNNRRDTWSDHIRRTIIK ESRPFILDYLHKQGWATR >gi|296493445|gb|ADTK01000056.1| GENE 15 12496 - 13398 1507 300 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15830186|ref|NP_308959.1| ribosomal protein S6 modification protein [Escherichia coli O157:H7 str. Sakai] # 1 300 1 300 300 585 99 1e-166 MKIAILSRDGTLYSCKRLREAAIQRGHLVEILDPLSCYMNINPAASSIHYKGRKLPHFDA VIPRIGTAITFYGTAALRQFEMLGSYPLNESVAIARARDKLRSMQLLARQGIDLPVTGIA HSPDDTSDLIDMVGGAPLVVKLVEGTQGIGVVLAETRQAAESVIDAFRGLNAHILVQEYI KEAQGCDIRCLVVGDEVVAAIERRAKEGDFRSNLHRGGAASVASITPQEREIAIKAARTM ALDVAGVDILRANRGPLVMEVNASPGLEGIEKTTGIDIAGKMIRWIERHATTEYCLKTGG >gi|296493445|gb|ADTK01000056.1| GENE 16 13486 - 13962 415 158 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_0925 NR:ns ## KEGG: EcE24377A_0925 # Name: not_defined # Def: TPR repeat-containing protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 158 17 174 174 310 100.0 1e-83 MTSLVVPGLDTLRQWLDDLGMSFFECDNCQALHLPHMQNFDGVFDAKIDLIDNTILFSAM AEVRPSAVLPLAADLSAINASSLTVKAFLDMQDDNLPKLVVCQSLSVMQGVTYEQFAWFV RQSEEQISMVILEANAHQLLLPTDDEGQNNVTENYFLH >gi|296493445|gb|ADTK01000056.1| GENE 17 14313 - 15425 1207 370 aa, chain + ## HITS:1 COG:potF KEGG:ns NR:ns ## COG: potF COG0687 # Protein_GI_number: 16128822 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Escherichia coli K12 # 1 370 1 370 370 732 99.0 0 MTALNKKWLSGLAAGALMAVSVGTLAAEQKTLHVYNWSDYIAPDTVANFEKETGIKVVYD VFDSNEVLEGKLMAGSTGFDLVVPSASFLERQLTAGVFQPLDKSKLPEWKNLDPELLKLV AKHDPDNKFAMPYMWATTGIGYNVDKVKAVLGENAPVDSWDLILKPENLEKLKSCGVSFL DAPEEVFATVLNYLGKDPNSTKADDYTGPATDLLLKLRPNIRYFHSSQYINDLANGDICV AIGWAGDVWQASNRAKEAKNGVNVSFSIPKEGAMAFFDVFAMPADAKNKDEAYQFLNYLL RPDVVAHISDHVFYANANKAATPLVSAEVRENPGIYPPADVRAKLFTLKVQDPKIDRVRT RAWTKVKSGK >gi|296493445|gb|ADTK01000056.1| GENE 18 15520 - 16653 1300 377 aa, chain + ## HITS:1 COG:potG KEGG:ns NR:ns ## COG: potG COG3842 # Protein_GI_number: 16128823 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Escherichia coli K12 # 1 377 28 404 404 757 99.0 0 MNDAISRPQAKTRKALTPLLEIRNLTKSYDGQHAVDDVSLTIYKGEIFALLGASGCGKST LLRMLAGFEQPSAGQIMLDGVDLSQVPPYLRPINMMFQSYALFPHMTVEQNIAFGLKQDK LPKAEIASRVHEMLGLVHMQEFAKRKPHQLSGGQRQRVALARSLAKRPKLLLLDEPMGAL DKKLRDRMQLEVVDILERVGVTCVMVTHDQEEAMTMAGRIAIMNRGKFVQIGEPEEIYEH PTTRYSAEFIGSVNVFEGVLKERQEDGLVLDSPGLVHPLKVDADASVVDNVPVHVALRPE KIMLCEEPPANGCNFAVGEVIHIAYLGDLSVYHVRLKSGQMISAQLQNAHRHRKGLPTWG DEVRLCWEVDSCVVLTV >gi|296493445|gb|ADTK01000056.1| GENE 19 16663 - 17616 924 317 aa, chain + ## HITS:1 COG:potH KEGG:ns NR:ns ## COG: potH COG1176 # Protein_GI_number: 16128824 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Escherichia coli K12 # 1 317 1 317 317 551 99.0 1e-157 MNTLEPAAQSKPPGGFKLWLSQLQMKHGRKLVIALPYIWLILLFLLPFLIVFKISLSEMA RAIPPYTELMEWADGQLSITLNLGNFLQLTDDPLYFDAYLQSLQVAAISTFCCLLIGYPL AWAVAHSKPSTRNILLLLVILPSWTSFLIRVYAWMGILKNNGVLNNFLLWLGVIDQPLTI LHTNLAVYIGIVYAYVPFMVLPIYTALIRIDYSLVEAALDLGARPLKTFFTVIVPLTKGG IIAGSMLVFIPAVGEFVIPELLGGPDSIMIGRVLWQEFFNNRDWPVASAVAIIMLLLLIV PIMWFHKHQQKSVGEHG >gi|296493445|gb|ADTK01000056.1| GENE 20 17613 - 18458 761 281 aa, chain + ## HITS:1 COG:STM0880 KEGG:ns NR:ns ## COG: STM0880 COG1177 # Protein_GI_number: 16764242 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component II # Organism: Salmonella typhimurium LT2 # 1 281 1 281 281 451 94.0 1e-127 MNNLPVVRSPWRIVILLLGFTFLYAPMLMLVIYSFNSSKLVTVWAGWSTRWYGELLRDDA MMSAVGLSLTIAACAATAAAILGTIAAVVLVRFGRFRGSNGFAFMITAPLVMPDVITGLS LLLLFVALAHAIGWPADRGMLTIWLAHVTFCTAYVAVVISSRLRELDRSIEEAAMDLGAT PLKVFFVITLPMIMPAIISGWLLAFTLSLDDLVIASFVSGPGATTLPMLVFSSVRMGVNP EINALATLILGAVGIVGFIAWYLMARAEKQRIRDIQRARHG >gi|296493445|gb|ADTK01000056.1| GENE 21 18518 - 19006 330 162 aa, chain + ## HITS:1 COG:no KEGG:LF82_2664 NR:ns ## KEGG: LF82_2664 # Name: ybjO # Def: inner membrane protein YbjO # Organism: E.coli_LF82 # Pathway: not_defined # 1 162 1 162 162 290 100.0 1e-77 MEDETLGFFKKTSSSHARLNVPALVQVAALAIIMIRGLDVLMIFNTLGVRGIGEFIHRSV QTWSLTLVFLSSLVLVFIEIWCAFSLVKGRRWARWLYLLTQITAASYLWAASLGYGYPEL FSIPGESKREIFHSLMLQKLPDMLILMLLFVPSTSRRFFQLQ >gi|296493445|gb|ADTK01000056.1| GENE 22 19047 - 20174 1092 375 aa, chain + ## HITS:1 COG:ybjF KEGG:ns NR:ns ## COG: ybjF COG2265 # Protein_GI_number: 16128827 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Escherichia coli K12 # 1 375 1 375 375 756 98.0 0 MQCALYDAGRCRSCQWITQPIPEQLSAKTADLKNLLADFPVEEWCAPVSGPEQGFRNKAK MVVSGSVEKPLLGMLHRDGTPEDLCDCPLYPASFAPVFAALKPFIARAGLTPYNVARKRG ELKYILLTESQSDGGMMLRFVLRSETKLAQLRKALPWLQEQLPQLKVITVNIQPVHMAIM EGETEIYLTEQQALAERFNDVPLWIRPQSFFQTNPAVASQLYATARDWVRQLPVKHMWDL FCGVGGFGLHCATPDMQLTGIEIASEAIACAKQSAAELGLTRLQFQALDSTQFATAQGEV PELVLVNPPRRGIGKPLCDYLSTMAPRFIIYSSCNAQTMAKDIRELPGYRIERVQLFDMF PHTAHYEVLTLLVKQ >gi|296493445|gb|ADTK01000056.1| GENE 23 20349 - 21080 840 243 aa, chain - ## HITS:1 COG:artJ KEGG:ns NR:ns ## COG: artJ COG0834 # Protein_GI_number: 16128828 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Escherichia coli K12 # 1 243 1 243 243 480 99.0 1e-136 MKKLVLAALLASFTFGASAAEKINFGISATYPPFESIGANNEIVGFDIDLAKALCKQMQA ECTFTNHAFDSLIPSLKFRKYDAVISGMDITPERSKQVSFTTPYYENSAVVIAKKDTYKT FADLKGKRIGMENGTTHQKYIQDQHPEVKTVSYDSYQNAFIDLKNGRIDGVFGDTAVVNE WLKTNPQLGVATEKVTDPQYFGTGLGIAVRPDNKALLEKLNNALAAIKADGTYQKINDQW FPQ >gi|296493445|gb|ADTK01000056.1| GENE 24 21372 - 22040 914 222 aa, chain - ## HITS:1 COG:ECs0944 KEGG:ns NR:ns ## COG: ECs0944 COG4160 # Protein_GI_number: 15830198 # Func_class: E Amino acid transport and metabolism # Function: ABC-type arginine/histidine transport system, permease component # Organism: Escherichia coli O157:H7 # 1 222 1 222 222 372 100.0 1e-103 MFEYLPELMKGLHTSLTLTVASLIVALILALIFTIILTLKTPVLVWLVRGYITLFTGTPL LVQIFLIYYGPGQFPTLQEYPALWHLLSEPWLCALIALSLNSAAYTTQLFYGAIRAIPEG QWQSCSALGMSKKDTLAILLPYAFKRSLSSYSNEVVLVFKSTSLAYTITLMEVMGYSQLL YGRTYDVMVFGAAGIIYLVVNGLLTLMMRLIERKALAFERRN >gi|296493445|gb|ADTK01000056.1| GENE 25 22040 - 22756 804 238 aa, chain - ## HITS:1 COG:artQ KEGG:ns NR:ns ## COG: artQ COG4215 # Protein_GI_number: 16128830 # Func_class: E Amino acid transport and metabolism # Function: ABC-type arginine transport system, permease component # Organism: Escherichia coli K12 # 1 238 1 238 238 426 100.0 1e-119 MNEFFPLASAAGMTVGLAVCALIVGLALAMFFAVWESAKWRPVAWAGSALVTILRGLPEI LVVLFIYFGSSQLLLTLSDGFTINLGFVQIPVQMDIENFDVSPFLCGVIALSLLYAAYAS QTLRGALKAVPVGQWESGQALGLSKSAIFFRLVMPQMWRHALPGLGNQWLVLLKDTALVS LISVNDLMLQTKSIATRTQEPFTWYIVAAAIYLVITLLSQYILKRIDLRATRFERRPS >gi|296493445|gb|ADTK01000056.1| GENE 26 22763 - 23494 1111 243 aa, chain - ## HITS:1 COG:artI KEGG:ns NR:ns ## COG: artI COG0834 # Protein_GI_number: 16128831 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Escherichia coli K12 # 1 243 1 243 243 475 100.0 1e-134 MKKVLIAALIAGFSLSATAAETIRFATEASYPPFESIDANNQIVGFDVDLAQALCKEIDA TCTFSNQAFDSLIPSLKFRRVEAVMAGMDITPEREKQVLFTTPYYDNSALFVGQQGKYTS VDQLKGKKVGVQNGTTHQKFIMDKHPEITTVPYDSYQNAKLDLQNGRIDGVFGDTAVVTE WLKDNPKLAAVGDKVTDKDYFGTGLGIAVRQGNTELQQKLNTALEKVKKDGTYETIYNKW FQK >gi|296493445|gb|ADTK01000056.1| GENE 27 23512 - 24312 246 266 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 23 244 1 215 305 99 29 3e-20 MHNYSVKGTIFDLNSYQTASIRVSMSIQLNGINCFYGAHQALFDITLDCPQGETLVLLGP SGAGKSSLLRVLNLLEMPRSGTLNIAGNHFDFTKTPSDKAIRDLRRNVGMVFQQYNLWPH LTVQQNLIEAPCRVLGLSKDQALARAEKLLERLRLKPYSDRYPLHLSGGQQQRVAIARAL MMEPQVLLFDEPTAALDPEITAQIVSIIRELAETNITQVIVTHEVEVARKTASRVVYMEN GHIVEQGDASCFTEPQTEAFKNYLSH >gi|296493445|gb|ADTK01000056.1| GENE 28 24458 - 24973 368 171 aa, chain - ## HITS:1 COG:no KEGG:ECO103_0909 NR:ns ## KEGG: ECO103_0909 # Name: ybjP # Def: putative lipoprotein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 171 1 171 171 337 99.0 1e-91 MRYSKLTMLIPCALLLSACTTVTPAYKDNGTRSGPCVEGGPDNVAQQFYDYRILHRSNDI TALRPYLSDKLATLLSDASRDNNHRELLTNDPFSSRTTLPESARVASASTIPNRDARNIP LRVDLKKGDQSWQDEVLMIQEGQCWVIDDVRYLGGSVHATAGTLRQSIENR >gi|296493445|gb|ADTK01000056.1| GENE 29 25099 - 25422 515 107 aa, chain + ## HITS:1 COG:ECs0952 KEGG:ns NR:ns ## COG: ECs0952 COG0393 # Protein_GI_number: 15830206 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 107 1 107 107 187 100.0 4e-48 MQFSTTPTLEGQTIVEYCGVVTGEAILGANIFRDFFAGIRDIVGGRSGAYEKELRKAREI AFEELGSQARALGADAVVGIDIDYETVGQNGSMLMVSVSGTAVKTRR >gi|296493445|gb|ADTK01000056.1| GENE 30 25419 - 26249 837 276 aa, chain + ## HITS:1 COG:ECs0953 KEGG:ns NR:ns ## COG: ECs0953 COG3023 # Protein_GI_number: 15830207 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Escherichia coli O157:H7 # 1 276 1 276 276 540 99.0 1e-153 MRRVFWLVAAALLLAGCAGEKGIVEKEGYQLDTRRQAQAAYPRIKVLVIHYTADDFDSSL ATLTDKQVSSHYLVPAVPPRYNGKPRIWQLVPEQELAWHAGISAWRGATRLNDTSIGIEL ENRGWQKSAGVKYFAPFEPAQIQALIPLAKDIIARYHIKPENVVAHADIAPQRKDDPGPL FPWQQLAQQGIGAWPDAQRVNFYLAGRAPHTPVDTASLLELLARYGYDVKPDMTPREQRR VIMAFQMHFRPTLYNGEADAETQAIAEALLEKYGQD >gi|296493445|gb|ADTK01000056.1| GENE 31 26246 - 27259 902 337 aa, chain - ## HITS:1 COG:ybjS KEGG:ns NR:ns ## COG: ybjS COG0451 # Protein_GI_number: 16128836 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Escherichia coli K12 # 1 337 13 349 349 707 100.0 0 MKVLVTGATSGLGRNAVEFLCQKGISVRATGRNEAMGKLLEKMGAEFVPADLTELVSSQA KVMLAGIDTLWHCSSFTSPWGTQQAFDLANVRATRRLGEWAVAWGVRNFIHISSPSLYFD YHHHRDIKEDFRPHRFANEFARSKAASEEVINMLSQANPQTRFTILRPQSLFGPHDKVFI PRLAHMMHHYGSILLPHGGSALVDMTYYENAVHAMWLASQEACDKLPSGRVYNITNGEHR TLRSIVQKLIDELNIDCRIRSVPYPMLDMIARSMERLGRKSAKEPPLTHYGVSKLNFDFT LDITRAQEELGYQPVITLDEGIEKTAAWLRDHGKLPR >gi|296493445|gb|ADTK01000056.1| GENE 32 27358 - 28788 1194 476 aa, chain - ## HITS:1 COG:ybjT KEGG:ns NR:ns ## COG: ybjT COG0702 # Protein_GI_number: 16128837 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate-sugar epimerases # Organism: Escherichia coli K12 # 1 476 11 486 486 958 99.0 0 MPQRILVLGASGYIGQHLVRTLSQQGHQILAAARHVDRLAKLQLANVSCHKVDLSWPDNL PALLQDIDTVYFLVHSMGEGGDFIAQERQVALNVRDALREVPVKQLIFLSSLQAPPHEQS DHLRARQATADILREANVPVTELRAGIIVGAGSAAFEVMRDMVYNLPVLTPPRWVRSRTT PIALENLLHYLVALLDHPASEHRIFEAAGPEVLSYQQQFEHFMAVSGKRRWLIPIPLPTR WISVWFLNVITSVPPTTARALIQGLKHDLLADDTALRALIPQRLIAFDDAVRSTLKEEEK LVNSSDWGYDAQAFARWRPEYGYFAKQAGFTVKTSASLAALWQVVNQIGGKERYFFGNIL WQTRALMDRAIGHKLAKGRPEREYLQTGDAVDSWKVIVVEPEKQLTLLFGMKAPGLGRLC FSLEDKGDYRTIDVRAFWHPHGMPGLFYWLLMIPAHLFIFRGMAKQIARLAEQSTD >gi|296493445|gb|ADTK01000056.1| GENE 33 28799 - 29800 1173 333 aa, chain - ## HITS:1 COG:ybjU KEGG:ns NR:ns ## COG: ybjU COG2008 # Protein_GI_number: 16128838 # Func_class: E Amino acid transport and metabolism # Function: Threonine aldolase # Organism: Escherichia coli K12 # 1 333 1 333 333 660 99.0 0 MIDLRSDTVTRPSRAMLEAMMAAPVGDDVYGDDPTVNALQDYAAELSGKEAAIFLPTGTQ ANLVALLSHCERGEEYIVGQAAHNYLFEAGGAAVLGSIQPQPIDAAADGTLPLDKVAMKI KPDDIHFARTKLLSLENTHNGKVLPREYLKEAWEFTRERNLALHVDGARIFNAVVAYGCE LKEITQYCDSFTICLSKGLGTPVGSLLVGNRDYIKRAIRWRKMAGGGMRQSGILAAAGMY ALKNNVARLQEDHDNAAWMAEQLREAGADVMRQDTNMLFVRVGEENAAALGEYMKARNVL INASPIVRLVTHLDVSREQLAEVAAHWRAFLAR >gi|296493445|gb|ADTK01000056.1| GENE 34 29837 - 31555 1756 572 aa, chain - ## HITS:1 COG:ECs0957 KEGG:ns NR:ns ## COG: ECs0957 COG0028 # Protein_GI_number: 15830211 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Escherichia coli O157:H7 # 1 572 1 572 572 1149 99.0 0 MKQTVAAYIAKTLESAGVKRIWGVTGDSLNGLSDSLNRMGTIEWMSTRHEEVAAFAAGAE AQLSGELAVCAGSCGPGNLHLINGLFDCHRNHVPVLAIAAHIPSSEIGSGYFQETHPQEL FRECSHYCELVSSPEQIPQVLAIAMRKAVLNRGVSVVVLPGDVALKPAPEGATTHWYHAP QPVVTPEEEELRKLAQLLRYSSNIALMCGSGCAGAHKELVEFAGKIKAPIVHALRGKEHV EYDNPYDVGMTGLIGFSSGFHTMMNADTLVLLGTQFPYRAFYPTDAKIIQIDINPASIGA HSKVDMALVGDIKSTLRALLPLVEEKADRKFLDKALEDYRDARKGLDDLAKPSEKAIHPQ YLAQQISHFAADDAIFTCDVGTPTVWAARYLKMNGKRRLLGSFNHGSMANAMPQALGAQA TEPERQVVAMCGDGGFSMLMGDFLSVVQMKLPVKIVVFNNSVLGFVAMEMKAGGYLTDGT ELHDTNFARIAEACGITGIRVEKASEVDEALQRAFSIDGPVLVDVVVAKEELAIPPQIKL EQAKGFSLYMLRAIISGRGDEVIELAKTNWLR >gi|296493445|gb|ADTK01000056.1| GENE 35 31688 - 32656 878 322 aa, chain - ## HITS:1 COG:ybjV KEGG:ns NR:ns ## COG: ybjV COG1018 # Protein_GI_number: 16128840 # Func_class: C Energy production and conversion # Function: Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 # Organism: Escherichia coli K12 # 1 322 1 322 322 663 100.0 0 MTMPTNQCPWRMQVHHITQETPDVWTISLICHDYYPYRAGQYALVSVRNSAETLRAYTIS STPGVSEYITLTVRRIDDGVGSQWLTRDVKRGDYLWLSDAMGEFTCDDKAEDKFLLLAAG CGVTPIMSMRRWLAKNRPQADVRVIYNVRTPQDVIFADEWRNYPVTLVAENNVTEGFIAG RLTRELLAGVPDLASRTVMTCGPAPYMDWVEQEVKALGVTRFFKEKFFTPVAEAATSGLK FTKLQPAREFYAPVGTTLLEALESNNVPVVAACRAGVCGCCKTKVVSGEYTVSSTMTLTD AEIAEGYVLACSCHPQGDLVLA >gi|296493445|gb|ADTK01000056.1| GENE 36 32668 - 34320 1967 550 aa, chain - ## HITS:1 COG:ECs0959 KEGG:ns NR:ns ## COG: ECs0959 COG1151 # Protein_GI_number: 15830213 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Escherichia coli O157:H7 # 1 550 3 552 552 1160 100.0 0 MFCVQCEQTIRTPAGNGCSYAQGMCGKTAETSDLQDLLIAALQGLSAWAVKAREYGIINH DVDSFAPRAFFSTLTNVNFDSPRIVGYAREAIALREALKAQCLAVDANARVDNPMADLQL VSDDLGELQRQAAEFTPNKDKAAIGENILGLRLLCLYGLKGAAAYMEHAHVLGQYDNDIY AQYHKIMAWLGTWPADMNALLECSMEIGQMNFKVMSILDAGETGKYGHPTPTQVNVKATA GKCILISGHDLKDLYNLLEQTEGTGVNVYTHGEMLPAHGYPELRKFKHLVGNYGSGWQNQ QVEFARFPGPIVMTSNCIIDPTVGAYDDRIWTRSIVGWPGVRHLDGEDFSAVIAQAQQMA GFPYSEIPHLITVGFGRQTLLGAADTLIDLVSREKLRHIFLLGGCDGARGERHYFTDFAT SVPDDCLILTLACGKYRFNKLEFGDIEGLPRLVDAGQCNDAYSAIILAVTLAEKLGCGVN DLPLSLVLSWFEQKAIVILLTLLSLGVKNIVTGPTAPGFLTPDLLAVLNEKFGLRSITTV EEDMKQLLSA >gi|296493445|gb|ADTK01000056.1| GENE 37 34464 - 35363 791 299 aa, chain - ## HITS:1 COG:ybjE KEGG:ns NR:ns ## COG: ybjE COG2431 # Protein_GI_number: 16128842 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 299 17 315 315 479 100.0 1e-135 MFSGLLIILVPLIVGYLIPLRQQAALKVINQLLSWMVYLILFFMGISLAFLDNLASNLLA ILHYSAVSITVILLCNIAALMWLERGLPWRNHHQQEKLPSRIAMALESLKLCGVVVIGFA IGLSGLAFLQHATEASEYTLILLLFLVGIQLRNNGMTLKQIVLNRRGMIVAVVVVVSSLI GGLINAFILDLPINTALAMASGFGWYSLSGILLTESFGPVIGSAAFFNDLARELIAIMLI PGLIRRSRSTALGLCGATSMDFTLPVLQRTGGLDMVPAAIVHGFILSLLVPILIAFFSA Prediction of potential genes in microbial genomes Time: Mon May 16 15:08:31 2011 Seq name: gi|296493444|gb|ADTK01000057.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont169.2, whole genome shotgun sequence Length of sequence - 64987 bp Number of predicted genes - 70, with homology - 68 Number of transcription units - 35, operones - 17 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 102 - 155 9.6 1 1 Tu 1 . - CDS 156 - 851 771 ## COG0580 Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) - Prom 1094 - 1153 8.2 + Prom 1047 - 1106 7.5 2 2 Tu 1 . + CDS 1277 - 2935 1279 ## COG3593 Predicted ATP-dependent endonuclease of the OLD family + Term 2936 - 2964 -0.6 - Term 2774 - 2815 0.2 3 3 Tu 1 . - CDS 2932 - 3882 663 ## COG2990 Uncharacterized protein conserved in bacteria - Prom 3961 - 4020 3.3 4 4 Op 1 13/0.000 + CDS 4039 - 5154 1333 ## COG0845 Membrane-fusion protein 5 4 Op 2 . + CDS 5151 - 7097 368 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 + Term 7107 - 7152 7.1 - Term 7100 - 7133 4.1 6 5 Tu 1 . - CDS 7170 - 7394 214 ## COG1278 Cold shock proteins - Prom 7482 - 7541 4.5 + Prom 7197 - 7256 2.2 7 6 Tu 1 . + CDS 7305 - 7547 119 ## + Prom 7592 - 7651 1.8 8 7 Op 1 19/0.000 + CDS 7717 - 8037 343 ## COG2127 Uncharacterized conserved protein 9 7 Op 2 . + CDS 8068 - 10344 1361 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 + Term 10370 - 10409 8.4 - TRNA 10688 - 10775 68.6 # Ser GGA 0 0 + Prom 10890 - 10949 5.0 10 8 Op 1 3/0.846 + CDS 11009 - 11947 627 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases 11 8 Op 2 5/0.462 + CDS 12002 - 12739 674 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family 12 8 Op 3 . + CDS 12763 - 13317 705 ## COG3381 Uncharacterized component of anaerobic dehydrogenases 13 8 Op 4 . + CDS 13389 - 13910 555 ## SSON_1045 hypothetical protein + Term 13944 - 13974 0.2 - Term 13932 - 13962 0.2 14 9 Op 1 . - CDS 13974 - 14807 799 ## COG1462 Uncharacterized protein involved in formation of curli polymers 15 9 Op 2 . - CDS 14834 - 15220 348 ## ECSP_1339 curli assembly protein CsgF 16 9 Op 3 . - CDS 15275 - 15664 125 ## APECO1_124 curli assembly protein CsgE 17 9 Op 4 . - CDS 15669 - 16319 221 ## COG2771 DNA-binding HTH domain-containing proteins + Prom 16162 - 16221 3.9 18 10 Tu 1 . + CDS 16287 - 16433 58 ## UTI89_C1162 hypothetical protein + Prom 16949 - 17008 3.7 19 11 Op 1 . + CDS 17046 - 17528 305 ## ECH74115_1421 curlin minor subunit 20 11 Op 2 . + CDS 17569 - 18024 394 ## B21_01046 hypothetical protein + Term 18047 - 18075 1.0 21 12 Tu 1 . + CDS 18083 - 18415 151 ## ECUMN_1217 putative autoagglutination protein 22 13 Tu 1 . + CDS 18536 - 18847 237 ## ECIAI39_2119 hypothetical protein + Prom 18860 - 18919 1.5 23 14 Op 1 2/0.923 + CDS 18942 - 19475 355 ## COG2110 Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 24 14 Op 2 . + CDS 19417 - 20898 864 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes - Term 20815 - 20851 4.0 25 15 Tu 1 . - CDS 20906 - 22063 655 ## SSON_1060 glucans biosynthesis protein - Prom 22245 - 22304 3.2 + Prom 22190 - 22249 4.8 26 16 Op 1 7/0.231 + CDS 22457 - 23992 1556 ## COG3131 Periplasmic glucans biosynthesis protein 27 16 Op 2 4/0.769 + CDS 24015 - 26528 2144 ## COG2943 Membrane glycosyltransferase + Term 26659 - 26691 3.1 + Prom 26621 - 26680 3.2 28 17 Tu 1 . + CDS 26701 - 26928 305 ## COG5645 Predicted periplasmic lipoprotein - Term 26633 - 26687 4.2 29 18 Op 1 . - CDS 26929 - 27303 495 ## G2583_1310 acidic protein MsyB 30 18 Op 2 5/0.462 - CDS 27386 - 28612 792 ## COG0477 Permeases of the major facilitator superfamily - Prom 28666 - 28725 2.5 - Term 28662 - 28701 0.3 31 19 Tu 1 . - CDS 28784 - 29704 922 ## COG1560 Lauroyl/myristoyl acyltransferase - Prom 29830 - 29889 4.8 + Prom 29799 - 29858 5.9 32 20 Tu 1 . + CDS 29929 - 30981 880 ## COG1054 Predicted sulfurtransferase + Term 30988 - 31024 6.5 - Term 30974 - 31014 7.3 33 21 Op 1 12/0.000 - CDS 31023 - 31598 607 ## COG2353 Uncharacterized conserved protein 34 21 Op 2 . - CDS 31602 - 32168 317 ## COG3038 Cytochrome B561 - Prom 32227 - 32286 8.1 - Term 32348 - 32388 -0.5 35 22 Op 1 . - CDS 32429 - 32542 100 ## - Term 32549 - 32580 2.4 36 22 Op 2 . - CDS 32590 - 33708 817 ## COG0665 Glycine/D-amino acid oxidases (deaminating) - Prom 33731 - 33790 2.2 - Term 33770 - 33798 0.6 37 23 Tu 1 . - CDS 33823 - 34077 257 ## ECH74115_1439 biofilm formation regulatory protein BssS - Prom 34279 - 34338 7.8 - Term 34318 - 34345 1.5 38 24 Op 1 . - CDS 34367 - 34612 208 ## UTI89_C1186 DNA damage-inducible protein I 39 24 Op 2 . - CDS 34686 - 35606 686 ## COG0418 Dihydroorotase - Prom 35769 - 35828 2.1 - Term 35802 - 35833 3.2 40 25 Tu 1 . - CDS 35838 - 36398 615 ## APECO1_145 hypothetical protein - Term 36480 - 36519 8.0 41 26 Op 1 1/1.000 - CDS 36532 - 37179 654 ## COG2999 Glutaredoxin 2 42 26 Op 2 . - CDS 37243 - 38451 1215 ## COG0477 Permeases of the major facilitator superfamily - Prom 38480 - 38539 2.1 + Prom 38532 - 38591 5.5 43 27 Op 1 5/0.462 + CDS 38687 - 39271 1067 ## PROTEIN SUPPORTED gi|15801183|ref|NP_287200.1| ribosomal-protein-S5-alanine N-acetyltransferase 44 27 Op 2 4/0.769 + CDS 39282 - 39929 764 ## COG3132 Uncharacterized protein conserved in bacteria 45 27 Op 3 5/0.462 + CDS 39931 - 40854 701 ## COG0673 Predicted dehydrogenases and related proteins + Term 40917 - 40945 3.0 46 28 Tu 1 . + CDS 40964 - 42499 1012 ## PROTEIN SUPPORTED gi|145628098|ref|ZP_01783899.1| 30S ribosomal protein S20 + Term 42507 - 42547 2.7 - Term 42491 - 42534 10.2 47 29 Op 1 7/0.231 - CDS 42539 - 42955 399 ## COG3418 Flagellar biosynthesis/type III secretory pathway chaperone 48 29 Op 2 8/0.077 - CDS 42960 - 43253 341 ## COG2747 Negative regulator of flagellin synthesis (anti-sigma28 factor) 49 29 Op 3 . - CDS 43329 - 43988 389 ## COG1261 Flagellar basal body P-ring biosynthesis protein - Prom 44011 - 44070 3.9 50 30 Op 1 24/0.000 + CDS 44143 - 44559 476 ## COG1815 Flagellar basal body protein 51 30 Op 2 9/0.000 + CDS 44563 - 44967 321 ## COG1558 Flagellar basal body rod protein 52 30 Op 3 16/0.000 + CDS 44979 - 45674 838 ## COG1843 Flagellar hook capping protein 53 30 Op 4 8/0.077 + CDS 45699 - 46904 1248 ## COG1749 Flagellar hook protein FlgE 54 30 Op 5 8/0.077 + CDS 46924 - 47679 664 ## COG4787 Flagellar basal body rod protein + Prom 47744 - 47803 1.7 55 31 Op 1 9/0.000 + CDS 47851 - 48633 941 ## COG4786 Flagellar basal body rod protein + Term 48638 - 48675 2.1 56 31 Op 2 9/0.000 + CDS 48686 - 49384 707 ## COG2063 Flagellar basal body L-ring protein 57 31 Op 3 7/0.231 + CDS 49396 - 50493 921 ## COG1706 Flagellar basal-body P-ring protein 58 31 Op 4 9/0.000 + CDS 50493 - 51434 976 ## COG3951 Rod binding protein 59 31 Op 5 21/0.000 + CDS 51500 - 53143 1524 ## COG1256 Flagellar hook-associated protein 60 31 Op 6 . + CDS 53155 - 54108 935 ## COG1344 Flagellin and related hook-associated proteins + Term 54150 - 54190 10.3 - Term 54252 - 54283 3.2 61 32 Tu 1 . - CDS 54305 - 57490 3204 ## COG1530 Ribonucleases G and E + Prom 57903 - 57962 3.6 62 33 Tu 1 . + CDS 58063 - 59022 190 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit 63 34 Tu 1 . - CDS 59134 - 59730 516 ## COG0424 Nucleotide-binding protein implicated in inhibition of septum formation - Prom 59750 - 59809 3.2 + Prom 59713 - 59772 5.4 64 35 Op 1 20/0.000 + CDS 59917 - 60438 429 ## COG1399 Predicted metal-binding, possibly nucleic acid-binding protein 65 35 Op 2 14/0.000 + CDS 60490 - 60663 292 ## PROTEIN SUPPORTED gi|15801206|ref|NP_287223.1| 50S ribosomal protein L32 66 35 Op 3 16/0.000 + CDS 60774 - 61814 619 ## COG0416 Fatty acid/phospholipid biosynthesis enzyme 67 35 Op 4 14/0.000 + CDS 61882 - 62835 979 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III 68 35 Op 5 26/0.000 + CDS 62851 - 63780 976 ## COG0331 (acyl-carrier-protein) S-malonyltransferase 69 35 Op 6 22/0.000 + CDS 63793 - 64527 252 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 + Term 64556 - 64582 -0.6 + Prom 64640 - 64699 4.9 70 35 Op 7 . + CDS 64738 - 64974 397 ## COG0236 Acyl carrier protein Predicted protein(s) >gi|296493444|gb|ADTK01000057.1| GENE 1 156 - 851 771 231 aa, chain - ## HITS:1 COG:aqpZ KEGG:ns NR:ns ## COG: aqpZ COG0580 # Protein_GI_number: 16128843 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) # Organism: Escherichia coli K12 # 1 231 1 231 231 353 99.0 1e-97 MFRKLAAECFGTFWLVFGGCGSAVLAAGFPELGIGFAGVALAFGLTVLTMAFAVGHISGG HFNPAVTIGLWAGGRFPAKEVVGYVIAQVVGGIVAAALLYLIASGKTGFDAAASGFASNG YGEHSPGGYSMLSALVVELVLSAGFLLVIHGTTDKFAPAGFAPIAIGLALTLIHLISIPV TNTSVNPARSTAVAIFQGGWALEQLWFFWVVPIVGGIIGGLIYRTLLEKRD >gi|296493444|gb|ADTK01000057.1| GENE 2 1277 - 2935 1279 552 aa, chain + ## HITS:1 COG:ECs0962 KEGG:ns NR:ns ## COG: ECs0962 COG3593 # Protein_GI_number: 15830216 # Func_class: L Replication, recombination and repair # Function: Predicted ATP-dependent endonuclease of the OLD family # Organism: Escherichia coli O157:H7 # 1 552 1 552 552 1072 99.0 0 MILERVEIVGFRGINRLSLMLEQNNVLIGENAWGKSSLLDALTLLLSPESDLYHFERDDF WFPPGDINGREHHLHIILTFRESLPGRHRVRRYRPLEACWTPCTDGYHRIFYRLEGESAE DGSVMTLRSFLDKDGHPIDVEDINDQARHLVRLMPVLRLRDARFMRRIRNGTVPNVPNVE VTARQLDFLARELSSHPQNLSDGQIRQGLSAMVQLLEHYFSEQGAGQARYRLMRRRASNE QRSWRYLDIINRMIDRPGGRSYRVILLGLFATLLQAKGTLRLDKDARPLLLIEDPETRLH PIMLSVAWHLLNLLPLQRIATTNSGELLSLTPVEHVCRLVRESSRVAAWRLGPSGLSTED SRRISFHIRFNRPSSLFARCWLLVEGETETWVINELARQCGHHFDAEGIKVIEFAQSGLK PLVKFARRMGIEWHVLVDGDEAGKKYAATVRSLLNNDREAEREHLTALPALDMEHFMYRQ GFSDVFHRVAQIPENVPMNLRKIISKAIHRSSKPDLAIEVAMEAGRRGVDSVPTLLKKMF SRVLWLARGRAD >gi|296493444|gb|ADTK01000057.1| GENE 3 2932 - 3882 663 316 aa, chain - ## HITS:1 COG:ybjX KEGG:ns NR:ns ## COG: ybjX COG2990 # Protein_GI_number: 16128845 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 316 15 330 330 604 99.0 1e-173 MSQLTERTFTPSESLSSLSLFLSLARGQCRPGKFWHRRSFRQKFLLRSLIMPRLSVEWMN ELSHWPNLNVLLTRQPRLPVRLHRPYLAANLSRKQLLEALRYHYALLRGCMSAEEFSLYL NTPGLQLAKLEGKNGEQFTLELTMMISMDKEGDSTILFRNSEGIPLAEITFTLCEYQGKR TMFIGGLQGAKWEIPHQEIQNATKACHGLFPKRLVMEAACLFAQRLQVEQIIAVSNETHI YRSLRYRDKEGKIHADYNAFWESVGGVCDAERHYRLPAQIARKEIAEIASKKRAEYRRRY EMLDAIQPQMATMFRG >gi|296493444|gb|ADTK01000057.1| GENE 4 4039 - 5154 1333 371 aa, chain + ## HITS:1 COG:ECs0964 KEGG:ns NR:ns ## COG: ECs0964 COG0845 # Protein_GI_number: 15830218 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli O157:H7 # 1 371 10 380 380 641 99.0 0 MKKRKTVKKRYVIALVIVIAGLITLWRILNAPVPTYQTLIVRPGDLQQSVLATGKLDALR KVDVGAQVSGQLKTLSVAIGDKVKKDQLLGVIDPEQAENQIKEVEATLMELRAQRQQAEA ELKLARVTYSRQQRLAQTQAVSQQELDTAATEMAVKQAQIGTIDAQIKRNQASLDTAKTN LDYTRIVAPMAGEVTQITTLQGQTVIAAQQAPNILTLADMSTMLVKAQVSEADVIHLKPG QKAWFTVLGDPLTRYEGQIKDVLPTPEKVNDAIFYYARFEVPNPNGLLRLDMTAQVHIQL TDVKNVLTIPLSALGDPVGDNRYKVKLLRNGETREREVTIGARNDTDVEIVKGLEAGDEV VIGEAKPGAAQ >gi|296493444|gb|ADTK01000057.1| GENE 5 5151 - 7097 368 648 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 256 648 7 413 413 146 27 3e-34 MTPLLELKDIRRSYPAGDEQVEVLKGITLDIYAGEMVAIVGASGSGKSTLMNILGCLDKA TSGTYRVAGQDVATLDADALAQLRREHFGFIFQRYHLLSHLTAEQNVEVPAVYAGLERKQ RLLRAQELLQRLGLEDRTEYYPAQLSGGQQQRVSIARALMNGGQVILADEPTGALDSHSG EEVMAILHQLRDRGHTVIIVTHDPQVAAQAERVIEIRDGEIVRNPPAIEKVNVAGGTEPV VNTVSGWRQFVSGFNEALTMAWRALAANKMRTLLTMLGIIIGIASVVSIVVVGDAAKQMV LADIRSIGTNTIDVYPGKDFGDDDPQYQQALKYDDLIAIQKQPWVASATPAVSQNLRLRY NNVDVAASANGVSGDYFNVYGMTFSEGNTFNQEQLNGRAQVVVLDSNTRRQLFPHKADVV GEVILVGNMPARVIGVAEEKQSMFGSSKVLRVWLPYSTMSGRVMGQSWLNSITVRVKEGF DSAEAEQQLTRLLSLRHGKKDFFTWNMDGVLKTVEKTTRTLQLFLTLVAVISLVVGGIGV MNIMLVSVTERTREIGIRMAVGARASDVLQQFLIEAVLVCLVGGALGITLSLLIAFTLQL FLPGWEIGFSPLALLLAFLCSTVTGILFGWLPARNAARLDPVDALARE >gi|296493444|gb|ADTK01000057.1| GENE 6 7170 - 7394 214 74 aa, chain - ## HITS:1 COG:ECs0966 KEGG:ns NR:ns ## COG: ECs0966 COG1278 # Protein_GI_number: 15830220 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Escherichia coli O157:H7 # 1 74 1 74 74 145 100.0 2e-35 MEKGTVKWFNNAKGFGFICPEGGGEDIFAHYSTIQMDGYRTLKAGQSVQFDVHQGPKGNH ASVIVPVEVEAAVA >gi|296493444|gb|ADTK01000057.1| GENE 7 7305 - 7547 119 80 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSENIFAAAFRADETKPFGIVEPLNSTLFHASTSFANLIQVRWNKPGSERGLFKTSPTLE IQFRELGRAVKRLTGARGQV >gi|296493444|gb|ADTK01000057.1| GENE 8 7717 - 8037 343 106 aa, chain + ## HITS:1 COG:ECs0967 KEGG:ns NR:ns ## COG: ECs0967 COG2127 # Protein_GI_number: 15830221 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 106 1 106 106 209 100.0 8e-55 MGKTNDWLDFDQLAEEKVRDALKPPSMYKVILVNDDYTPMEFVIDVLQKFFSYDVERATQ LMLAVHYQGKAICGVFTAEVAETKVAMVNKYARENEHPLLCTLEKA >gi|296493444|gb|ADTK01000057.1| GENE 9 8068 - 10344 1361 758 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 10 741 14 806 815 528 38 1e-149 MLNQELELSLNMAFARAREHRHEFMTVEHLLLALLSNPSAREALEACSVDLVALRQELEA FIEQTTPVLPASEEERDTQPTLSFQRVLQRAVFHVQSSGRNEVTGANVLVAIFSEQESQA AYLLRKHEVSRLDVVNFISHGTRKDEPTQSSDPGSQPNSEEQAGGEERMENFTTNLNQLA RVGGIDPLIGREKELERAIQVLCRRRKNNPLLVGESGVGKTAIAEGLAWRIVQGDVPEVM ADCTIYSLDIGSLLAGTKYRGDFEKRFKALLKQLEQDTNSILFIDEIHTIIGAGAASGGQ VDAANLIKPLLSSGKIRVIGSTTYQEFSNIFEKDRALARRFQKIDITEPSIEETVQIING LKPKYEAHHDVRYTAKAVRAAVELAVKYINDRHLPDKAIDVIDEAGARARLMPVSKRKKT VNVADIESVVARIARIPEKSVSQSDRDTLKNLGDRLKMLVFGQDKAIEALTEAIKMARAG LGHEHKPVGSFLFAGPTGVGKTEVTVQLSKALGIELLRFDMSEYMERHTVSRLIGAPPGY VGFDQGGLLTDAVIKHPHAVLLLDEIEKAHPDVFNILLQVMDNGTLTDNNGRKADFRNVV LVMTTNAGVRETERKSIGLIHQDNSTDAMEEIKKIFTPEFRNRLDNIIWFDHLSTDVIHQ VVDKFIVELQVQLDQKGVSLEVSQEARNWLAEKGYDRAMGARPMARVIQDNLKKPLANEL LFGSLVDGGQVTVALDKEKNELTYGFQSAQKHKAEAAH >gi|296493444|gb|ADTK01000057.1| GENE 10 11009 - 11947 627 312 aa, chain + ## HITS:1 COG:ycdW KEGG:ns NR:ns ## COG: ycdW COG0111 # Protein_GI_number: 16128996 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Escherichia coli K12 # 1 312 14 325 325 637 99.0 0 MDIIFYHPTFDTQWWIEALRKAIPQARVRAWKSGDNDSADYALVWHPPVEMLAGRDLKAV FALGAGVDSILSKLQAHPEMLNPSVPLFRLEDTGMGEQMQEYAVSQVLHWFRRFDDYRIQ QNSSHWQPLPEYHREDFTIGILGAGVLGSKVAQSLQTWRFPLRCWSRTRKSWPGVQSFAG REELSAFLSQCRVLINLLPNTPETVGIINQQLLEKLPDGAYLLNLARGVHVVEDDLLAAL DSGKVKGACLDVFNREPLPPESPLWQHPRVTITPHVAAITRPAEAVEYISRTIAQLEKGE KVCGQVDRARGY >gi|296493444|gb|ADTK01000057.1| GENE 11 12002 - 12739 674 245 aa, chain + ## HITS:1 COG:ycdX KEGG:ns NR:ns ## COG: ycdX COG1387 # Protein_GI_number: 16128997 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Escherichia coli K12 # 1 245 1 245 245 477 99.0 1e-134 MYPVDLHMHTVASTHAYSTLSDYIAQAKQKGIKLFAITDHGPDMEDAPHHWHFINMRIWP RVVDGVGILRGIEANIKNVDGEIDCSGKMFDSLDLIIAGFHEPVFAPHDKATNTQAMIAT IASGNVHIISHPGNPKYEIDVKAVAEAAAKHQVALEINNSSFLHSRKGSEDNCRAVAAAV RDAGGWVALGSDSHTAFTMGEFEECLKILDAVDFPPERILNVSPRRLLNFLESRGMAPIA EFADL >gi|296493444|gb|ADTK01000057.1| GENE 12 12763 - 13317 705 184 aa, chain + ## HITS:1 COG:ECs1412 KEGG:ns NR:ns ## COG: ECs1412 COG3381 # Protein_GI_number: 15830666 # Func_class: R General function prediction only # Function: Uncharacterized component of anaerobic dehydrogenases # Organism: Escherichia coli O157:H7 # 1 184 1 184 184 345 99.0 3e-95 MNEFSILCRVLGSLYYRQPQDPLLVPLFTLIREGKLAANWPLEQDELLTRLQKSCDMAQV SADYNALFIGDECAVPPYRSAWVEDATEAEVRAFLSERGMPLADTPADHIGTLLLAASWL EDQSTEDESEALETLFSEYLLPWCGAFLGKVEAHATTPFWRTMAPLTRDAISAMWDELEE DSEE >gi|296493444|gb|ADTK01000057.1| GENE 13 13389 - 13910 555 173 aa, chain + ## HITS:1 COG:no KEGG:SSON_1045 NR:ns ## KEGG: SSON_1045 # Name: ycdZ # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 173 7 179 179 281 100.0 1e-74 MAAFSAIMRGMNILLSIAITTGILSGIWGWVAVSLGLLSWAGFLGCTAYFACPQGGLKGL AISAATLLSGVVWAMVIIYGSALAPHLEILGYVITGIVAFLMCIQAKQLLLSFVPGTFIG ACATFAGQGDWKLVLPSLALGLVFGYAMKNSGLWLAARSAKTAHREQEIKNKA >gi|296493444|gb|ADTK01000057.1| GENE 14 13974 - 14807 799 277 aa, chain - ## HITS:1 COG:ECs1414 KEGG:ns NR:ns ## COG: ECs1414 COG1462 # Protein_GI_number: 15830668 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein involved in formation of curli polymers # Organism: Escherichia coli O157:H7 # 1 277 1 277 277 550 100.0 1e-156 MQRLFLLVAVMLLSGCLTAPPKEAARPTLMPRAQSYKDLTHLPAPTGKIFVSVYNIQDET GQFKPYPASNFSTAVPQSATAMLVTALKDSRWFIPLERQGLQNLLNERKIIRAAQENGTV AINNRIPLQSLTAANIMVEGSIIGYESNVKSGGVGARYFGIGADTQYQLDQIAVNLRVVN VSTGEILSSVNTSKTILSYEVQAGVFRFIDYQRLLEGEVGYTSNEPVMLCLMSAIETGVI FLINDGIDRGLWDLQNKAERQNDILVKYRHMSVPPES >gi|296493444|gb|ADTK01000057.1| GENE 15 14834 - 15220 348 128 aa, chain - ## HITS:1 COG:no KEGG:ECSP_1339 NR:ns ## KEGG: ECSP_1339 # Name: csgF # Def: curli assembly protein CsgF # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 128 11 138 138 205 100.0 4e-52 MLISPLSWAGTMTFQFRNPNFGGNPNNGAFLLNSAQAQNSYKDPSYNDDFGIETPSALDN FTQAIQSQILGGLLSNINTGKPGRMVTNDYIVDIANRDGQLQLNVTDRKTGQTSTIQVSG LQNNSTDF >gi|296493444|gb|ADTK01000057.1| GENE 16 15275 - 15664 125 129 aa, chain - ## HITS:1 COG:no KEGG:APECO1_124 NR:ns ## KEGG: APECO1_124 # Name: csgE # Def: curli assembly protein CsgE # Organism: E.coli_APEC # Pathway: not_defined # 1 129 1 129 129 256 100.0 2e-67 MKRYLRWIVAAEFLFAAGNLHAVEVEVPGLLTDHTVSSIGHDFYRAFSDKWESDYTGNLT INERPSARWGSWITITVNQDVIFQTFLFPLKRDFEKTVVFALIQTEEALNRRQINQALLS TGDLAHDEF >gi|296493444|gb|ADTK01000057.1| GENE 17 15669 - 16319 221 216 aa, chain - ## HITS:1 COG:ECs1417 KEGG:ns NR:ns ## COG: ECs1417 COG2771 # Protein_GI_number: 15830671 # Func_class: K Transcription # Function: DNA-binding HTH domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 216 1 216 216 408 100.0 1e-114 MFNEVHSIHGHTLLLITKPSLQATALLQHLKQSLAITGKLHNIQRSLDDISSGSIILLDM MEADKKLIHYWQDTLSRKNNNIKILLLNTPEDYPYRDIENWPHINGVFYAMEDQERVVNG LQGVLRGECYFTQKLASYLITHSGNYRYNSTESALLTHREKEILNKLRIGASNNEIARSL FISENTVKTHLYNLFKKIAVKNRTQAVSWANDNLRR >gi|296493444|gb|ADTK01000057.1| GENE 18 16287 - 16433 58 48 aa, chain + ## HITS:1 COG:no KEGG:UTI89_C1162 NR:ns ## KEGG: UTI89_C1162 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UTI89 # Pathway: not_defined # 3 48 1 46 46 75 100.0 7e-13 MTMNTMDFIKHDETPLFLLIAHLTAASKIEAPEVLTDVALLCVVINQP >gi|296493444|gb|ADTK01000057.1| GENE 19 17046 - 17528 305 160 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_1421 NR:ns ## KEGG: ECH74115_1421 # Name: csgB # Def: curlin minor subunit # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 160 1 160 160 223 100.0 1e-57 MYDQVQGDNMKNKLLFMMLTILGAPGIAAAAGYDLANSEYNFAVNELSKSSFNQAAIIGQ AGTNNSAQLRQGGSKLLAVVAQEGSSNRAKIDQTGDYNLAYIDQAGSANDASISQGAYGN TAMIIQKGSGNKANITQYGTQKTAIVVQRQSQMAIRVTQR >gi|296493444|gb|ADTK01000057.1| GENE 20 17569 - 18024 394 151 aa, chain + ## HITS:1 COG:no KEGG:B21_01046 NR:ns ## KEGG: B21_01046 # Name: csgA # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 151 1 151 151 187 100.0 1e-46 MKLLKVAAIAAIVFSGSALAGVVPQYGGGGNHGGGGNNSGPNSELNIYQYGGGNSALALQ TDARNSDLTITQHGGGNGADVGQGSDDSSIDLTQRGFGNSATLDQWNGKNSEMTVKQFGG GNGAAVDQTASNSSVNVTQVGFGNNATAHQY >gi|296493444|gb|ADTK01000057.1| GENE 21 18083 - 18415 151 110 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_1217 NR:ns ## KEGG: ECUMN_1217 # Name: csgC # Def: putative autoagglutination protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 110 54 163 163 160 99.0 1e-38 MNALLLLAALSSQITFNTTQQGDVYTIIPEVTLTQSCLCRVQILSLREGSSGQSQTKQEK TLSLPANQPIALTKLSLNISPDDRVKIVVTVSDGQSLHLSQQWPPSSEKS >gi|296493444|gb|ADTK01000057.1| GENE 22 18536 - 18847 237 103 aa, chain + ## HITS:1 COG:no KEGG:ECIAI39_2119 NR:ns ## KEGG: ECIAI39_2119 # Name: ymdA # Def: hypothetical protein # Organism: E.coli_IAI39 # Pathway: not_defined # 1 103 1 103 103 196 99.0 3e-49 MFRPFLNSLMLGSLFFPFIAIAGSTAQGGVIHFYGQIVEPACDVSTQSSPVEMNCPQNGS VPGKTYSSKALMSGNVKNAQIASVKVQYLDKQKKLAVMNIEYN >gi|296493444|gb|ADTK01000057.1| GENE 23 18942 - 19475 355 177 aa, chain + ## HITS:1 COG:ECs1423 KEGG:ns NR:ns ## COG: ECs1423 COG2110 # Protein_GI_number: 15830677 # Func_class: R General function prediction only # Function: Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 # Organism: Escherichia coli O157:H7 # 1 177 1 177 177 343 99.0 8e-95 MKTRIHVVQGDITKLAVDVIVNAANPSLMGGGGVDGAIHRAAGPALLDACLKVRQQQGDC PTGHAVITLAGALPAKAVVHTVGPVWRGGEQNEDQLLQDAYLNSLRLVAANSYTSVAFPA ISTGVYGYPRAAAAEIAVKTVSEFITRHALPEQVYFVCYDEENAHLYERLLTQQGDE >gi|296493444|gb|ADTK01000057.1| GENE 24 19417 - 20898 864 493 aa, chain + ## HITS:1 COG:ymdC KEGG:ns NR:ns ## COG: ymdC COG1502 # Protein_GI_number: 16129009 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Escherichia coli K12 # 1 493 1 493 493 988 100.0 0 MMKKTPTSTKDSLPNKEMNDLPRLASAVLPLCSQHPGQCGLFPLEKSLDAFAARYRLAEM AEHTLDVQYYIWQDDMSGRLLFSALLAAAKRGVRVRLLLDDNNTPGLDDILRLLDSHPRI EVRLFNPFSFRLLRPLGYITDFSRLNRRMHNKSFTVDGVVTLVGGRNIGDAYFGAGEEPL FSDLDVMAIGPVVEDVADDFARYWYCKSVSPLQQVLDVPEGEMADRIELPASWHNDAMTH RYLRKMESSPFINHLVDGTLPLIWAKTRLLSDDPAKGEGKAKRHSLLPQRLFDIMGSPSE RIDIISSYFVPTRAGVAQLLRMVRKGVKIAILTNSLAANDVAVVHAGYARWRKKLLRYGV ELYELKPTREQSSTLHDRGITGNSGASLHAKTFSIDGKTVFIGSFNFDPRSTLLNTEMGF VIESETLAQLIDKRFIQSQYDAAWQLRLDRWGRINWVDRHAKKEIILKKEPATSFWKRVM VRLASILPVEWLL >gi|296493444|gb|ADTK01000057.1| GENE 25 20906 - 22063 655 385 aa, chain - ## HITS:1 COG:no KEGG:SSON_1060 NR:ns ## KEGG: SSON_1060 # Name: mdoC # Def: glucans biosynthesis protein # Organism: S.sonnei # Pathway: not_defined # 1 385 1 385 385 705 100.0 0 MNPVPAQREYFLDSIRAWLMLLGIPFHISLIYSSHTWHVNSAEPSLWLTLFNDFIHSFRM QVFFVISGYFSYMLFLRYPLKKWWKVRVERVGIPMLTAIPLLTLPQFIMLQYVKGKAESW PGLSLYDKYNTLAWELISHLWFLLVLVVMTTLCVWIFKRIRNNLENSDKTNKKFSMVKLS VIFLCLGIGYAVIRRTIFIVYPPILSNGMFNFIVMQTLFYLPFFILGALAFIFPHLKALF TTPSRGCTLAAALAFVAYLLNQRYGSGDAWMYETESVITMVLGLWMVNVVFSFGHRLLNF QSARVTYFVNASLFIYLVHHPLTLFFGAYITPHITSNWLGFLCGLIFVVGIAIILYEIHL RIPLLKFLFSGKPVVKRENDKAPAR >gi|296493444|gb|ADTK01000057.1| GENE 26 22457 - 23992 1556 511 aa, chain + ## HITS:1 COG:ECs1426 KEGG:ns NR:ns ## COG: ECs1426 COG3131 # Protein_GI_number: 15830680 # Func_class: P Inorganic ion transport and metabolism # Function: Periplasmic glucans biosynthesis protein # Organism: Escherichia coli O157:H7 # 1 511 1 511 511 1031 100.0 0 MMKMRWLSAAVMLTLYTSSSWAFSIDDVAKQAQSLAGKGYEAPKSNLPSVFRDMKYADYQ QIQFNHDKAYWNNLKTPFKLEFYHQGMYFDTPVKINEVTATAVKRIKYSPDYFTFGDVQH DKDTVKDLGFAGFKVLYPINSKDKNDEIVSMLGASYFRVIGAGQVYGLSARGLAIDTALP SGEEFPRFKEFWIERPKPTDKRLTIYALLDSPRATGAYKFVVMPGRDTVVDVQSKIYLRD KVGKLGVAPLTSMFLFGPNQPSPANNYRPELHDSNGLSIHAGNGEWIWRPLNNPKHLAVS SFSMENPQGFGLLQRGRDFSRFEDLDDRYDLRPSAWVTPKGEWGKGSVELVEIPTNDETN DNIVAYWTPDQLPEPGKEMNFKYTITFSRDEDKLHAPDNAWVQQTRRSTGDVKQSNLIRQ PDGTIAFVVDFTGAEMKKLPEDTPVTAQTSIGDNGEIVESTVRYNPVTKGWRLVMRVKVK DAKKTTEMRAALVNADQTLSETWSYQLPANE >gi|296493444|gb|ADTK01000057.1| GENE 27 24015 - 26528 2144 837 aa, chain + ## HITS:1 COG:ECs1427 KEGG:ns NR:ns ## COG: ECs1427 COG2943 # Protein_GI_number: 15830681 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane glycosyltransferase # Organism: Escherichia coli O157:H7 # 1 837 21 857 857 1678 100.0 0 MPIAASEKAALPKTDIRAVHQALDAEHRTWAREDDSPQGSVKARLEQAWPDSLADGQLIK DDEGRDQLKAMPEAKRSSMFPDPWRTNPVGRFWDRLRGRDVTPRYLARLTKEEQESEQKW RTVGTIRRYILLILTLAQTVVATWYMKTILPYQGWALINPMDMVGQDLWVSFMQLLPYML QTGILILFAVLFCWVSAGFWTALMGFLQLLIGRDKYSISASTVGDEPLNPEHRTALIMPI CNEDVNRVFAGLRATWESVKATGNAKHFDVYILSDSYNPDICVAEQKAWMELIAEVGGEG QIFYRRRRRRVKRKSGNIDDFCRRWGSQYSYMVVLDADSVMTGDCLCGLVRLMEANPNAG IIQSSPKASGMDTLYARCQQFATRVYGPLFTAGLHFWQLGESHYWGHNAIIRVKPFIEHC ALAPLPGEGSFAGSILSHDFVEAALMRRAGWGVWIAYDLPGSYEELPPNLLDELKRDRRW CHGNLMNFRLFLVKGMHPVHRAVFLTGVMSYLSAPLWFMFLALSTALQVVHALTEPQYFL QPRQLFPVWPQWRPELAIALFASTMVLLFLPKLLSILLIWCKGTKEYGGFWRVTLSLLLE VLFSVLLAPVRMLFHTVFVVSAFLGWEVVWNSPQRDDDSTSWGEAFKRHGSQLLLGLVWA VGMAWLDLRFLFWLAPIVFSLILSPFVSVISSRATVGLRTKRWKLFLIPEEYSPPQVLVD TDRFLEMNRQRSLDDGFMHAVFNPSFNALATAMATARHRASKVLEIARDRHVEQALNETP EKLNRDRRLVLLSDPVTMARLHFRVWNSPERYSSWVSYYEGIKLNPLALRKPDAASQ >gi|296493444|gb|ADTK01000057.1| GENE 28 26701 - 26928 305 75 aa, chain + ## HITS:1 COG:ECs1428 KEGG:ns NR:ns ## COG: ECs1428 COG5645 # Protein_GI_number: 15830682 # Func_class: R General function prediction only # Function: Predicted periplasmic lipoprotein # Organism: Escherichia coli O157:H7 # 1 75 1 75 75 147 100.0 3e-36 MRLIVVSIMVTLLSGCGSIISRTIPGQGHGNQYYPGVQWDVRDSAWRYVTILDLPFSLVF DTLLLPIDIHHGPYE >gi|296493444|gb|ADTK01000057.1| GENE 29 26929 - 27303 495 124 aa, chain - ## HITS:1 COG:no KEGG:G2583_1310 NR:ns ## KEGG: G2583_1310 # Name: msyB # Def: acidic protein MsyB # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 124 2 125 125 183 100.0 1e-45 MTMYATLEEAIDAAREEFLADNPGIDAEDANVQQFNAQKYVLQDGDIMWQVEFFADEGEE GECLPMLSGEAAQSVFDGDYDEIEIRQEWQEENTLHEWDEGEFQLEPPLDTEEGRAAADE WDER >gi|296493444|gb|ADTK01000057.1| GENE 30 27386 - 28612 792 408 aa, chain - ## HITS:1 COG:yceE KEGG:ns NR:ns ## COG: yceE COG0477 # Protein_GI_number: 16129016 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 408 1 408 408 715 100.0 0 MSPCENDTPINWKRNLIVAWLGCFLTGAAFSLVMPFLPLYVEQLGVTGHSALNMWSGIVF SITFLFSAIASPFWGGLADRKGRKLMLLRSALGMGIVMVLMGLAQNIWQFLILRALLGLL GGFVPNANALIATQVPRNKSGWALGTLSTGGVSGALLGPMAGGLLADSYGLRPVFFITAS VLILCFFVTLFCIREKFQPVSKKEMLHMREVVTSLKNPKLVLSLFVTTLIIQVATGSIAP ILTLYVRELAGNVSNVAFISGMIASVPGVAALLSAPRLGKLGDRIGPEKILITALIFSVL LLIPMSYVQTPLQLGILRFLLGAADGALLPAVQTLLVYNSSNQIAGRIFSYNQSFRDIGN VTGPLMGAAISANYGFRAVFLVTAGVVLFNAVYSWNSLRRRRIPQVSN >gi|296493444|gb|ADTK01000057.1| GENE 31 28784 - 29704 922 306 aa, chain - ## HITS:1 COG:ECs1432 KEGG:ns NR:ns ## COG: ECs1432 COG1560 # Protein_GI_number: 15830686 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lauroyl/myristoyl acyltransferase # Organism: Escherichia coli O157:H7 # 1 306 1 306 306 618 100.0 1e-177 MTNLPKFSTALLHPRYWLTWLGIGVLWLVVQLPYPVIYRLGCGLGKLALRFMKRRAKIVH RNLELCFPEMSEQERRKMVVKNFESVGMGLMETGMAWFWPDRRIARWTEVIGMEHIRDVQ AQKRGILLVGIHFLTLELGARQFGMQEPGIGVYRPNDNPLIDWLQTWGRLRSNKSMLDRK DLKGMIKALKKGEVVWYAPDHDYGPRSSVFVPLFAVEQAATTTGTWMLARMSGACLVPFV PRRKPDGKGYQLIMLPPECSPPLDDAETTAAWMNKVVEKCIMMAPEQYMWLHRRFKTRPE GVPSRY >gi|296493444|gb|ADTK01000057.1| GENE 32 29929 - 30981 880 350 aa, chain + ## HITS:1 COG:ECs1433 KEGG:ns NR:ns ## COG: ECs1433 COG1054 # Protein_GI_number: 15830687 # Func_class: R General function prediction only # Function: Predicted sulfurtransferase # Organism: Escherichia coli O157:H7 # 1 350 1 350 350 738 99.0 0 MPVLHNRISNDALKAKMLAESEPRTTISFYKYFHIADPKATRDALYQLFTALNVFGRVYL AHEGINAQISVPASNVETFRAQLYAFDPALEGLRLNIALDDDGKSFWVLRMKVRDRIVAD GIDDPHFDASNVGEYLQAAEVNAMLDDPDALFIDMRNHYEYEVGHFENALEIPADTFREQ LPKAVEMMQAHKDKKIVMYCTGGIRCEKASAWMKHNGFNKVWHIEGGIIEYARKAREQGL PVRFIGKNFVFDERMGERISDEIIAHCHQCGAPCDSHTNCKNDGCHLLFIQCPVCAEKYK GCCSEICCEESALPPEEQRRRRAGRENGNKIFNKSRGRLNTTLGIPDPTE >gi|296493444|gb|ADTK01000057.1| GENE 33 31023 - 31598 607 191 aa, chain - ## HITS:1 COG:ECs1434 KEGG:ns NR:ns ## COG: ECs1434 COG2353 # Protein_GI_number: 15830688 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 191 1 191 191 360 100.0 1e-99 MKKSLLGLTFASLMFSAGSAVAADYKIDKEGQHAFVNFRIQHLGYSWLYGTFKDFDGTFT FDEKNPAADKVNVTINTTSVDTNHAERDKHLRSADFLNTAKYPQATFTSTSVKKDGDELD ITGDLTLNGVTKPVTLEAKLIGQGDDPWGGKRAGFEAEGKIKLKDFNIKTDLGPASQEVD LIISVEGVQQK >gi|296493444|gb|ADTK01000057.1| GENE 34 31602 - 32168 317 188 aa, chain - ## HITS:1 COG:ECs1435 KEGG:ns NR:ns ## COG: ECs1435 COG3038 # Protein_GI_number: 15830689 # Func_class: C Energy production and conversion # Function: Cytochrome B561 # Organism: Escherichia coli O157:H7 # 1 188 1 188 188 342 99.0 2e-94 MSFTNTPEHYGVISAAFHWLSAIIVYGMFALGLWMVTLSYYDGWYHKAPELHKSIGILLM MGLVIRVLWRVISPPPGPLPSYSPMTRLAAKAGHLALYLLLFAIGISGYLISTADGKPIS VFGWFDVPATLADAGAQADFAGALHFWLAWSVVVLSVMHGFMALKHHFIDKDDTLKRMLG KSSSDYGV >gi|296493444|gb|ADTK01000057.1| GENE 35 32429 - 32542 100 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRRLLHYLINNIREHLMLYLFLWGLLAIMDLIYVFYF >gi|296493444|gb|ADTK01000057.1| GENE 36 32590 - 33708 817 372 aa, chain - ## HITS:1 COG:solA KEGG:ns NR:ns ## COG: solA COG0665 # Protein_GI_number: 16129022 # Func_class: E Amino acid transport and metabolism # Function: Glycine/D-amino acid oxidases (deaminating) # Organism: Escherichia coli K12 # 1 372 1 372 372 777 98.0 0 MKYDLIIIGSGSVGAAAGYYATRAGLNVLMTDAHMPPHQHGSHHGDTRLIRHAYGEGEKY VPLVLRAQTLWDELSRHNEDDPIFVRSGVINLGPADSAFLANVAHSAEQWQLNVEKLDAQ GIMARWPEIRVPDNYIGLFETDSGFLRSELAIKTWIQLAKEAGCAQLFNCPVTAIHHDDD GVTIETVDGEYQAKKAIVCAGTWVKDLLPELPVQPVRKVFAWYQADGRYSVKNKFPAFTG ELPNGDQYYGFPAENDALKIGKHNGGQVIHSADERVPFAEVVSDGSEAFPFLRNVLPGIG CCLYGAACTYDNSPDEDFIIDTLPGHDNTLLITGLSGHGFKFASVLGEIAADFAQDKKSD FDLTPFRLSRFQ >gi|296493444|gb|ADTK01000057.1| GENE 37 33823 - 34077 257 84 aa, chain - ## HITS:1 COG:no KEGG:ECH74115_1439 NR:ns ## KEGG: ECH74115_1439 # Name: bssS # Def: biofilm formation regulatory protein BssS # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 84 2 85 85 167 100.0 8e-41 MEKNNEVIQTHPLVGWDISTVDSYDALMLRLHYQTPNKSEQEGTEVGQTLWLTTDVARQF ISILEAGIAKIESGDFPVNEYRRH >gi|296493444|gb|ADTK01000057.1| GENE 38 34367 - 34612 208 81 aa, chain - ## HITS:1 COG:no KEGG:UTI89_C1186 NR:ns ## KEGG: UTI89_C1186 # Name: not_defined # Def: DNA damage-inducible protein I # Organism: E.coli_UTI89 # Pathway: not_defined # 1 81 20 100 100 156 100.0 2e-37 MRIEVTIAKTSPLPAGAIDALAGELSRRIQYAFPDNEGHVSVRYAAANNLSVIGATKEDK QRISEILQETWESADDWFVSE >gi|296493444|gb|ADTK01000057.1| GENE 39 34686 - 35606 686 306 aa, chain - ## HITS:1 COG:pyrC KEGG:ns NR:ns ## COG: pyrC COG0418 # Protein_GI_number: 16129025 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase # Organism: Escherichia coli K12 # 1 306 43 348 348 628 99.0 1e-180 MPNLAPPVTTVEAAVAYRQRILDAVPAGHNFTPLMTCYLTDSLDPNELERGFNEGVFTAA KLYPANATTNSSHGVTSIDAIMPVLERMEKIGMPLLVHGEVTHADIDIFDREARFIESVM EPLRQRLTALKVVFEHITTKDAADYVRDGNERLAATITPQHLMFNRNHMLVGGVRPHLYC LPILKRNIHQQALRELVASGFNRVFLGTDSAPHARHRKESSCGCAGCFNAPTALGSYATV FEEMNALQHFEAFCSVNGPQFYGLPVNDTFIELVREEQQVAESIALTDDTLVPFLAGETV RWSVKQ >gi|296493444|gb|ADTK01000057.1| GENE 40 35838 - 36398 615 186 aa, chain - ## HITS:1 COG:no KEGG:APECO1_145 NR:ns ## KEGG: APECO1_145 # Name: yceB # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 186 20 205 205 353 100.0 2e-96 MNKFLFAAALIVSGLLVGCNQLTQYTITEQEINQSLAKHNNFSKDIGLPGVADAHIVLTN LTSQIGREEPNKVTLTGDANLDMNSLFGSQKATMKLKLKALPVFDKEKGAIFLKEMEVVD ATVQPEKMQTVMQTLLPYLNQALRNYFNQQPAYVLREDGSQGEAMAKKLAKGIEVKPGEI VIPFTD >gi|296493444|gb|ADTK01000057.1| GENE 41 36532 - 37179 654 215 aa, chain - ## HITS:1 COG:ECs1442 KEGG:ns NR:ns ## COG: ECs1442 COG2999 # Protein_GI_number: 15830696 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutaredoxin 2 # Organism: Escherichia coli O157:H7 # 1 215 1 215 215 427 100.0 1e-120 MKLYIYDHCPYCLKARMIFGLKNIPVELHVLLNDDAETPTRMVGQKQVPILQKDDSRYMP ESMDIVHYVDKLDGKPLLTGKRSPAIEEWLRKVNGYANKLLLPRFAKSAFDEFSTPAARK YFVDKKEASAGNFADLLAHSDGLIKNISDDLRALDKLIVKPNAVNGELSEDDIQLFPLLR NLTLVAGINWPSRVADYRDNMAKQTQINLLSSMAI >gi|296493444|gb|ADTK01000057.1| GENE 42 37243 - 38451 1215 402 aa, chain - ## HITS:1 COG:yceL KEGG:ns NR:ns ## COG: yceL COG0477 # Protein_GI_number: 16129028 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 402 11 412 412 701 99.0 0 MSRVSQARNLGKYFLLIDNMLVVLGFFVVFPLISIRFVDQMGWAAVMVGIALGLRQFIQQ GLGIFGGAIADRFGAKPMIVTGMLMRAAGFATMGIAHEPWLLWFSCLLSGLGGTLFDPPR SALVVKLIRPQQRGRFFSLLMMQDSAGAVIGALLGSWLLQYDFRLVCATGAVFFVLCAAF NAWLLPAWKLSTVRTPVREGMTRVMRDKRFVTYVLTLAGYYMLAVQVMLMLPIMVNDVAG APSAVKWMYAIEACLSLTLLYPIARWSEKHFRLEHRLMAGLLIMSLSMMPVGMVSGLQQL FTLICLFYIGSIIAEPARETLSASLADARARGSYMGFSRLGLAIGGAIGYIGGGWLFDLG KSAHQPELPWMMLGIIGIFTFLALGWQFSQKRAARRLLERDA >gi|296493444|gb|ADTK01000057.1| GENE 43 38687 - 39271 1067 194 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15801183|ref|NP_287200.1| ribosomal-protein-S5-alanine N-acetyltransferase [Escherichia coli O157:H7 EDL933] # 1 194 1 194 194 415 100 1e-115 MFGYRSNVPKVRLTTDRLVVRLVHDRDAWRLADYYAENRHFLKPWEPVRDESHCYPSGWQ ARLGMINEFHKQGSAFYFGLFDPDEKEIIGVANFSNVVRGSFHACYLGYSIGQKWQGKGL MFEALTAAIRYMQRTQHIHRIMANYMPHNKRSGDLLARLGFEKEGYAKDYLLIDGQWRDH VLTALTTPDWTPGR >gi|296493444|gb|ADTK01000057.1| GENE 44 39282 - 39929 764 215 aa, chain + ## HITS:1 COG:yceH KEGG:ns NR:ns ## COG: yceH COG3132 # Protein_GI_number: 16129030 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 215 1 215 215 392 98.0 1e-109 MKYQLTALEARVIGCLLEKQVTTPEQYPLSVNGVVTACNQKTNREPVMNLSESEVQEQLD NLVKRHYLRTVSGFGNRVTKYEQRFCNSEFGDLKLSAAEVALITTLLLRGAQTPGELRSR AARMYEFSDMAEVESTLEQLANREDGPFVVRLAREPGKRESRYMHLFSGEVEDQPSVTDM SSAVDGDLQARVEALEIEVAELKQRLDSLLAHLGD >gi|296493444|gb|ADTK01000057.1| GENE 45 39931 - 40854 701 307 aa, chain + ## HITS:1 COG:mviM KEGG:ns NR:ns ## COG: mviM COG0673 # Protein_GI_number: 16129031 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Escherichia coli K12 # 1 307 1 307 307 622 100.0 1e-178 MKKLRIGVVGLGGIAQKAWLPVLAAASDWTLQGAWSPTRAKALPICESWRIPYADSLSSL AASCDAVFVHSSTASHFDVVSTLLNAGVHVCVDKPLAENLRDAERLVELAARKKLTLMVG FNRRFAPLYGELKTQLATAASLRMDKHRSNSVGPHDLYFTLLDDYLHVVDTALWLSGGKA SLDGGTLLTNDAGEMLFAEHHFSAGPLQITTCMHRRAGSQRETVQAVTDGALIDITDMRE WREERGQGVVHKPIPGWQSTLEQRGFVGCARHFIECVQNQTVPQTAGEQAVLAQRIVDKI WRDAMSE >gi|296493444|gb|ADTK01000057.1| GENE 46 40964 - 42499 1012 511 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145628098|ref|ZP_01783899.1| 30S ribosomal protein S20 [Haemophilus influenzae 22.1-21] # 3 502 5 516 524 394 40 1e-109 MNLLKSLAAVSSMTMFSRVLGFARDAIVARIFGAGMATDAFFVAFKLPNLLRRIFAEGAF SQAFVPILAEYKSKQGEDATRVFVSYVSGLLTLALAVVTVAGMLAAPWVIMVTAPGFADT ADKFALTSQLLKITFPYILLISLASLVGAILNTWNRFSIPAFAPTLLNISMIGFALFAAP YFNPPVLALAWAVTVGGVLQLVYQLPHLKKIGMLVLPRINFHDAGAMRVVKQMGPAILGV SVSQISLIINTIFASFLASGSVSWMYYADRLMEFPSGVLGVALGTILLPSLSKSFASGNH DEYNRLMDWGLRLCFLLALPSAVALGILSGPLTVSLFQYGKFTAFDALMTQRALIAYSVG LIGLIVVKVLAPGFYSRQDIKTPVKIAIVTLILTQLMNLAFIGPLKHAGLSLSIGLAACL NASLLYWQLRKQKIFTPQPGWMAFLLRLVVAVLVMSGVLLGMLHIMPEWSLGTMPWRLLR LMAVVLAGIAAYFAALAVLGFKVKEFARRTV >gi|296493444|gb|ADTK01000057.1| GENE 47 42539 - 42955 399 138 aa, chain - ## HITS:1 COG:flgN KEGG:ns NR:ns ## COG: flgN COG3418 # Protein_GI_number: 16129033 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport; O Posttranslational modification, protein turnover, chaperones # Function: Flagellar biosynthesis/type III secretory pathway chaperone # Organism: Escherichia coli K12 # 1 138 1 138 138 205 98.0 2e-53 MTRLAEILDQMSAVLNDLKTVMDQEQQHLSMGQINGSQLQWITEQKSSLLATLDYLEQLR RKEPNTANSADISQRWQEITVKTQQLRQMNQHNGWLLEGQIERNQQALEMLKPHQEPTLY GANGQTSTTHRGSKKISI >gi|296493444|gb|ADTK01000057.1| GENE 48 42960 - 43253 341 97 aa, chain - ## HITS:1 COG:flgM KEGG:ns NR:ns ## COG: flgM COG2747 # Protein_GI_number: 16129034 # Func_class: K Transcription; N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Negative regulator of flagellin synthesis (anti-sigma28 factor) # Organism: Escherichia coli K12 # 1 97 1 97 97 123 100.0 8e-29 MSIDRTSPLKPVSTVQPRETTDAPVTNSRAAKTTASTSTSVTLSDAQAKLMQPGSSDINL ERVEALKLAIRNGELKMDTGKIADALINEAQQDLQSN >gi|296493444|gb|ADTK01000057.1| GENE 49 43329 - 43988 389 219 aa, chain - ## HITS:1 COG:ECs1450 KEGG:ns NR:ns ## COG: ECs1450 COG1261 # Protein_GI_number: 15830704 # Func_class: N Cell motility; O Posttranslational modification, protein turnover, chaperones # Function: Flagellar basal body P-ring biosynthesis protein # Organism: Escherichia coli O157:H7 # 1 219 1 219 219 368 98.0 1e-102 MLAIKRSVAIIAILFSPLSAASNLTSQLQTFFSAQLAGISDEVRVSIRTAPSLLPPCEQP LLSMSNNSRLWGNVNVLARCGNDKRYLQVNVQATGNYVVAAMPIVRGGKLEAGNVKLKRG RLDTLPPRTVLDINQLVDAISLRDLSPDQPIQLTQFRQAWRVKAGQRVNVIASGDGFSAN AEGQALNNAAVAQNARVRMVSGQVVSGVVDADGNILINL >gi|296493444|gb|ADTK01000057.1| GENE 50 44143 - 44559 476 138 aa, chain + ## HITS:1 COG:flgB KEGG:ns NR:ns ## COG: flgB COG1815 # Protein_GI_number: 16129036 # Func_class: N Cell motility # Function: Flagellar basal body protein # Organism: Escherichia coli K12 # 1 138 1 138 138 228 100.0 2e-60 MLDKLDAALRFQQEALNLRAQRQEVLAANIANADTPGYQARDIDFASELKKVMQRGRDAT SVVALTMTSTQHIPAQALTPPTAELQYRIPDQPSLDGNTVDMDRERTQFADNSLQYQMSL SALSGQIKGMMNVLQSGN >gi|296493444|gb|ADTK01000057.1| GENE 51 44563 - 44967 321 134 aa, chain + ## HITS:1 COG:ECs1452 KEGG:ns NR:ns ## COG: ECs1452 COG1558 # Protein_GI_number: 15830706 # Func_class: N Cell motility # Function: Flagellar basal body rod protein # Organism: Escherichia coli O157:H7 # 1 134 1 134 134 223 100.0 8e-59 MALLNIFDIAGSALTAQSQRLNVAASNLANADSVTGPDGQPYRAKQVVFQVNAAPGAATG GVKVADVIESQAPDKLVYEPGNPLADAKGYVKMPNVDVVGEMVNTMSASRSYQANVEVLN TVKSMMLKTLTLGQ >gi|296493444|gb|ADTK01000057.1| GENE 52 44979 - 45674 838 231 aa, chain + ## HITS:1 COG:ECs1453 KEGG:ns NR:ns ## COG: ECs1453 COG1843 # Protein_GI_number: 15830707 # Func_class: N Cell motility # Function: Flagellar hook capping protein # Organism: Escherichia coli O157:H7 # 1 231 1 231 231 314 100.0 1e-85 MSIAVTTTDPTNTGVSTTSSSSLTGSNAADLQSSFLTLLVAQLKNQDPTNPMENNELTSQ LAQISTVSGIEKLNTTLGSISGQIDNSQSLQASNLIGHGVMIPGTTVLAGTGSEEGAVTT TTPFGVELQQAADKVTATITDKNGAVVRTIDIGELTAGVHSFTWDGTLTDGSTAPNGSYN VAISASNGGTQLVAQPLQFALVQGVIRGNNGNTLDLGTYGTTTLDEVRQII >gi|296493444|gb|ADTK01000057.1| GENE 53 45699 - 46904 1248 401 aa, chain + ## HITS:1 COG:ECs1454 KEGG:ns NR:ns ## COG: ECs1454 COG1749 # Protein_GI_number: 15830708 # Func_class: N Cell motility # Function: Flagellar hook protein FlgE # Organism: Escherichia coli O157:H7 # 1 401 1 401 401 620 98.0 1e-177 MAFSQAVSGLNAAATNLDVIGNNIANSATYGFKSGTASFADMFAGSKVGLGVKVAGITQD FTDGTTTNTGRGLDVAISQNGFFRLVDSNGSVFYSRNGQFKLDENRNLVNMQGLQLTGYP ATGTPPTIQQGANPTNISIPNTLMAAKTTTTASMQINLNSSDPLPTVTPFSASNADSYNK KGSVTVFDSQGNAHDMSVYFVKTGDNNWQVYTQDSSDPTGTAEPAMKLVFNANGVLTSNP TENITTGAINGADPATFSLSFLNSMQQNTGANNIVATTQNGYKPGDLVSYQINDDGTVVG NYSNEQTQLLGQIVLANFANNEGLASEGDNVWSATQSSGVALLGTAGTGNFGTLTNGALE ASNVDLSKELVNMIVAQRNYQSNAQTIKTQDQILNTLVNLR >gi|296493444|gb|ADTK01000057.1| GENE 54 46924 - 47679 664 251 aa, chain + ## HITS:1 COG:flgF KEGG:ns NR:ns ## COG: flgF COG4787 # Protein_GI_number: 16129040 # Func_class: N Cell motility # Function: Flagellar basal body rod protein # Organism: Escherichia coli K12 # 1 251 1 251 251 418 100.0 1e-117 MDHAIYTAMGAASQTLNQQAVTASNLANASTPGFRAQLNALRAVPVEGLSLPTRTLVTAS TPGADMTPGKMDYTSRPLDVALQQDGWLAVQTADGSEGYTRNGSIQVDPTGQLTIQGHPV IGEAGPIAVPEGAEITIAADGTISALNPGDPANTVAPVGRLKLVKATGSEVQRGDDGIFR LSAETQATRGPVLQADPTLRVMSGVLEGSNVNAVAAMSDMIASARRFEMQMKVISSVDDN AGRANQLLSMS >gi|296493444|gb|ADTK01000057.1| GENE 55 47851 - 48633 941 260 aa, chain + ## HITS:1 COG:ECs1456 KEGG:ns NR:ns ## COG: ECs1456 COG4786 # Protein_GI_number: 15830710 # Func_class: N Cell motility # Function: Flagellar basal body rod protein # Organism: Escherichia coli O157:H7 # 1 260 1 260 260 434 100.0 1e-122 MISSLWIAKTGLDAQQTNMDVIANNLANVSTNGFKRQRAVFEDLLYQTIRQPGAQSSEQT TLPSGLQIGTGVRPVATERLHSQGNLSQTNNSKDVAIKGQGFFQVMLPDGSSAYTRDGSF QVDQNGQLVTAGGFQVQPAITIPANALSITIGRDGVVSVTQQGQAAPVQVGQLNLTTFMN DTGLESIGENLYTETQSSGAPNESTPGLNGAGLLYQGYVETSNVNVAEELVNMIQVQRAY EINSKAVSTTDQMLQKLTQL >gi|296493444|gb|ADTK01000057.1| GENE 56 48686 - 49384 707 232 aa, chain + ## HITS:1 COG:ECs1457 KEGG:ns NR:ns ## COG: ECs1457 COG2063 # Protein_GI_number: 15830711 # Func_class: N Cell motility # Function: Flagellar basal body L-ring protein # Organism: Escherichia coli O157:H7 # 1 232 1 232 232 412 100.0 1e-115 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM >gi|296493444|gb|ADTK01000057.1| GENE 57 49396 - 50493 921 365 aa, chain + ## HITS:1 COG:flgI KEGG:ns NR:ns ## COG: flgI COG1706 # Protein_GI_number: 16129043 # Func_class: N Cell motility # Function: Flagellar basal-body P-ring protein # Organism: Escherichia coli K12 # 1 365 1 365 365 556 100.0 1e-158 MIKFLSALILLLVTTAAQAERIRDLTSVQGVRQNSLIGYGLVVGLDGTGDQTTQTPFTTQ TLNNMLSQLGITVPTGTNMQLKNVAAVMVTASLPPFGRQGQTIDVVVSSMGNAKSLRGGT LLMTPLKGVDSQVYALAQGNILVGGAGASAGGSSVQVNQLNGGRITNGAVIERELPSQFG VGNTLNLQLNDEDFSMAQQIADTINRVRGYGSATALDARTIQVRVPSGNSSQVRFLADIQ NMQVNVTPQDAKVVINSRTGSVVMNREVTLDSCAVAQGNLSVTVNRQANVSQPDTPFGGG QTVVTPQTQIDLRQSGGSLQSVRSSASLNNVVRALNALGATPMDLMSILQSMQSAGCLRA KLEII >gi|296493444|gb|ADTK01000057.1| GENE 58 50493 - 51434 976 313 aa, chain + ## HITS:1 COG:flgJ_1 KEGG:ns NR:ns ## COG: flgJ_1 COG3951 # Protein_GI_number: 16129044 # Func_class: M Cell wall/membrane/envelope biogenesis; N Cell motility; O Posttranslational modification, protein turnover, chaperones # Function: Rod binding protein # Organism: Escherichia coli K12 # 1 167 1 167 167 289 99.0 4e-78 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEEPTPAAPMKFPLET VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK VSKTYSMNIDNLF >gi|296493444|gb|ADTK01000057.1| GENE 59 51500 - 53143 1524 547 aa, chain + ## HITS:1 COG:flgK KEGG:ns NR:ns ## COG: flgK COG1256 # Protein_GI_number: 16129045 # Func_class: N Cell motility # Function: Flagellar hook-associated protein # Organism: Escherichia coli K12 # 1 547 1 547 547 880 99.0 0 MSSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYV SGVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTL VSNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLN DQISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGST ARQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQ LVLAFAEAFNSQHKAGFDANGDAGEDFFAIGKPAVLQNTKNTGDVAIGATVTDASAVLAT DYKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAI VNMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIG NKTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIF DALINIR >gi|296493444|gb|ADTK01000057.1| GENE 60 53155 - 54108 935 317 aa, chain + ## HITS:1 COG:ECs1461 KEGG:ns NR:ns ## COG: ECs1461 COG1344 # Protein_GI_number: 15830715 # Func_class: N Cell motility # Function: Flagellin and related hook-associated proteins # Organism: Escherichia coli O157:H7 # 1 317 1 317 317 512 98.0 1e-145 MRFSTQMMYQQNMRGITNSQAEWMKYGEQMSTGKRVVNPSDDPIAASQAVVLSQAQAQNS QYTLARTFATQKVSLEESVLSQVTTAIQNAQEKIVYASNGTLSDDDRASLATDIQGFRDQ LLNLANTTDGNGRYIFAGYKTETAPFSEANGDYVGGTESIKQQVDASRSMVIGHTGDKIF DSITSNAVAEPDGSASETNLFAMLDSAIAALKTPVADSEADKEIAAAALDKTNRGLKNSL NNVLTVRAELGTQLNELESLDSLGSDRALGQTQQMSDLVDVDWNATISSYIMQQTALQAS YKAFTDMQGLSLFQLNK >gi|296493444|gb|ADTK01000057.1| GENE 61 54305 - 57490 3204 1061 aa, chain - ## HITS:1 COG:ECs1462 KEGG:ns NR:ns ## COG: ECs1462 COG1530 # Protein_GI_number: 15830716 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonucleases G and E # Organism: Escherichia coli O157:H7 # 1 1061 1 1061 1061 1482 99.0 0 MKRMLINATQQEELRVALVDGQRLYDLDIESPGHEQKKANIYKGKITRIEPSLEAAFVDY GAERHGFLPLKEIAREYFPANYSAHGRPNIKDVLREGQEVIVQIDKEERGNKGAALTTFI SLAGSYLVLMPNNPRAGGISRRIEGDDRTELKEALASLELPEGMGLIVRTAGVGKSAEAL QWDLSFRLKHWEAIKKAAESRPAPFLIHQESNVIVRAFRDYLRQDIGEILIDNPKVLELA RQHIAALGRPDFSSKIKLYTGEIPLFSHYQIESQIESAFQREVRLPSGGSIVIDSTEALT AIDINSARATRGGDIEETAFNTNLEAADEIARQLRLRDLGGLIVIDFIDMTPVRHQRAVE NRLREAVRQDRARIQISHISRFGLLEMSRQRLSPSLGESSHHVCPRCSGTGTVRDNESLS LSILRLIEEEALKENTQEVHAIVPVPIASYLLNEKRSAVNAIETRQDGVRCVIVPNDQME TPHYHVLRVRKGEETPTLSYMLPKLHEEAMALPSEEEFAERKRPEQPALATFAMPDVPPA PTPAEPAAPVVAPAPKAAPATPATPAQPGLLSRFFGALKALFSGSEETKPTEQPAPKAEA KPERQQDRRKPRQNNRRDRNERRDTRSERTEGSDNREENRRNRRQAQQQTAETRESRQQA EVTEKARTTDEQQAPRRERSRRRNDDKRQAQQEAKALNVEEQSVQETEQEERVRPVQPRR KQRQLNQKVRYEQSVAEEAVVAPVVEETAAAEPIVQEAPAPRTELVKVPLPVVAQTAPEQ QEENNADNRDNGGMPRRSRRSPRHLRVSGQRRRRYRDERYPTQSPMPLTVACASPELASG KVWIRYPIVRPQDVQVEEQREQEEVQVQPMVTEVPVAAAVEPVVSAPVVEEMAEVVEAPV PVAEPQPEVVETTHPEVIAAAVTEQPQVITESDVAVAQKVAEHAEPVVEPQEETADIEEV AETAEVVVAEPEVVAQPAAPVVAEVAAEVETVAAVEPEITVEHNHATAPMTRAPAPEYVP EAPRHSDWQRPTFAFEGKGAAGGHTATHHASAAPARPQPVE >gi|296493444|gb|ADTK01000057.1| GENE 62 58063 - 59022 190 319 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 95 309 83 284 285 77 29 1e-13 MKTETPSVKIVAITADEAGQRIDNFLRTQLKGVPKSMIYRILRKGEVRVNKKRIKPEYKL EAGDEVRIPPVRVAEREEEAVSPHLQKVAALADVILYEDDHILVLNKPSGTAVHGGSGLS FGVIEGLRALRPEARFLELVHRLDRDTSGVLLVAKKRSALRSLHEQLREKGMQKDYLALV RGQWQSHVKSVQAPLLKNILQSGERIVRVSQEGKPSETRFKVEERYAFATLVRCSPVTGR THQIRVHTQYAGHPIAFDDRYGDREFDRQLTEAGTGLNRLFLHAAALKFTHPGTGEVMRI EAPMDEGLKRCLQKLRNAR >gi|296493444|gb|ADTK01000057.1| GENE 63 59134 - 59730 516 198 aa, chain - ## HITS:1 COG:yceF KEGG:ns NR:ns ## COG: yceF COG0424 # Protein_GI_number: 16129050 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Nucleotide-binding protein implicated in inhibition of septum formation # Organism: Escherichia coli K12 # 1 198 10 207 207 397 99.0 1e-111 MEKNMPKLILASTSPWRRALLEKLQISFECAAPEVDETPRSDESPRQLVLRLAQEKAQSL ASRYPDHLIIGSDQVCVLDGEITGKPLTEENARLQLRKASGNIVTLYTGLALFNSANGHL QTEVEPFDVHFRHLSEAEIDNYVRKEHPLHCAGSFKSEGFGITLFERLEGRDPNTLVGLP LIALCQMLRREGKNPLMG >gi|296493444|gb|ADTK01000057.1| GENE 64 59917 - 60438 429 173 aa, chain + ## HITS:1 COG:ECs1466 KEGG:ns NR:ns ## COG: ECs1466 COG1399 # Protein_GI_number: 15830720 # Func_class: R General function prediction only # Function: Predicted metal-binding, possibly nucleic acid-binding protein # Organism: Escherichia coli O157:H7 # 1 173 1 173 173 320 100.0 7e-88 MQKVKLPLTLDPVRTAQKRLDYQGIYTPDQVERVAESVVSVDSDVECSMSFAIDNQRLAV LNGDAKVTVTLECQRCGKPFTHQVYTTYCFSPVRSDEQAEALPEAYEPIEVNEFGEIDLL AMVEDEIILALPVVPVHDSEHCEVSEADMVFGELPEEAQKPNPFAVLASLKRK >gi|296493444|gb|ADTK01000057.1| GENE 65 60490 - 60663 292 57 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15801206|ref|NP_287223.1| 50S ribosomal protein L32 [Escherichia coli O157:H7 EDL933] # 1 57 1 57 57 117 100 2e-25 MAVQQNKPTRSKRGMRRSHDALTAVTSLSVDKTSGEKHLRHHITADGYYRGRKVIAK >gi|296493444|gb|ADTK01000057.1| GENE 66 60774 - 61814 619 346 aa, chain + ## HITS:1 COG:ZplsX KEGG:ns NR:ns ## COG: ZplsX COG0416 # Protein_GI_number: 15801207 # Func_class: I Lipid transport and metabolism # Function: Fatty acid/phospholipid biosynthesis enzyme # Organism: Escherichia coli O157:H7 EDL933 # 1 346 1 346 346 633 99.0 0 MGGDFGPSVTVPAALQALNSNSQLTLLLVGNPDAITPLLAKADFEQRSRLQIIPAQSVIA SDARPSQAIRASRGSSMRVALELVKEGRAQACVSAGNTGALMGLAKLLLKPLEGIERPAL VTVLPHQQKGKTVVLDLGANVDCDSTMLVQFAIMGSVLAEEVVEIPNPRVALLNIGEEEV KGLDSIRDASAVLKTIPSINYIGYLEANELLTGKTDVLVCDGFTGNVTLKTMEGVVRMFL SLLKSQGEGKKRSWWLLLLKRWLQKSLTRRFSHLNPDQYNGACLLGLRGTVIKSHGAANQ RAFAVAIEQAVQAVQRQVPQRIAARLESVYPAGFELLDGGKSGTLR >gi|296493444|gb|ADTK01000057.1| GENE 67 61882 - 62835 979 317 aa, chain + ## HITS:1 COG:ECs1469 KEGG:ns NR:ns ## COG: ECs1469 COG0332 # Protein_GI_number: 15830723 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Escherichia coli O157:H7 # 1 317 1 317 317 612 100.0 1e-175 MYTKIIGTGSYLPEQVRTNADLEKMVDTSDEWIVTRTGIRERHIAAPNETVSTMGFEAAT RAIEMAGIEKDQIGLIVVATTSATHAFPSAACQIQSMLGIKGCPAFDVAAACAGFTYALS VADQYVKSGAVKYALVVGSDVLARTCDPTDRGTIIIFGDGAGAAVLAASEEPGIISTHLH ADGSYGELLTLPNADRVNPENSIHLTMAGNEVFKVAVTELAHIVDETLAANNLDRSQLDW LVPHQANLRIISATAKKLGMSMDNVVVTLDRHGNTSAASVPCALDEAVRDGRIKPGQLVL LEAFGGGFTWGSALVRF >gi|296493444|gb|ADTK01000057.1| GENE 68 62851 - 63780 976 309 aa, chain + ## HITS:1 COG:fabD KEGG:ns NR:ns ## COG: fabD COG0331 # Protein_GI_number: 16129055 # Func_class: I Lipid transport and metabolism # Function: (acyl-carrier-protein) S-malonyltransferase # Organism: Escherichia coli K12 # 1 309 1 309 309 532 100.0 1e-151 MTQFAFVFPGQGSQTVGMLADMAASYPIVEETFAEASAALGYDLWALTQQGPAEELNKTW QTQPALLTASVALYRVWQQQGGKAPAMMAGHSLGEYSALVCAGVIDFADAVRLVEMRGKF MQEAVPEGTGAMAAIIGLDDASIAKACEEAAEGQVVSPVNFNSPGQVVIAGHKEAVERAG AACKAAGAKRALPLPVSVPSHCALMKPAADKLAVELAKITFNAPTVPVVNNVDVKCETNG DAIRDALVRQLYNPVQWTKSVEYMAAQGVEHLYEVGPGKVLTGLTKRIVDTLTASALNEP SAMAAALEL >gi|296493444|gb|ADTK01000057.1| GENE 69 63793 - 64527 252 244 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 6 243 4 242 242 101 29 9e-21 MNFEGKIALVTGASRGIGRAIAETLAARGAKVIGTATSENGAQAISDYLGANGKGLMLNV TDPASIESVLEKIRAEFGEVDILVNNAGITRDNLLMRMKDEEWNDIIETNLSSVFRLSKA VMRAMMKKRHGRIITIGSVVGTMGNGGQANYAAAKAGLIGFSKSLAREVASRGITVNVVA PGFIETDMTRALSDDQRAGILAQVPAGRLGGAQEIANAVAFLASDEAAYITGETLHVNGG MYMV >gi|296493444|gb|ADTK01000057.1| GENE 70 64738 - 64974 397 78 aa, chain + ## HITS:1 COG:ECs1472 KEGG:ns NR:ns ## COG: ECs1472 COG0236 # Protein_GI_number: 15830726 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Escherichia coli O157:H7 # 1 78 1 78 78 108 100.0 2e-24 MSTIEERVKKIIGEQLGVKQEEVTNNASFVEDLGADSLDTVELVMALEEEFDTEIPDEEA EKITTVQAAIDYINGHQA Prediction of potential genes in microbial genomes Time: Mon May 16 15:09:20 2011 Seq name: gi|296493443|gb|ADTK01000058.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont169.3, whole genome shotgun sequence Length of sequence - 17533 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 7, operones - 2 average op.length - 6.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 + CDS 62 - 1303 1279 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase + Term 1329 - 1356 -0.1 + Prom 1344 - 1403 2.9 2 1 Op 2 6/0.000 + CDS 1423 - 2232 499 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase 3 1 Op 3 10/0.000 + CDS 2235 - 3257 1147 ## COG1559 Predicted periplasmic solute-binding protein 4 1 Op 4 22/0.000 + CDS 3247 - 3888 812 ## COG0125 Thymidylate kinase 5 1 Op 5 10/0.000 + CDS 3885 - 4889 739 ## COG0470 ATPase involved in DNA replication 6 1 Op 6 6/0.000 + CDS 4900 - 5697 697 ## COG0084 Mg-dependent DNase + Prom 5838 - 5897 5.6 7 1 Op 7 . + CDS 5992 - 7425 1615 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific + Term 7450 - 7480 3.0 - Term 7438 - 7467 2.8 8 2 Tu 1 . - CDS 7485 - 9674 2087 ## COG4773 Outer membrane receptor for ferric coprogen and ferric-rhodotorulic acid - Prom 9771 - 9830 4.1 9 3 Op 1 4/0.333 + CDS 10008 - 10367 435 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases 10 3 Op 2 6/0.000 + CDS 10370 - 10747 176 ## COG5633 Predicted periplasmic lipoprotein 11 3 Op 3 6/0.000 + CDS 10761 - 11402 573 ## COG3417 Collagen-binding surface adhesin SpaP (antigen I/II family) 12 3 Op 4 5/0.000 + CDS 11383 - 12207 247 ## COG0510 Predicted choline kinase involved in LPS biosynthesis 13 3 Op 5 2/1.000 + CDS 12218 - 13243 1093 ## COG1472 Beta-glucosidase-related glycosidases 14 3 Op 6 2/1.000 + CDS 13266 - 13808 510 ## COG3150 Predicted esterase + Term 13814 - 13869 4.2 + Prom 14117 - 14176 5.2 15 4 Tu 1 4/0.333 + CDS 14216 - 15520 1306 ## COG1252 NADH dehydrogenase, FAD-containing subunit + Term 15587 - 15632 3.9 + Prom 15627 - 15686 2.1 16 5 Tu 1 . + CDS 15747 - 16286 584 ## COG3134 Predicted outer membrane lipoprotein + Term 16309 - 16348 3.1 - Term 16265 - 16301 -0.4 17 6 Tu 1 . - CDS 16348 - 17058 453 ## COG1309 Transcriptional regulator - Prom 17117 - 17176 10.0 + Prom 17053 - 17112 9.8 18 7 Tu 1 . + CDS 17221 - 17478 293 ## ECO103_1157 hypothetical protein Predicted protein(s) >gi|296493443|gb|ADTK01000058.1| GENE 1 62 - 1303 1279 413 aa, chain + ## HITS:1 COG:ECs1473 KEGG:ns NR:ns ## COG: ECs1473 COG0304 # Protein_GI_number: 15830727 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Escherichia coli O157:H7 # 1 413 1 413 413 729 100.0 0 MSKRRVVVTGLGMLSPVGNTVESTWKALLAGQSGISLIDHFDTSAYATKFAGLVKDFNCE DIISRKEQRKMDAFIQYGIVAGVQAMQDSGLEITEENATRIGAAIGSGIGGLGLIEENHT SLMNGGPRKISPFFVPSTIVNMVAGHLTIMYGLRGPSISIATACTSGVHNIGHAARIIAY GDADVMVAGGAEKASTPLGVGGFGAARALSTRNDNPQAASRPWDKERDGFVLGDGAGMLV LEEYEHAKKRGAKIYAELVGFGMSSDAYHMTSPPENGAGAALAMANALRDAGIEASQIGY VNAHGTSTPAGDKAEAQAVKTIFGEAASRVLVSSTKSMTGHLLGAAGAVESIYSILALRD QAVPPTINLDNPDEGCDLDFVPHEARQVSGMEYTLCNSFGFGGTNGSLIFKKI >gi|296493443|gb|ADTK01000058.1| GENE 2 1423 - 2232 499 269 aa, chain + ## HITS:1 COG:ECs1474 KEGG:ns NR:ns ## COG: ECs1474 COG0115 # Protein_GI_number: 15830728 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Escherichia coli O157:H7 # 1 269 1 269 269 541 98.0 1e-154 MFLINGYKQESLAVSDRATQFGDGCFTTARVIDGKVSLLSAHIQRLHDACQRLMISCDFW PQLEQEMKTLAAEQQNGVLKVVISRGSGGRGYSTLNSGPATRILSVTAYPAHYDRLRNEG MTLALSPVRLGRNPHLAGIKHLNRLEQVLIRSHLEQTNADEALVLDSEGWVTECCAANLF WRKGNVVYTPRLDQAGVNGIMRQFCIRLLAQSSYQLVEVQASLEEALQADEMVICNALMP VMPVRACGDVSFSSATLYEYLAPLCERPN >gi|296493443|gb|ADTK01000058.1| GENE 3 2235 - 3257 1147 340 aa, chain + ## HITS:1 COG:yceG KEGG:ns NR:ns ## COG: yceG COG1559 # Protein_GI_number: 16129060 # Func_class: R General function prediction only # Function: Predicted periplasmic solute-binding protein # Organism: Escherichia coli K12 # 1 340 1 340 340 649 99.0 0 MKKVLLIILLLLVVLGIAAGVGVWKVRHLADSKLLIKEETIFTLKPGTGRLALGEQLYAD KIINRPRVFQWLLRIEPDLSHFKAGTYRFTPQMTVLEMLKLLESGKEAQFPLRLVEGMRL SDYLKQLREAPYIKHTLSDDKYATVAQALELENPEWIEGWFWPDTWMYTANTTDVALLKR AHKKMVKAVDSAWEGRADGLPYKDKNQLVTMASIIEKETAVASERDQVASVFINRLRIGM RLQTDPTVIYGMGERYNGKLSRADLETPTAYNTYTITGLPPGAIATPGADSLKAAAHPAK TPYLYFVADGKGGHTFNTNLASHNKSVQDYLKVLKEKNAQ >gi|296493443|gb|ADTK01000058.1| GENE 4 3247 - 3888 812 213 aa, chain + ## HITS:1 COG:tmk KEGG:ns NR:ns ## COG: tmk COG0125 # Protein_GI_number: 16129061 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate kinase # Organism: Escherichia coli K12 # 1 213 1 213 213 399 100.0 1e-111 MRSKYIVIEGLEGAGKTTARNVVVETLEQLGIRDMVFTREPGGTQLAEKLRSLVLDIKSV GDEVITDKAEVLMFYAARVQLVETVIKPALANGTWVIGDRHDLSTQAYQGGGRGIDQHML ATLRDAVLGDFRPDLTLYLDVTPEVGLKRARARGELDRIEQESFDFFNRTRARYLELAAQ DKSIHTIDATQPLEAVMDAIRTTVTHWVKELDA >gi|296493443|gb|ADTK01000058.1| GENE 5 3885 - 4889 739 334 aa, chain + ## HITS:1 COG:ECs1477 KEGG:ns NR:ns ## COG: ECs1477 COG0470 # Protein_GI_number: 15830731 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication # Organism: Escherichia coli O157:H7 # 1 334 1 334 334 605 99.0 1e-173 MRWYPWLRPDFEKLVASYQAGRGHHALLIQALPGMGDDALIYALSRYLLCQQPQGHKSCG HCRGCQLMQAGTHPDYYTLAPEKGKNALGIDAVREVTEKLNEHARLGGAKVVWVTDAALL TDAAANALLKTLEEPPAETWFFLATREPERLLATLRSRCRLHYLAPPPEQYAVTWLSREV TMSQDALLAALRLSAGSPGAALALFQGDNWQARETLCQALAYSVPSGDWYSLLAALNHEQ APARLHWLATLLMDALKRHHGAAQVTNVDVPGLVVELANHLSPSRLQAILGDVCHIREQL MSVTGINRELLITDLLLRIEHYLQPGVVLPVPHL >gi|296493443|gb|ADTK01000058.1| GENE 6 4900 - 5697 697 265 aa, chain + ## HITS:1 COG:ECs1478 KEGG:ns NR:ns ## COG: ECs1478 COG0084 # Protein_GI_number: 15830732 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Escherichia coli O157:H7 # 1 265 1 265 265 525 99.0 1e-149 MFLVDSHCHLDGLDYESLHKDVDDVLAKAAARDVNFCLAVATTLPGYLHMRDLVGERDNV VFSCGVHPLNQNDPYDVEDLRRLAAEEGVVALGETGLDYYYTPETKVRQQESFIHHIQIG RELNKPVIVHTRDARADTLAILREEKVTDCGGVLHCFTEDRETAGKLLDLGFYISFSGIV TFRNAEQLRDAARYVPLDRLLVETDSPYLAPVPHRGKENQPAMVRDVAEYMAVLKGVAVE ELAQVTTDNFARLFHIDSSRLQSIR >gi|296493443|gb|ADTK01000058.1| GENE 7 5992 - 7425 1615 477 aa, chain + ## HITS:1 COG:ptsG_1 KEGG:ns NR:ns ## COG: ptsG_1 COG1263 # Protein_GI_number: 16129064 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Escherichia coli K12 # 1 397 1 397 397 739 100.0 0 MFKNAFANLQKVGKSLMLPVSVLPIAGILLGVGSANFSWLPAVVSHVMAEAGGSVFANMP LIFAIGVALGFTNNDGVSALAAVVAYGIMVKTMAVVAPLVLHLPAEEIASKHLADTGVLG GIISGAIAAYMFNRFYRIKLPEYLGFFAGKRFVPIISGLAAIFTGVVLSFIWPPIGSAIQ TFSQWAAYQNPVVAFGIYGFIERCLVPFGLHHIWNVPFQMQIGEYTNAAGQVFHGDIPRY MAGDPTAGKLSGGFLFKMYGLPAAAIAIWHSAKPENRAKVGGIMISAALTSFLTGITEPI EFSFMFVAPILYIIHAILAGLAFPICILLGMRDGTSFSHGLIDFIVLSGNSSKLWLFPIV GIGYAIVYYTIFRVLIKALDLKTPGREDATEDAKATGTSEMAPALVAAFGGKENITNLDA CITRLRVSVADVSKVDQAGLKKLGAAGVVVAGSGVQAIFGTKSDNLKTEMDEYIRNH >gi|296493443|gb|ADTK01000058.1| GENE 8 7485 - 9674 2087 729 aa, chain - ## HITS:1 COG:ECs1480 KEGG:ns NR:ns ## COG: ECs1480 COG4773 # Protein_GI_number: 15830734 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferric coprogen and ferric-rhodotorulic acid # Organism: Escherichia coli O157:H7 # 1 729 1 729 729 1410 99.0 0 MLSTQFNRDNQYQAITKPSLLAGCIALALLPSAAFAAPATEETVIVEGSATAPDDGENDY SVTSTSAGTKMQMTQRDIPQSVTIVSQQRMEDQQLQTLGEVMENTLGISKSQADSDRALY YSRGFQIDNYMVDGIPTYFESRWNLGDALSDMALFERVEVVRGATGLMTGTGNPSAAINM VRKHATSREFKGDVSAEYGSWNKERYVADLQSPLTEDGKIRARIVGGYQNNDSWLDRYNS EKTFFSGIVDADLGDLTTLSAGYEYQRIDVNSPTWGGLPRWNTDGSSNSYDRARSTAPDW AYNDKEINKVFMTLKQRFADTWQATLNATHSEVEFDSKMMYVDAYVNKADGMLVGPYSNY GPGFDYVGGTGWNSGKRKVDALDLFADGIYELFGRQHNLMFGGSYSKQNNRYFSSWANIF PDEIGSFYNFNGNFPQTDWSPQSLAQDDTTHMKSLYAATRVTLADPLHLILGARYTNWRV DTLTYSMEKNHTTPYAGLVFDINDNWSTYASYTSIFQPQNDRDSSGKYLAPITGNNYELG LKSDWMNSRLTTTLAIFRIEQDNVAQSTGTPIPGSNGETAYKAVDGTVSKGVEFELNGAI TDNWQLTFGATRYIAEDNEGNAVNPNLPRTTVKMFTSYRLPVMPELTVGGGVNWQNRVYT DTVTPYGTFRAEQGSYALVDLFTRYQVTKNFSLQGNVNNLFDKTYDTNVEGSIVYGAPRN FSITGTYQF >gi|296493443|gb|ADTK01000058.1| GENE 9 10008 - 10367 435 119 aa, chain + ## HITS:1 COG:ECs1481 KEGG:ns NR:ns ## COG: ECs1481 COG0537 # Protein_GI_number: 15830735 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Escherichia coli O157:H7 # 1 119 1 119 119 205 100.0 2e-53 MAEETIFSKIIRREIPSDIVYQDDLVTAFRDISPQAPTHILIIPNILIPTVNDVSAEHEQ ALGRMITVAAKIAEQEGIAEDGYRLIMNTNRHGGQEVYHIHMHLLGGRPLGPMLAHKGL >gi|296493443|gb|ADTK01000058.1| GENE 10 10370 - 10747 176 125 aa, chain + ## HITS:1 COG:ECs1482 KEGG:ns NR:ns ## COG: ECs1482 COG5633 # Protein_GI_number: 15830736 # Func_class: R General function prediction only # Function: Predicted periplasmic lipoprotein # Organism: Escherichia coli O157:H7 # 1 125 1 125 125 225 99.0 1e-59 MRKGCFGLVSLALLLLVGCRSHPEIPVNDEQSLVMESSLLAAGISAEKPVLSTSDIQPSA SSTLYNERQEPITVHYRFYWYDARGLEMHPLERPRSVTIPAHSAVTLYGSANFLGAHKVR LYLYL >gi|296493443|gb|ADTK01000058.1| GENE 11 10761 - 11402 573 213 aa, chain + ## HITS:1 COG:ECs1483 KEGG:ns NR:ns ## COG: ECs1483 COG3417 # Protein_GI_number: 15830737 # Func_class: R General function prediction only # Function: Collagen-binding surface adhesin SpaP (antigen I/II family) # Organism: Escherichia coli O157:H7 # 1 213 1 213 213 367 100.0 1e-102 MTKMSRYALITALAMFLAGCVGQREPAPVEEVKPAPEQPAEPQQPVPTVPSVPTIPQQPG PIEHEDQTAPPAPHIRHYDWNGAMQPMVSKMLGADGVTAGSVLLVDSVNNRTNGSLNAAE ATETLRNALANNGKFTLVSAQQLSMAKQQLGLSPQDSLGTRSKAIGIARNVGAHYVLYSS ASGNVNAPTLQMQLMLVQTGEIIWSGKGAVSQQ >gi|296493443|gb|ADTK01000058.1| GENE 12 11383 - 12207 247 274 aa, chain + ## HITS:1 COG:ycfN KEGG:ns NR:ns ## COG: ycfN COG0510 # Protein_GI_number: 16129069 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted choline kinase involved in LPS biosynthesis # Organism: Escherichia coli K12 # 1 274 1 274 274 525 99.0 1e-149 MPFRSNNPITRDELLSRFFPQFHPVTTFNSGLSGGSFLIEHQGQRFVVRQPHDPDAPQSA FLRQYRALSQLPACIAPKPHLYLRDWMVVDYLPGAVKTYLPDTNELAGLLYYLHQQPRFG WRITLLPLLELYWQQSDPARRTVGWLRMLKRLRKAREPRPLRLSPLHMDVHAGNLVHSAS GLKLIDWEYAGDGDIALELAAVWVENTEQHRQLVNDYATRAKIYPAQLWRQVRRWFPWLL MLKAGWFEYRWRQTGDQQFIRLVDDTWRQLLIKQ >gi|296493443|gb|ADTK01000058.1| GENE 13 12218 - 13243 1093 341 aa, chain + ## HITS:1 COG:ycfO KEGG:ns NR:ns ## COG: ycfO COG1472 # Protein_GI_number: 16129070 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Escherichia coli K12 # 1 341 1 341 341 672 99.0 0 MGPVMLDVKGYELDAEEREILAHPLVGGLILFTRNYHDPAQLRELVRQIRAASRNHLVVA VDQEGGRVQRFREGFTRLPAAQSFAALSGMEEGGKLAQEAGWLMASEMIAMDIDISFAPV LDVGHISAAIGERSYHADPQKALAIASRFIDGMHEAGMKTTGKHFPGHGAVTADSHKETP CDPRPQAEIRAKDMSVFSSLIRENKLDAIMPAHVIYSDVDSRPASGSPYWLKTVLRQELG FDGVIFSDDLSMEGAAIMGSYAERGQASLDAGCDMILVCNNRKGAVSVLDNLSPIKAERV TRLYHKGSFSRQELMDSARWKAISTRLNQLHERWQEEKAGH >gi|296493443|gb|ADTK01000058.1| GENE 14 13266 - 13808 510 180 aa, chain + ## HITS:1 COG:ECs1486 KEGG:ns NR:ns ## COG: ECs1486 COG3150 # Protein_GI_number: 15830740 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Escherichia coli O157:H7 # 1 180 20 199 199 375 100.0 1e-104 MIIYLHGFDSNSPGNHEKVLQLQFIDPDVRLISYSTRHPKHDMQHLLKEVDKMLQLNVDE RPLICGVGLGGYWAERIGFLCDIRQVIFNPNLFPYENMEGKIDRPEEYADIATKCVTNFR EKNRDRCLVILSRNDEALNSQRTSEELHHYYEIVWDEEQTHKFKNISPHLQRIKAFKTLG >gi|296493443|gb|ADTK01000058.1| GENE 15 14216 - 15520 1306 434 aa, chain + ## HITS:1 COG:ndh KEGG:ns NR:ns ## COG: ndh COG1252 # Protein_GI_number: 16129072 # Func_class: C Energy production and conversion # Function: NADH dehydrogenase, FAD-containing subunit # Organism: Escherichia coli K12 # 1 434 1 434 434 863 100.0 0 MTTPLKKIVIVGGGAGGLEMATQLGHKLGRKKKAKITLVDRNHSHLWKPLLHEVATGSLD EGVDALSYLAHARNHGFQFQLGSVIDIDREAKTITIAELRDEKGELLVPERKIAYDTLVM ALGSTSNDFNTPGVKENCIFLDNPHQARRFHQEMLNLFLKYSANLGANGKVNIAIVGGGA TGVELSAELHNAVKQLHSYGYKGLTNEALNVTLVEAGERILPALPPRISAAAHNELTKLG VRVLTQTMVTSADEGGLHTKDGEYIEADLMVWAAGIKAPDFLKDIGGLETNRINQLVVEP TLQTTRDPDIYAIGDCASCPRPEGGFVPPRAQAAHQMATCAMNNILAQMNGKPLKNYQYK DHGSLVSLSNFSTVGSLMGNLTRGSMMIEGRIARFVYISLYRMHQIALHGYFKTGLMMLV GSINRVIRPRLKLH >gi|296493443|gb|ADTK01000058.1| GENE 16 15747 - 16286 584 179 aa, chain + ## HITS:1 COG:ycfJ KEGG:ns NR:ns ## COG: ycfJ COG3134 # Protein_GI_number: 16129073 # Func_class: S Function unknown # Function: Predicted outer membrane lipoprotein # Organism: Escherichia coli K12 # 1 179 1 179 179 300 100.0 1e-81 MNKSMLAGIGIGVAAALGVAAVASLNVFERGPQYAQVVSATPIKETVKTPRQECRNVTVT HRRPVQDENRITGSVLGAVAGGVIGHQFGGGRGKDVATVVGALGGGYAGNQIQGSLQESD TYTTTQQRCKTVYDKSEKMLGYDVTYKIGDQQGKIRMDRDPGTQIPLDSNGQLILNNKV >gi|296493443|gb|ADTK01000058.1| GENE 17 16348 - 17058 453 236 aa, chain - ## HITS:1 COG:ECs1489 KEGG:ns NR:ns ## COG: ECs1489 COG1309 # Protein_GI_number: 15830743 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 236 1 236 236 483 99.0 1e-137 MGSGLVNGGDYFYNNLSFTVTRYNGIMATDSTQCVKKSRGRPKVFDRDAALDKAMKLFWQ HGYEATSLADLVEATGAKAPTLYAEFTNKEGLFRAVLDRYIDRFAAKHEAQLFCEEKSVE SALADYFAAIANCFTSKDTPAGCFMINNCTTLSPDSGDIANTLKSRHAMQERTLQQFLCQ RQARGEIPTHCDVTHLAEFLNCIIQGMSISAREGASLEKLMQIAGTTLRLWPELVK >gi|296493443|gb|ADTK01000058.1| GENE 18 17221 - 17478 293 85 aa, chain + ## HITS:1 COG:no KEGG:ECO103_1157 NR:ns ## KEGG: ECO103_1157 # Name: ycfR # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 85 1 85 85 127 100.0 9e-29 MKNVKTLIAAAILSSMSFASFAAVEVQSTPEGQQKVGTISANAGTNLGSLEEQLAQKADE MGAKSFRITSVTGPNTLHGTAVIYK Prediction of potential genes in microbial genomes Time: Mon May 16 15:09:28 2011 Seq name: gi|296493442|gb|ADTK01000059.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont169.4, whole genome shotgun sequence Length of sequence - 15880 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 7, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 - CDS 52 - 1011 934 ## COG1376 Uncharacterized protein conserved in bacteria - Prom 1034 - 1093 4.9 2 1 Op 2 3/1.000 - CDS 1158 - 4604 3692 ## COG1197 Transcription-repair coupling factor (superfamily II helicase) - Prom 4669 - 4728 2.7 3 2 Tu 1 . - CDS 4732 - 5805 995 ## COG4763 Predicted membrane protein - Prom 5934 - 5993 2.5 + Prom 5982 - 6041 4.4 4 3 Op 1 23/0.000 + CDS 6067 - 7266 1443 ## COG4591 ABC-type transport system, involved in lipoprotein release, permease component 5 3 Op 2 23/0.000 + CDS 7259 - 7960 227 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 6 3 Op 3 5/0.000 + CDS 7966 - 9204 1291 ## COG4591 ABC-type transport system, involved in lipoprotein release, permease component 7 3 Op 4 5/0.000 + CDS 9233 - 10144 860 ## COG1940 Transcriptional regulator/sugar kinase 8 3 Op 5 . + CDS 10160 - 10981 665 ## COG0846 NAD-dependent protein deacetylases, SIR2 family 9 4 Op 1 . - CDS 11120 - 11908 374 ## ECO103_1166 putative inner membrane protein 10 4 Op 2 . - CDS 11905 - 12366 167 ## ECO111_1399 putative inner membrane protein - Term 12377 - 12419 6.4 11 5 Op 1 25/0.000 - CDS 12424 - 13470 1470 ## COG0687 Spermidine/putrescine-binding periplasmic protein 12 5 Op 2 2/1.000 - CDS 13467 - 14261 912 ## COG1177 ABC-type spermidine/putrescine transport system, permease component II - Prom 14367 - 14426 2.8 13 6 Tu 1 . - CDS 14428 - 15201 306 ## COG0582 Integrase 14 7 Tu 1 . - CDS 15515 - 15784 126 ## ECUMN_1304 putative excisionase for bacteriophage origin - Prom 15806 - 15865 1.6 Predicted protein(s) >gi|296493442|gb|ADTK01000059.1| GENE 1 52 - 1011 934 319 aa, chain - ## HITS:1 COG:ECs1491 KEGG:ns NR:ns ## COG: ECs1491 COG1376 # Protein_GI_number: 15830745 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 319 2 320 320 598 100.0 1e-171 MIKTRFSRWLTFFTFAAAVALALPAKANTWPLPPAGSRLVGENKFHVVENDGGSLEAIAK KYNVGFLALLQANPGVDPYVPRAGSVLTIPLQTLLPDAPREGIVINIAELRLYYYPPGKN SVTVYPIGIGQLGGDTLTPTMVTTVSDKRANPTWTPTANIRARYKAQGIELPAVVPAGPD NPMGHHAIRLAAYGGVYLLHGTNADFGIGMRVSSGCIRLRDDDIKTLFSQVTPGTKVNII NTPIKVSAEPNGARLVEVHQPLSEKIDDDPQLLPITLNSAMQSFKDAAQTDAEVMQHVMD VRSGMPVDVRRHQVSPQTL >gi|296493442|gb|ADTK01000059.1| GENE 2 1158 - 4604 3692 1148 aa, chain - ## HITS:1 COG:mfd KEGG:ns NR:ns ## COG: mfd COG1197 # Protein_GI_number: 16129077 # Func_class: L Replication, recombination and repair; K Transcription # Function: Transcription-repair coupling factor (superfamily II helicase) # Organism: Escherichia coli K12 # 1 1148 1 1148 1148 2276 99.0 0 MPEQYRYTLPVKAGEQRLLGELTGAACATLVAEIAERHAGPVVLIAPDMQNALRLHDEIS QFTDQMVMNLADWETLPYDSFSPHQDIISSRLSTLYQLPTMQRGVLIVPVNTLMQRVCPH SFLHGHALVMKKGQRLSRDALRTQLDSAGYRHVDQVMEHGEYATRGALLDLFPMGSELPY RLDFFDDEIDSLRVFDVDSQRTLEEVEAINLLPAHEFPTDKAAIELFRSQWRDTFEVKRD PEHIYQQVSKGTLPAGIEYWQPLFFSEPLPPLFSYFPANTLLVNTGDLETSAERFQADTL ARFENRGVDPMRPLLPPQSLWLRVDELFSELKNWPRVQLKTEHLPTKAANANLGFQKLPD LAVQAQQKAPLDALRKFLETFDGPVVFSVESEGRREALGELLARIKIAPQRIMRLDEASD RGRYLMIGAAEHGFVDKVRNLALICESDLLGERVARRRQDSRRTINPDTLIRNLAELHIG QPVVHLEHGVGRYAGMTTLEAGGITGEYLMLTYANDAKLYVPVSSLHLISRYAGGAEENA PLHKLGGDAWSRARQKAAEKVRDVAAELLDIYAQRAAKEGFAFKHDREQYQLFCDSFPFE TTPDQAQAINAVLSDMCQPLAMDRLVCGDVGFGKTEVAMRAAFLAVDNHKQVAVLVPTTL LAQQHYDNFRDRFANWPVRIEMISRFRSAKEQTQILAEVAEGKIDILIGTHKLLQSDVKF KDLGLLIVDEEHRFGVRHKERIKAMRANVDILTLTATPIPRTLNMAMSGMRDLSIIATPP ARRLAVKTFVREYDSLVVREAILREILRGGQVYYLYNDVENIQKAAERLAELVPEARIAI GHGQMRERELERVMNDFHHQRFNVLVCTTIIETGIDIPTANTIIIERADHFGLAQLHQLR GRVGRSHHQAYAWLLTPHPKAMTTDAQKRLEAIASLEDLGAGFALATHDLEIRGAGELLG EEQSGSMETIGFSLYMELLENAVDALKAGREPSLEDLTSQQTEVELRMPSLLPDDFIPDV NTRLSFYKRIASAKTENELEEIKVELIDRFGLLPDPARILLDVARLRQQAQKLGIRKLEG NEKGGVIEFAEKNHVNPAWLIGLLQKQPQHYRLDGPTRLKFIQDLSERKTRIEWVRQFMR ELEENAIA >gi|296493442|gb|ADTK01000059.1| GENE 3 4732 - 5805 995 357 aa, chain - ## HITS:1 COG:ycfT KEGG:ns NR:ns ## COG: ycfT COG4763 # Protein_GI_number: 16129078 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 357 1 357 357 624 99.0 1e-179 MKQKELWINQIKGLCICLVVIYHSVITFYPHLTTFQHPLSEVLSKCWIYFNLYLAPFRMP FFFFISGYLIRRYIDSVPWGNCLDKRIWNIFWVLALWGVVQWLALSALNQWLAPERDLSN ASNAAYADSTGEFLHGMITASTSLWYLYALIVYFVVCKIFSRLALPLFALFVLLSVAVNF VPTPWWGMNSVIRNLLYYSLGAWFGATIMTCVKEVPLRRHLLMASLLTVLAVGAWLFTIS LLLSLVSIVVIMKLFYQYEQRFGMRSTSLLNVIGSNTIAIYTTHRILVEIFSLTLLAQMN AARWSPQVELTLLLVYPFVSLFICTVAGLLVRKLSQRAFSDLLFSPPSLPAAVSYSR >gi|296493442|gb|ADTK01000059.1| GENE 4 6067 - 7266 1443 399 aa, chain + ## HITS:1 COG:ECs1494 KEGG:ns NR:ns ## COG: ECs1494 COG4591 # Protein_GI_number: 15830748 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ABC-type transport system, involved in lipoprotein release, permease component # Organism: Escherichia coli O157:H7 # 1 399 1 399 399 703 100.0 0 MYQPVALFIGLRYMRGRAADRFGRFVSWLSTIGITLGVMALVTVLSVMNGFERELQNNIL GLMPQAILSSEHGSLNPQQLPETAVKLDGVNRVAPITTGDVVLQSARSVAVGVMLGIDPA QKDPLTPYLVNVKQTDLEPGKYNVILGEQLASQLGVNRGDQIRVMVPSASQFTPMGRIPS QRLFNVIGTFAANSEVDGYEMLVNIEDASRLMRYPAGNITGWRLWLDEPLKVDSLSQQKL PEGSKWQDWRDRKGELFQAVRMEKNMMGLLLSLIVAVAAFNIITSLGLMVMEKQGEVAIL QTQGLTPRQIMMVFMVQGASAGIIGAILGAALGALLASQLNNLMPIIGVLLDGAALPVAI EPLQVIVIALVAMAIALLSTLYPSWRAAATQPAEALRYE >gi|296493442|gb|ADTK01000059.1| GENE 5 7259 - 7960 227 233 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 6 216 9 210 309 92 29 2e-18 MNKILLQCDNLCKRYQEGSVQTDVLHNVSFSVGEGEMMAIVGSSGSGKSTLLHLLGGLDT PTSGDVIFNGQPMSKLSSAAKAELRNQKLGFIYQFHHLLPDFTALENVAMPLLIGKKKPA EINSRALEMLKAVGLEHRANHRPSELSGGERQRVAIARALVNNPRLVLADEPTGNLDARN ADSIFQLLGELNRLQGTAFLVVTHDLQLAKRMSRQLEMRDGRLTAELSLMGAE >gi|296493442|gb|ADTK01000059.1| GENE 6 7966 - 9204 1291 412 aa, chain + ## HITS:1 COG:lolE KEGG:ns NR:ns ## COG: lolE COG4591 # Protein_GI_number: 16129081 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ABC-type transport system, involved in lipoprotein release, permease component # Organism: Escherichia coli K12 # 1 412 3 414 414 754 99.0 0 MPLSLLIGLRFSRGRRRGGMVSLISVISTIGIALGVAVLIVGLSAMNGFERELNNRILAV VPHGEIEAVNQPWTNWQEALDNVQKVPGIAAAAPYINFTGLVESGANLRAIQVKGVNPQQ EQRLSALPSFVQGDAWRNFKAGEQQIIIGKGVADALKVKQGDWVSIMIPNSNPEHKLMQP KRVRLHIAGILQLSGQLDHSFAMIPLADAQQYLDMGSSVSGIALKMTDVFNANKLVRDAG EVTNSYVYIKSWIGTYGYMYRDIQMIRAIMYLAMVLVIGVACFNIVSTLVMAVKDKSGDI AVLRTLGAKDGLIRAIFVWYGLLAGLFGSLCGVIIGVVVSLQLTPIIEWIEKLIGHQFLS SDIYFIDFLPSELHWLDVFYVLVTALLLSLLASWYPARRASNIDPARVLSGQ >gi|296493442|gb|ADTK01000059.1| GENE 7 9233 - 10144 860 303 aa, chain + ## HITS:1 COG:ycfX KEGG:ns NR:ns ## COG: ycfX COG1940 # Protein_GI_number: 16129082 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Escherichia coli K12 # 1 303 1 303 303 625 100.0 1e-179 MYYGFDIGGTKIALGVFDSGRQLQWEKRVPTPRDSYDAFLDAVCELVAEADQRFGCKGSV GIGIPGMPETEDGTLYAANVPAASGKPLRADLSARLDRDVRLDNDANCFALSEAWDDEFT QYPLVMGLILGTGVGGGLIFNGKPITGKSYITGEFGHMRLPVDALTMMGLDFPLRRCGCG QHGCIENYLSGRGFAWLYQHYYHQPLQAPEIIALYDQGDEQARAHVERYLDLLAVCLGNI LTIVDPDLVVIGGGLSNFPAITTQLADRLPRHLLPVARVPRIERARHGDAGGMRGAAFLH LTD >gi|296493442|gb|ADTK01000059.1| GENE 8 10160 - 10981 665 273 aa, chain + ## HITS:1 COG:ycfY KEGG:ns NR:ns ## COG: ycfY COG0846 # Protein_GI_number: 16129083 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Escherichia coli K12 # 1 273 1 273 279 543 100.0 1e-154 MLSRRGHRLSRFRKNKRRLRERLRQRIFFRDKVVPEAMEKPRVLVLTGAGISAESGIRTF RAADGLWEEHRVEDVATPEGFDRDPELVQAFYNARRRQLQQPEIQPNAAHLALAKLQDAL GDRFLLVTQNIDNLHERAGNTNVIHMHGELLKVRCSQSGQVLDWTGDVTPEDKCHCCQFP APLRPHVVWFGEMPLGMDEIYMALSMADIFIAIGTSGHVYPAAGFVHEAKLHGAHTVELN LEPSQVGNEFAEKYYGPASQVVPEFVEKLLKGL >gi|296493442|gb|ADTK01000059.1| GENE 9 11120 - 11908 374 262 aa, chain - ## HITS:1 COG:no KEGG:ECO103_1166 NR:ns ## KEGG: ECO103_1166 # Name: ycfZ # Def: putative inner membrane protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 262 1 262 262 487 100.0 1e-136 MKKFIILLSLLILLPLVATSKPLIPIMKTLFTDVTGTVPDAEEIARKAELFRQQTGIAPF IVILPDINNEASLRQNGKAMLAHASSSLSNVKGSVLLLFTTREPRLIMITNGQVESGLDD KHLGLLIENHTLAYLNADLWYQGINNALAVLQAQILKQPTPPLTYYPHPGQQHENAPPGS TNTLGFIAWAATFILFSWIFCYTTRFIYALKFAVAMTIANMGYQALCLYIDNSFAITRIS PLWAGLIGVCTFIAALLLTSKR >gi|296493442|gb|ADTK01000059.1| GENE 10 11905 - 12366 167 153 aa, chain - ## HITS:1 COG:no KEGG:ECO111_1399 NR:ns ## KEGG: ECO111_1399 # Name: yfmA # Def: putative inner membrane protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 153 1 153 153 303 99.0 2e-81 MSQDSKVLSRAFLGIGLILLVISAIIFYYHFTFTKSAVHTEGIIVDAVWYNNHSNDVDDN GSWYPVVAFRPTPDYTLIFNSNIGSDFYEDSEGDRVNVYYPPGHPEQAEINNPWVNFFKW GFVGIAGIIFGSVGLIITLPSTKKSRRKRKSRP >gi|296493442|gb|ADTK01000059.1| GENE 11 12424 - 13470 1470 348 aa, chain - ## HITS:1 COG:potD KEGG:ns NR:ns ## COG: potD COG0687 # Protein_GI_number: 16129086 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Escherichia coli K12 # 1 348 1 348 348 652 100.0 0 MKKWSRHLLAAGALALGMSAAHADDNNTLYFYNWTEYVPPGLLEQFTKETGIKVIYSTYE SNETMYAKLKTYKDGAYDLVVPSTYYVDKMRKEGMIQKIDKSKLTNFSNLDPDMLNKPFD PNNDYSIPYIWGATAIGVNGDAVDPKSVTSWADLWKPEYKGSLLLTDDAREVFQMALRKL GYSGNTTDPKEIEAAYNELKKLMPNVAAFNSDNPANPYMEGEVNLGMIWNGSAFVARQAG TPIDVVWPKEGGIFWMDSLAIPANAKNKEGALKLINFLLRPDVAKQVAETIGYPTPNLAA RKLLSPEVANDKTLYPDAETIKNGEWQNDVGAASSIYEEYYQKLKAGR >gi|296493442|gb|ADTK01000059.1| GENE 12 13467 - 14261 912 264 aa, chain - ## HITS:1 COG:ECs1500 KEGG:ns NR:ns ## COG: ECs1500 COG1177 # Protein_GI_number: 15830754 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component II # Organism: Escherichia coli O157:H7 # 1 264 1 264 264 431 100.0 1e-121 MIGRLLRGGFMTAIYAYLYIPIIILIVNSFNSSRFGINWQGFTTKWYSLLMNNDSLLQAA QHSLTMAVFSATFATLIGSLTAVALYRYRFRGKPFVSGMLFVVMMSPDIVMAISLLVLFM LLGIQLGFWSLLFSHITFCLPFVVVTVYSRLKGFDVRMLEAAKDLGASEFTILRKIILPL AMPAVAAGWVLSFTLSMDDVVVSSFVTGPSYEILPLKIYSMVKVGVSPEVNALATILLVL SLVMVIASQLIARDKTKGNTGDVK >gi|296493442|gb|ADTK01000059.1| GENE 13 14428 - 15201 306 257 aa, chain - ## HITS:1 COG:ECs1501 KEGG:ns NR:ns ## COG: ECs1501 COG0582 # Protein_GI_number: 15830755 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Escherichia coli O157:H7 # 1 257 116 372 372 526 99.0 1e-149 MAAYLVSRLGNHPLKELEVRDFALILDEWLDKDMVSTARVNRGLWVDIYKEAQHAGEVPP GWNPPEATRKPIPKVTRARLTMEDWQKIYNATPEKHFIRNAMLLAIVTGQRRDDICHMRF SDVWNEHLHITQGKTGMRLALPLTLRCDAIGITLKEVIDGCRDRILSPYLIHSRHQKQPK PMSKDNLSDYFAKARDLAGIIPPAGKTPPTFHEQRSLSERLYRAQGIDTKTLLGHKVQAT TDRYNDTRGQEWVKLVI >gi|296493442|gb|ADTK01000059.1| GENE 14 15515 - 15784 126 89 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_1304 NR:ns ## KEGG: ECUMN_1304 # Name: not_defined # Def: putative excisionase for bacteriophage origin # Organism: E.coli_UMN026 # Pathway: not_defined # 1 89 4 92 92 177 100.0 2e-43 MSEQYLITLDEWKPKRFSLPITNTTLVKYGKLGYIVPRPQKIRGRWLIDRRAVFVGPGET GIAPEIHTGDDDALKEILTHVTEATKKQH Prediction of potential genes in microbial genomes Time: Mon May 16 15:09:38 2011 Seq name: gi|296493441|gb|ADTK01000060.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont170.1, whole genome shotgun sequence Length of sequence - 1954 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 9/0.000 - CDS 67 - 846 598 ## COG1484 DNA replication protein 2 1 Op 2 . - CDS 846 - 1868 613 ## COG4584 Transposase and inactivated derivatives Predicted protein(s) >gi|296493441|gb|ADTK01000060.1| GENE 1 67 - 846 598 259 aa, chain - ## HITS:1 COG:YPO2025 KEGG:ns NR:ns ## COG: YPO2025 COG1484 # Protein_GI_number: 16122266 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Yersinia pestis # 1 259 2 260 260 470 99.0 1e-132 MMELQHQRLMVLAGQLQLESLISAAPALSQQAVDQEWSYMDFLEHLLHEEKLARHQRKQA MYTRMAAFPAVKTFEEYDFTFATGAPQKQLQSLRSLSFIERNENIVLLGPSGVGKTHLAI AMGYEAVRAGIKVRFTTAADLLLQLSTAQRQGRYKTTLQRGVMAPRLLIIDEIGYLPFSQ EEAKLFFQVIAKRYEKSAMILTSNLPFGQWDQTFAGDAALTSAMLDRILHHSHVVQIKGE SYRLRQKRKAGVIAEANPE >gi|296493441|gb|ADTK01000060.1| GENE 2 846 - 1868 613 340 aa, chain - ## HITS:1 COG:YPO2026 KEGG:ns NR:ns ## COG: YPO2026 COG4584 # Protein_GI_number: 16122267 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Yersinia pestis # 1 340 1 340 340 685 99.0 0 MVTFETVMEIKILHKQGMSSRAIARELGISRNTVKRYLQAKSEPPKYTPRPAVASLLDEY RDYIRQRIADAHPYKIPATVIAREIRDQGYRGGMTILRGFIRSLSVPQEQEPAVRFETEP GRQMQVDWGTMRNGRSPLHVFVAVLGYSRMLYIEFTDNMRYDTLETCHRNAFRFFGGVPR EVLYDNMKTVVLQRDAYQTGQHRFHPSLWQFGKEMGFSPRLCRPFRAQTKGKVERMVQYT RNSFYIPLMTRLRPMGITVDVETANRHGLRWLHDVANQRKHETIQARPCDRWLEEQQSML ALPPEKKEYDVHLDENLVNFDKHPLHHPLSIYDSFCRGVA Prediction of potential genes in microbial genomes Time: Mon May 16 15:09:41 2011 Seq name: gi|296493440|gb|ADTK01000061.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont171.1, whole genome shotgun sequence Length of sequence - 8394 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 7, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 89 - 859 577 ## COG0388 Predicted amidohydrolase - Prom 920 - 979 6.7 + Prom 911 - 970 3.9 2 2 Tu 1 . + CDS 1013 - 1486 585 ## EC55989_0244 C-lysozyme inhibitor + Term 1503 - 1533 3.0 - Term 1490 - 1520 3.0 3 3 Tu 1 . - CDS 1529 - 3973 2883 ## COG1960 Acyl-CoA dehydrogenases - Prom 4060 - 4119 3.2 + Prom 4122 - 4181 4.2 4 4 Op 1 6/0.000 + CDS 4213 - 4791 722 ## COG0279 Phosphoheptose isomerase + Term 4960 - 4990 1.7 + Prom 4824 - 4883 2.7 5 4 Op 2 . + CDS 4997 - 5764 588 ## COG0121 Predicted glutamine amidotransferase + Term 5809 - 5865 7.6 - Term 5691 - 5724 2.1 6 5 Tu 1 . - CDS 5735 - 6475 794 ## COG3034 Uncharacterized protein conserved in bacteria - Prom 6665 - 6724 4.3 + Prom 6691 - 6750 5.1 7 6 Tu 1 3/0.667 + CDS 6899 - 7525 321 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) + Term 7553 - 7604 1.2 + Prom 7547 - 7606 3.1 8 7 Tu 1 . + CDS 7701 - 8198 43 ## COG1943 Transposase and inactivated derivatives + Term 8295 - 8346 1.2 Predicted protein(s) >gi|296493440|gb|ADTK01000061.1| GENE 1 89 - 859 577 256 aa, chain - ## HITS:1 COG:ECs0246 KEGG:ns NR:ns ## COG: ECs0246 COG0388 # Protein_GI_number: 15829500 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Escherichia coli O157:H7 # 1 256 1 256 256 529 99.0 1e-150 MPGLKITLLQQPLVWMDGPANLRHFDRQLEGITGRDVIVLPEMFTSGFAMEAAASSLAQN DVVNWMTAKAQQCNALIAGSVALQTESGSVNRFLLVEPGGTVHFYDKRHLFRMADEHLHY KAGNARVIVEWRGWRILPLVCYDLRFPVWSRNLNDYDLAIYVANWPAPRSLHWQALLTAR AIENQAYVAGCNRVGSDGNGCHYRGDSRVINPQGEIIATADAHQATRIDAELSMVALREY REKFPAWQDADEFRLR >gi|296493440|gb|ADTK01000061.1| GENE 2 1013 - 1486 585 157 aa, chain + ## HITS:1 COG:no KEGG:EC55989_0244 NR:ns ## KEGG: EC55989_0244 # Name: ivy # Def: C-lysozyme inhibitor # Organism: E.coli_55989 # Pathway: not_defined # 1 157 1 157 157 292 100.0 2e-78 MGRISSGGMMFKAITTVAALVIATSAMAQDDLTISSLAKGETTKAAFNQMVQGHKLPAWV MKGGTYTPAQTVTLGDETYQVMSACKPHDCGSQRIAVMWSEKSNQMTGLFSTIDEKTSQE KLTWLNVNDALSIDGKTVLFAALTGSLENHPDGFNFK >gi|296493440|gb|ADTK01000061.1| GENE 3 1529 - 3973 2883 814 aa, chain - ## HITS:1 COG:yafH KEGG:ns NR:ns ## COG: yafH COG1960 # Protein_GI_number: 16128207 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Escherichia coli K12 # 1 814 13 826 826 1667 99.0 0 MMILSILATVVLLGALFYHRVSLFISSLILLAWTAALGVAGLWSAWVLVPLAIILVPFNF APMRKSMISAPVFRGFRKVMPPMSRTEKEAIDAGTTWWEGDLFQGKPDWKKLHNYPQPRL TAEEQAFLDGPVEEACRMANDFQITHELADLPPELWAYLKEHRFFAMIIKKEYGGLEFSA YAQSRVLQKLSGVSGILAITVGVPNSLGPGELLQHYGTDEQKNHYLPRLARGQEIPCFAL TSPEAGSDAGAIPDTGIVCMGEWQGQQVLGMRLTWNKRYITLAPIATVLGLAFKLSDPEK LLGGAEDLGITCALIPTTTPGVEIGRRHFPLNVPFQNGPTRGKDVFVPIDYIIGGPKMAG QGWRMLVECLSVGRGITLPSNSTGGVKSVALATGAYAHIRRQFKISIGKMEGIEEPLARI AGNAYVMDAAASLITYGIMLGEKPAVLSAIVKYHCTHRGQQSIIDAMDITGGKGIMLGQS NFLARAYQGAPIAITVEGANILTRSMMIFGQGAIRCHPYVLEEMEAAKNNDVNAFDKLLF KHIGHVGSNKVRSFWLGLTRGLTSSTPTGDATKRYYQHLNRLSANLALLSDVSMAVLGGS LKRRERISARLGDILSQLYLASAVLKRYDDEGRNEADLPLVHWGVQDALYQAEQAMDDLL QNFPNRVVAGLLNVVIFPTGRHYLAPSDKLDHKVAKILQVPNATRSRIGRGQYLTPSEHN PVGLLEEALVDVIAADPIHQRICKELGKNLPFTRLDELAHNALAKGLIDKDEAAILVKAE ESRLCSINVDDFDPEELATKPVKLPEKVRKVEAA >gi|296493440|gb|ADTK01000061.1| GENE 4 4213 - 4791 722 192 aa, chain + ## HITS:1 COG:ECs0249 KEGG:ns NR:ns ## COG: ECs0249 COG0279 # Protein_GI_number: 15829503 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoheptose isomerase # Organism: Escherichia coli O157:H7 # 1 192 1 192 192 373 100.0 1e-103 MYQDLIRNELNEAAETLANFLKDDANIHAIQRAAVLLADSFKAGGKVLSCGNGGSHCDAM HFAEELTGRYRENRPGYPAIAISDVSHISCVGNDFGFNDIFSRYVEAVGREGDVLLGIST SGNSANVIKAIAAAREKGMKVITLTGKDGGKMAGTADIEIRVPHFGYADRIQEIHIKVIH ILIQLIEKEMVK >gi|296493440|gb|ADTK01000061.1| GENE 5 4997 - 5764 588 255 aa, chain + ## HITS:1 COG:yafJ KEGG:ns NR:ns ## COG: yafJ COG0121 # Protein_GI_number: 16128209 # Func_class: R General function prediction only # Function: Predicted glutamine amidotransferase # Organism: Escherichia coli K12 # 1 255 1 255 255 542 99.0 1e-154 MCELLGMSANVPTDICFSFTGLVQRGGGTGPHKDGWGITFYEGKGCRTFKDPQPSFNSPI AKLVQDYPIKSCSVVAHIRQANRGEVALENTHPFTRELWGRNWTYAHNGQLTGYKSLETG NFRPVGETDSEKAFCWLLHKLTQRYPRTPGNMAAVFKYIASLAGEMRQKGVFNMLLSDGR YVMAYCSTNLHWITRRAPFGVATLLDQDVEIDFSSQTTPNDVVTVIATQPLTGNETWQKI MPGEWRLFCLGERVV >gi|296493440|gb|ADTK01000061.1| GENE 6 5735 - 6475 794 246 aa, chain - ## HITS:1 COG:ECs0251 KEGG:ns NR:ns ## COG: ECs0251 COG3034 # Protein_GI_number: 15829505 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 246 1 246 246 491 100.0 1e-139 MRKIALILAMLLIPCVSFAGLLGSSSSTTPVSKEYKQQLMGSPVYIQIFKEERTLDLYVK MGEQYQLLDSYKICKYSGGLGPKQRQGDFKSPEGFYSVQRNQLKPDSRYYKAINIGFPNA YDRAHGYEGKYLMIHGDCVSIGCYAMTNQGIDEIFQFVTGALVFGQPSVQVSIYPFRMTD ANMKRHKYSNFKDFWEQLKPGYDYFEQTRKPPTVSVVNGRYVVSKPLSHEVVQPQLASNY TLPEAK >gi|296493440|gb|ADTK01000061.1| GENE 7 6899 - 7525 321 208 aa, chain + ## HITS:1 COG:ECs0254 KEGG:ns NR:ns ## COG: ECs0254 COG0791 # Protein_GI_number: 15829508 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Escherichia coli O157:H7 # 1 208 42 249 249 416 98.0 1e-116 MQNRARLLKQYQTHLKKQASYIVEGNAESRRALRQHNREQIKQHPEWFPAPLKASDRRWQ ALAENNHFLSSDHLHNITEVAIHRLEQQLGKPYVWGGTRPDQGFDCSGLVFYAYNKILEA KLPRTANEMYHYHRATIVANNDLRRGDLLFFHIHSREIADHMGVYLGDGQFIESPRTGEN IRVSRLAEPFWQDHFLGARRILTEETIL >gi|296493440|gb|ADTK01000061.1| GENE 8 7701 - 8198 43 165 aa, chain + ## HITS:1 COG:yafM KEGG:ns NR:ns ## COG: yafM COG1943 # Protein_GI_number: 16128214 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli K12 # 1 164 1 164 165 322 98.0 2e-88 MSEYRRYYIKGGTWFFTVNLRNRRSQLLTTQYQMLRHAIIKVKRDRPFEINAWVVLPEHM HCIWTLPEGDDDFSSRWREIKKQFTHACGLKNIWQPRFWEHAIRNTKDYRHHVDYIYINP VKHGWVKQVSDWPFSTFHRDVARGLYPIDWAGDVTDINAGERIIL Prediction of potential genes in microbial genomes Time: Mon May 16 15:09:53 2011 Seq name: gi|296493439|gb|ADTK01000062.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont171.2, whole genome shotgun sequence Length of sequence - 31436 bp Number of predicted genes - 40, with homology - 40 Number of transcription units - 19, operones - 9 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 124 - 1863 1719 ## COG1298 Flagellar biosynthesis pathway, component FlhA - Prom 1913 - 1972 1.8 2 2 Op 1 2/1.000 + CDS 1823 - 2593 632 ## COG1360 Flagellar motor protein 3 2 Op 2 1/1.000 + CDS 2664 - 3719 959 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair 4 2 Op 3 2/1.000 + CDS 3794 - 4168 192 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Prom 4288 - 4347 7.2 5 3 Op 1 5/0.167 + CDS 4475 - 4741 116 ## COG1690 Uncharacterized conserved protein 6 3 Op 2 . + CDS 4784 - 5041 123 ## COG1186 Protein chain release factor B + Term 5283 - 5326 2.4 - Term 5042 - 5085 8.3 7 4 Tu 1 . - CDS 5098 - 6555 1583 ## COG2195 Di- and tripeptidases - Prom 6647 - 6706 3.6 + Prom 6697 - 6756 3.2 8 5 Tu 1 6/0.167 + CDS 6816 - 7274 607 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins + Term 7281 - 7319 9.1 + Prom 7281 - 7340 3.4 9 6 Op 1 . + CDS 7366 - 8610 1131 ## COG1073 Hydrolases of the alpha/beta superfamily 10 6 Op 2 . + CDS 8668 - 9069 588 ## SSON_0282 DNA-binding transcriptional regulator Crl - Term 9066 - 9100 6.0 11 7 Tu 1 . - CDS 9108 - 10163 1120 ## COG3203 Outer membrane protein (porin) - Prom 10247 - 10306 5.8 + Prom 10240 - 10299 8.4 12 8 Op 1 22/0.000 + CDS 10451 - 11554 1085 ## COG0263 Glutamate 5-kinase 13 8 Op 2 . + CDS 11566 - 12819 1490 ## COG0014 Gamma-glutamyl phosphate reductase + TRNA 12934 - 13009 91.5 # Thr CGT 0 0 14 9 Tu 1 . - CDS 13024 - 13692 -36 ## COG0582 Integrase - Prom 13720 - 13779 2.0 - Term 14367 - 14403 -0.0 15 10 Op 1 . - CDS 14414 - 14719 313 ## SF0298 hypothetical protein 16 10 Op 2 . - CDS 14719 - 15081 277 ## EFER_0574 conserved hypothetical protein; CPS-53 (KpLE1) prophage 17 10 Op 3 . - CDS 15072 - 15608 343 ## COG1896 Predicted hydrolases of HD superfamily - Prom 15670 - 15729 1.9 - Term 15694 - 15719 -0.5 18 11 Tu 1 . - CDS 15736 - 16560 839 ## COG5532 Uncharacterized conserved protein 19 12 Tu 1 . - CDS 16626 - 16988 499 ## EFER_4420 hypothetical protein - Term 17641 - 17669 1.0 20 13 Tu 1 . - CDS 17691 - 18383 504 ## COG2932 Predicted transcriptional regulator + Prom 18345 - 18404 2.4 21 14 Op 1 . + CDS 18481 - 18741 140 ## COG1396 Predicted transcriptional regulators 22 14 Op 2 . + CDS 18734 - 19285 273 ## ECO26_3705 putative phage regulatory protein + Term 19323 - 19359 1.0 23 15 Op 1 . + CDS 19474 - 20433 404 ## COG3646 Uncharacterized phage-encoded protein 24 15 Op 2 . + CDS 20430 - 20654 143 ## UTI89_C5099 hypothetical protein 25 15 Op 3 . + CDS 20651 - 21469 490 ## UTI89_C5100 putative replication protein 26 15 Op 4 . + CDS 21472 - 21960 162 ## JW5385 hypothetical protein 27 15 Op 5 . + CDS 21960 - 22613 615 ## ECS88_0548 putative phage AdoMet-dependent methyltransferase 28 15 Op 6 . + CDS 22610 - 22936 273 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) 29 15 Op 7 . + CDS 22933 - 23322 331 ## COG4570 Holliday junction resolvase 30 15 Op 8 . + CDS 23342 - 24139 728 ## ECS88_5013 conserved hypothetical protein from phage origin 31 15 Op 9 . + CDS 24219 - 25136 331 ## EFER_0593 hypothetical protein 32 15 Op 10 . + CDS 25149 - 25523 204 ## COG4570 Holliday junction resolvase 33 15 Op 11 . + CDS 25520 - 26341 321 ## ECS88_1348 putative antitermination protein Q of prophage + Term 26378 - 26423 6.2 + Prom 26372 - 26431 2.1 34 16 Op 1 . + CDS 26568 - 26765 137 ## ECO26_1481 hypothetical protein 35 16 Op 2 . + CDS 26916 - 27965 656 ## COG0863 DNA modification methylase + Term 28122 - 28188 30.0 + TRNA 28015 - 28090 88.5 # Met CAT 0 0 + TRNA 28100 - 28176 51.0 # Arg TCG 0 0 - Term 28981 - 29023 -0.9 36 17 Tu 1 . - CDS 29124 - 29309 114 ## E2348C_1245 hypothetical protein - Prom 29493 - 29552 8.3 + Prom 29455 - 29514 4.7 37 18 Tu 1 . + CDS 29720 - 29908 253 ## COG2801 Transposase and inactivated derivatives + Term 30150 - 30188 5.3 + Prom 30009 - 30068 4.2 38 19 Op 1 . + CDS 30216 - 30431 303 ## ECSP_1747 lysis protein 39 19 Op 2 . + CDS 30436 - 30786 381 ## EcSMS35_1182 hypothetical protein 40 19 Op 3 . + CDS 30850 - 31383 276 ## COG3772 Phage-related lysozyme (muraminidase) Predicted protein(s) >gi|296493439|gb|ADTK01000062.1| GENE 1 124 - 1863 1719 579 aa, chain - ## HITS:1 COG:ECs0257 KEGG:ns NR:ns ## COG: ECs0257 COG1298 # Protein_GI_number: 15829510 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FlhA # Organism: Escherichia coli O157:H7 # 1 578 1 578 579 1097 99.0 0 MLSRSDLLTLLTINFIVVTKGAERISEVSARFTLDAMPGKQMAIDADLNAGLINQAQAQT RRKDVASEADFYGAMDGASKFVRGDAIAGMMILAINLIGGVCIGIFKYNLSADAAFQQYV LMTIGDGLVAQIPSLLLSTAAAIIVTRISDNGDITHDVRHQLLASPSVLYTATGIMFVLA VVPGMPHLPFLLFSALLGFTGWRMSKRPQAAEAEEKSLETLTRTITETSEQQVSWETIPL IEPISLSLGYKLVALVDKAQGNPLTQRIRGVRQVISDGNGVLLPEIRIRENFRLKPSQYA IFINGIKADEADIPADKLMALPSSETYGEIDGVLGNDPAYGMPVTWIQPAQKAKALNMGY QVIDSASVIATHVNKIVRSYIPDLFNYDDITQLHNRLSSMAPRLAEDLSAALNYSQLLKV YRALLTEGVSLRDIVTIATVLVASSAVTKDHILLAADVRLALRRSITHPFVRKQELTVYT LNNELENLLTNVVNQAQQGGKVMLDSVPVDPNMLNQFQSTMPQVKEQMKAAGKDPVLLVP PQLRPLLARYARLFAPGLHVLSYNEVPDELELKIMGALM >gi|296493439|gb|ADTK01000062.1| GENE 2 1823 - 2593 632 256 aa, chain + ## HITS:1 COG:ECs0256 KEGG:ns NR:ns ## COG: ECs0256 COG1360 # Protein_GI_number: 15829511 # Func_class: N Cell motility # Function: Flagellar motor protein # Organism: Escherichia coli O157:H7 # 1 256 6 261 261 473 98.0 1e-133 MIVNSVSKSERESIIAALHGQSIFSGGGLSPLNKISPSHPPKPATVAVPEETEKKARDVN EKTALLKKKSATELGELATSINTIARDAHMEANLEMEIVPQGLRVLIKDDQNRNMFERGS AQIMPFFKTLLVELAPVFDSLDNKIIITGHTDAMAYKNNIYNNWNLSGDRALSARRVLEE AGMPEDKVMQVSAMADQMLLDAKNPQSAGNRRIEIMVLTKSASDTLYQYFGQHGDKVVQP LVQKLDKQQVLSQRMR >gi|296493439|gb|ADTK01000062.1| GENE 3 2664 - 3719 959 351 aa, chain + ## HITS:1 COG:dinP KEGG:ns NR:ns ## COG: dinP COG0389 # Protein_GI_number: 16128217 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Escherichia coli K12 # 1 351 1 351 351 711 99.0 0 MRKIIHVDMDCFFAAVEMRDNPALRDIPIAIGGSRERRGVISTANYPARKFGVRSAMPTG MALKLCPHLTLLPGRFDAYKEASNHIREIFSRYTSRIEPLSLDEAYLDVTDSVHCHGSAT LIAQEIRQTIFNELHLTASAGVAPVKFLAKIASDMNKPNGQFVITPAEVPAFLQTLPLAK IPGVGKVSAAKLEAMGLRTCGDVQKCDLVILLKRFGKFGRILWERSQGIDERDVNSERLR KSVGVERTMAEDIHHWSECEAIIERLYPELERRLAKVKPDLLIARQGVKLKFDDFQQTTQ EHVWPRLNKADLIATARKTWDERRGGRGVRLVGLHVTLLDPQMERQLVLGL >gi|296493439|gb|ADTK01000062.1| GENE 4 3794 - 4168 192 124 aa, chain + ## HITS:1 COG:yafP KEGG:ns NR:ns ## COG: yafP COG0454 # Protein_GI_number: 16128220 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Escherichia coli K12 # 1 124 27 150 150 229 94.0 1e-60 MTASQHYSPQQIAAWAQIDESRWKEKLAKSQVRVAVINAQPVGFISRIERHIDMLFVDPE YTRRGVASALLKPLIKSESELTVDASITAKPFFERYGFQIVKQQHVECRGAWFTNFYMRY KPQH >gi|296493439|gb|ADTK01000062.1| GENE 5 4475 - 4741 116 88 aa, chain + ## HITS:1 COG:ykfJ KEGG:ns NR:ns ## COG: ykfJ COG1690 # Protein_GI_number: 16128221 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 88 1 88 88 177 100.0 5e-45 MEWYMGKYIRPLSDAVFTIASDDLWIESLAIQQLHTTANLPNMQRVVGMPDLHPGRGYPI GAAFFSVGRFYPARRRGNGAGNRNGPLL >gi|296493439|gb|ADTK01000062.1| GENE 6 4784 - 5041 123 85 aa, chain + ## HITS:1 COG:prfH KEGG:ns NR:ns ## COG: prfH COG1186 # Protein_GI_number: 16128222 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Escherichia coli K12 # 15 85 98 166 166 105 81.0 2e-23 MGIKRKLVRHYSVDLSESVSVSSWAQKLVSGHANKRLARLLIAWKLEQQQQENSAALKSQ RRMFHHQIERGNPRRTFTGMAFIEG >gi|296493439|gb|ADTK01000062.1| GENE 7 5098 - 6555 1583 485 aa, chain - ## HITS:1 COG:ECs0264 KEGG:ns NR:ns ## COG: ECs0264 COG2195 # Protein_GI_number: 15829518 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Escherichia coli O157:H7 # 1 485 1 485 485 985 99.0 0 MSELSQLSPQPLWDIFAKICSIPHPSYHEEQLAEYIVGWAKEKGFHVERDQVGNILIRKP ATAGMENRKPVVLQAHLDMVPQKNNDTVHDFTKDPIQPYIDGEWVKARGTTLGADNGIGM ASALAVLADENVVHGPLEVLLTMTEEAGMDGAFGLQSNWLQADILINTDSEEEGEIYMGC AGGIDFTSNLHLDREAVPAGFETFKLTLKGLKGGHSGGEIHVGLGNANKLLVRFLAGHAE ELDLRLIDFNGGTLRNAIPREAFATIAVAADKVDALKSLVNTYQDILKNELAEKEKNLAL LLDSVANDKAALIAKSRDTFIRLLNATPNGVIRNSDVAKGVVETSLNVGVVTMTDNNVEI HCLIRSLIDSGKDYVVSMLDSLGKLAGAKTEAKGAYPGWQPDANSPVMHLVRETYQRLFN KTPNIQIIHAGLECGLFKKPYPEMDMVSIGPTITGPHSPDEQVHIKSVGHYWTLLTELLK EIPAK >gi|296493439|gb|ADTK01000062.1| GENE 8 6816 - 7274 607 152 aa, chain + ## HITS:1 COG:ECs0265 KEGG:ns NR:ns ## COG: ECs0265 COG0503 # Protein_GI_number: 15829519 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Escherichia coli O157:H7 # 1 152 1 152 152 306 99.0 9e-84 MSEKYIVTWDMLQIHARKLASRLMPSEQWKGIIAVSRGGLVPGALLARELGIRHVDTVCI SSYDHDNQRELKVLKRAEGDGEGFIVIDDLVDTGGTAVAIREMYPKAHFVTIFAKPAGRP LVDNYVVDIPQDTWIEQPWDMGVVFVPPISGR >gi|296493439|gb|ADTK01000062.1| GENE 9 7366 - 8610 1131 414 aa, chain + ## HITS:1 COG:yafA KEGG:ns NR:ns ## COG: yafA COG1073 # Protein_GI_number: 16128225 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Escherichia coli K12 # 1 414 1 414 414 865 99.0 0 MTQANLSETLFKPRFKHPETSTLVRRFNHGAQPPVQSALDGKTIPHWYRMINRLMWIWRG IDPREILDVQARIVMSDAERTDDDLYDTVIGYRGGNWIYEWATQAMVWQQKACAEEDPQL SGRHWLHAATLYNIAAYPHLKGDDLAEQAQALSNRAYEEAAQRLPGTMRQMEFTVPGGAP ITGFLHMPKGDGPFPTVLMCGGLDAMQTDYYSLYERYFAPRGIAMLTIDMPSVGFSSKWK LTQDSSLLHQHVLKALPNVPWVDHTRVAAFGFRFGANVAVRLAYLESPRLKAVACLGPVV HTLLSDFKCQQQVPEMYLDVLASRLGMHDASDDALRVELNRYSLKVQGLLGRRCPTPMLS GYWKNDPFSPEEDSRLITSSSADGKLLEIPFNPVYRNFDKGLQEITGWIEKRLC >gi|296493439|gb|ADTK01000062.1| GENE 10 8668 - 9069 588 133 aa, chain + ## HITS:1 COG:no KEGG:SSON_0282 NR:ns ## KEGG: SSON_0282 # Name: crl # Def: DNA-binding transcriptional regulator Crl # Organism: S.sonnei # Pathway: not_defined # 1 133 1 133 133 266 100.0 2e-70 MTLPSGHPKSRLIKKFTALGPYIREGKCEDNRFFFDCLAVCVNVKPAPEVREFWGWWMEL EAQESRFTYSYQFGLFDKAGDWKSVPVKDTEVVERLEHTLREFHEKLRELLTTLNLKLEP ADDFRDEPVKLTA >gi|296493439|gb|ADTK01000062.1| GENE 11 9108 - 10163 1120 351 aa, chain - ## HITS:1 COG:ECs0268 KEGG:ns NR:ns ## COG: ECs0268 COG3203 # Protein_GI_number: 15829522 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein (porin) # Organism: Escherichia coli O157:H7 # 1 351 1 351 351 625 100.0 1e-179 MKKSTLALVVMGIVASASVQAAEIYNKDGNKLDVYGKVKAMHYMSDNDSKDGDQSYIRFG FKGETQINDQLTGYGRWEAEFAGNKAESDTAQQKTRLAFAGLKYKDLGSFDYGRNLGALY DVEAWTDMFPEFGGDSSAQTDNFMTKRASGLATYRNTDFFGVIDGLNLTLQYQGKNENRD VKKQNGDGFGTSLTYDFGGSDFAISGAYTNSDRTNEQNLQSRGTGKRAEAWATGLKYDAN NIYLATFYSETRKMTPITGGFANKTQNFEAVAQYQFDFGLRPSLGYVLSKGKDIEGIGDE DLVNYIDVGATYYFNKNMSAFVDYKINQLDSDNKLNINNDDIVAVGMTYQF >gi|296493439|gb|ADTK01000062.1| GENE 12 10451 - 11554 1085 367 aa, chain + ## HITS:1 COG:ECs0269 KEGG:ns NR:ns ## COG: ECs0269 COG0263 # Protein_GI_number: 15829523 # Func_class: E Amino acid transport and metabolism # Function: Glutamate 5-kinase # Organism: Escherichia coli O157:H7 # 1 367 1 367 367 692 100.0 0 MSDSQTLVVKLGTSVLTGGSRRLNRAHIVELVRQCAQLHAAGHRIVIVTSGAIAAGREHL GYPELPATIASKQLLAAVGQSRLIQLWEQLFSIYGIHVGQMLLTRADMEDRERFLNARDT LRALLDNNIVPVINENDAVATAEIKVGDNDNLSALAAILAGADKLLLLTDQKGLYTADPR SNPQAELIKDVYGIDDALRAIAGDSVSGLGTGGMSTKLQAADVACRAGIDTIIAAGSKPG VIGDVMEGISVGTLFHAQATPLENRKRWIFGAPPAGEITVDEGATAAILERGSSLLPKGI KSVTGNFSRGEVIRICNLEGRDIAHGVSRYNSDALRRIAGHHSQEIDAILGYEYGPVAVH RDDMITR >gi|296493439|gb|ADTK01000062.1| GENE 13 11566 - 12819 1490 417 aa, chain + ## HITS:1 COG:ECs0270 KEGG:ns NR:ns ## COG: ECs0270 COG0014 # Protein_GI_number: 15829524 # Func_class: E Amino acid transport and metabolism # Function: Gamma-glutamyl phosphate reductase # Organism: Escherichia coli O157:H7 # 1 417 1 417 417 770 97.0 0 MLEQMGIAAKQASYKLAQLSSREKNRVLEKIADELEAQSEIILNANAQDVADARANGLGE AMLDRLALTPARLKGIADDVRQVCNLADPVGQVIDGSVLDSGLRLERRRVPLGVIGVIYE ARPNVTVDVASLCLKTGNAVILRGGKETCRTNAATVAVIQDALKSCGLPVGAVQAIDNPD RALVSEMLRMDKYIDMLIPRGGAGLHKLCREQSTIPVITGGIGVCHIYVDESVEIAEALK VIVNAKTQRPSTCNTVETLLVNKNIADSFLPALSKQMEESGVALHADAAALAQLQTGPAK VVAVKAEEYDDEFLSLDLNVKIVSDLDDAIAHIREHGTQHSDAILTRDMRNAQRFVNEVD SSAVYVNASTRFTDGGQFGLGAEVAVSTQKLHARGPMGLEALTTYKWIGIGDYTIRA >gi|296493439|gb|ADTK01000062.1| GENE 14 13024 - 13692 -36 222 aa, chain - ## HITS:1 COG:ECs0271 KEGG:ns NR:ns ## COG: ECs0271 COG0582 # Protein_GI_number: 15829525 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Escherichia coli O157:H7 # 1 222 103 324 324 434 99.0 1e-122 MLDKAPIIKVPQPKNKRIRWLEPHEAQRLIDECPEPLKSVVEFALATGLRRSNIINLEWQ QIDMQRRVAWINPEESKSNRAIGVALNDTACRVLKKQIGNHHRWVFVYKESCTKPDGTKA PTVRKMRYDANTAWKAALRRAGIDDFRFHDLRHTWASWLVQAGVPLSVLQEMGGWESIEM VRRYAHLAPNHLTEHARQIDSILNPSVPNSSQSKNKEGTNDV >gi|296493439|gb|ADTK01000062.1| GENE 15 14414 - 14719 313 101 aa, chain - ## HITS:1 COG:no KEGG:SF0298 NR:ns ## KEGG: SF0298 # Name: yfdT # Def: hypothetical protein # Organism: S.flexneri # Pathway: not_defined # 1 101 1 101 101 172 98.0 3e-42 MTTFTDKELIKEIKERISSLDVRDDIERRAYEIALLSLEVEPDERESYELFMEKRFGNLV DRRRAKNGDNEYMAWDMTLGWIVWQQRAGIHFSTMSQQEVK >gi|296493439|gb|ADTK01000062.1| GENE 16 14719 - 15081 277 120 aa, chain - ## HITS:1 COG:no KEGG:EFER_0574 NR:ns ## KEGG: EFER_0574 # Name: yfdS # Def: conserved hypothetical protein; CPS-53 (KpLE1) prophage # Organism: E.fergusonii # Pathway: not_defined # 1 120 1 120 120 233 100.0 2e-60 MRMNVFEMEGFLRGRCVPRDLKVNETDAEYLVRKFDALEAKCAAQENKVIPVSTELPPAN ESVLLFDANGEGWLIGWRSLWYTWGQKETGEWQWTFQVGDLENVNITHWAVMPKAPEAGA >gi|296493439|gb|ADTK01000062.1| GENE 17 15072 - 15608 343 178 aa, chain - ## HITS:1 COG:yfdR KEGG:ns NR:ns ## COG: yfdR COG1896 # Protein_GI_number: 16130293 # Func_class: R General function prediction only # Function: Predicted hydrolases of HD superfamily # Organism: Escherichia coli K12 # 1 178 10 187 187 363 97.0 1e-100 MSFIKTFSGKHFYYDRINKDDIDINDIAVSLSNICRFAGHLSHFYSVAQHAVLCSQLVPQ EFAFEALMHDATEAYCQDIPAPLKRLLPDYKQMEEKIDAVIREKYGLPPVMSTPVKYADL IMLATERRDLGLDDGSFWPVLEGIPATEMFNVIPLAPGHAYGMFMERFNELSELRKCA >gi|296493439|gb|ADTK01000062.1| GENE 18 15736 - 16560 839 274 aa, chain - ## HITS:1 COG:yfdQ KEGG:ns NR:ns ## COG: yfdQ COG5532 # Protein_GI_number: 16130292 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 274 1 274 274 500 99.0 1e-141 MSQNLDATAINQIHALISAQGVNEIISKIGADAVALPENFRIHDLEKFNLNRFRFRGALS TASIDDFTRYSKDLADEGTRCFIDADNMRAVSVLNLGTIDEPGHADNTATLKLKKTAPFS ALLSVNGERNSQKSLAEWIEDWADYLVGFDANGDAIQATKAAAAVRKITIEANQTADFED NDFSGKRSLMESVEAKTKDIMPVAFEFKCVPFEGLKERPFKLRLSIITGDRPVLVLRIIQ LEAVQEEMANEFRDLLVEKFKDSKVETFIGTFTA >gi|296493439|gb|ADTK01000062.1| GENE 19 16626 - 16988 499 120 aa, chain - ## HITS:1 COG:no KEGG:EFER_4420 NR:ns ## KEGG: EFER_4420 # Name: not_defined # Def: hypothetical protein # Organism: E.fergusonii # Pathway: not_defined # 1 120 107 226 226 239 99.0 3e-62 MASERSTDVQAFIGELDGGVFETKIGAVLSEVASGVMNTKTKGKVSLNLEIEPFDENRVK IKHKLSYVRPTNRGKISEEDTTETPMYVNRGGRLTILQEDQGQLLTLAGEPDGKLRAAGH >gi|296493439|gb|ADTK01000062.1| GENE 20 17691 - 18383 504 230 aa, chain - ## HITS:1 COG:STM0898 KEGG:ns NR:ns ## COG: STM0898 COG2932 # Protein_GI_number: 16764259 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Salmonella typhimurium LT2 # 1 229 3 230 231 221 51.0 6e-58 MNIGNRVRQLRQAKNMKIADLAEAIGVDAANISRLETGKQKQFTEQALSNIARSLGVDIA DLFTSDVKSNTVCKNSISEDVAQVKDVFRIEMLDVSASAGNGLIQGGDVIDVIHAIEYRT DNAVSMFGGRPANHIKVINVRGDSMCPTIEPGDLIFVDVSINQFDGDGIYVFGFDDKIYV KRLQMIPDKLLVISDNQIYREWGITSENEHRFMVFGKVLISQSQTLKRHN >gi|296493439|gb|ADTK01000062.1| GENE 21 18481 - 18741 140 86 aa, chain + ## HITS:1 COG:STM0898A KEGG:ns NR:ns ## COG: STM0898A COG1396 # Protein_GI_number: 16764260 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Salmonella typhimurium LT2 # 3 74 6 77 84 61 47.0 4e-10 MQSPLRNVRKAHGFTLQHVAAGVQVNPATLSRIERLEQIPSIDLAERLANFFKGEISEMQ ILYPARFQSSQNQNGFKPQEQEVSRG >gi|296493439|gb|ADTK01000062.1| GENE 22 18734 - 19285 273 183 aa, chain + ## HITS:1 COG:no KEGG:ECO26_3705 NR:ns ## KEGG: ECO26_3705 # Name: not_defined # Def: putative phage regulatory protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 183 5 187 187 360 98.0 1e-98 MGKHHWKVEKQPEWYVKAVRKTIAALPGGYAEAADWLDVTENALFNRLRADGDQIFPLGW AMVLQRAAGTHYIADAVAQSAGGVFVSLPEIEEVENADINQRLLEVIEQIGNYSKQIRSA IEDGVVEPHEQTAINDELYLSISKLQEHAALVYKIFCAPEKSDARECAAPGVVAFCVCGE TNA >gi|296493439|gb|ADTK01000062.1| GENE 23 19474 - 20433 404 319 aa, chain + ## HITS:1 COG:PM1774 KEGG:ns NR:ns ## COG: PM1774 COG3646 # Protein_GI_number: 15603639 # Func_class: S Function unknown # Function: Uncharacterized phage-encoded protein # Organism: Pasteurella multocida # 148 247 27 126 239 100 47.0 3e-21 MSMLFVGNQKHSALSIFAKTTRMSALVLCGNSGVILLSVKDHQHIDSAIPGRYTVQAPYK TGAGRGNPEFTKAHNRALAVFLCHEQHYAQIMVGRAGPTSVGPGSLVTGISTPVRLTTNK VVESLGGELLKITKEAAIMATIPTLTQPEIAIVDGQAVTSSLAVANFFSKRHDDVLKKIR TLECSASFTARNFSVSDYTDCTGRKLPCYQITRDGFAFLAMGFTGKRAAQFKEAYINAFN QMEKQLSNPSVLSDVAHNASVLYSYISSIHQVWLQQLYPMLAKAESPLAVSLYDYINDAS ALACLINLSLNPSEVRGRK >gi|296493439|gb|ADTK01000062.1| GENE 24 20430 - 20654 143 74 aa, chain + ## HITS:1 COG:no KEGG:UTI89_C5099 NR:ns ## KEGG: UTI89_C5099 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 74 1 74 74 152 100.0 4e-36 MIRNIFKRFTNQTFRCPRPGQWYTTPAGHVLRVSLVDRECQKVICEPLGRNYRVSMPLIA FRSGKNMKHLGGAA >gi|296493439|gb|ADTK01000062.1| GENE 25 20651 - 21469 490 272 aa, chain + ## HITS:1 COG:no KEGG:UTI89_C5100 NR:ns ## KEGG: UTI89_C5100 # Name: not_defined # Def: putative replication protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 272 1 272 272 517 99.0 1e-145 MSMELMVKAMKIRVGNPLRKLVLIKLADNASDQGECWPSYQHIADQCEISKRSVMNHIAA LCESGLVKKVTRKGEKGNSSNIYLLHLDGAGDSLGGSANNSLSGAANSPGSAGVAPGGSA GDSPRTSHSFEPVKEPVNEPIAVGASVDESVRVRSNRPEYSPEFEQAWLAYPKRAGGNSK SAAFKAWKARLNEGVNPETMLEGVKRYAGWVSAMGNSGTQFVKQAVTFFGPDRHFEEFWE VPAVSAARREDPYFKASYDNVDYSQIPAGFRG >gi|296493439|gb|ADTK01000062.1| GENE 26 21472 - 21960 162 162 aa, chain + ## HITS:1 COG:no KEGG:JW5385 NR:ns ## KEGG: JW5385 # Name: yfdN # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 162 3 164 164 296 98.0 2e-79 MSLLNEVQKFIEAHPGCTSGDIADAFAGYSRQRVLQSASKLRQSGRLAHRCEGDTHRHFP RLTERAQDPEPQPVRETRPVRNFYVGTNDPRVILCLTRQAEELESRGLYRRAATVWMAAF RESHSQPERNNFLARRERCLRKSSKRAASGEEWYLSGNYVGA >gi|296493439|gb|ADTK01000062.1| GENE 27 21960 - 22613 615 217 aa, chain + ## HITS:1 COG:no KEGG:ECS88_0548 NR:ns ## KEGG: ECS88_0548 # Name: not_defined # Def: putative phage AdoMet-dependent methyltransferase # Organism: E.coli_S88 # Pathway: not_defined # 1 217 12 228 228 447 100.0 1e-124 MSNKYCQALVELRNKPAHELKEVGDQWRTPDNIFWGINTLFGPFVLDLFTDGDNAKCAAY YTAEDNALAHDWSERLAELKGAAFGNPPYSRASQHEGQYITGMRYIMKHASAMRDKGGRY VFLIKAATSEVWWPEDADHIAFIRGRIGFELPAWFIPKDEKQVPTGAFFAGAIAVFDKTW KGPAISYIGRDELEACGEAFLAQVRQQAEKLVREMAA >gi|296493439|gb|ADTK01000062.1| GENE 28 22610 - 22936 273 108 aa, chain + ## HITS:1 COG:HI0749 KEGG:ns NR:ns ## COG: HI0749 COG1974 # Protein_GI_number: 16272690 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Haemophilus influenzae # 1 90 3 92 209 70 41.0 6e-13 MTTLTQCQQQVLDMLISYQKERGFPPTNQEVATMLGYRSVNAAVEHLRALEKKGVITIKR GVARGITLHTAVKDDDSEAVGIIRALLAGEENARLRATHWLHERDLKV >gi|296493439|gb|ADTK01000062.1| GENE 29 22933 - 23322 331 129 aa, chain + ## HITS:1 COG:ECs2751 KEGG:ns NR:ns ## COG: ECs2751 COG4570 # Protein_GI_number: 15832005 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvase # Organism: Escherichia coli O157:H7 # 1 122 3 117 119 87 40.0 8e-18 MKLILPFPPSVNTYWRHPNKGAFAGKSLISSAGRKFQSAACAAIVEQLRRLPKPTSAPAS VEIVLFPPDNRIRDLDNYNKALFDALTHAGVWEDDSQVKRMLVEWGPVIPEGKVEITISK YEKPAGAAA >gi|296493439|gb|ADTK01000062.1| GENE 30 23342 - 24139 728 265 aa, chain + ## HITS:1 COG:no KEGG:ECS88_5013 NR:ns ## KEGG: ECS88_5013 # Name: not_defined # Def: conserved hypothetical protein from phage origin # Organism: E.coli_S88 # Pathway: not_defined # 1 265 1 265 265 549 99.0 1e-155 MNNLMVIDGIEVRRDAYGRYSLNDLHRAAGSLDKHKPAFWLRNEQTERLISELQICNSVN IEPVNVIRGGNNQGTYVCKELVYAYAMWISPSFHLKVIRTFDMVTSAPEKLSGQAADKMQ AGVILLDFMRRELNLSNSSVLGACQKLQEAVGLPNLAPRYAIDAPADAHDGSSRPTLSLS ALLKQYGIRLTANQAYHQMVKLGIVEQRERYSRTAINNIKKFWSLTAKGCMFGKNITSPA NPRETQPHFFESRFPELLKLLDTVH >gi|296493439|gb|ADTK01000062.1| GENE 31 24219 - 25136 331 305 aa, chain + ## HITS:1 COG:no KEGG:EFER_0593 NR:ns ## KEGG: EFER_0593 # Name: not_defined # Def: hypothetical protein # Organism: E.fergusonii # Pathway: not_defined # 1 305 27 331 331 597 94.0 1e-169 MPLFMQGRVLLEPEPERYSSFASGAVPAASQPLADDPAVRAVFRNEAVIRRAGGVECLES WLLREKGCQWPHSDWHSENMTTMRHAPGAIRLCWHCDNQLRDQFTERLESMATDNCARWV LSVVRRDLGFDDSHVVTMPELCWWLIRNDLADALPESAARKALRLPKPVVPSVTRESDLV PSVPATSIIQDKAKKVLVLKVDPESPESFMLRPKRRRWVNEKFTRWVKTQPCACCGKPAD DPHHLIGHGQGGMGTKSHDIFTLPLCREHHNELHADPLEFEKKYGSQVELIFHFLDHAFA TGVLG >gi|296493439|gb|ADTK01000062.1| GENE 32 25149 - 25523 204 124 aa, chain + ## HITS:1 COG:ECs1523 KEGG:ns NR:ns ## COG: ECs1523 COG4570 # Protein_GI_number: 15830777 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvase # Organism: Escherichia coli O157:H7 # 1 124 1 124 124 207 87.0 3e-54 MLIDLVLPYPPTVNTYWRRRGSTYFISEEGKRYRRAVALIVRQQRLKLSLSGRLAIKVIA EPPDKRRRDLDNILKAPLDALTHAGVLMDDEQFDEINIVRGQSVSGGRMGVKIYPIMHEE QVKK >gi|296493439|gb|ADTK01000062.1| GENE 33 25520 - 26341 321 273 aa, chain + ## HITS:1 COG:no KEGG:ECS88_1348 NR:ns ## KEGG: ECS88_1348 # Name: not_defined # Def: putative antitermination protein Q of prophage # Organism: E.coli_S88 # Pathway: not_defined # 1 273 1 273 273 523 98.0 1e-147 MKLEDLPKYYSPKSPGLTDASASTSKDALSITDVMAAQGMTQNRAEMGFSAFLGKMGISM NDRERATELLTEYALSRCDRVAALRKLPAEIKPAVMRIMASYAFEDYARSAASKKQCSCC HGKKFIESEVFTNKIQYPDGKPPVWAKCTKGVYPSYWEEWKKVREVVKVACPECGGKGEV STACKDCRGRGVAIHREESVKRGMPVIRDCQRCGGRGCERLPSTEAFNAICKVTSAITLD TWKKSVKRFYDTLVVRFDIEEAWAERQLKRVTR >gi|296493439|gb|ADTK01000062.1| GENE 34 26568 - 26765 137 65 aa, chain + ## HITS:1 COG:no KEGG:ECO26_1481 NR:ns ## KEGG: ECO26_1481 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 65 42 106 106 123 100.0 2e-27 MLKQQDMTETARVVFNELSVTEPATVGEIAQNTYLSRERCQLILTQLVMAGLADYQFGCY RRLPQ >gi|296493439|gb|ADTK01000062.1| GENE 35 26916 - 27965 656 349 aa, chain + ## HITS:1 COG:ECs1780 KEGG:ns NR:ns ## COG: ECs1780 COG0863 # Protein_GI_number: 15831034 # Func_class: L Replication, recombination and repair # Function: DNA modification methylase # Organism: Escherichia coli O157:H7 # 1 349 1 349 352 687 93.0 0 MLNTVKISSCELINADCLEFIRSLPENSVDLIVTDPPYFKVKPEGWDNQWKGDDDYLKWL DQCLAQFWRVLKPAGSLYLFCGHRRASDIEIMMRERFSVLNHIIWAKPSGRWNGCNKESL RAYFPATERILFAEHYQGPYRPKDAGYAAKGSALKQHVMAPLISYFRDARAALGITAKQI ADATGKKNMVSHWFSASQWQLPNESDYLKLQSLFARVAEEKHQRGELEKSHYQLVSTYSE LNRQYTELLSEYKNLRRYFGVTVQVPYTDVWTYKPVQYYPGKHPCEKPAEMLQQIISASS RPGDLVADFFMGSGSTVKAAMALGRRAIGVELETGRFEQTVREVQDLIV >gi|296493439|gb|ADTK01000062.1| GENE 36 29124 - 29309 114 61 aa, chain - ## HITS:1 COG:no KEGG:E2348C_1245 NR:ns ## KEGG: E2348C_1245 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_0127 # Pathway: not_defined # 1 61 51 111 111 115 100.0 6e-25 MLNNKVLPVTIYNVVPFNKTFWNLIKNSQECPTNTDNVLNECFNNRCTLQICPYGLKQQS P >gi|296493439|gb|ADTK01000062.1| GENE 37 29720 - 29908 253 62 aa, chain + ## HITS:1 COG:Z2065 KEGG:ns NR:ns ## COG: Z2065 COG2801 # Protein_GI_number: 15801507 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 EDL933 # 1 62 1 62 316 117 91.0 4e-27 MAFKHYDVVRAAPPSDLAEKLTHKLKEGWQPFGSPVAITPYTLMQAIAAEGDVVVSGATE PE >gi|296493439|gb|ADTK01000062.1| GENE 38 30216 - 30431 303 71 aa, chain + ## HITS:1 COG:no KEGG:ECSP_1747 NR:ns ## KEGG: ECSP_1747 # Name: not_defined # Def: lysis protein # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 71 26 96 96 139 98.0 3e-32 MDQMEKITTGVSYTTSAVGTGYWFLQLLDRVSPSQWAAIGVLGSLLFGLLTYLTNLYFKI REDRRKAARGE >gi|296493439|gb|ADTK01000062.1| GENE 39 30436 - 30786 381 116 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_1182 NR:ns ## KEGG: EcSMS35_1182 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 116 1 116 116 227 99.0 1e-58 MTQNYELIVKGIRNFENKVTVTLALRDKKRFDGEIFDLDISLDRVEGAALEFYEAAARMS IRQVFLDVAAGLCEGDEQSPEKRPVILEAQNVWITYKGKLPGRITGSLKTPPESQP >gi|296493439|gb|ADTK01000062.1| GENE 40 30850 - 31383 276 177 aa, chain + ## HITS:1 COG:ydfQ KEGG:ns NR:ns ## COG: ydfQ COG3772 # Protein_GI_number: 16129513 # Func_class: R General function prediction only # Function: Phage-related lysozyme (muraminidase) # Organism: Escherichia coli K12 # 1 177 1 177 177 345 93.0 4e-95 MNAKIRYGLSAAVLALIAVGAPAPDILDQFLDEKEGNHTTAYRDGSGIWTICRGATMVDG KPVFPGMKLSKEKCDQVNAIERDKALAWVERNIKVPLTEPQKAGIASFCPYNIGPGKCFP STFYKRLNAGDRKGACEAIRWWIKDGGRDCRIRSNNCYGQVIRRDQESALACWGIDQ Prediction of potential genes in microbial genomes Time: Mon May 16 15:10:37 2011 Seq name: gi|296493438|gb|ADTK01000063.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont187.1, whole genome shotgun sequence Length of sequence - 5252 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/1.000 - CDS 44 - 901 658 ## COG0796 Glutamate racemase 2 1 Op 2 . - CDS 846 - 2690 1695 ## COG4206 Outer membrane cobalamin receptor protein - Prom 2926 - 2985 3.3 + Prom 2975 - 3034 6.0 3 2 Tu 1 . + CDS 3059 - 4159 1179 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase - Term 4150 - 4191 9.5 4 3 Op 1 . - CDS 4199 - 4558 382 ## LF82_3412 inner membrane protein YijD 5 3 Op 2 . - CDS 4558 - 5160 500 ## COG1309 Transcriptional regulator Predicted protein(s) >gi|296493438|gb|ADTK01000063.1| GENE 1 44 - 901 658 285 aa, chain - ## HITS:1 COG:ECs4898 KEGG:ns NR:ns ## COG: ECs4898 COG0796 # Protein_GI_number: 15834152 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glutamate racemase # Organism: Escherichia coli O157:H7 # 1 285 5 289 289 529 99.0 1e-150 MATKLQDGNTPCLAATPSEPRPTVLVFDSGVGGLSVYDEIRHLLPDLHYIYAFDNVAFPY GEKSEEFIVERVVAIVTAVQERYPLALAVVACNTASTVSLPALREKFDFPVVGVVPAIKP AARLTANGIVGLLATRGTVKRSYTHELIARFANECQIEMLGSAEMVELAEAKLHGEDVSL DALKRILRPWLRMKEPPDTVVLGCTHFPLLQEELLQVLPEGTRLVDSGAAIARRTAWLLE HEAPDAKSADANIAFCMAMTPEAEQLLPVLQRYGFETLEKLAVLG >gi|296493438|gb|ADTK01000063.1| GENE 2 846 - 2690 1695 614 aa, chain - ## HITS:1 COG:btuB KEGG:ns NR:ns ## COG: btuB COG4206 # Protein_GI_number: 16131804 # Func_class: H Coenzyme transport and metabolism # Function: Outer membrane cobalamin receptor protein # Organism: Escherichia coli K12 # 1 614 1 614 614 1158 99.0 0 MIKKASLLTACSVTAFSAWAQDTSPDTLVVTANRFEQPRSTVLAPTTVVTRQDIDRWQST SVNDVLRRLPGVDITQNGGSGQLSSIFIRGTNASHVLVLIDGVRLNLAGVCGSADLSQFP IALVQRVEYIRGPRSAVYGSDAIGGVVNIITTRDEPGTEISAGWGSNSYQNYDVSTQQQL GDKTRVTLLGDYAHTHGYDVVAYGNTGTQAQTDNDGFLSKTLYGALEHNFTDAWSGFVRG YGYDNRTNYDAYYSPGSPLLDTRKLYSQSWDAGLRYNGELIKSQLITSYSHSKDYNYDPH FGRYDSSATLDEMKQYTVQWANNVIVGHGSIGAGVDWQKQTTTPGTGYVENGYDQRNTGI YLTGLQQVGDFTFEGAARSDDNSQFGRHGTWQTSAGWEFIEGYRFIASYGTSYKAPNLGQ LYGFYGNPNLDPEKSKQWEGAFEGLTAGVNWRISGYRNDVSDLIDYDDHTLKYYNEGKAR IKGVEATANFDTGPLTHTVSYDYVDARNAITDTPLLRRAKQQVKYQLDWQLYDFDWGITY QYLGTRYDKDYSSYPYQTVKMGGVSLWDLAVAYPVTSHLTVRGKIANLFDKDYETVYGYQ TAGREYTLSGSYTF >gi|296493438|gb|ADTK01000063.1| GENE 3 3059 - 4159 1179 366 aa, chain + ## HITS:1 COG:trmA KEGG:ns NR:ns ## COG: trmA COG2265 # Protein_GI_number: 16131803 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Escherichia coli K12 # 1 366 1 366 366 738 100.0 0 MTPEHLPTEQYEAQLAEKVVRLQSMMAPFSDLVPEVFRSPVSHYRMRAEFRIWHDGDDLY HIIFDQQTKSRIRVDSFPAASELINQLMTAMIAGVRNNPVLRHKLFQIDYLTTLSNQAVV SLLYHKKLDDEWRQEAEALRDALRAQNLNVHLIGRATKTKIELDQDYIDERLPVAGKEMI YRQVENSFTQPNAAMNIQMLEWALDVTKGSKGDLLELYCGNGNFSLALARNFDRVLATEI AKPSVAAAQYNIAANHIDNVQIIRMAAEEFTQAMNGVREFNRLQGIDLKSYQCETIFVDP PRSGLDSETEKMVQAYPRILYISCNPETLCKNLETLSQTHKVERLALFDQFPYTHHMECG VLLTAK >gi|296493438|gb|ADTK01000063.1| GENE 4 4199 - 4558 382 119 aa, chain - ## HITS:1 COG:no KEGG:LF82_3412 NR:ns ## KEGG: LF82_3412 # Name: yijD # Def: inner membrane protein YijD # Organism: E.coli_LF82 # Pathway: not_defined # 1 119 1 119 119 222 100.0 4e-57 MKQANQDRGTLLLALVAGLSINGTFAALFSSIVPFSVFPIISLVLTVYCLHQRYLNRTMP VGLPGLAAACFILGVLLYSTVVRAEYPDIGSNFFPAVLSVIMVFWIGAKMRNRKQEVAE >gi|296493438|gb|ADTK01000063.1| GENE 5 4558 - 5160 500 200 aa, chain - ## HITS:1 COG:ECs4894 KEGG:ns NR:ns ## COG: ECs4894 COG1309 # Protein_GI_number: 15834148 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 200 35 234 234 346 99.0 2e-95 MEAAFSQLSAERSFASLSLREVAREAGIAPTSFYRHFRDVDELGLTMVDESGLMLRQLMR QARQRIAKGGSVIRTSVSTFMEFIGNNPNAFRLLLRERSGTSAAFRAAVAREIQHFIAEL ADYLELENHMPRAFTEAQAEAMVTIVFSAGAEALDVGVEQRRQLEERLVLQLRMISKGAY YWYRREQEKTAIIPGNVKDE Prediction of potential genes in microbial genomes Time: Mon May 16 15:10:49 2011 Seq name: gi|296493437|gb|ADTK01000064.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont187.2, whole genome shotgun sequence Length of sequence - 27957 bp Number of predicted genes - 21, with homology - 21 Number of transcription units - 11, operones - 4 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 304 - 363 4.2 1 1 Tu 1 . + CDS 395 - 1795 431 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 - Term 1505 - 1530 -0.8 2 2 Op 1 9/0.000 - CDS 1778 - 2695 842 ## COG0583 Transcriptional regulator - Prom 2729 - 2788 5.1 3 2 Op 2 7/0.200 - CDS 2947 - 4230 1123 ## COG0477 Permeases of the major facilitator superfamily 4 2 Op 3 . - CDS 4297 - 5511 1115 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily - Prom 5759 - 5818 10.5 5 3 Op 1 2/0.800 - CDS 6009 - 7382 1902 ## COG0165 Argininosuccinate lyase 6 3 Op 2 8/0.000 - CDS 7443 - 8216 1026 ## COG0548 Acetylglutamate kinase 7 3 Op 3 . - CDS 8227 - 9231 881 ## COG0002 Acetylglutamate semialdehyde dehydrogenase - Prom 9343 - 9402 6.1 + Prom 9277 - 9336 3.3 8 4 Tu 1 . + CDS 9385 - 10536 964 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases + Term 10544 - 10586 3.5 + Prom 10621 - 10680 5.3 9 5 Tu 1 . + CDS 10927 - 13539 3026 ## COG2352 Phosphoenolpyruvate carboxylase + Term 13557 - 13590 5.2 + Prom 13637 - 13696 4.3 10 6 Tu 1 . + CDS 13722 - 15455 2004 ## COG2194 Predicted membrane-associated, metal-dependent hydrolase + Term 15460 - 15518 4.3 11 7 Tu 1 . + CDS 15604 - 16455 739 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 16534 - 16568 -1.0 12 8 Op 1 2/0.800 - CDS 16442 - 16783 325 ## COG1445 Phosphotransferase system fructose-specific component IIB 13 8 Op 2 11/0.000 - CDS 16785 - 17663 581 ## COG1180 Pyruvate-formate lyase-activating enzyme 14 8 Op 3 4/0.400 - CDS 17629 - 19926 2177 ## COG1882 Pyruvate-formate lyase 15 8 Op 4 7/0.200 - CDS 19977 - 20297 543 ## COG1445 Phosphotransferase system fructose-specific component IIB 16 8 Op 5 . - CDS 20312 - 21391 1307 ## COG1299 Phosphotransferase system, fructose-specific IIC component - Prom 21556 - 21615 3.3 + Prom 21456 - 21515 4.2 17 9 Op 1 2/0.800 + CDS 21700 - 24201 2632 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) 18 9 Op 2 3/0.400 + CDS 24213 - 24875 773 ## COG0176 Transaldolase 19 9 Op 3 1/1.000 + CDS 24886 - 25989 1242 ## COG0371 Glycerol dehydrogenase and related enzymes + Term 26164 - 26207 3.1 + Prom 26030 - 26089 3.5 20 10 Tu 1 . + CDS 26264 - 26881 526 ## COG3738 Uncharacterized protein conserved in bacteria 21 11 Tu 1 . - CDS 26908 - 27813 910 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily Predicted protein(s) >gi|296493437|gb|ADTK01000064.1| GENE 1 395 - 1795 431 466 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 7 453 4 443 458 170 26 8e-42 MPHSYDYDAIVIGSGPGGEGAAMGLVKQGARVAVIERYQNVGGGCTHWGTIPSKALRHAV SRIIEFNQNPLYSDHSRLLRSSFADILNHADNVINQQTRMRQGFYERNHCEILQGNARFV DEHTLALDCPDGSVETLTAEKFVIACGSRPYHPTDVDFTHPRIYDSDSILSMHHEPRHVL IYGAGVIGCEYASIFRGMDVKVDLINTRDRLLAFLDQEMSDSLSYHFWNSGVVIRHNEEY EKIEGCDDGVIMHLKSGKKLKADCLLYANGRTGNTDSLALQNIGLETDSRGQLKVNSMYQ TAQPHVYAVGDVIGYPSLASAAYDQGRIAAQALVKGEATAHLIEDIPTGIYTIPEISSVG KTEQQLTAMKVPYEVGRAQFKHLARAQIVGMNVGTLKILFHRETKEILGIHCFGERAAEI IHIGQAIMEQKGGGNTIEYFVNTTFNYPTMAEAYRVAALNGLNRLF >gi|296493437|gb|ADTK01000064.1| GENE 2 1778 - 2695 842 305 aa, chain - ## HITS:1 COG:ECs4890 KEGG:ns NR:ns ## COG: ECs4890 COG0583 # Protein_GI_number: 15834144 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 305 1 305 305 596 100.0 1e-170 MNIRDLEYLVALAEHRHFRRAADSCHVSQPTLSGQIRKLEDELGVMLLERTSRKVLFTQA GMLLVDQARTVLREVKVLKEMASQQGETMSGPLHIGLIPTVGPYLLPHIIPMLHQTFPKL EMYLHEAQTHQLLAQLDSGKLDCVILALVKESEAFIEVPLFDEPMLLAIYEDHPWANREC VPMADLAGEKLLMLEDGHCLRDQAMGFCFEAGADEDTHFRATSLETLRNMVAAGSGITLL PALAVPPERKRDGVVYLPCIKPEPRRTIGLVYRPGSPLRSRYEQLAEAIRARMDGHFDKV LKQAV >gi|296493437|gb|ADTK01000064.1| GENE 3 2947 - 4230 1123 427 aa, chain - ## HITS:1 COG:ECs5316 KEGG:ns NR:ns ## COG: ECs5316 COG0477 # Protein_GI_number: 15834570 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 2 426 21 447 453 458 56.0 1e-129 MDVDVSTSVAGNKPQRIRRIQTVTLVLLFMAGIVNFLDRSSLSVAGEAIRGELGLSATEF GVLLSAFSLSYGFSQLPSGILLDRFGPRIVLGAGLIFWSLMQALTGMVNSFSHFILMRIG LGIGEAPFMPAGVKSITDWYAQKERGTALGIFNSSTVIGQAIAPPALVLMQLAWGWRTMF VIIGVAGILVGICWYAWYRNRAQFVLTDEERTYLSAPVKPRPQLQFSEWLALFKHRTTWG MILGFSGVNYTGWLYIAWLPGYLQAEQGFSLAKTGWVAAIPFLAAAVGMWVNGIVVDRLA KKGYDLAKTRKTAIVCGLMMSALGTLLVVQSSSPAQAVAFISMALFCVHFAGTSAWGLVQ VMVSETKVASIAGIQNFGSFVFASFAPIVTGWVVDTTHSFNLALVIAACVTFTGALCYFF IVKDRIE >gi|296493437|gb|ADTK01000064.1| GENE 4 4297 - 5511 1115 404 aa, chain - ## HITS:1 COG:rspA KEGG:ns NR:ns ## COG: rspA COG4948 # Protein_GI_number: 16129539 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Escherichia coli K12 # 1 404 1 404 404 813 91.0 0 MKIIAADVFVTCPGRNFVTLKITTESGLCGLGDATLNGRELSVASYLKDHLCPQLIGRDA SRIEDIWQFFYKGAYWRRGPVTMSAISAIDMALWDIKAKAANMPLYQLLGGASREGVMVY CHTTGRTIDEVLEDYAKHQQMGFKAIRVQCGVPGMQTTYGLAKGKGLAYEPATKGLWPEE QLWSSEKYLDFTPKLFEAVRNKFGFNEHLLHDMHHRLTPIEAARFGKSIEDYRLFWMEDP TPAENQECFRLIRQHTVTPIAVGEVFNSIWDCKQLIEEQLIDYIRTTITHAGGITGMRRI ADFASLYQVRTGSHGPSDLSPVCHAAALHFDLWVPNFGVQEYMGYSEQMLEVFPHSWRFE EGYMHPGDEPGLGISFDEKLAAKYPYDPAYLPVARLEDGTLWNW >gi|296493437|gb|ADTK01000064.1| GENE 5 6009 - 7382 1902 457 aa, chain - ## HITS:1 COG:argH KEGG:ns NR:ns ## COG: argH COG0165 # Protein_GI_number: 16131798 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate lyase # Organism: Escherichia coli K12 # 1 456 1 456 457 881 99.0 0 MALWGGRFTQAADQRFKQFNDSLRFDYRLAEQDIVGSVAWSKALVTVGVLTAEEQAQLEE ALNVLLEDVRARPQQILESDAEDIHSWVEGKLIDKVGQLGKKLHTGRSRNDQVATDLKLW CKDTVSELLTANRQLQSALVETAQNNQDAVMPGYTHLQRAQPVTFAHWCLAYVEMLARDE SRLQDALKRLDVSPLGCGALAGTAYEIDREQLAGWLGFASATRNSLDSVSDRDHVLELLS AAAIGMVHLSRFAEDLIFFNTGEAGFVELSDRVTSGSSLMPQKKNPDALELIRGKCGRVQ GALTGMMMTLKGLPLAYNKDMQEDKEGLFDALDTWLDCLHMAALVLDGIQVKRPRCQEAA QQGYANATELADYLVAKGVPFREAHHIVGEAVVEAIRQGKPLEDLPLDELQKFSPVIDED VYPILSLQSCLDKRAAKGGVSPQQVAQAIAFAQARLE >gi|296493437|gb|ADTK01000064.1| GENE 6 7443 - 8216 1026 257 aa, chain - ## HITS:1 COG:ECs4888 KEGG:ns NR:ns ## COG: ECs4888 COG0548 # Protein_GI_number: 15834142 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate kinase # Organism: Escherichia coli O157:H7 # 1 257 2 258 258 453 100.0 1e-127 MNPLIIKLGGVLLDSEEALERLFSALVNYRESHQRPLVIVHGGGCVVDELMKGLNLPVKK KNGLRVTPADQIDIITGALAGTANKTLLAWAKKHQIAAVGLFLGDGDSVKVTQLDEELGH VGLAQPGSPKLINSLLENGYLPVVSSIGVTDEGQLMNVNADQAATALAATLGADLILLSD VSGILDGKGQRIAEMTAAKAEQLIEQGIITDGMIVKVNAALDAARTLGRPVDIASWRHAE QLPALFNGMPMGTRILA >gi|296493437|gb|ADTK01000064.1| GENE 7 8227 - 9231 881 334 aa, chain - ## HITS:1 COG:argC KEGG:ns NR:ns ## COG: argC COG0002 # Protein_GI_number: 16131796 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate semialdehyde dehydrogenase # Organism: Escherichia coli K12 # 1 334 1 334 334 656 100.0 0 MLNTLIVGASGYAGAELVTYVNRHPHMNITALTVSAQSNDAGKLISDLHPQLKGIVDLPL QPMSDISEFSPGVDVVFLATAHEVSHDLAPQFLEAGCVVFDLSGAFRVNDATFYEKYYGF THQYPELLEQAAYGLAEWCGNKLKEANLIAVPGCYPTAAQLALKPLIDADLLDLNQWPVI NATSGVSGAGRKAAISNSFCEVSLQPYGVFTHRHQPEIATHLGADVIFTPHLGNFPRGIL ETITCRLKSGVTQAQVAQVLQQAYAHKPLVRLYDKGVPALKNVVGLPFCDIGFAVQGEHL IIVATEDNLLKGAAAQAVQCANIRFGYAETQSLI >gi|296493437|gb|ADTK01000064.1| GENE 8 9385 - 10536 964 383 aa, chain + ## HITS:1 COG:argE KEGG:ns NR:ns ## COG: argE COG0624 # Protein_GI_number: 16131795 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Escherichia coli K12 # 1 383 1 383 383 798 99.0 0 MKNKLPPFIEIYRALIATPSISATEEALDQSNADLITLLADWFKDLGFNVEVQPVPGTRN KFNMLASCGQGAGGLLLAGHTDTVPFDDGRWTRDPFTLTEHDGKLYGLGTADMKGFFAFI LDALRDVDVTKLKKPLYILATADEETSMAGARYFAETTALRPDCAIIGEPTSLQPVRAHK GHISNAIRIQGQSGHSSDPARGVNAIELMHDAIGHILQLRDNLKERYHYEAFTVPYPTLN LGHIHGGDASNRICACCELHMDIRPLPGMTLNELNGLLNDALAPVSERWPGRLTVDELHP PIPGYECPPNHQLVEVVEKLLGAKTEVVNYCTEAPFIQTLCPTLVLGPGSINQAHQPDEY LETRFIKPTRELITQVIHHFCWH >gi|296493437|gb|ADTK01000064.1| GENE 9 10927 - 13539 3026 870 aa, chain + ## HITS:1 COG:ppc KEGG:ns NR:ns ## COG: ppc COG2352 # Protein_GI_number: 16131794 # Func_class: C Energy production and conversion # Function: Phosphoenolpyruvate carboxylase # Organism: Escherichia coli K12 # 1 870 14 883 883 1699 99.0 0 MLGKVLGETIKDALGEHILERVETIRKLSKSSRAGNDANRQELLTTLQNLSNDELLPVAR AFSQFLNLANTAEQYHSISPKGEAASNPEVIARTLRKLKNQPELSEDTIKKAVESLSLEL VLTAHPTEITRRTLIHKMVEVNACLKQLDNKDIADYEHNQLMRRLRQLIAQSWHTDEIRK LRPSPVDEAKWGFAVVENSLWQGVPNYLRELNEQLEENLGYKLPVEFVPVRFTSWMGGDR DGNPNVTADITRHVLLLSRWKATDLFLKDIQVLVSELSMVEATPELLALVGEEGAAEPYR YLMKNLRSRLMATQAWLEARLKGEELPKPEGLLTQNEELWEPLYACYQSLQACGMGIIAN GDLLDTLRRVKCFGVPLVRIDIRQESTRHTEALGELTRYLGIGDYESWSEADKQAFLIRE LNSKRPLLPRNWQPSAETREVLDTCQVIAEAPQGSIAAYVISMAKTPSDVLAVHLLLKEA GIGFAMPVAPLFETLDDLNNANDVMTQLLNIDWYRGLIQGKQMVMIGYSDSAKDAGVMAA SWAQYQAQDALIKTCEKAGIELTLFHGRGGSIGRGGAPAHAALLSQPPGSLKGGLRVTEQ GEMIRFKYGLPEITVSSLSLYTGAILEANLLPPPEPKESWRRIMDELSVISCDLYRGYVR ENKDFVPYFRSATPEQELGKLPLGSRPAKRRPTGGVESLRAIPWIFAWTQNRLMLPAWLG AGTALQKVVEDGKQSELEAMCRDWPFFSTRLGMLEMVFAKADLWLAEYYDQRLVDKALWP LGKELRNLQEEDIKVVLAIANDSHLMADLPWIAESIQLRNIYTDPLNVLQAELLHRSRQA EKEGQEPDPRVEQALMVTIAGIAAGMRNTG >gi|296493437|gb|ADTK01000064.1| GENE 10 13722 - 15455 2004 577 aa, chain + ## HITS:1 COG:yijP KEGG:ns NR:ns ## COG: yijP COG2194 # Protein_GI_number: 16131793 # Func_class: R General function prediction only # Function: Predicted membrane-associated, metal-dependent hydrolase # Organism: Escherichia coli K12 # 1 577 1 577 577 1145 99.0 0 MHSTEVQAKPLFSWKALGWALLYFWFFSTLLQAIIYISGYSGTNGIRDSLLFSSLWLIPV FLFPKRIKIIAAVIGVVLWAASLAALCYYVIYGQEFSQSVLFVMFETNTNEASEYLSQYF SLKIVLIALAYTAVAVLLWTRLRPVYIPKPWRYVVSFALLYGLILHPIAMNTFIKNKPFE KTLDNLASRMEPAAPWQFLTGYYQYRQQLNSLTKLLNENNALPPLANFKDESGNEPRTLV LVIGESTQRGRMSLYGYPRETTPELDALHKTDPNLTVFNNVVTSRPYTIEILQQALTFAN EKNPDLYLTQPSLMNMMKQAGYKTFWITNQQTMTARNTMLTVFSKQTDKQFYMNQQRTQS AREYDTNVLKPFQEVLKDPAPKKLIIVHLLGTHIKYKYRYPENQGKFDGNTDHVPPGLNA EELESYNDYDNANLYNDHVVASLIKDFKAANPNGFLVYFSDHGEEVYDTPPHKTQGRNED NPTRHMYTIPFLLWTSEKWQATHPRDFSQDVDRKYSLAELIHTWSDLAGLSYDGYDPTRS VVNPQFKETTRWIGNPYKKNALIDYDTLPYGDQVGNQ >gi|296493437|gb|ADTK01000064.1| GENE 11 15604 - 16455 739 283 aa, chain + ## HITS:1 COG:yijO KEGG:ns NR:ns ## COG: yijO COG2207 # Protein_GI_number: 16131792 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 283 1 283 283 551 100.0 1e-157 MYHDVSYLLSRLINGPLSLRQIYFASSNGPVPDLAYQVDFPRLEIVLEGEFVDTGAGATL VPGDVLYVPAGGWNFPQWQAPATTFSVLFGKQQLGFSVVQWDGKQYQNLAKQHVARRGPR IGSFLLQTLNEMQMQPQEQQTARLIVASLLSHCRDLLGSQIQTASRSQALFEAIRDYIDE RYASALTRESVAQAFYISPNYLSHLFQKTGAIGFNEYLNHTRLEHAKTLLKGYDLKVKEV AHACGFVDSNYFCRLFRKNTERSPSEYRRQYHSQLTEKPTTPE >gi|296493437|gb|ADTK01000064.1| GENE 12 16442 - 16783 325 113 aa, chain - ## HITS:1 COG:frwD KEGG:ns NR:ns ## COG: frwD COG1445 # Protein_GI_number: 16131791 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system fructose-specific component IIB # Organism: Escherichia coli K12 # 1 113 1 113 113 206 99.0 1e-53 MAYLVAVTACVSGVAHTYMAAERLEKLCQLEKWGVSIETQGALGTENRLADEDIRRADVA LLITDIELAGAERFEHCRYVQCSIYAFLREPQRVMSAVRKVLSAPQQTHLILE >gi|296493437|gb|ADTK01000064.1| GENE 13 16785 - 17663 581 292 aa, chain - ## HITS:1 COG:ECs4881 KEGG:ns NR:ns ## COG: ECs4881 COG1180 # Protein_GI_number: 15834135 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Escherichia coli O157:H7 # 1 292 1 292 292 602 100.0 1e-172 MTSSAGQRISCNVVETRRNDVARIFNIQRYSLNDGEGIRTVVFFKGCPHLCPWCANPESI SGKIQTVRREAKCLHCAKCLRDADECPSGAFERIGRDISLDALEREVMKDDIFFRTSGGG VTLSGGEVLMQAEFATRFLQRLRLWGVSCAIETAGDAPASKLLPLAKLCDEVLFDLKIMD ATQARDVVKMNLPRVLENLRLLVSEGVNVIPRLPLIPGFTLSRENMQQALDVLIPLKIRQ IHLLPFHQYGEPKYRLLGKTWSMKEVPAPSSADVTTMREMAERAGFQVTVGG >gi|296493437|gb|ADTK01000064.1| GENE 14 17629 - 19926 2177 765 aa, chain - ## HITS:1 COG:ECs4880 KEGG:ns NR:ns ## COG: ECs4880 COG1882 # Protein_GI_number: 15834134 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Escherichia coli O157:H7 # 1 765 1 765 765 1545 99.0 0 MTNRISRLKTALFANTREISLERALLYTASHRQTEGEPVILRRAKATAYILEHVEISIRD EELIAGNRTVKPRAGIMSPEMDPYWLLKELDQFPTRPQDRFAISEEDKRIYREELFPYWE KRSMKDFINGQMTDEVKAATSTQIFSINQTDKGQGHIIIDYPRLLNHGLGELVAQMQQHC QQQPENHFYQAALLLLEASQKHILRYAELAETMAANCTDAQRREELLTIAEISRHNAEHK PQTFWQACQLFWYMNIILQYESNASSLSLGRFDQYMLPFYQASLTQGEDPAFLKELLESL WVKCNDIVLLRSTSSARYFAGFPTGYTALLGGLTENGRSAVNVLSFLCLDAYQSVQLPQP NLGVRTNALIDTPFLMKTAETIRLGTGIPQIFNDEVVVPAFLNRGVSLEDARDYSVVGCV ELSIPGRTYGLHDIAMFNLLKVMEICLHENEGNAALTYEGLLEQIRAKISHYITLMVEGS NICDIGHRDWAPVPLLSSFISDCLEKGRDITDGGARYNFSGVQGIGIANLSDSLHALKGM VFEQQRLSFDELLSVLKANFATPEGEKVRARLINRFEKYGNDIDEVDNISAELLRHYCKE VEKYQNPRGGYFTPGSYTVSAHVPLGSVVGATPDGRFAGEQLADGGLSPMLGQDAQGPTA VLKSVSKLDNTLLSNGTLLNVKFTPATLEGEAGLRKLADFLRAFTQLKLQHIQFNVVNAD TLREAQQRPQDYAGLVVRVAGYSAFFVELSKEIQDDIIRRTAHQL >gi|296493437|gb|ADTK01000064.1| GENE 15 19977 - 20297 543 106 aa, chain - ## HITS:1 COG:STM4113 KEGG:ns NR:ns ## COG: STM4113 COG1445 # Protein_GI_number: 16767378 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system fructose-specific component IIB # Organism: Salmonella typhimurium LT2 # 1 106 1 106 106 162 96.0 2e-40 MTKIIAVTACPSGVAHTYMAAEALESAAKAKGWEVKVETQGSIGLENELTAEDVASADMV ILTKDIGIKFEERFAGKTIVRVNISDAVKRADAIMSKIEAHLAQTA >gi|296493437|gb|ADTK01000064.1| GENE 16 20312 - 21391 1307 359 aa, chain - ## HITS:1 COG:frwC KEGG:ns NR:ns ## COG: frwC COG1299 # Protein_GI_number: 16131787 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Escherichia coli K12 # 1 359 1 359 359 565 99.0 1e-161 MNELVQILKNTRQHLMTGVSHMIPFVVSGGILLAVSVMLYGKGAVPDAVADPNLKKLFDI GVAGLTLMVPFLAAYIGYSIAERSALAPCAIGAWVGNSFGAGFFGALIAGIIGGIVVHYL KKIPVHKVLRSVMPIFIIPIVGTLITAGIMMWGLGEPVGALTNSLTQWLQGMQQGSIVML AVIMGLMLAFDMGGPVNKVAYAFMLICVAQGVYTVVAIAAVGICIPPLGMGLATLISRKN FSAEERETGKAALVMGCVGVTEGAIPFAAADPLRVIPSIMVGSVCGAVTAALVGAQCYAG WGGLIVLPVVEGKLGYIAAVAVGAVVTAVCVNVLKSLARKNGSSTDEKEDDLDLDFEIN >gi|296493437|gb|ADTK01000064.1| GENE 17 21700 - 24201 2632 833 aa, chain + ## HITS:1 COG:ECs4877_1 KEGG:ns NR:ns ## COG: ECs4877_1 COG1080 # Protein_GI_number: 15834130 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Escherichia coli O157:H7 # 123 688 1 566 566 1097 99.0 0 MALIVEFICELPNGVHARPASHVETLCNTFSSQIEWHNLRTDRKGNAKSALALIGTDTLA GDNCQLLISGADEQEAHQRLSQWLRDEFPHCDAPLAEVKSDELEPLPVSLTNLNPQIIRA RTVCSGSAGGILTPISSLDLNALGNLPAAKGVDAEQSALENGLTLVLKNIEFRLLDSDGA TSAILEAHRSLAGDTSLREHLLAGVSAGLSCAEAIVTSANHFCEEFARSSSSYLQERALD VRDVCFQLLQQIYGEQRFPAPGKLTQPAICMADELTPSQFLELDKNHLKGLLLKSGGTTS HTVILARSFNIPTLVGVDIDALTPWQHQTIYIDGNAGAIVVEPGEAVARYYQQEARVQDA LREQQRVWLTQQARTADGIRIEIAANIAHSVEAQAAFGNGAEGVGLFRTEMLYMDRTSAP GESELYNIFCQALESANGRSIIVRTMDIGGDKPVDYLNIPAEANPFLGYRAVRIYEEYAS LFTTQLRSILRASAHGSLKIMIPMISSMEEILWVKEKLAEAKQQLRNEHIPFDEKIQLGI MLEVPSVMFIIDQCCEEIDFFSIGSNDLTQYLLAVDRDNAKVTRHYNSLNPAFLRALDYA VQAVHRQGKWIGLCGELGAKGSVLPLLVGLGLDELSMSAPSIPAAKARMAQLDSRECRKL LNQAMACRTSLEVEHLLAQFRMTQQDAPLVTAECITLESDWRSKEEVLKGMTDNLLLAGR CRYPRKLEADLWAREAVFSTGLGFSFAIPHSKSEHIEQSTISVARLQAPVRWGDDEAQFI IMLTLNKHAAGDQHMRIFSRLARRIMHEEFRNALVNAASADAIASLLQHELEL >gi|296493437|gb|ADTK01000064.1| GENE 18 24213 - 24875 773 220 aa, chain + ## HITS:1 COG:talC KEGG:ns NR:ns ## COG: talC COG0176 # Protein_GI_number: 16131784 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Escherichia coli K12 # 1 220 1 220 220 386 99.0 1e-107 MELYLDTANVAEVERLARIFPIAGVTTNPSIIAASKESIWEVLPRLQKAIGDEGILFAQT MSRDAQGMVEEAKRLRDAIPGIVVKIPVTSEGLAAIKMLKKEGITTLGTAVYSAAQGLLA ALAGAKYVAPYVNRVDAQGGDGIRTVQELQALLEMHAPESMVLAASFKTPRQALDCLLAG CESITLPLDVAQQMLNTPAVESAIEKFEHDWNAAFGTTHL >gi|296493437|gb|ADTK01000064.1| GENE 19 24886 - 25989 1242 367 aa, chain + ## HITS:1 COG:gldA KEGG:ns NR:ns ## COG: gldA COG0371 # Protein_GI_number: 16131783 # Func_class: C Energy production and conversion # Function: Glycerol dehydrogenase and related enzymes # Organism: Escherichia coli K12 # 1 367 14 380 380 702 100.0 0 MDRIIQSPGKYIQGADVINRLGEYLKPLAERWLVVGDKFVLGFAQSTVEKSFKDAGLVVE IAPFGGECSQNEIDRLRGIAETAQCGAILGIGGGKTLDTAKALAHFMGVPVAIAPTIAST DAPCSALSVIYTDEGEFDRYLLLPNNPNMVIVDTKIVAGAPARLLAAGIGDALATWFEAR ACSRSGATTMAGGKCTQAALALAELCYNTLLEEGEKAMLAAEQHVVTPALERVIEANTYL SGVGFESGGLAAAHAVHNGLTAIPDAHHYYHGEKVAFGTLTQLVLENAPVEEIETVAALS HAVGLPITLAQLDIKEDVPAKMRIVAEAACAEGETIHNMPGGATPDQVYAALLVADQYGQ RFLQEWE >gi|296493437|gb|ADTK01000064.1| GENE 20 26264 - 26881 526 205 aa, chain + ## HITS:1 COG:ECs4873 KEGG:ns NR:ns ## COG: ECs4873 COG3738 # Protein_GI_number: 15834127 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 205 1 205 205 410 95.0 1e-115 MKASLALLSLLTAFTSHSLKSPAVPPTVVQIQANTNLAIADGARQQIGSTLFYDPAYMQL TYPGGDVPQERGVCSDVVIRALRSQKVDLQKLVHEDMAKNFAEYPQKWKLKRPDSNIDHR RVPNLETWFSRHNKTRPISKNAGDYQAGDIVSWRLDNGLAHIGVVSDGFARDGTPLVIHN IGAGAQEEDVLFSWRMVGHYRYFVK >gi|296493437|gb|ADTK01000064.1| GENE 21 26908 - 27813 910 301 aa, chain - ## HITS:1 COG:yijE KEGG:ns NR:ns ## COG: yijE COG0697 # Protein_GI_number: 16131781 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Escherichia coli K12 # 1 301 12 312 312 479 98.0 1e-135 MSAAGKSNPLAISGLVVLTLIWSYSWIFMKQVTSYIGAFDFTALRCIFGALVLFIVLLLR GRAMRPTPFKYTLAIALLQTCGMIGLAQWALVSGGAGKVAILSYTMPFWVVIFAALFLGE RLRRGQYFAILIAAFGLFLVLQPWQLDFSSMKSAMLAILSGVSWGASAIVAKRLYARHPR VDLLSLTSWQMLYAALVMSVVALLVPQREIDWQPTVFWALVYSAILATALAWSLWLFVLK NLPASIASLSTLAVPVCGVLFSWWLLGENPGVVEGSGIVLIVLALALVSRKKKEAVSVKS I Prediction of potential genes in microbial genomes Time: Mon May 16 15:10:53 2011 Seq name: gi|296493436|gb|ADTK01000065.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont187.3, whole genome shotgun sequence Length of sequence - 9274 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 4/1.000 - CDS 36 - 2216 2793 ## COG0376 Catalase (peroxidase I) - Prom 2359 - 2418 4.6 - Term 2419 - 2473 10.1 2 2 Tu 1 4/1.000 - CDS 2545 - 3435 964 ## COG0685 5,10-methylenetetrahydrofolate reductase - Prom 3586 - 3645 6.0 - Term 3599 - 3633 2.2 3 3 Op 1 5/1.000 - CDS 3784 - 6216 2596 ## COG0527 Aspartokinases 4 3 Op 2 . - CDS 6219 - 7379 1164 ## COG0626 Cystathionine beta-lyases/cystathionine gamma-synthases - Prom 7416 - 7475 4.8 + Prom 7462 - 7521 7.3 5 4 Tu 1 . + CDS 7656 - 7973 426 ## COG3060 Transcriptional regulator of met regulon + Prom 8020 - 8079 2.4 6 5 Tu 1 . + CDS 8157 - 8765 646 ## B21_03772 hypothetical protein Predicted protein(s) >gi|296493436|gb|ADTK01000065.1| GENE 1 36 - 2216 2793 726 aa, chain - ## HITS:1 COG:ECs4871 KEGG:ns NR:ns ## COG: ECs4871 COG0376 # Protein_GI_number: 15834125 # Func_class: P Inorganic ion transport and metabolism # Function: Catalase (peroxidase I) # Organism: Escherichia coli O157:H7 # 1 726 1 726 726 1428 100.0 0 MSTSDDIHNTTATGKCPFHQGGHDQSAGAGTTTRDWWPNQLRVDLLNQHSNRSNPLGEDF DYRKEFSKLDYYGLKKDLKALLTESQPWWPADWGSYAGLFIRMAWHGAGTYRSIDGRGGA GRGQQRFAPLNSWPDNVSLDKARRLLWPIKQKYGQKISWADLFILAGNVALENSGFRTFG FGAGREDVWEPDLDVNWGDEKAWLTHRHPEALAKAPLGATEMGLIYVNPEGPDHSGEPLS AAAAIRATFGNMGMNDEETVALIAGGHTLGKTHGAGPTSNVGPDPEAAPIEEQGLGWAST YGSGVGADAITSGLEVVWTQTPTQWSNYFFENLFKYEWVQTRSPAGAIQFEAVDAPEIIP DPFDPSKKRKPTMLVTDLTLRFDPEFEKISRRFLNDPQAFNEAFARAWFKLTHRDMGPKS RYIGPEVPKEDLIWQDPLPQPIYNPTEQDIIDLKFAIADSGLSVSELVSVAWASASTFRG GDKRGGANGARLALMPQRDWDVNAAAVRALPVLEKIQKESGKASLADIIVLAGVVGVEKA ASAAGLSIHVPFAPGRVDARQDQTDIEMFELLEPIADGFRNYRARLDVSTTESLLIDKAQ QLTLTAPEMTALVGGMRVLGANFDGSKNGVFTDRVGVLSNDFFVNLLDMRYEWKATDESK ELFEGRDRETGEVKYTASRADLVFGSNSVLRAVAEVYASSDAHEKFVKDFVAAWVKVMNL DRFDLL >gi|296493436|gb|ADTK01000065.1| GENE 2 2545 - 3435 964 296 aa, chain - ## HITS:1 COG:metF KEGG:ns NR:ns ## COG: metF COG0685 # Protein_GI_number: 16131779 # Func_class: E Amino acid transport and metabolism # Function: 5,10-methylenetetrahydrofolate reductase # Organism: Escherichia coli K12 # 1 296 1 296 296 611 100.0 1e-175 MSFFHASQRDALNQSLAEVQGQINVSFEFFPPRTSEMEQTLWNSIDRLSSLKPKFVSVTY GANSGERDRTHSIIKGIKDRTGLEAAPHLTCIDATPDELRTIARDYWNNGIRHIVALRGD LPPGSGKPEMYASDLVTLLKEVADFDISVAAYPEVHPEAKSAQADLLNLKRKVDAGANRA ITQFFFDVESYLRFRDRCVSAGIDVEIIPGILPVSNFKQAKKFADMTNVRIPAWMAQMFD GLDDDAETRKLVGANIAMDMVKILSREGVKDFHFYTLNRAEMSYAICHTLGVRPGL >gi|296493436|gb|ADTK01000065.1| GENE 3 3784 - 6216 2596 810 aa, chain - ## HITS:1 COG:metL_1 KEGG:ns NR:ns ## COG: metL_1 COG0527 # Protein_GI_number: 16131778 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Escherichia coli K12 # 1 448 1 448 448 881 100.0 0 MSVIAQAGAKGRQLHKFGGSSLADVKCYLRVAGIMAEYSQPDDMMVVSAAGSTTNQLINW LKLSQTDRLSAHQVQQTLRRYQCDLISGLLPAEEADSLISAFVSDLERLAALLDSGINDA VYAEVVGHGEVWSARLMSAVLNQQGLPAAWLDAREFLRAERAAQPQVDEGLSYPLLQQLL VQHPGKRLVVTGFISRNNAGETVLLGRNGSDYSATQIGALAGVSRVTIWSDVAGVYSADP RKVKDACLLPLLRLDEASELARLAAPVLHARTLQPVSGSEIDLQLRCSYTPDQGSTRIER VLASGTGARIVTSHDDVCLIEFQVPASQDFKLAHKEIDQILKRAQVRPLAVGVHNDRQLL QFCYTSEVADSALKILDEAGLPGELRLRQGLALVAMVGAGVTRNPLHCHRFWQQLKGQPV EFTWQSDDGISLVAVLRTGPTESLIQGLHQSVFRAEKRIGLVLFGKGNIGSRWLELFARE QSTLSARTGFEFVLAGVVDSRRSLLSYDGLDASRALAFFNDEAVEQDEESLFLWMRAHPY DDLVVLDVTASQQLADQYLDFASHGFHVISANKLAGASDSNKYRQIHDAFEKTGRHWLYN ATVGAGLPINHTVRDLIDSGDTILSISGIFSGTLSWLFLQFDGSVPFTELVDQAWQQGLT EPDPRDDLSGKDVMRKLVILAREAGYNIEPDQVRVESLVPAHCEGGSIDHFFENGDELNE QMVQRLEAAREMGLVLRYVARFDANGKARVGVEAVREDHPLASLLPCDNVFAIESRWYRD NPLVIRGPGAGRDVTAGAIQSDINRLAQLL >gi|296493436|gb|ADTK01000065.1| GENE 4 6219 - 7379 1164 386 aa, chain - ## HITS:1 COG:metB KEGG:ns NR:ns ## COG: metB COG0626 # Protein_GI_number: 16131777 # Func_class: E Amino acid transport and metabolism # Function: Cystathionine beta-lyases/cystathionine gamma-synthases # Organism: Escherichia coli K12 # 1 386 1 386 386 771 99.0 0 MTRKQATIAVRSGLNDDEQYGCVVPPIHLSSTYNFTGFNEPRAHDYSRRGNPTRDVVQRA LAELEGGAGAVLTNTGMSAIHLVTTVFLKPGDLLVAPHDCYGGSYRLFDSLAKRGCYRVL FVDQGDEQALRAALAEKPKLVLVESPSNPLLRVVDIAKICHLAREVGAVSVVDNTFLSPA LQNPLALGADLVLHSCTKYLNGHSDVVAGVVIAKDPDVVTELAWWANNIGVTGGAFDSYL LLRGLRTLVPRMELAQRNAQAIVKYLQTQPLVKKLYHPSLPENQGHEIAARQQKGFGAML SFELDGDEQTLRRFLSGLSLFTLAESLGGVESLISHAATMTHAGMAPEARAAAGISETLL RISTGIEDGEDLIADLENGFRAANKG >gi|296493436|gb|ADTK01000065.1| GENE 5 7656 - 7973 426 105 aa, chain + ## HITS:1 COG:ECs4867 KEGG:ns NR:ns ## COG: ECs4867 COG3060 # Protein_GI_number: 15834121 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulator of met regulon # Organism: Escherichia coli O157:H7 # 1 105 1 105 105 198 100.0 2e-51 MAEWSGEYISPYAEHGKKSEQVKKITVSIPLKVLKILTDERTRRQVNNLRHATNSELLCE AFLHAFTGQPLPDDADLRKERSDEIPEAAKEIMREMGINPETWEY >gi|296493436|gb|ADTK01000065.1| GENE 6 8157 - 8765 646 202 aa, chain + ## HITS:1 COG:no KEGG:B21_03772 NR:ns ## KEGG: B21_03772 # Name: yiiX # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 202 1 202 202 369 100.0 1e-101 MKNRLLILSLLVSVPAFAWQPQTGDIIFQISRSSQSKAIQLATHSDYSHTGMLVIRNKKP YVFEAVGPVKYTPLKQWIAHGEKGKYVVRRVEGGLSVEQQQKLAQTAKRYLGKPYDFSFS WSDDRQYCSEVVWKVYQNALGMRVGEQQKLKEFDLSNPLVQAKLKERYGKNIPLEETVVS PQAVFDAPQLTTVAKEWPLFSW Prediction of potential genes in microbial genomes Time: Mon May 16 15:10:57 2011 Seq name: gi|296493435|gb|ADTK01000066.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont187.4, whole genome shotgun sequence Length of sequence - 1269 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 135 - 449 119 ## COG1662 Transposase and inactivated derivatives, IS1 family + Term 518 - 584 2.3 - Term 670 - 704 5.3 2 2 Tu 1 . - CDS 744 - 1205 159 ## COG3209 Rhs family protein Predicted protein(s) >gi|296493435|gb|ADTK01000066.1| GENE 1 135 - 449 119 104 aa, chain + ## HITS:1 COG:ECs1301 KEGG:ns NR:ns ## COG: ECs1301 COG1662 # Protein_GI_number: 15830555 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives, IS1 family # Organism: Escherichia coli O157:H7 # 1 59 61 119 145 86 69.0 1e-17 MLTTDEWGSYARELPKEKHLTGKIFTQHIERNNLTLRTRIKRLARRTICFSRSVELHEKV IGAFIENICSTNWSHHPSDKLSLDDYQTEYTAKDFYLILLFFLK >gi|296493435|gb|ADTK01000066.1| GENE 2 744 - 1205 159 153 aa, chain - ## HITS:1 COG:ECs4864 KEGG:ns NR:ns ## COG: ECs4864 COG3209 # Protein_GI_number: 15834118 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli O157:H7 # 1 153 1242 1394 1394 322 99.0 1e-88 MDPLGLYEFKSKNIDDIGIFALAMCNGESINENKEYGGLICKKQGEYFPMNPISSNDNDS VDLRNIKCPEGSERVGDYHTHGFYSDDKGNKVTKENDVYDSLNFSSKDLTNSYMNGMGKK EYSSYLGTPNNTYLKYNPKAKGNGVTIIRQGSN Prediction of potential genes in microbial genomes Time: Mon May 16 15:11:01 2011 Seq name: gi|296493434|gb|ADTK01000067.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont195.1, whole genome shotgun sequence Length of sequence - 8938 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 2, operones - 2 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 26 - 232 88 ## COG2801 Transposase and inactivated derivatives 2 1 Op 2 . + CDS 315 - 572 181 ## EC55989_4958 conserved hypothetical protein; KpLE2 phage-like element + Term 766 - 795 2.5 3 2 Op 1 35/0.000 - CDS 1130 - 1897 228 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 4 2 Op 2 20/0.000 - CDS 1898 - 2854 888 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 5 2 Op 3 3/0.000 - CDS 2851 - 3828 603 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 6 2 Op 4 . - CDS 3846 - 4748 906 ## COG4594 ABC-type Fe3+-citrate transport system, periplasmic component - Term 4756 - 4783 1.5 7 2 Op 5 2/0.000 - CDS 4793 - 7117 2302 ## COG4772 Outer membrane receptor for Fe3+-dicitrate 8 2 Op 6 6/0.000 - CDS 7204 - 8157 674 ## COG3712 Fe2+-dicitrate sensor, membrane component 9 2 Op 7 . - CDS 8154 - 8675 375 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 8703 - 8762 6.8 Predicted protein(s) >gi|296493434|gb|ADTK01000067.1| GENE 1 26 - 232 88 68 aa, chain + ## HITS:1 COG:VC0257 KEGG:ns NR:ns ## COG: VC0257 COG2801 # Protein_GI_number: 15640286 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Vibrio cholerae # 1 68 223 290 290 98 66.0 2e-21 MERFFRSLKNEWVPATGYVSFSDAAHAITDYIVGYYSALRPHEYNGGLPPNESENRYWKN SNAEASFS >gi|296493434|gb|ADTK01000067.1| GENE 2 315 - 572 181 85 aa, chain + ## HITS:1 COG:no KEGG:EC55989_4958 NR:ns ## KEGG: EC55989_4958 # Name: yjhV # Def: conserved hypothetical protein; KpLE2 phage-like element # Organism: E.coli_55989 # Pathway: not_defined # 1 85 53 137 137 184 100.0 1e-45 MTLVNDTGFDPVFSGSIAESWRQQPCTPSYCCDWEAATMLRAFPLAKKGEGRARLPSLYA SFGKLGETPTHEDIIDNNRSINWPV >gi|296493434|gb|ADTK01000067.1| GENE 3 1130 - 1897 228 255 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 225 2 226 245 92 25 1e-18 MTLRTENLTVSYGTDKVLNDVSLSLPTGKITALIGPNGCGKSTLLNCFSRLLMPQSGTVF LGDNPINMLSSRQLARRLSLLPQHHLTPEGITVQELVSYGRNPWLSLWGRLSAEDNARVN VAMNQTRINHLAVRRLTELSGGQRQRAFLAMVLAQNTPVVLLDEPTTYLDINHQVDLMRL MGELRTQGKTVVAVLHDLNQASRYCDQLVVMANGHVMAQGTPEEVMTPGLLRTVFSVEAE IHPEPVSGRPMCLMR >gi|296493434|gb|ADTK01000067.1| GENE 4 1898 - 2854 888 318 aa, chain - ## HITS:1 COG:fecD KEGG:ns NR:ns ## COG: fecD COG0609 # Protein_GI_number: 16132109 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Escherichia coli K12 # 1 318 1 318 318 493 99.0 1e-139 MKIALVIFITLALAGCALLSLHMGVIPVPWRALLTDWQAGREHYYVLMEYRLPRLLLALF VGAALAVAGVLIQGIVRNPLASPDILGVNHAASLASVGALLLMPSLPVMVLPLLAFAGGM AGLILLKMLAKTHQPMKLALTGVALSACWASLTDYLMLSRPQDVNNALLWLTGSLWGRDW SFVKIAIPLMILFLPLSLSFCRDLDLLALGDARATTLGVSVPHTRFWALLLAVAMTSTGV AACGPISFIGLVVPHMMRSITGGRHRRLLPVSALTGALLLVVADLLARIIHPPLELPVGV LTAIIGAPWFVWLLVRMR >gi|296493434|gb|ADTK01000067.1| GENE 5 2851 - 3828 603 325 aa, chain - ## HITS:1 COG:fecC KEGG:ns NR:ns ## COG: fecC COG0609 # Protein_GI_number: 16132110 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Escherichia coli K12 # 1 325 8 332 332 495 99.0 1e-140 MLLWGLPVAALIIIFWLSLFCYSAIPVSGADATRALLPGHTPTLPEALVQNLRLPRSLVA VLIGASLALAGTLLQTLTHNPMASPSLLGINSGAALAMALTSALSPTPIAGYSLSFIAAC GGGVGWLLVMTAGGGFRHTHDRNKLILAGIALSAFCMGLTRITLLLAEDHAYGIFYWLAG GVSHARWQDVWQLLPVVVTAVPVVLLLANQLNLLNLSDSTAHTLGVNLTRLRLVINMLVL LLVGACVSVAGPVAFIGLLVPHLARFWAGFDQRNVLPVSMLLGATLMLLADVLARALAFP GDLPAGAVLALIGSPCFVWLVRRRG >gi|296493434|gb|ADTK01000067.1| GENE 6 3846 - 4748 906 300 aa, chain - ## HITS:1 COG:fecB KEGG:ns NR:ns ## COG: fecB COG4594 # Protein_GI_number: 16132111 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-citrate transport system, periplasmic component # Organism: Escherichia coli K12 # 1 300 3 302 302 550 99.0 1e-156 MLAFIRFLFAGLLLVISHAFAATVQDEHGTFTLEKTPQRIVVLELSFADALAAVDVSPIG IADDNDAKRILPEVRAHLKPWQSVGTRAQPSLEAIAALKPDLIIADSSRHAGVYIALQQI APVLLLKSRNETYAENLQSAAIIGEMVGKKREMQARLEQHKERMAQWASQLPKGTRVAFG TSREQQFNLHTQETWTGSVLASLGLNVPAAMAGASMPSIGLEQLLAVNPAWLLVAHYREE SIVKRWQQDPLWQMLTAAQKQQVASVDSNTWARMRGIFAAERIAADTVKIFHHQPLTVVK >gi|296493434|gb|ADTK01000067.1| GENE 7 4793 - 7117 2302 774 aa, chain - ## HITS:1 COG:fecA KEGG:ns NR:ns ## COG: fecA COG4772 # Protein_GI_number: 16132112 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for Fe3+-dicitrate # Organism: Escherichia coli K12 # 1 774 1 774 774 1538 99.0 0 MTPLRVFRKTTPLVNAIRLSLLPLAGLSFSAFAAQVNIAPGSLDKALNQYAAHSGFTLSV DASLTRGKQSNGLHGDYDVESGLQQLLDGSGLQVKPLGNNSWTLEPAPAPKEDALTVVGD WLGDARENDVFEHAGARDVIRREDFAKTGATTMREVLNRIPGVSAPENNGTGSHDLAMNF GIRGLNPRLASRSTVLMDGIPVPFAPYGQPQLSLAPVSLGNMDAIDVVRGGGAVRYGPQS VGGVVNFVTRAIPQDFGIEAGVEGQLSPTSSQNNPKETHNLMVGGTADNGFGTALLYSGT RGSDWREHSATRIDDLMLKSKYAPDEVHTFNSLLQYYDGEADMPGGLSRADYDADRWQST RPYDRFWGRRKLASLGYQFQPDSQHKFNIQGFYTQTLRSGYLEQGKRITLSPRNYWVRGI EPRYSQIFMIGPSAHEVGVGYRYLNESTHEMRYYTATSSGQLPSGSSPYDRDTRSGTEAH AWYLDDKIDIGNWTITPGMRFEHIESYQNNAITGTHEEVSYNAPLPALNVLYHLTDSWNL YANTEGSFGTVQYSQIGKAVQSGNVEPEKARTWELGTRYDDGALTAEMGLFLINFNNQYD SNQTNDTVTARGKTRHTGLETQARYDLGTLTPTLDNVSIYASYAYVNAEIREKGDTYGNL VPFSPKHKGTLGVDYKPGNWTFNLNSDFQSSQFADNANTVKESADGSTGRIPGFMLWGAR VAYDFGPQMADLNLAFGVKNIFDQDYFIRSYNDNNKGIYAGQPRTLYMQGSLKF >gi|296493434|gb|ADTK01000067.1| GENE 8 7204 - 8157 674 317 aa, chain - ## HITS:1 COG:fecR KEGG:ns NR:ns ## COG: fecR COG3712 # Protein_GI_number: 16132113 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Escherichia coli K12 # 1 317 1 317 317 567 100.0 1e-161 MNPLLTDSRRQALRSASHWYAVLSGERVSPQQEARWQQWYEQDQDNQWAWQQVENLRNQL GGVPGDVASRALHDTRLTRRHVMKGLLLLLGAGGGWQLWQSETGEGLRADYRTAKGTVSR QQLEDGSLLTLNTQSAADVRFDAHQRTVRLWYGEIAITTAKDALQRPFRVLTRQGQLTAL GTEFTVRQQDNFTQLDVQQHAVEVLLASAPAQKRIVNAGESLQFSASEFGAVKPLDDEST SWTKDILSFSDKPLGEVIATLTRYRNGVLRCDPAVAGLRLSGTFPLKNTDAILNVIAQTL PVKIQSITRYWINISPL >gi|296493434|gb|ADTK01000067.1| GENE 9 8154 - 8675 375 173 aa, chain - ## HITS:1 COG:fecI KEGG:ns NR:ns ## COG: fecI COG1595 # Protein_GI_number: 16132114 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Escherichia coli K12 # 1 173 1 173 173 326 100.0 1e-89 MSDRATTTASLTFESLYGTHHGWLKSWLTRKLQSAFDADDIAQDTFLRVMVSETLSTIRD PRSFLCTIAKRVMVDLFRRNALEKAYLEMLALMPEGGAPSPEERESQLETLQLLDSMLDG LNGKTREAFLLSQLDGLTYSEIAHKLGVSISSVKKYVAKAVEHCLLFRLEYGL Prediction of potential genes in microbial genomes Time: Mon May 16 15:11:04 2011 Seq name: gi|296493433|gb|ADTK01000068.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont196.1, whole genome shotgun sequence Length of sequence - 5121 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 386 - 445 6.9 1 1 Tu 1 . + CDS 468 - 1538 426 ## JW2611 hypothetical protein + Term 1743 - 1782 5.2 2 2 Tu 1 . - CDS 2031 - 2300 189 ## gi|300902645|ref|ZP_07120616.1| conserved domain protein - Prom 2323 - 2382 2.2 3 3 Tu 1 . - CDS 2926 - 4065 40 ## ECSE_1878 hypothetical protein - Prom 4129 - 4188 3.5 Predicted protein(s) >gi|296493433|gb|ADTK01000068.1| GENE 1 468 - 1538 426 356 aa, chain + ## HITS:1 COG:no KEGG:JW2611 NR:ns ## KEGG: JW2611 # Name: yfjN # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 356 1 357 357 525 71.0 1e-147 MTVRNYTNLNLDRTTIEATSRAFIENKNYSVHSIGPMPGARAGLRVVFAKPGEALATVNI FYNNGGTSTVQYQTGANHNLGKELADDLYETINPAEFEQVNMVLQGFVEANVLSVLQLSA EQPHIQFYEYLRNTHTTVWKIHSPEFQDELTVSLHHRNGTLQIQGRPLSCYRVFIFNLSE LLDLQGLEKVLIRQDDGKAYIVQKEVARSHLECEMGDAYPLLHKNVEKLLVSGLCVKLAA PDLPDYCMLLYPELRSVEGVLKSTMYRYGMPITQDGFGRYFDKIGSEFILKQQFGCNLSS AKVKTINDAYTFFNRERHGLFHMEIVVDTSRMVSDMARLMTKARQAWGIIKDLYIV >gi|296493433|gb|ADTK01000068.1| GENE 2 2031 - 2300 189 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|300902645|ref|ZP_07120616.1| ## NR: gi|300902645|ref|ZP_07120616.1| conserved domain protein [Escherichia coli MS 84-1] # 1 89 1 89 89 143 100.0 4e-33 MENLLFVMCWIKDHHVMSLTFGVLSAILWIKSATAKVELGKKIVMVTYEDPELNVNLHDF FATARLQSKYNSFAALSAAATALFQIIGL >gi|296493433|gb|ADTK01000068.1| GENE 3 2926 - 4065 40 379 aa, chain - ## HITS:1 COG:no KEGG:ECSE_1878 NR:ns ## KEGG: ECSE_1878 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 379 6 384 386 555 72.0 1e-157 MKMTKADFGVTIDDTLITDFRDVVNEGEFAYSYFINRNGKNQFSPICSCMDWISVSVRNL MNFPELSEDIDVKAMQVYSLISSIDIAVEALQQLHRIIESDDKLARWPFKKSTNVFKNKT PYLSHEDDDAYFKSIRAIFGAHPTNLKNSDGERLFASWPHFHAFNENDFTISLYNNTPGK DDVIFGIKFDELITYIEDRYKYLTSLKGSVIAIRIKHYSTLSKQIIPSTDNIHAELKLLL SEVAVRGDNDYYRMEVEELIKLFEGNVKEAHLKDEADVFLSKLHPIAVEIRYNLQEMNIE DLATKHGVLISTLPKNALHYVLQKMFTWLHSDRYDPMGGFYLDKLNEYSGGRYNFCSEDD DSTTLLKLRMMLHWYHQQK Prediction of potential genes in microbial genomes Time: Mon May 16 15:11:20 2011 Seq name: gi|296493432|gb|ADTK01000069.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont206.1, whole genome shotgun sequence Length of sequence - 6898 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 180 162 ## COG5433 Transposase - Prom 200 - 259 7.5 - Term 492 - 555 9.3 2 2 Tu 1 . - CDS 583 - 927 160 ## EcolC_2198 hypothetical protein - Prom 1044 - 1103 4.0 - Term 1372 - 1409 -0.6 3 3 Tu 1 . - CDS 1572 - 1796 92 ## gi|293414811|ref|ZP_06657454.1| hypothetical protein ECDG_01362 - Prom 1895 - 1954 7.2 4 4 Op 1 3/0.000 - CDS 1971 - 6191 2550 ## COG3209 Rhs family protein 5 4 Op 2 . - CDS 6259 - 6867 447 ## COG3501 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|296493432|gb|ADTK01000069.1| GENE 1 3 - 180 162 59 aa, chain - ## HITS:1 COG:ECs0241 KEGG:ns NR:ns ## COG: ECs0241 COG5433 # Protein_GI_number: 15829495 # Func_class: L Replication, recombination and repair # Function: Transposase # Organism: Escherichia coli O157:H7 # 1 59 1 59 378 120 96.0 5e-28 MELKKLMEHISITPDYRQAWKVEHKLSDILLLTICAVISGAEGWEDIEDFGETHLDFLK >gi|296493432|gb|ADTK01000069.1| GENE 2 583 - 927 160 114 aa, chain - ## HITS:1 COG:no KEGG:EcolC_2198 NR:ns ## KEGG: EcolC_2198 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 114 11 124 124 192 97.0 2e-48 MSDFDIVSSIDWHDGVLLEVKTTFKESHFNMTLFIDIYKSENIREKMIVEFSDVDDLVFA MDSAELIDNHKSGNISNGYIKKLKNKRIYKFFLYFSDGLLSMTFKNLQLIKPLE >gi|296493432|gb|ADTK01000069.1| GENE 3 1572 - 1796 92 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293414811|ref|ZP_06657454.1| ## NR: gi|293414811|ref|ZP_06657454.1| hypothetical protein ECDG_01362 [Escherichia coli B185] # 1 74 59 132 132 141 100.0 1e-32 MKKNETWMLLNGDATPNIRTMSNKKIECLEKSEVEWASRVCDSITDKERENLKALICSKP AKSVNTGKRPALGY >gi|296493432|gb|ADTK01000069.1| GENE 4 1971 - 6191 2550 1406 aa, chain - ## HITS:1 COG:rhsD KEGG:ns NR:ns ## COG: rhsD COG3209 # Protein_GI_number: 16128481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli K12 # 1 1258 1 1258 1426 2379 96.0 0 MSGKPAARQGDMTQYGGPIVQGSAGVRIGAPTGVACSVCPGGMTSGNPVNPLLGAKVLPG ETDLALPGPLPFILSRTYSSYRTKTPAPVGVFGPGWKAPSDIRLQLRDDGLILNDNGGRS IHFEPLLPGEAVYSRSESMWLVRGGKAAQPDGHTLARLWGALPPDIRLSPHLYLATNSAQ GPWWILGWSERVPGAEDVLPAPLPPYRELTGLADRFGRTLTYRREAAGDLTGEITGVTDG AGREFRLVLTTQAQRAEEARTSSLSSSDSSRPLSASAFPDTLPGTEYGPDRGIRLSAVWL MHDPAYPESLPAAPLVRYTYTEAGELLAVYDRSNTQVRAFTYDAQHPGRMVAHRYAGRPE MRYRYDDAGRVVEQLNPAGLSYRYQYEQDRITVTDSLNRREVLHTEGGAGLKRVVKKELA DGSVTHSGYDAAGRLTAQTDAAGRRTEYGLNVVSGDITDITTPDGRETKFYYNDGNQLTA VVSPDGLESRREYDEPGRLVSETSRSGETVRYRYDDAHSELPATTTDATGSTRQMTWSRY GQLLAFTDCSGYQTRYEYDRFGQMTAVHREEGISLYRHYDNRGRLTSVKDAQGRETQYEY NAAGDLTAVITPDGNRSETQYDAWGKAVSTTQGGLTRSMEYDAAGRVISLTNENGSHSDF SYDALDRLVQQGGFDGRTQRYHYDLTGKLTQSEDEGLVTLWYYDESDRITHRTVNGEPAE QWQYDDHGWLTDISHLSEGHRVAVHYGYDDKGRLTGERQTVENPETGELLWQHETTHAYN EQGLANRVTPDSLPPVEWLTYGSGYLAGMKLGDTPLLEYTRDRMHRETVRSFGSMAGSNA AYKLTSTYTPAGQLQSQHLNSLVYDRDYGWNDNGDLVRISGPRQTREYGYSATGRLESVR TLAPDLDIRIPYATDPAGNRLPDPELHPDSTLTAWPDNRIAEDAHYVYHYDEYGRLTEKT DLIPAGVIRTDDERTHHYHYDSQHRLVFYTRIQHGEPLVESRYLYDPLGRRMAKRVWRRE RDLTGWMSLSRKPEVTWYGWDGDRLTTVQTDTTRIQTVYQPGSFAPLIRIETDNGEREKA QRRSLAEKLQQEGSEDGHGVVFPAELVRLLDRLEEEIRADRVSSESRVWLAQCGLTVEQL ARQVEPEYTPARKVHFYHCDHRGLPLALISEDGNTVWSAEYDEWGNQLNEENPYYLYQPY RLPGQQYDEESGLDYNRHRYYDPLQGRYITQDPIGLAGGWSLYAYPLNPVQHVDPLGLST MIIGNGPVPDNPFGHAAAANRYGLMSSGTGDEMGASVSDYFKKMQPRRDTWIWIIDTTDE EEQCMMNKAIDLNKKLKTINLGPLPVANNCFSRTNMIFEACGFSNPSMNTNAPISLQVLG ELYGYQRYFMPKGPFTGFPFEFQGVK >gi|296493432|gb|ADTK01000069.1| GENE 5 6259 - 6867 447 202 aa, chain - ## HITS:1 COG:ZvgrE KEGG:ns NR:ns ## COG: ZvgrE COG3501 # Protein_GI_number: 15801686 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 EDL933 # 1 202 513 714 714 404 99.0 1e-113 MNTEVLNNRTTDVINNHAETIGNNQMIAVTNNQIQTVGVNQIETVGSNQIIKVGSVQVET IGLVRALTVGVAYQTTVGGIMNTSVALMQSSQIGLHKSLRVGLGYDVKVGNNVTFTVGKT KKDDTGQTAIYSAGEHLELCCGKARLVLTKDGQIFLNGTKIHLQGKEQVNGDSLLINWNC AASKSPPKTPDEKQDTPDMREY Prediction of potential genes in microbial genomes Time: Mon May 16 15:11:28 2011 Seq name: gi|296493431|gb|ADTK01000070.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont207.1, whole genome shotgun sequence Length of sequence - 1219 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 63 - 254 98 ## ECP_3777 hypothetical protein + Prom 311 - 370 2.5 2 2 Tu 1 . + CDS 418 - 834 273 ## COG1846 Transcriptional regulators Predicted protein(s) >gi|296493431|gb|ADTK01000070.1| GENE 1 63 - 254 98 63 aa, chain + ## HITS:1 COG:no KEGG:ECP_3777 NR:ns ## KEGG: ECP_3777 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_536 # Pathway: not_defined # 1 63 7 69 69 124 100.0 8e-28 MPREVTFQTGGGLVNYPRLLSSMDNPFSQTCQQPAFMFFYLREFICLLCRNISVIQKSGR NKL >gi|296493431|gb|ADTK01000070.1| GENE 2 418 - 834 273 138 aa, chain + ## HITS:1 COG:STM2813 KEGG:ns NR:ns ## COG: STM2813 COG1846 # Protein_GI_number: 16766124 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Salmonella typhimurium LT2 # 2 136 31 165 176 95 38.0 3e-20 MQLCIRANKRMQDNISEFLGAYGINHSVYMVLTTLFTAESHCLSPSEISQKLQFTRTNIT RITDFLEKTGYVKRTDSREDRRAKKISLTSEGMFFIQRLTLAQSMYLKEIWGYLTHDEQE LFEVINKKLLAHLDDVSS Prediction of potential genes in microbial genomes Time: Mon May 16 15:11:30 2011 Seq name: gi|296493430|gb|ADTK01000071.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont209.1, whole genome shotgun sequence Length of sequence - 1437 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 335 - 511 97 ## ECB_00212 hypothetical protein 2 1 Op 2 . - CDS 508 - 726 142 ## ECB_00212 hypothetical protein Predicted protein(s) >gi|296493430|gb|ADTK01000071.1| GENE 1 335 - 511 97 58 aa, chain - ## HITS:1 COG:no KEGG:ECB_00212 NR:ns ## KEGG: ECB_00212 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 58 66 123 123 112 93.0 6e-24 MSFIFNEETLYFTVFGGDFIMDDLKIVKNTEAWVWYKNCDSKNKKVAVELLQIYELHQ >gi|296493430|gb|ADTK01000071.1| GENE 2 508 - 726 142 72 aa, chain - ## HITS:1 COG:no KEGG:ECB_00212 NR:ns ## KEGG: ECB_00212 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 13 71 6 64 123 116 98.0 2e-25 MNVLRHVNAINMGVCILILVIGCANAKVLKRCILPAPDGGVEKLSKPNLHGVIKKVDLNT NEAEIKIKEGGQ Prediction of potential genes in microbial genomes Time: Mon May 16 15:11:34 2011 Seq name: gi|296493429|gb|ADTK01000072.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont209.2, whole genome shotgun sequence Length of sequence - 465 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 465 246 ## COG3209 Rhs family protein Predicted protein(s) >gi|296493429|gb|ADTK01000072.1| GENE 1 3 - 465 246 154 aa, chain - ## HITS:1 COG:Z0268 KEGG:ns NR:ns ## COG: Z0268 COG3209 # Protein_GI_number: 15799917 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli O157:H7 EDL933 # 1 154 1014 1167 1404 286 92.0 1e-77 ERVWRRERDLTGWMSLSRKPEETWYGWDGDRLTTVQTQQTRIQTVYQPGSFTPLLRIETE NGEQAKARHRSLAEVLQEDTGVTLPAELAVMLGRLERELRAGAVSAESEAWLAQCGLTVE QMESQMEAEYIPERRLHLYHCDHRGLPQALISPE Prediction of potential genes in microbial genomes Time: Mon May 16 15:11:35 2011 Seq name: gi|296493428|gb|ADTK01000073.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont211.1, whole genome shotgun sequence Length of sequence - 2537 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 23 - 1957 903 ## COG3209 Rhs family protein 2 1 Op 2 . + CDS 1972 - 2418 263 ## EC55989_1590 conserved hypothetical protein; putative exported protein Predicted protein(s) >gi|296493428|gb|ADTK01000073.1| GENE 1 23 - 1957 903 644 aa, chain + ## HITS:1 COG:Z0268 KEGG:ns NR:ns ## COG: Z0268 COG3209 # Protein_GI_number: 15799917 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli O157:H7 EDL933 # 1 494 774 1264 1404 919 92.0 0 MLWEHETGHVYSEQGLATRQEPDGLPPVEWLTYGSGYLAGMKLGGTPLVEYTRDRLHRET ARSFGGAGSTAGYEQATAYTLTGQLQSRHLNLPQLDRDYTWNDNGQLVRISGPQECREYR YSGTGRLTGVHTTAANLDIDIPYATDPAGNRLPDPELHPDSTLTAWPDNRIAEDAHYVYR YDEYGRLAEKTDRIPEGVIRMHDERTHHYHYDSQHRLVFYTRIQHGEPQVESRYLYDPLG RRTGKRVWRRGRDLTGWMSLSRKPEETWYGWDGDRLTTVQTQQTRIQTVYQPGSFTPLLR IETENGEQAKARHRSLAEVLQEDTGVTLPAELAVMLGRLERELRQGSVSEESQQWLAQCG LTAEQMAAQLEAEYIPERKLHLYHCDHRGLPLALISPEGETAWQGEYDEWGNLLGEESAQ HLQQSLRLPGQQYDEESGLYYNRNRYYDPLQGRYITQDPIGLRGEWNLYKYPLNPVRFID SLGLKFHVNGDPSDFNQAVEYLKQDSQMKETIDFLSSSEETINIEYIEGTNVRFNSNNMA IYWNSRASLFCSTELNSKSQSPALGLGHEFAHAQYYLLDKENFMALLSRTDKKYENKEEA RVITIIESRAAKTLGECTRGAHSGLPFYRVDGPLQTMKITGTPE >gi|296493428|gb|ADTK01000073.1| GENE 2 1972 - 2418 263 148 aa, chain + ## HITS:1 COG:no KEGG:EC55989_1590 NR:ns ## KEGG: EC55989_1590 # Name: not_defined # Def: conserved hypothetical protein; putative exported protein # Organism: E.coli_55989 # Pathway: not_defined # 1 148 6 153 153 280 100.0 9e-75 MKYLMVLLSLFSGSVLGMGRVNELCGIDSVKTIEIINLPSYVTTLVPLSKEGLNEIYRYK VVVNEISDLYAGKIIDLLQMKYFRKEKYNNIRWGVSIISKGNNKCEIYFDAFGECGSVNG INVCFEKNEMIGWIKKEIPLLSQKIGGL Prediction of potential genes in microbial genomes Time: Mon May 16 15:11:38 2011 Seq name: gi|296493427|gb|ADTK01000074.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont214.1, whole genome shotgun sequence Length of sequence - 1337 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1337 992 ## COG3468 Type V secretory pathway, adhesin AidA Predicted protein(s) >gi|296493427|gb|ADTK01000074.1| GENE 1 2 - 1337 992 445 aa, chain - ## HITS:1 COG:flu KEGG:ns NR:ns ## COG: flu COG3468 # Protein_GI_number: 16129941 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 1 445 365 809 1091 691 99.0 0 ALVTSTAATVTGINRLGAFSVVEGKADNVVLENGGRLDVLTGHTATNTRVDDGGTLDVRN GGTATTVSMGNGGVLLADSGAAVSGTRSDGKAFSIGGGQADALMLEKGSSFTLNAGDTAT DTTVNGGLFTARGGTLAGTTTLNNGAILTLSGKTVNNDTLTIREGDALLQGGSLTGNGSV EKSGSGTLTVSNTTLTQKAVNLNEGTLTLNDSTVTTDVIAQRGTALKLTGSTVLNGAIDP TNVTLASGATWNIPDNATVQSVVDDLSHAGQIHFTSTRTGKFVPATLKVKNLNGQNGTIS LRVRPDMAQNNADRLVIDGGRATGKTILNLVNAGNSASGLATSGKGIQVVEAINGATTEE GAFVQGNRLQAGAFNYSLNRDSDESWYLRSENAYRAEVPLYASMLTQAMDYDRILAGSRS HQTGVSGENNSVRLSIQGGHLGHDN Prediction of potential genes in microbial genomes Time: Mon May 16 15:11:39 2011 Seq name: gi|296493426|gb|ADTK01000075.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont219.1, whole genome shotgun sequence Length of sequence - 760 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 578 242 ## ECO111_p4-01 putative Rep protein Predicted protein(s) >gi|296493426|gb|ADTK01000075.1| GENE 1 3 - 578 242 191 aa, chain + ## HITS:1 COG:no KEGG:ECO111_p4-01 NR:ns ## KEGG: ECO111_p4-01 # Name: not_defined # Def: putative Rep protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 181 107 296 304 317 84.0 2e-85 YAAAIERALCEKLGADVNYSGLICKNPCHPEWQEVEWREEPYTLDELADYLDLSASARRS VDKNYGLGRNYHLFEKVRKWAYRAIRQGWPVFSQWLDAVIQRVEMYNASLPVPLSPAECR AIGKSIAKYTHRKFSPEGFSAVQAARGRKGGTKSKRAAVPTSARSLKPWEALGISRATYY RKLKCDPDLAK Prediction of potential genes in microbial genomes Time: Mon May 16 15:11:42 2011 Seq name: gi|296493425|gb|ADTK01000076.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont224.1, whole genome shotgun sequence Length of sequence - 638 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 60 - 636 431 ## EC55989_1357 putative exonuclease from phage origin Predicted protein(s) >gi|296493425|gb|ADTK01000076.1| GENE 1 60 - 636 431 192 aa, chain + ## HITS:1 COG:no KEGG:EC55989_1357 NR:ns ## KEGG: EC55989_1357 # Name: not_defined # Def: putative exonuclease from phage origin # Organism: E.coli_55989 # Pathway: not_defined # 1 192 1 192 823 387 100.0 1e-107 MSKVFICAAIPDEQAIKEEGAVAVATAIEAGDERRARAKFHWQFLEHYPAAQDCAYKFLV CEDKPGIPRPALDSWDAEYMQENRWDEESASFVPVETESDPMNVTFDKLAPEVQNAVMVK FDTCENITVDMVISAQELLQEDMATFDGHIVEALMKMPEVNAMYPELKLHAIGWVKHKCK PGAKWPEIQAEM Prediction of potential genes in microbial genomes Time: Mon May 16 15:11:50 2011 Seq name: gi|296493424|gb|ADTK01000077.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont233.1, whole genome shotgun sequence Length of sequence - 13323 bp Number of predicted genes - 12, with homology - 10 Number of transcription units - 5, operones - 1 average op.length - 8.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 228 - 265 1.1 1 1 Tu 1 . - CDS 350 - 493 63 ## 2 2 Tu 1 . + CDS 1107 - 1436 132 ## COG1846 Transcriptional regulators 3 3 Tu 1 . + CDS 2066 - 2533 9 ## gi|300902680|ref|ZP_07120642.1| hypothetical protein HMPREF9536_00840 4 4 Tu 1 . - CDS 2811 - 2993 61 ## - Prom 3122 - 3181 5.9 - Term 3651 - 3679 -1.0 5 5 Op 1 6/0.000 - CDS 3702 - 4523 663 ## COG3718 Uncharacterized enzyme involved in inositol metabolism 6 5 Op 2 5/0.000 - CDS 4533 - 5423 912 ## COG1082 Sugar phosphate isomerases/epimerases 7 5 Op 3 2/0.000 - CDS 5435 - 7339 1625 ## COG0524 Sugar kinases, ribokinase family 8 5 Op 4 3/0.000 - CDS 7352 - 8485 1100 ## COG0673 Predicted dehydrogenases and related proteins 9 5 Op 5 21/0.000 - CDS 8494 - 9522 1230 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 10 5 Op 6 16/0.000 - CDS 9534 - 11081 171 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 11 5 Op 7 1/0.000 - CDS 11133 - 12062 1159 ## COG1879 ABC-type sugar transport system, periplasmic component 12 5 Op 8 . - CDS 12094 - 13080 957 ## COG0673 Predicted dehydrogenases and related proteins - Prom 13155 - 13214 5.0 Predicted protein(s) >gi|296493424|gb|ADTK01000077.1| GENE 1 350 - 493 63 47 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLAGTSGLYNGFLGRHSGLPRVSHDIYSYSAFIVSLLSAESDIAHHT >gi|296493424|gb|ADTK01000077.1| GENE 2 1107 - 1436 132 109 aa, chain + ## HITS:1 COG:STM2813 KEGG:ns NR:ns ## COG: STM2813 COG1846 # Protein_GI_number: 16766124 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Salmonella typhimurium LT2 # 1 104 59 162 176 79 43.0 1e-15 MVLTILSTADNHCLSPSKISRELQFTKTNITRITDFLEKAGYIERTDSREDRRAKKISLT SDGLSFIQNVNIAQSLHLKEIWSPLTHAECELFEDINKKLLTHIADVCP >gi|296493424|gb|ADTK01000077.1| GENE 3 2066 - 2533 9 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|300902680|ref|ZP_07120642.1| ## NR: gi|300902680|ref|ZP_07120642.1| hypothetical protein HMPREF9536_00840 [Escherichia coli MS 84-1] # 1 155 1 155 155 299 100.0 3e-80 MIFYLIDKEVKDREMSFNTTHEKSEIYRLILRESELITAWVKSGDTPSAVYGKLRDKNPD IIFSINGFLYNLRNFNYALYETATKNKSKTRLIILNHYDDIASAIRAGHTLKGVYKLVCP HITYNCFITQLRKTYPDLHSQGKANRSNKNRIIAN >gi|296493424|gb|ADTK01000077.1| GENE 4 2811 - 2993 61 60 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MMNIATWRYKERLAPDISLTLRQYLQNGVCLHPGILQFVQLDLRLIFVSCIINGLPFPLV >gi|296493424|gb|ADTK01000077.1| GENE 5 3702 - 4523 663 273 aa, chain - ## HITS:1 COG:YPO2587 KEGG:ns NR:ns ## COG: YPO2587 COG3718 # Protein_GI_number: 16122800 # Func_class: G Carbohydrate transport and metabolism # Function: Uncharacterized enzyme involved in inositol metabolism # Organism: Yersinia pestis # 1 268 1 268 271 457 82.0 1e-128 MSRLLSRWQQPNAEGRTQSVTPESAGWGYVGFEAYELEEGQQLTLPAVSEERCLVLVAGR AFISTPSAQFTDIGERMSPFERIKPWAVYVTPQEAVQVKALTKLELAVCTAPGKGTYPTR LIAPQDIDGEARGNGHNQRYVHNILPEDKPADSLLVVEVWTNEGCTSSYPSHKHDTNNPP QESYLEETYYHRLNPEQGFCMQRVYTDDRTLDECMAVYNRDVVMVPKGYHPVATMAGYDS YYLNVMAGPVRKWIFTWEKEHSWINTEYSAISL >gi|296493424|gb|ADTK01000077.1| GENE 6 4533 - 5423 912 296 aa, chain - ## HITS:1 COG:YPO2586 KEGG:ns NR:ns ## COG: YPO2586 COG1082 # Protein_GI_number: 16122799 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Yersinia pestis # 2 296 12 306 306 531 86.0 1e-151 MSVQLGINPLTWTNDDLPSLGAETSLETCLSEGKEAGFAGFELGNKFPREARLLGPILQR HDLQLVSGWYSGRLLERSVEDEIAAVQSHLTLLRELGAKVLVFAEVSGCIHGEQQTPVHL RPRFPQECWKEYGEKLTVFARYTQQQGVQIAYHHHMGTVIESAQDVDNLMLHTGEEVGLL LDTGHLTFAGADPLAVAQRWASRINHVHCKDVRADVLADVKNRKTSFLDAVLSGVFTVPG DGCVDYPPIMQLLKTQDYHGWLVVEAEQDPAIAHPLTYARLGYNNLSRLARDAGLI >gi|296493424|gb|ADTK01000077.1| GENE 7 5435 - 7339 1625 634 aa, chain - ## HITS:1 COG:YPO2585_1 KEGG:ns NR:ns ## COG: YPO2585_1 COG0524 # Protein_GI_number: 16122798 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Yersinia pestis # 2 335 24 357 357 592 86.0 1e-169 MEKQFDVICMGRVAVDLYSQQIGARLEDVSSFAKYLGGSSGNVAYGTARQGLRSSMLARV GDEHMGRFLREELNQVGCDTSHLITDKERLTALVLLGIKDRDTFPLIFYRDNCADMAITA SDVDESYIASARCIAITGTHLSHPQTREAVLTALNYARRHGVRTVLDIDYRPVLWGLTSL GDGETRFIVAEQVTRELQEVLHLFDVVVGTEEEFHIAGGSTDTLQALAQVRHVSKATLVC KRGALGCSVYTDAIPPRLDDGLTVTGVCVEVLNVLGAGDAFMSGLLRGYLNDEGWEQACR YANACGALVVSRHGCAPAMPSKIELDDYLSRAALVPRPDIDPRLNHLHRVTTRRREWPEL CVMAFDHRSQLEDMALQCGASLKRIPALKQLILQASREAASSAGLEGKAGLLCDGTFGQD ALNAITGEGWWIGRPIELPGSRPLEMEHGNIGTQLISWPQEHVVKCLVFFHPEDAHGLRL EQEQKIAEVYHACCQSGHELLLEVILPATMPRSDELYLRAISRFYNLGIYPDWWKLPPLS AEGWTALSEIIASRDPHCRGVVILGLDAPAEQLRADFKAAAGQALVKGFAVGRTLFGDAS RAWLKHDIDDAQLVARIRDNYLQLIAWWRERGHA >gi|296493424|gb|ADTK01000077.1| GENE 8 7352 - 8485 1100 377 aa, chain - ## HITS:1 COG:YPO2584 KEGG:ns NR:ns ## COG: YPO2584 COG0673 # Protein_GI_number: 16122797 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Yersinia pestis # 1 377 1 377 377 663 85.0 0 MKEVRIGLIGTGYIGKAHAIAYAQAPTVFNLRGKLVREMVAEVNPALAAERAQAFGFNRS TGDWRVLVADPAIDVVDICSPNHLHKEMALEAIRHGKHVYSEKPLALNAHDAREMVDAAK RAGVKTLVGFNYMKNPTAQLAKEIIARGEIGEVIHFYGTHNEDYMADPLSPIHWHCFKET AGLGALGDLAAHIVNMAQYLVGEIEQVCGDLKIVVPARPAKAGSSEMVTVENEDQAHAMV RFAGGAQGVIETSRVACGRKMGLSYVITGTKGAISFTQERMAEIKLYLHDDPVNRQGFRT LLVGPAHPDYGAFCMGAGHGIGFNDQKTVEVRDLVDGIATDAPMWPDFEEGWKVSRVLDA IALSHQQGLWLNVNDIV >gi|296493424|gb|ADTK01000077.1| GENE 9 8494 - 9522 1230 342 aa, chain - ## HITS:1 COG:YPO2583 KEGG:ns NR:ns ## COG: YPO2583 COG1172 # Protein_GI_number: 16122796 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Yersinia pestis # 1 342 1 342 342 505 92.0 1e-143 MTQMIPKTQVPVKRAGRIDPIAFFERFGVLIFMILLLIFFQSQNSNFFSERNIFNILTEV SIYGIMAVGMTFVILTAGIDLSVGSILAVCAMTAAYVIKGDNFTTVDPNAWGGMSWLIGL GICLTMGTAIGFLHGLGVTRLRLPPFIVTLGGMTIWRGLTLVINDGAPIAGFDQGYRWWG RGELLGISIPIWIFAIVAIGGYLALHKTRWGRFVYAIGGNPEAARLAGVNVKRVLVSVYV LIGCLAGLAGFILSARLGSAEAVAGISFELRVIASVVIGGTSLMGGYGRIGGTIIGSIIM GILINGLVLMNVSAYYQQIITGLIIVLAVAFDTYAKSRRGAL >gi|296493424|gb|ADTK01000077.1| GENE 10 9534 - 11081 171 515 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 282 487 17 217 245 70 25 6e-12 MSQPLLKVTDLAKSFSGVWALSSAQLTVGTGEIHALLGENGAGKSTLLKALAGAQPQTRG EIWFNGETLPVDDSPVERQNKGIITIYQEFNLLPNMTVAENMFLGREPRKRNLIVDEKAV NQEAQVILDYLQLNVAPTTPVARLSVAQQQMVEIARALTLNARLIIMDEPSAALSDSEVE SLHRVVRELKNRGVSIIYVTHRLHEVFQLCDRFTVFQDGRFTGTGAVTETNVEQLIRLMV GRDVAFNRRPASETHHEDKPVRLSVKGLSREKPPLDPHGIALHDISFHVHAGEVLGIAGL VGAGRTEVARCLFGADAFTTGSFELDGVPYKPRDPLYALEQGLALVPEDRKKEGAVLGLS IRDNLSLSCLSSLLQWRWFVNTRKEDDLIESYRQALQIKMVNSTQEVRKLSGGNQQKVIL ARCMALNPRVLIVDEPTRGIDVGTKSEVHQVLFDMAKKGVAVIVISSDLPEVMAVSDRII TLSEGRLTGEIHGDDANEERLMTMMAINHNALNAA >gi|296493424|gb|ADTK01000077.1| GENE 11 11133 - 12062 1159 309 aa, chain - ## HITS:1 COG:YPO2581 KEGG:ns NR:ns ## COG: YPO2581 COG1879 # Protein_GI_number: 16122794 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Yersinia pestis # 1 308 1 308 309 506 85.0 1e-143 MKKLILAALIAMTSGAAIAENEQIVFSTPNLAMPFEVHMQRTAVKAAKEMGVKLQVLDGQ GSSPKQVADLENAITRGAQGFIVSPNDVNAVSSAIDEIQDTKLPVVTLDRSVDSQKKVPH FGANNYKGGQAIGDFVKTKFPNGAEIILLTGQPGSSSNIERTKGIRDSLKAGGDMYKIVA DQTGNWMRSEGMRIVESVLPSLPKRPQVILSANDDMALGAIEALQSQGVQPGEILVTGFD AVPEALARVRDGWLAVTADQRPGFAVTTAMSQLVANVREKKAITGADYPPTLITKENLQQ AERIAEAGN >gi|296493424|gb|ADTK01000077.1| GENE 12 12094 - 13080 957 328 aa, chain - ## HITS:1 COG:AGl1682 KEGG:ns NR:ns ## COG: AGl1682 COG0673 # Protein_GI_number: 15890957 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 319 5 320 330 312 52.0 5e-85 MFNIALLGAGRIGQVHAANIASHSATTLWSVVDPNQEFATRLATQYQARQQSLDEAMTDP NVHAVLIASATDTHADLIEMAARYGKAIFCEKPVHLDLARVRDCLKVVKEYDVPLFIGFN RRFDPQFRRVKTDTQAGRIGKPESLLIISRDPSPPPAEYVRVSGGMFRDMTIHDFDMARF IMGEEPVSVYAQGSNLVDPAIGEAGDIDTALIVLKYASGAMATIVNSRRSSYGYDQRLEL HGSEGLLCAGNILENQVQHYGKQGCTSALPEHFFLQRYKSAYAAEWEHFVAVLRGEAVPD CSGDDGERALYLADKALESLRCQREIVL Prediction of potential genes in microbial genomes Time: Mon May 16 15:12:09 2011 Seq name: gi|296493423|gb|ADTK01000078.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont233.2, whole genome shotgun sequence Length of sequence - 11022 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 5, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 71 - 1996 1343 ## COG3962 Acetolactate synthase 2 1 Op 2 . - CDS 2008 - 3513 1369 ## COG1012 NAD-dependent aldehyde dehydrogenases - Prom 3587 - 3646 8.4 + Prom 3446 - 3505 2.7 3 2 Op 1 . + CDS 3616 - 3807 72 ## gi|300922095|ref|ZP_07138236.1| hypothetical protein HMPREF9548_00377 4 2 Op 2 . + CDS 3804 - 4664 669 ## COG1737 Transcriptional regulators - Term 4946 - 4974 -0.3 5 3 Tu 1 . - CDS 4999 - 5511 55 ## gi|300902695|ref|ZP_07120656.1| putative M-agglutinin - Prom 5551 - 5610 1.9 - Term 5765 - 5806 3.1 6 4 Op 1 . - CDS 6046 - 6465 174 ## SPAB_03303 hypothetical protein 7 4 Op 2 10/0.000 - CDS 6481 - 8991 833 ## COG3188 P pilus assembly protein, porin PapC - Prom 9107 - 9166 5.6 8 4 Op 3 . - CDS 9237 - 10004 388 ## COG3121 P pilus assembly protein, chaperone PapD 9 4 Op 4 . - CDS 10082 - 10276 112 ## gi|13236678|gb|AAK16196.1| AfaA + Prom 10587 - 10646 5.4 10 5 Tu 1 . + CDS 10751 - 10924 85 ## gi|300902700|ref|ZP_07120661.1| hypothetical protein HMPREF9536_00859 Predicted protein(s) >gi|296493423|gb|ADTK01000078.1| GENE 1 71 - 1996 1343 641 aa, chain - ## HITS:1 COG:YPO2578 KEGG:ns NR:ns ## COG: YPO2578 COG3962 # Protein_GI_number: 16122793 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase # Organism: Yersinia pestis # 1 641 8 648 648 1072 83.0 0 MQTERMTMAQALVRFLNQQYVSVDGKETPFVEGVATIFGHGNVLGIGQALEQDPGHLQVM QGCNEQGIAHMATGFAKQHRRQRIFAVTSSVGPGAANMVTAAATATANRIPLLLLPGDIY ACRQPDPVLQQIEQYHDLSISTNDCFRPVSRYWDRINRPEQLMSAMLSAMRTLTDPANTG AVTICLPQDVQGEAWDYPLSFFARRVHHIERRPPDPQRLAQAHALISRKRRPLVVCGGGV RYSGAHEALREFVEVLQLPFAETQAGKGALVSDHPLNLGGVGVTGGMAANQLAPQADLVI GIGTRLTDFTTGSKALFSHPDVEFLLLNVAEFDSLKLDATALIADARTGLTALTHALKDY RSGWGEAIAHAKSAWHDECCRLWERQWHPDDVPEVAGHLDAQLMEYSETLNTHLTQTRVL GLINQHIEDNAILVGAAGSLPGDLQRLWQVKTPDSYHLEYGYSCMGYEVAAAIGAKLAKP QQPVYAMVGDGSYMMLHSELQTAVQEGIKVTILLFDNASFGCINNLQMGHGMGSFGTENR YRNPKTGQLDGPLVKVDFAQNAESYGCRAWRVHNEKSLLAALEASRAHPGPTLLDIKVLP KTMTHDYASWWRTGNAQVAESPAVRQAAEQTQAQVKKARQY >gi|296493423|gb|ADTK01000078.1| GENE 2 2008 - 3513 1369 501 aa, chain - ## HITS:1 COG:YPO2577 KEGG:ns NR:ns ## COG: YPO2577 COG1012 # Protein_GI_number: 16122792 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Yersinia pestis # 1 501 1 508 508 879 84.0 0 METVANFIHGECVTGSGQRIQAIFNPATGEQIRQVVMSTAQETEQAIAAAQLAFPAWARF APLKRARVMFRFKALLEENMERLARIISEEHGKVFSDAVGELTRGLEVVEFACGIPHLQK GEHSANVGTGVDSHSLMQPLGVCAGITPFNFPAMVPMWMFPIALATGNTFVLKPSEKDPS LALALAKLLKEAGLPDGVFNVVQGDKESVDVLLTDPRVQAVSFVGSTPIAEYMYLTASAH GKRCQALGGAKNHCILMPDADMDMATNAIMGAAYGAAGERCMALSVVVAVGDKTADELCA RLEKKIASLRVGPGLNQTPENEMGPLISAAHRSKVLGYIDSGEAQGANLRVDGRGLRVKG HEQGYFVGPTLFDNVTPEMTIYKEEIFGPVLSVVRVADYASAVDLINRHEYGNGSAIFTR DGGCARRFCEEVQAGMVGVNVPIPVPMAFHCFGGWKRSIFGALNVHGNDGVRFYTRMKTI TARWPEMQQETAAAFSMPTLG >gi|296493423|gb|ADTK01000078.1| GENE 3 3616 - 3807 72 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|300922095|ref|ZP_07138236.1| ## NR: gi|300922095|ref|ZP_07138236.1| hypothetical protein HMPREF9548_00377 [Escherichia coli MS 182-1] # 1 63 1 63 63 122 100.0 8e-27 MIIVIKDGFLCEQPASLTQPELLGVSGEEMTEASVSFLLPGGWPTFKRRGFAFRIMRLDR GIT >gi|296493423|gb|ADTK01000078.1| GENE 4 3804 - 4664 669 286 aa, chain + ## HITS:1 COG:YPO2576 KEGG:ns NR:ns ## COG: YPO2576 COG1737 # Protein_GI_number: 16122791 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Yersinia pestis # 1 271 1 271 284 426 88.0 1e-119 MTTATSLNELQKQIRDRYDSLSKRLQQVSRYVLDNTNSVAFDTVAVIAERADVPPSTLIR FANAFDFTGFNEMKQLFRMHLVEETASYSDRARLFREMESEAVPETPLDILQEFASSNSQ ALQQLAARAEPEMLEHAVQLLAYADTIYIAGLRRSFSIAAYLTYALSHLECRPILLDGMG GMLREQISRIKACDIVVSISFSPYAEETVMVSKTAASVGARQIVITDSQISPLATFSDLC FVVKEAQVDAFRSQSATLCLVQSLVVALAYRLGDKKHNNTQENSNQ >gi|296493423|gb|ADTK01000078.1| GENE 5 4999 - 5511 55 170 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|300902695|ref|ZP_07120656.1| ## NR: gi|300902695|ref|ZP_07120656.1| putative M-agglutinin [Escherichia coli MS 84-1] # 1 170 1 170 170 303 100.0 3e-81 MNLKKIAIASSVIAGITMALTCHAVTVTATHTVESDAEFTIDWVDAGPTTTNAKDGEVWG HLDMTQTRGTPTLGRLSNPQGETSPGPLKAPFRFTGPNGHTARAYLDSYGTAIHNYAGDN LANGLKVGSGTGDDPFVVGTTSRLTAKIFGDQTLVPGVYRTTFELTTWTD >gi|296493423|gb|ADTK01000078.1| GENE 6 6046 - 6465 174 139 aa, chain - ## HITS:1 COG:no KEGG:SPAB_03303 NR:ns ## KEGG: SPAB_03303 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Paratyphi_B # Pathway: not_defined # 4 132 15 144 156 77 27.0 2e-13 MKKIQIVCSGIVLVVISSLAQAVELSLNTSDGRSGELKDGTKVATGRIICRGTYTSFHIW MNSRQMGNIPGHYIILGRHDSHNEMRVRLDGAGWLPSVSDGQGMVSTGIPEQHTFDVVID GNQLLGPDEYILSVSGECS >gi|296493423|gb|ADTK01000078.1| GENE 7 6481 - 8991 833 836 aa, chain - ## HITS:1 COG:STM0301 KEGG:ns NR:ns ## COG: STM0301 COG3188 # Protein_GI_number: 16763684 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Salmonella typhimurium LT2 # 14 835 21 835 836 775 50.0 0 MAACTLLYTGNTEARTYSFDASMLKGGGKGVDLTLFEEGAQLPGIYPVDIILNGSRVDSR DMAFRTEKDAEGKTYLKTCLTREMLARYGVMTEEYPELFHGDDGKEADACAELSVIPQAT ETYQFASQQLLLSIPQVALRPPLRGIAPEALWDDGIPAFLLNWQANTSRSEYRGYGKSVT DNYWVTLEPGINLGPWRVRNLTTWNRSSGQPGQWESAYIRAERGINSLRSRLTLGEDYTP SDIFDSVPFRGVMLGSDESMVPYNQREFAPVVRGIARTQARIEVRQNGYLIDSRTVAPGA FALTDLPLTGSGGDLQVTVQESDGTAQVLTVPYTTPAIALREGYMKYSIAGGEYRSSDDA VEHSPLGQMSVMYGLPWGLTAFGGAQMSSHYQSAALGLGWSMGRLGAVSVDGIHSRGQQK GRDTETGETWRLRYNKSFELTNTDLTAARYQYTSSGFHTLSDVLSTFRNDNFRAYSYSGD RSRRTTLRLNQSLGSLGYMSLYGSRDEFRHNRQKQDSFGMSYGTSWKNISWYVNWSRNYS TNAYYQRGHIEDSINLWMNIPLGQWIGGRDNDINVTTQTQRTTGQNTWYETGLNGRAFDR RLYWDVSERIAPGSENNVDSSRLNLRWFGTYGELAGMYSYSSHIRQMSAGISGGMVLHNE GITLGQKGEDTVALVVAPGVNGASVGSLPGVRTDTRGYTLVNHVSPYQENLITLDPTTFP ENVEVSQTDTRVVPTKGAVVRAKFITRVGGRALITLSRSDGSRLPFGAVVTEEGKKGQMP GGAGVVGDNGEVYLSGLDETGQLKVQWGRNSSCRANYRLPEEKEATGIFLVRTVCM >gi|296493423|gb|ADTK01000078.1| GENE 8 9237 - 10004 388 255 aa, chain - ## HITS:1 COG:STM0300 KEGG:ns NR:ns ## COG: STM0300 COG3121 # Protein_GI_number: 16763683 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Salmonella typhimurium LT2 # 24 254 16 237 237 234 49.0 1e-61 MKIRHLAALVGILAGLPASGVMAAGNQESTKTTSKVFSLHLGTTRVIYNLDSSGASLTVM NDLDYPMLVQSEVLAENQKDPAPFVVTPPLFRLDALQSSRLRIVRTGGGFPEDRESLQWL CVKGIPPKHDDRWAEEKGADKKKADKATLQVNLSVSSCVKLFVRPPAVKGRPEDAGGKVE WTKSGNKLKGNNPTPFYINISELSVGGKEVKEHRYISPYSSYEYEIPSGASGKVSWTVIT DYGGKSKPFESELKS >gi|296493423|gb|ADTK01000078.1| GENE 9 10082 - 10276 112 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|13236678|gb|AAK16196.1| ## NR: gi|13236678|gb|AAK16196.1| AfaA [Escherichia coli] # 1 64 64 127 127 129 100.0 5e-29 MGYTRREACEQHGVSQGYLSGALVRIQHLSQTVNNLLPYYVSGLAMNHEEEGDRSVKRGG DWRV >gi|296493423|gb|ADTK01000078.1| GENE 10 10751 - 10924 85 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|300902700|ref|ZP_07120661.1| ## NR: gi|300902700|ref|ZP_07120661.1| hypothetical protein HMPREF9536_00859 [Escherichia coli MS 84-1] # 1 57 1 57 57 110 100.0 3e-23 MKLQEGKNITGGNKSGQKNDCGIFFFRQKIIVVLYTSAASPDGFTWLKLTTTRVRMV Prediction of potential genes in microbial genomes Time: Mon May 16 15:12:35 2011 Seq name: gi|296493422|gb|ADTK01000079.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont237.1, whole genome shotgun sequence Length of sequence - 9000 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 8, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 25 - 291 328 ## ECO26_0314 hypothetical protein 2 1 Op 2 . - CDS 306 - 509 255 ## ECO26_0313 hypothetical protein - Prom 529 - 588 8.4 - Term 537 - 578 5.5 3 2 Op 1 . - CDS 595 - 1623 991 ## APECO1_1718 hypothetical protein 4 2 Op 2 . - CDS 1642 - 2004 153 ## ECO26_0311 hypothetical protein 5 2 Op 3 . - CDS 2001 - 2213 201 ## ECO26_0310 hypothetical protein - Term 2475 - 2522 0.5 6 3 Op 1 . - CDS 2559 - 2897 295 ## ECO26_0309 hypothetical protein 7 3 Op 2 . - CDS 2899 - 3309 218 ## ECO26_0308 hypothetical protein - Prom 3363 - 3422 4.2 - Term 3357 - 3395 -0.2 8 4 Op 1 . - CDS 3439 - 3717 143 ## APECO1_1722 hypothetical protein 9 4 Op 2 . - CDS 3729 - 4421 310 ## APECO1_1723 hypothetical protein 10 4 Op 3 . - CDS 4473 - 4640 199 ## gi|300902712|ref|ZP_07120672.1| conserved domain protein 11 4 Op 4 . - CDS 4660 - 4848 166 ## gi|300902713|ref|ZP_07120673.1| hypothetical protein HMPREF9536_00871 - Prom 4952 - 5011 4.7 12 5 Tu 1 . - CDS 6930 - 7133 90 ## ECO26_0302 hypothetical protein - Prom 7339 - 7398 7.6 - Term 7539 - 7605 -0.8 13 6 Tu 1 . - CDS 7620 - 8171 62 ## gi|300902716|ref|ZP_07120676.1| conserved domain protein - Prom 8191 - 8250 7.7 + Prom 8486 - 8545 3.7 14 7 Tu 1 . + CDS 8566 - 8724 73 ## gi|300902717|ref|ZP_07120677.1| hypothetical protein HMPREF9536_00875 + Term 8906 - 8945 4.5 15 8 Tu 1 . - CDS 8791 - 8943 153 ## ECO26_0301 hypothetical protein Predicted protein(s) >gi|296493422|gb|ADTK01000079.1| GENE 1 25 - 291 328 88 aa, chain - ## HITS:1 COG:no KEGG:ECO26_0314 NR:ns ## KEGG: ECO26_0314 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 74 1 74 609 110 87.0 2e-23 MLEINTSKIKNTVTFSVDKDSLKKAKDSITGLKEFAENIKPAKLRFDNVTKGYKKAQSEV DKITKKQAQADKANAKAQLAAQRVVARG >gi|296493422|gb|ADTK01000079.1| GENE 2 306 - 509 255 67 aa, chain - ## HITS:1 COG:no KEGG:ECO26_0313 NR:ns ## KEGG: ECO26_0313 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 67 1 67 67 123 100.0 2e-27 MEVILKSKNGVHVHLDANDSRGLLNLKYLCSLLDVPYEGVKARMFRLNESIDQALHHFLS KEGDKND >gi|296493422|gb|ADTK01000079.1| GENE 3 595 - 1623 991 342 aa, chain - ## HITS:1 COG:no KEGG:APECO1_1718 NR:ns ## KEGG: APECO1_1718 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 342 1 342 342 625 94.0 1e-178 MILGNDYVDLAPLFSAHSTRNYLLSTLDFTDSVGVTSHKVAVSQLVETNESLFNKETSRF SSEHNVTKREQGKEYLIEIPYFLREDLIKPQDVQGKRKPGTDFQETLTDIYADYVAKHNV AFQRTKESVLAASLFSGKTYTPKTDDVLIEWGKLFNVSAMKATVNASSTDTTKIFKEFDQ IATDIIEKAQSQAAAVERIVVFCKPEAFSAIRFSAGMANAFQYVSPLEEGNVVYQRRDLL PGVTAFTIPGTNIDVIKLVDPLHLAHMTADAVAIPKFAKGSNVYQNIYGAASSTFELINA DPTEVYSYSYESSRGDAVNVVTENSQMVVNHGVGFSVQITVK >gi|296493422|gb|ADTK01000079.1| GENE 4 1642 - 2004 153 120 aa, chain - ## HITS:1 COG:no KEGG:ECO26_0311 NR:ns ## KEGG: ECO26_0311 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 120 1 120 120 214 97.0 8e-55 MIIDLDTLFPVRKTFTDIVSYCTDPFASVESKVFASLPADVECGDLITSTGAKYESGDDI YVVMSEFVTAGANKPVDVLRSNAGLVCIKADALNAVSEAAKAALIKKGFQLEGFHSVFTS >gi|296493422|gb|ADTK01000079.1| GENE 5 2001 - 2213 201 70 aa, chain - ## HITS:1 COG:no KEGG:ECO26_0310 NR:ns ## KEGG: ECO26_0310 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 70 1 70 70 98 98.0 7e-20 MLFLNDKEQIIKYRDEMLKINPQITEMVAKYAGCSVEEVERCVEKYFSPSSPSTPSLNEL IKLKAKEITK >gi|296493422|gb|ADTK01000079.1| GENE 6 2559 - 2897 295 112 aa, chain - ## HITS:1 COG:no KEGG:ECO26_0309 NR:ns ## KEGG: ECO26_0309 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 112 1 111 111 193 95.0 1e-48 MNKNTYDTIYSLINYYEDDYLLPLNRAELEAYKENTPAALNEAFKHWDLAVNAFEHLSKR VEMLCKRENAYLTADQLFELSNWIEGIESDVRYVGDGLVELAQRLGATITEE >gi|296493422|gb|ADTK01000079.1| GENE 7 2899 - 3309 218 136 aa, chain - ## HITS:1 COG:no KEGG:ECO26_0308 NR:ns ## KEGG: ECO26_0308 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 136 1 136 136 272 98.0 3e-72 MAKRKNNVVKKIGDSASLLSKPKSILRRKDFKELVTLSKQNSAPGEWKTEIIEHSPSVPC GDEFNALQEILSSTPGVFWKPRKRKEYIVDSSDLRKYQILGFEDYNHYVGYLATNGLNNL VPEFQILDNADHYGDF >gi|296493422|gb|ADTK01000079.1| GENE 8 3439 - 3717 143 92 aa, chain - ## HITS:1 COG:no KEGG:APECO1_1722 NR:ns ## KEGG: APECO1_1722 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 92 1 92 92 180 96.0 1e-44 MIQYLVKNQVDRIQCNDTGKRIYETLAYLYKGKPTPLKYSDVLHRAGCSEDGLKFWLKQL SNFGVIEIKELSFSTFNLKRLNREIDFIYSTL >gi|296493422|gb|ADTK01000079.1| GENE 9 3729 - 4421 310 230 aa, chain - ## HITS:1 COG:no KEGG:APECO1_1723 NR:ns ## KEGG: APECO1_1723 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 230 1 229 229 417 95.0 1e-115 MINQLTFTKHYDTFDNVSKIYSDKFPQGKDLDLLHIVLYFRFLSYQENNLNCYESHETLA KIFKSSASTIKRKINDLKEMGLLETSPHPDPYISSLIYNALPLTDAHITPPGESSLSDLS EAEEAQEQPKGHKQPSFDVLDDDWDAPLPWETEETPVSSSEKVANDNEEDVLENFASLII REKHRRTGGSSFIDFANSLAYRHGLKRPNGIEAYFAKKHPKVYEDFDIPF >gi|296493422|gb|ADTK01000079.1| GENE 10 4473 - 4640 199 55 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|300902712|ref|ZP_07120672.1| ## NR: gi|300902712|ref|ZP_07120672.1| conserved domain protein [Escherichia coli MS 84-1] # 1 55 1 55 55 84 100.0 1e-15 MSETKKPIPRTYLHVDPEIFKVLFAEAKKRQIMVSDLMLEIITEAAENIKQKKGK >gi|296493422|gb|ADTK01000079.1| GENE 11 4660 - 4848 166 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|300902713|ref|ZP_07120673.1| ## NR: gi|300902713|ref|ZP_07120673.1| hypothetical protein HMPREF9536_00871 [Escherichia coli MS 84-1] # 1 62 1 62 62 108 100.0 9e-23 MKWFTPEHVISAFKKGELTRHQVVMNRNMARSRGYPERAACFNEALKIIDELRKNEKESE TE >gi|296493422|gb|ADTK01000079.1| GENE 12 6930 - 7133 90 67 aa, chain - ## HITS:1 COG:no KEGG:ECO26_0302 NR:ns ## KEGG: ECO26_0302 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 67 61 127 127 137 98.0 1e-31 MIGFMSDQMLETAPRLTRAVSDETSVYAGAGQNMGQNPFNIIIVICTKEHIERLELMYQG KSDGFFT >gi|296493422|gb|ADTK01000079.1| GENE 13 7620 - 8171 62 183 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|300902716|ref|ZP_07120676.1| ## NR: gi|300902716|ref|ZP_07120676.1| conserved domain protein [Escherichia coli MS 84-1] # 1 183 1 183 183 316 100.0 3e-85 MRRSAFVRMMIFTTLFVGGTSTAVAKNCKKGIPCGNSCIAVGKTCRIGSYAPSSYKNHSY SYPSSSIRSSQSNNSRKTEGVKTTPAAKNYLCQYSIASIENGRLGALRVQGYAQVTLYAD SFKANRLNGTYLISPKLNANGKFMMADDKSKVYAYDSWLLNFAISDRITRTTEQWDRCKL NQK >gi|296493422|gb|ADTK01000079.1| GENE 14 8566 - 8724 73 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|300902717|ref|ZP_07120677.1| ## NR: gi|300902717|ref|ZP_07120677.1| hypothetical protein HMPREF9536_00875 [Escherichia coli MS 84-1] # 1 52 1 52 52 65 100.0 1e-09 MHLLSKHLHTIHFLVIDVFIVAISYRICIASSEEMDAVAIIAQAMAMLTKRL >gi|296493422|gb|ADTK01000079.1| GENE 15 8791 - 8943 153 50 aa, chain - ## HITS:1 COG:no KEGG:ECO26_0301 NR:ns ## KEGG: ECO26_0301 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 50 1 50 52 61 96.0 9e-09 MKNIAAIKRNNRKIHARKFLSTPEGKAWLERKQRENEERKLLSELKWLKD Prediction of potential genes in microbial genomes Time: Mon May 16 15:13:22 2011 Seq name: gi|296493421|gb|ADTK01000080.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont251.1, whole genome shotgun sequence Length of sequence - 10341 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 3, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 129 - 188 4.2 1 1 Tu 1 . + CDS 320 - 1591 1130 ## COG0814 Amino acid permeases - Term 1581 - 1611 1.7 2 2 Op 1 44/0.000 - CDS 1621 - 2625 802 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 3 2 Op 2 44/0.000 - CDS 2622 - 3605 575 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 4 2 Op 3 49/0.000 - CDS 3616 - 4518 1270 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 5 2 Op 4 38/0.000 - CDS 4528 - 5547 355 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 - Prom 5733 - 5792 2.8 - Term 5599 - 5634 -0.1 6 2 Op 5 . - CDS 5855 - 7462 1981 ## COG0747 ABC-type dipeptide transport system, periplasmic component - Prom 7628 - 7687 3.2 - TRNA 8208 - 8284 85.1 # Pro CGG 0 0 - Term 8163 - 8202 -0.7 7 3 Tu 1 . - CDS 8376 - 10067 1532 ## COG2194 Predicted membrane-associated, metal-dependent hydrolase - Prom 10241 - 10300 8.2 Predicted protein(s) >gi|296493421|gb|ADTK01000080.1| GENE 1 320 - 1591 1130 423 aa, chain + ## HITS:1 COG:yhjV KEGG:ns NR:ns ## COG: yhjV COG0814 # Protein_GI_number: 16131411 # Func_class: E Amino acid transport and metabolism # Function: Amino acid permeases # Organism: Escherichia coli K12 # 1 423 1 423 423 702 99.0 0 MQHNTLSKHNQKLPFTRYDFGWVLLCIGMAIGAGTVLMPVQIGLKGIWVFITAAIIAYPA TWVVQDIYLKTLSESDSCNDYTDIISHYLGKNWGIFLGVIYFLMIIHGIFIYSLSVVFDS ASYLKTFGLTDADLSQSLLYKVAIFAVLVAIASGGERLLFKISGPMVVVKVGIIVVFGFA MIPHWNFANITAFPQASVFFRDVLLTIPFCFFSAVFIQVLNPMNIAYRKREADKVLATRL ALRTHRISYITLIAVILFFAFSFTFSISHEEAVSAFEQNISALALAAQVIPGHIIHITST VLNIFAVLTAFFGIYLGFHEAIKGIILNLLSRIIDTKKINSRMLTLAICAFIVITLTIWV SFRVSVLVFFQLGSPLYGIVSCLIPFFLIYKVAQLEKLRGFKAWLILLYGILLCLSPLLK LIE >gi|296493421|gb|ADTK01000080.1| GENE 2 1621 - 2625 802 334 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 3 328 2 329 329 313 47 3e-85 MSTQEATSQQPLLQAIDLKKHYPVKKGMFAPERLVKALDGVSFNLERGKTLAVVGESGCG KSTLGRLLTMIETPTGGELYYQGQDLLKHDPQAQKLRRQKIQIVFQNPYGSLNPRKKVGQ ILEEPLLINTSFSKEQRREKALSMMAKVGLKTEHYDRYPHMFSGGQRQRIAIARGLMLDP DVVIADEPVSALDVSVRAQVLNLMMDLQQELGLSYVFISHDLSVVEHIADEVMVMYLGRC VEKGTKDQIFNNPRHPYTQALLSATPRLNPDDRRERIKLTGELPSPLNPPPGCAFNARCR RRFGPCTQLQPQLKDYGGQLVACFAVDQDENPQR >gi|296493421|gb|ADTK01000080.1| GENE 3 2622 - 3605 575 327 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 3 315 11 324 329 226 40 7e-59 MALLNVDKLSVHFGDESAPFRAVDRISYSVKQGEVVGIVGESGSGKSVSSLAIMGLIDYP GRVMAEKLEFNGQDLQRISEKERRNLVGAEVAMIFQDPMTSLNPCYTVGFQIMEAIKVHQ GGNKSTRRQRAIDLLNQVGIPDPASRLDVYPHQLSGGMSQRVMIAMAIACRPKLLIADEP TTALDVTIQAQIIELLLELQQKENMALVLITHDLALVAEAAHKIIVMYAGQVVETGDAHA IFHAPRHPYTQALLRALPEFAQDKERLASLPGVVPGKYDRPNGCLLNPRCPYATDRCRAE EPALNMLADGRQSKCHYPLDDAGRPTL >gi|296493421|gb|ADTK01000080.1| GENE 4 3616 - 4518 1270 300 aa, chain - ## HITS:1 COG:ECs4422 KEGG:ns NR:ns ## COG: ECs4422 COG1173 # Protein_GI_number: 15833676 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Escherichia coli O157:H7 # 1 300 1 300 300 486 100.0 1e-137 MSQVTENKVISAPVPMTPLQEFWHYFKRNKGAVVGLVYVVIVLFIAIFANWIAPYNPAEQ FRDALLAPPAWQEGGSMAHLLGTDDVGRDVLSRLMYGARLSLLVGCLVVVLSLIMGVILG LIAGYFGGLVDNIIMRVVDIMLALPSLLLALVLVAIFGPSIGNAALALTFVALPHYVRLT RAAVLVEVNRDYVTASRVAGAGAMRQMFINIFPNCLAPLIVQASLGFSNAILDMAALGFL GMGAQPPTPEWGTMLSDVLQFAQSAWWVVTFPGLAILLTVLAFNLMGDGLRDALDPKLKQ >gi|296493421|gb|ADTK01000080.1| GENE 5 4528 - 5547 355 339 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 66 333 43 310 320 141 29 2e-33 MLQFILRRLGLVIPTFIGITLLTFAFVHMIPGDPVMIMAGERGISPERHAQLLAELGLDK PMWQQYLHYIWGVMHGDLGISMKSRIPVWEEFVPRFQATLELGVCAMIFATAVGIPVGVL AAVKRGSIFDHTAVGLALTGYSMPIFWWGMMLIMLVSVHWNLTPVSGRVSDMVFLDDSNP LTGFMLIDTAIWGEDGNFIDAVAHMILPAIVLGTIPLAVIVRMTRSSMLEVLGEDYIRTA RAKGLTRMRVIIVHALRNAMLPVVTVIGLQVGTLLAGAILTETIFSWPGLGRWLIDALQR RDYPVVQGGVLLVATMIILVNLLVDLLYGVVNPRIRHKK >gi|296493421|gb|ADTK01000080.1| GENE 6 5855 - 7462 1981 535 aa, chain - ## HITS:1 COG:dppA KEGG:ns NR:ns ## COG: dppA COG0747 # Protein_GI_number: 16131416 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Escherichia coli K12 # 1 535 1 535 535 1091 100.0 0 MRISLKKSGMLKLGLSLVAMTVAASVQAKTLVYCSEGSPEGFNPQLFTSGTTYDASSVPL YNRLVEFKIGTTEVIPGLAEKWEVSEDGKTYTFHLRKGVKWHDNKEFKPTRELNADDVVF SFDRQKNAQNPYHKVSGGSYEYFEGMGLPELISEVKKVDDNTVQFVLTRPEAPFLADLAM DFASILSKEYADAMMKAGTPEKLDLNPIGTGPFQLQQYQKDSRIRYKAFDGYWGTKPQID TLVFSITPDASVRYAKLQKNECQVMPYPNPADIARMKQDKSINLMEMPGLNVGYLSYNVQ KKPLDDVKVRQALTYAVNKDAIIKAVYQGAGVSAKNLIPPTMWGYNDDVQDYTYDPEKAK ALLKEAGLEKGFSIDLWAMPVQRPYNPNARRMAEMIQADWAKVGVQAKIVTYEWGEYLKR AKDGEHQTVMMGWTGDNGDPDNFFATLFSCAASEQGSNYSKWCYKPFEDLIQPARATDDH NKRVELYKQAQVVMHDQAPALIIAHSTVFEPVRKEVKGYVVDPLGKHHFENVSIE >gi|296493421|gb|ADTK01000080.1| GENE 7 8376 - 10067 1532 563 aa, chain - ## HITS:1 COG:yhjW KEGG:ns NR:ns ## COG: yhjW COG2194 # Protein_GI_number: 16131417 # Func_class: R General function prediction only # Function: Predicted membrane-associated, metal-dependent hydrolase # Organism: Escherichia coli K12 # 1 563 12 574 574 1150 99.0 0 MRYIKSITQQKLSFLLAIYIGLFMNGAVFYRRFGSYAHDFTVWKGVSAVVELAATVLVTF FLLRLLSLFGRRSWRILASLVVLFSAGASYYMTFLNVVIGYGIIASVMTTDIDLSKEVVG LNFILWLIAVSALPLILIWNNRCRYTLLRQLRTPGQRIRSLAVVVLAGIMVWAPIRLLDI QQKKVERATGVDLPSYGGVVANSYLPSNWLSALGLYAWARVDESSDNNSLLNPAKKFTYQ APQNVDDTYVVFIIGETTRWDHMGIFGYERNTTPKLAQEKNLAAFRGYSCDTATKLSLRC MFVRQGGAEDNPQRTLKEQNIFAVLKQLGFSSDLYAMQSEMWFYSNTMADNIAYREQIGA EPRNRGKPVDDMLLVDEMQQSLGRNPDGKHLIILHTKGSHFNYTQRYPRSFAQWKPECIG VDSGCTKAQMINSYDNSVTYVDHFISSVIDQVRDKKAIVFYAADHGESINEREHLHGTPR ELAPPEQFRVPMMVWMSDKYLENPANAQAFAQLKKEADMKVPRRHVELYDTIMGCLGYTS PDGGINENNNWCHIPQTKVAAAN Prediction of potential genes in microbial genomes Time: Mon May 16 15:13:25 2011 Seq name: gi|296493420|gb|ADTK01000081.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont251.2, whole genome shotgun sequence Length of sequence - 6459 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 77 - 1285 1557 ## COG0477 Permeases of the major facilitator superfamily - Prom 1323 - 1382 3.5 2 2 Tu 1 . - CDS 1514 - 2212 536 ## COG5571 Autotransporter protein or domain, integral membrane beta-barrel involved in protein secretion - Prom 2289 - 2348 3.1 + Prom 2157 - 2216 6.5 3 3 Op 1 5/0.000 + CDS 2370 - 2933 356 ## PROTEIN SUPPORTED gi|157164512|ref|YP_001467500.1| 50S ribosomal protein L24 (BL23; 12 kDa DNA-binding protein; HPB12) 4 3 Op 2 . + CDS 2930 - 3370 470 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Term 3433 - 3468 6.1 5 4 Tu 1 . - CDS 3339 - 5672 1904 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing - Prom 5790 - 5849 5.3 + Prom 5698 - 5757 2.4 6 5 Tu 1 . + CDS 5826 - 6458 202 ## PROTEIN SUPPORTED gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 Predicted protein(s) >gi|296493420|gb|ADTK01000081.1| GENE 1 77 - 1285 1557 402 aa, chain - ## HITS:1 COG:yhjX KEGG:ns NR:ns ## COG: yhjX COG0477 # Protein_GI_number: 16131418 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 402 1 402 402 678 99.0 0 MTPSNYQRTRWLTLIGTIITQFALGSVYTWSLFNGALSAKLDAPVSQVAFSFGLLSLGLA ISSSVAGKLQERFGVKRVTMASGILLGLGFFLTAHSNNLMMLWLSAGVLVGLADGAGYLL TLSNCVKWFPERKGLISAFAIGSYGLGSLGFKFIDTQLLETVGLEKTFVIWGAIALVMIV FGATLMKDAPKQEVKTSNGVVEKDYTLAESMRKPQYWMLAVMFLTACMSGLYVIGVAKDI AQSLAHLDVVSAANAVTVISIANLSGRLVLGILSDKIARIRVITIGQVISLVGMAALLFA PLNAVTFFAAIACVAFNFGGTITVFPSLVSEFFGLNNLAKNYGVIYLGFGIGSIFGSIIA SLFGGFYVTFYVIFALLILSLALSTTIRQPEQKMLREAHGSL >gi|296493420|gb|ADTK01000081.1| GENE 2 1514 - 2212 536 232 aa, chain - ## HITS:1 COG:ECs4433 KEGG:ns NR:ns ## COG: ECs4433 COG5571 # Protein_GI_number: 15833687 # Func_class: N Cell motility # Function: Autotransporter protein or domain, integral membrane beta-barrel involved in protein secretion # Organism: Escherichia coli O157:H7 # 1 232 3 234 234 426 99.0 1e-119 MIIKKSGGRWQLSLLASVVISAFFLNTAYAWQQEYIVDTQPGHSTERYTWDSDHQPDYND ILSQRIQSSQRALGLEVNLAEETPVDVTSSMSMGWNFPLYEQVTTGPVAALHYDGTTTSM YNEFGDSTTTLTDPLWHASVSSLGWRVDSRLGDLRPWAQISYNQQFGENIWKAQSGLSRM TATNQNGNWLDVTVGADMLLNQNIAAYAALTQAENTTNNSDYLYTMGVSARF >gi|296493420|gb|ADTK01000081.1| GENE 3 2370 - 2933 356 187 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164512|ref|YP_001467500.1| 50S ribosomal protein L24 (BL23; 12 kDa DNA-binding protein; HPB12) [Campylobacter concisus 13826] # 1 181 1 181 185 141 38 1e-33 MERCGWVSQDPLYIAYHDNEWGVPETDSKKLFEMICLEGQQAGLSWITVLKKRENYRACF HQFDPVKVAAMQEEDVERLVQDAGIIRHRGKIQAIIGNARAYLQMEQNGEPFADFVWSFV NHQPQVTQATTLSEIPTSTSASDALSKALKKRGFKFVGTTICYSFMQACGLVNDHVVGCC CYPGNKP >gi|296493420|gb|ADTK01000081.1| GENE 4 2930 - 3370 470 146 aa, chain + ## HITS:1 COG:yiaC KEGG:ns NR:ns ## COG: yiaC COG0454 # Protein_GI_number: 16131421 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Escherichia coli K12 # 1 146 1 146 146 280 97.0 7e-76 MIREAQRSELPAILELWLESTTWGHPFIKANYWRDCIPLVRDAYLANAQNWVWEEDGKLL GFVSIMEGRFLAAMFVAPKAVRRGIGKALMQYVQQRYPHLMLEVYQKNQPAIDFYRAQGF HIVDCAWQDETQLPTWIMSWPVVQTL >gi|296493420|gb|ADTK01000081.1| GENE 5 3339 - 5672 1904 777 aa, chain - ## HITS:1 COG:bisC KEGG:ns NR:ns ## COG: bisC COG0243 # Protein_GI_number: 16131422 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 39 777 1 739 739 1526 99.0 0 MANSSSRYSVLTAAHWGPMLVETDGETVFSSRGALATGMENSLQSAVRDQVHSNTRVRFP MVRKGFLASPENPQGIRGQDEFVRVSWDEALDLIHQQHKRIREAYGPASIFAGSYGWRSN GVLHKASTLLQRYMALAGGYTGHLGDYSTGAAQAIMPYVVGGSEVYQQQTSWPLVLEHSD VVVLWSANPLNTLKIAWNASDEQGLSYFSALRDSGKKLICIDPMRSETVDFFGDKMEWVA PHMGTDVALMLGIAHTLVENGWHDEAFLARCTTGYAVFASYLLGESDGIAKTAEWAAEIC GVGAAKIRELAAIFHQNTTMLMAGWGMQRQQFGEQKHWMIVTLAAMLGQIGTPGGGFGLS YHFANGGNPTRRSAVLSSMQGSLPGGCDAVDKIPVARIVEALENPGGAYQHNGMNRHFPD IRFIWWAGGANFTHHQDTNRLIRAWQKPELVVISECFWTAAAKHADIVLPATTSFERNDL TMTGDYSNQHLVPMKQVVPPRYEARNDFDVFAELSERWEKGGYARFTEGKSQLQWLETFY NVARQRGASQQVELPPFAEFWQANQLIEMPENPDSERFIRFADFCRDPLAHPLKTASGKI EIFSQRIADYGYPDCPGHPMWLEPDEWQGNAEPEQLQVLSAHPAHRLHSQLNYSSLRELY AVANREPVTIHPDDAQARGITEGDMVRVWNSRGQILAGAVISEGIKPGVICIHEGAWPDL DLTADGICKNGAVNVLTKDLPSSRLGNGCAGNTALAWLEKYNGPELTLTAFEPPASS >gi|296493420|gb|ADTK01000081.1| GENE 6 5826 - 6458 202 211 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 [Kordia algicida OT-1] # 113 211 243 341 347 82 36 8e-16 MKKRVYLIAAVVSGALAVSGCTTNPYTGEREAGKSAIGAGLGSLVGAGIGALSSSKKDRG KGALIGAAAGAALGGGVGYYMDVQEAKLRDKMRGTGVSVTRSGDNIILNMPNNVTFDSSS ATLKPAGANTLTGVAMVLKEYPKTAVNVIGYTDSTGGHDLNMRLSQQRADSVASALITQG VDASRIRTQGLGPANPIASNSTAEGKAQNRR Prediction of potential genes in microbial genomes Time: Mon May 16 15:13:26 2011 Seq name: gi|296493419|gb|ADTK01000082.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont251.3, whole genome shotgun sequence Length of sequence - 3188 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 68 - 1042 870 ## COG1052 Lactate dehydrogenase and related dehydrogenases + Term 1059 - 1100 3.0 - Term 1052 - 1082 0.2 2 2 Tu 1 . - CDS 1092 - 1802 735 ## SBO_3556 hypothetical protein - Prom 1961 - 2020 7.7 3 3 Op 1 4/0.000 + CDS 2237 - 2527 273 ## COG2944 Predicted transcriptional regulator + Term 2554 - 2585 4.1 + Prom 2589 - 2648 4.7 4 3 Op 2 . + CDS 2809 - 3021 295 ## COG1278 Cold shock proteins + Term 3048 - 3083 4.9 Predicted protein(s) >gi|296493419|gb|ADTK01000082.1| GENE 1 68 - 1042 870 324 aa, chain + ## HITS:1 COG:ECs4438 KEGG:ns NR:ns ## COG: ECs4438 COG1052 # Protein_GI_number: 15833692 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Escherichia coli O157:H7 # 1 324 5 328 328 645 100.0 0 MKPSVILYKALPDDLLQRLQEHFTVHQVANLSPQTVEQNAAIFAEAEGLLGSNENVDAAL LEKMPKLRATSTISVGYDNFDVDALTARKILLMHTPTVLTETVADTLMALVLSTARRVVE VAERVKAGEWTASIGPDWYGTDVHHKTLGIVGMGRIGMALAQRAHFGFNMPILYNARRHH KEAEERFNARYCDLDTLLQESDFVCLILPLTDETHHLFGAEQFAKMKSSAIFINAGRGPV VDENALIAALQKGEIHAAGLDVFEQEPLSVDSPLLSMANVVAVPHIGSATHETRYGMAAC AVDNLIDALQGKVEKNCVNPHVAD >gi|296493419|gb|ADTK01000082.1| GENE 2 1092 - 1802 735 236 aa, chain - ## HITS:1 COG:no KEGG:SBO_3556 NR:ns ## KEGG: SBO_3556 # Name: yiaF # Def: hypothetical protein # Organism: S.boydii # Pathway: not_defined # 1 236 41 276 276 432 99.0 1e-120 MATGKSCSRWFAPLAALLMVVSLSGCFDKEGDQRKAFIDFLQNTVMRSGERLPTLTADQK KQFGPFVSDYAILYGYSQQVNQAMDSGLRPVVDSVNAIRVPQDYVTQSGPLREMNGSLGV LAQQLQNAKLQADAAHSALKQTDDLKPVFDQAFTKVVTTPADALQPLIPAAQTFTQQLVM VGDYIAQQGTQVSFVANGIQFPTSQQASEYNKLIAPLPAQHQAFNQAWTTAVTATQ >gi|296493419|gb|ADTK01000082.1| GENE 3 2237 - 2527 273 96 aa, chain + ## HITS:1 COG:ECs4440 KEGG:ns NR:ns ## COG: ECs4440 COG2944 # Protein_GI_number: 15833694 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 96 1 96 96 164 100.0 5e-41 MEYKDPMHELLSSLEQIVFKDETQKITLTHRTTSCTEIEQLRKGTGLKIDDFARVLGVSV AMVKEWESRRVKPSSAELKLMRLIQANPALSKQLME >gi|296493419|gb|ADTK01000082.1| GENE 4 2809 - 3021 295 70 aa, chain + ## HITS:1 COG:ECs4441 KEGG:ns NR:ns ## COG: ECs4441 COG1278 # Protein_GI_number: 15833695 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Escherichia coli O157:H7 # 1 70 1 70 70 122 100.0 1e-28 MSGKMTGIVKWFNADKGFGFITPDDGSKDVFVHFSAIQNDGYKSLDEGQKVSFTIESGAK GPAAGNVTSL Prediction of potential genes in microbial genomes Time: Mon May 16 15:13:39 2011 Seq name: gi|296493418|gb|ADTK01000083.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont251.4, whole genome shotgun sequence Length of sequence - 28877 bp Number of predicted genes - 27, with homology - 27 Number of transcription units - 13, operones - 7 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 20 - 232 158 ## ECSE_3831 small toxic polypeptide - Prom 437 - 496 5.8 - Term 442 - 485 4.9 2 2 Op 1 19/0.000 - CDS 498 - 2567 2857 ## COG0751 Glycyl-tRNA synthetase, beta subunit 3 2 Op 2 . - CDS 2577 - 3488 1058 ## COG0752 Glycyl-tRNA synthetase, alpha subunit - Prom 3520 - 3579 3.4 4 2 Op 3 . - CDS 3583 - 3879 296 ## ECO103_4674 putative outer membrane lipoprotein - Prom 3933 - 3992 4.5 + Prom 3866 - 3925 3.8 5 3 Tu 1 . + CDS 4057 - 5052 764 ## COG3274 Uncharacterized protein conserved in bacteria 6 4 Op 1 1/1.000 - CDS 5094 - 5531 466 ## COG4682 Predicted membrane protein 7 4 Op 2 3/0.833 - CDS 5577 - 5918 299 ## COG4682 Predicted membrane protein - Prom 5995 - 6054 5.8 - Term 6036 - 6084 1.1 8 5 Op 1 11/0.000 - CDS 6087 - 7541 977 ## COG1070 Sugar (pentulose and hexulose) kinases - Term 7574 - 7611 -0.3 9 5 Op 2 . - CDS 7613 - 8935 1369 ## COG2115 Xylose isomerase - Prom 9032 - 9091 4.7 + Prom 9023 - 9082 6.9 10 6 Op 1 11/0.000 + CDS 9301 - 10293 1015 ## COG4213 ABC-type xylose transport system, periplasmic component + Term 10297 - 10337 5.1 11 6 Op 2 11/0.000 + CDS 10371 - 11912 231 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 12 6 Op 3 4/0.333 + CDS 11890 - 13071 996 ## COG4214 ABC-type xylose transport system, permease component + Term 13088 - 13133 8.0 13 7 Tu 1 . + CDS 13149 - 14327 1010 ## COG1609 Transcriptional regulators - Term 14379 - 14420 7.2 14 8 Tu 1 . - CDS 14435 - 15007 484 ## COG2992 Uncharacterized FlgJ-related protein + Prom 15279 - 15338 5.6 15 9 Tu 1 . + CDS 15579 - 17609 1654 ## COG0366 Glycosidases + Prom 17697 - 17756 2.4 16 10 Tu 1 . + CDS 17787 - 19040 1226 ## COG3977 Alanine-alpha-ketoisovalerate (or valine-pyruvate) aminotransferase - Term 19143 - 19188 8.2 17 11 Op 1 . - CDS 19191 - 19664 174 ## COG1142 Fe-S-cluster-containing hydrogenase components 2 18 11 Op 2 . - CDS 19766 - 20614 699 ## COG1414 Transcriptional regulator - Prom 20765 - 20824 6.0 + Prom 20695 - 20754 4.5 19 12 Op 1 3/0.833 + CDS 20815 - 21813 1000 ## COG2055 Malate/L-lactate dehydrogenases 20 12 Op 2 2/1.000 + CDS 21825 - 22292 468 ## COG2731 Beta-galactosidase, beta subunit + Term 22302 - 22340 -0.7 + Prom 22308 - 22367 1.5 21 13 Op 1 11/0.000 + CDS 22410 - 22883 352 ## COG3090 TRAP-type C4-dicarboxylate transport system, small permease component 22 13 Op 2 9/0.000 + CDS 22883 - 24163 754 ## PROTEIN SUPPORTED gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 23 13 Op 3 3/0.833 + CDS 24176 - 25162 385 ## PROTEIN SUPPORTED gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 24 13 Op 4 3/0.833 + CDS 25166 - 26662 1449 ## COG1070 Sugar (pentulose and hexulose) kinases 25 13 Op 5 9/0.000 + CDS 26659 - 27321 791 ## COG0269 3-hexulose-6-phosphate synthase and related proteins 26 13 Op 6 8/0.000 + CDS 27314 - 28174 955 ## COG3623 Putative L-xylulose-5-phosphate 3-epimerase 27 13 Op 7 . + CDS 28168 - 28863 571 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases Predicted protein(s) >gi|296493418|gb|ADTK01000083.1| GENE 1 20 - 232 158 70 aa, chain - ## HITS:1 COG:no KEGG:ECSE_3831 NR:ns ## KEGG: ECSE_3831 # Name: not_defined # Def: small toxic polypeptide # Organism: E.coli_SE11 # Pathway: not_defined # 1 70 1 70 70 132 98.0 4e-30 MPEHHKVSLLLVGNQHRGLGMPQKYRLLSLIVICFTLLFFTWMIRDSLCELHIKQGSYEL AAFLACNLKE >gi|296493418|gb|ADTK01000083.1| GENE 2 498 - 2567 2857 689 aa, chain - ## HITS:1 COG:ECs4442 KEGG:ns NR:ns ## COG: ECs4442 COG0751 # Protein_GI_number: 15833696 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase, beta subunit # Organism: Escherichia coli O157:H7 # 1 689 1 689 689 1321 99.0 0 MSEKTFLVEIGTEELPPKALRSLAESFAANFTAELDNAGLAHGTVQWFAAPRRLALKVAN LAEAQPDREIEKRGPAIAQAFDAEGKPSKAAEGWARGCGITVDQAERLTTDKGEWLLYRA HVKGESTEALLPNMVATSLAKLPIPKLMRWGASDVHFVRPVHTVTLLLGDKVIPATILGI QSDRVIRGHRFMGEPEFTIDNADQYPEILRERGKVIADYEERKAKIKADAEEAARKIGGN ADLSESLLEEVASLVEWPVVLTAKFEEKFLAVPAEALVYTMKGDQKYFPVYANDGKLLPN FIFVANIESKDPQQIISGNEKVVRPRLADAEFFFNTDRKKRLEDNLPRLQTVLFQQQLGT LRDKTDRIQALAGWIAEQIGADVNHATRAGLLSKCDLMTNMVFEFTDTQGVMGMHYARHD GEAEDVAVALNEQYQPRFAGDDLPSNPVACALAIADKMDTLAGIFGIGQHPKGDKDPFAL RRAALGVLRIIVEKNLNLDLQTLTEEAVRLYGDKLTNANVVDDVIDFMLGRFRAWYQDEG YTVDTIQAVLARRPTRPADFDARMKAVSHFRTLEAAAALAAANKRVSNILAKSDEVLSDR VNASTLKEPEEIKLAMQVVVLRDKLEPYFAEGRYQDALVELAELREPVDAFFDKVMVMVD DKELRLNRLTMLEKLRELFLRVADISLLQ >gi|296493418|gb|ADTK01000083.1| GENE 3 2577 - 3488 1058 303 aa, chain - ## HITS:1 COG:ECs4443 KEGG:ns NR:ns ## COG: ECs4443 COG0752 # Protein_GI_number: 15833697 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase, alpha subunit # Organism: Escherichia coli O157:H7 # 1 303 1 303 303 632 99.0 0 MQKFDTRTFQGLILTLQDYWARQGCTIVQPLDMEVGAGTSHPMTCLRALGPEPMAAAYVQ PSRRPTDGRYGENPNRLQHYYQFQVVIKPSPDNIQELYLGSLKELGMDPTIHDIRFVEDN WENPTLGAWGLGWEVWLNGMEVTQFTYFQQVGGLECKPVTGEITYGLERLAMYIQGVDSV YDLVWSDGPLGKTTYGDVFHQNEVEQSTYNFEYADVDFLFTCFEQYEKEAQQLLALENPL PLPAYERILKAAHSFNLLDARKAISVTERQRYILRIRTLTTAVAEAYYASREALGFPMCN KDK >gi|296493418|gb|ADTK01000083.1| GENE 4 3583 - 3879 296 98 aa, chain - ## HITS:1 COG:no KEGG:ECO103_4674 NR:ns ## KEGG: ECO103_4674 # Name: ysaB # Def: putative outer membrane lipoprotein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 98 2 99 99 194 100.0 1e-48 MMNAFFPAMALMVLVGCSTPSPVQKAQRVKVDPLRSLNMEALCKDQAAKRYNTGEQKIDV TAFEQFQGSYEMRGYTFRKEQFVCSFDADGHFLHLSMR >gi|296493418|gb|ADTK01000083.1| GENE 5 4057 - 5052 764 331 aa, chain + ## HITS:1 COG:yiaH KEGG:ns NR:ns ## COG: yiaH COG3274 # Protein_GI_number: 16131432 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 331 1 331 331 605 99.0 1e-173 MQPKIYWIDNLRGIACLMVVMIHTTTWYVTNAHSVSPVTWDIANVLNSASRVSVPLFFMI SGYLFFGERSAQPRHFLRIGLCLIFYSAIALLYIALFTSINMELALKNLLQKPVFYHLWF FFAIAVIYLVSPLIQVKNVGGKMLLVLMVVIGIIANPNTVPQKIDGFEWLPINLYINGDT FYYILYGMLGRAIGMMDTQHKALSWVSAALFATGVFIISRGTLYELQWRGNFADTWYLYC GPMVFICAIALLTLVKNTLDTRTIRGLGLISRHSLGIYGFHALIIHALRTRGIELKNWPI LDIIWIFCATLAASLLLSMLVQRIDRNRLVS >gi|296493418|gb|ADTK01000083.1| GENE 6 5094 - 5531 466 145 aa, chain - ## HITS:1 COG:ECs4445 KEGG:ns NR:ns ## COG: ECs4445 COG4682 # Protein_GI_number: 15833699 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 145 2 146 146 268 100.0 2e-72 MDNKISTYSPAFSIVSWIALVGGIVTYLLGLWNAEMQLNEKGYYFAVLVLGLFSAASYQK TVRDKYEGIPTTSIYYMTCLTVFIISVALLMVGLWNATLLLSEKGFYGLAFFLSLFGAVA VQKNIRDAGINPPKETQVTQEEYSE >gi|296493418|gb|ADTK01000083.1| GENE 7 5577 - 5918 299 113 aa, chain - ## HITS:1 COG:ECs4446 KEGG:ns NR:ns ## COG: ECs4446 COG4682 # Protein_GI_number: 15833700 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 113 5 117 117 190 100.0 7e-49 MKTSKTVAKLLFVVGALVYLVGLWISCPLLSGKGYFLGVLMTATFGNYAYLRAEKLGQLD NFFTHICQLVALITIGLLFIGVLNAPINAYEMVIYPIAFFVCLFGQMRLFRSV >gi|296493418|gb|ADTK01000083.1| GENE 8 6087 - 7541 977 484 aa, chain - ## HITS:1 COG:xylB KEGG:ns NR:ns ## COG: xylB COG1070 # Protein_GI_number: 16131435 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Escherichia coli K12 # 1 484 1 484 484 919 99.0 0 MYIGIDLGTSGVKVILLNEQGEVVASQTEKLTVSRPHPLWSEQDPEQWWQATDRAMKALG DQHSLQDVKALGIAGQMHGATLLDAQQRVLRPAILWNDGRCAQECTLLEARVPQSRVITG NLMMPGFTAPKLLWVQRHEPEIFRQIDKVLLPKDYLHLRMTGEFASDMSDAAGTMWLDVA KRDWSDVMLQACDLSRDQMPALYEGSEITGALLPEVAKAWGMATVPVVAGGGDNAAGAVG VGMVDANQAMLSLGTSGVYFAVSEGFLSKPESAVHSFCHALPQRWHLMSVMLSAASCLDW VAKLTGLCNVPALIAAAQQADESAEPVWFLPYLSGERTPHNNPQAKGVFFGLTHQHGPNE LARAVLEGVGYALADGMDVVHACGIKPQSVTLIGGGARSEYWRQMLADISGQQLDYRTGG DVGPALGAARLAQIAANPEKSLIELLPQLPLEQSHLPDAQRYAAYQPRRETFRRLYQQLL PLMA >gi|296493418|gb|ADTK01000083.1| GENE 9 7613 - 8935 1369 440 aa, chain - ## HITS:1 COG:ECs4448 KEGG:ns NR:ns ## COG: ECs4448 COG2115 # Protein_GI_number: 15833702 # Func_class: G Carbohydrate transport and metabolism # Function: Xylose isomerase # Organism: Escherichia coli O157:H7 # 1 440 1 440 440 903 99.0 0 MQAYFDQLDRVRYEGSKSSNPLAFRHYNPDELVLGKRMEEHLRFAACYWHTFCWNGADMF GVGAFNRPWQQPGEALALAKRKADVAFEFFHKLHVPFYCFHDVDVSPEGASLKEYINNFA QMVDVLAGKQEESGVKLLWGTANCFTNPRYGAGAATNPDPEVFSWAATQVVTAMEATHKL GGENYVLWGGREGYETLLNTDLRQEREQLGRFMQMVVEHKHKIGFQGTLLIEPKPQEPTK HQYDYDAATVYGFLKQFGLEKEIKLNIEANHATLAGHSFHHEIATAIALGLFGSVDANRG DAQLGWDTDQFPNSVEENALVMYEILKAGGFTTGGLNFDAKVRRQSTDKYDLFYGHIGAM DTMALALKIAARMIEDGELDKRIAQRYSGWNSELGQQILKGQMSLADLAKYAQEHNLSPV HQSGRQEQLENLVNHYLFDK >gi|296493418|gb|ADTK01000083.1| GENE 10 9301 - 10293 1015 330 aa, chain + ## HITS:1 COG:xylF KEGG:ns NR:ns ## COG: xylF COG4213 # Protein_GI_number: 16131437 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type xylose transport system, periplasmic component # Organism: Escherichia coli K12 # 1 330 1 330 330 578 99.0 1e-165 MKIKNILLTLCTSLLLTNVAAHAKEVKIGMAIDDLRLERWQKDRDIFVKKAESLGAKVFV QSANGNEETQMSQIENMINRGVDVLVIIPYNGQVLSNVVKEAKQEGIKVLAYDRMINDAD IDFYISFDNEKVGELQAKALVDIVPQGNYFLMGGSPVDNNAKLFRAGQMKVLKPYVDSGK IKVVGDQWVDGWLPENALKIMENALTANNNKIDAVVASNDATAGGAIQALSAQGLSGKVA ISGQDADLAGIKRIAAGTQTMTVYKPITLLANTAAEIAVELGNGQEPKADTSLNNGLKDV PSRLLTPIDVNKNNIKDTVIKDGFHKESEL >gi|296493418|gb|ADTK01000083.1| GENE 11 10371 - 11912 231 513 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 254 483 7 240 329 93 29 1e-18 MPYLLEMKNITKTFGSVKAIDNVSLRLNAGEIVSLCGENGSGKSTLMKVLCGIYPHGSYE GEIIFAGEEIQASHIRDTERKGIAIIHQELALVKELTVLENIFLGNEITHNGIMDYDLMT LRCQKLLAQVSLSISPDTRVGDLGLGQQQLVEIAKALNKQVRLLILDEPTASLTEQETSV FLDIIRDLQQHGIACIYISHKLNEVKAISDTICVIRDGQHIGTRDAAGMSEDDIITMMVG RELTALYPNEPHTTGDEILRIEHLTAWHPVNRHIKRVNDVSFSLKRGEILGIAGLVGAGR TETIQCLFGVWPGQWEGKIYIDGKQVDIRNCQQAIAQGIAMVPEDRKRDGIVPVMAVGKN ITLAALNKFTGGISQLDDAAEQKCILESIQQLKVKTSSPDLAIGRLSGGNQQKAILARCL LLNPRILILDEPTRGIDIGAKYEIYKLINQLVQQGIAVIVISSELPEVLGLSDRVLVMHE GKLKANLINHNLTQEQVMEAALRSEHHVEKQSV >gi|296493418|gb|ADTK01000083.1| GENE 12 11890 - 13071 996 393 aa, chain + ## HITS:1 COG:ECs4451 KEGG:ns NR:ns ## COG: ECs4451 COG4214 # Protein_GI_number: 15833705 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type xylose transport system, permease component # Organism: Escherichia coli O157:H7 # 1 393 1 393 393 596 100.0 1e-170 MSKSNPSEVKLAVPTSGGFSGLKSLNLQVFVMIAAIIAIMLFFTWTTDGAYLSARNVSNL LRQTAITGILAVGMVFVIISAEIDLSVGSMMGLLGGVAAICDVWLGWPLPLTIIVTLVLG LLLGAWNGWWVAYRKVPSFIVTLAGMLAFRGILIGITNGTTVSPTSAAMSQIGQSYLPAS TGFIIGALGLMAFVGWQWRGRMRRQALGLQSPASTAVVGRQALTAIIVLGAIWLLNDYRG VPTPVLLLTLLLLGGMFMATRTAFGRRIYAIGGNLEAARLSGINVERTKLAVFAINGLMV AIAGLILSSRLGAGSPSAGNIAELDAIAACVIGGTSLAGGVGSVAGAVMGAFIMASLDNG MSMMDVPTFWQYIVKGAILLLAVWMDSATKRRS >gi|296493418|gb|ADTK01000083.1| GENE 13 13149 - 14327 1010 392 aa, chain + ## HITS:1 COG:xylR_1 KEGG:ns NR:ns ## COG: xylR_1 COG1609 # Protein_GI_number: 16131440 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 265 1 265 265 542 100.0 1e-154 MFTKRHRITLLFNANKAYDRQVVEGVGEYLQASQSEWDIFIEEDFRARIDKIKDWLGDGV IADFDDKQIEQALADVDVPIVGVGGSYHLAESYPPVHYIATDNYALVESAFLHLKEKGVN RFAFYGLPESSGKRWATEREYAFRQLVAEEKYRGVVYQGLETAPENWQHAQNRLADWLQT LPPQTGIIAVTDARARHILQVCEHLHIPVPEKLCVIGIDNEELTRYLSRVALSSVAQGAR QMGYQAAKLLHRLLDKEEMPLQRILVPPVRVIERRSTDYRSLTDPAVIQAMHYIRNHACK GIKVDQVLDAVGISRSNLEKRFKEEVGETIHAMIHAEKLEKARSLLISTTLSINEISQMC GYPSLQYFYSVFKKAYDTTPKEYRDVNSEVML >gi|296493418|gb|ADTK01000083.1| GENE 14 14435 - 15007 484 190 aa, chain - ## HITS:1 COG:ECs4453 KEGG:ns NR:ns ## COG: ECs4453 COG2992 # Protein_GI_number: 15833707 # Func_class: R General function prediction only # Function: Uncharacterized FlgJ-related protein # Organism: Escherichia coli O157:H7 # 1 190 85 274 274 348 100.0 4e-96 MPYITSQNAAITAERNWLISKQYQGQWSPAERARLKDIAKRYKVKWSGNTRKIPWNTLLE RVDIIPTSMVATMAAAESGWGTSKLARNNNNLFGMKCMKGRCTNAPGKVKGYSQFSSVKE SVSAYVTNLNTHPAYSSFRKSRAQLRKADQEVTATAMIHKLKGYSTKGKSYNNYLFAMYQ DNQRLIAAHM >gi|296493418|gb|ADTK01000083.1| GENE 15 15579 - 17609 1654 676 aa, chain + ## HITS:1 COG:ECs4454 KEGG:ns NR:ns ## COG: ECs4454 COG0366 # Protein_GI_number: 15833708 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Escherichia coli O157:H7 # 1 676 1 676 676 1361 99.0 0 MKLAACFLTLLPGFAVAASWTSPGFPAFSEQGTGTFVSHAQLPKGTRPLTLNFDQQCWQP ADAIKLNQMLSLQPCSNTPPQWRLFRDGKYTLQIDTRSGTPTLMISIQNAAEPVANLVRE CPKWDGLPLTLDVSATFPEGAAVRDYYSQQIAIVKNGQITLQPAATSNGLLLLERAETDA PAPFDWHNATVYFVLTDRFENGDPSNDQSYGRHKDGMAEIGTFHGGDLRGLTNKLDYLQQ LGVNALWISAPFEQIHGWVGGGTKGDFPHYAYHGYYTQDWTNLDANMGNEADLRTLVDSA HQRGIRILFDVVMNHTGYATLADMQEYQFGALYLSGDEVKKTLGERWSDWKPAAGQTWHS FNDYINFSDKTGWDKWWGKNWIRTDIGDYDNPGFDDLTMSLAFLPDIKTESTTASGLPVF YKNKTDTHAKVIEGFTPRDYLTHWLSQWVRDYGIDGFRVDTAKHVELPAWQQLKTEASAA LREWKKANPDKALDDKPFWMTGEAWGHGVMQSDYYRHGFDAMINFDYQEQAAKAVDCLAQ MDTTWQQMAEKLQGFNVLSYLSSHDTRLFREGGDKAAELLLLAPGAVQIFYGDESSRPFG PTGSDPLQGTRSDMNWQDVSGKSAANVAHWQKISQFRARHPAIGAGKQTTLSLKQGYGFV REHGDDKVLVIWAGQQ >gi|296493418|gb|ADTK01000083.1| GENE 16 17787 - 19040 1226 417 aa, chain + ## HITS:1 COG:ECs4455 KEGG:ns NR:ns ## COG: ECs4455 COG3977 # Protein_GI_number: 15833709 # Func_class: E Amino acid transport and metabolism # Function: Alanine-alpha-ketoisovalerate (or valine-pyruvate) aminotransferase # Organism: Escherichia coli O157:H7 # 1 417 1 417 417 875 100.0 0 MTFSLFGDKFTRHSGITLLMEDLNDGLRTPGAIMLGGGNPAQIPEMQDYFQTLLTDMLES GKATDALCNYDGPQGKTELLTLLAGMLREKLGWDIEAQNIALTNGSQSAFFYLFNLFAGR RADGRVKKVLFPLAPEYIGYADAGLEEDLFVSARPNIELLPEGQFKYHVDFEHLHIGEET GMICVSRPTNPTGNVITDEELLKLDALANQHGIPLVIDNAYGVPFPGIIFSEARPLWNPN IVLCMSLSKLGLPGSRCGIIIANEKIITAITNMNGIISLAPGGIGPAMMCEMIKRNDLLR LSETVIKPFYYQRVQETIAIIRRYLPEDRCLIHKPEGAIFLWLWFKDLPITTEQLYQRLK ARGVLMVPGHNFFPGLDKPWPHTHQCMRMNYVPEPEKIEAGVKILAEEIERAWAESH >gi|296493418|gb|ADTK01000083.1| GENE 17 19191 - 19664 174 157 aa, chain - ## HITS:1 COG:ECs4456 KEGG:ns NR:ns ## COG: ECs4456 COG1142 # Protein_GI_number: 15833710 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 2 # Organism: Escherichia coli O157:H7 # 1 157 1 157 157 260 99.0 8e-70 MNRFIIADATKCIGCRTCEVACAVSHQENQDCAALSPDEFISRIRVIKDHSWTTAVACHQ CEDAPCANVCPVNAISREHGHIFVEQTRCIGCKSCMLACPFGAMEVVSSRKKARAIKCDL CWHRETGPACVEACPTKALQCMDVEKVQRHRLRQQPV >gi|296493418|gb|ADTK01000083.1| GENE 18 19766 - 20614 699 282 aa, chain - ## HITS:1 COG:yiaJ KEGG:ns NR:ns ## COG: yiaJ COG1414 # Protein_GI_number: 16131445 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 282 1 282 282 560 98.0 1e-160 MGKEVMGKKENEMAQEKERPAGSQSLFRGLMLIEILSNYPNGCPLAHLSELAGLNKSTVH RLLQGLQSCGYVTTAPAAGSYRLTTKFIAVGQKALSSLNIIHIAAPHLEALNIATGETIN FSSREDDHAILIYKLEPTTGMLRTRAYIGQHMPLYCSAMGKIYMAFGHPDYVKSYWENHQ HEIQPLTRNTITELPAMFDELAHIRESGAAMDREENELGVSCIAVPVFNIHGRVPYAVSI SLSTSRLKQVGEKNLLKPLRETAQAISNELGFTVRDDQGAIT >gi|296493418|gb|ADTK01000083.1| GENE 19 20815 - 21813 1000 332 aa, chain + ## HITS:1 COG:yiaK KEGG:ns NR:ns ## COG: yiaK COG2055 # Protein_GI_number: 16131446 # Func_class: C Energy production and conversion # Function: Malate/L-lactate dehydrogenases # Organism: Escherichia coli K12 # 1 332 1 332 332 672 99.0 0 MKVTFEQLKAAFNRVLISRGVDSETADACAEMFARTTESGVYSHGVNRFPRFIQQLENGD IIPDAQPKRITSLGAIEQWDAQRSIGNLTAKKMMDRAIELAADHGIGLVALRNANHWMRG GSYGWQAAEKGYIGICWTNSIAVMPPWGAKECRIGTNPLIVAIPSTPITMVDMSMSMFSY GMLEVNRLAGRQLPVDGGFDDEGNLTKEPGVIEKNRRILPMGYWKGSGMSIVLDMIATLL SDGASVAEVTQDNSDEYGISQIFIAIEVDKLIDGPTRDAKLQRIMDYVTTAERADENQAI RLPGHEFTTLLAENRRNGITVDDSVWAKIQAL >gi|296493418|gb|ADTK01000083.1| GENE 20 21825 - 22292 468 155 aa, chain + ## HITS:1 COG:yiaL KEGG:ns NR:ns ## COG: yiaL COG2731 # Protein_GI_number: 16131447 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Escherichia coli K12 # 1 155 1 155 155 307 97.0 4e-84 MIFGHIAQPNPCRLPAAIEKALDFLRATDFNALEPGVVEIDGKNIYAQIIDLTTREAVEN RPEVHRRYIDIQFLAWGEEKIGIAIDTGNNKVSESLLEQRDIIFYHDSEHESFIEMIPGS YAIFFPQDVHRPGCILQTASEIRKIVVKVALTALN >gi|296493418|gb|ADTK01000083.1| GENE 21 22410 - 22883 352 157 aa, chain + ## HITS:1 COG:yiaM KEGG:ns NR:ns ## COG: yiaM COG3090 # Protein_GI_number: 16131448 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, small permease component # Organism: Escherichia coli K12 # 1 157 1 157 157 283 99.0 1e-76 MKKILEAILVINLAVLSCIVFINIILRYGFQTSILSVDELSRYLFVWLTFIGAIVAFMDN AHVQVTFLVEKLSPAWQRRVALVTHSLILFICGALAWGATLKTIQDWSDYSPILGLPIGL MYAACLPTSLVIAFFELRHLYQLITRSNSLTSPPQGA >gi|296493418|gb|ADTK01000083.1| GENE 22 22883 - 24163 754 426 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 [Algoriphagus sp. PR1] # 1 419 3 423 431 295 36 3e-79 IMAVLIFLGCLLGGIAIGLPIAWALLLCGAALMFWLDMFDVQIMAQTLVNGADSFSLLAI PFFVLAGEIMNAGGLSKRIVDLPMKLVGHKPGGLGYVGVLAAMIMASLSGSAVADTAAVA ALLVPMMRSANYPVNRAAGLIASGGIIAPIIPPSIPFIIFGVSSGLSISKLFMAGIAPGM MMGATLMLTWWWQASRLNLPRQQKATMQEIWHSFVSGIWALFLPVIIIGGFRSGLFTPTE AGAVAAFYALFVATVIYREMTFATLWHVLIGAAKTTSVVMFLVASAQVSAWLITIAELPM MVSDLLQPLVDSPRLLFIVIMVAILIVGMVMDLTPTVLILTPVLMPLVKEAGIDPIYFGV MFIINCSIGLITPPIGNVLNVISGVAKLKFDDAVRGVFPYVLVLYSLLVVFVFIPDLIIL PLKWIN >gi|296493418|gb|ADTK01000083.1| GENE 23 24176 - 25162 385 328 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 [Algoriphagus sp. PR1] # 4 328 3 325 328 152 27 2e-36 MKLRSVTYALFIAGLAAFSTSSLAAQSLRFGYETSQTDSQHIAAKKFNDLLQERTKGELK LKLFPDSTLGNAQAMISGVRGGTIDMEMSGSNNFAGLSPVMNLLDVPFLFRDTAHAHKTL DGKVGDDLKASLEGKGLKVLAYWENGWRDVTNSRAPVKTPADLKGLKIRTNNSPMNIAAF KVFGANPIPMPFAEVYTGLETRTIDAQEHPINVVWSAKFFEVQKYLSLTHHAYSPLLVVI NKAKFDGLTPEFQQALISSAQEAGNYQRKLVAEDQQKIIDGMKEAGVEVITDLDRKAFSD ALGTQVRDMFVKDVPQGADLLKAVDEVQ >gi|296493418|gb|ADTK01000083.1| GENE 24 25166 - 26662 1449 498 aa, chain + ## HITS:1 COG:lyxK KEGG:ns NR:ns ## COG: lyxK COG1070 # Protein_GI_number: 16131451 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Escherichia coli K12 # 1 498 1 498 498 1026 98.0 0 MTQYWLGLDCGGSWLKAGLYDREGREAGVQRLPLCALSPQPGWAERDMAELWQCCMAVIR ELLTHSGVSGEQIVGIGISAQGKGLFLLDKNNKPLGNAILSSDRRAMEIVRRWQEDGIPE KLYPLTRQTLWTGHPVSLLRWLKEHEPERYAQIGCVMMTHDYLRWCLTGVKGCEESNISE SNLYNMSLGEYDPCLTDWLGIAEINHALPPVVGSAEICGEITAQTAVLTGLKAGTPVVGG LFDVVSTALCAGIEDEFTLNAVMGTWAVTSGITRGLRDGEAHPYVYGRYVNDGEFIVHEA SPTSSGNLEWFTAQWGEISFDEINQAVASLPKAGGDLFFLPFLYGSNAGLEMTSGFYGMQ AIHTRAHLLQAIYEGVVFSHMTHLNRMRERFTDVHTLRVTGGPAHSDVWMQMLADVSGLR IELPQVEETGCFGAALAARVGTGVYRDFSEAQRDLQHPVRTLLPDMTVHQLYQQKYQRYQ HLIAALQGFHARIKEHIL >gi|296493418|gb|ADTK01000083.1| GENE 25 26659 - 27321 791 220 aa, chain + ## HITS:1 COG:sgbH KEGG:ns NR:ns ## COG: sgbH COG0269 # Protein_GI_number: 16131452 # Func_class: G Carbohydrate transport and metabolism # Function: 3-hexulose-6-phosphate synthase and related proteins # Organism: Escherichia coli K12 # 1 220 1 220 220 403 99.0 1e-112 MSRPLLQLALDHSSLEAAQRDVTQLKDSVDIVEAGTILCLNEGLGAVKALREQCPDKIIV ADWKVADAGETLAQQAFGAGANWMTIICAAPLATVEKGHAMAQRCGGEIQIELFGNWTLD DARDWHRIGVRQAIYHRGRDAQASGQQWGEADLARMKALSDIGLELSITGGITPADLPLF KDIRVKAFIAGRALAGAANPAQVAGDFHAQIDAIWGGARA >gi|296493418|gb|ADTK01000083.1| GENE 26 27314 - 28174 955 286 aa, chain + ## HITS:1 COG:sgbU KEGG:ns NR:ns ## COG: sgbU COG3623 # Protein_GI_number: 16131453 # Func_class: G Carbohydrate transport and metabolism # Function: Putative L-xylulose-5-phosphate 3-epimerase # Organism: Escherichia coli K12 # 1 286 12 297 297 574 98.0 1e-164 MRNHPLGIYEKALAKDLSWPERLVLAKSCGFDFVEMSVDETDERLSRLDWSAAQRTSLVA AMIETGVGIPSMCLSAHRRFPFGSRDEAVRERAREIMSKAIRLARDLGIRTIQLAGYDVY YEDHDEGTRQRFAEGLAWAVEQAAASQVMLAVEIMDTAFMNSISKWKKWDEMLASPWFTV YPDVGNLSAWGNDVPAELKLGIDRIAAIHLKDTQPVTGQSPGQFRDVPFGEGCVDFVGIF KTLHELNYRGSFLIEMWTEKAKEPVLEIIQARRWIEARMQEAGFIC >gi|296493418|gb|ADTK01000083.1| GENE 27 28168 - 28863 571 231 aa, chain + ## HITS:1 COG:sgbE KEGG:ns NR:ns ## COG: sgbE COG0235 # Protein_GI_number: 16131454 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli K12 # 1 231 1 231 231 463 99.0 1e-130 MLEQLKADVLAANLALPAHHLVTFTWGNVSAVDDTRQWMVIKPSGVEYDVMTADDMVVVE IASGKVVEGSKKPSSDTPTHLALYRRYAEIGGIVHTHSRHATIWSQAGLDLPAWGTTHAD YFYGAIPCTRQMTAEEINGEYEYQTGEVIIETFEERGRSPAQIPAVLVHSHGPFAWGKNA ADAVHNAVVLEECAYMGLFSRQLAPQLPAMQNELLDKHYLRKHGANAYYGQ Prediction of potential genes in microbial genomes Time: Mon May 16 15:13:46 2011 Seq name: gi|296493417|gb|ADTK01000084.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont251.5, whole genome shotgun sequence Length of sequence - 5784 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 350 - 1090 688 ## COG3713 Outer membrane protein V - Prom 1183 - 1242 8.9 + Prom 1132 - 1191 10.4 2 2 Tu 1 . + CDS 1214 - 2188 673 ## COG0583 Transcriptional regulator - Term 2101 - 2135 -0.7 3 3 Op 1 . - CDS 2185 - 3321 991 ## COG1566 Multidrug resistance efflux pump 4 3 Op 2 . - CDS 3327 - 3650 330 ## EcSMS35_3913 hypothetical protein - Prom 3683 - 3742 8.7 + Prom 3690 - 3749 6.5 5 4 Tu 1 . + CDS 3773 - 3976 98 ## EC55989_4045 conserved hypothetical protein, putative membrane protein + Term 4033 - 4084 -0.6 - Term 4132 - 4186 -0.3 6 5 Tu 1 . - CDS 4195 - 5733 1804 ## COG1012 NAD-dependent aldehyde dehydrogenases Predicted protein(s) >gi|296493417|gb|ADTK01000084.1| GENE 1 350 - 1090 688 246 aa, chain - ## HITS:1 COG:ECs4460 KEGG:ns NR:ns ## COG: ECs4460 COG3713 # Protein_GI_number: 15833714 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein V # Organism: Escherichia coli O157:H7 # 1 246 1 246 246 471 99.0 1e-133 MLINRNIVALFALPFMASATASELSIGAGAAYNESPYRGYNKNTKAIPLISYEGDSFYVR QTTLGFILSQSEKNELSLTASWMPLEFDPADNDDYAMQQLDKRDSTAMAGVAWYHHERWG TVKASAAADVLDNSNGWVGELSVFHKMQIGRLSLTPALGVLYYDENFSDYYYGISESESR RSGLASYSAQDAWVPYVSLTAKYPIGEHVVLMASAGYSELPEEITDSPMIDRNESFTFVT GVSWRF >gi|296493417|gb|ADTK01000084.1| GENE 2 1214 - 2188 673 324 aa, chain + ## HITS:1 COG:yiaU KEGG:ns NR:ns ## COG: yiaU COG0583 # Protein_GI_number: 16131456 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 324 1 324 324 660 99.0 0 MTKLQLKYRELKIISVIAASENISHAATVLGIAQANVSKYLADFESKVGLKVFDRTTRQL MLTPFGTALLPYINDMLDRNEQLNNFIADYKHEKRGRVTIYAPTGIITYLSKHVIDKIKD IGDITLSLKTCNLERNAFYEGVEFPDDCDVLISYAPPKDESLVASFITQYAVTAYASQRY LEKHPISRPDELEHHSCILIDSMMIDDANIWRFNVAGSKEVRDYRVKGNYVCDNTQSALE LARNHLGIVFAPDKSVQSDLQDGTLVPCFQHPYEWWLDLVAIFRKREYQPWRVQYVLDEM LREIRHQLAQSQQLRPEQAAESED >gi|296493417|gb|ADTK01000084.1| GENE 3 2185 - 3321 991 378 aa, chain - ## HITS:1 COG:yiaV KEGG:ns NR:ns ## COG: yiaV COG1566 # Protein_GI_number: 16131457 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Escherichia coli K12 # 1 378 1 378 378 726 100.0 0 MDLLIILTYVAFAWAMFKIFKIPVNKWTIPTAALGGIFIVSGLILLMNYNHPYTFKAQKA VISIPVVPQVTGVVIEVTDKKNTLIKKGEVLFRLDPTRYQARVDRLMADIVTAEHKQRAL GAELDEMAANTQQAKATRDKFAKEYQRYARGSQAKVNPFSERDIDVARQNYLAQEASVKS SAAEQKQIQSQLDSLVLGEHSQIASLKAQLAEAKYNLEQTIVRAPSDGYVTQVLIRPGTY AASLPLRPVMVFIPDQKRQIVAQFRQNSLLRLAPGDDAEVVFNALPGKVFSGKLAAISPA VPGGAYQSTGTLQTLNTAPGSDGVIATIELDEHTDLSALPDGIYAQVAVYSDHFSHVSVM RKVLLRMTSWVHYLYLDH >gi|296493417|gb|ADTK01000084.1| GENE 4 3327 - 3650 330 107 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_3913 NR:ns ## KEGG: EcSMS35_3913 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 107 1 107 107 199 99.0 2e-50 MFLDYFALGVLIFVFLVIFYGIIILHDIPYLIAKKRNHPHADAIHVAGWVSLFTLHVIWP FLWIWATLYRPERGWGMQNHDSSVVQIQQRIAGLEKQLADIKSSSAE >gi|296493417|gb|ADTK01000084.1| GENE 5 3773 - 3976 98 67 aa, chain + ## HITS:1 COG:no KEGG:EC55989_4045 NR:ns ## KEGG: EC55989_4045 # Name: not_defined # Def: conserved hypothetical protein, putative membrane protein # Organism: E.coli_55989 # Pathway: not_defined # 1 67 4 70 70 109 98.0 3e-23 MAFQQVEVYHIEIMELLTKLMNNSSKTSTVQIKRIKPSIIYRLLLIGLGTPMVIYGLVCP LTIETRE >gi|296493417|gb|ADTK01000084.1| GENE 6 4195 - 5733 1804 512 aa, chain - ## HITS:1 COG:aldB KEGG:ns NR:ns ## COG: aldB COG1012 # Protein_GI_number: 16131459 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Escherichia coli K12 # 1 512 31 542 542 1071 99.0 0 MTNNPPSAQIKPGEYGFPLKLKARYDNFIGGEWVAPADGEYYQNLTPVTGQLLCEVASSG KRDIDLALDAAHKVKDKWAHTSVQDRAAILFKIADRMEQNLELLATAETWDNGKPIRETS AADVPLAIDHFRYFASCIRAQEGGISEVDSETVAYHFHEPLGVVGQIIPWNFPLLMASWK MAPALAAGNCVVLKPARLTPLSVLLLMEIVGDLLPPGVVNVVNGAGGEIGEYLATSKRIA KVAFTGSTEVGQQIMQYATQNIIPVTLELGGKSPNIFFADVMDEEDAFFDKALEGFALFA FNQGEVCTCPSRALVQESIYERFMERAIRRVESIRSGNPLDSVTQMGAQVSHGQLETILN YIDIGKKEGADVLTGGRRKLLEGELKDGYYLEPTILFGQNNMRVFQEEIFGPVLAVTTFK TMEEALELANDTQYGLGAGVWSRNGNLAYKMGRGIQAGRVWTNCYHAYPAHAAFGGYKQS GIGRETHKMMLEHYQQTKCLLVSYSDKPLGLF Prediction of potential genes in microbial genomes Time: Mon May 16 15:13:52 2011 Seq name: gi|296493416|gb|ADTK01000085.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont251.6, whole genome shotgun sequence Length of sequence - 5516 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 13 - 47 5.1 1 1 Tu 1 2/0.000 - CDS 63 - 1214 1339 ## COG1454 Alcohol dehydrogenase, class IV - Prom 1285 - 1344 9.0 - Term 1294 - 1353 13.1 2 2 Op 1 7/0.000 - CDS 1404 - 3248 1970 ## COG3276 Selenocysteine-specific translation elongation factor 3 2 Op 2 4/0.000 - CDS 3245 - 4636 1328 ## COG1921 Selenocysteine synthase [seryl-tRNASer selenium transferase] 4 2 Op 3 . - CDS 4734 - 5342 620 ## COG0625 Glutathione S-transferase - Prom 5365 - 5424 2.9 Predicted protein(s) >gi|296493416|gb|ADTK01000085.1| GENE 1 63 - 1214 1339 383 aa, chain - ## HITS:1 COG:ECs4466 KEGG:ns NR:ns ## COG: ECs4466 COG1454 # Protein_GI_number: 15833720 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Escherichia coli O157:H7 # 1 383 1 383 383 707 99.0 0 MASSTFFIPSVNVIGADSLTDAMNMMADYGFTRTLIVTDSMLTKLGMAGDVQKALEERNI FSVIYDGTQPNPTTENVAAGLKLLKENNCDSVISLGGGSPHDCAKGIALVAANGGDIRDY EGVDRSAKPQLPMIAINTTAGTASEMTRFCIITDEARHIKMAIVDKHVTPLLSVNDSSLM IGMPKSLTAATGMDALTHAIEAYVSIAATPITDACALKAVTMIAENLPLAVEDGSNAKAR EAMAYAQFLAGMAFNNASLGYVHAMAHQLGGFYNLPHGVCNAVLLPHVQVFNSKVAAARL RDCAAAMGVNVTGKNDAEGAEACINAIRELAKKVDIPAGLRDLNVKEEDFAVLATNALKD ACGFTNPIQATHEEIVAIYRAAM >gi|296493416|gb|ADTK01000085.1| GENE 2 1404 - 3248 1970 614 aa, chain - ## HITS:1 COG:selB KEGG:ns NR:ns ## COG: selB COG3276 # Protein_GI_number: 16131461 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Selenocysteine-specific translation elongation factor # Organism: Escherichia coli K12 # 1 614 1 614 614 1206 99.0 0 MIIATAGHVDHGKTTLLQAITGVNADRLPEEKKRGMTIDLGYAYWPQPDGRVPGFIDVPG HEKFLSNMLAGVGGIDHALLVVACDDGVMAQTREHLAILQLTGNPMLTVALTKADRVDEA RVDEVERQVKEVLREYGFAEAKLFITAATEGRGIDALREHLLQLPGREHASQHSFRLAID RAFTVKGAGLVVTGTALSGEVKVGDSLWLTGVNKPMRVRALHAQNQPTETAHAGQRIALN IAGDAEKEQINRGDWLLADAPPEPFTRVIVELQTHTPLTQWQPLHIHHAASHVTGRVSLL EDNLAELVFDTPLWLADNDRLVLRDISARNTLAGARVVMLNPPRRGKRKPEYLQWLASLA RAQSDADALSVHLERGAVNLADFAWARQLNGEGMRELLQQPGYIQAGYSLLNAPVAARWQ RKILDTLATYHEQHRDEPGPGRERLRRMALPMEDEALVLLLIEKMRESGDILSHHGWLHL PDHKAGFSEEQQAIWQKAEPLFGDEPWWVRDLAKETGTDEQAMRLTLRQAAQQGIITAIV KDRYYRNDRIVEFANMIRDLDQECGSTCAADFRDRLGVGRKLAIQILEYFDRIGFTRRRG NDHLLRDALLFPEK >gi|296493416|gb|ADTK01000085.1| GENE 3 3245 - 4636 1328 463 aa, chain - ## HITS:1 COG:ECs4468 KEGG:ns NR:ns ## COG: ECs4468 COG1921 # Protein_GI_number: 15833722 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine synthase [seryl-tRNASer selenium transferase] # Organism: Escherichia coli O157:H7 # 1 463 1 463 463 843 100.0 0 MTTETRSLYSQLPAIDRLLRDSSFLSLRDTYGHTRVVELLRQMLDEAREVIRGSQTLPAW CENWAQEVDARLTKEAQSALRPVINLTGTVLHTNLGRALQAEAAVEAVAQAMRSPVTLEY DLDDAGRGHRDRALAQLLCRITGAEDACIVNNNAAAVLLMLAATASGKEVVVSRGELVEI GGAFRIPDVMRQAGCTLHEVGTTNRTHANDYRQAVNENTALLMKVHTSNYSIQGFTKAID EAELVALGKELDVPVVTDLGSGSLVDLSQYGLPKEPMPQELIAAGVSLVSFSGDKLLGGP QAGIIVGKKEMIARLQSHPLKRALRADKMTLAALEATLRLYLHPEALSEKLPTLRLLTRS AEVIQIQAQRLQAPLAAHYGAEFAVQVMPCLSQIGSGSLPVDRLPSAALTFTPHDGRGSH LESLAARWRELPVPVIGRIYDGRLWLDLRCLEDEQRFLEMLLK >gi|296493416|gb|ADTK01000085.1| GENE 4 4734 - 5342 620 202 aa, chain - ## HITS:1 COG:yibF KEGG:ns NR:ns ## COG: yibF COG0625 # Protein_GI_number: 16131463 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione S-transferase # Organism: Escherichia coli K12 # 1 202 1 202 202 401 100.0 1e-112 MKLVGSYTSPFVRKLSILLLEKGITFEFINELPYNADNGVAQFNPLGKVPVLVTEEGECW FDSPIIAEYIELMNVAPAMLPRDPLESLRVRKIEALADGIMDAGLVSVREQARPAAQQSE DELLRQREKINRSLDVLEGYLVDGTLKTDTVNLATIAIACAVGYLNFRRVAPGWCVDRPH LVKLVENLFSRESFARTEPPKA Prediction of potential genes in microbial genomes Time: Mon May 16 15:13:53 2011 Seq name: gi|296493415|gb|ADTK01000086.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont252.1, whole genome shotgun sequence Length of sequence - 3514 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 64 - 123 3.9 1 1 Tu 1 . + CDS 315 - 1799 414 ## ECUMN_4878 hypothetical protein 2 2 Op 1 . - CDS 2121 - 2723 384 ## ECO26_2894 hypothetical protein - Term 2738 - 2766 0.5 3 2 Op 2 . - CDS 2810 - 3016 303 ## COG3311 Predicted transcriptional regulator Predicted protein(s) >gi|296493415|gb|ADTK01000086.1| GENE 1 315 - 1799 414 494 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_4878 NR:ns ## KEGG: ECUMN_4878 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 494 24 517 517 998 100.0 0 MSFGQPCDEFPLSSLPPLIRDAVIEAQQITQAPLGLVAASALGAVSLVCQNLIDVCRLNT LRGPVSLFFLTLAESGERKTAVDKLLMKPLYQQEMQLYSRYKSELAVWKNKEELLKAQKK ALLSKLNKELRKGADESETLRQLEVLQKNSAEEPVRYKFIFNDATTAAIKNQLCGKWRSV GIMSDEAGIIFDGYTLSELPFINKMWDGSVLSVDRKNEPEQMIENARMTLSLMVQPGLFD RYMERKGSVARDSGFLARCLISKPATTQGKRFINGAVIPGGSLTAFHERLMELARGSIEK SSEDERYCLHFSPEAQKIFIEHYNVLEQDLSPSGPLSPFRGHVSKKTENIARIAALFQYF SYGEGKISADIMTSAVVISSWYTDEYKKLFALPDESELQQKDAEELFDWLIEECRGECPP RVRKNYILQCGPGRFRNRKKLNALLNILESQFRLSVVPEGKTMYVLLPQIASLKLSDVSG IFTSGYHYNKLRAK >gi|296493415|gb|ADTK01000086.1| GENE 2 2121 - 2723 384 200 aa, chain - ## HITS:1 COG:no KEGG:ECO26_2894 NR:ns ## KEGG: ECO26_2894 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 200 50 249 249 420 100.0 1e-116 MKPQYQTRYELLHESYQKWLTGFTRHAVSWGVCHPNIYYFHNLTPGWVSFNGEKPEIAIV PQSLHRLIYGPDKRTTPPLDDDLIVNLCTSEHLLVHHPMLEGILLSECERLRQRSLANKL ISLFRQFGGTELRLKLVWLCWLDLMTGNCLDDWTENLKRKSEKELEEWIINRQKQSAALT DLMDQYVLLAYRTTVDDKRT >gi|296493415|gb|ADTK01000086.1| GENE 3 2810 - 3016 303 68 aa, chain - ## HITS:1 COG:Z1188 KEGG:ns NR:ns ## COG: Z1188 COG3311 # Protein_GI_number: 15800709 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Escherichia coli O157:H7 EDL933 # 9 58 15 64 65 78 68.0 3e-15 MITPVSLMDDQMVDMAFITQLTGLTDKWFYKLIKDGDFPAPIKMGRSSRWLKSEVEAWLQ ARIAQSRP Prediction of potential genes in microbial genomes Time: Mon May 16 15:14:05 2011 Seq name: gi|296493414|gb|ADTK01000087.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont253.1, whole genome shotgun sequence Length of sequence - 13364 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 4, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 41 - 1399 331 ## COG2199 FOG: GGDEF domain + Prom 1890 - 1949 2.9 2 2 Op 1 . + CDS 1986 - 4409 1641 ## EcE24377A_1144 outer membrane protein PgaA 3 2 Op 2 6/0.000 + CDS 4418 - 6436 1025 ## COG0726 Predicted xylanase/chitin deacetylase + Term 6470 - 6507 3.2 + Prom 6448 - 6507 3.3 4 2 Op 3 . + CDS 6549 - 7754 861 ## COG1215 Glycosyltransferases, probably involved in cell wall biogenesis 5 2 Op 4 . + CDS 7756 - 8169 146 ## B21_01030 hypothetical protein + Term 8234 - 8281 1.5 - Term 8168 - 8204 6.3 6 3 Op 1 . - CDS 8219 - 9007 612 ## COG1702 Phosphate starvation-inducible protein PhoH, predicted ATPase 7 3 Op 2 . - CDS 8986 - 9282 82 ## gi|300902804|ref|ZP_07120755.1| hypothetical protein HMPREF9536_00954 - Prom 9445 - 9504 3.9 - Term 9555 - 9587 -1.0 8 4 Op 1 9/0.000 - CDS 9626 - 10897 1181 ## COG2837 Predicted iron-dependent peroxidase 9 4 Op 2 7/0.000 - CDS 10903 - 12030 1543 ## COG2822 Predicted periplasmic lipoprotein involved in iron transport 10 4 Op 3 . - CDS 12088 - 12918 852 ## COG0672 High-affinity Fe2+/Pb2+ permease - Prom 13014 - 13073 4.5 Predicted protein(s) >gi|296493414|gb|ADTK01000087.1| GENE 1 41 - 1399 331 452 aa, chain - ## HITS:1 COG:ycdT_2 KEGG:ns NR:ns ## COG: ycdT_2 COG2199 # Protein_GI_number: 16128989 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli K12 # 251 452 1 202 202 412 100.0 1e-115 MEKDYLRISSTVLVSLLFGLALVLVNSWFNQPGVEEVVPRSTYLMVMIALFFIDTVAFIF MQLYFIYDRRQFSNCVLSLAFLSCLIYFVKTVIIIQQIIEERLTSSVVQNDIAIYYLFRQ MSLCILIFLALVNKVSENTKQRNLFSKKMTLCISLFFVFGGPIVAHILSSHYESYNLHIA ELTNENGQVVWKASYVTIMIFMWLTLLSVNLYFNGLRYDIWNGVTVIAFCAVLYNISLLF MSRYSVSTWYISRTIEVVSKLTVMVIFMCHIFSALRVTKNIAHRDPLTNIFNRNYFFNEL TVQSASAQKTPYCVMIMDIDHFKKVNDTWGHPVGDQVIKTVVNIIGKSIRPDDLLARVGG EEFGVLLTDIDTERAKALAERIRENVERLTGDNPEYAIPQKVTISIGAVVTQENALNPNE IYRLADNALYEAKETGRNKVVVRDVVNFCESP >gi|296493414|gb|ADTK01000087.1| GENE 2 1986 - 4409 1641 807 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_1144 NR:ns ## KEGG: EcE24377A_1144 # Name: pgaA # Def: outer membrane protein PgaA # Organism: E.coli_E24377A # Pathway: not_defined # 1 807 1 807 807 1560 99.0 0 MYSSSRKRCPKTKWALKILTAAFLAASPAAKSAVNNAYDALIIEARKGNTQPALLWFAQK SALSNNQIADWLQIALWAGQDKQVITVYNRYRYQQLPARGYAAVAVAYRNLQQWQNSLTL WQKALSLEPQNKDYQRGQILTLADAGHYDTALVKLKQLNSGAPDKANLLAEAYIYKLAGR HQDELRAMTESLPENASTQQYPTEYVQALRNNQLAAAIDDANLTPDIRADIHAELVRLSF MPTRSESERYAIADRALAQYAALEILWHDNPDRTAQYQRIQVDHLGALLTRDRYKDVISH YQRLKKTGQIIPPWGQYWVASAYLKDHQPKKAQSIMTELFYHKETIAPDLSDEELADLFY SHLESENYPGALTVTQHTINTSPPFLRLMGTPTSIPNDTWLQGHSFLSTVAKYSNDLPQA EMTARELAYNAPGNQGLRIDYASMLQARGWPRAAENELKKAEVIEPRNINLEVEQAWTAL TLQEWQQAAVLTHDVVEREPQDPGVVRLKRAVDVHNLAELRIAGSTGIDAEGPDSGKHDV DLTTIVYSPPLKDNWRGFAGFGYADGQFSEGKGIVRDWLAGVEWRSRNIWLEAEYAERVF NHEHKPGARLSGWYDFNDNWRIGSQLERLSHRVPLRAMKNGVTGNSAQAYVRWYQNERRK YGVSWAFTDFSDSNQRHEVSLEGQERIWSSPYLIVDFLPSLYYEQNTEHDTPYYNPIKTF DIVPAFEASHLLWRSYENSWEQIFSAGVGASWQKHYGTDVVTQLGYGQRISWNDVIDAGA TLRWEKRPYDGDREHNLYVEFDMTFRF >gi|296493414|gb|ADTK01000087.1| GENE 3 4418 - 6436 1025 672 aa, chain + ## HITS:1 COG:ycdR KEGG:ns NR:ns ## COG: ycdR COG0726 # Protein_GI_number: 16128987 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Escherichia coli K12 # 1 672 1 672 672 1344 100.0 0 MLRNGNKYLLMLVSIIMLTACISQSRTSFIPPQDRESLLAEQPWPHNGFVAISWHNVEDE AADQRFMSVRTSALREQFAWLRENGYQPVSIAQIREAHRGGKPLPEKAVVLTFDDGYQSF YTRVFPILQAFQWPAVWAPVGSWVDTPADKQVKFGDELVDREYFATWQQVREVARSRLVE LASHTWNSHYGIQANATGSLLPVYVNRAYFTDHARYETAAEYRERIRLDAVKMTEYLRTK VEVNPHVFVWPYGEANGIAIEELKKLGYDMFFTLESGLANASQLDSIPRVLIANNPSLKE FAQQIITVQEKSPQRIMHIDLDYVYDENLQQMDRNIDVLIQRVKDMQISTVYLQAFADPD GDGLVKEVWFPNRLLPMKADIFSRVAWQLRTRSGVNIYAWMPVLSWDLDPTLTRVKYLPT GEKKAQIHPEQYHRLSPFDDRVRAQVGMLYEDLAGHAAFDGILFHDDALLSDYEDASAPA ITAYQQAGFSGSLSEIRQNPEQFKQWARFKSRALTDFTLELSARVKAIRGPHIKTARNIF ALPVIQPESEAWFAQNYADFLKSYDWTAIMAMPYLEGVAEKSADQWLIQLTNQIKNIPQA KDKSILELQAQNWQKNGQHQAISSQQLAHWMSLLQLNGVKNYGYYPDNFLHNQPEIDLIR PEFSTAWYPKND >gi|296493414|gb|ADTK01000087.1| GENE 4 6549 - 7754 861 401 aa, chain + ## HITS:1 COG:ycdQ KEGG:ns NR:ns ## COG: ycdQ COG1215 # Protein_GI_number: 16128986 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases, probably involved in cell wall biogenesis # Organism: Escherichia coli K12 # 1 401 41 441 441 806 100.0 0 MSIMWIVGGVYFWVYRERHWPWGENAPAPQLKDNPSISIIIPCFNEEKNVEETIHAALAQ RYENIEVIAVNDGSTDKTRAILDRMAAQIPHLRVIHLAQNQGKAIALKTGAAAAKSEYLV CIDGDALLDRDAAAYIVEPMLYNPRVGAVTGNPRIRTRSTLVGKIQVGEYSSIIGLIKRT QRIYGNVFTVSGVIAAFRRSALAEVGYWSDDMITEDIDISWKLQLNQWTIFYEPRALCWI LMPETLKGLWKQRLRWAQGGAEVFLKNMTRLWRKENFRMWPLFFEYCLTTIWAFTCLVGF IIYAVQLAGVPLNIELTHIAATHTAGILLCTLCLLQFIVSLMIENRYEHNLTSSLFWIIW FPVIFWMLSLATTLVSFTRVMLMPKKQRARWVSPDRGILRG >gi|296493414|gb|ADTK01000087.1| GENE 5 7756 - 8169 146 137 aa, chain + ## HITS:1 COG:no KEGG:B21_01030 NR:ns ## KEGG: B21_01030 # Name: ycdP # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 137 1 137 137 233 100.0 2e-60 MNNLIITTRQSPVRLLVDYVATTILWTLFALFIFLFAMDLLTGYYWQSEARSRLQFYFLL AVANAVVLIVWALYNKLRFQKQQHHAAYQYTPQEYAESLAIPDELYQQLQKSHRMSVHFT SQGQIKMVVSEKALVRA >gi|296493414|gb|ADTK01000087.1| GENE 6 8219 - 9007 612 262 aa, chain - ## HITS:1 COG:ECs1266 KEGG:ns NR:ns ## COG: ECs1266 COG1702 # Protein_GI_number: 15830520 # Func_class: T Signal transduction mechanisms # Function: Phosphate starvation-inducible protein PhoH, predicted ATPase # Organism: Escherichia coli O157:H7 # 1 262 93 354 354 523 100.0 1e-148 MGRQKAVIKARREAKRVLRRDSRSHKQREEESVTSLVQMGGVEAIGMARDSRDTSPILAR NEAQLHYLKAIESKQLIFATGEAGCGKTWISAAKAAEALIHKDVDRIIVTRPVLQADEDL GFLPGDIAEKFAPYFRPVYDVLVRRLGASFMQYCLRPEIGKVEIAPFAYMRGRTFENAVV ILDEAQNVTAAQMKMFLTRLGENVTVIVNGDITQCDLPRGVCSGLSDALERFEEDEMVGI VRFGKEDCVRSALCQRTLHAYS >gi|296493414|gb|ADTK01000087.1| GENE 7 8986 - 9282 82 98 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|300902804|ref|ZP_07120755.1| ## NR: gi|300902804|ref|ZP_07120755.1| hypothetical protein HMPREF9536_00954 [Escherichia coli MS 84-1] # 1 98 1 98 98 188 100.0 8e-47 MVTSCSGQCLVKLRSTTCGDFSSGSNFVTLYFSHFLSLLVLPTPQTVCVAVFRHFIQKQT MAYKEANPLMFVRIIALPTARVMNTNNSRCSPWEDKKQ >gi|296493414|gb|ADTK01000087.1| GENE 8 9626 - 10897 1181 423 aa, chain - ## HITS:1 COG:ECs1265 KEGG:ns NR:ns ## COG: ECs1265 COG2837 # Protein_GI_number: 15830519 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted iron-dependent peroxidase # Organism: Escherichia coli O157:H7 # 1 423 1 423 423 855 99.0 0 MQYEDKNGVNEPSRRRLLKGIGALALAGSCPVAHAQKTQSAPGTLSPDARNEKQPFYGEH QAGILTPQQAAMMLVAFDVLASDKPDLERLFRLLTQRFAFLTQGGAAPETPNPRLPPLDS GILGGYIAPDNLTITLSVGHSLFDERFGLAPQMPKKLQKMTRFPNDSLDAALCHGDVLLQ ICANTQDTVIHALRDIIKHTPDLLSVRWKREGFISDHAARSKGKETPINLLGFKDGTANP DSQNDKLMQKVVWVTADQQEPAWTIGGSYQAVRLIQFRVEFWDRTPLKEQQTIFGRDKQT GAPLGMQHEHDVPDYASDPEGKVIALDSHIRLANPRTAESESSLMLRRGYSYSLGVTNSG QLDMGLLFVCYQHDLEKGFLTVQKRLNGEALEEYVKPIGGGYFFALPGVKDANDYLGSAL LRV >gi|296493414|gb|ADTK01000087.1| GENE 9 10903 - 12030 1543 375 aa, chain - ## HITS:1 COG:ECs1264 KEGG:ns NR:ns ## COG: ECs1264 COG2822 # Protein_GI_number: 15830518 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted periplasmic lipoprotein involved in iron transport # Organism: Escherichia coli O157:H7 # 1 375 1 375 375 704 99.0 0 MTINFRRNALQLSVAALFSSAFMANAANVPQVKVTVTDKQCEPMTITVNAGKTQFIIQNH SQKALEWEILKGVMVVEERENIAPGFSQKMTANLQPGEYDMTCGLLTNPKGKLIVKGEAT ADAAQSDALLSLGGAITAYKAYVMAETTQLVTDTKAFTDAIKAGDIEKAKALYAPTRQHY ERIEPIAELFSDLDGSIDAREDDYEQKAADPKFTGFHRLEKALFGDNTTKGMDQYADQLY TDVVDLQKRISELAFPPSKVVGGAAGLIEEVAASKISGEEDRYSHTDLWDFQANVEGSQK IVDLLRPQLQKANPELLAKVDANFKKVDTILAKYRTKDGFETYDKLTDADRNALKGPITA LAEDLAQLRGVLGLD >gi|296493414|gb|ADTK01000087.1| GENE 10 12088 - 12918 852 276 aa, chain - ## HITS:1 COG:ECs1263 KEGG:ns NR:ns ## COG: ECs1263 COG0672 # Protein_GI_number: 15830517 # Func_class: P Inorganic ion transport and metabolism # Function: High-affinity Fe2+/Pb2+ permease # Organism: Escherichia coli O157:H7 # 1 276 1 276 276 455 100.0 1e-128 MFVPFLIMLREGLEAALIVSLIASYLKRTQRGRWIGVMWIGVLLAAALCLGLGIFINETT GEFPQKEQELFEGIVAVIAVVILTWMVFWMRKVSRNVKVQLEQAVDSALQRGNHHGWALV MMVFFAVAREGLESVFFLLAAFQQDVGIWPPLGAMLGLATAVVLGFLLYWGGIRLNLGAF FKWTSLFILFVAAGLAAGAIRAFHEAGLWNHFQEIAFDMSAVLSTHSLFGTLMEGIFGYQ EAPSVSEVAVWFIYLIPALVAFALPPRAGATASRSA Prediction of potential genes in microbial genomes Time: Mon May 16 15:14:26 2011 Seq name: gi|296493413|gb|ADTK01000088.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont253.2, whole genome shotgun sequence Length of sequence - 12357 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 5, operones - 1 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 140 - 1648 1630 ## COG0591 Na+/proline symporter - Prom 1797 - 1856 5.5 + Prom 1620 - 1679 1.8 2 2 Tu 1 . + CDS 1728 - 1946 72 ## ECIAI1_1060 conserved hypothetical protein in PutA-PutB intergenic region + Prom 1960 - 2019 3.4 3 3 Tu 1 . + CDS 2070 - 6032 4581 ## COG4230 Delta 1-pyrroline-5-carboxylate dehydrogenase + Term 6040 - 6080 6.2 - Term 6025 - 6068 2.0 4 4 Tu 1 . - CDS 6072 - 6710 645 ## COG1309 Transcriptional regulator - Prom 6839 - 6898 4.3 + Prom 6798 - 6857 4.5 5 5 Op 1 4/0.000 + CDS 6998 - 8089 966 ## COG2141 Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related flavin-dependent oxidoreductases 6 5 Op 2 2/0.500 + CDS 8089 - 8781 690 ## COG1335 Amidases related to nicotinamidase 7 5 Op 3 5/0.000 + CDS 8793 - 9179 409 ## COG0251 Putative translation initiation inhibitor, yjgF family 8 5 Op 4 4/0.000 + CDS 9187 - 9987 720 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 9 5 Op 5 5/0.000 + CDS 9997 - 10587 469 ## COG0778 Nitroreductase 10 5 Op 6 1/1.000 + CDS 10598 - 11092 314 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family 11 5 Op 7 . + CDS 11113 - 12357 704 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 Predicted protein(s) >gi|296493413|gb|ADTK01000088.1| GENE 1 140 - 1648 1630 502 aa, chain - ## HITS:1 COG:ECs1261 KEGG:ns NR:ns ## COG: ECs1261 COG0591 # Protein_GI_number: 15830515 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Escherichia coli O157:H7 # 1 501 1 501 502 888 99.0 0 MAISTPMLVTFCVYIFGMILIGFIAWRSTKNFDDYILGGRSLGPFVTALSAGASDMSGWL LMGLPGAVFLSGISESWIAIGLTLGAWINWKLVAGRLRVHTEYNNNALTLPDYFTGRFED KSRILRIISALVILLFFTIYCASGIVAGARLFESTFGMSYETALWAGAAATILYTFIGGF LAVSWTDTVQASLMIFALILTPVIVIISVGGFGDSLEVIKQKSIENVDMLKGLNFVAIIS LMGWGLGYFGQPHILARFMAADSHHSIVHARRISMTWMILCLAGAVAVGFFGIAYFNEHP AVAGAVNQNAERVFIELAQILFNPWIAGILLSAILAAVMSTLSCQLLVCSSAITEDLYKA FLRKHASQKELVWVGRVMVLVVALVAIALAANPENRVLGLVSYAWAGFGAAFGPVVLFSV MWSRMTRNGALAGMIIGALTVIVWKQFGWLGLYEIIPGFIFGSIGIVVFSLLGKAPSAAM QKRFAEADAHYHSAPPSRLQES >gi|296493413|gb|ADTK01000088.1| GENE 2 1728 - 1946 72 72 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_1060 NR:ns ## KEGG: ECIAI1_1060 # Name: not_defined # Def: conserved hypothetical protein in PutA-PutB intergenic region # Organism: E.coli_IAI1 # Pathway: not_defined # 1 63 1 63 63 110 100.0 1e-23 MFYQGDRILPEALVIHNRFNTPFTLNFSAQRHYFSSGCTLSHFLRLHLSKMLTAAEKKSE LFFNPCHIDFFY >gi|296493413|gb|ADTK01000088.1| GENE 3 2070 - 6032 4581 1320 aa, chain + ## HITS:1 COG:ZputA_2 KEGG:ns NR:ns ## COG: ZputA_2 COG4230 # Protein_GI_number: 15801003 # Func_class: C Energy production and conversion # Function: Delta 1-pyrroline-5-carboxylate dehydrogenase # Organism: Escherichia coli O157:H7 EDL933 # 526 1320 1 795 795 1531 99.0 0 MGTTTMGVKLDDATRERIKSAATRIDRTPHWLIKQAIFSYLEQLENSDTLPELPALLSGA ANESDEAPTPAEEPHQPFLDFAEQILPQSVSRAAITAAYRRPETEAVSMLLEQARLPQPV AEQAHKLAYQLADKLRNQKNASGRAGMVQGLLQEFSLSSQEGVALMCLAEALLRIPDKAT RDALIRDKISNGNWQSHIGRSPSLFVNAATWGLLFTGKLVSTHNEASLSRSLNRIIGKSG EPLIRKGVDMAMRLMGEQFVTGETIAEALANARKLEEKGFRYSYDMLGEAALTAEDAQAY MVSYQQAIHAIGKASNGRGIYEGPGISIKLSALHPRYSRAQYDRVMEELYPRLKSLTLLA RQYDIGINIDAEEADRLEISLDLLEKLCFEPELAGWNGIGFVIQAYQKRCPLVIDYLIDL ATRSRRRLMIRLVKGAYWDSEIKRAQMDGLEGYPVYTRKVYTDVSYLACAKKLLAVPNLI YPQFATHNAHTLAAIYQLAGQNYYPGQYEFQCLHGMGEPLYEQVTGKVADGKLNRPCRIY APVGTHETLLAYLVRRLLENGANTSFVNRIADTSLPLDELVADPVTAVEKLAQQEGQTGL PHPKIPLPHDLYGHGRDNSAGLDLANEHRLASLSSALLNSALQKWQALPMLEQPVAAGEM SPVINPAEPKDIVGFVREATPREVEQALESAVNNAPIWFATPPVERAAILHRAAVLMESQ MQQLIGILVREAGKTFSNAIAEVREAVDFLHYYAGQVRDDFANETHRPLGPVVCISPWNF PLAIFTGQIAAALAAGNSVLAKPAEQTPLIAAQGIAILLEAGVPPGVVQLLPGQGETVGA QLTGDDRVRGVMFTGSTEVATLLQRNIASRLDAQGRPIPLIAETGGMNAMIVDSSALTEQ VVVDVLASAFDSAGQRCSALRVLCLQDEIADHTLKMLRGAMAECRMGNPGRLTTDIGPVI DSEAKANIERHIQTMRSKGRPVFQAVRENSEDAREWQSGTFVAPTLIELDDFAELQKEVF GPVLHVVRYNRNQLPELIEQINASGYGLTLGVHTRIDETIAQVTGSAHVGNLYVNRNMVG AVVGVQPFGGEGLSGTGPKAGGPLYLYRLLANRPESALAVTLARQDAEYPVDAQLKAALT QPLNALREWAANRPELQALCTQYGELAQAGTQRLLPGPTGERNTWTLLPRERVLCIADDE QDALTQLAAVLAVGSQVLWPDDALHRQLVKALPSAVSERIQLAKAENITAQPFDAVIFHG DSDQLRALCEAVAARDGAIVSVQGFARGESNILLERLYIERSLSVNTAAAGGNASLMTIG >gi|296493413|gb|ADTK01000088.1| GENE 4 6072 - 6710 645 212 aa, chain - ## HITS:1 COG:ycdC KEGG:ns NR:ns ## COG: ycdC COG1309 # Protein_GI_number: 16128979 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 212 1 212 212 411 99.0 1e-115 MTQGAVKTTGKRSRTVSAKKKAILSAALDTFSQFGFHGTRLEQIAELAGVSKTNLLYYFP SKEALYIAVLRQILDIWLAPLKAFREDFAPLAAIKEYIRLKLEVSRDYPQASRLFCMEML AGAPLLMDELTGDLKALIDEKSALIAGWVKSGKLAPIDPQHLIFMIWASTQHYADFAPQV EAVTGATLRDEVFFNQTVENVQRIIIEGIRPR >gi|296493413|gb|ADTK01000088.1| GENE 5 6998 - 8089 966 363 aa, chain + ## HITS:1 COG:ECs1258 KEGG:ns NR:ns ## COG: ECs1258 COG2141 # Protein_GI_number: 15830512 # Func_class: C Energy production and conversion # Function: Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related flavin-dependent oxidoreductases # Organism: Escherichia coli O157:H7 # 1 363 20 382 382 743 99.0 0 MKIGVFVPIGNNGWLISTHAPQYMPTFELNKAIVQKAEHYHFDFALSMIKLRGFGGKTEF WDHNLESFTLMAGLAAVTSRIQIYATAATLTLPPAIVARMAATIDSISGGRFGVNLVTGW QKPEYEQMGIWPGDDYFSRRYDYLTEYVQVLRDLWGSGKSDFKGDFFTMNDCRVSPQPSV PMKVICAGQSDAGMAFSAQYADFNFCFGKGVNTPTAFAPTAARMKQAAEQTGRDVGSYVL FMVIADETDDAARAKWEHYKAGADEEALSWLTEQSQKDTRSGTDTNVRQMADPTSAVNIN MGTLVGSYASVARMLDEVASVPGAEGVLLTFDDFLSGIETFGERIQPLMQCRAHLPALTQ EVA >gi|296493413|gb|ADTK01000088.1| GENE 6 8089 - 8781 690 230 aa, chain + ## HITS:1 COG:ECs1257 KEGG:ns NR:ns ## COG: ECs1257 COG1335 # Protein_GI_number: 15830511 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Escherichia coli O157:H7 # 1 230 15 244 244 466 99.0 1e-131 MTTLTARPEAITFDPQQSALIVVDMQNAYATPGGYLDLAGFDVSTTRPVIANIQTAVTAA RAAGMLIIWFQNGWDEQYVEAGGPGSPNFHKSNALKTMRKQPQLQGKLLAKGSWDYQLVD ELVPQPGDIVLPKPRYSGFFNTPLDSILRSRGIRHLVFTGIATNVCVESTLRDGFFLEYF GVVLEDATHQAGPEFVQKAALFNIETFFGWVSDVETFCDALSPTSFARIA >gi|296493413|gb|ADTK01000088.1| GENE 7 8793 - 9179 409 128 aa, chain + ## HITS:1 COG:ycdK KEGG:ns NR:ns ## COG: ycdK COG0251 # Protein_GI_number: 16128976 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Escherichia coli K12 # 1 128 1 128 128 250 100.0 4e-67 MPKSVIIPAGSSAPLAPFVPGTLADGVVYVSGTLAFDQHNNVLFADDPKAQTRHVLETIR KVIETAGGTMADVTFNSIFITDWKNYAAINEIYAEFFPGDKPARFCIQCGLVKPDALVEI ATIAHIAK >gi|296493413|gb|ADTK01000088.1| GENE 8 9187 - 9987 720 266 aa, chain + ## HITS:1 COG:ECs1255 KEGG:ns NR:ns ## COG: ECs1255 COG0596 # Protein_GI_number: 15830509 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Escherichia coli O157:H7 # 1 266 1 266 266 498 98.0 1e-141 MKLSLSPPPYADAPVVVLISGLGGSGSYWLPQLAVLEQEYQVVCYDQRGTGNNPDTLAED YSIAQMAAELHQALVAAGIEHYAVVGHALGALVGMQLALDYPASVTVLVCVNGWLRINAH TRRCFQVRERLLYSGGAQAWVEAQPLFLYPADWMAARAPRLEAEDALALAHFQGKNNLLR RLNALKRADFSHHAVRIRCPVQIICASDDLLVPSACSSELHAALPDSQKMVMRYGGHACN VTDPETFNALLLNGLASLLHHREAAL >gi|296493413|gb|ADTK01000088.1| GENE 9 9997 - 10587 469 196 aa, chain + ## HITS:1 COG:ECs1254 KEGG:ns NR:ns ## COG: ECs1254 COG0778 # Protein_GI_number: 15830508 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Escherichia coli O157:H7 # 1 196 1 196 196 379 96.0 1e-105 MNEAVSPGALSTLFTDARTHNGWRETPVSDETLRELYALMKWGPTSANCSPARIVFIRTA EGKERLRPALSSGNLQKTLTAPVTSIVAWDSEFYERLPLLFPHGDARSWFTSSPQLAEET AFRNSSMQAAYLIVACRALGLDTGPMSGFDRQYVDDAFFAGSTLKSNLLINIGYGDNSKL YARLPRLSFEEACGLL >gi|296493413|gb|ADTK01000088.1| GENE 10 10598 - 11092 314 164 aa, chain + ## HITS:1 COG:ECs1253 KEGG:ns NR:ns ## COG: ECs1253 COG1853 # Protein_GI_number: 15830507 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Escherichia coli O157:H7 # 13 164 1 152 152 308 99.0 3e-84 MNIVDQQTFRDAMSCMGAAVNIITTDGPAGRAGFTASAVCSVTDTPPTLLVCLNRGASVW PVFNENRTLCVNTLSAGQEPLSNLFGGKTPMEHRFAAARWQTGVTGCPQLEEALVSFDCR ISQVVSVGTHDILFCAIEAIHRHATPYGLVWFDRSYHALMRPAC >gi|296493413|gb|ADTK01000088.1| GENE 11 11113 - 12357 704 415 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 24 415 8 408 447 275 38 9e-74 MAMFGFPHWQLKSTSTESGVVAPDERLPFAQTAIMGVQHAVAMFGATVLMPILMGLDPNL SILMSGVGTLLFFFITGGRVPSYLGSSAAFVGVVIAATGFNGQGINPNISIALGGIIACG LVYTVIGLVVMKIGTRWIERLMPPVVTGAVVMAIGLNLAPIAVKSVSASAFDSWMAVMTV LCIGLVAVFTRGMIQRLLILVGLIVACLLYGVMTNLLGLGKAVDFTLVSHAAWFGLPHFS TPAFNSQAMMLIAPVAVILVAENLGHLKAVAGMTGRNMDPYMGRAFVGDGLATMLSGSVG GSGVTTYAENIGVMAVTKVYSTLVFVAAAVIAMLLGFSPKFGALIHTIPAAVIGGASIVV FGLIAVAGARIWVQNRVDLSQNGNLIMVAVTLVLGAGDFALTLGGFTLGGIGTAT Prediction of potential genes in microbial genomes Time: Mon May 16 15:14:30 2011 Seq name: gi|296493412|gb|ADTK01000089.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont253.3, whole genome shotgun sequence Length of sequence - 6337 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 6, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 64 - 237 78 ## COG3729 General stress protein - Prom 402 - 461 5.6 + Prom 381 - 440 4.9 2 2 Op 1 . + CDS 610 - 1206 690 ## COG0655 Multimeric flavodoxin WrbA 3 2 Op 2 . + CDS 1227 - 1454 372 ## LF82_2690 uncharacterized protein YccJ + Term 1460 - 1498 3.5 4 3 Tu 1 . - CDS 1492 - 2733 1486 ## ECO111_1113 glucose-1-phosphatase/inositol phosphatase - Prom 2927 - 2986 3.3 - Term 2840 - 2882 1.8 5 4 Tu 1 . - CDS 3025 - 4284 317 ## ECO26_1556 hypothetical protein - Prom 4321 - 4380 4.0 6 5 Op 1 . + CDS 4544 - 5464 1054 ## COG2214 DnaJ-class molecular chaperone 7 5 Op 2 . + CDS 5464 - 5769 336 ## ECB_01002 modulator of CbpA co-chaperone 8 6 Tu 1 . - CDS 5862 - 6212 282 ## COG3381 Uncharacterized component of anaerobic dehydrogenases Predicted protein(s) >gi|296493412|gb|ADTK01000089.1| GENE 1 64 - 237 78 57 aa, chain - ## HITS:1 COG:STM1121 KEGG:ns NR:ns ## COG: STM1121 COG3729 # Protein_GI_number: 16764478 # Func_class: R General function prediction only # Function: General stress protein # Organism: Salmonella typhimurium LT2 # 1 55 1 55 55 59 92.0 1e-09 MANHRGGSGNFAEDRERASEAGKKGGQHSGGNFKNDPQRASEAGKKGGKSSHGKSDN >gi|296493412|gb|ADTK01000089.1| GENE 2 610 - 1206 690 198 aa, chain + ## HITS:1 COG:wrbA KEGG:ns NR:ns ## COG: wrbA COG0655 # Protein_GI_number: 16128970 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Escherichia coli K12 # 1 198 1 198 198 349 100.0 2e-96 MAKVLVLYYSMYGHIETMARAVAEGASKVDGAEVVVKRVPETMPPQLFEKAGGKTQTAPV ATPQELADYDAIIFGTPTRFGNMSGQMRTFLDQTGGLWASGALYGKLASVFSSTGTGGGQ EQTITSTWTTLAHHGMVIVPIGYAAQELFDVSQVRGGTPYGATTIAGGDGSRQPSQEELS IARYQGEYVAGLAVKLNG >gi|296493412|gb|ADTK01000089.1| GENE 3 1227 - 1454 372 75 aa, chain + ## HITS:1 COG:no KEGG:LF82_2690 NR:ns ## KEGG: LF82_2690 # Name: yccJ # Def: uncharacterized protein YccJ # Organism: E.coli_LF82 # Pathway: not_defined # 1 75 1 75 75 125 100.0 6e-28 MPTQEAKAHHVGEWASLRNTSPEIAEAIFEVAGYDEKMAEKIWEEGSDEVLVKAFAKTDK DSLFWGEQTIERKNV >gi|296493412|gb|ADTK01000089.1| GENE 4 1492 - 2733 1486 413 aa, chain - ## HITS:1 COG:no KEGG:ECO111_1113 NR:ns ## KEGG: ECO111_1113 # Name: agp # Def: glucose-1-phosphatase/inositol phosphatase # Organism: E.coli_O111_H- # Pathway: Glycolysis / Gluconeogenesis [PATH:eoi00010] # 1 413 1 413 413 832 100.0 0 MNKTLIAATVAGIVLLASNAQAQTVPEGYQLQQVLMMSRHNLRAPLANNGSVLEQSTPNK WPEWDVPGGQLTTKGGVLEVYMGHYMREWLAQQGMVKSGECPPPDTVYAYANSLQRTVAT AQFFITGAFPGCDIPVHHQEKMGTMDPTFNPVITDDSAAFSEQAVAAMEKELSKLQLTDS YQLLEKIVNYKDSPACKEKQQCSLVDGKNTFSAKYQQEPGVSGPLKVGNSLVDAFTLQYY EGFPMDQVAWGEIKSDQQWKVLSKLKNGYQDSLFTSPEVARNVAKPLVSYIDKALVTDRT SAPKITVLVGHDSNIASLLTALDFKPYQLHDQNERTPIGGKIVFQRWHDSKANRDLMKIE YVYQSAEQLRNADALTLQAPAQRVTLELSGCPIDANGFCPMDKFDSVLNEAVK >gi|296493412|gb|ADTK01000089.1| GENE 5 3025 - 4284 317 419 aa, chain - ## HITS:1 COG:no KEGG:ECO26_1556 NR:ns ## KEGG: ECO26_1556 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 419 1 419 419 828 99.0 0 MSSNIHGISCTANNYLKQAWNNIKNEHEKNQKYSITLFENTLVCFMRLYKEIRRQKAEDY IPCLECDSLEKEFEEMQNDNDLSLFLRTLRTNDTETYSGVSEGITYTIQYVRDIDIVRVS LPGRGSESITDFKGYYWYGFMEYIENINACDDVFSEYCLDDENMSIQPEWINTPGISDLD TGIDLSGISFIQSEINKTYGLKYAPVDGDGYCLLRAILVLKEHEYSWALGSHKTQKQVYE EFIKIVDKQTIEALVDTAFNDLREDVKTLFGVNLQSDNKIQGQGGFLSWSFLSFKKEFID SCLNDKKCILHLPEFIFNDNKARLVLDTDPEQKVNEVKNFLTALSDSICSLFIVNSNVAS ISLGNESFSTDDDLEYGYLINTGNHYDVYLPPELFAQAYELNNKERNAQIDFLTRYAIY >gi|296493412|gb|ADTK01000089.1| GENE 6 4544 - 5464 1054 306 aa, chain + ## HITS:1 COG:cbpA KEGG:ns NR:ns ## COG: cbpA COG2214 # Protein_GI_number: 16128966 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone # Organism: Escherichia coli K12 # 1 306 1 306 306 609 99.0 1e-174 MELKDYYAIMGVKPTDDLKTIKTAYRRLARKYHPDVSKEPDAEARFKEVAEAWEVLSDEQ RRAEYDQMWQHRNDPQFNRQFHHSDGQSFNAEDFDDIFSSIFGQHARQSRQRPATRGHDI EIEVAVFLEETLTEHKRTISYNLPVYNAFGMIEQEIPKTLNVKIPAGVGNGQRIRLKGQG TPGENGGPNGDLWLVIHIAPHPLFDIVGHDLEIVVPVSPWEAALGAKVTVPTLKESILLT IPPGSQAGQRLRVKGKGLVSKKQTGDLYAVLKIVMPPKPDENTAALWQQLADAQSSFDPR KDWGKA >gi|296493412|gb|ADTK01000089.1| GENE 7 5464 - 5769 336 101 aa, chain + ## HITS:1 COG:no KEGG:ECB_01002 NR:ns ## KEGG: ECB_01002 # Name: yccD # Def: modulator of CbpA co-chaperone # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 101 1 101 101 189 100.0 2e-47 MANVTVTFTITEFCLHTGISEEELNEIVGLGVVEPREIQETTWVFDDHAAIVVQRAVRLR HELALDWPGIAVALTLMDDIAHLKQENRLLRQRLSRFVAHP >gi|296493412|gb|ADTK01000089.1| GENE 8 5862 - 6212 282 116 aa, chain - ## HITS:1 COG:torD KEGG:ns NR:ns ## COG: torD COG3381 # Protein_GI_number: 16128964 # Func_class: R General function prediction only # Function: Uncharacterized component of anaerobic dehydrogenases # Organism: Escherichia coli K12 # 1 116 84 199 199 224 98.0 2e-59 MTDKQAALPYASAYKQDEQEIKRLLVEAGMETSGNFNEPADHLAIYLELLSHLHFSLGEG TVPARRIDSLRQKTLTALWQWLPEFVVRCRQYDSFGFYAALSQLLLVLVESDHQNR Prediction of potential genes in microbial genomes Time: Mon May 16 15:14:51 2011 Seq name: gi|296493411|gb|ADTK01000090.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont253.4, whole genome shotgun sequence Length of sequence - 26031 bp Number of predicted genes - 23, with homology - 22 Number of transcription units - 12, operones - 5 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 138 - 174 -0.1 1 1 Op 1 7/0.000 - CDS 200 - 2746 2724 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 2 1 Op 2 . - CDS 2746 - 3918 999 ## COG3005 Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit - Prom 3946 - 4005 4.8 + Prom 3883 - 3942 4.9 3 2 Tu 1 . + CDS 4048 - 4740 717 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 4 3 Tu 1 . - CDS 4713 - 5705 712 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 5733 - 5792 2.9 + Prom 5720 - 5779 3.3 5 4 Op 1 1/1.000 + CDS 5854 - 8568 2577 ## COG0642 Signal transduction histidine kinase 6 4 Op 2 . + CDS 8640 - 9713 591 ## COG0348 Polyferredoxin + Term 9743 - 9786 -0.5 - Term 9714 - 9755 5.6 7 5 Tu 1 . - CDS 9762 - 9935 164 ## PROTEIN SUPPORTED gi|16760699|ref|NP_456316.1| hypothetical protein STY1938 - Prom 10057 - 10116 4.2 - Term 10094 - 10124 1.0 8 6 Tu 1 . - CDS 10329 - 10541 323 ## COG1278 Cold shock proteins - Prom 10730 - 10789 6.0 + Prom 10711 - 10770 6.8 9 7 Tu 1 . + CDS 10827 - 11039 185 ## COG1278 Cold shock proteins + Prom 11227 - 11286 8.3 10 8 Tu 1 . + CDS 11481 - 11786 204 ## + Term 11824 - 11860 4.1 + Prom 11814 - 11873 5.5 11 9 Op 1 . + CDS 11893 - 12537 353 ## SSON_0993 putative regulator 12 9 Op 2 . + CDS 12534 - 13280 474 ## JW0968 conserved hypothetical protein 13 9 Op 3 . + CDS 13280 - 15376 1920 ## ECO111_1095 hypothetical protein 14 9 Op 4 6/0.000 + CDS 15422 - 16561 871 ## COG1596 Periplasmic protein involved in polysaccharide export 15 9 Op 5 3/1.000 + CDS 16549 - 16995 468 ## COG0394 Protein-tyrosine-phosphatase 16 9 Op 6 . + CDS 17015 - 19195 2222 ## COG3206 Uncharacterized protein involved in exopolysaccharide biosynthesis + Term 19280 - 19315 6.0 - Term 19199 - 19242 5.3 17 10 Tu 1 . - CDS 19315 - 20613 710 ## EcolC_2616 phosphoanhydride phosphorylase - Term 20735 - 20765 -0.5 18 11 Op 1 31/0.000 - CDS 20798 - 21934 1183 ## COG1294 Cytochrome bd-type quinol oxidase, subunit 2 19 11 Op 2 . - CDS 21946 - 23490 1559 ## COG1271 Cytochrome bd-type quinol oxidase, subunit 1 - Prom 23523 - 23582 2.2 20 12 Op 1 . - CDS 23624 - 24481 672 ## B21_00987 hypothetical protein 21 12 Op 2 . - CDS 24478 - 24876 479 ## SbBS512_E2339 hydrogenase-1 operon protein HyaE 22 12 Op 3 7/0.000 - CDS 24873 - 25460 751 ## COG0680 Ni,Fe-hydrogenase maturation factor 23 12 Op 4 . - CDS 25457 - 25939 505 ## COG1969 Ni,Fe-hydrogenase I cytochrome b subunit Predicted protein(s) >gi|296493411|gb|ADTK01000090.1| GENE 1 200 - 2746 2724 848 aa, chain - ## HITS:1 COG:torA KEGG:ns NR:ns ## COG: torA COG0243 # Protein_GI_number: 16128963 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 848 1 848 848 1743 99.0 0 MNNNDLFQASRRRFLAQLGGLTVAGMLGPSLLTPRRATAAQAATDAVISKEGILTGSHWG AIRATVKDGRFVAAKPFELDKYPSKMIAGLPDHVHNAARIRYPMVRVDWLRKRHLSDTSQ RGDNRFVRVSWDEALDMFYEELERVQKTHGPSALLTASGWQSTGMFHNASGMLAKAIALH GNSVGTGGDYSTGAAQVILPRVVGSMEVYEQQTSWPLVLQNSKTIVLWGSDLLKNQQANW WCPDHDVYEYYAQLKAKVAAGEIEVISIDPVVTSTHEYLGREHVKHIAVNPQTDVPLQLA LAYTLYSENLYDKNFLANYCVGFEQFLPYLLGEKDGQPKDAAWAEKLTGIDAETIRGLAR QMAANRTQIIAGWCVQRMQHGEQWAWMIVVLAAMLGQIGLPGGGFGFGWHYNGAGTPGRK GVILSGFSGSTSIPPVHDNSDYKGYSSTIPIARFIDAILEPGKVINWNGKSVKLPPLKMC IFAGTNPFHRHQQINRIIEGWRKLETVIAIDNQWTSTCRFADIVLPATTQFERNDLDQYG NHSNRGIIAMKQVVPPQFEARNDFDIFRELCRRFNREEAFTEGLDEMGWLKRIWQEGVQQ GKGRGVHLPAFDDFWNNKEYVEFDHPQMFVRHQAFREDPDLEPLGTPSGLIEIYSKTIAD MNYDDCQGHPMWFEKIERSHGGPGSQKYPLHLQSVHPDFRLHSQLCESETLRQQYTVAGK EPVFINPQDASARGIRNGDVVRVFNARGQVLAGAVVSDRYAPGVARIHEGAWYDPDKGGE PGALCKYGNPNVLTIDIGTSQLAQATSAHTTLVEIEKYNGAVEQVTAFNGPVEMVAQCEY VPASQVKS >gi|296493411|gb|ADTK01000090.1| GENE 2 2746 - 3918 999 390 aa, chain - ## HITS:1 COG:torC KEGG:ns NR:ns ## COG: torC COG3005 # Protein_GI_number: 16128962 # Func_class: C Energy production and conversion # Function: Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit # Organism: Escherichia coli K12 # 1 390 1 390 390 800 99.0 0 MRKLWNALRRPSARWSVLALVAIGIVIGIALIVLPHVGIKVTSTTEFCVSCHSMQPVYEE YKQSVHFQNASGVRAECHDCHIPPDMPGMVKRKLEASNDIYQTFIAHSIDTPEKFEAKRA ELAEREWARMKENNSATCRSCHNYDAMDHAKQHPEAARQMKVAAKDNQSCIDCHKGIAHQ LPDMSSGFRKQFDELRASANDSGDTLYSIDIKPIYAAKGDKEASGSLLPASEVKVLKRDG DWLQIEITGWTESAGRQRVLTQFPGKRIFVASIRGDVQQQVKTLEKTTVADTNTEWSKLQ ATAWMKKGDMVNDIKPIWAYADSLYNGTCNQCHGAPEIAHFDANGWIGTLNGMIGFTSLD KREERTLLKYLQMNASDTAGKAHGDKKEEK >gi|296493411|gb|ADTK01000090.1| GENE 3 4048 - 4740 717 230 aa, chain + ## HITS:1 COG:ECs1150 KEGG:ns NR:ns ## COG: ECs1150 COG0745 # Protein_GI_number: 15830404 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 230 1 230 230 439 99.0 1e-123 MPHHIVIVEDEPVTQARLQSYFTQEGYTVSVTASGAGLREIMQNQPVDLILLDINLPDEN GLMLTRALRERSTVGIILVTGRSDRIDRIVGLEMGADDYVTKPLELRELVVRVKNLLWRI DLARQAQPHTQDNCYRFAGYCLNVSRHTLERDGEPIKLTRAEYEMLVAFVTNPGEILSRE RLLRMLSARRVENPDLRTVDVLIRRLRHKLSADLLVTQHGEGYFLAADVC >gi|296493411|gb|ADTK01000090.1| GENE 4 4713 - 5705 712 330 aa, chain - ## HITS:1 COG:torT KEGG:ns NR:ns ## COG: torT COG1879 # Protein_GI_number: 16128960 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli K12 # 3 330 15 342 342 634 100.0 0 MLPAFSADNLLRWHDAQHFTVQASTPLKAKRAWKLCALYPSLKDSYWLSLNYGMQEAARR YGVDLKVLEAGGYSQLATQQAQIDQCKQWGAEAILLGSSTTSFPDLQKQVASLPVIELVN AIDAPQVKSRVGVPWFQMGYQPGRYLVQWAHGKPLNVLLMPGPDNAGGSKEMVEGFRAAI AGSPVRIVDIALGDNDIEIQRNLLQEMLERHPEIDVVAGTAIAAEAAMGEGRNLKTPLTV VSFYLSHQVYRGLKRGRVIMAASDQMVWQGELAVEQAIRQLQGQSVSDNVSPPILVLTPK NADREHIRRSLSPGGFRPVYFYQHTSAAKK >gi|296493411|gb|ADTK01000090.1| GENE 5 5854 - 8568 2577 904 aa, chain + ## HITS:1 COG:torS_1 KEGG:ns NR:ns ## COG: torS_1 COG0642 # Protein_GI_number: 16128959 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 650 1 650 650 1125 99.0 0 MGFALMALLTLTSTLVGWYNLRFISQVEKDNTQALIPTMNMARQLSEASAWELFAAQNLT SADNEKMWQAQGRMLTAQSLKINALLQALREQGFDTTAIEQQEQEISRSLRQQGELVGQR LQLRQQQQQLSQQIVAAADEIARLAQGQANNAATSAGATQAGIYDLIEQDQRQAAESALD RLIDIDLEYVNQMNELRLSALRVQQMVMNLGLEQIQKNAPTLEKQLNNAVKILQRRQIRI EDPGVRAQVATTLTTVSQYSDLLALFQQDSEISNHLQTLAQNNIAQFAQFSSEVSQLVDT IELRNQHGLAHLEKASARGQYSLLLLGMVSLCALILILWRVVYRSVTRPLAEQTQALQRL LDGDIDSPFPETAGVRELDTIGRLMDAFRSNVHALNRHREQLAAQVKARTAELQELVIEH RQARAEAEKASQAKSAFLAAMSHEIRTPLYGILGTAQLLADNPALNAQRDDLRAITDSGE SLLTILNDILDYSAIEAGGKNVSVSDEPFEPRPLLESTLQLMSGRVKGRPIRLATAIADD MPCALMGDPRRIRQVITNLLSNALRFTDEGYIILRSRTDGEQWLVEVEDSGCGIDPAKLA EIFQPFVQVSGKRGGTGLGLTISSRLAQAMGGELSATSTPEVGSCFCLRLPLRVATAPVP KTVNQAVRLDGLRLLLIEDNPLTQRITIEMLKTSGAQIVAVGNAAQALETLQNSEPFAAA LVDFDLPDIDGITLARQLAQQYPSLVLIGFSAHVIDETLRQRTSSLFRGIIPKPVPREVL GQLLAHYLQLQVNNDQSLDVSQLNEDAQLMGTEKIHEWLVLFTQHALPLLDEIDIARASQ DSEKIKRAAHQLKSSCSSLGMRIASQLCAQLEQQPLSAPLPHEEITRSVAALEAWLHKKD LNAI >gi|296493411|gb|ADTK01000090.1| GENE 6 8640 - 9713 591 357 aa, chain + ## HITS:1 COG:yccM KEGG:ns NR:ns ## COG: yccM COG0348 # Protein_GI_number: 16128958 # Func_class: C Energy production and conversion # Function: Polyferredoxin # Organism: Escherichia coli K12 # 1 357 1 357 357 697 99.0 0 MAENKRTRWQRRPGTTGGKLPWNDWRNATTWRKATQLLLLAMNIYIAITFWYWVRYYETA SSTTFVARPGGIEGWLPIAGLMNLKYSLTTGQLPSVHVAAMLLLVAFIVISLLLKKAFCS WLCPVGTLSELIGDLGNKLFGRQCVLPRWLDIPLRGVKYLLLSFFLYIALLMPAQAIHYF MLSPYSVVMDVKMLDFFRHMGTATLISVTVLLIASLFIRHAWCRYLCPYGALMGVVSLLS PFKIRRNAESCIDCGKCAKNCPSRIPVDKLIQVRTVECTGCMTCVESCPVASTLTFSLQK PAANKKAFALSGWLMTLLVLGIMFAVIGYAMYAGVWQSPVPEELYRRLIPQAPMIGH >gi|296493411|gb|ADTK01000090.1| GENE 7 9762 - 9935 164 57 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|16760699|ref|NP_456316.1| hypothetical protein STY1938 [Salmonella enterica subsp. enterica serovar Typhi str. CT18] # 1 57 1 57 57 67 59 7e-11 MNIEELKKQAETEIADFIAQKIAELNKNTGKEVSEIRFTAREKMTGLESYDVKIKIM >gi|296493411|gb|ADTK01000090.1| GENE 8 10329 - 10541 323 70 aa, chain - ## HITS:1 COG:ECs1145 KEGG:ns NR:ns ## COG: ECs1145 COG1278 # Protein_GI_number: 15830399 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Escherichia coli O157:H7 # 1 70 1 70 70 133 100.0 9e-32 MSNKMTGLVKWFNADKGFGFITPDDGSKDVFVHFTAIQSNEFRTLNENQKVEFSIEQGQR GPAAANVVTL >gi|296493411|gb|ADTK01000090.1| GENE 9 10827 - 11039 185 70 aa, chain + ## HITS:1 COG:ECs1144 KEGG:ns NR:ns ## COG: ECs1144 COG1278 # Protein_GI_number: 15830398 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Escherichia coli O157:H7 # 1 70 1 70 70 137 100.0 5e-33 MSRKMTGIVKTFDRKSGKGFIIPSDGRKEVQVHISAFTPRDAEVLIPGLRVEFCRVNGLR GPTAANVYLS >gi|296493411|gb|ADTK01000090.1| GENE 10 11481 - 11786 204 101 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKHKLSAILMAFMLTTPAAFAAPEAANGTEATTGTTGTTTTTTGATTTAATTGGVAAGAV GTATVVGVATAVGVATLAVVAANDSGDGGSHNTSTTTSTTR >gi|296493411|gb|ADTK01000090.1| GENE 11 11893 - 12537 353 214 aa, chain + ## HITS:1 COG:no KEGG:SSON_0993 NR:ns ## KEGG: SSON_0993 # Name: ymcC # Def: putative regulator # Organism: S.sonnei # Pathway: not_defined # 1 214 1 214 214 419 100.0 1e-116 MRPLILSIFALFLAGCTHSQQSMVDTFRASLFDNQDITVADQQIQALPYSTMYLRLNEGQ RIFVVLGYIEQEQSKWLSQDNAMLVTHNGRLLKTVKLNNNLLEVTNSGQDPLRNALAIKD GSRWTRDILWSEDNHFRSATLSSTFSFAGLETLNIAGRNVLCNVWQEEVTSTRPEKQWQN TFWVDSATGQVRQSRQMLGAGVIPVEMTFLKPAP >gi|296493411|gb|ADTK01000090.1| GENE 12 12534 - 13280 474 248 aa, chain + ## HITS:1 COG:no KEGG:JW0968 NR:ns ## KEGG: JW0968 # Name: ymcB # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 248 1 248 248 472 100.0 1e-132 MNKLQSYFIASVLYVMTPHAFAQGTVTIYLPGEQQTLSVGPVENVVQLVTQPQLRDRLWW PGALLTDSAAKAKALKDYQHVMAQLASWEAEADDDVAATIKSVRQQLLNLNITGRLPVKL DPDFVRVDENSNPPLVGDYTLYTVQRPVTITLLGAVSGAGQLPWQAGRSVTDYLQDHPRL AGADKNNVMVITPEGETVVAPVALWNKRHVEPPPGSQLWLGFSAHVLPEKYADLNDQIVS VLTQRVPD >gi|296493411|gb|ADTK01000090.1| GENE 13 13280 - 15376 1920 698 aa, chain + ## HITS:1 COG:no KEGG:ECO111_1095 NR:ns ## KEGG: ECO111_1095 # Name: ymcA # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 698 1 698 698 1415 99.0 0 MKKNSYLLSCLAIAVSSACHAEVLTYPDPLGSSQSDFGGTGLLQMPNARIAPEGEFSVNY RDNDQYRFYSTSVALFPWLEGTIRYTDVRTRKYSQWEDFSGDQSYKDKSFDFKLRLWEEG HWLPQVAFGKRDIAGTGLFDGEYLVASKQAGPFDFTLGMAWGYAGNAGNITNPFCRVSDK YCHRAESHDAGDISFSDIFRGPASIFGGIEYQTPWNPLRLKLEYDGNNYQNDFAGKLPQA SHFNVGAVYRAASWADLNLSYERGNTLMFGFTLRTNFNDLRPALRDTPKPAYQPAPESEG LQYTTVANQLTALKYNAGFDAPEIQLRDKTLYMSGQQYKYRDSREAVDRANRILVNNLPQ GVEKISVTQKREHMAMVTTETDVASLRKQLAGTAPGQSEQLQQQRVEAEDLSAFGRGYRI REDRFSYSFNPTLSQSLGGPEDFYMFQLGLMSSARYWFTDHLLLDGGIFTNIYNNYDKFK SSLLPADSTLPRVRTHIRDYVRNDVYLNNLQANYFADLGNGFYGQVYGGYLETMYAGVGS ELLYRPLDASWALGVDVNYVKQRDWDNMMRFTDYSTPTGFVTAYWNPPTLNGVLMKLSVG QYLAKDKGATIDVAKRFDSGVAVGVWAAISNVSKDDYGEGGFSKGFYISIPFDLMTIGPN RNRAVVSWTPLTRDGGQMLSRKYQLYPMTAEREVPVGQ >gi|296493411|gb|ADTK01000090.1| GENE 14 15422 - 16561 871 379 aa, chain + ## HITS:1 COG:ECs1139 KEGG:ns NR:ns ## COG: ECs1139 COG1596 # Protein_GI_number: 15830393 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Escherichia coli O157:H7 # 1 379 1 379 379 744 100.0 0 MKKNIFKFSVLTLAVLSLTACTLVPGQNLSTSNKDVIELPDNQYDLDKMVNIYPVTPGLI DQLRAKPIMSQANPELEQQIANYEYRIGIGDVLMVTVWDHPELTTPAGQYRSASDTGNWV NADGAIFYPYIGRLKVAGKTLTQVRNEITARLDSVIESPQVDVSVAAFRSQKAYVTGEVS KSGQQPITNIPLTIMDAINAAGGLTADADWRNVVLTQNGVKTKVNLYALMQRGDLRQNKL LHPGDILFIPRNDDLKVFVMGEVGKQSTLKMDRSGMTLAEALGNAEGMNQDVADATGIFV IRATQNKQNGKIANIYQLNAKDASAMILGTEFQLEPYDIVYVTTAPLARWNRVISLLVPT ISGVHDLTETSRWIQTWPN >gi|296493411|gb|ADTK01000090.1| GENE 15 16549 - 16995 468 148 aa, chain + ## HITS:1 COG:ECs1138 KEGG:ns NR:ns ## COG: ECs1138 COG0394 # Protein_GI_number: 15830392 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Escherichia coli O157:H7 # 1 148 5 152 152 276 100.0 1e-74 MAQLKFNSILVVCTGNICRSPIGERLLRKRLPGVKVKSAGVHGLVKHPADATAADVAANH GVSLEGHAGRKLTAEMARNYDLILAMESEHIAQVTAIAPEVRGKTMLFGQWLEQKEIPDP YRKSQDAFEHVYGMLERASQEWAKRLSR >gi|296493411|gb|ADTK01000090.1| GENE 16 17015 - 19195 2222 726 aa, chain + ## HITS:1 COG:ZyccC_1 KEGG:ns NR:ns ## COG: ZyccC_1 COG3206 # Protein_GI_number: 15800902 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein involved in exopolysaccharide biosynthesis # Organism: Escherichia coli O157:H7 EDL933 # 1 480 1 480 480 890 99.0 0 MTTKNMNTPPGSTQENEIDLLRLVGELWDHRKFIISVTALFTLIAVAYSLLSTPIYQADT LVQVEQKQGNAILSGLSDMIPNSSPESAPEIQLLQSRMILGKTIAELNLRDMVEQKYFPI VGRGWARLTKEKPGELAISWMHIPQLNGQDQQLTLTVGENGHYTLEGEEFTVNGMVGQRL EKDGVALTIADIKAKPGTQFVLSQRTELEAINALQETFTVSERSKESGMLELTMTGDDPQ LITRILNSIANNYLQQNIARQAAQDSQSLEFLQRQLPEVRSELDQAEEKLNVYRQQRDSV DLNLEAKAVLEQIVNVDNQLNELTFREAEISQLYKKDHPTYRALLEKRQTLEQERKRLNK RVSAMPSTQQEVLRLSRDVEAGRAVYLQLLNRQQELSISKSSAIGNVRIIDPAVTQPQPV KPKKALNVVLGFILGLFISVGAVLARAMLRRGVEAPEQLEEHGINVYATIPMSEWLDKRT RLRKKNLFSNQQRHRTKNIPFLAVDNPADSAVEAVRALRTSLHFAMMETENNILMITGAT PDSGKTFVSSTLAAVIAQSDQKVLFIDADLRRGYSHNLFTVSNEHGLSEYLAGKDELNKV IQHFGKGGFDVITRGQVPPNPSELLMRDRMRQLLEWANDHYDLVIVDTPPMLAVSDAAVV GRSVGTSLLVARFGLNTAKEVSLSMQRLEQAGVNIKGAILNGVIKRASTAYSYGYNYYGY SYSEKE >gi|296493411|gb|ADTK01000090.1| GENE 17 19315 - 20613 710 432 aa, chain - ## HITS:1 COG:no KEGG:EcolC_2616 NR:ns ## KEGG: EcolC_2616 # Name: not_defined # Def: phosphoanhydride phosphorylase # Organism: E.coli_ATCC8739 # Pathway: gamma-Hexachlorocyclohexane degradation [PATH:ecl00361]; Inositol phosphate metabolism [PATH:ecl00562]; Riboflavin metabolism [PATH:ecl00740] # 1 432 11 442 442 843 99.0 0 MKAILIPFLSLLIPLTPQSAFAQSEPELKLESVVIVSRHGVRAPTKATQLMQDVTPDAWP TWPVKLGWLTPRGGELIAYLGHYQRQRLVADGLLAKKGCPQSGQVAIIADVDERTRKTGE AFAAGLAPDCAITVHTQADTSSPDPLFNPLKTGVCQLDNANVTDAILSRAGGAIADFTGH RQTAFRELERVLNFPQSNLCLKREKQDESCSLTQALPSELKVSADNVSLTGAVSLASMLT EIFLLQQAQGMPEPGWGRITDSHQWNTLLSLHNAQFYLLQRTPEVARSRATPLLDLIKTA LTPHPPQKQAYGVTLPTSVLFIAGHDTNLANLGGALELNWTLPGQPDNTPPGGELVFERW RRLSDNSQWIQVSLVFQTLQQMRDKTPLSLNTPPGEVKLTLAGCEERNAQGMCSLAGFTQ IVNEARIPACSL >gi|296493411|gb|ADTK01000090.1| GENE 18 20798 - 21934 1183 378 aa, chain - ## HITS:1 COG:ECs1135 KEGG:ns NR:ns ## COG: ECs1135 COG1294 # Protein_GI_number: 15830389 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 2 # Organism: Escherichia coli O157:H7 # 1 378 1 378 378 666 99.0 0 MFDYETLRFIWWLLIGVILVVFMISDGFDMGIGCLLPLVARDDDERRIVINSVGAHWEGN QVWLILAGGALFAAWPRVYAAAFSGFYVAMILVLCSLFFRPLAFDYRGKIADARWRKMWD AGLVIGSLVPPVVFGIAFGNLLLGVPFAFTPQLRVEYLGSFWQLLTPFPLLCGLLSLGMV ILQGGVWLQLKTVGVIHLRSQLATKRAALLVMLCFLLAGYWLWVGIDGFVLLAQDANGPS NPLMKLVAVLPGAWMNNFVESPVLWIFPLLGFFCPLLTVMAIYRGRPGWGFLMASLMQFG VIFTAGITLFPFVMPSSVSPISSLTLWDSTSSQLTLSIMLVIVLIFLPIVLLYTLWSYYK MWGRMTTETLRRNENELY >gi|296493411|gb|ADTK01000090.1| GENE 19 21946 - 23490 1559 514 aa, chain - ## HITS:1 COG:ECs1134 KEGG:ns NR:ns ## COG: ECs1134 COG1271 # Protein_GI_number: 15830388 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 1 # Organism: Escherichia coli O157:H7 # 1 514 1 514 514 964 99.0 0 MWDVIDLSRWQFALTALYHFLFVPLTLGLIFLLAIMETIYVVTGKTIYRDMTRFWGKLFG INFALGVATGLTMEFQFGTNWSFYSNYVGDIFGAPLAMEALMAFFLESTFVGLFFFGWQR LNKYQHLLVTWLVAFGSNLSALWILNANGWMQYPTGAHFDIDTLRMEMTSFSELVFNPVS QVKFVHTVMAGYVTGAMFIMAISAWYLLRGRERDVALRSFAIGSVFGTLAIIGTLQLGDS SAYEVAQVQPVKLAAMEGEWQTEPAPAPFHVVAWPEQDQERNAFALKIPALLGILATHSL DKPVPGLKNLMAETYPRLQRGRMAWLLMQEISQGNREPHVLQAFRELEGDLGYGMLLSRY APDMNHVTAAQYQAAMRGAIPQVAPVFWSFRIMVGCGSLLLLVMLIALVQTLRGKIDQHR WVLKMALWSLPLPWIAIEAGWFMTEFGRQPWAIQDILPTYSAHSALTTGQLAFSLIMIVG LYTLFLIAEVYLMQKYARLGPSAMQSEQPTQQQG >gi|296493411|gb|ADTK01000090.1| GENE 20 23624 - 24481 672 285 aa, chain - ## HITS:1 COG:no KEGG:B21_00987 NR:ns ## KEGG: B21_00987 # Name: hyaF # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 285 1 285 285 556 100.0 1e-157 MSETFFHLLGPGTQPNDDSFSMNPLPITCQVNDEPSMAALEQCAHSPQVIALLNELQHQL SERQPPLGEVLAVDLLNLNADDRHFINTLLGEGEVSVRIQQADDSESEIQEAIFCGLWRV RRRRGEKLLEDKLEAGCAPLALWQAATQNLLPTDSLLPPPIDGLMNGLPLAHELLAHVRN PDAQPHSINLTQLPISEADRLFLSRLCGPGNIQIRTIGYGESYINATGLRHVWHLRCTDT LKGPLLESYEICPIPEVVLAAPEDLVDSAQRLSEVCQWLAEAAPT >gi|296493411|gb|ADTK01000090.1| GENE 21 24478 - 24876 479 132 aa, chain - ## HITS:1 COG:no KEGG:SbBS512_E2339 NR:ns ## KEGG: SbBS512_E2339 # Name: hyaE # Def: hydrogenase-1 operon protein HyaE # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 1 132 1 132 132 261 99.0 6e-69 MSNDTPFDALWQRMLARGWTPVSESRLDDWLTQAPDGVVLLSSDPKRTPEVSDNPVMIGE LLHEFPDYTWQVAIADLEQSEAIGDRFGVFRFPATLVFTGGNYRGVLNGIHPWAELINLM RGLVEPQQERAS >gi|296493411|gb|ADTK01000090.1| GENE 22 24873 - 25460 751 195 aa, chain - ## HITS:1 COG:hyaD KEGG:ns NR:ns ## COG: hyaD COG0680 # Protein_GI_number: 16128941 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase maturation factor # Organism: Escherichia coli K12 # 1 195 1 195 195 360 100.0 1e-99 MSEQRVVVMGLGNLLWADEGFGVRVAERLYAHYHWPEYVEIVDGGTQGLNLLGYVESASH LLILDAIDYGLEPGTLRTYAGERIPAYLSAKKMSLHQNSFSEVLALADIRGHLPAHIALV GLQPAMLDDYGGSLSELAREQLPAAEQAALAQLAAWGIVPQPANESRCLNYDCLSMENYE GVRLRQYRMTQEEQG >gi|296493411|gb|ADTK01000090.1| GENE 23 25457 - 25939 505 160 aa, chain - ## HITS:1 COG:ECs1130 KEGG:ns NR:ns ## COG: ECs1130 COG1969 # Protein_GI_number: 15830384 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase I cytochrome b subunit # Organism: Escherichia coli O157:H7 # 1 160 76 235 235 298 100.0 3e-81 MRIYWAFVGNRYSRELFIVPVWRKSWWQGVWYEIRWYLFLAKRPSADIGHNPIAQAAMFG YFLMSVFMIITGFALYSEHSQYAIFAPFRYVVEFFYWTGGNSMDIHSWHRLGMWLIGAFV IGHVYMALREDIMSDDTVISTMVNGYRSHKFGKISNKERS Prediction of potential genes in microbial genomes Time: Mon May 16 15:15:31 2011 Seq name: gi|296493410|gb|ADTK01000091.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont253.5, whole genome shotgun sequence Length of sequence - 6814 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 19 - 70 3.7 1 1 Op 1 11/0.000 - CDS 142 - 1935 1939 ## COG0374 Ni,Fe-hydrogenase I large subunit 2 1 Op 2 . - CDS 1932 - 3050 761 ## COG1740 Ni,Fe-hydrogenase I small subunit - Prom 3206 - 3265 2.7 + TRNA 3477 - 3564 76.9 # Ser TGA 0 0 + Prom 3489 - 3548 80.4 3 2 Op 1 2/0.667 + CDS 3771 - 4430 841 ## COG0670 Integral membrane protein, interacts with FtsH + Term 4450 - 4489 3.1 + Prom 4438 - 4497 4.8 4 2 Op 2 . + CDS 4521 - 4850 544 ## COG2920 Dissimilatory sulfite reductase (desulfoviridin), gamma subunit - Term 4743 - 4786 1.6 5 3 Tu 1 . - CDS 4847 - 5125 362 ## COG1254 Acylphosphatases - Prom 5209 - 5268 2.4 + Prom 5091 - 5150 2.0 6 4 Op 1 3/0.667 + CDS 5220 - 6410 582 ## PROTEIN SUPPORTED gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative 7 4 Op 2 . + CDS 6468 - 6785 414 ## COG3785 Uncharacterized conserved protein Predicted protein(s) >gi|296493410|gb|ADTK01000091.1| GENE 1 142 - 1935 1939 597 aa, chain - ## HITS:1 COG:hyaB KEGG:ns NR:ns ## COG: hyaB COG0374 # Protein_GI_number: 16128939 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase I large subunit # Organism: Escherichia coli K12 # 1 597 1 597 597 1274 99.0 0 MSTQYETQGYTINNAGRRLVVDPITRIEGHMRCEVNINDQNVITNAVSCGTMFRGLEIIL QGRDPRDAWAFVERICGVCTGVHALASVYAIEDAIGIKVPDNANIIRNIMLATLWCHDHL VHFYQLAGMDWIDVLDALKADPRKTSELAQSLSSWPKSSPGYFFDVQNRLKKFVEGGQLG IFRNGYWGHPQYKLPPEANLMGFAHYLEALDFQREIVKIHAVFGGKNPHPNWIVGGMPCA INIDESGAVGAVNMERLNLVQSIITRTADFINNVMIPDALAIGQFNKPWSEIGTGLSDKC VLSYGAFPDIANDFGEKSLLMPGGAVINGDFNNVLPVDLVDPQQVQEFVDHAWYRYPNDQ VGRHPFDGITDPWYNPGDVKGSDTNIQQLNEQERYSWIKAPRWRGNAMEVGPLARTLIAY HKGDAATVESVDRMMSALNLPLSGIQSTLGRILCRAHEAQWAAGKLQYFFDKLMTNLKNG NLATASTEKWEPVTWPTECRGVGFTEAPRGALGHWAAIRDGKIDLYQCVVPTTWNASPRD PKGQIGAYEAALMNTKMAIPEQPLEILRTLHSFDPCLACSTHVLGDDGSELISVQVR >gi|296493410|gb|ADTK01000091.1| GENE 2 1932 - 3050 761 372 aa, chain - ## HITS:1 COG:hyaA KEGG:ns NR:ns ## COG: hyaA COG1740 # Protein_GI_number: 16128938 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase I small subunit # Organism: Escherichia coli K12 # 1 372 1 372 372 736 99.0 0 MNNEETFYQAMRRQGVTRRSFLKYCSLAATSLGLGAGMAPKIAWALENKPRIPVVWIHGL ECTCCTESFIRSAHPLAKDVILSLISLDYDDTLMAAAGNQAEEVFEDIITQYNGKYILAV EGNPPLGEQGMFCISSGRPFIEKLKRAAAGASAIIAWGTCASWGCVQAARPNPTQATPID KVITDKPIIKVPGCPPIPDVMSAIITYMVTFDRLPDVDRMGRPLMFYGQRIHDKCYRRAH FDAGEFVQSWDDDAARKGYCLYKMGCKGPTTYNACSSTRWNDGVSFPIQSGHGCLGCAEN GFWDRGSFYSRVVDIPQMGTHSTADTVGLTALGVVAAAVGVHAVASAVDQRRRHNQQPTE TEHQPGNEDKQA >gi|296493410|gb|ADTK01000091.1| GENE 3 3771 - 4430 841 219 aa, chain + ## HITS:1 COG:yccA KEGG:ns NR:ns ## COG: yccA COG0670 # Protein_GI_number: 16128937 # Func_class: R General function prediction only # Function: Integral membrane protein, interacts with FtsH # Organism: Escherichia coli K12 # 1 219 1 219 219 328 99.0 3e-90 MDRIVSSSHDRTSLLSTHKVLRNTYFLLSLTLAFSAITATASTVLMLPSPGLILTLVGMY GLMFLTYKTANKPTGIISAFAFTGFLGYILGPILNTYLSAGMGDVIAMALGGTALVFLCC SAYVLTTRKDMSFLGGMLMAGIVVVLIGMVANIFLQLPALHLAISAVFILISSGAILFET SNIIHGGETNYIRATVSLYVSLYNIFVSLLSILGFASRD >gi|296493410|gb|ADTK01000091.1| GENE 4 4521 - 4850 544 109 aa, chain + ## HITS:1 COG:ECs1053 KEGG:ns NR:ns ## COG: ECs1053 COG2920 # Protein_GI_number: 15830307 # Func_class: P Inorganic ion transport and metabolism # Function: Dissimilatory sulfite reductase (desulfoviridin), gamma subunit # Organism: Escherichia coli O157:H7 # 1 109 20 128 128 219 100.0 1e-57 MLIFEGKEIETDTEGYLKESSQWSEPLAVVIAENEGISLSPEHWEVVRFVRDFYLEFNTS PAIRMLVKAMANKFGEEKGNSRYLYRLFPKGPAKQATKIAGLPKPVKCI >gi|296493410|gb|ADTK01000091.1| GENE 5 4847 - 5125 362 92 aa, chain - ## HITS:1 COG:ECs1052 KEGG:ns NR:ns ## COG: ECs1052 COG1254 # Protein_GI_number: 15830306 # Func_class: C Energy production and conversion # Function: Acylphosphatases # Organism: Escherichia coli O157:H7 # 1 92 1 92 92 182 98.0 1e-46 MSKVCIIAWVYGRVQGVGFRYTTQYEAKRLGLTGYAKNLDDGSVEVVACGEEGQVEKLMQ WLKSGGPRSARVERVLSEPHHPSGELADFRIR >gi|296493410|gb|ADTK01000091.1| GENE 6 5220 - 6410 582 396 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative [Thermococcus barophilus MP] # 22 389 21 386 396 228 32 7e-60 MSVRLVLAKGREKSLLRRHPWVFSGAVARMEGKASLGETIDIVDHQGKWLARGAYSPASQ IRARVWTFDPSESIDIAFFSRRLQQAQKWRDWLAQKDGLDSYRLIAGESDGLPGITIDRF GNFLVLQLLSAGAEYQRAALISALQTLYPECAIYDRSDVAVRKKEGMELTQGLVTGELPP ALLPIEEHGMKLLVDIQHGHKTGYYLDQRDSRLATRRYVENKRVLNCFSYTGGFAVSALM GGCSQVVSVDTSQEALDIARQNVELNKLDLSKAEFVRDDVFKLLRTYRDRGEKFDVIVMD PPKFVENKSQLMGACRGYKDINMLAIQLLNEGGILLTFSCSGLMTSDLFQKIIADAAIDA GRDVQFIEQFRQAADHPVIATYPEGLYLKGFACRVM >gi|296493410|gb|ADTK01000091.1| GENE 7 6468 - 6785 414 105 aa, chain + ## HITS:1 COG:ECs1050 KEGG:ns NR:ns ## COG: ECs1050 COG3785 # Protein_GI_number: 15830304 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 105 18 122 122 200 100.0 6e-52 MIASKFGIGQQVRHSLLGYLGVVVDIDPVYSLSEPSPDELAVNDELRAAPWYHVVMEDDN GLPVHTYLAEAQLSSELQDEHPEQPSMDELAQTIRKQLQAPRLRN Prediction of potential genes in microbial genomes Time: Mon May 16 15:15:35 2011 Seq name: gi|296493409|gb|ADTK01000092.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont253.6, whole genome shotgun sequence Length of sequence - 9405 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 7, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 17 - 430 618 ## COG1832 Predicted CoA-binding protein - Prom 550 - 609 3.7 + Prom 522 - 581 2.4 2 2 Op 1 3/0.800 + CDS 603 - 1265 624 ## COG3110 Uncharacterized protein conserved in bacteria + Term 1275 - 1301 -1.0 3 2 Op 2 . + CDS 1352 - 1819 391 ## COG1803 Methylglyoxal synthase + Term 1827 - 1858 4.1 - Term 1728 - 1766 1.2 4 3 Tu 1 . - CDS 1851 - 3923 1648 ## COG0210 Superfamily I DNA and RNA helicases - Prom 3972 - 4031 4.7 + Prom 3936 - 3995 3.6 5 4 Op 1 5/0.400 + CDS 4028 - 4474 507 ## COG3304 Predicted membrane protein 6 4 Op 2 . + CDS 4484 - 6646 1935 ## COG1289 Predicted membrane protein + Term 6662 - 6710 3.9 7 5 Tu 1 . - CDS 6609 - 7238 549 ## COG3070 Regulator of competence-specific genes - Prom 7293 - 7352 4.5 + Prom 7211 - 7270 5.8 8 6 Tu 1 5/0.400 + CDS 7457 - 7966 270 ## COG5404 SOS-response cell division inhibitor, blocks FtsZ ring formation + Term 8037 - 8080 3.1 + Prom 8111 - 8170 12.2 9 7 Tu 1 . + CDS 8266 - 9375 1205 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins Predicted protein(s) >gi|296493409|gb|ADTK01000092.1| GENE 1 17 - 430 618 137 aa, chain - ## HITS:1 COG:yccU KEGG:ns NR:ns ## COG: yccU COG1832 # Protein_GI_number: 16128932 # Func_class: R General function prediction only # Function: Predicted CoA-binding protein # Organism: Escherichia coli K12 # 1 137 28 164 164 268 99.0 3e-72 MKETDIAGILTSTHTIALVGASDKPDRPSYRVMKYLLDQGYHVIPVSPKVAGKTLLGQQG YGTLADVPEKVDMVDVFRNSEAAWGVAQEAIAIGAKTLWMQLGVINEQAAVLARDAGLNV VMDRCPAIEIPRLGLAK >gi|296493409|gb|ADTK01000092.1| GENE 2 603 - 1265 624 220 aa, chain + ## HITS:1 COG:ECs1048 KEGG:ns NR:ns ## COG: ECs1048 COG3110 # Protein_GI_number: 15830302 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 220 1 220 220 409 100.0 1e-114 MKTGIVTTLIALCLPVSVFATTLRLSTDVDLLVLDGKKVSSSLLRGADSIELDNGPHQLV FRVEKTIHLSNSEERLYISPPLVVSFNTQLINQVNFRLPRLENEREANHFDAAPRLELLD GDATPIPVKLDILAITSTAKTIDYEVEVERYNKSAKRASLPQFATMMADDSTLLSGVSEL DAIPPQSQVLTEQRLKYWFKLADPQTRNTFLQWAEKQPSS >gi|296493409|gb|ADTK01000092.1| GENE 3 1352 - 1819 391 155 aa, chain + ## HITS:1 COG:ECs1047 KEGG:ns NR:ns ## COG: ECs1047 COG1803 # Protein_GI_number: 15830301 # Func_class: G Carbohydrate transport and metabolism # Function: Methylglyoxal synthase # Organism: Escherichia coli O157:H7 # 1 155 1 155 155 313 100.0 9e-86 MYIMELTTRTLPARKHIALVAHDHCKQMLMSWVERHQPLLEQHVLYATGTTGNLISRATG MNVNAMLSGPMGGDQQVGALISEGKIDVLIFFWDPLNAVPHDPDVKALLRLATVWNIPVA TNVATADFIIQSPHFNDAVDILIPDYQRYLADRLK >gi|296493409|gb|ADTK01000092.1| GENE 4 1851 - 3923 1648 690 aa, chain - ## HITS:1 COG:helD KEGG:ns NR:ns ## COG: helD COG0210 # Protein_GI_number: 16128929 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Escherichia coli K12 # 7 690 1 684 684 1342 99.0 0 MWIYRVMELKATTLGKRLAQHPYDRAVILNAGIKVSGDRHEYLIPFNQLLAIHCKRGLVW GELEFVLPDEKVVRLHGTEWGETQRFYHHLDAHWRRWSGEMSEIASGVLRQQLDLIATRT GENKWLTREQTSGVQQQIRQALSALPLPVNRLEEFDNCREAWRKCQAWLKDIESARLQHN QAYTEAMLTEYADFFRQVESSPLNPAQARAVVNGEHSLLVLAGAGSGKTSVLVARAGWLL ARGEASPEQILLLAFGRKAAEEMDERIRERLHTEDITARTFHALALHIIQQGSKKVPIVS KLENDTAARHELFIAEWRKQCSEKKAQAKGWRQWLTEEMQWSVPEGNFWDDEKLQRRLAS RLDRWVSLMRMHGGAQAEMIASAPEEIRDLFSKRIKLMAPLLKAWKGALKAENAVDFSGL IHQAIVILEKGRFISPWKHILVDEFQDISPQRAALLAALRKQNSQTTLFAVGDDWQAIYR FSGAQMSLTTAFHENFGEGERCDLDTTYRFNSRIGKVANRFIQQNPGQLKKPLNSLTNGD KKAVTLLDESQLDALLDKLSGYAKPEERILILARYHHMRPASLEKAATRWPKLQIDFMTI HASKGQQADYVIIVGLQEGSDGFPAAARESIMEEALLPPVEDFPDAEERRLMYVALTRAR HRVWALFNKENPSPFVEILKNLDVPVARKP >gi|296493409|gb|ADTK01000092.1| GENE 5 4028 - 4474 507 148 aa, chain + ## HITS:1 COG:yccF KEGG:ns NR:ns ## COG: yccF COG3304 # Protein_GI_number: 16128928 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 134 1 134 148 239 100.0 2e-63 MRTVLNILNFVLGGFATTLGWLLATLVSIVLIFTLPLTRSCWEITKLSLVPYGNEAIHVD ELNPAGKNVLLNTGGTVLNIFWLIFFGWWLCLMHIATGIAQCISIIGIPVGIANFKIAAI ALWPVGRRVVSVETAQAAREANARRRFE >gi|296493409|gb|ADTK01000092.1| GENE 6 4484 - 6646 1935 720 aa, chain + ## HITS:1 COG:ECs1044 KEGG:ns NR:ns ## COG: ECs1044 COG1289 # Protein_GI_number: 15830298 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 720 1 720 720 1427 99.0 0 MAFMLSPLLKRYTWNSAWLYYARIFIALCGTTAFPWWLGDVKLTIPLTLGMVAAALTDLD DRLAGRLRNLIITLFCFFIASASVELLFPWPWLFAIGLTLSTSGFILLGGLGQRYATIAF GALLIAIYTMLGTSLYEHWYQQPMYLLAGAVWYNVLTLIGHLLFPVRPLQDNLARCYEQL ARYLELKSRMFDPDIEDESQAPLYDLALANGQLMATLNQTKLSLLTRLRGDRGQRGTRRT LHYYFVAQDIHERASSSHIQYQTLREHFRHSDVLFRFQRLMSMQGQACQQLSRCILLRQP YQHDPHFERAFTHIDAALERMRDNGAPADLLKTLGFLLNNLRAIDAQLATIESEQAQALP HNNDENELADDSPHGLSDIWLRLSRHFTPESALFRHAVRMSLVLCFGYAIIQITGMHHGY WILLTSLFVCQPNYNATRHRLKLRIIGTLVGIAIGIPVLWFVPSLEGQLVLLVITGVLFF AFRNVQYAHATMFITLLVLLCFNLLGEGFEVALPRVIDTLIGCAIAWAAVSYIWPDWQFR NLPRMLERATEANCRYLDAILEQYHQGRDNRLAYRIARRDAHNRDAELASVVSNMSSEPN VTPQIREAAFRLLCLNHTFTSYISALGAHREQLTNPEILAFLDDAVCYVDDALHHQPADE ERVNEALASLKQRMQQLEPRADSKEPLVVQQVGLLIALLPEIGRLQRQITQVPQETPVSA >gi|296493409|gb|ADTK01000092.1| GENE 7 6609 - 7238 549 209 aa, chain - ## HITS:1 COG:yccR KEGG:ns NR:ns ## COG: yccR COG3070 # Protein_GI_number: 16128926 # Func_class: K Transcription # Function: Regulator of competence-specific genes # Organism: Escherichia coli K12 # 1 209 1 209 209 407 100.0 1e-114 MKSLSYKRIYKSQEYLATLGTIEYRSLFGSYSLTVDDTVFAMVSDGELYLRACEQSAQYC VKHPPVWLTYKKCGRSVTLNYYRVDESLWRNQLKLVRLSKYSLDAALKEKSTRNTRERLK DLPNMSFHLEAILGEVGIKDVRALRILGAKMCWLRLRQQNSLVTEKILFMLEGAIIGIHE AALPVARRQELAEWADSLTPKQEFPAELE >gi|296493409|gb|ADTK01000092.1| GENE 8 7457 - 7966 270 169 aa, chain + ## HITS:1 COG:ECs1042 KEGG:ns NR:ns ## COG: ECs1042 COG5404 # Protein_GI_number: 15830296 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: SOS-response cell division inhibitor, blocks FtsZ ring formation # Organism: Escherichia coli O157:H7 # 1 169 1 169 169 269 100.0 2e-72 MYTSGYAHRSSSFSSAASKIARVSTENTTAGLISEVVYREDQPMMTQLLLLPLLQQLGQQ SRWQLWLTPQQKLSREWVQASGLPLTKVMQISQLSPCHTVESMVRALRTGNYSVVIGWLA DDLTEEEHAELVDAANEGNAMGFIMRPVSASSHATRQLSGLKIHSNLYH >gi|296493409|gb|ADTK01000092.1| GENE 9 8266 - 9375 1205 369 aa, chain + ## HITS:1 COG:ECs1041 KEGG:ns NR:ns ## COG: ECs1041 COG2885 # Protein_GI_number: 15830295 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Escherichia coli O157:H7 # 20 369 1 346 346 637 96.0 0 MLSRWRYSWRILDDNEAQKMKKTAIAIAVALAGFATVAQAAPKDNTWYTGAKLGWSQYHD TGFIPNNGPTHENQLGAGAFGGYQVNPYVGFEMGYDWLGRMPYKGDNINGAYKAQGVQLT AKLGYPITDDLDVYTRLGGMVWRADTKANVPGGASFKDHDTGVSPVFAGGVEYAITPEIA TRLEYQWTNNIGDAHTIGTRPDNGMLSLGVSYRFGQGEAAPVVAPAPAPAPEVQTKHFTL KSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERR AQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEVKGI KDVVTQPQA Prediction of potential genes in microbial genomes Time: Mon May 16 15:15:58 2011 Seq name: gi|296493408|gb|ADTK01000093.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont253.7, whole genome shotgun sequence Length of sequence - 68853 bp Number of predicted genes - 59, with homology - 58 Number of transcription units - 31, operones - 12 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 89 - 541 517 ## COG3120 Uncharacterized protein conserved in bacteria - Prom 567 - 626 4.0 + Prom 640 - 699 3.7 2 2 Op 1 8/0.111 + CDS 727 - 2487 1481 ## COG1067 Predicted ATP-dependent protease 3 2 Op 2 . + CDS 2556 - 3074 743 ## COG0764 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases + Term 3102 - 3137 4.5 - Term 3089 - 3126 9.1 4 3 Tu 1 . - CDS 3144 - 3311 98 ## COG3130 Ribosome modulation factor - Prom 3388 - 3447 3.3 - Term 3504 - 3544 4.1 5 4 Op 1 8/0.111 - CDS 3567 - 4130 667 ## COG3009 Uncharacterized protein conserved in bacteria 6 4 Op 2 11/0.000 - CDS 4127 - 5767 1442 ## COG3008 Paraquat-inducible protein B 7 4 Op 3 5/0.556 - CDS 5772 - 7025 703 ## COG2995 Uncharacterized paraquat-inducible protein A - Prom 7089 - 7148 1.6 8 5 Op 1 6/0.222 - CDS 7155 - 9062 2407 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 9 5 Op 2 . - CDS 9074 - 11182 178 ## PROTEIN SUPPORTED gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative - Prom 11355 - 11414 5.8 + Prom 11301 - 11360 3.4 10 6 Tu 1 . + CDS 11426 - 12535 680 ## COG3217 Uncharacterized Fe-S protein - Term 12227 - 12288 0.5 11 7 Tu 1 . - CDS 12532 - 13074 434 ## SSON_0950 hypothetical protein - Prom 13174 - 13233 6.7 - Term 13171 - 13226 12.9 12 8 Tu 1 . - CDS 13248 - 14258 941 ## COG0167 Dihydroorotate dehydrogenase - Prom 14290 - 14349 4.0 13 9 Tu 1 . + CDS 14202 - 14411 139 ## gi|213615837|ref|ZP_03371663.1| hypothetical protein SentesTyp_15729 14 10 Tu 1 . - CDS 14369 - 14761 174 ## COG3121 P pilus assembly protein, chaperone PapD - Prom 14907 - 14966 1.6 - Term 14899 - 14925 -0.6 15 11 Op 1 4/0.667 - CDS 15072 - 15587 263 ## COG3539 P pilus assembly protein, pilin FimA 16 11 Op 2 4/0.667 - CDS 15595 - 16137 227 ## COG3539 P pilus assembly protein, pilin FimA 17 11 Op 3 6/0.222 - CDS 16149 - 17192 425 ## COG3539 P pilus assembly protein, pilin FimA 18 11 Op 4 10/0.000 - CDS 17210 - 19810 1706 ## COG3188 P pilus assembly protein, porin PapC 19 11 Op 5 7/0.111 - CDS 19835 - 20536 328 ## COG3121 P pilus assembly protein, chaperone PapD - Prom 20556 - 20615 4.0 - Term 20554 - 20587 4.5 20 12 Tu 1 . - CDS 20619 - 21170 296 ## COG3539 P pilus assembly protein, pilin FimA + Prom 21259 - 21318 5.1 21 13 Op 1 4/0.667 + CDS 21517 - 22092 567 ## COG0431 Predicted flavoprotein 22 13 Op 2 5/0.556 + CDS 22085 - 23044 914 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components 23 13 Op 3 8/0.111 + CDS 23041 - 24186 1145 ## COG2141 Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related flavin-dependent oxidoreductases 24 13 Op 4 24/0.000 + CDS 24197 - 24988 831 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 25 13 Op 5 . + CDS 24985 - 25752 216 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 + Term 25909 - 25953 3.7 26 14 Tu 1 . - CDS 25959 - 28571 3048 ## COG0308 Aminopeptidase N - Prom 28678 - 28737 4.3 + Prom 28535 - 28594 3.5 27 15 Tu 1 . + CDS 28789 - 30039 1153 ## COG1488 Nicotinic acid phosphoribosyltransferase + Term 30048 - 30077 1.2 28 16 Tu 1 . - CDS 30045 - 30314 81 ## + Prom 30090 - 30149 6.4 29 17 Tu 1 . + CDS 30208 - 31608 1682 ## COG0017 Aspartyl/asparaginyl-tRNA synthetases + Term 31623 - 31654 4.1 + Prom 31996 - 32055 7.8 30 18 Op 1 3/0.778 + CDS 32210 - 32890 813 ## COG3203 Outer membrane protein (porin) 31 18 Op 2 5/0.556 + CDS 32847 - 33287 438 ## COG3203 Outer membrane protein (porin) + Term 33312 - 33352 4.2 + Prom 33378 - 33437 5.0 32 19 Tu 1 . + CDS 33472 - 34662 1512 ## COG1448 Aspartate/tyrosine/aromatic aminotransferase + Term 34852 - 34883 1.1 - Term 34710 - 34747 2.2 33 20 Op 1 7/0.111 - CDS 34884 - 35531 497 ## COG0491 Zn-dependent hydrolases, including glyoxylases 34 20 Op 2 9/0.000 - CDS 35558 - 36106 376 ## COG3108 Uncharacterized protein conserved in bacteria - Prom 36159 - 36218 2.4 - Term 36152 - 36198 5.5 35 20 Op 3 4/0.667 - CDS 36287 - 38134 1394 ## COG2989 Uncharacterized protein conserved in bacteria - Prom 38250 - 38309 6.5 - Term 38358 - 38388 3.0 36 21 Op 1 8/0.111 - CDS 38395 - 42855 5594 ## COG3096 Uncharacterized protein involved in chromosome partitioning 37 21 Op 2 7/0.111 - CDS 42855 - 43559 775 ## COG3095 Uncharacterized protein involved in chromosome partitioning 38 21 Op 3 6/0.222 - CDS 43540 - 44862 1594 ## COG3006 Uncharacterized protein involved in chromosome partitioning 39 21 Op 4 . - CDS 44859 - 45644 694 ## COG0500 SAM-dependent methyltransferases - Prom 45777 - 45836 5.2 40 22 Tu 1 . + CDS 45780 - 46559 621 ## COG1434 Uncharacterized conserved protein + Term 46781 - 46818 -0.9 41 23 Tu 1 . - CDS 46536 - 47429 693 ## JW0902 conserved hypothetical protein - Prom 47466 - 47525 4.8 - Term 47510 - 47551 3.4 42 24 Op 1 11/0.000 - CDS 47583 - 48329 771 ## COG1212 CMP-2-keto-3-deoxyoctulosonic acid synthetase 43 24 Op 2 4/0.667 - CDS 48326 - 48508 145 ## COG2835 Uncharacterized conserved protein 44 24 Op 3 4/0.667 - CDS 48560 - 49792 830 ## COG3214 Uncharacterized protein conserved in bacteria 45 24 Op 4 9/0.000 - CDS 49829 - 50815 662 ## COG1663 Tetraacyldisaccharide-1-P 4'-kinase 46 24 Op 5 5/0.556 - CDS 50812 - 52560 242 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 47 24 Op 6 2/1.000 - CDS 52597 - 54861 754 ## COG0658 Predicted membrane metal-binding protein - Prom 54942 - 55001 2.8 - Term 55005 - 55045 8.3 48 25 Tu 1 . - CDS 55068 - 55352 167 ## PROTEIN SUPPORTED gi|148826039|ref|YP_001290792.1| 50S ribosomal protein L35 - Term 55397 - 55426 3.5 49 26 Op 1 . - CDS 55472 - 55570 150 ## PROTEIN SUPPORTED gi|238903721|ref|ZP_04649193.1| ribosomal protein S1 50 26 Op 2 21/0.000 - CDS 55512 - 57185 2807 ## PROTEIN SUPPORTED gi|15800772|ref|NP_286786.1| 30S ribosomal protein S1 - Term 57239 - 57275 1.3 51 26 Op 3 3/0.778 - CDS 57296 - 57979 267 ## PROTEIN SUPPORTED gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 - Prom 58017 - 58076 5.0 - Term 58004 - 58037 2.3 52 27 Tu 1 . - CDS 58152 - 58916 814 ## COG0501 Zn-dependent protease with chaperone function - Prom 59023 - 59082 3.2 - Term 59010 - 59041 2.0 53 28 Op 1 6/0.222 - CDS 59085 - 60368 1309 ## COG0128 5-enolpyruvylshikimate-3-phosphate synthase - Term 60386 - 60419 4.1 54 28 Op 2 4/0.667 - CDS 60439 - 61527 1075 ## COG1932 Phosphoserine aminotransferase - Prom 61584 - 61643 3.3 55 29 Tu 1 . - CDS 61726 - 62418 673 ## COG2323 Predicted membrane protein - Prom 62504 - 62563 1.9 + Prom 62375 - 62434 2.9 56 30 Tu 1 . + CDS 62548 - 64308 1920 ## COG1944 Uncharacterized conserved protein + Term 64318 - 64346 0.7 57 31 Op 1 7/0.111 + CDS 64714 - 65571 658 ## COG2116 Formate/nitrite family of transporters 58 31 Op 2 11/0.000 + CDS 65626 - 67908 2425 ## COG1882 Pyruvate-formate lyase + Term 67951 - 67982 3.2 + Prom 67958 - 68017 2.9 59 31 Op 3 . + CDS 68100 - 68840 684 ## COG1180 Pyruvate-formate lyase-activating enzyme Predicted protein(s) >gi|296493408|gb|ADTK01000093.1| GENE 1 89 - 541 517 150 aa, chain - ## HITS:1 COG:ECs1040 KEGG:ns NR:ns ## COG: ECs1040 COG3120 # Protein_GI_number: 15830294 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 150 1 150 150 253 100.0 9e-68 MKYQQLENLESGWKWKYLVKKHREGELITRYIEASAAQEAVDVLLSLENEPVLVNGWIDK HMNPELVNRMKQTIRARRKRHFNAEHQHTRKKSIDLEFIVWQRLAGLAQRRGKTLSETIV QLIEDAENKEKYANKMSSLKQDLQALLGKE >gi|296493408|gb|ADTK01000093.1| GENE 2 727 - 2487 1481 586 aa, chain + ## HITS:1 COG:ECs1039 KEGG:ns NR:ns ## COG: ECs1039 COG1067 # Protein_GI_number: 15830293 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent protease # Organism: Escherichia coli O157:H7 # 1 586 1 586 586 1167 100.0 0 MTITKLAWRDLVPDTDSYQEIFAQPHLIDENDPLFSDTQPRLQFALEQLLHTRASSSFML AKAPEESEYLNLIANAARTLQSDAGQLVGGHYEVSGHSIRLRHAVSADDNFATLTQVVAA DWVEAEQLFGCLRQFNGDITLQPGLVHQANGGILIISLRTLLAQPLLWMRLKNIVNRERF DWVAFDESRPLPVSVPSMPLKLKVILVGERESLADFQEMEPELSEQAIYSEFEDTLQIVD AESVTQWCRWVTFTARHNHLPAPGADAWPVLIREAARYTGEQETLPLSPQWILRQCKEVA SLCDGDTFSGEQLNLMLQQREWREGFLAERMQDEILQEQILIETEGERIGQINALSVIEF PGHPRAFGEPSRISCVVHIGDGEFTDIERKAELGGNIHAKGMMIMQAFLMSELQLEQQIP FSASLTFEQSYSEVDGDSASMAELCALISALADVPVNQSIAITGSVDQFGRAQPVGGLNE KIEGFFAICQQRELTGKQGVIIPTANVRHLSLHSELVKAVEEGKFTIWAVDDVTDALPLL LNLVWDGEGQTTLMQTIQERIAQASQQEGRHRFPWPLRWLNWFIPN >gi|296493408|gb|ADTK01000093.1| GENE 3 2556 - 3074 743 172 aa, chain + ## HITS:1 COG:ECs1038 KEGG:ns NR:ns ## COG: ECs1038 COG0764 # Protein_GI_number: 15830292 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases # Organism: Escherichia coli O157:H7 # 1 172 1 172 172 344 100.0 4e-95 MVDKRESYTKEDLLASGRGELFGAKGPQLPAPNMLMMDRVVKMTETGGNFDKGYVEAELD INPDLWFFGCHFIGDPVMPGCLGLDAMWQLVGFYLGWLGGEGKGRALGVGEVKFTGQVLP TAKKVTYRIHFKRIVNRRLIMGLADGEVLVDGRLIYTASDLKVGLFQDTSAF >gi|296493408|gb|ADTK01000093.1| GENE 4 3144 - 3311 98 55 aa, chain - ## HITS:1 COG:ECs1037 KEGG:ns NR:ns ## COG: ECs1037 COG3130 # Protein_GI_number: 15830291 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome modulation factor # Organism: Escherichia coli O157:H7 # 1 55 1 55 55 80 100.0 5e-16 MKRQKRDRLERAHQRGYQAGIAGRSKEMCPYQTLNQRSQWLGGWREAMADRVVMA >gi|296493408|gb|ADTK01000093.1| GENE 5 3567 - 4130 667 187 aa, chain - ## HITS:1 COG:ymbA KEGG:ns NR:ns ## COG: ymbA COG3009 # Protein_GI_number: 16128919 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 6 187 1 182 182 337 98.0 9e-93 MKKWLVTIATLWLAGCSSGEINKNYYQLPVVQSGTQSTASQGNRLLWVEQVTVPDYLAGN GVVYQTSDVKYVIANNNLWASPLDQQLRNTLVANLSTQLPGWVVASQPLGSAQDTLNVTV TEFNGRYDGKVIVSGEWLLNHQGQLIKRPFRLEGVQTQDGYDEMVKVLAGVWSQEAASIA QEIKRLP >gi|296493408|gb|ADTK01000093.1| GENE 6 4127 - 5767 1442 546 aa, chain - ## HITS:1 COG:ECs1035 KEGG:ns NR:ns ## COG: ECs1035 COG3008 # Protein_GI_number: 15830289 # Func_class: R General function prediction only # Function: Paraquat-inducible protein B # Organism: Escherichia coli O157:H7 # 1 546 1 546 546 1081 99.0 0 MESNNGEAKIQKVKNWSPVWIFPIVTALIGAWVLFYHYSHQGPEVTLITANAEGIEGGKT TIKSRSVDVGVVESATLADDLTHVEIKARLNSGMEKLLHKDTVFWVVKPQIGREGISGLG TLLSGVYIELQPGAKGSKMDKYDLLDSPPLAPPDAKGIRVILDSKKAGQLSPGDPVLFRG YRVGSVETSTFDTQKRNISYQLFINAPYDRLVTSNVRFWKDSGIAVDLTSAGMRVEMGSL TTLLSGGVSFDVPEGLDLGQPVAPKTAFVLYDDQKSIQDSLYTDHIDYLMFFKDSVRGLQ PGAPVEFRGIRLGTVSKVPFFAPNMRQTFNDDYRIPVLIRIEPERLKMQLGENADVVEHL GELLKRGLRGSLKTGNLVTGALYVDLDFYPNTPAITGIREFNGYQIIPTVSGGLAQIQQR LMEALDKINKLPLNPMIEQATSTLSESQRTMKNLQTTLDSMNKILASQSMQQLPTDMQST LRELNRSMQGFQPGSAAYNKMVADMQRLDQVLRELQPVLKTLNEKSNALVFEAKDKKDPE PKRAKQ >gi|296493408|gb|ADTK01000093.1| GENE 7 5772 - 7025 703 417 aa, chain - ## HITS:1 COG:pqiA KEGG:ns NR:ns ## COG: pqiA COG2995 # Protein_GI_number: 16128917 # Func_class: S Function unknown # Function: Uncharacterized paraquat-inducible protein A # Organism: Escherichia coli K12 # 1 417 1 417 417 790 100.0 0 MCEHHHAAKHILCSQCDMLVALPRLEHGQKAACPRCGTTLTVAWDAPRQRPTAYALAALF MLLLSNLFPFVNMNVAGVTSEITLLEIPGVLFSEDYASLGTFFLLFVQLVPAFCLITILL LVNRAELPVRLKEQLARVLFQLKTWGMAEIFLAGVLVSFVKLMAYGSIGVGSSFLPWCLF CVLQLRAFQCVDRRWLWDDIAPMPELRQPLKPGVTGIRQGLRSCSCCTAILPADEPVCPR CSTKGYVRRRNSLQWTLALLVTSIMLYLPANILPIMVTDLLGSKMPSTILAGVILLWSEG SYPVAAVIFLASIMVPTLKMIAIAWLCWDAKGHGKRDSERMHLIYEVVEFVGRWSMIDVF VIAVLSALVRMGGLMSIYPAMGALMFALVVIMTMFSAMTFDPRLSWDRQPESEHEES >gi|296493408|gb|ADTK01000093.1| GENE 8 7155 - 9062 2407 635 aa, chain - ## HITS:1 COG:ECs1033 KEGG:ns NR:ns ## COG: ECs1033 COG0488 # Protein_GI_number: 15830287 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Escherichia coli O157:H7 # 1 635 1 635 635 1189 99.0 0 MSLISMHGAWLSFSDAPLLDNAELHIEDNERVCLVGRNGAGKSTLMKILNREQGLDDGRI IYEQDLIVARLQQDPPRNVEGSVYDFVAEGIEEQAEYLKRYHDISRLVMNDPSEKNLNEL AKVQEQLDHHNLWQLENRINEVLAQLGLDPNVALSSLSGGWLRKAALGRALVSNPRVLLL DEPTNHLDIETIDWLEGFLKTFNGTIIFISHDRSFIRNMATRIVDLDRGKLVTYPGNYDQ YLLEKEEALRVEELQNAEFDRKLAQEEVWIRQGIKARRTRNEGRVRALKAMRRERGERRE VMGTAKMQVEEASRSGKIVFEMEDVCYQVDGKQLVKDFSAQVLRGDKIALIGPNGCGKTT LLKLMLGQLQADSGRIHVGTKLEVAYFDQHRAELDPDKTVMDNLAEGKQEVMVNGKPRHV LGYLQDFLFHPKRAMTPVRALSGGERNRLLLARLFLKPSNLLILDEPTNDLDVETLELLE ELIDSYQGTVLLVSHDRQFVDNTVTECWIFEGGGKIGRYVGGYHDARGQQEQYVALKQPA VKKTEEAAVPKAETVKRSSSKLSYKLQRELEQLPQLLEDLEAKLEALQTQVADASFFSQP HEQTQKVLADMAAAEQELEQAFERWEYLEALKNGG >gi|296493408|gb|ADTK01000093.1| GENE 9 9074 - 11182 178 702 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative [Thermococcus barophilus MP] # 487 667 174 328 396 73 29 4e-12 MNSLFASTARGLEELLKTELENLGAVECQVVQGGVHFKGDTRLVYQSLMWSRLASRIMLP LGECKVYSDLDLYLGVQAINWTEMFNPGATFAVHFSGLNDTIRNSQYGAMKVKDAIVDAF TRKNLPRPNVDRDAPDIRVNVWLHKETASIALDLSGDGLHLRGYRDRAGIAPIKETLAAA IVMRSGWQPGTPLLDPMCGSGTLLIEAAMLATDRAPGLHRGRWGFSGWAQHDEAIWQEVK AEAQTRARKGLAEYSSHFYGSDSDARVIQRARTNARLAGIGELITFEVKDVAQLTNPLPK GPYGTVLSNPPYGERLDSEPALIALHSLLGRIMKNQFGGWNLSLFSASPDLLSCLQLRAD KQYKAKNGPLDCVQKNYHVAESTPDSKPAMVAEDYANRLRKNLKKFEKWARQEGIECYRL YDADLPEYNVAVDRYADWVVVQEYAPPKTIDAHKARQRLFDIIAATISVLGIAPNKLVLK TRERQKGKNQYQKLGEKGEFLEVTEYNAHLWVNLTDYLDTGLFLDHRIARRMLGQMSKGK DFLNLFSYTGSATVHAGLGGARSTTTVDMSRTYLEWAERNLRLNGLTGRAHRLIQADCLA WLREANEQFDLIFIDPPTFSNSKRMEDAFDVQRDHLALMKDLKRLLRAGGTIMFSNNKRG FRMDLDGLAKLGLKAQEITQKTLSQDFARNRQIHNCWLITAA >gi|296493408|gb|ADTK01000093.1| GENE 10 11426 - 12535 680 369 aa, chain + ## HITS:1 COG:ycbX_1 KEGG:ns NR:ns ## COG: ycbX_1 COG3217 # Protein_GI_number: 16128914 # Func_class: R General function prediction only # Function: Uncharacterized Fe-S protein # Organism: Escherichia coli K12 # 1 273 1 273 273 567 99.0 1e-161 MVTLTRLFIHPVKSMRGIGLTHALADVSGLAFDRIFMITEPDGTFITARQFPQMVRFTPS PVHDGLHLTAPDGSSAYVRFADFATQDAPTEVWGTHFTARIAPDAINKWLSGFFSREVQL RWVGPQMTRRVKRHNTVPLSFADGYPYLLANEASLRDLQQRCPASVKMEQFRPNLVVSGA SAWEEDRWKVIRIGDVVFDVVKPCSRCIFTTVSPEKGQKHPAGEPLKTLQSFRTAQDNGD VDFGQNLIARNSGVIRVGDEVEILATAPAKIYGAAAADDTANITQQPDANVDIDWQGQAF RGNNQQVLLEQLENQGIRIPYSCRAGICGSCRVQLLEGEVTPLKKSAMGDDGTILCCSCV PKTALKLAR >gi|296493408|gb|ADTK01000093.1| GENE 11 12532 - 13074 434 180 aa, chain - ## HITS:1 COG:no KEGG:SSON_0950 NR:ns ## KEGG: SSON_0950 # Name: ycbW # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 180 13 192 192 374 100.0 1e-102 MRIKPDDNWRWYYDEEHDRMMLDLANGMLFRSRFARKMLTPDAFSPAGFCVDDAALYFSF EEKCRDFNLSKEQKAELVLNALVAIRYLKPQMPKSWHFVSHGEMWVPMPGDAACVWLSDT HEQVNLLVVESGENAALCLLAQPCVVIAGRAMQLGDAIKIMNDRLKPQVNVDSFSLEQAV >gi|296493408|gb|ADTK01000093.1| GENE 12 13248 - 14258 941 336 aa, chain - ## HITS:1 COG:ECs1029 KEGG:ns NR:ns ## COG: ECs1029 COG0167 # Protein_GI_number: 15830283 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Escherichia coli O157:H7 # 1 336 1 336 336 679 100.0 0 MYYPFVRKALFQLDPERAHEFTFQQLRRITGTPFEALVRQKVPAKPVNCMGLTFKNPLGL AAGLDKDGECIDALGAMGFGSIEIGTVTPRPQPGNDKPRLFRLVDAEGLINRMGFNNLGV DNLVENVKKAHYDGVLGINIGKNKDTPVEQGKDDYLICMEKIYAYAGYIAINISSPNTPG LRTLQYGEALDDLLTAIKNKQNDLQAMHHKYVPIAVKIAPDLSEEELIQVADSLVRHNID GVIATNTTLDRSLVQGMKNCDQTGGLSGRPLQLKSTEIIRRLSLELNGRLPIIGVGGIDS VIAAREKIAAGASLVQIYSGFIFKGPPLIKEIVTHI >gi|296493408|gb|ADTK01000093.1| GENE 13 14202 - 14411 139 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|213615837|ref|ZP_03371663.1| ## NR: gi|213615837|ref|ZP_03371663.1| hypothetical protein SentesTyp_15729 [Salmonella enterica subsp. enterica serovar Typhi str. E98-2068] # 1 44 1 44 45 81 88.0 2e-14 MSALWIELEKGFTNEGVVHELSWIPGVQTGGVLCASSDQKGIDLRQKKRKRFPNLFQGRF CSFFALNPP >gi|296493408|gb|ADTK01000093.1| GENE 14 14369 - 14761 174 130 aa, chain - ## HITS:1 COG:ycbF KEGG:ns NR:ns ## COG: ycbF COG3121 # Protein_GI_number: 16128911 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli K12 # 1 130 116 245 245 252 96.0 1e-67 MPVDRETLFELSIASVPSGKVENQSVKVAIRSVFKLFWRPEGLPGDPLEAYQQLRWTWNS QGVQLTNPTPYYINLIQVSVNGKALSNAGVVPPKSQRQTSWCHAIAPCHVAWRAINDYGG LSAKKEQNLP >gi|296493408|gb|ADTK01000093.1| GENE 15 15072 - 15587 263 171 aa, chain - ## HITS:1 COG:ECs1027 KEGG:ns NR:ns ## COG: ECs1027 COG3539 # Protein_GI_number: 15830281 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli O157:H7 # 1 171 17 187 187 342 100.0 3e-94 MLKRIIWILFLLGLTGGCELFAHDGTVNISGSFRRNTCVLAQDSKQINVQLGDVSLTRFS HGNYGPEKSFIINLQDCGTDVSTVDVTFSGTPDGVQSEMLSIESGTDAASGLAIAILDDA KILIPLNQASKDYSLHSGKVPLTFYAQLRPVNSDVQSGKVNASATFVLHYD >gi|296493408|gb|ADTK01000093.1| GENE 16 15595 - 16137 227 180 aa, chain - ## HITS:1 COG:ycbU KEGG:ns NR:ns ## COG: ycbU COG3539 # Protein_GI_number: 16128909 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 180 1 180 180 334 100.0 6e-92 MKKKTIFQCVILFFSILNIHVGMAGPEQVSMHIYGNVVDQGCDVATKSALQNIHIGDFNI SDFQAANTVSTAADLNIDITGCAAGITGADVLFSGEADTLAPTLLKLTDTGGSGGMATGI AVQILDAQSQQEIPLNQVQPLTPLKAGDNTLKYQLRYKSTKAGATGGNATAVLYFDLVYQ >gi|296493408|gb|ADTK01000093.1| GENE 17 16149 - 17192 425 347 aa, chain - ## HITS:1 COG:ycbT KEGG:ns NR:ns ## COG: ycbT COG3539 # Protein_GI_number: 16128908 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 3 347 12 356 356 688 99.0 0 MLLLRLFFAAVLMLWCAQTAAYSGQCHTTQGNPYIGVNFGVKTLEEEENTAGVVKDKFYQ WNESNDYYVSCDCDKDNVRSGRWAFAADSPLVYLGDNWYKINDYLAAKVLLQVKGSSPTA VPFENVGTGADTRWHICDPGGQRLGGQGASGNSGSFSLKILQPFVGSVVIPPMALARLFE CYNIPAGDSCTTTGTPVLVYYLSGTINSLGSCSVNAGETIEVDLGDVFAANFRVVGHKPL GARTAELAIPVRCNTGNAGLVNVNLSLTATTDPSYPQAIKTSRPGVGVVVTDSQNNIISP AGGTLPLSIPDDADSIARMNVYPVSTTGVPPETGRFEATATVRINFD >gi|296493408|gb|ADTK01000093.1| GENE 18 17210 - 19810 1706 866 aa, chain - ## HITS:1 COG:ycbS KEGG:ns NR:ns ## COG: ycbS COG3188 # Protein_GI_number: 16128907 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli K12 # 1 866 1 866 866 1666 98.0 0 MYRTHRQHSLLSSGGVPSFIGGLVVFVSAAFNAQAETWFDPAFFKDDPSMVADLSRFEKG QKITPGVYRVDIVLNQTIVDTRNVNFVEITPEKGIAACLTTESLDAMGVNTDAFPAFKQL DKQACVPLAEIIPDASVTFNVNKLRLEISVPQIAIKSNARGYVPPERWDEGINALLLGYS FSGANSIHSSADSDSGDSYFLNLNSGVNLGPWRLRNNSTWSRSSGQTAEWKNLSSYLQRA VIPLKGELTVCDDYTAGDFFDSVSFRGVQLASDDNMLPDSLKGFAPVVRGIAKSNAQITI KQNGYTIYQTYVSPGAFEISDLYSTSSSGDLLVEIKEADGSVNSYSVPFSSVPLLQRQGR IKYAVTLAKYRTNSNEQQESKFAQATLQWGGPWGTTWYGGGQYAEYYRAAMFGLGFNLGD FGAISFDATQAKSTLADQSEHKGQSYRFLYAKTLNQLGTNFQLMGYRYSTSGFYTLSDTM YKHMDGYEFNDGDDEDTPMWSRYYNLFYTKRGKLQVNISQQLGEYGSFYLSGSQQTYWHT DQQDRLLQFGYNTQIKDLSLGVSWNYSKSRGQPDADQVFALNFSLPLNLLLPRSNDSYTR KKNYAWMTSNTSIDNEGHITQNLGLTETLLDDGNLSYSVQQGYNSEGKTANGSASMDYKG AFADARVGYNYSDNGSQQQLNYALSGSLVAHSQGITLGQSLGETNVLIAAPGAENTRVAN STGLKTDWRGYTVVPYATSYRENRIALDAASLKRNVDLENAVVNVVPTKGALVLAEFNAH AGARVLMKTSKQGIPLRFGAIATLDGIQTNSGIIDDDGSLYMSGLPAQGAITVRWGEAPD QICHISYQLTEQQINSAITRMDAICR >gi|296493408|gb|ADTK01000093.1| GENE 19 19835 - 20536 328 233 aa, chain - ## HITS:1 COG:ycbR KEGG:ns NR:ns ## COG: ycbR COG3121 # Protein_GI_number: 16128906 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli K12 # 1 233 1 233 233 447 99.0 1e-126 MKTCITKGIVTVSLTAILLSCSSAWAAGKGGIGLAATRLVYSEGEEQISLGVRNTSPDVP YLIQSWVMTPDNKKSADFIITPPLFVLNPANENLLRIMYIGAPLAKDRETLFFTSVRAVP STTKREEGNTLKIATQSVIKLFWRPKGLAYPLGEAPAKLRCTSSADMVTVSNPTPYFITL TDLKIGGKVVKNQMISPFDKYQFSLPKGAKNSSVTYRTINDYGAETPQLNCKS >gi|296493408|gb|ADTK01000093.1| GENE 20 20619 - 21170 296 183 aa, chain - ## HITS:1 COG:ECs1021 KEGG:ns NR:ns ## COG: ECs1021 COG3539 # Protein_GI_number: 15830275 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli O157:H7 # 1 183 1 182 182 214 84.0 8e-56 MMTMKKSVLTAFITVVCATSSVMAADDNAITDGKVTFNGKVIAPACTLVAATKDSVVTLP NVSATKLQTNGAVSGVKTDVPIALEGCDVTVTKNATFTFSGTADGVQPTAFANQATTDAA TNVALQMYLPDGSTSVTPGTETSNIQLADSAEQTVTFKVDYIATGKATSGNVNAVTNFHI NYY >gi|296493408|gb|ADTK01000093.1| GENE 21 21517 - 22092 567 191 aa, chain + ## HITS:1 COG:ZycbP KEGG:ns NR:ns ## COG: ZycbP COG0431 # Protein_GI_number: 15800798 # Func_class: R General function prediction only # Function: Predicted flavoprotein # Organism: Escherichia coli O157:H7 EDL933 # 1 191 1 191 191 352 96.0 2e-97 MRVITLAGSPRFPSRSSSLLEYAREKLNCLDVEVYHWNLQNFAPEDLLYARFDSPALKTF TEQLQQADGLIVATPVYKAAYSGALKTLLDLLPERALQGKVVLPLATGGTVAHLLAVDYA LKPVLSAMKAQEILHGVFADDSQVIDYHHKPQFTPNLQTRLDTALETFWQALHRRDVQVP DLLSLRGNVHA >gi|296493408|gb|ADTK01000093.1| GENE 22 22085 - 23044 914 319 aa, chain + ## HITS:1 COG:ssuA KEGG:ns NR:ns ## COG: ssuA COG0715 # Protein_GI_number: 16128903 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Escherichia coli K12 # 1 319 15 333 333 604 99.0 1e-173 MRNIIKLALAGLLSVSTFAVAAESSPEALRIGYQKGSIGMVLAKSHQLLEKRYPQSKISW VEFPAGPQMLEALNVGSIDLGSTGDIPPIFAQAAGADLVYVGVEPPKPKAEVILVAENSP IKTVADLKGHKVAFQKGSSSHNLLLRALRQAGLKFTDIQPTYLTPADARAAFQQGNVDAW AIWDPYYSAALLQGGVRVLKDGTDLNQTGSFYLAARPYAEKNGAFIQGVLATFSEADALT RSQREQSIALLAKTMGLPAPVIASYLDHRPPTTIKPVTAEVAALQQQTADLFYENRLVPK KVDIRQRIWQPTQLEGKQL >gi|296493408|gb|ADTK01000093.1| GENE 23 23041 - 24186 1145 381 aa, chain + ## HITS:1 COG:ECs1018 KEGG:ns NR:ns ## COG: ECs1018 COG2141 # Protein_GI_number: 15830272 # Func_class: C Energy production and conversion # Function: Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related flavin-dependent oxidoreductases # Organism: Escherichia coli O157:H7 # 1 381 1 381 381 755 99.0 0 MSLNMFWFLPTHGDGHYLGTEEGSRPVDHGYLQQIAQAADRLGYTGVLIPTGRSCEDAWL VAASMIPVTQRLKFLVALRPSVTSPTVAARQAATLDRLSNGRALFNLVTGSDPQELAGDG VFLDHSERYEASAEFTQVWRRLLLGETVNFNGKHIHVRGAKLLFPPIQQPYPPLYFGGSS DVAQELAAEQVDLYLTWGEPPELVKEKIEQVRAKAAAHGRKIRFGIRLHVIVRETNDEAW QAAERLISHLDDETIAKAQAAFARTDSVGQQRMAALHNGKRDNLEISPNLWAGVGLVRGG AGTALVGDGPTVAARINEYAALGIDSFVLSGYPHLEEAYRVGELLFPHLDVAIPEIPQPQ PLNPQGEAVANDFIPRKVAQS >gi|296493408|gb|ADTK01000093.1| GENE 24 24197 - 24988 831 263 aa, chain + ## HITS:1 COG:ssuC KEGG:ns NR:ns ## COG: ssuC COG0600 # Protein_GI_number: 16128901 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Escherichia coli K12 # 1 263 16 278 278 422 100.0 1e-118 MATPVKKWLLRVAPWFLPVGIVAVWQLASSVGWLSTRILPSPEGVVTAFWTLSASGELWQ HLAISSWRALIGFSIGGSLGLILGLISGLSRWGERLLDTSIQMLRNVPHLALIPLVILWF GIDESAKIFLVALGTLFPIYINTWHGIRNIDRGLVEMARSYGLSGIPLFIHVILPGALPS IMVGVRFALGLMWLTLIVAETISANSGIGYLAMNAREFLQTDVVVVAIILYALLGKLADV SAQLLERLWLRWNPAYHLKEATV >gi|296493408|gb|ADTK01000093.1| GENE 25 24985 - 25752 216 255 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 12 211 5 214 305 87 33 1e-16 MNTARLNQGTPLLLNAVSKHYAENIVLNQLDLHIPAGQFVAVVGRSGGGKSTLLRLLAGL ETPTAGDVLAGTTPLAEIQEDTRMMFQDARLLPWKSVIDNVGLGLKGQWRDAARRALAAV GLENRAGEWPAALSGGQKQRVALARALIHRPGLLLLDEPLGALDALTRLEMQDLIVSLWQ QHGFTVLLVTHDVSEAVAMADRVLLIEEGKIGLDLTVDIPRPRRLGSVRLAELEAEVLQR VMRRGHSEQLIRRHG >gi|296493408|gb|ADTK01000093.1| GENE 26 25959 - 28571 3048 870 aa, chain - ## HITS:1 COG:ECs1015 KEGG:ns NR:ns ## COG: ECs1015 COG0308 # Protein_GI_number: 15830269 # Func_class: E Amino acid transport and metabolism # Function: Aminopeptidase N # Organism: Escherichia coli O157:H7 # 1 870 1 870 870 1749 99.0 0 MTQQPQAKYRHDYRAPDYQITDIDLTFDLDAQKTVVTAVSQAVRHGASDAPLRLNGEDLK LVSVHINDEPWTAWKEEEGALVISNLPERFTLKIINEISPAANTALEGLYQSGDALCTQC EAEGFRHITYYLDRPDVLARFTTKIIADKTKYPFLLSNGNRVAQGELENGRHWVQWQDPF PKPCYLFALVAGDFDVLRDTFTTRSGREVALELYVDRGNLDRAPWAMTSLKNSMKWDEER FGLEYDLDIYMIVAVDFFNMGAMENKGLNIFNSKYVLARTDTATDKDYLDIERVIGHEYF HNWTGNRVTCRDWFQLSLKEGLTVFRDQEFSSDLGSRAVNRINNVRTMRGLQFAEDASPM AHPIRPDMVIEMNNFYTLTVYEKGAEVIRMIHTLLGEENFQKGMQLYFERHDGSAATCDD FVQAMEDASNVDLSHFRRWYSQSGTPIVTVKDDYNPETEQCTLTISQRTPATPDQAEKQP LHIPFAIELYDNEGKVIPLQKGGHPVNSVLNVTQAEQTFVFDNVYFQPVPALLCEFSAPV KLEYKWSDQQLTFLMRHARNDFSRWDAAQSLLATYIKLNVARHQQGQPLSLPVHVADAFR AVLLDEKIDPALAAEILTLPSVNEMAELFDIIDPIAIAEVREALTRTLATELADELLAIY NANYQSEYRVEHEDIAKRTLRNACLRFLAFGETHLADVLVSKQYHEANNMTDALAALSAA VAAQLPCRDALMQEYDDKWHQDGLVMDKWFILQATSPAANVLETVRGLLQHRSFTMSNPN RIRSLIGAFAGSNPAAFHAEDGSGYQFLVEMLTDLNSRNPQVASRLIEPLIRLKRYDAKR QEKMRAALEQLKGLENLSGDLYEKITKALA >gi|296493408|gb|ADTK01000093.1| GENE 27 28789 - 30039 1153 416 aa, chain + ## HITS:1 COG:pncB KEGG:ns NR:ns ## COG: pncB COG1488 # Protein_GI_number: 16128898 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid phosphoribosyltransferase # Organism: Escherichia coli K12 # 17 416 1 400 400 821 99.0 0 MFIVLNAPVRGRYCAPMTQFASPVLHSLLDTDAYKLHMQQAVFHHYYDVHVAAEFRCRGD DLLGIYADAIREQIQAMQHLRLQDDEYQWLSALPFFKADYLNWLREFRFNPEQVTVSNDN GKLDIRLSGPWREVILWEVPLLAVISEMVHRYRSPQADVAQALDTLESKLVDFSALTAGL DMSRFHLMDFGTRRRFSREVQETIVKRLQQESWFVGTSNYDLARRLSLTPMGTQAHEWFQ AHQQISPDLANSQRAALAAWLEEYPDQLGIALTDCITMDAFLRDFGVEFASRYQGLRHDS GDPVEWGEKAIAHYKKLGIDPQSKTLVFSDNLDLRKAVELYRHFSSRVQLSFGIGTRLTC DIPQVKPLNIVIKLVECNGKPVAKLSDSPGKTICHDKAFVRALRKAFDLPHIKKAS >gi|296493408|gb|ADTK01000093.1| GENE 28 30045 - 30314 81 89 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPAFESRRVRTHPRTVTSLSTATRPWSTSATGTTLIIFSLLIVGKNKHLSTRKWGDTYVT WHLQSDKQNSQMQRKISELKVKKGSRLAP >gi|296493408|gb|ADTK01000093.1| GENE 29 30208 - 31608 1682 466 aa, chain + ## HITS:1 COG:asnS KEGG:ns NR:ns ## COG: asnS COG0017 # Protein_GI_number: 16128897 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl/asparaginyl-tRNA synthetases # Organism: Escherichia coli K12 # 1 466 1 466 466 959 100.0 0 MSVVPVADVLQGRVAVDSEVTVRGWVRTRRDSKAGISFLAVYDGSCFDPVQAVINNSLPN YNEDVLRLTTGCSVIVTGKVVASPGQGQQFEIQASKVEVAGWVEDPDTYPMAAKRHSIEY LREVAHLRPRTNLIGAVARVRHTLAQALHRFFNEQGFFWVSTPLITASDTEGAGEMFRVS TLDLENLPRNDQGKVDFDKDFFGKESFLTVSGQLNGETYACALSKIYTFGPTFRAENSNT SRHLAEFWMLEPEVAFANLNDIAGLAEAMLKYVFKAVLEERADDMKFFAERVDKDAVSRL ERFIEADFAQVDYTDAVTILENCGRKFENPVYWGVDLSSEHERYLAEEHFKAPVVVKNYP KDIKAFYMRLNEDGKTVAAMDVLAPGIGEIIGGSQREERLDVLDERMLEMGLNKEDYWWY RDLRRYGTVPHSGFGLGFERLIAYVTGVQNVRDVIPFPRTPRNASF >gi|296493408|gb|ADTK01000093.1| GENE 30 32210 - 32890 813 226 aa, chain + ## HITS:1 COG:ECs1012 KEGG:ns NR:ns ## COG: ECs1012 COG3203 # Protein_GI_number: 15830266 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein (porin) # Organism: Escherichia coli O157:H7 # 1 225 1 225 362 387 100.0 1e-107 MMKRNILAVIVPALLVAGTANAAEIYNKDGNKVDLYGKAVGLHYFSKGNGENSYGGNGDM TYARLGFKGETQINSDLTGYGQWEYNFQGNNSEGADAQTGNKTRLAFAGLKYADVGSFDY GRNYGVVYDALGYTDMLPEFGGDTAYSDDFFVGRVGGVATYRNSNFFGLVDGLNFAVQYL GKNERDTARRSNGDGVGGSISYEYEGFGIVGAYGAADRTNLQEAQR >gi|296493408|gb|ADTK01000093.1| GENE 31 32847 - 33287 438 146 aa, chain + ## HITS:1 COG:ompF KEGG:ns NR:ns ## COG: ompF COG3203 # Protein_GI_number: 16128896 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein (porin) # Organism: Escherichia coli K12 # 13 146 229 362 362 248 100.0 3e-66 MVQLTVPTCKKLNGKKAEQWATGLKYDANNIYLAANYGETRNATPITNKFTNTSGFANKT QDVLLVAQYQFDFGLRPSIAYTKSKAKDVEGIGDVDLVNYFEVGATYYFNKNMSTYVDYI INQIDSDNKLGVGSDDTVAVGIVYQF >gi|296493408|gb|ADTK01000093.1| GENE 32 33472 - 34662 1512 396 aa, chain + ## HITS:1 COG:ECs1011 KEGG:ns NR:ns ## COG: ECs1011 COG1448 # Protein_GI_number: 15830265 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Escherichia coli O157:H7 # 1 396 1 396 396 810 99.0 0 MFENITAAPADPILGLADLFRADERPGKINLGIGVYKDETGKTPVLTSVKKAEQYLLENE TTKNYLGIDGIPEFGRCTQELLFGKGSALINDKRARTAQTPGGTGALRVAADFLAKNTSV KRVWVSNPSWPNHKSVFNSASLEVREYAYYDAENHTLDFDALINSLNEAQAGDVVLFHGC CHNPTGIDPTLEQWQTLAQLSVEKGWLPLFDFAYQGFARGLEEDAEGLRAFAAMHKELIV ASSYSKNFGLYNERVGACTLVAADSETVDRAFSQMKAAIRANYSNPPAHGASVVATILSN DALRAIWEQELTDMRQRIQRMRQLFVNTLQEKGANRDFSFIIKQNGMFSFSGLTKEQVLR LREEFGVYAVASGRVNVAGMTPDNMAPLCEAIVAVL >gi|296493408|gb|ADTK01000093.1| GENE 33 34884 - 35531 497 215 aa, chain - ## HITS:1 COG:ycbL KEGG:ns NR:ns ## COG: ycbL COG0491 # Protein_GI_number: 16128894 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Escherichia coli K12 # 1 215 1 215 215 446 99.0 1e-125 MNYRIIPVTAFSQNCSLIWCEQTRLAALVDPGGDAEKIKQEVDDSGLTLMQILLTHGHLD HVGAAAELAQHYGVPVFGPEKEDEFWLQGLPAQSRMFGLEECQPLTPDRWLNEGDTISIG NVTLQVLHCPGHTPGHVVFFDDRAKLLISGDVIFKGGVGRSDFPRGDHNQLISSIKDKLL PLGDDVTFIPGHGPLSTLGYERLHNPFLQDEMPVW >gi|296493408|gb|ADTK01000093.1| GENE 34 35558 - 36106 376 182 aa, chain - ## HITS:1 COG:ECs1009 KEGG:ns NR:ns ## COG: ECs1009 COG3108 # Protein_GI_number: 15830263 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 182 1 182 182 361 100.0 1e-100 MDKFDANRRKLLALGGVALGAAILPTPAFATLSTPRPRILTLNNLHTGESIKAEFFDGRG YIQEELAKLNHFFRDYRANKIKSIDPGLFDQLYRLQGLLGTRKPVQLISGYRSIDTNNEL RARSRGVAKKSYHTKGQAMDFHIEGIALSNIRKAALSMRAGGVGYYPRSNFVHIDTGPAR HW >gi|296493408|gb|ADTK01000093.1| GENE 35 36287 - 38134 1394 615 aa, chain - ## HITS:1 COG:ycbB KEGG:ns NR:ns ## COG: ycbB COG2989 # Protein_GI_number: 16128892 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 615 1 615 615 1189 99.0 0 MLLNMMCGRRLSAISLCLAVTFAPLFNAQADEPEVIPGDSPVAVSEQGEALPQAQATAIM AGIQPLPEGAAEKARTQIESQLPAGYKPVYLNQLQLLYAARDMQPMWENRDAVKAFQQQL AEVAIAGFQPQFNKWVELLTDPGVNGMARDVVLSDAMMGYLHFIANIPVKGTRWLYSSKP YALATPPLSVINQWQLALDKGQLPTFVAGLAPQHPQYAAMHESLLALLSDTKPWPQLTGK ATLRPGQWSNDVPALREILQRTGMLDGGPKITLPGDDTPTDAVVSPSAVTVETAETKPMD KQTTSRSKPAPAVRAAYDNELVEAVKRFQAWQGLGADGAIGPATRDWLNVTPAQRAGVLA LNIQRLRLLPTELSTGIMVNIPAYSLVYYQNGNQVLDSRVIVGRPDRKTPMMSSALNNVV VNPPWNVPPTLARKDILPKVRNDPGYLESHGYTVMRGWNSREAIDPWQVDWSTITASNLP FRFQQAPGPRNSLGRYKFNMPSSEAIYLHDTPNHNLFKRDTRALSSGCVRVNKASDLANM LLQDAGWNDKRISDALKQGDTRYVNIRQSIPVNLYYLTAFVGADGRTQYRTDIYNYDLPA RSSSQIVSKAEQLIR >gi|296493408|gb|ADTK01000093.1| GENE 36 38395 - 42855 5594 1486 aa, chain - ## HITS:1 COG:ECs1007 KEGG:ns NR:ns ## COG: ECs1007 COG3096 # Protein_GI_number: 15830261 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Uncharacterized protein involved in chromosome partitioning # Organism: Escherichia coli O157:H7 # 1 1486 1 1486 1486 2519 100.0 0 MIERGKFRSLTLINWNGFFARTFDLDELVTTLSGGNGAGKSTTMAAFVTALIPDLTLLHF RNTTEAGATSGSRDKGLHGKLKAGVCYSMLDTINSRHQRVVVGVRLQQVAGRDRKVDIKP FAIQGLPMSVQPTQLVTETLNERQARVLPLNELKDKLEAMEGVQFKQFNSITDYHSLMFD LGIIARRLRSASDRSKFYRLIEASLYGGISSAITRSLRDYLLPENSGVRKAFQDMEAALR ENRMTLEAIRVTQSDRDLFKHLISEATNYVAADYMRHANERRVHLDKALEFRRELHTSRQ QLAAEQYKHVDMARELAEHNGAEGDLEADYQAASDHLNLVQTALRQQEKIERYEADLDEL QIRLEEQNEVVAEAIERQEENEARAEAAELEVDELKSQLADYQQALDVQQTRAIQYNQAI AALNRAKELCHLPDLTADSAAEWLETFQAKELEATEKMLSLEQKMSMAQTAHSQFEQAYQ LVVAINGPLARNEAWDVARELLREGVDQRHLAEQVQPLRMRLSELEQRLREQQEAERLLA DFCKRQGKNFDIDELEALHQELEARIASLSDSVSNAREERMALRQEQEQLQSRIQSLMQR APVWLAAQNSLNQLSEQCGEEFTSSQDVTEYLQQLLEREREAIVERDEVGARKNAVDEEI ERLSQPGGSEDQRLNALAERFGGVLLSEIYDDVSLEDAPYFSALYGPSRHAIVVPDLSQV TEHLEGLTDCPEDLYLIEGDPQSFDDSVFSVDELEKAVVVKIADRQWRYSRFPEVPLFGR AARESRIESLHAEREVLSERFATLSFDVQKTQRLHQAFSRFIGSHLAVAFESDPEAEIRQ LNSRRVELERALSNHENDNQQQRIQFEQAKEGVTALNRILPRLNLLADDSLADRVDEIRE RLDEAQEAARFVQQFGNQLAKLEPIVSVLQSDPEQFEQLKEDYAYSQQMQRDARQQAFAL TEVVQRRAHFSYSDSAEMLSGNSDLNEKLRERLEQAEAERTRAREALRGHAAQLSQYNQV LASLKSSYDTKKELLNDLQRELQDIGVRADSGAEERARIRRDELHAQLSNNRSRRNQLEK ALTFCEAEMDNLTRKLRKLERDYFEMREQVVTAKAGWCAVMRMVKDNGVERRLHRRELAY LSADDLRSMSDKALGALRLAVADNEHLRDVLRMSEDPKRPERKIQFFVAVYQHLRERIRQ DIIRTDDPVEAIEQMEIELSRLTEELTSREQKLAISSRSVANIIRKTIQREQNRIRMLNQ GLQNVSFGQVNSVRLNVNVRETHAMLLDVLSEQHEQHQDLFNSNRLTFSEALAKLYQRLN PQIDMGQRTPQTIGEELLDYRNYLEMEVEVNRGSDGWLRAESGALSTGEAIGTGMSILVM VVQSWEDESRRLRGKDISPCRLLFLDEAARLDARSIATLFELCERLQMQLIIAAPENISP EKGTTYKLVRKVFQNTEHVHVVGLRGFAPQLPETLPGSDEAPSQAS >gi|296493408|gb|ADTK01000093.1| GENE 37 42855 - 43559 775 234 aa, chain - ## HITS:1 COG:ECs1006 KEGG:ns NR:ns ## COG: ECs1006 COG3095 # Protein_GI_number: 15830260 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Uncharacterized protein involved in chromosome partitioning # Organism: Escherichia coli O157:H7 # 1 234 1 234 234 444 99.0 1e-124 MSSTNIEQVMPVKLAQALANPLFPALDSALRSGRHIGLDELDNHAFLMDFQEYLEEFYAR YNVELIRAPEGFFYLRPRSTTLIPRSVLSELDMMVGKILCYLYLSPERLANEGIFTQQEL YDELLTLADEAKLLKLVNNRSTGSDVDRQKLQEKVRSSLNRLRRLGMVWFMGHDSSKFRI TESVFRFGADVRAGDDPREAQRRLIRDGEAMPIENHLQLNDETEENQPDSGEEE >gi|296493408|gb|ADTK01000093.1| GENE 38 43540 - 44862 1594 440 aa, chain - ## HITS:1 COG:mukF KEGG:ns NR:ns ## COG: mukF COG3006 # Protein_GI_number: 16128889 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Uncharacterized protein involved in chromosome partitioning # Organism: Escherichia coli K12 # 1 440 1 440 440 832 100.0 0 MSEFSQTVPELVAWARKNDFSISLPVDRLSFLLAVATLNGERLDGEMSEGELVDAFRHVS DAFEQTSETIGVRANNAINDMVRQRLLNRFTSEQAEGNAIYRLTPLGIGITDYYIRQREF STLRLSMQLSIVAGELKRAADAAEEGGDEFHWHRNVYAPLKYSVAEIFDSIDLTQRLMDE QQQQVKDDIAQLLNKDWRAAISSCELLLSETSGTLRELQDTLEAAGDKLQANLLRIQDAT MTHDDLHFVDRLVFDLQSKLDRIISWGQQSIDLWIGYDRHVHKFIRTAIDMDKNRVFAQR LRQSVQTYFDEPWALTYANADRLLDMRDEEMALRDEEVTGELPEDLEYEEFNEIREQLAA IIEEQLAVYKTRQVPLDLGLVVREYLSQYPRARHFDVARIVIDQAVRLGVAQADFTGLPA KWQPINDYGAKVQAHVIDKY >gi|296493408|gb|ADTK01000093.1| GENE 39 44859 - 45644 694 261 aa, chain - ## HITS:1 COG:smtA KEGG:ns NR:ns ## COG: smtA COG0500 # Protein_GI_number: 16128888 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli K12 # 1 261 1 261 261 536 99.0 1e-152 MQDRNFDDIAEKFSRNIYGTTKGQLRQAILWQDLDHVLAEMGPQKLRVLDAGGGEGQTAI KMAERGHQVILCDLSAQMIDRAKQAAEAKGVSDNMQFIHCAAQDVASHLETPVDLILFHA VLEWVADPRSVLQTLWSVLRPGGVLSLMFYNAHGLLMHNMVAGNFDYVQAGMPKKKKRTL SPDYPRDPAQVYLWLEEAGWQIMGKTGVRVFHDYLREKHQQRDCYEALLELETRYCRQEP YITLGRYIHVTARKPQSKDKV >gi|296493408|gb|ADTK01000093.1| GENE 40 45780 - 46559 621 259 aa, chain + ## HITS:1 COG:ECs1003 KEGG:ns NR:ns ## COG: ECs1003 COG1434 # Protein_GI_number: 15830257 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 259 1 259 259 449 99.0 1e-126 MLFTLKKVIGNMLLPLPLMLLIIGAGLALLWFSRFQKTGKIFISIGWLALLLLSLQPVAD RLLRPIESTYPTWNNSQKVDYIVVLGGGYTWNPQWAPSSNLINNSLPRLNEGIRLWRENP GSKLIFTGGVAKTNTVSTAEVGARVAQSLGVPREQIITLDLPKDTEEEAAAVKQAIGDAP FLLVTSASHLPRAMIFFQQEGLNPLPAPANQLAIDSPLNPWERAIPSPVWLMHSDRVGYE TLGRIWQWLKGPSGEPRQE >gi|296493408|gb|ADTK01000093.1| GENE 41 46536 - 47429 693 297 aa, chain - ## HITS:1 COG:no KEGG:JW0902 NR:ns ## KEGG: JW0902 # Name: ycbJ # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 297 1 297 297 588 100.0 1e-166 MEQLRAELSHLLGEKLSRIECVNEKADTALWALYDSQGNPMPLMARSFSTPGKARQLAWK TTMLARSGTVRMPTIYGVMTHEEHPGPDVLLLERMRGVSVEAPARTPERWEQLKDQIVEA LLAWHRQDSRGCVGAVDNTQENFWPSWYRQHVEVLWTTLNQFNNTGLTMQDKRILFRTRE CLPALFEGFNDNCVLIHGNFCLRSMLKDSRSDQLLAMVGPGLMLWAPREYELFRLMDNSL AEDLLWSYLQRAPVAESFIWRRWLYVLWDEVAQLVNTGRFSRRNFDLASKSLLPWLA >gi|296493408|gb|ADTK01000093.1| GENE 42 47583 - 48329 771 248 aa, chain - ## HITS:1 COG:kdsB KEGG:ns NR:ns ## COG: kdsB COG1212 # Protein_GI_number: 16128885 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: CMP-2-keto-3-deoxyoctulosonic acid synthetase # Organism: Escherichia coli K12 # 1 248 1 248 248 479 98.0 1e-135 MSFVVIIPARYASTRLPGKPLVDINGKPMIVHVLERARESGAERIIVATDHEDVARAVEA AGGEVCMTRADHQSGTERLAEVVEKCAFSDDTVIVNVQGDEPMIPATIIRQVADNLAQRQ VGMATLAVPIRNAEEAFNPNAVKVVLDAEGYALYFSRATIPWDRDRFAEGLETVGDNFLR HLGIYGYRAGFIRRYVNWQASPLEHIEMLEQLRVLWYGEKIHVAVAHEVPGTGVDTPEDL ERVRAEMR >gi|296493408|gb|ADTK01000093.1| GENE 43 48326 - 48508 145 60 aa, chain - ## HITS:1 COG:ECs1000 KEGG:ns NR:ns ## COG: ECs1000 COG2835 # Protein_GI_number: 15830254 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 60 1 60 60 119 100.0 2e-27 MDHRLLEIIACPVCNGKLWYNQEKQELICKLDNLAFPLRDGIPVLLETEARVLTADESKS >gi|296493408|gb|ADTK01000093.1| GENE 44 48560 - 49792 830 410 aa, chain - ## HITS:1 COG:ycaQ KEGG:ns NR:ns ## COG: ycaQ COG3214 # Protein_GI_number: 16128883 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 410 1 410 410 808 99.0 0 MSLPHLSLADARNLHLAAQGLLNKPRRRASLEDIPATISRMSLLQIDTINIVARSPYLVL FSRLGNYPAQWLDESLARGELMEYWAHEACFMPRSDFRLIRHRMLAPEKMGWKYKDAWMQ EHEAEIAQLIQHIHDKGPVRSADFEHPRKGASGWWEWKPHKRHLEGLFTAGKVMVIERRN FQRVYDLTHRVMPDWDDERDLVSQTEAEIIMLDNSARSLGIFREQWLADYYRLKRPALAA WREARAEQQQIIAVHVEKLGNLWLHADLLPLLERALAGKLTATHSAVLSPFDPVVWDRKR AEQLFDFSYRLECYTPAPKRQYGYFVLPLLHRGQLVGRMDAKMHRQTGILEVISLWLQEG IKPTTMLQKGLRQAITDFASWQQATRVTLGRCPQGLFTDCRTGWEIDPVA >gi|296493408|gb|ADTK01000093.1| GENE 45 49829 - 50815 662 328 aa, chain - ## HITS:1 COG:lpxK KEGG:ns NR:ns ## COG: lpxK COG1663 # Protein_GI_number: 16128882 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Tetraacyldisaccharide-1-P 4'-kinase # Organism: Escherichia coli K12 # 1 328 1 328 328 646 99.0 0 MIEKIWSGESPLWRLLLPLSWLYGLVSGAIRLCYKLKLKRAWRAPVPVVVVGNLTAGGNG KTPVVVWLVEQLQQRGIRVGVVSRGYGGKAESYPLLLSADTTTAQAGDEPVLIYQRTDAP VAVSPVRSDAVKAILAQHPDVQIIVTDDGLQHYRLARDVEIVVIDGVRRFGNGWWLPAGP MRERAGRLKSVDAVIVNGGVPRSGEIPMHLLPGQAVNLRTGTRCDVAQLEHVVAMAGIGH PPRFFATLKMCGVQPEKCVPLADHQSLNHADVSALVSAGQTLVMTEKDAVKCRAFAEENW WYLPVDAQLSGDEPAKLLTQLTSLASGN >gi|296493408|gb|ADTK01000093.1| GENE 46 50812 - 52560 242 582 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 359 568 17 229 245 97 32 1e-19 MHNDKDLSTWQTFRRLWPTIAPFKAGLIVAGVALILNAASDTFMLSLLKPLLDDGFGKTD RSVLVWMPLVVIGLMILRGITSYVSSYCISWVSGKVVMTMRRRLFGHMMGMPVSFFDKQS TGTLLSRITYDSEQVASSSSGALITVVREGASIIGLFIMMFYYSWQLSIILIVLAPIVSI AIRVVSKRFRNISKNMQNTMGQVTTSAEQMLKGHKEVLIFGGQEVETKRFDKVSNRMRLQ GMKMVSASSISDPIIQLIASLALAFVLYAASFPSVMDSLTAGTITVVFSSMIALMRPLKS LTNVNAQFQRGMAACQTLFTILDSEQEKDEGKRVIERATGDVEFRNVTFTYPGRDVPALR NINLKIPAGKTVALVGRSGSGKSTIASLITRFYDIDEGEILMDGHDLREYTLASLRNQVA LVSQNVHLFNDTVANNIAYARTEQYSREQIEEAARMAYAMDFINKMDNGLDTVIGENGVL LSGGQRQRIAIARALLRDSPILILDEATSALDTESERAIQAALDELQKNRTSLVIAHRLS TIEKADEIVVVEDGVIVERGTHNDLLEHRGVYAQLHKMQFGQ >gi|296493408|gb|ADTK01000093.1| GENE 47 52597 - 54861 754 754 aa, chain - ## HITS:1 COG:ZycaI_1 KEGG:ns NR:ns ## COG: ZycaI_1 COG0658 # Protein_GI_number: 15800774 # Func_class: R General function prediction only # Function: Predicted membrane metal-binding protein # Organism: Escherichia coli O157:H7 EDL933 # 1 428 27 454 454 774 98.0 0 MKITTVGVCIICGIFPLLILPQLPGTLTLAFLTLFACVLAFIPVKTVRYIALTLLFFVWG ILAAKQILWAGETLTGATQDAIVEITATDGMTTHYGQITHLQGRRIFPAPSLVLYGEYLP QAVCAGQLWSMKLKVRAVHGQLNDGGFDSQRYAIAQHQPLTGRFLQASVIEPNCSLRAQY LASLQTTLQPYPWNAVILGLGMGERLSVPKEIKNIMRDTGTAHLMAISGLHIAFAALLAA GLIRSGQIFLPGRWIHWQMPLIGGICCAAFYAWLTGMQPPALRTVVALATWGMLKLSGRQ WSGWDVWICCLAAILLMDPVAILSQSLWLSAAAVAALIFWYQWFPCPEWQLPPVLRAVVS LIHLQLGITLLLMPVQIVIFHGISLTSFIANLLAIPLVTFITVPLILAAMVVHLSGPLIL EQGLWFLADRSLALLFWGLKSLPEGWINIAECWQWLSFSPWFLLVVWRLNAWRTLPAMCV AGGLLMCWPLWQKPRPDEWQLYMLDVGQGLAMVIARNGKAILYDTGLAWPEGDSGQQLII PWLHWHNLEPEGVILSHEHLDHRGGLDSILHTWPMLWIRSPLNWEHHQPCVRGEAWQWQG LRFSVHWPLQASNDKGNNHSCVVKVDDGTNSILLTGDIEVPAEQKMLSRYWQQVQTTLLQ VPHHGSNTSSSLPLIQRVNGKVALASASRYNAWRLPSNKVKHRYQQQGYQWLDTPHQGQV TVNFSAQGWRISSLREQILPRWYHQWFGVPVDNG >gi|296493408|gb|ADTK01000093.1| GENE 48 55068 - 55352 167 94 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148826039|ref|YP_001290792.1| 50S ribosomal protein L35 [Haemophilus influenzae PittEE] # 1 89 4 91 96 68 38 7e-11 MTKSELIERLATQQSHIPAKTVEDAVKEMLEHMASTLAQGERIEIRGFGSFSLHYRAPRT GRNPKTGDKVELEGKYVPHFKPGKELRDRANIYG >gi|296493408|gb|ADTK01000093.1| GENE 49 55472 - 55570 150 32 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238903721|ref|ZP_04649193.1| ribosomal protein S1 [Escherichia coli BL21(DE3)] # 4 32 542 570 570 62 100 0.0 MQTSPTTQWLKLSKQLKASNSLTLRDFYSEVC >gi|296493408|gb|ADTK01000093.1| GENE 50 55512 - 57185 2807 557 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15800772|ref|NP_286786.1| 30S ribosomal protein S1 [Escherichia coli O157:H7 EDL933] # 1 557 1 557 557 1085 100 0.0 MTESFAQLFEESLKEIETRPGSIVRGVVVAIDKDVVLVDAGLKSESAIPAEQFKNAQGEL EIQVGDEVDVALDAVEDGFGETLLSREKAKRHEAWITLEKAYEDAETVTGVINGKVKGGF TVELNGIRAFLPGSLVDVRPVRDTLHLEGKELEFKVIKLDQKRNNVVVSRRAVIESENSA ERDQLLENLQEGMEVKGIVKNLTDYGAFVDLGGVDGLLHITDMAWKRVKHPSEIVNVGDE ITVKVLKFDRERTRVSLGLKQLGEDPWVAIAKRYPEGTKLTGRVTNLTDYGCFVEIEEGV EGLVHVSEMDWTNKNIHPSKVVNVGDVVEVMVLDIDEERRRISLGLKQCKANPWQQFAET HNKGDRVEGKIKSITDFGIFIGLDGGIDGLVHLSDISWNVAGEEAVREYKKGDEIAAVVL QVDAERERISLGVKQLAEDPFNNWVALNKKGAIVTGKVTAVDAKGATVELADGVEGYLRA SEASRDRVEDATLVLSVGDEVEAKFTGVDRKNRAISLSVRAKDEADEKDAIATVNKQEDA NFSNNAMAEAFKAAKGE >gi|296493408|gb|ADTK01000093.1| GENE 51 57296 - 57979 267 227 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 [Treponema pallidum subsp. pallidum str. Nichols] # 7 223 36 286 863 107 30 2e-22 MTAIAPVITIDGPSGAGKGTLCKAMAEALQWHLLDSGAIYRVLALAALHHHVDVASEDAL VPLASHLDVRFVSTNGNLEVILEGEDVSGEIRTQEVANAASQVAAFPRVREALLRRQRAF RELPGLIADGRDMGTVVFPDAPVKIFLDASSEERAHRRMLQLQEKGFSVNFERLLAEIKE RDDRDRNRAVAPLVPAADALVLDSTTLSIEQVIEKALQYARQKLALA >gi|296493408|gb|ADTK01000093.1| GENE 52 58152 - 58916 814 254 aa, chain - ## HITS:1 COG:ECs0992 KEGG:ns NR:ns ## COG: ECs0992 COG0501 # Protein_GI_number: 15830246 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Escherichia coli O157:H7 # 1 254 9 262 262 442 99.0 1e-124 MKNTKLLLAIATSAALLTGCQNTHGIDTNMAISSGLNAYKAATLSDADAKAIANQGCAEM DSGNQVASKSSKYGKRLAKIAKALGNNINGTPVNYKVYMTSDVNAWAMANGCVRVYSGLM DMMNDNEIEGVLGHELGHVALGHSLAEMKASYAIVAARDAISATSGVASQLSRSQLGDIA EGAINAKYSRDKESEADDFSFDLLKKRGISTQGLVGSFEKLASLDGGRTQSMFDSHPPST ERAQHIRDRIASGK >gi|296493408|gb|ADTK01000093.1| GENE 53 59085 - 60368 1309 427 aa, chain - ## HITS:1 COG:ECs0991 KEGG:ns NR:ns ## COG: ECs0991 COG0128 # Protein_GI_number: 15830245 # Func_class: E Amino acid transport and metabolism # Function: 5-enolpyruvylshikimate-3-phosphate synthase # Organism: Escherichia coli O157:H7 # 1 427 1 427 427 846 100.0 0 MESLTLQPIARVDGTINLPGSKSVSNRALLLAALAHGKTVLTNLLDSDDVRHMLNALTAL GVSYTLSADRTRCEIIGNGGPLHAEGALELFLGNAGTAMRPLAAALCLGSNDIVLTGEPR MKERPIGHLVDALRLGGAKITYLEQENYPPLRLQGGFTGGNVDVDGSVSSQFLTALLMTA PLAPEDTVIRIKGDLVSKPYIDITLNLMKTFGVEIENQHYQQFVVKGGQSYQSPGTYLVE GDASSASYFLAAAAIKGGTVKVTGIGRNSMQGDIRFADVLEKMGATICWGDDYISCTRGE LNAIDMDMNHIPDAAMTIATAALFAKGTTTLRNIYNWRVKETDRLFAMATELRKVGAEVE EGHDYIRITPPEKLNFAEIATYNDHRMAMCFSLVALSDTPVTILDPKCTAKTFPDYFEQL ARISQAA >gi|296493408|gb|ADTK01000093.1| GENE 54 60439 - 61527 1075 362 aa, chain - ## HITS:1 COG:ECs0990 KEGG:ns NR:ns ## COG: ECs0990 COG1932 # Protein_GI_number: 15830244 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoserine aminotransferase # Organism: Escherichia coli O157:H7 # 1 362 1 362 362 746 99.0 0 MAQIFNFSSGPAMLPAEVLKQAQQELRDWNGLGTSVMEVSHRGKEFIQVAEEAEKDFRDL LNVPSNYKVLFCHGGGRGQFAAVPLNILGDKTTADYVDAGYWAASAIKEAKKYCTPNVFD AKVTVDGLRAVKPMREWQLSDNAAYMHYCPNETIDGIAIDETPDFGKDVVVAADFSSTIL SRPIDVSRYGVIYAGAQKNIGPAGLTIVIVREDLLGKANIACPSILDYSILNDNDSMFNT PPTFAWYLSGLVFKWLKANGGVAEMDKINQQKAELLYGVIDNSDFYRNDVAKANRSRMNV PFQLADSALDKLFLEESFAAGLHALKGHRVVGGMRASIYNAMPLEGVKALTDFMVEFERR HG >gi|296493408|gb|ADTK01000093.1| GENE 55 61726 - 62418 673 230 aa, chain - ## HITS:1 COG:ycaP KEGG:ns NR:ns ## COG: ycaP COG2323 # Protein_GI_number: 16128873 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 230 1 230 230 463 100.0 1e-130 MKAFDLHRMAFDKVPFDFLGEVALRSLYTFVLVFLFLKMTGRRGVRQMSLFEVLIILTLG SAAGDVAFYDDVPMVPVLIVFITLALLYRLVMWLMAHSEKLEDLLEGKPVVIIEDGELAW SKLNNSNMTEFEFFMELRLRGVEQLGQVRLAILETNGQISVYFFEDDKVKPGLLILPSDC TQRYKVVPESADYACIRCSEIIHMKAGEKQLCPRCANPEWTKASRAKRVT >gi|296493408|gb|ADTK01000093.1| GENE 56 62548 - 64308 1920 586 aa, chain + ## HITS:1 COG:ycaO KEGG:ns NR:ns ## COG: ycaO COG1944 # Protein_GI_number: 16128872 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 586 4 589 589 1186 99.0 0 MTQTFIPGKDAALEDSIARFQQKLSDLGFQIEEASWLNPVPNVWSVHIRDKECALCFTNG KGATKKAALASALGEYFERLSTNYFFADFWLGETIANGPFVHYPNEKWFPLTENDDVPEG LLDDRLRAFYDPENELTGSMLIDLQSGNEDRGICGLPFTRQSDNQTVYIPMNIIGNLYVS NGMSAGNTRNEARVQGLSEVFERYVKNRIIAESISLPEIPADVLARYPAVVEAIETLEAE GFPIFAYDGSLGGQYPVICVVLFNPANGTCFASFGAHPDFGVALERTVTELLQGRGLKDL DVFTPPTFDDEEVAEHTNLETHFIDSSGLISWDLFKQDADYPFVDWNFSGTTEEEFATLM AIFKKEDKEVYIADYEHLGVYACRIIVPGMSDIYPAEDLWLANNSMGSHLRETILSLPGS EWEKEDYLNLIEQLDEEGFDDFTRVRELLGLATGSDNGWYTLRIGELKAMLALAGGDLEQ ALVWTEWTMEFNSSVFSPERANYYRCLQTLLLLAQEEDRQPLQYLNAFVRMYGADAVEAA SAAMSGEAAFYGLQPVDSDLHAFAAHQSLLKAYEKLQRAKAAFWAK >gi|296493408|gb|ADTK01000093.1| GENE 57 64714 - 65571 658 285 aa, chain + ## HITS:1 COG:ECs0987 KEGG:ns NR:ns ## COG: ECs0987 COG2116 # Protein_GI_number: 15830241 # Func_class: P Inorganic ion transport and metabolism # Function: Formate/nitrite family of transporters # Organism: Escherichia coli O157:H7 # 1 285 1 285 285 535 100.0 1e-152 MKADNPFDLLLPAAMAKVAEEAGVYKATKHPLKTFYLAITAGVFISIAFVFYITATTGTG TMPFGMAKLVGGICFSLGLILCVVCGADLFTSTVLIVVAKASGRITWGQLAKNWLNVYFG NLVGALLFVLLMWLSGEYMTANGQWGLNVLQTADHKVHHTFIEAVCLGILANLMVCLAVW MSYSGRSLMDKAFIMVLPVAMFVASGFEHSIANMFMIPMGIVIRDFASPEFWTAVGSAPE NFSHLTVMNFITDNLIPVTIGNIIGGGLLVGLTYWVIYLRENDHH >gi|296493408|gb|ADTK01000093.1| GENE 58 65626 - 67908 2425 760 aa, chain + ## HITS:1 COG:ECs0986 KEGG:ns NR:ns ## COG: ECs0986 COG1882 # Protein_GI_number: 15830240 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Escherichia coli O157:H7 # 1 760 1 760 760 1563 99.0 0 MSELNEKLATAWEGFTKGDWQNEVNVRDFIQKNYTPYEGDESFLAGATEATTTLWDKVME GVKLENRTHAPVDFDTAVASTITSHDAGYINKALEKVVGLQTEAPLKRALIPFGGIKMIE GSCKAYNRELDPMIKKIFTEYRKTHNQGVFDVYTPDILRCRKSGVLTGLPDAYGRGRIIG DYRRVALYGIDYLMKDKYAQFTSLQTDLENGVNLEQTIRLREEIAEQHRALGQMKEMAAK YGYDISGPATNAQEAIQWTYFGYLAAVKSQNGAAMSFGRTSTFLDVYIERDLKAGKITEQ EAQEMVDHLVMKLRMVRFLRTPEYDELFSGDPIWATESIGGMGLDGRTLVTKNSFRFLNT LYTMGPSPEPNMTILWSEKLPLNFKKFAAKVSIDTSSLQYENDDLMRPDFNNDDYAIACC VSPMIVGKQMQFFGARANLAKTMLYAINGGVDEKLKMQVGPKSEPIKGDLLNYDEVMERM DHFMDWLAKQYITALNIIHYMHDKYSYEASLMALHDRDVIRTMACGIAGLSVAADSLSAI KYAKVKPIRDEDGLAIDFEIEGEYPQFGNNDPRVDDLAVDLVERFMKKIQKLHTYRDAIP TQSVLTITSNVVYGKKTGNTPDGRRAGAPFGPGANPMHGRDQKGAVASLTSVAKLPFAYA KDGISYTFSIVPNALGKDDEVRKTNLAGLMDGYFHHEASIEGGQHLNVNVMNREMLLDAM ENPEKYPQLTIRVSGYAVRFNSLTKEQQQDVITRTFTQSM >gi|296493408|gb|ADTK01000093.1| GENE 59 68100 - 68840 684 246 aa, chain + ## HITS:1 COG:ECs0985 KEGG:ns NR:ns ## COG: ECs0985 COG1180 # Protein_GI_number: 15830239 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Escherichia coli O157:H7 # 1 246 1 246 246 519 100.0 1e-147 MSVIGRIHSFESCGTVDGPGIRFITFFQGCLMRCLYCHNRDTWDTHGGKEVTVEDLMKEV VTYRHFMNASGGGVTASGGEAILQAEFVRDWFRACKKEGIHTCLDTNGFVRRYDPVIDEL LEVTDLVMLDLKQMNDEIHQNLVGVSNHRTLEFAKYLANKNVKVWIRYVVVPGWSDDDDS AHRLGEFTRDMGNVEKIELLPYHELGKHKWVAMGEEYKLDGVKPPKKETMERVKGILEQY GHKVMF Prediction of potential genes in microbial genomes Time: Mon May 16 15:16:26 2011 Seq name: gi|296493407|gb|ADTK01000094.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont253.8, whole genome shotgun sequence Length of sequence - 24475 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 10, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 53 - 643 356 ## COG2249 Putative NADPH-quinone reductase (modulator of drug activity B) - Prom 668 - 727 3.2 + Prom 643 - 702 2.6 2 2 Tu 1 . + CDS 743 - 1651 472 ## COG0583 Transcriptional regulator - Term 1507 - 1538 -0.1 3 3 Tu 1 . - CDS 1652 - 3082 1375 ## COG0531 Amino acid transporters - Prom 3166 - 3225 9.1 - Term 3238 - 3281 8.5 4 4 Tu 1 . - CDS 3292 - 4440 1011 ## COG0477 Permeases of the major facilitator superfamily - Prom 4470 - 4529 3.8 + Prom 4489 - 4548 4.3 5 5 Tu 1 . + CDS 4754 - 5380 567 ## COG1335 Amidases related to nicotinamidase + Term 5388 - 5417 1.1 - Term 5375 - 5405 2.1 6 6 Op 1 9/0.000 - CDS 5416 - 6279 880 ## COG3302 DMSO reductase anchor subunit 7 6 Op 2 16/0.000 - CDS 6281 - 6898 620 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 8 6 Op 3 2/0.800 - CDS 6909 - 9353 2326 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing - Prom 9402 - 9461 4.6 - Term 9541 - 9566 -0.5 9 7 Op 1 8/0.000 - CDS 9592 - 10884 1550 ## COG0172 Seryl-tRNA synthetase - Term 10943 - 10974 -1.0 10 7 Op 2 8/0.000 - CDS 10975 - 12318 1107 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase 11 7 Op 3 10/0.000 - CDS 12329 - 12940 660 ## COG2834 Outer membrane lipoprotein-sorting protein 12 7 Op 4 7/0.400 - CDS 13095 - 17201 3889 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins - Prom 17253 - 17312 2.6 - Term 17236 - 17279 9.2 13 8 Tu 1 . - CDS 17336 - 17830 541 ## COG1522 Transcriptional regulators + Prom 18286 - 18345 5.8 14 9 Op 1 8/0.000 + CDS 18374 - 19339 733 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 + Term 19366 - 19410 8.0 + Prom 19367 - 19426 6.1 15 9 Op 2 14/0.000 + CDS 19462 - 21228 170 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 16 9 Op 3 5/0.400 + CDS 21229 - 22950 1637 ## COG4987 ABC-type transport system involved in cytochrome bd biosynthesis, fused ATPase and permease components 17 9 Op 4 5/0.400 + CDS 22992 - 23696 418 ## COG2360 Leu/Phe-tRNA-protein transferase + Term 23885 - 23921 -1.0 + Prom 23886 - 23945 3.7 18 10 Tu 1 . + CDS 23981 - 24199 257 ## PROTEIN SUPPORTED gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 + Term 24236 - 24275 5.5 Predicted protein(s) >gi|296493407|gb|ADTK01000094.1| GENE 1 53 - 643 356 196 aa, chain - ## HITS:1 COG:ycaK KEGG:ns NR:ns ## COG: ycaK COG2249 # Protein_GI_number: 16128868 # Func_class: R General function prediction only # Function: Putative NADPH-quinone reductase (modulator of drug activity B) # Organism: Escherichia coli K12 # 1 196 1 196 196 401 99.0 1e-112 MQSERIYLVWAHPRHDSLTAHIADAIHQRAMERKIQVTELDLYRRNFNPVMTPEDEPDWK NMDKRYSPEVHQLYSELLEHDTLVVVFPLWWYSFPAMLKGYIDRVWNNGLAYGDGHKLPF NKVRCVALVGGDKESFVQMGWEKNISDYLKNMCSYLGIEDADVTFLCNTVVFDGEELHAS YYQSLLSQVRDMVDAL >gi|296493407|gb|ADTK01000094.1| GENE 2 743 - 1651 472 302 aa, chain + ## HITS:1 COG:ycaN KEGG:ns NR:ns ## COG: ycaN COG0583 # Protein_GI_number: 16128867 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 302 1 302 302 626 99.0 1e-179 MRMNMSDFATFFAVARNQSFRAAGDELGLSSSAISHSIKTLEQRLKIRLFNRTTRSVSLT EAGSNLYERLRPAFDEIQIMLDEMNDFRLTPTGTLKINAARVAARIFLMPLLVGFTREYP DIKVELTTDDSLVDIVQQGFDAGVRLSGIVEKDMISVAIGPPVKLCVAATPEYFARYGKP RHPHDLLNHQCVVFRYPSGKPFHWQFAKELEIAVAGNIILDDVDAELEAVLMGAGIGYLL YEQIKEYLDTGRLECVLEDWSTERPGFQIYYPNRQYMSCGLRAFLDYVKTGQICQSQRHR PQ >gi|296493407|gb|ADTK01000094.1| GENE 3 1652 - 3082 1375 476 aa, chain - ## HITS:1 COG:ycaM KEGG:ns NR:ns ## COG: ycaM COG0531 # Protein_GI_number: 16128866 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli K12 # 1 476 65 540 540 887 99.0 0 MAGNVQEKQLRWYNIALMSFITVWGFGNVVNNYANQGLVVVFSWVFIFALYFTPYALIVG QLGSTFKDGRGGVSTWIKHTMGPGLAYLAAWTYWVVHIPYLAQKPQAILIALGWAMKGDG SLIKEYSVVALQGLTLVLFIFFMWVASRGMKSLKIVGSVAGIAMFVMSLLYVAMAVTAPA ITEVHIATTNITWETFIPHIDFTYITTISMLVFAVGGAEKISPYVNQTRNPGKEFPKGML CLAVMVAVCAILGSLAMGMMFDSRNIPDDLMTNGQYYAFQKLGEYYNMGNTLMVIYAIAN TLGQVAALVFSIDAPLKVLLGDADSKYIPASLCRTNASGTPVNGYFLTLVLVAILIMLPT LGIGDMNNLYKWLLNLNSVVMPLRYLWVFVAFIAVVRLAQKYKPEYVFIRNKPLAMTVGI WCFAFTAFACLTGIFPKMEAFTAEWTFQLALNVATPFVLVGLGLIFPLLARKANSK >gi|296493407|gb|ADTK01000094.1| GENE 4 3292 - 4440 1011 382 aa, chain - ## HITS:1 COG:ECs0983 KEGG:ns NR:ns ## COG: ECs0983 COG0477 # Protein_GI_number: 15830237 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 382 1 382 382 640 100.0 0 MSTYTRPVMLLLSGLLLLTLAIAVLNTLVPLWLAQEHMSTWQVGVVSSSYFTGNLVGTLL TGYVIKRIGFNRSYYLASFIFAAGCAGLGLMIGFWSWLAWRFVAGVGCAMIWVVVESALM CSGTSRNRGRLLAAYMMVYYVGTFLGQLLVSKVSTELMSVLPWVTGLTLAGILPLLFTRV LNQQAENHDSTSITSMLKLRQARLGVNGCIISGIVLGSLYGLMPLYLNHKGVSNASIGFW MAVLVSAGILGQWPIGRLADKFGRLLVLRVQVFVVILGSIAMLSQAAMAPALFILGAAGF TLYPVAMAWACEKVEHHQLVAMNQALLLSYTVGSLLGPSFTAMLMQNFSDNLLFIMIASV SFIYLLMLLRNAGHTPKPVAHV >gi|296493407|gb|ADTK01000094.1| GENE 5 4754 - 5380 567 208 aa, chain + ## HITS:1 COG:ycaC KEGG:ns NR:ns ## COG: ycaC COG1335 # Protein_GI_number: 16128864 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Escherichia coli K12 # 1 208 1 208 208 429 99.0 1e-120 MTKPYVRLDKNDAAVLLVDHQAGLLSLVRDIEPDKFKNNVLALGDLAKYFNLPTILTTSF ETGPNGPLVPELKAQFPDTPYIARPGNINAWDNEDFVKAVKATGKKQLIIAGVVTEVCVA FPALSAIEEGFDVFVVTDASGTFNEITRHSAWDRMSQAGAQLMTWFGVACELHRDWRNDI EGLATLFSNHIPDYRNLMTSYDTLTKQK >gi|296493407|gb|ADTK01000094.1| GENE 6 5416 - 6279 880 287 aa, chain - ## HITS:1 COG:dmsC KEGG:ns NR:ns ## COG: dmsC COG3302 # Protein_GI_number: 16128863 # Func_class: R General function prediction only # Function: DMSO reductase anchor subunit # Organism: Escherichia coli K12 # 1 287 1 287 287 437 100.0 1e-122 MGSGWHEWPLMIFTVFGQCVAGGFIVLALALLKGDLRAEAQQRVIACMFGLWVLMGIGFI ASMLHLGSPMRAFNSLNRVGASALSNEIASGSIFFAVGGIGWLLAMLKKLSPALRTLWLI VTMVLGVIFVWMMVRVYNSIDTVPTWYSIWTPMGFFLTMFMGGPLLGYLLLSLAGVDGWA MRLLPAISVLALVVSGVVSVMQGAELATIHSSVQQAAALVPDYGALMSWRIVLLAVALCL WIAPQLKGYQPAVPLLSVSFILLLAGELIGRGVFYGLHMTVGMAVAS >gi|296493407|gb|ADTK01000094.1| GENE 7 6281 - 6898 620 205 aa, chain - ## HITS:1 COG:dmsB KEGG:ns NR:ns ## COG: dmsB COG0437 # Protein_GI_number: 16128862 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Escherichia coli K12 # 1 205 1 205 205 414 100.0 1e-116 MTTQYGFFIDSSRCTGCKTCELACKDYKDLTPEVSFRRIYEYAGGDWQEDNGVWHQNVFA YYLSISCNHCEDPACTKVCPSGAMHKREDGFVVVDEDVCIGCRYCHMACPYGAPQYNETK GHMTKCDGCYDRVAEGKKPICVESCPLRALDFGPIDELRKKHGDLAAVAPLPRAHFTKPN IVIKPNANSRPTGDTTGYLANPKEV >gi|296493407|gb|ADTK01000094.1| GENE 8 6909 - 9353 2326 814 aa, chain - ## HITS:1 COG:dmsA KEGG:ns NR:ns ## COG: dmsA COG0243 # Protein_GI_number: 16128861 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 30 814 1 785 785 1626 99.0 0 MKTKIPDAVLAAEVSRRGLVKTTAIGGLAMASSALTLPFSRIAHAVDSAIPTKSDEKVIW SACTVNCGSRCPLRMHVVDGEIKYVETDNTGDDNYDGLHQVRACLRGRSMRRRVYNPDRL KYPMKRVGARGEGKFERISWEEAYDIIATNMQRLIKEYGNESIYLNYGTGTLGGTMTRSW PPGNTLVARLMNCCGGYLNHYGDYSSAQIAEGLNYTYGGWADGNSPSDIENSKLVVLFGN NPGETRMSGGGVTYYLEQARQKSNARMIIIDPRYTDTGAGREDEWIPIRPGTDAALVNGL AYVMITENLVDQAFLDKYCVGYDEKTLPASAPKNGHYKAYILGEGPDGVAKTPQWASQIT GIPAEKIIQLAREIGSTKPAFISQGWGPQRHANGEIATRAISMLAILTGNVGINGGNTGA REGSYSLPFVRMPTLENPIQTSISMFMWTDAIERGPEMTALRDGVRGKDKLDVPIKMIWN YAGNCLINQHSEINRTHEILQDDKKCELIVVIDCHMTSSAKYADILLPDCTASEQMDFAL DASCGNMSYVIFNDQVIKPRFECKTIYEMTSELAKRLGVEQQFTEGRTQEEWMRHLYAQS REAIPELPTFEEFRKQGIFKKRDPQGHHVAYKAFREDPQANPLTTPSGKIEIYSQALADI AATWELPEGDVIDPLPIYTPGFESYQDPLNKQYPLQLTGFHYKSRVHSTYGNVDVLKAAC RQEMWINPLDAQKRGINNGDKVRIFNDRGEVHIEAKVTPRMMPGVVALGEGAWYDPDAKR VDKGGCINVLTTQRPSPLAKGNPSHTNLVQVEKV >gi|296493407|gb|ADTK01000094.1| GENE 9 9592 - 10884 1550 430 aa, chain - ## HITS:1 COG:ECs0978 KEGG:ns NR:ns ## COG: ECs0978 COG0172 # Protein_GI_number: 15830232 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Seryl-tRNA synthetase # Organism: Escherichia coli O157:H7 # 1 430 1 430 430 865 100.0 0 MLDPNLLRNEPDAVAEKLARRGFKLDVDKLGALEERRKVLQVKTENLQAERNSRSKSIGQ AKARGEDIEPLRLEVNKLGEELDAAKAELDALQAEIRDIALTIPNLPADEVPVGKDENDN VEVSRWGTPREFDFEVRDHVTLGEMHSGLDFAAAVKLTGSRFVVMKGQIARMHRALSQFM LDLHTEQHGYSENYVPYLVNQDTLYGTGQLPKFAGDLFHTRPLEEEADTSNYALIPTAEV PLTNLVRGEIIDEDDLPIKMTAHTPCFRSEAGSYGRDTRGLIRMHQFDKVEMVQIVRPED SMAALEEMTGHAEKVLQLLGLPYRKIILCTGDMGFGACKTYDLEVWIPAQNTYREISSCS NVWDFQARRMQARCRSKSDKKTRLVHTLNGSGLAVGRTLVAVMENYQQADGRIEVPEVLR PYMNGLEYIG >gi|296493407|gb|ADTK01000094.1| GENE 10 10975 - 12318 1107 447 aa, chain - ## HITS:1 COG:ECs0977 KEGG:ns NR:ns ## COG: ECs0977 COG2256 # Protein_GI_number: 15830231 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Escherichia coli O157:H7 # 1 447 1 447 447 872 100.0 0 MSNLSLDFSDNTFQPLAARMRPENLAQYIGQQHLLAAGKPLPRAIEAGHLHSMILWGPPG TGKTTLAEVIARYANADVERISAVTSGVKEIREAIERARQNRNAGRRTILFVDEVHRFNK SQQDAFLPHIEDGTITFIGATTENPSFELNSALLSRARVYLLKSLSTEDIEQVLTQAMED KTRGYGGQDIVLPDETRRAIAELVNGDARRALNTLEMMADMAEVDDSGKRVLKPELLTEI AGERSARFDNKGDRFYDLISALHKSVRGSAPDAALYWYARIITAGGDPLYVARRCLAIAS EDVGNADPRAMQVAIAAWDCFTRVGPAEGERAIAQAIVYLACAPKSNAVYTAFKAALADA RERPDYDVPVHLRNAPTKLMKEMGYGQEYRYAHDEANAYAAGEVYFPPEIAQTRYYFPTN RGLEGKIGEKLAWLAEQDQNSPIKRYR >gi|296493407|gb|ADTK01000094.1| GENE 11 12329 - 12940 660 203 aa, chain - ## HITS:1 COG:ECs0976 KEGG:ns NR:ns ## COG: ECs0976 COG2834 # Protein_GI_number: 15830230 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane lipoprotein-sorting protein # Organism: Escherichia coli O157:H7 # 1 203 2 204 204 378 100.0 1e-105 MKKIAITCALLSSLVASSVWADAASDLKSRLDKVSSFHASFTQKVTDGSGAAVQEGQGDL WVKRPNLFNWHMTQPDESILVSDGKTLWFYNPFVEQATATWLKDATGNTPFMLIARNQSS DWQQYNIKQNGDDFVLTPKASNGNLKQFTINVGRDGTIHQFSAVEQDDQRSSYQLKSQQN GAVDAAKFTFTPPQGVTVDDQRK >gi|296493407|gb|ADTK01000094.1| GENE 12 13095 - 17201 3889 1368 aa, chain - ## HITS:1 COG:ECs0975 KEGG:ns NR:ns ## COG: ECs0975 COG1674 # Protein_GI_number: 15830229 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Escherichia coli O157:H7 # 1 1368 1 1342 1342 2083 97.0 0 MSQEYTEDKEVTLTKLSSGRRLLEALLILIVLFAVWLMAALLSFNPSDPSWSQTAWHEPI HNLGGMPGAWLADTLFFIFGVMAYTIPVIIVGGCWFAWRHQSSDEYIDYFAVSLRIIGVL ALILTSCGLAAINADDIWYFASGGVIGSLLSTTLQPLLHSSGGTIALLCVWAAGLTLFTG WSWVTIAEKLGGWILNILTFASNRTRRDDTWVDEDEYEDDEEYEDENHGKQHESRRARIL RGALARRKRLAEKFINPMGRQTDAALFSGKRMDDDEEITYTARGVAADPDDVLFSGNRAT QPEYDEYDPLLNGAPITEPVAVAAAATTATQSWAAPVEPVTQTPPVASVDVPPSQPTVAW QPVPGPQTGEPVIAPAPEGYPQQSQYAQPAVQYNEPLQQPVQPQQPYYAPAAEQPAQQPY YAPAAEQPVQQPYYATAPEQPAQQPYYAPAPEQPVAGNAWQAEEQQSTFAPQSTYQTEQT YQQPAAQEPLYQQPQSVEQQPVVEPEPVVEETKPARPPLYYFEEVEEKRAREREQLAAWY QPIPEPVKEPEPIKSSLKAPSVAAVPPVEAAAAVSPLASGVKKATLATGAAATVAAPVFS LANSGGPRPQVKEGIGPQLPRPKRIRVPTRRELASYGIKLPSQRAAEEKAREAQRNQYDS GDQYNDDEIDAMQQDELARQFAQTQQQRYGEQYQHDVPVNAEDADAAAEAELARQFAQTQ QQRYSGEQPAGANPFSLDDFEFSPMKALLDDGPHEPLFTPIVEPVQQPQQPVAPQQQYQQ PQQPVPPQQQYQQPQQPVAPQPQYQQPQQQVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQ PQYQQPQQPVAPQQQDTLLHPLLMRNGDSRPLHKPTTPLPSLDLLTPPPSEVEPVDTFAL EQMARLVEARLADFRIKADVVNYSPGPVITRFELNLAPGVKAARISNLSRDLARSLSTVA VRVVEVIPGKPYVGLELPNKKRQTVYLREVLDNAKFRDNPSPLTVVLGKDIAGEPVVADL AKMPHLLVAGTTGSGKSVGVNAMILSMLYKAQPEDVRFIMIDPKMLELSVYEGIPHLLTE VVTDMKDAANALRWCVNEMERRYKLMSALGVRNLAGYNEKIAEADRMMRPIPDPYWKPGD SMDAQHPVLKKEPYIVVLVDEFADLMMTVGKKVEELIARLAQKARAAGIHLVLATQRPSV DVITGLIKANIPTRIAFTVSSKIDSRTILDQAGAESLLGMGDMLYSGPNSTLPVRVHGAF VRDQEVHAVVQDWKARGRPQYVDGITSDSESEGGAGGFDGAEELDPLFDQAVQFVTEKRK ASISGVQRQFRIGYNRAARIIEQMEAQGIVSEQGHNGNREVLAPPPFD >gi|296493407|gb|ADTK01000094.1| GENE 13 17336 - 17830 541 164 aa, chain - ## HITS:1 COG:ECs0974 KEGG:ns NR:ns ## COG: ECs0974 COG1522 # Protein_GI_number: 15830228 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 164 1 164 164 312 100.0 2e-85 MVDSKKRPGKDLDRIDRNILNELQKDGRISNVELSKRVGLSPTPCLERVRRLERQGFIQG YTALLNPHYLDASLLVFVEITLNRGAPDVFEQFNTAVQKLEEIQECHLVSGDFDYLLKTR VPDMSAYRKLLGETLLRLPGVNDTRTYVVMEEVKQSNRLVIKTR >gi|296493407|gb|ADTK01000094.1| GENE 14 18374 - 19339 733 321 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 10 317 5 306 306 286 49 7e-77 MGTTKHSKLLILGSGPAGYTAAVYAARANLQPVLITGMEKGGQLTTTTEVENWPGDPNDL TGPLLMERMHEHATKFETEIIFDHINKVDLQNRPFRLNGDNGEYTCDALIIATGASARYL GLPSEEAFKGRGVSACATCDGFFYRNQKVAVIGGGNTAVEEALYLSNIASEVHLIHRRDG FRAEKILIKRLMDKVENGNIILHTNRTLEEVTGDQMGVTGVRLRDTQNSDNIESLDVAGL FVAIGHSPNTAIFEGQLELENGYIKVQSGIHGNATQTSIPGVFAAGDVMDHIYRQAITSA GTGCMAALDAERYLDGLADAK >gi|296493407|gb|ADTK01000094.1| GENE 15 19462 - 21228 170 588 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 369 570 150 351 398 70 29 1e-11 MNKSRQKELTRWLKQQSVISQRWLNISRLLGFVSGILIIAQAWFMARILQHMIMENIPRE ALLLPFTLLVLTFVLRAWVVWLRERVGYHAGQHIRFAIRRQVLDRLQQAGPAWIQGKPAG SWATLVLEQIDDMHDYYARYLPQMALAVSVPLLIVVAIFPSNWAAALILLGTAPLIPLFM ALVGMGAADANRRNFLALARLSGHFLDRLRGMETLRIFGRGEAEIESIRSASEDFRQRTM EVLRLAFLSSGILEFFTSLSIALVAVYFGFSYLGELDFGHYDTGVTLAAGFLALILAPEF FQPLRDLGTFYHAKAQAVGAADSLKTFMETPLAHPQRGEVELASTDPVTIEAEDLFITSP EGKTLAGPLNFTLPAGQRAVLVGRSGSGKSSLLNALSGFLSYQGSLRINGIELRDLSPES WRKHLSWVGQNPQLPAATLRDNVLLARPDASEQELQAALDNAWVSEFLPLLPQGVDTPVG DQAARLSVGQAQRVAVARALLNPCSLLLLDEPAASLDAHSEQRVMEALNAASLRQTTLMV THQLEDLADWDVIWVMQDGRIIEQGRYAELSVAGGPFATLLAHRQEEI >gi|296493407|gb|ADTK01000094.1| GENE 16 21229 - 22950 1637 573 aa, chain + ## HITS:1 COG:cydC KEGG:ns NR:ns ## COG: cydC COG4987 # Protein_GI_number: 16128853 # Func_class: C Energy production and conversion; O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome bd biosynthesis, fused ATPase and permease components # Organism: Escherichia coli K12 # 1 573 1 573 573 1045 99.0 0 MRALLPYLALYKRHKWMLSLGIVLAIVTLLASIGLLTLSGWFLSASAVAGVAGLYSFNYM LPAAGVRGAAITRTAGRYFERLVSHDATFRVLQHLRIYTFSKLLPLSPAGLARYRQGELL NRVVADVDTLDHLYLRVISPLVGAFVVIMVVTIGLSFLDFTLAFTLGGIMLLTLFLMPPL FYRAGKSTGQNLTHLRGQYRQQLTAWLQGQAELTIFGASDRYRTQLENTEIQWLEAQRRQ SELTALSQAIMLLIGALAVILMLWMASGGVGGNAQPGALIALFVFCALAAFEALAPVTGA FQHLGQVIASAVRITDLTDQKPEVTFPDTQTRVADRVSLTLRDVQFTYPEQSQQALKGIS LQVNAGEHIAILGRTGCGKSTLLQLLTRAWDPQQGEILLNDSPIASLNEAALRQTISVVP QRVHLFSATLRDNLLLASPGSSDEALAESLRRVGLEKLLEDAGLNSWLGEGGRQLSGGEL RRLAIARALLHDAPLVLLDEPTEGLDATTESQILELLAEMMREKTVLMVTHRLRGLSRFQ QIIVMDNGQIIEQGTHAELLARQGRYYQFKQGL >gi|296493407|gb|ADTK01000094.1| GENE 17 22992 - 23696 418 234 aa, chain + ## HITS:1 COG:aat KEGG:ns NR:ns ## COG: aat COG2360 # Protein_GI_number: 16128852 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Leu/Phe-tRNA-protein transferase # Organism: Escherichia coli K12 # 1 234 1 234 234 493 100.0 1e-139 MRLVQLSRHSIAFPSPEGALREPNGLLALGGDLSPARLLMAYQRGIFPWFSPGDPILWWS PDPRAVLWPESLHISRSMKRFHKRSPYRVTMNYAFGQVIEGCASDREEGTWITRGVVEAY HRLHELGHAHSIEVWREDELVGGMYGVAQGTLFCGESMFSRMENASKTALLVFCEEFIGH GGKLIDCQVLNDHTASLGACEIPRRDYLNYLNQMRLGRLPNNFWVPRCLFSPQE >gi|296493407|gb|ADTK01000094.1| GENE 18 23981 - 24199 257 72 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 [Streptococcus pneumoniae TIGR4] # 1 70 1 70 72 103 65 1e-21 MAKEDNIEMQGTVLETLPNTMFRVELENGHVVTAHISGKMRKNYIRILTGDKVTVELTPY DLSKGRIVFRSR Prediction of potential genes in microbial genomes Time: Mon May 16 15:16:43 2011 Seq name: gi|296493406|gb|ADTK01000095.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont256.1, whole genome shotgun sequence Length of sequence - 49003 bp Number of predicted genes - 44, with homology - 44 Number of transcription units - 18, operones - 12 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 65 - 124 6.7 1 1 Op 1 5/0.400 + CDS 151 - 843 564 ## COG2186 Transcriptional regulators 2 1 Op 2 . + CDS 866 - 2293 949 ## COG0477 Permeases of the major facilitator superfamily 3 2 Op 1 13/0.000 - CDS 2259 - 3251 831 ## COG1609 Transcriptional regulators 4 2 Op 2 8/0.200 - CDS 3255 - 4199 753 ## COG0524 Sugar kinases, ribokinase family 5 3 Op 1 16/0.000 - CDS 4310 - 5200 1163 ## COG1879 ABC-type sugar transport system, periplasmic component 6 3 Op 2 21/0.000 - CDS 5225 - 6190 1293 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 7 3 Op 3 9/0.200 - CDS 6195 - 7700 1812 ## COG1129 ABC-type sugar transport system, ATPase component 8 3 Op 4 4/0.400 - CDS 7708 - 8127 393 ## COG1869 ABC-type ribose transport system, auxiliary component - Prom 8222 - 8281 4.1 - Term 8248 - 8288 5.6 9 4 Tu 1 . - CDS 8294 - 10162 1669 ## COG3158 K+ transporter - Prom 10304 - 10363 4.5 + Prom 10289 - 10348 2.8 10 5 Op 1 7/0.400 + CDS 10385 - 11881 1403 ## COG0714 MoxR-like ATPases 11 5 Op 2 . + CDS 11875 - 13326 1119 ## COG2425 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain 12 6 Tu 1 . - CDS 13331 - 14281 946 ## COG2502 Asparagine synthetase A - Prom 14393 - 14452 5.0 + Prom 14303 - 14362 9.6 13 7 Op 1 4/0.400 + CDS 14475 - 14933 477 ## COG1522 Transcriptional regulators + Prom 14942 - 15001 2.0 14 7 Op 2 5/0.400 + CDS 15023 - 15466 582 ## COG0716 Flavodoxins + Prom 15693 - 15752 4.7 15 8 Op 1 24/0.000 + CDS 15845 - 17734 1965 ## COG0445 NAD/FAD-utilizing enzyme apparently involved in cell division 16 8 Op 2 3/0.600 + CDS 17798 - 18421 546 ## COG0357 Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division + Term 18509 - 18560 4.2 + Prom 18894 - 18953 5.8 17 9 Op 1 8/0.200 + CDS 19038 - 19418 195 ## COG3312 F0F1-type ATP synthase, subunit I 18 9 Op 2 40/0.000 + CDS 19427 - 20242 802 ## COG0356 F0F1-type ATP synthase, subunit a 19 9 Op 3 37/0.000 + CDS 20289 - 20528 350 ## COG0636 F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K 20 9 Op 4 38/0.000 + CDS 20590 - 21060 576 ## COG0711 F0F1-type ATP synthase, subunit b 21 9 Op 5 41/0.000 + CDS 21075 - 21608 560 ## COG0712 F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) 22 9 Op 6 42/0.000 + CDS 21621 - 23162 1740 ## COG0056 F0F1-type ATP synthase, alpha subunit 23 9 Op 7 42/0.000 + CDS 23213 - 24076 997 ## COG0224 F0F1-type ATP synthase, gamma subunit 24 9 Op 8 42/0.000 + CDS 24103 - 25485 1611 ## COG0055 F0F1-type ATP synthase, beta subunit 25 9 Op 9 6/0.400 + CDS 25506 - 25925 438 ## COG0355 F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) + Term 25948 - 25981 4.5 26 10 Tu 1 9/0.200 + CDS 26276 - 27646 1719 ## COG1207 N-acetylglucosamine-1-phosphate uridyltransferase (contains nucleotidyltransferase and I-patch acetyltransferase domains) 27 11 Tu 1 1/0.800 + CDS 27808 - 29637 2303 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains + Term 29658 - 29689 2.3 + Prom 29679 - 29738 8.6 28 12 Tu 1 7/0.400 + CDS 29939 - 30511 451 ## COG3539 P pilus assembly protein, pilin FimA + Prom 30580 - 30639 5.9 29 13 Op 1 10/0.000 + CDS 30694 - 31290 272 ## COG3121 P pilus assembly protein, chaperone PapD 30 13 Op 2 6/0.400 + CDS 31315 - 33837 1515 ## COG3188 P pilus assembly protein, porin PapC 31 13 Op 3 2/0.800 + CDS 33848 - 34921 452 ## COG3539 P pilus assembly protein, pilin FimA + Prom 35031 - 35090 9.1 32 14 Op 1 39/0.000 + CDS 35169 - 36209 1181 ## COG0226 ABC-type phosphate transport system, periplasmic component 33 14 Op 2 38/0.000 + CDS 36296 - 37255 1242 ## COG0573 ABC-type phosphate transport system, permease component 34 14 Op 3 41/0.000 + CDS 37255 - 38145 1100 ## COG0581 ABC-type phosphate transport system, permease component 35 14 Op 4 32/0.000 + CDS 38236 - 39009 345 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 36 14 Op 5 1/0.800 + CDS 39024 - 39749 853 ## COG0704 Phosphate uptake regulator + Term 39759 - 39801 9.0 + Prom 39766 - 39825 6.8 37 15 Tu 1 . + CDS 40035 - 40871 557 ## COG3711 Transcriptional antiterminator + Term 40931 - 40961 -0.9 + Prom 40905 - 40964 6.7 38 16 Op 1 8/0.200 + CDS 41004 - 42881 1357 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 39 16 Op 2 1/0.800 + CDS 42900 - 44294 1253 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase + Term 44315 - 44359 5.7 40 17 Op 1 . + CDS 44380 - 45996 1574 ## COG4580 Maltoporin (phage lambda and maltose receptor) 41 17 Op 2 . + CDS 46023 - 47192 1105 ## COG2382 Enterochelin esterase and related enzymes 42 17 Op 3 . + CDS 47207 - 47929 775 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase + Term 47943 - 47984 5.2 - Term 47789 - 47826 -0.9 43 18 Op 1 . - CDS 47991 - 48578 269 ## COG3196 Uncharacterized protein conserved in bacteria 44 18 Op 2 . - CDS 48627 - 49001 149 ## EC55989_4186 putative inner membrane protein Predicted protein(s) >gi|296493406|gb|ADTK01000095.1| GENE 1 151 - 843 564 230 aa, chain + ## HITS:1 COG:ZyieP KEGG:ns NR:ns ## COG: ZyieP COG2186 # Protein_GI_number: 15804355 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 EDL933 # 1 230 1 230 230 461 99.0 1e-130 MPLSAQQLAAQKNLSYVLAEKLAQRILKGEYEPGTILPGEIELGEQFGVSRTAVREAVKT LTAKGMVLPRPRIGTRVMPQSNWNFLDQELLTWWMTEENFHQVIDHFLVMRICLEPQACL LAATVGTAEQKAHLNTLMAEMAALKENFRRERWIEVDMAWHEHIYEMSANPFLTSFASLF HSVYHTYFTSITSDTVIKLDLHQAIVDAIVQSDGDAAFKACQALLRSPDK >gi|296493406|gb|ADTK01000095.1| GENE 2 866 - 2293 949 475 aa, chain + ## HITS:1 COG:yieO KEGG:ns NR:ns ## COG: yieO COG0477 # Protein_GI_number: 16131622 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 475 1 475 475 828 99.0 0 MSDKKNRSMAGLPWIAAMAFFMQALDATILNTALPAIAHSLNRSPLAMQSAIISYTLTVA MLIPVSGWLADRFGTRRIFTLAVSLFTLGSLACALSNSLPQLVVFRVIQGIGGAMMMPVA RLALLRAYPRNELLPVLNFVAMPGLVGPILGPVLGGVLVTWATWHWIFLINIPIGIAGLL YARKHMPNFTTARRRFDITGFLLFGLSLVLFSSGIELFGEKIVASWIALTVIVTSIGLLL LYILHARHTPNPLISLDLFKTRTFSIGIVGNIATRLGTGCVPFLMPLMLQVGFGYQAFIA GCMMAPTALGSIIAKSMVTQVLRRLGYRHTLVGITVIIGLMIAQFSLQSPAMAIWMLILP LFILGMAMSTQFTAMNTITLADLTDDNASSGNSVLAVTQQLSISLGVAVSAAVLRVYEGM EGTTTVEQFHYTFITMGIITVASAAMFMLLKTTDGNNLIKRQRKSKPNRVPSESE >gi|296493406|gb|ADTK01000095.1| GENE 3 2259 - 3251 831 330 aa, chain - ## HITS:1 COG:ECs4695 KEGG:ns NR:ns ## COG: ECs4695 COG1609 # Protein_GI_number: 15833949 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 330 1 330 330 659 99.0 0 MATMKDVARLAGVSTSTVSHVINKDRFVSEAITAKVEAAIKELNYAPSALARSLKLNQTH TIGMLITASTNPFYSELVRGVERSCFERGYSLVLCNTEGDEQRMNRNLETLMQKRVDGLL LLCTETHQPSREIMQRYPTVPTVMMDWAPFDGDSDLIQDNSLLGGDLATQYLIDKGHTRI ACITGPLDKTPARLRLEGYRAAMKRAGLSIPDGYEVTGDFEFNGGFDAMRQLLSHPLRPQ AVFTGNDAMAVGVYQALYQAELQVPQDIAVIGYDDIELASFMTPPLTTIHQPKDELGELA IDVLIHRITQPTLQQQRLQLTPILMERGSA >gi|296493406|gb|ADTK01000095.1| GENE 4 3255 - 4199 753 314 aa, chain - ## HITS:1 COG:ECs4694 KEGG:ns NR:ns ## COG: ECs4694 COG0524 # Protein_GI_number: 15833948 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli O157:H7 # 6 314 1 309 309 529 100.0 1e-150 MDIPNMQNAGSLVVLGSINADHILNLQSFPTPGETVTGNHYQVAFGGKGANQAVAAGRSG ANIAFIACTGDDSIGESVRQQLATDNIDITPVSVIKGESTGVALIFVNGEGENVIGIHAG ANAALSPALVEAQRERIANASALLMQLESPLESVMAAAKIAHQNKTIVALNPAPARELPD ELLALVDIITPNETEAEKLTGIRVENDEDAAKAAQVLHEKGIRTVLITLGSRGVWASVNG EGQRVPGFRVQAVDTIAAGDTFNGALITALLEEKPLPEAIRFAHAAAAIAVTRKGAQPSV PWREEIDAFLDRQR >gi|296493406|gb|ADTK01000095.1| GENE 5 4310 - 5200 1163 296 aa, chain - ## HITS:1 COG:ECs4693 KEGG:ns NR:ns ## COG: ECs4693 COG1879 # Protein_GI_number: 15833947 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 1 296 1 296 296 493 100.0 1e-139 MNMKKLATLVSAVALSATVSANAMAKDTIALVVSTLNNPFFVSLKDGAQKEADKLGYNLV VLDSQNNPAKELANVQDLTVRGTKILLINPTDSDAVGNAVKMANQANIPVITLDRQATKG EVVSHIASDNVLGGKIAGDYIAKKAGEGAKVIELQGIAGTSAARERGEGFQQAVAAHKFN VLASQPADFDRTKGLNVMQNLLTAHPDVQAVFAQNDEMALGALRALQTAGKSDVMVVGFD GTPDGEKAVNDGKLAATIAQLPDQIGAKGVETADKVLKGEKVQAKYPVDLKLVVKQ >gi|296493406|gb|ADTK01000095.1| GENE 6 5225 - 6190 1293 321 aa, chain - ## HITS:1 COG:ECs4692 KEGG:ns NR:ns ## COG: ECs4692 COG1172 # Protein_GI_number: 15833946 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Escherichia coli O157:H7 # 1 321 1 321 321 481 100.0 1e-135 MTTQTVSGRRYFTKAWLMEQKSLIALLVLIAIVSTLSPNFFTINNLFNILQQTSVNAIMA VGMTLVILTSGIDLSVGSLLALTGAVAASIVGIEVNALVAVAAALALGAAIGAVTGVIVA KGRVQAFIATLVMMLLLRGVTMVYTNGSPVNTGFTENADLFGWFGIGRPLGVPTPVWIMG IVFLAAWYMLHHTRLGRYIYALGGNEAATRLSGINVNKIKIIVYSLCGLLASLAGIIEVA RLSSAQPTAGTGYELDAIAAVVLGGTSLAGGKGRIVGTLIGALILGFLNNGLNLLGVSSY YQMIVKAVVILLAVLVDNKKQ >gi|296493406|gb|ADTK01000095.1| GENE 7 6195 - 7700 1812 501 aa, chain - ## HITS:1 COG:rbsA KEGG:ns NR:ns ## COG: rbsA COG1129 # Protein_GI_number: 16131617 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Escherichia coli K12 # 1 501 1 501 501 962 99.0 0 MEALLQLKGIDKAFPGVKALSGAALNVYPGRVMALVGENGAGKSTMMKVLTGIYARDAGT LLWLGKETTFTGPKSSQEAGIGIIHQELNLIPQLTIAENIFLGREFVNRFGKIDWKTMYA EADKLLAKLNLRFKSDKLVGDLSIGDQQMVEIAKVLSFESKVIIMDEPTDALTDTETESL FRVIRELKSQGRGIVYISHRMKEIFEICDDVTVFRDGQFIAEREVASLTEDSLIEMMVGR KLEDQYPHLGKAPGDIRLKVDNLCGPGVNDVSFTLRKGEILGVSGLMGAGRTELMKVLYG ALPRTSGYVTLDGHEVVTRSPQDGLANGIVYISEDRKRDGLVLGMSVKENMSLTALRYFS RAGGSLKHADEQQAVSDFIRLFNVKTPSMEQAIGLLSGGNQQKVAIARGLMTRPKVLILD EPTRGVDVGAKKEIYQLINQFKADGLSIILVSSEMPEVLGMSDRIIVMHEGHLSGEFTRE QATQEVLMAAAVGKLNRVNQE >gi|296493406|gb|ADTK01000095.1| GENE 8 7708 - 8127 393 139 aa, chain - ## HITS:1 COG:rbsD KEGG:ns NR:ns ## COG: rbsD COG1869 # Protein_GI_number: 16131616 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type ribose transport system, auxiliary component # Organism: Escherichia coli K12 # 1 139 13 151 151 249 98.0 9e-67 MKKGTVLNSDISSVISRLGHTDTLVVCDAGLPISKSTTRIDMALTQGVPSFMQVLGVVTN EMQVEAAIIAEEIKQHNPQLHETLLTHLEQLQKHQGNTIEIRYTTHEQFKQQTAESQAVI RSGECSPYANIILCAGVTF >gi|296493406|gb|ADTK01000095.1| GENE 9 8294 - 10162 1669 622 aa, chain - ## HITS:1 COG:ECs4689 KEGG:ns NR:ns ## COG: ECs4689 COG3158 # Protein_GI_number: 15833943 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transporter # Organism: Escherichia coli O157:H7 # 1 622 1 622 622 1177 99.0 0 MSTDNKQSLPAITLAAIGVVYGDIGTSPLYTLRECLSGQFGFGVERDAVFGFLSLIFWLL IFVVSIKYLTFVMRADNAGEGGILTLMSLAGRNTSARTTSMLVIMGLIGGSFFYGEVVIT PAISVMSAIEGLEIVAPQLDTWIVPLSIIVLTLLFMIQKHGTAMVGKLFAPIMLTWFLIL AGLGLRSIIANPEVLHALNPMWAVHFFLEYKTVSFIALGAVVLSITGVEALYADMGHFGK FPIRLAWFTVVLPSLTLNYFGQGALLLKNPEAIKNPFFLLAPDWALIPLLIIAALATVIA SQAVISGVFSLTRQAVRLGYLSPMRIIHTSEMESGQIYIPFVNWMLYVAVVIVIVSFEHS SNLAAAYGIAVTGTMVLTSILSTTVARQNWHWNKYFVALILIAFLCVDIPLFTANLDKLL SGGWLPLSLGTVMFIVMTTWKSERFRLLRRMHEHGNSLEAMIASLEKSPPVRVPGTAVYM SRAINVIPFALMHNLKHNKVLHERVILLTLRTEDAPYVHNVRRVQIEQLSPTFWRVVASY GWRETPNVEEVFHRCGLEGLSCRMMETSFFMSHESLILGKRPWYLRLRGKLYLLLQRNAL RAPDQFEIPPNRVIELGTQVEI >gi|296493406|gb|ADTK01000095.1| GENE 10 10385 - 11881 1403 498 aa, chain + ## HITS:1 COG:yieN KEGG:ns NR:ns ## COG: yieN COG0714 # Protein_GI_number: 16131614 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Escherichia coli K12 # 1 498 9 506 506 947 100.0 0 MAHPHLLAERISRLSSSLEKGLYERSHAIRLCLLAALSGESVFLLGPPGIAKSLIARRLK FAFQNARAFEYLMTRFSTPEEVFGPLSIQALKDEGRYERLTSGYLPEAEIVFLDEIWKAG PAILNTLLTAINERQFRNGAHVEKIPMRLLVAASNELPEADSSLEALYDRMLIRLWLDKV QDKANFRSMLTSQQDENDNPVPDALQVTDEEYERWQKEIGEITLPDHVFELIFMLRQQLD KLPDAPYVSDRRWKKAIRLLQASAFFSGRSAVAPVDLILLKDCLWYDAQSLNLIQQQIDV LMTGHAWQQQGMLTRLGAIVQRHLQLQQQQSDKTALTVIRLGGIFSRRQQYQLPVNVTAS TLTLLLQKPLKLHDMEVVHISFERSALEQWLSKGGEIRGKLNGIGFAQKLNLEVDSAQHL VVRDVSLQGSTLALPGSSAEGLPGEIKQQLEELESDWRKQHALFSEQQKCLFIPGDWLGR IEASLQDVGAQIRQAQQC >gi|296493406|gb|ADTK01000095.1| GENE 11 11875 - 13326 1119 483 aa, chain + ## HITS:1 COG:ECs4687 KEGG:ns NR:ns ## COG: ECs4687 COG2425 # Protein_GI_number: 15833941 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain # Organism: Escherichia coli O157:H7 # 1 473 1 473 483 891 100.0 0 MLTLDTLNVMLAVSEEGLIEEMIIALLASPQLAVFFEKFPRLKAAITDDVPRWREALRSR LKDARVPPELTEEVMCYQQSQLLSTPQFIVQLPQILDLLHRLNSPWAEQARQLVDANSTI TSALHTLFLQRWRLSLIVQATTLNQQLLEEEREQLLSEVQERMTLSGQLEPILADNNTAA GRLWDMSAGQLKRGDYQLIVKYGEFLNEQPELKRLAEQLGRSREAKSIPRNDAQMETFRT MVREPATVPEQVDGLQQSDDILRLLPPELATLGITELEYEFYRRLVEKQLLTYRLHGESW REKVIERPVVHKDYDEQPRGPFIVCVDTSGSMGGFNEQCAKAFCLALMRIALAENRRCYI MLFSTEIVRYELSGPQGIEQAIRFLSQQFRGGTDLASCFRAIMERLQSREWFDADAVVIS DFIAQRLPDDVTSKVKELQRVHQHRFHAVAMSAHGKPGIMRIFDHIWRFDTGMRSRLLRR WRR >gi|296493406|gb|ADTK01000095.1| GENE 12 13331 - 14281 946 316 aa, chain - ## HITS:1 COG:ECs4686 KEGG:ns NR:ns ## COG: ECs4686 COG2502 # Protein_GI_number: 15833940 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthetase A # Organism: Escherichia coli O157:H7 # 1 316 15 330 330 626 99.0 1e-179 MKSHFSRQLEERLGLIEVQAPILSRVGDGTQDNLSGCEKAVQVKVKVLPDAQFEVVHSLA KWKRQTLGQHDFSAGEGLYTHMKALRPDEDRLSPLHSVYVDQWDWERVMGDGERQFSTLK STVEAIWAGIKATEAAVSEEFGLAPFLPDQIHFVHSQELLSRYPDLDAKGRERAIAKDLG AVFLVGIGGKLSDGHRHDVRAPDYDDWSTPSELGHAGLNGDILVWNPVLEDAFELSSMGI RVDADTLKHQLALTGDEDRLQLEWHQALLRGEMPQTIGGGIGQSRLTMLLLQLPHIGQVQ CGVWPAAVRESVPSLL >gi|296493406|gb|ADTK01000095.1| GENE 13 14475 - 14933 477 152 aa, chain + ## HITS:1 COG:ECs4685 KEGG:ns NR:ns ## COG: ECs4685 COG1522 # Protein_GI_number: 15833939 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 152 1 152 152 296 100.0 8e-81 MENYLIDNLDRGILEALMGNARTAYAELAKQFGVSPGTIHVRVEKMKQAGIITGARIDVS PKQLGYDVGCFIGIILKSAKDYPSALAKLESLDEVTEAYYTTGHYSIFIKVMCRSIDALQ HVLINKIQTIDEIQSTETLIVLQNPIMRTIKP >gi|296493406|gb|ADTK01000095.1| GENE 14 15023 - 15466 582 147 aa, chain + ## HITS:1 COG:ECs4684 KEGG:ns NR:ns ## COG: ECs4684 COG0716 # Protein_GI_number: 15833938 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Escherichia coli O157:H7 # 1 147 1 147 147 275 98.0 2e-74 MADITLISGSTLGGAEYVAEHLAEKLEEAGFTTETLHGPLLEDLSASGIWLIISSTHGAG DIPDNLSPFYEALQEQKPDLSAVRFGAIGIGSREYDTFCGAIDKLEAELKDSGAKQTGET LKINILDHDIPEDPAEEWLGSWINLLK >gi|296493406|gb|ADTK01000095.1| GENE 15 15845 - 17734 1965 629 aa, chain + ## HITS:1 COG:gidA KEGG:ns NR:ns ## COG: gidA COG0445 # Protein_GI_number: 16131609 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: NAD/FAD-utilizing enzyme apparently involved in cell division # Organism: Escherichia coli K12 # 1 629 1 629 629 1246 99.0 0 MFYPDPFDVIIIGGGHAGTEAAMAAARMGQQTLLLTHNIDTLGQMSCNPAIGGIGKGHLV KEVDALGGLMAKAIDQAGIQFRILNASKGPAVRATRAQADRVLYRQAVRTALENQPNLMI FQQAVEDLIVENDRVVGAVTQMGLKFRAKAVVLTVGTFLDGKIHIGLDNYSGGRAGDPPS IPLSRRLRELPLRVGRLKTGTPPRIDARTIDFSVLAQQHGDNPMPVFSFMGNASQHPQQV PCYITHTNEKTHYVIRSNLDRSPMYAGVIEGVGPRYCPSIEDKVMRFADRNQHQIFLEPE GLTSNEIYPNGISTSLPFDVQMQIVRSMQGMENAKIVRPGYAIEYDFFDPRDLKPTLESK FIQGLFFAGQINGTTGYEEAAAQGLLAGLNAARLSADKEGWAPARSQAYLGVLVDDLCTL GTKEPYRMFTSRAEYRLMLREDNADLRLTEIGRELGLVDDERWARFNEKLENIERERQRL KSTWVTPSAEAAAEVNAHLTAPLSREASGEDLLRRPEMTYEKLTTLTPFAPALTDEQAAE QVEIQVKYEGYIARQQDEIEKQLRNENTLLPATLDYRQVSGLSNEVIAKLNDHKPASIGQ ASRISGVTPAAISILLVWLKKQGMLRRSA >gi|296493406|gb|ADTK01000095.1| GENE 16 17798 - 18421 546 207 aa, chain + ## HITS:1 COG:ECs4682 KEGG:ns NR:ns ## COG: ECs4682 COG0357 # Protein_GI_number: 15833936 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division # Organism: Escherichia coli O157:H7 # 1 207 1 207 207 411 100.0 1e-115 MLNKLSLLLKDAGISLTDHQKNQLIAYVNMLHKWNKAYNLTSVRDPNEMLVRHILDSIVV APYLQGERFIDVGTGPGLPGIPLSIVRPEAHFTLLDSLGKRVRFLRQVQHELKLENIEPV QSRVEEFPSEPPFDGVISRAFASLNDMVSWCHHLPGEQGRFYALKGQMPEDEIALLPEEY QVESVVKLQVPALDGERHLVVIKANKI >gi|296493406|gb|ADTK01000095.1| GENE 17 19038 - 19418 195 126 aa, chain + ## HITS:1 COG:ECs4681 KEGG:ns NR:ns ## COG: ECs4681 COG3312 # Protein_GI_number: 15833935 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit I # Organism: Escherichia coli O157:H7 # 1 126 5 130 130 157 100.0 6e-39 MSVSLVSRNVARKLLLVQLLVVIASGLLFSLKDPFWGVSAISGGLAVFLPNVLFMIFAWR HQAHTPAKGRVAWTFAFGEAFKVLAMLVLLVVALAVLKAVFLPLIVTWVLVLVVQILAPA VINNKG >gi|296493406|gb|ADTK01000095.1| GENE 18 19427 - 20242 802 271 aa, chain + ## HITS:1 COG:STM3871 KEGG:ns NR:ns ## COG: STM3871 COG0356 # Protein_GI_number: 16767155 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit a # Organism: Salmonella typhimurium LT2 # 1 271 1 271 271 467 96.0 1e-131 MASENMTPQDYIGHHLNNLQLDLRTFSLVDPQNPPATFWTINIDSMFFSVVLGLLFLVLF RSVAKKATSGVPGKFQTAIELVIGFVNGSVKDMYHGKSKLIAPLALTIFVWVFLMNLMDL LPIDLLPYIAEHVLGLPALRVVPSADVNVTLSMALGVFILILFYSIKMKGIGGFTKELTL QPFNHWAFIPVNLILEGVSLLSKPVSLGLRLFGNMYAGELIFILIAGLLPWWSQWILNVP WAIFHILIITLQAFIFMVLTIVYLSMASEEH >gi|296493406|gb|ADTK01000095.1| GENE 19 20289 - 20528 350 79 aa, chain + ## HITS:1 COG:BU003 KEGG:ns NR:ns ## COG: BU003 COG0636 # Protein_GI_number: 15616633 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K # Organism: Buchnera sp. APS # 1 79 1 79 79 97 83.0 5e-21 MENLNMDLLYMAAAVMMGLAAIGAAIGIGILGGKFLEGAARQPDLIPLLRTQFFIVMGLV DAIPMIAVGLGLYVMFAVA >gi|296493406|gb|ADTK01000095.1| GENE 20 20590 - 21060 576 156 aa, chain + ## HITS:1 COG:ECs4678 KEGG:ns NR:ns ## COG: ECs4678 COG0711 # Protein_GI_number: 15833932 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit b # Organism: Escherichia coli O157:H7 # 1 156 1 156 156 201 100.0 5e-52 MNLNATILGQAIAFVLFVLFCMKYVWPPLMAAIEKRQKEIADGLASAERAHKDLDLAKAS ATDQLKKAKAEAQVIIEQANKRRSQILDEAKAEAEQERTKIVAQAQAEIEAERKRAREEL RKQVAILAVAGAEKIIERSVDEAANSDIVDKLVAEL >gi|296493406|gb|ADTK01000095.1| GENE 21 21075 - 21608 560 177 aa, chain + ## HITS:1 COG:ECs4677 KEGG:ns NR:ns ## COG: ECs4677 COG0712 # Protein_GI_number: 15833931 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) # Organism: Escherichia coli O157:H7 # 1 177 1 177 177 300 100.0 8e-82 MSEFITVARPYAKAAFDFAVEHQSVERWQDMLAFAAEVTKNEQMAELLSGALAPETLAES FIAVCGEQLDENGQNLIRVMAENGRLNALPDVLEQFIHLRAVSEATAEVDVISAAALSEQ QLAKISAAMEKRLSRKVKLNCKIDKSVMAGVIIRAGDMVIDGSVRGRLERLADVLQS >gi|296493406|gb|ADTK01000095.1| GENE 22 21621 - 23162 1740 513 aa, chain + ## HITS:1 COG:ECs4676 KEGG:ns NR:ns ## COG: ECs4676 COG0056 # Protein_GI_number: 15833930 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, alpha subunit # Organism: Escherichia coli O157:H7 # 1 513 1 513 513 986 100.0 0 MQLNSTEISELIKQRIAQFNVVSEAHNEGTIVSVSDGVIRIHGLADCMQGEMISLPGNRY AIALNLERDSVGAVVMGPYADLAEGMKVKCTGRILEVPVGRGLLGRVVNTLGAPIDGKGP LDHDGFSAVEAIAPGVIERQSVDQPVQTGYKAVDSMIPIGRGQRELIIGDRQTGKTALAI DAIINQRDSGIKCIYVAIGQKASTISNVVRKLEEHGALANTIVVVATASESAALQYLAPY AGCAMGEYFRDRGEDALIIYDDLSKQAVAYRQISLLLRRPPGREAFPGDVFYLHSRLLER AARVNAEYVEAFTKGEVKGKTGSLTALPIIETQAGDVSAFVPTNVISITDGQIFLETNLF NAGIRPAVNPGISVSRVGGAAQTKIMKKLSGGIRTALAQYRELAAFSQFASDLDDATRKQ LDHGQKVTELLKQKQYAPMSVAQQSLVLFAAERGYLADVELSKIGSFEAALLAYVDRDHA PLMQEINQTGGYNDEIEGKLKGILDSFKATQSW >gi|296493406|gb|ADTK01000095.1| GENE 23 23213 - 24076 997 287 aa, chain + ## HITS:1 COG:ECs4675 KEGG:ns NR:ns ## COG: ECs4675 COG0224 # Protein_GI_number: 15833929 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, gamma subunit # Organism: Escherichia coli O157:H7 # 1 287 1 287 287 555 100.0 1e-158 MAGAKEIRSKIASVQNTQKITKAMEMVAASKMRKSQDRMAASRPYAETMRKVIGHLAHGN LEYKHPYLEDRDVKRVGYLVVSTDRGLCGGLNINLFKKLLAEMKTWTDKGVQCDLAMIGS KGVSFFNSVGGNVVAQVTGMGDNPSLSELIGPVKVMLQAYDEGRLDKLYIVSNKFINTMS QVPTISQLLPLPASDDDDLKHKSWDYLYEPDPKALLDTLLRRYVESQVYQGVVENLASEQ AARMVAMKAATDNGGSLIKELQLVYNKARQASITQELTEIVSGAAAV >gi|296493406|gb|ADTK01000095.1| GENE 24 24103 - 25485 1611 460 aa, chain + ## HITS:1 COG:ECs4674 KEGG:ns NR:ns ## COG: ECs4674 COG0055 # Protein_GI_number: 15833928 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, beta subunit # Organism: Escherichia coli O157:H7 # 1 460 1 460 460 890 100.0 0 MATGKIVQVIGAVVDVEFPQDAVPRVYDALEVQNGNERLVLEVQQQLGGGIVRTIAMGSS DGLRRGLDVKDLEHPIEVPVGKATLGRIMNVLGEPVDMKGEIGEEERWAIHRAAPSYEEL SNSQELLETGIKVIDLMCPFAKGGKVGLFGGAGVGKTVNMMELIRNIAIEHSGYSVFAGV GERTREGNDFYHEMTDSNVIDKVSLVYGQMNEPPGNRLRVALTGLTMAEKFRDEGRDVLL FVDNIYRYTLAGTEVSALLGRMPSAVGYQPTLAEEMGVLQERITSTKTGSITSVQAVYVP ADDLTDPSPATTFAHLDATVVLSRQIASLGIYPAVDPLDSTSRQLDPLVVGQEHYDTARG VQSILQRYQELKDIIAILGMDELSEEDKLVVARARKIQRFLSQPFFVAEVFTGSPGKYVS LKDTIRGFKGIMEGEYDHLPEQAFYMVGSIEEAVEKAKKL >gi|296493406|gb|ADTK01000095.1| GENE 25 25506 - 25925 438 139 aa, chain + ## HITS:1 COG:atpC KEGG:ns NR:ns ## COG: atpC COG0355 # Protein_GI_number: 16131599 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) # Organism: Escherichia coli K12 # 1 139 1 139 139 246 100.0 1e-65 MAMTYHLDVVSAEQQMFSGLVEKIQVTGSEGELGIYPGHAPLLTAIKPGMIRIVKQHGHE EFIYLSGGILEVQPGNVTVLADTAIRGQDLDEARAMEAKRKAEEHISSSHGDVDYAQASA ELAKAIAQLRVIELTKKAM >gi|296493406|gb|ADTK01000095.1| GENE 26 26276 - 27646 1719 456 aa, chain + ## HITS:1 COG:ECs4672 KEGG:ns NR:ns ## COG: ECs4672 COG1207 # Protein_GI_number: 15833926 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylglucosamine-1-phosphate uridyltransferase (contains nucleotidyltransferase and I-patch acetyltransferase domains) # Organism: Escherichia coli O157:H7 # 1 456 1 456 456 890 100.0 0 MLNNAMSVVILAAGKGTRMYSDLPKVLHTLAGKAMVQHVIDAANELGAAHVHLVYGHGGD LLKQALKDDNLNWVLQAEQLGTGHAMQQAAPFFADDEDILMLYGDVPLISVETLQRLRDA KPQGGIGLLTVKLDDPTGYGRITRENGKVTGIVEHKDATDEQRQIQEINTGILIANGADM KRWLAKLTNNNAQGEYYITDIIALAYQEGREIVAVHPQRLSEVEGVNNRLQLSRLERVYQ SEQAEKLLLAGVMLRDPARFDLRGTLTHGRDVEIDTNVIIEGNVTLGHRVKIGTGCVIKN SVIGDDCEISPYTVVEDANLAAACTIGPFARLRPGAELLEGAHVGNFVEMKKARLGKGSK AGHLTYLGDAEIGDNVNIGAGTITCNYDGANKFKTIIGDDVFVGSDTQLVAPVTVGKGAT IAAGTTVTRNVGENALAISRVPQTQKEGWRRPVKKK >gi|296493406|gb|ADTK01000095.1| GENE 27 27808 - 29637 2303 609 aa, chain + ## HITS:1 COG:glmS KEGG:ns NR:ns ## COG: glmS COG0449 # Protein_GI_number: 16131597 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Escherichia coli K12 # 1 609 1 609 609 1196 100.0 0 MCGIVGAIAQRDVAEILLEGLRRLEYRGYDSAGLAVVDAEGHMTRLRRLGKVQMLAQAAE EHPLHGGTGIAHTRWATHGEPSEVNAHPHVSEHIVVVHNGIIENHEPLREELKARGYTFV SETDTEVIAHLVNWELKQGGTLREAVLRAIPQLRGAYGTVIMDSRHPDTLLAARSGSPLV IGLGMGENFIASDQLALLPVTRRFIFLEEGDIAEITRRSVNIFDKTGAEVKRQDIESNLQ YDAGDKGIYRHYMQKEIYEQPNAIKNTLTGRISHGQVDLSELGPNADELLSKVEHIQILA CGTSYNSGMVSRYWFESLAGIPCDVEIASEFRYRKSAVRRNSLMITLSQSGETADTLAGL RLSKELGYLGSLAICNVPGSSLVRESDLALMTNAGTEIGVASTKAFTTQLTVLLMLVAKL SRLKGLDASIEHDIVHGLQALPSRIEQMLSQDKRIEALAEDFSDKHHALFLGRGDQYPIA LEGALKLKEISYIHAEAYAAGELKHGPLALIDADMPVIVVAPNNELLEKLKSNIEEVRAR GGQLYVFADQDAGFVSSDNMHIIEMPHVEEVIAPIFYTVPLQLLAYHVALIKGTDVDQPR NLAKSVTVE >gi|296493406|gb|ADTK01000095.1| GENE 28 29939 - 30511 451 190 aa, chain + ## HITS:1 COG:ECs4670 KEGG:ns NR:ns ## COG: ECs4670 COG3539 # Protein_GI_number: 15833924 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli O157:H7 # 1 190 1 200 200 193 55.0 1e-49 MKRNIIGGAFTLASLMLAGHALAEDGVVHFVGEIVDTTCEVTSDTADQIVPLGKVSKNAF SGVGSLASPQKFSIKLENCPATYTQAAVRFDGTEAPGGDGDLKVGTPLTAGNPGDFTGTG QAIAATGVGIRIFNQSDNSQVKLYNDSAYTAIDAEGKAEMKFIARYVATNATVTAGTANA DSQFTVEYKK >gi|296493406|gb|ADTK01000095.1| GENE 29 30694 - 31290 272 198 aa, chain + ## HITS:1 COG:STM0176 KEGG:ns NR:ns ## COG: STM0176 COG3121 # Protein_GI_number: 16763566 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Salmonella typhimurium LT2 # 1 197 31 226 227 132 38.0 5e-31 MVYLSNNADKSISVFSKEEKIPYLIQAWVDPFNKEDKSKAPFTVIPPVSRLEPSQEKILR IIHTKGVSLPDDRESVFWLNIKNIPPSASNKATNSLEIAVKTRIKLFWRPANIRLIPEDA APKVKWRREGRNLIAENPNPIHISVMDVIVDGHDVPLNMIRPFETLTLPLPANSAGGQMT WRFINDYGAVSDPIKMTL >gi|296493406|gb|ADTK01000095.1| GENE 30 31315 - 33837 1515 840 aa, chain + ## HITS:1 COG:ECs4667 KEGG:ns NR:ns ## COG: ECs4667 COG3188 # Protein_GI_number: 15833921 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli O157:H7 # 1 839 1 843 844 1105 63.0 0 MMTTRIVVGLTAGTCLIFSQNLMAEVSVFNPALLEIDHQSGVDIRQFNRANLMPPGVYSV DIFINGKMFERQDVTFVQDNPDADLHACFIAIKKTLSSFGIKVDALKSFNDVDETVCLDP APRIEGSSWQFDSDKLQLNISIPQIYMDAMAYDYISPTRWDEGINALTINYDFSGSHTLR SDYGSQETDTSYLNLRNGLNIGPWRLRNYSTLNTSDGRAEYNSISTWIQRDIAALRSQIM IGDTWTASDIFDSTQIRGARLYTDNDMLPASQNGFAPVVRGIAKSNATVIIRQNGYVIYQ SAVPQGAFEITDLNTASTGGDLDVTIKEEDGSEQRFTQPYASLAILKREGQTDVDVSVGE LRDEDGFTPDVLQAQILHGFSHGITLYGGMQAAENYGSAALGVGKDLGALGAISFDVTHA RANFSHDDTETGQSYRFLYSKRFDDTDTSLRLVGYRYSTEGYYTLNEWASRRNSPEDFWE TGNRRSRVEGTLTQSLGRDYGNLYLTLSRQQYWHTDDVERLMQFGYSSSWKRLSWNVSWS YSNTARQGTGNNHASDNTSEQIYMLSLSVPLSGWWGNSYATYSVSQNDNSGSSHQLGLSG TALERNNLSWNLMQSYNSHDDEVGGNMSLTYDGSYGTVNGSYNYSQNSQRLNYGIRGGIL AHSEGVTLSQELGETIALVKAPGAAGLEIDNMRGAATDWRGYTVKTQLNPYDENRVAISD NYFSKSNIELDNTVVTMVPTRGAVVKAEFVTHVGYRVLFRVLNANGKPVPFGAIAAIQDA SLADSGIVGDRGELYLSGLPEKGQVTLSWGENASTKCIFNYSLSTPESESGLIEQGVTCH >gi|296493406|gb|ADTK01000095.1| GENE 31 33848 - 34921 452 357 aa, chain + ## HITS:1 COG:ECs4665 KEGG:ns NR:ns ## COG: ECs4665 COG3539 # Protein_GI_number: 15833919 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli O157:H7 # 1 356 1 359 360 250 42.0 3e-66 MNKYIKQWCFAVFMLSLSSVALAAPKGICTPDNGVFHSTLDFSGYLITANENKVGTTFNT TVTNGSSYPGRCHCDTGNVGEFPYIYYTSKINQALTYAGVHSNINYYDLNPNLDVGIAID ILGVGYVNAPFEYHANNPSGNTKYNCNRIEPLSISSGAKAIVYFYIKKTFAGKLIIPETK IVTLYGTISRDTPVDYSQPMADVYIRGDITAPQSCEINNLQPVYFDFKEIPAADFSSVVG SAVTTHKITKTVTIECENLGILNTDDISTSFYATEPNTDNSMVVTSNSNVGIKIYDKNNK EIKVNGGELPTDMGKSTVYGEKSGSVTFSAAPASLTGARPAPGQFTATATITVEIVR >gi|296493406|gb|ADTK01000095.1| GENE 32 35169 - 36209 1181 346 aa, chain + ## HITS:1 COG:pstS KEGG:ns NR:ns ## COG: pstS COG0226 # Protein_GI_number: 16131596 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Escherichia coli K12 # 1 346 1 346 346 632 100.0 0 MKVMRTTVATVVAATLSMSAFSVFAEASLTGAGATFPAPVYAKWADTYQKETGNKVNYQG IGSSGGVKQIIANTVDFGASDAPLSDEKLAQEGLFQFPTVIGGVVLAVNIPGLKSGELVL DGKTLGDIYLGKIKKWDDEAIAKLNPGLKLPSQNIAVVRRADGSGTSFVFTSYLAKVNEE WKNNVGTGSTVKWPIGLGGKGNDGIAAFVQRLPGAIGYVEYAYAKQNNLAYTKLISADGK PVSPTEENFANAAKGADWSKTFAQDLTNQKGEDAWPITSTTFILIHKDQKKPEQGTEVLK FFDWAYKTGAKQANDLDYASLPDSVVEQVRAAWKTNIKDSSGKPLY >gi|296493406|gb|ADTK01000095.1| GENE 33 36296 - 37255 1242 319 aa, chain + ## HITS:1 COG:ECs4663 KEGG:ns NR:ns ## COG: ECs4663 COG0573 # Protein_GI_number: 15833917 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Escherichia coli O157:H7 # 1 319 1 319 319 535 100.0 1e-152 MAATKPAFNPPGKKGDIIFSVLVKLAALIVLLMLGGIIVSLIISSWPSIQKFGLAFLWTK EWDAPNDIYGALVPIYGTLVTSFIALLIAVPVSFGIALFLTELAPGWLKRPLGIAIELLA AIPSIVYGMWGLFIFAPLFAVYFQEPVGNIMSNIPIVGALFSGPAFGIGILAAGVILAIM IIPYIAAVMRDVFEQTPVMMKESAYGIGCTTWEVIWRIVLPFTKNGVIGGIMLGLGRALG ETMAVTFIIGNTYQLDSASLYMPGNSITSALANEFAEAESGLHVAALMELGLILFVITFI VLAASKFMIMRLAKNEGAR >gi|296493406|gb|ADTK01000095.1| GENE 34 37255 - 38145 1100 296 aa, chain + ## HITS:1 COG:ECs4662 KEGG:ns NR:ns ## COG: ECs4662 COG0581 # Protein_GI_number: 15833916 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Escherichia coli O157:H7 # 1 296 1 296 296 507 100.0 1e-144 MAMVEMQTTAALAESRRKMQARRRLKNRIALTLSMATMAFGLFWLIWILMSTITRGIDGM SLALFTEMTPPPNTEGGGLANALAGSGLLILWATVFGTPLGIMAGIYLAEYGRKSWLAEV IRFINDILLSAPSIVVGLFVYTIVVAQMEHFSGWAGVIALALLQVPIVIRTTENMLKLVP DSLREAAYALGTPKWKMISAITLKASVSGIMTGILLAIARIAGETAPLLFTALSNQFWST DMMQPIANLPVTIFKFAMSPFAEWQQLAWAGVLIITLCVLLLNILARVVFAKNKHG >gi|296493406|gb|ADTK01000095.1| GENE 35 38236 - 39009 345 257 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 11 252 2 239 245 137 36 1e-31 MSMVETAPSKIQVRNLNFYYGKFHALKNINLDIAKNQVTAFIGPSGCGKSTLLRTFNKMF ELYPEQRAEGEILLDGDNILTNSQDIALLRAKVGMVFQKPTPFPMSIYDNIAFGVRLFEK LSRADMDERVQWALTKAALWNETKDKLHQSGYSLSGGQQQRLCIARGIAIRPEVLLLDEP CSALDPISTGRIEELITELKQDYTVVIVTHNMQQAARCSDHTAFMYLGELIEFSNTDDLF TKPAKKQTEDYITGRYG >gi|296493406|gb|ADTK01000095.1| GENE 36 39024 - 39749 853 241 aa, chain + ## HITS:1 COG:phoU KEGG:ns NR:ns ## COG: phoU COG0704 # Protein_GI_number: 16131592 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate uptake regulator # Organism: Escherichia coli K12 # 1 241 1 241 241 449 100.0 1e-126 MDSLNLNKHISGQFNAELESIRTQVMTMGGMVEQQLSDAITAMHNQDSDLAKRVIEGDKN VNMMEVAIDEACVRIIAKRQPTASDLRLVMVISKTIAELERIGDVADKICRTALEKFSQQ HQPLLVSLESLGRHTIQMLHDVLDAFARMDIDEAVRIYREDKKVDQEYEGIVRQLMTYMM EDSRTIPSVLTALFCARSIERIGDRCQNICEFIFYYVKGQDFRHVGGDELDKLLAGKDSD K >gi|296493406|gb|ADTK01000095.1| GENE 37 40035 - 40871 557 278 aa, chain + ## HITS:1 COG:bglG KEGG:ns NR:ns ## COG: bglG COG3711 # Protein_GI_number: 16131591 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Escherichia coli K12 # 1 278 1 278 278 521 98.0 1e-148 MNMQITKILNNNVVVVIDDQQREKVVMGRGIGFQKRAGKRINSSGIEKEYALSSHELNGR LSELLSHIPLEVMATCDRIISLAQERLGKLQDSIYISLTDHCQFAIKRFQQNVLLPNPLL WDIQRLYPKEFQLGEEALTIIDKRLGVQLPKDEVGFIAMHLVSAQMSGNMEDVAGVTQLM REMLQLIKFQFSLNYQEESLSYQRLVTHLKFLSWRILGHASINDSDESLQQAVKQNYPQA WQCAERIAIFIGLQYQRKISPAEIMFLAINIERVHKEH >gi|296493406|gb|ADTK01000095.1| GENE 38 41004 - 42881 1357 625 aa, chain + ## HITS:1 COG:bglF_2 KEGG:ns NR:ns ## COG: bglF_2 COG1263 # Protein_GI_number: 16131590 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Escherichia coli K12 # 91 451 1 361 361 684 99.0 0 MTELARKIVAGVGGADNIVSLMHCATRLRFKLKDESKAQAEVLKKTPGIIMVVESGGQFQ VVIGNHVADVFLAVNSVAGLDEKAQQAPENDDKGNLLNRFVYVISGIFTPLIGLMAATGI LKGMLALALTFQWTTEQSGTYLILFSASDALFWFFPIILGYTAGKRFGGNPFTAMVIGGA LVHPLILTAFENGQKADALGLDFLGIPVTLLNYSSSVIPIIFSAWLCSILERRLNAWLPS AIKNFFTPLLCLMVITPVTFLLVGPLSTWISELIAAGYLWLYQAVPVFAGAVMGGFWQIF VMFGLHWGLVPLCINNFTVLGYDTMIPLLMPAIMAQVGAALGVFLCERDAQKKVVAGSAA LTSLFGITEPAVYGVNLPRKYPFVIACISGALGATIIGYAQTKVYSFGLPSIFTFMQTIP STGIDFTVWASVIGGVIAIGCAFVGTVMLHFITAKRQPAQGAPQEKTPEVITPPEQGGIC SPMTGEIVPLIHVADTTFASGLLGKGIAILPSVGEVRSPVAGRIASLFATLHAIGIESDD GVEILIHVGIDTVKLDGKFFSAHVNVGDKVNTGDRLISFDIPAIREAGFDLTTPVLISNS DDFTDVLPHGTAQISAGEPLLSIIR >gi|296493406|gb|ADTK01000095.1| GENE 39 42900 - 44294 1253 464 aa, chain + ## HITS:1 COG:bglB KEGG:ns NR:ns ## COG: bglB COG2723 # Protein_GI_number: 16131589 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Escherichia coli K12 # 1 464 1 464 470 984 100.0 0 MKAFPETFLWGGATAANQVEGAWQEDGKGISTSDLQPHGVMGKMEPRILGKENIKDVAID FYHRYPEDIALFAEMGFTCLRISIAWARIFPQGDEVEPNEAGLAFYDRLFDEMAQAGIKP LVTLSHYEMPYGLVKNYGGWANRAVIDHFEHYARTVFTRYQHKVALWLTFNEINMSLHAP FTGVGLAEESGEAEVYQAIHHQLVASARAVKACHSLLPEAKIGNMLLGGLVYPLTCQPQD MLQAMEENRRWMFFGDVQARGQYPGYMQRFFRDHNITIEMTESDAEDLKHTVDFISFSYY MTGCVSHDESINKNAQGNILNMIPNPHLKSSEWGWQIDPVGLRVLLNTLWDRYQKPLFIV ENGLGAKDSVEADGSIQDDYRIAYLNDHLVQVNEAIADGVDIMGYTSWGPIDLVSASHSQ MSKRYGFIYVDRDDNGEGSLTRTRKKSFGWYAEVIKTRGLSLKK >gi|296493406|gb|ADTK01000095.1| GENE 40 44380 - 45996 1574 538 aa, chain + ## HITS:1 COG:yieC KEGG:ns NR:ns ## COG: yieC COG4580 # Protein_GI_number: 16131588 # Func_class: G Carbohydrate transport and metabolism # Function: Maltoporin (phage lambda and maltose receptor) # Organism: Escherichia coli K12 # 1 538 1 538 538 1014 99.0 0 MFRRNLITSAVLLMAPLAFSAQSLAESLTVEQRLELLEKALRETQSELKKYKDEEKKKYT PATVNRSVSTNDQGYAANPFPTSSAAKPDAVLVKNEEKNASETGSIYSSMTLKDFSKFVK DEIGFSYNGYYRSGWGTASHGSPKSWAIGSLGRFGNEYSGWFDLQLKQRVYNENGKRVDA VVMMDGNVGQQYSTGWFGDNAGGENFMQFSDMYVTTKGFLPFAPEADFWVGKHGAPKIEI QMLDWKTQRTDAAAGVGLENWKVGPGKIDIALVREDVDDYDRSLQNKQQINTNTIDLRYK DIPLWDKATLMVSGRYVTANESASEKDNQDNNGYYDWKDTWMFGTSLTQKFDKGGFNEFS FLVANNSIASNFGRYAGASPFTTFNGRYYGDHTGGTAVRLTSQGEAYIGDHFIVANAIVY SFGNDIYSYETGAHSDFESIRAVVRPAYIWDQYNQTGVELGYFTQQNKDANSNKFNESGY KTTLFHTFKVNTSMLTSRPEIRFYATYIKALENELDGFTFEDNKDDQFAVGAQAEIWW >gi|296493406|gb|ADTK01000095.1| GENE 41 46023 - 47192 1105 389 aa, chain + ## HITS:1 COG:yieL KEGG:ns NR:ns ## COG: yieL COG2382 # Protein_GI_number: 16131587 # Func_class: P Inorganic ion transport and metabolism # Function: Enterochelin esterase and related enzymes # Organism: Escherichia coli K12 # 1 389 12 400 400 762 99.0 0 MNIKIAALTLAIASGISAQWAIAADMPASPAPTIPVKQYVTQVNADNSVTFRYFAPGAKN VSVVVGVPVPDNIHPMTKDEAGVWSWRTPILKGNLYEYFFNVDGVRSIDTGTAMTKPQRQ VNSSMILVPGSYLDTRSVAHGDLIAITYHSNALQSERQMYVWTPPGYTGMGEPLPVLYFY HGFGDTGRSAIDQGRIPQIMDNLLAEGKIKPMLVVIPDTETDAKGIIPEDFVPQERRKVF YPLNAKAADRELMNDIIPLISKRFNVRKDADGRALAGLSQGGYQALVSGMNHLESFGWLA TFSGVTTTTVPDEGVAARLNDPAAINQQLRNFTVVVGDKDVVTGKDIAGLKTELEQKKIK FDYQEYPGLNHEMDVWRPAYAAFVQKLFK >gi|296493406|gb|ADTK01000095.1| GENE 42 47207 - 47929 775 240 aa, chain + ## HITS:1 COG:yieK KEGG:ns NR:ns ## COG: yieK COG0363 # Protein_GI_number: 16131586 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Escherichia coli K12 # 24 232 1 209 213 434 100.0 1e-122 MKLIITEDYQEMSRVAAHHLLGYMSKTRRVNLAITAGSTPKGMYEYLTTLVKGKPWYDNC YFYNFDEIPFRGKEGEGVTITNLRNLFFTPAGIKEENIQKLTIDNYREHDQKLAREGGLD LVVLGLGADGHFCGNLPNTTHFHEQTVEFPIQGEMVDIVAHGELGGDFSLVPDSYVTMGP KSIMAAKNLLIIVSGAGKAQALKNVLQGPVTEDVPASVLQLHPSLMVIADKAAAAELALG >gi|296493406|gb|ADTK01000095.1| GENE 43 47991 - 48578 269 195 aa, chain - ## HITS:1 COG:yieJ KEGG:ns NR:ns ## COG: yieJ COG3196 # Protein_GI_number: 16131585 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 195 1 195 195 369 98.0 1e-102 MTQNIRPLPQFKYHPKPLETGAFEQDKTVECDCCEQQTSVYYSGPFYCVDEVEHLCPWCI ADGSAAEKFAGSFQDDASIEGVEFEYDEEDEFAGIKNTYPDEMLKELVERTPGYHGWQQE FWLAHCGDFCAFIGYVGWNDIKDRLDEFANLEEDCENFGIRNSDLAKCLQKRGDCQGYLF RCLHCGKLRLWGDFS >gi|296493406|gb|ADTK01000095.1| GENE 44 48627 - 49001 149 124 aa, chain - ## HITS:1 COG:no KEGG:EC55989_4186 NR:ns ## KEGG: EC55989_4186 # Name: yieI # Def: putative inner membrane protein # Organism: E.coli_55989 # Pathway: not_defined # 1 124 34 157 157 208 100.0 6e-53 KEPLVLLRRIQVLPLFLLLSITTGVIPALLTGVMVACLPEKIGSQKNYRCLAGGIGGVVI TEIYCAVIVHIKGMASSELYENILSGDSLVVRIIPALLAGVVMSRIITRLPGLDISCPET DSLS Prediction of potential genes in microbial genomes Time: Mon May 16 15:17:03 2011 Seq name: gi|296493405|gb|ADTK01000096.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont256.2, whole genome shotgun sequence Length of sequence - 43993 bp Number of predicted genes - 42, with homology - 42 Number of transcription units - 26, operones - 9 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 184 - 849 634 ## COG0637 Predicted phosphatase/phosphohexomutase - Prom 981 - 1040 1.8 + Prom 926 - 985 1.6 2 2 Tu 1 . + CDS 1016 - 2353 1578 ## COG2252 Permeases + Term 2374 - 2410 1.3 - Term 2360 - 2398 4.1 3 3 Op 1 3/0.556 - CDS 2407 - 2973 736 ## COG0431 Predicted flavoprotein 4 3 Op 2 4/0.222 - CDS 2995 - 3756 495 ## COG2091 Phosphopantetheinyl transferase - Prom 3838 - 3897 8.0 5 4 Op 1 9/0.000 - CDS 3901 - 4860 740 ## COG0583 Transcriptional regulator 6 4 Op 2 3/0.556 - CDS 4835 - 6010 908 ## COG0477 Permeases of the major facilitator superfamily - Prom 6037 - 6096 2.5 - Term 6101 - 6130 0.4 7 5 Op 1 3/0.556 - CDS 6143 - 7390 964 ## COG0814 Amino acid permeases - Term 7431 - 7469 3.7 8 5 Op 2 . - CDS 7481 - 8896 1940 ## COG3033 Tryptophanase - Prom 8931 - 8990 6.7 - Term 9378 - 9424 11.7 9 6 Op 1 10/0.000 - CDS 9434 - 10798 1602 ## COG0486 Predicted GTPase - Term 10848 - 10878 3.0 10 6 Op 2 22/0.000 - CDS 10904 - 12550 1912 ## COG0706 Preprotein translocase subunit YidC - Prom 12614 - 12673 3.9 11 6 Op 3 . - CDS 12774 - 13100 205 ## COG0594 RNase P protein component 12 6 Op 4 . - CDS 13150 - 13290 228 ## PROTEIN SUPPORTED gi|15804297|ref|NP_290336.1| 50S ribosomal protein L34 - Prom 13386 - 13445 6.3 + Prom 13258 - 13317 4.0 13 7 Tu 1 . + CDS 13403 - 13603 59 ## ECH74115_5131 hypothetical protein + Prom 13688 - 13747 2.5 14 8 Op 1 16/0.000 + CDS 13966 - 15300 1259 ## COG0593 ATPase involved in DNA replication initiation 15 8 Op 2 18/0.000 + CDS 15305 - 16405 1181 ## COG0592 DNA polymerase sliding clamp subunit (PCNA homolog) 16 8 Op 3 9/0.000 + CDS 16405 - 17478 771 ## COG1195 Recombinational DNA repair ATPase (RecF pathway) 17 8 Op 4 1/0.889 + CDS 17507 - 19921 2871 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit + Term 20053 - 20095 11.9 + Prom 20055 - 20114 6.6 18 9 Tu 1 . + CDS 20161 - 20559 525 ## COG3753 Uncharacterized protein conserved in bacteria + Term 20566 - 20607 5.1 + Prom 20564 - 20623 3.8 19 10 Tu 1 . + CDS 20674 - 21486 1189 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 21495 - 21536 6.4 - Term 21475 - 21531 6.3 20 11 Tu 1 . - CDS 21532 - 22188 318 ## EcE24377A_4205 hypothetical protein - Prom 22331 - 22390 4.9 + Prom 22275 - 22334 8.7 21 12 Op 1 1/0.889 + CDS 22466 - 23155 621 ## COG2186 Transcriptional regulators 22 12 Op 2 2/0.667 + CDS 23152 - 24030 563 ## COG3734 2-keto-3-deoxy-galactonokinase 23 12 Op 3 1/0.889 + CDS 24014 - 24631 530 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase 24 12 Op 4 7/0.111 + CDS 24628 - 25776 1513 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 25 13 Tu 1 . + CDS 25896 - 27188 1375 ## COG0477 Permeases of the major facilitator superfamily + Term 27199 - 27245 5.6 26 14 Tu 1 . - CDS 27185 - 28249 603 ## COG0644 Dehydrogenases (flavoproteins) 27 15 Tu 1 . + CDS 28314 - 29564 893 ## ECSE_3975 hypothetical protein 28 16 Tu 1 . - CDS 29566 - 29829 288 ## COG5645 Predicted periplasmic lipoprotein - Prom 30017 - 30076 3.0 29 17 Tu 1 4/0.222 + CDS 30204 - 30617 578 ## COG0071 Molecular chaperone (small heat shock protein) 30 18 Tu 1 1/0.889 + CDS 30729 - 31157 514 ## COG0071 Molecular chaperone (small heat shock protein) + Term 31173 - 31205 5.4 + Prom 31177 - 31236 4.4 31 19 Tu 1 . + CDS 31354 - 33015 1851 ## COG2985 Predicted permease + Term 33023 - 33069 11.3 32 20 Tu 1 . - CDS 33012 - 33728 525 ## COG2188 Transcriptional regulators + Prom 33764 - 33823 3.7 33 21 Op 1 2/0.667 + CDS 34024 - 35640 1745 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 34 21 Op 2 . + CDS 35640 - 36278 548 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases + Term 36457 - 36503 10.0 35 22 Tu 1 . - CDS 36451 - 37344 614 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 37374 - 37433 2.6 + Prom 37300 - 37359 5.0 36 23 Op 1 4/0.222 + CDS 37511 - 39226 1639 ## COG4146 Predicted symporter 37 23 Op 2 . + CDS 39223 - 40716 1137 ## COG3119 Arylsulfatase A and related enzymes + Term 40723 - 40761 4.2 - Term 40711 - 40749 1.2 38 24 Tu 1 . - CDS 40763 - 41173 348 ## EC55989_4146 conserved hypothetical protein; putative inner membrane protein - Prom 41279 - 41338 7.1 + Prom 41154 - 41213 3.6 39 25 Op 1 . + CDS 41322 - 41669 465 ## COG2149 Predicted membrane protein 40 25 Op 2 . + CDS 41659 - 42021 229 ## ECO111_4500 putative inner membrane protein 41 25 Op 3 . + CDS 42018 - 42515 374 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) - Term 42461 - 42510 2.0 42 26 Tu 1 . - CDS 42523 - 43674 1213 ## COG0477 Permeases of the major facilitator superfamily - Prom 43729 - 43788 5.3 Predicted protein(s) >gi|296493405|gb|ADTK01000096.1| GENE 1 184 - 849 634 221 aa, chain - ## HITS:1 COG:yieH KEGG:ns NR:ns ## COG: yieH COG0637 # Protein_GI_number: 16131583 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Escherichia coli K12 # 1 221 1 221 221 466 99.0 1e-131 MSQIEAVFFDCDGTLVDSEVICSRAYVTMFQEFGITLDPEEVFKRFKGVKLYEIIDIVSL EHGVTLAKTEAEHVYRAEVARLFDSELEAIEGAGALLSAITAPMCVVSNGPNNKMQHSMG KLNMLHYFPDKLFSGYDIQRWKPDPALMFHAAKAMNVNVENCILVDDSVAGAQSGIDAGM EVFYFCADPHNKPIVHPKVTTFTHLSQLPELWKARGWDITA >gi|296493405|gb|ADTK01000096.1| GENE 2 1016 - 2353 1578 445 aa, chain + ## HITS:1 COG:STM3851 KEGG:ns NR:ns ## COG: STM3851 COG2252 # Protein_GI_number: 16767135 # Func_class: R General function prediction only # Function: Permeases # Organism: Salmonella typhimurium LT2 # 1 445 43 487 487 735 96.0 0 MSQQHTTQASGQGMLERVFKLREHGTTARTEVIAGFTTFLTMVYIVFVNPQILGVAGMDT SAVFVTTCLIAAFGSIMMGLFANLPVALAPAMGLNAFFAFVVVQAMGLPWQVGMGAIFWG AIGLLLLTIFRVRYWMIANIPVSLRVGITSGIGLFIGMMGLKNAGVIVANPETLVSIGNL TSHSVLLGILGFFIIAILASRNIHAAVLVSIVVTTLLGWMLGDVHYNGIVSAPPSVMTVV GHVDLAGSFNLGLAGVIFSFMLVNLFDSSGTLIGVTDKAGLADEKGKFPRMKQALYVDSI SSVTGSFIGTSSVTAYIESSSGVSVGGRTGLTAVVVGLLFLLVIFLSPLAGMVPGYAAAG ALIYVGVLMTSSLARVNWQDLTESVPAFITAVMMPFSFSITEGIALGFISYCVMKIGTGR LRDLSPCVIIVALMFILKIVFIDAH >gi|296493405|gb|ADTK01000096.1| GENE 3 2407 - 2973 736 188 aa, chain - ## HITS:1 COG:ECs4650 KEGG:ns NR:ns ## COG: ECs4650 COG0431 # Protein_GI_number: 15833904 # Func_class: R General function prediction only # Function: Predicted flavoprotein # Organism: Escherichia coli O157:H7 # 1 188 1 188 188 363 99.0 1e-100 MSEKLKVVTLLGSLRKGSFNGMVARTLPKIAPASMEVNALPSIADIPLYDADVQQEEGFP ATVEALAEQIRQADGVVIVTPEYNYSVPGGLKNAIDWLSRLPDQPLAGKPVLIQTSSMGV IGGARCQYHLRQILVFLDAMVMNKPEFMGGVIQNKVDPQTGEVIDQGTLDHLTGQLTAFG EFIQRVKI >gi|296493405|gb|ADTK01000096.1| GENE 4 2995 - 3756 495 253 aa, chain - ## HITS:1 COG:yieE KEGG:ns NR:ns ## COG: yieE COG2091 # Protein_GI_number: 16131580 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantetheinyl transferase # Organism: Escherichia coli K12 # 1 253 1 253 253 515 99.0 1e-146 MERKMATHFARGILTEGHLISVRLPSQCHQEARNIPLHRQSRFLASRGLLAELMFMLYGI GELPEIVTLPKGKPVFSDKNLPSFSISYAGNMVGVALTTEGECGLDMELQRATRGFHSPH APDNHTFSSNESLWISKQNDPNEARAQLITLRRSVLKLTGDVLNDDPRDLQLLPIAGRLK CAHVNHVEALCDAEDVLVWSVAVTPTIEKLSVWELDGKHGWKSLPDIHSRANNPTSRMMR FAQLSTVKAFSPN >gi|296493405|gb|ADTK01000096.1| GENE 5 3901 - 4860 740 319 aa, chain - ## HITS:1 COG:yidZ KEGG:ns NR:ns ## COG: yidZ COG0583 # Protein_GI_number: 16131579 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 319 1 319 319 583 99.0 1e-166 MKKSITTLDLNLLLCLQLLMQERSVTKAAKRMNVTPSAVSKSLAKLRAWFDDPLFVNSPL GLSPTPLMVSMEQNLAEWMQMSNLLLDKPHHQTPRGLKFELAAESPLMMIMLNALSKQIY QRYPQATIKLRNWDYDSLDAITRGEVDIGFSGRESHPRSRELLSSLPLAIDYEVLFSDVP CVWLRKDHPALHETWSLDTFLRYPHISICWEQSDTWALDNVLQELGRERTIAMSLPEFEQ SLFMAAQPDNLLLATAPRYCQYYNQLHQLPLVALPLPFDESQQKKLEVPFTLLWHKRNSH NPKIVWLRETIKNLYASMA >gi|296493405|gb|ADTK01000096.1| GENE 6 4835 - 6010 908 391 aa, chain - ## HITS:1 COG:yidY KEGG:ns NR:ns ## COG: yidY COG0477 # Protein_GI_number: 16131578 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 391 1 391 391 661 100.0 0 MSRFLICSFALVLLYPAGIDMYLVGLPRIAADLNASEAQLHIAFSVYLAGMAAAMLFAGK VADRSGRKPVAIPGAALFIIASVFCSLAETSTLFLAGRFLQGLGAGCCYVVAFAILRDTL DDRRRAKVLSLLNGITCIIPVLAPVLGHLIMLKFPWQSLFWAMAMMGIAVLMLSLFILKE TRPAAPAASDKPRENSESLLNRFFLSRVVITTLSVSVILTFVNTSPVLLMEIMGFERGEY ATIMALTAGVSMTVSFSTPFALGIFKPRTLMITSQVLFLAAGITLAVSPSHAVSLFGITL ICAGFSVGFGVAMSQALGPFSLRAGVASSTLGIAQVCGSSLWIWLAAVVGIGAWNMLIGI LIACSIVSLLLIMFVAPGRPVAAHEEIHHHA >gi|296493405|gb|ADTK01000096.1| GENE 7 6143 - 7390 964 415 aa, chain - ## HITS:1 COG:tnaB KEGG:ns NR:ns ## COG: tnaB COG0814 # Protein_GI_number: 16131577 # Func_class: E Amino acid transport and metabolism # Function: Amino acid permeases # Organism: Escherichia coli K12 # 1 415 1 415 415 731 99.0 0 MTDQAEKKHSAFWGVMVIAGTVIGGGMFALPVDLAGAWFFWGAFILIIAWFSMLHSGLLL LEANLNYPVGSSFNTITKDLIGNTWNIISGITVAFVLYILTYAYISANGAIISETISMNL GYHANPRIVGICTAIFVASVLWISSLAASRITSLFLGLKIISFVIVFGSFFFQVDYPILR DATSTTAGTSYFPYIFMALPVCLASFGFHGNIPSLIICYGKRKDKLIKSVVFGSLLALVI YLFWLYCTMGNIPRESFKAIISSGGNVDSLVKSFLGTKQHGIIEFCLLVFSNLAVASSFF GVTLGLFDYLADLFKIDNSHGGRFKTVLLTFLPPALLYLIFPNGFIYGIGGAGLCATIWA VIIPAVLAIKARKKFPNQMFTVWGGNLIPAIVILFGITVILCWFGNVFNVLPKFG >gi|296493405|gb|ADTK01000096.1| GENE 8 7481 - 8896 1940 471 aa, chain - ## HITS:1 COG:tnaA KEGG:ns NR:ns ## COG: tnaA COG3033 # Protein_GI_number: 16131576 # Func_class: E Amino acid transport and metabolism # Function: Tryptophanase # Organism: Escherichia coli K12 # 1 471 6 476 476 971 99.0 0 MENFKHLPEPFRIRVIEPVKRTTRAYREEAIIKSGMNPFLLDSEDVFIDLLTDSGTGAVT QSMQAAMMRGDEAYSGSRSYYALAESVKNIFGYQYTIPTHQGRGAEQIYIPVLIKKREQE KGLDRSKMVAFSNYFFDTTQGHSQINGCTVRNVYIKEAFDTDVRYDFKGNFDLEGLERGI EEVGPNNVPYIVATITSNSAGGQPVSLANLKAMYSIAKKYDIPVVMDSARFAENAYFIKQ REAEYKDWTIEQITRETYKYADMLAMSAKKDAMVPMGGLLCMKDDSFFDVYTECRTLCVV QEGFPTYGGLEGGAMERLAVGLYDGMNLDWLAYRIAQVQYLVDGLEEIGVVCQQAGGHAA FVDAGKLLPHIPADQFPAQALACELYKVAGIRAVEIGSFLLGRDPKTGKQLPCPAELLRL TIPRATYTQTHMDFIIEAFKHVKENAANIKGLTFTYEPKVLRHFTAKLKEV >gi|296493405|gb|ADTK01000096.1| GENE 9 9434 - 10798 1602 454 aa, chain - ## HITS:1 COG:thdF KEGG:ns NR:ns ## COG: thdF COG0486 # Protein_GI_number: 16131574 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Escherichia coli K12 # 1 454 1 454 454 859 99.0 0 MSDNDTIVAQATPPGRGGVGILRISGLKAREVAETVLGKLPKPRYADYLPFKDADGSVLD QGIALWFPGPNSFTGEDVLELQGHGGPVILDLLLKRILTIPGLRIARPGEFSERAFLNDK LDLAQAEAIADLIDASSEQAARSALNSLQGAFSARVNHLVEALTHLRIYVEAAIDFPDEE IDFLSDGKIEAQLNDVIADLDAVRAEARQGSLLREGMKVVIAGRPNAGKSSLLNALAGRE AAIVTDIAGTTRDVLREHIHIDGMPLHIIDTAGLREASDEVERIGIERAWQEIEQADRVL FMVDGTTTDAVDPAEIWPEFIARLPAKLPITVVRNKADITGETLGMSEVNGHALIRLSAR TGEGVDVLRNHLKQSMGFDTNMEGGFLARRRHLQALEQAAEHLQQGKAQLLGAWAGELLA EELRLAQQNLSEITGEFTSDDLLGRIFSSFCIGK >gi|296493405|gb|ADTK01000096.1| GENE 10 10904 - 12550 1912 548 aa, chain - ## HITS:1 COG:ECs4640 KEGG:ns NR:ns ## COG: ECs4640 COG0706 # Protein_GI_number: 15833894 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YidC # Organism: Escherichia coli O157:H7 # 1 548 1 548 548 1064 100.0 0 MDSQRNLLVIALLFVSFMIWQAWEQDKNPQPQAQQTTQTTTTAAGSAADQGVPASGQGKL ISVKTDVLDLTINTRGGDVEQALLPAYPKELNSTQPFQLLETSPQFIYQAQSGLTGRDGP DNPANGPRPLYNVEKDAYVLAEGQNELQVPMTYTDAAGNTFTKTFVLKRGDYAVNVNYNV QNAGEKPLEISTFGQLKQSITLPPHLDTGSSNFALHTFRGAAYSTPDEKYEKYKFDTIAD NENLNISSKGGWVAMLQQYFATAWIPHNDGTNNFYTANLGNGIAAIGYKSQPVLVQPGQT GAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGWLWFISQPLFKLLKWIHSFVGNWGFSIII ITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQAMRERLGDDKQRISQEMMALYKAEKVNPL GGCFPLLIQMPIFLALYYMLMGSVELRQAPFALWIHDLSAQDPYYILPILMGVTMFFIQK MSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSGLVLYYIVSNLVTIIQQQLIYRGLEKRGL HSREKKKS >gi|296493405|gb|ADTK01000096.1| GENE 11 12774 - 13100 205 108 aa, chain - ## HITS:1 COG:rnpA KEGG:ns NR:ns ## COG: rnpA COG0594 # Protein_GI_number: 16131572 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNase P protein component # Organism: Escherichia coli K12 # 1 108 12 119 119 190 99.0 5e-49 MLTPSQFTFVFQQPQRAGTPQITILGRLNSLGHPRIGLTVAKKNVRRAHERNRIKRLTRE SFRLRQHELPAMDFVVVAKKGVADLDNRALSEALEKLWRRHCRLARGS >gi|296493405|gb|ADTK01000096.1| GENE 12 13150 - 13290 228 46 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15804297|ref|NP_290336.1| 50S ribosomal protein L34 [Escherichia coli O157:H7 EDL933] # 1 46 1 46 46 92 100 4e-18 MKRTFQPSVLKRNRSHGFRARMATKNGRQVLARRRAKGRARLTVSK >gi|296493405|gb|ADTK01000096.1| GENE 13 13403 - 13603 59 66 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_5131 NR:ns ## KEGG: ECH74115_5131 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 66 1 66 66 122 98.0 5e-27 MYTPESILFPYLPRFSAPFRRENVRPHTSENQHGAPGGGLYGLMGKAQGSSWILIRSIKP IFVYGH >gi|296493405|gb|ADTK01000096.1| GENE 14 13966 - 15300 1259 444 aa, chain + ## HITS:1 COG:dnaA KEGG:ns NR:ns ## COG: dnaA COG0593 # Protein_GI_number: 16131570 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication initiation # Organism: Escherichia coli K12 # 1 444 24 467 467 884 100.0 0 MWIRPLQAELSDNTLALYAPNRFVLDWVRDKYLNNINGLLTSFCGADAPQLRFEVGTKPV TQTPQAAVTSNVAAPAQVAQTQPQRAAPSTRSGWDNVPAPAEPTYRSNVNVKHTFDNFVE GKSNQLARAAARQVADNPGGAYNPLFLYGGTGLGKTHLLHAVGNGIMARKPNAKVVYMHS ERFVQDMVKALQNNAIEEFKRYYRSVDALLIDDIQFFANKERSQEEFFHTFNALLEGNQQ IILTSDRYPKEINGVEDRLKSRFGWGLTVAIEPPELETRVAILMKKADENDIRLPGEVAF FIAKRLRSNVRELEGALNRVIANANFTGRAITIDFVREALRDLLALQEKLVTIDNIQKTV AEYYKIKVADLLSKRRSRSVARPRQMAMALAKELTNHSLPEIGDAFGGRDHTTVLHACRK IEQLREESHDIKEDFSNLIRTLSS >gi|296493405|gb|ADTK01000096.1| GENE 15 15305 - 16405 1181 366 aa, chain + ## HITS:1 COG:ECs4636 KEGG:ns NR:ns ## COG: ECs4636 COG0592 # Protein_GI_number: 15833890 # Func_class: L Replication, recombination and repair # Function: DNA polymerase sliding clamp subunit (PCNA homolog) # Organism: Escherichia coli O157:H7 # 1 366 1 366 366 729 100.0 0 MKFTVEREHLLKPLQQVSGPLGGRPTLPILGNLLLQVADGTLSLTGTDLEMEMVARVALV QPHEPGATTVPARKFFDICRGLPEGAEIAVQLEGERMLVRSGRSRFSLSTLPAADFPNLD DWQSEVEFTLPQATMKRLIEATQFSMAHQDVRYYLNGMLFETEGEELRTVATDGHRLAVC SMPIGQSLPSHSVIVPRKGVIELMRMLDGGDNPLRVQIGSNNIRAHVGDFIFTSKLVDGR FPDYRRVLPKNPDKHLEAGCDLLKQAFARAAILSNEKFRGVRLYVSENQLKITANNPEQE EAEEILDVTYSGAEMEIGFNVSYVLDVLNALKCENVRMMLTDSVSSVQIEDAASQSAAYV VMPMRL >gi|296493405|gb|ADTK01000096.1| GENE 16 16405 - 17478 771 357 aa, chain + ## HITS:1 COG:ECs4635 KEGG:ns NR:ns ## COG: ECs4635 COG1195 # Protein_GI_number: 15833889 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair ATPase (RecF pathway) # Organism: Escherichia coli O157:H7 # 1 357 1 357 357 712 100.0 0 MSLTRLLIRDFRNIETADLALSPGFNFLVGANGSGKTSVLEAIYTLGHGRAFRSLQIGRV IRHEQEAFVLHGRLQGEERETAIGLTKDKQGDSKVRIDGTDGHKVAELAHLMPMQLITPE GFTLLNGGPKYRRAFLDWGCFHNEPGFFTAWSNLKRLLKQRNAALRQVTRYEQLRPWDKE LIPLAEQISTWRAEYSAGIAADMADTCKQFLPEFSLTFSFQRGWEKETEYAEVLERNFER DRQLTYTAHGPHKADLRIRADGAPVEDTLSRGQLKLLMCALRLAQGEFLTRESGRRCLYL IDDFASELDDERRGLLASRLKATQSQVFVSAISAEHVIDMSDENSKMFTVEKGKITD >gi|296493405|gb|ADTK01000096.1| GENE 17 17507 - 19921 2871 804 aa, chain + ## HITS:1 COG:ECs4634 KEGG:ns NR:ns ## COG: ECs4634 COG0187 # Protein_GI_number: 15833888 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Escherichia coli O157:H7 # 1 804 1 804 804 1588 100.0 0 MSNSYDSSSIKVLKGLDAVRKRPGMYIGDTDDGTGLHHMVFEVVDNAIDEALAGHCKEII VTIHADNSVSVQDDGRGIPTGIHPEEGVSAAEVIMTVLHAGGKFDDNSYKVSGGLHGVGV SVVNALSQKLELVIQREGKIHRQIYEHGVPQAPLAVTGETEKTGTMVRFWPSLETFTNVT EFEYEILAKRLRELSFLNSGVSIRLRDKRDGKEDHFHYEGGIKAFVEYLNKNKTPIHPNI FYFSTEKDGIGVEVALQWNDGFQENIYCFTNNIPQRDGGTHLAGFRAAMTRTLNAYMDKE GYSKKAKVSATGDDAREGLIAVVSVKVPDPKFSSQTKDKLVSSEVKSAVEQQMNELLAEY LLENPTDAKIVVGKIIDAARAREAARRAREMTRRKGALDLAGLPGKLADCQERDPALSEL YLVEGDSAGGSAKQGRNRKNQAILPLKGKILNVEKARFDKMLSSQEVATLITALGCGIGR DEYNPDKLRYHSIIIMTDADVDGSHIRTLLLTFFYRQMPEIVERGHVYIAQPPLYKVKKG KQEQYIKDDEAMDQYQISIALDGATLHTNASAPALAGEALEKLVSEYNATQKMINRMERR YPKAMLKELIYQPTLTEADLSDEQTVTRWVNALVSELNDKEQHGSQWKFDVHTNAEQNLF EPIVRVRTHGVDTDYPLDHEFITGGEYRRICTLGEKLRGLLEEDAFIERGERRQPVASFE QALDWLVKESRRGLSIQRYKGLGEMNPEQLWETTMDPESRRMLRVTVKDAIAADQLFTTL MGDAVEPRRAFIEENALKAANIDI >gi|296493405|gb|ADTK01000096.1| GENE 18 20161 - 20559 525 132 aa, chain + ## HITS:1 COG:ECs4633 KEGG:ns NR:ns ## COG: ECs4633 COG3753 # Protein_GI_number: 15833887 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 132 4 135 135 214 98.0 4e-56 MGLFDEVVGAFLKGDAGKYQAILSWVEEQGGIQVLLEKLQSGGLGAILSTWLSNQQGNQS VSGEQLESALGTNAVSDLGQKLGVDTSTASSLLAEQLPKIIDALSPQGEVSPQANNDLLS AGMELLKGKLFR >gi|296493405|gb|ADTK01000096.1| GENE 19 20674 - 21486 1189 270 aa, chain + ## HITS:1 COG:yidA KEGG:ns NR:ns ## COG: yidA COG0561 # Protein_GI_number: 16131565 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Escherichia coli K12 # 1 270 1 270 270 525 100.0 1e-149 MAIKLIAIDMDGTLLLPDHTISPAVKNAIAAARARGVNVVLTTGRPYAGVHNYLKELHME QPGDYCITYNGALVQKAADGSTVAQTALSYDDYRFLEKLSREVGSHFHALDRTTLYTANR DISYYTVHESFVATIPLVFCEAEKMDPNTQFLKVMMIDEPAILDQAIARIPQEVKEKYTV LKSAPYFLEILDKRVNKGTGVKSLADVLGIKPEEIMAIGDQENDIAMIEYAGVGVAMDNA IPSVKEVANFVTKSNLEDGVAFAIEKYVLN >gi|296493405|gb|ADTK01000096.1| GENE 20 21532 - 22188 318 218 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_4205 NR:ns ## KEGG: EcE24377A_4205 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 218 36 253 253 442 99.0 1e-123 MKLNFKGFFKAAGLFPLALMLSGCISYALVSHTAKGSSGKYQSQSDTITGLSQAKDSNGT KGYVFVGESLDYLITDGADDIVKMLNDPALNRHNIQVADDARFVLNAGKKKFTGTISLYY YWNNEEEKALATHYGFACGVQHCTRSLENLKGTIHEKNKNMDYSKVMAFYHPFKVRFYEY YSPRGIPDGVSAALLPVTVTLDIITAPLQFLVVYAVNQ >gi|296493405|gb|ADTK01000096.1| GENE 21 22466 - 23155 621 229 aa, chain + ## HITS:1 COG:STM3830 KEGG:ns NR:ns ## COG: STM3830 COG2186 # Protein_GI_number: 16767115 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Salmonella typhimurium LT2 # 1 229 1 229 229 409 94.0 1e-114 MTLNKTDRIVITLGKQIVHGKYVPGSPLPAEAELCEEFATSRNIIREVFRSLMAKRLIEM KRYRGAFVAPRNQWNYLDTDVLQWVLENDYDPRLISAMSEVRNLVEPAIARWAAERATSS DLAQIESALNEMIANNQDREAFNEADIRYHEAVLQSVHNPVLQQLSIAISSLQRAVFERT WMGDEANMPQTLQEHKALFDAIRHQDGDAAEQAALTMIASSTRRLKEIT >gi|296493405|gb|ADTK01000096.1| GENE 22 23152 - 24030 563 292 aa, chain + ## HITS:1 COG:dgoK KEGG:ns NR:ns ## COG: dgoK COG3734 # Protein_GI_number: 16131561 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-galactonokinase # Organism: Escherichia coli K12 # 1 292 1 292 292 545 98.0 1e-155 MTARYIAIDWGSTNLRAWLYQGDHCLESRQSEAGVTRLNGKSPAAVLAEVTTDWREENTP VVMAGMVGSNVGWKVAPYLSVPARFSSIGEQLTSVGDNIWIIPGLCVSHDDNHNVMRGEE TQLIGARALAPSSLYVMPGTHCKWVQADSQQINDFRTVMTGELHHLLLNHSLIGAGLPPQ ENSADAFAAGLERGLNAPAILPQLFEVRASHVLGTLPREQVSEFLSGLLIGAEVASMRDY VTHQHAITLVAGTSLTARYQQAFQAMGCDVTAVAGDTAFQAGIRSIAHAVAN >gi|296493405|gb|ADTK01000096.1| GENE 23 24014 - 24631 530 205 aa, chain + ## HITS:1 COG:RSc2752 KEGG:ns NR:ns ## COG: RSc2752 COG0800 # Protein_GI_number: 17547471 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Ralstonia solanacearum # 1 201 3 203 213 239 62.0 3e-63 MQWQTKLPLIAILRGIKPDEALAHVGAVIDAGFDAVEIPLNSPQWEQSIPAIVDAYGDKA LIGAGTVLKPEQVDALARMGCQLIVTPNIHSEVIRRAVGYGMTVCPGCATATEAFTALEA GAQALKIFPSSAFGPQYIKALKAVLPSDIAVFAVGGVTPENLAQWIDAGCAGAGLGSDLY RAGQSVERTAQQAAAFVKAYREAVQ >gi|296493405|gb|ADTK01000096.1| GENE 24 24628 - 25776 1513 382 aa, chain + ## HITS:1 COG:dgoA_2 KEGG:ns NR:ns ## COG: dgoA_2 COG4948 # Protein_GI_number: 16131560 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Escherichia coli K12 # 1 382 96 477 477 799 100.0 0 MKITKITTYRLPPRWMFLKIETDEGVVGWGEPVIEGRARTVEAAVHELGDYLIGQDPSRI NDLWQVMYRAGFYRGGPILMSAIAGIDQALWDIKGKVLNAPVWQLMGGLVRDKIKAYSWV GGDRPADVIDGIKTLREIGFDTFKLNGCEELGLIDNSRAVDAAVNTVAQIREAFGNQIEF GLDFHGRVSAPMAKVLIKELEPYRPLFIEEPVLAEQAEYYPKLAAQTHIPLAAGERMFSR FDFKRVLEAGGISILQPDLSHAGGITECYKIAGMAEAYDVTLAPHCPLGPIALAACLHID FVSYNAVLQEQSMGIHYNKGAELLDFVKNKEDFSMVGGFFKPLTKPGLGVEIDEAKVIEF SKNAPDWRNPLWRHEDNSVAEW >gi|296493405|gb|ADTK01000096.1| GENE 25 25896 - 27188 1375 430 aa, chain + ## HITS:1 COG:dgoT KEGG:ns NR:ns ## COG: dgoT COG0477 # Protein_GI_number: 16131559 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 430 16 445 445 801 99.0 0 MDIPVNAAKPGRRRYLTLVMIFITVVICYVDRANLAVASAHIQEEFGITKAEMGYVFSAF AWLYTLCQIPGGWFLDRVGSRVTYFIAIFGWSVATLFQGFATGLMSLIGLRAITGIFEAP AFPTNNRMVTSWFPEHERASAVGFYTSGQFVGLAFLTPLLIWIQEMLSWHWVFIVTGGIG IIWSLIWFKVYQPPRLTKGISKAELDYIRDGGGLVDGDAPVKKEARQPLTAKDWKLVFHR KLIGVYLGQFAVASTLWFFLTWFPNYLTQEKGITALKAGFMTTVPFLAAFVGVLLSGWVA DLLVRKGFSQGFARKTPIICGLLISTCIMGANYTNDPMMIMCLMALAFFGNGFASITWSL VSSLAPMRLIGLTGGVFNFAGGLGGITVPLVVGYLAQGYGFAPALVYISAVALIGALSYI LLVGDVKRVG >gi|296493405|gb|ADTK01000096.1| GENE 26 27185 - 28249 603 354 aa, chain - ## HITS:1 COG:yidS KEGG:ns NR:ns ## COG: yidS COG0644 # Protein_GI_number: 16131558 # Func_class: C Energy production and conversion # Function: Dehydrogenases (flavoproteins) # Organism: Escherichia coli K12 # 1 354 8 361 361 744 99.0 0 MEHFDVAIIGLGPAGSALARKLAGKMQVIALDKKHQCGTEGFSKPCGGLLAPDAQRSFIR DGLTLPVDVIANPQIFSVKTVDVAASLTRNYQRSYININRHAFDLWMKSLIPASVEVYHD SLCRKIWREDDKWHVIFRADGWEQHITARYLVGADGANSMVRRHLYPDHQIRKYVAIQQW FAEKHPVPFYSCIFDNSITNCYSWSISKDGYFIFGGAYPMKDGQTRFTTLKEKMSAFQFQ FGKAVKSEKCTVLFPSRWQDFVCGKDNAFLIGEAAGFISASSLEGISYALDSADILRSVL LKQPEKLNTAYWRATRKLRLKLFGKIVKSRCLTAPALRKWIMRSGMAHIPQLKD >gi|296493405|gb|ADTK01000096.1| GENE 27 28314 - 29564 893 416 aa, chain + ## HITS:1 COG:no KEGG:ECSE_3975 NR:ns ## KEGG: ECSE_3975 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 416 1 416 416 842 100.0 0 MMAGTVLYQDRAMKQITFAPRNHLLTNTNTWTPDSQWLVFDVRPSGASFTGETIERVNIH TGEVEVIYRASQGAHVGVVTVHPKSEKYVFIHGPENPDETWHYDFHHRRGVIAEGGKVSN LDAMDITAPYTPGALRGGSHVHVFSPNGERVSFTYNDHVMHELDPALDLRNVGVAAPFGP VNVQKQHPREYSGSHWCVLVSKTTPTPQPGSDEINRAYEEGWVGNHALAFIGDTLSPKGE KVPELFIVELPQDEAGWKAAGDAPLSGTETTLPAPPRGVVQRRLTFTHHRAYPGLVNVPR HWVRCNPQGTQIAFLMRDDNGIVQLWLISPQGGEPRQLTHNKTDIQSAFNWHPSGEWLGF VLDNRIACAHAQSGEVEYLTENHANPPSADAVVFSPDGQWLAWMEGGQLWITETDR >gi|296493405|gb|ADTK01000096.1| GENE 28 29566 - 29829 288 87 aa, chain - ## HITS:1 COG:ECs4628 KEGG:ns NR:ns ## COG: ECs4628 COG5645 # Protein_GI_number: 15833882 # Func_class: R General function prediction only # Function: Predicted periplasmic lipoprotein # Organism: Escherichia coli O157:H7 # 1 87 49 135 135 167 98.0 5e-42 MMSHTGGKEGTYPGTRASATMIGDDETNWGTKSLAILDMPFTAVMDTLLLPWDVFRKDSS VRSRVEKSEANAQATNAVIPPARMPDN >gi|296493405|gb|ADTK01000096.1| GENE 29 30204 - 30617 578 137 aa, chain + ## HITS:1 COG:ECs4627 KEGG:ns NR:ns ## COG: ECs4627 COG0071 # Protein_GI_number: 15833881 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone (small heat shock protein) # Organism: Escherichia coli O157:H7 # 1 137 1 137 137 256 100.0 7e-69 MRNFDLSPLYRSAIGFDRLFNHLENNQSQSNGGYPPYNVELVDENHYRIAIAVAGFAESE LEITAQDNLLVVKGAHADEQKERTYLYQGIAERNFERKFQLAENIHVRGANLVNGLLYID LERVIPEAKKPRRIEIN >gi|296493405|gb|ADTK01000096.1| GENE 30 30729 - 31157 514 142 aa, chain + ## HITS:1 COG:ECs4626 KEGG:ns NR:ns ## COG: ECs4626 COG0071 # Protein_GI_number: 15833880 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone (small heat shock protein) # Organism: Escherichia coli O157:H7 # 1 142 3 144 144 275 100.0 2e-74 MRNFDLSPLMRQWIGFDKLANALQNAGESQSFPPYNIEKSDDNHYRITLALAGFRQEDLE IQLEGTRLSVKGTPEQPKEEKKWLHQGLMNQPFSLSFTLAENMEVSGATFVNGLLHIDLI RNEPEPIAAQRIAISERPALNS >gi|296493405|gb|ADTK01000096.1| GENE 31 31354 - 33015 1851 553 aa, chain + ## HITS:1 COG:ECs4625 KEGG:ns NR:ns ## COG: ECs4625 COG2985 # Protein_GI_number: 15833879 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Escherichia coli O157:H7 # 1 553 9 561 561 985 100.0 0 MSDIALTVSILALVAVVGLFIGNVKFRGIGLGIGGVLFGGIIVGHFVSQAGMTLSSDMLH VIQEFGLILFVYTIGIQVGPGFFASLRVSGLRLNLFAVLIVIIGGLVTAILHKLFDIPLP VVLGIFSGAVTNTPALGAGQQILRDLGTPMEMVDQMGMSYAMAYPFGICGILFTMWMLRV IFRVNVETEAQQHESSRTNGGALIKTINIRVENPNLHDLAIKDVPILNGDKIICSRLKRE ETLKVPSPDTIIQLGDLLHLVGQPADLHNAQLVIGQEVDTSLSTKGTDLRVERVVVTNEN VLGKRIRDLHFKERYDVVISRLNRAGVELVASGDISLQFGDILNLVGRPSAIDAVANVLG NAQQKLQQVQMLPVFIGIGLGVLLGSIPVFVPGFPAALKLGLAGGPLIMALILGRIGSIG KLYWFMPPSANLALRELGIVLFLSVVGLKSGGDFVNTLVNGEGLSWIGYGALITAVPLIT VGILARMLAKMNYLTMCGMLAGSMTDPPALAFANNLHPTSGAAALSYATVYPLVMFLRII TPQLLAVLFWSIG >gi|296493405|gb|ADTK01000096.1| GENE 32 33012 - 33728 525 238 aa, chain - ## HITS:1 COG:ECs4624 KEGG:ns NR:ns ## COG: ECs4624 COG2188 # Protein_GI_number: 15833878 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 238 1 238 238 446 99.0 1e-125 MIYKSIAERLRIRPNSADFTLNSLLPGEKKLAEEFAVSRMTIRKAIDLLVAWGLVVRRHG SGTYLVRKDVLHQTASLTGLVEVLKRQGKTVTSQVLIFEIMPAPPAIASQLRIQINEQIY FSRRVRFVEGKPLMLEDSYMPVKLFRNLSLQHLEGSKFEYIEQECGILIGGNYESLTPVL ADRLLARQMKVAEHTPLLRITSLSYSESGEFLNYSVMFRNASEYQVEYHLRRLHPEKS >gi|296493405|gb|ADTK01000096.1| GENE 33 34024 - 35640 1745 538 aa, chain + ## HITS:1 COG:glvCm_1 KEGG:ns NR:ns ## COG: glvCm_1 COG1263 # Protein_GI_number: 16132269 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Escherichia coli K12 # 1 447 88 534 534 846 99.0 0 MLSQIQRFGGAMFTPVLLFPFAGIVVGLAILLQNPMFVGESLTDPNSLFAQIVHIIEEGG WTVFRNMPLIFAVGLPIGLAKQAQGRACLAVMVSFLTWNYFINAMGMTWGSYFGVDFTQD AVAGSGLTMMAGIKTLDTSIIGAIIISGIVTALHNRLFDKKLPVFLGIFQGTSYVVIIAF LVMIPCAWLTLLGWPKVQMGIESLQAFLRSAGALGVWVYTFLERILIPTGLHHFIYGPFI FGPAAVEGGIQMYWAQHLQEFSLSAEPLKSLFPEGGFALHGNSKIFGAVGISLAMYFTAA PENRVKVAGLLIPATLTAMLVGITEPLEFTFLFISPLLFAVHAVLAALMSTVMYLFGVVG NMGGGLIDQVLPQNWIPMFSNHADMMLTQIAIGLCFTLLYFVVFRTLILQFNMCTPGRED AEVKLYSKAEYKASRGQTTAAEPKKELDQAAGILQALGGVGNISSINNCATRLRIALHDM SQTLDDEVFKKLGAHGVFRSGDAIQVIIGLHVSQLREQLDSLINSHQSAENVAITEAV >gi|296493405|gb|ADTK01000096.1| GENE 34 35640 - 36278 548 212 aa, chain + ## HITS:1 COG:glvG KEGG:ns NR:ns ## COG: glvG COG1486 # Protein_GI_number: 16131551 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Escherichia coli K12 # 1 212 1 212 212 443 99.0 1e-124 MTKFSVVVAGGGSTFTPGIVLMLLANQERFPLRALKFYDNDGARQEVIAEACKVILKEKA PDIAFSYTTDPEVAFSDVDFVMAHIRVGKYPMRELDEKIPLRHGVVGQETCGPGGIAYGM RSIGGVLELVDYMEKYSPNAWMLNYSNPAAIVAEATRRLRPNAKILNICDMPIGIESRMA QIVGLQDRKQMRVRYYGLNHWWSAISRSFRKG >gi|296493405|gb|ADTK01000096.1| GENE 35 36451 - 37344 614 297 aa, chain - ## HITS:1 COG:yidL KEGG:ns NR:ns ## COG: yidL COG2207 # Protein_GI_number: 16131550 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 297 11 307 307 624 100.0 1e-179 MNGKLQSSDVKNETPYNIPLLINENVISSGISLISLWHTYADEHYRVIWPRDKKKPLIAN SWVAVYTVQGCGKILLKNGEQITLHGNCIIFLKPMDIHSYHCEGLVWEQYWMEFTPTSMM DIPVGQQSVIYNGEIYNQELTEVAELITSPEAIKNNLAVAFLTKIIYQWICLMYADGKKD PQRRQIEKLIATLHASLQQRWSVADMAATIPCSEAWLRRLFLRYTGKTPKEYYLDARLDL ALSLLKQQGNSVGEVADTLNFFDSFHFSKAFKHKFGYAPSAVLKNTDQHPTDASPHN >gi|296493405|gb|ADTK01000096.1| GENE 36 37511 - 39226 1639 571 aa, chain + ## HITS:1 COG:yidK KEGG:ns NR:ns ## COG: yidK COG4146 # Protein_GI_number: 16131549 # Func_class: R General function prediction only # Function: Predicted symporter # Organism: Escherichia coli K12 # 1 571 1 571 571 1050 99.0 0 MNSLQILSFVGFTLLVAVITWWKVRKTDTGSQQGYFLAGRSLKAPVIAASLMLTNLSTEQ LVGLSGQAYKSGMSVMGWEVTSAVTLIFLALIFLPRYLKRGIATIPDFLEERYDKTTRII IDFCFLIATGVCFLPIVLYSGALALNSLFHVGESLQISHGAAIWLLVILLGLAGILYAVI GGLRAMAVADSINGIGLVIGGLMVPVFGLIAMGKGSFMQGIEQLTTVHAEKLNSIGGPTD PLPIGAAFTGLILVNTFYWCTNQGIVQRTLASKSLAEGQKGALLTAVLKMLDPLVLVLPG LIAFHLYQDLPKADMAYPTLVNNVLPVPMVGFFGAVLFGAVISTFNGFLNSASTLFSMGI YRRIINQNAEPQQLVTVGRKFGFFIAIVSVLVAPWIANAPQGLYSWMKQLNGIYNVPLVT IIIMGFFFPRIPALAAKVAMGIGIISYITINYLVKFDFHFLYVLACTFCINVVVMLVIGF IKPRATPFTFKDAFAVDMKPWRNVKIASIGILFAMIGVYAGLAEFGGYGTRWLAMISYFI AAVVIVYLIFDSWRHRHDPAVTFTPDAKDSL >gi|296493405|gb|ADTK01000096.1| GENE 37 39223 - 40716 1137 497 aa, chain + ## HITS:1 COG:yidJ KEGG:ns NR:ns ## COG: yidJ COG3119 # Protein_GI_number: 16131548 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Escherichia coli K12 # 1 497 1 497 497 1044 99.0 0 MKRPNFLFIMTDTQATNMVGCYSGKPLNTQNIDSLAAEGIRFNSAYTCSPVCTPARAGLF TGIYANQSGPWTNNVAPGKNISTMGRYFKDAGYHTCYIGKWHLDGHDYFGTGECPPEWDA DYWFDGANYLSELTEKEISLWRNGLNSVEDLQANHIDETFTWAHRISNRAVDFLQQPARA EEPFLMVVSYDEPHHPFTCPVEYLEKYADFYYDLGEKAQDDLANKPEHHRLWAQAMPSPV GDDGLYHHPLYFACNDFVDDQIGRVINALTPEQRENTWVIYTSDHGEMMGAHKLISKGAA MYDDITRIPLIIRSPQGERRQVDTPVSHIDLLPTMMALADIEKPEILPGENILAVKEPRG VMVEFNRYEIEHDSFGGFIPVRCWVTDDFKLVLNLFTSDELYDRRNDPNEMHNLIDDIRF ADVRSKMHDALLDYMDKIRDPFRSYQWSLRPWRKDALPRWMGAFRPRPQDGYSPVVRDYD TGLPTQGVKVEEKKQKF >gi|296493405|gb|ADTK01000096.1| GENE 38 40763 - 41173 348 136 aa, chain - ## HITS:1 COG:no KEGG:EC55989_4146 NR:ns ## KEGG: EC55989_4146 # Name: yidI # Def: conserved hypothetical protein; putative inner membrane protein # Organism: E.coli_55989 # Pathway: not_defined # 1 136 14 149 149 217 100.0 1e-55 MLFGAIALMMGIIHFSFGPFSAPPPTFESIVADKTAEIKRGLLAGIKGEKITTVEKKEDV DVDKILDQSGIALAIAALLCAFIGGMRKENRWGIRGALVFGGGTLAFHTLLFGIGIVCSI LLIFLIFSFLTGGSLV >gi|296493405|gb|ADTK01000096.1| GENE 39 41322 - 41669 465 115 aa, chain + ## HITS:1 COG:ECs4617 KEGG:ns NR:ns ## COG: ECs4617 COG2149 # Protein_GI_number: 15833871 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 115 1 115 115 190 100.0 7e-49 MKISRLGEAPDYRFSLANERTFLAWIRTALGFLAAGVGLDQLAPDFATPVIRELLALLLC LFSGGLAMYGYLRWLRNEKAMRLKEDLPYTNSLLIISLILMVVAVIVMGLVLYAG >gi|296493405|gb|ADTK01000096.1| GENE 40 41659 - 42021 229 120 aa, chain + ## HITS:1 COG:no KEGG:ECO111_4500 NR:ns ## KEGG: ECO111_4500 # Name: yidG # Def: putative inner membrane protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 120 1 120 120 221 100.0 8e-57 MPDSRKARRIADPGLQPERTSLAWFRTMLGYGALMALAIKHNWHQAGMLFWISIGILAIV ALILWHYTRNRNLMDVTNSDFSQFHVVRDKFLISLAVLSLAILFAVTHIHQLIVFIERVA >gi|296493405|gb|ADTK01000096.1| GENE 41 42018 - 42515 374 165 aa, chain + ## HITS:1 COG:yidF KEGG:ns NR:ns ## COG: yidF COG0641 # Protein_GI_number: 16131544 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Escherichia coli K12 # 1 165 1 165 165 339 99.0 1e-93 MTGSQVIDAEEDRHKLVVEYKDALQPADFYHNFKQRGIRSVQLIPYLEFDDRGDLTAASV TAELWGKFLIALFECWVRADISRISIELFDATLQKWCGSENPQPRRDCQACDWHRLCPHA RQETPDSVLCAGYQAFYSYSAPHMRVMRDLIKQHRSPMELMTMLR >gi|296493405|gb|ADTK01000096.1| GENE 42 42523 - 43674 1213 383 aa, chain - ## HITS:1 COG:ECs4614 KEGG:ns NR:ns ## COG: ECs4614 COG0477 # Protein_GI_number: 15833868 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 9 383 22 396 396 619 100.0 1e-177 MLVLLVAVGQMAQTIYIPAIADMARDLNVREGAVQSVMGAYLLTYGVSQLFYGPISDRVG RRPVILVGMSIFMLATLVAVTTSSLTVLIAASAMQGMGTGVGGVMARTLPRDLYERTQLR HANSLLNMGILVSPLLAPLIGGLLDTMWNWRACYLFLLVLCAGVTFSMARWMPETRPVDA PRTRLLTSYKTLFGNSGFNCYLLMLIGGLAGIAAFEACSGVLMGAVLGLSSMTVSILFIL PIPAAFFGAWFAGRPNKRFSTLMWQSVICCLLAGLLMWIPDWFGVMNVWTLLVPAALFFF GAGMLFPLATSGAMEPFPFLAGTAGALVGGLQNIGSGVLASLSAMLPQTGQGSLGLLMTL MGLLIVLCWLPLATRMSHQGQPV Prediction of potential genes in microbial genomes Time: Mon May 16 15:17:27 2011 Seq name: gi|296493404|gb|ADTK01000097.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont256.3, whole genome shotgun sequence Length of sequence - 7899 bp Number of predicted genes - 8, with homology - 7 Number of transcription units - 3, operones - 1 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 35 - 124 64 ## - Prom 310 - 369 5.0 - Term 177 - 210 0.0 2 2 Tu 1 . - CDS 382 - 672 74 ## ECIAI39_4274 hypothetical protein - Prom 905 - 964 2.4 3 3 Op 1 32/0.000 + CDS 863 - 2551 1852 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 4 3 Op 2 1/0.000 + CDS 2555 - 2845 355 ## COG0440 Acetolactate synthase, small (regulatory) subunit + Term 2868 - 2906 6.0 5 3 Op 3 5/0.000 + CDS 2920 - 3510 755 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain + Prom 3524 - 3583 1.7 6 3 Op 4 5/0.000 + CDS 3621 - 5012 1172 ## COG3851 Signal transduction histidine kinase, glucose-6-phosphate specific 7 3 Op 5 3/0.000 + CDS 5022 - 6341 1187 ## COG2271 Sugar phosphate permease + Prom 6393 - 6452 2.3 8 3 Op 6 . + CDS 6479 - 7870 1704 ## COG2271 Sugar phosphate permease Predicted protein(s) >gi|296493404|gb|ADTK01000097.1| GENE 1 35 - 124 64 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSLVDIAILILKLIVAALQLLDAVLKYLK >gi|296493404|gb|ADTK01000097.1| GENE 2 382 - 672 74 96 aa, chain - ## HITS:1 COG:no KEGG:ECIAI39_4274 NR:ns ## KEGG: ECIAI39_4274 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_IAI39 # Pathway: not_defined # 1 96 70 165 165 190 96.0 1e-47 MEVVIGDLVGFVFNNIYTSTALQGRQWKIFQQGKIEGLITFCTELQVTKNPAGARFHQPP AAGNRKAHCIMSTLKVYVFRAERAISARILTYLGHS >gi|296493404|gb|ADTK01000097.1| GENE 3 863 - 2551 1852 562 aa, chain + ## HITS:1 COG:ECs4612 KEGG:ns NR:ns ## COG: ECs4612 COG0028 # Protein_GI_number: 15833866 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Escherichia coli O157:H7 # 1 562 1 562 562 1102 99.0 0 MASSGTTSTRKRFTGAEFIVHFLEQQGIKIVTGIPGGSILPVYDALSQSTQIRHILARHE QGAGFIAQGMARTDGKPAVCMACSGPGATNLVTAIADARLDSIPLICITGQVPASMIGTD AFQEVDTYGISIPITKHNYLVRHIEELPQVMSDAFRIAQSGRPGPVWIDIPKDVQTAVFE IETQPAMAEKAAAPAFSEESIRDAAAMINAAKRPVLYLGGGVINAPARVRELAEKAQLPT TMTLMALGMLPKAHPLSLGMLGMHGVRSTNYILQEADLLIVLGARFDDRAIGKTEQFCPN AKIIHVDIDRAELGKIKQPHVAIQADVDDVLAQLIPQVEAQPRAEWHQLVADLQREFPCP IPKACDPLSHYGLINAVAACVDDNAIITTDVGQHQMWTAQAYPLNRPRQWLTSGGLGTMG FGLPAAIGAALANPDRKVLCFSGDGSLMMNIQEMATASENQLDVKIILMNNEALGLVHQQ QSLFYEQGVFAATYPGKINFMQIAAGFGLETCDLNNEADPQAALQEIINRPGPALIHVRI DAEEKVYPMVPPGAANTEMVGE >gi|296493404|gb|ADTK01000097.1| GENE 4 2555 - 2845 355 96 aa, chain + ## HITS:1 COG:ECs4611 KEGG:ns NR:ns ## COG: ECs4611 COG0440 # Protein_GI_number: 15833865 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase, small (regulatory) subunit # Organism: Escherichia coli O157:H7 # 1 96 1 96 96 184 100.0 3e-47 MQNTTHDNVILELTVRNHPGVMTHVCGLFARRAFNVEGILCLPIQDSDKSHIWLLVNDDQ RLEQMISQIDKLEDVVKVQRNQSDPTMFNKIAVFFQ >gi|296493404|gb|ADTK01000097.1| GENE 5 2920 - 3510 755 196 aa, chain + ## HITS:1 COG:ECs4606 KEGG:ns NR:ns ## COG: ECs4606 COG2197 # Protein_GI_number: 15833860 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 196 1 196 196 375 100.0 1e-104 MITVALIDDHLIVRSGFAQLLGLEPDLQVVAEFGSGREALAGLPGRGVQVCICDISMPDI SGLELLSQLPKGMATIMLSVHDSPALVEQALNAGARGFLSKRCSPDELIAAVHTVATGGC YLTPDIAIKLASGRQDPLTKRERQVAEKLAQGMAVKEIAAELGLSPKTVHVHRANLMEKL GVSNDVELARRMFDGW >gi|296493404|gb|ADTK01000097.1| GENE 6 3621 - 5012 1172 463 aa, chain + ## HITS:1 COG:uhpB KEGG:ns NR:ns ## COG: uhpB COG3851 # Protein_GI_number: 16131538 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase, glucose-6-phosphate specific # Organism: Escherichia coli K12 # 1 463 39 501 501 773 100.0 0 MAVLLFPFGLRLGLMLQCPRGYWPVLLGAEWLLIYWLTQAVGLTHFPLLMIGSLLTLLPV ALISRYRHQRDWRTLLLQGAALTAAALLQSLPWLWHGKESWNALLLTLTGGLTLAPICLV FWHYLANNTWLPLGPSLVSQPINWRGRHLVWYLLLFVISLWLQLGLPDELSRFTPFCLAL PIIALAWHYGWQGALIATLMNAIALIASQTWRDHPVDLLLSLLVQSLTGLLLGAGIQRLR ELNQSLQKELARNQHLAERLLETEESVRRDVARELHDDIGQTITAIRTQAGIVQRLAADN ASVKQSGQLIEQLSLGVYDAVRRLLGRLRPRQLDDLTLEQAIRSLMREMELEGRGIVSHL EWRIDESALSENQRVTLFRVCQEGLNNIVKHADASAVTLQGWQQDERLMLVIEDDGSGLP PGSGQQGFGLTGMRERVTALGGTLHISCLHGTRVSVSLPQRYV >gi|296493404|gb|ADTK01000097.1| GENE 7 5022 - 6341 1187 439 aa, chain + ## HITS:1 COG:uhpC KEGG:ns NR:ns ## COG: uhpC COG2271 # Protein_GI_number: 16131537 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Escherichia coli K12 # 1 439 2 440 440 810 100.0 0 MLPFLKAPADAPLMTDKYEIDARYRYWRRHILLTIWLGYALFYFTRKSFNAAVPEILANG VLSRSDIGLLATLFYITYGVSKFVSGIVSDRSNARYFMGIGLIATGIINILFGFSTSLWA FAVLWVLNAFFQGWGSPVCARLLTAWYSRTERGGWWALWNTAHNVGGALIPIVMAAAALH YGWRAGMMIAGCMAIVVGIFLCWRLRDRPQALGLPAVGEWRHDALEIAQQQEGAGLTRKE ILTKYVLLNPYIWLLSFCYVLVYVVRAAINDWGNLYMSETLGVDLVTANTAVTMFELGGF IGALVAGWGSDKLFNGNRGPMNLIFAAGILLSVGSLWLMPFASYVMQATCFFTIGFFVFG PQMLIGMAAAECSHKEAAGAATGFVGLFAYLGASLAGWPLAKVLDTWHWSGFFVVISIAA GISALLLLPFLNAQTPREA >gi|296493404|gb|ADTK01000097.1| GENE 8 6479 - 7870 1704 463 aa, chain + ## HITS:1 COG:ECs4603 KEGG:ns NR:ns ## COG: ECs4603 COG2271 # Protein_GI_number: 15833857 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Escherichia coli O157:H7 # 1 463 1 463 463 845 100.0 0 MLAFLNQVRKPTLDLPLEVRRKMWFKPFMQSYLVVFIGYLTMYLIRKNFNIAQNDMISTY GLSMTQLGMIGLGFSITYGVGKTLVSYYADGKNTKQFLPFMLILSAICMLGFSASMGSGS VSLFLMIAFYALSGFFQSTGGSCSYSTITKWTPRRKRGTFLGFWNISHNLGGAGAAGVAL FGANYLFDGHVIGMFIFPSIIALIVGFIGLRYGSDSPESYGLGKAEELFGEEISEEDKET ESTDMTKWQIFVEYVLKNKVIWLLCFANIFLYVVRIGIDQWSTVYAFQELKLSKAVAIQG FTLFEAGALVGTLLWGWLSDLANGRRGLVACIALALIIATLGVYQHASNEYIYLASLFAL GFLVFGPQLLIGVAAVGFVPKKAIGAADGIKGTFAYLIGDSFAKLGLGMIADGTPVFGLT GWAGTFAALDIAAIGCICLMAIVAVMEERKIRREKKIQQLTVA Prediction of potential genes in microbial genomes Time: Mon May 16 15:17:35 2011 Seq name: gi|296493403|gb|ADTK01000098.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont256.4, whole genome shotgun sequence Length of sequence - 10314 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 7, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1770 1459 ## COG1001 Adenine deaminase - Prom 1820 - 1879 6.1 + Prom 1755 - 1814 5.8 2 2 Op 1 . + CDS 1933 - 3279 1285 ## COG2252 Permeases 3 2 Op 2 . + CDS 3332 - 3784 393 ## LF82_3360 uncharacterized protein YicN + Prom 3904 - 3963 3.7 4 3 Tu 1 . + CDS 3995 - 5185 1123 ## COG2814 Arabinose efflux permease 5 4 Tu 1 . - CDS 5226 - 5519 274 ## SSON_3614 putative transport protein - Prom 5719 - 5778 5.6 + Prom 5658 - 5717 5.8 6 5 Tu 1 . + CDS 5741 - 6559 755 ## COG1464 ABC-type metal ion transport system, periplasmic component/surface antigen - Term 6439 - 6495 5.2 7 6 Op 1 8/0.000 - CDS 6563 - 7486 685 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 7521 - 7580 2.3 8 6 Op 2 . - CDS 7597 - 8781 627 ## COG0477 Permeases of the major facilitator superfamily - Prom 8964 - 9023 9.8 9 7 Tu 1 . - CDS 9577 - 10194 397 ## ECIAI39_4246 hypothetical protein Predicted protein(s) >gi|296493403|gb|ADTK01000098.1| GENE 1 3 - 1770 1459 589 aa, chain - ## HITS:1 COG:yicP KEGG:ns NR:ns ## COG: yicP COG1001 # Protein_GI_number: 16131535 # Func_class: F Nucleotide transport and metabolism # Function: Adenine deaminase # Organism: Escherichia coli K12 # 1 578 1 578 588 1184 99.0 0 MNNSINHKFHHISRAEYQELLAVSRGDAVADYIIDNVSILDLINGGEISGPIVIKGRYIA GVGAEYADAPALQRIDARGATAVPGFIDAHLHIESSMMTPVTFETATLPRGLTTVICDPH EIVNVMGEAGFAWFARCAEQARQNQYLQVSSCVPALEGCDVNGASFTLEQMLAWRDHPQV TGLAEMMDYPGVISGQNALLDKLDAFRHLTLDGHCPGLGGKELNAYITAGIENCHESYQL EEGRRKLQLGMSLMIREGSAARNLNALAPLINEFNSPQCMLCTDDRNPWEIAHEGHIDAL IRRLIEQHNVPLHVAYRVASWSTARHFGLNHLGLLAPGKQADIVLLSDARKVTVQQVLVK GEPIDAQTLQAEESARLAQSAPPYGNTIARQPVSASDFALQFTPGKRYRVIDVIHNELIT HSHSSVYSENGFDRDDVSFIAVLERYGQRLAPACGLLGGFGLNEGALAATVSHDSHNIVV IGRSAEEMALAVNQVIQDGGGLCVVRNGQVQSHLPLPIAGLMSTDTAQSLAEQIDALKAA ARECGPLPDEPFIQMAFLSLPVIPALKLTSQGLFDGEKRNPTDWWWWNR >gi|296493403|gb|ADTK01000098.1| GENE 2 1933 - 3279 1285 448 aa, chain + ## HITS:1 COG:yicO KEGG:ns NR:ns ## COG: yicO COG2252 # Protein_GI_number: 16131534 # Func_class: R General function prediction only # Function: Permeases # Organism: Escherichia coli K12 # 1 448 23 470 470 765 99.0 0 MDKKMNNDNTDYVSNESGTLSRLFKLPQHGTTVRTELIAGMTTFLTMVYIVFVNPQILGA AQMDPKVVFVTTCLIAGIGSIAMGIFANLPVALAPAMGLNAFFAFVVVGAMGISWQTGMG AIFWGAVGLFLLTLFRIRYWMISNIPLSLRIGITSGIGLFIALMGLKNTGVIVANKDTLV MIGDLSSHGVLLGILGFFIITVLSSRHFHAAVLVSIVVTSCCGLFFGDVHFSGVYSIPPD ISGVIGEVDLSGALTLELAGIIFSFMLINLFDSSGTLIGVTDKAGLIDGNGKFPNMNKAL YVDSVSSVAGAFIGTSSVTAYIESTSGVAVGGRTGLTAVVVGVMFLLVMFFSPLVAMVPP YATAGALIFVGVLMTSSLARVNWDDFTESVPAFITTVMMPFTFSITEGIALGFMSYCIMK VCTGRWRDLNLCVVVVAALFALKIILVD >gi|296493403|gb|ADTK01000098.1| GENE 3 3332 - 3784 393 150 aa, chain + ## HITS:1 COG:no KEGG:LF82_3360 NR:ns ## KEGG: LF82_3360 # Name: yicN # Def: uncharacterized protein YicN # Organism: E.coli_LF82 # Pathway: not_defined # 1 150 10 159 159 290 99.0 1e-77 MIWIMLATLAVVFVVGFRVLTSGARKAIRRLSDRLNIDVIPVESMVDQMGKSAGDEFLRY LHRPDESHLQNAAQVLLIWQIVIVDGSEQNLLQWHRILQKARLAAPITDAQVRLALGFLR ETEPEMQDINAFQMRYNAFFQPAEGVHWLH >gi|296493403|gb|ADTK01000098.1| GENE 4 3995 - 5185 1123 396 aa, chain + ## HITS:1 COG:yicM KEGG:ns NR:ns ## COG: yicM COG2814 # Protein_GI_number: 16131532 # Func_class: G Carbohydrate transport and metabolism # Function: Arabinose efflux permease # Organism: Escherichia coli K12 # 1 396 56 451 451 665 99.0 0 MSEFIAENRGADAITRPNWSAVFSVAFCVACLIIVEFLPVSLLTPMAQDLGISEGVAGQS VTVTAFVAMFASLFITQTIQATDRRYVVILFAVLLTLSCLLVSFANSFSLLLIGRACLGL ALGGFWAMSASLTMRLVPPRTVPKALSVIFGAVSIALVIAAPLGSFLGELIGWRNVFNAA AVMGVLCIFWIIKSLPSLPGEPSHQKQNTFRLLQRPGVMAGMIAIFMSFAGQFAFFTYIR PVYMNLAGFSVDGLTLVLLSFGIASFIGTSLSSFILKRSVKLALAGAPLILAVSALVLTL WGSDKIVATGVAIIWGLTFALVPVGWSTWITRSLADQAEKAGSIQVAVIQLANTCGAAIG GYALDNIGLTSPLMLSGTLMLLTALLVTAKVKMKKS >gi|296493403|gb|ADTK01000098.1| GENE 5 5226 - 5519 274 97 aa, chain - ## HITS:1 COG:no KEGG:SSON_3614 NR:ns ## KEGG: SSON_3614 # Name: not_defined # Def: putative transport protein # Organism: S.sonnei # Pathway: not_defined # 1 97 1 97 97 187 100.0 9e-47 MKPTTLLLIFTFFAMPGIVYAESPFSSLQSAKEKTTVLQDLRKICTPQASLSDEAWEKLM LSDENNKQHIREAIVAMERNNQSNYWEALGKVECPDM >gi|296493403|gb|ADTK01000098.1| GENE 6 5741 - 6559 755 272 aa, chain + ## HITS:1 COG:nlpA KEGG:ns NR:ns ## COG: nlpA COG1464 # Protein_GI_number: 16131531 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface antigen # Organism: Escherichia coli K12 # 1 272 1 272 272 501 99.0 1e-142 MKLTTHHLRTGAALLLAGILLAGCDQSSSDAKHIKVGVINGAEQDVAEVAKKVAKEKYGL DVELVGFSGSLLPNDATNHGELDANVFQHRPFLEQDNQAHGYKLVAVGNTFVFPMAGYSK KITTVAQIKEGATVAIPNDPTNLGRALLLLQKEKLITLKEGKGLLPTALDITDNPRHLQI MELEGAQLPRVLDDPKVDVAIISTTYIQQTGLSPVHDSVFIEDKNSPYVNILVAREDNKN AENVKEFLQSYQSPEVAKAAETIFNGGAVPGW >gi|296493403|gb|ADTK01000098.1| GENE 7 6563 - 7486 685 307 aa, chain - ## HITS:1 COG:yicL KEGG:ns NR:ns ## COG: yicL COG0697 # Protein_GI_number: 16131530 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Escherichia coli K12 # 1 307 1 307 307 498 99.0 1e-141 MGSTRKGMLNVLIAAVLWGSSGVCAQYIMEQSQMSSQFLTMTRLIFAGLILLTLSFVHGD KIFSIINNHKDAISLLIFSVVGALTVQLTFLLTIEKSNAATATVLQFLSPTIIVAWFSLV RKSRPGILVFCAILTSLVGTFLLVTHGNPTSLSISPAALFWGIASAFAAAFYTTYPSTLI ARYGTLPVVGWSMLIGGLILLPFYARQGTNFVVNGSLILAFFYLVVIGTSLTFSLYLKGA QLIGGPKASILSCAEPLSSALLSLLLLGITFTLPDWLGTLLILSSVILISMDSRRRARKI NRPARHE >gi|296493403|gb|ADTK01000098.1| GENE 8 7597 - 8781 627 394 aa, chain - ## HITS:1 COG:setC KEGG:ns NR:ns ## COG: setC COG0477 # Protein_GI_number: 16131529 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 394 1 394 394 707 99.0 0 MQKTATTPSKILDLTAAAFLLVAFLTGIAGALQTPTLSIFLADELKARPIMVGFFFTGSA IMGILVSQFLARHSDKQGDRKLLILLCCLFGVLACTLFAWNRNYFILLSTGVLLSSFAST ANPQMFTLAREHADRTGRETVMFSTFLRAQISLAWVIGPPLAYELAMGFSFKVMYLTAAI AFVVCGLIVWLFLPSIQRNIPVVTQPVEILPSTHRKRDTRLLFVVCSMMWAANNLYMINM PLFIIDELHLTDKLAGEMIGIAAGLEIPMMLIAGYYMKHIGKRLLMLIAIVSGMCFYASV LMATTPAVELELQILNAIFLGILCGIGMLYFQDLMPEKIGSATTLYANTSRVGWIIAGSV DGIMVEIWSYHALFWLAIGMLGIAMICLLFIKDI >gi|296493403|gb|ADTK01000098.1| GENE 9 9577 - 10194 397 205 aa, chain - ## HITS:1 COG:no KEGG:ECIAI39_4246 NR:ns ## KEGG: ECIAI39_4246 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_IAI39 # Pathway: not_defined # 1 205 76 280 280 417 96.0 1e-115 MEKPETAMKAITRNLDREIWRDLMQRSGMLSLMDAQARDTWYRSLEYDNFPEISEANILS TFEQLHQNKDEVFERGVINVFRGLSWNYKTNCPCKFGSKIIVNNLVRWDRWGFHLITGQQ ADRLADLERMLHLFSGKPIPDNRENITIHLDDHIQSVQGKEDYEDEMFSIRYFKKGSAHI TFRKPELVDRLNEIIAKHYPGVLPS Prediction of potential genes in microbial genomes Time: Mon May 16 15:17:44 2011 Seq name: gi|296493402|gb|ADTK01000099.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont258.1, whole genome shotgun sequence Length of sequence - 506 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 69 - 504 294 ## EcE24377A_1416 exonuclease family protein Predicted protein(s) >gi|296493402|gb|ADTK01000099.1| GENE 1 69 - 504 294 145 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_1416 NR:ns ## KEGG: EcE24377A_1416 # Name: not_defined # Def: exonuclease family protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 145 465 609 823 273 99.0 2e-72 MQAANISQPDADKLLAASRGEFVEGISDPNDPKWVKGIQTRDSVNQNQHESERNYQKAEQ NSPNALQNEPETKQPEPVAQQEVEKVCTACGQTGGGNCPDCGAVMGDATYQETFDEEYQV EVQEDDPEEMEGAEHPHKENTGGNQ Prediction of potential genes in microbial genomes Time: Mon May 16 15:17:47 2011 Seq name: gi|296493401|gb|ADTK01000100.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont260.1, whole genome shotgun sequence Length of sequence - 2327 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 144 - 713 489 ## ECS88_2076 hypothetical protein 2 2 Tu 1 . + CDS 973 - 1374 208 ## ECB_01907 hypothetical protein Predicted protein(s) >gi|296493401|gb|ADTK01000100.1| GENE 1 144 - 713 489 189 aa, chain + ## HITS:1 COG:no KEGG:ECS88_2076 NR:ns ## KEGG: ECS88_2076 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 189 13 201 201 390 100.0 1e-107 MTYKYNPFWQQRIRETVRHALNVHPRLTALRVDLRFPDVPAATDAAVISRFINALKARID AYQKRKHREGKRVHPTTLHYVWAREFGECKGKKHYHLMLLVNRDTWCRAGDYRAPGSLAG MIKQAWCSALGVDVGCHATLVHFPAWPAVWLERDDDTGFQQVLERADYLAKEHTKAHCTG ERNFGCSRS >gi|296493401|gb|ADTK01000100.1| GENE 2 973 - 1374 208 133 aa, chain + ## HITS:1 COG:no KEGG:ECB_01907 NR:ns ## KEGG: ECB_01907 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 133 12 144 144 264 98.0 7e-70 MYAKSFIALDGNGRLTGARTAQAAPYANYTCHLCGSALRYHPQYDTELPWFEHTDDRLTE HGQQCPYVRPERREIQLIKRLQQFVPDALPVVRKASWHCRQCHHDYYGEQYCTNCQTGGF SIPRTTQEEICEF Prediction of potential genes in microbial genomes Time: Mon May 16 15:18:02 2011 Seq name: gi|296493400|gb|ADTK01000101.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont287.1, whole genome shotgun sequence Length of sequence - 30495 bp Number of predicted genes - 27, with homology - 27 Number of transcription units - 14, operones - 7 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 113 - 598 481 ## ECO111_0312 hypothetical protein 2 1 Op 2 . + CDS 614 - 1090 411 ## COG2003 DNA repair proteins 3 1 Op 3 . + CDS 1159 - 1380 284 ## ECB_02806 hypothetical protein 4 1 Op 4 . + CDS 1454 - 1582 59 ## ECED1_2355 antitoxin of the YeeV-YeeU toxin-antitoxin system; CP4-44 prophage 5 1 Op 5 . + CDS 1650 - 1796 61 ## ECP_3015 hypothetical protein - Term 2114 - 2165 -0.7 6 2 Op 1 4/0.000 - CDS 2400 - 3647 673 ## COG0477 Permeases of the major facilitator superfamily - Term 3661 - 3699 -0.4 7 2 Op 2 . - CDS 3719 - 4633 377 ## COG0524 Sugar kinases, ribokinase family - Prom 4779 - 4838 3.6 + Prom 4749 - 4808 5.5 8 3 Tu 1 . + CDS 4849 - 6282 634 ## COG1621 Beta-fructosidases (levanase/invertase) 9 4 Tu 1 . - CDS 6290 - 7240 570 ## COG1609 Transcriptional regulators - Prom 7385 - 7444 5.5 + Prom 7364 - 7423 4.6 10 5 Op 1 4/0.000 + CDS 7528 - 7746 204 ## COG2610 H+/gluconate symporter and related permeases 11 5 Op 2 . + CDS 7764 - 9092 1192 ## COG3048 D-serine dehydratase 12 6 Op 1 19/0.000 - CDS 9200 - 10738 516 ## COG0477 Permeases of the major facilitator superfamily 13 6 Op 2 . - CDS 10738 - 11901 728 ## COG1566 Multidrug resistance efflux pump + Prom 12144 - 12203 6.9 14 7 Op 1 12/0.000 + CDS 12317 - 12931 387 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain + Term 13142 - 13182 -0.9 + Prom 13461 - 13520 3.4 15 7 Op 2 . + CDS 13590 - 16529 1095 ## COG0642 Signal transduction histidine kinase + Term 16536 - 16574 4.4 - Term 16522 - 16562 4.8 16 8 Op 1 2/0.875 - CDS 16585 - 17730 450 ## COG1804 Predicted acyl-CoA transferases/carnitine dehydratase - Term 17739 - 17775 2.2 17 8 Op 2 3/0.500 - CDS 17804 - 18748 727 ## COG0679 Predicted permeases 18 8 Op 3 4/0.000 - CDS 18818 - 20512 1129 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] - Term 20530 - 20558 0.5 19 8 Op 4 . - CDS 20566 - 21816 1137 ## COG1804 Predicted acyl-CoA transferases/carnitine dehydratase - Prom 21842 - 21901 8.4 - Term 22226 - 22274 10.3 20 9 Tu 1 . - CDS 22329 - 22961 578 ## SDY_2573 hypothetical protein - Prom 23052 - 23111 7.2 + Prom 23028 - 23087 5.1 21 10 Tu 1 . + CDS 23257 - 23532 57 ## SSON_2467 hypothetical protein 22 11 Tu 1 . - CDS 23609 - 23851 282 ## SDY_2575 hypothetical protein - Prom 23893 - 23952 4.3 23 12 Tu 1 . + CDS 24204 - 25124 848 ## COG1560 Lauroyl/myristoyl acyltransferase + Term 25158 - 25198 4.1 - Term 25562 - 25593 4.1 24 13 Tu 1 . - CDS 25616 - 26854 1279 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 27062 - 27121 4.7 + Prom 27010 - 27069 6.5 25 14 Op 1 9/0.000 + CDS 27249 - 28928 1452 ## COG3275 Putative regulator of cell autolysis 26 14 Op 2 3/0.500 + CDS 28943 - 29677 731 ## COG3279 Response regulator of the LytR/AlgR family 27 14 Op 3 . + CDS 29690 - 30494 535 ## COG2207 AraC-type DNA-binding domain-containing proteins Predicted protein(s) >gi|296493400|gb|ADTK01000101.1| GENE 1 113 - 598 481 161 aa, chain + ## HITS:1 COG:no KEGG:ECO111_0312 NR:ns ## KEGG: ECO111_0312 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 161 1 161 161 323 93.0 1e-87 MKTISHNSTTPSVSVTAASGNDQPQLVATLVPDEQRISFWPQHFGLIPQWVTLEPRVFGW MDRLCEDYCGGIWNLYTLNNGGAFMAPEPDDDDDETWVLFNAMNGNRAEMSPEAAGIAAC LMTYSHHACRTECYAMTVHYYRLRDYALQHPECSAIMRIID >gi|296493400|gb|ADTK01000101.1| GENE 2 614 - 1090 411 158 aa, chain + ## HITS:1 COG:ECs2803 KEGG:ns NR:ns ## COG: ECs2803 COG2003 # Protein_GI_number: 15832057 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Escherichia coli O157:H7 # 1 158 1 158 158 294 96.0 5e-80 MQQLSFLPGNMTPGERSLIQRALKTLDRHLHEPGVAFTSTHAAREWLILNMAGLEREEFR VLYLNNQNQLIAGETLFTGTINRTEVHPREVIKRALYHNAAAVVLAHNHPSGEVTPSKAD RLITERLVQALALVDIRVPDHLIVGGSQVFSFAEHGLL >gi|296493400|gb|ADTK01000101.1| GENE 3 1159 - 1380 284 73 aa, chain + ## HITS:1 COG:no KEGG:ECB_02806 NR:ns ## KEGG: ECB_02806 # Name: yeeT # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 73 1 73 73 150 100.0 1e-35 MKIITRGEAMRIHQQHPASRLFPFCTGKYRWHGSAEAYTGREVQDIPGVLAVFAERRKDS FGPYVRLMSVTLN >gi|296493400|gb|ADTK01000101.1| GENE 4 1454 - 1582 59 42 aa, chain + ## HITS:1 COG:no KEGG:ECED1_2355 NR:ns ## KEGG: ECED1_2355 # Name: yeeU # Def: antitoxin of the YeeV-YeeU toxin-antitoxin system; CP4-44 prophage # Organism: E.coli_ED1a # Pathway: not_defined # 1 42 1 42 122 91 97.0 7e-18 MSDTLPGTTLPDDNHDRPWWGLPCTVTPCFGARLVQEGKQLH >gi|296493400|gb|ADTK01000101.1| GENE 5 1650 - 1796 61 48 aa, chain + ## HITS:1 COG:no KEGG:ECP_3015 NR:ns ## KEGG: ECP_3015 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_536 # Pathway: not_defined # 1 29 90 118 142 62 100.0 5e-09 MKQLELMLTSGELNPRHQHTVTLYAKGLTWPTPPAVVVMFIWLFIRHR >gi|296493400|gb|ADTK01000101.1| GENE 6 2400 - 3647 673 415 aa, chain - ## HITS:1 COG:ECs3241 KEGG:ns NR:ns ## COG: ECs3241 COG0477 # Protein_GI_number: 15832495 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 415 1 415 415 743 99.0 0 MALNIPFRNAYYRFASSYSFLFFISWSLWWSLYAIWLKGHLGLTGTELGTLYSVNQFTSI LFMMFYGIVQDKLGLKKPLIWCMSFILVLTGPFMIYVYEPLLQSNFSVGLILGALFFGLG YLAGCGLLDSFTEKMARNFHFEYGTARAWGSFGYAIGAFFAGIFFSISPHINFWLVSLFG AVFMMINMCFKDKDHQCVAADAGGGKKEDFIAVFKDRNFWVFVIFIVGTWSFYNIFDQQL FPVFYAGLFESHDVGTRLYGYLNSFQVVLEALCMAIIPFFVNRVGPKNALLIGVVIMALR ILSCALFVNPWIISLVKLLHAIEVPLCVISVFKYSVANFDKRLSSTIFLIGFQIASSLGI VLLSTPTGILFDHAGYQTVFFAISGIVCLMLLFGIFFLSKKREQIVMETPVPSAI >gi|296493400|gb|ADTK01000101.1| GENE 7 3719 - 4633 377 304 aa, chain - ## HITS:1 COG:ECs3242 KEGG:ns NR:ns ## COG: ECs3242 COG0524 # Protein_GI_number: 15832496 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli O157:H7 # 1 304 1 304 304 607 98.0 1e-174 MSAKVWVLGDAVVDLLPESDGRLLPCPGGAPANVAVGIARLGGISGFIGRVGDDPFGALM QRTLLTEGVDITYLKQDEWHRTSTVLVDLNDQGERSFTFMVRPSADLFLETTDLPCWRHG EWLHLCSIALSAEPSRTSAFTAMTAIRHAGGFVSFDPNIREDLWQDEHLLRLCLRQALQL ADVVKLSEEEWRLISGKTQNDQDICALAKEYEIAMLLVTKGAEGVVVCYRGQVHHFAGMS VNCVDSTGAGDAFVAGLLTGLSSTGLSTDEREMRRIIDLAQRCGALAVTAKGAMTALPCR QELE >gi|296493400|gb|ADTK01000101.1| GENE 8 4849 - 6282 634 477 aa, chain + ## HITS:1 COG:ECs3243 KEGG:ns NR:ns ## COG: ECs3243 COG1621 # Protein_GI_number: 15832497 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Escherichia coli O157:H7 # 1 477 1 477 477 967 98.0 0 MTQSRLHAAQNALAKLHEHRGNTFYPHFHLAPPAGWMNDPNGLIWFNDRYHAFYQHHPMS EHWGPMHWGHATSDDMIHWQHEPIALAPGDDNDKDGCFSGSAVDDNGVLSLIYTGHVWLD GAGNDDAIREVQCLATSRDGIHFEKQGVILTPPEGIMHFRDPKVWREADTWWMVVGAKDP GNTGQILLYRGSSLREWTFDRVLAHADAGESYMWECPDFFSLGDQHYLMFSPQGMNAEGY SYRNRFQSGVIPGMWSPGRLFAQSGHFTELDNGHDFYAPQSFLAKDGRRIVIGWMDMWES PMPSKREGWAGCMTLARELSESNGKLLQRPVHEAESLRQQHQSVSPRTISNKYVLQENAQ AVEIQLQWALKNSDAEHYGLQLGTGMRLYIDNQSERLVLWRYYPHENLDGYRSIPLPQRD TLALRIFIDTSSVEVFINDGEAVMSSRIYPQPEERELSLYASHGVAVLQHGALWLLG >gi|296493400|gb|ADTK01000101.1| GENE 9 6290 - 7240 570 316 aa, chain - ## HITS:1 COG:ECs3244 KEGG:ns NR:ns ## COG: ECs3244 COG1609 # Protein_GI_number: 15832498 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 316 16 331 331 630 99.0 0 MTVSRVMHNAESVRPATRDRVLQAIQTLNYVPDLSARKMRAQGRKPSTLAVLAQDTATTP FSVDILLAIEQTASEFGWNSFLINIFSEDDAARAARQLLAHRPDGIIYTTMGLRHITLPE SLYGENIVLANCVADDPALPSYIPDDYTAQYESTQHLLAAGYRQPLCFWLPESALATGYR RQGFEQAWRDAGRDLAEVKQFHMATGDDHYTDLASLLNDHFKSGKPDFDVLICGNDRAAF VAYQVLLAKGVRIPQDVAVMGFDNLVGVGHLFLPPLTTIQLPHDIIGREAALHIIEGREG GRVTRIPCPLLIRCST >gi|296493400|gb|ADTK01000101.1| GENE 10 7528 - 7746 204 72 aa, chain + ## HITS:1 COG:dsdX KEGG:ns NR:ns ## COG: dsdX COG2610 # Protein_GI_number: 16130297 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Escherichia coli K12 # 1 72 374 445 445 126 100.0 9e-30 MLPLYPDISPEIIAIAIGSGAIGCTIVTDSLFWLVKQYCGATLNETFKYYTTATFIASVV ALAGTFLLSFII >gi|296493400|gb|ADTK01000101.1| GENE 11 7764 - 9092 1192 442 aa, chain + ## HITS:1 COG:dsdA KEGG:ns NR:ns ## COG: dsdA COG3048 # Protein_GI_number: 16130298 # Func_class: E Amino acid transport and metabolism # Function: D-serine dehydratase # Organism: Escherichia coli K12 # 1 442 1 442 442 871 100.0 0 MENAKMNSLIAQYPLVKDLVALKETTWFNPGTTSLAEGLPYVGLTEQDVQDAHARLSRFA PYLAKAFPETAATGGIIESELVAIPAMQKRLEKEYQQPISGQLLLKKDSHLPISGSIKAR GGIYEVLAHAEKLALEAGLLTLDDDYSKLLSPEFKQFFSQYSIAVGSTGNLGLSIGIMSA RIGFKVTVHMSADARAWKKAKLRSHGVTVVEYEQDYGVAVEEGRKAAQSDPNCFFIDDEN SRTLFLGYSVAGQRLKAQFAQQGRIVDADNPLFVYLPCGVGGGPGGVAFGLKLAFGDHVH CFFAEPTHSPCMLLGVHTGLHDQISVQDIGIDNLTAADGLAVGRASGFVGRAMERLLDGF YTLSDQTMYDMLGWLAQEEGIRLEPSALAGMAGPQRVCASVSYQQMHGFSAEQLRNTTHL VWATGGGMVPEEEMNQYLAKGR >gi|296493400|gb|ADTK01000101.1| GENE 12 9200 - 10738 516 512 aa, chain - ## HITS:1 COG:emrY KEGG:ns NR:ns ## COG: emrY COG0477 # Protein_GI_number: 16130299 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 512 1 512 512 910 99.0 0 MAITKSTPAPLTGGTLWCVTIALSLATFMQMLDSTISNVAIPTISGFLGASTDEGTWVIT SFGVANAIAIPVTGRLAQRIGELRLFLLSVTFFSLSSLMCSLSTNLDVLIFFRVVQGLMA GPLIPLSQSLLLRNYPPEKRTFALALWSMTVIIAPICGPILGGYICDNFSWGWIFLINVP MGIIVLTLCLTLLKGRETETSPVKMNLPGLTLLVLGVGGLQIMLDKGRDLDWFNSSTIII LTVVSVISLISLVIWESTSENPILDLSLFKSRNFTIGIVSITCAYLFYSGAIVLMPQLLQ ETMGYNAIWAGLAYAPIGIMPLLISPLIGRYGNKIDMRLLVTFSFLMYAVCYYWRSVTFM PTIDFTGIILPQFFQGFAVACFFLPLTTISFSGLPDNKFANASSMSNFFRTLSGSVGTSL TMTLWGRRESLHHSQLTATIDQFNPVFNSSSQIMDKYYGSLSGVLNEINNEITQQSLSIS ANEIFRMAAIAFILLTVLVWFAKPPFTAKGIG >gi|296493400|gb|ADTK01000101.1| GENE 13 10738 - 11901 728 387 aa, chain - ## HITS:1 COG:emrK KEGG:ns NR:ns ## COG: emrK COG1566 # Protein_GI_number: 16130300 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Escherichia coli K12 # 1 387 1 387 387 711 99.0 0 MEQINSNKKHSNRRKYFSLLAVVLFIAFSGAYAYWSMELEDMISTDDAYVTGNADPISAQ VSGSVTVVNHKDTNYVRQGDILVSLDKTDATIALNKAKNNLANIVRQTNKLYLQDKQYSA EVASARIQYQQSLEDYNRRVPLAKQGVISKETLEHTKDTLISSKAALNAAIQAYKANKAL VMNTPLNRQPQVVEAADATKEAWLALKRTDIKSPVTGYIAQRSVQVGETVSPGQSLMAVV PARQMWVNANFKETQLTDVRIGQSVNIISDLYGENVVFHGRVTGINMGTGNAFSLLPAQN ATGNWIKIVQRVPVEVSLDPKELMEHPLRIGLSMTATIDTKNEDIAEMPELASTVTSMPA YTSKALVIDTSPIEKEISNIISHNGQF >gi|296493400|gb|ADTK01000101.1| GENE 14 12317 - 12931 387 204 aa, chain + ## HITS:1 COG:ECs3248 KEGG:ns NR:ns ## COG: ECs3248 COG2197 # Protein_GI_number: 15832502 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 204 1 204 204 392 100.0 1e-109 MNAIIIDDHPLAIAAIRNLLIKNDIEILAELTEGGSAVQRVETLKPDIVIIDVDIPGVNG IQVLETLRKRQYSGIIIIVSAKNDHFYGKHCADAGANGFVSKKEGMNNIIAAIEAAKNGY CYFPFSLNRFVGSLTSDQQKLDSLSKQEISVMRYILDGKDNNDIAEKMFISNKTVSTYKS RLMEKLECKSLMDLYTFAQRNKIG >gi|296493400|gb|ADTK01000101.1| GENE 15 13590 - 16529 1095 979 aa, chain + ## HITS:1 COG:evgS_2 KEGG:ns NR:ns ## COG: evgS_2 COG0642 # Protein_GI_number: 16130302 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 313 732 1 420 420 840 99.0 0 MISRYFTHSLNVVKYYNSPRQYNFFLTRKESVILNEVLNRFVDALTNEVRYEVSQNWLDT GNLAFLNKPLELTEHEKQWIKQHPNLKVLENPYSPPYSMTDENGSVRGVMGDILNIITLQ TGLNFSPITVSHNIHAGTQLSPGGWDIIPGAIYSEDRENNVLFAEAFITTPYVFVMQKAP DSEQTLKKGMKVAIPYYYELHSQLKEMYPEVEWIQVDNASAAFHKVKEGELDALVATQLN SRYMIDHYYPNELYHFLIPGVPNASLSFAFPRGEPELKDIINKALNAIPPSEVLRLTEKW IKMPNVTIDTWDLYSEQFYIVTTLSVLLVGSSLLWGFYLLRSVRRRKVIQGDLENQISFR KALSDSLPNPTYVVNWQGNVISHNSAFEHYFTADYYKNAMLPLENSDSPFKDVFSNAHEV TAETKENRTIYTQVFEIDNGIEKRCINHWHTLCNLPASDNAVYICGWQDITETRDLINAL EVEKNKAIKATVAKSQFLATMSHEIRTPISSIMGFLELLSGSGLSKEQRVEAISLAYATG QSLLGLIGEILDVDKIESGNYQLQPQWVDIPTLVQNTCHSFGAIAASKSIALSCSSTFPD HYLVKIDPQAFKQVLSNLLSNALKFTTEGAVKITTSLVHIDDNHAVIKMTIMDSGSGLSQ EEQQQLFKRYSQTSAGRQQTGSGLGLMICKELIKNMQGDLSLESHPGIGTTFTITIPVEI SQQVATVEAKAEHPITLPEKLSILIADDHPTNRLLLKRQLNLLGYDVDEATDGVQALHKV SMQHYDLLITDVNMPNMDGFELTRKLREQNSSLPIWGLTANAQANEREKGLSCGMNLCLF KPLTLDVLKTHLSQLHQVAHIAPQYRHLDIEALKNNTANNLQLMQEILMTFQHETHKDLP AAFQALEAGDNRTFHQCIHRIHGAANILNLQKLINISHQLEITPVSDDSKPEILQLLNSV KEHIAELDQEIAVFCQKND >gi|296493400|gb|ADTK01000101.1| GENE 16 16585 - 17730 450 381 aa, chain - ## HITS:1 COG:yfdE KEGG:ns NR:ns ## COG: yfdE COG1804 # Protein_GI_number: 16130303 # Func_class: C Energy production and conversion # Function: Predicted acyl-CoA transferases/carnitine dehydratase # Organism: Escherichia coli K12 # 1 381 14 394 394 796 100.0 0 MTNNESKGPFEGLLVIDMTHVLNGPFGTQLLCNMGARVIKVEPPGHGDDTRTFGPYVDGQ SLYYSFINHGKESVVLDLKNDHDKSIFINMLKQADVLAENFRPGTMEKLGFSWETLQEIN PRLIYASSSGFGHTGPLKDAPAYDTIIQAMSGIMMETGYPDAPPVRVGTSLADLCGGVYL FSGIVSALYGREKSQRGAHVDIAMFDATLSFLEHGLMAYIATGKSPQRLGNRHPYMAPFD VFNTQDKPITICCGNDKLFSALCQALELTELVNDPRFSSNILRVQNQAILKQYIERTLKT QAAEVWLARIHEVGVPVAPLLSVAEAIKLPQTQARNMLIEAGGIMMPGNPIKISGCADPH VMPGAATLDQHGEQIRQEFSS >gi|296493400|gb|ADTK01000101.1| GENE 17 17804 - 18748 727 314 aa, chain - ## HITS:1 COG:ECs3252 KEGG:ns NR:ns ## COG: ECs3252 COG0679 # Protein_GI_number: 15832506 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Escherichia coli O157:H7 # 1 314 1 314 314 531 100.0 1e-151 MLTFFIGDLLPIIVIMLLGYFSGRRETFSEDQARAFNKLVLNYALPAALFVSITRANREM IFADTRLTLVSLVVIVGCFFFSWFGCYKFFKRTHAEAAVCALIAGSPTIGFLGFAVLDPI YGDSVSTGLVVAIISIIVNAITIPIGLYLLNPSSGADGKKNSNLSALISAAKEPVVWAPV LATILVLVGVKIPAAWDPTFNLIAKANSGVAVFAAGLTLAAHKFEFSAEIAYNTFLKLIL MPLALLLVGMACHLNSEHLQMMVLAGALPPAFSGIIIASRFNVYTRTGTASLAVSVLGFV VTAPLWIYVSRLVS >gi|296493400|gb|ADTK01000101.1| GENE 18 18818 - 20512 1129 564 aa, chain - ## HITS:1 COG:ECs3253 KEGG:ns NR:ns ## COG: ECs3253 COG0028 # Protein_GI_number: 15832507 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Escherichia coli O157:H7 # 1 564 1 564 564 1079 99.0 0 MSDQLQMTDGMHIIVEALKQNNIDTIYGVVGIPVTDMARHAQAEGIRYIGFRHEQSAGYA AAASGFLTQKPGICLTVSAPGFLNGLTALANATVNGFPMIMISGSSDRAIVDLQQGDYEE LDQMNAAKPYAKAAFRVNQPQDLGIALARAIRVSVSGRPGGVYLDLPANVLAATMEKDEA LTTIVKVENPSPALLPCPKSVTSAISLLAKAERPLIILGKGAAYSQADEQLREFIESTQI PFLPMSMAKGILEDTHPLSAAAARSFALANADVVMLVGARLNWLLAHGKKGWAADTQFIQ LDIEPQEIDSNRPIAVPVVGDIASSMQGMLAELKQNTFTTPLVWRDILNIHKQQNAQKMH EKLSTDTQPLNYFNALSAVRDVLRENQDIYLVNEGANTLDNARNIIDMYKPRRRLDCGTW GVMGIGMGYAIGASVTSGSPVVAIEGDSAFGFSGMEIETICRYNLPVTIVIFNNGGIYRG DGVDLSGAGAPSPTDLLHHARYDKLMDAFRGVGYNVTTTDELRHALTTGIQSRKPTIINV VIDPAAGTESGHITKLNPKQVAGN >gi|296493400|gb|ADTK01000101.1| GENE 19 20566 - 21816 1137 416 aa, chain - ## HITS:1 COG:yfdW KEGG:ns NR:ns ## COG: yfdW COG1804 # Protein_GI_number: 16130306 # Func_class: C Energy production and conversion # Function: Predicted acyl-CoA transferases/carnitine dehydratase # Organism: Escherichia coli K12 # 1 416 1 416 416 830 100.0 0 MSTPLQGIKVLDFTGVQSGPSCTQMLAWFGADVIKIERPGVGDVTRHQLRDIPDIDALYF TMLNSNKRSIELNTKTAEGKEVMEKLIREADILVENFHPGAIDHMGFTWEHIQEINPRLI FGSIKGFDECSPYVNVKAYENVAQAAGGAASTTGFWDGPPLVSAAALGDSNTGMHLLIGL LAALLHREKTGRGQRVTMSMQDAVLNLCRVKLRDQQRLDKLGYLEEYPQYPNGTFGDAVP RGGNAGGGGQPGWILKCKGWETDPNAYIYFTIQEQNWENTCKAIGKPEWITDPAYSTAHA RQPHIFDIFAEIEKYTVTIDKHEAVAYLTQFDIPCAPVLSMKEISLDPSLRQSGSVVEVE QPLRGKYLTVGCPMKFSAFTPDIKAAPLLGEHTAAVLQELGYSDDEIAAMKQNHAI >gi|296493400|gb|ADTK01000101.1| GENE 20 22329 - 22961 578 210 aa, chain - ## HITS:1 COG:no KEGG:SDY_2573 NR:ns ## KEGG: SDY_2573 # Name: not_defined # Def: hypothetical protein # Organism: S.dysenteriae # Pathway: not_defined # 1 210 1 210 210 326 100.0 3e-88 MKRLIMATMVTAILASSTVWAADNAPVAAQQQTQQVQQTQKTAAAERISEQGLYAMRDVQ VARLALFHGDPEKAKELTNEASALLSDDSTEWAKFAKPGKKTNLNDDQYIVINASVGISE SYVATPEKEAAIKIANEKMAKGDKKGAMEELRLAGVGVMENQYLMPLKQTRNALADAQKL LDKKQYYEANLALKGAEDGIIVDSEALFVN >gi|296493400|gb|ADTK01000101.1| GENE 21 23257 - 23532 57 91 aa, chain + ## HITS:1 COG:no KEGG:SSON_2467 NR:ns ## KEGG: SSON_2467 # Name: not_defined # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 91 1 91 91 160 100.0 2e-38 MKVNLILFSLFLLVSIMACNVFAFSISGGVSERSYKETEKTSAMTTTHSTKLQPSQAILF KMREDAPPLNLTEEITPTYPTKANYLIHPVR >gi|296493400|gb|ADTK01000101.1| GENE 22 23609 - 23851 282 80 aa, chain - ## HITS:1 COG:no KEGG:SDY_2575 NR:ns ## KEGG: SDY_2575 # Name: not_defined # Def: hypothetical protein # Organism: S.dysenteriae # Pathway: not_defined # 1 80 1 80 80 111 100.0 7e-24 MIYLWMFLALCIVCVSGYIGQVLNVVSAVSSFFGMVILAALIYYFTMWLTGGNELVTGIF MFLAPACGLMIRFMVGYGRR >gi|296493400|gb|ADTK01000101.1| GENE 23 24204 - 25124 848 306 aa, chain + ## HITS:1 COG:ECs3258 KEGG:ns NR:ns ## COG: ECs3258 COG1560 # Protein_GI_number: 15832512 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lauroyl/myristoyl acyltransferase # Organism: Escherichia coli O157:H7 # 1 306 23 328 328 625 100.0 1e-179 MFPQCKFSREFLHPRYWLTWFGLGVLWLWVQLPYPVLCFLGTRIGAMARPFLKRRESIAR KNLELCFPQHSAEEREKMIAENFRSLGMALVETGMAWFWPDSRVRKWFDVEGLDNLKRAQ MQNRGVMVVGVHFMSLELGGRVMGLCQPMMATYRPHNNQLMEWVQTRGRMRSNKAMIGRN NLRGIVGALKKGEAVWFAPDQDYGRKGSSFAPFFAVENVATTNGTYVLSRLSGAAMLTVT MVRKADYSGYRLFITPEMEGYPTDENQAAAYMNKIIEKEIMRAPEQYLWIHRRFKTRPVG ESSLYI >gi|296493400|gb|ADTK01000101.1| GENE 24 25616 - 26854 1279 412 aa, chain - ## HITS:1 COG:yfdZ KEGG:ns NR:ns ## COG: yfdZ COG0436 # Protein_GI_number: 16130311 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Escherichia coli K12 # 1 412 1 412 412 847 100.0 0 MADTRPERRFTRIDRLPPYVFNITAELKMAARRRGEDIIDFSMGNPDGATPPHIVEKLCT VAQRPDTHGYSTSRGIPRLRRAISRWYQDRYDVEIDPESEAIVTIGSKEGLAHLMLATLD HGDTVLVPNPSYPIHIYGAVIAGAQVRSVPLVEGVDFFNELERAIRESYPKPKMMILGFP SNPTAQCVELEFFEKVVALAKRYDVLVVHDLAYADIVYDGWKAPSIMQVPGARDVAVEFF TLSKSYNMAGWRIGFMVGNKTLVSALARIKSYHDYGTFTPLQVAAIAALEGDQQCVRDIA EQYKRRRDVLVKGLHEAGWMVEMPKASMYVWAKIPEPYAAMGSLEFAKKLLNEAKVCVSP GIGFGDYGDTHVRFALIENRDRIRQAIRGIKAMFRADGLLPASSKHIHENAE >gi|296493400|gb|ADTK01000101.1| GENE 25 27249 - 28928 1452 559 aa, chain + ## HITS:1 COG:ECs3260 KEGG:ns NR:ns ## COG: ECs3260 COG3275 # Protein_GI_number: 15832514 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Escherichia coli O157:H7 # 1 559 7 565 565 1123 100.0 0 MLLAVFDRAALMLICLFFLIRIRLFRELLHKSAHSPKELLAVTAIFSLFALFSTWSGVPV EGSLVNVRIIAVMSGGILFGPWVGIITGVIAGIHRYLIDIGGVTAIPCFITSILAGCISG WINLKIPKAQRWRVGILGGMLCETLTMILVIVWAPTTALGIDIVSKIGIPMILGSVCIGF IVLLVQSVEGEKEASAARQAKLALDIANKTLPLFRHVNSESLRKVCEIIRDDIHADAVAI TNTDHVLAYVGVGEHNYQNGDDFISPTTRQAMNYGKIIIKNNDEAHRTPEIHSMLVIPLW EKGVVTGTLKIYYCHAHQITSSLQEMAVGLSQIISTQLEVSRAEQLREMANKAELRALQS KINPHFLFNALNAISSSIRLNPDTARQLIFNLSRYLRYNIELKDDEQIDIKKELYQIKDY IAIEQARFGDKLTVIYDIDEEVNCCIPSLLIQPLVENAIVHGIQPCKGKGVVTISVAECG NRVRIAVRDTGHGIDPKVIERVEANEMPGNKIGLLNVHHRVKLLYGEGLHIRRLEPGTEI AFYIPNQRTPVASQATLLL >gi|296493400|gb|ADTK01000101.1| GENE 26 28943 - 29677 731 244 aa, chain + ## HITS:1 COG:ECs3261 KEGG:ns NR:ns ## COG: ECs3261 COG3279 # Protein_GI_number: 15832515 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Escherichia coli O157:H7 # 1 244 1 244 244 483 100.0 1e-136 MKVIIVEDEFLAQQELSWLIKEHSQMEIVGTFDDGLDVLKFLQHNRVDAIFLDINIPSLD GVLLAQNISQFAHKPFIVFITAWKEHAVEAFELEAFDYILKPYQESRITGMLQKLEAAWQ QQQTSSTPAATVTRENDTINLVKDERIIVTPINDIYYAEAHEKMTFVYTRRESYVMPMNI TEFCSKLPPSHFFRCHRSFCVNLNKIREIEPWFNNTYILRLKDLDFEVPVSRSKVKEFRQ LMHL >gi|296493400|gb|ADTK01000101.1| GENE 27 29690 - 30494 535 268 aa, chain + ## HITS:1 COG:ECs3262 KEGG:ns NR:ns ## COG: ECs3262 COG2207 # Protein_GI_number: 15832516 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 268 1 268 285 540 99.0 1e-154 MKAPGLPADQQFFADLFSGLVLNPQLLGRVWFASQPASLPVGSLCIDFPRLDIVLRGEYG NLLEAKQQRMVEGEMLFIPARAANLPINNKPVMLLSLVFAPTWLGLSFYDSRTTSLLHPA RQIQLPSLQRGEGEAMLTALTHLSRSPLEQNIIQPLVLSLLHLCRNVVNMPPGNSQPRGD FLYHSICNWVQDNYAQPLTRESVAQFFNITPNHLSKLFAQHGTMRFIEYVRWVRMAKARM ILQKYHLSIHEVAQRCGFPDSDYFCRVF Prediction of potential genes in microbial genomes Time: Mon May 16 15:18:19 2011 Seq name: gi|296493399|gb|ADTK01000102.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont287.2, whole genome shotgun sequence Length of sequence - 5538 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.000 - CDS 40 - 2535 2552 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) 2 1 Op 2 3/0.000 - CDS 2560 - 3597 903 ## COG1363 Cellulase M and related proteins 3 1 Op 3 3/0.000 - CDS 3597 - 4682 1102 ## COG0006 Xaa-Pro aminopeptidase 4 1 Op 4 . - CDS 4697 - 5536 964 ## COG1299 Phosphotransferase system, fructose-specific IIC component Predicted protein(s) >gi|296493399|gb|ADTK01000102.1| GENE 1 40 - 2535 2552 831 aa, chain - ## HITS:1 COG:ECs3263_2 KEGG:ns NR:ns ## COG: ECs3263_2 COG1080 # Protein_GI_number: 15832517 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Escherichia coli O157:H7 # 98 683 1 586 586 1117 99.0 0 MLTIQFLCPLPNGLHARPAWELKEQCSQWQSEITFFNHRQNAKADAKSSLALIGTGTLFN DSCSLNISGSDEEQARRVLEEYILVRFIDSDSVQPTQAELTAHPLPRSLSRLNPDLLYGN VLASGVGVGTLTLLQSDSLDSYRAIPASAQDSTRLEHSLATLAEQLNQQLRERDGESKTI LSAHLSLIQDDEFAGNIRRLMTEQHQGLGAAIISNMEQVCAKLSASASDYLRERVSDIRD ISEQLLHITWPELKPRNNLVLEKPTILVAEDLTPSQFLSLDLKNLAGMILEKTGRTSHTL ILARASAIPVLSGLPLDAIARYAGQPAVLDAQCGVLAINPNDAVSGYYQVAQTLADKRQK QQAQAAAQLAYSRDNKRIDIAANIGTALEAPGAFANGAEGVGLFRTEMLYMDRDSAPDEQ EQFEAYQQVLLAAGDKPIIFRTMDIGGDKSIPYLNIPQEENPFLGYRAVRIYPEFAGLFR TQLRAILRAASFGNAQLMIPMVHSLDQILWVKGEIQKAIVELKRDGLRHAETITLGIMVE VPSVCYIIDHFCDEVDFFSIGSNDMTQYLYAVDRNNPRVSPLYNPITPSFLRMLQQIVTT AHQRGKWVGICGELGGESRYLPLLLGLGLDELSMSSPRIPAVKSQLRQLDSEACRELARQ ACECRSAQEIEALLTAFTPEEDVRPLLALENIFVDQDFSNKEQAIQFLCGNLGVNGRTEH PFELEEDVWQREEIVTTGVGFGVAIPHTKSQWIRHSSISIARLVKPVDWQSEMGEVELVI MLTLGANEGMNHVKVFSQLARKLVNKNFRQSLFAAQDAQSILTLLETELTF >gi|296493399|gb|ADTK01000102.1| GENE 2 2560 - 3597 903 345 aa, chain - ## HITS:1 COG:ypdE KEGG:ns NR:ns ## COG: ypdE COG1363 # Protein_GI_number: 16130316 # Func_class: G Carbohydrate transport and metabolism # Function: Cellulase M and related proteins # Organism: Escherichia coli K12 # 1 345 1 345 345 674 99.0 0 MDLSLLKALSEADAIASSEQEVRQILLEEADRLQKEVRFDGLGSVLIRLNESTGPKVMIC AHMDEVGFMVRSISREGAIDVLPVGNVRMAARQLQPVRITTREECKIPGLLDGDRQGNDV SAMRVDIGARSYDEVMQAGIRPGDRVTFDTTFQVLPHQRVMGKAFDDRLGCYLLVTLLRE LHDAELPAEVWLVASSSEEVGLRGGQTATRAVSPDVAIVLDTACWAKNFDYGAANHRQIG NGPMLVLSDKSLIAPPKLTAWIETVAAEIGVPLQADMFSNGGTDGGAVHLTGTGVPTVVM GPATRHGHCAASIADCRDILQMEQLLSALIQRLTRETVVQLTDFR >gi|296493399|gb|ADTK01000102.1| GENE 3 3597 - 4682 1102 361 aa, chain - ## HITS:1 COG:ypdF KEGG:ns NR:ns ## COG: ypdF COG0006 # Protein_GI_number: 16130317 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Escherichia coli K12 # 1 361 1 361 361 696 97.0 0 MTLLASLRDWLKAQQLDAVLLSSRQNKQPHLGISTGSGYVVISRESAHILVDSRYFVEVE ARAQGYQLHLLDATNTLTTIVNQIIADEQLQTLGFEGQQVSWETAHRWKSELNAKLVSAT PDVLRQIKTPEEVEIIRLACGIADRGAEHIRRFIQAGMSEREIAAELEWFMRQQGAEKAS FDTIVASGWRGALPHGKASDKIVAAGEFVTLDFGALYQGYCSDMTRTLLVNGEGVSAESH PLFNVYQIVLQAQLAAISAIRPGVRCQQVDDAARQVITEAGFSHYFGHNTGHAIGIEVHE DPRFSPRDTTTLQPGMLLTVEPGIYLPGQGGVRIEDVVLVTPQGAEVLYAMPKTVLLTGE A >gi|296493399|gb|ADTK01000102.1| GENE 4 4697 - 5536 964 279 aa, chain - ## HITS:1 COG:ECs3266 KEGG:ns NR:ns ## COG: ECs3266 COG1299 # Protein_GI_number: 15832520 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Escherichia coli O157:H7 # 1 279 137 415 415 476 100.0 1e-134 MSTQPTQLLNFDPSTMQWATSSPVPSTFIGALIISIVAGYLVKWMNQKIQLPDFLLAFKT TFLLPILSAIFVMLAMYYVITPFGGWINGGIRTVLTAAGEKGALMYAMGIAAATAIDLGG PINKAAGFVAFSFTTDHVLPVTARSIAIVIPPIGLGLATIIDRRLTGKRLFNAQLYPQGK TAMFLAFMGISEGAIPFALESPITAIPSYMVGAIVGSTAAVWLGAVQWFPESAIWAWPLV TNLGVYMAGIALGAVITALMVVFLRLMMFRKGKLLIDSL Prediction of potential genes in microbial genomes Time: Mon May 16 15:18:23 2011 Seq name: gi|296493398|gb|ADTK01000103.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont287.3, whole genome shotgun sequence Length of sequence - 12368 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 9, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 2 1 Op 2 3/1.000 - CDS 436 - 762 537 ## COG1445 Phosphotransferase system fructose-specific component IIB - Prom 792 - 851 3.0 - Term 932 - 967 5.1 3 2 Tu 1 . - CDS 981 - 1946 1024 ## COG0837 Glucokinase - Prom 1981 - 2040 2.5 + Prom 1916 - 1975 5.0 4 3 Tu 1 . + CDS 2150 - 3406 956 ## COG0038 Chloride channel protein EriC + Term 3449 - 3482 1.3 + Prom 3422 - 3481 2.6 5 4 Tu 1 . + CDS 3521 - 3847 344 ## ECDH10B_2555 hypothetical protein - Term 3860 - 3903 3.9 6 5 Tu 1 . - CDS 3988 - 5163 1476 ## COG1914 Mn2+ and Fe2+ transporters of the NRAMP family + Prom 5429 - 5488 5.0 7 6 Tu 1 . + CDS 5562 - 6764 1437 ## COG1972 Nucleoside permease + Term 6772 - 6822 12.7 - Term 6756 - 6813 16.6 8 7 Tu 1 . - CDS 6814 - 9003 1511 ## COG2200 FOG: EAL domain - Prom 9038 - 9097 3.2 - TRNA 9212 - 9287 86.5 # Ala GGC 0 0 - TRNA 9327 - 9402 86.5 # Ala GGC 0 0 + Prom 9532 - 9591 5.6 9 8 Op 1 . + CDS 9623 - 9982 404 ## SDY_2596 hypothetical protein 10 8 Op 2 . + CDS 9984 - 10376 205 ## JW2394 predicted DNA-binding transcriptional regulator + Term 10385 - 10419 3.5 - Term 10368 - 10413 6.3 11 9 Tu 1 . - CDS 10428 - 11843 1580 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases - Prom 11902 - 11961 4.9 + TRNA 12102 - 12177 94.3 # Val TAC 0 0 + TRNA 12222 - 12297 94.3 # Val TAC 0 0 Predicted protein(s) >gi|296493398|gb|ADTK01000103.1| GENE 1 3 - 414 481 137 aa, chain - ## HITS:1 COG:ECs3266 KEGG:ns NR:ns ## COG: ECs3266 COG1299 # Protein_GI_number: 15832520 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Escherichia coli O157:H7 # 1 135 1 135 415 230 100.0 5e-61 MAIKKRSATVVPGASGAAAAVKNPQASKSSFWGELPQHVMSGISRMVPTLIMGGVILAFS QLIAYSWLKIPAEIGIMDALNSGKFSGFDLSLLKFAWLSQSFGGVLFGFAIPMFAAFVAN SIGGKLAFPAGFIGGGG >gi|296493398|gb|ADTK01000103.1| GENE 2 436 - 762 537 108 aa, chain - ## HITS:1 COG:ECs3267 KEGG:ns NR:ns ## COG: ECs3267 COG1445 # Protein_GI_number: 15832521 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system fructose-specific component IIB # Organism: Escherichia coli O157:H7 # 1 108 1 108 108 169 100.0 1e-42 MSKKLIALCACPMGLAHTFMAAQALEEAAVEAGYEVKIETQGADGIQNRLTAQDIAEATI IIHSVAVTPEDNERFESRDVYEITLQDAIKNAAGIIKEIEEMIASEQQ >gi|296493398|gb|ADTK01000103.1| GENE 3 981 - 1946 1024 321 aa, chain - ## HITS:1 COG:ECs3268 KEGG:ns NR:ns ## COG: ECs3268 COG0837 # Protein_GI_number: 15832522 # Func_class: G Carbohydrate transport and metabolism # Function: Glucokinase # Organism: Escherichia coli O157:H7 # 1 321 1 321 321 654 99.0 0 MTKYALVGDVGGTNARLALCDIASGEISQAKTYSGLDYPSLEAVIRVYLEEHKVEVKDGC IAIACPITGDWVAMTNHTWAFSIAEMKKNLGFSHLEIINDFTAVSMAIPMLKKEHLIQFG GAEPVEGKPIAVYGAGTGLGVAHLVHVDKRWVSLPGEGGHVDFAPNSEEEAIILEILRAE IGHVSAERVLSGPGLVNLYRAIVKADNRLPENLKPKDITERALADSCTDCRRALSLFCVI MGRFGGNLALTLGTFGGVYIAGGIVPRFLEFFKASGFRAAFEDKGRFKEYVHDIPVYLIV HNNPGLLGSGAHLRQTLGHIL >gi|296493398|gb|ADTK01000103.1| GENE 4 2150 - 3406 956 418 aa, chain + ## HITS:1 COG:ECs3269 KEGG:ns NR:ns ## COG: ECs3269 COG0038 # Protein_GI_number: 15832523 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Escherichia coli O157:H7 # 1 418 1 418 418 634 99.0 0 MLHPRARIMLLLSLPAVAIGIASSLILIVVMKIASVLQNLLWQRLPGTLGIAQDSPLWII GVLTLTGIAVGLVIRFSQGHAGPDPACEPLIGAPVPPSALPGLIVALILGLAGGVSLGPE HPIMTVNIALAVAIGARLLPRVNRMEWTILASAGTIGALFGTPVAAALIFSQTLNGSSEV PLWDRLFAPLMAAAAGALTTGLFFHPHFSLPIAHYGQMEMTDILSGAIVAAIAIAAGMVA VWCLPRLHAMMNQMKNPVLVLGIGGFILGILGVIGGPVSLFKGLDEMQQMVANQAFSTSD YFLLAVIKLAALVVAAASGFRGGRIFPAVFVGVALGLMLHEHVPAVPAAITVSCAILGIV LVVTRDGWLSLFMAAVVVPNTTLLPLLCIVMLPAWLLLAGKPMMMVNRPKQQPPHDNV >gi|296493398|gb|ADTK01000103.1| GENE 5 3521 - 3847 344 108 aa, chain + ## HITS:1 COG:no KEGG:ECDH10B_2555 NR:ns ## KEGG: ECDH10B_2555 # Name: ypeC # Def: hypothetical protein # Organism: E.coli_DH10B # Pathway: not_defined # 1 108 1 108 108 165 100.0 6e-40 MFRSLFLAAALMAFTPLAANAGEITLLPSIKLQIGDRDHYGNYWDGGHWRDRDYWHRNYE WRKNRWWRHDNGYHRGWDKRKAYERGYREGWRDRDDHRGKGRGHGHRH >gi|296493398|gb|ADTK01000103.1| GENE 6 3988 - 5163 1476 391 aa, chain - ## HITS:1 COG:ECs3271 KEGG:ns NR:ns ## COG: ECs3271 COG1914 # Protein_GI_number: 15832525 # Func_class: P Inorganic ion transport and metabolism # Function: Mn2+ and Fe2+ transporters of the NRAMP family # Organism: Escherichia coli O157:H7 # 1 391 22 412 412 657 99.0 0 MGPAFIAAIGYIDPGNFATNIQAGASFGYQLLWVVVWANLMAMLIQLLSAKLGIATGKNL AEQIRDHYPRPVVWFYWVQAEIIAMATDLAEFIGAAIGFKLILGVSLLQGAVLTGIATFL ILMLQRRGQKPLEKVIGGLLLFVAAAYIVELIFSQPNLAQLGKGMVIPSLPTSEAVFLAA GVLGATIMPHVIYLHSSLTQHLHGGSRQQRYSATKWDVAIAMTIAGFVNLAMMATAAAAF HFSGHTGVADLDEAYLTLQPLLSHAAATVFGLSLVAAGLSSTVVGTLAGQVVMQGFIRFH IPLWVRRTVTMLPSFIVILMGLDPTRILVMSQVLLSFGIALALVPLLIFTSDSKLMGDLV NSKRVKQTGWVIVVLVVALNIWLLVGTALGL >gi|296493398|gb|ADTK01000103.1| GENE 7 5562 - 6764 1437 400 aa, chain + ## HITS:1 COG:ECs3272 KEGG:ns NR:ns ## COG: ECs3272 COG1972 # Protein_GI_number: 15832526 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside permease # Organism: Escherichia coli O157:H7 # 1 400 1 400 400 653 99.0 0 MDRVLHFVLALAVVAILALLVSSDRKKIRIRYVIQLLVIEVLLAWFFLNSDVGLGFVKGF SEMFEKLLGFANEGTNFVFGSMNDQGLAFFFLKVLCPIVFISALIGILQHIRVLPVIIRA IGFLLSKVNGMGKLESFNAVSSLILGQSENFIAYKDILGKISRNRMYTMAATAMSTVSMS IVGAYMTMLEPKYVVAALVLNMFSTFIVLSLINPYRVDASEENIQMSNLHEGQSFFEMLG EYILAGFKVAIIVAAMLIGFIALIAALNALFATVTGWVGYSISFQGILGYIFYPIAWVMG VPSSEALQVGSIMATKLVSNEFVAMMDLQKIASTLSPRAEGIISVFLVSFANFSSIGIIA GAVKGLNEEQGNVVSRFGLKLVYGSTLVSVLSASIAALVL >gi|296493398|gb|ADTK01000103.1| GENE 8 6814 - 9003 1511 729 aa, chain - ## HITS:1 COG:yfeA_3 KEGG:ns NR:ns ## COG: yfeA_3 COG2200 # Protein_GI_number: 16130327 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 472 729 1 258 258 524 100.0 1e-148 MFVEHNLIKNIKIFTLAFTLTVVLIQLSRFISPLAIIHSSYIFLAWMPLCVMLSILFIFG WRGVVPVLCGMFCTNLWNFHLSFLQTAVMLGSQTFVVLCACAILRWQLGTRWRYGLTSRY VWQRLFWLGLVTPIGIKCSMYLVGSFFDFPLKISTFFGDADAIFTVVDLLSLFTAVLIYN MLFYYLTRMIVSPHFAQILWRRDIAPSLGKEKRAFTLSWLAALSVLLLLLCTPYENDFIA GYLVPVFFIIFTLGVGKLRYPFLNLTWAVSTLCLLNYNQNFLQGVETEYSLAFILAVLIS FSVCLLYMVRIYHRSEWLNRRWHLQALTDPLTLLPNFRALEQAPEQEAGKSFCCLRIDNL EFMSRHYGLMMRVHCIRSICRTLLPLMQENEKLYQLPGSELLLVLSGPETEGRLQHMVNI LNSRQIHWNNTGLDMGYGAAWGRFDGNQETLQPLLGQLSWLAEQSCAHHHVLALDSREEM VSGQTTKQVLLLNTIRTALDQGDLLLYAQPIRNKEGEGYDEILARLKYDGGIMTPDKFLP LIAQFNLSARFDLQVLESLLKWLATHPCDKKGPRFSVNLMPLTLLQKNIAGRIIRLFKRY HISPQAVILEITEEQAFSNAESSMYNIEQLHKFGFRIAIDDFGTGYANYERLKRLQADII KIDGVFVKDIVTNTLDAMIVRSITDLAKAKSLSVVAEFVETQQQQALLHKLGVQYLQGYL IGRPQPLAD >gi|296493398|gb|ADTK01000103.1| GENE 9 9623 - 9982 404 119 aa, chain + ## HITS:1 COG:no KEGG:SDY_2596 NR:ns ## KEGG: SDY_2596 # Name: yfeC # Def: hypothetical protein # Organism: S.dysenteriae # Pathway: not_defined # 1 119 1 119 119 216 100.0 2e-55 MFKERMTPDELARLTGYSRQTINKWVRKEGWTTSPKPGVQGGKARLVHVNEQVREYIRNA ERPEGQGEAPALSGDAPLEVLLVTLAKEMTPVEQKQFTSLLLREGIIGLLQRLGIRDSK >gi|296493398|gb|ADTK01000103.1| GENE 10 9984 - 10376 205 130 aa, chain + ## HITS:1 COG:no KEGG:JW2394 NR:ns ## KEGG: JW2394 # Name: yfeD # Def: predicted DNA-binding transcriptional regulator # Organism: E.coli_J # Pathway: not_defined # 1 130 1 130 130 250 100.0 1e-65 MKRLRNKMTTEELAECLGVAKQTVNRWIREKGWKTEKFPGVKGGRARLILVDTQVCEFIQ NTPAFHNTPMLMEAEERIAEYAPGARAPAYRQIINAIDNMTDIEQEKVAQFLSREGIRNF LARLDIDESA >gi|296493398|gb|ADTK01000103.1| GENE 11 10428 - 11843 1580 471 aa, chain - ## HITS:1 COG:gltX KEGG:ns NR:ns ## COG: gltX COG0008 # Protein_GI_number: 16130330 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Escherichia coli K12 # 1 471 1 471 471 983 100.0 0 MKIKTRFAPSPTGYLHVGGARTALYSWLFARNHGGEFVLRIEDTDLERSTPEAIEAIMDG MNWLSLEWDEGPYYQTKRFDRYNAVIDQMLEEGTAYKCYCSKERLEALREEQMAKGEKPR YDGRCRHSHEHHADDEPCVVRFANPQEGSVVFDDQIRGPIEFSNQELDDLIIRRTDGSPT YNFCVVVDDWDMEITHVIRGEDHINNTPRQINILKALKAPVPVYAHVSMINGDDGKKLSK RHGAVSVMQYRDDGYLPEALLNYLVRLGWSHGDQEIFTREEMIKYFTLNAVSKSASAFNT DKLLWLNHHYINALPPEYVATHLQWHIEQENIDTRNGPQLADLVKLLGERCKTLKEMAQS CRYFYEDFAEFDADAAKKHLRPVARQPLEVVRDKLAAITDWTAENVHHAIQATADELEVG MGKVGMPLRVAVTGAGQSPALDVTVHAIGKTRSIERINKALDFIAERENQQ Prediction of potential genes in microbial genomes Time: Mon May 16 15:18:36 2011 Seq name: gi|296493397|gb|ADTK01000104.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont287.4, whole genome shotgun sequence Length of sequence - 15024 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 9, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 142 - 168 0.1 1 1 Op 1 . - CDS 252 - 1136 488 ## COG0583 Transcriptional regulator 2 1 Op 2 . - CDS 1177 - 1335 96 ## EcSMS35_2559 hypothetical protein - Term 1345 - 1379 5.2 3 2 Op 1 1/1.000 - CDS 1388 - 2644 975 ## COG0477 Permeases of the major facilitator superfamily 4 2 Op 2 . - CDS 2704 - 3537 878 ## COG0005 Purine nucleoside phosphorylase - Prom 3743 - 3802 3.8 5 3 Tu 1 . + CDS 3786 - 4550 783 ## ECS88_2597 hypothetical protein + Term 4552 - 4599 2.5 6 4 Tu 1 . - CDS 4589 - 5515 646 ## COG0583 Transcriptional regulator - Prom 5541 - 5600 3.7 + Prom 5489 - 5548 5.0 7 5 Tu 1 . + CDS 5605 - 6603 701 ## COG0385 Predicted Na+-dependent transporter + Term 6614 - 6659 3.1 - Term 6552 - 6596 -0.8 8 6 Op 1 3/0.667 - CDS 6600 - 6818 274 ## COG3530 Uncharacterized protein conserved in bacteria 9 6 Op 2 7/0.333 - CDS 6820 - 8835 2140 ## COG0272 NAD-dependent DNA ligase (contains BRCT domain type II) - Term 8846 - 8874 1.3 10 6 Op 3 . - CDS 8906 - 9889 769 ## COG3115 Cell division protein - Prom 10065 - 10124 2.7 + Prom 10024 - 10083 3.3 11 7 Tu 1 8/0.333 + CDS 10248 - 10883 537 ## COG2981 Uncharacterized protein involved in cysteine biosynthesis + Term 10923 - 10955 4.1 + Prom 10976 - 11035 4.4 12 8 Tu 1 6/0.667 + CDS 11068 - 12039 702 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 + Term 12068 - 12095 0.1 + Prom 12200 - 12259 5.8 13 9 Op 1 25/0.000 + CDS 12423 - 12680 357 ## COG1925 Phosphotransferase system, HPr-related proteins 14 9 Op 2 10/0.000 + CDS 12725 - 14452 1796 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) 15 9 Op 3 . + CDS 14493 - 15002 634 ## COG2190 Phosphotransferase system IIA components Predicted protein(s) >gi|296493397|gb|ADTK01000104.1| GENE 1 252 - 1136 488 294 aa, chain - ## HITS:1 COG:xapR KEGG:ns NR:ns ## COG: xapR COG0583 # Protein_GI_number: 16130331 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 294 1 294 294 561 99.0 1e-160 MERVYRTDLKLLRYFLAVAEELHFGRAAARLNMSQPPLSIHIKELENQLGTQLFIRHSRS VVLTHAGKILMEESRRLLVNANNVLARVEQIGRGEAGRIELGVVGTAMWGRMRPVMRRFL RENPNVEVLFREKMPAMQMALLERRELDAGIWRMATEPPTGFTSLRLHESAFLVAMPEEH HLSSFSTVPLEALRDEYFVTMPPVYTDWDFLQRVCQQVGFSPVVIREVNEPQTVLAMVSM GIGITLIADSYAQMNWPGVIFRPLKQRIPADLYIVYETQQVTPAMVKLLAALTQ >gi|296493397|gb|ADTK01000104.1| GENE 2 1177 - 1335 96 52 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_2559 NR:ns ## KEGG: EcSMS35_2559 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 52 1 52 52 69 100.0 3e-11 MKDILFSSGVGFGIGALFTIVRLPIPVPNVLPGILSIVFMYVGYLVVKYFMP >gi|296493397|gb|ADTK01000104.1| GENE 3 1388 - 2644 975 418 aa, chain - ## HITS:1 COG:xapB KEGG:ns NR:ns ## COG: xapB COG0477 # Protein_GI_number: 16130332 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 418 1 418 418 704 99.0 0 MSIAMRLKVMSFLQYFIWGSWLVTLGSYMINTLHFTGANVGMVYSSKGIAAIIMPGIMGI IADKWLRAERAYMLCHLVCAGVLFYAASVTDPDMMFWVMLVNAMAFMPTIALSNSVSYSC LAQAGLDPVTAFPPIRVFGTVGFIVAMWAVSLLHLELNSLQLYIASGASLLLSAYALTLP KIPVAEKKATTSLASKLGLDAFVLFKNPRMAIFFLFAMMLGAVLQITNVFGNPFLHDFAR NPEFADSFVVKYPSILLSVSQMAEVGFILTIPFFLKRFGIKTVMLMSMVAWTLRFGFFAY GDPSTTGFILLLLSMIVYGCAFDFFNISGSVFVEQEVDSSIRASAQGLFMTMVNGVGAWV GSILSGMAVDYFSVDGVKDWQTIWLVFAGYALFLAVIFFFGFKYNHDPEKIKHRAVAH >gi|296493397|gb|ADTK01000104.1| GENE 4 2704 - 3537 878 277 aa, chain - ## HITS:1 COG:xapA KEGG:ns NR:ns ## COG: xapA COG0005 # Protein_GI_number: 16130333 # Func_class: F Nucleotide transport and metabolism # Function: Purine nucleoside phosphorylase # Organism: Escherichia coli K12 # 1 277 1 277 277 563 100.0 1e-160 MSQVQFSHNPLFCIDIIKTYKPDFTPRVAFILGSGLGALADQIENAVAISYEKLPGFPVS TVHGHAGELVLGHLQGVPVVCMKGRGHFYEGRGMTIMTDAIRTFKLLGCELLFCTNAAGS LRPEVGAGSLVALKDHINTMPGTPMVGLNDDRFGERFFSLANAYDAEYRALLQKVAKEEG FPLTEGVFVSYPGPNFETAAEIRMMQIIGGDVVGMSVVPEVISARHCDLKVVAVSAITNM AEGLSDVKLSHAQTLAAAELSKQNFINLICGFLRKIA >gi|296493397|gb|ADTK01000104.1| GENE 5 3786 - 4550 783 254 aa, chain + ## HITS:1 COG:no KEGG:ECS88_2597 NR:ns ## KEGG: ECS88_2597 # Name: yfeN # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 254 1 254 254 442 96.0 1e-123 MKKHLLTLTLSSILAIPVVSHAEFKGGFADIGVHYLDWTSRTTEKSSTKSHKDDFGYLEF EGGANFSWGEMYGFFDWENFYNGRHNKPGSEQRYTFKNTNRIYLGDTGFNLYLHAYGTYG SANRVNFHDDMFLYGIGYNFTGSGWWFKPFFAKRYTDQTYYTGDNGYVAGWVAGYNFMLG SEKFTLTNWNEYEFDRDATYAAGNGGKEGLNGAVALWWNATSHITTGIQYRYADDKLGED FYQDAIIYSIKFNF >gi|296493397|gb|ADTK01000104.1| GENE 6 4589 - 5515 646 308 aa, chain - ## HITS:1 COG:yfeR KEGG:ns NR:ns ## COG: yfeR COG0583 # Protein_GI_number: 16130335 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 308 1 308 308 575 100.0 1e-164 MNYSLKQLKVFVTVAQEKSFSRAGERIGLSQSAVSHSVKELENHTGVRLLDRTTREVVLT DAGQQLALRLERLLDELNSTLRDTGRMGQQLSGKVRVAASQTISAHLIPQCIAESHRRYP DIQFVLHDRPQQWVMESIRQGDVDFGIVIDPGPVGDLQCEAILSEPFFLLCHRDSALAVE DYVPWQALQGAKLVLQDYASGSRPLIDAALARNGIQANIVQEIGHPATLFPMVAAGIGIS ILPALALPLPEGSPLVVKRITPVVERQLMLVRRKNRSLSTAAEALWDVVRDQGNALMAGR EGDPLYQI >gi|296493397|gb|ADTK01000104.1| GENE 7 5605 - 6603 701 332 aa, chain + ## HITS:1 COG:ECs3282 KEGG:ns NR:ns ## COG: ECs3282 COG0385 # Protein_GI_number: 15832536 # Func_class: R General function prediction only # Function: Predicted Na+-dependent transporter # Organism: Escherichia coli O157:H7 # 1 332 1 332 332 592 99.0 1e-169 MKLFRILDPFTLTLITVVLLASFFPARGDFVPFFENLTTAAIALLFFMHGAKLSREAIIA GGGHWRLHLWVICSTFVLFPILGVLFAWWKPVNVDPMLYSGFLYLCILPATVQSAIAFTS MAGGNVAAAVCSASASSLLGIFLSPLLVGLVMNVHGAGGSLEQVGKIMLQLLLPFVLGHL SRPWIGDWVSRNKKWIAKTDQTSILLVVYTAFSEAVVNGIWHKVGWGSLLFIVVVSCVLL AIVIVVNVFMARRLGFNKADEITIVFCGSKKSLANGIPMANILFPTSVIGMMVLPLMIFH QIQLMVCAVLARRYKRQTEQLQAQQESSADKA >gi|296493397|gb|ADTK01000104.1| GENE 8 6600 - 6818 274 72 aa, chain - ## HITS:1 COG:Z3676 KEGG:ns NR:ns ## COG: Z3676 COG3530 # Protein_GI_number: 15802943 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 EDL933 # 1 72 1 72 72 123 100.0 9e-29 MEKEQLIEIANTIMPFGKYKGRRLIDLPEEYLLWFARKDEFPAGKLGELMQITLLIKTEG LTQLVQPLKRPL >gi|296493397|gb|ADTK01000104.1| GENE 9 6820 - 8835 2140 671 aa, chain - ## HITS:1 COG:lig KEGG:ns NR:ns ## COG: lig COG0272 # Protein_GI_number: 16130337 # Func_class: L Replication, recombination and repair # Function: NAD-dependent DNA ligase (contains BRCT domain type II) # Organism: Escherichia coli K12 # 1 671 1 671 671 1314 99.0 0 MESIEQQLTELRTTLRHHEYLYHVMDAPEIPDAEYDRLMRELRELETKHPELITPDSPTQ RVGAAPLAAFSQIRHEVPMLSLDNVFDEESFLAFNKRVQDRLKNNEKVTWCCELKLDGLA VSILYENGVLVSAATRGDGTTGEDITSNVRTIRAIPLKLHGENIPARLEVRGEVFLPQAG FEKINEDARRTGGKVFANPRNAAAGSLRQLDPRITAKRPLTFFCYGVGVLEGGELPDTHL GRLLQFKKWGLPVSDRVTLCESAEEVLAFYHKVEEDRPTLGFDIDGVVIKVNSLEQQEQL GFVARAPRWAVAFKFPAQEQMTFVRDVEFQVGRTGAITPVARLEPVHVAGVLVSNATLHN ADEIERLGLRIGDKVVIRRAGDVIPQVVNVVLSERPEDTREVVFPTHCPVCGSDVERVEG EAVARCTGGLICGAQRKESLKHFVSRRAMDVDGMGDKIIDQLVEKEYVHTPADLFKLTAG KLTGLERMGPKSAQNVVNALEKAKETTFARFLYALGIREVGEATAAGLAAYFGTLEALEA ASIEELQKVPDVGIVVASHVHNFFAEESNRNVISELLAEGVHWPAPIVINAEEIDSPFAG KTVVLTGSLSQMSRDDAKARLVELGAKVAGSVSKKTDLVIAGEAAGSKLAKAQELGIEVI DEAEMLRLLGS >gi|296493397|gb|ADTK01000104.1| GENE 10 8906 - 9889 769 327 aa, chain - ## HITS:1 COG:ZzipA KEGG:ns NR:ns ## COG: ZzipA COG3115 # Protein_GI_number: 15802945 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Escherichia coli O157:H7 EDL933 # 1 327 2 328 328 474 99.0 1e-133 MQDLRLILIIVGAIAIIALLVHGFWTSRKERSSMFRDRPLKRMKSKRDDDSYDEDVEDDE GVGEVRVHRVNHAPANAQEHEAARPSPQHQYQPPYASAQPRQPVQQPPEAQVPPQHAPRP AQPVQQPAYQPQPEQPLQQPVSPQVAPAPQPVHSAPQPAQQAFQPAEPVAAPQPEPVAEP APVMDKPKRKEAVIIMNVAAHHGSELNGELLLNSIQQAGFIFGDMNIYHRHLSPDGSGPA LFSLANMVKPGTFDPEMKDFTTPGVTIFMQVPSYGDELQNFKLMLQSAQHIADEVGGVVL DDQRRMMTPQKLREYQDIIREVKDANA >gi|296493397|gb|ADTK01000104.1| GENE 11 10248 - 10883 537 211 aa, chain + ## HITS:1 COG:ECs3285 KEGG:ns NR:ns ## COG: ECs3285 COG2981 # Protein_GI_number: 15832539 # Func_class: E Amino acid transport and metabolism # Function: Uncharacterized protein involved in cysteine biosynthesis # Organism: Escherichia coli O157:H7 # 1 211 43 253 253 382 99.0 1e-106 MGGAFWWLFTQLDVWIPTLMSYVPDWLQWLSYLLWPLAVISVLLVFGYFFSTIANWIAAP FNGLLAEQLEARLTGATPPDTGIFGIMKDVPRIMKREWQKFAWYLPRAIVLLILYLIPGI GQTVAPVLWFLFSAWMLAIQYCDYPFDNHKVPFKEMRTALRTRKITNMQFGALTSLFTMI PLLNLFIMPVAVCGATAMWVDCYRDKHAMWR >gi|296493397|gb|ADTK01000104.1| GENE 12 11068 - 12039 702 323 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 4 312 3 304 308 275 49 2e-73 MSKIFEDNSLTIGHTPLVRLNRIGNGRILAKVESRNPSFSVKCRIGANMIWDAEKRGVLK PGVELVEPTSGNTGIALAYVAAARGYKLTLTMPETMSIERRKLLKALGANLVLTEGAKGM KGAIQKAEEIVASNPEKYLLLQQFSNPANPEIHEKTTGPEIWEDTDGQVDVFIAGVGTGG TLTGVSRYIKGTKGKTDLISVAVEPTDSPVIAQALAGEEIKPGPHKIQGIGAGFIPANLD LKLVDKVIGITNEEAISTARRLMEEEGILAGISSGAAVAAALKLQEDESFTNKNIVVILP SSGERYLSTALFADLFTEKELQQ >gi|296493397|gb|ADTK01000104.1| GENE 13 12423 - 12680 357 85 aa, chain + ## HITS:1 COG:ECs3287 KEGG:ns NR:ns ## COG: ECs3287 COG1925 # Protein_GI_number: 15832541 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Escherichia coli O157:H7 # 1 85 1 85 85 129 100.0 2e-30 MFQQEVTITAPNGLHTRPAAQFVKEAKGFTSEITVTSNGKSASAKSLFKLQTLGLTQGTV VTISAEGEDEQKAVEHLVKLMAELE >gi|296493397|gb|ADTK01000104.1| GENE 14 12725 - 14452 1796 575 aa, chain + ## HITS:1 COG:ECs3288 KEGG:ns NR:ns ## COG: ECs3288 COG1080 # Protein_GI_number: 15832542 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Escherichia coli O157:H7 # 1 575 1 575 575 1043 100.0 0 MISGILASPGIAFGKALLLKEDEIVIDRKKISADQVDQEVERFLSGRAKASAQLETIKTK AGETFGEEKEAIFEGHIMLLEDEELEQEIIALIKDKHMTADAAAHEVIEGQASALEELDD EYLKERAADVRDIGKRLLRNILGLKIIDLSAIQDEVILVAADLTPSETAQLNLKKVLGFI TDAGGRTSHTSIMARSLELPAIVGTGSVTSQVKNDDYLILDAVNNQVYVNPTNEVIDKMR AVQEQVASEKAELAKLKDLPAITLDGHQVEVCANIGTVRDVEGAERNGAEGVGLYRTEFL FMDRDALPTEEEQFAAYKAVAEACGSQAVIVRTMDIGGDKELPYMNFPKEENPFLGWRAI RIAMDRKEILRDQLRAILRASAFGKLRIMFPMIISVEEVRALRKEIEIYKQELRDEGKAF DESIEIGVMVETPAAATIARHLAKEVDFFSIGTNDLTQYTLAVDRGNDMISHLYQPMSPS VLNLIKQVIDASHAEGKWTGMCGELAGDERATLLLLGMGLDEFSMSAISIPRIKKIIRNT NFEDAKVLAEQALAQPTTDELMTLVNKFIEEKTIC >gi|296493397|gb|ADTK01000104.1| GENE 15 14493 - 15002 634 169 aa, chain + ## HITS:1 COG:crr KEGG:ns NR:ns ## COG: crr COG2190 # Protein_GI_number: 16130343 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Escherichia coli K12 # 1 169 1 169 169 285 100.0 4e-77 MGLFDKLKSLVSDDKKDTGTIEIIAPLSGEIVNIEDVPDVVFAEKIVGDGIAIKPTGNKM VAPVDGTIGKIFETNHAFSIESDSGVELFVHFGIDTVELKGEGFKRIAEEGQRVKVGDTV IEFDLPLLEEKAKSTLTPVVISNMDEIKELIKLSGSVTVGETPVIRIKK Prediction of potential genes in microbial genomes Time: Mon May 16 15:18:46 2011 Seq name: gi|296493396|gb|ADTK01000105.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont287.5, whole genome shotgun sequence Length of sequence - 11159 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 6, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 832 788 ## COG2240 Pyridoxal/pyridoxine/pyridoxamine kinase - Prom 915 - 974 3.1 + Prom 787 - 846 2.5 2 2 Tu 1 . + CDS 937 - 1305 385 ## ECO103_2938 hypothetical protein 3 3 Op 1 . - CDS 1308 - 2297 562 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 4 3 Op 2 17/0.000 - CDS 2353 - 3450 1346 ## COG1118 ABC-type sulfate/molybdate transport systems, ATPase component 5 3 Op 3 17/0.000 - CDS 3440 - 4315 1116 ## COG4208 ABC-type sulfate transport system, permease component 6 3 Op 4 7/0.000 - CDS 4315 - 5148 939 ## COG0555 ABC-type sulfate transport system, permease component 7 3 Op 5 1/1.000 - CDS 5148 - 6164 1366 ## COG4150 ABC-type sulfate transport system, periplasmic component - Prom 6186 - 6245 4.8 - Term 6286 - 6322 5.1 8 4 Tu 1 . - CDS 6335 - 7126 248 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 - Prom 7175 - 7234 3.9 9 5 Tu 1 . - CDS 7255 - 8112 774 ## COG1737 Transcriptional regulators - Prom 8244 - 8303 3.5 + Prom 8176 - 8235 6.4 10 6 Op 1 9/0.000 + CDS 8276 - 9172 891 ## COG2103 Predicted sugar phosphate isomerase 11 6 Op 2 1/1.000 + CDS 9176 - 10600 1482 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 12 6 Op 3 . + CDS 10605 - 11135 465 ## COG1680 Beta-lactamase class C and other penicillin binding proteins Predicted protein(s) >gi|296493396|gb|ADTK01000105.1| GENE 1 1 - 832 788 277 aa, chain - ## HITS:1 COG:ECs3290 KEGG:ns NR:ns ## COG: ECs3290 COG2240 # Protein_GI_number: 15832544 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal/pyridoxine/pyridoxamine kinase # Organism: Escherichia coli O157:H7 # 1 277 1 277 283 538 98.0 1e-153 MSSLLLFNDKSRALQADIVAVQSQVVYGSVGNSIAVPAIKQNGLNVFAVPTVLLSNTPHY DTFYGGAIPDEWFSGYLRALQERDALRQLRAVTTGYMGTASQIKILAEWLTALRKDHPDL LIMVDPVIGDIDSGIYVKPDLPEAYRQYLLPLAQGITPNIFELEILTGKNCRDLDSAIAA AKSLLSDTLKWVVITSASGNEENQEMQVVVVSADSVNVISHSRVKTDLKGTGDLFCAQLI SGLLKGKALNDAVHRAGLRVLEVMRYTKQNESDELIL >gi|296493396|gb|ADTK01000105.1| GENE 2 937 - 1305 385 122 aa, chain + ## HITS:1 COG:no KEGG:ECO103_2938 NR:ns ## KEGG: ECO103_2938 # Name: yfeK # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 122 1 122 122 234 100.0 5e-61 MKKIICLVITLLMTLPAYAKLTAHEEARINAMLEGLAQKKDLIFVRNGDEHTCDEAVSHL RLKLGNTRNRIDTAEQFIDKVASSSSITGKPYIVKIPGKSDENAQPFLHALIAQTDKTVP AQ >gi|296493396|gb|ADTK01000105.1| GENE 3 1308 - 2297 562 329 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 24 318 1 304 308 221 39 2e-57 MARFVTCRPDKTRKRRIRQHHVWIEIVSTLEQTIGNTPLVKLQRMGPDNGSEVWLKLEGN NPAGSVKDRAALSMIVEAEKRGEIKPGDVLIEATSGNTGIALAMIAALKGYRMKLLMPDN MSQERRAAMRAYGAELILVTKEQGMEGARDLALEMANRGEGKLLDQFNNPDNPYAHYTTT GPEIWQQTGGRITHFVSSMGTTGTITGVSRFMREQSKPVTIVGLQPEEGSSIPGIRRWPA EYLPGIFNASLVDEVLDIHQRDAENTMRELAVREGIFCGVSSGGAVAGALRVAKANPGAV VVAIICDRGDRYLSTGVFGEEHFSQGAGI >gi|296493396|gb|ADTK01000105.1| GENE 4 2353 - 3450 1346 365 aa, chain - ## HITS:1 COG:cysA KEGG:ns NR:ns ## COG: cysA COG1118 # Protein_GI_number: 16130348 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type sulfate/molybdate transport systems, ATPase component # Organism: Escherichia coli K12 # 1 365 1 365 365 718 99.0 0 MSIEIDNIKKSFGRTQVLNDISLDIPSGQMVALLGPSGSGKTTLLRIIAGLEHQTSGHIR FHGTDVSRLHARDRKVGFVFQHYALFRHMTVFDNIAFGLTVLPRRDRPNAAAIKAKVTKL LEMVQLAHLADRYPAQLSGGQKQRVALARALAVEPQILLLDEPFGALDAQVRKELRRWLR QLHEELKFTSVFVTHDQEEATEVADRVVVMSQGNIEQADAPNQVWREPATRFVLEFMGEV NRLQGTIRGGQFHVGAHRWPLGYTPAYQGPVDLFLRPWEVDISRRTSLDSPLPVQVLEAS PKGHYTQLVVQPLGWYNEPLTVVMHGDDAPQRGERLFVGLQHARLYNGDERIETRDEELA LAQSA >gi|296493396|gb|ADTK01000105.1| GENE 5 3440 - 4315 1116 291 aa, chain - ## HITS:1 COG:cysWm KEGG:ns NR:ns ## COG: cysWm COG4208 # Protein_GI_number: 16132224 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type sulfate transport system, permease component # Organism: Escherichia coli K12 # 1 291 1 291 291 538 100.0 1e-153 MAEVTQLKRYDARPINWGKWFLIGIGMLVSAFILLVPMIYIFVQAFSKGLMPVLQNLADP DMLHAIWLTVMIALIAVPVNLVFGILLAWLVTRFNFPGRQLLLTLLDIPFAVSPVVAGLV YLLFYGSNGPLGGWLDEHNLQIMFSWPGMVLVTIFVTCPFVVRELVPVMLSQGSQEDEAA ILLGASGWQMFRRVTLPNIRWALLYGVVLTNARAIGEFGAVSVVSGSIRGETLSLPLQIE LLEQDYNTVGSFTAAALLTLMAIITLFLKSMLQWRLENQEKRAQQEEHHEH >gi|296493396|gb|ADTK01000105.1| GENE 6 4315 - 5148 939 277 aa, chain - ## HITS:1 COG:cysU KEGG:ns NR:ns ## COG: cysU COG0555 # Protein_GI_number: 16130349 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type sulfate transport system, permease component # Organism: Escherichia coli K12 # 1 277 1 277 277 439 99.0 1e-123 MFAVSSRRVLPGFTLSLGTSLLFVCLILLLPLSALVMQLAQMSWAQYWEVITNPQVVAAY KVTLLSAFVASIFNGVFGLLMAWILTRYRFPGRTLLDALMDLPFALPTAVAGLTLASLFS VNGFYGEWLAKFDIKVTYTWLGIAVAMAFTSIPFVVRTVQPVLEELGPEYEEAAETLGAT RWQSFCKVVLPELSPALVAGVALSFTRSLGEFGAVIFIAGNIAWKTEVTSLMIFVRLQEF DYPAASAIASVILAASLILLFSINTLQSRFGRRVVGH >gi|296493396|gb|ADTK01000105.1| GENE 7 5148 - 6164 1366 338 aa, chain - ## HITS:1 COG:ECs3296 KEGG:ns NR:ns ## COG: ECs3296 COG4150 # Protein_GI_number: 15832550 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type sulfate transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 1 338 1 338 338 650 100.0 0 MAVNLLKKNSLALVASLLLAGHVQATELLNSSYDVSRELFAALNPPFEQQWAKDNGGDKL TIKQSHAGSSKQALAILQGLKADVVTYNQVTDVQILHDKGKLIPADWQSRLPNNSSPFYS TMGFLVRKGNPKNIHDWNDLVRSDVKLIFPNPKTSGNARYTYLAAWGAADKADGGDKAKT EQFMTQFLKNVEVFDTGGRGATTTFAERGLGDVLISFESEVNNIRKQYEAQGFEVVIPKT NILAEFPVAWVDKNVQANGTEKAAKAYLNWLYSPQAQTIITDYYYRVNNPEVMDKLKDKF PQTELFRVEDKFGSWPEVMKTHFTSGGELDKLLAAGRN >gi|296493396|gb|ADTK01000105.1| GENE 8 6335 - 7126 248 263 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 7 251 4 238 242 100 30 6e-21 MGKLTGKTALITGALQGIGEGIARTFARHGANLILLDISPEIEKLADELCGRGHRCTAVV ADVRDPASVAAAIKRAKEKEGRIDILVNNAGVCRLGSFLDMSDDDRDFHIDINIKGVWNV TKAVLPEMIARKDGRIVMMSSVTGDMVADPGETAYALTKAAIVGLTKSLAVEYAQSGIRV NAICPGYVRTPMAESIARQSNPEDPESVLTEMAKAIPMRRLADPLEVGELAAFLASDESS YLTGTQNVIDGGSTLPETVSVGI >gi|296493396|gb|ADTK01000105.1| GENE 9 7255 - 8112 774 285 aa, chain - ## HITS:1 COG:yfeT KEGG:ns NR:ns ## COG: yfeT COG1737 # Protein_GI_number: 16130352 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 285 1 285 285 476 99.0 1e-134 MLYLTKISNAGSEFTENEQKIADFLQANVSELQSVSSRQMAKQLGISQSSIVKFAQKLGA QGFTELRMALIGEYSASREKTNATALHLHSSITSDDSLEVIARKLNREKELALEQTCALF DYARLQKIIEVISKAPFIQITGLGGSALVGRDLSFKLMKIGYRVACEADTHVQATVSQAL KKGDVQIAISYSGSKKEIVLCAEAARKQGATVIAITSLADSPLRRLAHFTLDTVSGETEW RSSSMSTRTAQNSVTDLLFVGLVQLNDVESLKMIQRSSELTQRLK >gi|296493396|gb|ADTK01000105.1| GENE 10 8276 - 9172 891 298 aa, chain + ## HITS:1 COG:yfeU KEGG:ns NR:ns ## COG: yfeU COG2103 # Protein_GI_number: 16130353 # Func_class: R General function prediction only # Function: Predicted sugar phosphate isomerase # Organism: Escherichia coli K12 # 1 298 1 298 298 542 99.0 1e-154 MQLEKMITEGSNTASAEIDRVSTLEMCRIINDEDKTVPLAVERVLPDIAAAIDVIHAQVS GGGRLIYLGAGTSGRLGILDASECPPTYGVKPGLVVGLIAGGEYAIQHAVEGAEDSREGG VNDLKNINLTAQDVVVGIAASGRTPYVIAGLEYARQLGCRTVGISCNPGSAVSTTAEFAI TPIVGAEVVTGSSRMKAGTAQKLVLNMLSTGLMIKSGKVFGNLMVDVVATNEKLHVRQVN IVKNATGCNAEQAEAALIACERNCKTAIVMVLKNLDAAEAKKRLDQHGGFIRQVLDKE >gi|296493396|gb|ADTK01000105.1| GENE 11 9176 - 10600 1482 474 aa, chain + ## HITS:1 COG:yfeV_2 KEGG:ns NR:ns ## COG: yfeV_2 COG1263 # Protein_GI_number: 16130354 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Escherichia coli K12 # 101 474 1 374 374 654 99.0 0 MAKEISSELLNTILTRVGGPGNIASCGNCMTRLRLGVHDSSLVDPNIKTLEGVKGVILTS DQVQVVFGPGKAHRAAKAMSELLGEAPVQDAAEIAAQNKRQLKAKQTSGVQQFLAKFATI FTPLIPGFIAAGLLLGIATLIATVMHVPADAQGTLPDALNFMKVFSKGLFTFLVILVGYN AAQAFGGTGVNGAIIAALFLLGYNPAATTGYYAGFHDFFGLPIDPRGNIIGVLIAAWACA RIEGMVRRFMPDDLDMLLTSLITLLITATLAYLIIMPLGGWLFEGMSWLFMHLNSNPFGC AVLAGLFLIAVVFGVHQGFIPVYLALMDSQGFNSLFPILSMAGAGQVGAALALYWRAQPH SALRSQVRGAIIPGLLGVGEPLIYGVTLPRMKPFITACLGGAAGGLFIGLIAWWGLPMGL NSAFGPSGLVALPLMTSAQGILPAMAVYAGGILVAWVCGFIFTTLFGCRNVNLD >gi|296493396|gb|ADTK01000105.1| GENE 12 10605 - 11135 465 176 aa, chain + ## HITS:1 COG:yfeW KEGG:ns NR:ns ## COG: yfeW COG1680 # Protein_GI_number: 16130355 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Escherichia coli K12 # 1 175 30 204 463 353 98.0 1e-97 MKRTMLYLSLLAVSCSVSAAKYPVLTESSPEKAGFNVERLNQMDRWISQQVDAGYPGVNL LIIKDNQIVYRKAWGAAKKYDGSVLMEQPVKATTGTLYDLASNTKMYATNFALQKLMSEG KLHPDDRIAKYIPGFADSPNDTIKGKNTLRISDLLHHSGGFPADPQYPNKAVAGAW Prediction of potential genes in microbial genomes Time: Mon May 16 15:18:56 2011 Seq name: gi|296493395|gb|ADTK01000106.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont289.1, whole genome shotgun sequence Length of sequence - 26195 bp Number of predicted genes - 35, with homology - 35 Number of transcription units - 19, operones - 7 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 32 - 529 198 ## COG3772 Phage-related lysozyme (muraminidase) 2 1 Op 2 . - CDS 529 - 744 235 ## APECO1_513 phage lysis protein - Prom 777 - 836 1.5 + Prom 1061 - 1120 8.0 3 2 Tu 1 . + CDS 1318 - 2415 1052 ## COG3203 Outer membrane protein (porin) + Term 2468 - 2497 -0.3 - Term 2456 - 2485 0.5 4 3 Tu 1 . - CDS 2605 - 2988 256 ## SSON_2441 hypothetical protein - Prom 3039 - 3098 2.3 5 4 Op 1 . - CDS 3211 - 3573 243 ## COG4570 Holliday junction resolvase 6 4 Op 2 . - CDS 3570 - 3860 168 ## ECO26_0590 hypothetical protein 7 4 Op 3 . - CDS 3853 - 4023 80 ## EFER_2709 prophage protein NinE 8 4 Op 4 . - CDS 4023 - 4478 342 ## ECUMN_0599 hypothetical protein - Prom 4579 - 4638 2.3 - Term 4588 - 4626 3.3 9 5 Tu 1 . - CDS 4673 - 5179 177 ## ECUMN_0598 conserved hypothetical protein; putative exported protein - Prom 5323 - 5382 5.0 - Term 5353 - 5397 6.1 10 6 Tu 1 . - CDS 5417 - 6025 62 ## COG3617 Prophage antirepressor - Prom 6256 - 6315 4.0 + Prom 7001 - 7060 4.4 11 7 Tu 1 . + CDS 7118 - 8341 609 ## ECUMN_0595 chromosome segregation ATPase from phage origin, putative coiled-coil + Term 8367 - 8408 5.5 - Term 8405 - 8451 -0.5 12 8 Op 1 . - CDS 8590 - 8892 151 ## ECUMN_0593 Ren protein from phage origin 13 8 Op 2 . - CDS 8889 - 9590 391 ## ECH74115_1601 replication protein P 14 8 Op 3 . - CDS 9587 - 10591 536 ## EFER_2719 replication protein O frm phage origin 15 8 Op 4 . - CDS 10603 - 11142 531 ## LF82_p155 bacteriophage regulatory protein CII 16 8 Op 5 . - CDS 11212 - 11442 143 ## ECUMN_0589 Cro - Prom 11463 - 11522 4.6 + Prom 11464 - 11523 2.6 17 9 Tu 1 . + CDS 11547 - 12236 267 ## COG2932 Predicted transcriptional regulator + Term 12330 - 12380 -0.3 + Prom 12383 - 12442 6.5 18 10 Tu 1 . + CDS 12466 - 12846 267 ## Dbac_1517 lipoprotein + Term 12855 - 12884 1.1 19 11 Op 1 . + CDS 13327 - 13533 218 ## ECIAI1_0759 putative prophage Kil protein (modular protein) 20 11 Op 2 . + CDS 13609 - 13905 325 ## ECUMN_1389 host-nuclease inhibitor protein gam from bacteriophage origin 21 11 Op 3 . + CDS 13911 - 14696 658 ## ECUMN_0583 recombination protein bet from phage origin 22 12 Tu 1 . + CDS 14849 - 15373 461 ## ECUMN_1387 exonuclease from phage origin + Term 15434 - 15468 3.4 23 13 Tu 1 . + CDS 15525 - 15716 132 ## EcolC_2879 hypothetical protein 24 14 Tu 1 . + CDS 15793 - 16008 222 ## ECSP_3284 hypothetical protein + Term 16031 - 16057 -0.6 + Prom 16010 - 16069 3.8 25 15 Op 1 . + CDS 16107 - 16325 134 ## COG1734 DnaK suppressor protein 26 15 Op 2 . + CDS 16373 - 16651 75 ## ECUMN_0578 conserved hypothetical protein from phage origin + Prom 16756 - 16815 1.6 27 16 Tu 1 1/1.000 + CDS 16850 - 18013 237 ## COG0582 Integrase - TRNA 18029 - 18105 89.4 # Arg TCT 0 0 + Prom 18081 - 18140 1.7 28 17 Tu 1 . + CDS 18348 - 18980 231 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain + Term 19140 - 19182 0.6 - Term 18767 - 18809 -0.3 29 18 Op 1 4/0.500 - CDS 18983 - 19498 205 ## COG3539 P pilus assembly protein, pilin FimA 30 18 Op 2 6/0.000 - CDS 19509 - 20516 724 ## COG3539 P pilus assembly protein, pilin FimA 31 18 Op 3 10/0.000 - CDS 20675 - 23137 1509 ## COG3188 P pilus assembly protein, porin PapC 32 18 Op 4 7/0.000 - CDS 23168 - 23860 239 ## COG3121 P pilus assembly protein, chaperone PapD - Prom 23894 - 23953 5.2 - Term 23897 - 23945 0.1 33 18 Op 5 . - CDS 24080 - 24622 260 ## COG3539 P pilus assembly protein, pilin FimA - Prom 24742 - 24801 7.1 + Prom 24806 - 24865 5.5 34 19 Op 1 2/0.500 + CDS 25093 - 25959 886 ## COG0190 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase 35 19 Op 2 . + CDS 25961 - 26173 303 ## COG2501 Uncharacterized conserved protein Predicted protein(s) >gi|296493395|gb|ADTK01000106.1| GENE 1 32 - 529 198 165 aa, chain - ## HITS:1 COG:ybcS KEGG:ns NR:ns ## COG: ybcS COG3772 # Protein_GI_number: 16128538 # Func_class: R General function prediction only # Function: Phage-related lysozyme (muraminidase) # Organism: Escherichia coli K12 # 1 165 1 165 165 310 98.0 8e-85 MPPSLRKAVAAAIGGGAIAIASVLITGPSGNDGLEGVSYIPYKDIVGVWTVCHGHTGKDI MLGKTYTKAECKALLNKDLATVARQINPYIKVDIPETMRGALYSFVYNVGAGNFRTSTLL RKINQGDIKGACDQLRRWTYAGGKQWKGLMTRREIEREICLWGQQ >gi|296493395|gb|ADTK01000106.1| GENE 2 529 - 744 235 71 aa, chain - ## HITS:1 COG:no KEGG:APECO1_513 NR:ns ## KEGG: APECO1_513 # Name: not_defined # Def: phage lysis protein # Organism: E.coli_APEC # Pathway: not_defined # 1 71 30 100 100 134 100.0 1e-30 MKSMDKLTTGVAYGTSAGSAGYWFLQLLDKVTPSQWAAIGVLGSLVFGLLTYLTNLYFKI KEDKRKAARGE >gi|296493395|gb|ADTK01000106.1| GENE 3 1318 - 2415 1052 365 aa, chain + ## HITS:1 COG:nmpC KEGG:ns NR:ns ## COG: nmpC COG3203 # Protein_GI_number: 16128536 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein (porin) # Organism: Escherichia coli K12 # 1 347 21 367 375 614 99.0 1e-176 MKKLTVAISAVAASVLMAMSAQAAEIYNKDSNKLDLYGKVNAKHYFSSNDADDGDTTYAR LGFKGETQINDQLTGFGQWEYEFKGNRAESQGSSKDKTRLAFAGLKFGDYGSIDYGRNYG VAYDIGAWTDVLPEFGGDTWTQTDVFMTGRTTGVATYRNNDFFGLVDGLNFAAQYQGKND RSDFDNYTEGNGDGFGFSATYEYEGFGIGATYAKSDRTDTQVNAGKVLPEVFASGKNAEV WAAGLKYDANNIYLATTYSETQNMTVFADHFVANKAQNFEAVAQYQFDFGLRPSVAYLQS KGKDLGVWGDQDLVKYVDVGATYYFNKNMSTFVDYKINLLDKNDFTKALGVSTDDIVAVG LVYQF >gi|296493395|gb|ADTK01000106.1| GENE 4 2605 - 2988 256 127 aa, chain - ## HITS:1 COG:no KEGG:SSON_2441 NR:ns ## KEGG: SSON_2441 # Name: ybcQ # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 127 1 127 127 252 100.0 2e-66 MRDIQMVLERWGAWAANNHEDVTWSSIAAGFKGLITSKVKSRPQCCDDDAMIICGCMARL KKNNSDLHDLLVDYYVVGMTFMSLAGKHCCSDGYIGKRLQKAEGIIEGMLMALDIRLEMD IVVNNSN >gi|296493395|gb|ADTK01000106.1| GENE 5 3211 - 3573 243 120 aa, chain - ## HITS:1 COG:rus KEGG:ns NR:ns ## COG: rus COG4570 # Protein_GI_number: 16128533 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvase # Organism: Escherichia coli K12 # 1 120 1 120 120 238 97.0 2e-63 MNTYHITLPWPPSNNRYYRHNRGRTHISAEGQAYRDNVTRIIKNAMLDIGLAMPVKIRIE CHMPDRRRRDLDNLQKAAFDALTKAGFWLDDAQVVDYRVVKMPVTKGGRLELTITEMGNE >gi|296493395|gb|ADTK01000106.1| GENE 6 3570 - 3860 168 96 aa, chain - ## HITS:1 COG:no KEGG:ECO26_0590 NR:ns ## KEGG: ECO26_0590 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 96 1 96 96 182 98.0 3e-45 MADLRKAARGRECQVRITGVCNGNPETSVLAHIRLAGLCGTGIKPPDLIATIACSACHDE IDRRTHFVDAEYAKECALEGMARTQVIWLKEGVIKA >gi|296493395|gb|ADTK01000106.1| GENE 7 3853 - 4023 80 56 aa, chain - ## HITS:1 COG:no KEGG:EFER_2709 NR:ns ## KEGG: EFER_2709 # Name: not_defined # Def: prophage protein NinE # Organism: E.fergusonii # Pathway: not_defined # 1 56 28 83 83 99 98.0 3e-20 MATPLIRVMNGHIYRVPNRRKRKPELKPSEIPTLLGYTASLVDKKWLRLAARRNHG >gi|296493395|gb|ADTK01000106.1| GENE 8 4023 - 4478 342 151 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_0599 NR:ns ## KEGG: ECUMN_0599 # Name: ybcN # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 151 1 151 151 315 98.0 2e-85 MNLPQDGIKLHRGNFTAIGQQIQPYLEEGKCFRMVLKPWREKRSLSQNALSHMWYSEISE YLISRGKTFATPAWVKDALKHTYLGYETKDLVDVVTGEITTIQSLRHTSDLDTGEMYVFL CKVEAWAMNIGCHLTIPQSCEFQLLRDKQEA >gi|296493395|gb|ADTK01000106.1| GENE 9 4673 - 5179 177 168 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_0598 NR:ns ## KEGG: ECUMN_0598 # Name: not_defined # Def: conserved hypothetical protein; putative exported protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 168 2 169 169 289 100.0 3e-77 MWNFDSADLSAIAAGISAFGTLAAAGSALASWCTSKKALQLQNRVYLYESLKACAERANS SAKDKRGSEWSVNDAADIIRCLVRAMELIKQDSQQKEGNQALMLKQYFVNLLIMELYEEV HNGDAADSVFKSTEPTQVLDNLWSKWQEAIAFFDIWNYPVATEEDLAD >gi|296493395|gb|ADTK01000106.1| GENE 10 5417 - 6025 62 202 aa, chain - ## HITS:1 COG:YPO2093 KEGG:ns NR:ns ## COG: YPO2093 COG3617 # Protein_GI_number: 16122332 # Func_class: K Transcription # Function: Prophage antirepressor # Organism: Yersinia pestis # 1 109 1 114 187 121 52.0 8e-28 MTTQLAFHKTTFTPICHNNRIWLTATEVGLALEYADDKAVQRIYSRHSDEFTDMMTRVVK VTTPRGMQESRVFSLRGAHLIAMFARTPVAKEFRRWVLDILDREVQQSPITKQFTDNELC TLAWLWRASDTMLTACQNVTPLLQVAEHREAGRFTSIEQEYPRILNRAREILARETAHVK FQPWQDDKWSRVLPYFRQNLLQ >gi|296493395|gb|ADTK01000106.1| GENE 11 7118 - 8341 609 407 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_0595 NR:ns ## KEGG: ECUMN_0595 # Name: not_defined # Def: chromosome segregation ATPase from phage origin, putative coiled-coil # Organism: E.coli_UMN026 # Pathway: not_defined # 1 407 94 500 500 652 100.0 0 MDGINKQIQELRRTYKEKKEIYDKLVRQISIYSEDVELAELGFYEPHFNFEDSEQFKNKI KSIRDEQKLMLRDKTHSGAVYCTTQWTVEGSRAEGKKMTDRNIRLTTRAFNNECDAAISN CTWKNITKMEERITKAFEAINKLNEQNHIYINTKYLNKKLEELWLTHEYREQKQKEKEEQ AEIRAQMREEERAQREIEKAMQDAEAEERRYKKAIEAARKEMEKVTGDMKQRLENRIAEL EQSLSQAESKHQRALSMAQQTKQGHVYIISNIGSFGENVYKIGMTRRLDPQDRVNELGDA SVPFIFDVHAMIYSEDAPSLEKKLHDVFDKKRVNLVNRRKEFFYVTLDEIKEAVKKHSDS EIEFIETAVAKDFNESLAIRNHENKKSDNSNSSIIPERKTPEFADAI >gi|296493395|gb|ADTK01000106.1| GENE 12 8590 - 8892 151 100 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_0593 NR:ns ## KEGG: ECUMN_0593 # Name: not_defined # Def: Ren protein from phage origin # Organism: E.coli_UMN026 # Pathway: not_defined # 1 100 1 100 100 184 100.0 7e-46 MTGKEAIIHYLGTHKSFCAQDVAAVTGATVTSINQAAAKMARAGILVVDGKVWRTVYYRF ATREEREGKVSTNLIFKECRQSAAMKRVLAVYGDMNLNLL >gi|296493395|gb|ADTK01000106.1| GENE 13 8889 - 9590 391 233 aa, chain - ## HITS:1 COG:no KEGG:ECH74115_1601 NR:ns ## KEGG: ECH74115_1601 # Name: not_defined # Def: replication protein P # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 233 1 233 233 438 97.0 1e-122 MKNIAAQMVNFDREQMRRIANNMPEQYDEKPQVQQVAQIINGVFSQLLATFPASLANRDQ NELNEIRRQWVLAFRENGITTMEQVDAGMRVARRQNRPFLPSPGQFVAWCREEASVTAGL PNASELVDMVYEYCRKRGLYPDAESYPWKSNAHYWLVTNLYQNMRANALTDAELRRKAAD ELTCMTARINRGEAIPEPVKQLPVMGGRPLNRAQALAKIAEIKAKFGLKGATV >gi|296493395|gb|ADTK01000106.1| GENE 14 9587 - 10591 536 334 aa, chain - ## HITS:1 COG:no KEGG:EFER_2719 NR:ns ## KEGG: EFER_2719 # Name: not_defined # Def: replication protein O frm phage origin # Organism: E.fergusonii # Pathway: not_defined # 1 334 6 339 339 651 98.0 0 MAKVGLREQNRLSGANRNTLIAGGIMANTAEIFNFPVPDAAQKEPRVADLDDGYTRIANE LLEAVMLAGLTQHQLLVFLAVMRKTYGFNKKLDWVSNEQLSELTGILPHKCSAAKSVLVK RGIFIQSGRNIGINNVVSEWSTLPESGKKNKVYLKEVNLPESGKKSLPKSGKGTYPNQVN TKDKLTKDNIKPFSSENSGESSDQPENDLPVEKPDAAIQSGSRWGTAEDLTAAEWMFDMV KTIAPSARKPNFAGWANDIRLMRERDGRNHRDMCVLFRWACQDNFWSGNVLSPAKLRDKW TQLEINRNKQQAGVIAGKSKLDLTNTDWIYGVDL >gi|296493395|gb|ADTK01000106.1| GENE 15 10603 - 11142 531 179 aa, chain - ## HITS:1 COG:no KEGG:LF82_p155 NR:ns ## KEGG: LF82_p155 # Name: not_defined # Def: bacteriophage regulatory protein CII # Organism: E.coli_LF82 # Pathway: not_defined # 1 179 3 181 181 334 97.0 8e-91 MQPLTYQQTSGFIPTAVINRSQTKQAPGHEKIRDAVRAWSAVDNQDVVAALIVNEYREQG DGTIDFPDDVSRARQKLFRFLDNKFDSEKYRNNVRELTPAILAVLPLEYRGYLVEQDSFM TRLAEMEKELSEAKQAVILNAPRHQKLKEMSEGIVSMFRVDPDLAGPLMAMVTTMLGAI >gi|296493395|gb|ADTK01000106.1| GENE 16 11212 - 11442 143 76 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_0589 NR:ns ## KEGG: ECUMN_0589 # Name: cro # Def: Cro # Organism: E.coli_UMN026 # Pathway: not_defined # 1 76 1 76 76 137 98.0 1e-31 MNPAIKTAINIVGSQKKLGDACEVSQQAVYKWLHNKAKVSPEHVGSIVTATGGVVKAYQI RPDLPKLFPHTEKNAA >gi|296493395|gb|ADTK01000106.1| GENE 17 11547 - 12236 267 229 aa, chain + ## HITS:1 COG:STM0898 KEGG:ns NR:ns ## COG: STM0898 COG2932 # Protein_GI_number: 16764259 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Salmonella typhimurium LT2 # 1 229 1 231 231 158 40.0 7e-39 MKTTLSERLKEARLARGLTQKALGDLVGVSQAAIQKIETGKANQTTKIVEIANALGVRAE WLSSGVGNMSDSTVQPIQSTVSHSKYFKIDVLDIEVSAGPGVINREFVEVLRSVEYSFDD ARHMFDGRKAENIRIINVRGDSMSGTIEPGDLLFVDITVKSFDGDGIYAFLYDDTAHVKR LQMMKDKLLVISDNKSYSPWDPIEKDEMNRVFIFGKVIGSMPQTYRKHG >gi|296493395|gb|ADTK01000106.1| GENE 18 12466 - 12846 267 126 aa, chain + ## HITS:1 COG:no KEGG:Dbac_1517 NR:ns ## KEGG: Dbac_1517 # Name: not_defined # Def: lipoprotein # Organism: D.baculatum # Pathway: not_defined # 16 123 23 131 134 107 52.0 2e-22 MRKILIAAMMASVLAGCASSGNQQLKNETEISVQSKLQEGKTTKNEVKSYFGSPDAVSYT DSGNEIWKYAFAKVKVNGTTFIPFYGLFHNGTNGTKKELTILFNDDTIKKYTMSETQINS KSGWAD >gi|296493395|gb|ADTK01000106.1| GENE 19 13327 - 13533 218 68 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_0759 NR:ns ## KEGG: ECIAI1_0759 # Name: not_defined # Def: putative prophage Kil protein (modular protein) # Organism: E.coli_IAI1 # Pathway: not_defined # 1 68 69 136 136 132 100.0 4e-30 MVHQHYGTQTVNRGAVMPGMLVKHKDGTWTASANLRGRLYLHRGIERTYTRDLLVEVFLD GRGNGLNH >gi|296493395|gb|ADTK01000106.1| GENE 20 13609 - 13905 325 98 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_1389 NR:ns ## KEGG: ECUMN_1389 # Name: gam # Def: host-nuclease inhibitor protein gam from bacteriophage origin # Organism: E.coli_UMN026 # Pathway: not_defined # 1 98 41 138 138 169 100.0 2e-41 MNAYYIQDRLEAQSWARHYQQIAREEKEAELADDMEKGLPQHLFESLCIDHLQRHGASKK AITRAFDDDVEFQERMAEHIRYMVETIAHHQVDIDSEV >gi|296493395|gb|ADTK01000106.1| GENE 21 13911 - 14696 658 261 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_0583 NR:ns ## KEGG: ECUMN_0583 # Name: not_defined # Def: recombination protein bet from phage origin # Organism: E.coli_UMN026 # Pathway: not_defined # 1 261 1 261 261 527 100.0 1e-148 MSTALATLAGKLAERVGMDSVDPQELITTLRQTAFKGDASDAQFIALLIVANQYGLNPWT KEIYAFPDKQNGIVPVVGVDGWSRIINENQQFDGMDFEQDNESCTCRIYRKDRNHPICVT EWMDECRREPFKTREGREITGPWQSHPKRMLRHKAMIQCARLAFGFAGIYDKDEAERIVE NTAYTAERQPERDITPVNDETMQEINTLLIALDKTWDDDLLPLCSQIFRRDIRASSELTQ AEAVKALGFLKQKATEQKVAA >gi|296493395|gb|ADTK01000106.1| GENE 22 14849 - 15373 461 174 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_1387 NR:ns ## KEGG: ECUMN_1387 # Name: not_defined # Def: exonuclease from phage origin # Organism: E.coli_UMN026 # Pathway: not_defined # 1 174 53 226 226 362 98.0 4e-99 MKMSYFHTLLAEVCTGVAPEVNAKALAWGKQYENDARALFEFTSGVNVTESPIIYRDESM RTACSPDGLCSDGNGLELKCPFTSRDFMKFRLGGFEAIKSAYMAQVQYSMWVTRKDAWYF ANYDPRMKREGLHYVVIEQDEKYMASFDEMVPEFIEKMDEALAEIGFVFGEQWR >gi|296493395|gb|ADTK01000106.1| GENE 23 15525 - 15716 132 63 aa, chain + ## HITS:1 COG:no KEGG:EcolC_2879 NR:ns ## KEGG: EcolC_2879 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 63 1 63 63 95 96.0 4e-19 MHKASPVELRTSIEMAHSLAQIGVRFVPIPVETDEEFHTLAASLSQKLEMMVAKAEADER DQV >gi|296493395|gb|ADTK01000106.1| GENE 24 15793 - 16008 222 71 aa, chain + ## HITS:1 COG:no KEGG:ECSP_3284 NR:ns ## KEGG: ECSP_3284 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 71 23 93 93 137 98.0 1e-31 MTHQQENALRSIARQANSEIKKARQQFPDKNVDDICRSVLKKHRETVTLMGFTPTHLSLA IGMLNGVFKER >gi|296493395|gb|ADTK01000106.1| GENE 25 16107 - 16325 134 72 aa, chain + ## HITS:1 COG:ECs0805 KEGG:ns NR:ns ## COG: ECs0805 COG1734 # Protein_GI_number: 15830059 # Func_class: T Signal transduction mechanisms # Function: DnaK suppressor protein # Organism: Escherichia coli O157:H7 # 1 71 1 71 73 107 91.0 5e-24 MADIIDSASGIEELQRNTAIKMRRLNHQAISATHCCECGDPIDERRRLAVQGCRTCASCQ EEIELKNKQWGL >gi|296493395|gb|ADTK01000106.1| GENE 26 16373 - 16651 75 92 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_0578 NR:ns ## KEGG: ECUMN_0578 # Name: not_defined # Def: conserved hypothetical protein from phage origin # Organism: E.coli_UMN026 # Pathway: not_defined # 1 92 1 92 92 175 100.0 5e-43 MFRIIFPNTWYVDHHGTPCKILRSTHNKVHYIRKGRTCIASMFRFNHDFEPVNKADADRI AEEIETAEHIKKLRDMRSKSRGNHGIIQPHTR >gi|296493395|gb|ADTK01000106.1| GENE 27 16850 - 18013 237 387 aa, chain + ## HITS:1 COG:intD KEGG:ns NR:ns ## COG: intD COG0582 # Protein_GI_number: 16128520 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Escherichia coli K12 # 1 387 1 387 387 712 99.0 0 MSLFRRNEIWYASYSLPGGKRIKESLGTKDKRQAQELHDKRKAELWRVEKLGDLPDVTFE EACLRWLEEKADKKSLDSDKSRIEFWLEHFEGIRLKDISEAKIYSAVSRMHNRKTKEIWK QKVQAAIRKGKELPVYEPKPVSTQTKAKHLAMIKAILRAAERDWKWLEKAPVIKIPAVRN KRVRWLEKEEAKRLIDECPEPLKSVVKFALATGLRKSNIINLEWQQIDMQRRVAWVNPEE SKSNRAIGVALNDTACKVLRDQIGKHHKWVFVHTKAAKRADGTSTPAVRKMRIDSKTSWL SACRRAGIEDFRFHDLRHTWASWLIQSGVPLSVLQEMGGWESIEMVRRYAHLAPNHLTEH ARKIDDIFGDNVPNMSHSGIMEDIKKA >gi|296493395|gb|ADTK01000106.1| GENE 28 18348 - 18980 231 210 aa, chain + ## HITS:1 COG:ZfimZ KEGG:ns NR:ns ## COG: ZfimZ COG2197 # Protein_GI_number: 15800272 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli O157:H7 EDL933 # 1 210 22 231 231 390 99.0 1e-109 MKPTSVIIMDTHPIIRMSIEVLLQKNSELQIVLKTDDYRITIDYLRTRPVDLIIMDIDLP GTDGFTFLKRIKQIQSTVKVLFLSSKSECFYAGRAIQAGANGFVSKCNDQNDIFHAVQMI LSGYTFFPSKTLNYIKSNKCSTNSSTITVLSNREVTILRYLVSGLSNKEIADKLLLSNKT VSAHKSNIYGKLGLHSIVELIDYAKLYELI >gi|296493395|gb|ADTK01000106.1| GENE 29 18983 - 19498 205 171 aa, chain - ## HITS:1 COG:sfmF KEGG:ns NR:ns ## COG: sfmF COG3539 # Protein_GI_number: 16128518 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 171 1 171 171 327 98.0 8e-90 MRRALFSCFCGLLWSSSGWAADPLGTININLHGNVVDFSCTVNTADIDKTVDLGRWPTTQ LLNAGDTTALVPFSLRLEGCPPGSVAILFTGTPASDTNLLALDDPAMAQTVAIELRNSDR SRLALGEASPTEEVDANGNVTLNFFANYRALASGVRPGVAKADAIFMINYN >gi|296493395|gb|ADTK01000106.1| GENE 30 19509 - 20516 724 335 aa, chain - ## HITS:1 COG:sfmH KEGG:ns NR:ns ## COG: sfmH COG3539 # Protein_GI_number: 16128517 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 11 335 1 325 325 642 99.0 0 MKIICRLLLAMACLCLANISWATVCANSTGVAEDEHYDLSNIFNSTNNQPGQIVVLPEKS GWVGVSAICPPGTLVNYTYRSYVTNFIVQETIDNYKYMQLNDYLLGAMSLVDSVMDIQFP PQNYILMGTDPNVSQNLPFGVMDSRLIFRLKVIRPFINMVEIPRQVMFTVYVTSTPYDPL VTPVYTISFGGRVEVPQNCELNAGQIVEFDFGDIGASLFSAAGPGNRPAGVMPQTKSIAV KCTNVAAQAYLTMRLEASAVSGQAMVSDNQDLGFIVADQNDTPITPNDLNSVIPFRLDAA AAANVTLRAWPISITGQKPTEGPFSALGYLRVDYQ >gi|296493395|gb|ADTK01000106.1| GENE 31 20675 - 23137 1509 820 aa, chain - ## HITS:1 COG:ECs0594 KEGG:ns NR:ns ## COG: ECs0594 COG3188 # Protein_GI_number: 15829848 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli O157:H7 # 1 812 1 812 869 1604 99.0 0 MKIPTTTDIPQRYTWCLAGICYSSLAILPSFLSYAESYFNPAFLLENGTSVADLSRFERG NHQPARVYRVDLWRNDEFIGSQDIVFESTTENTGDKSGGLMPCFNQVLLERIGLNSSAFP ELAQQQNNKCINLLKAVPDATINFDFAAMRLNITIPQIALLSSAHGYIPPEEWDEGIPAL LLNYNFTGNRGNGNDSYFFSEFSGINIGPWRLRNNGSWNYFRGNGYHSEQWNNIGTWVQR AIIPLKSELVMGDGNTGSDIFDGVGFRGVRLYSSDNMYPDSQQGFAPTVRGIARTAAQLT IRQNGFIIYQSYVSPGAFEITDLHPTSSNGDLDVTIDERDGNQQNYTIPYSTVPILQREG RFKFDLTAGDFRSGNSQQSSPFFFQGTALGGLPQEFTAYGGTQLSANYTAFLLGLGRNLG NWGAVSLDVTHARSQLADDSRHEGDSIRFLYAKSMNTFGTNFQLMGYRYSTQGFYTLDDV AYRRMEGYEYDYDYDGEHRDEPIIVNYHNLRFSRKDRLQLNISQSLNDFGSLYISGTHQK YWNTSDSDTWYQVGYTSSWVGISYSLSFSWNESVGIPDNERIVGLNVSVPFNVLTKRRYT RENALDRAYASFNANRNSNGQNSWLAGVGGTLLEGHNLSYHVSQGDTSNNGYTGSATANW QAAYATLGVGYNYDRDQHDVNWQLSGGVVGHENGITLSQPFGDTNVLIKAPGAGGVRIEN QTGILTDWRGYAVMPYATVYRYNRIALDTNTMGNSIDVEKNISSVVPTQGALVRANFDTR IGVRALITVTQGGKPVPFGSLVRENSIGITSMWVMTGKFI >gi|296493395|gb|ADTK01000106.1| GENE 32 23168 - 23860 239 230 aa, chain - ## HITS:1 COG:sfmC KEGG:ns NR:ns ## COG: sfmC COG3121 # Protein_GI_number: 16128515 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli K12 # 1 230 1 230 230 436 99.0 1e-122 MMTKIKLLMLIIFYLIISASAHAAGGIALGATRIIYPADAKQTAVWIRNSHTNERFLVNS WIENSSGVKEKSFIITPPLFVSEPKSENTLRIIYTGPPLAADRESLFWMNVKTIPSVDKN ALNGRNVLQLAILSRMKLFLRPIQLQELPAEAPDTLKFSRSGNYINVHNPSPFYVTLVNL QVGSQKLGNAMAAPRVSSQIPLPSGVQGKLKFQTVNDYGSVTPVREVNLN >gi|296493395|gb|ADTK01000106.1| GENE 33 24080 - 24622 260 180 aa, chain - ## HITS:1 COG:ECs0592 KEGG:ns NR:ns ## COG: ECs0592 COG3539 # Protein_GI_number: 15829846 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli O157:H7 # 1 180 12 191 191 264 100.0 8e-71 MKLRFISSALAAALFAATGSYAAVVDGGTIHFEGELVNAACSVNTDSADQVVTLGQYRTD IFNAVGNTSALIPFTIQLNDCDPVVAANAAVAFSGQADAINDNLLAIASSTNTTTATGVG IEILDNTSAILKPDGNSFSTNQNLIPGTNVLHFSARYKGTGTSASAGQANADATFIMRYE >gi|296493395|gb|ADTK01000106.1| GENE 34 25093 - 25959 886 288 aa, chain + ## HITS:1 COG:folD KEGG:ns NR:ns ## COG: folD COG0190 # Protein_GI_number: 16128513 # Func_class: H Coenzyme transport and metabolism # Function: 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase # Organism: Escherichia coli K12 # 1 288 1 288 288 570 100.0 1e-162 MAAKIIDGKTIAQQVRSEVAQKVQARIAAGLRAPGLAVVLVGSNPASQIYVASKRKACEE VGFVSRSYDLPETTSEAELLELIDTLNADNTIDGILVQLPLPAGIDNVKVLERIHPDKDV DGFHPYNVGRLCQRAPRLRPCTPRGIVTLLERYNIDTFGLNAVVIGASNIVGRPMSMELL LAGCTTTVTHRFTKNLRHHVENADLLIVAVGKPGFIPGDWIKEGAIVIDVGINRLENGKV VGDVVFEDAAKRASYITPVPGGVGPMTVATLIENTLQACVEYHDPQDE >gi|296493395|gb|ADTK01000106.1| GENE 35 25961 - 26173 303 70 aa, chain + ## HITS:1 COG:ybcJ KEGG:ns NR:ns ## COG: ybcJ COG2501 # Protein_GI_number: 16128512 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 70 8 77 77 127 100.0 7e-30 MATFSLGKHPHVELCDLLKLEGWSESGAQAKIAIAEGQVKVDGAVETRKRCKIVAGQTVS FAGHSVQVVA Prediction of potential genes in microbial genomes Time: Mon May 16 15:19:47 2011 Seq name: gi|296493394|gb|ADTK01000107.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont289.2, whole genome shotgun sequence Length of sequence - 2779 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 89 - 610 375 ## COG1988 Predicted membrane-bound metal-dependent hydrolases - Term 599 - 634 7.4 2 2 Tu 1 . - CDS 646 - 2031 1730 ## COG0215 Cysteinyl-tRNA synthetase - Prom 2066 - 2125 3.6 + Prom 2025 - 2084 4.3 3 3 Tu 1 . + CDS 2205 - 2699 537 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family + Term 2747 - 2775 1.3 Predicted protein(s) >gi|296493394|gb|ADTK01000107.1| GENE 1 89 - 610 375 173 aa, chain + ## HITS:1 COG:ECs0589 KEGG:ns NR:ns ## COG: ECs0589 COG1988 # Protein_GI_number: 15829843 # Func_class: R General function prediction only # Function: Predicted membrane-bound metal-dependent hydrolases # Organism: Escherichia coli O157:H7 # 1 173 1 173 173 290 100.0 1e-78 MPTVITHAAVPLCIGLGLGSKVIPPRLLFAGIILAMLPDADVLSFNFGVAYGNVFGHRGF THSLVFAFVVPLLCVLIGRRWFRAGLIRCWLFLTVSLLSHSLLDSVTTGGKGVGWLWPWS DERFFAPWQVIKVAPFALSRYTTAYGHQVIISELMWVWLPGMLLMGMLWWRRR >gi|296493394|gb|ADTK01000107.1| GENE 2 646 - 2031 1730 461 aa, chain - ## HITS:1 COG:cysS KEGG:ns NR:ns ## COG: cysS COG0215 # Protein_GI_number: 16128510 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Cysteinyl-tRNA synthetase # Organism: Escherichia coli K12 # 1 461 1 461 461 919 99.0 0 MLKIFNTLTRQKEEFKPIHAGEVGMYVCGITVYDLCHIGHGRTFVAFDVVARYLRFLGYK LKYVRNITDIDDKIIKRANENGESFVALVDRMIAEMHKDFDALNILRPDMEPRATHHIAE IIELTEQLIAKGHAYVADNGDVMFDVPTDPTYGVLSRQDLDQLQAGARVDVVDDKRNPMD FVLWKMSKEGEPSWPSPWGAGRPGWHIECSAMNCKQLGNHFDIHGGGSDLMFPHHENEIA QSTCAHDGQYVNYWMHSGMVMVDREKMSKSLGNFFTVRDVLKYYDAETVRYFLMSGHYRS QLNYSEENLKQARAALERLYTALRGTDKTVAPAGGEAFEARFIEAMDDDFNTPEAYSVLF DMAREVNRLKAEDMAAANAMASHLRKLSAVLGLLEQEPEAFLQSGAQADDSEVAEIEALI QQRLDARKAKDWAAADAARDRLNEMGIVLEDGPQGTTWRRK >gi|296493394|gb|ADTK01000107.1| GENE 3 2205 - 2699 537 164 aa, chain + ## HITS:1 COG:ppiB KEGG:ns NR:ns ## COG: ppiB COG0652 # Protein_GI_number: 16128509 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Escherichia coli K12 # 1 164 1 164 164 328 100.0 2e-90 MVTFHTNHGDIVIKTFDDKAPETVKNFLDYCREGFYNNTIFHRVINGFMIQGGGFEPGMK QKATKEPIKNEANNGLKNTRGTLAMARTQAPHSATAQFFINVVDNDFLNFSGESLQGWGY CVFAEVVDGMDVVDKIKGVATGRSGMHQDVPKEDVIIESVTVSE Prediction of potential genes in microbial genomes Time: Mon May 16 15:20:04 2011 Seq name: gi|296493393|gb|ADTK01000108.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont289.3, whole genome shotgun sequence Length of sequence - 50141 bp Number of predicted genes - 45, with homology - 44 Number of transcription units - 27, operones - 13 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 674 465 ## COG2908 Uncharacterized protein conserved in bacteria + Prom 690 - 749 3.4 2 2 Op 1 29/0.000 + CDS 792 - 1301 593 ## COG0041 Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase 3 2 Op 2 . + CDS 1298 - 2365 1195 ## COG0026 Phosphoribosylaminoimidazole carboxylase (NCAIR synthetase) 4 3 Op 1 . - CDS 2504 - 3397 956 ## COG0549 Carbamate kinase 5 3 Op 2 . - CDS 3394 - 4209 485 ## EC55989_0534 hypothetical protein 6 3 Op 3 . - CDS 4220 - 5479 1238 ## EcolC_3103 hypothetical protein 7 3 Op 4 . - CDS 5489 - 7156 1819 ## COG0074 Succinyl-CoA synthetase, alpha subunit - Prom 7334 - 7393 4.9 + Prom 7366 - 7425 5.9 8 4 Op 1 4/0.688 + CDS 7473 - 8522 962 ## COG2055 Malate/L-lactate dehydrogenases 9 4 Op 2 4/0.688 + CDS 8544 - 9779 1066 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 10 4 Op 3 . + CDS 9790 - 10575 915 ## COG3257 Uncharacterized protein, possibly involved in glyoxylate utilization + Term 10618 - 10649 4.1 - Term 10718 - 10757 10.0 11 5 Op 1 2/1.000 - CDS 10803 - 11948 1101 ## COG1929 Glycerate kinase 12 5 Op 2 2/1.000 - CDS 11970 - 13271 943 ## COG2233 Xanthine/uracil permeases - Term 13281 - 13313 3.0 13 6 Op 1 2/1.000 - CDS 13328 - 14689 1582 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 14 6 Op 2 . - CDS 14749 - 16203 1247 ## COG1953 Cytosine/uracil/thiamine/allantoin permeases - Prom 16258 - 16317 8.1 15 7 Op 1 8/0.062 - CDS 16372 - 17250 1055 ## COG2084 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases 16 7 Op 2 8/0.062 - CDS 17350 - 18126 766 ## COG3622 Hydroxypyruvate isomerase 17 7 Op 3 4/0.688 - CDS 18139 - 19920 2119 ## COG3960 Glyoxylate carboligase - Prom 19941 - 20000 4.2 - Term 19967 - 19997 3.0 18 8 Op 1 4/0.688 - CDS 20010 - 20825 710 ## COG1414 Transcriptional regulator 19 8 Op 2 . - CDS 20903 - 21385 508 ## COG3194 Ureidoglycolate hydrolase - Prom 21447 - 21506 5.0 + Prom 21492 - 21551 5.6 20 9 Op 1 4/0.688 + CDS 21615 - 22541 614 ## COG0583 Transcriptional regulator + Term 22552 - 22601 2.7 21 9 Op 2 . + CDS 22610 - 23704 1141 ## COG2603 Predicted ATPase + Term 23767 - 23811 10.1 22 10 Tu 1 . + CDS 24115 - 24198 56 ## - Term 24198 - 24245 4.0 23 11 Op 1 11/0.062 - CDS 24416 - 26830 2413 ## COG3127 Predicted ABC-type transport system involved in lysophospholipase L1 biosynthesis, permease component 24 11 Op 2 . - CDS 26827 - 27513 313 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) + Prom 27170 - 27229 1.9 25 12 Op 1 5/0.125 + CDS 27451 - 28107 602 ## COG2755 Lysophospholipase L1 and related esterases 26 12 Op 2 3/0.812 + CDS 28097 - 28906 229 ## PROTEIN SUPPORTED gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 27 12 Op 3 . + CDS 28967 - 29821 1263 ## COG3118 Thioredoxin domain-containing protein + Term 29856 - 29897 7.1 - Term 29833 - 29883 12.8 28 13 Op 1 4/0.688 - CDS 29884 - 30663 762 ## COG0390 ABC-type uncharacterized transport system, permease component 29 13 Op 2 . - CDS 30650 - 31327 177 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 31354 - 31413 3.8 + Prom 31318 - 31377 4.0 30 14 Op 1 26/0.000 + CDS 31473 - 32390 1335 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs 31 14 Op 2 . + CDS 32387 - 32845 415 ## COG1585 Membrane protein implicated in regulation of membrane protease activity 32 15 Tu 1 . - CDS 32846 - 33253 369 ## COG0789 Predicted transcriptional regulators - Prom 33293 - 33352 3.8 33 16 Op 1 4/0.688 - CDS 33378 - 34667 1375 ## COG0531 Amino acid transporters 34 16 Op 2 . - CDS 34673 - 35605 882 ## COG2066 Glutaminase - Prom 35687 - 35746 3.4 + Prom 35565 - 35624 7.1 35 17 Tu 1 . + CDS 35867 - 38371 2763 ## COG2217 Cation transport ATPase + Term 38388 - 38419 2.3 36 18 Tu 1 . - CDS 38586 - 38981 359 ## COG3093 Plasmid maintenance system antidote protein - Prom 39110 - 39169 2.6 + Prom 38922 - 38981 2.7 37 19 Tu 1 5/0.125 + CDS 39065 - 39859 757 ## COG3735 Uncharacterized protein conserved in bacteria + Term 39912 - 39960 -0.1 + Prom 39863 - 39922 7.6 38 20 Tu 1 . + CDS 40063 - 40542 719 ## COG2606 Uncharacterized conserved protein + Term 40548 - 40583 7.4 39 21 Tu 1 . - CDS 40579 - 42231 1791 ## COG0737 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases - Prom 42276 - 42335 2.6 40 22 Tu 1 . + CDS 42449 - 43669 1161 ## COG0477 Permeases of the major facilitator superfamily + Term 43676 - 43715 8.0 + Prom 43746 - 43805 5.2 41 23 Tu 1 . + CDS 43907 - 45583 501 ## PROTEIN SUPPORTED gi|229845962|ref|ZP_04466074.1| 30S ribosomal protein S2 + Term 45635 - 45700 3.1 - Term 45621 - 45686 3.1 42 24 Tu 1 . - CDS 45716 - 47020 1312 ## COG0524 Sugar kinases, ribokinase family - Prom 47113 - 47172 1.7 + Prom 46992 - 47051 2.7 43 25 Tu 1 . + CDS 47172 - 48131 737 ## COG0657 Esterase/lipase + Term 48164 - 48215 7.1 - Term 48067 - 48103 2.0 44 26 Tu 1 . - CDS 48128 - 49090 901 ## COG0276 Protoheme ferro-lyase (ferrochelatase) - Prom 49115 - 49174 3.9 - Term 49199 - 49230 2.5 45 27 Tu 1 . - CDS 49326 - 49970 985 ## COG0563 Adenylate kinase and related kinases - Prom 50011 - 50070 3.9 Predicted protein(s) >gi|296493393|gb|ADTK01000108.1| GENE 1 3 - 674 465 223 aa, chain + ## HITS:1 COG:ybbF KEGG:ns NR:ns ## COG: ybbF COG2908 # Protein_GI_number: 16128508 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 3 223 20 240 240 446 98.0 1e-125 PPAGFLRFLAGEARKADALYILGDLFEAWIGDDDPNPLHRQMAAAIKAVSDSGVPCYFIH GNRDFLLGKRFARESGMTLLPEEKVLELYGRRVLIMHGDTLCTDDAGYQAFRAKVHKPWL QMLFLALPLFVRKRIAARMRANSKEANSSKSLAIMDVNQNAVVSAMEKHQVQGLIHGHTH RPAVHELIANQQPAFRVVLGAWHTEGSMVKVTADDVELIHFPF >gi|296493393|gb|ADTK01000108.1| GENE 2 792 - 1301 593 169 aa, chain + ## HITS:1 COG:ECs0585 KEGG:ns NR:ns ## COG: ECs0585 COG0041 # Protein_GI_number: 15829839 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase # Organism: Escherichia coli O157:H7 # 1 169 1 169 169 280 100.0 1e-75 MSSRNNPARVAIVMGSKSDWATMQFAAEIFEILNVPHHVEVVSAHRTPDKLFSFAESAEE NGYQVIIAGAGGAAHLPGMIAAKTLVPVLGVPVQSAALSGVDSLYSIVQMPRGIPVGTLA IGKAGAANAALLAAQILATHDKELHQRLNDWRKAQTDEVLENPDPRGAA >gi|296493393|gb|ADTK01000108.1| GENE 3 1298 - 2365 1195 355 aa, chain + ## HITS:1 COG:purK KEGG:ns NR:ns ## COG: purK COG0026 # Protein_GI_number: 16128506 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazole carboxylase (NCAIR synthetase) # Organism: Escherichia coli K12 # 1 354 1 354 355 692 99.0 0 MKQVCVLGNGQLGRMLRQAGEPLGIAVWPVGLDAEPAAVPFQQSVITAEIERWPETALTR ELARHPAFVNRDVFPIIADRLTQKQLFDKLHLPTAPWQLLAERSEWPAVFDRLGELAIVK RRTGGYDGRGQWRLRANETEQLPAECYGECIVEQGINFSGEVSLVGARGFDGSTVFYPLT HNLHQDGILRTSVAFPQANAQQQAQAEEMLSAIMQELGYVGVMAMECFVTPQGLLINELA PRVHNSGHWTQNGASISQFELHLRAITDLPLPQPVVNSPSVMINLIGSDVNYDWLKLPLV HLHWYDKEVRPGRKVGHLNLTDSDTSRLSATLEALIPLLPPEYASGVMWAQSKFS >gi|296493393|gb|ADTK01000108.1| GENE 4 2504 - 3397 956 297 aa, chain - ## HITS:1 COG:arcC KEGG:ns NR:ns ## COG: arcC COG0549 # Protein_GI_number: 16128505 # Func_class: E Amino acid transport and metabolism # Function: Carbamate kinase # Organism: Escherichia coli K12 # 1 297 1 297 297 551 99.0 1e-157 MKTLVVALGGNALLQRGEALTAENQYRNIASAVPALARLARSYRLAIVHGNGPQVGLLAL QNLAWKEVEPYPLDVLVAESQGMIGYMLAQSLSAQPQMPPVTTVLTRIEVSPDDPAFLQP EKFIGPVYQPEEQEALEAAYGWQMKRDGKYLRRVVASPQPRKILDSEAIELLLKEGHVVI CSGGGGVPVTEDGAGSEAVIDKDLAAALLAEQINADGLVILTDADAVYENWGTPQQRAIR HATPDELAPFAKADGSMGPKVTAVSGYVRSRGKPAWIGALSRIEETLAGEAGTCISL >gi|296493393|gb|ADTK01000108.1| GENE 5 3394 - 4209 485 271 aa, chain - ## HITS:1 COG:no KEGG:EC55989_0534 NR:ns ## KEGG: EC55989_0534 # Name: ylbF # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 271 1 271 271 529 99.0 1e-149 MTIIHPLLASSSAPNYRQSWRLAGVWRRAINLMTESGELLTLHRQGSGFGPGGWMLRRAQ FDALCGGLCGNERPQVVAQGIRLGRFTVKQPQRYCLLRITPPAHPQPLAAAWMQRAEETG LFGPLALAASDPLPAELRQFRHCFQAALNGVKTDWRHWLGKGPGLTPSHDDTLSGMLLAA WYYGALDARSGRQFFACSDNLQLVTTAVSVSYLRYAAQGYFASPLLHFVHALSCPKRTAV AIDSLLALGHTSGADTLLGFWLGQQLLQGKP >gi|296493393|gb|ADTK01000108.1| GENE 6 4220 - 5479 1238 419 aa, chain - ## HITS:1 COG:no KEGG:EcolC_3103 NR:ns ## KEGG: EcolC_3103 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 419 1 419 419 818 100.0 0 MFTSVAQANAAVIEQIRRARPHWLDVQPASSLISELNEGKTLLHAGPPMRWQEMTGPMKG ACVGACLFEGWVKDEAQALAILEQGEVNFIPCHHVNAVGPMGGITSASMPMLVVENVTDG NRAYCNLNEGIGKVMRFGAYGEDVLTRHRWMRDVLMPVLSAALGRMERGIDLTAMMAQGI TMGDEFHQRNIASSALLMRALAPQIARLDHDKQHIAEVMDFLSVTDQFFLNLAMAYCKAA MDAGAMIRAGSIVTAMTRNGNMFGIRVSGLGERWFTAPVNTPQGLFFTGFSQEQANPDMG DSAITETFGIGGAAMIAAPGVTRFVGAGGMEAARAVSEEMAEIYLERNMQLQIPGWDFQG ACLGLDIRRVVETGITPLINTGIAHKEAGIGQIGAGTVRAPLACFEQALEALAESMGIG >gi|296493393|gb|ADTK01000108.1| GENE 7 5489 - 7156 1819 555 aa, chain - ## HITS:1 COG:ECs0580 KEGG:ns NR:ns ## COG: ECs0580 COG0074 # Protein_GI_number: 15829834 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, alpha subunit # Organism: Escherichia coli O157:H7 # 1 555 1 555 555 1021 98.0 0 MIHAFIKKGCFQDSVSLMIISRKLSESENVDDVSVMMGTPANKALLDTTGFWHDDFNNAT PNDICVAIRSEAADAGIAQAIMQQLEEALKQLAQGAGSSQALTQVRRWDSACQKLPDASL ALISVAGEYAAELANQALDRNLNVMMFSDNVTLEDEIQLKTRAREKGLLVMGPDCGTSMI AGTPLAFANVMPEGNIGVIGASGTGIQELCSQIALAGEGITHAIGLGGRDLSREVGGISA LTALEMLSADEKSEVLAFVSKPPAEAVRLKIVNAMKATGKPTVALFLGYTPAVARDENVW FASSLDEAARLACLLSRVTARRNAIAPVSSGFICGLYTGGTLAAEAAGLLAGHLGVEADD THHHGMMLDADGHQIIDLGDDFYTVGRPHPMIDPALRNQLIADLGAKPQVRVLLLDVVIG FGATADPAASLVSAWQKACAARSDNQPLYAIATVTGTERDPQCRSQQIATLEDAGIAVVS SLPEATLLAAALIRPLSPATQQHTPSLLENVAVINIGLRSFALELQSASKPVVHYQWSPV AGGNKKLARLLERLQ >gi|296493393|gb|ADTK01000108.1| GENE 8 7473 - 8522 962 349 aa, chain + ## HITS:1 COG:ylbC KEGG:ns NR:ns ## COG: ylbC COG2055 # Protein_GI_number: 16128501 # Func_class: C Energy production and conversion # Function: Malate/L-lactate dehydrogenases # Organism: Escherichia coli K12 # 1 349 1 349 349 701 98.0 0 MKISRETLHQLIENKLCQAGLKREHAATVAEVLVYADARGIHSHGAVRVEYYAERISKGG TNREPEFRLEETGPCSAILHADNAAGQVAAKMGMEHAIKTAQQNGVAVVGISRMGHSGAI SYFVQQAARAGLIGISMCQSDPMVVPFGGAEIYYGTNPLAFAAPGEGNEILTFDMATTVQ AWGKVLDARSRNMSIPDTWAVDKNGAPTTDPFAVHALLPAAGPKGYGLMMMIDVLSGVLL GLPFGRQVSSMYDDLHAGRNLGQLHVVINPNFFSSSELFRQHLSQTMRELNAITPAPGFN QVYYPGQDQDIKQRKAAVEGIEIVDDIYQYLISDALYNTSYETKNPFAQ >gi|296493393|gb|ADTK01000108.1| GENE 9 8544 - 9779 1066 411 aa, chain + ## HITS:1 COG:ylbB KEGG:ns NR:ns ## COG: ylbB COG0624 # Protein_GI_number: 16128500 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Escherichia coli K12 # 1 411 1 411 411 870 99.0 0 MITHFRQAIEETLPWLSSFGADPTGGMTRLLYSPEWLETQQQFKKRMAASGLETRFDEVG NLYGRLSGTEYPQEVVLSGSHIDTVVNGGNLDGQFGALAAWLAIDWLKTQYGAPLRTVEV VAMAEEEGSRFPYVFWGSKNIFGLANPDDVRNICDAKGNSFVDAMKACGFTLPNAPLTPR QDIKAFVELHIEQGCVLESNGQSIGVVNAIVGQRRYTVTLNGESNHAGTTPMGYRRDTVY AFSRICHQSVEKAKRMGDPLVLTFGKVEPRPNTVNVVPGKTTFTIDCRHTDAAVLRDFTQ QLENDMRAICDEMDIGIDIDLWMDEEPVPMNKELVATLTELCESEKLNYRVMHSGAGHDA QIFAPRVPTCMIFIPSINGISHNPAERTNITDLAEGVKTLALMLYQLAWQK >gi|296493393|gb|ADTK01000108.1| GENE 10 9790 - 10575 915 261 aa, chain + ## HITS:1 COG:ECs0577 KEGG:ns NR:ns ## COG: ECs0577 COG3257 # Protein_GI_number: 15829831 # Func_class: R General function prediction only # Function: Uncharacterized protein, possibly involved in glyoxylate utilization # Organism: Escherichia coli O157:H7 # 1 261 1 261 261 496 99.0 1e-140 MGYLNNVTGYRDDLLANRAIVKHGNFALLTPDGLVKNIIPGFENCDATILSTPKLGASFV DYLVTLHQNGGNQQGFGGEGIETFLYVISGNITAKAEGKTFALSEGGYLYCPPGSLMTFV NAQAEDSQIFLYKRRYVPVEGHAPWLVSGNASELERIHYEGMDDVILLDFLPKELGFDMN MHILSFAPGASHGYIETHVQEHGAYILSGQGVYNLDNNWIPVKKGDYIFMGAYSLQAGYG VGRGEAFSYIYSKDCNRDVEI >gi|296493393|gb|ADTK01000108.1| GENE 11 10803 - 11948 1101 381 aa, chain - ## HITS:1 COG:ybbZ KEGG:ns NR:ns ## COG: ybbZ COG1929 # Protein_GI_number: 16128498 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerate kinase # Organism: Escherichia coli K12 # 1 381 1 381 381 642 98.0 0 MKIVIAPDSFKESLSAEKCCQAIKAGFSTLFPDANYICLPIADGGEGTVDAMVAATGGNI VTLEVCGPMGEKVNAFYGLTGDGKTAVIEMAAASGLMLVAPEKRNPLLASSFGTGELIRH ALDNGIRHIILGIGGSATVDGGMGMAQALGVRFLDADGQVLAANGGNLARVASIEMEECD PRLANSHIEVACDVDNPLVGARGAAAVFGPQKGATPEMVEELEQGLQNYARVLQQQTEIN VCQMAGGGAAGGMGIAAAVFLNADIKPGIEIVLNAVNLAQAVQGAALVITGEGRIDSQTA GGKAPLGVASVAKQFNVPVIGIAGVLGDGVEVVHQYGIDAVFSILPRLAPLAEVLASGET NLFNSARNIACAIKIGQGIKN >gi|296493393|gb|ADTK01000108.1| GENE 12 11970 - 13271 943 433 aa, chain - ## HITS:1 COG:ECs0575 KEGG:ns NR:ns ## COG: ECs0575 COG2233 # Protein_GI_number: 15829829 # Func_class: F Nucleotide transport and metabolism # Function: Xanthine/uracil permeases # Organism: Escherichia coli O157:H7 # 1 433 3 435 435 747 99.0 0 MFNFAVSRESLLSGFQWFFFIFCNTVVVPPTLLSAFQLPQSSLLTLTQYAFLATALACFA QAFCGHRRAIMEGPGGLWWGTILTITLGEASRGTPINDIATSLAVGIALSGVLTMLIGFS GLGHRLARLFTPSVMVLFMLMLGAQLTTIFFKGMLGLPFGIADPNFKIQLPPFALSVAVM CLVLGMIIFLPQRFARYGLLVGTITGWLLWYFCFPSSHSLSGELHWQWFPLGSGGALSPG IILTAVITGLVNISNTYGAIRGTDVFYPQQGAGNTRYRRSFVATGFMTLITVPLAVIPFS PFVSSIGLLTQTGDYTRRSFIYGSVICLLVALVPALTRLFCSIPLPVSSAVMLVSYLPLL FSALVFSQQITFTARNIYRLALPLFVGIFLMALPPVYLQDLPLTLRPLLSNGLLVGILLA VLMDNLIPWERIE >gi|296493393|gb|ADTK01000108.1| GENE 13 13328 - 14689 1582 453 aa, chain - ## HITS:1 COG:ybbX KEGG:ns NR:ns ## COG: ybbX COG0044 # Protein_GI_number: 16128496 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Escherichia coli K12 # 1 453 1 453 453 939 98.0 0 MSFDLIIKNGTVILENEARVVDIAVKGGKIAAIGQDLGDAKEVMDASGLVVSPGMVDAHT HISEPGRSHWEGYETGTRAAAKGGITTMIEMPLNQLPATVDRASIELKFDAAKGKLTIDA AQLGGLVSYNIDRLHELDEVGVVGFKCFVATCGDRGIDNDFRDVNDWQFFKGAQKLGELG QPVLVHCENALICDALGEEAKREGRVTAHDYVASRPVFTEVEAIRRVLYLAKVAGCRLHV CHISSPEGVEEVTRARQEGQDVTCESCPHYFVLDTDQFEEIGTLAKCSPPIRDLENQKGM WEKLFNGEIDCLVSDHSPCPPEMKAGNIMEAWGGIAGLQNCMDVMFDEAVQKRGMSLPMF GKLIATNAADIFGLQQKGRIAPGKDADFVFIQPNSSYVLTNDDLEYRHKVSPYVGRTIGA RITKTILRGDVIYDIEQGFPVAPKGQFILKHQQ >gi|296493393|gb|ADTK01000108.1| GENE 14 14749 - 16203 1247 484 aa, chain - ## HITS:1 COG:ECs0572 KEGG:ns NR:ns ## COG: ECs0572 COG1953 # Protein_GI_number: 15829826 # Func_class: F Nucleotide transport and metabolism; H Coenzyme transport and metabolism # Function: Cytosine/uracil/thiamine/allantoin permeases # Organism: Escherichia coli O157:H7 # 1 463 1 463 463 812 99.0 0 MEHQRKLFQQRGYSEDLLPKTQSQRTWKTFNYFTLWMGSVHNVPNYVMVGGFFILGLSTF SIMLAIILSAFFIAAVMVLNGAAGSKYGVPFAMILRASYGVRGALFPGLLRGGIAAIMWF GLQCYAGSLACLILIGKIWPGFLTLGGDFTLLGLSLPGLITFLLFWLVNVGIGFGGGKVL NKFTAILNPCIYIVFGGMAIWAISLVGLGPIFDYIPSGIQKAENSGFLFLVVINAVVAVW AAPAVSASDFTQNAHSFREQALGQTLGLVVAYILFAVAGVCIIAGASIHYGADTWNVLDI VQRWDSLFASFFAVLVILMTTISTNATGNIIPAGYQIAAIAPTKLTYKNGVLIASIISLL ICPWKLMENQDSIYLFLDIIGGMLGPVIGVMMAHYFVVMRGKINLDELYTAPGDYKYYDN GFNLTAFSVTLVAVILSLGGKFIPFMEPLSRVSWFVGVIVAFAAYALLKKRTTAEKTGEQ KTIG >gi|296493393|gb|ADTK01000108.1| GENE 15 16372 - 17250 1055 292 aa, chain - ## HITS:1 COG:ybbQ KEGG:ns NR:ns ## COG: ybbQ COG2084 # Protein_GI_number: 16128493 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases # Organism: Escherichia coli K12 # 1 292 1 292 292 551 100.0 1e-157 MKLGFIGLGIMGTPMAINLARAGHQLHVTTIGPVADELLSLGAVSVETARQVTEASDIIF IMVPDTPQVEEVLFGENGCTKASLKGKTIVDMSSISPIETKRFARQVNELGGDYLDAPVS GGEIGAREGTLSIMVGGDEAVFERVKPLFELLGKNITLVGGNGDGQTCKVANQIIVALNI EAVSEALLFASKAGADPVRVRQALMGGFASSRILEVHGERMIKRTFNPGFKIALHQKDLN LALQSAKALALNLPNTATCQELFNTCAANGGSQLDHSALVQALELMANHKLA >gi|296493393|gb|ADTK01000108.1| GENE 16 17350 - 18126 766 258 aa, chain - ## HITS:1 COG:gip KEGG:ns NR:ns ## COG: gip COG3622 # Protein_GI_number: 16128492 # Func_class: G Carbohydrate transport and metabolism # Function: Hydroxypyruvate isomerase # Organism: Escherichia coli K12 # 1 258 1 258 258 553 99.0 1e-158 MLRFSANLSMLFGEYDFLARFEKAAQCGFRGVEFMFPYDYDIEELKQVLASNKLEHTLHN LPAGDWAAGERGIACIPGREEEFRDGVAAAIRYARALGNKKINCLVGKTPAGFSSEQIHA TLVENLRYAANMLMKEDILLLIEPINHFDIPGFHLTGTRQALKLIDDVGCCNLKIQYDIY HMQRMEGELTNTMTQWADKIGHLQIADNPHRGEPGTGEINYDYLFKVIENSDYNGWVGCE YKPQTTTEAGLRWMDPYR >gi|296493393|gb|ADTK01000108.1| GENE 17 18139 - 19920 2119 593 aa, chain - ## HITS:1 COG:ECs0568 KEGG:ns NR:ns ## COG: ECs0568 COG3960 # Protein_GI_number: 15829822 # Func_class: R General function prediction only # Function: Glyoxylate carboligase # Organism: Escherichia coli O157:H7 # 1 593 1 593 593 1206 100.0 0 MAKMRAVDAAMYVLEKEGITTAFGVPGAAINPFYSAMRKHGGIRHILARHVEGASHMAEG YTRATAGNIGVCLGTSGPAGTDMITALYSASADSIPILCITGQAPRARLHKEDFQAVDIE AIAKPVSKMAVTVREAALVPRVLQQAFHLMRSGRPGPVLVDLPFDVQVAEIEFDPDMYEP LPVYKPAASRMQIEKAVEMLIQAERPVIVAGGGVINADAAALLQQFAELTSVPVIPTLMG WGCIPDDHELMAGMVGLQTAHRYGNATLLASDMVFGIGNRFANRHTGSVEKYTEGRKIVH IDIEPTQIGRVLCPDLGIVSDAKAALTLLVEVAQEMQKAGRLPCRKEWVADCQQRKRTLL RKTHFDNVPVKPQRVYEEMNKAFGRDVCYVTTIGLSQIAAAQMLHVFKDRHWINCGQAGP LGWTIPAALGVCAADPKRNVVAISGDFDFQFLIEELAVGAQFNIPYIHVLVNNAYLGLIR QSQRAFDMDYCVQLAFENINSSEVNGYGVDHVKVAEGLGCKAIRVFKPEDIAPAFEQAKA LMAQYRVPVVVEVILERVTNISMGSELDNVMEFEDIADNAADAPTETCFMHYE >gi|296493393|gb|ADTK01000108.1| GENE 18 20010 - 20825 710 271 aa, chain - ## HITS:1 COG:ECs0567 KEGG:ns NR:ns ## COG: ECs0567 COG1414 # Protein_GI_number: 15829821 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 271 1 271 271 528 100.0 1e-150 MTEVRRRGRPGQAEPVAQKGAQALERGIAILQYLEKSGGSSSVSDISLNLDLPLSTTFRL LKVLQAADFVYQDSQLGWWHIGLGVFNVGAAYIHNRDVLSVAGPFMRRLMLLSGETVNVA IRNGNEAVLIGQLECKSMVRMCAPLGSRLPLHASGAGKALLYPLAEEELMSIILQTGLQQ FTPTTLVDMPTLLKDLEQARELGYTVDKEEHVVGLNCIASAIYDDVGSVVAAISISGPSS RLTEDRFVSQGELVRDTARDISTALGLKAHP >gi|296493393|gb|ADTK01000108.1| GENE 19 20903 - 21385 508 160 aa, chain - ## HITS:1 COG:ECs0566 KEGG:ns NR:ns ## COG: ECs0566 COG3194 # Protein_GI_number: 15829820 # Func_class: F Nucleotide transport and metabolism # Function: Ureidoglycolate hydrolase # Organism: Escherichia coli O157:H7 # 1 160 1 160 160 331 100.0 3e-91 MKLQVLPLSQEAFSAYGDVIETQQRDFFHINNGLVERYHDLALVEILEQDRTLISINRAQ PANLPLTIHELERHPLGTQAFIPMKGEVFVVVVALGDDKPDLSTLRAFITNGEQGVNYHR NVWHHPLFAWQRVTDFLTIDRGGSDNCDVESIPEQELCFA >gi|296493393|gb|ADTK01000108.1| GENE 20 21615 - 22541 614 308 aa, chain + ## HITS:1 COG:ECs0565 KEGG:ns NR:ns ## COG: ECs0565 COG0583 # Protein_GI_number: 15829819 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 308 1 308 308 612 100.0 1e-175 MFDPETLRTFIAVAETGSFSKAAERLCKTTATISYRIKLLEENTGVALFFRTTRSVTLTA AGEHLLSQARDWLSWLESMPSELQQVNDGVERQVNIVINNLLYNPQAVAQLLAWLNERYP FTQFHISRQIYMGVWDSLLYEGFSLAIGVTGTEALANTFSLDPLGSVQWRFVMAADHPLA NVEEPLTEAQLRRFPAVNIEDSARTLTKRVAWRLPGQKEIIVPDMETKIAAHLAGVGIGF LPKSLCQSMIDNQQLVSRVIPTMRPPSPLSLAWRKFGSGKAVEDIVTLFTQRRPEISGFL EIFGNPRS >gi|296493393|gb|ADTK01000108.1| GENE 21 22610 - 23704 1141 364 aa, chain + ## HITS:1 COG:ybbB KEGG:ns NR:ns ## COG: ybbB COG2603 # Protein_GI_number: 16128487 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Escherichia coli K12 # 1 364 1 364 364 713 97.0 0 MQERHTEQDYRALLIADTPIIDVRAPIEFEHGAMPAAINLPLMNNDERAAVGTCYKQQGS DAALALGHKLVAGEIRQQRMDAWRAACLQNPQGILCCARGGQRSHIVQRWLHEVGINYPL VEGGYKVLRQTAIQATIELSQKPIVLIGGCTGCGKTLLVQQQPNGVDLEGLARHRGSAFG RTLQPQLSQASFENLLAAEMLKTDARQNLRLWVLEDESRMIGSNHLPECLRERMTQAAIA VVEDPFEIRLERLNEEYFLRMHHDFTHAYGDEQGWQEYCEYLHHGLSAIKRRLGLQRYNE LAARLDAALTTQLTTGSTDGHLAWLVPLLEEYYDPMYRYQLEKKAEKVVFRGEWAEVAEW VKAQ >gi|296493393|gb|ADTK01000108.1| GENE 22 24115 - 24198 56 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKTFLDKVLIVLKALIALLELIRQFIE >gi|296493393|gb|ADTK01000108.1| GENE 23 24416 - 26830 2413 804 aa, chain - ## HITS:1 COG:ybbP KEGG:ns NR:ns ## COG: ybbP COG3127 # Protein_GI_number: 16128480 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Predicted ABC-type transport system involved in lysophospholipase L1 biosynthesis, permease component # Organism: Escherichia coli K12 # 1 804 1 804 804 1429 99.0 0 MIARWFWREWRSPSLLIVWLALSLAVACVLALGNISDRMEKGLSQQSREFMAGDRALRSS REVPQAWLEEAQKRGLKVGKQLTFATMTFAGDTPQLANVKAVDDIYPMYGDLQTNPPGLK PQAGSVLLAPRLMALLNLKTGDTIDVGDATLRIAGEVIQEPDSGFNPFQMAPRLMMNLAD VDKTGAVQPGSRVTWRYKFGGNENQLDGYEKWLLPQLKPEQRWYGLEQDEGALGRSMERS QQFLLLSALLTLLLAVAAVAVAMNHYCRSRYDLVAILKTLGAGRAQLRKLIVGQWLMVLT LSAVTGGAIGLLFENVLMVLLKPVLPAALPPASLWPWLWALGTMTVISLLVGLRPYRLLL ATQPLRVLRNDVVANVWPLKFYLPIVSVVVVLLLAGLMGGSMLLWAVLAGAVVLALLCGV LGWMLLNVLRRMTLKSLPLRLAVSRLLRQPWSTLSQLSAFSLSFMLLALLLVLRGDLLDR WQQQLPPESPNYFLINIATEQVAPLKAFLAEHHIVPESFYPVVRARLTAINDKPTEGNED EALNRELNLTWQNTRPDHNPIVAGNWPPKADEVSMEEGLAKRLNVALGDTVTFMGDTQEF RAKVTSLRKVDWESLRPNFYFIFPEGALDGQPQSWLTSFRWENGNGMLTQLNRQFPTISL LDIGAILKQVGQVLEQVSRALEVMVVLVTACGVLLLLAQVQVGMRQRHQELVVWRTLGAG KKLLRTTLWCEFAMLGFVSGLVAAIGAETALAVLQAKVFDFPWEPDWRLWIVLPCSGALL LSLFGGWLGARLVKGKALFRQFAG >gi|296493393|gb|ADTK01000108.1| GENE 24 26827 - 27513 313 228 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 25 228 20 223 223 125 36 6e-28 MPAENIVEVHHLKKSVGQGEHELSILTGVELVVKRGETIALVGESGSGKSTLLAILAGLD DGSSGEVSLVGQPLHNMDEEARAKLRAKHVGFVFQSFMLIPTLNALENVELPALLRGESS AESRNGAKALLEQLGLGKRLDHLPAQLSGGEQQRVALARAFNGRPDVLFADEPTGNLDRQ TGDKIADLLFSLNREHGTTLIMVTHDLQLAARCDRCLRLVNGQLQEEA >gi|296493393|gb|ADTK01000108.1| GENE 25 27451 - 28107 602 218 aa, chain + ## HITS:1 COG:tesA KEGG:ns NR:ns ## COG: tesA COG2755 # Protein_GI_number: 16128478 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Escherichia coli K12 # 11 218 1 208 208 404 100.0 1e-113 MLPLTDGLLKMMNFNNVFRWHLPFLFLVLLTFRAAAADTLLILGDSLSAGYRMSASAAWP ALLNDKWQSKTSVVNASISGDTSQQGLARLPALLKQHQPRWVLVELGGNDGLRGFQPQQT EQTLRQILQDVKAANAEPLLMQIRLPANYGRRYNEAFSAIYPKLAKEFDVPLLPFFMEEV YLKPQWMQDDGIHPNRDAQPFIADWMAKQLQPLVNHDS >gi|296493393|gb|ADTK01000108.1| GENE 26 28097 - 28906 229 269 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 [alpha proteobacterium BAL199] # 14 268 3 249 259 92 32 3e-18 MTHKATEILTGKVMQKSVLITGCSSGIGLESALELKRQGFHVLAGCRKPDDVERMNSMGF TGVLIDLDSPESVDRAADEVIALTDNCLYGIFNNAGFGMYGPLSTISRAQMEQQFSANFF GAHQLTMRLLPAMLPHGEGRIVMTSSVMGLISTPGRGAYAASKYALEAWSDALRMELRHS GIKVSLIEPGPIRTRFTDNVNQTQSDKPVENPGIAARFTLGPEAVVDKVRHAFISEKPKM RYPVTLVTWAVMVLKRLLPGRVMDKILQG >gi|296493393|gb|ADTK01000108.1| GENE 27 28967 - 29821 1263 284 aa, chain + ## HITS:1 COG:ECs0555 KEGG:ns NR:ns ## COG: ECs0555 COG3118 # Protein_GI_number: 15829809 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Thioredoxin domain-containing protein # Organism: Escherichia coli O157:H7 # 1 284 13 296 296 483 100.0 1e-136 MSVENIVNINESNLQQVLEQSMTTPVLFYFWSERSQHCLQLTPILESLAAQYNGQFILAK LDCDAEQMIAAQFGLRAIPTVYLFQNGQPVDGFQGPQPEEAIRALLDKVLPREEELKAQQ AMQLMQEGNYTDALPLLKDAWQLSNQNGEIGLLLAETLIALNRSEDAEAVLKTIPLQDQD TRYQGLVAQIELLKQAADTPEIQQLQQQVAENPEDAALATQLALQLHQVGRNEEALELLF GHLRKDLTAADGQTRKTFQEILAALGTGDALASKYRRQLYALLY >gi|296493393|gb|ADTK01000108.1| GENE 28 29884 - 30663 762 259 aa, chain - ## HITS:1 COG:ECs0554 KEGG:ns NR:ns ## COG: ECs0554 COG0390 # Protein_GI_number: 15829808 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Escherichia coli O157:H7 # 1 259 10 268 268 443 99.0 1e-124 MNSHNITNESLALALMLVVVAILISHKEKLALEKDILWSVGRAIIQLIIVGYVLKYIFSV DDASLTLLMVLFICFNAAWNAQKRSKYIAKAFISSFVAITVGAGITLAVLILSGSIEFIP MQVIPIAGMIAGNAMVAVGLCYNNLGQRVISEQQQIQEKLSLGATPKQASAILIRDSIRA ALIPTVDSAKTVGLVSLPGMMSGLIFAGIDPVKAIKYQIMVTFMLLSTASLSTIIACYLT YRKFYNSRHQLVVTQLKKK >gi|296493393|gb|ADTK01000108.1| GENE 29 30650 - 31327 177 225 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 7 208 1 210 245 72 30 4e-12 MQENSPLLQLQNVGYLAGDTKILNNINFSLRAGEFKLITGPSGCGKSTLLKIVASLISPT SGTLLFEGEDVSTLKPEIYRQQVSYCAQTPTLFGDTVYDNLIFPWQIRNQQPDPAIFLDF LERFALPDSILTKDIAELSGGEKQRISLIRNLQFIPKVLLLDEITSALDESNKHNVNEMI HRYVREQNIAVLWVTHDKDEINHADKVITLQPHAGEMQEARYELA >gi|296493393|gb|ADTK01000108.1| GENE 30 31473 - 32390 1335 305 aa, chain + ## HITS:1 COG:ECs0552 KEGG:ns NR:ns ## COG: ECs0552 COG0330 # Protein_GI_number: 15829806 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Escherichia coli O157:H7 # 1 305 1 305 305 536 100.0 1e-152 MLIFIPILIFVALVIVGAGVKIVPQGYQWTVERFGRYTKTLQPGLSLVVPFMDRIGRKIN MMEQVLDIPSQEVISKDNANVTIDAVCFIQVIDAPRAAYEVSNLELAIINLTMTNIRTVL GSMELDEMLSQRDSINSRLLRIVDEATNPWGIKVTRIEIRDVRPPAELISSMNAQMKAER TKRAYILEAEGIRQAEILKAEGEKQSQILKAEGERQSAFLQAEARERSAEAEARATKMVS EAIASGDIQAVNYFVAQKYTEALQQIGSSSNSKVVMMPLEASSLMGSIAGIAELVKDSAN KRTQP >gi|296493393|gb|ADTK01000108.1| GENE 31 32387 - 32845 415 152 aa, chain + ## HITS:1 COG:ybbJ KEGG:ns NR:ns ## COG: ybbJ COG1585 # Protein_GI_number: 16128472 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Membrane protein implicated in regulation of membrane protease activity # Organism: Escherichia coli K12 # 2 152 1 151 151 241 100.0 3e-64 MMELMVVHPHIFWLSLGGLLLAAEMLGGNGYLLWSGVAAVITGLVVWLVPLGWEWQGVMF AILTLLAAWLWWKWLSRRVREQKHSDSHLNQRGQQLIGRRFVLESPLVNGRGHMRVGDSS WPVSASEDLGAGTHVEVIAIEGITLHIRAVSS >gi|296493393|gb|ADTK01000108.1| GENE 32 32846 - 33253 369 135 aa, chain - ## HITS:1 COG:ybbI KEGG:ns NR:ns ## COG: ybbI COG0789 # Protein_GI_number: 16128471 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 1 135 1 135 135 266 100.0 8e-72 MNISDVAKITGLTSKAIRFYEEKGLVTPPMRSENGYRTYTQQHLNELTLLRQARQVGFNL EESGELVNLFNDPQRHSADVKRRTLEKVAEIERHIEELQSMRDQLLALANACPGDDSADC PIIENLSGCCHHRAG >gi|296493393|gb|ADTK01000108.1| GENE 33 33378 - 34667 1375 429 aa, chain - ## HITS:1 COG:ybaT KEGG:ns NR:ns ## COG: ybaT COG0531 # Protein_GI_number: 16128470 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli K12 # 1 429 2 430 430 668 100.0 0 MNTEGNNGNKPLGLWNVVSIGIGAMVGAGIFALLGQAALLMEASTWVAFAFGGIVAMFSG YAYARLGASYPSNGGIIDFFRRGLGNGVFSLALSLLYLLTLAVSIAMVARAFGAYAVQFL HEGSQEEHLILLYALGIIAVMTLFNSLSNHAVGRLEVILVGIKMMILLLLIIAGVWSLQP AHISVSAPPSSGAFFSCIGITFLAYAGFGMMANAADKVKDPQVIMPRAFLVAIGVTTLLY ISLALVLLSDVSALELEKYADTAVAQAASPLLGHVGYVIVVIGALLATASAINANLFAVF NIMDNMGSERELPKLMNKSLWRQSTWGNIIVVVLIMLMTAALNLGSLASVASATFLICYL AVFVVAIRLRHDIHASLPILIVGTLVMLLVIVGFIYSLWSQGSRALIWIIGSLLLSLIVA MVMKRNKTV >gi|296493393|gb|ADTK01000108.1| GENE 34 34673 - 35605 882 310 aa, chain - ## HITS:1 COG:ybaS KEGG:ns NR:ns ## COG: ybaS COG2066 # Protein_GI_number: 16128469 # Func_class: E Amino acid transport and metabolism # Function: Glutaminase # Organism: Escherichia coli K12 # 1 310 1 310 310 605 99.0 1e-173 MLDANKLQQAVDQAYTQFHSLNGGQNADYIPFLANVPGQLAAVAIVTCDGKVYSAGDSDY RFALESISKVCTLALALEDVGPQAVQDKIGADPTGLPFNSVIALELHGGKPLSPLVNAGA IATTSLINAENVEQRWQRILHIQQQLAGEQVALSDEVNQSEQTTNFHNRAIAWLLYSAGY LYCDAMEACDVYTRQCSTLLNTVELATLGATLAAGGVNPLTHKRVLQADNVPYILAEMMM EGLYGRSGDWAYRVGLPGKSGVGGGILAVVPGVMGIAAFSPPLDEEGNSVRGQKMVASVA KQLGYNVFKG >gi|296493393|gb|ADTK01000108.1| GENE 35 35867 - 38371 2763 834 aa, chain + ## HITS:1 COG:ybaR KEGG:ns NR:ns ## COG: ybaR COG2217 # Protein_GI_number: 16128468 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Escherichia coli K12 # 1 834 1 834 834 1467 99.0 0 MSQIIDLTLDGLSCGHCVKRVKESLEQRPDVEQADVSITEAHVTGTASAEQLIETIKQAG YDASVSHPKAKPLAESSIPSEALTAVSEALPAATADDDDSQQLLLSGMSCASCVTRVQNA LQSVPGVTQARVNLAERTALVMGSASPQDLVQAVEKAGYGAEAIEDDAKRRERQQETAVA TMKRFRWQAIVALAVGIPVMVWGMIGDNMMVTADNRSLWLVIGLITLAVMVFAGGHFYRS AWKSLLNGAATMDTLVALGTGVAWLYSMSVNLWPQWFPMEARHLYYEASAMIIGLINLGH MLEARARQRSSKALEKLLDLTPPTARLVTDEGEKSVPLAEVQPGMLLRLTTGDRVPVDGE ITQGEAWLDEAMLTGEPIPQQKGEGESVHAGTVVQDGSVLFRASAVGSHTTLSRIIRMVR QAQSSKPEIGQLADKISAVFVPVVVVIALVSAAIWYFFGPAPQIVYTLVIATTVLIIACP CALGLATPMSIISGVGRAAEFGVLVRDADALQRASTLDTVVFDKTGTLTEGKPQVVAVKT FADVDEAQALRLAAALEQGSSHPLARAILDKAGDMQLPQVNGFRTLRGLGVSGEAEGHAL LLGNQALLNEQQVGTKAIEAEITAQASQGATPVLLAIDGKAVALLAVRDPLRSDSVAALQ RLHKAGYRLVMLTGDNPTTANAIAKEAGIDEVIAGVLPDGKAEAIKRLQSEGRQVAMVGD GINDAPALAQADVGIAMGGGSDVAIETAAITLMRHSLMGVADALAISRATLRNMKQNLLG AFIYNSIGIPVAAGILWPFTGTLLNPVVAGAAMALSSITVVSNANRLLRFKPKE >gi|296493393|gb|ADTK01000108.1| GENE 36 38586 - 38981 359 131 aa, chain - ## HITS:1 COG:ECs0536 KEGG:ns NR:ns ## COG: ECs0536 COG3093 # Protein_GI_number: 15829790 # Func_class: R General function prediction only # Function: Plasmid maintenance system antidote protein # Organism: Escherichia coli O157:H7 # 1 131 1 131 131 225 98.0 2e-59 MIQYVLASLFTGKQQLKTMKQATRKPTTPGDILLYEYLEPLDLKINELAELLHVHRNSVS ALINNNRKLTTEMAFRLAKVFDTTVDFRLNLQAAVDLWEVENNMRTQEELGRIETVAEYL ARREERAKKVA >gi|296493393|gb|ADTK01000108.1| GENE 37 39065 - 39859 757 264 aa, chain + ## HITS:1 COG:ybaP KEGG:ns NR:ns ## COG: ybaP COG3735 # Protein_GI_number: 16128466 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 264 1 264 264 504 99.0 1e-143 MDLLYRVKTLWAALRGNHYTWPAIDITLPGNRHFHLIGSIHMGSHDMAPLPTRLLKKLKN ADALIVEADVSTSDTPFANLPACEALEERISEEQLQNLQHISQEMGISPSLFSTQPLWQI AMVLQATQAQKLGLRAEYGIDYQLLQAAKQQHKPVIELEGAENQIAMLLQLPDKGLALLD DTLTHWHTNARLLQQMMSWWLNAPPQNNEITLPNTFSQSLYDVLMHQRNLAWRDKLRAMP PGRYVVAVGALHLYGEGNLPQMLR >gi|296493393|gb|ADTK01000108.1| GENE 38 40063 - 40542 719 159 aa, chain + ## HITS:1 COG:ybaK KEGG:ns NR:ns ## COG: ybaK COG2606 # Protein_GI_number: 16128465 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 159 1 159 159 292 100.0 2e-79 MTPAVKLLEKNKISFQIHTYEHDPAETNFGDEVVKKLGLNPDQVYKTLLVAVNGDMKHLA VAVTPVAGQLDLKKVAKALGAKKVEMADPMVAQRSTGYLVGGISPLGQKKRLPTIIDAPA QEFATIYVSGGKRGLDIELAAGDLAKILDAKFADIARRD >gi|296493393|gb|ADTK01000108.1| GENE 39 40579 - 42231 1791 550 aa, chain - ## HITS:1 COG:ushA KEGG:ns NR:ns ## COG: ushA COG0737 # Protein_GI_number: 16128464 # Func_class: F Nucleotide transport and metabolism # Function: 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases # Organism: Escherichia coli K12 # 1 550 1 550 550 1117 100.0 0 MKLLQRGVALALLTTFTLASETALAYEQDKTYKITVLHTNDHHGHFWRNEYGEYGLAAQK TLVDGIRKEVAAEGGSVLLLSGGDINTGVPESDLQDAEPDFRGMNLVGYDAMAIGNHEFD NPLTVLRQQEKWAKFPLLSANIYQKSTGERLFKPWALFKRQDLKIAVIGLTTDDTAKIGN PEYFTDIEFRKPADEAKLVIQELQQTEKPDIIIAATHMGHYDNGEHGSNAPGDVEMARAL PAGSLAMIVGGHSQDPVCMAAENKKQVDYVPGTPCKPDQQNGIWIVQAHEWGKYVGRADF EFRNGEMKMVNYQLIPVNLKKKVTWEDGKSERVLYTPEIAENQQMISLLSPFQNKGKAQL EVKIGETNGRLEGDRDKVRFVQTNMGRLILAAQMDRTGADFAVMSGGGIRDSIEAGDISY KNVLKVQPFGNVVVYADMTGKEVIDYLTAVAQMKPDSGAYPQFANVSFVAKDGKLNDLKI KGEPVDPAKTYRMATLNFNATGGDGYPRLDNKPGYVNTGFIDAEVLKAYIQKSSPLDVSV YEPKGEVSWQ >gi|296493393|gb|ADTK01000108.1| GENE 40 42449 - 43669 1161 406 aa, chain + ## HITS:1 COG:fsr KEGG:ns NR:ns ## COG: fsr COG0477 # Protein_GI_number: 16128463 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 406 1 406 406 676 100.0 0 MAMSEQPQPVAGAAASTTKARTSFGILGAISLSHLLNDMIQSLILAIYPLLQSEFSLTFM QIGMITLTFQLASSLLQPVVGYWTDKYPMPWSLPIGMCFTLSGLVLLALAGSFGAVLLAA ALVGTGSSVFHPESSRVARMASGGRHGLAQSIFQVGGNFGSSLGPLLAAVIIAPYGKGNV AWFVLAALLAIVVLAQISRWYSAQHRMNKGKPKATIINPLPRNKVVLAVSILLILIFSKY FYMASISSYYTFYLMQKFGLSIQNAQLHLFAFLFAVAAGTVIGGPVGDKIGRKYVIWGSI LGVAPFTLILPYASLHWTGVLTVIIGFILASAFSAILVYAQELLPGRIGMVSGLFFGFAF GMGGLGAAVLGLIADHTSIELVYKICAFLPLLGMLTIFLPDNRHKD >gi|296493393|gb|ADTK01000108.1| GENE 41 43907 - 45583 501 558 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229845962|ref|ZP_04466074.1| 30S ribosomal protein S2 [Haemophilus influenzae 7P49H1] # 4 522 5 513 618 197 26 1e-49 MHHATPLITTIVGGLVLAFILGMLANKLRISPLVGYLLAGVLAGPFTPGFVADTKLAPEL AELGVILLMFGVGLHFSLKDLMAVKAIAIPGAIAQIAVATLLGMALSAVLGWSLMTGIVF GLCLSTASTVVLLRALEERQLIDSQRGQIAIGWLIVEDLVMVLTLVLLPAVAGMMEQGDV GFATLAVDMGITIGKVIAFIAIMMLVGRRLVPWIMARSAATGSRELFTLSVLALALGIAF GAVELFDVSFALGAFFAGMVLNESELSHRAAHDTLPLRDAFAVLFFVSVGMLFDPLILIQ QPLAVLATLAIILFGKSLAAFFLVRLFGHSQRTALTIAASLAQIGEFAFILAGLGMALNL LPQAGQNLVLAGAILSIMLNPVLFALLEKYLAKTETLEEQTLEEAIEEEKQIPVDICNHA LLVGYGRVGSLLGEKLLASDIPLVVIETSRTRVDELRERGVRAVLGNAANEEIMQLAHLE CAKWLILTIPNGYEAGEIVASARAKNPDIEIIARAHYDDEVTYITERGANQVVMGEREIA RTMLELLETPPAGGGVTG >gi|296493393|gb|ADTK01000108.1| GENE 42 45716 - 47020 1312 434 aa, chain - ## HITS:1 COG:ECs0530 KEGG:ns NR:ns ## COG: ECs0530 COG0524 # Protein_GI_number: 15829784 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli O157:H7 # 1 434 1 434 434 914 100.0 0 MKFPGKRKSKHYFPVNARDPLLQQFQPENETSAAWVVGIDQTLVDIEAKVDDEFIERYGL SAGHSLVIEDDVAEALYQELKQKNLITHQFAGGTIGNTMHNYSVLADDRSVLLGVMCSNI EIGSYAYRYLCNTSSRTDLNYLQGVDGPIGRCFTLIGESGERTFAISPGHMNQLRAESIP EDVIAGASALVLTSYLVRCKPGEPMPEATMKAIEYAKKYNVPVVLTLGTKFVIAENPQWW QQFLKDHVSILAMNEDEAEALTGESDPLLASDKALDWVDLVLCTAGPIGLYMAGFTEDEA KRKTQHPLLPGAIAEFNQYEFSRAMRHKDCQNPLRVYSHIAPYMGGPEKIMNTNGAGDGA LAALLHDITANSYHRSNVPNSSKHKFTWLTYSSLAQVCKYANRVSYQVLNQHSPRLTRGL PEREDSLEESYWDR >gi|296493393|gb|ADTK01000108.1| GENE 43 47172 - 48131 737 319 aa, chain + ## HITS:1 COG:aes KEGG:ns NR:ns ## COG: aes COG0657 # Protein_GI_number: 16128460 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Escherichia coli K12 # 1 319 1 319 319 656 98.0 0 MKPENKLPVLDLISAEMKTVVNTLQSDLPSWPATGTIAEQRQYYTLERRFWNAGAPEMAT RAYMVPTKYGQVETRLFCPQPDSPATLFYLHGGGFILGNLDTHDRIMRLLASYSQCTVIG IDYPLSPEARFPQAIEEIVAACCYFHQQAEDYQINMSRIGFAGDSAGAMLALASALWLRD KQIDCGKIAGVLLWYGLYGLRDSVTRRLLGGVWDGLTQQDLQMYEEAYLSNDADRESPYY CLFNNDLTREVPPCFIAGAEFDPLLDDSRLLYQTLAAHQQPCEFKLYPGTLHAFLHYSRM MKTADEALRDGAQFFTAQL >gi|296493393|gb|ADTK01000108.1| GENE 44 48128 - 49090 901 320 aa, chain - ## HITS:1 COG:ECs0528 KEGG:ns NR:ns ## COG: ECs0528 COG0276 # Protein_GI_number: 15829782 # Func_class: H Coenzyme transport and metabolism # Function: Protoheme ferro-lyase (ferrochelatase) # Organism: Escherichia coli O157:H7 # 1 320 1 320 320 622 99.0 1e-178 MRQTKTGILLANLGTPDAPTPEAVKRYLKQFLSDRRVVDTSRLLWWPLLRGVILPLRSPR VAKLYASVWMEDGSPLMVYSRQQQQALAQRLPDTPVALGMSYGSPSLESAVDELLAEHVD HIVVLPLYPQFSCSTVGAVWDELARILARKRSIPGISFIRDYADNHDYINALANSVRASF AKHGEPDLLLLSYHGIPQRYADEGDDYPQRCRTTTRELASALGMAPEKVMMTFQSRFGRE PWLMPYTDETLKMLGEKGVGHIQVMCPGFAADCLETLEEIAEQNREVFLGAGGKKYEYIP ALNATPEHIEMMANLVAAYR >gi|296493393|gb|ADTK01000108.1| GENE 45 49326 - 49970 985 214 aa, chain - ## HITS:1 COG:ECs0527 KEGG:ns NR:ns ## COG: ECs0527 COG0563 # Protein_GI_number: 15829781 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate kinase and related kinases # Organism: Escherichia coli O157:H7 # 1 214 1 214 214 404 100.0 1e-113 MRIILLGAPGAGKGTQAQFIMEKYGIPQISTGDMLRAAVKSGSELGKQAKDIMDAGKLVT DELVIALVKERIAQEDCRNGFLLDGFPRTIPQADAMKEAGINVDYVLEFDVPDELIVDRI VGRRVHAPSGRVYHVKFNPPKVEGKDDVTGEELTTRKDDQEETVRKRLVEYHQMTAPLIG YYSKEAEAGNTKYAKVDGTKPVAEVRADLEKILG Prediction of potential genes in microbial genomes Time: Mon May 16 15:20:27 2011 Seq name: gi|296493392|gb|ADTK01000109.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont289.4, whole genome shotgun sequence Length of sequence - 28782 bp Number of predicted genes - 27, with homology - 27 Number of transcription units - 19, operones - 6 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 52 - 1926 2587 ## COG0326 Molecular chaperone, HSP90 family - Prom 1948 - 2007 3.3 - Term 1989 - 2034 9.0 2 2 Op 1 23/0.000 - CDS 2036 - 2641 519 ## COG0353 Recombinational DNA repair protein (RecF pathway) 3 2 Op 2 30/0.000 - CDS 2641 - 2970 227 ## PROTEIN SUPPORTED gi|149916415|ref|ZP_01904934.1| 30S ribosomal protein S21 - Term 2993 - 3021 1.6 4 2 Op 3 8/0.143 - CDS 3023 - 4954 1969 ## COG2812 DNA polymerase III, gamma/tau subunits - Prom 4991 - 5050 2.5 - Term 4995 - 5031 2.2 5 3 Tu 1 . - CDS 5083 - 5634 823 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins 6 4 Tu 1 . - CDS 5787 - 6164 316 ## COG2832 Uncharacterized protein conserved in bacteria - Prom 6192 - 6251 4.0 + Prom 6151 - 6210 6.3 7 5 Op 1 . + CDS 6234 - 6761 544 ## COG3923 Primosomal replication protein N'' 8 5 Op 2 . + CDS 6775 - 6936 188 ## ECS88_0463 hypothetical protein - Term 6978 - 7014 2.4 9 6 Tu 1 . - CDS 7148 - 10504 3541 ## COG3264 Small-conductance mechanosensitive channel - Prom 10556 - 10615 3.6 10 7 Tu 1 . + CDS 10516 - 10707 87 ## gi|300919868|ref|ZP_07136335.1| conserved domain protein + Term 10894 - 10928 0.6 - Term 10595 - 10624 1.4 11 8 Tu 1 . - CDS 10638 - 11285 549 ## COG1309 Transcriptional regulator - Prom 11346 - 11405 5.0 + Prom 11289 - 11348 3.6 12 9 Op 1 27/0.000 + CDS 11427 - 12620 1270 ## COG0845 Membrane-fusion protein 13 9 Op 2 . + CDS 12643 - 15792 3299 ## COG0841 Cation/multidrug efflux pump + Term 15812 - 15839 1.5 14 10 Op 1 . + CDS 16338 - 16712 251 ## ECDH10B_0417 hypothetical protein 15 10 Op 2 . + CDS 16738 - 16956 168 ## UTI89_C0487 hemolysin expression-modulating protein + Term 17050 - 17090 -0.5 + Prom 16958 - 17017 5.8 16 11 Tu 1 . + CDS 17128 - 17679 412 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Prom 17700 - 17759 3.4 17 12 Tu 1 . + CDS 17795 - 18265 436 ## G2583_0570 hypothetical protein + Prom 18344 - 18403 4.6 18 13 Tu 1 . + CDS 18423 - 19979 1320 ## COG4943 Predicted signal transduction protein containing sensor and EAL domains + Term 19986 - 20033 -0.9 - Term 19867 - 19904 2.2 19 14 Tu 1 . - CDS 20021 - 20374 461 ## COG5507 Uncharacterized conserved protein - 5S_RRNA 20485 - 20598 100.0 # ECU82664 [D:55896..56009] # 4.5S ribosomal RNA # Escherichia coli # Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; Enterobacteriaceae; Escherichia. 20 15 Tu 1 . + CDS 20753 - 21064 279 ## COG3695 Predicted methylated DNA-protein cysteine methyltransferase + Term 21177 - 21213 2.7 - Term 21053 - 21084 4.1 21 16 Tu 1 . - CDS 21095 - 21667 448 ## COG3126 Uncharacterized protein conserved in bacteria - Prom 21726 - 21785 2.6 + Prom 21786 - 21845 2.9 22 17 Tu 1 . + CDS 21885 - 22745 719 ## COG1946 Acyl-CoA thioesterase + Term 22751 - 22790 6.1 - Term 22731 - 22786 12.6 23 18 Op 1 24/0.000 - CDS 22794 - 24080 1445 ## COG0004 Ammonia permease 24 18 Op 2 4/0.714 - CDS 24110 - 24448 484 ## COG0347 Nitrogen regulatory protein PII - Prom 24546 - 24605 4.8 - Term 24557 - 24592 4.5 25 19 Op 1 35/0.000 - CDS 24629 - 26410 225 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 26 19 Op 2 3/0.857 - CDS 26403 - 28175 205 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 27 19 Op 3 . - CDS 28205 - 28663 369 ## COG1522 Transcriptional regulators - Prom 28694 - 28753 5.5 Predicted protein(s) >gi|296493392|gb|ADTK01000109.1| GENE 1 52 - 1926 2587 624 aa, chain - ## HITS:1 COG:ECs0526 KEGG:ns NR:ns ## COG: ECs0526 COG0326 # Protein_GI_number: 15829780 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone, HSP90 family # Organism: Escherichia coli O157:H7 # 1 624 1 624 624 1181 99.0 0 MKGQETRGFQSEVKQLLHLMIHSLYSNKEIFLRELISNASDAADKLRFRALSNPDLYEGD GELRVRVSFDKDKRTLTISDNGVGMTRDEVIDHLGTIAKSGTKSFLESLGSDQAKDSQLI GQFGVGFYSAFIVADKVTVRTRAAGEKPENGVFWESAGEGEYTVADITKEDRGTEITLHL REGEDEFLDDWRVRSIISKYSDHIALPVEIEKREEKDGETVISWEKINKAQALWTRNKSE ITDEEYKEFYKHIAHDFNDPLTWSHNRVEGKQEYTSLLYIPSQAPWDMWNRDHKHGLKLY VQRVFIMDDAEQFMPNYLRFVRGLIDSSDLPLNVSREILQDSTVTRNLRNALTKRVLQML EKLAKDDAEKYQTFWQQFGLVLKEGPAEDFANQEAIAKLLRFASTYTDSSAQTVSLEDYV SRMKEGQEKIYYITADSYAAAKSSPHLELLRKKGIEVLLLSDRIDEWMMNYLTEFDGKPF QSVSKVDESLEKLADEVDESAKEAEKALTPFIDRVKALLGERVKDVRLTHRLTDTPAIVS TDADEMSTQMAKLFAAAGQKVPEVKYIFELNPDHVLVKRAADTEDEAKFSEWVELLLDQA LLAERGTLEDPNLFIRRMNQLLVS >gi|296493392|gb|ADTK01000109.1| GENE 2 2036 - 2641 519 201 aa, chain - ## HITS:1 COG:ECs0525 KEGG:ns NR:ns ## COG: ECs0525 COG0353 # Protein_GI_number: 15829779 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Escherichia coli O157:H7 # 1 201 1 201 201 399 100.0 1e-111 MQTSPLLTQLMEALRCLPGVGPKSAQRMAFTLLQRDRSGGMRLAQALTRAMSEIGHCADC RTFTEQEVCNICSNPRRQENGQICVVESPADIYAIEQTGQFSGRYFVLMGHLSPLDGIGP DDIGLDRLEQRLAEEKITEVILATNPTVEGEATANYIAELCAQYDVEASRIAHGVPVGGE LEMVDGTTLSHSLAGRHKIRF >gi|296493392|gb|ADTK01000109.1| GENE 3 2641 - 2970 227 109 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149916415|ref|ZP_01904934.1| 30S ribosomal protein S21 [Roseobacter sp. AzwK-3b] # 5 109 9 114 114 92 41 4e-18 MFGKGGLGNLMKQAQQMQEKMQKMQEEIAQLEVTGESGAGLVKVTINGAHNCRRVEIDPS LLEDDKEMLEDLVAAAFNDAARRIEETQKEKMASVSSGMQLPPGFKMPF >gi|296493392|gb|ADTK01000109.1| GENE 4 3023 - 4954 1969 643 aa, chain - ## HITS:1 COG:ECs0523 KEGG:ns NR:ns ## COG: ECs0523 COG2812 # Protein_GI_number: 15829777 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Escherichia coli O157:H7 # 1 643 1 643 643 1114 99.0 0 MSYQVLARKWRPQTFADVVGQEHVLTALANGLSLGRIHHAYLFSGTRGVGKTSIARLLAK GLNCETGITATPCGVCDNCREIEQGRFVDLIEIDAASRTKVEDTRDLLDNVQYAPARGRF KVYLIDEVHMLSRHSFNALLKTLEEPPEHVKFLLATTDPQKLPVTILSRCLQFHLKALDV EQIRHQLEHILNEEHIAHEPRALQLLARAAEGSLRDALSLTDQAIASGDGQVSTQAVSAM LGTLDDDQALSLVEAMVEANGERVMALINEAAARGIEWEALLVEMLGLLHRIAMVQLSPA ALGNDMAAIELRMRELARTIPPTDIQLYYQTLLIGRKELPYAPDRRMGVEMTLLRALAFH PRMPLPEPEVPRQSFAPVAPTAVMTPTQVPPQPQSAPQQAPTVPLPETTSQVLAARQQLQ RVQGATKAKKSEPAAATRARPVNNAALERLASVTDRVQARPVPSALEKAPAKKEAYRWKA TTPVMQQKEVVATPKALKKALEHEKTPELAAKLAAEAIERDAWAAQVSQLSLPKLVEQVA LNAWKEESDNAVCLHLRSSQRHLNNRGAQQKLAEALSTLKGSTVELTIVEDDNPAVRTPL EWRQAIYEEKLAQARESIIADNNIQTLRRFFDAELDEESIRPI >gi|296493392|gb|ADTK01000109.1| GENE 5 5083 - 5634 823 183 aa, chain - ## HITS:1 COG:apt KEGG:ns NR:ns ## COG: apt COG0503 # Protein_GI_number: 16128453 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Escherichia coli K12 # 1 183 1 183 183 349 100.0 1e-96 MTATAQQLEYLKNSIKSIQDYPKPGILFRDVTSLLEDPKAYALSIDLLVERYKNAGITKV VGTEARGFLFGAPVALGLGVGFVPVRKPGKLPRETISETYDLEYGTDQLEIHVDAIKPGD KVLVVDDLLATGGTIEATVKLIRRLGGEVADAAFIINLFDLGGEQRLEKQGITSYSLVPF PGH >gi|296493392|gb|ADTK01000109.1| GENE 6 5787 - 6164 316 125 aa, chain - ## HITS:1 COG:ECs0521 KEGG:ns NR:ns ## COG: ECs0521 COG2832 # Protein_GI_number: 15829775 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 125 1 125 125 199 100.0 1e-51 MQRIILIIIGWLAVVLGTLGVVLPVLPTTPFILLAAWCFARSSPRFHAWLLYRSWFGSYL RFWQKHHAMPRGVKPRAILLILLTFAISLWFVQMPWVRIMLLVILACLLFYMWRIPVIDE KQEKH >gi|296493392|gb|ADTK01000109.1| GENE 7 6234 - 6761 544 175 aa, chain + ## HITS:1 COG:priC KEGG:ns NR:ns ## COG: priC COG3923 # Protein_GI_number: 16128451 # Func_class: L Replication, recombination and repair # Function: Primosomal replication protein N'' # Organism: Escherichia coli K12 # 1 175 1 175 175 244 99.0 8e-65 MKTALLLEKLEGQLATLRQRCAPVAQFATLSARFDRHLFQTRATTLQACLDEAGDNLAAL RHAVEQQQLPQVAWLAEHLAAQLEAIAREASAWSLREWDSAPPKIARWQRKRIQHQDFER RLREMVAERRARLARVTDLVEQQTLHREVEAYEARLARCRHALEKIENRLARLTR >gi|296493392|gb|ADTK01000109.1| GENE 8 6775 - 6936 188 53 aa, chain + ## HITS:1 COG:no KEGG:ECS88_0463 NR:ns ## KEGG: ECS88_0463 # Name: ybaM # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 53 1 53 53 72 100.0 4e-12 MSLENAPDDVKLAVDLIVLLEENQIPARTVLRALDIVKRDYEKKLTRDDEAEK >gi|296493392|gb|ADTK01000109.1| GENE 9 7148 - 10504 3541 1118 aa, chain - ## HITS:1 COG:ECs0518 KEGG:ns NR:ns ## COG: ECs0518 COG3264 # Protein_GI_number: 15829772 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Escherichia coli O157:H7 # 1 1118 3 1120 1120 2076 99.0 0 MFQYYKRSRHFVFSAFIAFVFVLLCQNTAFARASSNGDLPTKADLQAQLDSLNKQKDLSA QDKLVQQDLTDTLATLDKIDRIKEETVQLRQKVAEAPEKMRQATAALTALSDVDNDEETR KILSTLSLRQLETRVAQALDDLQNAQNDLASYNSQLVSLQTQPERVQNAMYNASQQLQQI RSRLDGTDVGETALRPSQKVLMQAQQALLNAEIDQQRKSLEGNTVLQDTLQKQRDYVTAN SARLEHQLQLLQEAVNSKRLTLTEKTAQEAVSPDEAARIQANPLVKQELEINQQLSQRLI TATENGNQLMQQNIKVKNWLERALQSERNIKEQIAVLKGSLLLSRILYQQQQTLPSADEL ENMTNRIADLRLEQFEVNQQRDALFQSDAFVNKLEEGHTNEVNSEVHDALLQVVDMRREL LDQLNKQLGNQLMMAINLQINQQQLMSVSKNLKSILTQQIFWVNSNRPMDWDWIKAFPQS LKDEFKSMKITVNWEKAWPAVFIAFLAGLPLLLIAGLIHWRLGWLKAYQQKLASAVGSLR NDSQLNTPKAILIDLIRALPVCLIILAVGLILLTMQLNISELLWSFSKKLAIFWLVFGLC WKVLEKNGVAVRHFGMPEQQTSHWRRQIVRISLALLPIHFWSVVAELSPLHLMDDVLGQA MIFFNLLLIAFLVWPMCRESWRDKESHTMRLVTITVLSIIPIALMVLTATGYFYTTLRLS GRWIETVYLVIIWNLLYQTVLRGLSVAARRIAWRRALARRQNLVKEGAEGAEPPEEPTIA LEQVNQQTLRITMLLMFALFGVMFWAIWSDLITVFSYLDSITLWHYNGTEAGAAVVKNVT MGSLLFAIIASMVAWALIRNLPGLLEVLVLSRLNMRQGASYAITTILNYIIIAVGAMTVF GSLGVSWDKLQWLAAALSVGLGFGLQEIFGNFVSGLIILFERPVRIGDTVTIGSFSGTVS KIRIRATTITDFDRKEVIIPNKAFVTERLINWSLTDTTTRLVIRLGVAYGSDLEKVRKVL LKAATEHPRVMHEPMPEVFFTAFGASTLDHELRLYVRELRDRSRTVDELNRTIDQLCREN DINIAFNQLEVHLHNEKGDEVTEVKRDYKGDDPTPAVG >gi|296493392|gb|ADTK01000109.1| GENE 10 10516 - 10707 87 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|300919868|ref|ZP_07136335.1| ## NR: gi|300919868|ref|ZP_07136335.1| conserved domain protein [Escherichia coli MS 115-1] # 1 61 1 61 71 103 98.0 4e-21 MKVSEPEERPEKVKPQEYHDAVNQNSNDENVQEKSWSQIQGYSLVAGLRSVGHRRYISSK MAT >gi|296493392|gb|ADTK01000109.1| GENE 11 10638 - 11285 549 215 aa, chain - ## HITS:1 COG:acrR KEGG:ns NR:ns ## COG: acrR COG1309 # Protein_GI_number: 16128448 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 215 1 215 215 415 99.0 1e-116 MARKTKQEAQETRQHILDVALRLFSQQGVSSTLLGEIAKAAGVTRGAIYWHFKDKSDLFS EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE >gi|296493392|gb|ADTK01000109.1| GENE 12 11427 - 12620 1270 397 aa, chain + ## HITS:1 COG:ECs0516 KEGG:ns NR:ns ## COG: ECs0516 COG0845 # Protein_GI_number: 15829770 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli O157:H7 # 1 397 1 397 397 660 100.0 0 MNKNRGFTPLAVVLMLSGSLALTGCDDKQAQQGGQQMPAVGVVTVKTEPLQITTELPGRT SAYRIAEVRPQVSGIILKRNFKEGSDIEAGVSLYQIDPATYQATYDSAKGDLAKAQAAAN IAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTAAKAAVETARINLAYTKVTSPI SGRIGKSNVTEGALVQNGQATALATVQQLDPIYVDVTQSSNDFLRLKQELANGTLKQENG KAKVSLITSDGIKFPQDGTLEFSDVTVDQTTGSITLRAIFPNPDHTLLPGMFVRARLEEG LNPNAILVPQQGVTRTPRGDATVLVVGADDKVETRPIVASQAIGDKWLVTEGLKAGDRVV ISGLQKVRPGVQVKAQEVTADNNQQAASGAQPEQSKS >gi|296493392|gb|ADTK01000109.1| GENE 13 12643 - 15792 3299 1049 aa, chain + ## HITS:1 COG:acrB KEGG:ns NR:ns ## COG: acrB COG0841 # Protein_GI_number: 16128446 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Escherichia coli K12 # 1 1049 1 1049 1049 2000 100.0 0 MPNFFIDRPIFAWVIAIIIMLAGGLAILKLPVAQYPTIAPPAVTISASYPGADAKTVQDT VTQVIEQNMNGIDNLMYMSSNSDSTGTVQITLTFESGTDADIAQVQVQNKLQLAMPLLPQ EVQQQGVSVEKSSSSFLMVVGVINTDGTMTQEDISDYVAANMKDAISRTSGVGDVQLFGS QYAMRIWMNPNELNKFQLTPVDVITAIKAQNAQVAAGQLGGTPPVKGQQLNASIIAQTRL TSTEEFGKILLKVNQDGSRVLLRDVAKIELGGENYDIIAEFNGQPASGLGIKLATGANAL DTAAAIRAELAKMEPFFPSGLKIVYPYDTTPFVKISIHEVVKTLVEAIILVFLVMYLFLQ NFRATLIPTIAVPVVLLGTFAVLAAFGFSINTLTMFGMVLAIGLLVDDAIVVVENVERVM AEEGLPPKEATRKSMGQIQGALVGIAMVLSAVFVPMAFFGGSTGAIYRQFSITIVSAMAL SVLVALILTPALCATMLKPIAKGDHGEGKKGFFGWFNRMFEKSTHHYTDSVGGILRSTGR YLVLYLIIVVGMAYLFVRLPSSFLPDEDQGVFMTMVQLPAGATQERTQKVLNEVTHYYLT KEKNNVESVFAVNGFGFAGRGQNTGIAFVSLKDWADRPGEENKVEAITMRATRAFSQIKD AMVFAFNLPAIVELGTATGFDFELIDQAGLGHEKLTQARNQLLAEAAKHPDMLTSVRPNG LEDTPQFKIDIDQEKAQALGVSINDINTTLGAAWGGSYVNDFIDRGRVKKVYVMSEAKYR MLPDDIGDWYVRAADGQMVPFSAFSSSRWEYGSPRLERYNGLPSMEILGQAAPGKSTGEA MELMEQLASKLPTGVGYDWTGMSYQERLSGNQAPSLYAISLIVVFLCLAALYESWSIPFS VMLVVPLGVIGALLAATFRGLTNDVYFQVGLLTTIGLSAKNAILIVEFAKDLMDKEGKGL IEATLDAVRMRLRPILMTSLAFILGVMPLVISTGAGSGAQNAVGTGVMGGMVTATVLAIF FVPVFFVVVRRRFSRKNEDIEHSHTVDHH >gi|296493392|gb|ADTK01000109.1| GENE 14 16338 - 16712 251 124 aa, chain + ## HITS:1 COG:no KEGG:ECDH10B_0417 NR:ns ## KEGG: ECDH10B_0417 # Name: ybaJ # Def: hypothetical protein # Organism: E.coli_DH10B # Pathway: not_defined # 1 124 1 124 124 243 100.0 2e-63 MDEYSPKRHDIAQLKFLCETLYHDCLANLEESNHGWVNDPTSAINLQLNELIEHIATFAL NYKIKYNEDNKLIEQIDEYLDDTFMLFSSYGINMQDLQKWRKSGNRLFRCFVNATKENPA SLSC >gi|296493392|gb|ADTK01000109.1| GENE 15 16738 - 16956 168 72 aa, chain + ## HITS:1 COG:no KEGG:UTI89_C0487 NR:ns ## KEGG: UTI89_C0487 # Name: hha # Def: hemolysin expression-modulating protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 72 5 76 76 130 100.0 1e-29 MSEKPLTKTDYLMRLRRCQTIDTLERVIEKNKYELSDNELAVFYSAADHRLAELTMNKLY DKIPSSVWKFIR >gi|296493392|gb|ADTK01000109.1| GENE 16 17128 - 17679 412 183 aa, chain + ## HITS:1 COG:ylaD KEGG:ns NR:ns ## COG: ylaD COG0110 # Protein_GI_number: 16128443 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Escherichia coli K12 # 1 183 1 183 183 379 99.0 1e-105 MSTEKEKMIAGELYRSADETLSRDRLRARQLIHRYNHSLAEEHTLRQQILADLFGQVTEA YIEPTFRCDYGYNIFLGNNFFANFDCVMLDVCPIRIGDNCMLAPGVHIYTATHPIDPVAR NSGAELGKPVTIGNNVWIGGRAVINPGVTIGDNVVVASGAVVTKDVPNNVVVGGNPARII KKL >gi|296493392|gb|ADTK01000109.1| GENE 17 17795 - 18265 436 156 aa, chain + ## HITS:1 COG:no KEGG:G2583_0570 NR:ns ## KEGG: G2583_0570 # Name: ylaC # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 156 14 169 169 304 100.0 8e-82 MTEIQRLLTETIESLNTREKRDNKPRFSISFIRKHPGLFIGMYVAFFATLAVMLQSETLS GSVWLLVVLFILLNGFFFFDVYPRYRYEDIDVLDFRVCYNGEWYNTRFVPAALVEAILNS PRVADVHKEQLQKMIVRKGELSFYDIFTLARAESTS >gi|296493392|gb|ADTK01000109.1| GENE 18 18423 - 19979 1320 518 aa, chain + ## HITS:1 COG:ECs0510 KEGG:ns NR:ns ## COG: ECs0510 COG4943 # Protein_GI_number: 15829764 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing sensor and EAL domains # Organism: Escherichia coli O157:H7 # 1 518 1 518 518 1040 99.0 0 MLVRTRHLVGLISGVLILSVLLPVGLSIWLAHQQVETSFIEELDTYSSRVAIRANKVATQ GKDALQELERWQGAACSEAHLMEMRRVSYSYRYIQEVVYIDNNVPQCSSLEHESPPDTFP EPGKISKDGYRVWLTSHNDLGIIRYMVAMGTAHYVVMIDPASFIDVIPYSSWQIDAAIIG NAHNVVITSSDEIAQGIITRLQKTPGEHIENNGIIYDILPFPEMNISIITWASTKMLQKG WHRQVFIWLPLGLVIGLLAAMFVLRILRRIQSPHHRLQDAIENRDICVHYQPIVSLANGK IVGAEALARWPQTDGSWLSPDSFIPLAQQTGLSEPLTLLIIRSVFEDMGDWLRQHPQQHI SINLESTVLTSEKIPQLLREMINHYQVNPRQIALELTEREFADPKTSAPIISRYREAGHE IYLDDFGTGYSSLSYLQDLDVDILKIDKSFVDALEYKNVTPHIIEMAKTLKLKMVAEGIE TSKQEEWLRQHGVHYGQGWLYSKALPKEDFLRWAEQHL >gi|296493392|gb|ADTK01000109.1| GENE 19 20021 - 20374 461 117 aa, chain - ## HITS:1 COG:ECs0509 KEGG:ns NR:ns ## COG: ECs0509 COG5507 # Protein_GI_number: 15829763 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 117 1 117 117 213 100.0 9e-56 MKYVDGFVVAVPADKKDAYREMAAKAAPLFKEFGALRIVECWASDVPDGKVTDFRMAVKA EENEEVVFSWIEYPSKEVRDAANQKMMSDPRMKEFGESMPFDGKRMIYGGFESIIDE >gi|296493392|gb|ADTK01000109.1| GENE 20 20753 - 21064 279 103 aa, chain + ## HITS:1 COG:ECs0508 KEGG:ns NR:ns ## COG: ECs0508 COG3695 # Protein_GI_number: 15829762 # Func_class: L Replication, recombination and repair # Function: Predicted methylated DNA-protein cysteine methyltransferase # Organism: Escherichia coli O157:H7 # 1 103 27 129 129 202 100.0 8e-53 MEKEDSFPQRVWQIVAAIPEGYVTTYGDVAKLAGSPRAARQVGGVLKRLPEGSTLPWHRV VNRHGTISLTGPDLQRQRQALLAEGVMVSGSGQIDLQRYRWNY >gi|296493392|gb|ADTK01000109.1| GENE 21 21095 - 21667 448 190 aa, chain - ## HITS:1 COG:ECs0507 KEGG:ns NR:ns ## COG: ECs0507 COG3126 # Protein_GI_number: 15829761 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 190 1 190 190 235 100.0 4e-62 MKLVHMASGLAVAIALAACADKSADIQTPAPAANTSISATQQPAIQQPNVSGTVWIRQKV ALPPDAVLTVTLSDASLADAPSKVLAQKAVRTEGKQSPFSFVLPFNPADVQPNARILLSA AITVNDKLVFITDTVQPVINQGGTKADLTLVPVQQTAVPVQASGGATTTVPSTSPTQVNP SSAVPAPTQY >gi|296493392|gb|ADTK01000109.1| GENE 22 21885 - 22745 719 286 aa, chain + ## HITS:1 COG:ECs0506 KEGG:ns NR:ns ## COG: ECs0506 COG1946 # Protein_GI_number: 15829760 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA thioesterase # Organism: Escherichia coli O157:H7 # 1 286 1 286 286 557 100.0 1e-159 MSQALKNLLTLLNLEKIEEGLFRGQSEDLGLRQVFGGQVVGQALYAAKETVPEERLVHSF HSYFLRPGDSKKPIIYDVETLRDGNSFSARRVAAIQNGKPIFYMTASFQAPEAGFEHQKT MPSAPAPDGLPSETQIAQSLAHLLPPVLKDKFICDRPLEVRPVEFHNPLKGHVAEPHRQV WIRANGSVPDDLRVHQYLLGYASDLNFLPVALQPHGIGFLEPGIQIATIDHSMWFHRPFN LNEWLLYSVESTSASSARGFVRGEFYTQDGVLVASTVQEGVMRNHN >gi|296493392|gb|ADTK01000109.1| GENE 23 22794 - 24080 1445 428 aa, chain - ## HITS:1 COG:ECs0505 KEGG:ns NR:ns ## COG: ECs0505 COG0004 # Protein_GI_number: 15829759 # Func_class: P Inorganic ion transport and metabolism # Function: Ammonia permease # Organism: Escherichia coli O157:H7 # 1 428 1 428 428 727 100.0 0 MKIATIKTGLASLAMLPGLVMAAPAVADKADNAFMMICTALVLFMTIPGIALFYGGLIRG KNVLSMLTQVTVTFALVCILWVVYGYSLAFGEGNNFFGNINWLMLKNIELTAVMGSIYQY IHVAFQGSFACITVGLIVGALAERIRFSAVLIFVVVWLTLSYIPIAHMVWGGGLLASHGA LDFAGGTVVHINAAIAGLVGAYLIGKRVGFGKEAFKPHNLPMVFTGTAILYIGWFGFNAG SAGTANEIAALAFVNTVVATAAAILGWIFGEWALRGKPSLLGACSGAIAGLVGVTPACGY IGVGGALIIGVVAGLAGLWGVTMLKRLLRVDDPCDVFGVHGVCGIVGCIMTGIFAASSLG GVGFAEGVTMGHQLLVQLESIAITIVWSGVVAFIGYKLADLTVGLRVPEEQEREGLDVNS HGENAYNA >gi|296493392|gb|ADTK01000109.1| GENE 24 24110 - 24448 484 112 aa, chain - ## HITS:1 COG:ECs0504 KEGG:ns NR:ns ## COG: ECs0504 COG0347 # Protein_GI_number: 15829758 # Func_class: E Amino acid transport and metabolism # Function: Nitrogen regulatory protein PII # Organism: Escherichia coli O157:H7 # 1 112 1 112 112 196 100.0 1e-50 MKLVTVIIKPFKLEDVREALSSIGIQGLTVTEVKGFGRQKGHAELYRGAEYSVNFLPKVK IDVAIADDQLDEVIDIVSKAAYTGKIGDGKIFVAELQRVIRIRTGEADEAAL >gi|296493392|gb|ADTK01000109.1| GENE 25 24629 - 26410 225 593 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 346 565 135 355 398 91 35 6e-18 MRSFSQLWPTLKRLLAYGSPWRKPLGIAVLMMWVAAAAEVSGPLLISYFIDNMVAKNNLP LKMVAGLAAAYVGLQLFAAGLHYAQSLLFNRAAVGVVQQLRTDVMDAALRQPLSEFDTQP VGQVISRVTNDTEVIRDLYVTVVATVLRSAALVGAMLVAMFSLDWRMALVAIMIFPVVLV VMVIYQRYSTPIVRRVRAYLADINDGFNEIINGMSVIQQFRQQARFGERMGEASRSHYMA RMQTLRLDGFLLRPLLSLFSSLILCGLLMLFGFSASGTIEVGVLYAFISYLGRLNEPLIE LTTQQAMLQQAVVAGERVFELMDGPRQQYGNDDRPLQSGTIEVDNVSFAYRDDNLVLKNI NLSVPSRNFVALVGHTGSGKSTLASLLMGYYPLTEGEIRLDGRPLSSLSHSALRQGVAMV QQDPVVLADTFLANVTLGRDISEERVWQALETVQLAELARSMSDGIYTPLGEQGNNLSVG QKQLLALARVLVETPQILILDEATASIDSGTEQAIQHALAAVREHTTLVVIAHRLSTIVD ADTILVLHRGQAVEQGTHQQLLAAQGRYWQMYQLQLAGEELAASVREEESLSA >gi|296493392|gb|ADTK01000109.1| GENE 26 26403 - 28175 205 590 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 342 557 137 351 398 83 28 1e-15 MRLFAQLSWYFRREWRRYLGAVALLVIIAMLQLVPPKVVGIVVDGVTEQHFTTGQILMWI ATMVLIAVVVYLLRYVWRVLLFGASYQLAVELREDYYRQLSRQHPEFYLRHRTGDLMARA TNDVDRVVFAAGEGVLTLVDSLVMGCAVLIMMSTQISWQLTLFALLPMPVMAIMIKRNGD ALHERFKLAQAAFSSLNDRTQESLTSIRMIKAFGLEDRQSALFAADAEDTGKKNMRVARI DARFDPTIYIAIGMANLLAIGGGSWMVVQGSLTLGQLTSFMMYLGLMIWPMLALAWMFNI VERGSAAYSRIRAMLAEAPVVNDGSEPVPEGRGELDVNIHQFTYPQTDHPALENVNFALK PGQMLGICGPTGSGKSTLLSLIQRHFDVSEGDIRFHDIPLTKLQLDSWRSRLAVVSQTPF LFSDTVANNIALGCPNATQQEIEHVARLASVHDDILRLPQGYDTEVGERGVMLSGGQKQR ISIARALLVNAEILILDDALSAVDGRTEHQILHNLRQWGQGRTVIISAHRLSALTEASEI IVMQHGHIAQRGNHDELAQQSGWYRDMYRYQQLEAALDDVPEIREEAVDA >gi|296493392|gb|ADTK01000109.1| GENE 27 28205 - 28663 369 152 aa, chain - ## HITS:1 COG:ybaO KEGG:ns NR:ns ## COG: ybaO COG1522 # Protein_GI_number: 16128432 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 152 30 181 181 308 100.0 2e-84 MLDKIDRKLLALLQQDCTLSLQALAEAVNLTTTPCWKRLKRLEDDGILIGKVALLDPEKI GLGLTAFVLIKTQHHSSEWYCRFVTVVTEMPEVLGFWRMAGEYDYLMRVQVADMKRYDEF YKRLVNSVPGLSDVTSSFAMEQIKYTTSLPIE Prediction of potential genes in microbial genomes Time: Mon May 16 15:21:14 2011 Seq name: gi|296493391|gb|ADTK01000110.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont289.5, whole genome shotgun sequence Length of sequence - 107404 bp Number of predicted genes - 104, with homology - 103 Number of transcription units - 52, operones - 24 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 857 - 916 3.0 2 2 Op 1 5/0.200 + CDS 941 - 2641 953 ## COG4533 ABC-type uncharacterized transport system, periplasmic component + Term 2665 - 2693 -0.0 3 2 Op 2 . + CDS 2706 - 3401 669 ## COG0603 Predicted PP-loop superfamily ATPase 4 3 Op 1 5/0.200 - CDS 3453 - 3851 396 ## COG0824 Predicted thioesterase - Prom 3882 - 3941 2.7 5 3 Op 2 6/0.000 - CDS 3945 - 4316 589 ## COG1555 DNA uptake protein and related DNA-binding proteins - Prom 4363 - 4422 2.3 - Term 4386 - 4423 7.1 6 3 Op 3 9/0.000 - CDS 4467 - 6338 2276 ## COG0760 Parvulin-like peptidyl-prolyl isomerase - Prom 6460 - 6519 3.5 - Term 6463 - 6494 2.5 7 3 Op 4 16/0.000 - CDS 6530 - 6802 414 ## COG0776 Bacterial nucleoid DNA-binding protein - Prom 6858 - 6917 3.4 - Term 6937 - 6978 8.5 8 4 Op 1 18/0.000 - CDS 7011 - 9365 2969 ## COG0466 ATP-dependent Lon protease, bacterial type - Prom 9475 - 9534 4.5 - Term 9489 - 9519 2.7 9 4 Op 2 24/0.000 - CDS 9553 - 10827 238 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 10 4 Op 3 29/0.000 - CDS 10953 - 11576 641 ## COG0740 Protease subunit of ATP-dependent Clp proteases - Prom 11610 - 11669 5.2 - Term 11751 - 11788 6.0 11 4 Op 4 2/0.600 - CDS 11822 - 13120 1736 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) - Prom 13259 - 13318 6.0 - Term 13373 - 13411 6.2 12 5 Tu 1 . - CDS 13464 - 13781 250 ## COG0271 Stress-induced morphogen (activity unknown) - Prom 13948 - 14007 5.4 + Prom 13985 - 14044 3.4 13 6 Op 1 1/0.850 + CDS 14086 - 14664 666 ## COG3056 Uncharacterized lipoprotein 14 6 Op 2 5/0.200 + CDS 14708 - 16183 1391 ## COG0477 Permeases of the major facilitator superfamily + Prom 16347 - 16406 9.4 15 7 Op 1 25/0.000 + CDS 16643 - 17590 810 ## COG1622 Heme/copper-type cytochrome/quinol oxidases, subunit 2 16 7 Op 2 20/0.000 + CDS 17612 - 19603 2055 ## COG0843 Heme/copper-type cytochrome/quinol oxidases, subunit 1 17 7 Op 3 16/0.000 + CDS 19593 - 20207 598 ## COG1845 Heme/copper-type cytochrome/quinol oxidase, subunit 3 18 7 Op 4 7/0.000 + CDS 20207 - 20536 341 ## COG3125 Heme/copper-type cytochrome/quinol oxidase, subunit 4 19 7 Op 5 2/0.600 + CDS 20551 - 21438 905 ## COG0109 Polyprenyltransferase (cytochrome oxidase assembly factor) + Term 21459 - 21485 -1.0 + Prom 21507 - 21566 3.2 20 8 Tu 1 . + CDS 21587 - 22951 1617 ## COG0477 Permeases of the major facilitator superfamily + Term 23074 - 23101 -0.8 - Term 22896 - 22944 2.5 21 9 Tu 1 . - CDS 23079 - 23570 743 ## COG1666 Uncharacterized protein conserved in bacteria - Prom 23596 - 23655 3.0 + Prom 23566 - 23625 2.9 22 10 Op 1 4/0.350 + CDS 23738 - 24649 801 ## COG1893 Ketopantoate reductase 23 10 Op 2 . + CDS 24612 - 25202 697 ## COG0693 Putative intracellular protease/amidase 24 11 Tu 1 . - CDS 25256 - 26704 1603 ## COG0301 Thiamine biosynthesis ATP pyrophosphatase - Prom 26899 - 26958 3.5 + Prom 26648 - 26707 3.1 25 12 Op 1 22/0.000 + CDS 26910 - 27152 363 ## COG1722 Exonuclease VII small subunit 26 12 Op 2 13/0.000 + CDS 27152 - 28051 897 ## COG0142 Geranylgeranyl pyrophosphate synthase 27 12 Op 3 3/0.450 + CDS 28076 - 29938 2024 ## COG1154 Deoxyxylulose-5-phosphate synthase + Prom 29940 - 29999 3.8 28 13 Tu 1 . + CDS 30118 - 31092 975 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) - Term 31084 - 31137 3.4 29 14 Op 1 12/0.000 - CDS 31146 - 31664 583 ## COG1267 Phosphatidylglycerophosphatase A and related proteins 30 14 Op 2 11/0.000 - CDS 31642 - 32619 1037 ## COG0611 Thiamine monophosphate kinase 31 14 Op 3 17/0.000 - CDS 32697 - 33116 509 ## COG0781 Transcription termination factor 32 14 Op 4 6/0.000 - CDS 33136 - 33606 602 ## COG0054 Riboflavin synthase beta-chain 33 14 Op 5 14/0.000 - CDS 33695 - 34798 847 ## COG1985 Pyrimidine reductase, riboflavin biosynthesis 34 14 Op 6 . - CDS 34802 - 35197 420 ## COG1327 Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains - Prom 35278 - 35337 4.9 + Prom 35318 - 35377 3.1 35 15 Tu 1 . + CDS 35402 - 35941 463 ## EcHS_A0483 hypothetical protein + Term 35948 - 35987 -0.6 + Prom 36103 - 36162 7.7 36 16 Tu 1 . + CDS 36240 - 37124 1114 ## COG3248 Nucleoside-binding outer membrane protein + Term 37137 - 37165 -1.0 - Term 37123 - 37151 -1.0 37 17 Tu 1 . - CDS 37301 - 37648 400 ## c0520 hypothetical protein - Prom 37677 - 37736 4.5 - Term 37716 - 37757 8.1 38 18 Op 1 31/0.000 - CDS 37777 - 38748 1003 ## COG0341 Preprotein translocase subunit SecF 39 18 Op 2 25/0.000 - CDS 38759 - 40573 2035 ## COG0342 Preprotein translocase subunit SecD 40 18 Op 3 15/0.000 - CDS 40634 - 40966 496 ## COG1862 Preprotein translocase subunit YajC 41 18 Op 4 17/0.000 - CDS 40989 - 42116 1046 ## COG0343 Queuine/archaeosine tRNA-ribosyltransferase 42 18 Op 5 . - CDS 42172 - 43242 1123 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) - Prom 43268 - 43327 4.0 + Prom 43211 - 43270 1.9 43 19 Tu 1 . + CDS 43335 - 43916 502 ## COG3124 Uncharacterized protein conserved in bacteria - Term 43755 - 43797 -1.0 44 20 Tu 1 . - CDS 43921 - 45735 1555 ## COG0366 Glycosidases 45 21 Op 1 5/0.200 - CDS 45894 - 47267 1562 ## COG1113 Gamma-aminobutyrate permease and related permeases - Term 47286 - 47321 6.4 46 21 Op 2 4/0.350 - CDS 47343 - 48662 1383 ## COG1114 Branched-chain amino acid permeases - Prom 48811 - 48870 5.3 - Term 49007 - 49042 5.1 47 22 Op 1 40/0.000 - CDS 49069 - 50364 1129 ## COG0642 Signal transduction histidine kinase - Term 50381 - 50420 3.3 48 22 Op 2 . - CDS 50422 - 51111 666 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 51135 - 51194 5.2 + Prom 51111 - 51170 3.8 49 23 Op 1 28/0.000 + CDS 51301 - 52503 944 ## COG0420 DNA repair exonuclease 50 23 Op 2 2/0.600 + CDS 52500 - 55643 3113 ## COG0419 ATPase involved in DNA repair 51 23 Op 3 . + CDS 55685 - 56953 933 ## COG2814 Arabinose efflux permease + Term 57052 - 57101 2.2 - Term 57038 - 57087 6.0 52 24 Tu 1 . - CDS 57096 - 58004 294 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase - Prom 58093 - 58152 4.3 + Prom 58039 - 58098 4.1 53 25 Tu 1 . + CDS 58129 - 59040 1169 ## COG2974 DNA recombination-dependent growth factor C 54 26 Tu 1 . - CDS 59198 - 59401 131 ## ECH74115_0466 hypothetical protein - Prom 59576 - 59635 8.7 - Term 59645 - 59681 3.0 55 27 Op 1 . - CDS 59688 - 59972 285 ## COG3123 Uncharacterized protein conserved in bacteria 56 27 Op 2 . - CDS 60044 - 60721 591 ## B21_00341 hypothetical protein - Prom 60880 - 60939 11.1 57 28 Op 1 . - CDS 60979 - 61170 266 ## LF82_2523 uncharacterized protein YaiA 58 28 Op 2 1/0.850 - CDS 61220 - 61744 364 ## COG0703 Shikimate kinase - Prom 61795 - 61854 6.2 59 29 Tu 1 . - CDS 61921 - 62385 386 ## COG1671 Uncharacterized protein conserved in bacteria - Prom 62476 - 62535 3.5 + Prom 62404 - 62463 4.3 60 30 Tu 1 . + CDS 62505 - 63314 905 ## COG0345 Pyrroline-5-carboxylate reductase + Term 63323 - 63375 11.3 - Term 63278 - 63326 -1.0 61 31 Tu 1 . - CDS 63331 - 64431 690 ## COG2199 FOG: GGDEF domain - Prom 64459 - 64518 4.6 - Term 64491 - 64531 5.3 62 32 Tu 1 . - CDS 64548 - 64868 269 ## APECO1_1624 hypothetical protein - Term 64939 - 64983 9.1 63 33 Op 1 . - CDS 64987 - 66402 1299 ## COG1785 Alkaline phosphatase - Prom 66442 - 66501 5.8 64 33 Op 2 . - CDS 66503 - 66763 288 ## SSON_0357 hypothetical protein - Prom 66947 - 67006 8.0 + Prom 66948 - 67007 7.7 65 34 Tu 1 . + CDS 67226 - 68320 1123 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes + Term 68329 - 68366 5.1 - Term 68164 - 68206 0.9 66 35 Tu 1 . - CDS 68344 - 68556 358 ## UTI89_C0397 hypothetical protein - Prom 68636 - 68695 5.7 + Prom 68706 - 68765 3.8 67 36 Tu 1 . + CDS 68816 - 69124 294 ## APECO1_1629 hypothetical protein + Term 69125 - 69167 -0.9 68 37 Op 1 . - CDS 69183 - 70277 1040 ## B21_00329 hypothetical protein 69 37 Op 2 . - CDS 70290 - 71510 1272 ## COG1133 ABC-type long-chain fatty acid transport system, fused permease and ATPase components 70 37 Op 3 . - CDS 71530 - 71724 99 ## ECIAI1_0372 hypothetical protein - Prom 71808 - 71867 3.4 + Prom 71727 - 71786 4.1 71 38 Tu 1 . + CDS 71862 - 73019 1032 ## COG1680 Beta-lactamase class C and other penicillin binding proteins + Term 73029 - 73077 10.6 72 39 Op 1 . - CDS 73020 - 73688 332 ## SSON_0350 putative DNA-binding transcriptional regulator 73 39 Op 2 . - CDS 73731 - 76682 2627 ## COG3468 Type V secretory pathway, adhesin AidA + Prom 76543 - 76602 3.5 74 40 Tu 1 . + CDS 76640 - 76831 98 ## 75 41 Tu 1 . + CDS 77206 - 78180 1028 ## COG0113 Delta-aminolevulinic acid dehydratase 76 42 Op 1 5/0.200 - CDS 78286 - 79137 882 ## COG2175 Probable taurine catabolism dioxygenase 77 42 Op 2 7/0.000 - CDS 79134 - 79961 1004 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 78 42 Op 3 6/0.000 - CDS 79958 - 80725 267 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 79 42 Op 4 . - CDS 80738 - 81700 1134 ## COG4521 ABC-type taurine transport system, periplasmic component - Prom 81780 - 81839 4.1 80 43 Tu 1 . + CDS 81654 - 81878 120 ## EcE24377A_0388 hypothetical protein + Prom 82044 - 82103 7.9 81 44 Op 1 1/0.850 + CDS 82316 - 82987 416 ## COG2120 Uncharacterized proteins, LmbE homologs 82 44 Op 2 . + CDS 82997 - 84193 646 ## COG1215 Glycosyltransferases, probably involved in cell wall biogenesis 83 44 Op 3 . + CDS 84042 - 84752 355 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 84 44 Op 4 . + CDS 84754 - 85527 826 ## SSON_0337 hypothetical protein + Term 85638 - 85696 19.7 + Prom 85631 - 85690 6.1 85 45 Op 1 2/0.600 + CDS 85712 - 85987 275 ## COG1937 Uncharacterized protein conserved in bacteria 86 45 Op 2 12/0.000 + CDS 86022 - 87131 1271 ## COG1062 Zn-dependent alcohol dehydrogenases, class III + Term 87147 - 87179 2.2 87 45 Op 3 . + CDS 87224 - 88057 658 ## COG0627 Predicted esterase 88 46 Tu 1 . - CDS 88283 - 88822 624 ## COG3122 Uncharacterized protein conserved in bacteria - Prom 88850 - 88909 3.1 89 47 Tu 1 . - CDS 88924 - 90135 1086 ## COG0477 Permeases of the major facilitator superfamily - Prom 90187 - 90246 4.0 90 48 Op 1 6/0.000 - CDS 90311 - 91324 1241 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases 91 48 Op 2 4/0.350 - CDS 91321 - 92271 985 ## COG4569 Acetaldehyde dehydrogenase (acetylating) 92 48 Op 3 1/0.850 - CDS 92268 - 93077 862 ## COG3971 2-keto-4-pentenoate hydratase 93 48 Op 4 . - CDS 93087 - 93953 903 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 94 48 Op 5 . - CDS 93971 - 94915 775 ## ECO111_0384 3-(2,3-dihydroxyphenyl)propionate dioxygenase 95 48 Op 6 . - CDS 94917 - 96581 1801 ## COG0654 2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases - Prom 96672 - 96731 5.8 + Prom 96620 - 96679 6.1 96 49 Op 1 . + CDS 96772 - 97605 799 ## COG1414 Transcriptional regulator 97 49 Op 2 1/0.850 + CDS 97673 - 98764 989 ## COG1609 Transcriptional regulators + Prom 98769 - 98828 4.4 98 50 Op 1 3/0.450 + CDS 98887 - 101961 2357 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 101969 - 102001 5.4 99 50 Op 2 4/0.350 + CDS 102013 - 103266 1204 ## COG0477 Permeases of the major facilitator superfamily 100 50 Op 3 . + CDS 103332 - 103943 302 ## COG0110 Acetyltransferase (isoleucine patch superfamily) - Term 103944 - 103980 3.5 101 51 Op 1 3/0.450 - CDS 104046 - 105200 1080 ## COG2807 Cyanate permease 102 51 Op 2 4/0.350 - CDS 105233 - 105703 750 ## COG1513 Cyanate lyase 103 51 Op 3 . - CDS 105734 - 106393 489 ## COG0288 Carbonic anhydrase - Prom 106486 - 106545 3.4 + Prom 106418 - 106477 5.4 104 52 Tu 1 . + CDS 106503 - 107402 491 ## COG0583 Transcriptional regulator Predicted protein(s) >gi|296493391|gb|ADTK01000110.1| GENE 1 23 - 841 690 272 aa, chain - ## HITS:1 COG:cof KEGG:ns NR:ns ## COG: cof COG0561 # Protein_GI_number: 16128431 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Escherichia coli K12 # 1 272 5 276 276 557 99.0 1e-159 MARLAAFDMDGTLLMPDHHLGEKTLSTLARLRERDITLTFATGRHALEMQHILGALSLDA YLITGNGTRVHSLEGELLHRDDLPADVAELVLYQQWDTRASMHIFNDDGWFTGKEIPALL QAFVYSGFRYQIIDVKKMPLGSVTKICFCGDHDDLTRLQIQLYEALGERAHLCFSATDCL EVLPVGCNKGAALTVLTQHLGLSLRDCMAFGDAMNDREMLGSVGSGFIMGNAMPQLRAEL PHLPVIGHCRNQAVSHYLTHWLDYPHLPYSPE >gi|296493391|gb|ADTK01000110.1| GENE 2 941 - 2641 953 566 aa, chain + ## HITS:1 COG:ECs0499 KEGG:ns NR:ns ## COG: ECs0499 COG4533 # Protein_GI_number: 15829753 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 1 566 1 566 566 1103 99.0 0 MRLLNRLNQYQRLWQPSAGKPQTVTVSELAERCFCSERHVRTLLRQAQEAGWLEWQAQSG RGKRGQLRFLVTPESLRNAMMEQALETGKQQDVLELAQLAPGELRTLLQPFMGGQWQNDT PTLRIPYYRPLEPLQPGFLPGRAEQHLAGQIFSGLTRFDNNTQRPIGDLAHHWETSTDGL RWDFYLRSTLHWHNGDAVKASHLHQRLLMLLQLPALDQLFISVKRIEVTHPQCLTFFLHR PDYWLAHRLASYCSHLAHPQFPLIGTGPFRLTQFTAELVRLESHDYYHLRHPLLKAVEYW ITPPLFEKDLGTSCRHPVQITIGKPEELQRVSQVSSGISLGFCYLTLRKSPRLSLWQARK VISIIHQSGLLQTLEVGENLITASHALLPGWTIPHWQVPDEVKLPKTLTLVYHLPIELHT MAERLQATLAAEGCELTIIFHNAKNWDDTTLLAHADLMMGDRLIGEAPEYTLEQWLRCDP LWPHVFDAPAYAHLQSTLDAVQVMPDEENRFNALKAVFSQLMADATLTPLFNYHYRISAP PGVNGVRLTPRGWFEFTEAWLPAPSQ >gi|296493391|gb|ADTK01000110.1| GENE 3 2706 - 3401 669 231 aa, chain + ## HITS:1 COG:ybaX KEGG:ns NR:ns ## COG: ybaX COG0603 # Protein_GI_number: 16128429 # Func_class: R General function prediction only # Function: Predicted PP-loop superfamily ATPase # Organism: Escherichia coli K12 # 1 231 1 231 231 483 99.0 1e-136 MKRAVVVFSGGQDSTTCLVQALQQYDEVHCVTFDYGQRHRAEIDVARELALKLGARAHKV LDVTLLNELAVSSLTRDSIPVPDYEPEADGIPNTFVPGRNILFLTLAAIYAYQVKAEAVI TGVCETDFSGYPDCRDEFVKALNHAVSLGMAKDIRFETPLMWIDKAETWALADYYGKLDL VRNETLTCYNGIKGDGCGHCAACNLRANGLNHYLADKPTVMAAMKQKTGLR >gi|296493391|gb|ADTK01000110.1| GENE 4 3453 - 3851 396 132 aa, chain - ## HITS:1 COG:ybaW KEGG:ns NR:ns ## COG: ybaW COG0824 # Protein_GI_number: 16128428 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Escherichia coli K12 # 1 132 1 132 132 251 100.0 3e-67 MQTQIKVRGYHLDVYQHVNNARYLEFLEEARWDGLENSDSFQWMTAHNIAFVVVNININY RRPAVLSDLLTITSQLQQLNGKSGILSQVITLEPEGQVVADALITFVCIDLKTQKALALE GELREKLEQMVK >gi|296493391|gb|ADTK01000110.1| GENE 5 3945 - 4316 589 123 aa, chain - ## HITS:1 COG:ybaV KEGG:ns NR:ns ## COG: ybaV COG1555 # Protein_GI_number: 16128427 # Func_class: L Replication, recombination and repair # Function: DNA uptake protein and related DNA-binding proteins # Organism: Escherichia coli K12 # 1 123 1 123 123 177 97.0 5e-45 MKHGIKALLITLSLACAGMSHSALAAAPAAKPTTVETKAEAPAAQSKAAVPAKASDEEGT RVSINNASAEELARAMNGVGLKKAQAIVSYREEYGPFKTVEDLKQVPGMGNSLVERNLAV LTL >gi|296493391|gb|ADTK01000110.1| GENE 6 4467 - 6338 2276 623 aa, chain - ## HITS:1 COG:ybaU KEGG:ns NR:ns ## COG: ybaU COG0760 # Protein_GI_number: 16128426 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Parvulin-like peptidyl-prolyl isomerase # Organism: Escherichia coli K12 # 1 623 1 623 623 1106 99.0 0 MMDSLRTAANSLVLKIIFGIIIVSFILTGVSGYLIGGGNNYAAKVNDQEISRGQFENAFN SERNRMQQQLGDQYSELAANEGYMKTLRQQVLNRLIDEALLDQYARELKLGISDEQVKQA IFATPAFQVDGKFDNSRYNGILNQMGMTADQYAQALRNQLTTQQLINGVAGTDFMLKGET DELAALVAQQRVVREATIDVNALAAKQPVTEQEIASYYEQNKNNFMTPEQFRVSYIKLDA ATMQQPVSDADIQSYYDQHQDQFTQPQRTRYSIIQTKTEDEAKAVLDELNKGGDFAALAK EKSADIISARNGGDMGWLEDATIPDELKNAGLKEKGQLSGVIKSSVGFLIVRLDDIQPAK VKSLDEVRDDIAAKVKHEKALDAYYALQQKVSDAASNDTESLAGAEQTAGVKATQTGWFS KDNLPEELNFKPVADAIFNGGLVGENGAPGINSDIITVDGDRAFVLRISEHKPEAVKPLA DVQEQVKALVQHNKAEQQAKVDAEKLLVDLKAGKGAEAMQAAGLKFGEPKTLSRSGRDPI SQAAFALPLPAKDKPSYGMATDMQGNVVLLALDEVKQGSMPEDQKKAMVQGITQNNAQIV FEALMSNLRKEAKIKIGDALEQQ >gi|296493391|gb|ADTK01000110.1| GENE 7 6530 - 6802 414 90 aa, chain - ## HITS:1 COG:YPO3154 KEGG:ns NR:ns ## COG: YPO3154 COG0776 # Protein_GI_number: 16123316 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Yersinia pestis # 1 90 1 90 90 126 92.0 1e-29 MNKSQLIDKIAAGADISKAAAGRALDAIIASVTESLKEGDDVALVGFGTFAVKERAARTG RNPQTGKEITIAAAKVPSFRAGKALKDAVN >gi|296493391|gb|ADTK01000110.1| GENE 8 7011 - 9365 2969 784 aa, chain - ## HITS:1 COG:lon KEGG:ns NR:ns ## COG: lon COG0466 # Protein_GI_number: 16128424 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Escherichia coli K12 # 1 784 1 784 784 1482 100.0 0 MNPERSERIEIPVLPLRDVVVYPHMVIPLFVGREKSIRCLEAAMDHDKKIMLVAQKEAST DEPGVNDLFTVGTVASILQMLKLPDGTVKVLVEGLQRARISALSDNGEHFSAKAEYLESP TIDEREQEVLVRTAISQFEGYIKLNKKIPPEVLTSLNSIDDPARLADTIAAHMPLKLADK QSVLEMSDVNERLEYLMAMMESEIDLLQVEKRIRNRVKKQMEKSQREYYLNEQMKAIQKE LGEMDDAPDENEALKRKIDAAKMPKEAKEKAEAELQKLKMMSPMSAEATVVRGYIDWMVQ VPWNARSKVKKDLRQAQEILDTDHYGLERVKDRILEYLAVQSRVNKIKGPILCLVGPPGV GKTSLGQSIAKATGRKYVRMALGGVRDEAEIRGHRRTYIGSMPGKLIQKMAKVGVKNPLF LLDEIDKMSSDMRGDPASALLEVLDPEQNVAFSDHYLEVDYDLSDVMFVATSNSMNIPAP LLDRMEVIRLSGYTEDEKLNIAKRHLLPKQIERNALKKGELTVDDSAIIGIIRYYTREAG VRGLEREISKLCRKAVKQLLLDKSLKHIEINGDNLHDYLGVQRFDYGRADNENRVGQVTG LAWTEVGGDLLTIETACVPGKGKLTYTGSLGEVMQESIQAALTVVRARAEKLGINPDFYE KRDIHVHVPEGATPKDGPSAGIAMCTALVSCLTGNPVRADVAMTGEITLRGQVLPIGGLK EKLLAAHRGGIKTVLIPFENKRDLEEIPDNVIADLDIHPVKRIEEVLTLALQNEPSGMQV VTAK >gi|296493391|gb|ADTK01000110.1| GENE 9 9553 - 10827 238 424 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 159 386 250 432 466 96 31 6e-19 MTDKRKDGSGKLLYCSFCGKSQHEVRKLIAGPSVYICDECVDLCNDIIREEIKEVAPHRE RSALPTPHEIRNHLDDYVIGQEQAKKVLAVAVYNHYKRLRNGDTSNGVELGKSNILLIGP TGSGKTLLAETLARLLDVPFTMADATTLTEAGYVGEDVENIIQKLLQKCDYDVQKAQRGI VYIDEIDKISRKSDNPSITRDVSGEGVQQALLKLIEGTVAAVPPQGGRKHPQQEFLQVDT SKILFICGGAFAGLDKVISHRVETGSGIGFGATVKAKSDKASEGELLAQVEPEDLIKFGL IPEFIGRLPVVATLNELSEEALIQILKEPKNALTKQYQALFNLEGVDLEFRDEALDAIAK KAMARKTGARGLRSIVEAALLDTMYDLPSMEDVEKVVIDESVIDGQSKPLLIYGKPEAQQ ASGE >gi|296493391|gb|ADTK01000110.1| GENE 10 10953 - 11576 641 207 aa, chain - ## HITS:1 COG:ECs0491 KEGG:ns NR:ns ## COG: ECs0491 COG0740 # Protein_GI_number: 15829745 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Escherichia coli O157:H7 # 1 207 1 207 207 417 100.0 1e-117 MSYSGERDNFAPHMALVPMVIEQTSRGERSFDIYSRLLKERVIFLTGQVEDHMANLIVAQ MLFLEAENPEKDIYLYINSPGGVITAGMSIYDTMQFIKPDVSTICMGQAASMGAFLLTAG AKGKRFCLPNSRVMIHQPLGGYQGQATDIEIHAREILKVKGRMNELMALHTGQSLEQIER DTERDRFLSAPEAVEYGLVDSILTHRN >gi|296493391|gb|ADTK01000110.1| GENE 11 11822 - 13120 1736 432 aa, chain - ## HITS:1 COG:ECs0490 KEGG:ns NR:ns ## COG: ECs0490 COG0544 # Protein_GI_number: 15829744 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Escherichia coli O157:H7 # 1 432 1 432 432 738 100.0 0 MQVSVETTQGLGRRVTITIAADSIETAVKSELVNVAKKVRIDGFRKGKVPMNIVAQRYGA SVRQDVLGDLMSRNFIDAIIKEKINPAGAPTYVPGEYKLGEDFTYSVEFEVYPEVELQGL EAIEVEKPIVEVTDADVDGMLDTLRKQQATWKEKDGAVEAEDRVTIDFTGSVDGEEFEGG KASDFVLAMGQGRMIPGFEDGIKGHKAGEEFTIDVTFPEEYHAENLKGKAAKFAINLKKV EERELPELTAEFIKRFGVEDGSVEGLRAEVRKNMERELKSAIRNRVKSQAIEGLVKANDI DVPAALIDSEIDVLRRQAAQRFGGNEKQALELPRELFEEQAKRRVVVGLLLGEVIRTNEL KADEERVKGLIEEMASAYEDPKEVIEFYSKNKELMDNMRNVALEEQAVEAVLAKAKVTEK ETTFNELMNQQA >gi|296493391|gb|ADTK01000110.1| GENE 12 13464 - 13781 250 105 aa, chain - ## HITS:1 COG:ECs0489 KEGG:ns NR:ns ## COG: ECs0489 COG0271 # Protein_GI_number: 15829743 # Func_class: T Signal transduction mechanisms # Function: Stress-induced morphogen (activity unknown) # Organism: Escherichia coli O157:H7 # 1 105 12 116 116 204 100.0 3e-53 MMIRERIEEKLRAAFQPVFLEVVDESYRHNVPAGSESHFKVVLVSDRFTGERFLNRHRMI YSTLAEELSTTVHALALHTYTIKEWEGLQDTVFASPPCRGAGSIA >gi|296493391|gb|ADTK01000110.1| GENE 13 14086 - 14664 666 192 aa, chain + ## HITS:1 COG:yajG KEGG:ns NR:ns ## COG: yajG COG3056 # Protein_GI_number: 16128419 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized lipoprotein # Organism: Escherichia coli K12 # 1 192 35 226 226 355 100.0 2e-98 MFKKILFPLVALFMLAGCAKPPTTIEVSPTITLPQQDPSLMGVTVSINGADQRTDQALAK VTRDNQIVTLTASRDLRFLLQEVLEKQMTARGYMVGPNGPVNLQIIVSQLYADVSQGNVR YNIATKADIAIIATAQNGNKMTKNYRASYNVEGAFQASNKNIADAVNSVLTDTIADMSQD TSIHEFIKQNAR >gi|296493391|gb|ADTK01000110.1| GENE 14 14708 - 16183 1391 491 aa, chain + ## HITS:1 COG:ECs0487 KEGG:ns NR:ns ## COG: ECs0487 COG0477 # Protein_GI_number: 15829741 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 491 1 491 491 843 99.0 0 MSSQYLRIFQQPRSAILLILGFASGLPLALTSGTLQAWMTVENIDLKTIGFFSLVGQAYV FKFLWSPLMDRYTPPFFGRRRGWLLATQILLLVAIAAMGFLEPGTQLRWMAALAVVIAFC SASQDIVFDAWKTDVLPAEERGAGAAISVLGYRLGMLVSGGLALWLADKWLGWQGMYWLM AALLIPCIIATLLAPEPTDTIPVPKTLEQAVVAPLRDFFGRNNAWLILLLIVLYKLGDAF AMSLTTTFLIRGVGFDAGEVGVVNKTLGLLATIVGALYGGILMQRLSLFRALLIFGILQG ASNAGYWLLSITDKHLYSMGAAVFFENLCGGMGTSAFVALLMTLCNKSFSATQFALLSAL SAVGRVYVGPVAGWFVEAHGWSTFYLFSVAAAVPGLILLLVCRQTLEYTRVNDNFISRTE YPAGYAFAMWTLAAGVSLLAVWLLLLTMDALDLTHFSFLPALLEVGVLVALSGVVLGGLL DYLALRKTHLT >gi|296493391|gb|ADTK01000110.1| GENE 15 16643 - 17590 810 315 aa, chain + ## HITS:1 COG:cyoA KEGG:ns NR:ns ## COG: cyoA COG1622 # Protein_GI_number: 16128417 # Func_class: C Energy production and conversion # Function: Heme/copper-type cytochrome/quinol oxidases, subunit 2 # Organism: Escherichia coli K12 # 1 315 1 315 315 627 99.0 1e-180 MRLRKYNKSLGWLSLFAGTVLLSGCNSALLDPKGQIGLEQRSLILTAFGLMLIVVIPAIL MAVGFAWKYRASNKDAKYSPNWSHSNKVEAVVWTVPILIIIFLAVLTWKTTHALEPSKPL AHDEKPITIEVVSMDWKWFFIYPEQGIATVNEIAFPANTPVYFKVTSNSVMNSFFIPRLG SQIYAMAGMQTRLHLIANEPGTYDGISASYSGPGFSGMKFKAIATPDRAAFDQWVAKAKQ SPNSMSDMAAFEKLAAPSEYNQVEYFSNVKPDLFADVINKFMAHGKSMDMTQPEGEHSAH EGMEGMDMSHAESAH >gi|296493391|gb|ADTK01000110.1| GENE 16 17612 - 19603 2055 663 aa, chain + ## HITS:1 COG:ECs0485 KEGG:ns NR:ns ## COG: ECs0485 COG0843 # Protein_GI_number: 15829739 # Func_class: C Energy production and conversion # Function: Heme/copper-type cytochrome/quinol oxidases, subunit 1 # Organism: Escherichia coli O157:H7 # 1 663 1 663 663 1241 100.0 0 MFGKLSLDAVPFHEPIVMVTIAGIILGGLALVGLITYFGKWTYLWKEWLTSVDHKRLGIM YIIVAIVMLLRGFADAIMMRSQQALASAGEAGFLPPHHYDQIFTAHGVIMIFFVAMPFVI GLMNLVVPLQIGARDVAFPFLNNLSFWFTVVGVILVNVSLGVGEFAQTGWLAYPPLSGIE YSPGVGVDYWIWSLQLSGIGTTLTGINFFVTILKMRAPGMTMFKMPVFTWASLCANVLII ASFPILTVTVALLTLDRYLGTHFFTNDMGGNMMMYINLIWAWGHPEVYILILPVFGVFSE IAATFSRKRLFGYTSLVWATVCITVLSFIVWLHHFFTMGAGANVNAFFGITTMIIAIPTG VKIFNWLFTMYQGRIVFHSAMLWTIGFIVTFSVGGMTGVLLAVPGADFVLHNSLFLIAHF HNVIIGGVVFGCFAGMTYWWPKAFGFKLNETWGKRAFWFWIIGFFVAFMPLYALGFMGMT RRLSQQIDPQFHTMLMIAASGAVLIALGILCLVIQMYVSIRDRDQNRDLTGDPWGGRTLE WATSSPPPFYNFAVVPHVHERDAFWEMKEKGEAYKKPDHYEEIHMPKNSGAGIVIAAFST IFGFAMIWHIWWLAIVGFAGMIITWIVKSFDEDVDYYVPVAEIEKLENQHFDEITKAGLK NGN >gi|296493391|gb|ADTK01000110.1| GENE 17 19593 - 20207 598 204 aa, chain + ## HITS:1 COG:cyoC KEGG:ns NR:ns ## COG: cyoC COG1845 # Protein_GI_number: 16128415 # Func_class: C Energy production and conversion # Function: Heme/copper-type cytochrome/quinol oxidase, subunit 3 # Organism: Escherichia coli K12 # 1 204 1 204 204 368 100.0 1e-102 MATDTLTHATAHAHEHGHHDAGGTKIFGFWIYLMSDCILFSILFATYAVLVNGTAGGPTG KDIFELPFVLVETFLLLFSSITYGMAAIAMYKNNKSQVISWLALTWLFGAGFIGMEIYEF HHLIVNGMGPDRSGFLSAFFALVGTHGLHVTSGLIWMAVLMVQIARRGLTSTNRTRIMCL SLFWHFLDVVWICVFTVVYLMGAM >gi|296493391|gb|ADTK01000110.1| GENE 18 20207 - 20536 341 109 aa, chain + ## HITS:1 COG:ECs0483 KEGG:ns NR:ns ## COG: ECs0483 COG3125 # Protein_GI_number: 15829737 # Func_class: C Energy production and conversion # Function: Heme/copper-type cytochrome/quinol oxidase, subunit 4 # Organism: Escherichia coli O157:H7 # 1 109 1 109 109 159 100.0 8e-40 MSHSTDHSGASHGSVKTYMTGFILSIILTVIPFWMVMTGAASPAVILGTILAMAVVQVLV HLVCFLHMNTKSDEGWNMTAFVFTVLIIAILVVGSIWIMWNLNYNMMMH >gi|296493391|gb|ADTK01000110.1| GENE 19 20551 - 21438 905 295 aa, chain + ## HITS:1 COG:ECs0482 KEGG:ns NR:ns ## COG: ECs0482 COG0109 # Protein_GI_number: 15829736 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Polyprenyltransferase (cytochrome oxidase assembly factor) # Organism: Escherichia coli O157:H7 # 1 295 2 296 296 521 99.0 1e-148 MFKQYLQVTKPGIIFGNLISVIGGFLLASKGSIDYPLFIYTLVGVSLVVASGCVFNNYID RDIDRKMERTKNRVLVKGLISPAVSLVYATLLGFAGFMLLWFGANPLACWLGVMGFVVYV GVYSLYMKRHSVYGTLIGSLSGAAPPVIGYCAVTGEFDSGAAILLAIFSLWQMPHSYAIA IFRFKDYQAANIPVLPVVKGISVAKNHITLYIIAFAVATLMLSLGGYAGYKYLVVAAAVS VWWVGMALRGYKVADDRIWARKLFGFSIIAITALSVMMSVDFMVPDSHTLLAAVW >gi|296493391|gb|ADTK01000110.1| GENE 20 21587 - 22951 1617 454 aa, chain + ## HITS:1 COG:yajR KEGG:ns NR:ns ## COG: yajR COG0477 # Protein_GI_number: 16128412 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 454 3 456 456 810 98.0 0 MNDYKMTPGERRATWGLGTVFSLRMLGMFMVLPVLTTYGMALQGASEALIGIAIGIYGLT QAVFQIPFGLLSDRIGRKPLIVGGLAVFAAGSVIAALSDSIWGIILGRALQGSGAIAAAV MALLSDLTREQNRTKAMAFIGVSFGITFAIAMVLGPIITHKLGLHALFWMIAILATTGIA LTIWVVPNSSTHVLNRESGMVKGSFSKVLAEPRLLKLNFGIMCLHMLLMSTFVALPGQLA DAGFPAAEHWKVYLATMLIAFGSVVPFIIYAEVKRKMKQVFVFCVGLIVVAEIVLWNAQT QFWQLVVGVQLFFVAFNLMEALLPSLISKESPAGYKGTAMGVYSTSQFLGVAIGGSLGGW IDGMFDGQGVFLAGAMLAAVWLAVASTMKEPPYVSSLRIEIPADIAANEALKVRLLETEG VKEVLIAEEEHSAYVKIDSKVTNRFEVEQAIRQA >gi|296493391|gb|ADTK01000110.1| GENE 21 23079 - 23570 743 163 aa, chain - ## HITS:1 COG:ECs0480 KEGG:ns NR:ns ## COG: ECs0480 COG1666 # Protein_GI_number: 15829734 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 163 7 169 169 280 100.0 7e-76 MPSFDIVSEVDLQEARNAVDNASREVESRFDFRNVEASFELNDASKTIKVLSESDFQVNQ LLDILRAKLLKRGIEGSSLDVPENIVHSGKTWFVEAKLKQGIESATQKKIVKMIKDSKLK VQAQIQGDEIRVTGKSRDDLQAVMAMVRGGDLGQPFQFKNFRD >gi|296493391|gb|ADTK01000110.1| GENE 22 23738 - 24649 801 303 aa, chain + ## HITS:1 COG:apbA KEGG:ns NR:ns ## COG: apbA COG1893 # Protein_GI_number: 16128410 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate reductase # Organism: Escherichia coli K12 # 1 303 1 303 303 622 99.0 1e-178 MKITVLGCGALGQLWLTALCKQGHEVQGWLRVPQPYCSVNLVETDGSIFNESLTANDPDF LATSDLLLVTLKAWQVSDAVKSLASTLPVATPILLIHNGMGTIEELQNIQQPLLMGTTTH AARRDGNVIIHVANGITHIGPARQQDGDYSYLADILQTVLPDVAWHNNIRAELWRKLAVN CVINPLTAIWNCPNGELRHHPQEIMQICEEVAAVIEREGHHTSAEDLRDYVMQVIDATAE NISSMLQDIRALRHTEIDYINGFLLRRARAHGIAVPENTRLFEMVKRKESEYERIGTGLP RPW >gi|296493391|gb|ADTK01000110.1| GENE 23 24612 - 25202 697 196 aa, chain + ## HITS:1 COG:ECs0478 KEGG:ns NR:ns ## COG: ECs0478 COG0693 # Protein_GI_number: 15829732 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Escherichia coli O157:H7 # 1 196 3 198 198 382 99.0 1e-106 MSASALVCLAPGSEETEAVTTIDLLVRGGIKVTTASVASDGNLAITCSRGVKLLADAPLV EVADGEYDVIVLPGGIKGAECFRDSTLLVETVKQFHRSGRIVAAICAAPATVLVPHDIFP IGNMTGFPTLKDKIPAEQWQDKRVVWDARVKLLTSQGPGTAIDFGLKIIDLLVGREKAHE VASQLVMAAGIYNYYE >gi|296493391|gb|ADTK01000110.1| GENE 24 25256 - 26704 1603 482 aa, chain - ## HITS:1 COG:ECs0477_1 KEGG:ns NR:ns ## COG: ECs0477_1 COG0301 # Protein_GI_number: 15829731 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine biosynthesis ATP pyrophosphatase # Organism: Escherichia coli O157:H7 # 1 402 1 402 402 791 99.0 0 MKFIIKLFPEITIKSQSVRLRFIKILTGNIRNVLKHYDETLAVVRHWDNIEVRAKDENQR LAIRDALTRIPGIHHILEVEDVPFTDMHDIFEKALVQYRDQLEGKTFCVRVKRRGKHDFS SIDVERYVGGGLNQHIESARVKLTNPEVTVHLEVEDDRLLLIKGRYEGIGGFPIGTQEDV LSLISGGFDSGVSSYMLMRRGCRVHYCFFNLGGAAHEIGVRQVAHYLWNRFGSSHRVRFV AINFEPVVGEILEKIDDGQMGVILKRMMVRAASKVAERYGVQALVTGEALGQVSSQTLTN LRLIDNVSDTLILRPLISYDKEHIINLARQIGTEDFARTMPEYCGVISKSPTVKAVKSKI EAEEEKFDFSILDKVVEEANNVDIREIAQQTEQEVVEVETVNGFGPNDVILDIRSIDEQE DKPLKVEGIDVVSLPFYKLSTKFGDLDQNRTWLLWCERGVMSRLQALYLREQGFNNVKVY RP >gi|296493391|gb|ADTK01000110.1| GENE 25 26910 - 27152 363 80 aa, chain + ## HITS:1 COG:ECs0476 KEGG:ns NR:ns ## COG: ECs0476 COG1722 # Protein_GI_number: 15829730 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII small subunit # Organism: Escherichia coli O157:H7 # 1 80 1 80 80 112 100.0 1e-25 MPKKNEAPASFEKALSELEQIVTRLESGDLPLEEALNEFERGVQLARQGQAKLQQAEQRV QILLSDNEDASLTPFTPDNE >gi|296493391|gb|ADTK01000110.1| GENE 26 27152 - 28051 897 299 aa, chain + ## HITS:1 COG:ispA KEGG:ns NR:ns ## COG: ispA COG0142 # Protein_GI_number: 16128406 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Escherichia coli K12 # 1 299 1 299 299 534 99.0 1e-152 MDFPQQLEACVKQANQALSRFIAPLPFQNTPVVETMQYGALLGGKRLRPFLVYATGHMFG VSTNTLDAPAAAVECIHAYSLIHDDLPAMDDDDLRRGLPTCHVKFGEANAILAGDALQTL AFSILSDANMPEVSDRDRISMISELASASGIAGMCGGQALDLDAEGKHVPLDALERIHRH KTGALIRAAVRLGALSAGDKGRRALPVLDKYAESIGLAFQVQDDILDVVGDTATLGKRQG ADQQLGKSTYPALLGLEQARKKARDLIDDARQSLKQLAEQSLDTSALEALADYIIQRNK >gi|296493391|gb|ADTK01000110.1| GENE 27 28076 - 29938 2024 620 aa, chain + ## HITS:1 COG:dxs KEGG:ns NR:ns ## COG: dxs COG1154 # Protein_GI_number: 16128405 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Escherichia coli K12 # 1 620 1 620 620 1269 99.0 0 MSFDIAKYPTLALVDSTQELRLLPKESLPKLCDELRRYLLDSVSRSSGHFASGLGTVELT VALHYVYNTPFDQLIWDVGHQAYPHKILTGRRDKIGTIRQKGGLHPFPWRGESEYDVLSV GHSSTSISAGIGIAVAAEKEGKNRRTVCVIGDGAITAGMAFEAMNHAGDIRPDMLVVLND NEMSISENVGALNNHLAQLLSGKLYSSLREGGKKVFSGVPPIKELLKRTEEHIKGMVVPG TLFEELGFNYIGPVDGHDVLGLITTLKNMRDLKGPQFLHIMTKKGRGYEPAEKDPITFHA VPKFDPSSGCLPKSSGGLPSYSKIFGDWLCETAAKDNKLMAITPAMREGSGMVEFSRKFP DRYFDVAIAEQHAVTFAAGLAIGGYKPIVAIYSTFLQRAYDQVLHDVAIQKLPVLFAIDR AGIVGADGQTHQGAFDLSYLRCIPEMVIMTPSDENECRQMLYTGYHYNDGPSAVRYPRGN AVGVELTPLEKLPIGKGIVKRRGEKLAILNFGTLMPEAAKVAESLNATLVDMRFVKPLDE ALILEMAASHEALVTVEENAIMGGAGSGVNEVLMAHRKPVPVLNIGLPDFFIPQGTQEEM RAELGLDAAGMEAKIKAWLA >gi|296493391|gb|ADTK01000110.1| GENE 28 30118 - 31092 975 324 aa, chain + ## HITS:1 COG:yajO KEGG:ns NR:ns ## COG: yajO COG0667 # Protein_GI_number: 16128404 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Escherichia coli K12 # 1 324 25 348 348 652 98.0 0 MQYNPLGKTDLRVSRLCLGCMTFGEPDRGNHAWTLPEESSRPIIKRALEGGINFFDTANS FSDGSSEEIVGRALRDFARREDVVVATKVFHRVGDLPEGLSRAQILRSIDDSLRRLGMDY VDILQIHRWDYNTPIEETLEALNDVVKAGKARYIGASSMHASQFAQALKLQKQHGWAQFI SMQDHYNLIYREEEREMLPLCYQEGVAVIPWSPLARGRLTRPWGETTARLVSDEVGRNLY KESDENDAQIAERLTGVSEELGATRAQVALAWLLSKPGIAAPIIGTSREEQLDELLNAVD ITLKPEQIAELETPYKPHAVVGFK >gi|296493391|gb|ADTK01000110.1| GENE 29 31146 - 31664 583 172 aa, chain - ## HITS:1 COG:pgpA KEGG:ns NR:ns ## COG: pgpA COG1267 # Protein_GI_number: 16128403 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphatase A and related proteins # Organism: Escherichia coli K12 # 1 172 1 172 172 313 100.0 8e-86 MTILPRHKDVAKSRLKMSNPWHLLAVGFGSGLSPIVPGTMGSLAAIPFWYLMTFLPWQLY SLVVMLGICIGVYLCHQTAKDMGVHDHGSIVWDEFIGMWITLMALPTNDWQWVAAGFVIF RILDMWKPWPIRWFDRNVHGGMGIMIDDIVAGVISAGILYFIGHHWPLGILS >gi|296493391|gb|ADTK01000110.1| GENE 30 31642 - 32619 1037 325 aa, chain - ## HITS:1 COG:thiL KEGG:ns NR:ns ## COG: thiL COG0611 # Protein_GI_number: 16128402 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate kinase # Organism: Escherichia coli K12 # 1 325 1 325 325 637 99.0 0 MACGEFSLIARYFDRVRSSRLDVELGIGDDCALLNIPEKQTLAISTDTLVAGNHFLHDID PADLAYKALAVNLSDLAAMGADPAWLTLALTLPDVDEAWLESFSDSLFDLLNYYDMQLIG GDTTRGPLSMTLGIHGFVPMGRALTRSGAKPGDWIYVTGTPGDSAAGLAILQNRLQVADA KDADYLIKRHLRPSPRILQGQALRDLANSAIDLSDGLISDLGHIVKASDCGARIDLALLP FSDALSRHVEPEQALRWALSGGEDYELCFTVPELNRGALDVALGHLGVPFTCIGQMTADI EGLCFIRDGEPVTFDWKGYDHFATP >gi|296493391|gb|ADTK01000110.1| GENE 31 32697 - 33116 509 139 aa, chain - ## HITS:1 COG:ECs0469 KEGG:ns NR:ns ## COG: ECs0469 COG0781 # Protein_GI_number: 15829723 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Escherichia coli O157:H7 # 1 139 1 139 139 245 100.0 2e-65 MKPAARRRARECAVQALYSWQLSQNDIADVEYQFLAEQDVKDVDVLYFRELLAGVATNTA YLDGLMKPYLSRLLEELGQVEKAVLRIALYELSKRSDVPYKVAINEAIELAKSFGAEDSH KFVNGVLDKAAPVIRPNKK >gi|296493391|gb|ADTK01000110.1| GENE 32 33136 - 33606 602 156 aa, chain - ## HITS:1 COG:ECs0468 KEGG:ns NR:ns ## COG: ECs0468 COG0054 # Protein_GI_number: 15829722 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase beta-chain # Organism: Escherichia coli O157:H7 # 1 156 1 156 156 258 100.0 4e-69 MNIIEANVATPDARVAITIARFNNFINDSLLEGAIDALKRIGQVKDENITVVWVPGAYEL PLAAGALAKTGKYDAVIALGTVIRGGTAHFEYVAGGASNGLAHVAQDSEIPVAFGVLTTE SIEQAIERAGTKAGNKGAEAALTALEMINVLKAIKA >gi|296493391|gb|ADTK01000110.1| GENE 33 33695 - 34798 847 367 aa, chain - ## HITS:1 COG:ribD_2 KEGG:ns NR:ns ## COG: ribD_2 COG1985 # Protein_GI_number: 16128399 # Func_class: H Coenzyme transport and metabolism # Function: Pyrimidine reductase, riboflavin biosynthesis # Organism: Escherichia coli K12 # 144 367 1 224 224 432 97.0 1e-121 MQDEYYMARALKLAQRGRFTTHPNPNVGCVIVKDGEIVGEGYHQRAGEPHAEVHALRMAG EKAKGATAYVTLEPCSHHGRTPPCCDALIAAGVARVVASMQDPNPQVAGRGLYRLQQAGI DVSHGLMMSEAEQLNKGFLKRMRTGFPYIQLKLGASLDGRTAMASGESQWITSPQARRDV QRQRAQSHAILTSSATVLADDPALTVRWSELDEQTQALYPQQNLRQPVRIVIDSQNRVTP EHRIVQQPGETWFARTQEDSSEWPETVRTLLIPEHKGHLDLVVLMMQLGKQQINSIWVEA GPTLAGALLQAGLVDELIVYIAPKLLGSDARGLCTLPGLEKLADAPQFKFKEIRHVGPDV CLHLVGA >gi|296493391|gb|ADTK01000110.1| GENE 34 34802 - 35197 420 131 aa, chain - ## HITS:1 COG:ECs0466 KEGG:ns NR:ns ## COG: ECs0466 COG1327 # Protein_GI_number: 15829720 # Func_class: K Transcription # Function: Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains # Organism: Escherichia coli O157:H7 # 1 131 19 149 149 231 99.0 2e-61 MGEGSSVRRRRQCLVCNERFTTFEVAELVMPRVVKSNDVREPFNEEKLRSGMLRALEKRP VSSDDVEMAINHIKSQLRATGEREVPSKMIGNLVMEQLKKLDKVAYIRFASVYRSFEDIK EFGEEIARLED >gi|296493391|gb|ADTK01000110.1| GENE 35 35402 - 35941 463 179 aa, chain + ## HITS:1 COG:no KEGG:EcHS_A0483 NR:ns ## KEGG: EcHS_A0483 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_HS # Pathway: not_defined # 1 179 21 199 199 295 100.0 6e-79 MNTNVFRLLLLGSLFSLSACVQQSEVRQMKHSVSTLNQEMTQLNKETVKITQQNRLNAKS SSGVYLLPGAKTPARLESQIGTLRMSLVNITPDADGTTLTLRIQGESNDPLPAFSGTVEY GQIQGTIDNFQEINVQNQLINAPASVLAPSDVDIPLQLKGISVEQLDFVRIHDIQPVMQ >gi|296493391|gb|ADTK01000110.1| GENE 36 36240 - 37124 1114 294 aa, chain + ## HITS:1 COG:Ztsx KEGG:ns NR:ns ## COG: Ztsx COG3248 # Protein_GI_number: 15800140 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Nucleoside-binding outer membrane protein # Organism: Escherichia coli O157:H7 EDL933 # 1 294 1 294 294 523 99.0 1e-148 MKKTLLAAGAVLALSSSFTVNAAENDKPQYLSDWWHQSVNVVGSYHTRFGPQIRNDTYLE YEAFAKKDWFDFYGYADAPVFFGGNSDAKGIWNHGSPLFMEIEPRFSIDKLTNTDLSFGP FKEWYFANNYIYDMGRNKDGRQSTWYMGLGTDIDTGLPMSLSMNVYAKYQWQNYGAANEN EWDGYRFKIKYFVPITDLWGGQLSYIGFTNFDWGSDLGDDSGNAINGIKTRTNNSIASSH ILALNYDHWHYSVVARYWHDGGQWNDDAELNFGNGNFNVRSTGWGGYLVVGYNF >gi|296493391|gb|ADTK01000110.1| GENE 37 37301 - 37648 400 115 aa, chain - ## HITS:1 COG:no KEGG:c0520 NR:ns ## KEGG: c0520 # Name: yajD # Def: hypothetical protein # Organism: E.coli_CFT073 # Pathway: not_defined # 1 115 24 138 138 236 100.0 2e-61 MAIIPKNYARLESGYREKALKIYPWVCGRCSREFVYSNLRELTVHHIDHDHTNNPEDGSN WELLCLYCHDHEHSKYTEADQYGTTVIAGEDAQKDVGEAKYNPFADLKAMMNKKK >gi|296493391|gb|ADTK01000110.1| GENE 38 37777 - 38748 1003 323 aa, chain - ## HITS:1 COG:ECs0460 KEGG:ns NR:ns ## COG: ECs0460 COG0341 # Protein_GI_number: 15829714 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecF # Organism: Escherichia coli O157:H7 # 1 323 1 323 323 613 100.0 1e-175 MAQEYTVEQLNHGRKVYDFMRWDYWAFGISGLLLIAAIVIMGVRGFNWGLDFTGGTVIEI TLEKPAEIDVMRDALQKAGFEEPMLQNFGSSHDIMVRMPPAEGETGGQVLGSQVLKVINE STNQNAAVKRIEFVGPSVGADLAQTGAMALMAALLSILVYVGFRFEWRLAAGVVIALAHD VIITLGILSLFHIEIDLTIVASLMSVIGYSLNDSIVVSDRIRENFRKIRRGTPYEIFNVS LTQTLHRTLITSGTTLMVILMLYLFGGPVLEGFSLTMLIGVSIGTASSIYVASALALKLG MKREHMLQQKVEKEGADQPSILP >gi|296493391|gb|ADTK01000110.1| GENE 39 38759 - 40573 2035 604 aa, chain - ## HITS:1 COG:ECs0459 KEGG:ns NR:ns ## COG: ECs0459 COG0342 # Protein_GI_number: 15829713 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecD # Organism: Escherichia coli O157:H7 # 1 604 12 615 615 1113 100.0 0 MLIVVIVIGLLYALPNLFGEDPAVQITGARGVAASEQTLIQVQKTLQEEKITAKSVALEE GAILARFDSTDTQLRAREALMGVMGDKYVVALNLAPATPRWLAAIHAEPMKLGLDLRGGV HFLMEVDMDTALGKLQEQNIDSLRSDLREKGIPYTTVRKENNYGLSITFRDAKARDEAIA YLSKRHPDLVISSQGSNQLRAVMSDARLSEAREYAVQQNINILRNRVNQLGVAEPVVQRQ GADRIVVELPGIQDTARAKEILGATATLEFRLVNTNVDQAAAASGRVPGDSEVKQTREGQ PVVLYKRVILTGDHITDSTSSQDEYNQPQVNISLDSAGGNIMSNFTKDNIGKPMATLFVE YKDSGKKDANGRAVLVKQEEVINIANIQSRLGNSFRITGINNPNEARQLSLLLRAGALIA PIQIVEERTIGPTLGMQNIEQGLEACLAGLLVSILFMIIFYKKFGLIATSALIANLILIV GIMSLLPGATLSMPGIAGIVLTLAVAVDANVLINERIKEELSNGRTVQQAIDEGYRGAFS SIFDANITTLIKVIILYAVGTGAIKGFAITTGIGVATSMFTAIVGTRAIVNLLYGGKRVK KLSI >gi|296493391|gb|ADTK01000110.1| GENE 40 40634 - 40966 496 110 aa, chain - ## HITS:1 COG:ECs0458 KEGG:ns NR:ns ## COG: ECs0458 COG1862 # Protein_GI_number: 15829712 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YajC # Organism: Escherichia coli O157:H7 # 1 110 1 110 110 201 100.0 4e-52 MSFFISDAVAATGAPAQGSPMSLILMLVVFGLIFYFMILRPQQKRTKEHKKLMDSIAKGD EVLTNGGLVGRVTKVAENGYIAIALNDTTEVVIKRDFVAAVLPKGTMKAL >gi|296493391|gb|ADTK01000110.1| GENE 41 40989 - 42116 1046 375 aa, chain - ## HITS:1 COG:ECs0457 KEGG:ns NR:ns ## COG: ECs0457 COG0343 # Protein_GI_number: 15829711 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Queuine/archaeosine tRNA-ribosyltransferase # Organism: Escherichia coli O157:H7 # 1 375 1 375 375 798 100.0 0 MKFELDTTDGRARRGRLVFDRGVVETPCFMPVGTYGTVKGMTPEEVEATGAQIILGNTFH LWLRPGQEIMKLHGDLHDFMQWKGPILTDSGGFQVFSLGDIRKITEQGVHFRNPINGDPI FLDPEKSMEIQYDLGSDIVMIFDECTPYPADWDYAKRSMEMSLRWAKRSRERFDSLGNKN ALFGIIQGSVYEDLRDISVKGLVDIGFDGYAVGGLAVGEPKADMHRILEHVCPQIPADKP RYLMGVGKPEDLVEGVRRGIDMFDCVMPTRNARNGHLFVTDGVVKIRNAKYKSDTGPLDP ECDCYTCRNYSRAYLHHLDRCNEILGARLNTIHNLRYYQRLMAGLRKAIEEGKLESFVTD FYQRQGREVPPLNVD >gi|296493391|gb|ADTK01000110.1| GENE 42 42172 - 43242 1123 356 aa, chain - ## HITS:1 COG:ECs0456 KEGG:ns NR:ns ## COG: ECs0456 COG0809 # Protein_GI_number: 15829710 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Escherichia coli O157:H7 # 1 356 1 356 356 705 99.0 0 MRVTDFSFELPESLIAHYPMPERSSCRLLSLDGPTGALTHGTFTDLLDKLNPGDLLVFNN TRVIPARLFGRKASGGKIEVLVERMLDDKRILAHIRASKAPKPGAELLLGDDESINATMT ARHGALFEVEFNDDRSVLDILNSIGHMPLPPYIDRPDEDADRELYQTVYSEKPGAVAAPT AGLHFDEPLLEKLRAKGVEMAFVTLHVGAGTFQPVRVDTIEDHIMHSEYAEVPQDVVDAV LAAKARGNRVIAVGTTSVRSLESAAQAAKNDLIEPFFDDTQIFIYPGFQYKVVDALVTNF HLPESTLIMLVSAFAGYQHTMNAYKAAVEEKYRFFSYGDAMFITYNPQAINERVGE >gi|296493391|gb|ADTK01000110.1| GENE 43 43335 - 43916 502 193 aa, chain + ## HITS:1 COG:yajB KEGG:ns NR:ns ## COG: yajB COG3124 # Protein_GI_number: 16128389 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 193 1 193 193 351 99.0 5e-97 MNFLAHLHLAHLAESSLSGNLLADFVRGNPEESFPPDVVAGIHMHRRIDVLTDNLPEVRE AREWFRSETRRVAPITLDVMWDHFLSRHWSQLSPDFPLQEFVCYAREQVMTILPDSPPRF INLNNYLWSEQWLVRYRDMDFIQNVLNGMASRRPRLDALRDSWYDLDAHYDALETRFWQF YPRMMAQASHKAL >gi|296493391|gb|ADTK01000110.1| GENE 44 43921 - 45735 1555 604 aa, chain - ## HITS:1 COG:malZ KEGG:ns NR:ns ## COG: malZ COG0366 # Protein_GI_number: 16128388 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Escherichia coli K12 # 1 604 2 605 605 1243 99.0 0 MLNAWHLPVPPFVKQSKDQLLITLWLTGEDPPQRIMLRTEHDNEEMSVSMHKQRSQQQPG VTAWRAAIDLSSGQPRRRYSFKLLWHDRQRWFTPQGFSRMPPARLEQFAVDVPDIGPQWA ADQIFYQIFPDRFARSLPREAEQDHVYYHHAAGQEIILRDWDEPVTAQAGGSTFYGGDLD GISEKLPYLKKLGVTALYLNPVFKAPSVHKYDTEDYRHVDPQFGGDGALLRLRHNTQQLG MRLVLDGVFNHSGDSHAWFDRHNRGTGGACHNPESPWRDWYSFSDDGTALDWLGYASLPK LDYQSESLVNEIYRGEDSIVRHWLKAPWSMDGWRLDVVHMLGEAGGARNNMQHVAGITEA AKETQPEAYIVGEHFGDARQWLQADVEDAAMNYRGFTFPLWGFLANTDISYDPQQIDAQT CMAWMDNYRAGLSHQQQLRMFNQLDSHDTARFKTLLGRDIARLPLAVVWLFTWPGVPCIY YGDEVGLDGKNDPFCRKPFPWQVEKQDTALFALYQRMIALRKKSQALRHGGCQVLYAEDN VVVFVRVLNQQRVLVAINRGEACEVVLPASPLLNAVQWQCKEGHGQLTDGILALPAISAT VWMN >gi|296493391|gb|ADTK01000110.1| GENE 45 45894 - 47267 1562 457 aa, chain - ## HITS:1 COG:ECs0452 KEGG:ns NR:ns ## COG: ECs0452 COG1113 # Protein_GI_number: 15829706 # Func_class: E Amino acid transport and metabolism # Function: Gamma-aminobutyrate permease and related permeases # Organism: Escherichia coli O157:H7 # 1 457 1 457 457 783 100.0 0 MESKNKLKRGLSTRHIRFMALGSAIGTGLFYGSADAIKMAGPSVLLAYIIGGIAAYIIMR ALGEMSVHNPAASSFSRYAQENLGPLAGYITGWTYCFEILIVAIADVTAFGIYMGVWFPT VPHWIWVLSVVLIICAVNLMSVKVFGELEFWFSFFKVATIIIMIVAGFGIIIWGIGNGGQ PTGIHNLWSNGGFFSNGWLGMVMSLQMVMFAYGGIEIIGITAGEAKDPEKSIPRAINSVP MRILVFYVGTLFVIMSIYPWNQVGTAGSPFVLTFQHMGITFAASILNFVVLTASLSAINS DVFGVGRMLHGMAEQGSAPKIFSKTSRRGIPWVTVLVMTTALLFAVYLNYIMPENVFLVI ASLATFATVWVWIMILLSQIAFRRRLPPEEVKALKFKVPGGVATTIGGLIFLLFIIGLIG YHPDTRISLYVGFAWIVVLLIGWMFKRRHDRQLAENQ >gi|296493391|gb|ADTK01000110.1| GENE 46 47343 - 48662 1383 439 aa, chain - ## HITS:1 COG:ECs0451 KEGG:ns NR:ns ## COG: ECs0451 COG1114 # Protein_GI_number: 15829705 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid permeases # Organism: Escherichia coli O157:H7 # 1 439 1 439 439 741 100.0 0 MTHQLRSRDIIALGFMTFALFVGAGNIIFPPMVGLQAGEHVWTAAFGFLITAVGLPVLTV VALAKVGGGVDSLSTPIGKVAGVLLATVCYLAVGPLFATPRTATVSFEVGIAPLTGDSAL PLFIYSLVYFAIVILVSLYPGKLLDTVGNFLAPLKIIALVILSVAAIVWPAGSISTATEA YQNAAFSNGFVNGYLTMDTLGAMVFGIVIVNAARSRGVTEARLLTRYTVWAGLMAGVGLT LLYLALFRLGSDSASLVDQSANGAAILHAYVQHTFGGGGSFLLAALIFIACLVTAVGLTC ACAEFFAQYVPLSYRTLVFILGGFSMVVSNLGLSQLIQISVPVLTAIYPPCIALVVLSFT RSWWHNSSRVIAPPMFISLLFGILDGIKASAFSDILPSWAQRLPLAEQGLAWLMPTVVMV VLAIIWDRAAGRQVTSSAH >gi|296493391|gb|ADTK01000110.1| GENE 47 49069 - 50364 1129 431 aa, chain - ## HITS:1 COG:ECs0450 KEGG:ns NR:ns ## COG: ECs0450 COG0642 # Protein_GI_number: 15829704 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli O157:H7 # 1 431 1 431 431 835 99.0 0 MLERLSWKRLVLELLLCCLPAFILGAFFGYLPWFLLASVTGLLIWHFWNLLRLSWWLWVD RSMTPPPGRGSWEPLLYGLHQMQLRNKKRRRELGNLIKRFRSGAESLPDAVVLTTEEGGI FWCNGLAQQILGLRWPEDNGQNILNLLRYPEFTQYLKTRDFSRPLNLVLNTGRHLEIRVM PYTHKQLLMVARDVTQMHQLEGARRNFFANVSHELRTPLTVLQGYLEMMDEQPLGGAVRE KALHTMREQTQRMEGLVKQLLTLSKIEAAPTQLLNEKVDVPMMLRVVEREAQTLSQKKQT FTFEIDNGLKVSGNEDQLRSAISNLVYNAVNHTPEGTHITVRWQRVPHGAEFSVEDNGPG IAPEHIPRLTERFYRVDKARSRQTGGSGLGLAIVKHAVNHHESRLNIESTVGKGTRFSFV IPERLIAKNSD >gi|296493391|gb|ADTK01000110.1| GENE 48 50422 - 51111 666 229 aa, chain - ## HITS:1 COG:ECs0449 KEGG:ns NR:ns ## COG: ECs0449 COG0745 # Protein_GI_number: 15829703 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 229 1 229 229 449 100.0 1e-126 MARRILVVEDEAPIREMVCFVLEQNGFQPVEAEDYDSAVNQLNEPWPDLILLDWMLPGGS GIQFIKHLKRESMTRDIPVVMLTARGEEEDRVRGLETGADDYITKPFSPKELVARIKAVM RRISPMAVEEVIEMQGLSLDPTSHRVMAGEEPLEMGPTEFKLLHFFMTHPERVYSREQLL NHVWGTNVYVEDRTVDVHIRRLRKALEPGGHDRMVQTVRGTGYRFSTRF >gi|296493391|gb|ADTK01000110.1| GENE 49 51301 - 52503 944 400 aa, chain + ## HITS:1 COG:ECs0448 KEGG:ns NR:ns ## COG: ECs0448 COG0420 # Protein_GI_number: 15829702 # Func_class: L Replication, recombination and repair # Function: DNA repair exonuclease # Organism: Escherichia coli O157:H7 # 1 400 1 400 400 803 100.0 0 MRILHTSDWHLGQNFYSKSREAEHQAFLDWLLETAQTHQVDAIIVAGDVFDTGSPPSYAR TLYNRFVVNLQQTGCHLVVLAGNHDSVATLNESRDIMAFLNTTVVASAGHAPQILPRRDG TPGAVLCPIPFLRPRDIITSQAGLNGIEKQQHLLAAITDYYQQHYADACKLRGDQPLPII ATGHLTTVGASKSDAVRDIYIGTLDAFPAQNFPPADYIALGHIHRAQIIGGMEHVRYCGS PIPLSFDECGKSKYVHLVTFSNGKLESVENLNVPVTQPMAVLKGDLASITAQLEQWRDVS QEPPVWLDIEITTDEYLHDIQRKIQALTESLPVEVLLVRRSREQRERVLASQQRETLSEL SVEEVFNRRLALEELDESQQQRLQHLFTTTLHTLAGEHEA >gi|296493391|gb|ADTK01000110.1| GENE 50 52500 - 55643 3113 1047 aa, chain + ## HITS:1 COG:ZsbcC KEGG:ns NR:ns ## COG: ZsbcC COG0419 # Protein_GI_number: 15800123 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Escherichia coli O157:H7 EDL933 # 1 1047 1 1047 1047 1476 98.0 0 MKILSLRLKNLNSLKGEWKIDFTREPFASNGLFAITGPTGAGKTTLLDAICLALYHETPR LSNVSQSQNDLMTRDTAECLAEVEFEVKGEAYRAFWSQNRARNQPDGNLQVPRVELARCA DGKILADKVKDKLELTATLTGLDYGRFTRSMLLSQGQFAAFLNAKPKERAELLEELTGTE IYGQISAMVFEQHKSARTELEKLQAQASGVALLTPEQVQSLTASLQVLTDEEKQLLTAQQ QEQQSLNWLTRQDELQQEASRRQQALQQVLAEEEKAQPQLAALSLAQPARNLRPHWERIA EHSAALAHIRQQIEEVNTRLQSTMALRASIRHHAAKQSAELQQQQQSLNTWLQEHDRFRQ WNNELAGWRAQFSQQTSDREHLRQWQQQLTHAEQKLNTLAAITLTLTADEVASALAQHAE QRPLRQRLVALHGQIVPQQKRLAQLQVAIQNVTQEQTQRNAALNEMRQRYKEKTQQLADV KTICEQEARIKTLEAQRAQLQAGQPCPLCGSTSHPAVEAYQALEPGVNQSRLLALENEVK KLGEEGAALRGQLDALTKQLQRDENEAQSLRQDEQALTQQWQAVTASLNITLQPQDDIQP WLDAQDEHERQLRLLSQRHELQGQIAAHNQQIIQYQQQIEQRQQQLLTALAGYALTLPQE DEEESWLATRQQEAQSWQQRQNELTALQNRIQQLTPILETLPQSDDLPHSEETVALDNWR QVHEQCLALHSQQQTLQQQDVLAAQSLQKAQAQFDTALQASVFDDQQAFLAALMDEQTLT QLEQLKQNLENQRRQAQTLVTQTAETLAQHQQHRPDGLALTVTVEQIQQELAQTHQKLRE NTTSQGEIRQQLKQDADNRQQQQTLMQQIAQMTQQVEDWGYLNSLIGSKEGDKFRKFAQG LTLDNLVHLANQQLTRLHGRYLLQRKASEALEVEVVDTWQADAVRDTRTLSGGESFLVSL ALALALSDLVSHKTRIDSLFLDEGFGTLDSETLDTALDALDALNASGKTIGVISHVEAMK ERIPVQIKVKKINGLGYSKLESTFAVK >gi|296493391|gb|ADTK01000110.1| GENE 51 55685 - 56953 933 422 aa, chain + ## HITS:1 COG:ECs0446 KEGG:ns NR:ns ## COG: ECs0446 COG2814 # Protein_GI_number: 15829700 # Func_class: G Carbohydrate transport and metabolism # Function: Arabinose efflux permease # Organism: Escherichia coli O157:H7 # 29 422 1 394 394 630 99.0 1e-180 MALLVVILQAITLLATVIGSRSGSCDGGMKKVILSLALGTFGLGMAEFGIMGVLTELAHN VGISIPAAGHMISYYALGVVVGAPIIALFSSRYSLKHILLFLVALCVIGNAMFTLSSSYM MLAIGRLVSGFPHGAFFGVGAIVLSKIIKPGKVTAAVAGVVSGMTVANLLGIPLGTYLSQ EFSWRYTFLLIAVFNIAVMASVYFWVPDIRDEAKGKLREQFHFLRSPAPWLIFAATMFGN AGVFAWFSYVKPYMMFISGFSETAMTFIMMLVGLGMVLGNMLSGRISGRYSPLRIAAVTD FIIVLALLMLFFCGGMKTTSLIFAFICCAGLFALSAPLQILLLQNAKGGELLGAAGGQIA FNLGSAVGAYCGGMMLTLGLAYNYVALPAALLSFAAMSSLLLYGRYKRQQAADSPVLAKP LG >gi|296493391|gb|ADTK01000110.1| GENE 52 57096 - 58004 294 302 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 3 298 6 316 319 117 31 2e-25 MRIGIDLGGTKTEVIALGDAGEQLYRHRLPTPRDDYRQTIETIATLVDMAEQATGQRGTV GMGIPGSISPYTGVVKNANSTWLNGQPFDKDLSARLQREVRLANDANCLAVSEAVDGAAA GAQTVFAVIIGTGCGAGVAFNGRAHIGGNGTAGEWGHNPLPWMDEDELRYREEVPCYCGK QGCIETFISGTGFATDYRRLSGHALKGSEIIRLVEESDPVAELALRRYELRLAKSLAHVV NILDPDVIVLGGGMSNVDRLYQTVGQLIKQFVFGGECETPVRKAKHGDSSGVRGAAWLWP QE >gi|296493391|gb|ADTK01000110.1| GENE 53 58129 - 59040 1169 303 aa, chain + ## HITS:1 COG:rdgC KEGG:ns NR:ns ## COG: rdgC COG2974 # Protein_GI_number: 16128378 # Func_class: L Replication, recombination and repair # Function: DNA recombination-dependent growth factor C # Organism: Escherichia coli K12 # 1 303 1 303 303 572 100.0 1e-163 MLWFKNLMVYRLSREISLRAEEMEKQLASMAFTPCGSQDMAKMGWVPPMGSHSDALTHVA NGQIVICARKEEKILPSPVIKQALEAKIAKLEAEQARKLKKTEKDSLKDEVLHSLLPRAF SRFSQTMMWIDTVNGLIMVDCASAKKAEDTLALLRKSLGSLPVVPLSMENPIELTLTEWV RSGSAAQGFQLLDEAELKSLLEDGGVIRAKKQDLTSEEITNHIEAGKVVTKLALDWQQRI QFVMCDDGSLKRLKFCDELRDQNEDIDREDFAQRFDADFILMTGELAALIQNLIEGLGGE AQR >gi|296493391|gb|ADTK01000110.1| GENE 54 59198 - 59401 131 67 aa, chain - ## HITS:1 COG:no KEGG:ECH74115_0466 NR:ns ## KEGG: ECH74115_0466 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 67 655 721 721 114 89.0 8e-25 MTEPQDRSLAINNPQLSADVKTAWLKEDPSLLLFVEQPDLSQLRDLVKTGATRKIRSEAR HRLEEKQ >gi|296493391|gb|ADTK01000110.1| GENE 55 59688 - 59972 285 94 aa, chain - ## HITS:1 COG:ECs0441 KEGG:ns NR:ns ## COG: ECs0441 COG3123 # Protein_GI_number: 15829695 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 94 1 94 94 175 100.0 2e-44 MLQSNEYFSGKVKSIGFSSSSTGRASVGVMVEGEYTFSTAEPEEMTVISGALNVLLPDAT DWQVYEAGSVFNVPGHSEFHLQVAEPTSYLCRYL >gi|296493391|gb|ADTK01000110.1| GENE 56 60044 - 60721 591 225 aa, chain - ## HITS:1 COG:no KEGG:B21_00341 NR:ns ## KEGG: B21_00341 # Name: aroM # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 225 1 225 225 404 100.0 1e-111 MSASLAILTIGIVPMQEVLPLLTEYIDEDNISHHSLLGKLSREEVMAEYAPEAGEDTILT LLNDNQLAHVSRRKVERDLQGVVEVLDNRGYDVIILMSTANISSMTARNTIFLEPSRILP PLVSSIVEDHQVGVIVPVEEMLPVQAQKWQILQKSPVFSLGNPIHDSEQKIIDAGKELLA KGADVIMLDCLGFHQRHRDLLQKQLDVPVLLSNVLIARLAAELLV >gi|296493391|gb|ADTK01000110.1| GENE 57 60979 - 61170 266 63 aa, chain - ## HITS:1 COG:no KEGG:LF82_2523 NR:ns ## KEGG: LF82_2523 # Name: yaiA # Def: uncharacterized protein YaiA # Organism: E.coli_LF82 # Pathway: not_defined # 1 63 1 63 63 90 100.0 2e-17 MPTKPPYPREAYIVTIEKGKPGQTVTWYQLRADHPKPDSLISEHPTAQEAMDAKKRYEDP DKE >gi|296493391|gb|ADTK01000110.1| GENE 58 61220 - 61744 364 174 aa, chain - ## HITS:1 COG:ECs0438 KEGG:ns NR:ns ## COG: ECs0438 COG0703 # Protein_GI_number: 15829692 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Escherichia coli O157:H7 # 1 174 1 174 174 333 99.0 1e-91 MTQPLFLIGPRGCGKTTVGMALADSLNRRFVDTDQWLQSQLNMTVAEIVEREEWAGFRAR ETAALEAVTAPSTVIATGGGIILTEFNRHFMQNNGIVVYLCAPVSVLVNRLQAAPEEDLR PTLTGKPLSEEVQEVLEERDALYREVAHIIIDATNEPGQVISEIRSALAQTINC >gi|296493391|gb|ADTK01000110.1| GENE 59 61921 - 62385 386 154 aa, chain - ## HITS:1 COG:ECs0436 KEGG:ns NR:ns ## COG: ECs0436 COG1671 # Protein_GI_number: 15829691 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 152 41 192 192 290 99.0 5e-79 MTIWVDADACPNVIKEILYRAAERMQMPLVLVANQSLRVPPSRFIRTLRVAAGFDVADNE IVRQCEAGDLVITADIPLAAEAIEKGAAAINPRGERYTPATIRERLTMRDFMDTLRASGI QTGGPDSLSQRDRQAFAAELEKWWLEVQRSRGQM >gi|296493391|gb|ADTK01000110.1| GENE 60 62505 - 63314 905 269 aa, chain + ## HITS:1 COG:proC KEGG:ns NR:ns ## COG: proC COG0345 # Protein_GI_number: 16128371 # Func_class: E Amino acid transport and metabolism # Function: Pyrroline-5-carboxylate reductase # Organism: Escherichia coli K12 # 1 269 1 269 269 489 100.0 1e-138 MEKKIGFIGCGNMGKAILGGLIASGQVLPGQIWVYTPSPDKVAALHDQFGINAAESAQEV AQIADIIFAAVKPGIMIKVLSEITSSLNKDSLVVSIAAGVTLDQLARALGHDRKIIRAMP NTPALVNAGMTSVTPNALVTPEDTADVLNIFRCFGEAEVIAEPMIHPVVGVSGSSPAYVF MFIEAMADAAVLGGMPRAQAYKFAAQAVMGSAKMVLETGEHPGALKDMVCSPGGTTIEAV RVLEEKGFRAAVIEAMTKCMEKSEKLSKS >gi|296493391|gb|ADTK01000110.1| GENE 61 63331 - 64431 690 366 aa, chain - ## HITS:1 COG:yaiC_2 KEGG:ns NR:ns ## COG: yaiC_2 COG2199 # Protein_GI_number: 16128370 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli K12 # 196 366 1 171 171 356 100.0 3e-98 MNDENFFKKAAAHGEEPPLTPQNEHQRSGLRFARRVRLPRAVGLAGMFLPIASTLVSHPP PGWWWLVLVGWAFVWPHLAWQIASRAVDPLSREIYNLKTDAVLAGMWVGVMGVNVLPSTA MLMIMCLNLMGAGGPRLFVAGLVLMVVSCLVTLELTGITVSFNSAPLEWWLSLPIIVIYP LLFGWVSYQTATKLAEHKRRLQVMSTRDGMTGVYNRRHWETMLRNEFDNCRRHNRDATLL IIDIDHFKSINDTWGHDVGDEAIVALTRQLQITLRGSDVIGRFGGDEFAVIMSGTPAESA ITAMLRVHEGLNTLRLPNTPQVTLRISVGVAPLNPQMSHYREWLKSADLALYKAKKAGRN RTEVAA >gi|296493391|gb|ADTK01000110.1| GENE 62 64548 - 64868 269 106 aa, chain - ## HITS:1 COG:no KEGG:APECO1_1624 NR:ns ## KEGG: APECO1_1624 # Name: psiF # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 106 7 112 112 161 100.0 5e-39 MKITLLVTLLFGLVFLTTVGAAERTLTPQQQRMTSCNQQATAQALKGDARKTYMSDCLKN SKSAPGEKSLTPQQQKMRECNNQATQQSLKGDDRNKFMSACLKKAA >gi|296493391|gb|ADTK01000110.1| GENE 63 64987 - 66402 1299 471 aa, chain - ## HITS:1 COG:phoA KEGG:ns NR:ns ## COG: phoA COG1785 # Protein_GI_number: 16128368 # Func_class: P Inorganic ion transport and metabolism # Function: Alkaline phosphatase # Organism: Escherichia coli K12 # 1 471 24 494 494 865 99.0 0 MKQSTIALALLPLLFTPVTKARTPEMPVLENRAAQGDITAPGGARRLTGDQTAALRDSLS DKPAKNIILLIGDGMGDSEITAARNYAEGAGGFFKGIDALPLTGQYTHYALNKKTGKPDY VTDSAASATAWSTGVKTYNGALGVDIHEKDHPTILEMAKAAGLATGNVSTAELQDATPAA LVAHVTSRKCYGPSATSEKCPGNALEKGGKGSITEQLLNARADVTLGGGAKTFAETATAG EWQGKTLREQAQARGYQLVSDAASLNAVTEANQQKPLLGLFADGNMPVRWQGPKATYHGN IDKPAVTCTPNPQRNDSVPTLAQMTDKAIELLSKNEKGFFLQVEGASIDKQDHAANPCGQ IGETVDLDEAVQRALEFAKKDGNTLVIVTADHAHASQIVAPDTKAPGLTQALNTKDGAVM VMSYGNSEEDSQEHTGSQLRIAAYGPHAANVVGLTDQTDLFYTMKAALGLK >gi|296493391|gb|ADTK01000110.1| GENE 64 66503 - 66763 288 86 aa, chain - ## HITS:1 COG:no KEGG:SSON_0357 NR:ns ## KEGG: SSON_0357 # Name: yaiB # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 86 1 86 86 132 100.0 5e-30 MKNLIAELLFKLAQKEEESKELCAQVEALEIIVTAMLRNMAQNDQQRLIDQVEGALYEVK PDASIPDDDTELLRDYVKKLLKHPRQ >gi|296493391|gb|ADTK01000110.1| GENE 65 67226 - 68320 1123 364 aa, chain + ## HITS:1 COG:ECs0431 KEGG:ns NR:ns ## COG: ECs0431 COG1181 # Protein_GI_number: 15829685 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Escherichia coli O157:H7 # 1 364 1 364 364 728 100.0 0 MEKLRVGIVFGGKSAEHEVSLQSAKNIVDAIDKSRFDVVLLGIDKQGQWHVSDASNYLLN ADDPAHIALRPSATSLAQVPGKHEHQLIDAQNGQPLPTVDVIFPIVHGTLGEDGSLQGML RVANLPFVGSDVLASAACMDKDVTKRLLRDAGLNIAPFITLTRANRHNISFAEVESKLGL PLFVKPANQGSSVGVSKVTSEEQYAIAVDLAFEFDHKVIVEQGIKGREIECAVLGNDNPQ ASTCGEIVLTSDFYAYDTKYIDEDGAKVVVPAAIAPEINDKIRAIAVQAYQTLGCAGMAR VDVFLTPENEVVINEINTLPGFTNISMYPKLWQASGLGYTDLITRLIELALERHAADNAL KTTM >gi|296493391|gb|ADTK01000110.1| GENE 66 68344 - 68556 358 70 aa, chain - ## HITS:1 COG:no KEGG:UTI89_C0397 NR:ns ## KEGG: UTI89_C0397 # Name: yaiZ # Def: hypothetical protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 70 45 114 114 138 100.0 5e-32 MNLPVKIRRDWHYYAFAIGLIFILNGVVGLLGFEAKGWQTYAVGLVTWVISFWLAGLIIR RRDEETENAQ >gi|296493391|gb|ADTK01000110.1| GENE 67 68816 - 69124 294 102 aa, chain + ## HITS:1 COG:no KEGG:APECO1_1629 NR:ns ## KEGG: APECO1_1629 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 102 1 102 102 200 100.0 1e-50 MADFTLSKSLFSGKYRNASSTPGNIAYALFVLFCFWAGAQLLNLLVHAPGVYERLMQVQE TGRPRVEIGLGVGTIFGLIPFLVGCLIFAVVALWLHWRHRRQ >gi|296493391|gb|ADTK01000110.1| GENE 68 69183 - 70277 1040 364 aa, chain - ## HITS:1 COG:no KEGG:B21_00329 NR:ns ## KEGG: B21_00329 # Name: yaiW # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 364 1 364 364 680 98.0 0 MSRVNPLSSLSLLAVLVLAGCSSQAPQPLKKGEKAIDVGSVVRQKMPASVKDRDAWAKDL ATTFESQGLAPTLENVCSVLAVAQQESNYQADPAVPGLSKIAWQEIDRRAERMHIPPFLV HTALKIKSPNGKSYSERLDSVRTEKQLSAIFDDLISMVPMGQTLFGSLNPVRTGGPMQVS IAFAEQHTKGYPWKMDGTVRQEVFSRRGGLWFGTYHLLNYPASYSAPIYRFADFNAGWYA SRNAAFQNAVSKASGVKLALDGDLIRYDSKEPGKTELATRKLAAKLGMSDSEIRRQLEKG DSFSFEETALYKKVYQLAEAKTGKSLPREMLPGIQLESPKITRNLTTAWFAKRVDERRAR CMKQ >gi|296493391|gb|ADTK01000110.1| GENE 69 70290 - 71510 1272 406 aa, chain - ## HITS:1 COG:ECs0427 KEGG:ns NR:ns ## COG: ECs0427 COG1133 # Protein_GI_number: 15829681 # Func_class: I Lipid transport and metabolism # Function: ABC-type long-chain fatty acid transport system, fused permease and ATPase components # Organism: Escherichia coli O157:H7 # 1 406 1 406 406 757 100.0 0 MFKSFFPKPGTFFLSAFVWALIAVIFWQAGGGDWVARITGASGQIPISAARFWSLDFLIF YAYYIVCVGLFALFWFIYSPHRWQYWSILGTALIIFVTWFLVEVGVAVNAWYAPFYDLIQ TALSSPHKVTIEQFYREVGVFLGIALIAVVISVLNNFFVSHYVFRWRTAMNEYYMANWQQ LRHIEGAAQRVQEDTMRFASTLENMGVSFINAIMTLIAFLPVLVTLSAHVPELPIIGHIP YGLVIAAIVWSLMGTGLLAVVGIKLPGLEFKNQRVEAAYRKELVYGEDDATRATPPTVRE LFSAVRKNYFRLYFHYMYFNIARILYLQVDNVFGLFLLFPSIVAGTITLGLMTQITNVFG QVRGAFQYLINSWTTLVELMSIYKRLRSFEHELDGDKIQEVTHTLS >gi|296493391|gb|ADTK01000110.1| GENE 70 71530 - 71724 99 64 aa, chain - ## HITS:1 COG:no KEGG:ECIAI1_0372 NR:ns ## KEGG: ECIAI1_0372 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 64 1 64 64 118 98.0 7e-26 MPAFVILRVLIRLCKQSLVCETRRNNSYGQAGELSALLWVTMGGFIWLTLCSGHAVNTQQ LLKR >gi|296493391|gb|ADTK01000110.1| GENE 71 71862 - 73019 1032 385 aa, chain + ## HITS:1 COG:ECs0426 KEGG:ns NR:ns ## COG: ECs0426 COG1680 # Protein_GI_number: 15829680 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Escherichia coli O157:H7 # 1 385 1 385 385 757 99.0 0 MKRSLLFSAVLCAASLTSVHAAQPITEPEFASDIVDRYADHIFYGSGATGMALVVIDGNQ RVFRSYGETRPGNNVRPQLDSVVRIASLTKLMTSEMLVKLLDQGTVKLNDPLSKYAPPGA RVPTYNGTPITLVNLATHTSALPREQPGGAAHRPVFVWPTREQRWKYLSTAKLKAAPGSQ AAYSNLAFDLLADALANASGKPYTQLFEEQITRPLGMKDTTYTPSPDQCRRLMVAERGAS PCNNTLAAIGSGGVYSTPGDMMRWMQQYLSSDFYQRSNQADRMQTLIYQRAQFTKVIGMD VPGKADALGLGWVYMAPKEGRPGIIQKTGGGGGFITYMAMIPQKNIGAFVVVTRSPLTRF KNMSDGINDLVTELSGNKQLVIPAS >gi|296493391|gb|ADTK01000110.1| GENE 72 73020 - 73688 332 222 aa, chain - ## HITS:1 COG:no KEGG:SSON_0350 NR:ns ## KEGG: SSON_0350 # Name: yaiV # Def: putative DNA-binding transcriptional regulator # Organism: S.sonnei # Pathway: not_defined # 1 222 1 222 222 448 100.0 1e-125 MPVKDLTGITAKDAQMLSVVKPLQEFGKLDKCLSRYGTRFEFNNEKQVIFSSDVNNEDTF VILEGVISLRREENVLIGITQAPYIMGLADGLMKNDIPYKLISEGNCTGYHLPAKQTITL IEQNQLWRDAFYWLAWQNRILELRDVQLIGHNSYEQIRATLLSMIDWNEELRSRIGVMNY IHQRTRISRSVVAEVLAALRKGGYIEMNKGKLVAINRLPSEY >gi|296493391|gb|ADTK01000110.1| GENE 73 73731 - 76682 2627 983 aa, chain - ## HITS:1 COG:yaiU KEGG:ns NR:ns ## COG: yaiU COG3468 # Protein_GI_number: 16128359 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 517 983 1 467 467 719 97.0 0 MHSWKKKLVVSQLALACTLAITSQANAANYDTWGYHDNTTTDLVWPYDTDNDGIADAGGM DPMGSDGNYVTYNGFVYYNNANGDFDTVFKGDTVNGTISTYYLNHDYASGSVNQVDISNS VIHGSITSELPFGYYSITPEGYNGYTFNNGTTTDLDYNWYDGDVFTLNVSNSTIDDDYEA LYFTDTHLDGDKSKSTNETFYTALGVAVNLDVESNINITNNSRVAGIALSEGNTSNTTYT SEYHQWDNNINVSNSTVTSGSQTPLEESDGFFGKSAEPSDYAGNGGQNDVALSFTDNAGS NYSMKNNVNFDHSTLLGDVEFTSHWNNDGVFFYSTGHDSNGDGVLDTNGGWVDDAQNVDE LNITLNNGSKWVGSANMSAEVIAPADMYDVAPNSLTPGATIEANDWGRIIDNKVFQSGVF NVALNNSSEWNTVNSSVIDTLAVNNGSQVNVTDSSLVSDTIGLTNGSSLNIGANGVVATD HLTVDSYSTVNLTESTGWNNYSNLYTNTITVTNGGVLDVNVDQFDTEAFRTDKLELTSGN IADHNGNVISGVFDINSSDYVLNADLVNDRTWDTSKSNYGYGIVAMNSDGHLTINGNGDV DNGTELDNSSVDNVVAATGNYKVRIDNATGAGTIADYKDKELIRVNDANTNATFSAANKA DLGAYTYQAEQRGNTVVLQQMELTDYANMALSIPSANTNIWNLEQDTVGTRLTNSRHGLA DNGGAWGSYFGGNFNGDNGTINYDQDVNGIMVGVDTKVDGNNAKWIVGAAAGFAKGDMND RSGQVDQDSQTAYIYSSAHFANNVFVDGSLSYSHFNNDLSATMSNGTYVDGSTNSDAWGF GLKAGYDFKLGDAGYVTPYGSVSGLFQSGDDYQLSNDMKVDGQSYDSMRYELGVDAGYTF TYSEDQALTPYFKLAYVYDDSNNDNDVNGDSIDNGTEGSAVRVGLGTQFSFTKNFSAYTD ANYLGGGDVDQDWSANVGVKYTW >gi|296493391|gb|ADTK01000110.1| GENE 74 76640 - 76831 98 63 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLIVILQVSFSRSAFVIPPQGRNKLIHQLSMHRKHIMFRANDYAALQTNGSIILSPKSEC GHI >gi|296493391|gb|ADTK01000110.1| GENE 75 77206 - 78180 1028 324 aa, chain + ## HITS:1 COG:ECs0423 KEGG:ns NR:ns ## COG: ECs0423 COG0113 # Protein_GI_number: 15829677 # Func_class: H Coenzyme transport and metabolism # Function: Delta-aminolevulinic acid dehydratase # Organism: Escherichia coli O157:H7 # 1 324 12 335 335 620 100.0 1e-178 MTDLIQRPRRLRKSPALRAMFEETTLSLNDLVLPIFVEEEIDDYKAVEAMPGVMRIPEKH LAREIERIANAGIRSVMTFGISHHTDETGSDAWREDGLVARMSRICKQTVPEMIVMSDTC FCEYTSHGHCGVLCEHGVDNDATLENLGKQAVVAAAAGADFIAPSAAMDGQVQAIRQALD AAGFKDTAIMSYSTKFASSFYGPFREAAGSALKGDRKSYQMNPMNRREAIRESLLDEAQG ADCLMVKPAGAYLDIVRELRERTELPIGAYQVSGEYAMIKFAALAGAIDEEKVVLESLGS IKRAGADLIFSYFALDLAEKKILR >gi|296493391|gb|ADTK01000110.1| GENE 76 78286 - 79137 882 283 aa, chain - ## HITS:1 COG:ECs0422 KEGG:ns NR:ns ## COG: ECs0422 COG2175 # Protein_GI_number: 15829676 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Probable taurine catabolism dioxygenase # Organism: Escherichia coli O157:H7 # 1 283 1 283 283 566 100.0 1e-161 MSERLSITPLGPYIGAQISGADLTRPLSDNQFEQLYHAVLRHQVVFLRDQAITPQQQRAL AQRFGELHIHPVYPHAEGVDEIIVLDTHNDNPPDNDNWHTDVTFIETPPAGAILAAKELP STGGDTLWTSGIAAYEALSVPFRQLLSGLRAEHDFRKSFPEYKYRKTEEEHQRWREAVAK NPPLLHPVVRTHPVSGKQALFVNEGFTTRIVDVSEKESEALLGFLFAHITKPEFQVRWRW QPNDIAIWDNRVTQHYANADYLPQRRIMHRATILGDKPFYRAG >gi|296493391|gb|ADTK01000110.1| GENE 77 79134 - 79961 1004 275 aa, chain - ## HITS:1 COG:ECs0421 KEGG:ns NR:ns ## COG: ECs0421 COG0600 # Protein_GI_number: 15829675 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Escherichia coli O157:H7 # 1 275 1 275 275 416 98.0 1e-116 MSVLINEKLHSHRLKWRWPLSRQVTLSIGTLAVLLTVWWAVAALQLISPLFLPPPQQVLA KLLTIAGPQGFMDATLWQHLAASLTRIVLALLAAVLIGIPVGIAMGLSPTVRGILDPVIE LYRPVPPLAYLPLMVIWFGIGETSKILLIYLAIFAPVAMSALAGVKSAQQVRIRAAQSLG ASRAQVLWFVILPGALPEILTGLRIGLGVGWSTLVAAELIAATRGLGFMVQSAGEFLATD VVLAGIAVIAIIAFLLELGLRALQRRLTPWHGEVQ >gi|296493391|gb|ADTK01000110.1| GENE 78 79958 - 80725 267 255 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 205 1 215 245 107 31 2e-22 MLQISHLYADYGGKPALEDINLTLESGELLVVLGPSGCGKTTLLNLIAGFVPYQHGSILL AGKRIEGPGAERGVVFQNEGLLPWRNVQDNVAFGLQLAGIEKMQRLEIAHQMLKKVGLEG AEKRYIWQLSGGQRQRVGIARALAANPQLLLLDEPFGALDAFTRDQMQTLLLKLWQETGK QVLLITHDIEEAVFMATELVLLSSGPGRVLERLSLNFARRFVAGESSRSIKSDPQFIAMR EYVLSRVFEQREAFS >gi|296493391|gb|ADTK01000110.1| GENE 79 80738 - 81700 1134 320 aa, chain - ## HITS:1 COG:tauA KEGG:ns NR:ns ## COG: tauA COG4521 # Protein_GI_number: 16128350 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type taurine transport system, periplasmic component # Organism: Escherichia coli K12 # 1 320 20 339 339 585 99.0 1e-167 MAISSRNTLLAALAFIAFQAQAVNVTVAYQTSAEPAKVAQADNTFAKESGASVDWRKFDS GASIVRALASGDVQIGNLGSSPLAVAASQQVPIEVFLLASKLGNSEALVVKKTISKPEDL IGKRIAVPFISTTHYSLLAALKHWGIKPGQVEIVNLQPPAIIAAWQRGDIDGAYVWAPAV NALEKDGKVLTDSEQVGQWGAPTLDVWVVRKDFAEKHPEVVKAFAKSAIDAQQPYIANPD VWLKQPENISKLARLSGVPEGDVPGLVKGNTYLTPQQQTAELTGPVNKAIIDTAQFLKEQ GKVPAVANDYSQYVTSRFVQ >gi|296493391|gb|ADTK01000110.1| GENE 80 81654 - 81878 120 74 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_0388 NR:ns ## KEGG: EcE24377A_0388 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 74 1 74 74 132 98.0 4e-30 MNASAARSVLRDEIAMIVCSPVLLWEQYSGIKTFIKRISRYRTDDFILSKKTAKERLNMQ LLRSNNWENYYTSF >gi|296493391|gb|ADTK01000110.1| GENE 81 82316 - 82987 416 223 aa, chain + ## HITS:1 COG:yaiS KEGG:ns NR:ns ## COG: yaiS COG2120 # Protein_GI_number: 16128349 # Func_class: S Function unknown # Function: Uncharacterized proteins, LmbE homologs # Organism: Escherichia coli K12 # 50 185 1 136 136 274 99.0 8e-74 MDKVLDSALLSSANKRKGILAIGAHPDDIELGCGASLARLAQKGIYIAAVVMTTGNSGTD GIIDRHEESRNALKILGCHQTIHLNFADTRAHLQLNDMISALEDIIKNQIPSDVEIMRVY TMHDADRHQDHLAVYQASMVACRTIPQILGYETPSTWLSFMPQVFESVKEEYFTVKLAAL KKHKSQERRDYMRHDRLRAVAQFRGQQVNSDLGEGFVIHKMIL >gi|296493391|gb|ADTK01000110.1| GENE 82 82997 - 84193 646 398 aa, chain + ## HITS:1 COG:yaiP KEGG:ns NR:ns ## COG: yaiP COG1215 # Protein_GI_number: 16128348 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases, probably involved in cell wall biogenesis # Organism: Escherichia coli K12 # 1 398 1 398 398 798 99.0 0 MKTWIFICMSIAMLLWFLSTLRRKPSQKKGCIDAIIPAYNEGPCLAQSLDNLLRNPYFCR VICVNDGSTDNTEAVMAEVKRKWGDRFVAVTQKNTGKGGALMNGLNYATCDQVFLSDADT YVPPDQDGMGYMLAEIERGADAVGGIPSTALKGAGLLPHIRATVKLPMIVMKRTLQQLLG GAPFIISGACGMFRTDVLRKFGFSDRTKVEDLDLTWTLVANGYRIRQANRCIVYSQECNS PREEWRRWRRWIVGYAVCMRLHKRLLFSRFGIFSIFPMLLVVIYGVGIYLTTWFNEFITT GPHGVVLAMFPLIWIGVVCVIGAFSAWFHRCWLLVPLAPLSVVYVLLAYAIWIIYGLIAF FTGREPQRDKPTRYSALVEASTAYSQPSVTGTEKLSEA >gi|296493391|gb|ADTK01000110.1| GENE 83 84042 - 84752 355 236 aa, chain + ## HITS:1 COG:b0359 KEGG:ns NR:ns ## COG: b0359 COG0110 # Protein_GI_number: 16128344 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Escherichia coli K12 # 100 236 11 147 147 267 99.0 1e-71 MPSGLFMDLLPFLLDANLSATNPPAIPHWWKRQPLIPNLLSQELKNYLKLNVKEKNIQIA DQVIIDETAGEVVIGANTRICHGAVIQGPVVIGANCLIGNYAFIRPGTIISNGVKIGFAT EIKNAVIEAEATIGPQCFIADSVVANQAYLGAQVRTSNHRLDEQPVSVRTPEGIIATGCD KLGCYIGQRSRLGVQVIILPGRIISPNTQLGPRVIVERNLPAGTYSLRQELIRTGD >gi|296493391|gb|ADTK01000110.1| GENE 84 84754 - 85527 826 257 aa, chain + ## HITS:1 COG:no KEGG:SSON_0337 NR:ns ## KEGG: SSON_0337 # Name: yaiO # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 257 1 257 257 483 100.0 1e-135 MIKRTLLAAAIFSALPAYAGLTSITAGYDFTDYSGDHGNRNLAYAELVAKVENATLLFNL SQGRRDYETEHFNATRGQGAVWYKWNNWLTTRTGIAFADNTPVFARQDFRQDINLALLPK TLFTTGYRYTKYYDDVEVDAWQGGVSLYTGPVITSYRYTHYDSSDAGGSYSNMISVRLND PRGTGYTQLWLSRGTGAYTYDWTPETRYGSMKSVSLQRIQPLTEQLNLGLTAGKVWYDTP TDDYNGLQLAAHLTWKF >gi|296493391|gb|ADTK01000110.1| GENE 85 85712 - 85987 275 91 aa, chain + ## HITS:1 COG:ECs0412 KEGG:ns NR:ns ## COG: ECs0412 COG1937 # Protein_GI_number: 15829666 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 91 8 98 98 154 100.0 4e-38 MPSTPEEKKKVLTRVRRIRGQIDALERSLEGDAECRAILQQIAAVRGAANGLMAEVLESH IRETFDRNDCYSREVSQSVDDTIELVRAYLK >gi|296493391|gb|ADTK01000110.1| GENE 86 86022 - 87131 1271 369 aa, chain + ## HITS:1 COG:ECs0411 KEGG:ns NR:ns ## COG: ECs0411 COG1062 # Protein_GI_number: 15829665 # Func_class: C Energy production and conversion # Function: Zn-dependent alcohol dehydrogenases, class III # Organism: Escherichia coli O157:H7 # 1 369 1 369 369 727 100.0 0 MKSRAAVAFAPGKPLEIVEIDVAPPKKGEVLIKVTHTGVCHTDAFTLSGDDPEGVFPVVL GHEGAGVVVEVGEGVTSVKPGDHVIPLYTAECGECEFCRSGKTNLCVAVRETQGKGLMPD GTTRFSYNGQPLYHYMGCSTFSEYTVVAEVSLAKINPEANHEHVCLLGCGVTTGIGAVHN TAKVQPGDSVAVFGLGAIGLAVVQGARQAKAGRIIAIDTNPKKFDLARRFGATDCINPND YDKPIKDVLLDINKWGIDHTFECIGNVNVMRAALESAHRGWGQSVIIGVAGSGQEISTRP FQLVTGRVWKGSAFGGVKGRSQLPGMVEDAMKGDIDLEPFVTHTMSLDEINDAFDLMHEG KSIRTVIRY >gi|296493391|gb|ADTK01000110.1| GENE 87 87224 - 88057 658 277 aa, chain + ## HITS:1 COG:yaiM KEGG:ns NR:ns ## COG: yaiM COG0627 # Protein_GI_number: 16128340 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Escherichia coli K12 # 1 277 1 277 277 561 97.0 1e-160 MELIEKHASFGGWQNVYRHYSQSLKCEMNVGVYLPPKAANEKLPVLYWLSGLTCNEQNFI TKSGMQRYAAEHNIIVVAPDTSPRGSHVADADRYDLGQGAGFYLNATQAPWNEHYKMYDY IRNELPDLVMQHFPATTRKSISGHSMGGLGALVLALRNPDEYVSVSAFSPIVSPSQVPWG QQAFAAYLGENKDAWLDYDPVSLISQGQRVAEIMVDQGLSDDFYAEQLRTPNLEKICQEM NIKTLIRYHEGYDHSYYFVSSFIGEHIAYHANKLNMR >gi|296493391|gb|ADTK01000110.1| GENE 88 88283 - 88822 624 179 aa, chain - ## HITS:1 COG:yaiL KEGG:ns NR:ns ## COG: yaiL COG3122 # Protein_GI_number: 16128339 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 179 40 218 218 295 100.0 4e-80 MAKLTLQEQLLKAGLVTSKKAAKVERTAKKSRVQAREARAAVEENKKAQLERDKQLSEQQ KQAALAKEYKAQVKQLIEMNRITIANGDIGFNFTDGNLIKKIFVDKLTQAQLINGRLAIA RLLVDNNSEGEYAIIPASVADKIAQRDASSIVLHSALSAEEQDEDDPYADFKVPDDLMW >gi|296493391|gb|ADTK01000110.1| GENE 89 88924 - 90135 1086 403 aa, chain - ## HITS:1 COG:mhpT KEGG:ns NR:ns ## COG: mhpT COG0477 # Protein_GI_number: 16128338 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 13 403 28 418 418 571 99.0 1e-163 MSTRTPSSSSSRLMLTIGLCFLVALMEGLDLQAAGIAAGGIAQAFALDKMQMGWIFSAGI LGLLPGALVGGMLADRYGRKRILIGSVALFGLFSLATAIAWDFPSLVFARLMTGVGLGAA LPNLIALTSEAAGPRFRGTAVSLMYCGVPIGAALAATLGFAGANLAWQTVFWVGGVVPLI LVPLLMRWLPESAVFAGEKQAAPPLRALFAPETATATLLLWLCYFFTLLVVYMLINWLPL LLVEQGFQPSQAAGVMFALQMGAASGTLMLGALMDKLRPVTMSLLIYSGMLASLLALGTV SSFNGMLLAGFVAGLFATGGQSVLYALAPLFYSSQIRATGVGTAVAVGRLGAMSGPLLAG KMLALGTGTVGVMAASAPGILVAGLAVFILMSRRSRMQPCADA >gi|296493391|gb|ADTK01000110.1| GENE 90 90311 - 91324 1241 337 aa, chain - ## HITS:1 COG:mhpE KEGG:ns NR:ns ## COG: mhpE COG0119 # Protein_GI_number: 16128337 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Escherichia coli K12 # 1 337 1 337 337 631 100.0 0 MNGKKLYISDVTLRDGMHAIRHQYSLENVRQIAKALDDARVDSIEVAHGDGLQGSSFNYG FGAHSDLEWIEAAADVVKHAKIATLLLPGIGTIHDLKNAWQAGARVVRVATHCTEADVSA QHIQYARELGMDTVGFLMMSHMTTPENLAKQAKLMEGYGATCIYVVDSGGAMNMSDIRDR FRALKAELKPETQTGMHAHHNLSLGVANSIAAVEEGCDRIDASLAGMGAGAGNAPLEVFI AAADKLGWQHGTDLYALMDAADDLVRPLQDRPVRVDRETLALGYAGVYSSFLRHCETAAA RYGLSAVDILVELGKRRMVGGQEDMIVDVALDLRNNK >gi|296493391|gb|ADTK01000110.1| GENE 91 91321 - 92271 985 316 aa, chain - ## HITS:1 COG:mhpF KEGG:ns NR:ns ## COG: mhpF COG4569 # Protein_GI_number: 16128336 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acetaldehyde dehydrogenase (acetylating) # Organism: Escherichia coli K12 # 1 316 1 316 316 578 100.0 1e-165 MSKRKVAIIGSGNIGTDLMIKILRHGQHLEMAVMVGIDPQSDGLARARRMGVATTHEGVI GLMNMPEFADIDIVFDATSAGAHVKNDAALREAKPDIRLIDLTPAAIGPYCVPVVNLEAN VDQLNVNMVTCGGQATIPMVAAVSRVARVHYAEIIASIASKSAGPGTRANIDEFTETTSR AIEVVGGAAKGKAIIVLNPAEPPLMMRDTVYVLSDEASQDDIEASINEMAEAVQAYVPGY RLKQRVQFEVIPQDKPVNLPGVGQFSGLKTAVWLEVEGAAHYLPAYAGNLDIMTSSALAT AEKMAQSLARKAGEAA >gi|296493391|gb|ADTK01000110.1| GENE 92 92268 - 93077 862 269 aa, chain - ## HITS:1 COG:mhpD KEGG:ns NR:ns ## COG: mhpD COG3971 # Protein_GI_number: 16128335 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: 2-keto-4-pentenoate hydratase # Organism: Escherichia coli K12 # 1 269 3 271 271 533 98.0 1e-151 MTKHTLEQLAADLRRAAEQGEAIAPLRDLIGIDNAEAAYAIQHINVQYDVVQGRRVVGRK VGLTHPKVQQQLGVDQPDFGTLFADMCYGDNEIIPFSRVLQPRIEAEIALVLNRDLPATD ITFDELYNAIEWVLPALEVVGSRIRDWSIQFVDTVADNASCGVYVIGGPAQRPAGLDLKN CAMKMTRNNEEVSSGRGSECLGHPLNAAVWLARKMASLGEPLRAGDIILTGALGPMVAVN AGDRFEAHIEGIGSVAATFSSAAPKGSLS >gi|296493391|gb|ADTK01000110.1| GENE 93 93087 - 93953 903 288 aa, chain - ## HITS:1 COG:mhpC KEGG:ns NR:ns ## COG: mhpC COG0596 # Protein_GI_number: 16128334 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Escherichia coli K12 # 1 288 22 309 309 602 99.0 1e-172 MSYQPQTEAATSRFLNVEEAGKTLRIHFNDCGQGDETVVLLHGSGPGATGWANFSRNIDP LVEAGYRVILLDCPGWGKSDSIVNSGSRSDLNARILKSVVDQLDIAKIHLLGNSMGGHSS VAFTLKWPERVGKLVLMGGGTGGMSLFTPMPTEGIKRLNQLYRQPTIENLKLMMDIFVFD TSDLTDALFEARLNNMLSRRDHLENFVKSLEANPKQFPDFGPRLAEIKAQTLIVWGRNDR FVPMDAGLRLLSGIAGSELHIFRDCGHWAQWEHADAFNQLVLNFLARP >gi|296493391|gb|ADTK01000110.1| GENE 94 93971 - 94915 775 314 aa, chain - ## HITS:1 COG:no KEGG:ECO111_0384 NR:ns ## KEGG: ECO111_0384 # Name: mhpB # Def: 3-(2,3-dihydroxyphenyl)propionate dioxygenase # Organism: E.coli_O111_H- # Pathway: Phenylalanine metabolism [PATH:eoi00360] # 1 314 1 314 314 642 100.0 0 MHAYLHCLSHSPLVGYVDPAQEVLDEVNGVIASARERIAAFSPELVVLFAPDHYNGFFYD VMPPFCLGVGATAIGDFGSAAGELPVPVELAEACAHAVMKSGIDLAVSYCMQVDHGFAQP LEFLLGGLDKVPVLPVFINGVATPLPGFQRTRMLGEAIGRFTSTLNKRVLFLGSGGLSHQ PPVPELAKADAHMRDRLLGSGKDLPASERELRQQRVISAAEKFVEDQRTLHPLNPIWDNQ FMTLLEQGRIQELDAVSNEELSAIAGKSTHEIKTWVAAFAAISTFGNWRSEGRYYRPIPE WIAGFGSLSARTEN >gi|296493391|gb|ADTK01000110.1| GENE 95 94917 - 96581 1801 554 aa, chain - ## HITS:1 COG:mhpA KEGG:ns NR:ns ## COG: mhpA COG0654 # Protein_GI_number: 16128332 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases # Organism: Escherichia coli K12 # 1 554 1 554 554 1131 98.0 0 MAIQHPDIQPAVNHSVQVAIAGAGPVGLMMANYLGQMGIDVLVVEKLDKLIDYPRAIGID DEALRTMQSVGLVENVLPHTTPWHAMRFLTPKGRCFADIQPMTDEFGWPRRNAFIQPQVD AVMLEGVSRFPNVRCLFSRELEAFSQQNDEVTLHLKTAEGQREIVKAQWLVACDGGASFV RRTLNVPFEGKTAPNQWIVVDIANDPLSTPHIYLCCDPVRPYVSAALPHAVRRFEFMVMP GETEEQLREPQNMRKLLSKVLPNPDNVELIRQRVYTHNARLAQRFRIDRVLLAGDAAHIM PVWQGQGYNSGMRDAFNLAWKLALVIQGKARDALLDTYQQERRDHAKAMIDLSVTTGNVL APPKRWQGTLRDGVSWLLNYLPPVKRYFLEMRFKPMPQYYGGALVREGEAKHSPVGKMFI QPKVTLENGDVTLLDNAIGANFAVIGWGCNPLWGMSDEQIQQWRALGTRFIQVVPEVQIH TAQDNHDGVLRVGDTQGRLRSWFAQHNASLVVMRPDRFVAATAIPQTLGNTLNKLASVMT LTRPDADVSVEKVA >gi|296493391|gb|ADTK01000110.1| GENE 96 96772 - 97605 799 277 aa, chain + ## HITS:1 COG:ECs0401 KEGG:ns NR:ns ## COG: ECs0401 COG1414 # Protein_GI_number: 15829655 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 277 39 315 315 501 99.0 1e-142 MQNNEQTEYKTVRGLTRGLMLLNMLNKLDGGASVGLLAELSGLHRTTVRRLLETLQEEGY VRRSPSDDSFRLTIKVRQLSEGFRDEQWISALAAPLLGDLLREVVWPTDVSTLDVDAMVV RETTHRFSRLSFHRAMVGRRLPLLKTASGLTWLAFCPEQERKELIEMLAARPGDDYQLAR EPLKLQAILARARKEGYGQNYRGWDQEEKIASIAVPLRSEQRVIGCLNLVYMASAMTIEQ AAEKHLPALQRVAKQIEEGVESQAILVAGRRSGVHLR >gi|296493391|gb|ADTK01000110.1| GENE 97 97673 - 98764 989 363 aa, chain + ## HITS:1 COG:lacI KEGG:ns NR:ns ## COG: lacI COG1609 # Protein_GI_number: 16128330 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 4 363 1 360 360 645 99.0 0 MVNVKPVTLYDVAEYAGVSYQTVSRVVNQASHVSAKTREKVEAAMAELNYIPNRVAQQLA GKQSLLIGVATSSLALHAPSQIVAAIKSRADQLGASVVVSMVERSGVEACKAAVHNLLAQ RVSGLIINYPLDDQDAIAVEAACTNVPALFLDVSDQTPINSIIFSHEDGTRLGVEHLVAL GHQQIALLAGPLSSVSARLRLAGWHKYLTRNQIQPIAEREGDWSAMSGFQQTMQMLNEGI VPTAMLVANDQMALGAMRAITESGLRVGADISVVGYDDTEDSSCYIPPLTTIKQDFRLLG QTSVDRLLQLSQGQAVKGNQLLPVSLVKRKTTLAPNTQTASPRALADSLMQLARQVSRLE SGQ >gi|296493391|gb|ADTK01000110.1| GENE 98 98887 - 101961 2357 1024 aa, chain + ## HITS:1 COG:lacZ KEGG:ns NR:ns ## COG: lacZ COG3250 # Protein_GI_number: 16128329 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Escherichia coli K12 # 1 1024 1 1024 1024 2083 99.0 0 MTMITDSLAVVLQRRDWENPGVTQLNRLAAHPPFASWRNSEEARTDRPSQQLRSLNGEWR FAWFPAPEAVPESWLECDLPEADTVVVPSNWQMHGYDAPIYTNVTYPITVNPPFVPTENP TGCYSLTFNVDESWLQEGQTRIIFDGVNSAFHLWCNGRWVGYGQDSRLPSEFDLSAFLRA GENRLAVMVLRWSDGSYLEDQDMWRMSGIFRDVSLLHKPTTQISDFHVATRFNDDFSRAV LEAEVQMCGELRDYLRVTVSLWQGETQVASGTAPFGGEIIDERGSYADRVTLRLNVENPK LWSAEIPNLYRAVVELHTADGTLIEAEACDVGFREVRIENGLLLLNGKPLLIRGVNRHEH HPLHGQVMDEQTMVQDILLMKQNNFNAVRCSHYPNHPLWYTLCDRYGLYVVDEANIETHG MVPMNRLTDDPRWLPAMSERVTRMVQRDRNHPSVIIWSLGNESGHGANHDALYRWIKSVD PSRPVQYEGGGADTTATDIICPMYARVDEDQPFPAVPKWSIKKWLSLPGETRPLILCEYA HAMGNSLGGFAKYWQAFRQYPRLQGGFVWDWVDQSLIKYDENGNPWSAYGGDFGDTPNDR QFCMNGLVFADRTPHPALTEAKHQQQFFQFRLSGQTIEVTSEYLFRHSDNELLHWMVALD GKPLASGEVPLDVAPQGKQLIELPELPQPESAGQLWLTVRVVQPNATAWSEAGHISAWQQ WRLAENLSVTLPSASHIIPQLTTSETDFCIELGNKRWQFNRQSGLLSQMWIGDEKQLLTP LRDQFTRAPLDNDIGVSEATRIDPNAWVERWKAAGHYQAEAALLQCSADTLADAVLITTA HAWQHQGKTLFISRKTYRIDGSGQMAITVDVEVASDTPHPARIGLTCQLAQVAERVNWLG LGPQENYPDRLTAACFDRWDLPLSDMYTPYVFPSENGLRCGTRELNYGPHQWRGDFQFNI SRYSQQQLMETSHRHLLHAEEGTWLNIDGFHMGIGGDDSWSPSVSAEFQLSAGRYHYQLV WCQK >gi|296493391|gb|ADTK01000110.1| GENE 99 102013 - 103266 1204 417 aa, chain + ## HITS:1 COG:lacY KEGG:ns NR:ns ## COG: lacY COG0477 # Protein_GI_number: 16128328 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 417 1 417 417 687 100.0 0 MYYLKNTNFWMFGLFFFFYFFIMGAYFPFFPIWLHDINHISKSDTGIIFAAISLFSLLFQ PLFGLLSDKLGLRKYLLWIITGMLVMFAPFFIFIFGPLLQYNILVGSIVGGIYLGFCFNA GAPAVEAFIEKVSRRSNFEFGRARMFGCVGWALCASIVGIMFTINNQFVFWLGSGCALIL AVLLFFAKTDAPSSATVANAVGANHSAFSLKLALELFRQPKLWFLSLYVIGVSCTYDVFD QQFANFFTSFFATGEQGTRVFGYVTTMGELLNASIMFFAPLIINRIGGKNALLLAGTIMS VRIIGSSFATSALEVVILKTLHMFEVPFLLVGCFKYITSQFEVRFSATIYLVCFCFFKQL AMIFMSVLAGNMYESIGFQGAYLVLGLVALGFTLISVFTLSGPGPLSLLRRQVNEVA >gi|296493391|gb|ADTK01000110.1| GENE 100 103332 - 103943 302 203 aa, chain + ## HITS:1 COG:ECs0395 KEGG:ns NR:ns ## COG: ECs0395 COG0110 # Protein_GI_number: 15829649 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Escherichia coli O157:H7 # 1 203 1 203 203 405 98.0 1e-113 MNMSMTERIKAGKLFTDMCEGLPEKRLRGKTLMYEFNHSHPSEVEKRESLLKEMFATVGE NAWVEPPVYFSYGSNIHIGRNFYANFNLTIVDDYTVTIGDNVLIAPNVTLSVTGHPVHHE LRKNGEMYSFPITIGNNVWIGSHVVINPGVTIGDNSVIGAGSIVTKDIPPNVVAAGVPCR VIREINDRDKQYYFKDYKVESSV >gi|296493391|gb|ADTK01000110.1| GENE 101 104046 - 105200 1080 384 aa, chain - ## HITS:1 COG:cynX KEGG:ns NR:ns ## COG: cynX COG2807 # Protein_GI_number: 16128326 # Func_class: P Inorganic ion transport and metabolism # Function: Cyanate permease # Organism: Escherichia coli K12 # 1 384 1 384 384 585 98.0 1e-167 MLLVLVLIGLNMRPLLTSVGPLLPQLRQASGMSFSVAALLTALPVVTMGGLALAGSWLHQ HVSERRSVAISLLLIAVGALMRELYPQSVLLLSSALLGGVGIGIIQAVMPSVIKRRFQQR TPQVMGLWSAALMGGGGLGAAITPWLVQHSESWYQTLAWWALPAVVALFAWWWQSAREVA SSHKTTTTPVRVVFTPRAWTLGVYFGLINGGYASLIAWLPAFYIEIGASAQYSGSLLALM TLGQAAGALLMPAMARHQDRRKLLMLALVLQLVGFCGFIWLPLQLPVLWAMVCGLGLGGA FPLCLLLALDHSAQPAIAGKLVAFMQGIGFIIAGLAPWFSGVLRSISGNYLMDWAFHALC VVGLMIITLRFAPARFPQLWVKEA >gi|296493391|gb|ADTK01000110.1| GENE 102 105233 - 105703 750 156 aa, chain - ## HITS:1 COG:cynS KEGG:ns NR:ns ## COG: cynS COG1513 # Protein_GI_number: 16128325 # Func_class: P Inorganic ion transport and metabolism # Function: Cyanate lyase # Organism: Escherichia coli K12 # 1 156 1 156 156 290 99.0 9e-79 MIQSQINRNIRLDLADAILLSKAKKDLSFAEIADGTGLAEAFVTAALLGQQALPADAARL VGAKLDLDEDAILLLQMIPLRGCIDDRIPTDPTMYRFYEMLQVYGTTLKALVHEKFGDGI ISAINFKLDVKKVADPEGGERAVITLDGKYLPTKPF >gi|296493391|gb|ADTK01000110.1| GENE 103 105734 - 106393 489 219 aa, chain - ## HITS:1 COG:ZcynT KEGG:ns NR:ns ## COG: ZcynT COG0288 # Protein_GI_number: 15800068 # Func_class: P Inorganic ion transport and metabolism # Function: Carbonic anhydrase # Organism: Escherichia coli O157:H7 EDL933 # 1 219 1 219 219 429 100.0 1e-120 MKEIIDGFLKFQREAFPKREALFKQLATQQSPRTLFISCSDSRLVPELVTQREPGDLFVI RNAGNIVPSYGPEPGGVSASVEYAVAALRVSDIVICGHSNCGAMTAIASCQCMDHMPAVS HWLRYADSARVVNEARPHSDLPSKAAAMVRENVIAQLANLQTHPSVRLALEEGRIALHGW VYDIESGSIAAFDGATRQFVPLAANPRVCAIPLRQPTAA >gi|296493391|gb|ADTK01000110.1| GENE 104 106503 - 107402 491 299 aa, chain + ## HITS:1 COG:ECs0391 KEGG:ns NR:ns ## COG: ECs0391 COG0583 # Protein_GI_number: 15829645 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 299 1 299 299 544 98.0 1e-155 MLSRHINYFLAVAEHGSFTRAASALHVSQPALSQQIRQLEESLGVPLFDRSGRTIRLTDA GEVWRQYASRALQELGAGKRAIHDVADLTRGSLRIAVTPTFTSYFIGPLMADFYARYPGI TLQLQEMSQEKIEDLLCRDELDVGIAFAPVHSPELEAIPLLTESLALVVAKHHPLAACEQ VALSRLHDEKLVLLSAEFATREQIDHYCEKAGLHPQVVIEANSISAVLELIRRTSLSTLL PAAIATQHDGLKAISLAPPLLERTAVLLRRKNSWQTAAAKAFLHMALEECADVGENESR Prediction of potential genes in microbial genomes Time: Mon May 16 15:22:13 2011 Seq name: gi|296493390|gb|ADTK01000111.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont289.6, whole genome shotgun sequence Length of sequence - 39369 bp Number of predicted genes - 33, with homology - 33 Number of transcription units - 18, operones - 8 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 25 - 62 1.1 1 1 Op 1 6/0.000 - CDS 87 - 1370 1376 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases 2 1 Op 2 2/0.750 - CDS 1360 - 2619 1425 ## COG1457 Purine-cytosine permease and related proteins - Prom 2657 - 2716 5.6 - Term 2872 - 2900 -1.0 3 2 Op 1 4/0.250 - CDS 2957 - 4843 2023 ## COG0365 Acyl-coenzyme A synthetases/AMP-(fatty) acid ligases 4 2 Op 2 8/0.000 - CDS 4883 - 6334 1626 ## COG2079 Uncharacterized protein involved in propionate catabolism 5 2 Op 3 9/0.000 - CDS 6368 - 7537 1256 ## COG0372 Citrate synthase 6 2 Op 4 . - CDS 7582 - 8472 1008 ## COG2513 PEP phosphonomutase and related enzymes - Prom 8556 - 8615 4.0 + Prom 8528 - 8587 8.0 7 3 Tu 1 . + CDS 8711 - 10297 1481 ## COG1221 Transcriptional regulators containing an AAA-type ATPase domain and a DNA-binding domain + Term 10300 - 10345 2.7 8 4 Tu 1 . - CDS 10395 - 10670 370 ## ECUMN_0373 hypothetical protein - Prom 10792 - 10851 4.5 + Prom 10512 - 10571 6.3 9 5 Tu 1 . + CDS 10820 - 11488 515 ## COG1280 Putative threonine efflux protein + Term 11615 - 11648 -0.4 - TRNA 11740 - 11820 28.4 # Pseudo ??? 0 0 10 6 Tu 1 . - CDS 12162 - 12977 350 ## EC55989_0333 hypothetical protein - Prom 13004 - 13063 5.5 11 7 Tu 1 . - CDS 13220 - 14269 1094 ## COG1064 Zn-dependent alcohol dehydrogenases 12 8 Op 1 3/0.750 - CDS 14646 - 16028 1403 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases 13 8 Op 2 . - CDS 16038 - 16988 1102 ## COG0549 Carbamate kinase - Prom 17045 - 17104 6.6 14 9 Op 1 . - CDS 17131 - 18549 1715 ## ECO103_0299 hypothetical protein 15 9 Op 2 . - CDS 18549 - 20096 1950 ## COG0074 Succinyl-CoA synthetase, alpha subunit 16 9 Op 3 . - CDS 20086 - 20949 401 ## JW0311 hypothetical protein 17 9 Op 4 . - CDS 20989 - 21594 620 ## COG0666 FOG: Ankyrin repeat - Prom 21628 - 21687 4.0 18 10 Op 1 . + CDS 21852 - 22349 426 ## ECIAI1_0315 conserved hypothetical protein; putative inner membrane protein 19 10 Op 2 . + CDS 22441 - 23373 801 ## COG0583 Transcriptional regulator - Term 23224 - 23274 1.5 20 11 Tu 1 . - CDS 23415 - 24503 715 ## COG2200 FOG: EAL domain - Prom 24670 - 24729 4.5 + Prom 24744 - 24803 5.1 21 12 Tu 1 . + CDS 24823 - 25008 93 ## ECBD_3343 hypothetical protein + Term 25147 - 25183 -1.0 - Term 25325 - 25359 4.0 22 13 Tu 1 . - CDS 25378 - 27411 2251 ## COG1292 Choline-glycine betaine transporter - Prom 27463 - 27522 3.0 + Prom 27394 - 27453 5.0 23 14 Op 1 6/0.000 + CDS 27522 - 28127 594 ## COG1309 Transcriptional regulator 24 14 Op 2 10/0.000 + CDS 28141 - 29613 1593 ## COG1012 NAD-dependent aldehyde dehydrogenases 25 14 Op 3 . + CDS 29627 - 31297 1664 ## COG2303 Choline dehydrogenase and related flavoproteins + Prom 31365 - 31424 2.2 26 15 Tu 1 . + CDS 31510 - 32178 41 ## B21_00269 hypothetical protein + Term 32283 - 32317 0.8 - Term 32269 - 32307 7.2 27 16 Op 1 13/0.000 - CDS 32421 - 33116 536 ## COG1556 Uncharacterized conserved protein 28 16 Op 2 17/0.000 - CDS 33109 - 34536 1184 ## COG1139 Uncharacterized conserved protein containing a ferredoxin-like domain 29 16 Op 3 3/0.750 - CDS 34547 - 35266 524 ## COG0247 Fe-S oxidoreductase - Prom 35389 - 35448 7.6 30 17 Tu 1 . - CDS 35794 - 36648 352 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 36758 - 36817 8.9 + Prom 36699 - 36758 5.3 31 18 Op 1 . + CDS 36874 - 38199 379 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 32 18 Op 2 . + CDS 38296 - 38544 106 ## G2583_0402 hypothetical protein 33 18 Op 3 . + CDS 38556 - 39149 527 ## COG3059 Predicted membrane protein Predicted protein(s) >gi|296493390|gb|ADTK01000111.1| GENE 1 87 - 1370 1376 427 aa, chain - ## HITS:1 COG:ECs0390 KEGG:ns NR:ns ## COG: ECs0390 COG0402 # Protein_GI_number: 15829644 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Escherichia coli O157:H7 # 1 427 1 427 427 879 99.0 0 MSNNALQTIINARLPGKEGLWQIHLQDGKISAIDAQSGVMPITENSLDAEQGLVIPPFVE PHIHLDTTQTAGQPNWNQSGTLFEGIERWAERKALLTHDDVKQRAWQTLKWQIANGIQHV RTHVDVSDATLTALKAMLEVKQEVAPWIDLQIVAFPQEGILSYPNGEALLEEALRLGADV VGAIPHFEFTREYGVESLHKTFALAQKYDRLIDVHCDEIDDEQSRFVETVAALAHREGMG ARVTASHTTAMHSYNGAYTSRLFRLLKMSGINFVANPLVNIHLQGRFDTYPKRRGITRVK EMLESGINVCFGHDDVFDPWYPLGTANMLQVLHMGLHVCQLMGYGQINDGLNLITHHSAR TLNLQDYGIAAGNSANLIILPAENGFDALRRQVPVRYSVRGGKVIASTQPAQTTVYLEQP EAIDYKR >gi|296493390|gb|ADTK01000111.1| GENE 2 1360 - 2619 1425 419 aa, chain - ## HITS:1 COG:codB KEGG:ns NR:ns ## COG: codB COG1457 # Protein_GI_number: 16128321 # Func_class: F Nucleotide transport and metabolism # Function: Purine-cytosine permease and related proteins # Organism: Escherichia coli K12 # 1 419 1 419 419 671 99.0 0 MSQDNNFSQGPVPQSARKGVLALTFVMLGLTFFSASMWTGGTLGTGLSYHDFFLAVLIGN LLLGIYTSFLGYIGAKTGLTTHLLARFSFGVKGSWLPSLLLGGTQVGWFGVGVAMFAIPV GKATGLDINLLIAVSGLLMTVTVFFGISALTVLSVIAVPAIACLGGYSVWLAVNGMGGLD ALKAVVPAQPLDFNVALALVVGSFISAGTLTADFVRFGRNAKLAVLVAMVAFFLGNSLMF IFGAAGAAALGMADISDVMIAQGLLLPAIVVLGLNIWTTNDNALYASGLGFANITGMSSK TLSVINGIIGTVCALWLYNNFVGWLTFLSAAIPPVGGVIIADYLMNRRRYEHFATTRMMS VNWVAILAVALGIAAGHWLPGIVPVNAVLGGALSYLILNPILNRKTTAAMTHVEVNSVE >gi|296493390|gb|ADTK01000111.1| GENE 3 2957 - 4843 2023 628 aa, chain - ## HITS:1 COG:ECs0388 KEGG:ns NR:ns ## COG: ECs0388 COG0365 # Protein_GI_number: 15829642 # Func_class: I Lipid transport and metabolism # Function: Acyl-coenzyme A synthetases/AMP-(fatty) acid ligases # Organism: Escherichia coli O157:H7 # 1 628 1 628 628 1284 99.0 0 MSFSEFYQRSINEPEQFWAEQARRIDWQTPFTQTLDHSNPPFARWFCEGRTNLCHNAIDR WLEKQPEALALIAVSSETEEERTFTFRQLHDEVNAVASMLRSLGVQRGDRVLVYMPMIAE AHITLLACARIGAIHSVVFGGFASHSVAARIDDAKPVLIVSADAGARGGKIIPYKKLLDD AISQAQHQPRHVLLVDRGLAKMARVSGRDVDFASLRHQHIGARVPVAWLESNETSCILYT SGTTGKPKGVQRDVGGYAVALATSMDTIFGGKAGGVFFCASDIGWVVGHSYIVYAPLLAG MATIVYEGLPTWPDCGVWWKIVEKYQVSRMFSAPTAIRVLKKFPTAEIRKHDLSSLEVLY LAGEPLDEPTASWVSNTLDVPVIDNYWQTESGWPIMAIARGLDDRPTRLGSPGVPMYGYN VQLLNEVTGEPCGVNEKGMLVVEGPLPPGCIQTIWGDDDRFVKTYWSLFSRPVYATFDWG IRDADGYHFILGRTDDVINVAGHRLGTREIEESISSHPGVAEVAVVGVKDALKGQVAVAF VIPKESDSLEDRDVAHSQEKAIMALVDSQIGNFGRPAHVWFVSQLPKTRSGKMLRRTIQA ICEGRDPGDLTTIDDPASLDQIRQAMEE >gi|296493390|gb|ADTK01000111.1| GENE 4 4883 - 6334 1626 483 aa, chain - ## HITS:1 COG:prpD KEGG:ns NR:ns ## COG: prpD COG2079 # Protein_GI_number: 16128319 # Func_class: R General function prediction only # Function: Uncharacterized protein involved in propionate catabolism # Organism: Escherichia coli K12 # 1 483 1 483 483 1006 99.0 0 MSAQINNIRPEFDREIVDIVDYVMNYEISSKVAYDTAHYCLLDTLGCGLEALEYPACKKL LGPIVPGTVVPNGVRVPGTQFQLDPVQAAFNIGAMIRWLDFNDTWLAAEWGHPSDNLGGI LATADWLSRNAVASGKAPLTMKQVLTAMIKAHEIQGCIALENSFNRVGLDHVLLVKVAST AVVAEMLGLTREEILNAVSLAWVDGQSLRTYRHAPNTGTRKSWAAGDATSRAVRLALMAK TGEMGYPSALTAPVWGFYDVSFKGESFRFQRPYGSYVMENVLFKISFPAEFHSQTAVEAA MTLYEQMQAAGKTAADIEKVTIRTHEACIRIIDKKGPLNNPADRDHCIQYMVAIPLLFGR LTAADYEDNVAQDKRIDALREKINCFEDPAFTADYHDPEKRAIANAITLEFTDGTRFEEV VVEYPIGHARRRQDGIPKLVDKFKINLARQFPTRQQQRILEVSLDRTRLEQMPVNEYLDL YVI >gi|296493390|gb|ADTK01000111.1| GENE 5 6368 - 7537 1256 389 aa, chain - ## HITS:1 COG:prpC KEGG:ns NR:ns ## COG: prpC COG0372 # Protein_GI_number: 16128318 # Func_class: C Energy production and conversion # Function: Citrate synthase # Organism: Escherichia coli K12 # 1 389 1 389 389 775 99.0 0 MSDTTILQNSTHVIKPKKSVALSGVPAGNTALCTVGKSGNDLHYRGYDILDLAEHCEFEE VAHLLIHGKLPTRDELAAYKTKLKALRGLPANVRTVLEALPAASHPMDVMRTGVSALGCT LPEKEGHTVSGARDIADKLLASLSSILLYWYHYSHNGERIQPETDDDSIGGHFLHLLHGE KPSQSWEKAMHISLVLYAEHEFNASTFTSRVIAGTGSDMYSAIIGAIGALRGPKHGGANE VSLEIQQRYETPDEAEADIRKRVENKEVVIGFGHPVYTIADPRHQVIKRVAKQLSQEGGS LKMYNIADRLETVMWESKKMFPNLDWFSAVSYNMMGVPTEMFTPLFVIARVTGWAAHIIE QRQDNKIIRPSANYVGPEDCPFVALDKRQ >gi|296493390|gb|ADTK01000111.1| GENE 6 7582 - 8472 1008 296 aa, chain - ## HITS:1 COG:prpB KEGG:ns NR:ns ## COG: prpB COG2513 # Protein_GI_number: 16128316 # Func_class: G Carbohydrate transport and metabolism # Function: PEP phosphonomutase and related enzymes # Organism: Escherichia coli K12 # 1 296 1 296 296 558 99.0 1e-159 MSLHSPGKAFRAALSKENPLQIVGTINANHALLAQRAGYQAIYLSGGGVAAGSLGLPDLG ISTLDDVLTDIRRITDVCSLPLLVDADIGFGSSAFNVARTVKSMIKAGAAGLHIEDQVGA KRCGHRPNKAIVSKEEMVDRIRAAVDAKTDPDFVIMARTDALAVEGLDAAIERAQAYVEA GAEMLFPEAITELAMYRQFADAVQVPILANITEFGATPLFTTDELRSAHVAMALYPLSAF RAMNRAAEHVYNVLRQEGTQKSVIDTMQTRNELYESINYYQYEEKLDDLFARSQVK >gi|296493390|gb|ADTK01000111.1| GENE 7 8711 - 10297 1481 528 aa, chain + ## HITS:1 COG:ECs0384 KEGG:ns NR:ns ## COG: ECs0384 COG1221 # Protein_GI_number: 15829638 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulators containing an AAA-type ATPase domain and a DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 528 1 528 528 1028 99.0 0 MAHPPRLNDDKPVIWTVSVTRLFELFRDISLEFDHLANITPIQLGFEKAVTYIRKKLANE RCDAVIAAGSNGAYLKSRLSVPVILIKPSGYDVLQALAKAGKLTSSIGVVTYQETIPALV AFQKTFNLRLDQRSYITEEDARGQINELKANGTEAVVGAGLITDLAEEAGMTGIFIYSAV TVRQAFSDALDMTRMSLRHNTHDATRNALRTRYVLGDMLGQSPQMEQVRQTILLYARSSA AVLIEGETGTGKELAAQAIHREYFARHDARQGKKSHPFVAVNCGAIAESLLEAELFGYEE GAFTGSRRGGRAGLFEIAHGGTLFLDEIGEMPLPLQTRLLRVLEEKEVTRVGGHQPVPVD VRVISATHCNLEEDMRQGQFRRDLFYRLSILRLQLPPLRERVADILPLAESFLKVSLAAL SAPFSAALRQGLQASETVLVHYDWPGNIRELRNMMERLALFLSVEPTPDLTPQFMQLLLP ELARESAKIPAPRLLTPQQALEKFNGDKTAAANYLGISRTTFWRRLKS >gi|296493390|gb|ADTK01000111.1| GENE 8 10395 - 10670 370 91 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_0373 NR:ns ## KEGG: ECUMN_0373 # Name: yahO # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 91 1 91 91 134 98.0 1e-30 MKIISKMLVGALALAVTNVYAAELMTKAEFEKVESQYEKIGDISTSNEMSTADAKEDLIK KADEKGADVLVLTSGQTDNKIHGTANIYKKK >gi|296493390|gb|ADTK01000111.1| GENE 9 10820 - 11488 515 222 aa, chain + ## HITS:1 COG:yahN KEGG:ns NR:ns ## COG: yahN COG1280 # Protein_GI_number: 16128313 # Func_class: E Amino acid transport and metabolism # Function: Putative threonine efflux protein # Organism: Escherichia coli K12 # 1 222 2 223 223 400 99.0 1e-111 MKLLHLFMDEITMDPLHAVYLTVGLFVITFFNPGANLFVVVQTSLASGRRAGVLTGLGVA LGDAFYSGLGLFGLATLITQCEEIFSLIRIVGGAYLLWFAWCSMRRQSTPQMSTLQQPIS APWYVFFRRGLITDLSNPQTVLFFISIFSVTLNAETPTWARLMAWAGIVLASIIWRVFLS QAFSLPAVRRAYGRMQRVASRVIGAIIGVFALRLIYEGVTQR >gi|296493390|gb|ADTK01000111.1| GENE 10 12162 - 12977 350 271 aa, chain - ## HITS:1 COG:no KEGG:EC55989_0333 NR:ns ## KEGG: EC55989_0333 # Name: yahL # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 271 1 271 271 483 100.0 1e-135 MISLKAPHNNLMPYTQQSILNTVKNNQLPEDIKSSLVSCVDIFKVLIKQYYDYPYDCRDD LVDDDKLIHLMAAVRDCEWSDDNALTINVQFNDFPGFYDWMDYPDHPVKFVFRILENQKG TVWVYDQDDAFLDIKANVQAGRFTGLKKLVQFIDSVRTDCKCILLEHHMPLLRLFPKGKE CMHVEKWLREMSSIPETDAPIKQALAHGLLLHLKNIYPVFPESLVMLLLSVLDVKTYRDD ARLNEWISNRVQELGDRYYPVNKHVKIRYTL >gi|296493390|gb|ADTK01000111.1| GENE 11 13220 - 14269 1094 349 aa, chain - ## HITS:1 COG:ECs0379 KEGG:ns NR:ns ## COG: ECs0379 COG1064 # Protein_GI_number: 15829633 # Func_class: R General function prediction only # Function: Zn-dependent alcohol dehydrogenases # Organism: Escherichia coli O157:H7 # 1 349 1 349 349 687 99.0 0 MKIKAVGAYSAKQPLEPMDITRREPGPNDVKIEIAYCGVCHSDLHQVRSEWAGTVYPCVP GHEIVGRVVAVGDQVEKHAPGDLVGVGCIVDSCKHCEECEDGLENYCDHMTGTYNSPTPD EPGHTLGGYSQQIVVHERYVLRIRHPQEQLAAVAPLLCAGITTYSPLRHWQAGPGKKVGV VGIGGLGHMGIKLAHAMGAHVVAFTTSEAKREAAKALGADEVVNSRNADEMAAHLKSFDF ILNTVAAPHNLDDFTTLLKRDGTMTLVGAPATPHKSPEVFNLIMKRRAIAGSMIGGIPET QEMLDFCAEHGIVADIEMIRADQINEAYERMLRGDVKYRFVIDNRTLTD >gi|296493390|gb|ADTK01000111.1| GENE 12 14646 - 16028 1403 460 aa, chain - ## HITS:1 COG:yahJ KEGG:ns NR:ns ## COG: yahJ COG0402 # Protein_GI_number: 16128309 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Escherichia coli K12 # 1 460 1 460 460 916 99.0 0 MKESNSRREFLSQSGKMVTAAALFGTSVPLAHAAVAGTLNCEANNTMKITDPHYYLDNVL LETGFDYENGVAVQTRTARQTVEIQDGKIVALRENKQHPDATLPHYDAGGKLMLPTTRDM HIHLDKTFYGGPWRSLNRPAGTTIQDMIKLEQKMLPELQPYTQERAEKLIDLLQSKGTTI ARSHCNIEPVSGLKNLQNLQAVLARRQAGFECEIVAFPQHGLLLSKSEPLMREAMQAGAH YVGGLDPTSVDGAMEKSLDTMFQIALDYDKGVDIHLHETTPAGVAAINYMVETVEKTPQL KGKLTISHAFALATLNEQQVDELAHRMAAQQISIASTVPIGTLHMPLKQLHDKGVKVMTG TDSVIDHWSPYGLGDMLEKANLYAQLYIRPNEQNLSRSLFLATGDVLPLNEKGERVWPKA QDDASFVLVDASCSAEAVARISPRTATFHKGQLVWGSVAG >gi|296493390|gb|ADTK01000111.1| GENE 13 16038 - 16988 1102 316 aa, chain - ## HITS:1 COG:yahI KEGG:ns NR:ns ## COG: yahI COG0549 # Protein_GI_number: 16128308 # Func_class: E Amino acid transport and metabolism # Function: Carbamate kinase # Organism: Escherichia coli K12 # 1 316 1 316 316 578 98.0 1e-165 MKELVVVAIGGNSIIKDNASQSIEHQAEAVKAVADTVLEMLASDYDIVLTHGNGPQVGLD LRRAEIAHEREGLPLTPLANCVADTQGGIGYLIQQALNNRLARHGEKKAVTVVTQVEVDK NDPGFAHPTKPIGAFFSESQRDKLQKANPDWCFVEDAGRGYRRVVASPEPKRIVEAPAIK ALIQQGFVVIGAGGGGIPVVRTEAGDYQSVDAVIDKDLSTALLAREIHADILVITTGVEK VCIHFGKPQQQALDRVDIATMTRYMQEGHFPPGSMLPKIIASLTFLEQGGKEVIITTPEC LPAALRGETGTHIIKT >gi|296493390|gb|ADTK01000111.1| GENE 14 17131 - 18549 1715 472 aa, chain - ## HITS:1 COG:no KEGG:ECO103_0299 NR:ns ## KEGG: ECO103_0299 # Name: yahG # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 472 1 472 472 944 100.0 0 MSQSLFSQPLNVINVGIAMFSDDLKKQHVEVTQLDWTPPGQGNMQVVQALDNIADSPLAD KIAAANQQALERIIQSHPVLIGFDQAINVVPGMTPKTILHAGPPITWEKMCGAMKGAVTG ALVFEGLAKDLDEATELAASGEITFSPCHEHDCVGSMAGVTSASMFMHIVKNKTYGNIAY TNMSEQMAKILRMGANDQSVIDRLNWMRDVQGPMLRDAMKIIGEIDLRLMLAQALHMGDE CHNRNNAGTTLLIQALTPGIIQAGYSVEQQREVFEFVASSDYFSGPTWMAMCKAAMDAAH GIEYSTVVTTMARNGVEFGLRVSGLPGQWFTGPAQQVIGPMFAGYKPEDSGLDIGDSAIT ETYGIGGFAMATAPAIVALVGGTVEEAIDFSRQMREITLGENPNVTIPLLGFMGVPSAID ITRVGSSGILPVINTAIAHKDAGVGMIGAGIVHPPFACFEKAILGWCERYGV >gi|296493390|gb|ADTK01000111.1| GENE 15 18549 - 20096 1950 515 aa, chain - ## HITS:1 COG:yahF KEGG:ns NR:ns ## COG: yahF COG0074 # Protein_GI_number: 16128305 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, alpha subunit # Organism: Escherichia coli K12 # 1 515 1 515 515 974 98.0 0 MSVKIVIKPNTYFDSVSLMSISTRANKLDGVEQAFVAMATEMNKGVLKNLGLLTPELEQA KNGDLMIVINGKSGADNEQLLVEIEELFNTKAQSGSHEARYATIASAKKHIPESNLAVIS VNGLFAAREARQALQNDLNVMLFSDNVSVEDELALKQLAHEKGLLMMGPDCGTAIINGAA LCFGNAVRRGNIGIVGASGTGSQELSVRIHEFGGGVSQLIGTGGRDLSEKIGGLMMLDAI GMLENDPQTEIIVLISKPPAPAVARKVLERARACRKPVVVCFLGRVETPVDEQGLQFARG SKEAALKAVMLSGVKQENLDLHTLNQPLIADVRARLQPQQKYIRGLFCGGTLCDETMFAV MEKHGDVYSNIQPDPEFRLQDINRSIKHTFLDFGDDDFTNGKPHPMIDPTNRISRLIEEA RDPEVAVIVMDFVLGFGSHEDPVGSTIEAIKEAKAIAAAEGRELIILAYVLGTDLDTPSL EQQSQMLLDAGVILASSSTNTGLLAREFICKGEEA >gi|296493390|gb|ADTK01000111.1| GENE 16 20086 - 20949 401 287 aa, chain - ## HITS:1 COG:no KEGG:JW0311 NR:ns ## KEGG: JW0311 # Name: yahE # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 287 1 287 287 585 97.0 1e-166 MWALTADADFLAQRGQGQVEQVFARAVNIALPARQQLLTLLCEEYDNAPNSCRLALTHFN GLFRHGDKVQFDEQGITVGQHLHIEMSRCRRWLSPTLQMTAVNFHLIAWQQWHDIIHQHL GENETLFNYRGDNPFYQALNKELHIKRRAVIQAVNDKQNIASAVASMMGLGIGLTPSADD YLTGLALVLFIPGHPAEKYKEEFYLGLQRGRNNTTLLSAITLEAALQQRCRENIHSFIHN IIYDIPGNATQAIEKIKHIGSSSGCDMLYGMADGCALSQTYGGNYVS >gi|296493390|gb|ADTK01000111.1| GENE 17 20989 - 21594 620 201 aa, chain - ## HITS:1 COG:ECs0367 KEGG:ns NR:ns ## COG: ECs0367 COG0666 # Protein_GI_number: 15829621 # Func_class: R General function prediction only # Function: FOG: Ankyrin repeat # Organism: Escherichia coli O157:H7 # 1 201 1 201 201 389 98.0 1e-108 MTIKNLPADYLLAAQQGDIDKVKTCLALGVDINTCDRQGKTAITLASLYQQYACVQALID AGADINKQDLTCLNPFLMSCLNGDLTLLRIILPAKPDLNCVTRFGGVGLTPACEKGHLSI VKELLAHTEINVNQTNHVGWTPLLEAIVLNDGGIKQQAIVQLLLEHGASPHLTDKYGKTP LELARERGFEEIAQLLIAAGA >gi|296493390|gb|ADTK01000111.1| GENE 18 21852 - 22349 426 165 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_0315 NR:ns ## KEGG: ECIAI1_0315 # Name: yahC # Def: conserved hypothetical protein; putative inner membrane protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 165 1 165 165 252 100.0 4e-66 MNGLTATGVTVGICAGLWQLVSSHVGLSQGWELLGTIGFVAFCSFYAAGGGKSGFIRSLA VNYSGMVWAFFAALAAGWLASVSGISGFWASVITTVPFSAVVVWQGRFWLLSFIPGGFLG MTLFFASGMNWTVTLLGFLAGNCVGIISEYGGQKLSEATTKRDGY >gi|296493390|gb|ADTK01000111.1| GENE 19 22441 - 23373 801 310 aa, chain + ## HITS:1 COG:yahB KEGG:ns NR:ns ## COG: yahB COG0583 # Protein_GI_number: 16128301 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 310 1 310 310 633 99.0 0 MNSIFTEENLLAFTTAARFGSFSKAAEELGLTTSAISYTIKRMETGLDVVLFTRSTRSIE LTESGRYFFRKATDLLNDFYAIKRSIDTISQGIEARVRICINQLLYTPKHTARLLQVLKK QFPTCQITVTTEVYNGVWDAIINNQANIAIGAPDTLLDGGGIDYTEIGAIRWAFAIAPDH PLAFVPEPIAESQLRLYPNIMVEDTAHTINKKVGWLLHGQESILVPDFNTKCQCQILGEG IGFLPDYMVREAMAQSLLVTRQIHNPRQDSRMLLATQHSATGQVTQWIKKQFAPNGILTG IYQDLLHREN >gi|296493390|gb|ADTK01000111.1| GENE 20 23415 - 24503 715 362 aa, chain - ## HITS:1 COG:yahA_2 KEGG:ns NR:ns ## COG: yahA_2 COG2200 # Protein_GI_number: 16128300 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 101 362 1 262 262 549 99.0 1e-156 MNSCDFRVFLQEFGTTVHLSLPGSVSEKERLLLKLLMQGMSVTEISQYRNRSAKTISHQK KQLFEKLGIQSDITFWRDIFFQYNPEIISATGSNSHRYINDNHYHHIVTPEAISLALENH EFKPWIQPVFCAQTGVLTGCEVLVRWEHPQTGIIPPDQFIPLAESSGLIVIMTRQLMKQT ADILMPVKHLLPDNFHIGINVSAGCFLAAGFEKECLNLVKKLGNDKIKLVLELTERNPIP VTPEARAIFDSLHQHNITFALDDFGTGYATYRYLQAFPVDFIKIDKSFVQMASVDEISGH IVDNIVELARKPGLSIVAEGVETQEQADLMIGKGVHFLQGYLYSPPVPGNKFVSEWVMKA GG >gi|296493390|gb|ADTK01000111.1| GENE 21 24823 - 25008 93 61 aa, chain + ## HITS:1 COG:no KEGG:ECBD_3343 NR:ns ## KEGG: ECBD_3343 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_BL21_DE3 # Pathway: not_defined # 1 61 1 61 61 115 100.0 6e-25 MSKGALYEFNNPDQLKIPLPHKHIASTFNDIMSKDVGYAYVSLLYACPLKTHSLRLNPFS K >gi|296493390|gb|ADTK01000111.1| GENE 22 25378 - 27411 2251 677 aa, chain - ## HITS:1 COG:ECs0360 KEGG:ns NR:ns ## COG: ECs0360 COG1292 # Protein_GI_number: 15829614 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Choline-glycine betaine transporter # Organism: Escherichia coli O157:H7 # 1 677 1 677 677 1333 100.0 0 MTDLSHSREKDKINPVVFYTSAGLILLFSLTTILFRDFSALWIGRTLDWVSKTFGWYYLL AATLYIVFVVCIACSRFGSVKLGPEQSKPEFSLLSWAAMLFAAGIGIDLMFFSVAEPVTQ YMQPPEGAGQTIEAARQAMVWTLFHYGLTGWSMYALMGMALGYFSYRYNLPLTIRSALYP IFGKRINGPIGHSVDIAAVIGTIFGIATTLGIGVVQLNYGLSVLFDIPDSMAAKAALIAL SVIIATISVTSGVDKGIRVLSELNVALALGLILFVLFMGDTSFLLNALVLNVGDYVNRFM GMTLNSFAFDRPVEWMNNWTLFFWAWWVAWSPFVGLFLARISRGRTIRQFVLGTLIIPFT FTLLWLSVFGNSALYEIIHGGAAFAEEAMVHPERGFYSLLAQYPAFTFSASVATITGLLF YVTSADSGALVLGNFTSQLKDINSDAPGWLRVFWSVAIGLLTLGMLMTNGISALQNTTVI MGLPFSFVIFFVMAGLYKSLKVEDYRRESANRDTAPRPLGLQDRLSWKKRLSRLMNYPGT RYTKQMMETVCYPAMEEVAQELRLRGAYVELKSLPPEEGQQLGHLDLLVHMGEEQNFVYQ IWPQQYSVPGFTYRARSGKSTYYRLETFLLEGSQGNDLMDYSKEQVITDILDQYERHLNF IHLHREAPGHSVMFPDA >gi|296493390|gb|ADTK01000111.1| GENE 23 27522 - 28127 594 201 aa, chain + ## HITS:1 COG:ECs0359 KEGG:ns NR:ns ## COG: ECs0359 COG1309 # Protein_GI_number: 15829613 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 7 201 1 195 195 357 98.0 7e-99 MNGVAKMPKLGMQSIRRRQLIDATLEAINEVGMHDATIAQIARRAGVSTGIISHYFRDKN GLLEATMRDITSQLRDAVLNRLHALPQGSAEQRLQAIVAGNFDETQVSSAAMKAWLAFWA SSMHQPMLYRLQQVSSRRLLSNLVSEFRRELPREQAQEAGYGLAALIDGLWLRAALSGKP LDKPLAHSLTRHFITQHLPTD >gi|296493390|gb|ADTK01000111.1| GENE 24 28141 - 29613 1593 490 aa, chain + ## HITS:1 COG:betB KEGG:ns NR:ns ## COG: betB COG1012 # Protein_GI_number: 16128297 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Escherichia coli K12 # 1 490 1 490 490 956 99.0 0 MSRMAEQQLYIHGGYTSATSGRTFETINPANGNVLATVQAAGREDVDRAVKSALQGQKIW ASMTAMERSRILRRAVDILRERNDELAKLETLDTGKAYSETSTVDIVTGADVLEYYAGLI PALEGSQIPLRETSFVYTRREPLGVVAGIGAWNYPIQIALWKSAPALAAGNAMIFKPSEV TPLTALKLAEIYSEAGLPDGVFNVLPGVGAETGQYLTEHPGIAKVSFTGGVASGKKVMAN SAASSLKEVTMELGGKSPLIVFDDADLDLAADIAMMANFFSSGQVCTNGTRVFVPAKCKA AFEQKILARVERIRAGDVFDPQTNFGPLVSFPHRDNVLRYIAKGKEEGARVLCGGDVLKG DGFDNGAWVAPTVFTDCSDDMTIVREEIFGPVMSILTYESEDEVIRRANDTDYGLAAGIV TADLNRAHRVIHQLEAGICWINTWGESPAEMPVGGYKHSGIGRENGVMTLQSYTQVKSIQ VEMAKFQSIF >gi|296493390|gb|ADTK01000111.1| GENE 25 29627 - 31297 1664 556 aa, chain + ## HITS:1 COG:betA KEGG:ns NR:ns ## COG: betA COG2303 # Protein_GI_number: 16128296 # Func_class: E Amino acid transport and metabolism # Function: Choline dehydrogenase and related flavoproteins # Organism: Escherichia coli K12 # 1 556 1 556 556 1170 100.0 0 MQFDYIIIGAGSAGNVLATRLTEDPNTSVLLLEAGGPDYRFDFRTQMPAALAFPLQGKRY NWAYETEPEPFMNNRRMECGRGKGLGGSSLINGMCYIRGNALDLDNWAQEPGLENWSYLD CLPYYRKAETRDMGENDYHGGDGPVSVTTSKPGVNPLFEAMIEAGVQAGYPRTDDLNGYQ QEGFGPMDRTVTPQGRRASTARGYLDQAKSRPNLTIRTHAMTDHIIFDGKRAVGVEWLEG DSTIPTRATANKEVLLCAGAIASPQILQRSGVGNAELLAEFDIPLVHELPGVGENLQDHL EMYLQYECKEPVSLYPALQWWNQPKIGAEWLFGGTGVGASNHFEAGGFIRSREEFAWPNI QYHFLPVAINYNGSNAVKEHGFQCHVGSMRSPSRGHVRIKSRDPHQHPAILFNYMSHEQD WQEFRDAIRITREIMHQPALDQYRGREISPGVECQTDEQLDEFVRNHAETAFHPCGTCKM GYDEMSVVDGEGRVHGLEGLRVVDASIMPQIITGNLNATTIMIGEKIADMIRGQEALPRS TAGYFVANGMPVRAKK >gi|296493390|gb|ADTK01000111.1| GENE 26 31510 - 32178 41 222 aa, chain + ## HITS:1 COG:no KEGG:B21_00269 NR:ns ## KEGG: B21_00269 # Name: ykgH # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 222 1 222 222 441 100.0 1e-122 MREQIKQDIDLIEILFYLKKKIRVILFIMAICMAMVLLFLYINKDNIKVIYSLKINQTTP GILVSCDSNNNFACQTTMTEDVIQRITTFFHTSPDVKNREIRLEWSGDKRALPTAEEEIS RVQASIIKWYASEYHNGRQVLDEIQTPSAINSELYTKMIYLTRNWSLYPNGDGCVTISSP EIKNKYPAAICLALGFFLSIVISVMFCLVKKMVDEYQQNSGQ >gi|296493390|gb|ADTK01000111.1| GENE 27 32421 - 33116 536 231 aa, chain - ## HITS:1 COG:ykgG KEGG:ns NR:ns ## COG: ykgG COG1556 # Protein_GI_number: 16128293 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 231 52 282 282 443 99.0 1e-124 MDNRGEFLNNVAQALGRPLRLEPQAEDAPLNNYANERLTQLNQQQRCDAFIQFASDVMLT RCELTSEAKAAEAAIRLCKELGDQSVVISGDTRLEELGISERLQQECNAVVWDPAKGAEN ISQAEQAKVGVVYAEYGLTESGGVILFSAAERGRSLSLLPEYSLFILRKSTILPRVAQLA EKLHQKAQAGERMPSCINIISGPSSTADIELIKVVGVHGPVKAVYLIIEDC >gi|296493390|gb|ADTK01000111.1| GENE 28 33109 - 34536 1184 475 aa, chain - ## HITS:1 COG:ykgF KEGG:ns NR:ns ## COG: ykgF COG1139 # Protein_GI_number: 16128292 # Func_class: C Energy production and conversion # Function: Uncharacterized conserved protein containing a ferredoxin-like domain # Organism: Escherichia coli K12 # 1 475 1 475 475 1001 99.0 0 MLIKTSNTDFKTRIRQQIEDPIMRKAVANAQQRIGANRQKMVDELGHWEEWRDRAAQIRD HVLSNLDAYLYQLSEKVTQNGGHVYFARTKEDATRYILQVAQRKNARKVVKSKSMVTEEI GVNHVLQDAGIQVIETDLGEYILQLDQDPPSHVVVPAIHKDRHQIRRVLHERLGYEGPET PEAMTLFIRQKIREDFLSAEIGITGCNFAVAETGSVCLVTNEGNARMCTTLPKTHIAVMG MERIAPTFAEVDVLITMLARSAVGARLTGYNTWLTGPREAGHVDGPEEFHLVIVDNGRSE VLASEFRDVLRCIRCGACMNTCPAYRHIGGHGYGSIYPGPIGAVISPLLGGYKDFKDLPY ACSLCTACDNVCPVRIPLSKLILRHRRVMAEKGITAKAEQRAIKMFAYANSHPGLWKVGM MAGAHAASWFINGGKTPLKFGAISDWMEARDLPEADGESFRSWFKKHQAQEKKNG >gi|296493390|gb|ADTK01000111.1| GENE 29 34547 - 35266 524 239 aa, chain - ## HITS:1 COG:ykgE KEGG:ns NR:ns ## COG: ykgE COG0247 # Protein_GI_number: 16128291 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Escherichia coli K12 # 1 239 1 239 239 480 98.0 1e-135 MNVNFFVTCIGDALKSRMARDSVLLLEKLGCRVNFPEKQGCCGQPAINSGYIKEAIPGMK NLIAALEDNDDPIISPAGSCTYAVKSYPMYLADEPEWASRAAKVAARMQDLTSFIVNKLG VVDVGASLQGRAVYHPSCSLARKLGVKDEPLTLLKNVRGLELFTFAEQDTCCGFGGMFSV KMAEISGEMVKEKVAHLMEVRPEYLIGADVSCLLNISGRLQREGQKVKVMHIAEVLMSR >gi|296493390|gb|ADTK01000111.1| GENE 30 35794 - 36648 352 284 aa, chain - ## HITS:1 COG:ECs0343 KEGG:ns NR:ns ## COG: ECs0343 COG2207 # Protein_GI_number: 15829597 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 284 1 284 284 561 99.0 1e-160 MDALSRLLMLNAPQGTIDKNCVLGSDWQLPHGAGELSVIRWHALTQGAAKLEMPTGEIFT LRPGNVVLLPQNSAHRLSHVDNESTCIVCGTLRLQHSARYFLTSLPETLFLAPVNHSVEY NWLREAIPFLQQESRLAMPGVDALCSQICATFFTLAVREWIAQVNTEKNILSLLLHPRLG AVIQQMLEMPGHAWTVESLASIAHMSRASFAQLFRDVSGTTPLAVLTKLRLQIAAQMFSR EMLPVVVIAESVGYASESSFHKAFVREFGCTPGEYRERVRQLAP >gi|296493390|gb|ADTK01000111.1| GENE 31 36874 - 38199 379 441 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 1 433 1 444 458 150 26 1e-35 MNKYQAVIIGFGKAGKTLAVTLAKAGWRVALIEQSNAMYGGTCINIGCIPTKTLVHDAQQ HTDFVRAIQRKNEVVNFLRNKNFHNLADMPNIDVIDGQAEFINNHSLRVHRPEGNLEIHG EKIFINTGAQTVVPPIPGITTTPGVYDSTGLLNLKELPGHLGILGGGYIGVEFASMFANF GSKVTILEAASLFLPREERDIADNIATILRDQGVDIILNAHVERISHHENQVQVHSEHAQ LAVDALLIASGRQPATASLHPENAGIAVNERGAIVVDKRLHTTADNIWAMGDVTGGLQFT YISLDDYRIVRDELLGEGKRSTDDRKNVPYSVFMTPPLSRVGMTEEQARESGADIQVVTL PVAAIPRARVMNDTRGVLKAIVDNKTQRILGASLLCVDSHEMINIVKMVMDAGLPYSILR DQIFTHPSMSESLNDLFSLVK >gi|296493390|gb|ADTK01000111.1| GENE 32 38296 - 38544 106 82 aa, chain + ## HITS:1 COG:no KEGG:G2583_0402 NR:ns ## KEGG: G2583_0402 # Name: ykgI # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 82 2 83 83 120 97.0 1e-26 MVFYMFKKSVLFATLLSGVMAFSTNADDKTILKHISVSSVSASPTVLEDAIADIARKYNA SSWKVTSMRIDNNSTATAVLYK >gi|296493390|gb|ADTK01000111.1| GENE 33 38556 - 39149 527 197 aa, chain + ## HITS:1 COG:ykgB KEGG:ns NR:ns ## COG: ykgB COG3059 # Protein_GI_number: 16128286 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 197 4 200 200 386 99.0 1e-107 MEKYLHLLSRGDKIGLTLIRLSIAIVFMWIGLLKFVPYEADSITPFVANSPLMSFFYEHP EDYKQYLTHEGEYKPEARAWQSANNTYGFSNGLGVVEVIIALLVLANPVNRWLGLLGGLM AFTTPLVTLSFLITTPEAWVPALGDAHHGFPYLSGAGRLVLKDTLMLAGAVMIMADSARE ILKQRSNESSSTLKTEY Prediction of potential genes in microbial genomes Time: Mon May 16 15:22:40 2011 Seq name: gi|296493389|gb|ADTK01000112.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont294.1, whole genome shotgun sequence Length of sequence - 791 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 15:23:01 2011 Seq name: gi|296493388|gb|ADTK01000113.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont296.1, whole genome shotgun sequence Length of sequence - 63740 bp Number of predicted genes - 65, with homology - 65 Number of transcription units - 26, operones - 16 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 815 854 ## COG1680 Beta-lactamase class C and other penicillin binding proteins + Term 822 - 861 5.3 - Term 810 - 849 5.3 2 2 Op 1 . - CDS 873 - 1772 1118 ## COG2837 Predicted iron-dependent peroxidase - Term 1781 - 1812 2.1 3 2 Op 2 . - CDS 1868 - 2443 440 ## SDY_2630 hypothetical protein 4 2 Op 3 . - CDS 2504 - 2953 271 ## G2583_2964 hypothetical protein 5 2 Op 4 . - CDS 2940 - 3365 559 ## COG0456 Acetyltransferases - Prom 3393 - 3452 4.3 + Prom 3411 - 3470 4.2 6 3 Op 1 4/0.444 + CDS 3579 - 4448 815 ## COG0860 N-acetylmuramoyl-L-alanine amidase 7 3 Op 2 . + CDS 4452 - 5351 690 ## COG0408 Coproporphyrinogen III oxidase + Term 5593 - 5626 -0.6 8 4 Op 1 2/0.778 - CDS 5357 - 6409 933 ## COG2207 AraC-type DNA-binding domain-containing proteins 9 4 Op 2 4/0.444 - CDS 6455 - 6955 429 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein 10 4 Op 3 6/0.000 - CDS 6968 - 7627 771 ## COG4816 Ethanolamine utilization protein 11 4 Op 4 8/0.000 - CDS 7637 - 8524 1069 ## COG4302 Ethanolamine ammonia-lyase, small subunit 12 4 Op 5 5/0.222 - CDS 8545 - 9906 1464 ## COG4303 Ethanolamine ammonia-lyase, large subunit 13 4 Op 6 4/0.444 - CDS 9918 - 11321 1288 ## COG4819 Ethanolamine utilization protein, possible chaperonin protecting lyase from inhibition 14 4 Op 7 2/0.778 - CDS 11318 - 12544 1592 ## COG3192 Ethanolamine utilization protein - Prom 12578 - 12637 4.2 15 5 Op 1 2/0.778 - CDS 12661 - 13848 1193 ## COG1454 Alcohol dehydrogenase, class IV 16 5 Op 2 4/0.444 - CDS 13838 - 14674 929 ## COG4820 Ethanolamine utilization protein, possible chaperonin 17 5 Op 3 4/0.444 - CDS 14685 - 16088 838 ## PROTEIN SUPPORTED gi|148544941|ref|YP_001272311.1| 50S ribosomal protein L29P 18 5 Op 4 4/0.444 - CDS 16100 - 16387 253 ## COG4576 Carbon dioxide concentrating mechanism/carboxysome shell protein - Term 16449 - 16486 5.6 19 6 Op 1 1/0.778 - CDS 16494 - 16787 467 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein 20 6 Op 2 4/0.444 - CDS 16826 - 17842 871 ## COG0280 Phosphotransacetylase 21 6 Op 3 4/0.444 - CDS 17839 - 18642 820 ## COG4812 Ethanolamine utilization cobalamin adenosyltransferase 22 6 Op 4 4/0.444 - CDS 18639 - 19340 755 ## COG4766 Ethanolamine utilization protein 23 6 Op 5 4/0.444 - CDS 19315 - 19794 383 ## COG4917 Ethanolamine utilization protein 24 6 Op 6 1/0.778 - CDS 19807 - 20142 396 ## COG4810 Ethanolamine utilization protein - Prom 20313 - 20372 5.5 25 7 Tu 1 . - CDS 20435 - 22714 2559 ## COG0281 Malic enzyme - Prom 22835 - 22894 3.5 + Prom 22797 - 22856 3.3 26 8 Op 1 13/0.000 + CDS 23003 - 23953 915 ## COG0176 Transaldolase 27 8 Op 2 . + CDS 23973 - 25976 2274 ## COG0021 Transketolase + Term 25992 - 26049 5.5 28 9 Tu 1 . - CDS 26072 - 27115 733 ## ECO103_2975 hypothetical protein 29 10 Op 1 3/0.667 - CDS 27241 - 27816 691 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 30 10 Op 2 . - CDS 27884 - 29863 1449 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases - Prom 29950 - 30009 2.3 + Prom 29932 - 29991 2.8 31 11 Tu 1 . + CDS 30069 - 31769 1361 ## COG3850 Signal transduction histidine kinase, nitrate/nitrite-specific + Prom 31774 - 31833 4.7 32 12 Tu 1 . + CDS 31933 - 35046 3233 ## COG0841 Cation/multidrug efflux pump + Term 35051 - 35099 7.5 + Prom 35480 - 35539 5.7 33 13 Op 1 9/0.000 + CDS 35585 - 35941 322 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family 34 13 Op 2 . + CDS 35945 - 37072 1014 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 35 13 Op 3 . + CDS 37100 - 37300 316 ## G2583_2995 hypothetical protein - Term 37285 - 37318 -0.9 36 14 Tu 1 . - CDS 37410 - 38108 794 ## COG0400 Predicted esterase 37 15 Op 1 4/0.444 - CDS 38182 - 40158 1346 ## COG1444 Predicted P-loop ATPase fused to an acetyltransferase 38 15 Op 2 . - CDS 40212 - 41075 789 ## COG2321 Predicted metalloprotease - Prom 41122 - 41181 2.1 39 16 Op 1 7/0.000 - CDS 41222 - 41620 197 ## COG1487 Predicted nucleic acid-binding protein, contains PIN domain 40 16 Op 2 . - CDS 41620 - 41847 263 ## COG4456 Virulence-associated protein and related proteins - Prom 41877 - 41936 6.9 - Term 41943 - 41980 8.0 41 17 Tu 1 . - CDS 42001 - 42714 1122 ## COG0152 Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase + Prom 42699 - 42758 3.6 42 18 Tu 1 . + CDS 42786 - 43016 90 ## SDY_2665 hypothetical protein + Term 43101 - 43131 2.7 - Term 42888 - 42919 3.2 43 19 Op 1 9/0.000 - CDS 42927 - 43961 898 ## COG3317 Uncharacterized lipoprotein 44 19 Op 2 . - CDS 43978 - 44856 791 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase - Prom 45055 - 45114 3.0 + Prom 44914 - 44973 6.0 45 20 Op 1 4/0.444 + CDS 45002 - 45574 345 ## COG2716 Glycine cleavage system regulatory protein 46 20 Op 2 1/0.778 + CDS 45574 - 46044 508 ## COG1225 Peroxiredoxin + Prom 46196 - 46255 3.4 47 21 Op 1 4/0.444 + CDS 46297 - 46914 293 ## COG1142 Fe-S-cluster-containing hydrogenase components 2 48 21 Op 2 5/0.222 + CDS 46914 - 47585 506 ## COG0651 Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit + Prom 47618 - 47677 2.8 49 22 Op 1 10/0.000 + CDS 47699 - 48931 922 ## COG0651 Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit 50 22 Op 2 1/0.778 + CDS 49056 - 49889 782 ## COG0650 Formate hydrogenlyase subunit 4 51 23 Op 1 1/0.778 + CDS 49992 - 50735 553 ## COG1009 NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit 52 23 Op 2 3/0.667 + CDS 50555 - 51136 254 ## COG1009 NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit 53 23 Op 3 7/0.000 + CDS 51148 - 51798 607 ## COG4237 Hydrogenase 4 membrane component (E) 54 23 Op 4 7/0.000 + CDS 51803 - 53395 1296 ## COG0651 Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit 55 23 Op 5 5/0.222 + CDS 53385 - 55052 1712 ## COG3261 Ni,Fe-hydrogenase III large subunit 56 23 Op 6 6/0.000 + CDS 55062 - 55607 372 ## COG1143 Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) 57 23 Op 7 . + CDS 55604 - 56005 86 ## COG3260 Ni,Fe-hydrogenase III small subunit 58 23 Op 8 . + CDS 55932 - 56360 223 ## COG3260 Ni,Fe-hydrogenase III small subunit 59 23 Op 9 . + CDS 56353 - 56766 391 ## SSON_2571 putative protein processing element 60 23 Op 10 1/0.778 + CDS 56796 - 58808 1529 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains 61 23 Op 11 . + CDS 58830 - 59678 392 ## COG2116 Formate/nitrite family of transporters + Term 59683 - 59726 11.5 - Term 59671 - 59712 7.3 62 24 Tu 1 . - CDS 59716 - 60777 1372 ## COG0628 Predicted permease - Prom 60847 - 60906 4.5 + Prom 60862 - 60921 3.1 63 25 Op 1 6/0.000 + CDS 60990 - 62453 1692 ## COG4783 Putative Zn-dependent protease, contains TPR repeats 64 25 Op 2 . + CDS 62474 - 62833 533 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family 65 26 Tu 1 . - CDS 62971 - 63738 638 ## COG0593 ATPase involved in DNA replication initiation Predicted protein(s) >gi|296493388|gb|ADTK01000113.1| GENE 1 3 - 815 854 270 aa, chain + ## HITS:1 COG:yfeW KEGG:ns NR:ns ## COG: yfeW COG1680 # Protein_GI_number: 16130355 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Escherichia coli K12 # 10 270 203 463 463 526 99.0 1e-149 KSVSWQHHPGALYSQDKGQTLEMIKRTPLEYQPGSKHIYSDVDYMLLGFIVESVTGQPLD RYVEESIYRPLGLTHTVFNPLLKGFKPQQIAATELNGNTRDGVIHFPNIRTSTLWGQVHD EKAFYSMGGVSGHAGLFSNTGDIAVLMQTMLNGGGYGDVQLFSAETVKMFTTSSKEDATF GLGWRVNGNATMTPTFGTLASPQTYGHTGWTGTVTVIDPVNHMAIVMLSNKPHSPVADPQ KNPNMFESGQLPIATYGWVVDQVYAALKQK >gi|296493388|gb|ADTK01000113.1| GENE 2 873 - 1772 1118 299 aa, chain - ## HITS:1 COG:yfeX KEGG:ns NR:ns ## COG: yfeX COG2837 # Protein_GI_number: 16130356 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted iron-dependent peroxidase # Organism: Escherichia coli K12 # 1 299 10 308 308 615 99.0 1e-176 MSQVQSGILPEHCRAAIWIEANVKGEVDALRAASKTFADKLATFEAKFPDAHLGAVVAFG NNTWRALSGGVGAEELKDFPGYGKGLAPTTQFDVLIHILSLRHDVNFSVAQAAMEAFGDC IEVKEEIHGFRWVEERDLSGFVDGTENPAGEETRREVAVIKDGVDAGGSYVFVQRWEHNL KQLNRMSVHDQEMMIGRTKEANEEIDGDERPETSHLTRVDLKEDGKGLKIVRQSLPYGTA SGTHGLYFCAYCARLHNIEQQLLSMFGDTDGKRDAMLRFTKPVTGGYYFAPSLDKLMAL >gi|296493388|gb|ADTK01000113.1| GENE 3 1868 - 2443 440 191 aa, chain - ## HITS:1 COG:no KEGG:SDY_2630 NR:ns ## KEGG: SDY_2630 # Name: not_defined # Def: hypothetical protein # Organism: S.dysenteriae # Pathway: not_defined # 1 191 1 191 191 378 100.0 1e-104 MKSLRLMLCAMPLMLTGCSTMSSVNWSAANPWNWFGSSTKVSEQGVGELTASTPLQEQAI ADALDGDYRLRSGMKTANGNVVRFFEVMKGDNVAMVINGDQGTISRIDVLDSDIPADTGV KIGTPFSDLYSKAFGNCQKADGDDNRAVECKAEGSQHISYQFSGEWSGPEGLMPSDDTLK NWKVSKIIWRR >gi|296493388|gb|ADTK01000113.1| GENE 4 2504 - 2953 271 149 aa, chain - ## HITS:1 COG:no KEGG:G2583_2964 NR:ns ## KEGG: G2583_2964 # Name: yfeZ # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 149 3 151 151 239 100.0 3e-62 MKSTEFHPVHYDAHGRLRLPLLFWLVLLLQARTWVLFVIAGASREQGTALLNLFYPDHDN FWLGLIPGIPAVLAFLLSGRRATFPRTWRVLYFLLLLAQVVLLCWQPWLWLNGESVSGIG LALVVADIVALIWLLTNRRLRACFNEVKE >gi|296493388|gb|ADTK01000113.1| GENE 5 2940 - 3365 559 141 aa, chain - ## HITS:1 COG:ECs3305 KEGG:ns NR:ns ## COG: ECs3305 COG0456 # Protein_GI_number: 15832559 # Func_class: R General function prediction only # Function: Acetyltransferases # Organism: Escherichia coli O157:H7 # 1 141 38 178 178 293 100.0 7e-80 MEIRVFRQEDFEEVITLWERCDLLRPWNDPEMDIERKMNHDVSLFLVAEVNGEVVGTVMG GYDGHRGSAYYLGVHPEFRGRGIANALLNRLEKKLIARGCPKIQINVPEDNDMVLGMYER LGYEHADVLSLGKRLIEDEEY >gi|296493388|gb|ADTK01000113.1| GENE 6 3579 - 4448 815 289 aa, chain + ## HITS:1 COG:ECs3306 KEGG:ns NR:ns ## COG: ECs3306 COG0860 # Protein_GI_number: 15832560 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Escherichia coli O157:H7 # 1 289 1 289 289 532 100.0 1e-151 MSTFKPLKTLTSRRQVLKAGLAALTLSGMSQAIAKDEPLKTSNGHSKPKAKKSGGKRVVV LDPGHGGIDTGAIGRNGSKEKHVVLAIAKNVRSILRNHGIDARLTRSGDTFIPLYDRVEI AHKHGADLFMSIHADGFTNPKAAGASVFALSNRGASSAMAKYLSERENRADEVAGKKATD KDHLLQQVLFDLVQTDTIKNSLTLGSHILKKIKPVHKLHSRNTEQAAFVVLKSPSVPSVL VETSFITNPEEERLLGTAAFRQKIATAIAEGVISYFHWFDNQKAHSKKR >gi|296493388|gb|ADTK01000113.1| GENE 7 4452 - 5351 690 299 aa, chain + ## HITS:1 COG:hemF KEGG:ns NR:ns ## COG: hemF COG0408 # Protein_GI_number: 16130361 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase # Organism: Escherichia coli K12 # 1 299 1 299 299 620 98.0 1e-178 MKPDAHQVKQFLLNLQDTICQQLTAVDGAEFVEDSWQREAGGGGRSRVLRNGGVFEQAGV NFSHVHGEAMPASATAHRPELAGRSFEAMGVSLVVHPHNPYVPTSHANVRFFIAEKPGAE PVWWFGGGFDLTPFYGFEEDAIHWHRTARDLCLPFGEDVYPRYKKWCDDYFYLKHRNEQR GIGGLFFDDLNTPDFDHCFAFMQAVGKGYTDAYLPIVERRKAMAYGERERNFQLYRRGRY VEFNLVWDRGTLFGLQTGGRTESILMSMPPLVRWEYDYQPKDGSPEAALSEFIKVRDWV >gi|296493388|gb|ADTK01000113.1| GENE 8 5357 - 6409 933 350 aa, chain - ## HITS:1 COG:ECs3308 KEGG:ns NR:ns ## COG: ECs3308 COG2207 # Protein_GI_number: 15832562 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 350 1 350 350 711 99.0 0 MKKTRTANLHHLYHEPLPENLKLTPKVEVDNVHQRQTTDVYEHALTITAWQQIYDQLHPG KFHGEFTEILLDDIQVFREYTGLALRQSCLVWPNSFWFGIPATRGEQGFIGSQCLGSAEI ATRPGGTEFELSTPDDYTILGVVLSEDVITRQANFLHNPDRVLHMLRSQSALEVKEQHKA ALWGFVQQALATFCENPENLHQPAVRKVLGDNLLMAMGAMLEDAQPMVTAESISHQSYRR LLSRAREYVLENMSEPVTVLDLCNQLHVSRRTLQNAFHAILGIGPNAWLKRIRLNAVRRE LISPWSQSTTVKDAAMQWGFWHLGQFATDYQQLFAEKPSLTLHQRMREWG >gi|296493388|gb|ADTK01000113.1| GENE 9 6455 - 6955 429 166 aa, chain - ## HITS:1 COG:eutK KEGG:ns NR:ns ## COG: eutK COG4577 # Protein_GI_number: 16130363 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Escherichia coli K12 # 1 166 3 168 168 288 100.0 3e-78 MINALGLLEVDGMVAAIDAADAMLKAANVRLLSHEVLDPGRLTLVVEGDLAACRAALDAG CAAAMRTGRVISRKEIGRPDDDTQWLVTGFNRQPKQPVREPDAPVIVAESADELLALLTS VRQGMTAGEVAAHFGWPLEKARNALEQLFSAGTLRKRSSRYRLKPH >gi|296493388|gb|ADTK01000113.1| GENE 10 6968 - 7627 771 219 aa, chain - ## HITS:1 COG:ECs3310 KEGG:ns NR:ns ## COG: ECs3310 COG4816 # Protein_GI_number: 15832564 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Escherichia coli O157:H7 # 1 219 1 219 219 374 98.0 1e-104 MPALDLIRPSVTAMRVIASVNAEFARELKLPPHIRSLGLISADSDDVTYIAADEATKQAM VEVVYGRSLYAGAAHGPSPTAGEVLIMLGGPNPAEVRAGLDAMVAHIENGAAFQWANDAE NTAFLAHVVSRTGSYLSSTAGITLGDPMAYLVAPPLEATYGIDAALKSADVQLVTYVPPP SETNYSAAFLTGSQAACKAACNAFTDAVLEIARNPIQRA >gi|296493388|gb|ADTK01000113.1| GENE 11 7637 - 8524 1069 295 aa, chain - ## HITS:1 COG:eutC KEGG:ns NR:ns ## COG: eutC COG4302 # Protein_GI_number: 16130365 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine ammonia-lyase, small subunit # Organism: Escherichia coli K12 # 1 295 1 295 295 554 99.0 1e-158 MDQKQIEEIVRSVMASMGQTAPAPSEAKCATTNCAAPVTSESCALDLGSAEAKAWIGVEN PHRADVLTELRRSTVARVCTGRAGPRPRTQALLRFLADHSRSKDTVLKEVPEEWVKAQGL LEVRSEISDKNLYLTRPDMGRRLCAEAVEALKAQCVANPDVQVVISDGLSTDAITVNYEE ILPPLMAGLKQAGLKVGTPFFVRYGRVKIEDQIGEILGAKVVILLVGERPGLGQSESLSC YAVYSPRMATTVEADRTCISNIHQGGTPPVEAAAVIVDLAKRMLEQKASGINMTR >gi|296493388|gb|ADTK01000113.1| GENE 12 8545 - 9906 1464 453 aa, chain - ## HITS:1 COG:eutB KEGG:ns NR:ns ## COG: eutB COG4303 # Protein_GI_number: 16130366 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine ammonia-lyase, large subunit # Organism: Escherichia coli K12 # 1 453 15 467 467 922 99.0 0 MKLKTTLFSNVYQFKDVKEVLAKANELRSGDVLAGVAAASSQERVAAKQVLSEMTVADIR NNPVIAYEDDCVTRLIQDDVNETAYNQIKNWSISELREYVLSDETSVDDIAFTRKGLTSE VVAAVAKICSNADLIYGAKKMPVIKKANTTIGIPGTFSARLQPNDTRDDVQSIAAQIYEG LSFGVGDAVIGVNPVTDDVENLSRVLDTIYGVIDKFNIPTQGCVLAHVTTQIEAIRRGAP GGLIFQSICGSEKGLKEFGVELAMLDEARAVGAEFNRIAGENCLYFETGQGSALSAGANF GADQVTMEARNYGLARHYDPFIVNTVVGFIGPEYLYNDRQIIRAGLEDHFMGKLSGISMG CDCCYTNHADADQNLNENLMILLATAGCNYIMGMPLGDDIMLNYQTTAFHDTATVRQLLN LRPSPEFERWLESMGIMANGRLTKRAGDPSLFF >gi|296493388|gb|ADTK01000113.1| GENE 13 9918 - 11321 1288 467 aa, chain - ## HITS:1 COG:ECs3313 KEGG:ns NR:ns ## COG: ECs3313 COG4819 # Protein_GI_number: 15832567 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein, possible chaperonin protecting lyase from inhibition # Organism: Escherichia coli O157:H7 # 1 467 1 467 467 884 99.0 0 MNTRQLLSVGIDIGTTTTQVIFSHLELVNRAAVSQVPRYEFIKREISWQSPVFFTPVDKQ GGLKEAELKSLILEQYQAAGIAPESVDSGAIIITGESAKTRNARPAVMALSQSLGDFVVA SAGPHLESVIAGHGAGAQTLSEQRLCRVLNIDIGGGTANYALFDAGKISGTACLNVGGRL LETDSQGRVVYAHKPGQMIVDECFGAGTDARSLTGAQLVQVTRRMAALIVEVIDGTLSPL AQALMQTGLLPAGVTPEIITLSGGVGECYRHQPADPFCFADIGPLLATALHDHPRLREMN VQFPAQTVRATVIGAGAHTLSLSGSTIWLEGVQLPLRNLPVTIPIDETDLVSAWQQALLQ LDLDPKTDAYVLALPASLPVRYAAVLTVINALVDFVARFPNPHPLLVVAGQDFGKALGML LRPQLQQLPLAVIDEVIVRAGDYIDIGTPLFGGSVVPVTVKSLAFPS >gi|296493388|gb|ADTK01000113.1| GENE 14 11318 - 12544 1592 408 aa, chain - ## HITS:1 COG:eutH KEGG:ns NR:ns ## COG: eutH COG3192 # Protein_GI_number: 16130377 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Escherichia coli K12 # 1 408 1 408 408 663 100.0 0 MGINEIIMYIMMFFMLIAAVDRILSQFGGSARFLGKFGKSIEGSGGQFEEGFMAMGALGL AMVGMTALAPVLAHVLGPVIIPVYEMLGANPSMFAGTLLACDMGGFFLAKELAGGDVAAW LYSGLILGSMMGPTIVFSIPVALGIIEPSDRRYLALGVLAGIVTIPIGCIAGGLVAMYSG VQINGQPVEFTFALILMNMIPVIIVAILVALGLKFIPEKMINGFQIFAKFLVALITLGLA AAVVKFLLGWELIPGLDPIFMAPGDKPGEVMRAIEVIGSISCVLLGAYPMVLLLTRWFEK PLMSVGKVLNMNNIAAAGMVATLANNIPMFGMMKQMDTRGKVINCAFAVSAAFALGDHLG FAAANMNAMIFPMIVGKLIGGVTAIGVAMMLVPKEDATATKTEAEAQS >gi|296493388|gb|ADTK01000113.1| GENE 15 12661 - 13848 1193 395 aa, chain - ## HITS:1 COG:ECs3315 KEGG:ns NR:ns ## COG: ECs3315 COG1454 # Protein_GI_number: 15832569 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Escherichia coli O157:H7 # 1 395 10 404 404 711 98.0 0 MQNELQTALFQAFDTLNLQRVKTFSVPPVTLCGLGAVSSCGQQAQTRGLKHLFVMADSFL HQAGMTAGLTRSLAVKGIAMTLWLCPVGEPCITDVCAAVAQLRESGCDGVIAFGGGSVLD AAKAVALLVTNPDSTLAEMSETSVLQPRLPLIAIPTTAGTGSETTNVTVIIDAVSGRKQV LAHASLMPDVAILDAALTEGVPSHVTAMTGIDALTHAIEAYSALNATPFTDSLAIGAIAM IGKSLPKAVGYGHDLAARESMLLASCMAGMAFSSAGLGLCHAMAHQPGAALHIPHGLANA MLLPTVMEFNRMVCRERFSQIGRALRTKKSDDRDAINAVSELIAEVGIGKRLGDVGATSA HYGAWAQAAQEDICLRSNPRTASLEQIVGLYAAAQ >gi|296493388|gb|ADTK01000113.1| GENE 16 13838 - 14674 929 278 aa, chain - ## HITS:1 COG:eutJ KEGG:ns NR:ns ## COG: eutJ COG4820 # Protein_GI_number: 16130379 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein, possible chaperonin # Organism: Escherichia coli K12 # 1 278 1 278 278 562 99.0 1e-160 MAHDEQWLTPRLQTAATLCNQTPAATESPLWLGVDLGTCDVVSMVVDRDGQPVAVCLDWA DVVRDGIVWDFFGAVTIVRRHLDTLEQQFGRRFSHAATSFPPGTDPRISINVLESAGLEV SHVLDEPTAVADLLQLDNAGVVDIGGGTTGIAIVKKGKVTYSADEATGGHHISLTLAGNR RISLEEAEQYKRGHGDEIWPAVKPVYEKMADIVARHIEGQGITDLWLAGGSCMQPGVAEL FRKQFPALQVHLPQHSLFMTPLAIASSGREKAEGLYAK >gi|296493388|gb|ADTK01000113.1| GENE 17 14685 - 16088 838 467 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148544941|ref|YP_001272311.1| 50S ribosomal protein L29P [Lactobacillus reuteri DSM 20016] # 1 454 1 461 477 327 40 1e-88 MNQQDIEQVVKAVLLKMQSSDTPSAAVHEMGVFASLDDAVAAAKVAQQGLKSVAMRQLAI AAIREAGEKHARDLAELAVSETGMGRVEDKFAKNVAQARGTPGVECLSPQVLTGDNGLTL IENAPWGVVASVTPSTNPAATVINNAISLIAAGNSVIFAPHPAAKKVSQRAITLLNQAIV AAGGPENLLVTVANPDIETAQRLFKFPGIGLLVVTGGEAVVEAARKHTNKRLIAAGAGNP PVVVDETADLARAAQSIVKGASFDNNIICADEKVLIVVDSVADELMRLMEGQHAVKLTAE QAQQLQPVLLKNIDERGKGTVSRDWVGRDAAKIAAAIGLNVPQETRLLFVETTAEHPFAV TELMMPVLPVVRVANVADAIALAVKLEGGCHHTAAMHSRNIENMNQMANAIDTSIFVKNG PCIAGLGLGGEGWTTMTITTPTGEGVTSARTFVRLRRCVLVDAFRIV >gi|296493388|gb|ADTK01000113.1| GENE 18 16100 - 16387 253 95 aa, chain - ## HITS:1 COG:cchB KEGG:ns NR:ns ## COG: cchB COG4576 # Protein_GI_number: 16130381 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Escherichia coli K12 # 1 95 1 95 95 176 100.0 9e-45 MKLAVVTGQIVCTVRHHGLAHDKLLMVEMIDPQGNPDGQCAVAIDNIGAGTGEWVLLVSG SSARQAHKSETSPVDLCVIGIVDEVVSGGQVIFHK >gi|296493388|gb|ADTK01000113.1| GENE 19 16494 - 16787 467 97 aa, chain - ## HITS:1 COG:ECs3319 KEGG:ns NR:ns ## COG: ECs3319 COG4577 # Protein_GI_number: 15832573 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Escherichia coli O157:H7 # 1 97 15 111 111 153 100.0 9e-38 MEALGMIETRGLVALIEASDAMVKAARVKLVGVKQIGGGLCTAMVRGDVAACKAATDAGA AAAQRIGELVSVHVIPRPHGDLEEVFPIGLKGDSSNL >gi|296493388|gb|ADTK01000113.1| GENE 20 16826 - 17842 871 338 aa, chain - ## HITS:1 COG:eutI KEGG:ns NR:ns ## COG: eutI COG0280 # Protein_GI_number: 16130383 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Escherichia coli K12 # 1 338 1 338 338 622 99.0 1e-178 MIIERCRELALRAPARVVFPDALDQRVLKAAQYLHQQGLATPILVANPFELRQFALSHGV AMDGLQVIDPHGNLAMREEFAHRWLARAGEKTPPDALEKLTDPLMFAAAMVSAGKADVCI AGNLSSTANVLRAGLRIIGLQPGCKTLSSIFLMLPQYSGPALGFADCSVVPQPTAAQLAD IALASAETWRAITGEEPRVAMLSFSSNGSARHPCVANVQQATEIVRERAPKLVVDGELQF DAAFVPEVAAQKAPASPLQGKANVMVFPSLEAGNIGYKIAQRLGGYRAVGPLIQGLAAPM HDLSRGCSVQEIIELALVAAVPRQTKVNRESSLQTLVE >gi|296493388|gb|ADTK01000113.1| GENE 21 17839 - 18642 820 267 aa, chain - ## HITS:1 COG:eutT KEGG:ns NR:ns ## COG: eutT COG4812 # Protein_GI_number: 16130384 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization cobalamin adenosyltransferase # Organism: Escherichia coli K12 # 1 267 1 267 267 508 100.0 1e-144 MKDFITEAWLRANHTLSEGAEIHLPADSRLTPSARELLESRHLRIKFIDEQGRLFVDDEQ QQPQPVHGLTSSDEHPQACCELCRQPVAKKPDTLTHLSAEKMVAKSDPRLGFRAVLDSTI ALAVWLQIELAEPWQPWLADIRSRLGNIMRADALGEPLGCQAIVGLSDEDLHRLSHQPLR YLDHDHLVPEASHGRDAALLNLLRTKVRETETVAAQVFITRSFEVLRPDILQALNRLSST VYVMMILSVTKQPLTVKQIQQRLGETQ >gi|296493388|gb|ADTK01000113.1| GENE 22 18639 - 19340 755 233 aa, chain - ## HITS:1 COG:eutQ KEGG:ns NR:ns ## COG: eutQ COG4766 # Protein_GI_number: 16130385 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Escherichia coli K12 # 1 233 1 233 233 454 99.0 1e-128 MKKLITANDIREAHARGEQAMSVVLRASIITPEAREVADLLGFTITECDESIPVTASVPA SVPADKTESQRIRETIIAQLPEGQFTESLVAQLMEKVMKEKQSLEQGAMQPSFKSVTGKG GIKVIDGSSVKFGRFDGAEPHCVGLTDLVTGDDGSSMAAGFMQWENAFFPWTLNYDEIDM VLEGELHVRHEGETMIAKAGDVMFIPKGSSIEFGTTSSVKFLYVAWPANWQSL >gi|296493388|gb|ADTK01000113.1| GENE 23 19315 - 19794 383 159 aa, chain - ## HITS:1 COG:ECs3323 KEGG:ns NR:ns ## COG: ECs3323 COG4917 # Protein_GI_number: 15832577 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Escherichia coli O157:H7 # 1 159 1 159 159 315 99.0 2e-86 MKRIAFVGSVGAGKTTLFNALQGNYTLARKTQAVEFNDKGDIDTPGEYFSHPRWYHALIT TLQDVDMLIYVHGANDPESRLPAGLLDIGVSKRQIAVISKTDMPDADVAATRKLLLETGF EEPIFELNSHDPQSVQQLVDYLASLTKQEEAGEKTHHSE >gi|296493388|gb|ADTK01000113.1| GENE 24 19807 - 20142 396 111 aa, chain - ## HITS:1 COG:ECs3324 KEGG:ns NR:ns ## COG: ECs3324 COG4810 # Protein_GI_number: 15832578 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Escherichia coli O157:H7 # 1 111 25 135 135 213 100.0 1e-55 MDKERIIQEFVPGKQVTLAHLIAHPGEELAKKIGVPDAGAIGIMTLTPGETAMIAGDLAL KAADVHIGFLDRFSGALVIYGSVGAVEEALSQTVSGLGRLLNYTLCEMTKS >gi|296493388|gb|ADTK01000113.1| GENE 25 20435 - 22714 2559 759 aa, chain - ## HITS:1 COG:maeB_1 KEGG:ns NR:ns ## COG: maeB_1 COG0281 # Protein_GI_number: 16130388 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Escherichia coli K12 # 1 434 1 434 434 849 100.0 0 MDDQLKQSALDFHEFPVPGKIQVSPTKPLATQRDLALAYSPGVAAPCLEIEKDPLKAYKY TARGNLVAVISNGTAVLGLGNIGALAGKPVMEGKGVLFKKFAGIDVFDIEVDELDPDKFI EVVAALEPTFGGINLEDIKAPECFYIEQKLRERMNIPVFHDDQHGTAIISTAAILNGLRV VEKNISDVRMVVSGAGAAAIACMNLLVALGLQKHNIVVCDSKGVIYQGREPNMAETKAAY AVVDDGKRTLDDVIEGADIFLGCSGPKVLTQEMVKKMARAPMILALANPEPEILPPLAKE VRPDAIICTGRSDYPNQVNNVLCFPFIFRGALDVGATAINEEMKLAAVRAIAELAHAEQS EVVASAYGDQDLSFGPEYIIPKPFDPRLIVKIAPAVAKAAMESGVATRPIADFDVYIDKL TEFVYKTNLFMKPIFSQARKAPKRVVLPEGEEARVLHATQELVTLGLAKPILIGRPNVIE MRIQKLGLQIKAGVDFEIVNNESDPRFKEYWTEYFQIMKRRGVTQEQAQRALISNPTVIG AIMVQRGEADAMICGTVGDYHEHFSVVKNVFGYRDGVHTAGAMNALLLPSGNTFIADTYV NDEPDAEELAEITLMAAETVRRFGIEPRVALLSHSNFGSSDCPSSSKMRQALELVRDRAP DLMIDGEMHGDAALVEAIRNDRMPDSPLKGSANILVMPNMEAARISYNLLRVSSSEGVTV GPVLMGVAKPVHVLTPIASVRRIVNMVALAVVEAQTQPL >gi|296493388|gb|ADTK01000113.1| GENE 26 23003 - 23953 915 316 aa, chain + ## HITS:1 COG:ECs3326 KEGG:ns NR:ns ## COG: ECs3326 COG0176 # Protein_GI_number: 15832580 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Escherichia coli O157:H7 # 1 316 1 316 316 635 100.0 0 MNELDGIKQFTTVVADSGDIESIRHYHPQDATTNPSLLLKAAGLSQYEHLIDDAIAWGKK NGKTQEQQVVAACDKLAVNFGAEILKIVPGRVSTEVDARLSFDKEKSIEKARHLVDLYQQ QGVEKSRILIKLASTWEGIRAAEELEKEGINCNLTLLFSFAQARACAEAGVFLISPFVGR IYDWYQARKPMDPYVVEEDPGVKSVRNIYDYYKQHHYETIVMGASFRRTEQILALTGCDR LTIAPNLLKELQEKVSPVVRKLIPPSQTFPRPAPMSEAEFRWEHNQDAMAVEKLSEGIRL FAVDQRKLEDLLAAKL >gi|296493388|gb|ADTK01000113.1| GENE 27 23973 - 25976 2274 667 aa, chain + ## HITS:1 COG:tktB KEGG:ns NR:ns ## COG: tktB COG0021 # Protein_GI_number: 16130390 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase # Organism: Escherichia coli K12 # 1 667 1 667 667 1359 99.0 0 MSRKDLANAIRALSMDAVQKANSGHPGAPMGMADIAEVLWNDFLKHNPTDPTWYDRDRFI LSNGHASMLLYSLLHLTGYDLPLEELKNFRQLHSKTPGHPEIGYTPGVETTTGPLGQGLA NAVGLAIAERTLAAQFNQPDHEIVDHFTYVFMGDGCLMEGISHEVCSLAGTLGLGKLIGF YDHNGISIDGETEGWFTDDTAKRFEAYHWHVIHEIDGHDPQAVKEAILEAQSVKDKPSLI ICRTVIGFGSPNKAGKEEAHGAPLGEEEVALARQKLGWHHPPFEIPKKIYHAWDAREKGE KAQQSWNEKFAAYKKAHPQLAEEFTRRMSGGLPKDWEKTTQKYINELQANPAKIATRKAS QNTLNAYGPMLPELLGGSADLAPSNLTIWKGSVSLKEDPAGNYIHYGVREFGMTAIANGI AHHGGFVPYTATFLMFVEYARNAARMAALMKARQIMVYTHDSIGLGEDGPTHQAVEQLAS LRLTPNFSTWRPCDQVEAAVGWKLAVERHNGPTALILSRQNLAQVERTPDQVKEIARGGY VLKDSGGKPDIILIATGSEMEITLQAAEKLAGEGRNVRVVSLPSTDIFDAQDEEYRESVL PSNVAARVAVEAGIADYWYKYVGLKGAIVGMTGYGESAPADKLFPFFGFTAENIVAKAHK VLGVKGA >gi|296493388|gb|ADTK01000113.1| GENE 28 26072 - 27115 733 347 aa, chain - ## HITS:1 COG:no KEGG:ECO103_2975 NR:ns ## KEGG: ECO103_2975 # Name: ypfG # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 347 1 347 347 672 100.0 0 MRYRIFLLFFFALLPTSLVWAAPAQRAFSDWQVTCNNQNFCVARNTGDHNGLVMTLSRSA GAHTDAVLRIERGGLKSPDASEGEIAPRLLLDGEPLALSGDKWRISPWLLVTDDTATITA FLQMIQEGKAITLRDGNQTISLSGLKAALLFIDAQQKRVGSETAWIKKGDEPPLSVPPAP ALKEVAVVNPTPTPLSLEERNDLLDYGNWRMNGLRCSLDPLRREVNVTALTDDKALMMIS CEAGAYNTIDLAWIVSRKKPLASRPVRLRLPFNNGQETNELELMNATFDEKSRELVTLAK GRGLSDCGIQARWRFDGQRFRLVRYAAEPTCDNWHGPDAWPTLWITR >gi|296493388|gb|ADTK01000113.1| GENE 29 27241 - 27816 691 191 aa, chain - ## HITS:1 COG:yffH KEGG:ns NR:ns ## COG: yffH COG0494 # Protein_GI_number: 16130392 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Escherichia coli K12 # 1 191 1 191 191 370 98.0 1e-103 MTQQITLIKDKILSDNYFTLHNITYDLTRKDGEVIRHKREVYDRGNGATILLYNAKKKTV VLIRQFRVATWVNGNESGQLIETCAGLLDNDEPEVCIRKEAIEETGYEVGEVRKLFELYM SPGGVTELIHFFIAEYSDNQRANAGGGVEDEDIEVLELPFSQALEMIKTGEIRDGKTVLL LNYLQTSHLMD >gi|296493388|gb|ADTK01000113.1| GENE 30 27884 - 29863 1449 659 aa, chain - ## HITS:1 COG:aegA_2 KEGG:ns NR:ns ## COG: aegA_2 COG0493 # Protein_GI_number: 16130393 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Escherichia coli K12 # 181 659 1 479 479 997 99.0 0 MNRFIMANSQQCLGCHACEIACVMAHNDEQHVLSQHHFHPRITVIKHQQQRSAVTCHHCE DAPCARSCPNGAISHVDDSIQVNQQKCIGCKSCVVACPFGTMQIVLTPVAAGKVKATAHK CDLCAGRENGPACVENCPADALQLVTDVALSGMAKSRRLRTARQEHQPWHASTAAQEMPV MSKVEQMQATPARGEPDKLAIEARKTGFDEIYLPFRADQAQREASRCLKCGEHSVCEWTC PLHNHIPQWIELVKAGNIDAAVELSHQTNTLPEITGRVCPQDRLCEGACTIRDEHGAVTI GNIERYISDQALAKGWRPDLSHVTKVDKRVAIIGAGPAGLACADVLTRNGVAVTVYDRHP EIGGLLTFGIPSFKLDKSLLARRREIFSAMGIHFELNCEVGKDVSLDSLLEQYDAVFVGV GTYRSMKAGLPNEDAPGVYDALPFLIANTKQVMGLEELPEEPFINTAGLNVVVLGGGDTA MDCVRTALRHGASNVTCAYRRDEANMPGSKKEVKNAREEGANFEFNVQPVALELNEQGHV CGIRFLRTRLGEPDAQGRRRPVPIEGSEFVMPADAVIMAFGFNPHGMPWLESHGVTVDKW GRIIADVESQYRYQTTNPKIFAGGDAVRGADLVVTAMAEGRHAAQGIIDWLGVKSVKSH >gi|296493388|gb|ADTK01000113.1| GENE 31 30069 - 31769 1361 566 aa, chain + ## HITS:1 COG:narQ KEGG:ns NR:ns ## COG: narQ COG3850 # Protein_GI_number: 16130394 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase, nitrate/nitrite-specific # Organism: Escherichia coli K12 # 1 566 1 566 566 1075 99.0 0 MIVKRPVSASLARAFFYIVLLSILSTGIALLTLASSLRDAEAINIAGSLRMQSYRLGYDL QSGSPQLNAHRQLFQQALHSPVLTNLNVWYVPEAVKTRYAHLNANWLEMNNRLSKGDLPW YQANINNYVNQIDLFVLALQHYAERKMLLVVAISLAGGIGIFTLVFFTLRRIRHQVVAPL NQLVTASQRIEHGQFDSPPLDTSLPNELGLLAKTFNQMSSELHKLYLSLEASVEEKTRDL HEAKRRLEVLYQCSQALNTSQIDVHCFRHILQIVRDNEAAEYLELNVGENWRISEGQPNP ELPMQILPVTMQETVYGELHWQNSHVSSSEPLLNSVSSMLGRGLYFNQAQKHFQQLLLME ERATIARELHDSLAQVLSYLRIQLTLLKRSIPEDNATAQSIMADFSQALNDAYRQLRELL TTFRLTLQQADLPSALREMLDTLQNQTSAKLTLDCRLPTLALDAQMQVHLLQIIREAVLN AMKHANASEIAVSCVTAPDGNHTVYIRDNGIGIGEPKEPEGHYGLNIMRERAERLGGTLT FSQPSGGGTLVSISFRSAEGEESQLM >gi|296493388|gb|ADTK01000113.1| GENE 32 31933 - 35046 3233 1037 aa, chain + ## HITS:1 COG:acrD KEGG:ns NR:ns ## COG: acrD COG0841 # Protein_GI_number: 16130395 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Escherichia coli K12 # 1 1037 1 1037 1037 2040 99.0 0 MANFFIDRPIFAWVLAILLCLTGTLAIFSLPVEQYPDLAPPNVRVTANYPGASAQTLENT VTQVIEQNMTGLDNLMYMSSQSSGTGQASVTLSFKAGTDPDEAVQQVQNQLQSAMRKLPQ AVQNQGVTVRKTGDTNILTIAFVSTDGSMDKQDIADYVASNIQDPLSRVNGVGDIDAYGS QYSMRIWLDPAKLNSFQMTAKDVTDAIESQNAQIAVGQLGGTPSVDKQALNATINAQSLL QTPEQFRDITLRVNQDGSEVRLGDVATVEMGAEKYDYLSRFNGKPASGLGVKLASGANEM ATAELVLNRLDELAQYFPHGLEYKVAYETTSFVKASIEDVVKTLLEAIALVFLVMYLFLQ NFRATLIPTIAVPVVLMGTFSVLYAFGYSVNTLTMFAMVLAIGLLVDDAIVVVENVERIM SEEGLTPREATRKSMGQIQGALVGIAMVLSAVFVPMAFFGGTTGAIYRQFSITIVTAMVL SVLVAMILTPALCATLLKPLKKGEHHGQKGFFAWFNQMFNRNAERYEKGVAKILHRSLRW IVIYVLLLGGMVFLFLRLPTSFLPLEDRGMFTTSVQLPSGSTQQQTLKVVEQIEKYYFTH EKDNIMSVFATVGSGPGGNGQNVARMFIRLKDWSERDSKTGTSFAIIERATKAFNKIKEA RVIASSPPAISGLGSSAGFDMELQDHAGAGHDALMAARNQLLALAAENPELTRVRHNGLD DSPQLQIDIDQRKAQALGVDIDDINDTLQTAWGSSYVNDFMDRGRVKKVYVQAAAPYRML PDDINLWYVRNKDGGMVPFSAFATSRWETGSPRLERYNGYSAVEIVGEAAPGVSTGTAMD IMESLVKQLPNGFGLEWTAMSYQERLSGAQAPALYAISLLVVFLCLAALYESWSVPFSVM LVVPLGVIGALLATWMRGLENDVYFQVGLLTVIGLSAKNAILIVEFANEMNQKGHDLFEA TLHACRQRLRPILMTSLAFIFGVLPMATSTGAGSGGQHAVGTGVMGGMISATILAIYFVP LFFVLVRRRFPLKPRPE >gi|296493388|gb|ADTK01000113.1| GENE 33 35585 - 35941 322 118 aa, chain + ## HITS:1 COG:ECs3333 KEGG:ns NR:ns ## COG: ECs3333 COG1393 # Protein_GI_number: 15832587 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Escherichia coli O157:H7 # 1 118 1 118 118 243 98.0 5e-65 MVTLYGIKNCDTIKKARRWLEANNIDYRFHDYRVDGLDSELLNGFINELGWEALLNTRGT TWRKLDETTRNKITDAASAAALMTEMPAIIKRPLLCAPGKPMLLGFSDSSYQQFFHEV >gi|296493388|gb|ADTK01000113.1| GENE 34 35945 - 37072 1014 375 aa, chain + ## HITS:1 COG:dapE KEGG:ns NR:ns ## COG: dapE COG0624 # Protein_GI_number: 16130397 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Escherichia coli K12 # 1 375 1 375 375 778 100.0 0 MSCPVIELTQQLIRRPSLSPDDAGCQALLIERLQAIGFTVERMDFADTQNFWAWRGQGET LAFAGHTDVVPPGDADRWINPPFEPTIRDGMLFGRGAADMKGSLAAMVVAAERFVAQHPN HTGRLAFLITSDEEASAHNGTVKVVEALMARNERLDYCLVGEPSSIEVVGDVVKNGRRGS LTCNLTIHGVQGHVAYPHLADNPVHRAAPFLNELVAIEWDQGNEFFPATSMQIANIQAGT GSNNVIPGELFVQFNFRFSTELTDEMIKAQVLALLEKHQLRYTVDWWLSGQPFLTARGKL VDAVVNAVEHYNEIKPQLLTTGGTSDGRFIARMGAQVVELGPVNATIHKINECVNAADLQ LLARMYQRIMEQLVA >gi|296493388|gb|ADTK01000113.1| GENE 35 37100 - 37300 316 66 aa, chain + ## HITS:1 COG:no KEGG:G2583_2995 NR:ns ## KEGG: G2583_2995 # Name: ypfN # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 66 1 66 66 85 100.0 7e-16 MDWLAKYWWILVIVFLVGVLLNVIKDLKRVDHKKFLANKPELPPHRDFNDKWDDDDDWPK KDQPKK >gi|296493388|gb|ADTK01000113.1| GENE 36 37410 - 38108 794 232 aa, chain - ## HITS:1 COG:ypfH KEGG:ns NR:ns ## COG: ypfH COG0400 # Protein_GI_number: 16130398 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Escherichia coli K12 # 1 232 9 240 240 456 99.0 1e-128 MKHDHFVVQSPDKPAQQLLLLFHGVGDNPVAMGEIGSWFAPLFPDALVVSVGGAEPSGNP AGRQWFSVQGITEDNRQARVDAIMPTFIETVRYWQKQSGVGANATALIGFSQGAIMVLES IKAEPGLASRVIAFNGRYASLPETASTATTIHLIHGGEDPVIDLAHAVAAQEALISAGGD VTLDIVEDLGHAIDNRSMQFALDHLRYTIPKHYFDEALSGGKPGDDDVIEMM >gi|296493388|gb|ADTK01000113.1| GENE 37 38182 - 40158 1346 658 aa, chain - ## HITS:1 COG:ECs3336 KEGG:ns NR:ns ## COG: ECs3336 COG1444 # Protein_GI_number: 15832590 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase fused to an acetyltransferase # Organism: Escherichia coli O157:H7 # 1 658 14 671 671 1261 98.0 0 MKREGIRRLLVLSGEEGWCFDHALKLRDALPGDWLWISPQPDAENHCSPSALQTLLGREF RHAVFDARHGFDAAAFAALSGTLKAGSWLVLLLPVWEEWENQPDADSLRWSDCPDPIATP HFVQHLKRVLTADNDAILWRQNQPFSLAHFTPRTDWHPATGAPQPEQQQLLQQLLTMPPG VAAVTAARGRGKSALAGQLISRIAGSAIVTAPAKAATDVLAQFAGEKFRFIAPDALLASD EQADWLVVDEAAAIPAPLLHQLVSRFPRTLLTTTVQGYEGTGRGFLLKFCARFPHLHHFE LQQPIRWAQGCPLEKMVSEALVFDDENFTHTPQGNIVISAFEQTLWRSEPQTPLKVYQLL SGAHYRTSPLDLRRMMDAPGQHFLQAAGENEIAGALWLVDEGGLSQQLSQAVWAGFRRPR GNLVAQSLAAHGSNPLAATLRGRRVSRIAVHPARQREGVGQQLIASALQYRPGLDYLSVS FGYTGELWRFWQRCGFVLVRMGNHREASSGCYTAMALLPMSDAGKQLAEREHYRLRRDAQ ALAQWNGETLPVDPLNDAILSDDDWLELAGFAFAHRPLLTSLGCLMRLLQTSELALPALR GRLQKNVSDAQLCTTLKLSGRKMLLVRQREEAAQALFALNDVRTERLRDRITQWQFFH >gi|296493388|gb|ADTK01000113.1| GENE 38 40212 - 41075 789 287 aa, chain - ## HITS:1 COG:ECs3337 KEGG:ns NR:ns ## COG: ECs3337 COG2321 # Protein_GI_number: 15832591 # Func_class: R General function prediction only # Function: Predicted metalloprotease # Organism: Escherichia coli O157:H7 # 1 287 1 287 287 535 99.0 1e-152 MRWQGRRESDNVEDRRNSSGGPSMGGPGFRLPSGKGGLILLLVVLVAGYYGVDLTGLMTG QPVSQQQSTRSISPNEDEAAKFTSVILATTEDTWGQQFEKMGKTYQQPKLVMYRGMTRTG CGAGQSIMGPFYCPADGTVYIDLSFYDDMKDKLGADGDFAQGYVIAHEVGHHVQKLLGIE PKVRQLQQNATQAEVNRLSVRMELQADCFAGVWGHNMQQQGVLETGDLEEALNAAQAIGD DRLQQQSQGRVVPDSFTHGTSQQRYSWFKRGFDSGDPAQCNTFGKSI >gi|296493388|gb|ADTK01000113.1| GENE 39 41222 - 41620 197 132 aa, chain - ## HITS:1 COG:PSLT106 KEGG:ns NR:ns ## COG: PSLT106 COG1487 # Protein_GI_number: 17233504 # Func_class: R General function prediction only # Function: Predicted nucleic acid-binding protein, contains PIN domain # Organism: Salmonella typhimurium LT2 # 1 132 1 132 132 251 93.0 3e-67 MLKFMLDTNICIFTIKNKPVHVRERFNLNSGRMCISSVTLMELIYGAEKSQMPERNLAVI EGFVSRLVVLDYDSAAATHTGQIRAELARQGRPVGPFDQMIAGHARSRGLIVVTNNTREF ERVAGIRLEDWS >gi|296493388|gb|ADTK01000113.1| GENE 40 41620 - 41847 263 75 aa, chain - ## HITS:1 COG:PSLT107 KEGG:ns NR:ns ## COG: PSLT107 COG4456 # Protein_GI_number: 17233505 # Func_class: S Function unknown # Function: Virulence-associated protein and related proteins # Organism: Salmonella typhimurium LT2 # 1 75 2 76 76 124 85.0 4e-29 METTVFLSNRSQAVRLPKAVALPEDVKKVEIIAIGRTRIITPAGESWDSWFDGENVSADF MDIRDQPAMQERESF >gi|296493388|gb|ADTK01000113.1| GENE 41 42001 - 42714 1122 237 aa, chain - ## HITS:1 COG:ECs3338 KEGG:ns NR:ns ## COG: ECs3338 COG0152 # Protein_GI_number: 15832592 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase # Organism: Escherichia coli O157:H7 # 1 237 1 237 237 464 100.0 1e-131 MQKQAELYRGKAKTVYSTENPDLLVLEFRNDTSAGDGARIEQFDRKGMVNNKFNYFIMSK LAEAGIPTQMERLLSDTECLVKKLDMVPVECVVRNRAAGSLVKRLGIEEGIELNPPLFDL FLKNDAMHDPMVNESYCETFGWVSKENLARMKELTYKANDVLKKLFDDAGLILVDFKLEF GLYKGEVVLGDEFSPDGSRLWDKETLEKMDKDRFRQSLGGLIEAYEAVARRLGVQLD >gi|296493388|gb|ADTK01000113.1| GENE 42 42786 - 43016 90 76 aa, chain + ## HITS:1 COG:no KEGG:SDY_2665 NR:ns ## KEGG: SDY_2665 # Name: not_defined # Def: hypothetical protein # Organism: S.dysenteriae # Pathway: not_defined # 1 76 8 83 83 133 100.0 2e-30 MRRERLIRTTSDKQISCELPLKSRFPADAHVCVSYQKKGPDDSSPVFLLAKRSLEDSYQR VVLTLSQSMTFRIDEL >gi|296493388|gb|ADTK01000113.1| GENE 43 42927 - 43961 898 344 aa, chain - ## HITS:1 COG:nlpB KEGG:ns NR:ns ## COG: nlpB COG3317 # Protein_GI_number: 16130402 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized lipoprotein # Organism: Escherichia coli K12 # 1 344 2 345 345 624 100.0 1e-179 MAYSVQKSRLAKVAGVSLVLLLAACSSDSRYKRQVSGDEAYLEAAPLAELHAPAGMILPV TSGDYAIPVTNGSGAVGKALDIRPPAQPLALVSGARTQFTGDTASLLVENGRGNTLWPQV VSVLQAKNYTITQRDDAGQTLTTDWVQWNRLDEDEQYRGRYQISVKPQGYQQAVTVKLLN LEQAGKPVADAASMQRYSTEMMNVISAGLDKSATDAANAAQNRASTTMDVQSAADDTGLP MLVVRGPFNVVWQRLPAALEKVGMKVTDSTRSQGNMAVTYKPLSDSDWQELGASDPGLAS GDYKLQVGDLDNRSSLQFIDPKGHTLTQSQNDALVAVFQAAFSK >gi|296493388|gb|ADTK01000113.1| GENE 44 43978 - 44856 791 292 aa, chain - ## HITS:1 COG:ECs3340 KEGG:ns NR:ns ## COG: ECs3340 COG0329 # Protein_GI_number: 15832594 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Escherichia coli O157:H7 # 1 292 1 292 292 578 99.0 1e-165 MFTGSIVAIVTPMDEKGNVCRASLKKLIDYHVASGTSAIVSVGTTGESATLNHDEHADVV MMTLELADGRIPVIAGTGANATAEAISLTQRFNDSGIVGCLTVTPYYNRPSQEGLYQHFK AIAEHTDLPQILYNVPSRTGCDLLPETVGRLTKVKNIIGIKEATGNLTRVNQIKELVSDD FVLLSGDDASALDFMQLGGHGVISVTANVAARDMAQMCKLAAEGHFAEARVINQRLMPLH NKLFVEPNPIPVKWACKELGLVATDTLRLPMTPITDSGRETVRAALKHAGLL >gi|296493388|gb|ADTK01000113.1| GENE 45 45002 - 45574 345 190 aa, chain + ## HITS:1 COG:ECs3341 KEGG:ns NR:ns ## COG: ECs3341 COG2716 # Protein_GI_number: 15832595 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system regulatory protein # Organism: Escherichia coli O157:H7 # 1 190 23 212 212 374 99.0 1e-104 MTLSSQHYLVITALGADRPGIVNTITRHVSSCGCNIEDSRLAMLGEEFTFIMLLSGSWNA ITLIESTLPLKGAELDLLIVMKRTTARPRPPMPASVWVQVDVADSPHLIERFTALFDAHH MNIAELVSRTQPAENERAAQLHIQITAHSPASADAANIEQAFKALCTELNAQGSINVVNY SQHDEQDGVK >gi|296493388|gb|ADTK01000113.1| GENE 46 45574 - 46044 508 156 aa, chain + ## HITS:1 COG:ECs3342 KEGG:ns NR:ns ## COG: ECs3342 COG1225 # Protein_GI_number: 15832596 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Escherichia coli O157:H7 # 1 156 1 156 156 322 100.0 1e-88 MNPLKAGDIAPKFSLPDQDGEQVNLTDFQGQRVLVYFYPKAMTPGCTVQACGLRDNMDEL KKAGVDVLGISTDKPEKLSRFAEKELLNFTLLSDEDHQVCEQFGVWGEKSFMGKTYDGIH RISFLIDADGKIEHVFDDFKTSNHHDVVLNWLKEHA >gi|296493388|gb|ADTK01000113.1| GENE 47 46297 - 46914 293 205 aa, chain + ## HITS:1 COG:hyfA KEGG:ns NR:ns ## COG: hyfA COG1142 # Protein_GI_number: 16130406 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 2 # Organism: Escherichia coli K12 # 1 204 14 217 218 376 98.0 1e-104 MNRFVVAEPLWCTGCNTCLAACSDVHKTQGLQQHPRLALAKTSTITAPVVCHHCEEAPCL QVCPVNAISQRDDAIQLNESLCIGCKLCAVVCPFGAISASGSRPVNAHAQYVFQAEGSLK DGEENVLPQHALLRWEPGVQTVAVKCDLCDFLPEGPACVRACPNQALRLITDDSLQRQMK EKQRLAASWFANGGEDPLSLTQEQH >gi|296493388|gb|ADTK01000113.1| GENE 48 46914 - 47585 506 223 aa, chain + ## HITS:1 COG:ECs3344 KEGG:ns NR:ns ## COG: ECs3344 COG0651 # Protein_GI_number: 15832598 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit # Organism: Escherichia coli O157:H7 # 1 223 1 223 672 295 89.0 5e-80 MDALQLLTWSLILYLFASLASLFLLGLDRLAIKLSGITSLVGGVIGIISGITQLHAGVTL VAHFATPFDFADLTLRMDSLSAFMVLVISLLVVVCSLYSLTYMREYEGKGAAAMGFFMNL FIASMVALLVMDNAFWFIVLFEMMSLSSWFLVIARQDKTSINAGMLYFFIAHAGSVLIMI AFLLMGRESGSLDFASFARFHFLRGWRRRCFCWPFSVLARKPG >gi|296493388|gb|ADTK01000113.1| GENE 49 47699 - 48931 922 410 aa, chain + ## HITS:1 COG:ECs3344 KEGG:ns NR:ns ## COG: ECs3344 COG0651 # Protein_GI_number: 15832598 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit # Organism: Escherichia coli O157:H7 # 1 410 263 672 672 703 98.0 0 MDLLAQTGLPLWWGTLVMAIGAISALLGVLYALAEQDIKRLLAWSTVENVGIILLAVGVA MDGLSLHDPLLTVVGLLGALFHLLNHALFKGLLFLGAGAIISRLHTHDMEKMGALAKRMP WTAAACLIGCLAISAIPPLNGFISEWYTWQSLFSLSRVEAVALQLAGPIAMVMLAVTGGL AVMCFVKMYGITFCGAPRSTHAEEAQEVPNTMIVAMLLLAALCVLIALSASWLAPKIMHI AHAFTNTPPVTVASGIALVPGTFHTRVTPSLLLLLLLAMPLLPGLYWLWCRSRRAAFRRT GDAWACGYGWENAMAPSGNGVMQPLRVVFSALFRLRQQFDPTLRLNKGLAHVTARAQSTE PFWDERVIRPIVSATQRLAKEIQHLQSGDFRLYCLYVVAALVVLLIAIAV >gi|296493388|gb|ADTK01000113.1| GENE 50 49056 - 49889 782 277 aa, chain + ## HITS:1 COG:ECs3345 KEGG:ns NR:ns ## COG: ECs3345 COG0650 # Protein_GI_number: 15832599 # Func_class: C Energy production and conversion # Function: Formate hydrogenlyase subunit 4 # Organism: Escherichia coli O157:H7 # 1 277 46 322 322 439 99.0 1e-123 MHSRRGPGIWQDYRDIHKLFKRQEVAPISSGLMFRLMPWVLISSMLVLAMALPLFITVSP FAGGGDLITLIYLLALFRFFFALSGLDTGSPFAGVGASRELTLGILVEPMLILSLLVLAL IAGSTHIEMISNTLATGWNSPLTTVLALLACGFACFIEMGKIPFDVAEAEQELQEGPLTE YSGAGLALAKWGLGLKQVVMAALFVALFLPFGRAQELSLACLLTSLVVTLLKVLLIFVLA SIAENTLARGRFLLIHHVTWLGFSLAALAWVFWLTGL >gi|296493388|gb|ADTK01000113.1| GENE 51 49992 - 50735 553 247 aa, chain + ## HITS:1 COG:hyfD KEGG:ns NR:ns ## COG: hyfD COG1009 # Protein_GI_number: 16130409 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit # Organism: Escherichia coli K12 # 2 224 31 253 479 390 99.0 1e-108 MGVLFAALTTLCMLSLISAFYQADKVAVTLTLVNVGDVALFGLVIDRVSTLILFVVVFLG LLVTIYSTGYLTDKNREHPHNGTNRYYAFLLVFIGAMAGLVLSSTLLGQLLFFEITGGCS WALISYYQSDKAQRSALKALLITHIGSLGLYLAAATLFLQTGTFVLSAMSELHGDARYLV YGGILFAAWGKSAQLPMQAWLPDAMEAPTPISAYLHAASMVKVGGWHWRAASPTSSTTRS LKACFSL >gi|296493388|gb|ADTK01000113.1| GENE 52 50555 - 51136 254 193 aa, chain + ## HITS:1 COG:ECs3346 KEGG:ns NR:ns ## COG: ECs3346 COG1009 # Protein_GI_number: 15832600 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit # Organism: Escherichia coli O157:H7 # 37 193 323 479 479 285 99.0 2e-77 MGEIGPATDASVATGRNGSANTDQRLSPRRIDGESGRLALEGSIAYIVNHAFAKSLFFLV AGALSYSCGTRLLPRLRGVLHTLPLPGVGFCVAALAITGVPPFNGFFSKFPLFAACFALS VEYWILLPAMILLMIESVASFAWFIRWFGRVVPGKPSEAVADAAPLPGSMRLVLIVLIVM SLISSVIAATWLQ >gi|296493388|gb|ADTK01000113.1| GENE 53 51148 - 51798 607 216 aa, chain + ## HITS:1 COG:ECs3347 KEGG:ns NR:ns ## COG: ECs3347 COG4237 # Protein_GI_number: 15832601 # Func_class: C Energy production and conversion # Function: Hydrogenase 4 membrane component (E) # Organism: Escherichia coli O157:H7 # 1 216 1 216 216 339 100.0 2e-93 MTGSMIVNNLAGLMMLTSLFVISVKSYRLSCGFYACQSLVLVSIFATLSCLFAAEQLLIW SASAFITKVLLVPLIMTYAARNIPQNIPEKALFGPAMMALLAALIVLLCAFVVQPVKLPM ATGLKPALAVALGHFLLGLLCIVSQRNILRQIFGYCLMENGSHLVLALLAWRAPELVEIG IATDAIFAVIVMVLLARKIWRTHGTLDVNNLTALKG >gi|296493388|gb|ADTK01000113.1| GENE 54 51803 - 53395 1296 530 aa, chain + ## HITS:1 COG:ECs3348 KEGG:ns NR:ns ## COG: ECs3348 COG0651 # Protein_GI_number: 15832602 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit # Organism: Escherichia coli O157:H7 # 1 530 1 526 526 859 98.0 0 MSYSVMFALLLLTPLLFSLLCFACRKRGLSATCTVTVLHSLGITLLLILALWVVQTAADA GEIFAAGLWLHIDGLGGLFLAILGVIGFLTGVYSIGYMRHEVAHGELSPVTLCDYYGFFH LFLFTMLLVVTSNNLIVMWAAIEATTLSSAFLVGIYGQRSSLEAAWKYIIICTVGVAFGL FGTVLVYANVASVMPQAEMAIFWSEVLKQSSLLDPTLMLLAFVFVLIGFGTKTGLFPMHA WLPDAHSEAPSPVSALLSAVLLNCALLVLIRYYIIICQAIGSDFPNRLLLIFGMLSVAVA AFFILVQRDIKRLLAYSSVENMGLVAVALGIGGPLGIFAALLHTLNHSLAKTLLFCGSGN VLLKYGTRDLNVVCGMLKIMPFTAVLFGGGALALALALAGMPPFNIFLSEFMTITAGLAR NHLLIIVLLLLLLTLVLAGLVRMAARVLMAKPPQAVNRGDLGWLTTSPMVILLVMMLAMG THIPQPVIRILAGASTIVLSGTHDLPAQRSTWHDFLPSGTASVSEKHSER >gi|296493388|gb|ADTK01000113.1| GENE 55 53385 - 55052 1712 555 aa, chain + ## HITS:1 COG:ECs3349_2 KEGG:ns NR:ns ## COG: ECs3349_2 COG3261 # Protein_GI_number: 15832603 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase III large subunit # Organism: Escherichia coli O157:H7 # 156 555 1 416 416 820 95.0 0 MNVNSSSNRGEAILAALKTQFPGAELDEERQTPEQVTITVKINLLPDVVHYLYYQHDGWL PVLFGNDERTLNGHYAVYYALSMEGAEKCWIVVKALVDADSREFPSVTPRVPAAVWGERE IRDMYGLIPVGLPDQRRLVLPDDWPEDMHPLRKDAMDYRLRPEPTTDSETYAFINEGNSD ARVIPVGPLHITSDEPGHFRLFVDGEQIVDADYRLFYVHRGMEKLAETRMGYNEVTFLSD RVCGICGFAHSVAYTNSVENALGIEVPQRAHTIRSILLEVERLHSHLLNLGLSCHFVGFD TGFMQFFRVREKSMTMAELLIGSRKTYGLNLIGGVRRDILKEQRLQTLKLVREMRADVSE LVEMLLATPNMEQRTQGIGILDRQIARDLRFDHPYADYGNIPKTLFTFTGGDVFSRVMVR VKETFDSLAMLEFALDNMPDTPLLTEGFSYKPHAFALGFVEAPRGEDVHWSMLGDNQKLF RWRCRAATYANWPVLRYMLRGNTVSDAPLIIGSLDPCYSCTDRVTLVDVRKRQSKTVPYK EIERYGIDRNRSPLK >gi|296493388|gb|ADTK01000113.1| GENE 56 55062 - 55607 372 181 aa, chain + ## HITS:1 COG:ECs3350 KEGG:ns NR:ns ## COG: ECs3350 COG1143 # Protein_GI_number: 15832604 # Func_class: C Energy production and conversion # Function: Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) # Organism: Escherichia coli O157:H7 # 1 181 1 181 181 301 97.0 3e-82 MLKLLKTIMRAGTATVKYPFAPLEVSPGFRGKPDLMPSQCIACGACACACPANALTIQTD DQQNSRTWQLYLGRCIYCGRCEEVCPTRAIQLTNNFELTVTNKADLYTRATFHLQRCSRC ERPFAQQKTVALAAELLAQQQNAPQNREMLWAQASVCPECKQRATLLNDDTDVPLVAKEQ L >gi|296493388|gb|ADTK01000113.1| GENE 57 55604 - 56005 86 133 aa, chain + ## HITS:1 COG:ZhyfI KEGG:ns NR:ns ## COG: ZhyfI COG3260 # Protein_GI_number: 15803012 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase III small subunit # Organism: Escherichia coli O157:H7 EDL933 # 1 90 1 90 252 181 98.0 3e-46 MSPVLTQHVSQPITLDEQTQKMKRHLLQDIRRSAYVYRVDCGGCNACEIEIFAAITPVFD AERFGIKVVSSPRHADILLFTGAVTRAMRILHFGRMSLPPIIKSVFPTARAVPAADFPRS LQRLGRQRHHCPH >gi|296493388|gb|ADTK01000113.1| GENE 58 55932 - 56360 223 142 aa, chain + ## HITS:1 COG:hyfI KEGG:ns NR:ns ## COG: hyfI COG3260 # Protein_GI_number: 16130414 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase III small subunit # Organism: Escherichia coli K12 # 7 142 117 252 252 281 99.0 2e-76 MRCRRRIFHDLYSVWGGSDTIVPIDVWIPGCPPTPAATIHGFAVALGLLQQKIHAVDYRD PTGVTMQPLWPQIPPSQRIAIEREARRLAGYRQGREICDRLLRHLSDDPTGNRVNTWLRD ADDPRLNSIVQQLFRLLRGLHD >gi|296493388|gb|ADTK01000113.1| GENE 59 56353 - 56766 391 137 aa, chain + ## HITS:1 COG:no KEGG:SSON_2571 NR:ns ## KEGG: SSON_2571 # Name: not_defined # Def: putative protein processing element # Organism: S.sonnei # Pathway: not_defined # 1 137 22 158 158 272 98.0 2e-72 MTEECGEIVFWTLRKKFVASSDEMPEHSSQVMYYSLAIGHHVGVIDCLNVAFPCPLTEYE DWLAQVEEEQARRKMLGVMNFGEIVIDASHTALLTRAFAPLADDATSVWQARSIQFIHLL DEIVQEPAIYLMARKIA >gi|296493388|gb|ADTK01000113.1| GENE 60 56796 - 58808 1529 670 aa, chain + ## HITS:1 COG:ECs3353 KEGG:ns NR:ns ## COG: ECs3353 COG3604 # Protein_GI_number: 15832607 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Escherichia coli O157:H7 # 8 670 1 663 663 1307 99.0 0 MAMSDEAMFAPPQGITIEAVNGMLAERLAQKHGKASLLRAFIPMPPPFSPVQLIELHVLK SNFYYRYHDDGSDVTATTEYQGEMVDYSRHAVLLGSSGMAELRFIRTHGSRFTPQDCTLF NWLARIITPVLQSWLNDEEQQVALRLLEKDRDHHRVLVDITNAVLSHLDLDDLIADVARE IHHFFGLASVSMVLGDHRKNEKFSLWCSDLSASHCACLPRNMPGDSVLLTQTLQTRQPTL THRADDLFLWQRDPLLLLLASNGCESALLIPLTFGNHTPGALLLAHTSSTLFSEENCQLL QHIADRIAIAVGNADAWRSMTDLQESLQQENHQLSEQLLSNLGIGDIIYQSQAMEDLLQQ VDIVAKSDSTVLICGETGTGKEVIARAIHQLSPRRDKPLVKINCAAIPASLLESELFGHD KGAFTGAINTHRGRFEIADGGTLFLDEIGDLPLELQPKLLRVLQEREIERLGGSRTIPVN VRVIAATNRDLWQMVEDRQFRSDLFYRLNVFPLELPPLRDRPEDIPLLAKHFTQKMARHM NRSIDAIPTEALRQLMSWDWPGNVRELENVIERAVLLTRGNSLNLHLNVRQSRLLPTLNE DSALRSSMAQLLHPTTPENDEEERQRIVQVLRETNGIVAGPRGAATRLGMKRTTLLSRMQ RLGISVREVL >gi|296493388|gb|ADTK01000113.1| GENE 61 58830 - 59678 392 282 aa, chain + ## HITS:1 COG:focB KEGG:ns NR:ns ## COG: focB COG2116 # Protein_GI_number: 16130417 # Func_class: P Inorganic ion transport and metabolism # Function: Formate/nitrite family of transporters # Organism: Escherichia coli K12 # 1 281 1 281 282 501 99.0 1e-142 MRNKLSFDLQLSARKAAIAERIAAHKIARSKVSVFLMAMSAGVFMAIGFTFYLSVIADAP SSQALTHLVGGLCFTLGFILLAVCGTSLFTSSVMTVMAKSRGVISWRTWLINALLVACGN LAGIACFSLLIWFSGLVMSENAMWGVAVLHCAEGKMHHTFTESVSLGIMCNLMVCLALWM SYCGRSLCDKIVAMILPITLFVASGFEHCIANLFVIPFAIAIRHFAPPPFWQLVHSSADN FPALTVSHFITANLLPVMLGNIIGGAVLVSICYRAIYLRQES >gi|296493388|gb|ADTK01000113.1| GENE 62 59716 - 60777 1372 353 aa, chain - ## HITS:1 COG:ECs3355 KEGG:ns NR:ns ## COG: ECs3355 COG0628 # Protein_GI_number: 15832609 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Escherichia coli O157:H7 # 1 353 1 353 353 587 99.0 1e-168 MLEMLMQWYRRRFSDPEAIALLVILVAGFGIIFFFSGLLAPLLVAIVLAYLLEWPTVRLQ SIGCSRRWATSIVLVVFVGILLLMAFVVLPIAWQQGIYLIRDMPGMLNKLSDFAATLPRR YPALMDAGIIDAMAENMRSRMLTMGDSVVKISLASLVGLLTIAVYLVLVPLMVFFLLKDK EQMLNAVRRVLPRNRGLAGQVWKEMNQQITNYIRGKVLEMIVVGIATWLGFLLFGLNYSL LLAVLVGFSVLIPYIGAFVVTIPVIGVALFQFGAGTEFWSCFAVYLIIQALDGNLLVPVL FSEAVNLHPLVIILSVVIFGGLWGFWGVFFAIPLATLIKAVIHAWPDGQIAQE >gi|296493388|gb|ADTK01000113.1| GENE 63 60990 - 62453 1692 487 aa, chain + ## HITS:1 COG:yfgC KEGG:ns NR:ns ## COG: yfgC COG4783 # Protein_GI_number: 16130419 # Func_class: R General function prediction only # Function: Putative Zn-dependent protease, contains TPR repeats # Organism: Escherichia coli K12 # 1 487 1 487 487 872 100.0 0 MFRQLKKNLVATLIAAMTIGQVAPAFADSADTLPDMGTSAGSTLSIGQEMQMGDYYVRQL RGSAPLINDPLLTQYINSLGMRLVSHANSVKTPFHFFLINNDEINAFAFFGGNVVLHSAL FRYSDNESQLASVMAHEISHVTQRHLARAMEDQQRSAPLTWVGALGSILLAMASPQAGMA ALTGTLAGTRQGMISFTQQNEQEADRIGIQVLQRSGFDPQAMPTFLEKLLDQARYSSRPP EILLTHPLPESRLADARNRANQMRPMVVQSSEDFYLAKARTLGMYNSGRNQLTSDLLDEW AKGNVRQQRAAQYGRALQAMEANKYDEARKTLQPLLAAEPGNAWYLDLATDIDLGQNKAN EAINRLKNARDLRTNPVLQLNLANAYLQGGQPQEAANILNRYTFNNKDDSNGWDLLAQAE AALNNRDQELAARAEGYALAGRLDQAISLLSSASSQVKLGSLQQARYDARIDQLRQLQER FKPYTKM >gi|296493388|gb|ADTK01000113.1| GENE 64 62474 - 62833 533 119 aa, chain + ## HITS:1 COG:ECs3357 KEGG:ns NR:ns ## COG: ECs3357 COG1393 # Protein_GI_number: 15832611 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Escherichia coli O157:H7 # 1 119 1 119 119 204 99.0 4e-53 MTKQVKIYHNPRCSKSRETLNLLKENGVESEVVLYLETPADAATLRDLLKMLGMNSAREL MRQKEDLYKELNLADSSLSEEALIQAMVDNPKLMERPIVVANGKARIGRPPEQVLEIVG >gi|296493388|gb|ADTK01000113.1| GENE 65 62971 - 63738 638 255 aa, chain - ## HITS:1 COG:ECs3358 KEGG:ns NR:ns ## COG: ECs3358 COG0593 # Protein_GI_number: 15832612 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication initiation # Organism: Escherichia coli O157:H7 # 8 255 1 248 248 493 99.0 1e-139 VQTSWDSVVNFSRFCEILVEVSLNTPAQLSLPLYLPDDETFASFWPGDNSSLLAALQNVL RQEHSGYIYLWAREGAGRSHLLHAACAELSQRGDAVGYVPLDKRTWFVPEVLDGMEHLSL VCIDNIECIAGDELWEMAIFDLYNRILESGKTRLLITGDRPPRQLNLGLPDLASRLDWGQ IYKLQPLSDEDKLQALQLRARLRGFELPEDVGRFLLKRLDREMRTLFMTLDQLDRASITA QRKLTIPFVKEILKL Prediction of potential genes in microbial genomes Time: Mon May 16 15:23:24 2011 Seq name: gi|296493387|gb|ADTK01000114.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont296.2, whole genome shotgun sequence Length of sequence - 13739 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 7, operones - 4 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 - CDS 24 - 1313 1017 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 - Prom 1338 - 1397 8.0 - Term 1348 - 1380 3.1 2 1 Op 2 . - CDS 1399 - 2025 787 ## COG0035 Uracil phosphoribosyltransferase - Prom 2061 - 2120 7.5 + Prom 2247 - 2306 7.0 3 2 Op 1 21/0.000 + CDS 2350 - 3387 1243 ## PROTEIN SUPPORTED gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase 4 2 Op 2 4/0.500 + CDS 3387 - 4025 714 ## COG0299 Folate-dependent phosphoribosylglycinamide formyltransferase PurN + Term 4042 - 4073 2.4 + Prom 4101 - 4160 2.8 5 3 Op 1 11/0.000 + CDS 4196 - 6262 2068 ## COG0855 Polyphosphate kinase 6 3 Op 2 . + CDS 6267 - 7808 1412 ## COG0248 Exopolyphosphatase + Term 7814 - 7854 10.3 - Term 7594 - 7641 2.1 7 4 Tu 1 . - CDS 7847 - 10090 1382 ## COG2200 FOG: EAL domain - Prom 10227 - 10286 6.5 - Term 10216 - 10266 1.5 8 5 Tu 1 . - CDS 10307 - 10597 126 ## EC55989_2789 hypothetical protein - Prom 10776 - 10835 8.6 + Prom 10766 - 10825 9.0 9 6 Op 1 . + CDS 10947 - 11462 395 ## SSON_2587 putative outer membrane lipoprotein 10 6 Op 2 . + CDS 11478 - 12017 599 ## ECP_2508 hypothetical protein + Term 12055 - 12102 6.9 - Term 12050 - 12081 3.4 11 7 Tu 1 . - CDS 12112 - 13689 2003 ## COG0519 GMP synthase, PP-ATPase domain/subunit Predicted protein(s) >gi|296493387|gb|ADTK01000114.1| GENE 1 24 - 1313 1017 429 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 3 423 2 428 447 396 47 1e-110 MTRRAIGVSERPPLLQTIPLSLQHLFAMFGATVLVPVLFHINPATVLLFNGIGTLLYLFI CKGKIPAYLGSSFAFISPVLLLLPLGYEVALGGFIMCGVLFCLVSFIVKKAGTGWLDVLF PPAAMGAIVAVIGLELAGVAAGMAGLLPAEGQTPDSKTIIISITTLAVTVLGSVLFRGFL AIIPILIGVLVGYALSFAMGIVDTTPIINAHWFALPTLYTPRFEWFAILTILPAALVVIA EHVGHLVVTANIVKKDLLRDPGLHRSMFANGLSTVISGFFGSTPNTTYGENIGVMAITRV YSTWVIGGAAIFAILLSCVGKLAAAIQMIPLPVMGGVSLLLYGVIGASGIRVLIESKVDY NKAQNLILTSVILIIGVSGAKVNIGAAELKGMALATIVGIGLSLIFKLISVLRPEEVVLD AEDADITDK >gi|296493387|gb|ADTK01000114.1| GENE 2 1399 - 2025 787 208 aa, chain - ## HITS:1 COG:ECs3360 KEGG:ns NR:ns ## COG: ECs3360 COG0035 # Protein_GI_number: 15832614 # Func_class: F Nucleotide transport and metabolism # Function: Uracil phosphoribosyltransferase # Organism: Escherichia coli O157:H7 # 1 208 10 217 217 394 100.0 1e-110 MKIVEVKHPLVKHKLGLMREQDISTKRFRELASEVGSLLTYEATADLETEKVTIEGWNGP VEIDQIKGKKITVVPILRAGLGMMDGVLENVPSARISVVGMYRNEETLEPVPYFQKLVSN IDERMALIVDPMLATGGSVIATIDLLKKAGCSSIKVLVLVAAPEGIAALEKAHPDVELYT ASIDQGLNEHGYIIPGLGDAGDKIFGTK >gi|296493387|gb|ADTK01000114.1| GENE 3 2350 - 3387 1243 345 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase [Acinetobacter baumannii SDF] # 2 345 7 356 356 483 68 1e-136 MTDKTSLSYKDAGVDIDAGNALVGRIKGVVKKTRRPEVMGGLGGFGALCALPQKYREPVL VSGTDGVGTKLRLAMDLKRHDTIGIDLVAMCVNDLVVQGAEPLFFLDYYATGKLDVDTAS AVISGIAEGCLQSGCSLVGGETAEMPGMYHGEDYDVAGFCVGVVEKSEIIDGSKVSDGDV LIALGSSGPHSNGYSLVRKILEVSGCDPQTTELDGKPLADHLLAPTRIYVKSVLELIEKV DVHAIAHLTGGGFWENIPRVLPDNTQAVIDESSWQWPEVFNWLQTAGNVERHEMYRTFNC GVGMIIALPAPEVDKALALLNANGENAWKIGIIKASDSEQRVVIE >gi|296493387|gb|ADTK01000114.1| GENE 4 3387 - 4025 714 212 aa, chain + ## HITS:1 COG:purN KEGG:ns NR:ns ## COG: purN COG0299 # Protein_GI_number: 16130425 # Func_class: F Nucleotide transport and metabolism # Function: Folate-dependent phosphoribosylglycinamide formyltransferase PurN # Organism: Escherichia coli K12 # 1 212 1 212 212 430 99.0 1e-121 MNIVVLISGNGSNLQAIIDACKTNKIKGTVRAVFSNKADAFGLERARQAGIATHTLIASA FDSREAYDRELIHEIDMYAPDVVVLAGFMRILSPAFVSHYAGRLLNIHPSLLPKYPGLHT HRQALENGDEEHGTSVHFVTDELDGGPVILQAKVPVFAGDTEDDITARVQTQEHAIYPLV ISWFADGRLKMHENAAWLDGQRLPPQGYAADE >gi|296493387|gb|ADTK01000114.1| GENE 5 4196 - 6262 2068 688 aa, chain + ## HITS:1 COG:ECs3363 KEGG:ns NR:ns ## COG: ECs3363 COG0855 # Protein_GI_number: 15832617 # Func_class: P Inorganic ion transport and metabolism # Function: Polyphosphate kinase # Organism: Escherichia coli O157:H7 # 1 688 1 688 688 1347 100.0 0 MGQEKLYIEKELSWLSFNERVLQEAADKSNPLIERMRFLGIYSNNLDEFYKVRFAELKRR IIISEEQGSNSHSRHLLGKIQSRVLKADQEFDGLYNELLLEMARNQIFLINERQLSVNQQ NWLRHYFKQYLRQHITPILINPDTDLVQFLKDDYTYLAVEIIRGDTIRYALLEIPSDKVP RFVNLPPEAPRRRKPMILLDNILRYCLDDIFKGFFDYDALNAYSMKMTRDAEYDLVHEME ASLMELMSSSLKQRLTAEPVRFVYQRDMPNALVEVLREKLTISRYDSIVPGGRYHNFKDF INFPNVGKANLVNKPLPRLRHIWFDKAQFRNGFDAIRERDVLLYYPYHTFEHVLELLRQA SFDPSVLAIKINIYRVAKDSRIIDSMIHAAHNGKKVTVVVELQARFDEEANIHWAKRLTE AGVHVIFSAPGLKIHAKLFLISRKENGEVVRYAHIGTGNFNEKTARLYTDYSLLTADARI TNEVRRVFNFIENPYRPVTFDYLMVSPQNSRRLLYEMVDREIANAQQGLPSGITLKLNNL VDKGLVDRLYAASSSGVPVNLLVRGMCSLIPNLEGISDNIRAISIVDRYLEHDRVYIFEN GGDKKVYLSSADWMTRNIDYRIEVATPLLDPRLKQRVLDIIDILFSDTVKARYIDKELSN RYVPRGNRRKVRAQLAIYDYIKSLEQPE >gi|296493387|gb|ADTK01000114.1| GENE 6 6267 - 7808 1412 513 aa, chain + ## HITS:1 COG:ECs3364 KEGG:ns NR:ns ## COG: ECs3364 COG0248 # Protein_GI_number: 15832618 # Func_class: F Nucleotide transport and metabolism; P Inorganic ion transport and metabolism # Function: Exopolyphosphatase # Organism: Escherichia coli O157:H7 # 1 513 1 513 513 1009 100.0 0 MPIHDKSPRPQEFAAVDLGSNSFHMVIARVVDGAMQIIGRLKQRVHLADGLGPDNMLSEE AMTRGLNCLSLFAERLQGFSPASVCIVGTHTLRQALNATDFLKRAEKVIPYPIEIISGNE EARLIFMGVEHTQPEKGRKLVIDIGGGSTELVIGENFEPILVESRRMGCVSFAQLYFPGG VINKENFQRARMAAAQKLETLTWQFRIQGWNVAMGASGTIKAAHEVLMEMGEKDGIITPE RLEKLVKEVLRHRNFASLSLPGLSEERKTVFVPGLAILCGVFDALAIRELRLSDGALREG VLYEMEGRFRHQDVRSRTASSLANQYHIDSEQARRVLDTTMQMYEQWREQQPKLAHPQLE ALLRWAAMLHEVGLNINHSGLHRHSAYILQNSDLPGFNQEQQLMMATLVRYHRKAIKLDD LPRFTLFKKKQFLPLIQLLRLGVLLNNQRQATTTPPTLTLITDDSHWTLRFPHDWFSQNA LVLLDLEKEQEYWEGVAGWRLKIEEESTPEIAA >gi|296493387|gb|ADTK01000114.1| GENE 7 7847 - 10090 1382 747 aa, chain - ## HITS:1 COG:Z3766_3 KEGG:ns NR:ns ## COG: Z3766_3 COG2200 # Protein_GI_number: 15803026 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli O157:H7 EDL933 # 476 747 1 272 272 556 99.0 1e-158 MKLNATYIKIRDKWWGLPLFLPSLILPIFAHINTFAHISSGEVFLFYLPLALMISMMMFF SWAALPGIALGIFVRKYAELGFYETLSLTANFIIIIILCWGGYRVFTPRRNNVSHGDTRL ISQRIFWQIVFPATLFLILFQFAAFVGLLASRENLVGVMPFNLGTLINYQALLVGNLIGV PLCYFIIRVVRNPFYLRSYYSQLKQQVDAKVTKKEFALWLLALGALLLLLCMPLNEKSTI FSTNYTLSLLLPLMMWGAMRYGYKLISLLWAVVLMISIHSYQNYIPIYPGYTTQLTITSS SYLVFSFIVNYMAVLATRQRAVVRRIQRLAYVDPVVHLPNVRALNRALRDAPWSALCYLR IPGMEMLVKNYGIMLRIQYKQKLSHWLSPLLEPGEDVYQLSGNDLALRLNTESHQERITA LDSHLKQFRFFWDGMPMQPQIGVSYCYVRSPVNHIYLLLGELNTVAELSIVTNAPENMQR RGAMYLQRELKDKVAMMNRLQQALEHNHFFLMAQPITGMRGDVYHEILLRMKGENDELIG PDSFLPVAHEFGLSSSIDMWVIEHTLQFMAENRAKMPAHRFAINLSPTSVCQARFPVEVS QLLAKYQIEAWQLIFEVTESNALTNVKQAQITLQHLQELGCQIAIDDFGTGYASYARLKN VNADLLKIDGSFIRNIVSNSLDYQIVASICHLARMKKMRVVAEYVENEEIREAVLSLGID YMQGYLIGKPQPLIDTLNEIEPIRESA >gi|296493387|gb|ADTK01000114.1| GENE 8 10307 - 10597 126 96 aa, chain - ## HITS:1 COG:no KEGG:EC55989_2789 NR:ns ## KEGG: EC55989_2789 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 96 1 96 96 186 100.0 3e-46 MLSFFFALMVLPGTDGRVDKTAKEEDKADEQYDTGHATVKSVSFSHTGCLTHKRFLEVCP TLWTVLTLRHNRTKWSILRYVKKRQWIHILMLWKLI >gi|296493387|gb|ADTK01000114.1| GENE 9 10947 - 11462 395 171 aa, chain + ## HITS:1 COG:no KEGG:SSON_2587 NR:ns ## KEGG: SSON_2587 # Name: not_defined # Def: putative outer membrane lipoprotein # Organism: S.sonnei # Pathway: not_defined # 1 171 2 172 172 266 100.0 3e-70 MKFKKCLLPVAMLASFTLAGCQSNADDHAADVYQTDQLNTKQETKTVNIISILPAKVAVD NSQNKRNAQAFGALIGAVAGGVIGHNVGSGSNSGTTAGAVGGGAVGAAAGSMVNDKTLVE GVSLTYKEGTKVYTSTQVGKECQFTTGLAVVITTTYNETRIQPNTKCPEKS >gi|296493387|gb|ADTK01000114.1| GENE 10 11478 - 12017 599 179 aa, chain + ## HITS:1 COG:no KEGG:ECP_2508 NR:ns ## KEGG: ECP_2508 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_536 # Pathway: not_defined # 1 179 1 179 179 176 99.0 3e-43 MKKVFLCAILASLSYPAIASSLQDQLSAVAEAEQQGKNEEQRQHDEWVAERNREIQQEKQ RRANAQAAANKRAATAAANKKARQDKLDAEASADKKRDQSYEDELRSLEIQKQKLALAKE EARVKRENEFIDQELKHKAAQTDVVQSEADANRNMTEGGRDLMKSVGKAEENKSDSWFN >gi|296493387|gb|ADTK01000114.1| GENE 11 12112 - 13689 2003 525 aa, chain - ## HITS:1 COG:ZguaA_2 KEGG:ns NR:ns ## COG: ZguaA_2 COG0519 # Protein_GI_number: 15803031 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Escherichia coli O157:H7 EDL933 # 207 525 1 319 319 658 99.0 0 MTENIHKHRILILDFGSQYTQLVARRVRELGVYCELWAWDVTEAQIRDFNPSGIILSGGP ESTTEENSPRAPQYVFEAGVPVFGVCYGMQTMAMQLGGHVEASNEREFGYAQVEVVNDSA LVRGIEDALTADGKPLLDVWMSHGDKVTAIPSDFVTVASTESCPFAIMANEEKRFYGVQF HPEVTHTRQGMRMLERFVRDICQCEALWTPAKIIDDAVARIREQVGDDKVILGLSGGVDS SVTAMLLHRAIGKNLTCVFVDNGLLRLNEAEQVLDMFGDHFGLNIVHVPAEDRFLSALAG ENDPEAKRKIIGRVFVEVFDEEALKLEDVKWLAQGTIYPDVIESAASATGKAHVIKSHHN VGGLPKEMKMGLVEPLKELFKDEVRKIGLELGLPYDMLYRHPFPGPGLGVRVLGEVKKEY CDLLRRADAIFIEELRKADLYDKVSQAFTVFLPVRSVGVMGDGRKYDWVVSLRAVETIDF MTAHWAHLPYDFLGRVSNRIINEVNGISRVVYDISGKPPATIEWE Prediction of potential genes in microbial genomes Time: Mon May 16 15:23:34 2011 Seq name: gi|296493386|gb|ADTK01000115.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont296.3, whole genome shotgun sequence Length of sequence - 3129 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 13 - 1479 1673 ## COG0516 IMP dehydrogenase/GMP reductase - Prom 1516 - 1575 6.7 + Prom 1547 - 1606 5.0 2 2 Tu 1 . + CDS 1641 - 3011 1021 ## COG1570 Exonuclease VII, large subunit Predicted protein(s) >gi|296493386|gb|ADTK01000115.1| GENE 1 13 - 1479 1673 488 aa, chain - ## HITS:1 COG:YPO2871_3 KEGG:ns NR:ns ## COG: YPO2871_3 COG0516 # Protein_GI_number: 16123063 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Yersinia pestis # 207 486 1 280 281 478 94.0 1e-134 MLRIAKEALTFDDVLLVPAHSTVLPNTADLSTQLTKTIRLNIPMLSAAMDTVTEARLAIA LAQEGGIGFIHKNMSIERQAEEVRRVKKHESGVVTDPQTVLPTTTLREVKELTERNGFAG YPVVTEENELVGIITGRDVRFVTDLNQPVSVYMTPKERLVTVREGEAREVVLAKMHEKRV EKALVVDDEFHLIGMITVKDFQKAERKPNACKDEQGRLRVGAAVGAGAGNEERVDALVAA GVDVLLIDSSHGHSEGVLQRIRETRAKYPDLQIIGGNVATAAGARALAEAGCSAVKVGIG PGSICTTRIVTGVGVPQITAVADAVEALEGTGIPVIADGGIRFSGDIAKAIAAGASAVMV GSMLAGTEESPGEIELYQGRSYKSYRGMGSLGAMSKGSSDRYFQSDNAADKLVPEGIEGR VAYKGRLKEIIHQQMGGLRSCMGLTGCGTIDELRTKAEFVRISGAGIQESHVHDVTITKE SPNYRLGS >gi|296493386|gb|ADTK01000115.1| GENE 2 1641 - 3011 1021 456 aa, chain + ## HITS:1 COG:xseA KEGG:ns NR:ns ## COG: xseA COG1570 # Protein_GI_number: 16130434 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII, large subunit # Organism: Escherichia coli K12 # 1 456 1 456 456 814 99.0 0 MLPSQSPAIFTVSRLNQTVRLLLEHEMGQVWISGEISNFTQPASGHWYFTLKDDTAQVRC AMFRNSNRRVTFRPQHGQQVLVRANITLYEPRGDYQIIVESMQPAGEGLLQQKYEQLKAK LQAEGLFDQQYKKPLPSPAHCVGVITSKTGAALHDILHVLKRRDPSLPVIIYPTAVQGDD APGQIVRAIELANQRNECDVLIVGRGGGSLEDLWSFNDERVARAIFASRIPVVSAVGHET DVTIADFVADLRAPTPSAAAEVVSRNQQELLRQVQSTRQRLEMAMDYYLANRTRRFTQIH HRLQQQHPQLRLARQQTMLERLQKRMSFALENQLKRTGQQQQRLTQRLNQQNPQPKIHRA QTRIQQLEYRLAETLRVQLSATRERFGNAVTHLEAVSPLSTLARGYSVTTATDGNVLKKV KQVKAGEMLTTRLEDGWIESEVKNIQPVKKSRKKVH Prediction of potential genes in microbial genomes Time: Mon May 16 15:25:36 2011 Seq name: gi|296493385|gb|ADTK01000116.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont297.1, whole genome shotgun sequence Length of sequence - 38207 bp Number of predicted genes - 60, with homology - 59 Number of transcription units - 11, operones - 9 average op.length - 6.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 88 - 147 4.2 1 1 Tu 1 . + CDS 270 - 824 410 ## COG0663 Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily 2 2 Op 1 . - CDS 800 - 1057 160 ## B21_03082 hypothetical protein 3 2 Op 2 8/0.000 - CDS 1054 - 1872 378 ## COG0169 Shikimate 5-dehydrogenase 4 2 Op 3 6/0.000 - CDS 1877 - 2449 386 ## COG0009 Putative translation factor (SUA5) 5 2 Op 4 7/0.000 - CDS 2454 - 2996 111 ## COG0551 Zn-finger domain associated with topoisomerase type I 6 2 Op 5 8/0.000 - CDS 3025 - 3498 602 ## COG2922 Uncharacterized protein conserved in bacteria 7 2 Op 6 . - CDS 3470 - 4594 414 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake - Prom 4731 - 4790 2.4 + Prom 4626 - 4685 3.5 8 3 Op 1 26/0.000 + CDS 4724 - 5233 555 ## COG0242 N-formylmethionyl-tRNA deformylase 9 3 Op 2 20/0.000 + CDS 5248 - 6195 964 ## COG0223 Methionyl-tRNA formyltransferase + Term 6212 - 6240 2.1 10 3 Op 3 8/0.000 + CDS 6268 - 7530 1060 ## COG0144 tRNA and rRNA cytosine-C5-methylases 11 3 Op 4 7/0.000 + CDS 7552 - 8928 1151 ## COG0569 K+ transport systems, NAD-binding component + Term 8930 - 8969 8.6 + Prom 8965 - 9024 6.9 12 3 Op 5 . + CDS 9058 - 9468 471 ## COG1970 Large-conductance mechanosensitive channel 13 4 Op 1 2/1.000 - CDS 9465 - 9683 60 ## COG3036 Uncharacterized protein conserved in bacteria 14 4 Op 2 . - CDS 9739 - 10164 426 ## COG0789 Predicted transcriptional regulators 15 4 Op 3 . - CDS 10175 - 10543 261 ## SBO_3287 hypothetical protein - Prom 10572 - 10631 6.0 - Term 10604 - 10629 -0.5 16 5 Op 1 50/0.000 - CDS 10650 - 11033 636 ## PROTEIN SUPPORTED gi|15803821|ref|NP_289855.1| 50S ribosomal protein L17 17 5 Op 2 26/0.000 - CDS 11074 - 12063 1093 ## COG0202 DNA-directed RNA polymerase, alpha subunit/40 kD subunit 18 5 Op 3 36/0.000 - CDS 12089 - 12709 1038 ## PROTEIN SUPPORTED gi|15803823|ref|NP_289857.1| 30S ribosomal protein S4 19 5 Op 4 48/0.000 - CDS 12743 - 13132 669 ## PROTEIN SUPPORTED gi|15803824|ref|NP_289858.1| 30S ribosomal protein S11 20 5 Op 5 3/0.667 - CDS 13149 - 13505 586 ## PROTEIN SUPPORTED gi|15803825|ref|NP_289859.1| 30S ribosomal protein S13 - Prom 13551 - 13610 3.9 - Term 13581 - 13622 8.2 21 6 Op 1 . - CDS 13652 - 13768 198 ## PROTEIN SUPPORTED gi|15803826|ref|NP_289860.1| 50S ribosomal protein L36 22 6 Op 2 53/0.000 - CDS 13800 - 15131 1258 ## PROTEIN SUPPORTED gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 23 6 Op 3 48/0.000 - CDS 15139 - 15573 715 ## PROTEIN SUPPORTED gi|15803828|ref|NP_289862.1| 50S ribosomal protein L15 24 6 Op 4 50/0.000 - CDS 15577 - 15756 291 ## PROTEIN SUPPORTED gi|15803829|ref|NP_289863.1| 50S ribosomal protein L30 25 6 Op 5 56/0.000 - CDS 15760 - 16263 834 ## PROTEIN SUPPORTED gi|15803830|ref|NP_289864.1| 30S ribosomal protein S5 26 6 Op 6 46/0.000 - CDS 16278 - 16631 572 ## PROTEIN SUPPORTED gi|15803831|ref|NP_289865.1| 50S ribosomal protein L18 27 6 Op 7 55/0.000 - CDS 16641 - 17174 902 ## PROTEIN SUPPORTED gi|15803832|ref|NP_289866.1| 50S ribosomal protein L6 28 6 Op 8 50/0.000 - CDS 17187 - 17579 641 ## PROTEIN SUPPORTED gi|15803833|ref|NP_289867.1| 30S ribosomal protein S8 29 6 Op 9 50/0.000 - CDS 17613 - 17918 513 ## PROTEIN SUPPORTED gi|15803834|ref|NP_289868.1| 30S ribosomal protein S14 30 6 Op 10 48/0.000 - CDS 17933 - 18472 911 ## PROTEIN SUPPORTED gi|15803835|ref|NP_289869.1| 50S ribosomal protein L5 31 6 Op 11 57/0.000 - CDS 18487 - 18801 516 ## PROTEIN SUPPORTED gi|15803836|ref|NP_289870.1| 50S ribosomal protein L24 32 6 Op 12 50/0.000 - CDS 18812 - 19183 617 ## PROTEIN SUPPORTED gi|15803837|ref|NP_289871.1| 50S ribosomal protein L14 - Prom 19255 - 19314 7.1 - Term 19283 - 19326 10.3 33 7 Op 1 50/0.000 - CDS 19348 - 19602 431 ## PROTEIN SUPPORTED gi|15803838|ref|NP_289872.1| 30S ribosomal protein S17 34 7 Op 2 50/0.000 - CDS 19602 - 19793 301 ## PROTEIN SUPPORTED gi|15803839|ref|NP_289873.1| 50S ribosomal protein L29 35 7 Op 3 50/0.000 - CDS 19793 - 20203 694 ## PROTEIN SUPPORTED gi|15803840|ref|NP_289874.1| 50S ribosomal protein L16 36 7 Op 4 61/0.000 - CDS 20216 - 20917 1183 ## PROTEIN SUPPORTED gi|15833433|ref|NP_312206.1| 30S ribosomal protein S3 37 7 Op 5 59/0.000 - CDS 20935 - 21267 537 ## PROTEIN SUPPORTED gi|15803842|ref|NP_289876.1| 50S ribosomal protein L22 38 7 Op 6 60/0.000 - CDS 21282 - 21560 484 ## PROTEIN SUPPORTED gi|15803843|ref|NP_289877.1| 30S ribosomal protein S19 39 7 Op 7 61/0.000 - CDS 21577 - 22398 1438 ## PROTEIN SUPPORTED gi|15803844|ref|NP_289878.1| 50S ribosomal protein L2 40 7 Op 8 61/0.000 - CDS 22416 - 22718 490 ## PROTEIN SUPPORTED gi|15803845|ref|NP_289879.1| 50S ribosomal protein L23 41 7 Op 9 58/0.000 - CDS 22715 - 23320 992 ## PROTEIN SUPPORTED gi|15803846|ref|NP_289880.1| 50S ribosomal protein L4 42 7 Op 10 40/0.000 - CDS 23331 - 23960 1056 ## PROTEIN SUPPORTED gi|15803847|ref|NP_289881.1| 50S ribosomal protein L3 43 7 Op 11 . - CDS 23993 - 24304 513 ## PROTEIN SUPPORTED gi|15803848|ref|NP_289882.1| 30S ribosomal protein S10 - Prom 24475 - 24534 4.3 + Prom 24275 - 24334 2.2 44 8 Tu 1 . + CDS 24367 - 24573 92 ## 45 9 Op 1 . - CDS 24542 - 24961 294 ## B21_03124 hypothetical protein 46 9 Op 2 . - CDS 24963 - 26432 405 ## JW3285 general secretory pathway component, cryptic - Prom 26458 - 26517 11.4 47 10 Op 1 . + CDS 26849 - 27427 57 ## COG3031 Type II secretory pathway, component PulC 48 10 Op 2 6/0.000 + CDS 27411 - 29363 1499 ## COG1450 Type II secretory pathway, component PulD 49 10 Op 3 24/0.000 + CDS 29373 - 30854 1086 ## COG2804 Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB 50 10 Op 4 10/0.000 + CDS 30851 - 32047 755 ## COG1459 Type II secretory pathway, component PulF 51 10 Op 5 12/0.000 + CDS 32057 - 32494 523 ## COG2165 Type II secretory pathway, pseudopilin PulG 52 10 Op 6 12/0.000 + CDS 32502 - 33011 359 ## COG2165 Type II secretory pathway, pseudopilin PulG 53 10 Op 7 12/0.000 + CDS 33008 - 33385 260 ## COG2165 Type II secretory pathway, pseudopilin PulG 54 10 Op 8 7/0.000 + CDS 33378 - 33965 472 ## COG4795 Type II secretory pathway, component PulJ 55 10 Op 9 4/0.667 + CDS 33958 - 34941 508 ## COG3156 Type II secretory pathway, component PulK 56 10 Op 10 . + CDS 34956 - 36119 527 ## COG3297 Type II secretory pathway, component PulL 57 10 Op 11 . + CDS 36116 - 36577 271 ## EcHS_A3528 general secretion pathway protein M 58 10 Op 12 . + CDS 36577 - 37242 389 ## COG1989 Type II secretory pathway, prepilin signal peptidase PulO and related peptidases + Term 37305 - 37340 2.6 59 11 Op 1 9/0.000 - CDS 37283 - 37759 641 ## COG2193 Bacterioferritin (cytochrome b1) 60 11 Op 2 . - CDS 37831 - 38025 169 ## COG2906 Bacterioferritin-associated ferredoxin - Prom 38046 - 38105 4.0 Predicted protein(s) >gi|296493385|gb|ADTK01000116.1| GENE 1 270 - 824 410 184 aa, chain + ## HITS:1 COG:ECs4145 KEGG:ns NR:ns ## COG: ECs4145 COG0663 # Protein_GI_number: 15833399 # Func_class: R General function prediction only # Function: Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily # Organism: Escherichia coli O157:H7 # 1 184 73 256 256 385 100.0 1e-107 MSDVLRPYRDLFPQIGQRVMIDDSSVVIGDVRLADDVGIWPLVVIRGDVHYVQIGARTNI QDGSMLHVTHKSSYNPDGNPLTIGEDVTVGHKVMLHGCTIGNRVLVGMGSILLDGAIVED DVMIGAGSLVPQNKRLESGYLYLGSPVKQIRPLSDEEKAGLRYSANNYVKWKDEYLDQGN QTQP >gi|296493385|gb|ADTK01000116.1| GENE 2 800 - 1057 160 85 aa, chain - ## HITS:1 COG:no KEGG:B21_03082 NR:ns ## KEGG: B21_03082 # Name: yrdB # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 85 1 85 85 155 100.0 4e-37 MNQAIQFPDREEWDENKKCVCFPALVNGMQLTCAISGESLAYRFTGDTPEQWLASFRQHR WDLEEEAENLIQEQSEDDQGWVWLP >gi|296493385|gb|ADTK01000116.1| GENE 3 1054 - 1872 378 272 aa, chain - ## HITS:1 COG:aroE KEGG:ns NR:ns ## COG: aroE COG0169 # Protein_GI_number: 16131162 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Escherichia coli K12 # 1 272 1 272 272 543 98.0 1e-154 METYAVFGNPIAHSKSPFIHQQFAQQLNIEHPYGRVLAPINDFINTLNAFFSAGGKGANV TVPFKEEAFARADELTERAALAGAVNTLKRLEDGRLQGDNTDGIGLLSDLERLSFIRPGL RILLIGAGGASRGVLLPLLSLDCAVTITNRTVSRAEELAKLFAHTGSIQALSMDELEGHE FDLIINATSSGISGDIPAIPSSLIHPGIYCYDMFYQKGKTPFLAWCEQRGSKCNADGLGM LVAQAAHAFLLWHGVLPDVEPVIKQLQEELSA >gi|296493385|gb|ADTK01000116.1| GENE 4 1877 - 2449 386 190 aa, chain - ## HITS:1 COG:ECs4148 KEGG:ns NR:ns ## COG: ECs4148 COG0009 # Protein_GI_number: 15833402 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Escherichia coli O157:H7 # 1 190 1 190 190 373 99.0 1e-103 MNNNLQGDAIAAAIDVLNEERVIAYPTEAVFGVGCDPDSETAVMRLLELKQRPVDKGLIL IAANYEQLKPYIDDTMLTDAQRETIFSRWPGPVTFVFPAPATTPRWLTGRFDSLAVRVTD HPLVVALCQAYGKPLVSTSANLSGLPPCRTVDEVRAQFGAAFPVVPGETGGRLNPSEIRD ALTGELFRQG >gi|296493385|gb|ADTK01000116.1| GENE 5 2454 - 2996 111 180 aa, chain - ## HITS:1 COG:ZyrdD KEGG:ns NR:ns ## COG: ZyrdD COG0551 # Protein_GI_number: 15803811 # Func_class: L Replication, recombination and repair # Function: Zn-finger domain associated with topoisomerase type I # Organism: Escherichia coli O157:H7 EDL933 # 9 180 1 172 172 323 98.0 8e-89 MAKSALFTVRNNESCPKCGAELVIRSGKHGPFLGCSQYPACDYVRPLKSSADGHIVKVLE GQVCLACGANLVLRQGRFGMFIGCSNYPECEHTELIDKPDETAITCPQCRTGHLVQRRSR YGKTFHSCDRYPECQFAINFKPIAGECPECHYPLLIEKKTAQGVKHFCASKQCGKPVSAE >gi|296493385|gb|ADTK01000116.1| GENE 6 3025 - 3498 602 157 aa, chain - ## HITS:1 COG:ECs4150 KEGG:ns NR:ns ## COG: ECs4150 COG2922 # Protein_GI_number: 15833404 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 157 1 157 157 272 100.0 2e-73 MFDVLMYLFETYIHTEAELRVDQDKLEQDLTDAGFDREDIYNALLWLEKLADYQEGLAEP MQLASDPLSMRIYTPEECERLDASCRGFLLFLEQIQVLNLETREMVIERVLALDTAEFDL EDLKWVILMVLFNIPGCENAYQQMEELLFEVNEGMLH >gi|296493385|gb|ADTK01000116.1| GENE 7 3470 - 4594 414 374 aa, chain - ## HITS:1 COG:smf KEGG:ns NR:ns ## COG: smf COG0758 # Protein_GI_number: 16132235 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Escherichia coli K12 # 1 374 1 374 374 743 98.0 0 MVDTDIWLRLMSISSLYGDDMVRIAHWLAKQSHIDAVVLQQTGLTLRQAQRFLSFPRKSI ESSLCWLEQPNHHLIPADSEFYPPQLLVTTDYPGALFVEGELHALHSFQLAVVGSRAHSW YGERWGRLFCETLATRGVTITSGLARGIDGVAHKAALQVNGVSIAVLGNGLNTIHPRRHA RLAASLLEHGGALVSEFPLDVPPLAYNFPRRNRIISGLSKGVLVVEAALRSGSLVTARCA LEQGREVFALPGPIGNPGSEGPHWLIKQGAILVTEPEEILENLQFGLHWLPDAPENSFYS PDQQDVALPFPELLANVGDEVTPVDVVAERAGQPVPEVVTQLLELELAGWIAAVPGGYVR LRRACHVRRTNVFV >gi|296493385|gb|ADTK01000116.1| GENE 8 4724 - 5233 555 169 aa, chain + ## HITS:1 COG:ECs4152 KEGG:ns NR:ns ## COG: ECs4152 COG0242 # Protein_GI_number: 15833406 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Escherichia coli O157:H7 # 1 169 1 169 169 291 100.0 3e-79 MSVLQVLHIPDERLRKVAKPVEEVNAEIQRIVDDMFETMYAEEGIGLAATQVDIHQRIIV IDVSENRDERLVLINPELLEKSGETGIEEGCLSIPEQRALVPRAEKVKIRALDRDGKPFE LEADGLLAICIQHEMDHLVGKLFMDYLSPLKQQRIRQKVEKLDRLKARA >gi|296493385|gb|ADTK01000116.1| GENE 9 5248 - 6195 964 315 aa, chain + ## HITS:1 COG:ECs4153 KEGG:ns NR:ns ## COG: ECs4153 COG0223 # Protein_GI_number: 15833407 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA formyltransferase # Organism: Escherichia coli O157:H7 # 1 315 1 315 315 612 99.0 1e-175 MSESLRIIFAGTPDFAARHLDALLSSGHNVVGVFTQPDRPAGRGKKLMPSPVKVLAEEKG LPVFQPVSLRPQENQQLVADLQADVMVVVAYGLILPKAVLEMPRLGCINVHGSLLPRWRG AAPIQRSLWAGDAETGVTIMQMDVGLDTGDMLYKLSCPITAEDTSGTLYDKLAELGPQGL ITTLKQLADGTAKPEVQDETLVTYAEKLSKEEARIDWSLSAAQLERCIRAFNPWPMSWLE IEGQPVKVWKASVIDTATNAAPGTILEANKQGIQVATGDGILNLLSLQPAGKKAMSAQDL LNSRREWFVPGNRLV >gi|296493385|gb|ADTK01000116.1| GENE 10 6268 - 7530 1060 420 aa, chain + ## HITS:1 COG:sun KEGG:ns NR:ns ## COG: sun COG0144 # Protein_GI_number: 16131168 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Escherichia coli K12 # 1 420 10 429 429 838 99.0 0 MAAQAVEQVVEQGQSLSNILPPLQQKVSDKDKALLQELCFGVLRTLSQLDWLINKLMARP MTGKQRTVHYLIMVGLYQLLYTRIPPHAALAETVEGAIAIKRPQLKGLINGVLRQFQRQQ EELLAEFNASDARYLHPSWLLKRLQKAYPEQWQSIVEANNQRPPMWLRVNRTHHSRDSWL ALLDEAGMKGFPHADYPDAVRLETPAPVHALPGFEDGWVTVQDASAQGCMTWLAPQNGEH ILDLCAAPGGKTTHILEVAPEAQVVAVDIDEQRLSRVYDNLKRLGMKATVKQGDGRYPSQ WCGEQQFDRILLDAPCSATGVIRRHPDIKWLRRDRDIPELAQLQSEILDAIWPHLKSGGT LVYATCSVLPEENSLQIKAFLQRTADAELCETGTPEQPGKQNLPGTEEGDGFFYAKLIKK >gi|296493385|gb|ADTK01000116.1| GENE 11 7552 - 8928 1151 458 aa, chain + ## HITS:1 COG:ECs4155 KEGG:ns NR:ns ## COG: ECs4155 COG0569 # Protein_GI_number: 15833409 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Escherichia coli O157:H7 # 1 458 1 458 458 871 100.0 0 MKIIILGAGQVGGTLAENLVGENNDITVVDTNGERLRTLQDKFDLRVVQGHGSHPRVLRE AGADDADMLVAVTSSDETNMVACQVAYSLFNTPNRIARIRSPDYVRDADKLFHSDAVPID HLIAPEQLVIDNIYRLIEYPGALQVVNFAEGKVSLAVVKAYYGGPLIGNALSTMREHMPH IDTRVAAIFRHDRPIRPQGSTIVEAGDEVFFIAASQHIRAVMSELQRLEKPYKRIMLVGG GNIGAGLARRLEKDYSVKLIERNQQRAAELAEKLQNTIVFFGDASDQELLAEEHIDQVDL FIAVTNDDEANIMSAMLAKRMGAKKVMVLIQRRAYVDLVQGSVIDIAISPQQATISALLS HVRKADIVGVSSLRRGVAEAIEAVAHGDESTSRVVGRVIDEIKLPPGTIIGAVVRGNDVM IANDNLRIEQGDHVIMFLTDKKFITDVERLFQPSPFFL >gi|296493385|gb|ADTK01000116.1| GENE 12 9058 - 9468 471 136 aa, chain + ## HITS:1 COG:ECs4156 KEGG:ns NR:ns ## COG: ECs4156 COG1970 # Protein_GI_number: 15833410 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Large-conductance mechanosensitive channel # Organism: Escherichia coli O157:H7 # 1 136 1 136 136 243 100.0 5e-65 MSIIKEFREFAMRGNVVDLAVGVIIGAAFGKIVSSLVADIIMPPLGLLIGGIDFKQFAVT LRDAQGDIPAVVMHYGVFIQNVFDFLIVAFAIFMAIKLINKLNRKKEEPAAAPAPTKEEV LLTEIRDLLKEQNNRS >gi|296493385|gb|ADTK01000116.1| GENE 13 9465 - 9683 60 72 aa, chain - ## HITS:1 COG:yhdL KEGG:ns NR:ns ## COG: yhdL COG3036 # Protein_GI_number: 16132253 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 72 1 72 72 128 98.0 3e-30 MSRYQHTKGQIKDNAIEALLHDPLFRQRVEKNKKGKGSYMRKGKHGNRGNWEASGKKVNH FFTTGLLLSGVC >gi|296493385|gb|ADTK01000116.1| GENE 14 9739 - 10164 426 141 aa, chain - ## HITS:1 COG:ECs4157 KEGG:ns NR:ns ## COG: ECs4157 COG0789 # Protein_GI_number: 15833411 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 141 1 141 141 247 100.0 4e-66 MYRIGELAKMAEVTPDTIRYYEKQQMMEHEVRTEGGFRLYTESDLQRLKFIRHARQLGFS LESIRELLSIRIDPEHHTCQESKGIVQERLQEVEARIAELQSMQRSLQRLNDACCGTAHS SVYCSILEALEQGASGVKSGC >gi|296493385|gb|ADTK01000116.1| GENE 15 10175 - 10543 261 122 aa, chain - ## HITS:1 COG:no KEGG:SBO_3287 NR:ns ## KEGG: SBO_3287 # Name: yhdN # Def: hypothetical protein # Organism: S.boydii # Pathway: not_defined # 1 122 1 122 122 223 99.0 1e-57 MWLLDQWAERHIAEAQAKGEFDNLAGSGEPLILDDDSHVPPELRAGYRLLKNAGCLPPEL EQRREAIQLLDILRGIRHDDPQYQEVSRRLSLLELKLRQAGLSTDFLRGDYADKLLDKIN DN >gi|296493385|gb|ADTK01000116.1| GENE 16 10650 - 11033 636 127 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803821|ref|NP_289855.1| 50S ribosomal protein L17 [Escherichia coli O157:H7 EDL933] # 1 127 1 127 127 249 100 2e-65 MRHRKSGRQLNRNSSHRQAMFRNMAGSLVRHEIIKTTLPKAKELRRVVEPLITLAKTDSV ANRRLAFARTRDNEIVAKLFNELGPRFASRAGGYTRILKCGFRAGDNAPMAYIELVDRSE KAEAAAE >gi|296493385|gb|ADTK01000116.1| GENE 17 11074 - 12063 1093 329 aa, chain - ## HITS:1 COG:ECs4160 KEGG:ns NR:ns ## COG: ECs4160 COG0202 # Protein_GI_number: 15833414 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, alpha subunit/40 kD subunit # Organism: Escherichia coli O157:H7 # 1 329 1 329 329 613 100.0 1e-175 MQGSVTEFLKPRLVDIEQVSSTHAKVTLEPLERGFGHTLGNALRRILLSSMPGCAVTEVE IDGVLHEYSTKEGVQEDILEILLNLKGLAVRVQGKDEVILTLNKSGIGPVTAADITHDGD VEIVKPQHVICHLTDENASISMRIKVQRGRGYVPASTRIHSEEDERPIGRLLVDACYSPV ERIAYNVEAARVEQRTDLDKLVIEMETNGTIDPEEAIRRAATILAEQLEAFVDLRDVRQP EVKEEKPEFDPILLRPVDDLELTVRSANCLKAEAIHYIGDLVQRTEVELLKTPNLGKKSL TEIKDVLASRGLSLGMRLENWPPASIADE >gi|296493385|gb|ADTK01000116.1| GENE 18 12089 - 12709 1038 206 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803823|ref|NP_289857.1| 30S ribosomal protein S4 [Escherichia coli O157:H7 EDL933] # 1 206 1 206 206 404 100 1e-112 MARYLGPKLKLSRREGTDLFLKSGVRAIDTKCKIEQAPGQHGARKPRLSDYGVQLREKQK VRRIYGVLERQFRNYYKEAARLKGNTGENLLALLEGRLDNVVYRMGFGATRAEARQLVSH KAIMVNGRVVNIASYQVSPNDVVSIREKAKKQSRVKAALELAEQREKPTWLEVDAGKMEG TFKRKPERSDLSADINEHLIVELYSK >gi|296493385|gb|ADTK01000116.1| GENE 19 12743 - 13132 669 129 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803824|ref|NP_289858.1| 30S ribosomal protein S11 [Escherichia coli O157:H7 EDL933] # 1 129 1 129 129 262 100 3e-69 MAKAPIRARKRVRKQVSDGVAHIHASFNNTIVTITDRQGNALGWATAGGSGFRGSRKSTP FAAQVAAERCADAVKEYGIKNLEVMVKGPGPGRESTIRALNAAGFRITNITDVTPIPHNG CRPPKKRRV >gi|296493385|gb|ADTK01000116.1| GENE 20 13149 - 13505 586 118 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803825|ref|NP_289859.1| 30S ribosomal protein S13 [Escherichia coli O157:H7 EDL933] # 1 118 1 118 118 230 99 1e-59 MARIAGINIPDHKHAVIALTSIYGVGKTRSKAILAAAGIAEDVKISELSEGQIDTLRDEV AKFVVEGDLRREISMSIKRLMDLGCYRGLRHRRGLPVRGQRTKTNARTRKGPRKPIKK >gi|296493385|gb|ADTK01000116.1| GENE 21 13652 - 13768 198 38 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803826|ref|NP_289860.1| 50S ribosomal protein L36 [Escherichia coli O157:H7 EDL933] # 1 38 1 38 38 80 100 1e-14 MKVRASVKKLCRNCKIVKRDGVIRVICSAEPKHKQRQG >gi|296493385|gb|ADTK01000116.1| GENE 22 13800 - 15131 1258 443 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 [alpha proteobacterium BAL199] # 12 440 16 444 447 489 56 1e-137 MAKQPGLDFQSAKGGLGELKRRLLFVIGALIVFRIGSFIPIPGIDAAVLAKLLEQQRGTI IEMFNMFSGGALSRASIFALGIMPYISASIIIQLLTVVHPTLAEIKKEGESGRRKISQYT RYGTLVLAIFQSIGIATGLPNMPGMQGLVINPGFAFYFTAVVSLVTGTMFLMWLGEQITE RGIGNGISIIIFAGIVAGLPPAIAHTIEQARQGDLHFLVLLLVAVLVFAVTFFVVFVERG QRRIVVNYAKRQQGRRVYAAQSTHLPLKVNMAGVIPAIFASSIILFPATIASWFGGGTGW NWLTTISLYLQPGQPLYVLLYASAIIFFCFFYTALVFNPRETADNLKKSGAFVPGIRPGE QTAKYIDKVMTRLTLVGALYITFICLIPEFMRDAMKVPFYFGGTSLLIVVVVIMDFMAQV QTLMMSSQYESALKKANLKGYGR >gi|296493385|gb|ADTK01000116.1| GENE 23 15139 - 15573 715 144 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803828|ref|NP_289862.1| 50S ribosomal protein L15 [Escherichia coli O157:H7 EDL933] # 1 144 1 144 144 280 100 1e-74 MRLNTLSPAEGSKKAGKRLGRGIGSGLGKTGGRGHKGQKSRSGGGVRRGFEGGQMPLYRR LPKFGFTSRKAAITAEVRLSDLAKVEGGVVDLNTLKAANIIGIQIEFAKVILAGEVTTPV TVRGLRVTKGARAAIEAAGGKIEE >gi|296493385|gb|ADTK01000116.1| GENE 24 15577 - 15756 291 59 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803829|ref|NP_289863.1| 50S ribosomal protein L30 [Escherichia coli O157:H7 EDL933] # 1 59 1 59 59 116 100 2e-25 MAKTIKITQTRSAIGRLPKHKATLLGLGLRRIGHTVEREDTPAIRGMINAVSFMVKVEE >gi|296493385|gb|ADTK01000116.1| GENE 25 15760 - 16263 834 167 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803830|ref|NP_289864.1| 30S ribosomal protein S5 [Escherichia coli O157:H7 EDL933] # 1 167 1 167 167 325 100 2e-88 MAHIEKQAGELQEKLIAVNRVSKTVKGGRIFSFTALTVVGDGNGRVGFGYGKAREVPAAI QKAMEKARRNMINVALNNGTLQHPVKGVHTGSRVFMQPASEGTGIIAGGAMRAVLEVAGV HNVLAKAYGSTNPINVVRATIDGLENMNSPEMVAAKRGKSVEEILGK >gi|296493385|gb|ADTK01000116.1| GENE 26 16278 - 16631 572 117 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803831|ref|NP_289865.1| 50S ribosomal protein L18 [Escherichia coli O157:H7 EDL933] # 1 117 1 117 117 224 100 5e-58 MDKKSARIRRATRARRKLQELGATRLVVHRTPRHIYAQVIAPNGSEVLVAASTVEKAIAE QLKYTGNKDAAAAVGKAVAERALEKGIKDVSFDRSGFQYHGRVQALADAAREAGLQF >gi|296493385|gb|ADTK01000116.1| GENE 27 16641 - 17174 902 177 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803832|ref|NP_289866.1| 50S ribosomal protein L6 [Escherichia coli O157:H7 EDL933] # 1 177 1 177 177 352 100 3e-96 MSRVAKAPVVVPAGVDVKINGQVITIKGKNGELTRTLNDAVEVKHADNTLTFGPRDGYAD GWAQAGTARALLNSMVIGVTEGFTKKLQLVGVGYRAAVKGNVINLSLGFSHPVDHQLPAG ITAECPTQTEIVLKGADKQVIGQVAADLRAYRRPEPYKGKGVRYADEVVRTKEAKKK >gi|296493385|gb|ADTK01000116.1| GENE 28 17187 - 17579 641 130 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803833|ref|NP_289867.1| 30S ribosomal protein S8 [Escherichia coli O157:H7 EDL933] # 1 130 1 130 130 251 100 5e-66 MSMQDPIADMLTRIRNGQAANKAAVTMPSSKLKVAIANVLKEEGFIEDFKVEGDTKPELE LTLKYFQGKAVVESIQRVSRPGLRIYKRKDELPKVMAGLGIAVVSTSKGVMTDRAARQAG LGGEIICYVA >gi|296493385|gb|ADTK01000116.1| GENE 29 17613 - 17918 513 101 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803834|ref|NP_289868.1| 30S ribosomal protein S14 [Escherichia coli O157:H7 EDL933] # 1 101 1 101 101 202 100 3e-51 MAKQSMKAREVKRVALADKYFAKRAELKAIISDVNASDEDRWNAVLKLQTLPRDSSPSRQ RNRCRQTGRPHGFLRKFGLSRIKVREAAMRGEIPGLKKASW >gi|296493385|gb|ADTK01000116.1| GENE 30 17933 - 18472 911 179 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803835|ref|NP_289869.1| 50S ribosomal protein L5 [Escherichia coli O157:H7 EDL933] # 1 179 1 179 179 355 100 2e-97 MAKLHDYYKDEVVKKLMTEFNYNSVMQVPRVEKITLNMGVGEAIADKKLLDNAAADLAAI SGQKPLITKARKSVAGFKIRQGYPIGCKVTLRGERMWEFFERLITIAVPRIRDFRGLSAK SFDGRGNYSMGVREQIIFPEIDYDKVDRVRGLDITITTTAKSDEEGRALLAAFDFPFRK >gi|296493385|gb|ADTK01000116.1| GENE 31 18487 - 18801 516 104 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803836|ref|NP_289870.1| 50S ribosomal protein L24 [Escherichia coli O157:H7 EDL933] # 1 104 1 104 104 203 100 1e-51 MAAKIRRDDEVIVLTGKDKGKRGKVKNVLSSGKVIVEGINLVKKHQKPVPALNQPGGIVE KEAAIQVSNVAIFNAATGKADRVGFRFEDGKKVRFFKSNSETIK >gi|296493385|gb|ADTK01000116.1| GENE 32 18812 - 19183 617 123 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803837|ref|NP_289871.1| 50S ribosomal protein L14 [Escherichia coli O157:H7 EDL933] # 1 123 1 123 123 242 100 3e-63 MIQEQTMLNVADNSGARRVMCIKVLGGSHRRYAGVGDIIKITIKEAIPRGKVKKGDVLKA VVVRTKKGVRRPDGSVIRFDGNACVLLNNNSEQPIGTRIFGPVTRELRSEKFMKIISLAP EVL >gi|296493385|gb|ADTK01000116.1| GENE 33 19348 - 19602 431 84 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803838|ref|NP_289872.1| 30S ribosomal protein S17 [Escherichia coli O157:H7 EDL933] # 1 84 1 84 84 170 100 1e-41 MTDKIRTLQGRVVSDKMEKSIVVAIERFVKHPIYGKFIKRTTKLHVHDENNECGIGDVVE IRECRPLSKTKSWTLVRVVEKAVL >gi|296493385|gb|ADTK01000116.1| GENE 34 19602 - 19793 301 63 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803839|ref|NP_289873.1| 50S ribosomal protein L29 [Escherichia coli O157:H7 EDL933] # 1 63 1 63 63 120 100 1e-26 MKAKELREKSVEELNTELLNLLREQFNLRMQAASGQLQQSHLLKQVRRDVARVKTLLNEK AGA >gi|296493385|gb|ADTK01000116.1| GENE 35 19793 - 20203 694 136 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803840|ref|NP_289874.1| 50S ribosomal protein L16 [Escherichia coli O157:H7 EDL933] # 1 136 1 136 136 271 100 3e-72 MLQPKRTKFRKMHKGRNRGLAQGTDVSFGSFGLKAVGRGRLTARQIEAARRAMTRAVKRQ GKIWIRVFPDKPITEKPLAVRMGKGKGNVEYWVALIQPGKVLYEMDGVPEELAREAFKLA AAKLPIKTTFVTKTVM >gi|296493385|gb|ADTK01000116.1| GENE 36 20216 - 20917 1183 233 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15833433|ref|NP_312206.1| 30S ribosomal protein S3 [Escherichia coli O157:H7 str. Sakai] # 1 233 1 233 233 460 100 1e-129 MGQKVHPNGIRLGIVKPWNSTWFANTKEFADNLDSDFKVRQYLTKELAKASVSRIVIERP AKSIRVTIHTARPGIVIGKKGEDVEKLRKVVADIAGVPAQINIAEVRKPELDAKLVADSI TSQLERRVMFRRAMKRAVQNAMRLGAKGIKVEVSGRLGGAEIARTEWYREGRVPLHTLRA DIDYNTSEAHTTYGVIGVKVWIFKGEILGGMAAVEQPEKPAAQPKKQQRKGRK >gi|296493385|gb|ADTK01000116.1| GENE 37 20935 - 21267 537 110 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803842|ref|NP_289876.1| 50S ribosomal protein L22 [Escherichia coli O157:H7 EDL933] # 1 110 1 110 110 211 100 5e-54 METIAKHRHARSSAQKVRLVADLIRGKKVSQALDILTYTNKKAAVLVKKVLESAIANAEH NDGADIDDLKVTKIFVDEGPSMKRIMPRAKGRADRILKRTSHITVVVSDR >gi|296493385|gb|ADTK01000116.1| GENE 38 21282 - 21560 484 92 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803843|ref|NP_289877.1| 30S ribosomal protein S19 [Escherichia coli O157:H7 EDL933] # 1 92 1 92 92 191 100 7e-48 MPRSLKKGPFIDLHLLKKVEKAVESGDKKPLRTWSRRSTIFPNMIGLTIAVHNGRQHVPV FVTDEMVGHKLGEFAPTRTYRGHAADKKAKKK >gi|296493385|gb|ADTK01000116.1| GENE 39 21577 - 22398 1438 273 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803844|ref|NP_289878.1| 50S ribosomal protein L2 [Escherichia coli O157:H7 EDL933] # 1 273 1 273 273 558 100 1e-158 MAVVKCKPTSPGRRHVVKVVNPELHKGKPFAPLLEKNSKSGGRNNNGRITTRHIGGGHKQ AYRIVDFKRNKDGIPAVVERLEYDPNRSANIALVLYKDGERRYILAPKGLKAGDQIQSGV DAAIKPGNTLPMRNIPVGSTVHNVEMKPGKGGQLARSAGTYVQIVARDGAYVTLRLRSGE MRKVEADCRATLGEVGNAEHMLRVLGKAGAARWRGVRPTVRGTAMNPVDHPHGGGEGRNF GKHPVTPWGVQTKGKKTRSNKRTDKFIVRRRSK >gi|296493385|gb|ADTK01000116.1| GENE 40 22416 - 22718 490 100 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803845|ref|NP_289879.1| 50S ribosomal protein L23 [Escherichia coli O157:H7 EDL933] # 1 100 1 100 100 193 100 2e-48 MIREERLLKVLRAPHVSEKASTAMEKSNTIVLKVAKDATKAEIKAAVQKLFEVEVEVVNT LVVKGKVKRHGQRIGRRSDWKKAYVTLKEGQNLDFVGGAE >gi|296493385|gb|ADTK01000116.1| GENE 41 22715 - 23320 992 201 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803846|ref|NP_289880.1| 50S ribosomal protein L4 [Escherichia coli O157:H7 EDL933] # 1 201 1 201 201 386 100 1e-107 MELVLKDAQSALTVSETTFGRDFNEALVHQVVVAYAAGARQGTRAQKTRAEVTGSGKKPW RQKGTGRARSGSIKSPIWRSGGVTFAARPQDHSQKVNKKMYRGALKSILSELVRQDRLIV VEKFSVEAPKTKLLAQKLKDMALEDVLIITGELDENLFLAARNLHKVDVRDATGIDPVSL IAFDKVVMTADAVKQVEEMLA >gi|296493385|gb|ADTK01000116.1| GENE 42 23331 - 23960 1056 209 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803847|ref|NP_289881.1| 50S ribosomal protein L3 [Escherichia coli O157:H7 EDL933] # 1 209 1 209 209 411 100 1e-114 MIGLVGKKVGMTRIFTEDGVSIPVTVIEVEANRVTQVKDLANDGYRAIQVTTGAKKANRV TKPEAGHFAKAGVEAGRGLWEFRLAEGEEFTVGQSISVELFADVKKVDVTGTSKGKGFAG TVKRWNFRTQDATHGNSLSHRVPGSIGQNQTPGKVFKGKKMAGQMGNERVTVQSLDVVRV DAERNLLLVKGAVPGATGSDLIVKPAVKA >gi|296493385|gb|ADTK01000116.1| GENE 43 23993 - 24304 513 103 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803848|ref|NP_289882.1| 30S ribosomal protein S10 [Escherichia coli O157:H7 EDL933] # 1 103 1 103 103 202 100 3e-51 MQNQRIRIRLKAFDHRLIDQATAEIVETAKRTGAQVRGPIPLPTRKERFTVLISPHVNKD ARDQYEIRTHLRLVDIVEPTEKTVDALMRLDLAAGVDVQISLG >gi|296493385|gb|ADTK01000116.1| GENE 44 24367 - 24573 92 68 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGECNRSYVAPRLGALLGSQIRLTEVQIEPAVNYDKPAHYTYLTTERKRIASKVNSLRVI SQVRSVHF >gi|296493385|gb|ADTK01000116.1| GENE 45 24542 - 24961 294 139 aa, chain - ## HITS:1 COG:no KEGG:B21_03124 NR:ns ## KEGG: B21_03124 # Name: pioO # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 139 1 139 139 266 100.0 2e-70 MFEFYIAAREQKETGHPGIFSRQKHSTIIYVICLLLICLWFAGMVLVGGYARQLWVLWIV KAEVTVEAETPAFKQSTQHYFFKKQPLPVVESVEEEDDPGVAVENAPSSSEDEENTVEES DEKAGLRERVKNALNELER >gi|296493385|gb|ADTK01000116.1| GENE 46 24963 - 26432 405 489 aa, chain - ## HITS:1 COG:no KEGG:JW3285 NR:ns ## KEGG: JW3285 # Name: gspA # Def: general secretory pathway component, cryptic # Organism: E.coli_J # Pathway: not_defined # 1 489 1 489 489 949 100.0 0 MSTRREVILSWLCEKRQTWRLCYLLGEAGSGKTWLAQQLQKDKHRRVITLSLVVSWQGKA AWIVTDDNAAEQGCRDSAWTRDEMAGQLLHALHRTDSRCPLIIIENAHLNHRRILDDLQR AISLIPDGQFLLIGRPDRKVERDFKKQGIELVSIGRLTEHELKASILEGQNIDQPDLLLT ARVLKRIALLCRGDRRKLALAGETIRLLQQAEQTSVFTAKQWRMIYRILGDNRPRKMQLA VVMSGTIIALTCGWLLLSSFTATLPVPAWLIPVTPVVKQDMTKDIAHVVMRDSEALSVLY GVWGYEVPADSAWCDQAVRAGLACKSGNASLQTLVDQNLPWIASLKVGDKKLPVVVVRVG EASVDVLVGQQTWTLTHKWFESVWTGDYLLLWKMSPEGESTITRDSSEEEILWLETMLNR ALHISTEPSAEWRPLLVEKIKQFQKSHHLKTDGVVGFSTLVHLWQVAGESAYLYRDEANI SPETTVKGK >gi|296493385|gb|ADTK01000116.1| GENE 47 26849 - 27427 57 192 aa, chain + ## HITS:1 COG:gspC KEGG:ns NR:ns ## COG: gspC COG3031 # Protein_GI_number: 16131203 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulC # Organism: Escherichia coli K12 # 1 192 80 271 271 362 99.0 1e-100 MAVNQETPKLSIALNGIVLTSNDETSFVLINEGSEQKRYSLNEALESAPGTFIRKINKTS VVFETHGHYEKVTLHPGLPDIIKQPDSESQNVLADYIIATPIRDGEQIYGLRLNPRKGLN AFTTSLLQPGDIALRINNLSLTHPDEVSQALSLLLTQQSAQFTIRRNGVPRLINVSVGEL TGMNGLRHERTQ >gi|296493385|gb|ADTK01000116.1| GENE 48 27411 - 29363 1499 650 aa, chain + ## HITS:1 COG:gspD KEGG:ns NR:ns ## COG: gspD COG1450 # Protein_GI_number: 16131204 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulD # Organism: Escherichia coli K12 # 1 650 5 654 654 1221 99.0 0 MKGLNKITCCLLAALLMPCAGHAENEQYGANFNNADIRQFVEIVGQHLGKTILIDPSVQG TISVRSNDTFSQQEYYQFFLSILDLYGYSVITLDNGFLKVVRSANVKTSPGMIADSSRPG VGDELVTRIVPLENVPARDLAPLLRQMMDAGSVGNVVHYEPSNVLILTGRASTINKLIEV IKRVDVIGTEKQQIIHLEYASAEDLAEILNQLISESHGKSQMPALLSAKIVADKRTNSLI ISGPEKARQRITSLLKSLDVEESEEGNTRVYYLKYAKATNLVEVLTGVSEKLKDEKGNAR KPSSSGAMDNVAITADEQTNSLVITADQSVQEKLATVIARLDIRHAQVLVEAIIVEVQDG NGLNLGVQWANKNVGAQQFTNTGLPIFNAAQGVADYKKNGGITSANPAWDMFSAYNGMAA GFFNGDWGVLLTALASNNKNDILATPSIVTLDNKLASFNVGQDVPVLSGSQTTSGDNVFN TVERKTVGTKLKVTPQVNEGDAVLLEIEQEVSSVDSSSNSTLGPTFNTRTIQNAVLVKTG ETVVLGGLLDDFSKEQVSKVPLLGDIPLVGQLFRYTSTERAKRNLMVFIRPTIIRDDDVY HSLSKEKYTRYRQEQQQRIDGKSKALVGSEDLPVLDENTFNSHAPAPSSR >gi|296493385|gb|ADTK01000116.1| GENE 49 29373 - 30854 1086 493 aa, chain + ## HITS:1 COG:gspE KEGG:ns NR:ns ## COG: gspE COG2804 # Protein_GI_number: 16131205 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB # Organism: Escherichia coli K12 # 1 493 1 493 493 931 99.0 0 MRIHSPYPASWALAQRIGYLYSEGEIIYLADTPFERLLDIQRQVGQCQTMTSLSQADFEA RLEAVFHQNTGESQQIAQDIDQSVDLLSLSEEMPANEDLLNEDSAAPVIRLINAILSEAI KETASDIHIETYEKTMSIRFRIDGVLRTILQPNKKLAALLISRIKVMARLDIAEKRIPQD GRISLRIGRRNIDVRVSTLPSIYGERAVLRLLDKNSLQLSLNNLGMTAADKQDLENLIQL PHGIILVTGPTGSGKSTTLYAILSALNTPGRNILTVEDPVEYELEGIGQTQVNTRVDMSF ARGLRAILRQDPDVVMVGEIRDTETAQIAVQASLTGHLVLSTLHTNSASGAVTRLRDMGV ESFLLSSSLAGIIAQRLVRRLCPQCRQFTPVSPQQAQMFKYHQLAVTTIGTPVGCPHCHQ SGYQGRMAIHEMMVVTPELRAAIHENVDEQALERLVRQQHNALIKNGLQKVISGDTSWDE VMRVASATLESEA >gi|296493385|gb|ADTK01000116.1| GENE 50 30851 - 32047 755 398 aa, chain + ## HITS:1 COG:hofF KEGG:ns NR:ns ## COG: hofF COG1459 # Protein_GI_number: 16131206 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulF # Organism: Escherichia coli K12 # 1 398 1 398 398 717 100.0 0 MNYRYRAMTQDGQKLQGIIDANDERQARLRLREEGLFLLDIRPQKSSGVKTRRPRISHSE LTLFTRQLATLSAAALPLEESLAVIGQQSSNKRLGDVLNQVRSAILEGHPLSDALQHFPT LFDSLYRTLVKAGEKSGLLAPVLEKLADYNENRQKIRSKLIQSLIYPCMLTTVAIGVVII LLTAVVPKITEQFVHMKQQLPLSTRILLGLSDTLQRTGPTLLATVFIVAVGFWLWLKRGN NRHRFHAMLLRVALIGPLICAINSARYLRTLSILQSSGVPLLDGMNLSTESLNNLEIRQR LANAAENVRQGNSIHLSLEQTAIFPPMMLYMVASGEKSGQLGTLMVRAADNQETLQQNRI ALTLSIFEPALIITMALIVLFIVVSVLQPLLQLNSMIN >gi|296493385|gb|ADTK01000116.1| GENE 51 32057 - 32494 523 145 aa, chain + ## HITS:1 COG:hofG KEGG:ns NR:ns ## COG: hofG COG2165 # Protein_GI_number: 16131207 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, pseudopilin PulG # Organism: Escherichia coli K12 # 1 145 1 145 145 258 99.0 2e-69 MRATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKVVSDIVALENALDMYK LDNHHYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDL LSAGPDGEMGTEDDITNWGLSKKKK >gi|296493385|gb|ADTK01000116.1| GENE 52 32502 - 33011 359 169 aa, chain + ## HITS:1 COG:hofH KEGG:ns NR:ns ## COG: hofH COG2165 # Protein_GI_number: 16131208 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, pseudopilin PulG # Organism: Escherichia coli K12 # 1 169 1 169 169 316 98.0 1e-86 MNQQRGFTLLEMMLVLALVAITASVVLFTYGREDVTNTRARETAARFTAALELAIDRATL SGQPVGIHFSDSAWCIMVPGKTPSAWRWVPLQEDAADESQNDWDEELSIHLQPFKPDDSN QPQVVILADGQITPFSLLMANAGTGEPLLTLVCSGSWPLDQTLARDTRP >gi|296493385|gb|ADTK01000116.1| GENE 53 33008 - 33385 260 125 aa, chain + ## HITS:1 COG:gspI KEGG:ns NR:ns ## COG: gspI COG2165 # Protein_GI_number: 16131209 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, pseudopilin PulG # Organism: Escherichia coli K12 # 1 125 14 138 138 205 100.0 2e-53 MNKQSGMTLLEVLLAMSIFTAVALTLMSSMQGQRNAIERMRNETLALWIADNQLQSQDSF GEENTSSSGKELINGEEWNWRSDIHSSKDGTLLERTITVTLPSGQTTSLTRYQSIDNKSG QAQDD >gi|296493385|gb|ADTK01000116.1| GENE 54 33378 - 33965 472 195 aa, chain + ## HITS:1 COG:gspJ KEGG:ns NR:ns ## COG: gspJ COG4795 # Protein_GI_number: 16131210 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulJ # Organism: Escherichia coli K12 # 1 195 1 195 195 325 100.0 4e-89 MINRQQGFTLLEVMAALAIFSMLSVLAFMIFSQASELHQRSQKEIQQFNQLQRTITILDN DLLQLVARRNRSTDKIMVLGEEAIFTTQSRDPLAPLSEAQTLLTVHWYLRNHTLYRAVRT SVDGRKDQPAQAMLEHVESFLLESNSGESQELPLSVTLHLQTQQYGGLQRRFALPEQLAR EESPAQTQAGNNNHE >gi|296493385|gb|ADTK01000116.1| GENE 55 33958 - 34941 508 327 aa, chain + ## HITS:1 COG:gspK KEGG:ns NR:ns ## COG: gspK COG3156 # Protein_GI_number: 16131211 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulK # Organism: Escherichia coli K12 # 1 327 1 327 327 587 100.0 1e-167 MNNEQRGVALLIVLMLLALMAALAADMTLSFHSQLQRTRQVNHHLQRQYDIELAEKLALA SLTQDVKDNDRQTTLQQYWAQPQQLQLEDGNTVKWQLRDAQHCFNLNALAKISDDPLASP DFPAQVFSALLINAGIDRGNTDEIVQSIADYIDVDDSPRFHGAEDSFYQSQTPPRHSANQ MLFLTGELRQIKGITENIYQRLIPYVCVLPTTELSINLNMLTENDIPLFRALFLNNITDA DARVLLQKRPREGWLTTDAFLYWAQQDFSGVKPLVAQVKRHLFPYSRYFTLSTESISDEQ SQGWQSHIFFNRKQQSAQIYRRTLQLY >gi|296493385|gb|ADTK01000116.1| GENE 56 34956 - 36119 527 387 aa, chain + ## HITS:1 COG:gspL KEGG:ns NR:ns ## COG: gspL COG3297 # Protein_GI_number: 16131212 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulL # Organism: Escherichia coli K12 # 1 387 2 388 388 712 99.0 0 MPESLMVIRSSSTLRKHWEWMTFSADSVSSVHTLTDDLPLESLADQPGAGNVHLLIPPEG LLYRSLTLPNAKYKLTAQTLQWLAEETLPDNTQDWHWTVVDKQNESVEVIGIQSEKLSRY LERLHTAGLNVTRVLPDGCYLPWEVDSWTLVNQQKSWLIRSAAHAFNELDEHWLQHLAAQ FPPENMLCYGVVPHGVAAANPLIQHPEIPSLSLYSADIAFQRYDMLHGIFRKQKTVSKSG KWLARLAVSCLVLAILSFVGSRSIALWHTLKIEDQLQQQQQETWQRYFPQIKRTHNFHFY FKQQLAQQYPEAVPLLYHLQTLLLEHPELQLMEANYSQKQKSLTLKMSAKSEANIDRFCE LTQSWLPMEKTEKDPVSGVWTVRNSGK >gi|296493385|gb|ADTK01000116.1| GENE 57 36116 - 36577 271 153 aa, chain + ## HITS:1 COG:no KEGG:EcHS_A3528 NR:ns ## KEGG: EcHS_A3528 # Name: not_defined # Def: general secretion pathway protein M # Organism: E.coli_HS # Pathway: Bacterial secretion system [PATH:ecx03070] # 1 153 9 161 161 291 100.0 5e-78 MIKSWWAEKSTSEKQIVAALAVLSLGVFCWLGVIKPIDTYIAEHQSHAQKIKKDIKWMQD QASTHGLLGHPALTQPIKNILLEEAKRENLAITLENGPDNTLTIHPVTAPLENVSRWLTT AQVTYGIVIEDLQFTLAGNEEITLRHLSFREQQ >gi|296493385|gb|ADTK01000116.1| GENE 58 36577 - 37242 389 221 aa, chain + ## HITS:1 COG:hofD KEGG:ns NR:ns ## COG: hofD COG1989 # Protein_GI_number: 16131214 # Func_class: N Cell motility; O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, prepilin signal peptidase PulO and related peptidases # Organism: Escherichia coli K12 # 1 221 1 221 225 354 99.0 7e-98 MTMLLPLFILVGFIAGYFVNAIAYHLSPLEDKTALTFRQVLVHFRQKKYAWHDTVPLILC VAAAIACALAPFTPIVTGALFLYFCFVLTLSVIDFRTQLLPDKLTLPLLWLGLVFNAQYG LIDLHDAVYGAVAGYGVLWCVYWGVWLVCHKEGLGYGDFKLLAAAGAWCGWQTLPMILLI ASLGGIGYAIVSQLLQRRTITTIAFGPWLALGSMINLGYLA >gi|296493385|gb|ADTK01000116.1| GENE 59 37283 - 37759 641 158 aa, chain - ## HITS:1 COG:bfr KEGG:ns NR:ns ## COG: bfr COG2193 # Protein_GI_number: 16131215 # Func_class: P Inorganic ion transport and metabolism # Function: Bacterioferritin (cytochrome b1) # Organism: Escherichia coli K12 # 1 158 1 158 158 278 100.0 3e-75 MKGDTKVINYLNKLLGNELVAINQYFLHARMFKNWGLKRLNDVEYHESIDEMKHADRYIE RILFLEGLPNLQDLGKLNIGEDVEEMLRSDLALELDGAKNLREAIGYADSVHDYVSRDMM IEILRDEEGHIDWLETELDLIQKMGLQNYLQAQIREEG >gi|296493385|gb|ADTK01000116.1| GENE 60 37831 - 38025 169 64 aa, chain - ## HITS:1 COG:bfd KEGG:ns NR:ns ## COG: bfd COG2906 # Protein_GI_number: 16131216 # Func_class: P Inorganic ion transport and metabolism # Function: Bacterioferritin-associated ferredoxin # Organism: Escherichia coli K12 # 1 64 1 64 64 119 100.0 1e-27 MYVCLCNGISDKKIRQAVRQFSPHSFQQLKKFIPVGNQCGKCVRAAREVMEDELMQLPEF KESA Prediction of potential genes in microbial genomes Time: Mon May 16 15:25:58 2011 Seq name: gi|296493384|gb|ADTK01000117.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont297.2, whole genome shotgun sequence Length of sequence - 2699 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 55 - 2412 2068 ## EcHS_A3533 bifunctional chitinase/lysozyme (EC:3.2.1.14 3.2.1.17) - Prom 2593 - 2652 6.6 Predicted protein(s) >gi|296493384|gb|ADTK01000117.1| GENE 1 55 - 2412 2068 785 aa, chain - ## HITS:1 COG:no KEGG:EcHS_A3533 NR:ns ## KEGG: EcHS_A3533 # Name: chiA # Def: bifunctional chitinase/lysozyme (EC:3.2.1.14 3.2.1.17) # Organism: E.coli_HS # Pathway: Amino sugar and nucleotide sugar metabolism [PATH:ecx00520] # 18 785 130 897 897 1416 97.0 0 MIGMGLVCSALPALAMEAWNNQQGGNKYQVIFDGKIYENAWWVASSNCPGDAKSNDASNP WRYVRAATATEISETSNPQSCTSAPQPSPDVKPAPDVKPAPDVQPAPADKSNDNYAVVAW KGQEGSSTWYVIYNGGIYKNAWWVGAANCPGDAKENDASNPWRYVRAATATEISQYGNPG SCSVKPDNNGGAVTPVDPTPETPVTPTPDNNEPSTPADSGNDYSLQTWSGQEGSEIYHVI FNGNVYKNAWWVGSKDCPRGTSAENSNNPWRLVRTATAAELSQYGNPTTCEIDNGGVIVA DGFQASKAYSADSIVDYNDAHYKTSVDQDAWGFVPGGDNPWKKYEPAKAWSASTVYVKGD RVVVDGQAYEALFWTQSDNPALVANQNATGSNSRPWKPLGKAQSYSNEELNNAPQFNPET LYASDTLIRFNGVNYISQSKVQKVSPSDSNPWRVFVDWTGTKERVGTPKKAWPKHVYAPY VDFTLNTIPDLAALAKNHNVNHFTLAFVVSKDANTCLPTWGTAYGMQNYAQYSKIKALRE AGGDVMLSIGGANNAPLAASCKNVDDLMQHYYDIVDNLNLKVLDFDIEGTWVADQASIER RNLAVKKVQDKWKSEGKDIAIWYTLPILPTGLTPEGMNVLSDAKAKGVELAGVNVMTMDY GNAICQSANTEGQNIHGKCATSAIANLHSQLKGLHPNKSDAEIDAMMGTTPMVGVNDVQG EVFYLSDARLVMQDAQKRNLGMVGIWSIARDLPGGTNLSPEFHGLTKEQAPKYAFSEIFA PFTKQ Prediction of potential genes in microbial genomes Time: Mon May 16 15:26:10 2011 Seq name: gi|296493383|gb|ADTK01000118.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont305.1, whole genome shotgun sequence Length of sequence - 8153 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 1, operones - 1 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.000 - CDS 108 - 1754 1788 ## COG0579 Predicted dehydrogenase - Prom 1845 - 1904 5.9 2 1 Op 2 4/0.000 - CDS 1972 - 3615 195 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 3 1 Op 3 4/0.000 - CDS 3691 - 4338 477 ## COG3145 Alkylated DNA repair protein 4 1 Op 4 3/0.000 - CDS 4341 - 5405 612 ## COG2169 Adenosine deaminase 5 1 Op 5 4/0.000 - CDS 5479 - 6534 1194 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis - Prom 6559 - 6618 2.0 - Term 6590 - 6631 5.0 6 1 Op 6 . - CDS 6646 - 7749 1321 ## COG3203 Outer membrane protein (porin) Predicted protein(s) >gi|296493383|gb|ADTK01000118.1| GENE 1 108 - 1754 1788 548 aa, chain - ## HITS:1 COG:ECs3099 KEGG:ns NR:ns ## COG: ECs3099 COG0579 # Protein_GI_number: 15832353 # Func_class: R General function prediction only # Function: Predicted dehydrogenase # Organism: Escherichia coli O157:H7 # 1 548 1 548 548 1072 99.0 0 MKKVTAMLFSMAVGLNAVSMAAKAKASEEQETDVLLIGGGIMSATLGTYLRELEPEWSMT MVERLEGVAQESSNGWNNAGTGHSALMELNYTPQNADGSISIEKAVAINEAFQISRQFWA HQVERGVLRTPRSFINTVPHMSFVWGEDNVNFLRARYAALQQSSLFRGMRYSEDHAQIKE WAPLVMEGRDPQQKVAATRTEIGTDVNYGEITRQLIASLQKKSNFSLQLSSEVRALKRND DNTWTVTVADLKNGTAQNIRAKFVFIGAGGAALKLLQESGIPEAKDYAGFPVGGQFLVSE NPDVVNHHLAKVYGKASVGAPPMSVPHIDTRVLDGKRVVLFGPFATFSTKFLKNGSLWDL MSSTTTSNVMPMMHVGLDNFDLVKYLVSQVMLSEEDRFEALKEYYPQAKKEDWRLWQAGQ RVQIIKHDAEKGGVLRLGTEVVSDQQGTIAALLGASPGASTAAPIMLDLLEKVFGDRVSS PQWQATLKAIVPSYGRKLNGDVAATERELQYTSEVLGLKYDKPQAADSTPKPQLKPKPVQ KEVADIAL >gi|296493383|gb|ADTK01000118.1| GENE 2 1972 - 3615 195 547 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 326 541 133 357 398 79 30 7e-15 MELLVLVWRQYRWPFISVMALSLASAALGIGLIAFINQRLIETADTSLLVLPEFLGLLLL LMAVTLGSQLALTTLGHHFVYRLRSEFIKRILDTHVERIEQLGSASLLAGLTSDVRNITI AFVRLPELVQGIILTIGSAAYLWMLSGKMLLVTAIWMAITIWGGFVLVARVYKHMATLRE TEDKLYTDFQTVLEGRKELTLNRERAEYVFNNLYIPDAQEYRHHIIRADTFHLSAVNWSN IMMLGAIGLVFWMANSLGWADTNVAATYSLTLLFLRTPLLSAVGALPTLLTAQVAFNKLN KFALAPFKAEFPRPQAFPNWQTLELRNVTFSYQDNAFSVGPINLTIKRGELLFLIGGNGS GKSTLAMLLTGLYQPQSGEILLDGKPVSGEQPEDYRKLFSAVFTDVWLFDQLLGPEGKPA NPQLVEKWLAQLKMAHKLELSNGRIVNLKLSKGQKKRVALLLALAEERDIILLDEWAADQ DPHFRREFYQVLLPLMQEMGKTIFAISHDDHYFIHADRLLEMRNGQLSELTGEERDAASR DAVARTA >gi|296493383|gb|ADTK01000118.1| GENE 3 3691 - 4338 477 215 aa, chain - ## HITS:1 COG:ECs3101 KEGG:ns NR:ns ## COG: ECs3101 COG3145 # Protein_GI_number: 15832355 # Func_class: L Replication, recombination and repair # Function: Alkylated DNA repair protein # Organism: Escherichia coli O157:H7 # 1 215 2 216 216 442 98.0 1e-124 MDLFADAEPWQEPLAAGAVILHRFAFNAAEQLIRDINDVASQSPFRQMVTPGGYTMSVAM TNCGHLGWTSHRQGYLYSPIDPQTNKPWPAMPQSFHNLCQRAATAAGYPDFQPDACLINR YAPGAKLSLHQDKDEPDLRAPIVSVSLGLPAIFQFGGLKRNDPLKRLLLEHGDVVVWGGE SRLFYHGIQPLKAGFHPLTTDCRYNLTFRQAGKKE >gi|296493383|gb|ADTK01000118.1| GENE 4 4341 - 5405 612 354 aa, chain - ## HITS:1 COG:ada_1 KEGG:ns NR:ns ## COG: ada_1 COG2169 # Protein_GI_number: 16130150 # Func_class: F Nucleotide transport and metabolism # Function: Adenosine deaminase # Organism: Escherichia coli K12 # 1 184 1 184 184 363 100.0 1e-100 MKKATCLTDDQRWQSVLARDPNADGEFVFAVRTTGIFCRPSCRARHALRENVSFYANASE ALAAGFRPCKRCQPEKANAQQHRLDKITHACRLLEQETPVTLEALADQVAMSPFHLHRLF KATTGMTPKAWQQAWRARRLRESLAKGESVTTSILNAGFPDSSSYYRKADETLGMTAKQF RHGGENLAVRYALADCELGRCLVAESERGICAILLGDDDATLISELQQMFPAADNAPADL MFQQHVREVIASLNQRDTPLTLPLDIRGTAFQQQVWQALRTIPCGETVSYQQLANAIGKP KAVRAVASACAANKLAIIIPCHRVVRGDGTLSGYRWGVSRKAQLLRREAENEER >gi|296493383|gb|ADTK01000118.1| GENE 5 5479 - 6534 1194 351 aa, chain - ## HITS:1 COG:ECs3103 KEGG:ns NR:ns ## COG: ECs3103 COG1477 # Protein_GI_number: 15832357 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Escherichia coli O157:H7 # 1 351 1 351 351 691 99.0 0 MEISFTRVALLAAALFFVGCDQKPQPAKTHATEVTVLEGKTMGTFWRASIPGIDAKRSAE LKEKIQTQLDADDQLLSTYKKDSALMRFNDSQSLSPWPVSEAMADIVTTSLRIGAKTDGA MDITVGPLVNLWGFGPEQQPVQIPSQEQIDAMKAKTGLQHLTVINQSHQQYLQKDLPDLN VDLSTVGEGYAADHLARLMEQEGISRYLVSVGGALNSRGMNGEGQPWRVAIQKPTDKENA VQAVVDINGHGISTSGSYRNYYELDGKRLSHVIDPQTGRPIEHNLVSVTVIAPTALEADA WDTGLMVLGPEKAKEVVRREGLAVYMITKEGDSFKTWMSPQFKSFLVSEKN >gi|296493383|gb|ADTK01000118.1| GENE 6 6646 - 7749 1321 367 aa, chain - ## HITS:1 COG:ECs3104 KEGG:ns NR:ns ## COG: ECs3104 COG3203 # Protein_GI_number: 15832358 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein (porin) # Organism: Escherichia coli O157:H7 # 1 367 1 367 367 632 97.0 0 MKVKVLSLLVPALLVAGAANAAEVYNKDGNKLDLYGKVDGLHYFSDNKDVDGDQTYMRLG FKGETQVTDQLTGYGQWEYQIQGNSAENENNSWTRVAFAGLKFQDVGSFDYGRNYGVVYD VTSWTDVLPEFGGDTYGSDNFMQQRGNGFATYRNTDFFGLVDGLNFAVQYQGKNGSVSGE GMTNNGRGALRQNGDGVGGSITYDYEGFGIGGAISSSKRTDDQNSPLYIGNGDRAETYTG GLKYDANNIYLAAQYTQTYNATRVGSLGWANKAQNFEAVAQYQFDFGLRPSVAYLQSKGK NLGTIAGRNYDDEDILKYVDVGATYYFNKNMSTYVDYKINLLDDNQFTRDAGINTDNIVA LGLVYQF Prediction of potential genes in microbial genomes Time: Mon May 16 15:26:12 2011 Seq name: gi|296493382|gb|ADTK01000119.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont305.2, whole genome shotgun sequence Length of sequence - 3697 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 43 - 102 5.3 1 1 Op 1 12/0.000 + CDS 323 - 2995 2600 ## COG0642 Signal transduction histidine kinase 2 1 Op 2 . + CDS 3012 - 3662 807 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain Predicted protein(s) >gi|296493382|gb|ADTK01000119.1| GENE 1 323 - 2995 2600 890 aa, chain + ## HITS:1 COG:ZyojN_1 KEGG:ns NR:ns ## COG: ZyojN_1 COG0642 # Protein_GI_number: 15802769 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli O157:H7 EDL933 # 1 700 1 700 700 1276 99.0 0 MRQKETTATTRFSLLPGSITRFFLLLIIVLLVTMGVMVQSAVNAWLKDKSYQIVDITHAI QKRVDTWRYVTWQIYDNIAATTSPSSGEGLQETRLKQDVYYLEKPRRKTEALIFGSHDNS TLEMTQRMSTYLDTLWGAENVPWSMYYLNGQDNSLVLISTLPLKDLTSGFKESTVSDIVD SRRAEMLQQANALDERESFSNMRRLAWQNGHYFTLRTTFNQPGHLATVVAFDLPINDLIP PGMPLDSFRLEQDATATGNNDNEKEGTDSVSIHFNSTKIEIASALNSTDMRLVWQVPYGT LLLDTLQNILLPLLLNIGLLALALFGYTTFRHFSSRSTESLPNTAVNNELRILRAINEEI VSLLPLGLLVHDQESNRTVISNKIADHLLPHLNLQNITTMAEQHQGIIQATINNELYEIR MFRSQVAPRTQIFIIRDQDREVLVNKKLKQAQRLYEKNQQGRMTFMKNIGDALKEPAQSL AESAAKLNAPESKQLANQADVLVRLVDEIQLANMLADDSWKSETVLFSVQDLIDEVVPSV LPAIKRKGLQLLINNHLKAHDMRRGDRDALRRILLLLMQYAVTSTQLGKITLEVDQDESS EDRLTFRILDTGEGVSIHEMDNLHFPFINLTQNDRYGKADPLAFWLSDQLARKLGGHLNI KTRDGLGTRYSVHIKMLAADPEVEEEEERLLDDVCVMVDVTSAEIRNIVTRQLENWGATC ITPDERLISQDYDIFLTDNPSNLTASGLLLSDDESGVREIGPGQLCVNFNMSNAMQEAVL QLIEVQLAQEEVTESPLGGDENAQLHASGYYALFVDTVPDDVKRLYTEAATSDFAALAQT AHRLKGVFAMLNLVPGKQLCETLEHLIREKDVPGIEKYISDIDSYVKSLL >gi|296493382|gb|ADTK01000119.1| GENE 2 3012 - 3662 807 216 aa, chain + ## HITS:1 COG:ECs3106 KEGG:ns NR:ns ## COG: ECs3106 COG2197 # Protein_GI_number: 15832360 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 216 1 216 216 400 100.0 1e-111 MNNMNVIIADDHPIVLFGIRKSLEQIEWVNVVGEFEDSTALINNLPKLDAHVLITDLSMP GDKYGDGITLIKYIKRHFPSLSIIVLTMNNNPAILSAVLDLDIEGIVLKQGAPTDLPKAL AALQKGKKFTPESVSRLLEKISAGGYGDKRLSPKESEVLRLFAEGFLVTEIAKKLNRSIK TISSQKKSAMMKLGVENDIALLNYLSSVTLSPADKD Prediction of potential genes in microbial genomes Time: Mon May 16 15:26:30 2011 Seq name: gi|296493381|gb|ADTK01000120.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont305.3, whole genome shotgun sequence Length of sequence - 64643 bp Number of predicted genes - 50, with homology - 50 Number of transcription units - 25, operones - 11 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 58 - 2859 1952 ## COG0642 Signal transduction histidine kinase - Prom 2991 - 3050 1.9 2 2 Op 1 13/0.000 + CDS 3074 - 4900 1437 ## COG0642 Signal transduction histidine kinase 3 2 Op 2 1/1.000 + CDS 4897 - 6282 1021 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains + Term 6294 - 6351 2.6 + Prom 6341 - 6400 5.2 4 3 Op 1 21/0.000 + CDS 6478 - 7140 612 ## COG1788 Acyl CoA:acetate/3-ketoacid CoA transferase, alpha subunit 5 3 Op 2 4/0.250 + CDS 7140 - 7790 517 ## COG2057 Acyl CoA:acetate/3-ketoacid CoA transferase, beta subunit + Term 7803 - 7830 -0.8 + Prom 7828 - 7887 3.6 6 4 Op 1 4/0.250 + CDS 7922 - 9109 958 ## COG2031 Short chain fatty acids transporter 7 4 Op 2 . + CDS 9140 - 10324 1211 ## COG0183 Acetyl-CoA acetyltransferase + Term 10335 - 10374 8.2 - Term 10313 - 10368 9.2 8 5 Op 1 5/0.250 - CDS 10398 - 11174 678 ## COG4676 Uncharacterized protein conserved in bacteria 9 5 Op 2 4/0.250 - CDS 11179 - 12828 1003 ## COG5445 Predicted secreted protein 10 5 Op 3 3/0.375 - CDS 12829 - 17433 3801 ## COG2373 Large extracellular alpha-helical protein 11 5 Op 4 2/0.750 - CDS 17367 - 17930 290 ## COG3234 Uncharacterized protein conserved in bacteria 12 5 Op 5 . - CDS 17987 - 19675 1333 ## COG4685 Uncharacterized protein conserved in bacteria - Prom 19750 - 19809 3.5 - Term 19776 - 19802 -1.0 13 6 Tu 1 . - CDS 19824 - 22451 3197 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit - Prom 22489 - 22548 6.4 + Prom 22512 - 22571 5.2 14 7 Tu 1 . + CDS 22598 - 23320 724 ## COG2227 2-polyprenyl-3-methyl-5-hydroxy-6-metoxy-1,4-benzoquinol methylase 15 8 Tu 1 . - CDS 23448 - 27152 2689 ## COG3468 Type V secretory pathway, adhesin AidA + Prom 27700 - 27759 3.0 16 9 Op 1 24/0.000 + CDS 27896 - 30181 2685 ## COG0209 Ribonucleotide reductase, alpha subunit + Prom 30297 - 30356 5.7 17 9 Op 2 8/0.000 + CDS 30416 - 31546 1406 ## COG0208 Ribonucleotide reductase, beta subunit 18 9 Op 3 . + CDS 31546 - 31800 182 ## COG0633 Ferredoxin + Term 31896 - 31928 -0.9 19 10 Tu 1 . - CDS 31854 - 32504 589 ## EcolC_1414 hypothetical protein - Prom 32689 - 32748 4.5 + Prom 32640 - 32699 4.1 20 11 Tu 1 . + CDS 32719 - 32925 219 ## COG0583 Transcriptional regulator + Prom 33304 - 33363 6.4 21 12 Tu 1 . + CDS 33391 - 34461 467 ## ECUMN_2578 hypothetical protein + Term 34643 - 34691 1.3 - Term 34468 - 34514 -0.6 22 13 Op 1 6/0.250 - CDS 34680 - 35756 1183 ## COG0584 Glycerophosphoryl diester phosphodiesterase 23 13 Op 2 . - CDS 35761 - 37119 1375 ## COG2271 Sugar phosphate permease - Prom 37215 - 37274 5.1 + Prom 37209 - 37268 3.5 24 14 Op 1 9/0.000 + CDS 37392 - 39020 1760 ## COG0578 Glycerol-3-phosphate dehydrogenase 25 14 Op 2 8/0.000 + CDS 39010 - 40269 1022 ## COG3075 Anaerobic glycerol-3-phosphate dehydrogenase 26 14 Op 3 3/0.375 + CDS 40266 - 41456 1101 ## COG0247 Fe-S oxidoreductase + Term 41469 - 41509 10.6 + Prom 41510 - 41569 2.2 27 15 Tu 1 . + CDS 41650 - 42552 688 ## COG5464 Uncharacterized conserved protein - Term 42545 - 42589 4.2 28 16 Tu 1 1/1.000 - CDS 42593 - 42982 380 ## COG3836 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase 29 17 Op 1 . - CDS 43118 - 44320 1134 ## COG1058 Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA - Prom 44342 - 44401 4.2 30 17 Op 2 . - CDS 44420 - 44962 653 ## B21_02135 hypothetical protein + Prom 45039 - 45098 6.4 31 18 Tu 1 . + CDS 45241 - 45666 429 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes + Term 45670 - 45713 1.6 - Term 45659 - 45694 5.1 32 19 Tu 1 . - CDS 45705 - 46307 217 ## SFV_2322 protein induced by aluminum - Prom 46455 - 46514 5.8 + Prom 46500 - 46559 7.4 33 20 Op 1 5/0.250 + CDS 46597 - 47754 896 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 34 20 Op 2 12/0.000 + CDS 47758 - 48726 872 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 35 20 Op 3 8/0.000 + CDS 48726 - 50708 1605 ## COG0451 Nucleoside-diphosphate-sugar epimerases 36 20 Op 4 6/0.250 + CDS 50705 - 51595 741 ## COG0726 Predicted xylanase/chitin deacetylase 37 20 Op 5 5/0.250 + CDS 51595 - 53247 1368 ## COG1807 4-amino-4-deoxy-L-arabinose transferase and related glycosyltransferases of PMT family 38 20 Op 6 9/0.000 + CDS 53244 - 53579 376 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 39 20 Op 7 . + CDS 53579 - 53965 502 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Term 53779 - 53822 2.9 40 21 Tu 1 . - CDS 53959 - 54225 209 ## B21_02145 hypothetical protein - Prom 54259 - 54318 4.9 41 22 Op 1 6/0.250 - CDS 54335 - 55690 1178 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II 42 22 Op 2 5/0.250 - CDS 55687 - 56649 976 ## COG1441 O-succinylbenzoate synthase 43 22 Op 3 9/0.000 - CDS 56649 - 57506 887 ## COG0447 Dihydroxynaphthoic acid synthase 44 22 Op 4 15/0.000 - CDS 57521 - 58279 572 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 45 22 Op 5 10/0.000 - CDS 58276 - 59775 1542 ## COG1165 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase - Prom 59827 - 59886 46.3 - SSU_RRNA 59839 - 60192 100.0 # AY958844 [D:1..1893] # 16S ribosomal RNA # uncultured bacterium # Bacteria; environmental samples. 46 23 Op 1 1/1.000 - CDS 60188 - 61330 866 ## COG1169 Isochorismate synthase 47 23 Op 2 4/0.250 - CDS 61409 - 61714 506 ## COG4575 Uncharacterized conserved protein 48 23 Op 3 . - CDS 61769 - 62230 484 ## COG2153 Predicted acyltransferase - Prom 62251 - 62310 4.9 + Prom 62210 - 62269 5.0 49 24 Tu 1 . + CDS 62295 - 63212 609 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III + Term 63234 - 63260 -0.6 + Prom 63284 - 63343 4.0 50 25 Tu 1 . + CDS 63403 - 64620 391 ## ECO26_3260 deubiquitinase Predicted protein(s) >gi|296493381|gb|ADTK01000120.1| GENE 1 58 - 2859 1952 933 aa, chain - ## HITS:1 COG:rcsC_1 KEGG:ns NR:ns ## COG: rcsC_1 COG0642 # Protein_GI_number: 16130155 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 700 1 700 700 1439 100.0 0 MFRALALVLWLLIAFSSVFYIVNALHQRESEIRQEFNLSSDQAQRFIQRTSDVMKELKYI AENRLSAENGVLSPRGRETQADVPAFEPLFADSDCSAMSNTWRGSLESLAWFMRYWRDNF SAAYDLNRVFLIGSDNLCMANFGLRDMPVERDTALKALHERINKYRNAPQDDSGSNLYWI SEGPRPGVGYFYALTPVYLANRLQALLGVEQTIRMENFFLPGTLPMGVTILDENGHTLIS LTGPESKIKGDPRWMQERSWFGYTEGFRELVLKKNLPPSSLSIVYSVPVDKVLERIRMLI LNAILLNVLAGAALFTLARMYERRIFIPAESDALRLEEHEQFNRKIVASAPVGICILRTA DGVNILSNELAHTYLNMLTHEDRQRLTQIICGQQVNFVDVLTSNNTNLQISFVHSRYRNE NVAICVLVDVSSRVKMEESLQEMAQAAEQASQSKSMFLATVSHELRTPLYGIIGNLDLLQ TKELPKGVDRLVTAMNNSSSLLLKIISDILDFSKIESEQLKIEPREFSPREVMNHITANY LPLVVRKQLGLYCFIEPDVPVALNGDPMRLQQVISNLLSNAIKFTDTGCIVLHVRADGDY LSIRVRDTGVGIPAKEVVRLFDPFFQVGTGVQRNFQGTGLGLAICEKLISMMDGDISVDS EPGMGSQFTVRIPLYGAQYPQKKGVEGLSGKRCWLAVRNASLCQFLETSLQRSGIVVTTY EGQEPTSEDVLITDEVVSKKWQGRAVVTFCRRHIGIPLEKAPGEWVHSVAAPHELPALLA RIYLIEMESDDPANALPSTDKAVSDNDDMMILVVDDHPINRRLLADQLGSLGYQCKTAND GVDALNVLSKNHIDIVLSDVNMPNMDGYRLTQRIRQLGLTLPVIGVTANALAEEKQRCLE SGMDSCLSKPVTLDVIKQTLTVYAERVRKSRES >gi|296493381|gb|ADTK01000120.1| GENE 2 3074 - 4900 1437 608 aa, chain + ## HITS:1 COG:atoS_3 KEGG:ns NR:ns ## COG: atoS_3 COG0642 # Protein_GI_number: 16130156 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 331 608 1 278 278 529 100.0 1e-150 MHYMKWIYPRRLRNQMILMAILMVIVPTLTIGYIVETEGRSAVLSEKEKKLSAVVNLLNQ ALGDRYDLYIDLPREERIRALNAELAPITENITHAFPGIGAGYYNKMLDAIITYAPSALY QNNVGVTIAADHPGREVMRTNTPLVYSGRQVRGDILNSMLPIERNGEILGYIWANELTED IRRQAWKMDVRIIIVLTAGLLISLLLIVLFSRRLSANIDIITDGLSTLAQNIPTRLPQLP GEMGQISQSVNNLAQALRETRTLNDLIIENAADGVIAIDRQGDVTTMNPAAEVITGYQRH ELVGQPYSMLFDNTQFYSPVLDTLEHGTEHVALEISFPGRDRTIELSVTTSRIHNTHGEM IGALVIFSDLTARKETQRRMAQAERLATLGELMAGVAHEVRNPLTAIRGYVQILRQQTSD PIHQEYLSVVLKEIDSINKVIQQLLEFSRPRHSQWQQVSLNALVEETLVLVQTAGVQARV DFISELDNELSPINADRELLKQVLLNILINAVQAISARGKIRIQTWQYSDSQQAISIEDN GCGIDLSLQKKIFDPFFTTKASGTGLGLALSQRIINAHQGDIRVASLPGYGATFTLILPI NPQGNQTV >gi|296493381|gb|ADTK01000120.1| GENE 3 4897 - 6282 1021 461 aa, chain + ## HITS:1 COG:atoC KEGG:ns NR:ns ## COG: atoC COG2204 # Protein_GI_number: 16130157 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 1 461 1 461 461 911 100.0 0 MTAINRILIVDDEDNVRRMLSTAFALQGFETHCANNGRTALHLFADIHPDVVLMDIRMPE MDGIKALKEMRSHETRTPVILMTAYAEVETAVEALRCGAFDYVIKPFDLDELNLIVQRAL QLQSMKKEIRHLHQALSTSWQWGHILTNSPAMMDICKDTAKIALSQASVLISGESGTGKE LIARAIHYNSRRAKGPFIKVNCAALPESLLESELFGHEKGAFTGAQTLRQGLFERANEGT LLLDEIGEMPLVLQAKLLRILQEREFERIGGHQTIKVDIRIIAATNRDLQAMVKEGTFRE DLFYRLNVIHLILPPLRDRREDISLLANHFLQKFSSENQRDIIDIDPMAMSLLTAWSWPG NIRELSNVIERAVVMNSGPIIFSEDLPPQIRQPVCNAGEVKTAPVGERNLKEEIKRVEKR IIMEVLEQQEGNRTRTALMLGISRRALMYKLQEYGIDPADV >gi|296493381|gb|ADTK01000120.1| GENE 4 6478 - 7140 612 220 aa, chain + ## HITS:1 COG:atoD KEGG:ns NR:ns ## COG: atoD COG1788 # Protein_GI_number: 16130158 # Func_class: I Lipid transport and metabolism # Function: Acyl CoA:acetate/3-ketoacid CoA transferase, alpha subunit # Organism: Escherichia coli K12 # 1 220 1 220 220 411 100.0 1e-115 MKTKLMTLQDATGFFRDGMTIMVGGFMGIGTPSRLVEALLESGVRDLTLIANDTAFVDTG IGPLIVNGRVRKVIASHIGTNPETGRRMISGEMDVVLVPQGTLIEQIRCGGAGLGGFLTP TGVGTVVEEGKQTLTLDGKTWLLERPLRADLALIRAHRCDTLGNLTYQLSARNFNPLIAL AADITLVEPDELVETGELQPDHIVTPGAVIDHIIVSQESK >gi|296493381|gb|ADTK01000120.1| GENE 5 7140 - 7790 517 216 aa, chain + ## HITS:1 COG:atoA KEGG:ns NR:ns ## COG: atoA COG2057 # Protein_GI_number: 16130159 # Func_class: I Lipid transport and metabolism # Function: Acyl CoA:acetate/3-ketoacid CoA transferase, beta subunit # Organism: Escherichia coli K12 # 1 216 1 216 216 426 100.0 1e-119 MDAKQRIARRVAQELRDGDIVNLGIGLPTMVANYLPEGIHITLQSENGFLGLGPVTTAHP DLVNAGGQPCGVLPGAAMFDSAMSFALIRGGHIDACVLGGLQVDEEANLANWVVPGKMVP GMGGAMDLVTGSRKVIIAMEHCAKDGSAKILRRCTMPLTAQHAVHMLVTELAVFRFIDGK MWLTEIADGCDLATVRAKTEARFEVAADLNTQRGDL >gi|296493381|gb|ADTK01000120.1| GENE 6 7922 - 9109 958 395 aa, chain + ## HITS:1 COG:atoE KEGG:ns NR:ns ## COG: atoE COG2031 # Protein_GI_number: 16130160 # Func_class: I Lipid transport and metabolism # Function: Short chain fatty acids transporter # Organism: Escherichia coli K12 # 1 395 46 440 440 715 99.0 0 MVKMWGDGFWNLLAFGMQMALIIVTGHALASSAPVKSLLRTAASAAKTPVQGVMLVTFFG SVACVINWGFGLVVGAMFAREVARRVPGSDYPLLIACAYIGFLTWGGGFSGSMPLLAATP GNPVEHIAGLIPVGDTLFSGFNIFITVTLIVVMPFITRMMMPKPSDVVSIDPKLLMEEAD FQKQLPKDAPPSERLEESRILTLIIGALGIAYLAMYFSEHGFNITINTVNLMFMIAGLLL HKTPMAYMRAISAAARSTAGILVQFPFYAGIQLMMEHSGLGGLITEFFINVANKDTFPVM TFFSSALINFAVPSGGGHWVIQGPFVIPAAQALGADLGKSVMAIAYGEQWMNMAQPFWAL PALAIAGLGVRDIMGYCITALLFSGVIFVIGLTLF >gi|296493381|gb|ADTK01000120.1| GENE 7 9140 - 10324 1211 394 aa, chain + ## HITS:1 COG:atoB KEGG:ns NR:ns ## COG: atoB COG0183 # Protein_GI_number: 16130161 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA acetyltransferase # Organism: Escherichia coli K12 # 1 394 1 394 394 650 98.0 0 MKNCVIVSAVRTAIGSFNGSLASTSAIDLGATVIKAAIERAKIDSLHIDEVIMGNVLQAG LGQNPARQALLKSGLAETVCGFTVNKVCGSGLKSVALAAQAIQAGQAQSIVAGGMENMSL APYLLDAKARSGYRLGDGQVYDVILRDGLMCATHGYHMGITAENVAKEYGITREMQDELA LHSQRKAAAAIESGAFTAEIVPVNVVTRKKTFVFSQDEFPKANSTAEALGALRPAFDKAG TVTAGNASGINDGAAALVIMEESAALAAGLTPLARIKSYASGGVPPALMGMGPVPATQKA LQLAGLQLADIDLIEANEAFAAQFLAVGKTLGFDPEKVNVNGGAIALGHPIGASGARILV TLLHAMQARDKTLGLATLCIGGGQGIAMVIERLN >gi|296493381|gb|ADTK01000120.1| GENE 8 10398 - 11174 678 258 aa, chain - ## HITS:1 COG:ECs3108 KEGG:ns NR:ns ## COG: ECs3108 COG4676 # Protein_GI_number: 15832362 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 258 1 258 258 514 99.0 1e-146 MRKIFLPLLLVALSPVVHSEGVQEVEIDAPLSGWHPVEGEDASFSQSINYPASSVNMADD QNISAQIRGKIKNYAAAGKVQQGRLVVNGASMPQRIESDGSFARPYIFTEGSNSVQVISP DGQSRQKMQFYSTPGTGTIRARLRLVLSWDTDNTDLDLHVVTPDGEHAWYGNTVLKNSGA LDMDVTTGYGPEIFAMPAPVHGRYQVYINYYGGRSETELTTAQLTLITDEGSVNEKQETF IVPMRNAGELTLVKSFDW >gi|296493381|gb|ADTK01000120.1| GENE 9 11179 - 12828 1003 549 aa, chain - ## HITS:1 COG:yfaQ KEGG:ns NR:ns ## COG: yfaQ COG5445 # Protein_GI_number: 16130163 # Func_class: S Function unknown # Function: Predicted secreted protein # Organism: Escherichia coli K12 # 1 549 1 549 549 1076 99.0 0 MNWRRIVWLLALVTLPTLAEETPLQLALRGAQHDQLYQLSSSGVTKVSALPDTLTTPLGS LWKLYVYAWLEDTHQPEQPYQCRGNSPEEVYCCQAGESITRDTALVRSCGLYFAPQRLHI GADVWGQYWQQRQAPAWLASLTTLKPETSVTVKSLLDSLATLPAQNKAQEVLLDVVLDEA KIGVASMLGSRVRVKTWSWFADDKQEIRQGGFAGWLTDGTPLWVTGSGTSKTVLTRYATV LNRVLPVPTQVASGQCVEVELFARYPLKKITAEKSTTAVKPGVLNGRYRVTFTNGNHITF VSHGETTLLSEKGKLKLQSHLDREEYVARVLDREAKSTPPEAAKAMTVAIRTFLQQNANR EGDCLTIPDSSATQRVSASPATTGARTMAAWTQDLIYAGDPVHYHGSRATEGTLSWRQAT AQAGQGERYDQILAFAYPDNSLSRWGAPRSTCQLLPKAKAWLAKKKPQWRRILQAETGYN EPDVFAVCRLVSGFPYTDRQQKRLFIRNFFTLQDRLDLTHEYLHLAFDGYPTGLDENYIE TLTRQLLMD >gi|296493381|gb|ADTK01000120.1| GENE 10 12829 - 17433 3801 1534 aa, chain - ## HITS:1 COG:ECs3111 KEGG:ns NR:ns ## COG: ECs3111 COG2373 # Protein_GI_number: 15832365 # Func_class: R General function prediction only # Function: Large extracellular alpha-helical protein # Organism: Escherichia coli O157:H7 # 1 1534 1 1534 1534 2979 98.0 0 MDTQRFQSQFHWHLSFKFSGAIAACLSLSLVGTGLANADDSLPSSNYAPPAGGTFFLLAD SSFSSSEEAKVRLEAPGRDYRRYQMEEYGGVDVRLYRIPDPMAFLRQQKNLHRIVVQPQY LGDGLNNTLTWLWDNWYGKSRRVMQRTFSSQSRQNVTQALPELQLGNAIIKPSRYVQNNQ FSPLKKYPLVEQFRYPLWQAKPFEPQQGVKLEGASSNFISPQPGNIYIPLGQQEPGLYLV EAMVGGYRATTVVFVSDTVALSKVSGKELLVWTAGKKQGEAKPGSEILWTDGLGVMTRGV TDDSGTLQLQHISPERSYILGKDAEGGVFVSENFFYESEIYNTRLYIFTDRPLYRAGDRV DVKVIGREFHDPLHSSPIVSAPAKLSVLDANGSLLQTVNVTLDARNGGQGSFRLPENAVA GGYELRLAYRNQVYSSSFRVANYIKPHFEIGLALDKKEFKTGEAVSGKLQLLYPDGEPVK NARVQLSLRAQQLSMVGNDLRYAGRFPVSLEGSETVSDASGHVTLNLPAADKPSRYLLTV SASDGAAYRVTTTKEILIERGLAHYSLSTAAQYSNSGESVVFRYAALESSKQVPVTYEWL RLEDRTSHSGELPSGGKSFTVNFAKPGNYNLTLRDKDGLILAGLSHAVSGKGSTAHTGTV DIVADKTLYQPGETAKMLITFPEPIDEALLTLERDRVEQQSLLSHPANWLTLQRLNDTQY EARVRVSNSFAPNITFSVLYTRNGQYSFQNAGIKVAVPQLDIRVKTDKTHYQPGELVNVE LTSSLKGKPVSAQLTVGVVDEMIYALQPEIAPNIGKFFYPLGRNNVRTSSSLSFISYDQA LSSEPVAPGATNRSERRVKMLERPRREEVDTAAWMPSLTTDKQGKAYFTFLMPDSLTRWR ITARGMNGDGLVGQGRAYLRSEKNLYMKWSMPTVYRVGDKPAAGLFIFSQQDNEPVALVT KFAGAEMRQTLTLHKGANYISLTQNIQQSGLLSAELQQNGQVQDSISTKLSFVDNSWPVE QQKNVMLGGGDNALMLPEQASNIRLQSSETPQEIFRNNLDALVDEPWGGVINTGSRLIPL SLAWRSLADHQSAAANDIRQMIQDNRLRLMQLAGPGARFTWWGEDGNGDAFLTAWAWYAD WQASQAIGVTQQPEYWQHMLDSYAEQADNVPLLHRALVLAWAQEMNLPCKTLLKGLDEAI ARRGTKTEDFSEEDPSDINDSLIFDTPESPLADAVANVLTMTLLKKAQLKSTVMPQVQQY ARDKAANSNQPLAHTVVLLNSGGDATQTAAILSGLTAEQSTIERALAMNWLAKYMATMPP VVLPAPAGAWAKHKLTGGGEYWRWVGQGVPDILSFGDELSPQNVQVRWREPAKTAQQSNI PVTVERQLYRLIPGEEEMSFTLQPVTSNEIDSDALYLDEITLTSEQDAVLRYGQVEVPLP PGADVERTTWGISVNKPNAAKQQGQLLEKARNEMGELAYMVPVKELTGTVTFRHLLRFSQ KGQFVLPPARYVRSYAPAQQSVAAGSEWTGMQVK >gi|296493381|gb|ADTK01000120.1| GENE 11 17367 - 17930 290 187 aa, chain - ## HITS:1 COG:ECs3112 KEGG:ns NR:ns ## COG: ECs3112 COG3234 # Protein_GI_number: 15832366 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 187 30 216 216 371 97.0 1e-103 MLNVEQSGLFRAWFVRIAQEQLRQGPSPRWYQQDCAGLVRFAANEALKVHDSKWLKSNGI ASQYLPPEMTLKPEQRQLAQNWNQGNGKTGPYVTAINLIQYNSQFIGQDINQALPGDMIF FDQGDAQHLMVWMGRYVIYHTGSATKTDNGMRAVSLQQLMTWKDTRWILNDSNPNFIGIY RLNFLAR >gi|296493381|gb|ADTK01000120.1| GENE 12 17987 - 19675 1333 562 aa, chain - ## HITS:1 COG:yfaA KEGG:ns NR:ns ## COG: yfaA COG4685 # Protein_GI_number: 16130165 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 562 17 578 578 1117 99.0 0 MSGEKKAKGWRFYGLVGFGAIALLSAGVWALQYAGSGPEKTLSPLVVHNNLQIDLNEPDL FLDSDSLSQLPKDLLTIPFLHDVLSEDFVFYYQNHADRLGIEGSIRRIVYEHDLTLKDKL FSSLLDQPAQAALWHDKQGHLSHYMVLIQRSGLSKLLEPLLFAATSDSQLSKTEISSIKI NSETIPVYQLRYNGNNALMFATYQDKMLVFSSTDMLFKDDQQDTEATAIASDLLSGKKRW QASFGLEERAAEKTPVRQRIVASARLLGFGYQRLMPSFAGVRFEMGNDGWHSFVALNDES ASVDASFDFTPVWNSMPAGASFCVAVPYSHGIAEEMLSHISQENDKLNGALDGAAGLCWY EDSKLQTPLFVGQFDGTAEQAQLPGKLFTQNIGAHESKAPEGVLPVSQTQQGEAQIWRRE VSSRYGQYPKAQAAQPDQLMSDYFFRVSLAMQNKTLLFSLDDTLVNNALQTLNKTRPAMV DVIPTDGIVPLYINPQGIAKLLRNETLTSLPKNLEPVFYNAAQTLLMPKLDALSQQPRYV MKLAQMEPGAAWQWLPITWQPL >gi|296493381|gb|ADTK01000120.1| GENE 13 19824 - 22451 3197 875 aa, chain - ## HITS:1 COG:gyrA KEGG:ns NR:ns ## COG: gyrA COG0188 # Protein_GI_number: 16130166 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Escherichia coli K12 # 1 875 1 875 875 1657 99.0 0 MSDLAREITPVNIEEELKSSYLDYAMSVIVGRALPDVRDGLKPVHRRVLYAMNVLGNDWN KAYKKSARVVGDVIGKYHPHGDLAVYNTIVRMAQPFSLRYMLVDGQGNFGSIDGDSAAAM RYTEIRLAKIAHELMADLEKETVDFVDNYDGTEKIPDVMPTKIPNLLVNGSSGIAVGMAT NIPPHNLTEVINGCLAYIDDEDISIEGLMEHIPGPDFPTAAIINGRRGIEEAYRTGRGKV YIRARAEVEVDAKTGRETIIVHEIPYQVNKARLIEKIAELVKEKRVEGISALRDESDKDG MRIVIEVKRDAVGEVVLNNLYSQTQLQVSFGINMVALHHGQPKIMNLKDIIAAFVRHRRE VVTRRTIFELRKARDRAHILEALAVALANIDPIIELIRHAPTPAEAKTALVANPWQLGNV AAMLERAGDDAARPEWLEPEFGVRDGLYYLTEQQAQAILDLRLQKLTGLEHEKLLDEYKE LLDQIAELLRILGSADRLMEVIREELELVREQFGDKRRTEITANSADINLEDLITQEDVV VTLSHQGYVKYQPLSEYEAQRRGGKGKSAARIKEEDFIDRLLVANTHDHILCFSSRGRVY SMKVYQLPEATRGARGRPIVNLLPLEQDERITAILPVTEFEEGVKVFMATANGTVKKTVL TEFNRLRTAGKVAIKLVDGDELIGVDLTSGEDEVMLFSAEGKVVRFKESSVRAMGCNTTG VRGIRLGEGDKVVSLIVPRGDGAILTATQNGYGKRTAVAEYPTKSRATKGVISIKVTERN GLVVGAVQVDDCDQIMMITDAGTLVRTRVSEISIVGRNTQGVILIRTAEDENVVGLQRVA EPVDEEDLDTIDGSAAEGDDEIAPEVDVDDEPEEE >gi|296493381|gb|ADTK01000120.1| GENE 14 22598 - 23320 724 240 aa, chain + ## HITS:1 COG:ubiG KEGG:ns NR:ns ## COG: ubiG COG2227 # Protein_GI_number: 16130167 # Func_class: H Coenzyme transport and metabolism # Function: 2-polyprenyl-3-methyl-5-hydroxy-6-metoxy-1,4-benzoquinol methylase # Organism: Escherichia coli K12 # 1 240 1 240 240 496 99.0 1e-140 MNAEKSPENHNVDHEEIAKFEAVASRWWDLEGEFKPLHRINPLRLGYIAERAGGLFGKKV LDVGCGGGILAESMAREGATVTGLDMGFEPLQVAKLHALESGIQVDYVQETVEEHAAKHA GQYDVVTCMEMLEHVPDPQSVVRACAQLVKPGGDVFFSTLNRNGKSWLMAVVGAEYILRM VPKGTHDVKKFIKPAELLGWVDQTSLKERHITGLHYNPITNTFKLGPGVDVNYMLHTQNK >gi|296493381|gb|ADTK01000120.1| GENE 15 23448 - 27152 2689 1234 aa, chain - ## HITS:1 COG:yfaL_2 KEGG:ns NR:ns ## COG: yfaL_2 COG3468 # Protein_GI_number: 16130168 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 789 1234 1 446 446 745 99.0 0 MIASLFSANGVAAVTDSCQGYDVKASCQASRQSLSGITQDWSIADGQWLVFSDMTNNASG GAVFLQQGAEFSLLPENETGMTLFANNTVTGEYNNGGAIFAKENSTLNLTDVIFSGNVAG GYGGAIYSSGTNDTGAVDLRVTNAMFRNNIANDGKGGAIYTINNDVYLSDVIFDNNQAYT STSYSDGDGGAIDVTDNNSDSKHPSGYTIVNNTAFTNNTAEGYGGAIYTNSVTAPYLIDI SVDDSYSQNGGVLVDENNSAAGYGDGPSSAAGGFMYLGLSEVTFDIADGKTLVIGNTEND GAVDSIAGTGLITKTGSGDLVLNADNNDFTGEMQIENGEVTLGRSNSLMNVGDTHCQDDP QDCYGLTIGSIDQYQNQAELNVGSTQQTFVHALTGFQNGTLNIDAGGNVTVNQGSFAGII EGAGQLTIAQNGSYVLAGAQSMALTGDIVVDDGAVLSLEGDAADLTALQDDPQSIVLNGG VLDLSDFSTWQSGTSYNDGLEVSGSSGTVIGSQDVVDLAGGDNLHIGGDGKDGVYVVVDA SDGQVSLANNNSYLGTTQIASGTLMVSDNSQLGDTHYNRQVIFTDKQQESVMEITSDVDT RSDAAGHGRDIEMRADGEVAVDAGVDTQWGALMADSSGQHQDEGSTLTKTGAGTLELTAS GTTQSAVRVEEGTLKGDVADILPYASSLWVGDGATFVTGADQDIQSIDAISSGTIDISDG TVLRLTGQDTSVALNASLFNGDGTLVNATDGVTLTGELNTNLETDSLTYLSNVTVNGNLT NTSGAVSLQNGVAGDTLTVNGDYTGGGTLLLDSELNGDDSVSDQLVMNGNTAGNTTVVVN SITGIGEPTSTGIKVVDFAADPTQFQNNAQFSLAGSGYVNMGAYDYTLVEDNNDWYLRSQ EVTPPSPPDQDPTPDPDPTPDPDPTPDPEPTPAYQPVLNAKVGGYLNNLRAANQAFMMER RDHAGGDGQTLNLRVIGGDYHYTAAGQLAQHEDTSTVQLSGDLFSGRWGTDGEWMLGIVG GYSDNQGDSRSNMTGTRADNQNHGYAVGLTSSWFQHGNQKQGAWLDSWLQYAWFSNDVSE QEDGTDHYHSSGIIASLEAGYQWLPGRGVVIEPQAQVIYQGVQQDDFTAANRARVSQSQG DDIQTRLGLHSEWRTAVHVIPTLDLNYYHDPHSTEIEEDGSTISDDAVKQRGEIKVGVTG NISQRVSLRGSVAWQKGSDDFAQTAGFLSMTVKW >gi|296493381|gb|ADTK01000120.1| GENE 16 27896 - 30181 2685 761 aa, chain + ## HITS:1 COG:nrdA KEGG:ns NR:ns ## COG: nrdA COG0209 # Protein_GI_number: 16130169 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, alpha subunit # Organism: Escherichia coli K12 # 1 761 1 761 761 1582 99.0 0 MNQNLLVTKRDGSTERINLDKIHRVLDWAAEGLHNVSISQVELRSHIQFYDGIKTSDIHE TIIKAAADLISRDAPDYQYLAARLAIFHLRKKAYGQFEPPALYDHVVKMVEMGKYDNHLL EDYTEEEFKQMDTFIDHDRDMTFSYAAVKQLEGKYLVQNRVTGEIYESAQFLYILVAACL FSNYPRETRLQYVKRFYDAVSTFKISLPTPIMSGVRTPTRQFSSCVLIECGDSLDSINAT SSAIVKYVSQRAGIGINAGRIRALGSPIRGGEAFHTGCIPFYKHFQTAVKSCSQGGVRGG AATLFYPMWHLEVESLLVLKNNRGVEGNRVRHMDYGVQINKLMYTRLLKGEDITLFSPSD VPGLYDAFFADQEEFERLYTKYEKDDSIRKQRVKAVELFSLMMQERASTGRIYIQNVDHC NTHSPFDPAIAPVRQSNLCLEIALPTKPLNDVNDENGEIALCTLSAFNLGAINSLDELEE LAILAVRALDALLDYQDYPIPAAKRGAMGRRTLGIGVINFAYYLAKHGKRYSDGSANNLT HKTFEAIQYYLLKASNELAKEQGACPWFNETTYAKGILPIDTYKKDLDTIANEPLHYDWE ALRESIKTHGLRNSTLSALMPSETSSQISNATNGIEPPRGYVSIKASKDGILRQVVPDYE HLHDAYELLWEMPGNDGYLQLVGIMQKFIDQSISANTNYDPSRFPSGKVPMQQLLKDLLT AYKFGVKTLYYQNTRDGAEDAQDDLVPSIQDDGCESGACKI >gi|296493381|gb|ADTK01000120.1| GENE 17 30416 - 31546 1406 376 aa, chain + ## HITS:1 COG:ECs3118 KEGG:ns NR:ns ## COG: ECs3118 COG0208 # Protein_GI_number: 15832372 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, beta subunit # Organism: Escherichia coli O157:H7 # 1 376 1 376 376 757 99.0 0 MAYTTFSQTKNDQLKEPMFFGQPVNVARYDQQKYDIFEKLIEKQLSFFWRPEEVDVSRDR IDYQALPEHEKHIFISNLKYQTLLDSIQGRSPNVALLPLISIPELETWVETWAFSETIHS RSYTHIIRNIVNDPSVVFDDIVTNEQIQKRAEGISSYYDELIEMTSYWHLLGEGTHTVNG KTVTVSLRELKKKLYLCLMSVNALEAIRFYVSFACSFAFAERELMEGNAKIIRLIARDEA LHLTGTQHMLNLLRSGADDPEMAEIAEECKQECYDLFVQAAQQEKDWADYLFRDGSMIGL NKDILCQYVEYITNIRMQAVGLDLPFQTRSNPIPWINTWLVSDNVQVAPQEVEVSSYLVG QIDAEVDTDDLSNFQL >gi|296493381|gb|ADTK01000120.1| GENE 18 31546 - 31800 182 84 aa, chain + ## HITS:1 COG:yfaE KEGG:ns NR:ns ## COG: yfaE COG0633 # Protein_GI_number: 16130171 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Escherichia coli K12 # 1 84 1 84 84 159 100.0 9e-40 MARVTLRITGTQLLCQDEHPSLLAALESHNVAVEYQCREGYCGSCRTRLVAGQVDWIAEP LAFIQPGEILPCCCRAKGDIEIEM >gi|296493381|gb|ADTK01000120.1| GENE 19 31854 - 32504 589 216 aa, chain - ## HITS:1 COG:no KEGG:EcolC_1414 NR:ns ## KEGG: EcolC_1414 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 216 1 216 216 430 100.0 1e-119 MAVSAKYDEFNHWWATEGDWVEEPNYRRNGMSGVQCVERNGKKLYVKRMTHHLFHSVRYP FGRPTIVREVAVIKELERAGVIVPKIVFGEAVKIEGEWRALLVTEDMAGFISIADWYAQH AVSPYSDEVRQAMLKAVALAFKKMHSINRQHGCCYVRHIYVKTEGKAEAGFLDLEKSRRR LRRDKAINHDFRQLEKYLEPIPKADWEQVKAYYYAM >gi|296493381|gb|ADTK01000120.1| GENE 20 32719 - 32925 219 68 aa, chain + ## HITS:1 COG:ECs3122 KEGG:ns NR:ns ## COG: ECs3122 COG0583 # Protein_GI_number: 15832376 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 68 225 292 292 117 89.0 7e-27 MNFIRQGLGIALQPELTLKSIAGELCSVPLESTFYRQISLLAKEKPVEGSPLFLLQTCTE QLVVSGKI >gi|296493381|gb|ADTK01000120.1| GENE 21 33391 - 34461 467 356 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_2578 NR:ns ## KEGG: ECUMN_2578 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 356 13 368 368 594 94.0 1e-168 MPNTSIHLSRCNILQNNKLQPEEVYKESQQTAKLEIFCDEFLKISQSRYGLSTSADSIAN LLTFFTKASDAIDRIKTQKIDVHSYGFVPIRHFVEHVITYKNEFAADYEPYTLTFTRGEN NEGVLSIESKEGSISQRTINLNEYKTAINIINEHVTKENIHNTVQSLTEKDISKINSSDK HHKISSEESIKSQLYSDQKKYADLLLHSEKNTEWYKYASSEERYDKFKNSSKEIKNTYKQ IVLAQKKLNQMKYINKLGGELIDIADKKLAPLINDSFSYTRDFFAYSKQENNIFTFDNSK FVDPKEKEGLMIQHSNGQLVITGKYCPEGVQTAFTQEQYDKLIRYINIFFTFPKCE >gi|296493381|gb|ADTK01000120.1| GENE 22 34680 - 35756 1183 358 aa, chain - ## HITS:1 COG:ECs3124 KEGG:ns NR:ns ## COG: ECs3124 COG0584 # Protein_GI_number: 15832378 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Escherichia coli O157:H7 # 1 358 1 358 358 717 99.0 0 MKLTLKNLSMAIMMSTIVMGSSAMAADSNEKIVIAHRGASGYLPEHTLPAKAMAYAQGAD YLEQDLVMTKDDHLVVLHDHYLDRVTDVADRFPDRARKDGRYYAIDFTLDEIKSLKFTEG FDIENGKKVQTYPGRFPMGKSDFRVHTFEEEIEFVQGLNHSTGKNIGIYPEIKAPWFHHQ EGKDIAAKTLEVLKKYGYTGKDDKVYLQCFDADELKRIKNELEPKMGMELNLVQLIAYTD WNETQQKQPDGSWVNYNYDWMFKPGAMKQVAEYADGIGPDYHMLIEETSQPGNIKLTGMV QDAQQNKLVVHPYTVRSDKLPEYTTDVNQLYDALYNKAGVNGLFTDFPDKAVKFLNKE >gi|296493381|gb|ADTK01000120.1| GENE 23 35761 - 37119 1375 452 aa, chain - ## HITS:1 COG:glpT KEGG:ns NR:ns ## COG: glpT COG2271 # Protein_GI_number: 16130175 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Escherichia coli K12 # 1 452 1 452 452 884 99.0 0 MLSIFKPAPHKARLPAAEIDPTYRRLRWQIFLGIFFGYAAYYLVRKNFALAMPYLVEQGF SRGDLGFALSGISIAYGFSKFIMGSVSDRSNPRVFLPAGLILAAAVMLFMGFVPWATSSI AVMFVLLFLCGWFQGMGWPPCGRTMVHWWSQKERGGIVSVWNCAHNVGGGIPPLLFLLGM AWFNDWHAALYMPAFCAILVALFAFAMMRDTPQSCGLPPIEEYKNDYPDDYNEKAEQELT AKQIFMQYVLPNKLLWYIAIANVFVYLLRYGILDWSPTYLKEVKHFALDKSSWAYFLYEY AGIPGTLLCGWMSDKVFRGNRGATGVFFMTLVTIATIVYWMNPAGNPTVDMICMIVIGFL IYGPVMLIGLHALELAPKKAAGTAAGFTGLFGYLGGSVAASAIVGYTVDFFGWDGGFMVM IGGSILAVILLIVVMIGEKRRHEQLLQKRNGG >gi|296493381|gb|ADTK01000120.1| GENE 24 37392 - 39020 1760 542 aa, chain + ## HITS:1 COG:ECs3126 KEGG:ns NR:ns ## COG: ECs3126 COG0578 # Protein_GI_number: 15832380 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Escherichia coli O157:H7 # 1 542 1 542 542 1051 99.0 0 MKTRDSQSSDVIIIGGGATGAGIARDCALRGLRVILVERHDIATGATGRNHGLLHSGARY AVTDAESARECISENQILKRIARHCVEPTNGLFITLPEDDLSFQATFIRACEEAGISAEA IDPQQARIIEPVVNPALIGAVKVPDGTVDPFRLTAANMLDAKEHGAVILTAHEVTGLIRE GATVCGVRVRNHLTGETQALHAPVVVNAAGIWGQHIAEYADLRIRMFPAKGSLLIMDHRI NQHVINRCRKPSDADILVPGDTISLIGTTSLRIDYNEIDDNRVTAEEVDILLREGEKLAP VMAKTRILRAYSGVRPLVASDDDPSGRNVSRGIVLLDHAERDGLDGFITITGGKLMTYRL MAEWATDAVCRKLGNTRPCTTADLALPGSQEPAEVTLRKVISLPAPLRGSAVYRHGDRTP AWLSEGRLHRSLVCECEAVTAGEVQYAVENLNVNSLLDLRRRTRVGMGTCQGELCACRAA GLLQRFNVTTSAQSIEQLSTFLNERWKGVQPIAWGDALRESEFTRWVYQGLCGLEKEQKD AL >gi|296493381|gb|ADTK01000120.1| GENE 25 39010 - 40269 1022 419 aa, chain + ## HITS:1 COG:glpB KEGG:ns NR:ns ## COG: glpB COG3075 # Protein_GI_number: 16130177 # Func_class: E Amino acid transport and metabolism # Function: Anaerobic glycerol-3-phosphate dehydrogenase # Organism: Escherichia coli K12 # 1 419 1 419 419 803 99.0 0 MRFDTVIMGGGLAGLLCGLQLQKHGLRCAIVTRGQSALHFSSGSLDLLSHLPDGQPVTDI HSGLESLRQQAPAHPYSLLEPQRVLDLACQAQALIAESGAQLQGSVELAHQRVTPLGTLR ATWLSSPEVPVWPLPAKKICVVGISGLMDFQAHLAAASLRELDLSVETAEIELPELDVLR NNATEFRAVNIARFLDNEENWPLLLDALIPVANTCEMILMPACFGLADDKLWRWLNEKLP CSLMLLPTLPPSVLGIRLQNQLQRQFVRQGGVWMPGDEVKKVTCKNGVVNEIWTRNHADI PLRPRFAVLASGSFFSGGLVAERNGIREPILGLDVLQTATRGEWYKGDFFAPQPWQQFGV TTDQTLRPSQAGQTIENLFAIGSVLGGFDPIAQGCGGGVCAVSALHAAQQIAQRAGGQQ >gi|296493381|gb|ADTK01000120.1| GENE 26 40266 - 41456 1101 396 aa, chain + ## HITS:1 COG:ECs3128 KEGG:ns NR:ns ## COG: ECs3128 COG0247 # Protein_GI_number: 15832382 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Escherichia coli O157:H7 # 1 396 1 396 396 828 99.0 0 MNDTSFENCIKCTVCTTACPVSRVNPGYPGPKQAGPDGERLRLKDGALYDEALKYCINCK RCEVACPSDVKIGDIIQRARAKYDTTRPSLRNFVLSHTDLMGSVSTPFAPIVNTATSLKP VRQLLDAALKIDHRRTLPKYSFGTFRRWYRSIAAQQAQYKDQVAFFHGCFVNYNHPQLGK DLIKVLNAMGTGVQLLSKEKCCGVPLIANGFTDKARKQAITNVESIREAVGVKGIPVIAT SSTCTFALRDEYPEVLNVDNKGLRDHIELATRWLWRKLDEGKTLPLKPLPLKVVYHTPCH MEKMGWTLYTLELLRKIPGLELTVLDSQCCGIAGTYGFKKENYPTSQAIGAPLFRQIEES GADLVITDCETCKWQIEMSTSLRCEHPITLLAQALA >gi|296493381|gb|ADTK01000120.1| GENE 27 41650 - 42552 688 300 aa, chain + ## HITS:1 COG:ECs3129 KEGG:ns NR:ns ## COG: ECs3129 COG5464 # Protein_GI_number: 15832383 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 300 1 308 308 581 96.0 1e-166 MTESTTSSPHDAVFKTFMFTPETARDFLEIHLPEPLRKLCNLQTLRLEPTSFIEKSLRAY YSDVLWSVETSDGDGYIYCVIEHQSSAEKNMAFRLMRYATAAMQRHLDKGYDRVPLVVPL LFYHGETSPYPYSLNWLDEFDDPQLARQLYTEAFPLVDITIVPDDEIMQHRRIALLELIQ KHIRDRDLIGMVDRITTLLVRGFTNDSQLQTLFNYLLQCGDTSRFTRFIEEIAERSPLQK ERLMTIAERLRQEGHQIGWQEGMHEQAIKIALRMLEQGIDRDQVLAATQLSEADLAANNH >gi|296493381|gb|ADTK01000120.1| GENE 28 42593 - 42982 380 129 aa, chain - ## HITS:1 COG:yfaU KEGG:ns NR:ns ## COG: yfaU COG3836 # Protein_GI_number: 16130180 # Func_class: G Carbohydrate transport and metabolism # Function: 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase # Organism: Escherichia coli K12 # 1 129 139 267 267 257 100.0 3e-69 MAQVNDSLCLLVQVESKTALDNLDEILDVEGIDGVFIGPADLSASLGYPDNAGHPEVQRI IETSIRRIRAAGKAAGFLAVAPDMAQQCLAWGANFVAVGVDTMLYSDALDQRLAMFKSGK NGPRIKGSY >gi|296493381|gb|ADTK01000120.1| GENE 29 43118 - 44320 1134 400 aa, chain - ## HITS:1 COG:yfaY KEGG:ns NR:ns ## COG: yfaY COG1058 # Protein_GI_number: 16130184 # Func_class: R General function prediction only # Function: Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA # Organism: Escherichia coli K12 # 1 400 1 400 400 795 99.0 0 MLKVEMLSTGDEVLHGQIVDTNAAWLADFFFHQGLPLSRRNTVGDNLDDLVTILRERSQH ADVLIVNGGLGPTSDDLSALAAATAKGEGLVLHEAWLKEMERYFHERGRVMAPSNRKQAE LPASAEFINNPVGTACGFAVQLNRCLMFFTPGVPSEFKVMVEHEILPRLRERFSLPQPPV CLRLTTFGRSESDLAQSLDTLQLPPGVTMGYRSSMPIIELKLTGPASEQQAMEKLWLDVK RVAGQSVIFEGTEGLPAQISRELQNRQFSLTLSEQFTGGLLALQLSRAGAPLLACEVVPS QEETLAQTAHWITERRANHFAGLALAVSGFENEHLNFALATPDGTFALRVRFSTTRYSLA IRQEVCAMMALNMLRRWLNGQDIASEHGWIEVIESMTLSV >gi|296493381|gb|ADTK01000120.1| GENE 30 44420 - 44962 653 180 aa, chain - ## HITS:1 COG:no KEGG:B21_02135 NR:ns ## KEGG: B21_02135 # Name: yfaZ # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 180 1 180 180 274 99.0 1e-72 MKKIALAGLAGMLLVSASVNAMSISGQAGKEYTNIGVGFGTESTGLALSGNWTHNDDDGD VAGVGLGLNLPLGPLMATVGGKGVYTNPNYGDEGYAAAVGGGLQWKIGNSFRLFGEYYYS PDSLSSGIQSYEEANAGARYTIMRPVSIEAGYRYLNLSGKDGNRDNAVADGPYVGVNASF >gi|296493381|gb|ADTK01000120.1| GENE 31 45241 - 45666 429 141 aa, chain + ## HITS:1 COG:yfaO KEGG:ns NR:ns ## COG: yfaO COG0494 # Protein_GI_number: 16130186 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Escherichia coli K12 # 1 141 1 141 141 282 98.0 1e-76 MRQRTIVCPLIQNDGAYLLCKMADDRGVFPGQWALSGGGVESGERIEEALRREIREELGE QLLLTEITPWTFSDDIRTKTYADGRKEEIYMIYLIFDCVSANREVKINEEFQDYAWVKPE DLVHYDLNVATRKTLRLKGLL >gi|296493381|gb|ADTK01000120.1| GENE 32 45705 - 46307 217 200 aa, chain - ## HITS:1 COG:no KEGG:SFV_2322 NR:ns ## KEGG: SFV_2322 # Name: ais # Def: protein induced by aluminum # Organism: S.flexneri_8401 # Pathway: not_defined # 1 200 20 219 219 381 98.0 1e-105 MLAFCRSSLKSKKYFIILLALAAIAGLGTHAAWSSNGLPRIDNKTLARLAQQHPVVVLFR HAERCDRSTNQCLSDKTGITVKGTQDARELGNAFSADIPDFDLYSSNTVRTIQSATWFSA GKKLTVDKRLLQCGNEIYSAIKDLQSKAPDKNIVIFTHNHCLTYIAKNKRDATFKPDYLD GLVMHVEKGKVYLDGEFVNH >gi|296493381|gb|ADTK01000120.1| GENE 33 46597 - 47754 896 385 aa, chain + ## HITS:1 COG:yfbE KEGG:ns NR:ns ## COG: yfbE COG0399 # Protein_GI_number: 16130188 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Escherichia coli K12 # 1 385 6 390 390 743 98.0 0 MAEGKAMSEFLPFSRPAMGVEELAAVKEVLESGWITTGPKNQALEQAFCQLTGNQHAIAV SSATAGMHITLMALEIGKGDEVITPSLTWVSTLNMISLLGATPVMVDVDRDTLMVTPEAI ESAITPRTKAIIPVHYAGAPADIDAIRAIGERYGIAVIEDAAHAVGTYYKGRHIGAKGTA IFSFHAIKNITCAEGGLIVTDNENLARQLRMLKFHGLGVDAYDRQTWGRAPQAEVLTPGY KYNLTDINAAIALTQLVKLEHLNTRRREIAQQYQQALAALPFQPLSLPDWPHVHAWHLFI IRVDEQCCGISRNALMEALKERGIGTGLHFRAAHTQKYYRERFPTLSLPNTEWNSERICS LSLFPDMTTADADRVITALQQLAGQ >gi|296493381|gb|ADTK01000120.1| GENE 34 47758 - 48726 872 322 aa, chain + ## HITS:1 COG:yfbF KEGG:ns NR:ns ## COG: yfbF COG0463 # Protein_GI_number: 16130189 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Escherichia coli K12 # 1 322 1 322 322 658 100.0 0 MFEIHPVKKVSVVIPVYNEQESLPELIRRTTTACESLGKEYEILLIDDGSSDNSAHMLVE ASQAENSHIVSILLNRNYGQHSAIMAGFSHVTGDLIITLDADLQNPPEEIPRLVAKADEG YDVVGTVRQNRQDSWFRKTASKMINRLIQRTTGKAMGDYGCMLRAYRRHIVDAMLHCHER STFIPILANIFARRAIEIPVHHAEREFGESKYSFMRLINLMYDLVTCLTTTPLRMLSLLG SIIAIGGFSIAVLLVILRLTFGPQWAAEGVFMLFAVLFTFIGAQFIGMGLLGEYIGRIYT DVRARPRYFVQQVIRPSSKENE >gi|296493381|gb|ADTK01000120.1| GENE 35 48726 - 50708 1605 660 aa, chain + ## HITS:1 COG:yfbG_2 KEGG:ns NR:ns ## COG: yfbG_2 COG0451 # Protein_GI_number: 16130190 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Escherichia coli K12 # 306 660 1 355 355 767 100.0 0 MKTVVFAYHDMGCLGIEALLAAGYEISAIFTHTDNPGEKAFYGSVAHLAAERGIPVYAPD NVNHPLWVERIAQLSPEVIFSFYYRHLICDEILQLAPAGAFNLHGSLLPKYRGRAPLNWV LVNGETETGVTLHRMVKRADAGAIVAQLRVAIAPDDIAITLHHKLCHAARQLLEQTLPAI KHGNILEIAQRENEATCFGRRTPDDSFLEWHKPASVLHNMVRAVADPWPGAFTYVGNQKF TVWSSRVHPHASKAQPGSVISIAPLLIACGDGALEIVTGQAGDGITMQGSQLAQTLGLVQ GSRLNSQPACTARRRTRVLILGVNGFIGNHLTERLLREDHYEVYGLDIGSDAISRFLNHP HFHFVEGDISIHSEWIEYHVKKCDVVLPLVAIATPIEYTRNPLRVFELDFEENLRIIRYC VKYRKRIIFPSTSEVYGMCSDKYFDEDHSNLIVGPVNKPRWIYSVSKQLLDRVIWAYGEK EGLQFTLFRPFNWMGPRLDNLNAARIGSSRAITQLILNLVEGSPIKLIDGGKQKRCFTDI RDGIEALYRIIENAGNRCDGEIINIGNPENEASIEELGEMLLASFEKHPLRHHFPPFAGF RVVESSSYYGKGYQDVEHRKPSIRNAHRCLDWEPKIDMQETIDETLDFFLRTVDLTDKPS >gi|296493381|gb|ADTK01000120.1| GENE 36 50705 - 51595 741 296 aa, chain + ## HITS:1 COG:yfbH KEGG:ns NR:ns ## COG: yfbH COG0726 # Protein_GI_number: 16130191 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Escherichia coli K12 # 1 296 1 296 296 613 100.0 1e-175 MTKVGLRIDVDTFRGTREGVPRLLEILSKHNIQASIFFSVGPDNMGRHLWRLVKPQFLWK MLRSNAASLYGWDILLAGTAWPGKEIGHANADIIREAAKHHEVGLHAWDHHAWQARSGNW DRQTMIDDIARGLRTLEEIIGQPVTCSAAAGWRADQKVIEAKEAFHLRYNSDCRGAMPFR PLLESGNPGTAQIPVTLPTWDEVIGRDVKAEDFNGWLLNRILRDKGTPVYTIHAEVEGCA YQHNFVDLLKRAAQEGVTFCPLSELLSETLPLGQVVRGNIAGREGWLGCQQIAGSR >gi|296493381|gb|ADTK01000120.1| GENE 37 51595 - 53247 1368 550 aa, chain + ## HITS:1 COG:yfbI KEGG:ns NR:ns ## COG: yfbI COG1807 # Protein_GI_number: 16130192 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 4-amino-4-deoxy-L-arabinose transferase and related glycosyltransferases of PMT family # Organism: Escherichia coli K12 # 1 550 1 550 550 1041 99.0 0 MKSVRYLIGLFAFIACYYLLPISTRLLWQPDETRYAEISREMLASGDWIVPHLLGLRYFE KPIAGYWINSIGQWLFGANNFGVRAGVIFATLLTAALVTWFTLRLWRDKRLALLATVIYL SLFIVYAIGTYAVLDPFIAFWLVAGMCSFWLAMQAQTWKGKSAGFLLLGITCGMGVMTKG FLALAVPVLSVLPWVATQKRWKDLFIYGWLAVISCVLTVLPWGLAIAQREPNFWHYFFWV EHIQRFALDDAQHRAPFWYYVPVIIAGSLPWLGLLPGALYTGWKNRKHSATVYLLSWTIM PLLFFSVAKGKLPTYILSCFASLAMLMAHYALLAAKNNPLALRINGWINIAFGVTGIIAT FVVSPWGPMNTPVWQTFEIYKVFCAWSIFSLWAFFGWYTLTNVEKTWSFAALCPLGLALL VGFSIPDRVMEGKHPQFFVEMTQESLQPSRYILTDSVGVAAGLAWSLQRDDIIMYRQTGE LKYGLNYPDAKGRFVSGDEFANWLNQHRQEGIITLVLSVDRDEDINSLAIPPADAIDRQE RLVLIQYRPK >gi|296493381|gb|ADTK01000120.1| GENE 38 53244 - 53579 376 111 aa, chain + ## HITS:1 COG:Z3516 KEGG:ns NR:ns ## COG: Z3516 COG0697 # Protein_GI_number: 15802807 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Escherichia coli O157:H7 EDL933 # 1 111 1 111 111 157 99.0 5e-39 MIWLTLVFASLLSVAGQLCQKQATCFVAINKRRKHIVLWLGLALACLGLAMVLWLLVLQN VPVGIAYPMLSLNFVWVTLAAVKLWHEPVSPRHWCGVAFIIGGIVILGSTV >gi|296493381|gb|ADTK01000120.1| GENE 39 53579 - 53965 502 128 aa, chain + ## HITS:1 COG:yfbJ KEGG:ns NR:ns ## COG: yfbJ COG0697 # Protein_GI_number: 16130193 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Escherichia coli K12 # 1 128 95 222 222 204 99.0 2e-53 MGLMWGLFSVIIASVAQLSLGFAASHLPPMTHLWDFIAALLAFGLDARILLLGLLGYLLS VFCWYKTLHKLALSKAYALLSMSYVLVWIASMVLPGWEGTFSLKALLGVACIMSGLMLIF LPMTKQRY >gi|296493381|gb|ADTK01000120.1| GENE 40 53959 - 54225 209 88 aa, chain - ## HITS:1 COG:no KEGG:B21_02145 NR:ns ## KEGG: B21_02145 # Name: pmrD # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 88 1 88 88 171 100.0 8e-42 MEWLVKKSCCNKQDNRHVLMLCDAGGAIKMIAEVKSDFAVKVGDLLSPLQNALYCINREK LHTVKVLSASSYSPDEWERQCKVAGKTQ >gi|296493381|gb|ADTK01000120.1| GENE 41 54335 - 55690 1178 451 aa, chain - ## HITS:1 COG:menE KEGG:ns NR:ns ## COG: menE COG0318 # Protein_GI_number: 16130195 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Escherichia coli K12 # 1 451 1 451 451 853 98.0 0 MIFSDWPWRHWRQVRGEAIALRLNDEQLNWRELCARVDELATGFAAQGVVEGSGVMLRAW NTPQTLLAWLALLQCGARVLPVNPQLPQPLLEELLPNLTLQFALVPDGENTFPALTSLHI QRVEGAHAATWQPTRLCSMTLTSGSTGLPKAAVHTYQAHLASAEGVLSLIPFGDHDDWLL SLPLFHVSGQGIMWRWLYAGARMTVRDKQPLEQMLAGCTHASLVPTQLWRLLVNRSSVSL KAVLLGGAAIPVELTEQAREQGIRCFCGYGLTEFASTVCAKEADGLADVGSPLPGREVKI VNNEVWLRAASMAEGYWRNGQLVSLVNDEGWYATRDRGEMHNGKLTIVGRLDNLFFSGGE GIQPEEVERVIAAHPAVLQVFIVPVADKEFGHRPVAVMEYDHESVDLSEWVKDKLASFQQ PVRWLTLPPELKNGGIKISRQALKEWVQRQQ >gi|296493381|gb|ADTK01000120.1| GENE 42 55687 - 56649 976 320 aa, chain - ## HITS:1 COG:ECs3149 KEGG:ns NR:ns ## COG: ECs3149 COG1441 # Protein_GI_number: 15832403 # Func_class: H Coenzyme transport and metabolism # Function: O-succinylbenzoate synthase # Organism: Escherichia coli O157:H7 # 1 320 1 320 320 623 99.0 1e-178 MRSAQVYRWQIPMDAGVVLRDRRLKTRDGLYVCLREGEREGWGEISPLPGFSQETWEEAQ SVLLAWVNNWLAGDCELPQMTSVAFGVSCALAELADTLPQAANYRAAPLCNGDPDDLILK LADMPGEKVAKVKVGLYEAVRDGMVVNLLLEAIPDLHLRLDANRAWTPLKGQQFAKYVNP DYRHRIAFLEEPCKTRDDSRAFARETGIAIAWDESLREPDFAFVAEEGVRAVVIKPTLTG SLEKVREQVQAAHALGLTAVISSSIESSLGLTQLARIAAWLTPDTIPGLDTLDLMQAQQV RRWPGSTLPVVEVDALERLL >gi|296493381|gb|ADTK01000120.1| GENE 43 56649 - 57506 887 285 aa, chain - ## HITS:1 COG:menB KEGG:ns NR:ns ## COG: menB COG0447 # Protein_GI_number: 16130197 # Func_class: H Coenzyme transport and metabolism # Function: Dihydroxynaphthoic acid synthase # Organism: Escherichia coli K12 # 1 285 1 285 285 602 100.0 1e-172 MIYPDEAMLYAPVEWHDCSEGFEDIRYEKSTDGIAKITINRPQVRNAFRPLTVKEMIQAL ADARYDDNIGVIILTGAGDKAFCSGGDQKVRGDYGGYKDDSGVHHLNVLDFQRQIRTCPK PVVAMVAGYSIGGGHVLHMMCDLTIAADNAIFGQTGPKVGSFDGGWGASYMARIVGQKKA REIWFLCRQYDAKQALDMGLVNTVVPLADLEKETVRWCREMLQNSPMALRCLKAALNADC DGQAGLQELAGNATMLFYMTEEGQEGRNAFNQKRQPDFSKFKRNP >gi|296493381|gb|ADTK01000120.1| GENE 44 57521 - 58279 572 252 aa, chain - ## HITS:1 COG:yfbB KEGG:ns NR:ns ## COG: yfbB COG0596 # Protein_GI_number: 16130198 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Escherichia coli K12 # 1 252 1 252 252 509 98.0 1e-144 MILHAQAKHGKPGLPWLVFLHGFSGDCHEWQEVGEAFADYSRLYVDLPGHGGSAAISVDG FDDVTDLLRKTLVSYNILDFWLVGYSLGGRVAMMAACQGLTGLCGVIVEGGHPGLQNAEQ RAERQRSDRQWAQRFCTEPLTAVFADWYQQPVFASLNDDQRRELVVLRSNNNGATLAAML EATSLAVQPDLRANLSARTFAFYYLCGERDSKFRALAAELAADCHVIPRAGHNAHRENPA GVIASLAQILRF >gi|296493381|gb|ADTK01000120.1| GENE 45 58276 - 59775 1542 499 aa, chain - ## HITS:1 COG:menD KEGG:ns NR:ns ## COG: menD COG1165 # Protein_GI_number: 16130199 # Func_class: H Coenzyme transport and metabolism # Function: 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase # Organism: Escherichia coli K12 # 1 499 58 556 556 994 99.0 0 MGHLALGLAKVSKQPVAVIVTSGTAVANLYPALIEAGLTGEKLILLTADRPPELIDCGAN QAIRQPGMFASHPTHSISLPRPTQDIPARWLVSTIDHALGTLHAGGVHINCPFAEPLYGE MDDTGLSWQQRLGDWWQDDKPWLREAPRLESEKQRDWFFWRQKRGVVVAGRMSAEEGKKV ALWAQTLGWPLIGDVLSQTGQPLPCADLWLGNAKATSELQQAQIVVQLGSSLTGKRLLQW QASCEPEEYWIVDDIEGRLDPAHHRGRRLIANIADWLELHPAEKRQPWCVEIPRLAEQAM QAVIARRDAFGEAQLAHRICDYLPEQGQLFVGNSLVVRLIDALSQLPAGYPVYSNRGASG IDGLLSTAAGVQRASGKPTLAIVGDLSALYDLNALALLRQVSAPLVLIVVNNNGGQIFSL LPTPQSERERFYLMPQNVHFEHAAAMFELKYHRPQNWQELETAFADAWRTPTTTVIEMVV NDTDGAQTLQQLLAQVSHL >gi|296493381|gb|ADTK01000120.1| GENE 46 60188 - 61330 866 380 aa, chain - ## HITS:1 COG:menF KEGG:ns NR:ns ## COG: menF COG1169 # Protein_GI_number: 16130200 # Func_class: H Coenzyme transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Isochorismate synthase # Organism: Escherichia coli K12 # 76 380 1 305 356 594 99.0 1e-170 MQSLTTALENLLRHLSQEIPATPGIRVIDIPFPLKDAFDALSWLASQQTYPQFYWQQRNG DEEAAVLGAITRFTSLDQAQRFLRQHPEHADLRIWGLNAFDPSQGNLLLPRLEWRRCGGK ATLRLTLFSESSLQHDAIQAKEFIATLVSIKPLPGLHLTTTREQHWPDKTGWTQLIELAT KTIAEGELDKVVLARATDLHFASPVNAAAMMAASRRLNLNCYHFYMAFDGENAFLGSSPE RLWRRRDKALRTEALAGTVANNPDDKQAQQLGEWLMADDKNQRENMLVVEDICQRLQADT QTLDVLPPQVLRLRKVQHLRRCIWTSLNKADDVICLHQLQPTAAVAGLPRDLARQFIARH EPFTREWYAGSAGYLSLQQS >gi|296493381|gb|ADTK01000120.1| GENE 47 61409 - 61714 506 101 aa, chain - ## HITS:1 COG:ECs3154 KEGG:ns NR:ns ## COG: ECs3154 COG4575 # Protein_GI_number: 15832408 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 84 1 84 101 143 98.0 9e-35 MSNQFGDTRIDDDLTLLSETLEEVLRSSGDTADQKYVELKARAEKALDDVKKRVSQASDS YYYRAKQAVYRADDYVHEKPWQGIGVGAAVGLVLGLLLARR >gi|296493381|gb|ADTK01000120.1| GENE 48 61769 - 62230 484 153 aa, chain - ## HITS:1 COG:elaA KEGG:ns NR:ns ## COG: elaA COG2153 # Protein_GI_number: 16130202 # Func_class: R General function prediction only # Function: Predicted acyltransferase # Organism: Escherichia coli K12 # 1 153 1 153 153 315 100.0 2e-86 MIEWQDLHHSELSVSQLYALLQLRCAVFVVEQNCPYQDIDGDDLTGDNRHILGWKNDELV AYARILKSDDDLEPVVIGRVIVSEALRGEKVGQQLMSKTLETCTHHWPDKPVYLGAQAHL QNFYQSFGFIPVTEVYEEDGIPHIGMAREVIQA >gi|296493381|gb|ADTK01000120.1| GENE 49 62295 - 63212 609 305 aa, chain + ## HITS:1 COG:elaC KEGG:ns NR:ns ## COG: elaC COG1234 # Protein_GI_number: 16130203 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Escherichia coli K12 # 1 305 7 311 311 627 100.0 1e-179 MELIFLGTSAGVPTRTRNVTAILLNLQHPTQSGLWLFDCGEGTQHQLLHTAFNPGKLDKI FISHLHGDHLFGLPGLLCSRSMSGIIQPLTIYGPQGIREFVETALRISGSWTDYPLEIVE IGAGEILDDGLRKVTAYPLEHPLECYGYRIEEHDKPGALNAQALKAAGVPPGPLFQELKA GKTITLEDGRQINGADYLAAPVPGKALAIFGDTGPCDAALDLAKGVDVMVHEATLDITME AKANSRGHSSTRQAATLAREAGVGKLIITHVSSRYDDKGCQHLLRECRSIFPATELANDF TVFNV >gi|296493381|gb|ADTK01000120.1| GENE 50 63403 - 64620 391 405 aa, chain + ## HITS:1 COG:no KEGG:ECO26_3260 NR:ns ## KEGG: ECO26_3260 # Name: elaD # Def: deubiquitinase # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 405 2 406 406 770 100.0 0 MVTVVSNYCQLSQTQLSQTFAEKFTVTEELLQSLKKTALSGDEESIELLHNIALGYDEFG KKAEDILYHIVRNPTNETLSIIRLIKNACLKLYNLAHTATNSHLKPTGPDNSDVLLFKKL FSPSKLMTIIGDEIPLISEKQSLSKVLLNDENNELSDDTNFWDKNRQLTTDEIACYLQKI AANAKNTQVNYPTGLYVPYSTRTHLEDALNENIKSDPSWPNEVQLFHINTGGHWILVSLQ KIVNKKNNKLQIKCVIFNSLRALGHDKENSLKRVINSFNSELMGEMSNNNIKVHLNEPEI IFLHADLQQYLSQSCGAFVCMAAQEVIEQRESNSDSAPYTLLKNYADRFKKYSAEEQYEI DFQHRLVNRNCYLDKYGDARINASYTQLEIKHSQPKNRASGKRVS Prediction of potential genes in microbial genomes Time: Mon May 16 15:26:55 2011 Seq name: gi|296493380|gb|ADTK01000121.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont305.4, whole genome shotgun sequence Length of sequence - 3593 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 37 - 1827 1259 ## COG2304 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain - Prom 1935 - 1994 7.8 + Prom 1779 - 1838 8.9 2 2 Tu 1 . + CDS 1929 - 2351 221 ## COG2234 Predicted aminopeptidases 3 3 Tu 1 . + CDS 2499 - 2900 140 ## COG2234 Predicted aminopeptidases + Prom 2918 - 2977 5.0 4 4 Tu 1 . + CDS 3003 - 3506 572 ## SDY_2471 hypothetical protein + Term 3541 - 3578 9.1 Predicted protein(s) >gi|296493380|gb|ADTK01000121.1| GENE 1 37 - 1827 1259 596 aa, chain - ## HITS:1 COG:yfbK KEGG:ns NR:ns ## COG: yfbK COG2304 # Protein_GI_number: 16130205 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain # Organism: Escherichia coli K12 # 13 596 1 575 575 994 93.0 0 MSGACIYLKGFYMRNKNIIMLLMSSLILSGCGPESENKESQQQQPSTPTDQQVLAAQQAA IKEAEQRSVADKATADAKAKALAQQEAQQYSDKQALQGRLQAAPKYQHAAREKAASQIAN PGTARYKQFDDNPVKQVAQNPLATFSLDVDTGSYANVRRFLNHGLLPPPDAVRVEEIVNY FPYDWDIKDKQSIPATKPIPFAMRYELAPAPWNEQLTLLKIDILAKDHKSEELPASNLVF LIDTSGSMISDERLPLIQSSLKLLVKELREQDNIAIVTYAGDSRIALPSISGSHKAEINA AIDSLDAEGSTNGGAGLELAYQQAAKGFIKGGINRILLATDGDFNVGIDDPKSIESMVKK QRESGVTLSTFGVGNSNYNEAMMVRIADVGNGNYSYIDTLSEAQKVLNSEMRQTLITVAK DVKAQIEFNPAWVTEYRQIGYEKRQLRAEDFNNDNVDAGDIGAGKHITLLFELTLNGQKA SIDKLRYAPDNKSVKSDKTKELAWLKIRWKYPQGKESQLVEFPLGPTINAPSEDMRFCAA VAAYGQKLRGSEYLNNTSWQQIKQWAQQAKGEDPQGYRAEFIRLIELADGVTDISQ >gi|296493380|gb|ADTK01000121.1| GENE 2 1929 - 2351 221 140 aa, chain + ## HITS:1 COG:yfbL KEGG:ns NR:ns ## COG: yfbL COG2234 # Protein_GI_number: 16130206 # Func_class: R General function prediction only # Function: Predicted aminopeptidases # Organism: Escherichia coli K12 # 1 140 3 142 325 263 93.0 9e-71 MKKINFAFIILFLFSLPLIIFYQPWVNALPPMPRHANPEQLEKTVRYLTQTVHPRSADNI DNLNRSAEYIKEVFVSSGARITSQDVPITGGPYKNIVADYGPADGPLIIIGAHYDSASSY ENDELTYTPGADDNASGVAG >gi|296493380|gb|ADTK01000121.1| GENE 3 2499 - 2900 140 133 aa, chain + ## HITS:1 COG:yfbL KEGG:ns NR:ns ## COG: yfbL COG2234 # Protein_GI_number: 16130206 # Func_class: R General function prediction only # Function: Predicted aminopeptidases # Organism: Escherichia coli K12 # 1 133 193 325 325 255 93.0 2e-68 MIALEMIGYYDSAPGSQDYPYPAMSWLYPDRGDFIAVVGRMQDINAVRQVKATLLSSRYL SVYSMNAPGFIPGIDFSDHLNYWQHDIPAVMITDTAFYRNKQYHLPGDTADRLNYQKMAQ VVDGVINLLYNSK >gi|296493380|gb|ADTK01000121.1| GENE 4 3003 - 3506 572 167 aa, chain + ## HITS:1 COG:no KEGG:SDY_2471 NR:ns ## KEGG: SDY_2471 # Name: yfbM # Def: hypothetical protein # Organism: S.dysenteriae # Pathway: not_defined # 1 167 1 167 167 322 100.0 4e-87 MGMIGYFAEIDSEKINQLLESTEKPLMDNIHDTLSGLRRLDIDKRWDFLHFGLTGTSAFD PAKNDPLSRAVLGEHSLEDGIDGFLGLTWNQELAATIDRLESLDRSELRKQFSIKRLNEM EIYPGVTFSEELEGQLFASIMLDMEKLISAYRRMLRQGNHALTVIVG Prediction of potential genes in microbial genomes Time: Mon May 16 15:27:03 2011 Seq name: gi|296493379|gb|ADTK01000122.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont305.5, whole genome shotgun sequence Length of sequence - 15669 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 1, operones - 1 average op.length - 13.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 1 - 42 5.4 1 1 Op 1 22/0.000 - CDS 62 - 1519 1805 ## COG1007 NADH:ubiquinone oxidoreductase subunit 2 (chain N) 2 1 Op 2 30/0.000 - CDS 1526 - 3055 1686 ## COG1008 NADH:ubiquinone oxidoreductase subunit 4 (chain M) - Prom 3088 - 3147 1.9 3 1 Op 3 26/0.000 - CDS 3219 - 5060 2185 ## COG1009 NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit 4 1 Op 4 30/0.000 - CDS 5057 - 5359 472 ## COG0713 NADH:ubiquinone oxidoreductase subunit 11 or 4L (chain K) 5 1 Op 5 28/0.000 - CDS 5356 - 5910 727 ## COG0839 NADH:ubiquinone oxidoreductase subunit 6 (chain J) 6 1 Op 6 31/0.000 - CDS 5922 - 6464 678 ## COG1143 Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) 7 1 Op 7 18/0.000 - CDS 6479 - 7456 1184 ## COG1005 NADH:ubiquinone oxidoreductase subunit 1 (chain H) 8 1 Op 8 12/0.000 - CDS 7453 - 10179 2964 ## COG1034 NADH dehydrogenase/NADH:ubiquinone oxidoreductase 75 kD subunit (chain G) 9 1 Op 9 23/0.000 - CDS 10232 - 11569 1456 ## COG1894 NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit 10 1 Op 10 15/0.000 - CDS 11566 - 12066 457 ## COG1905 NADH:ubiquinone oxidoreductase 24 kD subunit 11 1 Op 11 9/0.000 - CDS 12069 - 13871 2130 ## COG0649 NADH:ubiquinone oxidoreductase 49 kD subunit 7 12 1 Op 12 30/0.000 - CDS 13965 - 14627 684 ## COG0377 NADH:ubiquinone oxidoreductase 20 kD subunit and related Fe-S oxidoreductases 13 1 Op 13 . - CDS 14643 - 15080 434 ## COG0838 NADH:ubiquinone oxidoreductase subunit 3 (chain A) - Prom 15176 - 15235 5.9 Predicted protein(s) >gi|296493379|gb|ADTK01000122.1| GENE 1 62 - 1519 1805 485 aa, chain - ## HITS:1 COG:ECs3160 KEGG:ns NR:ns ## COG: ECs3160 COG1007 # Protein_GI_number: 15832414 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 2 (chain N) # Organism: Escherichia coli O157:H7 # 61 485 1 425 425 703 99.0 0 MTITPQNLIALLPLLIVGLTVVVVMLSIAWRRNHFLNATLSVIGLNAALVSLWFVGQAGA MDVTPLMRVDGFAMLYTGLVLLASLATCTFAYPWLEGYNDNKDEFYLLVLIAALGGILLA NANHLASLFLGIELISLPLFGLVGYAFRQKRSLEASIKYTILSAAASSFLLFGMALVYAQ SGDLSFVALGKNLGDGMLNEPLLLAGFGLMIVGLGFKLSLVPFHLWTPDVYQGAPAPVST FLATASKIAIFGVVMRLFLYVPVGDSEAIRVVLAIIAFASIIFGNLMALSQTNIKRLLGY SSISHLGYLLVALIALQTGEMSMEAVGGYLAGYLFSSLGAFGVVSLMSSPYRGPDADSLF SYRGLFWHRPILAAVMTVMMLSLAGIPMTLGFIGKFYVLAVGVQAHLWWLVGAVVVGSAI GLYYYLRVAVSLYLHAPEQPGRDAPSNWQYSAGGIVVLISALLVLVLGVWPQPLISIVRL AMPLM >gi|296493379|gb|ADTK01000122.1| GENE 2 1526 - 3055 1686 509 aa, chain - ## HITS:1 COG:ECs3161 KEGG:ns NR:ns ## COG: ECs3161 COG1008 # Protein_GI_number: 15832415 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 4 (chain M) # Organism: Escherichia coli O157:H7 # 1 509 1 509 509 885 99.0 0 MLLPWLILIPFIGGFLCWQTERFGVKVPRWIALITMGLTLALSLQLWLQGGYSLTQSAGI PQWQSEFDMPWIPRFGISIHLAIDGLSLLMVVLTGLLGVLAVLCSWKEIEKYQGFFHLNL MWILGGVIGVFLAIDMFLFFFFWEMMLVPMYFLIALWGHKASDGKTRITAATKFFIYTQA SGLVMLIAILALVFVHYNATGVWTFNYEELLNTSMSSGVEYLLMLGFFIAFAVKMPVVPM HGWLPDAHSQAPTAGSVDLAGILLKTAAYGLLRFSLPLFPNASAEFAPIAMWLGVIGIFY GAWMAFAQTDIKRLIAYTSVSHMGFVLIAIYTGSQLAYQGAVIQMIAHGLSAAGLFILCG QLYERIHTRDMRMMGGLWSKMKWLPALSLFFAVATLGMPGTGNFVGEFMILFGSFQVVPV ITVISTFGLVFASVYSLAMLHRAYFGKAKSQIASQELPGMSLRELFMILLLVVLLVLLGF YPQPILDTSHSAIGNIQQWFVNSVTTTRP >gi|296493379|gb|ADTK01000122.1| GENE 3 3219 - 5060 2185 613 aa, chain - ## HITS:1 COG:nuoL KEGG:ns NR:ns ## COG: nuoL COG1009 # Protein_GI_number: 16130213 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit # Organism: Escherichia coli K12 # 1 613 1 613 613 1053 99.0 0 MNMLALTIILPLIGFVLLAFSRGRWSENVSAIVGVGSVGLAALVTAFIGVDFFANGEQSY SQPLWTWMSVGDFNIGFNLVLDGLSLTMLSVVTGVGFLIHMYASWYMRGEEGYSRFFAYT NLFIASMVVLVLADNLLLMYLGWEGVGLCSYLLIGFYYTDPKNGAAAMKAFVVTRVGDVF LAFALFILYNELGTLNFREMVELAPAHFADGNNMLMWATLMLLGGAVGKSAQLPLQTWLA DAMAGPTPVSALIHAATMVTAGVYLIARTHGLFLMTPEVLHLVGIVGAVTLLLAGFAALV QTDIKRVLAYSTMSQIGYMFLALGVQAWDAAIFHLMTHAFFKALLFLASGSVILACHHEQ NIFKMGGLRKSIPLVYLCFLVGGAALSALPLVTAGFFSKDEILAGAMANGHINLMVAGLV GAFMTSLYTFRMIFIVFHGKEQIHAHAVKGVTHSLPLIVLLILSTFVGALIVPPLQGVLP QTTELAHGSMLTLEITSGVVAVVGILLAAWLWLGKRTLVTSIANSAPGRLLGTWWYNAWG FDWLYDKVFVKPFLGIAWLLKRDPLNSMMNIPAVLSRFAGKGLLLSENGYLRWYVASMSI GAVVVLALLMVLR >gi|296493379|gb|ADTK01000122.1| GENE 4 5057 - 5359 472 100 aa, chain - ## HITS:1 COG:ECs3163 KEGG:ns NR:ns ## COG: ECs3163 COG0713 # Protein_GI_number: 15832417 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 11 or 4L (chain K) # Organism: Escherichia coli O157:H7 # 1 100 1 100 100 148 100.0 3e-36 MIPLQHGLILAAILFVLGLTGLVIRRNLLFMLIGLEIMINASALAFVVAGSYWGQTDGQV MYILAISLAAAEASIGLALLLQLHRRRQNLNIDSVSEMRG >gi|296493379|gb|ADTK01000122.1| GENE 5 5356 - 5910 727 184 aa, chain - ## HITS:1 COG:ECs3164 KEGG:ns NR:ns ## COG: ECs3164 COG0839 # Protein_GI_number: 15832418 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 6 (chain J) # Organism: Escherichia coli O157:H7 # 1 184 1 184 184 317 100.0 7e-87 MEFAFYICGLIAILATLRVITHTNPVHALLYLIISLLAISGVFFSLGAYFAGALEIIVYA GAIMVLFVFVVMMLNLGGSEIEQERQWLKPQVWIGPAILSAIMLVVIVYAILGVNDQGID GTPISAKAVGITLFGPYVLAVELASMLLLAGLVVAFHVGREERAGEVLSNRKDDSAKRKT EEHA >gi|296493379|gb|ADTK01000122.1| GENE 6 5922 - 6464 678 180 aa, chain - ## HITS:1 COG:ECs3165 KEGG:ns NR:ns ## COG: ECs3165 COG1143 # Protein_GI_number: 15832419 # Func_class: C Energy production and conversion # Function: Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) # Organism: Escherichia coli O157:H7 # 1 180 1 180 180 371 100.0 1e-103 MTLKELLVGFGTQVRSIWMIGLHAFAKRETRMYPEEPVYLPPRYRGRIVLTRDPDGEERC VACNLCAVACPVGCISLQKAETKDGRWYPEFFRINFSRCIFCGLCEEACPTTAIQLTPDF EMGEYKRQDLVYEKEDLLISGPGKYPEYNFYRMAGMAIDGKDKGEAENEAKPIDVKSLLP >gi|296493379|gb|ADTK01000122.1| GENE 7 6479 - 7456 1184 325 aa, chain - ## HITS:1 COG:ECs3166 KEGG:ns NR:ns ## COG: ECs3166 COG1005 # Protein_GI_number: 15832420 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 1 (chain H) # Organism: Escherichia coli O157:H7 # 1 325 1 325 325 593 100.0 1e-169 MSWISPELIEILLTILKAVVILLVVVTCGAFMSFGERRLLGLFQNRYGPNRVGWGGSLQL VADMIKMFFKEDWIPKFSDRVIFTLAPMIAFTSLLLAFAIVPVSPGWVVADLNIGILFFL MMAGLAVYAVLFAGWSSNNKYSLLGAMRASAQTLSYEVFLGLSLMGVVAQAGSFNMTDIV NSQAHVWNVIPQFFGFITFAIAGVAVCHRHPFDQPEAEQELADGYHIEYSGMKFGLFFVG EYIGIVTISALMVTLFFGGWQGPLLPPFIWFALKTAFFMMMFILIRASLPRPRYDQVMSF GWKICLPLTLINLLVTAAVILWQAQ >gi|296493379|gb|ADTK01000122.1| GENE 8 7453 - 10179 2964 908 aa, chain - ## HITS:1 COG:ECs3167 KEGG:ns NR:ns ## COG: ECs3167 COG1034 # Protein_GI_number: 15832421 # Func_class: C Energy production and conversion # Function: NADH dehydrogenase/NADH:ubiquinone oxidoreductase 75 kD subunit (chain G) # Organism: Escherichia coli O157:H7 # 1 908 3 910 910 1891 99.0 0 MATIHVDGKEYEVNGADNLLEACLSLGLDIPYFCWHPALGSVGACRQCAVKQYQNAEDTR GRLVMSCMTPASDGTFISIDDEEAKQFRESVVEWLMTNHPHDCPVCEEGGNCHLQDMTVM TGHSFRRYRFTKRTHRNQDLGPFISHEMNRCIACYRCVRYYKDYADGTDLGVYGAHDNVY FGRPEDGTLESEFSGNLVEICPTGVFTDKTHSERYNRKWDMQFAPSICQQCSIGCNISPG ERYGELRRIENRYNGTVNHYFLCDRGRFGYGYVNLKDRPRQPVQRRGDDFITLNAEQAMQ GAADILRQSKKVIGIGSPRASVESNFALRELVGEENFYTGIAHGEQEHLQLALKVLREGG IYTPALREIESYDAVLVLGEDVTQTGARVALAVRQAVKGKAREMAAAQKVADWQIAAILN IGQRAKHPLFVTNVDDTRLDDIAAWTYRAPVEDQARLGFAIAHALDNSAPAVDGIEPELQ SKIDVIVQALAGAKKPLIISGTNAGSIEVIQAAANVAKALKGRGADVGITMIARSVNSMG LGIMGGGSLEEALTELETGRADAVVVLENDLHRHASATRVNAALAKAPLVMVVDHQRTAI MENAHLVLSAASFAESDGTVINNEGRAQRFFQVYDPAYYDSKTVMLESWRWLHSLHSTLL SREVDWTQLDHVIDAVVAKIPELAGIKDAAPDATFRIRGQKLAREPHRYSGRTAMRANIS VHEPRQPQDIDTMFTFSMEGNNQPTAHRSQVPFAWAPGWNSPQAWNKFQDEVGGKLRFGD PGVRLFETSENGLDYFTSVPARFQPQDGKWRIAPYYHLFGSDELSQRAPVFQSRMPQPYI KLNPADAAKLGVNAGTHVSFSYDGNTVTLPVEIAEGLTAGQVGLPMGMSGIAPVLAGAHL EDLKEAQQ >gi|296493379|gb|ADTK01000122.1| GENE 9 10232 - 11569 1456 445 aa, chain - ## HITS:1 COG:ECs3168 KEGG:ns NR:ns ## COG: ECs3168 COG1894 # Protein_GI_number: 15832422 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit # Organism: Escherichia coli O157:H7 # 1 445 1 445 445 925 100.0 0 MKNIIRTPETHPLTWRLRDDKQPVWLDEYRSKNGYEGARKALTGLSPDEIVNQVKDAGLK GRGGAGFSTGLKWSLMPKDESMNIRYLLCNADEMEPGTYKDRLLMEQLPHLLVEGMLISA FALKAYRGYIFLRGEYIEAAVNLRRAIAEATEAGLLGKNIMGTGFDFELFVHTGAGRYIC GEETALINSLEGRRANPRSKPPFPATSGVWGKPTCVNNVETLCNVPAILANGVEWYQNIS KSKDAGTKLMGFSGRVKNPGLWELPFGTTAREILEDYAGGMRDGLKFKAWQPGGAGTDFL TEAHLDLPMEFESIGKAGSRLGTALAMAVDHEINMVSLVRNLEEFFARESCGWCTPCRDG LPWSVKILRALERGEGQPGDIETLEQLCRFLGPGKTFCAHAPGAVEPLQSAIKYFREEFE AGIKQPFSNTHLINGIQPNLLKERW >gi|296493379|gb|ADTK01000122.1| GENE 10 11566 - 12066 457 166 aa, chain - ## HITS:1 COG:nuoE KEGG:ns NR:ns ## COG: nuoE COG1905 # Protein_GI_number: 16130220 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 24 kD subunit # Organism: Escherichia coli K12 # 1 166 1 166 166 340 100.0 8e-94 MHENQQPQTEAFELSAAEREAIEHEMHHYEDPRAASIEALKIVQKQRGWVPDGAIHAIAD VLGIPASDVEGVATFYSQIFRQPVGRHVIRYCDSVVCHINGYQGIQAALEKKLNIKPGQT TFDGRFTLLPTCCLGNCDKGPNMMIDEDTHAHLTPEAIPELLERYK >gi|296493379|gb|ADTK01000122.1| GENE 11 12069 - 13871 2130 600 aa, chain - ## HITS:1 COG:nuoC_2 KEGG:ns NR:ns ## COG: nuoC_2 COG0649 # Protein_GI_number: 16130221 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 49 kD subunit 7 # Organism: Escherichia coli K12 # 202 600 1 399 399 844 99.0 0 MVNNMTDLTAQEPAWQTRDHLDDPVIGELRNRFGPDAFTVQATRTGVPVVWIKREQLLEV GDFLKKLPKPYVMLFDLHGMDERLRTHREGLPAADFSVFYHLISIDRNRDIMLKVALAEN DLHVPTFTKLFPNANWYERETWDLFGITFDGHPNLRRIMMPQTWKGHPLRKDYPARATEF SPFELTKAKQDLEMEALTFKPEEWGMKRGTENEDFMFLNLGPNHPSAHGAFRIVLQLDGE EIVDCVPDIGYHHRGAEKMGERQSWHSYIPYTDRIEYLGGCVNEMPYVLAVEKLAGITVP DRVNVIRVMLSELFRINSHLLYISTFIQDVGAMTPVFFAFTDRQKIYDLVEAITGFRMHP AWFRIGGVAHDLPRGWDRLLREFLDWMPKRLASYEKAALQNTILKGRSQGVAAYGAKEAL EWGTTGAGLRATGIDFDVRKARPYSGYENFDFEIPVGGGVSDCYTRVMLKVEELRQSLRI LEQCLNNMPEGPFKADHPLTTPPPKERTLQHIETLITHFLQVSWGPVMPANESFQMIEAT KGINSYYLTSDGSTMSYRTRIRTPSYAHLQQIPAAIRGSLVSDLIVYLGSIDFVMSDVDR >gi|296493379|gb|ADTK01000122.1| GENE 12 13965 - 14627 684 220 aa, chain - ## HITS:1 COG:ECs3171 KEGG:ns NR:ns ## COG: ECs3171 COG0377 # Protein_GI_number: 15832425 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 20 kD subunit and related Fe-S oxidoreductases # Organism: Escherichia coli O157:H7 # 1 220 1 220 220 457 100.0 1e-129 MDYTLTRIDPNGENDRYPLQKQEIVTDPLEQEVNKNVFMGKLNDMVNWGRKNSIWPYNFG LSCCYVEMVTSFTAVHDVARFGAEVLRASPRQADLMVVAGTCFTKMAPVIQRLYDQMLEP KWVISMGACANSGGMYDIYSVVQGVDKFIPVDVYIPGCPPRPEAYMQALMLLQESIGKER RPLSWVVGDQGVYRANMQSERERKRGERIAVTNLRTPDEI >gi|296493379|gb|ADTK01000122.1| GENE 13 14643 - 15080 434 145 aa, chain - ## HITS:1 COG:ECs3172 KEGG:ns NR:ns ## COG: ECs3172 COG0838 # Protein_GI_number: 15832426 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 3 (chain A) # Organism: Escherichia coli O157:H7 # 1 145 3 147 147 266 100.0 9e-72 MSTSTEVIAHHWAFAIFLIVAIGLCCLMLVGGWFLGGRARARSKNVPFESGIDSVGSARL RLSAKFYLVAMFFVIFDVEALYLFAWSTSIRESGWVGFVEAAIFIFVLLAGLVYLVRIGA LDWTPARSRRERMNPETNSIANRQR Prediction of potential genes in microbial genomes Time: Mon May 16 15:27:21 2011 Seq name: gi|296493378|gb|ADTK01000123.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont305.6, whole genome shotgun sequence Length of sequence - 55008 bp Number of predicted genes - 53, with homology - 53 Number of transcription units - 25, operones - 13 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 16 - 954 929 ## COG0583 Transcriptional regulator + Prom 1763 - 1822 1.9 2 2 Op 1 6/0.125 + CDS 1874 - 3091 1091 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 3 2 Op 2 . + CDS 3175 - 3774 698 ## COG1896 Predicted hydrolases of HD superfamily - Term 3568 - 3603 -0.5 4 3 Tu 1 2/0.875 - CDS 3833 - 5665 1548 ## COG0471 Di- and tricarboxylate transporters - Prom 5691 - 5750 2.5 - Term 5703 - 5732 1.1 5 4 Op 1 . - CDS 5752 - 6402 490 ## COG0637 Predicted phosphatase/phosphohexomutase 6 4 Op 2 2/0.875 - CDS 6413 - 6907 650 ## COG3013 Uncharacterized conserved protein - Term 6934 - 6964 -0.5 7 4 Op 3 . - CDS 6990 - 7445 338 ## COG3092 Uncharacterized protein conserved in bacteria - Prom 7603 - 7662 4.8 + Prom 7502 - 7561 3.0 8 5 Op 1 14/0.000 + CDS 7783 - 8985 1315 ## COG0282 Acetate kinase + Term 8989 - 9028 10.1 9 5 Op 2 1/0.875 + CDS 9060 - 11204 2337 ## COG0857 BioD-like N-terminal domain of phosphotransacetylase + Term 11242 - 11282 6.8 + Prom 11234 - 11293 4.1 10 6 Tu 1 . + CDS 11394 - 12914 1488 ## COG1288 Predicted membrane protein + Term 12924 - 12954 3.0 - Term 12914 - 12939 -0.5 11 7 Tu 1 3/0.750 - CDS 12947 - 13489 594 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 12 8 Op 1 2/0.875 - CDS 13547 - 14098 527 ## COG0622 Predicted phosphoesterase 13 8 Op 2 . - CDS 14154 - 14798 540 ## COG0625 Glutathione S-transferase - Prom 14826 - 14885 2.0 + Prom 14846 - 14905 3.9 14 9 Op 1 3/0.750 + CDS 14934 - 15581 574 ## COG0625 Glutathione S-transferase 15 9 Op 2 3/0.750 + CDS 15638 - 16000 401 ## COG1539 Dihydroneopterin aldolase 16 9 Op 3 . + CDS 16021 - 16914 883 ## COG1090 Predicted nucleoside-diphosphate sugar epimerase + Term 16936 - 16981 -0.4 - Term 16879 - 16940 1.8 17 10 Tu 1 . - CDS 16962 - 17852 479 ## COG5464 Uncharacterized conserved protein - Prom 17881 - 17940 3.4 - Term 18005 - 18042 6.0 18 11 Op 1 6/0.125 - CDS 18049 - 18822 258 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 19 11 Op 2 12/0.000 - CDS 18830 - 19546 632 ## COG4160 ABC-type arginine/histidine transport system, permease component 20 11 Op 3 12/0.000 - CDS 19543 - 20229 679 ## COG4215 ABC-type arginine transport system, permease component 21 11 Op 4 9/0.000 - CDS 20319 - 21101 939 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain - Prom 21212 - 21271 4.3 - Term 21161 - 21204 0.2 22 11 Op 5 5/0.375 - CDS 21322 - 22104 950 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain - Prom 22304 - 22363 2.4 - Term 22331 - 22361 3.6 23 12 Op 1 5/0.375 - CDS 22370 - 22939 560 ## COG0163 3-polyprenyl-4-hydroxybenzoate decarboxylase 24 12 Op 2 18/0.000 - CDS 23034 - 24551 1545 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase 25 12 Op 3 7/0.125 - CDS 24588 - 25076 388 ## COG1286 Uncharacterized membrane protein, required for colicin V production - Prom 25112 - 25171 4.8 - Term 25165 - 25206 6.4 26 13 Op 1 7/0.125 - CDS 25336 - 25998 654 ## COG3147 Uncharacterized protein conserved in bacteria 27 13 Op 2 15/0.000 - CDS 25988 - 27256 1198 ## COG0285 Folylpolyglutamate synthase - Term 27276 - 27321 10.6 28 13 Op 3 5/0.375 - CDS 27326 - 28240 1018 ## COG0777 Acetyl-CoA carboxylase beta subunit - Prom 28334 - 28393 4.7 29 14 Op 1 5/0.375 - CDS 28396 - 29055 602 ## COG0586 Uncharacterized membrane-associated protein 30 14 Op 2 5/0.375 - CDS 29138 - 29950 586 ## COG0101 Pseudouridylate synthase 31 14 Op 3 5/0.375 - CDS 29950 - 30963 1235 ## COG0136 Aspartate-semialdehyde dehydrogenase - Term 30976 - 31016 5.1 32 14 Op 4 . - CDS 31029 - 32165 1098 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases + Prom 32182 - 32241 3.4 33 15 Tu 1 . + CDS 32264 - 33259 864 ## ECO103_2785 flagella biosynthesis regulator - Term 33178 - 33208 1.8 34 16 Tu 1 . - CDS 33256 - 34434 1076 ## COG0477 Permeases of the major facilitator superfamily - Prom 34553 - 34612 4.5 - Term 34590 - 34619 2.1 35 17 Tu 1 . - CDS 34718 - 35938 1293 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase - Prom 35974 - 36033 3.5 + Prom 36013 - 36072 3.7 36 18 Tu 1 . + CDS 36097 - 38103 1534 ## COG0665 Glycine/D-amino acid oxidases (deaminating) + Term 38350 - 38391 1.9 37 19 Op 1 . - CDS 38224 - 38502 366 ## ECB_02250 hypothetical protein 38 19 Op 2 5/0.375 - CDS 38536 - 39084 490 ## COG3101 Uncharacterized protein conserved in bacteria 39 19 Op 3 7/0.125 - CDS 39084 - 39893 859 ## COG0730 Predicted permeases 40 19 Op 4 7/0.125 - CDS 39893 - 40717 526 ## COG3770 Murein endopeptidase 41 19 Op 5 3/0.750 - CDS 40721 - 41806 1150 ## COG0082 Chorismate synthase 42 19 Op 6 . - CDS 41841 - 42773 1632 ## PROTEIN SUPPORTED gi|191165479|ref|ZP_03027320.1| ribosomal large subunit L3 protein glutamine methyltransferase - Prom 42911 - 42970 2.7 + Prom 42741 - 42800 3.4 43 20 Tu 1 . + CDS 42939 - 43490 553 ## COG2840 Uncharacterized protein conserved in bacteria + Term 43717 - 43757 2.1 44 21 Tu 1 . - CDS 43612 - 44109 191 ## SF2408 hypothetical protein - Prom 44155 - 44214 4.7 45 22 Op 1 4/0.625 - CDS 44471 - 44995 175 ## COG3539 P pilus assembly protein, pilin FimA 46 22 Op 2 4/0.625 - CDS 44992 - 45462 248 ## COG3539 P pilus assembly protein, pilin FimA 47 22 Op 3 7/0.125 - CDS 45459 - 45872 228 ## COG3539 P pilus assembly protein, pilin FimA - Term 45892 - 45930 1.3 48 23 Op 1 10/0.000 - CDS 45982 - 46734 442 ## COG3121 P pilus assembly protein, chaperone PapD 49 23 Op 2 6/0.125 - CDS 46754 - 49399 2419 ## COG3188 P pilus assembly protein, porin PapC - Term 49437 - 49474 4.6 50 23 Op 3 3/0.750 - CDS 49481 - 50044 628 ## COG3539 P pilus assembly protein, pilin FimA - Prom 50222 - 50281 8.4 - Term 50672 - 50716 10.1 51 24 Tu 1 . - CDS 50728 - 51198 467 ## COG2062 Phosphohistidine phosphatase SixA - Prom 51316 - 51375 5.5 52 25 Op 1 20/0.000 - CDS 51416 - 53560 1654 ## COG1250 3-hydroxyacyl-CoA dehydrogenase 53 25 Op 2 . - CDS 53560 - 54870 1311 ## COG0183 Acetyl-CoA acetyltransferase - Prom 54901 - 54960 4.9 Predicted protein(s) >gi|296493378|gb|ADTK01000123.1| GENE 1 16 - 954 929 312 aa, chain - ## HITS:1 COG:lrhA KEGG:ns NR:ns ## COG: lrhA COG0583 # Protein_GI_number: 16130224 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 312 1 312 312 599 99.0 1e-171 MISANRPIINLDLDLLRTFVAVADLNTFAAAAAAVCRTQSAVSQQMQRLEQLVGKELFAR HGRNKLLTEHGIQLLGYARKILRFNDEACSSLMFSNLQGVLTIGASDESADTILPFLLNR VSSVYPKLALDVRVKRNAYMAEMLESQEVDLMVTTHRPSAFKALNLRTSPTHWYCAAEYV LQKGEPIPLVLLDDPSPFRDMVLATLNKADIPWRLAYVASTLPAVRAAVKAGLGVTARPV EMMSPDLRVLSGVDGLPPLPDTEYLLCYDPSSNNELAQVIYQAMESYHNPWQYSPMSAPE GDDSLLIERDIE >gi|296493378|gb|ADTK01000123.1| GENE 2 1874 - 3091 1091 405 aa, chain + ## HITS:1 COG:yfbQ KEGG:ns NR:ns ## COG: yfbQ COG0436 # Protein_GI_number: 16130225 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Escherichia coli K12 # 1 405 1 405 405 853 100.0 0 MSPIEKSSKLENVCYDIRGPVLKEAKRLEEEGNKVLKLNIGNPAPFGFDAPDEILVDVIR NLPTAQGYCDSKGLYSARKAIMQHYQARGMRDVTVEDIYIGNGVSELIVQAMQALLNSGD EMLVPAPDYPLWTAAVSLSSGKAVHYLCDESSDWFPDLDDIRAKITPRTRGIVIINPNNP TGAVYSKELLMEIVEIARQHNLIIFADEIYDKILYDDAEHHSIAPLAPDLLTITFNGLSK TYRVAGFRQGWMVLNGPKKHAKGYIEGLEMLASMRLCANVPAQHAIQTALGGYQSISEFI TPGGRLYEQRNRAWELINDIPGVSCVKPRGALYMFPKIDAKRFNIHDDQKMVLDFLLQEK VLLVQGTAFNWPWPDHFRIVTLPRVDDIELSLSKFARFLSGYHQL >gi|296493378|gb|ADTK01000123.1| GENE 3 3175 - 3774 698 199 aa, chain + ## HITS:1 COG:ECs3175 KEGG:ns NR:ns ## COG: ECs3175 COG1896 # Protein_GI_number: 15832429 # Func_class: R General function prediction only # Function: Predicted hydrolases of HD superfamily # Organism: Escherichia coli O157:H7 # 1 199 1 199 199 372 99.0 1e-103 MKQSHFFAHLSRLKLINRWPLMRNVRTENVSEHSLQVAMVAHALATIKNRKFGGNVNAER IALLAMYHDASEVLTGDLPTPVKYFNSQIAQEYKAIEKIAQQKLVDMVPEELRDIFAPLI DEHAYSDEEKSLVKQADALCAYLKCLEELAAGNNEFLLAKTRLEATLEARRSQEMDYFME VFVPSFHLSLDEISQDSPL >gi|296493378|gb|ADTK01000123.1| GENE 4 3833 - 5665 1548 610 aa, chain - ## HITS:1 COG:ECs3176 KEGG:ns NR:ns ## COG: ECs3176 COG0471 # Protein_GI_number: 15832430 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Escherichia coli O157:H7 # 1 610 1 610 610 1085 100.0 0 MNGELIWVLSLLAVAIVLFATGRVRMDAVALFVIVAFALSGTLTVPEVFSGFSDPNVVLI AALFIIGDGLVRTGVATVMGTWLVKVAGNSEIKMLVLLMLTVAGLGAFMSSTGVVAIFIP VVLSVAMRMQTSPSRLMMPLSFAGLISGMMTLVATPPNLVVNSELLREGYHGFSFFSVTP IGLVVLVLGILYMLVMRFMLKGDTQTPQREGWTRRTFRDLIREYRLTGRARRLAIRPGSP MIGQRLDDLKLRERYGANVIGVERWRRFRRVIVNVNGVSEFRARDVLLIDMSAADVDLRQ FCSEQLLEPMVLRGEYFSDQALDVGMAEISLIPESELIGKSVREIGFRTRYGLNVVGLKR NGVALEGSLADEPLLLGDIILVVGNWKLIGMLAKQGRDFVALNLPEEVSEASPAHSQAPH AIFCLVLMVALMLTDEIPNPVAAIIACLLMGKFRCIDAESAYKSIHWPSIILIVGMMPFA VALQKTGGVALAVKGLMDIGGGYGPHMMLGCLFVLSAVIGLFISNTATAVLMAPIALAAA KTMGVSPYPFAMVVAMAASAAFMTPVSSPVNTLVLGPGNYSFSDFVKLGVPFTIIVMAVC VVMIPMLFPF >gi|296493378|gb|ADTK01000123.1| GENE 5 5752 - 6402 490 216 aa, chain - ## HITS:1 COG:ECs3177 KEGG:ns NR:ns ## COG: ECs3177 COG0637 # Protein_GI_number: 15832431 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Escherichia coli O157:H7 # 1 216 1 216 216 393 99.0 1e-109 MRCKGFLFDLDGTLVDSLPAVERAWSNWARRHGLAPEEVLAFIHGKQAITSLRHFMAGKS EADIAAEFTRLEQIEATETEGITALPGAIALLSHLNKAGIPWAIVTSGSMPVARARHKIA GLPAPEVFVTAERVKRGKPEPDAYLLGAQLLGLAPQECVVVEDAPAGVLSGLAAGCHVIA VNAPADTPRLNEVDLVLHSLEQITVTKQPNGDVIIQ >gi|296493378|gb|ADTK01000123.1| GENE 6 6413 - 6907 650 164 aa, chain - ## HITS:1 COG:ECs3178 KEGG:ns NR:ns ## COG: ECs3178 COG3013 # Protein_GI_number: 15832432 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 164 7 170 170 318 100.0 3e-87 MEMTNAQRLILSNQYKMMTMLDPANAERYRRLQTIIERGYGLQMRELDREFGELKEETCR TIIDIMEMYHALHVSWSNLQDQQSIDERRVTFLGFDAATEARYLGYVRFMVNVEGRYTHF DAGTHGFNAQTPMWEKYQRMLNVWHACPRQYHLSANEINQIINA >gi|296493378|gb|ADTK01000123.1| GENE 7 6990 - 7445 338 151 aa, chain - ## HITS:1 COG:yfbV KEGG:ns NR:ns ## COG: yfbV COG3092 # Protein_GI_number: 16130230 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 151 1 151 151 296 100.0 7e-81 MSTPDNRSVNFFSLFRRGQHYSKTWPLEKRLAPVFVENRVIKMTRYAIRFMPPIAVFTLC WQIALGGQLGPAVATALFALSLPMQGLWWLGKRSVTPLPPAILNWFYEVRGKLQESGQVL APVEGKPDYQALADTLKRAFKQLDKTFLDDL >gi|296493378|gb|ADTK01000123.1| GENE 8 7783 - 8985 1315 400 aa, chain + ## HITS:1 COG:ECs3180 KEGG:ns NR:ns ## COG: ECs3180 COG0282 # Protein_GI_number: 15832434 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Escherichia coli O157:H7 # 1 400 1 400 400 803 100.0 0 MSSKLVLVLNCGSSSLKFAIIDAVNGEEYLSGLAECFHLPEARIKWKMDGNKQEAALGAG AAHSEALNFIVNTILAQKPELSAQLTAIGHRIVHGGEKYTSSVVIDESVIQGIKDAASFA PLHNPAHLIGIEEALKSFPQLKDKNVAVFDTAFHQTMPEESYLYALPYNLYKEHGIRRYG AHGTSHFYVTQEAAKMLNKPVEELNIITCHLGNGGSVSAIRNGKCVDTSMGLTPLEGLVM GTRSGDIDPAIIFHLHDTLGMSVDAINKLLTKESGLLGLTEVTSDCRYVEDNYATKEDAK RAMDVYCHRLAKYIGAYTALMDGRLDAVVFTGGIGENAAMVRELSLGKLGVLGFEVDHER NLAARFGKSGFINKEGTRPAVVIPTNEELVIAQDASRLTA >gi|296493378|gb|ADTK01000123.1| GENE 9 9060 - 11204 2337 714 aa, chain + ## HITS:1 COG:pta_1 KEGG:ns NR:ns ## COG: pta_1 COG0857 # Protein_GI_number: 16130232 # Func_class: R General function prediction only # Function: BioD-like N-terminal domain of phosphotransacetylase # Organism: Escherichia coli K12 # 1 391 1 391 391 752 100.0 0 MSRIIMLIPTGTSVGLTSVSLGVIRAMERKGVRLSVFKPIAQPRTGGDAPDQTTTIVRAN SSTTTAAEPLKMSYVEGLLSSNQKDVLMEEIVANYHANTKDAEVVLVEGLVPTRKHQFAQ SLNYEIAKTLNAEIVFVMSQGTDTPEQLKERIELTRNSFGGAKNTNITGVIVNKLNAPVD EQGRTRPDLSEIFDDSSKAKVNNVDPAKLQESSPLPVLGAVPWSFDLIATRAIDMARHLN ATIINEGDINTRRVKSVTFCARSIPHMLEHFRAGSLLVTSADRPDVLVAACLAAMNGVEI GALLLTGGYEMDARISKLCERAFATGLPVFMVNTNTWQTSLSLQSFNLEVPVDDHERIEK VQEYVANYINADWIESLTATSERSRRLSPPAFRYQLTELARKAGKRIVLPEGDEPRTVKA AAICAERGIATCVLLGNPAEINRVAASQGVELGAGIEIVDPEVVRESYVGRLVELRKNKG MTETVAREQLEDNVVLGTLMLEQDEVDGLVSGAVHTTANTIRPPLQLIKTAPGSSLVSSV FFMLLPEQVYVYGDCAINPDPTAEQLAEIAIQSADSAAAFGIEPRVAMLSYSTGTSGAGS DVEKVREATRLAQEKRPDLMIDGPLQYDAAVMADVAKSKAPNSPVAGRATVFIFPDLNTG NTTYKAVQRSADLISIGPMLQGMRKPVNDLSRGALVDDIVYTIALTAIQSAQQQ >gi|296493378|gb|ADTK01000123.1| GENE 10 11394 - 12914 1488 506 aa, chain + ## HITS:1 COG:yfcC KEGG:ns NR:ns ## COG: yfcC COG1288 # Protein_GI_number: 16130233 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 506 8 513 513 921 100.0 0 MSAITESKPTRRWAMPDTLVIIFFVAILTSLATWVVPVGMFDSQEVQYQVDGQTKTRKVV DPHSFRILTNEAGEPEYHRVQLFTTGDERPGLMNFPFEGLTSGSKYGTAVGIIMFMLVIG GAFGIVMRTGTIDNGILALIRHTRGNEILFIPALFILFSLGGAVFGMGEEAVAFAIIIAP LMVRLGYDSITTVLVTYIATQIGFASSWMNPFCVVVAQGIAGVPVLSGSGLRIVVWVIAT LIGLIFTMVYASRVKKNPLLSRVHESDRFFREKQADVEQRPFTFGDWLVLIVLTAVMVWV IWGVIVNAWFIPEIASQFFTMGLVIGIIGVVFRLNGMTVNTMASSFTEGARMMIAPALLV GFAKGILLLVGNGEAGDASVLNTILNSIANAISGLDNAVAAWFMLLFQAVFNFFVTSGSG QAALTMPLLAPLGDLVGVNRQVTVLAFQFGDGFSHIIYPTSASLMATLGVCRVDFRNWLK VGATLLGLLFIMSSVVVIGAQLMGYH >gi|296493378|gb|ADTK01000123.1| GENE 11 12947 - 13489 594 180 aa, chain - ## HITS:1 COG:ECs3183 KEGG:ns NR:ns ## COG: ECs3183 COG0494 # Protein_GI_number: 15832437 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Escherichia coli O157:H7 # 1 180 1 180 180 329 100.0 1e-90 MEQRRLASTEWVDIVNEENEVIAQASREQMRAQCLRHRATYIVVHDGMGKILVQRRTETK DFLPGMLDATAGGVVQADEQLLESARREAEEELGIAGVPFAEHGQFYFEDKNCRVWGALF SCVSHGPFALQEDEVSEVCWLTPEEITARCDEFTPDSLKALALWMKRNAKNEAVETETAE >gi|296493378|gb|ADTK01000123.1| GENE 12 13547 - 14098 527 183 aa, chain - ## HITS:1 COG:ECs3184 KEGG:ns NR:ns ## COG: ECs3184 COG0622 # Protein_GI_number: 15832438 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Escherichia coli O157:H7 # 1 183 2 184 184 362 100.0 1e-100 MKLMFASDIHGSLPATERVLELFAQSGAQWLVILGDVLNHGPRNALPEGYAPAKVAERLN EVAHKVIAVRGNCDSEVDQMLLHFPITAPWQQVLLEKQRLFLTHGHLFGPENLPALNQND VLVYGHTHLPVAEQRGEIFHFNPGSVSIPKGGNPASYGMLDNDVLSVIALNDQSIIAQVA INP >gi|296493378|gb|ADTK01000123.1| GENE 13 14154 - 14798 540 214 aa, chain - ## HITS:1 COG:yfcF KEGG:ns NR:ns ## COG: yfcF COG0625 # Protein_GI_number: 16130236 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione S-transferase # Organism: Escherichia coli K12 # 1 214 1 214 214 417 99.0 1e-117 MSKPAITLWSDAHFFSPYVLSAWVALQEKGLSFHIKTIDLDSGEHLQPTWQGYGQTRRVP LLQIDDFELSESSAIAEYLEDRFAPPTWERIYPLDLENRARARQIQAWLRSDLMPIREER PTDVVFAGAKKAPLTAEGKASAEKLFAMAEHLLALGQPNLFGEWCIADTDLALMINRLVL HGDEVPERLVDYATFQWQRASVQRFIALSAKQSG >gi|296493378|gb|ADTK01000123.1| GENE 14 14934 - 15581 574 215 aa, chain + ## HITS:1 COG:yfcG KEGG:ns NR:ns ## COG: yfcG COG0625 # Protein_GI_number: 16130237 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione S-transferase # Organism: Escherichia coli K12 # 1 215 1 215 215 442 100.0 1e-124 MIDLYFAPTPNGHKITLFLEEAELDYRLIKVDLGKGGQFRPEFLRISPNNKIPAIVDHSP ADGGEPLSLFESGAILLYLAEKTGLFLSHETRERAATLQWLFWQVGGLGPMLGQNHHFNH AAPQTIPYAIERYQVETQRLYHVLNKRLENSPWLGGENYSIADIACWPWVNAWTRQRIDL AMYPAVKNWHERIRSRPATGQALLKAQLGDERSDS >gi|296493378|gb|ADTK01000123.1| GENE 15 15638 - 16000 401 120 aa, chain + ## HITS:1 COG:ECs3187 KEGG:ns NR:ns ## COG: ECs3187 COG1539 # Protein_GI_number: 15832441 # Func_class: H Coenzyme transport and metabolism # Function: Dihydroneopterin aldolase # Organism: Escherichia coli O157:H7 # 1 120 1 120 120 211 100.0 2e-55 MAQPAAIIRIKNLRLRTFIGIKEEEINNRQDIVINVTIHYPADKARTSEDINDALNYRTV TKNIIQHVENNRFSLLEKLTQDVLDIAREHHWVTYAEVEIDKLHALRYADSVSMTLSWQR >gi|296493378|gb|ADTK01000123.1| GENE 16 16021 - 16914 883 297 aa, chain + ## HITS:1 COG:yfcH KEGG:ns NR:ns ## COG: yfcH COG1090 # Protein_GI_number: 16130239 # Func_class: R General function prediction only # Function: Predicted nucleoside-diphosphate sugar epimerase # Organism: Escherichia coli K12 # 1 297 1 297 297 607 100.0 1e-174 MNIVITGGTGLIGRHLIPRLLELGHQITVVTRNPQKASSVLGPRVTLWQGLADQSNLNGV DAVINLAGEPIADKRWTHEQKERLCQSRWNITQKLVDLINASDTPPSVLISGSATGYYGD LGEVVVTEEEPPHNEFTHKLCARWEEIACRAQSDKTRVCLLRTGVVLAPDGGILGKMLPP FRLGLGGPIGSGRQYLAWIHIDDMVNGILWLLDNELRGPFNMVSPYPVRNEQFAHALGHA LHRPAILRVPATAIRLLMGESSVLVLGGQRALPKRLEEAGFAFRWYDLEEALADVVR >gi|296493378|gb|ADTK01000123.1| GENE 17 16962 - 17852 479 296 aa, chain - ## HITS:1 COG:yfcI KEGG:ns NR:ns ## COG: yfcI COG5464 # Protein_GI_number: 16130240 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 296 1 296 296 582 98.0 1e-166 MTISTTSTPHDAVFKSFLRHPDTARDFIDIHLPAPLRKLCDLTTLKLEPNSFIDDDLRQY YSDLLWSVKTQEGVGYIYVVIEHQSKPEELMAFRMMRYSIAAMQNHLDAGYKELPLVIPM LFYHGCRSPYPYSLCWLDEFAEPAIARKIYSSAFPLVDITVVPDDEIMQHRKMALLELIQ KHIRQRDLLGLVDQIVSLLVTGNTNDRQLKALFNYVLQTGDAQRFRAFIGEITERAPQEK EKLMTIADRLREEGAMQGKHEEALRIAQEMLDRGLDRELVMMVTRLSPDDLIAQSH >gi|296493378|gb|ADTK01000123.1| GENE 18 18049 - 18822 258 257 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 20 232 20 221 223 103 34 2e-21 MSENKLNVIDLHKRYGEHEVLKGVSLQANAGDVISIIGSSGSGKSTFLRCINFLEKPSEG SIVVSGQTINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFNLWSHMTVLENVMEAPIQV LGLSKQEARERAVKYLAKVGIDERAQGKYPVHLSGGQQQRVSIARALAMEPEVLLFDEPT SALDPELVGEVLRIMQQLAEEGKTMVVVTHEMGFARHVSTHVIFLHQGKIEEEGAPEQLF GNPQSPRLQQFLKGSLK >gi|296493378|gb|ADTK01000123.1| GENE 19 18830 - 19546 632 238 aa, chain - ## HITS:1 COG:ECs3191 KEGG:ns NR:ns ## COG: ECs3191 COG4160 # Protein_GI_number: 15832445 # Func_class: E Amino acid transport and metabolism # Function: ABC-type arginine/histidine transport system, permease component # Organism: Escherichia coli O157:H7 # 1 238 1 238 238 434 100.0 1e-122 MIEILHEYWKPLLWTDGYRFTGVAITLWLLILSVVIGGVLALFLAIGRVSSNKYIQFPIW LFTYIFRGTPLYVQLLVFYSGMYTLEIVKGTEFLNAFFRSGLNCTVLALTLNTCAYTTEI FAGAIRSVPHGEIEAARAYGFSTFKMYRCIILPSALRIALPAYSNEVILMLHSTALAFTA TVPDLLKIARDINAATYQPFTAFGIAAVLYLIISYVLISLFRRAEKRWLQHVKPSSTH >gi|296493378|gb|ADTK01000123.1| GENE 20 19543 - 20229 679 228 aa, chain - ## HITS:1 COG:hisQ KEGG:ns NR:ns ## COG: hisQ COG4215 # Protein_GI_number: 16130243 # Func_class: E Amino acid transport and metabolism # Function: ABC-type arginine transport system, permease component # Organism: Escherichia coli K12 # 1 228 1 228 228 394 99.0 1e-109 MLYGFSGVILQGALVTLELAISSVVLAVIIGLIGAGGKLSQNRLSGLIFEGYTTLIRGVP DLVLMLLIFYGLQIALNTVTEAMGVGQIDIDPMVAGIITLGFIYGAYFTETFRGAFMAVP KGHIEAATAFGFTRGQVFRRIMFPAMMRYALPGIGNNWQVILKSTALVSLLGLEDVVKAT QLAGKSTWEPFYFAIVCGVIYLVFTTVSNGVLLFLERRYSVGVKRADL >gi|296493378|gb|ADTK01000123.1| GENE 21 20319 - 21101 939 260 aa, chain - ## HITS:1 COG:ECs3193 KEGG:ns NR:ns ## COG: ECs3193 COG0834 # Protein_GI_number: 15832447 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Escherichia coli O157:H7 # 24 260 24 260 260 464 100.0 1e-131 MKKLVLSLSLVLAFSSATAAFAAIPQNIRIGTDPTYAPFESKNSQGELVGFDIDLAKELC KRINTQCTFVENPLDALIPSLKAKKIDAIMSSLSITEKRQQEIAFTDKLYAADSRLVVAK NSDIQPTVESLKGKRVGVLQGTTQETFGNEHWAPKGIEIVSYQGQDNIYSDLTAGRIDAA FQDEVAASEGFLKQPVGKDYKFGGPSVKDEKLFGVGTGMGLRKEDNELREALNKAFAEMR ADGTYEKLAKKYFDFDVYGG >gi|296493378|gb|ADTK01000123.1| GENE 22 21322 - 22104 950 260 aa, chain - ## HITS:1 COG:argT KEGG:ns NR:ns ## COG: argT COG0834 # Protein_GI_number: 16130245 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Escherichia coli K12 # 1 260 1 260 260 464 99.0 1e-131 MKKSILALSLLVGLSAAASSYAALPETVRIGTDTTYAPFSSKDAKGDFVGFDIDLGNEMC KRMQVKCTWVASDFDALIPSLKAKKIDAIISSLSITDKRQQEIAFSDKLYAADSRLIAAK GSPIQPTLDSLKGKHVGVLQGSTQEAYANETWRSKGVDVVAYANQDLVYSDLAAGRLDAA LQDEVAASEGFLKQPAGKDFAFAGSSVKDKKYFGDGTGVGLRKDDAELTAAFNKALGELR QDGTYDKMAKKYFDFNVYGD >gi|296493378|gb|ADTK01000123.1| GENE 23 22370 - 22939 560 189 aa, chain - ## HITS:1 COG:ECs3195 KEGG:ns NR:ns ## COG: ECs3195 COG0163 # Protein_GI_number: 15832449 # Func_class: H Coenzyme transport and metabolism # Function: 3-polyprenyl-4-hydroxybenzoate decarboxylase # Organism: Escherichia coli O157:H7 # 1 189 1 189 189 356 100.0 2e-98 MKRLIVGISGASGAIYGVRLLQVLRDVTDIETHLVMSQAARQTLSLETDFSLREVQALAD VTHDARDIAASISSGSFQTLGMVILPCSIKTLSGIVHSYTDGLLTRAADVVLKERRPLVL CVRETPLHLGHLRLMTQAAEIGAVIMPPVPAFYHRPQSLDDVINQTVNRVLDQFAITLPE DLFARWQGA >gi|296493378|gb|ADTK01000123.1| GENE 24 23034 - 24551 1545 505 aa, chain - ## HITS:1 COG:purF KEGG:ns NR:ns ## COG: purF COG0034 # Protein_GI_number: 16130247 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Escherichia coli K12 # 1 505 1 505 505 1027 100.0 0 MCGIVGIAGVMPVNQSIYDALTVLQHRGQDAAGIITIDANNCFRLRKANGLVSDVFEARH MQRLQGNMGIGHVRYPTAGSSSASEAQPFYVNSPYGITLAHNGNLTNAHELRKKLFEEKR RHINTTSDSEILLNIFASELDNFRHYPLEADNIFAAIAATNRLIRGAYACVAMIIGHGMV AFRDPNGIRPLVLGKRDIDENRTEYMVASESVALDTLGFDFLRDVAPGEAIYITEEGQLF TRQCADNPVSNPCLFEYVYFARPDSFIDKISVYSARVNMGTKLGEKIAREWEDLDIDVVI PIPETSCDIALEIARILGKPYRQGFVKNRYVGRTFIMPGQQLRRKSVRRKLNANRAEFRD KNVLLVDDSIVRGTTSEQIIEMAREAGAKKVYLASAAPEIRFPNVYGIDMPSATELIAHG REVDEIRQIIGADGLIFQDLNDLIDAVRAENPDIQQFECSVFNGVYVTKDVDQGYLDFLD TLRNDDAKAVQRQNEVENLEMHNEG >gi|296493378|gb|ADTK01000123.1| GENE 25 24588 - 25076 388 162 aa, chain - ## HITS:1 COG:cvpA KEGG:ns NR:ns ## COG: cvpA COG1286 # Protein_GI_number: 16130248 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, required for colicin V production # Organism: Escherichia coli K12 # 1 162 1 162 162 268 99.0 4e-72 MVWIDYAIIAVIAFSSLVSLIRGFVREALSLVTWGCAFFVASHYYTYLSVWFTGFEDELV RNGIAIAVLFIATLIVGAIVNFVIGQLVEKTGLSGTDRVLGVCFGALRGVLIVAAILFFL DSFTGVSKSEDWSKSQLIPQFSFIIRWFFDYLQSSSSFLPRA >gi|296493378|gb|ADTK01000123.1| GENE 26 25336 - 25998 654 220 aa, chain - ## HITS:1 COG:ZdedD KEGG:ns NR:ns ## COG: ZdedD COG3147 # Protein_GI_number: 15802861 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 EDL933 # 1 220 1 220 220 297 99.0 1e-80 MASKFQNRLVGTIVLVALGVIVLPGLLDGQKKHYQDEFAAIPLVPKAGDRDEPDMMPAAT QALPTQPPEGAAEEVRAGDAAAPSLDPATIAANNTEFEPEPAPVAPPKPKPVEPPKPKVE APPAPKPEPKPVVEEKAAPTGKAYVVQLGALKNADKVNEIVGKLRGAGYRVYTSPSTPVQ GKITRILVGPDASKDKLKGSLGELKQLSGLSGVVMGYTPN >gi|296493378|gb|ADTK01000123.1| GENE 27 25988 - 27256 1198 422 aa, chain - ## HITS:1 COG:ECs3199 KEGG:ns NR:ns ## COG: ECs3199 COG0285 # Protein_GI_number: 15832453 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Escherichia coli O157:H7 # 1 422 1 422 422 811 99.0 0 MIIKRTPQAASPLASWLSYLENLHSKTIDLGLERVSQVAARLGVLKPAPFVFTVAGTNGK GTTCRTLESILMAAGYKVGVYSSPHLVRYTERVRVQGQELPESAHTASFAEIESARGDIS LTYFEYGTLSALWLFKQAQLDVVILEVGLGGRLDATNIVDADVAVVTSIALDHTDWLGPD RESIGREKAGIFRSAKPAIVGEPEMPSTIADVAQEKGALLQRRGVEWNYSVTDHDWAFSD AHGTLENLPLPLVPQPNAATALAALRASGLEVSENAIRDGIASAILPGRFQIVSESPRVI FDVAHNPHAAEYLTGRMKALPKNGRVLAVIGMLHDKDIAGTLAWLKSVVDDWYCAPLEGP RGATAEQLLEHLGNGKSFDSVAQAWDAAMADAKAEDTVLVCGSFHTVAHVMEVIDARRSG GK >gi|296493378|gb|ADTK01000123.1| GENE 28 27326 - 28240 1018 304 aa, chain - ## HITS:1 COG:ECs3200 KEGG:ns NR:ns ## COG: ECs3200 COG0777 # Protein_GI_number: 15832454 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase beta subunit # Organism: Escherichia coli O157:H7 # 1 304 1 304 304 584 100.0 1e-167 MSWIERIKSNITPTRKASIPEGVWTKCDSCGQVLYRAELERNLEVCPKCDHHMRMTARNR LHSLLDEGSLVELGSELEPKDVLKFRDSKKYKDRLASAQKETGEKDALVVMKGTLYGMPV VAAAFEFAFMGGSMGSVVGARFVRAVEQALEDNCPLICFSASGGARMQEALMSLMQMAKT SAALAKMQERGLPYISVLTDPTMGGVSASFAMLGDLNIAEPKALIGFAGPRVIEQTVREK LPPGFQRSEFLIEKGAIDMIVRRPEMRLKLASILAKLMNLPAPNPEAPREGVVVPPVPDQ EPEA >gi|296493378|gb|ADTK01000123.1| GENE 29 28396 - 29055 602 219 aa, chain - ## HITS:1 COG:STM2367 KEGG:ns NR:ns ## COG: STM2367 COG0586 # Protein_GI_number: 16765694 # Func_class: S Function unknown # Function: Uncharacterized membrane-associated protein # Organism: Salmonella typhimurium LT2 # 1 219 1 219 219 374 92.0 1e-104 MDLIYFLIDFILHIDVHLAELVAEYGVWVYAILFLILFCETGLVVTPFLPGDSLLFVAGA LASLETNDLNVHMMVVLMLIAAIVGDAVNYTIGRLFGDKLFSNPDSKIFRRSYLDKTHQF YERHGGKTIILARFVPIVRTFAPFVAGMGHMSYRHFAAYNVIGALLWVLIFTYAGYFFGT LPFIQSNLKLMIVGIIFVSILPGVFEFIRHKRAAARAAK >gi|296493378|gb|ADTK01000123.1| GENE 30 29138 - 29950 586 270 aa, chain - ## HITS:1 COG:ECs3202 KEGG:ns NR:ns ## COG: ECs3202 COG0101 # Protein_GI_number: 15832456 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Escherichia coli O157:H7 # 1 270 1 270 270 553 99.0 1e-157 MSDQQQPPVYKIALGIEYDGSKYYGWQRQNEVRSVQEKLEKALSQVANEPITVFCAGRTD AGVHGTGQVVHFETTAQRKDAAWTLGVNANLPGDIAVRWVKAVPDDFHARFSATARRYRY IIYNHRLRPAVLSKGVTHFYEPLDAERMHRAAQCLLGENDFTSFRAVQCQSRTPWRNVMH INVTRHGPYVVVDIKANAFVHHMVRNIVGSLMEVGAHNQPESWIAELLAAKDRTRAAATA KAEGLYLVAVDYPDRYDLPKPPMGPLFLAD >gi|296493378|gb|ADTK01000123.1| GENE 31 29950 - 30963 1235 337 aa, chain - ## HITS:1 COG:usg KEGG:ns NR:ns ## COG: usg COG0136 # Protein_GI_number: 16130254 # Func_class: E Amino acid transport and metabolism # Function: Aspartate-semialdehyde dehydrogenase # Organism: Escherichia coli K12 # 1 337 1 337 337 638 99.0 0 MSEGWNIAVLGATGAVGEALLETLAERQFPVGEIYALARNESAGEQLRFGGKTITVQDAA EFDWTQAQLAFFVAGKEATAAWVEEATNSGCLVIDSSGLFALEPDVPLVVPEVNPFVLTD YRNRNVIAVPDSLTSQLLAALKPLIDQGGLSRISVTSLISASAQGKKAVDALAGQSAKLL NGIPIDEEDFFGRQLAFNMLPLLPDSEGSVREERRIVDEVRKILQDEGLMISASVVQAPV FYGHAQMVNFEALRPLAAEEACDAFAQGEDIVLSEENEFPTQVGDASGTPHLSVGCVRND YGMPEQVQFWSVADNVRFGGALMAVKIAEKLVQEYLY >gi|296493378|gb|ADTK01000123.1| GENE 32 31029 - 32165 1098 378 aa, chain - ## HITS:1 COG:ECs3204 KEGG:ns NR:ns ## COG: ECs3204 COG0111 # Protein_GI_number: 15832458 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Escherichia coli O157:H7 # 1 378 1 378 378 737 98.0 0 MKILVDENMPYARDLFSRLGEVTAVPGRPIPVAQLADADALMVRSVTKVNESLLAGKPIK FVGTATAGTDHVDEAWLKQAGIGFSAAPGCNAIAVVEYVFSSLLMLAERDGFSLHDRTVG IVGVGNVGRRLQARLEALGIKTLLCDPPRADRGDEGDFRSLDELVQHADILTFHTPLFKD GPYKTLHLADEKLIRSLKPGAILINACRGAVVDNTALLTCLNEGQKLSVVLDVWEGEPEL NVELLTKVDIGTPHIAGYTLEGKARGTTQVFEAYSKFIGHEQHVALDTLLPAPEFGRITL HGPLDQPTLKRLVHLVYDVRRDDAPLRKVAGIPGEFDKLRKNYLERREWSSLYVICDDAS AASLLCKLGFNAVHHPAR >gi|296493378|gb|ADTK01000123.1| GENE 33 32264 - 33259 864 331 aa, chain + ## HITS:1 COG:no KEGG:ECO103_2785 NR:ns ## KEGG: ECO103_2785 # Name: flk # Def: flagella biosynthesis regulator # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 331 1 331 331 519 100.0 1e-146 MIQPISGPPPGQPPGQGDNLPSGAGNQPLSSQQRTSLESLMTKVTSLTQQQRAELWAGIR HDIGLSGDSPLLSRHFPAAEHNLAQRLLAAQKSHSARQLLAQLGEYLRLGNNRQAVTDYI RHNFGQTPLNQLSPEQLKTILTLLQEGKMVIPQPQQREATDRPLLPAEHNALKQLVTKLA AATGEPSKQIWQSMLELSGVKDGELIPAKLFNHLVTWLQARQTLSQQNTPTLESLQMALK QPLDASELAALSAYIQQKYGLSAQSSLSSAQAEDILNQLYQRRVKGIDPRDMQPLLNPFP PMMDTLQNMATRPALWILLVAIILMLVWLVR >gi|296493378|gb|ADTK01000123.1| GENE 34 33256 - 34434 1076 392 aa, chain - ## HITS:1 COG:yfcJ KEGG:ns NR:ns ## COG: yfcJ COG0477 # Protein_GI_number: 16130257 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 392 1 392 392 588 99.0 1e-168 MTAVSQTETRSSANFSLFRIAFAVFLTYMTVGLPLPVIPLFVHHDLGYGNTMVGIAVGIQ FLATVLTRGYAGRLADQYGAKRSALQGMLACGLAGGALLLAAILPVSAPFKFALLVVGRL ILGFGESQLLTGALTWGLGIVGPKHSGKVMSWNGMAIYGALAVGAPLGLLIHSHYGFAAL AITTMVLPLLAWACNGTVRKVPALAGERPSLWSVVGLIWKPGLGLALQGVGFAVIGTFVS LYFASKGWAMAGFTLTAFGGAFVVMRVMFGWMPDRFGGVKVAIVSLLVETVGLLLLWQAP GAWVALAGAALTGAGCSLIFPALGVEVVKRVPSQVRGTALGGYAAFQDIALGVSGPLAGM LATTFGYSSVFLAGAISAVLGIIVTILSFRRG >gi|296493378|gb|ADTK01000123.1| GENE 35 34718 - 35938 1293 406 aa, chain - ## HITS:1 COG:ECs3207 KEGG:ns NR:ns ## COG: ECs3207 COG0304 # Protein_GI_number: 15832461 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Escherichia coli O157:H7 # 1 406 1 406 406 770 100.0 0 MKRAVITGLGIVSSIGNNQQEVLASLREGRSGITFSQELKDSGMRSHVWGNVKLDTTGLI DRKVVRFMSDASIYAFLSMEQAIADAGLSPEAYQNNPRVGLIAGSGGGSPRFQVFGADAM RGPRGLKAVGPYVVTKAMASGVSACLATPFKIHGVNYSISSACATSAHCIGNAVEQIQLG KQDIVFAGGGEELCWEMACEFDAMGALSTKYNDTPEKASRTYDAHRDGFVIAGGGGMVVV EELEHALARGAHIYAEIVGYGATSDGADMVAPSGEGAVRCMQMAMHGVDTPIDYLNSHGT STPVGDVKELAAIREVFGDKSPAISATKAMTGHSLGAAGVQEAIYSLLMLEHGFIAPSIN IEELDEQAAGLNIVTETTDRELTTVMSNSFGFGGTNATLVMRKLKD >gi|296493378|gb|ADTK01000123.1| GENE 36 36097 - 38103 1534 668 aa, chain + ## HITS:1 COG:yfcK_2 KEGG:ns NR:ns ## COG: yfcK_2 COG0665 # Protein_GI_number: 16130259 # Func_class: E Amino acid transport and metabolism # Function: Glycine/D-amino acid oxidases (deaminating) # Organism: Escherichia coli K12 # 256 668 1 413 413 832 99.0 0 MKHYSIQPANLEFNAEGTPVSRDFDDVYFSNDNGLEETRYVFLGGNQLEVRFPEHPHPLF VVAESGFGTGLNFLTLWQAFDQFREAHPQAQLQRLHFISFEKFPLTRADLALAHQHWPEL APWAEQLQAQWPMPLPGCHRLLLDEGRVTLDLWFGDINELTSQLDDSLNQKVDAWFLDGF APAKNPDMWTQNLFNAMARLARPGGTLATFTSAGFVRRGLQDAGFTMQKRKGFGRKREML CGVMEQTLPLPCSAPWFNRTGSSKREAAIIGGGIASALLSLALLRRGWQVTLYCADEAPA LGASGNRQGALYPLLSKHDEALNRFFSNAFTFARRFYDQLPVKFDHDWCGVTQLGWDEKS QHKIAQMLSMDLPAELAVAVEANAVEQITGVATNCSGITYPQGGWLCPAELTRNVLELAQ QQGLQIYYQYQLQNLSRKDDCWLLNFAGDQQATHSVVVLANGHQISRFSQTSTIPVYSVA GQVSHIPTTPELAELKQVLCYDGYLTPQNPANQHHCIGASYHRGSEDTAYSEDDQQQNRQ RLIDCFPQAQWAKEVDVSDKEARCGVRCATRDHLPMVGNVPDYEATLVEYASLAEQKDEA VSAPVFDDLFMFAALGSRGLCSAPLCAEILAAQMSDEPIPMDASTLAALNPNRLWVRKLL KGKAVKAG >gi|296493378|gb|ADTK01000123.1| GENE 37 38224 - 38502 366 92 aa, chain - ## HITS:1 COG:no KEGG:ECB_02250 NR:ns ## KEGG: ECB_02250 # Name: yfcL # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 92 1 92 92 154 100.0 1e-36 MIAEFESRILALIDGMVDHASDDELFASGYLRGHLTLAIAELESGDDHSAQAVHTTVSQS LEKAIGAGELSPRDQALVTDMWENLFQQASQQ >gi|296493378|gb|ADTK01000123.1| GENE 38 38536 - 39084 490 182 aa, chain - ## HITS:1 COG:yfcM KEGG:ns NR:ns ## COG: yfcM COG3101 # Protein_GI_number: 16130261 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 182 1 182 182 375 98.0 1e-104 MNSTHHYEQLIEIFNSCFADEFNTRLIKGDDEPIYLPADAEVPYNRIVFAHGFYASAIHE ISHWCIAGKARRELVDFGYWYCPDGRDAQTQSQFEDVEVKPQALDWLFCVAAGYPFNVSC DNLEGDFEPDRVVFQRRVHAQVMDYLANGIPERPARFIKALQNYYHTPELTAEQFPWPEA LN >gi|296493378|gb|ADTK01000123.1| GENE 39 39084 - 39893 859 269 aa, chain - ## HITS:1 COG:ECs3211 KEGG:ns NR:ns ## COG: ECs3211 COG0730 # Protein_GI_number: 15832465 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Escherichia coli O157:H7 # 1 269 1 269 269 450 100.0 1e-126 METFNSLFMVSPLLLGVLFFVAMLAGFIDSIAGGGGLLTIPALMAAGMSPANALATNKLQ ACGGSISATIYFIRRKVVSLSDQKLNIAMTFVGSMSGALLVQYVQADVLRQILPILVICI GLYFLLMPKLGEEDRQRRMYGLPFALIAGGCVGFYDGFFGPAAGSFYALAFVTLCGFNLA KATAHAKLLNATSNIGGLLLFILGGKVIWATGFVMLVGQFLGARMGSRLVLSKGQKLIRP MIVIVSAVMSAKLLYDSHGQEILHWLGMN >gi|296493378|gb|ADTK01000123.1| GENE 40 39893 - 40717 526 274 aa, chain - ## HITS:1 COG:mepA KEGG:ns NR:ns ## COG: mepA COG3770 # Protein_GI_number: 16130263 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Murein endopeptidase # Organism: Escherichia coli K12 # 1 274 1 274 274 515 99.0 1e-146 MNKTAIALLALLASSASLAATPWQKITQPVPGSAQSIGSFSNGCIVGADTLPIQSEHYQV MRTDQRRYFGHPDLVMFIQRLSSQVSNLGMGTVLIGDMGMPAGGRFNGGHASHQTGLDVD IFLQLPKTRWTSAQLLRPQALDLVSRDGKHVVSTLWKPEIFSLIKLAAQDKDVTRIFVNP AIKQQLCLDAGTDRDWLRKVRPWFQHRAHMHVRLRCPADSLECEDQPLPPPGDGCGAELQ SWFAPPKPGTTKPEKKTPPPLPPSCQALLDEHVI >gi|296493378|gb|ADTK01000123.1| GENE 41 40721 - 41806 1150 361 aa, chain - ## HITS:1 COG:ECs3213 KEGG:ns NR:ns ## COG: ECs3213 COG0082 # Protein_GI_number: 15832467 # Func_class: E Amino acid transport and metabolism # Function: Chorismate synthase # Organism: Escherichia coli O157:H7 # 1 361 1 361 361 696 99.0 0 MAGNTIGQLFRVTTFGESHGLALGCIVDGVPPGIPLTEADLQHDLDRRRPGTSRYTTQRR EPDQVKILSGVFEGVTTGTSIGLLIENTDQRSQDYSAIKDVFRPGHADYTYEQKYGLRDY RGGGRSSARETAMRVAAGAIAKKYLAEKFGIEIRGCLTQMGDIPLEIKDWSLVEQNPFFC PDPDKIDALDELMRALKKEGDSIGAKVTVVASGVPAGLGEPVFDRLDADIAHALMSINAV KGVEIGDGFDVVALRGSQNRDEITKDGFQSNHAGGILGGISSGQQIIAHMALKPTSSITV PGRTINRFGEEVEMITKGRHDPCVGIRAVPIAEAMLAIVLMDHLLRQRAQNADVKTDIPR W >gi|296493378|gb|ADTK01000123.1| GENE 42 41841 - 42773 1632 310 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|191165479|ref|ZP_03027320.1| ribosomal large subunit L3 protein glutamine methyltransferase [Escherichia coli B7A] # 1 310 1 310 310 633 99 0.0 MDKIFVDEAVNELQTIQDMLRWSVSRFSAANIWYGHGTDNPWDEAVQLVLPSLYLPLDIP EDMRTARLTSSEKHRIVERVIRRVNERIPVAYLTNKAWFCGHEFYVDERVLVPRSPIGEL INNKFAGLISKQPQHILDMCTGSGCIAIACAYAFPEAEVDAVDISPDALAVAEQNIEEHG LIHNVIPIRSDLFRDLPKVQYDLIVTNPPYVDAEDMSDLPNEYRHEPELGLASGTDGLKL TRRILGNAADYLADDGVLICEVGNSMVHLMEQYPDVPFTWLEFDNGGDGVFMLTKEQLLA AREYFAIYKD >gi|296493378|gb|ADTK01000123.1| GENE 43 42939 - 43490 553 183 aa, chain + ## HITS:1 COG:yfcN KEGG:ns NR:ns ## COG: yfcN COG2840 # Protein_GI_number: 16130266 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 183 1 183 183 349 100.0 2e-96 MKKKTTLSEEDQALFRQLMAGTRKIKQDTIVHRPQRKKISEVPVKRLIQEQADASHYFSD EFQPLLNTEGPVKYVRPDVSHFEAKKLRRGDYSPELFLDLHGLTQLQAKQELGALIAACR REHVFCACVMHGHGKHILKQQTPLWLAQHPHVMAFHQAPKEYGGDAALLVLIEVEEWLPP ELP >gi|296493378|gb|ADTK01000123.1| GENE 44 43612 - 44109 191 165 aa, chain - ## HITS:1 COG:no KEGG:SF2408 NR:ns ## KEGG: SF2408 # Name: not_defined # Def: hypothetical protein # Organism: S.flexneri # Pathway: not_defined # 1 165 126 290 296 318 100.0 5e-86 MTFKITVASDDYGCPWIASFYSYTDLPGFGSYTAPTVHNTICPTIPVASYDISWSENYVS HNKALRIQSTGSTVTTTLSTYLMEGGRLCDGSNFSDNDGRGAYCRAVSELLTFTSYGCDK STVTVTPTRHPVTDKVLHDIVVNVNTSSGQPIDSTCRFQYVLNEL >gi|296493378|gb|ADTK01000123.1| GENE 45 44471 - 44995 175 174 aa, chain - ## HITS:1 COG:ECs3217 KEGG:ns NR:ns ## COG: ECs3217 COG3539 # Protein_GI_number: 15832471 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli O157:H7 # 1 174 10 182 182 229 72.0 2e-60 MKKKRTLFFISSLMLLGSGTTIAGDNLHFTGNLISKSCTPVINGSQLAEVHFPAIAASDL MNLGQSERVPLVFQLKDCHSSTLFNVKVTLTGTEDSALPGFLAFDSSSSASGAGIGIETA AGTSVPINNTTGVTLPLNQGNNSLNFNTWLQAKSGRDVTSGDFSATVTATFEYF >gi|296493378|gb|ADTK01000123.1| GENE 46 44992 - 45462 248 156 aa, chain - ## HITS:1 COG:ECs3218 KEGG:ns NR:ns ## COG: ECs3218 COG3539 # Protein_GI_number: 15832472 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli O157:H7 # 1 156 1 156 156 253 79.0 1e-67 MKRISLILLWGFCSMALSNVSFHGYLVQPPNCTISNAQTIEITFQDVLIDDINGSNYEQT VPYSITCDTAVRDPLMEMTLSWSGTPSDFDNAAVSSNITGLGIQLKQAGQSFTINTPLVV NETDLPVLTAVPVKKSGVVLPEADFEAWATLQVDYQ >gi|296493378|gb|ADTK01000123.1| GENE 47 45459 - 45872 228 137 aa, chain - ## HITS:1 COG:Z3598 KEGG:ns NR:ns ## COG: Z3598 COG3539 # Protein_GI_number: 15802882 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli O157:H7 EDL933 # 1 137 61 197 197 228 83.0 3e-60 MNLRVLVDAPPPCTVNGAAVEFGNVFINKINGVDYKRPIDYSLVCNNLAMDDLRLQMQAT TVVINGETVISTGIPGFGIRVQKSSDHTILDLTSGSWLPFNFSSGVPVLEAVPVKQSGTT LAAAEFNASATIVVDYQ >gi|296493378|gb|ADTK01000123.1| GENE 48 45982 - 46734 442 250 aa, chain - ## HITS:1 COG:ECs3220 KEGG:ns NR:ns ## COG: ECs3220 COG3121 # Protein_GI_number: 15832474 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli O157:H7 # 1 250 1 250 252 417 86.0 1e-116 MSDLLCSAKLGATTLALLLSAASLSAQASVTPDRTRLIFNESDKSISVTLRNNDPKLPYL AQSWIEDEKGNKISSPLTVLPPVQRIDSMMNGQVKVQGMPDINKLPADRESLFYFNVREI PPKSNKANTLQIALQTRIKLFWRPKALENVSMKNPWQYKVTLTRNGQEFTVNNPTPYYVI FSNASTQKNGNPAAGFSPIVMSPKTSAPLNVKMGSVPVLTYVNDYGARMPLFFTCNGNSC QVDEEQSRKG >gi|296493378|gb|ADTK01000123.1| GENE 49 46754 - 49399 2419 881 aa, chain - ## HITS:1 COG:yfcUm KEGG:ns NR:ns ## COG: yfcUm COG3188 # Protein_GI_number: 16132272 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli K12 # 1 881 1 881 881 1711 98.0 0 MPDHSLFRLRILPWCIALAMSGSYSSVWAEDDIQFDSRFLELKGDTKIDLKRFSSQGYVE PGKYNLQVQLNKQPLAEEYDIYWYAGEDDASKSYACLTPELVAQFGLKEDVAKNLQWSHD AKCLKSGQLEGMEIKADLSQSALVISLPQAYLEYTYPDWDPPSRWDDGISGIVADYSINA QTRHEENGGDDSNEISGNGTVGVNLGPWRMRADWQTNYQHTRSNDDDEEFSGDDTQKKWE WSRYYAWRALPSLKAKLALGEDYLNSDIFDGFNYVGGSVSTDDQMLPPNLRGYAPDISGV AHTTAKVTVSQMGRVIYETQVPAGPFRLQDLGDSVSGTLHIRIEEQNGQVQEYDISTASM PYLTRPGQVRYKIMMGRPQEWGHHVEGEFFSGAEASWGIANGWSLYGGALGDENYQSAAL GVGRDLSTFGAVAFDVTHSHTKLDKDTAYGKGSLDGNSFRVSYSKDFDQLNSRVTFAGYR FSEENFMTMSEYLDASDSGMVRTGNDKEMYTATYNQNFRDAGVSVYLNYTRHTYWDREEQ TNYNIMLSHYFNMGSIRNMSVSLTGYRYEYDNRADKGMYISLSMPWGDNSTVSYNGNYGS GTDSSQVGYFSRVDDATHYQLNVGTSDKHTSVDGYYSHDGSLAQVDLSANYHEGQYTSAG LSLQGGATLTAHGGALHRTQNMGGTRLLIDADGVADVPVEGNGAAVYTNMFGKAVVSDVN NYYRNQAYIDLNKLPENAEATQSVVQATLTEGAIGYRKFAVISGQKAMAVLRLQDGSHPP FGAEVKNDNEQTVGLVDDDGNVYLAGVKPGEHMSVFWSGVAHCDINLPDPLPADLFNGLL LPCQHKGNVAPVVPDDIKPVIQEQTQQVTPTNPPVSVSANQ >gi|296493378|gb|ADTK01000123.1| GENE 50 49481 - 50044 628 187 aa, chain - ## HITS:1 COG:yfcV KEGG:ns NR:ns ## COG: yfcV COG3539 # Protein_GI_number: 16130272 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 187 1 187 187 296 98.0 2e-80 MSKFVKTAIAAAMVMGVFTSTATIAAGNNGTARFYGTIEDSVCSIVPDDHKLEVDMGDIG AEKLKNNGTTTPKSFQIRLQDCVFDTQETMTTTFTGTVSSANSGNYYTIFNTDTGAAFNN VSLAIGDSLGTSYKSGMGIDQKIVKDTSTNKGKAKQTLNFKAWLVGAADAPDLGNFEANT TFQITYL >gi|296493378|gb|ADTK01000123.1| GENE 51 50728 - 51198 467 156 aa, chain - ## HITS:1 COG:sixA KEGG:ns NR:ns ## COG: sixA COG2062 # Protein_GI_number: 16130273 # Func_class: T Signal transduction mechanisms # Function: Phosphohistidine phosphatase SixA # Organism: Escherichia coli K12 # 1 156 6 161 161 303 100.0 9e-83 MRHGDAALDAASDSVRPLTTNGCDESRLMANWLKGQKVEIERVLVSPFLRAEQTLEEVGD CLNLPSSAEVLPELTPCGDVGLVSAYLQALTNEGVASVLVISHLPLVGYLVAELCPGETP PMFTTSAIASVTLDESGNGTFNWQMSPCNLKMAKAI >gi|296493378|gb|ADTK01000123.1| GENE 52 51416 - 53560 1654 714 aa, chain - ## HITS:1 COG:yfcX_2 KEGG:ns NR:ns ## COG: yfcX_2 COG1250 # Protein_GI_number: 16130274 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyacyl-CoA dehydrogenase # Organism: Escherichia coli K12 # 308 714 1 407 407 787 98.0 0 MEMTSAFTLNVRLDNIAVITIDVPGEKMNTLKAEFASQVRAIIKQLRENKELRGVVFVSA KPDNFIAGADINMIGNCKTAQEAEALARQGQQLMAEIHALPIPVIAAIHGACLGGGLELA LACHGRVCTDDPKTVLGLPEVQLGLLPGSGGTQRLPRLIGVSTALEMILTGKQLRAKQAL KLGLVDDVVPHSILLEAAVELAKKDRPSSRPLPVRERILAGPLGRALLFKMVGKKTEHKT QGNYPAIERILEVVETGLAQGTSSGYDAEARAFGELAMTPQSQALRSIFFASTDVKKDPG SDAPPAPLNSVGILGGGLMGGGIAYVTACKAGLPVRIKDINPQGINHALKYSWDQLEGKV RRRHLKASERDKQLALISGTTDYRGFAHRDLIIEAVFENLELKQQMVAEVEQNCAAHTIF ASNTSSLPIGDIAAHATRPEQVIGLHFFSPVEKMPLVEIIPHAGTSVQTIATTVKLAKKQ GKTPIVVRDKAGFYVNRILAPYINEAIRMLTEGERVEHIDAALVKFGFPVGPIQLLDEVG IDTGTKIIPVLEAAYGERFSAPANVVSSILNDDRKGRKNGRGFYLYGQKGRKSKKQVDPA IYPLIGAQGQGRLSAPQVAERCVMLMLNEAVRCVDEQVIRSVRDGDIGAVFGIGFPPFLG GPFRYIDSLGAGEVVAIMQRLATQYGSRFTPCERLVEMGARGESFWKTTATDLQ >gi|296493378|gb|ADTK01000123.1| GENE 53 53560 - 54870 1311 436 aa, chain - ## HITS:1 COG:yfcY KEGG:ns NR:ns ## COG: yfcY COG0183 # Protein_GI_number: 16130275 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA acetyltransferase # Organism: Escherichia coli K12 # 1 436 1 436 436 790 99.0 0 MGQVLPLVTRQGDRIAIVSGLRTPFARQATAFHGIPAVDLGKIVVGELLARSEIPAEVIE QLVFGQVVQMPEAPNIAREIVLGTGMNVHTDAYSVSRACATSFQAVANVAESLMAGTIRA GIAGGADSSSVLPIGVSKKLARVLVDVNKARTMSQRLKLFSRLRLRDLMPVPPAVAEYST GLRMGDTAEQMAKTYGITREQQDALAHRSHQRAAQAWSDGKLKEEVMTAFIPPYKQPLVE DNNIRGNSSLADYAKLRPAFDRKHGTVTAANSTPLTDGAAAVILMTESRVKELGLVPLGY LRSYAFTAIDVWQDMLLGPAWSTPLALERAGLTMSELTLIDMHEAFAAQTLANIQLLGSE RFAREVLGRAHATGEVDDSKFNVLGGSIAYGHPFAATGARMITQTLHELRRRGGGFGLVT ACAAGGLGAAMVLEAE Prediction of potential genes in microbial genomes Time: Mon May 16 15:27:36 2011 Seq name: gi|296493377|gb|ADTK01000124.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont305.7, whole genome shotgun sequence Length of sequence - 12226 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 11, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 43 - 327 397 ## COG3691 Uncharacterized protein conserved in bacteria + Prom 539 - 598 6.2 2 2 Tu 1 . + CDS 699 - 2039 1328 ## COG2067 Long-chain fatty acid transport protein + Term 2061 - 2093 5.6 + Prom 2318 - 2377 9.9 3 3 Tu 1 . + CDS 2407 - 3465 507 ## ECO111_3092 hypothetical protein + Term 3612 - 3644 2.0 - Term 3590 - 3642 11.0 4 4 Tu 1 . - CDS 3647 - 4402 886 ## COG2853 Surface lipoprotein - Prom 4481 - 4540 4.3 5 5 Tu 1 . + CDS 4697 - 5629 1009 ## COG2116 Formate/nitrite family of transporters + TRNA 5705 - 5779 66.4 # Arg CCT 0 0 + Prom 5706 - 5765 79.3 6 6 Tu 1 . + CDS 5955 - 7145 423 ## PROTEIN SUPPORTED gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 7 7 Tu 1 . - CDS 7496 - 7666 60 ## COG2801 Transposase and inactivated derivatives - Prom 7835 - 7894 7.6 + Prom 7892 - 7951 3.9 8 8 Op 1 . + CDS 8177 - 8902 80 ## Ping_2978 hypothetical protein 9 8 Op 2 . + CDS 8902 - 9930 245 ## COG1479 Uncharacterized conserved protein + Term 9942 - 9976 4.4 + Prom 10467 - 10526 4.3 10 9 Tu 1 . + CDS 10583 - 10711 67 ## ECS88_3283 hypothetical protein + Term 10765 - 10806 -0.4 - Term 10703 - 10747 11.1 11 10 Tu 1 . - CDS 10760 - 11266 178 ## KPN_pKPN3p05866 ATP-dependent helicase - Prom 11405 - 11464 3.1 12 11 Tu 1 . - CDS 11699 - 12217 376 ## COG2801 Transposase and inactivated derivatives Predicted protein(s) >gi|296493377|gb|ADTK01000124.1| GENE 1 43 - 327 397 94 aa, chain - ## HITS:1 COG:yfcZ KEGG:ns NR:ns ## COG: yfcZ COG3691 # Protein_GI_number: 16130276 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 94 9 102 102 159 98.0 2e-39 MSKCSADETPVCCCMDVGTIMDNSDCTASYSRVFASRAEAEQTLAALTEKARSVESEPCK ITPTFTEESDGVRLDIDFTFACEAEMLIFQLGLR >gi|296493377|gb|ADTK01000124.1| GENE 2 699 - 2039 1328 446 aa, chain + ## HITS:1 COG:ECs3227 KEGG:ns NR:ns ## COG: ECs3227 COG2067 # Protein_GI_number: 15832481 # Func_class: I Lipid transport and metabolism # Function: Long-chain fatty acid transport protein # Organism: Escherichia coli O157:H7 # 1 446 3 448 448 836 99.0 0 MSQKTLFTKSALAVAVALISTQAWSAGFQLNEFSSSGLGRAYSGEGAIADDAGNVSRNPA LITMFDRPTFSAGAVYIDPDVNISGTSPSGRSLKADNIAPTAWVPNMHFVAPINDQFGWG ASITSNYGLATEFNDTYAGGSVGGTTDLETMNLNLSGAYRLNNAWSFGLGFNAVYARAKI ERFAGDLGQLVAGQIMQSPAGQTPQGQALAATANGIDSNTKIAHLNGNQWGFGWNAGILY ELDKNNRYALTYRSEVKIDFKGNYSSDLNRAFNNYGLPIPTATGGATQSGYLTLNLPEMW EVSGYNRVDPQWAIHYSLAYTSWSQFQQLKATSTSGDTLFQKHEGFKDAYRIALGTTYYY DDNWTFRTGIAFDDSPVPAQNRSISIPDQDRFWLSAGTTYAFNKDASVDVGVSYMHGQSV KINEGPYQFESEGKAWLFGTNFNYAF >gi|296493377|gb|ADTK01000124.1| GENE 3 2407 - 3465 507 352 aa, chain + ## HITS:1 COG:no KEGG:ECO111_3092 NR:ns ## KEGG: ECO111_3092 # Name: yfdF # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 352 1 352 352 627 99.0 1e-178 MLPSISINNTSATYPESINENNNDEVNGLVQELKNLFNGKEGISTCIKHLLELIKNAIRV NDDPYRFNINNSSVTYIDIGSNDTDHITIGIDNQEPIELPANYKDKELVRTIINDNIVEK THDINNKEMIFSALKEIYDGDPGFIFDKISHKLRHTVTEFDESGNSEPTDLFTWYGKDKK GDSLAIVIKNKNGNDYLSLGYYDQDDYHIQRGIRINGDSLTQYCSEKAMNASAWFESSKA IMAESFATGSDHQVINELNGERLREPNEVFKRLGRAIRYNFQVDDAKFRRDNVKEIISNL FANKVDVDHPENKHKDFKDLEDNVEKRLQNRQAKYQNEINQLSAPGVNFDDI >gi|296493377|gb|ADTK01000124.1| GENE 4 3647 - 4402 886 251 aa, chain - ## HITS:1 COG:ECs3229 KEGG:ns NR:ns ## COG: ECs3229 COG2853 # Protein_GI_number: 15832483 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Surface lipoprotein # Organism: Escherichia coli O157:H7 # 1 251 1 251 251 514 100.0 1e-146 MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADGLYPVLSWLTWPM SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA IQDDLKDIDSE >gi|296493377|gb|ADTK01000124.1| GENE 5 4697 - 5629 1009 310 aa, chain + ## HITS:1 COG:ECs3230 KEGG:ns NR:ns ## COG: ECs3230 COG2116 # Protein_GI_number: 15832484 # Func_class: P Inorganic ion transport and metabolism # Function: Formate/nitrite family of transporters # Organism: Escherichia coli O157:H7 # 1 310 1 310 310 594 100.0 1e-170 MDNDKIDQHSDEIEVESEEKERGKKIEIDEDRLPSRAMAIHEHIRQDGEKELERDAMALL WSAIAAGLSMGASLLAKGIFHVELEGVPGSFLLENLGYTFGFIIVIMARQQLFTENTVTA VLPVMQKPTMSNVGLLMRLWGVVLLGNILGTGIAAWAFEYMPIFNEETRDAFVKIGMDVM KNTPSEMFANAIISGWLIATMVWMFPAAGAAKIVVIILMTWLIALGDTTHIVVGSVEILY LVFNGTLHWSDFIWPFALPTLAGNICGGTFIFALMSHAQIRNDMSNKRKAEARQKAERAE NIKKNDKNPA >gi|296493377|gb|ADTK01000124.1| GENE 6 5955 - 7145 423 396 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 [Campylobacter concisus 13826] # 10 392 14 406 406 167 27 3e-41 MPLTARQVETAKPKDKIYKIADGGGLYLQVNPNGSKYWRMKYHYAGKEKKLSFGTYPTIT LAEARKRRDEAKKIHADGKDPGEVKKAENLEKNLSEGRTFATIATEWYNAKVSGWSASYA DYVDRAFKNNVFPYIGNRPISDIEPLELLAVLQRIESRGACELANKVRQRCGEVFRYAIV TGRARYNPASDLVIAMKSHKRTHYPFLLPPELPEFLYKLENYTGSIVTREATKLLMLTGL RTVELRMGEWCEIDFSQKIWEVPPTRMKMRKEHIVPLSNQAIRSLQILKELTGRYRYIFA GRNDVNRPISDASVNMVLKKIGYDKKATGHGFRHTMSTILHEKGFNSAWIEMQLAHTDKN SIRGTYNHARYLEGRREMMQWYADYLDAIRKDSPAI >gi|296493377|gb|ADTK01000124.1| GENE 7 7496 - 7666 60 56 aa, chain - ## HITS:1 COG:YPO2640 KEGG:ns NR:ns ## COG: YPO2640 COG2801 # Protein_GI_number: 16122850 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Yersinia pestis # 15 48 5 38 87 57 82.0 7e-09 MSSSATLGYPHEEVFSDEQIISILREAVAGVSARELCRKNAISDATFYTGRRSLVE >gi|296493377|gb|ADTK01000124.1| GENE 8 8177 - 8902 80 241 aa, chain + ## HITS:1 COG:no KEGG:Ping_2978 NR:ns ## KEGG: Ping_2978 # Name: not_defined # Def: hypothetical protein # Organism: P.ingrahamii # Pathway: not_defined # 5 240 107 344 346 166 39.0 6e-40 MSNIYIYAYQELVSYLRELFIDYNEKVSSEDLRKLESGWRDYLMKLDKNENIERVDIITY NYDIYLERLLDLLNIRFDIPNISDGNAKFKIYKPHGSISFLYEGYMSKDNYNIQKELSLN NGDISKFECRHENLSLYMPLIPIIPPAGEAHRYAQNWANSIKESIEATMESYNPNDDFFI CGISYWHVDRAEIDAIIRKINSDVNVRMINPGNVDGLSAILGMVFNSYVHHSSSEVLKEL A >gi|296493377|gb|ADTK01000124.1| GENE 9 8902 - 9930 245 342 aa, chain + ## HITS:1 COG:VCA0199 KEGG:ns NR:ns ## COG: VCA0199 COG1479 # Protein_GI_number: 15600969 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Vibrio cholerae # 24 202 267 447 638 62 24.0 1e-09 MEILKRDSNSINIASFWEGFSLGKFNFDPPYQRDSVWDEEKQSFFIDSILRNYPIPPIFL HQKIDDDTGRITFEVVDGKQRLTAIVNFINGQITSASEEEDDELTGVFFSDLSTSKYAEV KKLFWRYQMPIEYIDTEDERIIDSIFDRLNRNGERLNGQELRNAKYHDTDFYKKIVEYSQ REYWQKLLEHVDKKRMEDKEFVSELIFTILEDEVFGATQDIIDSLYEKYCTKNGDETNEA FSIFEKTTDYLVDMNIDFKSHKASGVSHLYGIFSYALYCSNNNVPVEEASNRLSTFLDSL WEKDTQHNAELRREYRKTMSSSTKSKTQRSRRIEVLLNLLHD >gi|296493377|gb|ADTK01000124.1| GENE 10 10583 - 10711 67 42 aa, chain + ## HITS:1 COG:no KEGG:ECS88_3283 NR:ns ## KEGG: ECS88_3283 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 42 66 107 107 67 97.0 2e-10 MGLKPGPKPIAKSTGKPDQRRRDNKDTPGNTPGLKPSKSTGK >gi|296493377|gb|ADTK01000124.1| GENE 11 10760 - 11266 178 168 aa, chain - ## HITS:1 COG:no KEGG:KPN_pKPN3p05866 NR:ns ## KEGG: KPN_pKPN3p05866 # Name: not_defined # Def: ATP-dependent helicase # Organism: K.pneumoniae # Pathway: not_defined # 1 167 565 731 731 296 80.0 2e-79 MSVHDVVRQEMLAIYRENDYRIAVGNKRVDYADAAARSLFAEGSDNFQRYNLQNHCFIAS GQNCYVIPWMGDKIVNTITSLLIRCGFKASSFAGVIEVEGSGVASVQRALKKMLLSGLPS EFELAAVVPDKYLDKYDEYLPETLLATGYGAQAYDTEGTCIWLQKHLQ >gi|296493377|gb|ADTK01000124.1| GENE 12 11699 - 12217 376 172 aa, chain - ## HITS:1 COG:ECs1208 KEGG:ns NR:ns ## COG: ECs1208 COG2801 # Protein_GI_number: 15830462 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 172 125 296 296 338 98.0 3e-93 MAERPDQLWVADFTYVSTWRGFVYVAFIIDVFAGYIVGWRVSSSMETTFVLDALEQALWA RRPSGTIHHSDKGSQYVSLAYTERLKEAGLLASTGSTGDSYDNAMAESINGLYKAEVIHR KSWKNRAEVELATLTWVDWYNNRRLLGRLGHTPPAEAEKAYYASIGNDDLAA Prediction of potential genes in microbial genomes Time: Mon May 16 15:27:50 2011 Seq name: gi|296493376|gb|ADTK01000125.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont309.1, whole genome shotgun sequence Length of sequence - 2734 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 461 - 520 4.7 1 1 Op 1 . + CDS 682 - 1080 77 ## ECSE_P1-0108 prepilin peptidase PilU 2 1 Op 2 . + CDS 1080 - 2504 400 ## ECSE_P1-0107 shufflon protein + Term 2663 - 2707 5.7 3 2 Tu 1 . - CDS 2501 - 2734 69 ## ECSE_P1-0107 shufflon protein Predicted protein(s) >gi|296493376|gb|ADTK01000125.1| GENE 1 682 - 1080 77 132 aa, chain + ## HITS:1 COG:no KEGG:ECSE_P1-0108 NR:ns ## KEGG: ECSE_P1-0108 # Name: not_defined # Def: prepilin peptidase PilU # Organism: E.coli_SE11 # Pathway: not_defined # 1 132 87 218 218 240 100.0 1e-62 MAVTDALTGLLPGTFTRRFLIAGMLSQITTDIWWFRTTEFATAAIVLFCLHKLVNRHRLN IGTGDLWLIAGITAWSGLYNAIWCVLLGTGGFVLWHSTWCIKGHKEGPLGPWLCFGHVLL LLDNLYQPLWVI >gi|296493376|gb|ADTK01000125.1| GENE 2 1080 - 2504 400 474 aa, chain + ## HITS:1 COG:no KEGG:ECSE_P1-0107 NR:ns ## KEGG: ECSE_P1-0107 # Name: not_defined # Def: shufflon protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 367 1 367 430 613 98.0 1e-174 MKKYDRGWASLETGAALLIVMLLIAWGAGIWQDYIQTKGWQTEARLVSNWTSAARSYIGK NYTTLQGSSTTTTPAVITTTMLKNTGFLSSGFTETNSEGQRLQAYVVRNAQNPELLQAMV VSSGGTPYPVKALIQMAKDITTGLGGYIQDGKTATGALRSWSVALSNYGAKSGNGHIAVL LSTDELSGAAEDTDRLYRFQVNGRPDLNKMHTAIDMGSNNLNNVGAVNAQTGNFSGNVNG VNGTFSGQVKGNSGNFDVNVTAGGDIRSNNGWLITRNSKGWLNETHGGGFYMSDGSWVRS VNSKGIYTGGQVKGGTVRADGRLYTGEYLQLERTAVAGASCSPNGLVGRDNTGAILSCQS GTWKTSGSLNGSYTNLGSHRGSFSGRNSGGSTLFIYASGGNGGSAGGACANTSRLQGYVG GTLISVNASNNPAYGKTAFISFAVPAGTSYQITSYPTENTSCGAGVFSVFGYQT >gi|296493376|gb|ADTK01000125.1| GENE 3 2501 - 2734 69 77 aa, chain - ## HITS:1 COG:no KEGG:ECSE_P1-0107 NR:ns ## KEGG: ECSE_P1-0107 # Name: not_defined # Def: shufflon protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 77 354 430 430 154 97.0 1e-36 AILSCQSGTWGTIGGKLKVTQLSTTGYLGQFDFCAIARMGNAEDAHYCQVVESPAGSRKW YKYEHKTGCIASCVTLN Prediction of potential genes in microbial genomes Time: Mon May 16 15:28:03 2011 Seq name: gi|296493375|gb|ADTK01000126.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont309.2, whole genome shotgun sequence Length of sequence - 4041 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 277 - 603 165 ## SPAB_05365 hypothetical protein - Prom 689 - 748 1.6 2 2 Tu 1 . + CDS 587 - 1741 509 ## COG0582 Integrase + Prom 1799 - 1858 3.0 3 3 Op 1 . + CDS 1892 - 2716 119 ## SeHA_A0107 TraE 4 3 Op 2 . + CDS 2805 - 4004 903 ## SeHA_A0106 TraF Predicted protein(s) >gi|296493375|gb|ADTK01000126.1| GENE 1 277 - 603 165 108 aa, chain - ## HITS:1 COG:no KEGG:SPAB_05365 NR:ns ## KEGG: SPAB_05365 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Paratyphi_B # Pathway: not_defined # 20 107 364 450 451 99 53.0 3e-20 MRGDGMLTSKEAKNNANNHSLQACQSGTWKSSSASIWTNIKTFTLYPKNTQVLGRFKLCI NTYRIDGREMAETEVVPIDMPDSNGEMTWQAKNYTQYSSYFMKITCLK >gi|296493375|gb|ADTK01000126.1| GENE 2 587 - 1741 509 384 aa, chain + ## HITS:1 COG:RSc1554 KEGG:ns NR:ns ## COG: RSc1554 COG0582 # Protein_GI_number: 17546273 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Ralstonia solanacearum # 4 284 73 354 354 167 40.0 3e-41 MPSPRIRKMSLSRALDKYLKTVSVHKKGHQQEFYRSNVIKRYPIALRNMDEITTVDIATY RDVRLAEINPRTGKPITGNTVRLELALLSSLFNIARVEWGTCRTNPVELVRKPKVSSGRD RRLTSSEERRLSRYFREKNLMLYVIFHLALETAMRQGEILALRWEHIDLRHGVAHLPETK NGHSRDVPLSRRARNFLQMMPVNLHGNVFDYTASGFKNAWRIATQRLRIEDLHFHDLRHE AISRFFELGSLNVMEIAAISGHRSMNMLKRYTHLRAWQLVSKLDARRRQTQKVAAWFVPY PAHITTINEENGQKAHRIEIGDFDNLHVTATTKEEAVHRASEVLLRTLAIAAQKGERVPS PGALPVNDPDYIMICPLNPGSTPL >gi|296493375|gb|ADTK01000126.1| GENE 3 1892 - 2716 119 274 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0107 NR:ns ## KEGG: SeHA_A0107 # Name: not_defined # Def: TraE # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 274 1 274 274 540 100.0 1e-152 MRLNTTGIAARMMLSLDKKAIGEKLINGRQINPTPLQVRYPQGVREVLGIMSEQLAISTA DLTRILLEDALHNMFMPADNTAGNIISRIEYIMLSHDINAPLLATLLSPWNIRSTVIQDP ARLADYLSADALEHLAECFNLNPDWLNGHENYPIALSGEWPDTADNFRMLINDSSNTEVI FWHSFPFAGNTKREYYGVILRQKKEINGSVIYPALSLSPTILNDEKRKWLTEYTTRQNTT MSLRRVTLRPGLAGNLITGQILPVSLFNTSLLPW >gi|296493375|gb|ADTK01000126.1| GENE 4 2805 - 4004 903 399 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0106 NR:ns ## KEGG: SeHA_A0106 # Name: not_defined # Def: TraF # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 399 2 400 400 703 99.0 0 MKKNHITRTIIASAVLFSFNAAAATSYFEARNDAMGGTGVASSHYGVAPLANPALLTKHN SNDDFSLLLPSVGAQVADPDDVSNKADDVKDDWDLFDSAVDNQHGVQQAAANLKHRLQEF RNINADAQVGVSAVAAMANDTLPFALMLKSYGTVSVNGKVNDADLDYLDKVANGTITDVD KNALTSRAFGRAAVITDVGISFAKELETAGQKWSLGVTPKYQRVDLFNYNVTVRDYDKDD FDGDKYHNTKNGFNADIGAYTDLNDNWTVGLVAQNIIPRSIDTKVVNGFKETFKVRPQAT AGVSWHNDLFTTALDVDLTPASGFTSDSKRQFASVGAEFNAWKWAQLRAGYRQNMASNSG SAFTAGVGISPFDVVHIDVSGLVGTDHDYGAMAQLQFTF Prediction of potential genes in microbial genomes Time: Mon May 16 15:28:18 2011 Seq name: gi|296493374|gb|ADTK01000127.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont309.3, whole genome shotgun sequence Length of sequence - 14488 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 5, operones - 3 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 70 - 654 117 ## SeHA_A0105 TraG + Term 737 - 772 -0.9 + Prom 867 - 926 1.9 2 2 Tu 1 . + CDS 1049 - 1507 214 ## SeHA_A0103 lipoprotein + Term 1530 - 1563 0.7 + Prom 1509 - 1568 1.7 3 3 Op 1 . + CDS 1669 - 2322 389 ## SeHA_A0102 lipoprotein 4 3 Op 2 . + CDS 2319 - 3467 583 ## COG2805 Tfp pilus assembly protein, pilus retraction ATPase PilT 5 3 Op 3 . + CDS 3464 - 3754 130 ## SeHA_A0100 TraK 6 3 Op 4 . + CDS 3769 - 4320 309 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes + Term 4354 - 4390 4.0 7 4 Op 1 . + CDS 4410 - 8177 1770 ## COG4643 Uncharacterized protein conserved in bacteria 8 4 Op 2 . + CDS 8195 - 8542 220 ## SeHA_A0097 TraL 9 4 Op 3 . + CDS 8617 - 9231 342 ## SC071 TraM-like protein 10 4 Op 4 . + CDS 9242 - 10225 539 ## SeHA_A0095 TraN 11 4 Op 5 . + CDS 10228 - 11517 725 ## SC069 TraO-like protein 12 4 Op 6 . + CDS 11517 - 12221 351 ## SeHA_A0093 TraP 13 4 Op 7 . + CDS 12221 - 12748 381 ## SeHA_A0092 TraQ 14 4 Op 8 . + CDS 12799 - 13203 388 ## SeHA_A0091 TraR + Term 13213 - 13249 5.2 15 5 Op 1 . + CDS 13267 - 13455 191 ## SeHA_A0090 hypothetical protein 16 5 Op 2 . + CDS 13439 - 14239 341 ## SeHA_A0089 hypothetical protein 17 5 Op 3 . + CDS 14329 - 14475 61 ## SeHA_A0088 TraU Predicted protein(s) >gi|296493374|gb|ADTK01000127.1| GENE 1 70 - 654 117 194 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0105 NR:ns ## KEGG: SeHA_A0105 # Name: not_defined # Def: TraG # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 194 1 194 194 382 100.0 1e-105 MMKPRSSYSKTAFILLFSVFLVAAVTKAKSSLPDITLEQAKEINADNTVIFLFRHGERCD RSDMPCYSDKSGITITGTEKAQQEGIKFATIFSEYDIYSSNAVRTIQTAKFFSGKEPVVM DSLSDCNNDLYKTLESIARESHKRNIVIMTHNHCLSFLARDRLGKKFKPAYLDALIMHYD GTRLILDGKYNKEA >gi|296493374|gb|ADTK01000127.1| GENE 2 1049 - 1507 214 152 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0103 NR:ns ## KEGG: SeHA_A0103 # Name: not_defined # Def: lipoprotein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 152 1 152 152 315 100.0 3e-85 MNRIIPAGCLAALFLAGCNSPSQHTPVPRNYGAASDVAVTRHTQYTLWDKGVFSGQLPVS TGSHLLTANSDRLSFDWEGDAIELLNELARVRGMQFNYNGVRLPLPVNLHVRDMTFSNTL RLIEAQTAWRATIHQYPGLLQVSFMQPENRKK >gi|296493374|gb|ADTK01000127.1| GENE 3 1669 - 2322 389 217 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0102 NR:ns ## KEGG: SeHA_A0102 # Name: not_defined # Def: lipoprotein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 217 56 272 272 411 100.0 1e-114 MLTDAGKTLGFRGGKAQRSWELIQALNARESTLNALYDFRPLISPEGWLPPVIDEAQDVA HITPDQIRTSSRVWTIIRPERFVSNPPGWRDWLLRGLSTTATPGTEGSVVPEDSVQRKVW ETALRQGWQEGRQNADLTLEANQKTLTRDYRGMMLYSLLWRQGMITRPDVSDQMQTVTGD GKKLVTGDRVRRLKNHAEFNLQKSHWRPLIGTEGGSR >gi|296493374|gb|ADTK01000127.1| GENE 4 2319 - 3467 583 382 aa, chain + ## HITS:1 COG:yggR KEGG:ns NR:ns ## COG: yggR COG2805 # Protein_GI_number: 16130851 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Tfp pilus assembly protein, pilus retraction ATPase PilT # Organism: Escherichia coli K12 # 8 354 3 335 341 137 29.0 3e-32 MNIPDEFGLFPFQRFTADELRHFFVWCAAHKVSDVDLTGGSPVSVSRFGRRVRCSSATLP TTLMSSLIDELFGREVIPRVLAGNPVDRTIQINGDASGRYGLKRGERIRLRSHLIQGTSG AEEKAISITMRVIPTEIPDILSMNIEPDLLEAMVCKSGLGFVCGETGSGKSTLCSALYRY IMDNFPDAKIVTYEDPVEYILGNENDLLPPHQAEIGRDVVSFAAGLRSAVRRNPEIIGVG EIRDNETADAAVQAGNTGHYCLSTMHTKSPGETLARLLGLFPPVIRDSMAWAVLSLLQFI LVQVLVRTNDGGRKAVREYIVINDELRDNLSGMPHAEWGHHIDAIIRQEKRRIRDQILEM YIRNEVDRREAILFIPPGELRS >gi|296493374|gb|ADTK01000127.1| GENE 5 3464 - 3754 130 96 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0100 NR:ns ## KEGG: SeHA_A0100 # Name: not_defined # Def: TraK # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 96 1 96 96 179 100.0 2e-44 MKREGFWRYATPPVNIQGVPLPFTMIYLFICPFPSKTTFWICTGIILFFVILDRSGWTVR TICTRIFSILRGAIASGRPWWYRHHTESPRDWTGLD >gi|296493374|gb|ADTK01000127.1| GENE 6 3769 - 4320 309 183 aa, chain + ## HITS:1 COG:YPMT1.73 KEGG:ns NR:ns ## COG: YPMT1.73 COG1502 # Protein_GI_number: 16082865 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Yersinia pestis # 20 175 8 161 162 119 38.0 3e-27 MSGKKTIVTLLRVSLLACPLLFTTPSFAMIDTPSVKVGFSPEGSASALVLDTINSAESSI RMMAYSFTDPDVMHALAKAKKRGVDVRIVVDDKGNTNRASQEAMKYINLLDIPLRTVDAF PIHHDKVIIVDGNTVETGSYNFSRAAARKNSENVVVLKNMPDVAAQYLEHWQDRWNKGTD WRP >gi|296493374|gb|ADTK01000127.1| GENE 7 4410 - 8177 1770 1255 aa, chain + ## HITS:1 COG:RSc1865_1 KEGG:ns NR:ns ## COG: RSc1865_1 COG4643 # Protein_GI_number: 17546584 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Ralstonia solanacearum # 74 374 10 304 359 133 33.0 2e-30 MPSVSRYRTWLAVPADEIEDLKKAHPPMNGHTPVIWDKEHKLWFARPGADLSRLDRWLPR PQDVSMNGSDPVTEFAQVLENAGLVLKELPVMDGKIHRVPTADDKKGQQSGAYRGFLDGR PAGWYRDYRSADNSPITWTFSGGEQTDPRARLHLKAHSMQRREDAERELKAQYNRQAAYA RRYINKWPQATAHEYLTRKGIQAAPGVRVNNKNELVIPFSNRNGAIRSYQRIPVTGGKDA RILIDSEKTGNWFALGTPRNGQPVLFAEGYATAASLHEATGLPVLMTVDGGNMIAVAENA RQKWTQSPFIFCADNDHAIRVNKGIVSATKAAELTGGSVIFPAFTDAEKAQGLTDFNDLD ASRGRAAFQHVINAQLEHIGVSTPNSNTPEMREALVIGNLVFTPVHTEEKTMTPTEYPET SPDTGHSHDQGPSLPSATQQEQPAASSSNIADETQSFTSHATENNGKDERHADNVQAQTE SSVVQDSPAESPASPESTTASAPDEPAHPAEPPEKVVSVATDKDWREFEAELSQPEKNES QQESGTSPEIPAPAATPASPEDSPSSETMNESAEPSPSPDAVKEEMTVMTEPDDKQSTGI SPDTPREEAVNHVTEDIPLQPEPFLPESDGPEEYENYSAYQELMNNENPEHLQENSDMPS QPDTGHVQADESELQATTDTVEPAVRDDPTPQQPVQSTSSSDNTSSFLDKARGFFTRKKT DSQAHENSDPTPSPETTTTTPAPDSIVYAPERPDAPISLNLDEIIKSLEGEERADRTVLY KLDGKPAFIDRVNRLEMVNGASNDDRSVLAALAVATDFYGGVIELTGSDAFKQKAMQLII EHNIKVRMKFPDQRAALEKLRKEMAVGKDTVVTHKPTPELNRNTPEQPAVPDPVQEKEAT QSPASPVAPEASTVSTVPASSPAEPGKTADAAQGEEPPGKLRPGESVTAVLHNFGRAEYA PGKGESFFVELKNRSGSKLYWGEQLESLVKNHQKGDVVTLTLQNREQFILPGEQKARFRN KWSMESVTNGISVSHDNPDKGQRIQAIPVETFMKVAAQISQGWPEEMKALRMPENVGSHL FIGEDRHPVSAPQNANQVTEITSAAPDKLTPVLGSVDKDTRELNLLLVQSADEHLQGVVR LNGTLYPALATPSADNSQLVINALTDKGLRFAGYGEAVNHDADSTNRPAPELMQFHLKTR EEPLFAAVYTPEKQPDALYRNLGFEQSWQQWSNSQKPEDRQEKTLHQDLSHSPGR >gi|296493374|gb|ADTK01000127.1| GENE 8 8195 - 8542 220 115 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0097 NR:ns ## KEGG: SeHA_A0097 # Name: not_defined # Def: TraL # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 115 1 115 115 207 100.0 8e-53 MNMHKSLVAVLGLILISSTASAATCRAGSAAQAGSNTGYERARAAADAWSQRENDVSSSL QSCLSRIKKISINLPQFPSLDDILSQLENQVCDAVVDKVNEKLPGNIDPWKDYNL >gi|296493374|gb|ADTK01000127.1| GENE 9 8617 - 9231 342 204 aa, chain + ## HITS:1 COG:no KEGG:SC071 NR:ns ## KEGG: SC071 # Name: traM # Def: TraM-like protein # Organism: S.enterica_Choleraesuis # Pathway: not_defined # 1 204 27 230 230 404 100.0 1e-111 MKQSEQRANNVPALIKALLCTGTCLLISITGNAIQYWHSTNVEREYFATDNGRLVRLAPT SQPAWSQNDAMAFGSQALATAFNLDFVHYRSQISSLSPRFSDEGFVGYVNALQASNILET IKKEKMNLTATTGAGVLVRQGQMSDGVWFWTFQYPVRMRLVGQTTSKPEQSFVFEITIQR VDPRLKPSGMEIRQMISRNAGPNS >gi|296493374|gb|ADTK01000127.1| GENE 10 9242 - 10225 539 327 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0095 NR:ns ## KEGG: SeHA_A0095 # Name: not_defined # Def: TraN # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 327 1 327 327 573 99.0 1e-162 MQSRYLLSTLLLVCSAATSADNAGWQNARTPQTTNSASDHVQNASQNTGGVPATLVKGEL PAPGQASPLVQDAARLDSELSADEIRSLRSLMADNERAINAPITSVVPRISSLTVNLSPG ASLPLVRTAMNNLSVVTFTDINGSPWPQSDPPYNAAPKLFDVQYNENMVTITPLRPWASG NISVYLKGLSVPVILNVTSGETDTPSSSQEMDSRLDLRIPRQGPTSPVVSIPTDKIALHD ATLQAFLDGIPPRDPSVKRLKFTGNVPDTTIWQHGDDLLVRSRAMLRDEFEQTLSSADGT HLWKLPVTPLLTFSVNGQSIHVTPELE >gi|296493374|gb|ADTK01000127.1| GENE 11 10228 - 11517 725 429 aa, chain + ## HITS:1 COG:no KEGG:SC069 NR:ns ## KEGG: SC069 # Name: traO # Def: TraO-like protein # Organism: S.enterica_Choleraesuis # Pathway: not_defined # 1 429 1 429 429 747 99.0 0 MSAEQDAGKSGKKLAALLGLGSIILFGGGYIAFSKLSGSNSDMQSAVNINSAASGGTRSV TETPHYRELLRADNERGAAAAARNNQTFIASLPQGLDIPDTQEKQQQPAAKPENYAHRQA SGTPQEDRAASEKRMERLQKLIVRIKDQHPAGSTPTIATTMWNKSPAETTGQNGTQQFAL KNASLSTPVAEKGIQLIPALTRIPAYIDTAVDSDNPSSKVIATIPAGPWAGATLFSPGVK LVGNGVEIHFDRMSWNGMDLKVNAYAQREDNLMSSVASNVNTRWFKHIILPSVLGGVGSI GTLYKDANTQVIQGNYGTVTGRVGMPSGEAVAGVIAGGMAERGSQILTRQAESEPYKQVE VYQHEVVSILFVDPVMTNDARSSSLSSGISPSVNRTSQAEQRSQARMQTAMEQRKAVMQR RYDEQPETP >gi|296493374|gb|ADTK01000127.1| GENE 12 11517 - 12221 351 234 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0093 NR:ns ## KEGG: SeHA_A0093 # Name: not_defined # Def: TraP # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 234 1 234 234 422 100.0 1e-117 MKPETEIDETGSFGEPEEKPAPFWKRSVWGISVATWGLCAVVLLAAIWYLFLRAPSETGM PPFNDADAGVQTWQTTQESSPSVQSTETMTVRTGEDMSQLARDVKTELDNRDEKIQATLN MLHDSINKLGEAIKKDEEYAQETRRQLDDIRSRLNGIMTQKSVTESSSTPHPAKKKTSSV LNGMKIMSMETGMAWIRWQGSTWAVREGQTLGNVVIQRIDPTTRTIITSAGTLR >gi|296493374|gb|ADTK01000127.1| GENE 13 12221 - 12748 381 175 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0092 NR:ns ## KEGG: SeHA_A0092 # Name: not_defined # Def: TraQ # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 175 1 175 175 275 100.0 6e-73 MNMDALTAIENFASSIFSAGMDFLFTWGEFIGVISMITLFARARSAGPVKMSPGKFIAGM LTSCMLVSLPAMINAGGVQMGFRADSFGPIAYVQPQTFGAAAGAANAVLSLAKLAGVGFV MNGISIWRKAGLDGHTALSASESVSKGNVKFIAGVLLVFIDRVLNALLASIGIVF >gi|296493374|gb|ADTK01000127.1| GENE 14 12799 - 13203 388 134 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0091 NR:ns ## KEGG: SeHA_A0091 # Name: not_defined # Def: TraR # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 134 1 134 134 186 95.0 2e-46 MSRIRVVAARLYAAFIHAQLFVMSCRSKLTKFLLLIPMMLLPKAVLADGDLADMVRNVEQ GAKTAQSSSLTIAQFIGVILFLGGLIGLKKVGKQGGMGLASCIVSIVIGAVLVAGPEMMS RSQKQLGISSISIG >gi|296493374|gb|ADTK01000127.1| GENE 15 13267 - 13455 191 62 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0090 NR:ns ## KEGG: SeHA_A0090 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 62 1 62 62 112 100.0 3e-24 MSAWSPEHITLRRTENGNVIAHDSRDDQEFIFAECCSDEIANIILEAVKRYTSTEPAHAT QH >gi|296493374|gb|ADTK01000127.1| GENE 16 13439 - 14239 341 266 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0089 NR:ns ## KEGG: SeHA_A0089 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 266 1 266 266 523 100.0 1e-147 MQHSIKDLWLYPFPEIDVVHTQEPLLPEPELTTPGRCICCRQNVRHRFRLDDSWPLRQLT DTISDTRVRLNKATEHLDKLKKRGEPVATGEKEKYNTAVKAAERALEQARLSARRLSLRH VQKAEITSTESLSEKEQELFHEDGPPYSLCAFCHAWHSLNGYAAAQGVMVWLPDLHPSTV VALNRRSLQEVFSNDKFRVRRGREALSALMQNRLAVEDKFRSFRPADFADVFRRYPPSGR SPLREKMNGIALILTPDSFIKKEYVD >gi|296493374|gb|ADTK01000127.1| GENE 17 14329 - 14475 61 48 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0088 NR:ns ## KEGG: SeHA_A0088 # Name: not_defined # Def: TraU # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 46 1 46 1014 94 93.0 9e-19 MNINRFIFTIEDCLNTLSRFSVASSFVEYCDLRTVIGLDRQDRGNRLI Prediction of potential genes in microbial genomes Time: Mon May 16 15:28:58 2011 Seq name: gi|296493373|gb|ADTK01000128.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont315.1, whole genome shotgun sequence Length of sequence - 3090 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 20 - 79 2.5 1 1 Op 1 3/0.000 + CDS 304 - 1587 1289 ## COG0477 Permeases of the major facilitator superfamily + Term 1596 - 1631 4.2 + Prom 1597 - 1656 2.3 2 1 Op 2 . + CDS 1676 - 3089 1128 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases Predicted protein(s) >gi|296493373|gb|ADTK01000128.1| GENE 1 304 - 1587 1289 427 aa, chain + ## HITS:1 COG:ydfJ KEGG:ns NR:ns ## COG: ydfJ COG0477 # Protein_GI_number: 16129502 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 427 1 427 427 788 100.0 0 MDFQLYSLGAALVFHEIFFPESSTAMALILAMGTYGAGYVARIVGAFIFGKMGDRIGRKK VLFITITMMGICTTLIGVLPTYAQIGVFAPILLVTLRIIQGLGAGAEISGAGTMLAEYAP KGKRGIISSFVAMGTNCGTLSATAIWAFMFFILSKEELLAWGWRIPFLASVVVMVFAIWL RMNLKESPVFEKVNDSNQPTAKPAPAGSMFQSKSFWLATGLRFGQAGNSGLIQTFLAGYL VQTLLFNKAIPTDALMISSILGFMTIPFLGWLSDKIGRRIPYIIMNTSAIVLAWPMLSII VDKSYAPSTIMVALIVIHNCAVLGLFALENITMAEMFGCKNRFTRMAISKEIGGLIASGF GPILAGIFCTMTESWYPIAIMIMAYSVIGLISALKMPEVKDRDLSALEDAAEDQPRVVRA AQPSRSL >gi|296493373|gb|ADTK01000128.1| GENE 2 1676 - 3089 1128 471 aa, chain + ## HITS:1 COG:ydfI KEGG:ns NR:ns ## COG: ydfI COG0246 # Protein_GI_number: 16129501 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Escherichia coli K12 # 1 471 1 471 486 946 98.0 0 MGNNLLSAKATLPVYDRNNLAPRIIHLGFGAFHRAHQGVYADILATEHFSDWGYYEVNLI GGEQQIADLQQQDNLYTVAEMSADAWTARVVGVVKKALHVQMDGLETVLAAMCEPQIAIV SLTITEKGYFHSPATGQLMLDHPMVAADVQNPHQPKTATGVIVEALARRKAAGLPAFTVM SCDNMPENGHVMRDVVTSYAKAVDVKLAQWIEDNVTFPSTMVDRIVPAVTEDTLAKIEQL TGVGDPAGVACEPFRQWVIEDNFVAGRPEWEKAGAELVSDVLPYEEMKLRMLNGSHSFLA YLGYLAGYQHINDCMEDEHYRYAAYGLMLQEQAPTLKVQGVDLQDYANRLIARYSNPALR HRTWQIAMDGSQKLPQRMLDSVRWHLAHDSKFDLLALGVAGWMRYVGGVDEQGNPIEISD PLLPVIQKAVQSSAEGKARVQSLLAIKAIFGDDLPDNSLFTEKVTEAYLSL Prediction of potential genes in microbial genomes Time: Mon May 16 15:29:01 2011 Seq name: gi|296493372|gb|ADTK01000129.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont315.2, whole genome shotgun sequence Length of sequence - 6595 bp Number of predicted genes - 8, with homology - 7 Number of transcription units - 6, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 18 - 221 362 ## ECIAI1_1560 putative selenium carrying protein - Prom 263 - 322 1.5 - Term 355 - 390 6.7 2 2 Op 1 5/0.000 - CDS 399 - 1085 708 ## COG1802 Transcriptional regulators - Prom 1111 - 1170 5.5 - Term 1133 - 1169 3.0 3 2 Op 2 . - CDS 1174 - 1920 837 ## COG4221 Short-chain alcohol dehydrogenase of unknown specificity 4 2 Op 3 . - CDS 1932 - 2150 59 ## - Prom 2210 - 2269 2.7 5 3 Tu 1 . + CDS 2057 - 4102 1767 ## COG0339 Zn-dependent oligopeptidases - Term 4093 - 4146 2.2 6 4 Tu 1 . - CDS 4147 - 4665 235 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase - Prom 4876 - 4935 10.1 + Prom 4821 - 4880 9.1 7 5 Tu 1 . + CDS 4941 - 5333 283 ## COG3111 Uncharacterized conserved protein + Term 5345 - 5386 6.2 + Prom 5501 - 5560 6.9 8 6 Tu 1 . + CDS 5588 - 6478 498 ## COG2199 FOG: GGDEF domain Predicted protein(s) >gi|296493372|gb|ADTK01000129.1| GENE 1 18 - 221 362 67 aa, chain - ## HITS:1 COG:no KEGG:ECIAI1_1560 NR:ns ## KEGG: ECIAI1_1560 # Name: ydfZ # Def: putative selenium carrying protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 67 1 67 67 114 100.0 1e-24 MTTYDRNRNAITTGSRVMVSGTGHTGKILSIDTEGLTAEQIRRGKTVVVEGCEEKLAPLD LIRLGMN >gi|296493372|gb|ADTK01000129.1| GENE 2 399 - 1085 708 228 aa, chain - ## HITS:1 COG:ECs2149 KEGG:ns NR:ns ## COG: ECs2149 COG1802 # Protein_GI_number: 15831403 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 228 1 228 228 414 100.0 1e-116 MTVETQLNPTQPVNQQIYRILRRDIVHCLIAPGTPLSEKEVSVRFNVSRQPVREAFIKLA ENGLIQIRPQRGSYVNKISMAQVRNGSFIRQAIECAVARRAASMITESQCYQLEQNLHQQ RIAIERKQLDDFFELDDNFHQLLTQIADCQLAWDTIENLKATVDRVRYMSFDHVSPPEML LRQHLDIFSALQKRDGDAVERAMTQHLQEISESVRQIRQENSDWFSEE >gi|296493372|gb|ADTK01000129.1| GENE 3 1174 - 1920 837 248 aa, chain - ## HITS:1 COG:ydfG KEGG:ns NR:ns ## COG: ydfG COG4221 # Protein_GI_number: 16129498 # Func_class: R General function prediction only # Function: Short-chain alcohol dehydrogenase of unknown specificity # Organism: Escherichia coli K12 # 1 248 1 248 248 504 100.0 1e-143 MIVLVTGATAGFGECITRRFIQQGHKVIATGRRQERLQELKDELGDNLYIAQLDVRNRAA IEEMLASLPAEWCNIDILVNNAGLALGMEPAHKASVEDWETMIDTNNKGLVYMTRAVLPG MVERNHGHIINIGSTAGSWPYAGGNVYGATKAFVRQFSLNLRTDLHGTAVRVTDIEPGLV GGTEFSNVRFKGDDGKAEKTYQNTVALTPEDVSEAVWWVSTLPAHVNINTLEMMPVTQSY AGLNVHRQ >gi|296493372|gb|ADTK01000129.1| GENE 4 1932 - 2150 59 72 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPGDSDDWQFDQNEEPDTAVCFAQGKDSLLSFSSPEREVFHSVGLLITECNLANQYYPLL SITCYVIDTQKR >gi|296493372|gb|ADTK01000129.1| GENE 5 2057 - 4102 1767 681 aa, chain + ## HITS:1 COG:ECs2147 KEGG:ns NR:ns ## COG: ECs2147 COG0339 # Protein_GI_number: 15831401 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent oligopeptidases # Organism: Escherichia coli O157:H7 # 1 681 1 681 681 1321 99.0 0 MTTMNPFLVQSTLPYLAPHFDQIANHHYRPAFDEGMQQKRAEIAAIALNPQTPDFNNTIL ALEQSGELLTRVTSVFFAMTAAHTNDELQRLDEQFSAELAELANDIYLNGELFARVDAVW QRRESLGLDSESIRLVEVIHQRFVLAGAKLAQADKAKLKVLNTEAATLTSQFNQRLLAAN KSGGLVVNDIAQLAGMSEQEIALAAVAAREKGLDNKWLIPLLNTTQQPALAEMRDRATRE KLFIAGWTRAEKNDGNDTRAIIQRLVEIRAQQAKLLGFPHYAAWKIADQMAKTPEAALNF MREIVPAARQRASDELASIQAVIDKQQGGFSAQPWDWAFYAEQVRREKFDLDEAQLKPYF ELNTVLNEGVFWTANQLFGIKFVERFDIPVYHPDVRVWEIFDHNGVGLALFYGDFFARDS KSGGAWMGNFVEQSTLNETHPVIYNVCNYQKPAAGEPALLLWDDVITLFHEFGHTLHGLF ARQRYATLSGTNTPRDFVEFPSQINEHWATHPQVFARYARHYQSGAAMPDELQQKMRNAS LFNKGYEMSELLSAALLDMRWHCLEENEAMQDVDDFELRALVAENMDLPAIPPRYRSSYF AHIFGGGYAAGYYAYLWTQMLADDGYQWFVEQGGLTRENGQRFREAILSRGNSEDLERLY RQWRGKAPQIMPMLQHRGLNI >gi|296493372|gb|ADTK01000129.1| GENE 6 4147 - 4665 235 172 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 18 172 747 902 904 95 36 1e-19 MNINKDKIVQLADTDTIENLTSALSQRLIADQLRLTTAESCTGGKLASALCAAEDTPKFY GAGFVTFTDQAKMKILSVSQQSLERYSAVSEKVAAEMATGAIERADADVSIAITGYGGPE GGEDGTPAGTVWFAWHIKGQTYTAVMHFAGDCETVLALAVRFALVQLLQLLL >gi|296493372|gb|ADTK01000129.1| GENE 7 4941 - 5333 283 130 aa, chain + ## HITS:1 COG:ydeI KEGG:ns NR:ns ## COG: ydeI COG3111 # Protein_GI_number: 16129495 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 130 1 130 130 244 99.0 3e-65 MKFQAIVLASFLVMPYALADDQGGLKQDAAPPPPHAIEDGYRGTDDAKKMTVDFAKNMHD GASVSLRGNLISHKGEDRYVFRDKSGEINVVIPAAVFDGREVQPDQMINISGSLDKKSAP AVVRVTHLQK >gi|296493372|gb|ADTK01000129.1| GENE 8 5588 - 6478 498 296 aa, chain + ## HITS:1 COG:ydeH_2 KEGG:ns NR:ns ## COG: ydeH_2 COG2199 # Protein_GI_number: 16129494 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli K12 # 121 296 1 176 176 361 99.0 1e-99 MIKKTTEIDAILLNLNKAIDAHYQWLVSMFHSVVARDASKPEITDNHSYGLCQFGRWIDH LGPLDNDELPYVRLMDSAHQHMHNCGRELMLAIVENHWQDAHFDAFQEGLLSFTAALTDY KIYLLTIRSNMDVLTGLPGRRVLDESFDHQLRNTEPLNLYLMLLDIDRFKLVNDTYGHLI GDVVLRTLATYLASWTRDYETVYRYGGEEFIIIVKAANDEEACRAGVRICQLVDNHAITH SEGHINITVTAGVSRAFPEEPLDVVIGRADRAMYEGKQTGRNRCMFIDEQNVINRV Prediction of potential genes in microbial genomes Time: Mon May 16 15:29:19 2011 Seq name: gi|296493371|gb|ADTK01000130.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont315.3, whole genome shotgun sequence Length of sequence - 34670 bp Number of predicted genes - 31, with homology - 31 Number of transcription units - 17, operones - 6 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 258 - 1445 900 ## COG0477 Permeases of the major facilitator superfamily - Prom 1511 - 1570 3.7 + Prom 1424 - 1483 4.7 2 2 Tu 1 . + CDS 1640 - 2539 751 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 2542 - 2592 4.3 - Term 2530 - 2580 7.3 3 3 Tu 1 . - CDS 2584 - 2997 267 ## COG2188 Transcriptional regulators - Prom 3171 - 3230 3.7 + Prom 3227 - 3286 3.1 4 4 Op 1 10/0.000 + CDS 3481 - 3792 382 ## COG1440 Phosphotransferase system cellobiose-specific component IIB + Prom 3794 - 3853 4.9 5 4 Op 2 13/0.000 + CDS 3907 - 5214 712 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 6 4 Op 3 . + CDS 5256 - 5567 333 ## COG1447 Phosphotransferase system cellobiose-specific component IIA + Term 5571 - 5610 6.0 7 5 Op 1 1/0.875 + CDS 5623 - 7296 1316 ## COG4580 Maltoporin (phage lambda and maltose receptor) 8 5 Op 2 . + CDS 7321 - 8760 1051 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase - Term 8764 - 8815 -0.8 9 6 Op 1 . - CDS 8817 - 9035 118 ## SbBS512_E1673 hypothetical protein 10 6 Op 2 3/0.500 - CDS 9067 - 9450 300 ## COG2207 AraC-type DNA-binding domain-containing proteins 11 6 Op 3 . - CDS 9470 - 9904 412 ## COG1846 Transcriptional regulators - Prom 9932 - 9991 7.1 + Prom 10024 - 10083 5.4 12 7 Tu 1 . + CDS 10116 - 10781 621 ## COG2095 Multiple antibiotic transporter + Term 10786 - 10811 -0.5 - Term 10770 - 10803 3.8 13 8 Tu 1 . - CDS 10806 - 11996 868 ## COG2814 Arabinose efflux permease - Prom 12029 - 12088 6.6 14 9 Tu 1 . - CDS 12146 - 12853 216 ## EcHS_A1609 hypothetical protein - Prom 13008 - 13067 4.8 15 10 Tu 1 . - CDS 13338 - 14219 724 ## COG0583 Transcriptional regulator - Prom 14252 - 14311 3.8 + Prom 14237 - 14296 2.4 16 11 Op 1 3/0.500 + CDS 14320 - 15708 1407 ## COG1012 NAD-dependent aldehyde dehydrogenases 17 11 Op 2 . + CDS 15784 - 16698 654 ## COG2066 Glutaminase 18 11 Op 3 . + CDS 16698 - 17057 241 ## B21_01491 hypothetical protein + Prom 17080 - 17139 3.4 19 12 Tu 1 . + CDS 17196 - 18614 1224 ## COG2199 FOG: GGDEF domain + Prom 18616 - 18675 3.4 20 13 Tu 1 . + CDS 18841 - 20292 1694 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases + Term 20309 - 20342 3.8 + Prom 20323 - 20382 4.1 21 14 Tu 1 . + CDS 20499 - 21413 610 ## COG3781 Predicted membrane protein 22 15 Op 1 3/0.500 - CDS 21417 - 22175 586 ## COG4106 Trans-aconitate methyltransferase 23 15 Op 2 5/0.375 - CDS 22232 - 22522 387 ## COG1359 Uncharacterized conserved protein 24 15 Op 3 6/0.375 - CDS 22546 - 23421 870 ## COG1830 DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes 25 15 Op 4 16/0.000 - CDS 23448 - 24464 1125 ## COG1879 ABC-type sugar transport system, periplasmic component 26 15 Op 5 11/0.000 - CDS 24476 - 25468 1084 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 27 15 Op 6 21/0.000 - CDS 25468 - 26496 1015 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 28 15 Op 7 . - CDS 26490 - 28025 174 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 28263 - 28322 4.9 + Prom 28091 - 28150 6.2 29 16 Op 1 5/0.375 + CDS 28274 - 29227 889 ## COG2390 Transcriptional regulator, contains sigma factor-related N-terminal domain 30 16 Op 2 . + CDS 29306 - 30898 1040 ## COG1070 Sugar (pentulose and hexulose) kinases + Term 30904 - 30949 -0.9 + Prom 31198 - 31257 7.1 31 17 Tu 1 . + CDS 31429 - 34669 2194 ## COG4625 Uncharacterized protein with a C-terminal OMP (outer membrane protein) domain Predicted protein(s) >gi|296493371|gb|ADTK01000130.1| GENE 1 258 - 1445 900 395 aa, chain - ## HITS:1 COG:ydeF KEGG:ns NR:ns ## COG: ydeF COG0477 # Protein_GI_number: 16129493 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 395 1 395 395 659 99.0 0 MNLSLRRSTSALLASSLLLTIGRGATLPFMTIYLSRQYSLSVDLIGYAMTIALTIGVVFS LGFGILADKFDKKRYMLLAITAFASGFIAIPLVNNVKLVVLFFALINCAYSVFATVLKAW FADNLSSTSKTKIFSINYTMLNIGWTIGPPLGTLLVMQSINLPFWLAAICSAFPMLFIQI WVKRSEKIIATETGSVWSPKVLLQDKALLWFTCSGFLASFVSGAFASCISQYVMVIADGD FAEKVVAVVLPVNAAMVVTLQYSVGRRLNPANIRALMTAGTLCFVIGLVGFIFSGNSLLL WGMSAAVFTVGEIIYAPGEYMLIDHIAPPGMKASYFSAQSLGWLGAAINPLVSGVVLTSL PPSSLFVILALVIIAAWVLMLKGIRARPWGQPALC >gi|296493371|gb|ADTK01000130.1| GENE 2 1640 - 2539 751 299 aa, chain + ## HITS:1 COG:ECs2140 KEGG:ns NR:ns ## COG: ECs2140 COG0697 # Protein_GI_number: 15831394 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Escherichia coli O157:H7 # 1 299 1 299 299 464 96.0 1e-130 MSRKDGVLALLVVVVWGLNFVVIKVGLHNMPPLMLAGLRFMLVAFPAIFFVARPKVPLNL LLGYGLTISFAQFAFLFCAINFGMPAGLASLVLQAQAFFTIVLGAFTFGERLHGKQLAGI ALAIFGVLVLIEDSLNGQHVAMLGFMLTLAAAFSWACGNIFNKKIMSHSTRPAVMSLVIW SALIPIIPFFVASLILDGSASMIHSLVAIDMTTILSLMYLAFVATIVGYGIWGTLLGRYE TWRVAPLSLLVPVVGLASAALLLDERLTGLQFLGAVLIMTGLYINVFGLRWRKAVKVRG >gi|296493371|gb|ADTK01000130.1| GENE 3 2584 - 2997 267 137 aa, chain - ## HITS:1 COG:ECs4624 KEGG:ns NR:ns ## COG: ECs4624 COG2188 # Protein_GI_number: 15833878 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 3 137 98 232 238 105 35.0 2e-23 MKFEVQDASPTIATELNLVTGEQVYYIKRLRFIEDNAAQLEETWMSVARFPDLTVSHMQK SKFSYIENECGIKIIGTFETFSPTFPTPEIASILRISPRDPILKIQTQAVDSNSIPLDYS LLYSNIFEFQVKYFFPR >gi|296493371|gb|ADTK01000130.1| GENE 4 3481 - 3792 382 103 aa, chain + ## HITS:1 COG:lin2905 KEGG:ns NR:ns ## COG: lin2905 COG1440 # Protein_GI_number: 16801964 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Listeria innocua # 1 97 1 95 100 100 57.0 5e-22 MKKILLVCAAGMSTSMLVKRMIDHANAISLEVNISALAIAEAKGKIKNNEVDVVLLGPQV RFQKPEIEAVAQGKMPVAVIEMKDYGTMNGQAVLEFAMKLLQE >gi|296493371|gb|ADTK01000130.1| GENE 5 3907 - 5214 712 435 aa, chain + ## HITS:1 COG:lin2906 KEGG:ns NR:ns ## COG: lin2906 COG1455 # Protein_GI_number: 16801965 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Listeria innocua # 1 432 1 432 450 565 69.0 1e-161 MGLMASFERGMERFLVPVAIKLNSQKHVAAVRDGFVFTFPIIMASSLIILINFAILSPDG FIAGLLHLNSIFPNLEKAQAIFTPVMNGSVNIMSIMIAFLVARNMAISYEQDDLLCGLTA IGAFFIVYTPYQMIDGQAFLTTKYLGAQGLFVAVIVALITSEIFCRLARNPKITITMPAA VPPAVARSFKVLLPIFFVMVFFSALNYCLTLISPAGLNDLIYTLIQTPLKHMGTNIFAVI ILGAVGNFLWVLGIHGPNTTSAIRETVFSEANLENLSWAAQHGTTWGAPYPITWTSINDA FANCGGSGMTLGLLLAIFIASKRAEYRDLAKMSFIPGIFNINEPIMFGLPIVLNPIMMVP FIMVPIVNCAIGYFFVSMEIIPPVAYAVPWTTPGPLISFLGTGGNWLALLVGFLCLGVAT MIYLPLLLPPTKSIT >gi|296493371|gb|ADTK01000130.1| GENE 6 5256 - 5567 333 103 aa, chain + ## HITS:1 COG:YPO2680 KEGG:ns NR:ns ## COG: YPO2680 COG1447 # Protein_GI_number: 16122885 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIA # Organism: Yersinia pestis # 3 102 16 115 115 82 51.0 2e-16 MFADEELVMELLVNAGQARSDAMEAIRCAGQKDWQGATQLMASSESACLQAHKIQTALIS QDEGCGKIKVNLILIHAQDHLMNAILCQDLAREIISLRKELHA >gi|296493371|gb|ADTK01000130.1| GENE 7 5623 - 7296 1316 557 aa, chain + ## HITS:1 COG:yieC KEGG:ns NR:ns ## COG: yieC COG4580 # Protein_GI_number: 16131588 # Func_class: G Carbohydrate transport and metabolism # Function: Maltoporin (phage lambda and maltose receptor) # Organism: Escherichia coli K12 # 20 557 19 538 538 545 52.0 1e-155 MNIKTLNVSLLSFSIITALFPLNAMATKLTIEQRLELLENELSQNKQELKATQNELGVYK SRLSTLQKSITENKYKSASLAEISATSPVADNIKNENGEQNSFTAAHTINGSQQVAVIES KGDKTTIESVTLKDISKYIKDDIGFSYQGYFRSGWGTGNHGSPQTYAAGSLGRFGNEMSG WFDLTLNQRVYNQDGKTANAVVTYDGNVGEQYNDAWFGDSANENIMQFSDIYLTTRGFLP FAPEADFWVGKHKLPQYEIQMLDWKTLTTDVAAGVGIENWALGVGLFDMSLSRDDVDVYS RDFSRTSQMNTNSVDVRYRNIPLWDDATLSLMAKYSAPNKTDQQQDNENDNSYFEMKDSW MLTSVLRQNLQRDTFNEFTLQVANNSYASSFASFSDASNTMAHGRYYYGDHTNGIAWRLI SQGEMYLTDNIIMANALVYSHGEDVYSYESGAHSDFDSIRTVIRPAWIWNTWNQTGLELG WFKQQNKTQQGVTLNESAYKTILWHALKVGESILGSRPEIRFYGTYINILDNELSNFKFN ENSKDEFMAGIQAEVWW >gi|296493371|gb|ADTK01000130.1| GENE 8 7321 - 8760 1051 479 aa, chain + ## HITS:1 COG:SP0303 KEGG:ns NR:ns ## COG: SP0303 COG2723 # Protein_GI_number: 15900236 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Streptococcus pneumoniae TIGR4 # 1 479 1 477 478 739 71.0 0 MSGFKKGFLWGGAVAAHQLEGGWNEGGKGISIADVMTAGAHGVPREVTEGVIDGLNYPNH EAIDFYHRYKTDIQLFAEMGFKCFRTSIAWTRIFPQGDEQEPNEEGLQFYDDLFDECLKQ GMEPVVTLSHFEMPYHLVTKYGGWRNRKLIDFFIRFASTVFTRYKEKVKYWMTFNEINNQ VNFSESLCPFTNSGILYSPEEDINEREQIMYQAVHYELVASALAVQTGKSINPEFNIGCM IAMCPIYPLTCAPNDMMMATKAMHRRYWFTDVHARGYYPQHMLNYFARKGFNLDITPEDN AILASGCVDFIGFSYYMSFTTQFSPDNPQLDYVEPRDLVSNPYIDTSEWGWQIDPAGLRY SLNWFWDHFQLPLFIVENGFGAVDQRQADGTVNDHYRIDYFASHIREMKKAVVEDGVDLI GYTPWGCIDLVSAGTGEMKKRYGMIYVDKDNEGKGTLERIRKASFYWYRDLIANNGENI >gi|296493371|gb|ADTK01000130.1| GENE 9 8817 - 9035 118 72 aa, chain - ## HITS:1 COG:no KEGG:SbBS512_E1673 NR:ns ## KEGG: SbBS512_E1673 # Name: marB # Def: hypothetical protein # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 1 71 1 71 79 122 98.0 5e-27 MKPLSSAIAAALILFSARGVAEQTTQPVVTSCANVVVVPPSQEQPPFDLNHMGTGSDKSD ALGVPYYNQHAM >gi|296493371|gb|ADTK01000130.1| GENE 10 9067 - 9450 300 127 aa, chain - ## HITS:1 COG:ECs2138 KEGG:ns NR:ns ## COG: ECs2138 COG2207 # Protein_GI_number: 15831392 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 127 3 129 129 233 100.0 9e-62 MSRRNTDAITIHSILDWIEDNLESPLSLEKVSERSGYSKWHLQRMFKKETGHSLGQYIRS RKMTEIAQKLKESNEPILYLAERYGFESQQTLTRTFKNYFDVPPHKYRMTNMQGESRFLH PLNHYNS >gi|296493371|gb|ADTK01000130.1| GENE 11 9470 - 9904 412 144 aa, chain - ## HITS:1 COG:STM1520 KEGG:ns NR:ns ## COG: STM1520 COG1846 # Protein_GI_number: 16764865 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Salmonella typhimurium LT2 # 1 144 1 144 144 255 90.0 2e-68 MKSTSDLFNEIIPLGRLIHMVNQKKDRLLNEYLSPLDITAAQFKVLCSIRCAACITPVEL KKVLSVDLGALTRMLDRLVCKGWVERLPNPNDKRGVLVKLTTSGAAICEQCHQLVGQDLH QELTKNLTADEVATPEHLLKKVLP >gi|296493371|gb|ADTK01000130.1| GENE 12 10116 - 10781 621 221 aa, chain + ## HITS:1 COG:marC KEGG:ns NR:ns ## COG: marC COG2095 # Protein_GI_number: 16129488 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Multiple antibiotic transporter # Organism: Escherichia coli K12 # 1 221 1 221 221 366 100.0 1e-101 MLDLFKAIGLGLVVLLPLANPLTTVALFLGLAGNMNSAERNRQSLMASVYVFAIMMVAYY AGQLVMDTFGISIPGLRIAGGLIVAFIGFRMLFPQQKAIDSPEAKSKSEELEDEPSANIA FVPLAMPSTAGPGTIAMIISSASTVRQSSTFADWVLMVAPPLIFFLVAVILWGSLRSSGA IMRLVGKGGIEAISRLMGFLLVCMGVQFIINGILEIIKTYH >gi|296493371|gb|ADTK01000130.1| GENE 13 10806 - 11996 868 396 aa, chain - ## HITS:1 COG:ydeA KEGG:ns NR:ns ## COG: ydeA COG2814 # Protein_GI_number: 16129487 # Func_class: G Carbohydrate transport and metabolism # Function: Arabinose efflux permease # Organism: Escherichia coli K12 # 1 396 1 396 396 572 100.0 1e-163 MTTNTVSRKVAWLRVVTLAVAAFIFNTTEFVPVGLLSDIAQSFHMQTAQVGIMLTIYAWV VALMSLPFMLMTSQVERRKLLICLFVVFIASHVLSFLSWSFTVLVISRIGVAFAHAIFWS ITASLAIRMAPAGKRAQALSLIATGTALAMVLGLPLGRIVGQYFGWRMTFFAIGIGALIT LLCLIKLLPLLPSEHSGSLKSLPLLFRRPALMSIYLLTVVVVTAHYTAYSYIEPFVQNIA GFSANFATALLLLLGGAGIIGSVIFGKLGNQYASALVSTAIALLLVCLALLLPAANSEIH LGVLSIFWGIAMMIIGLGMQVKVLALAPDATDVAMALFSGIFNIGIGAGALVGNQVSLHW SMSMIGYVGAVPAFAALIWSIIIFRRWPVTLEEQTQ >gi|296493371|gb|ADTK01000130.1| GENE 14 12146 - 12853 216 235 aa, chain - ## HITS:1 COG:no KEGG:EcHS_A1609 NR:ns ## KEGG: EcHS_A1609 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_HS # Pathway: not_defined # 1 235 137 371 371 456 99.0 1e-127 MGRVFQTGVERVLFLFLNDFIEQFPMINLGVPIKRAHTPHIEPLPPDRHTAADYLRQFDL LVLNFISRGNFVILPRLWSNSEVHRWFVNKDPNLITAILDITDGELKEDLLQSLMDSLGS NKHVLPEVCICFLSLLAEQESPHFQDFFLFFANMLLHYHQFMNPNESDLNDVLMPASLSD DKIIKHMARRTLKLFVKNETPPKVTHEDLVKNRPRSPVRPPIPATAKTPDLPERH >gi|296493371|gb|ADTK01000130.1| GENE 15 13338 - 14219 724 293 aa, chain - ## HITS:1 COG:ECs2133 KEGG:ns NR:ns ## COG: ECs2133 COG0583 # Protein_GI_number: 15831387 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 293 1 293 293 585 98.0 1e-167 MDLTQLEMFNAVAEAGSITQAAAIVHRVPSNLTTRLRQLETELGVDLFIRENQRLRLSPA GHNFLRYSQQILALVDEARSVVAGDEPQGLFSLGSLESTAAVRIPATLAEFNHRYPKIQF SLSTGPSGTMLEGVLEGKLNAAFIDGPINHTAIDGIPVYREELMIVTPQGHAPVIRASQV NGSNIYAFRANCSYRRHFESWFHADGAAPGTIHEMESYHGMLACVVAGAGIALIPRSMLE SMPGHHQVEAWPLAEQWRWLTTWLVWRRGAKTRPLEAFIQLLDVPDSAKQGYQ >gi|296493371|gb|ADTK01000130.1| GENE 16 14320 - 15708 1407 462 aa, chain + ## HITS:1 COG:yneI KEGG:ns NR:ns ## COG: yneI COG1012 # Protein_GI_number: 16129484 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Escherichia coli K12 # 1 462 9 470 470 886 99.0 0 MTITPATHAISINPATGEQLSVLPWAGANDIENALQLAAAGFRDWRETNIDYRAEKLRGI GKALRARSEKMAQMITREMGKPINQARAEVAKSANLCDWYAEHGPAMLKAEPTLVENQQA VIEYRPLGTILAIMPWNFPLWQVMRGAVPIILAGNGYLLKHAPNVMGCAQLIAQVFKDAG IPQGVYGWLNADNDGVSQMIKDSRIAAVTVTGSVRAGAAIGAQAGAALKKCVLELGGSDP FIVLNDADLELAVKAAVAGRYQNTGQVCAAAKRFIIEEGIASAFTERFVAAAAALKMGDP RDEENALGPMARFDLRDELHHQVEKTLAQGARLLLGGEKMAGAGNYYPPTVLANVTPEMT AFREEMFGPVAAITVAKDAEHALELANDSEFGLSATIFTTDETQARQMAARLECGGVFIN GYCASDARVAFGGVKKSGFGRELSHFGLHEFCNIQTVWKDRI >gi|296493371|gb|ADTK01000130.1| GENE 17 15784 - 16698 654 304 aa, chain + ## HITS:1 COG:ECs2131 KEGG:ns NR:ns ## COG: ECs2131 COG2066 # Protein_GI_number: 15831385 # Func_class: E Amino acid transport and metabolism # Function: Glutaminase # Organism: Escherichia coli O157:H7 # 1 304 5 308 308 612 100.0 1e-175 MDNAILENILRQVRPLIGQGKVADYIPALATVDGSRLGIAICTVDGQLFQAGDAQERFSI QSISKVLSLVVAMRHYSEEEIWQRVGKDPSGSPFNSLVQLEMEQGIPRNPFINAGALVVC DMLQGRLSAPRQRMLEVVRGLSGVSDISYDTVVARSEFEHSARNAAIAWLMKSFGNFHHD VTTVLQNYFHYCALKMSCVELARTFVFLANQGKAIHIDEPVVTPMQARQINALMATSGMY QNAGEFAWRVGLPAKSGVGGGIVAIVPHEMAIAVWSPELDDAGNSLAGIAVLEQLTKQLG RSVY >gi|296493371|gb|ADTK01000130.1| GENE 18 16698 - 17057 241 119 aa, chain + ## HITS:1 COG:no KEGG:B21_01491 NR:ns ## KEGG: B21_01491 # Name: yneG # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 119 1 119 119 239 100.0 3e-62 MQSLDPLFARLSRSKFRSRFRLGMKERQYCLEKGAPVIEQHAADFVAKRLAPALPANDGK QTPMRGHPVFIAQHATATCCRGCLAKWHNIPQGVSLSEEQQRYIVAVIYHWLVVQMNQP >gi|296493371|gb|ADTK01000130.1| GENE 19 17196 - 18614 1224 472 aa, chain + ## HITS:1 COG:Z2182_2 KEGG:ns NR:ns ## COG: Z2182_2 COG2199 # Protein_GI_number: 15801615 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli O157:H7 EDL933 # 288 472 1 185 185 387 100.0 1e-107 MHVHPISTFRLFQEGHLLRNSIAIFALTTLFYFIGAELRLVHELSLFWPLNGVMAGVFAR YVWLNRLHYYAISYVAMLVYDAITTEWGLVSLVINFSNMMFIVTVALLVVRDKRLGKNKY EPVSALRLFNYCLIAALLCAIVGAIGSVSIDSLDFWPLLADWFSEQFSTGVLIVPCMLTL AIPGVLPRFKAEQMMPAIALIVSVIASVVIGGAGSLAFPLPALIWCAVRYTPQVTCLLTF VTGAVEIVLVANSVIDISVGSPFSIPEMFSARLGIATMAICPIMVSFSVAAINSLMKQVA LRADFDFLTQVYSRSGLYEALKSPSLKQTQHLTVMLLDIDYFKSINDNYGHECGDKVLSV FARHIQKIVGDKGLVARMGGEEFAVAVPSVNPVDGLLMAEKIRKGVELQPFTWQQKTLYL TVSIGVGSGRASYRTLTDDFNKLMVEADTCLYRSKKDGRNRTSTMRYGEEVV >gi|296493371|gb|ADTK01000130.1| GENE 20 18841 - 20292 1694 483 aa, chain + ## HITS:1 COG:ECs2128 KEGG:ns NR:ns ## COG: ECs2128 COG0246 # Protein_GI_number: 15831382 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Escherichia coli O157:H7 # 1 483 1 483 483 994 99.0 0 MKTLNRRDFPGAQYPERIIQFGEGNFLRAFVDWQIDLLNEHTDLNSGVVVVRPIETSFPP SLSTQDGLYTTIIRGLNEKGEAVSDARLIRSVNREISVYSEYDEFLKLAHNPEMRFVFSN TTEAGISYHAGDKFDDAPAVSYPAKLTRLLFERFSHFNGALDKGWIIIPCELIDYNGDAL RELVLRYAQEWALPEAFIQWLDQANSFCSTLVDRIVTGYPRDEVAKLEEELGYHDGFLDT AEHFYLFVIQGPKSLATELRLDKYPLNVLIVDDIKPYKERKVAILNGAHTALVPVAFQAG LDTVGEAMNDAEICAFVEKAIYEEIIPVLDLPRDELESFASAVTGRFRNPYIKHQLLSIA LNGMTKFRTRILPQLLAGQKAKGTLPARLTFALAALIAFYRGERNGETYPVQDDAHWLER YQQLWSQHRDRVIGTQELVAIVLAEKDHWEQDLTQVPGLVEQVANDLDAILEKGMREAVR PLC >gi|296493371|gb|ADTK01000130.1| GENE 21 20499 - 21413 610 304 aa, chain + ## HITS:1 COG:yneE KEGG:ns NR:ns ## COG: yneE COG3781 # Protein_GI_number: 16129479 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 304 18 321 321 594 99.0 1e-170 MIVRPQQHWLRRIFVWHGSVLSKISSRLLLNFLFSIAVIFMLPWYTHLGIKFTLAPFSIL GVAIAIFLGFRNNAGYARYVEARKLWGQLMIASRSLLREVKTTLPDSASVREFARLQIAF AHCLRMTLRKQPQAEVLAHYLKTEDLQPVLASNSPANRILLIMGEWLAVQRRNGQLSDIL FISLNDRLNDISAVLAGCERIAYTPIPFAYTLILHRTVYLFCIMLPFALVVDLHYMTPFI SVLISYTFISLDCLAEELEDPFGTENNDLPLDAICNAIEIDLLQMNDEAEIPAKILPDRH YQLT >gi|296493371|gb|ADTK01000130.1| GENE 22 21417 - 22175 586 252 aa, chain - ## HITS:1 COG:tam KEGG:ns NR:ns ## COG: tam COG4106 # Protein_GI_number: 16129478 # Func_class: R General function prediction only # Function: Trans-aconitate methyltransferase # Organism: Escherichia coli K12 # 1 252 1 252 252 491 99.0 1e-139 MSDWNPSLYLHFAAERSRPAVELLARVPLENVEYVADLGCGPGNSTALLQQRWPAARITG IDSSPAMIAEARSALPDCQFVEADIRNWQPVQALDLIFANASLQWLPDHYELFPHLVSLL NPQGVLAVQMPDNWLEPTHVLMREVAWEQNYPDRGREPLAGVHAYYDILSEAGCEVDIWR TTYYHQMPSHQAIIDWVTATGLRPWLQDLTESEQQLFLKRYHQMLEEQYPLQENGQILLA FPRLFIVARRTE >gi|296493371|gb|ADTK01000130.1| GENE 23 22232 - 22522 387 96 aa, chain - ## HITS:1 COG:ECs2125 KEGG:ns NR:ns ## COG: ECs2125 COG1359 # Protein_GI_number: 15831379 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 96 1 96 96 188 100.0 2e-48 MHVTLVEINVHEDKVDEFIEVFRQNHLGSVQEEGNLRFDVLQDPEVNSRFYIYEAYKDED AVAFHKTTPHYKTCVAKLESLMTGPRKKRLFNGLMP >gi|296493371|gb|ADTK01000130.1| GENE 24 22546 - 23421 870 291 aa, chain - ## HITS:1 COG:yneB KEGG:ns NR:ns ## COG: yneB COG1830 # Protein_GI_number: 16129476 # Func_class: G Carbohydrate transport and metabolism # Function: DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes # Organism: Escherichia coli K12 # 1 291 1 291 291 594 99.0 1e-170 MADLDDIKDGKDFRTDQPQQNIPFTLKGCGALDWGMQSRLSRIFNPKTGKTVMLAFDHGY FQGPTTGLERIDINIAPLFEHADVLMCTRGILRSVVPPATNKPVVLRASGANSILAELSN EAVALSMDDAVRLNSCAVAAQVYIGSEYEHQSIKNIIQLVDAGMKVGMPTMAVTGVGKDM VRDQRYFSLATRIAAEMGAQIIKTYYVEKGFERIVAGCPVPIVIAGGKKLPEREALEMCW QAIDQGASGVDMGRNIFQSDHPVAMMKAVQAVVHHNETADRAYELYLSEKQ >gi|296493371|gb|ADTK01000130.1| GENE 25 23448 - 24464 1125 338 aa, chain - ## HITS:1 COG:ECs2123 KEGG:ns NR:ns ## COG: ECs2123 COG1879 # Protein_GI_number: 15831377 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 1 338 1 340 340 640 99.0 0 MTLHRFKKIALLLGIAAISMNVQAAERIAFIPKLVGVGFFTSGGNGAQQAGKELGVDVTY DGPTEPSVSGQVQLINNFVNQGYNAIIVSAVSPDGLCPALKRAMQRGVRVLTWDSDTKPE CRSYYINQGTPAQLGGMLVDMAARQVNKDKAKVAFFYSSPTVTDQNQWVKEAKAKIAKEH PGWEIVTTQFGYNDATKSLQTAEGILKAYSDLDAIIAPDANALPAAAQAAENLKNDKVAI VGFSTPNVMRPYVERGTVKEFGLWDVVQQGKISVYVADALLKKGSMKTGDKLDIQGVGQV EVSPNSVQGYDYEADGNGIVLLPERVIFNKENIGKYDF >gi|296493371|gb|ADTK01000130.1| GENE 26 24476 - 25468 1084 330 aa, chain - ## HITS:1 COG:ydeZ KEGG:ns NR:ns ## COG: ydeZ COG1172 # Protein_GI_number: 16129474 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Escherichia coli K12 # 1 330 1 330 330 511 100.0 1e-145 MRIRYGWELALAALLVIEIVAFGAINPRMLDLNMLLFSTSDFICIGIVALPLTMVIVSGG IDISFGSTIGLCAIALGVLFQSGVPMPLAILLTLLLGALCGLINAGLIIYTKVNPLVITL GTLYLFAGSALLLSGMAGATGYEGIGGFPMAFTDFANLDVLGLPVPLIIFLICLLVFWLW LHKTHAGRNVFLIGQSPRVALYSAIPVNRTLCALYAMTGLASAVAAVLLVSYFGSARSDL GASFLMPAITAVVLGGANIYGGSGSIIGTAIAVLLVGYLQQGLQMAGVPNQVSSALSGAL LIVVVVGRSVSLHRQQIKEWLARRANNPLP >gi|296493371|gb|ADTK01000130.1| GENE 27 25468 - 26496 1015 342 aa, chain - ## HITS:1 COG:ydeY KEGG:ns NR:ns ## COG: ydeY COG1172 # Protein_GI_number: 16129473 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Escherichia coli K12 # 1 342 1 342 342 540 99.0 1e-153 MLKFIQNNREITALLAVVLLFALPGFLDRQYLSVQTLTMVYSSAQILILLAMGATLVMLT RNIDVSVGSITGMCAVLLGMLLNAGYSLPVACVATLLLGLLAGFFNGVLVAWLKIPAIVA TLGTLGLYRGIMLLWTGGKWIEGLPAELKQLSAPLLFGVSAIGWLTIILVAFMAWLLAKT AFGRSFYATGDNLQGARQLGVRTEAIRIVAFSLNGCMAALAGIVFASQIGFIPNQTGTGL EMKAIAACVLGGISLLGGSGAIIGAVLGAWFLTQIDSVLVLLRIPAWWNDFIAGLVLLAV LVFDGRLRCALERNLRRQKYARFMTPPPSVKPASSGKKREAA >gi|296493371|gb|ADTK01000130.1| GENE 28 26490 - 28025 174 511 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 268 481 1 217 245 71 26 6e-12 MQTSDTRALPLLCARSVYKQYSGVNVLKGIDFTLHQGEVHALLGGNGAGKSTLMKIIAGI TPADSGTLEIGGNNYARLTPVHAHQLGIYLVPQEPLLFPSLSIKENILFGMAKKQLSMQK MKNLLAALGCQFDLHSLAGSLDVADRQMVEILRGLMRDSRILILDEPTASLTPAETERLF SRLQELLATGVGIVFISHKLPEIRQIADRISVMRDGTIALSGKTSELSTDDIIQAITPAV REKSLSASQKLWLELPGNRPQHAAGTPVLTLENLTGEGFRNVSLTLNAGEILGLAGLVGA GRTELAETLYGLRTLRGGRIMLNGKEINKLSTGERLLRGLVYLPEDRQSSGLNLDASLAW NVCALTHNLRGFWAKTAKDNATLERYRRALNIKFNQPEQAARTLSGGNQQKILIAKCLEA SPQVLIVDEPTRGVDVSARNDIYQLLRSIAAQNVAVLLISSDLEEIELMTDRVYVMHQGE ITHSALTGRDINVETIMRVAFGDSQRQEASC >gi|296493371|gb|ADTK01000130.1| GENE 29 28274 - 29227 889 317 aa, chain + ## HITS:1 COG:ECs2119 KEGG:ns NR:ns ## COG: ECs2119 COG2390 # Protein_GI_number: 15831373 # Func_class: K Transcription # Function: Transcriptional regulator, contains sigma factor-related N-terminal domain # Organism: Escherichia coli O157:H7 # 1 317 1 317 317 586 99.0 1e-167 MTINDSAISEQGMCEEEQVARIAWFYYHDGLTQSEISDRLGLTRLKVSRLLEKGHQSGII RVQINSRFEGCLEYETQLRRQFSLQHVRVIPGLADADVGGRLGIGAAHMLMSLLQPQQML AIGFGEATMNTLQRLSGFISSQQIRLVTLSGGVGSYMTGIGQLNAACSVNIIPAPLRASS ADIARTLKNENCVKDVLLAAQAADVAIVGIGAVSQQDDATIIRSGYISQGEQLMIGRKGA VGDILGYFFDAKGDVVTDIKIHNELIGLPLSSLKTIPVRVGVAGGENKAEAIAAAMKGGY INALVTDQDTAAAILRS >gi|296493371|gb|ADTK01000130.1| GENE 30 29306 - 30898 1040 530 aa, chain + ## HITS:1 COG:ydeV KEGG:ns NR:ns ## COG: ydeV COG1070 # Protein_GI_number: 16129470 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Escherichia coli K12 # 1 530 1 530 530 1081 99.0 0 MARLFTPSESKYYLMALDAGTGSIRAVIFDLEGNQIAVGQAEWRHLAVPDVPGSMEFDLN KNWQLACECMRQALHNAGIAPEYIAAVSACSMREGIVLYNNEGAPIWACANVDARAAREV SELKELHNNTFENEVYRATGQTLALSAIPRLLWLAHHRSDIYRQASTITMISDWLAYMLS GELAVDPSNAGTTGLLDLTTRDWKPALLDMAGLRADILSPVKETGTLLGVVSSQAAELCG LKAGTPVVVGGGDVQLGCLGLGVVRPAQTAVLGGTFWQQVVNLAAPVTDPEMNVRVNPHV IPGMVQAESISFFTGLTMRWFRDAFCAEEKLIAERLGIDTYTLLEEMASRVPPGSWGVMP IFSDRMRFKTWYHAAPSFINLSIDPDKCNKATLFRALEENAAIVSACNLQQIADFSNIHP SSLVFAGGGSKGKLWSQILADVSGLPVNIPVVKEATALGCAIAAGVGAGIFSSMAETGER LVRWERTHTPDPEKHELYQDSRDKWQAVYQDQLGLVDHGLTTSLWKAPGL >gi|296493371|gb|ADTK01000130.1| GENE 31 31429 - 34669 2194 1080 aa, chain + ## HITS:1 COG:AGl3085 KEGG:ns NR:ns ## COG: AGl3085 COG4625 # Protein_GI_number: 15891657 # Func_class: S Function unknown # Function: Uncharacterized protein with a C-terminal OMP (outer membrane protein) domain # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 110 752 142 778 1341 103 25.0 3e-21 MNRIYRVIWNCTLQVFQACSELTRRAGKTSTVNLRKSSGLTTKFSRLTLGVLLALSGSAS GASLEVDNDQITNIDTDVAYDAYLVGWYGTGVLNILAGGNASLTTITTSVIGANEDSEGT VNVLGGTWRLYDSGNNARPLNVGQSGTGTLNIKQKGHVDGGYLRLGSSTGGVGTVNVEGE DSVLTTELFEIGSYGTGSLNITDKGYVTSSIVAILGYQAGSNGQVVVEKGGEWLIKNNDS SIEFQIGNQGMGEATIREGGLVTAENTIIGGNATGVGTLNVQDQDSVITVRRLYNGYFGN GAVNISNNGLINNKEYSLVGVEDGSHGVVNVTDKGHWSFLGTGEAFRYIYIGDAGDGELN VSREGKVDSGIITAGMKETGTGNITVKDKNSVITNLGTNLGYDGHGEMNISNEGLVVSNG GSSLGYGENGVGNVSITTGGMWEVNKNVYTTIGVAGVGNLNISDGGKFVSQNITFLGDKA SGIGTLNLMDGTSSFDTVGINVGNFGSGIVNVSNGATLNSTGYGFIGGNASGKGIVNIST DSLWNLKTSSTNAQLLQVGVLGTGELNITTGGIVKARDTQIALNDKSKGDVRVDGQNSLL ETFNMYVGTSGTGTLTLTNSGTLNVEGGEVYLGVFEPAVGTLNIGAAHGEAAADAGYITN ATKVEFGSGEGVFVFNHTNNSDAGYQVDMLITGDDKDGKVIHDAGHTVFNAGNTYSGKTL VNDGLLTIASHTADGVTGMGSSEVTIASPGTLDILASTNSAGDYTLTNALKGDGLMRVQL SSSDKMFGFTHATGTEFAGVAQLKDSTFTLERDNTAALTHAMLQSDSENTTSVKVGEQSI GGLAMNGGTLIFDTDIPAATLAEGYISVDTLVVGAGDYTWKGRNYQVNGTGDVLIDVPKP WNDPMANNPLTTLNLLEHDDSHVGVQLVKAQTVIGSGGSLTLRDLQGDEVEADKTLHIAQ NGTVVAEGDYGFRLTTAPGDGLYVNYGLKALNIHGGQKLTLAEHGGAYGATADMSAKIGG EGDLAINTVRQVSLSNGQNDYQGATYVQMGTLRTDADGALGNTRELNISNAAIVDLNGST Prediction of potential genes in microbial genomes Time: Mon May 16 15:29:35 2011 Seq name: gi|296493370|gb|ADTK01000131.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont315.4, whole genome shotgun sequence Length of sequence - 26351 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 4, operones - 3 average op.length - 6.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 2 - 2194 1737 ## COG3468 Type V secretory pathway, adhesin AidA + Term 2204 - 2240 7.1 + Prom 2305 - 2364 5.4 2 1 Op 2 8/0.000 + CDS 2403 - 2669 271 ## COG1396 Predicted transcriptional regulators 3 1 Op 3 2/0.000 + CDS 2669 - 3991 577 ## COG3550 Uncharacterized protein related to capsule biosynthesis enzymes + Prom 4034 - 4093 3.4 4 1 Op 4 1/0.000 + CDS 4309 - 4488 56 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 4666 - 4725 5.6 5 1 Op 5 7/0.000 + CDS 4943 - 5506 309 ## COG3539 P pilus assembly protein, pilin FimA + Term 5548 - 5594 -0.4 + Prom 5650 - 5709 4.0 6 1 Op 6 10/0.000 + CDS 5867 - 6577 436 ## COG3121 P pilus assembly protein, chaperone PapD 7 1 Op 7 6/0.000 + CDS 6619 - 9270 884 ## COG3188 P pilus assembly protein, porin PapC 8 1 Op 8 4/0.000 + CDS 9284 - 9814 350 ## COG3539 P pilus assembly protein, pilin FimA 9 1 Op 9 . + CDS 9827 - 10330 253 ## COG3539 P pilus assembly protein, pilin FimA 10 1 Op 10 . + CDS 10390 - 11304 334 ## ECIAI1_1512 putative fimbrial-like exported adhesin protein + Prom 11383 - 11442 9.9 11 2 Tu 1 . + CDS 11638 - 13917 1262 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing + Term 13931 - 13974 8.2 + Prom 14059 - 14118 5.0 12 3 Op 1 . + CDS 14165 - 14362 167 ## G2583_1863 two-component-system connector protein YneN 13 3 Op 2 2/0.000 + CDS 14437 - 15198 393 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 15246 - 15282 3.0 + Prom 15354 - 15413 4.7 14 3 Op 3 4/0.000 + CDS 15600 - 17282 1309 ## COG3119 Arylsulfatase A and related enzymes + Term 17295 - 17327 5.0 15 3 Op 4 1/0.000 + CDS 17334 - 18491 448 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) + Prom 18701 - 18760 4.6 16 3 Op 5 . + CDS 18782 - 18982 148 ## COG4178 ABC-type uncharacterized transport system, permease and ATPase components + Term 18986 - 19018 0.9 + Prom 19125 - 19184 3.1 17 4 Op 1 . + CDS 19271 - 20467 727 ## COG4178 ABC-type uncharacterized transport system, permease and ATPase components 18 4 Op 2 . + CDS 20505 - 22877 1823 ## JW1490 predicted porin protein + Term 22897 - 22926 -0.5 19 4 Op 3 3/0.000 + CDS 22934 - 25717 1867 ## COG0612 Predicted Zn-dependent peptidases + Term 25724 - 25760 6.3 + Prom 25872 - 25931 6.7 20 4 Op 4 . + CDS 26079 - 26349 201 ## COG0076 Glutamate decarboxylase and related PLP-dependent proteins Predicted protein(s) >gi|296493370|gb|ADTK01000131.1| GENE 1 2 - 2194 1737 730 aa, chain + ## HITS:1 COG:ydeU KEGG:ns NR:ns ## COG: ydeU COG3468 # Protein_GI_number: 16129468 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 265 730 1 466 466 826 98.0 0 RGPPQTVETFTGQMGSTVLFKEGALTVNKGGISQGELTGGGNLNVTGGTLAIEGLNARYN ALTSISPNAEVSLDNTQGLGRGNIANDGLLTLKNVTGELRNSISGKGIVSATARTDVELD GDNSRFVGQFNIDTGSALSVNEQKNLGDASVINNGLLTISTERSWAMTHSISGSGDVTKL GTGILTLNNDSAAYQGTTDIVGGEIAFGSDSAINMASQHINIHNSGVMSGNVTTAGDVNV MPGGTLRVAKTTIGGNLENGGTVQMNSEGGKPGNVLTVNGNYTGNNGLMTFNATLGGDNS PTDKMNVKGDTQGNTRVRVDNIGGVGAQTVNGIELIEVGGNSAGNFALTTGTVEAGAYVY TLAKGKGNDEKNWYLTSKWDGVTPPDTPDPINNPPVVDPEGPSVYRPEAGSYISNIAAAN SLFSHRLHDRLGEPQYTDSLHSQGSASSMWMRHVGGHERSRAGDGQLNTQANRYVLQLGG DLAQWSSNAQDRWHLGVMAGYANQHSNTQSNRVGYKSDGRISGYSAGIYATWYQNDANKT GAYVDSWALYNWFDNSVSSDNRSADDYDSRGVTASVEGGYTFEAGTFSGSEGTLNTWYVQ PQVQITWMGVKDSDHTRKDGTRIETEGDGNVQTRLGVKTYLNSHHQRDDGKQREFQPYIE ANWINNSKVYAVKMNGQTVSRDGARNLGEVRTGVEAKVNNNLSLWGNVGVQLGDKGYSDT QGMLGVKYSW >gi|296493370|gb|ADTK01000131.1| GENE 2 2403 - 2669 271 88 aa, chain + ## HITS:1 COG:hipB KEGG:ns NR:ns ## COG: hipB COG1396 # Protein_GI_number: 16129467 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 1 88 1 88 88 137 97.0 4e-33 MMSFQKIYSPTQLANAMKLVRQQNGWTQSELAKKIGIKQATISNFENNPDNTTLTTFFKI LQSLELSMTLCDTKNASPESTEQQDLEW >gi|296493370|gb|ADTK01000131.1| GENE 3 2669 - 3991 577 440 aa, chain + ## HITS:1 COG:hipA KEGG:ns NR:ns ## COG: hipA COG3550 # Protein_GI_number: 16129466 # Func_class: R General function prediction only # Function: Uncharacterized protein related to capsule biosynthesis enzymes # Organism: Escherichia coli K12 # 1 440 1 440 440 896 98.0 0 MPKLVTWMNNQRVGELTKLANGAHTFKYAPEWLASRYARPLSLSLPLQRGNITSDAVFNF FDNLLPDSPIVRDRIVKRYHAKSRQPFDLLSEIGRDSVGAVTLLPENETITRPIMAWEKL TEARLEEVLTAYKADIPLGMIREENDFRISVAGAQEKTALLRIGNDWCIPKGITPTTHII KLPIGEIRQPNATLDLSQSVDNEYYCLLLAKELGLNVPDAEIIKAGRVRALAVERFDRRW NTERTVLLRLPQEDMCQTFGLPSSVKYESDGGPGIARIMAFLMGSSEALRDRYDFMKFQV FQWLIGATDGHAKNFSVFIQAGGSYRLTPFYDIISAFPVLGGTGIHISDLKLAMGLNASK GKKTAIDKIYPRHFLATAKVLRFPEVQMHEILSDFARMIPAALDNVKTSLPTDFPENVVT AVETNVLRLHGRLSREYGSK >gi|296493370|gb|ADTK01000131.1| GENE 4 4309 - 4488 56 59 aa, chain + ## HITS:1 COG:yneL KEGG:ns NR:ns ## COG: yneL COG2207 # Protein_GI_number: 16129465 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 59 1 59 59 115 98.0 3e-26 MSPLRYQKWLRLNEVRRPMLNEHYDVTTAAYAVGYESYPISVGNIRGCLESHPREILPG >gi|296493370|gb|ADTK01000131.1| GENE 5 4943 - 5506 309 187 aa, chain + ## HITS:1 COG:ECs2113 KEGG:ns NR:ns ## COG: ECs2113 COG3539 # Protein_GI_number: 15831367 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli O157:H7 # 1 187 1 187 187 283 99.0 1e-76 MKLKHVGMIAVSVLAMSSAAVSAAEGDESVTTTVNGGVIHFKGEVVNAACAIDSESMNQT VELGQVRSSRLAKAGDLSSAVGFNIKLNDCDTNVSSNAAVAFLGTTVTSNDDTLALQSSA AGSAQNVGIQILDRTGEVLILDGATFSAKTDLIDGTNILPFQARYIALGQSVAGTANADA TFKVQYL >gi|296493370|gb|ADTK01000131.1| GENE 6 5867 - 6577 436 236 aa, chain + ## HITS:1 COG:Z2201 KEGG:ns NR:ns ## COG: Z2201 COG3121 # Protein_GI_number: 15801632 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli O157:H7 EDL933 # 1 236 4 239 239 435 99.0 1e-122 MQTTRTPYSISFMATVLLLLLFACHSTVANAAVALGATRVIYPANQKQVLLPVTNNDPAS VYLIQSWIENAGDQKDTQFVITPPLFSMQGKKENTLRIINATNHQLPGDRESLFWVNVKA IPAMEKDQKNENTLQLAIISRIKMFYRPTNLAMAPEEAPAMLRFRRSGSKLTLINPTPYF ITVTNMKAGNSNLPNTMVPPKGEVSVDITHAATGDISFQTINDYGALTPRIKATMQ >gi|296493370|gb|ADTK01000131.1| GENE 7 6619 - 9270 884 883 aa, chain + ## HITS:1 COG:Z2203 KEGG:ns NR:ns ## COG: Z2203 COG3188 # Protein_GI_number: 15801633 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli O157:H7 EDL933 # 1 883 1 883 883 1664 98.0 0 MTAIRAAFKAYRMHQVLIMPRFARLTIALGLATAVFPVDAEYYFNPRFLSNDLAESVDLS AFTKGREAPPGTYRVDIYLNDEFMTSRDITFIADDNNAELIPCLSTDLLVSLGIKKSALL DNKEHSAEKHVPDNSACTPLQDRLADASTEFDVGQQHLSLSVPQIYVGRMARGYVSPDLW EEGINAGLLNYSFNGNSINNRSNHNAGKSNYAYLNLQSGINIGSWRLRDNSTWSYNSGSS NSSDSNKWQHINTSAERDIIPLRSRLTVGDSYTDGDIFDSVNFRGLKINSTEAMLPDSQH GFAPVIHGIARGTAQVSVKQNGYDVYQTTVPPGPFTIDDINSATNGGDLQVTIKEADGSI QTLYVPYSSVPVLQRAGYTRYALAMGEYRSGNNLQSSPKFIQGSLMHGLEGNWTPYGGMQ IAEDYQAFNLGIGKDLGLFGAFSFDITQANTTLADGTRHSGQSVKSVYSKSFYQTGTNIQ VAGYRYSTQGFYNLSDSAYSRMSGYTVKPPTGDTNEQTQFIDYFNLFYSKRGQEQISISQ QLGNYGTTFFSASRQSYWNTSRSDQQISFGLNVPFGDITTSLNYSYSNNIWQNDRDHLLA FTLNVPFSHWMRTDSQSAFRNSNASYSMSNDLKGGMTNLSGVYGTLLPDNNLNYSVQVGN THGGNTSSGTSGYSSLNYRGAYGNTNVGYSRNGDSSQIYYGMSGGIIAHADGITFGQPLG DTMVLVKAPGADNVKIENQTGIHTDWRGYAILPFATEYRENRVALNANSLADNVELDETV VTVIPTHGAIARATFNAQIGGKVLMTLKYGNKSVPFGAIVTHGENKNGSIVAENGQVYLT GLPQSGKLQVSWGNDKNSNCIVDYKLPEVSPGTLLNQQTAICR >gi|296493370|gb|ADTK01000131.1| GENE 8 9284 - 9814 350 176 aa, chain + ## HITS:1 COG:ECs2109 KEGG:ns NR:ns ## COG: ECs2109 COG3539 # Protein_GI_number: 15831363 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli O157:H7 # 1 176 1 176 176 338 99.0 2e-93 MKYNNVIFLGLCLGLTTYSALSADSVIKISGRVLDYGCTVSSDSLNFTVDLQKNSARQFP TTGSTSPAVPFQITLSECSKGTTGVRVAFNGIEDAENNTLLKLDEGSNTASGLGIEILDG NMRPVKLNDLHAGMQWIPLVPEQNNILPYSARLKSTQKSVNPGLVRASATFTLEFQ >gi|296493370|gb|ADTK01000131.1| GENE 9 9827 - 10330 253 167 aa, chain + ## HITS:1 COG:ydeR KEGG:ns NR:ns ## COG: ydeR COG3539 # Protein_GI_number: 16129462 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 167 1 167 167 305 98.0 2e-83 MKRLHKRFLLATFCALFTATLQAADVTITVNGRVVAKPCTIQTKEANVNLGDLYTRNLQQ PGSASGWHNITLSLTDCPVETSAVTAIVTGSTDNTGYYKNEGTAENIQIELRDDQDATLK NGDSKTVIVDEITRNAQFPLKARAITVNGNACQGTIEALINVIYTWQ >gi|296493370|gb|ADTK01000131.1| GENE 10 10390 - 11304 334 304 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_1512 NR:ns ## KEGG: ECIAI1_1512 # Name: ydeQ # Def: putative fimbrial-like exported adhesin protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 304 1 304 304 572 100.0 1e-162 MGKTISIKVLFGIYLLLMAGKVFAFSCNVDGGSSIGAGTTSVYVNLDPVIQPGQNLVVDL SQHISCWNDYGGWYDTDHINLVQGSAFAGSLQSYKGSLYWNNVTYPFPLTTNTNVLDIGD KTPMPLPLKLYITPVGAAGGVVIKAGEVIARIHMYKIATLGSGNPRNFTWNIISNNSVVM PTGGCTVDSRNVTVDLPDFPGSAEIPLGVYCSSEQKLSFYLSGATTDSARQVFANTAPDA TKASGVGVSLMRNGKILATGENVSLGTVNKSKVPLGLSATYGQTGNKVSAGTVQSVIGVT FIYE >gi|296493370|gb|ADTK01000131.1| GENE 11 11638 - 13917 1262 759 aa, chain + ## HITS:1 COG:ydeP KEGG:ns NR:ns ## COG: ydeP COG0243 # Protein_GI_number: 16129460 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 759 1 759 759 1564 99.0 0 MKKKIESYQGAAGGWGAVKSVANAVRKQMDIRQDVIAMFDMNKPEGFDCPGCAWPDPKHS ASFDICENGAKAIAWEVTDKQVNASFFAENTVQSLLTWGDHELEAAGRLTQPLKYDAVSD CYKPLSWQQAFDEIGARLQSYSDPNQVEFYTSGRTSNEAAFLYQLFAREYGSNNFPDCSN MCHEPTSVGLAASIGVGKGTVLLEDFEKCDLVICIGHNPGTNHPRMLTSLRALVKRGAKM IAINPLQERGLERFTAPQNPFEMLTNSETQLASAYYNVRIGGDMALLKGMMRLLIERDDA ASAAGRPSLLDDEFIQTHTVGFDELRRDVLNSEWKDIERISGLSQTQIAELADAYAAAER TIICYGMGITQHEHGTQNVQQLVNLLLMKGNIGKPGAGICPLRGHSNVQGDRTVGITEKP SAEFLARLGERYGFTPPHAPGHAAIASMQAICTGQARALICMGGNFALAMPDREASAVPL TQLDLAVHVATKLNRSHLLTARHSYILPVLGRSEIDMQKNGAQAVTVEDSMSMIHASRGM LKPAGVMLKSECAVVAGIAQAALPQSVVAWEYLVEDYDRIRNDIEAVLPEFADYNQRIRH PGGFHLINAAAERRWMTPSGKANFITSKGLLEDPSSAFNSKLVMATVRSHDQYNTTIYGM DDRYRGVFGQRDVVFMSAKQAKICRVKNGERVNLIALTPDGKRSSRRMDRLKVVIYPMAD RSLVTYFPESNHMLTLDNHDPLSGIPGYKSIPVELEPSN >gi|296493370|gb|ADTK01000131.1| GENE 12 14165 - 14362 167 65 aa, chain + ## HITS:1 COG:no KEGG:G2583_1863 NR:ns ## KEGG: G2583_1863 # Name: not_defined # Def: two-component-system connector protein YneN # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 65 1 65 65 88 96.0 8e-17 MHATTVKNKITQRDNYKEIMSVIVVVLLLTLTLIAIFSAIDQLGISEMGRIARDLTHFII NSLQD >gi|296493370|gb|ADTK01000131.1| GENE 13 14437 - 15198 393 253 aa, chain + ## HITS:1 COG:ydeO KEGG:ns NR:ns ## COG: ydeO COG2207 # Protein_GI_number: 16129458 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 253 1 253 253 503 98.0 1e-142 MSLVCSVIFIHHAFNANILDKDYAFSDGEILMVDNAVRTHFEPYERHFKEIGFNENTIKK YLQCTNIQTVTVPVPAKFLRASNVPTGLLNEMIAYLNSEERNHHNFSELLLFSCLSIFAA CKGFITLLTNGVLSVSGKVRNIVNMKLAHPWKLKDICDCLYISESLLKKKLKQEQTTFSQ ILLDARMQHAKNLIRVEGSVNKIAEQCGYASTSYFIYAFRKHFGNSPKRVSKEYRCQSHT GMNTGNTMSALAI >gi|296493370|gb|ADTK01000131.1| GENE 14 15600 - 17282 1309 560 aa, chain + ## HITS:1 COG:ydeN KEGG:ns NR:ns ## COG: ydeN COG3119 # Protein_GI_number: 16129457 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Escherichia coli K12 # 1 560 12 571 571 1090 100.0 0 MKSALKKSVVSTSISLILASGMAAFAAHAADDVKLKATKTNVAFSDFTPTEYSTKGKPNI IVLTMDDLGYGQLPFDKGSFDPKTMENREVVDTYKIGIDKAIEAAQKSTPTLLSLMDEGV RFTNGYVAHGVSGPSRAAIMTGRAPARFGVYSNTDAQDGIPLTETFLPELFQNHGYYTAA VGKWHLSKISNVPVPEDKQTRDYHDNFTTFSAEEWQPQNRGFDYFMGFHAAGTAYYNSPS LFKNRERVPAKGYISDQLTDEAIGVVDRAKTLDQPFMLYLAYNAPHLPNDNPAPDQYQKQ FNTGSQTADNYYASVYSVDQGVKRILEQLKKNGQYDNTIILFTSDNGAVIDGPLPLNGAQ KGYKSQTYPGGTHTPMFMWWKGKLQPGNYDKLISAMDFYPTALDAADISIPKDLKLDGVS LLPWLQDKKQGEPHKNLTWITSYSHWFDEENIPFWDNYHKFVRHQSDDYPHNPNTEDLSQ FSYTVRNNDYSLVYTVENNQLGLYKLTDLQQKDNLAAANPQVVKEMQGVVREFIDSSQPP LSEVNQEKFNNIKKALSEAK >gi|296493370|gb|ADTK01000131.1| GENE 15 17334 - 18491 448 385 aa, chain + ## HITS:1 COG:ydeM KEGG:ns NR:ns ## COG: ydeM COG0641 # Protein_GI_number: 16129456 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Escherichia coli K12 # 1 385 6 390 390 794 99.0 0 MHVTAKPSSFQCNLKCDYCFYLEKESQFTHEKWMDDSTLKEFIKQYIAASGNQVYFTWQG GEPTLAGLDFFRKVIHYQQRYAGQKRIFNALQTNGILLNNEWCSFLKEHEFLVGISIDGP QELHDRYRRSNSGNGTFAKVIAAIERLKSYQVEFNTLTVINNVNVHYPLEVYHFLKSIGS KHMQFIELLETGTPNIDFSGHSENTFRIIDFSVPPTAYGKFMSTIFMQWVKNDVGEIFIR QFESFVSRFLGNGHTSCIFQESCKDNLVVESNGDIYECDHFVYPQYKIGNINKSELKTMN SVQLTAQKKRISAKCQQCAYKPICNGGCPKHRITKVNNETVSYFCEGYKILFSTMVPYMN AMVELAKNRVPLYHIMDVAKQMENN >gi|296493370|gb|ADTK01000131.1| GENE 16 18782 - 18982 148 66 aa, chain + ## HITS:1 COG:ECs2101 KEGG:ns NR:ns ## COG: ECs2101 COG4178 # Protein_GI_number: 15831355 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease and ATPase components # Organism: Escherichia coli O157:H7 # 1 66 1 66 561 109 100.0 1e-24 MITIPITFCMLIAKYLCLLKPFWLRKNNKTSVLLIIIILAMILGVVKIQVWLNDWNNDFF NALSQK >gi|296493370|gb|ADTK01000131.1| GENE 17 19271 - 20467 727 398 aa, chain + ## HITS:1 COG:ECs2101 KEGG:ns NR:ns ## COG: ECs2101 COG4178 # Protein_GI_number: 15831355 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease and ATPase components # Organism: Escherichia coli O157:H7 # 1 398 164 561 561 791 98.0 0 MLITFTVILWQSAGTLSFTVGGTEWNIQGYMVYTVVLIVIGGTLFTHKVGKRIRPLNVEK QRSEATFRTNLVQHNKQAELIALSNAESLQRQELRDNFHTIKENWHRLMNRQRWLDYWQN IYSRSLSVLPYFLLLPQFISGQINLGGLMKSRQAFMLVSNNLSWFIYKYDELAELAAVID RLYEFHQLTEQRPTNKPKNCQHAVQVANASIRTPDNKIILENLNFHVSPGKWLLLKGYSG AGKTTLLKTLSHCWPWFKGDISSPADSWYVSQTPLIKSGLLKEIICKALPLPVDDKSLSE VLHQVGLGKLAARIHDHDRWGDILSSGEKQRIALARLILRRPKWIFLDETTSHLEEQEAI RLLRLVREKLPTSGVIMVTHQPGVWNLADDICDISTVL >gi|296493370|gb|ADTK01000131.1| GENE 18 20505 - 22877 1823 790 aa, chain + ## HITS:1 COG:no KEGG:JW1490 NR:ns ## KEGG: JW1490 # Name: yddB # Def: predicted porin protein # Organism: E.coli_J # Pathway: not_defined # 1 790 1 790 790 1518 99.0 0 MKRVLIPGVILCGADVAQAVDDKNMYMHFFEEMTVYAPVPVPVNGNTHYTSESIERLPTG NGNISDLLRTNPAVRMDSTQSTSLNQGDIRPEKISIHGASPYQNAYLIDGISATNNLNPA NESDASSATNISGMSQGYYLDVSLLDNVTLYDSFVPVEFGRFNGGVIDAKIKRFNADDSK VKLGYRTTRSDWLTSHIDENNKSAFNQGSSGSTYYSPDFKKNFYTLSFNQELADNFGVTA GLSRRQSDITRADYVSNDGIVAGRAQYKNVIDTALSKFTWFASDRFTHDLTLKYTGSSRD YNTSTFPQSDREMGNKSYGLAWDMDTQLAWAKLRTTVGWDHISDYTRHDHDIWYTELSCT YGDITGRCTRGGLGHISQAVDNYTFKTRLDWQKFAVGNVSHQPYFGAEYIYSDAWTERHN QSESYVINAAGKKTNHTIYHKGKGRLGIDNYTLYMADRISWRNVSLMPGVRYDYDNYLSN HNISPRFMTEWDIFANQTSMITAGYNRYYGGDILDMGLRDIRNSWTESVSGNKTLTRYQD LKTPYNDELAMGLQQKIGKNVIARANYVYREAHDQISKSSRTDSATKTTITEYNNDGKTK THSFSLSFELAEPLHIRQVDINPQIVFSYIKSKGNLSLNNGYEESNTGDNQVVYNGNLVS YDSVPVADFNNPLKISLNMDFTHQPSGLVWANTLAWQEARKARIILGKTNAQYISEYSDY KQYVDEKLDSSLTWDTRLSWTPQFLKQQNLTISADILNVLDSKTAVDTTNTGVATYASGR TFWLDVSMKF >gi|296493370|gb|ADTK01000131.1| GENE 19 22934 - 25717 1867 927 aa, chain + ## HITS:1 COG:pqqL KEGG:ns NR:ns ## COG: pqqL COG0612 # Protein_GI_number: 16129453 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Escherichia coli K12 # 1 927 5 931 931 1681 98.0 0 MRNLCFLLTLVATLLLPGRLIAAALPQDEKLITGQLDNGLRYMIYPHAHPKDQVNLWLQI HTGSLQEEDNERGVAHFVEHMMFNGTKTWPGNKVIETFESMGLRFGRDVNAYTSYDETVY QVSLPTTQKQNLQQVMAIFSEWSNAATFEKLEVDAERGVITEEWRAHQDAKWRTSQARRP FLLANTRNLDREPIGLMDTVATVTPAQLRQFYQRWYQPNNMTFIVVGDIDSKEALALIKD NLSKLPANKAAENRVWPTKAENHLRFNIINDKENRVNGIALYYRLPMVQVNDEQSFVEQA EWSMLVQLFNQRLQERIQSGELKTISGGTARSVKIAPDYQSLFFRVNARDDNMQDAANAL MAELATIDQHGFSAEELDDVKSTRLTWLKNAVDQQAERDLRMLTSRLASSSLNNTPFLSP EETYQLSKRLWQQITVQSLAEKWQQLRKNQDAFWEQMVNNELAAKKALSPAAILALEKEY ANKKLAAYIFPGRNLSLTVDADPQAEISSKETLAENLTSLTLSNGARVILAKSAGEEQKL QITAVSNKGDLSFPAQQKSLITLANKAVSGSGVGELSSSSLKRWSAENSVTMSSKVSGMN TLLSVSARTNNPEPGFQLINQRITHSTINDNIWASLQNAQIQALKTLDQRPAEKFAQQMY ETRYADDRTKLLQENQIVQFTAADALAADRQLFSSPADITFVIVGNVSEDKLVALITRYL GSIKHSDSPLAAGKPLTRATDNASVTVKEQNEPVAQVSQWKRYDSRTPVNLATRMALDAF NVALAKDLRVNIREQASGAYSVSSRLSVDPQAKDISHLLAFTCQPERHDELLTLANEVMV KRLAKGISEQELNEYQQNVQRSLDIQQRSVQQLANTIVNSLIQYDDPAAWTEQEQLLKQM TVENVNTAVKQYLSHPVNTYTGVLLQK >gi|296493370|gb|ADTK01000131.1| GENE 20 26079 - 26349 201 90 aa, chain + ## HITS:1 COG:ECs2098 KEGG:ns NR:ns ## COG: ECs2098 COG0076 # Protein_GI_number: 15831352 # Func_class: E Amino acid transport and metabolism # Function: Glutamate decarboxylase and related PLP-dependent proteins # Organism: Escherichia coli O157:H7 # 1 90 1 90 466 191 98.0 4e-49 MDKKQVTDLRSELLDSRFGAKSISTIAESKRFPLHEMRDDVAFQIINDELYLDGNARQNL ATFCQTWDDDNVHKLMDLSINKNWIDKEEY Prediction of potential genes in microbial genomes Time: Mon May 16 15:29:51 2011 Seq name: gi|296493369|gb|ADTK01000132.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont316.1, whole genome shotgun sequence Length of sequence - 1149 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 72 - 131 1.9 1 1 Tu 1 . + CDS 194 - 1148 364 ## COG4584 Transposase and inactivated derivatives Predicted protein(s) >gi|296493369|gb|ADTK01000132.1| GENE 1 194 - 1148 364 318 aa, chain + ## HITS:1 COG:AGl6 KEGG:ns NR:ns ## COG: AGl6 COG4584 # Protein_GI_number: 15890093 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 318 31 351 530 293 46.0 2e-79 MPTVPISMRKLKEILRLKYGVGLSHRQIGRSLAISPSVVSRYANRAAQLGIKQWPLPTGW DDTKLKHAFLQTQVKMKKHSLPDWATVHRELRNKCVTLQLLWEEYCERNPGGFYSYNHYC RMYREWLKTTSPSMRQVHKAGEKLFVDYCGPTVGVTDPETGEIRTAQVIVAVLGASSYTW AEATWSQQLEDWVMSHVRCFQWLGGVPELVVPDNLKSATSRACKYDPDVNPTYQQMLEHY NVAVLPARPRKPKDKAKAEVGVQVVERWIMARIRHEIFYSLASLNQRIRELLERLNNKIM QKLGYSRAELFIQLDKPA Prediction of potential genes in microbial genomes Time: Mon May 16 15:29:52 2011 Seq name: gi|296493368|gb|ADTK01000133.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont322.1, whole genome shotgun sequence Length of sequence - 3927 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 5, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 138 - 347 94 ## gi|300903850|ref|ZP_07121754.1| hypothetical protein HMPREF9536_01963 - Prom 507 - 566 4.8 2 2 Tu 1 . - CDS 890 - 1102 132 ## gi|300903852|ref|ZP_07121756.1| hypothetical protein HMPREF9536_01965 - Prom 1256 - 1315 3.9 - Term 1408 - 1446 -1.0 3 3 Tu 1 . - CDS 1451 - 1612 115 ## UTI89_C4885 hypothetical protein - Prom 1819 - 1878 2.9 + Prom 2233 - 2292 5.4 4 4 Tu 1 . + CDS 2388 - 2864 86 ## ECP_4526 hypothetical protein + Term 2872 - 2916 -0.5 + Prom 2894 - 2953 5.1 5 5 Tu 1 . + CDS 2974 - 3714 628 ## COG3637 Opacity protein and related surface antigens + Term 3793 - 3827 5.2 Predicted protein(s) >gi|296493368|gb|ADTK01000133.1| GENE 1 138 - 347 94 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|300903850|ref|ZP_07121754.1| ## NR: gi|300903850|ref|ZP_07121754.1| hypothetical protein HMPREF9536_01963 [Escherichia coli MS 84-1] # 1 69 45 113 113 123 98.0 4e-27 MGYAENKSNSVVKNAYVKFNLLKDGVVIGQTIDTASNLEPGQKWKIQAFINTFNGSPDSF KVTEVKIVN >gi|296493368|gb|ADTK01000133.1| GENE 2 890 - 1102 132 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|300903852|ref|ZP_07121756.1| ## NR: gi|300903852|ref|ZP_07121756.1| hypothetical protein HMPREF9536_01965 [Escherichia coli MS 84-1] # 1 70 19 88 88 125 100.0 7e-28 MDTCFKILQLKFDQKLANRCIGLAKHQWGLISILRHRRIINTSMVRSNTSVVSPWIKSSN CSQIKTRREF >gi|296493368|gb|ADTK01000133.1| GENE 3 1451 - 1612 115 53 aa, chain - ## HITS:1 COG:no KEGG:UTI89_C4885 NR:ns ## KEGG: UTI89_C4885 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 51 1 51 99 93 98.0 2e-18 MALLWNQAAGSINLEFISYWNSGVLAAPDNHLTLEERSALQKLWGGLETGDAI >gi|296493368|gb|ADTK01000133.1| GENE 4 2388 - 2864 86 158 aa, chain + ## HITS:1 COG:no KEGG:ECP_4526 NR:ns ## KEGG: ECP_4526 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_536 # Pathway: not_defined # 1 122 1 122 122 229 100.0 2e-59 MSAQEKTVQIDSFDAKFLDINTLSRLIYERKSFVIENVSDVSETVKRVEHEIEKTKLSCR VYTEYRSTALAGSLWSPTVILGVASAVAIGVHNLSTWNPDYEIGKNYIKRRLSVKYKKAE VLCSVTPVPLPVCILGRLYSHNFRQAFTSSLRRYIYFV >gi|296493368|gb|ADTK01000133.1| GENE 5 2974 - 3714 628 246 aa, chain + ## HITS:1 COG:STM0306 KEGG:ns NR:ns ## COG: STM0306 COG3637 # Protein_GI_number: 16763689 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Opacity protein and related surface antigens # Organism: Salmonella typhimurium LT2 # 1 246 1 239 239 147 39.0 2e-35 MNKVFVVSVVVAACVFAANAGAKEGKSGFYLTGKAGASVVSLSDQRFLSGDEEETSKYKG GDDHDTVFSGGIAAGYDFYPQFSIPVRTELEFYARGKADSKYNVDKDSWSGGYWRDDLKN EVSVNTLMLNAYYDFRNDSAFTPWVSAGIGYARIHQKTTGISTWDYEYGSSGRESLSRSG SADNFAWSLGAGVRYDVTPDIALDLSYRYLDAGDSSVSYKDEWGDKYKSEVDVKSHDIML GMTYNF Prediction of potential genes in microbial genomes Time: Mon May 16 15:30:06 2011 Seq name: gi|296493367|gb|ADTK01000134.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont322.2, whole genome shotgun sequence Length of sequence - 4560 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 43 - 102 3.3 1 1 Tu 1 . + CDS 303 - 1493 623 ## COG2194 Predicted membrane-associated, metal-dependent hydrolase - Term 1807 - 1851 0.1 2 2 Tu 1 . - CDS 1853 - 2896 541 ## ECUMN_3323 SHI-2 pathogenicity island protein - Prom 2993 - 3052 6.0 3 3 Tu 1 . - CDS 3152 - 4417 374 ## PROTEIN SUPPORTED gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 - Prom 4473 - 4532 2.8 Predicted protein(s) >gi|296493367|gb|ADTK01000134.1| GENE 1 303 - 1493 623 396 aa, chain + ## HITS:1 COG:yjgX KEGG:ns NR:ns ## COG: yjgX COG2194 # Protein_GI_number: 16132096 # Func_class: R General function prediction only # Function: Predicted membrane-associated, metal-dependent hydrolase # Organism: Escherichia coli K12 # 98 245 1 148 148 263 91.0 4e-70 MIPVYHFLVSAAILVFMVIFWRTHHRGHRNWLALLLFVLCSVNSWPLRMVKGTVVGTTDT LREMQRYKQLNQHGADNWKILPGVPLYDTIVIVTGESVRRDYMSVYGYPVPTTPWLNTAP GLFIDGYTSAAASTVPSLSRTLIYDYEQNPDSGNNVVALAAKAGYSTWWISNQGKLGEHD TRISVIASDAEHTVFLKKGSFASRKTDDMLLLQETERALADKSSPKVIFLHMMGSHPNPC DRLHSWPNHYLEQYPRKVACYLASISKLDNFLGQLDGILRRHSRHFVMLYFSDHGLSVSD SANPVHHDGHVQGGYSVPLIITASDITSHQSVSRKISARHFAGIFQWLTGIRTENIPPFN LLTDEDNEPIMVFNGERNVPADSLKPQPLILPVKGK >gi|296493367|gb|ADTK01000134.1| GENE 2 1853 - 2896 541 347 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_3323 NR:ns ## KEGG: ECUMN_3323 # Name: not_defined # Def: SHI-2 pathogenicity island protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 347 11 357 357 717 99.0 0 MTGKLRFEVNDNQGCFIFPETWFGSLLDEFEELIDAYDADEISETSYINKLRRLARQEND FIDVHAHLAYVFLEQNAPRKALNAALKGLAVGNRLIPEGFSGRIIWIHPDNRPFLRALYA AILANAHLQRHQDAIMLIEKILDYNPEDNHGARWLLGPELLRTGAHEQARHILQEHADEF SPYWYELGLLHFLNGELVKAATAFRRGFAANTYIAEILCGNLHPFPLAVWHNFSGGPDTA EDYYATYHPLWGQYPEALLFVNWLYNHSSVLHERAEIIKCAEMLMQEDDFEICESILRQQ EKLRERIDETLSEKIVQKCRNMNGEYVWPWILPFSAAGMKHTGIQYQ >gi|296493367|gb|ADTK01000134.1| GENE 3 3152 - 4417 374 421 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 [Campylobacter concisus 13826] # 49 401 60 402 406 148 27 7e-36 MALTDAKIRAAKPTDKAYKLTDGAGMFLLVHPNGSRYWRLRYRILGKEKTLALGVYPEVS LSEARTKRDEARKLISEGIDPCEQKRVKKVVPDLQLSFEHIARRWHASNKQWAQSHSDKV LKSLETHVFPFIGNRDITTLNTPDLLIPVRAAEAKQIYEIASRLQQRISAVMRYAVQSGI IRYNPALDMAGALTSVKRQHRPALDLSRLPELLSRINSYKGQPVTRLAVMLNLLVFIRSS ELRYARWAEIDIDNSMWTIPAERKPLPGVKFSHRGSKMRTPHLVPLSKQAVAILTELQTW AGENGLIFTGAHDPRKPISENTVNKALRVMGYDTTKEVCGHGFRAMACSALIESGLWSRD AVERQMSHQERNGVRAAYIHKAEHLEERRLMLQWWADFLDANREECISPFEYAKVNNPLK R Prediction of potential genes in microbial genomes Time: Mon May 16 15:30:12 2011 Seq name: gi|296493366|gb|ADTK01000135.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont332.1, whole genome shotgun sequence Length of sequence - 1728 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 306 289 ## COG3209 Rhs family protein - Prom 354 - 413 8.1 - Term 754 - 797 1.1 2 2 Op 1 . - CDS 886 - 1299 174 ## ECIAI1_0251 conserved hypothetical protein, hypothetical protein 450 3 2 Op 2 . - CDS 1329 - 1631 83 ## COG5433 Transposase Predicted protein(s) >gi|296493366|gb|ADTK01000135.1| GENE 1 3 - 306 289 101 aa, chain - ## HITS:1 COG:Z0268 KEGG:ns NR:ns ## COG: Z0268 COG3209 # Protein_GI_number: 15799917 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli O157:H7 EDL933 # 1 101 1 98 1404 183 90.0 8e-47 MSGKPAARQGDMTRKGLDIVQGSAGVLIGAPTGVACSVCPKKKDSPNYGNPVNPVLGAKV LPGETDIALPGPLPFILSRAYSSYRTRTPAPVGVFGPGWKA >gi|296493366|gb|ADTK01000135.1| GENE 2 886 - 1299 174 137 aa, chain - ## HITS:1 COG:no KEGG:ECIAI1_0251 NR:ns ## KEGG: ECIAI1_0251 # Name: not_defined # Def: conserved hypothetical protein, hypothetical protein 450 # Organism: E.coli_IAI1 # Pathway: not_defined # 1 137 10 146 159 231 97.0 6e-60 MLVGAAVGAVVIGGGIVIYNNREAVGDAIEKSMEMRGEAELLNAQSQMECWREPYGAFSS VFGNDDSSKENPNVGKDLTDEEKNCLGSGSTGGSGGWEPDEDGEIRNTYRSIKDAPQYPR GVRNVQNGTTRNVVKDQ >gi|296493366|gb|ADTK01000135.1| GENE 3 1329 - 1631 83 100 aa, chain - ## HITS:1 COG:yhhI KEGG:ns NR:ns ## COG: yhhI COG5433 # Protein_GI_number: 16131356 # Func_class: L Replication, recombination and repair # Function: Transposase # Organism: Escherichia coli K12 # 1 100 279 378 378 202 98.0 2e-52 MMVRYYISSADLTAEKFATAIRNHWHVENNLHWRLDVVMNEDDCKIRRGNAAELFSGIRH IAINILTNDKVFKAGLRRKMRKAAMDRNYLASVLAGSGLS Prediction of potential genes in microbial genomes Time: Mon May 16 15:30:16 2011 Seq name: gi|296493365|gb|ADTK01000136.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont334.1, whole genome shotgun sequence Length of sequence - 6270 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 32 - 181 171 ## JW1988 hypothetical protein + Term 336 - 374 5.0 - Term 1028 - 1062 6.9 2 2 Op 1 2/0.000 - CDS 1079 - 1408 503 ## COG2926 Uncharacterized protein conserved in bacteria - Prom 1513 - 1572 3.9 3 2 Op 2 4/0.000 - CDS 1580 - 2638 914 ## COG1289 Predicted membrane protein - Prom 2764 - 2823 4.1 - Term 2711 - 2745 1.1 4 2 Op 3 3/0.000 - CDS 2836 - 3309 457 ## COG3449 DNA gyrase inhibitor - Prom 3333 - 3392 7.4 5 2 Op 4 . - CDS 3428 - 4471 960 ## COG1686 D-alanyl-D-alanine carboxypeptidase - Prom 4567 - 4626 2.9 + Prom 4655 - 4714 2.8 6 3 Tu 1 . + CDS 4803 - 6230 1264 ## COG2925 Exonuclease I Predicted protein(s) >gi|296493365|gb|ADTK01000136.1| GENE 1 32 - 181 171 49 aa, chain + ## HITS:1 COG:no KEGG:JW1988 NR:ns ## KEGG: JW1988 # Name: yeeW # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 49 16 64 64 89 100.0 4e-17 MGHIVVDIDGVNITELINKAAENGYSLRVVDDRDSTETPATYASPHQLL >gi|296493365|gb|ADTK01000136.1| GENE 2 1079 - 1408 503 109 aa, chain - ## HITS:1 COG:ECs2809 KEGG:ns NR:ns ## COG: ECs2809 COG2926 # Protein_GI_number: 15832063 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 109 23 131 131 177 100.0 4e-45 METTKPSFQDVLEFVRLFRRKNKLQREIQDVEKKIRDNQKRVLLLDNLSDYIKPGMSVEA IQGIIASMKGDYEDRVDDYIIKNAELSKERRDISKKLKAMGEMKNGEAK >gi|296493365|gb|ADTK01000136.1| GENE 3 1580 - 2638 914 352 aa, chain - ## HITS:1 COG:yeeA KEGG:ns NR:ns ## COG: yeeA COG1289 # Protein_GI_number: 16129949 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 352 1 352 352 689 100.0 0 MRADKSLSPFEIRVYRHYRIVHGTRVALAFLLTFLIIRLFTIPESTWPLVTMVVIMGPIS FWGNVVPRAFERIGGTVLGSILGLIALQLELISLPLMLVWCAAAMFLCGWLALGKKPYQG LLIGVTLAIVVGSPTGEIDTALWRSGDVILGSLLAMLFTGIWPQRAFIHWRIQLAKSLTE YNRVYQSAFSPNLLERPRLESHLQKLLTDAVKMRGLIAPASKETRIPKSIYEGIQTINRN LVCMLELQINAYWATRPSHFVLLNAQKLRDTQHMMQQILLSLVHALYEGNPQPVFANTEK LNDAVEELRQLLNNHHDLKVVETPIYGYVWLNMETAHQLELLSNLICRALRK >gi|296493365|gb|ADTK01000136.1| GENE 4 2836 - 3309 457 157 aa, chain - ## HITS:1 COG:sbmC KEGG:ns NR:ns ## COG: sbmC COG3449 # Protein_GI_number: 16129950 # Func_class: L Replication, recombination and repair # Function: DNA gyrase inhibitor # Organism: Escherichia coli K12 # 1 157 1 157 157 317 100.0 5e-87 MNYEIKQEEKRTVAGFHLVGPWEQTVKKGFEQLMMWVDSKNIVPKEWVAVYYDNPDETPA EKLRCDTVVTVPGYFTLPENSEGVILTEITGGQYAVAVARVVGDDFAKPWYQFFNSLLQD SAYEMLPKPCFEVYLNNGAEDGYWDIEMYVAVQPKHH >gi|296493365|gb|ADTK01000136.1| GENE 5 3428 - 4471 960 347 aa, chain - ## HITS:1 COG:dacD KEGG:ns NR:ns ## COG: dacD COG1686 # Protein_GI_number: 16129951 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Escherichia coli K12 # 1 347 44 390 390 712 99.0 0 MDYTTGQILTAGNEHQQRNPASLTKLMTGYVVDRAIDSHRITPDDIVTVGRDAWAKDNPV FVGSSLMFLKEGDRVSVRDLSRGLIVDSGNDACVALADYIAGGQRQFVEMMNNYAEKLHL KDTHFETVHGLDAPGQHSSAYDLAVLSRAIIHGEPEFYHMYSEKSLTWNGITQQNRNGLL WDKTMNVDGLKTGHTSGAGFNLIASAVDGQRRLIAVVMGADSAKGREEEARKLLRWGQQN FTTVQILHRGKKVGTERIWYGDKENIDLGTEQEFWMVLPKAEIPHIKAKYTLDGKELTAP ISAHQRVGEIELYDRDKQLAHWPLVTLESVGEGSMFSRLSDYFHHKA >gi|296493365|gb|ADTK01000136.1| GENE 6 4803 - 6230 1264 475 aa, chain + ## HITS:1 COG:ECs2813 KEGG:ns NR:ns ## COG: ECs2813 COG2925 # Protein_GI_number: 15832067 # Func_class: L Replication, recombination and repair # Function: Exonuclease I # Organism: Escherichia coli O157:H7 # 1 475 1 475 475 973 99.0 0 MMNDGKQQSTFLFHDYETFGTHPALDRPAQFAAIRTDSEFNVIGEPEVFYCKPADDYLPQ PGAVLITGITPQEARAKGENEAAFAARIHSLFTVPKTCILGYNNVRFDDEVTRNVFYRNF YDPYAWSWQHDNSRWDLLDVMRACYALRPEGINWPENDDGLPSFRLEHLTKANGIEHSNA HDAMADVYATIAMAKLVKTRQPRLFDYLFTHRNKHKLMALIDVPQMKPLVHVSGMFGAWR GNTSWVAPLAWHPENRNAVIMVDLAGDISPLLELDSDTLRERLYTAKADLGDNAAVPVKL VHINKCPVLAQANTLRPEDADRLGINRQHCLDNLKILRENPQVREEVVAIFAEAEPFTPS DNVDAQLYSGFFSDADRAAMKIVLETEPRNLPALDITFVDKRIEKLLFNYRARNFPGTLD YAEQQRWLEHRRQVFTPEFLQGYAEEIQMLAQQYADDKEKVALLKALWQYAEEIV Prediction of potential genes in microbial genomes Time: Mon May 16 15:30:21 2011 Seq name: gi|296493364|gb|ADTK01000137.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont334.2, whole genome shotgun sequence Length of sequence - 5949 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 3, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 10/0.000 - CDS 42 - 269 309 ## COG0425 Predicted redox protein, regulator of disulfide bond formation 2 1 Op 2 2/0.000 - CDS 283 - 1341 761 ## COG2391 Predicted transporter component - Prom 1453 - 1512 7.3 - Term 1452 - 1509 14.5 3 2 Op 1 4/0.000 - CDS 1520 - 2878 1520 ## COG0531 Amino acid transporters - Prom 3077 - 3136 4.2 - Term 3024 - 3056 2.3 4 2 Op 2 2/0.000 - CDS 3145 - 4074 823 ## COG0583 Transcriptional regulator 5 2 Op 3 . - CDS 4120 - 4944 797 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Term 4988 - 5025 1.5 6 3 Op 1 6/0.000 - CDS 5027 - 5281 97 ## COG4115 Uncharacterized protein conserved in bacteria 7 3 Op 2 . - CDS 5278 - 5529 322 ## COG2161 Antitoxin of toxin-antitoxin stability system - Prom 5556 - 5615 9.1 Predicted protein(s) >gi|296493364|gb|ADTK01000137.1| GENE 1 42 - 269 309 75 aa, chain - ## HITS:1 COG:ECs2814 KEGG:ns NR:ns ## COG: ECs2814 COG0425 # Protein_GI_number: 15832068 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted redox protein, regulator of disulfide bond formation # Organism: Escherichia coli O157:H7 # 1 75 1 75 75 124 100.0 6e-29 MAIKKLDVVTQVCPFPLIEAKAALAEMASGDELVIEFDCTQATEAIPQWAAEEGHAITDY QQIGDAAWSITVQKA >gi|296493364|gb|ADTK01000137.1| GENE 2 283 - 1341 761 352 aa, chain - ## HITS:1 COG:yeeE KEGG:ns NR:ns ## COG: yeeE COG2391 # Protein_GI_number: 16129954 # Func_class: R General function prediction only # Function: Predicted transporter component # Organism: Escherichia coli K12 # 1 352 1 352 352 618 98.0 1e-177 MFSMILSGLICGALLGFVMQRGRFCLTGGFRDMYIAKNNRMFYALLIAISVQSVGVFALI QAGLLTYEAGAFPWLGTVIGGYLFGLGIVLAGGCATGTWYRAGEGLIGSWIALFTYMVMS AVMRSPHASGLNQTLQYYSTEHNSIAETFNLSVWPLVTVLLVITLWVVMKELKKPKLKVA TLPPRRTGIAHILFEKRWHPFVTAVLIGLIALLAWPLSEATGRMFGLGITSPTANILQFL VAGDVKYINWGVFLVLGIFVGSFIAAKASREFRVRAADAQTTLRSGLGGVLMGFGASIAG GCSIGNGLVMTAMMTWQGWIGLVFMILGVWTASWLVYVRPQRKARLATAAAN >gi|296493364|gb|ADTK01000137.1| GENE 3 1520 - 2878 1520 452 aa, chain - ## HITS:1 COG:ECs2816 KEGG:ns NR:ns ## COG: ECs2816 COG0531 # Protein_GI_number: 15832070 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli O157:H7 # 1 452 3 454 454 816 100.0 0 MSHNVTPNTSRVELRKTLTLVPVVMMGLAYMQPMTLFDTFGIVSGLTDGHVPTAYAFALI AILFTALSYGKLVRRYPSAGSAYTYAQKSISPTVGFMVGWSSLLDYLFAPMINILLAKIY FEALVPSIPSWMFVVALVAFMTAFNLRSLKSVANFNTVIVVLQVVLIAVILGMVVYGVFE GEGAGTLASTRPFWSGDAHVIPMITGATILCFSFTGFDGISNLSEETKDAERVIPRAIFL TALIGGMIFIFATYFLQLYFPDISRFKDPDASQPEIMLYVAGKAFQVGALIFSTITVLAS GMAAHAGVARLMYVMGRDGVFPKSFFGYVHPKWRTPAMNIILVGAIALLAINFDLVMATA LINFGALVAFTFVNLSVISQFWIREKRNKTLKDHFQYLFLPMCGALTVGALWVNLEESSM VLGLIWAAIGLIYLACVTKSFRNPVPQYEDVA >gi|296493364|gb|ADTK01000137.1| GENE 4 3145 - 4074 823 309 aa, chain - ## HITS:1 COG:yeeY KEGG:ns NR:ns ## COG: yeeY COG0583 # Protein_GI_number: 16129956 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 309 8 316 316 618 99.0 1e-177 MKPLLDVLMILDALEKEGSFAAASAKLYKTPSALSYTVHKLESDLNIQLLDRSGHRAKFT RTGKMLLEKGREVLHTVRELEKQAIKLHEGWENELVIGVDDTFPFSLLAPLIEAFYQHHS VTRLKFINGVLGGSWDALTQGRADIIVGAMHEPPSSSEFGFSRLGDLEQVFAVAPHHPLA LEEEPLNRRIIKRYRAIMVGDTAQAGASTASQLLDEQEAITVFDFKTKLELQISGLGCGY LPRYLAQRFLDSGALIEKKVVAQTLFEPVWIGWNEQTAGLASGWWRDEILANSAIAGVYA KSDDGKSAI >gi|296493364|gb|ADTK01000137.1| GENE 5 4120 - 4944 797 274 aa, chain - ## HITS:1 COG:ECs2818 KEGG:ns NR:ns ## COG: ECs2818 COG0451 # Protein_GI_number: 15832072 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Escherichia coli O157:H7 # 1 274 1 274 274 528 100.0 1e-150 MKKVAIVGLGWLGMPLAMSLSARGWQVTGSKTTQDGVEAARMSGIDSYLLRMEPELVCDS DDLDALMDADALVITLPARRSGPGDEFYLQAVQELVDSALAHRIPRIIFTSSTSVYGDAQ GTVKETTPRNPVTNSGRVLEELEDWLHNLPGTSVDILRLAGLVGPGRHPGRFFAGKTAPD GEHGVNLVHLEDVIGAITLLLQAPKGGHIYNICAPAHPARNVFYPQMARLLGLEPPQFRN SLDSGKGKIIDGSRICNELGFEYQYPDPLVMPLE >gi|296493364|gb|ADTK01000137.1| GENE 6 5027 - 5281 97 84 aa, chain - ## HITS:1 COG:AGc3658 KEGG:ns NR:ns ## COG: AGc3658 COG4115 # Protein_GI_number: 15889305 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 84 1 88 89 107 56.0 5e-24 MKLIWSEESWDDYLYWQETDKRIVKKINELIKDTRRTPFEGKGKPEPLKHNLSGFWSRRI TEEHRLVYAVTDDSLLIAACRYHY >gi|296493364|gb|ADTK01000137.1| GENE 7 5278 - 5529 322 83 aa, chain - ## HITS:1 COG:yefM KEGG:ns NR:ns ## COG: yefM COG2161 # Protein_GI_number: 16129958 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Antitoxin of toxin-antitoxin stability system # Organism: Escherichia coli K12 # 1 83 10 92 92 140 100.0 8e-34 MRTISYSEARQNLSATMMKAVEDHAPILITRQNGEACVLMSLEEYNSLEETAYLLRSPAN ARRLMDSIDSLKSGKGTEKDIIE Prediction of potential genes in microbial genomes Time: Mon May 16 15:30:25 2011 Seq name: gi|296493363|gb|ADTK01000138.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont334.3, whole genome shotgun sequence Length of sequence - 9153 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 2, operones - 2 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 18/0.000 + CDS 14 - 913 1060 ## COG0040 ATP phosphoribosyltransferase 2 1 Op 2 19/0.000 + CDS 919 - 2223 1291 ## COG0141 Histidinol dehydrogenase 3 1 Op 3 13/0.000 + CDS 2220 - 3290 1113 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 4 1 Op 4 18/0.000 + CDS 3290 - 4357 1233 ## COG0131 Imidazoleglycerol-phosphate dehydratase 5 1 Op 5 25/0.000 + CDS 4357 - 4947 677 ## COG0118 Glutamine amidotransferase 6 1 Op 6 23/0.000 + CDS 4947 - 5684 876 ## COG0106 Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase 7 1 Op 7 24/0.000 + CDS 5666 - 6442 937 ## COG0107 Imidazoleglycerol-phosphate synthase 8 1 Op 8 . + CDS 6436 - 7044 678 ## COG0139 Phosphoribosyl-AMP cyclohydrolase 9 2 Op 1 25/0.000 - CDS 7083 - 8198 684 ## COG0438 Glycosyltransferase 10 2 Op 2 . - CDS 8208 - 9152 844 ## COG0438 Glycosyltransferase Predicted protein(s) >gi|296493363|gb|ADTK01000138.1| GENE 1 14 - 913 1060 299 aa, chain + ## HITS:1 COG:hisG KEGG:ns NR:ns ## COG: hisG COG0040 # Protein_GI_number: 16129960 # Func_class: E Amino acid transport and metabolism # Function: ATP phosphoribosyltransferase # Organism: Escherichia coli K12 # 1 299 1 299 299 574 100.0 1e-164 MTDNTRLRIAMQKSGRLSDDSRELLARCGIKINLHTQRLIAMAENMPIDILRVRDDDIPG LVMDGVVDLGIIGENVLEEELLNRRAQGEDPRYFTLRRLDFGGCRLSLATPVDEAWDGPL SLNGKRIATSYPHLLKRYLDQKGISFKSCLLNGSVEVAPRAGLADAICDLVSTGATLEAN GLREVEVIYRSKACLIQRDGEMEESKQQLIDKLLTRIQGVIQARESKYIMMHAPTERLDE VIALLPGAERPTILPLAGDQQRVAMHMVSSETLFWETMEKLKALGASSILVLPIEKMME >gi|296493363|gb|ADTK01000138.1| GENE 2 919 - 2223 1291 434 aa, chain + ## HITS:1 COG:hisD KEGG:ns NR:ns ## COG: hisD COG0141 # Protein_GI_number: 16129961 # Func_class: E Amino acid transport and metabolism # Function: Histidinol dehydrogenase # Organism: Escherichia coli K12 # 1 434 1 434 434 780 98.0 0 MSFNTIIDWNSCTAEQQRQLLMRPAISASESITRTVNDILDNVKARGDEALREYSAKFDK TTVTALKVSAEEIAAASERLSDELKQAMAVAVKNIETFHTAQKLPPVDVETQPGVRCQQV TRPVDSVGLYIPGGSAPLFSTVLMLATPARIANCKKVVLCSPPPIADEILYAAQLCGVQD VFNVGGAQAIAALAFGTESVPKVDKIFGPGNAFVTEAKRQVSQRLDGAAIDMPAGPSEVL VIADSGATPDFVASDLLSQAEHGPDSQVILLTPAADMARRVAEAVERLLAELPRAETARQ ALNASRLIVTKDLAQCVEISNQYGPEHLIIQTRNARDLVDGITSAGSVFLGDWSPESAGD YASGTNHVLPTYGYTATCSSLGLADFQKRMTVQELSKEGFSALASTIETLAAAERLTAHK NAVTLRVNALKEQA >gi|296493363|gb|ADTK01000138.1| GENE 3 2220 - 3290 1113 356 aa, chain + ## HITS:1 COG:hisC KEGG:ns NR:ns ## COG: hisC COG0079 # Protein_GI_number: 16129962 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Escherichia coli K12 # 1 356 1 356 356 708 99.0 0 MSTVTITDLARENVRNLTPYQSARRLGGNGDVWLNANEYPTAVEFQLTQQTLNRYPECQP KAVIENYAQYAGVKPEQVLVSRGADEGIELLIRAFCEPGKDAILYCPPTYGMYSVSAETI GVECRTVPTLDNWQLDLQGISDKLDGIKVVYVCSPNNPTGQLINPQDFRTLLELTRGKAI VVADEAYIEFCPQASLAGWLAEYPHLAILRTLSKAFALAGLRCGFTLANEEVINLLMKVI APYPLSTPVADIAAQALSPQGIVAMRERVAQIIAEREYLIAALKKISCVEQVFDSETNYI LARFKASSAVFKSLWDQGIILRDQNKQPSLSGCLRITVGTREESQRVIDALRAEQV >gi|296493363|gb|ADTK01000138.1| GENE 4 3290 - 4357 1233 355 aa, chain + ## HITS:1 COG:hisB_2 KEGG:ns NR:ns ## COG: hisB_2 COG0131 # Protein_GI_number: 16129963 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate dehydratase # Organism: Escherichia coli K12 # 149 355 1 207 207 430 99.0 1e-120 MSQKYLFIDRDGTLISEPPSDFQVDRFDKLAFEPGVIPELLKLQKAGYKLVMITNQDGLG TQSFPQADFDGPHNLMMQIFTSQGVQFDEVLICPHLPADECDCRKPKVKLVERYLAEQAM DRANSYVIGDRATDIQLAENMGINGLRYDRETLNWPMIGEQLTKRDRYAHVVRNTKETQI DVQVWLDREGGSKINTGVGFFDHMLDQIATHGGFRMEINVKGDLYIDDHHTVEDTALALG EALKIALGDKRGICRFGFVLPMDECLARCALDISGRPHLEYKAEFTYQRVGDLSTEMIEH FFRSLSYTMGVTLHLKTKGKNDHHRVESLFKAFGRTLRQAIRVEGDTLPSSKGVL >gi|296493363|gb|ADTK01000138.1| GENE 5 4357 - 4947 677 196 aa, chain + ## HITS:1 COG:hisH KEGG:ns NR:ns ## COG: hisH COG0118 # Protein_GI_number: 16129964 # Func_class: E Amino acid transport and metabolism # Function: Glutamine amidotransferase # Organism: Escherichia coli K12 # 1 196 1 196 196 415 99.0 1e-116 MNVVILDTGCANLNSVKSAIARHGYEPKVSRDPDVVLLADKLFLPGVGTAQAAMDQVRER ELFDLIKACTQPVLGICLGMQLLGRRSEESNGVDLLGIIDEDVPKMTDFGLPLPHMGWNR VYPQAGNRLFQGIEDGAYFYFVHSYAMPVNPWTIAQCNYGEPFTAAVQKDNFYGVQFHPE RSGAAGANLLKNFLEM >gi|296493363|gb|ADTK01000138.1| GENE 6 4947 - 5684 876 245 aa, chain + ## HITS:1 COG:ECs2825 KEGG:ns NR:ns ## COG: ECs2825 COG0106 # Protein_GI_number: 15832079 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase # Organism: Escherichia coli O157:H7 # 1 245 2 246 246 463 99.0 1e-130 MIIPALDLIDGTVVRLHQGDYGKQRDYGNDPLPRLQDYAAQGAEVLHLVDLTGAKDPAKR QIPLIKTLVAGVNVPVQVGGGVRSEEDVAALLEAGVARVVVGSTAVKSPEMVKGWFERFG ADALVLALDVRIDEQGNKQVAVSGWQENSGVSLEQLVETYLPVGLKHVLCTDISRDGTLA GSNVSLYEEVCARYPQVAFQSSGGIGDIDDVAALRGTGVRGVIVGRALLEGKFTVKEAIA CWQNA >gi|296493363|gb|ADTK01000138.1| GENE 7 5666 - 6442 937 258 aa, chain + ## HITS:1 COG:ECs2826 KEGG:ns NR:ns ## COG: ECs2826 COG0107 # Protein_GI_number: 15832080 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate synthase # Organism: Escherichia coli O157:H7 # 1 258 1 258 258 519 100.0 1e-147 MLAKRIIPCLDVRDGQVVKGVQFRNHEIIGDIVPLAKRYAEEGADELVFYDITASSDGRV VDKSWVSRVAEVIDIPFCVAGGIKSLEDAAKILSFGADKISINSPALADPTLITRLADRF GVQCIVVGIDTWYDAETGKYHVNQYTGDESRTRVTQWETLDWVQEVQKRGAGEIVLNMMN QDGVRNGYDLEQLKKVREVCHVPLIASGGAGTMEHFLEAFRDADVDGALAASVFHKQIIN IGELKAYLATQGVEIRIC >gi|296493363|gb|ADTK01000138.1| GENE 8 6436 - 7044 678 202 aa, chain + ## HITS:1 COG:hisI_1 KEGG:ns NR:ns ## COG: hisI_1 COG0139 # Protein_GI_number: 16129967 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosyl-AMP cyclohydrolase # Organism: Escherichia coli K12 # 1 112 1 112 112 239 100.0 3e-63 MLTEQQRRELDWEKTDGLMPVIVQHAVSGEVLMLGYMNPEALDKTLESGKVTFFSRTKQR LWTKGETSGNFLNVVSIAPDCDNDTLLVLANPIGPTCHKGTSSCFGDTAHQWLFLYQLEQ LLAERKSADPETSYTAKLYASGTKRIAQKVGEEGVETALAATVHDRFELTNEASDLMYHL LVLLQDQGLDLGEVIDNLKNRH >gi|296493363|gb|ADTK01000138.1| GENE 9 7083 - 8198 684 371 aa, chain - ## HITS:1 COG:PA5447 KEGG:ns NR:ns ## COG: PA5447 COG0438 # Protein_GI_number: 15600640 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Pseudomonas aeruginosa # 1 371 1 372 381 365 49.0 1e-101 MRVLHVYKTYYPDTYGGIEQVIYQLSQGCARRGIAADVFTFSPDKDTGPVAYEDHRVIYN KQLFEIASTPFSLKALKRFKLIKDDYDIINYHFPFPFMDMLHLSARPDARTVVTYHSDIV KQKRLMKLSQPLQERFLSGVDCIVASSPNYVASSQTLKKYLDKTVVIPFGLEQQDVQHDP QRVAHWRETVGDKFFLFVGTFRYYKGLHILMDAAECSRLPVVVVGGGPLESEVRREAQQR GLSNVMFTGMLNDEDKFILFQLCRGVVFPSHLRSEAFGITLLEGARFARPLISCEIGTGT SFINQDKVSGCVIPPNDSQALVEAMNELWNNEETSNRYGENSRRRFEEMFTADHMIDAYV NLYTTLLESKS >gi|296493363|gb|ADTK01000138.1| GENE 10 8208 - 9152 844 314 aa, chain - ## HITS:1 COG:PA5448 KEGG:ns NR:ns ## COG: PA5448 COG0438 # Protein_GI_number: 15600641 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Pseudomonas aeruginosa # 32 312 90 370 375 192 36.0 5e-49 AFLRRQTLLIEAYRLLHPRRQAWALRDYKDYIYHGPNFYLPHKLERAVTTFHDISIFTCP EYHPKDRVRYMEKSLHESLDSAKLILTVSDFSRSEIIRLFNYPAERIVTTKLACSSDYIP RSPAECLPVLQKYQLAWQAYALYIGTMEPRKNIRGLLHAYQLLPMEIRMRYPLILSGYRG WEDDVLWQLVERGTREGWIRYLGYVPDEDLPYLYAAARVFVYPSFYEGFGLPILEAMSCG VPVVCSNVTSLPEVVGDAGLVADPNDIDAISAQILQSLQDDSWREIATARGLAQAKQFSW ENCATQTINAYKLL Prediction of potential genes in microbial genomes Time: Mon May 16 15:30:28 2011 Seq name: gi|296493362|gb|ADTK01000139.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont334.4, whole genome shotgun sequence Length of sequence - 10219 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 1, operones - 1 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 209 105 ## ECIAI1_2098 mannosyltransferase B 2 1 Op 2 11/0.000 - CDS 261 - 3902 1715 ## COG0438 Glycosyltransferase 3 1 Op 3 1/0.000 - CDS 3905 - 5182 758 ## COG0500 SAM-dependent methyltransferases 4 1 Op 4 26/0.000 - CDS 5182 - 6396 543 ## COG1134 ABC-type polysaccharide/polyol phosphate transport system, ATPase component 5 1 Op 5 . - CDS 6396 - 7190 544 ## COG1682 ABC-type polysaccharide/polyol phosphate export systems, permease component 6 1 Op 6 11/0.000 - CDS 7192 - 8568 1109 ## COG1109 Phosphomannomutase 7 1 Op 7 . - CDS 8592 - 10007 1311 ## COG0836 Mannose-1-phosphate guanylyltransferase - Prom 10031 - 10090 7.2 Predicted protein(s) >gi|296493362|gb|ADTK01000139.1| GENE 1 2 - 209 105 69 aa, chain - ## HITS:1 COG:no KEGG:ECIAI1_2098 NR:ns ## KEGG: ECIAI1_2098 # Name: wbdB # Def: mannosyltransferase B # Organism: E.coli_IAI1 # Pathway: not_defined # 1 69 1 69 385 142 100.0 6e-33 MSFIMKIIFATEPIKYPLTGIGRYSLELVKRLAVAREIEELKLFHGASFIEQIPLVENKS DTKASNHGR >gi|296493362|gb|ADTK01000139.1| GENE 2 261 - 3902 1715 1213 aa, chain - ## HITS:1 COG:aq_1080 KEGG:ns NR:ns ## COG: aq_1080 COG0438 # Protein_GI_number: 15606357 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Aquifex aeolicus # 451 1147 129 823 885 459 35.0 1e-128 MRIVIDLQGAQTESRFRGIGRYSIAIARGIIRNNSRHEIFIALSAMLDESIADIKAQFAD LLPAENIVVWHAVGPVRAMDQGNEWRRESAELIREAFLESLCPDVVFITSLFEGHVDDAA TSVHKFSRQYKVAVLHHDLIPLVQAETYLQDDVYKPYYLQKVEWLKNADLLLTNSAYTAQ EAIEHLHLQGDHVQNIAAAVDSQFCMAEVTASEKETVLGHYGIQREFMLYAPGGFDSRKN FKRLIEAYAGLSDALRRSHQLVIVSKLSIGDRQYLESLASGNGLQQGELVLTGYVPEDEL IQLYRLCKLFIFASLHEGFGLPVLEAMSCGAPVIGSNVTSIPEVIGNPEALFDPYSVSSM RDKIAQCLTDDTFLARLKDMAQQQARNFSWDKAAVTALEAFEKIAVEDTGTAQVLPEALI QKILAISQGQPDDRDLRLCATAIDYNLKTAELYQIDDKSLNWRVEGPFDSSYSLALVNRE FARALSADGVEVLLHSTEGPGDFAPDASFMAQSENSDLLAFYNQCQTRKSNEKIDILSRN IYPPRVTKMDAKVKFLHCYAWEETGFPQPWINEFNRELDGVLCTSEHVRKILIDNGLNVP AFVVGNGCDHWLNIPAETTKDVDHGTFRFLHVSSCFPRKGIQAMLQAWGRAFTRRDNVIL IIKTFNNPHNEIDAWLAQAQAQFIDYPKVEVIKEDMSATELKGLYESCDVLVAPGCAEGF GLPIAEAMLSGLPAIVTNWSGQLDFVNSQNSWLVDYQFTRVKTHFGLFSSAWASVDIDNL IDALKAAASTDKSVLRDMADAGRELLLQQFTWKAVADRSCRAVKTLRGHIDIAQHRARIG WLTTWNTKCGIATYSQHLVESAPHGADVVFAPQVSAGDLVCADEEFVLRNWIVGKESNYL ENLQPHIDALRLDVIVIQFNYGFFNHRELSAFIRRQHDAGRSVVMTMHSTVDPLEKEPSW NFRLAEMKEALALCDRLLVHSIADMNRLKDLGLTANVALFPHGVINYSAASVTRQQQSLP LIASYGFCLPHKGLMELVESVHRLKQAGKPVRLRLVNAEYPVGESRDLVAELKAAAQRLG VTDLIDMHNDFLPDAESLRLLSEADLLIFAYQNTGESASGAVRYGMATQKPVAVTPLAIF DDLDDAVFKFDGCSVDDISQGIDRILNSIREQDAWATRTQQRADAWREQHDYQAVSRRLV NMCQGLAKAKYFK >gi|296493362|gb|ADTK01000139.1| GENE 3 3905 - 5182 758 425 aa, chain - ## HITS:1 COG:aq_1079 KEGG:ns NR:ns ## COG: aq_1079 COG0500 # Protein_GI_number: 15606356 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Aquifex aeolicus # 4 216 213 414 416 124 35.0 3e-28 MGSSFYRSFEERHRGSVEEIKRRLSFYLPFLAGLKDIYPDGVIADIGCGRGEWLEILNEN GIVNIGVDLDDGMLARAREAGLNVQKMDCLQFLQSQADQSLIALTGFHIAEHLPFEVLQQ LVMHTLRVLKPGGLLILETPNPENVSVGTCSFYMDPTHNHPLPPPLLEFLPIHYGFNRAI TVRLQEKEVLQSPDAAVNLVDVLKGVSPDYSIIAQKAAPTDILERFDTLFTQQYGLTLDA LSNRYDAILRQQFSSVVSRLETLNQTYMQQISQMSETIQTLQGEVDDLSHVIDQNHQLHQ QMADLHNSRSWRITQPLRWLSLQRQLLRQEGAKVRARRAAKKILRKGMALSLVFFHRYPK SKVYLFKVLRKTGCYTLLQRLFQRVMLVQSDTMMMQSRRYDVGTEEMTSRAMSIYNELKN KNTEK >gi|296493362|gb|ADTK01000139.1| GENE 4 5182 - 6396 543 404 aa, chain - ## HITS:1 COG:PA5450 KEGG:ns NR:ns ## COG: PA5450 COG1134 # Protein_GI_number: 15600643 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate transport system, ATPase component # Organism: Pseudomonas aeruginosa # 1 404 1 405 421 531 61.0 1e-151 MSYIRVNNVGKAYRQYHSKTGRLIEWLSPLNTKRHNLKWILSDINFEVAPGEAVGIIGIN GAGKSTLLKLITGTSRPTTGEIEISGRVAALLELGMGFHSDFTGRQNVYMSGQLLGLSSE KITELMPQIEEFAEIGDYIDQPVRVYSSGMQVRLAFSVATAIRPDVLIIDEALSVGDAYF QHKSFERIRKFRQEGTTLLLVSHDKQAIQSICDRAILLNKGQIEMEGEPEAVMDYYNALL ADKQNQSIKQVEHNGKTQTVSGTGEVTISEVHLLDEQGNVTEFVSVGHRVSLQVNVEVKD DIPELVVGYMIKDRLGQPIFGTNTYHLNQTLTSLKKGEKRSFLFSFDARLGVGSYSVAVA LHTSSTHLGKNYEWRDLAVVFNVVNTEQQEFVGVSWLPPELEIS >gi|296493362|gb|ADTK01000139.1| GENE 5 6396 - 7190 544 264 aa, chain - ## HITS:1 COG:PA5451 KEGG:ns NR:ns ## COG: PA5451 COG1682 # Protein_GI_number: 15600644 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate export systems, permease component # Organism: Pseudomonas aeruginosa # 4 264 5 265 265 255 56.0 8e-68 MRDLLTTIYRYRGFIWSSVKRDFQARYQTSMLGALWLVLQPLSMILVYTLVFSEVMKARM PDNTGSFAYSIYLCSGVLTWGLFTEMLDKGQSVFINNANLIKKLSFPKICLPIIVTLSAV LNFAIIFSLFLIFIIVTGNFPGWLFLSVIPVLLLQILFAGGLGMILGVMNVFFRDVGQLV GVALQFWFWFTPIVYVLNSLPAWAKNLMMYNPMTRIMQSYQSIFAYHLAPNWYSLWPVLA LAIIFCVIGFRMFRKHAADMVDEL >gi|296493362|gb|ADTK01000139.1| GENE 6 7192 - 8568 1109 458 aa, chain - ## HITS:1 COG:YPO3097 KEGG:ns NR:ns ## COG: YPO3097 COG1109 # Protein_GI_number: 16123271 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Yersinia pestis # 1 458 1 457 457 740 74.0 0 MTQLTCFKAYDIRGELGEELNEDIAYRIGRAYGEFLKPGKIVVGGDVRLTSESLKLALAR GLMDAGTDVLDIGLSGTEEIYFATFHLGVDGGIEVTASHNPMNYNGMKLVRENAKPISGD TGLRDIQRLAEENQFPPVDPARRGTLRQISVLKEYVDHLMGYVDLANFTRPLKLVVNSGN GAAGHVIDEVEKRFAAAGVPVTFIKVHHQPDGHFPNGIPNPLLPECRQDTADAVREHQAD MGIAFDGDFDRCFLFDDEASFIEGYYIVGLLAEAFLQKQPGAKIIHDPRLTWNTVDIVTR NGGQPVMSKTGHAFIKERMRQEDAIYGGEMSAHHYFRDFAYCDSGMIPWLLVAELLCLKN SSLKSLVAERQKAFPASGEINRKLRNAAEAIARIRAQYEPAAAHIDTTDGISIEYPEWRF NLRTSNTEPVVRLNVESRADVALMNEKTTELLHLLSGE >gi|296493362|gb|ADTK01000139.1| GENE 7 8592 - 10007 1311 471 aa, chain - ## HITS:1 COG:YPO3099_1 KEGG:ns NR:ns ## COG: YPO3099_1 COG0836 # Protein_GI_number: 16123273 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannose-1-phosphate guanylyltransferase # Organism: Yersinia pestis # 1 355 3 354 354 445 60.0 1e-125 MLLPVIMAGGTGSRLWPMSRELYPKQFLRLFGQNSMLQETITRLSGLEIHEPMVICNEEH RFLVAEQLRQLNKLSNNIILEPVGRNTAPAIALAALQATRYGDDPLMLVLAADHIINNQS AFHDDIRVAEQYADEGHLVTFGIVPNAPETGYGYIQRGVALTDSAHAPYQVARFVEKPDR ERAEVYLASGEYYWNSGMFMFRAKKYLSELAKYRPDILETCQAAVNAADNGSDFINIPHD IFCECPDESVDYAVMEKTADAVVVGLDADWSDVGSWSALWEVSPKDGQGNVLSGDAWVHN SENCYINSDEKLVAAIGVENLVIVSTKDAVLVMNRERSQDVKKAVEFLKQNQRSEYKRHR EIYRPWGRCDVVVQTPRFNVNRITVKPGGAFSMQMHHHRAEHWVILAGTGQVTVNGKQFL LSENQSTFIPIGAEHCLENPGCIPLEVLEIQSGSYLGEDDIIRIKDQYGRC Prediction of potential genes in microbial genomes Time: Mon May 16 15:30:31 2011 Seq name: gi|296493361|gb|ADTK01000140.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont340.1, whole genome shotgun sequence Length of sequence - 793 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 47 - 628 452 ## EcSMS35_3205 hypothetical protein Predicted protein(s) >gi|296493361|gb|ADTK01000140.1| GENE 1 47 - 628 452 193 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_3205 NR:ns ## KEGG: EcSMS35_3205 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 193 68 260 260 360 94.0 1e-98 MTEDGQALVMLKNGTIGITGLMQGCPNGVQTLLGSRISINGNLIPTSQMCNQQTGFRAVE VEAGQAPEMVKKAAHSIAERDVSVLQAFGVRMEFTRGDMLKVCPKFVTSLAGFSPKQTSV INKDSVLQAARQAYSREYDEETTETADFDSYEIKGNKVEFEVFNPGYRTYDKVTVTVGAD GNATDASVEFIGK Prediction of potential genes in microbial genomes Time: Mon May 16 15:30:35 2011 Seq name: gi|296493360|gb|ADTK01000141.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont344.1, whole genome shotgun sequence Length of sequence - 2909 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1053 482 ## COG5525 Bacteriophage tail assembly protein 2 2 Tu 1 . - CDS 1158 - 1715 203 ## LF82_p278 prophage protein - Prom 1766 - 1825 6.6 3 3 Tu 1 . + CDS 2070 - 2237 74 ## gi|293405052|ref|ZP_06649044.1| predicted protein 4 4 Tu 1 . - CDS 2258 - 2464 204 ## ECUMN_1835 conserved hypothetical protein; DLP12 prophage - Prom 2498 - 2557 3.5 Predicted protein(s) >gi|296493360|gb|ADTK01000141.1| GENE 1 3 - 1053 482 350 aa, chain - ## HITS:1 COG:ECs0825 KEGG:ns NR:ns ## COG: ECs0825 COG5525 # Protein_GI_number: 15830079 # Func_class: R General function prediction only # Function: Bacteriophage tail assembly protein # Organism: Escherichia coli O157:H7 # 1 350 34 383 700 574 78.0 1e-164 MRVPKGAGNSVPWDPELTPYIIEPMNCLASREYDAVIFVGPARTGKTIGLIDGWIVYTIV CDPSDMLVVQMTEDKAREHSKKRLDRTFRSSAAVKKRMSPRRNDNNVHDKTFRDGSFLKI GWPSVNIMSSSDYRFVALTDYDRFPENIDSEGDGFSLASKRTTTFMSAGMTLVESSPGRD ICDSKWRRKSPHEAPPTTGILSLYNRGDRRRWYWSCPHCGEYFQPAMDAMTGYRNEPDPF KASEAAYLLCPHCSGIITAEKKRELNSAGVWLREGQVIDRNGNVSGEPRRSRIASFWMEG PAAAYQTWAQLVYKLLTAEQEYEATGSEETLRAVINTDWGLPYLPRASME >gi|296493360|gb|ADTK01000141.1| GENE 2 1158 - 1715 203 185 aa, chain - ## HITS:1 COG:no KEGG:LF82_p278 NR:ns ## KEGG: LF82_p278 # Name: not_defined # Def: prophage protein # Organism: E.coli_LF82 # Pathway: not_defined # 11 185 1 175 175 335 99.0 4e-91 MIFTCQMELLMSNVSGIGDAYYWSVFKIAEAFGLHRDTVKKRLLAANTPVAATVRGNPVY ALQHVGPALFSVKHEAADSVHDPSRMEPKERKDWYQSENERIKLEKEQRKLIPVDEVVIV YSSMRKAVVQVLETIPDVLERDCALTPQAVGAVQQAIDDLRYTLQEKSYEACAAELIPDE EGESL >gi|296493360|gb|ADTK01000141.1| GENE 3 2070 - 2237 74 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293405052|ref|ZP_06649044.1| ## NR: gi|293405052|ref|ZP_06649044.1| predicted protein [Escherichia coli FVEC1412] # 1 55 1 55 55 80 100.0 4e-14 MRTINASSSAMQPASEQDDIVNSPNYEDIAVVELQLCISLTVTECDIDSVTGEIV >gi|296493360|gb|ADTK01000141.1| GENE 4 2258 - 2464 204 68 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_1835 NR:ns ## KEGG: ECUMN_1835 # Name: ybcW # Def: conserved hypothetical protein; DLP12 prophage # Organism: E.coli_UMN026 # Pathway: not_defined # 1 68 1 68 68 108 98.0 4e-23 MNKEQSADELSLDLIRVKNMLNSTISMSYPDVVIACIEHKVSLEAFRAIEAALVKHDKNS KDYSLVVD Prediction of potential genes in microbial genomes Time: Mon May 16 15:30:45 2011 Seq name: gi|296493359|gb|ADTK01000142.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont344.2, whole genome shotgun sequence Length of sequence - 7893 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 8, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 377 - 417 4.0 1 1 Tu 1 . - CDS 441 - 614 257 ## ECS88_5024 hypothetical protein - Prom 721 - 780 6.6 - Term 1051 - 1084 2.2 2 2 Tu 1 . - CDS 1288 - 1500 362 ## COG1278 Cold shock proteins 3 3 Op 1 . - CDS 1864 - 2346 323 ## ECUMN_1840 conserved hypothetical protein; Qin prophage 4 3 Op 2 . - CDS 2358 - 2774 281 ## COG3772 Phage-related lysozyme (muraminidase) 5 4 Op 1 . - CDS 2888 - 3199 323 ## LF82_2837 uncharacterized protein YdfR 6 4 Op 2 . - CDS 3204 - 3410 255 ## ECUMN_1843 putative S lysis protein; Qin prophage - Prom 3522 - 3581 6.6 7 5 Tu 1 . - CDS 4172 - 4387 202 ## COG1278 Cold shock proteins + Prom 4541 - 4600 3.7 8 6 Tu 1 . + CDS 4700 - 4900 135 ## COG1278 Cold shock proteins - Term 5177 - 5220 6.3 9 7 Op 1 . - CDS 5322 - 6074 296 ## JW1551 predicted antitermination protein Q 10 7 Op 2 . - CDS 6088 - 7137 437 ## ECIAI1_1603 conserved hypothetical protein; Qin prophage - Prom 7339 - 7398 3.4 - Term 7340 - 7377 -0.9 11 8 Tu 1 . - CDS 7484 - 7735 219 ## ECED1_1418 conserved hypothetical protein; Qin prophage - Prom 7807 - 7866 7.5 Predicted protein(s) >gi|296493359|gb|ADTK01000142.1| GENE 1 441 - 614 257 57 aa, chain - ## HITS:1 COG:no KEGG:ECS88_5024 NR:ns ## KEGG: ECS88_5024 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 57 32 88 88 78 92.0 8e-14 MNIENLKTKAEADISEYITKKIIELKKKTGKEVTSIQFTAREKMTGLESYDVKINLI >gi|296493359|gb|ADTK01000142.1| GENE 2 1288 - 1500 362 70 aa, chain - ## HITS:1 COG:cspI KEGG:ns NR:ns ## COG: cspI COG1278 # Protein_GI_number: 16129511 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Escherichia coli K12 # 1 70 1 70 70 134 100.0 5e-32 MSNKMTGLVKWFNPEKGFGFITPKDGSKDVFVHFSAIQSNDFKTLTENQEVEFGIENGPK GPAAVHVVAL >gi|296493359|gb|ADTK01000142.1| GENE 3 1864 - 2346 323 160 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_1840 NR:ns ## KEGG: ECUMN_1840 # Name: ydfP # Def: conserved hypothetical protein; Qin prophage # Organism: E.coli_UMN026 # Pathway: not_defined # 1 160 6 165 165 259 100.0 2e-68 MVIFLVLSGFIVGNVWSDRGWQKKWAERDAAALSQEVNAQFAARIIEQGRTIARDEAVKD AQQKSAEISARAAYLSDSVNQLRAEAKKYAIRLDAAKHTADLAAAVRGKTTKTAEGMLTN MLGDIAAEAQLYAEIADERYIAGVTCQQIYESLRDKKHQM >gi|296493359|gb|ADTK01000142.1| GENE 4 2358 - 2774 281 138 aa, chain - ## HITS:1 COG:ydfQ KEGG:ns NR:ns ## COG: ydfQ COG3772 # Protein_GI_number: 16129513 # Func_class: R General function prediction only # Function: Phage-related lysozyme (muraminidase) # Organism: Escherichia coli K12 # 1 138 40 177 177 286 100.0 7e-78 MAYRDGSGIWTICRGATVVDGKTVFPNMKLSKEKCDQVNAIERDKALAWVERNIKVPLTE PQKAGIASFCPYNIGPGKCFPSTFYKRLNAGDRKGACEAIRWWIKDGGRDCRIRSNNCYG QVIRRDQESALTCWGIEQ >gi|296493359|gb|ADTK01000142.1| GENE 5 2888 - 3199 323 103 aa, chain - ## HITS:1 COG:no KEGG:LF82_2837 NR:ns ## KEGG: LF82_2837 # Name: ydfR # Def: uncharacterized protein YdfR # Organism: E.coli_LF82 # Pathway: not_defined # 1 103 1 103 103 173 94.0 2e-42 MTQNYELIVKGIRNFENKVTVTLALRDKKRFDGEIFDLDVAMDRVEGAALEFYEAAARRS VRQVFLEVAEKLSEKVESYLQHQYSFKIENPANKHERPHHKYI >gi|296493359|gb|ADTK01000142.1| GENE 6 3204 - 3410 255 68 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_1843 NR:ns ## KEGG: ECUMN_1843 # Name: essQ # Def: putative S lysis protein; Qin prophage # Organism: E.coli_UMN026 # Pathway: not_defined # 1 68 4 71 71 125 100.0 6e-28 MDKLTTGVAYGTSAGNAGFWALQLLDKVTPSQWAAIGVLGSLVFGLLTYLTNLYFKIKED RRKAARGE >gi|296493359|gb|ADTK01000142.1| GENE 7 4172 - 4387 202 71 aa, chain - ## HITS:1 COG:cspB KEGG:ns NR:ns ## COG: cspB COG1278 # Protein_GI_number: 16129516 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Escherichia coli K12 # 1 71 1 71 71 133 100.0 1e-31 MSNKMTGLVKWFNADKGFGFISPVDGSKDVFVHFSAIQNDNYRTLFEGQKVTFSIESGAK GPAAANVIITD >gi|296493359|gb|ADTK01000142.1| GENE 8 4700 - 4900 135 66 aa, chain + ## HITS:1 COG:cspF KEGG:ns NR:ns ## COG: cspF COG1278 # Protein_GI_number: 16129517 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Escherichia coli K12 # 1 66 5 70 70 128 100.0 3e-30 MTGIVKTFDGKSGKGLITPSDGRIDVQLHVSALNLRDAEEITTGLRVEFCRINGLRGPSA ANVYLS >gi|296493359|gb|ADTK01000142.1| GENE 9 5322 - 6074 296 250 aa, chain - ## HITS:1 COG:no KEGG:JW1551 NR:ns ## KEGG: JW1551 # Name: ydfT # Def: predicted antitermination protein Q # Organism: E.coli_J # Pathway: not_defined # 1 250 1 250 250 521 99.0 1e-147 MNLEALPKYYSPKSPKLSDDAPATGTGCLTITDVMAAQGMVQSKAPLGLALFLAKVGVQD PQFAIEGLLNYAMALDNPTLNKLSEEIRLQIIPYLVSFAFADYSRSAASKARCEHCSGTG FYNVLREVVKHYRRGESVIKEEWVKELCQHCHGKGEVSTACRGCKGKGIVLDEKRTRFHG VPVYKICGRCNGNRFSRLPTTLARRHVQKLVPDLTDYQWYKGYADVIGKLVTKCWQEEAY AEAQLRKVTR >gi|296493359|gb|ADTK01000142.1| GENE 10 6088 - 7137 437 349 aa, chain - ## HITS:1 COG:no KEGG:ECIAI1_1603 NR:ns ## KEGG: ECIAI1_1603 # Name: ydfU # Def: conserved hypothetical protein; Qin prophage # Organism: E.coli_IAI1 # Pathway: not_defined # 1 349 1 349 349 706 98.0 0 MRVLLRPVLVPELGLVVLKPGRESIQIFHNPRVLVEPEPKSMRNLPSGVVPAVRQPLVED KTLLPFFSNERVIRAAGGVGALSDWLLRHVTSCQWPNGDYHHSETVIHRYGTGAMVLCWH CDNQLRDQTSESLDLLAQQNLTAWVIDVIRHAISGTQERELSLAELSWWAVCNQVVDALP EAVSRRSLGLPAEKICSVYRESDIVPGEQTATSILKQRTKNLAPLPYAYQQQKPPQEKTV VSITVDPESPESFMKLPKRRRWVKEKYTRWVKTQPCACCGMPADDPHHLIGHGQGGMGTK AHDLFVLPLCRKHHNELHTDTVAFEEKYGSQLELIFRFIDRALAIGVLA >gi|296493359|gb|ADTK01000142.1| GENE 11 7484 - 7735 219 83 aa, chain - ## HITS:1 COG:no KEGG:ECED1_1418 NR:ns ## KEGG: ECED1_1418 # Name: rem # Def: conserved hypothetical protein; Qin prophage # Organism: E.coli_ED1a # Pathway: not_defined # 1 83 1 83 83 157 100.0 1e-37 MMNIEELRKIFCEDGLYAVCVENGNIVSHYRILCLRKNGAALINFVDARVTDGFILREGE FVTSLQALKEIGIKAGFSAFAEE Prediction of potential genes in microbial genomes Time: Mon May 16 15:31:04 2011 Seq name: gi|296493358|gb|ADTK01000143.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont344.3, whole genome shotgun sequence Length of sequence - 3569 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 177 179 ## EcSMS35_1181 lambda phage S protein family protein - Prom 225 - 284 3.6 + Prom 220 - 279 2.9 2 2 Tu 1 . + CDS 510 - 689 65 ## + Term 738 - 773 3.2 - TRNA 619 - 695 84.2 # Arg TCT 0 0 - TRNA 709 - 785 57.8 # Arg TCG 0 0 - TRNA 793 - 868 84.5 # Met CAT 0 0 - Term 970 - 1029 5.0 3 3 Op 1 . - CDS 1079 - 1768 304 ## APECO1_1032 putative Q antiterminator encoded by prophage CP-933P 4 3 Op 2 . - CDS 1765 - 2124 137 ## COG4570 Holliday junction resolvase 5 3 Op 3 . - CDS 2137 - 3114 506 ## EC55989_1726 conserved hypothetical protein; Qin prophage Predicted protein(s) >gi|296493358|gb|ADTK01000143.1| GENE 1 3 - 177 179 58 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_1181 NR:ns ## KEGG: EcSMS35_1181 # Name: not_defined # Def: lambda phage S protein family protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 58 1 58 71 103 100.0 2e-21 MKSMDKISTGIAYGTSAGSAGYWFLQWLDQVSPSQWAAIGVLGSLVLGFLTYLTNLYF >gi|296493358|gb|ADTK01000143.1| GENE 2 510 - 689 65 59 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPVSFNKLPAKQTTVNRLNCETFKKKARKSEPGKISLARCTGFEPVTDCLEGNCSVRLS >gi|296493358|gb|ADTK01000143.1| GENE 3 1079 - 1768 304 229 aa, chain - ## HITS:1 COG:no KEGG:APECO1_1032 NR:ns ## KEGG: APECO1_1032 # Name: not_defined # Def: putative Q antiterminator encoded by prophage CP-933P # Organism: E.coli_APEC # Pathway: not_defined # 1 229 3 231 231 399 97.0 1e-110 MNNQYLQFVREQLIIATADLSGATKGQLEAWQENAMFDTGRYRRKKIRYRDEVTGKMITR DNPPIPGKQSLANGSSIALVSPVEFSTSSWRRALLSLEEHHKAWLLWCYGESICWEYQIA ITQWAWNEFNTQSGTRKIAGKTQERLKKLIWLAAQAVKAELFGGEGYEYQELALLAGVTT KNWSKTFTRHWVAMKHIFQRLDSEALLFVMRTRSKQKAAFSKQSVAKVD >gi|296493358|gb|ADTK01000143.1| GENE 4 1765 - 2124 137 119 aa, chain - ## HITS:1 COG:ECs2751 KEGG:ns NR:ns ## COG: ECs2751 COG4570 # Protein_GI_number: 15832005 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvase # Organism: Escherichia coli O157:H7 # 1 119 1 119 119 215 94.0 1e-56 MLIDLVLPYPPTVNTYWRRRGSTYFISEEGKRYRRAVALIVRQQRLKLSLSGRLAIKIIA EPPDKRRRDLDNILKAPLDALTHAGLLIDDEQFDEINIVRGQLVPGGRLGIKITELECA >gi|296493358|gb|ADTK01000143.1| GENE 5 2137 - 3114 506 325 aa, chain - ## HITS:1 COG:no KEGG:EC55989_1726 NR:ns ## KEGG: EC55989_1726 # Name: ydfU # Def: conserved hypothetical protein; Qin prophage # Organism: E.coli_55989 # Pathway: not_defined # 1 325 25 349 349 652 98.0 0 MQVFHNPRVLVEPEPKSMRNLPSGVVPAVRQPLAEDKSLLPFFSNERVIRAAGGAGALSD WLLRHIKSCQWPHGDYHHSETVIHRYGTGAMVLCWHCDNQLRDQTSESLEQLAHQNLSAW MIDVIGHAISGTQERELSLAELSWWAVRNQVADALPEAVLRRSLGLRAEKIRSMYRESDI VPGEQTATSILKQRTKNLAPLPHAHQQQNPPQEKTVVSIAVDPESPESFMKRPKRRRWVN EKYTRWVKTQPCACCGKPADDPHHLIGHGQGGMGTKSHDIFTLPLCREHHNELHADPLAF EEKHGSQVDLIFRFLDHAFATGVLG Prediction of potential genes in microbial genomes Time: Mon May 16 15:31:18 2011 Seq name: gi|296493357|gb|ADTK01000144.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont344.4, whole genome shotgun sequence Length of sequence - 314 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 37 - 249 135 ## ECBD_2081 Hok/gef cell toxic protein Predicted protein(s) >gi|296493357|gb|ADTK01000144.1| GENE 1 37 - 249 135 70 aa, chain - ## HITS:1 COG:no KEGG:ECBD_2081 NR:ns ## KEGG: ECBD_2081 # Name: not_defined # Def: Hok/gef cell toxic protein # Organism: E.coli_BL21_DE3 # Pathway: not_defined # 1 70 1 70 70 108 94.0 9e-23 MLDTCRLASYVPKGKEKQAMKQQKAMLIALIVICLTVIVTALVTRKDLWGVRIRTGQTEA AVFTAYEHEE Prediction of potential genes in microbial genomes Time: Mon May 16 15:31:20 2011 Seq name: gi|296493356|gb|ADTK01000145.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont344.5, whole genome shotgun sequence Length of sequence - 489 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 9 - 54 1.3 1 1 Tu 1 . - CDS 225 - 476 178 ## COG1636 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|296493356|gb|ADTK01000145.1| GENE 1 225 - 476 178 83 aa, chain - ## HITS:1 COG:XF2273 KEGG:ns NR:ns ## COG: XF2273 COG1636 # Protein_GI_number: 15838864 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 1 78 138 215 236 135 76.0 1e-32 MQQVNECGRRAVAHYPGMVYWDYNWRKQGGSSRMIEISKREKFYQQEYCGCVYSLRDTNL HRKSQGRPLIKIGQLHYGKEEKE Prediction of potential genes in microbial genomes Time: Mon May 16 15:31:21 2011 Seq name: gi|296493355|gb|ADTK01000146.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont344.6, whole genome shotgun sequence Length of sequence - 2884 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 305 - 1327 501 ## Sbal195_0765 hypothetical protein 2 1 Op 2 . - CDS 1330 - 1869 -32 ## gi|300903949|ref|ZP_07121840.1| hypothetical protein HMPREF9536_02052 - Prom 2085 - 2144 4.8 - Term 1981 - 2028 2.0 3 2 Tu 1 . - CDS 2153 - 2818 445 ## COG1636 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|296493355|gb|ADTK01000146.1| GENE 1 305 - 1327 501 340 aa, chain - ## HITS:1 COG:no KEGG:Sbal195_0765 NR:ns ## KEGG: Sbal195_0765 # Name: not_defined # Def: hypothetical protein # Organism: S.baltica_OS195 # Pathway: not_defined # 1 337 1 337 339 472 71.0 1e-132 MLKLEELLEYAEQLKDDDAAKISLYFITRHLKAGMSRTARVVDKFDFKIIKAPIAPDIAK FFKYTLSNQIISHASKDDIVMKKYTVIDDDIDNKIYAYAMNNAISFSKVINNDIKNDKPV VLTSLAEVQNDLWAYCIKVQKGADVTYSFRKISRGKVTTNEPQNMTQRVFALFDKTDKEL RSFDGSAVNFDDKIDCIYIKDQFYVFHKKSFEAIVGLEVEFTEAAQKTLNTIKELDLIEG LDVIEQAILHKPSLRKILTHIAEKGNHTALEKNDVQAMNDVLKMFQNEEFKTNEHGKLVI EDERQGRNFLKLLNDYYKQGMTTKKYYGTDSGNVINPIKA >gi|296493355|gb|ADTK01000146.1| GENE 2 1330 - 1869 -32 179 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|300903949|ref|ZP_07121840.1| ## NR: gi|300903949|ref|ZP_07121840.1| hypothetical protein HMPREF9536_02052 [Escherichia coli MS 84-1] # 1 179 26 204 204 318 100.0 9e-86 MCFVQLYTYRSYLNWKGISVESLTIFFKYFGAVSVIGVLSLFGIVGLTIFLKNIKRRCAT SGRTVRVIDIENKNNESISYLFTYIIPFVFQDLSTLTNVIPIAILLTVTALIYINSSMIL INPTISINYTLYQVTYLDLESDKKRTGMVLTKSKYLEEDDLLDVEDVGPKLFYAESHKE >gi|296493355|gb|ADTK01000146.1| GENE 3 2153 - 2818 445 221 aa, chain - ## HITS:1 COG:XF2273 KEGG:ns NR:ns ## COG: XF2273 COG1636 # Protein_GI_number: 15838864 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 8 215 8 214 236 359 80.0 2e-99 MTIADFKRPKLELPNGANKLLLHSCCAPCSGEVMEALQASGIDYTIFFYNPNIHPQKEYL IRKDENIRFAEQHGVPFIDADYDTDNWFERAKGMEWEPERGIRCTMCFDMRFERTALYAA ENGFSVISSSLGISRWKNMQQVNDCGRRAVAHYPGMVYWDYNWRKQGGSSRMIEISKREK FYQQEYCGCVYSLRDTNLHRKSQGRPLIKIGQLHYDKEEKE Prediction of potential genes in microbial genomes Time: Mon May 16 15:31:34 2011 Seq name: gi|296493354|gb|ADTK01000147.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont346.1, whole genome shotgun sequence Length of sequence - 1345 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 46 - 1116 268 ## S2705 putative transposase Predicted protein(s) >gi|296493354|gb|ADTK01000147.1| GENE 1 46 - 1116 268 356 aa, chain + ## HITS:1 COG:no KEGG:S2705 NR:ns ## KEGG: S2705 # Name: not_defined # Def: putative transposase # Organism: S.flexneri_2457T # Pathway: not_defined # 1 356 142 497 497 717 100.0 0 MGWVLGIVYRVIATHLVKKAGHTHQVAKTGAVTLIQRFGSALNLNVHFHMLFLDGVYVEQ SHGSARFRWVKAPTSPELTQLTHTIAHRVGRYLERQGLLERDVENSYLASDAVDDDPMTP LLGHSITYRIAVGSQAGRKVFTLQTLPTSGDPFGDGIGKVAGSSLHAGVAARADERKKLE RLCRYISRPAVSEKRLSLTRGGNVRYQLKTPYRDGTTHVIFEPLDFIARLAALVPKPRVN LTRFHGVFAPNSRHRALVTPAKRGRGNKVRVADEPATPAQRRASMTWAQRLKRVFNIDIE TCSGCGGAMKVIACIEDPIVIKQILDHLKHKAETSGTRALPESRAPPAELLLGLFD Prediction of potential genes in microbial genomes Time: Mon May 16 15:31:40 2011 Seq name: gi|296493353|gb|ADTK01000148.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont347.1, whole genome shotgun sequence Length of sequence - 1272 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 824 958 ## ECH74115_1625 phage major capsid protein E - Term 833 - 858 -0.5 2 1 Op 2 . - CDS 880 - 1212 299 ## ECO103_0542 putative head-DNA stabilization protein Predicted protein(s) >gi|296493353|gb|ADTK01000148.1| GENE 1 2 - 824 958 274 aa, chain - ## HITS:1 COG:no KEGG:ECH74115_1625 NR:ns ## KEGG: ECH74115_1625 # Name: not_defined # Def: phage major capsid protein E # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 274 1 274 341 540 98.0 1e-152 MSMYTTAQLLAANEQKFKFDPLFLRLFFRESYPFTTEKVYLSQIPGLVNMALYVSPIVSG EVIRSRGGSTSEFTPGYVKPKHEVNPQMTLRRLPDEDPQNLADPAYRRRRIIMQNMRDEE LAIAQVEEMQAVSAVLKGKYTMTGEAFDPVEVDMGRSEENNITQSGGTEWSKRDKSTYDP TDDIEAYALNASGVVNIIVFDPKGWALFRSFKAVREKLDTRRGSHSELETAVKDLGKAVS YKGMYGDAAIVVYSGQYVENGVKKNFLPDNTMVL >gi|296493353|gb|ADTK01000148.1| GENE 2 880 - 1212 299 110 aa, chain - ## HITS:1 COG:no KEGG:ECO103_0542 NR:ns ## KEGG: ECO103_0542 # Name: not_defined # Def: putative head-DNA stabilization protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 110 3 112 112 174 100.0 1e-42 MTSKETFTHYQPQGNSDPAHTATAPGGLSAKAPAMTPLMLDTSSRKLVAWDGTTDGAAVG ILAVAADQTSTTLTFYKSGTFRYEDVLWPEAASDETKKRTAFAGTAISIV Prediction of potential genes in microbial genomes Time: Mon May 16 15:31:55 2011 Seq name: gi|296493352|gb|ADTK01000149.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont348.1, whole genome shotgun sequence Length of sequence - 31505 bp Number of predicted genes - 33, with homology - 32 Number of transcription units - 10, operones - 5 average op.length - 5.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 12/0.000 - CDS 2 - 2024 1668 ## COG3210 Large exoproteins involved in heme utilization or adhesion 2 1 Op 2 . - CDS 2037 - 3644 731 ## COG2831 Hemolysin activation/secretion protein - Prom 3723 - 3782 3.0 + Prom 4246 - 4305 6.8 3 2 Op 1 . + CDS 4325 - 4714 279 ## c1204 hypothetical protein + Term 4724 - 4758 4.4 4 2 Op 2 . + CDS 4779 - 5837 614 ## COG0500 SAM-dependent methyltransferases 5 2 Op 3 . + CDS 5854 - 6600 465 ## c1202 hypothetical protein 6 2 Op 4 4/0.000 + CDS 6579 - 7418 352 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase 7 2 Op 5 4/0.000 + CDS 7393 - 7650 392 ## COG0236 Acyl carrier protein 8 2 Op 6 3/0.000 + CDS 7662 - 7913 403 ## COG0236 Acyl carrier protein 9 2 Op 7 3/0.000 + CDS 7918 - 8499 310 ## COG4648 Predicted membrane protein 10 2 Op 8 2/0.250 + CDS 8496 - 9854 524 ## COG0365 Acyl-coenzyme A synthetases/AMP-(fatty) acid ligases 11 2 Op 9 3/0.000 + CDS 9841 - 10194 223 ## COG0764 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases 12 2 Op 10 2/0.250 + CDS 10185 - 11861 1424 ## COG4261 Predicted acyltransferase 13 2 Op 11 . + CDS 11865 - 12287 476 ## COG0824 Predicted thioesterase 14 2 Op 12 . + CDS 12284 - 12889 720 ## CKO_04918 hypothetical protein 15 2 Op 13 . + CDS 12858 - 15176 2179 ## COG4258 Predicted exporter 16 2 Op 14 . + CDS 15173 - 15757 510 ## ECH74115_4810 hypothetical protein 17 2 Op 15 3/0.000 + CDS 15759 - 16928 1004 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase 18 2 Op 16 4/0.000 + CDS 16973 - 17389 308 ## COG4706 Predicted 3-hydroxylacyl-(acyl carrier protein) dehydratase 19 2 Op 17 11/0.000 + CDS 17386 - 18120 182 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 20 2 Op 18 . + CDS 18117 - 19346 1180 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase + Term 19356 - 19390 3.6 21 3 Tu 1 . - CDS 20186 - 21217 705 ## COG1609 Transcriptional regulators - Prom 21435 - 21494 6.1 + Prom 21225 - 21284 3.3 22 4 Op 1 13/0.000 + CDS 21488 - 21931 422 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 23 4 Op 2 11/0.000 + CDS 21947 - 22234 166 ## PROTEIN SUPPORTED gi|148984431|ref|ZP_01817719.1| PTS system, IIB component, putative 24 4 Op 3 . + CDS 22268 - 23503 1169 ## COG3037 Uncharacterized protein conserved in bacteria + Term 23517 - 23568 6.1 - Term 23509 - 23552 1.4 25 5 Tu 1 . - CDS 23719 - 24003 109 ## COG2963 Transposase and inactivated derivatives - Prom 24169 - 24228 5.0 + Prom 24207 - 24266 5.1 26 6 Tu 1 . + CDS 24324 - 24479 68 ## - Term 24365 - 24403 6.0 27 7 Op 1 . - CDS 24425 - 25438 686 ## EC55989_4931 deoxyribose specific mutarotase 28 7 Op 2 2/0.250 - CDS 25450 - 26766 809 ## COG0738 Fucose permease 29 7 Op 3 . - CDS 26794 - 27714 777 ## COG0524 Sugar kinases, ribokinase family + Prom 27897 - 27956 3.9 30 8 Tu 1 . + CDS 28017 - 28799 491 ## COG1349 Transcriptional regulators of sugar metabolism - Term 29700 - 29753 -0.7 31 9 Op 1 . - CDS 29802 - 30236 119 ## ECUMN_4871 hypothetical protein 32 9 Op 2 . - CDS 30224 - 30625 144 ## ECB_01907 hypothetical protein 33 10 Tu 1 . - CDS 30792 - 31361 531 ## ECUMN_3393 hypothetical protein Predicted protein(s) >gi|296493352|gb|ADTK01000149.1| GENE 1 2 - 2024 1668 674 aa, chain - ## HITS:1 COG:PA2462 KEGG:ns NR:ns ## COG: PA2462 COG3210 # Protein_GI_number: 15597658 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Large exoproteins involved in heme utilization or adhesion # Organism: Pseudomonas aeruginosa # 41 673 42 720 5627 244 35.0 4e-64 MNQPPVHFTYRLLSYLVSAIIAGQPLLPAVGAVITPQNGAGMDKAANGVPVVNIATPNGA GISHNRFTDYNVGKEGLILNNATGKLNPTQLGGLIQNNPNLKAGGEAKGIINEVTGGNRS LLQGYTEVAGKAANVIVANPYGITCDGCGFINTPHATLTTGRPVMNADGSLQALEVTEGS ITINGAGLDGTRSDAVSIIARATEVNAALHAKDLTVTAGANRITADGRVSALKGEGDVPK VAVDTGALGGMYARRIHLTSTESGVGVNLGNLYARDGDITLDASGRLTVNNSLATGAVTA KGQGVTLTGDHKAGGNLSVSSRSDIVLSNGTLNSDKNLSLTAGGRITQQNEKLTAGRDVT LAAKNITQDTASQINAARDIVTVASDTLTTQGQITAGQNLTASATTLTQDGILLAKSHAG LDAGTLNNSGAVQGASLTLGSTTLSNSGSLLSGGPLTVNTRDFTQSGRTGAKGKVDITAS GKLTSTGSLVSDDALVLKAQDVTQNGVLSGGKGLTVSAQTLSSGKKSVTHSDAAMTLNVT TVALDGETSAGDTLRVQADRLSTAAGAQLQSGKNLSINARDARLAGTQAAQQTMVVNASE KLTHSGKSSAPSLSLSAPELTSSGVLVGSALNTQSQTLTNSGLLQGEASLTVNTQRLDNQ QNGTLYSAAGLTLD >gi|296493352|gb|ADTK01000149.1| GENE 2 2037 - 3644 731 535 aa, chain - ## HITS:1 COG:PA2463 KEGG:ns NR:ns ## COG: PA2463 COG2831 # Protein_GI_number: 15597659 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Hemolysin activation/secretion protein # Organism: Pseudomonas aeruginosa # 3 535 27 565 565 274 33.0 4e-73 MLSPGVRSAIQQQQQQLLDENQRQRDALERSAPLTITPSREMSAGTEGPCFTVSRIVVSG ATRLTSAETDRLVAPWVNQCLNITGLTAVTDAVTDGYIRRGYITSRAFLTEQDLSGGVLH ITVMEGRLQQIRAEGADLPARTLKMVFPGMEGKVLNLRDIEQGMEQINRLRTEPVQIEIS PGDREGWSVVTLTASPEWPVTGSAGIDNSGQKNTGTGQLNGVLSFNNPLGLADNWFVSGG RSSDFSVSHDARNFAAGVSLPYGYTLVDYTYSWSDYLSTIDNRGWRWRSTGDLQTHRLGL SHVLFRNGDMKTALTGGLQHRIIHNYLDDVLLQGSSRKLTSFSVGLNHTHKFLSDVGTLN PVFTRGMPWFGAESDHGKRGDLPVNQFRKWSVSASFQRPVTDRVWWLTSAYAQWSPDRLH GVEQLSLGGESSVRGFKEQYISGNNGGYLRNELSWSLFSLPYVGTVRAVAALDGGWLHSD RDDPYSSGTLWGGAAGLSTTSDYVSGSFTAGLPLVYPDWLAPDHLTVYWRVAVAF >gi|296493352|gb|ADTK01000149.1| GENE 3 4325 - 4714 279 129 aa, chain + ## HITS:1 COG:no KEGG:c1204 NR:ns ## KEGG: c1204 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_CFT073 # Pathway: not_defined # 1 129 1 129 129 230 99.0 1e-59 MKRYIKWFAITIFISMLSACARTAPVQQISTTVSVGHTQEQVKNAILKAGAQRKWIMTQV SPGVIKARYQTRNHVAEVRITYTATYYNIKYDSSLNLQASDGKIHKNYNRWVRNLDKDIQ VNLSTGATL >gi|296493352|gb|ADTK01000149.1| GENE 4 4779 - 5837 614 352 aa, chain + ## HITS:1 COG:Z4850 KEGG:ns NR:ns ## COG: Z4850 COG0500 # Protein_GI_number: 15803988 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli O157:H7 EDL933 # 1 352 1 352 352 701 94.0 0 MYEQDTLSALDAITEAQRIAFAPMLFQAALCLRNAGILDYLDQQGKQGAPLNAITEHTVL NEYAVGVLLDMGLSGRIITCKEGIYYLAKIGHYLLHDTMTRVNMDFTQDVCYQGLFFLAD SLNEGKPSGLKVFGDWPTIYPALSQLPDAARDSWFAFDHYYSDGAFNAALPYVFANNPTT LYDVGGNTGKWALRCCKFNENIAVTLLDLPQQIVLAKENIANAGFSDRIDFHAVDMLSDA PLPGEADIWWMSQFLDCFSPEQIISILSKIASVMKPGAKLCIMELFWDAQRFEATSFSLN ASSLYFTCMANGNSRFYSAEKFYDYLNKAGFQVAERHDNLGVGHTLLICQKK >gi|296493352|gb|ADTK01000149.1| GENE 5 5854 - 6600 465 248 aa, chain + ## HITS:1 COG:no KEGG:c1202 NR:ns ## KEGG: c1202 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_CFT073 # Pathway: not_defined # 1 248 1 248 248 478 98.0 1e-133 MVYRGRALMKFALNITNWQALAPGLNDVQQWQAWSRQPWAIDPAAPLAKLSELPMMTARR LSSGSKLAVECGLTMLRRHQPDAVLYTSRHGELERNYRIVHALATEQALSPTDFALSVHN SSVGNLTIVAKQPIVSSSLSAGRDSFQQGLCEVLSLLQAGYQRVLMVDFDGFLPEFYHPQ LPAEMPTWPYAVALVIEAGDDWQCETQPAIAVNETTLPQSILFLQHYLQNADAFSLPGER VQWRWSRR >gi|296493352|gb|ADTK01000149.1| GENE 6 6579 - 7418 352 279 aa, chain + ## HITS:1 COG:Z4852 KEGG:ns NR:ns ## COG: Z4852 COG0204 # Protein_GI_number: 15803990 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Escherichia coli O157:H7 EDL933 # 7 279 1 273 273 454 95.0 1e-128 MALEPQMSKLMSRLGWAWRLVMTGLCFVLFGLGGLLLSVVWFNILLVLVWDTSRRRRLAR RSIAASFRLFLTVAKGLGVLDYRIDGAEILRQERGCLVVANHPTLIDYVLLASVMPETDC LVKSALLKNPFLGGVVRAADYLINSEAETLLPRCQQRLAQGDTILIFPEGTRTKPGEKMT LQRGAANIAVRCTSDLRIVTIGCSEHLLDKQSKWYDVPPARPFFTVEVRGRVEINKFYDA TSQEPTLAARQLNRHLLLQLQQSCLPLSGINNASALSGN >gi|296493352|gb|ADTK01000149.1| GENE 7 7393 - 7650 392 85 aa, chain + ## HITS:1 COG:ECs4328 KEGG:ns NR:ns ## COG: ECs4328 COG0236 # Protein_GI_number: 15833582 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Escherichia coli O157:H7 # 1 85 1 85 85 143 96.0 9e-35 MQALYLEIKNLIISTLNLDDLTPDDIDINAPLFGDGLGLDSIDALELGLAVKNEYGIVLS AESEEMRQHFFSVATLASFIAAQRA >gi|296493352|gb|ADTK01000149.1| GENE 8 7662 - 7913 403 83 aa, chain + ## HITS:1 COG:ECs4329 KEGG:ns NR:ns ## COG: ECs4329 COG0236 # Protein_GI_number: 15833583 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Escherichia coli O157:H7 # 1 83 1 83 83 113 93.0 1e-25 MTDQQTIYQEVSALLVKLFEIDPQDIKPEARLYEDLELDSIDAVDMIVHQQKKTGKKIKP EEFKAVRTVQDVVEAVERLLQEG >gi|296493352|gb|ADTK01000149.1| GENE 9 7918 - 8499 310 193 aa, chain + ## HITS:1 COG:ECs4330 KEGG:ns NR:ns ## COG: ECs4330 COG4648 # Protein_GI_number: 15833584 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 190 1 190 193 243 83.0 1e-64 MSGIRSLPMIKLLTGLLLLAWPFLIWLGLVHNSLHWLLPLMALLLLLRLRQPRRQTGPLQ VVTKIVAVVGIVLCVSSFLLKTHQLLLFYPVVVNAVMLAVFGGSLRSTMPIVERLARLQE PDLPEKAVRYTRRVTQVWCLFFIFNGSIALFTTLYGNMPLWTAWNGIIAYLLIGFLTAGE WLIRCQMIKRKTP >gi|296493352|gb|ADTK01000149.1| GENE 10 8496 - 9854 524 452 aa, chain + ## HITS:1 COG:ECs4331 KEGG:ns NR:ns ## COG: ECs4331 COG0365 # Protein_GI_number: 15833585 # Func_class: I Lipid transport and metabolism # Function: Acyl-coenzyme A synthetases/AMP-(fatty) acid ligases # Organism: Escherichia coli O157:H7 # 1 452 1 453 453 622 66.0 1e-178 MNQTLSLSQWLTASRPVTTPVAWLGEYTWTLGHLRHDVALLIDHLRDQPGNRWALCFENS YLFIVALLATLHSGKIPVIPGHNRAALLNVQRSLFDCVLSDKRLGWNGPQFVVSTSHQMT TSTIAFDDIASDAFIELFTSGSTGQPKQIAKPVISLDQEANMLATHFADRLLGCRFVASV MPQHLYGLTFRIFLPMALGLPLHAAMLWYSEQLAALSHTYRYAFVSSPAFLKRLDLHLTP PPVEMILSAGGMLPWQDVEKTSTWLNIWPDEIYGSTETGILAWRYRQQDNIPWFTFSDVH LSQESEGVRVFSPLIPAEGLVLDDMLQFNENGQFHLIGRRGRVVKIEEKRISLVEIEQRL LALDGIMDVAAIPLIRNGRQAIGVVVVPNKEARQIWQCSGGKTLELSWRKALLPWLEPVA VPRYWRIIDEIPVNNMNKRVYAQLQELFHDTP >gi|296493352|gb|ADTK01000149.1| GENE 11 9841 - 10194 223 117 aa, chain + ## HITS:1 COG:ECs4332 KEGG:ns NR:ns ## COG: ECs4332 COG0764 # Protein_GI_number: 15833586 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases # Organism: Escherichia coli O157:H7 # 1 117 1 117 117 176 72.0 1e-44 MIRHEIERHQVQPQEMEIVLYLDPMLYWFNGHFAVQPLLPGVAQLDWVMHYATTLLAPGW RFRSIQNVKFLAPLIPETTVTLQLTWQETSQVLTFCYQRHDGDARHTASSGKIRLCR >gi|296493352|gb|ADTK01000149.1| GENE 12 10185 - 11861 1424 558 aa, chain + ## HITS:1 COG:Z4858_2 KEGG:ns NR:ns ## COG: Z4858_2 COG4261 # Protein_GI_number: 15803996 # Func_class: R General function prediction only # Function: Predicted acyltransferase # Organism: Escherichia coli O157:H7 EDL933 # 247 558 1 312 312 624 97.0 1e-178 MSVNFSPCVLIPCYNHGAMMPGVLARLKPFNLPCIVVDDGSDAATQQQLDNLVAEQPGVT LIRLAENTGKGAAVMRGLQAAADAGFSHAVQVDADGQHAIEDIPKLLALAEQHPVALISG QPIYDDSIPRSRLYGRWVTHVWVWIETLSLQLKDSMCGFRVYPVAPTLQLAKYATIGKRM DFDTEVMVRLYWQGNTSYFVPTRVTYPQDGLSHFDALKDNVRISLMHTRLFFGMLPRIPS LLMRRSSSHWARQSEVKGLWGMRLMLLVWRLLGRTAFSALLYPVVGVYWLTAARARKASQ DWLARVRQHQPQAAKLNSYQHFLRFGNAMLDKIASWRGELQLGHDVLFAPGAEAALDVRD PRGKLLLASHLGDVEVCRALAKIQGYKTINALVFSENAQRFKQIMQEMAPQAGINLMPVT DIGPETAILLKEKLDNGEWVAIVGDRIAVTPQRGGDWRVCWSPFMGQPAPFPQGPFILAS ILRCPVNLIFALRQRGKLHIHCETFADPLLLPRGERQQALQNAIDHYAARLEHYALQSPL DWFNFFDFWQLPEIQDKE >gi|296493352|gb|ADTK01000149.1| GENE 13 11865 - 12287 476 140 aa, chain + ## HITS:1 COG:ECs4334 KEGG:ns NR:ns ## COG: ECs4334 COG0824 # Protein_GI_number: 15833588 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Escherichia coli O157:H7 # 1 140 1 140 140 277 98.0 4e-75 MLNDPRFTAEVELTIPFHDVDMMGVAWHGNYFRYFEVAREALLNQFNYGYRQMKESGYLW PVVDARVKYRHALTFEQRIRVRAHIEEFENRLRIGYQIFDAETGKRATTGYTIQVAVEEQ SRELCFVSPDILFERIGVKP >gi|296493352|gb|ADTK01000149.1| GENE 14 12284 - 12889 720 201 aa, chain + ## HITS:1 COG:no KEGG:CKO_04918 NR:ns ## KEGG: CKO_04918 # Name: not_defined # Def: hypothetical protein # Organism: C.koseri # Pathway: not_defined # 1 201 1 201 201 331 92.0 9e-90 MKFLPLLVLLISPFVSALTLDDLQQRFTEQPVIRAHFDQTRTIKDLPQPLRSQGQMLIAR DQGLLWDQTSPFPMQLLLDDKRMVQVINGQPPQIITAENNPQMFQFNHLLRALFQADRKV LEQNFRVEFADKGEGRWTLRLTPTTTPLDKIFNTIDLAGQTYLESIQLNDKQGDRTDIAL TQHQLTPAQLTDDERQRFATQ >gi|296493352|gb|ADTK01000149.1| GENE 15 12858 - 15176 2179 772 aa, chain + ## HITS:1 COG:Z4861 KEGG:ns NR:ns ## COG: Z4861 COG4258 # Protein_GI_number: 15803999 # Func_class: R General function prediction only # Function: Predicted exporter # Organism: Escherichia coli O157:H7 EDL933 # 1 772 1 772 772 1343 97.0 0 MTNANALPPSKRPALLWGLVCLVMAAALLILLPQSRLNSSVLAMLPKQAMGDIPPALNDG FMQRLDRQLVWLVSPGKEANPRVAQEWLTLLQKSAALGDIKGPMDAASQQAWGAFFWQHR NGLIDPDTRARLQNGGEAQAKWILSQLYSAFSGVSGKELQNDPLMLMRGSQLAMAKNGQR LRLMDGWLVTQDPQGNYWYLLHGELAGSSFDMQQTHQLITTLNTLEKALKTRYPQAQLLS RGTVFYSDYASQQAKQDISTLGVATLLGVILLIVAVFRSLRPLLLCVISIGIGALAGTAA TLLIFGELHLMTLVMSMSVIGISADYTLYYLTERMVHGNDVSPWQSLAKVRNALLLALLT TVAAYLIMMLAPFPGIRQMAIFAAVGLSASCLTVLFWHPWLCRGLPVRPVPAMALMLRWL AAWRRNKKLSRGLPVALALFSLAGMSMLRVDDDISQLQALPQHILAQEKAITALTGQSVD QKWFVVYGDSPQQTLRRLEKYTASLEYAKKEGLISNYRTIPLNSLARQEEDLDLLKTAAP TVTKALQNAGLTAVKPDLNAMPVKVDEWLASPASEGWRLLWLTLENGESGVLVPVEGVKS SALLQEIATYYPCGIAWVDRKSTFDELFALYRYVLTGLLLVALAVIACGAVARLGWRKGL ISLVPSVLSLGCGLAVLAMSGQAVNLFSLLALVLVLGIGINYTLFFSNPRGTPLTSLLAI ALAMLTTLLTLGMLVFSATQAISSFGIVLVSGIFTAFLLSPLAMPDKKRTKK >gi|296493352|gb|ADTK01000149.1| GENE 16 15173 - 15757 510 194 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_4810 NR:ns ## KEGG: ECH74115_4810 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 194 1 194 194 375 99.0 1e-103 MIKSTFWRAFALTATLILTGCSHSQPEQEGRPQAWLQPGTRITLPAPGISPAVNSQQLLT GSFNGKTQSLLVMLNADDQKITLAGLSSVGIRLFLVTYDAQGLRAEQSIVVPQLPPASQV LADVMLSHWPISAWQPQLPAGWTLRDNGDKRELRNASGKLVTEITYLNRQGKRVPISIEQ HVFKYHITIQYLGD >gi|296493352|gb|ADTK01000149.1| GENE 17 15759 - 16928 1004 389 aa, chain + ## HITS:1 COG:ECs4338 KEGG:ns NR:ns ## COG: ECs4338 COG0304 # Protein_GI_number: 15833592 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Escherichia coli O157:H7 # 1 389 1 389 389 732 98.0 0 MIYISAVGMINALGNNLDEIAANLARGVAPGMRPRAGWLQGHPQAVLAGVDGELPLIPEK FAAHRSRNNQILLAALAQLQPQVDDAIAKYGCERIAIVLGTSTSGLHEGDTHVNLHTHGQ PSTTWHYAQQELGDPSRFLSHWLALDGPAYTLSTACSSSARAIISGRRLIEAGLVDAAIV GGADTLSRMPINGFHSLESLSPTLCQPFGRDRAGITIGEGAGLMLLTREPQPIALLGVGE SSDAYHISAPHPQGEGAIRAINQALTDAQLTPDDVGYINLHGTATQLNDQIESMVVNALF GERVPCSSTKHLTGHTLGAAGITEAAISMLILQRDLPLPAQDFSLSPRDPTLPPCGIIEK PQPLARPVILSNSFAFGGNNASILLGRVS >gi|296493352|gb|ADTK01000149.1| GENE 18 16973 - 17389 308 138 aa, chain + ## HITS:1 COG:ECs4339 KEGG:ns NR:ns ## COG: ECs4339 COG4706 # Protein_GI_number: 15833593 # Func_class: I Lipid transport and metabolism # Function: Predicted 3-hydroxylacyl-(acyl carrier protein) dehydratase # Organism: Escherichia coli O157:H7 # 1 138 17 154 154 272 97.0 1e-73 MLLLEDVVSVSDDSAVCRVTVSPSGVLAPFLDPDGNLPGWFALELMAQTVGVWSGWHRHQ QGKNSIELGMILGARELLCAAGILPAGQTLTITVKLLMQDDRFGSFECSINDGEATGRIN TFQPTAEELTTLFQQGAS >gi|296493352|gb|ADTK01000149.1| GENE 19 17386 - 18120 182 244 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 1 241 1 238 242 74 25 7e-13 LMSRSVLVTGASKGIGRAIACQLAADGFNIGVHYHRDATGAQETLNAIVANGGNGRLLSF DVANREQCREVLEHEIAQHGAWYGVVSNAGIARDAAFPALSDDDWDAVIHTNLDSFYNVI QPCIMPMIGSRQGGRIITLSSVSGVMGNRGQVNYSAAKAGIIGATKALAIELAKRKITVN CIAPGLIDTGMIEMEESALKEAMSMIPMKRMGQAEEVAGLASYLMSDIAGYVTRQVISIN GGML >gi|296493352|gb|ADTK01000149.1| GENE 20 18117 - 19346 1180 409 aa, chain + ## HITS:1 COG:ECs4341 KEGG:ns NR:ns ## COG: ECs4341 COG0304 # Protein_GI_number: 15833595 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Escherichia coli O157:H7 # 1 409 1 409 409 822 100.0 0 MTRRVVITGMGGVTAFGENWQDVSARLLAYENAVRKMPEWQVYDGLHTLLGAPVDDFTLP EHYTRKRIRAMGRVSQMSTRASELALEQAGLIGDPILTSGETGIAYGSSTGSTGPVSEFA TMLTEKHTNNITGTTYVQMMPHTTAVNTGLFFGLRGRVIPTSSACTSGSQAIGYAWEAIR HGYQTVMVAGGAEELCPSEAAVFDTLFATSQHNDAPKTTPSPFDENRDGLVIGEGAGTLI LEELEHAKARGATIYGEIVGFATNCDAAHITQPQRETMQYCMEQSLKIAGLSAQDIGYIS AHGTATDRGDMAESLATATIYGDNVPLSSLKSYFGHTLGACGALEAWMSLQMMREGWFAP TLNLNKPDPNCGALDYIMHEARKVDCEFLQSNNFAFGGINTSIIIKRWP >gi|296493352|gb|ADTK01000149.1| GENE 21 20186 - 21217 705 343 aa, chain - ## HITS:1 COG:YPO2568 KEGG:ns NR:ns ## COG: YPO2568 COG1609 # Protein_GI_number: 16122786 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Yersinia pestis # 1 343 1 344 344 541 77.0 1e-154 MEKTRKRRGTGRVTLQEVANFAGVGTMTVSRALRTPEQVSDKLREKIEQAVEELGYIPNR TAGALASGHSHTVAVLVPSLTDKASSRFMQSLQQVLNKNEFQLLLGCHEYNQLKEAEILM TLLQGNPAALVIFGSQLADKTHQILEKTNIPTINVIGSPFSPAQITIETAFFEASHKLTE HLLEQGYKNIGFIGAHMDNRLQRQQLNGWHKAMLSHYQNTDLVITMPDAASLQLGRYALN EILLRQPELDAVICSHEEIALGIMFECQRRLLKIPGNIAVACLDGSDNCDQTHPTLTSIR IDYKKMGIETGKLLIGLLNNNHDESEESRIVQFNYQIELRQST >gi|296493352|gb|ADTK01000149.1| GENE 22 21488 - 21931 422 147 aa, chain + ## HITS:1 COG:YPO2569 KEGG:ns NR:ns ## COG: YPO2569 COG1762 # Protein_GI_number: 16122787 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Yersinia pestis # 1 147 1 147 147 221 74.0 4e-58 MLKNLLNTEVVQVVEQVKDWREAVAISCRPLIENGSIEPRYVDAIYHSHDTIGPYYVVGP GIAMPHARPEEGANKLSLALTLIPSGVNFDADENDPVKLLIVLAATDSTSHIEAISQLAK LFDNEKDIQAILTAKTTQDILSVIARY >gi|296493352|gb|ADTK01000149.1| GENE 23 21947 - 22234 166 95 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148984431|ref|ZP_01817719.1| PTS system, IIB component, putative [Streptococcus pneumoniae SP3-BS71] # 1 95 2 94 94 68 37 5e-11 MKITVVCGNGLGTSLMMEMSIKSILKDLSVSADVDHVDLGSAKGTPSDIYVGTKDIAEQL AAQSVAGKIVALENMIDKKAMRERLSVVLTELGAL >gi|296493352|gb|ADTK01000149.1| GENE 24 22268 - 23503 1169 411 aa, chain + ## HITS:1 COG:YPO2782 KEGG:ns NR:ns ## COG: YPO2782 COG3037 # Protein_GI_number: 16122986 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Yersinia pestis # 1 411 8 418 418 665 95.0 0 MSDVLSEPAVLVGLIALIGLIAQKKPVTECIKGTVKTIMGFVILGAGAGLVVSSLGDFAN IFQHAFGIQGVVPNNEAIVSVAQKSFGKEMAMIMFFAMVINIMIARFTPWKFIFLTGHHT LFMSMMVAVILATAGMTGITLIAVGSLVVGVAMVFFPAIAHPYMKKVTGSDDVAIGHFST LSYVLAGFIGSKFGNKEHSTEDMNVPKSLLFLRDTPVAISFTMSIIFLVTCLFAGADAVK ELSGGKNWFMFSIMQSITFAAGVYIILQGVRMVIAEIVPAFKGISDKLVPNARPALDCPV VFPYAPNAVLVGFLSSFAAGLIGMFTLYLLNMIVIIPGVVPHFFVGAAAGVFGNATGGRR GAILGAFAQGLLITFLPVFLLPVLGDIGFANTTFSDADFGALGILLGIIVR >gi|296493352|gb|ADTK01000149.1| GENE 25 23719 - 24003 109 94 aa, chain - ## HITS:1 COG:ECs1337 KEGG:ns NR:ns ## COG: ECs1337 COG2963 # Protein_GI_number: 15830591 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 90 1 90 141 128 80.0 3e-30 MEQKTLSVEPRRSFSNEFKLQMVKLALQPRGSVARIAREHDINDNLLFKWLRLSLIKGRI SRRLPVTNSSGIGVELLPVEMTLDERPYPVFYAR >gi|296493352|gb|ADTK01000149.1| GENE 26 24324 - 24479 68 51 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIVYKGNAIIVLSWRVKKRAYLNDKRAAVQALSGYSIPVVTVKVRVCCGAT >gi|296493352|gb|ADTK01000149.1| GENE 27 24425 - 25438 686 337 aa, chain - ## HITS:1 COG:no KEGG:EC55989_4931 NR:ns ## KEGG: EC55989_4931 # Name: deoM # Def: deoxyribose specific mutarotase # Organism: E.coli_55989 # Pathway: not_defined # 1 337 1 337 337 713 100.0 0 MSTRINLWRALFGEKPRILLENSDFTVTSFRYDSGVEGLKIANSRGHLIILPWMGQMIWD AQFDGHGLTMCNMFRQPKPATEVIETYGCFAFHSGLLANGCPSAEDTHLLHGEMACAAMD EAWLELDGDMLRLNGRYEYVMGFGHHYLAQPTVVLHKSSTLFDIKMAVTNLASVDMPLQY MCHMNYAYIPNATFSQNIPDEILRLRESVPSHVNPTAQWLAFNQRIMQGEASLSTLSQPE FYDPEIVFFADKLDAYTDQPEFRMISPDGTTFVTRFYSAELNYVTRWILYNGEQQVAAFA LPATCRPEGYLAAQRNGTLIQVAPQQTRTFTVTTGIE >gi|296493352|gb|ADTK01000149.1| GENE 28 25450 - 26766 809 438 aa, chain - ## HITS:1 COG:STM3792 KEGG:ns NR:ns ## COG: STM3792 COG0738 # Protein_GI_number: 16767076 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Salmonella typhimurium LT2 # 1 438 1 438 438 751 96.0 0 MNDKNIIQMPDGYLNKTPLFQFILLSCLFPLWGCAAALNDILITQFKSVFSLSNFASALV QSAFYGGYFFIAIPASLVIKKTSYKVAILIGLTLYIGGCTLFFPASHMATYTMFLAAIFA IAIGLSFLETAANTYSSMIGPKAYATLRLNISQTFYPIGAASGILLGKYLVFSEGESLEK QMSGMNAEQIHNFKVLMLENTLEPYKYMIMILVVVMVLFLLTRFPTCKVAQTSHHKRPSA MDTLRYLARNPRFRRGIVAQFLYVGMQVAVWSFTIRLALELGDINERDASNFMVYSFACF FIGKFIANILMTRFNPEKVLILYSVIGALFLAYVALAPSFSAVYVAVLVSVLFGPCWATI YAGTLDTVDNEHTEMAGAVIVMAIVGAAVVPAIQGYIADMFHSLQLSFLVSMLCFVYVGV YFWRESKVRTALAEVTAS >gi|296493352|gb|ADTK01000149.1| GENE 29 26794 - 27714 777 306 aa, chain - ## HITS:1 COG:STM3793 KEGG:ns NR:ns ## COG: STM3793 COG0524 # Protein_GI_number: 16767077 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Salmonella typhimurium LT2 # 1 306 1 306 306 599 96.0 1e-171 MDIAVIGSNMVDLITYTNQMPKEGETLEAPAFKIGCGGKGANQAVAAAKLNSKVLMLTKV GDDIFADNTIRNLESWGINTTYVEKVPCTSSGVAPIFVNANSSNSILIIKGANKFLSPED IDRAAEDLKKCKLIVLQLEVQLETVYHAIEFGKKNGIEVLLNPAPALRELDMSYACKCDF FIPNETELEILTGMSVDTYDHIRLAARSLVDKGLNNIIVTMSEKGALWMTRDQEVHVPAF KVNAVDTSGAGDAFIGCFSHYYVQSGDVEAALKKAALFAAFSVTGKGTQSSYPSIEQFNE FLTLNE >gi|296493352|gb|ADTK01000149.1| GENE 30 28017 - 28799 491 260 aa, chain + ## HITS:1 COG:STM3794 KEGG:ns NR:ns ## COG: STM3794 COG1349 # Protein_GI_number: 16767078 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Salmonella typhimurium LT2 # 1 260 1 261 261 430 82.0 1e-120 METKQKERIRRLIEILKKTDRIHLKDAARMLEVSVMTIRRDLHQEDEPLPLTLLGGYIVM VHKPAPSMPVIQDVPRNHRDDLPIAILAAGMVNENDLIFFDNGQEIPLVISMIPDAITFT GICYSHRVFVALNEKPNVTAILCGGTYRARSDAFYDASNSSPLDSLNPRKIFISASGVHD HFGVSWFNPEDLATKRKAMARGLRKILLARHALFDEVASASLAPLSAFDVLISERPLPAD YVTHCRNASVKIITPDSEDE >gi|296493352|gb|ADTK01000149.1| GENE 31 29802 - 30236 119 144 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_4871 NR:ns ## KEGG: ECUMN_4871 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 144 30 173 173 290 96.0 1e-77 MRILNCYMANDSKGHFVTAKEAAKHNRRDILCCVSCGCPLTLKRGNHGQPPWFEHDQMTV AAKILLRCTWLDPAEKEARRLHLQGMTVPDYTVKVRKWFCVMCDEDYEGEKCCPRCGTGV YSRAWGRQEVPSEDARADNPLQRL >gi|296493352|gb|ADTK01000149.1| GENE 32 30224 - 30625 144 133 aa, chain - ## HITS:1 COG:no KEGG:ECB_01907 NR:ns ## KEGG: ECB_01907 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 133 12 144 144 247 91.0 9e-65 MYAKSFLALDGNGRLTGARTAQTAPYDRYTCHLCGSALRYHPQYDTERPWFEHTDDGLTA HGQQCPYVRPERREVRLIKRLQQFVPDALPVVRKASWHCRQCHHDYYGEQYCTNCQTGGF SIPRTTQEEICEF >gi|296493352|gb|ADTK01000149.1| GENE 33 30792 - 31361 531 189 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_3393 NR:ns ## KEGG: ECUMN_3393 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 189 13 201 201 370 95.0 1e-101 MTYKYNPFWQQRIRETVRHALNVHPRLTALRVDLRFPDVPAATDAAVISRFINALKARID AYQKRKHREGKRVHPTTLHYVWAREFGECKGKKHYHLMLLVNRDTWCRAGDYRDPGSLAG MIKQAWCSALGVDAGCYATLVHFPVRPAVWLERDDETGFQQVLERAGYLAKEHTKARGTG ERNFGCSRG Prediction of potential genes in microbial genomes Time: Mon May 16 15:32:25 2011 Seq name: gi|296493351|gb|ADTK01000150.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont360.1, whole genome shotgun sequence Length of sequence - 784 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 15:32:25 2011 Seq name: gi|296493350|gb|ADTK01000151.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont368.1, whole genome shotgun sequence Length of sequence - 596 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 595 435 ## COG5492 Bacterial surface proteins containing Ig-like domains Predicted protein(s) >gi|296493350|gb|ADTK01000151.1| GENE 1 1 - 595 435 198 aa, chain + ## HITS:1 COG:Z0972 KEGG:ns NR:ns ## COG: Z0972 COG5492 # Protein_GI_number: 15800508 # Func_class: N Cell motility # Function: Bacterial surface proteins containing Ig-like domains # Organism: Escherichia coli O157:H7 EDL933 # 1 198 34 231 249 331 97.0 7e-91 SDTDWLRLAMVKDLQPGEMTADAEDDTYLDDEDADWKTTTQGQKSVGDTSATLAWRPGDS GQKKLVQLFDSGEVCAFRIKYPNGTVDVFRGWLSSLGKTIASKDVMTRTVKISGVGRPYL AEEGTETVSVTGLTVSPASASVKVGATTTLTFTVKPDGASDKAISVHSSDPQTATVTLNG LVATVKGVKQGSVSIVGM Prediction of potential genes in microbial genomes Time: Mon May 16 15:32:26 2011 Seq name: gi|296493349|gb|ADTK01000152.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont375.1, whole genome shotgun sequence Length of sequence - 679 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 678 541 ## COG5525 Bacteriophage tail assembly protein Predicted protein(s) >gi|296493349|gb|ADTK01000152.1| GENE 1 3 - 678 541 225 aa, chain + ## HITS:1 COG:ECs1630 KEGG:ns NR:ns ## COG: ECs1630 COG5525 # Protein_GI_number: 15830884 # Func_class: R General function prediction only # Function: Bacteriophage tail assembly protein # Organism: Escherichia coli O157:H7 # 1 225 395 619 641 441 100.0 1e-124 YLTAGIDSQLDRYEMRVWGWGPGEESWLIDRQIIMGRHDDEQTLLRVDEAINKTYTRRNG AEMSVSRICWDTGGIDPTIVYERSKKHGLFRVIPIKGASVYGKPVASMPRKRNKNGVYLT EIGTDTAKEQIYNRFTLTPEGDEPLPGAVHFPNNPDIFDLTEAQQLTAEEQVEKWVDGRK KILWDSKKRRNEALDCFVYALAALRISISRWQLDLSALLASLQEE Prediction of potential genes in microbial genomes Time: Mon May 16 15:32:28 2011 Seq name: gi|296493348|gb|ADTK01000153.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont377.1, whole genome shotgun sequence Length of sequence - 5138 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 2 - 1244 1004 ## COG4644 Transposase and inactivated derivatives, TnpA family 2 1 Op 2 . - CDS 1248 - 1808 369 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs - Prom 1879 - 1938 4.2 3 2 Tu 1 . + CDS 2982 - 3695 310 ## COG0840 Methyl-accepting chemotaxis protein + Term 3716 - 3757 0.2 - Term 3750 - 3793 -0.8 4 3 Tu 1 . - CDS 3843 - 4040 160 ## pECS88_0115 colicin V precursor (microcin V bacteriocin) - Prom 4201 - 4260 6.2 5 4 Op 1 . + CDS 4562 - 4816 143 ## COG0673 Predicted dehydrogenases and related proteins 6 4 Op 2 . + CDS 4788 - 5114 71 ## APECO1_O1CoBM84 hypothetical protein Predicted protein(s) >gi|296493348|gb|ADTK01000153.1| GENE 1 2 - 1244 1004 414 aa, chain - ## HITS:1 COG:CAP0094 KEGG:ns NR:ns ## COG: CAP0094 COG4644 # Protein_GI_number: 15004798 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives, TnpA family # Organism: Clostridium acetobutylicum # 7 298 8 301 314 133 25.0 6e-31 MPRRLILSATERDTLLALPESQDDLIRYYTFNDSDLSLIRQRRGDANRLGFAVQLCLLRY PGYALGTDSELPEPVILWVAKQVQAEPASWAKYGERDVTRREHAQELRTYLQLAPFGLSD FRALVRELTELAQQTDKGLLLAGQALESLRQKRRILPALSVIDRACSEAIARANRRVYRA LVEPLTDSHRAKLDELLKLKAGSSITWLTWLRQAPLKPNSRHMLEHIERLKTFQLVDLPE GLGRHIHQNRLLKLAREGGQMTPKDLGKFEPQRRYATLAAVVLESTATVIDELVDLHDRI LVKLFSGAKHKHQQQFQKQGKAINDKVRLYSRIGQALLEAKESGSDPYAAIEAVIPWDEF TESVSEAELLARPEGFDHLHLVGENFATLRRYTPALLEVLELRAAPAAQGVLAA >gi|296493348|gb|ADTK01000153.1| GENE 2 1248 - 1808 369 186 aa, chain - ## HITS:1 COG:DRC0005 KEGG:ns NR:ns ## COG: DRC0005 COG1961 # Protein_GI_number: 10957532 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Deinococcus radiodurans # 2 184 3 185 185 194 59.0 7e-50 MQGHRIGYVRVSSFDQNPERQLEQTQVSKVFTDKASGKDTQRPQLEALLSFVREGDTVVV HSMDRLARNLDDLRRLVQKLTQRGVRIEFLKEGLVFTGEDSPMANLMLSVMGAFAEFERA LIRERQREGIALAKQRGAYRGRKKALSDEQAATLRQRATAGEPKAQLAREFNISRETLYQ YLRTDD >gi|296493348|gb|ADTK01000153.1| GENE 3 2982 - 3695 310 237 aa, chain + ## HITS:1 COG:VC0512_2 KEGG:ns NR:ns ## COG: VC0512_2 COG0840 # Protein_GI_number: 15640536 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Vibrio cholerae # 1 235 163 397 399 231 67.0 6e-61 MVASIQEVASNAQHAADAAGRADTETASGQRLVAHTSQSITALEGEIRQATQVIHELEGQ SNEISKVLDVIRGIAEQTNLLALNAAIEAARAGEQGRGFAVVADEVRSLAARTQQSTTDI QSMISALQERAQSAVTVMEQSSRQAHTSVAHAEEAATALDGIGQRVNEITDMNAQIATAV EQQGAVSEDINRSIINIRDAADTNVQTGQNNLQSAKSVAQLTSALSELAKQFWEKRG >gi|296493348|gb|ADTK01000153.1| GENE 4 3843 - 4040 160 65 aa, chain - ## HITS:1 COG:no KEGG:pECS88_0115 NR:ns ## KEGG: pECS88_0115 # Name: cvaC # Def: colicin V precursor (microcin V bacteriocin) # Organism: E.coli_S88 # Pathway: not_defined # 1 55 1 55 103 62 96.0 6e-09 MRTLTLNELDSVSGGASGRDIAMAIGTLSGQFVAGGIGAAAGGVAGGAIYDYGGTAEFGK NRTLS >gi|296493348|gb|ADTK01000153.1| GENE 5 4562 - 4816 143 84 aa, chain + ## HITS:1 COG:BS_yhjJ KEGG:ns NR:ns ## COG: BS_yhjJ COG0673 # Protein_GI_number: 16078117 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Bacillus subtilis # 8 75 137 204 350 59 39.0 1e-09 MRLRNEKKLGELHFFSAEFNKNSALTRHHLTWRDSAQQSKSSGALGDLSCHLLDLFCFIG ESPVVVHGIKTVKGTRGYKIRWSG >gi|296493348|gb|ADTK01000153.1| GENE 6 4788 - 5114 71 108 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM84 NR:ns ## KEGG: APECO1_O1CoBM84 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 107 206 312 339 226 97.0 3e-58 MGTKSDGQVEVDDNGYVMGSSEKGAYFRVHASKSETDHNLGLHIQLVFENGEIRYSTHHE NRLLLILFNDTNTETIGFDALKRLPDPPRELPFWSDSFIHLHDDWCAR Prediction of potential genes in microbial genomes Time: Mon May 16 15:32:36 2011 Seq name: gi|296493347|gb|ADTK01000154.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont381.1, whole genome shotgun sequence Length of sequence - 12578 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 6, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 38 - 113 82.1 # Trp CCA 0 0 - Term 107 - 140 3.1 1 1 Tu 1 . - CDS 209 - 1048 546 ## COG0583 Transcriptional regulator - Prom 1073 - 1132 6.1 + Prom 1029 - 1088 6.9 2 2 Tu 1 . + CDS 1167 - 1505 418 ## COG3085 Uncharacterized protein conserved in bacteria - Term 1248 - 1281 -0.2 3 3 Tu 1 . - CDS 1530 - 3050 905 ## COG0606 Predicted ATPase with chaperone activity - Prom 3255 - 3314 6.9 4 4 Op 1 7/0.200 + CDS 3641 - 5287 1294 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 5 4 Op 2 5/0.400 + CDS 5284 - 5547 244 ## COG3978 Acetolactate synthase (isozyme II), small (regulatory) subunit 6 4 Op 3 5/0.400 + CDS 5567 - 6496 1145 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase 7 4 Op 4 8/0.200 + CDS 6561 - 8411 2027 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase 8 4 Op 5 . + CDS 8414 - 9958 1497 ## COG1171 Threonine dehydratase + Term 10051 - 10081 1.0 - Term 9955 - 9992 6.1 9 5 Tu 1 . - CDS 10010 - 10903 769 ## COG0583 Transcriptional regulator - Prom 10947 - 11006 4.5 + Prom 10940 - 10999 8.1 10 6 Tu 1 . + CDS 11053 - 12528 2031 ## COG0059 Ketol-acid reductoisomerase Predicted protein(s) >gi|296493347|gb|ADTK01000154.1| GENE 1 209 - 1048 546 279 aa, chain - ## HITS:1 COG:ECs4698 KEGG:ns NR:ns ## COG: ECs4698 COG0583 # Protein_GI_number: 15833952 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 279 1 279 279 536 100.0 1e-152 MDTELLKTFLEVSRTRHFGRAAESLYLTQSAVSFRIRQLENQLGVNLFTRHRNNIRLTAA GEKLLPYAETLMSTWQAARKEVAHTSRHNEFSIGASASLWECMLNQWLGRLYQNQDAHTG LQFEARIAQRQSLVKQLHERQLDLLITTEAPKMDEFSSQLLGYFTLALYTSAPSKLKGDL NYLRLEWGPDFQQHEAGLIGADEVPILTTSSAELAQQQIAMLNGCTWLPVSWARKKGGLH TVVDSTTLSRPLYAIWLQNSDKNTLIRDLLKINVLDEVY >gi|296493347|gb|ADTK01000154.1| GENE 2 1167 - 1505 418 112 aa, chain + ## HITS:1 COG:ECs4699 KEGG:ns NR:ns ## COG: ECs4699 COG3085 # Protein_GI_number: 15833953 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 112 1 112 112 208 100.0 2e-54 MAESFTTTNRYFDNKHYPRGFSRHGDFTIKEAQLLERHGYAFNELDLGKREPVTEEEKLF VAVCRGEREPVTEAERVWSKYMTRIKRPKRFHTLSGGKPQVEGAEDYTDSDD >gi|296493347|gb|ADTK01000154.1| GENE 3 1530 - 3050 905 506 aa, chain - ## HITS:1 COG:yifB KEGG:ns NR:ns ## COG: yifB COG0606 # Protein_GI_number: 16131625 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATPase with chaperone activity # Organism: Escherichia coli K12 # 1 506 11 516 516 981 99.0 0 MSLSIVHTRAALGVNAPPITVEVHISKGLPGLTMVGLPETTVKEARDRVRSAIINSGYEY PAKKITINLAPADLPKEGGRYDLPIAIALLAASEQLTANKLDEYELVGELALTGALRGVP GAISSATEAIKSGRKIIVAKDNEDEVGLINGEGCLIADHLQAVCAFLEGKHALERPKPTD AVSRALQHDLSDVVGQEQGKRGLEITAAGGHNLLLIGPPGTGKTMLASRINGLLPDLSNE EALESAAILSLVNAESVQKQWRQRPFRSPHHSASLTAMVGGGAIPGPGEISLAHNGVLFL DELPEFERRTLDALREPIESGQIHLSRTRAKITYPARFQLVAAMNPSPTGHYQGNHNRCT PEQTLRYLNRLSGPFLDRFDLSLEIPLPPPGILSKTVVPGESSATVKQRVMAARERQFKR QNKLNAWLDSPEIRQFCKLESEDAMWLEGTLIHLGLSIRAWQRLLKVARTIADIDQSDII TRQHLQEAVSYRAIDRLLIHLQKLLT >gi|296493347|gb|ADTK01000154.1| GENE 4 3641 - 5287 1294 548 aa, chain + ## HITS:1 COG:ECs4702 KEGG:ns NR:ns ## COG: ECs4702 COG0028 # Protein_GI_number: 15833956 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Escherichia coli O157:H7 # 1 548 1 548 548 1106 99.0 0 MNGAQWVVHALRAQGVNTVFGYPGGAIMPVYDALYDGGVEHLLCRHEQGAAMAAIGYARA TGKTGVCIATSGPGATNLITGLADALLDSIPVVAITGQVSAPFIGTDAFQEVDVLGLSLA CTKHSFLVQSLEELPRIMAEAFDVASSGRPGPVLVDIPKDIQLASGDLEPWFTTVENEVT FPHAEVEQARQMLAKAQKPMLYVGGGVGMAQAVPALREFLATTKMPATCTLKGLGAVEAD YPYYLGMLGMHGTKAANFAVQECDLLIAVGARFDDRVTGKLNTFAPHASVIHMDIDPAEM NKLRQAHVALQGDLNALLPALQQPLNINDWQLHCAQLRDEHAWRYDHPGDAIYAPLLLKQ LSDRKPADCVVTTDVGQHQMWAAQHIAHTRPENFITSSGLGTMGFGLPAAVGAQVARPND TVVCISGDGSFMMNVQELGTVKRKQLPLKIVLLDNQRLGMVRQWQQLFFQERYSETTLTD NPDFLMLASAFGIPGQHITRKDQVEAALDTMLNSDGPYLLHVSIDELENVWPLVPPGASN SEMLEKLS >gi|296493347|gb|ADTK01000154.1| GENE 5 5284 - 5547 244 87 aa, chain + ## HITS:1 COG:ECs4703 KEGG:ns NR:ns ## COG: ECs4703 COG3978 # Protein_GI_number: 15833957 # Func_class: S Function unknown # Function: Acetolactate synthase (isozyme II), small (regulatory) subunit # Organism: Escherichia coli O157:H7 # 1 87 1 87 87 143 100.0 8e-35 MMQHQVNVSARFNPETLERVLRVVRHRGFHVCSMNMAAASDAQNINIELTVASPRSVDLL FSQLNKLVDVAHVAICQSTTTSQQIRA >gi|296493347|gb|ADTK01000154.1| GENE 6 5567 - 6496 1145 309 aa, chain + ## HITS:1 COG:ECs4704 KEGG:ns NR:ns ## COG: ECs4704 COG0115 # Protein_GI_number: 15833958 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Escherichia coli O157:H7 # 1 309 1 309 309 644 100.0 0 MTTKKADYIWFNGEMVRWEDAKVHVMSHALHYGTSVFEGIRCYDSHKGPVVFRHREHMQR LHDSAKIYRFPVSQSIDELMEACRDVIRKNNLTSAYIRPLIFVGDVGMGVNPPAGYSTDV IIAAFPWGAYLGAEALEQGIDAMVSSWNRAAPNTIPTAAKAGGNYLSSLLVGSEARRHGY QEGIALDVNGYISEGAGENLFEVKDGVLFTPPFTSSALPGITRDAIIKLAKELGIEVREQ VLSRESLYLADEVFMSGTAAEITPVRSVDGIQVGEGRCGPVTKRIQQAFFGLFTGETEDK WGWLDQVNQ >gi|296493347|gb|ADTK01000154.1| GENE 7 6561 - 8411 2027 616 aa, chain + ## HITS:1 COG:ECs4705 KEGG:ns NR:ns ## COG: ECs4705 COG0129 # Protein_GI_number: 15833959 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Escherichia coli O157:H7 # 1 616 1 616 616 1211 99.0 0 MPKYRSATTTHGRNMAGARALWRATGMTDADFGKPIIAVVNSFTQFVPGHVHLRDLGKLV AEQIEAAGGVAKEFNTIAVDDGIAMGHGGMLYSLPSRELIADSVEYMVNAHCADAMVCIS NCDKITPGMLMASLRLNIPVIFVSGGPMEAGKTKLSDQIIKLDLVDAMIQGADPKVSDSQ SDQVERSACPTCGSCSGMFTANSMNCLTEALGLSQPGNGSLLATHADRKQLFLNAGKRIV ELTKRYYEQNDESALPRNIASKAAFENAMTLDIAMGGSTNTVLHLLAAAQEAEIDFTMSD IDKLSRKVPQLCKVAPSTQKYHMEDVHRAGGVIGILGELDRAGLLNRDVKNVLGLTLPQT LEQYDVMLTQDDAVKNMFRAGPAGIRTTQAFSQDCRWDSLDDDRANGCIRSLEHAYSKDG GLAVLYGNFAENGCIVKTAGVDDSILKFTGPAKVYESQDDAVEAILGGKVVAGDVVVIRY EGPKGGPGMQEMLYPTSFLKSMGLGKACALITDGRFSGGTSGLSIGHVSPEAASGGSIGL IEDGDLIAIDIPNRGIQLQVSDAELAARREAQEARGDKAWTPKNRERQVSFALRAYASLA TSADKGAVRDKSKLGG >gi|296493347|gb|ADTK01000154.1| GENE 8 8414 - 9958 1497 514 aa, chain + ## HITS:1 COG:ECs4706 KEGG:ns NR:ns ## COG: ECs4706 COG1171 # Protein_GI_number: 15833960 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Escherichia coli O157:H7 # 1 514 1 514 514 1009 100.0 0 MADSQPLSGAPEGAEYLRAVLRAPVYEAAQVTPLQKMEKLSSRLDNVILVKREDRQPVHS FKLRGAYAMMAGLTEEQKAHGVITASAGNHAQGVAFSSARLGVKALIVMPTATADIKVDA VRGFGGEVLLHGANFDEAKAKAIELSQQQGFTWVPPFDHPMVIAGQGTLALELLQQDAHL DRVFVPVGGGGLAAGVAVLIKQLMPQIKVIAVEAEDSACLKAALDAGHPVDLPRVGLFAE GVAVKRIGDETFRLCQEYLDDIITVDSDAICAAMKDLFEDVRAVAEPSGALALAGMKKYI AQHNIRGERLAHILSGANVNFHGLRYVSERCELGEQREALLAVTIPEEKGSFLKFCQLLG GRSVTEFNYRFADAKNACIFVGVRLSRGLEERKEILQMLNDGGYSVVDLSDDEMAKLHVR YMVGGRPSHPLQERLYSFEFPESPGALLRFLNTLGTHWNISLFHYRSHGTDYGRVLAAFE LGDHEPDFETRLNELGYDCHDETNNPAFRFFLAG >gi|296493347|gb|ADTK01000154.1| GENE 9 10010 - 10903 769 297 aa, chain - ## HITS:1 COG:ECs4707 KEGG:ns NR:ns ## COG: ECs4707 COG0583 # Protein_GI_number: 15833961 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 297 1 297 297 565 100.0 1e-161 MDLRDLKTFLHLAESRHFGRSARAMHVSPSTLSRQIQRLEEDLGQPLFVRDNRTVTLTEA GEELRVFAQQTLLQYQQLRHTIDQQGPSLSGELHIFCSVTAAYSHLPPILDRFRAEHPSV EIKLTTGDAADAMEKVVTGEADLAIAGKPETLPGAVAFSMLENLAVVLIAPALPCPVRNQ VSVEKPDWSTVPFIMADQGPVRRRIELWFRRNKISNPMIYATVGGHEAMVSMVALGCGVA LLPEVVLENSPEPVRNRVMILDRSDEKTPFELGVCAQKKRLHEPLIEAFWRILPNHK >gi|296493347|gb|ADTK01000154.1| GENE 10 11053 - 12528 2031 491 aa, chain + ## HITS:1 COG:ilvC KEGG:ns NR:ns ## COG: ilvC COG0059 # Protein_GI_number: 16131632 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Ketol-acid reductoisomerase # Organism: Escherichia coli K12 # 1 491 1 491 491 967 99.0 0 MANYFNTLNLRQQLAQLGKCRFMGRDEFADGASYLQGKKVVIVGCGAQGLNQGLNMRDSG LDISYALRKEAIAEKRASWRKATENGFKVGTYEELIPQADLVVNLTPDKQHSDVVRTVQP LMKDGAALGYSHGFNIVEVGEQIRKDITVVMVAPKCPGTEVREEYKRGFGVPTLIAVHPE NDPKGEGMAIAKAWAAATGGHRAGVLESSFVAEVKSDLMGEQTILCGMLQAGSLLCFDKL VEEGTDPAYAEKLIQFGWETITEALKQGGITLMMDRLSNPAKLRAYALSEQLKEIMAPLF QKHMDDIISGEFSSGMMADWANDDKKLLTWREETGKTAFETAPQYEGKIGEQEYFDKGVL MIAMVKAGVELAFETMVDSGIIEESAYYESLHELPLIANTIARKRLYEMNVVISDTAEYG NYLFSYACVPLLKPFMAELQPGDLGKAIPEGAVDNAQLRDVNEAIRSHAIEQVGKKLRGY MTDMKRIAVAG Prediction of potential genes in microbial genomes Time: Mon May 16 15:32:44 2011 Seq name: gi|296493346|gb|ADTK01000155.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont381.2, whole genome shotgun sequence Length of sequence - 23362 bp Number of predicted genes - 21, with homology - 21 Number of transcription units - 9, operones - 2 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 25 - 65 10.6 1 1 Tu 1 . - CDS 69 - 350 265 ## COG0760 Parvulin-like peptidyl-prolyl isomerase - Prom 378 - 437 4.8 - Term 434 - 475 6.0 2 2 Tu 1 . - CDS 549 - 998 117 ## COG3692 Uncharacterized protein conserved in bacteria - Prom 1162 - 1221 1.8 + Prom 1129 - 1188 3.2 3 3 Tu 1 . + CDS 1215 - 3236 2146 ## COG0210 Superfamily I DNA and RNA helicases + Term 3300 - 3338 4.2 - Term 3229 - 3273 3.2 4 4 Tu 1 . - CDS 3283 - 4767 1616 ## COG0248 Exopolyphosphatase 5 5 Tu 1 . - CDS 4901 - 6166 1288 ## COG0513 Superfamily II DNA and RNA helicases - Prom 6302 - 6361 2.4 + Prom 6127 - 6186 4.6 6 6 Tu 1 . + CDS 6297 - 6626 165 ## PROTEIN SUPPORTED gi|124485582|ref|YP_001030198.1| ribosomal protein L12E/L44/L45/RPP1/RPP2-like protein + Term 6679 - 6712 0.2 + Prom 6712 - 6771 3.8 7 7 Op 1 . + CDS 6953 - 8212 1311 ## COG1158 Transcription termination factor 8 7 Op 2 . + CDS 8222 - 8440 166 ## ECO103_4381 hypothetical protein 9 7 Op 3 5/0.333 + CDS 8452 - 9555 844 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 10 7 Op 4 4/0.333 + CDS 9567 - 10613 1055 ## COG3765 Chain length determinant protein + Term 10627 - 10660 5.2 11 8 Op 1 10/0.000 + CDS 10669 - 11799 983 ## COG0381 UDP-N-acetylglucosamine 2-epimerase 12 8 Op 2 4/0.333 + CDS 11796 - 13058 1170 ## COG0677 UDP-N-acetyl-D-mannosaminuronate dehydrogenase 13 8 Op 3 16/0.000 + CDS 13058 - 14125 1200 ## COG1088 dTDP-D-glucose 4,6-dehydratase 14 8 Op 4 4/0.333 + CDS 14144 - 15025 961 ## COG1209 dTDP-glucose pyrophosphorylase 15 8 Op 5 4/0.333 + CDS 15003 - 15677 398 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 16 8 Op 6 6/0.000 + CDS 15682 - 16812 1181 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 17 8 Op 7 . + CDS 16814 - 18064 1169 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 18 8 Op 8 . + CDS 18061 - 19140 876 ## EcolC_4210 4-alpha-L-fucosyltransferase 19 8 Op 9 . + CDS 19137 - 20489 1314 ## ECH74115_5227 putative common antigen polymerase 20 8 Op 10 4/0.333 + CDS 20492 - 21232 860 ## COG1922 Teichoic acid biosynthesis proteins + Term 21246 - 21295 6.1 + Prom 21263 - 21322 4.5 21 9 Tu 1 . + CDS 21423 - 22808 1325 ## COG1113 Gamma-aminobutyrate permease and related permeases + TRNA 22911 - 22987 89.5 # Arg CCG 0 0 + TRNA 23046 - 23121 84.9 # His GTG 0 0 + TRNA 23142 - 23228 69.1 # Leu CAG 0 0 + TRNA 23271 - 23347 92.7 # Pro TGG 0 0 Predicted protein(s) >gi|296493346|gb|ADTK01000155.1| GENE 1 69 - 350 265 93 aa, chain - ## HITS:1 COG:ECs4709 KEGG:ns NR:ns ## COG: ECs4709 COG0760 # Protein_GI_number: 15833963 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Parvulin-like peptidyl-prolyl isomerase # Organism: Escherichia coli O157:H7 # 1 93 1 93 93 188 100.0 2e-48 MAKTAAALHILVKEEKLALDLLEQIKNGADFGKLAKKHSICPSGKRGGDLGEFRQGQMVP AFDKVVFSCPVLEPTGPLHTQFGYHIIKVLYRN >gi|296493346|gb|ADTK01000155.1| GENE 2 549 - 998 117 149 aa, chain - ## HITS:1 COG:yifNm KEGG:ns NR:ns ## COG: yifNm COG3692 # Protein_GI_number: 16132261 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 149 15 163 163 285 95.0 1e-77 MAINFSPKVGEILECNFGNYPVSQNGQFSTTYYDGRIPPEMIKNRLVVVLNGKINGNACI VIPLSTTRDHDKLNRGMHVEIASNVINGLQFFDQQIRWAKADLVQQVSRNRLNRARTYRG YLNQCLPHELVADIQRAVIKSINAISLIN >gi|296493346|gb|ADTK01000155.1| GENE 3 1215 - 3236 2146 673 aa, chain + ## HITS:1 COG:rep KEGG:ns NR:ns ## COG: rep COG0210 # Protein_GI_number: 16131634 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Escherichia coli K12 # 1 673 1 673 673 1320 99.0 0 MRLNPGQQQAVEFVTGPCLVLAGAGSGKTRVITNKIAHLIRGCGYQARHIAAVTFTNKAA REMKERVGQTLGRKEARGLMISTFHTLGLDIIKREYAALGMKANFSLFDDTDQLALLKEL TEGLIEDDKVLLQQLISTISNWKNDLKPPSQAAASAIGERDRIFAHCYGLYDAHLKACNV LDFDDLILLPTLLLQRNEEVRERWQNKIRYLLVDEYQDTNTSQYELVKLLVGSRARFTVV GDDDQSIYSWRGARPQNLVLLSQDFPALKVIKLEQNYRSSGRILKAANILIANNPHVFEK RLFSELGYGTELKVLSANNEEHEAERVTGELIAHHFVNKTQYKDYAILYRGNHQSRVFEK FLMQNRIPYKISGGTSFFSRPEIKDLLAYLRVLTNPDDDSAFLRIVNTPKREIGPATLKK LGEWAMTRNKSMFTASFDMGLSQTLSGRGYEALTRFTHWLAEIQRLAEREPIAAVRDLIH GMDYESWLYETSPSPKAAEMRMKNVNQLFSWMTEMLEGSELDEPMTLTQVVTRFTLRDMM ERGESEEELDQVQLMTLHASKGLEFPYVYMVGMEEGFLPHQSSIDEDNIDEERRLAYVGI TRAQKELTFTLCKERRQYGELVRPEPSRFLLELPQDDLIWEQERKVVSAEERMQKGQSHL ANLKAMMAAKRGK >gi|296493346|gb|ADTK01000155.1| GENE 4 3283 - 4767 1616 494 aa, chain - ## HITS:1 COG:ECs4712 KEGG:ns NR:ns ## COG: ECs4712 COG0248 # Protein_GI_number: 15833966 # Func_class: F Nucleotide transport and metabolism; P Inorganic ion transport and metabolism # Function: Exopolyphosphatase # Organism: Escherichia coli O157:H7 # 1 494 1 494 494 959 99.0 0 MGSTSSLYAAIDLGSNSFHMLVVREVAGSIQTLTRIKRKVRLAAGLNSENALSNEAMERG WQCLRLFAERLQDIPPSQIRVVATATLRLAVNAGDFIAKAQEILGCPVQVISGEEEARLI YQGVAHTTGGADQRLVVDIGGASTELVTGTGAQTTSLFSLSMGCVTWLERYFADRNLGQE NFDAAEKAAREVLRPVADELRYHGWKVCVGASGTVQALQEIMMAQGMDERITLEKLQQLK QRAIHCGRLEELEIDGLTLERALVFPSGLAILIAIFTELNIQCMTLAGGALREGLVYGML HLAVEQDIRSRTLRNIQRRFMIDIDQAQRVAKVAANFFDQVENEWYLEAISRDLLISACQ LHEIGLSVDFKQAPQHAAYLVRNLDLPGFTPAQKKLLATLLLNQTNPVDLSSLHQQNAVP PRVAEQLCRLLRLAIIFASRRRDDLVPEMTLQANHELLTLTLPQGWLTQHPLGKEIIAQE SQWQSYVHWPLEVH >gi|296493346|gb|ADTK01000155.1| GENE 5 4901 - 6166 1288 421 aa, chain - ## HITS:1 COG:rhlB KEGG:ns NR:ns ## COG: rhlB COG0513 # Protein_GI_number: 16131636 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Escherichia coli K12 # 1 421 1 421 421 862 100.0 0 MSKTHLTEQKFSDFALHPKVVEALEKKGFHNCTPIQALALPLTLAGRDVAGQAQTGTGKT MAFLTSTFHYLLSHPAIADRKVNQPRALIMAPTRELAVQIHADAEPLAEATGLKLGLAYG GDGYDKQLKVLESGVDILIGTTGRLIDYAKQNHINLGAIQVVVLDEADRMYDLGFIKDIR WLFRRMPPANQRLNMLFSATLSYRVRELAFEQMNNAEYIEVEPEQKTGHRIKEELFYPSN EEKMRLLQTLIEEEWPDRAIIFANTKHRCEEIWGHLAADGHRVGLLTGDVAQKKRLRILD EFTRGDLDILVATDVAARGLHIPAVTHVFNYDLPDDCEDYVHRIGRTGRAGASGHSISLA CEEYALNLPAIETYIGHSIPVSKYNPDALMTDLPKPLRLTRPRTGNGPRRTGAPRNRRRS G >gi|296493346|gb|ADTK01000155.1| GENE 6 6297 - 6626 165 109 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|124485582|ref|YP_001030198.1| ribosomal protein L12E/L44/L45/RPP1/RPP2-like protein [Methanocorpusculum labreanum Z] # 5 106 17 116 120 68 34 5e-11 MSDKIIHLTDDSFDTDVLKADGAILVDFWAEWCGPCKMIAPILDEIADEYQGKLTVAKLN IDQNPGTAPKYGIRGIPTLLLFKNGEVAATKVGALSKGQLKEFLDANLA >gi|296493346|gb|ADTK01000155.1| GENE 7 6953 - 8212 1311 419 aa, chain + ## HITS:1 COG:ECs4716 KEGG:ns NR:ns ## COG: ECs4716 COG1158 # Protein_GI_number: 15833970 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Escherichia coli O157:H7 # 1 419 1 419 419 815 100.0 0 MNLTELKNTPVSELITLGENMGLENLARMRKQDIIFAILKQHAKSGEDIFGDGVLEILQD GFGFLRSADSSYLAGPDDIYVSPSQIRRFNLRTGDTISGKIRPPKEGERYFALLKVNEVN FDKPENARNKILFENLTPLHANSRLRMERGNGSTEDLTARVLDLASPIGRGQRGLIVAPP KAGKTMLLQNIAQSIAYNHPDCVLMVLLIDERPEEVTEMQRLVKGEVVASTFDEPASRHV QVAEMVIEKAKRLVEHKKDVIILLDSITRLARAYNTVVPASGKVLTGGVDANALHRPKRF FGAARNVEEGGSLTIIATALIDTGSKMDEVIYEEFKGTGNMELHLSRKIAEKRVFPAIDY NRSGTRKEELLTTQEELQKMWILRKIIHPMGEIDAMEFLINKLAMTKTNDDFFEMMKRS >gi|296493346|gb|ADTK01000155.1| GENE 8 8222 - 8440 166 72 aa, chain + ## HITS:1 COG:no KEGG:ECO103_4381 NR:ns ## KEGG: ECO103_4381 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 72 1 72 72 135 100.0 6e-31 MPKTPRVYVAFCFYICNLNAALAMLGKFLEFAGMLCNLHIKWLIFAQDWWVRNGLSLLNK GLRGYTSANNFL >gi|296493346|gb|ADTK01000155.1| GENE 9 8452 - 9555 844 367 aa, chain + ## HITS:1 COG:rfe KEGG:ns NR:ns ## COG: rfe COG0472 # Protein_GI_number: 16131640 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Escherichia coli K12 # 1 367 1 367 367 613 99.0 1e-175 MNLLTVSTDLISIFLFTTLFLFFARKVAKKVGLVDKPNFRKRHQGLIPLVGGISVYAGIC FTFGIVDYYIPHASLYLACAGVLVFIGALDDRFDISVKIRATIQAAVGIVMMVFGKLYLS SLGYIFGSWEMVLGPFGYFLTLFAVWAVINAFNMVDGIDGLLGGLSCVSFAAIGMILWFD GQTSLAIWCFAMIAAILPYIMLNLGILGRRYKVFMGDAGSTLIGFTVIWILLETTQGKTH PISPVTALWIIAIPLMDMVAIMYRRLRKGMSPFSPDRQHIHHLIMRAGFTSRQAFVLITL AAALLASIGVLAEYSHFVPEWVMLVLFLLAFFLYGYCIKRAWKVARFIKRVKRRLRRNRG GSPNLTK >gi|296493346|gb|ADTK01000155.1| GENE 10 9567 - 10613 1055 348 aa, chain + ## HITS:1 COG:ECs4718 KEGG:ns NR:ns ## COG: ECs4718 COG3765 # Protein_GI_number: 15833972 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Chain length determinant protein # Organism: Escherichia coli O157:H7 # 1 348 2 349 349 682 100.0 0 MTQPMPGKPAEDAENELDIRGLFRTLWAGKLWIIGMGLAFALIALAYTFFARQEWSSTAI TDRPTVNMLGGYYSQQQFLRNLDVRSNMASADQPSVMDEAYKEFVMQLASWDTRREFWLQ TDYYKQRMVGNSKADAALLDEMINNIQFIPGDFTRAVNDSVKLIAETAPDANNLLRQYVA FASQRAASHLNDELKGAWAARTIQMKAQVKRQEEVAKAIYDRRMNSIEQALKIAEQHNIS RSATDVPAEELPDSEMFLLGRPMLQARLENLQAVGPAFDLDYDQNRAMLNTLNVGPTLDP RFQTYRYLRTPEEPVKRDSPRRAFLMIMWGIVGGLIGAGVALTRRCSK >gi|296493346|gb|ADTK01000155.1| GENE 11 10669 - 11799 983 376 aa, chain + ## HITS:1 COG:ECs4719 KEGG:ns NR:ns ## COG: ECs4719 COG0381 # Protein_GI_number: 15833973 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine 2-epimerase # Organism: Escherichia coli O157:H7 # 1 376 15 390 390 748 99.0 0 MKVLTVFGTRPEAIKMAPLVHALAKDPFFEAKVCVTAQHREMLDQVLKLFSIVPDYDLNI MQPGQGLTEITCRILEGLKPILAEFKPDVVLVHGDTTTTLATSLAAFYQRIPVGHVEAGL RTGDLYSPWPEEANRTLTGHLAMYHFSPTETSRQNLLRENVADSRIFITGNTVIDALLWV RDQVMSSDTLRSELAANYPFIDPDKKMILVTGHRRESFGRGFEEICHALADIATTHQDIQ IVYPVHLNPNVREPVNRILGHVKNVILIDPQEYLPFVWLMNHAWLILTDSGGIQEEAPSL GKPVLVMRDTTERPEAVTAGTVRLVGTDKQRIVEEVTRLLKDENEYQAMSRAHNPYGDGQ ACSRILEALKNNRISL >gi|296493346|gb|ADTK01000155.1| GENE 12 11796 - 13058 1170 420 aa, chain + ## HITS:1 COG:ECs4720 KEGG:ns NR:ns ## COG: ECs4720 COG0677 # Protein_GI_number: 15833974 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetyl-D-mannosaminuronate dehydrogenase # Organism: Escherichia coli O157:H7 # 1 420 1 420 420 850 100.0 0 MSFATISVIGLGYIGLPTAAAFASRQKQVIGVDINQHAVDTINRGEIHIVEPDLASVVKT AVEGGFLRASTTPVEADAWLIAVPTPFKGDHEPDMTYVESAARSIAPVLKKGALVILEST SPVGSTEKMAEWLAEMRPDLTFPQQVGEQADVNIAYCPERVLPGQVMVELIKNDRVIGGM TPVCSARASELYKIFLEGECVVTNSRTAEMCKLTENSFRDVNIAFANELSLICADQGINV WELIRLANRHPRVNILQPGPGVGGHCIAVDPWFIVAQNPQQARLIRTAREVNDHKPFWVI DQVKAAVADCLAATDKRASELKIACFGLAFKPNIDDLRESPAMEIAELIAQWHSGETLVV EPNIHQLPKKLTGLCTLAQLDEALATADVLVMLVDHSQFKVINGDNVHQQYVVDAKGVWR >gi|296493346|gb|ADTK01000155.1| GENE 13 13058 - 14125 1200 355 aa, chain + ## HITS:1 COG:ECs4721 KEGG:ns NR:ns ## COG: ECs4721 COG1088 # Protein_GI_number: 15833975 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Escherichia coli O157:H7 # 1 355 1 355 355 738 98.0 0 MRKILITGGAGFIGSALVRYIINETSDAVVVVDKLTYAGNLMSLAPVAQSERFAFEKVDI CDRAELARVFTEHQPDCVMHLAAESHVDRSIDGPAAFIETNIVGTYTLLEAARAYWNTLT EDKKSAFRFHHISTDEVYGDLHSTDDFFTETTPYAPSSPYSASKASSDHLVRAWLRTYGL PTLITNCSNNYGPYHFPEKLIPLMILNALAGKPLPVYGNGQQIRDWLYVEDHARALYCVA TTGKVGETYNIGGHNERKNLDVVETICELLEELAPNKPHGVAHYRDLITFVADRPGHDLR YAIDASKIARELGWLPQETFESGMRKTVQWYLANESWWKQVQDGSYQGERLGLKG >gi|296493346|gb|ADTK01000155.1| GENE 14 14144 - 15025 961 293 aa, chain + ## HITS:1 COG:rffH KEGG:ns NR:ns ## COG: rffH COG1209 # Protein_GI_number: 16131645 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Escherichia coli K12 # 1 293 1 293 293 596 100.0 1e-170 MKGIILAGGSGTRLHPITRGVSKQLLPIYDKPMIYYPLSVLMLAGIREILIITTPEDKGY FQRLLGDGSEFGIQLEYAEQPSPDGLAQAFIIGETFLNGEPSCLVLGDNIFFGQGFSPKL RHVAARTEGATVFGYQVMDPERFGVVEFDDNFRAISLEEKPKQPKSNWAVTGLYFYDSKV VEYAKQVKPSERGELEITSINQMYLEAGNLTVELLGRGFAWLDTGTHDSLIEASTFVQTV EKRQGFKIACLEEIAWRNGWLDDEGVKRAASSLAKTGYGQYLLELLRARPRQY >gi|296493346|gb|ADTK01000155.1| GENE 15 15003 - 15677 398 224 aa, chain + ## HITS:1 COG:ECs4723 KEGG:ns NR:ns ## COG: ECs4723 COG0454 # Protein_GI_number: 15833977 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Escherichia coli O157:H7 # 44 224 1 181 181 321 98.0 8e-88 MPVRASIEPLTWENAFFGVNSAIVRITSEAPLLTPDALAPWSRVQAKIAASNTGELDALQ QLGFSLVEGEVDLALPVNNASDSGAVVAQETDIPALRQLASAAFAQGRFRAPWYAPDASS RFYAQWIENAVRGTFDHQCLILRAASGDIRGYVSLRELNATDARIGLLAGRGAGAELMQT ALNWAYARGKTTLRVATQMGNTAALKRYIQSGANVESTAYWLYR >gi|296493346|gb|ADTK01000155.1| GENE 16 15682 - 16812 1181 376 aa, chain + ## HITS:1 COG:wecE KEGG:ns NR:ns ## COG: wecE COG0399 # Protein_GI_number: 16131647 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Escherichia coli K12 # 1 376 1 376 376 796 99.0 0 MIPFNAPPVVGTELDYMQSAMGSGKLCGDGGFTRRCQQWLEQRFGSAKVLLTPSCTASLE MAALLLDIQPGDEVIMPSYTFVSTANAFVLRGAKIVFVDVRPATMNIDETLIEAAITDKT RVIVPVHYAGVACEMDTIMALAKKHNLFVVEDAAQGVMSTYKGRALGTIGHIGCFSFHET KNYTAGGEGGATLINDKALIERAEIIREKGTNRSQFFRGQVDKYTWRDIGSSYLMSDLQA AYLWAQLEAADRINQQRLALWQNYYDALAPLAKAGRIELPSIPDGCVQNAHMFYIKLRDI DDRSALINFLKEAEIMAVFHYIPLHGCPAGEHFGEFHGEDRYTTKESERLLRLPLFYNLS PVNQRTVIATLLNYFS >gi|296493346|gb|ADTK01000155.1| GENE 17 16814 - 18064 1169 416 aa, chain + ## HITS:1 COG:wzxE KEGG:ns NR:ns ## COG: wzxE COG2244 # Protein_GI_number: 16131648 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Escherichia coli K12 # 1 416 1 416 416 697 99.0 0 MSLAKASLWTAASTLVKIGAGLLVGKLLAVSFGPAGLGLAANFRQLITVLGVLAGAGIFN GVTKYVAQYHDNPQQLRRVVGTSSAMVLGFSTLMALVFVLAAAPISQGLFGNTDYQGLVR LVALVQMGIAWGNLLLALMKGFRDAAGNALSLIVGSLIGVLAYYVSYRLGGYEGALLGLA LIPALVVIPAAIMLIKRGVIPLSYLKPSWDNGLAGQLSKFTLMALITSVTLPVAYIMMRK LLAAQYSWDEVGIWQGVSSISDAYLQFITASFSVYLLPTLSRLTEKRDITREVVKSLKFV LPAVAAASFTVWLLRDFAIWLLLSNKFTAMRDLFAWQLVGDVLKVGAYVFGYLVIAKASL RFYILAEVSQFTLLMVFAHWLIPAHGALGAAQAYMATYIVYFSLCCGVFFLWRRRA >gi|296493346|gb|ADTK01000155.1| GENE 18 18061 - 19140 876 359 aa, chain + ## HITS:1 COG:no KEGG:EcolC_4210 NR:ns ## KEGG: EcolC_4210 # Name: not_defined # Def: 4-alpha-L-fucosyltransferase # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 359 1 359 359 744 99.0 0 MTVLIHVLGSDIPHHNRTVLRFFNDALAATSEHAREFMVVGKDDGLSDSCPALSVQFFPG KKSLAEAIIAKAKANRQQRFFFHGQFNPTLWLALLSGGIKPSQFFWHIWGADLYELSSGL RYKLFYPLRRLAQKRVGYVFATRGDLSFFAKTHPKVRGELLYFPTRMDPSLNTMANDRQR EGKMTILVGNSGDRSNEHVAALRAVHQQFGDTVKVVVPMGYPPNNEAYIEEVRQAGLELF SEENLQVLSEKLEFDAYLALLRQCDLGYFIFARQQGIGTLCLLIQAGIPCVLNRENPFWQ DMTEQHLPVLFTTDDLNEDIVLEAQRQLASVDKNTIAFFSPNYLQGWQRALAIAAGEVA >gi|296493346|gb|ADTK01000155.1| GENE 19 19137 - 20489 1314 450 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_5227 NR:ns ## KEGG: ECH74115_5227 # Name: wzyE # Def: putative common antigen polymerase # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 450 1 450 450 749 100.0 0 MSLLQFSGLFVVWLLCTLFIATLTWFEFRRVRFNFNVFFSLLFLLTFFFGFPLTSVLVFR FDVGVAPPEILLQALLSAGCFYAVYYVTYKTRLRKRVADVPRRPLFTMNRVETNLTWVIL MGIALVSVGIFFMHNGFLLFRLNSYSQIFSSEVSGVALKRFFYFFIPAMLVVYFLRQDSK AWLFFLVSTVAFGLLTYMIVGGTRANIIIAFAIFLFIGIIRGWISLWMLAAAGVLGIVGM FWLALKRYGMNVSGDEAFYTFLYLTRDTFSPWENLALLLQNYDNIDFQGLAPIVRDFYVF IPSWLWPGRPSMVLNSANYFTWEVLNNHSGLAISPTLIGSLVVMGGALFIPLGAIVVGLI IKWFDWLYELGNRETNRYKAAILHSFCFGAIFNMIVLAREGLDSFVSRVVFFIVVFGACL MIAKLLYWLFESAGLIHKRTKSSLRTQVEG >gi|296493346|gb|ADTK01000155.1| GENE 20 20492 - 21232 860 246 aa, chain + ## HITS:1 COG:wecG KEGG:ns NR:ns ## COG: wecG COG1922 # Protein_GI_number: 16131650 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Teichoic acid biosynthesis proteins # Organism: Escherichia coli K12 # 1 246 1 246 246 476 99.0 1e-134 MNNNTTAPTYTLRGLQLIGWRDMQHALDYLFADGQLKQGTLVAINAEKMLTIEDNAEVRE LINAAEFKYADGISVVRSVRKKYPQAQVSRVAGADLWEELMARAGKEGTPVFLVGGKPEV LAQTEAKLRNQWNVNIVGSQDGYFKPEQRQALFERIHASGAQIVTVAMGSPKQEIFMRDC RLVHPDALYMGVGGTYDVFTGHVKRAPKIWQTLGLEWLYRLLSQPSRIKRQLRLLRYLRW HYTGNL >gi|296493346|gb|ADTK01000155.1| GENE 21 21423 - 22808 1325 461 aa, chain + ## HITS:1 COG:ECs4729 KEGG:ns NR:ns ## COG: ECs4729 COG1113 # Protein_GI_number: 15833983 # Func_class: E Amino acid transport and metabolism # Function: Gamma-aminobutyrate permease and related permeases # Organism: Escherichia coli O157:H7 # 1 461 1 461 461 808 100.0 0 MADNKPELQRGLEARHIELIALGGTIGVGLFMGAASTLKWAGPSVLLAYIIAGLFVFFIM RSMGEMLFLEPVTGSFAVYAHRYMSPFFGYLTAWSYWFMWMAVGISEITAIGVYVQFWFP EMAQWIPALIAVALVALANLAAVRLYGEIEFWFAMIKVTTIIVMIVIGLGVIFFGFGNGG QSIGFSNLTEHGGFFAGGWKGFLTALCIVVASYQGVELIGITAGEAKNPQVTLRSAVGKV LWRILIFYVGAIFVIVTIFPWNEIGSNGSPFVLTFAKIGITAAAGIINFVVLTAALSGCN SGMYSCGRMLYALAKNRQLPAAMAKVSRHGVPVAGVAVSIAILLIGSCLNYIIPNPQRVF VYVYSASVLPGMVPWFVILISQLRFRRAHKAAIASHPFRSILFPWANYVTMAFLICVLIG MYFNEDTRMSLFVGIIFMLAVTAIYKVFGLNRHGKAHKLEE Prediction of potential genes in microbial genomes Time: Mon May 16 15:32:59 2011 Seq name: gi|296493345|gb|ADTK01000156.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont381.3, whole genome shotgun sequence Length of sequence - 2254 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 50 - 109 3.2 1 1 Tu 1 . + CDS 148 - 1383 923 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) Predicted protein(s) >gi|296493345|gb|ADTK01000156.1| GENE 1 148 - 1383 923 411 aa, chain + ## HITS:1 COG:ECs4730 KEGG:ns NR:ns ## COG: ECs4730 COG0641 # Protein_GI_number: 15833984 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Escherichia coli O157:H7 # 1 411 1 411 411 875 98.0 0 MLQQVPTRAFHVMAKPSGSDCNLNCDYCFYLEKQSLYREKPVTHMDDDTLEAYVRHYIAA SEPQNEVAFTWQGGEPTLLGLDFYRRAVKLQAKYGAGRKISNSFQTNGVLLDDKWCAFLA ENHFLVGLSLDGPAEIHNQYRVTKGGRPTHKLVMRALTLLQKHHVDYNVLVCVNRTSAQQ PLQVYDFLCDAGVEFIQFIPVVERLADETVARDGLKLHAPGDIQGELTEWSVRPDKFGEF LVAIFDHWIKRDVGKIFVMNIEWAFANFVGAPGAVCHHQPTCGRSVIVEHNGDVYACDHY VYPQYRLGNMHQQTIAEMIDSPQQQVFGEDKFKQLPAQCRSCNVLKACWGGCPKHRFMLD ASGKPGLNYLCAGYQRYFRHLPPYLKAMADLLAHGRPASDIMQAHLLVVSK Prediction of potential genes in microbial genomes Time: Mon May 16 15:33:11 2011 Seq name: gi|296493344|gb|ADTK01000157.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont381.4, whole genome shotgun sequence Length of sequence - 36744 bp Number of predicted genes - 36, with homology - 36 Number of transcription units - 19, operones - 8 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 11/0.083 - CDS 14 - 1210 1441 ## COG3071 Uncharacterized enzyme of heme biosynthesis 2 1 Op 2 10/0.083 - CDS 1213 - 2406 1295 ## COG2959 Uncharacterized enzyme of heme biosynthesis 3 1 Op 3 23/0.000 - CDS 2428 - 3168 505 ## COG1587 Uroporphyrinogen-III synthase 4 1 Op 4 . - CDS 3165 - 4106 1003 ## COG0181 Porphobilinogen deaminase - Prom 4249 - 4308 3.8 + Prom 4280 - 4339 5.3 5 2 Tu 1 . + CDS 4493 - 7039 2269 ## COG3072 Adenylate cyclase 6 3 Tu 1 . - CDS 7079 - 7399 519 ## COG1965 Protein implicated in iron transport, frataxin homolog - Prom 7423 - 7482 5.3 + Prom 7696 - 7755 4.0 7 4 Op 1 1/0.917 + CDS 7862 - 8065 221 ## COG5567 Predicted small periplasmic lipoprotein 8 4 Op 2 4/0.583 + CDS 8102 - 8926 784 ## COG0253 Diaminopimelate epimerase 9 4 Op 3 10/0.083 + CDS 8923 - 9630 662 ## COG3159 Uncharacterized protein conserved in bacteria 10 4 Op 4 8/0.083 + CDS 9627 - 10523 861 ## COG4973 Site-specific recombinase XerC 11 4 Op 5 5/0.500 + CDS 10523 - 11239 513 ## COG1011 Predicted hydrolase (HAD superfamily) + Term 11269 - 11312 5.0 12 5 Tu 1 . + CDS 11323 - 13485 2269 ## COG0210 Superfamily I DNA and RNA helicases + Term 13499 - 13523 -1.0 13 6 Tu 1 . - CDS 13632 - 14396 694 ## COG3698 Predicted periplasmic protein 14 7 Tu 1 . + CDS 14766 - 15716 1118 ## COG0598 Mg2+ and Co2+ transporters + Term 15728 - 15764 7.3 15 8 Op 1 . - CDS 15759 - 16124 328 ## B21_03645 hypothetical protein 16 8 Op 2 . - CDS 16153 - 16533 99 ## JW5590 predicted inner membrane protein - Prom 16566 - 16625 1.8 17 9 Op 1 1/0.917 - CDS 16628 - 17518 965 ## COG2962 Predicted permeases 18 9 Op 2 . - CDS 17570 - 18055 481 ## COG2050 Uncharacterized protein, possibly involved in aromatic compounds catabolism - Prom 18094 - 18153 4.3 + Prom 18095 - 18154 4.0 19 10 Tu 1 . + CDS 18202 - 19071 1004 ## COG2829 Outer membrane phospholipase A + Term 19155 - 19188 4.5 + Prom 19116 - 19175 4.5 20 11 Op 1 4/0.583 + CDS 19198 - 21033 1837 ## COG0514 Superfamily II DNA helicase 21 11 Op 2 . + CDS 21097 - 21717 763 ## COG1280 Putative threonine efflux protein + Term 21741 - 21782 2.6 - Term 21729 - 21770 6.4 22 12 Tu 1 . - CDS 21779 - 22399 498 ## COG1280 Putative threonine efflux protein - Prom 22472 - 22531 4.9 + Prom 22305 - 22364 3.6 23 13 Op 1 5/0.500 + CDS 22510 - 23532 719 ## COG2267 Lysophospholipase 24 13 Op 2 3/0.667 + CDS 23540 - 24340 933 ## COG0561 Predicted hydrolases of the HAD superfamily 25 13 Op 3 . + CDS 24416 - 25315 982 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 26 14 Tu 1 . - CDS 25203 - 26156 734 ## COG0583 Transcriptional regulator - Prom 26190 - 26249 4.9 + Prom 26167 - 26226 3.0 27 15 Tu 1 . + CDS 26274 - 28535 2507 ## COG0620 Methionine synthase II (cobalamin-independent) + Term 28543 - 28575 6.3 - Term 28450 - 28487 1.8 28 16 Tu 1 . - CDS 28574 - 29389 847 ## COG0412 Dienelactone hydrolase and related enzymes - Prom 29417 - 29476 5.3 + Prom 29345 - 29404 6.5 29 17 Tu 1 . + CDS 29651 - 30412 896 ## COG2820 Uridine phosphorylase + Term 30429 - 30463 5.2 + Prom 30461 - 30520 3.0 30 18 Op 1 5/0.500 + CDS 30568 - 31980 1601 ## COG1322 Uncharacterized protein conserved in bacteria + Prom 31989 - 32048 3.2 31 18 Op 2 7/0.167 + CDS 32075 - 32830 363 ## PROTEIN SUPPORTED gi|163754278|ref|ZP_02161401.1| 30S ribosomal protein S15 32 18 Op 3 10/0.083 + CDS 32844 - 33449 704 ## COG3165 Uncharacterized protein conserved in bacteria 33 18 Op 4 6/0.167 + CDS 33527 - 35086 1460 ## COG0661 Predicted unusual protein kinase + Term 35107 - 35143 1.0 34 19 Op 1 16/0.000 + CDS 35165 - 35434 200 ## PROTEIN SUPPORTED gi|90022866|ref|YP_528693.1| ribosomal protein L25 35 19 Op 2 28/0.000 + CDS 35438 - 35953 476 ## COG1826 Sec-independent protein secretion pathway components 36 19 Op 3 . + CDS 35956 - 36732 833 ## COG0805 Sec-independent protein secretion pathway component TatC Predicted protein(s) >gi|296493344|gb|ADTK01000157.1| GENE 1 14 - 1210 1441 398 aa, chain - ## HITS:1 COG:ECs4732 KEGG:ns NR:ns ## COG: ECs4732 COG3071 # Protein_GI_number: 15833986 # Func_class: H Coenzyme transport and metabolism # Function: Uncharacterized enzyme of heme biosynthesis # Organism: Escherichia coli O157:H7 # 12 398 12 398 398 728 100.0 0 MLKVLLLFVLLIAGIVVGPMIAGHQGYVLIQTDNYNIETSVTGLAIILILAMVVLFAIEW LLRRIFRTGAHTRGWFVGRKRRRARKQTEQALLKLAEGDYQQVEKLMAKNADHAEQPVVN YLLAAEAAQQRGDEARANQHLERAAELAGNDTIPVEITRVRLQLARNENHAARHGVDKLL EVTPRHPEVLRLAEQAYIRTGAWSSLLDIIPSMAKAHVGDEEHRAMLEQQAWIGLMDQAR ADNGSEGLRNWWKNQSRKTRHQVALQVAMAEHLIECDDHDTAQQIIIDGLKRQYDDRLLL PIPRLKTNNPEQLEKVLRQQIKNVGDRPLLWSTLGQSLMKHGEWQEASLAFRAALKQRPD AYDYAWLADALDRLHKPEEAAAMRRDGLMLTLQNNPPQ >gi|296493344|gb|ADTK01000157.1| GENE 2 1213 - 2406 1295 397 aa, chain - ## HITS:1 COG:hemX KEGG:ns NR:ns ## COG: hemX COG2959 # Protein_GI_number: 16131655 # Func_class: H Coenzyme transport and metabolism # Function: Uncharacterized enzyme of heme biosynthesis # Organism: Escherichia coli K12 # 1 397 1 393 393 578 98.0 1e-165 MTEQEKTSAVVEETREAVDTTSQPVATEKKSKNNTALILSAVAIAIALAAGVGLYGWGKQ QAVNQTATSDALANQLTALQKAQESQKAELEGIIKQQAAQLKQANRQQETLAKQLDEVQQ KVATISGSDAKTWLLAQADFLVKLAGRKLWSDQDVTTAAALLKSADASLADMNDPSLITV RRAITDDIASLSAVSQVDYDGIILKLNQLSNQVDNLRLADNDSDGSPMDSDGEELSSSIS EWRINLQKSWQNFMDNFITIRRRDDTAVPLLAPNQDIYLRENIRSRLLVAAQAVPRHQQE TYRQALENVSTWVRAYYDTDDATTKAFLDEVDQLSQQNISMDLPETLQSQAMLEKLMQTR VRNLLAQPAAGTTEAKPAPAPAPAPQADTSAAAPQGE >gi|296493344|gb|ADTK01000157.1| GENE 3 2428 - 3168 505 246 aa, chain - ## HITS:1 COG:ECs4734 KEGG:ns NR:ns ## COG: ECs4734 COG1587 # Protein_GI_number: 15833988 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III synthase # Organism: Escherichia coli O157:H7 # 1 246 1 246 246 475 100.0 1e-134 MSILVTRPSPAGEELVSRLRTLGQVAWHFPLIEFSPGRQLPQLADQLAALGESDLLFALS QHAVAFAQSQLHQQDRKWPRLPNYFAIGRTTALALHTVSGQKILYPQDREISEVLLQLPE LQNIAGKRALILRGNGGRELIGDTLTARGAEVTFCECYQRCAIHYDGAEEAMRWQSREVT TVVVTSGEMLQQLWSLIPQWYREHWLLHCRLLVVSERLAKLARELGWQDIKVADNADNDA LLRALQ >gi|296493344|gb|ADTK01000157.1| GENE 4 3165 - 4106 1003 313 aa, chain - ## HITS:1 COG:ECs4735 KEGG:ns NR:ns ## COG: ECs4735 COG0181 # Protein_GI_number: 15833989 # Func_class: H Coenzyme transport and metabolism # Function: Porphobilinogen deaminase # Organism: Escherichia coli O157:H7 # 1 313 8 320 320 593 99.0 1e-169 MLDNVLRIATRQSPLALWQAHYVKDKLMASHPGLVVELVPMVTRGDVILDTPLAKVGGKG LFVKELEVALLENRADIAVHSMKDVPVEFPQGLGLVTICEREDPRDAFVSNNYDSLDALP AGSIVGTSSLRRQCQLAERRPDLIIRSLRGNVGTRLSKLDNGEYDAIILAVAGLKRLGLE SRIRAALPPEISLPAVGQGAVGIECRLDDARTRELLAALNHHETALRVTAERAMNTRLEG GCQVPIGSYAELIDGEIWLRALVGAPDGSQIIRGERRGAPQDAEQMGISLAEELLNNGAR EILAEVYNGDAPA >gi|296493344|gb|ADTK01000157.1| GENE 5 4493 - 7039 2269 848 aa, chain + ## HITS:1 COG:ECs4736 KEGG:ns NR:ns ## COG: ECs4736 COG3072 # Protein_GI_number: 15833990 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate cyclase # Organism: Escherichia coli O157:H7 # 1 848 1 848 848 1781 100.0 0 MYLYIETLKQRLDAINQLRVDRALAAMGPAFQQVYSLLPTLLHYHHPLMPGYLDGNVPKG ICLYTPDETQRHYLNELELYRGMSVQDPPKGELPITGVYTMGSTSSVGQSCSSDLDIWVC HQSWLDSEERQLLQRKCSLLESWAASLGVEVSFFLIDENRFRHNESGSLGGEDCGSTQHI LLLDEFYRTAVRLAGKRILWNMVPCDEEEHYDDYVMTLYAQGVLTPNEWLDLGGLSSLSA EEYFGASLWQLYKSIDSPYKAVLKTLLLEAYSWEYPNPRLLAKDIKQRLHDGEIVSFGLD PYCMMLERVTEYLTAIEDFTRLDLVRRCFYLKVCEKLSRERACVGWRRAVLSQLVSEWGW DEARLAMLDNRANWKIDQVREAHNELLDAMMQSYRNLIRFARRNNLSVSASPQDIGVLTR KLYAAFEALPGKVTLVNPQISPDLSEPNLTFIYVPPGRANRSGWYLYNRAPNIESIISHQ PLEYNRYLNKLVAWAWFNGLLTSRTRLYIKGNGIVDLPKLQEMVADVSHHFPLRLPAPTP KALYSPCEIRHLAIIVNLEYDPTAAFRNQVVHFDFRKLDVFSFGENQNCLVGSVDLLYRN SWNEVRTLHFNGEQSMIEALKTILGKMHQDAAPPDSVEVFCYSQHLRGLIRTRVQQLVSE CIELRLSSTRQETGRFKALRVSGQTWGLFFERLNVSVQKLENAIEFYGAISHNKLHGLSV QVETNHVKLPAVVDGFASEGIIQFFFEETQDENGFNIYILDESNRVEVYHHCEGSKEELV RDVSRFYSSSHDRFTYGSSFINFNLPQFYQIVKVDGREQVIPFRTKSIGNMPPANQDHDT PLLQQYFS >gi|296493344|gb|ADTK01000157.1| GENE 6 7079 - 7399 519 106 aa, chain - ## HITS:1 COG:STM3943 KEGG:ns NR:ns ## COG: STM3943 COG1965 # Protein_GI_number: 16767214 # Func_class: P Inorganic ion transport and metabolism # Function: Protein implicated in iron transport, frataxin homolog # Organism: Salmonella typhimurium LT2 # 1 106 1 106 106 193 94.0 5e-50 MNDSEFHRLADQLWLTIEERLDDWDGDSDIDCEINGGVLTITFENGSKIIINRQEPLHQV WLATKQGGYHFDLKGDEWICDRSGETFWDLLEQAATQQAGETVSFR >gi|296493344|gb|ADTK01000157.1| GENE 7 7862 - 8065 221 67 aa, chain + ## HITS:1 COG:Z5325 KEGG:ns NR:ns ## COG: Z5325 COG5567 # Protein_GI_number: 15804397 # Func_class: N Cell motility # Function: Predicted small periplasmic lipoprotein # Organism: Escherichia coli O157:H7 EDL933 # 1 67 1 67 67 109 100.0 1e-24 MKNVFKALTVLLTLFSLTGCGLKGPLYFPPADKNAPPPTKPVETQTQSTVPDKNDRATGD GPSQVNY >gi|296493344|gb|ADTK01000157.1| GENE 8 8102 - 8926 784 274 aa, chain + ## HITS:1 COG:ECs4739 KEGG:ns NR:ns ## COG: ECs4739 COG0253 # Protein_GI_number: 15833993 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Escherichia coli O157:H7 # 1 274 2 275 275 568 100.0 1e-162 MQFSKMHGLGNDFMVVDAVTQNVFFSPELIRRLADRHLGVGFDQLLVVEPPYDPELDFHY RIFNADGSEVAQCGNGARCFARFVRLKGLTNKRDIRVSTANGRMVLTVTDDDLVRVNMGE PNFEPSAVPFRANKAEKTYIMRAAEQTILCGVVSMGNPHCVIQVDDVDTAAVETLGPVLE SHERFPERANIGFMQVVKREHIRLRVYERGAGETQACGSGACAAVAVGIQQGLLAEEVRV ELPGGRLDIAWKGPGHPLYMTGPAVHVYDGFIHL >gi|296493344|gb|ADTK01000157.1| GENE 9 8923 - 9630 662 235 aa, chain + ## HITS:1 COG:yigA KEGG:ns NR:ns ## COG: yigA COG3159 # Protein_GI_number: 16131662 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 235 1 235 235 444 99.0 1e-125 MKQPGEELQETLTELDDRAVVDYLIKNPEFFIRNARAVEAIRVPHPVRGTVSLVEWHMAR ARNHIHVLEENMVLLMEQAIANEGLFYRLLYLQRSLTAASSLDDMLMRFHRWARDLGLAG ASLRLFPDRWRLGAPSNHTHLALSRQSFEPLRIQRLGQEQHYLGPLNGPELLVVLPEAKA VGSVAMSMLGSDADLGVVLFTSRDASHYQQGQGTQLLHEIALMLPELLERWIERV >gi|296493344|gb|ADTK01000157.1| GENE 10 9627 - 10523 861 298 aa, chain + ## HITS:1 COG:xerC KEGG:ns NR:ns ## COG: xerC COG4973 # Protein_GI_number: 16131663 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerC # Organism: Escherichia coli K12 # 1 298 1 298 298 585 99.0 1e-167 MTDLHTDVERYLRYLSVERQLSPITLLNYQRQLEAIINFASENGLQSWQQCDAAMVRNFA VRSRRKGLGAASLALRLSALRSFFDWLVSQNELKANPAKGVSAPKAPRHLPKNIDVDDMN RLLDIDINDPLAVRDRAMLEVMYGAGLRLSELVGLDIKHLDLESGEVWVMGKGSKERRLP IGRNAVAWIEHWLDLRDLFGSEDDALFLSKLGKRISARNVQKRFAEWGIKQGLNNHVHPH KLRHSFATHMLESSGDLRGVQELLGHANLSTTQIYTHLDFQHLASVYDAAHPRAKRGK >gi|296493344|gb|ADTK01000157.1| GENE 11 10523 - 11239 513 238 aa, chain + ## HITS:1 COG:yigB KEGG:ns NR:ns ## COG: yigB COG1011 # Protein_GI_number: 16131664 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Escherichia coli K12 # 1 238 1 238 238 486 100.0 1e-137 MRFYRPLGRISALTFDLDDTLYDNRPVILRTEREALTFVQNYHPALRSFQNEDLQRLRQA VREAEPEIYHDVTRWRFRSIEQAMLDAGLSAEEASAGAHAAMINFAKWRSRIDVPQQTHD TLKQLAKKWPLVAITNGNAQPELFGLGDYFEFVLRAGPHGRSKPFSDMYFLAAEKLNVPI GEILHVGDDLTTDVGGAIRSGMQACWIRPENGDLMQTWDSRLLPHLEISRLASLTSLI >gi|296493344|gb|ADTK01000157.1| GENE 12 11323 - 13485 2269 720 aa, chain + ## HITS:1 COG:ECs4743 KEGG:ns NR:ns ## COG: ECs4743 COG0210 # Protein_GI_number: 15833997 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Escherichia coli O157:H7 # 1 720 1 720 720 1431 99.0 0 MDVSYLLDSLNDKQREAVAAPRSNLLVLAGAGSGKTRVLVHRIAWLMSVENCSPYSIMAV TFTNKAAAEMRHRIGQLMGTSQGGMWVGTFHGLAHRLLRAHHMDANLPQDFQILDSEDQL RLLKRLIKAMNLDEKQWPPRQAMWYINSQKDEGLRPHHIQSYGNPVEQTWQKVYQAYQEA CDRAGLVDFAELLLRAHELWLNKPHILQHYRERFTNILVDEFQDTNNIQYAWIRLLAGDT GKVMIVGDDDQSIYGWRGAQVENIQRFLNDFPGAETIRLEQNYRSTSNILSAANALIENN NGRLGKKLWTDGADGEPISLYCAFNELDEARFVVNRIKTWQDNGGALAECAILYRSNAQS RVLEEALLQASMPYRIYGGMRFFERQEIKDALSYLRLIANRNDDAAFERVVNTPTRGIGD RTLDVVRQTSRDRQLTLWQACRELLQEKALAGRAASALQRFMELIDALAQETADMPLHVQ TDRVIKDSGLRTMYEQEKGEKGQTRIENLEELVTATRQFSYNEEDEDLMPLQAFLSQAAL EAGEGQADTWQDAVQLMTLHSAKGLEFPQVFIVGMEEGMFPSQMSLDEGGRLEEERRLAY VGVTRAMQKLTLTYAETRRLYGKEVYHRPSRFIGELPEECVEEVRLRATVSRPVSHQRMG TPMVENDSGYKLGQRVRHAKFGEGTIVNMEGSGEHSRLQVAFQGQGIKWLVAAYARLETV >gi|296493344|gb|ADTK01000157.1| GENE 13 13632 - 14396 694 254 aa, chain - ## HITS:1 COG:ECs4745 KEGG:ns NR:ns ## COG: ECs4745 COG3698 # Protein_GI_number: 15833999 # Func_class: S Function unknown # Function: Predicted periplasmic protein # Organism: Escherichia coli O157:H7 # 1 254 1 254 254 508 97.0 1e-144 MAHRLLIGKGMITLNLKRIFLALTLLPLFAVAADDCALSDPMLIVQAYTVNPQTERVKMY WQKANGEAWGTLHALLVDMNSQGQVQMAMNGGIYDESYAPLGLYIENGQQKVALNLASGE GNFFIRPGGVFYVAGDKVGIVRLDAFKASKEIQFAVQSGPMLMENGVINPRIHPNVASRK IRNGVGINKHGNAVFLLSQQATNFYDFSCYAKAKLNVEQLLYLDGTISHMYMKGGAIPWQ RYPFVTMISVERKG >gi|296493344|gb|ADTK01000157.1| GENE 14 14766 - 15716 1118 316 aa, chain + ## HITS:1 COG:ECs4746 KEGG:ns NR:ns ## COG: ECs4746 COG0598 # Protein_GI_number: 15834000 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Escherichia coli O157:H7 # 1 316 1 316 316 607 100.0 1e-174 MLSAFQLENNRLTRLEVEESQPLVNAVWIDLVEPDDDERLRVQSELGQSLATRPELEDIE ASARFFEDDDGLHIHSFFFFEDAEDHAGNSTVAFTIRDGRLFTLRERELPAFRLYRMRAR SQSMVDGNAYELLLDLFETKIEQLADEIENIYSDLEQLSRVIMEGHQGDEYDEALSTLAE LEDIGWKVRLCLMDTQRALNFLVRKARLPGGQLEQAREILRDIESLLPHNESLFQKVNFL MQAAMGFINIEQNRIIKIFSVVSVVFLPPTLVASSYGMNFEFMPELKWSFGYPGAIIFMI LAGLAPYLYFKRKNWL >gi|296493344|gb|ADTK01000157.1| GENE 15 15759 - 16124 328 121 aa, chain - ## HITS:1 COG:no KEGG:B21_03645 NR:ns ## KEGG: B21_03645 # Name: yigF # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 121 6 126 126 218 100.0 4e-56 MNDGSLSEKWKYRFNFYDQHGFPGFWGATPEYKAAFKALKVRQRLTIQMNFIAFFCSWIY LFVLGLWKKAIIVLLLGILSLFVGALIGVNILGIAVAAYVAVNTNKWFYEKEVKGLNTWS L >gi|296493344|gb|ADTK01000157.1| GENE 16 16153 - 16533 99 126 aa, chain - ## HITS:1 COG:no KEGG:JW5590 NR:ns ## KEGG: JW5590 # Name: yigG # Def: predicted inner membrane protein # Organism: E.coli_J # Pathway: not_defined # 1 126 1 126 126 182 99.0 4e-45 MLRMFIPTSNGKISRRRYIFSFILINFIFAFLIIFFNDGEAGFLVIVSTIVLHYLVINMN CQRLRDSGFIYIKTYVFGTLAVYIISIITMIAEDFACSGNGSMIFLICYFSTFSMLMLAP TDSSKQ >gi|296493344|gb|ADTK01000157.1| GENE 17 16628 - 17518 965 296 aa, chain - ## HITS:1 COG:ECs4749 KEGG:ns NR:ns ## COG: ECs4749 COG2962 # Protein_GI_number: 15834003 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Escherichia coli O157:H7 # 1 295 1 295 295 508 99.0 1e-144 MDAKQTRQGVLLALAAYFIWGIAPAYFKLIYYVPADEILTHRVIWSFFFMVVLMSICRQW SYLKTLIQTPQKIFMLAVSAVLIGGNWLLFIWAVNNHHMLEASLGYFINPLVNIVLGMIF LGERFRRMQWLAVILAICGVLVQLWTFGSLPIIALGLAFSFAFYGLVRKKIAVEAQTGML IETMWLLPVAAIYLFAIADSSTSHMGQNPMSLNLLLIAAGIVTTVPLLCFTAAATRLRLS TLGFFQYIGPTLMFLLAVTFYGEKPGADKMVTFAFIWVALAIFVMDAIYTQRRTSK >gi|296493344|gb|ADTK01000157.1| GENE 18 17570 - 18055 481 161 aa, chain - ## HITS:1 COG:ECs4750 KEGG:ns NR:ns ## COG: ECs4750 COG2050 # Protein_GI_number: 15834004 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized protein, possibly involved in aromatic compounds catabolism # Organism: Escherichia coli O157:H7 # 1 161 1 161 161 306 99.0 1e-83 MKYKDDMSAVLTAEQALKLVGEMFVYHMPFNRALGMELERYEKEFAQLAFKNQPMMVGNW AQSILHGGVIASALDVAAGLVCVGSTLTRHETISEDELRQRLSRMGTIDLRVDYLRPGRG ERFTATSSLLRAGNKVAVARVELHNEEQLYIASATATYMVG >gi|296493344|gb|ADTK01000157.1| GENE 19 18202 - 19071 1004 289 aa, chain + ## HITS:1 COG:ECs4751 KEGG:ns NR:ns ## COG: ECs4751 COG2829 # Protein_GI_number: 15834005 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane phospholipase A # Organism: Escherichia coli O157:H7 # 1 289 1 289 289 572 100.0 1e-163 MRTLQGWLLPVFMLPMAVYAQEATVKEVHDAPAVRGSIIANMLQEHDNPFTLYPYDTNYL IYTQTSDLNKEAIASYDWAENARKDEVKFQLSLAFPLWRGILGPNSVLGASYTQKSWWQL SNSEESSPFRETNYEPQLFLGFATDYRFAGWTLRDVEMGYNHDSNGRSDPTSRSWNRLYT RLMAENGNWLVEVKPWYVVGNTDDNPDITKYMGYYQLKIGYHLGDAVLSAKGQYNWNTGY GGAELGLSYPITKHVRLYTQVYSGYGESLIDYNFNQTRVGVGVMLNDLF >gi|296493344|gb|ADTK01000157.1| GENE 20 19198 - 21033 1837 611 aa, chain + ## HITS:1 COG:ECs4752 KEGG:ns NR:ns ## COG: ECs4752 COG0514 # Protein_GI_number: 15834006 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Escherichia coli O157:H7 # 1 611 1 611 611 1254 99.0 0 MNVAQAEVLNLESGAKQVLQETFGYQQFRPGQEEIIDTVLSGRDCLVVMPTGGGKSLCYQ IPALLLNGLTVVVSPLISLMKDQVDQLQANGVAAACLNSTQTREQQLEVMTGCRTGQIRL LYIAPERLMLDNFLEHLAHWNPVLLAVDEAHCISQWGHDFRPEYAALGQLRQRFPTLPFM ALTATADDTTRQDIVRLLGLNDPLIQISSFDRPNIRYMLMEKFKPLDQLMRYVQEQRGKS GIIYCNSRAKVEDTAARLQSKGISAAAYHAGLENNVRADVQEKFQRDDLQIVVATVAFGM GINKPNVRFVVHFDIPRNIESYYQETGRAGRDGLPAEAILFYDPADMAWLRRCLEEKPQG QLQDIERHKLNAMGAFAEAQTCRRLVLLNYFGEGRQEPCGNCDICLDPPKQYDGSTDAQI ALSTIGRVNQRFGMGYVVEVIRGANNQRIRDYGHDKLKVYGMGRDKSHEHWVSVIRQLIH LGLVTQNIAQHSALQLTEAARPVLRGESSLQLAVPRIVALKPKAMQKSFGGNYDRKLFAK LRKLRKSIADESNVPPYVVFNDATLIEMAEQMPITASEMLSVNGVGMRKLERFGKPFMAL IRAHVDGDDEE >gi|296493344|gb|ADTK01000157.1| GENE 21 21097 - 21717 763 206 aa, chain + ## HITS:1 COG:ECs4753 KEGG:ns NR:ns ## COG: ECs4753 COG1280 # Protein_GI_number: 15834007 # Func_class: E Amino acid transport and metabolism # Function: Putative threonine efflux protein # Organism: Escherichia coli O157:H7 # 1 206 1 206 206 363 99.0 1e-101 MLMLFLTVAMVHIVALMSPGPDFFFVSQTAVSRSRKEAMMGVLGITCGVMVWAGIALLGL HLIIEKMAWLHTLIMVGGGLYLCWMGYQMLRGALKKEAVSAPAPQIELAKSGRSFLKGLL TNLANPKAIIYFGSVFSLFVGDNVGTTARWGIFALIIVETLAWFTVVASLFALPQMRRGY QRLAKWIDGFAGALFAGFGIHLIISR >gi|296493344|gb|ADTK01000157.1| GENE 22 21779 - 22399 498 206 aa, chain - ## HITS:1 COG:ECs4754 KEGG:ns NR:ns ## COG: ECs4754 COG1280 # Protein_GI_number: 15834008 # Func_class: E Amino acid transport and metabolism # Function: Putative threonine efflux protein # Organism: Escherichia coli O157:H7 # 1 206 1 206 206 352 100.0 3e-97 MTLEWWFAYLLTSIILSLSPGSGAINTMTTSLNHGYRGAVASIAGLQTGLAIHIVLVGVG LGTLFSRSVIAFEVLKWAGAAYLIWLGIQQWRAAGAIDLKSLASTQSRRHLFQRAVFVNL TNPKSIVFLAALFPQFIMPQQPQLMQYIVLGVTTIVVDIIVMIGYATLAQRIALWIKGPK QMKALNKIFGSLFMLVGALLASARHA >gi|296493344|gb|ADTK01000157.1| GENE 23 22510 - 23532 719 340 aa, chain + ## HITS:1 COG:ECs4755 KEGG:ns NR:ns ## COG: ECs4755 COG2267 # Protein_GI_number: 15834009 # Func_class: I Lipid transport and metabolism # Function: Lysophospholipase # Organism: Escherichia coli O157:H7 # 1 340 1 340 340 699 99.0 0 MFQQQKDWETRENAFAAFTMGPLTDFWRQRDEAEFTGVDDIPVRFVRFRAQHHDRVVVIC PGRIESYVKYAELAYDLFHLGFDVLIIDHRGQGRSGRLLADPHLGHVNRFNDYVDDLAAF WQQEVQPGPWRKRYILAHSMGGAISTLFLQRHPGVCDAIALTAPMFGIVIRMPSFMARQI LNWAEAHPRFRDGYAIGTGRWRALPFAINVLTHSRQRYRRNLRFYADDPTIRVGGPTYHW VRESILAGEQVLAGAGDDATPTLLLQAEEERVVDNRMHDRFCELRTAAGHPVEGGRPLVI KGAYHEILFEKDAMRSVALHAIVDFFNRHNSPSGNRSTEV >gi|296493344|gb|ADTK01000157.1| GENE 24 23540 - 24340 933 266 aa, chain + ## HITS:1 COG:Z5347 KEGG:ns NR:ns ## COG: Z5347 COG0561 # Protein_GI_number: 15804418 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Escherichia coli O157:H7 EDL933 # 1 266 40 305 305 555 99.0 1e-158 MYQVVASDLDGTLLSPDHTLSPYAKETLKLLTAHGINFVFATGRHHVDVGQIRDNLEIKS YMITSNGARVHDLDGNLIFAHNLDRDIASDLFGVVNDNPDIITNVYRDDEWFMNRHRPEE MRFFKEAVFKYALYEPGLLEPEGVSKVFFTCDSHEQLLPLEQAINARWGDRVNVSFSTLT CLEVMAGGVSKGHALEAVAKKLGYSLKDCIAFGDGMNDAEMLSMAGKGCIMGSAHQRLKD LHPELEVIGTNADDAVPHYLRKLYLS >gi|296493344|gb|ADTK01000157.1| GENE 25 24416 - 25315 982 299 aa, chain + ## HITS:1 COG:ECs4757 KEGG:ns NR:ns ## COG: ECs4757 COG0697 # Protein_GI_number: 15834011 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Escherichia coli O157:H7 # 1 299 1 299 299 540 100.0 1e-153 MALLIITTILWAFSFSFYGEYLAGHVDSYFAVLVRVGLAALVFLPFLRTRGNSLKTVGLY MLVGAMQLGVMYMLSFRAYLYLTVSELLLFTVLTPLYITLIYDIMSKRRLRWGYAFSALL AVIGAGIIRYDQVTDHFWTGLLLVQLSNITFAIGMVGYKRLMETRPMPQHNAFAWFYLGA FLVAVIAWFLLGNAQKMPQTTLQWGILVFLGVVASGIGYFMWNYGATQVDAGTLGIMNNM HVPAGLLVNLAIWHQQPHWPTFITGALVILASLWVHRKWVAPRSSQTADDRRRDCALSE >gi|296493344|gb|ADTK01000157.1| GENE 26 25203 - 26156 734 317 aa, chain - ## HITS:1 COG:ECs4758 KEGG:ns NR:ns ## COG: ECs4758 COG0583 # Protein_GI_number: 15834012 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 317 1 317 317 613 99.0 1e-175 MIEVKHLKTLQALRNCGSLAAAAATLHQTQSALSHQFSDLEQRLGFRLFVRKSQPLRFTP QGEILLQLANQVLPQISQALQACNEPQQTRLRIAIECHSCIQWLTPALENFHKNWPQVEM DFKSGVTFDPQPALQQGELDLVMTSDILPRSGLHYSPMFDYEVRLVLAPDHPLAAKTRIT PEDLASETLLIYPVQRSRLDVWRHFLQPAGVSPSLKSVDNTLLLIQMVAARMGIAALPHW VVESFERQGLVVTKTLGEGLWSRLYAAVRDGEQRQPITEAFIRSARNHACDHLPFVKSAE RPTYDAPTVRPGSPARL >gi|296493344|gb|ADTK01000157.1| GENE 27 26274 - 28535 2507 753 aa, chain + ## HITS:1 COG:metE KEGG:ns NR:ns ## COG: metE COG0620 # Protein_GI_number: 16131678 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase II (cobalamin-independent) # Organism: Escherichia coli K12 # 1 753 1 753 753 1516 98.0 0 MTILNHTLGFPRVGLRRELKKAQESYWAGNSTREELLTVGRELRARHWDQQKQAGIDLLP VGDFAWYDHVLTTSLLLGNVPPRHQNKDGSVDIDTLFRIGRGRAPTGEPAAAAEMTKWFN TNYHYMVPEFVKGQQFKLTWTQLLEEVDEALALGHNVKPVLLGPVTYLWLGKVKGEQFDR LSLLNDILPVYQQVLAELAKRGIEWVQIDEPALVLELPQAWLDAYKPAYDALQGQVKLLL TTYFEGVTPNLDTITALPVQGLHVDLVHGKDDVVELHKRLPSDWLLSAGLINGRNVWRAD LTEKYAQIKDIVGKRDLWVASSCSLLHSPIDLSVETRLDAEVKSWFAFALQKCHELALLR DALNSGDTAALAEWSAPIQARRHSTRVHNPAVEKRLAAITAQDSQRANVYEVRAEAQRAR FKLPAWPTTTIGSFPQTTEIRTLRLDFKKGNLDANNYRTGIAEHIRQAIVEQERLGLDVL VHGEAERNDMVEYFGEHLDGFVFTQNGWVQSYGSRCVKPPIVIGDVSRPAPITVEWAKYA QSLTDKPVKGMLTGPVTILCWSFPREDVSRETIAKQIALALRDEVADLEAAGIGIIQIDE PALREGLPLRRSDWDAYLQWGVEAFRINAAVAKDDTQIHTHMCYCEFNDIMDSIAALDAD VITIETSRSDMELLESFEEFDYPNEIGPGVYDIHSPNVPSVEWIEALLKKAAKRIPAERL WVNPDCGLKTRGWPETRAALANMVQAAQNLRLG >gi|296493344|gb|ADTK01000157.1| GENE 28 28574 - 29389 847 271 aa, chain - ## HITS:1 COG:ECs4760 KEGG:ns NR:ns ## COG: ECs4760 COG0412 # Protein_GI_number: 15834014 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Dienelactone hydrolase and related enzymes # Organism: Escherichia coli O157:H7 # 1 271 23 293 293 546 98.0 1e-155 MATTQQSGFAPAASPLASTIVQTPDDAIVAGFTSIPSQGDNMPAYHARPKQSDGPLPVVI VVQEIFGVHEHIRDICRRLALEGYLAIAPELYFREGDPNDFADIPTLLSGLVAKVPDSQV LADLDHVASWASRNGGDVHRLMITGFCWGGRITWLYAAHNPQLKAAVAWYGKLTGDKSLN SPKQPVDIATDLNAPVLGLYGGQDNSIPQESVETMRQALRAANAKAEIIVYPDAGHAFNA DYRPSYHAESAKDGWQRMLEWFKQYGGKKSL >gi|296493344|gb|ADTK01000157.1| GENE 29 29651 - 30412 896 253 aa, chain + ## HITS:1 COG:udp KEGG:ns NR:ns ## COG: udp COG2820 # Protein_GI_number: 16131680 # Func_class: F Nucleotide transport and metabolism # Function: Uridine phosphorylase # Organism: Escherichia coli K12 # 1 253 1 253 253 489 100.0 1e-138 MSKSDVFHLGLTKNDLQGATLAIVPGDPDRVEKIAALMDKPVKLASHREFTTWRAELDGK PVIVCSTGIGGPSTSIAVEELAQLGIRTFLRIGTTGAIQPHINVGDVLVTTASVRLDGAS LHFAPLEFPAVADFECTTALVEAAKSIGATTHVGVTASSDTFYPGQERYDTYSGRVVRHF KGSMEEWQAMGVMNYEMESATLLTMCASQGLRAGMVAGVIVNRTQQEIPNAETMKQTESH AVKIVVEAARRLL >gi|296493344|gb|ADTK01000157.1| GENE 30 30568 - 31980 1601 470 aa, chain + ## HITS:1 COG:ECs4762 KEGG:ns NR:ns ## COG: ECs4762 COG1322 # Protein_GI_number: 15834016 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 470 6 475 475 803 99.0 0 MVYAVIALVGVAIGWLFASYQYAQQKAEQLAEREEMVAELSAAKQQITQSEHWRAECELL NNEVRSLQSINTSLEADLREVTTRMEAAQQHADDKIRQMINSEQRLSEQFENLANRIFEH SNRRVDEQNRQSLNSLLSPLREQLDGFRRQVQDSFGKEAQERHTLTHEIRNLQQLNAQMA QEAINLTRALKGDNKTQGNWGEVVLTRVLEASGLREGYEYETQVSIENDARSRMQPDVIV RLPQGKDVVIDAKMTLVAYERYFNAEDDYTRESALQEHIASVRNHIRLLGRKDYQQLPGL RTLDYVLMFIPVEPAFLLALDRQPELITEALKNNIMLVSPTTLLVALRTIANLWRYEHQS RNAQQIADRASKLYDKMRLFIDDMSAIGQSLDKAQDNYRQAMKKLSSGRGNVLAQAEAFR GLGVEIKREINPDLAEQAVSQDEEYRLRSVPEQPNDEAYQRDDEYNQQSR >gi|296493344|gb|ADTK01000157.1| GENE 31 32075 - 32830 363 251 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163754278|ref|ZP_02161401.1| 30S ribosomal protein S15 [Kordia algicida OT-1] # 28 250 1 221 221 144 34 8e-34 MVDKSQETTHFGFQTVAKEQKADMVAHVFHSVASKYDVMNDLMSFGIHRLWKRFTIDCSG VRRGQTVLDLAGGTGDLTAKFSRLVGETGKVVLADINESMLKMGREKLRNIGVIGNVEYV QANAEALPFPDNTFDCITISFGLRNVTDKDKALRSMYRVLKPGGRLLVLEFSKPIIEPLS KAYDAYSFHVLPRIGSLVANDADSYRYLAESIRMHPDQDTLKAMMQDAGFESVDYYNLTA GVVALHRGYKF >gi|296493344|gb|ADTK01000157.1| GENE 32 32844 - 33449 704 201 aa, chain + ## HITS:1 COG:yigP KEGG:ns NR:ns ## COG: yigP COG3165 # Protein_GI_number: 16131683 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 201 1 201 201 372 99.0 1e-103 MPFKPLVTAGIESLLNTFLYRSPALKTARLRLLGKVLRVEVKGFSTSLILVFSERQVDVL GEWAGDADCTVIAYASVLPKLRDRQQLTALIRSGELEVQGDIQVVQNFVALADLAEFDPA ELLAPYTGDIAAEGISKAMRGGAKFLHHGIKRQQRYVAEAITEEWRMAPGPLEVAWFAEE TAAVERAVDALTKRLEKLEAK >gi|296493344|gb|ADTK01000157.1| GENE 33 33527 - 35086 1460 519 aa, chain + ## HITS:1 COG:ECs4765 KEGG:ns NR:ns ## COG: ECs4765 COG0661 # Protein_GI_number: 15834019 # Func_class: R General function prediction only # Function: Predicted unusual protein kinase # Organism: Escherichia coli O157:H7 # 1 519 28 546 546 1056 100.0 0 MRITLPLRLWRYSLFWMPNRHKDKLLGERLRLALQELGPVWIKFGQMLSTRRDLFPPHIA DQLALLQDKVAPFDGKLAKQQIEAAMGGLPVEAWFDDFEIKPLASASIAQVHTARLKSNG KEVVIKVIRPDILPVIKADLKLIYRLARWVPRLLPDGRRLRPTEVVREYEKTLIDELNLL RESANAIQLRRNFEDSPMLYIPEVYPDYCSEGMMVMERIYGIPVSDVAALEKNGTNMKLL AERGVQVFFTQVFRDSFFHADMHPGNIFVSYEHPENPKYIGIDCGIVGSLNKEDKRYLAE NFIAFFNRDYRKVAELHVDSGWVPPDTNVEEFEFAIRTVCEPIFEKPLAEISFGHVLLNL FNTARRFNMEVQPQLVLLQKTLLYVEGVGRQLYPQLDLWKTAKPFLESWIKDQVGIPALV RAFKEKAPFWVEKMPELPELVYDSLRQGKYLQHSVDKIARELQSNHVRQGQSRYFLGIGA TLVLSGTFLLVSRPEWGLMPGWLMAGGLIAWFVGWRKTR >gi|296493344|gb|ADTK01000157.1| GENE 34 35165 - 35434 200 89 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90022866|ref|YP_528693.1| ribosomal protein L25 [Saccharophagus degradans 2-40] # 1 79 3 81 83 81 51 6e-15 MGGISIWQLLIIAVIVVLLFGTKKLGSIGSDLGASIKGFKKAMSDDEPKQDKTSQDADFT AKTIADKQADTNQEQAKTEDAKRHDKEQV >gi|296493344|gb|ADTK01000157.1| GENE 35 35438 - 35953 476 171 aa, chain + ## HITS:1 COG:ECs4767 KEGG:ns NR:ns ## COG: ECs4767 COG1826 # Protein_GI_number: 15834021 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Sec-independent protein secretion pathway components # Organism: Escherichia coli O157:H7 # 1 171 1 171 171 267 100.0 9e-72 MFDIGFSELLLVFIIGLVVLGPQRLPVAVKTVAGWIRALRSLATTVQNELTQELKLQEFQ DSLKKVEKASLTNLTPELKASMDELRQAAESMKRSYVANDPEKASDEAHTIHNPVVKDNE AAHEGVTPAAAQTQASSPEQKPETTPEPVVKPAADAEPKTAAPSPSSSDKP >gi|296493344|gb|ADTK01000157.1| GENE 36 35956 - 36732 833 258 aa, chain + ## HITS:1 COG:ECs4768 KEGG:ns NR:ns ## COG: ECs4768 COG0805 # Protein_GI_number: 15834022 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Sec-independent protein secretion pathway component TatC # Organism: Escherichia coli O157:H7 # 1 258 1 258 258 465 100.0 1e-131 MSVEDTQPLITHLIELRKRLLNCIIAVIVIFLCLVYFANDIYHLVSAPLIKQLPQGSTMI ATDVASPFFTPIKLTFMVSLILSAPVILYQVWAFIAPALYKHERRLVVPLLVSSSLLFYI GMAFAYFVVFPLAFGFLANTAPEGVQVSTDIASYLSFVMALFMAFGVSFEVPVAIVLLCW MGITSPEDLRKKRPYVLVGAFVVGMLLTPPDVFSQTLLAIPMYCLFEIGVFFSRFYVGKG RNREEENDAEAESEKTEE Prediction of potential genes in microbial genomes Time: Mon May 16 15:33:18 2011 Seq name: gi|296493343|gb|ADTK01000158.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont381.5, whole genome shotgun sequence Length of sequence - 3669 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 33 - 815 572 ## COG0084 Mg-dependent DNase - Term 740 - 777 1.0 2 2 Tu 1 . - CDS 812 - 1300 366 ## COG0250 Transcription antiterminator - Prom 1394 - 1453 6.4 + Prom 1333 - 1392 3.9 3 3 Op 1 6/0.000 + CDS 1467 - 2960 1608 ## COG0043 3-polyprenyl-4-hydroxybenzoate decarboxylase and related decarboxylases 4 3 Op 2 . + CDS 3006 - 3669 612 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases Predicted protein(s) >gi|296493343|gb|ADTK01000158.1| GENE 1 33 - 815 572 260 aa, chain + ## HITS:1 COG:ZtatD KEGG:ns NR:ns ## COG: ZtatD COG0084 # Protein_GI_number: 15804431 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Escherichia coli O157:H7 EDL933 # 1 260 5 264 264 537 100.0 1e-153 MFDIGVNLTSSQFAKDRDDVVARAFDAGVNGLLITGTNLRESQQAQKLARQYSSCWSTAG VHPHDSSQWQAATEEAIIELAAQPEVVAIGECGLDFNRNFSTPEEQERAFVAQLRIAADL NMPVFMHCRDAHERFMTLLEPWLDKLPGAVLHCFTGTREEMQACVAHGIYIGITGWVCDE RRGLELRELLPLIPAEKLLIETDAPYLLPRDLTPKPSSRRNEPAHLPHILQRIAHWRGED AAWLAATTDANVKTLFGIAF >gi|296493343|gb|ADTK01000158.1| GENE 2 812 - 1300 366 162 aa, chain - ## HITS:1 COG:ECs4770 KEGG:ns NR:ns ## COG: ECs4770 COG0250 # Protein_GI_number: 15834024 # Func_class: K Transcription # Function: Transcription antiterminator # Organism: Escherichia coli O157:H7 # 1 162 1 162 162 330 100.0 9e-91 MQSWYLLYCKRGQLQRAQEHLERQAVNCLAPMITLEKIVRGKRTAVSEPLFPNYLFVEFD PEVIHTTTINATRGVSHFVRFGASPAIVPSAVIHQLSVYKPKDIVDPATPYPGDKVIITE GAFEGFQAIFTEPDGEARSMLLLNLINKEIKHSVKNTEFRKL >gi|296493343|gb|ADTK01000158.1| GENE 3 1467 - 2960 1608 497 aa, chain + ## HITS:1 COG:ubiD KEGG:ns NR:ns ## COG: ubiD COG0043 # Protein_GI_number: 16131689 # Func_class: H Coenzyme transport and metabolism # Function: 3-polyprenyl-4-hydroxybenzoate decarboxylase and related decarboxylases # Organism: Escherichia coli K12 # 1 497 1 497 497 1036 100.0 0 MDAMKYNDLRDFLTLLEQQGELKRITLPVDPHLEITEIADRTLRAGGPALLFENPKGYSM PVLCNLFGTPKRVAMGMGQEDVSALREVGKLLAFLKEPEPPKGFRDLFDKLPQFKQVLNM PTKRLRGAPCQQKIVSGDDVDLNRIPIMTCWPEDAAPLITWGLTVTRGPHKERQNLGIYR QQLIGKNKLIMRWLSHRGGALDYQEWCAAHPGERFPVSVALGADPATILGAVTPVPDTLS EYAFAGLLRGTKTEVVKCISNDLEVPASAEIVLEGYIEQGETAPEGPYGDHTGYYNEVDS FPVFTVTHITQREDAIYHSTYTGRPPDEPAVLGVALNEVFVPILQKQFPEIVDFYLPPEG CSYRLAVVTIKKQYAGHAKRVMMGVWSFLRQFMYTKFVIVCDDDVNARDWNDVIWAITTR MDPARDTVLVENTPIDYLDFASPVSGLGSKMGLDATNKWPGETQREWGRPIKKDPDVVAH IDAIWDELAIFNNGKSA >gi|296493343|gb|ADTK01000158.1| GENE 4 3006 - 3669 612 221 aa, chain + ## HITS:1 COG:ECs4772 KEGG:ns NR:ns ## COG: ECs4772 COG0543 # Protein_GI_number: 15834026 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Escherichia coli O157:H7 # 1 221 1 221 233 456 100.0 1e-128 MTTLSCKVTSVEAITDTVYRVRIVPDAAFSFRAGQYLMVVMDERDKRPFSMASTPDEKGF IELHIGASEINLYAKAVMDRILKDHQIVVDIPHGEAWLRDDEERPMILIAGGTGFSYARS ILLTALARNPNRDITIYWGGREEQHLYDLCELEALSLKHPGLQVVPVVEQPEAGWRGRTG TVLTAVLQDHGTLAEHDIYIAGRFEMAKIARDLFCSERNAR Prediction of potential genes in microbial genomes Time: Mon May 16 15:33:21 2011 Seq name: gi|296493342|gb|ADTK01000159.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont381.6, whole genome shotgun sequence Length of sequence - 7805 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 20 - 52 2.1 1 1 Op 1 20/0.000 - CDS 204 - 1367 1316 ## COG0183 Acetyl-CoA acetyltransferase 2 1 Op 2 . - CDS 1377 - 3566 2690 ## COG1250 3-hydroxyacyl-CoA dehydrogenase - Prom 3629 - 3688 4.8 + Prom 3666 - 3725 2.4 3 2 Op 1 2/1.000 + CDS 3756 - 5087 1638 ## COG0006 Xaa-Pro aminopeptidase 4 2 Op 2 4/1.000 + CDS 5087 - 5701 457 ## COG1739 Uncharacterized conserved protein 5 2 Op 3 6/0.000 + CDS 5740 - 7191 1472 ## COG0168 Trk-type K+ transport systems, membrane components 6 2 Op 4 . + CDS 7203 - 7748 488 ## COG4635 Flavodoxin Predicted protein(s) >gi|296493342|gb|ADTK01000159.1| GENE 1 204 - 1367 1316 387 aa, chain - ## HITS:1 COG:ECs4773 KEGG:ns NR:ns ## COG: ECs4773 COG0183 # Protein_GI_number: 15834027 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA acetyltransferase # Organism: Escherichia coli O157:H7 # 1 387 1 387 387 728 98.0 0 MEQVVIVDAIRTPMGRSKGGAFRNVRAEDLSAHLMRSLLARNPALEAAALDDIYWGCVQQ TLEQGFNIARNAALLAEVPHSVPAVTVNRLCGSSMQALHDAARMIMTGDAQACLVGGVEH MGHVPMSHGVDFHPGLSRNVAKAAGMMGLTAEMLARMHGISREMQDAFAARSHARAWAAT QSAAFKNEIIPTGGHDADGVLKQFNYDEVIRPETTVEALATLRPAFDPVNGTVTAGTSSA LSDGAAAMLVMSESRAHELGLKPRARVRSMAVVGCDPSIMGYGPVPASKLALKKAGLSAS DIGVFEMNEAFAAQILPCIKDLGLMEQIDEKINLNGGAIALGHPLGCSGARISTTLLNLM ERKDVQFGLATMCIGLGQGIATVFERV >gi|296493342|gb|ADTK01000159.1| GENE 2 1377 - 3566 2690 729 aa, chain - ## HITS:1 COG:fadB_2 KEGG:ns NR:ns ## COG: fadB_2 COG1250 # Protein_GI_number: 16131692 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyacyl-CoA dehydrogenase # Organism: Escherichia coli K12 # 307 729 1 423 423 839 99.0 0 MLYKGDTLYLDWLEDGIAELVFDAPGSVNKLDTATVASLGEAIGVLEQQSDLKGLLLRSN KAAFIVGADITEFLSLFLVPEEQLSQWLQFANSVFNRLEDLPVPTIAAVNGYALGGGCEC VLATDYRLATPDLRIGLPETKLGIMPGFGGSVRMPRMLGADSALEIIAAGKDVGADQALK IGLVDGVVKAEKLVEGAKAVLRQAINGDLDWKAKRQPKLEPLKLSKIEATMSFTIAKGMV AQTAGKHYPAPITAVKTIEAAARFGREEALNLENKSFVPLAHTNEARALVGIFLNDQYVK GKAKKLTKDVETPKQAAVLGAGIMGGGIAYQSAWKGVPVVMKDINDKSLTLGMTEAAKLL NKQLERGKIDGLKLAGVISTIHPTLDYAGFDRVDVVVEAVVENPKVKKAVLAETEQKVRP DTVLASNTSTIPISELANALERPENFCGMHFFNPVHRMPLVEIIRGEKSSDETIAKVVAW ASKMGKTPIVVNDCPGFFVNRVLFPYFAGFSQLLRDGADFRKIDKVMEKQFGWPMGPAYL LDVVGIDTAHHAQAVMAAGFPQRMQKDYRDAIDALFDANRFGQKNGLGFWRYKEDSKGKP KKEEDAAVEDLLAEVSQPKRDFSEEEIIARMMIPMVNEVVRCLEEGIIATPAEADMALVY GLGFPPFHGGAFRWLDTLGSAKYLDMAQQYQHLGPLYEVPEGLRNKARHNEPYYPPVEPA RPVGDLKTA >gi|296493342|gb|ADTK01000159.1| GENE 3 3756 - 5087 1638 443 aa, chain + ## HITS:1 COG:pepQ KEGG:ns NR:ns ## COG: pepQ COG0006 # Protein_GI_number: 16131693 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Escherichia coli K12 # 1 443 1 443 443 927 100.0 0 MESLASLYKNHIATLQERTRDALARFKLDALLIHSGELFNVFLDDHPYPFKVNPQFKAWV PVTQVPNCWLLVDGVNKPKLWFYLPVDYWHNVEPLPTSFWTEDVEVIALPKADGIGSLLP AARGNIGYIGPVPERALQLGIEASNINPKGVIDYLHYYRSFKTEYELACMREAQKMAVNG HRAAEEAFRSGMSEFDINIAYLTATGHRDTDVPYSNIVALNEHAAVLHYTKLDHQAPEEM RSFLLDAGAEYNGYAADLTRTWSAKSDNDYAQLVKDVNDEQLALIATMKAGVSYVDYHIQ FHQRIAKLLRKHQIITDMSEEAMVENDLTGPFMPHGIGHPLGLQVHDVAGFMQDDSGTHL AAPAKYPYLRCTRILQPGMVLTIEPGIYFIESLLAPWREGQFSKHFNWQKIEALKPFGGI RIEDNVVIHENNVENMTRDLKLA >gi|296493342|gb|ADTK01000159.1| GENE 4 5087 - 5701 457 204 aa, chain + ## HITS:1 COG:ECs4776 KEGG:ns NR:ns ## COG: ECs4776 COG1739 # Protein_GI_number: 15834030 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 204 2 205 205 373 100.0 1e-103 MESWLIPAAPVTVVEEIKKSRFITLLAHTDGVEAAKAFVESVRAEHPDARHHCVAWVAGA PDDSQQLGFSDDGEPAGTAGKPMLAQLMGSGVGEITAVVVRYYGGILLGTGGLVKAYGGG VNQALRQLTTQRKTPLTEYTLQCEYSQLTGIEALLGQCDGKIINSDYQAFVLLRVALPAA KVAEFSAKLADFSRGSLQLLAIEE >gi|296493342|gb|ADTK01000159.1| GENE 5 5740 - 7191 1472 483 aa, chain + ## HITS:1 COG:ECs4777 KEGG:ns NR:ns ## COG: ECs4777 COG0168 # Protein_GI_number: 15834031 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Escherichia coli O157:H7 # 1 483 1 483 483 848 100.0 0 MHFRAITRIVGLLVILFSGTMIIPGLVALIYRDGAGRAFTQTFFVALAIGSMLWWPNRKE KGELKSREGFLIVVLFWTVLGSVGALPFIFSESPNLTITDAFFESFSGLTTTGATTLVGL DSLPHAILFYRQMLQWFGGMGIIVLAVAILPILGVGGMQLYRAEMPGPLKDNKMRPRIAE TAKTLWLIYVLLTVACALALWFAGMDAFDAIGHSFATIAIGGFSTHDASIGYFDSPTINT IIAIFLLISGCNYGLHFSLLSGRSLKVYWRDPEFRMFIGVQFTLVVICTLVLWFHNVYSS ALMTINQAFFQVVSMATTAGFTTDSIARWPLFLPVLLLCSAFIGGCAGSTGGGLKVIRIL LLFKQGNRELKRLVHPNAVYSIKLGNRALPERILEAVWGFFSAYALVFIVSMLAIIATGV DDFSAFASVVATLNNLGPGLGVVADNFTSMNPVAKWILIANMLFGRLEVFTLLVLFTPTF WRE >gi|296493342|gb|ADTK01000159.1| GENE 6 7203 - 7748 488 181 aa, chain + ## HITS:1 COG:ECs4778 KEGG:ns NR:ns ## COG: ECs4778 COG4635 # Protein_GI_number: 15834032 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism # Function: Flavodoxin # Organism: Escherichia coli O157:H7 # 1 181 1 181 181 358 100.0 2e-99 MKTLILFSTRDGQTREIASYLASELKELGIQADVANVHRIEEPQWENYDRVVIGASIRYG HYHSAFQEFVKKHATRLNSMPSAFYSVNLVARKPEKRTPQTNSYARKFLMNSQWRPDRCA VIAGALRYPRYRWYDRFMIKLIMKMSGGETDTRKEVVYTDWEQVANFAREIAHLTDKPTL K Prediction of potential genes in microbial genomes Time: Mon May 16 15:33:22 2011 Seq name: gi|296493341|gb|ADTK01000160.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont386.1, whole genome shotgun sequence Length of sequence - 869 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 15:33:23 2011 Seq name: gi|296493340|gb|ADTK01000161.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont393.1, whole genome shotgun sequence Length of sequence - 1302 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 990 594 ## COG3328 Transposase and inactivated derivatives Predicted protein(s) >gi|296493340|gb|ADTK01000161.1| GENE 1 1 - 990 594 329 aa, chain + ## HITS:1 COG:YPO0011 KEGG:ns NR:ns ## COG: YPO0011 COG3328 # Protein_GI_number: 16120364 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Yersinia pestis # 16 328 30 350 402 305 48.0 7e-83 QSGQALTGKDGILTPLIKQLTETALAAELDSHLAQDFEVNRKNGSGKKTIKAPTGSFELA TPRDRNGYFEPQLVKKHQTTLSDELERKIIRLFALGMSYQDISREIEDLYAFSVSTATIS AVTDKVIPELKQWQQRPLEKVYPFVWLDAIHYKIREDGCYQSKAVYTVLALNLEGKKEVL GLYLSESEGANFWLSVLSDLQNRGVEDILIACVDGLTGFPEAINSIYPQTEVQLCVIHQI RNSIKYVASKHHKAFMADLKPVYRAVSETALDELEAKRGQQYPVVLQSWRRKRENLSAYF RYPANIRKVIYTTNAIESVHRQFRKLTKT Prediction of potential genes in microbial genomes Time: Mon May 16 15:33:23 2011 Seq name: gi|296493339|gb|ADTK01000162.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont415.1, whole genome shotgun sequence Length of sequence - 1209 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 329 251 ## COG5511 Bacteriophage capsid protein 2 1 Op 2 . - CDS 329 - 541 180 ## ECUMN_1831 conserved hypothetical protein from putative prophage 3 1 Op 3 . - CDS 538 - 1209 480 ## COG5525 Bacteriophage tail assembly protein Predicted protein(s) >gi|296493339|gb|ADTK01000162.1| GENE 1 2 - 329 251 109 aa, chain - ## HITS:1 COG:ECs0827 KEGG:ns NR:ns ## COG: ECs0827 COG5511 # Protein_GI_number: 15830081 # Func_class: R General function prediction only # Function: Bacteriophage capsid protein # Organism: Escherichia coli O157:H7 # 1 109 25 133 374 214 100.0 2e-56 MAILDDVIGVFSPGWKAARLRSRAVIQAYEAVKTTRTHKARRENRTADQLSQYGAVSLRE QARYLDNNHDLVIGVFDKLEERVVGKNGIIVEPHPVLRNGAIARDLAAE >gi|296493339|gb|ADTK01000162.1| GENE 2 329 - 541 180 70 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_1831 NR:ns ## KEGG: ECUMN_1831 # Name: not_defined # Def: conserved hypothetical protein from putative prophage # Organism: E.coli_UMN026 # Pathway: not_defined # 1 70 3 72 72 115 100.0 5e-25 MNQNDIEAMIQRYTEAEMAVLDGKSVTFNGQQMTMENLSEIRQGRQEWERRLAALITRRR GHPGYRLARF >gi|296493339|gb|ADTK01000162.1| GENE 3 538 - 1209 480 223 aa, chain - ## HITS:1 COG:ECs0825 KEGG:ns NR:ns ## COG: ECs0825 COG5525 # Protein_GI_number: 15830079 # Func_class: R General function prediction only # Function: Bacteriophage tail assembly protein # Organism: Escherichia coli O157:H7 # 1 223 478 700 700 437 98.0 1e-122 ESWPLASDPSQQMRLMAMAVDSGGEDGVTDNAYKFWRRCRRDGLGKRIYLFKGDSIRRAK LITRTFPDNTGRTGRRAQAAGDVPLWLLQTDALKDRVNNALWRDSPGPGYVHFPDWLGSW FYDELTYEERSRDGKWSKPGRGANEAFDLMVYAEALVILHGYEKIRWPDAPEWASRETWL ECVPDSTEPSPSPEPVSTPVKKQKRKKTVTDDVNPWLTSGGWL Prediction of potential genes in microbial genomes Time: Mon May 16 15:33:27 2011 Seq name: gi|296493338|gb|ADTK01000163.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont431.1, whole genome shotgun sequence Length of sequence - 4596 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 4, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 29 - 78 4.5 1 1 Op 1 . - CDS 167 - 565 98 ## APECO1_508 hypothetical protein 2 1 Op 2 . - CDS 606 - 1571 285 ## ECUMN_1852 hypothetical protein 3 1 Op 3 . - CDS 1552 - 2073 184 ## ECUMN_1853 hypothetical protein 4 1 Op 4 . - CDS 2057 - 2281 214 ## ECO103_1708 DNA-binding transcriptional regulator DicC - Prom 2303 - 2362 2.9 + Prom 2221 - 2280 2.5 5 2 Tu 1 . + CDS 2362 - 2769 151 ## COG1396 Predicted transcriptional regulators + Prom 2807 - 2866 4.1 6 3 Op 1 . + CDS 2938 - 3093 179 ## ECO26_2204 hypothetical protein 7 3 Op 2 . + CDS 3095 - 3670 185 ## EC55989_1736 hypothetical protein + Prom 3940 - 3999 3.1 8 4 Op 1 . + CDS 4157 - 4345 122 ## ECSP_2138 putative inhibitor of cell division encoded by cryptic prophage CP-933P 9 4 Op 2 . + CDS 4342 - 4533 149 ## ECO111_1046 hypothetical protein Predicted protein(s) >gi|296493338|gb|ADTK01000163.1| GENE 1 167 - 565 98 132 aa, chain - ## HITS:1 COG:no KEGG:APECO1_508 NR:ns ## KEGG: APECO1_508 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 132 1 132 138 237 97.0 1e-61 MAKVFTQEEREKIKGQVVELVRRSGRETLRQLEAKTGATRYLMSVLARELVASGDVYNSG YGLFPSEQARKDWQNARKKLSRAKAKKTSVVDPDLIWSLPDGEIRRYDRHQNIICCECRK SEVMQRILAFYQ >gi|296493338|gb|ADTK01000163.1| GENE 2 606 - 1571 285 321 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_1852 NR:ns ## KEGG: ECUMN_1852 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 321 19 339 339 628 100.0 1e-179 MSLLFAERPLVINTQLAMKIGLNEAIVLQQLHYWLRDTNSGMECDGVRWIYNTTEQWLEQ FPFWSESTLKRAFASLKTLGLLRCEKLNKSKRDMTNFYTINYGSELLDDGKLSESIGSKC AAPSGQNDTMEEVKMKRSIGSKRPNVIGSKWPDDPTENTTEITTENKNTFRPEASQPDPQ TAEQDFLIRHPGAVVFSAKKRQWGSQEDLACAQWIWGRIVGLYEQAASDDGEIMRPKEPN WTVWANDVRTMRMLDGRSHRQICEMFGRVQRDPFWVKNIMSPSKLREKWDELVIRLGRSP VQRCVNHISEPDTEIPPGFRG >gi|296493338|gb|ADTK01000163.1| GENE 3 1552 - 2073 184 173 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_1853 NR:ns ## KEGG: ECUMN_1853 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 173 1 173 173 330 97.0 9e-90 MKITPEQVCEALDAWVCRPGMTQEQATILITEAFWTLKERPNIDVQRVIDEGGAVDQRAL GVNRVKIFERWKAIDTRDKREKFTALVPAIMEAIRINDFRLYREISDGKSITYMIAGLNK EYGDVVESGLLFADPAVVDRETDELIEKAIAFKLAYRQQYQQKAGWNYEPSFC >gi|296493338|gb|ADTK01000163.1| GENE 4 2057 - 2281 214 74 aa, chain - ## HITS:1 COG:no KEGG:ECO103_1708 NR:ns ## KEGG: ECO103_1708 # Name: not_defined # Def: DNA-binding transcriptional regulator DicC # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 74 1 75 75 130 93.0 1e-29 MLKVDAITFFGSKTKLANAAGVKLSVAAWGELVPEGRAMRLQEASGGELQYDPKVYDEYR KAKRAGRLNNENHP >gi|296493338|gb|ADTK01000163.1| GENE 5 2362 - 2769 151 135 aa, chain + ## HITS:1 COG:ECs1765 KEGG:ns NR:ns ## COG: ECs1765 COG1396 # Protein_GI_number: 15831019 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 135 1 135 135 155 65.0 1e-38 MDTRTLGQRVLARRKELRLTQREAARLAGVAHVTISQWERDETQPVGKRLFALADALKCS PTWLMFGDEDKAPVPAQELHVETELTPSHKELIELFDALPSSEQEALLSEMRARVENFNK LFEEMLKARKNKSIK >gi|296493338|gb|ADTK01000163.1| GENE 6 2938 - 3093 179 51 aa, chain + ## HITS:1 COG:no KEGG:ECO26_2204 NR:ns ## KEGG: ECO26_2204 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 51 53 103 103 99 100.0 5e-20 MDTIDLGNSESLVCGVFPNQDGTFTAMTYTKSKTFKTENGARRWLERNSGE >gi|296493338|gb|ADTK01000163.1| GENE 7 3095 - 3670 185 191 aa, chain + ## HITS:1 COG:no KEGG:EC55989_1736 NR:ns ## KEGG: EC55989_1736 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 191 15 205 205 375 98.0 1e-103 MDFDTIMEKAYEEYFEDLAEGEEALSFSEFKQALRIRMCSHNDAEHKYEKQNQTAENFVL EPGETLFKIPVTCPICGFTSEELDDSCNNQETTKYVEDDTECARRTIISTSPNSRTNKSH FERVINPLPQTNKKDAGGQKTQWNNGISVKKRRSIKDKQRSANNCKRFIPVINFHVLKRI YWWLRHCLMNP >gi|296493338|gb|ADTK01000163.1| GENE 8 4157 - 4345 122 62 aa, chain + ## HITS:1 COG:no KEGG:ECSP_2138 NR:ns ## KEGG: ECSP_2138 # Name: not_defined # Def: putative inhibitor of cell division encoded by cryptic prophage CP-933P # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 62 36 97 97 127 98.0 9e-29 MKTLLPNVNTSEGCFEIGVTISNPVFTEDAINKRKHERELLNKICILSMLARLRPIQKGC WQ >gi|296493338|gb|ADTK01000163.1| GENE 9 4342 - 4533 149 63 aa, chain + ## HITS:1 COG:no KEGG:ECO111_1046 NR:ns ## KEGG: ECO111_1046 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 63 4 66 66 120 100.0 2e-26 MNTAFALVLTVFLVSGEPVDIAVSVHRTMQECVTAATEQKIPGNCYPVDKVIHQDNNEIP AGL Prediction of potential genes in microbial genomes Time: Mon May 16 15:33:50 2011 Seq name: gi|296493337|gb|ADTK01000164.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont435.1, whole genome shotgun sequence Length of sequence - 16673 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 8, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 220 - 272 9.1 1 1 Tu 1 . - CDS 283 - 1257 803 ## COG0583 Transcriptional regulator - Prom 1324 - 1383 7.5 - Term 1405 - 1435 2.1 2 2 Tu 1 . - CDS 1467 - 4064 2874 ## COG0550 Topoisomerase IA - Prom 4226 - 4285 3.7 + Prom 4264 - 4323 4.8 3 3 Tu 1 . + CDS 4444 - 4695 403 ## ECIAI1_1292 hypothetical protein + Term 4698 - 4738 6.3 - Term 4686 - 4726 6.6 4 4 Tu 1 . - CDS 4731 - 5780 963 ## COG0616 Periplasmic serine proteases (ClpP class) - Prom 5810 - 5869 3.6 + Prom 5889 - 5948 2.1 5 5 Op 1 5/0.333 + CDS 6000 - 6758 719 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 6 5 Op 2 . + CDS 6755 - 7345 640 ## COG2109 ATP:corrinoid adenosyltransferase + Term 7356 - 7389 2.5 - Term 7344 - 7376 1.5 7 6 Tu 1 4/0.667 - CDS 7385 - 8257 1162 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases - Term 8295 - 8324 1.2 8 7 Op 1 6/0.333 - CDS 8358 - 8978 726 ## COG0009 Putative translation factor (SUA5) 9 7 Op 2 . - CDS 8975 - 9856 412 ## COG0613 Predicted metal-dependent phosphoesterases (PHP family) - Prom 9888 - 9947 4.6 + Prom 9908 - 9967 4.0 10 8 Op 1 10/0.000 + CDS 10129 - 11691 1661 ## COG0147 Anthranilate/para-aminobenzoate synthases component I 11 8 Op 2 21/0.000 + CDS 11691 - 13286 1518 ## COG0547 Anthranilate phosphoribosyltransferase 12 8 Op 3 13/0.000 + CDS 13290 - 14648 1185 ## COG0134 Indole-3-glycerol phosphate synthase 13 8 Op 4 37/0.000 + CDS 14660 - 15853 1318 ## COG0133 Tryptophan synthase beta chain 14 8 Op 5 . + CDS 15853 - 16659 391 ## PROTEIN SUPPORTED gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc Predicted protein(s) >gi|296493337|gb|ADTK01000164.1| GENE 1 283 - 1257 803 324 aa, chain - ## HITS:1 COG:ECs1847 KEGG:ns NR:ns ## COG: ECs1847 COG0583 # Protein_GI_number: 15831101 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 324 1 324 324 647 100.0 0 MKLQQLRYIVEVVNHNLNVSSTAEGLYTSQPGISKQVRMLEDELGIQIFSRSGKHLTQVT PAGQEIIRIAREVLSKVDAIKSVAGEHTWPDKGSLYIATTHTQARYALPNVIKGFIERYP RVSLHMHQGSPTQIADAVSKGNADFAIATEALHLYEDLVMLPCYHWNRAIVVTPDHPLAG KKAITIEELAQYPLVTYTFGFTGRSELDTAFNRAGLTPRIVFTATDADVIKTYVRLGLGV GVIASMAVDPVADPDLVRVDAHDIFSHSTTKIGFRRSTFLRSYMYDFIQRFAPHLTRDVV DAAVALRSNEEIEVMFKDIKLPEK >gi|296493337|gb|ADTK01000164.1| GENE 2 1467 - 4064 2874 865 aa, chain - ## HITS:1 COG:topA_1 KEGG:ns NR:ns ## COG: topA_1 COG0550 # Protein_GI_number: 16129235 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Escherichia coli K12 # 1 592 1 592 592 1167 100.0 0 MGKALVIVESPAKAKTINKYLGSDYVVKSSVGHIRDLPTSGSAAKKSADSTSTKTAKKPK KDERGALVNRMGVDPWHNWEAHYEVLPGKEKVVSELKQLAEKADHIYLATDLDREGEAIA WHLREVIGGDDARYSRVVFNEITKNAIRQAFNKPGELNIDRVNAQQARRFMDRVVGYMVS PLLWKKIARGLSAGRVQSVAVRLVVEREREIKAFVPEEFWEVDASTTTPSGEALALQVTH QNDKPFRPVNKEQTQAAVSLLEKARYSVLEREDKPTTSKPGAPFITSTLQQAASTRLGFG VKKTMMMAQRLYEAGYITYMRTDSTNLSQDAVNMVRGYISDNFGKKYLPESPNQYASKEN SQEAHEAIRPSDVNVMAESLKDMEADAQKLYQLIWRQFVACQMTPAKYDSTTLTVGAGDF RLKARGRILRFDGWTKVMPALRKGDEDRILPAVNKGDALTLVELTPAQHFTKPPARFSEA SLVKELEKRGIGRPSTYASIISTIQDRGYVRVENRRFYAEKMGEIVTDRLEENFRELMNY DFTAQMENSLDQVANHEAEWKAVLDHFFSDFTQQLDKAEKDPEEGGMRPNQMVLTSIDCP TCGRKMGIRTASTGVFLGCSGYALPPKERCKTTINLVPENEVLNVLEGEDAETNALRAKR RCPKCGTAMDSYLIDPKRKLHVCGNNPTCDGYEIEEGEFRIKGYDGPIVECEKCGSEMHL KMGRFGKYMACTNEECKNTRKILRNGEVAPPKEDPVPLPELPCEKSDAYFVLRDGAAGVF LAANTFPKSRETRAPLVEELYRFRDRLPEKLRYLADAPQQDPEGNKTMVRFSRKTKQQYV SSEKDGKATGWSAFYVDGKWVEGKK >gi|296493337|gb|ADTK01000164.1| GENE 3 4444 - 4695 403 83 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_1292 NR:ns ## KEGG: ECIAI1_1292 # Name: yciN # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 83 1 83 83 158 100.0 7e-38 MNKETQPIDRETLLKEANKIIREHEDTLAGIEATGVTQRNGVLVFTGDYFLDEQGLPTAK STAVFNMFKHLAHVLSEKYHLVD >gi|296493337|gb|ADTK01000164.1| GENE 4 4731 - 5780 963 349 aa, chain - ## HITS:1 COG:sohB KEGG:ns NR:ns ## COG: sohB COG0616 # Protein_GI_number: 16129233 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Escherichia coli K12 # 1 349 1 349 349 593 99.0 1e-169 MELLSEYGLFLAKIVTVVLAIAAIAAIIVNVAQRNKRQRGELRVNNLSEQYKEMKDELAA ALMDSHQQKQWHKAQKKKHKQEAKAAKAKAKLGEVATDSKPRVWVLDFKGSMDAHEVNSL REEITAVLAAFKPQDQVVLRLESPGGMVHGYGLAASQLQRLRDKNIPLTVTVDKVAASGG YMMACVADKIVSAPFAIVGSIGVVAQMPNFNRFLKSKDIDIELHTAGQYKRTLTLLGENT EEGREKFREELNETHQLFKDFVKRMRPSLDIEQVATGEHWYGQQAVEKGLVDEINTSDEV ILSLMEGREVVNVRYMQRKRLIDRFTGSAAESADRLLLRWWQRGQKPLM >gi|296493337|gb|ADTK01000164.1| GENE 5 6000 - 6758 719 252 aa, chain + ## HITS:1 COG:ECs1843 KEGG:ns NR:ns ## COG: ECs1843 COG1028 # Protein_GI_number: 15831097 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Escherichia coli O157:H7 # 1 252 1 252 252 494 99.0 1e-140 MHYQPKQDLLNDRIILVTGASDGIGREAAMTYARYGATVILLGRNEEKLRQVASHINEET GRQPQWFILDLLTCTSEDCQQLAQRITVNYPRLDGVLHNAGLLGDVCPMSEQNPQVWQDV IQVNVNATFMLTQALLPLLLKSDAGSLVFTSSSVGRQGRANWGAYAASKFATEGMMQVLA DEYQQRLRVNCINPGGTRTAMRASAFPTEDPQKLKTPADIMPLYLWLMGDDSRRKTGMTF DAQPGRKPGISQ >gi|296493337|gb|ADTK01000164.1| GENE 6 6755 - 7345 640 196 aa, chain + ## HITS:1 COG:btuR KEGG:ns NR:ns ## COG: btuR COG2109 # Protein_GI_number: 16129231 # Func_class: H Coenzyme transport and metabolism # Function: ATP:corrinoid adenosyltransferase # Organism: Escherichia coli K12 # 1 196 1 196 196 389 99.0 1e-108 MSDERYQQRQQRVKEKVDARVAQAQDERGIIIVFTGNGKGKTTAAFGTATRAVGHGKKVG VVQFIKGTWPNGERNLLEPHGVEFQVMATGFTWDTQNRESDTAACREVWQHAKRMLADSS LDMVLLDELTYMVAYDYLPLEEVVQALNERPHQQTVIITGRGCHRDILELADTVSELRPI KHAFDAGVKAQIGIDY >gi|296493337|gb|ADTK01000164.1| GENE 7 7385 - 8257 1162 290 aa, chain - ## HITS:1 COG:ECs1841 KEGG:ns NR:ns ## COG: ECs1841 COG1187 # Protein_GI_number: 15831095 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Escherichia coli O157:H7 # 1 290 1 290 290 539 100.0 1e-153 MSEKLQKVLARAGHGSRREIESIIEAGRVSVDGKIAKLGDRVEVTPGLKIRIDGHLISVR ESAEQICRVLAYYKPEGELCTRNDPEGRPTVFDRLPKLRGARWIAVGRLDVNTCGLLLFT TDGELANRLMHPSREVEREYAVRVFGQVDDAKLRDLSRGVQLEDGPAAFKTIKFSGGEGI NQWYNVTLTEGRNREVRRLWEAVGVQVSRLIRVRYGDIPLPKGLPRGGWTELDLAQTNYL RELVELPPETSSKVAVEKDRRRMKANQIRRAVKRHSQVSGGRRSGGRNNG >gi|296493337|gb|ADTK01000164.1| GENE 8 8358 - 8978 726 206 aa, chain - ## HITS:1 COG:ECs1839 KEGG:ns NR:ns ## COG: ECs1839 COG0009 # Protein_GI_number: 15831093 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Escherichia coli O157:H7 # 1 206 13 218 218 419 100.0 1e-117 MSQFFYIHPDNPQQRLINQAVEIVRKGGVIVYPTDSGYALGCKIEDKNAMERICRIRQLP DGHNFTLMCRDLSELSTYSFVDNVAFRLMKNNTPGNYTFILKGTKEVPRRLLQEKRKTIG MRVPSNPIAQALLEALGEPMLSTSLMLPGSEFTESDPEEIKDRLEKQVDLIIHGGYLGQK PTTVIDLTDDTPVVVREGVGDVKPFL >gi|296493337|gb|ADTK01000164.1| GENE 9 8975 - 9856 412 293 aa, chain - ## HITS:1 COG:yciV KEGG:ns NR:ns ## COG: yciV COG0613 # Protein_GI_number: 16129227 # Func_class: R General function prediction only # Function: Predicted metal-dependent phosphoesterases (PHP family) # Organism: Escherichia coli K12 # 1 293 1 293 293 582 100.0 1e-166 MSDTNYAVIYDLHSHTTASDGCLTPEALVHRAVEMRVGTLAITDHDTTAAIAPAREEISR SGLALNLIPGVEISTVWENHEIHIVGLNIDITHPLMCEFLAQQTERRNQRAQLIAERLEK AQIPGALEGAQRLAQGGAVTRGHFARFLVECGKASSMADVFKKYLARGKTGYVPPQWCTI EQAIDVIHHSGGKAVLAHPGRYNLSAKWLKRLVAHFAEHHGDAMEVAQCQQSPNERTQLA ALARQHHLWASQGSDFHQPCPWIELGRKLWLPAGVEGVWQLWEQPQNTTEREL >gi|296493337|gb|ADTK01000164.1| GENE 10 10129 - 11691 1661 520 aa, chain + ## HITS:1 COG:ECs1836 KEGG:ns NR:ns ## COG: ECs1836 COG0147 # Protein_GI_number: 15831090 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component I # Organism: Escherichia coli O157:H7 # 1 520 1 520 520 1027 98.0 0 MQTQKPTLELLTCEGAYRDNPTALFHQLCGDRPATLLLESADIDSKDDLKSLLLVDSALR ITALGDTVTIQALSGNGEALLTLLDNALPAGVENEQLPNCRVLRFPPVSPLLDEDARLCS LSVFDAFRLLQNLLNVPKEEREAMFFGGLFSYDLVAGFEDLPQLSAENNCPDFCFYLAET LMVIDHQKKSTRIQASLFAPNEEEKQRLTARLNELRQQLTEAAPPLPVVSVPHMRCECNQ SDEEFGGVVRLLQKAIRAGEIFQVVPSRRFSLPCPSPLAAYYVLKKSNPSPYMFFMQDND FTLFGASPESSLKYDATSRQIEIYPIAGTRPRGRRADGSLDRDLDSRIELEMRTDHKELS EHLMLVDLARNDLARICTPGSRYIADLTKVDRYSYVMHLVSRVVGELRHDLDALHAYRAC MNMGTLSGAPKVRAMQLIAEAEGRRRGSYGGAVGYFTAHGDLDTCIVIRSALVENGIATV QAGAGVVLDSIPQSEADETRNKARAVLRAIATAHHAQETF >gi|296493337|gb|ADTK01000164.1| GENE 11 11691 - 13286 1518 531 aa, chain + ## HITS:1 COG:trpD_2 KEGG:ns NR:ns ## COG: trpD_2 COG0547 # Protein_GI_number: 16129224 # Func_class: E Amino acid transport and metabolism # Function: Anthranilate phosphoribosyltransferase # Organism: Escherichia coli K12 # 197 531 1 335 335 600 99.0 1e-171 MADILLLDNIDSFTYNLADQLRSNGHNVVIYRNHIPAQTLIERLATMSNPVLMLSPGPGV PSEAGCMPELLTRLRGKLPIIGICLGHQAIVEAYGGYVGQAGEILHGKASSIEHDGQAMF AGLTNPLPVARYHSLVGSNIPAGLTINAHFNGMVMAVRHDADRVCGFQFHPESILTTQGA RLLEQTLAWAQQKLEPANTLQPILEKLYQAQTLSQQESHQLFSAVVRGELKPEQLAAALV SMKIRGEHPNEIAGAATALLENAAPFPRPDYLFADIVGTGGDGSNSINISTASAFVAAAC GLKVAKHGNRSVSSKSGSSDLLAAFGIILDMNADKSRQALDELGVCFLFAPKYHTGFRHA MPVRQQLKTRTLFNVLGPLINPAHPPLALIGVYSPELVLPIAETLRVLGYQRAAVVHSGG MDEVSLHAPTIVAELHDGEIKSYQLTAEDFGLTPYHQEQLAGGTPEENRDILTRLLQGKG DAAHEAAVAANVAMLMRLHGHEDLQANAQTVLEVLRSGSAYDRVTALAARG >gi|296493337|gb|ADTK01000164.1| GENE 12 13290 - 14648 1185 452 aa, chain + ## HITS:1 COG:trpC_1 KEGG:ns NR:ns ## COG: trpC_1 COG0134 # Protein_GI_number: 16129223 # Func_class: E Amino acid transport and metabolism # Function: Indole-3-glycerol phosphate synthase # Organism: Escherichia coli K12 # 1 253 2 254 254 497 98.0 1e-140 MQTVLAKIVADKAIWVETRKQQQPLASFQNEVQPSTRHFYDALQGARTAFILECKKASPS KGVIRDDFDPARIAAIYKHYASAISVLTDEKYFQGSFDFLPIVSQIAPQPILCKDFIIDP YQIYLARYYQADACLLMLSVLDDEQYRQLAAVAHSLEMGVLTEVSNEEELERAIALGAKV VGINNRDLRDLSIDLNRTRELAPKLGHNVTVISESGINTYAQVRELSHFANGFLIGSALM AHDDLHAAVRRVLLGENKVCGLTRGQDAKAAYDAGAIYGGLIFVATSPRCVNVEQAQEVM AAAPLQYVGVFRNHDIADVVDKAKVLSLAAVQLHGNEDQLYIDTLREALPAHVAIWKALS VGETLPARELQHVDKYVLDNGQGGSGQRFDWSLLNGQSLGNVLLAGGLGADNCVEAAQTG CAGLDFNSAVESQPGIKDARLLASVFQTLRAY >gi|296493337|gb|ADTK01000164.1| GENE 13 14660 - 15853 1318 397 aa, chain + ## HITS:1 COG:trpB KEGG:ns NR:ns ## COG: trpB COG0133 # Protein_GI_number: 16129222 # Func_class: E Amino acid transport and metabolism # Function: Tryptophan synthase beta chain # Organism: Escherichia coli K12 # 1 397 1 397 397 796 100.0 0 MTTLLNPYFGEFGGMYVPQILMPALRQLEEAFVSAQKDPEFQAQFNDLLKNYAGRPTALT KCQNITAGTNTTLYLKREDLLHGGAHKTNQVLGQALLAKRMGKTEIIAETGAGQHGVASA LASALLGLKCRIYMGAKDVERQSPNVFRMRLMGAEVIPVHSGSATLKDACNEALRDWSGS YETAHYMLGTAAGPHPYPTIVREFQRMIGEETKAQILEREGRLPDAVIACVGGGSNAIGM FADFINETNVGLIGVEPGGHGIETGEHGAPLKHGRVGIYFGMKAPMMQTEDGQIEESYSI SAGLDFPSVGPQHAYLNSTGRADYVSITDDEALEAFKTLCLHEGIIPALESSHALAHALK MMRENPDKEQLLVVNLSGRGDKDIFTVHDILKARGEI >gi|296493337|gb|ADTK01000164.1| GENE 14 15853 - 16659 391 268 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc [Roseobacter sp. AzwK-3b] # 1 249 1 248 263 155 35 2e-37 MERYESLFAQLKERKEGAFVPFVTLGDPGIEQSLKIIDTLIEAGADALELGIPFSDPLAD GPTIQNATLRAFAAGVTPAQCFEMLALIRQKHPTIPIGLLMYANLVFNKGIDEFYAQCEK VGVDSVLVADVPIEESAPFRQAALRHNVAPIFICPPNADDDLLRQIASYGRGYTYLLSRA GVTGAENRAALPLNHLVAKLKEYNAAPPLQGFGISAPDQVKAAIDAGAAGAISGSAIVKI IEQHINEPEKMLAALKVFVQPMKAATRS Prediction of potential genes in microbial genomes Time: Mon May 16 15:33:59 2011 Seq name: gi|296493336|gb|ADTK01000165.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont435.2, whole genome shotgun sequence Length of sequence - 19683 bp Number of predicted genes - 21, with homology - 21 Number of transcription units - 12, operones - 4 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 161 - 220 4.8 1 1 Op 1 3/0.400 + CDS 359 - 538 222 ## COG3729 General stress protein + Term 579 - 618 6.2 2 1 Op 2 4/0.400 + CDS 624 - 1124 615 ## COG3685 Uncharacterized protein conserved in bacteria 3 1 Op 3 . + CDS 1170 - 1676 611 ## COG3685 Uncharacterized protein conserved in bacteria + Term 1700 - 1734 6.0 4 2 Tu 1 . - CDS 1736 - 2290 666 ## COG3047 Outer membrane protein W - Prom 2403 - 2462 7.0 + Prom 2422 - 2481 7.0 5 3 Op 1 . + CDS 2731 - 3474 565 ## ECP_1303 hypothetical protein 6 3 Op 2 9/0.000 + CDS 3504 - 4043 543 ## COG2917 Intracellular septation protein A + Term 4045 - 4086 5.5 + Prom 4060 - 4119 2.1 7 3 Op 3 . + CDS 4148 - 4546 359 ## COG1607 Acyl-CoA hydrolase + Term 4549 - 4590 11.3 - Term 4537 - 4578 11.3 8 4 Tu 1 . - CDS 4586 - 5305 566 ## COG0810 Periplasmic protein TonB, links inner and outer membranes - Prom 5337 - 5396 5.6 + Prom 5265 - 5324 2.3 9 5 Tu 1 . + CDS 5433 - 5825 287 ## COG2350 Uncharacterized protein conserved in bacteria + Term 5834 - 5890 5.1 + Prom 6040 - 6099 4.8 10 6 Tu 1 . + CDS 6245 - 7378 744 ## COG1226 Kef-type K+ transport systems, predicted NAD-binding component + Term 7403 - 7443 8.1 - Term 7391 - 7429 3.1 11 7 Tu 1 . - CDS 7433 - 7606 79 ## EC55989_1347 hypothetical protein - Prom 7631 - 7690 6.2 + Prom 7662 - 7721 7.8 12 8 Op 1 1/1.000 + CDS 7749 - 9209 1172 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes 13 8 Op 2 . + CDS 9244 - 9573 456 ## COG3099 Uncharacterized protein conserved in bacteria + Term 9597 - 9628 4.1 - Term 9585 - 9616 4.1 14 9 Op 1 44/0.000 - CDS 9626 - 10630 853 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 15 9 Op 2 44/0.000 - CDS 10627 - 11640 577 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 16 9 Op 3 49/0.000 - CDS 11652 - 12557 882 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 17 9 Op 4 21/0.000 - CDS 12575 - 13495 763 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components - Term 13525 - 13577 11.1 18 9 Op 5 . - CDS 13581 - 15212 1751 ## COG4166 ABC-type oligopeptide transport system, periplasmic component - Prom 15343 - 15402 4.2 + Prom 15499 - 15558 2.4 19 10 Tu 1 . + CDS 15631 - 15792 115 ## ECS88_1310 hypothetical protein + Term 15915 - 15944 0.4 - Term 15898 - 15936 4.6 20 11 Tu 1 . - CDS 15949 - 16596 463 ## COG2095 Multiple antibiotic transporter - Prom 16784 - 16843 6.9 21 12 Tu 1 . + CDS 17073 - 19683 2620 ## COG1454 Alcohol dehydrogenase, class IV Predicted protein(s) >gi|296493336|gb|ADTK01000165.1| GENE 1 359 - 538 222 59 aa, chain + ## HITS:1 COG:STM1728 KEGG:ns NR:ns ## COG: STM1728 COG3729 # Protein_GI_number: 16765072 # Func_class: R General function prediction only # Function: General stress protein # Organism: Salmonella typhimurium LT2 # 1 55 1 55 60 58 89.0 4e-09 MAEHRGGSGNFAEDREKASDAGRKGGQHSGGNFKNDPQRASEAGKKGGQQSGGNKSGKS >gi|296493336|gb|ADTK01000165.1| GENE 2 624 - 1124 615 166 aa, chain + ## HITS:1 COG:ECs1830 KEGG:ns NR:ns ## COG: ECs1830 COG3685 # Protein_GI_number: 15831084 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 166 1 166 166 234 100.0 7e-62 MNMKTIEDVFIHLLSDTYSAEKQLTRALAKLARATSNEKLSQAFHAHLEETHGQIERIDQ VVESESNLKIKRMKCVAMEGLIEEANEVIESTEKNEVRDAALIAAAQKVEHYEIASYGTL ATLAEQLGYRKAAKLLKETLEEEKATDIKLTDLALNNVNKKAENKA >gi|296493336|gb|ADTK01000165.1| GENE 3 1170 - 1676 611 168 aa, chain + ## HITS:1 COG:ECs1829 KEGG:ns NR:ns ## COG: ECs1829 COG3685 # Protein_GI_number: 15831083 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 168 1 168 168 308 99.0 3e-84 MNRIEHYHDWLRDAHAMEKQAESMLESMASRIDNYPELRARIEQHLSETKNQIVQLETIL DRNDISRSVIKDSMSKMAALGQSIGGIFPSDEIVKGSISGYVFEQFEIACYTSLLAAAKN AGDTASIPTIEAILNEEKQMADWLIQHIPQTTEKFLIRSETDGVEAKK >gi|296493336|gb|ADTK01000165.1| GENE 4 1736 - 2290 666 184 aa, chain - ## HITS:1 COG:ompW KEGG:ns NR:ns ## COG: ompW COG3047 # Protein_GI_number: 16129217 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein W # Organism: Escherichia coli K12 # 1 184 29 212 212 358 100.0 3e-99 MRAGSATVRPTEGAGGTLGSLGGFSVTNNTQLGLTFTYMATDNIGVELLAATPFRHKIGT RATGDIATVHHLPPTLMAQWYFGDASSKFRPYVGAGINYTTFFDNGFNDHGKEAGLSDLS LKDSWGAAGQVGVDYLINRDWLVNMSVWYMDIDTTANYKLGGAQQHDSVRLDPWVFMFSA GYRF >gi|296493336|gb|ADTK01000165.1| GENE 5 2731 - 3474 565 247 aa, chain + ## HITS:1 COG:no KEGG:ECP_1303 NR:ns ## KEGG: ECP_1303 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_536 # Pathway: not_defined # 1 247 1 247 247 352 99.0 8e-96 MSITAQSVYRDTGNFFRNQFMTILLVSLLCAFITVVLGHVFSPSDAQLAQLNDGVPVSGS SGLFDLVQNMSPEQQQILLQASAASTFSGLIGNAILAGGVILIIQLVSAGQRVSALRAIG ASAPILPKLFILIFLTTLLVQIGIMLVVVPGIIMAILLALAPVMLVQDKMGIFASMRSSM RLTWANMRLVAPAVLSWLLAKTLLLLFASSFAALTPEIGAVLANTLSNLISAILLIYLFR LYMLIRQ >gi|296493336|gb|ADTK01000165.1| GENE 6 3504 - 4043 543 179 aa, chain + ## HITS:1 COG:ECs1754 KEGG:ns NR:ns ## COG: ECs1754 COG2917 # Protein_GI_number: 15831008 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Intracellular septation protein A # Organism: Escherichia coli O157:H7 # 1 179 1 179 179 291 100.0 6e-79 MKQFLDFLPLVVFFAFYKIYDIYAATAALIVATAIVLIYSWVRFRKVEKMALITFVLVVV FGGLTLFFHNDEFIKWKVTVIYALFAGALLVSQWVMKKPLIQRMLGKELTLPQPVWSKLN LAWAVFFILCGLANIYIAFWLPQNIWVNFKVFGLTALTLIFTLLSGIYIYRHMPQEDKS >gi|296493336|gb|ADTK01000165.1| GENE 7 4148 - 4546 359 132 aa, chain + ## HITS:1 COG:yciA KEGG:ns NR:ns ## COG: yciA COG1607 # Protein_GI_number: 16129214 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA hydrolase # Organism: Escherichia coli K12 # 1 132 1 132 132 264 100.0 3e-71 MSTTHNVPQGDLVLRTLAMPADTNANGDIFGGWLMSQMDIGGAILAKEIAHGRVVTVRVE GMTFLRPVAVGDVVCCYARCVQKGTTSVSINIEVWVKKVASEPIGQRYKATEALFKYVAV DPEGKPRALPVE >gi|296493336|gb|ADTK01000165.1| GENE 8 4586 - 5305 566 239 aa, chain - ## HITS:1 COG:tonB KEGG:ns NR:ns ## COG: tonB COG0810 # Protein_GI_number: 16129213 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein TonB, links inner and outer membranes # Organism: Escherichia coli K12 # 1 239 1 239 239 283 99.0 2e-76 MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVAPADLEPPQA VQPPPEPVVEPEPEPEPIPEPPKEAPAVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR PASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ >gi|296493336|gb|ADTK01000165.1| GENE 9 5433 - 5825 287 130 aa, chain + ## HITS:1 COG:ECs1751 KEGG:ns NR:ns ## COG: ECs1751 COG2350 # Protein_GI_number: 15831005 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 130 1 130 130 253 100.0 4e-68 MHGLNKDIIFPLQIFALSNNVLYNFPEQGVVPVLYVIYAQDKADSLEKRLSVRPAHLARL QLLHDEGRLLTAGPMPAVDSNDPGAAGFTGSTVIAEFESLEAAQAWADADPYVAAGVYEH VSVKPFKKVF >gi|296493336|gb|ADTK01000165.1| GENE 10 6245 - 7378 744 377 aa, chain + ## HITS:1 COG:ECs1750 KEGG:ns NR:ns ## COG: ECs1750 COG1226 # Protein_GI_number: 15831004 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, predicted NAD-binding component # Organism: Escherichia coli O157:H7 # 1 377 41 417 417 684 100.0 0 MSVNLLDIFHIKAFSELDLSLLANAPLFMLGVFLVLNSIGLLFRAKLAWAISIILLLIAL IYTLHFYPWLKFSIGFCIFTLVFLLILRKDFSHSSAAAGTIFAFISFTTLLFYSTYGALY LSEGFNPRIESLMTAFYFSIETMSTVGYGDIVPVSESARLFTISVIISGITVFATSMTSI FGPLIRGGFNKLVKGNNHTMHRKDHFIVCGHSILAINTILQLNQRGQNVTVISNLPEDDI KQLEQRLGDNADVIPGDSNDSSVLKKAGIDRCRAILALSDNDADNAFVVLSAKDMSSDVK TVLAVSDSKNLNKIKMVHPDIILSPQLFGSEILARVLNGEEINNDMLVSMLLNSGHGIFS DNDEQETKADSKESAQK >gi|296493336|gb|ADTK01000165.1| GENE 11 7433 - 7606 79 57 aa, chain - ## HITS:1 COG:no KEGG:EC55989_1347 NR:ns ## KEGG: EC55989_1347 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 57 38 94 94 83 100.0 3e-15 MKRSRTEVGRWRMQRQASRRKSRWLEGQSRRNMRIHSIRKCILNKQRNSLLFAIYNI >gi|296493336|gb|ADTK01000165.1| GENE 12 7749 - 9209 1172 486 aa, chain + ## HITS:1 COG:cls KEGG:ns NR:ns ## COG: cls COG1502 # Protein_GI_number: 16129210 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Escherichia coli K12 # 1 486 1 486 486 978 100.0 0 MTTVYTLVSWLAILGYWLLIAGVTLRILMKRRAVPSAMAWLLIIYILPLVGIIAYLAVGE LHLGKRRAERARAMWPSTAKWLNDLKACKHIFAEENSSVAAPLFKLCERRQGIAGVKGNQ LQLMTESDDVMQALIRDIQLARHNIEMVFYIWQPGGMADQVAESLMAAARRGIHCRLMLD SAGSVAFFRSPWPELMRNAGIEVVEALKVNLMRVFLRRMDLRQHRKMIMIDNYIAYTGSM NMVDPRYFKQDAGVGQWIDLMARMEGPIATAMGIIYSCDWEIETGKRILPPPPDVNIMPF EQASGHTIHTIASGPGFPEDLIHQALLTAAYSAREYLIMTTPYFVPSDDLLHAICTAAQR GVDVSIILPRKNDSMLVGWASRAFFTELLAAGVKIYQFEGGLLHTKSVLVDGELSLVGTV NLDMRSLWLNFEITLAIDDKGFGADLAAVQDDYISRSRLLDARLWLKRPLWQRVAERLFY FFSPLL >gi|296493336|gb|ADTK01000165.1| GENE 13 9244 - 9573 456 109 aa, chain + ## HITS:1 COG:yciU KEGG:ns NR:ns ## COG: yciU COG3099 # Protein_GI_number: 16129209 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 109 27 135 135 201 100.0 2e-52 MDMDLNNRLTEDETLEQAYDIFLELAADNLDPADVLLFNLQFEERGGAELFDPAEDWQEH VDFDLNPDFFAEVVIGLADSEDGEINDVFARILLCREKDHKLCHIIWRE >gi|296493336|gb|ADTK01000165.1| GENE 14 9626 - 10630 853 334 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 1 329 1 325 329 333 51 7e-91 MNAVTEGRKVLLEIADLKVHFEIKDGKQWFWQPPKTLKAVDGVTLRLYEGETLGVVGESG CGKSTFARAIIGLVKATDGHVAWLGKELLGMKPDEWRAVRSDIQMIFQDPLASLNPRMTI GEIIAEPLRTYHPKMSRQEVRERVKAMMLKVGLLPNLINRYPHEFSGGQCQRIGIARALI LEPKLIICDEPVSALDVSIQAQVVNLLQQLQREMGLSLIFIAHDLAVVKHISDRVLVMYL GHAVELGTYDEVYHNPLHPYTRALMSAVPIPDPDLEKNKTIQLLEGELPSPINPPSGCVF RTRCPIAGPECAKTRPVLEGSFRHAVSCLKVDPL >gi|296493336|gb|ADTK01000165.1| GENE 15 10627 - 11640 577 337 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 12 317 4 311 329 226 40 7e-59 MSVIETATVPLAQQQADALLNVKDLRVTFSTPDGDVTAVNDLNFSLRAGETLGIVGESGS GKSQTAFALMGLLAANGRIGGSATFNGREILNLPEHELNKLRAEQISMIFQDPMTSLNPY MRVGEQLMEVLMLHKNMSKAEAFEESVRMLDAVKMPAARKRMKMYPHEFSGGMRQRVMIA MALLCRPKLLIADEPTTALDVTVQAQIMTLLNELKREFNTAIIMITHDLGVVAGICDKVL VMYAGRTMEYGNARDVFYQPVHPYSIGLLNAVPRLDAEGETMLTIPGNPPNLLRLPKGCP FQPRCPHAMEICSSAPPLEEFTPGRLRACFKPVEELL >gi|296493336|gb|ADTK01000165.1| GENE 16 11652 - 12557 882 301 aa, chain - ## HITS:1 COG:STM1744 KEGG:ns NR:ns ## COG: STM1744 COG1173 # Protein_GI_number: 16765088 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Salmonella typhimurium LT2 # 1 301 2 302 302 509 96.0 1e-144 MLSKKNSETLENFSEKLEVEGRSLWQDARRRFMHNRAAVASLIVLVLIALFVILAPMLSQ FAYDDTDWAMMSSAPDMESGHYFGTDSSGRDLLVRVAIGGRISLMVGVAAALVAVVVGTL YGSLSGYLGGKVDSVMMRLLEILNSFPFMFFVILLVTFFGQNILLIFVAIGMVSWLDMAR IVRGQTLSLKRKEFIEAAQVGGVSTSGIVIRHIVPNVLGVVVVYASLLVPSMILFESFLS FLGLGTQEPLSSWGALLSDGANSMEVSPWLLLFPAGFLVVTLFCFNFIGDGLRDALDPKD R >gi|296493336|gb|ADTK01000165.1| GENE 17 12575 - 13495 763 306 aa, chain - ## HITS:1 COG:ECs1744 KEGG:ns NR:ns ## COG: ECs1744 COG0601 # Protein_GI_number: 15830998 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Escherichia coli O157:H7 # 1 306 1 306 306 549 100.0 1e-156 MLKFILRRCLEAIPTLFILITISFFMMRLAPGSPFTGERTLPPEVMANIEAKYHLNDPIM TQYFSYLKQLAHGDFGPSFKYKDYSVNDLVASSFPVSAKLGAAAFFLAVILGVSAGVIAA LKQNTKWDYTVMGLAMTGVVIPSFVVAPLLVMIFAIILHWLPGGGWNGGALKFMILPMVA LSLAYIASIARITRGSMIEVLHSNFIRTARAKGLPMRRIILRHALKPALLPVLSYMGPAF VGIITGSMVIETIYGLPGIGQLFVNGALNRDYSLVLSLTILVGALTILFNAIVDVLYAVI DPKIRY >gi|296493336|gb|ADTK01000165.1| GENE 18 13581 - 15212 1751 543 aa, chain - ## HITS:1 COG:ECs1743 KEGG:ns NR:ns ## COG: ECs1743 COG4166 # Protein_GI_number: 15830997 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 1 543 1 543 543 1082 100.0 0 MTNITKRSLVAAGVLAALMAGNVALAADVPAGVTLAEKQTLVRNNGSEVQSLDPHKIEGV PESNISRDLFEGLLVSDLDGHPAPGVAESWDNKDAKVWTFHLRKDAKWSDGTPVTAQDFV YSWQRSVDPNTASPYASYLQYGHIAGIDEILEGKKPITDLGVKAIDDHTLEVTLSEPVPY FYKLLVHPSTSPVPKAAIEKFGEKWTQPGNIVTNGAYTLKDWVVNERIVLERSPTYWNNA KTVINQVTYLPIASEVTDVNRYRSGEIDMTYNNMPIELFQKLKKEIPDEVHVDPYLCTYY YEINNQKPPFNDVRVRTALKLGMDRDIIVNKVKAQGDMPAYGYTPPYTDGAKLTQPEWFG WSQEKRNEEAKKLLAEAGYTADKPLTINLLYNTSDLHKKLAIAASSLWKKNIGVNVKLVN QEWKTFLDTRHQGTFDVARAGWCADYNEPTSFLNTMLSNSSMNTAHYKSPAFDSIMAETL KVTDEAQRTALYTKAEQQLDKDSAIVPVYYYVNARLVKPWVGGYTGKDPLDNTYTRNMYI VKH >gi|296493336|gb|ADTK01000165.1| GENE 19 15631 - 15792 115 53 aa, chain + ## HITS:1 COG:no KEGG:ECS88_1310 NR:ns ## KEGG: ECS88_1310 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 12 53 63 104 104 76 100.0 2e-13 MSIFSDYSSSSEMHNNLTIDYYLALSSTKGSGITNIISIILQQAQDYDVAKIT >gi|296493336|gb|ADTK01000165.1| GENE 20 15949 - 16596 463 215 aa, chain - ## HITS:1 COG:ychE KEGG:ns NR:ns ## COG: ychE COG2095 # Protein_GI_number: 16129203 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Multiple antibiotic transporter # Organism: Escherichia coli K12 # 1 215 1 215 215 377 100.0 1e-105 MIQTFFDFPVYFKFFIGLFALVNPVGIIPVFISMTSYQTAAARNKTNLTANLSVAIILWI SLFLGDTILQLFGISIDSFRIAGGILVVTIAMSMISGKLGEDKQNKQEKSETAVRESIGV VPLALPLMAGPGAISSTIVWGTRYHSISYLFGFFVAIALFALCCWGLFRMAPWLVRVLRQ TGINVITRIMGLLLMALGIEFIVTGIKGIFPGLLN >gi|296493336|gb|ADTK01000165.1| GENE 21 17073 - 19683 2620 870 aa, chain + ## HITS:1 COG:ECs1741_2 KEGG:ns NR:ns ## COG: ECs1741_2 COG1454 # Protein_GI_number: 15830995 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Escherichia coli O157:H7 # 448 870 1 423 444 858 100.0 0 MAVTNVAELNALVERVKKAQREYASFTQEQVDKIFRAAALAAADARIPLAKMAVAESGMG IVEDKVIKNHFASEYIYNAYKDEKTCGVLSEDDTFGTITIAEPIGIICGIVPTTNPTSTA IFKSLISLKTRNAIIFSPHPRAKDATNKAADIVLQAAIAAGAPKDLIGWIDQPSVELSNA LMHHPDINLILATGGPGMVKAAYSSGKPAIGVGAGNTPVVIDETADIKRAVASVLMSKTF DNGVICASEQSVVVVDSVYDAVRERFATHGGYLLQGKELKAVQDVILKNGALNAAIVGQP AYKIAELAGFSVPENTKILIGEVTVVDESEPFAHEKLSPTLAMYRAKDFEDAVEKAEKLV AMGGIGHTSCLYTDQDNQPARVSYFGQKMKTARILINTPASQGGIGDLYNFKLAPSLTLG CGSWGGNSISENVGPKHLINKKTVAKRAENMLWHKLPKSIYFRRGSLPIALDEVITDGHK RALIVTDRFLFNNGYADQITSVLKAAGVETEVFFEVEADPTLSIVRKGAELANSFKPDVI IALGGGSPMDAAKIMWVMYEHPETHFEELALRFMDIRKRIYKFPKMGVKAKMIAVTTTSG TGSEVTPFAVVTDDATGQKYPLADYALTPDMAIVDANLVMDMPKSLCAFGGLDAVTHAME AYVSVLASEFSDGQALQALKLLKEYLPASYHEGSKNPVARERVHSAATIAGIAFANAFLG VCHSMAHKLGSQFHIPHGLANALLICNVIRYNANDNPTKQTAFSQYDRPQARRRYAEIAD HLGLSAPGDRTAAKIEKLLAWLETLKAELGIPKSIREAGVQEADFLANVDKLSEDAFDDQ CTGANPRYPLISELKQILLDTYYGRDYVEG Prediction of potential genes in microbial genomes Time: Mon May 16 15:34:07 2011 Seq name: gi|296493335|gb|ADTK01000166.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont435.3, whole genome shotgun sequence Length of sequence - 1932 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 1396 - 1455 4.5 2 2 Tu 1 . + CDS 1491 - 1904 554 ## COG2916 DNA-binding protein H-NS Predicted protein(s) >gi|296493335|gb|ADTK01000166.1| GENE 1 270 - 887 269 205 aa, chain - ## HITS:1 COG:ECs1740 KEGG:ns NR:ns ## COG: ECs1740 COG1435 # Protein_GI_number: 15830994 # Func_class: F Nucleotide transport and metabolism # Function: Thymidine kinase # Organism: Escherichia coli O157:H7 # 1 205 1 205 205 404 99.0 1e-113 MAQLYFYYSAMNAGKSTALLQSSYNYQERGMRTVVYTAEIDDRFGAGKVSSRIGLSSPAK LFNQNSSLFAEIRAEHEQQAIHCVLVDECQFLTRQQVYELSEVVDQLDIPVLCYGLRTDF RGELFIGSQYLLAWSDKLVELKTICFCGRKASMVLRLDQAGRPYNEGEQVVIGGNERYVS VCRKHYKEALQVGSLTAIQERHRHD >gi|296493335|gb|ADTK01000166.1| GENE 2 1491 - 1904 554 137 aa, chain + ## HITS:1 COG:ECs1739 KEGG:ns NR:ns ## COG: ECs1739 COG2916 # Protein_GI_number: 15830993 # Func_class: R General function prediction only # Function: DNA-binding protein H-NS # Organism: Escherichia coli O157:H7 # 1 137 1 137 137 197 100.0 4e-51 MSEALKILNNIRTLRAQARECTLETLEEMLEKLEVVVNERREEESAAAAEVEERTRKLQQ YREMLIADGIDPNELLNSLAAVKSGTKAKRAQRPAKYSYVDENGETKTWTGQGRTPAVIK KAMDEQGKSLDDFLIKQ Prediction of potential genes in microbial genomes Time: Mon May 16 15:34:10 2011 Seq name: gi|296493334|gb|ADTK01000167.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont435.4, whole genome shotgun sequence Length of sequence - 5067 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 46 - 87 8.8 1 1 Op 1 5/0.000 - CDS 110 - 1018 990 ## COG1210 UDP-glucose pyrophosphorylase - Prom 1155 - 1214 4.9 - Term 1142 - 1176 -0.4 2 1 Op 2 4/1.000 - CDS 1220 - 2233 581 ## COG0784 FOG: CheY-like receiver - Term 2252 - 2281 -0.3 3 1 Op 3 . - CDS 2325 - 3230 398 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily - Prom 3480 - 3539 3.2 + Prom 3239 - 3298 3.8 4 2 Op 1 4/1.000 + CDS 3343 - 3801 200 ## PROTEIN SUPPORTED gi|90021194|ref|YP_527021.1| ribosomal protein L20 5 2 Op 2 . + CDS 3851 - 4693 860 ## COG0788 Formyltetrahydrofolate hydrolase + TRNA 4853 - 4937 66.9 # Tyr GTA 0 0 + TRNA 4972 - 5056 66.9 # Tyr GTA 0 0 Predicted protein(s) >gi|296493334|gb|ADTK01000167.1| GENE 1 110 - 1018 990 302 aa, chain - ## HITS:1 COG:ECs1738 KEGG:ns NR:ns ## COG: ECs1738 COG1210 # Protein_GI_number: 15830992 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose pyrophosphorylase # Organism: Escherichia coli O157:H7 # 1 302 1 302 302 589 100.0 1e-168 MAAINTKVKKAVIPVAGLGTRMLPATKAIPKEMLPLVDKPLIQYVVNECIAAGITEIVLV THSSKNSIENHFDTSFELEAMLEKRVKRQLLDEVQSICPPHVTIMQVRQGLAKGLGHAVL CAHPVVGDEPVAVILPDVILDEYESDLSQDNLAEMIRRFDETGHSQIMVEPVADVTAYGV VDCKGVELAPGESVPMVGVVEKPKADVAPSNLAIVGRYVLSADIWPLLAKTPPGAGDEIQ LTDAIDMLIEKETVEAYHMKGKSHDCGNKLGYMQAFVEYGIRHNTLGTEFKAWLEEEMGI KK >gi|296493334|gb|ADTK01000167.1| GENE 2 1220 - 2233 581 337 aa, chain - ## HITS:1 COG:STM1753_1 KEGG:ns NR:ns ## COG: STM1753_1 COG0784 # Protein_GI_number: 16765097 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Salmonella typhimurium LT2 # 1 134 1 134 134 248 91.0 2e-65 MTQPLVGKQILIVEDEQVFRSLLDSWFSSLGATTVLAADGVDALELLGGFTPDLMICDIA MPRMNGLKLLEHIRNRGDQTPVLVISATENMADIAKALRLGVEDVLLKPVKDLNRLREMV FACLYPSMFNSRVEEEERLFRDWDAMVDNPAAAAKLLQELQPPVQQVISHCRVNYRQLVA ADKPGLVLDIAALSENDLAFYCLDVTRAGHNGVLAALLLRALFNGLLQEQLAHQNQRLPE LGALLKQVNHLLRQANLPGQFPLLVGYYHRELKNLILVSAGLNATLNTGEHQVQISNGVP LGTLGNAYLNQLSQRCDAWQCQIWGTGGRLRLMLSAE >gi|296493334|gb|ADTK01000167.1| GENE 3 2325 - 3230 398 301 aa, chain - ## HITS:1 COG:ECs1736 KEGG:ns NR:ns ## COG: ECs1736 COG1752 # Protein_GI_number: 15830990 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Escherichia coli O157:H7 # 1 301 14 314 314 610 100.0 1e-174 MRKIKIGLALGSGAARGWSHIGVINALKKVGIEIDIVAGCSIGSLVGAAYACDRLSALED WVTSFSYWDVLRLMDLSWQRGGLLRGERVFNQYREIMPETEIENCSRRFAAVATNLSTGR ELWFTEGDLHLAIRASCSIPGLMAPVAHNGYWLVDGAVVNPIPISLTRALGADIVIAVDL QHDAHLMQQDLLSFNVSEENSENGDSLPWHARLKERLGSITTRRAVTAPTATEIMTTSIQ VLENRLKRNRMAGDPPDILIQPVCPQISTLDFHRAHAAIAAGQLAVEKKMDELLPLVRTN I >gi|296493334|gb|ADTK01000167.1| GENE 4 3343 - 3801 200 152 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90021194|ref|YP_527021.1| ribosomal protein L20 [Saccharophagus degradans 2-40] # 4 127 3 131 134 81 37 1e-15 MSQLCPCGSAVEYSLCCHPYVSGEKVAPDPEHLMRSRYCAFVMQDADYLIKTWHPSCGAA ALRAELMAGFAHTEWLGLTVFEHCWQDADNIGFVSFVARFTEGGKTGAIIERSRFLKENG QWYYIDGTRPQFGRNDPCPCGSGKKFKKCCGQ >gi|296493334|gb|ADTK01000167.1| GENE 5 3851 - 4693 860 280 aa, chain + ## HITS:1 COG:purU KEGG:ns NR:ns ## COG: purU COG0788 # Protein_GI_number: 16129193 # Func_class: F Nucleotide transport and metabolism # Function: Formyltetrahydrofolate hydrolase # Organism: Escherichia coli K12 # 1 280 1 280 280 563 99.0 1e-161 MHSLQRKVLRTICPDQKGLIARITNICYKHELNIVQNNEFVDHRTGRFFMRTELEGIFND STLLADLDSALPEGSVRELNPAGRRRIVILVTKEAHCLGDLLMKANYGGLDVEIAAVIGN HDTLRSLVERFDIPFELVSHEGLSRNEHDQKMADAIDAYQPDYVVLAKYMRVLTPEFVAR FPNKIINIHHSFLPAFIGARPYHQAYERGVKIIGATAHYVNDNLDEGPIIMQDVIHVDHT YTAEDMMRAGRDVEKNVLSRALYKVLAQRVFVYGNRTIIL Prediction of potential genes in microbial genomes Time: Mon May 16 15:34:10 2011 Seq name: gi|296493333|gb|ADTK01000168.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont439.1, whole genome shotgun sequence Length of sequence - 639 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 579 396 ## ECUMN_1861 putative exonuclease from phage origin Predicted protein(s) >gi|296493333|gb|ADTK01000168.1| GENE 1 3 - 579 396 192 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_1861 NR:ns ## KEGG: ECUMN_1861 # Name: not_defined # Def: putative exonuclease from phage origin # Organism: E.coli_UMN026 # Pathway: not_defined # 1 192 1 192 823 389 100.0 1e-107 MSKVFICAAIPDELATREEGAVAVATAIEAGDERRARAKFHWQFLEHYPAAQDCAYKFIV CEDKPGIPRPALDSWDAEYMQENRWDEESASFVPVETESDPMNVTFDKLAPEVQNAVMVK FDTCENITVDMVISAQELLQEDMATFDGHIVEALMKMPEVNAMYPELKLHAIGWVKHKCI PGAKWPEIQAEM Prediction of potential genes in microbial genomes Time: Mon May 16 15:34:14 2011 Seq name: gi|296493332|gb|ADTK01000169.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont444.1, whole genome shotgun sequence Length of sequence - 2640 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 26 - 163 79 ## SbBS512_E0872 putative tail fiber assembly protein homolog 2 1 Op 2 . - CDS 163 - 1869 599 ## COG5301 Phage-related tail fibre protein 3 2 Tu 1 . + CDS 1670 - 2639 367 ## ECIAI1_1567 conserved hypothetical protein; putative exported protein Predicted protein(s) >gi|296493332|gb|ADTK01000169.1| GENE 1 26 - 163 79 45 aa, chain - ## HITS:1 COG:no KEGG:SbBS512_E0872 NR:ns ## KEGG: SbBS512_E0872 # Name: not_defined # Def: putative tail fiber assembly protein homolog # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 1 45 1 45 190 96 97.0 3e-19 MAFRMSEQARTIKIYNLLAGTNEFIGEGDAYIPPHTGLPANSTDM >gi|296493332|gb|ADTK01000169.1| GENE 2 163 - 1869 599 568 aa, chain - ## HITS:1 COG:STM4200 KEGG:ns NR:ns ## COG: STM4200 COG5301 # Protein_GI_number: 16767450 # Func_class: R General function prediction only # Function: Phage-related tail fibre protein # Organism: Salmonella typhimurium LT2 # 5 79 167 248 581 65 54.0 3e-10 MALEDASTTKKGIVQLSSATNSTSESLAATPKAVKAAYDLANGKYTAQDATTAQKGIIQL SSATNSTSETLAATPKAVKTAYDNAEKRLQKDQNGADIPGKDTFTKNIGACRAFGGSVST ITGNWTTAQFIEWLDSQGAFNHPYWMCKGSWSYGNNKIITDTDCGNIHLAGAVIEVMGIK SAMTIRITTPTTSTGGGTTNAQFTYINHGTDYSPGWRRDYNSRNKPTASEIGALPSGGTA VSSVNLSSKGRVTALTDNTQGATGLELYEVYNNGYPTAYGNIIHLKGMTAVGEGELLIGW SGTSGAHAPAFIRSRRDTTDANWSPWAQLYTSAHPPAEFYPVGAPIPWPSDTVPSGYALM QGQTFDKSAYPKLAAAYPSGVIPDMRGWTIKGKPASGRAVLSQEQDGIKSHTHSASASST DLGTKTTSSFDYGTKSTNNTGAHTHSLSGSTNAAGNHSHRDGRRFNPSVFKDTYQYGYTS SGQNTWGVQGSVGMSTGWLANTSTDGNHSHSLSGTAVSAGAHAHTVGIGAHTHSVAIGSH GHTITVNAAGNAENTVKNIAFNYIVRLA >gi|296493332|gb|ADTK01000169.1| GENE 3 1670 - 2639 367 323 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_1567 NR:ns ## KEGG: ECIAI1_1567 # Name: not_defined # Def: conserved hypothetical protein; putative exported protein # Organism: E.coli_IAI1 # Pathway: not_defined # 22 316 1 295 334 140 94.0 6e-32 MLLVALLSWIIPFCAVVASCAVYFPLARSYAALTAFGVAASDSEVLLVALLSCTIPFFVV LASSSATADAISSARFAAVSARVAADSAVLLLCAAAVALPAASVAFVDAVVALPFAADAC LVASSFEADADDADDAAELADDAAAVFEDSARVSDAFAFVSDVFAAEADLAAAVACSVAS PAFVVAVEADDAALSADFPAAVALAEACPALVDAALADDAAAVFEPAAAEALCPAAVSED LAFVSDVFAAFAEFPAAVAEEAALLALKDAFVSDDFAASFEAAASRAEVAASDAFVIAVD AEVAADCSDAAAFVSDVFAAPAL Prediction of potential genes in microbial genomes Time: Mon May 16 15:34:24 2011 Seq name: gi|296493331|gb|ADTK01000170.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont447.1, whole genome shotgun sequence Length of sequence - 9863 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 232 - 1080 590 ## SeHA_A0070 hypothetical protein + Term 1088 - 1131 9.2 2 2 Tu 1 . - CDS 1165 - 1500 260 ## EcE24377A_D0059 hypothetical protein - Prom 1646 - 1705 4.7 + Prom 1573 - 1632 7.0 3 3 Op 1 . + CDS 1734 - 2066 130 ## SeHA_A0073 relaxosome component 4 3 Op 2 . + CDS 2077 - 4776 1685 ## SeHA_A0074 putative relaxase/mobilization nuclease domain protein + Term 4783 - 4820 7.1 - Term 4771 - 4807 10.1 5 4 Op 1 . - CDS 4813 - 7104 830 ## SeHA_A0075 TrbC 6 4 Op 2 . - CDS 7097 - 8167 532 ## SeHA_A0076 TrbB 7 4 Op 3 . - CDS 8186 - 9394 508 ## SeHA_A0077 conjugal transfer protein - Prom 9634 - 9693 4.2 Predicted protein(s) >gi|296493331|gb|ADTK01000170.1| GENE 1 232 - 1080 590 282 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0070 NR:ns ## KEGG: SeHA_A0070 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 282 1 282 282 591 100.0 1e-168 MNQTLPTADLNTAGTTDVIPSVAIDRIIAQRNEGIALFMQAMECLATARRILLDASGDIF LYGFEDCVTDSVRCMDKPEEAKRNITRLADRKIWDRLMTDTGMYTFMSSCQRDEWNSQLM SDTCPEITLDNVLATFRHLNASKMQTFEQGLIDVYRKLSWDYRTNNPCRLGKRIIIENLL YRWSNGRVTLDCSGREALDDLVRPFYLLEGRNVPDFRSSIGAQYGEFIGNGDNVGKLLEG EYFTVRGYQKGTVHIVFKRPDLVEKLNDIIARHYPGSLPPRV >gi|296493331|gb|ADTK01000170.1| GENE 2 1165 - 1500 260 111 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_D0059 NR:ns ## KEGG: EcE24377A_D0059 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 111 1 111 111 150 98.0 2e-35 MASKKFYSDDDIQLAKAALSELPDLTAQRKTLRDFLDAIRDDIIVLVRTKGYTLADVRDT LKNAGYEVGEKALRDIIREAESKKPSRRSSSKTASKKTDSARKDGIDMNNN >gi|296493331|gb|ADTK01000170.1| GENE 3 1734 - 2066 130 110 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0073 NR:ns ## KEGG: SeHA_A0073 # Name: not_defined # Def: relaxosome component # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 110 17 126 126 179 99.0 2e-44 MSDSAVRKKSEVRQKTVVRTLRFSPVEDETIRKKAEDSGLTVSAYIRNAALNKRINSRTD DAFLKELMRLGRMQKHLFVQGKRTGDKEYAEVLVAITELTNTLRKQLMEG >gi|296493331|gb|ADTK01000170.1| GENE 4 2077 - 4776 1685 899 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0074 NR:ns ## KEGG: SeHA_A0074 # Name: not_defined # Def: putative relaxase/mobilization nuclease domain protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 899 1 899 899 1784 99.0 0 MNAVIPKKRRDGKSSFEDLVSYVSVRDDMTDEELNLSSSSQAEQPHRSRFSRLVDYATRL RNESFVALVDVMKDGCEWVNFYGVTCFHNCTSLETAAADMEYIARQAHYAKDDTDPVFHY ILSWQSHESPRPEQIYDSVRHTLKSLGLADHQYVSAVHTDTDNLHVHVAVNRVHPETGYL NRLSWSQEKLSRACRELELKHGFAPDNGCWVHAPGNRIVRKTAVERDRQNAWTRGKKQTF REYVAQTAVAGLRSEPVNDWLSLHRRLAEDGLYLSQMDGKFLVMDGWDRNREGVQLDSFG PSWCAEKLIKKMGDYTPVPKDIFSQVEAPGRYNPDFIAADVRPEKIAETESLQQYACRHL GERLPEMAREGRLENCQAIHRTLAEAGLWMRVQHGHLVICDGYDHNQTPVRADSVWSLLT LDNVNQLDGGWQPVPTDIFRQVTPTERFRGRRMESCPATDKEWHRMRTGTGPQGAIKREL FSDKESLWGYSISHCSPQIEEMITQGEFTWQRCHELFAQQGLMLQKQHHGLVVVDAFNHE QTPVKASSIHPDLTLGRAEPQAGPFVSAPADLFDRVQPESRYNPELAVSDRYGVSSKRDP MLRRQRREARAEARADLRARYLAWREQWRKPDLRYGERCREIHQACRLRKSHIRAQYDDP ALRKLHYHIAEVQRMQALIRLKEDIRDERQKLIADGKWYPPSYRQWVEIQAAQGDRAAVS QLRGWDYRDRRKDRSRTTTTDRCVVLCEPGGTPVYGNTGDLEARLQKNGSVRFRDRRTGE FVCTDYGDRVVFRNHHDRNALADKLDLIAPVLFGRDPRMGFEPEGNDKQFNQVFAEMVAW HNVTGRTGHEDYRITRPDVDHHREGSERYYRDYIAANSNDDASLPPPEQDKRWEPPSPG >gi|296493331|gb|ADTK01000170.1| GENE 5 4813 - 7104 830 763 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0075 NR:ns ## KEGG: SeHA_A0075 # Name: not_defined # Def: TrbC # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 763 1 763 763 1562 99.0 0 MSEHRVNPELLHRTAWGNPVWNALQSLNIYGFCLVASLVASFIWPLALPACLLFTLITML VFSLQRWRCPLRMPMTLECADPSQDRMIKRSLFSFWPTLFQYEVILESPASGIFYVGYQR VRDIGRELWLSMDDLTRHIMFFATTGGGKTETIFAWAINPLCWARGFTLVDGKAQNDTAR TIWYLARRFGREDDVEVINFMNGGKSRSEIILSGEKTRPQSNTWNPFCYSTEAFTAETMQ SMLPQNVQGGEWQSRAIAMNKALVFGTKFWCVREGKTMSLQMLREHMTLEGMAKLYCRGL DDQWPEEAIAPLRNYLQDVPGFDLSLVRTPSAWTEEPRKQHAYLSGQFSETFSTFTEAFG DIFAEDSGDIDIRDSIHSDRILMVMIPALDTSAHTTSALGRMFITQKSMILARDLGYRLE GTDSDALEVKKYKGRFPYLCFLYEVGAYYTDRIAVEATQVRSLDFALILMAQDQERIEGQ TTATNTATLMQNTGTKFAGRIVSEGSTARTLKSAAGEEARARMNNLQRQDGIFGESWIDS PQISILMESKINVQELIELHPGEFFSIFRGETVPSASFFIPDDEKSCSSDPVVINRYISV DAPRLDRLRRLVPRTTQRRIPSPENVSAIIGVLTAKPSRKRRKIRTEPHTIVDTFQQRIA GRQAAMAMLEEYDTDINARESALWETAVNTLKTTTREERRIRYITLNRPELPETKEENQI SVRAERAGINLLTLPQDNNHPTGRPVNGFHHKKTNRPDWDGMY >gi|296493331|gb|ADTK01000170.1| GENE 6 7097 - 8167 532 356 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0076 NR:ns ## KEGG: SeHA_A0076 # Name: not_defined # Def: TrbB # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 356 1 356 356 727 100.0 0 MTIEYFATRIAKHISVQAIWPDGRTEIIAVLPRDALSGLVTRNNQLIHIAFDAMGTRNRE TVLIDKQEHNADQILTAIDSCLRLEYLMERRFSRTPLFRGIIASVVLFVMATIGFSLFRY VDRVFWDDTTPEAVQTAGEPRLLPPHLNHTVPLNEGIQLPVPPKKDVQVPEKTITSKNPE AAAARHNLATVLKRNADRGMFTINLSSGHERTLFAFLDPACPNCRLLEPALKRLASDFNV VIYPVSVIGGEESTDRVAPLLCEKDAQKRAAGWHRLYSADNGMMTPPEETTPADETCLKA ARAAIDVNNVAFRKFGFAGTPWVLSDTGWHLPTGILQETGTLNLFLKTTDSESGHE >gi|296493331|gb|ADTK01000170.1| GENE 7 8186 - 9394 508 402 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0077 NR:ns ## KEGG: SeHA_A0077 # Name: not_defined # Def: conjugal transfer protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 402 1 402 402 770 100.0 0 MSYNRQPVAEDPMQIWGAVGVLLILLLFVIWLFLPEVVYASCLILHTLWGLVDWGPFHNY AAPRYNLLAMTGNNAANISYSQWVNVMEQTIGILWMYLLPVTLWCLWEWYQHPGQSRFTR RPVDITRLPHIFASLSPAIAPVLADGDPEKLFHGGKRPERRVALTPEAFVEQHTLITNMQ LDVAAARRCFMAQLGKPLTSWKDMAPHEKALFAIFGLQYFLDDRKAALKLMDTLNLSCRI KSKRDSGKFCTPVYSLAKSAFQRVIKSNGAQQWLKQHRYVRSGLVWLYAHDLRLTPPNWI WLKGVDRTLFYALHRANTTKGFIEGAGVVAVARAEAEAMRFGLPCPEPCVDEAVEGLRRD MLSLGLIWDEPQPDRDRKRRILTNWSLTDDILPRTPATDNEF Prediction of potential genes in microbial genomes Time: Mon May 16 15:35:01 2011 Seq name: gi|296493330|gb|ADTK01000171.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont447.2, whole genome shotgun sequence Length of sequence - 10226 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 6, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 57 - 308 226 ## SeHA_A0079 hypothetical protein - Prom 551 - 610 4.8 - Term 1067 - 1115 10.1 2 2 Tu 1 . - CDS 1232 - 1408 211 ## SC082 hypothetical protein - Prom 1494 - 1553 1.7 - Term 1872 - 1903 3.2 3 3 Tu 1 . - CDS 1924 - 2124 233 ## SeHA_A0083 surface exclusion protein - Prom 2366 - 2425 4.9 - Term 2374 - 2417 5.2 4 4 Op 1 . - CDS 2614 - 4821 459 ## SC079 integral membrane protein 5 4 Op 2 . - CDS 4879 - 5322 206 ## SeHA_A0085 TraX - Prom 5364 - 5423 1.6 6 5 Op 1 . - CDS 5492 - 6694 839 ## ECSE_P1-0086 TraW protein 7 5 Op 2 . - CDS 6661 - 7086 125 ## SeHA_A0087 TraV - Prom 7204 - 7263 2.4 - Term 7211 - 7239 1.0 8 6 Tu 1 . - CDS 7275 - 10226 1351 ## ECSE_P1-0088 TraU protein Predicted protein(s) >gi|296493330|gb|ADTK01000171.1| GENE 1 57 - 308 226 83 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0079 NR:ns ## KEGG: SeHA_A0079 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 83 1 83 83 157 98.0 1e-37 MSEKWSKEFDSCFQVGSIEEIADALEKRSGSEENALAMLAMTAFTLMTRRGICEMQNVAP DGKGIRLELIGFRGNETIPDTLH >gi|296493330|gb|ADTK01000171.1| GENE 2 1232 - 1408 211 58 aa, chain - ## HITS:1 COG:no KEGG:SC082 NR:ns ## KEGG: SC082 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Choleraesuis # Pathway: not_defined # 1 58 37 94 94 108 98.0 4e-23 MNLVDAFVKKVISGPYEEYGKWWIDVEYISWGVPGKTRLMFESKEQALEVKEGYKFLT >gi|296493330|gb|ADTK01000171.1| GENE 3 1924 - 2124 233 66 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0083 NR:ns ## KEGG: SeHA_A0083 # Name: not_defined # Def: surface exclusion protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 66 155 220 220 112 72.0 5e-24 MEIHTKYVNMPMLELPIGVNSARSIANTLHAMASRGYDYPVDFPRLIQEKRKEWEQIAGM PVAEVF >gi|296493330|gb|ADTK01000171.1| GENE 4 2614 - 4821 459 735 aa, chain - ## HITS:1 COG:no KEGG:SC079 NR:ns ## KEGG: SC079 # Name: traY # Def: integral membrane protein # Organism: S.enterica_Choleraesuis # Pathway: not_defined # 1 735 11 745 745 1193 86.0 0 MAFAHTQFYSGVTVKILLRALCAGLAISSLPAMASVTYQDIVSAATNPDDLSRQALVTIF GDVVTNPLSTSAPTLIGSMFGAFNSIIAVLAVVWFMFIGIRHVVRSGHQGQVFSTGRDVV GTLSVVAGFLMIVPTGNGWSLAQLIMLWGASIMGVGSANVMVQLAADNIANGYSMTVQPV QASTRTAARGIFEMELCKYAVNAGLNDFNQTAKSSTSLMTESAKTASGNYTVTVSNGSGI CGTASLSVEGNGTTDQSTIGKFFNPFSKNEYSGVISAQRAAMDNMISDMDNAASEFVTTF LEKRNSGNGTLPDIETRIQRAADEYERAVQKSLPIDNGEQSRKEALKSYLTTYGWVTLGA WYQTFATANQRLAELADRAPAVTSMSSLGEVGDTDLFSAVMGAYKTQLQNSTYTPPLGTI TSADEQLLSKTSTPQDALIKPTEKFGQWLTNNIATEWTDSGTSSNQVNPLIKMKNIGDKT MVAAEGIWITYTSARVITSGTGSSIFGKFIDLTTSAISTLNALLEALAPPVYFLLLLLFC AGFSLSIYLPFIPFIFWMTGIGNWVISVLIGCTAGPLWGATHLGTSQDRGSRAAYGYIYL IDSMIRPPIMVFGFFFASVAIIAIGTILNFLFGAALTNVQISSFTGIFSLAGFLLLYARS CTTIVAAAFALQAYLPDHVINFLGGRDGVNTLGNMAGSIKEIFAGSNRNIRHAPNIREEQ IKRMNNDNNKDGIKG >gi|296493330|gb|ADTK01000171.1| GENE 5 4879 - 5322 206 147 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0085 NR:ns ## KEGG: SeHA_A0085 # Name: not_defined # Def: TraX # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 147 48 194 194 263 100.0 1e-69 MLEQQKRRLDRFRQVNNKKAQLHLTWEEALQASHMSVDDLDRRFRRRRTVWRFCCWSLLA IALFLSGMLFAASSLPLTTLVRAISTLVLILSGVALCASRALIVTYRLWQLHERKVSEPE QGTFRDFLNDRNGWRNATLIAVTSKQY >gi|296493330|gb|ADTK01000171.1| GENE 6 5492 - 6694 839 400 aa, chain - ## HITS:1 COG:no KEGG:ECSE_P1-0086 NR:ns ## KEGG: ECSE_P1-0086 # Name: not_defined # Def: TraW protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 400 1 400 400 646 99.0 0 MQRKTLLAALIAILSGTACQAHAYSVTVVASRPVEQQVVPRMEAIKGVLSDILSTQTATG TAINQNSEKLASVIAQNGQATRQQMIFSNETQRLEEARKSFTVPDSICSESASGIATESK SASASAASKLSKGGGVSNRNIRDRLASAANSPVREAYDGAAIHASYCTEAEYARFGGTAV CPSVGEIPGGDSQVRSIYHGAGTADTPAALTWDQKQIDAATAYMKNTSRPSAGRALGKGE VNTQSGRTYVGLQNEYNGIIDSASNPQLTLIADSTPNETTRKALAETLQSDSAAAYFDQV ASPEAKARGYMSTREFEAFEAGRRYANTAYLVDLQEMQGDNLLRELVRITAQMNWQLNDL KEQIRQGNVISGQQLALTARQYYEKQLGSLEKTINQANAR >gi|296493330|gb|ADTK01000171.1| GENE 7 6661 - 7086 125 141 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0087 NR:ns ## KEGG: SeHA_A0087 # Name: not_defined # Def: TraV # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 141 64 204 204 265 100.0 3e-70 MRLLLPSSTDELITTAIWQVSVPDEAPFDVQHVIQWILPSPLASLMSDDTWLWPSVPGTP ALSAGDWPVMDTDPSTLRRVRRRSFSIIDNRNRKGYITRRYQCFPLAPLPEDRKYDSLPD LVLHPEQETIKCKEKHYSQPL >gi|296493330|gb|ADTK01000171.1| GENE 8 7275 - 10226 1351 983 aa, chain - ## HITS:1 COG:no KEGG:ECSE_P1-0088 NR:ns ## KEGG: ECSE_P1-0088 # Name: not_defined # Def: TraU protein # Organism: E.coli_SE11 # Pathway: not_defined # 11 983 42 1014 1014 1998 99.0 0 DQYGSDESLTDRERRPWLNSPYIAATKRGEYLSVFEVSGAFREMDEASDQTGPGSLESLI TSMSDSLNTAYKNSGHKISCVFERDPEMGKEEIEDMVAPQKRSLANTGIQLQDVVDEKVT TLSPWLVRERCWLAIWSGPDLISNSDRTAHDELVRRLAERVPKARFAQSPWQWTLSALKI RHEAFLDNVEQALRHSSDGLILRLLDIHEVGREIRRQTERYSTPRNWQPHLPEDAQPAGY RWTDDESVLHAPSLHLQLFNTQVTTQGNLVQAGGLWHGMVSITLPPQNLQTFNELVRAVP RAVPWRIRMDLMPGGMKALNLKKTLLTYSSFISAVRPMYESVMTLAATDEKEPVCIMTIM ASTWGKTREICARNQAILKSAIEGWGVCGTTTTFGDPRRAWVNTILAASGGSGPVPLYPP LSHAISLFPLNRAGSVWRGKGNLMLHTEDGSAFEVGLASSQQNKHTELAPGDPGLGKSVL INTLSEIQISSAQKNLPFIAYIDKGYSAQGLVQLIRDSLPPERKDEAVGIILSNDPEYTR NLFDVMYGAKKPITPEKNFMSSVLCALCVDTGTGQPCNPGDTRQIINQLIELAFKEYGEN NPRLYRASTEELVDSALQDSGLYEKHDAAWWARSTWFEVRDMLHNAGYIMAAQRAHYQAM PQLPEVSSMLGHTSLRDVFGTVQRDGSNELLLDYIRRALEQGHNDYPMISGYTRFMINPE TRVIAVDLNNVAGDKTPAGRLKTGIMYLLAGQIAGGDFTLPQYRDEVLKQLPREYHEIAL KRINQLDQEVKTKVYDELHNARGIDFIWENLDTQEREQRKFAIRTVLSTQYLRNYPESVL KSANTLWLIRYKPEDIPVLRDNFNVPEFMLKRFLKMPEGPAPDGSGVPVLGVFRVKSGTL ARILKFTVGPLELWALNSSPKDSALRKTLTNKLGSVRARKILAENFPRGSATSLIEHRAG QHNSDNVIEDLASELIRKQGYNL Prediction of potential genes in microbial genomes Time: Mon May 16 15:35:35 2011 Seq name: gi|296493329|gb|ADTK01000172.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont453.1, whole genome shotgun sequence Length of sequence - 831 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 - CDS 3 - 399 287 ## COG1310 Predicted metal-dependent protease of the PAD1/JAB1 superfamily - Term 466 - 515 1.0 2 1 Op 2 . - CDS 549 - 746 144 ## COG4672 Phage-related protein Predicted protein(s) >gi|296493329|gb|ADTK01000172.1| GENE 1 3 - 399 287 132 aa, chain - ## HITS:1 COG:Z0978_1 KEGG:ns NR:ns ## COG: Z0978_1 COG1310 # Protein_GI_number: 15800514 # Func_class: R General function prediction only # Function: Predicted metal-dependent protease of the PAD1/JAB1 superfamily # Organism: Escherichia coli O157:H7 EDL933 # 1 70 26 95 95 147 98.0 3e-36 MSPEDWLQAEMQGEIVALVHSHPGGLPWLSEADRRLQVQSDLPWWLVCRGAIHKFRCVPH LTGRRFEHGVTDCYTLFRDAYHLAGIEMPDFHRGDDWWRHGQNLYLDNLEDTGLYQVPLS SAQPGDVLLCCF >gi|296493329|gb|ADTK01000172.1| GENE 2 549 - 746 144 65 aa, chain - ## HITS:1 COG:Z0977 KEGG:ns NR:ns ## COG: Z0977 COG4672 # Protein_GI_number: 15800513 # Func_class: S Function unknown # Function: Phage-related protein # Organism: Escherichia coli O157:H7 EDL933 # 1 65 168 232 232 139 95.0 9e-34 MLANTCTWIYRGDECGYNGPAVADEYDQPTSDISKDKCSKCLSGCKFRNNVGNFGGFLSI NKLSQ Prediction of potential genes in microbial genomes Time: Mon May 16 15:35:35 2011 Seq name: gi|296493328|gb|ADTK01000173.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont458.1, whole genome shotgun sequence Length of sequence - 569 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 557 543 ## COG3468 Type V secretory pathway, adhesin AidA Predicted protein(s) >gi|296493328|gb|ADTK01000173.1| GENE 1 3 - 557 543 184 aa, chain + ## HITS:1 COG:flu KEGG:ns NR:ns ## COG: flu COG3468 # Protein_GI_number: 16129941 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 1 184 908 1091 1091 339 99.0 2e-93 DNNDFRARGWGWLGSLETGLPFSITDNLMLEPQLQYTWQGLSLDDGQDNAGYVKFGHGSA QHVRAGFRLGSHNDMTFGEGTSSRAPLRDSAKHSVSELPVNWWVQPSVIRTFSSRGDMRV GTSTAGSGMTFSPSQNGTSLDLQAGLEARVRENITLGVQAGYAHSVSGSSAEGYNGQATL NVTF Prediction of potential genes in microbial genomes Time: Mon May 16 15:35:36 2011 Seq name: gi|296493327|gb|ADTK01000174.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont473.1, whole genome shotgun sequence Length of sequence - 2876 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 186 151 ## ECIAI1_0779 putative lipoprotein; DLP12 prophage - Prom 211 - 270 3.4 + Prom 1091 - 1150 5.2 2 2 Op 1 . + CDS 1231 - 1725 532 ## ECUMN_1833 conserved hypothetical protein from putative prophage 3 2 Op 2 . + CDS 1725 - 2874 657 ## COG5525 Bacteriophage tail assembly protein Predicted protein(s) >gi|296493327|gb|ADTK01000174.1| GENE 1 3 - 186 151 61 aa, chain - ## HITS:1 COG:no KEGG:ECIAI1_0779 NR:ns ## KEGG: ECIAI1_0779 # Name: borD # Def: putative lipoprotein; DLP12 prophage # Organism: E.coli_IAI1 # Pathway: not_defined # 1 61 17 77 113 114 100.0 1e-24 MKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHHFFVSGIGQEKTVDAAKICGG A >gi|296493327|gb|ADTK01000174.1| GENE 2 1231 - 1725 532 164 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_1833 NR:ns ## KEGG: ECUMN_1833 # Name: not_defined # Def: conserved hypothetical protein from putative prophage # Organism: E.coli_UMN026 # Pathway: not_defined # 1 154 1 154 164 273 100.0 1e-72 MDRELKNLTLNISQLAALSGVHRQTAAARLQNLPVAGGHESNLKLYRVVDIVSAFLALPP PVAEGEMDAHERKAWYQSERERLKFEQETAQLIPASDVRREFAIWAKAVVQVLETLPDIL ERDCGLQPAAVSRVQSIIDDLRDQIALRVTEAGADDEEELQQEE >gi|296493327|gb|ADTK01000174.1| GENE 3 1725 - 2874 657 383 aa, chain + ## HITS:1 COG:ECs0825 KEGG:ns NR:ns ## COG: ECs0825 COG5525 # Protein_GI_number: 15830079 # Func_class: R General function prediction only # Function: Bacteriophage tail assembly protein # Organism: Escherichia coli O157:H7 # 1 383 1 383 700 775 99.0 0 MLNQETAKAARTDSGYILRAPRRMRVADAVAQYMRVPMGAGNSVPWDPLVAPYVIEPMNC LASREYDAVIFVGPARTGKTIGLIDGWVIYNVICDPADMLIIQMTEEKAREHSKKRLART FRVSPEVVSRLSPNKNDNNVYDRTFLAGNYLKIGWPSVNIMSSSDYKCVALTDYDRFPED IDGEGDAFSLASKRTTTFMSSGMTLVESSPGRDVKDVKWRRTSPHEAPPTTGILSLYNRG DRRRWYWPCPHCGEYFQPCGDVVAGFRDIADPVLASEAAYIQCPSCSGRIMPEHKRELNG RGVWLRDGESINADGSRYGDPRRSRIASFWMEGPAAAYQTLSQLVYKLLTAEQEYETTGS EETLKTVINTDWGLPYLPRASME Prediction of potential genes in microbial genomes Time: Mon May 16 15:35:41 2011 Seq name: gi|296493326|gb|ADTK01000175.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont474.1, whole genome shotgun sequence Length of sequence - 981 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 - CDS 100 - 429 168 ## COG4718 Phage-related protein 2 1 Op 2 . - CDS 429 - 959 527 ## COG5281 Phage-related minor tail protein Predicted protein(s) >gi|296493326|gb|ADTK01000175.1| GENE 1 100 - 429 168 109 aa, chain - ## HITS:1 COG:ECs0838 KEGG:ns NR:ns ## COG: ECs0838 COG4718 # Protein_GI_number: 15830092 # Func_class: S Function unknown # Function: Phage-related protein # Organism: Escherichia coli O157:H7 # 1 109 1 109 109 215 99.0 2e-56 METFHWKVRPDMNVVSEPKVVTVKLGDGYEQRRAAGLNNQLSIYSVTIRVRKCEHPSLKA FLERHGGVRAFQWTPPYDWKPIRVVCRKWSASVGALWVTITADFEQVVA >gi|296493326|gb|ADTK01000175.1| GENE 2 429 - 959 527 176 aa, chain - ## HITS:1 COG:ECs0837 KEGG:ns NR:ns ## COG: ECs0837 COG5281 # Protein_GI_number: 15830091 # Func_class: S Function unknown # Function: Phage-related minor tail protein # Organism: Escherichia coli O157:H7 # 1 176 846 1021 1021 329 98.0 1e-90 MGNGLATFVTTGKLNFKSFTSSVLSDMAKILAQATMMKSIKGIGSVLGFDLSSLSLNANG GIYQSADLSRYSGTVVNRPTFFAFAKGAGVMGEAGPEAILPLRRGADGKLGVVADIGGSG MGMFAPQYNIEINNDGTNGQIGPAALKVVYDLGKKAAADFMQQQARDGGRLSGAYR Prediction of potential genes in microbial genomes Time: Mon May 16 15:35:43 2011 Seq name: gi|296493325|gb|ADTK01000176.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont497.1, whole genome shotgun sequence Length of sequence - 4137 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 129 177 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 - Prom 378 - 437 5.0 - TRNA 244 - 319 94.8 # Thr GGT 0 0 - TRNA 326 - 400 64.8 # Gly TCC 0 0 - TRNA 517 - 601 66.9 # Tyr GTA 0 0 - TRNA 610 - 685 91.8 # Thr TGT 0 0 2 2 Tu 1 . - CDS 870 - 1064 84 ## EcSMS35_4423 hypothetical protein - Prom 1313 - 1372 2.1 3 3 Tu 1 . + CDS 1071 - 2156 719 ## COG1072 Panthothenate kinase 4 4 Op 1 6/1.000 - CDS 2026 - 2991 714 ## COG0340 Biotin-(acetyl-CoA carboxylase) ligase 5 4 Op 2 . - CDS 2988 - 4016 681 ## COG0812 UDP-N-acetylmuramate dehydrogenase - Prom 4039 - 4098 2.8 Predicted protein(s) >gi|296493325|gb|ADTK01000176.1| GENE 1 1 - 129 177 43 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 42 1 42 407 72 76 4e-13 MSKEKFERTKPHVNVGTIGHVDHGKTTLTAAITTVLAKTYGGA >gi|296493325|gb|ADTK01000176.1| GENE 2 870 - 1064 84 64 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_4423 NR:ns ## KEGG: EcSMS35_4423 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 64 1 64 64 120 98.0 2e-26 MLFYTHKHVSGGHSSIGKAIFAMARHSSVEWVNGQEGNTRWGGIIRKNLCQEAIVEKQCD RYSI >gi|296493325|gb|ADTK01000176.1| GENE 3 1071 - 2156 719 361 aa, chain + ## HITS:1 COG:ECs4901 KEGG:ns NR:ns ## COG: ECs4901 COG1072 # Protein_GI_number: 15834155 # Func_class: H Coenzyme transport and metabolism # Function: Panthothenate kinase # Organism: Escherichia coli O157:H7 # 1 308 9 316 316 627 100.0 1e-179 MTPYLQFDRNQWAALRDSVPMTLSEDEIARLKGINEDLSLEEVAEIYLPLSRLLNFYISS NLRRQAVLEQFLGTNGQRIPYIISIAGSVAVGKSTTARVLQALLSRWPEHRRVELITTDG FLHPNQVLKERGLMKKKGFPESYDMHRLVKFVSDLKSGVPNVTAPVYSHLIYDVIPDGDK TVVQPDILILEGLNVLQSGMDYPHDPHHVFVSDFVDFSIYVDAPEDLLQTWYINRFLKFR EGAFTDPDSYFHNYAKLTKEEAIKTAMTLWKEINWLNLKQNILPTRERASLILTKSANHA VEEVRLRKYFSGELMPPFLIFLHYAGIFHRPSRVLLFHPAQVIKPPVCLFRVKCQIFLYH Q >gi|296493325|gb|ADTK01000176.1| GENE 4 2026 - 2991 714 321 aa, chain - ## HITS:1 COG:ECs4900_2 KEGG:ns NR:ns ## COG: ECs4900_2 COG0340 # Protein_GI_number: 15834154 # Func_class: H Coenzyme transport and metabolism # Function: Biotin-(acetyl-CoA carboxylase) ligase # Organism: Escherichia coli O157:H7 # 77 321 1 245 245 469 99.0 1e-132 MKDYTVPLKLIALLANGEFHSGEQLGETLGMSRAAINKHIQTLRDWGVDVFTVPGKGYSL PEPIQLLNAEQILGQLDGGSVAVLPVIDSTNQYLLDRIGELKSGDACVAEYQQAGRGRRG RKWFSPFGANLYLSMFWRLEQGPAAAIGLSLVIGIVMAEVLRKLGADKVRVKWPNDLYLQ DRKLAGILVELTGKTGDAAQIVIGAGINMAMRRVEESVVNQGWITLQEAGINLDRNTLAA MLIRELRAALELFEQEGLAPYLSRWEKLDNFINRPVKLIIGDKEIFGISRGIDKQGALLL EQDGIIKPWMGGEISLRSAEK >gi|296493325|gb|ADTK01000176.1| GENE 5 2988 - 4016 681 342 aa, chain - ## HITS:1 COG:ECs4899 KEGG:ns NR:ns ## COG: ECs4899 COG0812 # Protein_GI_number: 15834153 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate dehydrogenase # Organism: Escherichia coli O157:H7 # 1 342 1 342 342 714 99.0 0 MNHSLKPWNTFGIDHNAQHIVCAEDEQQLLNAWQHATAEGQPVLILGEGSNVLFLEDYRG TVIINRIKGIEIHDEPDAWYLHVGAGENWHRLVKYTLQEGMPGLENLALIPGCVGSSPIQ NIGAYGVELQRVCAYVDCVELATGKQVRLTAKECRFGYRDSIFKHEYQDRFAIVAVGLRL PKEWQPVLTYGDLTRLDPTTVTPQQVFNAVCHMRTTKLPDPKVNGNAGSFFKNPVVSAET AEALLSQFPTAPNYPQADGSVKLAAGWLIDQCQLKGMQMGGAAVHRQQALVLINEDNAKS EDVVQLAHHVRQKVGEKFNVWLEPEVRFIGASGEVSAVETIS Prediction of potential genes in microbial genomes Time: Mon May 16 15:35:46 2011 Seq name: gi|296493324|gb|ADTK01000177.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont508.1, whole genome shotgun sequence Length of sequence - 908 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 - CDS 1 - 330 173 ## PROTEIN SUPPORTED gi|148984516|ref|ZP_01817804.1| 50S ribosomal protein L9 2 1 Op 2 . - CDS 327 - 863 159 ## COG2963 Transposase and inactivated derivatives Predicted protein(s) >gi|296493324|gb|ADTK01000177.1| GENE 1 1 - 330 173 110 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148984516|ref|ZP_01817804.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP3-BS71] # 1 102 1 102 107 71 38 3e-13 LMISLPAGSRIWLVAGITDMRNGFNGLASKVQNVLKDDPFSGHLFIFRGRRGDQIKVLWA DSDGLCLFTKRLERGRFVWPVTRDGKVHLTPAQLSMLLEGINWKHPKRTE >gi|296493324|gb|ADTK01000177.1| GENE 2 327 - 863 159 178 aa, chain - ## HITS:1 COG:Z4315_2 KEGG:ns NR:ns ## COG: Z4315_2 COG2963 # Protein_GI_number: 15803508 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 EDL933 # 52 177 2 133 134 94 43.0 1e-19 MFVRFRRAGLSWPLPAGMSEQELDACLYGQFSTVPVVRPESTVISEAPVVKKRPRRPNFP YEFKIALVEQSLQPGACVAQIARENGINDNLLFNWRHQYRKGGLLPSGKNMPALLPVTLT PEPDNKIPAPAQEPEQINTPSDSLCCELVLPAGTLRLKGKLTPALLQTLIREIKGSSH Prediction of potential genes in microbial genomes Time: Mon May 16 15:36:06 2011 Seq name: gi|296493323|gb|ADTK01000178.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont523.1, whole genome shotgun sequence Length of sequence - 61651 bp Number of predicted genes - 56, with homology - 56 Number of transcription units - 30, operones - 12 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 173 - 232 5.7 1 1 Tu 1 . + CDS 292 - 1590 1013 ## COG0477 Permeases of the major facilitator superfamily 2 2 Op 1 3/0.789 - CDS 1587 - 1910 278 ## COG5544 Predicted periplasmic lipoprotein - Term 1917 - 1954 3.7 3 2 Op 2 3/0.789 - CDS 1956 - 3311 1333 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes 4 3 Op 1 3/0.789 - CDS 3425 - 6085 2241 ## COG1042 Acyl-CoA synthetase (NDP forming) 5 3 Op 2 3/0.789 - CDS 6117 - 6641 447 ## COG3148 Uncharacterized conserved protein - Term 6845 - 6877 5.4 6 4 Tu 1 . - CDS 6884 - 7303 178 ## PROTEIN SUPPORTED gi|124485582|ref|YP_001030198.1| ribosomal protein L12E/L44/L45/RPP1/RPP2-like protein - Prom 7362 - 7421 6.1 + Prom 7311 - 7370 3.2 7 5 Tu 1 . + CDS 7510 - 8547 1003 ## COG0566 rRNA methylases + Term 8555 - 8592 8.5 - Term 8541 - 8580 5.1 8 6 Tu 1 . - CDS 8595 - 9284 618 ## COG0692 Uracil DNA glycosylase - Prom 9455 - 9514 5.9 + Prom 9346 - 9405 7.3 9 7 Tu 1 . + CDS 9589 - 9972 541 ## COG3445 Acid-induced glycyl radical enzyme + Term 9995 - 10030 7.4 - Term 9983 - 10018 7.4 10 8 Tu 1 . - CDS 10028 - 10615 487 ## COG1280 Putative threonine efflux protein - Prom 10652 - 10711 4.3 + Prom 10633 - 10692 3.8 11 9 Tu 1 . + CDS 10718 - 11599 515 ## COG0583 Transcriptional regulator + Term 11607 - 11636 -0.5 12 10 Tu 1 . - CDS 11808 - 13142 1450 ## COG0513 Superfamily II DNA and RNA helicases - Prom 13274 - 13333 4.2 + Prom 13176 - 13235 2.4 13 11 Tu 1 . + CDS 13274 - 14011 733 ## COG4123 Predicted O-methyltransferase + Term 14176 - 14213 -0.9 14 12 Tu 1 . - CDS 13996 - 15618 1189 ## COG0029 Aspartate oxidase - Prom 15857 - 15916 5.1 + Prom 15749 - 15808 5.0 15 13 Op 1 . + CDS 15874 - 16029 78 ## ECBD_1107 hypothetical protein 16 13 Op 2 11/0.000 + CDS 16026 - 16601 341 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 17 13 Op 3 10/0.000 + CDS 16634 - 17284 536 ## COG3073 Negative regulator of sigma E activity 18 13 Op 4 8/0.000 + CDS 17284 - 18240 823 ## COG3026 Negative regulator of sigma E activity 19 13 Op 5 4/0.632 + CDS 18237 - 18716 340 ## COG3086 Positive regulator of sigma E activity + Term 18727 - 18771 -0.9 + Prom 18770 - 18829 5.6 20 14 Op 1 14/0.000 + CDS 18914 - 20713 2083 ## COG0481 Membrane GTPase LepA 21 14 Op 2 13/0.000 + CDS 20729 - 21703 1033 ## COG0681 Signal peptidase I + Term 21716 - 21761 14.1 + Prom 21754 - 21813 1.8 22 15 Op 1 18/0.000 + CDS 21975 - 22655 654 ## COG0571 dsRNA-specific ribonuclease 23 15 Op 2 16/0.000 + CDS 22652 - 23557 1011 ## COG1159 GTPase 24 15 Op 3 9/0.000 + CDS 23569 - 24297 623 ## COG1381 Recombinational DNA repair protein (RecF pathway) 25 15 Op 4 8/0.000 + CDS 24309 - 25040 998 ## COG0854 Pyridoxal phosphate biosynthesis protein 26 15 Op 5 . + CDS 25040 - 25420 293 ## COG0736 Phosphopantetheinyl transferase (holo-ACP synthase) + Term 25439 - 25499 12.4 - Term 26080 - 26114 1.1 27 16 Op 1 1/0.947 - CDS 26115 - 26375 256 ## COG1145 Ferredoxin 28 16 Op 2 . - CDS 26431 - 27279 815 ## COG1737 Transcriptional regulators + Prom 27386 - 27445 3.4 29 17 Op 1 2/0.842 + CDS 27488 - 28123 481 ## COG0560 Phosphoserine phosphatase 30 17 Op 2 . + CDS 28181 - 28684 472 ## COG0590 Cytosine/adenosine deaminases 31 18 Tu 1 . - CDS 28681 - 30237 1743 ## COG4623 Predicted soluble lytic transglycosylase fused to an ABC-type amino acid-binding protein - Prom 30385 - 30444 4.0 32 19 Tu 1 . + CDS 30498 - 34382 4297 ## COG0046 Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain + Term 34443 - 34472 0.4 33 20 Tu 1 . + CDS 34985 - 36367 1035 ## COG0642 Signal transduction histidine kinase + Prom 36392 - 36451 2.5 34 21 Op 1 . + CDS 36532 - 37245 436 ## EcHS_A2708 hypothetical protein 35 21 Op 2 4/0.632 + CDS 37235 - 38569 1308 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 36 21 Op 3 . + CDS 38630 - 38968 498 ## COG0347 Nitrogen regulatory protein PII 37 22 Tu 1 . - CDS 39013 - 40203 1279 ## COG1018 Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 - Prom 40241 - 40300 7.3 + Prom 40407 - 40466 5.6 38 23 Tu 1 . + CDS 40531 - 41784 1542 ## COG0112 Glycine/serine hydroxymethyltransferase + Term 41811 - 41840 2.1 - Term 41799 - 41828 2.1 39 24 Tu 1 . - CDS 41857 - 43050 1212 ## COG1940 Transcriptional regulator/sugar kinase - Prom 43075 - 43134 3.6 + Prom 42964 - 43023 2.9 40 25 Op 1 . + CDS 43168 - 44151 1078 ## ECH74115_3782 tetratricopeptide repeat protein 41 25 Op 2 . + CDS 44212 - 46449 1980 ## EcE24377A_2834 TPR repeat-containing protein + Prom 46458 - 46517 9.9 42 26 Op 1 16/0.000 + CDS 46546 - 47529 1066 ## COG1879 ABC-type sugar transport system, periplasmic component 43 26 Op 2 21/0.000 + CDS 47552 - 49063 188 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 44 26 Op 3 3/0.789 + CDS 49088 - 50086 1089 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 45 26 Op 4 2/0.842 + CDS 50161 - 51213 1027 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 46 26 Op 5 . + CDS 51225 - 52097 777 ## COG2017 Galactose mutarotase and related enzymes + Term 52107 - 52150 4.6 - Term 52095 - 52138 8.4 47 27 Op 1 1/0.947 - CDS 52145 - 52567 449 ## COG2259 Predicted membrane protein 48 27 Op 2 6/0.105 - CDS 52664 - 53866 1003 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 49 27 Op 3 3/0.789 - CDS 53876 - 54688 797 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 50 27 Op 4 4/0.632 - CDS 54685 - 55005 172 ## COG2146 Ferredoxin subunits of nitrite reductase and ring-hydroxylating dioxygenases 51 27 Op 5 6/0.105 - CDS 55005 - 55523 434 ## COG5517 Small subunit of phenylpropionate dioxygenase 52 27 Op 6 . - CDS 55520 - 56881 1338 ## COG4638 Phenylpropionate dioxygenase and related ring-hydroxylating dioxygenases, large terminal subunit - Prom 56930 - 56989 2.2 + Prom 56839 - 56898 5.4 53 28 Op 1 9/0.000 + CDS 57017 - 57907 719 ## COG0583 Transcriptional regulator + Prom 57950 - 58009 2.1 54 28 Op 2 . + CDS 58067 - 59206 958 ## COG0477 Permeases of the major facilitator superfamily + Term 59370 - 59434 -0.8 55 29 Tu 1 . - CDS 59198 - 60478 829 ## COG3711 Transcriptional antiterminator - Prom 60584 - 60643 3.9 56 30 Tu 1 . - CDS 60669 - 61523 668 ## COG1073 Hydrolases of the alpha/beta superfamily - Prom 61561 - 61620 2.0 Predicted protein(s) >gi|296493323|gb|ADTK01000178.1| GENE 1 292 - 1590 1013 432 aa, chain + ## HITS:1 COG:kgtP KEGG:ns NR:ns ## COG: kgtP COG0477 # Protein_GI_number: 16130512 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 432 1 432 432 752 100.0 0 MAESTVTADSKLTSSDTRRRIWAIVGASSGNLVEWFDFYVYSFCSLYFAHIFFPSGNTTT QLLQTAGVFAAGFLMRPIGGWLFGRIADKHGRKKSMLLSVCMMCFGSLVIACLPGYETIG TWAPALLLLARLFQGLSVGGEYGTSATYMSEVAVEGRKGFYASFQYVTLIGGQLLALLVV VVLQHTMEDAALREWGWRIPFALGAVLAVVALWLRRQLDETSQQETRALKEAGSLKGLWR NRRAFIMVLGFTAAGSLCFYTFTTYMQKYLVNTAGMHANVASGIMTAALFVFMLIQPLIG ALSDKIGRRTSMLCFGSLAAIFTVPILSALQNVSSPYAAFGLVMCALLIVSFYTSISGIL KAEMFPAQVRALGVGLSYAVANAIFGGSAEYVALSLKSIGMETAFFWYVTLMAVVAFLVS LMLHRKGKGMRL >gi|296493323|gb|ADTK01000178.1| GENE 2 1587 - 1910 278 107 aa, chain - ## HITS:1 COG:STM2653 KEGG:ns NR:ns ## COG: STM2653 COG5544 # Protein_GI_number: 16765973 # Func_class: R General function prediction only # Function: Predicted periplasmic lipoprotein # Organism: Salmonella typhimurium LT2 # 1 107 1 107 107 154 75.0 5e-38 MRILFVCSLLLLSGCSHMANDSWSGQDKAQHFIASAMLSAAGNEYSQHQGMSRDRSAMFG LMFSVSLGASKELWDSRPEGSGWSWKDLAWDVAGASPGYTVWQLTRH >gi|296493323|gb|ADTK01000178.1| GENE 3 1956 - 3311 1333 451 aa, chain - ## HITS:1 COG:ECs3452 KEGG:ns NR:ns ## COG: ECs3452 COG1502 # Protein_GI_number: 15832706 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Escherichia coli O157:H7 # 1 431 2 432 452 877 100.0 0 MLSKFKRNKHQQHLAQLPKISQSVDDVDFFYAPADFRETLLEKIASAKQRICIVALYLEQ DDGGKGILNALYEAKRQRPELDVRVLVDWHRAQRGRIGAAASNTNADWYCRMAQENPGVD VPVYGVPINTREALGVLHFKGFIIDDSVLYSGASLNDVYLHQHDKYRYDRYHLIRNRKMS DIMFEWVTQNIMNGRGVNRLDDVNRPKSPEIKNDIRLFRQELRDAAYHFQGDADNDQLSV TPLVGLGKSSLLNKTIFHLMPCAEQKLTICTPYFNLPAILVRNIIQLLREGKKVEIIVGD KTANDFYIPEDEPFKIIGALPYLYEINLRRFLSRLQYYVNTDQLVVRLWKDDDNTYHLKG MWVDDKWMLITGNNLNPRAWRLDLENAILIHDPQLELAPQREKELELIREHTTIVKHYRD LQSIADYPVKVRKLIRRLRRIRIDRLISRIL >gi|296493323|gb|ADTK01000178.1| GENE 4 3425 - 6085 2241 886 aa, chain - ## HITS:1 COG:yfiQ_1 KEGG:ns NR:ns ## COG: yfiQ_1 COG1042 # Protein_GI_number: 16130509 # Func_class: C Energy production and conversion # Function: Acyl-CoA synthetase (NDP forming) # Organism: Escherichia coli K12 # 1 709 1 709 709 1363 99.0 0 MSQRGLEALLRPKSIAVIGASMKPNRAGYLMMRNLLAGGFNGPVLPVTPAWKAVLGVLAW PDIASLPFTPDLAVLCTNASRNLALLEELGEKGCKTCIILSAPASQHEDLRACALRHNMR LLGPNSLGLLAPWQGLNASFSPVPIKRGKLAFISQSAAVSNTILDWAQQREMGFSYFIAL GDSLDIDVDELLDYLARDSKTSAILLYLEQLSDARRFVSAARSASRNKPILVIKSGRSPA AQRLLNTTAGMDPAWDAAIQRAGLLRVQDTHELFSAVETLSHMRPLRGDRLMIISNGAAP AALALDALWSRNGKLATLSEETCQKLRDALPGHVAISNPLDLRDDASSEHYIKTLDILLH SQDFDALMVIHSPSAAAPATESAQVLIEAVKHHPRSKYVSLLTNWCGEHSSQEARRLFSE AGLPTYRTPEGTITAFMHMVEYRRNQKQLRETPALPSNLTSNTAEAHLLLQQAIAEGATS LDTHEVQPILQAYGMNTLPTWIASDSTEAVHIAEQIGYPVALKLRSPDIPHKSEVQGVML YLRTANEVQQAANAIFDRVKMAWPQARVHGLLVQSMANRAGAQELRVVVEHDPVFGPLIM LGEGGVEWRPEDQAVVALPPLNMNLARYLVIQGIKSKKIRARSALRPLDVAGLSQLLVQV SNLIVDCPEIQRLDIHPLLASGSEFTALDVTLDISPFEGDNESRLAVRPYPHQLEEWVEL KNGERCLFRPILPEDEPQLQQFISRVTKEDLYYRYFSEINEFTHEDLANMTQIDYDREMA FVAVRRIDQTEEILGVTRAISDPDNIDAEFAVLVRSDLKGLGLGRRLMEKLITYTRDHGL QRLNGITMPNNRGMVALARKLGFNVDIQLEEGIVGLTLNLAQREES >gi|296493323|gb|ADTK01000178.1| GENE 5 6117 - 6641 447 174 aa, chain - ## HITS:1 COG:ECs3449 KEGG:ns NR:ns ## COG: ECs3449 COG3148 # Protein_GI_number: 15832703 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 174 67 240 240 348 100.0 4e-96 MFDTEPMKPSNTGRLIADILPDTVAFQWSRTEPSQDLLDLVQNPDYQPMVVFPASYADEQ REVIFTPPAGKPPLFIMLDGTWPEARKMFRKSPYLDNLPVISVDLSRLSAYRLREAQAEG QYCTAEVAIALLDMAGDTGAAAGLGEHFTRFKTRYLAGKTQHLGSITAEQLESV >gi|296493323|gb|ADTK01000178.1| GENE 6 6884 - 7303 178 139 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|124485582|ref|YP_001030198.1| ribosomal protein L12E/L44/L45/RPP1/RPP2-like protein [Methanocorpusculum labreanum Z] # 55 136 35 115 120 73 46 3e-12 MNTVCTHCQAINRIPDDRIEDAAKCGRCGHDLFDGEVINATGETLDKLLKDDLPVVIDFW APWCGPCRNFAPIFEDVAQERSGKVRFVKVNTEAERELSSRFGIRSIPTIMIFKNGQVVD MLNGAVPKAPFDSWLNESL >gi|296493323|gb|ADTK01000178.1| GENE 7 7510 - 8547 1003 345 aa, chain + ## HITS:1 COG:ECs3447 KEGG:ns NR:ns ## COG: ECs3447 COG0566 # Protein_GI_number: 15832701 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Escherichia coli O157:H7 # 1 345 1 345 345 672 100.0 0 MNDEMKGKSGKVKVMYVRSDDDSDKRTHNPRTGKGGGRPGKSRADGGRRPARDDKQSQPR DRKWEDSPWRTVSRAPGDETPEKADHGGISGKSFIDPEVLRRQRAEETRVYGENACQALF QSRPEAIVRAWFIQSVTPRFKEALRWMAANRKAYHVVDEAELTKASGTEHHGGVCFLIKK RNGTTVQQWVSQAGAQDCVLALENESNPHNLGGMMRSCAHFGVKGVVVQDAALLESGAAI RTAEGGAEHVQPITGDNIVNVLDDFRQAGYTVVTTSSEQGKPLFKTSLPAKMVLVLGQEY EGLPDAARDPNDLRVKIDGTGNVAGLNISVATGVLLGEWWRQNKA >gi|296493323|gb|ADTK01000178.1| GENE 8 8595 - 9284 618 229 aa, chain - ## HITS:1 COG:ung KEGG:ns NR:ns ## COG: ung COG0692 # Protein_GI_number: 16130505 # Func_class: L Replication, recombination and repair # Function: Uracil DNA glycosylase # Organism: Escherichia coli K12 # 1 229 1 229 229 466 100.0 1e-131 MANELTWHDVLAEEKQQPYFLNTLQTVASERQSGVTIYPPQKDVFNAFRFTELGDVKVVI LGQDPYHGPGQAHGLAFSVRPGIAIPPSLLNMYKELENTIPGFTRPNHGYLESWARQGVL LLNTVLTVRAGQAHSHASLGWETFTDKVISLINQHREGVVFLLWGSHAQKKGAIIDKQRH HVLKAPHPSPLSAHRGFFGCNHFVLANQWLEQRGETPIDWMPVLPAESE >gi|296493323|gb|ADTK01000178.1| GENE 9 9589 - 9972 541 127 aa, chain + ## HITS:1 COG:ECs3445 KEGG:ns NR:ns ## COG: ECs3445 COG3445 # Protein_GI_number: 15832699 # Func_class: R General function prediction only # Function: Acid-induced glycyl radical enzyme # Organism: Escherichia coli O157:H7 # 1 127 1 127 127 243 100.0 9e-65 MITGIQITKAANDDLLNSFWLLDSEKGEARCIVAKAGYAEDEVVAVSKLGDIEYREVPVE VKPEVRVEGGQHLNVNVLRRETLEDAVKHPEKYPQLTIRVSGYAVRFNSLTPEQQRDVIA RTFTESL >gi|296493323|gb|ADTK01000178.1| GENE 10 10028 - 10615 487 195 aa, chain - ## HITS:1 COG:yfiK KEGG:ns NR:ns ## COG: yfiK COG1280 # Protein_GI_number: 16130503 # Func_class: E Amino acid transport and metabolism # Function: Putative threonine efflux protein # Organism: Escherichia coli K12 # 1 195 1 195 195 327 99.0 7e-90 MTPTLLSAFWTYTLITAMTPGPNNILALSSATSHGFRQSTRVLAGMSLGFLIVMLLCAGI SFSLAVIDPAAVHLLSWAGAAYIVWLAWKIATSPTKEDGLQAKPISFWASFALQFVNVKI ILYGVTALSTFVLPQTQALSWVVGVSVLLAMIGTFGNVCWALAGHLFQQLFRQYGRQLNI VLALLLVYCAVRIFY >gi|296493323|gb|ADTK01000178.1| GENE 11 10718 - 11599 515 293 aa, chain + ## HITS:1 COG:yfiE KEGG:ns NR:ns ## COG: yfiE COG0583 # Protein_GI_number: 16130502 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 293 16 308 308 565 98.0 1e-161 MDLRRFITLKTVVEEGSFLRASQKLCCTQSTVTFHIQQLEQEFSVQLFEKIGRRMCLTRE GKKLLPHIYELTRVMDTLREAAKKESDPDGELRVVSGETLLSYRMPQVLQRFRQRAPKVR LSLQSLNCYVIRDALLNDEADVGVFYRVGNDDALNRRELGEQSLVLVASPQIADVDFTEP GRHNACSFIINEPQCVFRQIFESTLRQRRITVENTIELLSIESIKRCVAANIGVSYLPRF AVEKELESGELIELPFGEQSQTITAMCAHHAGKAVSPAMHTFIQCVEESFVAG >gi|296493323|gb|ADTK01000178.1| GENE 12 11808 - 13142 1450 444 aa, chain - ## HITS:1 COG:ECs3442 KEGG:ns NR:ns ## COG: ECs3442 COG0513 # Protein_GI_number: 15832696 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Escherichia coli O157:H7 # 1 444 1 444 444 778 100.0 0 MTVTTFSELELDESLLEALQDKGFTRPTAIQAAAIPPALDGRDVLGSAPTGTGKTAAYLL PALQHLLDFPRKKSGPPRILILTPTRELAMQVADHARELAKHTHLDIATITGGVAYMNHA EVFSENQDIVVATTGRLLQYIKEENFDCRAVETLILDEADRMLDMGFAQDIEHIAGETRW RKQTLLFSATLEGDAIQDFAERLLEDPVEVSANPSTRERKKIHQWYYRADDLEHKTALLV HLLKQPEATRSIVFVRKRERVHELANWLREAGINNCYLEGEMVQGKRNEAIKRLTEGRVN VLVATDVAARGIDIPDVSHVFNFDMPRSGDTYLHRIGRTARAGRKGTAISLVEAHDHLLL GKVGRYIEEPIKARVIDELRPKTRAPSEKQTGKPSKKVLAKRAEKKKAKEKEKPRVKKRH RDTKNIGKRRKPSGTGVPPQTTEE >gi|296493323|gb|ADTK01000178.1| GENE 13 13274 - 14011 733 245 aa, chain + ## HITS:1 COG:yfiC KEGG:ns NR:ns ## COG: yfiC COG4123 # Protein_GI_number: 16130500 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Escherichia coli K12 # 1 245 41 285 285 502 97.0 1e-142 MSQSTSVLRRNGFTFKQFFVAHDRCAMKVGTDGILLGAWAPVAGVKRCLDIGAGSGLLAL MLAQRTSDSVIIDAVELESEAAAQAQENINQSPWAERINVHTADIQQWVTQQTARFDLII SNPPYYQQGVECATPQREQARYTTTLDHPSLLTCAAECITEEGFFCVVLPEQIGNGFTEL ALSMGWHLRLRTDVAENEARLPHRVLLAFSPQAGECFSDRLVIRGPDQNYSEAYTALTQA FYLFM >gi|296493323|gb|ADTK01000178.1| GENE 14 13996 - 15618 1189 540 aa, chain - ## HITS:1 COG:nadB KEGG:ns NR:ns ## COG: nadB COG0029 # Protein_GI_number: 16130499 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate oxidase # Organism: Escherichia coli K12 # 1 540 1 540 540 1117 98.0 0 MNTLPEHSCDVLIIGSGAAGLSLALRLADQHQVIVLSKGPVTEGSTFYAQGGIAAVFDET DSIDSHVEDTLIAGAGICDRHAVEFVASNARSCVQWLIDQGVLFDTHVQPNGEESYHLTR EGGHSHRRILHAADATGREVQSTLVSKAQNHPNIRVLERSNAVDLIVSDKIGLPGTRRVV GAWVWNRNKETVETCHAKAVVLATGGASKVYQYTTNPDISSGDGIAMAWRAGCRVANLEF NQFHPTALYHPQARNFLLTEALRGEGAYLKRPDGTRFMPDFDERGELAPRDIVARAIDHE MKRLGADCMFLDISHKPADFIRQHFPMIYEKLLGLGIDLTQEPVPIVPAAHYTCGGVMVD DHGRTDVEGLYAIGEVSYTGLHGANRMASNSLLECLVYGWSAAEDITRRMPDAHGVSTLP PWDESRVENPDERVVIQHNWHELRLFMWDYVGIVRTTKRLERALRRITMLQQEIDEYYAH FRVSNNLLELRNLVQVAELIVRCAMMRKESRGLHFTLDYPELLTHSGPSILSPGNHYINR >gi|296493323|gb|ADTK01000178.1| GENE 15 15874 - 16029 78 51 aa, chain + ## HITS:1 COG:no KEGG:ECBD_1107 NR:ns ## KEGG: ECBD_1107 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_BL21_DE3 # Pathway: not_defined # 1 51 1 51 51 66 100.0 3e-10 MIRLQHDKQKQMRYGTLQKRDTLTLCLLKLQLMEWRFDSAWKFGLGRLYLG >gi|296493323|gb|ADTK01000178.1| GENE 16 16026 - 16601 341 191 aa, chain + ## HITS:1 COG:ECs3439 KEGG:ns NR:ns ## COG: ECs3439 COG1595 # Protein_GI_number: 15832693 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Escherichia coli O157:H7 # 1 191 1 191 191 360 100.0 1e-100 MSEQLTDQVLVERVQKGDQKAFNLLVVRYQHKVASLVSRYVPSGDVPDVVQEAFIKAYRA LDSFRGDSAFYTWLYRIAVNTAKNYLVAQGRRPPSSDVDAIEAENFESGGALKEISNPEN LMLSEELRQIVFRTIESLPEDLRMAITLRELDGLSYEEIAAIMDCPVGTVRSRIFRAREA IDNKVQPLIRR >gi|296493323|gb|ADTK01000178.1| GENE 17 16634 - 17284 536 216 aa, chain + ## HITS:1 COG:rseA KEGG:ns NR:ns ## COG: rseA COG3073 # Protein_GI_number: 16130497 # Func_class: T Signal transduction mechanisms # Function: Negative regulator of sigma E activity # Organism: Escherichia coli K12 # 1 216 1 216 216 329 100.0 2e-90 MQKEQLSALMDGETLDSELLNELAHNPEMQKTWESYHLIRDSMRGDTPEVLHFDISSRVM AAIEEEPVRQPATLIPEAQPAPHQWQKMPFWQKVRPWAAQLTQMGVAACVSLAVIVGVQH YNGQSETSQQPETPVFNTLPMMGKASPVSLGVPSEATANNGQQQQVQEQRRRINAMLQDY ELQRRLHSEQLQFEQAQTQQAAVQVPGIQTLGTQSQ >gi|296493323|gb|ADTK01000178.1| GENE 18 17284 - 18240 823 318 aa, chain + ## HITS:1 COG:ECs3437 KEGG:ns NR:ns ## COG: ECs3437 COG3026 # Protein_GI_number: 15832691 # Func_class: T Signal transduction mechanisms # Function: Negative regulator of sigma E activity # Organism: Escherichia coli O157:H7 # 1 318 1 318 318 611 99.0 1e-175 MKQLWFAMSLVTGSLLFSANASATPASGALLQQMNLASQSLNYELSFISINKQGVESLRY RHARLDNRPLAQLLQMDGPRREVVQRGNEISYFEPGLEPFTLNGDYIVDSLPSLIYTDFK RLSPYYDFISVGRTRIADRLCEVIRVVARDGTRYSYIVWMDTESKLPMRVDLLDRDGETL EQFRVIAFNVNQDISSSMQTLAKANLPPLLSVPVGEKAKFSWTPTWLPQGFSEVSSSRRP LPTMDNMPIESRLYSDGLFSFSVNVNRATPSSTDQMLRTGRRTVSTSVRDNAEIPIVGEL PPQTAKRIAENIKFGAAQ >gi|296493323|gb|ADTK01000178.1| GENE 19 18237 - 18716 340 159 aa, chain + ## HITS:1 COG:rseC KEGG:ns NR:ns ## COG: rseC COG3086 # Protein_GI_number: 16130495 # Func_class: T Signal transduction mechanisms # Function: Positive regulator of sigma E activity # Organism: Escherichia coli K12 # 1 159 1 159 159 294 100.0 5e-80 MIKEWATVVSWQNGQALVSCDVKASCSSCASRAGCGSRVLNKLGPQTTHTIVVPCDEPLV PGQKVELGIAEGSLLSSALLVYMSPLVGLFLIASLFQLLFASDVAALCGAILGGIGGFLI ARGYSRKFAARAEWQPIILSVALPPGLVRFETSSEDASQ >gi|296493323|gb|ADTK01000178.1| GENE 20 18914 - 20713 2083 599 aa, chain + ## HITS:1 COG:ECs3435 KEGG:ns NR:ns ## COG: ECs3435 COG0481 # Protein_GI_number: 15832689 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane GTPase LepA # Organism: Escherichia coli O157:H7 # 1 599 1 599 599 1167 100.0 0 MKNIRNFSIIAHIDHGKSTLSDRIIQICGGLSDREMEAQVLDSMDLERERGITIKAQSVT LDYKASDGETYQLNFIDTPGHVDFSYEVSRSLAACEGALLVVDAGQGVEAQTLANCYTAM EMDLEVVPVLNKIDLPAADPERVAEEIEDIVGIDATDAVRCSAKTGVGVQDVLERLVRDI PPPEGDPEGPLQALIIDSWFDNYLGVVSLIRIKNGTLRKGDKVKVMSTGQTYNADRLGIF TPKQVDRTELKCGEVGWLVCAIKDIHGAPVGDTLTLARNPAEKALPGFKKVKPQVYAGLF PVSSDDYEAFRDALGKLSLNDASLFYEPESSSALGFGFRCGFLGLLHMEIIQERLEREYD LDLITTAPTVVYEVETTSREVIYVDSPSKLPAVNNIYELREPIAECHMLLPQAYLGNVIT LCVEKRGVQTNMVYHGNQVALTYEIPMAEVVLDFFDRLKSTSRGYASLDYNFKRFQASDM VRVDVLINGERVDALALITHRDNSQNRGRELVEKMKDLIPRQQFDIAIQAAIGTHIIARS TVKQLRKNVLAKCYGGDISRKKKLLQKQKEGKKRMKQIGNVELPQEAFLAILHVGKDNK >gi|296493323|gb|ADTK01000178.1| GENE 21 20729 - 21703 1033 324 aa, chain + ## HITS:1 COG:lepB KEGG:ns NR:ns ## COG: lepB COG0681 # Protein_GI_number: 16130493 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Escherichia coli K12 # 1 324 1 324 324 652 100.0 0 MANMFALILVIATLVTGILWCVDKFFFAPKRRERQAAAQAAAGDSLDKATLKKVAPKPGW LETGASVFPVLAIVLIVRSFIYEPFQIPSGSMMPTLLIGDFILVEKFAYGIKDPIYQKTL IETGHPKRGDIVVFKYPEDPKLDYIKRAVGLPGDKVTYDPVSKELTIQPGCSSGQACENA LPVTYSNVEPSDFVQTFSRRNGGEATSGFFEVPKNETKENGIRLSERKETLGDVTHRILT VPIAQDQVGMYYQQPGQQLATWIVPPGQYFMMGDNRDNSADSRYWGFVPEANLVGRATAI WMSFDKQEGEWPTGLRLSRIGGIH >gi|296493323|gb|ADTK01000178.1| GENE 22 21975 - 22655 654 226 aa, chain + ## HITS:1 COG:ECs3433 KEGG:ns NR:ns ## COG: ECs3433 COG0571 # Protein_GI_number: 15832687 # Func_class: K Transcription # Function: dsRNA-specific ribonuclease # Organism: Escherichia coli O157:H7 # 1 226 1 226 226 422 100.0 1e-118 MNPIVINRLQRKLGYTFNHQELLQQALTHRSASSKHNERLEFLGDSILSYVIANALYHRF PRVDEGDMSRMRATLVRGNTLAELAREFELGECLRLGPGELKSGGFRRESILADTVEALI GGVFLDSDIQTVEKLILNWYQTRLDEISPGDKQKDPKTRLQEYLQGRHLPLPTYLVVQVR GEAHDQEFTIHCQVSGLSEPVVGTGSSRRKAEQAAAEQALKKLELE >gi|296493323|gb|ADTK01000178.1| GENE 23 22652 - 23557 1011 301 aa, chain + ## HITS:1 COG:era KEGG:ns NR:ns ## COG: era COG1159 # Protein_GI_number: 16130491 # Func_class: R General function prediction only # Function: GTPase # Organism: Escherichia coli K12 # 1 301 1 301 301 600 100.0 1e-172 MSIDKSYCGFIAIVGRPNVGKSTLLNKLLGQKISITSRKAQTTRHRIVGIHTEGAYQAIY VDTPGLHMEEKRAINRLMNKAASSSIGDVELVIFVVEGTRWTPDDEMVLNKLREGKAPVI LAVNKVDNVQEKADLLPHLQFLASQMNFLDIVPISAETGLNVDTIAAIVRKHLPEATHHF PEDYITDRSQRFMASEIIREKLMRFLGAELPYSVTVEIERFVSNERGGYDINGLILVERE GQKKMVIGNKGAKIKTIGIEARKDMQEMFEAPVHLELWVKVKSGWADDERALRSLGYVDD L >gi|296493323|gb|ADTK01000178.1| GENE 24 23569 - 24297 623 242 aa, chain + ## HITS:1 COG:ECs3431 KEGG:ns NR:ns ## COG: ECs3431 COG1381 # Protein_GI_number: 15832685 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Escherichia coli O157:H7 # 1 242 1 242 242 470 100.0 1e-132 MEGWQRAFVLHSRPWSETSLMLDVFTEESGRVRLVAKGARSKRSTLKGALQPFTPLLLRF GGRGEVKTLRSAEAVSLALPLSGITLYSGLYINELLSRVLEYETRFSELFFDYLHCIQSL AGVTGTPEPALRRFELALLGHLGYGVNFTHCAGSGEPVDDTMTYRYREEKGFIASVVIDN KTFTGRQLKALNAREFPDADTLRAAKRFTRMALKPYLGGKPLKSRELFRQFMPKRTVKTH YE >gi|296493323|gb|ADTK01000178.1| GENE 25 24309 - 25040 998 243 aa, chain + ## HITS:1 COG:ECs3430 KEGG:ns NR:ns ## COG: ECs3430 COG0854 # Protein_GI_number: 15832684 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal phosphate biosynthesis protein # Organism: Escherichia coli O157:H7 # 1 243 1 243 243 438 100.0 1e-123 MAELLLGVNIDHIATLRNARGTAYPDPVQAAFIAEQAGADGITVHLREDRRHITDRDVRI LRQTLDTRMNLEMAVTEEMLAIAVETKPHFCCLVPEKRQEVTTEGGLDVAGQRDKMRDAC KRLADAGIQVSLFIDADEEQIKAAAEVGAPFIEIHTGCYADAKTDAEQAQELARIAKAAT FAASLGLKVNAGHGLTYHNVKAIAAIPEMHELNIGHAIIGRAVMTGLKDAVAEMKRLMLE ARG >gi|296493323|gb|ADTK01000178.1| GENE 26 25040 - 25420 293 126 aa, chain + ## HITS:1 COG:acpS KEGG:ns NR:ns ## COG: acpS COG0736 # Protein_GI_number: 16130488 # Func_class: I Lipid transport and metabolism # Function: Phosphopantetheinyl transferase (holo-ACP synthase) # Organism: Escherichia coli K12 # 1 126 1 126 126 240 100.0 5e-64 MAILGLGTDIVEIARIEAVIARSGDRLARRVLSDNEWAIWKTHHQPVRFLAKRFAVKEAA AKAFGTGIRNGLAFNQFEVFNDELGKPRLRLWGEALKLAEKLGVANMHVTLADERHYACA TVIIES >gi|296493323|gb|ADTK01000178.1| GENE 27 26115 - 26375 256 86 aa, chain - ## HITS:1 COG:yfhL KEGG:ns NR:ns ## COG: yfhL COG1145 # Protein_GI_number: 16130487 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Escherichia coli K12 # 1 86 1 86 86 153 100.0 6e-38 MALLITKKCINCDMCEPECPNEAISMGDHIYEINSDKCTECVGHYETPTCQKVCPIPNTI VKDPAHVETEEQLWDKFVLMHHADKI >gi|296493323|gb|ADTK01000178.1| GENE 28 26431 - 27279 815 282 aa, chain - ## HITS:1 COG:ECs3427 KEGG:ns NR:ns ## COG: ECs3427 COG1737 # Protein_GI_number: 15832681 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 282 25 306 306 525 100.0 1e-149 MNGLLRIRQRYQGLAQSDKKLADYLLLQPDTARHLSSQQLANEAGVSQSSVVKFAQKLGY KGFPALKLALSEALASQPESPSVPIHNQIRGDDPLRLVGEKLIKENTAAMYATLNVNSEE KLHECVTMLRSARRIILTGIGASGLVAQNFAWKLMKIGFNAAAVRDMHALLATVQASSPD DLLLAISYTGVRRELNLAADEMLRVGGKVLAITGFTPNALQQRASHCLYTIAEEQATNSA SISACHAQGMLTDLLFIALIQQDLELAPERIRHSEALVKKLV >gi|296493323|gb|ADTK01000178.1| GENE 29 27488 - 28123 481 211 aa, chain + ## HITS:1 COG:STM2569 KEGG:ns NR:ns ## COG: STM2569 COG0560 # Protein_GI_number: 16765889 # Func_class: E Amino acid transport and metabolism # Function: Phosphoserine phosphatase # Organism: Salmonella typhimurium LT2 # 1 211 1 211 211 365 91.0 1e-101 MATHERRVVFFDLDGTLHQQDMFGSFLRYLLRRQPLNALLVLPLLPIIAIALLIKGRAAR WPMSLLLWGCTFGHSEARLQTLQADFVRWFRDNVTAFPLVQERLTTYLLSSDADIWLITG SPQPLVEAVYFDTPWLPRVNLIASQIQRGYGGWVLTMRCLGHEKVAQLERKIGTPLRLYS GYSDSNQDNPLLYFCQHRWRVTPRGELQQLE >gi|296493323|gb|ADTK01000178.1| GENE 30 28181 - 28684 472 167 aa, chain + ## HITS:1 COG:yfhC KEGG:ns NR:ns ## COG: yfhC COG0590 # Protein_GI_number: 16130484 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Escherichia coli K12 # 1 167 12 178 178 337 99.0 5e-93 MSEVEFSHEYWMRHALTLAKRAWDEREVPVGAVLVHNNRVIGEGWNRPIGRHDPTAHAEI MALRQGGLVMQNYRLIDATLYVTLEPCVMCAGAMIHSRIGRVVFGARDAKTGAAGSLMDV LHHPGMNHRVEITEGILADECAALLSDFFRMRRQEIKAQKKAQSSTD >gi|296493323|gb|ADTK01000178.1| GENE 31 28681 - 30237 1743 518 aa, chain - ## HITS:1 COG:yfhD KEGG:ns NR:ns ## COG: yfhD COG4623 # Protein_GI_number: 16130483 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted soluble lytic transglycosylase fused to an ABC-type amino acid-binding protein # Organism: Escherichia coli K12 # 47 518 1 472 472 913 99.0 0 MKKLKINYLFIGILALLLAVALWPSIPWFGKADNRIAAIQARGELRVSTIHTPLTYNEIN GKPFGLDYELAKQFADYLGVKLKVTVRQNISQLFDDLDNGNADLLAAGLVYNSERVKNYQ PGPTYYSVSQQLVYKVGQYRPRTLGNLTAEQLTVAPGHVVVNDLQTLKETKFPELSWKVD DKKGSAELMEDVIEGKLDYTIADSVAISLFQRVHPELAVALDITDEQPVTWFSPLDGDNT LSAALLDFFNEMNEDGTLARIEEKYLGHGDDFDYVDTRTFLRAVDAVLPQLKPLFEKYAE EIDWRLLAAIAYQESHWDAQATSPTGVRGMMMLTKNTAQSLGITDRTDAEQSISGGVRYL QDMMSKVPESVPENERIWFALAAYNMGYAHMLDARALTAKTKGNPDSWADVKQRLPLLSQ KPYYSKLTYGYARGHEAYAYVENIRKYQISLVGYLQEKEKQATEAAMQLAQDYPAVSPTE LGKEKFPFLSFLSQSSSNYLTHSPSLLFSRKGSEEKQN >gi|296493323|gb|ADTK01000178.1| GENE 32 30498 - 34382 4297 1294 aa, chain + ## HITS:1 COG:ECs3423_1 KEGG:ns NR:ns ## COG: ECs3423_1 COG0046 # Protein_GI_number: 15832677 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain # Organism: Escherichia coli O157:H7 # 1 984 2 985 985 1994 99.0 0 MEILRGSPALSAFRINKLLARFQAARLPVHNIYAEYVHFADLNAPLNDDEHAQLERLLKY GPALASHAPQGKLLLVTPRPGTISPWSSKATDIAHNCGLQQVNRLERGVAYYIEAGTLTN EQWQQVTAELHDRMMETVFFDLDDAEQLFAHHQPTPVTSVDLLGQGRQALIDANLRLGLA LAEDEIDYLQDAFTKLGRNPNDIELYMFAQANSEHCRHKIFNADWIIDGEQQPKSLFRMI KNTFETTPDHVLSAYKDNAAVMEGSEVGRYFADHETGRYDFHQEPAHILMKVETHNHPTA ISPWPGAATGSGGEIRDEGATGRGAKPKAGLVGFSVSNLRIPGFEQPWEEDFGKPERIVT ALDIMTEGPLGGAAFNNEFGRPALNGYFRTYEEKVNSHNGEELRGYHKPIMLAGGIGNIR ADHVQKGEINVGAKLVVLGGPAMNIGLGGGAASSMASGQSDADLDFASVQRDNPEMERRC QEVIDRCWQLGDANPILFIHDVGAGGLSNAMPELVSDGGRGGKFELRDILSDEPGMSPLE IWCNESQERYVLAIAADQLPLFDELCKRERAPYAVIGEATEELHLSLHDRHFDNQPIDLP LDVLLGKTPKMTRDVQTLKAKGDALAREGITIADAVKRVLHLPTVAEKTFLVTIGDRSVT GMVARDQMVGPWQVPVANCAVTTASLDSYYGEAMAIGERAPVALLDFAASARLAVGEALT NIAATQIGDIKRIKLSANWMAAAGHPGEDAGLYEAVKAVGEELCPALGLTIPVGKDSMSM KTRWQEGNEEREMTSPLSLVISAFARVEDVRHTITPQLSTEDNALLLIDLGKGNNALGAT ALAQVYRQLGDKPADVRDVAQLKGFYDAIQALVAQRKLLAYHDRSDGGLLVTLAEMAFAG HCGINADIASLGDDRLAALFNEELGAVIQVRAADREAVESVLAQHGLADCVHYVGQAVSG DRFVITANGQTVFSESRTTLRVWWAETTWQMQRLRDNPECADQEHQAKSNDADPGLNVKL SFDINEDVAAPYIATGARPKVAVLREQGVNSHVEMAAAFHRAGFDAIDVHMSDLLTGRTG LEDFHALVACGGFSYGDVLGAGEGWAKSILFNDRVRDEFATFFHRPQTLALGVCNGCQMM SNLRELIPGSELWPRFVRNTSDRFEARFSLVEVTQSPSLLLQGMVGSQMPIAVSHGEGRV EVRDAAHLAALESKGLVALRYVDNFGKVTETYPANPNGSPNGITAVTTESGRVTIMMPHP ERVFRTVSNSWHPENWGEDGPWMRIFRNARKQLG >gi|296493323|gb|ADTK01000178.1| GENE 33 34985 - 36367 1035 460 aa, chain + ## HITS:1 COG:ECs3422 KEGG:ns NR:ns ## COG: ECs3422 COG0642 # Protein_GI_number: 15832676 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli O157:H7 # 15 460 51 496 496 840 99.0 0 MLAFLLILLPLLVLAWQAWQSLNALSDQAALVNRTTLIDARRSEAMTNAALEMERSYRQY CVLDDPTLAKVYQSQRKRYSEMLDAHAGVLPDDKLYQALRQDLNNLAQLQCNNSGPDAAA AARLEAFASANTEMVQATRTVVFSRGQQLQREIAERGQYFGWQSLVLFLVSLVMVLLFTR MIIGPVKNIERMINRLGEGRSLGNSVSFSGPSELRSVGQRILWLSERLSWLESQRHQFLR HLSHELKTPLASMREGTELLADQVVGPLTPEQKEVVSILDSSSRNLQKLIEQLLDYNRKQ ADSAVELENVELAPLVETVVSAHSLPARAKMMHTDVDLKATACLAEPMLLMSVLDNLYSN AVHYGAESGNICLRSSSHGARVYIDVINTGTPIPQEERAMIFEPFFQGSHQRKGAVKGSG LGLSIARDCIRRMQGELYLVDESGQDVCFRIELPSSKNTK >gi|296493323|gb|ADTK01000178.1| GENE 34 36532 - 37245 436 237 aa, chain + ## HITS:1 COG:no KEGG:EcHS_A2708 NR:ns ## KEGG: EcHS_A2708 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_HS # Pathway: not_defined # 1 237 3 239 239 410 99.0 1e-113 MRHIFQRLLPRRLWLAGLPCLALLGCVQNHNKPAIDTPAEEKIPVYQLADYLSTECSDIW ALQGKSTETNPLYWLRAMDCADRLMPAQSRQQARQYDDGSWQNTFKQGILLADAKITPYE RRQLVARIEALSTEIPAQVRPLYQLWRDGQALQLQLAEERQRYSKLQQSSDSELDTLRQQ HHVLQQQLELTTRKLENLTDIERQLSTRKPAGNFSPDTPHESEKPAPSTHEVTPDEP >gi|296493323|gb|ADTK01000178.1| GENE 35 37235 - 38569 1308 444 aa, chain + ## HITS:1 COG:ECs3420 KEGG:ns NR:ns ## COG: ECs3420 COG2204 # Protein_GI_number: 15832674 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli O157:H7 # 1 444 1 444 444 837 100.0 0 MSHKPAHLLLVDDDPGLLKLLGLRLTSEGYSVVTAESGAEGLRVLNREKVDLVISDLRMD EMDGMQLFAEIQKVQPGMPVIILTAHGSIPDAVAATQQGVFSFLTKPVDKDALYQAIDDA LEQSAPATDERWREAIVTRSPLMLRLLEQARLVAQSDVSVLINGQSGTGKEIFAQAIHNA SPRNSKPFIAINCGALPEQLLESELFGHARGAFTGAVSNREGLFQAAEGGTLFLDEIGDM PAPLQVKLLRVLQERKVRPLGSNRDIDINVRIISATHRDLPKAMARGEFREDLYYRLNVV SLKIPALAERTEDIPLLANHLLRQAAERHKPFVRAFSTDAMKRLMTASWPGNVRQLVNVI EQCVALTSSPVISDALVEQALEGENTALPTFVEARNQFELNYLRKLLQITKGNVTHAARM AGRNRTEFYKLLSRHELDANDFKE >gi|296493323|gb|ADTK01000178.1| GENE 36 38630 - 38968 498 112 aa, chain + ## HITS:1 COG:ECs3419 KEGG:ns NR:ns ## COG: ECs3419 COG0347 # Protein_GI_number: 15832673 # Func_class: E Amino acid transport and metabolism # Function: Nitrogen regulatory protein PII # Organism: Escherichia coli O157:H7 # 1 112 1 112 112 191 100.0 4e-49 MKKIDAIIKPFKLDDVREALAEVGITGMTVTEVKGFGRQKGHTELYRGAEYMVDFLPKVK IEIVVPDDIVDTCVDTIIRTAQTGKIGDGKIFVFDVARVIRIRTGEEDDAAI >gi|296493323|gb|ADTK01000178.1| GENE 37 39013 - 40203 1279 396 aa, chain - ## HITS:1 COG:ZhmpA_2 KEGG:ns NR:ns ## COG: ZhmpA_2 COG1018 # Protein_GI_number: 15803077 # Func_class: C Energy production and conversion # Function: Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 # Organism: Escherichia coli O157:H7 EDL933 # 150 396 1 247 247 513 99.0 1e-145 MLDAQTIATVKATIPLLVETGPKLTAHFYDRMFTHNPELKEIFNMSNQRNGDQREALFNA IAAYASNIENLPALLPAVEKIAQKHTSFQIKPEQYNIVGEHLLATLDEMFSPGQEVLDAW GKAYGVLANVFINREAEIYNENASKAGGWEGTRDFRIVAKTPRSALITSFELEPVDGGAV AEYRPGQYLGVWLKPEGFPHQEIRQYSLTRKPDGKGYRIAVKREEGGQVSNWLHNHANVG DVVKLVAPAGDFFMAVADDTPVTLISAGVGQTPMLAMLDTLAKAGHTAQVNWFHAAENGD VHAFADEVKELGQSLPRFTAHTWYRQPSEADRAKGQFDSEGLMDLSKQEGAFSDPTMQFY LCGPVGFMQFAAKQLVDLGVKQENIHYECFGPHKVL >gi|296493323|gb|ADTK01000178.1| GENE 38 40531 - 41784 1542 417 aa, chain + ## HITS:1 COG:glyA KEGG:ns NR:ns ## COG: glyA COG0112 # Protein_GI_number: 16130476 # Func_class: E Amino acid transport and metabolism # Function: Glycine/serine hydroxymethyltransferase # Organism: Escherichia coli K12 # 1 417 1 417 417 811 100.0 0 MLKREMNIADYDAELWQAMEQEKVRQEEHIELIASENYTSPRVMQAQGSQLTNKYAEGYP GKRYYGGCEYVDIVEQLAIDRAKELFGADYANVQPHSGSQANFAVYTALLEPGDTVLGMN LAHGGHLTHGSPVNFSGKLYNIVPYGIDATGHIDYADLEKQAKEHKPKMIIGGFSAYSGV VDWAKMREIADSIGAYLFVDMAHVAGLVAAGVYPNPVPHAHVVTTTTHKTLAGPRGGLIL AKGGSEELYKKLNSAVFPGGQGGPLMHVIAGKAVALKEAMEPEFKTYQQQVAKNAKAMVE VFLERGYKVVSGGTDNHLFLVDLVDKNLTGKEADAALGRANITVNKNSVPNDPKSPFVTS GIRVGTPAITRRGFKEAEAKELAGWMCDVLDSINDEAVIERIKGKVLDICARYPVYA >gi|296493323|gb|ADTK01000178.1| GENE 39 41857 - 43050 1212 397 aa, chain - ## HITS:1 COG:ECs3416 KEGG:ns NR:ns ## COG: ECs3416 COG1940 # Protein_GI_number: 15832670 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Escherichia coli O157:H7 # 1 397 3 399 399 813 99.0 0 MRACINNQQIRHHNKCVILELLYRQKRANKSTLARLAQISIPAVSNILQELESEKRVVNI DDESQTRGHSSGTWLIAPEGDWTLCLNVTPTSIECQVANACLSPKGEFEYLQIDAPTPQA LLSEIEKCWHRHRKLWPDRTINLALAIHGQVDPVTGVSQTMPQAPWTTPVEVKYLLEEKL GIRVMVDNDCVMLALAEKWQNNSQERDFCVINVDYGIGSSFVINEQIYRGSLYGSGQIGH TIVNPDGVVCDCGRYGCLETVASLSALKKQARVWLKSQPVNTQLDPEKLTTAQLIAAWQS GEPWITSWVDRSANAIGLSLYNFLNILNINQIWMYGRSCAFGENWLNTIIRQTGFNPFDR DEGPSVKATQIGFGQLSRAQQVLGIGYLYVEAQLRQI >gi|296493323|gb|ADTK01000178.1| GENE 40 43168 - 44151 1078 327 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_3782 NR:ns ## KEGG: ECH74115_3782 # Name: not_defined # Def: tetratricopeptide repeat protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 327 32 358 1124 698 99.0 0 MTPVKVWQERVEIPTYETGPQDIHPMFLENRVYQGSSGAVYPYGVTDTLSEQKTLKSWQA VWLENDYIKVMILPELGGRVHRAWDKVKQRDFVYHNEVIKPALVGLLGPWISGGIEFNWP QHHRPTTFMPVDFTLEAHEDGAQTVWVGETEPMHGLQVMTGFTLRPEQAALEIASRVYNG NATPRHFLWWANPAVKGGEGHQSVFPPDVTAVFDHGKRAVSAFPIATGTYYKVDYSAGVD ISRYKNVPVPTSYMAEKSQYDFVGAWCHDEDGGLLHVANHHIAPGKKQWSWGHSEFGQAW DKSLTDNNGPYIELMTGIFADNQPDFT >gi|296493323|gb|ADTK01000178.1| GENE 41 44212 - 46449 1980 745 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_2834 NR:ns ## KEGG: EcE24377A_2834 # Name: not_defined # Def: TPR repeat-containing protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 745 380 1124 1124 1486 99.0 0 MVQNASRDAVIKLQRSERGIEWGLYAISPLNGYRLAIREIGKCNALLDDAVALMPATAIQ GVLHGINPERLTIELSDADGNIVLSYQEHQPQELPLPDVAKAPLAAQDITSTDEAWFIGQ HLEQYHHASRSPFDYYLRGVALDPLDYRCNLALAMLEYNRADFPQAVAYATQALKRAHAL NKNPQCGQASLIRASAYERQGQYQQAEEDFWRAVWSGNSKAGGYYGLARLAARNGNFDAG LDFCQQSLRACPTNQEVLCLHNLLLVLSGRQDNARVQREKLLRDYPLNATLWWLNWFDGR SESALAQWRGLCQGRDVNALMTAGQLINWGMPTLAAEMLNALDCQRTLPLYLQASLLPKA ERGELVAKAIDVFPQFVRFPNTLEEVAALESIEECWFARHLLACFYYNKRSYNKAITLWQ RCVEMSPEFADGWRGLAIHAWNKQHDYELAARYLDNAYQLAPQDARLLFERDLLDKLSGA TPEKRLARLENNLEIALKRDDMTAELLNLWHLTGQADKAADILATRKFHPWEGGEGKVTS QFILNQLLRAWQHLDARQPQQASELLHAALHYPENLSEGRLPGQTDNDIWFWQAICANAQ GDETEAMRCLRLAATGDRTINIHSYYNDQPVDYLFWQGMALRLLGEQHTAQQLFSEMKQW AQEMAKTSIEADFFAVSQPDLLSLYGDLQQQHKEKCLMVAMLASAGLGEVAQYESARAEL TAINPAWPKAALFTTVMPFIFNYVH >gi|296493323|gb|ADTK01000178.1| GENE 42 46546 - 47529 1066 327 aa, chain + ## HITS:1 COG:ECs3414 KEGG:ns NR:ns ## COG: ECs3414 COG1879 # Protein_GI_number: 15832668 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 1 327 1 327 327 601 99.0 1e-172 MPKKMRTTRNLLLMATLLGSALFARAAEKEMTIGAIYLDTQGYYAGVRQGVQDAAKDSSV QVQLIETNAQGDISKESTFVDTLVARNVDAIILSAVSENGSSRTVRRASEAGIPVICYNT CINQKGVDKYVSAYLVGDPLEFGKKLGNAAADYFIANKIDQPKIAVINCEAFEVCVQRRK GFEEVLKSRVPGAQIVANQEGTVLDKAISVGEKLIISTPDLNAIMGESGGATLGAVKAVR NQNQAGKIAVFGSDMTTEIAQELENNQVLKAVVDISGKKMGNAVLAQTLKVINKQADGEK VIQVPIDLYTKTEDGKQWLATHVDGLP >gi|296493323|gb|ADTK01000178.1| GENE 43 47552 - 49063 188 503 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 278 480 21 212 305 77 27 2e-13 MFTATEAVPVAKVVAGNKRYPGVVALDNVNFTLNKGEVRALLGKNGAGKSTLIRMLTGSE RPDSGDIWIGETRLEGDEATLTRRAAELGGRAVYQELSLVEGLTVAENLCLGQWPRRNGM IDYLQMAQDAQRCLQALGVDVSPEQLVSTLSPAQKQLVEIARVMKGEPRVVILDEPTSSL ASAEVELVISAVKKMSALGVAVIYVSHRMEEIRRIASCATVMRDGQVAGDVMLENTSTHH IVSLMLGRDHVDIAPVAPQEIVDQAVLEVRALRHKPKLEDISFTLRRGEVLGIAGLLGAG RSELLKAIVGLEEYEQGEIVINGEKITRPDYGDMLKRGIGYTPENRKEAGIIPWLGVDEN TVLTNRQKISANGVLQWSTIRRLTEEVMQRMTVKAASSETPIGTLSGGNQQKVVIGRWVY AASQILLLDEPTRGVDIEAKQQIYRIVRELAAEGKSVVFISSEVEELPLVCDRILLLQHG TFSQEFHSPVNVDELMSAILSVH >gi|296493323|gb|ADTK01000178.1| GENE 44 49088 - 50086 1089 332 aa, chain + ## HITS:1 COG:ECs3412 KEGG:ns NR:ns ## COG: ECs3412 COG1172 # Protein_GI_number: 15832666 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Escherichia coli O157:H7 # 1 332 1 332 332 493 99.0 1e-139 MSASSLPLPQGKSVSLKQFVSRHINEIGLLVVIAILYLVFSLNAPGFISLNNQMNVLRDA ATIGIAAWAMTLIIISGEIDVSVGPMVAFVSVCLAFLLQFEVPLAIACLLVLLLGALMGT LAGVLRGVFNVPSFVATLGLWSALRGMGLFMTNALPVPIDENEVLDWLGGQFLGVPVSAL IMIVLFALFVFISRKTAFGRSVFAVGGNATAAQLCGINVRRVRILIFTLSGLLAAVTGIL LAARLGSGNAGAANGLEFDVIAAVVVGGTALSGGRGSLFGTLLGVLVITLIGNGLVLLGI NSFFQQVVRGVIIVVAVLANILLTQRSSKAKR >gi|296493323|gb|ADTK01000178.1| GENE 45 50161 - 51213 1027 350 aa, chain + ## HITS:1 COG:yphC KEGG:ns NR:ns ## COG: yphC COG1063 # Protein_GI_number: 16130470 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli K12 # 1 350 15 364 364 723 99.0 0 MLAAYLPGNSTVDLREVAVPTPGINQVLIKMKSSGICGSDVHYIYHQHRATAAAPDKPLY QGFINGHEPCGQIVAMGQGCRHFKEGDRVLVYHISGCGFCPNCRRGFPISCTGKGKAAYG WQRDGGHAEYLLAEEKDLILLPDALSYEDGAFISCGVGTAYEGILRGEVSGSDNVLVVGL GPVGMMAMMLAKGRGAKRIIGVDMLPERLAMAKQLGVMDHGYLATTEGLPQIIAELTHGG ADVALDCSGNAAGRLLALQSTADWGRVVYIGETGKVEFEVSADLMHHQRRIIGSWVTSLF HMEKCAHDLTDWKLWPRNAITHRFSLEQAGDAYALMASGKCGKVVINFPD >gi|296493323|gb|ADTK01000178.1| GENE 46 51225 - 52097 777 290 aa, chain + ## HITS:1 COG:yphB KEGG:ns NR:ns ## COG: yphB COG2017 # Protein_GI_number: 16130469 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Escherichia coli K12 # 1 290 1 290 290 596 100.0 1e-170 MTIYTLSHGSLKLDVSDQGGVIEGFWRDTTPLLRPGKKSGVATDASCFPLVPFANRVSGN RFVWQGREYQLQPNVEWDAHYLHGDGWLGEWQCVSHSDDSLCLVYEHRSGVYHYRVSQAF HLTADTLTVTLSVTNQGAETLPFGTGWHPYFPLSPQTRIQAQASGYWLEREQWLAGEFCE QLPQELDFNQPAPLPRQWVNNGFAGWNGQARIEQPQEGYAIIMETTPPAPCYFIFVSDPA FDKGYAFDFFCLEPMSHAPDDHHRPEGGDLIALAPGESTTSEMSLRVEWL >gi|296493323|gb|ADTK01000178.1| GENE 47 52145 - 52567 449 140 aa, chain - ## HITS:1 COG:yphA KEGG:ns NR:ns ## COG: yphA COG2259 # Protein_GI_number: 16130468 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 140 25 164 164 236 100.0 7e-63 MNTLRYFDFGAARPVLLLIARIAVVLIFIIFGFPKMMGFDGTVQYMASLGAPMPMLAAII AVVMEVPAAILIVLGFFTRPLAVLFIFYTLGTAVIGHHYWDMTGDAVGPNMINFWKNVSI AGAFLLLAITGPGAISLDRR >gi|296493323|gb|ADTK01000178.1| GENE 48 52664 - 53866 1003 400 aa, chain - ## HITS:1 COG:ECs3408 KEGG:ns NR:ns ## COG: ECs3408 COG0446 # Protein_GI_number: 15832662 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Escherichia coli O157:H7 # 1 400 1 400 400 747 97.0 0 MKEKTIIIVGGGQAAAMAAASLRQQGFTGELHLFSDERHLPYERPPLSKSMLLEDSPQLQ QVLPANWWQENNVHLHSGVTIKTLGRDTRELVLTNGESWHWDQLFIATGAAARPLPLLDA LGERCFTLRHAGDAARLREVLQPERSVVIVGAGTIGLELAASATQRRCKVTVIELAATVM GRNAPPPVQRYLLQRHQQAGVRILLNNAIEHVVDGEKVELTLQSGETLQADVVIYGIGIS ANEQLAREANLDTANGIVIDEACRTCDPAIFAGGDVAITRLDNGALHRCESWENANNQAQ IAAAAMLGLPLPLLPPPWFWSDQYSDNLQFIGDMRGDDWLCRGNPETQKAIWFNLQNGVL IGAVTLNQGREIRPIRKWIQSGKTFDAKLLIDENIALKSL >gi|296493323|gb|ADTK01000178.1| GENE 49 53876 - 54688 797 270 aa, chain - ## HITS:1 COG:hcaB KEGG:ns NR:ns ## COG: hcaB COG1028 # Protein_GI_number: 16130466 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Escherichia coli K12 # 1 270 1 270 270 507 99.0 1e-144 MSDLHNESIFITGGGSGLGLALVERFIEEGAQVATLELSAAKVASLRQRFGEHILAVEGN VTCYADYQRAVDQILTRSGKLDCFIGNAGIWDHNASLVNTPAETLETGFHELFNVNVLGY LLGAKACAPALIASEGSMIFTLSNAAWYPGGGGPLYTASKHAATGLIRQLAYELAPKVRV NGVGPCGMASDLRGPQALGQSETSIMQSLTPEKIAAILPLQFFPQPADFTGPYVMLASRR NNRALSGVMINADAGLAIRGIRHVAAGLDL >gi|296493323|gb|ADTK01000178.1| GENE 50 54685 - 55005 172 106 aa, chain - ## HITS:1 COG:ECs3406 KEGG:ns NR:ns ## COG: ECs3406 COG2146 # Protein_GI_number: 15832660 # Func_class: P Inorganic ion transport and metabolism; R General function prediction only # Function: Ferredoxin subunits of nitrite reductase and ring-hydroxylating dioxygenases # Organism: Escherichia coli O157:H7 # 1 106 1 106 106 209 100.0 1e-54 MNRIYACPVADVPEGEALRIDTSPVIALFNVGGEFYAINDRCSHGNASMSEGYLEDDATV ECPLHAASFCLKTGKALCLPATDPLTTYPVHVEGGDIFIDLPEAQP >gi|296493323|gb|ADTK01000178.1| GENE 51 55005 - 55523 434 172 aa, chain - ## HITS:1 COG:hcaA2 KEGG:ns NR:ns ## COG: hcaA2 COG5517 # Protein_GI_number: 16130464 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Small subunit of phenylpropionate dioxygenase # Organism: Escherichia coli K12 # 1 172 1 172 172 320 100.0 1e-87 MSAQVSLELHHRISQFLFHEASLLDDWKFRDWLAQLDEEIRYTMRTTVNAQTRDRRKGVQ PPTTWIFNDTKDQLERRIARLETGMAWAEEPPSRTRHLISNCQISETDIPNVFAVRVNYL LYRAQKERDETFYVGTRFDKVRRLEDDNWRLLERDIVLDQAVITSHNLSVLF >gi|296493323|gb|ADTK01000178.1| GENE 52 55520 - 56881 1338 453 aa, chain - ## HITS:1 COG:ECs3404 KEGG:ns NR:ns ## COG: ECs3404 COG4638 # Protein_GI_number: 15832658 # Func_class: P Inorganic ion transport and metabolism; R General function prediction only # Function: Phenylpropionate dioxygenase and related ring-hydroxylating dioxygenases, large terminal subunit # Organism: Escherichia coli O157:H7 # 1 453 1 453 453 952 100.0 0 MTTPSDLNIYQLIDTQNGRVTPRIYTDPDIYQLELERIFGRCWLFLAHESQIPKPGDFFN TYMGEDAVVVVRQKDGSIKAFLNQCRHRAMRVSYADCGNTRAFTCPYHGWSYGINGELID VPLEPRAYPQGLCKSHWGLNEVPCVESYKGLIFGNWDTSAPGLRDYLGDIAWYLDGMLDR REGGTEIVGGVQKWVINCNWKFPAEQFASDQYHALFSHASAVQVLGAKDDGSDKRLGDGQ TARPVWETAKDALQFGQDGHGSGFFFTEKPDANVWVDGAVSSYYRETYAEAEQRLGEVRA LRLAGHNNIFPTLSWLNGTATLRVWHPRGPDQVEVWAFCITDKAASDEVKAAFENSATRA FGPAGFLEQDDSENWCEIQKLLKGHRARNSKLCLEMGLGQEKRRDDGIPGITNYIFSETA ARGMYQRWADLLSSESWQEVLDKTAAYQQEVMK >gi|296493323|gb|ADTK01000178.1| GENE 53 57017 - 57907 719 296 aa, chain + ## HITS:1 COG:ECs3403 KEGG:ns NR:ns ## COG: ECs3403 COG0583 # Protein_GI_number: 15832657 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 296 1 296 296 568 100.0 1e-162 MELRHLRYFVAVAQALNFTRAAEKLHTSQPSLSSQIRDLENCVGVPLLVRDKRKVALTAA GECFLQDALAILEQAENAKLRARKIVQEDRQLTIGFVPSAEVNLLPKVLPMFRLRQPDTL IELVSLITTQQEEKIRRGELDVGLMRHPVYSPEIDYLELFDEPLVVVLPVDHPLAHEKEI TAAQLDGVNFVSTDPAYSGSLAPIVKAWFAQENSQPNIVQVATNILVTMNLVGMGLGVTL IPGYMNNFNTGQVVFRPIAGNVPSIALLMAWKKGEMKPALRDFIAIVQERLASVTA >gi|296493323|gb|ADTK01000178.1| GENE 54 58067 - 59206 958 379 aa, chain + ## HITS:1 COG:hcaT KEGG:ns NR:ns ## COG: hcaT COG0477 # Protein_GI_number: 16130461 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 379 1 379 379 636 100.0 0 MVLQSTRWLALGYFTYFFSYGIFLPFWSVWLKGIGLTPETIGLLLGAGLVARFLGSLLIA PRVSDPSRLISALRVLALLTLLFAVAFWAGAHVAWLMLVMIGFNLFFSPLVPLTDALANT WQKQFPLDYGKVRLWGSVAFVIGSALTGKLVTMFDYRVILALLTLGVASMLLGFLIRPTI QPQGASRQQESTGWSAWLALVRQNWRFLACVCLLQGAHAAYYGFSAIYWQAAGYSASAVG YLWSLGVVAEVIIFALSNKLFRRCSARDMLLISAICGVVRWGIMGATTALPWLIVVQILH CGTFTVCHLAAMRYIAARQGSEVIRLQAVYSAVAMGGSIAIMTVFAGFLYQYLGHGVFWV MALVALPAMFLRPKVVPSC >gi|296493323|gb|ADTK01000178.1| GENE 55 59198 - 60478 829 426 aa, chain - ## HITS:1 COG:ECs3401 KEGG:ns NR:ns ## COG: ECs3401 COG3711 # Protein_GI_number: 15832655 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Escherichia coli O157:H7 # 1 426 8 433 433 808 98.0 0 MMPTLAPPSVLSAPQRRCQILLTLFQPGLTATTATFSELNGVDDDIASLDISETGQEILR YHQLTLTAGYDGSYRVEGTVLNQRLCLFHWLRRGFRLCPSFITSHFTPALKSELKRRGIA RNFYDDTNLQALVNLCSRRLQKRFETRDIHFLCLYLQYCLLQHHAGITPQFNPLQRRWAE SCLEFQVAQEIGRHWQRRALQPVPPDEPLFMALLFSMLRVPDPLRDAHQRDRQLRQSIKR LVNHFRELGNVRFYDEQGLCDQLYTHLAQALNRSLFAIGIDNTLPEEFARLYPRLVRTTR AALAGFESEYGVHLSDEESGLVAVIFGAWLMQENDLHEKQIILLTGNDSEREAQIEQQLR ELTLLPLNIKHMSVKAFLQTGAPRGAALIIAPYTMPLPLFSPPLIYTDLTLTTHQQEQIR KMLESA >gi|296493323|gb|ADTK01000178.1| GENE 56 60669 - 61523 668 284 aa, chain - ## HITS:1 COG:yfhR KEGG:ns NR:ns ## COG: yfhR COG1073 # Protein_GI_number: 16130459 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Escherichia coli K12 # 1 284 10 293 293 587 100.0 1e-168 MALPVNKRVPKILFILFVVAFCVYLVPRVAINFFYYPDDKIYGPDPWSAESVEFTAKDGT RLQGWFIPSSTGPADNAIATIIHAHGNAGNMSAHWPLVSWLPERNFNVFMFDYRGFGKSK GTPSQAGLLDDTQSAINVVRHRSDVNPQRLVLFGQSIGGANILDVIGRGDREGIRAVILD STFASYATIANQMIPGSGYLLDESYSGENYIASVSPIPLLLIHGKADHVIPWQHSEKLYS LAKEPKRLILIPDGEHIDAFSDRHGDVYREQMVDFILSALNPQN Prediction of potential genes in microbial genomes Time: Mon May 16 15:36:30 2011 Seq name: gi|296493322|gb|ADTK01000179.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont523.2, whole genome shotgun sequence Length of sequence - 10253 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 6, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 14 - 817 1027 ## COG0483 Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family - Prom 853 - 912 3.9 2 2 Tu 1 8/1.000 + CDS 936 - 1676 955 ## COG0565 rRNA methylase + Prom 1852 - 1911 4.7 3 3 Op 1 13/0.000 + CDS 1946 - 2434 540 ## COG1959 Predicted transcriptional regulator 4 3 Op 2 20/0.000 + CDS 2546 - 3760 1616 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes 5 3 Op 3 14/0.000 + CDS 3788 - 4174 553 ## COG0822 NifU homolog involved in Fe-S cluster formation 6 3 Op 4 10/1.000 + CDS 4191 - 4514 431 ## COG0316 Uncharacterized conserved protein + Term 4530 - 4569 5.0 7 4 Op 1 11/1.000 + CDS 4610 - 5125 593 ## COG1076 DnaJ-domain-containing proteins 1 8 4 Op 2 13/0.000 + CDS 5142 - 6992 2271 ## COG0443 Molecular chaperone 9 4 Op 3 9/1.000 + CDS 6994 - 7329 392 ## COG0633 Ferredoxin 10 4 Op 4 2/1.000 + CDS 7341 - 7541 354 ## COG2975 Uncharacterized protein conserved in bacteria + Term 7547 - 7580 5.2 11 5 Tu 1 . + CDS 7600 - 8883 1480 ## COG0260 Leucyl aminopeptidase + Prom 8923 - 8982 5.6 12 6 Tu 1 . + CDS 9025 - 9801 701 ## ECSP_3466 enhanced serine sensitivity protein SseB + Term 9876 - 9902 -0.6 Predicted protein(s) >gi|296493322|gb|ADTK01000179.1| GENE 1 14 - 817 1027 267 aa, chain - ## HITS:1 COG:ECs3399 KEGG:ns NR:ns ## COG: ECs3399 COG0483 # Protein_GI_number: 15832653 # Func_class: G Carbohydrate transport and metabolism # Function: Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family # Organism: Escherichia coli O157:H7 # 1 267 1 267 267 530 100.0 1e-150 MHPMLNIAVRAARKAGNLIAKNYETPDAVEASQKGSNDFVTNVDKAAEAVIIDTIRKSYP QHTIITEESGELEGTDQDVQWVIDPLDGTTNFIKRLPHFAVSIAVRIKGRTEVAVVYDPM RNELFTATRGQGAQLNGYRLRGSTARDLDGTILATGFPFKAKQYATTYINIVGKLFNECA DFRRTGSAALDLAYVAAGRVDGFFEIGLRPWDFAAGELLVREAGGIVSDFTGGHNYMLTG NIVAGNPRVVKAMLANMRDELSDALKR >gi|296493322|gb|ADTK01000179.1| GENE 2 936 - 1676 955 246 aa, chain + ## HITS:1 COG:yfhQ KEGG:ns NR:ns ## COG: yfhQ COG0565 # Protein_GI_number: 16130457 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylase # Organism: Escherichia coli K12 # 1 246 1 246 246 483 100.0 1e-136 MLQNIRIVLVETSHTGNMGSVARAMKTMGLTNLWLVNPLVKPDSQAIALAAGASDVIGNA HIVDTLDEALAGCSLVVGTSARSRTLPWPMLDPRECGLKSVAEAANTPVALVFGRERVGL TNEELQKCHYHVAIAANPEYSSLNLAMAVQVIAYEVRMAWLATQENGEQVEHEETPYPLV DDLERFYGHLEQTLLATGFIRENHPGQVMNKLRRLFTRARPESQELNILRGILASIEQQN KGNKAE >gi|296493322|gb|ADTK01000179.1| GENE 3 1946 - 2434 540 162 aa, chain + ## HITS:1 COG:ECs3397 KEGG:ns NR:ns ## COG: ECs3397 COG1959 # Protein_GI_number: 15832651 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 162 1 162 162 278 100.0 4e-75 MRLTSKGRYAVTAMLDVALNSEAGPVPLADISERQGISLSYLEQLFSRLRKNGLVSSVRG PGGGYLLGKDASSIAVGEVISAVDESVDATRCQGKGGCQGGDKCLTHALWRDLSDRLTGF LNNITLGELVNNQEVLDVSGRQHTHDAPRTRTQDAIDVKLRA >gi|296493322|gb|ADTK01000179.1| GENE 4 2546 - 3760 1616 404 aa, chain + ## HITS:1 COG:ECs3396 KEGG:ns NR:ns ## COG: ECs3396 COG1104 # Protein_GI_number: 15832650 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Escherichia coli O157:H7 # 1 404 9 412 412 818 99.0 0 MKLPIYLDYSATTPVDPRVAEKMMQFMTMDGTFGNPASRSHRFGWQAEEAVDIARNQIAD LVGADPREIVFTSGATESDNLAIKGAANFYQKKGKHIITSKTEHKAVLDTCRQLEREGFE VTYLAPQRNGIIDLKELEAAMRDDTILVSIMHVNNEIGVVQDIAAIGEMCRARGIIYHVD ATQSVGKLPIDLSQLKVDLMSFSGHKIYGPKGIGALYVRRKPRVRIEAQMHGGGHERGMR SGTLPVHQIVGMGEAYRIAKEEMATEMERLRGLRDRLWNGIKDIEEVYLNGDLEHGAPNI LNVSFNYVEGESLIMALKDLAVSSGSACTSASLEPSYVLRALGLNDELAHSSIRFSLGRF TTEEEIDYTIELVRKSIGRLRDLSPLWEMYKQGVDLNSIEWAHH >gi|296493322|gb|ADTK01000179.1| GENE 5 3788 - 4174 553 128 aa, chain + ## HITS:1 COG:ECs3395 KEGG:ns NR:ns ## COG: ECs3395 COG0822 # Protein_GI_number: 15832649 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Escherichia coli O157:H7 # 1 128 1 128 128 237 100.0 4e-63 MAYSEKVIDHYENPRNVGSFDNNDENVGSGMVGAPACGDVMKLQIKVNDEGIIEDARFKT YGCGSAIASSSLVTEWVKGKSLDEAQAIKNTDIAEELELPPVKIHCSILAEDAIKAAIAD YKSKREAK >gi|296493322|gb|ADTK01000179.1| GENE 6 4191 - 4514 431 107 aa, chain + ## HITS:1 COG:ECs3394 KEGG:ns NR:ns ## COG: ECs3394 COG0316 # Protein_GI_number: 15832648 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 107 1 107 107 207 100.0 3e-54 MSITLSDSAAARVNTFLANRGKGFGLRLGVRTSGCSGMAYVLEFVDEPTPEDIVFEDKGV KVVVDGKSLQFLDGTQLDFVKEGLNEGFKFTNPNVKDECGCGESFHV >gi|296493322|gb|ADTK01000179.1| GENE 7 4610 - 5125 593 171 aa, chain + ## HITS:1 COG:ECs3393 KEGG:ns NR:ns ## COG: ECs3393 COG1076 # Protein_GI_number: 15832647 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-domain-containing proteins 1 # Organism: Escherichia coli O157:H7 # 1 171 1 171 171 269 99.0 2e-72 MDYFTLFGLPARYQLDTQALSLRFQDLQRQYHPDKFASGSQAEQLAAVQQSATINQAWQT LRHPLMRAEYLLSLHGFDLASEQHTVRDTAFLMEQLELREELDEIEQAKDEARLESFIKR VKKMFDTRHQLMVEQLDNETWAAAADTVRKLRFLDKLRSSAEQLEEKLLDF >gi|296493322|gb|ADTK01000179.1| GENE 8 5142 - 6992 2271 616 aa, chain + ## HITS:1 COG:ECs3392 KEGG:ns NR:ns ## COG: ECs3392 COG0443 # Protein_GI_number: 15832646 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Escherichia coli O157:H7 # 1 616 1 616 616 1088 99.0 0 MALLQISEPGLSAAPHQRRLAAGIDLGTTNSLVATVRSGQAETLADHEGRHLLPSVVHYQ QQGHSVGYDARTNAALDTANTISSVKRLMGRSLADIQQRYPHLPYQFQASENGLPMIETA AGLLNPVRVSADILKALAARATEALAGELDGVVITVPAYFDDAQRQGTKDAARLAGLHVL RLLNEPTAAAIAYGLDSGQEGVIAVYDLGGGTFDISILRLSRGVFEVMATGGDSALGGDD FDHLLADYIREQAGIPDRSDNRVQRELLDAAIAAKIALSDADSVTVNVAGWQGEISREQF NELIAPLVKRTLLACRRALKDAGVEASEVLEVVMVGGSTRVPLVRERVGEFFGRPPLTSI DPDKVVAIGAAIQADILVGNKPDSEMLLLDVIPLSLGLETMGGLVEKVIPRNTTIPVARA QDFTTFKDGQTAMSIHVMQGERELVQDCRSLARFALRGIPALPAGGAHIRVTFQVDADGL LSVTAMEKSTGVEASIQVKPSYGLTDSEIASMIKDSMSYAEQDVKARMLAEQKVEAARVL ESLHGALAADAALLSAAERQVIDDAAAHLSEVAQGNDVDAIEQAIKNVDKQTQDFAARRM DQSVRRALKGHSVDEV >gi|296493322|gb|ADTK01000179.1| GENE 9 6994 - 7329 392 111 aa, chain + ## HITS:1 COG:ECs3391 KEGG:ns NR:ns ## COG: ECs3391 COG0633 # Protein_GI_number: 15832645 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Escherichia coli O157:H7 # 1 111 1 111 111 207 100.0 3e-54 MPKIVILPHQDLCPDGAVLEANSGETILDAALRNGIEIEHACEKSCACTTCHCIVREGFD SLPESSEQEDDMLDKAWGLEPESRLSCQARVTDEDLVVEIPRYTINHAREH >gi|296493322|gb|ADTK01000179.1| GENE 10 7341 - 7541 354 66 aa, chain + ## HITS:1 COG:ECs3390 KEGG:ns NR:ns ## COG: ECs3390 COG2975 # Protein_GI_number: 15832644 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 66 1 66 66 114 100.0 5e-26 MGLKWTDSREIGEALYDAYPDLDPKTVRFTDMHQWICDLEDFDDDPQASNEKILEAILLV WLDEAE >gi|296493322|gb|ADTK01000179.1| GENE 11 7600 - 8883 1480 427 aa, chain + ## HITS:1 COG:ECs3389 KEGG:ns NR:ns ## COG: ECs3389 COG0260 # Protein_GI_number: 15832643 # Func_class: E Amino acid transport and metabolism # Function: Leucyl aminopeptidase # Organism: Escherichia coli O157:H7 # 1 427 30 456 456 852 99.0 0 MTEAMKITLSTQPADARWGEKATYSINNDGITLHLNGADDLGLIQRAARKIDGLGIKHVQ LSGEGWDADRCWAFWQGYKAPKGTRKVEWPDLDDAQRQELDNRLMIIDWVRDTINAPAEE LGPSQLAQRAVDLISNVAGDRVTYRITKGEDLRDQGYMGLHTVGRGSERSPVLLALDYNP TGDKEAPVYACLVGKGITFDSGGYSIKQTAFMDSMKSDMGGAATVTGALAFAITRGLNKR VKLFLCCADNLISGNAFKLGDIITYRNGKKVEVMNTDAEGRLVLADGLIDASAQKPEMII DAATLTGAAKTALGNDYHALFSFDDALAGRLLASASQENEPFWRLPLAEFHRSQLPSNFA ELNNTGSAAYPAGASTAAGFLSHFVENYQQGWLHIDCSATYRKAPVEQWSAGATGLGVRT IANLLTA >gi|296493322|gb|ADTK01000179.1| GENE 12 9025 - 9801 701 258 aa, chain + ## HITS:1 COG:no KEGG:ECSP_3466 NR:ns ## KEGG: ECSP_3466 # Name: sseB # Def: enhanced serine sensitivity protein SseB # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 258 5 262 262 484 100.0 1e-135 MSETKNELEDLLEKAATEPAHRPAFFRTLLESTVWVPGTAAQGEAVVEDSALDLQHWEKE DGTSVIPFFTSLEALQQAVEDEQAFVVMPVRTLFEMTLGETLFLNAKLPTGKEFMPREIS LLIGEEGNPLSSQEILEGGESLILSEVAEPPAQMIDSLTTLFKTIKPVKRAFICSIKENE EAQPNLLIGIEADGDIEEIIQVAGSVATDTLPGDEPIDICQVKKGEKGISHFITEHIAPF YERRWGGFLRDFKQNRII Prediction of potential genes in microbial genomes Time: Mon May 16 15:36:40 2011 Seq name: gi|296493321|gb|ADTK01000180.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont523.3, whole genome shotgun sequence Length of sequence - 17486 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 5, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 4 - 31 -0.1 1 1 Op 1 . - CDS 54 - 278 184 ## COG2897 Rhodanese-related sulfurtransferase 2 1 Op 2 . - CDS 299 - 898 446 ## COG2897 Rhodanese-related sulfurtransferase - Prom 1055 - 1114 2.7 + Prom 1008 - 1067 4.2 3 2 Op 1 7/0.000 + CDS 1105 - 6066 4489 ## COG2373 Large extracellular alpha-helical protein 4 2 Op 2 1/1.000 + CDS 6067 - 8379 1290 ## COG4953 Membrane carboxypeptidase/penicillin-binding protein PbpC + Prom 8419 - 8478 3.7 5 3 Op 1 11/0.000 + CDS 8528 - 8959 488 ## COG0105 Nucleoside diphosphate kinase + Prom 8963 - 9022 2.5 6 3 Op 2 5/0.000 + CDS 9109 - 10263 1318 ## COG0820 Predicted Fe-S-cluster redox enzyme + Term 10270 - 10315 10.3 + Prom 10400 - 10459 6.0 7 4 Op 1 10/0.000 + CDS 10548 - 11561 616 ## COG1426 Uncharacterized protein conserved in bacteria 8 4 Op 2 11/0.000 + CDS 11588 - 12706 1201 ## COG0821 Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis + Term 12716 - 12747 4.5 9 4 Op 3 12/0.000 + CDS 12817 - 14091 1449 ## COG0124 Histidyl-tRNA synthetase 10 4 Op 4 9/0.000 + CDS 14109 - 14729 685 ## COG2976 Uncharacterized protein conserved in bacteria 11 4 Op 5 7/0.000 + CDS 14740 - 15918 1107 ## COG1520 FOG: WD40-like repeat + Term 15939 - 15980 10.4 + Prom 15941 - 16000 1.8 12 5 Tu 1 . + CDS 16036 - 17485 1851 ## COG1160 Predicted GTPases Predicted protein(s) >gi|296493321|gb|ADTK01000180.1| GENE 1 54 - 278 184 74 aa, chain - ## HITS:1 COG:sseA KEGG:ns NR:ns ## COG: sseA COG2897 # Protein_GI_number: 16130446 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Escherichia coli K12 # 1 74 261 334 334 150 98.0 7e-37 MREGELKTTDELDAIFFGRGVSYDKPIIVSCGSGVTAAVVLLALATLDVPNVKLYDGAWS EWGARADLPVEPVK >gi|296493321|gb|ADTK01000180.1| GENE 2 299 - 898 446 199 aa, chain - ## HITS:1 COG:ECs3387 KEGG:ns NR:ns ## COG: ECs3387 COG2897 # Protein_GI_number: 15832641 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Escherichia coli O157:H7 # 1 166 54 219 334 343 99.0 2e-94 MSTTWFVGADWLAEHIDDPEIQIIDARMASPGQEDRNVAQEYLNGHIPGAVFFDIEALSD HTSPLPHMLPRPETFAVAMRELGVNQDKHLIVYDEGNLFSAPRAWWMLRTFGVEKVSILG GGLAGWQRDDLLLEEGAVELPEGEFNAAFNPEAVVKVTDVLLASHKIRRKLLMPARLHVL TQKLMNLAQVYVADIFPVH >gi|296493321|gb|ADTK01000180.1| GENE 3 1105 - 6066 4489 1653 aa, chain + ## HITS:1 COG:yfhM KEGG:ns NR:ns ## COG: yfhM COG2373 # Protein_GI_number: 16130445 # Func_class: R General function prediction only # Function: Large extracellular alpha-helical protein # Organism: Escherichia coli K12 # 1 1653 1 1653 1653 3207 99.0 0 MKKLRVAACMLMLALAGCDNNDNAPTAVKKDAPSEVTKAASSENASSAKLSVPERQKLAQ QSAGKVLTLLDLSEVQLDGAATLVLTFSIPLDPDQDFSRVIHVVDKKSGKVDGAWELSDN LKELRLRHLEPKRDLIVTIGKEVKALNNATFSKDYEKTITTRDIQPSVGFASRGSLLPGK VVEGLPVMALNVNNVDVNFFRVKPESLPAFISQWEYRNSLANWQSDKLLQMADLVYTGRF DLNPARNTREKLLLPLGDIKPLQQAGVYLAVMNQAGRYDYSNPATLFTLSDIGVSAHRYH NRLDIFTQSLENGAAQQGIEVSLLNEKGLTLTQATSDAQGHVQLENDKNAALLLARKDGQ TTLLDLKLPALDLAEFNIAGAPGYSKQFFMFGPRDLYRPGETVILNGLLRDADGKALPNQ PIKLDVIKPDGQVLRSVVSQPENGLYHFTWPLDSNAATGMWHIRANTGDNQYRMWDFHVE DFMPERMALNLTGEKTPLTPKDEVKFSVVGYYLYGAPANGNTLQGQLFLRPLREAVSALP GFEFGDIAAENLSRTLDEVQLTLDDKGRGEVSTESQWKETHSPLQVIFQGSLLESGGRPV TRRAEQAIWPADALPGIRPQFASKSVYDYRTDSTVKQPIVDEGSNAGFDIVYSDAQGVKK AVSGLQVRLIRERRDYYWNWSEDEGWQSQFDQKDLIENEQTLDLQADETGKVSFPVEWGA YRLEVKAPNEAVSSVRFWAGYSWQDNSDGSGAVRPDRVTLKLDKASYRPGDTIKLHIAAP TAGKGYAMVESSEGPLWWQEIDVPAQGLDLTIPVDKTWNRHDLYLSTLVVRPGDKSRSAT PKRAVGVLHLPLGDENRRLDLALETPAKMRPNQPLTVKIKASTKNGEKPKQVNVLVSAVD SGVLNITDYVTPDPWQAFFGQKRYGADIYDIYGQVIEGQGRLAALRFGGDGDELKRGGKP PVNHVNIVAQQALPVTLNEQGEGSVTLPIGDFNGELRVMAQAWTADDFGSNESKVIVAAP VIAELNMPRFMASGDTSRLTLDITNLTDKPQKLNVALTASGLLELVSDSPAAVELAPGVR TTLFIPVRALPGYGDGEIQATISGLALPDETVADQQKQWKIGVRPAFPAQTVNYGTALQP GETWAIPADGLQNFSPVTLEGQLLLSGKPPLNIARYIKELKAYPYGCLEQTASGLFPSLY TNAAQLQALGIKGDSDEKRRASVDIGISRLLQMQRDNGGFALWDKNGDEEYWLTAYVMDF LVRAGEQGYSVPTDAINRGNERLLRYLQDPGMMSIPYADNLKASKFAVQSYAALVLARQQ KAPLGALREIWEHRADAASGLPLLQLGVALKTMGDATRGEEAIVLALKTPRNSDERIWLG DYGSPLRDNALMLSLLEENKLLPDEQYTLLNTLSQQAFGERWLSTQESNALFLAARTIQD LPGKWQAQTSFSAEPLTGEKTLNSNLNSDQLATLQVRNSGDQPLWLRMDASGYPQSAPLP ANNVLQIERHILGTDGKSKSLDSLRSGDLVLVWLQVKASNSVPDALVVDLLPAGLELENQ NLANGSASLEQSGGEVQNLLNQMQQASIKHIEFRDDRFVAAVAVDEYQPVTLVYLARAVT PGTYQVPQPMVESMYVPQWRATGAAEDLLIVRP >gi|296493321|gb|ADTK01000180.1| GENE 4 6067 - 8379 1290 770 aa, chain + ## HITS:1 COG:pbpC KEGG:ns NR:ns ## COG: pbpC COG4953 # Protein_GI_number: 16130444 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase/penicillin-binding protein PbpC # Organism: Escherichia coli K12 # 1 770 1 770 770 1478 99.0 0 MPRLLTKRGCWITLAAAPFLLFLAAWGADKLWPLPLHEVNPARVVVAQDGTPLWRFADAD GIWRYPVTIEDVSPRYLEALINYEDHWFWKHPGVNPFSVARAAWQDLTSGRVISGGSTLT MQVARLLDPHPKTFGGKIRQLWRALQLEWHLSKREILTLYLNRAPFGGTLQGIGAASWAY LGKSPANLSYSEAAMLAVLPQAPSRLRPDRWPERAEAARNKVLERMAVQGVWSREQVKES REEPIWLAPRQMPQLAPLFSRMMLGKSKSDKIVTTLDAGLQRRLEELAQNWKGRLPPRSS LAMIVVDHTDMRVRGWVGSVDLNDDSRFGHVDMVNAIRSPGSVLKPFVYGLALDEGLIHP ASLLQDVPRRTGDYRPGNFDSGFHGPISMSEALVRSLNLPAVQVLEAYGPKRFAAKLRNV GLPLYLPNGAAPNLSLILGGAGAKLEDMAAAYTAFARHGKAGKLRLQPDDPLLERPLMSS GAAWIIRRIMADEAKPLPDSALPRVAPLAWKTGTSYGYRDAWAIGVNARYVIGIWTGRPD GTPVVGQFGFASAVPLLNQVNNILLSRSANLPEDPRPNSVTRGVICWPGGQSLPEGDGNC RRRLATWLLDGSQPPTLLLPEQEGINGIHFPIWLDENGKRVAADCPQARQEMINVWPLPL EPWLPASERRAVRLPPASTICPPYGHDAKLPLQLTGVRDGAIIKRLPGAAEATLPLQSSG GAGERWWFLNGEPLTERGRNVTLHLTDKGDYQLLVMDDVGQIATVKFVMQ >gi|296493321|gb|ADTK01000180.1| GENE 5 8528 - 8959 488 143 aa, chain + ## HITS:1 COG:ECs3380 KEGG:ns NR:ns ## COG: ECs3380 COG0105 # Protein_GI_number: 15832634 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside diphosphate kinase # Organism: Escherichia coli O157:H7 # 1 143 1 143 143 277 100.0 4e-75 MAIERTFSIIKPNAVAKNVIGNIFARFEAAGFKIVGTKMLHLTVEQARGFYAEHDGKPFF DGLVEFMTSGPIVVSVLEGENAVQRHRDLLGATNPANALAGTLRADYADSLTENGTHGSD SVESAAREIAYFFGEGEVCPRTR >gi|296493321|gb|ADTK01000180.1| GENE 6 9109 - 10263 1318 384 aa, chain + ## HITS:1 COG:yfgB KEGG:ns NR:ns ## COG: yfgB COG0820 # Protein_GI_number: 16130442 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster redox enzyme # Organism: Escherichia coli K12 # 1 384 1 384 384 790 100.0 0 MSEQLVTPENVTTKDGKINLLDLNRQQMREFFKDLGEKPFRADQVMKWMYHYCCDNFDEM TDINKVLRGKLKEVAEIRAPEVVEEQRSSDGTIKWAIAVGDQRVETVYIPEDDRATLCVS SQVGCALECKFCSTAQQGFNRNLRVSEIIGQVWRAAKIVGAAKVTGQRPITNVVMMGMGE PLLNLNNVVPAMEIMLDDFGFGLSKRRVTLSTSGVVPALDKLGDMIDVALAISLHAPNDE IRDEIVPINKKYNIETFLAAVRRYLEKSNANQGRVTIEYVMLDHVNDGTEHAHQLAELLK DTPCKINLIPWNPFPGAPYGRSSNSRIDRFSKVLMSYGFTTIVRKTRGDDIDAACGQLAG DVIDRTKRTLRKRMQGEAIDIKAV >gi|296493321|gb|ADTK01000180.1| GENE 7 10548 - 11561 616 337 aa, chain + ## HITS:1 COG:ECs3378 KEGG:ns NR:ns ## COG: ECs3378 COG1426 # Protein_GI_number: 15832632 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 337 1 337 337 543 99.0 1e-154 MNTEATHDQNEALTTGARLRNAREQLGLSQQAVAERLCLKVSTVRDIEEDKAPADLASTF LRGYIRSYARLVHIPEEELLPGLEKQAPLRAAKVAPMQSFSLGKRRKKRDGWLMTFTWLV LFVVIGLSGAWWWQDHKAQQEEITTMADQSSAELSSNSEQGQSVPLNTSTTTDPATTSTP PASVDTTATNTQTPAVTAPAPAVDPQQNAVVSPSQANVDTAATPAPTATTTPDGAAPLPT DQAGVTTPAADPNALVMNFTADCWLEVTDATGKKLFSGMQRKDGNLNLTGQAPYKLKIGA PAAVQIQYQGKPVDLSRFIRTNQVARLTLNAEQSPAQ >gi|296493321|gb|ADTK01000180.1| GENE 8 11588 - 12706 1201 372 aa, chain + ## HITS:1 COG:ECs3377 KEGG:ns NR:ns ## COG: ECs3377 COG0821 # Protein_GI_number: 15832631 # Func_class: I Lipid transport and metabolism # Function: Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis # Organism: Escherichia coli O157:H7 # 1 372 1 372 372 711 100.0 0 MHNQAPIQRRKSTRIYVGNVPIGDGAPIAVQSMTNTRTTDVEATVNQIKALERVGADIVR VSVPTMDAAEAFKLIKQQVNVPLVADIHFDYRIALKVAEYGVDCLRINPGNIGNEERIRM VVDCARDKNIPIRIGVNAGSLEKDLQEKYGEPTPQALLESAMRHVDHLDRLNFDQFKVSV KASDVFLAVESYRLLAKQIDQPLHLGITEAGGARSGAVKSAIGLGLLLSEGIGDTLRVSL AADPVEEIKVGFDILKSLRIRSRGINFIACPTCSRQEFDVIGTVNALEQRLEDIITPMDV SIIGCVVNGPGEALVSTLGVTGGNKKSGLYEDGVRKDRLDNNDMIDQLEARIRAKASQLD EARRIDVQQVEK >gi|296493321|gb|ADTK01000180.1| GENE 9 12817 - 14091 1449 424 aa, chain + ## HITS:1 COG:ECs3376 KEGG:ns NR:ns ## COG: ECs3376 COG0124 # Protein_GI_number: 15832630 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Histidyl-tRNA synthetase # Organism: Escherichia coli O157:H7 # 1 424 1 424 424 860 100.0 0 MAKNIQAIRGMNDYLPGETAIWQRIEGTLKNVLGSYGYSEIRLPIVEQTPLFKRAIGEVT DVVEKEMYTFEDRNGDSLTLRPEGTAGCVRAGIEHGLLYNQEQRLWYIGPMFRHERPQKG RYRQFHQLGCEVFGLQGPDIDAELIMLTARWWRALGISEHVTLELNSIGSLEARANYRDA LVAFLEQHKEKLDEDCKRRMYTNPLRVLDSKNPEVQALLNDAPALGDYLDEESREHFAGL CKLLESAGIAYTVNQRLVRGLDYYNRTVFEWVTNSLGSQGTVCAGGRYDGLVEQLGGRAT PAVGFAMGLERLVLLVQAVNPEFKADPVVDIYLVASGADTQSAAMALAERLRDELPGVKL MTNHGGGNFKKQFARADKWGARVAVVLGESEVANGTAVVKDLRSGEQTAVAQDSVAAHLR TLLG >gi|296493321|gb|ADTK01000180.1| GENE 10 14109 - 14729 685 206 aa, chain + ## HITS:1 COG:ECs3375 KEGG:ns NR:ns ## COG: ECs3375 COG2976 # Protein_GI_number: 15832629 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 206 1 206 206 344 99.0 5e-95 MEIYENENDQVEAVKRFFAENGKALAVGVILGVGALIGWRYWNSHQVDSARSASLAYQNA VTAVSEGKPDSIPAAEKFAAENKNTYGALASLELAQQFVDKNELEKAAAQLQQGLADTSD ENLKAVINLRLARVQVQLKQADAALKTLDTIKGEGWAAIVADLRGEALLSKGDKQGARSA WEAGVKSDVTPALSEMMQMKINNLSI >gi|296493321|gb|ADTK01000180.1| GENE 11 14740 - 15918 1107 392 aa, chain + ## HITS:1 COG:ECs3374 KEGG:ns NR:ns ## COG: ECs3374 COG1520 # Protein_GI_number: 15832628 # Func_class: S Function unknown # Function: FOG: WD40-like repeat # Organism: Escherichia coli O157:H7 # 13 392 13 392 392 716 99.0 0 MQLRKLLLPGLLSVTLLSGCSLFNSEEDVVKMSPLPTVENQFTPTTAWSTSVGSGIGNFY SNLHPALADNVVYTADRAGLVKALNADDGKEIWSVSLAEKDGWFSKEPALLSGGVTVSGG HVYIGSEKAQVYALNTSDGTVAWQTKVAGEALSRPVVSDGLVLIHTSNGQLQALNEADGA VKWTVNLDMPSLSLRGESAPATAFGAAVVGGDNGRVSAVLMEQGQMIWQQRISQATGSTE IDRLSDVDTTPVVVNGVVFALAYNGNLTALDLRSGQIMWKRELGSVNDFIVDGNRIYLVD QNDRVMALTIDGGVTLWAQSDLLHRLLTSPVLYNGNLVVGDSEGYLHWINVEDGRFVAQQ KVDSSGFQTEPVAADGKLLIQAKDGTVYSITR >gi|296493321|gb|ADTK01000180.1| GENE 12 16036 - 17485 1851 483 aa, chain + ## HITS:1 COG:ECs3373 KEGG:ns NR:ns ## COG: ECs3373 COG1160 # Protein_GI_number: 15832627 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Escherichia coli O157:H7 # 1 483 14 496 503 938 100.0 0 MVPVVALVGRPNVGKSTLFNRLTRTRDALVADFPGLTRDRKYGRAEIEGREFICIDTGGI DGTEDGVETRMAEQSLLAIEEADVVLFMVDARAGLMPADEAIAKHLRSREKPTFLVANKT DGLDPDQAVVDFYSLGLGEIYPIAASHGRGVLSLLEHVLLPWMEDLAPQEEVDEDAEYWA QFEAEENGEEEEEDDFDPQSLPIKLAIVGRPNVGKSTLTNRILGEERVVVYDMPGTTRDS IYIPMERDGREYVLIDTAGVRKRGKITDAVEKFSVIKTLQAIEDANVVMLVIDAREGISD QDLSLLGFILNSGRSLVIVVNKWDGLSQEVKEQVKETLDFRLGFIDFARVHFISALHGSG VGNLFESVREAYDSSTRRVGTSMLTRIMTMAVEDHQPPLVRGRRVKLKYAHAGGYNPPIV VIHGNQVKDLPDSYKRYLMNYFRKSLDVMGSPIRIQFKEGENPYANKRNTLTPTQMRKRK RLM Prediction of potential genes in microbial genomes Time: Mon May 16 15:36:41 2011 Seq name: gi|296493320|gb|ADTK01000181.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont525.1, whole genome shotgun sequence Length of sequence - 3052 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 12 - 71 2.9 1 1 Tu 1 . + CDS 315 - 1172 170 ## COG3464 Transposase and inactivated derivatives 2 2 Op 1 1/0.000 - CDS 1824 - 2858 853 ## COG1735 Predicted metal-dependent hydrolase with the TIM-barrel fold 3 2 Op 2 . - CDS 2861 - 3052 112 ## COG3238 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|296493320|gb|ADTK01000181.1| GENE 1 315 - 1172 170 285 aa, chain + ## HITS:1 COG:mll5961 KEGG:ns NR:ns ## COG: mll5961 COG3464 # Protein_GI_number: 13474973 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Mesorhizobium loti # 2 284 171 454 521 140 34.0 3e-33 MLIVNLDTHRPLVLLPGRDQRTLATWFRKYPEIQVVSRDRSGVYATAAREGAPQARQVAD RWHLLKNIGDEPERMMYRHMPLIRLVVRELSLKKSPEPEISVPVASLRRLERLKQHIRKK RHQRWTEVMALHNKGCSFREISRITGLSRVTVSRWVGSGTFPEMSTRPPKRGLLDPWREW LKEQRECGNYNSGRIWREMVARGVTGSETIVRDAVAKWRKGWIPPVTTAARLPSVSRVSR WLMPWRIIRGEENYAFRFISLMCEKEPELKIAQQLVLEFYRILKT >gi|296493320|gb|ADTK01000181.1| GENE 2 1824 - 2858 853 344 aa, chain - ## HITS:1 COG:STM3550 KEGG:ns NR:ns ## COG: STM3550 COG1735 # Protein_GI_number: 16766836 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Salmonella typhimurium LT2 # 1 344 1 344 344 619 86.0 1e-177 MKDYLQTVTGPVAREDMGLTLPHEHLFNNLSSVVDAPCYPFSQRLVDKKVTAEIQWALKH DPYCCADNMDRKPIEDVIFEINNFISLGGRTIVDATGSESIGRDAQALREVALKTGLNIV ASSGPYLEKFESQRIHKTVDELATTIDKELNQGIGDTDIRAGMIGEIGVSPTFTEAEHNS LRAASLAQINNPHVAMNIHMPGWLRRGDEVLDIVLGEMGVSPNKVSLAHSDPSGKDVAYQ RKMLDKGVWLEFDMIGLDITFPKEGIAPGVQETADAVAHLIELGYADQLVLSHDVFLKQM WAKNGGNGWGFVPDVFLAYLAERGVDKTILKKLCIDNPGRLLTA >gi|296493320|gb|ADTK01000181.1| GENE 3 2861 - 3052 112 63 aa, chain - ## HITS:1 COG:STM3549 KEGG:ns NR:ns ## COG: STM3549 COG3238 # Protein_GI_number: 16766835 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Salmonella typhimurium LT2 # 11 63 269 321 323 83 88.0 1e-16 VVWRFRGPVYQLLGSVLIDVLIPSLGNTVYLVTIIGTLFALVGAIVTTIPEYRASKTMKK MEV Prediction of potential genes in microbial genomes Time: Mon May 16 15:36:43 2011 Seq name: gi|296493319|gb|ADTK01000182.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont534.1, whole genome shotgun sequence Length of sequence - 3891 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 1, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 13/0.000 + CDS 36 - 512 434 ## COG3444 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB 2 1 Op 2 13/0.000 + CDS 551 - 1354 880 ## COG3715 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC 3 1 Op 3 2/0.000 + CDS 1344 - 2135 766 ## COG3716 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID + Term 2143 - 2179 4.7 4 1 Op 4 2/0.000 + CDS 2190 - 2891 675 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase + Term 2970 - 3021 7.3 + Prom 3171 - 3230 6.3 5 1 Op 5 . + CDS 3292 - 3876 471 ## COG3539 P pilus assembly protein, pilin FimA Predicted protein(s) >gi|296493319|gb|ADTK01000182.1| GENE 1 36 - 512 434 158 aa, chain + ## HITS:1 COG:ECs4018 KEGG:ns NR:ns ## COG: ECs4018 COG3444 # Protein_GI_number: 15833272 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB # Organism: Escherichia coli O157:H7 # 1 158 1 158 158 281 100.0 4e-76 MSSPNILLTRIDNRLVHGQVGVTWTSTIGANLLVVVDDVVANDDIQQKLMGITAETYGFG IRFFTIEKTINVIGKAAPHQKIFLICRTPQTVRKLVEGGIDLKDVNVGNMHFSEGKKQIS SKVYVDDQDLTDLRFIKQRGVNVFIQDVPGDQKEQIPD >gi|296493319|gb|ADTK01000182.1| GENE 2 551 - 1354 880 267 aa, chain + ## HITS:1 COG:agaC KEGG:ns NR:ns ## COG: agaC COG3715 # Protein_GI_number: 16131031 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC # Organism: Escherichia coli K12 # 1 267 1 267 267 479 100.0 1e-135 MHEITLLQGLSLAALVFVLGIDFWLEALFLFRPIIVCTLTGAILGDIQTGLITGGLTELA FAGLTPAGGVQPPNPIMAGLMTTVIAWSTGVDAKTAIGLGLPFSLLMQYVILFFYSAFSL FMTKADKCAKEADTAAFSRLNWTTMLIVASAYAVIAFLCTYLAQGAMQALVKAMPAWLTH GFEVAGGILPAVGFGLLLRVMFKAQYIPYLIAGFLFVCYIQVSNLLPVAVLGAGFAVYEF FNAKSRQQAQPQPVASKNEEEDYSNGI >gi|296493319|gb|ADTK01000182.1| GENE 3 1344 - 2135 766 263 aa, chain + ## HITS:1 COG:ECs4019 KEGG:ns NR:ns ## COG: ECs4019 COG3716 # Protein_GI_number: 15833273 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID # Organism: Escherichia coli O157:H7 # 1 263 1 263 263 504 100.0 1e-143 MGSEISKKDITRLGFRSSLLQASFNYERMQAGGFTWAMLPILKKIYKDDKPGLSAAMKDN LEFINTHPNLVGFLMGLLISMEEKGENRDTIKGLKVALFGPIAGIGDAIFWFTLLPIMAG ICSSFASQGNLLGPILFFAVYLLIFFLRVGWTHVGYSVGVKAIDKVRENSQMIARSATIL GITVIGGLIASYVHINVVTSFAIDSTHSVALQQDFFDKVFPNILPMAYTLLMYYFLRVKK AHPVLLIGVTFVLSIVCSAFGIL >gi|296493319|gb|ADTK01000182.1| GENE 4 2190 - 2891 675 233 aa, chain + ## HITS:1 COG:agaI KEGG:ns NR:ns ## COG: agaI COG0363 # Protein_GI_number: 16131033 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Escherichia coli K12 # 1 233 19 251 251 456 98.0 1e-128 MQTLQQVENYTALSERASEYLLAVIRSKPDAVICLATGATPLLTYHYLVEKIHQQQVDVS QLTFVKLDEWVDLPLTMPGTCETFLQQHIVQPLGLREDQLISFRSEEINETECERVTNLI ARKGGLDLCVLGLGKNGHLGLNEPGESLQPACHISQLDARTQQHEMLKTAGRPVTRGITL GLKDILNAREVLLLVTGEGKQDATERFLTAKVSTAIPASFLWLHSNFICLINT >gi|296493319|gb|ADTK01000182.1| GENE 5 3292 - 3876 471 194 aa, chain + ## HITS:1 COG:ECs4020 KEGG:ns NR:ns ## COG: ECs4020 COG3539 # Protein_GI_number: 15833274 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli O157:H7 # 1 194 1 194 194 371 100.0 1e-103 MNKVTKTAIAGLLALFAGNAAATDGEIVFDGEILKSACEINDSDKKIEVALGHYNAEQFR SVGDRSPKIPFTIPLVNCPVTGWEHDNGNVEASFRLWLETRDNGTVPNFPNLAKVGSFAG TAATGVGIRIDDAESGNLMPLNAMGNDNTVYQIPADSAGIVNVDLIAYYVSTVEASEITP GEADAVVNVTLDYR Prediction of potential genes in microbial genomes Time: Mon May 16 15:36:48 2011 Seq name: gi|296493318|gb|ADTK01000183.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont534.2, whole genome shotgun sequence Length of sequence - 12190 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 8, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 10/0.167 + CDS 82 - 744 358 ## COG3121 P pilus assembly protein, chaperone PapD 2 1 Op 2 6/0.167 + CDS 774 - 3164 1553 ## COG3188 P pilus assembly protein, porin PapC 3 2 Tu 1 . + CDS 3301 - 4392 433 ## COG3539 P pilus assembly protein, pilin FimA + Term 4496 - 4522 -0.3 - Term 4383 - 4427 8.1 4 3 Tu 1 . - CDS 4435 - 5295 1038 ## COG0313 Predicted methyltransferases - Prom 5333 - 5392 6.5 + Prom 5113 - 5172 2.7 5 4 Op 1 10/0.167 + CDS 5360 - 7396 1843 ## COG3107 Putative lipoprotein 6 4 Op 2 11/0.000 + CDS 7354 - 7749 185 ## COG0792 Predicted endonuclease distantly related to archaeal Holliday junction resolvase 7 4 Op 3 11/0.000 + CDS 7769 - 8359 482 ## COG0279 Phosphoheptose isomerase 8 4 Op 4 . + CDS 8369 - 8944 548 ## COG2823 Predicted periplasmic or secreted lipoprotein - Term 8861 - 8895 2.0 9 5 Op 1 2/1.000 - CDS 9058 - 10098 1279 ## COG0701 Predicted permeases - Term 10105 - 10157 3.2 10 5 Op 2 . - CDS 10171 - 10806 644 ## COG0702 Predicted nucleoside-diphosphate-sugar epimerases - Prom 10882 - 10941 2.2 11 6 Tu 1 . + CDS 10934 - 11452 587 ## COG0693 Putative intracellular protease/amidase + Term 11690 - 11736 1.1 - Term 11366 - 11394 1.4 12 7 Tu 1 . - CDS 11432 - 11875 408 ## COG3787 Uncharacterized protein conserved in bacteria - Prom 11897 - 11956 3.1 + Prom 11842 - 11901 4.2 13 8 Tu 1 . + CDS 11926 - 12190 100 ## COG2827 Predicted endonuclease containing a URI domain Predicted protein(s) >gi|296493318|gb|ADTK01000183.1| GENE 1 82 - 744 358 220 aa, chain + ## HITS:1 COG:yraI KEGG:ns NR:ns ## COG: yraI COG3121 # Protein_GI_number: 16131035 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli K12 # 1 220 12 231 231 444 98.0 1e-125 MLCSFCIGQALAGGIVLQRTRVIYDASRKEAALPVANKGAETPYLLQSWVDNIDGTSRAP FIITPPLFRLEAGDDSSLRIIKTADNLPENKESLFYINVRAIPAKKKSDNVNANELTLVF KTRIKMFYRPAHLKGRVNDAWKSLEFKRSDHSLNIYNPTEYYVVFAGLAVDKTDLTSKIE YIAPGEHKQLPLPASGGKNVKWAAINDYGGSSGTETRPLQ >gi|296493318|gb|ADTK01000183.1| GENE 2 774 - 3164 1553 796 aa, chain + ## HITS:1 COG:ECs4022 KEGG:ns NR:ns ## COG: ECs4022 COG3188 # Protein_GI_number: 15833276 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli O157:H7 # 1 796 1 796 838 1504 99.0 0 MPQRHHQGHKRTPKQLALIIKRCLPMVLTGSGMLCTTANAEEYYFDPIMLETTKSGMQTT DLSRFSKKYAQLPGTYQVDIWLNKKKVSQKKITFTANAEQLLQPQFTVEQLRELGIKVDE IPALAEKDDDSVINSLEQIIPGTAAEFDFNHQRLNLSIPQIALYRDARGYVSPSRWDDGI PTLFTNYSFTGSDNRYRQGNRSQRQYLNMQNGANFGPWRLRNYSTWTRNDQTSSWNTISS YLQRDIKALKSQLLLGESATSGSIFSSYTFTGVQLASDDNMLPNSQRGFAPTVRGIANSS AIVTIRQNGYVIYQSNVPAGAFEINDLYPSSNSGDLEVTIEESDGTQRRFIQPYSSLPMM QRPGHLKYSATAGRYRADANSDSKEPEFAEATAIYGLNNTFTLYSGLLGSEDYYALGIGI GGTLGALGALSMDINRADTQFDNQHSFHGYQWRTQYIKDIPETNTNIAVSYYRYTNDGYF SFDEANTRNWDYNSRQKSEIQFNISQTIFDGVSLYASGSQQDYWGNNEKNRNISVGVSGQ QWGIGYSLNYQYSRYTDQNNDRALSLNLSIPLERWLPRSRVSYQMTSQKDRPTQHEMRLD GSLLDDGRLSYSLEQSLDDDNNHNSSVNASYRSPYGTFSAGYSYGNDSSQYNYGVTGGVV IHPHGVTLSQYLGNAFALIDANGASGVRIQNYPGIATDPFGYAVVPYLTTYQENRLSVDT TQLPDNVDLEQTTQFVVPNRGAMVAARFNANIGYRVLVTVSDRNGKPLPFGALASNDETG QQSIVDEGGILYLSGI >gi|296493318|gb|ADTK01000183.1| GENE 3 3301 - 4392 433 363 aa, chain + ## HITS:1 COG:yraK KEGG:ns NR:ns ## COG: yraK COG3539 # Protein_GI_number: 16131037 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 363 1 363 363 664 98.0 0 MKRAPLITGLLLISTSCAYASSGGCGADSTSGATNYSSVVDDVTVNQTDNVTGREFTSAT LSSTNWQYACSCSAGKAVKLVYMVSPVLTTTGHQAGYYKLNDSLDIKTTLKANDIPGLVT DQTVSVNTRFTQIKSNTVYSAATQTGVCQGDTSRYGPVNIGANTTFTLYVTKPFLGSMTI PKTDIAVIKGAWVDGMGSPSTGDFHDLVKLSIQGNLTAPQSCKINQGDVIKVNFGFINGQ KFTTRNAMPDGFTPVDFDITYDCGDTSKIKNSLQMRIDGTTGVVDQYNLVARRRSSDNAP DVGIRIENLGGGVANIPFQNGILPVDPSGHGTVNMRAWPVNLVGGELETGKFQGTATITV IVR >gi|296493318|gb|ADTK01000183.1| GENE 4 4435 - 5295 1038 286 aa, chain - ## HITS:1 COG:ECs4027 KEGG:ns NR:ns ## COG: ECs4027 COG0313 # Protein_GI_number: 15833281 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Escherichia coli O157:H7 # 1 286 1 286 286 533 100.0 1e-151 MKQHQSADNSQGQLYIVPTPIGNLADITQRALEVLQAVDLIAAEDTRHTGLLLQHFGINA RLFALHDHNEQQKAETLLAKLQEGQNIALVSDAGTPLINDPGYHLVRTCREAGIRVVPLP GPCAAITALSAAGLPSDRFCYEGFLPAKSKGRRDALKAIEAEPRTLIFYESTHRLLDSLE DIVAVLGESRYVVLARELTKTWETIHGAPVGELLAWVKEDENRRKGEMVLIVEGHKAQEE DLPADALRTLALLQAELPLKKAAALAAEIHGVKKNALYKYALEQQG >gi|296493318|gb|ADTK01000183.1| GENE 5 5360 - 7396 1843 678 aa, chain + ## HITS:1 COG:ECs4028 KEGG:ns NR:ns ## COG: ECs4028 COG3107 # Protein_GI_number: 15833282 # Func_class: R General function prediction only # Function: Putative lipoprotein # Organism: Escherichia coli O157:H7 # 1 678 1 678 678 1192 99.0 0 MVPSTFSRLKAARCLPVVLAALIFAGCGTHTPDQSTAYMQGTAQADSAFYLQQMQQSSDD TRINWQLLAIRALVKEGKTGQAVELFNQLPQELNDSQRREKTLLAVEIKLAQKDFAGAQN LLAKITPADLEQNQQARYWQAKIDASQGRPSIDLLRALIAQEPLLGAKEKQQNIDATWQA LSSMTQEQANTLVINADENILQGWLDLQRVWFDNRNDPDMMKAGIADWQKRYPNNPGAKM LPTQLVNVKAFKPASTNKIALLLPLNGQAAVFGRTIQQGFEAAKNIGTQPVAAQVAAAPA ADVAEQPQPQTVDGVASPAQASVSDLTGEQPAAQPVPVSAPATSTAAVSAPANPSAELKI YDTSSQPLSQILSQVQQDGASIVVGPLLKNNVEELLKSNTPLNVLALNQPENIENRVNIC YFALSPEDEARDAARHIRDQGKQAPLVLIPRSSLGDRVANAFAQEWQKLGGGTVLQQKFG STSELRAGVNGGSGIALTGSPITPRATTDSGMTTNNPTLQTTPTDDQFTNNGGRVDAVYI VATPGEIAFIKPMIAMRNGSQSGATLYASSRSAQGTAGPDFRLEMEGLQYSEIPMLAGGN LPLMQQALSAVNNDYSLARMYAMGVDAWSLANHFSQMRQVQGFEINGNTGSLTANPDCVI NRKLSWLQYQQGQVVPAS >gi|296493318|gb|ADTK01000183.1| GENE 6 7354 - 7749 185 131 aa, chain + ## HITS:1 COG:yraN KEGG:ns NR:ns ## COG: yraN COG0792 # Protein_GI_number: 16131040 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease distantly related to archaeal Holliday junction resolvase # Organism: Escherichia coli K12 # 1 131 1 131 131 248 99.0 2e-66 MATVPTRSGSPRQLTTKQTGDAWEAQARRWLEGKGLRFIAANVNERGGEIDLIMREGRTT VFVEVRYRRSALYGGAAASVTRSKQHKLLQTARLWLARHNGSFDTVDCRFDVVAFTGNEV EWIKDAFNDHS >gi|296493318|gb|ADTK01000183.1| GENE 7 7769 - 8359 482 196 aa, chain + ## HITS:1 COG:ECs4030 KEGG:ns NR:ns ## COG: ECs4030 COG0279 # Protein_GI_number: 15833284 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoheptose isomerase # Organism: Escherichia coli O157:H7 # 1 196 1 196 196 370 100.0 1e-103 MQERIKACFTESIQTQIAAAEALPDAISRAAMTLVQSLLNGNKILCCGNGTSAANAQHFA ASMINRFETERPSLPAIALNTDNVVLTAIANDRLHDEVYAKQVRALGHAGDVLLAISTRG NSRDIVKAVEAAVTRDMTIVALTGYDGGELAGLLGPQDVEIRIPSHRSARIQEMHMLTVN CLCDLIDNTLFPHQDD >gi|296493318|gb|ADTK01000183.1| GENE 8 8369 - 8944 548 191 aa, chain + ## HITS:1 COG:ECs4031 KEGG:ns NR:ns ## COG: ECs4031 COG2823 # Protein_GI_number: 15833285 # Func_class: R General function prediction only # Function: Predicted periplasmic or secreted lipoprotein # Organism: Escherichia coli O157:H7 # 1 191 1 191 191 296 100.0 2e-80 MKALSPIAVLISALLLQGCVAAAVVGTAAVGTKAATDPRSVGTQVDDGTLEVRVNSALSK DEQIKKEARINVTAYQGKVLLVGQSPNAELSARAKQIAMGVDGANEVYNEIRQGQPIGLG EASNDTWITTKVRSQLLTSDLVKSSNVKVTTENGEVFLMGLVTEREAKAAADIASRVSGV KRVTTAFTFIK >gi|296493318|gb|ADTK01000183.1| GENE 9 9058 - 10098 1279 346 aa, chain - ## HITS:1 COG:yraQ KEGG:ns NR:ns ## COG: yraQ COG0701 # Protein_GI_number: 16131043 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Escherichia coli K12 # 1 346 1 346 346 573 99.0 1e-163 MTGQSSSQAATPIQWWKPALFFLVVIAGLWYVKWEPYYGKAFTAAETHSIGKSILAQADA NPWQAAVDYAMIYFLAVWKAAVLGVILGSLIQVLIPRDWLLRTLGQSRFRGTLLGTLFSL PGMMCTCCAAPVAAGMRRQQVSMGGALAFWMGNPVLNPATLVFMGFVLGWGFAAIRLVAG LVMVLLIATLVQKWVRETPQTQAPVEIDIPEAQGGFFSRWGRALWTLFWSTIPVYILAVL VLGAARVWLFPHADGTVDNSLMWVVAMAVAGCLFVIPTAAEIPIVQTMMLAGMGTAPALA LLMTLPAVSLPSLIMLRKAFPAKALWLTGAMVAVSGVIVGGLALLF >gi|296493318|gb|ADTK01000183.1| GENE 10 10171 - 10806 644 211 aa, chain - ## HITS:1 COG:yraR KEGG:ns NR:ns ## COG: yraR COG0702 # Protein_GI_number: 16131044 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate-sugar epimerases # Organism: Escherichia coli K12 # 1 211 16 226 226 421 99.0 1e-118 MSQVLITGATGLVGGHLLRMLINEPKVNAIAAPTRRPLGDMPGVFNPHDPQLTDALAQVT DPIDIVFCCLGTTRREAGSKEAFIHADYTLVVDTALTGRRLGAQHMLVVSAMGANAHSPF FYNRVKGEMEEALIAQNWPKLTIARPSMLLGDRSKQRMNETLFAPLFRLLPGNWKSIDAR DVARVMLAESMRPEHEGVTILSSSELRKRAE >gi|296493318|gb|ADTK01000183.1| GENE 11 10934 - 11452 587 172 aa, chain + ## HITS:1 COG:ECs4034 KEGG:ns NR:ns ## COG: ECs4034 COG0693 # Protein_GI_number: 15833288 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Escherichia coli O157:H7 # 1 172 15 186 186 330 100.0 7e-91 MSKKIAVLITDEFEDSEFTSPADEFRKAGHEVITIEKQAGKTVKGKKGEASVTIDKSIDE VTPAEFDALLLPGGHSPDYLRGDNRFVTFTRDFVNSGKPVFAICHGPQLLISADVIRGRK LTAVKPIIIDVKNAGAEFYDQEVVVDKDQLVTSRTPDDLPAFNREALRLLGA >gi|296493318|gb|ADTK01000183.1| GENE 12 11432 - 11875 408 147 aa, chain - ## HITS:1 COG:yhbP KEGG:ns NR:ns ## COG: yhbP COG3787 # Protein_GI_number: 16131046 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 147 1 147 147 291 100.0 3e-79 METLIAISRWLAKQHVVTWCVQQEGELWCANAFYLFDAQKVAFYILTEEKTRHAQMSGPQ AAVAGTVNGQPKTVALIRGVQFKGEIRRLEGEESDLARKAYNRRFPVARMLSAPVWEIRL DEIKFTDNTLGFGKKMIWLRDSGTEQA >gi|296493318|gb|ADTK01000183.1| GENE 13 11926 - 12190 100 88 aa, chain + ## HITS:1 COG:yhbQ KEGG:ns NR:ns ## COG: yhbQ COG2827 # Protein_GI_number: 16131047 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease containing a URI domain # Organism: Escherichia coli K12 # 1 84 1 84 100 152 98.0 2e-37 MTPWFLYLIRTADNKLYTGITTDVERRYQQHQSGKGAKALRGKGELTLAFSAPVGDRSLA LRAEYRVKQLTKRQKERLVAEGAGGGGG Prediction of potential genes in microbial genomes Time: Mon May 16 15:36:58 2011 Seq name: gi|296493317|gb|ADTK01000184.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont534.3, whole genome shotgun sequence Length of sequence - 24823 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 9, operones - 4 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 - CDS 1 - 511 563 ## COG3153 Predicted acetyltransferase 2 1 Op 2 . - CDS 505 - 1029 724 ## COG3154 Putative lipid carrier protein - Prom 1070 - 1129 4.4 + Prom 1149 - 1208 4.4 3 2 Op 1 13/0.000 + CDS 1238 - 2233 875 ## COG0826 Collagenase and related proteases 4 2 Op 2 2/0.750 + CDS 2242 - 3120 925 ## COG0826 Collagenase and related proteases + Prom 3235 - 3294 5.7 5 3 Tu 1 . + CDS 3326 - 4333 945 ## COG2141 Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related flavin-dependent oxidoreductases + Term 4387 - 4422 0.2 6 4 Tu 1 . - CDS 4451 - 5695 1542 ## COG0814 Amino acid permeases - Prom 5782 - 5841 8.4 - Term 5771 - 5805 3.5 7 5 Tu 1 . - CDS 5849 - 7738 2427 ## COG0513 Superfamily II DNA and RNA helicases - Prom 7764 - 7823 5.7 - Term 7846 - 7882 1.3 8 6 Op 1 6/0.000 - CDS 7918 - 8802 587 ## COG4785 Lipoprotein NlpI, contains TPR repeats - Term 8866 - 8899 5.9 9 6 Op 2 26/0.000 - CDS 8911 - 11046 188 ## PROTEIN SUPPORTED gi|229537485|ref|ZP_04426621.1| ribosomal protein S1 - Term 11243 - 11271 1.3 10 6 Op 3 14/0.000 - CDS 11293 - 11562 445 ## PROTEIN SUPPORTED gi|16131057|ref|NP_417634.1| 30S ribosomal subunit protein S15 - Prom 11639 - 11698 4.1 - Term 11580 - 11618 -0.8 11 6 Op 4 26/0.000 - CDS 11711 - 12655 1038 ## COG0130 Pseudouridine synthase 12 6 Op 5 32/0.000 - CDS 12655 - 13056 628 ## COG0858 Ribosome-binding factor A - Prom 13093 - 13152 2.3 - Term 13067 - 13093 1.0 13 6 Op 6 20/0.000 - CDS 13220 - 15892 3073 ## COG0532 Translation initiation factor 2 (IF-2; GTPase) 14 6 Op 7 32/0.000 - CDS 15917 - 17404 1026 ## PROTEIN SUPPORTED gi|17988250|ref|NP_540884.1| transcription elongation factor NusA 15 6 Op 8 . - CDS 17432 - 17854 380 ## COG0779 Uncharacterized protein conserved in bacteria - TRNA 18091 - 18167 86.1 # Met CAT 0 0 + Prom 18326 - 18385 4.4 16 7 Tu 1 . + CDS 18515 - 19858 1623 ## COG0137 Argininosuccinate synthase + Term 19866 - 19907 5.7 17 8 Tu 1 . - CDS 19866 - 21491 634 ## COG2194 Predicted membrane-associated, metal-dependent hydrolase - Prom 21533 - 21592 7.1 - TRNA 21950 - 22036 70.3 # Leu GAG 0 0 - Term 21900 - 21942 9.2 18 9 Op 1 7/0.000 - CDS 22051 - 22383 363 ## COG1314 Preprotein translocase subunit SecG - Prom 22511 - 22570 4.0 - Term 22560 - 22587 -0.1 19 9 Op 2 9/0.000 - CDS 22611 - 23948 1566 ## COG1109 Phosphomannomutase 20 9 Op 3 . - CDS 23941 - 24789 847 ## COG0294 Dihydropteroate synthase and related enzymes Predicted protein(s) >gi|296493317|gb|ADTK01000184.1| GENE 1 1 - 511 563 170 aa, chain - ## HITS:1 COG:ECs4037 KEGG:ns NR:ns ## COG: ECs4037 COG3153 # Protein_GI_number: 15833291 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Escherichia coli O157:H7 # 1 167 1 167 167 320 97.0 7e-88 MLIRVEIPIDAPGIDALLRRSFESDAEAKLVHDLREDGFLTLGLVATDDEGQVIGYVAFS PVDVQGEDLQWVGMAPLAVDEKYRGQGLARQLVYEGLDSLNEFGYAAVVTLGDPALYSRF GFELAAHHDLRCRWPGTESAFQVHRLADDALNGVTGLVEYHDFFFFFFFF >gi|296493317|gb|ADTK01000184.1| GENE 2 505 - 1029 724 174 aa, chain - ## HITS:1 COG:ECs4038 KEGG:ns NR:ns ## COG: ECs4038 COG3154 # Protein_GI_number: 15833292 # Func_class: I Lipid transport and metabolism # Function: Putative lipid carrier protein # Organism: Escherichia coli O157:H7 # 1 174 1 174 174 325 100.0 2e-89 MLDKLRSRIVHLGPSLLSVPVKLTPFALKRQVLEQVLSWQFRQALDDGELEFLEGRWLSI HVRDIDLQWFTSVVNGKLVVSQNAQADVSFSADASDLLMIAARKQDPDTLFFQRRLVIEG DTELGLYVKNLMDAIELEQMPKALRMMLLQLADFVEAGMKTAPETKQTSVGEPC >gi|296493317|gb|ADTK01000184.1| GENE 3 1238 - 2233 875 331 aa, chain + ## HITS:1 COG:ECs4039 KEGG:ns NR:ns ## COG: ECs4039 COG0826 # Protein_GI_number: 15833293 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Escherichia coli O157:H7 # 1 331 1 331 331 678 100.0 0 MELLCPAGNLPALKAAIENGADAVYIGLKDDTNARHFAGLNFTEKKLQEAVSFVHQHRRK LHIAINTFAHPDGYARWQRAVDMAAQLGADALILADLAMLEYAAERYPHIERHVSVQASA TNEEAINFYHRHFDVARVVLPRVLSIHQVKQLARVTPVPLEVFAFGSLCIMSEGRCYLSS YLTGESPNTVGACSPARFVRWQQTPQGLESRLNEVLIDRYQDGENAGYPTLCKGRYLVDG ERYHALEEPTSLNTLELLPELMAANIASVKIEGRQRSPAYVSQVAKVWRQAIDRCKADPQ NFVPQSAWMETLGSMSEGTQTTLGAYHRKWQ >gi|296493317|gb|ADTK01000184.1| GENE 4 2242 - 3120 925 292 aa, chain + ## HITS:1 COG:yhbV KEGG:ns NR:ns ## COG: yhbV COG0826 # Protein_GI_number: 16131051 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Escherichia coli K12 # 1 292 7 298 298 600 100.0 1e-171 MKYSLGPVLWYWPKETLEEFYQQAATSSADVIYLGEAVCSKRRATKVGDWLEMAKSLAGS GKQIVLSTLALVQASSELGELKRYVENGEFLIEASDLGVVNMCAERKLPFVAGHALNCYN AVTLKILLKQGMMRWCMPVELSRDWLVNLLNQCDELGIRNQFEVEVLSYGHLPLAYSARC FTARSEDRPKDECETCCIKYPNGRNVLSQENQQVFVLNGIQTMSGYVYNLGNELASMQGL VDVVRLSPQGTDTFAMLDAFRANENGAAPLPLTANSDCNGYWRRLAGLELQA >gi|296493317|gb|ADTK01000184.1| GENE 5 3326 - 4333 945 335 aa, chain + ## HITS:1 COG:yhbW KEGG:ns NR:ns ## COG: yhbW COG2141 # Protein_GI_number: 16131052 # Func_class: C Energy production and conversion # Function: Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related flavin-dependent oxidoreductases # Organism: Escherichia coli K12 # 1 335 1 335 335 674 99.0 0 MTDKTIAFSLLDLAPIPEGSSAREAFSHSLDLARLAEKRGYHRYWLAEHHNMTGIASAAT SVLIGYLAANTTTLHLGSGGVMLPNHSPLVIAEQFGTLNTLYPGRIDLGLGRAPGSDQRT MMALRRHMSGDIDNFPRDVAELVDWFDARDPNPNVRPVPGYGEKIPVWLLGSSLYSAQLA AQLGLPFAFASHFAPDMLFQALHLYRSNFKPSARLEKPYAMVCINIIAADSNRDAEFLFT SMQQAFVKLRRGETGQLPPPIQNMDQFWSPSEQYGVQQALSMSLVGDKAKVRHGLQSILR ETDADEIMVNGQIFDHQARLHSFELAMDVKEELLG >gi|296493317|gb|ADTK01000184.1| GENE 6 4451 - 5695 1542 414 aa, chain - ## HITS:1 COG:ECs4042 KEGG:ns NR:ns ## COG: ECs4042 COG0814 # Protein_GI_number: 15833296 # Func_class: E Amino acid transport and metabolism # Function: Amino acid permeases # Organism: Escherichia coli O157:H7 # 1 414 1 414 414 694 99.0 0 MATLTTTQTSPSLLGGVVIIGGTIIGAGMFSLPVVMSGAWFFWSMAALIFTWFCMLHSGL MILEANLNYRIGSSFDTITKDLLGKGWNVVNGISIAFVLYILTYAYISASGSILHHTFAE MSLNVPARAAGFGFALLVAFVVWLSTKAVSRMTAIVLGAKVITFFLTFGSLLGHVQPATL FNVAESNASYAPYLLMTLPFCLASFGYHGNVPSLMKYYGKDPKTIVKCLVYGTLMALALY TIWLLATMGNIPRPEFIGIAEKGGNIDVLVQALSGVLNSRSLDLLLVVFSNFAVASSFLG VTLGLFDYLADLFGFDDSALGRLKTALLTFAPPVVGGLLFPNGFLYAIGYAGLAATIWAA IVPALLARASRKRFGSPKFRVWGGKPMITLILVFGVGNALVHILSSFNLLPVYQ >gi|296493317|gb|ADTK01000184.1| GENE 7 5849 - 7738 2427 629 aa, chain - ## HITS:1 COG:ECs4043 KEGG:ns NR:ns ## COG: ECs4043 COG0513 # Protein_GI_number: 15833297 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Escherichia coli O157:H7 # 1 629 18 646 646 1055 99.0 0 MAEFETTFADLGLKAPILEALNDLGYEKPSPIQAECIPHLLNGRDVLGMAQTGSGKTAAF SLPLLQNLDPELKAPQILVLAPTRELAVQVAEAMTDFSKHMRGVNVVALYGGQRYDVQLR ALRQGPQIVVGTPGRLLDHLKRGTLDLSKLSGLVLDEADEMLRMGFIEDVETIMAQIPEG HQTALFSATMPEAIRRITRRFMKEPQEVRIQSSVTTRPDISQSYWTVWGMRKNEALVRFL EAEDFDAAIIFVRTKNATLEVAEALERNGYNSAALNGDMNQALREQTLERLKDGRLDILI ATDVAARGLDVERISLVVNYDIPMDSESYVHRIGRTGRAGRAGRALLFVENRERRLLRNI ERTMKLTIPEVELPNAELLGKRRLEKFAAKVQQQLESSDLDQYRALLSKIQPTAEGEELD LETLAAALLKMAQGERTLIVPPDAPMRPKREFRDRDDRGPRDRNDRGPRGDREDRPRRER RDVGDMQLYRIEVGRDDGVEVRHIVGAIANEGDISSRYIGNIKLFASHSTIELPKGMPGE VLQHFTRTRILNKPMNMQLLGDAQPHTGGERRGGGRGFGGERREGGRNFSGERREGGRGD GRRFSGERREGRAPRRDDSTGRRRFGGDA >gi|296493317|gb|ADTK01000184.1| GENE 8 7918 - 8802 587 294 aa, chain - ## HITS:1 COG:ECs4044 KEGG:ns NR:ns ## COG: ECs4044 COG4785 # Protein_GI_number: 15833298 # Func_class: R General function prediction only # Function: Lipoprotein NlpI, contains TPR repeats # Organism: Escherichia coli O157:H7 # 1 294 1 294 294 557 99.0 1e-159 MKPFLRWCFVATALTLAGCSNTSWRKSEVLAVPLQPTLQQEVILARMEQILASRALTDDE RAQLLYERGVLYDSLGLRALARNDFSQALAIRPDMPEVFNYLGIYLTQAGNFDAAYEAFD SVLELDPTYNYAHLNRGIALYYGGRDKLAQDDLLAFYQDDPNDPFRSLWLYLAEQKLDEK QAKEVLKQHFEKSDKEQWGWNIVEFYLGNISEQTLMERLKADATDNTSLAEHLSETNFYL GKYYLSLGDLDSATALFKLAVANNVYNFVEHRYALLELSLLGQDQDDLAESDQQ >gi|296493317|gb|ADTK01000184.1| GENE 9 8911 - 11046 188 711 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229537485|ref|ZP_04426621.1| ribosomal protein S1 [Planctomyces limnophilus DSM 3776] # 621 703 433 516 557 77 46 1e-13 MLNPIVRKFQYGQHTVTLETGMMARQATAAVMVSMDDTAVFVTVVGQKKAKPGQDFFPLT VNYQERTYAAGRIPGSFFRREGRPSEGETLIARLIDRPIRPLFPEGFVNEVQVIATVVSV NPQVNPDIVAMIGASAALSLSGIPFNGPIGAARVGYINDQYVLNPTQDELKESKLDLVVA GTEAAVLMVESEAELLSEDQMLGAVVFGHEQQQVVIQNINELVKEAGKPRWDWQPEPVNE ALNARVAALAEARLSDAYRITDKQERYAQVDVIKSETIATLLAEDETLDENELGEILHAI EKNVVRSRVLAGEPRIDGREKDMIRGLDVRTGVLPRTHGSALFTRGETQALVTATLGTAR DAQVLDELMGERTDTFLFHYNFPPYSVGETGMVGSPKRREIGHGRLAKRGVLAVMPDMDK FPYTVRVVSEITESNGSSSMASVCGASLALMDAGVPIKAAVAGIAMGLVKEGDNYVVLSD ILGDEDHLGDMDFKVAGSRDGISALQMDIKIEGITKEIMQVALNQAKGARLHILGVMEQA INAPRGDISEFAPRIHTIKINPDKIKDVIGKGGSVIRALTEETGTTIEIEDDGTVKIAAT DGEKAKHAIRRIEEITAEIEVGRVYTGKVTRIVDFGAFVAIGGGKEGLVHISQIADKRVE KVTDYLQMGQEVPVKVLEVDRQGRIRLSIKEATEQSQPAAAPEAPAAEQGE >gi|296493317|gb|ADTK01000184.1| GENE 10 11293 - 11562 445 89 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|16131057|ref|NP_417634.1| 30S ribosomal subunit protein S15 [Escherichia coli str. K-12 substr. MG1655] # 1 89 1 89 89 176 100 2e-43 MSLSTEATAKIVSEFGRDANDTGSTEVQVALLTAQINHLQGHFAEHKKDHHSRRGLLRMV SQRRKLLDYLKRKDVARYTQLIERLGLRR >gi|296493317|gb|ADTK01000184.1| GENE 11 11711 - 12655 1038 314 aa, chain - ## HITS:1 COG:ECs4047 KEGG:ns NR:ns ## COG: ECs4047 COG0130 # Protein_GI_number: 15833301 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridine synthase # Organism: Escherichia coli O157:H7 # 1 314 1 314 314 615 100.0 1e-176 MSRPRRRGRDINGVLLLDKPQGMSSNDALQKVKRIYNANRAGHTGALDPLATGMLPICLG EATKFSQYLLDSDKRYRVIARLGQRTDTSDADGQIVEERPVTFSAEQLAAALDTFRGDIE QIPSMYSALKYQGKKLYEYARQGIEVPREARPITVYELLFIRHEGNELELEIHCSKGTYI RTIIDDLGEKLGCGAHVIYLRRLAVSKYPVERMVTLEHLRELVEQAEQQDIPAAELLDPL LMPMDSPASDYPVVNLPLTSSVYFKNGNPVRTSGAPLEGLVRVTEGENGKFIGMGEIDDE GRVAPRRLVVEYPA >gi|296493317|gb|ADTK01000184.1| GENE 12 12655 - 13056 628 133 aa, chain - ## HITS:1 COG:ECs4048 KEGG:ns NR:ns ## COG: ECs4048 COG0858 # Protein_GI_number: 15833302 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-binding factor A # Organism: Escherichia coli O157:H7 # 1 133 1 133 133 239 100.0 1e-63 MAKEFGRPQRVAQEMQKEIALILQREIKDPRLGMMTTVSGVEMSRDLAYAKVYVTFLNDK DEDAVKAGIKALQEASGFIRSLLGKAMRLRIVPELTFFYDNSLVEGMRMSNLVTSVVKHD EERRVNPDDSKED >gi|296493317|gb|ADTK01000184.1| GENE 13 13220 - 15892 3073 890 aa, chain - ## HITS:1 COG:ECs4049 KEGG:ns NR:ns ## COG: ECs4049 COG0532 # Protein_GI_number: 15833303 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 2 (IF-2; GTPase) # Organism: Escherichia coli O157:H7 # 1 890 1 890 890 1394 100.0 0 MTDVTIKTLAAERQTSVERLVQQFADAGIRKSADDSVSAQEKQTLIDHLNQKNSGPDKLT LQRKTRSTLNIPGTGGKSKSVQIEVRKKRTFVKRDPQEAERLAAEEQAQREAEEQARREA EESAKREAQQKAEREAAEQAKREAAEQAKREAAEKDKVSNQQDDMTKNAQAEKARREQEA AELKRKAEEEARRKLEEEARRVAEEARRMAEENKWTDNAEPTEDSSDYHVTTSQHARQAE DESDREVEGGRGRGRNAKAARPKKGNKHAESKADREEARAAVRGGKGGKRKGSSLQQGFQ KPAQAVNRDVVIGETITVGELANKMAVKGSQVIKAMMKLGAMATINQVIDQETAQLVAEE MGHKVILRRENELEEAVMSDRDTGAAAEPRAPVVTIMGHVDHGKTSLLDYIRSTKVASGE AGGITQHIGAYHVETENGMITFLDTPGHAAFTSMRARGAQATDIVVLVVAADDGVMPQTI EAIQHAKAAQVPVVVAVNKIDKPEADPDRVKNELSQYGILPEEWGGESQFVHVSAKAGTG IDELLDAILLQAEVLELKAVRKGMASGAVIESFLDKGRGPVATVLVREGTLHKGDIVLCG FEYGRVRAMRNELGQEVLEAGPSIPVEILGLSGVPAAGDEVTVVRDEKKAREVALYRQGK FREVKLARQQKSKLENMFANMTEGEVHEVNIVLKADVQGSVEAISDSLLKLSTDEVKVKI IGSGVGGITETDATLAAASNAILVGFNVRADASARKVIEAESLDLRYYSVIYNLIDEVKA AMSGMLSPELKQQIIGLAEVRDVFKSPKFGAIAGCMVTEGVVKRHNPIRVLRDNVVIYEG ELESLRRFKDDVNEVRNGMECGIGVKNYNDVRTGDVIEVFEIIEIQRTIA >gi|296493317|gb|ADTK01000184.1| GENE 14 15917 - 17404 1026 495 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|17988250|ref|NP_540884.1| transcription elongation factor NusA [Brucella melitensis 16M] # 4 482 9 483 537 399 43 1e-111 MNKEILAVVEAVSNEKALPREKIFEALESALATATKKKYEQEIDVRVQIDRKSGDFDTFR RWLVVDEVTQPTKEITLEAARYEDESLNLGDYVEDQIESVTFDRITTQTAKQVIVQKVRE AERAMVVDQFREHEGEIITGVVKKVNRDNISLDLGNNAEAVILREDMLPRENFRPGDRVR GVLYSVRPEARGAQLFVTRSKPEMLIELFRIEVPEIGEEVIEIKAAARDPGSRAKIAVKT NDKRIDPVGACVGMRGARVQAVSTELGGERIDIVLWDDNPAQFVINAMAPADVASIVVDE DKHTMDIAVEAGNLAQAIGRNGQNVRLASQLSGWELNVMTVDDLQAKHQAEAHAAIDTFT KYLDIDEDFATVLVEEGFSTLEELAYVPMKELLEIEGLDEPTVEALRERAKNALATIAQA QEESLGDNKPADDLLNLEGVDRDLAFKLAARGVCTLEDLAEQGIDDLADIEGLTDEKAGA LIMAARNICWFGDEA >gi|296493317|gb|ADTK01000184.1| GENE 15 17432 - 17854 380 140 aa, chain - ## HITS:1 COG:ECs4051 KEGG:ns NR:ns ## COG: ECs4051 COG0779 # Protein_GI_number: 15833305 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 140 13 152 152 273 100.0 5e-74 MITAPVEALGFELVGIEFIRGRTSTLRIYIDSEDGINVDDCADVSHQVSAVLDVEDPITV AYNLEVSSPGLDRPLFTAEHYARFVGEEVTLVLRMAVQNRRKWQGVIKAVDGEMITVTVE GKDEVFALSNIQKANLVPHF >gi|296493317|gb|ADTK01000184.1| GENE 16 18515 - 19858 1623 447 aa, chain + ## HITS:1 COG:argG KEGG:ns NR:ns ## COG: argG COG0137 # Protein_GI_number: 16131063 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate synthase # Organism: Escherichia coli K12 # 1 447 1 447 447 911 99.0 0 MTTILKHLPVGQRIGIAFSGGLDTSAALLWMRQKGAVPYAYTANLGQPDEEDYDAIPRRA MEYGAENARLIDCRKQLVAEGIAAIQCGAFHNTTGGLTYFNTTPLGRAVTGTMLVAAMKE DGVNIWGDGSTYKGNDIERFYRYGLLTNAELQIYKPWLDTDFIDELGGRHEMSEFMIACG FDYKMSVEKAYSTDSNMLGATHEAKDLEYLNSSVKIVNPIMGVKFWDESVKIPAEEVTVR FEQGHPVALNGKTFSDDVEMMLEANRIGGRHGLGMSDQIENRIIEAKSRGIYEAPGMALL HIAYERLLTGIHNEDTIEQYHAHGRQLGRLLYQGRWFDSQALMLRDSLQRWVASQITGEV TLELRRGNDYSILNTVSENLTYKPERLTMEKGDSVFSPDDRIGQLTMRNLDITDTREKLF GYAKTGLLSSSATSGVPQVENLENKGQ >gi|296493317|gb|ADTK01000184.1| GENE 17 19866 - 21491 634 541 aa, chain - ## HITS:1 COG:yhbX KEGG:ns NR:ns ## COG: yhbX COG2194 # Protein_GI_number: 16131064 # Func_class: R General function prediction only # Function: Predicted membrane-associated, metal-dependent hydrolase # Organism: Escherichia coli K12 # 1 541 7 547 547 1091 99.0 0 MTVFNKFARSFKSHWLLYLCVILFGITNLVASSGAHMVQRLLFFVLTILVVKRISSLPLR LLVAAPFVLLTAADMSISLYSWCTFGTTFNDGFAISVLQSDPDEVVKMLGMYIPYLCAFA FLSLLFLAVIIKYDVSLPTKKVTGILLLIVISGSLFSACQFAYKDAKNKKAFSPYILASR FATYTPFFNLNYFALAAKEHQRLLSIANTVPYFQLSVRDTGIDTYVLIVGESVRVDNMSL YGYTRSTTPQVEAQRKQIKLFNQAISGAPYTALSVPLSLTADSVLSHDIHNYPDNIINMA NQAGFQTFWLSSQSAFRQNGTAVTSIAMRAMETVYVRGFDELLLPHLSQALQQNTQQKKL IVLHLNGSHEPACSAYPQSSAVFQPQDDQDACYDNSIHYTDSLLGQVFELLKDRRASVMY FADHGLERDPTKKNVYFHGGREASQQAYHVPMFIWYSPVLGDGVDRTTENNIFSTAYNNY LINAWMGVTKPEQPQTLEEVIAHYKGDSRVVDANHDVFDYVMLRKEFTEDKQGNPTPEGQ G >gi|296493317|gb|ADTK01000184.1| GENE 18 22051 - 22383 363 110 aa, chain - ## HITS:1 COG:ECs4054 KEGG:ns NR:ns ## COG: ECs4054 COG1314 # Protein_GI_number: 15833308 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecG # Organism: Escherichia coli O157:H7 # 1 110 1 110 110 185 99.0 2e-47 MYEALLVVFLIVAIGLVGLIMLQQGKGADMGASFGAGASATLFGSSGSGNFMTRMTALLA TLFFIISLVLGNINSNKTNKGSEWENLSAPVKTEQTQPAAPAKPTSDIPN >gi|296493317|gb|ADTK01000184.1| GENE 19 22611 - 23948 1566 445 aa, chain - ## HITS:1 COG:mrsA KEGG:ns NR:ns ## COG: mrsA COG1109 # Protein_GI_number: 16131066 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Escherichia coli K12 # 1 445 1 445 445 836 100.0 0 MSNRKYFGTDGIRGRVGDAPITPDFVLKLGWAAGKVLARHGSRKIIIGKDTRISGYMLES ALEAGLAAAGLSALFTGPMPTPAVAYLTRTFRAEAGIVISASHNPFYDNGIKFFSIDGTK LPDAVEEAIEAEMEKEISCVDSAELGKASRIVDAAGRYIEFCKATFPNELSLSELKIVVD CANGATYHIAPNVLRELGANVIAIGCEPNGVNINAEVGATDVRALQARVLAEKADLGIAF DGDGDRVIMVDHEGNKVDGDQIMYIIAREGLRQGQLRGGAVGTLMSNMGLELALKQLGIP FARAKVGDRYVLEKMQEKGWRIGAENSGHVILLDKTTTGDGIVAGLQVLAAMARNHMSLH DLCSGMKMFPQILVNVRYTAGSGDPLEHESVKAVTAEVEAALGNRGRVLLRKSGTEPLIR VMVEGEDEAQVTEFAHRIADAVKAV >gi|296493317|gb|ADTK01000184.1| GENE 20 23941 - 24789 847 282 aa, chain - ## HITS:1 COG:ECs4056 KEGG:ns NR:ns ## COG: ECs4056 COG0294 # Protein_GI_number: 15833310 # Func_class: H Coenzyme transport and metabolism # Function: Dihydropteroate synthase and related enzymes # Organism: Escherichia coli O157:H7 # 1 282 16 297 297 556 100.0 1e-158 MKLFAQGTSLDLSHPHVMGILNVTPDSFSDGGTHNSLIDAVKHANLMINAGATIIDVGGE STRPGAAEVSVEEELQRVIPVVEAIAQRFEVWISVDTSKPEVIRESAKVGAHIINDIRSL SEPGALEAAAETGLPVCLMHMQGNPKTMQEAPKYDDVFAEVNRYFIEQIARCEQAGIAKE KLLLDPGFGFGKNLSHNYSLLARLAEFHHFNLPLLVGMSRKSMIGQLLNVGPSERLSGSL ACAVIAAMQGAHIIRVHDVKETVEAMRVVEATLSAKENKRYE Prediction of potential genes in microbial genomes Time: Mon May 16 15:37:01 2011 Seq name: gi|296493316|gb|ADTK01000185.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont534.4, whole genome shotgun sequence Length of sequence - 5487 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 7 - 43 7.1 1 1 Op 1 13/0.000 - CDS 81 - 2024 1683 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 - Prom 2054 - 2113 4.2 2 1 Op 2 . - CDS 2115 - 2744 548 ## COG0293 23S rRNA methylase - Prom 2822 - 2881 4.2 + Prom 2643 - 2702 3.0 3 2 Tu 1 . + CDS 2870 - 3163 454 ## PROTEIN SUPPORTED gi|188532496|ref|YP_001906293.1| Predicted RNA-binding protein containing KH domain, possibly ribosomal protein + Term 3249 - 3288 7.4 - Term 3244 - 3270 1.7 4 3 Tu 1 . - CDS 3319 - 3795 623 ## COG0782 Transcription elongation factor - Prom 3942 - 4001 5.7 + Prom 3881 - 3940 3.2 5 4 Tu 1 . + CDS 4043 - 5476 1097 ## COG2027 D-alanyl-D-alanine carboxypeptidase (penicillin-binding protein 4) Predicted protein(s) >gi|296493316|gb|ADTK01000185.1| GENE 1 81 - 2024 1683 647 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 1 599 1 597 636 652 56 0.0 MSDMAKNLILWLVIAVVLMSVFQSFGPSESNGRKVDYSTFLQEVNNDQVREARINGREIN VTKKDSNRYTTYIPVQDPKLLDNLLTKNVKVVGEPPEEPSLLASIFISWFPMLLLIGVWI FFMRQMQGGGGKGAMSFGKSKARMLTEDQIKTTFADVAGCDEAKEEVAELVEYLREPSRF QKLGGKIPKGVLMVGPPGTGKTLLAKAIAGEAKVPFFTISGSDFVEMFVGVGASRVRDMF EQAKKAAPCIIFIDEIDAVGRQRGAGLGGGHDEREQTLNQMLVEMDGFEGNEGIIVIAAT NRPDVLDPALLRPGRFDRQVVVGLPDVRGREQILKVHMRRVPLAPDIDAAIIARGTPGFS GADLANLVNEAALFAARGNKRVVSMVEFEKAKDKIMMGAERRSMVMTEAQKESTAYHEAG HAIIGRLVPEHDPVHKVTIIPRGRALGVTFFLPEGDAISASRQKLESQISTLYGGRLAEE IIYGPEHVSTGASNDIKVATNLARNMVTQWGFSEKLGPLLYAEEEGEVFLGRSVAKAKHM SDETARIIDQEVKALIERNYNRARQLLTDNMDILHAMKDALMKYETIDAPQIDDLMARRD VRPPAGWEEPGASNNSGDNGSPKAPRPVDEPRTPNPGNTMSEQLGDK >gi|296493316|gb|ADTK01000185.1| GENE 2 2115 - 2744 548 209 aa, chain - ## HITS:1 COG:ECs4058 KEGG:ns NR:ns ## COG: ECs4058 COG0293 # Protein_GI_number: 15833312 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 23S rRNA methylase # Organism: Escherichia coli O157:H7 # 1 209 1 209 209 409 100.0 1e-114 MTGKKRSASSSRWLQEHFSDKYVQQAQKKGLRSRAWFKLDEIQQSDKLFKPGMTVVDLGA APGGWSQYVVTQIGGKGRIIACDLLPMDPIVGVDFLQGDFRDELVMKALLERVGDSKVQV VMSDMAPNMSGTPAVDIPRAMYLVELALEMCRDVLAPGGSFVVKVFQGEGFDEYLREIRS LFTKVKVRKPDSSRARSREVYIVATGRKP >gi|296493316|gb|ADTK01000185.1| GENE 3 2870 - 3163 454 97 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|188532496|ref|YP_001906293.1| Predicted RNA-binding protein containing KH domain, possibly ribosomal protein [Erwinia tasmaniensis Et1/99] # 1 97 1 97 97 179 93 4e-45 MNLSTKQKQHLKGLAHPLKPVVLLGSNGLTEGVLAEIEQALEHHELIKVKIATEDRETKT LIVEAIVRETGACNVQVIGKTLVLYRPTKERKISLPR >gi|296493316|gb|ADTK01000185.1| GENE 4 3319 - 3795 623 158 aa, chain - ## HITS:1 COG:STM3299 KEGG:ns NR:ns ## COG: STM3299 COG0782 # Protein_GI_number: 16766595 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Salmonella typhimurium LT2 # 1 158 1 158 158 282 96.0 2e-76 MQAIPMTLRGAEKLREELDFLKSVRRPEIIAAIAEAREHGDLKENAEYHAAREQQGFCEG RIKDIEAKLSNAQVIDVTKMPNNGRVIFGATVTVLNLDSDEEQTYRIVGDDEADFKQNLI SVNSPIARGLIGKEEDDVVVIKTPGGEVEFEVIKVEYL >gi|296493316|gb|ADTK01000185.1| GENE 5 4043 - 5476 1097 477 aa, chain + ## HITS:1 COG:dacB KEGG:ns NR:ns ## COG: dacB COG2027 # Protein_GI_number: 16131072 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase (penicillin-binding protein 4) # Organism: Escherichia coli K12 # 1 477 1 477 477 952 99.0 0 MRFSRFIIGLTSCIAFSVQAANVDEYVTQLPAGANLALMVQKVGASAPAIDYHSQQMALP ASTQKVITALAALIQLGPDFRFTTTLETKGNVENGVLKGDLVARFGADPTLKRQDIRNMV ATLKKSGVNQIDGNVLIDTSIFASHDKAPGWPWNDMTQCFSAPPAAAIVDRNCFSVSLYS AQKPGDMAFIRVASYYPVTMFSQVRTLPRGSAEAQYCELDVVPGDLNRFTLTGCLPQRSE PLPLAFAVQDGASYAGAILKDELKQAGITWSGTLLRQTQVNEPGTVVASKQSAPLHDLLK IMLKKSDNMIADTVFRMIGHARFNVPGTWRAGSDAVRQILRQQAGVDIGNTIIADGSGLS RHNLIAPATMMQVLQYIAQHDNELNFISMLPLAGYDGSLQYRAGLHQAGVDGKVSAKTGS LQGVYNLAGFITTASGQRMAFVQYLSGYAVEPADQRNRRIPLVRFESRLYKDIYQNN Prediction of potential genes in microbial genomes Time: Mon May 16 15:37:02 2011 Seq name: gi|296493315|gb|ADTK01000186.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont534.5, whole genome shotgun sequence Length of sequence - 2341 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 - CDS 99 - 1271 1371 ## COG0536 Predicted GTPase 2 1 Op 2 . - CDS 1287 - 2252 751 ## PROTEIN SUPPORTED gi|46133178|ref|ZP_00156740.2| COG0697: Permeases of the drug/metabolite transporter (DMT) superfamily Predicted protein(s) >gi|296493315|gb|ADTK01000186.1| GENE 1 99 - 1271 1371 390 aa, chain - ## HITS:1 COG:ECs4062 KEGG:ns NR:ns ## COG: ECs4062 COG0536 # Protein_GI_number: 15833316 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Escherichia coli O157:H7 # 1 390 1 390 390 687 99.0 0 MKFVDEASILVVAGDGGNGCVSFRREKYIPKGGPDGGDGGDGGDVWMEADENLNTLIDYR FEKSFRAERGQNGASRDCTGKRGKDVTIKVPVGTRVIDQGTGETMGDMTKHGQRLLVAKG GWHGLGNTRFKSSVNRTPRQKTNGTPGDKRELLLELMLLADVGMLGMPNAGKSTFIRAVS AAKPKVADYPFTTLVPSLGVVRMDNEKSFVVADIPGLIEGAAEGAGLGIRFLKHLERCRV LLHLIDIDPIDGTDPVENARIIISELEKYSQDLAAKPRWLVFNKIDLLDKAEAEEKAKAI AEALGWEDKYYLISAASGLGVKDLCWDVMTFIIENPVVQAEEAKQPEKVEFMWDDYHRQQ LEEIAEEDDEDWDDDWDEDDEEGVEFIYKR >gi|296493315|gb|ADTK01000186.1| GENE 2 1287 - 2252 751 321 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46133178|ref|ZP_00156740.2| COG0697: Permeases of the drug/metabolite transporter (DMT) superfamily [Haemophilus influenzae R2866] # 1 303 1 300 306 293 49 7e-80 MKQQAGIGILLALTTAICWGALPIAMKQVLEVMEPPTIVFYRFLMASIGLGAILAVKKRL PPLRVFRKPRWLILLAVATAGLFGNFILFSSSLQYLSPTASQVIGQLSPVGMMVASVFIL KEKMRSTQVVGALMLLSGLVMFFNTSLVEIFTKLTDYTWGVIFGVGAATVWVSYGVAQKV LLRRLASPQILFLLYTLCTIALFPLAKPGVIAQLSHWQLACLIFCGLNTLVGYGALAEAM ARWQAAQVSAIITLTPLFTLFFSDLLSLAWPDFFARPMLNLLGYLGAFVVVAGAMYSAIG HRIWGGLRKHTTVVSQPRAGE Prediction of potential genes in microbial genomes Time: Mon May 16 15:37:04 2011 Seq name: gi|296493314|gb|ADTK01000187.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont534.6, whole genome shotgun sequence Length of sequence - 1916 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 1 - 29 2.1 1 1 Op 1 32/0.000 - CDS 67 - 324 437 ## PROTEIN SUPPORTED gi|15803725|ref|NP_289759.1| 50S ribosomal protein L27 2 1 Op 2 . - CDS 345 - 680 562 ## PROTEIN SUPPORTED gi|224585100|ref|YP_002638899.1| 50S ribosomal protein L21 - Prom 722 - 781 3.6 + Prom 782 - 841 3.2 3 2 Tu 1 . + CDS 915 - 1886 983 ## COG0142 Geranylgeranyl pyrophosphate synthase Predicted protein(s) >gi|296493314|gb|ADTK01000187.1| GENE 1 67 - 324 437 85 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803725|ref|NP_289759.1| 50S ribosomal protein L27 [Escherichia coli O157:H7 EDL933] # 1 85 1 85 85 172 100 1e-43 MAHKKAGGSTRNGRDSEAKRLGVKRFGGESVLAGSIIVRQRGTKFHAGANVGCGRDHTLF AKADGKVKFEVKGPKNRKFISIEAE >gi|296493314|gb|ADTK01000187.1| GENE 2 345 - 680 562 111 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|224585100|ref|YP_002638899.1| 50S ribosomal protein L21 [Salmonella enterica subsp. enterica serovar Paratyphi C strain RKS4594] # 1 111 1 111 111 221 97 5e-58 MCAEAEFYMYAVFQSGGKQHRVSEGQTVRLEKLDIATGETVEFAEVLMIANGEEVKIGVP FVDGGVIKAEVVAHGRGEKVKIVKFRRRKHYRKQQGHRQWFTDVKITGISA >gi|296493314|gb|ADTK01000187.1| GENE 3 915 - 1886 983 323 aa, chain + ## HITS:1 COG:ispB KEGG:ns NR:ns ## COG: ispB COG0142 # Protein_GI_number: 16131077 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Escherichia coli K12 # 1 323 1 323 323 624 100.0 1e-179 MNLEKINELTAQDMAGVNAAILEQLNSDVQLINQLGYYIVSGGGKRIRPMIAVLAARAVG YEGNAHVTIAALIEFIHTATLLHDDVVDESDMRRGKATANAAFGNAASVLVGDFIYTRAF QMMTSLGSLKVLEVMSEAVNVIAEGEVLQLMNVNDPDITEENYMRVIYSKTARLFEAAAQ CSGILAGCTPEEEKGLQDYGRYLGTAFQLIDDLLDYNADGEQLGKNVGDDLNEGKPTLPL LHAMHHGTPEQAQMIRTAIEQGNGRHLLEPVLEAMNACGSLEWTRQRAEEEADKAIAALQ VLPDTPWREALIGLAHIAVQRDR Prediction of potential genes in microbial genomes Time: Mon May 16 15:37:14 2011 Seq name: gi|296493313|gb|ADTK01000188.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont534.7, whole genome shotgun sequence Length of sequence - 26441 bp Number of predicted genes - 27, with homology - 27 Number of transcription units - 8, operones - 5 average op.length - 4.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 43 - 102 4.1 1 1 Tu 1 . + CDS 213 - 491 313 ## COG3423 Predicted transcriptional regulator + Term 495 - 549 7.2 2 2 Op 1 11/0.000 - CDS 539 - 1798 1546 ## COG0766 UDP-N-acetylglucosamine enolpyruvyl transferase 3 2 Op 2 6/0.000 - CDS 1853 - 2122 305 ## COG5007 Predicted transcriptional regulator, BolA superfamily - Prom 2168 - 2227 1.6 - Term 2164 - 2216 6.0 4 2 Op 3 10/0.000 - CDS 2267 - 2560 372 ## COG3113 Predicted NTP binding protein (contains STAS domain) 5 2 Op 4 13/0.000 - CDS 2560 - 3195 900 ## COG2854 ABC-type transport system involved in resistance to organic solvents, auxiliary component 6 2 Op 5 16/0.000 - CDS 3214 - 3765 605 ## COG1463 ABC-type transport system involved in resistance to organic solvents, periplasmic component 7 2 Op 6 23/0.000 - CDS 3770 - 4552 796 ## COG0767 ABC-type transport system involved in resistance to organic solvents, permease component 8 2 Op 7 . - CDS 4560 - 5369 592 ## COG1127 ABC-type transport system involved in resistance to organic solvents, ATPase component - Prom 5437 - 5496 4.4 + Prom 5457 - 5516 5.5 9 3 Op 1 6/0.000 + CDS 5579 - 6556 654 ## COG0530 Ca2+/Na+ antiporter 10 3 Op 2 13/0.000 + CDS 6570 - 7556 892 ## COG0794 Predicted sugar phosphate isomerase involved in capsule formation 11 3 Op 3 11/0.000 + CDS 7577 - 8143 732 ## COG1778 Low specificity phosphatase (HAD superfamily) 12 3 Op 4 12/0.000 + CDS 8140 - 8715 459 ## COG3117 Uncharacterized protein conserved in bacteria 13 3 Op 5 19/0.000 + CDS 8684 - 9241 542 ## COG1934 Uncharacterized protein conserved in bacteria 14 3 Op 6 17/0.000 + CDS 9248 - 9973 287 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 15 3 Op 7 11/0.000 + CDS 10021 - 11454 1336 ## COG1508 DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog 16 3 Op 8 11/0.000 + CDS 11477 - 11764 462 ## PROTEIN SUPPORTED gi|227335124|ref|ZP_03838780.1| hypothetical protein CIT292_04930 + Term 11793 - 11830 8.2 17 3 Op 9 8/0.000 + CDS 11882 - 12373 427 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 18 3 Op 10 7/0.000 + CDS 12419 - 13273 855 ## COG1660 Predicted P-loop-containing kinase 19 3 Op 11 . + CDS 13270 - 13542 308 ## COG1925 Phosphotransferase system, HPr-related proteins + Prom 13586 - 13645 2.6 20 4 Tu 1 . + CDS 13756 - 14388 515 ## B21_03023 hypothetical protein - Term 14151 - 14191 3.7 21 5 Op 1 3/1.000 - CDS 14385 - 15056 524 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) 22 5 Op 2 2/1.000 - CDS 15110 - 15763 782 ## COG3155 Uncharacterized protein involved in an early stage of isoprenoid biosynthesis - Prom 15817 - 15876 3.7 23 6 Op 1 4/1.000 - CDS 15993 - 18329 2691 ## COG0642 Signal transduction histidine kinase - Prom 18353 - 18412 3.4 24 6 Op 2 . - CDS 18425 - 19330 898 ## COG1242 Predicted Fe-S oxidoreductase 25 7 Op 1 21/0.000 + CDS 20029 - 24489 4700 ## COG0069 Glutamate synthase domain 2 26 7 Op 2 3/1.000 + CDS 24502 - 25920 1597 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases + Term 25930 - 25976 6.1 + Prom 25947 - 26006 2.3 27 8 Tu 1 . + CDS 26104 - 26439 179 ## PROTEIN SUPPORTED gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 Predicted protein(s) >gi|296493313|gb|ADTK01000188.1| GENE 1 213 - 491 313 92 aa, chain + ## HITS:1 COG:Znlp KEGG:ns NR:ns ## COG: Znlp COG3423 # Protein_GI_number: 15803728 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Escherichia coli O157:H7 EDL933 # 1 92 1 92 92 174 100.0 4e-44 MESNFIDWHPADIIAGLRKKGTSMAAESRRNGLSSSTLANALSRPWPKGEMIIAKALGTD PWVIWPSRYHDPQTHEFIDRTQLMRSYTKPKK >gi|296493313|gb|ADTK01000188.1| GENE 2 539 - 1798 1546 419 aa, chain - ## HITS:1 COG:murA KEGG:ns NR:ns ## COG: murA COG0766 # Protein_GI_number: 16131079 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine enolpyruvyl transferase # Organism: Escherichia coli K12 # 1 419 1 419 419 812 100.0 0 MDKFRVQGPTKLQGEVTISGAKNAALPILFAALLAEEPVEIQNVPKLKDVDTSMKLLSQL GAKVERNGSVHIDARDVNVFCAPYDLVKTMRASIWALGPLVARFGQGQVSLPGGCTIGAR PVDLHISGLEQLGATIKLEEGYVKASVDGRLKGAHIVMDKVSVGATVTIMCAATLAEGTT IIENAAREPEIVDTANFLITLGAKISGQGTDRIVIEGVERLGGGVYRVLPDRIETGTFLV AAAISRGKIICRNAQPDTLDAVLAKLRDAGADIEVGEDWISLDMHGKRPKAVNVRTAPHP AFPTDMQAQFTLLNLVAEGTGFITETVFENRFMHVPELSRMGAHAEIESNTVICHGVEKL SGAQVMATDLRASASLVLAGCIAEGTTVVDRIYHIDRGYERIEDKLRALGANIERVKGE >gi|296493313|gb|ADTK01000188.1| GENE 3 1853 - 2122 305 89 aa, chain - ## HITS:1 COG:ECs4069 KEGG:ns NR:ns ## COG: ECs4069 COG5007 # Protein_GI_number: 15833323 # Func_class: K Transcription # Function: Predicted transcriptional regulator, BolA superfamily # Organism: Escherichia coli O157:H7 # 1 89 1 89 89 179 100.0 1e-45 MIEDPMENNEIQSVLMNALSLQEVHVSGDGSHFQVIAVGELFDGMSRVKKQQTVYGPLME YIADNRIHAVSIKAYTPAEWARDRKLNGF >gi|296493313|gb|ADTK01000188.1| GENE 4 2267 - 2560 372 97 aa, chain - ## HITS:1 COG:yrbB KEGG:ns NR:ns ## COG: yrbB COG3113 # Protein_GI_number: 16131081 # Func_class: R General function prediction only # Function: Predicted NTP binding protein (contains STAS domain) # Organism: Escherichia coli K12 # 1 97 33 129 129 170 100.0 6e-43 MSESLSWMQTGDTLALSGELDQDVLLPLWEMREEAVKGITCIDLSRVSRVDTGGLALLLH LIDLAKKQGNNVTLQGVNDKVYTLAKLYNLPADVLPR >gi|296493313|gb|ADTK01000188.1| GENE 5 2560 - 3195 900 211 aa, chain - ## HITS:1 COG:yrbC KEGG:ns NR:ns ## COG: yrbC COG2854 # Protein_GI_number: 16131082 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: ABC-type transport system involved in resistance to organic solvents, auxiliary component # Organism: Escherichia coli K12 # 1 211 1 211 211 394 100.0 1e-109 MFKRLMMVALLVIAPLSAATAADQTNPYKLMDEAAQKTFDRLKNEQPQIRANPDYLRTIV DQELLPYVQVKYAGALVLGQYYKSATPAQREAYFAAFREYLKQAYGQALAMYHGQTYQIA PEQPLGDKTIVPIRVTIIDPNGRPPVRLDFQWRKNSQTGNWQAYDMIAEGVSMITTKQNE WGTLLRTKGIDGLTAQLKSISQQKITLEEKK >gi|296493313|gb|ADTK01000188.1| GENE 6 3214 - 3765 605 183 aa, chain - ## HITS:1 COG:ECs4072 KEGG:ns NR:ns ## COG: ECs4072 COG1463 # Protein_GI_number: 15833326 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: ABC-type transport system involved in resistance to organic solvents, periplasmic component # Organism: Escherichia coli O157:H7 # 1 183 1 183 183 335 100.0 3e-92 MQTKKNEIWVGIFLLAALLAALFVCLKAANVTSIRTEPTYTLYATFDNIGGLKARSPVSI GGVVVGRVADITLDPKTYLPRVTLEIEQRYNHIPDTSSLSIRTSGLLGEQYLALNVGFED PELGTAILKDGDTIQDTKSAMVLEDLIGQFLYGSKGDDNKNSGDAPAAAPGNNETTEPVG TTK >gi|296493313|gb|ADTK01000188.1| GENE 7 3770 - 4552 796 260 aa, chain - ## HITS:1 COG:ECs4073 KEGG:ns NR:ns ## COG: ECs4073 COG0767 # Protein_GI_number: 15833327 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: ABC-type transport system involved in resistance to organic solvents, permease component # Organism: Escherichia coli O157:H7 # 1 260 1 260 260 438 100.0 1e-123 MLLNALASLGHKGIKTLRTFGRAGLMLFNALVGKPEFRKHAPLLVRQLYNVGVLSMLIIV VSGVFIGMVLGLQGYLVLTTYSAETSLGMLVALSLLRELGPVVAALLFAGRAGSALTAEI GLMRATEQLSSMEMMAVDPLRRVISPRFWAGVISLPLLTVIFVAVGIWGGSLVGVSWKGI DSGFFWSAMQNAVDWRMDLVNCLIKSVVFAITVTWISLFNGYDAIPTSAGISRATTRTVV HSSLAVLGLDFVLTALMFGN >gi|296493313|gb|ADTK01000188.1| GENE 8 4560 - 5369 592 269 aa, chain - ## HITS:1 COG:ECs4074 KEGG:ns NR:ns ## COG: ECs4074 COG1127 # Protein_GI_number: 15833328 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: ABC-type transport system involved in resistance to organic solvents, ATPase component # Organism: Escherichia coli O157:H7 # 1 269 1 269 269 505 100.0 1e-143 MEQSVANLVDMRDVSFTRGNRCIFDNISLTVPRGKITAIMGPSGIGKTTLLRLIGGQIAP DHGEILFDGENIPAMSRSRLYTVRKRMSMLFQSGALFTDMNVFDNVAYPLREHTQLPAPL LHSTVMMKLEAVGLRGAAKLMPSELSGGMARRAALARAIALEPDLIMFDEPFVGQDPITM GVLVKLISELNSALGVTCVVVSHDVPEVLSIADHAWILADKKIVAHGSAQALQANPDPRV RQFLDGIADGPVPFRYPAGDYHADLLPGS >gi|296493313|gb|ADTK01000188.1| GENE 9 5579 - 6556 654 325 aa, chain + ## HITS:1 COG:ECs4075 KEGG:ns NR:ns ## COG: ECs4075 COG0530 # Protein_GI_number: 15833329 # Func_class: P Inorganic ion transport and metabolism # Function: Ca2+/Na+ antiporter # Organism: Escherichia coli O157:H7 # 1 325 1 325 325 482 100.0 1e-136 MLLATALLIVGLLLVVYSADRLVFAASILCRTFGIPPLIIGMTVVSIGTSLPEIIVSLAA SLHEQRDLAVGTALGSNIINILLILGLAALVRPFTVHSDVLRRELPLMLLVSVVAGSVLY DGQLSRSDGIFLLFLAVLWLLFIVKLARQAERQGTDSLTREQLAELPRDGGLPVAFLWLG IALIIMPVATRMVVDNATVLANYFAISELTMGLTAIAIGTSLPELATAIAGVRKGENDIA VGNIIGANIFNIVIVLGLPALITPGEIDPLAYSRDYSVMLLVSIIFALLCWRRSPQPGRG VGVLLTGGFIVWLAMLYWLSPILVE >gi|296493313|gb|ADTK01000188.1| GENE 10 6570 - 7556 892 328 aa, chain + ## HITS:1 COG:yrbH_1 KEGG:ns NR:ns ## COG: yrbH_1 COG0794 # Protein_GI_number: 16131087 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted sugar phosphate isomerase involved in capsule formation # Organism: Escherichia coli K12 # 1 212 1 212 212 424 100.0 1e-118 MSHVELQPGFDFQQAGKEVLAIERECLAELDQYINQNFTLACEKMFWCKGKVVVMGMGKS GHIGRKMAATFASTGTPSFFVHPGEAAHGDLGMVTPQDVVIAISNSGESSEITALIPVLK RLHVPLICITGRPESSMARAADVHLCVKVAKEACPLGLAPTSSTTATLVMGDALAVALLK ARGFTAEDFALSHPGGALGRKLLLRVNDIMHTGDEIPHVKKTASLRDALLEVTRKNLGMT VICDDNMMIEGIFTDGDLRRVFDMGVDVRQLSIADVMTPGGIRVRPGILAVEALNLMQSR HITSVMVADGDHLLGVLHMHDLLRAGVV >gi|296493313|gb|ADTK01000188.1| GENE 11 7577 - 8143 732 188 aa, chain + ## HITS:1 COG:ECs4077 KEGG:ns NR:ns ## COG: ECs4077 COG1778 # Protein_GI_number: 15833331 # Func_class: R General function prediction only # Function: Low specificity phosphatase (HAD superfamily) # Organism: Escherichia coli O157:H7 # 1 188 1 188 188 354 99.0 5e-98 MSKAGASLATCYGPVSADVMAKAENIRLLILDVDGVLSDGLIYMGNNGEELKAFNVRDGY GIRCALTSDIEVAIITGRKAKLVEDRCATLGITHLYQGQSNKLIAFSDLLEKLAIAPENV AYVGDDLIDWPVMEKVGLSVAVADAHPLLIPRADYVTRIAGGRGAVREVCDLLLLAQGKL DEAKGQSI >gi|296493313|gb|ADTK01000188.1| GENE 12 8140 - 8715 459 191 aa, chain + ## HITS:1 COG:ECs4078 KEGG:ns NR:ns ## COG: ECs4078 COG3117 # Protein_GI_number: 15833332 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 191 1 191 191 354 100.0 7e-98 MSKARRWVIIVLSLAVLVMIGINMAEKDDTAQVVVNNNDPTYKSEHTDTLVYNPEGALSY RLIAQHVEYYSDQAVSWFTQPVLTTFDKDKIPTWSVKADKAKLTNDRMLYLYGHVEVNAL VPDSQLRRITTDNAQINLVTQDVTSEDLVTLYGTTFNSSGLKMRGNLRSKNAELIEKVRT SYEIQNKQTQP >gi|296493313|gb|ADTK01000188.1| GENE 13 8684 - 9241 542 185 aa, chain + ## HITS:1 COG:ECs4079 KEGG:ns NR:ns ## COG: ECs4079 COG1934 # Protein_GI_number: 15833333 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 185 1 185 185 330 100.0 1e-90 MKFKTNKLSLNLVLASSLLAASIPAFAVTGDTDQPIHIESDQQSLDMQGNVVTFTGNVIV TQGTIKINADKVVVTRPGGEQGKEVIDGYGKPATFYQMQDNGKPVEGHASQMHYELAKDF VVLTGNAYLQQVDSNIKGDKITYLVKEQKMQAFSDKGKRVTTVLVPSQLQDKNNKGQTPA QKKGN >gi|296493313|gb|ADTK01000188.1| GENE 14 9248 - 9973 287 241 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 2 235 3 233 305 115 31 4e-25 MATLTAKNLAKAYKGRRVVEDVSLTVNSGEIVGLLGPNGAGKTTTFYMVVGIVPRDAGNI IIDDDDISLLPLHARARRGIGYLPQEASIFRRLSVYDNLMAVLQIRDDLSAEQREDRANE LMEEFHIEHLRDSMGQSLSGGERRRVEIARALAANPKFILLDEPFAGVDPISVIDIKRII EHLRDSGLGVLITDHNVRETLAVCERAYIVSQGHLIAHGTPTEILQDEHVKRVYLGEDFR L >gi|296493313|gb|ADTK01000188.1| GENE 15 10021 - 11454 1336 477 aa, chain + ## HITS:1 COG:ECs4081 KEGG:ns NR:ns ## COG: ECs4081 COG1508 # Protein_GI_number: 15833335 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog # Organism: Escherichia coli O157:H7 # 1 477 1 477 477 873 100.0 0 MKQGLQLRLSQQLAMTPQLQQAIRLLQLSTLELQQELQQALESNPLLEQIDTHEEIDTRE TQDSETLDTADALEQKEMPEELPLDASWDTIYTAGTPSGTSGDYIDDELPVYQGETTQTL QDYLMWQVELTPFSDTDRAIATSIVDAVDDTGYLTVPLEDILESMGDEEIDIDEVEAVLK RIQRFDPVGVAAKDLRDCLLIQLSQFDKTTPWLEEARLIISDHLDLLANHDFRTLMRVTR LKEDVLKEAVNLIQSLDPRPGQSIQTGEPEYVIPDVLVRKHNGHWTVELNSDSIPRLQIN QHYASMCNNARNDGDSQFIRSNLQDAKWLIKSLESRNDTLLRVSRCIVEQQQAFFEQGEE YMKPMVLADIAQAVEMHESTISRVTTQKYLHSPRGIFELKYFFSSHVNTEGGGEASSTAI RALVKKLIAAENPAKPLSDSKLTSLLSEQGIMVARRTVAKYRESLSIPPSNQRKQLV >gi|296493313|gb|ADTK01000188.1| GENE 16 11477 - 11764 462 95 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227335124|ref|ZP_03838780.1| hypothetical protein CIT292_04930 [Citrobacter youngae ATCC 29220] # 1 95 1 95 95 182 94 2e-45 MQLNITGNNVEITEALREFVTAKFAKLEQYFDRINQVYVVLKVEKVTHTSDATLHVNGGE IHASAEGQDMYAAIDGLIDKLARQLTKHKDKLKQH >gi|296493313|gb|ADTK01000188.1| GENE 17 11882 - 12373 427 163 aa, chain + ## HITS:1 COG:ptsN KEGG:ns NR:ns ## COG: ptsN COG1762 # Protein_GI_number: 16131094 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Escherichia coli K12 # 1 163 1 163 163 303 100.0 1e-82 MTNNDTTLQLSSVLNRECTRSRVHCQSKKRALEIISELAAKQLSLPPQVVFEAILTREKM GSTGIGNGIAIPHGKLEEDTLRAVGVFVQLETPIAFDAIDNQPVDLLFALLVPADQTKTH LHTLSLVAKRLADKTICRRLRAAQSDEELYQIITDTEGTPDEA >gi|296493313|gb|ADTK01000188.1| GENE 18 12419 - 13273 855 284 aa, chain + ## HITS:1 COG:ECs4084 KEGG:ns NR:ns ## COG: ECs4084 COG1660 # Protein_GI_number: 15833338 # Func_class: R General function prediction only # Function: Predicted P-loop-containing kinase # Organism: Escherichia coli O157:H7 # 1 284 1 284 284 556 100.0 1e-158 MVLMIVSGRSGSGKSVALRALEDMGFYCVDNLPVVLLPDLARTLADREISAAVSIDVRNM PESPEIFEQAMSNLPDAFSPQLLFLDADRNTLIRRYSDTRRLHPLSSKNLSLESAIDKES DLLEPLRSRADLIVDTSEMSVHELAEMLRTRLLGKRERELTMVFESFGFKHGIPIDADYV FDVRFLPNPHWDPKLRPMTGLDKPVAAFLDRHTEVHNFIYQTRSYLELWLPMLETNNRSY LTVAIGCTGGKHRSVYIAEQLADYFRSRGKNVQSRHRTLEKRKP >gi|296493313|gb|ADTK01000188.1| GENE 19 13270 - 13542 308 90 aa, chain + ## HITS:1 COG:ECs4085 KEGG:ns NR:ns ## COG: ECs4085 COG1925 # Protein_GI_number: 15833339 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Escherichia coli O157:H7 # 1 90 1 90 90 149 100.0 2e-36 MTVKQTVEITNKLGMHARPAMKLFELMQGFDAEVLLRNDEGTEAEANSVIALLMLDSAKG RQIEVEATGPQEEEALAAVIALFNSGFDED >gi|296493313|gb|ADTK01000188.1| GENE 20 13756 - 14388 515 210 aa, chain + ## HITS:1 COG:no KEGG:B21_03023 NR:ns ## KEGG: B21_03023 # Name: yrbL # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 210 1 210 210 426 100.0 1e-118 MIRLSEQSPLGTGRHRKCYAHPEDAQRCIKIVYHRGDGGDKEIRRELKYYAHLGRRLKDW SGIPRYHGTVETDCGTGYVYDVIADFDGKPSITLTEFAEQCRYEEDIAQLRQLLKQLKRY LQDNRIVTMSLKPQNILCHRISESEVIPVVCDNIGESTLIPLATWSKWCCLRKQERLWKR FIAQPALAIALQKDLQPRESKTLALTSREA >gi|296493313|gb|ADTK01000188.1| GENE 21 14385 - 15056 524 223 aa, chain - ## HITS:1 COG:mtgA KEGG:ns NR:ns ## COG: mtgA COG0744 # Protein_GI_number: 16131098 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Escherichia coli K12 # 1 223 20 242 242 437 100.0 1e-123 MVVLAVFWGGGIALFSVAPVPFSAVMVERQVSAWLHGNFRYVAHSDWVSMDQISPWMGLA VIAAEDQKFPEHWGFDVASIEKALAHNERNENRIRGASTISQQTAKNLFLWDGRSWVRKG LEAGLTLGIETVWSKKRILTVYLNIAEFGDGVFGVEAAAQRYFHKPASKLTRSEAALLAA VLPNPLRFKVSSPSGYVRSRQAWILRQMYQLGGEPFMQQHQLD >gi|296493313|gb|ADTK01000188.1| GENE 22 15110 - 15763 782 217 aa, chain - ## HITS:1 COG:yhbL KEGG:ns NR:ns ## COG: yhbL COG3155 # Protein_GI_number: 16131099 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized protein involved in an early stage of isoprenoid biosynthesis # Organism: Escherichia coli K12 # 1 217 4 220 220 407 99.0 1e-113 MKKIGVILSGCGVYDGSEIHEAVLTLLAISRSGAQAVCFAPDKQQVDVINHLTGEAMTET RNVLIEAARITRGEIRPLAQADAAELDALIVPGGFGAAKNLSNFATLGSECTVDRELKAL AQAMHQAGKPLGFMCIAPAMLPKIFDFPLRLTIGTDIDTAEVLEEMGAEHVPCPVDDIVV DEDNKIVTTPAYMLAQNIAEAASGIDKLVSRVLVLAE >gi|296493313|gb|ADTK01000188.1| GENE 23 15993 - 18329 2691 778 aa, chain - ## HITS:1 COG:ZarcB_1 KEGG:ns NR:ns ## COG: ZarcB_1 COG0642 # Protein_GI_number: 15803750 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli O157:H7 EDL933 # 1 562 1 562 562 1055 100.0 0 MKQIRLLAQYYVDLMMKLGLVRFSMLLALALVVLAIVVQMAVTMVLHGQVESIDVIRSIF FGLLITPWAVYFLSVVVEQLEESRQRLSRLVQKLEEMRERDLSLNVQLKDNIAQLNQEIA VREKAEAELQETFGQLKIEIKEREETQIQLEQQSSFLRSFLDASPDLVFYRNEDKEFSGC NRAMELLTGKSEKQLVHLKPADVYSPEAAAKVIETDEKVFRHNVSLTYEQWLDYPDGRKA CFEIRKVPYYDRVGKRHGLMGFGRDITERKRYQDALERASRDKTTFISTISHELRTPLNG IVGLSRILLDTELTAEQEKYLKTIHVSAVTLGNIFNDIIDMDKMERRKVQLDNQPVDFTS FLADLENLSALQAQQKGLRFNLEPTLPLPHQVITDGTRLRQILWNLISNAVKFTQQGQVT VRVRYDEGDMLHFEVEDSGIGIPQDELDKIFAMYYQVKDSHGGKPATGTGIGLAVSRRLA KNMGGDITVTSEQGKGSTFTLTIHAPSVAEEVDDAFDEDDMPLPALNVLLVEDIELNVIV ARSVLEKLGNSVDVAMTGKAALEMFKPGEYDLVLLDIQLPDMTGLDISRELTKRYPREDL PPLVALTANVLKDKQEYLNAGMDDVLSKPLSVPALTAMIKKFWDTQDDEESTVTTEENSK SEALLDIPMLEQYLELVGPKLITDGLAVFEKMMPGYVSVLESNLTAQDKKGIVEEGHKIK GAAGSVGLRHLQQLGQQIQSPDLPAWEDNVGEWIEEMKEEWRHDVEVLKAWVAKATKK >gi|296493313|gb|ADTK01000188.1| GENE 24 18425 - 19330 898 301 aa, chain - ## HITS:1 COG:ECs4090 KEGG:ns NR:ns ## COG: ECs4090 COG1242 # Protein_GI_number: 15833344 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductase # Organism: Escherichia coli O157:H7 # 1 301 9 309 309 627 99.0 1e-180 MFGGDLTRRYGQKVHKLTLHGGFSCPNRDGTIGRGGCTFCNVASFADEAQQHRSIAEQLA HQANLVNRAKRYLAYFQAYTSTFAEVQVLRSMYQQAVSQANIVGLCVGTRPDCVPDAVLD LLCEYKDQGYEVWLELGLQTAHDKTLHRINRGHDFACYQRTTQLARQRGLKVCSHLIVGL PGEGQAECLQTLERVVETGVDGIKLHPLHIVKGSIMAKAWEAGRLNGIELEDYTLTAGEM IRHTPPEVIYHRISASARRPTLLAPLWCENRWTGMVELDRYLNEHGVQGSALERPWIPPT E >gi|296493313|gb|ADTK01000188.1| GENE 25 20029 - 24489 4700 1486 aa, chain + ## HITS:1 COG:ECs4091_2 KEGG:ns NR:ns ## COG: ECs4091_2 COG0069 # Protein_GI_number: 15833345 # Func_class: E Amino acid transport and metabolism # Function: Glutamate synthase domain 2 # Organism: Escherichia coli O157:H7 # 379 1194 1 816 816 1647 99.0 0 MLYDKSLERDNCGFGLIAHIEGEPSHKVVRTAIHALARMQHRGAILADGKTGDGCGLLLQ KPDRFFRIVAQERGWRLAKNYAVGMLFLNKDPELAAAARRIVEEELQRETLSIVGWRDVP TNEGVLGEIALSSLPHIEQIFVNAPAGWRPRDMERRLFIARRRIEKRLEADKDFYVCSLS NLVNIYKGLCMPADLPRFYLDLADLRLESAICLFHQRFSTNTVPRWPLAQPFRYLAHNGE INTITGNRQWARARTYKFQTPLIPDLHDAAPFVNETGSDSSSMDNMLELLLAGGMDIIRA MRLLVPPAWQNNPDMDPELRAFFDFNSMHMEPWDGPAGIVMSDGRFAACNLDRNGLRPAR YVITKDKLITCASEVGIWDYQPDEVVEKGRVGPGELMVIDTRSGRILHSAETDDDLKSRH PYKEWMEKNVRRLVPFEDLPDEEVGSRELDDDTLASYQKQFNYSAEELDSVIRVLGENGQ EAVGSMGDDTPFAVLSSQPRIIYDYFRQQFAQVTNPPIDPLREAHVMSLATSIGREMNVF CEAEGQAHRLSFKSPILLYSDFKQLTTMKEEHYRADTLDITFDVTKTTLEATVKELCDKA EKMVRSGTVLLVLSDRNIAKDRLPVPAPMAVGAIQTRLVDQSLRCDANIIVETASARDPH HFAVLLGFGATAIYPYLAYETLGRLVDTHAIAKDYRTVMLNYRNGINKGLYKIMSKMGIS TIASYRCSKLFEAVGLHDDVVGLCFQGAVSRIGGASFEDFQQDLLNLSKRAWLARKPISQ GGLLKYVHGGEYHAYNPDVVRTLQQAVQSGEYSDYQEYAKLVNERPATTLRDLLAITPGE NAVNIADVEPASELFKRFDTAAMSIGALSPEAHEALAEAMNSIGGNSNSGEGGEDPARYG TNKVSRIKQVASGRFGVTPAYLVNADVIQIKVAQGAKPGEGGQLPGDKVTPYIAKLRYSV PGVTLISPPPHHDIYSIEDLAQLIFDLKQVNPKAMISVKLVSEPGVGTIATGVAKAYADL ITIAGYDGGTGASPLSSVKYAGCPWELGLVETQQALVANGLRHKIRLQVDGGLKTGVDII KAAILGAESFGFGTGPMVALGCKYLRICHLNNCATGVATQDDKLRKNHYHGLPFKVTNYF EFIARETRELMAQLGVTRLVDLIGRTDLLKELDGFTAKQQKLALSKLLETAEPHPGKALY CTENNPPFDNGLLNAQLLQQAKPFVDERQSKTFWFDIRNTDRSVGASLSGYIAQTHGDQG LAADPIKAYFNGTAGQSFGVWNAGGVELYLTGDANDYVGKGMAGGLIAIRPPVGSAFRSH EASIIGNTCLYGATGGRLYAAGRAGERFGVRNSGAITVVEGIGDNGCEYMTGGIVCILGK TGVNFGAGMTGGFAYVLDESGDFRKRVNPELVEVLSVDDLAIHEEHLRGLITEHVQHTGS QRGEEILANWSTFATKFALVKPKSSDVKALLGHRSRSAAELRVQAQ >gi|296493313|gb|ADTK01000188.1| GENE 26 24502 - 25920 1597 472 aa, chain + ## HITS:1 COG:gltD KEGG:ns NR:ns ## COG: gltD COG0493 # Protein_GI_number: 16131103 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Escherichia coli K12 # 1 472 1 472 472 975 100.0 0 MSQNVYQFIDLQRVDPPKKPLKIRKIEFVEIYEPFSEGQAKAQADRCLSCGNPYCEWKCP VHNYIPNWLKLANEGRIFEAAELSHQTNTLPEVCGRVCPQDRLCEGSCTLNDEFGAVTIG NIERYINDKAFEMGWRPDMSGVKQTGKKVAIIGAGPAGLACADVLTRNGVKAVVFDRHPE IGGLLTFGIPAFKLEKEVMTRRREIFTGMGIEFKLNTEVGRDVQLDDLLSDYDAVFLGVG TYQSMRGGLENEDADGVYAALPFLIANTKQLMGFGETRDEPFVSMEGKRVVVLGGGDTAM DCVRTSVRQGAKHVTCAYRRDEENMPGSRREVKNAREEGVEFKFNVQPLGIEVNGNGKVS GVKMVRTEMGEPDAKGRRRAEIVAGSEHIVPADAVIMAFGFRPHNMEWLAKHSVELDSQG RIIAPEGSDNAFQTSNPKIFAGGDIVRGSDLVVTAIAEGRKAADGIMNWLEV >gi|296493313|gb|ADTK01000188.1| GENE 27 26104 - 26439 179 112 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 [Haemophilus parasuis 29755] # 19 111 12 104 339 73 40 1e-12 MESLSERTSTGYQQIHDGIIHLVDSARTETVRSVNALMTATYWEIGRRIVEFEQGGEARA AYGAQLIKRLSKDLSLRYKRGFSAKNLRQMRLFYLFFQHVEIRQTVSGELTP Prediction of potential genes in microbial genomes Time: Mon May 16 15:37:25 2011 Seq name: gi|296493312|gb|ADTK01000189.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont534.8, whole genome shotgun sequence Length of sequence - 17986 bp Number of predicted genes - 21, with homology - 21 Number of transcription units - 15, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 28 - 741 434 ## PROTEIN SUPPORTED gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 + Term 759 - 791 4.9 2 2 Op 1 3/0.833 - CDS 801 - 1265 352 ## COG2731 Beta-galactosidase, beta subunit 3 2 Op 2 5/0.333 - CDS 1262 - 2137 322 ## PROTEIN SUPPORTED gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 4 2 Op 3 1/1.000 - CDS 2134 - 2823 773 ## COG3010 Putative N-acetylmannosamine-6-phosphate epimerase - Term 2832 - 2863 2.4 5 2 Op 4 4/0.333 - CDS 2871 - 4361 1570 ## COG0477 Permeases of the major facilitator superfamily - Prom 4401 - 4460 3.0 6 3 Tu 1 . - CDS 4470 - 5363 924 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase - Prom 5398 - 5457 3.7 7 4 Tu 1 . - CDS 5485 - 6276 904 ## COG2186 Transcriptional regulators - Prom 6440 - 6499 8.8 + Prom 6494 - 6553 5.7 8 5 Tu 1 . + CDS 6656 - 8023 1210 ## COG3069 C4-dicarboxylate transporter + Term 8036 - 8075 5.0 - Term 8024 - 8063 5.0 9 6 Op 1 13/0.000 - CDS 8066 - 8563 497 ## COG2969 Stringent starvation protein B 10 6 Op 2 . - CDS 8569 - 9207 663 ## PROTEIN SUPPORTED gi|46133488|ref|ZP_00157281.2| COG0625: Glutathione S-transferase 11 7 Tu 1 . + CDS 9149 - 9358 65 ## ECP_3312 hypothetical protein + Term 9463 - 9518 -0.3 - Term 9522 - 9575 3.4 12 8 Op 1 59/0.000 - CDS 9602 - 9994 650 ## PROTEIN SUPPORTED gi|15803764|ref|NP_289798.1| 30S ribosomal protein S9 13 8 Op 2 5/0.333 - CDS 10010 - 10513 881 ## PROTEIN SUPPORTED gi|226956764|ref|YP_002807559.1| 50S ribosomal subunit protein L13 - Prom 10595 - 10654 5.6 - Term 10602 - 10634 3.1 14 9 Tu 1 . - CDS 10657 - 11628 769 ## COG1485 Predicted ATPase - Prom 11674 - 11733 2.2 + Prom 11805 - 11864 3.3 15 10 Tu 1 . + CDS 11978 - 12376 504 ## COG3105 Uncharacterized protein conserved in bacteria + Term 12387 - 12419 3.8 + Prom 12426 - 12485 4.2 16 11 Op 1 6/0.167 + CDS 12530 - 13897 1484 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain + Prom 13901 - 13960 4.0 17 11 Op 2 . + CDS 13987 - 15054 1157 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain + Term 15076 - 15105 1.2 - Term 15058 - 15100 4.5 18 12 Tu 1 . - CDS 15116 - 16054 1239 ## COG0039 Malate/lactate dehydrogenases - Prom 16258 - 16317 4.3 + Prom 16404 - 16463 5.2 19 13 Tu 1 . + CDS 16489 - 16959 552 ## COG1438 Arginine repressor + Term 16964 - 17003 5.4 + Prom 17208 - 17267 7.2 20 14 Tu 1 . + CDS 17324 - 17587 327 ## S3493 hypothetical protein + Term 17615 - 17645 4.1 21 15 Tu 1 . - CDS 17643 - 17915 343 ## COG2732 Barstar, RNAse (barnase) inhibitor Predicted protein(s) >gi|296493312|gb|ADTK01000189.1| GENE 1 28 - 741 434 237 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 [Haemophilus parasuis 29755] # 1 234 103 338 339 171 38 2e-42 LPWSTYVRLLSVKNADARSFYEKETLRCGWSVRQLERQIATQFYERTLLSHDKSAMLQQH APTETHILPQQAIRDPFVLEFLELKDEYSESDFEEALINHLMDFMLELGDDFAFVGRQRR LRIDDNWFRVDLLFFHRRLRCLLIVDLKVGKFSYSDAGQMNMYLNYAKEHWTLPDENPPI GLVLCAEKGAGEAHYALAGLPNTVLASEYKMQLPDEKRLADELVRTQAVLEEGYRRR >gi|296493312|gb|ADTK01000189.1| GENE 2 801 - 1265 352 154 aa, chain - ## HITS:1 COG:yhcH KEGG:ns NR:ns ## COG: yhcH COG2731 # Protein_GI_number: 16131111 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Escherichia coli K12 # 1 154 1 154 154 285 100.0 3e-77 MMMGEVQSLPSAGLHPALQDALTLALAARPQEKAPGRYELQGDNIFMNVMTFNTQSPVEK KAELHEQYIDIQLLLNGEERILFGMAGTARQCEEFHHEDDYQLCSTIDNEQAIILKPGMF AVFMPGEPHKPGCVVGEPGEIKKVVVKVKADLMA >gi|296493312|gb|ADTK01000189.1| GENE 3 1262 - 2137 322 291 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 [Bacillus selenitireducens MLS10] # 6 287 8 314 323 128 30 2e-29 MTTLAIDIGGTKLAAALIGADGQIRDRRELPTPASQTPEALRDALSALVSPLQAHAQRVA IASTGIIRDGSLLALNPHNLGGLLHFPLVKTLEQLTNLPTIAINDAQAAAWAEYQALEGD VTEMVFITVSTGVGGGVVSGGKLLTGPGGLAGHIGHTLADPHGPVCGCGRTGCVEAIASG RGIAAAAQGELAGADARTIFTRAGQGDEQAQQLIHRSARTLARLIADIKATTDCQCVVVG GSVGLAEGYLALVETYLAQEPAAFHVDLLAAHYRHDAGLLGAALLAQGEKL >gi|296493312|gb|ADTK01000189.1| GENE 4 2134 - 2823 773 229 aa, chain - ## HITS:1 COG:nanE KEGG:ns NR:ns ## COG: nanE COG3010 # Protein_GI_number: 16131113 # Func_class: G Carbohydrate transport and metabolism # Function: Putative N-acetylmannosamine-6-phosphate epimerase # Organism: Escherichia coli K12 # 1 229 1 229 229 402 100.0 1e-112 MSLLAQLDQKIAANGGLIVSCQPVPDSPLDKPEIVAAMALAAEQAGAVAIRIEGVANLQA TRAVVSVPIIGIVKRDLEDSPVRITAYIEDVDALAQAGADIIAIDGTDRPRPVPVETLLA RIHHHGLLAMTDCSTPEDGLACQKLGAEIIGTTLSGYTTPETPEEPDLALVKTLSDAGCR VIAEGRYNTPAQAADAMRHGAWAVTVGSAITRLEHICQWYNTAMKKAVL >gi|296493312|gb|ADTK01000189.1| GENE 5 2871 - 4361 1570 496 aa, chain - ## HITS:1 COG:ECs4097 KEGG:ns NR:ns ## COG: ECs4097 COG0477 # Protein_GI_number: 15833351 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 496 11 506 506 917 99.0 0 MSTTTQNIPWYRHLNRAQWRAFSAAWLGYLLDGFDFVLIALVLTEVQGEFGLTTVQAASL ISAAFISRWFGGLMLGAMGDRYGRRLAMVTSIVLFSAGTLACGFAPGYITMFIARLVIGM GMAGEYGSSATYVIESWPKHLRNKASGFLISGFSVGAVVAAQVYSLVVPVWGWRALFFIG ILPIIFALWLRKNIPEAEDWKEKHGGKAPVRTMVDILYRGEHRIANIVMTLAAATALWFC FAGNLQNAAIVAVLGLLCAAIFISFMVQSTGKRWPTGVMLMVVVLFAFLYSWPIQALLPT YLKTDLAYDPHTVANVLFFSGFGAAVGCCVGGFLGDWLGTRKAYVCSLLASQLLIIPVFA IGGANVWVLGLLLFFQQMLGQGIAGILPKLIGGYFDTDQRAAGLGFTYNVGALGGALAPI IGALIAQRLDLGTALASLSFSLTFVVILLIGLDMPSRVQRWLRPEALRTHDAIDGKPFSG AVPFGSAKNDLVKTKS >gi|296493312|gb|ADTK01000189.1| GENE 6 4470 - 5363 924 297 aa, chain - ## HITS:1 COG:ECs4098 KEGG:ns NR:ns ## COG: ECs4098 COG0329 # Protein_GI_number: 15833352 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Escherichia coli O157:H7 # 1 297 1 297 297 595 100.0 1e-170 MATNLRGVMAALLTPFDQQQALDKASLRRLVQFNIQQGIDGLYVGGSTGEAFVQSLSERE QVLEIVAEEAKGKIKLIAHVGCVSTAESQQLAASAKRYGFDAVSAVTPFYYPFSFEEHCD HYRAIIDSADGLPMVVYNIPALSGVKLTLDQINTLVTLPGVGALKQTSGDLYQMEQIRRE HPDLVLYNGYDEIFASGLLAGADGGIGSTYNIMGWRYQGIVKALKEGDIQTAQKLQTECN KVIDLLIKTGVFRGLKTVLHYMDVVSVPLCRKPFGPVDEKYLPELKALAQQLMQERG >gi|296493312|gb|ADTK01000189.1| GENE 7 5485 - 6276 904 263 aa, chain - ## HITS:1 COG:nanR KEGG:ns NR:ns ## COG: nanR COG2186 # Protein_GI_number: 16131116 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 263 1 263 263 496 100.0 1e-140 MGLMNAFDSQTEDSSPAIGRNLRSRPLARKKLSEMVEEELEQMIRRREFGEGEQLPSERE LMAFFNVGRPSVREALAALKRKGLVQINNGERARVSRPSADTIIGELSGMAKDFLSHPGG IAHFEQLRLFFESSLVRYAAEHATDEQIDLLAKALEINSQSLDNNAAFIRSDVDFHRVLA EIPGNPIFMAIHVALLDWLIAARPTVTDQALHEHNNVSYQQHIAIVDAIRRHDPDEADRA LQSHLNSVSATWHAFGQTTNKKK >gi|296493312|gb|ADTK01000189.1| GENE 8 6656 - 8023 1210 455 aa, chain + ## HITS:1 COG:dcuD KEGG:ns NR:ns ## COG: dcuD COG3069 # Protein_GI_number: 16131117 # Func_class: C Energy production and conversion # Function: C4-dicarboxylate transporter # Organism: Escherichia coli K12 # 1 455 1 455 455 744 99.0 0 MFGIIISVIVLITMGYLILKNYKPQVVLAAAGIFLMMCGVWLGFGGVLDPAKSSGYLIVD IYNEILRMLSNRIAGLGLSIMAVGGYARYMERIGASRAMVSLLSRPLKLIRSPYIILSAT YVIGQIMAQFITSASGLGMLLMVTLFPTLVSLGVSRLSAVAVIATTMSIEWGILETNSIF AAQVAGMKIATYFFHYQLPVASCVIISVAISHFFVQRAFDKKDKNINHEQAEQKALDNVP PLYYAILPVMPLILMLGSLFLAHVGLMQSELHLVVVMLLSLTVTMFVEFFRKHNLRETMD DVQAFFDGMGTQFANVVTLVVAGEIFAKGLTTIGTVDAVIRGAEHSGLGGIGVMIIMALV IAICAIVMGSGNAPFMSFASLIPNIAAGLHVPAVVMIMPMHFATTLARAVSPITAVVVVT SGIAGVSPFAVVKRTAIPMAVGFVVNMIATITLFY >gi|296493312|gb|ADTK01000189.1| GENE 9 8066 - 8563 497 165 aa, chain - ## HITS:1 COG:ECs4101 KEGG:ns NR:ns ## COG: ECs4101 COG2969 # Protein_GI_number: 15833355 # Func_class: R General function prediction only # Function: Stringent starvation protein B # Organism: Escherichia coli O157:H7 # 1 133 1 133 165 253 100.0 1e-67 MDLSQLTPRRPYLLRAFYEWLLDNQLTPHLVVDVTLPGVQVPMEYARDGQIVLNIAPRAV GNLELANDEVRFNARFGGIPRQVSVPLAAVLAIYARENGAGTMFEPEAAYDEDTSIMNDE EASADNETVMSVIDGDKPDHDDDTHPDDEPPQPPRGGRPALRVVK >gi|296493312|gb|ADTK01000189.1| GENE 10 8569 - 9207 663 212 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46133488|ref|ZP_00157281.2| COG0625: Glutathione S-transferase [Haemophilus influenzae R2866] # 1 203 1 203 212 259 62 7e-69 MAVAANKRSVMTLFSGPTDIYSHQVRIVLAEKGVSFEIEHVEKDNPPQDLIDLNPNQSVP TLVDRELTLWESRIIMEYLDERFPHPPLMPVYPVARGESRLYMHRIEKDWYTLMNTIING SASEADAARKQLREELLAIAPVFGQKPYFLSDEFSLVDCYLAPLLWRLPQLGIEFSGPGA KELKGYMTRVFERDSFLASLTEAEREMRLGRS >gi|296493312|gb|ADTK01000189.1| GENE 11 9149 - 9358 65 69 aa, chain + ## HITS:1 COG:no KEGG:ECP_3312 NR:ns ## KEGG: ECP_3312 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_536 # Pathway: not_defined # 1 69 1 69 69 123 100.0 3e-27 MSVGPENSVITERLLAATAMKTSRYSQNFYCYQPPGGQSEVVLPSKERLSLFENQTKNEQ YPTFGQKIG >gi|296493312|gb|ADTK01000189.1| GENE 12 9602 - 9994 650 130 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803764|ref|NP_289798.1| 30S ribosomal protein S9 [Escherichia coli O157:H7 EDL933] # 1 130 1 130 130 254 100 2e-67 MAENQYYGTGRRKSSAARVFIKPGNGKIVINQRSLEQYFGRETARMVVRQPLELVDMVEK LDLYITVKGGGISGQAGAIRHGITRALMEYDESLRSELRKAGFVTRDARQVERKKVGLRK ARRRPQFSKR >gi|296493312|gb|ADTK01000189.1| GENE 13 10010 - 10513 881 167 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|226956764|ref|YP_002807559.1| 50S ribosomal subunit protein L13 [Escherichia sp. 1_1_43] # 1 167 1 167 167 343 99 4e-94 MSCEPQQLKTFGCSPTCNYLLGKLLMKTFTAKPETVKRDWYVVDATGKTLGRLATELARR LRGKHKAEYTPHVDTGDYIIVLNADKVAVTGNKRTDKVYYHHTGHIGGIKQATFEEMIAR RPERVIEIAVKGMLPKGPLGRAMFRKLKVYAGNEHNHAAQQPQVLDI >gi|296493312|gb|ADTK01000189.1| GENE 14 10657 - 11628 769 323 aa, chain - ## HITS:1 COG:ECs4105 KEGG:ns NR:ns ## COG: ECs4105 COG1485 # Protein_GI_number: 15833359 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Escherichia coli O157:H7 # 1 323 53 375 375 656 99.0 0 MARVGKLWGKREDTKHTPVRGLYMWGGVGRGKTWLMDLFYQSLPGERKQRLHFHRFMMRV HEELTALQGQTDPLEIIADRFKAETDVLCFDEFFVSDITDAMLLGGLMKALFARGITLVA TSNIPPDELYRNGLQRARFLPAIDAIKQHCDVMNVDAGVDYRLRTLTQAHLWLSPLNDET RAQMDKLWLALAGAKRENSPTLEINHRPLATMGVENQTLAVSFTTLCVDARSQHDYIALS RLFHTVMLFDVPVMTRLMESEARRFIALVDEFYERHVKLVVSAEVPLYEIYQGDRLKFEF QRCLSRLQEMQSEEYLKREHLAG >gi|296493312|gb|ADTK01000189.1| GENE 15 11978 - 12376 504 132 aa, chain + ## HITS:1 COG:STM3347 KEGG:ns NR:ns ## COG: STM3347 COG3105 # Protein_GI_number: 16766642 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Salmonella typhimurium LT2 # 1 132 3 134 134 229 96.0 9e-61 MTWEYALIGLVVGIIIGAVAMRFGNRKLRQQQALQYELEKNKAELDEYREELVSHFARSA ELLDTMAHDYRQLYQHMAKSSSSLLPELSAEANPFRNRLAESEASNDQAPVQMPRDYSEG ASGLLRTGAKRD >gi|296493312|gb|ADTK01000189.1| GENE 16 12530 - 13897 1484 455 aa, chain + ## HITS:1 COG:degQ KEGG:ns NR:ns ## COG: degQ COG0265 # Protein_GI_number: 16131124 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Escherichia coli K12 # 1 455 1 455 455 777 100.0 0 MKKQTQLLSALALSVGLTLSASFQAVASIPGQVADQAPLPSLAPMLEKVLPAVVSVRVEG TASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIINASKGYVLTNNHVINQAQKISIQ LNDGREFDAKLIGSDDQSDIALLQIQNPSKLTQIAIADSDKLRVGDFAVAVGNPFGLGQT ATSGIVSALGRSGLNLEGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSV GIGFAIPSNMARTLAQQLIDFGEIKRGLLGIKGTEMSADIAKAFNLDVQRGAFVSEVLPG SGSAKAGVKAGDIITSLNGKPLNSFAELRSRIATTEPGTKVKLGLLRNGKPLEVEVTLDT STSSSASAEMITPALEGATLSDGQLKDGGKGIKIDEVVKGSPAAQAGLQKDDVIIGVNRD RVNSIAEMRKVLAAKPAIIALQIVRGNESIYLLMR >gi|296493312|gb|ADTK01000189.1| GENE 17 13987 - 15054 1157 355 aa, chain + ## HITS:1 COG:ECs4108 KEGG:ns NR:ns ## COG: ECs4108 COG0265 # Protein_GI_number: 15833362 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Escherichia coli O157:H7 # 1 355 1 355 355 645 100.0 0 MFVKLLRSVAIGLIVGAILLVAMPSLRSLNPLSTPQFDSTDETPASYNLAVRRAAPAVVN VYNRGLNTNSHNQLEIRTLGSGVIMDQRGYIITNKHVINDADQIIVALQDGRVFEALLVG SDSLTDLAVLKINATGGLPTIPINARRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIG LNPTGRQNFLQTDASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPFQL ATKIMDKLIRDGRVIRGYIGIGGREIAPLHAQGGGIDQLQGIVVNEVSPDGPAANAGIQV NDLIISVDNKPAISALETMDQVAEIRPGSVIPVVVMRDDKQLTLQVTIQEYPATN >gi|296493312|gb|ADTK01000189.1| GENE 18 15116 - 16054 1239 312 aa, chain - ## HITS:1 COG:ECs4109 KEGG:ns NR:ns ## COG: ECs4109 COG0039 # Protein_GI_number: 15833363 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Escherichia coli O157:H7 # 1 312 1 312 312 557 100.0 1e-159 MKVAVLGAAGGIGQALALLLKTQLPSGSELSLYDIAPVTPGVAVDLSHIPTAVKIKGFSG EDATPALEGADVVLISAGVARKPGMDRSDLFNVNAGIVKNLVQQVAKTCPKACIGIITNP VNTTVAIAAEVLKKAGVYDKNKLFGVTTLDIIRSNTFVAELKGKQPGEVEVPVIGGHSGV TILPLLSQVPGVSFTEQEVADLTKRIQNAGTEVVEAKAGGGSATLSMGQAAARFGLSLVR ALQGEQGVVECAYVEGDGQYARFFSQPLLLGKNGVEERKSIGTLSAFEQNALEGMLDTLK KDIALGEEFVNK >gi|296493312|gb|ADTK01000189.1| GENE 19 16489 - 16959 552 156 aa, chain + ## HITS:1 COG:ECs4110 KEGG:ns NR:ns ## COG: ECs4110 COG1438 # Protein_GI_number: 15833364 # Func_class: K Transcription # Function: Arginine repressor # Organism: Escherichia coli O157:H7 # 1 156 1 156 156 293 100.0 6e-80 MRSSAKQEELVKAFKALLKEEKFSSQGEIVAALQEQGFDNINQSKVSRMLTKFGAVRTRN AKMEMVYCLPAELGVPTTSSPLKNLVLDIDYNDAVVVIHTSPGAAQLIARLLDSLGKAEG ILGTIAGDDTIFTTPANGFTVKDLYEAILELFDQEL >gi|296493312|gb|ADTK01000189.1| GENE 20 17324 - 17587 327 87 aa, chain + ## HITS:1 COG:no KEGG:S3493 NR:ns ## KEGG: S3493 # Name: yhcN # Def: hypothetical protein # Organism: S.flexneri_2457T # Pathway: not_defined # 1 87 18 104 104 134 100.0 1e-30 MKIKTTVAALSVLSVLSFGAFAADSIDAAQAQNREAIGTVSVSGVASSPMDIREMLNKKA EEKGATAYQITEARSGDTWHATAELYK >gi|296493312|gb|ADTK01000189.1| GENE 21 17643 - 17915 343 90 aa, chain - ## HITS:1 COG:ECs4112 KEGG:ns NR:ns ## COG: ECs4112 COG2732 # Protein_GI_number: 15833366 # Func_class: K Transcription # Function: Barstar, RNAse (barnase) inhibitor # Organism: Escherichia coli O157:H7 # 1 90 1 90 90 157 98.0 6e-39 MNIYTFDFDEIESQEDFYRDFSQAFGLAKDKVRDLDSLWDVLMNDVLPLPLEIEFVHLGE KTRRRFGALILLFDEAEEELEGHLRFNVRH Prediction of potential genes in microbial genomes Time: Mon May 16 15:37:42 2011 Seq name: gi|296493311|gb|ADTK01000190.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont534.9, whole genome shotgun sequence Length of sequence - 36751 bp Number of predicted genes - 31, with homology - 31 Number of transcription units - 14, operones - 9 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.333 - CDS 13 - 1980 1530 ## COG1289 Predicted membrane protein 2 1 Op 2 . - CDS 1986 - 2918 963 ## COG1566 Multidrug resistance efflux pump 3 1 Op 3 . - CDS 2926 - 3129 125 ## G2583_3962 hypothetical protein - Prom 3236 - 3295 5.5 + Prom 3212 - 3271 7.2 4 2 Tu 1 . + CDS 3312 - 4241 882 ## COG0583 Transcriptional regulator 5 3 Tu 1 . - CDS 4369 - 5814 1773 ## COG0312 Predicted Zn-dependent proteases and their inactivated homologs 6 4 Op 1 5/1.000 - CDS 5970 - 9770 3297 ## COG3164 Predicted membrane protein 7 4 Op 2 8/0.333 - CDS 9838 - 11307 1551 ## COG1530 Ribonucleases G and E 8 4 Op 3 7/0.333 - CDS 11297 - 11890 594 ## COG0424 Nucleotide-binding protein implicated in inhibition of septum formation 9 4 Op 4 19/0.000 - CDS 11899 - 12387 427 ## COG2891 Cell shape-determining protein 10 4 Op 5 22/0.000 - CDS 12387 - 13490 1173 ## COG1792 Cell shape-determining protein 11 4 Op 6 2/1.000 - CDS 13556 - 14599 1042 ## COG1077 Actin-like ATPase involved in cell morphogenesis - Prom 14639 - 14698 4.9 12 5 Tu 1 . - CDS 14904 - 16844 1227 ## COG2200 FOG: EAL domain - Prom 16936 - 16995 3.7 + Prom 16899 - 16958 3.4 13 6 Op 1 . + CDS 16996 - 17970 1094 ## COG0604 NADPH:quinone reductase and related Zn-dependent oxidoreductases 14 6 Op 2 . + CDS 18007 - 18171 70 ## EcolC_0453 hypothetical protein 15 7 Op 1 27/0.000 + CDS 18948 - 19418 464 ## COG0511 Biotin carboxyl carrier protein 16 7 Op 2 6/0.333 + CDS 19429 - 20778 1402 ## COG0439 Biotin carboxylase + Term 20798 - 20826 3.0 + Prom 20803 - 20862 5.4 17 8 Op 1 4/1.000 + CDS 20887 - 21129 251 ## COG3924 Predicted membrane protein 18 8 Op 2 3/1.000 + CDS 21119 - 22570 1509 ## COG4145 Na+/panthothenate symporter 19 8 Op 3 7/0.333 + CDS 22582 - 23463 1543 ## PROTEIN SUPPORTED gi|194435103|ref|ZP_03067340.1| ribosomal protein L11 methyltransferase + Term 23517 - 23571 11.8 + Prom 23585 - 23644 5.5 20 9 Op 1 12/0.333 + CDS 23792 - 24757 1101 ## PROTEIN SUPPORTED gi|42631300|ref|ZP_00156838.1| COG0042: tRNA-dihydrouridine synthase 21 9 Op 2 1/1.000 + CDS 24783 - 25079 395 ## COG2901 Factor for inversion stimulation Fis, transcriptional activator + Term 25109 - 25146 1.4 22 10 Op 1 . + CDS 25165 - 26049 747 ## COG0863 DNA modification methylase + Prom 26054 - 26113 1.9 23 10 Op 2 . + CDS 26133 - 26312 112 ## EcE24377A_3748 hypothetical protein + Term 26326 - 26362 0.1 - Term 26240 - 26281 2.3 24 11 Tu 1 . - CDS 26315 - 26977 486 ## COG1309 Transcriptional regulator - Prom 27113 - 27172 7.7 + Prom 27072 - 27131 7.4 25 12 Op 1 27/0.000 + CDS 27376 - 28533 922 ## COG0845 Membrane-fusion protein 26 12 Op 2 . + CDS 28545 - 31649 2713 ## COG0841 Cation/multidrug efflux pump + Term 31660 - 31695 3.5 + Prom 31666 - 31725 4.7 27 13 Tu 1 . + CDS 31902 - 32123 269 ## ECH74115_4585 putative lipoprotein + Term 32129 - 32164 5.6 + Prom 32451 - 32510 2.2 28 14 Op 1 6/0.333 + CDS 32554 - 33579 950 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain + Term 33587 - 33637 5.5 29 14 Op 2 6/0.333 + CDS 33647 - 34828 905 ## COG4597 ABC-type amino acid transport system, permease component 30 14 Op 3 34/0.000 + CDS 34838 - 35941 810 ## COG0765 ABC-type amino acid transport system, permease component 31 14 Op 4 . + CDS 35949 - 36707 534 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 Predicted protein(s) >gi|296493311|gb|ADTK01000190.1| GENE 1 13 - 1980 1530 655 aa, chain - ## HITS:1 COG:yhcP KEGG:ns NR:ns ## COG: yhcP COG1289 # Protein_GI_number: 16131130 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 655 1 655 655 1277 99.0 0 MGIFSIANQHIRFAVKLATAIVLALFVGFHFQLETPRWAVLTAAIVAAGPAFAAGGEPYS GAIRYRGFLRIIGTFIGCIAGLVIIIAMIRAPLLMILVCCIWAGFCTWISSLVRIENSYA WGLAGYTALIIVITIQPEPLLTPQFAVERCSEIVIGIVCAIMADLLFSPRSIKQEVDREL ESLLVAQYQLMQLCIKHGDGEVVDKAWGDLVRRTTALQGMRSNLNMESSRWARANRRLKA INTLSLTLITQSCETYLIQNTRPELITDTFREFFDTPVETAQDVHKQLKRLRRVIAWTGE RETPVTIYSWVAAATRYQLLKRGVISNTKINATEEEILQGEPEVKVESAERHHAMVNFWR TTLSCILGTLFWLWTGWTSGSGAMVMIAVVTSLAMRLPNPRMVAIDFIYGTLAALPLGLL YFLVIIPNTQQSMLLLCISLAVLGFFLGIEVQKRRLGSMGALASTINIIVLDNPMTFHFS QFLDSALGQIVGCVLAFTVILLVRDKSRDRTGRVLLNQFVSAAVSAMTTNVARRKENHLP ALYQQLFLLMNKFPGDLPKFRLALTMIIAHQRLRDAPIPVNEDLSAFHRQMRRTADHVIS ARSDDKRRRYFGQLLEELEIYQEKLRIWQAPPQVTEPVHRLAGMLHKYQHALTDS >gi|296493311|gb|ADTK01000190.1| GENE 2 1986 - 2918 963 310 aa, chain - ## HITS:1 COG:yhcQ KEGG:ns NR:ns ## COG: yhcQ COG1566 # Protein_GI_number: 16131131 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Escherichia coli K12 # 1 310 1 310 310 580 100.0 1e-165 MKTLIRKFSRTAITVVLVILAFIAIFNAWVYYTESPWTRDARFSADVVAIAPDVSGLITQ VNVHDNQLVKKGQILFTIDQPRYQKALEEAQADVAYYQVLAQEKRQEAGRRNRLGVQAMS REEIDQANNVLQTVLHQLAKAQATRDLAKLDLERTVIRAPADGWVTNLNVYTGEFITRGS TAVALVKQNSFYVLAYMEETKLEGVRPGYRAEITPLGSNKVLKGTVDSVAAGVTNASSTR DDKGMATIDSNLEWVRLAQRVPVRIRLDNQQENIWPAGTTATVVVTGKQDRDESQDSFFR KMAHRLREFG >gi|296493311|gb|ADTK01000190.1| GENE 3 2926 - 3129 125 67 aa, chain - ## HITS:1 COG:no KEGG:G2583_3962 NR:ns ## KEGG: G2583_3962 # Name: aaeX # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 67 24 90 90 103 100.0 3e-21 MSLFPVIVVFGLSFPPIFFELLLSLAIFWLVRRVLVPTGIYDFVWHPALFNTALYCCLFY LISRLFV >gi|296493311|gb|ADTK01000190.1| GENE 4 3312 - 4241 882 309 aa, chain + ## HITS:1 COG:ECs4116 KEGG:ns NR:ns ## COG: ECs4116 COG0583 # Protein_GI_number: 15833370 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 309 1 309 309 623 100.0 1e-178 MERLKRMSVFAKVVEFGSFTAAARQLQMSVSSISQTVSKLEDELQVKLLNRSTRSIGLTE AGRIYYQGCRRMLHEVQDVHEQLYAFNNTPIGTLRIGCSSTMAQNVLAGLTAKMLKEYPG LSVNLVTGIPAPDLIADGLDVVIRVGALQDSSLFSRRLGAMPMVVCAAKSYLTQYGIPEK PADLSSHSWLEYSVRPDNEFELIAPEGISTRLIPQGRFVTNDPMTLVRWLTAGAGIAYVP LMWVINEINRGELEILLPRYQSDPRPVYALYTEKDKLPLKVQVVINSLTDYFVEVGKLFQ EMHGRGKEK >gi|296493311|gb|ADTK01000190.1| GENE 5 4369 - 5814 1773 481 aa, chain - ## HITS:1 COG:tldD KEGG:ns NR:ns ## COG: tldD COG0312 # Protein_GI_number: 16131134 # Func_class: R General function prediction only # Function: Predicted Zn-dependent proteases and their inactivated homologs # Organism: Escherichia coli K12 # 1 481 1 481 481 889 100.0 0 MSLNLVSEQLLAANGLKHQDLFAILGQLAERRLDYGDLYFQSSYHESWVLEDRIIKDGSY NIDQGVGVRAISGEKTGFAYADQISLLALEQSAQAARTIVRDSGDGKVQTLGAVEHSPLY TSVDPLQSMSREEKLDILRRVDKVAREADKRVQEVTASLSGVYELILVAATDGTLAADVR PLVRLSVSVLVEEDGKRERGASGGGGRFGYEFFLADLDGEVRADAWAKEAVRMALVNLSA VAAPAGTMPVVLGAGWPGVLLHEAVGHGLEGDFNRRGTSVFSGQVGELVASELCTVVDDG TMVDRRGSVAIDDEGTPGQYNVLIENGILKGYMQDKLNARLMGMTPTGNGRRESYAHLPM PRMTNTYMLPGKSTPQEIIESVEYGIYAPNFGGGQVDITSGKFVFSTSEAYLIENGKVTK PVKGATLIGSGIETMQQISMVGNDLKLDNGVGVCGKEGQSLPVGVGQPTLKVDNLTVGGT A >gi|296493311|gb|ADTK01000190.1| GENE 6 5970 - 9770 3297 1266 aa, chain - ## HITS:1 COG:yhdR+P KEGG:ns NR:ns ## COG: yhdR+P COG3164 # Protein_GI_number: 16132254 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 1266 1 1266 1266 2507 99.0 0 MRRLPGILLLTGAALVVIAALLVSGLRIALPHLDAWRPEILNKIESATGMPVEASQLSAS WQNFGPTLEAHDIRAELKDGGEFSVKRVTLALDVWQSLLHMRWQFRDLTFWQLRFRTNTP ITSGGGNDSLEASHISDLFLRQFDHFDLRDSEVSFLTPSGQRAELAIPQLTWLNDPRRHR AEGLVSLSSLTGQHGVMQVRMDLRDDEGLLSNGRVWLQADDIDLKPWLGKWMQDNIALET AQFSLEGWMTIDKGDVTGGDVWLKQGGASWLGEKQTHTLSVDNLTAHITRENPGWQFSIP DTRITMDGKPWPSGALTLAWIPEQDVGGKDNKRSDELRIRASNLELAGLEGVRPLAAKLS PALGDVWRSTQPSGKINTLALDIPLQAADKTRFQASWSDLAWKQWKLLPGAEHFSGTLSG SVENGLLTASMKQAKMPYETVFRAPLEIADGQATISWLNNDKGFQLDGRNIDVKAKAVHA RGGFRYLQPANDEPWLGILAGISTDDGSQAWRYFPENLMGKDLVDYLSGAIQGGEADNAT LVYGGNPQLFPYKHNEGQFEVLVPLRNAKFAFQPDWPALTNLDIELDFINDGLWMKTDGV NLGGVRASNLTAVIPDYSKEKLLIDADIKGPGKAVGPYFDETPLKDSLGATLQELQLDGD VNARLHLDIPLNGELVTAKGEVTLRNNSLFIKPLDSTLKNLSGKFSFINGDLQSEPLTAS WFNQPLNVDFSTKEGAKAYQVAVNLNGNWQPAKTGVLPEAVNEALSGSVAWDGKVGIDLP YHAGATYNVELNGDLKNVSSHLPSPLAKPAGEPLAVNVKVDGNLNSFELTGQAGADNHFN SRWLLGQKLTLDRAIWAADSKTLPPLPEQSGVELNMPPMNGAEWLALFQKGAAESVGGAA SFPQHITLRTPMLSLGNQQWNNLSIVSQPTANGTLVEAQGREINATLAMRNNAPWLANIK YLYYNPSVAKTRGDSTPSSPFPTTERINFRGWPDAQIRCTECWFWGQKFGRIDSDITISG DTLTLTNGLIDTGFSRLTADGEWVNNPGNERTSLKGKLRGQKIDPAAEFFGVTTPIRQSS FNVDYDLHWRKAPWQPDEATLNGIIHTQLGKGEITEINTGHAGQLLRLLSVDALMRKLRF DFRDTFGEGFYFDSIRSTAWIKDGVMHTDDTLVDGLEADIAMKGSVNLVRRDLNMEAVVA PEISATVGVAAAFAVNPIVGAAVFAASKVLGPLWSKVSILRYHISGPLDDPQINEVLRQP RKEKAQ >gi|296493311|gb|ADTK01000190.1| GENE 7 9838 - 11307 1551 489 aa, chain - ## HITS:1 COG:ZcafA KEGG:ns NR:ns ## COG: ZcafA COG1530 # Protein_GI_number: 15803780 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonucleases G and E # Organism: Escherichia coli O157:H7 EDL933 # 1 489 7 495 495 917 100.0 0 MTAELLVNVTPSETRVAYIDGGILQEIHIEREARRGIVGNIYKGRVSRVLPGMQAAFVDI GLDKAAFLHASDIMPHTECVAGEEQKQFTVRDISELVRQGQDLMVQVVKDPLGTKGARLT TDITLPSRYLVFMPGASHVGVSQRIESESERERLKKVVAEYCDEQGGFIIRTAAEGVGEA ELASDAAYLKRVWTKVMERKKRPQTRYQLYGELALAQRVLRDFADAELDRIRVDSRLTYE ALLEFTSEYIPEMTSKLEHYTGRQPIFDLFDVENEIQRALERKVELKSGGYLIIDQTEAM TTVDINTGAFVGHRNLDDTIFNTNIEATQAIARQLRLRNLGGIIIIDFIDMNNEDHRRRV LHSLEQALSKDRVKTSVNGFSALGLVEMTRKRTRESIEHVLCNECPTCHGRGTVKTVETV CYEIMREIVRVHHAYDSDRFLVYASPAVAEALKGEESHSLAEVEIFVGKQVKVQIEPLYN QEQFDVVMM >gi|296493311|gb|ADTK01000190.1| GENE 8 11297 - 11890 594 197 aa, chain - ## HITS:1 COG:yhdE KEGG:ns NR:ns ## COG: yhdE COG0424 # Protein_GI_number: 16131136 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Nucleotide-binding protein implicated in inhibition of septum formation # Organism: Escherichia coli K12 # 1 197 1 197 197 367 98.0 1e-102 MTSLYLASGSPRRQELLAQLGVTFERIVTGIEEQRQPQESAQQYVVRLAREKAQAGVAQT AQDLPVLGADTIVILNGEVLEKPRDAEHAAQMLRKLSGQTHQVMTAVALADSQHILDCLV VTDVTFRTLTDEDIAGYVASGEPLDKAGAYGIQGLGGCFVRKINGSYHAVVGLPLVETYE LLSNFNALREKRDKHDG >gi|296493311|gb|ADTK01000190.1| GENE 9 11899 - 12387 427 162 aa, chain - ## HITS:1 COG:ECs4121 KEGG:ns NR:ns ## COG: ECs4121 COG2891 # Protein_GI_number: 15833375 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell shape-determining protein # Organism: Escherichia coli O157:H7 # 1 162 1 162 162 227 100.0 6e-60 MASYRSQGRWVIWLSFLIALLLQIMPWPDNLIVFRPNWVLLILLYWILALPHRVNVGTGF VMGAILDLISGSTLGVRVLAMSIIAYLVALKYQLFRNLALWQQALVVMLLSLVVDIIVFW AEFLVINVSFRPEVFWSSVVNGVLWPWIFLLMRKVRQQFAVQ >gi|296493311|gb|ADTK01000190.1| GENE 10 12387 - 13490 1173 367 aa, chain - ## HITS:1 COG:ECs4122 KEGG:ns NR:ns ## COG: ECs4122 COG1792 # Protein_GI_number: 15833376 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell shape-determining protein # Organism: Escherichia coli O157:H7 # 1 367 1 367 367 592 99.0 1e-169 MKPIFSRGPSLQIRLILAVLVALGIIIADSRLGTFSQIRTYMDTAVSPFYFVSNAPRELL DGVSQTLASRDQLELENRALRQELLLKNSELLMLGQYKQENARLRELLGSPLRQDEQKMV TQVISTVNDPYSDQVVIDKGSVNGVYEGQPVISDKGVVGQVVAVAKLTSRVLLICDATHA LPIQVLRNDIRVIAAGNGCTDDLQLEHLPANTDIRVGDVLVTSGLGGRFPEGYPVAVVSS VKLDTQRAYTVIQARPTAGLQRLRYLLLLWGADRNGANPMTPEEVHRVANERLMQMMPQV LPSPDAMGPKFPDPATGIAQPTPQQPTTGNAATAPAAPTQPAANRSPQRATPPQSGAQPP ARAPGGQ >gi|296493311|gb|ADTK01000190.1| GENE 11 13556 - 14599 1042 347 aa, chain - ## HITS:1 COG:ECs4123 KEGG:ns NR:ns ## COG: ECs4123 COG1077 # Protein_GI_number: 15833377 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Escherichia coli O157:H7 # 1 347 21 367 367 649 100.0 0 MLKKFRGMFSNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRAGSPKSVAAVGHDAK QMLGRTPGNIAAIRPMKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQ VERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNG VVYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRN LAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALL RNLDRLLMEETGIPVVVAEDPLTCVARGGGKALEMIDMHGGDLFSEE >gi|296493311|gb|ADTK01000190.1| GENE 12 14904 - 16844 1227 646 aa, chain - ## HITS:1 COG:ECs4124_3 KEGG:ns NR:ns ## COG: ECs4124_3 COG2200 # Protein_GI_number: 15833378 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli O157:H7 # 395 646 1 252 252 499 100.0 1e-141 MRLTTKFSAFVTLLTGLTIFVTLLGCSLSFYNAIQYKFSHRVQAVATAIDTHLVSNDFST LRPQITELMMSADIVRVDLLHGDKQVYTLARNGSYRPVGSSDLFRELSVPLIKHPGMSLR LVYQDPMGNYFHSLMTTAPLTGAIGFIIVMLFLAVRWLQRQLAGQELLETRATRILNGER GSNVLGTIYEWPPRTSSALDTLLREIQNAREQHSRLDTLIRSYAAQDVKTGLNNRLFFDN QLATLLEDQEKVGTHGIVMMIRLPDFNMLSDTWGHSQVEEQFFTLTNLLSTFMMRYPGAL LARYHRSDFAALLPHRTLKEAESIAGQLIKAVDTLPNNKMLDRDDMIHIGICAWRSGQDT EQVMEHAESATRNAGLQGGNSWAIYDDSLPEKGRGNVRWRTLIEQMLSRGGPRLYQKPAV TREGQVHHRELMCRIFDGNEEVSSAEYMPMVLQFGLSEEYDRLQISRLIPLLRYWPEENL AIQVTVESLIRPRFQRWLRDTLMQCEKSQRKRIIIELAEADVGQHISRLQPVIRLVNALG VRVAVNQAGLTLVSTSWIKELNVELLKLHPGLVRNIEKRTENQLLVQSLVEACSGTSTQV YATGVRSRSEWQTLIQRGVTGGQGDFFASSQPLDTNVKKYSQRYSV >gi|296493311|gb|ADTK01000190.1| GENE 13 16996 - 17970 1094 324 aa, chain + ## HITS:1 COG:ECs4125 KEGG:ns NR:ns ## COG: ECs4125 COG0604 # Protein_GI_number: 15833379 # Func_class: C Energy production and conversion; R General function prediction only # Function: NADPH:quinone reductase and related Zn-dependent oxidoreductases # Organism: Escherichia coli O157:H7 # 1 324 1 324 324 639 100.0 0 MQALLLEQQDGKTLASVQTLDESRLPEGDVTVDVHWSSLNYKDALAITGKGKIIRNFPMI PGIDFAGTVRTSEDPRFHAGQEVLLTGWGVGENHWGGLAEQARVKGDWLVAMPQGLDARK AMIIGTAGFTAMLCVMALEDAGVRPQDGEIVVTGASGGVGSTAVALLHKLGYQVVAVSGR ESTHEYLKSLGASRILPRDEFAESRPLEKQVWAGAIDTVGDKVLAKVLAQMNYGGCVAAC GLAGGFTLPTTVMPFILRNVRLQGVDSVMTPPERRAQAWQRLVADLPESFYTQAAKEISL SEAPNFAEAIINNQIQGRTLVKVN >gi|296493311|gb|ADTK01000190.1| GENE 14 18007 - 18171 70 54 aa, chain + ## HITS:1 COG:no KEGG:EcolC_0453 NR:ns ## KEGG: EcolC_0453 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 48 1 48 276 98 93.0 5e-20 MKSLFNRLTGKAVSRTAFVEHLGQKVIQHHPNCKVMISTDHKLMRIDTPLNSYY >gi|296493311|gb|ADTK01000190.1| GENE 15 18948 - 19418 464 156 aa, chain + ## HITS:1 COG:ECs4127 KEGG:ns NR:ns ## COG: ECs4127 COG0511 # Protein_GI_number: 15833381 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxyl carrier protein # Organism: Escherichia coli O157:H7 # 1 156 1 156 156 246 100.0 2e-65 MDIRKIKKLIELVEESGISELEISEGEESVRISRAAPAASFPVMQQAYAAPMMQQPAQSN AAAPATVPSMEAPAAAEISGHIVRSPMVGTFYRTPSPDAKAFIEVGQKVNVGDTLCIVEA MKMMNQIEADKSGTVKAILVESGQPVEFDEPLVVIE >gi|296493311|gb|ADTK01000190.1| GENE 16 19429 - 20778 1402 449 aa, chain + ## HITS:1 COG:ECs4128 KEGG:ns NR:ns ## COG: ECs4128 COG0439 # Protein_GI_number: 15833382 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Escherichia coli O157:H7 # 1 449 1 449 449 882 99.0 0 MLDKIVIANRGEIALRILRACKELGIKTVAVHSSADRDLKHVLLADETVCIGPAPSVKSY LNIPAIISAAEITGAVAIHPGYGFLSENANFAEQVERSGFIFIGPKAETIRLMGDKVSAI AAMKKAGVPCVPGSDGPLGDDMDKNRAIAKRIGYPVIIKASGGGGGRGMRVVRGDAELAQ SISMTRAEAKAAFSNDMVYMEKYLENPRHVEIQVLADGQGNAIYLAERDCSMQRRHQKVV EEAPAPGITPELRRYIGERCAKACVDIGYRGAGTFEFLFENGEFYFIEMNTRIQVEHPVT EMITGVDLIKEQLRIAAGQPLSIKQEEVHVRGHAVECRINAEDPNTFLPSPGKITRFHAP GGFGVRWESHIYAGYTVPPYYDSMIGKLICYGENRDVAIARMKNALQELIIDGIKTNVDL QIRIMNDENFQHGGTNIHYLEKKLGLQEK >gi|296493311|gb|ADTK01000190.1| GENE 17 20887 - 21129 251 80 aa, chain + ## HITS:1 COG:ECs4129 KEGG:ns NR:ns ## COG: ECs4129 COG3924 # Protein_GI_number: 15833383 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 80 1 80 80 149 100.0 2e-36 MDTRFVQAHKEARWALGLTLLYLAVWLVAAYLPGVAPGFTGFPRWFEMACILTPLLFIGL CWAMVKFIYRDIPLEDDDAA >gi|296493311|gb|ADTK01000190.1| GENE 18 21119 - 22570 1509 483 aa, chain + ## HITS:1 COG:ECs4130 KEGG:ns NR:ns ## COG: ECs4130 COG4145 # Protein_GI_number: 15833384 # Func_class: H Coenzyme transport and metabolism # Function: Na+/panthothenate symporter # Organism: Escherichia coli O157:H7 # 1 483 3 485 485 798 99.0 0 MQLEVILPLVAYLVVVFGISVYAMRKRSTGTFLNEYFLGSRSMGGIVLAMTLTATYISAS SFIGGPGAAYKYGLGWVLLAMIQLPAVWLSLGILGKKFAILARRYNAVTLNDMLFARYQS RLLVWLASLSLLVAFVGAMTVQFIGGARLLETAAGIPYETGLLIFGISIALYTAFGGFRA SVLNDTMQGLVMLIGTVVLLIGVVHAAGGLSNAVQTLQTIDPQLVTPQGADDILSPAFMT SFWVLVCFGVIGLPHTAVRCISYKDSKAVHRGIIIGTIVVAILMFGMHLAGALGRAVIPD LTVPDLVIPTLMVKVLPPFAAGIFLAAPMAAIMSTINAQLLQSSATIIKDLYLNIRPDQM QNETRLKRMSAVITLVLGALLLLAAWKPPEMIIWLNLLAFGGLEAVFLWPLVLGLYWERA NAKGALSAMIVGGVLYAVLATLNIQYLGFHPIVPSLLLSLVAFLVGNRFGTSVPQATVLT TDK >gi|296493311|gb|ADTK01000190.1| GENE 19 22582 - 23463 1543 293 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|194435103|ref|ZP_03067340.1| ribosomal protein L11 methyltransferase [Shigella dysenteriae 1012] # 1 293 1 293 293 598 100 1e-170 MPWIQLKLNTTGANAEDLSDALMEAGAVSITFQDTHDTPVFEPLPGETRLWGDTDVIGLF DAETDMNDVVAILENHPLLGAGFAHKIEQLEDKDWEREWMDNFHPMRFGERLWICPSWRD VPDENAVNVMLDPGLAFGTGTHPTTSLCLQWLDSLDLTGKTVIDFGCGSGILAIAALKLG AAKAIGIDIDPQAIQASRDNAERNGVSDRLELYLPKDQPEEMKADVVVANILAGPLRELA PLISVLPVSGGLLGLSGILASQAESVCEAYVDSFALDPVVEKEEWCRITGRKN >gi|296493311|gb|ADTK01000190.1| GENE 20 23792 - 24757 1101 321 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42631300|ref|ZP_00156838.1| COG0042: tRNA-dihydrouridine synthase [Haemophilus influenzae R2866] # 1 317 28 344 353 428 65 1e-119 MRIGQYQLRNRLIAAPMAGITDRPFRTLCYEMGAGLTVSEMMSSNPQVWESDKSRLRMVH IDEPGIRTVQIAGSDPKEMADAARINVESGAQIIDINMGCPAKKVNRKLAGSALLQYPDV VKSILTEVVNAVDVPVTLKIRTGWAPEHRNCEEIAQLAEDCGIQALTIHGRTRACLFNGE AEYDSIRAVKQKVSIPVIANGDITDPLKARAVLDYTGADALMIGRAAQGRPWIFREIQHY LDTGELLPPLPLAEVKRLLCAHVRELHDFYGPAKGYRIARKHVSWYLQEHAPNDQFRRTF NAIEDASEQLEALEAYFENFA >gi|296493311|gb|ADTK01000190.1| GENE 21 24783 - 25079 395 98 aa, chain + ## HITS:1 COG:ECs4133 KEGG:ns NR:ns ## COG: ECs4133 COG2901 # Protein_GI_number: 15833387 # Func_class: K Transcription; L Replication, recombination and repair # Function: Factor for inversion stimulation Fis, transcriptional activator # Organism: Escherichia coli O157:H7 # 1 98 1 98 98 162 100.0 1e-40 MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN >gi|296493311|gb|ADTK01000190.1| GENE 22 25165 - 26049 747 294 aa, chain + ## HITS:1 COG:ECs4134 KEGG:ns NR:ns ## COG: ECs4134 COG0863 # Protein_GI_number: 15833388 # Func_class: L Replication, recombination and repair # Function: DNA modification methylase # Organism: Escherichia coli O157:H7 # 1 294 3 296 296 603 98.0 1e-172 MRTGCEPTRFGNEAKTIIHGDALAELKKLPTESVDLIFADPPYNIGKNFDGLIEAWKEDL FIDWLFEVIAECHRVLKKQGSMYIMNSTENMPFIDLQCRKLFTIKSRIVWSYDSSGVQAK KHYGSMYEPILMMVKDAKNYTFNGDAILVEAKTGSQRALIDYRKNPPQPYNHQKVPGNVW DFPRVRYLMDEYENHPTQKPEALLKRIILASSNPGDIVLDPFAGSFTTGAVAIASGRKFI GIEINSEYIKMGLRRLDVASHYSAEELAKVKKRKTGNRSKRCRVSEVDPDLIAK >gi|296493311|gb|ADTK01000190.1| GENE 23 26133 - 26312 112 59 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_3748 NR:ns ## KEGG: EcE24377A_3748 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 59 10 68 68 87 100.0 2e-16 MIRKYWWLVVFAVFVFLFDTLLMQWIELLATETDKCRNMNSVNPLKLVNCDELNFQDRM >gi|296493311|gb|ADTK01000190.1| GENE 24 26315 - 26977 486 220 aa, chain - ## HITS:1 COG:ECs4136 KEGG:ns NR:ns ## COG: ECs4136 COG1309 # Protein_GI_number: 15833390 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 220 1 220 220 426 100.0 1e-119 MAKRTKAEALKTRQELIETAIAQFAQHGVSKTTLNDIADAANVTRGAIYWHFENKTQLFN EMWLQQPSLRELIQEHLTAGLEHDPFQQLREKLIVGLQYIAKIPRQQALLKILYHKCEFN DEMLAEGVIREKMGFNPQTLREVLQACQQQGCVANNLDLDVVMIIIDGAFSGIVQNWLMN MAGYDLYKQAPALVDNVLRMFMPDENITKLIHQTNELSVM >gi|296493311|gb|ADTK01000190.1| GENE 25 27376 - 28533 922 385 aa, chain + ## HITS:1 COG:acrE KEGG:ns NR:ns ## COG: acrE COG0845 # Protein_GI_number: 16131153 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli K12 # 1 385 1 385 385 674 100.0 0 MTKHARFFLLPSFILISAALIAGCNDKGEEKAHVGEPQVTVHIVKTAPLEVKTELPGRTN AYRIAEVRPQVSGIVLNRNFTEGSDVQAGQSLYQIDPATYQANYDSAKGELAKSEAAAAI AHLTVKRYVPLVGTKYISQQEYDQAIADARQADAAVIAAKATVESARINLAYTKVTAPIS GRIGKSTVTEGALVTNGQTTELATVQQLDPIYVDVTQSSNDFMRLKQSVEQGNLHKENAT SNVELVMENGQTYPLKGTLQFSDVTVDESTGSITLRAVFPNPQHTLLPGMFVRARIDEGV QPDAILIPQQGVSRTPRGDATVLIVNDKSQVEARPVVASQAIGDKWLISEGLKSGDQVIV SGLQKARPGEQVKATTDTPADTASK >gi|296493311|gb|ADTK01000190.1| GENE 26 28545 - 31649 2713 1034 aa, chain + ## HITS:1 COG:acrF KEGG:ns NR:ns ## COG: acrF COG0841 # Protein_GI_number: 16131154 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Escherichia coli K12 # 1 1034 1 1034 1034 1949 99.0 0 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ EVQQQGISVEKSSSSYLMVAGFVSDNPDTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRL KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMEFGKIRDG FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKVYVQADAKFRM LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF VPVFFVVIRRCFKG >gi|296493311|gb|ADTK01000190.1| GENE 27 31902 - 32123 269 73 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_4585 NR:ns ## KEGG: ECH74115_4585 # Name: not_defined # Def: putative lipoprotein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 73 1 73 73 115 100.0 6e-25 MKRLIPVALLTALLAGCAHDSPCVPVYDDQGRLVHTNTCMKGTTQDNWETAGAIAGGAAA VAGLTMGIIALSK >gi|296493311|gb|ADTK01000190.1| GENE 28 32554 - 33579 950 341 aa, chain + ## HITS:1 COG:yhdW KEGG:ns NR:ns ## COG: yhdW COG0834 # Protein_GI_number: 16131156 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Escherichia coli K12 # 37 341 1 305 305 620 99.0 1e-177 MKKMMIATLAAASVLLAVANQAHAGATLDAVQKKGFVQCGISDGLPGFSYADADGKFSGI DVDVCRGVAAAVFGDDTKVKYTPLTAKERFTALQSGEVDLLSRNTTWTSSRDAGMGMAFT GVTYYDGIGFLTHDKAGLKSAKELDGATVCIQAGTDTELNVADYFKANNMKYTPVTFDRS DESAKALESGRCDTLASDQSQLYALRIKLSNPAEWIVLPEVISKEPLGPVVRRGDDEWFS IVRWTLFAMLNAEEMGINSQNVDEKAANPATPDMAHLLGKEGDYGKDLKLDNKWAYNIIK QVGNYSEIFERNVGSESPLKIKRGQNNLWNNGGIQYAPPVR >gi|296493311|gb|ADTK01000190.1| GENE 29 33647 - 34828 905 393 aa, chain + ## HITS:1 COG:ECs4142 KEGG:ns NR:ns ## COG: ECs4142 COG4597 # Protein_GI_number: 15833396 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Escherichia coli O157:H7 # 1 393 7 399 399 698 99.0 0 MSHRRSTVKGSLSFANPTVRAWLFQILAVVAVVGIVGWLFHNTVTNLNNRGITSGFAFLD RGAGFGIVQHLIDYQQGDTYGRVFIVGLLNTLLVSALCIVFASVLGFFIGLARLSDNWLL RKLSTIYIEIFRNIPPLLQIFFWYFAVLRNLPGPRQAVSAFDLAFLSNRGLYIPSPQLGD GFIAFILAVVMAIVLSVGLFRFNKTYQIKTGQLRRTWPIAAVLIIGLPLLAQWLFGAALH WDVPALRGFNFRGGMVLIPELAALTLALSVYTSAFIAEIIRAGIQAVPYGQHEAARSLGL PNPVTLRQVIIPQALRVIIPPLTSQYLNIVKNSSLAAAIGYPDMVSLFAGTVLNQTGQAI ETIAMTMSVYLIISLTISLLMNIYNRRIAIVER >gi|296493311|gb|ADTK01000190.1| GENE 30 34838 - 35941 810 367 aa, chain + ## HITS:1 COG:ECs4143 KEGG:ns NR:ns ## COG: ECs4143 COG0765 # Protein_GI_number: 15833397 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Escherichia coli O157:H7 # 1 367 2 368 368 674 98.0 0 MTKVLLSQPSRPASHNSSRAMVWVRKNLFSSWSNSLLTIGCIWLMWELIPPLLNWAFLQA NWVGSTRADCTKAGACWVFIHERFGQFMYGLYPHDQRWRINLALLIGLVSIAPMFWKILP HRGRYIAVWAVIYPLIVWWLMYGGFFGLERVETRQWGGLTLTLIIASVGIAGALPWGILL ALGRRSHMPIVRILSVIFIEFWRGVPLITVLFMSSVMLPLFMAEGTSIDKLIRALVGVIL FQSAYVAEVVRGGLQALPKGQYEAAESLALGYWKTQGLVILPQALKLVIPGLVNTIIALF KDTSLVIIIGLFDLFSSVQQATVDPTWLGMSTEGYVFAALIYWIFCFSMSRYSQHLEKRF NTGRTPH >gi|296493311|gb|ADTK01000190.1| GENE 31 35949 - 36707 534 252 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 12 250 1 242 245 210 45 1e-53 MSQILLQPANAMITLENVNKWYGQFHVLKNINLTVQPGERIVLCGPSGSGKSTTIRCINH LEEHQQGRIVVDGIELNEDIRNIERVRQEVGMVFQHFNLFPHLTVLQNCTLAPIWVRKMP KKEAEALAMHYLERVRIAEHAHKFPGQISGGQQQRVAIARSLCMKPKIMLFDEPTSALDP EMVKEVLDTMIGLAQSGMTMLCVTHEMGFARTVADRVIFMDRGEIVEQAAPDEFFTHPKS ERTRAFLSQVIH Prediction of potential genes in microbial genomes Time: Mon May 16 15:37:50 2011 Seq name: gi|296493310|gb|ADTK01000191.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont534.10, whole genome shotgun sequence Length of sequence - 413 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - 5S_RRNA 142 - 263 100.0 # AE006468 [R:3566622..3566743] # 5S ribosomal RNA # Salmonella enterica subsp. enterica serovar Typhimurium str. LT2 # Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; Enterobacteriaceae; Salmonella; Salmonella enterica subsp. enterica serovar Typhimurium. - TRNA 302 - 377 88.7 # Thr GGT 0 0 Predicted protein(s) Prediction of potential genes in microbial genomes Time: Mon May 16 15:37:51 2011 Seq name: gi|296493309|gb|ADTK01000192.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont542.1, whole genome shotgun sequence Length of sequence - 1697 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 10/0.000 - CDS 278 - 604 152 ## COG3668 Plasmid stabilization system protein 2 1 Op 2 . - CDS 601 - 864 211 ## COG3609 Predicted transcriptional regulators containing the CopG/Arc/MetJ DNA-binding domain 3 1 Op 3 . - CDS 936 - 1697 498 ## EcSMS35_3215 hypothetical protein Predicted protein(s) >gi|296493309|gb|ADTK01000192.1| GENE 1 278 - 604 152 108 aa, chain - ## HITS:1 COG:RSc3224 KEGG:ns NR:ns ## COG: RSc3224 COG3668 # Protein_GI_number: 17547943 # Func_class: R General function prediction only # Function: Plasmid stabilization system protein # Organism: Ralstonia solanacearum # 1 94 1 94 99 101 51.0 3e-22 MKLTVSPLAEHDLEAIGDWIAQDNPVRAISFTEELYQQCLLIAESPVLYRERPELGAGIR GCSYGRYLLLFRVLDTEVRIERIVHGSRDINRVLAEPDDKPSDHFWQP >gi|296493309|gb|ADTK01000192.1| GENE 2 601 - 864 211 87 aa, chain - ## HITS:1 COG:RSc3225 KEGG:ns NR:ns ## COG: RSc3225 COG3609 # Protein_GI_number: 17547944 # Func_class: K Transcription # Function: Predicted transcriptional regulators containing the CopG/Arc/MetJ DNA-binding domain # Organism: Ralstonia solanacearum # 1 85 1 87 88 82 56.0 3e-16 MPTSVSLSPYFETFIREQIESGRYNNTSEVIRAGLRALEEREQQIKLESLQSAVTAGINS GESKSAEEVFGRLTHKYKKMAEGEQPI >gi|296493309|gb|ADTK01000192.1| GENE 3 936 - 1697 498 253 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_3215 NR:ns ## KEGG: EcSMS35_3215 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 253 36 288 288 513 99.0 1e-144 VQIDTLIRQLTEISVLTESIGGKTARDWAMKQDFRCGCWLMDKPETAMKAITRNLDREIW RDLMQRSGMLSLMDAQARDTWYRSLEYDNFPEISEANILSTFEQLHQNKDEVFERGVINV FRGLSWNYENNSPCKFGSKIIVNNLVRWDRWGFHLITGQQADRLADLERMLYLFSGKPIP DNRENITIHLDDHIQSVQGKEDYEDEMFSIRYFKKGSAHITFRKPELVDRLNDIIARHYS EALASTVNKSRKA Prediction of potential genes in microbial genomes Time: Mon May 16 15:37:57 2011 Seq name: gi|296493308|gb|ADTK01000193.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont544.1, whole genome shotgun sequence Length of sequence - 6888 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 12/0.000 + CDS 104 - 1162 908 ## COG4658 Predicted NADH:ubiquinone oxidoreductase, subunit RnfD 2 1 Op 2 13/0.000 + CDS 1166 - 1786 583 ## COG4659 Predicted NADH:ubiquinone oxidoreductase, subunit RnfG 3 1 Op 3 10/0.000 + CDS 1790 - 2485 683 ## COG4660 Predicted NADH:ubiquinone oxidoreductase, subunit RnfE 4 1 Op 4 3/1.000 + CDS 2485 - 3120 657 ## COG0177 Predicted EndoIII-related endonuclease + Prom 3652 - 3711 5.9 5 2 Tu 1 4/1.000 + CDS 3731 - 5233 1571 ## COG3104 Dipeptide/tripeptide permease + Term 5257 - 5291 5.2 + Prom 5259 - 5318 4.2 6 3 Tu 1 . + CDS 5339 - 5944 707 ## COG0625 Glutathione S-transferase + Term 5949 - 5994 5.9 - Term 5936 - 5978 5.0 7 4 Tu 1 . - CDS 5988 - 6848 871 ## COG2240 Pyridoxal/pyridoxine/pyridoxamine kinase Predicted protein(s) >gi|296493308|gb|ADTK01000193.1| GENE 1 104 - 1162 908 352 aa, chain + ## HITS:1 COG:ECs2339 KEGG:ns NR:ns ## COG: ECs2339 COG4658 # Protein_GI_number: 15831593 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfD # Organism: Escherichia coli O157:H7 # 1 352 1 352 352 632 99.0 0 MVFRIASSPYTHNQRQTSRIMLLVLLAAVPGIAAQLWFFGWGTLVQILLASVSALLAEAL VLKLRKQSVAATLKDNSALLTGLLLAVSIPPLAPWWMVVLGTVFAVIIAKQLYGGLGQNP FNPAMIGYVVLLISFPVQMTSWLPPHEIAVNIPGFIDAIQVIFSGHTASGGDMNTLRLGI DGISQATPLDTFKTSVRAGHSVEQIMQYPIYSGILAGAGWQWVNLAWLAGGLWLLWQKAI RWHIPLSFLVTLALCATLGWLFSPETLAAPQIHLLSGATMLGAFFILTDPVTASTTNRGR LIFGALAGLLVWLIRSFGGYPDGVAFAVLLANITVPLIDYYTRPRVYGHRKG >gi|296493308|gb|ADTK01000193.1| GENE 2 1166 - 1786 583 206 aa, chain + ## HITS:1 COG:ydgP KEGG:ns NR:ns ## COG: ydgP COG4659 # Protein_GI_number: 16129589 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfG # Organism: Escherichia coli K12 # 1 206 1 206 206 393 100.0 1e-109 MLKTIRKHGITLALFAAGSTGLTAAINQMTKTTIAEQASLQQKALFDQVLPAERYNNALA QSCYLVTAPELGKGEHRVYIAKQDDKPVAAVLEATAPDGYSGAIQLLVGADFNGTVLGTR VTEHHETPGLGDKIELRLSDWITHFAGKKISGADDAHWAVKKDGGDFDQFTGATITPRAV VNAVKRAGLYAQTLPAQLSQLPACGE >gi|296493308|gb|ADTK01000193.1| GENE 3 1790 - 2485 683 231 aa, chain + ## HITS:1 COG:ECs2341 KEGG:ns NR:ns ## COG: ECs2341 COG4660 # Protein_GI_number: 15831595 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfE # Organism: Escherichia coli O157:H7 # 1 231 1 231 231 395 99.0 1e-110 MSEIKDVIVQGLWKNNSALVQLLGLCPLLAVTSTATNALGLGLATTLVLTLTNLTISTLR HWTPAEIRIPIYVMIIASVVSAVQMLINAYAFGLYQSLGIFIPLIVTNCIVVGRAEAFAA KKGPALSALDGFSIGMGATCAMFVLGSLREIIGNGTLFDGADALLGSWAKVLRVEIFHTD SPFLLAMLPPGAFIGLGLMLAGKYLIDERMKKRRAEAAAERALPNGETGNV >gi|296493308|gb|ADTK01000193.1| GENE 4 2485 - 3120 657 211 aa, chain + ## HITS:1 COG:nth KEGG:ns NR:ns ## COG: nth COG0177 # Protein_GI_number: 16129591 # Func_class: L Replication, recombination and repair # Function: Predicted EndoIII-related endonuclease # Organism: Escherichia coli K12 # 1 211 1 211 211 422 99.0 1e-118 MNKAKRLEILTRLRENNPHPTTELNFSSPFELLIAVLLSAQATDVSVNKATAKLYPVANT PAAMHELGVEGVKTYIKTIGLYNSKAENIIKTCRILLERHNGEVPEDRAALEALPGVGRK TANVVLNTAFGWPTIAVDTHIFRVCNRTQFAPGKNVEQVEEKLLKVVPAEFKVDCHHWLI LHGRYTCIARKPRCGSCIIEDLCEYKEKVDI >gi|296493308|gb|ADTK01000193.1| GENE 5 3731 - 5233 1571 500 aa, chain + ## HITS:1 COG:ydgR KEGG:ns NR:ns ## COG: ydgR COG3104 # Protein_GI_number: 16129592 # Func_class: E Amino acid transport and metabolism # Function: Dipeptide/tripeptide permease # Organism: Escherichia coli K12 # 1 487 1 487 500 879 100.0 0 MSTANQKPTESVSLNAFKQPKAFYLIFSIELWERFGYYGLQGIMAVYLVKQLGMSEADSI TLFSSFSALVYGLVAIGGWLGDKVLGTKRVIMLGAIVLAIGYALVAWSGHDAGIVYMGMA AIAVGNGLFKANPSSLLSTCYEKNDPRLDGAFTMYYMSVNIGSFFSMIATPWLAAKYGWS VAFALSVVGLLITIVNFAFCQRWVKQYGSKPDFEPINYRNLLLTIIGVVALIAIATWLLH NQEVARMALGVVAFGIVVIFGKEAFAMKGAARRKMIVAFILMLEAIIFFVLYSQMPTSLN FFAIRNVEHSILGLAVEPEQYQALNPFWIIIGSPILAAIYNKMGDTLPMPTKFAIGMVMC SGAFLILPLGAKFASDAGIVSVSWLVASYGLQSIGELMISGLGLAMVAQLVPQRLMGFIM GSWFLTTAGANLIGGYVAGMMAVPDNVTDPLMSLEVYGRVFLQIGVATAVIAVLMLLTAP KLHRMTQDDAADKAAKAAVA >gi|296493308|gb|ADTK01000193.1| GENE 6 5339 - 5944 707 201 aa, chain + ## HITS:1 COG:ECs2344 KEGG:ns NR:ns ## COG: ECs2344 COG0625 # Protein_GI_number: 15831598 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione S-transferase # Organism: Escherichia coli O157:H7 # 1 201 1 201 201 387 99.0 1e-108 MKLFYKPGACSLASHITLRESGKDFTLVSVDLMKKRLENGDDYFAVNPKGQVPALLLDDG TLLTEGVAIMQYLADSVPDRQLLAPVNSISRYKTIEWLNYIATELHKGFTPLFRPDTPEE YKPTVRAQLEKKLQYVNEALKDEHWICGQRFTIADAYLFTVLRWAYAVKLNLEGFEHIAA FMQRMAERPEVQDALSAEGLK >gi|296493308|gb|ADTK01000193.1| GENE 7 5988 - 6848 871 286 aa, chain - ## HITS:1 COG:pdxY KEGG:ns NR:ns ## COG: pdxY COG2240 # Protein_GI_number: 16129594 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal/pyridoxine/pyridoxamine kinase # Organism: Escherichia coli K12 # 1 286 2 287 287 568 100.0 1e-162 MKNILAIQSHVVYGHAGNSAAEFPMRRLGANVWPLNTVQFSNHTQYGKWTGCVMPPSHLT EIVQGIAAIDKLHTCDAVLSGYLGSAEQGEHILGIVRQVKAANPQAKYFCDPVMGHPEKG CIVAPGVAEFHVRHGLPASDIIAPNLVELEILCEHAVNNVEEAVLAARELIAQGPQIVLV KHLARAGYSRDRFEMLLVTADEAWHISRPLVDFGMRQPVGVGDVTSGLLLVKLLQGATLQ EALEHVTAAVYEIMVTTKAMQEYELQVVAAQDRIAKPEHYFSATKL Prediction of potential genes in microbial genomes Time: Mon May 16 15:37:59 2011 Seq name: gi|296493307|gb|ADTK01000194.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont544.2, whole genome shotgun sequence Length of sequence - 4447 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 - CDS 43 - 1317 833 ## PROTEIN SUPPORTED gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 - Prom 1382 - 1441 3.2 2 1 Op 2 3/1.000 - CDS 1446 - 2102 634 ## COG0259 Pyridoxamine-phosphate oxidase 3 1 Op 3 3/1.000 - CDS 2161 - 2484 303 ## COG3895 Predicted periplasmic protein 4 2 Tu 1 . - CDS 2588 - 3697 751 ## COG2377 Predicted molecular chaperone distantly related to HSP70-fold metalloproteases - Prom 3767 - 3826 3.2 + Prom 3813 - 3872 6.9 5 3 Tu 1 . + CDS 3971 - 4438 516 ## COG3133 Outer membrane lipoprotein Predicted protein(s) >gi|296493307|gb|ADTK01000194.1| GENE 1 43 - 1317 833 424 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 [Phaeobacter gallaeciensis BS107] # 4 421 7 414 418 325 42 4e-89 MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALYCGFDPTADSLHLGHLVPLLCLKR FQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENS AIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQG YDFACLNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTE GGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQ YVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLM QALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCL ICWK >gi|296493307|gb|ADTK01000194.1| GENE 2 1446 - 2102 634 218 aa, chain - ## HITS:1 COG:pdxH KEGG:ns NR:ns ## COG: pdxH COG0259 # Protein_GI_number: 16129596 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxamine-phosphate oxidase # Organism: Escherichia coli K12 # 1 218 1 218 218 432 100.0 1e-121 MSDNDELQQIAHLRREYTKGGLRRRDLPADPLTLFERWLSQACEAKLADPTAMVVATVDE HGQPYQRIVLLKHYDEKGMVFYTNLGSRKAHQIENNPRVSLLFPWHTLERQVMVIGKAER LSTLEVMKYFHSRPRDSQIGAWVSKQSSRISARGILESKFLELKQKFQQGEVPLPSFWGG FRVSLEQIEFWQGGEHRLHDRFLYQRENDAWKIDRLAP >gi|296493307|gb|ADTK01000194.1| GENE 3 2161 - 2484 303 107 aa, chain - ## HITS:1 COG:STM1447 KEGG:ns NR:ns ## COG: STM1447 COG3895 # Protein_GI_number: 16764795 # Func_class: R General function prediction only # Function: Predicted periplasmic protein # Organism: Salmonella typhimurium LT2 # 1 107 3 109 109 189 84.0 7e-49 MKKLLIIILPVLLSGCSAFNQLVERMQTDTLEYQCDEKPLTVKLNNPRQEVSFVYDNQLL HLKQGISASGARYTDGIYVFWSKGDEATVYKRDRIVLNNCQLQNPQR >gi|296493307|gb|ADTK01000194.1| GENE 4 2588 - 3697 751 369 aa, chain - ## HITS:1 COG:ydhH KEGG:ns NR:ns ## COG: ydhH COG2377 # Protein_GI_number: 16129598 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted molecular chaperone distantly related to HSP70-fold metalloproteases # Organism: Escherichia coli K12 # 1 369 1 369 369 712 99.0 0 MKSGRFIGVMSGTSLDGVDVVLATIDEHRVAQLASLSWPIPVSLKQAVLDICQGQQLTLS QFGQLDTQLGRLFADAVNALLKEQNLQARDIVAIGCHGQTVWHEPTGVAPHTLQIGDNNQ IVARTGITVVGDFRRRDIALGGQGAPLVPAFHHALLAHPTERRMVLNIGGIANLSLLIPG QPVGGYDTGPGNMLMDAWIWRQAGKPYDKDAEWARAGKVILPLLQNMLSDPYFSQPAPKS TGREYFNYGWLERHLRHFPGVDPRDVQATLAELTAVTISEQVLLSGGCERLMVCGGGSRN PLLMARLAALLPGTEVTTTDAVGISGDDMEALAFAWLAWRTLAGLPGNLPSVTGASQETV LGAIFPANP >gi|296493307|gb|ADTK01000194.1| GENE 5 3971 - 4438 516 155 aa, chain + ## HITS:1 COG:slyB KEGG:ns NR:ns ## COG: slyB COG3133 # Protein_GI_number: 16129599 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane lipoprotein # Organism: Escherichia coli K12 # 1 155 1 155 155 212 99.0 2e-55 MIKRVLVVSMVGLSLVGCVNNDTLSGDVYTASEAKQVQNVSYGTIVNVRPVQIQGGDDSN VIGAIGGAVLGGFLGNTVGGGTGRSLATAAGAVAGGVAGQGVQSAMNKTQGVELEIRKDD GNTIMVVQKQGNTRFSPGQRVVLASNGSQVTVSPR Prediction of potential genes in microbial genomes Time: Mon May 16 15:38:07 2011 Seq name: gi|296493306|gb|ADTK01000195.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont544.3, whole genome shotgun sequence Length of sequence - 24553 bp Number of predicted genes - 22, with homology - 21 Number of transcription units - 16, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 25 - 465 353 ## COG1846 Transcriptional regulators - Prom 613 - 672 6.8 + Prom 457 - 516 9.9 2 2 Op 1 . + CDS 660 - 896 128 ## ECUMN_1934 conserved hypothetical protein; putative inner membrane protein 3 2 Op 2 6/0.333 + CDS 899 - 1756 704 ## COG1566 Multidrug resistance efflux pump 4 2 Op 3 . + CDS 1756 - 3768 1239 ## COG1289 Predicted membrane protein + Term 3906 - 3943 2.2 - Term 3661 - 3697 0.0 5 3 Op 1 3/0.889 - CDS 3769 - 4305 708 ## COG2032 Cu/Zn superoxide dismutase 6 3 Op 2 . - CDS 4371 - 5267 1061 ## COG4989 Predicted oxidoreductase - Prom 5290 - 5349 4.1 + Prom 5388 - 5447 2.2 7 4 Op 1 6/0.333 + CDS 5658 - 6257 562 ## COG1309 Transcriptional regulator 8 4 Op 2 4/0.667 + CDS 6294 - 7391 1171 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family 9 4 Op 3 7/0.111 + CDS 7472 - 7879 216 ## PROTEIN SUPPORTED gi|15900839|ref|NP_345443.1| lactoylglutathione lyase + Term 7888 - 7923 5.2 + Prom 7895 - 7954 3.9 10 5 Op 1 3/0.889 + CDS 7982 - 8629 717 ## COG0847 DNA polymerase III, epsilon subunit and related 3'-5' exonucleases + Prom 8633 - 8692 1.6 11 5 Op 2 . + CDS 8722 - 13338 3223 ## COG1201 Lhr-like helicases + Term 13358 - 13399 -0.8 - Term 13334 - 13374 8.5 12 6 Tu 1 . - CDS 13389 - 13736 488 ## COG0278 Glutaredoxin-related protein - Prom 13770 - 13829 4.7 + Prom 13957 - 14016 7.1 13 7 Tu 1 . + CDS 14070 - 14885 484 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) + Term 14889 - 14930 8.1 + Prom 14901 - 14960 4.4 14 8 Tu 1 . + CDS 15013 - 15594 462 ## PROTEIN SUPPORTED gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent + Term 15708 - 15736 -1.0 15 9 Tu 1 . - CDS 15829 - 16998 1248 ## COG2814 Arabinose efflux permease - Prom 17044 - 17103 3.4 - Term 17064 - 17113 2.4 16 10 Tu 1 . - CDS 17164 - 17253 62 ## - Prom 17283 - 17342 5.6 + Prom 17336 - 17395 4.0 17 11 Tu 1 . + CDS 17552 - 18577 1072 ## COG1609 Transcriptional regulators + Term 18584 - 18630 0.5 - Term 18441 - 18493 2.1 18 12 Tu 1 . - CDS 18574 - 19506 645 ## COG0583 Transcriptional regulator - Prom 19535 - 19594 4.4 + Prom 19528 - 19587 2.9 19 13 Tu 1 . + CDS 19619 - 20830 1128 ## COG0477 Permeases of the major facilitator superfamily + Term 21030 - 21062 1.0 + Prom 20834 - 20893 7.3 20 14 Tu 1 . + CDS 21121 - 22269 1016 ## COG2230 Cyclopropane fatty acid synthase and related methyltransferases + Term 22281 - 22313 4.6 21 15 Tu 1 . - CDS 22309 - 22950 741 ## COG0307 Riboflavin synthase alpha chain - Prom 23164 - 23223 3.6 + Prom 23025 - 23084 5.7 22 16 Tu 1 . + CDS 23165 - 24538 1458 ## COG0534 Na+-driven multidrug efflux pump Predicted protein(s) >gi|296493306|gb|ADTK01000195.1| GENE 1 25 - 465 353 146 aa, chain - ## HITS:1 COG:ECs2351 KEGG:ns NR:ns ## COG: ECs2351 COG1846 # Protein_GI_number: 15831605 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 146 1 146 146 241 100.0 4e-64 MKLESPLGSDLARLVRIWRALIDHRLKPLELTQTHWVTLHNIHQLPPDQSQIQLAKAIGI EQPSLVRTLDQLEEKGLISRQTCASDRRAKRIKLTEKAEPLISEMEAVINKTRAEILHGI SAEELEQLITLIAKLEHNIIELQAKG >gi|296493306|gb|ADTK01000195.1| GENE 2 660 - 896 128 78 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_1934 NR:ns ## KEGG: ECUMN_1934 # Name: ydhI # Def: conserved hypothetical protein; putative inner membrane protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 78 1 78 78 105 100.0 4e-22 MKFMLNATGLPLQDLVFGASVYFPPFFKAFAFGFVIWLVVHRLLRGWIYAGDIWHPLLMD LSLFAICVCLALAILIAW >gi|296493306|gb|ADTK01000195.1| GENE 3 899 - 1756 704 285 aa, chain + ## HITS:1 COG:ECs2353 KEGG:ns NR:ns ## COG: ECs2353 COG1566 # Protein_GI_number: 15831607 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Escherichia coli O157:H7 # 1 285 15 299 299 548 99.0 1e-156 MSIKTIKYFSTIIVAVVAVLAGWWLWNYYMQSPWTRDGKIRAEQVSITPQVSGRIVELNI KDNQLVKAGDLLLTIDKTPFQIAELNAQAQLAKAQSDLAKANNEANRRRHLSQNFISAEE LDTANLNVKAMQASVNAAQATLKQTQWQLAQTEIRAPVSGWVTNLTTRIGDYADTGKPLF ALVDSHSFYVIGYFEETKLRHIREGAPAQITLYSDNKTLQGHVSSIGRAIYDQSVESDSS LIPDVKPNVPWVRLAQRVPVRFALDKVPGDVTLVSGTTCSIAVGQ >gi|296493306|gb|ADTK01000195.1| GENE 4 1756 - 3768 1239 670 aa, chain + ## HITS:1 COG:ydhK KEGG:ns NR:ns ## COG: ydhK COG1289 # Protein_GI_number: 16129603 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 670 1 670 670 1221 99.0 0 MNASSWSLRNLPWFRATLAQWRYALRNTIAMCLALTVAYYLNLDEPYWAMTSAAVVSFPT VGGVISKSLGRIAGSLLGAIAALLLAGHTLNEPWFFLLSMSAWLGFCTWACAHFTNNVAY AFQLAGYTAAIIAFPMVNITEASQLWDIAQARVCEVIVGILCGGMMMMILPSSSDATALL TALKNMHARLLEHASLLWQPETTDAIRAAHEGVIGQILTMNLLRIQAFWSHYRFRQQNAR LNALLHQQLRMTSVISSLRRMLLNWPSPPGATREILEQLLTALASSQTDVYTVARIIAPL RPTNVADYRHVAFWQRLRYFCRLYLQSSQELHRLQSGVDDHTRLPRTSGLARHTDNAEAM WSGLRTFCTLMMIGAWSIASQWDAGANALTLAAISCVLYSAVAAPFKSLSLLMRTLVLLS LFSFVVKFGLMVQISDLWQFLLFLFPLLATMQLLKLQMPKFAALWGQLIVFMGSFIAVTN PPVYDFADFLNDNLAKIVGVALAWLAFAILRPGSDARKSRRHIRALRRDFVDQLSRHPTL SESEFESLTYHHVSQLSNSQDALARRWLLRWGVVLLNCSHVVWQLRDWESRSDPLSRVRD NCISLLRGVMSERGVQQKSLAATLEELQRICDSLARHHQPAVRELAAIVWRLYCSLSQLE QAPPQGTLAS >gi|296493306|gb|ADTK01000195.1| GENE 5 3769 - 4305 708 178 aa, chain - ## HITS:1 COG:ECs2355 KEGG:ns NR:ns ## COG: ECs2355 COG2032 # Protein_GI_number: 15831609 # Func_class: P Inorganic ion transport and metabolism # Function: Cu/Zn superoxide dismutase # Organism: Escherichia coli O157:H7 # 6 178 1 173 173 301 100.0 3e-82 MNGGPMKRFSLAILALVVATGAQAASEKVEMNLVTSQGVGQSIGSVTITETDKGLEFSPD LKALPPGEHGFHIHAKGSCQPATKDGKASAAESAGGHLDPQNTGKHEGPEGAGHLGDLPA LVVNNDGKATDAVIAPRLKSLDEIKDKALMVHVGGDNMSDQPKPLGGGGERYACGVIK >gi|296493306|gb|ADTK01000195.1| GENE 6 4371 - 5267 1061 298 aa, chain - ## HITS:1 COG:ydhF KEGG:ns NR:ns ## COG: ydhF COG4989 # Protein_GI_number: 16129605 # Func_class: R General function prediction only # Function: Predicted oxidoreductase # Organism: Escherichia coli K12 # 1 298 1 298 298 606 98.0 1e-173 MVQRITIAPQGPEFSRFVMGYWRLMDWNMPARQLVSFIEEHLDLGVTTVDHADIYGGYQC EAAFGEALKLAPHLRERMEIVSKCGIATTAREENVIGHYITDRDHIIKSTEQSLINLATD HLDLLLIHRPDPLMDADEVADAFKHLHQSGKVRHFGVSNFTPAQFALLQSRLPFTLATNQ VEISPVHQPLLLDGTLDQLQQLRVRPMAWSCLGGGRLFNDDYFQPLRDELAVVAEELNAG SIEQVVYAWVLRLPSQPLPIIGSGKIERVRAAVEAETLKMTRQQWFRIRKAALGYDVP >gi|296493306|gb|ADTK01000195.1| GENE 7 5658 - 6257 562 199 aa, chain + ## HITS:1 COG:ydhM KEGG:ns NR:ns ## COG: ydhM COG1309 # Protein_GI_number: 16129607 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 199 1 199 199 391 100.0 1e-109 MNKHTEHDTREHLLATGEQLCLQRGFTGMGLSELLKTAEVPKGSFYHYFRSKEAFGVAML ERHYAAYHQRLTELLQSGEGNYRDRILAYYQQTLNQFCQHGTISGCLTVKLSAEVCDLSE DMRSAMDKGARGVIALLSQALENGRENHCLTFCGEPLQQAQVLYALWLGANLQAKISRSF EPLENALAHVKNIIATPAV >gi|296493306|gb|ADTK01000195.1| GENE 8 6294 - 7391 1171 365 aa, chain + ## HITS:1 COG:nemA KEGG:ns NR:ns ## COG: nemA COG1902 # Protein_GI_number: 16129608 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Escherichia coli K12 # 1 365 1 365 365 702 99.0 0 MSSEKLYSPLKVGAITAANRIFMAPLTRLRSIEPGDIPTPLMAEYYRQRASAGLIISEAT QISAQAKGYAGAPGIHSPEQIAAWKKITAGVHAENGHMAVQLWHTGRISHASLQPGGQAP VAPSALSAGTRTSLRDENGQAIRVETSMPRALELEEIPGIVNDFRQAIANAREAGFDLVE LHSAHGYLLHQFLSPSSNHRTDQYGGSVENRARLVLEVVDAGIEEWGADRIGIRISPIGT FQNTDNGPNEEADALYLIEQLGKRGIAYLHMSEPDWAGGEPYTDAFREKVRARFHGPIIG AGAYTVEKAETLIGKGLIDAVAFGRDWIANPDLVARLQRKAELNPQRAESFYGGGAEGYT DYPTL >gi|296493306|gb|ADTK01000195.1| GENE 9 7472 - 7879 216 135 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900839|ref|NP_345443.1| lactoylglutathione lyase [Streptococcus pneumoniae TIGR4] # 2 127 4 126 126 87 38 6e-17 MRLLHTMLRVGDLQRSIDFYTKVLGMKLLRTSENPEYKYSLAFVGYGPETEEAVIELTYN WGVDKYELGTAYGHIALSVDNAAEACEKIRQNGGNVTREAGPVKGGTTVIAFVEDPDGYK IELIEEKDAGRGLGN >gi|296493306|gb|ADTK01000195.1| GENE 10 7982 - 8629 717 215 aa, chain + ## HITS:1 COG:ECs2361 KEGG:ns NR:ns ## COG: ECs2361 COG0847 # Protein_GI_number: 15831615 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases # Organism: Escherichia coli O157:H7 # 1 215 1 215 215 440 99.0 1e-123 MSDNAQLTGLCDRFRGFYPVVIDVETAGFNAKTDALLEIAAITLKMDEQGWLMPDTTLHF HVEPFVGANLQPEALAFNGIDPNDPDRGAVSEYEALHEIFKVVRKGIKASGCNRAIMVAH NANFDHSFMMAAAERASLKRNPFHPFATFDTAALAGLALGQTVLSKACQTAGMDFDSTQA HSALYDTERTAVLFCEIVNRWKRLGGWPLPTAEEV >gi|296493306|gb|ADTK01000195.1| GENE 11 8722 - 13338 3223 1538 aa, chain + ## HITS:1 COG:lhr KEGG:ns NR:ns ## COG: lhr COG1201 # Protein_GI_number: 16129611 # Func_class: R General function prediction only # Function: Lhr-like helicases # Organism: Escherichia coli K12 # 1 1538 1 1538 1538 2918 99.0 0 MADNPDPSSLLPDVFSPATRDWFLRAFKQPTAVQSQTWHVAARGEHALVIAPTGSGKTLA AFLYALDRLFREGGEDTREAHKRKTSRILYISPIKALGTDVQRNLQIPLKGIADERRRRG ETEVNLRVGIRTGDTPAQERSKLTRNPPDILITTPESLYLMLTSRARETLRGVETVIIDE VHAVAGSKRGAHLALSLERLDALLHTSAQRIGLSATVRSASDVAAFLGGDRPVTVVNPPA MRHPQIRIVVPVANMDDVSSVASGTGEDSHAGREGSIWPYIETGILDEVLRHRSTIVFTN SRGLAEKLTARLNELYAARLQRSPSIAVDAAHFESTSGATSNRVQSSDVFIARSHHGSVS KEQRAITEQALKSGELRCVVATSSLELGIDMGAVDLVIQVATPLSVASGLQRIGRAGHQV GGVSKGLFFPRTRRDLVDSAVIVECMFAGRLENLTPPHNPLDVLAQQTVAAAAMDALQVD EWYSRVRRAAPWKDLPRRVFDATLDMLSGRYPSGDFSAFRPKLVWNRETGILTARPGAQL LAVTSGGTIPDRGMYSVLLPEGEEKAGSRRVGELDEEMVYESRVNDIITLGATSWRIQQI TRDQVIVTPAPGRSARLPFWRGEGNGRPAELGEMIGDFLHLLADGAFFSGTIPPWLAEEN TIANIQGLIEEQRNATGIVPGSRHLVLERCRDEIGDWRIILHSPYGRRVHEPWALAIAGR IHALWGADASVVASDDGIVARIPDTDGKLPDAAIFLFEPEKLLQIVREAVGSSALFAARF RECAARALLMPGRTPGHRTPLWQQRLRASQLLEIAQGYPDFPVILETLRECLQDVYDLPA LERLMRRLNGGEIQISDVTTTTPSPFATSLLFGYVAEFMYQSDAPLAERRASVLSLDSEL LRNLLGQVDPGELLDPQVIRQVEEELQRLAPGRRAKGEEGLFDLLRELGPMTVEDLAQRH TGSSEEVASYLENLLAVKRIFPAMISGQERLACMDDAARLRDALGVRLPESLPEIYLHRV SYPLRDLFLRYLRAHALVTAEQLAHEFSLGIAIVEEQLQQLREQGLVMNLQQDIWVSDEV FRRLRLRSLQAAREATRPVAATTYARLLLERQGVLPATDGSPALFASTSPGVYEGVDGVM RVIEQLAGVGLPASLWESQILPARVRDYSSEMLDELLATGAVIWSGQKKLGEDDGLVALH LQEYAAESFTPAEADQANRSALQQAIVAVLADGGAWFAQQISQRIRDKIGESVDLSALQE ALWALVWQGVITSDIWAPLRALTRSSSNARTSTRRSHRARRGRPVYAQPVSPRVSYNTPN LAGRWSLLQVEPLNDTERMLALAENMLDRYGIISRQAVIAENIPGGFPSMQTLCRSMEDS GRIMRGRFVEGLGGAQFAERLTIDRLRDLATQATQTRHYTPVALSANDPANVWGNLLPWP AHPATLVPTRRAGALVVVSGGKLLLYLAQGGKKMLVWQEKEELLAPEVFHALTTALRREP RLRFTLTEVNDLPVRQTPMFTLLREAGFSSSPQGLDWG >gi|296493306|gb|ADTK01000195.1| GENE 12 13389 - 13736 488 115 aa, chain - ## HITS:1 COG:ECs2363 KEGG:ns NR:ns ## COG: ECs2363 COG0278 # Protein_GI_number: 15831617 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutaredoxin-related protein # Organism: Escherichia coli O157:H7 # 1 115 1 115 115 225 100.0 1e-59 MSTTIEKIQRQIAENPILLYMKGSPKLPSCGFSAQAVQALAACGERFAYVDILQNPDIRA ELPKYANWPTFPQLWVDGELVGGCDIVIEMYQRGELQQLIKETAAKYKSEEPDAE >gi|296493306|gb|ADTK01000195.1| GENE 13 14070 - 14885 484 271 aa, chain + ## HITS:1 COG:ydhO KEGG:ns NR:ns ## COG: ydhO COG0791 # Protein_GI_number: 16129613 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Escherichia coli K12 # 1 271 1 271 271 422 100.0 1e-118 MARINRISITLCALLFTTLPLTPMAHASKQARESSATTHITKKADKKKSTATTKKTQKTA KKAASKSTTKSKTASSVKKSSITASKNAKTRSKHAVNKTASASFTEKCTKRKGYKSHCVK VKNAASGTLADAHKAKVQKATKVAMNKLMQQIGKPYRWGGSSPRTGFDCSGLVYYAYKDL VKIRIPRTANEMYHLRDAAPIERSELKNGDLVFFRTQGRGTADHVGVYVGNGKFIQSPRT GQEIQITSLSEDYWQRHYVGARRVMTPKTLR >gi|296493306|gb|ADTK01000195.1| GENE 14 15013 - 15594 462 193 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent [Streptococcus pneumoniae TIGR4] # 1 193 1 199 201 182 44 2e-45 MSFELPALPYAKDALAPHISAETIEYHYGKHHQTYVTNLNNLIKGTAFEGKSLEEIIRSS EGGVFNNAAQVWNHTFYWNCLAPNAGGEPTGKVAEAIAASFGSFADFKAQFTDAAIKNFG SGWTWLVKNSDGKLAIVSTSNAGTPLTTDATPLLTVDVWEHAYYIDYRNARPGYLEHFWA LVNWEFVAKNLAA >gi|296493306|gb|ADTK01000195.1| GENE 15 15829 - 16998 1248 389 aa, chain - ## HITS:1 COG:ydhP KEGG:ns NR:ns ## COG: ydhP COG2814 # Protein_GI_number: 16129615 # Func_class: G Carbohydrate transport and metabolism # Function: Arabinose efflux permease # Organism: Escherichia coli K12 # 1 389 1 389 389 571 100.0 1e-163 MKINYPLLALAIGAFGIGTTEFSPMGLLPVIARGVDVSIPAAGMLISAYAVGVMVGAPLM TLLLSHRARRSALIFLMAIFTLGNVLSAIAPDYMTLMLSRILTSLNHGAFFGLGSVVAAS VVPKHKQASAVATMFMGLTLANIGGVPAATWLGETIGWRMSFLATAGLGVISMVSLFFSL PKGGAGARPEVKKELAVLMRPQVLSALLTTVLGAGAMFTLYTYISPVLQSITHATPVFVT AMLVLIGVGFSIGNYLGGKLADRSVNGTLKGFLLLLMVIMLAIPFLARNEFGAAISMVVW GAATFAVVPPLQMRVMRVASEAPGLSSSVNIGAFNLGNALGAAAGGAVISAGLGYSFVPV MGAIVAGLALLLVFMSARKQPETVCVANS >gi|296493306|gb|ADTK01000195.1| GENE 16 17164 - 17253 62 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSTDLKFSLVTTIIVLGLIVAVGLTAALH >gi|296493306|gb|ADTK01000195.1| GENE 17 17552 - 18577 1072 341 aa, chain + ## HITS:1 COG:ECs2367 KEGG:ns NR:ns ## COG: ECs2367 COG1609 # Protein_GI_number: 15831621 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 341 1 341 341 700 100.0 0 MATIKDVAKRANVSTTTVSHVINKTRFVAEETRNAVWAAIKELHYSPSAVARSLKVNHTK SIGLLATSSEAAYFAEIIEAVEKNCFQKGYTLILGNAWNNLEKQRAYLSMMAQKRVDGLL VMCSEYPEPLLAMLEEYRHIPMVVMDWGEAKADFTDAVIDNAFEGGYMAGRYLIERGHRE IGVIPGPLERNTGAGRLAGFMKAMEEAMIKVPESWIVQGDFEPESGYRAMQQILSQPHRP TAVFCGGDIMAMGALCAADEMGLRVPQDVSLIGYDNVRNARYFTPALTTIHQPKDSLGET AFNMLLDRIVNKREEPQSIEVHPRLIERRSVADGPFRDYRR >gi|296493306|gb|ADTK01000195.1| GENE 18 18574 - 19506 645 310 aa, chain - ## HITS:1 COG:ECs2368 KEGG:ns NR:ns ## COG: ECs2368 COG0583 # Protein_GI_number: 15831622 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 310 1 310 310 621 99.0 1e-178 MWSEYSLEVVDAVARNGSFSAAAQELHRVPSAVSYTVRQLEEWLAVPLFERRHRDVELTA AGAWFLKEGRSVVKKMQITRQQCQQIANGWRGQLAIAVDNIVRPERTRQMIVDFYRHFDD VELLVFQEVFNGVWDALSDGRVELAIGATRAIPVGGRYAFRDMGMLSWSCVVASHHPLAL MDGPFSDDTLRNWPSLVREDTSRTLPKRITWLLDNQKRLVVPDWESSATCISAGLCIGMV PTHFAKPWLNEGKWVALELENPFPDSACCLTWQQNDMSPALTWLLEYLGDSETLNKEWLR EPEETPATGD >gi|296493306|gb|ADTK01000195.1| GENE 19 19619 - 20830 1128 403 aa, chain + ## HITS:1 COG:ECs2369 KEGG:ns NR:ns ## COG: ECs2369 COG0477 # Protein_GI_number: 15831623 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 403 1 403 403 706 99.0 0 MQPGKRFLVWLAGLSVLGFLATDMYLPAFAAIQADLQTPASAVSASLSLFLAGFAAAQLL WGPLSDRYGRKPVLLIGLTIFALGSLGMLWVENAATLLVLRFVQAVGVCAAAVIWQALVT DYYPSQKVNRIFATIMPLVGLSPALAPLLGSWLLVHFSWQAIFATLFAITVVLILPIFWL KPTTKARNNSQDGLTFTDLLRSKTYRGNVLIYAACSASFFAWLTGSPFILSEMGYSPAVI GLSYVPQTIAFLIGGYGCRAALQKWQGKQLLPWLLVLFAVSVIATWAAGFISHVSLVEIL IPFCVMAIANGAIYPIVVAQALRPFPHATGRAAALQNTLQLGLCFLASLVVSWLISISTP LLTTTSVMLSTVVLVALGYMMQRCEEVGCQNHGNAEVAHSESH >gi|296493306|gb|ADTK01000195.1| GENE 20 21121 - 22269 1016 382 aa, chain + ## HITS:1 COG:cfa KEGG:ns NR:ns ## COG: cfa COG2230 # Protein_GI_number: 16129619 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cyclopropane fatty acid synthase and related methyltransferases # Organism: Escherichia coli K12 # 1 382 1 382 382 812 100.0 0 MSSSCIEEVSVPDDNWYRIANELLSRAGIAINGSAPADIRVKNPDFFKRVLQEGSLGLGE SYMDGWWECDRLDMFFSKVLRAGLENQLPHHFKDTLRIAGARLFNLQSKKRAWIVGKEHY DLGNDLFSRMLDPFMQYSCAYWKDADNLESAQQAKLKMICEKLQLKPGMRVLDIGCGWGG LAHYMASNYDVSVVGVTISAEQQKMAQERCEGLDVTILLQDYRDLNDQFDRIVSVGMFEH VGPKNYDTYFAVVDRNLKPEGIFLLHTIGSKKTDLNVDPWINKYIFPNGCLPSVRQIAQS SEPHFVMEDWHNFGADYDTTLMAWYERFLAAWPEIADNYSERFKRMFTYYLNACAGAFRA RDIQLWQVVFSRGVENGLRVAR >gi|296493306|gb|ADTK01000195.1| GENE 21 22309 - 22950 741 213 aa, chain - ## HITS:1 COG:ribC KEGG:ns NR:ns ## COG: ribC COG0307 # Protein_GI_number: 16129620 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase alpha chain # Organism: Escherichia coli K12 # 1 213 1 213 213 426 100.0 1e-119 MFTGIVQGTAKLVSIDEKPNFRTHVVELPDHMLDGLETGASVAHNGCCLTVTEINGNHVS FDLMKETLRITNLGDLKVGDWVNVERAAKFSDEIGGHLMSGHIMTTAEVAKILTSENNRQ IWFKVQDSQLMKYILYKGFIGIDGISLTVGEVTPTRFCVHLIPETLERTTLGKKKLGARV NIEIDPQTQAVVDTVERVLAARENAMNQPGTEA >gi|296493306|gb|ADTK01000195.1| GENE 22 23165 - 24538 1458 457 aa, chain + ## HITS:1 COG:ECs2372 KEGG:ns NR:ns ## COG: ECs2372 COG0534 # Protein_GI_number: 15831626 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Escherichia coli O157:H7 # 1 457 1 457 457 796 99.0 0 MQKYISEARLLLALAIPVILAQIAQTAMGFVDTVMAGGYSATDMAAVAIGTSIWLPAILF GHGLLLALTPVIAQLNGSGRRERIAHQVRQGFWLAGFVSVLIMLVLWNAGYIIRSMENID PALADKAVGYLRALLWGAPGYLFFQVARNQCEGLAKTKPGMVMGFIGLLVNIPVNYIFIY GHFGMPELGGVGCGVATAAVYWVMFLAMVSYIKRARSMRDIRNEKGTAKPDPAVMKRLIQ LGLPIALALFFEVTLFAVVALLVSPLGIVDVAGHQIALNFSSLMFVLPMSLAAAVTIRVG YRLGQGSTLDAQTAARTGLMVGVCMATLTAIFTVSLREQIALLYNDNPEVVTLAAHLMLL AAVYQISDSIQVIGSGILRGYKDTRSIFYITFTAYWVLGLPSGYILALTDLVVKPMGPAG FWIGFIIGLTSAAIMMMLRMRFLQRLPSVIILQRASR Prediction of potential genes in microbial genomes Time: Mon May 16 15:38:17 2011 Seq name: gi|296493305|gb|ADTK01000196.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont544.4, whole genome shotgun sequence Length of sequence - 12884 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 5, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 27 - 1283 969 ## COG3468 Type V secretory pathway, adhesin AidA - Prom 1320 - 1379 3.3 + TRNA 1591 - 1667 92.3 # Val GAC 0 0 + TRNA 1672 - 1748 96.8 # Val GAC 0 0 + Prom 1674 - 1733 80.2 2 2 Op 1 . + CDS 1856 - 2161 324 ## SSON_1489 hypothetical protein + Term 2176 - 2202 -0.7 + Prom 2167 - 2226 3.3 3 2 Op 2 . + CDS 2287 - 3891 941 ## COG4529 Uncharacterized protein conserved in bacteria + Term 3904 - 3938 -0.5 4 3 Op 1 . - CDS 3903 - 4223 140 ## ECIAI1_1720 hypothetical protein 5 3 Op 2 . - CDS 4283 - 4714 211 ## ECIAI1_1720 hypothetical protein 6 3 Op 3 3/0.000 - CDS 4718 - 5503 555 ## COG4117 Thiosulfate reductase cytochrome B subunit (membrane anchoring protein) 7 3 Op 4 . - CDS 5500 - 6168 266 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 8 3 Op 5 . - CDS 6232 - 6870 482 ## SbBS512_E1872 hypothetical protein 9 3 Op 6 5/0.000 - CDS 6883 - 8985 1978 ## COG2414 Aldehyde:ferredoxin oxidoreductase 10 3 Op 7 . - CDS 9006 - 9632 370 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 - Term 10043 - 10083 8.2 11 4 Tu 1 . - CDS 10088 - 10297 252 ## LF82_2868 uncharacterized protein YdhZ - Prom 10386 - 10445 7.2 + Prom 10565 - 10624 8.3 12 5 Op 1 5/0.000 + CDS 10854 - 12266 1422 ## COG0469 Pyruvate kinase + Term 12296 - 12329 4.5 + Prom 12300 - 12359 5.7 13 5 Op 2 . + CDS 12577 - 12813 338 ## COG4238 Murein lipoprotein + Term 12824 - 12877 1.2 Predicted protein(s) >gi|296493305|gb|ADTK01000196.1| GENE 1 27 - 1283 969 418 aa, chain - ## HITS:1 COG:ydhQ KEGG:ns NR:ns ## COG: ydhQ COG3468 # Protein_GI_number: 16129622 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 1 418 1 418 418 694 98.0 0 MGSDAKNLMNDGNVQIVKTGEVIGATQLTEGELIVEAGGRAENTVVTGAGWLKVATGGIA KCTQYGNNGTLSVSDGAIATDIVQSEGGAISLSTLATVNGRHPEGEFSVDQGYACGLLLE NGGNLRVLEGHRAEKIILDQEGGLLVNGTTSAVVVDEGGELLVYPGGEASNCEINQGGVF MLAGKASDTLLAGGTMNNLGGEDSDTIVENGSIYRLGTDGLQLYSSGKTQNLSVNVGGRA EVHAGTLENAVIQGGTVILLSPTSADENFVVEEDRAPVELTGSVALLDGASMIIGYGADL QQSTITVQQGGVLILDGSTVKGDSVTFSVGNINLNGGKLWLITGAATHVQLKVKRLRGEG AICLQTSAKEISPDFINVKGEVTGDIHVEITDASRQTLCNALKLQPDEDGIGATLQPA >gi|296493305|gb|ADTK01000196.1| GENE 2 1856 - 2161 324 101 aa, chain + ## HITS:1 COG:no KEGG:SSON_1489 NR:ns ## KEGG: SSON_1489 # Name: not_defined # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 101 1 101 101 191 100.0 8e-48 MATLLQLHFAFNGPFGDAMAEQLKPLAESINQEPGFLWKVWTESEKNHEAGGIYLFTDEK SALAYLEKHTARLKNLGVEEVVAKVFDVNEPLSQINQAKLA >gi|296493305|gb|ADTK01000196.1| GENE 3 2287 - 3891 941 534 aa, chain + ## HITS:1 COG:ECs2375 KEGG:ns NR:ns ## COG: ECs2375 COG4529 # Protein_GI_number: 15831629 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 534 1 534 534 1075 98.0 0 MKKIAIVGAGPTGIYTLFSLLQQQTPLSISIFEQADEAGVGMPYSDEENSKMMLANIASI EIPPINCTYLEWLQKQEASHLQRYGVKKETLHDRQFLPRILLGEYFRDQFLRLVDQARKQ KFAVAVYESCQVTDLQITNAGVMLATNQGLPSETFDLAVIATGHVWPDEEEATRTYFPSP WSGLMEAKVDACNVGIMGTSLSGLDAAMAVAIQHGSFIEDDKQHVVFHRDNASEKLNITL MSRTGILPEADFYCPIPYEPLHIVTDQALNAEIQKGEEGLLDRVFRLIVEEIKFADPDWS QRIALESLNVDSFAQAWFAERKQRDPFDWAEKNLQEVESNKREKHTVPWRYIILRLHEAV QEIVPHLNEHDHKRFSKGLARVFIDNYAAIPSESIRRLLALREAGIIHILALGEDYEMEI NESRTVLKTEDNSYSFDVFIDARGQRPLKVKDIPFPGLREQLQKTGDEIPDVGEDYTLQQ PEDIRGRVAFGALPWLMHDQPFVQGLTACAEIGEAMARAVVKPASRARRRLSFD >gi|296493305|gb|ADTK01000196.1| GENE 4 3903 - 4223 140 106 aa, chain - ## HITS:1 COG:no KEGG:ECIAI1_1720 NR:ns ## KEGG: ECIAI1_1720 # Name: ydhT # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 106 165 270 270 202 100.0 3e-51 MRVRVCTLGERCHVASNGDLVQIASFGANARIANSGDNVHIIASGENSTVVSTGVVDSII LGPGGSAALAYHDGERVRFAVAIEGENNIRTGVRYRLNEQHQFVEC >gi|296493305|gb|ADTK01000196.1| GENE 5 4283 - 4714 211 143 aa, chain - ## HITS:1 COG:no KEGG:ECIAI1_1720 NR:ns ## KEGG: ECIAI1_1720 # Name: ydhT # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 129 1 129 270 268 98.0 6e-71 MIITRADLREWRIGAVMYRWFLRHFPRGGSYADIHHALIEEGYTDWAESLVEYAWKKWLA DENFAHQEVSSMQKLATDPGEIPFCSQFARSDDHARIGCCEDNARIATAGYAAQIASMGY SVRIGSVALTVTLAAPVNAHGLL >gi|296493305|gb|ADTK01000196.1| GENE 6 4718 - 5503 555 261 aa, chain - ## HITS:1 COG:ydhU KEGG:ns NR:ns ## COG: ydhU COG4117 # Protein_GI_number: 16129626 # Func_class: C Energy production and conversion # Function: Thiosulfate reductase cytochrome B subunit (membrane anchoring protein) # Organism: Escherichia coli K12 # 1 261 1 261 261 479 98.0 1e-135 MNPSQHAEQFQSQLANYVPQFTPEFWPVWLIIAGVLLVGMWLVQGLHALLRARGVKKSAT DHGEKVYLYSKAVRLWHWSNALLFVLLLASGLINHFAMVGATAVKSLVAVHEVCGFLLLA CWLGFVLINAVGDNGHHYRIRRQGWLERAAKQTRFYLFGIMQGEEHPFPATTQSKFNPLQ QVAYIGVMYGLLPLLLLTGLLCLYPQAVGDVFPGVRYWLLQAHFALAFISLFFIFGHLYL CTTGRTPHETFKSMVDGYHRH >gi|296493305|gb|ADTK01000196.1| GENE 7 5500 - 6168 266 222 aa, chain - ## HITS:1 COG:ECs2378 KEGG:ns NR:ns ## COG: ECs2378 COG0437 # Protein_GI_number: 15831632 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Escherichia coli O157:H7 # 1 222 18 239 239 456 99.0 1e-128 MSFTRRKFVLGMGTVIFFTGSASSLLANTRQEKEVRYAMIHDESRCNGCNICTRACRKTN HVPAQGSRLSIAHIPVTDNDNETQYHFFRQSCQHCEDAPCIDVCPTGASWRDEQGIVRVE KSQCIGCSYCIGACPYQVRYLNPVTKVADKCDFCAESRLAKGFPPICVSACPEHALIFGR EDSPEIQAWLQDNKYYQYQLPGAGKPHLYRRFGQHLIKKENV >gi|296493305|gb|ADTK01000196.1| GENE 8 6232 - 6870 482 212 aa, chain - ## HITS:1 COG:no KEGG:SbBS512_E1872 NR:ns ## KEGG: SbBS512_E1872 # Name: not_defined # Def: hypothetical protein # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 1 212 4 215 215 372 100.0 1e-102 MNHRDELPLAKVSEVDEAKRQWLQGMRHPVDTVTEPEPAEILAEFIRQHSAAGQLVARAV FLSPPYSVAEEELSVLLESIKQNGDYADIACMTGSQDDYYYSTQAMSENYAAMSLQVVEQ DICRAIAHAVRFECQTYPRPYKVAMLMQAPYYFQEAQIEAAIAAMDVAPEYADIRQVESS TAELYLFSERFMTYGKAYGLCEWFEVEQFQNP >gi|296493305|gb|ADTK01000196.1| GENE 9 6883 - 8985 1978 700 aa, chain - ## HITS:1 COG:ydhV KEGG:ns NR:ns ## COG: ydhV COG2414 # Protein_GI_number: 16129629 # Func_class: C Energy production and conversion # Function: Aldehyde:ferredoxin oxidoreductase # Organism: Escherichia coli K12 # 1 699 1 699 700 1488 99.0 0 MANGWTGNILRVNLTTGNITLEDSSKFKSFVGGMGFGYKIMYDEVPPGTKPFDEANKLVF ATGPLTGSGAPCSSRVNITSLSTFTKGNLVVDAHMGGFFAAQMKFAGYDVIIIEGKAKSP VWLKIKDDKVSLEKADFLWGKGTRATTEEICRLTSPETCVAAIGQAGENLVPLSGMLNSR NHSGGAGTGAIMGSKNLKAIAVEGTKGVNIADRQEMKRLNDYMMTELIGANNNHVVPSTP QSWAEYSDPKSRWTARKGLFWGAAEGGPIETGEIPPGNQNTVGFRTYKSVFDLGPAAEKY TVKMSGCHSCPIRCMTQMNIPRVKEFGVPSTGGNTCVANFVHTTIFPNGPKDFEDKDDGR VIGNLVGLNLFDDYGLWCNYGQLHRDFTYCYSKGVFKRVLPAEEYAEIHWDQLEAGDVNF IKDFYYRLAHRVGELSHLADGSYAIAERWNLGEEYWGYAKNKLWSPFGYPVHHANEASAQ VGSIVNCMFNRDCMTHTHINFIGSGLPLKLQREVAKELFGSEDAYDETKNYTPINDAKIK YAKWSLLRVCLHNAVTLCNWVWPMTVSPLKSRNYRGDLALEAKFFKAITGEEMTQEKLDL AAERIFTLHRAYTVKLMQTKDMRNEHDLICSWVFDKDPQIPVFTEGTDKMDRDDMHASLT MFYKEMGWDPQLGSPTRETLQRLGLEDIAADLAAHNLLPV >gi|296493305|gb|ADTK01000196.1| GENE 10 9006 - 9632 370 208 aa, chain - ## HITS:1 COG:ECs2381 KEGG:ns NR:ns ## COG: ECs2381 COG0437 # Protein_GI_number: 15831635 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Escherichia coli O157:H7 # 1 208 1 208 208 366 99.0 1e-101 MNPVDRPLLDIGLTRLEFLRISGKGLAALTIAPALLSLLGCKQEDIDSGTVGLINTPKGV LVTQRARCTGCHRCEISCTNFNDGSVGTFFSRIKIHRNYFFGDNGVGSGGGLYGDLNYTA DTCRQCKEPQCMNVCPIGAITWQQKEGCITVDHKRCIGCSACTTACPWMMATVNTESKKS SKCVLCGECANACPTGALKIIEWKDITV >gi|296493305|gb|ADTK01000196.1| GENE 11 10088 - 10297 252 69 aa, chain - ## HITS:1 COG:no KEGG:LF82_2868 NR:ns ## KEGG: LF82_2868 # Name: ydhZ # Def: uncharacterized protein YdhZ # Organism: E.coli_LF82 # Pathway: not_defined # 1 69 1 69 69 109 100.0 3e-23 MGNRTKEDELYREMCRVVGKVVLEMRDLGQEPKHIVIAGVLRTALANKRIQRSELEKQAM ETVINALVK >gi|296493305|gb|ADTK01000196.1| GENE 12 10854 - 12266 1422 470 aa, chain + ## HITS:1 COG:ECs2383 KEGG:ns NR:ns ## COG: ECs2383 COG0469 # Protein_GI_number: 15831637 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Escherichia coli O157:H7 # 1 470 1 470 470 884 100.0 0 MKKTKIVCTIGPKTESEEMLAKMLDAGMNVMRLNFSHGDYAEHGQRIQNLRNVMSKTGKT AAILLDTKGPEIRTMKLEGGNDVSLKAGQTFTFTTDKSVIGNSEMVAVTYEGFTTDLSVG NTVLVDDGLIGMEVTAIEGNKVICKVLNNGDLGENKGVNLPGVSIALPALAEKDKQDLIF GCEQGVDFVAASFIRKRSDVIEIREHLKAHGGENIHIISKIENQEGLNNFDEILEASDGI MVARGDLGVEIPVEEVIFAQKMMIEKCIRARKVVITATQMLDSMIKNPRPTRAEAGDVAN AILDGTDAVMLSGESAKGKYPLEAVSIMATICERTDRVMNSRLEFNNDNRKLRITEAVCR GAVETAEKLDAPLIVVATQGGKSARAVRKYFPDATILALTTNEKTAHQLVLSKGVVPQLV KEITSTDDFYRLGKELALQSGLAHKGDVVVMVSGALVPSGTTNTASVHVL >gi|296493305|gb|ADTK01000196.1| GENE 13 12577 - 12813 338 78 aa, chain + ## HITS:1 COG:ECs2384 KEGG:ns NR:ns ## COG: ECs2384 COG4238 # Protein_GI_number: 15831638 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Murein lipoprotein # Organism: Escherichia coli O157:H7 # 1 78 1 78 78 105 100.0 2e-23 MKATKLVLGAVILGSTLLAGCSSNAKIDQLSSDVQTLNAKVDQLSNDVNAMRSDVQAAKD DAARANQRLDNMATKYRK Prediction of potential genes in microbial genomes Time: Mon May 16 15:38:37 2011 Seq name: gi|296493304|gb|ADTK01000197.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont544.5, whole genome shotgun sequence Length of sequence - 20629 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 6, operones - 4 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 4/1.000 - CDS 35 - 1039 766 ## COG1376 Uncharacterized protein conserved in bacteria - Prom 1118 - 1177 2.8 2 2 Op 1 7/0.000 - CDS 1188 - 1604 562 ## COG2166 SufE protein probably involved in Fe-S center assembly 3 2 Op 2 24/0.000 - CDS 1617 - 2837 1232 ## COG0520 Selenocysteine lyase 4 2 Op 3 41/0.000 - CDS 2834 - 4105 1086 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component 5 2 Op 4 41/0.000 - CDS 4080 - 4826 762 ## COG0396 ABC-type transport system involved in Fe-S cluster assembly, ATPase component 6 2 Op 5 3/1.000 - CDS 4836 - 6362 1456 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component 7 2 Op 6 . - CDS 6332 - 6700 480 ## COG0316 Uncharacterized conserved protein - Prom 6883 - 6942 4.5 - Term 6977 - 7027 12.4 8 3 Op 1 . - CDS 7248 - 7436 314 ## SbBS512_E1886 hypothetical protein - Prom 7461 - 7520 5.9 - Term 7448 - 7490 6.1 9 3 Op 2 6/0.000 - CDS 7536 - 7946 313 ## COG2050 Uncharacterized protein, possibly involved in aromatic compounds catabolism 10 3 Op 3 . - CDS 7943 - 10999 2620 ## COG0277 FAD/FMN-containing dehydrogenases 11 4 Tu 1 . + CDS 11388 - 12500 1281 ## COG0628 Predicted permease + Term 12508 - 12540 6.3 + Prom 12744 - 12803 6.3 12 5 Op 1 . + CDS 12929 - 13285 136 ## ECO103_1832 hypothetical protein + Prom 13303 - 13362 2.0 13 5 Op 2 21/0.000 + CDS 13385 - 14599 891 ## COG0477 Permeases of the major facilitator superfamily + Prom 14617 - 14676 3.2 14 5 Op 3 4/1.000 + CDS 14826 - 16091 776 ## COG0477 Permeases of the major facilitator superfamily 15 5 Op 4 8/0.000 + CDS 16103 - 16969 859 ## COG0169 Shikimate 5-dehydrogenase 16 5 Op 5 3/1.000 + CDS 17000 - 17758 885 ## COG0710 3-dehydroquinate dehydratase + Prom 17773 - 17832 3.6 17 6 Op 1 . + CDS 17903 - 19498 1578 ## COG4670 Acyl CoA:acetate/3-ketoacid CoA transferase 18 6 Op 2 . + CDS 19512 - 20628 1319 ## COG1960 Acyl-CoA dehydrogenases Predicted protein(s) >gi|296493304|gb|ADTK01000197.1| GENE 1 35 - 1039 766 334 aa, chain - ## HITS:1 COG:ynhG KEGG:ns NR:ns ## COG: ynhG COG1376 # Protein_GI_number: 16129634 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 334 1 334 334 606 99.0 1e-173 MKRASLLTLTLIGAFSAIQAAWAVDYPLPPTGSRLVGQNQTYTVQEGDKNLQAIARRFDT AAMLILEANNTIAPVPKPGTTITIPSQLLLPDAPRQGIIVNLAELRLYYYPPGENIVQVY PIGIGLQGLETPVMETRVGQKIPNPTWTPTAGIRQRSLERGIKLPPVVPAGPNNPLGRYA LRLAHGNGEYLIHGTSAPDSVGLRVSSGCIRMNAPDIKALFSSVRTGTPVKVINEPVKYS VEPNGMRYVEVHRPLSAEEQQNVQTMPYTLPAGFTQFKDNKAVDQKLVDKALYRRAGYPV SVSSGATPAASNAPSVESAQNGEPEQGNMLRATQ >gi|296493304|gb|ADTK01000197.1| GENE 2 1188 - 1604 562 138 aa, chain - ## HITS:1 COG:ynhA KEGG:ns NR:ns ## COG: ynhA COG2166 # Protein_GI_number: 16129635 # Func_class: R General function prediction only # Function: SufE protein probably involved in Fe-S center assembly # Organism: Escherichia coli K12 # 1 126 1 126 138 249 99.0 1e-66 MAVLPDKEKLLRNFLRCANWEEKYLYIIELGQRLPELRDEDRSPQNSIQGCQSQVWIVMR QNAQGIIELQGDSDAAIVKGLIAVVFILYDQMTPQDIVNFDVRPWFEKMALTQHLTPSRS QGLEAMIRAIRAKAAALS >gi|296493304|gb|ADTK01000197.1| GENE 3 1617 - 2837 1232 406 aa, chain - ## HITS:1 COG:csdB KEGG:ns NR:ns ## COG: csdB COG0520 # Protein_GI_number: 16129636 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Escherichia coli K12 # 1 406 1 406 406 826 99.0 0 MTFSVDKVRADFPVLSREVNGLPLAYLDSAASAQKPSQVIDAEAEFYRHGYAAVHRGIHT LSAQATEKMENVRKRASLFINARSAEELVFVRGTTEGINLVANSWGNSNVRAGDNIIISQ MEHHANIVPWQMLCARVGAELRVIPLNPDGTLQLETLPTLFDEKTRLLAITHVSNVLGTE NPLAEMITLAHQHGAKVLVDGAQAVMHHPVDVQALDCDFYVFSGHKLYGPTGIGILYVKE ALLQEMPPWEGGGSMIATVSLSEGTTWTKAPWRFEAGTPNTGGIIGLGAALEYVSALGLN NIAEYEQNLMHYALSQLESVPDLTLYGPQNRLGVIAFNLGKHHAYDVGSFLDNYGIAVRT GHHCAMPLMAYYNVPAMCRASLAMYNTHEEVDRLVTGLQRIHRLLG >gi|296493304|gb|ADTK01000197.1| GENE 4 2834 - 4105 1086 423 aa, chain - ## HITS:1 COG:ECs2388 KEGG:ns NR:ns ## COG: ECs2388 COG0719 # Protein_GI_number: 15831642 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Escherichia coli O157:H7 # 1 423 1 423 423 798 98.0 0 MAGLPNSSNALQQWHHLFEAEGVKRSPQAQQHLQQLLRTGLPTRKHENWKYTPLEGLTNS QFVSIAGEISPQQRDALALTLDAVRLVFVDGRYVPALSDATEGSGYEVSINDDRQGLPDA IQAEVFLHLTESLAQSVTHIAVKRGPRPAKPLLLMHITQGVAGEEVNTAHYRHHLDLAEG AEATVIEHFVSLNDARHFTGARFTINVAANAHLQHIKLAFENPLSHHFAHNDLLLADDAT AFSHSFLLGGAVLRHNTSTQLNGENSMLRINSLAMPVKNEVCDTRTWLEHNKGFCNSRQL HKTIVSDKGRAVFNGLINVAQHAIKTDGQMTNNNLLMGKLAEVDTKPQLEIYADDVKCSH GATVGRIDDEQMFYLRSRGINQQDAQQMIIYAFAAELTEALRDEGLKQQVLARIGQRLPG GAR >gi|296493304|gb|ADTK01000197.1| GENE 5 4080 - 4826 762 248 aa, chain - ## HITS:1 COG:ECs2389 KEGG:ns NR:ns ## COG: ECs2389 COG0396 # Protein_GI_number: 15831643 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, ATPase component # Organism: Escherichia coli O157:H7 # 1 248 1 248 248 482 99.0 1e-136 MLSIKDLHVSVEDKAILRGLSLAVRPGEVHAIMGPNGSGKSTLSATLAGREDYEVTGGTV EFKGKDLLALSPEDRAGEGIFMAFQYPVEIPGVSNQFFLQTALNAVRSYRGQETLDRFDF QDLMEEKIALLKMPEDLLTRSVNVGFSGGEKKRNDILQMAVLEPGLCILDESDSGLDIDA LKVVADGVNSLRDGKRSFIIVTHYQRILDYIKPDYVHVLYQGRIVKSGDFTLVKQLEEQG YGWLTEQQ >gi|296493304|gb|ADTK01000197.1| GENE 6 4836 - 6362 1456 508 aa, chain - ## HITS:1 COG:ynhE KEGG:ns NR:ns ## COG: ynhE COG0719 # Protein_GI_number: 16129639 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Escherichia coli K12 # 1 508 1 508 508 1056 100.0 0 MWLWRKLWGIGGTMSRNTEATDDVKTWTGGPLNYKEGFFTQLATDELAKGINEEVVRAIS AKRNEPEWMLEFRLNAYRAWLEMEEPHWLKAHYDKLNYQDYSYYSAPSCGNCDDTCASEP GAVQQTGANAFLSKEVEAAFEQLGVPVREGKEVAVDAIFDSVSVATTYREKLAEQGIIFC SFGEAIHDHPELVRKYLGTVVPGNDNFFAALNAAVASDGTFIYVPKGVRCPMELSTYFRI NAEKTGQFERTILVADEDSYVSYIEGCSAPVRDSYQLHAAVVEVIIHKNAEVKYSTVQNW FPGDNNTGGILNFVTKRALCEGENSKMSWTQSETGSAITWKYPSCILRGDNSIGEFYSVA LTSGHQQADTGTKMIHIGKNTKSTIISKGISAGHSQNSYRGLVKIMPTATNARNFTQCDS MLIGANCGAHTFPYVECRNNSAQLEHEATTSRIGEDQLFYCLQRGISEEDAISMIVNGFC KDVFSELPLEFAVEAQKLLAISLEHSVG >gi|296493304|gb|ADTK01000197.1| GENE 7 6332 - 6700 480 122 aa, chain - ## HITS:1 COG:ydiC KEGG:ns NR:ns ## COG: ydiC COG0316 # Protein_GI_number: 16129640 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 122 1 122 122 252 100.0 9e-68 MDMHSGTFNPQDFAWQGLTLTPAAAIHIRELVAKQPGMVGVRLGVKQTGCAGFGYVLDSV SEPDKDDLLFEHDGAKLFVPLQAMPFIDGTEVDFVREGLNQIFKFHNPKAQNECGCGESF GV >gi|296493304|gb|ADTK01000197.1| GENE 8 7248 - 7436 314 62 aa, chain - ## HITS:1 COG:no KEGG:SbBS512_E1886 NR:ns ## KEGG: SbBS512_E1886 # Name: not_defined # Def: hypothetical protein # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 1 62 28 89 89 109 100.0 2e-23 MSTQLDPTQLAIEFLRRDQSNLSPAQYLKRLKQLELEFADLLTLSSAELKEEIYFAWRLG VH >gi|296493304|gb|ADTK01000197.1| GENE 9 7536 - 7946 313 136 aa, chain - ## HITS:1 COG:ydiI KEGG:ns NR:ns ## COG: ydiI COG2050 # Protein_GI_number: 16129642 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized protein, possibly involved in aromatic compounds catabolism # Organism: Escherichia coli K12 # 1 136 1 136 136 276 100.0 9e-75 MIWKRKITLEALNAMGEGNMVGFLDIRFEHIGDDTLEATMPVDSRTKQPFGLLHGGASVV LAESIGSVAGYLCTEGEQKVVGLEINANHVRSAREGRVRGVCKPLHLGSRHQVWQIEIFD EKGRLCCSSRLTTAIL >gi|296493304|gb|ADTK01000197.1| GENE 10 7943 - 10999 2620 1018 aa, chain - ## HITS:1 COG:ydiJ_1 KEGG:ns NR:ns ## COG: ydiJ_1 COG0277 # Protein_GI_number: 16129643 # Func_class: C Energy production and conversion # Function: FAD/FMN-containing dehydrogenases # Organism: Escherichia coli K12 # 1 542 1 542 542 1119 99.0 0 MIPQISQAPGVVQLVLNFLQELEQQGFTGDTATSYADRLTMSTDNSIYQLLPDAVVFPRS TADVALIARLAAQERYSSLIFTPRGGGTGTNGQALNQGIIVDMSRHMNRIIEINPEEGWV RVEAGVIKDQLNQYLKPFGYFFAPELSTSNRATLGGMINTDASGQGSLVYGKTSDHVLGV RAVLLGGDILDTQPLPVELAETLGKSNTTIGRIYNTVYQRCRQQRQLIIDNFPKLNRFLT GYDLRHVFNDEMTEFDLTRILTGSEGTLAFITEARLDITRLPKVRRLVNVKYDSFYSALR NAPFMVEARALSVETVDSKVLNLAREDIVWHSVSELITDVPDKEMLGLNIVEFAGDDEAL IDERVNALCVRLDELIVSQQAGVIGWQVCRELAGVERIYAMRKKAVGLLGNAKGAAKPIP FAEDTCVPPEHLADYIAEFRALLDSHGLSYGMFGHVDAGVLHVRPALDMCDPQQEILMKQ ISDDVVALTAKYGGLLWGEHGKGFRAEYSPAFFGEELFAELRKVKAAFDPHNRLNPGKIC PPEGLDAPMMKVDAVKRGTFDRQIPIAVRQQWRGAMECNGNGLCFNFDARSPMCPSMKIT QNRIHSPKGRATLVREWLRLLADRGVDPLKLEQELPESGVSLRTLIARTRNSWHANKGEY DFSHEVKEAMSGCLACKACSTQCPIKIDVPEFRSRFLQLYHTRYLRPLRDHLVATVESYA PLMARAPKTFNFFINQPLVRKLSEKHIGMVDLPLLSVPSLQQQMVGHRSANMTLEQLEAL NAEQKARTVLVVQDPFTSYYDAQVVADFVRLVEKLGFQPVLLPFSPNGKAQHIKGFLNRF AKTAKKTADFLNRMAKLGMPMVGVDPALVLCYRDEYKLALGEERGEFNVLLANEWLASAL ESQPVATVSGESWYFFGHCTEVTALPGAPAQWAAIFARFGAKLENVSVGCCGMAGTYGHE AKNHENSLGIYELSWHQAMQRLPRNRCLATGYSCRSQVKRVEGTGVRHPVQALLEIIK >gi|296493304|gb|ADTK01000197.1| GENE 11 11388 - 12500 1281 370 aa, chain + ## HITS:1 COG:ECs2395 KEGG:ns NR:ns ## COG: ECs2395 COG0628 # Protein_GI_number: 15831649 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Escherichia coli O157:H7 # 1 370 1 370 370 584 99.0 1e-167 MVNVRQPRDVAQILLSVLFLAIMIVACLWIVQPFILGFAWAGTVVIATWPVLLRLQKIMF GRRSLAVLVMTLLLVMVFIIPIALLVNSIVDGSGPLIKAISSGDMTLPDLAWLNTIPVIG AKLYAGWHNLLDMGGTAIMAKVRPYIGTTTTWFVGQAAHIGRFMVHCALMLLFSALLYWR GEQVAQGIRHFATRLAGVRGDAAVLLAAQAIRAVALGVVVTALVQAVLGGIGLAVSGVPY ATLLTVLMILSCLVQLGPLPVLIPAIIWLYWTGDTTWGTVLLVWSGVVGTLDNVIRPMLI RMGADLPLILILSGVIGGLIAFGMIGLFIGPVLLAVSWRLFAAWVEEVPPPTDQPEEILE ELGEIEKPNK >gi|296493304|gb|ADTK01000197.1| GENE 12 12929 - 13285 136 118 aa, chain + ## HITS:1 COG:no KEGG:ECO103_1832 NR:ns ## KEGG: ECO103_1832 # Name: ydiL # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 118 1 118 118 241 100.0 7e-63 MNAYELQALRHIFAMTIDECATWIAQTGDSESWRQWENGKCAIPDRVVEQLLAMRQQRKK HLHAIIEKINNRIGNNTMRFFPDLTAFQQVYPDGNFIDWKIYQSVAAELYAHDLERLC >gi|296493304|gb|ADTK01000197.1| GENE 13 13385 - 14599 891 404 aa, chain + ## HITS:1 COG:ECs2397 KEGG:ns NR:ns ## COG: ECs2397 COG0477 # Protein_GI_number: 15831651 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 404 1 404 404 712 98.0 0 MKNPYYPTALGLYFNYLVHGMGVILMSLNMASLETLWQTNAAGVSIVISSLGIGRLSVLL FAGLLSDRFGRRPFIMLGMCCYMAFFFGILHTNNIIIAYVFGFLAGMANSFLDAGTYPSL MEAFPRSPGTANILIKAFVSSGQFLLPLIISLLVWAELWFGWSFMIAAGIMFINALFLYL CTFPPHPGHRLPVIKKTTSSTEHRCSIIDLASYTLYGYISMATFYLVSQWLAQYGQFVAG MSYTMSIKLLSIYTVGSLLCVFITAPLIRNTVRPTTLLMLYTFISFIALFTVCLHPTFYV VIIFAFVIGFTSAGGVVQIGLTLMAERFPYAKGKATGIYYSAGSIATFTIPLITAHLSQR SIADIMWFDTAIAAIGFLLALFIGLRSRKKTRHHSLKENVAPGG >gi|296493304|gb|ADTK01000197.1| GENE 14 14826 - 16091 776 421 aa, chain + ## HITS:1 COG:ydiN KEGG:ns NR:ns ## COG: ydiN COG0477 # Protein_GI_number: 16129647 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 421 3 423 423 719 99.0 0 MSQNKAFSTPFILAVLCIYFSYFLHGISVITLAQNMSSLAEKFSTDNAGIAYLISGIGLG RLISILFFGVISDKFGRRAVILMAVIMYLLFFFGIPACPNLTLAYGLAVCVGIANSALDT GGYPALMECFPKASGSAVILVKAMVSFGQMFYPMLVSYMLLNNIWYGYGLIIPGILFVLI TLMLLKSKFPSQLVDASVANELPQMNSKPLVWLEGVSSVLFGVAAFSTFYVIVVWMPKYA MAFAGMSEAEALKTISYYSMGSLVCVFIFAALLKKMVRPIWANVFNSALATITAAIIYLY PSPLVCNAGAFVIGFSAAGGILQLGVSVMSEFFPKSKAKVTSIYMMMGGLANFVIPLITG YLSNIGLQYIIVLDFTFALLALITAIIVFIRYYRVFIIPENDVRFGERKFSTRLNTIKHR G >gi|296493304|gb|ADTK01000197.1| GENE 15 16103 - 16969 859 288 aa, chain + ## HITS:1 COG:ydiB KEGG:ns NR:ns ## COG: ydiB COG0169 # Protein_GI_number: 16129648 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Escherichia coli K12 # 1 288 1 288 288 581 100.0 1e-166 MDVTAKYELIGLMAYPIRHSLSPEMQNKALEKAGLPFTYMAFEVDNDSFPGAIEGLKALK MRGTGVSMPNKQLACEYVDELTPAAKLVGAINTIVNDDGYLRGYNTDGTGHIRAIKESGF DIKGKTMVLLGAGGASTAIGAQGAIEGLKEIKLFNRRDEFFDKALAFAQRVNENTDCVVT VTDLADQQAFAEALASADILTNGTKVGMKPLENESLVNDISLLHPGLLVTECVYNPHMTK LLQQAQQAGCKTIDGYGMLLWQGAEQFTLWTGKDFPLEYVKQVMGFGA >gi|296493304|gb|ADTK01000197.1| GENE 16 17000 - 17758 885 252 aa, chain + ## HITS:1 COG:ECs2400 KEGG:ns NR:ns ## COG: ECs2400 COG0710 # Protein_GI_number: 15831654 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate dehydratase # Organism: Escherichia coli O157:H7 # 1 252 1 252 252 448 99.0 1e-126 MKTVTVKDLVIGTGAPKIIVSLMAKDIARVKSEALAYREADFDILEWRVDHFADLSNVES VMAAAKILRETMPEKPLLFTFRSAKEGGEQAISTEAYIALNRAAIDSGLVDMIDLELFTG DDQVKETVAYAHAHDVKVVMSNHDFHKTPEAEEIIARLRKMQSFDADIPKIALMPQSTSD VLTLLAATLEMQEQYADRPIITMSMAKTGVISRLAGEVFGSAATFGAVKKASAPGQISVN DLRTVLTILHQA >gi|296493304|gb|ADTK01000197.1| GENE 17 17903 - 19498 1578 531 aa, chain + ## HITS:1 COG:ECs2401 KEGG:ns NR:ns ## COG: ECs2401 COG4670 # Protein_GI_number: 15831655 # Func_class: I Lipid transport and metabolism # Function: Acyl CoA:acetate/3-ketoacid CoA transferase # Organism: Escherichia coli O157:H7 # 1 531 1 531 531 1065 99.0 0 MKPVKPPRINGRVPVLSAQEAVNYIPDEATLCVLGAGGGILEATTLITALADKYKQTQTP RNLSIISPTGLGDRADRGISPLAQEGLVKWALCGHWGQSPRISELAEQNKIIAYNYPQGV LTQTLRAAAAHQPGIISDIGIGTFVDPRQQGGKLNEVTKEDLIKLVEFDNKEYLYYKAIA PDIAFIRATTCDSEGYATFEDEVMYLDALVIAQAVHNNGGIVMMQVQKMVKKATLHPKSV RIPGYLVDIVVVDPDQSQLYGGAPVNRFISGDFTLDDSTKLSLPLNQRKLVARRALFEMR KGAVGNVGVGIADGIGLVAREEGCADDFILTVETGPIGGITSQGIAFGANVNTRAILDMT SQFDFYHGGGLDVCYLSFAEVDQHGNVGVHKFNGKIMGTGGFIDISATSKKIIFCGTLTA GSLKTEIADGKLHIVQEGRVNKFIRELPEITFSGKIALERGLDVRYITERAVFTLKEDGL HLIEIAPGVDLQKDILDKMDFTPVISPELKLMDERLFIDAAMGFVLPEAAH >gi|296493304|gb|ADTK01000197.1| GENE 18 19512 - 20628 1319 372 aa, chain + ## HITS:1 COG:ECs2402 KEGG:ns NR:ns ## COG: ECs2402 COG1960 # Protein_GI_number: 15831656 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Escherichia coli O157:H7 # 1 372 19 390 401 779 100.0 0 MDFSLTEEQELLLASIRELITTNFPEEYFRTCDQNGTYPREFMRALADNGISMLGVPEEF GGIPADYVTQMLALMEVSKCGAPAFLITNGQCIHSMRRFGSAEQLRKTAESTLETGDPAY ALALTEPGAGSDNNSATTTYTRKNGKVYINGQKTFITGAKEYPYMLVLARDPQPKDPKKA FTLWWVDSSKPGIKINPLHKIGWHMLSTCEVYLDNVEVEESDMVGEEGMGFLNVMYNFEM ERLINAARSTGFAECAFEDAARYANQRIAFGKPIGHNQMIQEKLALMAIKIDNMRNMVLK VAWQADQHQSLRTSAALAKLYCARTAMEVIDDAIQIMGGLGYTDEARVSRFWRDVRCERI GGGTDEIMIYVA Prediction of potential genes in microbial genomes Time: Mon May 16 15:38:44 2011 Seq name: gi|296493303|gb|ADTK01000198.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont544.6, whole genome shotgun sequence Length of sequence - 9988 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 5, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 34 - 945 597 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 1061 - 1120 6.5 + Prom 1090 - 1149 6.1 2 2 Op 1 29/0.000 + CDS 1261 - 2025 800 ## COG2086 Electron transfer flavoprotein, beta subunit 3 2 Op 2 9/0.333 + CDS 2045 - 2983 1032 ## COG2025 Electron transfer flavoprotein, alpha subunit 4 2 Op 3 12/0.000 + CDS 3039 - 4328 1175 ## COG0644 Dehydrogenases (flavoproteins) 5 2 Op 4 2/1.000 + CDS 4325 - 4618 349 ## COG2440 Ferredoxin-like protein + Term 4631 - 4665 5.0 6 3 Tu 1 . + CDS 4675 - 6321 1166 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II + Term 6346 - 6382 4.5 - Term 6334 - 6370 7.7 7 4 Tu 1 . - CDS 6378 - 8756 2804 ## COG0574 Phosphoenolpyruvate synthase/pyruvate phosphate dikinase - Prom 8973 - 9032 8.0 + Prom 8937 - 8996 10.0 8 5 Tu 1 . + CDS 9089 - 9922 868 ## COG1806 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|296493303|gb|ADTK01000198.1| GENE 1 34 - 945 597 303 aa, chain - ## HITS:1 COG:ydiP KEGG:ns NR:ns ## COG: ydiP COG2207 # Protein_GI_number: 16129652 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 303 1 303 303 622 100.0 1e-178 MYQRCFDNASETLFVAGKTPRLSRFAFSDDPKWESGHHVHDNETELIYVKKGVARFTIDS SLYVAHADDIVVIERGRLHAVASDVNDPATTCTCALYGFQFQGAEENQLLQPHSCPVIAA GQGKEVIKTLFNELSVILPQSKNSQTSSLWDAFAYTLAILYYENFKNAYRSEQGYIKKDV LIKDILFYLNNNYREKITLEQLSKKFRASVSYICHEFTKEYRISPINYVIQRRMTEAKWS LTNTELSQAEISWRVGYENVDHFAKLFLRHVGCSPSDYRRQFKNCFAEQEILSEFPQPVS LVG >gi|296493303|gb|ADTK01000198.1| GENE 2 1261 - 2025 800 254 aa, chain + ## HITS:1 COG:ydiQ KEGG:ns NR:ns ## COG: ydiQ COG2086 # Protein_GI_number: 16129653 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, beta subunit # Organism: Escherichia coli K12 # 1 254 1 254 254 466 99.0 1e-131 MKIITCFKLVPEEQDIVVTPEYTLNFDNADAKISQFDLNAIEAASQLTTDDDEIAALTVG GSLLQNSKVRKDVLSRGPHSLYLVQDAQLEHALPLDTAKALAAAIEKIGFDLLIFGEGSG DLYAQQVGLLVGEILQLPVINAVSAIQRQGNTLVIERTLEDDVEVIELSVPAVLCVTSDI NVPRIPSMKAILGAGKKPVNQWQASDIDWSQSAPLAELVGIRVPPQTERKHIIIDNDSPE AIAELAEHLKKALN >gi|296493303|gb|ADTK01000198.1| GENE 3 2045 - 2983 1032 312 aa, chain + ## HITS:1 COG:ECs2405 KEGG:ns NR:ns ## COG: ECs2405 COG2025 # Protein_GI_number: 15831659 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, alpha subunit # Organism: Escherichia coli O157:H7 # 1 312 1 312 312 616 99.0 1e-176 MSQLNNVWVFSDNPERYAELFGGAQQWGQQVYAIVQNTDQAQAVMPYGPKCIYVLEQNDA LQRTENYAESIAALLKDKHPAMLLLAATKRGKALAARLSVQLDAALVNDATAVDIVDGHI CAEHRMYGGLAFAQEKINSPLAIITLAPGVQEPCTSDTSHQCPTETVPYVAPRHEILCRE RRAKAASSVDLSKAKRVVGVGRGLAAQDDLKMVHELAAVLNAEVGCSRPIAEGENWMERE RYIGVSGVLLKSDLYLTLGISGQIQHMVGGNGAKVIVAINKDKNAPIFNYADYGLVGDIY KVVPALISQLSR >gi|296493303|gb|ADTK01000198.1| GENE 4 3039 - 4328 1175 429 aa, chain + ## HITS:1 COG:ECs2406 KEGG:ns NR:ns ## COG: ECs2406 COG0644 # Protein_GI_number: 15831660 # Func_class: C Energy production and conversion # Function: Dehydrogenases (flavoproteins) # Organism: Escherichia coli O157:H7 # 1 429 1 429 429 822 98.0 0 MSDDKFDAIVVGAGVAGSVAALVMARAGLDVLVIERGDSAGCKNMTGGRLYAHTLEAIIP GFAASAPVERKVTREKISFLTEESAVTLDFHREQPDVPQHASYTVLRNRLDPWLMEQAEQ AGAQFIPGVRVDALVREGNKVTGVQAGDDILEANVVILADGVNSMLGRSLGMVPASDPHH YAVGVKEVIGLTPEQINDRFNITGEEGAAWLFAGSPSDGLMGGGFLYTNNDSVSLGLVCG LGDIAHAQKSVPQMLEDFKQHPAIRPLISGGKLLEYSAHMVPEGGLAMVPQLVNDGVMIV GDAAGFCLNLGFTVRVMDLAIASAQAAATTVIAAKEREDFSASSLAQYKRELEQSCVMRD MQHFRKIPALMENPRLFSQYPRMVADIMNEMFTIDGKPNQPVRKMIMGHAKKIGLINLLK DGIKGATAL >gi|296493303|gb|ADTK01000198.1| GENE 5 4325 - 4618 349 97 aa, chain + ## HITS:1 COG:ECs2407 KEGG:ns NR:ns ## COG: ECs2407 COG2440 # Protein_GI_number: 15831661 # Func_class: C Energy production and conversion # Function: Ferredoxin-like protein # Organism: Escherichia coli O157:H7 # 1 97 1 97 97 207 100.0 3e-54 MSQNATVNVDIKLGVNKFHVDEGHPHIILAANPDINEFRKLMKACPAGLYKQDDAGNIHF DSAGCLECGTCRVLCGNTILEQWQYPAGTFGIDFRYG >gi|296493303|gb|ADTK01000198.1| GENE 6 4675 - 6321 1166 548 aa, chain + ## HITS:1 COG:ydiD KEGG:ns NR:ns ## COG: ydiD COG0318 # Protein_GI_number: 16129657 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Escherichia coli K12 # 1 548 19 566 566 1116 99.0 0 MKVTLTFNEQRRAAYRQQGLWGDASLADYWQQTARAMPDKIAVVDNHGASYTYSALDHAA SCLANWMLAKGIESGDRIAFQLPGWCEFTVIYLACLKTGAVSVPLLPSWREAELVWVLNK CQAKMFFAPTLFKQTRPVDLILPLQNQLPQLQQIVGVDKLAPATSSLSLSQIIADNTSLT TAITTHGDELAAVLFTSGTEGLPKGVMLTHNNILASERAYCARLNLTWQDVFMMPAPLGH ATGFLHGVTAPFLIGARSVLLDIFTPDACLALLEQQRCTCMLGATPFVYDLLNVLEKQPA DLSALRFFLCGGTTIPKKVARECQQRGIKLLSVYGSTESSPHAVVNLDDPLSRFMHTDGY AAAGVEIKVVDDARKTLPPGCEGEEASRGPNVFMGYFDEPELTARALDEEGWYYSGDLCR MDEAGYIKITGRKKDIIVRGGENISSREVEDILLQHPKIHDACVVAMPDERLGERSCAYV VLKAPHHSLSLEEVVAFFSRKRVAKYKYPEHIVVIEKLPRTASGKIQKFLLRKDIMRRLT QDVCEEIE >gi|296493303|gb|ADTK01000198.1| GENE 7 6378 - 8756 2804 792 aa, chain - ## HITS:1 COG:ppsA KEGG:ns NR:ns ## COG: ppsA COG0574 # Protein_GI_number: 16129658 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: Escherichia coli K12 # 1 792 1 792 792 1609 100.0 0 MSNNGSSPLVLWYNQLGMNDVDRVGGKNASLGEMITNLSGMGVSVPNGFATTADAFNQFL DQSGVNQRIYELLDKTDIDDVTQLAKAGAQIRQWIIDTPFQPELENAIREAYAQLSADDE NASFAVRSSATAEDMPDASFAGQQETFLNVQGFDAVLVAVKHVFASLFNDRAISYRVHQG YDHRGVALSAGVQRMVRSDLASSGVMFSIDTESGFDQVVFITSAWGLGEMVVQGAVNPDE FYVHKPTLAANRPAIVRRTMGSKKIRMVYAPTQEHGKQVKIEDVPQEQRDIFSLTNEEVQ ELAKQAVQIEKHYGRPMDIEWAKDGHTGKLFIVQARPETVRSRGQVMERYTLHSQGKIIA EGRAIGHRIGAGPVKVIHDISEMNRIEPGDVLVTDMTDPDWEPIMKKASAIVTNRGGRTC HAAIIARELGIPAVVGCGDATERMKDGENVTVSCAEGDTGYVYAELLEFSVKSSSVETMP DLPLKVMMNVGNPDRAFDFACLPNEGVGLARLEFIINRMIGVHPRALLEFDDQEPQLQNE IREMMKGFDSPREFYVGRLTEGIATLGAAFYPKRVIVRLSDFKSNEYANLVGGERYEPDE ENPMLGFRGAGRYVSDSFRDCFALECEAVKRVRNDMGLTNVEIMIPFVRTVDQAKAVVEE LARQGLKRGENGLKIIMMCEIPSNALLAEQFLEYFDGFSIGSNDMTQLALGLDRDSGVVS ELFDERNDAVKALLSMAIRAAKKQGKYVGICGQGPSDHEDFAAWLMEEGIDSLSLNPDTV VQTWLSLAELKK >gi|296493303|gb|ADTK01000198.1| GENE 8 9089 - 9922 868 277 aa, chain + ## HITS:1 COG:ECs2410 KEGG:ns NR:ns ## COG: ECs2410 COG1806 # Protein_GI_number: 15831664 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 277 1 277 277 550 99.0 1e-156 MENAVDRHVFYISDGTAITAEVLGHAVMSQFPVTISSITLPFVENESRARAVKDQIDAIY HQTGVRPLVFYSIVLPEIRAIILQSEGFCQDIVQALVAPLQQEMKLDPTPIAHRTHGLNP NNLNKYDARIAAIDYTLAHDDGISLRNLDQAQVILLGVSRCGKTPTSLYLAMQFGIRAAN YPFIADDMDNLVLPASLKPLQHKLFGLTIDPERLAAIREERRENSRYASLRQCRMEVAEV EALYRKNQIPWINSTNYSVEEIATKILDIMGLSRRMY Prediction of potential genes in microbial genomes Time: Mon May 16 15:38:57 2011 Seq name: gi|296493302|gb|ADTK01000199.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont544.7, whole genome shotgun sequence Length of sequence - 36568 bp Number of predicted genes - 40, with homology - 39 Number of transcription units - 23, operones - 8 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 1.8 1 1 Tu 1 . + CDS 112 - 1158 891 ## COG0722 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase + Term 1164 - 1199 5.8 + Prom 1202 - 1261 3.9 2 2 Tu 1 . + CDS 1290 - 1481 187 ## COG4256 Hemin uptake protein 3 3 Op 1 4/0.200 - CDS 1485 - 2921 1147 ## COG0397 Uncharacterized conserved protein 4 3 Op 2 5/0.000 - CDS 2984 - 3697 204 ## COG2200 FOG: EAL domain - Prom 3754 - 3813 5.6 - Term 3894 - 3930 9.6 5 4 Op 1 4/0.200 - CDS 3944 - 4408 273 ## PROTEIN SUPPORTED gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 6 4 Op 2 5/0.000 - CDS 4486 - 5223 631 ## COG4138 ABC-type cobalamin transport system, ATPase component 7 4 Op 3 5/0.000 - CDS 5235 - 5786 598 ## COG0386 Glutathione peroxidase 8 4 Op 4 4/0.200 - CDS 5849 - 6829 876 ## COG4139 ABC-type cobalamin transport system, permease component - Prom 6855 - 6914 10.3 9 5 Op 1 13/0.000 - CDS 6930 - 7229 318 ## PROTEIN SUPPORTED gi|148826992|ref|YP_001291745.1| 50S ribosomal protein L35 10 5 Op 2 40/0.000 - CDS 7234 - 9621 2696 ## COG0072 Phenylalanyl-tRNA synthetase beta subunit 11 5 Op 3 13/0.000 - CDS 9636 - 10619 1259 ## COG0016 Phenylalanyl-tRNA synthetase alpha subunit - Prom 10803 - 10862 5.5 - Term 11026 - 11057 2.1 12 5 Op 4 46/0.000 - CDS 11069 - 11425 578 ## PROTEIN SUPPORTED gi|15802128|ref|NP_288150.1| 50S ribosomal protein L20 13 5 Op 5 36/0.000 - CDS 11478 - 11690 358 ## PROTEIN SUPPORTED gi|110805473|ref|YP_688993.1| 50S ribosomal protein L35 - Term 11714 - 11755 9.5 14 5 Op 6 16/0.000 - CDS 11772 - 12251 665 ## PROTEIN SUPPORTED gi|167856598|ref|ZP_02479300.1| 50S ribosomal protein L35 15 5 Op 7 . - CDS 12318 - 14186 1826 ## COG0441 Threonyl-tRNA synthetase - Prom 14431 - 14490 7.1 + Prom 14738 - 14797 4.8 16 6 Op 1 . + CDS 14819 - 16666 414 ## COG0666 FOG: Ankyrin repeat 17 6 Op 2 . + CDS 16737 - 16949 143 ## SFV_1500 hypothetical protein + Term 16967 - 17009 -0.2 - Term 16957 - 16993 6.2 18 7 Tu 1 . - CDS 17002 - 17760 611 ## COG3137 Putative salt-induced outer membrane protein - Prom 17889 - 17948 11.7 + Prom 17904 - 17963 6.1 19 8 Tu 1 . + CDS 18047 - 18976 930 ## COG1105 Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) 20 9 Tu 1 . + CDS 19077 - 19367 319 ## ECH74115_2442 hypothetical protein + Prom 19394 - 19453 3.4 21 10 Tu 1 . + CDS 19473 - 20333 829 ## COG3001 Fructosamine-3-kinase + Term 20351 - 20379 1.3 - Term 20339 - 20367 1.3 22 11 Tu 1 . - CDS 20374 - 20910 508 ## SDY_1819 hypothetical protein - Prom 21080 - 21139 3.3 + Prom 20942 - 21001 3.2 23 12 Tu 1 . + CDS 21057 - 21725 760 ## COG0637 Predicted phosphatase/phosphohexomutase + Term 21727 - 21770 12.2 + Prom 21805 - 21864 4.5 24 13 Op 1 1/1.000 + CDS 21888 - 22478 297 ## COG1988 Predicted membrane-bound metal-dependent hydrolases 25 13 Op 2 . + CDS 22611 - 24002 1741 ## COG1823 Predicted Na+/dicarboxylate symporter + Term 24014 - 24056 5.0 26 14 Tu 1 . - CDS 24006 - 24293 114 ## JW1719 hypothetical protein - Prom 24376 - 24435 5.7 + Prom 24321 - 24380 7.0 27 15 Tu 1 . + CDS 24402 - 24605 78 ## - Term 24918 - 24962 -0.8 28 16 Tu 1 . - CDS 25098 - 25361 205 ## SSON_1427 cell division modulator - Prom 25517 - 25576 4.1 + Prom 25431 - 25490 2.7 29 17 Tu 1 . + CDS 25544 - 27805 2153 ## COG0753 Catalase + Term 27809 - 27859 13.6 - Term 27798 - 27844 13.3 30 18 Op 1 5/0.000 - CDS 27852 - 28610 645 ## COG3394 Uncharacterized protein conserved in bacteria 31 18 Op 2 4/0.200 - CDS 28623 - 29975 1535 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases 32 19 Op 1 5/0.000 - CDS 30080 - 30919 723 ## COG2207 AraC-type DNA-binding domain-containing proteins 33 19 Op 2 13/0.000 - CDS 30930 - 31280 552 ## COG1447 Phosphotransferase system cellobiose-specific component IIA 34 19 Op 3 10/0.000 - CDS 31331 - 32689 1550 ## COG1455 Phosphotransferase system cellobiose-specific component IIC - Prom 32709 - 32768 4.4 - Term 32695 - 32728 1.4 35 19 Op 4 . - CDS 32774 - 33094 405 ## COG1440 Phosphotransferase system cellobiose-specific component IIB - Prom 33201 - 33260 4.6 - Term 33327 - 33362 4.5 36 20 Tu 1 . - CDS 33393 - 33731 453 ## ECIAI39_1315 DNA-binding transcriptional activator OsmE - Prom 33832 - 33891 1.7 37 21 Tu 1 . + CDS 33933 - 34760 880 ## COG0171 NAD synthase + Term 34799 - 34838 8.2 + Prom 34908 - 34967 3.6 38 22 Tu 1 . + CDS 34990 - 35877 537 ## COG0322 Nuclease subunit of the excinuclease complex 39 23 Op 1 . - CDS 35837 - 36103 162 ## COG3758 Uncharacterized protein conserved in bacteria 40 23 Op 2 . - CDS 36104 - 36412 202 ## COG3758 Uncharacterized protein conserved in bacteria - Prom 36481 - 36540 2.4 Predicted protein(s) >gi|296493302|gb|ADTK01000199.1| GENE 1 112 - 1158 891 348 aa, chain + ## HITS:1 COG:ECs2411 KEGG:ns NR:ns ## COG: ECs2411 COG0722 # Protein_GI_number: 15831665 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Escherichia coli O157:H7 # 1 348 1 348 348 717 99.0 0 MNRTDELRTARIESLVTPAELALRYPVTPGVATHVTDSRRRIEKILNGEDKRLLVIIGPC SIHDLTAAMEYATRLQSLRNQYQSRLEIVMRTYYEKPRTVVGWKGLISDPDLNGSYRVNH GLELARKLLLQVNELGVPTATEFLDMVTGQFIADLISWGAIGARTTESQIHREMASALSC PVGFKNGTDGNTRIAVDAIRAARASHMFLSPDKNGQMTIYQTSGNPYGHIIMRGGKKPNY HADDIAAACDTLHEFDLPEHLVVDFSHGNCQKQHRRQLEVCEDICQQIRNGSRAIAGIMA ESFLREGTQKIVGGQPLTYGQSITDPCLGWEDTERLVEKLASAVDTRF >gi|296493302|gb|ADTK01000199.1| GENE 2 1290 - 1481 187 63 aa, chain + ## HITS:1 COG:ECs2412 KEGG:ns NR:ns ## COG: ECs2412 COG4256 # Protein_GI_number: 15831666 # Func_class: P Inorganic ion transport and metabolism # Function: Hemin uptake protein # Organism: Escherichia coli O157:H7 # 1 63 1 63 63 107 100.0 6e-24 MRYTDSRKLTPETDANHKTASPQPIRRISSQTLLGPDGKLIIDHDGQEYLLRKTQAGKLL LTK >gi|296493302|gb|ADTK01000199.1| GENE 3 1485 - 2921 1147 478 aa, chain - ## HITS:1 COG:ydiU KEGG:ns NR:ns ## COG: ydiU COG0397 # Protein_GI_number: 16129662 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 478 1 478 478 957 99.0 0 MTLSFVTRWRDELPETYTALSPTPLNNARLIWHNTELANTLSIPSSLFKNGAGVWGGEAL LPGMSPLAQVYSGHQFGVWAGQLGDGRGILLGEQLLADGTTMDWHLKGAGLTPYSRMGDG RAVLRSTIRESLASEAMHYLGIPTTRALSIVTSDSPVYRETAEPGAMLMRVAPSHLRFGH FEHFYYRRESEKVRQLADFAIRHYWSHLEDDEDKYRLWFSDVVARTASLIAQWQTVGFAH GVMNTDNMSLLGLTLDYGPFGFLDDYEPGFICNHSDHQGRYSFDNQPAVALWNLQRLAQT LSPFVAVDALNEALDSYQQVLLTHYGERMRQKLGFMTEQKEDNALLNELFSLMARERSDY TRTFRMLSLTEQHSAASPLRDEFIDRAAFDDWFARYRGRLQQDEVSDSERQQLMQSVNPA LVLRNWLAQRAIEAAEKGDMTELHRLHEALRNPFSDRADDYVSRPPDWGKRLEVSCSS >gi|296493302|gb|ADTK01000199.1| GENE 4 2984 - 3697 204 237 aa, chain - ## HITS:1 COG:ydiV KEGG:ns NR:ns ## COG: ydiV COG2200 # Protein_GI_number: 16129663 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 1 237 1 237 237 483 100.0 1e-137 MKIFLENLYHSDCYFLPIRDNQQVLVGVELITHFSSEDGTVRIPTSRVIAQLTEEQHWQL FSEQLELLKSCQHFFIQHKLFAWLNLTPQVATLLLERDNYAGELLKYPFIELLINENYPH LNEGKDNRGLLSLSQVYPLVLGNLGAGNSTMKAVFDGLFTRVMLDKSFIQQQITHRSFEP FIRAIQAQISPCCNCIIAGGIDTAEILAQITPFDFHALQGCLWPAVPINQITTLVQR >gi|296493302|gb|ADTK01000199.1| GENE 5 3944 - 4408 273 154 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 [Haemophilus parasuis 29755] # 40 154 65 174 175 109 45 2e-23 MRFCLILITALLLAGCSHHKAPPPNARLSDSITVIAGLNDQLQSWHGTPYRYGGMTRRGV DCSGFVVVTMRDRFDLQLPRETKQQASIGTQIDKDELLPGDLVFFKTGSGQNGLHVGIYD TNNQFIHASTSKGVMRSSLDNVYWQKNFWQARRI >gi|296493302|gb|ADTK01000199.1| GENE 6 4486 - 5223 631 245 aa, chain - ## HITS:1 COG:ECs2416 KEGG:ns NR:ns ## COG: ECs2416 COG4138 # Protein_GI_number: 15831670 # Func_class: H Coenzyme transport and metabolism # Function: ABC-type cobalamin transport system, ATPase component # Organism: Escherichia coli O157:H7 # 1 245 5 249 249 455 99.0 1e-128 MQLQDVAESTRLGPLSGEVRAGEILHLVGPNGAGKSTLLARMAGMTSGKGSIQFAGQPLE AWSATKLALHRAYLSQQQTPPFAMPVWHYLTLHQHDKTRTELLNDVAGALALDDKLGRST NQLSGGEWQRVRLAAVVLQITPQANPAGQLLLLDEPMNSLDVAQQSALDKILSALCQQGL AIVMSSHDLNHTLRHAHRAWLLKGGKMLASGRREEVLTPPNLAQAYGMNFRRLDIEGHRM LISTI >gi|296493302|gb|ADTK01000199.1| GENE 7 5235 - 5786 598 183 aa, chain - ## HITS:1 COG:ECs2417 KEGG:ns NR:ns ## COG: ECs2417 COG0386 # Protein_GI_number: 15831671 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione peroxidase # Organism: Escherichia coli O157:H7 # 1 183 1 183 183 374 98.0 1e-104 MQDSILTTVVKDIDGEVTTLEKYAGNVLLIVNVASKCGLTPQYEQLENIQKAWADRGFVV LGFPCNQFLEQEPGSDEEIKTYCTTTWGVTFPMFSKIEVNGEGRHPLYQKLIAAAPTAVA PEESGFYARMVSKGRAPLYPDDILWNFEKFLVGRDGKVIQRFSPDMTPEDPIVMESIKLA LAK >gi|296493302|gb|ADTK01000199.1| GENE 8 5849 - 6829 876 326 aa, chain - ## HITS:1 COG:btuC KEGG:ns NR:ns ## COG: btuC COG4139 # Protein_GI_number: 16129667 # Func_class: H Coenzyme transport and metabolism # Function: ABC-type cobalamin transport system, permease component # Organism: Escherichia coli K12 # 1 326 1 326 326 468 99.0 1e-132 MLTLARQQQRQNIRWLLCLSVLMLLALLLSLCAGEQWISPGDWFTPRGELFVWQIRLPRT LAVLLVGAALAISGAVMQALFENPLAEPGLLGVSNGACVGLIAAVLLGQGQLPNWALGLC AIAGALIITLILLRFARRHLSTSRLLLAGVALGIICSALMTWAIYFSTSVDLRQLMYWMM GGFGGVDWRQSWLMLALIPVLLWICCQSRPMNMLALGEISARQLGLPLWFWRNVLVAATG WMVGVSVALAGAIGFIGLVIPHILRLCGLTDHRVLLPGCALAGASALLLADIVARLALAA AELPIGVVTATLGAPVFIWLLLKARR >gi|296493302|gb|ADTK01000199.1| GENE 9 6930 - 7229 318 99 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148826992|ref|YP_001291745.1| 50S ribosomal protein L35 [Haemophilus influenzae PittGG] # 3 92 4 93 96 127 66 1e-28 MALTKAEMSEYLFDKLGLSKRDAKELVELFFEEIRRALENGEQVKLSGFGNFDLRDKNQR PGRNPKTGEDIPITARRVVTFRPGQKLKSRVENASPKDE >gi|296493302|gb|ADTK01000199.1| GENE 10 7234 - 9621 2696 795 aa, chain - ## HITS:1 COG:ECs2420_2 KEGG:ns NR:ns ## COG: ECs2420_2 COG0072 # Protein_GI_number: 15831674 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase beta subunit # Organism: Escherichia coli O157:H7 # 140 795 1 656 656 1293 99.0 0 MKFSELWLREWVNPAIDSDALANQITMAGLEVDGVEPVAGSFHGVVVGEVVECAQHPNAD KLRVTKVNVGGDRLLDIVCGAPNCRQGLRVAVATIGAVLPGDFKIKAAKLRGEPSEGMLC SFSELGISDDHSGIIELPADAPIGTDIREYLKLDDNTIEISVTPNRADCLGIIGVARDVA VLNQLPLVEPEIVPVGATIDDTLPITVEAPEACPRYLGRVVKGINVKAPTPLWMKEKLRR CGIRSIDAVVDVTNYVLLELGQPMHAFDKDRIEGGIVVRMAKEGETLVLLDGTEAKLNAD TLVIADHNKALAMGGIFGGEHSGVNDETQNVLLECAFFSPLSITGRARRHGLHTDASHRY ERGVDPALQHKAMERATRLLIDICGGEAGPVIDITSEATLPKRATITLRRSKLDRLIGHH IADEQVTDILRRLGCEVTEGKDEWQAVAPSWRFDMEIEEDLVEEVARVYGYNNIPDEPVQ ASLIMGTHREADLSLKRVKTLLNDKGYQEVITYSFVDPKVQQMIHPGVEALLLPSPISVE MSAMRLSLWTGLLATVVYNQNRQQNRVRIFESGLRFVPDTQAPLGIRQDLMLAGVICGNR YEEHWNLAKETVDFYDLKGDLESVLDLTGKLNEVEFRAEANPALHPGQSAAIYLKGERIG FVGVVHPELERKLDLNGRTLVFELEWNKLADRVVPQAREISRFPANRRDIAVVVAENVPA ADILSECKKVGVNQVVGVNLFDVYRGKGVAEGYKSLAISLILQDTSRTLEEEEIAATVAK CVEALKERFQASLRD >gi|296493302|gb|ADTK01000199.1| GENE 11 9636 - 10619 1259 327 aa, chain - ## HITS:1 COG:ECs2421 KEGG:ns NR:ns ## COG: ECs2421 COG0016 # Protein_GI_number: 15831675 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase alpha subunit # Organism: Escherichia coli O157:H7 # 1 327 1 327 327 668 100.0 0 MSHLAELVASAKAAISQASDVAALDNVRVEYLGKKGHLTLQMTTLRELPPEERPAAGAVI NEAKEQVQQALNARKAELESAALNARLAAETIDVSLPGRRIENGGLHPVTRTIDRIESFF GELGFTVATGPEIEDDYHNFDALNIPGHHPARADHDTFWFDATRLLRTQTSGVQIRTMKA QQPPIRIIAPGRVYRNDYDQTHTPMFHQMEGLIVDTNISFTNLKGTLHDFLRNFFEEDLQ IRFRPSYFPFTEPSAEVDVMGKNGKWLEVLGCGMVHPNVLRNVGIDPEVYSGFAFGMGME RLTMLRYGVTDLRSFFENDLRFLKQFK >gi|296493302|gb|ADTK01000199.1| GENE 12 11069 - 11425 578 118 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15802128|ref|NP_288150.1| 50S ribosomal protein L20 [Escherichia coli O157:H7 EDL933] # 1 118 1 118 118 227 100 9e-59 MARVKRGVIARARHKKILKQAKGYYGARSRVYRVAFQAVIKAGQYAYRDRRQRKRQFRQL WIARINAAARQNGISYSKFINGLKKASVEIDRKILADIAVFDKVAFTALVEKAKAALA >gi|296493302|gb|ADTK01000199.1| GENE 13 11478 - 11690 358 70 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|110805473|ref|YP_688993.1| 50S ribosomal protein L35 [Shigella flexneri 5 str. 8401] # 1 70 1 70 70 142 98 3e-33 MEVIKMPKIKTVRGAAKRFKKTGKGGFKHKHANLRHILTKKATKRKRHLRPKAMVSKGDL GLVIACLPYA >gi|296493302|gb|ADTK01000199.1| GENE 14 11772 - 12251 665 159 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167856598|ref|ZP_02479300.1| 50S ribosomal protein L35 [Haemophilus parasuis 29755] # 1 158 2 159 159 260 81 7e-69 QEVRLTGLEGEQLGIVSLREALEKAEEAGVDLVEISPNAEPPVCRIMDYGKFLYEKSKSS KEQKKKQKVIQVKEIKFRPGTDEGDYQVKLRSLIRFLEEGDKAKITLRFRGREMAHQQIG MEVLNRVKDDLQELAVVESFPTKIEGRQMIMVLAPKKKQ >gi|296493302|gb|ADTK01000199.1| GENE 15 12318 - 14186 1826 622 aa, chain - ## HITS:1 COG:thrS KEGG:ns NR:ns ## COG: thrS COG0441 # Protein_GI_number: 16129675 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Threonyl-tRNA synthetase # Organism: Escherichia coli K12 # 1 622 21 642 642 1310 99.0 0 MDVALDIGPGLAKACIAGRVNGELVDACDLIENDAQLSIITAKDEEGLEIIRHSCAHLLG HAIKQLWPHTKMAIGPVIDNGFYYDVDLDRTLTQEDVEALEKRMHELAEKNYDVIKKKVS WHEARETFANRGESYKVSILDENIAHDDKPGLYFHEEYVDMCRGPHVPNMRFCHHFKLMK TAGAYWRGDSNNKMLQRIYGTAWADKKALNAYLQRLEEAAKRDHRKIGKQLDLYHMQEEA PGMVFWHNDGWTIFRELEVFVRSKLKEYQYQEVKGPFMMDRVLWEKTGHWDNYKDAMFTT SSENREYCIKPMNCPGHVQIFNQGLKSYRDLPLRMAEFGSCHRNEPSGSLHGLMRVRGFT QDDAHIFCTEEQIRDEVNGCIRLVYDMYSTFGFEKIVVKLSTRPEKRIGSDEMWDRAEAD LAVALEENNIPFEYQLGEGAFYGPKIEFTLYDCLDRAWQCGTVQLDFSLPSRLSASYVGE DNERKVPVMIHRAILGSMERFIGILTEEFAGFFPTWLAPVQVVIMNITDSQSDYVNELTQ KLSNAGIRVKADLRNEKIGFKIREHTLRRVPYMLVCGDKEVESGKVAVRTRRGKDLGSMD VNEVIEKLQQEIRSRSLKQLEE >gi|296493302|gb|ADTK01000199.1| GENE 16 14819 - 16666 414 615 aa, chain + ## HITS:1 COG:ECs2427 KEGG:ns NR:ns ## COG: ECs2427 COG0666 # Protein_GI_number: 15831681 # Func_class: R General function prediction only # Function: FOG: Ankyrin repeat # Organism: Escherichia coli O157:H7 # 1 615 18 632 632 1179 98.0 0 MHIDSDIPTPSSEPINQFARQLITLLDTSDLSSMLSYCVTQEFTANCRKISQNCYSTALF TINFATSPIHAENILITLHYKKEIISLLLETTPIKANHLRSILDYIEQEQLTAENRNHCM KLSKKIHREKTIQPTVNLNGSAFFLQSPSDAIFCRHLSLQYALDSLRNGKGKVNLIKHYS SVESIQQHVPLVRDAEFRALLRHPPAGSRVIASKDFGFALDIFFGRMMANNVSHMSAILY IDNHTLSVRLRIKQSAYGQLNYVVSVYDPNDTNVAVRGTHRTARGFLSLDKFISSGPDAQ TWADRYVRNCAIAILPLLPEGVPGAIFTGIATRMPFAPIHPSAMLLIMATGQTQQLITLF RQLHILPEKEIIEIITAQNSVGTPALFLAMMNGHTDNVKIFIQEIQSLVDIHIIHEDNLV KLLQTKSANETPGLYISMLYGFDEIIDIFLNALTTPIAQELLNKKLVMSILAMKTHDGEP GLYAAMENNHPLCVTRFLSKINGIAFKYKLSKANIMDLLKGATAQGTPALYIAMSKGNED VVLSYISTLGAFAKKHSFSQHQLFTLLAAKNHDNMSAVHIAIHHNHYKTVETYYAAINAI SQSLSFSADELKTYL >gi|296493302|gb|ADTK01000199.1| GENE 17 16737 - 16949 143 70 aa, chain + ## HITS:1 COG:no KEGG:SFV_1500 NR:ns ## KEGG: SFV_1500 # Name: not_defined # Def: hypothetical protein # Organism: S.flexneri_8401 # Pathway: not_defined # 1 70 3 72 72 132 98.0 5e-30 MCTTPPDSCNNKKSLDACPFCYTPLSRTRDMQDTGMPTKRFDKKHWKMVVVLLAICGAML LLRWAAMIWG >gi|296493302|gb|ADTK01000199.1| GENE 18 17002 - 17760 611 252 aa, chain - ## HITS:1 COG:ECs2428 KEGG:ns NR:ns ## COG: ECs2428 COG3137 # Protein_GI_number: 15831682 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative salt-induced outer membrane protein # Organism: Escherichia coli O157:H7 # 1 252 1 252 252 461 100.0 1e-130 MKLLKTVPAIVMLAGGMFASLNAAADDSVFTVMDDPASAKKPFEGNLNAGYLAQSGNTKS SSLTADTTMTWYGQTTAWSLWGNASNTSSNDERSSEKYAAGGRSRFNLTDYDYLFGQASW LTDRYNGYRERDVLTAGYGRQFLNGPVHSFRFEFGPGVRYDKYTDNASETQPLGYASGAY AWQLTDNAKFTQGVSVFGAEDTTLNSESALNVAINEHFGLKVAYNVTWNSEPPESAPEHT DRRTTLSLGYSM >gi|296493302|gb|ADTK01000199.1| GENE 19 18047 - 18976 930 309 aa, chain + ## HITS:1 COG:pfkB KEGG:ns NR:ns ## COG: pfkB COG1105 # Protein_GI_number: 16129677 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) # Organism: Escherichia coli K12 # 1 309 1 309 309 570 99.0 1e-162 MVRIYTLTLAPSLDSATITPQIYPEGKLRCTAPVFEPGGGGINVARAIAHLGGTATAIFP AGGATGEHLVSLLADENVPVATVEAKDWTRQNLHVHVEASGEQYRFVMPGAALNEDEFRQ LEEQVLEIESGAILVISGSLPPGVKLEKLTQLISAAQKQGIRCIVDSSGEALSAALAIGN IELVKPNQKELSALVNRELTQPDDVRKAAQEIVNSGKAKRVVVSLGPQGALGVDSENCIQ VVPPPVKSQSTVGAGDSMVGAMTLKLAENASLEEMVRFGVAAGSAATLNQGTRLCSHDDT QKIYAYLSR >gi|296493302|gb|ADTK01000199.1| GENE 20 19077 - 19367 319 96 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_2442 NR:ns ## KEGG: ECH74115_2442 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 96 1 96 96 158 100.0 4e-38 MASGDLVRYVITVMLHEDTLTEINELNNYLTRDGFLLTMTDDEGNIHELGTNTFGLISTQ SEEEIRELVSGLTQSATGKDPEITITTWEEWNSNRK >gi|296493302|gb|ADTK01000199.1| GENE 21 19473 - 20333 829 286 aa, chain + ## HITS:1 COG:yniA KEGG:ns NR:ns ## COG: yniA COG3001 # Protein_GI_number: 16129679 # Func_class: G Carbohydrate transport and metabolism # Function: Fructosamine-3-kinase # Organism: Escherichia coli K12 # 1 286 1 286 286 575 100.0 1e-164 MWQAISRLLSEQLGEGEIELRNELPGGEVHAAWHLRYAGHDFFVKCDERELLPGFTAEAD QLELLSRSKTVTVPKVWAVGADRDYSFLVMDYLPPRPLDAHSAFILGQQIARLHQWSDQP QFGLDFDNALSTTPQPNTWQRRWSTFFAEQRIGWQLELAAEKGIAFGNIDAIVEHIQQRL ASHQPQPSLLHGDLWSGNCALGPDGPYIFDPACYWGDRECDLAMLPLHTEQPPQIYDGYQ SVSPLPADFLERQPVYQLYTLLNRARLFGGQHLVIAQQSLDRLLAA >gi|296493302|gb|ADTK01000199.1| GENE 22 20374 - 20910 508 178 aa, chain - ## HITS:1 COG:no KEGG:SDY_1819 NR:ns ## KEGG: SDY_1819 # Name: not_defined # Def: hypothetical protein # Organism: S.dysenteriae # Pathway: not_defined # 1 178 1 178 178 337 100.0 8e-92 MTYQQAGRIAVLKRILGWVIFIPALISTLISLLKFMNTRQENQEGINAVMLDFTHVMIDM MQANTPFLNLFWYNSPTPNFNGGVNVMFWVIFILIFVGLALQDSGARMSRQARFLREGVE DQLILEKAKGEEGLTREQIESRIVVPHHTIFLQFFSLYILPVICIAAGYVFFSLLGFI >gi|296493302|gb|ADTK01000199.1| GENE 23 21057 - 21725 760 222 aa, chain + ## HITS:1 COG:ECs2433 KEGG:ns NR:ns ## COG: ECs2433 COG0637 # Protein_GI_number: 15831687 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Escherichia coli O157:H7 # 1 222 1 222 222 410 100.0 1e-115 MSTPRQILAAIFDMDGLLIDSEPLWDRAELDVMASLGVDISRRNELPDTLGLRIDMVVDL WYARQPWNGPSRQEVVERVIARAISLVEETRPLLPGVREAVALCKEQGLLVGLASASPLH MLEKVLTMFDLRDSFDALASAEKLPYSKPHPQVYLDCAAKLGVDPLTCVALEDSVNGMIA SKAARMRSIVVPAPEAQNDPRFVLANVKLSSLTELTAKDLLG >gi|296493302|gb|ADTK01000199.1| GENE 24 21888 - 22478 297 196 aa, chain + ## HITS:1 COG:ECs2434 KEGG:ns NR:ns ## COG: ECs2434 COG1988 # Protein_GI_number: 15831688 # Func_class: R General function prediction only # Function: Predicted membrane-bound metal-dependent hydrolases # Organism: Escherichia coli O157:H7 # 1 196 5 200 200 379 100.0 1e-105 MTAEGHLLFSIACAVFAKNAELTPVLAQGDWWHIVPSAILTCLLPDIDHPKSFLGQRLKW ISKPIARAFGHRGFTHSLLAVFALLATFYLKVPESWFIPADALQGMVLGYLSHILADMLT PAGVPLLWPCRWRFRLPILVPQKGNQLERFICMALFVWSVWMPHSLPENSAVRWSSQMIN TLQIQFHRLIKHQVEY >gi|296493302|gb|ADTK01000199.1| GENE 25 22611 - 24002 1741 463 aa, chain + ## HITS:1 COG:ydjN KEGG:ns NR:ns ## COG: ydjN COG1823 # Protein_GI_number: 16129683 # Func_class: R General function prediction only # Function: Predicted Na+/dicarboxylate symporter # Organism: Escherichia coli K12 # 1 462 1 462 463 750 99.0 0 MNFPLIANIVVFVVLLFALAQTRHKQWSLAKKVLVGLVMGVVFGLALHTIYGSDSQVLKD SVQWFNIVGNGYVQLLQMIVMPLVFASILSAVARLHNASQLGKISFLTIGTLLFTTLIAA LVGVLVTNLFGLTAEGLVQGGAETARLNAIESNYVGKVSDLSVPQLVLSFIPKNPFADLT GANPTSIISVVIFAAFLGVAALKLLKDDAPKGERVLTAIDTLQSWVMKLVRLVMQLTPYG VLALMTKVVAGSNLQDIIKLGSFVVASYLGLLIMFAVHGILLGINGVSPLKYFRKVWPVL TFAFTSRSSAASIPLNVEAQTRRLGVPESIASFAASFGATIGQNGCAGLYPAMLAVMVAP TVGINPLDPMWIATLVGIVTVSSAGVAGVGGGATFAALIVLPAMGLPVTLVALLISVEPL IDMGRTALNVSGSMTAGTLTSQWLKQTDKAILDSEDDAELAHR >gi|296493302|gb|ADTK01000199.1| GENE 26 24006 - 24293 114 95 aa, chain - ## HITS:1 COG:no KEGG:JW1719 NR:ns ## KEGG: JW1719 # Name: ydjO # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 95 177 271 271 192 100.0 4e-48 MLNINLLFNPLNLPGMGSGVLEDIMSIPDSSLRKRLGYEVLSFSLQAHSLSQECIDKLDI FFADDLFKYESVCIAAMEHLKSKATAPIQNGPLPA >gi|296493302|gb|ADTK01000199.1| GENE 27 24402 - 24605 78 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNNQPVFYIDQIRHFFGSEFTGKIHPNKILISTSYPKLTAIKIINYHSIFNIVLQIRIKT KCWRSWL >gi|296493302|gb|ADTK01000199.1| GENE 28 25098 - 25361 205 87 aa, chain - ## HITS:1 COG:no KEGG:SSON_1427 NR:ns ## KEGG: SSON_1427 # Name: not_defined # Def: cell division modulator # Organism: S.sonnei # Pathway: not_defined # 1 87 1 87 87 155 100.0 6e-37 MRLVKPVMKKPLRQQNRQIISYVPRTEPAPPEHAIKMDSFRDVWMLRGKYVAFVLMGESF LRSPAFTVPESAQRWANQIRQEGEVTE >gi|296493302|gb|ADTK01000199.1| GENE 29 25544 - 27805 2153 753 aa, chain + ## HITS:1 COG:katE KEGG:ns NR:ns ## COG: katE COG0753 # Protein_GI_number: 16129686 # Func_class: P Inorganic ion transport and metabolism # Function: Catalase # Organism: Escherichia coli K12 # 1 753 1 753 753 1517 99.0 0 MSQHNEKNPHQHQSPLHDSSEAKPGMDSLAPEDGSHRPAAEPTPPGAQPTAPGSLKAPDT RNEKLNSLEDVRKGSENYALTTNQGVRIADDQNSLRAGSRGPTLLEDFILREKITHFDHE RIPERIVHARGSAAHGYFQPYKSLSDITKADFLSDPNKITPVFVRFSTVQGGAGSADTVR DIRGFATKFYTEEGIFDLVGNNTPIFFIQDAHKFPDFVHAVKPEPHWAIPQGQSAHDTFW DYVSLQPETLHNVMWAMSDRGIPRSYRTMEGFGIHTFRLINAEGKATFVRFHWKPLAGKA SLVWDEAQKLTGRDPDFHRRELWEAIEAGDFPEYELGFQLIPEEDEFKFDFDLLDPTKLI PEELVPVQRVGKMVLNRNPDNFFAENEQAAFHPGHIVPGLDFTNDPLLQGRLFSYTDTQI SRLGGPNFHEIPINRPTCPYHNFQRDGMHRMGIDTNPANYEPNSINDNWPRETPPGPKRG GFESYQERVEGNKVRERSPSFGEYYSHPRLFWLSQTPFEQRHIVDGFSFELSKVVRPYIR ERVVDQLAHIDLTLAQAVAKNLGIELTDDQLNITPPPDVNGLKKDPSLSLYAIPDGDVKG RVVAILLNDEVRSADLLAILKALKAKGVHAKLLYSRMGEVTADDGTVLPIAATFAGAPSL TVDAVIVPCGNIADIADNGDANYYLMEAYKHLKPIALAGDARKFKATIKIADQGEEGIVE ADSADGSFMDELLTLMAAHRVWSRIPKIDKIPA >gi|296493302|gb|ADTK01000199.1| GENE 30 27852 - 28610 645 252 aa, chain - ## HITS:1 COG:ECs2439 KEGG:ns NR:ns ## COG: ECs2439 COG3394 # Protein_GI_number: 15831693 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 252 1 252 252 514 99.0 1e-146 MERLLIVNADDFGLSKGQNYGIIEACRNGIVTSTTALVNGQAIDHAVQLSRDEPSLAIGM HFVLTMGKPLTAMPGLTRDGVLGKWIWQLAEEDALPLEEITQELASQYLRFIELFGRKPT HLDSHHHVHMFPQIFPIVARFAAEEGIALRIDRQPLSNAGDLPANLRSSHGFSSAFYGEE ISEALFLQVLDDSSHRGERSLEVMCHPAFVDNTIRQSAYCFPRLTELEVLTSASLKYAIA ERGYRLGSYLDV >gi|296493302|gb|ADTK01000199.1| GENE 31 28623 - 29975 1535 450 aa, chain - ## HITS:1 COG:ECs2440 KEGG:ns NR:ns ## COG: ECs2440 COG1486 # Protein_GI_number: 15831694 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Escherichia coli O157:H7 # 1 450 1 450 450 933 98.0 0 MSQKLKVVTIGGGSSYTPELLEGFIKRYHELPVSELWLVDVEDGKEKLDIIFELCQRMID NAGVPMKLYKTLDRREALKDADFVTTQLRVGQLPARELDERIPLSHGYLGQETNGAGGLF KGLRTIPVIFDIVKDVEELCPNAWVINFTNPAGMVTEAVYRHTGFKRFIGVCNIPIGMKM FIRDVLMLKDSDDLSIDLFGLNHMVFIKDVLVNGKSRFAELLDGVASGQLKASSVKNIFD LPFSEGLIRSLNLLPCSYLLYYFKQKEMLAIEMGEYYKGGARAQVVQKVEKQLFELYKNP ELKVKPKELEQRGGAYYSDAACEVINAIYNDKQAEHYVNIPHHGHIDNIPADWAVEMTCK LGRDGATPHPRITHFDDKVMGLIHTIKGFEIAASNAALSGEFNDVLLALNLSPLVHSDRD AELLAREMILAHEKWLPNFADCIAELKKAH >gi|296493302|gb|ADTK01000199.1| GENE 32 30080 - 30919 723 279 aa, chain - ## HITS:1 COG:ECs2441 KEGG:ns NR:ns ## COG: ECs2441 COG2207 # Protein_GI_number: 15831695 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 279 2 280 280 518 100.0 1e-147 MQPVINAPEIATAREQQLFNGKNFHVFIYNKTESISGLHQHDYYEFTLVLTGRYFQEING KRVLLERGDFVFIPLGSHHQSFYEFGATRILNVGISKRFFEQHYLPLLPYCFVASQVYRT NNAFLTYVETVISSLNFRETGLEEFVEMVTFYVINRLRHYREEQVIDDVPQWLKSTVEKM HDKEQFSESALENMVTLSAKSQEYLTRATQRYYGKTPMQIINEIRINFAKKQLEMTNYSV TDIAFEAGYSSPSLFIKTFKKLTSFTPKSYRKKLTEFNQ >gi|296493302|gb|ADTK01000199.1| GENE 33 30930 - 31280 552 116 aa, chain - ## HITS:1 COG:ECs2442 KEGG:ns NR:ns ## COG: ECs2442 COG1447 # Protein_GI_number: 15831696 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIA # Organism: Escherichia coli O157:H7 # 1 116 1 116 116 173 99.0 7e-44 MMDLDNIPDTQTEAEELEEVVMGLIINSGQARSLAYAALKQAKQGDFAAAKAMMDQSRMA LNEAHLVQTKLIEGDAGEGKIKVSLVLVHAQDHLMTSMLARELITELIELHEKLKA >gi|296493302|gb|ADTK01000199.1| GENE 34 31331 - 32689 1550 452 aa, chain - ## HITS:1 COG:celB KEGG:ns NR:ns ## COG: celB COG1455 # Protein_GI_number: 16129691 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Escherichia coli K12 # 1 452 1 452 452 792 99.0 0 MSNVIASLEKVLLPFAVKIGKQPHVNAIKNGFIRLMPLTLAGAMFVLINNVFLSFGEGSF FYSLGIRLDASTIETLNGLKGIGGNVYNGTLGIMSLMAPFFIGMALAEERKVDALAAGLL SVAAFMTVTPYSVGEAYAVGANWLGGANIISGIIIGLVVAEMFTFIVRRNWVIKLPDSVP ASVSRSFSALIPGFIILSVMGIIAWALNTWGTNFHQIIMDTISTPLASLGSVVGWAYVIF VPLLWFFGIHGALALTALDNGIMTPWALENIATYQQYGSVEAALAAGKTFHIWAKPMLDS FIFLGGSGATLGLILAIFIASRRADYRQVAKLALPSGIFQINEPILFGLPIIMNPVMFIP FVLVQPILAAITLAAYYMGVIPPVTNIAPWTMPTGLGAFFNTNGSVAALLVALFNLGIAT LIYLPFVVVANKAQNAIDKEESEEDIANALKF >gi|296493302|gb|ADTK01000199.1| GENE 35 32774 - 33094 405 106 aa, chain - ## HITS:1 COG:ECs2444 KEGG:ns NR:ns ## COG: ECs2444 COG1440 # Protein_GI_number: 15831698 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Escherichia coli O157:H7 # 1 93 1 93 106 174 100.0 3e-44 MEKKHIYLFCSAGMSTSLLVSKMRAQAEKYEVPVIIEAFPETLAGEKGQNADVVLLGPQI AYMLPEIQRLLPNKPVEVIDSLLYGKVDGLGVLKAAVAAIKKAAAN >gi|296493302|gb|ADTK01000199.1| GENE 36 33393 - 33731 453 112 aa, chain - ## HITS:1 COG:no KEGG:ECIAI39_1315 NR:ns ## KEGG: ECIAI39_1315 # Name: osmE # Def: DNA-binding transcriptional activator OsmE # Organism: E.coli_IAI39 # Pathway: not_defined # 1 112 1 112 112 205 99.0 4e-52 MNKNMAGILSAAAVLTMLAGCTAYDRTKDQFVQPVVKDVKKGMSRAQVAQIAGKPSSEVS MIHARGTCQAYILGQRDGKAETYFVALDDTGHVINSGYQTCAEYDTDPQAAK >gi|296493302|gb|ADTK01000199.1| GENE 37 33933 - 34760 880 275 aa, chain + ## HITS:1 COG:ECs2446 KEGG:ns NR:ns ## COG: ECs2446 COG0171 # Protein_GI_number: 15831700 # Func_class: H Coenzyme transport and metabolism # Function: NAD synthase # Organism: Escherichia coli O157:H7 # 1 275 1 275 275 546 98.0 1e-155 MTLQQQIIKALGAKPQINAEEEIRRSVDFLKSYLRTYPFIKSLVLGISGGQDSTLAGKLC QMAINELRQETGNESLQFIAVRLPYGVQADEQDCQDAIAFIQPDRVLTVNIKGAVLASEQ ALREAGIELSDFVRGNEKARERMKAQYSIAGMTSGVVVGTDHAAEAITGFFTKYGDGGTD INPLYRLNKRQGKQLLTALGCPEHLYKKAPTADLEDDRPSLPDEVALGVTYDNIDDYLEG KNVPEQVARTIENWYLKTEHKRRPPITVFDDFWKK >gi|296493302|gb|ADTK01000199.1| GENE 38 34990 - 35877 537 295 aa, chain + ## HITS:1 COG:ECs2447 KEGG:ns NR:ns ## COG: ECs2447 COG0322 # Protein_GI_number: 15831701 # Func_class: L Replication, recombination and repair # Function: Nuclease subunit of the excinuclease complex # Organism: Escherichia coli O157:H7 # 1 295 1 295 295 592 99.0 1e-169 MVRRLTSPRLEFEAAAIYEYPEHLRSFLNDLPTRPGVYLFHGESDTMPLYIGKSINIRSR VLSHLRTPDEAAMLRQSRRISWICTAGEIGALLLEARLIKEQQPLFNKRLRRNRQLCALQ LNEKRVDVVYAKEVDFSRAPNLFGLFANRRAALQALQSIADEQKLCYGLLGLEPLSRGRA CFRSALKRCAGACCGKESHEEHALRLRQSLERLRVVCWPWQGAVALKEQHPEMTQYHIIQ NWLWLGAVNSLEEATTLIRTPAGFDHDGYKILCKPLLSGNYEITELDPANDQRAS >gi|296493302|gb|ADTK01000199.1| GENE 39 35837 - 36103 162 88 aa, chain - ## HITS:1 COG:ECs2448 KEGG:ns NR:ns ## COG: ECs2448 COG3758 # Protein_GI_number: 15831702 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 88 125 212 212 190 100.0 5e-49 MDFNIMTRLDVCKAKVRIAERTFTTFGSRGGVVFVINGAWQLGDKLLTTDQGACWFDGRH TLRLLQPQGKLLFSEINWLAGHSPDQVQ >gi|296493302|gb|ADTK01000199.1| GENE 40 36104 - 36412 202 102 aa, chain - ## HITS:1 COG:ydjR KEGG:ns NR:ns ## COG: ydjR COG3758 # Protein_GI_number: 16129696 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 102 22 123 212 212 99.0 2e-55 MEYFDMRKMSVNLWRNAAGETREICTFPPAKRDFYWRASIASIAANGEFSLFPGMERIVT LLEGGEMLLESADRFNHTLKPLQPFAFAADQVVKAKLTAGQM Prediction of potential genes in microbial genomes Time: Mon May 16 15:39:29 2011 Seq name: gi|296493301|gb|ADTK01000200.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont544.8, whole genome shotgun sequence Length of sequence - 39688 bp Number of predicted genes - 39, with homology - 39 Number of transcription units - 18, operones - 8 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 75 - 560 646 ## COG3678 P pilus assembly/Cpx signaling pathway, periplasmic inhibitor/zinc-resistance associated protein - Prom 671 - 730 5.5 - Term 850 - 883 5.4 2 2 Op 1 5/0.200 - CDS 890 - 1858 769 ## COG2988 Succinylglutamate desuccinylase 3 2 Op 2 7/0.000 - CDS 1851 - 3194 1179 ## COG3724 Succinylarginine dihydrolase 4 2 Op 3 8/0.000 - CDS 3191 - 4669 1223 ## COG1012 NAD-dependent aldehyde dehydrogenases 5 2 Op 4 7/0.000 - CDS 4666 - 5700 1061 ## COG3138 Arginine/ornithine N-succinyltransferase beta subunit 6 2 Op 5 . - CDS 5697 - 6917 1396 ## COG4992 Ornithine/acetylornithine aminotransferase - Prom 7117 - 7176 9.7 + Prom 7106 - 7165 5.3 7 3 Tu 1 . + CDS 7363 - 8169 805 ## COG0708 Exonuclease III + Prom 8257 - 8316 3.5 8 4 Op 1 . + CDS 8336 - 9046 559 ## COG0398 Uncharacterized conserved protein 9 4 Op 2 . + CDS 9051 - 9728 854 ## EcSMS35_1439 hypothetical protein 10 4 Op 3 3/0.600 + CDS 9746 - 10450 592 ## COG0398 Uncharacterized conserved protein 11 4 Op 4 2/0.800 + CDS 10450 - 10998 478 ## COG2128 Uncharacterized conserved protein 12 4 Op 5 3/0.600 + CDS 11008 - 12174 991 ## COG4134 ABC-type uncharacterized transport system, periplasmic component 13 4 Op 6 3/0.600 + CDS 12192 - 13682 1376 ## COG4135 ABC-type uncharacterized transport system, permease component 14 4 Op 7 1/1.000 + CDS 13682 - 14335 200 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 15 4 Op 8 . + CDS 14402 - 15709 1305 ## COG2897 Rhodanese-related sulfurtransferase - Term 15655 - 15681 -1.0 16 5 Tu 1 . - CDS 15718 - 16338 404 ## COG0558 Phosphatidylglycerophosphate synthase - Prom 16380 - 16439 4.6 + Prom 16339 - 16398 4.7 17 6 Tu 1 . + CDS 16425 - 16832 375 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes - Term 16544 - 16572 0.6 18 7 Tu 1 . - CDS 16798 - 17070 95 ## SDY_1515 hypothetical protein - Prom 17257 - 17316 2.7 + Prom 17182 - 17241 4.1 19 8 Tu 1 . + CDS 17306 - 18649 1467 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase - Term 18487 - 18516 -0.2 20 9 Tu 1 . - CDS 18766 - 19806 231 ## EcE24377A_1985 hypothetical protein - Prom 19831 - 19890 6.3 - Term 19847 - 19897 9.8 21 10 Op 1 5/0.200 - CDS 19934 - 21895 1817 ## COG0550 Topoisomerase IA 22 10 Op 2 6/0.200 - CDS 21900 - 22943 1202 ## COG0709 Selenophosphate synthase - Prom 22965 - 23024 2.2 23 11 Tu 1 . - CDS 23060 - 23611 649 ## COG0778 Nitroreductase - Prom 23637 - 23696 4.9 + Prom 23436 - 23495 3.0 24 12 Op 1 . + CDS 23665 - 23829 56 ## ECUMN_2054 hypothetical protein 25 12 Op 2 4/0.600 + CDS 23772 - 25628 1751 ## COG0616 Periplasmic serine proteases (ClpP class) + Term 25641 - 25677 7.2 + Prom 25711 - 25770 4.1 26 13 Op 1 2/0.800 + CDS 25795 - 26811 1026 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D 27 13 Op 2 . + CDS 26822 - 27463 563 ## COG1335 Amidases related to nicotinamidase 28 14 Tu 1 . - CDS 27556 - 28914 674 ## COG0477 Permeases of the major facilitator superfamily - Prom 28970 - 29029 2.0 - Term 28935 - 28969 4.0 29 15 Tu 1 . - CDS 29031 - 29789 647 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 29821 - 29880 5.0 30 16 Op 1 2/0.800 - CDS 29926 - 30906 868 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 31 16 Op 2 4/0.600 - CDS 30916 - 31863 704 ## COG0524 Sugar kinases, ribokinase family 32 16 Op 3 3/0.600 - CDS 31868 - 32704 735 ## COG0191 Fructose/tagatose bisphosphate aldolase 33 16 Op 4 7/0.000 - CDS 32725 - 33768 798 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 34 16 Op 5 7/0.000 - CDS 33785 - 35164 993 ## COG0477 Permeases of the major facilitator superfamily 35 16 Op 6 1/1.000 - CDS 35191 - 36267 1075 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases - Prom 36351 - 36410 3.3 36 17 Op 1 3/0.600 - CDS 36637 - 36909 336 ## COG3139 Uncharacterized protein conserved in bacteria 37 17 Op 2 . - CDS 36951 - 37364 238 ## COG0229 Conserved domain frequently associated with peptide methionine sulfoxide reductase + Prom 37472 - 37531 6.2 38 18 Op 1 4/0.600 + CDS 37706 - 38701 1051 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase + Term 38728 - 38757 2.1 + Prom 38705 - 38764 3.5 39 18 Op 2 . + CDS 38785 - 39669 965 ## COG0676 Uncharacterized enzymes related to aldose 1-epimerase Predicted protein(s) >gi|296493301|gb|ADTK01000200.1| GENE 1 75 - 560 646 161 aa, chain - ## HITS:1 COG:spy KEGG:ns NR:ns ## COG: spy COG3678 # Protein_GI_number: 16129697 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; N Cell motility; T Signal transduction mechanisms; P Inorganic ion transport and metabolism # Function: P pilus assembly/Cpx signaling pathway, periplasmic inhibitor/zinc-resistance associated protein # Organism: Escherichia coli K12 # 1 161 1 161 161 226 98.0 1e-59 MRKFTALFVASTLALGAANLAHAADTTTAAPADAKPMMHHKGKFGPHQDMMFKDLNLTDA QKQQIREIMKGQRDQMKRPPLEERRAMHDIIASDTFDKAKAEAQIAKMEEQRKANMLAHM ETQNKIYNILTPEQKKQFNANFEKRLTERPAAKGKMPATAE >gi|296493301|gb|ADTK01000200.1| GENE 2 890 - 1858 769 322 aa, chain - ## HITS:1 COG:ydjS KEGG:ns NR:ns ## COG: ydjS COG2988 # Protein_GI_number: 16129698 # Func_class: E Amino acid transport and metabolism # Function: Succinylglutamate desuccinylase # Organism: Escherichia coli K12 # 1 322 1 322 322 660 99.0 0 MDNFLALTLTGKKPVITEREINGVRWRWLGDGVLELTPLTPPQGALVISAGIHGNETAPV EMLDALLGAISHGEIPLRWRLLVILGNPPALKQGKRYCHSDMNRMFGGRWQLFAESGETC RARELEQCLEDFYDLGKESVRWHLDLHTAIRGSLHPQFGVLPQRDIPWDEKFLTWLGAAG LEALVFHQEPGGTFTHFSARHFGALACTLELGKALPFGQNDLRQFAVTASAIAALLSGES VGIVRTPPLRYRVVSQITRHSPSFEMHMASDTLNFMPFEKGTLLAQDGEERFTVTHDVEY VLFPNPLVALGLRAGLMLEKIS >gi|296493301|gb|ADTK01000200.1| GENE 3 1851 - 3194 1179 447 aa, chain - ## HITS:1 COG:astB KEGG:ns NR:ns ## COG: astB COG3724 # Protein_GI_number: 16129699 # Func_class: E Amino acid transport and metabolism # Function: Succinylarginine dihydrolase # Organism: Escherichia coli K12 # 1 447 1 447 447 881 99.0 0 MNAWEVNFDGLVGLTHHYAGLSFGNEASTRHRFQVSNPRQAAKQGLLKMKALADAGFPQA VIPPHERPFIPVLRQLGFSGSDEQVLEKVARQAPHWLSSVSSASPMWVANAATIAPSADT LDGKVHLTVANLNNKFHRSLEAHVTESLLKAIFNDEEKFSVHSALPQVALLGDEGAANHN RLGGHYGEPGMQLFVYGREEGNDTRPSRYPARQTREASEAVARLNQVNPQQVIFAQQNPD VIDQGVFHNDVIAVSNRQVLFCHQQAFARQSQLLANLRARVNGFMAIEVPATQVSVSDAV STYLFNSQLLSRDDGSMMLVLPQECREHAGVWGYLNELLAADNPISELKVFDLRESMANG GGPACLRLRVVLTEEERRAVNPAVMMNDTLFNALNDWVDRYYRDRLTAADLADPQLLREG REALDVLSQLLNLGSVYPFQREGGGNG >gi|296493301|gb|ADTK01000200.1| GENE 4 3191 - 4669 1223 492 aa, chain - ## HITS:1 COG:ECs2452 KEGG:ns NR:ns ## COG: ECs2452 COG1012 # Protein_GI_number: 15831706 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Escherichia coli O157:H7 # 1 492 1 492 492 942 97.0 0 MTLWINGDWITGQGASRVKRNPVSGEVLWQGNDADAAQVEQACRAARAAFPRWARLSLAE RQVVVERFAGLLERNKGELTAIIARETGKPRWEAATEVTAMINKIAISIKAYHARTGEQR SEMPDGAASLRHRPHGVLAVFGPYNFPGHLPNGHIVPALLAGNTIIFKPSELTPWSGEAV MRLWQQAGLPPGVLNLVQGGRETGQALSALEDLDGLLFTGSANTGYQLHRQLSGQPEKIL ALEMGGNNPLIIDEVADIDAAVHLTIQSAFVTAGQRCTCARRLLLKSGAQGDAFLARLVA VSQRLTPGNWDDEPQPFIGGLISEQAAQQVVTAWQQLEAMGGRTLLAPRLLQSETSLLTP GIIEMTGVAGVPDEEVFGPLLRVWRYDSFEEAILMANNTRFGLSCGLVSPEREKFDQLLL EARAGIVNWNKPLTGAASTAPFGGIGASGNHRPSAWYAADYCAWPMASLESDSLTLPATL NPGLDFSDEVVR >gi|296493301|gb|ADTK01000200.1| GENE 5 4666 - 5700 1061 344 aa, chain - ## HITS:1 COG:ECs2453 KEGG:ns NR:ns ## COG: ECs2453 COG3138 # Protein_GI_number: 15831707 # Func_class: E Amino acid transport and metabolism # Function: Arginine/ornithine N-succinyltransferase beta subunit # Organism: Escherichia coli O157:H7 # 1 344 1 344 344 726 99.0 0 MMVIRPVERSDVSALMQLASKTGGGLTSLPANEATLSARIERAIKTWQGELPKSEQGYVF VLEDSETGTVAGICAIEVAVGLNDPWYNYRVGTLVHASKELNVYNALPTLFLSNDHTGSS ELCTLFLDPDWRKEGNGYLLSKSRFMFMAAFRDKFNDKVVAEMRGVIDEHGYSPFWQSLG KRFFSMDFSRADFLCGTGQKAFIAELMPKHPIYTHFLSQEAQDVIGQVHPQTAPARAVLE KEGFRYRNYIDIFDGGPTLECDIDRVRAIRKSRLVEVAEGQPAQGDFPACLVANENYHHF RVVLARTDPATERLILTAAQLDALKCHAGDRVRLVRLCAEEKTA >gi|296493301|gb|ADTK01000200.1| GENE 6 5697 - 6917 1396 406 aa, chain - ## HITS:1 COG:cstC KEGG:ns NR:ns ## COG: cstC COG4992 # Protein_GI_number: 16129702 # Func_class: E Amino acid transport and metabolism # Function: Ornithine/acetylornithine aminotransferase # Organism: Escherichia coli K12 # 1 406 1 406 406 824 100.0 0 MSQPITRENFDEWMIPVYAPAPFIPVRGEGSRLWDQQGKEYIDFAGGIAVNALGHAHPEL REALNEQASKFWHTGNGYTNEPVLRLAKKLIDATFADRVFFCNSGAEANEAALKLARKFA HDRYGSHKSGIVAFKNAFHGRTLFTVSAGGQPAYSQDFAPLPADIRHAAYNDINSASALI DDSTCAVIVEPIQGEGGVVPASNAFLQGLRELCNRHNALLIFDEVQTGVGRTGELYAYMH YGVTPDLLTTAKALGGGFPVGALLATEECARVMTVGTHGTTYGGNPLASAVAGKVLELIN TPEMLNGVKQRHDWFVERLNTINHRYGLFSEVRGLGLLIGCVLNADYAGQAKQISQEAAK AGVMVLIAGGNVVRFAPALNVSEEEVTTGLDRFAAACEHFVSRGSS >gi|296493301|gb|ADTK01000200.1| GENE 7 7363 - 8169 805 268 aa, chain + ## HITS:1 COG:xthA KEGG:ns NR:ns ## COG: xthA COG0708 # Protein_GI_number: 16129703 # Func_class: L Replication, recombination and repair # Function: Exonuclease III # Organism: Escherichia coli K12 # 1 268 1 268 268 566 100.0 1e-161 MKFVSFNINGLRARPHQLEAIVEKHQPDVIGLQETKVHDDMFPLEEVAKLGYNVFYHGQK GHYGVALLTKETPIAVRRGFPGDDEEAQRRIIMAEIPSLLGNVTVINGYFPQGESRDHPI KFPAKAQFYQNLQNYLETELKRDNPVLIMGDMNISPTDLDIGIGEENRKRWLRTGKCSFL PEEREWMDRLMSWGLVDTFRHANPQTADRFSWFDYRSKGFDDNRGLRIDLLLASQPLAEC CVETGIDYEIRSMEKPSDHAPVWATFRR >gi|296493301|gb|ADTK01000200.1| GENE 8 8336 - 9046 559 236 aa, chain + ## HITS:1 COG:ECs2456 KEGG:ns NR:ns ## COG: ECs2456 COG0398 # Protein_GI_number: 15831710 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 236 17 252 252 402 97.0 1e-112 MNAERKFLLACLIFALVIYAIHTFGLFDLLTDLPHLQTLIRQSGLFGYSLYILLFIIATL FLLPGSILVIAGGIVFGPLLGTLLSLIAATLASSCSFLLARWLGRDLLLKYVGHSHTFQA IEKGIARNGIDFLILTRLIPLFPYNIQNYAYGLTTIAFWPYTLISALTTLPGIVIYTVMA SDLANEGITLRFILQLCLAGLALFILVQLAKLYARHKHVDLSASRRSPLTHPKNEG >gi|296493301|gb|ADTK01000200.1| GENE 9 9051 - 9728 854 225 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_1439 NR:ns ## KEGG: EcSMS35_1439 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 225 1 225 225 435 98.0 1e-121 MSQHYSVSWKKGLAALCLLAVAGLSGCDQKENAAAKVEYDGLSNSQPLRVDANNHTVTML VQINGRFLTDDTRHGIVFKDGSNGHKSLFMAYATPKAFYEALKEAGGTPGENMTMDNKET THVTGSKLDISVNWQGAAKAYSFDEVIVDSNGKKLDMRFGGNLTAAEEKKTGCLVCLDSC PVGIVSNATYTYGAVEKRGEVKFKGNASVLPADNTLATVTFKITE >gi|296493301|gb|ADTK01000200.1| GENE 10 9746 - 10450 592 234 aa, chain + ## HITS:1 COG:ECs2458 KEGG:ns NR:ns ## COG: ECs2458 COG0398 # Protein_GI_number: 15831712 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 234 2 235 235 402 99.0 1e-112 MKIQSRKIWYYRITLIILLFAMLLAWALLPGVHEFINRSVAAFAAVDQQGIERFIQSYGA LAAVVSFLLMILQAIAAPLPAFLITFANASLFGAFWGGLLSWTSSMAGAALCFFIARVMG REVVEKLTGKTVLDSMDGFFTRYGKHTILVCRLLPFVPFDPISYAAGLTSIRFRSFFIAT GLGQLPATIVYSWAGSMLTGGTFWFVTGLFILFALTVVIFMAKKIWLERQKRNA >gi|296493301|gb|ADTK01000200.1| GENE 11 10450 - 10998 478 182 aa, chain + ## HITS:1 COG:ECs2459 KEGG:ns NR:ns ## COG: ECs2459 COG2128 # Protein_GI_number: 15831713 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 182 1 182 182 353 98.0 1e-97 MGLPPLSKIPFILRPQAWLHRRHYGEVLSPIRWWGRIPFIFYLVSMFVGWLERKRSPLDP VVRSLVSARIAQMCLCEFCVDITSMKVAERTGSTDKLLAVADWRQSPLFSDEERLALEYA EAASVTPPTVDDALRTRLAAHFDAQALTELTALIGLQNLSARFNSAMDIPAQGLCRIPEK RS >gi|296493301|gb|ADTK01000200.1| GENE 12 11008 - 12174 991 388 aa, chain + ## HITS:1 COG:ynjB KEGG:ns NR:ns ## COG: ynjB COG4134 # Protein_GI_number: 16129708 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Escherichia coli K12 # 1 388 2 389 389 686 97.0 0 MRHCGWLLGLLSLFSLATHASDWQEIKNEAKGQTVWFNAWGGDTAINRYLDWVSGEMKTH YAINLKIVRLADAADAVKRIQTEAAAGRKTGGSVDLLWVNGENFRTLKEAKLLQTGWAET LPNWRYVDTQLPVREDFSVPTEGAESPWGGAQLTFIARRDVTPQPPQTPQALLEFAKANP GTVTYPRPPDFTGTAFLEQLLIMLTPDPAALKVAPDKATFARVTAPLWQYLDALHPYLWR KGKDFPPSPARMDALLKAGTLRLSLTFNPAHAQQKIASGDLPASSYSFGFNQGMIGNVHF VTIPANANASAAAKVVANFLLSPNAQLRKADPAVWGDPSVLDPQKLPDGQRETLQSRMPQ DLPPVLAEPHAGWVNALEQEWLHRYGTH >gi|296493301|gb|ADTK01000200.1| GENE 13 12192 - 13682 1376 496 aa, chain + ## HITS:1 COG:ynjC KEGG:ns NR:ns ## COG: ynjC COG4135 # Protein_GI_number: 16129709 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Escherichia coli K12 # 1 496 1 496 496 677 96.0 0 MVAVIYAPLIPAALTLISPALSLTHWQALFADPQLPQALLATLVSTTIAAVGALLIALLV IVALWPWPKWQRMCARLPWLLAIPHVAFATSALLLFADGGLLYDYFPYFTPLMDKLGIGL GLTLAVKESAFLLWILAAVLSEKRLLQQVIVLDSLGYSRWQCLNWLLLPSVAPALAMAML AIVAWSLSVVDVAIILGPGNPPTLAVISWQWLTQGDADQQTKGALASLLLMLLLAAYVLL SYLLWRSWRRTIPRVDGIRKPATPLLPGTTLASFLPLTGVLCVVLLAILADQSTINSEAL INSLTMGLVATFIALLLLLLWLEWGPQRRQLWLWLPILLPALPLVAGQYTLALWLNLDGS WTAVVWGHLLWVMPWMLFILQPAWQRIDSRLILIAQTLSWSRAKIFFYVKCPLMLRPTLI AFAVGFSVSIAQYMPTLWLGAGRFPTLTTEAVALSSGGSNGIFAAQALWQLLLPLIIFAL TALVAKWVGYVRQGLR >gi|296493301|gb|ADTK01000200.1| GENE 14 13682 - 14335 200 217 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 195 1 200 245 81 30 7e-15 MLCVKNVSLRLPESRLLTNVNFTVDKGDIVTLMGPSGCGKSTLFSWMIGALAGQFSCTGE LWLNEQRIDMLPTAQRQIGILFQDALLFDQFSVGQNLLLALPSTLKGTARRNAVKDALDR AGLAETYHQDPATLSGGQRARVALLRALLAQPKALLLDEPFSRLDVALRDNFRQWVFSEV RELAIPVVQVTHDLQDVPADSSVLDMAQWSENYNKLR >gi|296493301|gb|ADTK01000200.1| GENE 15 14402 - 15709 1305 435 aa, chain + ## HITS:1 COG:ynjE KEGG:ns NR:ns ## COG: ynjE COG2897 # Protein_GI_number: 16129711 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Escherichia coli K12 # 1 435 6 440 440 842 99.0 0 MKRVSQMTALAMALGLACASSWAAELAKPLTLDQLQQQNGKAIDTRPSAFYNGWPQTLNG PSGHEPAALNLSASWLDKMSTEQLNAWIKQHNLKADAPVALYGNDKDVDAVKTRLQKAGF THISILSDALSEPSRLQKLPHFEQLVYPQWLHDLQQGKEVTAKPAGDWKVIEAAWGAPKL YLISHIPGADYIDTNEVESEPLWNKVSDEQLKAMLAKHGIRHDTTVILYGRDVYAGARVA QIMLYAGVKDVRLLDGGWQTWSDAGLPVERGTPPKVKAEPDFGVKIPAQPQLMLDMEQAR GLLHRQDASLVSIRSWPEFIGTTSGYSYIKPKGEIAGARWGHAGSDSTHMEDFHNPDGTM RSADDITAMWKAWNIKPEQQVSFYCGTGWRASETFMYARAMGWKNVSVYDGGWYEWSSDP KNPVATGERGPDSSK >gi|296493301|gb|ADTK01000200.1| GENE 16 15718 - 16338 404 206 aa, chain - ## HITS:1 COG:ynjF KEGG:ns NR:ns ## COG: ynjF COG0558 # Protein_GI_number: 16129712 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphate synthase # Organism: Escherichia coli K12 # 1 206 3 208 208 352 98.0 4e-97 MLDRHLHPRIKPLLHQCVRVLDKPGITPDGLTLVGFAIGVLALPFLALGWYLAALVVILL NRLLDGLDGALARRRGLTDAGGFLDISLDFLFYALVPFGFILAAPQQNALAGGWLLFAFI GTGSSFLAFAALAAKHQIDNPGYAHKSFYYLGGLTEGTETILLFVLGCLFPAWFAWFAWI FGALCWMTTFTRVWSGYLTLKSHQRQ >gi|296493301|gb|ADTK01000200.1| GENE 17 16425 - 16832 375 135 aa, chain + ## HITS:1 COG:ynjG KEGG:ns NR:ns ## COG: ynjG COG0494 # Protein_GI_number: 16129713 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Escherichia coli K12 # 1 135 1 135 135 242 98.0 1e-64 MKMIEVVAAIIERDGKILLAQRPAQSDQAGLWEFAGGKVEPDESQRQALVRELREELGIE ATVGEYVTSHQREVSGRIIHLHAWHVPDFHGTLQAHEHQALVWCSPEEALQYPLAPADIP LLESFMALRAARPAD >gi|296493301|gb|ADTK01000200.1| GENE 18 16798 - 17070 95 90 aa, chain - ## HITS:1 COG:no KEGG:SDY_1515 NR:ns ## KEGG: SDY_1515 # Name: not_defined # Def: hypothetical protein # Organism: S.dysenteriae # Pathway: not_defined # 1 90 1 90 90 179 100.0 3e-44 MSRALFAVVLAFPLIALANPHYRPDVEVNVPPEVFSSGGQSAQPCTQCCVYQDKNYSEGA VIKAEGILLQCQRDDKTLSTNPLVWRRVKP >gi|296493301|gb|ADTK01000200.1| GENE 19 17306 - 18649 1467 447 aa, chain + ## HITS:1 COG:ECs2467 KEGG:ns NR:ns ## COG: ECs2467 COG0334 # Protein_GI_number: 15831721 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Escherichia coli O157:H7 # 1 447 1 447 447 892 99.0 0 MDQTYSLESFLNHVQKRDPNQTEFAQAVREVMTTLWPFLEQNPKYRQMSLLERLVEPERV IQFRVVWVDDRNQVQVNRAWRVQFSSAIGPYKGGMRFHPSVNLSILKFLGFEQTFKNALT TLPMGGGKGGSDFDPKGKSEGEVMRFCQALMTELYRHLGADTDVPAGDIGVGGREVGFMA GMMKKLSNNTACVFTGKGLSFGGSLIRPEATGYGLVYFTEAMLKRHGMGFEGMRVSVSGS GNVAQYAIEKAMEFGARVITASDSSGTVVDESGFTKEKLARLIEIKASRDGRVADYAKEF GLVYLERQQPWSVPVDIALPCATQNELDVDAAHQLIANGVKAVAEGANMPTTIEATELFQ QAGVLFAPGKAANAGGVATSGLEMAQNAARLGWKAEKVDARLHHIMLDIHHACVEHGGEG EQTNYVQGANIAGFVKVADAMLAQGVI >gi|296493301|gb|ADTK01000200.1| GENE 20 18766 - 19806 231 346 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_1985 NR:ns ## KEGG: EcE24377A_1985 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 346 1 346 346 702 99.0 0 MKKVLLQNHPGSEKYSFNGWEIFNSNFERMIKENKAMLLCKWGGYLTCVVAVMFVFAAIT SNGLNERGLITAGCSFLYLLIMMGLIVRAGFKAKKEQLHYYQAKGIEPLSIEKLQALQLI APYRFYHKQWSETLEFWPRKPEPGKDTFQYHVLPFDSIDIISKRRESLEDQWGIEDSESY CALMEHFLSGDHGANTFKANMEEAPEQVIALLNKFAVFPSDYISDCANNRSGKSSAKLIW AAELSWMTSISSTAFQNGTIEEELAWHYIMLASRKAHELFESEEDYQKNSLMGFLYWHIC CYRRKLTDAELEACYRYDKQFWEHYSKKCRWPIRNVPWGASSVKYS >gi|296493301|gb|ADTK01000200.1| GENE 21 19934 - 21895 1817 653 aa, chain - ## HITS:1 COG:topB KEGG:ns NR:ns ## COG: topB COG0550 # Protein_GI_number: 16129717 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Escherichia coli K12 # 1 653 1 653 653 1321 99.0 0 MRLFIAEKPSLARAIADVLPKPHRKGDGFIECGNGQVVTWCIGHLLEQAQPDAYDSRYAR WNLADLPIVPEKWQLQPRPSVTKQLNVIKRFLHEASEIVHAGDPDREGQLLVDEVLDYLQ LAPEKRQQVQRCLINDLNPQAVERAIDRLRSNSEFVPLCVSALARARADWLYGINMTRAY TILGRNAGYQGVLSVGRVQTPVLGLVVRRDEEIENFVAKDFFEVKAHIVTPADERFTAIW QPSEACEPYQDEEGRLLHRPLAEHVVNRISGQPAIVTSYNDKRESESAPLPFSLSALQIE AAKRFGLSAQNVLDICQKLYETHKLITYPRSDCRYLPEEHFAGRHAVMNAISVHAPDLLP QPVVDPDIRNRCWDDKKVDAHHAIIPTARSSAINLTENEAKVYNLIARQYLMQFCPDAVF RKCVIELDIAKGKFVAKARFLAEAGWRALLGSKERDEENDGTPLPVVAKGDELLCEKGEV VERQTQPPRHFTDATLLSAMTGIARFVQDKDLKKILRATDGLGTEATRAGIIELLFKRGF LTKKGRYIHSTDAGKALFHSLPEMATRPDMTAHWESVLTQISEKQCRYQDFMQPLVGTLY QLIDQAKRTPVRQFRGIVAPGSGGSADKKKAAPRKRSAKKSPPADEAGSGAIA >gi|296493301|gb|ADTK01000200.1| GENE 22 21900 - 22943 1202 347 aa, chain - ## HITS:1 COG:ECs2470 KEGG:ns NR:ns ## COG: ECs2470 COG0709 # Protein_GI_number: 15831724 # Func_class: E Amino acid transport and metabolism # Function: Selenophosphate synthase # Organism: Escherichia coli O157:H7 # 1 347 1 347 347 679 99.0 0 MSENSIRLTQYSHGAGCGCKISPKVLETILHSEQAKFVDPNLLVGNETRDDAAVYDLGNG TSVISTTDFFMPIVDNPFDFGRIAATNAISDIFAMGGKPIMAIAILGWPINKLSPEIARE VTEGGRYACRQAGIALAGGHSIDAPEPIFGLAVTGIVPTERVKKNSTAQAGCKLFLTKPL GIGVLTTAEKKSLLKPEHQGLATEVMCRMNIAGASFANIEGVKAMTDVTGFGLLGHLSEM CQGAGVQARVDYDAIPKLPGVEEYIKLGAVPGGTERNFASYGHLMGEMPREVRDLLCDPQ TSGGLLLAVMPEAENEVKTTAAEFGIELTAIGELVPARGGRAMVEIR >gi|296493301|gb|ADTK01000200.1| GENE 23 23060 - 23611 649 183 aa, chain - ## HITS:1 COG:ECs2471 KEGG:ns NR:ns ## COG: ECs2471 COG0778 # Protein_GI_number: 15831725 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Escherichia coli O157:H7 # 1 183 1 183 183 362 99.0 1e-100 MDALELLINRRSASRLAEPAPTGEQLQNILRAGMRAPDHKSMQPWHFFVIEGEGRERFSA VLEQGAIAAGSDDKAIDKARNAPFRAPLIITVVAKCEENHKVPRWEQEMSAGCAVMAMQM AAVAQGFGGIWRSGALTEIPVVREAFGCREQDKIVGFLYLGTPQLKASTSINVPDPTPFV TYF >gi|296493301|gb|ADTK01000200.1| GENE 24 23665 - 23829 56 54 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_2054 NR:ns ## KEGG: ECUMN_2054 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 54 1 54 54 103 100.0 3e-21 MLQHSPQFLLTSAGWVITITTSIAPVTGVTLSWENTCEPFGDLLPDFLNGRGVC >gi|296493301|gb|ADTK01000200.1| GENE 25 23772 - 25628 1751 618 aa, chain + ## HITS:1 COG:sppA KEGG:ns NR:ns ## COG: sppA COG0616 # Protein_GI_number: 16129720 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Escherichia coli K12 # 1 618 1 618 618 1191 98.0 0 MRTLWRFIAGFFKWTWRLLNFVREMVLNLFFIFLVLVGVGIWMQVSSSDTKETAGRGALL LDISGVIVDKPDSSQRFSKLSRQLLGASSDRLQENSLFDIVNTIRQAKDDRNITGIVMDL KDFAGGDQPSMQYIGKALKEFRDSGKPVYAVGENYSQGQYYLASFANKIWLSPQGVVDLH GFATNGLYYKSLLDKLKVSTHVFRVGTYKSAVEPFIRDDMSPAAREADSRWIGELWQNYL NTVAANRQIPAQQVFPGAQGLLEGLTKTGGDTAKYALENKLVDALASSAEIEKALTKEFG WSKTDKNYRAISYYDYALKTPADTGDSIGVVFANGAIMDGEEAQGNVGGDTTAAQIRDAR LDPKVKAIVLRVNSPGGSVTASEVIRAELAAARAAGKPVVVSMGGMAASGGYWISTPANY IVANPSTLTGSIGIFGVITTVENSLDSIGVHTDGVSTSPLADVSITKALPPEAQQMMQLS IENGYKRFITLVADARHSTPEQIDKIAQGHVWTGQDAKANGLVDSLGDFDDAVAKAAELA KVKQWHLEYYVDEPTFFDKVMDNMSGSVRAMLPDTFQAMLPAPLASVASTVKSESDKLAA FNDPQNRYAFCLTCANVR >gi|296493301|gb|ADTK01000200.1| GENE 26 25795 - 26811 1026 338 aa, chain + ## HITS:1 COG:ECs2474 KEGG:ns NR:ns ## COG: ECs2474 COG0252 # Protein_GI_number: 15831728 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Escherichia coli O157:H7 # 1 338 1 338 338 678 99.0 0 MQKKSIYVAYTGGTIGMQRSEQGYIPVSGHLQRQLALMPEFHRPEMPDFTIHEYTPLMDS SDMTPEDWQHIAEDIKAHYDEYDGFVILHGTDTMAYTASALSFMLENLGKPVIVTGSQIP LAELRSDGQINLLNALYVAANYPINEVTLFFNNRLYRGNRTTKAHADGFDAFASPNLPPL LEAGIHIRRLNTPPAPHGEGELIVHPITPQPIGVVTIYPGISADVVRNFLRQPVKALILR SYGVGNAPQNKAFLQELQEASDRGIVVVNLTQCMSGKVNMGGYATGNALAHAGVIGGADM TVEATLTKLHYLLSQELDTETIRKAMSQNLRGELTPDD >gi|296493301|gb|ADTK01000200.1| GENE 27 26822 - 27463 563 213 aa, chain + ## HITS:1 COG:ECs2475 KEGG:ns NR:ns ## COG: ECs2475 COG1335 # Protein_GI_number: 15831729 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Escherichia coli O157:H7 # 1 213 7 219 219 439 99.0 1e-123 MPHRALLLVDLQNDFCAGGALAVPEGDSTVEVANRLIDWCQSRGEAVIASQDWHPANHGS FASQHGVEPYTPGQLDGLPQTFWPDHCVQNSEGAQLHPLLNQKAIAAVFHKGENPLVDSY SAFFDNGRRQKTSLDDWLRDHEIDELIVMGLATDYCVKFTVLDALQLGYKVNVITDGCRG VNIQPQDSAHAFMEMSAAGATLYTLADWEETQG >gi|296493301|gb|ADTK01000200.1| GENE 28 27556 - 28914 674 452 aa, chain - ## HITS:1 COG:ydjE KEGG:ns NR:ns ## COG: ydjE COG0477 # Protein_GI_number: 16129723 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 452 1 452 452 786 98.0 0 MEQYDQIGARLDRLPLARFHYRIFGIISFSLLLTGFLSYSGNVVLAKLVSNGWSNNFLNA AFTSALMFGYFIGSLTGGFIGDYFGRRRAFRINLLIVGIAATGAAFVPDMYWLIFFRFLM GTGMGALIMVGYASFTEFIPATVRGKWSARLSFVGNWSPMLSAAIGVVVIAFFSWRIMFL LGGIGILLAWFLSGKYFIESPRWLAGKGQIAGAESQLREVEQQIEREKSIRLPQLTLNQS NSKVKVIKGTFWLLFKGEMLRRTLVAITVLIAMNISLYTITVWIPTIFVNSGIDVDKSIL MTAVIMIGAPVGIFIAALIIDHFPRRLFGSALLIIIAVLGYIYSIQTTEWAILIYGLVMI FFLYMYVCFASAVYIPELWPTHLRLRGSGFVNAVGRIVAVFTPYGVAALLTHYGSITVFM VLGVMLVLCALVLSIFGIETRKVSLEEISEVN >gi|296493301|gb|ADTK01000200.1| GENE 29 29031 - 29789 647 252 aa, chain - ## HITS:1 COG:ECs2479 KEGG:ns NR:ns ## COG: ECs2479 COG1349 # Protein_GI_number: 15831733 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli O157:H7 # 1 252 1 252 252 464 100.0 1e-131 MAAKDRIQAIKQMVANDKKVTVSNLSGIFQVTEETIRRDLEKLEDEGFLTRTYGGAVLNT AMLTENIHFYKRASSFYEEKQLIARKALPFIDNKTTMAADSSSTVMELLKLLQDRSDLTL LTNSAEAIHVLAQSEIKVVSTGGELNKNTLSLQGRITKEIISRYHVDIMVMSCKGLDINS GALDSNEAEAEIKKTMIRQATEVALLVDHSKFDRKAFVQLADFSHINYIITDKSPGAEWI AFCKDNNIQLVW >gi|296493301|gb|ADTK01000200.1| GENE 30 29926 - 30906 868 326 aa, chain - ## HITS:1 COG:ydjG KEGG:ns NR:ns ## COG: ydjG COG0667 # Protein_GI_number: 16129725 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Escherichia coli K12 # 1 326 1 326 326 661 99.0 0 MKKIPLGTTDITLSRMGLGTWAIGGGPAWNGDLDRQICIDTILEAHRCGINLIDTAPGYN FGNSEVIVGQALKKLPREQVVVETKCGIVWERKGSLFNKVGDRQLYKNLSPESIREEVEA SLQRLGIDYIDIYMTHWQSVPPFFTPIAETVAVLNELKAEGKIRAIGAANVDADHIREYL QYGELDIIQAKYSILDRAMENELLPLCRDNGIVVQVYSPLEQGLLTGTITRDYVPGGARA NKVWFQRENMLKVIDMLEQWQPLCARYQCTIPTLALAWILKQSDLISILSGATAPEQVRE NVAALIINLSDADATLMREMAEALER >gi|296493301|gb|ADTK01000200.1| GENE 31 30916 - 31863 704 315 aa, chain - ## HITS:1 COG:ydjH KEGG:ns NR:ns ## COG: ydjH COG0524 # Protein_GI_number: 16129726 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli K12 # 1 315 8 322 322 603 98.0 1e-172 MDNLDVICIGAAIVDIPLQPVSKNIFDVDSYPLERIAMTTGGDAINEATIISRLGHRTAL MSRIGKDAAGQFILDHCRKENIDIQSLKQDVSIDTSINVGLVTEDGERTFVTNRNGSLWK LNIDDVDFSRFSQAKLLSLASIFNSPLLDGKALTEIFTQAKTRQMIICADMIKPRLNETL DDICEALSYVDYLFPNYAEAKLLTGKETLDEIADYFLACGVKTVVIKTGKDGCFIKRGDM TMKVPAVAGITAIDTIGAGDNFASGFIAALLEGKNLRECARFANATAAISVLSVGATTGV KNRKLVEQLLEEYEG >gi|296493301|gb|ADTK01000200.1| GENE 32 31868 - 32704 735 278 aa, chain - ## HITS:1 COG:ydjI KEGG:ns NR:ns ## COG: ydjI COG0191 # Protein_GI_number: 16129727 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Escherichia coli K12 # 1 278 1 278 278 552 99.0 1e-157 MLADIRYWENDATNKHYAIAHFNVWNAEMLMGVIDAAEEAKSPVIISFGTGFVGNTSFED FSHMMVSMARKATVPVITHWDHGRSMEIIHNAWTHGMNSLMRDASAFDFEENIRLTKEAV DFFHPLGIPVEAELGHVGNETVYEEALAGYHYTDPDQAAEFVERTGCDSLAVAIGNQHGV YTSEPQLNFEVVKRVRDAVSVPLVLHGASGISDADIKTAISLGIAKINIHTELCQAAMVA VKENQDQPFLHLEREVRKAVKERALEKIKLFGSDGKAE >gi|296493301|gb|ADTK01000200.1| GENE 33 32725 - 33768 798 347 aa, chain - ## HITS:1 COG:ydjJ KEGG:ns NR:ns ## COG: ydjJ COG1063 # Protein_GI_number: 16129728 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli K12 # 1 347 1 347 347 698 99.0 0 MKNSKAILQVPGTMKIISAEIPVPKEDEVLIKVEYVGICGSDVHGFESGPFIPPKDPNQE IGLGHECAGTVVAVGSRVRKFKPGDRVNIEPGVPCGHCRYCLEGKYNICPDVDFMATQPN YRGALTHYLCHPESFTYKLPDNMDTMEGALVEPAAVGMHAAMLADVKPGKKIIILGAGCI GLMTLQACKCLGATEIAVVDVLEKRLIMAEQLGATVVINGTKEDTIARCQQFTEDMGADI VFETAGSAVTIKQAPYLVMRGGKIMIVGTVPGDSAINFLKINREVTIQTVFRYANRYPVT IEAISSGRFDVKSMVTHIYDYRDVQQAFEESVNNKRDIIKGVIKISD >gi|296493301|gb|ADTK01000200.1| GENE 34 33785 - 35164 993 459 aa, chain - ## HITS:1 COG:ydjK KEGG:ns NR:ns ## COG: ydjK COG0477 # Protein_GI_number: 16129729 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 459 1 459 459 831 99.0 0 MEQITKPHCGARLDRLPDCRWHSSMFAIVAFGLLVCWSNAVGGLILAQLKALGWTDNSTT ATFSAITTAGMFLGALVGGIIGDKTGRRNAFILYEAIHIASMVVGAFSPNMDFLIACRFV MGVGLGALLVTLFAGFTEYMPGRNRGTWSSRVSFIGNWSYPLCSLIAMGLTPLISAEWNW RVQLLIPAILSLIATALAWRYFPESPRWLESRGRYQEAEKVMRSIEEGVIRQTGKPLPPV VIADDGKAPQAVPYSALLTGVLLKRVILGSCVLIAMNVVQYTLINWLPTIFMTQGINLKD SIVLNTMSMFGAPFGIFIAMLVMDKIPRKTMGVGLLILIAVLGYIYSLQTSMLLITLIGF FLITFVYMYVCYASAVYVPEIWPTEAKLRGSGLANAVGRISGIAAPYAVAVLLSSYGVTG VFILLGAVSIVVAIAIATIGIETKGVSVESLSIDAVANK >gi|296493301|gb|ADTK01000200.1| GENE 35 35191 - 36267 1075 358 aa, chain - ## HITS:1 COG:ECs2485 KEGG:ns NR:ns ## COG: ECs2485 COG1063 # Protein_GI_number: 15831739 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli O157:H7 # 1 358 1 358 358 739 100.0 0 MKALARFGKAFGGYKMIDVPQPICGPEDVVIEIKAAAICGADMKHYNVDSGSDEFNSIRG HEFAGCIAQVGEKVKDWKVGQRVVSDNSGHVCGVCPACEQGDFLCCTEKVNLGLDNNTWG GGFSKYCLVPGEILKIHRHALWEIPDGVDYEDAAVLDPICNAYKSIAQQSKFLPGQDVVV IGTGPLGLFSVQMARIMGAVNIVVVGLQEDVAVRFPVAKELGATAVVNGSTEDVVARCQQ ICGKDNLGLVIECSGANIALKQAIEMLRPNGEVVRVGMGFKPLDFSINDITAWNKSIIGH MAYDSTSWRNAIRLLASGAIKVKPMITHRIGLSQWREGFDAMVDKTAIKVIMTYDFDE >gi|296493301|gb|ADTK01000200.1| GENE 36 36637 - 36909 336 90 aa, chain - ## HITS:1 COG:ECs2486 KEGG:ns NR:ns ## COG: ECs2486 COG3139 # Protein_GI_number: 15831740 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 90 16 105 105 167 97.0 5e-42 MNLDDIINSMTPEVYQRLSTAVELGKWPDGVALTEEQKANCLQLVMLWQARHNTEAQHMT IDTNGQMVMKSKQQLKEDFGISAKPIAMFK >gi|296493301|gb|ADTK01000200.1| GENE 37 36951 - 37364 238 137 aa, chain - ## HITS:1 COG:ECs2487 KEGG:ns NR:ns ## COG: ECs2487 COG0229 # Protein_GI_number: 15831741 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Conserved domain frequently associated with peptide methionine sulfoxide reductase # Organism: Escherichia coli O157:H7 # 1 137 1 137 137 285 100.0 2e-77 MANKPSAEELKKNLSEMQFYVTQNHGTEPPFTGRLLHNKRDGVYHCLICDAPLFHSQTKY DSGCGWPSFYEPVSEESIRYIKDLSHGMQRIEIRCGNCDAHLGHVFPDGPQPTGERYCVN SASLRFTDGENGEEING >gi|296493301|gb|ADTK01000200.1| GENE 38 37706 - 38701 1051 331 aa, chain + ## HITS:1 COG:ECs2488 KEGG:ns NR:ns ## COG: ECs2488 COG0057 # Protein_GI_number: 15831742 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Escherichia coli O157:H7 # 1 331 1 331 331 633 99.0 0 MTIKVGINGFGRIGRIVFRAAQKRSDIEIVAINDLLDADYMAYMLKYDSTHGRFDGTVEV KDGHLIVNGKKIRVTAERDPANLKWDEVGVDVVAEATGLFLTDETARKHITAGAKKVVMT GPSKDNTPMFVKGANFDKYAGQDIVSNASCTTNCLAPLAKVINDNFGIIEGLMTTVHATT ATQKTVDGPSHKDWRGGRGASQNIIPSSTGAAKAVGKVLPELSGKLTGMAFRVPTPNVSV VDLTVRLEKAATYEQIKAAVKAAAEGEMKGVLGYTEDDVVSTDFNGEVCTSVFDAKAGIA LNDNFVKLVSWYDNETGYSNKVLDLIAHISK >gi|296493301|gb|ADTK01000200.1| GENE 39 38785 - 39669 965 294 aa, chain + ## HITS:1 COG:yeaD KEGG:ns NR:ns ## COG: yeaD COG0676 # Protein_GI_number: 16129734 # Func_class: G Carbohydrate transport and metabolism # Function: Uncharacterized enzymes related to aldose 1-epimerase # Organism: Escherichia coli K12 # 1 294 8 301 301 612 98.0 1e-175 MIKKIFALPVIEQISPVLSRRKLDELDLIVVDHPQVKASFALQGAHLLSWKPAGEEEVLW LSNNTPFKNGVAIRGGVPVCWPWFGPAAQQGLPAHGFARNLPWTLKSHREDADGVALTFE LTQSEETKKFWPHDFTLLAHFRVGKTCEIDLESHGEFETTSALHTYFNVGDIAKVSVSGL GDRFIDKVNDAKEDVLTDGIQTFPDRTDRVYLNPQDCSVINDEALNRIIAVGHQHHLNVV GWNPGPALSVSMGDMPDDGYKTFVCVETAYASETQKVTKEKPAHLAQSIRVAKR Prediction of potential genes in microbial genomes Time: Mon May 16 15:39:54 2011 Seq name: gi|296493300|gb|ADTK01000201.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont544.9, whole genome shotgun sequence Length of sequence - 40250 bp Number of predicted genes - 38, with homology - 38 Number of transcription units - 24, operones - 9 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.667 - CDS 23 - 877 764 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase - Term 903 - 931 3.0 2 1 Op 2 . - CDS 967 - 1713 819 ## COG3713 Outer membrane protein V - Prom 1887 - 1946 4.0 3 2 Op 1 12/0.000 + CDS 2149 - 4083 2035 ## COG2766 Putative Ser protein kinase 4 2 Op 2 4/0.417 + CDS 4196 - 5479 947 ## PROTEIN SUPPORTED gi|163739530|ref|ZP_02146940.1| 50S ribosomal protein L34 + Term 5485 - 5523 5.2 + Prom 5610 - 5669 4.8 5 3 Tu 1 5/0.167 + CDS 5758 - 7101 757 ## COG2199 FOG: GGDEF domain + Term 7119 - 7149 1.0 + Prom 7108 - 7167 2.7 6 4 Op 1 2/1.000 + CDS 7300 - 8772 1085 ## COG2199 FOG: GGDEF domain 7 4 Op 2 . + CDS 8815 - 9384 467 ## COG2606 Uncharacterized conserved protein + Term 9443 - 9490 4.2 + Prom 9434 - 9493 3.1 8 5 Tu 1 . + CDS 9593 - 10039 625 ## COG2707 Predicted membrane protein + Term 10058 - 10091 2.0 9 6 Tu 1 . - CDS 9996 - 10814 337 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 10838 - 10897 5.8 + Prom 10832 - 10891 3.4 10 7 Op 1 3/0.667 + CDS 10914 - 12095 581 ## COG2807 Cyanate permease 11 7 Op 2 . + CDS 12150 - 12497 427 ## COG3189 Uncharacterized conserved protein - Term 12435 - 12491 0.6 12 8 Tu 1 . - CDS 12519 - 12773 142 ## COG3042 Putative hemolysin - Prom 12798 - 12857 3.9 + Prom 12798 - 12857 6.9 13 9 Tu 1 . + CDS 12956 - 13981 756 ## COG2199 FOG: GGDEF domain + Term 14035 - 14073 2.0 - Term 14197 - 14234 6.3 14 10 Tu 1 . - CDS 14248 - 14496 347 ## COG2261 Predicted membrane protein - Prom 14539 - 14598 3.6 15 11 Op 1 . - CDS 14644 - 14826 271 ## ECB_01765 hypothetical protein 16 11 Op 2 4/0.417 - CDS 14830 - 15189 308 ## COG3615 Uncharacterized protein/domain, possibly involved in tellurite resistance - Prom 15281 - 15340 3.8 17 12 Tu 1 . - CDS 15362 - 16000 500 ## COG1280 Putative threonine efflux protein - Prom 16041 - 16100 3.5 18 13 Tu 1 . - CDS 16127 - 17050 890 ## COG0583 Transcriptional regulator - Prom 17091 - 17150 3.8 + Prom 17059 - 17118 3.0 19 14 Tu 1 . + CDS 17153 - 18238 1058 ## COG0473 Isocitrate/isopropylmalate dehydrogenase + Term 18257 - 18291 6.0 + Prom 18334 - 18393 8.1 20 15 Op 1 3/0.667 + CDS 18489 - 20099 1622 ## COG1292 Choline-glycine betaine transporter 21 15 Op 2 11/0.000 + CDS 20131 - 21255 1162 ## COG4638 Phenylpropionate dioxygenase and related ring-hydroxylating dioxygenases, large terminal subunit + Term 21259 - 21301 5.5 22 15 Op 3 . + CDS 21311 - 22276 629 ## COG1018 Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 + Term 22281 - 22337 7.8 23 16 Op 1 8/0.000 - CDS 22330 - 23445 1009 ## COG0349 Ribonuclease D - Term 23476 - 23508 3.0 24 16 Op 2 7/0.000 - CDS 23527 - 25278 1565 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II - Prom 25298 - 25357 3.1 - Term 25363 - 25391 0.6 25 16 Op 3 7/0.000 - CDS 25417 - 25998 570 ## COG3065 Starvation-inducible outer membrane lipoprotein 26 16 Op 4 8/0.000 - CDS 26038 - 26733 652 ## COG1214 Inactive homolog of metal-dependent proteases, putative molecular chaperone 27 16 Op 5 . - CDS 26791 - 28701 1511 ## COG1199 Rad3-related DNA helicases - Prom 28806 - 28865 1.8 28 17 Tu 1 . + CDS 28833 - 29177 522 ## COG0251 Putative translation initiation inhibitor, yjgF family 29 18 Tu 1 . + CDS 29600 - 29899 230 ## S1533 hypothetical protein + Term 29929 - 29972 5.2 30 19 Tu 1 . - CDS 30019 - 30198 331 ## COG3140 Uncharacterized protein conserved in bacteria - Prom 30221 - 30280 4.7 + Prom 30165 - 30224 5.2 31 20 Op 1 6/0.000 + CDS 30272 - 31633 1053 ## COG0147 Anthranilate/para-aminobenzoate synthases component I 32 20 Op 2 5/0.167 + CDS 31637 - 32215 410 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes + Prom 32247 - 32306 5.2 33 21 Tu 1 . + CDS 32399 - 33763 1485 ## COG1760 L-serine deaminase + Term 33774 - 33811 8.2 + Prom 33778 - 33837 1.9 34 22 Tu 1 . + CDS 33894 - 35492 1116 ## COG2200 FOG: EAL domain 35 23 Tu 1 . - CDS 35496 - 37052 1795 ## COG1253 Hemolysins and related proteins containing CBS domains + Prom 37321 - 37380 4.5 36 24 Op 1 13/0.000 + CDS 37515 - 38486 1157 ## COG3444 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB 37 24 Op 2 13/0.000 + CDS 38549 - 39349 993 ## COG3715 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC 38 24 Op 3 . + CDS 39362 - 40213 1007 ## COG3716 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID Predicted protein(s) >gi|296493300|gb|ADTK01000201.1| GENE 1 23 - 877 764 284 aa, chain - ## HITS:1 COG:ECs2490 KEGG:ns NR:ns ## COG: ECs2490 COG0656 # Protein_GI_number: 15831744 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Escherichia coli O157:H7 # 1 284 1 284 284 540 97.0 1e-153 MQQKMIQFSGDVSLPAIGQGTWYIGEDASQRKTEVAALRAGIELGLTLIDTAEMYADGGA EKVVGEALTGLRDNVFLVSKVYPWNAGGQKAINACEASLRRLNTDYLDLYLLHWSGSFAF EETVAAMEKLIAQGKIRRWGVSNLDHADMQELWQLPGGNQCATNQVLYHLGSRGIEYDLL PWCQQQQMPVMAYSPLAQAGRLRNGLLKNAVVNEIAHAHNISAAQVLLAWVISHQGVMAI PKAATIAHVQQNAAVLEVELSSAELTMLDKAYPAPKGKTALDMV >gi|296493300|gb|ADTK01000201.1| GENE 2 967 - 1713 819 248 aa, chain - ## HITS:1 COG:yeaF KEGG:ns NR:ns ## COG: yeaF COG3713 # Protein_GI_number: 16129736 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein V # Organism: Escherichia coli K12 # 1 248 1 248 248 467 100.0 1e-132 MTKLKLLALGVLIATSAGVAHAEGKFSLGAGVGVVEHPYKDYDTDVYPVPVINYEGDNFW FRGLGGGYYLWNDATDKLSITAYWSPLYFKAKDSGDHQMRHLDDRKSTMMAGLSYAHFTQ YGYLRTTLAGDTLDNSNGIVWDMAWLYRYTNGGLTVTPGIGVQWNSENQNEYYYGVSRKE SARSGLRGYNPNDSWSPYLELSASYNFLGDWSVYGTARYTRLSDEVTDSPMVDKSWTGLI STGITYKF >gi|296493300|gb|ADTK01000201.1| GENE 3 2149 - 4083 2035 644 aa, chain + ## HITS:1 COG:ECs2492 KEGG:ns NR:ns ## COG: ECs2492 COG2766 # Protein_GI_number: 15831746 # Func_class: T Signal transduction mechanisms # Function: Putative Ser protein kinase # Organism: Escherichia coli O157:H7 # 1 644 1 644 644 1293 100.0 0 MNIFDHYRQRYEAAKDEEFTLQEFLTTCRQDRSAYANAAERLLMAIGEPVMVDTAQEPRL SRLFSNRVIARYPAFEEFYGMEDAIEQIVSYLKHAAQGLEEKKQILYLLGPVGGGKSSLA ERLKSLMQLVPIYVLSANGERSPVNDHPFCLFNPQEDAQILEKEYGIPRRYLGTIMSPWA AKRLHEFGGDITKFRVVKVWPSILQQIAIAKTEPGDENNQDISALVGKVDIRKLEHYAQN DPDAYGYSGALCRANQGIMEFVEMFKAPIKVLHPLLTATQEGNYNGTEGISALPFNGIIL AHSNESEWVTFRNNKNNEAFLDRVYIVKVPYCLRISEEIKIYEKLLNHSELTHAPCAPGT LETLSRFSILSRLKEPENSSIYSKMRVYDGESLKDTDPKAKSYQEYRDYAGVDEGMNGLS TRFAFKILSRVFNFDHVEVAANPVHLFYVLEQQIEREQFPQEQAERYLEFLKGYLIPKYA EFIGKEIQTAYLESYSEYGQNIFDRYVTYADFWIQDQEYRDPDTGQLFDRESLNAELEKI EKPAGISNPKDFRNEIVNFVLRARANNSGRNPNWTSYEKLRTVIEKKMFSNTEELLPVIS FNAKTSTDEQKKHDDFVDRMMEKGYTRKQVRLLCEWYLRVRKSS >gi|296493300|gb|ADTK01000201.1| GENE 4 4196 - 5479 947 427 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739530|ref|ZP_02146940.1| 50S ribosomal protein L34 [Phaeobacter gallaeciensis BS107] # 1 418 1 433 445 369 44 1e-101 MTWFIDRRLNGKNKSMVNRQRFLRRYKAQIKQSISEAINKRSVTDVDSGESVSIPTEDIS EPMFHQGRGGLRHRVHPGNDHFVQNDRIERPQGGGGGSGSGQGQASQDGEGQDEFVFQIS KDEYLDLLFEDLALPNLKQNQQRQLTEYKTHRAGYTANGVPANISVVRSLQNSLARRTAM TAGKRRELHALEENLAIISNSEPAQLLEEERLRKEIAELRAKIERVPFIDTFDLRYKNYE KRPDPSSQAVMFCLMDVSGSMDQSTKDMAKRFYILLYLFLSRTYKNVEVVYIRHHTQAKE VDEHEFFYSQETGGTIVSSALKLMDEVVKERYNPAQWNIYAAQASDGDNWADDSPLCHEI LAKKLLPVVRYYSYIEITRRAHQTLWREYEHLQSTFDNFAMQHIRDQDDIYPVFRELFHK QNSTAKD >gi|296493300|gb|ADTK01000201.1| GENE 5 5758 - 7101 757 447 aa, chain + ## HITS:1 COG:yeaI_2 KEGG:ns NR:ns ## COG: yeaI_2 COG2199 # Protein_GI_number: 16129739 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli K12 # 267 447 1 181 181 356 100.0 6e-98 MVSMGTQKLKAQSFFIFSLLLTLILFCITTLYNENTNVKLIPQMNYLMVVVALFFLNAVI FLFMLMKYFTNKQILPTLILSLAFLSGLIYLVETIVIIHKPINGSTLIQTKSNDVSIFYI FRQLSFICLTSLALFCYGKDNILDNNKKKTGILLLALIPFLVFPLLAHNLSSYNADYSLY VVDYCPDNHTATWGINYTKILVCLWAFLLFFIIMRTRLASELWPLIALLCLASLCCNLLL LTLDEYNYTIWYISRGIEVSSKLFVVSFLIYNIFQELQLSSKLAVHDVLTNIYNRRYFFN SVESLLSRPVVKDFCVMLVDINQFKRINAQWGHRVGDKVLVSIVDIIQQSIRPDDILARL EGEVFGLLFTELNSAQAKIIAERMRKNVELLTGFSNRYDVPEQMTISIGTVFSTGDTRNI SLVMTEADKALREAKSEGGNKVIIHHI >gi|296493300|gb|ADTK01000201.1| GENE 6 7300 - 8772 1085 490 aa, chain + ## HITS:1 COG:ZyeaJ_2 KEGG:ns NR:ns ## COG: ZyeaJ_2 COG2199 # Protein_GI_number: 15802200 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli O157:H7 EDL933 # 325 490 1 166 166 326 100.0 7e-89 MLRHFIAASVIVLTSSFLIFELVASDRAMSAYLRYIVQKADSSFLYDKYQNQSIAAHVMR ALAAEQSEVSPEQRRAICEAFESANNTHGLNLTAHKYPGLRGTLQTASTDCDTIVEAAAL LPAFDQAVEGNRHQDDYGSGLGMAEEKFHYYLDLNDRYVYFYEPVNVEYFAMNNWSFLQS GSIGIDRKDIEKVFTGRTVLSSIYQDQRTKQNVMSLLTPVYVAGQLKGIVLLDINKNNLR NIFYTHDRPLLWRFLNVTLTDTDSGRDIIINQSEDNLFQYVSYVHDLPGGIRVSLSIDIL YFITSSWKSVLFWILTALILLNMVRMHFRLYQNVSRENISDAMTGLYNRKILTPELEQRL QKLVQSGSSVMFIAIDMDKLKLINDTLGHQEGDLAITLLAQAIKQSIRKSDYAIRLGGDE FCIILVDSTPQIAAQLPERIEKRLQHIAPQKEIGFSSGIYAMKENDTLHDAYKASDERLY VNKQNKNSRS >gi|296493300|gb|ADTK01000201.1| GENE 7 8815 - 9384 467 189 aa, chain + ## HITS:1 COG:ECs2496 KEGG:ns NR:ns ## COG: ECs2496 COG2606 # Protein_GI_number: 15831750 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 167 1 167 167 320 98.0 7e-88 MTEMAKGSVTHQRLIALLSQEGANFRVVTHEAVGKCEAVSEIRGTALGQGAKALVCKVKG NGVNQHVLAILAADQQADLSQLASHIGGLRASLASPAEVDELTGCVFGAIPPFSFHPKLK LVADPLLFERFDEIAFNAGMLDKSVILKTTDYLRIAQPELLNFRRTAQLACPFNKQNGQN QHNSNGKKR >gi|296493300|gb|ADTK01000201.1| GENE 8 9593 - 10039 625 148 aa, chain + ## HITS:1 COG:yeaL KEGG:ns NR:ns ## COG: yeaL COG2707 # Protein_GI_number: 16129743 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 148 1 148 148 188 100.0 3e-48 MFDVTLLILLGLAALGFISHNTTVAVSILVLIIVRVTPLSTFFPWIEKQGLSIGIIILTI GVMAPIASGTLPPSTLIHSFLNWKSLVAIAVGVIVSWLGGRGVTLMGSQPQLVAGLLVGT VLGVALFRGVPVGPLIAAGLVSLIVGKQ >gi|296493300|gb|ADTK01000201.1| GENE 9 9996 - 10814 337 272 aa, chain - ## HITS:1 COG:yeaM KEGG:ns NR:ns ## COG: yeaM COG2207 # Protein_GI_number: 16129744 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 272 2 273 273 513 98.0 1e-145 MHRLNLNGYEPDRHHEAAVAFCIHAGTDELTSPVHQHRKGQLILALHGAITCTVENALWM VPPQYAVWIPGGVEHSNQVTANAELCFLFIEPSAVTMPTTCCTLKISPLCRELILTLANR TTTQRAEPMTHRLIQVLFDELPQQPQQQLHLPVSSHPKIRTMVEMMAKGPVEWGSLGQWA GFFAMSERNLARLIVKETGLSFRQWRQQLQLIMALQGLVKGDTVQKVAHTLGYDSTTAFI TMFKKGLGQTPGRYIAGLTTVSPQSAKPDPRQ >gi|296493300|gb|ADTK01000201.1| GENE 10 10914 - 12095 581 393 aa, chain + ## HITS:1 COG:ECs2500 KEGG:ns NR:ns ## COG: ECs2500 COG2807 # Protein_GI_number: 15831754 # Func_class: P Inorganic ion transport and metabolism # Function: Cyanate permease # Organism: Escherichia coli O157:H7 # 1 393 1 393 393 641 99.0 0 MTCSTSLSGKNRIVLIAGILMIATTLRVTFTGAAPLLDTIRSAYSLTTAQTGLLTTLPLL AFALISPLAAPVARRFGMERSLFAALLLICAGIAIRSLPSPYLLFGGTAVIGGGIALGNV LLPGLIKRDFPHSVARLTGAYSLTMGAAAALGSAMVVPLAFNGFGWQGALLMLMCFPLLA LFLWLPQWRSQQHANLSTSRALHTRGIWRSPLAWQVTLFLGINSLVYYVIIGWLPAILIS HGYSEAQAGSLHGLLQLATAAPGLLIPLFLHHVKDQRGIAAFVALMCAVGAVELCFMPAH AITWTLLFGFGSGATMILGLTFIGLRASSAHQAAALSGMAQSVGYLLAACGPPLMGKIHD ANGDWSVPLMGVAILSLLMAIFGLCAGRDKEIR >gi|296493300|gb|ADTK01000201.1| GENE 11 12150 - 12497 427 115 aa, chain + ## HITS:1 COG:ECs2501 KEGG:ns NR:ns ## COG: ECs2501 COG3189 # Protein_GI_number: 15831755 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 115 8 122 122 221 98.0 2e-58 MNIQCKRVYDPAEQSDGYRILVDRLWPRGIKKTDLALDEWDKEITPSTELRKAFHGEVVD FATFREQYLAELAQHEQEGKRLADIAKKQPLTLLYAAKNTTQNHALVLADWLRSL >gi|296493300|gb|ADTK01000201.1| GENE 12 12519 - 12773 142 84 aa, chain - ## HITS:1 COG:yoaF KEGG:ns NR:ns ## COG: yoaF COG3042 # Protein_GI_number: 16129747 # Func_class: R General function prediction only # Function: Putative hemolysin # Organism: Escherichia coli K12 # 1 84 1 84 84 154 100.0 3e-38 MKIISFVLPCLLVLAGCSTPSQPEAPKPPQIGMANPASVYCQQKGGTLIPVQTAQGVSNN CKLPGGETIDEWALWRRDHPAGEK >gi|296493300|gb|ADTK01000201.1| GENE 13 12956 - 13981 756 341 aa, chain + ## HITS:1 COG:yeaP_2 KEGG:ns NR:ns ## COG: yeaP_2 COG2199 # Protein_GI_number: 16129748 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli K12 # 168 341 1 174 174 350 100.0 2e-96 MSDQIIARVSQSLAKEQSLESLVRQLLEMLEMVTDMESTYLTKVDVEARLQHIMFARNSQ KMHIPENFTVSWDYSLCKRAIDENCFFSDEVPDRWGDCIAARNLGITTFLSTPIHLPDGS FYGTLCAASSEKRQWSERAEQVLQLFAGLIAQYIQKEALVEQLREANAALIAQSYTDSLT GLPNRRAIFENLTTLFSLARHLNHKIMIAFIDLDNFKLINDRFGHNSGDLFLIQVGERLN TLQQNGEVIGRLGGDEFLVVSLNNENADISSLRERIQQQIRGEYHLGDVDLYYPGASLGI VEVDPETTDADSALHAADIAMYQEKKHKQKTPFVAHPALHS >gi|296493300|gb|ADTK01000201.1| GENE 14 14248 - 14496 347 82 aa, chain - ## HITS:1 COG:YPO1181 KEGG:ns NR:ns ## COG: YPO1181 COG2261 # Protein_GI_number: 16121476 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Yersinia pestis # 1 82 26 107 107 99 82.0 2e-21 MGILSWIIFGLIAGILAKWIMPGKDGGGFFMTILLGIVGAVVGGWISTLFGFGKVDGFNF GSFVVAVIGAIVVLFIYRKIKS >gi|296493300|gb|ADTK01000201.1| GENE 15 14644 - 14826 271 60 aa, chain - ## HITS:1 COG:no KEGG:ECB_01765 NR:ns ## KEGG: ECB_01765 # Name: yoaG # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 60 1 60 60 96 100.0 3e-19 MGKATYTVTVTNNSNGVSVDYETETPMTLLVPEVAAEVIKDLVNTVRSYDTENEHDVCGW >gi|296493300|gb|ADTK01000201.1| GENE 16 14830 - 15189 308 119 aa, chain - ## HITS:1 COG:ECs2506 KEGG:ns NR:ns ## COG: ECs2506 COG3615 # Protein_GI_number: 15831760 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein/domain, possibly involved in tellurite resistance # Organism: Escherichia coli O157:H7 # 1 119 1 119 119 238 98.0 2e-63 MLQIPQNYIHTRSTPFWNKQTAPAGIFERHLDKGTRPGVYPRLSVMHGAVKYFGYPDEHS AEPDQVILIEAGQFAVFPPEKWHNIEAMTDDTYFNIDFFVAPEVLMEGAQQRKVIHNGK >gi|296493300|gb|ADTK01000201.1| GENE 17 15362 - 16000 500 212 aa, chain - ## HITS:1 COG:ECs2507 KEGG:ns NR:ns ## COG: ECs2507 COG1280 # Protein_GI_number: 15831761 # Func_class: E Amino acid transport and metabolism # Function: Putative threonine efflux protein # Organism: Escherichia coli O157:H7 # 1 212 1 212 212 358 99.0 3e-99 MFAEYGVLNYWTYLVGAIFIVLVPGPNTLFVLKNSVSSGMKGGYLAACGVFIGDAVLMFL AWAGVATLIKTTPILFNIVRYLGAFYLLYLGSKILYATLKGKNSEAKSDEPQYGAIFKRA LILSLTNPKAILFYVSFFVQFIDVNAPHTGISFFILATTLELVSFCYLSFLIISGAFVTQ YIRTKKKLAKVGNSLIGLMFVGFAARLATLQS >gi|296493300|gb|ADTK01000201.1| GENE 18 16127 - 17050 890 307 aa, chain - ## HITS:1 COG:ECs2508 KEGG:ns NR:ns ## COG: ECs2508 COG0583 # Protein_GI_number: 15831762 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 307 8 314 314 627 100.0 1e-180 MNNLPLLNDLRVFMLVARRAGFAAVAEELGVSPAFVSKRIALLEQTLNVVLLHRTTRRVT ITEEGERIYEWAQRILQDVGQMMDELSDVRQVPQGMLRIISSFGFGRQVVAPALSALAKA YPQLELRFDVEDRLVDLVNEGVDLDIRIGDDIAPNLIARKLATNYRILCASPEFIAQHGA PKHLTDLSALPCLVIKERDHPFGVWQLRNKEGHHAIKVTGPLSSNHGEIVHQWCLDGQGI ALRSWWDVSENIASGHLVQVLPEYYQPANVWAVYVSRLATSAKVRITVEFLRQYFAEHYP NFSLEHA >gi|296493300|gb|ADTK01000201.1| GENE 19 17153 - 18238 1058 361 aa, chain + ## HITS:1 COG:yeaU KEGG:ns NR:ns ## COG: yeaU COG0473 # Protein_GI_number: 16129754 # Func_class: C Energy production and conversion; E Amino acid transport and metabolism # Function: Isocitrate/isopropylmalate dehydrogenase # Organism: Escherichia coli K12 # 1 361 1 361 361 763 100.0 0 MMKTMRIAAIPGDGIGKEVLPEGIRVLQAAAERWGFALSFEQMEWASCEYYSHHGKMMPD DWHEQLSRFDAIYFGAVGWPDTVPDHISLWGSLLKFRREFDQYVNLRPVRLFPGVPCPLA GKQPGDIDFYVVRENTEGEYSSLGGRVNEGTEHEVVIQESVFTRRGVDRILRYAFELAQS RPRKTLTSATKSNGLAISMPYWDERVEAMAENYPEIRWDKQHIDILCARFVMQPERFDVV VASNLFGDILSDLGPACTGTIGIAPSANLNPERTFPSLFEPVHGSAPDIYGKNIANPIAT IWAGAMMLDFLGNGDERFQQAHNGILAAIEEVIAHGPKTPDMKGNATTPQVADAICKIIL R >gi|296493300|gb|ADTK01000201.1| GENE 20 18489 - 20099 1622 536 aa, chain + ## HITS:1 COG:ECs2510 KEGG:ns NR:ns ## COG: ECs2510 COG1292 # Protein_GI_number: 15831764 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Choline-glycine betaine transporter # Organism: Escherichia coli O157:H7 # 56 536 1 481 481 907 99.0 0 MMSNVKKKDVPLISISLVAILFIAAALSLFPQQSADAANAIYTFVTRTLGSAVQVLVLLA MGLVIYLATSKYGNIRLGEGKPEYSTLSWLFMFICAGLGSSTLYWGVAEWAYYYQTPGLN IAPRSQQALEFSVPYSFFHWGISAWATYTLASLIMAYHFHVRKNKGLSLSGIIAAITGVR PQGPWGKLVDLMFLIATVGALTISLVVTAATFTRGLSALTGLPDNFTVQAFVILLSGGIF CLSSWIGINNGLQRLSKMVGWGAFLLPLLVLIVGPTEFITNSIINAIGLTTQNFLQMSLF TDPLGDGSFTRNWTVFYWLWWISYTPGVAMFVTRVSRGRKIKEVIWGLILGSTVGCWFFF GVMESYAIHQFINGVINVPQVLETLGGETAVQQVLMSLPAGKLFLAAYLGVMIIFLASHM DAVAYTMAATSTRNLQEGDDPDRGLRLFWCVVITLIPLSILFTGASLETMKTTVVLTALP FLVILLVKVGGFIRWLKQDYADIPAHQVEHYLPQTPVEALEKTPVLPAGTVFKGDN >gi|296493300|gb|ADTK01000201.1| GENE 21 20131 - 21255 1162 374 aa, chain + ## HITS:1 COG:ECs2511 KEGG:ns NR:ns ## COG: ECs2511 COG4638 # Protein_GI_number: 15831765 # Func_class: P Inorganic ion transport and metabolism; R General function prediction only # Function: Phenylpropionate dioxygenase and related ring-hydroxylating dioxygenases, large terminal subunit # Organism: Escherichia coli O157:H7 # 1 374 1 374 374 798 99.0 0 MNNLSPDFVLPENFCANPQEAWTIPARFYTDQNAFEHEKENVFAKSWICVAHSSELANAN DYVTREIIGESIVLVRGRDKVLRAFYNVCPHRGHQLLSGEGKAKNVITCPYHAWAFKLDG NLAHARNCENVANFDSDKAQLVPVRLEEYAGFVFINMDPNATSVEDQLPGLGAKVLEACP EVQDLKLAARFTTRTPANWKNIVDNYLECYHCGPAHPGFSDSVQVDRYWHTMHGNWTLQY GFAKPSEQSFKFEEGTDAAFHGFWLWPCTMLNVTPIKGMMTVIYEFPVDSETTLQNYDIY FTNEELTDEQKSLIEWYRDVFRPEDLRLVESVQKGLKSRGYRGQGRIMADSSGSGISEHG IAHFHNLLAQVFKD >gi|296493300|gb|ADTK01000201.1| GENE 22 21311 - 22276 629 321 aa, chain + ## HITS:1 COG:yeaX KEGG:ns NR:ns ## COG: yeaX COG1018 # Protein_GI_number: 16129757 # Func_class: C Energy production and conversion # Function: Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 # Organism: Escherichia coli K12 # 1 321 1 321 321 642 98.0 0 MSDYQMFEVQVSQVEPLTEQVKRFTLVATDGKPLPAFTGGSHIIVQMSDGDNHYSNAYSL LSSPHDTSCYQIAVRLEENSRGGSRFLHQQVKVGDRLTISTPNNLFALIPSARKHLFIAG GIGITPFLSHMAELQHSDVDWQLHYCSRNPESCAFRDELVQHPQAEKVHLHHSSTGTRLE LARLLADIEPGTHVYTCGPEALNEAVRSEAARLDIVADTLHFEQFAIEDKTGDAFTLVLA RSGKEFVVPEEMTILQVIENNKAAKVECLCREGVCGTCETAILEGEADHRDQYFSDEERA SQQSMLICCSRAKGKRLVLDL >gi|296493300|gb|ADTK01000201.1| GENE 23 22330 - 23445 1009 371 aa, chain - ## HITS:1 COG:ECs2513 KEGG:ns NR:ns ## COG: ECs2513 COG0349 # Protein_GI_number: 15831767 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonuclease D # Organism: Escherichia coli O157:H7 # 1 371 5 375 375 726 99.0 0 MITTDDALASLCEAVRAFPAIALDTEFVRTRTYYPQLGLIQLFDGEHLALIDPLGITDWS PLKAILRDPSITKFLHAGSEDLEVFLNVFGELPQPLIDTQILAAFCGRPMSWGFASMVEE YSGVTLDKSESRTDWLARPLTERQCEYAAADVWYLLPITAKLMVETEASGWLPAALDECR LMQMRRQEVVAPEDAWRDITNAWQLRTRQLACLQLLADWRLRKARERDLAVNFVVREEHL WSVARYMPGSLGELDSLGLSGSEIRFHGKTLLALVEKAQTLPEEALPQPMLNLMDMPGYR KVFKAIKSLITDVSETHKISAELLASRRQINQLLNWHWKLKPQNNLPELISGWRGELMAE ALHNLLQEYPQ >gi|296493300|gb|ADTK01000201.1| GENE 24 23527 - 25278 1565 583 aa, chain - ## HITS:1 COG:fadD KEGG:ns NR:ns ## COG: fadD COG0318 # Protein_GI_number: 16129759 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Escherichia coli K12 # 23 583 1 561 561 1149 99.0 0 MLTTCISFGVAMTTNTHFRGEELKKVWLNRYPADVPTEINPDRYQSLVDMFEQSVARYAD QPAFVNMGEVMTFRKLEERSRAFAAYLQQGLGLKKGDRVALMMPNLLQYPVALFGILRAG MIVVNVNPLYTPRELEHQLNDSGASAIVIVSNFAHTLEKVVDKTAVQHVILTRMGDQLST AKGTVVNFVVKYIKRLVPKYHLPDAISFRSALHNGYRMQYVKPELVPEDLAFLQYTGGTT GVAKGAMLTHRNMLANLEQVNATYGPLLHPGKELVVTALPLYHIFALTINCLLFIELGGQ NLLITNPRDIPGLVKELAKYPFTAITGVNTLFNALLNNKEFQQLDFSSLHLSAGGGMPVQ QVVAERWVKLTGQYLLEGYGLTECAPLVSVNPYDIDYHSGSIGLPVPSTEAKLVDDDDNE VPPGQPGELCVKGPQVMLGYWQRPDATDEIIKNGWLHTGDIAVMDEEGFLRIVDRKKDMI LVSGFNVYPNEIEDVVMQHPGVQEVAAVGVPSGSSGEAVKIFVVKKDPSLTEESLVTFCR RQLTGYKVPKLVEFRDELPKSNVGKILRRELRDEARGKVDNKA >gi|296493300|gb|ADTK01000201.1| GENE 25 25417 - 25998 570 193 aa, chain - ## HITS:1 COG:ECs2515 KEGG:ns NR:ns ## COG: ECs2515 COG3065 # Protein_GI_number: 15831769 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Starvation-inducible outer membrane lipoprotein # Organism: Escherichia coli O157:H7 # 1 193 1 193 193 360 98.0 1e-100 MAVQKNVIKGILAGTFALMLSGCVTVPDAIKGSSPTPQQDLVRVMSAPQLYVGQEARFGG KVVAVQNQQGKTRLEIATVPLDSGARPTLGEPSRGRIYADVNGFLDPVDFRGQLVTVVGP ITGAVDGKIGNTPYKFMVMQVTGYKRWHLTQQVIMPPQPIDPWFYGGRGWPYGYGGWGWY NPGPARVQTVVTE >gi|296493300|gb|ADTK01000201.1| GENE 26 26038 - 26733 652 231 aa, chain - ## HITS:1 COG:ECs2516 KEGG:ns NR:ns ## COG: ECs2516 COG1214 # Protein_GI_number: 15831770 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Inactive homolog of metal-dependent proteases, putative molecular chaperone # Organism: Escherichia coli O157:H7 # 1 231 1 231 231 448 99.0 1e-126 MRILAIDTATEACSVALWNDGTVNAHFELCPREHTQRILPMVQDILTTSGTSLTDINALA YGRGPGSFTGVRIGIGIAQGLALGAELPMIGVSTLMTMAQGAWRKNGATRVLAAIDARMG EVYWAEYQRDENGIWRGEETEAVLKPELVHERMQQLSGEWVTVGTGWQAWPDLGKESGLV LRDGEVLLPAAEDMLPIACQMFAEGKTVAVEHAEPVYLRNNVAWKKLPGKE >gi|296493300|gb|ADTK01000201.1| GENE 27 26791 - 28701 1511 636 aa, chain - ## HITS:1 COG:ECs2517 KEGG:ns NR:ns ## COG: ECs2517 COG1199 # Protein_GI_number: 15831771 # Func_class: K Transcription; L Replication, recombination and repair # Function: Rad3-related DNA helicases # Organism: Escherichia coli O157:H7 # 1 636 1 636 636 1265 100.0 0 MTDDFAPDGQLAKAIPGFKPREPQRQMAVAVTQAIEKGQPLVVEAGTGTGKTYAYLAPAL RAKKKVIISTGSKALQDQLYSRDLPTVSKALKYTGNVALLKGRSNYLCLERLEQQALAGG DLPVQILSDVILLRSWSNQTVDGDISTCVSVAEDSQAWPLVTSTNDNCLGSDCPMYKDCF VVKARKKAMDADVVVVNHHLFLADMVVKESGFGELIPEADVMIFDEAHQLPDIASQYFGQ SLSSRQLLDLAKDITIAYRTELKDTQQLQKCADRLAQSAQDFRLQLGEPGYRGNLRELLA NPQIQRAFLLLDDTLELCYDVAKLSLGRSALLDAAFERATLYRTRLKRLKEINQPGYSYW YECTSRHFTLALTPLSVADKFKELMAQKPGSWIFTSATLSVNDDLHHFTSRLGIEQAESL LLPSPFDYSRQALLCVPRNLPQTNQPGSARQLAAMLRPIIEANNGRCFMLCTSHAMMRDL AEQFRATMTLPVLLQGETSKGQLLQQFVSAGNALLVATSSFWEGVDVRGDTLSLVIIDKL PFTSPDDPLLKARMEDCRLRGGDPFDEVQLPDAVITLKQGVGRLIRDADDRGVLVICDNR LVMRPYGATFLASLPPAPRTRDIARAVRFLAIPSSR >gi|296493300|gb|ADTK01000201.1| GENE 28 28833 - 29177 522 114 aa, chain + ## HITS:1 COG:yoaB KEGG:ns NR:ns ## COG: yoaB COG0251 # Protein_GI_number: 16129763 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Escherichia coli K12 # 1 114 17 130 130 209 100.0 9e-55 MTIVRIDAEARWSDVVIHNNTLYYTGVPENLDADAFEQTANTLAQIDAVLEKQGSNKSSI LDATIFLADKNDFAAMNKAWDAWVVAGHAPVRCTVQAGLMNPKYKVEIKIVAAV >gi|296493300|gb|ADTK01000201.1| GENE 29 29600 - 29899 230 99 aa, chain + ## HITS:1 COG:no KEGG:S1533 NR:ns ## KEGG: S1533 # Name: not_defined # Def: hypothetical protein # Organism: S.flexneri_2457T # Pathway: not_defined # 1 99 21 119 119 187 100.0 7e-47 MPAVIDKALDFIGAMDVSAPTPSSMNESTAKGIFKYLKELGVPASAADITARADQEGWNP GFTEKMVGWAKKMETGERSVIKNPEYFSTYMQEELKALV >gi|296493300|gb|ADTK01000201.1| GENE 30 30019 - 30198 331 59 aa, chain - ## HITS:1 COG:ECs2520 KEGG:ns NR:ns ## COG: ECs2520 COG3140 # Protein_GI_number: 15831774 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 59 1 59 59 83 100.0 1e-16 MFAGLPSLTHEQQQKAVERIQELMAQGMSSGQAIALVAEELRANHSGERIVARFEDEDE >gi|296493300|gb|ADTK01000201.1| GENE 31 30272 - 31633 1053 453 aa, chain + ## HITS:1 COG:ECs2521 KEGG:ns NR:ns ## COG: ECs2521 COG0147 # Protein_GI_number: 15831775 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component I # Organism: Escherichia coli O157:H7 # 1 453 1 453 453 922 98.0 0 MKTLSPAVITLPWRQDAAEFYFSRLSHLPWAMLLHSGYADHPYSRFDIVVADPICTLTTF GKETVVSESEKRTTTTDDPLQVLQQVLDRADIRPTHNEDLPFQGGALGLFGYDLGRRFES LPEIAEQDIVLPDMAVGIYDWALIVDHQRHTVSLLSHNDVNARRAWLESQQFSPQEDFTL TSDWQSNMTREQYGEKFRQVQEYLHSGDCYQVNLAQRFHATYSGDEWQAFLQLNQANRAP FSAFLRLEQGAILSLSPERFILCDNSEIQTRPIKGTLPRLPDPQEDSKQAEKLANSAKDR AENLMIVDLMRNDIGRVAVAGSVKVPELFVVEPFPAVHHLVSTITARLPEQLHASDLLRA AFPGGSITGAPKVRAMEIIDELEPQRRNAWCGSIGYLSFCGNMDTSITIRTLTAINGQIY CSAGGGIVADSQEEAEYQETFDKVNKILRQLEK >gi|296493300|gb|ADTK01000201.1| GENE 32 31637 - 32215 410 192 aa, chain + ## HITS:1 COG:ECs2522 KEGG:ns NR:ns ## COG: ECs2522 COG0494 # Protein_GI_number: 15831776 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Escherichia coli O157:H7 # 1 192 1 192 192 340 99.0 8e-94 MEYRSLTLDYFLSRFQLLRPQINRETLNHRQAAVLIPIVRRPQPGLLLTQRSIHLRKHAG QVAFPGGAVDDTDASVIAAALREAEEEVAIPPSAVEVIGVLPPVDSVTGYQVTPVVGIIP PDLPYRASEDEVSAVFEMPLAQALHLGRYHPLDIYRRGDSHRVWLSWYEQYFVWGMTAGI IRELALQIGVKP >gi|296493300|gb|ADTK01000201.1| GENE 33 32399 - 33763 1485 454 aa, chain + ## HITS:1 COG:sdaA KEGG:ns NR:ns ## COG: sdaA COG1760 # Protein_GI_number: 16129768 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Escherichia coli K12 # 1 454 1 454 454 923 99.0 0 MISLFDMFKVGIGPSSSHTVGPMKAGKQFVDDLVEKGLLDSVTRVAVDVYGSLSLTGKGH HTDIAIIMGLAGNEPATVDIDSIPGFIRDVEERERLLLAQGRHEVDFPRDNGMRFHNGNL PLHENGMQIHAYNGDEVVYSKTYYSIGGGFIVDEEHFGQDAANEVSVPYPFKSATELLAY CNETSYSLSGLAMQNELALHSKKEIDEYFAHVWQTMQACIDRGMNTEGVLPGPLRVPRRA SALRRMLVSSDKLSNDPMNVIDWVNMFALAVNEENAAGGRVVTAPTNGACGIVPAVLAYY DHFIESVSPDIYTRYFMAAGAIGALYKMNASISGAEVGCQGEVGVACSMAAAGLAELLGG SPEQVCVAAEIGMEHNLGLTCDPVAGQVQVPCIERNAIASVKAINAARMALRRTSAPRVS LDKVIETMYETGKDMNAKYRETSRGGLAIKVQCD >gi|296493300|gb|ADTK01000201.1| GENE 34 33894 - 35492 1116 532 aa, chain + ## HITS:1 COG:yoaD_2 KEGG:ns NR:ns ## COG: yoaD_2 COG2200 # Protein_GI_number: 16129769 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 261 532 1 272 272 561 100.0 1e-159 MQKAQRIIKTYRRNRMIVCTICALVTLASTLSVRFISQRNLNQQRVVQFANHAVEELDKV LLPLQAGSEVLLPLIGLPCSVAHLPLRKQAAKLQTVRSIGLVQDGTLYCSSIFGYRNVPV VDILAELPAPQPLLRLTIDRALIKGSPVLIQWTPAAGSSNAGVMEMINIDLLTAMLLEPQ LPQISSASLTVDKRHLLYGNGLVDSLPQPEDNENYQVSSQRFPFTINVNGPGATALAWHY LPTQLPLAVLLSLLVGYIAWLATAYRMSFSREINLGLAQHEFELFCQPLLNARSQQCIGV EILLRWNNPRQGWISPDVFIPIAEEHHLIVPLTRYVMAETIRQRHVFPMSSQFHVGINVA PSHFRRGVLIKDLNQYWFSAHPIQQLILEITERDALLDVDYRIARELHRKNVKLAIDDFG TGNSSFSWLETLRPDVLKIDKSFTAAIGSDAVNSTVTDIIIALGQRLNIELVAEGVETQE QAKYLRRHGVHILQGYLYAQPMPLRDFPKWLAGSQPPPARHNGHITPIMPLR >gi|296493300|gb|ADTK01000201.1| GENE 35 35496 - 37052 1795 518 aa, chain - ## HITS:1 COG:yoaE_2 KEGG:ns NR:ns ## COG: yoaE_2 COG1253 # Protein_GI_number: 16129770 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Escherichia coli K12 # 231 518 1 288 288 540 100.0 1e-153 MEFLMDPSIWAGLLTLVVLEIVLGIDNLVFIAILADKLPPKQRDKARLLGLSLALIMRLG LLSLILWMVTLTKPLFTVMDFSFSGRDLIMLFGGIFLLFKATTELHERLENRDHDSGHGK GYASFWVVVTQIVILDAVFSLDAVITAVGMVNHLPVMMAAVVIAMAVMLLASKPLTRFVN QHPTVVVLCLSFLLMIGLSLVAEGFGFHIPKGYLYAAIGFSIIIEVFNQIARRNFIRHQS TLPLRARTADAILRLMGGKRQANVQHDADNPMPMPIPEGAFAEEERYMINGVLTLASRSL RGIMTPRGEISWVDANLGVDEIREQLLSSPHSLFPVCRGELDEIIGIVRAKELLVALEEG VDVAAIASASPAIIVPETLDPINLLGVLRRARGSFVIVTNEFGVVQGLVTPLDVLEAIAG EFPDADETPEIITDGDGWLVKGGTDLHALQQALDVEHLADDDDIATVAGLVISANGHIPR VGDVIDVGPLHITIIEANDYRVDLVRIVKEQPAHDEDE >gi|296493300|gb|ADTK01000201.1| GENE 36 37515 - 38486 1157 323 aa, chain + ## HITS:1 COG:ECs2527_2 KEGG:ns NR:ns ## COG: ECs2527_2 COG3444 # Protein_GI_number: 15831781 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB # Organism: Escherichia coli O157:H7 # 158 323 1 166 166 311 100.0 1e-84 MTIAIVIGTHGWAAEQLLKTAEMLLGEQENVGWIDFVPGENAETLIEKYNAQLAKLDTTK GVLFLVDTWGGSPFNAASRIVVDKEHYEVIAGVNIPMLVETLMARDDDPSFDELVALAVE TGREGVKALKAKPVEKAAPAPAAAAPKAAPTPAKPMGPNDYMVIGLARIDDRLIHGQVAT RWTKETNVSRIIVVSDEVAADTVRKTLLTQVAPPGVTAHVVDVAKMIRVYNNPKYAGERV MLLFTNPTDVERLVEGGVKITSVNVGGMAFRQGKTQVNNAVSVDEKDIEAFKKLNARGIE LEVRKVSTDPKLKMMDLISKIDK >gi|296493300|gb|ADTK01000201.1| GENE 37 38549 - 39349 993 266 aa, chain + ## HITS:1 COG:ECs2528 KEGG:ns NR:ns ## COG: ECs2528 COG3715 # Protein_GI_number: 15831782 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC # Organism: Escherichia coli O157:H7 # 1 266 1 266 266 394 100.0 1e-109 MEITTLQIVLVFIVACIAGMGSILDEFQFHRPLIACTLVGIVLGDMKTGIIIGGTLEMIA LGWMNIGAAVAPDAALASIISTILVIAGHQSIGAGIALAIPLAAAGQVLTIIVRTITVAF QHAADKAADNGNLTAISWIHVSSLFLQAMRVAIPAVIVALSVGTSEVQNMLNAIPEVVTN GLNIAGGMIVVVGYAMVINMMRAGYLMPFFYLGFVTAAFTNFNLVALGVIGTVMAVLYIQ LSPKYNRVAGAPAQAAGNNDLDNELD >gi|296493300|gb|ADTK01000201.1| GENE 38 39362 - 40213 1007 283 aa, chain + ## HITS:1 COG:ECs2529 KEGG:ns NR:ns ## COG: ECs2529 COG3716 # Protein_GI_number: 15831783 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID # Organism: Escherichia coli O157:H7 # 1 283 4 286 286 559 100.0 1e-159 MVDTTQTTTEKKLTQSDIRGVFLRSNLFQGSWNFERMQALGFCFSMVPAIRRLYPENNEA RKQAIRRHLEFFNTQPFVAAPILGVTLALEEQRANGAEIDDGAINGIKVGLMGPLAGVGD PIFWGTVRPVFAALGAGIAMSGSLLGPLLFFILFNLVRLATRYYGVAYGYSKGIDIVKDM GGGFLQKLTEGASILGLFVMGALVNKWTHVNIPLVVSRITDQTGKEHVTTVQTILDQLMP GLVPLLLTFACMWLLRKKVNPLWIIVGFFVIGIAGYACGLLGL Prediction of potential genes in microbial genomes Time: Mon May 16 15:40:02 2011 Seq name: gi|296493299|gb|ADTK01000202.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont544.10, whole genome shotgun sequence Length of sequence - 6886 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 8, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 4/0.500 + CDS 34 - 492 307 ## COG4811 Predicted membrane protein + Term 686 - 726 0.8 2 2 Tu 1 . + CDS 921 - 1487 620 ## COG1971 Predicted membrane protein + Term 1523 - 1564 2.1 - Term 1196 - 1235 3.0 3 3 Tu 1 3/1.000 - CDS 1484 - 2209 559 ## COG0500 SAM-dependent methyltransferases - Prom 2392 - 2451 3.7 - Term 2380 - 2410 1.0 4 4 Tu 1 . - CDS 2459 - 2668 351 ## COG1278 Cold shock proteins - Prom 2740 - 2799 2.6 - Term 3436 - 3479 6.1 5 5 Tu 1 . - CDS 3493 - 3780 249 ## ECS88_1877 hypothetical protein - Prom 3800 - 3859 6.0 + Prom 4074 - 4133 5.2 6 6 Tu 1 . + CDS 4157 - 4396 220 ## ECO103_2017 hypothetical protein + Term 4499 - 4541 2.3 - Term 4485 - 4529 3.1 7 7 Tu 1 . - CDS 4539 - 5330 894 ## COG1414 Transcriptional regulator - Prom 5369 - 5428 6.3 + Prom 5328 - 5387 8.3 8 8 Tu 1 . + CDS 5507 - 6880 1027 ## COG0477 Permeases of the major facilitator superfamily Predicted protein(s) >gi|296493299|gb|ADTK01000202.1| GENE 1 34 - 492 307 152 aa, chain + ## HITS:1 COG:yobD KEGG:ns NR:ns ## COG: yobD COG4811 # Protein_GI_number: 16129774 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 152 1 152 152 256 100.0 1e-68 MTITDLVLILFIAALLAFAIYDQFIMPRRNGPTLLAIPLLRRGRIDSVIFVGLIVILIYN NVTNHGALITTWLLSALALMGFYIFWIRVPKIIFKQKGFFFANVWIEYSRIKAMNLSEDG VLVMQLEQRRLLIRVRNIDDLEKIYKLLVSTQ >gi|296493299|gb|ADTK01000202.1| GENE 2 921 - 1487 620 188 aa, chain + ## HITS:1 COG:ECs2531 KEGG:ns NR:ns ## COG: ECs2531 COG1971 # Protein_GI_number: 15831785 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 188 19 206 206 324 100.0 5e-89 MNITATVLLAFGMSMDAFAASIGKGATLHKPKFSEALRTGLIFGAVETLTPLIGWGMGML ASRFVLEWNHWIAFVLLIFLGGRMIIEGFRGADDEDEEPRRRHGFWLLVTTAIATSLDAM AVGVGLAFLQVNIIATALAIGCATLIMSTLGMMVGRFIGSIIGKKAEILGGLVLIGIGVQ ILWTHFHG >gi|296493299|gb|ADTK01000202.1| GENE 3 1484 - 2209 559 241 aa, chain - ## HITS:1 COG:rrmA KEGG:ns NR:ns ## COG: rrmA COG0500 # Protein_GI_number: 16129776 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli K12 # 1 241 29 269 269 496 100.0 1e-140 MAKEGYVNLLPVQHKRSRDPGDSAEMMQARRAFLDAGHYQPLRDAIVAQLRERLDDKATA VLDIGCGEGYYTHAFADALPEITTFGLDVSKVAIKAAAKRYPQVTFCVASSHRLPFSDTS MDAIIRIYAPCKAEELARVVKPGGWVITATPGPRHLMELKGLIYNEVHLHAPHAEQLEGF TLQQSAELCYPMRLRGDEAVALLQMTPFAWRAKPEVWQTLAAKEVFDCQTDFNIHLWQRS Y >gi|296493299|gb|ADTK01000202.1| GENE 4 2459 - 2668 351 69 aa, chain - ## HITS:1 COG:ECs2533 KEGG:ns NR:ns ## COG: ECs2533 COG1278 # Protein_GI_number: 15831787 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Escherichia coli O157:H7 # 1 69 1 69 69 120 100.0 5e-28 MAKIKGQVKWFNESKGFGFITPADGSKDVFVHFSAIQGNGFKTLAEGQNVEFEIQDGQKG PAAVNVTAI >gi|296493299|gb|ADTK01000202.1| GENE 5 3493 - 3780 249 95 aa, chain - ## HITS:1 COG:no KEGG:ECS88_1877 NR:ns ## KEGG: ECS88_1877 # Name: yebO # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 95 1 95 95 134 100.0 1e-30 MNEVVNSGVMNIASLVVSVVVLLIGLILWFFINRASSRTNEQIELLEALLDQQKRQNALL RRLCEANEPEKADKKTVESQKSVEDEDIIRLVAER >gi|296493299|gb|ADTK01000202.1| GENE 6 4157 - 4396 220 79 aa, chain + ## HITS:1 COG:no KEGG:ECO103_2017 NR:ns ## KEGG: ECO103_2017 # Name: yobH # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 79 1 79 79 141 100.0 8e-33 MRFIIRTVMLIALVWIGLLLSGYGVLIGSKENAAGLGLQCTYLTARGTSTVQYLHTKSGF LGITDCPLLRKSNIVVDNG >gi|296493299|gb|ADTK01000202.1| GENE 7 4539 - 5330 894 263 aa, chain - ## HITS:1 COG:ECs2537 KEGG:ns NR:ns ## COG: ECs2537 COG1414 # Protein_GI_number: 15831791 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 263 1 263 263 494 100.0 1e-140 MANADLDKQPDSVSSVLKVFGILQALGEEREIGITELSQRVMMSKSTVYRFLQTMKTLGY VAQEGESEKYSLTLKLFELGARALQNVDLIRSADIQMREISRLTKETIHLGALDEDSIVY IHKIDSMYNLRMYSRIGRRNPLYSTAIGKVLLAWRDRDEVKQILEGVEYKRSTERTITST EALLPVLDQVREQGYGEDNEEQEEGLRCIAVPVFDRFGVVIAGLSISFPTLRFSEERLQE YVAMLHTAARKISAQMGYHDYPF >gi|296493299|gb|ADTK01000202.1| GENE 8 5507 - 6880 1027 457 aa, chain + ## HITS:1 COG:yebQ KEGG:ns NR:ns ## COG: yebQ COG0477 # Protein_GI_number: 16129782 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 457 38 494 494 748 99.0 0 MPKVQADGLPLPQRYGAILTIVIGISMAVLDGAIANVALPTIATDLHATPASSIWVVNAY QIAIVISLLSFSFLGDMFGYRRIYKCGLVVFLLSSLFCALSDSLQMLTLARVIQGFGGAA LMSVNTALIRLIYPQRFLGRGMGINSFIVAVSSAAGPTIAAAILSIASWKWLFLINVPLG IIALLLAMRFLPPNGSRASKPRFDLPSAVMNALTFGLLITALSGFAQGQSLTLIAAELVV MVVVGIFFIRRQLSLPVPLLPVDLLRIPLFSLSICTSVCSFCAQMLAMVSLPFYLQTVLG RSEVETGLLLTPWPLATMVMAPLAGYLIERVHAGLLGALGLFIMAAGLFSLVLLPASPAD INIIWPMILCGAGFGLFQSPNNHTIITSAPRERSGGASGMLGTARLLGQSSGAALVALML NQFGDNGTHVSLMAAAILAVIAAFVSGLRITQPRSRA Prediction of potential genes in microbial genomes Time: Mon May 16 15:40:12 2011 Seq name: gi|296493298|gb|ADTK01000203.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont544.11, whole genome shotgun sequence Length of sequence - 20374 bp Number of predicted genes - 22, with homology - 22 Number of transcription units - 13, operones - 6 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 - CDS 44 - 925 868 ## COG0501 Zn-dependent protease with chaperone function - Prom 1029 - 1088 4.4 - Term 1075 - 1110 6.0 2 1 Op 2 7/0.000 - CDS 1117 - 3165 2158 ## COG0793 Periplasmic protease 3 1 Op 3 4/0.667 - CDS 3185 - 3883 552 ## COG3109 Activator of osmoprotectant transporter ProP - Term 3919 - 3953 3.1 4 1 Op 4 . - CDS 3980 - 4477 303 ## PROTEIN SUPPORTED gi|15902812|ref|NP_358362.1| hypothetical protein spr0768 - Prom 4544 - 4603 3.8 + Prom 4473 - 4532 4.8 5 2 Op 1 11/0.000 + CDS 4607 - 5890 795 ## COG2995 Uncharacterized paraquat-inducible protein A 6 2 Op 2 4/0.667 + CDS 5859 - 8492 2364 ## COG3008 Paraquat-inducible protein B + Term 8501 - 8536 7.2 7 3 Tu 1 . + CDS 8572 - 10011 1204 ## COG0144 tRNA and rRNA cytosine-C5-methylases + Prom 10016 - 10075 5.7 8 4 Op 1 . + CDS 10129 - 10365 226 ## ECSP_2410 hypothetical protein 9 4 Op 2 . + CDS 10386 - 10661 149 ## SSON_1324 hypothetical protein 10 5 Tu 1 . - CDS 10662 - 11318 531 ## COG0639 Diadenosine tetraphosphatase and related serine/threonine protein phosphatases - Term 11628 - 11685 4.4 11 6 Op 1 . - CDS 11714 - 12055 362 ## ECO111_2347 hypothetical protein 12 6 Op 2 8/0.000 - CDS 12068 - 12940 749 ## COG1276 Putative copper export protein 13 6 Op 3 . - CDS 12944 - 13318 360 ## COG2372 Uncharacterized protein, homolog of Cu resistance protein CopC - Prom 13347 - 13406 4.0 + Prom 13376 - 13435 2.1 14 7 Tu 1 . + CDS 13457 - 13687 320 ## ECP_1786 DNA polymerase III subunit theta (EC:2.7.7.7) + Prom 13690 - 13749 3.7 15 8 Op 1 4/0.667 + CDS 13789 - 14445 476 ## COG0388 Predicted amidohydrolase 16 8 Op 2 . + CDS 14469 - 15131 743 ## COG0847 DNA polymerase III, epsilon subunit and related 3'-5' exonucleases 17 9 Tu 1 . - CDS 15128 - 17188 1873 ## COG1770 Protease II - Prom 17214 - 17273 4.3 18 10 Tu 1 . - CDS 17397 - 18056 801 ## COG2979 Uncharacterized protein conserved in bacteria - Prom 18213 - 18272 2.8 + Prom 18140 - 18199 3.2 19 11 Tu 1 . + CDS 18263 - 18469 63 ## gi|188494878|ref|ZP_03002148.1| hypothetical protein Ec53638_0054 20 12 Op 1 . - CDS 18383 - 18739 206 ## ECIAI39_1203 hypothetical protein 21 12 Op 2 . - CDS 18806 - 19096 372 ## COG3141 Uncharacterized protein conserved in bacteria - Prom 19123 - 19182 5.0 + Prom 19078 - 19137 4.7 22 13 Tu 1 . + CDS 19230 - 20373 1217 ## COG0027 Formate-dependent phosphoribosylglycinamide formyltransferase (GAR transformylase) Predicted protein(s) >gi|296493298|gb|ADTK01000203.1| GENE 1 44 - 925 868 293 aa, chain - ## HITS:1 COG:ECs2539 KEGG:ns NR:ns ## COG: ECs2539 COG0501 # Protein_GI_number: 15831793 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Escherichia coli O157:H7 # 1 293 1 293 293 550 100.0 1e-156 MMRIALFLLTNLAVMVVFGLVLSLTGIQSSSVQGLMIMALLFGFGGSFVSLLMSKWMALR SVGGEVIEQPRNERERWLVNTVATQARQAGIAMPQVAIYHAPDINAFATGARRDASLVAV STGLLQNMSPDEAEAVIAHEISHIANGDMVTMTLIQGVVNTFVIFISRILAQLAAGFMGG NRDEGEESNGNPLIYFAVATVLELVFGILASIITMWFSRHREFHADAGSAKLVGREKMIA ALQRLKTSYEPQEATSMMAFCINGKSKSLSELFMTHPPLDKRIEALRTGEYLK >gi|296493298|gb|ADTK01000203.1| GENE 2 1117 - 3165 2158 682 aa, chain - ## HITS:1 COG:ECs2540 KEGG:ns NR:ns ## COG: ECs2540 COG0793 # Protein_GI_number: 15831794 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Escherichia coli O157:H7 # 1 682 1 682 682 1320 99.0 0 MNMFFRLTALAGLLAIAGQTFAVEDITRADQIPVLKEETQHATVSERVTSRFTRSHYRQF DLDQAFSAKIFDRYLNLLDYSHNVLLASDVEQFAKKKTELGDELRSGKLDVFYDLYNLAQ KRRFERYQYALSVLEKPMDFTGNDTYNLDRSKAPWPKNEAELNALWDSKVKFDELSLKLA GKTDKEIRETLTRRYKFAIRRLAQTNSEDVFSLAMTAFAREIDPHTNYLSPRNTEQFNTE MSLSLEGIGAVLQMDDDYTVINSMVAGGPAAKSKAISVGDKIVGVGQTGKPMVDVIGWRL DDVVALIKGPKGSKVRLEILPAGKGTKTRTVTLTRERIRLEDRAVKMSVKTVGKEKVGVL DIPGFYVGLTDDVKVQLQKLEKQNVSSVIIDLRSNGGGALTEAVSLSGLFIPSGPIVQVR DNNGKVREDSDTDGQVFYKGPLVVLVDRFSASASEIFAAAMQDYGRALVVGEPTFGKGTV QQYRSLNRIYDQMLRPEWPALGSVQYTIQKFYRVNGGSTQRKGVTPDIIMPTGNEETETG EKFEDNALPWDSIDAATYVKSGDLTAFEPELLKEHNARIAKDPEFQNIMKDIARFNAMKD KRNIVSLNYAVREKENNEDDATRLARLNERFKREGKPELKKLDDLPKDYQEPDPYLDETV NIALDLAKLEKARPAEQPAPVK >gi|296493298|gb|ADTK01000203.1| GENE 3 3185 - 3883 552 232 aa, chain - ## HITS:1 COG:proQm KEGG:ns NR:ns ## COG: proQm COG3109 # Protein_GI_number: 16132234 # Func_class: T Signal transduction mechanisms # Function: Activator of osmoprotectant transporter ProP # Organism: Escherichia coli K12 # 1 232 1 232 232 385 99.0 1e-107 MENQPKLNSSKEVIAFLAERFPHCFSAEGEARPLKIGIFQDLVDRVAGEMNLSKTQLRSA LRLYTSSWRYLYGVKPGATRVDLDGNPCGELDEQHVEHARKQLEEAKARVQAQRAEQQAK KREAAAAAGEKEDAPRRERKPRPTTPRRKEGAERKPRAQKPVEKAPKTVKAPREEQHTPV SDISALTVGQALKVKAGQNAMDATVLEITKDGVRVQLNSGMSLIVRAEHLVF >gi|296493298|gb|ADTK01000203.1| GENE 4 3980 - 4477 303 165 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15902812|ref|NP_358362.1| hypothetical protein spr0768 [Streptococcus pneumoniae R6] # 3 159 6 161 165 121 41 4e-27 MNKTEFYADLNRDFNALMAGETSFLATLANTSALLYERLTDVNWAGFYLLEDDTLVLGPF QGKIACVRIPVGRGVCGTAVARNQVQRIEDVHAFDGHIACDAASNSEIVLPLVVKNQIIG VLDIDSTVFGRFTDEDEQGLRQLVAQLEKVLATTDYKKFFASVAG >gi|296493298|gb|ADTK01000203.1| GENE 5 4607 - 5890 795 427 aa, chain + ## HITS:1 COG:ECs2543 KEGG:ns NR:ns ## COG: ECs2543 COG2995 # Protein_GI_number: 15831797 # Func_class: S Function unknown # Function: Uncharacterized paraquat-inducible protein A # Organism: Escherichia coli O157:H7 # 1 427 1 427 427 845 99.0 0 MALNTPQITPTKKITVRAIGEELPRGDYQRCPQCDMLFSLPEINSHQSAYCPRCQAKIRD GRDWSLTRLAAMAFTMLLLMPFAWGEPLLHIWLLGIRIDANVMQGIWQMTKQGDTITGAM VFFCVIGAPLILVSSIAYLWFGNRLGMNLRPVLLMLERLKEWVMLDIYLVGIGVASIKVQ DYAHIQAGVGLFSFVALVILTTVTLSHLNVEELWERYYPQRPATRRDEKLRVCLGCHFTG YPDQRGRCPRCHIPLRLRRRHSLQKCWAALLASIVLLLPANLLPISIIYLNGGRQEDTIL SGIMSLASSNIAVAGIVFIASILVPFTKVIVMFTLLLSIHFKCQQGLRTRILLLRMVTWI GRWSMLDLFVISLTMSLINRDQILAFTMGPAAFYFGAAVILTILAVEWLDSRLLWDAHES GNARFDD >gi|296493298|gb|ADTK01000203.1| GENE 6 5859 - 8492 2364 877 aa, chain + ## HITS:1 COG:yebT KEGG:ns NR:ns ## COG: yebT COG3008 # Protein_GI_number: 16129787 # Func_class: R General function prediction only # Function: Paraquat-inducible protein B # Organism: Escherichia coli K12 # 1 877 3 879 879 1726 99.0 0 MSQETPASTTEAQIKNKRRISPFWLLPFIALMIAGWLIWDSYQDRGNTVTIDFMSADGIV PGRTPVRYQGVEVGTVQDISLSDDLRKIEVKVSIKSDMKDALREETQFWLVTPKASLAGV SGLDALVGGNYIGMMPGKGKEQDHFVALDTQPKYRLDNGDLMIHLQAPDLGSLNSGSLVY FRKIPVGKVYDYAINPNKQGVVIDVLIERRFTDLVKKGSRFWNVSGVDANVSVSGAKVKL ESLAALVNGAIAFDSPEESKPAEAEDTFGLYEDLAHSQRGVIIKLELPSGAGLTADSTPL MYQGLEVGQLTKLDLNPGGKVTGEMTVDPSVVTLLRENTRIELRNPKLSLSDANLSALLT GKTFELVPGDGEPRKEFVVVPGEKALLHEPDVLTLTLTAPESYGIDAGQPLILHGVQVGQ VIDRKLTSKGVTFTVAIEPQHRELVKGDSKFVVNSRVDVKVGLDGVEFLGASASEWINGG IRILPGDKGEMKASYPLYANLEKALENSLSDLPTTTVSLSAETLPDVQAGSVVLYRKFEV GEVITVRPRANAFDIDLHIKPEYRNLLTSNSVFWAEGGAKVQLNGSGLTVQASPLSRALK GAISFDNLSGASASQRKGDKRILYASETAARAVGGQITLHAFDAGKLAVGMPIRYLGIDI GQIQTLDLITARNEVQAKAVLYPEYVQTFARGGTRFSVVTPQISAAGVEHLDTILQPYIN VEPGRGNPRRDFELQEATITDSRYLDGLSIIVEAPEAGSLGIGTPVLFRGLEVGTVTGMT LGTLSDRVMIAMRISKRYQHLVRNNSVFWLASGYSLDFGLTGGVVKTGTFNQFIRGGIAF ATPPGTPLAPKAQEGKHFLLQESEPKEWREWGTALPK >gi|296493298|gb|ADTK01000203.1| GENE 7 8572 - 10011 1204 479 aa, chain + ## HITS:1 COG:yebU_1 KEGG:ns NR:ns ## COG: yebU_1 COG0144 # Protein_GI_number: 16129788 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Escherichia coli K12 # 1 383 3 385 385 783 98.0 0 MAQHTVYFPDAFLTQMREAMPSTLSFDDFLAACQRPLRRSIRVNTLKISVADFLQLTAPY GWTLTPIPWCEEGFWIERDNEDALPLGSTAEHLSGLFYIQEASSMLPVAALFADDNAPQR VMDVAAAPGSKTTQIAARMNNEGAILANEFSASRVKVLHANISRCGISNVALTHFDGRVF GAAVPEMFDAILLDAPCSGEGVVRKDPDALKNWSPESNQEIAATQRELIDSAFHALRPGG TLVYSTCTLNQEENEAVCLWLKETYPDAVEFLPLGDLFPGANKALTEEGFLHVFPQIYDC EGFFVARLRKTQAIPALPAPKYKVGNFPFSPVKDREAGQIRQATAGVGLNWDENLRLWQR DKELWLFPVGIEALIGKVRFSRLGIKLAETHNKGYRWQHEAVIALASPDNMNAFELTPQE AEEWYRGRDVYPQAAPVADDVLVTFQHQPIGLAKRIGSRLKNSYPRELVRDGKLFTGNA >gi|296493298|gb|ADTK01000203.1| GENE 8 10129 - 10365 226 78 aa, chain + ## HITS:1 COG:no KEGG:ECSP_2410 NR:ns ## KEGG: ECSP_2410 # Name: yebV # Def: hypothetical protein # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 78 1 78 78 157 100.0 2e-37 MKTSVRIGAFEIDDGELHGESPGDRTLTIPCKSDPDLCMQLDAWDAETSIPALLNGEHSV LYRTRYDQQSDAWIMRLA >gi|296493298|gb|ADTK01000203.1| GENE 9 10386 - 10661 149 91 aa, chain + ## HITS:1 COG:no KEGG:SSON_1324 NR:ns ## KEGG: SSON_1324 # Name: not_defined # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 91 1 91 91 192 100.0 2e-48 MAGYLSWLFPRCKISPKLNGTAPHFGDEMFALVLFVCYLDGGCEDIVVDVYNTEQQCLYS MSDQRIRHGGCFPIEDFIDGFWRPAQEYGDF >gi|296493298|gb|ADTK01000203.1| GENE 10 10662 - 11318 531 218 aa, chain - ## HITS:1 COG:pphA KEGG:ns NR:ns ## COG: pphA COG0639 # Protein_GI_number: 16129791 # Func_class: T Signal transduction mechanisms # Function: Diadenosine tetraphosphatase and related serine/threonine protein phosphatases # Organism: Escherichia coli K12 # 1 218 2 219 219 429 100.0 1e-120 MKQPAPVYQRIAGHQWRHIWLSGDIHGCLEQLRRKLWHCRFDPWRDLLISVGDVIDRGPQ SLRCLQLLEQHWVCAVRGNHEQMAMDALASQQMSLWLMNGGDWFIALADNQQKQAKTALE KCQHLPFILEVHSRTGKHVIAHADYPDDVYEWQKDVDLHQVLWSRSRLGERQKGQGITGA DHFWFGHTPLRHRVDIGNLHYIDTGAVFGGELTLVQLQ >gi|296493298|gb|ADTK01000203.1| GENE 11 11714 - 12055 362 113 aa, chain - ## HITS:1 COG:no KEGG:ECO111_2347 NR:ns ## KEGG: ECO111_2347 # Name: yebY # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 113 1 113 113 195 100.0 5e-49 MMKKSILTFLLLTSSAAALAAPQVITVSRFEVGKDKWAFNREEVMLTCRPGNALYVINPS TLVQYPLNDIAQKEVASGKTKAQPISVIQIDDPNNPGEKMSLAPFIERAEKLC >gi|296493298|gb|ADTK01000203.1| GENE 12 12068 - 12940 749 290 aa, chain - ## HITS:1 COG:yebZ KEGG:ns NR:ns ## COG: yebZ COG1276 # Protein_GI_number: 16129793 # Func_class: P Inorganic ion transport and metabolism # Function: Putative copper export protein # Organism: Escherichia coli K12 # 1 290 1 290 290 472 98.0 1e-133 MLAFTWIALRFIHFTSLMLVYGFAMYGAWLAPLTIRRLLAKRFLRLQQHAAVWSLISATA MLAVQGGLMGTGWADVFSPNIWQAVLQTQFGGVWLWQIVLALVTLIVALMQPRNMPRLLF MLTTAQFILLAGIGHATLNEGVTAKIHQTNHAIHLICAAAWFGGLLPVLWCMQLIKGRWR HQAIQALMRFSWCGHFAVIGVLASGVLNALLITGFPPTLTTYWGQLLLLKAILVMIMVVI ALANRYVLVPRMRQDEDRAAPWFEWMTKLEWAIGAVVLVIISLLATLEPF >gi|296493298|gb|ADTK01000203.1| GENE 13 12944 - 13318 360 124 aa, chain - ## HITS:1 COG:yobA KEGG:ns NR:ns ## COG: yobA COG2372 # Protein_GI_number: 16129794 # Func_class: R General function prediction only # Function: Uncharacterized protein, homolog of Cu resistance protein CopC # Organism: Escherichia coli K12 # 1 124 1 124 124 222 98.0 2e-58 MTSTARSLRYALAILTTSLVTPSVWAHAHLTHQYPAANAQVTAAPQAITLNFSEGVETGF SGAKITGPKNENIKTLPAKRNEQDQKQLIVPLADSLKPGTYIVDWHVVSVDGHKTKGHYT FSVK >gi|296493298|gb|ADTK01000203.1| GENE 14 13457 - 13687 320 76 aa, chain + ## HITS:1 COG:no KEGG:ECP_1786 NR:ns ## KEGG: ECP_1786 # Name: not_defined # Def: DNA polymerase III subunit theta (EC:2.7.7.7) # Organism: E.coli_536 # Pathway: Purine metabolism [PATH:ecp00230]; Pyrimidine metabolism [PATH:ecp00240]; Metabolic pathways [PATH:ecp01100]; DNA replication [PATH:ecp03030]; Mismatch repair [PATH:ecp03430]; Homologous recombination [PATH:ecp03440] # 1 76 30 105 105 136 100.0 2e-31 MLKNLAKLDQTEMDKVNVDLAAAGVAFKERYNMPVIAEAVEREQPEHLRSWFRERLIAHR LASVNLSRLPYEPKLK >gi|296493298|gb|ADTK01000203.1| GENE 15 13789 - 14445 476 218 aa, chain + ## HITS:1 COG:yobB KEGG:ns NR:ns ## COG: yobB COG0388 # Protein_GI_number: 16129796 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Escherichia coli K12 # 1 218 1 218 218 444 100.0 1e-125 MSFWKVAAAQYEPRKTSLTEQVAHHLEFVRAAARQQCQLLVFPSLSLLGCDYSRRALPAP PDLSLLDPLCYAATTWRMTIIAGLPVEYNDRFIRGIAVFAPWRKTPGIYHQSHGACLGRR SRTITVVDEQPQGMDMDPTCSLFTTGQCLGEPDLLASARRLQFFSHQYSIAVLMANARGN SALWDEYGRLIVRADRGSLLLVGQRSSQGWQGDIIPLR >gi|296493298|gb|ADTK01000203.1| GENE 16 14469 - 15131 743 220 aa, chain + ## HITS:1 COG:exoX KEGG:ns NR:ns ## COG: exoX COG0847 # Protein_GI_number: 16129797 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases # Organism: Escherichia coli K12 # 1 220 1 220 220 457 99.0 1e-129 MLRIIDTETCGLQGGIVEIASVDVIDGKIVNPMSQLVRPDRPISPQAMAIHRITEAMVAD KPWIEDVIPHYYGSEWYVAHNASFDRRVLPEMPGEWICTMKLARRLWPGIKYSNMALYKT RKLNVQTPPGLHHHRALYDCYITAALLIDIMNTSGWTAEQMADITGRPSLMTTFTFGKYR GKAVSDVAERDPGYLRWLFNNLDSMSPELRLTLKHYLENT >gi|296493298|gb|ADTK01000203.1| GENE 17 15128 - 17188 1873 686 aa, chain - ## HITS:1 COG:ptrB KEGG:ns NR:ns ## COG: ptrB COG1770 # Protein_GI_number: 16129798 # Func_class: E Amino acid transport and metabolism # Function: Protease II # Organism: Escherichia coli K12 # 1 686 1 686 686 1399 99.0 0 MLPKAARIPHAMTLHGDTRIDNYYWLRDDTRSQPEVLDYLQQENSYGHRVMASQQALQDR ILKEIIDRIPQREVSAPYIKNGYRYRQIYEPGCEYAIYQRQSAFSEEWDEWETLLDANKR AAHSEFYSMGGMAITPDNTIMALAEDFLSRRQYGIRFRNLETGNWYPELLDNVEPSFVWA NDSWTFYYVRKHPVTLLPYQVWRHAIGTPASQDKLIYEEKDDTYYVSLHKTTSKHYVVIH LASATTSEVRLLDAEMADAEPFVFLPRRKDHEYSLDHYQHRFYLRSNRHGKNFGLYRTRM RDEQQWEELIPPRENIMLEGFTLFTDWLVVEERQRGLTSLRQINRKTREVIGIAFDDPAY VTWIAYNPEPETARLRYGYSSMTTPDTLFELDMDTGERRVLKQTEVPGFDAANYRSEHLW IVARDGVEVPVSLVYHRKHYRKGHNPLLVYGYGSYGASIDADFSFSRLSLLDRGFVYAIV HVRGGGELGQQWYEDGKFLKKKNTFNDYLDACDALLKLGYGSPSLCYAMGGSAGGMLMGV AINQRPELFHGVIAQVPFVDVVTTMLDESIPLTTGEFEEWGNPQDPQYYEYMKSYSPYDN VTAQAYPHLLVTTGLHDSQVQYWEPAKWVAKLRELKTDDHLLLLCTDMDSGHGGKSGRFK SYEGVAMEYAFLVALAQGTLPATPAD >gi|296493298|gb|ADTK01000203.1| GENE 18 17397 - 18056 801 219 aa, chain - ## HITS:1 COG:ECs2556 KEGG:ns NR:ns ## COG: ECs2556 COG2979 # Protein_GI_number: 15831810 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 219 1 219 219 348 98.0 5e-96 MANWLNQLQSLLGQSSSSTSSSADQGLGKLLVPGALGGLAGLLVANKSARKLLTKYGTNA LLVGGGAVAGTVLWNKYKDKIRAAHQDEPQFGAQSTPLDERTERLILALVFAAKSDGHID AKERAAIDQQLREAGVEEQGRVLIEQAIEQPLDPQRLATGVRNEEEALEIYFLSCAAIDI DHFMERSYLNALGDALKIPQDVRDGIERDLEQQKRTLAE >gi|296493298|gb|ADTK01000203.1| GENE 19 18263 - 18469 63 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|188494878|ref|ZP_03002148.1| ## NR: gi|188494878|ref|ZP_03002148.1| hypothetical protein Ec53638_0054 [Escherichia coli 53638] # 1 68 1 68 68 132 100.0 6e-30 MLIARCDAVASSQAYDCGAFWGGQGTRAASANGQSNYQKPLTPLIFRHSRFAVHADLVMN IGTFTTHG >gi|296493298|gb|ADTK01000203.1| GENE 20 18383 - 18739 206 118 aa, chain - ## HITS:1 COG:no KEGG:ECIAI39_1203 NR:ns ## KEGG: ECIAI39_1203 # Name: yebF # Def: hypothetical protein # Organism: E.coli_IAI39 # Pathway: not_defined # 1 118 5 122 122 222 100.0 4e-57 MKKRGAFLGLLLVSACASVFAANNETSKSVTFPKCEGLDAAGIAASVKRDYQQNRVARWA DDQKIVGQADPVAWVSLQDIQGKDDKWSVPLTVRGKSADIHYQVSVDCKAGMAEYQRR >gi|296493298|gb|ADTK01000203.1| GENE 21 18806 - 19096 372 96 aa, chain - ## HITS:1 COG:ECs2558 KEGG:ns NR:ns ## COG: ECs2558 COG3141 # Protein_GI_number: 15831812 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 96 1 96 96 148 100.0 2e-36 MAVEVKYVVIREGEEKMSFTSKKEADAYDKMLDTADLLDTWLTNSPVQMEDEQREALSLW LAEQKDVLSTILKTGKLPSPQVVGAESEEEDASHAA >gi|296493298|gb|ADTK01000203.1| GENE 22 19230 - 20373 1217 381 aa, chain + ## HITS:1 COG:ECs2559 KEGG:ns NR:ns ## COG: ECs2559 COG0027 # Protein_GI_number: 15831813 # Func_class: F Nucleotide transport and metabolism # Function: Formate-dependent phosphoribosylglycinamide formyltransferase (GAR transformylase) # Organism: Escherichia coli O157:H7 # 1 381 1 381 392 729 99.0 0 MTLLGTALRPAATRVMLLGSGELGKEVAIECQRLGVEVIAVDRYADAPAMHVAHRSHVIN MLDGDALRRVVELEKPHYIVPEIEAIATDMLIQLEEEGLNVVPCARATKLTMNREGIRRL AAEELQLPTSTYRFADSESLFREAVAAIGYPCIVKPVMSSSGKGQTFIRSAEQLAQAWEY AQQGGRAGAGRVIVEGVVKFDFEITLLTVSAVDGVHFCAPVGHRQEDGDYRESWQPQQMS PLALERAQEIARKVVLALGGYGLFGVELFVCGDEVIFSEVSPRPHDTGMVTLISQDLSEF ALHVRAFLGLPVGGIRQYGPAASAVILPQLTSQNVTFDNVQNAVGADLQIRLFGKPEIDG SRRLGVALATAGSVVDAIKRA Prediction of potential genes in microbial genomes Time: Mon May 16 15:40:32 2011 Seq name: gi|296493297|gb|ADTK01000204.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont544.12, whole genome shotgun sequence Length of sequence - 17794 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 8, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 14 - 63 -0.5 1 1 Op 1 8/0.000 - CDS 68 - 709 884 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase 2 1 Op 2 4/1.000 - CDS 746 - 2557 1622 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase - Prom 2665 - 2724 5.8 - Term 2676 - 2711 3.1 3 2 Tu 1 . - CDS 2792 - 4267 1704 ## COG0364 Glucose-6-phosphate 1-dehydrogenase - Prom 4440 - 4499 7.8 4 3 Tu 1 5/1.000 + CDS 4605 - 5474 609 ## COG1737 Transcriptional regulators + Term 5493 - 5533 1.4 5 4 Tu 1 . + CDS 5602 - 7044 1310 ## COG0469 Pyruvate kinase + Term 7054 - 7087 0.3 - Term 7033 - 7085 2.6 6 5 Op 1 . - CDS 7104 - 7556 113 ## SSON_1293 hypothetical protein 7 5 Op 2 . - CDS 7601 - 7897 115 ## SSON_1293 hypothetical protein 8 5 Op 3 . - CDS 7968 - 8807 341 ## COG2819 Predicted hydrolase of the alpha/beta superfamily - Prom 8829 - 8888 4.3 - Term 8830 - 8889 5.0 9 6 Op 1 . - CDS 8900 - 9697 342 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold 10 6 Op 2 1/1.000 - CDS 9719 - 11518 735 ## COG0366 Glycosidases 11 6 Op 3 20/0.000 - CDS 11535 - 11912 128 ## COG3833 ABC-type maltose transport systems, permease component 12 6 Op 4 19/0.000 - CDS 12371 - 13624 130 ## COG1175 ABC-type sugar transport systems, permease components 13 6 Op 5 . - CDS 13663 - 14895 503 ## COG2182 Maltose-binding periplasmic proteins/domains - Prom 15128 - 15187 8.4 + Prom 15089 - 15148 3.8 14 7 Op 1 . + CDS 15168 - 16205 705 ## ECIAI1_1935 putative exported protein, putative porin (CymA protein precursor) 15 7 Op 2 . + CDS 16224 - 17180 326 ## COG1044 UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase + Prom 17193 - 17252 6.3 16 8 Tu 1 . + CDS 17487 - 17753 247 ## SbBS512_E2139 glycosidase Predicted protein(s) >gi|296493297|gb|ADTK01000204.1| GENE 1 68 - 709 884 213 aa, chain - ## HITS:1 COG:STM1884 KEGG:ns NR:ns ## COG: STM1884 COG0800 # Protein_GI_number: 16765226 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Salmonella typhimurium LT2 # 1 212 1 212 213 373 97.0 1e-103 MKNWKTSAESILTTGPVVPVIVVKKLEHAVPMAKALVAGGVRVLEVTLRTECAVDAIRAI AKEVPEAIVGAGTVLNPQQLAEVTEAGAQFAISPGLTEPLLKAATEGTIPLIPGISTVSE LMLGMDYGLKEFKFFPAEANGGVKALQAIAGPFSQVRFCPTGGISPANYRDYLALKSVLC IGGSWLVPADALEAGDYDRITKLAREAVEGAKL >gi|296493297|gb|ADTK01000204.1| GENE 2 746 - 2557 1622 603 aa, chain - ## HITS:1 COG:ECs2561 KEGG:ns NR:ns ## COG: ECs2561 COG0129 # Protein_GI_number: 15831815 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Escherichia coli O157:H7 # 1 603 1 603 603 1203 99.0 0 MNPQLLRVTNRIIERSRETRSAYLARIEQAKTSTVHRSQLACGNLAHGFAACLPEDKASL KSMLRNNIAIITSYNDMLSAHQPYEHYPDIIRKALHEANAVGQVAGGVPAMCDGVTQGQD GMELSLLSREVIAMSAAVGLSHNMFDGALFLGVCDKIVPGLTMAALSFGHLPAVFVPSGP MASGLPNKEKVRIRQLYAEGKVDRMALLESEAASYHAPGTCTFYGTANTNQMVVEFMGMQ LPGSSFVHPDSPLRDALTAAAARQVTRMTGNGNEWMPIGKMIDEKVVVNGIVALLATGGS TNHTMHLVAMARAAGIQINWDDFSDLSDVVPLMARLYPNGPADINHFQAAGGVPVLVREL LKAGLLHEDVNTVAGFGLSRYTLEPWLNNGELDWREGAEKSLDSNVIASFEQPFSHHGGT KVLSGNLGRAVMKTSAVPVENQVIEAPAVVFESQHDVMPAFEAGLLDRDCVVVVRHQGPK ANGMPELHKLMPPLGVLLDRCFKIALVTDGRLSGASGKVPSAIHVTPEAYDGGLLAKVRD GDIIRVNGQTGELTLLVDEAELAAREPHIPDLSASRVGTGRELFSALREKLSGAEQGATC ITF >gi|296493297|gb|ADTK01000204.1| GENE 3 2792 - 4267 1704 491 aa, chain - ## HITS:1 COG:zwf KEGG:ns NR:ns ## COG: zwf COG0364 # Protein_GI_number: 16129805 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate 1-dehydrogenase # Organism: Escherichia coli K12 # 1 491 1 491 491 1012 100.0 0 MAVTQTAQACDLVIFGAKGDLARRKLLPSLYQLEKAGQLNPDTRIIGVGRADWDKAAYTK VVREALETFMKETIDEGLWDTLSARLDFCNLDVNDTAAFSRLGAMLDQKNRITINYFAMP PSTFGAICKGLGEAKLNAKPARVVMEKPLGTSLATSQEINDQVGEYFEECQVYRIDHYLG KETVLNLLALRFANSLFVNNWDNRTIDHVEITVAEEVGIEGRWGYFDKAGQMRDMIQNHL LQILCMIAMSPPSDLSADSIRDEKVKVLKSLRRIDRSNVREKTVRGQYTAGFAQGKKVPG YLEEEGANKSSNTETFVAIRVDIDNWRWAGVPFYLRTGKRLPTKCSEVVVYFKTPELNLF KESWQDLPQNKLTIRLQPDEGVDIQVLNKVPGLDHKHNLQITKLDLSYSETFNQTHLADA YERLLLETMRGIQALFVRRDEVEEAWKWVDSITEAWAMDNDAPKPYQAGTWGPVASVAMI TRDGRSWNEFE >gi|296493297|gb|ADTK01000204.1| GENE 4 4605 - 5474 609 289 aa, chain + ## HITS:1 COG:ECs2563 KEGG:ns NR:ns ## COG: ECs2563 COG1737 # Protein_GI_number: 15831817 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 289 1 289 289 537 99.0 1e-152 MNMLEKIQSQLEHLSKSERKVAEVILASPDNAIHSSIAALALEANVSEPTVNRFCRSMDT RGFPDFKLHLAQSLANGTPYVNRNVNEDDSVESYTGKIFESAMATLDHVRHSLDKSAINR AVDLLTQAKKIAFFGLGSSAAVAHDAMNKFFRFNVPVVYSDDIVLQRMSCMNCSDGDVVV LISHTGRTKNLVELAQLARENDAMVIALPSAGTPLAREATLAITLDVPEDTDIYMPMVSR LAQLTVIDVLATGFTLRRGAKFRDNLKRVKEALKESRFDKQLLNLSDDR >gi|296493297|gb|ADTK01000204.1| GENE 5 5602 - 7044 1310 480 aa, chain + ## HITS:1 COG:pykA KEGG:ns NR:ns ## COG: pykA COG0469 # Protein_GI_number: 16129807 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Escherichia coli K12 # 1 480 1 480 480 883 99.0 0 MSRRLRRTKIVTTLGPATDRDNNLEKVIAAGANVVRMNFSHGSPEDHKMRADKVREIAAR LGRHVAILGDLQGPKIRVSTFKEGKVFLNIGDKFLLDANLGKGEGDKEKVGIDYKGLPAD VVPGDILLLDDGRVQLKVLEVQGMKVFTEVTVGGPLSNNKGINKLGGGLSAEALTEKDKA DIKTAALIGVDYLAVSFPRCGEDLNYARRLARDAGCDAKIVAKVERAEAVCSQDAMDDII LASDVVMVARGDLGVEIGDPELVGIQKALIRRARQLNRAVITATQMMESMITNPMPTRAE VMDVANAVLDGTDAVMLSAETAAGQYPSETVAAMARVCLGAEKIPSINVSKHRLDVQFDN VEEAIAMSAMYAANHLKGVTAIITMTESGRTALMTSRISSGLPIFAMSRHERTLNLTALY RGVTPVHFDSANDGVAAASEAVNLLRDKGYLMSGDLVIVTQGDVMSTVGSTNTTRILTVE >gi|296493297|gb|ADTK01000204.1| GENE 6 7104 - 7556 113 150 aa, chain - ## HITS:1 COG:no KEGG:SSON_1293 NR:ns ## KEGG: SSON_1293 # Name: not_defined # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 150 131 280 280 317 99.0 6e-86 MSKALAQQLELDVTNIVASDEMYCVYAIEPDSAINITKRFAKNLKIYNLPGIQFYGDKAR VVRTVGIGAGCFCDPIQYMEYQADYYITINDSIKTWVQTQYSKDSGLPMAVIGHSVAEEA GMRRLASYLDLHSGYPCIHFTGGCDYDWIE >gi|296493297|gb|ADTK01000204.1| GENE 7 7601 - 7897 115 98 aa, chain - ## HITS:1 COG:no KEGG:SSON_1293 NR:ns ## KEGG: SSON_1293 # Name: not_defined # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 92 17 108 280 188 97.0 5e-47 MVAVSQPTVDKIIFGAEATLINKIGTCWIASMDVLRKAVFEGVNIIITHEPTFYSYADLE GDDLEFSWARKIMDYTRGELSYLKIIEQKRSFCIKIIW >gi|296493297|gb|ADTK01000204.1| GENE 8 7968 - 8807 341 279 aa, chain - ## HITS:1 COG:SP0882 KEGG:ns NR:ns ## COG: SP0882 COG2819 # Protein_GI_number: 15900765 # Func_class: R General function prediction only # Function: Predicted hydrolase of the alpha/beta superfamily # Organism: Streptococcus pneumoniae TIGR4 # 19 277 17 270 274 135 32.0 1e-31 MNDIENYGNRYHVLQSFPMPQLASSREIRILLPASYYESDVSYPVLYMHDGQNLFDNSVS FGPHVWDIPEAVDAFFPNSQYDGVIIVGIDNASSHGKFSRMDEYSPWPRNTSFPLPGWDP DVDYSGGKGALYVDFIVNTLKNYIDSNFRTFPDRNNTAIAGSSMGAYISLFAAILRQDVF SKVGVFSPALWFNDSAMLNFIQENNIVEDLTVYLDVGTQETSGMREDFPEVYISGAEKLC VSLRKQRNVTIDYHLWGGDTHSESAWAKRFPEMLKLFYC >gi|296493297|gb|ADTK01000204.1| GENE 9 8900 - 9697 342 265 aa, chain - ## HITS:1 COG:PAB0058 KEGG:ns NR:ns ## COG: PAB0058 COG2159 # Protein_GI_number: 14520316 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Pyrococcus abyssi # 3 220 17 210 248 80 28.0 2e-15 MIIDAHTHIGDFVKIRMSEDVFLASLDKYNIDFALCSCGSAVEVDHDQNPIPDEDQVTQH DNNERMLRLVRQHSKRIGAFMWIKPRLESCDQDFEDMIASNRDIIYGIKVHPYHSKMAFN SDKVQEYIRLAQKYDLPVVSHTSNDYESSPQLVYEMAHKYPDVNFVMCHMGLATDNQEAI ELIAKLPNLYGDTAWVRADMVYQAIKICGSDKILFGTDNPINGLDTYNDDDFYNKYFAEL KSKLTQSEFEQLMFTNAKKLFKIKI >gi|296493297|gb|ADTK01000204.1| GENE 10 9719 - 11518 735 599 aa, chain - ## HITS:1 COG:BH2927 KEGG:ns NR:ns ## COG: BH2927 COG0366 # Protein_GI_number: 15615490 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus halodurans # 5 567 3 536 578 521 47.0 1e-147 MITVSSLKHSAKSEDSYAYDNETLHVRLRTLRGEVDKVILWIGDPYNWAEGGLDGGNMAG TEAFGWIGGNEIEMEQEAVTEYHDHWFAVFKPQKRRCRYGFILFGKEGEKFLFGEKRCVD ISSPECEERELSRLNNFFCFPYLNKIDVLNTPSWVKNTVWYQIFPDRFCNGRPEISPEGV EPWGSTPTSFNFMGGDLWGVIDKLDYLEDLGINGIYFCPIFTSASNHKYDTIDHFSVDPH LGGNIAFHTLIEEAHKRGIKIMLDAVFNHLGADSPIWLDVVRNGANSRYADWFWIHKFPV YPDTPKSEWDFKNFNYETFGNVIEMPKLNTENEECREYLLSIVRYWTQNFDIDGWRLDVA NEVDHHFWRDFRKVIKDIKPECYILGEIWHEGTPWLRGDQFDSLMNYPLTYGIIDYFALQ DTTKQEFMTSVTRSYLCYPKNITEVMFNLLDSHDTARILSVCSNDKRKVKLAYLFMLTQA GSPCIYYGSEIGIDGFKSMTLENNRKCMIWDENKQDLELRQFIRWLIRLRKKHPQWCEAS IQWKDVEHPTVIAYQRDNITFFLNNSEDTANFIYDGRSMEISGFSYEIEGLPAADLYDF >gi|296493297|gb|ADTK01000204.1| GENE 11 11535 - 11912 128 125 aa, chain - ## HITS:1 COG:SA0209 KEGG:ns NR:ns ## COG: SA0209 COG3833 # Protein_GI_number: 15925920 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type maltose transport systems, permease component # Organism: Staphylococcus aureus N315 # 3 124 157 278 279 144 59.0 3e-35 MTWLVKGYFDAIPTSLDEAAKIDGAGHLTIFIQIILPLAKPILVFVGLVSFTGPWMDFIL PTLVLRSEDKMTLAIGIFSWIASNSAENYTLFAAGALLVAVPITLLFVFTQKHITTGLVS GAVKE >gi|296493297|gb|ADTK01000204.1| GENE 12 12371 - 13624 130 417 aa, chain - ## HITS:1 COG:SPy1295 KEGG:ns NR:ns ## COG: SPy1295 COG1175 # Protein_GI_number: 15675248 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Streptococcus pyogenes M1 GAS # 2 413 23 449 453 277 39.0 3e-74 MLPISCLIMGGTQLSRHYYVKGGIFFAIQVCFLLYLSDIVHTLIGLFTLGDVAQIRKGLT VIQGDNSIFMLVEGVIATIIVGLFATIYILNIKDARNSSYCHLTFKQQLYKLYEDKFAFI VLTPAFLASIAFIVLPIVITVLVSFTNYAAPNHIPPKNLVDWVGIKNFIMLFKFKIWSDT FLGVALWTFIWAICATIFTFSFGFILALALAKKNIRFAKVWRLIFILPYAIPTFVTLLIF RLLLNGIGPVNSFLNSIGINSVDFLSVPLNAKITVIAVCVWVGAPYFMLLIAGALTNISR DLYEASEVDGATKWQQFKEITLPMVLHQVAPTLVMTFAYNFNNFGAIYLLTQGGPTNPEY RFAGHTDIFITWIYKLTLNFQQYHIASVISIIIFIFLSVIAIWQFRRMKSFKDDVGM >gi|296493297|gb|ADTK01000204.1| GENE 13 13663 - 14895 503 410 aa, chain - ## HITS:1 COG:BS_yvfK KEGG:ns NR:ns ## COG: BS_yvfK COG2182 # Protein_GI_number: 16080469 # Func_class: G Carbohydrate transport and metabolism # Function: Maltose-binding periplasmic proteins/domains # Organism: Bacillus subtilis # 1 406 1 414 421 171 29.0 3e-42 MKLSNIVTVIILAISSTLTPQAMADKLIPETDAELLVWSDATSVEYMKYAAKEFNKDFGY KVKFTFRNIAPMDAASRIMQDGGTTRVADVAEIEHDTLGRLVVAGGVMENMVSADRIKKT FIPGAVSAATYNNISYGFPVSFATLALFYNKDLLSAAPKTFEEINTFSEKFNNSSEHKYA LLWDVQNYYVSRMFITLYGANEFGKSGNDPKALGIASSEAKKGLETMKRLKKANPSNPLD MGNPQVLRGLFNEGKVAAVIDGPWSIQGYIDSGINFGVTRIPTLDGHQPRSFSTVRLAVV SSFTEYPHAAELFADYLTTDKMLMKRYEMTNLIPPVDSLMKKISQTGSEAIKAIIAQANY SDAMPLIPEMSYLWSPMTNAILATWVENKTPGEVLNHAHTIIEEQLSLQE >gi|296493297|gb|ADTK01000204.1| GENE 14 15168 - 16205 705 345 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_1935 NR:ns ## KEGG: ECIAI1_1935 # Name: not_defined # Def: putative exported protein, putative porin (CymA protein precursor) # Organism: E.coli_IAI1 # Pathway: not_defined # 1 345 1 345 345 623 98.0 1e-177 MVLIKKNVLSASIATVLLGSTGAFAANDRGFTSDTDWSEEQAGFNGHVGTTFEYETKSTD NYLGKTVKEYTKTHEIFQAYFSQPNWQFAAIYGLKYIDRRQEEKGYHENETSLQNYISLN KTFTIGNGFDFSLVYDLMYTDGNIDSPYVTGLHTVTVDNSFRPTLTYFNSDWNAGFYSNV EYLYSRNDKSSWGVREEEGQSILFQPYKRFGAWEFGIEFFYQKKHTDEFNHGKINDRGIY TETYYEPIIKYAFENAGNIYLRTRFGHGKTENADSLQYNANRTYFKTIRKATLGYEQNIG DDWIAKIEYKHAWEKEHKNFYDPRDQEKYNVLNQDNIYVHAFYRF >gi|296493297|gb|ADTK01000204.1| GENE 15 16224 - 17180 326 318 aa, chain + ## HITS:1 COG:FN1909 KEGG:ns NR:ns ## COG: FN1909 COG1044 # Protein_GI_number: 19705214 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase # Organism: Fusobacterium nucleatum # 2 309 3 318 332 156 35.0 5e-38 MFSALDIKTLMQGTLYGDPSLRVDTIRPIHSPLVGGLSIVMTPGDLLHIPTTGADIIIGP EEIISSNAKAKIVVDYLNVNNLNKVLRYYKVHKYRLFEQENTSTIPDVYIGKHCQIGMNC HFMPGVKIMNCVTIGDNVAIHANTVIKEGTIIGNDVIIDSNNSIGNYSFEYMSDERDSYV RVDSIGRVIIGDDVEIGCNNTIDRGTLGDTIIGKGTRIDNQVQIGHDCIIGNKCLIVSQC GFSGHVVLGDHVITHGQVGIAGHISIGSYSVIKAKSDVSHSCPEKSDLFGYPAKNTREYN KNLAVLNKLTKQHGVYKQ >gi|296493297|gb|ADTK01000204.1| GENE 16 17487 - 17753 247 88 aa, chain + ## HITS:1 COG:no KEGG:SbBS512_E2139 NR:ns ## KEGG: SbBS512_E2139 # Name: not_defined # Def: glycosidase # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 1 88 66 153 153 177 100.0 1e-43 MHYKGDGIYQLIYSEKAGAVTMQFATMNWNPQFTVTGLELPLNQAKTLKKGGMDKNTKVV IPADGKYVWTIQIGPDKKAIAAMVAPCK Prediction of potential genes in microbial genomes Time: Mon May 16 15:40:47 2011 Seq name: gi|296493296|gb|ADTK01000205.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont544.13, whole genome shotgun sequence Length of sequence - 7537 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 51 - 1178 691 ## COG3839 ABC-type sugar transport systems, ATPase components + Prom 1507 - 1566 6.6 2 2 Tu 1 . + CDS 1586 - 2395 427 ## COG1378 Predicted transcriptional regulators + Term 2409 - 2446 6.4 + Prom 2401 - 2460 2.6 3 3 Op 1 . + CDS 2483 - 4351 136 ## COG3533 Uncharacterized protein conserved in bacteria 4 3 Op 2 . + CDS 4388 - 7462 733 ## COG0383 Alpha-mannosidase Predicted protein(s) >gi|296493296|gb|ADTK01000205.1| GENE 1 51 - 1178 691 375 aa, chain + ## HITS:1 COG:YPO0609 KEGG:ns NR:ns ## COG: YPO0609 COG3839 # Protein_GI_number: 16120935 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Yersinia pestis # 1 371 2 369 372 417 55.0 1e-116 MAAVILRKVEKIYDNGFKAVHGIDLDINDGEFMVFVGPSGCAKSTTLRMIAGLEDVSGGE IYISDRLVNTLPPKDRSIAMVFQNYALYPHKTVFDNMAFGLKMQKRPKEEIKKIVEEVAE KLEISDLLCRKPKEMSGGQRQRVAVGRAIVRKPDVFLFDEPLSNLDAKLRVSMRVKIAQL HQSLKDEGHPATMIYVTHDQIEALTLGDRICVLNQGRIMQVDTPVNLYNHPANRFVAGFI GSPSMNLIDAVVREEDGKLYIHIAQNIRLYIPENKQSVLKAYIDKPVCFGIRPEHISLST NRKEMNTVTGALSIIENMGSEKYLYFTVNNKEFIAKVNDQSITTADIGKNLYFNLDTSFC HIFDFYTEENLTNYL >gi|296493296|gb|ADTK01000205.1| GENE 2 1586 - 2395 427 269 aa, chain + ## HITS:1 COG:BS_yrhO KEGG:ns NR:ns ## COG: BS_yrhO COG1378 # Protein_GI_number: 16079765 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 3 264 4 264 275 60 23.0 2e-09 MIHLVNKLINFGFTRTDAQVYISLLKNGRSSGYKIAKEISLSRSSVYSSIDNLYKNGYIF LVDGETKEYEAKSPELILSQIEKSTIDNIKILKKELSQMMFQNDREFIYNVSGFDNLHQK AKEMLSSAEYEVYLNTDFDLHFLKNEICEALNRHVRVIVFSFNKLITPDPRVELYYRFNH EEEYYPSHRFMLVVDMKTVMVFSQRAESLGLFSNNRLLINIIAEHIHSDIYLTKYQQQYP DANALINTLHESGNDMVQDEKHLSDSKNR >gi|296493296|gb|ADTK01000205.1| GENE 3 2483 - 4351 136 622 aa, chain + ## HITS:1 COG:mlr2247 KEGG:ns NR:ns ## COG: mlr2247 COG3533 # Protein_GI_number: 13472070 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Mesorhizobium loti # 377 536 408 576 662 67 24.0 6e-11 MIFFNDIKPTGWLKNKIEWYMNGHIGELDKLAPHLITEHKIYGRDRITPLTVSAEDSVIK KQNNPDDNMMQYYWWDSETQSNWKDGYCRSAYLLDNNEWHEKASDLCLELIDTQDDDGYI GIYDSSLKFNCKAENGEFWAQATALRFVLGCYESTRNESLLTSLTQAAQCIMAGYPKETS HPFVVEHGFAGHSHGLTITDVFYRLYEITNNIEYLSYCLWLYEDYSAGCVSEQDLQYKNV IDPGYLLKGHGVHTYEHIRALAIAASQRPQYQTALDIFLSQLKYYLTPSGAPVGDEWITG RTANATYTGYEYCSVHELFSTYVMLIKLTTNIGLADDAEWLFYNASFGMQHPIHHAIMYC KTDNCYEANERKHPDSNQFNTRYKYSPTHEEAAVCCVPNTARTIPAFVENMFQENNNGFT ALFYGPCEFYGTYNNVAVKITQLTEYPHDLSVKCLIEPESSVRFALSFRYPGWAKMMIIN GVTFTTNDTQNSLIILDRTWQYHDVIDIKIIADIQFNTDLCGDTYISRGPVLYAIELESD ILIKKNLLNGKYFNSGYIPCSRADESIQFSYADMESFSYSTSPEGKQYIEGCFYLNKTKI VRKLIPFGQTILRKVTFPNIES >gi|296493296|gb|ADTK01000205.1| GENE 4 4388 - 7462 733 1024 aa, chain + ## HITS:1 COG:lin2123 KEGG:ns NR:ns ## COG: lin2123 COG0383 # Protein_GI_number: 16801189 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Listeria innocua # 31 1022 39 1029 1032 415 29.0 1e-115 MDLTDNIIKLSNKIGQRVYTVVSEMSIEAWITKEPQPFVYRRSGAYQKLTIGSDWGQLFD CAWMRVYGKRPADKFRHKLVALVDIGGEGLIVDKNGSPICGITNKASSYGVPPDKPGKWV VDLSLVSENDEVEFWIDAACNDLFGYVTNGGIITDVHIATCNQLLKSLYYDVEVLFDWIN DGQKFESIHPKGIIPEKIITKRSERTDEIIRILEYIDNTLITFCNEEIIKCQIAIQSIIS KNNNPSEFRIMATGHAHLDIAWMWPLREGRRKAIRTFATALANIEKYPDYIFGASQYQLF HWIKKDYPYFFEKLKKQIAAGRFELQGCFWVECDLNLVSGESLIRQIMHGTRFTLQNFSK KINYVWQPDVFGFPATLPQILKKSGINYIASQKLSQNKINKFNNYLFRWQGLDGSEILMH NFPEDTYDSRARARSLEYIEQNYNEKEICPYALMVYGVGDGGAGPGEEHIERLTRIRNID GLPHVDFSRVDKFFTHADAFRESLPIISGELYFEAHQGCFTSESATKAHNRIMENKLHDA EFFITITNNMTSILRSEFDEIWKAMLTLQFHDILPGSCISRVYHETEKEYLKLEAKTEKI ISDAQSTLLSKIDTSSYKDPHILFNTTCFARNEWININNNWLKARVNSYGYAVIDPKNKI VNGLKAESRSIENNYIKLLFSESGDLISLYDKRYGKEYITENMHSEIRAYHEDAGFFAAW DFASNYRDGESYVLLAEKMTTVISGPKTTMTLIYHYNSSYLRFAFTLTQDSPRVDVQTFI DWHEPNVSLKVKFPVSVQTSLAQCQIQFGVIDRPTHSDDSFAFAKDEIPAHHWVDLSDQE TGIALIAKNKYGYRVKDQTLELTLLRSQHKPGEVVKTDNYDNFLINNFADIGKQKFTFSL FPHHGDYRVGGVVNQAYIMNHPLQIVPVKPHPGELPPSLSFFAIDTNSVIIETIKLAEDG DGIIVRLYESHGLVISAMLSWVCNYHYVQYVDLQENKLDDSFYCNQYCNLEFKPFEIITL RLTQ Prediction of potential genes in microbial genomes Time: Mon May 16 15:40:53 2011 Seq name: gi|296493295|gb|ADTK01000206.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont544.14, whole genome shotgun sequence Length of sequence - 15562 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 7, operones - 5 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 1/1.000 - CDS 24 - 995 727 ## COG1560 Lauroyl/myristoyl acyltransferase - Prom 1030 - 1089 6.4 - Term 1064 - 1098 3.5 2 2 Op 1 . - CDS 1115 - 2374 1303 ## COG0739 Membrane proteins related to metalloendopeptidases 3 2 Op 2 . - CDS 2453 - 3400 461 ## COG4531 ABC-type Zn2+ transport system, periplasmic component/surface adhesin - Prom 3421 - 3480 5.4 + Prom 3333 - 3392 6.5 4 3 Op 1 42/0.000 + CDS 3464 - 4219 220 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 5 3 Op 2 . + CDS 4216 - 5001 895 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components - Term 5178 - 5227 -0.8 6 4 Op 1 29/0.000 - CDS 5342 - 6352 1132 ## COG2255 Holliday junction resolvasome, helicase subunit 7 4 Op 2 . - CDS 6361 - 6972 626 ## COG0632 Holliday junction resolvasome, DNA-binding subunit - Prom 7003 - 7062 7.4 + Prom 7156 - 7215 6.0 8 5 Tu 1 . + CDS 7247 - 7849 255 ## ECO111_2384 hypothetical protein - Term 7806 - 7843 5.2 9 6 Op 1 8/0.000 - CDS 7851 - 8372 530 ## COG0817 Holliday junction resolvasome, endonuclease subunit 10 6 Op 2 7/0.000 - CDS 8407 - 9147 970 ## COG0217 Uncharacterized conserved protein 11 6 Op 3 5/0.333 - CDS 9176 - 9685 371 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 12 6 Op 4 . - CDS 9746 - 11518 2145 ## COG0173 Aspartyl-tRNA synthetase - Prom 11613 - 11672 4.6 + Prom 11742 - 11801 6.5 13 7 Op 1 1/1.000 + CDS 11828 - 12394 574 ## COG1335 Amidases related to nicotinamidase 14 7 Op 2 2/0.667 + CDS 12391 - 13209 405 ## COG1801 Uncharacterized conserved protein 15 7 Op 3 2/0.667 + CDS 13262 - 13657 468 ## COG3788 Uncharacterized relative of glutathione S-transferase, MAPEG superfamily 16 7 Op 4 17/0.000 + CDS 13698 - 14441 851 ## COG0500 SAM-dependent methyltransferases 17 7 Op 5 . + CDS 14438 - 15409 974 ## COG0500 SAM-dependent methyltransferases + Term 15498 - 15534 -0.4 Predicted protein(s) >gi|296493295|gb|ADTK01000206.1| GENE 1 24 - 995 727 323 aa, chain - ## HITS:1 COG:msbB KEGG:ns NR:ns ## COG: msbB COG1560 # Protein_GI_number: 16129808 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lauroyl/myristoyl acyltransferase # Organism: Escherichia coli K12 # 1 323 1 323 323 654 100.0 0 METKKNNSEYIPEFDKSFRHPRYWGAWLGVAAMAGIALTPPKFRDPILARLGRFAGRLGK SSRRRALINLSLCFPERSEAEREAIVDEMFATAPQAMAMMAELAIRGPEKIQPRVDWQGL EIIEEMRRNNEKVIFLVPHGWAVDIPAMLMASQGQKMAAMFHNQGNPVFDYVWNTVRRRF GGRLHARNDGIKPFIQSVRQGYWGYYLPDQDHGPEHSEFVDFFATYKATLPAIGRLMKVC RARVVPLFPIYDGKTHRLTIQVRPPMDDLLEADDHTIARRMNEEVEIFVGPRPEQYTWIL KLLKTRKPGEIQPYKRKDLYPIK >gi|296493295|gb|ADTK01000206.1| GENE 2 1115 - 2374 1303 419 aa, chain - ## HITS:1 COG:ECs2566 KEGG:ns NR:ns ## COG: ECs2566 COG0739 # Protein_GI_number: 15831820 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Escherichia coli O157:H7 # 1 419 1 419 419 823 100.0 0 MLGSLTVLTLAVAVWRPYVYHRDATPIVKTIELEQNEIRSLLPEASEPIDQAAQEDEAIP QDELDDKIAGEAGVHEYVVSTGDTLSSILNQYGIDMGDITQLAAADKELRNLKIGQQLSW TLTADGELQRLTWEVSRRETRTYDRTAANGFKMTSEMQQGEWVNNLLKGTVGGSFVASAR NAGLTSAEVSAVIKAMQWQMDFRKLKKGDEFAVLMSREMLDGKREQSQLLGVRLRSEGKD YYAIRAEDGKFYDRNGTGLAKGFLRFPTAKQFRISSNFNPRRTNPVTGRVAPHRGVDFAM PQGTPVLSVGDGEVVVAKRSGAAGYYVAIRHGRSYTTRYMHLRKILVKPGQKVKRGDRIA LSGNTGRSTGPHLHYEVWINQQAVNPLTAKLPRTEGLTGSDRREFLAQAKEIVPQLRFD >gi|296493295|gb|ADTK01000206.1| GENE 3 2453 - 3400 461 315 aa, chain - ## HITS:1 COG:ECs2567 KEGG:ns NR:ns ## COG: ECs2567 COG4531 # Protein_GI_number: 15831821 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Zn2+ transport system, periplasmic component/surface adhesin # Organism: Escherichia coli O157:H7 # 1 315 14 328 328 610 99.0 1e-174 MIGRIMLHKKTLLFAALSAALWGGATQAADAAVVASLKPVGFIASAIADGVTETEVLLPD GASEHDYSLRPSDVKRLQNADLVVWVGPEMEAFMQKPVSKLPEAKQVTIAQLEDVKPLLM KSIHGDDDDHDHAEKSDEDHHHGDFNMHLWLSPEIARATAVAIHGKLVELMPQSRAKLDA NLKDFEAQLASTETQVGNELAPLKGKGYFVFHDAYGYFEKQFGLTPLGHFTVNPEIQPGA QRLHEIRTQLVEQKATCVFAEPQFRPAVVESVARGTSVRMGTLDPLGTNIKLGKTSYSEF LSQLANQYASCLKGD >gi|296493295|gb|ADTK01000206.1| GENE 4 3464 - 4219 220 251 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 202 1 212 305 89 29 1e-17 MTSLVSLENVSVSFGQRRVLSDVSLELKPGKILTLLGPNGAGKSTLVRVVLGLVTPDEGV IKRNGKLRIGYVPQKLYLDTTLPLTVKRFLRLRPGTHKEDILPALKRVQAGHLINAPMQK LSGGETQRVLLARALLNRPQLLVLDEPTQGVDVNGQVALYDLIDQLRRELDCGVLMVSHD LHLVMAKTDEVLCLNHHICCSGTPEVVSLHPEFISMFGPRGAEQLGIYRHHHNHRHDLQG RIVLRRGNDRS >gi|296493295|gb|ADTK01000206.1| GENE 5 4216 - 5001 895 261 aa, chain + ## HITS:1 COG:ECs2569 KEGG:ns NR:ns ## COG: ECs2569 COG1108 # Protein_GI_number: 15831823 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Escherichia coli O157:H7 # 1 261 1 261 261 397 99.0 1e-110 MIELLFPGWLAGIMLACAAGPLGSFVVWRRMSYFGDTLAHASLLGVAFGLLLDVNPFYAV IAVTLLLAGGLVWLEKRPQLAIDTLLGIMAHSALSLGLVVVSLMSNIRVDLMAYLFGDLL AVTPEDLISIAIGVVIVVAILFWQWRNLLSMTISPDLAFVDGVKLQRVKLLLMLVTALTI GVAMKFVGALIITSLLIIPAATARRFARTPEQMAGVAVLVGMVAVTGGLTFSAFYDTPAG PSVVLCAALLFILSMMKKQAS >gi|296493295|gb|ADTK01000206.1| GENE 6 5342 - 6352 1132 336 aa, chain - ## HITS:1 COG:ECs2570 KEGG:ns NR:ns ## COG: ECs2570 COG2255 # Protein_GI_number: 15831824 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, helicase subunit # Organism: Escherichia coli O157:H7 # 1 336 1 336 336 644 100.0 0 MIEADRLISAGTTLPEDVADRAIRPKLLEEYVGQPQVRSQMEIFIKAAKLRGDALDHLLI FGPPGLGKTTLANIVANEMGVNLRTTSGPVLEKAGDLAAMLTNLEPHDVLFIDEIHRLSP VVEEVLYPAMEDYQLDIMIGEGPAARSIKIDLPPFTLIGATTRAGSLTSPLRDRFGIVQR LEFYQVPDLQYIVSRSARFMGLEMSDDGALEVARRARGTPRIANRLLRRVRDFAEVKHDG TISADIAAQALDMLNVDAEGFDYMDRKLLLAVIDKFFGGPVGLDNLAAAIGEERETIEDV LEPYLIQQGFLQRTPRGRMATTRAWNHFGITPPEMP >gi|296493295|gb|ADTK01000206.1| GENE 7 6361 - 6972 626 203 aa, chain - ## HITS:1 COG:ECs2571 KEGG:ns NR:ns ## COG: ECs2571 COG0632 # Protein_GI_number: 15831825 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, DNA-binding subunit # Organism: Escherichia coli O157:H7 # 1 203 1 203 203 379 100.0 1e-105 MIGRLRGIIIEKQPPLVLIEVGGVGYEVHMPMTCFYELPEAGQEAIVFTHFVVREDAQLL YGFNNKQERTLFKELIKTNGVGPKLALAILSGMSAQQFVNAVEREEVGALVKLPGIGKKT AERLIVEMKDRFKGLHGDLFTPAADLVLTSPASPATDDAEQEAVAALVALGYKPQEASRM VSKIARPDASSETLIREALRAAL >gi|296493295|gb|ADTK01000206.1| GENE 8 7247 - 7849 255 200 aa, chain + ## HITS:1 COG:no KEGG:ECO111_2384 NR:ns ## KEGG: ECO111_2384 # Name: yebB # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 200 1 200 200 416 99.0 1e-115 MNINYPAEYEIGDIVFTCIGAALFGQISAASNCWSNHVGIIIGHNGEDFLVAESRVPLST ITTLSRFIKRSANQRYAIKRLDAGLTEQQKQRIVEQVPSRLCKLYHTGFKYESSRQFCSK FVFDIYKEALCIPVGEIETFGQLLNSNPNAKLTFWKFWFLGSIPWERKTVTPASLWHHPG LVLIHAEGVETPQPELTEAV >gi|296493295|gb|ADTK01000206.1| GENE 9 7851 - 8372 530 173 aa, chain - ## HITS:1 COG:ECs2573 KEGG:ns NR:ns ## COG: ECs2573 COG0817 # Protein_GI_number: 15831827 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, endonuclease subunit # Organism: Escherichia coli O157:H7 # 1 173 1 173 173 300 99.0 8e-82 MAIILGIDPGSRVTGYGVICQVGRQLSYLGSGCIRTKVDDLPSRLKLIYAGVTEIITQFQ PDYFAIEQVFMAKNADSALKLGQARGVAIVAAVNQELPVFEYAARQVKQTVVGIGSAEKS QVQHMVRTLLKLPANPQADAADALAIAITHCHVSQNAMQMSESRLNLARGRLR >gi|296493295|gb|ADTK01000206.1| GENE 10 8407 - 9147 970 246 aa, chain - ## HITS:1 COG:ECs2574 KEGG:ns NR:ns ## COG: ECs2574 COG0217 # Protein_GI_number: 15831828 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 246 1 246 246 444 99.0 1e-125 MAGHSKWANTRHRKAAQDAKRGKIFTKIIRELVTAAKLGGGDPDANPRLRAAVDKALSNN MTRDTLNRAIARGVGGDDDANMETIIYEGYGPGGTAIMIECLSDNRNRTVAEVRHAFSKC GGNLGTDGSVAYLFSKKGVISFEKGDEDTIMEAALEAGAEDVVTYDDGAIDVYTAWEEMG KVRDALEAAGLKADSAEVSMIPSTKADMDAETAPKLMRLIDMLEDCDDVQEVYHNGEISD EVAATL >gi|296493295|gb|ADTK01000206.1| GENE 11 9176 - 9685 371 169 aa, chain - ## HITS:1 COG:ECs2575 KEGG:ns NR:ns ## COG: ECs2575 COG0494 # Protein_GI_number: 15831829 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Escherichia coli O157:H7 # 20 169 1 150 150 292 99.0 2e-79 MSIDNYVNGMSEGSQRRGSVKDKVYKRPVSILVVIYAQDTKRVLMLQRRDDPDFWQSVTG SVEEGETAPQAAMREVKEEVTIDVVAEQLTLIDCQRTVEFEIFSHLRHRYAPGVTRNTES WFCLALPHERQIVFTEHLAYKWLDAPAAAALTKSWSNRQAIEQFVINAA >gi|296493295|gb|ADTK01000206.1| GENE 12 9746 - 11518 2145 590 aa, chain - ## HITS:1 COG:aspS KEGG:ns NR:ns ## COG: aspS COG0173 # Protein_GI_number: 16129819 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl-tRNA synthetase # Organism: Escherichia coli K12 # 1 590 1 590 590 1195 99.0 0 MRTEYCGQLRLSHVGQQVTLCGWVNRRRDLGSLIFIDMRDREGIVQVFFDPDRADALKLA SELRNEFCIQVTGTVRARDEKNINRDMATGEIEVLASSLTIINRADVLPLDSNHVNTEEA RLKYRYLDLRRPEMAQRLKTRAKITSLVRRFMDDHGFLDIETPMLTKATPEGARDYLVPS RVHKGKFYALPQSPQLFKQLLMMSGFDRYYQIVKCFRDEDLRADRQPEFTQIDVETSFMT APQVREVMEALVRHLWLEVKGVDLGDFPVMTFAEAERRYGSDKPDLRNPMELTDVADLLK SVEFAVFAGPANDPKGRVAALRVPGGASLTRKQIDEYGNFVKIYGAKGLAYIKVNERAKG LEGINSPVAKFLNAEIIEAILDRTAAQDGDMIFFGADNKKIVADAMGALRLKVGKDLGLT DESKWAPLWVIDFPMFEDDGEGGLTAMHHPFTSPKDMTAAELKAAPENAVANAYDMVING YEVGGGSVRIHNGDMQQTVFGILGINEEEQREKFGFLLDALKYGTPPHAGLAFGLDRLTM LLTGTDNIRDVIAFPKTTAAACLMTEAPSFANPTALAELGIQVVKKAENN >gi|296493295|gb|ADTK01000206.1| GENE 13 11828 - 12394 574 188 aa, chain + ## HITS:1 COG:ECs2577 KEGG:ns NR:ns ## COG: ECs2577 COG1335 # Protein_GI_number: 15831831 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Escherichia coli O157:H7 # 1 188 12 199 199 372 98.0 1e-103 MLELNAKTTALVVIDLQEGILPFAGGPHTADEVVNRAGKLAAKFRASGQPVFLVRVGWSA DYAEALKQPVDAPSPAKVLPENWWQHPAALGATDSDIEIIKRQWGAFYGTDLELQLRRRG IDTIVLCGISINIGVESTARNAWELGFNLVIAEDACSAASAEQHNNSINHIYPRIARVRS VEEILHAL >gi|296493295|gb|ADTK01000206.1| GENE 14 12391 - 13209 405 272 aa, chain + ## HITS:1 COG:yecE KEGG:ns NR:ns ## COG: yecE COG1801 # Protein_GI_number: 16129821 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 272 1 272 272 539 99.0 1e-153 MIYIGLPQWSHPKWVRLGITSLEEYARHFNCVEGNTTLYALPKPEVVLRWREQTTDDFRF CFKFPATISHQAALRHCDDLVTEFLTRMSPLAPRIGQYWLQLPATFGPRELPALWHFLDS LPGEFNYGVEVRHPQFFAKGEKEQTLNRGLHQRGVNRVILDSRPVHAACPHSEAIRDAQR KKPKVPVHAVLTATNPLIRFIGSDDMTQNRELFQVWLQKLAQWHQTTTPYLFLHTPDIAQ APELVHTLWEDLRKTLPEIGAVPAIPQQSSLF >gi|296493295|gb|ADTK01000206.1| GENE 15 13262 - 13657 468 131 aa, chain + ## HITS:1 COG:ECs2579 KEGG:ns NR:ns ## COG: ECs2579 COG3788 # Protein_GI_number: 15831833 # Func_class: R General function prediction only # Function: Uncharacterized relative of glutathione S-transferase, MAPEG superfamily # Organism: Escherichia coli O157:H7 # 1 131 11 141 141 227 99.0 4e-60 MVSALYAVLSALLLMKFSFDVVRLRLQYRVAYGDGGFSELQSAIRIHGNAVEYIPIAIVL MLFMEMNGAETWMVHICGIVLLAGRLMHYYGFHHRLFRWRRSGMSATWCALLLMVLANLW YMPWELVFSLR >gi|296493295|gb|ADTK01000206.1| GENE 16 13698 - 14441 851 247 aa, chain + ## HITS:1 COG:ECs2580 KEGG:ns NR:ns ## COG: ECs2580 COG0500 # Protein_GI_number: 15831834 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli O157:H7 # 1 247 1 247 247 514 100.0 1e-146 MSHRDTLFSAPIARLGDWTFDERVAEVFPDMIQRSVPGYSNIISMIGMLAERFVQPGTQV YDLGCSLGAATLSVRRNIHHDNCKIIAIDNSPAMIERCRRHIDAYKAPTPVDVIEGDIRD IAIENASMVVLNFTLQFLEPSERQALLDKIYQGLNPGGALVLSEKFSFEDAKVGELLFNM HHDFKRANGYSELEISQKRSMLENVMLTDSVETHKARLHKAGFEHSELWFQCFNFGSLVA LKAEDAA >gi|296493295|gb|ADTK01000206.1| GENE 17 14438 - 15409 974 323 aa, chain + ## HITS:1 COG:ECs2581 KEGG:ns NR:ns ## COG: ECs2581 COG0500 # Protein_GI_number: 15831835 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli O157:H7 # 1 323 1 323 323 665 98.0 0 MIDFGNFYSLIAKNHLSHWLETLPAQIANWQREQQHGLFKQWSNAVEFLPEIKPYRLDLL HSVTAESEEPLSTGQIKRIETLMRNLMPWRKGPFSLYGVNIDTEWRSDWKWDRVLPHLSD LTGRTILDVGCGSGYHMWRMIGAGAHLAVGIDPTQLFLCQFEAVRKLLGNDQRAHLLPLG IEQLPALKAFDTVFSMGVLYHRRSPLEHLWQLKDQLVNEGELVLETLVIDGDENTVLVPG DRYAQMRNVYFIPSALALKNWLKKCGFVDIRVVDVCVTTTEEQRRTEWMVTESLSDFLDP HDPSKTVEGYPAPKRAVLIARKP Prediction of potential genes in microbial genomes Time: Mon May 16 15:40:59 2011 Seq name: gi|296493294|gb|ADTK01000207.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont544.15, whole genome shotgun sequence Length of sequence - 7288 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 - CDS 29 - 2458 2507 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 2 1 Op 2 1/1.000 - CDS 2483 - 3583 564 ## COG3005 Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit - Prom 3769 - 3828 8.7 - Term 3924 - 3959 2.0 3 2 Op 1 3/1.000 - CDS 3971 - 4717 425 ## COG3142 Uncharacterized protein involved in copper resistance 4 2 Op 2 . - CDS 4731 - 5297 372 ## COG3102 Uncharacterized protein conserved in bacteria - Prom 5321 - 5380 4.1 + Prom 5395 - 5454 4.1 5 3 Tu 1 . + CDS 5513 - 7246 2246 ## COG0018 Arginyl-tRNA synthetase Predicted protein(s) >gi|296493294|gb|ADTK01000207.1| GENE 1 29 - 2458 2507 809 aa, chain - ## HITS:1 COG:bisZ KEGG:ns NR:ns ## COG: bisZ COG0243 # Protein_GI_number: 16129825 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 809 7 815 815 1623 99.0 0 MTLTRREFIKHSGIAAGALVVTSAAPLPAWAEEKGGKILTAGRWGAMNVEVKDGKIVSST GALAKTIPNSLQSTAADQVHTTARIQHPMVRKSYLDNPLQPAKGRGEDTYVQVSWEQALK LIHEQHDRIRKANGPSAIFAGSYGWRSSGVLHKAQTLLQRYMNLAGGYSGHSGDYSTGAA QVIMPHVVGSVEVYEQQTSWPLILENSQAVVLWGMNPLNTLKIAWSSTDEQGLEYFHQLK KSGKPVIAIDPIRSETIEFFGDNATWIAPNMGTDVALMLGIAHTLMTQGKHDKVFLEKYT TGYPQFEEYLTGKSDNTPKSAAWAAEITGVPEAQIVKLAELMAANRTMLMAGWGIQRQQY GEQKHWMLVTLAAMLGQIGTPGGGFGFSYHYSNGGNPTRVGGVLPEMSAAIAGQASEAAD DGGMTAIPVARIVDALENPGGKYQHNGKEQTYPNIKMIWWAGGGNFTHHQDTNRLIKAWQ KPEMIVVSECYWTAAAKHADIVLPITTSFERNDLTMTGDYSNQHIVPMKQAVAPQFEARN DFDVFADLAELLKPGGKEIYTEGKDEMAWLKFFYDAAQKGARAQRVTMPMFNAFWQQNKL IEMRRSEKNEQYVRYADFRADPVKNALGTPSGKIEIYSKTLEKFGYKDCPAHPTWLAPDE WKGTADEKQLQLLTAHPAHRLHSQLNYAELRKKYAVADREPITIHTEDAARFGIANGDLV RVWNKRGQILTGAVVTDGIKKGVVCVHEGAWPDLENGLCKNGSANVLTADIPSSQLANAC AGNSALVYIEKYTGNAPKLTAFDKPAVQA >gi|296493294|gb|ADTK01000207.1| GENE 2 2483 - 3583 564 366 aa, chain - ## HITS:1 COG:ECs2583 KEGG:ns NR:ns ## COG: ECs2583 COG3005 # Protein_GI_number: 15831837 # Func_class: C Energy production and conversion # Function: Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit # Organism: Escherichia coli O157:H7 # 1 366 1 366 366 729 98.0 0 MRGKKRIGLLFLLIAVVVGGGGLLLAQKALHKTSDTAFCLSCHSMSKPFEEYQGTVHFSN QKGIRAECADCHIPKSGMDYLFAKLKASKDIYHEFVSGKIDSDDKFETHRQEMAETVWKE LKATDSATCRSCHSFDAMDIASQSESAQKMHNKAQKGGETCIDCHKGIAHFPPEIKMDDN AAHELESQAATSVTNGAHIYPFKTSRIGELATVNPGTDLTVVDASGKQPIVLLQGYQMQG SENTLYLAAGQRLALATLSEEGIKALTVNGEWQADEYGNQWRQASLQGALTDPALADRKP LWQYAEKLDDTYCAGCHAPIAADHYTVNAWPSIAKGMGARTSMSENELDILTRYFQYNAK DITEKQ >gi|296493294|gb|ADTK01000207.1| GENE 3 3971 - 4717 425 248 aa, chain - ## HITS:1 COG:cutCm KEGG:ns NR:ns ## COG: cutCm COG3142 # Protein_GI_number: 16132223 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in copper resistance # Organism: Escherichia coli K12 # 1 248 1 248 248 498 100.0 1e-141 MALLEICCYSMECALTAQQNGADRVELCAAPKEGGLTPSLGVLKSVRQRVTIPVHPIIRP RGGDFCYSDGEFAAILEDVRTVRELGFPGLVTGVLDVDGNVDMPRMEKIMAAAGPLAVTF HRAFDMCANPLYTLNNLAELGIARVLTSGQKSDALQGLSKIMELIAHRDAPIIMAGAGVR AENLHHFLDAGVLEVHSSAGAWQASPMRYRNQGLSMSSDEHADEYSRYIVDGAAVAEMKG IIERHQAK >gi|296493294|gb|ADTK01000207.1| GENE 4 4731 - 5297 372 188 aa, chain - ## HITS:1 COG:yecM KEGG:ns NR:ns ## COG: yecM COG3102 # Protein_GI_number: 16129827 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 188 3 190 190 369 98.0 1e-102 MANWQSIDELQDIASDLPRFTHALDELSLRLGLNITPLTADHISLRCHQNATAERWRRGF EQCGELLSENMINGRPICLFKLHEPVQVAHWQFSIVELPWPGEKRYPHEGWEHIEIVLPG DPETLNARALALLSDEGLSLPGISVKTSSPKGEHERLPNPTLAVTDGKTTIKFHPWSIEE IVASEQSA >gi|296493294|gb|ADTK01000207.1| GENE 5 5513 - 7246 2246 577 aa, chain + ## HITS:1 COG:argS KEGG:ns NR:ns ## COG: argS COG0018 # Protein_GI_number: 16129828 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Arginyl-tRNA synthetase # Organism: Escherichia coli K12 # 1 577 1 577 577 1150 99.0 0 MNIQALLSEKVRQAMIAAGAPADCEPQVRQSAKVQFGDYQANGMMAVAKKLGMAPRQLAE QVLTHLDLNGIASKVEIAGPGFINIFLDPAFLAEHVQQALASDRLGVSTPEKQTIVVDYS APNVAKEMHVGHLRSTIIGDAAVRTLEFLGHKVIRANHVGDWGTQFGMLIAWLEKQQQEN AGEMELADLEGFYRDAKKHYDEDEEFAERARNYVVKLQSGDEYFREMWRKLVDITMTQNQ ITYDRLNVTLTRDDVMGESLYNPMLPGIVADLKAKGLAVESEGATVVFLDEFKNKEGEPM GVIIQKKDGGYLYTTTDIACAKYRYETLHADRVLYYIDSRQHQHLMQAWAIVRKAGYVPE SVPLEHHMFGMMLGKDGKPFKTRAGGTVKLADLLDEALERARRLVAEKNPDMPADELEKL ANAVGIGAVKYADLSKNRTTDYIFDWDNMLAFEGNTAPYMQYAYTRVLSVFRKAEINEEQ LAAAPVIIREDREAQLAARLLQFEETLTVVAREGTPHVMCAYLYDLAGLFSGFYEHCPIL SAENEEVRNSRLKLAQLTAKTLKLGLDTLGIETVERM Prediction of potential genes in microbial genomes Time: Mon May 16 15:41:10 2011 Seq name: gi|296493293|gb|ADTK01000208.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont544.16, whole genome shotgun sequence Length of sequence - 34413 bp Number of predicted genes - 33, with homology - 33 Number of transcription units - 16, operones - 6 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 32 - 91 2.6 1 1 Tu 1 . + CDS 130 - 618 187 ## COG3755 Uncharacterized protein conserved in bacteria 2 2 Op 1 . - CDS 738 - 1130 149 ## B21_01838 hypothetical protein 3 2 Op 2 13/0.000 - CDS 1130 - 3208 2038 ## COG1298 Flagellar biosynthesis pathway, component FlhA 4 2 Op 3 4/0.667 - CDS 3201 - 4349 1031 ## COG1377 Flagellar biosynthesis pathway, component FlhB - Prom 4457 - 4516 2.8 5 3 Op 1 8/0.000 - CDS 4551 - 5195 772 ## COG3143 Chemotaxis protein 6 3 Op 2 18/0.000 - CDS 5206 - 5595 509 ## COG0784 FOG: CheY-like receiver 7 3 Op 3 13/0.000 - CDS 5610 - 6659 863 ## COG2201 Chemotaxis response regulator containing a CheY-like receiver domain and a methylesterase domain 8 3 Op 4 9/0.000 - CDS 6662 - 7522 651 ## COG1352 Methylase of chemotaxis methyl-accepting proteins 9 3 Op 5 13/0.000 - CDS 7541 - 9142 1554 ## COG0840 Methyl-accepting chemotaxis protein 10 3 Op 6 17/0.000 - CDS 9188 - 10849 1426 ## COG0840 Methyl-accepting chemotaxis protein - Prom 10895 - 10954 3.8 - Term 10930 - 10968 4.5 11 3 Op 7 20/0.000 - CDS 10994 - 11497 589 ## COG0835 Chemotaxis signal transduction protein 12 3 Op 8 5/0.000 - CDS 11518 - 13476 1663 ## COG0643 Chemotaxis protein histidine kinase and related kinases 13 3 Op 9 19/0.000 - CDS 13487 - 14413 726 ## COG1360 Flagellar motor protein 14 3 Op 10 . - CDS 14410 - 15297 939 ## COG1291 Flagellar motor component - Term 15360 - 15397 7.2 15 4 Op 1 . - CDS 15424 - 16002 351 ## ECSP_2465 transcriptional activator FlhC 16 4 Op 2 . - CDS 16005 - 16355 307 ## EC55989_2071 transcriptional activator FlhD - Prom 16553 - 16612 4.5 17 5 Tu 1 . + CDS 17135 - 17563 409 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins + Term 17615 - 17644 2.5 18 6 Op 1 8/0.000 - CDS 17570 - 18994 1207 ## COG0380 Trehalose-6-phosphate synthase 19 6 Op 2 1/1.000 - CDS 18969 - 19769 644 ## COG1877 Trehalose-6-phosphatase - Prom 19824 - 19883 3.6 - Term 19860 - 19897 5.4 20 7 Op 1 21/0.000 - CDS 19936 - 20922 1036 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 21 7 Op 2 16/0.000 - CDS 20937 - 22451 183 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 22 7 Op 3 . - CDS 22521 - 23510 1249 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 23560 - 23619 5.1 + Prom 24068 - 24127 6.8 23 8 Tu 1 . + CDS 24307 - 24810 487 ## COG1528 Ferritin-like protein + Term 24852 - 24899 7.9 - Term 24844 - 24883 7.4 24 9 Tu 1 . - CDS 24889 - 25140 286 ## B21_01860 hypothetical protein + Prom 26005 - 26064 7.9 25 10 Tu 1 . + CDS 26098 - 26595 635 ## COG1528 Ferritin-like protein + Term 26604 - 26636 5.4 26 11 Tu 1 . - CDS 26633 - 26872 334 ## ECUMN_2201 hypothetical protein - Prom 26944 - 27003 6.7 + Prom 26921 - 26980 4.6 27 12 Tu 1 . + CDS 27095 - 28273 1177 ## COG0814 Amino acid permeases + Term 28296 - 28337 2.5 - Term 28277 - 28330 3.6 28 13 Tu 1 . - CDS 28335 - 29000 699 ## COG3318 Predicted metal-binding protein related to the C-terminal domain of SecA - Prom 29082 - 29141 2.9 - TRNA 29196 - 29282 71.6 # Leu TAA 0 0 - TRNA 29295 - 29368 51.5 # Cys GCA 0 0 - TRNA 29423 - 29498 93.7 # Gly GCC 0 0 29 14 Op 1 9/0.000 - CDS 29650 - 30198 264 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase 30 14 Op 2 3/1.000 - CDS 30255 - 32021 1665 ## COG0322 Nuclease subunit of the excinuclease complex 31 14 Op 3 . - CDS 32084 - 32740 465 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain - Prom 32784 - 32843 5.2 + Prom 32949 - 33008 5.5 32 15 Tu 1 . + CDS 33199 - 33423 358 ## ECDH10B_2056 hypothetical protein + Term 33431 - 33471 8.0 - Term 33417 - 33457 4.2 33 16 Tu 1 . - CDS 33491 - 34177 460 ## COG2771 DNA-binding HTH domain-containing proteins - Prom 34292 - 34351 7.7 Predicted protein(s) >gi|296493293|gb|ADTK01000208.1| GENE 1 130 - 618 187 162 aa, chain + ## HITS:1 COG:yecT KEGG:ns NR:ns ## COG: yecT COG3755 # Protein_GI_number: 16129829 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 162 8 169 169 309 98.0 1e-84 MFKFLVLTLGIISCQAYAEDTVIVNDHDISAIKDCWQKNSDDDTDINVIKSCLRQEYNLV DAQLNKAYGEAYRYIEQVPRTGVKKPDTEQLNLLKKSQRAWLDFRDKECELILSNEDVQD LSDPYSESEWLSCMIIQTNTRTRQLQLYHNSEDFYPSPLTRG >gi|296493293|gb|ADTK01000208.1| GENE 2 738 - 1130 149 130 aa, chain - ## HITS:1 COG:no KEGG:B21_01838 NR:ns ## KEGG: B21_01838 # Name: flhE # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 130 1 130 130 218 100.0 4e-56 MRTLLAILLFPLLVQAAGEGMWQASSVGITLNHRGESMSSAPLSTRQPASGLMTLVAWRY QLIGPTPSGLRVRLCSQSRCVELEGQSGTTVAFSGIAAAEPLRFIWEVPGGGRLIPPLKV QRNEVIVNYR >gi|296493293|gb|ADTK01000208.1| GENE 3 1130 - 3208 2038 692 aa, chain - ## HITS:1 COG:flhA KEGG:ns NR:ns ## COG: flhA COG1298 # Protein_GI_number: 16129831 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FlhA # Organism: Escherichia coli K12 # 1 692 1 692 692 1266 99.0 0 MSNLAAMLRLPANLKSTQWQILAGPILILLILSMMVLPLPAFILDLLFTFNIALSIMVLL VAMFTQRTLEFAAFPTILLFTTLLRLALNVASTRIILMEGHTGAAAAGKVVEAFGHFLVG GNFAIGIVVFVILVIINFMVITKGAGRIAEVGARFVLDGMPGKQMAIDADLNAGLIGEDE AKKRRSEVTQEADFYGSMDGASKFVRGDAIAGILIMVINVVGGLLVGVLQHGMSMGHAAE SYTLLTIGDGLVAQIPALVISTAAGVIVTRVSTDQDVGEQMVNQLFSNPSVMLLSAAVLG LLGLVPGMPNLVFLLFTAGLLGLAWWIRGREQKAPAEPKPVKMAENNTVVEATWNDVQLE DSLGMEVGYRLIPMVDFQQDGELLGRIRSIRKKFAQEMGFLPPVVHIRDNMDLQPARYRI LMKGVEIGSGDAYPGRWLAINPGTAAGTLPGEATVDPAFGLNAIWIESALKEQAQIQGYT VVEASTVVATHLNHLISQHAAELFGRQEAQQLLDRVAQEMPKLTEDLVPGVVTLTTLHKV LQNLLDEKVPIRDMRTILETLAEHAPIQSDPHELTAVVRVALGRAITQQWFPGKDEVHVI GLDTPLERLLLQALQGGGGLEPGLADRLLAQTQEALSRQEMLGAPPVLLVNHALRPLLSR FLRRSLPQLVVLSNLELSDNRHIRMTATIGGK >gi|296493293|gb|ADTK01000208.1| GENE 4 3201 - 4349 1031 382 aa, chain - ## HITS:1 COG:flhB KEGG:ns NR:ns ## COG: flhB COG1377 # Protein_GI_number: 16129832 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FlhB # Organism: Escherichia coli K12 # 1 382 1 382 382 694 99.0 0 MSDESDDKTEAPTPHRLEKAREEGQIPRSRELTSLLILLVGVCVIWFGGVSLARRLSGML SAGLHFDHSIINDPNLILGQIILLIREAMLALLPLISGVVLVALISPVMLGGLVFSGKSL QPKFSKLNPLPGIKRMFSAQTGAELLKAILKTILVGSVTGFFLWHHWPQMMRLMAESPIT AMGNAMDLVGLCALLVVLGVIPMVGFDVFFQIFSHLKKLRMSRQDIRDEFKQSEGDPHVK GRIRQMQRAAARRRMMADVPKADVIVNNPTHYSVALQYDENKMSAPKVVAKGAGLLALRI REIGAENNVPTLEAPPLARALYRHAEIGQQIPGQLYAAVAEVLAWVWQLKRWRLAGGQRP VQPTHLPVPEALDFINEKPTHE >gi|296493293|gb|ADTK01000208.1| GENE 5 4551 - 5195 772 214 aa, chain - ## HITS:1 COG:cheZ KEGG:ns NR:ns ## COG: cheZ COG3143 # Protein_GI_number: 16129833 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Chemotaxis protein # Organism: Escherichia coli K12 # 1 214 1 214 214 345 100.0 3e-95 MMQPSIKPADEHSAGDIIARIGSLTRMLRDSLRELGLDQAIAEAAEAIPDARDRLYYVVQ MTAQAAERALNSVEASQPHQDQMEKSAKALTQRWDDWFADPIDLADARELVTDTRQFLAD VPAHTSFTNAQLLEIMMAQDFQDLTGQVIKRMMDVIQEIERQLLMVLLENIPEQESRPKR ENQSLLNGPQVDTSKAGVVASQDQVDDLLDSLGF >gi|296493293|gb|ADTK01000208.1| GENE 6 5206 - 5595 509 129 aa, chain - ## HITS:1 COG:ECs2592 KEGG:ns NR:ns ## COG: ECs2592 COG0784 # Protein_GI_number: 15831846 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Escherichia coli O157:H7 # 1 129 1 129 129 234 100.0 4e-62 MADKELKFLVVDDFSTMRRIVRNLLKELGFNNVEEAEDGVDALNKLQAGGYGFVISDWNM PNMDGLELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKPFTAATLEEKL NKIFEKLGM >gi|296493293|gb|ADTK01000208.1| GENE 7 5610 - 6659 863 349 aa, chain - ## HITS:1 COG:ECs2593 KEGG:ns NR:ns ## COG: ECs2593 COG2201 # Protein_GI_number: 15831847 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Chemotaxis response regulator containing a CheY-like receiver domain and a methylesterase domain # Organism: Escherichia coli O157:H7 # 1 349 1 349 349 654 99.0 0 MSKIRVLSVDDSALMRQIMTEIINSHSDMEMVATAPDPLVARDLIKKFNPDVLTLDVEMP RMDGLDFLEKLMRLRPMPVVMVSSLTGKGSEVTLRALELGAIDFVTKPQLGIREGMLAYS EMIAEKVRTAAKASLAAHKPLSAPTTLKAGPLLSSEKLIAIGASTGGTEAIRHVLQPLPL SSPALLITQHMPPGFTRSFADRLNKLCQIGVKEAEDGERVLPGHAYIAPGDRHMELARSG ANYQIKIHDGPAVNRHRPSVDVLFHSVAKQAGRNAVGVILTGMGNDGAAGMLAMRQAGAW TLAQNEASCVVFGMPREAINMGGVCEVVDLSQVSQQMLAKISAGQAIRI >gi|296493293|gb|ADTK01000208.1| GENE 8 6662 - 7522 651 286 aa, chain - ## HITS:1 COG:cheR KEGG:ns NR:ns ## COG: cheR COG1352 # Protein_GI_number: 16129836 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methylase of chemotaxis methyl-accepting proteins # Organism: Escherichia coli K12 # 1 286 1 286 286 570 99.0 1e-163 MTSSLPCGQTSLLLQMTERLALSDAHFRRISQLIYQRAGIVLADHKRDMVYNRLVRRLRS LGLTDFGHYLNLLESNQHSGEWQAFINSLTTNLTAFFREAHHFPLLADHARRRSGEYRVW SAAASTGEEPYSIAMTLADTLGTAPGRWKVFASDIDTEVLEKARSGIYRHEELKNLTPQQ LQRYFMRGTGPHEGLVRVRQELANYVDFAPLNLLAKQYTVPGPFDAIFCRNVMIYFDQNT QQEILRRFVPLLKPDGLLFAGHSENFSHLERRFTLRGQTVYALSKD >gi|296493293|gb|ADTK01000208.1| GENE 9 7541 - 9142 1554 533 aa, chain - ## HITS:1 COG:tap KEGG:ns NR:ns ## COG: tap COG0840 # Protein_GI_number: 16129837 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Escherichia coli K12 # 1 533 1 533 533 833 99.0 0 MFNRIRISTTLFLILILCGILQIGSNGMSFWAFRDDLQRLNQVEQSNQQRAALAQTRAVM LQASTSLNKAGTLTALSYPADDIKTLMTTARASLTQSTTLFKSFMAMTAGNEHVRALQKE TEKSFARWHNDLEHQATWLESNQLSDFLTAPVQGSQNAFDVNFEAWQLEINHVLEAASAQ SQRNYQISALVFISMIIVAAIYISSALWWTRKMIVQPLAIIGSHFDSIAAGNLARPIAVY GRNEITAIFASLKTMQQALRGTVSDVRKGSQEMHIGIAEIVAGNNDLSSRTEQQAASLAQ TAASMEQLTATVGQNADNARQASELAKNAATTAQAGGVQVSTMTHTMQEIATSSQKIGDI ISVIDGIAFQTNILALNAAVEAARAGEQGRGFAVVAGEVRNLASRSAQAAKEIKGLIEES VNRVQQGSKLVNNAAATMIDIVSSVTRVNDIMGEIASASEEQQRGIEQVAQAVSQMDQVT QQNASLVEEAAVATEQLANQADHLSSRVAVFTLEEHEVARHESAQLQIAPVVS >gi|296493293|gb|ADTK01000208.1| GENE 10 9188 - 10849 1426 553 aa, chain - ## HITS:1 COG:tar KEGG:ns NR:ns ## COG: tar COG0840 # Protein_GI_number: 16129838 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Escherichia coli K12 # 1 553 1 553 553 897 100.0 0 MINRIRVVTLLVMVLGVFALLQLISGSLFFSSLHHSQKSFVVSNQLREQQGELTSTWDLM LQTRINLSRSAVRMMMDSSNQQSNAKVELLDSARKTLAQAATHYKKFKSMAPLPEMVATS RNIDEKYKNYYTALTELIDYLDYGNTGAYFAQPTQGMQNAMGEAFAQYALSSEKLYRDIV TDNADDYRFAQWQLAVIALVVVLILLVAWYGIRRMLLTPLAKIIAHIREIAGGNLANTLT IDGRSEMGDLAQSVSHMQRSLTDTVTHVREGSDAIYAGTREIAAGNTDLSSRTEQQASAL EETAASMEQLTATVKQNADNARQASQLAQSASDTAQHGGKVVDGVVKTMHEIADSSKKIA DIISVIDGIAFQTNILALNAAVEAARAGEQGRGFAVVAGEVRNLASRSAQAAKEIKALIE DSVSRVDTGSVLVESAGETMNNIVNAVTRVTDIMGEIASASDEQSRGIDQVALAVSEMDR VTQQNASLVQESAAAAAALEEQASRLTQAVSAFRLAASPLTNKPQTPSRPASEQPPAQPR LRIAEQDPNWETF >gi|296493293|gb|ADTK01000208.1| GENE 11 10994 - 11497 589 167 aa, chain - ## HITS:1 COG:ECs2597 KEGG:ns NR:ns ## COG: ECs2597 COG0835 # Protein_GI_number: 15831851 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Chemotaxis signal transduction protein # Organism: Escherichia coli O157:H7 # 1 167 1 167 167 293 100.0 1e-79 MTGMTNVTKLASEPSGQEFLVFTLGDEEYGIDILKVQEIRGYDQVTRIANTPAFIKGVTN LRGVIVPIVDLRIKFSQVDVDYNDNTVVIVLNLGQRVVGIVVDGVSDVLSLTAEQIRPAP EFAVTLSTEYLTGLGALGDRMLILVNIEKLLNSEEMALLDSAASEVA >gi|296493293|gb|ADTK01000208.1| GENE 12 11518 - 13476 1663 652 aa, chain - ## HITS:1 COG:cheA KEGG:ns NR:ns ## COG: cheA COG0643 # Protein_GI_number: 16129840 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Chemotaxis protein histidine kinase and related kinases # Organism: Escherichia coli K12 # 1 652 3 654 654 1196 99.0 0 MDISDFYQTFFDEADELLADMEQHLLVLQPEAPDAEQLNAIFRAAHSIKGGAGTFGFSVL QETTHLMENLLDEARRGEMQLNTDIINLFLETKDIMQEQLDAYKQSQEPDAASFDYICQA LRQLALEAKGETPSAVTRLSVVAKSEPQDEQNRSQSPRRIILSRLKAGEVDLLEEELGHL TTLTDVVKGADSLSAILPGDIAEDDITAVLCFVIEADQITFETVEVSPKISTPPVLKLAA EQAPTGRVEREKTTRSSESTSIRVAVEKVDQLINLVGELVITQSMLAQRSSELDPVNHGD LITSMGQLQRNARDLQESVMSIRMMPMEYVFSRYPRLVRDLAGKLGKQVELTLVGSSTEL DKSLIERIIDPLTHLVRNSLDHGIELPEKRLAAGKNSVGNLILSAEHQGGNICIEVTDDG AGLNRERILAKAASQGLTVSENMSDDEVAMLIFAPGFSTAEQVTDVSGRGVGMDVVKRNI QEMGGHVEIQSKQGTGTTIRILLPLTLAILDGMSVRVADEVFILPLNAVMESLQPREADL HPLAGGERVLEVRGEYLPIVELWKVFNVAGAKTEATQGIVVILQSGGRRYALLVDQLIGQ HQVVVKNLESNYRKVPGISAATILGDGSVALIVDVSALQAINREQRMANTAA >gi|296493293|gb|ADTK01000208.1| GENE 13 13487 - 14413 726 308 aa, chain - ## HITS:1 COG:ECs2599 KEGG:ns NR:ns ## COG: ECs2599 COG1360 # Protein_GI_number: 15831853 # Func_class: N Cell motility # Function: Flagellar motor protein # Organism: Escherichia coli O157:H7 # 1 308 1 308 308 587 99.0 1e-168 MKNQAHPIIVVKRRKAKSHGAAHGSWKIAYADFMTAMMAFFLVMWLISISSPKELIQIAE YFRTPLATAVTGGDRISNSESPIPGGGDDYTQSQGEVNKQPNIEELKKRMEQSRLRKLRG DLDQLIESDPKLRALRPHLKIDLVQEGLRIQIIDSQNRPMFRTGSADVEPYMRDILRAIA PVLNGIPNRISLSGHTDDFPYASGEKGYSNWELSADRANASRRELMVGGLDSGKVLRVVG MAATMCLSDRGPDDAVNRRISLLVLNKQAEQAILHENAESQNEPVSALEKPEVAPQVSVP TMPSAEPR >gi|296493293|gb|ADTK01000208.1| GENE 14 14410 - 15297 939 295 aa, chain - ## HITS:1 COG:motA KEGG:ns NR:ns ## COG: motA COG1291 # Protein_GI_number: 16129842 # Func_class: N Cell motility # Function: Flagellar motor component # Organism: Escherichia coli K12 # 1 295 1 295 295 547 99.0 1e-155 MLILLGYLVVLGTVFGGYLMTGGNLGALYQPAELVIIAGAGIGSFIVGNNGKAIKGTLKA LPLLFRRSKYTKAMYMDLLALLYRLMAKSRQMGMFSLERDIENPRESEIFASYPRILADS VMLDFIVDYLRLIISGHMNTFEIEALMDEEIETHESEAEVPANSLALVGDSLPAFGIVAA VMGVVHALGSADRPAAELGALIAHAMVGTFLGILLAYGFISPLATVLRQKSAETSKMMQC VKVTLLSNLNGYAPPIAVEFGRKTLYSSERPSFIELEEHVRAVKNPQQQTTTEEA >gi|296493293|gb|ADTK01000208.1| GENE 15 15424 - 16002 351 192 aa, chain - ## HITS:1 COG:no KEGG:ECSP_2465 NR:ns ## KEGG: ECSP_2465 # Name: flhC # Def: transcriptional activator FlhC # Organism: E.coli_O157_TW14359 # Pathway: Two-component system [PATH:etw02020]; Flagellar assembly [PATH:etw02040] # 1 192 1 192 192 385 100.0 1e-106 MSEKSIVQEARDIQLAMELITLGARLQMLESETQLSRGRLIKLYKELRGSPPPKGMLPFS TDWFMTWEQNVHASMFCNAWQFLLKTGLCNGVDAVIKAYRLYLEQCPQAEEGPLLALTRA WTLVRFVESGLLQLSSCNCCGGNFITHAHQPVGSFACSLCQPPSRAVKRRKLSQNPADII PQLLDEQRVQAV >gi|296493293|gb|ADTK01000208.1| GENE 16 16005 - 16355 307 116 aa, chain - ## HITS:1 COG:no KEGG:EC55989_2071 NR:ns ## KEGG: EC55989_2071 # Name: flhD # Def: transcriptional activator FlhD # Organism: E.coli_55989 # Pathway: Two-component system [PATH:eck02020]; Flagellar assembly [PATH:eck02040] # 1 116 4 119 119 194 99.0 7e-49 MHISELLKHIYDINLSYLLLAQRLIVQDKASAMFRLGINEEMATTLAALTLPQMVKLAET NQLVCHFRFDSHQTITQLTQDSRVDDLQQIHTGIMLSTRLLNDVNQPEEALRKKRA >gi|296493293|gb|ADTK01000208.1| GENE 17 17135 - 17563 409 142 aa, chain + ## HITS:1 COG:ECs2603 KEGG:ns NR:ns ## COG: ECs2603 COG0589 # Protein_GI_number: 15831857 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Escherichia coli O157:H7 # 1 142 1 142 142 270 98.0 7e-73 MSYSNILVAVAVTPESQQLLAKAVSIARPVKGHISLITLASDPEMYNQLAAPMLEDLRSV MQEETQSFLDKLIQDAGYPVDKTFIAYGELSEHILEVCHKHHFDLVICGNHNHSFFSRAS CSAKRVIASSEVDVLLVPLTGD >gi|296493293|gb|ADTK01000208.1| GENE 18 17570 - 18994 1207 474 aa, chain - ## HITS:1 COG:otsA KEGG:ns NR:ns ## COG: otsA COG0380 # Protein_GI_number: 16129848 # Func_class: G Carbohydrate transport and metabolism # Function: Trehalose-6-phosphate synthase # Organism: Escherichia coli K12 # 1 474 1 474 474 959 100.0 0 MSRLVVVSNRIAPPDEHAASAGGLAVGILGALKAAGGLWFGWSGETGNEDQPLKKVKKGN ITWASFNLSEQDLDEYYNQFSNAVLWPAFHYRLDLVQFQRPAWDGYLRVNALLADKLLPL LQDDDIIWIHDYHLLPFAHELRKRGVNNRIGFFLHIPFPTPEIFNALPTYDTLLEQLCDY DLLGFQTENDRLAFLDCLSNLTRVTTRSAKSHTAWGKAFRTEVYPIGIEPKEIAKQAAGP LPPKLAQLKAELKNVQNIFSVERLDYSKGLPERFLAYEALLEKYPQHHGKIRYTQIAPTS RGDVQAYQDIRHQLENEAGRINGKYGQLGWTPLYYLNQHFDRKLLMKIFRYSDVGLVTPL RDGMNLVAKEYVAAQDPANPGVLVLSQFAGAANELTSALIVNPYDRDEVAAALDRALTMS LAERISRHAEMLDVIVKNDINHWQECFISDLKQIVPRSAESQQRDKVATFPKLA >gi|296493293|gb|ADTK01000208.1| GENE 19 18969 - 19769 644 266 aa, chain - ## HITS:1 COG:otsB KEGG:ns NR:ns ## COG: otsB COG1877 # Protein_GI_number: 16129849 # Func_class: G Carbohydrate transport and metabolism # Function: Trehalose-6-phosphatase # Organism: Escherichia coli K12 # 1 266 1 266 266 531 100.0 1e-151 MTEPLTETPELSAKYAWFFDLDGTLAEIKPHPDQVVVPDNILQGLQLLATASDGALALIS GRSMVELDALAKPYRFPLAGVHGAERRDINGKTHIVHLPDAIARDISVQLHTVIAQYPGA ELEAKGMAFALHYRQAPQHEDALMTLAQRITQIWPQMALQQGKCVVEIKPRGTSKGEAIA AFMQEAPFIGRTPVFLGDDLTDESGFAVVNRLGGMSVKIGTGATQASWRLAGVPDVWSWL EMITTALQQKRENNRSDDYESFSRSI >gi|296493293|gb|ADTK01000208.1| GENE 20 19936 - 20922 1036 328 aa, chain - ## HITS:1 COG:araH KEGG:ns NR:ns ## COG: araH COG1172 # Protein_GI_number: 16132221 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Escherichia coli K12 # 1 328 2 329 329 519 100.0 1e-147 MSSVSTSGSGAPKSSFSFGRIWDQYGMLVVFAVLFIACAIFVPNFATFINMKGLGLAISM SGMVACGMLFCLASGDFDLSVASVIACAGVTTAVVINLTESLWIGVAAGLLLGVLCGLVN GFVIAKLKINALITTLATMQIVRGLAYIISDGKAVGIEDESFFALGYANWFGLPAPIWLT VACLIIFGLLLNKTTFGRNTLAIGGNEEAARLAGVPVVRTKIIIFVLSGLVSAIAGIILA SRMTSGQPMTSIGYELIVISACVLGGVSLKGGIGKISYVVAGILILGTVENAMNLLNISP FAQYVVRGLILLAAVIFDRYKQKAKRTV >gi|296493293|gb|ADTK01000208.1| GENE 21 20937 - 22451 183 504 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 275 477 20 217 245 75 25 6e-13 MQQSTPYLSFRGIGKTFPGVKALTDISFDCYAGQVHALMGENGAGKSTLLKILSGNYAPT TGSVVINGQEMSFSDTTAALNAGVAIIYQELHLVPEMTVAENIYLGQLPHKGGIVNRSLL NYEAGLQLKHLGMDIDPDTPLKYLSIGQWQMVEIAKALARNAKIIAFDEPTSSLSAREID NLFRVIRELRKEGRVILYVSHRMEEIFALSDAITVFKDGRYVKTFTDMQQVDHDALVQAM VGRDIGDIYGWQPRSYGEERLRLDAVKAPGVRTPISLAVRSGEIVGLFGLVGAGRSELMK GMFGGTQITAGQVYIDQQPIDIRKPSHAIAAGMMLCPEDRKAEGIIPVHSVRDNINISAR RKHVLGGCVINNGWEENNADHHIRSLNIKTPGAEQLIMNLSGGNQQKAILGRWLSEEMKV ILLDEPTRGIDVGAKHEIYNVIYALAAQGVAVLFASSDLPEVLGVADRIVVMREGEIAGE LLHEQADERQALSLAMPKVSQAVA >gi|296493293|gb|ADTK01000208.1| GENE 22 22521 - 23510 1249 329 aa, chain - ## HITS:1 COG:araF KEGG:ns NR:ns ## COG: araF COG1879 # Protein_GI_number: 16129851 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli K12 # 1 329 1 329 329 634 100.0 0 MHKFTKALAAIGLAAVMSQSAMAENLKLGFLVKQPEEPWFQTEWKFADKAGKDLGFEVIK IAVPDGEKTLNAIDSLAASGAKGFVICTPDPKLGSAIVAKARGYDMKVIAVDDQFVNAKG KPMDTVPLVMMAATKIGERQGQELYKEMQKRGWDVKESAVMAITANELDTARRRTTGSMD ALKAAGFPEKQIYQVPTKSNDIPGAFDAANSMLVQHPEVKHWLIVGMNDSTVLGGVRATE GQGFKAADIIGIGINGVDAVSELSKAQATGFYGSLLPSPDVHGYKSSEMLYNWVAKDVEP PKFTEVTDVVLITRDNFKEELEKKGLGGK >gi|296493293|gb|ADTK01000208.1| GENE 23 24307 - 24810 487 167 aa, chain + ## HITS:1 COG:ECs2610 KEGG:ns NR:ns ## COG: ECs2610 COG1528 # Protein_GI_number: 15831864 # Func_class: P Inorganic ion transport and metabolism # Function: Ferritin-like protein # Organism: Escherichia coli O157:H7 # 1 167 1 167 167 296 100.0 1e-80 MATAGMLLKLNSQMNREFYASNLYLHLSNWCSEQSLNGTATFLRAQAQSNVTQMMRMFNF MKSVGATPIVKAIDVPGEKLNSLEELFQKTMEEYEQRSSTLAQLADEAKELNDDSTVNFL RDLEKEQQHDGLLLQTILDEVRSAKLAGMCPVQTDQHVLNVVSHQLH >gi|296493293|gb|ADTK01000208.1| GENE 24 24889 - 25140 286 83 aa, chain - ## HITS:1 COG:no KEGG:B21_01860 NR:ns ## KEGG: B21_01860 # Name: yecJ # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 83 1 83 83 138 100.0 5e-32 MSQPLNADQELVSDVVACQLVIKQILDVLDVIAPVEVREKMSSQLKNIDFTNHPAAADPV TMRAIQKAIALIELKFTPQGESH >gi|296493293|gb|ADTK01000208.1| GENE 25 26098 - 26595 635 165 aa, chain + ## HITS:1 COG:ECs2613 KEGG:ns NR:ns ## COG: ECs2613 COG1528 # Protein_GI_number: 15831867 # Func_class: P Inorganic ion transport and metabolism # Function: Ferritin-like protein # Organism: Escherichia coli O157:H7 # 1 165 1 165 165 291 100.0 3e-79 MLKPEMIEKLNEQMNLELYSSLLYQQMSAWCSYHTFEGAAAFLRRHAQEEMTHMQRLFDY LTDTGNLPRINTVESPFAEYSSLDELFQETYKHEQLITQKINELAHAAMTNQDYPTFNFL QWYVSEQHEEEKLFKSIIDKLSLAGKSGEGLYFIDKELSTLDTQN >gi|296493293|gb|ADTK01000208.1| GENE 26 26633 - 26872 334 79 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_2201 NR:ns ## KEGG: ECUMN_2201 # Name: yecH # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 79 1 79 79 140 98.0 2e-32 MDSIHGHEVLNMMIESGEQYTHASLEAAIKARFGEQARFHTCSAEGMTAGELVAFLAAKG KFIPSEEGFSTDQSKICRH >gi|296493293|gb|ADTK01000208.1| GENE 27 27095 - 28273 1177 392 aa, chain + ## HITS:1 COG:ECs2615 KEGG:ns NR:ns ## COG: ECs2615 COG0814 # Protein_GI_number: 15831869 # Func_class: E Amino acid transport and metabolism # Function: Amino acid permeases # Organism: Escherichia coli O157:H7 # 1 392 12 403 403 643 99.0 0 MAGTTIGAGMLAMPLAAAGVGFSVTLILLIGLWALMCYTALLLLEVYQHVPADTGLGTLA KRYLGRYGQWLTGFSMMFLMYALTAAYISGAGELLASSISDWTGISMSATAGVLLFTFVA GGVVCVGTSLVDLFNRFLFSAKIIFLVVMLVLLLPHIHKVNLLTLPLQQGLALSAIPVIF TSFGFHGSVPSIVSYMDGNVRKLRWVFITGSAIPLVAYIFWQVATLGSIDSTTFMGLLAN HAGLNGLLQALREMVASPHVELAVHLFADLALATSFLGVALGLFDYLADLFQRSNTVGGR LQTGAITFLPPLAFALFYPRGFVMALGYAGVALAVLALIIPSLLTWQSRKHNPQAGYRVK GGRPALVVVFLCGIAVIGVQFLIAAGLLPEVG >gi|296493293|gb|ADTK01000208.1| GENE 28 28335 - 29000 699 221 aa, chain - ## HITS:1 COG:ECs2616 KEGG:ns NR:ns ## COG: ECs2616 COG3318 # Protein_GI_number: 15831870 # Func_class: R General function prediction only # Function: Predicted metal-binding protein related to the C-terminal domain of SecA # Organism: Escherichia coli O157:H7 # 1 221 1 221 221 438 99.0 1e-123 MKTGPLNESELEWLDDILTKYNTDHAILDVAELDGLLTAVLSSPQEIEPAQWLVAVWGGA DYVPRWASEKEMTRFMNLAFQHMADTAERLNEFPEQFEPLFGLREVDGSELTIVEEWCFG YMRGVALSDWSTLPDSLKPALEAIALHGTEDNFERVEKMSPEAFEESVDAIRLAALDLHA YWMAHPQEKAVQQPIKAEEKPGRNDPCPCGSGKKFKQCCLH >gi|296493293|gb|ADTK01000208.1| GENE 29 29650 - 30198 264 182 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 9 175 486 665 904 106 38 2e-22 MQFNIPTLLTLFRVILIPFFVLVFYLPVTWSPFAAALIFCVAAVTDWFDGFLARRWNQST RFGAFLDPVADKVLVAIAMVLVTEHYHSWWVTLPAATMIAREIIISALREWMAELGKRSS VAVSWIGKVKTTAQMVALAWLLWRPNIWVEYAGIALFFVAAVLTLWSMLQYLSAARADLL DQ >gi|296493293|gb|ADTK01000208.1| GENE 30 30255 - 32021 1665 588 aa, chain - ## HITS:1 COG:ECs2651 KEGG:ns NR:ns ## COG: ECs2651 COG0322 # Protein_GI_number: 15831905 # Func_class: L Replication, recombination and repair # Function: Nuclease subunit of the excinuclease complex # Organism: Escherichia coli O157:H7 # 1 588 1 588 588 1193 100.0 0 MYDAGGTVIYVGKAKDLKKRLSSYFRSNLASRKTEALVAQIQQIDVTVTHTETEALLLEH NYIKLYQPRYNVLLRDDKSYPFIFLSGDTHPRLAMHRGAKHAKGEYFGPFPNGYAVRETL ALLQKIFPIRQCENSVYRNRSRPCLQYQIGRCLGPCVEGLVSEEEYAQQVEYVRLFLSGK DDQVLTQLISRMETASQNLEFEEAARIRDQIQAVRRVTEKQFVSNTGDDLDVIGVAFDAG MACVHVLFIRQGKVLGSRSYFPKVPGGTELSEVVETFVGQFYLQGSQMRTLPGEILLDFN LSDKTLLADSLSELAGRKINVQTKPRGDRARYLKLARTNAATALTSKLSQQSTVHQRLTA LASVLKLPEVKRMECFDISHTMGEQTVASCVVFDANGPLRAEYRRYNITGITPGDDYAAM NQVLRRRYGKAIDDSKIPDVILIDGGKGQLAQAKNVFAELDVSWDKNHPLLLGVAKGADR KAGLETLFFEPEGEGFSLPPDSPALHVIQHIRDESHDHAIGGHRKKRAKVKNTSSLETIE GVGPKRRQMLLKYMGGLQGLRNASVEEIAKVPGISQGLAEKIFWSLKH >gi|296493293|gb|ADTK01000208.1| GENE 31 32084 - 32740 465 218 aa, chain - ## HITS:1 COG:ECs2652 KEGG:ns NR:ns ## COG: ECs2652 COG2197 # Protein_GI_number: 15831906 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 218 1 218 218 415 100.0 1e-116 MINVLLVDDHELVRAGIRRILEDIKGIKVVGEASCGEDAVKWCRANAVDVVLMDMSMPGI GGLEATRKIARSTADVKIIMLTVHTENPLPAKVMQAGAAGYLSKGAAPQEVVSAIRSVYS GQRYIASDIAQQMALSQIEPEKTESPFASLSERELQIMLMITKGQKVNEISEQLNLSPKT VNSYRYRMFSKLNIHGDVELTHLAIRHGLCNAETLSSQ >gi|296493293|gb|ADTK01000208.1| GENE 32 33199 - 33423 358 74 aa, chain + ## HITS:1 COG:no KEGG:ECDH10B_2056 NR:ns ## KEGG: ECDH10B_2056 # Name: yecF # Def: hypothetical protein # Organism: E.coli_DH10B # Pathway: not_defined # 1 74 1 74 74 103 100.0 1e-21 MSTPDFSTAENNQELANEVSCLKAMLTLMLQAMGQADAGRVMLKMEKQLALIEDETQAAV FSKTVKQIKQAYRQ >gi|296493293|gb|ADTK01000208.1| GENE 33 33491 - 34177 460 228 aa, chain - ## HITS:1 COG:sdiA KEGG:ns NR:ns ## COG: sdiA COG2771 # Protein_GI_number: 16129863 # Func_class: K Transcription # Function: DNA-binding HTH domain-containing proteins # Organism: Escherichia coli K12 # 1 228 13 240 240 442 98.0 1e-124 MLLRFQRMEAAEEVYHEIEFQAQQLEYDYYSLCVRHPVPFTRPKVAFYTNYPEAWVSYYQ AKNFLAIDPVLNPENFSQGHLMWNDDLFSEAQPLWEAARAHGLRRGVTQYLMLPNRALGF LSFSRSSAREIPILSDELQLKMQLLVRESLMALMRLNDEIVMTPEMNFSKREKEILKWTA EGKTSAEIAMILSISENTVNFHQKNMQKKINAPNKTQVACYAAATGLI Prediction of potential genes in microbial genomes Time: Mon May 16 15:41:26 2011 Seq name: gi|296493292|gb|ADTK01000209.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont544.17, whole genome shotgun sequence Length of sequence - 4904 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 34/0.000 - CDS 40 - 792 653 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 2 1 Op 2 1/0.000 - CDS 789 - 1457 670 ## COG0765 ABC-type amino acid transport system, permease component 3 1 Op 3 2/0.000 - CDS 1472 - 2458 1153 ## COG2515 1-aminocyclopropane-1-carboxylate deaminase - Prom 2486 - 2545 8.0 - Term 2500 - 2553 3.9 4 1 Op 4 . - CDS 2563 - 3363 1227 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain - Prom 3388 - 3447 4.0 5 2 Op 1 . - CDS 3451 - 3999 406 ## G2583_2372 protein FliZ 6 2 Op 2 . - CDS 4048 - 4767 778 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit - Prom 4797 - 4856 3.0 Predicted protein(s) >gi|296493292|gb|ADTK01000209.1| GENE 1 40 - 792 653 250 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 4 246 2 241 245 256 51 3e-68 MSAIEVKNLVKKFHGQTVLHGIDLEVKPGEVVAIIGPSGSGKTTLLRSINLLEQPEAGTI TVGDITIDTARSLSQQKSLIRQLRQHVGFVFQNFNLFPHRTVLENIIEGPVIVKGEPKEE ATARARELLAKVGLAGKETSYPRRLSGGQQQRVAIARALAMRPEVILFDEPTSALDPELV GEVLNTIRQLAQEKRTMVIVTHEMSFARDVADRAIFMDQGRIVEQGAAKALFAAPQLPRT RQFLEKFLLQ >gi|296493292|gb|ADTK01000209.1| GENE 2 789 - 1457 670 222 aa, chain - ## HITS:1 COG:yecS KEGG:ns NR:ns ## COG: yecS COG0765 # Protein_GI_number: 16129865 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Escherichia coli K12 # 1 222 1 222 222 396 100.0 1e-110 MQESIQLVIDSLPFLLKGAGYTLQLSIGGMFFGLLLGFILALMRLSPIWPVRWLARFYIS IFRGTPLIAQLFMIYYGLPQFGIELDPIPSAMIGLSLNTAAYAAETLRAAISSIDKGQWE AAASIGMTPWQTMRRAILPQAARVALPPLSNSFISLVKDTSLAATIQVPELFRQAQLITS RTLEVFTMYLAASLIYWIMATVLSTLQNHFENQLNRQEREPK >gi|296493292|gb|ADTK01000209.1| GENE 3 1472 - 2458 1153 328 aa, chain - ## HITS:1 COG:yedO KEGG:ns NR:ns ## COG: yedO COG2515 # Protein_GI_number: 16129866 # Func_class: E Amino acid transport and metabolism # Function: 1-aminocyclopropane-1-carboxylate deaminase # Organism: Escherichia coli K12 # 1 328 33 360 360 647 100.0 0 MPLHNLTRFPRLEFIGAPTPLEYLPRFSDYLGREIFIKRDDVTPMAMGGNKLRKLEFLAA DALREGADTLITAGAIQSNHVRQTAAVAAKLGLHCVALLENPIGTTAENYLTNGNRLLLD LFNTQIEMCDALTDPNAQLEELATRVEAQGFRPYVIPVGGSNALGALGYVESALEIAQQC EGAVNISSVVVASGSAGTHAGLAVGLEHLMPESELIGVTVSRSVADQLPKVVNLQQAIAK ELELTASAEILLWDDYFAPGYGVPNDEGMEAVKLLARLEGILLDPVYTGKAMAGLIDGIS QKRFKDEGPILFIHTGGAPALFAYHPHV >gi|296493292|gb|ADTK01000209.1| GENE 4 2563 - 3363 1227 266 aa, chain - ## HITS:1 COG:fliY KEGG:ns NR:ns ## COG: fliY COG0834 # Protein_GI_number: 16129867 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Escherichia coli K12 # 1 266 1 266 266 487 99.0 1e-138 MKLAHLGRQALMGVMVVALVAGMSVKSFADEGLLNKVKERGTLLVGLEGTYPPFSFQGDD GKLTGFEVEFAQQLAKHLGVEASLKPTKWDGMLASLDSKRIDVVINQVTISDERKKKYDF STPYTISGIQALVKKGNEGTIKTAADLKGKKVGVGLGTNYEEWLRQNVQGVDVRTYDDDP TKYQDLRVGRIDAILVDRLAALDLVKKTNDTLAVTGEAFSRQESGVALRKGNEDLLKAVN DAIAEMQKDGTLQALSEKWFGADVTK >gi|296493292|gb|ADTK01000209.1| GENE 5 3451 - 3999 406 182 aa, chain - ## HITS:1 COG:no KEGG:G2583_2372 NR:ns ## KEGG: G2583_2372 # Name: fliZ # Def: protein FliZ # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 182 14 195 195 366 98.0 1e-100 MVQHLKRRPLSRYLKDFKHSQTHCAHCRKLLDRITLVRDGKIVNKIEISRLDTLLDENGW QVEQKSWAALCRFCGDLHCKTQSDFFDIIGFKQFLFEQTEMSPGTVREYVVRLRRLGNHL HEQNISLEQLQDGFLDEILAPWLPTTSTNNYRIALRKYQHYQRQTCTGLVQKSSSLPASD IY >gi|296493292|gb|ADTK01000209.1| GENE 6 4048 - 4767 778 239 aa, chain - ## HITS:1 COG:ECs2661 KEGG:ns NR:ns ## COG: ECs2661 COG1191 # Protein_GI_number: 15831915 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Escherichia coli O157:H7 # 1 239 1 239 239 416 100.0 1e-116 MNSLYTAEGVMDKHSLWQRYVPLVRHEALRLQVRLPASVELDDLLQAGGIGLLNAVERYD ALQGTAFTTYAVQRIRGAMLDELRSRDWVPRSVRRNAREVAQAIGQLEQELGRNATETEV AERLGIDIADYRQMLLDTNNSQLFSYDEWREEHGDSIELVTDDHQRENPLQQLLDSNLRQ RVMEAIETLPEREKLVLTLYYQEELNLKEIGAVLEVGESRVSQLHSQAIKRLRTKLGKL Prediction of potential genes in microbial genomes Time: Mon May 16 15:41:37 2011 Seq name: gi|296493291|gb|ADTK01000210.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont544.18, whole genome shotgun sequence Length of sequence - 23459 bp Number of predicted genes - 31, with homology - 30 Number of transcription units - 16, operones - 4 average op.length - 4.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 8 - 37 2.3 1 1 Tu 1 . - CDS 56 - 1105 1171 ## COG1344 Flagellin and related hook-associated proteins - Prom 1271 - 1330 3.1 + Prom 1056 - 1115 4.3 2 2 Op 1 15/0.000 + CDS 1353 - 2768 1131 ## COG1345 Flagellar capping protein 3 2 Op 2 . + CDS 2784 - 3194 441 ## COG1516 Flagellin-specific chaperone FliS 4 2 Op 3 . + CDS 3194 - 3559 445 ## ECIAI39_1129 flagellar biosynthesis protein FliT + Term 3567 - 3616 -0.9 5 2 Op 4 . + CDS 3637 - 5124 1480 ## COG0366 Glycosidases + Term 5130 - 5164 5.0 - Term 5116 - 5150 5.0 6 3 Tu 1 . - CDS 5158 - 5571 501 ## SDY_1087 hypothetical protein - Prom 5670 - 5729 6.7 + Prom 5661 - 5720 3.9 7 4 Op 1 10/0.000 + CDS 5758 - 6963 1464 ## COG2391 Predicted transporter component 8 4 Op 2 2/0.750 + CDS 6960 - 7193 314 ## COG0425 Predicted redox protein, regulator of disulfide bond formation + Prom 7223 - 7282 2.7 9 5 Tu 1 . + CDS 7302 - 7799 323 ## COG2135 Uncharacterized conserved protein + Term 7800 - 7830 0.3 10 6 Tu 1 . - CDS 7681 - 7950 100 ## COG0675 Transposase and inactivated derivatives - Prom 8011 - 8070 5.1 + Prom 7844 - 7903 1.5 11 7 Tu 1 . + CDS 8033 - 8236 92 ## COG1943 Transposase and inactivated derivatives + Prom 9010 - 9069 2.8 12 8 Tu 1 . + CDS 9138 - 9386 71 ## 13 9 Tu 1 . - CDS 9520 - 10071 95 ## COG1881 Phospholipid-binding protein - Prom 10178 - 10237 6.9 - Term 10126 - 10152 -0.6 14 10 Tu 1 . - CDS 10240 - 10572 296 ## COG2076 Membrane transporters of cations and cationic drugs - Prom 10597 - 10656 1.9 - Term 10702 - 10747 3.7 15 11 Tu 1 . - CDS 10916 - 11230 424 ## COG1677 Flagellar hook-basal body protein - Prom 11254 - 11313 3.6 + Prom 11300 - 11359 2.2 16 12 Op 1 19/0.000 + CDS 11445 - 13103 1432 ## COG1766 Flagellar biosynthesis/type III secretory pathway lipoprotein 17 12 Op 2 15/0.000 + CDS 13096 - 14091 1309 ## COG1536 Flagellar motor switch protein 18 12 Op 3 13/0.000 + CDS 14084 - 14770 749 ## COG1317 Flagellar biosynthesis/type III secretory pathway protein 19 12 Op 4 12/0.000 + CDS 14770 - 16143 1505 ## COG1157 Flagellar biosynthesis/type III secretory pathway ATPase 20 12 Op 5 8/0.000 + CDS 16162 - 16605 494 ## COG2882 Flagellar biosynthesis chaperone 21 12 Op 6 7/0.250 + CDS 16602 - 17729 717 ## COG3144 Flagellar hook-length control protein + Prom 17742 - 17801 2.2 22 13 Op 1 13/0.000 + CDS 17834 - 18298 475 ## COG1580 Flagellar basal body-associated protein 23 13 Op 2 20/0.000 + CDS 18303 - 19307 1063 ## COG1868 Flagellar motor switch protein 24 13 Op 3 6/0.500 + CDS 19304 - 19717 530 ## COG1886 Flagellar motor switch/type III secretory pathway protein 25 13 Op 4 6/0.500 + CDS 19720 - 20085 264 ## COG3190 Flagellar biogenesis protein 26 13 Op 5 16/0.000 + CDS 20085 - 20822 735 ## COG1338 Flagellar biosynthesis pathway, component FliP 27 13 Op 6 17/0.000 + CDS 20832 - 21101 413 ## COG1987 Flagellar biosynthesis pathway, component FliQ 28 13 Op 7 4/0.750 + CDS 21110 - 21895 689 ## COG1684 Flagellar biosynthesis pathway, component FliR + Prom 21974 - 22033 2.7 29 14 Tu 1 . + CDS 22185 - 22808 429 ## COG2771 DNA-binding HTH domain-containing proteins - Term 22715 - 22760 4.0 30 15 Tu 1 . - CDS 22852 - 23040 340 ## LF82_0529 protein DsrB + Prom 23085 - 23144 4.2 31 16 Tu 1 . + CDS 23203 - 23430 279 ## G2583_2404 hypothetical protein Predicted protein(s) >gi|296493291|gb|ADTK01000210.1| GENE 1 56 - 1105 1171 349 aa, chain - ## HITS:1 COG:YPO1842 KEGG:ns NR:ns ## COG: YPO1842 COG1344 # Protein_GI_number: 16122093 # Func_class: N Cell motility # Function: Flagellin and related hook-associated proteins # Organism: Yersinia pestis # 4 347 3 368 369 283 59.0 5e-76 MAQVINTNSLSLITQNNINKNQSALSTSIERLSSGLRINSAKDDAAGQAIANRFTSNIKG LTQAARNANDGISLAQTAEGALSEINNNLQRIRELTVQASTGTNSDSDLSSIQDEIKSRL DEIDRVSGQTQFNGVNVLSKNDSMKIQIGANDNQTISIGLQQIDSTTLNLKGFTVSGMAD FSAAKLTAADDTAIAAADVKDAGGKQVNLLSYTDTASNSTKYAVVDSATGKYMEATVVIT GTAAAVTVGAAEVARAATADPLKALDAAIAKVDKFRSSLGAVQNRLDSAVTNLNNTTTNL SEAQSRIQDADYATEVSNMSKAQIIQQAGNSVLSKANQVPQQVLSLLQG >gi|296493291|gb|ADTK01000210.1| GENE 2 1353 - 2768 1131 471 aa, chain + ## HITS:1 COG:fliD KEGG:ns NR:ns ## COG: fliD COG1345 # Protein_GI_number: 16129871 # Func_class: N Cell motility # Function: Flagellar capping protein # Organism: Escherichia coli K12 # 1 471 1 468 468 441 78.0 1e-123 MASISTLGVGSGLDLSSILDSLEAAEKSTLTPISKQQSSYTAKLSAYGTLKSALESFQTA NTALNKADLFTATSTTSSSSAFSATTTGSAIAGKYTISVSQLAQAQTLTTKNSQKDSKAA IATSDSVLTIQQGGGKDPVTIDISAGNSSLSGIRDAINNAKAGVSASIINVGNGEYRLSI TANDTGSNNAMKLSVSGDSALESFMGYNGTPGDSSNGMIESVTAQNAKLTVNNVEIENSS NTISDSLEDITLNLNDVTTGNQTLTISKDTSKAENAVKAWVDAYNTLQDTFSSLTKYTAV DAGAESQDSSNGALLGDSTLRTIQTQLKTLLANTHSSSNYKTLAQIGITSDASTGKLEIA TDKLQTALKNDAAGIGEMFIGDGKSTGVTTGISNNLTSWLSSTGIIQAAKDGVSKTLNNL TDQYNAASERIDTLMTRYKAQFTQLDVLMNSLNSTSSYLTQQFDTSNSNSK >gi|296493291|gb|ADTK01000210.1| GENE 3 2784 - 3194 441 136 aa, chain + ## HITS:1 COG:ECs2664 KEGG:ns NR:ns ## COG: ECs2664 COG1516 # Protein_GI_number: 15831918 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport; O Posttranslational modification, protein turnover, chaperones # Function: Flagellin-specific chaperone FliS # Organism: Escherichia coli O157:H7 # 1 136 1 136 136 224 99.0 3e-59 MYAAKGTQAYAQIGVESAVMSASQQQLVTMLFDGVLSALVRARLFMQDNNQQGKGVSLSK AINIIENGLRVSLDEESKDELTQNLIALYSYMVRRLLQANLRNDVSAVEEVEALMRNIAD AWKESLLSPSLIQDPV >gi|296493291|gb|ADTK01000210.1| GENE 4 3194 - 3559 445 121 aa, chain + ## HITS:1 COG:no KEGG:ECIAI39_1129 NR:ns ## KEGG: ECIAI39_1129 # Name: fliT # Def: flagellar biosynthesis protein FliT # Organism: E.coli_IAI39 # Pathway: Flagellar assembly [PATH:ect02040] # 1 121 1 121 121 204 100.0 6e-52 MNNAPHLYFAWQQLVEKSQLMLRLATEEQWDELIASEMAYVNAVQEIAHLTEEVAPSTTM QEQLRPMLRLILDNESKVKQLLQIRMDELAKLVGQSSVQKSVLSAYGDQGGFVLAPQDNL F >gi|296493291|gb|ADTK01000210.1| GENE 5 3637 - 5124 1480 495 aa, chain + ## HITS:1 COG:amyA KEGG:ns NR:ns ## COG: amyA COG0366 # Protein_GI_number: 16129874 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Escherichia coli K12 # 1 495 1 495 495 1033 99.0 0 MRNPTLLQCFHWYYPEGGKLWPELAERADGFNDIGINMVWLPPAYKGASGGYSVGYDSYD LFDLGEFDQKGSIPTKYGDKAQLLAAIDALKRNDIAVLLDVVVNHKMGADEKEAIRVQRV NADDRTQIDEEIIECEGWTRYTFPARAGQYSQFIWDFKCFSGIDHIENPDEDGIFKIVND YTGEGWNDQVDDELGNFDYLMGENIDFRNHAVTEEIKYWARWVMEQTQCDGFRLDAVKHI PAWFYKEWIEHVQEVAPKPLFIVAEYWSHEVDKLQTYIDQVEGKTMLFDAPLQMKFHEAS RMGRDYDMTQIFTGTLVEADPFHAVTLVANHDTQPLQALEAPVEPWFKPLAYALILLREN GVPSVFYPDLYGAHYEDVGGDGQTYPIDMPIIEQLDELILARQRFAHGVQTLFFDHPNCI AFSRSGTDEYPGCVVVMSNGDDGEKTIHLGENYGNKTWRDFLGNRQESVVTDENGEATFF CNGGSVSVWVIEEVI >gi|296493291|gb|ADTK01000210.1| GENE 6 5158 - 5571 501 137 aa, chain - ## HITS:1 COG:no KEGG:SDY_1087 NR:ns ## KEGG: SDY_1087 # Name: yedD # Def: hypothetical protein # Organism: S.dysenteriae # Pathway: not_defined # 1 137 1 137 137 265 100.0 4e-70 MKKLAIAGALMLLAGCAEVENYNNVVKTPAPDWLAGYWQTKGPQRALVSPEAIGSLIVTK EGDTLDCRQWQRVIAVPGKLTLMSDDLTNVTVKRELYEVERDGNTIEYDGMTMERVDRPT AECAAALDKAPLPTPLP >gi|296493291|gb|ADTK01000210.1| GENE 7 5758 - 6963 1464 401 aa, chain + ## HITS:1 COG:yedE KEGG:ns NR:ns ## COG: yedE COG2391 # Protein_GI_number: 16129876 # Func_class: R General function prediction only # Function: Predicted transporter component # Organism: Escherichia coli K12 # 1 401 1 401 401 706 99.0 0 MSWQQFKHAWLIKFWAPIPAVIAAGILSTYYFGITGTFWAVTGEFTRWGGQLLQLFGVHA EEWGYFKIIHLEGSPLTRIDGMMILGMFGGCFAAALWANNVKLRMPRSRIRIMQAIIGGI IAGFGARLAMGCNLAAFFTGIPQFSLHAWFFAIATAIGSWFGARFTLLPIFRIPVKMQKV SAASPLTQKPDQARRRFRLGMLVFFGLLGWALLTAMNQPKLGLAMLFGVGFGLLIERAQI CFTSAFRDMWITGRTHMAKAIIIGMAVSAIGIFSYVQLGVEPKIMWAGPNAVIGGLLFGF GIVLAGGCETGWMYRAVEGQVHYWWVGLGNVIGSTILAYYWDDFAPALATDWDKINLLKT FGPMGGLLVTYLLLFAALMLIIGWEKRFFRRAAPQTAKEIA >gi|296493291|gb|ADTK01000210.1| GENE 8 6960 - 7193 314 77 aa, chain + ## HITS:1 COG:ECs2669 KEGG:ns NR:ns ## COG: ECs2669 COG0425 # Protein_GI_number: 15831923 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted redox protein, regulator of disulfide bond formation # Organism: Escherichia coli O157:H7 # 1 77 1 77 77 145 100.0 2e-35 MKNIVPDYRLDMVGEPCPYPAVATLEAMPQLKKGEILEVVSDCPQSINNIPLDARNHGYT VLDIQQDGPTIRYLIQK >gi|296493291|gb|ADTK01000210.1| GENE 9 7302 - 7799 323 165 aa, chain + ## HITS:1 COG:ECs2670 KEGG:ns NR:ns ## COG: ECs2670 COG2135 # Protein_GI_number: 15831924 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 142 1 142 222 292 97.0 2e-79 MCGRFAQSQTREDYLALLAEDIERDIPYDPEPIGRYNVAPGTKVLLLSERDEHLHLDPVF WGYAPGWWDKPPLINARVETAATSHMFKPLWQHGRAICFADGWFEWKKEGDKKQPYFIYR ADGQAVFIAAIGSTPFERGDEAGRSFCKPAQGLSVTVAMAVRANL >gi|296493291|gb|ADTK01000210.1| GENE 10 7681 - 7950 100 89 aa, chain - ## HITS:1 COG:ydcM KEGG:ns NR:ns ## COG: ydcM COG0675 # Protein_GI_number: 16129391 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli K12 # 1 77 21 97 402 151 96.0 4e-37 MRRFAGACRFVFNRALARQNENHEVGNKYIPYGKMASWLVEWKNATETQWLKDSPSQPLQ QSLKDPERAYKNFFRLRHHAQTVCYLSRL >gi|296493291|gb|ADTK01000210.1| GENE 11 8033 - 8236 92 67 aa, chain + ## HITS:1 COG:Z5815 KEGG:ns NR:ns ## COG: Z5815 COG1943 # Protein_GI_number: 15804795 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 EDL933 # 1 44 1 44 138 84 86.0 5e-17 MKKETDIRRGRHCVFLKHVHLVFVTKYRRQIFDHDATEKYALTFQMYVLILKLNGLKWMA DQITSIC >gi|296493291|gb|ADTK01000210.1| GENE 12 9138 - 9386 71 82 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFLVAPNNILTRGSEVSRNLSCYEFDQLSHKIFETFQLDQWDNKAFLQDKDLLLDNSQYK IHIHREIILSDYYQQLDYASPN >gi|296493291|gb|ADTK01000210.1| GENE 13 9520 - 10071 95 183 aa, chain - ## HITS:1 COG:ybcL KEGG:ns NR:ns ## COG: ybcL COG1881 # Protein_GI_number: 16128528 # Func_class: R General function prediction only # Function: Phospholipid-binding protein # Organism: Escherichia coli K12 # 1 183 1 183 183 340 96.0 1e-93 MKKLIVSSVLAFITFSAQAAAFQVTSNEIKTGEQLTTSHVFSGFGCEGGNTSPSLTWSGA PEGTKSFAVTVYDPDAPTGSGWWHWTVANIPATVTYLPADAGRRDGTKLPTGAVQGRNDF GYAGFGGACPPKGDKPHHYQFKVWALKTDKIPVDSNSSGALVGYMLNANKIATAEITPVY EIK >gi|296493291|gb|ADTK01000210.1| GENE 14 10240 - 10572 296 110 aa, chain - ## HITS:1 COG:ECs1614 KEGG:ns NR:ns ## COG: ECs1614 COG2076 # Protein_GI_number: 15830868 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane transporters of cations and cationic drugs # Organism: Escherichia coli O157:H7 # 1 110 1 110 110 186 98.0 8e-48 MNPYIYLGGAILAEVIGTTLMKFSEGFTRLWPSVGTIICYCASFWLLAQTLAYIPTGIAY AIWSGVGIVLISLLSWGFFGQRLDLPAIIGMMLICAGVLVINLLSRSAPH >gi|296493291|gb|ADTK01000210.1| GENE 15 10916 - 11230 424 104 aa, chain - ## HITS:1 COG:ECs2676 KEGG:ns NR:ns ## COG: ECs2676 COG1677 # Protein_GI_number: 15831930 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar hook-basal body protein # Organism: Escherichia coli O157:H7 # 1 104 1 104 104 143 99.0 8e-35 MSAIQGIEGVISQLQTTAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFT LGEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV >gi|296493291|gb|ADTK01000210.1| GENE 16 11445 - 13103 1432 552 aa, chain + ## HITS:1 COG:fliF KEGG:ns NR:ns ## COG: fliF COG1766 # Protein_GI_number: 16129885 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis/type III secretory pathway lipoprotein # Organism: Escherichia coli K12 # 1 552 1 552 552 953 98.0 0 MNPAAAQTKSLEWLNRLRANPKIPLIVTGSAAVAVMVALILWAKTPDYRTLFSNLSDQDG GAIVSQLTQMNIPYRFSEASGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGI SQFSEQVNYQRALEGELSRTIETIGPVKGARVHLAMPKPSLFVREQKSPSASVTVNLLPG RALDEGQISAIVHLVSSAVAGLPPGNVTLVDQGGHLLTQSNTSGRDLNDAQLKYASDVEG RIQRRIEAILSPIVGNGNIHAQVTAQLDFASKEQTEEQYRPNGDESHAALRSRQLNESEQ SGSGYPGGVPGALSNQPAPANNAPISTPPANQNNRQQQASTTSNSGPRSTQRNETSNYEV DRTIRHTKMNVGDVQRLSVAVVVNYKTLPDGKPLPLSNEQMKQIEALTREAMGFSEKRGD SLNVVNSPFNSSDESGGALPFWQQQAFIDQLLAAGRWLLVLLVAWLLWRKAVRPQLTRRA EAVKTVQQQALAREEVEDAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPRVVA LVIRQWINNDHE >gi|296493291|gb|ADTK01000210.1| GENE 17 13096 - 14091 1309 331 aa, chain + ## HITS:1 COG:ECs2678 KEGG:ns NR:ns ## COG: ECs2678 COG1536 # Protein_GI_number: 15831932 # Func_class: N Cell motility # Function: Flagellar motor switch protein # Organism: Escherichia coli O157:H7 # 1 331 1 331 331 551 100.0 1e-157 MSNLTGTDKSVILLMTIGEDRAAEVFKHLSQREVQTLSAAMANVTQISNKQLTDVLAEFE QEAEQFAALNINANDYLRSVLVKALGEERAASLLEDILETRDTASGIETLNFMEPQSAAD LIRDEHPQIIATILVHLKRAQAADILALFDERLRHDVMLRIATFGGVQPAALAELTEVLN GLLDGQNLKRSKMGGVRTAAEIINLMKTQQEEAVITAVREFDGELAQKIIDEMFLFENLV DVDDRSIQRLLQEVDSESLLIALKGAEQPLREKFLRNMSQRAADILRDDLANRGPVRLSQ VENEQKAILLIVRRLAETGEMVIGSGEDTYV >gi|296493291|gb|ADTK01000210.1| GENE 18 14084 - 14770 749 228 aa, chain + ## HITS:1 COG:ECs2679 KEGG:ns NR:ns ## COG: ECs2679 COG1317 # Protein_GI_number: 15831933 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis/type III secretory pathway protein # Organism: Escherichia coli O157:H7 # 1 228 8 235 235 360 99.0 1e-100 MSDNLPWKTWTPDDLAPPPAEFVPMVESEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI AEGRQQGHEQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV >gi|296493291|gb|ADTK01000210.1| GENE 19 14770 - 16143 1505 457 aa, chain + ## HITS:1 COG:ZfliI KEGG:ns NR:ns ## COG: ZfliI COG1157 # Protein_GI_number: 15802376 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis/type III secretory pathway ATPase # Organism: Escherichia coli O157:H7 EDL933 # 13 457 52 496 496 824 99.0 0 MTTRLTRWLTTLDNFEAKMAQLPAVRRYGRLTRATGLVLEATGLQLPLGATCVIERQNGS ETHEVESEVVGFNGQRLFLMPLEEVEGVLPGARVYAKNISAEGLQSGKQLPLGPALLGRV LDGSGKPLDGLPSPDTTETGALITPPFNPLQRTPIEHVLDTGVRPINALLTVGRGQRMGL FAGSGVGKSVLLGMMARYTRADVIVVGLIGERGREVKDFIENILGAEGRARSVVIAAPAD VSPLLRMQGAAYATRIAEDFRDRGQHVLLIMDSLTRYAMAQREIALAIGEPPATKGYPPS VFAKLPALVERAGNGISGGGSITAFYTVLTEGDDQQDPIADSARAILDGHIVLSRRLAEA GHYPAIDIEASISRAITALISEQHYARVRTFKQLLSSFQRNRDLVSVGAYAKGSDPMLDK AIALWPQLEGYLQQGIFERADWEASLQGLERIFPTVS >gi|296493291|gb|ADTK01000210.1| GENE 20 16162 - 16605 494 147 aa, chain + ## HITS:1 COG:fliJ KEGG:ns NR:ns ## COG: fliJ COG2882 # Protein_GI_number: 16129889 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport; O Posttranslational modification, protein turnover, chaperones # Function: Flagellar biosynthesis chaperone # Organism: Escherichia coli K12 # 1 147 1 147 147 206 99.0 2e-53 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG MTSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST AALLAENRLDQKKMDEFAQRAAMRKPE >gi|296493291|gb|ADTK01000210.1| GENE 21 16602 - 17729 717 375 aa, chain + ## HITS:1 COG:fliK KEGG:ns NR:ns ## COG: fliK COG3144 # Protein_GI_number: 16129890 # Func_class: N Cell motility # Function: Flagellar hook-length control protein # Organism: Escherichia coli K12 # 1 375 1 375 375 511 94.0 1e-144 MIRIAPLITADVDNATLPGGKASDAAQDFLALLSEALTGEATTDKATPQLLVATDKPTAK GEPLVSDILSDAQQADLLIPVDETPPVINDEQSTLTPLTTAQTITLAAAADNNTAKDEKA DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSAQLTTAQPDDGP GTPAQPLTALVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW QQSLSQHISLFTRQGQQSAELRLQPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS LQGRVTGNSGVDIFA >gi|296493291|gb|ADTK01000210.1| GENE 22 17834 - 18298 475 154 aa, chain + ## HITS:1 COG:ECs2683 KEGG:ns NR:ns ## COG: ECs2683 COG1580 # Protein_GI_number: 15831937 # Func_class: N Cell motility # Function: Flagellar basal body-associated protein # Organism: Escherichia coli O157:H7 # 1 154 1 154 154 289 100.0 2e-78 MTDYAISKKSKRSLWIPILVFITLAACASAGYSYWHSHQVAADDKAQQRVVPSPVFYALD TFTVNLGDADRVLYIGITLRLKDEATRSRLSEYLPEVRSRLLLLFSRQDAAVLATEEGKK NLIAEIKTTLSTPLVAGQPKQDVTDVLYTAFILR >gi|296493291|gb|ADTK01000210.1| GENE 23 18303 - 19307 1063 334 aa, chain + ## HITS:1 COG:ECs2684 KEGG:ns NR:ns ## COG: ECs2684 COG1868 # Protein_GI_number: 15831938 # Func_class: N Cell motility # Function: Flagellar motor switch protein # Organism: Escherichia coli O157:H7 # 1 334 1 334 334 649 100.0 0 MGDSILSQAEIDALLNGDSEVKDEPTASVSGESDIRPYDPNTQRRVVRERLQALEIINER FARHFRMGLFNLLRRSPDITVGAIRIQPYHEFARNLPVPTNLNLIHLKPLRGTGLVVFSP SLVFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYV RSEMQVKFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENSR NEDQNWRDNLVRQVQHSQLELVANFADISLRLSQILKLKPGDVLPIEKPDRIIAHVDGVP VLTSQYGTLNGQYALRIEHLINPILNSLNEEQPK >gi|296493291|gb|ADTK01000210.1| GENE 24 19304 - 19717 530 137 aa, chain + ## HITS:1 COG:ECs2685 KEGG:ns NR:ns ## COG: ECs2685 COG1886 # Protein_GI_number: 15831939 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar motor switch/type III secretory pathway protein # Organism: Escherichia coli O157:H7 # 1 137 1 137 137 245 100.0 1e-65 MSDMNNPADDNNGAMDDLWAEALSEQKSTSSKSAADAVFQQFGGGDVSGTLQDIDLIMDI PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV RITDIITPSERMRRLSR >gi|296493291|gb|ADTK01000210.1| GENE 25 19720 - 20085 264 121 aa, chain + ## HITS:1 COG:STM1978 KEGG:ns NR:ns ## COG: STM1978 COG3190 # Protein_GI_number: 16765316 # Func_class: N Cell motility # Function: Flagellar biogenesis protein # Organism: Salmonella typhimurium LT2 # 1 121 2 125 125 134 66.0 5e-32 MNNHATVQSSAPVSAAPLLQVSGALIAIIALILAAAWLVKRLGFAPKRTGVNGLKISASA SLGARERVVVVDVEDARLVLGVTAGQINLLHKLPPSAPTEEIPQTDFQSVMKNLLKRSGR S >gi|296493291|gb|ADTK01000210.1| GENE 26 20085 - 20822 735 245 aa, chain + ## HITS:1 COG:ECs2687 KEGG:ns NR:ns ## COG: ECs2687 COG1338 # Protein_GI_number: 15831941 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FliP # Organism: Escherichia coli O157:H7 # 1 245 1 245 245 392 99.0 1e-109 MRRLLSVAPVLLWLVTPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA QSFYS >gi|296493291|gb|ADTK01000210.1| GENE 27 20832 - 21101 413 89 aa, chain + ## HITS:1 COG:STM1980 KEGG:ns NR:ns ## COG: STM1980 COG1987 # Protein_GI_number: 16765318 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FliQ # Organism: Salmonella typhimurium LT2 # 1 89 1 89 89 98 95.0 3e-21 MTPESVMMMGTEAMKVALALAAPLLLVALVTGLIISILQAATQINEMTLSFIPKIIAVFI AIIIAGPWMLNLLLDYVRTLFTNLPYIIG >gi|296493291|gb|ADTK01000210.1| GENE 28 21110 - 21895 689 261 aa, chain + ## HITS:1 COG:fliR KEGG:ns NR:ns ## COG: fliR COG1684 # Protein_GI_number: 16129897 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FliR # Organism: Escherichia coli K12 # 1 261 1 261 261 362 98.0 1e-100 MMQVTSDQWLSWLSLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL NMPVLARIMDMLALLLFLTFNGHLWLISLLADTFHTLPIGGEPLNSNAFLALTKAGSLIF LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC EHLFSEMFNLLADIISELPLI >gi|296493291|gb|ADTK01000210.1| GENE 29 22185 - 22808 429 207 aa, chain + ## HITS:1 COG:ECs2690 KEGG:ns NR:ns ## COG: ECs2690 COG2771 # Protein_GI_number: 15831944 # Func_class: K Transcription # Function: DNA-binding HTH domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 207 1 207 207 382 99.0 1e-106 MSTIIMDLCSYTRLGLTGYLLSRGVKKREINDIETVDDLAIACDSQRPSVVFINEDCFIH DASNSQRIKLIINQHPNTLFIVFMAIANVHFDEYLLVRKNLLISSKSIKPESLDDILGDI LKKETTITSFLNMPTLSLNRTESSMLRMWMAGQGTIQISDQMNIKAKTVSSHKGNIKRKI KTHNKQVIYHVVRLTDNVTNGIFVNMR >gi|296493291|gb|ADTK01000210.1| GENE 30 22852 - 23040 340 62 aa, chain - ## HITS:1 COG:no KEGG:LF82_0529 NR:ns ## KEGG: LF82_0529 # Name: dsrB # Def: protein DsrB # Organism: E.coli_LF82 # Pathway: not_defined # 1 62 1 62 62 124 100.0 1e-27 MKVNDRVTVKTDGGPRRPGVVLAVEEFSEGTMYLVSLEDYPLGIWFFNEAGHQDGIFVEK AE >gi|296493291|gb|ADTK01000210.1| GENE 31 23203 - 23430 279 75 aa, chain + ## HITS:1 COG:no KEGG:G2583_2404 NR:ns ## KEGG: G2583_2404 # Name: yodD # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 75 6 80 80 112 100.0 4e-24 MKTAKEYSDTAKREVSVDVDALLAAINEISESEVHRSQNDSEHVSVDGREYHTWRELADA FELDIHDFSVSEVNR Prediction of potential genes in microbial genomes Time: Mon May 16 15:41:54 2011 Seq name: gi|296493290|gb|ADTK01000211.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont544.19, whole genome shotgun sequence Length of sequence - 7685 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 5, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 199 - 258 2.3 1 1 Tu 1 . + CDS 281 - 1096 805 ## COG3769 Predicted hydrolase (HAD superfamily) - Term 884 - 920 3.7 2 2 Tu 1 2/1.000 - CDS 1093 - 2769 1338 ## COG2199 FOG: GGDEF domain - Prom 2810 - 2869 4.5 - Term 2913 - 2943 0.3 3 3 Op 1 3/1.000 - CDS 3025 - 3207 219 ## COG5475 Uncharacterized small protein 4 3 Op 2 . - CDS 3286 - 4197 1152 ## COG2354 Uncharacterized protein conserved in bacteria - Prom 4299 - 4358 4.9 + Prom 4291 - 4350 5.3 5 4 Tu 1 . + CDS 4376 - 5296 949 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 6 5 Op 1 6/0.000 - CDS 5285 - 5755 364 ## COG3727 DNA G:T-mismatch repair endonuclease 7 5 Op 2 4/0.667 - CDS 5736 - 7154 1345 ## COG0270 Site-specific DNA methylase 8 5 Op 3 . - CDS 7221 - 7649 274 ## COG1418 Predicted HD superfamily hydrolase Predicted protein(s) >gi|296493290|gb|ADTK01000211.1| GENE 1 281 - 1096 805 271 aa, chain + ## HITS:1 COG:ECs2693 KEGG:ns NR:ns ## COG: ECs2693 COG3769 # Protein_GI_number: 15831947 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Escherichia coli O157:H7 # 1 270 1 270 271 530 98.0 1e-150 MLSIQQPLLVFSDLDGTLLDSHSYDWQPAAPWLSRLREANVPVILCSSKTSAEMLYLQKT LGLQGLPLIAENGAVIQLAEQWQDIDGFPRIISGISHGEISQVLNTLREKEHFKFTTFDD VDDATIAEWTGLSRSQAALTQLHEASVTLIWRDSDERMAQFTARLNELGLQFMQGARFWH VLDASAGKDQAANWIIATYQQSSGKRPTTLGLGDGPNDAPLLEVMDYAVIVKGLNREGVH LHDEDPARVWRTQREGPEGWREGLDHFFSAH >gi|296493290|gb|ADTK01000211.1| GENE 2 1093 - 2769 1338 558 aa, chain - ## HITS:1 COG:yedQ_2 KEGG:ns NR:ns ## COG: yedQ_2 COG2199 # Protein_GI_number: 16129902 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli K12 # 380 558 1 179 179 355 99.0 1e-97 MENQSWLKKLARRLGPGHVVNLCFIVVLLFSTLLTWREVVVLEDAYISSQRNHLENVANA LDKHLQYNVDKLIFLRNGMREALVAPLDFTSLRNAVTEFEQHRDEHAWQIELNRRRTLPV NGVSDALVSEGNLLSRENESLDNEITAALEVGYLLRLAHNSSSMVEQAMYVSRAGFYVST QPTLFTRNVPTRYYGYVTQPWFVGHSQRENRHRAVRWFTSQPEHASNTEPQVTVSVPVDS ENYWYGVLGMSIPVRTMQQFLRNAIDKNLDGEYQLYDSKLRLLTSSNPDHPTGNIFDPRE LALLAQAMEHDTRGCIRMNSRYVSWERLDHFDGVLVRVHTLSEGVRGDFGSISIALTLLW ALFTTMLLISWYVIRRMVSNMYVLQSSLQWQAWHDTLTRLYNRGALFEKARPLAKLCQTH QHPFSVIQVDLDHFKAINDRFGHQAGDRVLSHAAGLISSSLRAQDVAGRVGGEEFCVILP GASLTQAAEVAERIRLKLNEKEMLIAKSTTIRISASLGVSSSEETGDYDFEQLQSLADRR LYLAKQAGRNRVFASDNA >gi|296493290|gb|ADTK01000211.1| GENE 3 3025 - 3207 219 60 aa, chain - ## HITS:1 COG:ECs2695 KEGG:ns NR:ns ## COG: ECs2695 COG5475 # Protein_GI_number: 15831949 # Func_class: S Function unknown # Function: Uncharacterized small protein # Organism: Escherichia coli O157:H7 # 1 60 1 60 60 102 100.0 2e-22 MSFMVSEEVTVKEGGPRMIVTGYSSGMVECRWYDGYGVKREAFHETELVPGEGSRSAEEV >gi|296493290|gb|ADTK01000211.1| GENE 4 3286 - 4197 1152 303 aa, chain - ## HITS:1 COG:yedI KEGG:ns NR:ns ## COG: yedI COG2354 # Protein_GI_number: 16129904 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 303 3 305 305 500 99.0 1e-141 MAGSSLLTLLDDIATLLDDISVMGKLAAKKTAGVLGDDLSLNAQQVSGVRANRELPVVWG VAKGSLINKVILVPLALIISAFIPWAITPLLMIGGAFLCFEGVEKVLHMLEARKHKEDPA QSQQRLEKLAAQDPLKFEKDKIKGAIRTDFILSAEIVAITLGIVAEAPLLNQVLVLSGIA LVVTVGVYGLVGVIVKIDDLGYWLAEKSSALMQALGKGLLIIAPWLMKALSIVGTLAMFL VGGGIVVHGIAPLHHAIEHFAGQQSAVVAMILPTVLNLILGFIIGGIVVLGVKAVAKIRG QAH >gi|296493290|gb|ADTK01000211.1| GENE 5 4376 - 5296 949 306 aa, chain + ## HITS:1 COG:ECs2697 KEGG:ns NR:ns ## COG: ECs2697 COG0697 # Protein_GI_number: 15831951 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Escherichia coli O157:H7 # 1 306 1 306 306 446 99.0 1e-125 MRFRQLLPLFGALFALYIIWGSTYFVIRIGVESWPPLMMAGVRFLAAGILLLAFLLLRGH KLPPLRPLLNAALIGLLLLAVGNGMVTVAEHQNVPSGIAAVVVATVPLFTLCFSRLFGIK TRKLEWVGIAIGLTGIIMLNSGGNLSGNPWGAILILIGSISWAFGSVYGSRITLPVGMMA GAIEMLAAGVVLMIASMIAGEKLTALPSLSGFLAVGYLALFGSIIAINAYMYLIRNVSPA LATSYAYVNPVVAVLLGTGLGGETLSKIEWLALGVIVFAVVLVTLGKYLFPAKPVVAPVI PDASSE >gi|296493290|gb|ADTK01000211.1| GENE 6 5285 - 5755 364 156 aa, chain - ## HITS:1 COG:vsr KEGG:ns NR:ns ## COG: vsr COG3727 # Protein_GI_number: 16129906 # Func_class: L Replication, recombination and repair # Function: DNA G:T-mismatch repair endonuclease # Organism: Escherichia coli K12 # 1 156 1 156 156 303 95.0 8e-83 MVDVHDKATRSKNMRAIATRDTAIEKRLASLLTGQGLAFRVQDASLPGRPDFVLDEFRCV IFAHGCFWHHHHCYLFKVPATRTDFWLEKIGKNVERDRRDISHLQELGWRVLIVWECALR GREKLTDAALTERLEEWICGEGASAQIDTQGIHLLA >gi|296493290|gb|ADTK01000211.1| GENE 7 5736 - 7154 1345 472 aa, chain - ## HITS:1 COG:ECs2699 KEGG:ns NR:ns ## COG: ECs2699 COG0270 # Protein_GI_number: 15831953 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Escherichia coli O157:H7 # 1 472 1 472 472 957 99.0 0 MQENISVTDSYSTGNAAQAMLEKLLQIYDVKTLVAQLNGVGENHWSAAILKRALANDSAW HRLSEKEFAHLQTLLPKPPAHHPHYAFRFIDLFAGIGGIRRGFESIGGQCVFTSEWNKHA VRTYKANHYCDPATHHFNEDIRDITLSHQEGVSDEAAAEHIRQHIPEHDVLLAGFPCQPF SLAGVSKKNSLGRAHGFACDTQGTLFFDVVRIIDARRPAMFVLENVKNLKSHDQGKTFRI IMQTLDELGYDVADAEDNGPDDPKIIDGKHFLPQHRERIVLVGFRRDLNLKADFTLRDIS KCFPAQRVTLAQLLDPMVEAKYILTPVLWKYLYRYAKKHQVRGNGFGYGMVYPNNPQSVT RTLSARYYKDGAEILIDRGWDMATGEKDFDDPLNQQHRPRRLTPRECARLMGFEAPGEAK FRIPVSDTQAYRQFGNSVVVPVFAAVAKLLEPKIKQAVALRQQEAQHGRRSR >gi|296493290|gb|ADTK01000211.1| GENE 8 7221 - 7649 274 142 aa, chain - ## HITS:1 COG:yedJ KEGG:ns NR:ns ## COG: yedJ COG1418 # Protein_GI_number: 16129908 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Escherichia coli K12 # 2 142 91 231 231 252 96.0 1e-67 MQFPAEKIEAVCHAIAAHSFSAQIAPLTTEAKIVQDADRLEALGAIGLARVFAVSGALGL ALFDGEDPFAQHRPLDDKRYALDHFQTKLLKLPQTMQTARGKQLAQHNAQFLVEFMAKLG AELAGENEGIDHKVIDAFSPAG Prediction of potential genes in microbial genomes Time: Mon May 16 15:41:55 2011 Seq name: gi|296493289|gb|ADTK01000212.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont553.1, whole genome shotgun sequence Length of sequence - 840 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 85 - 360 174 ## ECUMN_1822 putative minor tail protein T of putative prophage 2 1 Op 2 . + CDS 332 - 839 479 ## COG5281 Phage-related minor tail protein Predicted protein(s) >gi|296493289|gb|ADTK01000212.1| GENE 1 85 - 360 174 91 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_1822 NR:ns ## KEGG: ECUMN_1822 # Name: not_defined # Def: putative minor tail protein T of putative prophage # Organism: E.coli_UMN026 # Pathway: not_defined # 1 91 19 109 109 173 100.0 2e-42 MLSEMSATELGEWGDYFRMQSFSDVWMDAQFASLKALIVRMVSGSSDAAVADFSLLPEEN GIPERTDEELMHLGEGISGGVRYGPDSQPGH >gi|296493289|gb|ADTK01000212.1| GENE 2 332 - 839 479 169 aa, chain + ## HITS:1 COG:Z0975 KEGG:ns NR:ns ## COG: Z0975 COG5281 # Protein_GI_number: 15800511 # Func_class: S Function unknown # Function: Phage-related minor tail protein # Organism: Escherichia coli O157:H7 EDL933 # 1 169 1 169 1021 190 96.0 9e-49 MDQIANLVIDLGIDAAEFKNEIPRIKNLLNGAASDAERSSARMQRFMERQTQAARQTTQA ASSAATAASVHAQTVEKNAQAHERMAREVEKTRQRMEALSQKMREEQAQAMDLAEAQDKA AAAFYRQIDSVKQASAGLQELQRIQQQIRQSRNSGGIGQQDYLALISEV Prediction of potential genes in microbial genomes Time: Mon May 16 15:42:05 2011 Seq name: gi|296493288|gb|ADTK01000213.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont565.1, whole genome shotgun sequence Length of sequence - 28096 bp Number of predicted genes - 30, with homology - 30 Number of transcription units - 20, operones - 9 average op.length - 2.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 155 - 214 1.9 1 1 Op 1 . + CDS 242 - 808 353 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 2 1 Op 2 . + CDS 815 - 1003 60 ## EcSMS35_A0151 hypothetical protein - Term 915 - 950 6.1 3 2 Tu 1 . - CDS 1035 - 1712 340 ## COG1309 Transcriptional regulator + Prom 1665 - 1724 3.1 4 3 Tu 1 . + CDS 1791 - 2990 458 ## COG0477 Permeases of the major facilitator superfamily 5 4 Tu 1 . - CDS 3022 - 3906 414 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 6 5 Tu 1 . - CDS 4044 - 4208 61 ## EcSMS35_A0155 hypothetical protein 7 6 Tu 1 . + CDS 4485 - 6191 1424 ## COG4644 Transposase and inactivated derivatives, TnpA family 8 7 Tu 1 . - CDS 6246 - 6569 264 ## COG0640 Predicted transcriptional regulators - Prom 6592 - 6651 4.8 + Prom 6591 - 6650 5.6 9 8 Tu 1 . + CDS 6674 - 7807 208 ## COG0701 Predicted permeases + Term 7898 - 7935 1.0 - Term 7890 - 7920 4.3 10 9 Tu 1 . - CDS 8100 - 8717 191 ## EcSMS35_A0146 hypothetical protein - Prom 8809 - 8868 4.6 11 10 Tu 1 . - CDS 8908 - 10464 320 ## EcSMS35_A0145 hypothetical protein - Prom 10585 - 10644 4.7 12 11 Op 1 . - CDS 10727 - 11317 467 ## EcSMS35_A0144 hypothetical protein 13 11 Op 2 . - CDS 11317 - 11574 206 ## EcSMS35_A0143 hypothetical protein - Prom 11694 - 11753 9.2 + Prom 11675 - 11734 6.0 14 12 Tu 1 . + CDS 11928 - 14066 444 ## APECO1_O1CoBM110 hypothetical protein + Term 14131 - 14169 -1.0 - Term 14158 - 14187 2.8 15 13 Op 1 . - CDS 14228 - 14629 383 ## COG1487 Predicted nucleic acid-binding protein, contains PIN domain 16 13 Op 2 . - CDS 14641 - 14871 210 ## EcSMS35_A0140 plasmid maintenance protein VagC - Prom 14941 - 15000 4.4 + Prom 14962 - 15021 3.7 17 14 Op 1 . + CDS 15167 - 15457 359 ## EcSMS35_A0139 hypothetical protein 18 14 Op 2 . + CDS 15447 - 16346 383 ## COG4271 Predicted nucleotide-binding protein containing TIR -like domain + Term 16370 - 16398 2.1 - Term 16356 - 16384 2.1 19 15 Op 1 . - CDS 16396 - 18621 781 ## COG4928 Predicted P-loop ATPase 20 15 Op 2 . - CDS 18623 - 19711 852 ## KPN_pKPN5p08213 plasmid F RepC-like protein, plasmid replication - Prom 19736 - 19795 4.0 + Prom 20097 - 20156 4.4 21 16 Op 1 . + CDS 20263 - 20676 413 ## SeHA_A0030 hypothetical protein 22 16 Op 2 . + CDS 20678 - 21457 350 ## COG0582 Integrase + Term 21564 - 21601 -0.8 + Prom 21499 - 21558 4.2 23 17 Op 1 . + CDS 21636 - 22280 506 ## SeHA_A0032 ParA protein homolog + Prom 22282 - 22341 2.2 24 17 Op 2 . + CDS 22367 - 22675 197 ## EcE24377A_F0053 putative 60 kDa chaperonin + Term 22677 - 22719 6.1 + Prom 22853 - 22912 6.3 25 18 Op 1 . + CDS 23035 - 24069 739 ## SeHA_A0034 plasmid segregation protein ParM 26 18 Op 2 . + CDS 24062 - 24478 94 ## SeHA_A0035 plasmid stability protein 27 19 Op 1 4/0.400 - CDS 24480 - 25754 561 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair 28 19 Op 2 . - CDS 25754 - 26191 192 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) 29 19 Op 3 . - CDS 26188 - 26436 369 ## SeHA_A0038 protein ImpC - Prom 26471 - 26530 8.7 30 20 Tu 1 . + CDS 26830 - 27756 620 ## SeHA_A0041 hypothetical protein + Term 27766 - 27803 8.5 Predicted protein(s) >gi|296493288|gb|ADTK01000213.1| GENE 1 242 - 808 353 188 aa, chain + ## HITS:1 COG:mlr9265 KEGG:ns NR:ns ## COG: mlr9265 COG1961 # Protein_GI_number: 13488272 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Mesorhizobium loti # 1 188 14 201 206 224 69.0 5e-59 MALIGYARVSTAEQDTALQTDALRKAGCERVFEDTASGAKADRPGLADALAYLRDGDVLA VWRLDRLGRSMPHLIETIGALEARGVGFRSLTEAIDTTTPGGRLIFHVFGALGQFERDLI RERTKAGLTAAAARGRKGGRKPVVTADKLQRAREHIANGLNVREAATRLKVSKTALYTAL QSTSAADS >gi|296493288|gb|ADTK01000213.1| GENE 2 815 - 1003 60 62 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_A0151 NR:ns ## KEGG: EcSMS35_A0151 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 9 62 27 80 80 87 98.0 2e-16 MQSSSENDKRTPQQERQKAAREAERGREAWTLGQGMKKPVAGCYGRLTRWKGGGDVVYMA LL >gi|296493288|gb|ADTK01000213.1| GENE 3 1035 - 1712 340 225 aa, chain - ## HITS:1 COG:AGl1301 KEGG:ns NR:ns ## COG: AGl1301 COG1309 # Protein_GI_number: 15890777 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 16 210 52 254 261 161 43.0 1e-39 MFISDKVSSMTKLQPNTVIRAALDLLNEVGVDGLTTRKLAERLGVQQPALYWHFRNKRAL LDALAEAMLAENHTHSVPRADDDWRSFLIGNARSFRQALLAYRDGARIHAGTRPGAPQME TADAQLRFLCEAGFSAGDAVNALMTISYFTVGAVLEEQAGDSDAGERGGTVEQAPLSPLL RAAIDAFDEAGPDAAFEQGLAVIVDGLAKRRLVVRNVEGPRKGDD >gi|296493288|gb|ADTK01000213.1| GENE 4 1791 - 2990 458 399 aa, chain + ## HITS:1 COG:AGl1300 KEGG:ns NR:ns ## COG: AGl1300 COG0477 # Protein_GI_number: 15890776 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 371 2 369 394 300 46.0 3e-81 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA PVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYI ADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFL LPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDR FHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRG WMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYA ASITTWNGWAWIAGAALYLLCLPALRRGLWSGAGQRADR >gi|296493288|gb|ADTK01000213.1| GENE 5 3022 - 3906 414 294 aa, chain - ## HITS:1 COG:AGc468 KEGG:ns NR:ns ## COG: AGc468 COG0697 # Protein_GI_number: 15887623 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 7 280 10 283 287 290 65.0 2e-78 MSLRTPDLLFTAIAPAIWGSTYIVTTQYLPNFSPMTVAMLRALPAGLLLVMIVRQIPTGI WWMRIFILGALNISLFWSLLFISVYRLPGGVAATVGAVQPLMVVFISAALLGSPIRLMAV LGAICGTAGVALLVLTPNAALDPVGVAAGLAGAGSMAFGTVLTRKWQPPVPLLTFTAWQL AAGGLLLVPVALVFDPPIPMPTGTNVLGLAWLGLIGAGLTYFLWFRGISRLEPTVVSLLG FLSPGTAVLLGWLFLDQTLSALQIIGVLLVIGSIWLGQRSNRTPRARIACRKSP >gi|296493288|gb|ADTK01000213.1| GENE 6 4044 - 4208 61 54 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_A0155 NR:ns ## KEGG: EcSMS35_A0155 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 54 82 135 135 110 100.0 1e-23 MAGISTTGVVLSSVAWASDADYDVRLVQDCCYDPDRDAHEALLRSGFGGRVQVV >gi|296493288|gb|ADTK01000213.1| GENE 7 4485 - 6191 1424 568 aa, chain + ## HITS:1 COG:CAP0093 KEGG:ns NR:ns ## COG: CAP0093 COG4644 # Protein_GI_number: 15004797 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives, TnpA family # Organism: Clostridium acetobutylicum # 1 563 108 669 673 468 41.0 1e-131 MNADNLRKVPADAPTAFIKPRWKPLVITPEGLDRKFYEICALSELKNALRSGDIWVKGSR QFRDFDDYLLPAEKFAALKREQALPLAINPNSDQYLEERLQLLDEQLATVTRLAKDNELP DAILTESGLKITPLDAAVPDRAQALIDQTSQLLPRIKITELLMDVDDWTGFSRHFTHLKD GAEAKDRTLLLSAILGDAINLGLTKMAESSPGLTYAKLSWLQAWHIRDETYSAALAELVN HQYRHAFAAHWGDGTTSSSDGQRFRAGGRGESTGHVNPKYGSEPGRLFYTHISDQYAPFS TRVVNVGVRDSTYVLDGLLYHESDLRIEEHYTDTAGFTDHVFALMHLLGFRFAPRIRDLG ETKLYVPQGVQAYPTLRPLIGGTLNIKHVRAHWDDILRLASSIKQGTVTASLMLRKLGSY PRQNGLAVALRELGRIERTLFILDWLQSVELRRRVHAGLNKGEARNSLARAVFFNRLGEI RDRSFEQQRYRASGLNLVTAAIVLWNTVYLERATQGLVEAGKPVDGELLQFLSPLGWEHI NLTGDYVWRQSRRLEDGKFRPLRMPGKP >gi|296493288|gb|ADTK01000213.1| GENE 8 6246 - 6569 264 107 aa, chain - ## HITS:1 COG:mll2166 KEGG:ns NR:ns ## COG: mll2166 COG0640 # Protein_GI_number: 13472009 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Mesorhizobium loti # 1 100 1 100 106 103 49.0 1e-22 MQLEEVAKALKELGHPTRLFIFKHLVKAGEQGLPVGELQKQLGIPSSTLSHHISALVSVG LVTQNRESRTLMCVSQYEILEAIIEFLREECCVNSKTDVAEPAGKNG >gi|296493288|gb|ADTK01000213.1| GENE 9 6674 - 7807 208 377 aa, chain + ## HITS:1 COG:Cj1560 KEGG:ns NR:ns ## COG: Cj1560 COG0701 # Protein_GI_number: 15792865 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Campylobacter jejuni # 47 376 1 274 274 147 31.0 4e-35 MNSWIPMLQDAAEMFVFLAVELSLLFIVISAGVSLIRQKVPDHKIQQMMGARKGKGYLLA SLLGAVTPFCSCSTIPMLRGLLSAKAGFGPTLTFLFVSPLLNPIIVGLMWVTFGWKVTLL YAIIAAGVSVLSSIILDYLGFERHIVEYKNSVSGSCATKCGDSEASVKTSAVSSCCGAGL ALASVKTNCCTSSAKTIINLKTVKKEQNISACCPSILSEKSSESCCSSESQGNRNLTMNA TSGLIKLAMKDALQQFKDVLPYLLLSVLIGSFIYGFIPSEWIAAHAGADNPLAIPLSAVV GIPLYIRAEAVIPLASVLMTKGMGLGALMALIIGSAGASLTEVILLKSMFRMPMIIAFLT VILGMAILMGYLTQFLF >gi|296493288|gb|ADTK01000213.1| GENE 10 8100 - 8717 191 205 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_A0146 NR:ns ## KEGG: EcSMS35_A0146 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 205 1 205 205 375 100.0 1e-103 MGKYQEEKKTPIIMVKKRRTLLPPSLSEKTGVIARSETEKMEESVSEGISSPAVDTHTPD APACKKKKKRRRFPRQPHWTHEFTHDCVEKVKSLFPHLRAEGGGFLPLKIGITNDFAAFL TEHPETELTLDEWSCAISCITTRQVYLQRTAVAGIPRYGLDGLPAGQVSECDALNARRWL AVREQQKLKMKTMQEQSASTEKTER >gi|296493288|gb|ADTK01000213.1| GENE 11 8908 - 10464 320 518 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_A0145 NR:ns ## KEGG: EcSMS35_A0145 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 518 1 518 518 996 100.0 0 MAEKNKKTITGQVLNSIKINKLKCINGLNEIIFKPHALTAILGPNGSGKSTILHAIASIY MPEEGFPGEDHRLMHFFPRSPHAEWNGSDFIVNLTYRKDGVMIENELKNYGKADIRGSRW IQIYARRPLREVYYLGIDKCVPIIESEKKNNIQYETSSVSNDLITNILHYASYILNKPYT SFNQHQQPNGKILIGVESGGLAYSSLSMSAGEQKIFLILETILKADKNALILIDELDLLL HDEALKKLIEVISSHAKDKNKQIIFTTHREMITTLSDKINIRHVVNIQGRSYSFEETKPD AINRLTGESTTPIEIYVEDDLAVAIINKICSSLKASKYVKIFKFGAASNAFTLLASTLIR GDNLSGKLYILDGDKYSTENEKKTALDKVFTGTESRTYELKAAAEGKVKQFNLPNGVKPE QYIHYLITNVPLDGLGGEYLEIIEAARDIRVELDAHNYISNILTKLGIDRPSGLTRVMDL ASRHPEWHQYVSEVTDWLQPVVSDLMERLPENDTVDIT >gi|296493288|gb|ADTK01000213.1| GENE 12 10727 - 11317 467 196 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_A0144 NR:ns ## KEGG: EcSMS35_A0144 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 196 1 196 196 347 100.0 1e-94 MTQSRRPSPLQRRVLIVLAALDEKRPGPVLTRDLERVLERSGEAPVYGPNLRASCRRLED AGWLRTLRAPNLQLAVELTDAGRAVAQPLLLAEQDRLRAEQRAAEVVVLPLVPAAGLPAD GTSATDLAVQLNGITYQACRGDFVVRLDGSTCLQLWNKEGRVVRLEGDPLEVAQWLQACH DAGMEVRVQVNESVTP >gi|296493288|gb|ADTK01000213.1| GENE 13 11317 - 11574 206 85 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_A0143 NR:ns ## KEGG: EcSMS35_A0143 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 85 1 85 85 163 100.0 2e-39 MDEERVFSLSYEQLTRFAEKRIRECNLDSQGAIYLCESAKAGAVLIFWHELAINGYASMN AIKRQELIDADFQRLRNLIWPEDDR >gi|296493288|gb|ADTK01000213.1| GENE 14 11928 - 14066 444 712 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM110 NR:ns ## KEGG: APECO1_O1CoBM110 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 712 1 712 712 1421 99.0 0 MDIEIRHCNNIVRAHITLTADKLNIKFAPNGTGKSTLSRAISCAARDDIQGLQALMPFRL RGENPDSTGPIVIGADGIGDVMCFNEEYVSQFTFQPDELISDSFNILIRNQAHAEREREI EEMTQKIRAVFTDHTELNSLIDHLQELSNAFRSTSSGISRSSTGMRGLSGGNKIHHIPAG LENYQPYIRSERRVEWIDWQTKGLEFSPLSDGCCPFCTGDITGKEAQIRQVREEYDKSTI KNLTAIIRLVENLGNYLTESARERLLAITMLQNGPEAEHIEYLVALKRQTDTLTEKLTAL RGLNVFSLQEQQNVREVLTARLIDLQFFPDLQSELMQGITDRLNAALMDLINLAGPLQGK INRHRDSMIRLIAQHKTNINNFLTYAGYKYRVDIAGEGEQRKLRLRHIDFNGYVSGGSQH LSYGERNAFAIVLFMYECLSKNPGLIILDDPISSFDKNKKFAILEMLFRRASGECLKNRT VLMLTHDVEPVIDTLKSVRRLFSNQVTASCLRLSAGVIEELPVNDGDIMTFMQICKSITA SADCEEIIKLIYLRRYFEIVDERGDAYQLLSNLFHRRVVPLDYREPAAAGSGYPKMAPEK IQQALRDIREYVDSFDYPRLQALVSSPDEIKNLYRRCRNGYEKLQVFRLLELDQDHPVIR KFVNETYHIENEFICQLDPSRFDLIPEYVIMECDKLIALPPAANQSSVARIA >gi|296493288|gb|ADTK01000213.1| GENE 15 14228 - 14629 383 133 aa, chain - ## HITS:1 COG:STM3033 KEGG:ns NR:ns ## COG: STM3033 COG1487 # Protein_GI_number: 16766335 # Func_class: R General function prediction only # Function: Predicted nucleic acid-binding protein, contains PIN domain # Organism: Salmonella typhimurium LT2 # 1 131 5 131 132 103 41.0 5e-23 MLDTCICSFIMREQPEAVLKRLEQAVLRGHRIVVSAITYSEMRFGATGPKASPRHVQLVD EFCARLDAILPWDRAAVDATTKIKVALRLAGTPIGPNDTAIAGHAIAAGAILVTNNTREF ERVPDLVLEDWVK >gi|296493288|gb|ADTK01000213.1| GENE 16 14641 - 14871 210 76 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_A0140 NR:ns ## KEGG: EcSMS35_A0140 # Name: vagC # Def: plasmid maintenance protein VagC # Organism: E.coli_SECEC # Pathway: not_defined # 1 76 1 76 76 139 100.0 2e-32 MRTVSIFKNGNNRAIRLPRDLDFEGVSELEIVREGDSIILRPVRPTWGSFLEYEKADPDF MAEREDVVSDEGRFNL >gi|296493288|gb|ADTK01000213.1| GENE 17 15167 - 15457 359 96 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_A0139 NR:ns ## KEGG: EcSMS35_A0139 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 96 1 96 96 160 100.0 1e-38 MSVKITGLDKMQKQLKEVERATEALNGSYDVHFDANDPVSIENAIQEAYSMVYERASGYA TNPMVSPLIEHMKENLRQQILDRAEQQRQESGQDGN >gi|296493288|gb|ADTK01000213.1| GENE 18 15447 - 16346 383 299 aa, chain + ## HITS:1 COG:RSc2614 KEGG:ns NR:ns ## COG: RSc2614 COG4271 # Protein_GI_number: 17547333 # Func_class: K Transcription # Function: Predicted nucleotide-binding protein containing TIR -like domain # Organism: Ralstonia solanacearum # 145 267 77 202 233 100 43.0 5e-21 MAIDIFQSLKNCVFDLQRADVQNYQQPLKQLARLLNSENLQSVNAHLTRNVELDTFLARS EDTESSMAGSAVLQWPDEPADILGLKLLLIEKMADDNNFSFNFCHTFFYDRNIIESIRKF TSSLVAPFVRDYQLYVENQHDPEPAVFRPVSRKIFIVHGHDNDALQSVARFISRIGLEEI ILSERPDGSRTIIEKFEAESGDVSFAIVLMTPDDSGSALASESTRLRARQNVLYELGYFA GKLGRGKVLVLRKGDIEIPSDLAGVLYTELDEHGGWKRKLLSELAYAGVPFDKDKALSA >gi|296493288|gb|ADTK01000213.1| GENE 19 16396 - 18621 781 741 aa, chain - ## HITS:1 COG:all7133 KEGG:ns NR:ns ## COG: all7133 COG4928 # Protein_GI_number: 17233149 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase # Organism: Nostoc sp. PCC 7120 # 14 680 21 647 706 137 24.0 1e-31 MKILRQLWNQKGLDAAVEDVPEDRYGFGNIAENISRSILTLPLEASNVVGIEGAWGSGKT SLLNLILRNLALKKDAHTHVLHISPWLSGGSPVEALFLPVATVIQQEMEIRYPPKGFKKL WRKYLLSPEAQKVIEYAQDTSSRVLPLVQYIGQFSSIINWIAGGIKVFSDSRLAVDQKTT TKLRAEIAGQLVSLDLKFIVVMDDLDRLEPSQVAEVFRLVRAVADLPRFTHILCYDRQII THAVEHALNIEDGSRYLQKIIQLSFKLPRPEAFDLRNEFRQRAEALYQQINNQPPDSGMV RDLIAVTDTYGAALSTPREIHQAINSLIFLYPGMRDFVYFPDLCLLQLIRVTNPALYDWT EHYLTERSVIETGQGMLSDGEKADFREGLIRCMKTFRASNADSFLTLADWIPGISGHNDE YLNLFEPVSEDFRHIQTTGKRLSSLTHWRYYFAFSSPQNVLPPEFFRQLFEQAGVSEKQQ QLSELLLSKINSVGSLSGTWFEHILSRLTPGLIRERNFEECAGLVHFFFDHTDEVSTRFS IRNPWFSLREMAINEVVRHLLKHMQDIDETRTITLMEKLIVTGASPFWIADFMRDLIWEH GLAQNAVPSPSDALFSRDITERLRDRFAERMNQPELQQQLLLRKSLLGYLYAWRDMSSGE TVKQWVREVTTTDEGLVNLLIRLQTSVFSSHRGAYRRIARDQVSPFFDDWPAVEEKLKVM LSGNELTPEQEALKTALENDD >gi|296493288|gb|ADTK01000213.1| GENE 20 18623 - 19711 852 362 aa, chain - ## HITS:1 COG:no KEGG:KPN_pKPN5p08213 NR:ns ## KEGG: KPN_pKPN5p08213 # Name: repC # Def: plasmid F RepC-like protein, plasmid replication # Organism: K.pneumoniae # Pathway: not_defined # 1 362 6 367 367 639 85.0 0 MLSQLNLRFHKKLIEALKTRAGRENTSVNALAERFLDDGLKTVAPGDGYFQLIADPEATV RQLYRHIILGQTFGTSALSRDELRFVLVHVREAFLRGHNRLATLPALDTLLDITGNLLAW QVEHDRPVDGHYLKGIFRLAGKNWTEEFEAFRAALRPVVDQMYAEHLLRPLESDCFGLAE VPDAVLAEIFTLPRLKAVFPLMLRGLDWNTEQARTLAQELRPVISAVTETIEAGTLRLEI RVDGQHPGERPGAWYTTPRLHLLITGQDFVVPYGWEALSELLGLFTLYARHPEALTHGHQ GERVMFSPPGNVTPEGFFGIDGLRIFMPAEAFETLVRELATRCQEGPLAEALTGLRCLYG DL >gi|296493288|gb|ADTK01000213.1| GENE 21 20263 - 20676 413 137 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0030 NR:ns ## KEGG: SeHA_A0030 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 137 14 150 150 286 100.0 1e-76 MFFIENEGQAVAGTDYWQSVQAQAGYVYLSWNAGAARLLVPDAAKHLLREMRGAEYVIIS KGTLHGRDALELVFEDGSDAPFVIHMLSEQCDRLLPENNQGGGFVVTVWTRGGNQLRYPG KYRVVENLPDVSPWSEH >gi|296493288|gb|ADTK01000213.1| GENE 22 20678 - 21457 350 259 aa, chain + ## HITS:1 COG:PSLT031 KEGG:ns NR:ns ## COG: PSLT031 COG0582 # Protein_GI_number: 17233417 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Salmonella typhimurium LT2 # 14 256 15 257 260 379 88.0 1e-105 MQHLPAPIHHARDAAQLPVAIDYPAALALRQMSMVHDELPKYLLAPEVSALLHYVPDLRR KMLLATLWNTGARINEALALTRGDFSLTPPYPFVQLATLKQRTEKAARTAGRMPAGQQTH RLVPLSDAWYVSQLQTMVATLKIPMERRNKRTGRTEKARIWEVTDRTVRTWIGEAVAAAA ADGVTFSVPVTPHTFRHSYAMHMLYAGIPLKVLQSLMGHKSISSTEVYTKVFALDVAARH RVQFSMPESDAVSMLKRIP >gi|296493288|gb|ADTK01000213.1| GENE 23 21636 - 22280 506 214 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0032 NR:ns ## KEGG: SeHA_A0032 # Name: not_defined # Def: ParA protein homolog # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 214 1 214 214 409 100.0 1e-113 MPVIANTHPKGGVGKTTSSVNIVGEMKSDTVDLDTHTGLSIILGLRPEGKEISVKVPKTV DELIEIMTPYKNSDKTLLIDCGGFDSDLTRTAIAFADCVIVPSKDSLTERIGLMHFDGVL DEISSIMGTDITAHLYLCKVNPNKKKFPKLDAILPSFKHLKLMKSRISARAEFEDVIETG MGITESVHGRYSAGGKEVIALIEEINHLIENNKQ >gi|296493288|gb|ADTK01000213.1| GENE 24 22367 - 22675 197 102 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_F0053 NR:ns ## KEGG: EcE24377A_F0053 # Name: not_defined # Def: putative 60 kDa chaperonin # Organism: E.coli_E24377A # Pathway: not_defined # 1 102 1 102 102 141 99.0 9e-33 MSKAKFSAQAEKLKNLVDSARLQTSSSAVTATAPSQPNEGAEEVSGRPVAARTKAKNLKA IPLSYFEAHAELKATSKTSLDFSSYIIEAIREKLERDGAIGQ >gi|296493288|gb|ADTK01000213.1| GENE 25 23035 - 24069 739 344 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0034 NR:ns ## KEGG: SeHA_A0034 # Name: not_defined # Def: plasmid segregation protein ParM # Organism: S.enterica_Heidelberg # Pathway: not_defined # 19 344 1 326 326 643 100.0 0 MYEVCLFFLFVYEQRRLVMKIFIDDGSTNIKLAWLEDGDVKTLISPNSFKPEWSFSLLDD AAPANYEIDGEKFSFDPLSADAVVTTETRYQYSDVNVVAIQHALQQTGLKAQPVDVIVTL PISEYLDANNQKNKQNIERKKKNVMREVRVQGSDAFVIRSVSVLPESIPAGFSVLAGLED DESLLIVDLGGTTLDVSHVRSKMTGITKTWCDPNIGVSLITSGVKEQMAVHANTRVSSFQ ADNIIVHRNEPDYLSRRIYNAEQRESIINVINERQKLLIKRVNDVISRFTDYTHVMCVGG GAEIVAEAVKNLTKVPDERFYLSSSPQFDLVMGMIKMKGGVTNE >gi|296493288|gb|ADTK01000213.1| GENE 26 24062 - 24478 94 138 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0035 NR:ns ## KEGG: SeHA_A0035 # Name: not_defined # Def: plasmid stability protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 138 1 138 138 228 100.0 7e-59 MSDENKSRRCSFELFPDERTGDKIADELIANEKLKERGRFMRAMLVTGAAFAAIDKRLPL LISELLTENTTLDDINKVISSVIPGAFSVEKKLLELLEKQSGLHTSVDCSTPLTEQSLSR NDGEDQTRRNAENMFGDD >gi|296493288|gb|ADTK01000213.1| GENE 27 24480 - 25754 561 424 aa, chain - ## HITS:1 COG:PSLT054 KEGG:ns NR:ns ## COG: PSLT054 COG0389 # Protein_GI_number: 17233501 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Salmonella typhimurium LT2 # 1 423 1 423 424 630 70.0 1e-180 MFALADINSFYASCEKVFRPDLRNEPVIVLSNNDGCVIARSPEAKALGIRMGQPWFQVRQ MRLEKKIHVFSSNYALYHSMSQRVMAVLESLSPAVEPYSIDEMFIDLRGINHCISPEFFG HQLREQVKSWTGLTMGVGIAPTKTLAKSAQWATKQWPQFSGVVALTAENRNRILKLLGLQ PVGEVWGVGHRLTEKLNALGINTALQLAQANTAFIRKNFSVILERTVRELNGESCISLEE APPAKQQIVCSRSFGERITDKDAMHQAVVQYAERAAEKLRGERQYCRQVTTFVRTSPFAV KEPCYSNAAVEKLPLPTQDSRDIIAAACRALNHVWREGYRYMKAGVMLADFTPSGIAQPG LFDEIQPRKNSEKLMKTLDELNQSGKGKVWFAGRGTAPEWQMKREMLSQCYTTKWRDIPL ARLG >gi|296493288|gb|ADTK01000213.1| GENE 28 25754 - 26191 192 145 aa, chain - ## HITS:1 COG:PSLT055 KEGG:ns NR:ns ## COG: PSLT055 COG1974 # Protein_GI_number: 17233502 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Salmonella typhimurium LT2 # 20 142 17 137 140 169 68.0 2e-42 MSTVYHRPADPSGDDSYVRPLFADRCQAGFPSPATDYAEQELDLNSYCISRPAATFFLRA SGESMNQAGVQNGDLLVVDRAEKPQHGDIVIAEIDGEFTVKRLLLRPRPALEPVSDSPEF RTLYPENICIFGVVTHVIHRTRELR >gi|296493288|gb|ADTK01000213.1| GENE 29 26188 - 26436 369 82 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0038 NR:ns ## KEGG: SeHA_A0038 # Name: not_defined # Def: protein ImpC # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 82 1 82 82 148 100.0 6e-35 MIRIEILFDRQSTKNLKSGTLQALQNEIEQRLKPHYPEIWLRIDQGSAPSVSVTGARNDK DKERILSLLEEIWQDDSWLPAA >gi|296493288|gb|ADTK01000213.1| GENE 30 26830 - 27756 620 308 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0041 NR:ns ## KEGG: SeHA_A0041 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 308 1 308 308 629 100.0 1e-179 MPNWCSNRMYFSGEPAQIAEIKRLASGAVTPLYRRATNEGIQLFLAGSAGLLQITENIRS EQCPGVTAAGRGAVSTENIAFTRWLTHLQNGVLLDEQNCLMLHELWLQSGTGQRRWEGLP DDVRETITVHFTAKRGDWCDIWGSEDVSVWWNRLCDNVVPEKTMPFDLLTVLPTRLDVEV NGFNGGVLNGVPSAYHWYTERYGVKWPCGYDLNISSQGDNCIQVDFDTPWCQPESDVVAA LSRRFGCTLEHWYAEQGCNFCGWQLYERGELVDVLWGELEWSSPTDDDELPEVTGPAWIV DKVAHYGG Prediction of potential genes in microbial genomes Time: Mon May 16 15:43:03 2011 Seq name: gi|296493287|gb|ADTK01000214.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont565.2, whole genome shotgun sequence Length of sequence - 9008 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 163 - 495 220 ## COG4584 Transposase and inactivated derivatives - Prom 712 - 771 1.8 + Prom 984 - 1043 3.3 2 2 Tu 1 . + CDS 1265 - 2242 600 ## EcSMS35_A0095 repFIB replication protein A + Term 2250 - 2299 15.2 - Term 2285 - 2326 0.7 3 3 Tu 1 . - CDS 2527 - 3267 343 ## COG0582 Integrase - Term 3867 - 3902 4.4 4 4 Op 1 . - CDS 3950 - 4753 -149 ## EcSMS35_A0097 Mig-14 family protein - Prom 4860 - 4919 2.8 5 4 Op 2 . - CDS 4921 - 6030 502 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Prom 6205 - 6264 9.2 - Term 6365 - 6411 4.1 6 5 Tu 1 . - CDS 6463 - 7416 437 ## COG4571 Outer membrane protease - Prom 7451 - 7510 4.1 Predicted protein(s) >gi|296493287|gb|ADTK01000214.1| GENE 1 163 - 495 220 110 aa, chain - ## HITS:1 COG:YPCD1.97c KEGG:ns NR:ns ## COG: YPCD1.97c COG4584 # Protein_GI_number: 16082780 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Yersinia pestis # 75 102 1 28 308 59 96.0 2e-09 MIKQMRQQGAYIVDIAAQIGCSERTVRRYLKYPEPPARKTRHKMVKLKPFMDYIDMRLAE NVWNSEVIFAEIKAMGYTGGRSMLRYYIQPKRKMRPSKRTRRELPVKRET >gi|296493287|gb|ADTK01000214.1| GENE 2 1265 - 2242 600 325 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_A0095 NR:ns ## KEGG: EcSMS35_A0095 # Name: repA # Def: repFIB replication protein A # Organism: E.coli_SECEC # Pathway: not_defined # 1 325 1 325 325 604 100.0 1e-171 MDKSSGELVTLTPNNNNTVQPVALMRLGVFVPTLKSLKNSKKNTLSRTDATEELTRLSLA RAEGFDKVEITGPRLDMDNDFKTWVGIIHSFARHNVIGDKVELPFVEFAKLCGIPSSQSS RRLRERISPSLKRIAGTVISFSRTDEKHTREYITHLVQSAYYDTERDIVQLQADPRLFEL YQFDRKVLLQLKAINALKRRESAQALYTFIESLPRDPAPISLARLRARLNLKSPVFSQNQ TVRRAMEQLREIGYLDYTEIQRGRTKFFCIHYRRPRLKAPNDESKENPLQPSPAEKVSPE MAEKLALLEKLGITLDDLEKLFKSR >gi|296493287|gb|ADTK01000214.1| GENE 3 2527 - 3267 343 246 aa, chain - ## HITS:1 COG:PSLT031 KEGG:ns NR:ns ## COG: PSLT031 COG0582 # Protein_GI_number: 17233417 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Salmonella typhimurium LT2 # 15 240 16 257 260 174 49.0 1e-43 MNNVIPLQNSPERVSLLPIAPGVDFATALSLRRMATSTGATPAYLLAPEVSALLFYMPDQ RHHMLFATLWNTGMRIGEARMLTPESFDLNGVRPFVRILSEKVRARRGRPPKDEVRLVPL TDISYVRQMESWMITTRPRRREPLWAVTDETMRNWLKQAVRRAEADGVHFSIPVTPHTFR HSYIMHMLYHRQPRKVIQALVGHRDPRSMEVYTRVFALDMAATLAVPFTGDGRDAAEILR TLPPLR >gi|296493287|gb|ADTK01000214.1| GENE 4 3950 - 4753 -149 267 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_A0097 NR:ns ## KEGG: EcSMS35_A0097 # Name: not_defined # Def: Mig-14 family protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 267 36 302 302 560 100.0 1e-158 MHPDVVSYFMIHHDWKFDFFHYEKDGDIKGSYFLCNGKQIGIMARRSYPLSSDEVLIPFS PHARCFFPDKTNKLSIINKQNIINATWKIARKKQNCIIKESFSPKFEKTRRNEIQRFIRN GGEIKCISQLSDKEISSSYISLFHSRFGGTLPCYEYDNLLMFISHLRELMFGHVLFWDNK PCAIDIVLKSESSCNVYYDVPNGAVLNDENCMKLSPGSVLMWLNIDKARRYCQDNNKKMI YSIGAFRPEWKYKLLWSVPCKVGKCLC >gi|296493287|gb|ADTK01000214.1| GENE 5 4921 - 6030 502 369 aa, chain - ## HITS:1 COG:YPO1559 KEGG:ns NR:ns ## COG: YPO1559 COG0451 # Protein_GI_number: 16121830 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Yersinia pestis # 1 369 11 379 379 488 62.0 1e-138 MKLLLLTGATGFLGGAVLDKLLDNCNNINLLLLVRAPTPQAGLERIKENMRKFNVCEERL HALTNDNILPGDLNNPEAFLMDPRLDEVTHVINCAAIASFGNNPFIWNVNVTGTLAFARR MAKVAGLKRFLHVGTAMSCTPHTGSLVKEESASSETGEHLVEYTHSKATIEYLMRKQCPD LPLLVARPSIIVGHSRLGCLPSTSIFWVFRMGLMLQKFMCSLDDKIDVIPVDYCADALLM LLESSLINGEIVHISAGKESSVTFSAIDEAVARALNCDPVGDRYTKVSYDILAMSRHDFK NIFGPCNERLMLKAIRLYGAFSMLNVCFSNDKLLSIGMPKPPKFTDYIKYCIETTKHLSI QQQMEVDFK >gi|296493287|gb|ADTK01000214.1| GENE 6 6463 - 7416 437 317 aa, chain - ## HITS:1 COG:ECs1663 KEGG:ns NR:ns ## COG: ECs1663 COG4571 # Protein_GI_number: 15830917 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protease # Organism: Escherichia coli O157:H7 # 1 317 1 317 317 475 74.0 1e-134 MYLKILATALSAPVAFAALASDTGLSFTPEKISTEIDFGTLSGKAKERVYLPEEKGRKAS QLDWKYSNAPIVKGAFNWDLLPRVSVGASGWTTLAGRGGNMVDRDWLDTSNPGTWTDESK HPNTRLNFANEFDLNIKGWLLNQPDYQLGLMAGYQENRYSFTAKGGSYIYSSEGGFRDET GSFPDGERAIGYKQHFKMPYIGLTGNYRYDSFEFGGSFKYSGWVKASDNDEHYNPEKRIT YRSDVNNQNYYSVSLHAGYYITPAAKVYVEGTWNRITNKKGDTSLYSRNLNISDHTKNGA GIESYNFMTTAGLKYYF Prediction of potential genes in microbial genomes Time: Mon May 16 15:43:12 2011 Seq name: gi|296493286|gb|ADTK01000215.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont579.1, whole genome shotgun sequence Length of sequence - 1886 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 552 - 1412 284 ## COG2367 Beta-lactamase class A - Prom 1447 - 1506 2.9 2 2 Tu 1 . - CDS 1595 - 1885 180 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs Predicted protein(s) >gi|296493286|gb|ADTK01000215.1| GENE 1 552 - 1412 284 286 aa, chain - ## HITS:1 COG:SMa1952 KEGG:ns NR:ns ## COG: SMa1952 COG2367 # Protein_GI_number: 16263521 # Func_class: V Defense mechanisms # Function: Beta-lactamase class A # Organism: Sinorhizobium meliloti # 30 279 22 274 281 201 46.0 1e-51 MSIQHFRVALIPFFAAFCLPVFAHPETLVKVKDAEDQLGARVGYIELDLNSGKILESFRP EERFPMMSTFKVLLCGAVLSRVDAGQEQLGRRIHYSQNDLVEYSPVTEKHLTDGMTVREL CSAAITMSDNTAANLLLTTIGGPKELTAFLHNMGDHVTRLDRWEPELNEAIPNDERDTTM PAAMATTLRKLLTGELLTLASRQQLIDWMEADKVAGPLLRSALPAGWFIADKSGAGERGS RGIIAALGPDGKPSRIVVIYTTGSQATMDERNRQIAEIGASLIKHW >gi|296493286|gb|ADTK01000215.1| GENE 2 1595 - 1885 180 96 aa, chain - ## HITS:1 COG:YPCD1.91 KEGG:ns NR:ns ## COG: YPCD1.91 COG1961 # Protein_GI_number: 16082774 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Yersinia pestis # 11 94 100 183 183 108 67.0 2e-24 ITTDYLQQCPDGDMGQMVVTILSAVAQAERRRILERTNEGRQEAKLKGIKFGRRRTVDRN VVLTLHQKGTGATEIAHQLSIARSTVYKILEDERAS Prediction of potential genes in microbial genomes Time: Mon May 16 15:43:13 2011 Seq name: gi|296493285|gb|ADTK01000216.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont581.1, whole genome shotgun sequence Length of sequence - 766 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 585 607 ## COG5492 Bacterial surface proteins containing Ig-like domains 2 1 Op 2 . + CDS 601 - 766 121 ## ECIAI1_0793 minor tail protein G Predicted protein(s) >gi|296493285|gb|ADTK01000216.1| GENE 1 1 - 585 607 194 aa, chain + ## HITS:1 COG:Z1894 KEGG:ns NR:ns ## COG: Z1894 COG5492 # Protein_GI_number: 15801361 # Func_class: N Cell motility # Function: Bacterial surface proteins containing Ig-like domains # Organism: Escherichia coli O157:H7 EDL933 # 1 194 63 256 256 321 99.0 5e-88 ESYDDSYLDDEDPDWAATGQGQKSAGDTSFTLAWMPGEQGQQALLAWFNEGDTRAYKIRF PNGTVDVFRGWVSSIGKAVTAKEVITRTVKVTNVGRPSMAEDRSTVTAATGMTVTPASTS VVKGQSTTLTVAFQPEGATDKSFRAVSADKTKATVSVSGMTITVKGVAAGKVNIPVVSGN GEFAAVAEINVTAS >gi|296493285|gb|ADTK01000216.1| GENE 2 601 - 766 121 55 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_0793 NR:ns ## KEGG: ECIAI1_0793 # Name: not_defined # Def: minor tail protein G # Organism: E.coli_IAI1 # Pathway: not_defined # 1 55 1 55 140 97 100.0 2e-19 MFLKTESFEHNGVTVTLSELSALQRIEHLALMKRQAEQAESDSNRKFTVEDAIRT Prediction of potential genes in microbial genomes Time: Mon May 16 15:43:15 2011 Seq name: gi|296493284|gb|ADTK01000217.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont622.1, whole genome shotgun sequence Length of sequence - 506 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 69 - 504 278 ## ECO103_2434 putative exonuclease Predicted protein(s) >gi|296493284|gb|ADTK01000217.1| GENE 1 69 - 504 278 145 aa, chain + ## HITS:1 COG:no KEGG:ECO103_2434 NR:ns ## KEGG: ECO103_2434 # Name: not_defined # Def: putative exonuclease # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 145 465 609 823 270 99.0 1e-71 MQAANISQPDADELLAVSRGEFVEGISDPNDPKWVKGIQTRDSVNQNQQETEQNDQKAEQ NSPNTQQNEPETKQPEPVVQQEPEKICTACGHSGGGNCPDCGAVMGDATYQEIFDGENQP EVQENDPEEMEGTAHQHKENTGGNQ Prediction of potential genes in microbial genomes Time: Mon May 16 15:43:46 2011 Seq name: gi|296493283|gb|ADTK01000218.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont624.1, whole genome shotgun sequence Length of sequence - 90942 bp Number of predicted genes - 93, with homology - 92 Number of transcription units - 48, operones - 21 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 113 - 241 78 ## EC55989_1434 hypothetical protein 2 1 Op 2 . + CDS 244 - 411 112 ## SSON_1865 hypothetical protein + Term 625 - 660 4.0 + Prom 675 - 734 3.5 3 2 Tu 1 . + CDS 784 - 3465 2593 ## COG1048 Aconitase A + Term 3526 - 3562 4.1 - Term 3458 - 3506 5.5 4 3 Tu 1 . - CDS 3529 - 4119 629 ## COG0807 GTP cyclohydrolase II - Prom 4193 - 4252 1.6 + Prom 4209 - 4268 4.5 5 4 Tu 1 . + CDS 4289 - 5053 432 ## COG0671 Membrane-associated phospholipid phosphatase + Prom 5079 - 5138 2.9 6 5 Op 1 8/0.000 + CDS 5202 - 5510 201 ## PROTEIN SUPPORTED gi|46133578|ref|ZP_00203203.1| COG3771: Predicted membrane protein 7 5 Op 2 7/0.000 + CDS 5517 - 6686 1285 ## COG2956 Predicted N-acetylglucosaminyl transferase + Prom 6768 - 6827 7.7 8 5 Op 3 6/0.067 + CDS 6879 - 7616 635 ## COG0284 Orotidine-5'-phosphate decarboxylase 9 5 Op 4 . + CDS 7616 - 7942 441 ## COG0023 Translation initiation factor 1 (eIF-1/SUI1) and related proteins + Term 7951 - 7982 4.1 - Term 7935 - 7974 8.4 10 6 Tu 1 . - CDS 8068 - 8286 343 ## G2583_1624 hypothetical protein - Prom 8434 - 8493 6.1 - Term 8486 - 8544 4.1 11 7 Op 1 . - CDS 8555 - 9304 677 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 9333 - 9392 7.3 12 7 Op 2 . - CDS 9394 - 9567 192 ## EC55989_1446 hypothetical protein - Prom 9647 - 9706 6.3 - Term 9601 - 9643 3.1 13 8 Tu 1 . - CDS 9714 - 11699 1318 ## COG2200 FOG: EAL domain - Prom 11788 - 11847 6.4 - Term 11888 - 11924 6.3 14 9 Op 1 1/0.933 - CDS 11935 - 13869 1979 ## COG4776 Exoribonuclease II 15 9 Op 2 1/0.933 - CDS 13937 - 15064 374 ## COG4950 Uncharacterized protein conserved in bacteria - Prom 15146 - 15205 2.5 - Term 15140 - 15200 6.0 16 10 Tu 1 . - CDS 15208 - 15996 987 ## COG0623 Enoyl-[acyl-carrier-protein] reductase (NADH) - Prom 16077 - 16136 3.6 17 11 Op 1 1/0.933 - CDS 16364 - 16714 129 ## COG2852 Uncharacterized protein conserved in bacteria 18 11 Op 2 8/0.000 - CDS 16785 - 17603 448 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 19 11 Op 3 8/0.000 - CDS 17593 - 18585 468 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 20 11 Op 4 8/0.000 - CDS 18585 - 19475 956 ## COG4171 ABC-type antimicrobial peptide transport system, permease component 21 11 Op 5 8/0.000 - CDS 19462 - 20427 588 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 22 11 Op 6 . - CDS 20424 - 22067 1247 ## COG4166 ABC-type oligopeptide transport system, periplasmic component - Prom 22090 - 22149 4.0 23 12 Tu 1 . - CDS 22380 - 22625 314 ## Z2493 hypothetical protein - Prom 22658 - 22717 3.9 24 13 Tu 1 . - CDS 22759 - 24144 1474 ## COG0531 Amino acid transporters - Prom 24187 - 24246 2.2 - Term 24388 - 24427 1.2 25 14 Tu 1 . - CDS 24447 - 25865 1291 ## COG0174 Glutamine synthetase - Prom 25953 - 26012 5.4 + Prom 25950 - 26009 4.6 26 15 Op 1 1/0.933 + CDS 26089 - 26841 247 ## COG2071 Predicted glutamine amidotransferases 27 15 Op 2 4/0.400 + CDS 26868 - 27425 511 ## COG1396 Predicted transcriptional regulators 28 16 Op 1 6/0.067 + CDS 27700 - 29187 1690 ## COG1012 NAD-dependent aldehyde dehydrogenases 29 16 Op 2 4/0.400 + CDS 29189 - 30469 1501 ## COG0665 Glycine/D-amino acid oxidases (deaminating) 30 16 Op 3 . + CDS 30507 - 31772 1144 ## COG0160 4-aminobutyrate aminotransferase and related aminotransferases + Term 31851 - 31896 5.0 - Term 31788 - 31825 2.1 31 17 Tu 1 . - CDS 31903 - 32880 870 ## COG1221 Transcriptional regulators containing an AAA-type ATPase domain and a DNA-binding domain - Prom 32983 - 33042 3.8 + Prom 32856 - 32915 4.5 32 18 Op 1 . + CDS 33047 - 33715 1025 ## COG1842 Phage shock protein A (IM30), suppresses sigma54-dependent transcription 33 18 Op 2 . + CDS 33769 - 33993 308 ## ECDH10B_1422 phage shock protein B 34 18 Op 3 . + CDS 33993 - 34352 538 ## COG1983 Putative stress-responsive transcriptional regulator 35 18 Op 4 . + CDS 34361 - 34582 211 ## ECDH10B_1424 peripheral inner membrane phage-shock protein 36 18 Op 5 1/0.933 + CDS 34657 - 34971 418 ## COG0607 Rhodanese-related sulfurtransferase + Term 34985 - 35018 3.6 + Prom 34991 - 35050 3.9 37 19 Op 1 2/0.867 + CDS 35157 - 36863 1044 ## COG0366 Glycosidases 38 19 Op 2 35/0.000 + CDS 36877 - 38169 1519 ## COG1653 ABC-type sugar transport system, periplasmic component 39 19 Op 3 38/0.000 + CDS 38190 - 39071 962 ## COG1175 ABC-type sugar transport systems, permease components 40 19 Op 4 5/0.200 + CDS 39058 - 39900 930 ## COG0395 ABC-type sugar transport system, permease component 41 19 Op 5 3/0.667 + CDS 39931 - 40983 1240 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 42 19 Op 6 16/0.000 + CDS 41001 - 41789 976 ## COG1082 Sugar phosphate isomerases/epimerases 43 19 Op 7 4/0.400 + CDS 41811 - 42854 892 ## COG0673 Predicted dehydrogenases and related proteins 44 19 Op 8 11/0.000 + CDS 42920 - 45118 1542 ## COG1554 Trehalose and maltose hydrolases (possible phosphorylases) 45 19 Op 9 3/0.667 + CDS 45115 - 45774 588 ## COG0637 Predicted phosphatase/phosphohexomutase 46 19 Op 10 . + CDS 45788 - 46870 1062 ## COG3839 ABC-type sugar transport systems, ATPase components 47 19 Op 11 . + CDS 46915 - 47820 957 ## B21_01307 hypothetical protein - Term 47860 - 47913 5.5 48 20 Tu 1 . - CDS 47931 - 48929 830 ## COG1609 Transcriptional regulators - Prom 48958 - 49017 3.8 + Prom 48858 - 48917 2.2 49 21 Op 1 9/0.000 + CDS 49084 - 50481 1204 ## COG3106 Predicted ATPase 50 21 Op 2 5/0.200 + CDS 50478 - 51539 1221 ## COG3768 Predicted membrane protein + Term 51592 - 51634 1.9 + Prom 51599 - 51658 6.5 51 22 Tu 1 . + CDS 51687 - 53228 1477 ## COG3283 Transcriptional regulator of aromatic amino acids metabolism + Term 53238 - 53276 10.2 - Term 53217 - 53270 11.2 52 23 Tu 1 . - CDS 53272 - 53778 650 ## COG2077 Peroxiredoxin - Prom 53807 - 53866 2.3 + Prom 53775 - 53834 4.3 53 24 Tu 1 . + CDS 53897 - 54862 818 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 54 25 Tu 1 . - CDS 54837 - 55565 390 ## COG2866 Predicted carboxypeptidase - Prom 55602 - 55661 1.5 55 26 Op 1 1/0.933 - CDS 55666 - 55830 171 ## COG0702 Predicted nucleoside-diphosphate-sugar epimerases 56 26 Op 2 . - CDS 55901 - 56818 1003 ## COG1073 Hydrolases of the alpha/beta superfamily - Prom 56887 - 56946 2.4 + Prom 56762 - 56821 3.2 57 27 Tu 1 . + CDS 56959 - 57858 237 ## COG0583 Transcriptional regulator + Term 57999 - 58027 -0.9 58 28 Tu 1 . + CDS 58195 - 59808 1412 ## COG4166 ABC-type oligopeptide transport system, periplasmic component + Term 59820 - 59862 -0.6 - Term 59623 - 59657 0.3 59 29 Tu 1 . - CDS 59859 - 60890 787 ## COG0668 Small-conductance mechanosensitive channel - Prom 60941 - 61000 4.4 + Prom 60846 - 60905 4.8 60 30 Tu 1 . + CDS 61134 - 61391 330 ## ECSP_1858 predicted inner membrane protein + Term 61402 - 61431 2.8 - Term 61389 - 61419 3.0 61 31 Op 1 7/0.000 - CDS 61441 - 62391 1024 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins - Prom 62429 - 62488 3.7 - Term 62461 - 62521 10.1 62 31 Op 2 4/0.400 - CDS 62543 - 63295 786 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Prom 63321 - 63380 6.2 63 32 Op 1 2/0.867 - CDS 63490 - 64005 273 ## COG0350 Methylated DNA-protein cysteine methyltransferase 64 32 Op 2 4/0.400 - CDS 64016 - 65422 1021 ## COG2978 Putative p-aminobenzoyl-glutamate transporter 65 33 Op 1 3/0.667 - CDS 65579 - 67024 993 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 66 33 Op 2 . - CDS 67024 - 68334 985 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase - Prom 68398 - 68457 2.3 + Prom 68347 - 68406 2.3 67 34 Tu 1 . + CDS 68510 - 69418 643 ## COG0583 Transcriptional regulator + Term 69664 - 69700 -0.4 + Prom 69592 - 69651 4.7 68 35 Tu 1 . + CDS 69748 - 70311 622 ## COG2840 Uncharacterized protein conserved in bacteria 69 36 Tu 1 . - CDS 70332 - 71564 920 ## COG2199 FOG: GGDEF domain - Prom 71777 - 71836 3.1 + Prom 71671 - 71730 5.1 70 37 Tu 1 . + CDS 71873 - 72802 778 ## COG0598 Mg2+ and Co2+ transporters + Term 72904 - 72948 2.2 71 38 Tu 1 . + CDS 73355 - 74653 1120 ## COG0513 Superfamily II DNA and RNA helicases + Term 74750 - 74794 -0.9 - Term 74734 - 74771 5.3 72 39 Op 1 3/0.667 - CDS 74782 - 75717 899 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control 73 39 Op 2 . - CDS 75769 - 76698 227 ## COG0582 Integrase - Prom 76777 - 76836 3.9 74 40 Op 1 . - CDS 77006 - 77221 214 ## G2583_1696 hypothetical protein 75 40 Op 2 . - CDS 77300 - 77509 136 ## ECSE_1401 hypothetical protein - Prom 77690 - 77749 2.7 - Term 77706 - 77752 4.2 76 41 Op 1 . - CDS 77753 - 78562 657 ## COG3723 Recombinational DNA repair protein (RecE pathway) 77 41 Op 2 . - CDS 78555 - 81155 1161 ## ECB_01327 exonuclease VIII, 5' -> 3' specific dsDNA exonuclease 78 42 Op 1 . - CDS 81257 - 81532 318 ## ECO26_1920 hypothetical protein 79 42 Op 2 . - CDS 81607 - 81777 62 ## ECSE_1405 hypothetical protein 80 42 Op 3 . - CDS 81777 - 81998 154 ## Z2406 FtsZ inhibitor protein 81 43 Tu 1 . + CDS 82440 - 82928 105 ## APECO1_502 prophage CP-933R superinfection exclusion protein + Term 83020 - 83062 1.4 - Term 82712 - 82752 4.1 82 44 Op 1 . - CDS 82925 - 83080 80 ## B21_01342 hypothetical protein 83 44 Op 2 . - CDS 83091 - 83225 159 ## Z2402 hypothetical protein - Prom 83388 - 83447 5.5 84 45 Tu 1 . - CDS 83534 - 84010 138 ## JW1351 predicted DNA-binding transcriptional regulator + Prom 84055 - 84114 4.2 85 46 Op 1 . + CDS 84134 - 84430 70 ## COG4197 Uncharacterized protein conserved in bacteria, prophage-related 86 46 Op 2 . + CDS 84453 - 84875 292 ## JW1353 hypothetical protein 87 46 Op 3 1/0.933 + CDS 84888 - 85745 99 ## COG3756 Uncharacterized protein conserved in bacteria 88 46 Op 4 . + CDS 85752 - 86498 404 ## COG1484 DNA replication protein 89 46 Op 5 . + CDS 86521 - 87282 554 ## ECs1946 hypothetical protein 90 46 Op 6 . + CDS 87298 - 87729 160 ## Z2394 hypothetical protein + Term 87754 - 87788 4.7 91 47 Op 1 . - CDS 87763 - 88803 59 ## COG0270 Site-specific DNA methylase 92 47 Op 2 . - CDS 88827 - 90113 316 ## Gbro_1454 ATPase - Prom 90326 - 90385 4.2 93 48 Tu 1 . - CDS 90663 - 90842 61 ## Predicted protein(s) >gi|296493283|gb|ADTK01000218.1| GENE 1 113 - 241 78 42 aa, chain + ## HITS:1 COG:no KEGG:EC55989_1434 NR:ns ## KEGG: EC55989_1434 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 42 13 54 54 73 100.0 2e-12 MPSGNQEPRRDPELKRKAWLAVFLGSALFWVVVALLIWKVWG >gi|296493283|gb|ADTK01000218.1| GENE 2 244 - 411 112 55 aa, chain + ## HITS:1 COG:no KEGG:SSON_1865 NR:ns ## KEGG: SSON_1865 # Name: not_defined # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 55 5 59 59 80 100.0 1e-14 MVGQEQLESSPLCQHSDNETETKRECSVVIPDDWQLTSQQQAFIELFAEDDQPKQ >gi|296493283|gb|ADTK01000218.1| GENE 3 784 - 3465 2593 893 aa, chain + ## HITS:1 COG:acnA KEGG:ns NR:ns ## COG: acnA COG1048 # Protein_GI_number: 16129237 # Func_class: C Energy production and conversion # Function: Aconitase A # Organism: Escherichia coli K12 # 1 893 1 891 891 1786 99.0 0 MSSTLREASKDTLQAKDKTYHYYSLPLAAKSLGDITRLPKSLKVLLENLLRWQDGNSVTE EDIHALAGWLKNAHADREIAYRPARVLMQDFTGVPAVVDLAAMREAVKRLGGDTAKVNPL SPVDLVIDHSVTVDRFGDDEAFEENVRLEMERNHERYVFLKWGKQAFSRFSVVPPGTGIC HQVNLEYLGKAVWSELQDGEWIAYPDTLVGTDSHTTMINGLGVLGWGVGGIEAEAAMLGQ PVSMLIPDVVGFKLTGKLREGITATDLVLTVTQMLRKHGVVGKFVEFYGDGLDSLPLADR ATIANMSPEYGATCGFFPIDAVTLDYMRLSGRSEDQVELVEKYAKAQGMWRNPGDEPIFT STLELDMNDVEASLAGPKRPQDRVALPDVPKAFAASNELEVNATHKDRQPVDYVMNGHQY QLPDGAVVIAAITSCTNTSNPSVLMAAGLLAKKAVTLGLKRQPWVKASLAPGSKVVSDYL AKAKLTPYLDELGFNLVGYGCTTCIGNSGPLPDPIETAIKKGDLTVGAVLSGNRNFEGRI HPLVKTNWLASPPLVVAYALAGNMNINLAAEPIGHDRKGDPVYLKDIWPSAPAQEIARAV EQVSTEMFRKEYAEVFEGTAEWKEINVTRSDTYGWQEDSTYIRLSPFFDEMQATPAPVED IHGARILAMLGDSVTTDHISPAGSIKPDSPAGRYLQGRGVERKDFNSYGSRRGNHEVMMR GTFANIRIRNEMVPGVEGGMTRHLPDSDVVSIYDAAMRYKQEQTPLAVIAGKEYGSGSSR DWAAKGPRLLGIRVVIAESFERIHRSNLIGMGILPLEFPQGVTRKTLGLTGEEKIDIGDL QNLQPGATVPVTLTRADGSQEVVPCRCRIDTATELTYYQNDGILHYVIRNMLK >gi|296493283|gb|ADTK01000218.1| GENE 4 3529 - 4119 629 196 aa, chain - ## HITS:1 COG:ECs1850 KEGG:ns NR:ns ## COG: ECs1850 COG0807 # Protein_GI_number: 15831104 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase II # Organism: Escherichia coli O157:H7 # 1 196 1 196 196 403 100.0 1e-112 MQLKRVAEAKLPTPWGDFLMVGFEELATGHDHVALVYGDISGHTPVLARVHSECLTGDAL FSLRCDCGFQLEAALTQIAEEGRGILLYHRQEGRNIGLLNKIRAYALQDQGYDTVEANHQ LGFAADERDFTLCADMFKLLGVNEVRLLTNNPKKVEILTEAGINIVERVPLIVGRNPNNE HYLDTKAEKMGHLLNK >gi|296493283|gb|ADTK01000218.1| GENE 5 4289 - 5053 432 254 aa, chain + ## HITS:1 COG:ECs1851 KEGG:ns NR:ns ## COG: ECs1851 COG0671 # Protein_GI_number: 15831105 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Escherichia coli O157:H7 # 1 254 1 254 254 446 100.0 1e-125 MRSIARRTAVGAALLLVMPVAVWISGWRWQPGEQSWLLKAAFWVTETVTQPWGVITHLIL FGWFLWCLRFRIKAAFVLFAILAAAILVGQGVKSWIKDKVQEPRPFVIWLEKTHHIPVDE FYTLKRAERGNLVKEQLAEEKNIPQYLRSHWQKETGFAFPSGHTMFAASWALLAVGLLWP RRRTLTIAILLVWATGVMGSRLLLGMHWPRDLVVATLISWALVAVATWLAQRICGPLTPP AEENREIAQREQES >gi|296493283|gb|ADTK01000218.1| GENE 6 5202 - 5510 201 102 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|46133578|ref|ZP_00203203.1| COG3771: Predicted membrane protein [Haemophilus influenzae R2866] # 1 89 2 90 97 82 35 1e-14 MKYLLIFLLVLAIFVISVTLGAQNDQQVTFNYLLAQGEYRISTLLAVLFAAGFAIGWLIC GLFWLRVRVSLARAERKIKRLENQLSPATDVAVVPHSSAAKE >gi|296493283|gb|ADTK01000218.1| GENE 7 5517 - 6686 1285 389 aa, chain + ## HITS:1 COG:ECs1853 KEGG:ns NR:ns ## COG: ECs1853 COG2956 # Protein_GI_number: 15831107 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted N-acetylglucosaminyl transferase # Organism: Escherichia coli O157:H7 # 10 389 10 389 389 721 100.0 0 MLELLFLLLPVAAAYGWYMGRRSAQQNKQDEANRLSRDYVAGVNFLLSNQQDKAVDLFLD MLKEDTGTVEAHLTLGNLFRSRGEVDRAIRIHQTLMESASLTYEQRLLAIQQLGRDYMAA GLYDRAEDMFNQLTDETDFRIGALQQLLQIYQATSEWQKAIDVAERLVKLGKDKQRVEIA HFYCELALQHMASDDLDRAMTLLKKGAAADKNSARVSIMMGRVFMAKGEYAKAVESLQRV ISQDRELVSETLEMLQTCYQQLGKTAEWAEFLQRAVEENTGADAELMLADIIEARDGSEA AQVYITRQLQRHPTMRVFHKLMDYHLNEAEEGRAKESLMVLRDMVGEKVRSKPRYRCQKC GFTAYTLYWHCPSCRAWSTIKPIRGLDGL >gi|296493283|gb|ADTK01000218.1| GENE 8 6879 - 7616 635 245 aa, chain + ## HITS:1 COG:pyrF KEGG:ns NR:ns ## COG: pyrF COG0284 # Protein_GI_number: 16129242 # Func_class: F Nucleotide transport and metabolism # Function: Orotidine-5'-phosphate decarboxylase # Organism: Escherichia coli K12 # 1 245 1 245 245 454 99.0 1e-128 MTLTASSSSRAVTNSPVVVALDYHNRDDALSFVDKIDPRDCRLKVGKEMFTLFGPQFVRE LQQRGFDIFLDLKFHDIPNTAAHAVAAAADLGVWMVNVHASGGARMMTAAREALVPFGKD APLLIAVTVLTSMEASDLVDLGMTLSPADYAERLAALTQKCGLDGVVCSAQEAVRFKQVF GQEFKLVTPGIRPQGSEAGDQRRIMTPEQALSAGVDYMVIGRPVTQSVDPAQTLKAINAS LQRSA >gi|296493283|gb|ADTK01000218.1| GENE 9 7616 - 7942 441 108 aa, chain + ## HITS:1 COG:yciH KEGG:ns NR:ns ## COG: yciH COG0023 # Protein_GI_number: 16129243 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 1 (eIF-1/SUI1) and related proteins # Organism: Escherichia coli K12 # 1 108 2 109 109 177 100.0 5e-45 MSDSNSRLVYSTETGRIDEPKAAPVRPKGDGVVRIQRQTSGRKGKGVCLITGVDLDDAEL TKLAAELKKKCGCGGAVKDGVIEIQGDKRDLLKSLLEAKGMKVKLAGG >gi|296493283|gb|ADTK01000218.1| GENE 10 8068 - 8286 343 72 aa, chain - ## HITS:1 COG:no KEGG:G2583_1624 NR:ns ## KEGG: G2583_1624 # Name: osmB # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 72 1 72 72 74 100.0 1e-12 MFVTSKKMTAAVLAITLAMSLSACSNWSKRDRNTAIGAGAGALGGAVLTDGSTLGTLGGA AVGGVIGHQVGK >gi|296493283|gb|ADTK01000218.1| GENE 11 8555 - 9304 677 249 aa, chain - ## HITS:1 COG:ECs1857 KEGG:ns NR:ns ## COG: ECs1857 COG1349 # Protein_GI_number: 15831111 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli O157:H7 # 1 249 1 249 249 480 100.0 1e-135 MNSRQQTILQMVIDQGQVSVTDLAKATGVSEVTIRQDLNTLEKLSYLRRAHGFAVSLDSD DVETRMMSNYTLKRELAEFAASLVQPGETIFIENGSSNALLARTLGEQKKNVTIITVSSY IAHLLKDAPCEVILLGGVYQKKSESMVGPLTRQCIQQVHFSKAFIGIDGWQPETGFTGRD MMRTDVVNAVLEKECEAIVLTDSSKFGAVHSYSIGPVERFNRVITDSKIRASDLMHLEHS KLTVHVVDI >gi|296493283|gb|ADTK01000218.1| GENE 12 9394 - 9567 192 57 aa, chain - ## HITS:1 COG:no KEGG:EC55989_1446 NR:ns ## KEGG: EC55989_1446 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 57 15 71 71 105 100.0 4e-22 MSEFDAQRVAERIDIVLDILVAGDYHSAIHNLEILKAELLRQVAESTPDIPKAPWEI >gi|296493283|gb|ADTK01000218.1| GENE 13 9714 - 11699 1318 661 aa, chain - ## HITS:1 COG:yciR_3 KEGG:ns NR:ns ## COG: yciR_3 COG2200 # Protein_GI_number: 16129246 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 401 661 1 261 261 530 100.0 1e-150 MKTVRESTTLYNFLGSHNPYWRLTESSDVLRFSTTETTEPDRTLQLSAEQAARIREMTVI TSSLMMSLTVDESDLSVHLVGRKINKREWAGNASAWHDTPAVARDLSHGLSFAEQVVSEA HSAIVILDSRGNIQRFNRLCEDYTGLKEHDVIGQSVFKLFMSRREAAASRRNNRVFFRSG NAYEVELWIPTCKGQRLFLFRNKFVHSGSGKNEIFLICSGTDITEERRAQERLRILANTD SITGLPNRNAMQDLIDHAINHADNNKVGVVYLDLDNFKKVNDAYGHLFGDQLLRDVSLAI LSCLEHDQVLARPGGDEFLVLASNTSQSALEAMASRILTRLRLPFRIGLIEVYTSCSVGI ALSPEHGSDSTAIIRHADTAMYTAKEGGRGQFCVFTPEMNQRVFEYLWLDTNLRKALEND QLVIHYQPKITWRGEVRSLEALVRWQSPERGLIPPLDFISYAEESGLIVPLGRWVILDVV RQVAKWRDKGINLRVAVNISARQLADQTIFTALKQVLQELNFEYCPIDVELTESCLIEND ELALSVIQQFSQLGAQVHLDDFGTGYSSLSQLARFPIDAIKLDQVFVRDIHKQPVSQSLV RAIVAVAQALNLQVIAEGVESAKEDAFLTKNGINERQGFLFAKPMPAVAFERWYKRYLKR A >gi|296493283|gb|ADTK01000218.1| GENE 14 11935 - 13869 1979 644 aa, chain - ## HITS:1 COG:ECs1859 KEGG:ns NR:ns ## COG: ECs1859 COG4776 # Protein_GI_number: 15831113 # Func_class: K Transcription # Function: Exoribonuclease II # Organism: Escherichia coli O157:H7 # 1 644 1 644 644 1281 99.0 0 MFQDNPLLAQLKQQLHSQTPRAEGVVKATEKGFGFLEVDAQKSYFIPPPQMKKVMHGDRI IAVIHSEKERESAEPEELVEPFLTRFVGKVQGKNDRLAIVPDHPLLKDAIPCRAARGLNH EFKEGDWAVAEMRRHPLKGDRSFYAELTQYITFGDDHFVPWWVTLARHNLEKEAPDGVAT EMLDEGLVREDLTALDFVTIDSASTEDMDDALFAKALPDDKLQLIVAIADPTAWIAEGSK LDKAAKIRAFTNYLPGFNIPMLPRELSDDLCSLRANEVRPVLACRMTLSADGTIEDNIEF FAATIESKAKLVYDQVSDWLENTGDWQPESEAIAEQVRLLAQICQRRGEWRHNHALVFKD RPDYRFILGEKGEVLDIVAEPRRIANRIVEEAMIAANICAARVLRDKLGFGIYNVHMGFD PANADALAALLKTHGLHVDAEEVLTLDGFCKLRRELDAQPTGFLDSRIRRFQSFAEISTE PGPHFGLGLEAYATWTSPIRKYGDMINHRLLKAVIKGETATRPQDEITVQMAERRRLNRM AERDVGDWLYARFLKDKAGTDTRFAAEIVDISRGGMRVRLVDNGAIAFIPAPFLHAVRDE LVCSQENGTVQIKGETVYKVTDVIDVTIAEVRMETRSIIARPVA >gi|296493283|gb|ADTK01000218.1| GENE 15 13937 - 15064 374 375 aa, chain - ## HITS:1 COG:yciW_1 KEGG:ns NR:ns ## COG: yciW_1 COG4950 # Protein_GI_number: 16129248 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 210 27 236 236 410 96.0 1e-114 MEQRHITGKSHWYHETQSSTAEYDVLPLVPEAAKVSDPFLLDVILDEETLAPFLSWLVPA RVLAVELFPDQLTVTRSQTFTAYERLSTALTVAQVCGVQRLCNYYSARLTPLPGPDSSRE SNHRLAQITQYARQLASSPSIIDNRSRQHLNDVGLTAWDCVIINQIIGFIGFQARTIATF QAYLGHPVRWLPGLEIQNYADASLFADESIRWRSSYEVEKLPEEYTKSSTAELCQLAETL SLHPISLSLLERLLNSTRVNTQPDNKLAALLCARINGSPACFAACMDSSNEYKKISPLLR KGENEINQWADRHSVEHATVQAIQWLTRAPDRFSAAQFSPLLEHEKSSTQIINLLVWSGL CGWINRLKIALGETY >gi|296493283|gb|ADTK01000218.1| GENE 16 15208 - 15996 987 262 aa, chain - ## HITS:1 COG:ECs1861 KEGG:ns NR:ns ## COG: ECs1861 COG0623 # Protein_GI_number: 15831115 # Func_class: I Lipid transport and metabolism # Function: Enoyl-[acyl-carrier-protein] reductase (NADH) # Organism: Escherichia coli O157:H7 # 1 262 1 262 262 506 100.0 1e-143 MGFLSGKRILVTGVASKLSIAYGIAQAMHREGAELAFTYQNDKLKGRVEEFAAQLGSDIV LQCDVAEDASIDTMFAELGKVWPKFDGFVHSIGFAPGDQLDGDYVNAVTREGFKIAHDIS SYSFVAMAKACRSMLNPGSALLTLSYLGAERAIPNYNVMGLAKASLEANVRYMANAMGPE GVRVNAISAGPIRTLAASGIKDFRKMLAHCEAVTPIRRTVTIEDVGNSAAFLCSDLSAGI SGEVVHVDGGFSIAAMNELELK >gi|296493283|gb|ADTK01000218.1| GENE 17 16364 - 16714 129 116 aa, chain - ## HITS:1 COG:ycjD KEGG:ns NR:ns ## COG: ycjD COG2852 # Protein_GI_number: 16129250 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 116 2 117 117 213 94.0 7e-56 MDKIKSNARDLRRNLTLQERKLWRYLRSRRFGDFKFRRQHPVGSYILDFACCSARVVVEL DGGQHDLAVAYDTRRTSWLESQGWTVLRFWHNEIDCNEEAVLEIILQELNRRSPSP >gi|296493283|gb|ADTK01000218.1| GENE 18 16785 - 17603 448 272 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 1 256 3 263 329 177 35 2e-43 RKSEMIETLLEVRNLSKTFRYRTGWFRRQTVEAVKPLSFTLREGQTLAIIGENGSGKSTL AKMLAGMIEPTSGELLIDDHPLHFGDYSFRSQRIRMIFQDPSTSLNPRQRISQILDFPLR LNTDLEPEQRRKQIIETMRMVGLLPDHVSYYPHMLAPGQKQRLGLARALILRPKVIIADE ALASLDMSMRSQLINLMLELQEKQGISYIYVTQHIGMMKHISDQVLVMHQGEVVERGSTA DVLASPLHELTKRLIAGHFGEALTADAWRKDR >gi|296493283|gb|ADTK01000218.1| GENE 19 17593 - 18585 468 330 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 14 321 31 324 329 184 37 1e-45 MPLLDIRNLTIEFKTGDEWVKAVDRVSMTLTEGEIRGLVGESGSGKSLIAKAICGVNKDN WRVTADRMRFDDIDLLRLSARERRKLVGHNVSMIFQEPQSCLDPSERVGRQLMQNIPAWT YKGRWWQRFGWRKRRAIELLHRVGIKDHKDAMRSFPYELTEGECQKVMIAIALANQPRLL IADEPTNSMEPTTQAQIFRLLTRLNQNSNTTILLISHDLQMLSQWADKINVLYCGQTVET APSKELVTMPHHPYTQALIRAIPDFGSAMPHKSRLNTLPGAIPLLEQLPIGCRLGPRCPY AQRECIVTPRLTGAKNHLYACHFPLNMEKE >gi|296493283|gb|ADTK01000218.1| GENE 20 18585 - 19475 956 296 aa, chain - ## HITS:1 COG:sapC KEGG:ns NR:ns ## COG: sapC COG4171 # Protein_GI_number: 16129253 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Escherichia coli K12 # 1 296 1 296 296 535 100.0 1e-152 MPYDSVYSEKRPPGTLRTAWRKFYSDASAMVGLYGCAGLAVLCIFGGWFAPYGIDQQFLG YQLLPPSWSRYGEVSFFLGTDDLGRDVLSRLLSGAAPTVGGAFVVTLAATICGLVLGTFA GATHGLRSAVLNHILDTLLAIPSLLLAIIVVAFAGPSLSHAMFAVWLALLPRMVRSIYSM VHDELEKEYVIAARLDGASTLNILWFAVMPNITAGLVTEITRALSMAILDIAALGFLDLG AQLPSPEWGAMLGDALELIYVAPWTVMLPGAAIMISVLLVNLLGDGVRRAIIAGVE >gi|296493283|gb|ADTK01000218.1| GENE 21 19462 - 20427 588 321 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 1 318 1 317 320 231 35 1e-59 MIIFTLRRILLLIVTLFLLTFVGFSLSYFTPHAPLQGASLWNAWVFWFNGLIHWDFGVSS INGQPIAEQLKEVFPATMELCILAFGFALIVGIPVGMIAGITRHKWQDNLINAIALLGFS IPVFWLALLLTLFCSLTLGWLPVSGRFDLLYEVKPITGFALIDAWLSDSPWRDEMIMSAI RHMILPVITLSVAPTTEVIRLMRISTIEVYDQNYVKAAATRGLSRFTILRRHVLHNALPP VIPRLGLQFSTMLTLAMITEMVFSWPGLGRWLINAIRQQDYAAISAGVMVCGSLVIIVNV ISDILGAMANPLKHKEWYALR >gi|296493283|gb|ADTK01000218.1| GENE 22 20424 - 22067 1247 547 aa, chain - ## HITS:1 COG:sapA KEGG:ns NR:ns ## COG: sapA COG4166 # Protein_GI_number: 16129255 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Escherichia coli K12 # 1 547 1 547 547 1072 99.0 0 MRQVLSSLLVIAGLVSGQAIAAPESPPHADIRDSGFVYCVSGQVNTFNPSKASSGLIVDT LAAQFYDRLLDVDPYTYRLMPELAESWEVLDNGATYRFHLRRDVPFQKTDWFTPTRKMNA DDVVFTFQRIFDRNNPWHNVNGSNFPYFDSLQFTDNVKSVRKLDNHTVEFRLAQPDASFL WHLATHYASVMSAEYARKLEKEDRQEQLDRQPVGTGPYQLSEYRAGQFIRLQRHDDFWRG KPLMPQVVVDLGSGGTGRLSKLLTGECDVLAWPAASQLSILRDDPRLRLTLRPGMNVAYL AFNTAKPPLNNPAVRHALALAINNQRLMQSIYYGTAETAASILPRASWAYDNEAKITEYD PAKSREQLKSLGLENLTLKLWVPTRSQAWNPSPLKTAELIQADMAQVGVKVVIVPVEGRF QEARLMDMSHDLTLSGWATDSNDPDSFFRPLLSCAAIHSQTNLAHWCDPKFDSVLRKALS SQQLAARIEAYDEAQSILAQELPILPLASSLRLQAYRYDIKGLVLSPFGNASFAGVYREK QDEVKKP >gi|296493283|gb|ADTK01000218.1| GENE 23 22380 - 22625 314 81 aa, chain - ## HITS:1 COG:no KEGG:Z2493 NR:ns ## KEGG: Z2493 # Name: ymjA # Def: hypothetical protein # Organism: E.coli_O157 # Pathway: not_defined # 1 81 1 81 81 141 98.0 9e-33 MNHDIPLKYFDIADEYATECAEPVAEAERTPLAHYFQLLLTRLMNNEEISEEAQHEMAAE AGINPVRIDEIAEFLNQWGNE >gi|296493283|gb|ADTK01000218.1| GENE 24 22759 - 24144 1474 461 aa, chain - ## HITS:1 COG:ECs1873 KEGG:ns NR:ns ## COG: ECs1873 COG0531 # Protein_GI_number: 15831127 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli O157:H7 # 1 461 19 479 479 860 99.0 0 MAINSPLNIAAQPGKTRLRKSLKLWQVVMMGLAYLTPMTVFDTFGIVSGISDGHVPASYL LALAGVLFTAISYGKLVRQFPEAGSAYTYAQKSINPHVGFMVGWSSLLDYLFLPMINVLL AKIYLSALFPEVPPWVWVVTFVAILTAANLKSVNLVANFNTLFVLVQISIMVVFIFLVVQ GLHKGEGVGTVWSLQPFISENAHLIPIITGATIVCFSFLGFDAVTTLSEETPDAARVIPK AIFLTAVYGGVIFIAASFFMQLFFPDISRFKDPDAALPEIALYVGGKLFQSIFLCTTFVN TLASGLASHASVSRLLYVMGRDNVFPERVFGYVHPKWRTPALNVIMVGIVALSALFFDLV TATALINFGALVAFTFVNLSVFNHFWRRKGMNKSWKDHFHYLLMPLVGALTVGVLWVNLE STSLTLGLVWASLGGAYLWYLIRRYRKVPLYEGDRTPVSET >gi|296493283|gb|ADTK01000218.1| GENE 25 24447 - 25865 1291 472 aa, chain - ## HITS:1 COG:ycjK KEGG:ns NR:ns ## COG: ycjK COG0174 # Protein_GI_number: 16129258 # Func_class: E Amino acid transport and metabolism # Function: Glutamine synthetase # Organism: Escherichia coli K12 # 1 472 27 498 498 971 100.0 0 METNIVEVENFVQQSEERRGSAFTQEVKRYLERYPNTQYVDVLLTDLNGCFRGKRIPVSS LKKLEKGCYFPASVFAMDILGNVVEEAGLGQEMGEPDRTCVPVLGSLTPSAADPEFIGQM LLTMVDEDGAPFDVEPRNVLNRLWQQLRQRGLFPVVAVELEFYLLDRQRDAEGYLQPPCA PGTDDRNTQSQVYSVDNLNHFADVLNDIDELAQLQLIPADGAVAEASPGQFEINLYHTDN VLEACDDALALKRLVRLMAEKHKMHATFMAKPYEEHAGSGMHIHISMQNNRGENVLSDAE GEDSPLLKKMLAGMIDLMPSSMALLAPNVNSYRRFQPGMYVPTQASWGHNNRTVALRIPC GDRHNHRVEYRVAGADANPYLVMAAIFAGILHGLDNELPLQEEVEGNGLEQEGLPFPIRQ SDALGEFIENDHLRRYLGERFCHVYHACKNDELLQFERLITETEIEWMLKNA >gi|296493283|gb|ADTK01000218.1| GENE 26 26089 - 26841 247 250 aa, chain + ## HITS:1 COG:ycjL KEGG:ns NR:ns ## COG: ycjL COG2071 # Protein_GI_number: 16129259 # Func_class: R General function prediction only # Function: Predicted glutamine amidotransferases # Organism: Escherichia coli K12 # 1 250 9 258 258 488 99.0 1e-138 MNNPVIGVVMCRNRLKGHATQTLQEKYLNAIIHAGGLPIALPHALAEPSLLEQLLPKLDG IYLPGSPSNVQPHLYGENGDEPDADPGRDLLSMALINAALERRIPIFAICRGLQELVVAT GGSLHRKLCEQPELLEHREDPELPVEQQYAPSHEVQVEEGGLLSALLPECSNFWVNSLHG QGAKVVSPRLRVEARSPDGLVEAVSVINHPFALGVQWHPEWNSSEYALSRILFEGFITAC QHHIAEKQRL >gi|296493283|gb|ADTK01000218.1| GENE 27 26868 - 27425 511 185 aa, chain + ## HITS:1 COG:ycjC KEGG:ns NR:ns ## COG: ycjC COG1396 # Protein_GI_number: 16129260 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 1 185 1 185 185 344 100.0 5e-95 MSDEGLAPGKRLSEIRQQQGLSQRRAAELSGLTHSAISTIEQDKVSPAISTLQKLLKVYG LSLSEFFSEPEKPDEPQVVINQDDLIEMGSQGVSMKLVHNGNPNRTLAMIFETYQPGTTT GERIKHQGEEIGTVLEGEIVLTINGQDYHLVAGQSYAINTGIPHSFSNTSAGICRIISAH TPTTF >gi|296493283|gb|ADTK01000218.1| GENE 28 27700 - 29187 1690 495 aa, chain + ## HITS:1 COG:aldH KEGG:ns NR:ns ## COG: aldH COG1012 # Protein_GI_number: 16129261 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Escherichia coli K12 # 1 495 1 495 495 972 98.0 0 MNFHHLAYWQDKALSLAIENRLFINGEYTAAAENETFETVDPVTQAPLAKIARGKSVDID RAVSAARGVFERGDWSLSSPAKRKAVLNKLADLMEAHAEELALLETLDTGKPIRHSLRDD IPGAARAIRWYAEAIDKVYGEVATTSSHELAMIVREPVGVIAAIVPWNFPLLLTCWKLGP ALAAGNSVILKPSEKSPLSAIRLAGLAKEAGLPDGVLNVVTGFGHEAGQALSRHNDIDAI AFTGSTRTGKQLLKDAGDSNMKRVWLEAGGKSANLVFADCPDLQKAASATAAGIFYNQGQ VCIAGTRLLLEESIADEFLALLKQQAQNWQPGHPLDPATTMGTLIDCAHADSVHSFIREG ESKGQLLLDGRNAELAAAIGPTIFVDVDPNASLSREEIFGPVLVVTRFTSEDQALQLAND SQYGLGAAVWTRDLSRAHRMSRRLKAGSVFVNNYNDGDMTVPFGGYKQSGNGRDKSLHAL EKFTELKTIWISLEA >gi|296493283|gb|ADTK01000218.1| GENE 29 29189 - 30469 1501 426 aa, chain + ## HITS:1 COG:ordL KEGG:ns NR:ns ## COG: ordL COG0665 # Protein_GI_number: 16129262 # Func_class: E Amino acid transport and metabolism # Function: Glycine/D-amino acid oxidases (deaminating) # Organism: Escherichia coli K12 # 1 426 1 426 426 870 99.0 0 MTEHTSSYYAASANKYAPFDTLNESISCDVCVVGGGYTGLSSALHLAEAGFDVVVLEASR IGFGASGRNGGQLVNSYSRDIDVIEKSYGMDTARMLGSMMFEGGEIIRERIKRYQIDCDY RPGGLFVAMNDKQLATLEEQKENWERYGNKQLELLDANAIRREVASDRYTGALLDHSGGH IHPLNLAIGEADAIRLNGGRVYELSAVTQIQHTTPAVVRTAKGQVTAKYVIVAGNAYLGD KVEPELAKRSMPCGTQVITTERLSEDLARSLIPKNYCVEDCNYLLDYYRLTADNRLLYGG GVVYGARDPDDVERLVVPKLLKTFPQLKGVKIDYRWTGNFLLTLSRMPQFGRLDTNIYYM QGYSGHGVTCTHLAGRLIAELLRGDAERFDAFANLPHYPFPGGRTLRVPFTAMGAAYYSL RDRLGV >gi|296493283|gb|ADTK01000218.1| GENE 30 30507 - 31772 1144 421 aa, chain + ## HITS:1 COG:goaG KEGG:ns NR:ns ## COG: goaG COG0160 # Protein_GI_number: 16129263 # Func_class: E Amino acid transport and metabolism # Function: 4-aminobutyrate aminotransferase and related aminotransferases # Organism: Escherichia coli K12 # 1 421 1 421 421 793 99.0 0 MSNNEFHQRRLSATPRGVGVMCNFFAQSAENATLKDVEGNEYIDFAAGIAVLNTGHRHPD LVAAVEQQLQQFTHTAYQIVPYESYVTLAEKINALAPVSGQAKTAFFTTGAEAVENAVKI ARAHTGRPGVIAFSGGFHGRTYMTMALTGKVAPYKIGFGPFPGSVYHVPYPSDLHGISTQ DSLDAIERLFKSDIEAKQVAAIIFEPVQGEGGFNVAPKELVAAIRRLCDEHGIVMIADEV QSGFARTGKLFAMDHYADKPDLMTMAKSLAGGMPLSGVVGNANIMDAPAPGGLGGTYAGN PLAVAAAHAVLNIIDKESLCERANQLGQRLTNTLIDAKESVPAIAAVRGLGSMIAAEFND PQTGEPSAAIAQKIQQRALAQGLLLLTCGAYGNVIRFLYPLTIPDAQFDAAMKILQDALR D >gi|296493283|gb|ADTK01000218.1| GENE 31 31903 - 32880 870 325 aa, chain - ## HITS:1 COG:ECs1880 KEGG:ns NR:ns ## COG: ECs1880 COG1221 # Protein_GI_number: 15831134 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulators containing an AAA-type ATPase domain and a DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 325 6 330 330 636 99.0 0 MAEYKDNLLGEANSFLEVLEQVSHLAPLDKPVLIIGERGTGKELIASRLHYLSSRWQGPF ISLNCAALNENLLDSELFGHEAGAFTGAQKRHPGRFERADGGTLFLDELATAPMMVQEKL LRVIEYGELERVGGSQPLQVNVRLVCATNADLPAMVNEGTFRADLLDRLAFDVVQLPPLR ERESDIMLMAEHFAIQMCREIKLPLFPGFTEHARETLLNYRWPGNIRELKNVVERSVYRH GTSDYPLDDIIIDPFKRRPPEEAIAVSENTSLPTLPLDLREFQMQQEKELLQLSLQQGKY NQKRAAELLGLTYHQFRALLKKHQI >gi|296493283|gb|ADTK01000218.1| GENE 32 33047 - 33715 1025 222 aa, chain + ## HITS:1 COG:ECs1881 KEGG:ns NR:ns ## COG: ECs1881 COG1842 # Protein_GI_number: 15831135 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Phage shock protein A (IM30), suppresses sigma54-dependent transcription # Organism: Escherichia coli O157:H7 # 1 222 1 222 222 280 99.0 1e-75 MGIFSRFADIVNANINALLEKAEDPQKLVRLMIQEMEDTLVEVRSTSARALAEKKQLTRR IEQASAREVEWQEKAELALLKDREDLARAALIEKQKLTDLIKSLEHEVTLVDDTLARMKK EIGELENKLSETRARQQALMLRHQAANSSRDVRRQLDSGKLDEAMARFESFERRIDQMEA EAESHSFGKQKSLDDQFAELKADDAISEQLAQLKAKMKQDNQ >gi|296493283|gb|ADTK01000218.1| GENE 33 33769 - 33993 308 74 aa, chain + ## HITS:1 COG:no KEGG:ECDH10B_1422 NR:ns ## KEGG: ECDH10B_1422 # Name: pspB # Def: phage shock protein B # Organism: E.coli_DH10B # Pathway: not_defined # 1 74 1 74 74 127 100.0 9e-29 MSALFLAIPLTIFVLFVLPIWLWLHYSNRSGRSELSQSEQQRLAQLADEAKRMRERIQAL ESILDAEHPNWRDR >gi|296493283|gb|ADTK01000218.1| GENE 34 33993 - 34352 538 119 aa, chain + ## HITS:1 COG:ECs1883 KEGG:ns NR:ns ## COG: ECs1883 COG1983 # Protein_GI_number: 15831137 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Putative stress-responsive transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 119 1 119 119 228 100.0 3e-60 MAGINLNKKLWRIPQQGMVRGVCAGIANYFDVPVKLVRILVVLSIFFGLALFTLVAYIIL SFALDPMPDNMAFGEQLPSSSELLDEVDRELAASETRLREMERYVTSDTFTLRSRFRQL >gi|296493283|gb|ADTK01000218.1| GENE 35 34361 - 34582 211 73 aa, chain + ## HITS:1 COG:no KEGG:ECDH10B_1424 NR:ns ## KEGG: ECDH10B_1424 # Name: pspD # Def: peripheral inner membrane phage-shock protein # Organism: E.coli_DH10B # Pathway: not_defined # 1 73 1 73 73 112 100.0 6e-24 MNTRWQQAGQKVKPGFKLAGKLVLLTALRYGPAGVAGWAIKSVARRPLKMLLAVALEPLL SRAANKLAQRYKR >gi|296493283|gb|ADTK01000218.1| GENE 36 34657 - 34971 418 104 aa, chain + ## HITS:1 COG:pspE KEGG:ns NR:ns ## COG: pspE COG0607 # Protein_GI_number: 16129269 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Escherichia coli K12 # 1 104 1 104 104 206 100.0 1e-53 MFKKGLLALALVFSLPVFAAEHWIDVRVPEQYQQEHVQGAINIPLKEVKERIATAVPDKN DTVKVYCNAGRQSGQAKEILSEMGYTHVENAGGLKDIAMPKVKG >gi|296493283|gb|ADTK01000218.1| GENE 37 35157 - 36863 1044 568 aa, chain + ## HITS:1 COG:ycjM KEGG:ns NR:ns ## COG: ycjM COG0366 # Protein_GI_number: 16129270 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Escherichia coli K12 # 1 568 1 568 568 1183 99.0 0 MGPLPRKGNMKQKITDHLDEIYGGTFTATHLQKLVTRLESAKRLITQRRKKHWDESDVVL ITYADQFHSNDLKPLPTFNQFYHQWLQSIFSHVHLLPFYPWSSDDGFSVIDYQQVASEAG EWQDIQQLGECSHLMFDFVCNHMSAKSEWFKNYLQQHPGFEDFFIAVDPQTDLSAVTRPR ALPLLTPFQMRDHSTRHLWTTFSDDQIDLNYRSPEVLLAMVDVLLCYLAKGAEYVRLDAV GFMWKEPGTSCIHLEKTHLIIKLLRSIIDNVAPGTVIITETNVPHKDNIAYFGAGDDEAH MVYQFSLPPLVLHAVQKQNVEALCAWAQNLTLPSSNTTWFNFLASHDGIGLNPLRGLLPE SEILELVEALQQEGALVNWKNNPDGTRSPYEINVTYMDALSRRESSDEERCARFILAHAI LLSFPGVPAIYIQSILGSRNDYAGVEKLGYNRAINRKKYHSKEITRELNDEATLRHAVYH ELSRLITLRRSHNEFHPDNNFTIDTINSSVMRIQRSNADGNCLTGLFNVSKNIQHVNITN LHGRDLISEVDILGNEITLRPWQVMWIK >gi|296493283|gb|ADTK01000218.1| GENE 38 36877 - 38169 1519 430 aa, chain + ## HITS:1 COG:ycjN KEGG:ns NR:ns ## COG: ycjN COG1653 # Protein_GI_number: 16129271 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli K12 # 1 430 1 430 430 814 99.0 0 MIKSKIVLLSALVSCALISGCKEENKTNVSIEFMHSSVEQERQAVISKLIARFEKENPGI TVKQVPVEEDAYNTKVITLSRSGSLPEVIETSHDYAKVMDKEQLIDRKAVATVISNVGEG AFYDGVLRIVRTEDGSAWTGVPVSAWIGGIWYRKDVLAKAGLEEPKNWQQLLDVAQKLND PANKKYGIALPTAESVLTEQSFSQFALSNQANVFNAEGKITLDTPEMMQALTYYRDLAAN TMPGSNDIMEVKDAFMNGTAPMAIYSTYILPAVIKEGDPKNVGFVVPTEKNSAVYGMLTS LTITAGQKTEETEAAEKFVTFMEQADNIADWVMMSPGAALPVNKAVVTTATWKDNDVIKA LGELPNQLIGELPNIQVFGAVGDKNFTRMGDVTGSGVVSSMVHNVTVGKADLPGTLQASQ KKLDELVEQH >gi|296493283|gb|ADTK01000218.1| GENE 39 38190 - 39071 962 293 aa, chain + ## HITS:1 COG:ECs1890 KEGG:ns NR:ns ## COG: ECs1890 COG1175 # Protein_GI_number: 15831144 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Escherichia coli O157:H7 # 1 293 1 293 293 487 99.0 1e-138 MNRLFSGRSDMPFALLLLAPSLLLLGGLVAWPMVSNIEISFLRLPLNPNIESTFVGVSNY VRILSDPGFWHSLWMTVWYTALVVAGSTVLGLAVAMFFNREFRLRKTARSLVILSYVTPS ISLVFAWKYMFNNGYGIVNYLGVDLLHLYEQAPLWFDNPGSSFVLVVLFAIWRYFPYAFI SFLAILQTIDKSLYEAAEMDGANAWQRFRIVTLPAIMPVLATVVTLRTIWMFYMFADVYL LTTKVDILGVYLYKTAFAFNDLGKAAAISVVLFIIIFAVILLTRKRVNLNGNK >gi|296493283|gb|ADTK01000218.1| GENE 40 39058 - 39900 930 280 aa, chain + ## HITS:1 COG:ycjP KEGG:ns NR:ns ## COG: ycjP COG0395 # Protein_GI_number: 16129273 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Escherichia coli K12 # 1 280 1 280 280 462 100.0 1e-130 MATNKRTLSRIGFYCGLALFLIITLFPFFVMLMTSFKGAKEAISLHPTLLPQQWTLEHYV DIFNPMIFPFVDYFRNSLVVSVVSSVVAVFLGILGAYALSRLRFKGRMTINASFYTVYMF SGILLVVPLFKIITALGIYDTEMALIITMVTQTLPTAVFMLKSYFDTIPDEIEEAAMMDG LNRLQIIFRITVPLAMSGLISVFVYCFMVAWNDYLFASIFLSSASNFTLPVGLNALFSTP DYIWGRMMAASLVTALPVVIMYALSERFIKSGLTAGGVKG >gi|296493283|gb|ADTK01000218.1| GENE 41 39931 - 40983 1240 350 aa, chain + ## HITS:1 COG:ycjQ KEGG:ns NR:ns ## COG: ycjQ COG1063 # Protein_GI_number: 16129274 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli K12 # 1 350 1 350 350 697 99.0 0 MKKLVATAPRVAALVEYEDRAILANEVKIRVRFGAPKHGTEVVDFRAASPFIDEDFNGEW QMFTPRPADAPRGIEFGKFQLGNMVVGDIIECGSDVTDYAVGDSVCGYGPLSETVIINAV NNYKLRKMPQGSSWKNAVCYDPAQFAMSGVRDANVRVGDFVVVVGLGAIGQIAIQLAKRA GASVVIGVDPIAYRCDIARRHGADFCLNPIGTDVGKEIKTLTGKQGADVIIETSGYADAL QSALRGLAYGGTISYVAFAKPFAEGFNLGREAHFNNAKIVFSRACSEPNPDYPRWSRKRI EETCWELLMNGYLNCEDLIDPVVTFANSPESYMQYVDQHPEQSIKMGVTF >gi|296493283|gb|ADTK01000218.1| GENE 42 41001 - 41789 976 262 aa, chain + ## HITS:1 COG:ycjR KEGG:ns NR:ns ## COG: ycjR COG1082 # Protein_GI_number: 16129275 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Escherichia coli K12 # 1 262 4 265 265 545 99.0 1e-155 MKIGTQNQAFFPENILEKFRYIKEMGFDGFEIDGKLLVNNLEEVKAAIKETGLPVTTACG GYDGWIGDFIEERRLNGLKQIERILEALAEVGGKGIVVPAAWGMFTFRLPPMTSPRSLDG DRKMVSDSLRVLEQVAARTGTVVYLEPLNRYQDHMINTLADARRYIVENDLKHVQIIGDF YHMNIEEDNLAQALHDNRDLLGHVHIADNHRYQPGSGTLDFHALFEQLRADNYQGYVVYE GRIRAEDPAQAYRDSLAWLRTC >gi|296493283|gb|ADTK01000218.1| GENE 43 41811 - 42854 892 347 aa, chain + ## HITS:1 COG:ycjS KEGG:ns NR:ns ## COG: ycjS COG0673 # Protein_GI_number: 16129276 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Escherichia coli K12 # 1 347 5 351 351 730 99.0 0 MTSSPLRVAIIGAGQVADKVHASYYCTRNDLELVAVCDSRLSQAQALAEKYGNASVWDDP QAMLLAVKPDVVSVCSPNRFHYEHTLMALEAGCHVMCEKPPAMTPEQAREMCDTARKLGK VLAYDFHHRFALDTQQLRDQVTNGVLGEIYVTTARALRRCGVPGWGVFTNKELQGGGPLI DIGIHMLDAAMYVLGFPAVKSVNAHSFQKIGTQKSCGQFGEWDPATYSVEDSLFGTIEFH NGGILWLETSFALNIREQSIMNVSFCGDKAGATLFPAHIYTDNNGELMTLMQRELADDNR HLRSMEAFINHVQGKPVMIADAEQGYIIQQLVAALYQSAETGTRVEL >gi|296493283|gb|ADTK01000218.1| GENE 44 42920 - 45118 1542 732 aa, chain + ## HITS:1 COG:ECs1895 KEGG:ns NR:ns ## COG: ECs1895 COG1554 # Protein_GI_number: 15831149 # Func_class: G Carbohydrate transport and metabolism # Function: Trehalose and maltose hydrolases (possible phosphorylases) # Organism: Escherichia coli O157:H7 # 1 732 24 755 755 1462 97.0 0 MAQGNGYLGLRASHEEDYTRQTRGMYLAGLYHRAGKGEINELVNLPDILGMEIAINGEVF SLSREAWQRELDFASGELRRSVVWRTSNGTGYTIASRRFVSADQLPLIVLEITITPLDAD ASVLISTGIDATQTNHGRQHLDETQVRVFGQHLMQGIYTTQDGRSDVAISCCCMVSGDVQ QCYTAKERRLQQHTSAQLHAGETVTLQKLVWIDWRDDRQAVLDEWGSASLRQLEMCAQQS YDQLLAVSTENWRQWWQKRRITVNGGEAHDQQALDYALYHLRIMTPAHDERSSIAAKGLT GEGYKGHVFWDTEVFLLPFHLFSDPTVARSLLRYRWHNLPGAQEKARRNGWQGALFPWES ARSGEEETPEFAAINIRTGLRQKVASAQAEHHLVADIAWAVIQYWQTTGDESFIAHEGMA LLLETAKFWISRAVRVNDRLEIHDVIGPDEYTEHVNNNAFTSYMAYYNVQQALSIARQFG CSDDAFIHRAEMFLKELRLPEIQPDGVLPQDDSFMAKPAINLAKYKAAAGKQTILLDYSR AEVNEMQILKQADVVMLNYMLPEQFSAASCLANLQFYEPRTIHDSSLSKAIHGIVAARCG LLTQSYQFWREGTEIDLGADPHSCDDGIHAAATGAIWLGAIQGFAGVSVRDGELHLNPAL PEQWQQLSFPLFWQGCELQVTLDAQRIAIRTSAPVSLRLNGQLISVAEESVFCLGDFILP FNGTATTHQEDE >gi|296493283|gb|ADTK01000218.1| GENE 45 45115 - 45774 588 219 aa, chain + ## HITS:1 COG:ycjU KEGG:ns NR:ns ## COG: ycjU COG0637 # Protein_GI_number: 16129278 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Escherichia coli K12 # 1 219 1 219 219 421 98.0 1e-118 MKLQGVIFDLDGVITDTAHLHFQAWQQIAAEIGISIDAQFNESLKGISRDESLRRILQHG GKEGDFNSQERAQLAYRKNLLYVHSLRELTVSAVLPGIRSLLEDLRAQQIPVGLASVSLN APTILAALELREFFTFCADASQLKNSKPDPEIFLAACAGLGVPPQACIGIEDAQAGIDAI NASGMRSVGIGAGLTGAQLLLPSTESLTWPRLSAFWQNV >gi|296493283|gb|ADTK01000218.1| GENE 46 45788 - 46870 1062 360 aa, chain + ## HITS:1 COG:ECs1897 KEGG:ns NR:ns ## COG: ECs1897 COG3839 # Protein_GI_number: 15831151 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Escherichia coli O157:H7 # 1 360 1 360 360 712 98.0 0 MAQLSLQHIQKIYDNQVHVVKDFNLEIADKEFIVFVGPSGCGKSTTLRMIAGLEEISGGD LLIDGKRMNDVPAKARNIAMVFQNYALYPHMTVYDNMAFGLKMQKIAKEVIDERVNWAAQ ILGLREYLKRKPGALSGGQRQRVALGRAIVREAGVFLMDEPLSNLDAKLRVQMRAEISKL HQKLNTTMIYVTHDQTEAMTMATRIVIMKDGIVQQVGAPKTVYNQPANMFVAGFIGSPAM NFIRGTIDGDKFVTETLKLTIPEEKLAVLKTQGSLHKPIVMGIRPEDIHPDAQEENNISA KISVAELTGAEFMLYTTVGGHELVVRAGALNDYHAGENITIHFDMTKCHFFDAETEIALR >gi|296493283|gb|ADTK01000218.1| GENE 47 46915 - 47820 957 301 aa, chain + ## HITS:1 COG:no KEGG:B21_01307 NR:ns ## KEGG: B21_01307 # Name: ompG # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 301 1 301 301 595 99.0 1e-169 MKKLLPCTALVMCAGMACAQAEEKNDWHFNIGAMYEIENVEGYGEDMDGLAEPSVYFNAA NGPWRIALAYYQEGPVDYSAGKRGTWFDRPELEVHYQFLENDDFSFGLTGGFRNYGYHYV DEPGKDTANMQRWKIAPDWDVKLTDDLRFNGWLSMYKFANDLNTTGYADTRVETETGLQY TFNETVALRVNYYLERGFNMDDSRNNGEFSTQEIRAYLPLTLGNHSVTPYTRIGLDRWSN WDWQDDIEREGHDFNRVGLFYGYDFQNGLSVSLEYAFEWQDHDEGDSDKFHYAGVGVNYS F >gi|296493283|gb|ADTK01000218.1| GENE 48 47931 - 48929 830 332 aa, chain - ## HITS:1 COG:ycjW KEGG:ns NR:ns ## COG: ycjW COG1609 # Protein_GI_number: 16129281 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 332 1 332 332 635 99.0 0 MSPTIYDIARVAGVSKSTVSRVLNKQTNISPEAREKVLRAIEELQYQPNKLARALTSSGF DAIMVISTRSTKTTAGNPFFSEVLHAITAKAEEEGFDVILQTSHNPAEDLQKCESKIKQK MIKGIIMLSSPADESFFAQLDKYDIPVVVIGKVEGQYAHVYSVDTDNFGDSIALTDALIE SGHQNIACLHAPLDVHVSVDRVNGYKQSLAAHNIAVRDEWIVDGGYTHETALQAARELLS QSPLPEAVFATDSLKLMSIYRAAAEKNIAIPQQLAVVGYSNETLSFILTPAPGGIDIPTQ ELGQQSCELLFRLISGKPSPQNITVATHMTLK >gi|296493283|gb|ADTK01000218.1| GENE 49 49084 - 50481 1204 465 aa, chain + ## HITS:1 COG:ECs1900 KEGG:ns NR:ns ## COG: ECs1900 COG3106 # Protein_GI_number: 15831154 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Escherichia coli O157:H7 # 1 465 1 465 465 961 99.0 0 MKRLKNELNALVNRGVDRHLRLAVTGLSRSGKTAFITAMVNQLLNIHAGARLPLLSAVRE ERLLGVKRIPQRDFGIPRFTYDEGLAQLYGDPPAWPTPTRGVSEIRLALRFKSNDSLLRH FKDTSTLYLEIVDYPGEWLLDLPMLAQDYLSWSRQMTGLLNGQRGEWSAKWRMMCEGLDP LAPADENRLADIAAAWTDYLHHCKQQGLHFIQPGRFVLPGDMAGAPALQFFPWPDVDTWG ESKLAQADKHTNAGMLRERFNYYCEKVVKGFYKNHFLRFDRQIVLVDCLQPLNSGPQAFN DMRLALTQLMQSFHYGQRTLFRRLFSPVIDKLLFAATKADHVTIDQHANMVSLLQQLIQD AWQNAAFEGISMDCLGLASVQATTSGIIDVNGEKIPALRGNRLSDGAPLTVYPGEVPARL PGQAFWDKQGFQFEAFRPQVMDVDKPLPHIRLDAALEFLIGDKLR >gi|296493283|gb|ADTK01000218.1| GENE 50 50478 - 51539 1221 353 aa, chain + ## HITS:1 COG:ycjF KEGG:ns NR:ns ## COG: ycjF COG3768 # Protein_GI_number: 16129283 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 353 1 353 353 656 99.0 0 MTELLKPRIDFDGPLEVDQNPKFRAQQTFDENQAQNFAPATLDEAQEEEGQVEAVMDAAL RPKRSLWRKMVMGGLALFGASVVGQGVQWTMNAWQTQDWVALGGCAAGALIIGAGVGSVV TEWRRLWRLRQRAHERDEARDLLHSHGTGKGRAFCEKLAQQAGIDQSHPALQRWYASIHE TQNDREVVSLYAHLVQPVLDAQARREISRSAAESTLMIAVSPLALVDMAFIAWRNLRLIN RIATLYGIELGYYSRLRLFKLVLLNIAFAGASELVREVGMDWMSQDLAARLSTRAAQGIG AGLLTARLGIKAMELCRPLPWIDDDKPRLGDFRRQLIGQVKETLQKGKTPSEK >gi|296493283|gb|ADTK01000218.1| GENE 51 51687 - 53228 1477 513 aa, chain + ## HITS:1 COG:tyrR KEGG:ns NR:ns ## COG: tyrR COG3283 # Protein_GI_number: 16129284 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulator of aromatic amino acids metabolism # Organism: Escherichia coli K12 # 1 513 1 513 513 1014 100.0 0 MRLEVFCEDRLGLTRELLDLLVLRGIDLRGIEIDPIGRIYLNFAELEFESFSSLMAEIRR IAGVTDVRTVPWMPSEREHLALSALLEALPEPVLSVDMKSKVDMANPASCQLFGQKLDRL RNHTAAQLINGFNFLRWLESEPQDSHNEHVVINGQNFLMEITPVYLQDENDQHVLTGAVV MLRSTIRMGRQLQNVAAQDVSAFSQIVAVSPKMKHVVEQAQKLAMLSAPLLITGDTGTGK DLFAYACHQASPRAGKPYLALNCASIPEDAVESELFGHAPEGKKGFFEQANGGSVLLDEI GEMSPRMQAKLLRFLNDGTFRRVGEDHEVHVDVRVICATQKNLVELVQKGMFREDLYYRL NVLTLNLPPLRDCPQDIMPLTELFVARFADEQGVPRPKLAADLNTVLTRYAWPGNVRQLK NAIYRALTQLDGYELRPQDILLPDYDAATVAVGEDAMEGSLDEITSRFERSVLTQLYRNY PSTRKLAKRLGVSHTAIANKLREYGLSQKKNEE >gi|296493283|gb|ADTK01000218.1| GENE 52 53272 - 53778 650 168 aa, chain - ## HITS:1 COG:ECs1903 KEGG:ns NR:ns ## COG: ECs1903 COG2077 # Protein_GI_number: 15831157 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Escherichia coli O157:H7 # 1 168 1 168 168 310 100.0 9e-85 MSQTVHFQGNPVTVANSIPQAGSKAQTFTLVAKDLSDVTLGQFAGKRKVLNIFPSIDTGV CAASVRKFNQLATEIDNTVVLCISADLPFAQSRFCGAEGLNNVITLSTFRNAEFLQAYGV AIADGPLKGLAARAVVVIDENDNVIFSQLVDEITTEPDYEAALAVLKA >gi|296493283|gb|ADTK01000218.1| GENE 53 53897 - 54862 818 321 aa, chain + ## HITS:1 COG:ycjG KEGG:ns NR:ns ## COG: ycjG COG4948 # Protein_GI_number: 16129286 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Escherichia coli K12 # 1 321 15 335 335 590 99.0 1e-168 MRTVKVFEEAWPLHTPFVIARGSRSEARVVVVELEEEGIKGTGECTPYPRYGESDASVMA QIMSVVSQLEKGLTREELQKILPAGAARNALDCALWDLAARKQQQSLADLIGITLPETVI TAQTVVIGTPDQMANSASTLWQAGAKLLKVKLDNHLISERMVAIRTAVPDATLIVDANES WRAEGLAARCQLLADLGVAMLEQPLPAQDDAALENFIHPLPICADESCHTRSNLKALKGR YEMVNIKLDKTGGLTEALALATEARAQGFSLMLGCMLCTSRAISAALPLVPQVSFADIDG PTWLAVDVEPALQFTTGELHL >gi|296493283|gb|ADTK01000218.1| GENE 54 54837 - 55565 390 242 aa, chain - ## HITS:1 COG:ECs1905 KEGG:ns NR:ns ## COG: ECs1905 COG2866 # Protein_GI_number: 15831159 # Func_class: E Amino acid transport and metabolism # Function: Predicted carboxypeptidase # Organism: Escherichia coli O157:H7 # 1 242 21 262 262 487 99.0 1e-138 MTVTRPRAERGAFPPGTEHYGRSLLGAPLIWFPAPAASRESGLILAGTHGDENSSVVTLS CALRTLTPSLRRHHVVLCANPDGCQLGLRANANGVDLNRNFPAANWKEGETVYRWNSAAE ERDVVLLTGDKPGSEPETQALCQLIHRIQPAWVVSFHDPLACIEDPRHSELGEWLAQAFE LPLVTSVGYETPGSFGSWCADLNLHCITAEFPPISSDEASEKYLFAMANLLRWHPKDAIR PS >gi|296493283|gb|ADTK01000218.1| GENE 55 55666 - 55830 171 54 aa, chain - ## HITS:1 COG:ECs1906 KEGG:ns NR:ns ## COG: ECs1906 COG0702 # Protein_GI_number: 15831160 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate-sugar epimerases # Organism: Escherichia coli O157:H7 # 1 46 9 54 220 92 97.0 1e-19 MKKVLILGAGGQIARHVINQLADKQTIKQTLFARQPAKIHKPYPTNKMQTTSGK >gi|296493283|gb|ADTK01000218.1| GENE 56 55901 - 56818 1003 305 aa, chain - ## HITS:1 COG:ycjY KEGG:ns NR:ns ## COG: ycjY COG1073 # Protein_GI_number: 16129288 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Escherichia coli K12 # 1 305 6 310 310 581 98.0 1e-166 MNNKVSFTNSNNPTISLSAVIYFPPKFDETRQYQAIVVSHPGGGVKEQTAGTYAKKLAEK GFVTIAYDASYQGESGGEPRQLENPYIRTEDISAVIDYLTTLSYVDNTRIGAMGICAGAG YTANAAIQDRRIKAIGTVSAVNIGSMFRNGWENNVKSIDALPYVEAGSNARTSDISSGEY AVMPLAPMKESDAPNEELRQAWEYYHTPRAQYPTAPGYATLRSLNQIITYDAYHMAEVYL TQPMQIVAGSQAGSKWMSDDLYDRASSQDKRYHIVEGANHMDLYDGKAYVAEAISVLAPF FEETL >gi|296493283|gb|ADTK01000218.1| GENE 57 56959 - 57858 237 299 aa, chain + ## HITS:1 COG:ycjZ KEGG:ns NR:ns ## COG: ycjZ COG0583 # Protein_GI_number: 16129289 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 299 1 299 299 566 99.0 1e-161 MKREEIADLMAFVVVAEERSFTRAAARLSMAQSALSQIVRRIEERLGLRLLTRTTRSVVP TEAGEHLLSVLGPMLHDIDSALASLSELQNRPSGTIRITTVEHAAKTILLPAMRTFLKSH PEIDIQLTIDYGLTDVVSERFDAGVRLGGEMDKDMIAIRIGPDIPMAIVGSPDYFSRRSV PTSVSQLIDHQAINLYLPTSGTANRWRLIRGGREVRVRMEGQLLLNTIDLIIDAAIDGHG LAYLPYDQVERAIKEKKLIRVLDKFTPDLPGYHLYYPHRRHAGSAFSLFIDRLKYKGAV >gi|296493283|gb|ADTK01000218.1| GENE 58 58195 - 59808 1412 537 aa, chain + ## HITS:1 COG:mppA KEGG:ns NR:ns ## COG: mppA COG4166 # Protein_GI_number: 16129290 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Escherichia coli K12 # 1 537 8 544 544 1065 100.0 0 MKHSVSVTCCALLVSSISLSYAAEVPSGTVLAEKQELVRHIKDEPASLDPAKAVGLPEIQ VIRDLFEGLVNQNEKGEIVPGVATQWKSNDNRIWTFTLRDNAKWADGTPVTAQDFVYSWQ RLVDPKTLSPFAWFAALAGINNAQAIIDGKATPDQLGVTAVDAHTLKIQLDKPLPWFVNL TANFAFFPVQKANVESGKEWTKPGNLIGNGAYVLKERVVNEKLVVVPNTHYWDNAKTVLQ KVTFLPINQESAATKRYLAGDIDITESFPKNMYQKLLKDIPGQVYTPPQLGTYYYAFNTQ KGPTADQRVRLALSMTIDRRLMTEKVLGTGEKPAWHFTPDVTAGFTPEPSPFEQMSQEEL NAQAKTLLSAAGYGPQKPLKLTLLYNTSENHQKIAIAVASMWKKNLGVDVKLQNQEWKTY IDSRNTGNFDVIRASWVGDYNEPSTFLTLLTSTHSGNISRFNNPAYDKVLAQASTENTVK ARNADYNAAEKILMEQAPIAPIYQYTNGRLIKPWLKGYPINNPEDVAYSRTMYIVKH >gi|296493283|gb|ADTK01000218.1| GENE 59 59859 - 60890 787 343 aa, chain - ## HITS:1 COG:ECs1912 KEGG:ns NR:ns ## COG: ECs1912 COG0668 # Protein_GI_number: 15831166 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Escherichia coli O157:H7 # 1 343 1 343 343 672 100.0 0 MIAELFTNNALNLVIIFGSCAALILMSFWFRRGNRKRKGFLFHAVQFLIYTIIISAVGSI INYVIENYKLKFITPGVIDFICTSLIAVILTIKLFLLINQFEKQQIKKGRDITSARIMSR IIKITIIVVLVLLYGEHFGMSLSGLLTFGGIGGLAVGMAGKDILSNFFSGIMLYFDRPFS IGDWIRSPDRNIEGTVAEIGWRITKITTFDNRPLYVPNSLFSSISVENPGRMTNRRITTT IGLRYEDAAKVGVIVEAVREMLKNHPAIDQRQTLLVYFNQFADSSLNIMVYCFTKTTVWA EWLAAQQDVYLKIIDIVQSHGADFAFPSQTLYMDNITPPEQGR >gi|296493283|gb|ADTK01000218.1| GENE 60 61134 - 61391 330 85 aa, chain + ## HITS:1 COG:no KEGG:ECSP_1858 NR:ns ## KEGG: ECSP_1858 # Name: ynaJ # Def: predicted inner membrane protein # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 85 1 85 85 131 98.0 7e-30 MIMAKLKSAKGKKFLFGLLAVFIIAASVVTRATIGGVIEQYNIPMSEWTTSMYVIQSSMI FVYSLVFTVLLAIPLGIYFLGGEEQ >gi|296493283|gb|ADTK01000218.1| GENE 61 61441 - 62391 1024 316 aa, chain - ## HITS:1 COG:ECs1914 KEGG:ns NR:ns ## COG: ECs1914 COG0589 # Protein_GI_number: 15831168 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Escherichia coli O157:H7 # 1 316 1 316 316 634 99.0 0 MAMYQNMLVVIDPNQDDQPALRRAVYLHQRIGGKIKAFLPIYDFSYEMTTLLSPDERTAM RQGVISQRTAWIHEQAKYYLNAGVPIEIKVVWHNRPFEAIIQEVISGGHDLVLKMAHQHD RLEAVIFTPTDWHLLRKCPSPVWMVKDQPWPEGGKALVAVNLASEEPYHNALNEKLVKET IELAEQVNHTEVHLVGAYPVTPINIAIELPEFDPSVYNDAIRGQHLLAMKALRQKFGINE NMTHVEKGLPEEVIPDLAEHLQAGVVVLGTVGRTGISAAFLGNTAEQVIDHLRCDLLVIK PDQYQTPVELDDEEDD >gi|296493283|gb|ADTK01000218.1| GENE 62 62543 - 63295 786 250 aa, chain - ## HITS:1 COG:ECs1915 KEGG:ns NR:ns ## COG: ECs1915 COG0664 # Protein_GI_number: 15831169 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Escherichia coli O157:H7 # 1 250 1 250 250 501 100.0 1e-142 MIPEKRIIRRIQSGGCAIHCQDCSISQLCIPFTLNEHELDQLDNIIERKKPIQKGQTLFK AGDELKSLYAIRSGTIKSYTITEQGDEQITGFHLAGDLVGFDAIGSGHHPSFAQALETSM VCEIPFETLDDLSGKMPNLRQQMMRLMSGEIKGDQDMILLLSKKNAEERLAAFIYNLSRR FAQRGFSPREFRLTMTRGDIGNYLGLTVETISRLLGRFQKSGMLAVKGKYITIENNDALA QLAGHTRNVA >gi|296493283|gb|ADTK01000218.1| GENE 63 63490 - 64005 273 171 aa, chain - ## HITS:1 COG:ogt KEGG:ns NR:ns ## COG: ogt COG0350 # Protein_GI_number: 16129296 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Escherichia coli K12 # 1 171 1 171 171 348 99.0 2e-96 MLRLLEEKIATPLGPLWVICDEQFRLRAVEWEEYSERMVQLLDIHYRKEGYERISATNPG GLSDKLRDYFAGNLSIIDTLPTATGGTPFQREVWKTLRTIPCGQVMHYGQLAEQLGRPGA ARAVGAANGSNPISIVVPCHRVIGRNGTMTGYAGGVQRKEWLLRHEGYLLL >gi|296493283|gb|ADTK01000218.1| GENE 64 64016 - 65422 1021 468 aa, chain - ## HITS:1 COG:abgT KEGG:ns NR:ns ## COG: abgT COG2978 # Protein_GI_number: 16129297 # Func_class: H Coenzyme transport and metabolism # Function: Putative p-aminobenzoyl-glutamate transporter # Organism: Escherichia coli K12 # 1 468 43 510 510 803 99.0 0 MVTTAILSAFGVSAKNPTDGTPVVVKNLLSVEGLHWFLPNVIKNFSGFAPLGAILALVLG AGLAERVGLLPALMVKMASHVNARYASYMVLFIAFFSHISSDAALVIMPPMGALIFLAVG RHPVAGLLAAIAGVGCGFTANLLIVTTDVLLSGISTEAAAAFNPQMHVSVIDNWYFMASS VVVLTIVGGLITDKIIEPRLGQWQGNSDEKLQTLTESQRFGLRIAGVVSLLFIAAIALMV IPENGILRDPINHTVMPSPFIKGIVPLIILFFFVVSLAYGIATRTIRRQADLPHLMIEPM KEMAGFIVMVFPLAQFVAMFNWSNMGKFIAVGLTDILESSGLSGIPAFVGLALLSSFLCM FIASGSAIWSILAPIFVPMFMLLGFHPAFAQILFRIADSSVLPLAPVSPFVPLFLGFLQR YKPDAKLGTYYSLVLPYPLIFLVVWLLMLLAWYLVGLPIGPGIYPRLS >gi|296493283|gb|ADTK01000218.1| GENE 65 65579 - 67024 993 481 aa, chain - ## HITS:1 COG:abgB KEGG:ns NR:ns ## COG: abgB COG1473 # Protein_GI_number: 16129298 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Escherichia coli K12 # 1 481 1 481 481 946 98.0 0 MQEIYRFIDDAIEADRQRYTDIADQIWDHPETRFEEFWSAEHLASALESASFTVTRNVGN IPNAFIASFGQGKPVIALLGEYDALAGLSQQAGCAQPTSATPGENGHGCGHNLLGTAAFA AAIAVKKWLEQYGQGGTVRFYGCPGEEGGSGKTFMVREGVFDDVDAALTWHPEAFAGMFN TRTLANIQASWCFKGIAAHAANSPHLGRSALDAVTLMTTGTNFLNEHIIEKARVHYAITN SGGISPNVVQAQAEVLYLIRAPEMTDVQHIYDRVAKIAEGAALMTETTVECRFDKACSSY LPNRTLENAMYHALSHFGTPEWNSEELAFAKQIQATLTPNDRQNSLNNIAATGGENGKAF ALRHRETVLANEVAPYAATDNVLAASTDVGDVSWKLPVAQCFSPCFAVGTPLHTWQLVSQ GRTSIAHKGMLLAAKTMAATTVNLFIDSGLLQECQQEHQQVTDTQPYHCPIPKNVTPSPL K >gi|296493283|gb|ADTK01000218.1| GENE 66 67024 - 68334 985 436 aa, chain - ## HITS:1 COG:ECs1922 KEGG:ns NR:ns ## COG: ECs1922 COG1473 # Protein_GI_number: 15831176 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Escherichia coli O157:H7 # 1 436 6 441 441 773 99.0 0 MESLNQFVNSLAPKLSHWRRDFHHYAESGWVEFRTATLVAEELQQLGYSLALGREVVNES SRMGLPDELTLQREFERARQQGALAQWIAVFEGGFTGIVATLDTGRPGPVMAFRVDMDAL DLSEEQDVSHRPYRDGFASCNAGMMHACGHDGHTAIGLGLAHTLKQFESGLHGVIKLIFQ PAEEGTRGARAMVDAGVVDDVDYFTAVHIGTGVPAGTVVCGSDNFMATTKFDAHFTGTAA HAGAKPEDGHNALLAAAQATLALHAIAPHSEGASRVNVGVMQAGSGRNVVPASALLKVET RGASDVINQYVFDRAQQAIQGAATMYGVGVETRLMGAATASSPSPQWVAWLQSQAAQVAG VNQAIERVEAPAGSEDATLMMARVQQHQGQASYVVFGTQLAAGHHNEKFDFDEQVLAIAV ETLARTALNFPWTRGI >gi|296493283|gb|ADTK01000218.1| GENE 67 68510 - 69418 643 302 aa, chain + ## HITS:1 COG:ECs1923 KEGG:ns NR:ns ## COG: ECs1923 COG0583 # Protein_GI_number: 15831177 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 302 1 302 302 581 99.0 1e-166 MAFQVKIHQIRAFVEVARQGSIRGASRMLNMSQPALSKSIQELEEGLAAQLFFRRSKGVT LTDAGESFYQHASLILEELRAAQEEIRQRQGQLAGQINIGMGASISRSLMPAVISRFHQQ HPQVKVRIMEGQLVSMINELRQGELDFTINTYYQGPYDHEFTFEKLLEKQFAIFCRPGHP AIGARSIKQLLDYSWTMPTPHGSYYKQLSELLDDQAQTPQVGVVCETFSACISLVAKSDF LSILPEEMGCDPLHGQGLVMLPVSEILPKAAYYLIQRRDSRQTPLTASLITQFRRECGYL QS >gi|296493283|gb|ADTK01000218.1| GENE 68 69748 - 70311 622 187 aa, chain + ## HITS:1 COG:ECs1924 KEGG:ns NR:ns ## COG: ECs1924 COG2840 # Protein_GI_number: 15831178 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 187 1 187 187 379 100.0 1e-105 MNLDDKSLFLDAMEDVQPLKRATDVHWHPTRNQRAPQRIDTLQLDNFLTTGFLDIIPLSQ PLEFRREGLQHGVLDKLRSGKYPQQASLNLLRQPVEECRKMMFSFIQQAMADGLRNVLII HGKGRDDKSHANIVRSYVARWLTEFDDVQAYCTALPHHGGSGACYVALRKTAQAKQENWE RHAKRSR >gi|296493283|gb|ADTK01000218.1| GENE 69 70332 - 71564 920 410 aa, chain - ## HITS:1 COG:ydaM_3 KEGG:ns NR:ns ## COG: ydaM_3 COG2199 # Protein_GI_number: 16129302 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli K12 # 241 410 1 170 170 345 100.0 6e-95 MITHNFNTLDLLTSPVWIVSPFEEQLIYANSAARLLMQDLTFSQLRTGPYSVSSQKELPK YLSDLQNQHDIIEILTVQRKEEETALSCRLVLRELTETEPVIIFEGIEAPATLGLKASRS ANYQRKKQGFYARFFLTNSAPMLLIDPSRDGQIVDANLAALNFYGYNHETMCQKHTWEIN MLGRRVMPIMHEISHLPGGHKPLNFVHKLADGSTRHVQTYAGPIEIYGDKLMLCIVHDIT EQKRLEEQLEHAAHHDAMTGLLNRRQFYHITEPGQMQHLAIAQDYSLLLIDTDRFKHIND LYGHSKGDEVLCALARTLESCARKGDLVFRWGGEEFVLLLPRTPLDTALSLAETIRVSVA KVSISGLPRFTVSIGVAHHEGNESIDELFKRVDDALYRAKNDGRNRVLAA >gi|296493283|gb|ADTK01000218.1| GENE 70 71873 - 72802 778 309 aa, chain + ## HITS:1 COG:ECs1926 KEGG:ns NR:ns ## COG: ECs1926 COG0598 # Protein_GI_number: 15831180 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Escherichia coli O157:H7 # 1 309 19 327 327 629 99.0 1e-180 MLDGRGGVKPLENTDVIDEAHPCWLHLNYVHHESAQWLATTPLLPNNVRDALAGESTRPR VSRLGEGTLITLRCINGSTDERPDQLVAMRVYMDGRLIVSTRQRKVLALDDVVSDLEEGT GPTDCGGWLVDVCDALTDHSSEFIEQLHDKIIDLEDNLLDQQIPPRGFLALLRKQLIVMR RYMAPQRDVYARLASERLPWMSDDQRRRMQDIADRLGRGLDEIDACIARTGVMADEIAQV MQENLARRTYTMSLMAMVFLPSTFLTGLFGVNLGGIPGGGWQFGFSIFCILLVVLIGGVA LWLHRSKWL >gi|296493283|gb|ADTK01000218.1| GENE 71 73355 - 74653 1120 432 aa, chain + ## HITS:1 COG:ECs1927 KEGG:ns NR:ns ## COG: ECs1927 COG0513 # Protein_GI_number: 15831181 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Escherichia coli O157:H7 # 1 432 26 457 457 843 99.0 0 MTPVQAAALPAILAGKDVRVQAKTGSGKTAAFGLGLLQQIDASLFQTQALVLCPTRELAD QVAGELRRLARFLPNTKILTLCGGQPFGMQRDSLQHAPHIIVATPGRLLDHLQKGTVSLD ALNTLVMDEADRMLDMGFSDAIDDVIRFAPASRQTLLFSATWPEAIAAISGRVQRDPLAI EIDSTDALPPIEQQFYETSSKGKIPLLQRLLSLHQPSSCVVFCNTKKDCQAVCDALNEVG QSALSLHGDLEQRDRDQTLVRFANGSARVLVATDVAARGLDIKSLELVVNFELAWDPEVH VHRIGRTARAGNSGLAISFCAPEEAQRANIISDMLQIKLNWQTPPANSSIVPLEAEMATL CIDGGKKAKMRPGDVLGALTGDIGLDGADIGKIAVHPAHVYVAVRQAVAHKAWKQLQGGK IKGKTCRVRLLK >gi|296493283|gb|ADTK01000218.1| GENE 72 74782 - 75717 899 311 aa, chain - ## HITS:1 COG:ECs1928 KEGG:ns NR:ns ## COG: ECs1928 COG0037 # Protein_GI_number: 15831182 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Escherichia coli O157:H7 # 1 311 1 311 311 641 99.0 0 MQENQQITKKEQYNLNKLQKRLRRNVGEAIADFNMIEEGDRIMVCLSGGKDSYTMLEILR NLQQSAPINFSLVAVNLDQKQPGFPEHVLPEYLEKLGVEYKIVEENTYGIVKEKIPEGKT TCSLCSRLRRGILYRTATELGATKIALGHHRDDILQTLFLNMFYGGKMKGMPPKLMSDDG KHIVIRPLAYCREKDIQRFADAKAFPIIPCNLCGSQPNLQRQVIADMLRDWDKRYPGRIE TMFSAMQNVVPSHLCDTNLFDFKGINHGSEVVNGGDLAFDREEIPLQPAGWQPEEDENQL DELRLNVVEVK >gi|296493283|gb|ADTK01000218.1| GENE 73 75769 - 76698 227 309 aa, chain - ## HITS:1 COG:intR KEGG:ns NR:ns ## COG: intR COG0582 # Protein_GI_number: 16129306 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Escherichia coli K12 # 1 309 103 411 411 639 100.0 0 MKKTKSQLKTLRIIICESTPISHIRYSDILNYRNELLHGETLYLDNPRSNKKGRTVRTVD NYIALLCSLLRFAYQSGFISTKPFEGVKKLQRNRIKPDPLSKTEFNALMESEKGQSQNLW KFAVYSGLRHGELAALAWEDVDLEKGIVNVRRNLTILDMFGPPKTNAGIRTVTLLQPALE ALKEQYKLTGHHRKSEITFYHREYGRTEKQKLHFVFMPRVCNGKQKPYYSVSSLGARWNA AVKRAGIRRRNPYHTRHTFACWLLTAGANPAFIASQMGHETAQMVYEIYGMWIDDMNDEQ IAMLNARLS >gi|296493283|gb|ADTK01000218.1| GENE 74 77006 - 77221 214 71 aa, chain - ## HITS:1 COG:no KEGG:G2583_1696 NR:ns ## KEGG: G2583_1696 # Name: ydaQ # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 71 9 79 79 148 100.0 5e-35 MAQVIFNEEWMVEYGLMLRTGLGARQIEAYRQNCWVEGFHFKRVSPLGKPDSKRGIIWYN YPKINQFIKDS >gi|296493283|gb|ADTK01000218.1| GENE 75 77300 - 77509 136 69 aa, chain - ## HITS:1 COG:no KEGG:ECSE_1401 NR:ns ## KEGG: ECSE_1401 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 69 25 93 93 112 100.0 5e-24 MYKITATIEKEGGTPTNWTRYSKSKLTKSECEKMLSGKKEAGVSREQKVKLINFNCEKLQ SSRIALYSN >gi|296493283|gb|ADTK01000218.1| GENE 76 77753 - 78562 657 269 aa, chain - ## HITS:1 COG:recT KEGG:ns NR:ns ## COG: recT COG3723 # Protein_GI_number: 16129310 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecE pathway) # Organism: Escherichia coli K12 # 1 269 1 269 269 530 100.0 1e-150 MTKQPPIAKADLQKTQGNRAPAAVKNSDVISFINQPSMKEQLAAALPRHMTAERMIRIAT TEIRKVPALGNCDTMSFVSAIVQCSQLGLEPGSALGHAYLLPFGNKNEKSGKKNVQLIIG YRGMIDLARRSGQIASLSARVVREGDEFSFEFGLDEKLIHRPGENEDAPVTHVYAVARLK DGGTQFEVMTRKQIELVRSLSKAGNNGPWVTHWEEMAKKTAIRRLFKYLPVSIEIQRAVS MDEKEPLTIDPADSSVLTGEYSVIDNSEE >gi|296493283|gb|ADTK01000218.1| GENE 77 78555 - 81155 1161 866 aa, chain - ## HITS:1 COG:no KEGG:ECB_01327 NR:ns ## KEGG: ECB_01327 # Name: recE # Def: exonuclease VIII, 5' -> 3' specific dsDNA exonuclease # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 866 1 866 866 1734 99.0 0 MSTKPLFLLRKAKKSSGEPDVVLWASNDFESTCATLDYLIVKSGKKLSSYFKAVATNFPV VNDLPAEGEIDFTWSERYQLSKDSMTWELKPGAAPDNAHYQGNTNVNGEDMTEIEENMLL PISGQELPIRWLAQHGSEKPVTHVSRDGLQALHIARAEELPAVTALAVSHKTSLLDPLEI RELHKLVRDTDKVFPNPGNSNLGLITAFFEAYLNADYTDRGLLTKEWMKGNRVSHITRTA SGANAGGGNLTDRGEGFVHDLTSLARDVATGVLARSMDLDIYNLHPAHAKRIEEIIAENK PPFSVFRDKFITMPGGLDYSRAIVVASVKEAPIGIEVIPAHVTEYLNKVLTETDHANPDP EIVDIACGRSSAPMPQRVTEEGKQDDEEKPQPSGTTAVEQGEAETMEPDATEHHQDTQPL DAQSQVNSVDAKYQELRAELHEARKNIPSKNPVDADKLLAASRGEFVDGISDPNDPKWVK GIQTRDCVYQNQPETEKTSPDMNQPEPVVQQEPEIACNACGQTGGDNCPDCGAVMGDATY QETFDEESQVEAKENDPEEMEGAEHPHNENAGSDPHHDCSDEIGEVADPVIVEDIVPGIY YGISNENYHAGPGISKSQLDDIADTPALYLWRKNAPVDTTKTKTLDLGTAFHCRVLEPEE FSNRFIVAPEFNRRTNAGKEEEKAFLMECASTGKTVITAEEGRKIELMYQSVMALPLGQW LVESAGHAESSIYWEDPETGILCRCRPDKIIPEFHWIMDVKTTADIQRFKTAYYDYRYHV QDAFYSDGYEAQFGVQPTFVFLVASTTIECGRYPVEIFMMGEEAKLAGQQEYHRNLRTLS DCLNTDEWPAIKTLSLPRWAKEYAND >gi|296493283|gb|ADTK01000218.1| GENE 78 81257 - 81532 318 91 aa, chain - ## HITS:1 COG:no KEGG:ECO26_1920 NR:ns ## KEGG: ECO26_1920 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 91 17 107 107 161 100.0 7e-39 MITNYEATVVTTDDIVHEVNLEGKRIGYVIKTENKETPFTVVDIDGPSGNVKTLDEGVKK MCLVHIGKNLPAEKKAEFLATLIAMKLKGEI >gi|296493283|gb|ADTK01000218.1| GENE 79 81607 - 81777 62 56 aa, chain - ## HITS:1 COG:no KEGG:ECSE_1405 NR:ns ## KEGG: ECSE_1405 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 56 3 58 58 85 100.0 4e-16 MTKKIKCAYHLCKKDVEESKAIERMLHFMHGILSKDEPRKYCSEACAEKDQMAHEL >gi|296493283|gb|ADTK01000218.1| GENE 80 81777 - 81998 154 73 aa, chain - ## HITS:1 COG:no KEGG:Z2406 NR:ns ## KEGG: Z2406 # Name: kil # Def: FtsZ inhibitor protein # Organism: E.coli_O157 # Pathway: not_defined # 1 73 5 77 77 155 100.0 6e-37 MIAHHFGTDEIPRQCVTPGDYVLHEGRTYIASANNIKKRKLYIRNLTTKTCITDRMIKVF LGRDGLPVKAESW >gi|296493283|gb|ADTK01000218.1| GENE 81 82440 - 82928 105 162 aa, chain + ## HITS:1 COG:no KEGG:APECO1_502 NR:ns ## KEGG: APECO1_502 # Name: sieB # Def: prophage CP-933R superinfection exclusion protein # Organism: E.coli_APEC # Pathway: not_defined # 1 162 42 203 203 313 99.0 1e-84 MLDVFTPLLKLFANEPLERLMYTIIIFGLTLWLIPKEFTAAFNAYTEIPWLFQIIVFAFS FVVAISFSRLRAHIQKHYSLLPEQRVLLRLSEKEIAVFKDFLKTGNLIITSPCRNPVMKK LERKGIIQHQSDSANCSYYLVTEKYSHFMKLFWNSRSRRFNR >gi|296493283|gb|ADTK01000218.1| GENE 82 82925 - 83080 80 51 aa, chain - ## HITS:1 COG:no KEGG:B21_01342 NR:ns ## KEGG: B21_01342 # Name: ydaF # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 51 1 51 51 94 100.0 9e-19 MQKIDLGNNESLVCGVFPNQDGTFTAMTYTKSKTFKTETGARRWLEKHTVS >gi|296493283|gb|ADTK01000218.1| GENE 83 83091 - 83225 159 44 aa, chain - ## HITS:1 COG:no KEGG:Z2402 NR:ns ## KEGG: Z2402 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157 # Pathway: not_defined # 1 44 16 59 59 88 100.0 7e-17 MVHYEVVQYLMDCCGITYNQAVQALRSNDWDLWQAEVAIRSNKM >gi|296493283|gb|ADTK01000218.1| GENE 84 83534 - 84010 138 158 aa, chain - ## HITS:1 COG:no KEGG:JW1351 NR:ns ## KEGG: JW1351 # Name: racR # Def: predicted DNA-binding transcriptional regulator # Organism: E.coli_J # Pathway: not_defined # 1 158 1 158 158 307 100.0 6e-83 MLSGKDLGRAIEQAINKKIASGSVKSKAEVARHFKVQPPSIYDWIKKGSISKDKLPELWR FFSDVVGPEHWGLNEYPIPTPTNSDTKSELLDINNLYQAASDEIRAIVAFLLSGNATEPD WVDHDVRAYIAAMEMKVGKYLKALESERKSQNITKTGT >gi|296493283|gb|ADTK01000218.1| GENE 85 84134 - 84430 70 98 aa, chain + ## HITS:1 COG:ydaS KEGG:ns NR:ns ## COG: ydaS COG4197 # Protein_GI_number: 16129318 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria, prophage-related # Organism: Escherichia coli K12 # 1 98 1 98 98 190 100.0 5e-49 MKKENYSFKQACAVVGGQSAMARLLGVSPPSVNQWIKGVRQLPAERCPAIERATRGEVLC EELRPDIDWSYLRRSACCSQNMSVKQLNDSNKSSFDHT >gi|296493283|gb|ADTK01000218.1| GENE 86 84453 - 84875 292 140 aa, chain + ## HITS:1 COG:no KEGG:JW1353 NR:ns ## KEGG: JW1353 # Name: ydaT # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 140 1 140 140 273 100.0 1e-72 MKIKHEHIESVLFALAAEKGQAWVANAITEEYLRQGGGELPLVPGKDWNNQQNIYHRWLK GETKTQREKIQKLIPAILAILPRELRHRLCIFDTLERRALLAAQEALSTAIDAHDDAVQA VYRKAHFSGGGSSDDSVIVH >gi|296493283|gb|ADTK01000218.1| GENE 87 84888 - 85745 99 285 aa, chain + ## HITS:1 COG:ydaU_1 KEGG:ns NR:ns ## COG: ydaU_1 COG3756 # Protein_GI_number: 16129320 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 161 1 161 161 325 100.0 7e-89 MLFVLILSHRAASYGAIMAALPYMQLYIADYLADTMHLSAEEHGAYLLLMFNYWQTGKPI PKNRLAKIARLTNERWADVEPSLQEFFCDNGEEWVHLRIEEDLASVREKLTKKSAAGKAS VQARRSRKEADVQTKQERNLTGVQTDVEVVFEHDVNTKATNKDTDKDLKTDPPLNPPRGN RGVKKFDPLDIALPNWISVSLWREWVEFRQALRKPIRTEQGANGAIRELEKFRQQGFSPE QVIRHSIANEYQGLFAPKGVRPETLLRQVNTVSLPDSAIPPGFRG >gi|296493283|gb|ADTK01000218.1| GENE 88 85752 - 86498 404 248 aa, chain + ## HITS:1 COG:ydaV KEGG:ns NR:ns ## COG: ydaV COG1484 # Protein_GI_number: 16129321 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Escherichia coli K12 # 1 248 1 248 248 493 99.0 1e-139 MKNIATGDVLERIRRLAPSHVTAPFKTVAEWREWQLSEGQKRCEEINRQNRQLRVEKILN RSGIQPLHRKCSFSNYQVQNEGQRYALSQAKSIADELMTGCTNFAFSGKPGTGKNHLAAA IGNRLLKDGQTVIVVTVADVMSALHASYDDGQSGEKFLRELCQVDLLVLDEIGIQRETKN EQVVLHQIVDRRTASMRSVGMLTNLNYEAMKTLLGERIMDRMTMNGGRWVNFNWESWRPN VVQPGIAK >gi|296493283|gb|ADTK01000218.1| GENE 89 86521 - 87282 554 253 aa, chain + ## HITS:1 COG:no KEGG:ECs1946 NR:ns ## KEGG: ECs1946 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157J # Pathway: not_defined # 1 253 18 270 270 432 94.0 1e-120 METVFDALKAMGKATSIELAARLDISREEVLNELWELKKAGFVDKSAYTWRVADNNVQQE QPAPAELPEETTTATVAKISESDLTATIEQRGPLTADELATLFGTTSRKVASTLAMAISK ARLIRVNQNGKFRYCMPGGNLPAEPKAASVAETDGKAFPQLAGVALPVQEAATQEDIKTE TVADIVQSLPSFTETQADNLILPSLHMANRELRRAKNHVQKWERVCAALRELNKHRDIVR QIFDSSSRIVSEK >gi|296493283|gb|ADTK01000218.1| GENE 90 87298 - 87729 160 143 aa, chain + ## HITS:1 COG:no KEGG:Z2394 NR:ns ## KEGG: Z2394 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157 # Pathway: not_defined # 1 138 1 138 140 246 94.0 2e-64 MAKVFTPEEREKIKGQVVELVRLSGRETLRALEAKTGASRYYISTLARELVASGDVYNSG YGLFPSEQARKDWQNARKKLSRAKVKKTVVVDPDLIWSLPDGEIRRYDRRLNIICRECRK SEVMQRVLAFYQSNFRRLSGEHD >gi|296493283|gb|ADTK01000218.1| GENE 91 87763 - 88803 59 346 aa, chain - ## HITS:1 COG:NMA0427 KEGG:ns NR:ns ## COG: NMA0427 COG0270 # Protein_GI_number: 15793432 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Neisseria meningitidis Z2491 # 3 344 5 350 351 224 39.0 1e-58 MKKIKVFDFFSGCGGTSQGFHQAGMDIVFGLDFDVDAASSFRANFPQAAFINSDIRLIDN NAINKLVKKHRNDYILFSGCAPCQPYSKQNSNKKNDDPRLDLLKEFSRFVEHYMPDFIFV ENVPGMQKFNKNEGTFMMFLEMLSSKGYSVDYKVMPAAWYGVPQTRERLVLIASKDFYVA LPSPTHGVGNTPYSTVKDWIANLPKIEAGEKHNSIPDHEAARLSELNLRRIKCTPEGGSR EFWPDELILECHRNHKGHTDVYGRLSWDKPASGLTTRCISYSNGRFGHPTQNRAISVREA ACLQTFPLDYKFIGSLQSRARQIGNAVPPKMSETIGKHLLNIIKAS >gi|296493283|gb|ADTK01000218.1| GENE 92 88827 - 90113 316 428 aa, chain - ## HITS:1 COG:no KEGG:Gbro_1454 NR:ns ## KEGG: Gbro_1454 # Name: not_defined # Def: ATPase # Organism: G.bronchialis # Pathway: not_defined # 66 423 251 596 597 197 35.0 6e-49 MYHSDFKLRKKKSGVLITKQDYIRSKSKYIKKAFDEKLSNLRLWGAETIVKNITLAPELV EHVSFILGKKYKVIKFLEHRLFGTRGGTALLSTDKLNYTEAFAGSGEFAIVSLILNIYSA KPNSLILLDEPEVSLHPGAQKRMMDVLYSIVEQKKHQVVISTHSPVIVNTLPKDAIKLFV FDEESETAKIVQNIAPDEAFIELGHDINKKTIIVEDKLAKAIIDKAIKNDERLSLSFSVS YIPGGSETILSKHLPSYAVVERNDILFLLDGDKNKKIKPVRISEIADADLVNTMCKYYGC ELIINASGSNGKKNEQESNRLKRQVLEYAFNKVQYLPFDTPEQLLIEKAITPSEKEIIDS QTWSSNDPELYKNQIRLLAQHLYDKEEVNAEEIFCLQQMMTARLKNELPEFIKIRKIITQ ALDRGIIR >gi|296493283|gb|ADTK01000218.1| GENE 93 90663 - 90842 61 59 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTIRAVNIVFCCITAYLPLPFVQAVRSLFLLCANQQTSFICIKNRYHVILNLLIAYYGL Prediction of potential genes in microbial genomes Time: Mon May 16 15:44:52 2011 Seq name: gi|296493282|gb|ADTK01000219.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont624.2, whole genome shotgun sequence Length of sequence - 3843 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 319 - 378 2.3 1 1 Op 1 . + CDS 424 - 1023 421 ## SSON_1778 unknown protein encoded within prophage 2 1 Op 2 . + CDS 1023 - 1313 143 ## STY2051 putative bacteriophage protein 3 1 Op 3 . + CDS 1310 - 1852 215 ## ECSE_1424 hypothetical protein + Term 1933 - 1970 6.5 + Prom 1993 - 2052 3.9 4 2 Tu 1 . + CDS 2074 - 2643 154 ## STY2047 hypothetical protein + Term 2692 - 2748 -0.7 - Term 2560 - 2594 4.5 5 3 Tu 1 . - CDS 2612 - 2773 101 ## STY2046 hypothetical protein - Prom 2822 - 2881 3.7 + Prom 2742 - 2801 5.5 6 4 Op 1 . + CDS 2991 - 3332 379 ## G2583_3125 phage holin, lambda family 7 4 Op 2 . + CDS 3336 - 3812 387 ## COG4678 Muramidase (phage lambda lysozyme) Predicted protein(s) >gi|296493282|gb|ADTK01000219.1| GENE 1 424 - 1023 421 199 aa, chain + ## HITS:1 COG:no KEGG:SSON_1778 NR:ns ## KEGG: SSON_1778 # Name: not_defined # Def: unknown protein encoded within prophage # Organism: S.sonnei # Pathway: not_defined # 1 199 1 199 199 398 98.0 1e-110 MAHIQLVKQTSSGLLLPATPESCDFLHQIKIGEWIHADFKRVRNYAFHKRFFKLLQLGFD YWTPVGGAITPRERELLSGFVDYLCESVGREHTPALSDAAEQYLNTVATRRTRDTALLKS FEAFREWVTIQAGFYTEHFYPDGSRGRRAKSIAFANMDETEFQQVYKSVLNVLWNWILFR KFSSPEQVENVAAQLLEFA >gi|296493282|gb|ADTK01000219.1| GENE 2 1023 - 1313 143 96 aa, chain + ## HITS:1 COG:no KEGG:STY2051 NR:ns ## KEGG: STY2051 # Name: not_defined # Def: putative bacteriophage protein # Organism: S.typhi # Pathway: not_defined # 1 96 1 96 96 195 100.0 5e-49 MVDLRKAARGQMCTVRIPGYCNHNPETSVLAHYRLAGTCGTATKPHDMQAAIACSSCHDL IDGRVKTSDYTKEELRLMHAEGVFRTQEIWRKEGYL >gi|296493282|gb|ADTK01000219.1| GENE 3 1310 - 1852 215 180 aa, chain + ## HITS:1 COG:no KEGG:ECSE_1424 NR:ns ## KEGG: ECSE_1424 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 180 1 180 180 343 95.0 2e-93 MIYPTNTGKSGEHLRLTTLESVWIQGKLRMWGRWSYIGGGKTGNMFNQLLASKKLTKTAV NEALRRMKKAGIEKPELEAFLREMINGKQKTWLAHCTDAEALCIDRVISEVLAEHPGLIS VLRQRYEGRGMTKRKMAELLNDAHPEWCFSTCEKRIANWLAVAEYALYIPMRESFAQKIA >gi|296493282|gb|ADTK01000219.1| GENE 4 2074 - 2643 154 189 aa, chain + ## HITS:1 COG:no KEGG:STY2047 NR:ns ## KEGG: STY2047 # Name: not_defined # Def: hypothetical protein # Organism: S.typhi # Pathway: not_defined # 1 189 1 189 189 379 100.0 1e-104 MRELPKDYFLGVDDELVDYLEKQGEETIREIHLSNKTNVENGYKLLNIQIVGIGSSFLLL TQKTNFDFLTAGITTFTLLWTWCAIYLVCTGLSVKVRGLINAPPDHLYHEKYKDMEPSSF KIFADAGYLGPDKLLPLIRRYRLVDLSDTARELLLENEKIRTSLDKARMYTILAPVAAMF ISAVFLYVQ >gi|296493282|gb|ADTK01000219.1| GENE 5 2612 - 2773 101 53 aa, chain - ## HITS:1 COG:no KEGG:STY2046 NR:ns ## KEGG: STY2046 # Name: not_defined # Def: hypothetical protein # Organism: S.typhi # Pathway: not_defined # 1 53 48 100 100 85 98.0 4e-16 MSEKNKPQGENKPQHPVAPKPTPTQSTTDFATRRVFVGDSADSVIEHIKKQPR >gi|296493282|gb|ADTK01000219.1| GENE 6 2991 - 3332 379 113 aa, chain + ## HITS:1 COG:no KEGG:G2583_3125 NR:ns ## KEGG: G2583_3125 # Name: not_defined # Def: phage holin, lambda family # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 113 1 113 113 182 96.0 2e-45 MKMHNAPHSWPDLLELLQSWWRGDTPLGAVIMSIIMAGLRIAYFGGGGGWKRKTLEILLC GALTLTFASALEYVGWPKSLSVAIGGGVGLIGVDAIRGAAMRVIGNKFGAHKE >gi|296493282|gb|ADTK01000219.1| GENE 7 3336 - 3812 387 158 aa, chain + ## HITS:1 COG:ECs1622 KEGG:ns NR:ns ## COG: ECs1622 COG4678 # Protein_GI_number: 15830876 # Func_class: G Carbohydrate transport and metabolism # Function: Muramidase (phage lambda lysozyme) # Organism: Escherichia coli O157:H7 # 1 158 1 158 158 303 94.0 7e-83 MQTLNSQRKAFLDMVAWSEGTDNGRQKTRNHGYDVIVGGELFTDYSDHPRKLVTLNPKLK STAAGRYQLLSRWWDAYRKQLGLKDFSPESQDAVALQQIKERGALPMIDRGDIRQAIDRC SNIWASLPGAGYGQFEHKADSLIAKFKEVGGTVREIEV Prediction of potential genes in microbial genomes Time: Mon May 16 15:45:07 2011 Seq name: gi|296493281|gb|ADTK01000220.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont631.1, whole genome shotgun sequence Length of sequence - 604 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 11/0.000 - CDS 3 - 183 181 ## COG2801 Transposase and inactivated derivatives 2 1 Op 2 . - CDS 227 - 445 171 ## COG2801 Transposase and inactivated derivatives Predicted protein(s) >gi|296493281|gb|ADTK01000220.1| GENE 1 3 - 183 181 60 aa, chain - ## HITS:1 COG:ECs2478 KEGG:ns NR:ns ## COG: ECs2478 COG2801 # Protein_GI_number: 15831732 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 60 168 227 295 117 96.0 4e-27 METTFVLDALEQALWARRPSGTIHHSDKGSQYVSLAYMERLKEAKLLASTGSTGDSYDNA >gi|296493281|gb|ADTK01000220.1| GENE 2 227 - 445 171 72 aa, chain - ## HITS:1 COG:Z2073 KEGG:ns NR:ns ## COG: Z2073 COG2801 # Protein_GI_number: 15801514 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 EDL933 # 1 67 1 67 215 129 95.0 2e-30 MARCTVVRLMAVMGLAGVLRGKKVRTTVSRKTVAAGDRVNRQFVAERPDQLWVADFTYVS TWQGFVYSGVHH Prediction of potential genes in microbial genomes Time: Mon May 16 15:45:08 2011 Seq name: gi|296493280|gb|ADTK01000221.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont649.1, whole genome shotgun sequence Length of sequence - 1088 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 56 - 262 182 ## ECH74115_1621 gpW 2 1 Op 2 . + CDS 259 - 1087 490 ## COG5511 Bacteriophage capsid protein Predicted protein(s) >gi|296493280|gb|ADTK01000221.1| GENE 1 56 - 262 182 68 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_1621 NR:ns ## KEGG: ECH74115_1621 # Name: not_defined # Def: gpW # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 68 1 68 68 101 100.0 8e-21 MTRQEELAAARAALHDLMTGKRVATVQKDGRRVEFTTTSVSDLKKYIAELEVQTGMTQRR RGPAGFYV >gi|296493280|gb|ADTK01000221.1| GENE 2 259 - 1087 490 276 aa, chain + ## HITS:1 COG:ECs1632 KEGG:ns NR:ns ## COG: ECs1632 COG5511 # Protein_GI_number: 15830886 # Func_class: R General function prediction only # Function: Bacteriophage capsid protein # Organism: Escherichia coli O157:H7 # 1 276 1 276 533 539 98.0 1e-153 MKMSTIPTLLGPDGMTSLREYAGYHGGGSGFGGQLRAWNPPSESVDAALLPNFTRGNARA DDLVRNNGYAANAIQLHQDHIVGAFFRLSHRPSWRYLGIGEEEARAFSREVESAWKEFAE DDCCCIDVERKRTFTMMIREGVAMHAFNGELFVQATWDTSPSRLFRTQFRMVSPKRISNP NNTGDSRNCRAGVQINDSGAALGYYVSEDGYPGWMPQKWTWIPRELPGGRASFIHVFEPV EDGQTRGANVFYSVMEQMKMLDTLQNTQLQSAIVKA Prediction of potential genes in microbial genomes Time: Mon May 16 15:45:10 2011 Seq name: gi|296493279|gb|ADTK01000222.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont650.1, whole genome shotgun sequence Length of sequence - 340 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 339 72 ## COG0616 Periplasmic serine proteases (ClpP class) Predicted protein(s) >gi|296493279|gb|ADTK01000222.1| GENE 1 3 - 339 72 112 aa, chain - ## HITS:1 COG:ECs1633 KEGG:ns NR:ns ## COG: ECs1633 COG0616 # Protein_GI_number: 15830887 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Escherichia coli O157:H7 # 1 112 145 256 439 205 98.0 1e-53 DIIARVRDIKPVWALANDMNCSAGQLLASAASRRLVTQTARTGSIGVMMAHSNYGAALEK QGVEITLIYSGSHKVDGNPYSHLPDDVRETLQSRMDATRQMFAQKVSAYTGL Prediction of potential genes in microbial genomes Time: Mon May 16 15:45:11 2011 Seq name: gi|296493278|gb|ADTK01000223.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont650.2, whole genome shotgun sequence Length of sequence - 779 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 - CDS 3 - 549 497 ## COG0616 Periplasmic serine proteases (ClpP class) 2 1 Op 2 . - CDS 530 - 775 266 ## COG5511 Bacteriophage capsid protein Predicted protein(s) >gi|296493278|gb|ADTK01000223.1| GENE 1 3 - 549 497 182 aa, chain - ## HITS:1 COG:ECs1633 KEGG:ns NR:ns ## COG: ECs1633 COG0616 # Protein_GI_number: 15830887 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Escherichia coli O157:H7 # 1 182 1 182 439 325 99.0 2e-89 MTAELRNLPHIASMAFNEPLMLEPAYARVFFCALAGQLGISRLTDAVSGDSLTAGEAPAA LALSVDDDGPRQARSYQVMNGIAVLPVSGTLVSRTRALQPYSGMTGYNGIIARLQQAASD PMVDGILLDMDTPGGMVAGAFDCADIIARVRDIKPVWALANDMNCSAGQLLASAASRRLV TQ >gi|296493278|gb|ADTK01000223.1| GENE 2 530 - 775 266 81 aa, chain - ## HITS:1 COG:ECs1632 KEGG:ns NR:ns ## COG: ECs1632 COG5511 # Protein_GI_number: 15830886 # Func_class: R General function prediction only # Function: Bacteriophage capsid protein # Organism: Escherichia coli O157:H7 # 1 81 453 533 533 149 100.0 1e-36 MAIDGLKEVQEAVMLIEAGLSTYEKECAKRGDDYQEIFAQQVRETMERRAAGLKPPAWAA AAFESGLRQSTEEEKSDSRAA Prediction of potential genes in microbial genomes Time: Mon May 16 15:45:12 2011 Seq name: gi|296493277|gb|ADTK01000224.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont650.3, whole genome shotgun sequence Length of sequence - 251 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 249 172 ## COG5511 Bacteriophage capsid protein Predicted protein(s) >gi|296493277|gb|ADTK01000224.1| GENE 1 3 - 249 172 82 aa, chain - ## HITS:1 COG:ECs1632 KEGG:ns NR:ns ## COG: ECs1632 COG5511 # Protein_GI_number: 15830886 # Func_class: R General function prediction only # Function: Bacteriophage capsid protein # Organism: Escherichia coli O157:H7 # 1 82 412 493 533 165 98.0 2e-41 FLCWLEEAIARRVVTLPSKARFSFQEARSAWGNCDWIGSGRMAIDGLKEVQEAVMLIEAG LSTYEKECAKRGDDYQEIFAQQ Prediction of potential genes in microbial genomes Time: Mon May 16 15:45:12 2011 Seq name: gi|296493276|gb|ADTK01000225.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont650.4, whole genome shotgun sequence Length of sequence - 525 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 519 340 ## COG5511 Bacteriophage capsid protein Predicted protein(s) >gi|296493276|gb|ADTK01000225.1| GENE 1 3 - 519 340 172 aa, chain - ## HITS:1 COG:ECs1632 KEGG:ns NR:ns ## COG: ECs1632 COG5511 # Protein_GI_number: 15830886 # Func_class: R General function prediction only # Function: Bacteriophage capsid protein # Organism: Escherichia coli O157:H7 # 1 172 277 448 533 332 99.0 2e-91 MYAATIESELDTQSAMDFILGANSQEPRERLTGWIGEIAAYYAAAPVRLGGAKVPHLMPG DSLNLQTAQDTDNGYSVFEQSLLRYIAAGLGVSYEQLSRNYAQMSYSTARASANESWAYF MGRRKFVASRQASQMFLCWLEEAIVRRVVTLPSKARFSFQEARSAWGNCDWI Prediction of potential genes in microbial genomes Time: Mon May 16 15:45:17 2011 Seq name: gi|296493275|gb|ADTK01000226.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont661.1, whole genome shotgun sequence Length of sequence - 16435 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 10, operones - 7 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 130 - 160 3.0 1 1 Op 1 . - CDS 177 - 2510 1946 ## COG3378 Predicted ATPase 2 1 Op 2 . - CDS 2525 - 2845 336 ## EcSMS35_4010 hypothetical protein 3 2 Op 1 . - CDS 2981 - 3436 385 ## EFER_0442 conserved hypothetical protein from phage origin, putative ATPases with chaperone activity, ATP-binding subunit (COG0542) 4 2 Op 2 . - CDS 3429 - 3716 255 ## EcSMS35_4008 hypothetical protein 5 2 Op 3 . - CDS 3709 - 4113 279 ## EcolC_0047 putative bacteriophage protein - Prom 4224 - 4283 5.0 - Term 4141 - 4167 -0.6 6 3 Tu 1 . - CDS 4296 - 4562 266 ## COG3311 Predicted transcriptional regulator - Prom 4794 - 4853 9.0 7 4 Op 1 . + CDS 5114 - 5848 642 ## EcolC_0049 hypothetical protein 8 4 Op 2 . + CDS 5845 - 6345 185 ## EFER_0447 transactivation protein 9 4 Op 3 . + CDS 6419 - 6991 425 ## EFER_0448 polarity suppression protein (amber mutation-suppressing protein) + Term 7028 - 7064 5.6 - Term 7125 - 7156 1.3 10 5 Op 1 1/1.000 - CDS 7209 - 9167 331 ## COG5635 Predicted NTPase (NACHT family) 11 5 Op 2 2/1.000 - CDS 9171 - 10259 413 ## PROTEIN SUPPORTED gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 - Prom 10416 - 10475 3.8 12 6 Tu 1 . - CDS 11141 - 11623 450 ## COG0691 tmRNA-binding protein - Prom 11748 - 11807 5.3 + Prom 11596 - 11655 2.3 13 7 Op 1 9/0.500 + CDS 11782 - 12231 340 ## COG2867 Oligoketide cyclase/lipid transport protein 14 7 Op 2 . + CDS 12221 - 12511 285 ## COG2914 Uncharacterized protein conserved in bacteria + Term 12518 - 12562 7.8 - Term 12504 - 12550 8.2 15 8 Tu 1 . - CDS 12573 - 12914 347 ## COG2913 Small protein A (tmRNA-binding) - Prom 12967 - 13026 3.2 16 9 Op 1 17/0.000 - CDS 13063 - 14724 1522 ## COG0497 ATPase involved in DNA repair - Prom 14749 - 14808 4.7 17 9 Op 2 . - CDS 14810 - 15688 648 ## COG0061 Predicted sugar kinase - Prom 15810 - 15869 4.2 + Prom 15377 - 15436 2.6 18 10 Op 1 . + CDS 15620 - 15814 188 ## UTI89_C2948 hypothetical protein 19 10 Op 2 . + CDS 15811 - 16404 911 ## COG0576 Molecular chaperone GrpE (heat shock protein) Predicted protein(s) >gi|296493275|gb|ADTK01000226.1| GENE 1 177 - 2510 1946 777 aa, chain - ## HITS:1 COG:Z0339_2 KEGG:ns NR:ns ## COG: Z0339_2 COG3378 # Protein_GI_number: 15799978 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Escherichia coli O157:H7 EDL933 # 371 775 2 406 408 702 81.0 0 MKMNVTATVSHALGHWPRILPALGIQVLKNRHQPCPVCGGSDRFRFDDREGRGTWYCNQC GAGDGLKLVEKVFGVSPSDAATKVAAVTGSLPPADPAVTTAAVAETDAARKNAATLAQTL MAKTRTGTGNAYLTRKGFPGRECLMLTGTHRAGGVSWRAGDLVVPLYDDSGELVNLQLIS ADGRKRTLKGGQVRGTCHTLEGQNQAGKRLWIAEGYATALTVHHLTGETVMVALSSVNLL SLASLARQKHPACQIVLAADRDLSGDGQKKAAAAADACEGVVALPPVFGDWNDAFTQYGG EATRKAIYDAIRPPAESPFDTMSEAEFSAMSTSEKAMRIYEHYGEALAVDANGQLLSRYE NGVWKVLPPQDFARDVAGLFQRLRAPFSSGKVASVVDTLKLIIPQQEAPSRRLIGFRNGV LDTQNGTFHPHSPSHWMRTLCDVDFTPPVDGETLETHAPAFWRWLDRAAGGRAEKRDVIL AALFMVLANRYDWQLFLEVTGPGGSGKSIMAEIATLLAGEDNATSATIETLESPRERAAL TGFSLIRLPDQEKWSGDGAGLKAITGGDAVSVDPKYRDAYSTHIPAVILAVNNNPMRFTD RSGGVSRRRVIIHFPEQIAPQERDPQLKDKITRELAVIVRHLMQKFSDPMLARSLLQSQQ NSDEALNIKRDADPTFDFIGYLETLPQTSGMYMGNASIIPRNYRKYLYHAYLAYMEANGY RNVLSLKMFGLGLPVMLKEYGLNYEKRHTKQGIQTNLTLKEESYGDWLPKCDDPATT >gi|296493275|gb|ADTK01000226.1| GENE 2 2525 - 2845 336 106 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_4010 NR:ns ## KEGG: EcSMS35_4010 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 106 1 106 106 191 100.0 5e-48 MKTPLPPVLRAALYRRAVACAWLTVCERQHRYPHLTLESLEAAIAAELEGFYLRQHGEEK GRQIACALLEDLMESGPLKAAPSLSFLGLVVMDELCARHIKAPVLH >gi|296493275|gb|ADTK01000226.1| GENE 3 2981 - 3436 385 151 aa, chain - ## HITS:1 COG:no KEGG:EFER_0442 NR:ns ## KEGG: EFER_0442 # Name: not_defined # Def: conserved hypothetical protein from phage origin, putative ATPases with chaperone activity, ATP-binding subunit (COG0542) # Organism: E.fergusonii # Pathway: not_defined # 1 151 1 151 151 275 98.0 4e-73 MFDFPQPGEIYRSAGFPDVAVVGILEDGIPWEMPYRCPEIVWNPYRRKFSILVRILADGR TTDIPLGRFLREFTCDRPDLFRRSPVNRHAVLKEMAGDPELQKWREKYLDIYPQDPVPVS RAAPVAREWREIPRTEPDPEITPDNSYRNYL >gi|296493275|gb|ADTK01000226.1| GENE 4 3429 - 3716 255 95 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_4008 NR:ns ## KEGG: EcSMS35_4008 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 95 1 95 95 177 98.0 7e-44 MRNKKAPQTVSARHDAREHLSIEAYHKLNRASAVSQFVGGDLIHRELSGLHQLYIPHIFS YLNEDIDFVLNELKAKGLCRDFLAQQQDRGDRTHV >gi|296493275|gb|ADTK01000226.1| GENE 5 3709 - 4113 279 134 aa, chain - ## HITS:1 COG:no KEGG:EcolC_0047 NR:ns ## KEGG: EcolC_0047 # Name: not_defined # Def: putative bacteriophage protein # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 134 63 196 196 252 99.0 3e-66 MVWRVVCRAGMILFAIACYASESMVAQAGQPPGWPVSDNAGILTPVWAIAIERENSGDSV IYAVIGGCLMATTLTPSHPEFVFVFAAVRRADRHPRICMLRTVAGDERSARRSLVRDYVL SLAARLPVMEVSRA >gi|296493275|gb|ADTK01000226.1| GENE 6 4296 - 4562 266 88 aa, chain - ## HITS:1 COG:ECs0299 KEGG:ns NR:ns ## COG: ECs0299 COG3311 # Protein_GI_number: 15829553 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Escherichia coli O157:H7 # 24 88 22 86 86 102 72.0 2e-22 MQAVFSSPSPAPVTPLMPLPDITQERFLRLPEVMHLCGLSRSTIYELIRKGEFPPQVSLG GKNVAWLHSEVTAWMAGRIAGRKRGYDA >gi|296493275|gb|ADTK01000226.1| GENE 7 5114 - 5848 642 244 aa, chain + ## HITS:1 COG:no KEGG:EcolC_0049 NR:ns ## KEGG: EcolC_0049 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 244 1 244 244 429 99.0 1e-119 MSDNTIPEYLQPALAQLEKARAAHLENARLMDETVTAIERAEQEKNALAQADGNDADDWR TAFRAAGGVLSDELKQRHIERVARRELVQEYDNLAVVLNFERERLKGACDSTATAYRKAH HHLLSLYAEHELEHALNETCEALVRAMHLSILVQENPLANTTGHQGYIAPDKAVMQQVKS SLEQKIKQMQISLTGEPVLRLTGLSAATLPHMDYEVAGTPAQRKVWQDKIDQQGAELKAR GLLS >gi|296493275|gb|ADTK01000226.1| GENE 8 5845 - 6345 185 166 aa, chain + ## HITS:1 COG:no KEGG:EFER_0447 NR:ns ## KEGG: EFER_0447 # Name: not_defined # Def: transactivation protein # Organism: E.fergusonii # Pathway: not_defined # 1 166 1 166 166 318 99.0 5e-86 MIYCPSCGHVAHTRRAHFMDDGTKIMIAQCRNIYCSATFEASESFFSDSKDSGMEYISGK QRYRDSLTSASGSMKRPKRMLVTGYCCRRCKGLALSRTSRRLSQEVTERFYVCTDPGCGL VFKTLQTINRFIVRPVTPDELAESLHEKQELPPVRLKTQSYSLRLE >gi|296493275|gb|ADTK01000226.1| GENE 9 6419 - 6991 425 190 aa, chain + ## HITS:1 COG:no KEGG:EFER_0448 NR:ns ## KEGG: EFER_0448 # Name: not_defined # Def: polarity suppression protein (amber mutation-suppressing protein) # Organism: E.fergusonii # Pathway: not_defined # 1 190 1 190 190 328 98.0 7e-89 MESTALQQAFDTCQNNKAAWLQRKNELAAAEQEYLRLLSGEGRNVSRLDELRNIIEVRKW QVNQAAGRYIRSHEAVQHISIRDRLNDFMQQHGTALAAALAPELMGYSELTAIARNCAIQ RATDALREALLSWLAKGEKINYSAQDSDILTTIGFRPDAASVDDSREKFTPAQNMIFSRK SAQLASRQSV >gi|296493275|gb|ADTK01000226.1| GENE 10 7209 - 9167 331 652 aa, chain - ## HITS:1 COG:alr1232 KEGG:ns NR:ns ## COG: alr1232 COG5635 # Protein_GI_number: 17228727 # Func_class: T Signal transduction mechanisms # Function: Predicted NTPase (NACHT family) # Organism: Nostoc sp. PCC 7120 # 103 409 200 534 801 70 22.0 1e-11 MQPVSLDVLVSASLPWAKEVIKNKILPIIESSVRNYLIDVHAARFLNKNLERFLSNVKGQ CSLVNTLAFQNTPVELLDIYEPMSIYYDNEKKPYTCLIKDNADLLDNFNHILITDSAGMG KSTLMKRIALDCIANKEYIPIYIELRRAKDYDFALQVKHQLGLGEDISDNCLKKIPFIYL FDGIDEIPQQLKSKIVNLLSTFANNFSESKIIITSRHDNFLSELHGFSRFKINPLETNQA YDLLRRYDGRGPISSQLIKGLRLENGHNLNEFLSTPLYVSLLFCAYKFKPIIPRKRELFY SQVFDALYESHDLTKELGYVREKFSKLDSTDFHQVLRRLGFWCLKDGGKIEFTKDDLQIT LQNIISKIPGISTSPSLFIQDLIETVPLFVKEGAIIRWSHKSLMEYFAAMFICRDTKERQ KEILTKLYYSPESIRHRNLFSLCADIDYSSFRSSIIKCLLLDLKLSYEELKNEAAQFNCD TIKTKAEISLAGNSIIYLFDKDKENELLNDFFKGDLRAFNELSENGSHIITTITGVANHW LVIARESTIKSDILSILKSRKPSYFHKTHRFFNTDPELESNIMMEVGKQPEVKFNLTLDS LKDINKDKEIRLIESLLSFENTPQIKYDSALFELEQINCDESNGINQLLDGF >gi|296493275|gb|ADTK01000226.1| GENE 11 9171 - 10259 413 362 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 [Campylobacter concisus 13826] # 1 349 44 396 406 163 30 6e-40 KLWRFRYQRPATKQRTMMGLGAFPALSLADARRLRADYLALLANGIDPQIQAEVAEEQQQ IALDSIFSTVAANWFQLKSKSVTPDYAKDIWRSLEKDVFPVIGEIPVQQIKARTLVETLE PIKARGALETVRRLVQRINEIMIYAVNTGLIDANPASGIGMAFEKPKKQHMPTLRPEELP KLMRSLVMSNLSVSTRCLIEWQLLTLVRPSEASGARWAEIDLDAKLWTIPAKRMKAKREH IVPLSPQAIEILEVMKPISAHREHVFPSRNDPKQPMNSQTANAALKRIGYGGKLVAHGLR SIASTAMNERGFNPDVIEAALAHSDKNEVRRAYNRSTYLHARIDLMNWWGLKVKNLNVEN KV >gi|296493275|gb|ADTK01000226.1| GENE 12 11141 - 11623 450 160 aa, chain - ## HITS:1 COG:ECs3482 KEGG:ns NR:ns ## COG: ECs3482 COG0691 # Protein_GI_number: 15832736 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: tmRNA-binding protein # Organism: Escherichia coli O157:H7 # 1 160 1 160 160 308 100.0 3e-84 MTKKKAHKPGSATIALNKRARHEYFIEEEFEAGLALQGWEVKSLRAGKANISDSYVLLRD GEAFLFGANITPMAVASTHVVCDPTRTRKLLLNQRELDSLYGRVNREGYTVVALSLYWKN AWCKVKIGVAKGKKQHDKRSDIKEREWQVDKARIMKNAHR >gi|296493275|gb|ADTK01000226.1| GENE 13 11782 - 12231 340 149 aa, chain + ## HITS:1 COG:yfjG KEGG:ns NR:ns ## COG: yfjG COG2867 # Protein_GI_number: 16130538 # Func_class: I Lipid transport and metabolism # Function: Oligoketide cyclase/lipid transport protein # Organism: Escherichia coli K12 # 1 149 10 158 158 295 100.0 2e-80 MEIVMPQISRTALVPYSAEQMYQLVNDVQSYPQFLPGCTGSRILESTPGQMTAAVDVSKA GISKTFTTRNQLTSNQSILMNLVDGPFKKLIGGWKFTPLSQEACRIEFHLDFEFTNKLIE LAFGRVFKELAANMVQAFTVRAKEVYSAR >gi|296493275|gb|ADTK01000226.1| GENE 14 12221 - 12511 285 96 aa, chain + ## HITS:1 COG:yfjF KEGG:ns NR:ns ## COG: yfjF COG2914 # Protein_GI_number: 16130537 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 96 7 102 102 163 98.0 7e-41 MPGKIAVEVAYALPEKQYLQRVTLQEGATVEEAIRASGLLELRTDIDLTKNKVGIYSRPA KLSDSVHDGDRVEIYRPLIADPKELRRQRAEKSANK >gi|296493275|gb|ADTK01000226.1| GENE 15 12573 - 12914 347 113 aa, chain - ## HITS:1 COG:STM2685 KEGG:ns NR:ns ## COG: STM2685 COG2913 # Protein_GI_number: 16766000 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Small protein A (tmRNA-binding) # Organism: Salmonella typhimurium LT2 # 1 111 1 111 112 198 92.0 2e-51 MRCKTLTAAAAVLLMLTAGCSTLERVVYRPDINQGNYLTANDVSKIRVGMTQQQVAYALG TPLMSDPFGTNTWFYVFRQQPGHEGVTQQTLTLTFNSSGVLTNIDNKPALSGN >gi|296493275|gb|ADTK01000226.1| GENE 16 13063 - 14724 1522 553 aa, chain - ## HITS:1 COG:recN KEGG:ns NR:ns ## COG: recN COG0497 # Protein_GI_number: 16130535 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Escherichia coli K12 # 1 553 1 553 553 1000 99.0 0 MLAQLTISNFAIVRELEIDFHSGMTVITGETGAGKSIAIDALGLCLGGRAEADMVRTGAA RADLCARFSLKDTPAALRWLEENQLEDGHECLLRRVISSDGRSRGFINGTAVPLSQLREL GQLLIQIHGQHAHQLLTKPEHQKFLLDGYANETSLLQEMTARYQLWHQSCRDLAHHQQLS QERAARAELLQYQLKELNEFNPQPGEFEQIDEEYKRLANSGQLLTTSQNALALMADGEDA NLQSQLYTAKQLVSELIGMDSKLSGVLDMLEEATIQIAEASDELRHYCDRLDLDPNRLFE LEQRISKQISLARKHHVSPEALPQYYQSLLEEQQQLDDQADSQETLALAVTKHHQQALEI ARALHQQRQQYAEELAQLITDSMHALSMPHGQFTIDVKFDEHHLGADGADRIEFRVTTNP GQPMQPIAKVASGGELSRIALAIQVITARKMETPALIFDEVDVGISGPTAAVVGKLLRQL GESTQVMCVTHLPQVAGCGHQHYFVSKETDGAMTETHMQSLNKKARLQELARLLGGSEVT RNTLANAKELLAA >gi|296493275|gb|ADTK01000226.1| GENE 17 14810 - 15688 648 292 aa, chain - ## HITS:1 COG:yfjB KEGG:ns NR:ns ## COG: yfjB COG0061 # Protein_GI_number: 16130534 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar kinase # Organism: Escherichia coli K12 # 1 292 1 292 292 611 100.0 1e-175 MNNHFKCIGIVGHPRHPTALTTHEMLYRWLCTKGYEVIVEQQIAHELQLKNVKTGTLAEI GQLADLAVVVGGDGNMLGAARTLARYDIKVIGINRGNLGFLTDLDPDNAQQQLADVLEGH YISEKRFLLEAQVCQQDCQKRISTAINEVVLHPGKVAHMIEFEVYIDEIFAFSQRSDGLI ISTPTGSTAYSLSAGGPILTPSLDAITLVPMFPHTLSARPLVINSSSTIRLRFSHRRNDL EISCDSQIALPIQEGEDVLIRRCDYHLNLIHPKDYSYFNTLSTKLGWSKKLF >gi|296493275|gb|ADTK01000226.1| GENE 18 15620 - 15814 188 64 aa, chain + ## HITS:1 COG:no KEGG:UTI89_C2948 NR:ns ## KEGG: UTI89_C2948 # Name: yfjC # Def: hypothetical protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 64 20 83 83 118 100.0 7e-26 MCCQCSGVPWVSHNANTLEMIIHFSEVLVAKIDDNVSASLETLKLIPIISEVSEMNAKKT RRNS >gi|296493275|gb|ADTK01000226.1| GENE 19 15811 - 16404 911 197 aa, chain + ## HITS:1 COG:ECs3476 KEGG:ns NR:ns ## COG: ECs3476 COG0576 # Protein_GI_number: 15832730 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone GrpE (heat shock protein) # Organism: Escherichia coli O157:H7 # 1 197 1 197 197 315 100.0 3e-86 MSSKEQKTPEGQAPEEIIMDQHEEIEAVEPEASAEQVDPRDEKIANLEAQLAEAQTRERD GILRVKAEMENLRRRTELDIEKAHKFALEKFINELLPVIDSLDRALEVADKANPDMSAMV EGIELTLKSMLDVVRKFGVEVIAETNVPLDPNVHQAIAMVESDDVAPGNVLGIMQKGYTL NGRTIRAAMVTVAKAKA Prediction of potential genes in microbial genomes Time: Mon May 16 15:45:42 2011 Seq name: gi|296493274|gb|ADTK01000227.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont661.2, whole genome shotgun sequence Length of sequence - 11043 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 5, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.500 - CDS 8 - 1249 1306 ## COG4536 Putative Mg2+ and Co2+ transporter CorB 2 1 Op 2 . - CDS 1315 - 2106 712 ## COG4137 ABC-type uncharacterized transport system, permease component - Prom 2137 - 2196 3.9 + Prom 2157 - 2216 5.2 3 2 Op 1 23/0.000 + CDS 2273 - 3634 1638 ## COG0541 Signal recognition particle GTPase + Term 3644 - 3684 3.6 4 2 Op 2 12/0.000 + CDS 3711 - 4019 524 ## PROTEIN SUPPORTED gi|110806723|ref|YP_690243.1| 30S ribosomal protein S16 5 2 Op 3 30/0.000 + CDS 4038 - 4586 189 ## PROTEIN SUPPORTED gi|163796730|ref|ZP_02190688.1| 50S ribosomal protein L19 6 2 Op 4 33/0.000 + CDS 4617 - 5384 635 ## COG0336 tRNA-(guanine-N1)-methyltransferase 7 2 Op 5 . + CDS 5426 - 5773 573 ## PROTEIN SUPPORTED gi|15803128|ref|NP_289159.1| 50S ribosomal protein L19 + Term 5798 - 5844 9.6 8 3 Op 1 6/0.500 - CDS 5849 - 6331 274 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins 9 3 Op 2 . - CDS 6347 - 7570 779 ## COG2199 FOG: GGDEF domain 10 3 Op 3 . - CDS 7563 - 8081 163 ## ECH74115_3842 hypothetical protein - Prom 8133 - 8192 4.1 11 4 Tu 1 . - CDS 8231 - 8593 239 ## B21_02455 hypothetical protein - Prom 8639 - 8698 3.0 + Prom 8636 - 8695 7.2 12 5 Op 1 7/0.500 + CDS 8806 - 9876 1106 ## COG0722 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase 13 5 Op 2 . + CDS 9887 - 11008 1125 ## COG0287 Prephenate dehydrogenase Predicted protein(s) >gi|296493274|gb|ADTK01000227.1| GENE 1 8 - 1249 1306 413 aa, chain - ## HITS:1 COG:yfjDm KEGG:ns NR:ns ## COG: yfjDm COG4536 # Protein_GI_number: 16132248 # Func_class: P Inorganic ion transport and metabolism # Function: Putative Mg2+ and Co2+ transporter CorB # Organism: Escherichia coli K12 # 5 413 12 420 420 771 100.0 0 MVVISAYFSGSETGMMTLNRYRLRHMAKQGNRSAKRVEKLLRKPDRLISLVLIGNNLVNI LASALGTIVGMRLYGDAGVAIATGVLTFVVLVFAEVLPKTIAALYPEKVAYPSSFLLAPL QILMMPLVWLLNAITRMLMRMMGIKTDIVVSGSLSKEELRTIVHESRSQISRRNQDMLLS VLDLEKMTVDDIMVPRSEIIGIDINDDWKSILRQLSHSPHGRIVLYRDSLDDAISMLRVR EAWRLMSEKKEFTKETMLRAADEIYFVPEGTPLSTQLVKFQRNKKKVGLVVNEYGDIQGL VTVEDILEEIVGDFTTSMSPTLAEEVTPQNDGSVIIDGTANVREINKAFNWHLPEDDART VNGVILEALEEIPVAGTRVRIGEYDIDILDVQDNMIKQVKVFPVKPLRESVAE >gi|296493274|gb|ADTK01000227.1| GENE 2 1315 - 2106 712 263 aa, chain - ## HITS:1 COG:ECs3474 KEGG:ns NR:ns ## COG: ECs3474 COG4137 # Protein_GI_number: 15832728 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Escherichia coli O157:H7 # 1 263 26 288 288 446 100.0 1e-125 MPVFALLALVAYSVSLALIVPGLLQKNGGWRRMAIISAVIALVCHAIALEARILPDGDSG QNLSLLNVGSLVSLMICTVMTIVASRNRGWLLLPIVYAFALINLALATFMPNEYITHLEA TPGMLVHIGLSLFSYATLIIAALYALQLAWIDYQLKNKKLAFNQEMPPLMSIERKMFHIT QIGVVLLTLTLCTGLFYMHNLFSMENIDKAVLSIVAWFVYIVLLWGHYHEGWRGRRVVWF NVAGAVILTLAYFGSRIVQQLIS >gi|296493274|gb|ADTK01000227.1| GENE 3 2273 - 3634 1638 453 aa, chain + ## HITS:1 COG:ECs3473 KEGG:ns NR:ns ## COG: ECs3473 COG0541 # Protein_GI_number: 15832727 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal recognition particle GTPase # Organism: Escherichia coli O157:H7 # 1 453 1 453 453 803 100.0 0 MFDNLTDRLSRTLRNISGRGRLTEDNVKDTLREVRMALLEADVALPVVREFINRVKEKAV GHEVNKSLTPGQEFVKIVRNELVAAMGEENQTLNLAAQPPAVVLMAGLQGAGKTTSVGKL GKFLREKHKKKVLVVSADVYRPAAIKQLETLAEQVGVDFFPSDVGQKPVDIVNAALKEAK LKFYDVLLVDTAGRLHVDEAMMDEIKQVHASINPVETLFVVDAMTGQDAANTAKAFNEAL PLTGVVLTKVDGDARGGAALSIRHITGKPIKFLGVGEKTEALEPFHPDRIASRILGMGDV LSLIEDIESKVDRAQAEKLASKLKKGDGFDLNDFLEQLRQMKNMGGMASLMGKLPGMGQI PDNVKSQMDDKVLVRMEAIINSMTMKERAKPEIIKGSRKRRIAAGCGMQVQDVNRLLKQF DDMQRMMKKMKKGGMAKMMRSMKGMMPPGFPGR >gi|296493274|gb|ADTK01000227.1| GENE 4 3711 - 4019 524 102 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|110806723|ref|YP_690243.1| 30S ribosomal protein S16 [Shigella flexneri 5 str. 8401] # 1 102 1 102 102 206 100 6e-53 MTPDSVPRWGPVVLFTQEDVMVTIRLARHGAKKRPFYQVVVADSRNARNGRFIERVGFFN PIASEKEEGTRLDLDRIAHWVGQGATISDRVAALIKEVNKAA >gi|296493274|gb|ADTK01000227.1| GENE 5 4038 - 4586 189 182 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163796730|ref|ZP_02190688.1| 50S ribosomal protein L19 [alpha proteobacterium BAL199] # 11 177 2 160 179 77 29 4e-14 MSKQLTAQAPVDPIVLGKMGSSYGIRGWLRVFSSTEDAESIFDYQPWFIQKAGQWQQVQL ESWKHHNQDMIIKLKGVDDRDAANLLTNCEIVVDSSKLPQLEEGDYYWKDLMGCQVVTTE GYDLGKVVDMMETGSNDVLVIKANLKDAFGIKERLVPFLDGQVIKKVDLTTRSIEVDWDP GF >gi|296493274|gb|ADTK01000227.1| GENE 6 4617 - 5384 635 255 aa, chain + ## HITS:1 COG:ECs3470 KEGG:ns NR:ns ## COG: ECs3470 COG0336 # Protein_GI_number: 15832724 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-(guanine-N1)-methyltransferase # Organism: Escherichia coli O157:H7 # 1 255 1 255 255 488 100.0 1e-138 MWIGIISLFPEMFRAITDYGVTGRAVKNGLLSIQSWSPRDFTHDRHRTVDDRPYGGGPGM LMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVCGRYEGIDE RVIQTEIDEEWSIGDYVLSGGELPAMTLIDSVSRFIPGVLGHEASATEDSFAEGLLDCPH YTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQSLGRTWLRRPELLENLALTEEQARLLAEF KTEHAQQQHKHDGMA >gi|296493274|gb|ADTK01000227.1| GENE 7 5426 - 5773 573 115 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803128|ref|NP_289159.1| 50S ribosomal protein L19 [Escherichia coli O157:H7 EDL933] # 1 115 1 115 115 225 100 1e-58 MSNIIKQLEQEQMKQDVPSFRPGDTVEVKVWVVEGSKKRLQAFEGVVIAIRNRGLHSAFT VRKISNGEGVERVFQTHSPVVDSISVKRRGAVRKAKLYYLRERTGKAARIKERLN >gi|296493274|gb|ADTK01000227.1| GENE 8 5849 - 6331 274 160 aa, chain - ## HITS:1 COG:yfiB KEGG:ns NR:ns ## COG: yfiB COG2885 # Protein_GI_number: 16130526 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Escherichia coli K12 # 1 160 1 160 160 309 99.0 1e-84 MIKHLVAPLIFTSLILTGCQSPQGKFTPEQVAAMQSYGFTESAGDWSLGLSDAILFAKND YKLLPESQQQIQTMAAKLASTGLTHARMDGHTDNYGEDSYNEGLSLKRANVVADAWAMGG QIPRSNLTTQGLGKKYPIASNKTAQGRAENRRVAVVITTP >gi|296493274|gb|ADTK01000227.1| GENE 9 6347 - 7570 779 407 aa, chain - ## HITS:1 COG:yfiN_2 KEGG:ns NR:ns ## COG: yfiN_2 COG2199 # Protein_GI_number: 16130525 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli K12 # 240 407 1 168 168 331 99.0 2e-90 MDNDNSLNKRPTFKRALRNISITSIFITMMLIWLLLSVTSVLTLKQYAQKNLALTAATMT YSLEAAVVFADGPAATETLAALGQQGQFSTAEVRDKQQNILASWHYTRKDPGDTFSNFIS HWLFPAPIIQPIRHNGETIGEVRLTARDSSISHFIWFSLAVLTGCILLASGIAITLTRHL HNGLVEALKNITDVVHDVRSNRNFSRRVSEERIAEFHRFALDFNSLLDEMEEWQLRLQAK NAQLLRTALHDPLTGLANRAAFRSGINTLMNNSDARKTSALLFLDGDNFKYINDTWGHAT GDRVLIEIAKRLAEVGGLRHKAYRLGGDEFAMVLYDVQSESEVQQICSALTQIFNLPFDL HNGHQTTMTLSIGYAMTIEHASAEKLQELADHNMYQAKHQRAEKLVR >gi|296493274|gb|ADTK01000227.1| GENE 10 7563 - 8081 163 172 aa, chain - ## HITS:1 COG:no KEGG:ECH74115_3842 NR:ns ## KEGG: ECH74115_3842 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 172 1 172 172 329 100.0 2e-89 MRFSHRLFLLLILLLTGAPILAQEPSDVAKNVRMMVSGIVSYTRWPALSGPPKLCIFSSS RFSTALQENAATSLPYLPVIIHTQQEAMISGCNGFYFGNESPTFQMELTEQYPSKALLLI AEQNTECIIGSAFCLIIHNNDVRFAVNLDALSRSGVKVNPDVLMLARKKNDG >gi|296493274|gb|ADTK01000227.1| GENE 11 8231 - 8593 239 120 aa, chain - ## HITS:1 COG:no KEGG:B21_02455 NR:ns ## KEGG: B21_02455 # Name: yfiL # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 120 2 121 121 238 100.0 5e-62 MKKFIAPLLALLVSGCQIDPYTHAPTLTSTDWYDVGMEDAISGSAIKDDDAFSDSQADRG LYLKGYAEGQKKTCQTDFTYARGLSGKSFPASCNNVENASQLHEVWQKGADENASTIRLN >gi|296493274|gb|ADTK01000227.1| GENE 12 8806 - 9876 1106 356 aa, chain + ## HITS:1 COG:aroF KEGG:ns NR:ns ## COG: aroF COG0722 # Protein_GI_number: 16130522 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Escherichia coli K12 # 1 356 1 356 356 708 99.0 0 MQKDALNNVHITDEQVLMTPEQLKAAFPLSLQQEAQIADSRKTISDIIAGRDPRLLVVCG PCSIHDPETALEYARRFKALAAEVSDSLYLVMRVYFEKPRTTVGWKGLINDPHMDGSFDV EAGLQIARKLLLELVNMGLPLATEALDPNSPQYLGDLFSWSAIGARTTESQTHREMASGL SMPVGFKNGTDGSLATAINAMRAAAQPHRFVGINQAGQVALLQTQGNPDGHVILRGGKAP NYSPADVAQCEKEMEQAGLRPSLMVDCSHGNSNKDYRRQPAVAESVVAQIKDGNRSIIGL MIESNIHEGNQSSEQPRSEMKYGVSVTDACISWEMTDALLREIHQDLNGQLTARVA >gi|296493274|gb|ADTK01000227.1| GENE 13 9887 - 11008 1125 373 aa, chain + ## HITS:1 COG:ECs3463_2 KEGG:ns NR:ns ## COG: ECs3463_2 COG0287 # Protein_GI_number: 15832717 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydrogenase # Organism: Escherichia coli O157:H7 # 100 373 1 274 274 557 100.0 1e-158 MVAELTALRDQIDEVDKALLNLLAKRLELVAEVGEVKSRFGLPIYVPEREASMLASRRAE AEALGVPPDLIEDVLRRVMRESYSSENDKGFKTLCPSLRPVVIVGGGGQMGRLFEKMLTL SGYQVRILEQHDWDRAADIVADAGMVIVSVPIHVTEQVIGKLPPLPKDCILVDLASVKNG PLQAMLAAHDGPVLGLHPMFGPDSGSLAKQVVVWCDGRKPEAYQWFLEQIQVWGARLHRI SAVEHDQNMAFIQALRHFATFAYGLHLAEENVQLEQLLALSSPIYRLELAMVGRLFAQDP QLYADIIMSSERNLALIKRYYKRFGEAIELLEQGDKQAFIDSFRKVEHWFGDYAQRFQSE SRVLLRQANDNRQ Prediction of potential genes in microbial genomes Time: Mon May 16 15:45:48 2011 Seq name: gi|296493273|gb|ADTK01000228.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont661.3, whole genome shotgun sequence Length of sequence - 2017 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 - CDS 31 - 1191 946 ## COG0077 Prephenate dehydratase - Prom 1356 - 1415 3.5 - Term 1388 - 1415 1.5 2 1 Op 2 . - CDS 1441 - 1782 574 ## PROTEIN SUPPORTED gi|15803120|ref|NP_289151.1| translation inhibitor protein RaiA - Prom 1842 - 1901 4.1 Predicted protein(s) >gi|296493273|gb|ADTK01000228.1| GENE 1 31 - 1191 946 386 aa, chain - ## HITS:1 COG:ECs3462_2 KEGG:ns NR:ns ## COG: ECs3462_2 COG0077 # Protein_GI_number: 15832716 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydratase # Organism: Escherichia coli O157:H7 # 105 386 1 282 282 567 99.0 1e-161 MTSENPLLALREKISALDEKLLALLAERRELAVEVGKAKLLSHRPVRDIDRERDLLERLI TLGKAHHLDAHYITRLFQLIIEDSVLTQQALLQQHLNKINPHSARIAFLGPKGSYSHLAA RQYAARHFEQFIESGCAKFADIFNQVETGQADYAVVPIENTSSGAINDVYDLLQHTCLSI VGEMTLTIDHCLLVSGTTDLSTINTVYSHPQPFQQCSKFLNRYPHWKIEYTESTSAAMEK VAQAKSPHVAALGSEAGGTLYGLQVLERIEANQRQNFTRFVVLARKAINVSDQVPAKTTL LMATGQQAGALVEALLVLRNHNLIMTRLESRPIHGNPWEEMFYLDIQANLESAEMQKALK ELGEITRSMKVLGCYPSENVVPVDPT >gi|296493273|gb|ADTK01000228.1| GENE 2 1441 - 1782 574 113 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803120|ref|NP_289151.1| translation inhibitor protein RaiA [Escherichia coli O157:H7 EDL933] # 1 113 1 113 113 225 100 2e-59 MTMNITSKQMEITPAIRQHVADRLAKLEKWQTHLINPHIILSKEPQGFVADATINTPNGV LVASGKHEDMYTAINELINKLERQLNKLQHKGEARRAATSVKDANFVEEVEEE Prediction of potential genes in microbial genomes Time: Mon May 16 15:45:50 2011 Seq name: gi|296493272|gb|ADTK01000229.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont661.4, whole genome shotgun sequence Length of sequence - 5614 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 831 - 890 4.8 2 2 Op 1 11/1.000 + CDS 945 - 1925 1153 ## COG0564 Pseudouridylate synthases, 23S RNA-specific 3 2 Op 2 7/1.000 + CDS 1922 - 2653 478 ## COG1496 Uncharacterized conserved protein + Prom 2673 - 2732 4.7 4 3 Tu 1 . + CDS 2783 - 5356 1845 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 + Term 5361 - 5401 6.2 Predicted protein(s) >gi|296493272|gb|ADTK01000229.1| GENE 1 73 - 810 821 245 aa, chain - ## HITS:1 COG:ECs3458 KEGG:ns NR:ns ## COG: ECs3458 COG4105 # Protein_GI_number: 15832712 # Func_class: R General function prediction only # Function: DNA uptake lipoprotein # Organism: Escherichia coli O157:H7 # 1 245 1 245 245 466 100.0 1e-131 MTRMKYLVAAATLSLFLAGCSGSKEEVPDNPPNEIYATAQQKLQDGNWRQAITQLEALDN RYPFGPYSQQVQLDLIYAYYKNADLPLAQAAIDRFIRLNPTHPNIDYVMYMRGLTNMALD DSALQGFFGVDRSDRDPQHARAAFSDFSKLVRGYPNSQYTTDATKRLVFLKDRLAKYEYS VAEYYTERGAWVAVVNRVEGMLRDYPDTQATRDALPLMENAYRQMQMNAQAEKVAKIIAA NSSNT >gi|296493272|gb|ADTK01000229.1| GENE 2 945 - 1925 1153 326 aa, chain + ## HITS:1 COG:sfhB KEGG:ns NR:ns ## COG: sfhB COG0564 # Protein_GI_number: 16130515 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthases, 23S RNA-specific # Organism: Escherichia coli K12 # 1 326 1 326 326 625 100.0 1e-179 MAQRVQLTATVSENQLGQRLDQALAEMFPDYSRSRIKEWILDQRVLVNGKVCDKPKEKVL GGEQVAINAEIEEEARFEPQDIPLDIVYEDEDIIIINKPRDLVVHPGAGNPDGTVLNALL HYYPPIADVPRAGIVHRLDKDTTGLMVVAKTVPAQTRLVESLQRREITREYEAVAIGHMT AGGTVDEPISRHPTKRTHMAVHPMGKPAVTHYRIMEHFRVHTRLRLRLETGRTHQIRVHM AHITHPLVGDPVYGGRPRPPKGASEAFISTLRKFDRQALHATMLRLYHPISGIEMEWHAP IPQDMVELIEVMRADFEEHKDEVDWL >gi|296493272|gb|ADTK01000229.1| GENE 3 1922 - 2653 478 243 aa, chain + ## HITS:1 COG:ECs3456 KEGG:ns NR:ns ## COG: ECs3456 COG1496 # Protein_GI_number: 15832710 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 243 1 243 243 495 97.0 1e-140 MSKLIVPQWPLPKGVAACSSTRIGGVSLPPYDSLNLGAHCGDNPDHVEENRKRLFAAGNL PSKPVWLEQVHGKDVLKLTGEPYASKRADASYSNTPGTVCAVMTADCLPVLFCNRAGTEV AAAHAGWRGLCAGVLEETVSCFADNPENILAWLGPAIGPRAFEVGAEVREAFMAADAKAS TAFIQHGDKYLADIYLLARQRLASVGVEQIFGGDRCTYTENETFFSYRRDKTTGRMASFI WLI >gi|296493272|gb|ADTK01000229.1| GENE 4 2783 - 5356 1845 857 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 857 1 811 815 715 46 0.0 MRLDRLTNKFQLALADAQSLALGHDNQFIEPLHLMSALLNQEGGSVSPLLTSAGINAGQL RTDINQALNRLPQVEGTGGDVQPSQDLVRVLNLCDKLAQKRGDNFISSELFVLAALESRG TLADILKAAGATTANITQAIEQMRGGESVNDQGAEDQRQALKKYTIDLTERAEQGKLDPV IGRDEEIRRTIQVLQRRTKNNPVLIGEPGVGKTAIVEGLAQRIINGEVPEGLKGRRVLAL DMGALVAGAKYRGEFEERLKGVLNDLAKQEGNVILFIDELHTMVGAGKADGAMDAGNMLK PALARGELHCVGATTLDEYRQYIEKDAALERRFQKVFVAEPSVEDTIAILRGLKERYELH HHVQITDPAIVAAATLSHRYIADRQLPDKAIDLIDEAASSIRMQIDSKPEELDRLDRRII QLKLEQQALMKESDEASKKRLDMLNEELSDKERQYSELEEEWKAEKASLSGTQTIKAELE QAKIAIEQARRVGDLARMSELQYGKIPELEKQLEAATQLEGKTMRLLRNKVTDAEIAEVL ARWTGIPVSRMMESEREKLLRMEQELHHRVIGQNEAVDAVSNAIRRSRAGLADPNRPIGS FLFLGPTGVGKTELCKALANFMFDSDEAMVRIDMSEFMEKHSVSRLVGAPPGYVGYEEGG YLTEAVRRRPYSVILLDEVEKAHPDVFNILLQVLDDGRLTDGQGRTVDFRNTVVIMTSNL GSDLIQERFGELDYAHMKELVLGVVSHNFRPEFINRIDEVVVFHPLGEQHIASIAQIQLK RLYKRLEERGYEIHISDEALKLLSENGYDPVYGARPLKRAIQQQIENPLAQQILSGELVP GKVIRLEVNEDRIVAVQ Prediction of potential genes in microbial genomes Time: Mon May 16 15:45:53 2011 Seq name: gi|296493271|gb|ADTK01000230.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont662.1, whole genome shotgun sequence Length of sequence - 8053 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 1, operones - 1 average op.length - 8.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 7 - 1455 752 ## COG0534 Na+-driven multidrug efflux pump - Prom 1667 - 1726 80.1 + TRNA 1649 - 1724 87.1 # Asn GTT 0 0 - Term 1605 - 1647 3.5 2 1 Op 2 9/0.000 - CDS 1762 - 2640 701 ## COG0583 Transcriptional regulator - Prom 2745 - 2804 4.0 3 1 Op 3 3/0.000 - CDS 2814 - 3731 463 ## COG0583 Transcriptional regulator + TRNA 4057 - 4132 87.1 # Asn GTT 0 0 - Term 4132 - 4161 2.1 4 1 Op 4 4/0.000 - CDS 4188 - 5120 521 ## COG1376 Uncharacterized protein conserved in bacteria 5 1 Op 5 . - CDS 5185 - 5670 291 ## COG2038 NaMN:DMB phosphoribosyltransferase 6 1 Op 6 11/0.000 - CDS 5703 - 6263 469 ## COG2038 NaMN:DMB phosphoribosyltransferase 7 1 Op 7 8/0.000 - CDS 6275 - 7018 805 ## COG0368 Cobalamin-5-phosphate synthase 8 1 Op 8 . - CDS 7015 - 7557 462 ## COG2087 Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase - Prom 7655 - 7714 6.1 Predicted protein(s) >gi|296493271|gb|ADTK01000230.1| GENE 1 7 - 1455 752 482 aa, chain - ## HITS:1 COG:yeeO KEGG:ns NR:ns ## COG: yeeO COG0534 # Protein_GI_number: 16129928 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Escherichia coli K12 # 1 475 64 538 547 854 99.0 0 MNISSALRQVVHGTRWHAKRKSYKVLFWREITPLAVPIFMENACVLLMGVLSTFLVSWLG KDAMAGVGLADSFNMVIMAFFAAIDLGTTVVVAFSLGKRDRRRARVATRQSLVIMTLFAV LLATLIHHFGEQIIDFVAGDATTEVKALALTYLELTVLSYPAAAITLIGSGALRGAGNTK IPLLINGSLNILNIIISGILIYGLFSWPGLGFVGAGLGLTISRYIGAVAILWVLAIGFNP ALRISLKSYFKPLNFSIIWEVMGIGIPASVESVLFTSGRLLTQMFVAGMGTSVIAGNFIA FSIAALINLPGSALGSASTIITGRRLGVGQIAQAEIQLRHVFWLSTLGLTAIAWLTAPFA GVMASFYTQDPQVKHVVVILIWLNALFMPIWSASWVLPAGFKGARDARYAMWVSMLSMWG CRVVVGYVLGIMLGWGVVGVWMGMFADWAVRAVLFYWRMVTGRWLWKYPRPEPQKWVMLP TY >gi|296493271|gb|ADTK01000230.1| GENE 2 1762 - 2640 701 292 aa, chain - ## HITS:1 COG:ECs2783 KEGG:ns NR:ns ## COG: ECs2783 COG0583 # Protein_GI_number: 15832037 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 292 25 316 316 568 98.0 1e-162 MLFTSQSGVSRHIRELEDELGIEIFVRRGKQLLGMTEPGKALLVIAERILNEASNVRRLA DLFTNDTSGVLTIATTHTQARYSLPEVIKAFRELFPEVRLELIQGTPQEIATLLQNGEAD IGIASERLSNDPQLVAFPWFRWHHSLLVPHDHPLTQITPLTLESIAKWPLITYRQGITGR SRIDDAFARKGLLADIVLSAQDSDVIKTYVALGLGIGLVAEQSSGEQEEKNLIRLDTRHL FDANTVWLGLKRGQLQRNYVWRFLELCNAGLSVEDIKRQVMENSEEEIDYQI >gi|296493271|gb|ADTK01000230.1| GENE 3 2814 - 3731 463 305 aa, chain - ## HITS:1 COG:nac KEGG:ns NR:ns ## COG: nac COG0583 # Protein_GI_number: 16129930 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 305 1 305 305 535 99.0 1e-152 MNFRRLKYFVKIVDIGSLTQAAEVLHIAQPALSQQVATLEGELNQQLLIRTKRGVTPTDA GKILYTHARAILRQCEQAQLAVHNVGQALSGQVSIGFAPGTAASSITMPLLQAVRAEFPE IVIYLHENSGAVLNEKLINHQLDMAVIYEHSPVAGVSSQALLKEDLFLVGTQDCPGQSVD VNAIAQMNLFLPSDYSAVRLRVDEAFSLRRLTAKVIGEIESIATLTAAIASGMGVAVLPE SAARSLCGAVNGWMSRITTPSMSLSLSLNLPARANLSPQAQAVKELLMSVISSPVMEKRQ WQLVS >gi|296493271|gb|ADTK01000230.1| GENE 4 4188 - 5120 521 310 aa, chain - ## HITS:1 COG:erfK KEGG:ns NR:ns ## COG: erfK COG1376 # Protein_GI_number: 16129931 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 310 1 310 310 600 99.0 1e-171 MRRVNILCSFALLFASHTSLAVTYPLPPEGSRLVGQSLTVTVPDHNTQPLETFAAQYGQG LSNMLEANPGADVFLPKSGSQLTIPQQLILPATVRKGIVVNVAEMRLYYYPPDSNTVEVF PIGIGQAGRETPRNWVTTVERKQEAPTWTPTPNTRREYAKRGESLPAFVPAGPDNPMGLY AIYIGRLYAIHGTNANFGIGLRVSQGCIRLRNDDIKYLFDNVPVGTRVQIIDQPVKYTTE PDGSNWLEVHEPLSRNRAEYESDRKVPLPVTPSLRAFINGQEVDVNRANAALQRRSGMPV QISSGSRQMF >gi|296493271|gb|ADTK01000230.1| GENE 5 5185 - 5670 291 161 aa, chain - ## HITS:1 COG:cobT KEGG:ns NR:ns ## COG: cobT COG2038 # Protein_GI_number: 16129932 # Func_class: H Coenzyme transport and metabolism # Function: NaMN:DMB phosphoribosyltransferase # Organism: Escherichia coli K12 # 1 161 199 359 359 288 98.0 4e-78 MVGIGANLPTDKLANKIDVVRRAITLNQPNPQDGVDVLAKVGGFDLVGMAGVMLGAASCG LPVLLDGFLSYAAALAACQMSPAIKPYLIPSHLSAEKGARIALSHLGLEPYLNMEMRLGE GSGAALAMPIIEAACAIYNNMGELAASNIVLPGNTTSDLNS >gi|296493271|gb|ADTK01000230.1| GENE 6 5703 - 6263 469 186 aa, chain - ## HITS:1 COG:ECs2786 KEGG:ns NR:ns ## COG: ECs2786 COG2038 # Protein_GI_number: 15832040 # Func_class: H Coenzyme transport and metabolism # Function: NaMN:DMB phosphoribosyltransferase # Organism: Escherichia coli O157:H7 # 1 179 1 179 359 323 97.0 2e-88 MQTLADLLNTIPAIDPAAMSRAQRHIDGLLKPVGSLGRLEALAIQLAGMPGLNGIPHVGK KAVLVMCADHGVWEEGVAISPKEVTAIQAENMTRGTTGVCVLAAQAGANVHVVDVGIDSA EPIPGLINMRVARGSGNIASAPAMSRRQAEKLLLDVICYTRELAKNGVTLFGVGELGMAK RPRQRQ >gi|296493271|gb|ADTK01000230.1| GENE 7 6275 - 7018 805 247 aa, chain - ## HITS:1 COG:ECs2787 KEGG:ns NR:ns ## COG: ECs2787 COG0368 # Protein_GI_number: 15832041 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin-5-phosphate synthase # Organism: Escherichia coli O157:H7 # 1 247 1 247 247 412 98.0 1e-115 MSKLFWAMLSFITRLPVPRRWSQGLDFEHYSRGIITFPLIGLLLGAISGLVFMVLQAWCG VPLAALFSVLVLALMTGGFHLDGLADTCDGVFSARSRDRMLEIMRDSHLGTHGGLALIFV VLAKILVLSELALRGEPILASLAAACAVSRGTAALLMYRHRYAREEGLGNVFIGKIDGRQ TCVTLGLAAIFAAVLLPGMHGVAAMVVTMVAIFILGQLLKRTLGGQTGDTLGAAIELGEL VFLLALL >gi|296493271|gb|ADTK01000230.1| GENE 8 7015 - 7557 462 180 aa, chain - ## HITS:1 COG:ECs2788 KEGG:ns NR:ns ## COG: ECs2788 COG2087 # Protein_GI_number: 15832042 # Func_class: H Coenzyme transport and metabolism # Function: Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase # Organism: Escherichia coli O157:H7 # 1 180 2 181 181 351 98.0 3e-97 MILVTGGARSGKSRHAEALIGDSSQVLYIATSQILDDEMAARIEHHRQGRPAHWRTVERW QHLDELIHADINPHEAVLLECVTTMVTNLLFDYGGDKDPDEWDYQAMEQAINAEIQSLIA ACQRCPAKVVLVTNEVGMGIVPESRLARHFRDIAGRVNQQLAAAANEVWLVVSGIGVKIK Prediction of potential genes in microbial genomes Time: Mon May 16 15:45:55 2011 Seq name: gi|296493270|gb|ADTK01000231.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont663.1, whole genome shotgun sequence Length of sequence - 6290 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 7, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 236 - 616 177 ## COG2801 Transposase and inactivated derivatives + Term 695 - 739 0.2 + Prom 1139 - 1198 1.8 2 2 Tu 1 . + CDS 1219 - 1443 72 ## APECO1_O1CoBM150 hypothetical protein 3 3 Tu 1 . + CDS 1858 - 3051 352 ## COG0523 Putative GTPases (G3E family) 4 4 Tu 1 . + CDS 3190 - 3501 58 ## COG2801 Transposase and inactivated derivatives + Prom 3579 - 3638 2.7 5 5 Tu 1 . + CDS 3688 - 4554 574 ## pECS88_0139 putative transposase (fragment) 6 6 Tu 1 . + CDS 4691 - 5077 116 ## APECO1_O1CoBM154 hypothetical protein + Term 5126 - 5168 2.3 + Prom 5165 - 5224 4.2 7 7 Tu 1 . + CDS 5312 - 6253 137 ## pECS88_0143 hypothetical protein Predicted protein(s) >gi|296493270|gb|ADTK01000231.1| GENE 1 236 - 616 177 126 aa, chain + ## HITS:1 COG:yi22 KEGG:ns NR:ns ## COG: yi22 COG2801 # Protein_GI_number: 16132094 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli K12 # 1 126 176 301 301 253 95.0 4e-68 MLGAVERRFGNELPASPVEWLTDNGSCYRANETRQFARMLGLEPKSTAVRSPESNGIAES FVKTIKRDYISVMPKPDGLTAAKNLAEAFEHYNEWHPHSALGYRSPREYLRQQASNGLSD NRCLEI >gi|296493270|gb|ADTK01000231.1| GENE 2 1219 - 1443 72 74 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM150 NR:ns ## KEGG: APECO1_O1CoBM150 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 74 1 74 74 143 100.0 2e-33 MNWLRHTYQETLFPGRTYPPETPLLSGLKFFGIYRTQKNADITDRLLSVPNSADPKAMYL STERVPDILGTDVF >gi|296493270|gb|ADTK01000231.1| GENE 3 1858 - 3051 352 397 aa, chain + ## HITS:1 COG:BH1790 KEGG:ns NR:ns ## COG: BH1790 COG0523 # Protein_GI_number: 15614353 # Func_class: R General function prediction only # Function: Putative GTPases (G3E family) # Organism: Bacillus halodurans # 6 390 8 390 395 164 31.0 3e-40 METTSVIILNGFLGAGKTTLLKNLLIQAHEKQLSVSVIVNDMSELDVDGVLIANTDIVDA KDNNFVSITADSISSPAGLKKMDLAINHLLDHNKPDVMLIETSGSSHPLPLVKYLRRHSR VRLKAFLTLVDTVMLHEDYNDGKKLIPTLQENLRHNKRGLENLLAEQIMFCNRLLLTRND RLPFDIISAVAKAIHPLNPSVDVLAVSWGNIELSTLLAIPDYNFDRVELLISELEALVGN MDTPCNNEELIWRVIRDDRPFHPQRLWDTCHRFMGMGVYRSKGFFWLPGRDDLALLWNQS AGSISLALIGYWKAGVLEHTDNNLTREERSALQRHIDTASGRFGDRCCQLTIIGNATEVN DFTHALSLCLLTEEEIQWWMSGGVFPDPWPQKVTRLS >gi|296493270|gb|ADTK01000231.1| GENE 4 3190 - 3501 58 103 aa, chain + ## HITS:1 COG:b1578 KEGG:ns NR:ns ## COG: b1578 COG2801 # Protein_GI_number: 16129536 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli K12 # 1 91 105 187 218 94 59.0 4e-20 MQSSYLGGLSTADSDYRVNETRQFARMLGLKSKNTAVRSPVSNGIAESFMKTIKRDYISI MPKLVVCFLWSSLCSVLRKNILEPVEQMNAWIRGAINKKKKWK >gi|296493270|gb|ADTK01000231.1| GENE 5 3688 - 4554 574 288 aa, chain + ## HITS:1 COG:no KEGG:pECS88_0139 NR:ns ## KEGG: pECS88_0139 # Name: not_defined # Def: putative transposase (fragment) # Organism: E.coli_S88 # Pathway: not_defined # 1 288 1 288 288 585 100.0 1e-166 MAINDAGMFTVKEINRLKILQDVIDRNIRPGQAAEMLGITPRHCSRLLKRYRQYGPLGMN NQSRGRTGNRLLPASLTDQALSIIREHYRDFGPTLAREKLEEVHGLVLGKETIRRLMIKA GLWIPRRQRAPKIHQPRYRRPCTGELIQIDGCDHHWFENRGRPCSALVYVDDATSRLMHL LFVKSESTFTYFEATRGYIEKYGKPMILYSDKASVFRVNNKHATTGPGETQFARAMRCLN ITPLCAETSQAKGRVERAHLTLQDRLVKELRLKGICTIEAANAFAEGL >gi|296493270|gb|ADTK01000231.1| GENE 6 4691 - 5077 116 128 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM154 NR:ns ## KEGG: APECO1_O1CoBM154 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 128 1 128 128 241 100.0 8e-63 MLYLIEDNEYSRRAIGKYIDVWHYPDGHKELRLNGVLLPYSTYDRLSEVDPVAIVDNKRL GHVLDVARQVQRKRDNNRSQSLPCSGDEPSRRRHAPSINKSQRSLNEDDLLEAMIKLQGS SEAIFGKR >gi|296493270|gb|ADTK01000231.1| GENE 7 5312 - 6253 137 313 aa, chain + ## HITS:1 COG:no KEGG:pECS88_0143 NR:ns ## KEGG: pECS88_0143 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 313 1 313 313 619 100.0 1e-176 MAIINIYSKRQRKIRGEVNDVYQYNNIPHALRVQIIKIITDSIGFPSSNERYTSYRNEAD KVYAYIHEILSKEYGVFSLKEFAKNDFDALVDFFLKERNTEKCLDFIEICFQILVSHVAK NHYEFKDITSQSPGDAVIELNERFREHGVGYQFESEEIIRIDSQLIHADVVKPTLILLSG EPLFEGANDEFLAAHEHYRHKRYKECLNDCLKSFESIMKAIHDKNNWKYSPNDTASKLIN SCLSQNLIPAYLQSQFTSLKTMLETGIPTIRNKNAGHGQGADIKEVPEELVSYMLHLTAT NLLFLLKCEKNIK Prediction of potential genes in microbial genomes Time: Mon May 16 15:46:09 2011 Seq name: gi|296493269|gb|ADTK01000232.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont663.2, whole genome shotgun sequence Length of sequence - 6684 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 44 - 271 61 ## COG1943 Transposase and inactivated derivatives 2 1 Op 2 . - CDS 228 - 416 123 ## ECS88_0034 hypothetical protein - Prom 449 - 508 3.4 3 2 Op 1 3/0.000 - CDS 745 - 2052 963 ## COG1538 Outer membrane protein 4 2 Op 2 13/0.000 - CDS 2119 - 3900 277 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 5 2 Op 3 . - CDS 4056 - 5237 698 ## COG0845 Membrane-fusion protein - Prom 5329 - 5388 8.3 + Prom 5859 - 5918 6.1 6 3 Tu 1 . + CDS 6137 - 6427 92 ## COG2801 Transposase and inactivated derivatives Predicted protein(s) >gi|296493269|gb|ADTK01000232.1| GENE 1 44 - 271 61 75 aa, chain - ## HITS:1 COG:STM0946 KEGG:ns NR:ns ## COG: STM0946 COG1943 # Protein_GI_number: 16764308 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Salmonella typhimurium LT2 # 1 65 1 65 152 129 90.0 2e-30 MGNEKSLAHTRWNCKYHIVFAPKYRRQVFYREKRRAIGCILRKLCEWKSVRILEAECCAD HIHMLQCLSKLSENV >gi|296493269|gb|ADTK01000232.1| GENE 2 228 - 416 123 62 aa, chain - ## HITS:1 COG:no KEGG:ECS88_0034 NR:ns ## KEGG: ECS88_0034 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 62 6 67 67 119 95.0 3e-26 MCIETPQLGWGFRKAFSFEPVIKTPFDLLKHLAVWQLQVSNKKSKGGPNGERKELSAHPM EL >gi|296493269|gb|ADTK01000232.1| GENE 3 745 - 2052 963 435 aa, chain - ## HITS:1 COG:HI1462 KEGG:ns NR:ns ## COG: HI1462 COG1538 # Protein_GI_number: 16273365 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Haemophilus influenzae # 31 430 51 451 454 166 26.0 1e-40 MLKSDYRAPEVNYPINWTKGDVDGNTSPFDWEEFNDPNLDNWLHLVMTSNNDIAIAALRI HRAQLDAERTGITNTPALKAALSMDGKKQLNNSSGWAKSGSASLGTSYELDLWGKIARQR DVAEWAVHASEEDFRSARLMLLSEASNNYWRIGFVNQQITTLQQSIDYAKETLRLAEVRY RAGNISSLDVIDAQQNLLTQENQLTGLQREHSQLLNQQAVLLGTVPGCQIVEPTTLPKGS LPKVNANIPASILMRRPDISAKEWQLREALATVDIKRSEYYPTFNLTGALGTSSASLLAL LHNPVGSVGANLTLPFLEWRQRDIEVKIARNDYEQRVLEFKQLLYKAMSSIEDALSFRNQ LLLQETRLREELELARKSEWLNEVRYRHGAVRISFWLDAQEKRRQAELRLDENRFNQLQN LAKIYLEFGGASTFP >gi|296493269|gb|ADTK01000232.1| GENE 4 2119 - 3900 277 593 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 201 593 6 413 413 111 25 2e-24 MGCLDVPNRGDYYIDGQNAACLSPDELARVRREHIGFIFQRYHLIPDLSALGNVEIPAIY ANSERDSRRQRATALLGRLGLEGREHHKPCELSGGQQQRVSIARALINGGKIILADEPTG ALDSQSGQEVLAILNELNRRGHTVVMVTHDMKVARHAKRIIELCDGEIIADSGGCVSATE TLPKTNRIRQSYWKTLLDRTRESMQMALKAMKTHRLRTTLTMIGIVFGIASVVTVVALGE GARQETLEEIKSLGTNVVSIYPGQDLFDDSIESIRTLVPADANALAKQGFIDSVSPEVSA SDNIRFLGKSAIASINGVGREHFRVKGIELLQGTTFRDDRNALQEVIIDENTRKAIFDNT GLQALGQIVFLGSVPARVVGIAKSNNRSDASNRITVWMPYSTVMYRIVGKPVLTGISVRL KDNVDNEAAISAISQLLTRRHGIKDFQLYNFEQIRKSIEHTSMTFSILILMVACISLMIG SIGVMNIMLISVTERTHEIGVRMAVGARRSDIMQQFIIEAVLVCLIGGALGIALSYITGA LFNALADGIFAAIYSWQAAVAAFFCSTLIGIIFGYLPARKAARMDPVISLASE >gi|296493269|gb|ADTK01000232.1| GENE 5 4056 - 5237 698 393 aa, chain - ## HITS:1 COG:YPO2999 KEGG:ns NR:ns ## COG: YPO2999 COG0845 # Protein_GI_number: 16123180 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Yersinia pestis # 26 380 29 381 401 275 45.0 9e-74 MKYISHRAIICTLTLLIIIITVLFTFLRSSDVPEYITAPVRKGDIENSVLATGRIDAIER VNVGAQVSGQLKSLKVKQGDHVTKGQLIAEIDDLPQCNDLRNAEAALNEVKAELQSKQAL LKQAELRFKRQLRMLRENASSHEDFESAEAMLATTRAELHSLNAKLVQAQIEVDKKKLAL EYTRVVAPMDGIVIAIVTQQGQTVNSNQSAPTIIKLARLDVMTIKAQISEADITRISVGQ KARFSIFSEPDKHYSATLRAVELAPESVMKDDSLASNTSASGSGTSNASVYYNALFDVPN PENRLRIAMTAQVTLITDEAQNTLLVPIQAVHRNEGKKQQVLVLAADGRLEPRNVKTGIT NSVDIQILEGLNVGENVVLSLPDKKEPEERIML >gi|296493269|gb|ADTK01000232.1| GENE 6 6137 - 6427 92 96 aa, chain + ## HITS:1 COG:VC0257 KEGG:ns NR:ns ## COG: VC0257 COG2801 # Protein_GI_number: 15640286 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Vibrio cholerae # 11 91 116 244 290 110 46.0 8e-25 MACLLTTSIVFGDYAAIQVWCGDVTYIWTGKPGGVMFHSDTSRQFRQLLWRYQISQSMSR RGNCWDNSPMERFFRSLKNEWMPVVGYEASARPLTP Prediction of potential genes in microbial genomes Time: Mon May 16 15:46:12 2011 Seq name: gi|296493268|gb|ADTK01000233.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont694.1, whole genome shotgun sequence Length of sequence - 354 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 90 - 353 93 ## COG4718 Phage-related protein Predicted protein(s) >gi|296493268|gb|ADTK01000233.1| GENE 1 90 - 353 93 87 aa, chain - ## HITS:1 COG:ECs1644 KEGG:ns NR:ns ## COG: ECs1644 COG4718 # Protein_GI_number: 15830898 # Func_class: S Function unknown # Function: Phage-related protein # Organism: Escherichia coli O157:H7 # 1 87 23 109 109 154 93.0 4e-38 VRFGDGYSQRAPAGLNADLKTYSVTLSVSREEATALESFLAEHGGWKAFLWTPPYGYRQI KVTCAKWSSQVSMLRVEFSAEFKQVVN Prediction of potential genes in microbial genomes Time: Mon May 16 15:46:13 2011 Seq name: gi|296493267|gb|ADTK01000234.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont694.2, whole genome shotgun sequence Length of sequence - 2293 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 - CDS 1 - 268 107 ## COG4718 Phage-related protein 2 1 Op 2 . - CDS 265 - 2106 1693 ## COG5281 Phage-related minor tail protein Predicted protein(s) >gi|296493267|gb|ADTK01000234.1| GENE 1 1 - 268 107 89 aa, chain - ## HITS:1 COG:ECs1644 KEGG:ns NR:ns ## COG: ECs1644 COG4718 # Protein_GI_number: 15830898 # Func_class: S Function unknown # Function: Phage-related protein # Organism: Escherichia coli O157:H7 # 1 89 1 89 109 157 92.0 4e-39 MKTFRWKVKPGMDVTSAPSVREVRFGDGYSQRAPAGLNANLKTYSVTLSVPREEATVLES FLEEHGGWKSFLWTPPYEWRQIKVTCAKW >gi|296493267|gb|ADTK01000234.1| GENE 2 265 - 2106 1693 613 aa, chain - ## HITS:1 COG:ECs1643 KEGG:ns NR:ns ## COG: ECs1643 COG5281 # Protein_GI_number: 15830897 # Func_class: S Function unknown # Function: Phage-related minor tail protein # Organism: Escherichia coli O157:H7 # 1 613 241 849 849 931 97.0 0 MARQFHNVTAEQIAYVAQLQRSGDEAGALQAANEAATKGFDDQTRRLKENMGTLETWADR TARAFKSMWDAVLDIGRPDTAQEMLIKAEAAFKKADDIWNLRKDDYFVNDEARARYWDDR EKARLALEAARKKAEQQTQQDKNAQQQSDTEASRLKYTEEAQKAYERLQTPLEKYTARQE ELNKALKDGKILQADYNTLMAAAKKDYEATLKKPKQSGVKVSAGDRQEDSAHAALLTLQA ELRTLEKHAGANEKISQQRRDLWKAESQFAVLEEAAQRRQLSAQEKSLLAHKDETLEYKR QLAALGDKVTYQERLNALAQQADKFAQQQRAKRAAIDAKSRGLTDRQAEREATEQRLKEQ YGDNPLALNNVMSEQKKTWAAEDQLRGSWMAGLKSGWSEWEESATDSMSQVKSAATQTFD GIAQNMAAMLTGSEQNWRSFTRSVLSMMTEILLKQAMVGIVGSIGSAIGGAVGGGASASG GTAIQAAAAKFHFATGGFTGTGGKYEPAGIVHRGEFVFTKEATSRIGVGNLYRLMRGYAT GGYVGTPGSMADSRSQASGKFEQNNHVVINNDGTNGQIGPQALKAVYDVARKAAMDVVTG QMRDGGLFSGGGR Prediction of potential genes in microbial genomes Time: Mon May 16 15:46:13 2011 Seq name: gi|296493266|gb|ADTK01000235.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont714.1, whole genome shotgun sequence Length of sequence - 1155 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 916 - 1153 205 ## ECS88_3271 hypothetical protein Predicted protein(s) >gi|296493266|gb|ADTK01000235.1| GENE 1 916 - 1153 205 79 aa, chain + ## HITS:1 COG:no KEGG:ECS88_3271 NR:ns ## KEGG: ECS88_3271 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 79 1 79 188 167 100.0 9e-41 MPKSYTPNWFFTALLDNHINQMMARYSCLRALRMDFFYRKDTPDFLQPDHRWLELQLRML LEQVEQFENIVGFFWVIEW Prediction of potential genes in microbial genomes Time: Mon May 16 15:46:17 2011 Seq name: gi|296493265|gb|ADTK01000236.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont714.2, whole genome shotgun sequence Length of sequence - 5100 bp Number of predicted genes - 11, with homology - 9 Number of transcription units - 9, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 35 - 187 75 ## ECP_4589 hypothetical protein 2 2 Op 1 . + CDS 434 - 1006 461 ## c5193 hypothetical protein 3 2 Op 2 . + CDS 1075 - 1308 232 ## COG3311 Predicted transcriptional regulator + Term 1327 - 1370 2.0 + Prom 1424 - 1483 3.3 4 3 Tu 1 . + CDS 1628 - 1816 81 ## COG2274 ABC-type bacteriocin/lantibiotic exporters, contain an N-terminal double-glycine peptidase domain + Prom 1838 - 1897 4.6 5 4 Op 1 . + CDS 1923 - 2069 177 ## 6 4 Op 2 . + CDS 2140 - 2418 69 ## ECP_0286 hypothetical protein + Term 2434 - 2471 3.0 7 5 Tu 1 . - CDS 2592 - 3005 84 ## COG1266 Predicted metal-dependent membrane protease - Prom 3212 - 3271 4.8 - Term 3144 - 3184 -0.9 8 6 Tu 1 . - CDS 3289 - 3465 82 ## - Prom 3618 - 3677 4.1 + Prom 3789 - 3848 6.2 9 7 Tu 1 . + CDS 3901 - 4092 62 ## ECs1375 hypothetical protein + Term 4206 - 4247 2.1 + Prom 4095 - 4154 4.5 10 8 Tu 1 . + CDS 4294 - 4842 94 ## EC55989_4845 hypothetical protein 11 9 Tu 1 . - CDS 4861 - 5100 134 ## SeHA_C4697 ProQ family protein Predicted protein(s) >gi|296493265|gb|ADTK01000236.1| GENE 1 35 - 187 75 50 aa, chain + ## HITS:1 COG:no KEGG:ECP_4589 NR:ns ## KEGG: ECP_4589 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_536 # Pathway: not_defined # 1 50 139 188 188 108 98.0 7e-23 MRHNDPESIDNIRGALHYLAKEEQKDGLCAYGCNEVPERPAAGRPRKPHF >gi|296493265|gb|ADTK01000236.1| GENE 2 434 - 1006 461 190 aa, chain + ## HITS:1 COG:no KEGG:c5193 NR:ns ## KEGG: c5193 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_CFT073 # Pathway: not_defined # 1 190 49 238 238 364 91.0 1e-99 MNAIPYFDYSLAPFWPSYQNKVIGVLESALREQSGSRIRRILLRLPCEYDRTFSSRKIWF GMDFIETVSALMNATPGRDLCWLLTHHPEKPEYHAVLCVRQEYFDGPELDRLILDAWSNV LGFASPGEAKQYQKQITRDVVLDSRSPDCEERLKELIWAFSDFARDRRGVHDPEARYLAS NPWYPVAGQL >gi|296493265|gb|ADTK01000236.1| GENE 3 1075 - 1308 232 77 aa, chain + ## HITS:1 COG:Z1188 KEGG:ns NR:ns ## COG: Z1188 COG3311 # Protein_GI_number: 15800709 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Escherichia coli O157:H7 EDL933 # 1 64 1 64 65 117 93.0 5e-27 MLTTTSHDSVWLRADDPLIDMNYITSFTGMTDKWFYKLISEGHFPNPIKLGRSSRWYKSE VEQWMQQRIEESRGAAA >gi|296493265|gb|ADTK01000236.1| GENE 4 1628 - 1816 81 62 aa, chain + ## HITS:1 COG:XF1220 KEGG:ns NR:ns ## COG: XF1220 COG2274 # Protein_GI_number: 15837822 # Func_class: V Defense mechanisms # Function: ABC-type bacteriocin/lantibiotic exporters, contain an N-terminal double-glycine peptidase domain # Organism: Xylella fastidiosa 9a5c # 3 62 595 654 707 87 63.0 8e-18 MKMPMGYETLIGELGEGLSGGQKQRIFIARALYRKPGILFMDEATSSLDTESERFVNAAI KK >gi|296493265|gb|ADTK01000236.1| GENE 5 1923 - 2069 177 48 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGEVKKDIKITVIAFVINYLFFYIPVSLYLSYYYGYNFLNLYMFFYHL >gi|296493265|gb|ADTK01000236.1| GENE 6 2140 - 2418 69 92 aa, chain + ## HITS:1 COG:no KEGG:ECP_0286 NR:ns ## KEGG: ECP_0286 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_536 # Pathway: not_defined # 1 75 19 93 110 115 97.0 5e-25 MRKLSENEIKQISGGDGDDGQAELIAIGSLAGTFISPGFGSIAGAYIGDKVHSWATTATV SPSMSPSGIGLSSQFGSGRGTSSASSSAGSGS >gi|296493265|gb|ADTK01000236.1| GENE 7 2592 - 3005 84 137 aa, chain - ## HITS:1 COG:STM2377 KEGG:ns NR:ns ## COG: STM2377 COG1266 # Protein_GI_number: 16765704 # Func_class: R General function prediction only # Function: Predicted metal-dependent membrane protease # Organism: Salmonella typhimurium LT2 # 32 132 116 217 218 59 38.0 2e-09 MIIQLFVPYLLAVRKTEEWMISQMSFSGAILWINIFSSVLLVPVYEEIVFRGCLFNSFKF WFNDNIYTSAIVTSVIFSALHLQYTDFRTFLMLFLVSLVLISAKIKSNGLLMPILLHMSM NAVIAGIQYLASISYTH >gi|296493265|gb|ADTK01000236.1| GENE 8 3289 - 3465 82 58 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MASGPSLFLHYFSHYSISLHSFTDNNYKIKVMAILIQENKRSFYGYLFLLYKRILKDK >gi|296493265|gb|ADTK01000236.1| GENE 9 3901 - 4092 62 63 aa, chain + ## HITS:1 COG:no KEGG:ECs1375 NR:ns ## KEGG: ECs1375 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157J # Pathway: not_defined # 1 63 1 63 63 108 96.0 5e-23 MTTSPFILEFHDNDRDNHYQLIVSVILTYITLSWFNQVVYLMKRVCLWMYCVYVLPGICS YIN >gi|296493265|gb|ADTK01000236.1| GENE 10 4294 - 4842 94 182 aa, chain + ## HITS:1 COG:no KEGG:EC55989_4845 NR:ns ## KEGG: EC55989_4845 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 182 18 199 199 328 90.0 4e-89 MQITEALISEPGDIRRFVQQAVDHWPHLLAFYFTLHSAEGNINGQQIHAFCTSFYRQVHE RITERNHMASPASPLVLRWLREQHGGATIRCLLLLSQTSICHPRASVTVDEQCSQVVDLL QHSWQVISAGGQCRVERCFRVARGDTSGQYVALKTAALSLGLPVVTGITHRPVQRCTLIT AQ >gi|296493265|gb|ADTK01000236.1| GENE 11 4861 - 5100 134 79 aa, chain - ## HITS:1 COG:no KEGG:SeHA_C4697 NR:ns ## KEGG: SeHA_C4697 # Name: not_defined # Def: ProQ family protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 2 79 37 114 114 130 97.0 1e-29 ADDLIQDIAIRELAFGAGALRAAVASYVQSPRYYRALMAGGARYDQKGQPCGEVTPQEQK EAETRLMVLNDRQKARKPR Prediction of potential genes in microbial genomes Time: Mon May 16 15:46:39 2011 Seq name: gi|296493264|gb|ADTK01000237.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont714.3, whole genome shotgun sequence Length of sequence - 1245 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 290 64 ## EC55989_0278 activator of ProP osmoprotectant transporter 2 1 Op 2 . - CDS 247 - 1119 374 ## COG2801 Transposase and inactivated derivatives - Prom 1157 - 1216 1.7 Predicted protein(s) >gi|296493264|gb|ADTK01000237.1| GENE 1 2 - 290 64 96 aa, chain - ## HITS:1 COG:no KEGG:EC55989_0278 NR:ns ## KEGG: EC55989_0278 # Name: not_defined # Def: activator of ProP osmoprotectant transporter # Organism: E.coli_55989 # Pathway: not_defined # 8 96 9 107 185 93 55.0 2e-18 MVVSAIASTPQLTINRKPQGIFSKSPATPLQDKTNTVHNEKPLRVRNQTPWRHMTKRQRK NRKRINRLVEHWPELFNREKTQPLKVEIPDDLIQDI >gi|296493264|gb|ADTK01000237.1| GENE 2 247 - 1119 374 290 aa, chain - ## HITS:1 COG:ECs1311 KEGG:ns NR:ns ## COG: ECs1311 COG2801 # Protein_GI_number: 15830565 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 265 1 265 272 526 98.0 1e-149 MCQVFGVSRSGYYNWVQHEPSDRKQSDERLKLEIKVAHIRTRETYGTRRLPTELAENGII VGRDRLARLRKELRLRCKQKRKFRATTNSNHNLPVAPNLLNQTFAPTAPNQVWVADLTYV ATQEGWLYLAGIKDVYTCEIVGYAMGERMTKELTGKALFMALRSQRPPAGLIHHSDRGSQ YCAYDYRVIQEQFGLKTSMSRKGNCYDNAPMESFWGTLKNESLSHYRFNNRDEAISVIRE YIEIFYNRQRRHSRLGNISPAAFRENIIRWLLKKRTNGSVRYCQYTSADH Prediction of potential genes in microbial genomes Time: Mon May 16 15:46:41 2011 Seq name: gi|296493263|gb|ADTK01000238.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont729.1, whole genome shotgun sequence Length of sequence - 763 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 7 - 711 318 ## COG3316 Transposase and inactivated derivatives Predicted protein(s) >gi|296493263|gb|ADTK01000238.1| GENE 1 7 - 711 318 234 aa, chain + ## HITS:1 COG:Cgl0933 KEGG:ns NR:ns ## COG: Cgl0933 COG3316 # Protein_GI_number: 19552183 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Corynebacterium glutamicum # 1 233 1 234 236 274 58.0 9e-74 MNPFKGRHFQRDIILWAVRWYCKYGISYRELQEMLAERGVNVDHSTIYRWVQRYAPEMEK RLRWYWRNPSDLCPWHMDETYVKVNGRWAYLYRAVDSRGRTVDFYLSSRRNSKAAYRFLG KILNNVKKWQIPRFINTDKAPAYGRALALLKREGRCPSDVEHRQIKYRNNVIECDHGKLK RIIGATLGFKSMKTAYATIKGIEVMRALRKGQASAFYYGDPLGEMRLVSRVFEM Prediction of potential genes in microbial genomes Time: Mon May 16 15:46:44 2011 Seq name: gi|296493262|gb|ADTK01000239.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont743.1, whole genome shotgun sequence Length of sequence - 10158 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 4, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 67 - 1389 741 ## E2348_P1_058 conjugal transfer mating pair stabilization protein TraN 2 1 Op 2 . + CDS 1416 - 1673 89 ## APECO1_O1CoBM45 conjugal transfer protein TrbE 3 1 Op 3 . + CDS 1666 - 2409 634 ## pECS88_0084 conjugal pilus assembly protein TraF 4 1 Op 4 . + CDS 2425 - 2772 140 ## APECO1_O1CoBM47 conjugal transfer protein TrbA + Term 2972 - 3029 -0.9 5 2 Tu 1 . - CDS 2774 - 3088 87 ## E2348_P1_054 hypothetical protein - Prom 3114 - 3173 3.7 + Prom 2991 - 3050 5.4 6 3 Op 1 . + CDS 3169 - 3453 341 ## p1ECUMN_0088 conjugal transfer pilin chaperone TraQ 7 3 Op 2 . + CDS 3440 - 3985 452 ## p1ECUMN_0087 conjugal transfer protein TrbB 8 3 Op 3 . + CDS 3975 - 4256 168 ## p1ECUMN_0086 conjugal transfer protein TrbJ 9 3 Op 4 . + CDS 4243 - 4623 183 ## E2348_P1_050 conjugal transfer protein TrbF 10 3 Op 5 . + CDS 4620 - 5996 1107 ## EcSMS35_A0015 conjugal transfer pilus assembly protein TraH 11 3 Op 6 . + CDS 5993 - 8815 1908 ## ECO26_p2-71 conjugal transfer mating pair stabilization protein TraG + Term 8934 - 8968 -0.7 + Prom 8827 - 8886 1.6 12 4 Op 1 . + CDS 9044 - 9328 77 ## ECO26_p2-72 conjugal transfer protein TraS 13 4 Op 2 . + CDS 9360 - 10091 720 ## EcSMS35_A0012 conjugal transfer surface exclusion protein TraT Predicted protein(s) >gi|296493262|gb|ADTK01000239.1| GENE 1 67 - 1389 741 440 aa, chain + ## HITS:1 COG:no KEGG:E2348_P1_058 NR:ns ## KEGG: E2348_P1_058 # Name: traN # Def: conjugal transfer mating pair stabilization protein TraN # Organism: E.coli_0127 # Pathway: not_defined # 1 440 163 602 602 854 99.0 0 MELQGSTTWETRTLEYEMSQLPAREVNGQYVVSITSPVTGEIVDAHYSWSRTYLQKSVPM TITVLGTPLSWNAKYSADASFTPVQKTLTAGVAFTSSHPVRVGNTKFKRHTAMKLRLVVR VKKASYTPYVVWSESCPFSKELGKLTKTECTEAGGNRTLVKDGQSYSMYQSCWAYRDTYV TQSADKGTCQTYTDNPACTLVSHQCAFYSEEGACLHEYATYSCESKTSGKVMVCGGDVFC LDGECDKAQSGQSNDFAEAVSQLAALAAAGKDVAALNGGDVRAFTGQAKFCKKAAAGYSN CCKDSGWGQDIGLAKCSSDEKALAKAKSNKLTVSVGEFCSKKVLGVCLEKKRSYCQFDSK LAQIVQQQGRNGQLRISFGSAKHPDCRGITVDELQKIQFNRLDFTNFYEDLMNNQKIPDS GVLTQKVKEQIADQLKQAGQ >gi|296493262|gb|ADTK01000239.1| GENE 2 1416 - 1673 89 85 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM45 NR:ns ## KEGG: APECO1_O1CoBM45 # Name: trbE # Def: conjugal transfer protein TrbE # Organism: E.coli_APEC # Pathway: not_defined # 1 85 2 86 86 152 100.0 2e-36 MKVIFTSNRFIDFLIRLLITAIVISPVIIWSWDTVKETTADGMLAAAFVILYSGVLLFIL YFCFSALTDLQKPDERKSDERNEDE >gi|296493262|gb|ADTK01000239.1| GENE 3 1666 - 2409 634 247 aa, chain + ## HITS:1 COG:no KEGG:pECS88_0084 NR:ns ## KEGG: pECS88_0084 # Name: traF # Def: conjugal pilus assembly protein TraF # Organism: E.coli_S88 # Pathway: not_defined # 1 247 11 257 257 485 97.0 1e-136 MNKALLPLLLCCFIFPASGKDAGWQWYNEKINPKEKENKPVPAAPRQEPDIMQKLAALQT ATKRALYEAILYPGVDNFVKYFRLQNYWTQQAGLFTMSARKAMLAHPELDYNLQYSHYNG TVRNQLAADQAQQRQAIAKLAEHYGIMFFYRGQDPIDGQLAQVINGFRDTYGLSVIPVSV DGVINPLLPDSRTDQGQAQRLGVKYFPAMMLVDPKQGSIRPLSYGFISQDDLAKQFLNVS EDFKPNF >gi|296493262|gb|ADTK01000239.1| GENE 4 2425 - 2772 140 115 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM47 NR:ns ## KEGG: APECO1_O1CoBM47 # Name: trbA # Def: conjugal transfer protein TrbA # Organism: E.coli_APEC # Pathway: not_defined # 1 115 1 115 115 175 97.0 5e-43 MSEDYLKMFTGVVLLIFVIIAGYFFSERNDRKMFLLSSLVFLVINIACLYVLTASLWFLC GAIMSQGAALVVSIVAAALPDVTSFDRFRRIFICIVLSSVWSGVMWFFIRGLMTG >gi|296493262|gb|ADTK01000239.1| GENE 5 2774 - 3088 87 104 aa, chain - ## HITS:1 COG:no KEGG:E2348_P1_054 NR:ns ## KEGG: E2348_P1_054 # Name: artA # Def: hypothetical protein # Organism: E.coli_0127 # Pathway: not_defined # 1 104 1 104 104 145 98.0 4e-34 MEKRSFKEKLEIIRNIIRESLPGNAAIIALIYAASHSLPVNAFPDYLVISLLSIAAGIVV LWLFSIIYIYFCELFRSHWIAVWFIIWSSVINLIILYGFYDRFI >gi|296493262|gb|ADTK01000239.1| GENE 6 3169 - 3453 341 94 aa, chain + ## HITS:1 COG:no KEGG:p1ECUMN_0088 NR:ns ## KEGG: p1ECUMN_0088 # Name: traQ # Def: conjugal transfer pilin chaperone TraQ # Organism: E.coli_UMN026 # Pathway: not_defined # 1 94 1 94 94 166 98.0 3e-40 MISKRRFSLPRLDITGMWVFSLGVWFHIVARLVYSKPWMAFFLAELIAAILVLFGAYQVL DAWIARVSREEREALEARQQAMMEGQQEGGHVSH >gi|296493262|gb|ADTK01000239.1| GENE 7 3440 - 3985 452 181 aa, chain + ## HITS:1 COG:no KEGG:p1ECUMN_0087 NR:ns ## KEGG: p1ECUMN_0087 # Name: trbB # Def: conjugal transfer protein TrbB # Organism: E.coli_UMN026 # Pathway: not_defined # 1 181 1 181 181 305 88.0 6e-82 MSLTKSLLFTLLLSAAAVQASTRDEIERLWNPQGMATQPAQPAAGTSARTAKPAPRWFRL SNGRQGKLADWKGGLFMQGHCPYCHQFDPVLKQLAQQYGFSVFSYTLDGQGDTAFPEALP VPPEVMQTFFPNIPVATPTTFLVNVNTLEALPLLQGATDAAGFMARVDTVLQMYGGKKGA K >gi|296493262|gb|ADTK01000239.1| GENE 8 3975 - 4256 168 93 aa, chain + ## HITS:1 COG:no KEGG:p1ECUMN_0086 NR:ns ## KEGG: p1ECUMN_0086 # Name: trbJ # Def: conjugal transfer protein TrbJ # Organism: E.coli_UMN026 # Pathway: not_defined # 1 93 32 124 126 161 94.0 8e-39 MRNKQVVLLIAGISGIVTGIIVSLNIPFIRQGLFYPASPVEIVVSLSLTFSVSVVFFVGA IVGWISVSEIYYSRMTGLNESSEISEGTYNERK >gi|296493262|gb|ADTK01000239.1| GENE 9 4243 - 4623 183 126 aa, chain + ## HITS:1 COG:no KEGG:E2348_P1_050 NR:ns ## KEGG: E2348_P1_050 # Name: trbF # Def: conjugal transfer protein TrbF # Organism: E.coli_0127 # Pathway: not_defined # 1 126 1 126 126 200 76.0 1e-50 MRENKSNPELKIRSTERDYKYISRITGRYAGLSLVFLTAGIVLWTVMDIIFDACIDSWKA DPELNNSSYMWNILIYAIPYALYALAAGFLVTFFSVPNVRINIRKYRDIPAEMSYAPGEH IKGGQE >gi|296493262|gb|ADTK01000239.1| GENE 10 4620 - 5996 1107 458 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_A0015 NR:ns ## KEGG: EcSMS35_A0015 # Name: traH # Def: conjugal transfer pilus assembly protein TraH # Organism: E.coli_SECEC # Pathway: not_defined # 1 457 1 457 458 889 99.0 0 MMPRIKPLLVLCAALLTVTPAASADVNSDMNQFFNKLGFASNTTQPGVWQGQAAGYAYGG SLYARTQVKNVQLISMTLPDINAGCGGIDAYLGSFSFINGEQLQRFVKQIMSNAAGYFFD LALQTTVPEIKTAKDFVQKMASDINSMNLSSCQAAQGIIGGLFPRTQVSQQKVCQDIAGE SNIFADWAASRQGCTVGGKSDSVRDKASDKDKERVTKNINIMWNALSKNRMFDGNKELKE FVMTLTGSLVFGPNGEITPLSARTTDRSIIRAMMEGGTAKIYHCNDSDKCLKVVADTPVT ISRDNALKSQITKLLASIQNKAVSDTPLDDKEKGFISSTTIPVFKYLVDPQMLGVSNSMI YQLTDYIGYDILLQYIQELIQQARAMVATGNYDEAVIGHINDNMNDATRQIAAFQSQVQV QQDALLVVDRQMSYMRQQLSARMLSRYQNNYHFGGSTR >gi|296493262|gb|ADTK01000239.1| GENE 11 5993 - 8815 1908 940 aa, chain + ## HITS:1 COG:no KEGG:ECO26_p2-71 NR:ns ## KEGG: ECO26_p2-71 # Name: not_defined # Def: conjugal transfer mating pair stabilization protein TraG # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 939 1 939 939 1703 98.0 0 MNEVYVIAGGEWLRNNLNAIAAFMGTRTWDSIEKIALTLSVLAVAVMWVQRHNVMDLLGW VAVFVLISLLVNVRTSVQIIDNSDLVKVHRVDNVPVGLAMPLSLTTRIGHAMVASYEMIF TQPDSVTYSKTGMLFGAELVSKSTDFLSRNPEIANLFQDYVQNCVMGDIYLNHKYTLEEL MASADPYTLIFSRPSPLRGVYDSNNNFVTCKDASVSLKDKLNLDTQSGGKTWHYYAQQLF GGRPDPNLLFSTLIGDSYSYFYGSSKSASQIIRKNVTINALKEGITSYAARNGDSASLVN LATTSSMEKQRLAHVSIGHVAMRTLPMTQTILTGIAIGIFPLLVLAAVFNKLTLSVLKGY VFALMWLQSWPLLYAILNSAMTFYAKQNGAPVVLSEISQIQLKYSDLASTAGYLSMMIPP LSWMMVRGLGAGFSSVYSHFASSAISPTASAAGSVVDGNYSYGNMQTENVNGFSWSTNST TSFGQMMYQTGSGATATQTRDGNMVMDASGAMSRLPVGINATRQIAVAQQEMAREASNRA ESALHGFSSSIASAWNTLSQFGSNRGSSDSVTSGADSTMSAQDSMMASRMRSAVESYAKA HNISNEQATQELASRSTRASAGMYGDAHAEWGVKPKILGVGGGLGVRGGGRAGIDWSDDD AHQASSGSRASYDARHDIDAKASKDFKEASDYFTSRKVSESGSHTDNNADSRVDQLSAAL NSAKQSYDQYTTNMTRSHEYAEMASRTESMSGQMSEDLSQQFAQYVMKHAPQDAEAILTN TSSPEIAERRRAMAWSFVQEQIQPGVDNAWRESRGDIGKGMESVPSGGGSQDIIADHQGH QAIIEQRTQDSNIRNDVKHQVDNMVTEYKGNIGDTQNSIRGEENIVRGQYSELQNHHKTE ALSQNNKYNEEKSAQERMPGADSPQELMKRAKEYQDKYKQ >gi|296493262|gb|ADTK01000239.1| GENE 12 9044 - 9328 77 94 aa, chain + ## HITS:1 COG:no KEGG:ECO26_p2-72 NR:ns ## KEGG: ECO26_p2-72 # Name: not_defined # Def: conjugal transfer protein TraS # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 94 70 163 163 169 95.0 2e-41 MPVSSLLSPLLSLMVFIIGTLYELRQVSGCISIKKWGQNQLKDQYDGSEKLDFGGIEQTP TIYYNPSTGYPMHGGFDSAGNTFGTRWQDYYDRQ >gi|296493262|gb|ADTK01000239.1| GENE 13 9360 - 10091 720 243 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_A0012 NR:ns ## KEGG: EcSMS35_A0012 # Name: traT # Def: conjugal transfer surface exclusion protein TraT # Organism: E.coli_SECEC # Pathway: not_defined # 1 243 1 243 243 393 99.0 1e-108 MKTKKLMMVTLVSSTLALSGCGAMSTAIKKRNLEVKTQMSETIWLEPASERTVFLQIKNT SDKDMSGLQGKIADAVKAKGYQVVTSPDKAYYWIQANVLKADKMDLRESQGWLNRGYEGA AVGAALGAGITGYNSNSAGATLGVGLAAGLVGMAADAMVEDVNYTMITDVQIAERTKATV TTDNVAALRQGTSGAKIQTSTETGNQHKYQTRVVSNANKVNLKFEEAKPVLEDQLAKSIA NIL Prediction of potential genes in microbial genomes Time: Mon May 16 15:47:30 2011 Seq name: gi|296493261|gb|ADTK01000240.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont743.2, whole genome shotgun sequence Length of sequence - 5749 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 3.0 1 1 Tu 1 . + CDS 182 - 2335 1945 ## COG3505 Type IV secretory pathway, VirD4 components + Term 2457 - 2495 6.0 2 2 Op 1 7/0.000 - CDS 2344 - 2742 393 ## COG1487 Predicted nucleic acid-binding protein, contains PIN domain 3 2 Op 2 . - CDS 2742 - 2969 240 ## COG4456 Virulence-associated protein and related proteins - Prom 3001 - 3060 3.8 + Prom 2920 - 2979 2.0 4 3 Tu 1 . + CDS 3051 - 5748 2145 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member Predicted protein(s) >gi|296493261|gb|ADTK01000240.1| GENE 1 182 - 2335 1945 717 aa, chain + ## HITS:1 COG:PSLT104_1 KEGG:ns NR:ns ## COG: PSLT104_1 COG3505 # Protein_GI_number: 17233468 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Salmonella typhimurium LT2 # 1 610 1 610 616 1163 89.0 0 MSFNAKDMTQGGQIASMRIRMFSQIANIILYCLFIFFWILVGLVLWVKISWQTFVNGCIY WWCTTLEGMRDLIRSQPVYEIQYYGKTFRMNAAQVLHDKYMIWCGEQLWSAFVLASVVAL VICLITFFVVSWILGRQGKQQSENEITGGRQLTDNPKEVARMLKKDGKDSDIRIGDLPII RDSEIQNFCLHGTVGAGKSEVIRRLANYARQRGDMVVIYDRSGEFVKSYYDPSIDKILNP LDARCAAWDLWKECLTQPDFDNTANTLIPMGTKEDPFWQGSGRTIFAEAAYLMRNDPNRS YSKLVDTLLSIKIEKLRTFLRNSPAANLVEEKIEKTAISIRAVLTNYVKAIRYLQGIEHN GEPFTIRDWMRGVREDQKNGWLFISSNADTHASLKPVISMWLSIAIRGLLAMGENRNRRV WFFCDELPTLHKLPDLVEILPEARKFGGCYVFGIQSYAQLEDIYGEKAAATLFDVMNTRA FFRSPSHKIAEFAAGEIGEKEHLKASEQYSYGADPVRDGVSTGKDMERQTLVSYSDIQSL PDLTCYVTLPGPYPAVKLSLKYQTRPKVAPEFIPRDINPEMENRLSAVLAAREAEGRQMA SLFEPDVPEVVSGEDVTQAEQPQQPVSPAINDKKSDAGVNVPAGGIEQELKMKPEEEMEQ QLPPGISESGEVVDMAVYEAWQREQNPDIQQKMQRREEVNINVHRERGEDVEPGDDF >gi|296493261|gb|ADTK01000240.1| GENE 2 2344 - 2742 393 132 aa, chain - ## HITS:1 COG:PSLT106 KEGG:ns NR:ns ## COG: PSLT106 COG1487 # Protein_GI_number: 17233504 # Func_class: R General function prediction only # Function: Predicted nucleic acid-binding protein, contains PIN domain # Organism: Salmonella typhimurium LT2 # 1 132 1 132 132 257 94.0 4e-69 MLKFMLDTNICIFTIKNKPASVRERFNLNQGRMCISSVTLMELIYGAEKSQMPERNLAVI EGFVSRIDVLDYDAAAATHSGQIRAELARQGRPVGPFDQMIAGHARSRGLIIVTNNTREF ERVDGLRIEDWS >gi|296493261|gb|ADTK01000240.1| GENE 3 2742 - 2969 240 75 aa, chain - ## HITS:1 COG:PSLT107 KEGG:ns NR:ns ## COG: PSLT107 COG4456 # Protein_GI_number: 17233505 # Func_class: S Function unknown # Function: Virulence-associated protein and related proteins # Organism: Salmonella typhimurium LT2 # 1 75 2 76 76 131 92.0 3e-31 METTVFFSNRSQAVRLPKAVALPENVKRVEVIAVGRTRIITPAGETWDEWFDGHNVSADF MDNREQPGMQERESF >gi|296493261|gb|ADTK01000240.1| GENE 4 3051 - 5748 2145 899 aa, chain + ## HITS:1 COG:PSLT108 KEGG:ns NR:ns ## COG: PSLT108 COG0507 # Protein_GI_number: 17233470 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Salmonella typhimurium LT2 # 1 894 1 893 1752 1470 88.0 0 MMSIAQVRSAGSAGNYYTDKDNYYVLGSMGERWAGRGAEQLGLQGSVDKDVFTRLLEGKL PDGADLSRMQDGSNKHRPGYDLTFSAPKSVSVMAMLGGDKRLIDAHNQAVDFAVRQVEAL ASTRVMTDGQSETVLTGNLVMALFNHDTSRDQEPQLHTHAVVANVTQHNGEWKTLSSDKV GKTGFIENVYANQIAFGRLYREKLKEQVEALGYETEVVGKHGMWEMPGVPVEAFSGRSQA IREAVGEDASLKSRDVAALDTRKSKQHVDPEVRMAEWMQTLKETGFDIRAYRDAADQRAE TRTQTPGPASQDGPDVQQAVTQAIAGLSERKVQFTYTDVLARTVGILPPENGVIERARAG IDEAISREQLIPLDREKGLFTSGIHVLDELSVRALSRDIMKQNRVTVHPEKSVPRTAGYS DAVSVLAQDRPSLAIVSGQGGAAGQRERVAELVMMAREQGREVQIIAADRRSQMNLKQDE RLSGELITGRRQLQEGMAFTPGNTVIVDQGEKLSLKETLTLLDGAARHNVQVLITDSGQR TGTGSALMAMKDAGVNTYRWQGGEQRPATIISEPDRNVRYARLAGDFAASVKAGEESVAQ VSGVREQAILTQAIRSELKTQGVLGRPEVTMTALSPVWLDSRSRYLRDMYRPGMVMEQWN PETRSHDRYVIDRVTAQSNSLTLRDAQDETQVVRISSLDSSWSLFRPEKMPVADGERLRV TGKIPGLRVSGGDRLQVASVSEDAMTVVVPGRAEPASLPVSDSPFMALKLENGWVETPGH SVSDSAKVFASVTQMAMDNATLNGLARSGRDVRLYSSLDETRTAEKLARHPSFTVVSEQI KARAGETLLETAISLQKTGLHTPAQQAIHLALPVVESKNLAFSMVDLLTEVRIFMKGDH Prediction of potential genes in microbial genomes Time: Mon May 16 15:47:31 2011 Seq name: gi|296493260|gb|ADTK01000241.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont747.1, whole genome shotgun sequence Length of sequence - 1630 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 15 - 338 397 ## COG5471 Uncharacterized conserved protein 2 1 Op 2 . + CDS 331 - 606 134 ## ECUMN_1827 conserved hypothetical protein from putative prophage 3 1 Op 3 . + CDS 618 - 1196 484 ## ECS88_5033 minor tail protein Z (GPZ) of prophage 4 1 Op 4 . + CDS 1181 - 1594 292 ## ECUMN_1825 minor tail protein U Predicted protein(s) >gi|296493260|gb|ADTK01000241.1| GENE 1 15 - 338 397 107 aa, chain + ## HITS:1 COG:ECs0830 KEGG:ns NR:ns ## COG: ECs0830 COG5471 # Protein_GI_number: 15830084 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 107 16 122 122 182 99.0 2e-46 MAKNFVEEGKTVAIVASAAISSGDLVQVGDVFAVALTDIPQGETGDGITEGVFMLPKLKT DDMKTGKKVYLKSGKVQLTNSGSDPLVGVVWADAGTSAEEVPVKLNV >gi|296493260|gb|ADTK01000241.1| GENE 2 331 - 606 134 91 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_1827 NR:ns ## KEGG: ECUMN_1827 # Name: not_defined # Def: conserved hypothetical protein from putative prophage # Organism: E.coli_UMN026 # Pathway: not_defined # 1 91 1 91 91 178 100.0 5e-44 MSDPFSRLAARMDAITVRKMGKTASINDVDMTVIPGETLAELNALSGPAVSLVVFSSGYR PRRGDRVVYDGQQWTVTRHERFNGKPMIFIE >gi|296493260|gb|ADTK01000241.1| GENE 3 618 - 1196 484 192 aa, chain + ## HITS:1 COG:no KEGG:ECS88_5033 NR:ns ## KEGG: ECS88_5033 # Name: not_defined # Def: minor tail protein Z (GPZ) of prophage # Organism: E.coli_S88 # Pathway: not_defined # 1 192 1 192 192 342 99.0 3e-93 MKGLENAIRNLNSLDTRMVPQASAWAINRVAQKAVSVATRQVAGNTVAGDNQVKGIPLKL VRQRVRVFKASPSGKMTARIRVNRGNLPAIKLGTARVRLARRGGKLQYRGSVLKVGKYLF RDAFIQQLANGRWHVMRRIDGKNRYPIDVVKIPLSGPLTQAFEDARDHIIAAEMPKQLGY ALKQQLRLWLTR >gi|296493260|gb|ADTK01000241.1| GENE 4 1181 - 1594 292 137 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_1825 NR:ns ## KEGG: ECUMN_1825 # Name: not_defined # Def: minor tail protein U # Organism: E.coli_UMN026 # Pathway: not_defined # 1 137 1 137 137 264 99.0 9e-70 MADPMNRHTQIRQAVLARLREQCGDSATFFDGLPAFIDAQELPAVAVWLSDAQYTGKMTD EDDWQAVLHIAVFIRAQAPDSELDMWMESTIFPALNDVPALSGLIDTLIPLGFNYQRDNE MATWAMAEITYQITYTN Prediction of potential genes in microbial genomes Time: Mon May 16 15:47:39 2011 Seq name: gi|296493259|gb|ADTK01000242.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont767.1, whole genome shotgun sequence Length of sequence - 720 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 38 - 97 1.5 1 1 Tu 1 . + CDS 183 - 614 108 ## COG0500 SAM-dependent methyltransferases Predicted protein(s) >gi|296493259|gb|ADTK01000242.1| GENE 1 183 - 614 108 143 aa, chain + ## HITS:1 COG:Z1925 KEGG:ns NR:ns ## COG: Z1925 COG0500 # Protein_GI_number: 15801385 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli O157:H7 EDL933 # 1 143 80 222 222 286 99.0 7e-78 MDLNEASLNAASTRAGESKIKHKINHDVFEPYPAALHGQFDSISMFYLLHCLPGNISTKS CVIRNAAQALTDDGTLYGATILGDGVVHNSFGQKLMRIYNQKGIFSNTKDSEEGLTHILS EHFENVKTKVQGTVVMFSASGKK Prediction of potential genes in microbial genomes Time: Mon May 16 15:47:40 2011 Seq name: gi|296493258|gb|ADTK01000243.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont775.1, whole genome shotgun sequence Length of sequence - 805 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 26 - 421 476 ## B21_00741 hypothetical protein 2 1 Op 2 . - CDS 418 - 804 146 ## ECO26_0614 putative minor tail protein Predicted protein(s) >gi|296493258|gb|ADTK01000243.1| GENE 1 26 - 421 476 131 aa, chain - ## HITS:1 COG:no KEGG:B21_00741 NR:ns ## KEGG: B21_00741 # Name: U # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 131 1 131 131 233 99.0 1e-60 MKHTELRAAVLDALEKHDTGATLFDGRPAVFDEADFPAVAVYLTGAEYTGEELDSDTWQA ELHIEVFLPAQVPDSELDAWMESRIYPVMSDIPALSDLITSMVASGYDYRRDDDAGLWSS ADLTYVITYEM >gi|296493258|gb|ADTK01000243.1| GENE 2 418 - 804 146 128 aa, chain - ## HITS:1 COG:no KEGG:ECO26_0614 NR:ns ## KEGG: ECO26_0614 # Name: not_defined # Def: putative minor tail protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 128 65 192 192 197 100.0 1e-49 TVKNPQARIKVNRGDLPVIKLGNARVVLSRRRRRKKGQRSSLKGGGSVLVVGNRRIPGAF IQQLKNGRWHVMQRVAGKNRYPIDVVKIPMAVPLTTAFKQNIERIRRERLPKELGYALQH QLRMVIKR Prediction of potential genes in microbial genomes Time: Mon May 16 15:47:45 2011 Seq name: gi|296493257|gb|ADTK01000244.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont775.2, whole genome shotgun sequence Length of sequence - 495 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 15:47:45 2011 Seq name: gi|296493256|gb|ADTK01000245.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont775.3, whole genome shotgun sequence Length of sequence - 210 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 210 65 ## LF82_p168 tail attachment protein Predicted protein(s) >gi|296493256|gb|ADTK01000245.1| GENE 1 3 - 210 65 69 aa, chain - ## HITS:1 COG:no KEGG:LF82_p168 NR:ns ## KEGG: LF82_p168 # Name: not_defined # Def: tail attachment protein # Organism: E.coli_LF82 # Pathway: not_defined # 1 69 39 107 117 137 98.0 1e-31 VRGVFDDPENISYAGQGVRVEGSSPSLFVRTDDVRQLRRGDTLTIGEENFWIDRISPDDG GSCHLWLGR Prediction of potential genes in microbial genomes Time: Mon May 16 15:47:48 2011 Seq name: gi|296493255|gb|ADTK01000246.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont777.1, whole genome shotgun sequence Length of sequence - 1970 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 347 - 916 149 ## JW0693 hypothetical protein 2 1 Op 2 . - CDS 913 - 1716 328 ## COG3209 Rhs family protein Predicted protein(s) >gi|296493255|gb|ADTK01000246.1| GENE 1 347 - 916 149 189 aa, chain - ## HITS:1 COG:no KEGG:JW0693 NR:ns ## KEGG: JW0693 # Name: ybfC # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 189 1 189 189 394 100.0 1e-109 MKRVLFFLLMIFVSFGVIADCEIQAKDHDCFTIFAKGTIFSAFPVLNNKAMWRWYQNEDI GEYYWQTELGTCKNNKFTPSGARLLIRVGSLRLNENHAIKGTLQELINTAEKTAFLGDRF RSYIRAGIYQKKSSDPVQLLAVLDNSIMVKYFKDEKPTYARMTAHLPNKNESYECLIKIQ HELIRSEEK >gi|296493255|gb|ADTK01000246.1| GENE 2 913 - 1716 328 267 aa, chain - ## HITS:1 COG:ECs0729 KEGG:ns NR:ns ## COG: ECs0729 COG3209 # Protein_GI_number: 15829983 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli O157:H7 # 1 267 1133 1399 1399 548 98.0 1e-156 MDPVYTPARKIHLYHCDHRGLPLALVSTEGATEWCAEYDEWGNLLNEENPHQLQQLIRLP GQQYDEESGLYYNRHRYYDPLQGRYITQDPIGLKGGWNFYQYPLNPVQYIDSMGLASKYG HLNNGGYGARPNKPPTPDPSKLPDIAKQLRLPYPIDQASSAPNVFKTFFRALSPYDYTLY CRKWVKPNLTCTPQDDSQYPGMDTKTASDYLPQTNWPTTQLPPGYTCAEPYLFPDINKPD GPATAGIDDLGEILAKMKQRTSRGIRK Prediction of potential genes in microbial genomes Time: Mon May 16 15:47:52 2011 Seq name: gi|296493254|gb|ADTK01000247.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont782.1, whole genome shotgun sequence Length of sequence - 1871 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 422 - 934 142 ## EcSMS35_4780 hypothetical protein - Term 1129 - 1172 2.0 2 2 Op 1 . - CDS 1191 - 1424 241 ## COG3311 Predicted transcriptional regulator 3 2 Op 2 . - CDS 1493 - 1870 226 ## ECS88_4653 hypothetical protein Predicted protein(s) >gi|296493254|gb|ADTK01000247.1| GENE 1 422 - 934 142 170 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_4780 NR:ns ## KEGG: EcSMS35_4780 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 170 1 170 170 325 100.0 5e-88 MQITEALLAEPGDIRRFVQQAVDHWPHLLAFRFTLRSAEGSINGQQIQVFCTSFYRQVHE RITESNHMLSPSSPVVLRWLREQHGGATIRCLLLLSQDLFCHPRASATVDEACSQLVDLL QQTWQVISAGGQCRVERCFRVARPDTSEQYVALKTAVQSLMSLVIATIIR >gi|296493254|gb|ADTK01000247.1| GENE 2 1191 - 1424 241 77 aa, chain - ## HITS:1 COG:Z1188 KEGG:ns NR:ns ## COG: Z1188 COG3311 # Protein_GI_number: 15800709 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Escherichia coli O157:H7 EDL933 # 1 64 1 64 65 119 93.0 1e-27 MLTTTSHDSVLLRVDDPLIDMNYITSFTGMTDKWFYKLISEGHFPKPIKLGRSSRWYKSE VEQWMQQRIEESRGAAA >gi|296493254|gb|ADTK01000247.1| GENE 3 1493 - 1870 226 125 aa, chain - ## HITS:1 COG:no KEGG:ECS88_4653 NR:ns ## KEGG: ECS88_4653 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 125 130 254 254 265 98.0 5e-70 TVSALMNATPGRDLCWLLTRHPEKPEYHVVLCVRQEYFDGPELDRLILDAWSNVLGFASP GEAKPYQKQITRDVVLDSRSPDCEDILKELIWAFSDFARDRRGVCDPEARCLAGNPGYPG SAGPF Prediction of potential genes in microbial genomes Time: Mon May 16 15:47:57 2011 Seq name: gi|296493253|gb|ADTK01000248.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont782.2, whole genome shotgun sequence Length of sequence - 1945 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 219 253 ## EcSMS35_4783 hypothetical protein 2 2 Tu 1 . - CDS 466 - 1032 305 ## EcSMS35_4784 hypothetical protein Predicted protein(s) >gi|296493253|gb|ADTK01000248.1| GENE 1 3 - 219 253 72 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_4783 NR:ns ## KEGG: EcSMS35_4783 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 72 1 72 191 150 98.0 2e-35 MNAIPYFDYSLAPFWPSYQSKVIGVLERALREQSGSRIRRILLRLPCEYDNTFNSRQTWF GMDFIETVSALM >gi|296493253|gb|ADTK01000248.1| GENE 2 466 - 1032 305 188 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_4784 NR:ns ## KEGG: EcSMS35_4784 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 188 1 188 188 394 99.0 1e-108 MPKSYTPNWFFTALLDNHINQMMARYSCLRALRMDFFYRKDTPDFLQPDHRWLELQLRML LEQVEQFENMVGFFWVIEWTADHGFHAHAVFWIDRQRVKKIYPFAERITECWRSITHNSG SVHRCTYQPHYTYNINIPVRHNDPGSIDNIRGVLHYLAKEEQKDGLCAYGCNEVPERPAA GRPRKPHF Prediction of potential genes in microbial genomes Time: Mon May 16 15:48:03 2011 Seq name: gi|296493252|gb|ADTK01000249.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont794.1, whole genome shotgun sequence Length of sequence - 1086 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 156 - 917 285 ## COG3231 Aminoglycoside phosphotransferase - Prom 1015 - 1074 4.2 Predicted protein(s) >gi|296493252|gb|ADTK01000249.1| GENE 1 156 - 917 285 253 aa, chain - ## HITS:1 COG:SMc03094 KEGG:ns NR:ns ## COG: SMc03094 COG3231 # Protein_GI_number: 15966752 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aminoglycoside phosphotransferase # Organism: Sinorhizobium meliloti # 5 253 16 261 261 144 36.0 1e-34 MDADLYGYKWARDNVGQSGATIYRLYGKPDAPELFLKHGKGSVANDVTDEMVRLNWLTEF MPLPTIKHFIRTPDDAWLLTTAIPGKTAFQVLEEYPDSGENIVDALAVFLRRLHSIPVCN CPFNSDRVFRLAQAQSRMNNGLVDASDFDDERNGWPVEQVWKEMHKLLPFSPDSVVTHGD FSLDNLIFDEGKLIGCIDVGRVGIADRYQDLAILWNCLGEFSPSLQKRLFQKYGIDNPDM NKLQFHLMLDEFF Prediction of potential genes in microbial genomes Time: Mon May 16 15:48:03 2011 Seq name: gi|296493251|gb|ADTK01000250.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont796.1, whole genome shotgun sequence Length of sequence - 235 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 15:48:05 2011 Seq name: gi|296493250|gb|ADTK01000251.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont796.2, whole genome shotgun sequence Length of sequence - 3780 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 4, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 336 74 ## SbBS512_E1436 hypothetical protein 2 1 Op 2 . - CDS 377 - 1219 277 ## APECO1_379 hypothetical protein 3 1 Op 3 . - CDS 1189 - 1446 77 ## SBO_0971 hypothetical protein 4 1 Op 4 . - CDS 1518 - 1943 268 ## ECS88_1338 conserved hypothetical protein from phage origin 5 1 Op 5 . - CDS 1940 - 2194 107 ## APECO1_1042 putative regulatory protein - Prom 2217 - 2276 2.4 + Prom 2018 - 2077 3.9 6 2 Tu 1 . + CDS 2274 - 2693 326 ## COG1396 Predicted transcriptional regulators + Prom 2735 - 2794 3.8 7 3 Op 1 . + CDS 2910 - 3146 190 ## ECS88_1335 conserved hypothetical protein from phage origin 8 3 Op 2 . + CDS 3106 - 3276 190 ## ECUMN_1857 hypothetical protein 9 3 Op 3 . + CDS 3306 - 3524 209 ## EcSMS35_1156 hypothetical protein 10 4 Tu 1 . - CDS 3528 - 3692 81 ## ECSP_1720 hypothetical protein Predicted protein(s) >gi|296493250|gb|ADTK01000251.1| GENE 1 3 - 336 74 111 aa, chain - ## HITS:1 COG:no KEGG:SbBS512_E1436 NR:ns ## KEGG: SbBS512_E1436 # Name: not_defined # Def: hypothetical protein # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 1 111 1 111 141 192 99.0 3e-48 MAKVFTQEEREKIKRQVVELVRLSGRETLRQLEAKTGATRYLMSVLARELVASGDVYNSG YGLFPSEQARKDWQNARKKLSRAKVKKPAVVDPDLIWSLPDGEIRRYDRRL >gi|296493250|gb|ADTK01000251.1| GENE 2 377 - 1219 277 280 aa, chain - ## HITS:1 COG:no KEGG:APECO1_379 NR:ns ## KEGG: APECO1_379 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 280 77 352 352 499 97.0 1e-140 MQGRVLDGDLITGWEKRQVLKEDNGNISQTAKSPAERKRAQRERERKQEQNGDCHGASRN VTHMSRRVTTDKDTDKDTDQEDQNTMVHGVKNATNQAGDVQTVNPGQPAGTTPEADSAYA LKADSGAVQQVMTARPEQSHQLQQPEADSAIQREADRVVPENTGQPVGRVDYPDVFEQVW REYPLRAGANPKKSAFSAWKARLREGVPREAMLDGVRRYARYLAATGKTGTEFVQRATTF FGPDRNFENPWLLPVSGTNNQRCVNHISEPDNEIPPGFRG >gi|296493250|gb|ADTK01000251.1| GENE 3 1189 - 1446 77 85 aa, chain - ## HITS:1 COG:no KEGG:SBO_0971 NR:ns ## KEGG: SBO_0971 # Name: not_defined # Def: hypothetical protein # Organism: S.boydii # Pathway: not_defined # 1 82 1 82 356 154 90.0 1e-36 MANAWLRLWHDMPNDPKWRTIARVSGQPIATVMAVYIHLLVSASRNVTRGHIDVTTEDLA SALDVTEEVIDSFCRRCRGGYLMVI >gi|296493250|gb|ADTK01000251.1| GENE 4 1518 - 1943 268 141 aa, chain - ## HITS:1 COG:no KEGG:ECS88_1338 NR:ns ## KEGG: ECS88_1338 # Name: not_defined # Def: conserved hypothetical protein from phage origin # Organism: E.coli_S88 # Pathway: not_defined # 1 141 25 165 165 272 97.0 3e-72 MKIKHEHIRMAMNAWAHPDGEKVPAAKITKAYFELGMTFPELYDDSHPEALARNTQKIFR WVEKDTPDAVEKIQALLPAIEKAMPPLLVARMRSHSSAYFRELVETRERLVRDADDFVAV AIAGFNQMNRGGPAGNAVAVH >gi|296493250|gb|ADTK01000251.1| GENE 5 1940 - 2194 107 84 aa, chain - ## HITS:1 COG:no KEGG:APECO1_1042 NR:ns ## KEGG: APECO1_1042 # Name: not_defined # Def: putative regulatory protein # Organism: E.coli_APEC # Pathway: not_defined # 1 84 25 108 108 158 100.0 7e-38 MNQKTLEDVIKTVRVAVVADVCGVSQRAIYKWMDNGKLPRTEYTGETNYAEKIALASNGL FSADAILTIGRNKTTTKKLMGVDS >gi|296493250|gb|ADTK01000251.1| GENE 6 2274 - 2693 326 139 aa, chain + ## HITS:1 COG:ECs1941 KEGG:ns NR:ns ## COG: ECs1941 COG1396 # Protein_GI_number: 15831195 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli O157:H7 # 37 139 1 103 103 167 99.0 4e-42 MVHEDKARKEFASRLALACENAGYEQHGRQAEIARRMKLTPKAVSKWFNGETIPRREKLR ELATLIGTTPTYLLGEDTEESGQVRFYQELNPRQKIIIDLLDELPDSETDELLKTLEEKK QKYNAIYEELARKKKQKAS >gi|296493250|gb|ADTK01000251.1| GENE 7 2910 - 3146 190 78 aa, chain + ## HITS:1 COG:no KEGG:ECS88_1335 NR:ns ## KEGG: ECS88_1335 # Name: ydfA # Def: conserved hypothetical protein from phage origin # Organism: E.coli_S88 # Pathway: not_defined # 1 78 1 78 78 134 97.0 1e-30 MTAGFNFNNYAAGFCSATPALRGNEVSMDTIDLGNNESLVYGVFPNQDGTFTAMTYTKSK TFKTENGARRWLERNSGE >gi|296493250|gb|ADTK01000251.1| GENE 8 3106 - 3276 190 56 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_1857 NR:ns ## KEGG: ECUMN_1857 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 55 1 55 205 95 87.0 4e-19 MVPVAGWKETQVSDMDFDTIMEKAYEEYFEDLAEGEEALSFSEFKQALSSSAKSNG >gi|296493250|gb|ADTK01000251.1| GENE 9 3306 - 3524 209 72 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_1156 NR:ns ## KEGG: EcSMS35_1156 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 72 1 72 72 113 100.0 2e-24 MQKREPVIIAPDYTDDELYEWMRQKINAAQDLKWANEARAKQAENLSALEQDITNLEKAA ALSIARMITYPR >gi|296493250|gb|ADTK01000251.1| GENE 10 3528 - 3692 81 54 aa, chain - ## HITS:1 COG:no KEGG:ECSP_1720 NR:ns ## KEGG: ECSP_1720 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 54 58 111 111 96 98.0 3e-19 MNQLPRRALSRYGLTFQGNILSVNCQCRMLTRVRRTHSTSPVENSLITNLSFVG Prediction of potential genes in microbial genomes Time: Mon May 16 15:48:26 2011 Seq name: gi|296493249|gb|ADTK01000252.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont829.1, whole genome shotgun sequence Length of sequence - 9005 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 6, operones - 2 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 405 190 ## ECED1_1434 endopeptidase (lysis protein) from bacteriophage origin + Prom 434 - 493 2.5 2 2 Tu 1 . + CDS 561 - 827 198 ## KP1_2180 hypothetical protein - Term 794 - 824 4.3 3 3 Tu 1 . - CDS 833 - 1372 338 ## KP1_2181 hypothetical protein - Prom 1446 - 1505 3.1 4 4 Tu 1 . + CDS 1511 - 1861 85 ## COG1403 Restriction endonuclease + Prom 1913 - 1972 2.4 5 5 Op 1 . + CDS 2009 - 2491 375 ## APECO1_1026 putative phage terminase 6 5 Op 2 2/0.000 + CDS 2491 - 4248 1279 ## COG4626 Phage terminase-like protein, large subunit + Term 4321 - 4361 0.3 7 5 Op 3 3/0.000 + CDS 4396 - 5622 950 ## COG4695 Phage-related protein 8 5 Op 4 5/0.000 + CDS 5615 - 6214 572 ## COG3740 Phage head maturation protease 9 5 Op 5 . + CDS 6229 - 7446 1161 ## COG4653 Predicted phage phi-C31 gp36 major capsid-like protein + Term 7467 - 7496 1.2 10 6 Op 1 . + CDS 7523 - 7840 243 ## APECO1_1021 hypothetical protein 11 6 Op 2 . + CDS 7849 - 8187 94 ## COG5614 Bacteriophage head-tail adaptor 12 6 Op 3 . + CDS 8187 - 8633 266 ## APECO1_1019 hypothetical protein 13 6 Op 4 . + CDS 8630 - 8974 348 ## APECO1_1018 hypothetical protein Predicted protein(s) >gi|296493249|gb|ADTK01000252.1| GENE 1 1 - 405 190 134 aa, chain + ## HITS:1 COG:no KEGG:ECED1_1434 NR:ns ## KEGG: ECED1_1434 # Name: rz # Def: endopeptidase (lysis protein) from bacteriophage origin # Organism: E.coli_ED1a # Pathway: not_defined # 1 134 50 183 192 209 83.0 2e-53 AVNHYRDNAITYKEQRDKKVSELKQLTATIADMQQRQRDVAALDAKYSRELANAKAENET LRADVAAGRKRLRVNASCSAAVREATGPTSVDNATSPRLADTAERDYFTLRERLMTMQKQ LEGAQLYIREQCLR >gi|296493249|gb|ADTK01000252.1| GENE 2 561 - 827 198 88 aa, chain + ## HITS:1 COG:no KEGG:KP1_2180 NR:ns ## KEGG: KP1_2180 # Name: not_defined # Def: hypothetical protein # Organism: K.pneumoniae_NTUH-K2044 # Pathway: not_defined # 1 76 1 76 76 120 84.0 2e-26 MNVENLSNAHYIYNEMKELQRQKGILESGAGLGVTIQSAYQYSAFLEAIRPHAVAELNSR IEEKKSALVNLGASFSIYEHNKAGWKPA >gi|296493249|gb|ADTK01000252.1| GENE 3 833 - 1372 338 179 aa, chain - ## HITS:1 COG:no KEGG:KP1_2181 NR:ns ## KEGG: KP1_2181 # Name: not_defined # Def: hypothetical protein # Organism: K.pneumoniae_NTUH-K2044 # Pathway: not_defined # 1 179 1 179 179 324 92.0 7e-88 MDIKDKINTILLCDIAIHLGIETDIDPQLVKYAVSSGNDWVIKAEYSHLDVDEPSKEDRD FVTAVLNMYRGLSNAFRKLSDDEQKELVRDHHLKIHDGAIQLPGFDGNNECDYFSIIEAY QKIDRFPEQKQPIANTHSRTEHLYNAMLDEFKKIDAVNRSWDLSKEELASILSTAPRSF >gi|296493249|gb|ADTK01000252.1| GENE 4 1511 - 1861 85 116 aa, chain + ## HITS:1 COG:RSc1680 KEGG:ns NR:ns ## COG: RSc1680 COG1403 # Protein_GI_number: 17546399 # Func_class: V Defense mechanisms # Function: Restriction endonuclease # Organism: Ralstonia solanacearum # 1 113 1 115 124 68 41.0 4e-12 MPALIPRACRKRGCAGTTTDSSGYCDKHRGEGWVQHQRGLSRHQRGYGSKWDAIRARILK RDNHLCQNCLRNGRAVEARTVDHIIPKAHGGTDADSNLQSLCWPCHKAKTARERIN >gi|296493249|gb|ADTK01000252.1| GENE 5 2009 - 2491 375 160 aa, chain + ## HITS:1 COG:no KEGG:APECO1_1026 NR:ns ## KEGG: APECO1_1026 # Name: not_defined # Def: putative phage terminase # Organism: E.coli_APEC # Pathway: not_defined # 1 160 1 160 160 292 96.0 3e-78 MSGPPKTPPRLHLIRGNPSKRPVKDSKKTAKKDEKCLPKIPQHLGSQGKYWFRRMAEELN AEGIISQLDARALELLVEAYTEYRHHCETLDVEGYTYRTETQNGDVLIKAHPAAAMKADA WKRIRAMLAEFGMSPASRAKVNTAGPDDVDPLAELLKARD >gi|296493249|gb|ADTK01000252.1| GENE 6 2491 - 4248 1279 585 aa, chain + ## HITS:1 COG:ECs1598 KEGG:ns NR:ns ## COG: ECs1598 COG4626 # Protein_GI_number: 15830852 # Func_class: R General function prediction only # Function: Phage terminase-like protein, large subunit # Organism: Escherichia coli O157:H7 # 9 562 6 528 553 268 32.0 2e-71 MAKVADGIRYAERVVAGEIVAGEFVRLACQRFLDDLKYGEERGIYFSEPRAQHILNFYKF VPHVKGALAGQPIELMDWHVFILINIFGFVIPLVNEETGEIVMRSDGSGRPVMVRRFRTA YNEVARKNAKSTLSSGIGLYMTGADGEGGAEVYSAATTRDQARIVFEDAKNMVRKARSTL GRLFDFNKLAIYQEQSASKFEPLSSDANNLDGLNIHCAIIDELHAHKTRDVWDVLETATG ARLQSLLFGITTAGFNKEGICYEQRDYAIKVLRGYNSDVEGAVKDDSYFAIIYTLDEGDD PFDETVWQKANPGLGICKRWDDLRRLAKKAKEQVSARVNFFTKHMNVWVTAESAWMDMIK WEKCEYIAPQHELKTYPMWVGVDLAHKIDICAAAKLWRTDNGHVHADFKFWLPEGRLERC SRQQAELYRKWAEMDKLILTDGDVIDHAQIKSDLLEWIGGENLRELGFDPWSAMQFSLAL AEEGIPLVEVPQTVRNLSEAMKETESLVYAGRFHHSNHPVMNWMMSNVTVKPDKNDNIFP NKSTPEAKIDGPVALFTAMSRFLVNGGGVNDFLSTLDPDEDLLIL >gi|296493249|gb|ADTK01000252.1| GENE 7 4396 - 5622 950 408 aa, chain + ## HITS:1 COG:RSc1682 KEGG:ns NR:ns ## COG: RSc1682 COG4695 # Protein_GI_number: 17546401 # Func_class: S Function unknown # Function: Phage-related protein # Organism: Ralstonia solanacearum # 1 389 1 391 407 481 56.0 1e-136 MLLDALFRSEPLENPSVPVTGEAAETDNIFARDVYVSPETSMKLAAVYACIYVISSSVAQ MPLHVMRKTNEHVQPARDHPLFWLVHDEPNAWQTSYKWRELKQRHVLGWGNGYTWVKRNR RGEVTSLECCMPWETTLLNTGGRHTYGVYNEEGAFAVSPDDMIHIRALGNNQKMGLSPIM QHAETIGMGMSGQQYTSAFFNGNARPAGIISVKNELNEQSWGRLKNMWQRAVTALRSQEN KTMLLPAQLDYRALTVSPVDAQIIDMTKLNRSMIAGIFNVPAHMINDLEKATFSNITQQA IQFVRYTMMPWVANWEQELNRRLFTRTERAAGYYVRFNLTGLLRGTPQERAQFYHFAITD GWMSRNEARAFEDMNPVDGLDEMLVSVNAANPLNNFKDTKGKEEKNDE >gi|296493249|gb|ADTK01000252.1| GENE 8 5615 - 6214 572 199 aa, chain + ## HITS:1 COG:STM2236 KEGG:ns NR:ns ## COG: STM2236 COG3740 # Protein_GI_number: 16765564 # Func_class: R General function prediction only # Function: Phage head maturation protease # Organism: Salmonella typhimurium LT2 # 1 164 1 165 172 240 73.0 1e-63 MNDRETRCYSGEVRAEQYDNAPTHILGYGSVFNSRSEPLWGFREIIKPGAFDDVLNDDVR GLFNHDPNFILGRSSAGTLSLSVDERGLRYDIVAPGTPTICDLVLSPMLRGDINQSSFAF RVARDGESWYEDDEGIVIREITRISRLYDVSPVTYPAYQDADSGVRSMKAWQEARASGAL KKAVNERMARERLLTLLNA >gi|296493249|gb|ADTK01000252.1| GENE 9 6229 - 7446 1161 405 aa, chain + ## HITS:1 COG:RSc1684 KEGG:ns NR:ns ## COG: RSc1684 COG4653 # Protein_GI_number: 17546403 # Func_class: R General function prediction only # Function: Predicted phage phi-C31 gp36 major capsid-like protein # Organism: Ralstonia solanacearum # 1 405 1 409 412 420 56.0 1e-117 MKLHEMKQKRNTIAKDMRALHEKIGDNAWTDEQRAEWNRAKAELDALDEQIAREEELRRQ DQAYVDESGPEERQNNEAENGKKAVEEKRAAAFNRFLRAGFAELNAEERNLMRELRAQSV TTDSQGGYTVPTQMRNKIIDTMKAYGGIASVAQLLTTSTGQDITWSTSDGTTEEGELLAE NTAATEQDVTFGTAILGAKKLSSKIIRVSNELLQDSGVDIESYLANRIAQRIGRGEAKYL VQGTGTGSPLQPKGLAASVTGTIQTAASAAFTWKEMNALKHAIDPAYRGGPKYRWVFNDA TLQTIEEMEDGQKRPLWLPDIAGGTPATVLGIPYVIDQAIDGIGTGKKFIFLGDFNRFII RRVTYMELKRLVERYAEFDQVAFLAFHRFDCVLEDVAAIKALTGK >gi|296493249|gb|ADTK01000252.1| GENE 10 7523 - 7840 243 105 aa, chain + ## HITS:1 COG:no KEGG:APECO1_1021 NR:ns ## KEGG: APECO1_1021 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 105 58 162 162 217 100.0 9e-56 MAAIVEKLRAQCRIDTDDATDDELLMLYFRAACRKAENFINRKLYEETVPEGDPEGVLIA DDVLLALMLLVGHWYENRENSSDVSKAPVPFGFSSLLEPYRFIPL >gi|296493249|gb|ADTK01000252.1| GENE 11 7849 - 8187 94 112 aa, chain + ## HITS:1 COG:ECs1593 KEGG:ns NR:ns ## COG: ECs1593 COG5614 # Protein_GI_number: 15830847 # Func_class: R General function prediction only # Function: Bacteriophage head-tail adaptor # Organism: Escherichia coli O157:H7 # 1 106 1 106 112 102 51.0 1e-22 MQAGRLRDRVIILNVTTARSPSGHPVETVTEGATVWAEVKGISGREIISGGAETAQATVR VWMRFRRDVTATSRLKVLTGAFKGAILGIEGPPIPDARATRLEILCSLKGNV >gi|296493249|gb|ADTK01000252.1| GENE 12 8187 - 8633 266 148 aa, chain + ## HITS:1 COG:no KEGG:APECO1_1019 NR:ns ## KEGG: APECO1_1019 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 148 2 149 149 270 100.0 2e-71 MDFSLDFSGLADIARDLETLSRAENNKVLRDATRAGAEVMRDAVVERAPERTGKLKKNVV VLTQRSKRRGEIISGVHIRGRNLRTGNSDNSMKASDPRNAFYWRFVELGTINMPAHPFIR PAFDTTEELAAQVAIQRMNQAIDEVLSK >gi|296493249|gb|ADTK01000252.1| GENE 13 8630 - 8974 348 114 aa, chain + ## HITS:1 COG:no KEGG:APECO1_1018 NR:ns ## KEGG: APECO1_1018 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 114 1 114 114 195 97.0 4e-49 MREATLYSLLSQLAGGQVYPYVVPLTEGKPAVSPPWLVFSVVSDTVSDVLDGQAESRITV QIDVWATVPDDADNIREQALDAVRKLAPSVISKTQGYDPDSRLSRATLEFQVIA Prediction of potential genes in microbial genomes Time: Mon May 16 15:48:44 2011 Seq name: gi|296493248|gb|ADTK01000253.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont829.2, whole genome shotgun sequence Length of sequence - 5552 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 1, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 48 - 752 587 ## APECO1_1017 phage major tail subunit 2 1 Op 2 . + CDS 767 - 1138 322 ## EcSMS35_1196 phage tail assembly chaperone 3 1 Op 3 . + CDS 1162 - 1440 162 ## APECO1_1015 phage-related minor tail protein 4 1 Op 4 1/0.000 + CDS 1487 - 4714 2197 ## COG5281 Phage-related minor tail protein + Prom 4896 - 4955 3.4 5 1 Op 5 . + CDS 5046 - 5550 452 ## COG4672 Phage-related protein Predicted protein(s) >gi|296493248|gb|ADTK01000253.1| GENE 1 48 - 752 587 234 aa, chain + ## HITS:1 COG:no KEGG:APECO1_1017 NR:ns ## KEGG: APECO1_1017 # Name: gp14 # Def: phage major tail subunit # Organism: E.coli_APEC # Pathway: not_defined # 1 234 18 251 251 400 97.0 1e-110 MSSNFERSQLTKIMISSAPVTAETLDSASYLGLSCTIKEVQFTAGQKQDIDVTTLCSVEQ ENINGLGAASEISMSGNFYLNAAQNALRSAYDNDTTYGFKVIFPSGNGFTFMAEVRQHTW SAGTNGVVAATFSLRLKGKPVLTTEPLKVKVDLKSTLRVASRAKLEMAVEAVGGVPPYSY AWKKGGSPVSGQTAATFSKASAASGDAGAYTCEISDSASPVNKVTSTSCTVTVS >gi|296493248|gb|ADTK01000253.1| GENE 2 767 - 1138 322 123 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_1196 NR:ns ## KEGG: EcSMS35_1196 # Name: not_defined # Def: phage tail assembly chaperone # Organism: E.coli_SECEC # Pathway: not_defined # 1 123 1 123 123 229 100.0 2e-59 MTKNIRNLALATMSGFRHKTVDVPEWEGATVVLREPSAEAWLRWQEIVKAKDDETPLSVA ERARRNLEADVELFIDVLCDTGLQPVFSEDDREQVIAVYGPVHARLLRQSLELISDAGEV KKK >gi|296493248|gb|ADTK01000253.1| GENE 3 1162 - 1440 162 92 aa, chain + ## HITS:1 COG:no KEGG:APECO1_1015 NR:ns ## KEGG: APECO1_1015 # Name: not_defined # Def: phage-related minor tail protein # Organism: E.coli_APEC # Pathway: not_defined # 1 92 5 96 96 157 96.0 1e-37 MMLALRMGRTLSELRREMSASEIMMWAEFDRFSPLGDERADFRAAQIVSAVYGAQGVKVS LNDALLQWEKEQTEGASDPFAGLENALLIVSQ >gi|296493248|gb|ADTK01000253.1| GENE 4 1487 - 4714 2197 1075 aa, chain + ## HITS:1 COG:ECs2240 KEGG:ns NR:ns ## COG: ECs2240 COG5281 # Protein_GI_number: 15831494 # Func_class: S Function unknown # Function: Phage-related minor tail protein # Organism: Escherichia coli O157:H7 # 1 1075 1 1080 1080 1338 76.0 0 MATLRELIIKISANSRSFQSEISRASRMGQDYYRTMQNGGQQSAAASREMRRALAEVTDQ INTAKSSALNMAGAFAGAFATGHLISLADEWNSVNARLKQASQSSDDFQASQRELMAISQ RTGTAFSDNASLFARSAASMREYGYSSEEVLKVTEAISTGLKLSGASTAEASSVIMQFSQ ALAQGVLRGEEFNSVNENGDRVIRALAAGMGVARKDLKAMADNGKLTADKVVPALISQLG ALRDEYAAMPDTVSSSATKVENAFMAWVGGANEASGVTKTLSGVLNGVADNIDTVAAAAG ALVAVGVARYFGNMASSAGSATAGLITAARNEVALAEAQLRGTQIATARARAAVYRAQQA VVAARGTERQAAAEAKLTAAQASLTRNIAARTAAQTTLNTVTSVGSRLLSGALGLVGGVP GLVMLGATAWYTMYQNQEQARESARQYAATIDEIRQKTSAMSLPEASDNEEKTRQALDEQ NRLIDEQKSKIKSLQEKIAGYQYVLANPGWTTDNGFMINHMTSVKTVTEGLAEATNQLAV EQSRLTQMQGKAQSIQDVLAGLEERRVALIRQQAAGQNKAYQSLLIMNGQHTEFNRLLGL GNELLQQRQGLVNVPLRLPQTTLDDKQQTALNNSKRELALSRLKGEARERARLGYAADDL GFVGEAYQTARQNYINNSLDAWRNNQANKPKAHKKTEAEKTEDIYKRLIKQQKEQIALAG QNTELAKMKYQVSQGELSTLSEAQKKTLLQNAALIDQKKIREQLAAYESSLADSNASTRA SNDAQLLGYGEGSRMRERLQEMWSIRHEFEQKNNELLRQYQAGEIEEALWKQEKELNKKY LEERLSDQQDYYAKADALRNNWNAGLQEGLTNWADSATDYASQAADAVVSTMDGLVSNIS DALAGNVVDWRNWGSSILQEVSKILMNAAIVNGLKSLSGAGWWLGTVGGWISGAVANAKG GVYTSANLSAYSNTIVDTPTYFAFAKGAGLMGEAGPEAIMPLTRAADGSLGVRAIGNVNS GGGVVYSPVYHISIQNQGSNGEIDARSARGLVDLIDSRVVSIMQSSRRDGGLYSA >gi|296493248|gb|ADTK01000253.1| GENE 5 5046 - 5550 452 168 aa, chain + ## HITS:1 COG:ECs1645 KEGG:ns NR:ns ## COG: ECs1645 COG4672 # Protein_GI_number: 15830899 # Func_class: S Function unknown # Function: Phage-related protein # Organism: Escherichia coli O157:H7 # 1 168 1 168 232 322 94.0 1e-88 MQDIQQETLNECTKSEQSALVVLWEIDLTEVGGERYFFCNEQNEKGEPVTWQGRQYQEYP IQGSGFEMNGKGSSARPTLKVSNLHGMVTGMAEDLQSLVGGTVVRRKVYARFLDAVNFVN GNSDADPEQEVISRWRIEQCSELSAVSASFVLSTPTETDGAVFPGRIM Prediction of potential genes in microbial genomes Time: Mon May 16 15:48:52 2011 Seq name: gi|296493247|gb|ADTK01000254.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont830.1, whole genome shotgun sequence Length of sequence - 1620 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 40 - 747 357 ## S2721 putative DNA helicase 2 1 Op 2 . + CDS 734 - 1585 433 ## ECO111_p1-125 replication protein C Predicted protein(s) >gi|296493247|gb|ADTK01000254.1| GENE 1 40 - 747 357 235 aa, chain + ## HITS:1 COG:no KEGG:S2721 NR:ns ## KEGG: S2721 # Name: not_defined # Def: putative DNA helicase # Organism: S.flexneri_2457T # Pathway: not_defined # 1 235 41 275 275 390 85.0 1e-107 MLALQLAAQIAGGPDLLEVGELPTGPVIYLPAEDPPTAIHHRLHALGAHLSAEERQAVAD GLLIQPLIGSLPNIMAPEWFDGLKRAAEGRRLMVLDTLRRFHIEEENASGPMAQVIGRME AIAADTGCSIVFLHHASKGAAMMGAGDQQQASRGSSVLVDNIRWQSYLSSMTSAEAEEWG VDDDQRRFFVRFGVSKANYGAPFADRWFRRHDGGVLKPAVLERQRKSKGVPRGEA >gi|296493247|gb|ADTK01000254.1| GENE 2 734 - 1585 433 283 aa, chain + ## HITS:1 COG:no KEGG:ECO111_p1-125 NR:ns ## KEGG: ECO111_p1-125 # Name: not_defined # Def: replication protein C # Organism: E.coli_O111_H- # Pathway: not_defined # 1 283 1 283 283 559 100.0 1e-158 MVKPKNKHSLSHVRHDPAHCLAPGLFRALKRGERKRSKLDVTYDYGDGKRIEFSGPEPLG ADDLRILQGLVAMAGPNGLVLGPEPKTEGGRQLRLFLEPKWEAVTADAMVVKGSYRALAK EIGAEVDSGGALKHIQDCIERLWKVSIIAQNGRKRQGFRLLSEYASDEADGRLYVALNPL IAQAVMGGGQHVRISMDEVRALDSETARLLHQRLCGWIDPGKTGKASIDTLCGYVWPSEA SGSTMRKRRQRVREALPELVALGWTVTEFAAGKYDITRPKAAG Prediction of potential genes in microbial genomes Time: Mon May 16 15:49:01 2011 Seq name: gi|296493246|gb|ADTK01000255.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont834.1, whole genome shotgun sequence Length of sequence - 3135 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 186 96 ## UTI89_C1305 lambdoid prophage DLP12 precursor Bor-like protein - Prom 230 - 289 5.5 - Term 381 - 421 -0.3 2 2 Tu 1 . - CDS 477 - 887 140 ## COG5562 Phage envelope protein - Prom 1124 - 1183 3.5 + Prom 1080 - 1139 3.4 3 3 Tu 1 . + CDS 1173 - 1379 217 ## JW0548 hypothetical protein - Term 1443 - 1507 4.0 4 4 Tu 1 . - CDS 1544 - 1738 72 ## ECSP_1534 hypothetical protein + Prom 1982 - 2041 6.1 5 5 Op 1 4/0.000 + CDS 2127 - 2672 435 ## COG4220 Phage DNA packaging protein, Nu1 subunit of terminase + Prom 2711 - 2770 2.8 6 5 Op 2 . + CDS 2827 - 3133 271 ## COG5525 Bacteriophage tail assembly protein Predicted protein(s) >gi|296493246|gb|ADTK01000255.1| GENE 1 3 - 186 96 61 aa, chain - ## HITS:1 COG:no KEGG:UTI89_C1305 NR:ns ## KEGG: UTI89_C1305 # Name: ybcU # Def: lambdoid prophage DLP12 precursor Bor-like protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 61 17 77 113 99 88.0 4e-20 MKKMLLATALALLITGCAQQTFTVQNKQTAVAPKETITHHFFVSGIGQKKTVDAAKICGG A >gi|296493246|gb|ADTK01000255.1| GENE 2 477 - 887 140 136 aa, chain - ## HITS:1 COG:ybcV KEGG:ns NR:ns ## COG: ybcV COG5562 # Protein_GI_number: 16128541 # Func_class: R General function prediction only # Function: Phage envelope protein # Organism: Escherichia coli K12 # 1 136 15 150 150 244 98.0 3e-65 MAQVAIFKEIFDQVRKDLNCELFYSELKRHNVSHYIYYLATDNIHIVLENDNTVLIKGLK KVVNVKFSRNTHLIETSFDRLKSREITFQQYRENLAKAGVFRWVTNIHEHKRYYYTFDNS LLFTESIQNTTQIFPR >gi|296493246|gb|ADTK01000255.1| GENE 3 1173 - 1379 217 68 aa, chain + ## HITS:1 COG:no KEGG:JW0548 NR:ns ## KEGG: JW0548 # Name: ybcW # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 68 1 68 68 107 95.0 2e-22 MNKEQSADELSLDLIRVKNMLNSTISMSYPDVVIACIEHKVSLEAFRAIEAALVKHDNNM KDYSLVVD >gi|296493246|gb|ADTK01000255.1| GENE 4 1544 - 1738 72 64 aa, chain - ## HITS:1 COG:no KEGG:ECSP_1534 NR:ns ## KEGG: ECSP_1534 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 64 1 64 64 93 100.0 3e-18 MSTKNRTRRTTTRNIRFPNQMIEQINIALDQKGSGNFSAWVIEACRRRLTSEKRAYTSIK SDEE >gi|296493246|gb|ADTK01000255.1| GENE 5 2127 - 2672 435 181 aa, chain + ## HITS:1 COG:nohB KEGG:ns NR:ns ## COG: nohB COG4220 # Protein_GI_number: 16128543 # Func_class: L Replication, recombination and repair # Function: Phage DNA packaging protein, Nu1 subunit of terminase # Organism: Escherichia coli K12 # 1 181 1 181 181 328 98.0 4e-90 MEVNKKQLADIFGASIRTIQNWQEQGMPVLRGGGKGNEVLYDSAAVIKWYAERDAEIENE KMRREVEELRQASETDLQPGTIEYERHRLTRAQADAQELKNARDSAEVVETAFCTFVLSR IAGEIASILDGIPLSVQRRFPELENRHVDFLKRDIIKAMNKAAALDELIPGLLSEYIEQS G >gi|296493246|gb|ADTK01000255.1| GENE 6 2827 - 3133 271 102 aa, chain + ## HITS:1 COG:ECs1630 KEGG:ns NR:ns ## COG: ECs1630 COG5525 # Protein_GI_number: 15830884 # Func_class: R General function prediction only # Function: Bacteriophage tail assembly protein # Organism: Escherichia coli O157:H7 # 1 102 61 162 641 222 100.0 2e-58 MNAMGSDYIREVNVVKSARVGYSKMLLGVYAYFIEHKQRNTLIWLPTDGDAENFMKTHVE PTIRDIPSLLALAPWYGKKHRDNTLTMKRFTNGRGFWCLGGK Prediction of potential genes in microbial genomes Time: Mon May 16 15:49:06 2011 Seq name: gi|296493245|gb|ADTK01000256.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont834.2, whole genome shotgun sequence Length of sequence - 82 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 15:49:07 2011 Seq name: gi|296493244|gb|ADTK01000257.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont834.3, whole genome shotgun sequence Length of sequence - 154 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 152 106 ## COG5525 Bacteriophage tail assembly protein Predicted protein(s) >gi|296493244|gb|ADTK01000257.1| GENE 1 2 - 152 106 50 aa, chain + ## HITS:1 COG:ECs1630 KEGG:ns NR:ns ## COG: ECs1630 COG5525 # Protein_GI_number: 15830884 # Func_class: R General function prediction only # Function: Bacteriophage tail assembly protein # Organism: Escherichia coli O157:H7 # 1 50 61 110 641 110 100.0 9e-25 MNAMGSDYIREVNVVKSARVGYSKMLLGVYAYFIEHKQRNTLIWLPTDGD Prediction of potential genes in microbial genomes Time: Mon May 16 15:49:07 2011 Seq name: gi|296493243|gb|ADTK01000258.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont834.4, whole genome shotgun sequence Length of sequence - 85 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 15:49:08 2011 Seq name: gi|296493242|gb|ADTK01000259.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont834.5, whole genome shotgun sequence Length of sequence - 337 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 335 195 ## COG5525 Bacteriophage tail assembly protein Predicted protein(s) >gi|296493242|gb|ADTK01000259.1| GENE 1 2 - 335 195 111 aa, chain + ## HITS:1 COG:ECs1630 KEGG:ns NR:ns ## COG: ECs1630 COG5525 # Protein_GI_number: 15830884 # Func_class: R General function prediction only # Function: Bacteriophage tail assembly protein # Organism: Escherichia coli O157:H7 # 1 111 187 297 641 240 99.0 5e-64 IEQEGSPTFLGDKRIEGSVWPKSIRGSTPKVRGTCQIERAASESPHFMRFHVAGPHCGEE QYLKFGDKETPFGLKWTPDDPSSVFYLCEHNACVIRQQELDFTDARYICEK Prediction of potential genes in microbial genomes Time: Mon May 16 15:49:08 2011 Seq name: gi|296493241|gb|ADTK01000260.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont834.6, whole genome shotgun sequence Length of sequence - 400 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 41 - 398 179 ## COG5525 Bacteriophage tail assembly protein Predicted protein(s) >gi|296493241|gb|ADTK01000260.1| GENE 1 41 - 398 179 119 aa, chain + ## HITS:1 COG:ECs1630 KEGG:ns NR:ns ## COG: ECs1630 COG5525 # Protein_GI_number: 15830884 # Func_class: R General function prediction only # Function: Bacteriophage tail assembly protein # Organism: Escherichia coli O157:H7 # 1 118 270 387 641 245 99.0 1e-65 MFYLCEHNACVIRQQELDFTDARYICEKTGIWTRDGILWFSSSGEEIEPPDSVTFHIWTA YSPFTTWVQIVKDWMKTKGDTGKRKTFVNTTLGETWEAKIGERPDAEVMAERKEHYSAP Prediction of potential genes in microbial genomes Time: Mon May 16 15:49:09 2011 Seq name: gi|296493240|gb|ADTK01000261.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont834.7, whole genome shotgun sequence Length of sequence - 95 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 15:49:09 2011 Seq name: gi|296493239|gb|ADTK01000262.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont836.1, whole genome shotgun sequence Length of sequence - 532 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 7 - 532 429 ## COG0616 Periplasmic serine proteases (ClpP class) Predicted protein(s) >gi|296493239|gb|ADTK01000262.1| GENE 1 7 - 532 429 175 aa, chain + ## HITS:1 COG:ECs1633 KEGG:ns NR:ns ## COG: ECs1633 COG0616 # Protein_GI_number: 15830887 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Escherichia coli O157:H7 # 1 175 250 424 439 263 98.0 1e-70 MSAYTGLSVQAVLDTEAAVYSGQEAIDAGLADELVNSTDAITVMRDALDARKSRLSGGRM TKETQSTTVSATASQADVTDVVPATEGENASAAQPDVNAQITAAVAAENSRIMGILNCEE AHGREEQARVLAETPGMTVETARRILAAAPQSAQARSDTALDRLMQGAPAPLAAG Prediction of potential genes in microbial genomes Time: Mon May 16 15:49:18 2011 Seq name: gi|296493238|gb|ADTK01000263.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont846.1, whole genome shotgun sequence Length of sequence - 33718 bp Number of predicted genes - 27, with homology - 27 Number of transcription units - 6, operones - 3 average op.length - 8.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 54 - 221 82 ## COG0582 Integrase + Term 275 - 316 1.0 - Term 266 - 301 2.2 2 2 Op 1 . - CDS 422 - 1867 702 ## COG0515 Serine/threonine protein kinase 3 2 Op 2 . - CDS 1888 - 2430 156 ## EFER_3095 hypothetical protein 4 2 Op 3 . - CDS 2438 - 4123 978 ## COG0433 Predicted ATPase 5 2 Op 4 . - CDS 4113 - 5270 708 ## EFER_3097 hypothetical protein 6 2 Op 5 . - CDS 5273 - 6121 288 ## EFER_3098 hypothetical protein 7 2 Op 6 . - CDS 6132 - 10418 2637 ## EFER_3099 hypothetical protein 8 2 Op 7 . - CDS 10415 - 11701 504 ## EFER_3100 hypothetical protein 9 2 Op 8 . - CDS 11719 - 13272 503 ## PLES_23601 hypothetical protein 10 2 Op 9 . - CDS 13275 - 15422 1779 ## EFER_3102 putative type I restriction-modification system methyltransferase subunit 11 2 Op 10 . - CDS 15473 - 15859 72 ## EFER_3103 hypothetical protein 12 2 Op 11 5/0.000 - CDS 15856 - 17199 305 ## COG4268 McrBC 5-methylcytosine restriction system component 13 2 Op 12 . - CDS 17192 - 19267 914 ## COG1401 GTPase subunit of restriction endonuclease - Prom 19311 - 19370 11.8 14 3 Tu 1 . - CDS 19437 - 20309 435 ## COG3440 Predicted restriction endonuclease - Prom 20331 - 20390 4.3 15 4 Op 1 2/0.000 - CDS 20521 - 21501 190 ## COG2801 Transposase and inactivated derivatives - Term 21520 - 21561 2.5 16 4 Op 2 . - CDS 21566 - 22672 708 ## COG3055 Uncharacterized protein conserved in bacteria 17 4 Op 3 . - CDS 22692 - 23408 440 ## G2583_5112 hypothetical protein - Prom 23520 - 23579 6.1 18 5 Tu 1 . - CDS 24383 - 24604 84 ## ECUMN_4918 hypothetical protein - Prom 24841 - 24900 6.4 + Prom 24577 - 24636 6.3 19 6 Op 1 12/0.000 + CDS 24865 - 25467 50 ## COG0582 Integrase + Prom 25759 - 25818 7.9 20 6 Op 2 3/0.000 + CDS 25944 - 26540 353 ## COG0582 Integrase 21 6 Op 3 4/0.000 + CDS 27021 - 27569 479 ## COG3539 P pilus assembly protein, pilin FimA 22 6 Op 4 7/0.000 + CDS 27634 - 28173 324 ## COG3539 P pilus assembly protein, pilin FimA 23 6 Op 5 10/0.000 + CDS 28210 - 28935 448 ## COG3121 P pilus assembly protein, chaperone PapD 24 6 Op 6 6/0.000 + CDS 29002 - 31638 1458 ## COG3188 P pilus assembly protein, porin PapC 25 6 Op 7 4/0.000 + CDS 31648 - 32178 149 ## COG3539 P pilus assembly protein, pilin FimA 26 6 Op 8 . + CDS 32191 - 32694 254 ## COG3539 P pilus assembly protein, pilin FimA 27 6 Op 9 . + CDS 32714 - 33616 489 ## EcSMS35_4847 protein FimH Predicted protein(s) >gi|296493238|gb|ADTK01000263.1| GENE 1 54 - 221 82 55 aa, chain + ## HITS:1 COG:YPO3438 KEGG:ns NR:ns ## COG: YPO3438 COG0582 # Protein_GI_number: 16123586 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Yersinia pestis # 1 53 367 419 419 86 75.0 1e-17 MSHQERSSVRAAYIHKAEHLGERMLMLQWWAAYLDANRMAAITPFDYAKINSSFR >gi|296493238|gb|ADTK01000263.1| GENE 2 422 - 1867 702 481 aa, chain - ## HITS:1 COG:all4518 KEGG:ns NR:ns ## COG: all4518 COG0515 # Protein_GI_number: 17232010 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Nostoc sp. PCC 7120 # 2 279 4 298 520 110 31.0 8e-24 MPGTNKRLGDKYLLLECLGDGSHGWVWRAERLKDGKIVAVKIPKDITREDRQLAEGKELL NVEPHENIVQIFDMGRIDNEWFYIEMEYFPSQTLAQKLDDRQRNFGQTYERLFRIYRQVL CAVHYLAELPVPISHGDIKPHNILVGERDLVKLTDFGSSALPEEIYVRTRENGGTVLYSA PEFSNVDSRRGSLEELLLGDIYSLGVLLYQLLTGKLPHDTPAQVQRHAPFKLPTEINSSI HTDLEQVVLTCLQKRAEDRFSTISDLIHAFDSAATKQLKVGAIAPVFQTKPAQDWSSAVL EAMSDQSYQKAAQLASQEYSRSKDLQALHQQLIALYRANRLFDFEKVVEDNKAILLEGKL TQPAALFELIVKANLQLRNISQANFWLLQRKQAEAENAATMYLESSILGLEAKYSEARVL LEKVNQTTPMKFHVLSQLILVCEQMRDYSGAAAYLKAALRVAPLDNSMKEKKELYAKLGV I >gi|296493238|gb|ADTK01000263.1| GENE 3 1888 - 2430 156 180 aa, chain - ## HITS:1 COG:no KEGG:EFER_3095 NR:ns ## KEGG: EFER_3095 # Name: not_defined # Def: hypothetical protein # Organism: E.fergusonii # Pathway: not_defined # 1 180 20 199 199 375 98.0 1e-103 MPEVAWVNETFPLLASGKLLDQDVLVHSLGCSVWNTFGHEQSFMAVLECPAPGTFGADIR SDSGWFHKSSASPVCLIEFERFDGSAKGQQKLEEKLKNLLEAAQRWNHCPKTLVLSAWSQ GLVGVPDTQKLKDICRMGFTSSTGTQVIAAPDVEVVFSRFLFIKNLNMIVLDRIHYEVLM >gi|296493238|gb|ADTK01000263.1| GENE 4 2438 - 4123 978 561 aa, chain - ## HITS:1 COG:MK0692 KEGG:ns NR:ns ## COG: MK0692 COG0433 # Protein_GI_number: 20094129 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Methanopyrus kandleri AV19 # 50 555 50 510 527 196 30.0 1e-49 MAYKVLGKLVGNTGDASQLTMVVQDSFSVRRGEFVRIMHQERKDEGQVAVLGRVTKISRT NMLYNAGFGDGVTELELLPGANVTGENMFAQVELVGYRDPVTRQIKIPRRPLNPGTAVET VDFQFLSDFYEFNEHSSLHLGNLVGYDKGENTVPVFIDVNKLVTEHLAVLAMTGSGKSYT VGRIIERLVAVNNGTVVVFDPHGEYGKALAGGNLQFSSFLEATDDKRDQEALPLIKQTFE KLQAAGAGIQVYSPQHESFKHKYASKNTPLALQFDHFEMDDIAEILPGLTEPQQRVLDVA IRYWRYADKTEPRDINRLRFLLGDGIEELKEWDDLSEAEAKALSGRSAAVASMKLSRVLN EAKSFYSATMAEPTDIYKMVGRPSNQQGRLVVVDLQGLSDTAKQVICALLSSEILKAASS KTDPLRPCFIIYEEGHNFAPAGGNAVSHRIIKKIAGEGRKFGVGFGIVSQRPSKLDSDVT SQCNTLITMRLKNPDDQRFIAKASDMVSKADLEELPSLSTGEALVCGRSIPAPLLVKVGS KALIHGGESPEVLRVWGSFSG >gi|296493238|gb|ADTK01000263.1| GENE 5 4113 - 5270 708 385 aa, chain - ## HITS:1 COG:no KEGG:EFER_3097 NR:ns ## KEGG: EFER_3097 # Name: not_defined # Def: hypothetical protein # Organism: E.fergusonii # Pathway: not_defined # 1 385 1 385 385 741 98.0 0 MNENKTMNLAQSLAIDSQLRLPGSAGRLLKQVRDIEQVAPLFRDAGLVKAVPKPTIVNLR VAGIATKSCLDKVLGGFIYASAAVRTDLLIDDNKVSTVGSDSVSELDDIDFEAQRKRLDL IEIRHAYQLVEKKLLGYEPGDLILLDTPLFLARDMAPLERNVRHKQEYEKTRQVIETFWQ NHRSRIFPWNQNGVVLASILAERFSAIVSIAKQDLRTEQGRKQILASDGFSKELMQNLEG LDESLVGVGDLRFINGILGNYSRTIAFRLSEQEKRIEPNIEAKQGLIGFHFRSARGGQIK MVQLAGDEPEWKSADLDLVASRLMVLDMQNKGNAMPLPQLLGYQQLGILPKFATFYRQGL HLALKNNDIEAGWLAGLDEGLNNGL >gi|296493238|gb|ADTK01000263.1| GENE 6 5273 - 6121 288 282 aa, chain - ## HITS:1 COG:no KEGG:EFER_3098 NR:ns ## KEGG: EFER_3098 # Name: not_defined # Def: hypothetical protein # Organism: E.fergusonii # Pathway: not_defined # 1 282 10 291 291 522 95.0 1e-147 MSRYLSAALASNRKGRFLQTVAGATPLMKDWISSPPASGLLIVQAEELTDANTMQHLYHW AMQAGCAALVINLKAEQFTLLAQLPYPLDWQLVPASLRGQEPGLTALLASETDQAIAGFT GSADRYQHQAGDVVHTRYIRKHSNSGLLAFTTLPLWSLTLLDHSELLVSWLNWFVDHAGI AERIIEPKAPSTDYTPDKHDLVVLLLLYAGGGMNLQALSEHNAVKLMFDVNSLDIVKRGE MLRQHDFIDDAGITATGKTCLQASQYWAYAPLLGEQLHTGTL >gi|296493238|gb|ADTK01000263.1| GENE 7 6132 - 10418 2637 1428 aa, chain - ## HITS:1 COG:no KEGG:EFER_3099 NR:ns ## KEGG: EFER_3099 # Name: not_defined # Def: hypothetical protein # Organism: E.fergusonii # Pathway: not_defined # 1 1428 1 1428 1428 2751 98.0 0 MSFELNEQMVDSLAKQHRWAATGVSDYPPRDPLVGQSRFFKRFQTFLHTVDHDDDRFAHV FAVEAEWGRGKSRLGHELVAQINDCSRGWFVRDEQGQLHDKQLFNRDTQDKYLALYIRYS QVASDYQNSDNWFAFGLYQALLPLATQTFDGSIQSEIAKQAYERLMPFGFKSELLADALQ LNANHSEEDLYTDQTLVVSLVQAAYKVLKSFGVQYMLVVLDELETVAEAATFGLEDEQDK RLDGQAIRLIGKAIKEEDPRRKLPWLRYVALCSPLLGQQLREIQSVARRFELVELEHNAF ADVSDYVASLLKDGKLGFNYPAGLVEAAYAMSGANFGWFNVVMANVDAVLSQYQQAGRSI KDVGDLFEAVLAGSGRVARHVLDKGAIEGIQTRDQALLTECRSLLYGQLPVRLAEMGYAT ELLALKNEDLEPIAAHYRKLSIDRLQCRQALEQAKFRRDKDEWYYPAVDQALNLDTLLDN LRTFSIKEQAASGDKGAVLLPLSKGEFKYLLSLLYDHPAIEFAADALWHKLVGETTELDA SEATHIGPSVAMLLRLDLRYRSAQQNSLIFKDPAMGDAHEQAMSELSKQSSQKPLIKLLA RITGFVRLLDNNWTYDENLLSNKAGSDDAPLAILTEPRGRQGGLLTLDGFKLHPEGKALF AWVSNLIELNNLHTFAANHSHQNGRTPVLALTASAHLMEQYSRLDERNELRDAILLHYVN PSEADQLERIGLSLAACQLHGVNLTADSFTAKFKNKLHALTTFATDAIHKWRQRLQQRGL IAWPLKVDGKLSPNDRDLLFKGWYQLAIAHPELNGILDLQQQHGVPVNELSSLLDKLKVP GSYIAKGYTADEHAGLFTELNNVQRSQAQIPLFLARIAHPNKKHKWQFEGFKQQFYFAYV AETSVTAKGVFNDWMWWCGELNLLTLTNPTEKQAIWEHYPRSRLENAIREAQNWFRGNDM GSYAANVEVMSRVYGYARINEMFAPLGKNKLGFVTVEAKEQLEKAQSLFNVLKQQEEQLA DMVEANDTKVLAGLIHKRAEVLELVAKVKPLNSSRPMLKDAHILSLEDKTTSLYQRIEQA CLFAEFVECSAERINNRLVDLIIDVETECAPLTNFPKRLYTNTLRTIGHILDGALKDDTS SATGRKEQQADSDTLLHYLRKLDLGRAHDKLSALAQEVGLNLHNDQQLPIAEIQGHILSS YRNCKERFSKLVNNLTEQKLRAQQLQEWLSSATAEYQFADDIAELPKLVMKLQLIEDAIA DLPNDAESKRQSMQNSLRNGQFSSLRDLPEELLKPARGQLTPIQGQLLKIEERLNQVRRN SIAQVNSWLPLLKPLLASQKQAEPAALTLEDVSNLGVRELQQVCESTQAKWQSQGEQILN DTGLTLADWQPIYKALSQNQEPVLTPEQQKGLVDKGIVKMRLTFATGL >gi|296493238|gb|ADTK01000263.1| GENE 8 10415 - 11701 504 428 aa, chain - ## HITS:1 COG:no KEGG:EFER_3100 NR:ns ## KEGG: EFER_3100 # Name: not_defined # Def: hypothetical protein # Organism: E.fergusonii # Pathway: not_defined # 1 428 6 433 433 766 96.0 0 MDSITTRQNNDPVNSEVALDLCLQICQQGGLIASKAALLLAAVPALRSLLQPIIQPKKND AETDIISAFSLTAPLLDAFNDLSQSGEWQLALLGLNPDVRQHWINLAAARCQEAGAMSDP MVLVKLIQQLGNASEWVLAQLENADFSPQIIASPLAQTERDLLGHSLNDNAAIPALCRIL RTSHTLFTVSEQNEPPASIQAVDVTAKQLTNNWCSGRLLALPNTLLDEHDLNPNADWLLV SRSGHDNVPLTELFAQQPWLFLLSLIVFVQDAWAAEQRGGLLLTLPAGQNAFAPGQINVA VQGIEGDEVSLGSLAEFLVLLLGELNIPLYPALDANTESINRLNHVLSTFIAELLAKKIW QFTEAGRGESGQYRIHTSFSDACYSLPLAPLFGYKSQTLQRAIKQLAQNCYANKKRAANR LNLQGSSL >gi|296493238|gb|ADTK01000263.1| GENE 9 11719 - 13272 503 517 aa, chain - ## HITS:1 COG:no KEGG:PLES_23601 NR:ns ## KEGG: PLES_23601 # Name: not_defined # Def: hypothetical protein # Organism: P.aeruginosa_LESB58 # Pathway: not_defined # 8 510 8 500 503 130 27.0 9e-29 MYKVRVLQEIDDRLDAEYYNPAALSTLKKMETKGTVTTFGDLVDEGYRVVYHGADSIYGL KDSETLPFLSPTQIDANGSISFEDAEKLPLYYKDKYPKGLGKTGEILVEVKGNVSKVGII PSTFPKNLMISGSLYKATLDPKRVDSHYVLAFLKSKHGQILKNRLTSNTIINYIAKDALY SIPVIELKEKAQKYIGDKVRQAERLRAWAKLLEYRVQSFHSQFIPDQMNLDFGKKTRKVS SDRMTERLDAHFYPGVVEQYLNTHPCKFKSLNALTTVIYNGQTQPESIDEFCEQITVANL STSFIKGQPRKVSSPSKVDKFIKAHDILMCNAAHNKSYIGRELTYVHTDKKLLPSTEVMT IRVNRDLLPASYVRTYLLSRLGYVKIQSTIRGITAHSYPDDIALLDIYIPEVASNKKHQW FKQDDLLVQAGCAVELSQKLTSCAKTLVEALIEGQLTEQQLIQAQQALEDGDNSLDQAIL SKLSAEGYAIEGATPLFSDVDELYSLLEEAVQAEAEE >gi|296493238|gb|ADTK01000263.1| GENE 10 13275 - 15422 1779 715 aa, chain - ## HITS:1 COG:no KEGG:EFER_3102 NR:ns ## KEGG: EFER_3102 # Name: not_defined # Def: putative type I restriction-modification system methyltransferase subunit # Organism: E.fergusonii # Pathway: not_defined # 1 715 1 715 715 1464 99.0 0 MAKEAKKQDPQQQLEALFNKLVKQHSEFTQPLVELPEHEFAKAGLIQGLVEGNLTPPVML IALTEGHWEYAADTAKFDDLAYQLLSDIEGDWPVYAVVDDGLNQRILALFGADGSDGTHG ADTLPGLEELRSFDRMERDPTFRWSMRVYTRLMQRFDAFHENVYRVTKDKVNDKNDIIEE VAKLLFLETFRLHHDEDLTFKDDEGNTLRFRDVFDWQYVESHGDKAVAQIKAAFEEFKNH ENYVVISDDGSRNPIFSKETHLRLSVAKNYQDLLEAIQNLGPVKTNDGKIAKEHGTLADV SGDLLGRVFDVFLRANFESKGGLGVYLTPNPVKQAMLEIAMHDIDDDDEMRSRLANGDFR FCDPTCGSFGFGSVALSQIDKWIDFKLVLADDKKESLKQKLRDCAFTGADAAPRMVMLAR VNMALQGAPKAQIFYTDNSLTTNALKPNSFDLICTNPPFGTPKFDKGKNGQNSKANYEAN MEQVLGGFRPTKKVVDSYNAWYDHVKMKWQDIGDLELDENGEPKWAGYRTDLRRTGGTDK KPFYSLQPTTAGLALGSKPDSKGNWQPVGATIDPAVLFIDRCLQLLKPGGRLLIVLPDGI LCNSGDRYVREYIMGKKDEKTGEFVGGKAIVKAVISLPSDCFKLSGTGAKTSILYLQKRH ANPNQPEQFLPEPQTDVFMAVAETLGYVVKNNIEDYNAGVANDLDKIVSVYKRGE >gi|296493238|gb|ADTK01000263.1| GENE 11 15473 - 15859 72 128 aa, chain - ## HITS:1 COG:no KEGG:EFER_3103 NR:ns ## KEGG: EFER_3103 # Name: not_defined # Def: hypothetical protein # Organism: E.fergusonii # Pathway: not_defined # 1 128 1 128 128 206 100.0 2e-52 MTHEYQRQSLEQLITQQTATPRSAAVVFNENQTLWQTLGFSQSQVSLWLASLPQSNHLKG PTTATYQVIPDIAAHLVSLLHQAGGRMPLAQVLKKLPAGITTSEQQIRKLAQKHAQIEIK GPLLVLIN >gi|296493238|gb|ADTK01000263.1| GENE 12 15856 - 17199 305 447 aa, chain - ## HITS:1 COG:YPO0388 KEGG:ns NR:ns ## COG: YPO0388 COG4268 # Protein_GI_number: 16120722 # Func_class: V Defense mechanisms # Function: McrBC 5-methylcytosine restriction system component # Organism: Yersinia pestis # 6 411 8 420 438 311 42.0 2e-84 MGEVISVFEYDLLGSGKAASVGAKPVPPQVFNYLEALSLASNQGSQFLKLTSRSGFKLLQ VQNYAGMLSTPHGFQLEILPKVGKNLTAANARQTLLTMLSHLPGFRHIQTQQATLQAQRM PLLEIFINQFLHSVSQLLKQGLRSDYVSKQGNLAFMKGKLMLSAQLRHNAVNRHKFCVDY DDYMPDCAANRLLHSALDKLLSLKLSSENQRWLYELRFAFDGIPLSRDIERDINNLRLER GMAHYNEPMAWAQLILRGMSPSALQGNTKAISLLFPMEAVFESFVAQTLPYELPSHLKVL PQAATYSLVKHGVKDCFKLRPDLLIQSNKPVQTEMVMDTKWKLVNSSQQTKSLYGLAQAD FYQMFAYGQKYLGGNGEMYLIYPAHDDFSQPIPQHFAFSETLKLWVVPYRIMAKRGERMM WPNDVSCTMSPKLDRMAASVSYSGGMR >gi|296493238|gb|ADTK01000263.1| GENE 13 17192 - 19267 914 691 aa, chain - ## HITS:1 COG:YPO0387 KEGG:ns NR:ns ## COG: YPO0387 COG1401 # Protein_GI_number: 16120721 # Func_class: V Defense mechanisms # Function: GTPase subunit of restriction endonuclease # Organism: Yersinia pestis # 436 680 436 680 687 248 53.0 4e-65 MTLVDSVEAGRLTISELIDALAKDKNYTASRWYQRYRAFTTLLQQTSTFAEPATDGLVKQ LWYERDNGIASIRQGVPSLAEYQQSLPLLRELTERIRQQPDEETYQYVGNALQQAKENGL LKRMYWSLRNRVFAAFSPENYTSTVDENAFNKAAEFLNQHFHLGLVLTGNWLQKNYELKQ AIHAQSPDTDPYYVNMAIWHLYELLRERDNEQKQEKVASTTSITSSEPIENKIILHSPTN VIFFGPPGTGKTFRLQQKMKEYTSHAVPADRDAWLDSRLESLNWMQVITLVLLDLGKRTK VRQIIEHMWFQRKALLNGRNGNLSNTAWAALQSYTVPESLTVDYKNRREPAVFNKTDNSE WLLVDSQLEQVEDLVELYAELKRGPKSAEAIQRFAVVTFHQSYGYEEFIEGIRARSDESG NISYPIEPGIFMRLCQRANADPGHRYAIFIDEINRGNISKIFGELISLIEVDKRAGMPNA MSLQLAYSGDHFSVPANVDIIGAMNTADRSLALMDTALRRRFDFVEMMPDLSLLSGAKVK GIELESLLEKLNSRIEALYDREHTLGHAFFMPVKNALDAGDEEAAFKQLKIAFQKKIIPL LQEYFFDDWNKIRLVLADNQKQDDNLQFVIEKTDDLDTLFGNNHGLRRHDQQSTAYELKD FDQEIWNIPQAYRSIYQPQQTPLDEQAVNHG >gi|296493238|gb|ADTK01000263.1| GENE 14 19437 - 20309 435 290 aa, chain - ## HITS:1 COG:Z5892 KEGG:ns NR:ns ## COG: Z5892 COG3440 # Protein_GI_number: 15804871 # Func_class: V Defense mechanisms # Function: Predicted restriction endonuclease # Organism: Escherichia coli O157:H7 EDL933 # 1 280 12 291 294 583 98.0 1e-166 MASSKSLQQAIANIKIWHKGEQRAPHKPLLLLYVLAGYLNGHPRLFDYGSEIYEPLHSLL ERFGPQRSQYRPDMPFWRLQGDGFWQLHNAELCSTAGSSRQPPVKELNEYHVAGGFDEQH YALVTGNKKLINTLAQQILEAHFTESIQEEIADELGFDLQQIRKQRDPLFRKNVLRAYNY QCAICGFNMRHDDTTVALEAAHIKWKQHGGPCEIPNGLALCAIHHKAFDKGSIGLDENMR VLVSDAVNGGGIVERLFWDFDGKTIALPQVRKNYPYEVFVEWHRKEAFRG >gi|296493238|gb|ADTK01000263.1| GENE 15 20521 - 21501 190 326 aa, chain - ## HITS:1 COG:ECs5268 KEGG:ns NR:ns ## COG: ECs5268 COG2801 # Protein_GI_number: 15834522 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 326 1 326 326 666 97.0 0 MNAIISPDYYYVLTVAGQSNAMAYGEGLPLPDREDAPHPRIKQLARFAHTHPGGPSCHFN DIIPLTHCPHDVQDMQGYHHPLATNHQTQYGTVGQALHIARKLLPFIPDNAGILIVPCCR GGSAFTAGSEGTYSERHGASHDACRWGTDTPLYQDLVSRTRAALVKNPQNKFLGVCWMQG EFDLMTSDYVSHPQHFNHMVEAFRRDLKQYHSQLNNITDAPWFCGDTTWYWKENFPHAYE AIYGNYQNNVLANIIFVDFQQQGERGLTNAPDEDPDDLSTGYYGSAYRSPENWTTALRSS HFSAAARRGIISDSFVEAILQFLREK >gi|296493238|gb|ADTK01000263.1| GENE 16 21566 - 22672 708 368 aa, chain - ## HITS:1 COG:yjhT KEGG:ns NR:ns ## COG: yjhT COG3055 # Protein_GI_number: 16132131 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 368 37 404 404 704 99.0 0 MNKTITALAIMMASFAANASVLPETPVPFKSGTGAIDNDTVYIGLGSTGTAWYKLDTQAK DKKWTALAAFPGGPRDQATSAFIDGNLYVFGGIGKNSEGLTQVFNDVHKYNPKTNSWVKL MSHAPMGMAGHVTFVHNGKAYVTGGVNQNIFNGYFEDLNEAGKDSTAIDKINAHYFDKKA EDYFFNKFLLSFDPSTQQWSYAGESPWYGTAGAAVVNKGDKTWLINGEAKPGLRTDAVFE LDFTGNNLKWNKLAPVSSPDGVAGGFAGISNDSLIFAGGAGFKGSRENYQNGKNYAHEGL KKSYSTDIHLWHNGKWDKSGELSQGRAYGVSLPWNNSLLIIGGETAGGKAVTDSVLISVK DNKVTVQN >gi|296493238|gb|ADTK01000263.1| GENE 17 22692 - 23408 440 238 aa, chain - ## HITS:1 COG:no KEGG:G2583_5112 NR:ns ## KEGG: G2583_5112 # Name: nanC # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 238 4 241 241 454 100.0 1e-126 MKKAKILSGVLLLCFSSPLISQAATLDVRGGYRSGSHAYETRLKVSEGWQNGWWASMESN TWNTIHDNKKENAALNDVQVEVNYAIKLDDQWTVRPGMLTHFSSNGTRYGPYVKLSWDAT KDLNFGIRYRYDWKAYRQQDLSGDMSRDNVHRWDGYVTYHINSDFTFAWQTTLYSKQNDY RYANHKKWATENAFVLQYHMTPDITPYIEYDYLDRQGVYNGRDNLSENSYRIGVSFKL >gi|296493238|gb|ADTK01000263.1| GENE 18 24383 - 24604 84 73 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_4918 NR:ns ## KEGG: ECUMN_4918 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 73 1 73 131 130 98.0 1e-29 MQLILSIYHVICSVTSKINIKNSAKFNIRSTSRLTYDIISQLHLPQNGYFMERSNFLVAI IMLILRSLAIIPG >gi|296493238|gb|ADTK01000263.1| GENE 19 24865 - 25467 50 200 aa, chain + ## HITS:1 COG:ECs5271 KEGG:ns NR:ns ## COG: ECs5271 COG0582 # Protein_GI_number: 15834525 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Escherichia coli O157:H7 # 1 200 1 200 200 402 100.0 1e-112 MKNKADNKKRNFLTHSEIESLLKAANTGPHAARNYCLTLLCFIHGFRASEICRLRISDID LKAKCIYIHRLKKGFSTTHPLLNKEVQALKNWLSIRTSYPHAESEWVFLSRKGNPLSRQQ FYHIISTSGGNAGLSLEIHPHMLRHSCGFALANMGIDTRLIQDYLGHRNIRHTVWYTASN AGRFYGIWDRARGRQRHAVL >gi|296493238|gb|ADTK01000263.1| GENE 20 25944 - 26540 353 198 aa, chain + ## HITS:1 COG:fimE KEGG:ns NR:ns ## COG: fimE COG0582 # Protein_GI_number: 16132134 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Escherichia coli K12 # 1 198 1 198 198 367 100.0 1e-102 MSKRRYLTGKEVQAMMQAVCYGATGARDYCLILLAYRHGMRISELLDLHYQDLDLNEGRI NIRRLKNGFSTVHPLRFDEREAVERWTQERANWKGADRTDAIFISRRGSRLSRQQAYRII RDAGIEAGTVTQTHPHMLRHACGYELAERGADTRLIQDYLGHRNIRHTVRYTASNAARFA GLWERNNLINEKLKREEV >gi|296493238|gb|ADTK01000263.1| GENE 21 27021 - 27569 479 182 aa, chain + ## HITS:1 COG:ECs5273 KEGG:ns NR:ns ## COG: ECs5273 COG3539 # Protein_GI_number: 15834527 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli O157:H7 # 1 182 1 182 182 218 90.0 7e-57 MKIKTLAIVVLSALSLSSAAALADTTTVNGGTVHFKGEVVNAACAVDAGSVDQTVQLGQV RTASLAQEGATSSAVGFNIQLNDCDTTVADKAAIAFLGTAIDGTHPNVLALQSSAAGSAT KVGVQILDRTGAALALDGATFSAQTTLNNGTNTIPFQARYFATGAATPGAANADATFKVQ YQ >gi|296493238|gb|ADTK01000263.1| GENE 22 27634 - 28173 324 179 aa, chain + ## HITS:1 COG:fimI KEGG:ns NR:ns ## COG: fimI COG3539 # Protein_GI_number: 16132136 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 179 37 215 215 368 99.0 1e-102 MKRKRLFLLASLLPMFALAGNKWNTTLPGGNMKFQGVIIAETCRIEAGDKQMTVNMGQIS SNRFHAVGEDSAPVPFVIHLRECSTVVSERVGVAFHGVADGKNPDVLSVGEGPGIATNIG VALFDDEGNLVPINRPPANWKRLYSGSTSLHFIAKYRATGRRVTGGIANAQAWFSLTYQ >gi|296493238|gb|ADTK01000263.1| GENE 23 28210 - 28935 448 241 aa, chain + ## HITS:1 COG:ECs5275 KEGG:ns NR:ns ## COG: ECs5275 COG3121 # Protein_GI_number: 15834529 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli O157:H7 # 1 241 1 241 241 456 99.0 1e-128 MSNKNVNVRKSQEITFCLLAGILIFMAMMVAGRAEAGVALGATRVIYPAGQKQVQLAVTN NDENSTYLIQSWVENADGVKDGRFIVTPPLFAMKGKKENTLRILDATNNQLPQDRESLFW MNVKAIPSMDKSKLTENTLQLAIISRIKLYYRPAKLALPPDQAAEKLRFRRSANSLTLIN PTPYYLTVTELNAGARVLENALVPPMGESTVKLPSDAGSNITYRTINDYGALTPKMTGVM E >gi|296493238|gb|ADTK01000263.1| GENE 24 29002 - 31638 1458 878 aa, chain + ## HITS:1 COG:ECs5276 KEGG:ns NR:ns ## COG: ECs5276 COG3188 # Protein_GI_number: 15834530 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli O157:H7 # 1 878 1 878 878 1722 98.0 0 MSYLNLRLYQRNTQCLHIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQA VADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLN TASVSGMNLLADDACVPLTSMIHDATAHLDVGQQRLNLTIPQAFMSNRARGYIPPELWDP GINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNRSDRSSGSK NKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPV IHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTV PYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRY RAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYR YSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRT STLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNI PFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGD GNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNETVVL VKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVP TRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLA GKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR >gi|296493238|gb|ADTK01000263.1| GENE 25 31648 - 32178 149 176 aa, chain + ## HITS:1 COG:fimF KEGG:ns NR:ns ## COG: fimF COG3539 # Protein_GI_number: 16132139 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 176 1 176 176 320 100.0 1e-87 MRNKPFYLLCAFLWLAVSHALAADSTITIRGYVRDNGCSVAAESTNFTVDLMENAAKQFN NIGATTPVVPFRILLSPCGNAVSAVKVGFTGVADSHNANLLALENTVSAASGLGIQLLNE QQNQIPLNAPSSALSWTTLTPGKPNTLNFYARLMATQVPVTAGHINATATFTLEYQ >gi|296493238|gb|ADTK01000263.1| GENE 26 32191 - 32694 254 167 aa, chain + ## HITS:1 COG:fimG KEGG:ns NR:ns ## COG: fimG COG3539 # Protein_GI_number: 16132140 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 167 1 167 167 255 100.0 3e-68 MKWCKRGYVLAAILALASATIQAADVTITVNGKVVAKPCTVSTTNATVDLGDLYSFSLMS AGAASAWHDVALELTNCPVGTSRVTASFSGAADSTGYYKNQGTAQNIQLELQDDSGNTLN TGATKTVQVDDSSQSAHFPLQVRALTVNGGATQGTIQAVISITYTYS >gi|296493238|gb|ADTK01000263.1| GENE 27 32714 - 33616 489 300 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_4847 NR:ns ## KEGG: EcSMS35_4847 # Name: fimH # Def: protein FimH # Organism: E.coli_SECEC # Pathway: not_defined # 1 300 1 300 300 528 100.0 1e-149 MKRVITLFAVLLMGWSVNAWSFACKTANGTAIPIGGGSANVYVNLAPAVNVGQNLVVDLS TQIFCHNDYPETITDYVTLQRGSAYGGVLSNFSGTVKYSGSSYPFPTTSETPRVVYNSRT DKPWPVALYLTPVSSAGGVAIKAGSLIAVLILRQTNNYNSDDFQFVWNIYANNDVVVPTG GCDVSARDVTVTLPDYPGSVPIPLTVYCAKSQNLGYYLSGTTADAGNSIFTNTASFSPAQ GVGVQLTRNGTIIPANNTVSLGAVGTSAVSLGLTANYARTGGQVTAGNVQSIIGVTFVYQ Prediction of potential genes in microbial genomes Time: Mon May 16 15:50:34 2011 Seq name: gi|296493237|gb|ADTK01000264.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont846.2, whole genome shotgun sequence Length of sequence - 53428 bp Number of predicted genes - 51, with homology - 50 Number of transcription units - 29, operones - 12 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 184 - 294 77 ## - Prom 319 - 378 2.4 + Prom 457 - 516 3.3 2 2 Op 1 7/0.000 + CDS 634 - 1818 1467 ## COG1312 D-mannonate dehydratase + Term 1825 - 1861 8.6 3 2 Op 2 3/0.167 + CDS 1899 - 3359 1810 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases + Term 3384 - 3427 8.6 + Prom 3417 - 3476 5.1 4 3 Tu 1 . + CDS 3574 - 4347 926 ## COG2186 Transcriptional regulators + Term 4361 - 4398 4.1 - Term 4363 - 4411 0.3 5 4 Tu 1 . - CDS 4488 - 5318 334 ## EC55989_4990 hypothetical protein - Prom 5436 - 5495 6.8 6 5 Tu 1 . + CDS 5990 - 6382 240 ## EcHS_A4554 DNA replication/recombination/repair protein 7 6 Op 1 . - CDS 6375 - 7286 785 ## COG0583 Transcriptional regulator 8 6 Op 2 . - CDS 7351 - 8523 1260 ## ECO111_5184 isoaspartyl dipeptidase 9 6 Op 3 3/0.167 - CDS 8536 - 8997 523 ## COG0700 Uncharacterized membrane protein 10 6 Op 4 . - CDS 8994 - 9689 639 ## COG3314 Uncharacterized protein conserved in bacteria - Prom 9888 - 9947 4.2 + Prom 9831 - 9890 5.0 11 7 Tu 1 . + CDS 9927 - 10481 337 ## COG1859 RNA:NAD 2'-phosphotransferase + Term 10558 - 10606 2.9 12 8 Tu 1 2/0.500 - CDS 10494 - 11615 797 ## COG0477 Permeases of the major facilitator superfamily - Prom 11665 - 11724 1.9 13 9 Op 1 . - CDS 11740 - 12600 515 ## COG3204 Uncharacterized protein conserved in bacteria 14 9 Op 2 . - CDS 12665 - 12922 165 ## EC55989_5000 hypothetical protein 15 9 Op 3 4/0.167 - CDS 12919 - 13686 492 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) 16 9 Op 4 1/0.667 - CDS 13696 - 14847 1362 ## COG1775 Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB - Prom 14885 - 14944 1.7 17 10 Op 1 4/0.167 - CDS 14963 - 16210 1100 ## COG2733 Predicted membrane protein 18 10 Op 2 . - CDS 16284 - 17516 1123 ## COG0477 Permeases of the major facilitator superfamily - Prom 17740 - 17799 4.5 + Prom 17721 - 17780 4.3 19 11 Tu 1 . + CDS 17983 - 18903 529 ## COG5464 Uncharacterized conserved protein - Term 18835 - 18871 2.4 20 12 Tu 1 . - CDS 19057 - 20469 1014 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs - Prom 20537 - 20596 3.3 + Prom 20512 - 20571 3.7 21 13 Tu 1 . + CDS 20646 - 20810 95 ## COG5457 Uncharacterized conserved small protein + Term 20817 - 20852 3.6 + Prom 21075 - 21134 4.0 22 14 Tu 1 . + CDS 21162 - 23702 1697 ## COG2366 Protein related to penicillin acylase - Term 23695 - 23745 -0.0 23 15 Op 1 3/0.167 - CDS 23780 - 24736 1007 ## COG0523 Putative GTPases (G3E family) 24 15 Op 2 9/0.000 - CDS 24747 - 24950 274 ## COG2879 Uncharacterized small protein - Term 24961 - 24990 3.5 25 15 Op 3 . - CDS 25000 - 27150 2547 ## COG1966 Carbon starvation protein, predicted membrane protein - Term 27498 - 27536 6.6 26 16 Op 1 4/0.167 - CDS 27543 - 28055 631 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family 27 16 Op 2 . - CDS 28073 - 29635 1680 ## COG2368 Aromatic ring hydroxylase - Prom 29692 - 29751 2.6 - Term 29804 - 29843 5.3 28 17 Op 1 5/0.167 - CDS 29886 - 30776 752 ## COG2207 AraC-type DNA-binding domain-containing proteins 29 17 Op 2 6/0.083 - CDS 30786 - 32162 1343 ## COG0477 Permeases of the major facilitator superfamily - Term 32190 - 32224 0.3 30 18 Op 1 5/0.167 - CDS 32337 - 33125 1048 ## COG3836 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase 31 18 Op 2 2/0.500 - CDS 33136 - 33939 999 ## COG3971 2-keto-4-pentenoate hydratase 32 18 Op 3 4/0.167 - CDS 34007 - 34387 436 ## COG3232 5-carboxymethyl-2-hydroxymuconate isomerase 33 18 Op 4 4/0.167 - CDS 34397 - 35248 843 ## COG3384 Uncharacterized conserved protein 34 18 Op 5 3/0.167 - CDS 35250 - 36716 1488 ## COG1012 NAD-dependent aldehyde dehydrogenases 35 18 Op 6 . - CDS 36713 - 38002 1271 ## COG0179 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) - Prom 38064 - 38123 6.8 + Prom 38088 - 38147 8.2 36 19 Tu 1 . + CDS 38274 - 38720 388 ## COG1846 Transcriptional regulators + Prom 38735 - 38794 2.1 37 20 Tu 1 . + CDS 38839 - 40503 1606 ## COG0840 Methyl-accepting chemotaxis protein - Term 40502 - 40543 7.1 38 21 Op 1 21/0.000 - CDS 40552 - 41022 439 ## COG0477 Permeases of the major facilitator superfamily 39 21 Op 2 3/0.167 - CDS 41037 - 41921 886 ## COG0477 Permeases of the major facilitator superfamily - Prom 41953 - 42012 1.8 - Term 41976 - 42019 1.7 40 22 Tu 1 . - CDS 42136 - 43050 492 ## COG1802 Transcriptional regulators - Prom 43075 - 43134 4.0 + Prom 43039 - 43098 3.9 41 23 Tu 1 . + CDS 43189 - 44211 1090 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases + Term 44239 - 44277 3.4 42 24 Tu 1 . - CDS 44350 - 46641 2084 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily - Prom 46693 - 46752 4.2 - Term 46840 - 46884 5.1 43 25 Op 1 . - CDS 46895 - 47386 499 ## SSON_4506 hypothetical protein 44 25 Op 2 . - CDS 47438 - 48175 813 ## COG1484 DNA replication protein 45 25 Op 3 . - CDS 48178 - 48717 529 ## COG5529 Pyocin large subunit - Prom 48758 - 48817 2.2 - Term 48760 - 48807 7.8 46 26 Op 1 12/0.000 - CDS 48825 - 49298 429 ## COG3610 Uncharacterized conserved protein 47 26 Op 2 . - CDS 49289 - 50059 359 ## COG2966 Uncharacterized conserved protein 48 27 Op 1 . + CDS 50729 - 51403 310 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 49 27 Op 2 . + CDS 51415 - 52038 332 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain + Term 52048 - 52083 2.4 50 28 Tu 1 . - CDS 52076 - 52864 633 ## COG4114 Uncharacterized Fe-S protein - Prom 52954 - 53013 6.5 + Prom 52815 - 52874 3.5 51 29 Tu 1 . + CDS 53005 - 53241 230 ## UTI89_C5074 hypothetical protein - TRNA 53280 - 53366 69.1 # Leu CAG 0 0 Predicted protein(s) >gi|296493237|gb|ADTK01000264.1| GENE 1 184 - 294 77 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHVLNILWVVFGIGLMLVLTKTAGSDTFAALAMPLR >gi|296493237|gb|ADTK01000264.1| GENE 2 634 - 1818 1467 394 aa, chain + ## HITS:1 COG:uxuA KEGG:ns NR:ns ## COG: uxuA COG1312 # Protein_GI_number: 16132143 # Func_class: G Carbohydrate transport and metabolism # Function: D-mannonate dehydratase # Organism: Escherichia coli K12 # 1 394 1 394 394 809 99.0 0 MEQTWRWYGPNDPVSLADVRQAGATGVVTALHHIPNGEVWSVEEILKRKAIVEDAGLVWS VVESVPIHEDIKTHTGNYEQWIANYQQTLRNLAQCGIRTVCYNFMPVLDWTRTDLEYVLP DGSKALRFDQIEFAAFEMHILKRLGAEADYTEEEIAQAAVRFATMSDEDKARLTRNIIAG LPGAEEGYTLDQFRKHLELYKDIDKAKLRENFAVFLKAIIPVAEEVGVRMAVHPDDPPRP ILGLPRIVSTIEDMQWMVDTVNSMANGFTMCTGSYGVRADNDLVDMIKQFGPRIYFTHLR STMREDNPKTFHEAAHLNGDVDMYEVVKAIVEEEHRRKAEGKEDLIPMRPDHGHQMLDDL KKKTNPGYSAIGRLKGLAEVRGVELAIQRAFFSR >gi|296493237|gb|ADTK01000264.1| GENE 3 1899 - 3359 1810 486 aa, chain + ## HITS:1 COG:uxuB KEGG:ns NR:ns ## COG: uxuB COG0246 # Protein_GI_number: 16132144 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Escherichia coli K12 # 1 486 1 486 486 1013 99.0 0 MTTIVDSNLPVARPSWDHSRLESRIVHLGCGAFHRAHQALYTHHLLESTDSDWGICEVNL MPGNDRVLIENLKKQQLLYTVAEKGAESTELKIIGSMKEALHPEIDGCEGILNAMARPQT AIVSLTVTEKGYCADAASGQLDLNNPLIKHDLENPTAPKSAIGYIVEALRLRREKGLKAF TVMSCDNVRENGHVAKVAVLGLAQARDPQLAAWIEENVTFPCTMVDRIVPAATPETLQEI ADQLGVYDPCAIACEPFRQWVIEDNFVNGRPDWDKVGAQFVADVVPFEMMKLRMLNGSHS FLAYLGYLGGYETIADTMTNPAYRKAAFALMMQEQAPTLSMPEGTDLNAYATLLIERFSN PSLRHRTWQIAMDGSQKLPQRLLDPVRLHLQNGGSWRHLALGVAGWMRYTQGVDEQGNAI DVVDPMLAEFQKINAQYQGADRVKALLGLSGIFADDLPQNADFVGAVTAAYQQLCERGAR ECVAAL >gi|296493237|gb|ADTK01000264.1| GENE 4 3574 - 4347 926 257 aa, chain + ## HITS:1 COG:uxuR KEGG:ns NR:ns ## COG: uxuR COG2186 # Protein_GI_number: 16132145 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 257 1 257 257 498 100.0 1e-141 MKSATSAQRPYQEVGAMIRDLIIKTPYNPGERLPPEREIAEMLDVTRTVVREALIMLEIK GLVEVRRGAGIYVLDNSGSQNTDSPDANVCNDAGPFELLQARQLLESNIAEFAALQATRE DIVKMRQALQLEERELASSAPGSSESGDMQFHLAIAEATHNSMLVELFRQSWQWRENNPM WIQLHSHLDDSLYRKEWLGDHKQILAALIKKDARAAKLAMWQHLENVKQRLLEFSNVDDI YFDGYLFDSWPLDKVDA >gi|296493237|gb|ADTK01000264.1| GENE 5 4488 - 5318 334 276 aa, chain - ## HITS:1 COG:no KEGG:EC55989_4990 NR:ns ## KEGG: EC55989_4990 # Name: yjiC # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 276 1 276 276 552 97.0 1e-156 MSMPLSNALQSQIITDNHFLHHPKVDSELTRKYERARLDKENIYLLPLARGNNHNYDGKS VVEIRKLDISKEPWPFNYVTETCREFNGITTTGRMLYRNLKITSALDEIYGGICKKAHAT TELAEGLRLNLFMKSPFDPVEDYTVHEITLGPGCNVPGYAGTTIGYISTLPASQAKRWTN EQPRIDIYIDQIMTVTGVANSSGFALAALLNANIELGNDPIIGIEAYPGTAEIHAKMGYK VIPGDENAPLKRMTLQPSSLPELFELKNGEWNYIGK >gi|296493237|gb|ADTK01000264.1| GENE 6 5990 - 6382 240 130 aa, chain + ## HITS:1 COG:no KEGG:EcHS_A4554 NR:ns ## KEGG: EcHS_A4554 # Name: iraD # Def: DNA replication/recombination/repair protein # Organism: E.coli_HS # Pathway: not_defined # 1 130 4 133 133 259 100.0 2e-68 MMRQSLQAVLPEISGNKTSLLRKSVCSDILTLFNSPHSALPSLLVSGMPEWQVHNQSDKH LQSWYCRQLRSALLFHEPRIAALQVNFKEAYCHTLAISLEIMLYHDGEPLTFDLVWDNGG WCSAMLENVS >gi|296493237|gb|ADTK01000264.1| GENE 7 6375 - 7286 785 303 aa, chain - ## HITS:1 COG:yjiE KEGG:ns NR:ns ## COG: yjiE COG0583 # Protein_GI_number: 16132148 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 303 1 303 303 606 100.0 1e-173 MDDCGAILHNIETKWLYDFLTLEKCRNFSQAAVSRNVSQPAFSRRIRALEQAIGVELFNR QVTPLQLSEQGKIFHSQIRHLLQQLESNLAELRGGSDYAQRKIKIAAAHSLSLGLLPSII SQMPPLFTWAIEAIDVDEAVDKLREGQSDCIFSFHDEDLLEAPFDHIRLFESQLFPVCAS DEHGEALFNLAQPHFPLLNYSRNSYMGRLINRTLTRHSELSFSTFFVSSMSELLKQVALD GCGIAWLPEYAIQQEIRSGKLVVLNRDELVIPIQAYAYRMNTRMNPVAERFWRELRELEI VLS >gi|296493237|gb|ADTK01000264.1| GENE 8 7351 - 8523 1260 390 aa, chain - ## HITS:1 COG:no KEGG:ECO111_5184 NR:ns ## KEGG: ECO111_5184 # Name: iadA # Def: isoaspartyl dipeptidase # Organism: E.coli_O111_H- # Pathway: not_defined # 1 390 1 390 390 761 100.0 0 MIDYTAAGFTLLQGAHLYAPEDRGICDVLVANGKIIAVASNIPSDIVPDCTVVDLSGQIL CPGFIDQHVHLIGGGGEAGPTTRTPEVALSRLTEAGVTSVVGLLGTDSISRHPESLLAKT RALNEEGISAWMLTGAYHVPSRTITGSVEKDVAIIDRVIGVKCAISDHRSAAPDVYHLAN MAAESRVGGLLGGKPGVTVFHMGDSKKALQPVYDLLENCDVPISKLLPTHVNRNVPLFEQ ALEFARKGGTIDITSSIDEPVAPAEGIARAVQAGIPLARVTLSSDGNGSQPFFDDEGNLT HIGVAGFETLLETVQVLVKDYDFSISDALRPLTSSVAGFLNLTGKGEILPGNDADLLVMT PELRIEQVYARGKLMVKDGKACVKGTFETA >gi|296493237|gb|ADTK01000264.1| GENE 9 8536 - 8997 523 153 aa, chain - ## HITS:1 COG:ECs5287 KEGG:ns NR:ns ## COG: ECs5287 COG0700 # Protein_GI_number: 15834541 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Escherichia coli O157:H7 # 1 153 1 153 153 259 99.0 1e-69 MTTQVRKNVMDMFIDGARRGFTIATTNLLPNVVMAFVIIQALKITGLLDWVGHICEPVMA LWGLPGEAATVLLAALMSMGGAVGVAASLATAGTLTGHDVTVLLPAMYLMGNPVQNVGRC LGTAEVNAKYYPHIITVCVINALLSIWVMQLIV >gi|296493237|gb|ADTK01000264.1| GENE 10 8994 - 9689 639 231 aa, chain - ## HITS:1 COG:ECs5288 KEGG:ns NR:ns ## COG: ECs5288 COG3314 # Protein_GI_number: 15834542 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 231 1 231 231 399 100.0 1e-111 MGIVMTQQGDAVAGELATEKVGIKGYLAFFLTIIFFSGVFSGTDSWWRVFDFSVLNGSFG QLPGANGATTSFRGAGGAGAKDGFLFALELAPSVILSLGIISITDGLGGLRAAQQLMTPV LKPLLGIPGICSLALIANLQNTDAAAGMTKELAQEGEITERDKVIFAAYQTSGSAIITNY FSSGVAVFAFLGTSVIVPLAVILVFKFVGANILRVWLNFEERRNPTQGAQA >gi|296493237|gb|ADTK01000264.1| GENE 11 9927 - 10481 337 184 aa, chain + ## HITS:1 COG:kptA KEGG:ns NR:ns ## COG: kptA COG1859 # Protein_GI_number: 16132152 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNA:NAD 2'-phosphotransferase # Organism: Escherichia coli K12 # 1 184 35 218 218 364 98.0 1e-101 MAKYNEKELADTSKFLSFVLRHKPEAIGIVLDREGWADIDKLILCAQKAGKRLTRTLLDT VVATSDKKRFSYSSDGRCIRAVQGHSTSQVAISFAEKTPPQFLYHGTASRFLDEIKKQGL IAGERHYVHLSADEATARKVGARHGSPVILTVKAQEMAKRGIPFWQAENGVWLTSTVAVE FLEW >gi|296493237|gb|ADTK01000264.1| GENE 12 10494 - 11615 797 373 aa, chain - ## HITS:1 COG:yjiJ KEGG:ns NR:ns ## COG: yjiJ COG0477 # Protein_GI_number: 16132153 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 373 20 392 392 545 97.0 1e-155 MLVLTLGMGLGRFLYTPMLPVMMAEGAFSFSQLSWIASGNYAGYLAGSLLFSFGAFHQPS RLRPFLLVSALASGLLILAMAWLPPFILVLLIRFLAGVASAGMLIFGSTLIMQHTRHPFV LAALFSGVGIGIALGNEYVLAGLHFALSSQTLWQGAGALSGMMLIALTLLMPSKKHAIAP MPLAKTEQQIMSWWLLAILYGLAGFGYIIVATYLPLMAKDAGSPLLTAHLWTLVGLSIVP GCFSWLWAAKRWGALPCLTANLLVQAISVLLTLASDSPLLLIISSIGFGGTFMGTTSLVM TIARQLSVPGNLNLLGFVTLIYGIGQILGPALTSMLGNGTSALAGATLCGAAALFIAALI STVQLFKLQVVTS >gi|296493237|gb|ADTK01000264.1| GENE 13 11740 - 12600 515 286 aa, chain - ## HITS:1 COG:yjiK KEGG:ns NR:ns ## COG: yjiK COG3204 # Protein_GI_number: 16132154 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 286 38 323 323 553 98.0 1e-157 MTKSISLSKRISVIVILFAIVAVCTFFVQSCARKSNHAASFQNYHATIDGKEIAGITNNI SSLTWSAQSNTLFSTINKPAAIVEMTTNGDFIRTIPLDFVKDLETIEYIGDNQFVISDER DYAIYVISLTPNSEVKILKKIKIPLQDSPTNCGFEGLAYSRQDHTFWFFKEKNPIEVYKV NGLLSSNELHISKDEALQRQFTLDDVSGAEFNQQKNTLLVLSHESRALQEVTLVGEVIGE ISLTKGSRGLSHNIKQAEGIAMDASGNIYIVSEPNRFYRFTPKSSH >gi|296493237|gb|ADTK01000264.1| GENE 14 12665 - 12922 165 85 aa, chain - ## HITS:1 COG:no KEGG:EC55989_5000 NR:ns ## KEGG: EC55989_5000 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 85 1 85 85 176 100.0 2e-43 MKEFLFLFHSTVGVIQTRKALQAAGMTFRVSDIPRDLRGGCGLCIWLTCPPGEEIQWVIP GLTESIYCQQDGVWRCIAHYRVSPR >gi|296493237|gb|ADTK01000264.1| GENE 15 12919 - 13686 492 255 aa, chain - ## HITS:1 COG:ECs5297 KEGG:ns NR:ns ## COG: ECs5297 COG1924 # Protein_GI_number: 15834551 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Escherichia coli O157:H7 # 1 255 1 255 255 486 98.0 1e-137 MTYSIGIDSGSTATKGILLADGVITRRFLVPTPFRPATAITEAWETLREGLETTPFLTLT GYGRQLVDFADKQVTEISCHGLGARFLAPATRAVIDIGGQDSKVIQLDDDGNLCDFLMND KCAAGTGRFLEVISRTLGTSVEQLDSITENVTPHAITSMCTVFAESEVISLRSAGVAPEA ILAGVINAMARRSANFIARLSCEAPILFTGGVSHCQTFARMLESHLGMPVNTHPDAQFAG AIGAAVIGQRVRTRR >gi|296493237|gb|ADTK01000264.1| GENE 16 13696 - 14847 1362 383 aa, chain - ## HITS:1 COG:yjiM KEGG:ns NR:ns ## COG: yjiM COG1775 # Protein_GI_number: 16132156 # Func_class: E Amino acid transport and metabolism # Function: Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB # Organism: Escherichia coli K12 # 1 383 8 390 390 783 98.0 0 MSLITDLPAIFDQFSEARQKGFLTVMDLKERGIPLVGTYCTFMPQEIPMAAGAVVVSLCS TSDETIEEAEKDLPRNLCPLIKSSYGFGKTDKCPYFYFSDLVVGETTCDGKKKMYEYMAE FKPVHVMQLPNSVKDDASRALWKAEMLRLQKTIEERFGHEISEDALRDAIALKNRERRAL ANFYHLGQLNPPALSGSDILKVVYGATFRFDKEALINELDAMTARVRQQWEEGQRLALRP RILITGCPIGGAAEKVVRAIEENGGWVVGYENCTGAKATEQCVAETGDVYDALADKYLAI GCSCVSPNDQRLQMLSQMVEEYQVDGVVDVILQACHTYAVESLAIKRHVRQQHNIPYIAI ETDYSTSDVGQLSTRVAAFIEML >gi|296493237|gb|ADTK01000264.1| GENE 17 14963 - 16210 1100 415 aa, chain - ## HITS:1 COG:yjiN KEGG:ns NR:ns ## COG: yjiN COG2733 # Protein_GI_number: 16132157 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 13 415 24 426 426 778 99.0 0 MLALSLLLIAAATFVVTLFLPPNFWVSGVKAIAEAAMVGALADWFAVVALFRRVPIPIIS RHTAIIPRNKDRIGENLGQFVQEKFLDTQSLVALIRRHEPALLIGNWFSQPENARRVGQH LLQIMSGFLELTDDARIQRLLKRAVHRAIDKVDLSGTSALMLESMTKNERHQVLLDTLIA QLIALLQRDKSRKFIAQQIVRWLESEHPLKAKILPTEWLGEHSAELVSDAVNSLLDDISR DRAHQIRHAFDRATFALIDKLKNDPEMAARADAVKSYLKEDEAFNRYLSELWGDLREWLK ADINSEDSRVKERIARAGQWFGETLIADDALRASLNGHLEQAAHRVAPEFSAFLTRHISD TVKSWDARDMSRQIELNIGKDLQFIRVNGTLVGGCIGLILYLLSQLPALFPLGNF >gi|296493237|gb|ADTK01000264.1| GENE 18 16284 - 17516 1123 410 aa, chain - ## HITS:1 COG:ECs5300 KEGG:ns NR:ns ## COG: ECs5300 COG0477 # Protein_GI_number: 15834554 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 410 1 410 410 710 99.0 0 MPRFFARHAATLFFPMALILYDFAAYLSTDLIQPGIINVVRDFNADVSLAPAAVSLYLAG GMALQWLLGPLSDRIGRRPVLITGALIFTLACAATMFTTSMTQFLIARAIQGTSICFIAT VGYVTVQEAFGQTKGIKLMAIITSIVLIAPIIGPLSGAALMHFVHWKVLFAIIAVMGFIS FVGLLLAMPETVKRGAVPFSAKSVLRDFRNVFCNRLFLFGATTISLSYIPMMSWVAVSPV ILIDAGGLTTSQFAWTQVPVFGAVIVANAIVARFVKDPTEPRFIWRAVPIQLVGLALLII GNLLSPHVWLWSVLGTSLYAFGIGLIFPTLFRFTLFSNNLPKGTVSASLNMVILMVMSVS VEIGRWLWFNGGRLPFHLLAVVAGIIVVFTLAGLLNRVRQHQAAELAEER >gi|296493237|gb|ADTK01000264.1| GENE 19 17983 - 18903 529 306 aa, chain + ## HITS:1 COG:ECs5301 KEGG:ns NR:ns ## COG: ECs5301 COG5464 # Protein_GI_number: 15834555 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 256 1 256 260 480 95.0 1e-135 MTNFTTSTPHDALFKSFLTHPDTARDFMEIHLPKDLRELCDLDSLKLESASFVDEKLRAL HSDILWSVKTREGDGYIYVVIEHQSREDIHMAFRLMRYSMAVMQRHIEHDKRRQLPLVIP MLFYHGSRSPYPWSLCWLDEFADPTTARKLYSAAFPLVDVTVVPDDEIVQHRRVALLELI QKHIRQRDLMELIDQLVVLLVTECANDSQITALLNYILLTGDEARFKKFISELTRRMPHH RERIMTIAERIYNDGCIKGEQRILRLFLQNGVDPEWIQKITGLSAEQMQALEQPLPDREH DSWLEY >gi|296493237|gb|ADTK01000264.1| GENE 20 19057 - 20469 1014 470 aa, chain - ## HITS:1 COG:yjiR KEGG:ns NR:ns ## COG: yjiR COG1167 # Protein_GI_number: 16132161 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Escherichia coli K12 # 1 470 1 470 470 940 98.0 0 MTRYQHLATLLAERIEQGLYRHGEKLPSVRSLSQEHGVSISTVQQAYQTLETMKLITPQP RSGYFVAQRKAQPPVPPMTRPVQRPVEITQWDQVLDMLVAHSDSSIVPLSKSTPDVETPS LKPLWRELSRVVQHNLQTVLGYDLLAGQRVLREQIARLMLDSGSVVTADDIIITSGCHNS MSLALMAVCKPGDIVAVESPCYYGSMQMLRGMGVKVIEIPTDPETGISVEALELALEQWP IKGIILVPNCNNPLGFIMPDARKRAVLSLAQRHDIVIFEDDVYGELATEYPRPRTIHSWD IDGRVLLCSSFSKSIAPGLRVGWVAPGRYHDKLLHMKYAISSFNVPSTQMAAATFVLEGH YHRHIRRMRQIYQRNLALYTCWIREYFPCEIYITRPKGGFLLWIELPEQVDMVCVARQLY RMKIQVAAGSIFSASGKYRNCLRINCALPLSETNREALKQIGDAVYRAME >gi|296493237|gb|ADTK01000264.1| GENE 21 20646 - 20810 95 54 aa, chain + ## HITS:1 COG:ECs5303 KEGG:ns NR:ns ## COG: ECs5303 COG5457 # Protein_GI_number: 15834557 # Func_class: S Function unknown # Function: Uncharacterized conserved small protein # Organism: Escherichia coli O157:H7 # 1 54 1 54 54 66 100.0 1e-11 MEFHENKAKTPFIRLVQLWQAVRRWRRQMQARRVLQQMSDERLRDIGLRREDVE >gi|296493237|gb|ADTK01000264.1| GENE 22 21162 - 23702 1697 846 aa, chain + ## HITS:1 COG:all3924 KEGG:ns NR:ns ## COG: all3924 COG2366 # Protein_GI_number: 17231416 # Func_class: R General function prediction only # Function: Protein related to penicillin acylase # Organism: Nostoc sp. PCC 7120 # 31 661 59 672 847 203 28.0 1e-51 MKNRNRMIVNCVTASLMYYWSLPALAEQSSSEIKIVRDEYGMPHIYANDTWHLFYGYGYV VAQDRLFQMEMARRSTQGTVAEVLGKDFVKFDKDIRRNYWPDAIRAQIDALSPEDMSILQ GYADGMNAWIDKVNTNPETLLPKQFNTFGFTPKRWEPFDVAMIFVGTMANRFSDSTSEID NLALLTALKDKYGVSQGMAVFNQLKWLVNPSAPTTIAVQESSYPLKFNQQNSQTAALLPR YDLPAPMLDRPAKGADGALLALTAGKNRETIAAQFAQGGANGLAGYPTTSNMWVIGKSKA QDAKAIMVNGPQFGWYAPAYTYGIGLHGAGYDVTGNTPFAYPGLVFGHNGVISWGSTAGF GDDVDIFAERLSAEKPGYYLHNGKWVKMLSREETITVKNGQAETFTVWRTVHGNILQTDQ TTQTAYAKSRAWDGKEVASLLAWTHQMKAKNWQEWTQQAAKQALTINWYYADVNGNIGYV HTGAYPDRQSGHDPRLPVPGTGKWDWKGLLPFEMNPKVYNPLSGYIANWNNSPQKDYPAS DLFAFLWGGADRVTEIDRLLEQKPRLTADQAWDVIRQTSRQDLNLRLFLPTLQAATSGLT QSDPRRQLVDTLTRWDGINLLNDDGKTWQQPGSAILNVWLTSMLKRTVVAAVPMPFDKWY SASGYETTQDGPTGSLNISVGAKILYEAVQGDKSPIPQAVDLFAGKPQQEVVLAALEDTW ETLSKRYGNNVSNWKTPAMALTFRANNFFGVPQAAAEETRHQAEYQNRGTENDMIVFSPT TSDRPVLAWDVVAPGQSGFIAPDGTVDKHYEDQLKMYENFGRKSLWLTKQDVEAHKESQE VLHVQR >gi|296493237|gb|ADTK01000264.1| GENE 23 23780 - 24736 1007 318 aa, chain - ## HITS:1 COG:STM4530 KEGG:ns NR:ns ## COG: STM4530 COG0523 # Protein_GI_number: 16767774 # Func_class: R General function prediction only # Function: Putative GTPases (G3E family) # Organism: Salmonella typhimurium LT2 # 1 318 1 318 318 586 91.0 1e-167 MNPIAVTLLTGFLGAGKTTLLRHILNEQHGYKIAVIENEFGEVSVDDQLIGDRATQIKTL TNGCICCSRSNELEDALLDLLDNLDKGNIQFDRLVIECTGMADPGPIIQTFFSHEILCQR YLLDGVIALVDAVHADEQMNQFTIAQSQVGYADRILLTKTDVAGEAEKLRERLARINARA PVYTVTHGDIDLGLLFNTNGFMLEENVVSTKPRFHFIADKQNDISSIVVELDYPVDISEV SRVMENLLLESADKLLRYKGMLWIDGEPNRLLFQGVQRLYSADWDRPWGDEKPHSTMVFI GIQLPEDEIRAAFAGLKK >gi|296493237|gb|ADTK01000264.1| GENE 24 24747 - 24950 274 67 aa, chain - ## HITS:1 COG:ECs5312 KEGG:ns NR:ns ## COG: ECs5312 COG2879 # Protein_GI_number: 15834566 # Func_class: S Function unknown # Function: Uncharacterized small protein # Organism: Escherichia coli O157:H7 # 1 67 1 67 67 127 100.0 4e-30 MFGNLGQAKKYLGQAAKMLIGIPDYDNYVEHMKTNHPDKPYMSYEEFFRERQNARYGGDG KGGMRCC >gi|296493237|gb|ADTK01000264.1| GENE 25 25000 - 27150 2547 716 aa, chain - ## HITS:1 COG:ECs5313 KEGG:ns NR:ns ## COG: ECs5313 COG1966 # Protein_GI_number: 15834567 # Func_class: T Signal transduction mechanisms # Function: Carbon starvation protein, predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 716 6 721 721 1341 100.0 0 MDTKKLFKHIPWVILGIIGAFCLAVVALRRGEHVSALWIVVASVSVYLVAYRYYSLYIAQ KVMKLDPTRATPAVINNDGLNYVPTNRYVLFGHHFAAIAGAGPLVGPVLAAQMGYLPGTL WLLAGVVLAGAVQDFMVLFISSRRNGASLGEMIKEEMGPVPGTIALFGCFLIMIIILAVL ALIVVKALAESPWGVFTVCSTVPIALFMGIYMRFIRPGRVGEVSVIGIVLLVASIYFGGV IAHDPYWGPALTFKDTTITFALIGYAFVSALLPVWLILAPRDYLATFLKIGVIVGLALGI VVLNPELKMPAMTQYIDGTGPLWKGALFPFLFITIACGAVSGFHALISSGTTPKLLANET DARFIGYGAMLMESFVAIMALVAASIIEPGLYFAMNTPPAGLGITMPNLHEMGGENAPII MAQLKDVTAHAAATVSSWGFVISPEQILQTAKDIGEPSVLNRAGGAPTLAVGIAHVFHKV LPMADMGFWYHFGILFEALFILTALDAGTRSGRFMLQDLLGNFIPFLKKTDSLVAGIIGT AGCVGLWGYLLYQGVVDPLGGVKSLWPLFGISNQMLAAVALVLGTVVLIKMKRTQYIWVT VVPAVWLLICTTWALGLKLFSTNPQMEGFFYMASQYKEKIANGTDLTAQQIANMNHIVVN NYTNAGLSILFLIVVYSIIFYGFKTWLAVRNSDKRTDKETPYVPIPEGGVKISSHH >gi|296493237|gb|ADTK01000264.1| GENE 26 27543 - 28055 631 170 aa, chain - ## HITS:1 COG:STM1098 KEGG:ns NR:ns ## COG: STM1098 COG1853 # Protein_GI_number: 16764456 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Salmonella typhimurium LT2 # 1 170 1 170 170 300 84.0 1e-81 MQLDEQRLRFRDAMASLSAAVNIITTEGDAGQCGITATAVCSVTDTPPSLMVCINANSAM NPVFQGNGKLCVNVLNHEQELMARHFAGMTGMAMEERFSLSCWQKGPLAQPVLKGSLASL EGEIRDVQAIGTHLVYLVEIKNIILSAEGHGLIYFKRRFHPVMLEMEAAI >gi|296493237|gb|ADTK01000264.1| GENE 27 28073 - 29635 1680 520 aa, chain - ## HITS:1 COG:STM1099 KEGG:ns NR:ns ## COG: STM1099 COG2368 # Protein_GI_number: 16764457 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Aromatic ring hydroxylase # Organism: Salmonella typhimurium LT2 # 1 520 1 520 520 1061 95.0 0 MKPEDFRASTQRPFTGEEYLKSLQDGREIYIYGERVKDVTTHPAFRNAAASVAQLYDALH KPEMQDSLCWNTDTGSGGYTHKFFRVAKSADDLRQQRDAIAEWSRLSYGWMGRTPDYKAA FGCALGANPGFYGQFEQNARNWYTRIQETGLYFNHAIVNPPIDRHLPTDKVKDVYIKLEK ETDAGIIVSGAKVVATNSALTHYNMIGFGSAQVMGENPDFALMFVAPMDADGVKLISRAS YEMVAGATGSPYDYPLSSRFDENDAILVMDNVLIPWENVLIYRDFDRCRRWTMEGGFARM YPLQACVRLAVKLDFITALLKKSLECTGTLEFRGVQADLGEVVAWRNTFWALSDSMCSEA TPWVNGAYLPDHAALQTYRVLAPMAYAKIKNIIERNVTSGLIYLPSSARDLNNPQIDQYL AKYVRGSNGMDHVQRIKILKLMWDAIGSEFGGRHELYEINYSGSQDEIRLQCLRQAQSSG NMDKMMAMVDRCLSEYDQNGWTVPHLHNNDDINMLDKLLK >gi|296493237|gb|ADTK01000264.1| GENE 28 29886 - 30776 752 296 aa, chain - ## HITS:1 COG:STM1108 KEGG:ns NR:ns ## COG: STM1108 COG2207 # Protein_GI_number: 16764466 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Salmonella typhimurium LT2 # 1 294 1 294 298 538 87.0 1e-153 MCDRQIANIDISKEYDESLGTDDVHYQSFARMAAFFGRHMLPHRHEQYFQMHFLNSGQIE LQLDDHRYSVEAPLFVLTPPSVPHAFITESDADGHVLTVREDLIWPLLEVLYPGTRETFG LPGICLSLADKPDELAALEHYWQLIERESVEQLPGREHTLTLLAQAVFTLLLRNAKLDDH AASGMRGELKLFQRFNMLIESHFHQHWTVPDYANELHITESRLTDICRRFANRPPKRLIF DRQLREAKRLLLFSDNAVNNIAWQLGFKDPAYFARFFNRLVGCSPSAYRAKKVPVT >gi|296493237|gb|ADTK01000264.1| GENE 29 30786 - 32162 1343 458 aa, chain - ## HITS:1 COG:STM1107 KEGG:ns NR:ns ## COG: STM1107 COG0477 # Protein_GI_number: 16764465 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Salmonella typhimurium LT2 # 1 458 1 458 458 771 87.0 0 MSDTSPAIPESIDPANQHKALTAGQQAVIKKLFRRLIVFLFVLFIFSFLDRINIGFAGLT MGRDLGLSATMFGLATTLFYAAYVIFGIPSNIMLSIVGARRWIATIMVLWGIASTATMFA TGPTSLYVLRILVGITEAGFLPGILLYLTFWFPAYFRARANALFMVAMPVTTALGSIVSG YILSLDGVMALKGWQWLFLLEGFPSVLLGVMVWFWLDDSPDKAKWLTKEDKKCLQEMMDN DRLTLVQPEGAISHHAMQQRSMWREIFTPVVMMYTLAYFCLTNTLSAISIWTPQILQSFN QGSSNITIGLLAAVPQICTILGMVYWSRHSDRRQERRHHTALPYLFAAAGWLLASATDHN MIQMLGIIMASTGSFSAMAIFWTTPDQSISLRARAIGIAVINATGNIGSALSPFMIGWLK DLTGSFNSGLWFVAALLVIGAGIIWAIPMQSSRPRATP >gi|296493237|gb|ADTK01000264.1| GENE 30 32337 - 33125 1048 262 aa, chain - ## HITS:1 COG:STM1106 KEGG:ns NR:ns ## COG: STM1106 COG3836 # Protein_GI_number: 16764464 # Func_class: G Carbohydrate transport and metabolism # Function: 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase # Organism: Salmonella typhimurium LT2 # 1 253 1 253 263 430 89.0 1e-120 MENSFKAALKAGRPQIGLWLGLSSSYSAELLAGAGFDWLLIDGEHAPNNVQTVLTQLQAI APYPSQPVVRPSWNDPVQIKQLLDVGTQTLLVPMVQNADEAREAVRATRYPPAGIRGVGS ALARASRWNRIPDYLQKANDQMCVLVQIETREAMKNLLQILDVEGVDGVFIGPADLSADM GYAGNPQHPEVQAAIEQAIVQIRESGKAPGILIANEQLAKRYLELGALFVAVGVDTTLLA RAAEALAARFGAQATAVKPGVY >gi|296493237|gb|ADTK01000264.1| GENE 31 33136 - 33939 999 267 aa, chain - ## HITS:1 COG:STM1105 KEGG:ns NR:ns ## COG: STM1105 COG3971 # Protein_GI_number: 16764463 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: 2-keto-4-pentenoate hydratase # Organism: Salmonella typhimurium LT2 # 1 267 1 267 267 533 95.0 1e-151 MFDKHTHTLIAQRLDQAEKQREQIRAISLDYPEITIEDAYAVQREWVRLKIAEGRTLKGH KIGLTSKAMQASSQISEPDYGALLDDMFFHDGSDIPTDRFIVPRIEVELAFVLAKPLRGP NCTLFDVYNATDYVIPALELIDARCHNIDPETQRPRKVFDTISDNAANAGVILGGRPIKP DELDLRWISALMYRNGVIEETGVAAGVLNHPANGVAWLANKLAPYDVQLEAGQIILGGSF TRPVPARKGDTFHVDYGNMGSISCRFV >gi|296493237|gb|ADTK01000264.1| GENE 32 34007 - 34387 436 126 aa, chain - ## HITS:1 COG:STM1104 KEGG:ns NR:ns ## COG: STM1104 COG3232 # Protein_GI_number: 16764462 # Func_class: E Amino acid transport and metabolism # Function: 5-carboxymethyl-2-hydroxymuconate isomerase # Organism: Salmonella typhimurium LT2 # 1 126 1 126 126 229 86.0 1e-60 MPQFIVECTENIREEARLPELFASVNTALAATGIFPLGGIRSRAHWIDTWQMADGKHDYA FVHMTLKIGSGRSLESRQEVGEMLFDLIKTHFASLMESRYLALSFEIAELHPTLNFKQNN VHALFK >gi|296493237|gb|ADTK01000264.1| GENE 33 34397 - 35248 843 283 aa, chain - ## HITS:1 COG:STM1103 KEGG:ns NR:ns ## COG: STM1103 COG3384 # Protein_GI_number: 16764461 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Salmonella typhimurium LT2 # 1 283 1 283 283 571 93.0 1e-163 MGKLALAAKITHVPSMYLSELPGKNHGCRQGAIDGHKEISKRCREMGVDTIIVFDTHWLV NSAYHINCADHFEGVYTSNELPHFIRDMTYNYEGNPELGQLIADEALKLGVRAKAHNIPS LKLEYGTLVPMRYMNEDKHFKVVSISAFCTVHDFADSRKLGEAILKAIEQYDGTVAVLAS GSLSHRFIDDQRAEEGMNSYTREFDCQMDERVVKLWREGQFKEFCNMLPEYADYCYGEGN MHDTVMLLGMLGWDKYDGKVEFITELFPSSGTGQVNAVFPLPA >gi|296493237|gb|ADTK01000264.1| GENE 34 35250 - 36716 1488 488 aa, chain - ## HITS:1 COG:STM1102 KEGG:ns NR:ns ## COG: STM1102 COG1012 # Protein_GI_number: 16764460 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Salmonella typhimurium LT2 # 1 488 1 488 488 988 97.0 0 MKKVNHWINGKNFAGNDYFQTTNPATGEVLADVASGGEAEINQAVAAAKEAFPRWANLPM KERARLMRRLGDLIDQNVPEIAAMETADTGLPIHQTKNVLIPRASHNFEFFAEVCQQMNG KTYPVDDKMLNYTLVQPVGVCALVSPWNVPFMTATWKVAPCLALGNTAVLKMSELSPLTA DRLGELALEAGIPAGVLNVVQGYGATAGDALVRHHDVRAVSFTGGTATGRNIMKNAGLKK YSMELGGKSPVLIFEDADIERALDAALFTIFSINGERCTAGSRIFIQQSIYPEFVKRFAE RANRLRVGDPNDPNTQVGALISQQHWEKVSGYIRLGIEEGATLLAGGPDKPSDLPAHLKG GNFLRPTVLADVDNRMRVAQEEIFGPVACLLPFKDEAEGLRLANDVEYGLASYIWTQDVS KVLRLARGIEAGMVFVNTQNVRDLRQPFGGVKASGTGREGGEYSFEVFAEMKNVCISMGD HPIPKWGV >gi|296493237|gb|ADTK01000264.1| GENE 35 36713 - 38002 1271 429 aa, chain - ## HITS:1 COG:STM1101 KEGG:ns NR:ns ## COG: STM1101 COG0179 # Protein_GI_number: 16764459 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) # Organism: Salmonella typhimurium LT2 # 1 429 1 429 429 667 82.0 0 MKGTIFAVALNHRSQLDAWQEAFQQSPYKAPPKTAVWFIKPRNTVIGCGEPIPFPQGEKV LSGATVALIVGKTATKVGEEDAAEYIAGYALANDVSLPEESFYRPAIKAKCRDGFCPIGE TVALSNVDNLTIYPEINGRPADHWNTADLQRNAAQLLSALSEFATLNPGDAILLGTPQAR VEIQPGDRVRVLAEGFPPLENPVVDEREVTTRKSFPTQPHPHGTLFALGLNYADHASELE FKPPEEPLVFLKAPNTLTGDNQTSVRPNNIEYMHYEAELVVVIGKQARSVSEADAMDYVA GYTVCNDYAIRDYLENYYRPNLRVKSRDGLTPMLSTIVPKEAIPDPHNLTLRTFVNGELL QQGTTADLIFSVPFLIAYLSEFMTLNPGDMIATGTPKGLSDVVPGDEVVVEVEGVGRLVN RIVSEETAK >gi|296493237|gb|ADTK01000264.1| GENE 36 38274 - 38720 388 148 aa, chain + ## HITS:1 COG:STM1100 KEGG:ns NR:ns ## COG: STM1100 COG1846 # Protein_GI_number: 16764458 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Salmonella typhimurium LT2 # 1 124 1 124 146 199 83.0 1e-51 MHDSLTIALLQAREAAMSYFRPIVKRHNLTEQQWRIVRILAESPSMDFHDLAYRACILRP SLTGILTRMERDGLVLRLKPINDQRKLYISLTKEGQALYNRAQTQIEEAYRQIEAQFTAE KMQQLTHLLEEFIALGNSRQEDIPGDNE >gi|296493237|gb|ADTK01000264.1| GENE 37 38839 - 40503 1606 554 aa, chain + ## HITS:1 COG:Ztsr KEGG:ns NR:ns ## COG: Ztsr COG0840 # Protein_GI_number: 15804929 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Escherichia coli O157:H7 EDL933 # 1 554 1 554 554 899 99.0 0 MLKRIKIVTSLLLVLAVFGLLQLTSGGLFFNALKNDKENFTVLQTIRQQQSTLNGSWVAL LQTRNTLNRAGIRYMMDQNNIGSGSTVAELMQSASISLKQAEKNWAYYEALPRDPRQSTA AAAEIKRNYDIYHNALAELIQLLGAGKINEFFDQPTQGYQDGFEKQYVAYMEQNDRLYDI AVSDNNASYSQAMWILVGVMIVVLAVIFAVWFGIKASLVAPMNRLIDSIRHIAGGDLVKP IEVDGSNEMGQLAESLRHMQGELMRTVGDVRNGANAIYSGASEIATGNNDLSSRTEQQAA SLEETAASMEQLTATVKQNAENARQASHLALSASETAQRGGKVVDNVVQTMRDISTSSQK IADIISVIDGIAFQTNILALNAAVEAARAGEQGRGFAVVAGEVRNLAQRSAQAAREIKSL IEDSVGKVDVGSTLVESAGETMAEIVSAVTRVTDIMGEIASASDEQSRGIDQVGLAVAEM DRVTQQNAALVEESAAAAAALEEQASRLTEAVAVFRIQQQQQQQRETSAVVKTVTPATPR KMAVADSGENWETF >gi|296493237|gb|ADTK01000264.1| GENE 38 40552 - 41022 439 156 aa, chain - ## HITS:1 COG:yjiZ KEGG:ns NR:ns ## COG: yjiZ COG0477 # Protein_GI_number: 16132177 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 156 298 453 453 293 98.0 1e-79 MAAIPFLFGAAGMLVNGYVTDWLVKGGMAPIKSRKICIIAGMFCSAAFTLVVPQATTSMT AVLLIGMALFCIHFAGTSCWGLIHVAVASRMTASVGSIQNFASFICASFAPIITGFIVDT THSFRLALIICGYVTAAGALAYIFLVRQPINDPRKD >gi|296493237|gb|ADTK01000264.1| GENE 39 41037 - 41921 886 294 aa, chain - ## HITS:1 COG:yjiZ KEGG:ns NR:ns ## COG: yjiZ COG0477 # Protein_GI_number: 16132177 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 286 1 286 453 562 99.0 1e-160 MEKENITLDPRSSFTPSSSADIPVPPDGLVQRSTRIKRIQTTAMLLLFFAAVINYLDRSS LSVANLTIREELGLSATEIGALLSVFSLAYGIAQLPCGPLLDRKGPRLMLGLGMFFWSLF QAMSGMVHNFTQFVLVRIGMGIGEAPMNPCGVKVINDWFNIKERGRPMGFFNAASTIGVA VSPPILAAMMLVMGWRGMFITIGVLGIFLAIGWYMLYRNREHVELTAVEQAYLNAGSVNA RRDPLSFAEWRSLFRNRTMWGMMLGFSGINYTAWLYLAWLPGYLVTCKQPITWI >gi|296493237|gb|ADTK01000264.1| GENE 40 42136 - 43050 492 304 aa, chain - ## HITS:1 COG:ECs5317 KEGG:ns NR:ns ## COG: ECs5317 COG1802 # Protein_GI_number: 15834571 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 37 304 1 268 268 516 100.0 1e-146 MSRSQNLRHNVINQVIDDMARGHIPSPLPSQSALAEMYNISRTTVRHILSHLRECGVLTQ VGNDYVIARKPDHDDGFACTTASMSEQNKVFEQAFFTMINQRQLRPGETFSELQLARAAG VSPVVVREYLLKFGRYNLIQSEKRGQWSMKQFDQSYAEQLFELREMLETHSLQHFLNLPD HDPRWLQAKTMLERHRLLRDNIGNSFRMFSQLDRDFHSLLLSAADNIFFDQSLEIISVIF HFHYQWDESDLKQRNIIAVDEHMTILSALICRSDLDATLALRNHLNSAKQSMIRSINENT RYAH >gi|296493237|gb|ADTK01000264.1| GENE 41 43189 - 44211 1090 340 aa, chain + ## HITS:1 COG:yjjN KEGG:ns NR:ns ## COG: yjjN COG1063 # Protein_GI_number: 16132179 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli K12 # 1 340 6 345 345 671 99.0 0 MSTMNVLICQQPKELVWKQREIPIPGDNEALIKIKSVGICGTDIHAWGGNQPFFSYPRVL GHEICGEIVGLGKNIADLKNGQQVAVIPYVACQQCPACKSGRTNCCEKISVIGVHQDGGF SEYLSVPVANILPADGIDPQAAALIEPFAINAHAVRRAAIAPGEQVLVVGAGPIGLGAAA IAKADGAQVVVADTSPARREHVATRLELPVLDPSAEDFDAQLRAQFGGSLAQKVIDATGN QHAMNNTVNLIRHGGTVVFVGLFKGELQFSDPEFHKKETTMMGSRNATPEDFAKVGRLMA EGKITADMMLTHRYPFATLAETYERDVINNRELIKGVITF >gi|296493237|gb|ADTK01000264.1| GENE 42 44350 - 46641 2084 763 aa, chain - ## HITS:1 COG:mdoB KEGG:ns NR:ns ## COG: mdoB COG1368 # Protein_GI_number: 16132180 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Escherichia coli K12 # 14 763 1 750 750 1515 99.0 0 MSELLSFALFLASVLIYAWKAGRNTWWFAATLTVLGLFVVLNITLFASDYFTGDGINDAV LYTLTNSLTGAGVSKYILPGIGIVLGLTAVFGALGWILRRRRHHPHHFGYSLLALLLALG SVDASPAFRQITELVKSQSRDGDPDFAAYYKEPSKTIPDPKLNLVYIYGESLERTYFDNE AFPDLTPELGALKNEGLDFSHTQQLPGTDYTIAGMVASQCGIPLFAPFEGNASASVSSFF PQNICLGDILKNSGYQNYFVQGANLRFAGKDVFLKSHGFDHLYGSEELKSVVADPHYRND WGFYDDTVLDEAWKKFEELSRSGQRFSLFTLTVDTHHPDGFISRTCNRKKYDFDGKPNQS FSAVSCSQENIAAFINKIKASPWFKDTVIVVSSDHLAMNNTAWKYLNKQDRNNLFFVIRG DKPQQETLAVKRNTMDNGATVLDILGGDNYLGLGRSSLSGQSMSEIFLNIKEKTLAWKPD IIRLWKFPKEMKEFTIDQQKNMIAFSGSHFRLPLLLRVSDKRVEPLPESEYSAPLRFQLA DFAPRDNFVWVDHCYKMAQLWAPELALSTDWCVSQGQLGGQQIVQHVDKTMWKGKTAFKD TVIDMARYKSNVDTLKIVDNDIRYKADSFIFNVAGAPEEVKQFSGISRPESWGRWSNAQL GDEVKIEYKHPLPKKFDLVITAKAYGNNASRPIPVRVGNEEQTLVLGNEVTTTTLHFDNP TDADTLVIVPPEPVSTNEGNILGHSPRKLGIGMVEIKVVEREG >gi|296493237|gb|ADTK01000264.1| GENE 43 46895 - 47386 499 163 aa, chain - ## HITS:1 COG:no KEGG:SSON_4506 NR:ns ## KEGG: SSON_4506 # Name: yjjA # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 163 3 165 165 261 99.0 7e-69 MKTVKHLLCCAIAASALISTGVHAASWKDALSSAASELGNQNSTTQEGGWSLASLTNLLS SGNQALSADNMNNAAGILQYCAKQKLASATDAENIKNQVLEKLGLNSEEQKEDTNYLDGI QGLLKTKDGQQLNLDNIGTTPLAEKVKTKACDLVLKQGLNFIS >gi|296493237|gb|ADTK01000264.1| GENE 44 47438 - 48175 813 245 aa, chain - ## HITS:1 COG:ECs5321 KEGG:ns NR:ns ## COG: ECs5321 COG1484 # Protein_GI_number: 15834575 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Escherichia coli O157:H7 # 1 245 1 245 245 473 100.0 1e-133 MKNVGDLMQRLQKMMPAHIKPAFKTGEELLAWQKEQGAIRSAALERENRAMKMQRTFNRS GIRPLHQNCSFENYRVECEGQMNALSKARQYVEEFDGNIASFIFSGKPGTGKNHLAAAIC NELLLRGKSVLIITVADIMSAMKDTFRNSGTSEEQLLNDLSNVDLLVIDEIGVQTESKYE KVIINQIVDRRSSSKRPTGMLTNSNMEEMTKLLGERVMDRMRLGNSLWVIFNWDSYRSRV TGKEY >gi|296493237|gb|ADTK01000264.1| GENE 45 48178 - 48717 529 179 aa, chain - ## HITS:1 COG:ECs1768 KEGG:ns NR:ns ## COG: ECs1768 COG5529 # Protein_GI_number: 15831022 # Func_class: R General function prediction only # Function: Pyocin large subunit # Organism: Escherichia coli O157:H7 # 78 153 184 266 346 75 50.0 4e-14 MSSRVLTPDVVGIDALVHDHQTVLAKAEGGVVAVFANNAPAFYAVTPARLAELLALEEKL ARPGSDVALDDQLYQEPQAAPVAVPMGKFAMYPDWQPDADFIRLAALWGVALREPVTTEE LTSFIAYWQAEGKVFHHVQWQQKLARSLQIGRASNGGLPKRDVNTVSEPDSQIPPGFRG >gi|296493237|gb|ADTK01000264.1| GENE 46 48825 - 49298 429 157 aa, chain - ## HITS:1 COG:STM4545 KEGG:ns NR:ns ## COG: STM4545 COG3610 # Protein_GI_number: 16767789 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Salmonella typhimurium LT2 # 1 157 1 157 157 241 87.0 4e-64 MGVIEFLFALAQDMILAAIPAVGFAMVFNVPVRALRWCALLGAIGHGSRMILMTSGLNIE WSTFMASMLVGTIGIQWSRWYLAHPKVFTVAAVIPMFPGISAYTAMISAVKISQLGYSES LMITLLTNFLTASSIVGALSIGLSIPGLWLYRKRPRV >gi|296493237|gb|ADTK01000264.1| GENE 47 49289 - 50059 359 256 aa, chain - ## HITS:1 COG:yjjP KEGG:ns NR:ns ## COG: yjjP COG2966 # Protein_GI_number: 16132185 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 256 22 277 277 499 98.0 1e-141 MQTEQQRAVTRLCIQCGLFLLQHGAESALVDELSSRLGRALGMDSVESSISSNAIVLTTI KDGQCLTSTRKNQDRGINMHVVTEVQHIVILAEHHLLDYKGVEKRFSQIQPLRYPRWLVA LMVGLSCACFCKLNKGGWDGAVITFFASTAAMYIRQLLAQRHLHPQINFCLTAFAATTIS GLLLQLPTFSNTPTIAMAASVLLLVPGFPLINAVADMFKGHINTGLARWAIASLLTLATC VGVVMALTIWGLRGWV >gi|296493237|gb|ADTK01000264.1| GENE 48 50729 - 51403 310 224 aa, chain + ## HITS:1 COG:ECs5325 KEGG:ns NR:ns ## COG: ECs5325 COG2197 # Protein_GI_number: 15834579 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 224 18 241 241 423 99.0 1e-118 MQAGLKEVMRTHFPEYEIISSASAEDLTLLQLRRSGLVIADLAGESEDPRSVCEHYYSLI SQYREIHWVFMVSRSWYSQAVELLMCPTATLLSDVEPIENLVKTVRSGNTHAERISAMLT SPAMNETHDFSYRSVILTLSERKVLRLLGKGWGINQIASLLKKSNKTISAQKNSAMRRLA IHSNAEMYAWINSAQGARELNLPSVYGDAAEWNTAELRREMSHS >gi|296493237|gb|ADTK01000264.1| GENE 49 51415 - 52038 332 207 aa, chain + ## HITS:1 COG:ECs5326 KEGG:ns NR:ns ## COG: ECs5326 COG2197 # Protein_GI_number: 15834580 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 207 19 225 225 395 99.0 1e-110 MSSIGIESLFRKFAGNPYKLHTYTSQESFQDAMSRISFAAVIFSFSAMRSERREGLSCLT ELAIKFPRTRRLVIADDDIEARLIGSLSPSPLDGVLSKASTLEIFHQELFLSLNGVRQAT DRLNNQWYINQSRTLSPTEREILRFMSRGYSMTQIAEQLKRNIKTIRAHKFNVMSKLGVS SDAGLLEAADILLCMRHCEASNVLHPY >gi|296493237|gb|ADTK01000264.1| GENE 50 52076 - 52864 633 262 aa, chain - ## HITS:1 COG:ECs5327 KEGG:ns NR:ns ## COG: ECs5327 COG4114 # Protein_GI_number: 15834581 # Func_class: R General function prediction only # Function: Uncharacterized Fe-S protein # Organism: Escherichia coli O157:H7 # 1 262 1 262 262 520 98.0 1e-148 MAYRSAPLYEDIIWRTHLQPQDAGLAQAVRATIAKHREHLLEFIRLDEPAPLNAMTLAQW SSPNALSSLLAVYSDHIYRNQPTMIRENKPLISLWAQWYIGLMVPPLMLALLTQEKALDV SPEHFHAEFHETGRVACFWVDVCEDKNATPHSPQQRMETLISQALVPVVQALEATGEING KLIWSNTGYLINWYLTEMKQLLGEATVESLRHALFFEKTLTNGEDNPLWRTVVLRDGLLV RRTCCQRYRLPDVQQCGDCTLK >gi|296493237|gb|ADTK01000264.1| GENE 51 53005 - 53241 230 78 aa, chain + ## HITS:1 COG:no KEGG:UTI89_C5074 NR:ns ## KEGG: UTI89_C5074 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 78 38 115 115 119 98.0 2e-26 MLQRMLGSGWGVLLPGLLIAGLMYADLSPDQWRIVILMGLVLTPVMLYHKQLRHYVLLPS CLALIAGIMLMIMNLNQG Prediction of potential genes in microbial genomes Time: Mon May 16 15:51:03 2011 Seq name: gi|296493236|gb|ADTK01000265.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont846.3, whole genome shotgun sequence Length of sequence - 14419 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 9, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 59 - 145 69.1 # Leu CAG 0 0 1 1 Tu 1 . - CDS 413 - 1444 264 ## PROTEIN SUPPORTED gi|225082609|ref|YP_002654106.1| ribosomal protein L11 methyltransferase, putative - Prom 1467 - 1526 6.0 + Prom 1465 - 1524 3.0 2 2 Op 1 8/0.000 + CDS 1547 - 1960 399 ## COG3050 DNA polymerase III, psi subunit 3 2 Op 2 4/0.500 + CDS 1929 - 2375 749 ## PROTEIN SUPPORTED gi|15804944|ref|NP_290986.1| ribosomal-protein-alanine N-acetyltransferase 4 2 Op 3 3/1.000 + CDS 2390 - 3067 684 ## COG1011 Predicted hydrolase (HAD superfamily) 5 2 Op 4 5/0.500 + CDS 3158 - 4747 1839 ## COG4108 Peptide chain release factor RF-3 + Term 4758 - 4797 5.8 6 3 Tu 1 . + CDS 5140 - 5745 705 ## COG2823 Predicted periplasmic or secreted lipoprotein 7 4 Tu 1 . + CDS 5872 - 6033 185 ## gi|157368899|ref|YP_001476888.1| hypothetical protein Spro_0654 + Term 6060 - 6102 7.2 + Prom 6068 - 6127 2.3 8 5 Op 1 4/0.500 + CDS 6155 - 7228 842 ## COG4667 Predicted esterase of the alpha-beta hydrolase superfamily 9 5 Op 2 . + CDS 7225 - 8007 752 ## COG0084 Mg-dependent DNase + Term 8030 - 8081 3.0 10 6 Tu 1 . - CDS 8221 - 8955 584 ## COG1180 Pyruvate-formate lyase-activating enzyme - Term 8995 - 9030 4.0 11 7 Tu 1 . - CDS 9056 - 10606 1424 ## JW4343 conserved hypothetical protein - Prom 10710 - 10769 4.7 + Prom 10710 - 10769 4.5 12 8 Tu 1 7/0.500 + CDS 10864 - 11643 1033 ## COG0274 Deoxyribose-phosphate aldolase 13 9 Op 1 4/0.500 + CDS 11770 - 13092 1736 ## COG0213 Thymidine phosphorylase 14 9 Op 2 . + CDS 13144 - 14367 1508 ## COG1015 Phosphopentomutase Predicted protein(s) >gi|296493236|gb|ADTK01000265.1| GENE 1 413 - 1444 264 343 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225082609|ref|YP_002654106.1| ribosomal protein L11 methyltransferase, putative [marine gamma proteobacterium HTCC2148] # 78 334 106 368 371 106 30 1e-22 MSAFTPASEVLLRHSDDFEQSRILFAGDLQDDLPARLDTAASRAHTQQFHHWQVLSRQMG DNARFSLVATVDDVADCDTLIYYWPKNKPEAQFQLMNLLSLLPVGTDIFVVGENRSGVRS AEQMLADYAPLNKVDSARRCGLYFGRLEKQPVFDADKFWGEYSVDGLTVKTLPGVFSRDG LDVGSQLLLSTLTPHTKGKVLDVGCGAGVLSVAFARHSPKIRLTLCDVSAPAVEASRATL AANGVEGEVFASNVFSEVKGRFDMIISNPPFHDGMQTSLDAAQTLIRGAVRHLNSGGELR IVANAFLPYPDVLDETFGFHEVIAQTGRFKVYRAIMTRQAKKG >gi|296493236|gb|ADTK01000265.1| GENE 2 1547 - 1960 399 137 aa, chain + ## HITS:1 COG:ECs5330 KEGG:ns NR:ns ## COG: ECs5330 COG3050 # Protein_GI_number: 15834584 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, psi subunit # Organism: Escherichia coli O157:H7 # 1 137 1 137 137 253 100.0 6e-68 MTSRRDWQLQQLGITQWSLRRPGALQGEIAIAIPAHVRLVMVANDLPALTDPLVSDVLRA LTVSPDQVLQLTPEKIAMLPQGSRCNSWRLGTDEPLSLEGAQVASPALTELRANPTARAA LWQQICTYEHDFFPRND >gi|296493236|gb|ADTK01000265.1| GENE 3 1929 - 2375 749 148 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15804944|ref|NP_290986.1| ribosomal-protein-alanine N-acetyltransferase [Escherichia coli O157:H7 EDL933] # 1 148 1 148 148 293 99 6e-79 MNTISSLETTDLPAAYHIEQRAHAFPWSEKTFASNQGERYLNFQLTQNGKMAAFAITQVV LDEATLFNIAVDPDYQRQGLGRVLLEHLIDELEKRGVATLWLEVRASNAAAIALYESLGF NEATIRRNYYPTTDGREDAIIMALPISM >gi|296493236|gb|ADTK01000265.1| GENE 4 2390 - 3067 684 225 aa, chain + ## HITS:1 COG:ECs5332 KEGG:ns NR:ns ## COG: ECs5332 COG1011 # Protein_GI_number: 15834586 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Escherichia coli O157:H7 # 1 225 1 225 225 462 98.0 1e-130 MKWDWIFFDADETLFTFDSFTGLQRMFLDYSVTFTAEDFQDYQAVNKPLWVDYQNGAITS LQLQHGRFESWAERLKVEAGLLNDAFINAMAEICTPLPGAVSLLNAIRGNAKIGIITNGF SALQQVRLERTGLRDYFDLLVISEEVGVAKPNKKIFDYALEQAGNPDRSRVLMVGDTAES DILGGINAGLATCWLNAHHREQPEGIAPTWTVSSLHELEQLLCKH >gi|296493236|gb|ADTK01000265.1| GENE 5 3158 - 4747 1839 529 aa, chain + ## HITS:1 COG:ECs5333 KEGG:ns NR:ns ## COG: ECs5333 COG4108 # Protein_GI_number: 15834587 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptide chain release factor RF-3 # Organism: Escherichia coli O157:H7 # 1 529 1 529 529 1087 99.0 0 MTLSPYLQEVAKRRTFAIISHPDAGKTTITEKVLLFGQAIQTAGTVKGRGSNQHAKSDWM EMEKQRGISITTSVMQFPYHDCLVNLLDTPGHEDFSEDTYRTLTAVDCCLMVIDAAKGVE DRTRKLMEVTRLRDTPILTFMNKLDRDIRDPMELLDEVENELKIGCAPITWPIGCGKLFK GVYHLYKDETYLYQSGKGHTIQEVRIVKGLNNPDLDAAVGEDLAQQLRDELELVKGASNE FDKELFLAGEITPVFFGTALGNFGVDHMLDGLVEWAPAPMPRQTDTRTVQASEDKFTGFV FKIQANMDPKHRDRVAFMRVVSGKYEKGMKLRQVRTAKDVVISDALTFMAGDRSHVEEAY PGDILGLHNHGTIQIGDTFTQGEMMKFTGIPNFAPELFRRIRLKDPLKQKQLLKGLVQLS EEGAVQVFRPISNNDLIVGAVGVLQFDVVVSRLKSEYNVEAVYESVNVATARWVECADAK KFEEFKRKNESQLALDGGDNLAYIATSMVNLRLAQERYPDVQFHQTREH >gi|296493236|gb|ADTK01000265.1| GENE 6 5140 - 5745 705 201 aa, chain + ## HITS:1 COG:ECs5334 KEGG:ns NR:ns ## COG: ECs5334 COG2823 # Protein_GI_number: 15834588 # Func_class: R General function prediction only # Function: Predicted periplasmic or secreted lipoprotein # Organism: Escherichia coli O157:H7 # 1 201 1 201 201 264 99.0 8e-71 MTMTRLKISKTLLAVMLTSAVATGSAYAENNAQTTNESAGQKVDSSMNKVGNFMDDSAIT AKVKAALVDHDNIKSTDISVKTDQKVVTLSGFVESQAQAEEAVKVAKGVEGVTSVSDKLH VRDAKEGSVKGYAGDTATTSEIKAKLLADDIVPSRHVKVETTDGVVQLSGTVDSQAQSDR AESIAKAVDGVKSVKNDLKTK >gi|296493236|gb|ADTK01000265.1| GENE 7 5872 - 6033 185 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|157368899|ref|YP_001476888.1| ## NR: gi|157368899|ref|YP_001476888.1| hypothetical protein Spro_0654 [Serratia proteamaculans 568] # 1 53 33 85 85 77 92.0 4e-13 MFRWGIIFLVIALIAAALGFGGLAGTAAGAAKIVFVVGIILFLVSLFMGRKRP >gi|296493236|gb|ADTK01000265.1| GENE 8 6155 - 7228 842 357 aa, chain + ## HITS:1 COG:yjjU KEGG:ns NR:ns ## COG: yjjU COG4667 # Protein_GI_number: 16132195 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Escherichia coli K12 # 1 357 1 357 357 731 100.0 0 MGQRIPVTLGNIAPLSLRPFQPGRIALVCEGGGQRGIFTAGVLDEFMRAQFNPFDLYLGT SAGAQNLSAFICNQPGYARKVIMRYTTKREFFDPLRFVRGGNLIDLDWLVEATASQMPLQ MDTAARLFDSGKSFYMCACRQDDYAPNYFLPTKQNWLDVIRASSAIPGFYRSGVSLEGIN YLDGGISDAIPVKEAARQGAKTLVVIRTVPSQMYYTPQWFKRMERWLGDSSLQPLVNLVQ HHETSYRDIQQFIEKPPGKLRIFEIYPPKPLHSIALGSRIPALREDYKLGRLCGRYFLAT VGKLLTEKAPLTRHLVPVVTPESIVIPPAPVANDTLVAEVSDAPQANDPTFNNEDLA >gi|296493236|gb|ADTK01000265.1| GENE 9 7225 - 8007 752 260 aa, chain + ## HITS:1 COG:yjjVm KEGG:ns NR:ns ## COG: yjjVm COG0084 # Protein_GI_number: 16132266 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Escherichia coli K12 # 1 258 1 258 259 491 95.0 1e-139 MICRFIDTHCHFDFPPFSGDEEASLQRAAQAGVGKIIVPATEAENFARVQALAEKYQPLY AALGLHPGMLEKHSDVSLDQLQQALERRPAKVVAVGEIGLDLFGDDPQFERQQWLLDEQL KLAKRYDLPVILHSRRTHDKLAMHLKRHDLPCTGVVHGFSGSLQQAERFVQLGYKIGVGG TITYPRASKTRDVIAKLPLASLLLETDAPDMPLNGFQGQPNRPEQAVRVFDVLCELRPEP EDEIAEVLLNNTYALFSVSG >gi|296493236|gb|ADTK01000265.1| GENE 10 8221 - 8955 584 244 aa, chain - ## HITS:1 COG:yjjW KEGG:ns NR:ns ## COG: yjjW COG1180 # Protein_GI_number: 16132196 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Escherichia coli K12 # 1 244 44 287 287 482 96.0 1e-136 MGRCNDCGECVPQCPHQALQIVDGKVLWNAVVCEQCDTCLKMCPQHATPMAQSMSVDEVL SHVRKAVLFIEGITVSGGEATTQLPFVVALFTAIKNDPQLRHLTCLVDSNGMLSETGWEK LLPVCDGAMLDLKAWGSECHQHLTGRDNQQIKRSICLLAERGKLAELRLLVIPDQVDYLH HIDELATFIKRLGDVPVRLNAFHAHGVYGEAQSWASATPEDVEPLADALKVRGVSRLIFP ALYL >gi|296493236|gb|ADTK01000265.1| GENE 11 9056 - 10606 1424 516 aa, chain - ## HITS:1 COG:no KEGG:JW4343 NR:ns ## KEGG: JW4343 # Name: yjjI # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 516 1 516 516 1073 100.0 0 MPTSHENALQQRCQQIVTSPVLSPEQKRHFLALEAENNLPYPQLPAEARRALDEGVICDM FEGHAPYKPRYVLPDYARFLANGSEWLELEGAKDLDDALSLLTILYHHVPSVTSMPVYLG QLDALLQPYVRILTQDEIDVRIKRFWRYLDRTLPDAFMHANIGPSDSPITRAILRADAEL KQVSPNLTFIYDPEITPDDLLLEVAKNICECSKPHIANGPVHDKIFTKGGYGIVSCYNSL PLAGGGSTLVRLNLKAIAERSESLDDFFTRTLPHYCQQQIAIIDARCEFLYQQSHFFENS FLVKEGLINPERFVPMFGMYGLAEAVNLLCEKEGIAARYGKEAAANEVGYRISAQLAEFV ANTPVKYGWQKRAMLHAQSGISSDIGTTPGARLPYGDEPDPITHLQTVAPHHAYYYSGIS DILTLDETIKRNPQALVQLCLGAFKAGMREFTANVSGNDLVRVTGYMVRLSDLEKYRAEG SRTNTTWLGEEAARNTRILERQPRVISHEQQMRFSQ >gi|296493236|gb|ADTK01000265.1| GENE 12 10864 - 11643 1033 259 aa, chain + ## HITS:1 COG:ECs5340 KEGG:ns NR:ns ## COG: ECs5340 COG0274 # Protein_GI_number: 15834594 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Escherichia coli O157:H7 # 1 259 1 259 259 446 100.0 1e-125 MTDLKASSLRALKLMDLTTLNDDDTDEKVIALCHQAKTPVGNTAAICIYPRFIPIARKTL KEQGTPEIRIATVTNFPHGNDDIEIALAETRAAIAYGADEVDVVFPYRALMAGNEQVGFD LVKACKEACAAANVLLKVIIETGELKDEALIRKASEISIKAGADFIKTSTGKVAVNATPE SARIMMEVIRDMGVEKTVGFKPAGGVRTAEDAQKYLAIADELFGADWADARHYRFGASSL LASLLKALGHGDGKSASSY >gi|296493236|gb|ADTK01000265.1| GENE 13 11770 - 13092 1736 440 aa, chain + ## HITS:1 COG:ZdeoA KEGG:ns NR:ns ## COG: ZdeoA COG0213 # Protein_GI_number: 15804954 # Func_class: F Nucleotide transport and metabolism # Function: Thymidine phosphorylase # Organism: Escherichia coli O157:H7 EDL933 # 1 440 1 440 440 816 100.0 0 MFLAQEIIRKKRDGHALSDEEIRFFINGIRDNTISEGQIAALAMTIFFHDMTMPERVSLT MAMRDSGTVLDWKSLHLNGPIVDKHSTGGVGDVTSLMLGPMVAACGGYIPMISGRGLGHT GGTLDKLESIPGFDIFPDDNRFREIIKDVGVAIIGQTSSLAPADKRFYATRDITATVDSI PLITASILAKKLAEGLDALVMDVKVGSGAFMPTYELSEALAEAIVGVANGAGVRTTALLT DMNQVLASSAGNAVEVREAVQFLTGEYRNPRLFDVTMALCVEMLISGKLAKDDAEARAKL QAVLDNGKAAEVFGRMVAAQKGPTDFVENYAKYLPTAMLTKAVYADTEGFVSEMDTRALG MAVVAMGGGRRQASDTIDYSVGFTDMARLGDQVDGQRPLAVIHAKDENSWQEAAKAVKAA IKLADKAPESTPTVYRRISE >gi|296493236|gb|ADTK01000265.1| GENE 14 13144 - 14367 1508 407 aa, chain + ## HITS:1 COG:ECs5342 KEGG:ns NR:ns ## COG: ECs5342 COG1015 # Protein_GI_number: 15834596 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphopentomutase # Organism: Escherichia coli O157:H7 # 1 407 1 407 407 840 100.0 0 MKRAFIMVLDSFGIGATEDAERFGDVGADTLGHIAEACAKGEADNGRKGPLNLPNLTRLG LAKAHEGSTGFIPAGMDGNAEVIGAYAWAHEMSSGKDTPSGHWEIAGVPVLFEWGYFSDH ENSFPQELLDKLVERANLPGYLGNCHSSGTVILDQLGEEHMKTGKPIFYTSADSVFQIAC HEETFGLDKLYELCEIAREELTNGGYNIGRVIARPFIGDKAGNFQRTGNRHDLAVEPPAP TVLQKLVDEKHGQVVSVGKIADIYANCGITKKVKATGLDALFDATIKEMKEAGDNTIVFT NFVDFDSSWGHRRDVAGYAAGLELFDRRLPELMSLLRDDDILILTADHGCDPTWTGTDHT REHIPVLVYGPKVKPGSLGHRETFADIGQTLAKYFGTSDMEYGKAMF Prediction of potential genes in microbial genomes Time: Mon May 16 15:51:21 2011 Seq name: gi|296493235|gb|ADTK01000266.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont846.4, whole genome shotgun sequence Length of sequence - 19986 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 11, operones - 4 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 57 - 776 904 ## COG0813 Purine-nucleoside phosphorylase + Term 807 - 848 8.6 2 2 Op 1 2/0.700 - CDS 1051 - 1200 149 ## COG1396 Predicted transcriptional regulators 3 2 Op 2 2/0.700 - CDS 1232 - 2248 1103 ## COG0095 Lipoate-protein ligase A 4 2 Op 3 . - CDS 2276 - 2920 768 ## COG3726 Uncharacterized membrane protein affecting hemolysin expression - Prom 3002 - 3061 4.6 + Prom 2929 - 2988 3.1 5 3 Op 1 5/0.300 + CDS 3026 - 3994 1138 ## COG0560 Phosphoserine phosphatase 6 3 Op 2 1/0.800 + CDS 4043 - 5425 1428 ## COG1066 Predicted ATP-dependent serine protease 7 3 Op 3 . + CDS 5446 - 6678 1232 ## COG3172 Predicted ATPase/kinase involved in NAD metabolism - Term 6827 - 6890 2.1 8 4 Tu 1 . - CDS 6985 - 8652 2185 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 8749 - 8808 3.2 9 5 Op 1 4/0.300 + CDS 8863 - 10800 1821 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) 10 5 Op 2 . + CDS 10890 - 11216 384 ## COG2973 Trp operon repressor 11 6 Tu 1 . - CDS 11259 - 11771 374 ## COG1986 Uncharacterized conserved protein - Prom 11798 - 11857 4.3 + Prom 11722 - 11781 2.7 12 7 Tu 1 . + CDS 11823 - 12470 631 ## COG0406 Fructose-2,6-bisphosphatase 13 8 Tu 1 . - CDS 12467 - 13336 917 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 13529 - 13588 4.5 + Prom 13443 - 13502 4.3 14 9 Op 1 4/0.300 + CDS 13547 - 14020 536 ## COG3045 Uncharacterized protein conserved in bacteria 15 9 Op 2 40/0.000 + CDS 14033 - 14722 598 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 16 9 Op 3 6/0.200 + CDS 14722 - 16146 1310 ## COG0642 Signal transduction histidine kinase 17 9 Op 4 . + CDS 16204 - 17376 954 ## COG4452 Inner membrane protein involved in colicin E2 resistance 18 9 Op 5 . + CDS 17376 - 17555 221 ## COG4452 Inner membrane protein involved in colicin E2 resistance + Term 17572 - 17602 3.0 - Term 17559 - 17589 3.0 19 10 Tu 1 . - CDS 17614 - 18330 861 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Prom 18715 - 18774 8.1 20 11 Tu 1 . + CDS 18966 - 19652 518 ## COG0565 rRNA methylase Predicted protein(s) >gi|296493235|gb|ADTK01000266.1| GENE 1 57 - 776 904 239 aa, chain + ## HITS:1 COG:ECs5343 KEGG:ns NR:ns ## COG: ECs5343 COG0813 # Protein_GI_number: 15834597 # Func_class: F Nucleotide transport and metabolism # Function: Purine-nucleoside phosphorylase # Organism: Escherichia coli O157:H7 # 1 239 1 239 239 470 100.0 1e-133 MATPHINAEMGDFADVVLMPGDPLRAKYIAETFLEDAREVNNVRGMLGFTGTYKGRKISV MGHGMGIPSCSIYTKELITDFGVKKIIRVGSCGAVLPHVKLRDVVIGMGACTDSKVNRIR FKDHDFAAIADFDMVRNAVDAAKALGIDARVGNLFSADLFYSPDGEMFDVMEKYGILGVE MEAAGIYGVAAEFGAKALTICTVSDHIRTHEQTTAAERQTTFNDMIKIALESVLLGDKE >gi|296493235|gb|ADTK01000266.1| GENE 2 1051 - 1200 149 49 aa, chain - ## HITS:1 COG:ECs5344 KEGG:ns NR:ns ## COG: ECs5344 COG1396 # Protein_GI_number: 15834598 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 49 1 49 87 78 97.0 2e-15 MIDLENQEREIINLMLSQRISWLAAVRIRHKLSLAEVSKMLGISINSLK >gi|296493235|gb|ADTK01000266.1| GENE 3 1232 - 2248 1103 338 aa, chain - ## HITS:1 COG:Z5988_2 KEGG:ns NR:ns ## COG: Z5988_2 COG0095 # Protein_GI_number: 15804958 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate-protein ligase A # Organism: Escherichia coli O157:H7 EDL933 # 1 338 5 342 342 696 99.0 0 MSTLRLLISDSYDPWFNLAVEECIFRQMPATQRVLFLWRNADTVVIGRAQNPWKECNTRR MEEDNVRLARRSSGGGAVFHDLGNTCFTFMAGKPEYDKTISTSIVLNALNALGVSAEASG RNDLVVKTVEGDRKVSGSAYRETKDRGFHHGTLLLNADLSRLANYLNPDKKKLAAKGITS VRSRVTNLTELLPGITHEQVCEAITKAFFAHYGERVEAEIISPDKTPDLPNFAETFARQS SWEWNFGQAPAFSHLLDERFSWGGVELHFDVEKGHITRAQVFTDSLNPAPLEALAGRLQG CLYRADMLQQECEALLVDFPDQEKELRELSTWIAGAVR >gi|296493235|gb|ADTK01000266.1| GENE 4 2276 - 2920 768 214 aa, chain - ## HITS:1 COG:ECs5345_1 KEGG:ns NR:ns ## COG: ECs5345_1 COG3726 # Protein_GI_number: 15834599 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein affecting hemolysin expression # Organism: Escherichia coli O157:H7 # 1 205 1 205 224 366 100.0 1e-101 MARTKLKFRLHRAVIVLFCLALLVALMQGASWFSQNHQRQRNPQLEELARTLARQVTLNV APLMRTDSPDEKRIQAILDQLTDESRILDAGVYDEQGDLIARSGESVEVRDRLALDGKKA GGYFNQQIVEPIAGKNGPLGYLRLTLDTHTLATEAQQVDNTTNILRLMLLLSLAIGVVLT RTLLQGKRTRWQQSPFLLTASKPVPEEEESEKKE >gi|296493235|gb|ADTK01000266.1| GENE 5 3026 - 3994 1138 322 aa, chain + ## HITS:1 COG:ECs5346 KEGG:ns NR:ns ## COG: ECs5346 COG0560 # Protein_GI_number: 15834600 # Func_class: E Amino acid transport and metabolism # Function: Phosphoserine phosphatase # Organism: Escherichia coli O157:H7 # 1 322 1 322 322 638 99.0 0 MPNITWCDLPEDVSLWPGLPLSLSGDEVMPLDYHAGRSGWLLYGRGLDKQRLTQYQSKLG AAMVIVAAWCVEDYQVIRLAGSLTARATRLAHEAQLDVAPLGKIPHLRTPGLLVMDMDST AIQIECIDEIAKLAGTGEMVAEVTERAMRGELDFTASLRSRVATLKGADANILQQVRENL PLMPGLTQLVLKLETLGWKVAIASGGFTFFAEYLRDKLRLTAVVANELEIMDGKFTGNVI GDIVDAQYKAKTLTRLAQEYEIPLAQTVAIGDGANDLPMIKAAGLGIAYHAKPKVNEKTE VTIRHADLMGVFCILSGSLNQK >gi|296493235|gb|ADTK01000266.1| GENE 6 4043 - 5425 1428 460 aa, chain + ## HITS:1 COG:ECs5347 KEGG:ns NR:ns ## COG: ECs5347 COG1066 # Protein_GI_number: 15834601 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent serine protease # Organism: Escherichia coli O157:H7 # 1 460 1 460 460 902 100.0 0 MAKAPKRAFVCNECGADYPRWQGQCSACHAWNTITEVRLAASPTVARNERLSGYAGSAGV AKVQKLSDISLEELPRFSTGFKEFDRVLGGGVVPGSAILIGGNPGAGKSTLLLQTLCKLA QQMKTLYVTGEESLQQVAMRAHRLGLPTDNLNMLSETSIEQICLIAEEEQPKLMVIDSIQ VMHMADVQSSPGSVAQVRETAAYLTRFAKTRGVAIVMVGHVTKDGSLAGPKVLEHCIDCS VLLDGDADSRFRTLRSHKNRFGAVNELGVFAMTEQGLREVSNPSAIFLSRGDEVTSGSSV MVVWEGTRPLLVEIQALVDHSMMANPRRVAVGLEQNRLAILLAVLHRHGGLQMADQDVFV NVVGGVKVTETSADLALLLAMVSSLRDRPLPQDLVVFGEVGLAGEIRPVPSGQERISEAA KHGFRRAIVPAANVPKKAPEGMQIFGVKKLSDALSVFDDL >gi|296493235|gb|ADTK01000266.1| GENE 7 5446 - 6678 1232 410 aa, chain + ## HITS:1 COG:nadR_3 KEGG:ns NR:ns ## COG: nadR_3 COG3172 # Protein_GI_number: 16132207 # Func_class: H Coenzyme transport and metabolism # Function: Predicted ATPase/kinase involved in NAD metabolism # Organism: Escherichia coli K12 # 224 410 1 187 187 382 99.0 1e-106 MSSFDYLKTAIKQQGCTLQQVADASGMTKGYLSQLLNAKIKSPSAQKLEALHRFLGLEFP RQKKTIGVVFGKFYPLHTGHIYLIQRACSQVDELHIIMGFDDTRDRALFEDSAMSQQPTV PDRLRWLLQTFKYQKNIRIHAFNEEGMEPYPHGWDVWSNGIKKFMAEKGIQPDLIYTSEE ADAPQYMEHLGIDTVLVDPKRTFMSISGAQIRENPFRYWEYIPTEVKPFFVRTVAILGGE SSGKSTLVNKLANIFNTTSAWEYGRDYVFSHLGGDEIALQYSDYDKIALGHAQYIDFAVK YANKVAFIDTDFVTTQAFCKKYEGREHPFVQALIDEYRFDLVILLENNTPWVADGLRSLG SSVDRKEFQNLLVEMLEENNIEFVRVEEDDYDSRFLRCVELVREMMGEQR >gi|296493235|gb|ADTK01000266.1| GENE 8 6985 - 8652 2185 555 aa, chain - ## HITS:1 COG:ECs5349 KEGG:ns NR:ns ## COG: ECs5349 COG0488 # Protein_GI_number: 15834603 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Escherichia coli O157:H7 # 1 555 1 555 555 1101 99.0 0 MAQFVYTMHRVGKVVPPKRHILKNISLSFFPGAKIGVLGLNGAGKSTLLRIMAGIDKDIE GEARPQPDIKIGYLPQEPQLNPEHTVRESIEEAVSEVVNALKRLDEVYALYADPDADFDK LAAEQGRLEEIIQAHDGHNLNVQLERAADALRLPDWDAKIANLSGGERRRVALCRLLLEK PDMLLLDEPTNHLDAESVAWLERFLHDFEGTVVAITHDRYFLDNVAGWILELDRGEGIPW EGNYSSWLEQKDQRLAQEASQEAARRKSIEKELEWVRQGTKGRQSKGKARLARFEELNST EYQKRNETNELFIPPGPRLGDKVLEVSNLRKSYGDRLLIDDLSFSIPKGAIVGIIGPNGA GKSTLFRMISDQEQPDSGTITLGETVKLASVDQFRDSMDNSKTVWEEVSGGLDIMKIGNT EMPSRAYVGRFNFKGVDQGKRVGELSGGERGRLHLAKLLQVGGNMLLLDEPTNDLDIETL RALENALLEFPGCAMVISHDRWFLDRIATHILDYQDEGKVEFFEGNFTEYEEYKKRTLGA DALEPKRIKYKRIAK >gi|296493235|gb|ADTK01000266.1| GENE 9 8863 - 10800 1821 645 aa, chain + ## HITS:1 COG:ECs5350 KEGG:ns NR:ns ## COG: ECs5350 COG0741 # Protein_GI_number: 15834604 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Escherichia coli O157:H7 # 1 645 10 654 654 1244 99.0 0 MEKAKQVTWRLLAAGVCLLTVSSVARADSLDEQRSRYAQIKQAWDNRQMDVVEQMMPGLK DYPLYPYLEYRQITDDLMNQPAVTVTNFVRANPTLSPARTLQSRFVNELARREDWRGLLA FSPEKPGTTEAQCNYYYAKWNTGQSEEAWQGAKELWLTGKSQPNACDKLFSVWRASGKQD PLAYLERIRLAMKAGNTGLVTVLAGQMPADYQTIASAIISLANNPNTVLTFARTTGATDF TRQMAAVAFASVARQDAENARLMIPSLAQAQQLNEDQIQELRDIVAWRLMGNDVTDEQAK WRDDAIMRSQSTSLIERRVRMALGTGDRRGLNTWLARLPMEAKEKDEWRYWQADLLLERG REAEAKEILHQLMQQRGFYPMVAAQRIGEEYELKIDKAPQNVDSALTQGPEMARVRELMY WNLDNTARSEWANLVKSKSKTEQAQLARYAFNNQWWDLSVQATIAGKLWDHLEERFPLAY NDLFKRYTSGKEIPQSYAMAIARQESAWNPKVKSPVGASGLMQIMPGTATHTVKMFSIPG YSSPGQLLDPETNINIGTSYLQYVYQQFGNNRIFSSAAYNAGPGRVRTWLGNSAGRIDAV AFVESIPFSETRGYVKNVLAYDAYYRYFMGDKPTLMSATEWGRRY >gi|296493235|gb|ADTK01000266.1| GENE 10 10890 - 11216 384 108 aa, chain + ## HITS:1 COG:ECs5351 KEGG:ns NR:ns ## COG: ECs5351 COG2973 # Protein_GI_number: 15834605 # Func_class: K Transcription # Function: Trp operon repressor # Organism: Escherichia coli O157:H7 # 1 108 1 108 108 166 100.0 9e-42 MAQQSPYSAAMAEQRHQEWLRFVDLLKNAYQNDLHLPLLNLMLTPDEREALGTRVRIVEE LLRGEMSQRELKNELGAGIATITRGSNSLKAAPVELRQWLEEVLLKSD >gi|296493235|gb|ADTK01000266.1| GENE 11 11259 - 11771 374 170 aa, chain - ## HITS:1 COG:ECs5352 KEGG:ns NR:ns ## COG: ECs5352 COG1986 # Protein_GI_number: 15834606 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 170 4 173 173 325 98.0 2e-89 MHQVVCATTNPAKIQAILQAFHEIFGEGSCHIASVAVESGVPEQPFGSEETRAGARNRVA NARRLLPEADFWVAIEAGIDGDSTFSWVVIENTSQRGEARSATLPLPAVILQKVREGEAL GPVMSRYTGIDEIGRKEGAIGVFTAGKLTRASVYHQAVILALSPFHNAVY >gi|296493235|gb|ADTK01000266.1| GENE 12 11823 - 12470 631 215 aa, chain + ## HITS:1 COG:ECs5353 KEGG:ns NR:ns ## COG: ECs5353 COG0406 # Protein_GI_number: 15834607 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Escherichia coli O157:H7 # 1 215 1 215 215 406 100.0 1e-113 MLQVYLVRHGETQWNAERRIQGQSDSPLTAKGEQQAMQVATRAKELGITHIISSDLGRTR RTAEIIAQACGCDIIFDSRLRELNMGVLEKRHIDSLTEEEENWRRQLVNGTVDGRIPEGE SMQELSDRVNAALESCRDLPQGSRPLLVSHGIALGCLVSTILGLPAWAERRLRLRNCSIS RVDYQESLWLASGWVVETAGDISHLDAPALDELQR >gi|296493235|gb|ADTK01000266.1| GENE 13 12467 - 13336 917 289 aa, chain - ## HITS:1 COG:ECs5354 KEGG:ns NR:ns ## COG: ECs5354 COG2207 # Protein_GI_number: 15834608 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 289 1 289 289 597 100.0 1e-171 MDQAGIIRDLLIWLEGHLDQPLSLDNVAAKAGYSKWHLQRMFKDVTGHAIGAYIRARRLS KSAVALRLTARPILDIALQYRFDSQQTFTRAFKKQFAQTPALYRRSPEWSAFGIRPPLRL GEFTMPEHKFVTLEDTPLIGVTQSYSCSLEQISDFRHEMRYQFWHDFLGNAPTIPPVLYG LNETRPSQDKDDEQEVFYTTALAQDQADGYVLTGHPVMLQGGEYVMFTYEGLGTGVQEFI LTVYGTCMPMLNLTRRKGQDIERYYPAEDAKAGDRPINLRCELLIPIRR >gi|296493235|gb|ADTK01000266.1| GENE 14 13547 - 14020 536 157 aa, chain + ## HITS:1 COG:ECs5355 KEGG:ns NR:ns ## COG: ECs5355 COG3045 # Protein_GI_number: 15834609 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 157 1 157 157 291 100.0 3e-79 MKYKHLILSLSLIMLGPLAHAEEIGSVDTVFKMIGPDHKIVVEAFDDPDVKNVTCYVSRA KTGGIKGGLGLAEDTSDAAISCQQVGPIELSDRIKNGKAQGEVVFKKRTSLVFKSLQVVR FYDAKRNALAYLAYSDKVVEGSPKNAISAVPVMPWRQ >gi|296493235|gb|ADTK01000266.1| GENE 15 14033 - 14722 598 229 aa, chain + ## HITS:1 COG:creB KEGG:ns NR:ns ## COG: creB COG0745 # Protein_GI_number: 16132215 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli K12 # 1 229 1 229 229 448 98.0 1e-126 MQRETVWLVEDEQGIADTLVYMLQQEGFAVEVFERGLPVLDKARQQVPDVMILDVGLPDI SGFELCRQLLALHPALPVLFLTARSEEVDRLLGLEIGADDYVAKPFSPREVCARVRTLLR RVKKFSSPSPVIRIGHFELNEPAAQISWFDTPLTLTRYEFLLLKTLLKSPGRVWSRQQLM DSVWEDAQDTYDRTVDTHIKTLRAKLRAINPDLSPINTHRGMGYSLRGL >gi|296493235|gb|ADTK01000266.1| GENE 16 14722 - 16146 1310 474 aa, chain + ## HITS:1 COG:ECs5357 KEGG:ns NR:ns ## COG: ECs5357 COG0642 # Protein_GI_number: 15834611 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli O157:H7 # 1 474 1 474 474 909 99.0 0 MRIGMRLLLGYFLLVAVAAWFVLAIFVKEVKPGVRRATEGTLIDTATLLAELARPDLLSG DPTHGQLAQAFNQLQHRPFRANIGGINKVRNEYHVYMTDSQGKVLFDSANKAVGQDYSRW NDVWLTLRGQYGARSTLQNPADPESSVMYVAAPIMDGSRLIGVLSVGKPNAAMAPVIKRS ERRILWASAILLGIALVIGAGMVWWINRSIARLTRYADSVTDNKPVPLPELGSSELRKLA QALESMRVKLEGKNYIEQYVYALTHELKSPLAAIRGAAEILREGPPPEVVARFTDNILTQ NARMQALVETLLRQARLENRQEVVLTVVDVAALFRRVSEARTVQLAEKNITLHVMPTEVN VAAEPALLEQALGNLLDNAIDFTPESGRITLSAEVDQEHVALKVLDTGSGIPDYALSRIF ERFYSLPRANGQKSSGLGLAFVSEVARLFNGEVTLRNVQEGGVLASLRLHRHFT >gi|296493235|gb|ADTK01000266.1| GENE 17 16204 - 17376 954 390 aa, chain + ## HITS:1 COG:ECs5358 KEGG:ns NR:ns ## COG: ECs5358 COG4452 # Protein_GI_number: 15834612 # Func_class: V Defense mechanisms # Function: Inner membrane protein involved in colicin E2 resistance # Organism: Escherichia coli O157:H7 # 1 379 1 379 450 731 99.0 0 MLKSPLFWKMTTLFGAVLLLLIPIMLIRQVIVERADYRSDVEDAIRQSTSGPQKLVGPLI AIPVTELYTVQEDDKTVERKRSFIHFWLPESLMVDGNQNVEERKIGIYTGQVWHSDLTLK ADFDVSRLSELDAPNITLGKPFIVISVGDARGIGVVKAPEVNGTALTIEPGTGLEQGGQG VHIPLPEGDWRKQNLKLNMALNLSGTGDLSVVPAGRNSEMTLTSNWPHPSFLGDFLPAKR EVSESGFQAQWQSSWFANNLGERFASGNDTGWENFPAFSVAVTTPADQYQLTDRATKYAI LLIALTFMAFFVFETLTAQRLHPMQYLLVGLSLVMFYLLLLALSEHTGFTVAWIIASLIG ALMNGIYLQAVLKGWRNSMLFTLALLLLDV >gi|296493235|gb|ADTK01000266.1| GENE 18 17376 - 17555 221 59 aa, chain + ## HITS:1 COG:creD KEGG:ns NR:ns ## COG: creD COG4452 # Protein_GI_number: 16132217 # Func_class: V Defense mechanisms # Function: Inner membrane protein involved in colicin E2 resistance # Organism: Escherichia coli K12 # 1 59 392 450 450 115 100.0 1e-26 MWGLLNSADSALLLGTSVLVVALAGMMFVTRNIDWYAFSLPKMKASKEVTTDDELRIWK >gi|296493235|gb|ADTK01000266.1| GENE 19 17614 - 18330 861 238 aa, chain - ## HITS:1 COG:ECs5359 KEGG:ns NR:ns ## COG: ECs5359 COG0745 # Protein_GI_number: 15834613 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 238 1 238 238 470 100.0 1e-133 MQTPHILIVEDELVTRNTLKSIFEAEGYDVFEATDGAEMHQILSEYDINLVIMDINLPGK NGLLLARELREQANVALMFLTGRDNEVDKILGLEIGADDYITKPFNPRELTIRARNLLSR TMNLGTVSEERRSVESYKFNGWELDINSRSLIGPDGEQYKLPRSEFRAMLHFCENPGKIQ SRAELLKKMTGRELKPHDRTVDVTIRRIRKHFESTPDTPEIIATIHGEGYRFCGDLED >gi|296493235|gb|ADTK01000266.1| GENE 20 18966 - 19652 518 228 aa, chain + ## HITS:1 COG:ECs5361 KEGG:ns NR:ns ## COG: ECs5361 COG0565 # Protein_GI_number: 15834615 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylase # Organism: Escherichia coli O157:H7 # 1 228 1 228 228 424 99.0 1e-119 MRITIILVAPARAENIGAAARAMKTMGFSELRIVDSQAHLEPATRWVAHGSGDIIDNIKV FPTLAESLHDVDFTVATTARSRAKYHYYATPVELVPLLEEKSSWMSHAALVFGREDSGLT NEELALADVLTGVPMVADYPSLNLGQAVMVYCYQLATLIQQPAKSDTTADQHQLQALRER AMALLTTLAVADDIKLVDWLQQRLGLLEQRDTAMLHRLLHDIEKNITK Prediction of potential genes in microbial genomes Time: Mon May 16 15:51:27 2011 Seq name: gi|296493234|gb|ADTK01000267.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont846.5, whole genome shotgun sequence Length of sequence - 15095 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 7, operones - 5 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 11/0.000 + CDS 78 - 2540 2152 ## COG0527 Aspartokinases 2 1 Op 2 19/0.000 + CDS 2542 - 3474 687 ## COG0083 Homoserine kinase 3 1 Op 3 . + CDS 3475 - 4761 1449 ## COG0498 Threonine synthase + Term 4785 - 4824 4.2 + Prom 4770 - 4829 2.9 4 2 Tu 1 . + CDS 4975 - 5271 140 ## ECUMN_0005 conserved hypothetical protein; putative exported protein - Term 5261 - 5305 2.2 5 3 Op 1 5/0.000 - CDS 5424 - 6200 877 ## COG3022 Uncharacterized protein conserved in bacteria 6 3 Op 2 . - CDS 6270 - 7700 812 ## PROTEIN SUPPORTED gi|145634045|ref|ZP_01789756.1| 50S ribosomal protein L21 - Prom 7757 - 7816 4.3 + Prom 7873 - 7932 1.9 7 4 Op 1 5/0.000 + CDS 7979 - 8932 1230 ## COG0176 Transaldolase + Term 8940 - 8980 9.3 + Prom 8958 - 9017 2.4 8 4 Op 2 . + CDS 9047 - 9634 688 ## PROTEIN SUPPORTED gi|134277849|ref|ZP_01764564.1| ribosomal protein S16 - Term 9624 - 9661 6.2 9 5 Tu 1 4/1.000 - CDS 9669 - 10235 706 ## COG1584 Predicted membrane protein - Prom 10304 - 10363 4.5 - Term 10264 - 10304 3.2 10 6 Op 1 . - CDS 10384 - 11097 588 ## COG4735 Uncharacterized protein conserved in bacteria 11 6 Op 2 . - CDS 11123 - 11527 412 ## ECO103_0013 hypothetical protein - Prom 11565 - 11624 4.9 12 7 Op 1 31/0.000 + CDS 11904 - 13820 2266 ## COG0443 Molecular chaperone + Term 13849 - 13895 11.0 13 7 Op 2 . + CDS 13909 - 15039 1122 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain Predicted protein(s) >gi|296493234|gb|ADTK01000267.1| GENE 1 78 - 2540 2152 820 aa, chain + ## HITS:1 COG:thrA_1 KEGG:ns NR:ns ## COG: thrA_1 COG0527 # Protein_GI_number: 16127996 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Escherichia coli K12 # 1 460 1 460 460 905 99.0 0 MRVLKFGGTSVANAERFLRVADILESNARQGQVATVLSAPAKITNHLVAIIEKTISGQDA LPNISDAERIFAELLTGLTAAQPGFPLAQLKTFVDQEFAQIKHVLHGISLLGQCPDSINA ALICRGEKMSIAIMAGVLEARGHNVTVIDPVEKLLAVGHYLESTVDIAESTRRIAASRIP ADHMVLMAGFTAGNEKGELGVLGRNGSDYSAAVLAACLRADCCEIWTDVDGVYTCDPRQV PDARLLKSMSYQEAMELSYFGAKVLHPRTITPIAQFQIPCLIKNTGNPQAPGTLIGASRD EDELPVKGISNLNNMAMFSVSGPGMKGMVGMAARVFAAMSRARISVVLITQSSSEYSISF CVPQSDCVRAERAMQEEFYLELKEGLLEPLAVTERLAIISVVGDGMRTLRGISAKFFAAL ARANINIVAIAQGSSERSISVVVNNDDATTGVRVTHQMLFNTDQVIEVFVIGVGGVGGAL LEQLKRQQSWLKNKHIDLRVCGVANSKALLTNVHGLNLENWQEELAQAKEPFNLGRLIRL VKEYHLLNPVIVDCTSSQAVADQYADFLREGFHVVTPNKKANTSSMDYYHLLRHAAEKSR RKFLYDTNVGAGLPVIENLQNLLNAGDELMKFSGILSGSLSYIFGKLDEGMSFSEATTLA REMGYTEPDPRDDLSGMDVARKLLILARETGRELELADIEIEPVLPAEFNAEGDVAAFMA NLSQLDDLFAARVAKARDEGKVLRYVGNIDEDGACRVKIAEVDGNDPLFKVKNGENALAF YSHYYQPLPLVLRGYGAGNDVTAAGVFADLLRTLSWKLGV >gi|296493234|gb|ADTK01000267.1| GENE 2 2542 - 3474 687 310 aa, chain + ## HITS:1 COG:ECs0003 KEGG:ns NR:ns ## COG: ECs0003 COG0083 # Protein_GI_number: 15829257 # Func_class: E Amino acid transport and metabolism # Function: Homoserine kinase # Organism: Escherichia coli O157:H7 # 1 310 1 310 310 640 99.0 0 MVKVYAPASSANMSVGFDVLGAAVTPVDGALLGDVVTVEAAETFSLNNLGRFADKLPSEP RENIVYQCWERFCQELGKQIPVAMTLEKNMPIGSGLGSSACSVVAALMAMNEHCGKPLND TRLLALMGELEGRISGSIHYDNVAPCFLGGMQLMIEENDIISQQVPGFDEWLWVLAYPGI KVSTAEARAILPAQYRRQDCIAHGRHLAGFIHACYSRQPELAAKLMKDVIAEPYRERLLP GFRQARQAVAEIGAVASGISGSGPTLFALCDKPDTAQRVADWLGKNYLQNQEGFVHICRL DTAGARVLEN >gi|296493234|gb|ADTK01000267.1| GENE 3 3475 - 4761 1449 428 aa, chain + ## HITS:1 COG:ECs0004 KEGG:ns NR:ns ## COG: ECs0004 COG0498 # Protein_GI_number: 15829258 # Func_class: E Amino acid transport and metabolism # Function: Threonine synthase # Organism: Escherichia coli O157:H7 # 1 428 1 428 428 843 100.0 0 MKLYNLKDHNEQVSFAQAVTQGLGKNQGLFFPHDLPEFSLTEIDEMLKLDFVTRSAKILS AFIGDEIPQEILEERVRAAFAFPAPVANVESDVGCLELFHGPTLAFKDFGGRFMAQMLTH IAGDKPVTILTATSGDTGAAVAHAFYGLPNVKVVILYPRGKISPLQEKLFCTLGGNIETV AIDGDFDACQALVKQAFDDEELKVALGLNSANSINISRLLAQICYYFEAVAQLPQEARNQ LVVSVPSGNFGDLTAGLLAKSLGLPVKRFIAATNVNDTVPRFLHDGQWSPKATQATLSNA MDVSQPNNWPRVEELFRRKIWQLKELGYAAVDDETTQQTMRELKELGYTSEPHAAVAYRA LRDQLNPGEYGLFLGTAHPAKFKESVEAILGETLDLPKELAERADLPLLSHNLPADFAAL RKLMMNHQ >gi|296493234|gb|ADTK01000267.1| GENE 4 4975 - 5271 140 98 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_0005 NR:ns ## KEGG: ECUMN_0005 # Name: yaaX # Def: conserved hypothetical protein; putative exported protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 84 1 84 98 134 98.0 7e-31 MKKMQSIVLALSLVLVAPMAAQAAEITLVPSVKLQIGDRDNRGYYWDGGHWRDHGWWKQH YEWRGNRWHPHGPPPPPRHHKKAPHDHHGGHGPGKHHR >gi|296493234|gb|ADTK01000267.1| GENE 5 5424 - 6200 877 258 aa, chain - ## HITS:1 COG:ECs0006 KEGG:ns NR:ns ## COG: ECs0006 COG3022 # Protein_GI_number: 15829260 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 258 1 258 258 508 99.0 1e-144 MLILISPAKTLDYQSPLTTMRYTLPELLDNSQQLIHEARKLTPPQISTLMRISDKLAGIN AARFHDWQPDFTPENARQAILAFKGDVYTGLQAETFSEDDFDFAQQHLRMLSGLYGVLRP LDLMQPYRLEMGIRLENARGKDLYQFWGDIITNKLNEALAAQGDNVVINLASDEYFKSVK PKKLNAEIIKPVFLDEKNGKFKIISFYAKKARGLMSRFIIENRLTKPEQLTGFNSEGYFF DEASSSNGELVFKRYEQR >gi|296493234|gb|ADTK01000267.1| GENE 6 6270 - 7700 812 476 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145634045|ref|ZP_01789756.1| 50S ribosomal protein L21 [Haemophilus influenzae PittAA] # 6 447 8 446 456 317 40 3e-86 MPDFFSFINSVLWGSVMIYLLFGAGCWFTFRTGFVQFRYIRQFGKSLKNSIHPQPGGLTS FQSLCTSLAARVGSGNLAGVALAITAGGPGAVFWMWVAAFIGMATSFAECSLAQLYKERD VNGQFRGGPAWYMARGLGMRWMGVLFAVFLLIAYGIIFSGVQANAVARALSFSFDFPPLV TGIILAVFALLAITRGLHGVARLMQGFVPLMAIIWVLTSLVICVMNIGQLPHVIWSIFES AFGWQEAAGGAAGYTLSQAITNGFQRSMFSNEAGMGSTPNAAAAAASWPPHPAAQGIVQM IGIFIDTLVICTASAMLILLAGNGTTYMPLEGIQLIQKAMRVLMGSWGAEFVTLVVILFA FSSIVANYIYAENNLFFLRLNNPKAIWCLRICTFATVIGGTLLSLPLMWQLADIIMACMA ITNLTAILLLSPVVHTIASDYLRQRKLGVRPVFDPLRYPDIGRQLSPDAWDDVSQE >gi|296493234|gb|ADTK01000267.1| GENE 7 7979 - 8932 1230 317 aa, chain + ## HITS:1 COG:ECs0008 KEGG:ns NR:ns ## COG: ECs0008 COG0176 # Protein_GI_number: 15829262 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Escherichia coli O157:H7 # 1 317 1 317 317 609 100.0 1e-174 MTDKLTSLRQYTTVVADTGDIAAMKLYQPQDATTNPSLILNAAQIPEYRKLIDDAVAWAK QQSNDRAQQIVDATDKLAVNIGLEILKLVPGRISTEVDARLSYDTEASIAKAKRLIKLYN DAGISNDRILIKLASTWQGIRAAEQLEKEGINCNLTLLFSFAQARACAEAGVFLISPFVG RILDWYKANTDKKEYAPAEDPGVVSVSEIYQYYKEHGYETVVMGASFRNIGEILELAGCD RLTIAPALLKELAESEGAIERKLSYTGEVKARPARITESEFLWQHNQDPMAVDKLAEGIR KFAIDQEKLEKMIGDLL >gi|296493234|gb|ADTK01000267.1| GENE 8 9047 - 9634 688 195 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|134277849|ref|ZP_01764564.1| ribosomal protein S16 [Burkholderia pseudomallei 305] # 6 191 2 191 194 269 73 8e-72 MNTLRIGLVSISDRASSGVYQDKGIPALEEWLTSALTTPFELETRLIPDEQAIIEQTLCE LVDEMSCHLVLTTGGTGPARRDVTPDATLAVADREMPGFGEQMRQISLHFVPTAILSRQV GVIRKQALILNLPGQPKSIKETLEGVKDAAGNVVVHGIFASVPYCIQLLEGPYVETAPEV VAAFRPKSARRDVSE >gi|296493234|gb|ADTK01000267.1| GENE 9 9669 - 10235 706 188 aa, chain - ## HITS:1 COG:ECs0010 KEGG:ns NR:ns ## COG: ECs0010 COG1584 # Protein_GI_number: 15829264 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 188 1 188 188 299 100.0 2e-81 MGNTKLANPAPLGLMGFGMTTILLNLHNVGYFALDGIILAMGIFYGGIAQIFAGLLEYKK GNTFGLTAFTSYGSFWLTLVAILLMPKLGLTDAPNAQFLGVYLGLWGVFTLFMFFGTLKG ARVLQFVFFSLTVLFALLAIGNIAGNAAIIHFAGWIGLICGASAIYLAMGEVLNEQFGRT VLPIGESH >gi|296493234|gb|ADTK01000267.1| GENE 10 10384 - 11097 588 237 aa, chain - ## HITS:1 COG:yaaW KEGG:ns NR:ns ## COG: yaaW COG4735 # Protein_GI_number: 16128005 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 237 1 237 237 394 99.0 1e-110 MNVNYLNDSDLDFLQHCSEEQLANFARLLTHNEKGKTRLSSVLMRNELFKSMEGHPEQHR RNWQLIAGELQHFGGDSIANKLRGHGKLYRAILLDVSKRLKLKADKEMSTFEIEQQLLEQ FLRNTWKNMDEEHKQEFLHAVDARVNELEELLPLLMKDKLLAKGVSHLLSSQLTRILRTH AAMSVLGHGLLRGAGLGGPVGAALNGVKAVSGSAYRVTIPAVLQIACLRRMVSATQV >gi|296493234|gb|ADTK01000267.1| GENE 11 11123 - 11527 412 134 aa, chain - ## HITS:1 COG:no KEGG:ECO103_0013 NR:ns ## KEGG: ECO103_0013 # Name: yaaI # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 134 1 134 134 246 100.0 2e-64 MKSVFTISASLAISLMLCCTAQANDHKILGVIAMPRNETNDLALKLPVCRIVKRIQLSAD HGDLQLSGASVYFKAARSASQSLNIPSEIKEGQTTDWININSDNDNKRCVSKITFSGHTV NSSDMATLKIIGDD >gi|296493234|gb|ADTK01000267.1| GENE 12 11904 - 13820 2266 638 aa, chain + ## HITS:1 COG:ECs0014 KEGG:ns NR:ns ## COG: ECs0014 COG0443 # Protein_GI_number: 15829268 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Escherichia coli O157:H7 # 1 638 1 638 638 1103 99.0 0 MGKIIGIDLGTTNSCVAIMDGTTPRVLENAEGDRTTPSIIAYTQDGETLVGQPAKRQAVT NPQNTLFAIKRLIGRRFQDEEVQRDVSIMPFKIIAADNGDAWVEVKGQKMAPPQISAEVL KKMKKTAEDYLGEPVTEAVITVPAYFNDAQRQATKDAGRIAGLEVKRIINEPTAAALAYG LDKGTGNRTIAVYDLGGGTFDISIIEIDEVDGEKTFEVLATNGDTHLGGEDFDSRLINYL VEEFKKDQGIDLRNDPLAMQRLKEAAEKAKIELSSAQQTDVNLPYITADATGPKHMNIKV TRAKLESLVEDLVNRSIEPLKVALQDAGLSVSDIDDVILVGGQTRMPMVQKKVAEFFGKE PRKDVNPDEAVAIGAAVQGGVLTGDVKDVLLLDVTPLSLGIETMGGVMTTLIAKNTTIPT KHSQVFSTAEDNQSAVTIHVLQGERKRAADNKSLGQFNLDGINPAPRGMPQIEVTFDIDA DGILHVSAKDKNSGKEQKITIKASSGLNEDEIQKMVRDAEANAEADRKFEELVQARNQGD HLLHSTRKQVEEAGDKLPADDKTAIESALTALETALKGEDKAAIEAKMQELAQVSQKLIE IAQQQHAQQQTAGADASANNAKDDDVVDAEFEEVKDKK >gi|296493234|gb|ADTK01000267.1| GENE 13 13909 - 15039 1122 376 aa, chain + ## HITS:1 COG:ECs0015 KEGG:ns NR:ns ## COG: ECs0015 COG0484 # Protein_GI_number: 15829269 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Escherichia coli O157:H7 # 1 376 1 376 376 698 100.0 0 MAKQDYYEILGVSKTAEEREIKKAYKRLAMKYHPDRNQGDKEAEAKFKEIKEAYEVLTDS QKRAAYDQYGHAAFEQGGMGGGGFGGGADFSDIFGDVFGDIFGGGRGRQRAARGADLRYN MELTLEEAVRGVTKEIRIPTLEECDVCHGSGAKPGTQPQTCPTCHGSGQVQMRQGFFAVQ QTCPHCQGRGTLIKDPCNKCHGHGRVERSKTLSVKIPAGVDTGDRIRLAGEGEAGEHGAP AGDLYVQVQVKQHPIFEREGNNLYCEVPINFAMAALGGEIEVPTLDGRVKLKVPGETQTG KLFRMRGKGVKSVRGGAQGDLLCRVVVETPVGLNEKQKQLLQELQESFGGPTGEHNSPRS KSFFDGVKKFFDDLTR Prediction of potential genes in microbial genomes Time: Mon May 16 15:51:43 2011 Seq name: gi|296493233|gb|ADTK01000268.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont846.6, whole genome shotgun sequence Length of sequence - 36300 bp Number of predicted genes - 34, with homology - 34 Number of transcription units - 12, operones - 6 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 15 - 167 155 ## ECO103_0016 regulatory protein MokC for HokC + Prom 663 - 722 3.9 2 2 Op 1 5/0.000 + CDS 753 - 1919 1245 ## COG3004 Na+/H+ antiporter + Term 1935 - 1966 2.4 3 2 Op 2 . + CDS 1985 - 2884 791 ## COG0583 Transcriptional regulator 4 3 Tu 1 . - CDS 2923 - 3348 241 ## G2583_0019 hypothetical protein - Prom 3369 - 3428 9.6 5 4 Op 1 . - CDS 3663 - 3881 66 ## ECUMN_0018 hypothetical protein 6 4 Op 2 . - CDS 3894 - 4154 86 ## COG3188 P pilus assembly protein, porin PapC - Prom 4327 - 4386 2.9 - Term 4316 - 4350 -0.1 7 5 Tu 1 . - CDS 4443 - 5039 261 ## COG3188 P pilus assembly protein, porin PapC 8 6 Op 1 . - CDS 5472 - 5759 107 ## EcE24377A_0020 hypothetical protein 9 6 Op 2 . - CDS 5783 - 6592 -227 ## SSON_0027 hypothetical protein - Prom 6616 - 6675 9.1 - Term 7300 - 7340 2.0 10 7 Tu 1 . - CDS 7376 - 7639 422 ## PROTEIN SUPPORTED gi|15799705|ref|NP_285717.1| 30S ribosomal protein S20 - Prom 7680 - 7739 3.6 + Prom 7886 - 7945 5.2 11 8 Op 1 16/0.000 + CDS 7968 - 8909 391 ## PROTEIN SUPPORTED gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 12 8 Op 2 16/0.000 + CDS 8952 - 11768 3016 ## COG0060 Isoleucyl-tRNA synthetase 13 8 Op 3 7/0.000 + CDS 11768 - 12262 556 ## COG0597 Lipoprotein signal peptidase + Prom 12284 - 12343 1.8 14 8 Op 4 7/0.000 + CDS 12387 - 12836 257 ## PROTEIN SUPPORTED gi|225086978|ref|YP_002658248.1| ribosomal protein S2 15 8 Op 5 3/0.000 + CDS 12838 - 13788 440 ## PROTEIN SUPPORTED gi|227371337|ref|ZP_03854821.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase; SSU ribosomal protein S1P + Term 13817 - 13850 2.1 16 8 Op 6 3/0.000 + CDS 13854 - 14768 747 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase + Term 14785 - 14824 3.3 + Prom 14789 - 14848 3.9 17 8 Op 7 8/0.000 + CDS 14935 - 15756 878 ## COG0289 Dihydrodipicolinate reductase + Term 15806 - 15849 1.0 + Prom 16123 - 16182 5.5 18 8 Op 8 24/0.000 + CDS 16317 - 17360 959 ## COG0505 Carbamoylphosphate synthase small subunit 19 8 Op 9 . + CDS 17378 - 20599 4217 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) + Term 20610 - 20644 1.2 20 9 Tu 1 . - CDS 20607 - 20825 146 ## ECO103_0035 hypothetical protein - Prom 20994 - 21053 2.8 + Prom 20612 - 20671 4.8 21 10 Tu 1 . + CDS 20860 - 21255 315 ## ECO103_0036 DNA-binding transcriptional activator CaiF + Term 21363 - 21399 3.0 - Term 21105 - 21137 -1.0 22 11 Op 1 3/0.000 - CDS 21341 - 21931 416 ## COG0663 Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily 23 11 Op 2 5/0.000 - CDS 21937 - 22830 1058 ## COG1024 Enoyl-CoA hydratase/carnithine racemase 24 11 Op 3 4/0.000 - CDS 22831 - 24384 1191 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II 25 11 Op 4 8/0.000 - CDS 24458 - 25675 1404 ## COG1804 Predicted acyl-CoA transferases/carnitine dehydratase - Prom 25695 - 25754 2.2 26 11 Op 5 4/0.000 - CDS 25804 - 26946 1386 ## COG1960 Acyl-CoA dehydrogenases 27 11 Op 6 . - CDS 26977 - 28491 1596 ## COG1292 Choline-glycine betaine transporter - Prom 28515 - 28574 10.8 + Prom 28828 - 28887 3.9 28 12 Op 1 29/0.000 + CDS 28965 - 29735 874 ## COG2086 Electron transfer flavoprotein, beta subunit 29 12 Op 2 9/0.000 + CDS 29750 - 30691 917 ## COG2025 Electron transfer flavoprotein, alpha subunit 30 12 Op 3 12/0.000 + CDS 30742 - 32028 1306 ## COG0644 Dehydrogenases (flavoproteins) 31 12 Op 4 4/0.000 + CDS 32025 - 32312 361 ## COG2440 Ferredoxin-like protein 32 12 Op 5 4/0.000 + CDS 32370 - 33701 1491 ## COG0477 Permeases of the major facilitator superfamily + Prom 33770 - 33829 4.4 33 12 Op 6 7/0.000 + CDS 33866 - 34339 502 ## COG2249 Putative NADPH-quinone reductase (modulator of drug activity B) 34 12 Op 7 . + CDS 34332 - 36194 1035 ## PROTEIN SUPPORTED gi|229845962|ref|ZP_04466074.1| 30S ribosomal protein S2 Predicted protein(s) >gi|296493233|gb|ADTK01000268.1| GENE 1 15 - 167 155 50 aa, chain - ## HITS:1 COG:no KEGG:ECO103_0016 NR:ns ## KEGG: ECO103_0016 # Name: mokC # Def: regulatory protein MokC for HokC # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 50 20 69 69 74 100.0 1e-12 MKQHKAMIVALIVICITAVVAALVTRKDLCEVHIRTGQTEVAVFTAYESE >gi|296493233|gb|ADTK01000268.1| GENE 2 753 - 1919 1245 388 aa, chain + ## HITS:1 COG:ECs0017 KEGG:ns NR:ns ## COG: ECs0017 COG3004 # Protein_GI_number: 15829271 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter # Organism: Escherichia coli O157:H7 # 1 388 1 388 388 632 99.0 0 MKHLHRFFSSDASGGIILIIAAILAMIMANSGATSGWYHDFLETPVQLRVGSLEINKNML LWINDALMAVFFLLVGLEVKRELMQGSLASLRQAAFPVIAAIGGMIVPALLYLAFNYADP ITREGWAIPAATDIAFALGVLALLGSRVPLALKIFLMALAIIDDLGAIIIIALFYTNDLS MASLGVAAVAIAVLAVLNLCGVRRTGVYILVGVVLWTAVLKSGVHATLAGVIVGFFIPLK EKHGRSPAKRLEHVLHPWVAYLILPLFAFANAGVSLQGVTLDGLTSILPLGIIAGLLIGK PLGISLFCWLALRLKLAHLPEGTTYQQIMAVGILCGIGFTMSIFIASLAFGSVDPELINW AKLGILVGSISSAVIGYSWLRVRLRPSV >gi|296493233|gb|ADTK01000268.1| GENE 3 1985 - 2884 791 299 aa, chain + ## HITS:1 COG:nhaR KEGG:ns NR:ns ## COG: nhaR COG0583 # Protein_GI_number: 16128014 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 299 3 301 301 627 99.0 1e-180 MSHINYNHLYYFWHVYKEGSVVGAAEALYLTPQTITGQIRALEERLQGKLFKRKGRGLEP SELGELVYRYADKMFTLSQEMLDIVNYRKESNLLFDVGVADALSKRLVSSVLNAAVVEGE PIHLRCFESTHEMLLEQLSQHKLDMIISDCPIDSTQQEGLFSVRIGECGVSFWCTNPPPE KPFPACLEERRLLIPGRRSMLGRKLLNWFNSQGLNVEILGEFDDAALMKAFGAMHNAIFV APTLYAYDFYADKTVVEIGRVENVMEEYHAIFAERMIQHPAVQRICNTDYSALFSPAAR >gi|296493233|gb|ADTK01000268.1| GENE 4 2923 - 3348 241 141 aa, chain - ## HITS:1 COG:no KEGG:G2583_0019 NR:ns ## KEGG: G2583_0019 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 141 179 319 319 282 100.0 3e-75 MTCGFEDQSIDTGRFVLSRINNENYQFHHIRFDCIQGEQGELVFEPSDVEYYFEPVTIQE SGTSILKNELENSEDGAGQIGFQLSNDGTNEIQYGKSNYYSFQHPHEGSNQIPLFIRPRT YGNNVSSGQIMSRVKIVVMYN >gi|296493233|gb|ADTK01000268.1| GENE 5 3663 - 3881 66 72 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_0018 NR:ns ## KEGG: ECUMN_0018 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 52 1 52 319 105 92.0 7e-22 MKWLLLITLSLYSFIVQSAPCALTNVGEQRGTYILKPLSMKGNLTAEKLKTPEISLSVQT CKRQPTPLKFIF >gi|296493233|gb|ADTK01000268.1| GENE 6 3894 - 4154 86 86 aa, chain - ## HITS:1 COG:ECs0022 KEGG:ns NR:ns ## COG: ECs0022 COG3188 # Protein_GI_number: 15829276 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli O157:H7 # 1 86 731 816 816 184 98.0 4e-47 MQFTTDQRKPWYIQALRPDGSPLTFGYDVLDLQENNIGVVGQGSRLFIRVDEIPTGIKVA LNDEQNLFCTITFQHVIDENKTYICQ >gi|296493233|gb|ADTK01000268.1| GENE 7 4443 - 5039 261 198 aa, chain - ## HITS:1 COG:ECs0022 KEGG:ns NR:ns ## COG: ECs0022 COG3188 # Protein_GI_number: 15829276 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli O157:H7 # 1 198 436 633 816 358 96.0 3e-99 MAAWRYASQDYRTFSDHLYENDKHYHQSDYDDFYDIGRKNSLSANIMQPLSNNLGNVSLS ALWRNYWGRSGNAKDYQFSYSNNWQHISYTFSASQSYDENNKEEERFNLFISIPFYWGDD IAKTRHQINLSNSTSFSKDGYSSNNTGITGIAGEHDQLNYGIYVNQQQQNNDTSLGTNLS WRTPIAIIDGSYSHSKNA >gi|296493233|gb|ADTK01000268.1| GENE 8 5472 - 5759 107 95 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_0020 NR:ns ## KEGG: EcE24377A_0020 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 95 346 440 440 189 100.0 2e-47 MPGRWTTQLVNKHLGYRYTGVFKTLASIDDKPSRFEILIPLVQTLVRDNVKLNNDVYKEL NKFMHDYDKTSSEMRKYLKSINECMFLMKNIAHQN >gi|296493233|gb|ADTK01000268.1| GENE 9 5783 - 6592 -227 269 aa, chain - ## HITS:1 COG:no KEGG:SSON_0027 NR:ns ## KEGG: SSON_0027 # Name: not_defined # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 258 1 258 373 449 95.0 1e-125 MDVFDDRTSECLSEEIKIYQECHEKFVNFLKANFNQEIYPQLYTPEIFYEACRNLQSFYA HQETSQNAKYSAIDKKKSHLNQEIKKLIKKNIFPELYNEQCNKIPSSSTDDNQQITWQNF KTSNSAYSKPCGKLSLFKSVPSRLIEKSAYCSTENMATNKFDVVFSYCGDNIKEFILLLP YNKSLEMYELNDQKIQYLTAPNINIHKMSLSNITIEKSNLSYGYYFGCVLSNISCFESDL SNIIFSNGEINNLFIKKATYLEHHSLIPE >gi|296493233|gb|ADTK01000268.1| GENE 10 7376 - 7639 422 87 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15799705|ref|NP_285717.1| 30S ribosomal protein S20 [Escherichia coli O157:H7 EDL933] # 1 87 1 87 87 167 98 1e-40 MANIKSAKKRAIQSEKARKHNASRRSMMRTFIKKVYAAIEAGDKAAAQKAFNEMQPIVDR QAAKGLIHKNKAARHKANLTAQINKLA >gi|296493233|gb|ADTK01000268.1| GENE 11 7968 - 8909 391 313 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 [Bacillus selenitireducens MLS10] # 1 307 1 312 317 155 33 4e-37 MKLIRGIHNLSQAPQEGCVLTIGNFDGVHRGHRALLQGLQEEGRKRNLPVMVMLFEPQPL ELFATDKAPARLTRLREKLRYLAECGVDYVLCVRFDRRFAALTAQNFISDLLVKHLRVKF LAVGDDFRFGAGREGDFLLLQKAGMEYGFDITSTQTFCEGGVRISSTAVRQALADDNLAL AESLLGHPFAISGRVVHGDELGRTIGFPTANVPLRRQVSPVKGVYAVEVLGLGEKPLPGV ANIGTRPTVAGIRQQLEVHLLDVAMDLYGRHIQVVLRKKIRNEQRFASLDELKAQIARDE LTAREFFGLTKPA >gi|296493233|gb|ADTK01000268.1| GENE 12 8952 - 11768 3016 938 aa, chain + ## HITS:1 COG:ileS KEGG:ns NR:ns ## COG: ileS COG0060 # Protein_GI_number: 16128020 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Isoleucyl-tRNA synthetase # Organism: Escherichia coli K12 # 1 938 1 938 938 1956 100.0 0 MSDYKSTLNLPETGFPMRGDLAKREPGMLARWTDDDLYGIIRAAKKGKKTFILHDGPPYA NGSIHIGHSVNKILKDIIVKSKGLSGYDSPYVPGWDCHGLPIELKVEQEYGKPGEKFTAA EFRAKCREYAATQVDGQRKDFIRLGVLGDWSHPYLTMDFKTEANIIRALGKIIGNGHLHK GAKPVHWCVDCRSALAEAEVEYYDKTSPSIDVAFQAVDQDALKAKFAVSNVNGPISLVIW TTTPWTLPANRAISIAPDFDYALVQIDGQAVILAKDLVESVMQRIGVTDYTILGTVKGAE LELLRFTHPFMGFDVPAILGDHVTLDAGTGAVHTAPGHGPDDYVIGQKYGLETANPVGPD GTYLPGTYPTLDGVNVFKANDIVVALLQEKGALLHVEKMQHSYPCCWRHKTPIIFRATPQ WFVSMDQKGLRAQSLKEIKGVQWIPDWGQARIESMVANRPDWCISRQRTWGVPMSLFVHK DTEELHPRTLELMEEVAKRVEVDGIQAWWDLDAKEILGDEADQYVKVPDTLDVWFDSGST HSSVVDVRPEFAGHAADMYLEGSDQHRGWFMSSLMISTAMKGKAPYRQVLTHGFTVDGQG RKMSKSIGNTVSPQDVMNKLGADILRLWVASTDYTGEMAVSDEILKRAADSYRRIRNTAR FLLANLNGFDPAKDMVKPEEMVVLDRWAVGCAKAAQEDILKAYEAYDFHEVVQRLMRFCS VEMGSFYLDIIKDRQYTAKADSVARRSCQTALYHIAEALVRWMAPILSFTADEVWGYLPG EREKYVFTGEWYEGLFGLADSEAMNDAFWDELLKVRGEVNKVIEQARADKKVGGSLEAAV TLYAEPELSAKLTALGDELRFVLLTSGATVADYNDAPADAQQSEVLKGLKVALSKAEGEK CPRCWHYTQDVGKVAEHAEICGRCVSNVAGDGEKRKFA >gi|296493233|gb|ADTK01000268.1| GENE 13 11768 - 12262 556 164 aa, chain + ## HITS:1 COG:lspA KEGG:ns NR:ns ## COG: lspA COG0597 # Protein_GI_number: 16128021 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Lipoprotein signal peptidase # Organism: Escherichia coli K12 # 1 164 1 164 164 305 99.0 2e-83 MSQSICSTGLRWLWLVVVVLIIDLGSKYLILQNFALGDTVPLFPSLNLHYARNYGAAFSF LADSGGWQRWFFAGIAIGISVILAVMMYRSKATQKLNNIAYALIIGGALGNLFDRLWHGF VVDMIDFYVGDWHFATFNLADTAICVGAALIVLEGFLPSKAKKQ >gi|296493233|gb|ADTK01000268.1| GENE 14 12387 - 12836 257 149 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225086978|ref|YP_002658248.1| ribosomal protein S2 [gamma proteobacterium NOR5-3] # 1 144 1 143 148 103 36 1e-21 MSESVQSNSAVLVHFTLKLDDGTTAESTRNNGKPALFRLGDASLSEGLEQHLLGLKVGDK TTFSLEPDAAFGVPSPDLIQYFSRREFMDAGEPEIGAIMLFTAMDGSEMPGVIREINGDS ITVDFNHPLAGQTVHFDIEVLEIDPALEA >gi|296493233|gb|ADTK01000268.1| GENE 15 12838 - 13788 440 316 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227371337|ref|ZP_03854821.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase; SSU ribosomal protein S1P [Veillonella parvula DSM 2008] # 1 298 1 287 632 174 34 9e-43 MQILLANPRGFCAGVDRAISIVENALAIYGAPIYVRHEVVHNRYVVDSLRERGAIFIEQI SEVPDGAILIFSAHGVSQAVRNEAKSRDLTVFDATCPLVTKVHMEVARASRRGEESILIG HAGHPEVEGTMGQYSNPEGGMYLVESPDDVWKLTVKNEEKLSFMTQTTLSVDDTSDVIDA LRKRFPKIVGPRKDDICYATTNRQEAVRALAEQAEVVLVVGSKNSSNSNRLAELAQRMGK SAFLIDDAKDIQEEWVKEVKCIGVTAGASAPDILVQNVVARLQQLGGGEAIPLEGREENI VFEVPKELRVDIREVD >gi|296493233|gb|ADTK01000268.1| GENE 16 13854 - 14768 747 304 aa, chain + ## HITS:1 COG:yaaF KEGG:ns NR:ns ## COG: yaaF COG1957 # Protein_GI_number: 16128024 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Escherichia coli K12 # 1 304 1 304 304 579 99.0 1e-165 MRLPIFLDTDPGIDDAVAIAAAIFAPELDLQLMTTVAGNVSVEKTTRNALQLLHFWNAEI PLAQGAAVPLVRAPRDAASVHGESGMAGYDFVEHNRKPLGIPAFLAIRDALMRAPEPVTL VAIGPLTNIALLLSQCPECKPYIRRLVIMGGSAGRGNCTPNAEFNIAADPEAAACVFRSG IEIVMCGLDVTNQAILTPDYLATLPELNRTGKMLHALFSHYRSGSMQSGLRMHDLCAIAW LVRPDLFTLKPCFVAVETQGEFTSGTTVVDIDGCLGKPANVKVALDLDVKGFQQWVAEVL ALAS >gi|296493233|gb|ADTK01000268.1| GENE 17 14935 - 15756 878 273 aa, chain + ## HITS:1 COG:dapB KEGG:ns NR:ns ## COG: dapB COG0289 # Protein_GI_number: 16128025 # Func_class: E Amino acid transport and metabolism # Function: Dihydrodipicolinate reductase # Organism: Escherichia coli K12 # 1 273 1 273 273 470 99.0 1e-132 MHDANIRVAIAGAGGRMGRQLIQAALALEGVQLGAALEREGSSLLGSDAGELAGAGKTGV TVQSSLDAVKDDFDVFIDFTRPEGTLNHLAFCRQHGKGMVIGTTGFDEAGKQAIRDAAAD IAIVFAANFSVGVNVMLKLLEKAAKVMGDYTDIEIIEAHHRHKVDAPSGTALAMGEAIAH ALDKDLKDCAVYSREGHTGERVPGTIGFATVRAGDIVGEHTAMFADIGERLEITHKASSR MTFANGAVRSALWLSGKESGLFDMRDVLDLNSL >gi|296493233|gb|ADTK01000268.1| GENE 18 16317 - 17360 959 347 aa, chain + ## HITS:1 COG:ECs0035 KEGG:ns NR:ns ## COG: ECs0035 COG0505 # Protein_GI_number: 15829289 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase small subunit # Organism: Escherichia coli O157:H7 # 1 347 36 382 382 716 100.0 0 MTGYQEILTDPSYSRQIVTLTYPHIGNVGTNDADEESSQVHAQGLVIRDLPLIASNFRNT EDLSSYLKRHNIVAIADIDTRKLTRLLREKGAQNGCIIAGDNPDAALALEKARAFPGLNG MDLAKEVTTAEAYSWTQGSWTLTGGLPEAKKEDELPFHVVAYDFGAKRNILRMLVDRGCR LTIVPAQTSAEDVLKMNPDGIFLSNGPGDPAPCDYAITAIQKFLETDIPVFGICLGHQLL ALASGAKTVKMKFGHHGGNHPVKDVEKNVVMITAQNHGFAVDEATLPANLRVTHKSLFDG TLQGIHRTDKPAFSFQGHPEASPGPHDAAPLFDHFIELIEQYRKTAK >gi|296493233|gb|ADTK01000268.1| GENE 19 17378 - 20599 4217 1073 aa, chain + ## HITS:1 COG:ECs0036 KEGG:ns NR:ns ## COG: ECs0036 COG0458 # Protein_GI_number: 15829290 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Escherichia coli O157:H7 # 1 1073 1 1073 1073 2123 99.0 0 MPKRTDIKSILILGAGPIVIGQACEFDYSGAQACKALREEGYRVILVNSNPATIMTDPEM ADATYIEPIHWEVVRKIIEKERPDAVLPTMGGQTALNCALELERQGVLEEFGVTMIGATA DAIDKAEDRRRFDVAMKKIGLETARSGIAHTMEEALAVAADVGFPCIIRPSFTMGGSGGG IAYNREEFEEICARGLDLSPTKELLIDESLIGWKEYEMEVVRDKNDNCIIVCSIENFDAM GIHTGDSITVAPAQTLTDKEYQIMRNASMAVLREIGVETGGSNVQFAVNPKNGRLIVIEM NPRVSRSSALASKATGFPIAKVAAKLAVGYTLDELMNDITGGRTPASFEPSIDYVVTKIP RFNFEKFAGANDRLTTQMKSVGEVMAIGRTQQESLQKALRGLEVGATGFDPKVSLDDPEA LTKIRRELKDAGAERIWYIADAFRAGLSVDGVFNLTNIDRWFLVQIEELVRLEEKVAEVG ITGLNAEFLRQLKRKGFADARLAKLAGVREAEIRKLRDQYDLHPVYKRVDTCAAEFATDT AYMYSTYEEECEANPSTDREKIMVLGGGPNRIGQGIEFDYCCVHASLALREDGYETIMVN CNPETVSTDYDTSDRLYFEPVTLEDVLEIVRIEKPKGVIVQYGGQTPLKLARALEAAGVP VIGTSPDAIDRAEDRERFQHAVERLKLKQPANATVTAIEMAVEKAKEIGYPLVVRPSYVL GGRAMEIVYDEADLRRYFQTAVSVSNDAPVLLDHFLDDAVEVDVDAICDGEMVLIGGIME HIEQAGVHSGDSACSLPAYTLSQEIQDVMRQQVQKLAFELQVRGLMNVQFAVKNNEVYLI EVNPRAARTVPFVSKATGVPLAKVAARVMAGKSLAEQGVTKEVIPPYYSVKEVVLPFNKF PGVDPLLGPEMRSTGEVMGVGRTFAEAFAKAQLGSNSTMKKHGRALLSVREGDKERVVDL AAKLLKQGFELDATHGTAIVLGEAGINPRLVNKVHEGRPHIQDRIKNGEYTYIINTTSGR RAIEDSRVIRRSALQYKVHYDTTLNGGFATAMALNADATEKVISVQEMHAQIK >gi|296493233|gb|ADTK01000268.1| GENE 20 20607 - 20825 146 72 aa, chain - ## HITS:1 COG:no KEGG:ECO103_0035 NR:ns ## KEGG: ECO103_0035 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 72 1 72 72 135 100.0 7e-31 MTRFEAIKQGHIKIVDISIVCNFTVDKCELNPAYVIKNIDSPKDLLNGQKKRSSSENRIT YSIKLADEKYPP >gi|296493233|gb|ADTK01000268.1| GENE 21 20860 - 21255 315 131 aa, chain + ## HITS:1 COG:no KEGG:ECO103_0036 NR:ns ## KEGG: ECO103_0036 # Name: caiF # Def: DNA-binding transcriptional activator CaiF # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 131 1 131 131 249 100.0 2e-65 MCEGYVEKPLYLLIAEWMMAENRWVIAREISIHFDIEHSKAVNTLTYILSEVAEISCEVK MIPNKLEGRGCQCQRLVKVVDIDEQIYARLRNNSRDKLVGVRKTPRIPAVPLTELNREQK WQMMLSKSMRR >gi|296493233|gb|ADTK01000268.1| GENE 22 21341 - 21931 416 196 aa, chain - ## HITS:1 COG:ECs0038 KEGG:ns NR:ns ## COG: ECs0038 COG0663 # Protein_GI_number: 15829292 # Func_class: R General function prediction only # Function: Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily # Organism: Escherichia coli O157:H7 # 1 196 8 203 203 389 98.0 1e-108 MSYYAFEGLIPVVHPTAFVHPSAVLIGDVIVGAGVYIGPLASLRGDYGRLIVQAGANIQD GCIMHGYCDTDTIVGENGHIGHGAILHGCVIGRDALVGMNSVIMDGAVIGEESIVAAMSF VKAGFHGEKRQLLMGTPARAVRSVSDDELHWKRLNTKEYQDLVGRCHASLHETQPLRQME ENRPRLQGTTDVTPKR >gi|296493233|gb|ADTK01000268.1| GENE 23 21937 - 22830 1058 297 aa, chain - ## HITS:1 COG:caiD KEGG:ns NR:ns ## COG: caiD COG1024 # Protein_GI_number: 16128030 # Func_class: I Lipid transport and metabolism # Function: Enoyl-CoA hydratase/carnithine racemase # Organism: Escherichia coli K12 # 1 297 1 297 297 580 98.0 1e-166 MKQQGTTLSANNHAIKQYAFFAGMLSSLKKQKWRKGMSESLHLTRNGSILEITLDRPKAN AIDAKTSFEMGEVFLNFRDDPQLRVAIITGAGEKFFSAGWDLKAAAEGEAPDADFGPGGF AGLTEIFNLDKPVIAAVNGYAFGGGFELALAADFIVCADNASFALPEAKLGIVPDSGGVL RLPKILPPAIVNEMVMTGRRMGTEEALRWGIVNRVVSQAELMDNARELAQQLVNSAPLAI AALKEIYRTTSEMPVEEAYRYIRSGVLKHYPSVLHSEDAVEGPLAFAEKRDPVWKGR >gi|296493233|gb|ADTK01000268.1| GENE 24 22831 - 24384 1191 517 aa, chain - ## HITS:1 COG:ECs0040 KEGG:ns NR:ns ## COG: ECs0040 COG0318 # Protein_GI_number: 15829294 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Escherichia coli O157:H7 # 1 517 6 522 522 1083 99.0 0 MDIIGGQHLRQMWDDLADVYGHKTALICESSGGVVNRYSYLELNQEINRTANLFYTLGIR KGDKVALHLDNCPEFIFCWFGLAKIGAIMVPINARLLREESAWILQNSQACLLVTSAQFY PMYQQIQQEDATQLRHICLTDVALPADDGVSSFTQLKNQQPATLCYAPPLLTDDTAEILF TSGTTSRPKGVVITHYNLRFAGYYSAWQCALRDDDVYLTVMPAFHIDCQCTAAMAAFSAG ATFVLVEKYSARAFWGQVQKYRATITECIPMMIRTLMVQPPSANDRQHRLREVMFYLNLS EQEKDAFCERFGVRLLTSYGMTETIVGIIGDRPGDKRRWPSIGRAGFCYEAEIRDDHNRP LPAGEIGEICIKGVPGKTIFKEYFLNPKATAKVLEADGWLHTGDTGYCDEEGFFYFVDRR CNMIKRGGENVSCVELENIIATHPKIQDIVVVGIKDSIRDEAIKAFVVLNEGETLSEEEF FRFCEQNMAKFKVPSYLEIRKDLPRNCSGKIIRKNLK >gi|296493233|gb|ADTK01000268.1| GENE 25 24458 - 25675 1404 405 aa, chain - ## HITS:1 COG:ECs0041 KEGG:ns NR:ns ## COG: ECs0041 COG1804 # Protein_GI_number: 15829295 # Func_class: C Energy production and conversion # Function: Predicted acyl-CoA transferases/carnitine dehydratase # Organism: Escherichia coli O157:H7 # 1 405 1 405 405 832 99.0 0 MDHLPMPKFGPLAGLRVVFSGIEIAGPFAGQMFAEWGAEVIWIENVAWADTIRVQPNYPQ LSRRNLHALSLNIFKDEGREAFLKLMETTDIFIEASKGPAFARRGITDEVLWQHNPKLVI AHLSGFGQYGTEEYTNLPAYNTIAQAFSGYLIQNGDVDQPMPAFPYTADYFSGLTATTAA LAALHKARETGKGESIDIAMYEVMLRMGQYFMMDYFNGGEMCPRMSKGKDPYYAGCGLYK CADGYIVMELVGITQIEECFKDIGLAHLLSTPEIPEGTQLIHRIECPYGPLVEEKLDAWL AAHTIAEVKERFAELNIACAKVLTVPELESNPQYVARESITQWQTMDGRTCKGPNIMPKF KNNPGQIWRGMPSHGMDTAAILKNIGYSENDIQELVSKGLAKVED >gi|296493233|gb|ADTK01000268.1| GENE 26 25804 - 26946 1386 380 aa, chain - ## HITS:1 COG:ECs0042 KEGG:ns NR:ns ## COG: ECs0042 COG1960 # Protein_GI_number: 15829296 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Escherichia coli O157:H7 # 1 380 1 380 380 788 100.0 0 MDFNLNDEQELFVAGIRELMASENWEAYFAECDRDSVYPERFVKALADMGIDSLLIPEEH GGLDAGFVTLAAVWMELGRLGAPTYVLYQLPGGFNTFLREGTQEQIDKIMAFRGTGKQMW NSAITEPGAGSDVGSLKTTYTRRNGKIYLNGSKCFITSSAYTPYIVVMARDGASPDKPVY TEWFVDMSKPGIKVTKLEKLGLRMDSCCEITFDDVELDEKDMFGREGNGFNRVKEEFDHE RFLVALTNYGTAMCAFEDAARYANQRVQFGEAIGRFQLIQEKFAHMAIKLNSMKNMLYEA AWKADNGTITSGDAAMCKYFCANAAFEVVDSAMQVLGGVGIAGNHRISRFWRDLRVDRVS GGSDEMQILTLGRAVLKQYR >gi|296493233|gb|ADTK01000268.1| GENE 27 26977 - 28491 1596 504 aa, chain - ## HITS:1 COG:caiT KEGG:ns NR:ns ## COG: caiT COG1292 # Protein_GI_number: 16128034 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Choline-glycine betaine transporter # Organism: Escherichia coli K12 # 1 504 1 504 504 934 99.0 0 MKNEKRKTGIDPKVFFPPLIIVGILCWLTVRDLDAANVVINAVFSYVTNVWGWAFEWYMV VMLFGWFWLVFGPYAKKRLGNEPPEFSTASWIFMMFASCTSAAVLFWGSIEIYYYISTPP FGLEPNSTGAKELGLAYSLFHWGPLPWATYSFLSVAFAYFFFVRKMEVIRPSSTLVPLVG EKHAKGLFGTIVDNFYLVALIFAMGTSLGLATPLVTECMQWLFGIPHTLQLDAIIITCWI ILNAICVACGLQKGVRIASDVRSYLSFLMLGWVFIVSGASFIMNYFTDSVGMLLMYLPRM LFYTDPIAKGGFPQGWTVFYWAWWVIYAIQMSIFLARISRGRTVRELCFGMVLGLTASTW ILWTVLGSNTLLLIDKNIINIPNLIEQYGVARAIIETWAALPLSTATMWGFFILCFIATV TLVNACSYTLAMSTCREVRDGEEPPLLVRIGWSILVGIIGIVLLALGGLKPIQTAIIAGG CPLFFVNIMVTLSFIKDAKQNWKD >gi|296493233|gb|ADTK01000268.1| GENE 28 28965 - 29735 874 256 aa, chain + ## HITS:1 COG:ECs0044 KEGG:ns NR:ns ## COG: ECs0044 COG2086 # Protein_GI_number: 15829298 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, beta subunit # Organism: Escherichia coli O157:H7 # 1 256 13 268 268 426 100.0 1e-119 MKIITCYKCVPDEQDIAVNNADGSLDFSKADAKISQYDLNAIEAACQLKQQAAEAQVTAL SVGGKALTNAKGRKDVLSRGPDELIVVIDDQFEQALPQQTASALAAAAQKAGFDLILCGD GSSDLYAQQVGLLVGEILNIPAVNGVSKIISLTADTLTVERELEDETETLSIPLPAVVAV STDINSPQIPSMKAILGAAKKPVQVWSAADIGFNAEAAWSEQQVAAPKQRERQRIVIEGD GEEQIAAFAENLRKVI >gi|296493233|gb|ADTK01000268.1| GENE 29 29750 - 30691 917 313 aa, chain + ## HITS:1 COG:fixB KEGG:ns NR:ns ## COG: fixB COG2025 # Protein_GI_number: 16128036 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, alpha subunit # Organism: Escherichia coli K12 # 1 313 1 313 313 602 100.0 1e-172 MNTFSQVWVFSDTPSRLPELMNGAQALANQINTFVLNDADGAQAIQLGANHVWKLNGKPD DRMIEDYAGVMADTIRQHGADGLVLLPNTRRGKLLAAKLGYRLKAAVSNDASTVSVQDGK ATVKHMVYGGLAIGEERIATPYAVLTISSGTFDAAQPDASRTGETHTVEWQAPAVAITRT ATQARQSNSVDLDKARLVVSVGRGIGSKENIALAEQLCKAIGAELACSRPVAENEKWMEH ERYVGISNLMLKPELYLAVGISGQIQHMVGANASQTIFAINKDKNAPIFQYADYGIVGDA VKILPALTAALAR >gi|296493233|gb|ADTK01000268.1| GENE 30 30742 - 32028 1306 428 aa, chain + ## HITS:1 COG:fixC KEGG:ns NR:ns ## COG: fixC COG0644 # Protein_GI_number: 16128037 # Func_class: C Energy production and conversion # Function: Dehydrogenases (flavoproteins) # Organism: Escherichia coli K12 # 1 428 1 428 428 815 100.0 0 MSEDIFDAIIVGAGLAGSVAALVLAREGAQVLVIERGNSAGAKNVTGGRLYAHSLEHIIP GFADSAPVERLITHEKLAFMTEKSAMTMDYCNGDETSPSQRSYSVLRSKFDAWLMEQAEE AGAQLITGIRVDNLVQRDGKVVGVEADGDVIEAKTVILADGVNSILAEKLGMAKRVKPTD VAVGVKELIELPKSVIEDRFQLQGNQGAACLFAGSPTDGLMGGGFLYTNENTLSLGLVCG LHHLHDAKKSVPQMLEDFKQHPAVAPLIAGGKLVEYSAHVVPEAGINMLPELVGDGVLIA GDAAGMCMNLGFTIRGMDLAIAAGEAAAKTVLSAMKSDDFSKQKLAEYRQHLESGPLRDM RMYQKLPAFLDNPRMFSGYPELAVGVARDLFTIDGSAPELMRKKILRHGKKVGFINLIKD GMKGVTVL >gi|296493233|gb|ADTK01000268.1| GENE 31 32025 - 32312 361 95 aa, chain + ## HITS:1 COG:ECs0047 KEGG:ns NR:ns ## COG: ECs0047 COG2440 # Protein_GI_number: 15829301 # Func_class: C Energy production and conversion # Function: Ferredoxin-like protein # Organism: Escherichia coli O157:H7 # 1 95 1 95 95 199 98.0 1e-51 MTSPVNVDVKLGVNKFNVDEEHPHIVVKADADKQVLELLVKACPAGLYKKQDDGSVRFDY AGCLECGTCRILGLGSALEQWEYPRGTFGVEFRYG >gi|296493233|gb|ADTK01000268.1| GENE 32 32370 - 33701 1491 443 aa, chain + ## HITS:1 COG:yaaU KEGG:ns NR:ns ## COG: yaaU COG0477 # Protein_GI_number: 16128039 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 443 1 443 443 834 100.0 0 MQPSRNFDDLKFSSIHRRILLWGSGGPFLDGYVLVMIGVALEQLTPALKLDADWIGLLGA GTLAGLFVGTSLFGYISDKVGRRKMFLIDIIAIGVISVATMFVSSPVELLVMRVLIGIVI GADYPIATSMITEFSSTRQRAFSISFIAAMWYVGATCADLVGYWLYDVEGGWRWMLGSAA IPCLLILIGRFELPESPRWLLRKGRVKECEEMMIKLFGEPVAFDEEQPQQTRFRDLFNRR HFPFVLFVAAIWTCQVIPMFAIYTFGPQIVGLLGLGVGKNAALGNVVISLFFMLGCIPPM LWLNTAGRRPLLIGSFAMMTLALAVLGLIPDMGIWLVVMAFAVYAFFSGGPGNLQWLYPN ELFPTDIRASAVGVIMSLSRIGTIVSTWALPIFINNYGISNTMLMGAGISLFGLLISVAF APETRGMSLAQTSNMTIRGQRMG >gi|296493233|gb|ADTK01000268.1| GENE 33 33866 - 34339 502 157 aa, chain + ## HITS:1 COG:yabF KEGG:ns NR:ns ## COG: yabF COG2249 # Protein_GI_number: 16128040 # Func_class: R General function prediction only # Function: Putative NADPH-quinone reductase (modulator of drug activity B) # Organism: Escherichia coli K12 # 1 157 20 176 176 313 100.0 1e-85 MLEQARTLEGVEIRSLYQLYPDFNIDIAAEQEALSRADLIVWQHPMQWYSIPPLLKLWID KVFSHGWAYGHGGTALHGKHLLWAVTTGGGESHFEIGAHPGFDVLSQPLQATAIYCGLNW LPPFAMHCTFICDDETLEGQARHYKQRLLEWQEAHHG >gi|296493233|gb|ADTK01000268.1| GENE 34 34332 - 36194 1035 620 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229845962|ref|ZP_04466074.1| 30S ribosomal protein S2 [Haemophilus influenzae 7P49H1] # 6 586 9 612 618 403 38 1e-112 MDSHTLIQALIYLGSAALIVPIAVRLGLGSVLGYLIAGCIIGPWGLRLVTDAESILHFAE IGVVLMLFIIGLELDPQRLWKLRAAVFGGGALQMVICGGLLGLFCMLLGLRWQVAELIGM TLALSSTAIAMQAMNERNLMVTQMGRSAFAVLLFQDIAAIPLVAMIPLLATSSASTTMGA FALSALKVAGALVLVVLLGRYVTRPALRFVARSGLREVFSAVALFLVFGFGLLLEEVGLS MAMGAFLAGVLLASSEYRHALESDIEPFKGLLLGLFFIGVGMSIDFGTLLENPLRIVILL LGFLIIKIAMLWLIARPLQVPNKQRRWFAVLLGQGSEFAFVVFGAAQMANVLEPEWAKSL TLAVALSMAATPILLVILNRLEQSSTEEAREADEIDEEQPRVIIAGFGRFGQITGRLLLS SGVKMVVLDHDPDHIETLRKFGMKVFYGDATRMDLLESAGAAKAEVLINAIDDPQTNLQL TEMVKEHFPHLQIIARARDVDHYIRLRQAGVEKPERETFEGALKTGRLALESLGLGPYEA RERADVFRRFNIQMVEEMAMVENDTKARAAVYKRTSAMLSEIITEDREHLSLIQRHGWQG TEEGKHTGNMADEPETKPSS Prediction of potential genes in microbial genomes Time: Mon May 16 15:52:08 2011 Seq name: gi|296493232|gb|ADTK01000269.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont846.7, whole genome shotgun sequence Length of sequence - 27669 bp Number of predicted genes - 22, with homology - 22 Number of transcription units - 11, operones - 6 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 59 - 118 7.0 1 1 Tu 1 . + CDS 142 - 621 648 ## COG0262 Dihydrofolate reductase + Term 646 - 691 8.4 - Term 435 - 469 0.5 2 2 Op 1 8/0.000 - CDS 699 - 1541 756 ## COG0639 Diadenosine tetraphosphatase and related serine/threonine protein phosphatases 3 2 Op 2 8/0.000 - CDS 1548 - 1925 345 ## COG2967 Uncharacterized protein affecting Mg2+/Co2+ transport 4 2 Op 3 12/0.000 - CDS 1928 - 2749 693 ## COG0030 Dimethyladenosine transferase (rRNA methylation) 5 2 Op 4 13/0.000 - CDS 2746 - 3735 461 ## PROTEIN SUPPORTED gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 6 2 Op 5 16/0.000 - CDS 3735 - 5021 1389 ## COG0760 Parvulin-like peptidyl-prolyl isomerase 7 2 Op 6 . - CDS 5074 - 7428 2058 ## COG1452 Organic solvent tolerance protein OstA - Prom 7573 - 7632 2.4 + Prom 7409 - 7468 3.3 8 3 Tu 1 . + CDS 7683 - 8498 804 ## COG1076 DnaJ-domain-containing proteins 1 + Term 8600 - 8644 6.1 9 4 Op 1 7/0.000 - CDS 8615 - 9274 218 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit 10 4 Op 2 5/0.400 - CDS 9286 - 12192 3624 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family 11 5 Op 1 3/1.000 - CDS 12357 - 14708 1998 ## COG0417 DNA polymerase elongation subunit (family B) - Term 14713 - 14753 5.8 12 5 Op 2 5/0.400 - CDS 14783 - 15478 680 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases - Prom 15598 - 15657 1.7 13 6 Op 1 7/0.000 - CDS 15678 - 17180 1568 ## COG2160 L-arabinose isomerase 14 6 Op 2 . - CDS 17191 - 18891 1728 ## COG1069 Ribulose kinase - Prom 19039 - 19098 4.2 + Prom 18998 - 19057 4.8 15 7 Op 1 4/0.800 + CDS 19179 - 20108 686 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 20112 - 20171 3.3 16 7 Op 2 . + CDS 20194 - 20958 837 ## COG0586 Uncharacterized membrane-associated protein - Term 21008 - 21046 1.3 17 8 Op 1 11/0.000 - CDS 21072 - 21770 282 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 18 8 Op 2 14/0.000 - CDS 21754 - 23364 1598 ## COG1178 ABC-type Fe3+ transport system, permease component 19 8 Op 3 2/1.000 - CDS 23340 - 24323 1025 ## COG4143 ABC-type thiamine transport system, periplasmic component - Term 24330 - 24381 9.0 20 9 Tu 1 . - CDS 24487 - 26142 1415 ## COG4533 ABC-type uncharacterized transport system, periplasmic component + Prom 26150 - 26209 3.1 21 10 Tu 1 . + CDS 26231 - 26362 137 ## ECIAI1_0070 hypothetical protein + Term 26380 - 26426 0.1 + Prom 26389 - 26448 6.8 22 11 Tu 1 . + CDS 26482 - 27642 813 ## COG0477 Permeases of the major facilitator superfamily Predicted protein(s) >gi|296493232|gb|ADTK01000269.1| GENE 1 142 - 621 648 159 aa, chain + ## HITS:1 COG:folA KEGG:ns NR:ns ## COG: folA COG0262 # Protein_GI_number: 16128042 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Escherichia coli K12 # 1 159 1 159 159 332 100.0 1e-91 MISLIAALAVDRVIGMENAMPWNLPADLAWFKRNTLNKPVIMGRHTWESIGRPLPGRKNI ILSSQPGTDDRVTWVKSVDEAIAACGDVPEIMVIGGGRVYEQFLPKAQKLYLTHIDAEVE GDTHFPDYEPDDWESVFSEFHDADAQNSHSYCFEILERR >gi|296493232|gb|ADTK01000269.1| GENE 2 699 - 1541 756 280 aa, chain - ## HITS:1 COG:apaH KEGG:ns NR:ns ## COG: apaH COG0639 # Protein_GI_number: 16128043 # Func_class: T Signal transduction mechanisms # Function: Diadenosine tetraphosphatase and related serine/threonine protein phosphatases # Organism: Escherichia coli K12 # 1 280 1 280 280 587 100.0 1e-167 MATYLIGDVHGCYDELIALLHKVEFTPGKDTLWLTGDLVARGPGSLDVLRYVKSLGDSVR LVLGNHDLHLLAVFAGISRNKPKDRLTPLLEAPDADELLNWLRRQPLLQIDEEKKLVMAH AGITPQWDLQTAKECARDVEAVLSSDSYPFFLDAMYGDMPNNWSPELRGLGRLRFITNAF TRMRFCFPNGQLDMYSKESPEEAPAPLKPWFAIPGPVAEEYSIAFGHWASLEGKGTPEGI YALDTGCCWGGTLTCLRWEDKQYFVQPSNRHKDLGEAAAS >gi|296493232|gb|ADTK01000269.1| GENE 3 1548 - 1925 345 125 aa, chain - ## HITS:1 COG:ECs0055 KEGG:ns NR:ns ## COG: ECs0055 COG2967 # Protein_GI_number: 15829309 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein affecting Mg2+/Co2+ transport # Organism: Escherichia coli O157:H7 # 1 125 1 125 125 241 100.0 2e-64 MINSPRVCIQVQSVYIEAQSSPDNERYVFAYTVTIRNLGRAPVQLLGRYWLITNGNGRET EVQGEGVVGVQPLIAPGEEYQYTSGAIIETPLGTMQGHYEMIDENGVPFSIDIPVFRLAV PTLIH >gi|296493232|gb|ADTK01000269.1| GENE 4 1928 - 2749 693 273 aa, chain - ## HITS:1 COG:ksgA KEGG:ns NR:ns ## COG: ksgA COG0030 # Protein_GI_number: 16128045 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Dimethyladenosine transferase (rRNA methylation) # Organism: Escherichia coli K12 # 1 273 1 273 273 538 100.0 1e-153 MNNRVHQGHLARKRFGQNFLNDQFVIDSIVSAINPQKGQAMVEIGPGLAALTEPVGERLD QLTVIELDRDLAARLQTHPFLGPKLTIYQQDAMTFNFGELAEKMGQPLRVFGNLPYNIST PLMFHLFSYTDAIADMHFMLQKEVVNRLVAGPNSKAYGRLSVMAQYYCNVIPVLEVPPSA FTPPPKVDSAVVRLVPHATMPHPVKDVRVLSRITTEAFNQRRKTIRNSLGNLFSVEVLTG MGIDPAMRAENISVAQYCQMANYLAENAPLQES >gi|296493232|gb|ADTK01000269.1| GENE 5 2746 - 3735 461 329 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 [Flavobacteriales bacterium ALC-1] # 7 324 9 326 346 182 34 3e-45 MVKTQRVVITPGEPAGIGPDLVVQLAQREWPVELVVCADATLLTDRAAMLGLPLTLRTYS PNSPAQPQTAGTLTLLPVALRESVTAGQLAVENGHYVVETLARACDGCLNGEFAALITGP VHKGVINDAGIPFTGHTEFFEERSQAKKVVMMLATEELRVALATTHLPLRDIADAITPAL LHEVIAILHHDLRTKFGIAEPRILVCGLNPHAGEGGHMGTEEIDTIIPLLDELRAQGMKL NGPLPADTLFQPKYLDNADAVLAMYHDQGLPVLKYQGFGRGVNITLGLPFIRTSVDHGTA LELAGRGEADVGSFITALNLAIKMIVNTQ >gi|296493232|gb|ADTK01000269.1| GENE 6 3735 - 5021 1389 428 aa, chain - ## HITS:1 COG:ECs0058 KEGG:ns NR:ns ## COG: ECs0058 COG0760 # Protein_GI_number: 15829312 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Parvulin-like peptidyl-prolyl isomerase # Organism: Escherichia coli O157:H7 # 1 428 1 428 428 770 99.0 0 MKNWKTLLLGIAMIANTSFAAPQVVDKVAAVVNNGVVLESDVDGLMQSVKLNAAQARQQL PDDATLRHQIMERLIMDQIILQMGQKMGVKISDEQLDQAIANIAKQNNMTLDQMRSRLAY DGLNYNTYRNQIRKEMIISEVRNNEVRRRITILPQEVESLAQQVGNQNDASTELNLSHIL IPLPENPTSDQVNEAESQARAIVDQARNGADFGKLAIAHSADQQALNGGQMGWGRIQELP GIFAQALSTAKKGDIVGPIRSGVGFHILKVNDLRGESKNISVTEVHARHILLKPSPIMTD EQARVKLEQIAADIKSGKTTFAAAAKEFSQDPGSANQGGDLGWATADIFDPAFRDALTRL NKGQMSAPVHSSFGWHLIELLDTRNVDKTDAAQKDRAYRMLMNRKFSEEAASWMQEQRAS AYVKILSN >gi|296493232|gb|ADTK01000269.1| GENE 7 5074 - 7428 2058 784 aa, chain - ## HITS:1 COG:imp KEGG:ns NR:ns ## COG: imp COG1452 # Protein_GI_number: 16128048 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Organic solvent tolerance protein OstA # Organism: Escherichia coli K12 # 1 784 1 784 784 1553 99.0 0 MKKRIPTLLATMIATALYSQQGLAADLASQCMLGVPSYDRPLVQGDTNDLPVTINADHAK GDYPDDAVFTGSVDIMQGNSRLQADEVQLHQKEAPGQPEPVRTVDALGNVHYDDNQVILK GPKGWANLNTKDTNVWEGDYQMVGRQGRGKADLMKQRGENRYTILDNGSFTSCLPGSDTW SVVGSEIIHDREEQVAEIWNARFKVGPVPIFYSPYLQLPVGDKRRSGFLIPNAKYTTTNY FEFYLPYYWNIAPNMDATITPHYMHRRGNIMWENEFRYLSQAGAGLMELDYLPSDKVYED EHPNDDSSRRWLFYWNHSGVMDQVWRFNVDYTKVSDPSYFNDFDNKYGSSTDGYATQKFS VGYAVQNFNATVSTKQFQVFSEQNTSSYSAEPQLDVNYYQNDVGPFDTRIYGQAVHFVNT RDDMPEATRVHLEPTINLPLSNNWGSINTEAKLLATHYQQTNLDWYNSRNTTKLDESVNR VMPQFKVDGKMVFERDMEMLAPGYTQTLEPRAQYLYVPYRDQSDIYNYDSSLLQSDYSGL FRDRTYGGLDRIASANQVTTGVTSRIYDDAAVERFNISVGQIYYFTESRTGDDNITWEND DKTGSLVWAGDTYWRISERWGLRGGIQYDTRLDNVATSNSSIEYRRDEDRLVQLNYRYAS PEYIQATLPKYYSTAEQYKNGISQVGAVASWPIADRWSIVGAYYYDTNANKQADSMLGVQ YSSCCYAIRVGYERKLNGWDNDKQHAVYDNAIGFNIELRGLSSNYGLGTQEMLRSNILPY QNSL >gi|296493232|gb|ADTK01000269.1| GENE 8 7683 - 8498 804 271 aa, chain + ## HITS:1 COG:STM0094 KEGG:ns NR:ns ## COG: STM0094 COG1076 # Protein_GI_number: 16763484 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-domain-containing proteins 1 # Organism: Salmonella typhimurium LT2 # 1 271 1 270 270 504 95.0 1e-142 MQYWGKIIGVAVALLMGGGFWGVVLGLLIGHMFDKARSRKMAWFANQRERQALFFATTFE VMGHLTKSKGRVTEADIHIASQLMDRMNLHGASRTAAQNAFRVGKSDNYPLREKMRQFRS VCFGRFDLIRMFLEIQIQAAFADGSLHPNERAVLYVIAEELGISRAQFDQFLRMMQGGAQ FGGGYQQQTGGGNWQQAQRGPTLEDACNVLGVKPTDDATTIKRAYRKLMSEHHPDKLVAK GLPPEMMEMAKQKAQEIQQAYELIKQQKGFK >gi|296493232|gb|ADTK01000269.1| GENE 9 8615 - 9274 218 219 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 16 218 83 281 285 88 31 4e-17 MGMENYNPPQEPWLVILYQDDHIMVVNKPSGLLSVPGRLEEHKDSVMTRIQRDYPQAESV HRLDMATSGVIVVALTKAAERELKRQFREREPKKQYVARVWGHPSPAEGLVDLPLICDWP NRPKQKVCYETGKPAQTEYEVVEYAADNTARVVLKPITGRSHQLRVHMLALGHPILGDRF YASPEARAMAPRLLLHAEMLTITHPAYGNSMTFKAPADF >gi|296493232|gb|ADTK01000269.1| GENE 10 9286 - 12192 3624 968 aa, chain - ## HITS:1 COG:ECs0063 KEGG:ns NR:ns ## COG: ECs0063 COG0553 # Protein_GI_number: 15829317 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Escherichia coli O157:H7 # 1 968 1 968 968 1915 100.0 0 MPFTLGQRWISDTESELGLGTVVAVDARTVTLLFPSTGENRLYARSDSPVTRVMFNPGDT ITSHDGWQMQVEEVKEENGLLTYIGTRLDTEESGVALREVFLDSKLVFSKPQDRLFAGQI DRMDRFALRYRARKYSSEQFRMPYSGLRGQRTSLIPHQLNIAHDVGRRHAPRVLLADEVG LGKTIEAGMILHQQLLSGAAERVLIIVPETLQHQWLVEMLRRFNLRFALFDDERYAEAQH DAYNPFDTEQLVICSLDFARRSKQRLEHLCEAEWDLLVVDEAHHLVWSEDAPSREYQAIE QLAEHVPGVLLLTATPEQLGMESHFARLRLLDPNRFHDFAQFVEEQKNYRPVADAVAMLL AGNKLSNDELNMLGEMIGEQDIEPLLQAANSDSEDAQSARQELVSMLMDRHGTSRVLFRN TRNGVKGFPKRELHTIKLPLPTQYQTAIKVSGIMGARKSAEDRARDMLYPERIYQEFEGD NATWWNFDPRVEWLMGYLTSHRSQKVLVICAKAATALQLEQVLREREGIRAAVFHEGMSI IERDRAAAWFAEEDTGAQVLLCSEIGSEGRNFQFASHMVMFDLPFNPDLLEQRIGRLDRI GQAHDIQIHVPYLEKTAQSVLVRWYHEGLDAFEHTCPTGRTIYDSVYNDLINYLASPDQT EGFDDLIKNCREQHEALKAQLEQGRDRLLEIHSNGGEKAQALAESIEEQDDDTNLIAFAM NLFDIIGINQDDRGDNMIVLTPSDHMLVPDFPGLSEDGITITFDREVALAREDAQFITWE HPLIRNGLDLILSGDTGSSTISLLKNKALPVGTLLVELIYVVEAQAPKQLQLNRFLPPTP VRMLLDKNGNNLAAQVEFETFNRQLNAVNRHTGSKLVNAVQQDVHAILQLGEAQIEKSAR ALIDAARNEADEKLSAELSRLEALRAVNPNIRDDELTAIESNRQQVMESLDQAGWRLDAL RLIVVTHQ >gi|296493232|gb|ADTK01000269.1| GENE 11 12357 - 14708 1998 783 aa, chain - ## HITS:1 COG:polB KEGG:ns NR:ns ## COG: polB COG0417 # Protein_GI_number: 16128054 # Func_class: L Replication, recombination and repair # Function: DNA polymerase elongation subunit (family B) # Organism: Escherichia coli K12 # 1 783 1 783 783 1625 99.0 0 MAQAGFILTRHWRDTPQGTEVSFWLATDNGPLQVTLAPQESVAFIPADQVPRAQHILQGE QGFRLTPLALKDFHRQPVYGLYCRAHRQLMNYEKRLREGGVTVYEADVRPPERYLMERFI TSPVWVEGDIRNGAIVNARLKPHPDYRPPLKWVSIDIETTRHGELYCIGLEGCGQRIVYM LGPENGDASALDFELEYVASRPQLLEKLNAWFANYDPDVIIGWNVVQFDLRMLQKHAERY RIPLRLGRDNSELEWREHGFKNGVFFAQAKGRLIIDGIEALKSAFWNFSSFSLETVAQEL LGEGKSIDNPWDRMDEIDRRFAEDKPALATYNLKDCELVTQIFHKTEIMPFLLERATVNG LPVDRHGGSVAAFGHLYFPRMHRAGYVAPNLGEVPPHASPGGYVMDSRPGLYDSVLVLDY KSLYPSIIRTFLIDPVGLVEGMAQPDPEHSTEGFLDAWFSREKHCLPEIVTNIWHGRDEA KRQGNKPLSQALKIIMNAFYGVLGTTACRFFDPRLASSITMRGHQIMRQTKALIEAQGYD VIYGDTDSTFVWLKGAHSEEEAAKIGRALVQHVNAWWAETLQKQRLTSALELEYETHFCR FLMPTIRGADTGSKKRYAGLIQEGDKQRMVFKGLETVRTDWTPLAQQFQQELYLRIFRNE PYQEYIRETIDKLMAGELDARLVYRKRLRRPLSEYQRNVPPHVRAARLADEENQKRGRPL QYQNRGTIKYVWTTNGPEPLDYQRSPLDYEHYLTRQLQPVAEGILPFIEDNFATLMTGQL GLF >gi|296493232|gb|ADTK01000269.1| GENE 12 14783 - 15478 680 231 aa, chain - ## HITS:1 COG:araD KEGG:ns NR:ns ## COG: araD COG0235 # Protein_GI_number: 16128055 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli K12 # 1 231 1 231 231 469 99.0 1e-132 MLEDLKRLVLEANLALPKHNLVTLTWGNVSAVDRERGVFVIKPSGVDYSVMTADDMVVVS IATGEVVEGTKKPSSDTPTHRLLYQAFPSIGGIVHTHSRHATIWAQAGQSIPATGTTHAD YFYGTIPCTRKMTDAEINGEYEWETGNVIVETFEKQGIDAAQMPGVLVHSHGPFAWGKNA EDAVHNAIVLEEVAYMGIFCRQLAPQLPDMQQTLLDKHYLRKHGAKAYYGQ >gi|296493232|gb|ADTK01000269.1| GENE 13 15678 - 17180 1568 500 aa, chain - ## HITS:1 COG:araA KEGG:ns NR:ns ## COG: araA COG2160 # Protein_GI_number: 16128056 # Func_class: G Carbohydrate transport and metabolism # Function: L-arabinose isomerase # Organism: Escherichia coli K12 # 1 500 1 500 500 1053 99.0 0 MTIFDNYEVWFVIGSQHLYGPETLRQVTQHAEHVVNALNTEAKLPCKLVLKPLGTTPDEI TAICRDANYDDRCAGLVVWLHTFSPAKMWINGLTMLNKPLLQFHTQFNAALPWDSIDMDF MNLNQTAHGGREFGFIGARMRQQHAVVTGHWQDKQAHERIGSWMRQAVSKQDTRHLKVCR FGDNMREVAVTDGDKVAAQIKFGFSVNTWAVGDLVQVVNSISDGDVNALVDEYESCYTMT PATQIHGEKRQNVLEAARIELGMKRFLEQGGFHAFTTTFEDLHGLKQLPGLAVQRLMQQG YGFAGEGDWKTAALLRIMKVMSTGLQGGTSFMEDYTYHFEKGNDLVLGSHMLEVCPSIAV EEKPILDVQHLGIGGKDDPARLIFNTQTGPAIVASLIDLGDRYRLLVNCIDTVKTPHSLP KLPVANALWKAQPDLPTASEAWILAGGAHHTVFSHALNLNDMRQFAEMHDIEITVIDTAT RLPAFKDALRWNEVYYGFRR >gi|296493232|gb|ADTK01000269.1| GENE 14 17191 - 18891 1728 566 aa, chain - ## HITS:1 COG:araB KEGG:ns NR:ns ## COG: araB COG1069 # Protein_GI_number: 16128057 # Func_class: C Energy production and conversion # Function: Ribulose kinase # Organism: Escherichia coli K12 # 1 566 1 566 566 1102 99.0 0 MAIAIGLDFGSDSVRALAVDCASGEEIATSVEWYPRWQKGQFCDAPNNQFRHHPRDYIES MEAALKTVLAELSVEQRAAVVGIGVDTTGSTPAPIDADGNVLALRPEFAENPNAMFVLWK DHTAVEEAEEITRLCHAPGNVDYSRYIGGIYSSEWFWAKILHVTRQDSAVAQSAASWIEL CDWVPALLSGTTRPQDIRRGRCSAGHKSLWHESWGGLPPASFFDELDPILNRHLPSPLFT DTWTADIPVGTLCPEWAQRLGLPESVVISGGAFDCHMGAVGAGAQPNALVKVIGTSTCDI LIADKQSVGERAVKGICGQVDGSVVPGFIGLEAGQSAFGDIYAWFGRVLGWPLEQLAAQH PELKAQINASQKQLLPALTEAWAKNPSLDHLPVVLDWFNGRRTPNANQRLKGVITDLNLA TDAPLLFGGLIAATAFGARAIMECFTDQGIAVNNVMALGGIARKNQVIMQACCDVLNRPL QIVASDQCCALGAAIFAAVAAKVHADIPSAQQKMASAVEKTLQPCSEQAQRFEQLYRRYQ QWAMSAEQHYLPTSAPAQAAQAVPTL >gi|296493232|gb|ADTK01000269.1| GENE 15 19179 - 20108 686 309 aa, chain + ## HITS:1 COG:ECs0068 KEGG:ns NR:ns ## COG: ECs0068 COG2207 # Protein_GI_number: 15829322 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 18 309 1 292 292 622 100.0 1e-178 MQYGQLVSSLNGGSMKSMAEAQNDPLLPGYSFNAHLVAGLTPIEANGYLDFFIDRPLGMK GYILNLTIRGQGVVKNQGREFVCRPGDILLFPPGEIHHYGRHPEAREWYHQWVYFRPRAY WHEWLNWPSIFANTGFFRPDEAHQPHFSDLFGQIINAGQGEGRYSELLAINLLEQLLLRR MEAINESLHPPMDNRVREACQYISDHLADSNFDIASVAQHVCLSPSRLSHLFRQQLGISV LSWREDQRISQAKLLLSTTRMPIATVGRNVGFDDQLYFSRVFKKCTGASPSEFRAGCEEK VNDVAVKLS >gi|296493232|gb|ADTK01000269.1| GENE 16 20194 - 20958 837 254 aa, chain + ## HITS:1 COG:ECs0069 KEGG:ns NR:ns ## COG: ECs0069 COG0586 # Protein_GI_number: 15829323 # Func_class: S Function unknown # Function: Uncharacterized membrane-associated protein # Organism: Escherichia coli O157:H7 # 1 253 1 253 254 445 99.0 1e-125 MQALLEHFITQSTVYSLMAVVLVAFLESLALVGLILPGTVLMAGLGALIGSGELSFWHAW LAGIVGCLLGDWISFWLGWRFKKPLHRWSFLKKNKALLDKTEHALHQHSMFTILVGRFVG PTRPLVPMVAGMLDLPVAKFITPNIIGCLLWPPFYFLPGILAGAAIDIPAGMQSGEFKWL LLATAVFLWVGGWLCWRLWRSGKATDRLSHYLSRGRLLWLTPLISAIGVVALVVLIRHPL MPVYIDILRKVVGG >gi|296493232|gb|ADTK01000269.1| GENE 17 21072 - 21770 282 232 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 20 217 22 226 245 113 35 2e-24 MLKLTDITWLYHHLPMRFSLTVERGEQVAILGPSGAGKSTLLNLIAGFLTPASGSLTIDG VDHTTTPPSRRPVSMLFQENNLFSHLTVAQNIGLGLNPGLKLNAAQQEKMHAIARQMGID NLMARLPGELSGGQRQRVALARCLVREQPILLLDEPFSALDPALRQEMLTLVSTSCQQQK MTLLMVSHSVEDAARIATRSVVVADGRIAWQGKTNELLSGKASASALLGITG >gi|296493232|gb|ADTK01000269.1| GENE 18 21754 - 23364 1598 536 aa, chain - ## HITS:1 COG:thiP KEGG:ns NR:ns ## COG: thiP COG1178 # Protein_GI_number: 16128061 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Escherichia coli K12 # 1 536 1 536 536 889 98.0 0 MATRRQPLIPGWLIPGVSAATLVVAVALAAFLALWWNAPQGNWVAVWQDSYLWHVVRFSF WQAFLSALLSVVPAIFLARALYRRRFPGRLALLRLCAMTLILPVLVAVFGILSVYGRQGW LASLCQSLGLEWTFSPYGLQGILLAHVFFNLPMASRLLLQALENIPGEQRQLAAQLGMRG WHFFRFVEWPWLRRQIPPVAALIFMLCFASFATVLSLGGGPQATTIELAIYQALSYDYDP ARAAMLALIQMVCCLGLVLLSQRLSKAIAPGTTLLQGWRDPDDRLHSRICDTVLIVLALL LLLPPLLAVIVDGVNRQLPEVLAQPVLWQALWTSLRIALAAGVLCVVLTMMLLWSSRELR ARQKMLAGQALEMSGMLILAMPGIVLATGFFLLLNNTIGLPQSADGIVIFTNALMAIPYA LKVLENPMRDITARYSMLCQSLGIEGWSRLKVVELRALKRPLAQALAFACVLSIGDFGVV ALFGNDDFRTLPFYLYQQIGSYRSQDGAVTALILLLLCFLLFTVIEKLPGRNVKTD >gi|296493232|gb|ADTK01000269.1| GENE 19 23340 - 24323 1025 327 aa, chain - ## HITS:1 COG:tbpA KEGG:ns NR:ns ## COG: tbpA COG4143 # Protein_GI_number: 16128062 # Func_class: H Coenzyme transport and metabolism # Function: ABC-type thiamine transport system, periplasmic component # Organism: Escherichia coli K12 # 13 327 13 327 327 617 99.0 1e-176 MLKKCLPLLLLCTAPVFAKPVLTVYTYDSFAADWGPGPKIKKAFEADCNCELKLVALEDG VSLLNRLRMEGKNSKADVVLGLDNNLLDAASKTGLFAKSGVAADAVNVPGGWNNDTFVPF DYGYFAFVYDKNKLKNPPQSLKELVESDQNWRVIYQDPRTSTPGLGLLLWMQKVYGDDAP QAWQKLAKKTVTVTKGWSEAYGLFLKGESDLVLSYTTSPAYHILEEKKDNYAAANFSEGH YLQVEVAARTAASKQPELAQKFLQFMVSPAFQNAIPTGNWMYPVANVTLPAGFEQLTKPA TTLEFTPAEVAAQRQAWISEWQRAVSR >gi|296493232|gb|ADTK01000269.1| GENE 20 24487 - 26142 1415 551 aa, chain - ## HITS:1 COG:ECs0074 KEGG:ns NR:ns ## COG: ECs0074 COG4533 # Protein_GI_number: 15829328 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 1 551 1 551 552 1110 99.0 0 MPSARLQQQFIRLWQCCEGKSQDTTLNELAALLSCSRRHMRTLLNTMQDRGWLTWEAEVG RGKRSRLTFLYTGLALQQQRAEDLLEQDRIDQLVQLVGDKATVRQMLVSHLGRSFRQGRH ILRVLYYRPLRNLLPGSALRRSETHIARQIFSSLTRINEENGELEADIAHHWQQISPLHW RFFLRPGVHFHHGRELEMDDVIASLKRINTLPLYSHIADIVSPTPWTLDIHLTQPDRWLP LLLGQVPAMILPREWETLSNFASHPIGTGPYAVIRNTTNQLKIQAFDDFFGYRALIDEVN VWVLPEIADEPAGGLMLKGPQGEEKEIESRLEEGCYYLLFDSRTHRGANQQVRDWVSYVL SPTNLVYFAEEQYQQLWFPAYGLLPRWHHARTIKSEKPAGLESLTLTFYQDHSEHRVIAG IMQQILASHQVTLEIKEISYDQWHEGEIESDIWLNSANFTLPLDFSLFAHLCEVPLLQHC IPIDWQADAARWRNGEMNLANWCQQLVASKAMVPLIHHWLIIQGQRSMRGLRMNTLGWFD FKSAWFAPPDP >gi|296493232|gb|ADTK01000269.1| GENE 21 26231 - 26362 137 43 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_0070 NR:ns ## KEGG: ECIAI1_0070 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 43 57 99 99 84 97.0 1e-15 MRQFYQHYFTATAKLCWLRWLSVPQRLTMLEGLMQWDDRNSES >gi|296493232|gb|ADTK01000269.1| GENE 22 26482 - 27642 813 386 aa, chain + ## HITS:1 COG:setA KEGG:ns NR:ns ## COG: setA COG0477 # Protein_GI_number: 16128064 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 386 7 392 392 656 99.0 0 MARRMNGVYAAFMLVAFMMGVAGALQAPTLSLFLSREVGAQPFWIGLFYTVNAIAGIGVS LWLAKRSDSQGDRRKLIIFCCLMAIGNALLFAFNRHYLTLITCGVLLASLANTAMPQLFA LAREYADNSAREVVMFSSVMRAQLSLAWVIGPPLAFMLALNYGFTVMFSIAAGIFTLSLV LIAFMLPSVARVELPSENALSMQGGWQDSNVRMLFVASTLMWTCNTMYIIDMPLWISSEL GLPDKLAGFLMGTAAGLEIPAMILAGYYVKRYGKRRMMVIAVAAGVLFYTGLIFFHSRMA LMTLQLFNAVFIGIVAGIGMLWFQDLMPGRAGAATTLFTNSISTGVILAGVIQGAIAQSW GHFAVYWVIAVISVVALFLTAKVKDV Prediction of potential genes in microbial genomes Time: Mon May 16 15:52:12 2011 Seq name: gi|296493231|gb|ADTK01000270.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont846.8, whole genome shotgun sequence Length of sequence - 4659 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 30/0.000 - CDS 3 - 501 709 ## COG0066 3-isopropylmalate dehydratase small subunit 2 1 Op 2 10/0.000 - CDS 512 - 1912 1632 ## COG0065 3-isopropylmalate dehydratase large subunit 3 1 Op 3 11/0.000 - CDS 1915 - 3006 1211 ## COG0473 Isocitrate/isopropylmalate dehydrogenase 4 1 Op 4 . - CDS 3006 - 4577 1746 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases Predicted protein(s) >gi|296493231|gb|ADTK01000270.1| GENE 1 3 - 501 709 166 aa, chain - ## HITS:1 COG:leuD KEGG:ns NR:ns ## COG: leuD COG0066 # Protein_GI_number: 16128065 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase small subunit # Organism: Escherichia coli K12 # 1 166 1 166 201 340 100.0 1e-93 MAEKFIKHTGLVVPLDAANVDTDAIIPKQFLQKVTRTGFGAHLFNDWRFLDEKGQQPNPD FVLNFPQYQGASILLARENFGCGSSREHAPWALTDYGFKVVIAPSFADIFYGNSFNNQLL PVKLSDAEVDELFALVKANPGIHFDVDLEAQEVKAGEKTYRFTIDA >gi|296493231|gb|ADTK01000270.1| GENE 2 512 - 1912 1632 466 aa, chain - ## HITS:1 COG:leuC KEGG:ns NR:ns ## COG: leuC COG0065 # Protein_GI_number: 16128066 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase large subunit # Organism: Escherichia coli K12 # 1 466 1 466 466 927 100.0 0 MAKTLYEKLFDAHVVYEAENETPLLYIDRHLVHEVTSPQAFDGLRAHGRPVRQPGKTFAT MDHNVSTQTKDINACGEMARIQMQELIKNCKEFGVELYDLNHPYQGIVHVMGPEQGVTLP GMTIVCGDSHTATHGAFGALAFGIGTSEVEHVLATQTLKQGRAKTMKIEVQGKAAPGITA KDIVLAIIGKTGSAGGTGHVVEFCGEAIRDLSMEGRMTLCNMAIEMGAKAGLVAPDETTF NYVKGRLHAPKGKDFDDAVAYWKTLQTDEGATFDTVVTLQAEEISPQVTWGTNPGQVISV NDNIPDPASFADPVERASAEKALAYMGLKPGIPLTEVAIDKVFIGSCTNSRIEDLRAAAE IAKGRKVAPGVQALVVPGSGPVKAQAEAEGLDKIFIEAGFEWRLPGCSMCLAMNNDRLNP GERCASTSNRNFEGRQGRGGRTHLVSPAMAAAAAVTGHFADIRNIK >gi|296493231|gb|ADTK01000270.1| GENE 3 1915 - 3006 1211 363 aa, chain - ## HITS:1 COG:leuB KEGG:ns NR:ns ## COG: leuB COG0473 # Protein_GI_number: 16128067 # Func_class: C Energy production and conversion; E Amino acid transport and metabolism # Function: Isocitrate/isopropylmalate dehydrogenase # Organism: Escherichia coli K12 # 1 363 2 364 364 742 100.0 0 MSKNYHIAVLPGDGIGPEVMTQALKVLDAVRNRFAMRITTSHYDVGGAAIDNHGQPLPPA TVEGCEQADAVLFGSVGGPKWEHLPPDQQPERGALLPLRKHFKLFSNLRPAKLYQGLEAF CPLRADIAANGFDILCVRELTGGIYFGQPKGREGSGQYEKAFDTEVYHRFEIERIARIAF ESARKRRHKVTSIDKANVLQSSILWREIVNEIATEYPDVELAHMYIDNATMQLIKDPSQF DVLLCSNLFGDILSDECAMITGSMGMLPSASLNEQGFGLYEPAGGSAPDIAGKNIANPIA QILSLALLLRYSLDADDAACAIERAINRALEEGIRTGDLARGAAAVSTDEMGDIIARYVA EGV >gi|296493231|gb|ADTK01000270.1| GENE 4 3006 - 4577 1746 523 aa, chain - ## HITS:1 COG:ECs0078 KEGG:ns NR:ns ## COG: ECs0078 COG0119 # Protein_GI_number: 15829332 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Escherichia coli O157:H7 # 1 523 1 523 523 1013 99.0 0 MSQQVIIFDTTLRDGEQALQASLSVKEKLQIALALERMGVDVMEVGFPVSSPGDFESVQT IARQVKNSRVCALARCVEKDIDVAAESLKVAEAFRIHTFIATSPMHIATKLRSTLDEVIE RAIYMVKRARNYTDDVEFSCEDAGRTPIADLARVVEAAINAGATTINIPDTVGYTMPFEF AGIISGLYERVPNIDKAIISVHTHDDLGLAVGNSLAAVHAGARQVEGAMNGIGERAGNCS LEEVIMAIKVRKDILNVHTAINHQEIWRTSQLVSQICNMPIPANKAIVGSGAFAHSSGIH QDGVLKNRENYEIMTPESIGLNQIQLNLTSRSGRAAVKHRMDEMGYKESEYNLDNLYDAF LKLADKKGQVFDYDLEALAFIGKQQEEPEHFRLDYFSVQSGSNDIATAAVKLACGEEVKA EAANGNGPVDAVYQAINRITDYNVELVKYSLTAKGHGKDALGQVDIVANYNGRRFHGVGL ATDIVESSAKAMVHVLNNIWRAAEVEKELQRKAQHNENNKETV Prediction of potential genes in microbial genomes Time: Mon May 16 15:52:23 2011 Seq name: gi|296493230|gb|ADTK01000271.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont846.9, whole genome shotgun sequence Length of sequence - 30986 bp Number of predicted genes - 27, with homology - 27 Number of transcription units - 7, operones - 4 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 783 - 1745 833 ## COG0583 Transcriptional regulator + Term 1748 - 1806 14.5 + Prom 1788 - 1847 10.0 2 2 Op 1 32/0.000 + CDS 2063 - 3787 1351 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 3 2 Op 2 3/1.000 + CDS 3790 - 4281 512 ## COG0440 Acetolactate synthase, small (regulatory) subunit + Term 4307 - 4339 1.5 + Prom 4373 - 4432 5.7 4 3 Tu 1 . + CDS 4461 - 5465 1099 ## COG1609 Transcriptional regulators + Prom 5963 - 6022 4.1 5 4 Op 1 29/0.000 + CDS 6067 - 6525 302 ## COG2001 Uncharacterized protein conserved in bacteria 6 4 Op 2 12/0.000 + CDS 6530 - 7468 1067 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis 7 4 Op 3 12/0.000 + CDS 7465 - 7830 312 ## COG3116 Cell division protein 8 4 Op 4 26/0.000 + CDS 7846 - 9612 1508 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 9 4 Op 5 26/0.000 + CDS 9599 - 11086 1536 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase 10 4 Op 6 28/0.000 + CDS 11083 - 12441 1315 ## COG0770 UDP-N-acetylmuramyl pentapeptide synthase 11 4 Op 7 28/0.000 + CDS 12435 - 13517 1319 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 12 4 Op 8 25/0.000 + CDS 13520 - 14836 1334 ## COG0771 UDP-N-acetylmuramoylalanine-D-glutamate ligase 13 4 Op 9 31/0.000 + CDS 14836 - 16080 1426 ## COG0772 Bacterial cell division membrane protein 14 4 Op 10 26/0.000 + CDS 16101 - 17144 911 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase 15 4 Op 11 11/0.000 + CDS 17198 - 18673 1505 ## COG0773 UDP-N-acetylmuramate-alanine ligase 16 4 Op 12 18/0.000 + CDS 18666 - 19586 962 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes 17 4 Op 13 25/0.000 + CDS 19588 - 20418 661 ## COG1589 Cell division septal protein 18 4 Op 14 35/0.000 + CDS 20415 - 21677 1088 ## COG0849 Actin-like ATPase involved in cell division 19 4 Op 15 11/0.000 + CDS 21738 - 22889 1225 ## COG0206 Cell division GTPase + Prom 22900 - 22959 1.8 20 4 Op 16 . + CDS 22990 - 23907 793 ## COG0774 UDP-3-O-acyl-N-acetylglucosamine deacetylase + Term 23934 - 23991 2.1 + Prom 23955 - 24014 4.9 21 5 Op 1 . + CDS 24063 - 24650 306 ## G2583_0101 secretion monitor protein + Term 24681 - 24709 0.5 22 5 Op 2 8/0.000 + CDS 24712 - 27417 3433 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) + Term 27439 - 27469 3.0 23 5 Op 3 . + CDS 27477 - 27866 330 ## PROTEIN SUPPORTED gi|42631237|ref|ZP_00156775.1| COG0494: NTP pyrophosphohydrolases including oxidative damage repair enzymes - Term 28014 - 28058 3.7 24 6 Op 1 7/0.000 - CDS 28082 - 28279 136 ## COG3024 Uncharacterized protein conserved in bacteria 25 6 Op 2 7/0.000 - CDS 28289 - 29032 778 ## COG4582 Uncharacterized protein conserved in bacteria 26 6 Op 3 . - CDS 29032 - 29652 628 ## COG0237 Dephospho-CoA kinase - Prom 29754 - 29813 4.8 + Prom 29610 - 29669 3.0 27 7 Tu 1 . + CDS 29877 - 30920 1021 ## COG0516 IMP dehydrogenase/GMP reductase Predicted protein(s) >gi|296493230|gb|ADTK01000271.1| GENE 1 783 - 1745 833 320 aa, chain + ## HITS:1 COG:leuO KEGG:ns NR:ns ## COG: leuO COG0583 # Protein_GI_number: 16128070 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 320 54 373 373 628 99.0 1e-180 MTVELSMPEVQTDHSETAELSKPQLRMVDLNLLTVFDAVMQEQNITRAAHVLGMSQPAVS NAVARLKVMFNDELFVRYGRGIQPTARAFQLFGSVRQALQLVQNELPGSGFEPASSERVF HLCVCSPLDSILTSQIYNHIEQIAPNIHVMFKSSLNQNTEHQLRYQETEFVISYEDFHRP EFTSVPLFKDEMVLVASKNHPTIKGPLLKHDVYNEQHAAVSLDRFASFSQPWYDTVDKQA SIAYQGMAMMSVLSVVSQTHLVAIAPRWLAEEFAESLELQVLPLPLKQNSRTCYLSWHEA AGRDKGHQWMEEQLVSICKR >gi|296493230|gb|ADTK01000271.1| GENE 2 2063 - 3787 1351 574 aa, chain + ## HITS:1 COG:ilvI KEGG:ns NR:ns ## COG: ilvI COG0028 # Protein_GI_number: 16132273 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Escherichia coli K12 # 1 574 1 574 574 1163 99.0 0 MEMLSGAEMVVRSLIDQGVKQVFGYPGGAVLDIYDALHTVGGIDHVLVRHEQAAVHMADG LARATGEVGVVLVTSGPGATNAITGIATAYMDSIPLVVLSGQVATSLIGYDAFQECDMVG ISRPVVKHSFLVKQTEDIPQVLKKAFWLAASGRPGPVVVDLPKDILNPANKLPYVWPESV SMRSYNPTTTGHKGQIKRALQTLVAAKKPVVYVGGGAIMAGCHQQLKETVEALNLPVVSS LMGLGAFPATHRQALGMLGMHGTYEANMTMHNADVIFAVGVRFDDRTTNNLAKYCPNATV LHIDIDPTSISKTVTADIPIVGDARQVLEQMLELLSQESAHQPLDEIRDWWQQIEQWRAR QCLKYDTHSEKIKPQAVIETLWRLTKGDAYVTSDVGQHQMFAALYYPFDKPRRWINSGGL GTMGFGLPAALGVKMALPEETVVCVTGDGSIQMNIQELSTALQYELPVLVVNLNNRYLGM VKQWQDMIYSGRHSQSYMQSLPDFVRLAEAYGHVGIQISHPHELESKLSEALEQVCNNRL VFVDVTVDGSEHVYPMQIRGGGMDEMWLSKTERT >gi|296493230|gb|ADTK01000271.1| GENE 3 3790 - 4281 512 163 aa, chain + ## HITS:1 COG:ECs0082 KEGG:ns NR:ns ## COG: ECs0082 COG0440 # Protein_GI_number: 15829336 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase, small (regulatory) subunit # Organism: Escherichia coli O157:H7 # 1 163 1 163 163 281 99.0 3e-76 MRRILSVLLENESGALSRVIGLFSQRGYNIESLTVAPTDDPTLSRMTIQTVGDEKVLEQI EKQLHKLVDVLRVSELGQGAHVEREIMLVKIQASGYGRDEVKRNTEIFRGQIIDVTPSLY TVQLAGTSDKLDAFLASIRDVAKIVEVARSGVVGLSRGDKIMR >gi|296493230|gb|ADTK01000271.1| GENE 4 4461 - 5465 1099 334 aa, chain + ## HITS:1 COG:ECs0084 KEGG:ns NR:ns ## COG: ECs0084 COG1609 # Protein_GI_number: 15829338 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 334 1 334 334 655 100.0 0 MKLDEIARLAGVSRTTASYVINGKAKQYRVSDKTVEKVMAVVREHNYHPNAVAAGLRAGR TRSIGLVIPDLENTSYTRIANYLERQARQRGYQLLIACSEDQPDNEMRCIEHLLQRQVDA IIVSTSLPPEHPFYQRWANDPFPIVALDRALDREHFTSVVGADQDDAEMLAEELRKFPAE TVLYLGALPELSVSFLREQGFRTAWKDDPREVHFLYANSYEREAAAQLFEKWLETHPMPQ ALFTTSFALLQGVMDVTLRRDGKLPSDLAIATFGDNELLDFLQCPVLAVAQRHRDVAERV LEIVLASLDEPRKPKPGLTRIKRNLYRRGVLSRS >gi|296493230|gb|ADTK01000271.1| GENE 5 6067 - 6525 302 152 aa, chain + ## HITS:1 COG:ECs0085 KEGG:ns NR:ns ## COG: ECs0085 COG2001 # Protein_GI_number: 15829339 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 152 1 152 152 296 100.0 7e-81 MFRGATLVNLDSKGRLSVPTRYREQLLENAAGQMVCTIDIHHPCLLLYPLPEWEIIEQKL SRLSSMNPVERRVQRLLLGHASECQMDGAGRLLIAPVLRQHAGLTKEVMLVGQFNKFELW DETTWHQQVKEDIDAEQLATGDLSERLQDLSL >gi|296493230|gb|ADTK01000271.1| GENE 6 6530 - 7468 1067 312 aa, chain + ## HITS:1 COG:ECs0086 KEGG:ns NR:ns ## COG: ECs0086 COG0275 # Protein_GI_number: 15829340 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Escherichia coli O157:H7 # 1 312 2 313 313 593 100.0 1e-169 MENYKHTTVLLDEAVNGLNIRPDGIYIDGTFGRGGHSRLILSQLGEEGRLLAIDRDPQAI AVAKTIDDPRFSIIHGPFSALGEYVAERDLIGKIDGILLDLGVSSPQLDDAERGFSFMRD GPLDMRMDPTRGQSAAEWLQTAEEADIAWVLKTYGEERFAKRIARAIVERNREQPMTRTK ELAEVVAAATPVKDKFKHPATRTFQAVRIWVNSELEEIEQALKSSLNVLAPGGRLSIISF HSLEDRIVKRFMRENSRGPQVPAGLPMTEEQLKKLGGRQLRALGKLMPGEEEVAENPRAR SSVLRIAERTNA >gi|296493230|gb|ADTK01000271.1| GENE 7 7465 - 7830 312 121 aa, chain + ## HITS:1 COG:ECs0087 KEGG:ns NR:ns ## COG: ECs0087 COG3116 # Protein_GI_number: 15829341 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Escherichia coli O157:H7 # 1 121 1 121 121 224 100.0 4e-59 MISRVTEALSKVKGSMGSHERHALPGVIGDDLLRFGKLPLCLFICIILTAVTVVTTAHHT RLLTAQREQLVLERDALDIEWRNLILEENALGDHSRVERIATEKLQMQHVDPSQENIVVQ K >gi|296493230|gb|ADTK01000271.1| GENE 8 7846 - 9612 1508 588 aa, chain + ## HITS:1 COG:ECs0088 KEGG:ns NR:ns ## COG: ECs0088 COG0768 # Protein_GI_number: 15829342 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Escherichia coli O157:H7 # 1 588 1 588 588 1162 100.0 0 MKAAAKTQKPKRQEEHANFISWRFALLCGCILLALAFLLGRVAWLQVISPDMLVKEGDMR SLRVQQVSTSRGMITDRSGRPLAVSVPVKAIWADPKEVHDAGGISVGDRWKALANALNIP LDQLSARINANPKGRFIYLARQVNPDMADYIKKLKLPGIHLREESRRYYPSGEVTAHLIG FTNVDSQGIEGVEKSFDKWLTGQPGERIVRKDRYGRVIEDISSTDSQAAHNLALSIDERL QALVYRELNNAVAFNKAESGSAVLVDVNTGEVLAMANSPSYNPNNLSGTPKEAMRNRTIT DVFEPGSTVKPMVVMTALQRGVVRENSVLNTIPYRINGHEIKDVARYSELTLTGVLQKSS NVGVSKLALAMPSSALVDTYSRFGLGKATNLGLVGERSGLYPQKQRWSDIERATFSFGYG LMVTPLQLARVYATIGSYGIYRPLSITKVDPPVPGERVFPESIVRTVVHMMESVALPGGG GVKAAIKGYRIAIKTGTAKKVGPDGRYINKYIAYTAGVAPASQPRFALVVVINDPQAGKY YGGAVSAPVFGAIMGGVLRTMNIEPDALTTGDKNEFVINQGEGTGGRS >gi|296493230|gb|ADTK01000271.1| GENE 9 9599 - 11086 1536 495 aa, chain + ## HITS:1 COG:murE KEGG:ns NR:ns ## COG: murE COG0769 # Protein_GI_number: 16128078 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Escherichia coli K12 # 1 495 1 495 495 937 99.0 0 MADRNLRDLLAPWVPDAPSRALREMTLDSRVAAAGDLFVAVVGHQADGRRYIPQAIAQGV AAIIAEAKDEATDGEIREMHGVPVIYLSQLNERLSALAGRFYHEPSDNLRLVGVTGTNGK TTTTQLLAQWSQLLGETSAVMGTVGNGLLGKVIPTENTTGSAVDVQHELAGLVDQGATFC AMEVSSHGLVQHRVAALKFAASVFTNLSRDHLDYHGDMEHYEAAKWLLYSEHHCGQAIVN ADDEVGRRWLAKLPDAVAVSMEDHINPNCHGRWLKATEVNYHDSGATIRFSSSWGDGEIE SHLMGAFNVSNLLLALATLLALGYPLADLLKTAARLQPVCGRMEVFTAPGKPTVVVDYAH TPDALEKALQAARLHCAGKLWCVFGCGGDRDKGKRPLMGAIAEEFADVAVVTDDNPRTEE PRAIINDILAGMLDAGHAKVMEGRAEAVTCAVMQAKENDVVLVAGKGHEDYQIVGNQRLD YSDRVTVARLLGVIA >gi|296493230|gb|ADTK01000271.1| GENE 10 11083 - 12441 1315 452 aa, chain + ## HITS:1 COG:murF KEGG:ns NR:ns ## COG: murF COG0770 # Protein_GI_number: 16128079 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide synthase # Organism: Escherichia coli K12 # 1 452 1 452 452 814 99.0 0 MISVTLSQLTDILNGELQGADITLDAVTTDTRKLTPGCLFVALKGERFDAHDFADQAKAG GAGALLVSRPLDIDLPQLIVKDTRLAFGELAAWVRQQVPARVVALTGSSGKTSVKEMTAA ILSQCGNTLYTAGNLNNDIGVPMTLLRLTPEYDYAVIELGANHQGEIAWTVSLTRPEAAL VNNLAAAHLEGFGSLAGVAKAKGEIFSGLPENGIAIMNADNNDWLNWQSVIGSRKVWRFS PNAANSDFTATNIHVTSHGTEFTLQTPTGSVDVLLPLPGRHNIANALAAAALSMSVGATL DAIKAGLANLKAVPGRLFPIQLAENQLLLDDSYNANVGSMTAAVQVLAEMPGYRVLVVGD MAELGAESEACHVQVGEAAKAAGIDRVLSVGKQSHAISTASGVGEHFADKTALITRLKSL IAEQQVITILVKGSRSAAMEEVVRALQENGTC >gi|296493230|gb|ADTK01000271.1| GENE 11 12435 - 13517 1319 360 aa, chain + ## HITS:1 COG:STM0125 KEGG:ns NR:ns ## COG: STM0125 COG0472 # Protein_GI_number: 16763515 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Salmonella typhimurium LT2 # 1 360 1 360 360 642 97.0 0 MLVWLAEHLVKYYSGFNVFSYLTFRAIVSLLTALFISLWMGPRMIAHLQKLSFGQVVRND GPESHFSKRGTPTMGGIMILTAIVISVLLWAYPSNPYVWCVLVVLVGYGVIGFVDDYRKV VRKDTKGLIARWKYFWMSVIALGVAFALYLAGKDTPATQLVVPFFKDVMPQLGLFYILLA YFVIVGTGNAVNLTDGLDGLAIMPTVFVAGGFALVAWATGNMNFASYLHIPYLRHAGELV IVCTAIVGAGLGFLWFNTYPAQVFMGDVGSLALGGALGIIAVLLRQEFLLVIMGGVFVVE TLSVILQVGSFKLRGQRIFRMAPIHHHYELKGWPEPRVIVRFWIISLMLVLIGLATLKVR >gi|296493230|gb|ADTK01000271.1| GENE 12 13520 - 14836 1334 438 aa, chain + ## HITS:1 COG:ECs0092 KEGG:ns NR:ns ## COG: ECs0092 COG0771 # Protein_GI_number: 15829346 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramoylalanine-D-glutamate ligase # Organism: Escherichia coli O157:H7 # 1 438 1 438 438 851 99.0 0 MADYQGKNVVIIGLGLTGLSCVDFFLARGVTPRVMDTRMTPPGLDKLPEAVERHTGSLND EWLMAADLIVASPGIALAHPSLSAAADAGIEIVGDIELFCREAQAPIVAITGSNGKSTVT TLVGEMAKAAGVNVGVGGNIGLPALMLLDDECELYVLELSSFQLETTSSLQAVAATILNV TEDHMDRYPFGLQQYRAAKLRIYENAKVCVVNADDALTMPIRGADERCVSFGVNMGDYHL NHQQGETWLRVKGEKVLNVKEMKLSGQHNYTNALAALALADAAGLPRASSLKALTTFTGL PHRFEVVLEHNGVRWINDSKATNVGSTEAALNGLHVDGTLHLLLGGDGKSADFSPLARYL NGDNVRLYCFGRDGAQLAALHPEVAEQTETMEQAMRLLAPRVQPGDMVLLSPACASLDQF KNFEQRGNEFARLAKELG >gi|296493230|gb|ADTK01000271.1| GENE 13 14836 - 16080 1426 414 aa, chain + ## HITS:1 COG:ECs0093 KEGG:ns NR:ns ## COG: ECs0093 COG0772 # Protein_GI_number: 15829347 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Escherichia coli O157:H7 # 1 414 1 414 414 735 100.0 0 MRLSLPRLKMPRLPGFSILVWISTALKGWVMGSREKDTDSLIMYDRTLLWLTFGLAAIGF IMVTSASMPIGQRLTNDPFFFAKRDGVYLILAFILAIITLRLPMEFWQRYSATMLLGSII LLMIVLVVGSSVKGASRWIDLGLLRIQPAELTKLSLFCYIANYLVRKGDEVRNNLRGFLK PMGVILVLAVLLLAQPDLGTVVVLFVTTLAMLFLAGAKLWQFIAIIGMGISAVVLLILAE PYRIRRVTAFWNPWEDPFGSGYQLTQSLMAFGRGELWGQGLGNSVQKLEYLPEAHTDFIF AIIGEELGYVGVVLALLMVFFVAFRAMSIGRKALEIDHRFSGFLACSIGIWFSFQALVNV GAAAGMLPTKGLTLPLISYGGSSLLIMSTAIMMLLRIDYETRLEKAQAFVRGSR >gi|296493230|gb|ADTK01000271.1| GENE 14 16101 - 17144 911 347 aa, chain + ## HITS:1 COG:ECs0094 KEGG:ns NR:ns ## COG: ECs0094 COG0707 # Protein_GI_number: 15829348 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Escherichia coli O157:H7 # 1 347 9 355 355 638 100.0 0 MVMAGGTGGHVFPGLAVAHHLMAQGWQVRWLGTADRMEADLVPKHGIEIDFIRISGLRGK GIKALIAAPLRIFNAWRQARAIMKAYKPDVVLGMGGYVSGPGGLAAWSLGIPVVLHEQNG IAGLTNKWLAKIATKVMQAFPGAFPNAEVVGNPVRTDVLALPLPQQRLAGREGPVRVLVV GGSQGARILNQTMPQVAAKLGDSVTIWHQSGKGSQQSVEQAYAEAGQPQHKVTEFIDDMA AAYAWADVVVCRSGALTVSEIAAAGLPALFVPFQHKDRQQYWNALPLEKAGAAKIIEQPQ LSVDAVANTLAGWSRETLLTMAERARAASIPDATERVANEVSRAARA >gi|296493230|gb|ADTK01000271.1| GENE 15 17198 - 18673 1505 491 aa, chain + ## HITS:1 COG:murC KEGG:ns NR:ns ## COG: murC COG0773 # Protein_GI_number: 16128084 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate-alanine ligase # Organism: Escherichia coli K12 # 1 491 1 491 491 908 100.0 0 MNTQQLAKLRSIVPEMRRVRHIHFVGIGGAGMGGIAEVLANEGYQISGSDLAPNPVTQQL MNLGATIYFNHRPENVRDASVVVVSSAISADNPEIVAAHEARIPVIRRAEMLAELMRFRH GIAIAGTHGKTTTTAMVSSIYAEAGLDPTFVNGGLVKAAGVHARLGHGRYLIAEADESDA SFLHLQPMVAIVTNIEADHMDTYQGDFENLKQTFINFLHNLPFYGRAVMCVDDPVIRELL PRVGRQTTTYGFSEDADVRVEDYQQIGPQGHFTLLRQDKEPMRVTLNAPGRHNALNAAAA VAVATEEGIDDEAILRALESFQGTGRRFDFLGEFPLEPVNGKSGTAMLVDDYGHHPTEVD ATIKAARAGWPDKNLVMLFQPHRFTRTRDLYDDFANVLTQVDTLLMLEVYPAGEAPIPGA DSRSLCRTIRGRGKIDPILVPDPARVAEMLAPVLTGNDLILVQGAGNIGKIARSLAEIKL KPQTPEEEQHD >gi|296493230|gb|ADTK01000271.1| GENE 16 18666 - 19586 962 306 aa, chain + ## HITS:1 COG:ddlB KEGG:ns NR:ns ## COG: ddlB COG1181 # Protein_GI_number: 16128085 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Escherichia coli K12 # 1 306 1 306 306 581 100.0 1e-166 MTDKIAVLLGGTSAEREVSLNSGAAVLAGLREGGIDAYPVDPKEVDVTQLKSMGFQKVFI ALHGRGGEDGTLQGMLELMGLPYTGSGVMASALSMDKLRSKLLWQGAGLPVAPWVALTRA EFEKGLSDKQLAEISALGLPVIVKPSREGSSVGMSKVVAENALQDALRLAFQHDEEVLIE KWLSGPEFTVAILGEEILPSIRIQPSGTFYDYEAKYLSDETQYFCPAGLEASQEANLQAL VLKAWTTLGCKGWGRIDVMLDSDGQFYLLEANTSPGMTSHSLVPMAARQAGMSFSQLVVR ILELAD >gi|296493230|gb|ADTK01000271.1| GENE 17 19588 - 20418 661 276 aa, chain + ## HITS:1 COG:ftsQ KEGG:ns NR:ns ## COG: ftsQ COG1589 # Protein_GI_number: 16128086 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division septal protein # Organism: Escherichia coli K12 # 1 265 1 265 276 490 100.0 1e-138 MSQAALNTRNSEEEVSSRRNNGTRLAGILFLLTVLTTVLVSGWVVLGWMEDAQRLPLSKL VLTGERHYTRNDDIRQSILALGEPGTFMTQDVNIIQTQIEQRLPWIKQVSVRKQWPDELK IHLVEYVPIARWNDQHMVDAEGNTFSVPPERTSKQVLPMLYGPEGSANEVLQGYREMGQM LAKDRFTLKEAAMTARRSWQLTLNNDIKLNLGRGDTMKRLARFVELYPVLQQQAQTDGKR ISYVDLRYDSGAAVGWAPLPPEESTQQQNQAQAEQQ >gi|296493230|gb|ADTK01000271.1| GENE 18 20415 - 21677 1088 420 aa, chain + ## HITS:1 COG:ECs0098 KEGG:ns NR:ns ## COG: ECs0098 COG0849 # Protein_GI_number: 15829352 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell division # Organism: Escherichia coli O157:H7 # 1 420 1 420 420 818 100.0 0 MIKATDRKLVVGLEIGTAKVAALVGEVLPDGMVNIIGVGSCPSRGMDKGGVNDLESVVKC VQRAIDQAELMADCQISSVYLALSGKHISCQNEIGMVPISEEEVTQEDVENVVHTAKSVR VRDEHRVLHVIPQEYAIDYQEGIKNPVGLSGVRMQAKVHLITCHNDMAKNIVKAVERCGL KVDQLIFAGLASSYSVLTEDERELGVCVVDIGGGTMDIAVYTGGALRHTKVIPYAGNVVT SDIAYAFGTPPSDAEAIKVRHGCALGSIVGKDESVEVPSVGGRPPRSLQRQTLAEVIEPR YTELLNLVNEEILQLQEKLRQQGVKHHLAAGIVLTGGAAQIEGLAACAQRVFHTQVRIGA PLNITGLTDYAQEPYYSTAVGLLHYGKESHLNGEAEVEKRVTASVGSWIKRLNSWLRKEF >gi|296493230|gb|ADTK01000271.1| GENE 19 21738 - 22889 1225 383 aa, chain + ## HITS:1 COG:ECs0099 KEGG:ns NR:ns ## COG: ECs0099 COG0206 # Protein_GI_number: 15829353 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division GTPase # Organism: Escherichia coli O157:H7 # 1 383 1 383 383 642 100.0 0 MFEPMELTNDAVIKVIGVGGGGGNAVEHMVRERIEGVEFFAVNTDAQALRKTAVGQTIQI GSGITKGLGAGANPEVGRNAADEDRDALRAALEGADMVFIAAGMGGGTGTGAAPVVAEVA KDLGILTVAVVTKPFNFEGKKRMAFAEQGITELSKHVDSLITIPNDKLLKVLGRGISLLD AFGAANDVLKGAVQGIAELITRPGLMNVDFADVRTVMSEMGYAMMGSGVASGEDRAEEAA EMAISSPLLEDIDLSGARGVLVNITAGFDLRLDEFETVGNTIRAFASDNATVVIGTSLDP DMNDELRVTVVATGIGMDKRPEITLVTNKQVQQPVMDRYQQHGMAPLTQEQKPVAKVVND NAPQTAKEPDYLDIPAFLRKQAD >gi|296493230|gb|ADTK01000271.1| GENE 20 22990 - 23907 793 305 aa, chain + ## HITS:1 COG:ECs0100 KEGG:ns NR:ns ## COG: ECs0100 COG0774 # Protein_GI_number: 15829354 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-acyl-N-acetylglucosamine deacetylase # Organism: Escherichia coli O157:H7 # 1 305 1 305 305 636 100.0 0 MIKQRTLKRIVQATGVGLHTGKKVTLTLRPAPANTGVIYRRTDLNPPVDFPADAKSVRDT MLCTCLVNEHDVRISTVEHLNAALAGLGIDNIVIEVNAPEIPIMDGSAAPFVYLLLDAGI DELNCAKKFVRIKETVRVEDGDKWAEFKPYNGFSLDFTIDFNHPAIDSSNQRYAMNFSAD AFMRQISRARTFGFMRDIEYLQSRGLCLGGSFDCAIVVDDYRVLNEDGLRFEDEFVRHKM LDAIGDLFMCGHNIIGAFTAYKSGHALNNKLLQAVLAKQEAWEYVTFQDDAELPLAFKAP SAVLA >gi|296493230|gb|ADTK01000271.1| GENE 21 24063 - 24650 306 195 aa, chain + ## HITS:1 COG:no KEGG:G2583_0101 NR:ns ## KEGG: G2583_0101 # Name: secM # Def: secretion monitor protein # Organism: E.coli_O55_H7 # Pathway: Protein export [PATH:eok03060]; Bacterial secretion system [PATH:eok03070] # 1 195 33 227 227 384 98.0 1e-106 MLWTAGFNDKICALNTFEYDRDGNNVSGILTRWRQFGKRYFWPHLLLGMVAASLGLPALS NAAEPNAPAKATTRNHEPSAKVNFGQLALLEANTRRPNSNYSVDYWHQHAIRTVIRHLSF AMAPQTLPVAEESLPLQAQHLALLDTLSALLTQEGTPSEKGYRIDYAHFTPQAKFSTPVW ISQAQGIRAGPQRLT >gi|296493230|gb|ADTK01000271.1| GENE 22 24712 - 27417 3433 901 aa, chain + ## HITS:1 COG:ECs0102 KEGG:ns NR:ns ## COG: ECs0102 COG0653 # Protein_GI_number: 15829356 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Escherichia coli O157:H7 # 1 901 1 901 901 1724 100.0 0 MLIKLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKGEVL ENLIPEAFAVVREASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNA LTGKGVHVVTVNDYLAQRDAENNRPLFEFLGLTVGINLPGMPAPAKREAYAADITYGTNN EYGFDYLRDNMAFSPEERVQRKLHYALVDEVDSILIDEARTPLIISGPAEDSSEMYKRVN KIIPHLIRQEKEDSETFQGEGHFSVDEKSRQVNLTERGLVLIEELLVKEGIMDEGESLYS PANIMLMHHVTAALRAHALFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAK EGVQIQNENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTVVVPTNRPMIR KDLPDLVYMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVLN AKFHANEAAIVAQAGYPAAVTIATNMAGRGTDIVLGGSWQAEVAALENPTAEQIEKIKAD WQVRHDAVLAAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDALMRIFAS DRVSGMMRKLGMKPGEAIEHPWVTKAIANAQRKVESRNFDIRKQLLEYDDVANDQRRAIY SQRNELLDVSDVSETINSIREDVFKATIDAYIPPQSLEEMWDIPGLQERLKNDFDLDLPI AEWLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLA AMDYLRQGIHLRGYAQKDPKQEYKRESFSMFAAMLESLKYEVISTLSKVQVRMPEEVEEL EQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCPCGSGKKYKQCHGRL Q >gi|296493230|gb|ADTK01000271.1| GENE 23 27477 - 27866 330 129 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42631237|ref|ZP_00156775.1| COG0494: NTP pyrophosphohydrolases including oxidative damage repair enzymes [Haemophilus influenzae R2866] # 2 126 4 128 136 131 48 5e-30 MKKLQIAVGIIRNENNEIFITRRAADAHMANKLEFPGGKIEMGETPEQAVVRELQEEVGI TPQHFSLFEKLEYEFPDRHITLWFWLVERWEGKPWGKEGQPGEWMALVDLNADDFPPANE PVIAKLKRL >gi|296493230|gb|ADTK01000271.1| GENE 24 28082 - 28279 136 65 aa, chain - ## HITS:1 COG:ECs0105 KEGG:ns NR:ns ## COG: ECs0105 COG3024 # Protein_GI_number: 15829359 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 65 1 65 65 117 100.0 7e-27 MSETITVNCPTCGKTVVWGEISPFRPFCSKRCQLIDLGEWAAEEKRIPSSGDLSESDDWS EEPKQ >gi|296493230|gb|ADTK01000271.1| GENE 25 28289 - 29032 778 247 aa, chain - ## HITS:1 COG:yacF KEGG:ns NR:ns ## COG: yacF COG4582 # Protein_GI_number: 16128095 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 247 1 247 247 484 100.0 1e-137 MQTQVLFEHPLNEKMRTWLRIEFLIQQLTVNLPIVDHAGALHFFRNVSELLDVFERGEVR TELLKELDRQQRKLQTWIGVPGVDQSRIEALIQQLKAAGSVLISAPRIGQFLREDRLIAL VRQRLSIPGGCCSFDLPTLHIWLHLPQAQRDSQVETWIASLNPLTQALTMVLDLIRQSAP FRKQTSLNGFYQDNGGDADLLRLNLSLDSQLYPQISGHKSRFAIRFMPLDTENGQVPERL DFELACC >gi|296493230|gb|ADTK01000271.1| GENE 26 29032 - 29652 628 206 aa, chain - ## HITS:1 COG:ECs0107 KEGG:ns NR:ns ## COG: ECs0107 COG0237 # Protein_GI_number: 15829361 # Func_class: H Coenzyme transport and metabolism # Function: Dephospho-CoA kinase # Organism: Escherichia coli O157:H7 # 1 206 1 206 206 360 100.0 1e-100 MRYIVALTGGIGSGKSTVANAFADLGINVIDADIIARQVVEPGAPALHAIADHFGANMIA ADGTLQRRALRERIFANPEEKNWLNALLHPLIQQETQHQIQQATSPYVLWVVPLLVENSL YKKANRVLVVDVSPETQLKRTMQRDDVTREHVEQILAAQATREARLAVADDVIDNNGAPD AIASDVARLHAHYLQLASQFVSQEKP >gi|296493230|gb|ADTK01000271.1| GENE 27 29877 - 30920 1021 347 aa, chain + ## HITS:1 COG:guaC KEGG:ns NR:ns ## COG: guaC COG0516 # Protein_GI_number: 16128097 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Escherichia coli K12 # 1 347 1 347 347 695 99.0 0 MRIEEDLKLGFKDVLIRPKRSTLKSRSDVELERQFTFKHSGQSWSGVPIIAANMDTVGTF SMASALASFDILTAVHKHYSVEEWQAFINNSSADVLKHVMVSTGTSDADFEKTKQILDLN PALNFVCIDVANGYSEHFVQFVAKAREAWPTKTICAGNVVTGEMCEELILSGADIVKVGI GPGSVCTTRVKTGVGYPQLSAVIECADAAHGLGGMIVSDGGCTTPGDVAKAFGGGADFVM LGGMLAGYEESGGRIVEENGEKFMLFYGMSSESAMKRHVGGVAEYRAAEGKTVKLPLRGP VENTARDILGGLRSACTYVGASRLKELTKRTTFIRVQEQENRIFNNL Prediction of potential genes in microbial genomes Time: Mon May 16 15:52:29 2011 Seq name: gi|296493229|gb|ADTK01000272.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont846.10, whole genome shotgun sequence Length of sequence - 5683 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 24/0.000 - CDS 32 - 1234 856 ## COG1459 Type II secretory pathway, component PulF 2 1 Op 2 8/0.000 - CDS 1224 - 2609 856 ## COG2804 Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB 3 1 Op 3 5/1.000 - CDS 2619 - 3059 488 ## COG4969 Tfp pilus assembly protein, major pilin PilA - Prom 3136 - 3195 1.7 4 2 Tu 1 . - CDS 3262 - 4155 460 ## PROTEIN SUPPORTED gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 - Prom 4176 - 4235 2.0 + Prom 3973 - 4032 2.0 5 3 Op 1 6/1.000 + CDS 4243 - 4794 509 ## COG3023 Negative regulator of beta-lactamase expression 6 3 Op 2 . + CDS 4791 - 5645 846 ## COG3725 Membrane protein required for beta-lactamase induction Predicted protein(s) >gi|296493229|gb|ADTK01000272.1| GENE 1 32 - 1234 856 400 aa, chain - ## HITS:1 COG:ECs0110 KEGG:ns NR:ns ## COG: ECs0110 COG1459 # Protein_GI_number: 15829364 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulF # Organism: Escherichia coli O157:H7 # 1 400 1 400 400 711 98.0 0 MASKQLWRWHGITGDGNAQDGMLWAESRALLLMALQQQMVTPLSLKRIAINSAQWRGDKS AEVIHQLATLLKAGLTLSEGLALLAEQHPSKQWQALLQSLAHDLEQGIAFSNALLPWSEV FPPLYQAMIRTGELTGKLDECCFELARQQKAQRQLTDKVKSALRYPIIILAMAIMVVVAI LHFVLPEFAAIYKTFNTPLPALTQGIMTLADFSGEWGWLLVLFGFLLAIANKLLMRRPTW LIARQKLLLRIPIMGSLMRGQKLTQIFTILALTQSAGITFLQGVESVRETMRCPYWVQLL TQIQHDISNGHPIWLALKNAGEFSPLCLQLVRTGEASGSLDLMLDNLAHHHRENTMALAD NLAALLEPALLIITGGIIGTLVVAMYLPIFHLGDAMSGMG >gi|296493229|gb|ADTK01000272.1| GENE 2 1224 - 2609 856 461 aa, chain - ## HITS:1 COG:hofB KEGG:ns NR:ns ## COG: hofB COG2804 # Protein_GI_number: 16128100 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB # Organism: Escherichia coli K12 # 1 461 1 461 461 881 98.0 0 MNIPQLTALCLRYQGVLLDASEEVVHVAVVDAPSHELLDALHFATTKRIEITCWTRQQME GHASRTQQTLPVAVQEKHQPKAELLARTLQSALEQRASDIHIEPADNAYRIRLRIDGVLH PLPDVSPDAGVALTARLKVLGNLDIAEHRLPQDGQFTVELAGNAVSFRIATLPCRGGEKV VLRLLQQVGQALDVNTLGMQPLQLADFAHALQQPQGLVLVTGPTGSGKTVTLYSALQTLN TADINICSVEDPVEIPIAGLNQTQIHPRAGLTFQGVLRALLRQDPDVIMIGEIRDGETAE IAIKAAQTGHLVLSTLHTNSTCETLVRLQQMGVARWMLSSALTLVIAQRLVRKLCPHCRQ QQGEPIHIPVNVWPSPLPHWQAPGCVHCYHGFYGRTALFEVLPITPVIRQLISANTDVES LETHARQAGMCTLFENGCLAVEQGLTTFEELIRVLGMPHGE >gi|296493229|gb|ADTK01000272.1| GENE 3 2619 - 3059 488 146 aa, chain - ## HITS:1 COG:ppdD KEGG:ns NR:ns ## COG: ppdD COG4969 # Protein_GI_number: 16128101 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Tfp pilus assembly protein, major pilin PilA # Organism: Escherichia coli K12 # 1 146 1 146 146 292 99.0 2e-79 MDKQRGFTLIELMVVIGIIAILSAIGIPAYQNYLRKAALTDMLQTFVPYRTAVELCALEH GGLDTCDGGSNGIPSPTTTHYVSAMSVAKGVVSLTGQESLNGLSVVMTPGWDNANGVTGW TRNCNIQSDSALQQACEDVFRFDDAN >gi|296493229|gb|ADTK01000272.1| GENE 4 3262 - 4155 460 297 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 [Kordia algicida OT-1] # 27 295 15 283 286 181 37 9e-46 MPPRRYNPDTRRDELLERINLDIPGAVAQALREDLGGTVDANNDITAKLLPENSRSHATV ITRENGVFCGKRWVEEVFIQLAGDDVTIIWHVDDGDVINANQSLFELEGPSRVLLTGERT ALNFVQTLSGVASKVRHYVELLEGTNTQLLDTRKTLPGLRSALKYAVLCGGGANHRLGLS DAFLIKENHIIASGSVRQAVEKASWLHPDAPVEVEVENLEELDEALKAGADIIMLDNFET EQMREAVKRTNGKALLEVSGNVTDKTLREFAETGVDFISVGALTKHVQALDLSMRFR >gi|296493229|gb|ADTK01000272.1| GENE 5 4243 - 4794 509 183 aa, chain + ## HITS:1 COG:ampD KEGG:ns NR:ns ## COG: ampD COG3023 # Protein_GI_number: 16128103 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Escherichia coli K12 # 1 183 1 183 183 384 100.0 1e-107 MLLEQGWLVGARRVPSPHYDCRPDDETPTLLVVHNISLPPGEFGGPWIDALFTGTIDPQA HPFFAEIAHLRVSAHCLIRRDGEIVQYVPFDKRAWHAGVSQYQGRERCNDFSIGIELEGT DTLAYTDAQYQQLAAVTRALIDCYPDIAKNMTGHCDIAPDRKTDPGPAFDWARFRVLVSK ETT >gi|296493229|gb|ADTK01000272.1| GENE 6 4791 - 5645 846 284 aa, chain + ## HITS:1 COG:ampE KEGG:ns NR:ns ## COG: ampE COG3725 # Protein_GI_number: 16128104 # Func_class: V Defense mechanisms # Function: Membrane protein required for beta-lactamase induction # Organism: Escherichia coli K12 # 1 284 1 284 284 518 100.0 1e-147 MTLFTTLLVLIFERLFKLGEHWQLDHRLEAFFRRVKHFSLGRTLGMTIIAMGVTFLLLRA LQGVLFNVPTLLVWLLIGLLCIGAGKVRLHYHAYLTAASRNDSHARATMAGELTMIHGVP AGCDEREYLRELQNALLWINFRFYLAPLFWLIVGGTWGPVTLMGYAFLRAWQYWLARYQT PHHRLQSGIDAVLHVLDWVPVRLAGVVYALIGHGEKALPAWFASLGDFHTSQYQVLTRLA QFSLAREPHVDKVETPKAAVSMAKKTSFVVVVVIALLTIYGALV Prediction of potential genes in microbial genomes Time: Mon May 16 15:52:32 2011 Seq name: gi|296493228|gb|ADTK01000273.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont846.11, whole genome shotgun sequence Length of sequence - 7509 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 41 - 1411 1444 ## COG1113 Gamma-aminobutyrate permease and related permeases - Prom 1503 - 1562 5.2 + Prom 1682 - 1741 7.4 2 2 Op 1 6/0.000 + CDS 1955 - 2719 716 ## COG2186 Transcriptional regulators + Term 2724 - 2763 2.8 + Prom 2744 - 2803 1.7 3 2 Op 2 13/0.000 + CDS 2880 - 5543 3324 ## COG2609 Pyruvate dehydrogenase complex, dehydrogenase (E1) component 4 2 Op 3 . + CDS 5558 - 7450 2186 ## COG0508 Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes + Term 7451 - 7498 0.0 Predicted protein(s) >gi|296493228|gb|ADTK01000273.1| GENE 1 41 - 1411 1444 456 aa, chain - ## HITS:1 COG:aroP KEGG:ns NR:ns ## COG: aroP COG1113 # Protein_GI_number: 16128105 # Func_class: E Amino acid transport and metabolism # Function: Gamma-aminobutyrate permease and related permeases # Organism: Escherichia coli K12 # 1 456 2 457 457 826 100.0 0 MEGQQHGEQLKRGLKNRHIQLIALGGAIGTGLFLGSASVIQSAGPGIILGYAIAGFIAFL IMRQLGEMVVEEPVAGSFSHFAYKYWGSFAGFASGWNYWVLYVLVAMAELTAVGKYIQFW YPEIPTWVSAAVFFVVINAINLTNVKVFGEMEFWFAIIKVIAVVAMIIFGGWLLFSGNGG PQATVSNLWDQGGFLPHGFTGLVMMMAIIMFSFGGLELVGITAAEADNPEQSIPKATNQV IYRILIFYIGSLAVLLSLMPWTRVTADTSPFVLIFHELGDTFVANALNIVVLTAALSVYN SCVYCNSRMLFGLAQQGNAPKALASVDKRGVPVNTILVSALVTALCVLINYLAPESAFGL LMALVVSALVINWAMISLAHMKFRRAKQEQGVVTRFPALLYPLGNWICLLFMAAVLVIML MTPGMAISVYLIPVWLIVLGIGYLFKEKTAKAVKAH >gi|296493228|gb|ADTK01000273.1| GENE 2 1955 - 2719 716 254 aa, chain + ## HITS:1 COG:ECs0117 KEGG:ns NR:ns ## COG: ECs0117 COG2186 # Protein_GI_number: 15829371 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 254 1 254 254 402 100.0 1e-112 MAYSKIRQPKLSDVIEQQLEFLILEGTLRPGEKLPPERELAKQFDVSRPSLREAIQRLEA KGLLLRRQGGGTFVQSSLWQSFSDPLVELLSDHPESQYDLLETRHALEGIAAYYAALRST DEDKERIRELHHAIELAQQSGDLDAESNAVLQYQIAVTEAAHNVVLLHLLRCMEPMLAQN VRQNFELLYSRREMLPLVSSHRTRIFEAIMAGKPEEAREASHRHLAFIEEILLDRSREES RRERSLRRLEQRKN >gi|296493228|gb|ADTK01000273.1| GENE 3 2880 - 5543 3324 887 aa, chain + ## HITS:1 COG:ECs0118 KEGG:ns NR:ns ## COG: ECs0118 COG2609 # Protein_GI_number: 15829372 # Func_class: C Energy production and conversion # Function: Pyruvate dehydrogenase complex, dehydrogenase (E1) component # Organism: Escherichia coli O157:H7 # 1 887 1 887 887 1837 100.0 0 MSERFPNDVDPIETRDWLQAIESVIREEGVERAQYLIDQLLAEARKGGVNVAAGTGISNY INTIPVEEQPEYPGNLELERRIRSAIRWNAIMTVLRASKKDLELGGHMASFQSSATIYDV CFNHFFRARNEQDGGDLVYFQGHISPGVYARAFLEGRLTQEQLDNFRQEVHGNGLSSYPH PKLMPEFWQFPTVSMGLGPIGAIYQAKFLKYLEHRGLKDTSKQTVYAFLGDGEMDEPESK GAITIATREKLDNLVFVINCNLQRLDGPVTGNGKIINELEGIFEGAGWNVIKVMWGSRWD ELLRKDTSGKLIQLMNETVDGDYQTFKSKDGAYVREHFFGKYPETAALVADWTDEQIWAL NRGGHDPKKIYAAFKKAQETKGKATVILAHTIKGYGMGDAAEGKNIAHQVKKMNMDGVRH IRDRFNVPVSDADIEKLPYITFPEGSEEHTYLHAQRQKLHGYLPSRQPNFTEKLELPSLQ DFGALLEEQSKEISTTIAFVRALNVMLKNKSIKDRLVPIIADEARTFGMEGLFRQIGIYS PNGQQYTPQDREQVAYYKEDEKGQILQEGINELGAGCSWLAAATSYSTNNLPMIPFYIYY SMFGFQRIGDLCWAAGDQQARGFLIGGTSGRTTLNGEGLQHEDGHSHIQSLTIPNCISYD PAYAYEVAVIMHDGLERMYGEKQENVYYYITTLNENYHMPAMPEGAEEGIRKGIYKLETI EGSKGKVQLLGSGSILRHVREAAEILAKDYGVGSDVYSVTSFTELARDGQDCERWNMLHP LETPRVPYIAQVMNDAPAVASTDYMKLFAEQVRTYVPADDYRVLGTDGFGRSDSRENLRH HFEVDASYVVVAALGELAKRGEIDKKVVADAIAKFNIDADKVNPRLA >gi|296493228|gb|ADTK01000273.1| GENE 4 5558 - 7450 2186 630 aa, chain + ## HITS:1 COG:ECs0119 KEGG:ns NR:ns ## COG: ECs0119 COG0508 # Protein_GI_number: 15829373 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes # Organism: Escherichia coli O157:H7 # 1 630 1 630 630 941 99.0 0 MAIEIKVPDIGADEVEITEILVKVGDKVEAEQSLITVEGDKASMEVPSPQAGIVKEIKVS VGDKTQTGALIMIFDSADGAADAAPAQAEEKKEAAPAAAPAAAAAKDVNVPDIGSDEVEV TEILVKVGDKVEAEQSLITVEGDKASMEVPAPFAGTVKEIKVNVGDKVSTGSLIMVFEVA GEAGAAAPAAKQEAAPAAAPAPAAGVKEVNVPDIGGDEVEVTEVMVKVGDKVAAEQSLIT VEGDKASMEVPAPFAGVVKELKVNVGDKVKTGSLIMIFEVEGAAPAAAPAKQEAAAPAPA AKAEAPAAAPAAKAEGKSEFAENDAYVHATPLIRRLAREFGVNLAKVKGTGRKGRILRED VQAYVKEAIKRAEAAPAATGGGIPGLVPWPEVDFSKFGEIEEVELGRIQKISGANLSRNW VMIPHVTHFDKTDITELEAFRKQQNEEAAKRKLDVKITPVVFIMKAVAAALEQMPRFNSS LSEDGQRLTLKKYINIGVAVDTPNGLVVPVFKDVNKKGIIELSRELMTISKKARDGKLTA GEMQGGCFTISSIGGLGTTHFAPIVNAPEVAILGVSKSAMEPVWNGKEFVPRLMLPISLS FDHRVIDGADGARFITIINNTLSDIRRLVM Prediction of potential genes in microbial genomes Time: Mon May 16 15:52:38 2011 Seq name: gi|296493227|gb|ADTK01000274.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont846.12, whole genome shotgun sequence Length of sequence - 18579 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 12, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 82 - 141 7.8 1 1 Tu 1 . + CDS 219 - 1643 682 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 + Term 1667 - 1722 5.0 - Term 1666 - 1695 -0.9 2 2 Tu 1 . - CDS 1714 - 3528 718 ## ECO111_0118 hypothetical protein - Prom 3742 - 3801 4.4 + Prom 3717 - 3776 8.6 3 3 Tu 1 . + CDS 3883 - 6480 2967 ## COG1049 Aconitase B + Term 6516 - 6553 4.5 + Prom 6538 - 6597 4.1 4 4 Tu 1 . + CDS 6655 - 7017 463 ## COG3112 Uncharacterized protein conserved in bacteria + Term 7173 - 7211 2.2 - Term 7009 - 7050 10.2 5 5 Op 1 9/0.000 - CDS 7055 - 7849 849 ## COG1586 S-adenosylmethionine decarboxylase 6 5 Op 2 . - CDS 7865 - 8731 894 ## COG0421 Spermidine synthase - Prom 8770 - 8829 4.8 - Term 8778 - 8811 3.8 7 6 Tu 1 . - CDS 8837 - 9184 351 ## ECSE_0122 hypothetical protein - Prom 9326 - 9385 2.0 + Prom 9252 - 9311 4.8 8 7 Tu 1 . + CDS 9350 - 10900 1601 ## COG2132 Putative multicopper oxidases - Term 10953 - 10990 2.2 9 8 Tu 1 . - CDS 11102 - 13492 2524 ## COG4993 Glucose dehydrogenase - Prom 13529 - 13588 3.7 + Prom 13500 - 13559 6.4 10 9 Tu 1 . + CDS 13698 - 14234 843 ## COG0634 Hypoxanthine-guanine phosphoribosyltransferase + Term 14243 - 14281 4.2 11 10 Tu 1 . - CDS 14275 - 14937 668 ## COG0288 Carbonic anhydrase - Prom 15033 - 15092 5.4 + Prom 14948 - 15007 4.2 12 11 Op 1 45/0.000 + CDS 15046 - 15972 946 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 13 11 Op 2 3/0.600 + CDS 15969 - 16739 860 ## COG0842 ABC-type multidrug transport system, permease component + Term 16747 - 16786 6.6 14 12 Op 1 4/0.400 + CDS 16844 - 17284 305 ## COG2893 Phosphotransferase system, mannose/fructose-specific component IIA + Term 17300 - 17335 1.1 15 12 Op 2 . + CDS 17348 - 18578 753 ## COG0726 Predicted xylanase/chitin deacetylase Predicted protein(s) >gi|296493227|gb|ADTK01000274.1| GENE 1 219 - 1643 682 474 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 6 452 3 444 458 267 33 4e-71 MSTEIKTQVVVLGAGPAGYSAAFRCADLGLETVIVERYNTLGGVCLNVGCIPSKALLHVA KVIEEAKALAEHGIVFGEPKTDIDKIRTWKEKVINQLTGGLAGMAKGRKVKVVNGLGKFT GANTLEVEGENGKTVINFDNAIIAAGSRPIQLPFIPHEDPRIWDSTDALELKEVPERLLV MGGGIIGLEMGTVYHALGSQIDVVEMFDQVIPAADKDIVKVFTKRISKKFNLMLETKVTA VEAKEDGIYVTMEGKKAPAEPQRYDAVLVAIGRVPNGKNLDAGKAGVEVDDRGFIRVDKQ LRTNVPHIFAIGDIVGQPMLAHKGVHEGHVAAEVIAGKKHYFDPKVIPSIAYTEPEVAWV GLTEKEAKEKGISYETATFPWAASGRAIASDCADGMTKLIFDKESHRVIGGAIVGTNGGE LLGEIGLAIEMGCDAEDIALTIHAHPTLHESVGLAAEVFEGSITDLPNPKAKKK >gi|296493227|gb|ADTK01000274.1| GENE 2 1714 - 3528 718 604 aa, chain - ## HITS:1 COG:no KEGG:ECO111_0118 NR:ns ## KEGG: ECO111_0118 # Name: yacH # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 604 1 617 617 978 97.0 0 MKMTLPFKPHVLALICSAGLCAASTGLYIKSRTVEAPVEPQSTQQTAPDITAVTLPATVS APPVTPAVVKSAFSTAQIDQWVAPVALYPDSLLSQVLMASTYPANVAQAVQWSHDNPLKQ GDAAIQAVSDQPWDASVKSLVAFPQLMALMGENPQWVQNLGDAFLAQPQDVMDSVQRLRQ LAQQTGSLKSSTEQKVITTTKKAVPVKQTVTAPVIPSNTVLTASPVITEPATTVISIEPA NPDVVYIPNYNPTVVYGNWANTAYPPVYLPPPAGEPFIDSFVRGFGYSMGVATTYALFSS IDWDDDDHDHHHRDGNGWQHNGDNINIDVNNFNRITGEHLTDKNMAWRHNPNYRNGVPYH DQDMAKRFHQTDVNGGMSATQLPAPTRDSQRQAAASQFQQRTHAAPVITRDTQRQAAAQR FNEAENYGSYDDFRDFSRRQPLTQQQKDAARQRYQSASPEQRQAVREKMQTNPQIQQRRD AARERIQPASPEQRQAVREKMQTNPQIQQRRDAARERIQSASPEQRQVFKEKVQQRPLNQ QQRDNARQRVQSASPEQRQVFREKVQESRPQRLNDSNHTVRLNNEQRSAVRERLSERGAR RQER >gi|296493227|gb|ADTK01000274.1| GENE 3 3883 - 6480 2967 865 aa, chain + ## HITS:1 COG:acnB KEGG:ns NR:ns ## COG: acnB COG1049 # Protein_GI_number: 16128111 # Func_class: C Energy production and conversion # Function: Aconitase B # Organism: Escherichia coli K12 # 1 865 1 865 865 1705 100.0 0 MLEEYRKHVAERAAEGIAPKPLDANQMAALVELLKNPPAGEEEFLLDLLTNRVPPGVDEA AYVKAGFLAAIAKGEAKSPLLTPEKAIELLGTMQGGYNIHPLIDALDDAKLAPIAAKALS HTLLMFDNFYDVEEKAKAGNEYAKQVMQSWADAEWFLNRPALAEKLTVTVFKVTGETNTD DLSPAPDAWSRPDIPLHALAMLKNAREGIEPDQPGVVGPIKQIEALQQKGFPLAYVGDVV GTGSSRKSATNSVLWFMGDDIPHVPNKRGGGLCLGGKIAPIFFNTMEDAGALPIEVDVSN LNMGDVIDVYPYKGEVRNHETGELLATFELKTDVLIDEVRAGGRIPLIIGRGLTTKAREA LGLPHSDVFRQAKDVAESDRGFSLAQKMVGRACGVKGIRPGAYCEPKMTSVGSQDTTGPM TRDELKDLACLGFSADLVMQSFCHTAAYPKPVDVNTHHTLPDFIMNRGGVSLRPGDGVIH SWLNRMLLPDTVGTGGDSHTRFPIGISFPAGSGLVAFAAATGVMPLDMPESVLVRFKGKM QPGITLRDLVHAIPLYAIKQGLLTVEKKGKKNIFSGRILEIEGLPDLKVEQAFELTDASA ERSAAGCTIKLNKEPIIEYLNSNIVLLKWMIAEGYGDRRTLERRIQGMEKWLANPELLEA DADAEYAAVIDIDLADIKEPILCAPNDPDDARPLSAVQGEKIDEVFIGSCMTNIGHFRAA GKLLDAHKGQLPTRLWVAPPTRMDAAQLTEEGYYSVFGKSGARIEIPGCSLCMGNQARVA DGATVVSTSTRNFPNRLGTGANVFLASAELAAVAALIGKLPTPEEYQTYVAQVDKTAVDT YRYLNFNQLSQYTEKADGVIFQTAV >gi|296493227|gb|ADTK01000274.1| GENE 4 6655 - 7017 463 120 aa, chain + ## HITS:1 COG:ECs0123 KEGG:ns NR:ns ## COG: ECs0123 COG3112 # Protein_GI_number: 15829377 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 120 17 136 136 217 100.0 3e-57 MDYEFLRDITGVVKVRMSMGHEVVGHWFNEEVKENLALLDEVEQAAHALKGSERSWQRAG HEYTLWMDGEEVMVRANQLEFAGDEMEEGMNYYDEESLSLCGVEDFLQVVAAYRNFVQQK >gi|296493227|gb|ADTK01000274.1| GENE 5 7055 - 7849 849 264 aa, chain - ## HITS:1 COG:ECs0124 KEGG:ns NR:ns ## COG: ECs0124 COG1586 # Protein_GI_number: 15829378 # Func_class: E Amino acid transport and metabolism # Function: S-adenosylmethionine decarboxylase # Organism: Escherichia coli O157:H7 # 1 264 1 264 264 537 100.0 1e-153 MKKLKLHGFNNLTKSLSFCIYDICYAKTAEERDGYIAYIDELYNANRLTEILSETCSIIG ANILNIARQDYEPQGASVTILVSEEPVDPKLIDKTEHPGPLPETVVAHLDKSHICVHTYP ESHPEGGLCTFRADIEVSTCGVISPLKALNYLIHQLESDIVTIDYRVRGFTRDINGMKHF IDHEINSIQNFMSDDMKALYDMVDVNVYQENIFHTKMLLKEFDLKHYMFHTKPEDLTDSE RQEITAALWKEMREIYYGRNMPAV >gi|296493227|gb|ADTK01000274.1| GENE 6 7865 - 8731 894 288 aa, chain - ## HITS:1 COG:speE KEGG:ns NR:ns ## COG: speE COG0421 # Protein_GI_number: 16128114 # Func_class: E Amino acid transport and metabolism # Function: Spermidine synthase # Organism: Escherichia coli K12 # 1 288 1 288 288 600 99.0 1e-171 MAEKKQWHETLHDQFGQYFAVDNVLYHEKTDHQDLIIFENAAFGRVMALDGVVQTTERDE FIYHEMMTHVPLLTHGHAKHVLIIGGGDGAMLREVTRHKNVESITMVEIDAGVVSFCRQY LPNHNAGSYDDPRFKLVIDDGVNFVNQTSQTFDVIISDCTDPIGPGESLFTSAFYEGCKR CLNPGGIFVAQNGVCFLQQEEAIDSHRKLSHYFSDVGFYQAAIPTYYGGIMTFAWATDND ALRHLSTEIIQARFLASGLKCRYYNPAIHTAAFALPQYLQDALASQPS >gi|296493227|gb|ADTK01000274.1| GENE 7 8837 - 9184 351 115 aa, chain - ## HITS:1 COG:no KEGG:ECSE_0122 NR:ns ## KEGG: ECSE_0122 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 115 42 156 156 231 100.0 8e-60 MKTFFRTVLFGSLMAVCANSYALSESEAEDMADLTAVFVFLKNDCGYQNLPNGQIRRALV FFAQQNQWDLSNYDTFDMKALGEDSYRDLSGIGIPVAKKCKALARDSLSLLAYVK >gi|296493227|gb|ADTK01000274.1| GENE 8 9350 - 10900 1601 516 aa, chain + ## HITS:1 COG:yacK KEGG:ns NR:ns ## COG: yacK COG2132 # Protein_GI_number: 16128116 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Putative multicopper oxidases # Organism: Escherichia coli K12 # 1 516 1 516 516 1047 99.0 0 MQRRDFLKYSVALGVASALPLWSRAVFAAERPTLPIPDLLTTDARNRIQLTIGAGQSTFG GKTATTWGYNGNLLGPAVKLQHGKAVTVDIYNQLTEETTLHWHGLEVPGEVDGGPQGIIP PGGKRSVTLNVDQPAATCWFHPHQHGKTGRQVAMGLAGLVVIEDDEILKLMLPKQWGIDD VPVIVQDKKFSADGQIDYQLDVMTAAVGWFGDTLLTNGAIYPQHAAPRGWLRLRLLNGCN ARSLNFATSDNRPLYVIASDGGLLPEPVKVNELPVLMGERFEVLVEVNDNKPFDLVTLPV SQMGMAIAPFDKPHPVMRIQPIAISASGALPDTLSSLPALPSLEGLTVRKLQLSMDPMLD MMGMQMLMEKYGDQAMAGMDHSQMMGHMGHGNMNHMNHGGKFDFHHANKINGQAFDMNKP MFAAAKGQYERWVISGVGDMMLHPFHIHGTQFRILSENGKPPAAHRAGWKDTVKVEGNVS EVLVKFNHDAPKEHAYMAHCHLLEHEDTGMMLGFTV >gi|296493227|gb|ADTK01000274.1| GENE 9 11102 - 13492 2524 796 aa, chain - ## HITS:1 COG:gcd KEGG:ns NR:ns ## COG: gcd COG4993 # Protein_GI_number: 16128117 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose dehydrogenase # Organism: Escherichia coli K12 # 1 796 1 796 796 1520 99.0 0 MAINNTGSRRLLVTLTALFAALCGLYLLIGGGWLVAIGGSWYYPIAGLVMLGVAWMLWLS KRAALWLYAALLLGTMIWGVWEVGFDFWALTPRSDILVFFGIWLILPFVWRRLVIPASGA VAALVVALLISGGILTWAGFNDPQEINGTLSADATPAEAISPVADQDWPAYGRNQEGQRF SPLKQINADNVHKLKEAWVFRTGDVKQPNDPGEITNEVTPIKVGDTLYLCTAHQRLFALD AASGKEKWHYDPELKTNESFQHVTCRGVSYHEAKAETASPEVMADCPRRIILPVNDGRLI AINAENGKLCETFANKGVLNLQSNMPDTKPGLYEPTSPPIITDKTIVMAGSVTDNFSTRE TSGVIRGFDVNTGELLWAFDPGAKNPNAIPSDEHTFTFNSPNSWAPAAYDAKLDLVYLPM GVTTPDIWGGNRTPEQERYASSILALNATTGKLAWSYQTVHHDLWDMDLPAQPTLADITV NGQKVPVIYAPAKTGNIFVLDRRNGELVVPAPEKPVPQGAAKGDYVTPTQPFSELSFRPT KDLSGADMWGATMFDQLVCRVMFHQMRYEGIFTPPSEQGTLVFPGNLGMFEWGGISVDPN REVAIANPMALPFVSKLIPRGPGNPMEQPKDAKGTGTESGIQPQYGVPYGVTLNPFLSPF GLPCKQPAWGYISALDLKTNEVVWKKRIGTPQDSMPFPMPVPVPFNMGMPMLGGPISTAG NVLFIAATADNYLRAYNMSNGEKLWQGRLPAGGQATPMTYEVNGKQYVVISAGGHGSFGT KMGDYIVAYALPDDVK >gi|296493227|gb|ADTK01000274.1| GENE 10 13698 - 14234 843 178 aa, chain + ## HITS:1 COG:ECs0129 KEGG:ns NR:ns ## COG: ECs0129 COG0634 # Protein_GI_number: 15829383 # Func_class: F Nucleotide transport and metabolism # Function: Hypoxanthine-guanine phosphoribosyltransferase # Organism: Escherichia coli O157:H7 # 1 178 5 182 182 337 100.0 9e-93 MKHTVEVMIPEAEIKARIAELGRQITERYKDSGSDMVLVGLLRGSFMFMADLCREVQVSH EVDFMTASSYGSGMSTTRDVKILKDLDEDIRGKDVLIVEDIIDSGNTLSKVREILSLREP KSLAICTLLDKPSRREVNVPVEFIGFSIPDEFVVGYGIDYAQRYRHLPYIGKVILLDE >gi|296493227|gb|ADTK01000274.1| GENE 11 14275 - 14937 668 220 aa, chain - ## HITS:1 COG:yadF KEGG:ns NR:ns ## COG: yadF COG0288 # Protein_GI_number: 16128119 # Func_class: P Inorganic ion transport and metabolism # Function: Carbonic anhydrase # Organism: Escherichia coli K12 # 1 220 1 220 220 457 100.0 1e-129 MKDIDTLISNNALWSKMLVEEDPGFFEKLAQAQKPRFLWIGCSDSRVPAERLTGLEPGEL FVHRNVANLVIHTDLNCLSVVQYAVDVLEVEHIIICGHYGCGGVQAAVENPELGLINNWL LHIRDIWFKHSSLLGEMPQERRLDTLCELNVMEQVYNLGHSTIMQSAWKRGQKVTIHGWA YGIHDGLLRDLDVTATNRETLEQRYRHGISNLKLKHANHK >gi|296493227|gb|ADTK01000274.1| GENE 12 15046 - 15972 946 308 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 1 302 1 304 311 369 63 1e-101 MTIALELQQLKKTYPGGVQALRGIDLQVEAGDFYALLGPNGAGKSTTIGIISSLVNKTSG RVSVFGYDLEKDVVNAKRQLGLVPQEFNFNPFETVQQIVVNQAGYYGVERKEAYIRSEKY LKQLDLWGKRNERARMLSGGMKRRLMIARALMHEPKLLILDEPTAGVDIELRRSMWGFLK DLNDKGTTIILTTHYLEEAEMLCRNIGIIQHGELVENTSMKALLAKLKSETFILDLAPKS PLPKLDGYQYRLVDTATLEVEVLREQGINSVFTQLSEQGIQVLSMRNKANRLEELFVSLV NEKQGDRA >gi|296493227|gb|ADTK01000274.1| GENE 13 15969 - 16739 860 256 aa, chain + ## HITS:1 COG:ECs0132 KEGG:ns NR:ns ## COG: ECs0132 COG0842 # Protein_GI_number: 15829386 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Escherichia coli O157:H7 # 1 256 1 256 256 428 99.0 1e-120 MMHLYWVALKSIWAKEIHRFMRIWVQTLVPPVITMTLYFIIFGNLIGSRIGDMHGFSYMQ FIVSGLIMMSVITNAYANVASSFFGAKFQRNIEELLVAPVPTHVIIAGYVGGGVARGLFV GILVTAISLFFVPFQVHSWVFVALTLVLTAVLFSLAGLLNGVFAKTFDDISLVPTFVLTP LTYLGGVFYSLTLLPPFWQGLSHLNPIVYMISGFRYGFLGINDVPLVTTFGVLVVFIVAF YLICWSLIQRGRGLRS >gi|296493227|gb|ADTK01000274.1| GENE 14 16844 - 17284 305 146 aa, chain + ## HITS:1 COG:yadI KEGG:ns NR:ns ## COG: yadI COG2893 # Protein_GI_number: 16128122 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose-specific component IIA # Organism: Escherichia coli K12 # 1 146 1 146 146 302 100.0 1e-82 MLGWVITCHDDRAQEILDALEKKHGALLQCRAVNFWRGLSSNMLSRMMCDALHEADSGEG VIFLTDIAGAPPYRVASLLSHKHSRCEVISGVTLPLIEQMMACRETMTSSEFRERIVELG APEVSSLWHQQQKNPPFVLKHNLYEY >gi|296493227|gb|ADTK01000274.1| GENE 15 17348 - 18578 753 410 aa, chain + ## HITS:1 COG:yadE KEGG:ns NR:ns ## COG: yadE COG0726 # Protein_GI_number: 16128123 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Escherichia coli K12 # 1 400 1 400 409 814 98.0 0 MYKQAVILLLMLFTASVSAALPARYMQTIENAAIWAQIGDKMVTVGNIRAGQIIAVEPTA ASYYAFNFGFGKGFIDKGHLEPVQGRQKVEDGLGDLNKPLSNQNLITWKDTPVYNAPSAG SAPFGVLADNLRYPILHKLKDRLNQTWYQIRIGDRLAYISALDAQPDNGLPVLTYHHILR DEENTRFRHTSTTTSVRAFNNQMAWLRDRGYATLSMAQLEGYVKNKINLPARAVVITFDD GLKSVSRYAYPVLKQYGMKATAFVVTSRIKRHPQKWNPKSLQFMSVSELNEIRDVFDFQS HTHFLHRVDGYRRPILLSRSEHNILFDFARSRRALAQFNPHVWYLSYPFGGFNDKAVKAA NDAGFHLAVTTMKGKVKPGDNPLLLKRLYILRTDSLETMSSCLPVIPADG Prediction of potential genes in microbial genomes Time: Mon May 16 15:52:49 2011 Seq name: gi|296493226|gb|ADTK01000275.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont846.13, whole genome shotgun sequence Length of sequence - 3405 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 17 - 397 427 ## COG0853 Aspartate 1-decarboxylase - Prom 457 - 516 2.6 2 2 Tu 1 . + CDS 347 - 559 81 ## EcE24377A_0134 hypothetical protein + Prom 584 - 643 1.8 3 3 Tu 1 . + CDS 671 - 1573 757 ## COG5464 Uncharacterized conserved protein + Term 1575 - 1619 2.1 - Term 1600 - 1642 10.2 4 4 Op 1 19/0.000 - CDS 1647 - 2498 943 ## COG0414 Panthothenate synthetase 5 4 Op 2 . - CDS 2510 - 3304 937 ## COG0413 Ketopantoate hydroxymethyltransferase - Prom 3332 - 3391 4.4 Predicted protein(s) >gi|296493226|gb|ADTK01000275.1| GENE 1 17 - 397 427 126 aa, chain - ## HITS:1 COG:ECs0135 KEGG:ns NR:ns ## COG: ECs0135 COG0853 # Protein_GI_number: 15829389 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate 1-decarboxylase # Organism: Escherichia coli O157:H7 # 1 126 1 126 126 249 100.0 8e-67 MIRTMLQGKLHRVKVTHADLHYEGSCAIDQDFLDAAGILENEAIDIWNVTNGKRFSTYAI AAERGSRIISVNGAAAHCASVGDIVIIASFVTMPDEEARTWRPNVAYFEGDNEMKRTAKA IPVQVA >gi|296493226|gb|ADTK01000275.1| GENE 2 347 - 559 81 70 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_0134 NR:ns ## KEGG: EcE24377A_0134 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 7 70 1 64 64 103 98.0 2e-21 MSHFHAVEFALQHRANHNFYLSTLSLTKQAVPALRKFSRSIARFLMSVYSSDGICVSSLR TAPKRAMYRL >gi|296493226|gb|ADTK01000275.1| GENE 3 671 - 1573 757 300 aa, chain + ## HITS:1 COG:yadD KEGG:ns NR:ns ## COG: yadD COG5464 # Protein_GI_number: 16128125 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 300 1 300 300 578 98.0 1e-165 MDAPSTTPHDAVFKQFLMHAETARDFLEIHLPVELRELCDLNTLHLESGSFIEESLKGHS TDVLYSVQMQGNPGYLHVVIEHQSKPDKKMAFRMMRYSIAAMHRHLEADHDKLPLVVPIL FYQGEATPYPLSMCWFDIFYSPELARRVYNSPFPLVDITITPDDEIMQHRRIAILELLQK HIRQRDLMLLLEQLVTLIDEGYTSGSQLVAMQNYMLQRGHTEQADLFYGVLRDRETGGES MMTLAQWFEEKGIEKGIQQGRQEERQEFALRLLSKGMSREDVAEMANLPLAEIDKVINLI >gi|296493226|gb|ADTK01000275.1| GENE 4 1647 - 2498 943 283 aa, chain - ## HITS:1 COG:panC KEGG:ns NR:ns ## COG: panC COG0414 # Protein_GI_number: 16128126 # Func_class: H Coenzyme transport and metabolism # Function: Panthothenate synthetase # Organism: Escherichia coli K12 # 1 283 1 283 283 548 100.0 1e-156 MLIIETLPLLRQQIRRLRMEGKRVALVPTMGNLHDGHMKLVDEAKARADVVVVSIFVNPM QFDRPEDLARYPRTLQEDCEKLNKRKVDLVFAPSVKEIYPNGTETHTYVDVPGLSTMLEG ASRPGHFRGVSTIVSKLFNLVQPDIACFGEKDFQQLALIRKMVADMGFDIEIVGVPIMRA KDGLALSSRNGYLTAEQRKIAPGLYKVLSSIADKLQAGERDLDEIITIAGQELNEKGFRA DDIQIRDADTLLEVSETSKRAVILVAAWLGDARLIDNKMVELA >gi|296493226|gb|ADTK01000275.1| GENE 5 2510 - 3304 937 264 aa, chain - ## HITS:1 COG:ECs0138 KEGG:ns NR:ns ## COG: ECs0138 COG0413 # Protein_GI_number: 15829392 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate hydroxymethyltransferase # Organism: Escherichia coli O157:H7 # 1 264 1 264 264 508 96.0 1e-144 MKPTTIASLQKCKQDKKRFATITAYDYSFAKLFADEGLNVMLVGDSLGMTVQGHDSTLPV TVADIAYHTAAVRRGAPNCLLLADLPFMAYATPEQAFENAATVMRAGANMVKIEGGEWLV ETVKMLTERAVPVCGHLGLTPQSVNIFGGYKVQGRGDEAGDQLLSDALALEAAGAQLLVL ECVPVELAKRITDALAIPVIGIGAGNVTDGQILVMHDAFGITGGHIPKFAKNFLAETGDI RAAVRQYMAEVESGVYPGEEHSFH Prediction of potential genes in microbial genomes Time: Mon May 16 15:52:56 2011 Seq name: gi|296493225|gb|ADTK01000276.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont846.14, whole genome shotgun sequence Length of sequence - 14868 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 4, operones - 3 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 1250 161 ## COG3539 P pilus assembly protein, pilin FimA 2 1 Op 2 . - CDS 1303 - 1899 224 ## SSON_0145 putative fimbrial protein 3 1 Op 3 . - CDS 1926 - 2528 400 ## B21_00135 hypothetical protein 4 1 Op 4 6/0.000 - CDS 2543 - 2926 239 ## COG3539 P pilus assembly protein, pilin FimA - Prom 3054 - 3113 3.6 - Term 3040 - 3078 -0.9 5 1 Op 5 10/0.000 - CDS 3129 - 5729 1531 ## COG3188 P pilus assembly protein, porin PapC 6 1 Op 6 7/0.000 - CDS 5764 - 6456 452 ## COG3121 P pilus assembly protein, chaperone PapD - Prom 6493 - 6552 7.8 - Term 6521 - 6575 6.4 7 2 Op 1 3/0.000 - CDS 6609 - 7193 478 ## COG3539 P pilus assembly protein, pilin FimA 8 2 Op 2 7/0.000 - CDS 7563 - 8042 590 ## COG0801 7,8-dihydro-6-hydroxymethylpterin-pyrophosphokinase 9 2 Op 3 1/1.000 - CDS 8039 - 9403 1396 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase - Prom 9428 - 9487 3.5 10 3 Op 1 3/0.000 - CDS 9496 - 10392 549 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases - Term 10398 - 10430 3.0 11 3 Op 2 6/0.000 - CDS 10459 - 10914 665 ## COG1734 DnaK suppressor protein - Prom 10968 - 11027 2.3 - Term 10992 - 11027 -0.1 12 3 Op 3 3/0.000 - CDS 11092 - 11796 368 ## COG1489 DNA-binding protein, stimulates sugar fermentation 13 3 Op 4 . - CDS 11811 - 12341 497 ## COG1514 2'-5' RNA ligase - Prom 12363 - 12422 4.3 14 4 Tu 1 . + CDS 12370 - 14844 2087 ## COG1643 HrpA-like helicases Predicted protein(s) >gi|296493225|gb|ADTK01000276.1| GENE 1 2 - 1250 161 416 aa, chain - ## HITS:1 COG:yadC KEGG:ns NR:ns ## COG: yadC COG3539 # Protein_GI_number: 16128128 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 416 1 404 412 461 63.0 1e-129 MKKFFRHFLFLILCLSCYTASAGTDDNVGYIVGNSYGVGPSDQKWRETGPNGDATVIFRY ATSTNNLVFYKPTQLGPTGVKLQWSQLDTASGGGFLYCNRSDSTSGSAMRIENAMVDSGK MYGSHKLFNTSVPGLYYTLLISNMWSAYGTVTNVSSPGIYIGDSAEQYFSWYNPSEDVLY WSCNNANSTRKYWAVGGIYQTLTIEFYTDTNFDPTVTQQIKLSSSSNYLYSFKAYGAGQG INEHSYFIKIDFDLLNVKLTNPTCFTAMLSGTSVTGSTVKMGEYSAEQIRNGATPVPFDI SLQNCVRVTNIETKLVSTKVGTENGQLLGNTLTGNDAAKGVGVLIEGLATSKNPLMTLKP NDSNSVYKDYDPRGKDDTTGGVYPDQDTGITYPLHFQATLQQDGTIPIEAGEFKAT >gi|296493225|gb|ADTK01000276.1| GENE 2 1303 - 1899 224 198 aa, chain - ## HITS:1 COG:no KEGG:SSON_0145 NR:ns ## KEGG: SSON_0145 # Name: yadK # Def: putative fimbrial protein # Organism: S.sonnei # Pathway: not_defined # 1 198 1 198 198 390 100.0 1e-107 MHPTQRKLMKRIILFLSLQFPITYPAIAGQDIDLVANVKNSTCKSGISNQGNIDLGVVGV GYFSDNVTPESYQPGGKEFTVTVSDCALQGTGDVLNQLHIDFRALSGVMAAGSRQIFANE VATGAKNVGVVLFSIQDPTNIFNVISSAGNSRSVYPVMTSALHNSSWKFYARMQKIDPAL DVISGQVMSHILVDVYYE >gi|296493225|gb|ADTK01000276.1| GENE 3 1926 - 2528 400 200 aa, chain - ## HITS:1 COG:no KEGG:B21_00135 NR:ns ## KEGG: B21_00135 # Name: yadL # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 200 1 200 200 308 100.0 9e-83 MTFKNLRYGLSSSVVLAASLFSVPSYAATDSIGLTVITTVEIGTCTATLVNDSDQDISVV DFGDVYISEINAKTKVKTFKLKFKDCAGIPNKKAQIKLTKRATCEGTANDGAGFANGSTA ADKASAVAVEVWSTETPATGSATQFSCVTPASQEVTISNAANAVVYYPMSARLVVEKNKT VSNVTAGKFSAPATFTVTYN >gi|296493225|gb|ADTK01000276.1| GENE 4 2543 - 2926 239 127 aa, chain - ## HITS:1 COG:yadM KEGG:ns NR:ns ## COG: yadM COG3539 # Protein_GI_number: 16128131 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 127 77 203 203 233 100.0 7e-62 MGLDKIANKTTESQADFKLVASGCSSGISWIDTTLTGNASSSSPKLIIPQSGDSSSTTSN IGMGFKKRTTDDATFLKPNSAEKIRWSTDEMQPDKGLEMTVALRETDAGQGVPGNFRALA TFNFIYQ >gi|296493225|gb|ADTK01000276.1| GENE 5 3129 - 5729 1531 866 aa, chain - ## HITS:1 COG:htrE KEGG:ns NR:ns ## COG: htrE COG3188 # Protein_GI_number: 16128132 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli K12 # 1 866 1 865 865 1631 97.0 0 MTIKPTKNYHNHLTRIATFCALLYCNSAFCAELVEYDHTFLMGQNASNIDLSRYSEGNPA IPGMYDVSVYVNDQPIINQSITFIEIEGKKNAQACITLKNLLQFHINSPDINNEKAVLLA RDETLGNCLNLTEIIPQASVRYDVNEQRLDIDVPQAWVMKNYQNYVDPSLWENGINAAML SYNLNGYHSETPGRRNDSIYAAFNGGMNLGAWRLRASGNYNWMTDSGSNYDFKNRYIQRD IASLRSQLILGESYTTGETFDSVSIRGIRLYSDSRMLPPTLASFAPIIHGVANTNAKVTI TQGGYKIYETTVPPGAFVIDDLSPSGYGSDLIVTVEESDGSKRTFSQPFSSVVQMLRPGV GRWDISGGQVLKDDIQDEPNLFQASYYYGLNNYLTGYTGIQITDNNYTAGLLGLGLNTSV GAFSFDVTHSNVRIPDDKTYQGQSYRVSWNKLFEETSTSLNIAAYRYSTQNYLGLNDALT LIDEVKHPEQDLEPKSMRNYSRMKNQVTVSINQPLKFEKKDYGSFYLSGSWSDYWASGQN RSNYSIGYSNSASWGSYSVSAQRSWNEDGDTDDSVYLSFTIPIEKLLDTEQRTSGFQSID TQMSSDFKGNNQLNVSSSGYSDNARLSYSVNTGYTMNKASKDLSYVGGYASYESPWGTLA GSVSANSDNSRQVSLSTDGGFVLHSGGLTFSNDSFSDSDTLAVVQAPGAQGARINYGNST IDRWGYGVTSALSPYHENRIALDINDLENDVELKSTSAVAVPRQGSVVFADFETVQGQSA IMNITRSDGKNIPFAADIYDEQGNVIGNVGQGGQAFVRGIEQQGNISIKWLEESKPVSCL AHYQQSSEAEKIAQSIILNGIRCQIE >gi|296493225|gb|ADTK01000276.1| GENE 6 5764 - 6456 452 230 aa, chain - ## HITS:1 COG:ecpD KEGG:ns NR:ns ## COG: ecpD COG3121 # Protein_GI_number: 16128133 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli K12 # 1 230 17 246 246 444 98.0 1e-125 MAFSSSSIADIVISGTRVIYKSDQKSVNVRLENKGNNPLLVQSWLDTGDDNAEPGSITVP FTATPPVSRIDAKRGQTIKLMYTASTSLPKDRESVFWFNVLEVPPKPDAEKVANQSLLQL AFRTRIKLFYRPDGLKGNPSEAPLALKWFWSGSEGKASLRVTNPTPYYVSFSSGDLEASG KRYPIDVKMIAPFSDEVMKVTGLNGKASSAKVHFYAINDFGGAIEGNASL >gi|296493225|gb|ADTK01000276.1| GENE 7 6609 - 7193 478 194 aa, chain - ## HITS:1 COG:yadN KEGG:ns NR:ns ## COG: yadN COG3539 # Protein_GI_number: 16128134 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 194 1 194 194 326 100.0 2e-89 MSKKLGFALSGLMLAMVAGTASADMDGGQLNISGLVVDNTCETRVDGGNKDGLILLQTAT VGEIDAGVLNDTVGAKAKPFSITVDCSKANPNPGSTAKMTFGSVFFGNSKGTLNNDMSIN NPSDGVNIALHNIDGSTIKQVQINNPGDVYTKALDATTKSAVYDFKASYVRAVADQTATA GYVKTNTAYTITYQ >gi|296493225|gb|ADTK01000276.1| GENE 8 7563 - 8042 590 159 aa, chain - ## HITS:1 COG:folK KEGG:ns NR:ns ## COG: folK COG0801 # Protein_GI_number: 16128135 # Func_class: H Coenzyme transport and metabolism # Function: 7,8-dihydro-6-hydroxymethylpterin-pyrophosphokinase # Organism: Escherichia coli K12 # 1 159 1 159 159 307 96.0 5e-84 MTVAYIAIGSNLASPLEQVNAALKALGDIPESHILAVSSFYRTPPLGPQDQPDYLNAAVA LETSLAPEELLNHTQRIELQQGRVRKAERWGPRTLDLDIMLFGNEVINTERLTVPHYDMK NRGFMLWPLFEIAPELAFPDGETLREVLHTRAFDKLSKW >gi|296493225|gb|ADTK01000276.1| GENE 9 8039 - 9403 1396 454 aa, chain - ## HITS:1 COG:ECs0147 KEGG:ns NR:ns ## COG: ECs0147 COG0617 # Protein_GI_number: 15829401 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Escherichia coli O157:H7 # 1 454 1 454 454 860 99.0 0 MLSREESEAEQAVARPQVTVIPREQHAISRKDISENALKVMYRLNKAGYEAWLVGGGVRD LLLGKKPKDFDVTTNATPEQVRKLFRNCRLVGRRFRLAHVMFGPEIIEVATFRGHHEGNV SDRTTSQRGQNGMLLRDNIFGSIEEDAQRRDFTINSLYYSVADFTVRDYVGGMKDLKDGV IRLIGNPETRYREDPVRMLRAVRFAAKLGMRISPETAEPIPRLATLLNDIPPARLFEESL KLLQAGYGYDTYKLLCEYHLFQPLFPTITRYFTENGDSPMERIIEQVLKNTDTRIHNDMR VNPAFLFAAMFWYPLLETAQKIAQESGLTYHDAFALAMNDVLDEACRSLAIPKRLTTLTR DIWQLQLRMSRRQGKRAWKLLEHPKFRAAYDLLALRAEVERNAELQRLVKWWGEFQVSAP PDQKGMLNELDEEPSPRRRTRRPRKRAPRREGTA >gi|296493225|gb|ADTK01000276.1| GENE 10 9496 - 10392 549 298 aa, chain - ## HITS:1 COG:yadB KEGG:ns NR:ns ## COG: yadB COG0008 # Protein_GI_number: 16128137 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Escherichia coli K12 # 1 298 11 308 308 611 100.0 1e-175 MTDTQYIGRFAPSPSGELHFGSLIAALGSYLQARARQGRWLVRIEDIDPPREVPGAAETI LRQLEHYGLHWDGDVLWQSQRHDAYREALAWLHEQGLSYYCTCTRARIQSIGGIYDGHCR VLHHGPDNAAVRIRQQHPVTQFTDQLRGIIHADEKLAREDFIIHRRDGLFAYNLAVVVDD HFQGVTEIVRGADLIEPTVRQISLYQLFGWKVPDYIHLPLALNPQGAKLSKQNHAPALPK GDPRPVLIAALQFLGQQAEAHWQDFSVEQILQSAVKNWRLTAVPESAIVNSTFSNASC >gi|296493225|gb|ADTK01000276.1| GENE 11 10459 - 10914 665 151 aa, chain - ## HITS:1 COG:ECs0149 KEGG:ns NR:ns ## COG: ECs0149 COG1734 # Protein_GI_number: 15829403 # Func_class: T Signal transduction mechanisms # Function: DnaK suppressor protein # Organism: Escherichia coli O157:H7 # 1 151 1 151 151 261 100.0 5e-70 MQEGQNRKTSSLSILAIAGVEPYQEKPGEEYMNEAQLAHFRRILEAWRNQLRDEVDRTVT HMQDEAANFPDPVDRAAQEEEFSLELRNRDRERKLIKKIEKTLKKVEDEDFGYCESCGVE IGIRRLEARPTADLCIDCKTLAEIREKQMAG >gi|296493225|gb|ADTK01000276.1| GENE 12 11092 - 11796 368 234 aa, chain - ## HITS:1 COG:ECs0150 KEGG:ns NR:ns ## COG: ECs0150 COG1489 # Protein_GI_number: 15829404 # Func_class: R General function prediction only # Function: DNA-binding protein, stimulates sugar fermentation # Organism: Escherichia coli O157:H7 # 1 234 1 234 234 467 100.0 1e-132 MEFSPPLQRATLIQRYKRFLADVITPDGRELTLHCPNTGAMTGCATPGDTVWYSTSDNTK RKYPHTWELTQSQSGAFICVNTLWANRLTKEAILNESISELSGYSSLKSEVKYGAERSRI DFMLQADSRPDCYIEVKSVTLAENEQGYFPDAVTERGQKHLRELMSVAAEGQRAVIFFAV LHSAITRFSPARHIDEKYAQLLSEAQQRGVEILAYKAEISAEGMALKKSLPVTL >gi|296493225|gb|ADTK01000276.1| GENE 13 11811 - 12341 497 176 aa, chain - ## HITS:1 COG:ligT KEGG:ns NR:ns ## COG: ligT COG1514 # Protein_GI_number: 16128140 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 2'-5' RNA ligase # Organism: Escherichia coli K12 # 1 176 4 179 179 336 100.0 1e-92 MSEPQRLFFAIDLPAEIREQIIHWRATHFPPEAGRPVAADNLHLTLAFLGEVSAEKEKAL SLLAGRIRQPGFTLTLDDAGQWLRSRVVWLGMRQPPRGLIQLANMLRSQAARSGCFQSNR PFHPHITLLRDASEAVTIPPPGFNWSYAVTEFTLYASSFARGRTRYTPLKRWALTQ >gi|296493225|gb|ADTK01000276.1| GENE 14 12370 - 14844 2087 824 aa, chain + ## HITS:1 COG:hrpB KEGG:ns NR:ns ## COG: hrpB COG1643 # Protein_GI_number: 16128141 # Func_class: L Replication, recombination and repair # Function: HrpA-like helicases # Organism: Escherichia coli K12 # 1 824 1 824 824 1476 99.0 0 MLQCGAKNVNPLERFVSSLPVAAVLSELLTALDCAPQVLLSAPTGAGKSTWLPLQLLAHP GINGKIILLEPRRLAARNVAQRLAELLNEKPGDTVGYRMRAQNCVGPNTRLEVVTEGVLT RMIQRDPELSGVGLVILDEFHERSLQADLALALLLDVQQGLRDDLKLLIMSATLDNDRLQ QMLPEAPVVISEGRSFPVERRYLPLPAHQRFDDAVAVATAEMLRQESGSLLLFLPGVGEI QRVQEQLASRIGSDVLLCPLYGALSLNDQRKAILPAPQGMRKVVLATNIAETSLTIEGIR LVVDCAQERVARFDPRTGLTRLITQRVSQASMTQRAGRAGRLEPGICLHLIAKEQAERAA AQSEPEILQSDLSGLLMELLQWGCSDPAQMSWLDQPPVVNLLAAKRLLQMLGALEGERLS AQGQKMAALGNDPRLAAMLVSAKNDDEAATAAKIAAILEEPPRMGNSDLGVAFSRNQPAW QQRSQQLLKRLNVRGGEADSSLIAPLLAGAFADRIARRRGQDGRYQLANGMGAMLDANDA LSRHEWLIAPLLLQGSASPDARILLALPVDIDELVQRCPQLVQQSDTVEWDDAQGTLKAW RRLQIGLLTVKVQPLAKPSEDELHQAMLNGIRDKDLSVLNWTAEAEQLRLRLLCAAKWLP EYDWPAVDDESLLATLETWLLPHMTGVHSLRGLKSLDIYQALRGLLDWGMQQRLDSELPA HYTVPTGSRIAIRYHEDNPPALAVRMQEMFGEATNPTIAQGRVPLVLELLSPAQRPLQIT RDLSAFWKGAYREVQKEMKGRYPKHVWPDDPANTAPTRRTKKYS Prediction of potential genes in microbial genomes Time: Mon May 16 15:53:03 2011 Seq name: gi|296493224|gb|ADTK01000277.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont846.15, whole genome shotgun sequence Length of sequence - 2676 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 3 - 62 1.7 1 1 Tu 1 . + CDS 104 - 2638 2973 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) Predicted protein(s) >gi|296493224|gb|ADTK01000277.1| GENE 1 104 - 2638 2973 844 aa, chain + ## HITS:1 COG:mrcB KEGG:ns NR:ns ## COG: mrcB COG0744 # Protein_GI_number: 16128142 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Escherichia coli K12 # 1 844 1 844 844 1521 99.0 0 MAGNDREPIGRKGKPTRPVKQKVSRRRYEDDDDYDDYDDYEDEEPMPRKGKGKGKGRKPR GKRGWLWLLLKLAIVFAVLIAIYGVYLDQKIRSRIDGKVWQLPAAVYGRMVNLEPDMTIS KNEMVKLLEATQYRQVSKMTRPGEFTVQANSIEMIRRPFDFPDSKEGQVRARLTFDGDHL ATIVNMENNRQFGFFRLDPRLITMISSPNGEQRLFVPRSGFPDLLVDTLLATEDRHFYEH DGISLYSIGRAVLANLTAGRTVQGASTLTQQLVKNLFLSSERSYWRKANEAYMALIMDAR YSKDRILELYMNEVYLGQSGDNEIRGFPLASLYYFGRPVEELSLDQQALLVGMVKGASIY NPWRNPKLALERRNLVLRLLQQQQIIDQELYDMLSARPLGVQPRGGVISPQPAFMQLVRQ ELQAKLGDKVKDLSGVKIFTTFDSVAQDAAEKAAVEGIPALKKQRKLSDLETAIVVVDRF SGEVRAMVGGSEPQFAGYNRAMQARRSIGSLAKPATYLTALSQPKIYRLNTWIADAPIAL RQPNGQVWSPQNDDRRYSESGRVMLVDALTRSMNVPTVNLGMALGLPAVTETWIKLGVPK DQLHPVPAMLLGALNLTPIEVAQAFQTIASGGNRAPLSALRSVIAEDGKVLYQSFPQAER AVPAQAAYLTLWTMQQVVQRGTGRQLGAKYPNLHLAGKTGTTNNNVDTWFAGIDGSTVTI TWVGRDNNQPTKLYGASGAMSIYQRYLANQTPTPLNLVPPEDIADMGVDYDGNFVCSGGM RVLPVWTSDPQSLCQQSEMQQQPSGNPFDQSSQPQQQPQQQPAQQEQKDSDGVAGWIKDM FGSN Prediction of potential genes in microbial genomes Time: Mon May 16 15:53:04 2011 Seq name: gi|296493223|gb|ADTK01000278.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont846.16, whole genome shotgun sequence Length of sequence - 2415 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 131 - 190 6.2 1 1 Tu 1 . + CDS 211 - 2400 1757 ## COG1629 Outer membrane receptor proteins, mostly Fe transport Predicted protein(s) >gi|296493223|gb|ADTK01000278.1| GENE 1 211 - 2400 1757 729 aa, chain + ## HITS:1 COG:STM0191 KEGG:ns NR:ns ## COG: STM0191 COG1629 # Protein_GI_number: 16763581 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Salmonella typhimurium LT2 # 1 729 1 729 729 1356 94.0 0 MARSKTAQPKHSLRKIAVVVATAVSGMSVYAQAAVEPKEDTITVTAAPAPQESAWGPAAT IAAKHSATATKTDTPIEKTPQSISVVTNEEMQMHQFQSVKEALGYTPGVTVSSRGASNTY DFVIIRGFSSVGLNQNNYLDGLKLQGNFYNDAVIDPYMLERVELMRGPTSVLYGKSNPGG IISMVSKRPTTEPLKEIQFKMGTDNLFQTGFDFSDALDDNGEFSYRLTGLARSTNEQQKN SESQRYTIAPSFSWRPDDKTNFTFLSYFQNEPETGYYGWLPKEGTVEPLPNGKRLPTDFN EGASNNTYSRNQKMVGYSFEHGFNDTFTVRQNLRFSEMKTSQKSVYGTGIANDGHTLNRG TVVDNERLQNFSVDTQLESKFATGEVEHTLLTGVDFMRMRNDINASFGSAPSIDLYNKYH PEYFAFGNAEPYQMNESKQTGIYVQDQAEWNKWVFTLGGRYDWSKQATTVRENSYTPTEG YIERNDHQFTWRGGVNYLFDNGISPYFSYSQSFEPSAFDLWSNPRVSYKPSKGEQYEAGV KYVPNDMPVVVTGAVYQLTKTNNLTADPTNPLAQVPAGEIRARGVELEAKAALNANINLT ASYTYTDAEYTKDTNLKGKTPEQVPEHMASLWGDYTFNEGPLSGLTLGTGGRFIGSSYGD PANTFKVGSAAVMDAVVKYDLARFGMAGSSLAVNVNNLLDREYVASCFQTYGCFWGAERQ VVATATFRF Prediction of potential genes in microbial genomes Time: Mon May 16 15:53:06 2011 Seq name: gi|296493222|gb|ADTK01000279.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont846.17, whole genome shotgun sequence Length of sequence - 3783 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 14/0.000 + CDS 35 - 832 229 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 2 1 Op 2 33/0.000 + CDS 832 - 1722 842 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 3 1 Op 3 . + CDS 1719 - 3701 1829 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component Predicted protein(s) >gi|296493222|gb|ADTK01000279.1| GENE 1 35 - 832 229 265 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 9 235 274 507 563 92 29 4e-19 MQEYTNHSDTTFALRNISFRVPGRTLLHPLSLTFPAGKVTGLIGHNGSGKSTLLKMLGRH QPPTEGEILLDTQPLESWSSKAFARKVAYLPQQLPPAEGLTVRELVAIGRYPWHGALGRF GAADREKVEEAISLVGLKPLAHRLVDSLSGGERQRAWIAMLVAQDSRCLLLDEPTSALDI AHQVDVLALVHRLSQERGLTVIAVLHDINMAARYCDYLVALRGGEMIAQGTPAEIMRGET LEMIYGIPMGILPHPAGAAPVSFVY >gi|296493222|gb|ADTK01000279.1| GENE 2 832 - 1722 842 296 aa, chain + ## HITS:1 COG:fhuD KEGG:ns NR:ns ## COG: fhuD COG0614 # Protein_GI_number: 16128145 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Escherichia coli K12 # 1 296 1 296 296 582 99.0 1e-166 MSGLPLISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVA DTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSSEMLARIAPGR GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLT TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH DNSKDMDALMATPLWQAMPFVRTGRFQRVPAVWFYGATLSAMHFVRVLDNAIGGKA >gi|296493222|gb|ADTK01000279.1| GENE 3 1719 - 3701 1829 660 aa, chain + ## HITS:1 COG:fhuB KEGG:ns NR:ns ## COG: fhuB COG0609 # Protein_GI_number: 16128146 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Escherichia coli K12 # 1 660 1 660 660 964 99.0 0 MSKRIALFPALLLALLVIVATALTWMNFSQALPRSQWAQAAWSPDIDVIEQMIFHYSLLP RLAISLLVGAGLGLVGVLFQQVLRNPLAEPTTLGVATGAQLGITVTTLWAIPGAMASQFA ALVGACVVGLIVFGVAWGKRLSPVTLILAGLVVSLYCGAINQLLVIFHHDQLQSMFLWST GTLTQTDWGGVERLWPQLLGGVMLTLLLLRPLTLMGLDDGVARNLGLALSLARLAALSLA IVISALLVNAVGIIGFIGLFAPLLAKMLGARRLLPRLMLASLIGALILWLSDQIILWLTR VWMEVSTGSVTALIGAPLLLWLLPRLRSISAPDMKVNDRVAAERQHVLAFALAGGVLLII AVVVALSFGRDAHGWTWASGALLDDLMPWRWPRIMAALFAGVMLAVAGCIIQRLTGNPMA SPEVLGISSGAAFGVVLMLFLVPGNAFGWLLPAGSLGAAVTLLIIMIAAGRGGFSPHRML LAGMALSTAFTMLLMMLQASGDPRMAQVLTWISGSTYNATDAQVWRTGIVMVILLAITPL CRRWLTILPLGGDTARAVGMALTPTRIALLLLAACLTATATMTIGPLSFVGLMAPHIARM MGFRRTMPHIVISALVGGLLLVFADWCGRMVLFPFQIPAGLLSTFIGAPYFIYLLRKQSR Prediction of potential genes in microbial genomes Time: Mon May 16 15:53:09 2011 Seq name: gi|296493221|gb|ADTK01000280.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont848.1, whole genome shotgun sequence Length of sequence - 8338 bp Number of predicted genes - 6, with homology - 5 Number of transcription units - 2, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 317 - 383 20.3 1 1 Op 1 1/0.000 - CDS 459 - 2657 1997 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Term 2708 - 2734 1.0 2 1 Op 2 3/0.000 - CDS 2742 - 4019 815 ## COG3486 Lysine/ornithine N-monooxygenase 3 1 Op 3 5/0.000 - CDS 4016 - 5758 1474 ## COG4264 Siderophore synthetase component 4 1 Op 4 5/0.000 - CDS 5758 - 6705 248 ## PROTEIN SUPPORTED gi|229250525|ref|ZP_04374554.1| acetyltransferase, ribosomal protein N-acetylase 5 1 Op 5 . - CDS 6706 - 8106 1132 ## COG4264 Siderophore synthetase component - Prom 8152 - 8211 3.3 + Prom 7975 - 8034 2.7 6 2 Tu 1 . + CDS 8059 - 8214 79 ## Predicted protein(s) >gi|296493221|gb|ADTK01000280.1| GENE 1 459 - 2657 1997 732 aa, chain - ## HITS:1 COG:YPO0994 KEGG:ns NR:ns ## COG: YPO0994 COG1629 # Protein_GI_number: 16121296 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Yersinia pestis # 4 732 2 726 726 966 67.0 0 MISKKYTLWALNPLLLTMMAPAVAQQTDDETFVVSANRSNRTVAEMAQTTWVIENAELEQ QIQGGKELKDALAQLIPGLDVSSRSRTNYGMNVRGRPLVVLVDGVRLNSSRTDSRQLDSI DPFNIDHIEVISGATSLYGGGSTGGLINIVTKKGQPETMMEFEAGTKSGFSSSKDHDERI AGAVSGGNEHISGRLSVAYQKFGGWFDGNGDATLLDNTQTGLQYSDRLDIMGTGTLNIDE SRQLQLITQYYKSQGDDDYGLNLGKGFSAIRGTSTPFVSNGLNSDRIPGTERHLISLQYS DSAFLGQELVGQVYYRDESLRFYPFPTVNANKQVTAFSSSQQDTDQYGMKLTLNSKPMDG WQITWGLDADHERFTSNQMFFDLAQASASGGLNNKKIYTTGRYPSYDITNLAVFLQSGYD INNLFTLNGGVRYQYTENKIDDFIGYAQQRQIAAGKATSADAIPGGSVDYDNFLFNAGLL MHITERQQAWLNFSQGVELPDPGKYYGRGIYGAAVNGHLPLTKSVNVSDSKLEGVKVDSY ELGWRFTGNNLRTQIAAYYSISDKSVVANKDLTISVVDDKRRIYGVEGAVDYLIPDTDWS TGVNFNVLKTESKVNGTWQKYDVKTASPSKATAYIGWAPDPWSLRVQSTTSFDVSDAQGY KVDGYTTVDLLGSYQLPVGTLSFSIENLFDRDYTTVWGQRAPLYYSPGYGPASLYDYKGR GRTFGLNYSVLF >gi|296493221|gb|ADTK01000280.1| GENE 2 2742 - 4019 815 425 aa, chain - ## HITS:1 COG:YPO0993 KEGG:ns NR:ns ## COG: YPO0993 COG3486 # Protein_GI_number: 16121295 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Lysine/ornithine N-monooxygenase # Organism: Yersinia pestis # 1 423 1 423 442 554 59.0 1e-157 MKKSVDFIGVGTGPFNLSIAALSHQIEELDCLFFDEHPHFSWHPGMLVPDCHMQTVFLKD LVSAVAPTNPYSFVNYLVKHKKFYRFLTSRLRTVSREEFSDYLRWAAEDMNNLYFSHTVE NIDFDKKRRLFLVQTSQGEYFARNICLGTGKQPYLPPCVKHMTQSCFHASEMNLRRPDLS GKRITVVGGGQSGADLFLNALRGEWGEAAEINWVSRRNNFNALDEAAFADEYFTPEYISG FSGLEEDIRHQLLDEQKMTSDGITADSLLTIYRELYHRFEVLRKPRNIRLLPSRSVTTLE SSGPGWKLLMEHHLDQGRESLESDVVIFATGYRSALPQILPSLMPLITMHDKNTFKVRDD FTLEWSGPKENNIFVVNASMQTHGIAEPQLSLMAWRSARILNRVMGRDLFDLSMPPALIQ WRSGT >gi|296493221|gb|ADTK01000280.1| GENE 3 4016 - 5758 1474 580 aa, chain - ## HITS:1 COG:YPO0992 KEGG:ns NR:ns ## COG: YPO0992 COG4264 # Protein_GI_number: 16121294 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Siderophore synthetase component # Organism: Yersinia pestis # 2 577 5 579 582 781 66.0 0 MNHKDWDLVNRRLVAKMLSELEYEQVFHAESQGDDRYCINLPGAQWRFIAERGIWGWLWI DAQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASD LINLNADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRC DNEMDIHQLLTAAMDPQEFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEG RMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASR WLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLK PDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALI AHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLPQEVRDVTSRLSADYLIHDL QTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMKKHPQMSERFALFSLFRPQIIR VVLNPVKLTWPDLDGGSRMLPNYLEDLQNPLWLVTQEYES >gi|296493221|gb|ADTK01000280.1| GENE 4 5758 - 6705 248 315 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229250525|ref|ZP_04374554.1| acetyltransferase, ribosomal protein N-acetylase [Catenulispora acidiphila DSM 44928] # 150 306 29 184 194 100 33 5e-21 MSGANIVHSGYGLRCEKLDKPLNLGWGLDNSAVLHWPGELPTGWLCDALDQIFIAAPQLS AVVLPWSEWCEEPQALTLFGQVQSDIIHRSAFWQLPLWLSSPANRASGKMVFDAEREIYF PQRPPRPQGEVYRRYDPRIRRMLSFRIADPVSDAERFTRWMNDPRVEYFWEQSGSLEVQI AYLERQLTSKHAFPLIGCFDDRPFSYFEIYWAAEDRIGRHYSWQPFDRGLHLLVGEQQWR GAHYVQSWLRGVTHYLLLNEPRTQRTVLEPRTDNQRLFRHLEPAGYRTIKEFDFPHKRSR MVMADRHHFFTEVGL >gi|296493221|gb|ADTK01000280.1| GENE 5 6706 - 8106 1132 466 aa, chain - ## HITS:1 COG:all0394 KEGG:ns NR:ns ## COG: all0394 COG4264 # Protein_GI_number: 17227890 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Siderophore synthetase component # Organism: Nostoc sp. PCC 7120 # 27 461 166 605 606 197 31.0 4e-50 MLESYAHTQQTIDARHDWAILREKALNFGEAEQALLTGHAFHPAPKSHEPFNRQEAERYL PDMAPHFPLRWFSVDKTQIAGESLHLNLQQRLTRFAAENAPQLLNELSDNQWLFPLHPWQ GEYLLQQVWCQALFAKGLIRDLGEAGTSWLPTTSSRSLYCATSRDMIKFSLSVRLTNSVR TLSVKEVERGMRLARLAQTDGWQMLQARFPTFRVMQEDGWAGLRDLNGNIMQESLFSLRE NLLLEQPQSQTNVLVSLTQAAPDGGDSLLVSAVKRLSDRLGITVQQAAHAWVDAYCQQVL KPLFTAEADYGLVLLAHQQNILVQMLGDLPVGFIYRDCQGSAFMPHATEWLDTIDEAQAE NIFTREQLLRYFPYYLLVNSTFAVTAALGAAGLDSEANLMARVRTLLAEVRDQVTHKTCL NYVLESPYWNVKGNFFCYLNDHNENTIVDPSVIYFDFANPLQAQEV >gi|296493221|gb|ADTK01000280.1| GENE 6 8059 - 8214 79 51 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAGINCLLRMGVAFQHALVESFAEGNWQKKLMPELINNKPGESGKVDRITA Prediction of potential genes in microbial genomes Time: Mon May 16 15:53:14 2011 Seq name: gi|296493220|gb|ADTK01000281.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont848.2, whole genome shotgun sequence Length of sequence - 1738 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 131 - 190 6.5 1 1 Tu 1 . + CDS 240 - 1433 606 ## COG0477 Permeases of the major facilitator superfamily Predicted protein(s) >gi|296493220|gb|ADTK01000281.1| GENE 1 240 - 1433 606 397 aa, chain + ## HITS:1 COG:YPO0988 KEGG:ns NR:ns ## COG: YPO0988 COG0477 # Protein_GI_number: 16121292 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 16 391 20 395 407 311 57.0 1e-84 MSSKIEDTPQKTLSCWPLAFSAGLLGIGQNGLLVVLPVLVIQTNLSLSVWAALLMLGSML FLPSSPWWGKQISLTGSKTVVLWALGGYGVSFTLLGLGSVLMATGAVTTAVGLGILIIAR IVYGLTVSAMVPACQVWALQRAGEGNRMAALATISSGLSCGRLFGPLCAAAMLVIHPLAP VWMLMAAPVWAVVMLLRLPGTPPQPTPERKSVSLKRDCLPYLLCAMLLAAAMSMMQLGLS PALTRQFDTDTTTISQQVAWLLGLSAIAALIAQFVVLRPQRLTPVALLLSAGVLMSSGLA IMLTEQLWLFYLGCAVLSFGAALATPAYQLLLNDKLADGAGAGWLTASHTLGYGLCALLV PLASKTGVSIALIVTALFAAVLFTIVSACIWHYRTIK Prediction of potential genes in microbial genomes Time: Mon May 16 15:53:15 2011 Seq name: gi|296493219|gb|ADTK01000282.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont854.1, whole genome shotgun sequence Length of sequence - 4866 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 287 193 ## EC55989_4880 hypothetical protein - Prom 509 - 568 5.0 + Prom 842 - 901 4.3 2 2 Tu 1 . + CDS 957 - 1487 321 ## EC55989_4882 hypothetical protein + Term 1501 - 1541 1.4 - Term 2421 - 2466 4.0 3 3 Tu 1 . - CDS 2651 - 2980 154 ## ECIAI39_4580 hypothetical protein 4 4 Tu 1 . + CDS 3975 - 4859 428 ## COG3596 Predicted GTPase Predicted protein(s) >gi|296493219|gb|ADTK01000282.1| GENE 1 2 - 287 193 95 aa, chain - ## HITS:1 COG:no KEGG:EC55989_4880 NR:ns ## KEGG: EC55989_4880 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 95 1 95 243 195 98.0 5e-49 MNQPIHNDYWLSRFESILNSALAQHRAVSLIRVDLRFPEHMPVTIMDPDPDSAVISRFFE SLKAKIQAYQRKKRRTNKRVRATTLHYFWCREFGK >gi|296493219|gb|ADTK01000282.1| GENE 2 957 - 1487 321 176 aa, chain + ## HITS:1 COG:no KEGG:EC55989_4882 NR:ns ## KEGG: EC55989_4882 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 176 1 176 176 301 100.0 4e-81 MKLTEESLNLVIDILNDLICEGKMLNDVERRSLLNAVMAITAVKERSVVSVSSRKEQSKK EKKEREIDPRFPNAGVAWNEADESLLRDVLEDVPDDEIGKHLFWLAEKLGRTPYSVACKV RQIKKLPAEWKDQFRKISDRIRSSGMSVSEYLQRQGLLSPESGEENNPVSHNNNVA >gi|296493219|gb|ADTK01000282.1| GENE 3 2651 - 2980 154 109 aa, chain - ## HITS:1 COG:no KEGG:ECIAI39_4580 NR:ns ## KEGG: ECIAI39_4580 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_IAI39 # Pathway: not_defined # 1 109 299 407 407 219 97.0 3e-56 MANSIFYTLQKTNVALWLHLEMFALVAFIKLYTGKGFEFIKNMCAFIDYNNGIIKSTHEE LLGEIKHNIHVNLKVNPDNFPELINIYNSCFSKNIFLYDINDGNFKWCS >gi|296493219|gb|ADTK01000282.1| GENE 4 3975 - 4859 428 294 aa, chain + ## HITS:1 COG:ECs1395 KEGG:ns NR:ns ## COG: ECs1395 COG3596 # Protein_GI_number: 15830649 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Escherichia coli O157:H7 # 12 294 10 289 290 188 36.0 1e-47 MPFSNSSLSAQVKSYLTFLPEEIRQKILEHLHGVIHYEPVIGIMGKSGTGKSSLCNAIFQ SRICATHPLNGCPRQAHRLTLQLGERRMTLVDLPGIGETPQHDQEYRALYRQLLPELDLI IWILRSDERAYAADIAMHQFLLNEGADPSRFLFVLSHADRMFPAEEWNATEKCPSRHQEL SLATVTARVATLFPSSFPVLPVAAPAGWNLPALVSLMIHALPPQATSAVYSHIRGENRSE QARKHAQQTFGDAIGKSFDDAVARFSFPAWMLQLLRKARDRIIHLLITLWERLF Prediction of potential genes in microbial genomes Time: Mon May 16 15:53:23 2011 Seq name: gi|296493218|gb|ADTK01000283.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont860.1, whole genome shotgun sequence Length of sequence - 974 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 + CDS 1 - 495 307 ## COG4733 Phage-related protein, tail component + Term 516 - 546 3.0 2 1 Op 2 . + CDS 564 - 972 280 ## COG3637 Opacity protein and related surface antigens Predicted protein(s) >gi|296493218|gb|ADTK01000283.1| GENE 1 1 - 495 307 164 aa, chain + ## HITS:1 COG:Z2344 KEGG:ns NR:ns ## COG: Z2344 COG4733 # Protein_GI_number: 15801760 # Func_class: S Function unknown # Function: Phage-related protein, tail component # Organism: Escherichia coli O157:H7 EDL933 # 1 164 814 977 977 319 93.0 2e-87 ANAGTLNNVTINENCRVLGKLSANQIEGDLVKTVGKAFPRDSRAPERWPSGTITVRVYDD QPFDRQIVIPAVAFRGAKHERENNDIYSSCRLIVKKNGAEIYNRTALDNTLVYTGVIDMP AGRGHMTLEFSVSAWLVNDWYPTASISDLLVVVMKKSTAGITIS >gi|296493218|gb|ADTK01000283.1| GENE 2 564 - 972 280 136 aa, chain + ## HITS:1 COG:ECs0843 KEGG:ns NR:ns ## COG: ECs0843 COG3637 # Protein_GI_number: 15830097 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Opacity protein and related surface antigens # Organism: Escherichia coli O157:H7 # 1 136 1 136 199 243 94.0 5e-65 MRKVCAAILSAAICLAVSGAPAWASEQQATLSAGYLHARTNAPGSDNLNGINVKYRYEFT DTLGLITSFSYANAEDEQKTHYSDTRWHEDSVRNRWFSVMAGPSVRVNEWFSAYAMAGVA YSRVSTFSGDYLRVTD Prediction of potential genes in microbial genomes Time: Mon May 16 15:53:24 2011 Seq name: gi|296493217|gb|ADTK01000284.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont863.1, whole genome shotgun sequence Length of sequence - 1097 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 65 - 124 4.8 1 1 Tu 1 . + CDS 226 - 906 268 ## COG2378 Predicted transcriptional regulator Predicted protein(s) >gi|296493217|gb|ADTK01000284.1| GENE 1 226 - 906 268 226 aa, chain + ## HITS:1 COG:yfjR KEGG:ns NR:ns ## COG: yfjR COG2378 # Protein_GI_number: 16130552 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Escherichia coli K12 # 1 224 1 225 233 155 42.0 6e-38 MSDNRSRHDRLAVRLSLIISRLMAGESLSLKTLSDEFGVTERTLQRDFHQRLVHLDLEYR NGRYSLRRQSSPGAIPEMLSFIQNTGIARILPLRNGRLITCLTDNQEPSPCLIWLPAPDI TATFPECFSQLILAIRQCIHISLMTERWYPSLEPCRLIYYSGSWYLIALQKGKLQVFPLA DIKSVSLTSERFERRGHIHSLVAEERFISALPHFSFIHKLINTFNL Prediction of potential genes in microbial genomes Time: Mon May 16 15:53:25 2011 Seq name: gi|296493216|gb|ADTK01000285.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont864.1, whole genome shotgun sequence Length of sequence - 3893 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 61 - 447 112 ## gi|160112811|sp|Q9X2V9|MCJC_ECOLX RecName: Full=Microcin J25-processing protein mcjC gi|78322019|emb|CAJ40934.1| microcin J25 processing protein + Prom 744 - 803 6.7 2 2 Tu 1 . + CDS 878 - 1312 110 ## gi|160112811|sp|Q9X2V9|MCJC_ECOLX RecName: Full=Microcin J25-processing protein mcjC gi|78322019|emb|CAJ40934.1| microcin J25 processing protein + Term 1357 - 1401 -0.2 3 3 Tu 1 . + CDS 1902 - 3050 194 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 4 4 Tu 1 . - CDS 3375 - 3740 168 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs Predicted protein(s) >gi|296493216|gb|ADTK01000285.1| GENE 1 61 - 447 112 128 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160112811|sp|Q9X2V9|MCJC_ECOLX ## NR: gi|160112811|sp|Q9X2V9|MCJC_ECOLX RecName: Full=Microcin J25-processing protein mcjC # 1 127 98 224 513 240 99.0 2e-62 MDIFVSDKISDIKFLNPDMTFSLNIKMAEHYLSGNRIATQESLITGIYKVNNGEFIKFNN QLKPVLLRDEFSITKKNNSTIDSIIDNIEMMRDNRKIALLFSGGLDSALIFHTLKESGNK FCAYHFFF >gi|296493216|gb|ADTK01000285.1| GENE 2 878 - 1312 110 144 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160112811|sp|Q9X2V9|MCJC_ECOLX ## NR: gi|160112811|sp|Q9X2V9|MCJC_ECOLX RecName: Full=Microcin J25-processing protein mcjC # 1 144 370 513 513 281 98.0 6e-75 MKLASAQFFATDYTGKINKLTPFLHKNIIQHYAGLPVFSLFNQHFDRYPVRYEAFQRFGS DIFWKKNKRSSSQLIFRILSGKKDELVNTIKQSGLIEILGINHIELESILYENTTTRLTT ELPYILNLYRLAKFIQLQSIDYKG >gi|296493216|gb|ADTK01000285.1| GENE 3 1902 - 3050 194 382 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 147 341 5 201 223 79 28 4e-15 MDITLNSYSLLSDTVDNMIAAKKNNALRLISERYEDALTQENNAQKKYWLLSSKVLLLNS LLAVILFGSVFIYNILGVLNGVVSIGHFIMITSYIILLSTPVENIGALLSEIRQSMSSLA GFIQRHAENKATSPSIPFLNMERKLNLSIRELSFSYSDDKKILNSVSLDLFTGKMYSLTG PSGSGKSTLVKIISGYYKNYFGDIYLNDISLRNISDEDLNDAIYYLTQDDYIFMDTLRFN LRLANYDASENEMFKVLKLANLSVVNNEPVSLDTHLINRGNNYSGGQKQRISLARLFLRK PAIIIIDEATSALDYINESEILSSIRTHFPDALIINISHRINLLECSDCVYVLNEGNIVA SGHFRDLMVSNEYISGLASVTE >gi|296493216|gb|ADTK01000285.1| GENE 4 3375 - 3740 168 121 aa, chain - ## HITS:1 COG:AGl1981 KEGG:ns NR:ns ## COG: AGl1981 COG1961 # Protein_GI_number: 15891106 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 2 113 128 241 250 93 43.0 8e-20 MIRQKDIRVMAVNVPTTWINSGMSEFDSRLFAAINDMLLDMLAAVARRDYEQRRERQKQG IEKARKDGKYKGRKPNQARYDAINRLIESGSSWSQVQKVPGCSRGTISSAIKRKSGLKSS S Prediction of potential genes in microbial genomes Time: Mon May 16 15:53:40 2011 Seq name: gi|296493215|gb|ADTK01000286.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont865.1, whole genome shotgun sequence Length of sequence - 5465 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 225 - 284 6.1 1 1 Op 1 25/0.000 + CDS 330 - 1244 749 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin 2 1 Op 2 42/0.000 + CDS 1247 - 2071 237 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 3 1 Op 3 10/0.000 + CDS 2068 - 2925 777 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 4 1 Op 4 . + CDS 2922 - 3779 358 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 5 2 Tu 1 . - CDS 3724 - 3957 83 ## - Prom 4023 - 4082 4.5 + Prom 4034 - 4093 6.5 6 3 Op 1 . + CDS 4247 - 4681 241 ## COG0148 Enolase 7 3 Op 2 . + CDS 4735 - 4950 88 ## pECS88_0016 hypothetical protein Predicted protein(s) >gi|296493215|gb|ADTK01000286.1| GENE 1 330 - 1244 749 304 aa, chain + ## HITS:1 COG:STM2861 KEGG:ns NR:ns ## COG: STM2861 COG0803 # Protein_GI_number: 16766167 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Salmonella typhimurium LT2 # 8 304 9 304 305 467 75.0 1e-131 MLSIKKVTMLLGCLVLTCSIAFQASAAEKFKVITTFTIIADMAKNVAGDAAEVSSITKPG AEIHEYQPTPGDIKRAQGAQLILANGMNLELWFQRFYQHLNGVPEVIVSSGVTPVGITEG PYEGKPNPHAWMSPDNALIYVDNIRDAFIKYDPANAQTYQRNADTYKAKITQTLAPLRKQ IAELPENQRWMVTSEGAFSYLARDLGLKELYLWPINADQQGTPQQVRKVVDIVKKNHIPA VFSESTISDKPARQVARETGAHYGGVLYVDSLSTENGPVPTYIDLLKVTTSTLVQGIKAG KREK >gi|296493215|gb|ADTK01000286.1| GENE 2 1247 - 2071 237 274 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 1 235 7 238 318 95 26 6e-20 MQSAGIVVNDVTVTWRNGHTALRDASFTVPGGSIAALVGVNGSGKSTLFKAIMGFVRLTS GKISVLGIPTRQALQKNLVAYVPQSEEVDWSFPVLVEDVVMMGRYGHMGMLRIAKKRDRQ IVTDALERVDMVDFRHRQIGELSGGQKKRVFLARAIAQQGDVILLDEPFTGVDVKTEAKI ISLLRELRAEGKTMLVSTHNLGSVTTFCDYTVMVKGTVLASGPTDTTFTAENLELAFSGV LRHVTLNGSEESIITDDVRPFVAHRPSAVQREER >gi|296493215|gb|ADTK01000286.1| GENE 3 2068 - 2925 777 285 aa, chain + ## HITS:1 COG:STM2863 KEGG:ns NR:ns ## COG: STM2863 COG1108 # Protein_GI_number: 16766169 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Salmonella typhimurium LT2 # 1 277 1 277 286 426 87.0 1e-119 MNVLLEPFSYEYMLNAMWVSAMVGGLCAFLSCYLMLKGWSLIGDALSHSIVPGVAGAYML GLPFSLGAFFSGGLAAGSMLFLNQLTRLKEDAIIGLIFSSFFGLGLFMVSLNPTSVNIQT IVLGNILAIDPADILQLTIIGILSIIVLFFKWKDLMVTFFDENHARAIGLHPGRLKILFF TLLSVSTVAALQTVGAFLVICLVVTPGATAWLLTDRFPRLLMIAVTIGSVTSFLGAWVSY FLDGATGGIIVVAQTLLFLLAFVFAPTHGLLANRRRAHKALEDRS >gi|296493215|gb|ADTK01000286.1| GENE 4 2922 - 3779 358 285 aa, chain + ## HITS:1 COG:STM2864 KEGG:ns NR:ns ## COG: STM2864 COG1108 # Protein_GI_number: 16766170 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Salmonella typhimurium LT2 # 2 274 3 275 282 324 68.0 1e-88 MMALLLEPLQFTFMSHALLISLVVSIPCALLSVFLVLKSWALMGDAMSHAVFPGIVLAWI LGLPLATGAFVAGVFCAVATGYLKDNSRIKQDTVMGIVFSGMFAAGLILYIAVKPDVHLD HILFGDMLGITIGDIIQTVIIAGLVTLVISVKWRDFLLFSFDYQQAQASGLHTRWLHYGL LCMVSLTIVATLKAVGIILSISLLIAPGAIAVLLTQRFHIALLLATGISILVSMTGVWLS FFIDSAPAPTIVVLFAVMFIMTFTVTSINARTKENAEPRDLLSSD >gi|296493215|gb|ADTK01000286.1| GENE 5 3724 - 3957 83 77 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFCHYFPHFTALFATPENRHYLLSSPQQLSGNKTYQNHKSVHYSFKRIANKRMLLAECSA NRMITNPVVRHFPLCVR >gi|296493215|gb|ADTK01000286.1| GENE 6 4247 - 4681 241 144 aa, chain + ## HITS:1 COG:ML0255 KEGG:ns NR:ns ## COG: ML0255 COG0148 # Protein_GI_number: 15827047 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Mycobacterium leprae # 8 143 304 439 447 167 54.0 7e-42 MIFLPLTVAEDDWAGWKILHGALGEQIELVGDDLFVTNVKYIQRGIDENLANSALIKLNQ IGSLSETFDAVQLCHDNNWGTFISHRSGETVDSFIADMTVAMRAGHLKTGAPCRGERIEK YNQLMRIEDELGSSAQFAGKSAFK >gi|296493215|gb|ADTK01000286.1| GENE 7 4735 - 4950 88 71 aa, chain + ## HITS:1 COG:no KEGG:pECS88_0016 NR:ns ## KEGG: pECS88_0016 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 71 20 90 90 110 100.0 2e-23 MKAHVFVISMLTGIVVTYAVLLLGCLFIDKTLPTVDVVILSLVVGASAQQLSRVLMSINR TFPYLACSQRI Prediction of potential genes in microbial genomes Time: Mon May 16 15:53:47 2011 Seq name: gi|296493214|gb|ADTK01000287.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont867.1, whole genome shotgun sequence Length of sequence - 604 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 429 419 ## COG3501 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|296493214|gb|ADTK01000287.1| GENE 1 3 - 429 419 142 aa, chain - ## HITS:1 COG:Z0267 KEGG:ns NR:ns ## COG: Z0267 COG3501 # Protein_GI_number: 15799916 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 EDL933 # 1 142 1 142 713 257 97.0 4e-69 MSTGLRFTLEVDGLPPDAFAVVSFHLNQSLSSLFSLDLSLVSQQFLSLEFQQILDKMAYL TIWQGDDVQRRVKGMVTWFELGENDKNQMLYSMKVCPPLWRTGLRQNFRIFQNEDIESIL GTILQENGVTEWSPLFSEPHPS Prediction of potential genes in microbial genomes Time: Mon May 16 15:53:57 2011 Seq name: gi|296493213|gb|ADTK01000288.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont867.2, whole genome shotgun sequence Length of sequence - 32192 bp Number of predicted genes - 32, with homology - 32 Number of transcription units - 10, operones - 5 average op.length - 5.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 2 2 Op 1 . + CDS 1251 - 1751 462 ## COG3516 Uncharacterized protein conserved in bacteria 3 2 Op 2 . + CDS 1786 - 2010 169 ## SSON_0253 hypothetical protein 4 2 Op 3 5/0.571 + CDS 2061 - 3536 1266 ## COG3517 Uncharacterized protein conserved in bacteria 5 2 Op 4 8/0.429 + CDS 3543 - 3956 298 ## COG3518 Uncharacterized protein conserved in bacteria 6 2 Op 5 9/0.143 + CDS 3960 - 5810 1207 ## COG3519 Uncharacterized protein conserved in bacteria 7 2 Op 6 6/0.571 + CDS 5774 - 6856 725 ## COG3520 Uncharacterized protein conserved in bacteria 8 2 Op 7 6/0.571 + CDS 6881 - 8161 924 ## COG3456 Uncharacterized conserved protein, contains FHA domain 9 2 Op 8 8/0.429 + CDS 8158 - 8682 450 ## COG3521 Uncharacterized protein conserved in bacteria 10 2 Op 9 8/0.429 + CDS 8685 - 10016 1087 ## COG3522 Uncharacterized protein conserved in bacteria 11 2 Op 10 4/0.714 + CDS 10021 - 10782 707 ## COG3455 Uncharacterized protein conserved in bacteria 12 2 Op 11 . + CDS 10791 - 13556 514 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 13 2 Op 12 . + CDS 13649 - 14296 526 ## ECO103_0217 hypothetical protein 14 2 Op 13 5/0.571 + CDS 14301 - 15713 1197 ## COG3515 Uncharacterized protein conserved in bacteria 15 2 Op 14 . + CDS 15732 - 17750 1638 ## COG3523 Uncharacterized protein conserved in bacteria 16 2 Op 15 5/0.571 + CDS 17829 - 19256 1328 ## COG3523 Uncharacterized protein conserved in bacteria 17 2 Op 16 2/1.000 + CDS 19267 - 20619 1300 ## COG3515 Uncharacterized protein conserved in bacteria 18 2 Op 17 . + CDS 20643 - 21125 353 ## COG3157 Hemolysin-coregulated protein (uncharacterized) 19 2 Op 18 . + CDS 21169 - 22083 72 ## EC55989_0217 inner membrane protein YafU 20 2 Op 19 . + CDS 22093 - 22572 -72 ## ECO103_0211 hypothetical protein + Term 22619 - 22656 -0.1 21 3 Tu 1 . - CDS 22709 - 23494 612 ## EC55989_0216 putative aminopeptidase - Prom 23567 - 23626 8.0 22 4 Op 1 . - CDS 23713 - 23955 66 ## gi|213863415|ref|ZP_03386670.1| hypothetical protein SentesT_32125 - TRNA 23822 - 23898 90.7 # Asp GTC 0 0 23 4 Op 2 . - CDS 24031 - 24771 695 ## COG0847 DNA polymerase III, epsilon subunit and related 3'-5' exonucleases - Prom 24807 - 24866 4.3 + Prom 24745 - 24804 2.1 24 5 Tu 1 . + CDS 24827 - 25294 482 ## COG0328 Ribonuclease HI + Term 25305 - 25359 6.0 25 6 Tu 1 . - CDS 25291 - 26013 408 ## COG0500 SAM-dependent methyltransferases - Prom 26044 - 26103 3.8 + Prom 25822 - 25881 2.7 26 7 Op 1 . + CDS 26047 - 26802 399 ## COG0491 Zn-dependent hydrolases, including glyoxylases + Prom 26841 - 26900 1.5 27 7 Op 2 . + CDS 27012 - 28232 733 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) + Term 28240 - 28298 7.4 - Term 28232 - 28278 8.2 28 8 Op 1 4/0.714 - CDS 28280 - 29050 518 ## COG0500 SAM-dependent methyltransferases 29 8 Op 2 . - CDS 29128 - 29907 449 ## COG3021 Uncharacterized protein conserved in bacteria - Prom 29987 - 30046 5.7 + Prom 30069 - 30128 4.1 30 9 Tu 1 . + CDS 30169 - 31083 503 ## COG0583 Transcriptional regulator + Term 31119 - 31161 7.1 - Term 31038 - 31071 3.8 31 10 Op 1 . - CDS 31080 - 31883 1123 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase 32 10 Op 2 . - CDS 31859 - 32191 108 ## EC55989_0204 hypothetical protein - TRNA 32046 - 32122 90.7 # Asp GTC 0 0 Predicted protein(s) >gi|296493213|gb|ADTK01000288.1| GENE 1 35 - 553 618 172 aa, chain - ## HITS:1 COG:ECs0234 KEGG:ns NR:ns ## COG: ECs0234 COG3157 # Protein_GI_number: 15829488 # Func_class: S Function unknown # Function: Hemolysin-coregulated protein (uncharacterized) # Organism: Escherichia coli O157:H7 # 1 172 1 172 172 348 100.0 2e-96 MPTPCYISITGQTQGNITAGAFTADSVGNIYVQGHEDEMLVQEFLHNVTVPTDPQSGQPA GQRAHKPFIFTVALNKAVPLMYNALASGEMLPTTELHWWRTSVEGKQEHYFTTRLTDSTI VDMKLHMPHCQDPAKREFTQLLEVSLAYRKIEWEHVKSGTSGADDWRAPLEA >gi|296493213|gb|ADTK01000288.1| GENE 2 1251 - 1751 462 166 aa, chain + ## HITS:1 COG:ECs0233 KEGG:ns NR:ns ## COG: ECs0233 COG3516 # Protein_GI_number: 15829487 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 166 1 166 166 288 98.0 4e-78 MSKKFEGSVAPRERINISYVPKTDGQTAEVELPLNMLVVGDTGNTQETSSLDERQAVSVN KHNFGAVMAEAAIGLNFTVPATLKGSTTDDELNVALNIKSLDDFSPDSVARQVPEVNKLL ELREALTALKGPMGNLPAFRTQLQALLENEESREQLLKEIGLVSNK >gi|296493213|gb|ADTK01000288.1| GENE 3 1786 - 2010 169 74 aa, chain + ## HITS:1 COG:no KEGG:SSON_0253 NR:ns ## KEGG: SSON_0253 # Name: not_defined # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 74 1 74 74 138 100.0 6e-32 MTAHKNISGTYHLSVADILQVVYQVCFSPSVEINQDGVAALITTLDRRISDLLDEIIHFC EFQQSASHWQRVLH >gi|296493213|gb|ADTK01000288.1| GENE 4 2061 - 3536 1266 491 aa, chain + ## HITS:1 COG:ECs0231 KEGG:ns NR:ns ## COG: ECs0231 COG3517 # Protein_GI_number: 15829485 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 463 1 463 463 941 99.0 0 MSLQEEELVSSHAGQPEQASSLLDQIMAQTRIQPGSEGYDVARQGVTAFIASILQSTASA EPVNKLAVDSMIADIDERISRQMDVIIHAPAFQQVESFWRSLKTMVDRVDFRENIKVNVL HVTKQELLEDFEFAPEIIQSGFYKHVYSSGFGQFGGEPIAAVLGAYEFKNTAPDMKLLQY VSAVGAMAHAPFLSSVSPEFMGLNSWTELPNIKDLYAIFEGPAYTKWRALRDSEDSRYLG LTAPRFLLRQPYSPTDNPVKNFNYYEDVSQNHEDYLWGNTAWMLACNIADSFAKYRWCPN IIGPQSGGAVKDLPVHLFETMGQIQAKIPTEVLVTDRREFELAEEGFITLTMRKDSDNAA FFSANSVQKPKHFPGKDAETNYKLGTQLPYLFIINRLAHYIKVLQREQLGSWKERSDLER ELNTWIRQYIADQENPPADVRSRKPLRAAKVEVMDVEGEPGWYQVALSVRPHFKFMGANF ELSLVGRLDRE >gi|296493213|gb|ADTK01000288.1| GENE 5 3543 - 3956 298 137 aa, chain + ## HITS:1 COG:ECs0230 KEGG:ns NR:ns ## COG: ECs0230 COG3518 # Protein_GI_number: 15829484 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 137 1 137 137 246 99.0 6e-66 MIPDRERTTGSLFERMEASSARNRQGGSIHSLRQSIRQNLRNILNTRSGSCRGAPELGID EPEGAENFRESMSRAIEQCIERYEPRISHAEVQAVVSSASSPLDMTFHITAWVTFNETHE VLEFDMAPNGSQHYRVD >gi|296493213|gb|ADTK01000288.1| GENE 6 3960 - 5810 1207 616 aa, chain + ## HITS:1 COG:ECs0229 KEGG:ns NR:ns ## COG: ECs0229 COG3519 # Protein_GI_number: 15829483 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 12 616 12 616 616 1248 100.0 0 MEFEERYFREELDYLRQLSKLLATEKPHLARFLAEKDADPDIERLLEGVAFLTGNLRQKI EDEFPELTHGLIKMLWPNYLRPVPAMTLIEYTPDMDKSSVPVLIPRNEQFTTNAGEIRVD EVLPSDAKKEEPPPCTFTLCRDIWLLPVRLEQIENRSTTRNGVINITFSVAPGTDFRTLD LNKLRFWLGNDDNYTRDQLYLWFCEYLQGADLTVGEQHIRLPEFMLKAVGFEPQDAMLPW PKNVHSGYRILQEYFCYPDAFLFFDLCGCPALPDGLQAEFFTLQLRFSRPLPVDIRLRRD SLRLYCAPAINLFIHHAEAITLDNRRADYPLVPSRHYPQHYDVFSVNSVVSQVQDMFRKK DLGRPVSTQAARQWPAFESFSHQMEYSRKREVVYWHHRTKTSLFHRGFDHTLAFIHADGS YPSDESLLSNEVVSVSLTCTNRELPSQIRSGDITGTTGKNAAVASFRNITRPTQPLWPVI DGSLHWSLLSAMNLNYLSLLDTDALKQVIANFDRHAIHHPQTARLSQQKLDAIERLETRP VDRLFTGIPVRGLASTLYLHPEPFVCEGEMYLLGTVLSHFLSLYASVNSFHMLTVVNTES QETWKWTERIGQHPLI >gi|296493213|gb|ADTK01000288.1| GENE 7 5774 - 6856 725 360 aa, chain + ## HITS:1 COG:ECs0228 KEGG:ns NR:ns ## COG: ECs0228 COG3520 # Protein_GI_number: 15829482 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 360 1 360 360 738 99.0 0 MDGKNRAASSYLSPGNPPADKEQNDPLAQVFHNACSYNFFAMAELLHRLAKGEKGTPELS LRDDPAQETLRFSADASLAFPCSDISALKRDTSGAFRMTTTFMGLQGSQSPLPGYYLDHL AWKAVHEQSPVGDFLDMFSHRLTQFVWHIWRKYRYHISFRNGGVDAFSQRMYSLVGLGHR QLRDKLAINHSKMLAYSGILANPGRSPEIICGLVSHCFDLSEVTLQNWQRRKVDIEPDQQ NSLGSYSLKNGEKLAGRSVLGNFVLGTRVPDLSGKFQLSITSLTRKQFLSFLPSGENFLP LTMFVSFILRDQLAWDLHLGLAPEQVGAMRLGDNKSALLGWTSFLGTPEERPSVTIRVRS >gi|296493213|gb|ADTK01000288.1| GENE 8 6881 - 8161 924 426 aa, chain + ## HITS:1 COG:Z0258 KEGG:ns NR:ns ## COG: Z0258 COG3456 # Protein_GI_number: 15799907 # Func_class: T Signal transduction mechanisms # Function: Uncharacterized conserved protein, contains FHA domain # Organism: Escherichia coli O157:H7 EDL933 # 1 426 8 433 433 817 99.0 0 MPEEKLQTLSLQVINGSELESGRAARCLFTQQGNVGHGPECHWSVQDRQQSIPAQAFTVI LHDGTFCLRPQTAQLWLNQAKVTATSDLIQLRQGDEIQIGRLMVRVHLNRGDIPHYDEEM ATPETIVTNRDMLTDTLLSTEGAPHYPGMTHRHQLADTVVNGFSADPLQALQSESLITTG DPLSGIAAVRPSAPLSDPASNGGINTPFMDLPPIYASPGDRNDDVSAAEMAQRHLAVTPL LRGLGGSLTMSNSDDADDFLEEAGRTLQAAIKGLLDLQQQRNSLSDKHLRPLEDNPLRLN MDYATALDVMFAEGKSPVHLAAPAAVSESLRNVRHHEEANRAAIVESLRVLLDAFSPQNL LRRFVQYRRSHELRQPLDDAGAWQMYSHYYEELASDRQQGFEMLFNEVYAQVYDRVLREK QREPEA >gi|296493213|gb|ADTK01000288.1| GENE 9 8158 - 8682 450 174 aa, chain + ## HITS:1 COG:ECs0226 KEGG:ns NR:ns ## COG: ECs0226 COG3521 # Protein_GI_number: 15829480 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 174 1 174 174 328 98.0 4e-90 MNRKAFLACVLMCVILTGCETAKKISEVIKNPDIQVGSLKEQPSEITVTLLTEPDTNTNA EGESAAVDVQLVYLTDDSKLLAADYDQIASTPLPDVLGKNYIDHQDFNLLPDTIKTLPPV KLDEKTQFIGVVAYFSDDQATEWKQIETVEGTGHHYRLLVHVRQSSIEMKKEDE >gi|296493213|gb|ADTK01000288.1| GENE 10 8685 - 10016 1087 443 aa, chain + ## HITS:1 COG:ECs0225 KEGG:ns NR:ns ## COG: ECs0225 COG3522 # Protein_GI_number: 15829479 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 443 1 443 443 883 98.0 0 MATTRNKVMWQEGMLMRPHHFQQQQRYNDYLDNQRFRAMNDLSWGFTELTLNNELLAQGK IMIDSASGTLPDGTVFSIPDQDALPDPLHPQNFPDERSRNIYLALPVASDVRNEISDGRR TGRYRLNYADVRDLHSEEGDARTLTLGQLTPRIMSGAEDMSAYITLPLCRISDRHADGSL TLDDDFIPSCQNIQVSKKLRVYLKEVQGAIGGRASDLANRIGSPAQSGIADVAEFMMLQL LNRNQTRFTHRARRSQLHPEDFYLDLAGLLGELMTFTEPSRLPCPLDVYDHHDLTKTFKT LLPEVKRALHTVLSPRAVNLPLHLRDGIWQADIHDTELLQSATFVLAVAANVPVDQIQRQ FIQQSKISSPEKIRNMVSVQIPGIPLRALMVAPRQLPYHSGFSYFELDKSGQAWTEMAAA GAVALHVSGSFPDLNMQLWAIRG >gi|296493213|gb|ADTK01000288.1| GENE 11 10021 - 10782 707 253 aa, chain + ## HITS:1 COG:ECs0224 KEGG:ns NR:ns ## COG: ECs0224 COG3455 # Protein_GI_number: 15829478 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 253 1 253 253 480 97.0 1e-135 MDEGSLSLPPFTGHDEKSQRNYHLALRGNSLNPMIDAATPLLGMVMRLSTMNSQTMPEHL FAQVVTDVQAVEQLLQEQGYEPGVIISFRYILCTFIDEAALGNGWSNKNEWIKQSLLVHF HNEAWGGEKVFILLERLIREPKRYQDLLEFLWICFSLGFRGRYKVAAQDQGEFEQIYRRL YHVLHKLRGDAPFPLLHQDKKTQGGRYQLISRLTVKHIFCGGVVVLALFYLFYLLRLDSQ TQDILHQLNKLLR >gi|296493213|gb|ADTK01000288.1| GENE 12 10791 - 13556 514 921 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 579 881 477 789 815 202 36 2e-51 MIQIDLPTLVKRLNLFSRQALEMAASECMSQQAAEITVSHVLIQMLAMPRSDLRVITRQG DIGMEELRQALTVENYTTARSADSYPAFSPMLVEWLKEGWLLASAEMQHSELRGGVLLLA LLHSPLRYIPPAAARLLTGINRDRLQQDFVQWTQESAESVVPDADGKGAGTMTDASDTLL ARYAKNMTADARNGRLDPVLCRDHEIDLMIDILCRRRKNNPVVVGEAGVGKSALIEGLAL RIVAGQVPDKLKNTDIMTLDLGALQAGASVKGEFEKRFKGLMAEVISSPVPVILFIDEAH TLIGAGNQQGGLDISNLLKPALARGELKTIAATTWSEYKKYFEKDAALSRRFQLVKVSEP NAAEATIILRGLSAVYEQSHGVLIDDDALQAAATLSERYLSGRQLPDKAIDVLDTACARV AINLSSPPKQISALTTLSHQQEAEIRQLERELRIGLRTDTSRMTEVLVQYDETLTALDEL EAAWHQQQTLVREIIALRQQLLGVAEDDAAPLPDADTVEDTQPESESEQDNTGAVPADEA DREQPEETAETVSPVQRLAQLTAELDALHNDRLLVSPHVDKKQIAAVIAEWTGVPLNRLS QNEMSVITDLTKWLGDTIKGQDLAIASLHKHLLTARADLRRPGRPLGAFLLAGPSGVGKT ETVLQLAELLYGGRQYLTTINMSEFQEKHTVSRLIGSPPGYVGYGEGGVLTEAIRQKPYS VVLLDEVEKAHPDVLNLFYQAFDKGEMADGEGRLIDCKNIVFFLTSNLGYQVIVEHADDP ETMQEVLYPVLADFFKPALLARMEVVPYLPLSKETLATIIAGKLARLDNVLRSRFGAEVV IEPEVTDEIMSRVTRAENGARMLESVIDGDMLPPLSLLLLQKMAANTAIARIRLSAVDGA FTADVEDAQNDESVTKDETVL >gi|296493213|gb|ADTK01000288.1| GENE 13 13649 - 14296 526 215 aa, chain + ## HITS:1 COG:no KEGG:ECO103_0217 NR:ns ## KEGG: ECO103_0217 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 215 33 247 247 400 100.0 1e-110 MAGVAVAVATTTPPDATATLQAMQSCRRESAALERLDCYDRLLAPLSPSGFDGALVKAGF VGEAWTRATEQEKHRQGNTTELLVTQVPGERPTVVITTPAIGHVPPRPVLMFSCVDNITR MQVALMHPLDVHDIAVTLNADNRALRSHWFVRENGTLLESSRGLSGIDEIKQLFGAKTLT VDTGADNAAGKLTFNIDGLARAIAPLRDACHWAGE >gi|296493213|gb|ADTK01000288.1| GENE 14 14301 - 15713 1197 470 aa, chain + ## HITS:1 COG:ECs0220 KEGG:ns NR:ns ## COG: ECs0220 COG3515 # Protein_GI_number: 15829474 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 207 470 1 264 264 456 92.0 1e-128 MAMDLRDPNVWISHLLENLPEEKLASALKDDNPNWEYIDGEIVKLGSLAHSQLDIPELQR RGLVILASESKDFRLLAHLLRTLQHAGDPHLALRLLALYVEHYWAVAAPQNMAHKKRFAS QVIKRFETGIEGFSQNAATAQRDALSGELAKLAQCWQSHNAPELAQATDDLFALYQRAFN RAAPAPVPTPAASGSSPQTTATSESGVTQPSAPAPQIVIDSHDDKAWRDTLLKVAAILCE RQPDSPQGYRLRRHALWQNITSTPQAESDGRTPLAAVSADMVADYHAQLGSADMALWQQV EKSVLLAPYWLDGHCLSAQTALRLGYKQVADAIRDEVIRFLERLPQLTGLLFNDHTPFIS EQTKQWLAASPDAKVAPVAQIGEESKAARACFAEQGLEAALRYLDMLPEGDPRDQFHRQY LAAQLTEEAGLVQLAQQQYRMLFRMGLQMMVADWEPSLLEQLEQKFTAEQ >gi|296493213|gb|ADTK01000288.1| GENE 15 15732 - 17750 1638 672 aa, chain + ## HITS:1 COG:Z0250 KEGG:ns NR:ns ## COG: Z0250 COG3523 # Protein_GI_number: 15799899 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 EDL933 # 31 672 1 642 1144 1228 98.0 0 MFRLPTPRLLSGLKSALRPAMPRFKVSAFWLLILAWIFLLVWIWWKGPTWTLYEEQWLKP LANRWLATAAWGIIALMWLTVRVMKRLQQLEKMQKQQREEAVDPLSVELNAQQRYLDRWL LRLQRHLDNRRFLWQLPWYMVIGPAGSGKTTLLREGFPSDIIYAPEGARGAEQRLYLTPH VGKQAVIFDIDGTLCAPADADILHRRLWEHALGWLKEKRARQPLNGIILTLDLPDLLTAD KRRREHLLQTLRSRLQDIRQHLHCQLPVYVVLTRLDLLQGFAALFQSLNRQDRDAILGVT FTRRAHENDDWRTELNAFWQTWVDRMNLALPDLMVAQTHTRASLFSFSRQMQGSREPLVS LLEGLLDGENMNVMLRGVYLTSSLQRGQMDDIFTQSAARQYRLGNNPLASWPLVDTAPYF TRSLFPQALLAEPNLATESRAWLIRSRRRLTVFSATGGVAALLLITGWHHYYNGNYQSGI TVLKQAKAFMDVPPPQGEDDFGNLQLPLLNPVRDATLAYGDWGDRSRLADMGLYQGRRIG PYVEQTYLQLLEQRYLPSLFNGLVKAMNAAPPESEEKLAVLRVMRMLEDKSGRNNQVVKQ YMAKRWSEKFHGQRDIQAQLMSHLDYALAHTDWHAERQAGDGDAISRWTPYDKPVVSAQK ELSKLPVYQRVY >gi|296493213|gb|ADTK01000288.1| GENE 16 17829 - 19256 1328 475 aa, chain + ## HITS:1 COG:Z0250 KEGG:ns NR:ns ## COG: Z0250 COG3523 # Protein_GI_number: 15799899 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 EDL933 # 1 475 670 1144 1144 946 99.0 0 MFTSADDNKLVVPQFLTRYGLQSYFVKQRDELVELTAMDSWVLNLTRSVKYSDADRAEIQ RQLTEQYISDYTATWRAGMDNLNIRNFESIGQLTGALEQVISGDQPLQRALTVLRDNTQP GVFSEKLSAKEREEALAEPDYQLLTRLGHEFAPENSTLAVQKDKESTMQAVYQQLTELHR YLLAIQNAPVPGKSALKAVQLRLDQNSSDPIFATRQMAKTLPAPLNRWVGRLTDQAWHVV MVEAVHYMEVDWRDSVVKPFNEQLANNYPFNPRSAQDASLDAFERFFKPDGILDTFYQQN LKLFIDNDLSLEDGDNNVIIREDIIAQLETAQKIRDIFFSKQNGLGTSFAVETVSLSGNK RRSVLNLDGQLVDYSQGRNYTAHLVWPNNMREGNESKLTLIGTSGNAPRSISFSGPWAQF RLFGAGQLTGVQDGNFTVRFSVDGGAMTYRVHTDTEDNPFSGGLFSQFGLSDTLY >gi|296493213|gb|ADTK01000288.1| GENE 17 19267 - 20619 1300 450 aa, chain + ## HITS:1 COG:Z0249 KEGG:ns NR:ns ## COG: Z0249 COG3515 # Protein_GI_number: 15799898 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 EDL933 # 1 440 31 470 499 816 99.0 0 MNSNVLTQTIVTGSDPRGLPEFSAIREEINKASHPSQPELNWKLVESLALAIFKANGVDL HTATYYTLARTRTQGLAGFCEGAELLAAMVSHDWDKFWPQGGPARTEMLDWFNSRTGNIL RQQISFAESDLPLIYRTERALQLICDKLQQVELKRVPRVENLLYFMQNTRKRLEPQLKSN TENAAQTTVRTLIYAPETQASSTPEAVVPPLPGLPEMKVEVRSLTENPPQASVIKQGSTV RGFIAGIACSVAVASALWWWQVYPVQQQLLQVNDTAQGAATVWMASPELENYERRLQQLL DTSPVQPLETGMQMMRVADSRWPESLQQQQASTQWNEALKTRAQSSPQLRGWLQTRQDLH AFADLVMQREKEGLTLSYIKNVIWQAERGLGQETPVESLLTQYHDARAQKQNTDTLEKQI NERLEGVLSRWLLLKNNVMPEAATGTTAEK >gi|296493213|gb|ADTK01000288.1| GENE 18 20643 - 21125 353 160 aa, chain + ## HITS:1 COG:ECs0216 KEGG:ns NR:ns ## COG: ECs0216 COG3157 # Protein_GI_number: 15829470 # Func_class: S Function unknown # Function: Hemolysin-coregulated protein (uncharacterized) # Organism: Escherichia coli O157:H7 # 1 159 1 158 159 182 54.0 2e-46 MANISYLSLSGETQGLISAGCSTLDSVGNKAQPEHKDQIMVYALMHSISRSQNVNHHELI ITKPVDKSSPLLAKALSDNEKMAICEFILYRTSKAGIYQPYYKINLSKARISSIDFVTPH AVLEKELEPQERIAFIYEDISWEHTLAGTNAMSKWQDRVQ >gi|296493213|gb|ADTK01000288.1| GENE 19 21169 - 22083 72 304 aa, chain + ## HITS:1 COG:no KEGG:EC55989_0217 NR:ns ## KEGG: EC55989_0217 # Name: yafU # Def: inner membrane protein YafU # Organism: E.coli_55989 # Pathway: not_defined # 1 304 6 309 309 633 100.0 1e-180 MAGLRNNNNTQNAQWADYVGDILRGAQPINQLVPQHPYLNDVPLIDELRHQNTHHVILLT LDVAKKILSPITSFDYIHFITTHPSGIKDTLAWLVNASKLMTEFDDNGKIIFNLNALKYT KASYFEILGEKYIKITTSSPWLLEKLGKYIFSSRAPQVLELAIGWRGALSESIKGVKFCI WFSVAWRTIEFIMSSERDLVNFLGDFSMDVAKAVIAGGVATAIGSLASFACVSFGFPVIL VGGAILLTGIVCTVVLNEIDAQCHLSEKLKYAIRDGLKRQQELDKWKRENMTPFMYVLNT PPVI >gi|296493213|gb|ADTK01000288.1| GENE 20 22093 - 22572 -72 159 aa, chain + ## HITS:1 COG:no KEGG:ECO103_0211 NR:ns ## KEGG: ECO103_0211 # Name: ykfM # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 159 1 159 159 275 99.0 5e-73 MRLHVKLKEFLSMFFMAILFFPAFNASLFFTGVKPLYSIIKCSTEIFYDWRMLILCFGFT SFSFLNIHVILLTIIKSFLIKKTKVVNFATDITIQLTLIFLLIAIVIAPLIAPFVTGYVN ANYHPCGNNTGIFPGAIYIKNGMKCNNEYISRKEDSAVK >gi|296493213|gb|ADTK01000288.1| GENE 21 22709 - 23494 612 261 aa, chain - ## HITS:1 COG:no KEGG:EC55989_0216 NR:ns ## KEGG: EC55989_0216 # Name: yafT # Def: putative aminopeptidase # Organism: E.coli_55989 # Pathway: not_defined # 1 261 1 261 261 490 100.0 1e-137 MNSKKLCCICVLFSLLAGCASESSIDEKKKKAQVTQSNINKNTPQQLTDKDLFGNETTLA VSEEDIQAALDGDEFRVPLNSPVILVQSGNRAPETIMQEEMRKYYTVSTFSGIPDRQKPL TCNKNKDKNENEDVASAENMNWMQALRFVAAKGHQKAIIVYQDTLQTGKYDSALKSTVWS DYKNEKLTDAISLRYLVRFTLVDVATGEWATWSPVNYEYKVLPPLPDKNEASTTDMTEQQ IMQLKQKTYKAMVKDLVNRYQ >gi|296493213|gb|ADTK01000288.1| GENE 22 23713 - 23955 66 80 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|213863415|ref|ZP_03386670.1| ## NR: gi|213863415|ref|ZP_03386670.1| hypothetical protein SentesT_32125 [Salmonella enterica subsp. enterica serovar Typhi str. M223] # 32 80 1 49 49 94 100.0 3e-18 MTKVEAIRNIRLVPNGTQRGAVVQSVRIPACHAGGRGFESRPFRHYSLMKMSSESRKIFN FAVFLYLNSTISLFNDFTLA >gi|296493213|gb|ADTK01000288.1| GENE 23 24031 - 24771 695 246 aa, chain - ## HITS:1 COG:dnaQ KEGG:ns NR:ns ## COG: dnaQ COG0847 # Protein_GI_number: 16128202 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases # Organism: Escherichia coli K12 # 4 246 1 243 243 485 99.0 1e-137 MTAMSTAITRQIVLDTETTGMNQIGAHYEGHKIIEIGAVEVVNRRLTGNNFHVYLKPDRL VDPEAFGVHGIADEFLLDKPTFAEVADEFMDYIRGAELVIHNAAFDIGFMDYEFSLLKRD IPKTNTFCKVTDSLAVARKMFPGKRNSLDALCARYEIDNSKRTLHGALLDAQILAEVYLA MTGGQTSMAFAMEGETQQQQGEATIQRIVRQASKLRVVFATDDELAAHEARLDLVQKKGG SCLWRA >gi|296493213|gb|ADTK01000288.1| GENE 24 24827 - 25294 482 155 aa, chain + ## HITS:1 COG:ECs0210 KEGG:ns NR:ns ## COG: ECs0210 COG0328 # Protein_GI_number: 15829464 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HI # Organism: Escherichia coli O157:H7 # 1 155 1 155 155 314 100.0 5e-86 MLKQVEIFTDGSCLGNPGPGGYGAILRYRGREKTFSAGYTRTTNNRMELMAAIVALEALK EHCEVILSTDSQYVRQGITQWIHNWKKRGWKTADKKPVKNVDLWQRLDAALGQHQIKWEW VKGHAGHPENERCDELARAAAMNPTLEDTGYQVEV >gi|296493213|gb|ADTK01000288.1| GENE 25 25291 - 26013 408 240 aa, chain - ## HITS:1 COG:yafS KEGG:ns NR:ns ## COG: yafS COG0500 # Protein_GI_number: 16128200 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli K12 # 1 240 7 246 246 484 99.0 1e-137 MKPARVPQTVVAPDCWGDLPWGELYRKALERQLNPWFTKMYGFHLLKIGNLSAEINCEAC AVSHQVNVSAQGMPVQVQADPLHLPFADKSVDVCLLAHTLPWCTDPHRLLREADRVLIDD GWLVISGFNPISLMGLRKLVPVLRKTSPYNSRMFTLMRQLDWLSLLNFEVLHASRFHVLP WNKHGGKLLNAHIPALGCLQLIVARKRTIPLTLNPMKQSKNKPRIRQAVGATRQCRKPQA >gi|296493213|gb|ADTK01000288.1| GENE 26 26047 - 26802 399 251 aa, chain + ## HITS:1 COG:ECs0208 KEGG:ns NR:ns ## COG: ECs0208 COG0491 # Protein_GI_number: 15829462 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Escherichia coli O157:H7 # 1 251 1 251 251 516 99.0 1e-146 MNLNSIPAFDDNYIWVLNDEAGRCLIVDPGDAEPVLNAIATNNWQPEAIFLTHHHHDHVG GVKELVEKFPQIVVYGPQETQDKGTTQVVKDGETAFVLGHEFSVIATPGHTLGHICYFSK PYLFCGDTLFSGGCGRLFEGTPSQMYQSLKKLSALPDDTLVCCAHEYTLSNMKFALSILP HDLSINDYYRKVKELRAKNQITLPVILKNERQINVFLRTEDIDLINVINEETLLQQPEER FAWLRSKKDRF >gi|296493213|gb|ADTK01000288.1| GENE 27 27012 - 28232 733 406 aa, chain + ## HITS:1 COG:dniR KEGG:ns NR:ns ## COG: dniR COG0741 # Protein_GI_number: 16128198 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Escherichia coli K12 # 1 406 47 452 452 775 100.0 0 MDDGTSIAPDGDLWAFIGDELKMGIPENDRIREQKQKYLRNKSYLHDVTLRAEPYMYWIA GQVKKRNMPMELVLLPIVESAFDPHATSGANAAGIWQIIPSTGRNYGLKQTRNYDARRDV VASTTAALNMMQRLNKMFDGDWLLTVAAYNSGEGRVMKAIKTNKARGKSTDFWSLPLPQE TKQYVPKMLALSDILKNSKRYGVRLPTTDESRALARVHLSSPVEMAKVADMAGISVSKLK TFNAGVKGSTLGASGPQYVMVPKKHADQLRESLASGEIAAVQSTLVADNTPLNSRVYTVR SGDTLSSIASRLGVSTKDLQQWNKLRGSKLKPGQSLTIGAGSSAQRLANNSDSITYRVRK GDSLSSIAKRHGVNIKDVMRWNSDTANLQPGDKLTLFVKNNNMPDS >gi|296493213|gb|ADTK01000288.1| GENE 28 28280 - 29050 518 256 aa, chain - ## HITS:1 COG:ECs0206 KEGG:ns NR:ns ## COG: ECs0206 COG0500 # Protein_GI_number: 15829460 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli O157:H7 # 1 256 1 256 256 471 96.0 1e-133 MTTQSHHDHVEKQFSSQACEYLTSTVHASGRDLQRLAVRLADYPGASVLDMGCGAGHASF VAAQNVSTVVAYDLSAHMLDVVAQAAEARQLKNITTRQGYAESLPFADNAFDIVISRYSA HHWHDVGAALREVNRILKPGGRLIVMDVMSPGHPVRDIWLQTVEALRDISHVRNYASGEW LTLINEANLIVDNLITDKLPLEFSSWVARMRTPEALVDAIRIYQQSASTEVRTYFALQND GFFTSDIIMVDAHKAA >gi|296493213|gb|ADTK01000288.1| GENE 29 29128 - 29907 449 259 aa, chain - ## HITS:1 COG:ECs0205 KEGG:ns NR:ns ## COG: ECs0205 COG3021 # Protein_GI_number: 15829459 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 259 8 266 266 512 100.0 1e-145 MRYVAGQPAERILPPGSFASIGQALPPGEPLSTEERIRILVWNIYKQQRAEWLSVLKNYG KDAHLVLLQEAQTTPELVQFATANYLAADQVPAFVLPQHPSGVMTLSAAHPVYCCPLRER EPILRLAKSALVTVYPLPDTRLLMVVNIHAVNFSLGVDVYSKQLLPIGDQIAHHSGPVIM AGDFNAWSRRRMNALYRFAREMSLRQVRFTDDQRRRAFGRPLDFVFYRGLNVSEASVLVT RASDHNPLLVEFSPGKPDK >gi|296493213|gb|ADTK01000288.1| GENE 30 30169 - 31083 503 304 aa, chain + ## HITS:1 COG:yafC KEGG:ns NR:ns ## COG: yafC COG0583 # Protein_GI_number: 16128195 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 304 1 304 304 575 99.0 1e-164 MKATSEELAIFVSVVESGSFSRAAEQLGQANSAVSRAVKKLEMKLGVSLLNRTTRQLSLT EEGERYFRRVQSILQEMASAESEIMETRNTPRGLLRIDAATPVVLHFLMPLIKPFRERYP EVTLSLVSSETIINLIERKVDVAIRAGTLTDSSLRARPLFNSYRKIIASPDYISRYGKPE TIDDLKQHICLGFTEPASLNTWPIARSDGQLHEVKYGLSSNSGETLKQLCLSGNGIACLS DYMIDKEIARGELVELMADKVLPVEMPFSAVYYSDRAVSTRIRAFIDFLSEHVKTAPGGA VREA >gi|296493213|gb|ADTK01000288.1| GENE 31 31080 - 31883 1123 267 aa, chain - ## HITS:1 COG:yafB KEGG:ns NR:ns ## COG: yafB COG0656 # Protein_GI_number: 16128194 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Escherichia coli K12 # 1 267 1 267 267 512 100.0 1e-145 MAIPAFGLGTFRLKDDVVISSVITALELGYRAIDTAQIYDNEAAVGQAIAESGVPRHELY ITTKIWIENLSKDKLIPSLKESLQKLRTDYVDLTLIHWPSPNDEVSVEEFMQALLEAKKQ GLTREIGISNFTIPLMEKAIAAVGAENIATNQIELSPYLQNRKVVAWAKQHGIHITSYMT LAYGKALKDEVIARIAAKHNATPAQVILAWAMGEGYSVIPSSTKRKNLESNLKAQNLQLD AEDKKAIAALDCNDRLVSPEGLAPEWD >gi|296493213|gb|ADTK01000288.1| GENE 32 31859 - 32191 108 110 aa, chain - ## HITS:1 COG:no KEGG:EC55989_0204 NR:ns ## KEGG: EC55989_0204 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 2 62 8 68 91 117 98.0 9e-26 RELPGIKLSSKPVIKPVVVKEFGGAVVQSVRIPACHAGGRGFESRPFRHLLRSLELTLEV FFRLYIYYCQNRKNPLHFTLFFLNSLKPIITSVNENSIKRGILWLSLHLV Prediction of potential genes in microbial genomes Time: Mon May 16 15:54:21 2011 Seq name: gi|296493212|gb|ADTK01000289.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont871.1, whole genome shotgun sequence Length of sequence - 604 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 131 - 167 4.4 1 1 Tu 1 . - CDS 184 - 561 392 ## COG4723 Phage-related protein, tail component Predicted protein(s) >gi|296493212|gb|ADTK01000289.1| GENE 1 184 - 561 392 125 aa, chain - ## HITS:1 COG:ECs0841 KEGG:ns NR:ns ## COG: ECs0841 COG4723 # Protein_GI_number: 15830095 # Func_class: S Function unknown # Function: Phage-related protein, tail component # Organism: Escherichia coli O157:H7 # 1 125 91 215 215 216 97.0 6e-57 MPRLAGAKSGGVFQAVLGAALIAVAWWNPAGWLGAAALSGMYAAGASMILGGVAQMLAPK ARTPRTQTTDNGKQNTYFSSLDNMVAQGNVLPVLYGEMRVGSRVVSQEISTADEGDGGQV VVIGR Prediction of potential genes in microbial genomes Time: Mon May 16 15:54:21 2011 Seq name: gi|296493211|gb|ADTK01000290.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont886.1, whole genome shotgun sequence Length of sequence - 1778 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 393 - 452 3.8 1 1 Tu 1 . + CDS 513 - 746 221 ## ECBD_2095 uncharacterized protein YdfK + Prom 957 - 1016 4.5 2 2 Tu 1 . + CDS 1063 - 1653 249 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs Predicted protein(s) >gi|296493211|gb|ADTK01000290.1| GENE 1 513 - 746 221 77 aa, chain + ## HITS:1 COG:no KEGG:ECBD_2095 NR:ns ## KEGG: ECBD_2095 # Name: not_defined # Def: uncharacterized protein YdfK # Organism: E.coli_BL21_DE3 # Pathway: not_defined # 1 77 1 77 77 138 100.0 6e-32 MKSKDTLKWFPAQLPEVRIILGDAVVEVAKQGRPINTRTLLDYIEGNIKKKSWLDNKELL QTAISVLKDNQNLNGKM >gi|296493211|gb|ADTK01000290.1| GENE 2 1063 - 1653 249 196 aa, chain + ## HITS:1 COG:pinR KEGG:ns NR:ns ## COG: pinR COG1961 # Protein_GI_number: 16129335 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Escherichia coli K12 # 1 196 1 196 196 360 100.0 1e-99 MSRIFAYCRISTLDQTTENQRREIESAGFKIKPQQIIEEHISGSAATSERPGFNRLLARL KCGDQLIVTKLDRLGCNAMDIRKTVEQLTETGIRVHCLALGGIDLTSPTGKMMMQVISAV AEFERDLLLERTHSGIVRARGAGKRFGRPPVLNEEQKQVVFERIKSGVSISAIAREFKTS RQTILRAKAKLQTPDI Prediction of potential genes in microbial genomes Time: Mon May 16 15:54:24 2011 Seq name: gi|296493210|gb|ADTK01000291.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont897.1, whole genome shotgun sequence Length of sequence - 1736 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 371 - 862 188 ## COG3231 Aminoglycoside phosphotransferase + Term 898 - 938 -0.9 + Prom 967 - 1026 1.8 2 2 Tu 1 . + CDS 1093 - 1698 454 ## COG3570 Streptomycin 6-kinase Predicted protein(s) >gi|296493210|gb|ADTK01000291.1| GENE 1 371 - 862 188 163 aa, chain + ## HITS:1 COG:SMc02722 KEGG:ns NR:ns ## COG: SMc02722 COG3231 # Protein_GI_number: 15966137 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aminoglycoside phosphotransferase # Organism: Sinorhizobium meliloti # 3 108 112 217 303 132 59.0 2e-31 MGQQLGAVHSLSVDQCPFERRLSRMFGRAVDVVSRNAVNPDFLPDEDKSTPQLDLLARVE RELPVRLDQERTDMVVCHGDPCMPNFMVDPKTLQCTGLIDLGRLGTADRYADLALMIANA EENWAAPDEAERAFAVLFNVLGIEAPDRERLAFYLRLDPLTWG >gi|296493210|gb|ADTK01000291.1| GENE 2 1093 - 1698 454 201 aa, chain + ## HITS:1 COG:mll8451 KEGG:ns NR:ns ## COG: mll8451 COG3570 # Protein_GI_number: 13476976 # Func_class: V Defense mechanisms # Function: Streptomycin 6-kinase # Organism: Mesorhizobium loti # 1 197 77 276 279 216 54.0 2e-56 MLLEYAGERMLSHIVAEHGDYQATEIAAELMAKLYAASEEPLPSALLPIRDRFAALFQRA RDDQNAGCQTDYVHAAIIADQMMSNASELRGLHGDLHHENIMFSSRGWLVIDPVGLVGEV GFGAANMFYDPADRDDLCLDPRRIAQMADAFSRALDVDPRRLLDQAYAYGCLSAAWNADG EEEQRDLAIAAAIKQVRQTSY Prediction of potential genes in microbial genomes Time: Mon May 16 15:54:25 2011 Seq name: gi|296493209|gb|ADTK01000292.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont901.1, whole genome shotgun sequence Length of sequence - 511 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 15:54:25 2011 Seq name: gi|296493208|gb|ADTK01000293.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont905.1, whole genome shotgun sequence Length of sequence - 268 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 97 - 268 88 ## COG4723 Phage-related protein, tail component Predicted protein(s) >gi|296493208|gb|ADTK01000293.1| GENE 1 97 - 268 88 57 aa, chain + ## HITS:1 COG:ECs0841 KEGG:ns NR:ns ## COG: ECs0841 COG4723 # Protein_GI_number: 15830095 # Func_class: S Function unknown # Function: Phage-related protein, tail component # Organism: Escherichia coli O157:H7 # 1 57 1 57 215 112 98.0 1e-25 MAATHTLPLASPGMARICLYGDLQRFGRRIDLRVKTGSEAIRALAMQIPAFRQKLSD Prediction of potential genes in microbial genomes Time: Mon May 16 15:54:26 2011 Seq name: gi|296493207|gb|ADTK01000294.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont905.2, whole genome shotgun sequence Length of sequence - 738 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 + CDS 1 - 546 609 ## COG4723 Phage-related protein, tail component + Term 563 - 598 -0.4 2 1 Op 2 . + CDS 606 - 738 68 ## COG4733 Phage-related protein, tail component Predicted protein(s) >gi|296493207|gb|ADTK01000294.1| GENE 1 1 - 546 609 181 aa, chain + ## HITS:1 COG:ECs0841 KEGG:ns NR:ns ## COG: ECs0841 COG4723 # Protein_GI_number: 15830095 # Func_class: S Function unknown # Function: Phage-related protein, tail component # Organism: Escherichia coli O157:H7 # 1 181 35 215 215 304 93.0 7e-83 KTGAEAIRALSTQVPAFRQKLNDGWYQVRIAGRDAGETELSARLNEPLANGAVIHIVPRL AGAKSGGVFQVVLGAALIAVAWWNPVGWLGAAAVSGMYAAGASMILGGVAQMLAPKARTP TAASTDNGKQNTYFSSLDNMVAQGNVLPVLYGEMRVGSRVVSQEISTADEGDGGQVVVIG R >gi|296493207|gb|ADTK01000294.1| GENE 2 606 - 738 68 44 aa, chain + ## HITS:1 COG:ECs1648 KEGG:ns NR:ns ## COG: ECs1648 COG4733 # Protein_GI_number: 15830902 # Func_class: S Function unknown # Function: Phage-related protein, tail component # Organism: Escherichia coli O157:H7 # 1 44 1 44 1132 89 97.0 2e-18 MGKGSSKGHTPREAKDNLKSTQLLSVIDAISEGPIEGPVDGLKS Prediction of potential genes in microbial genomes Time: Mon May 16 15:54:30 2011 Seq name: gi|296493206|gb|ADTK01000295.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont908.1, whole genome shotgun sequence Length of sequence - 15735 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 7, operones - 4 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 97 - 129 1.1 1 1 Op 1 13/0.000 - CDS 369 - 2465 176 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 2 1 Op 2 . - CDS 2458 - 3732 698 ## COG0845 Membrane-fusion protein - Prom 3881 - 3940 12.3 + Prom 4304 - 4363 4.8 3 2 Op 1 . + CDS 4414 - 4611 124 ## pECS88_0101 conserved hypothetical protein YacA, possible repressor 4 2 Op 2 . + CDS 4608 - 4790 150 ## ECSE_P2-0093 hypothetical protein + Term 4905 - 4950 -0.9 5 3 Tu 1 . + CDS 4982 - 5197 74 ## APECO1_O1CoBM131 hypothetical protein + Term 5351 - 5410 4.5 + Prom 5651 - 5710 4.8 6 4 Op 1 . + CDS 5757 - 6791 201 ## COG0722 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase 7 4 Op 2 . + CDS 6712 - 6912 75 ## APECO1_O1CoBM135 hypothetical protein + Term 6973 - 7028 0.4 + Prom 7500 - 7559 7.1 8 5 Tu 1 . + CDS 7657 - 9825 1540 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins + Term 9843 - 9880 6.1 - Term 9578 - 9634 2.5 9 6 Op 1 1/1.000 - CDS 9870 - 10826 538 ## COG2819 Predicted hydrolase of the alpha/beta superfamily 10 6 Op 2 1/1.000 - CDS 10911 - 12140 833 ## COG2382 Enterochelin esterase and related enzymes - Prom 12161 - 12220 6.4 11 7 Tu 1 . - CDS 12244 - 15735 2246 ## COG1132 ABC-type multidrug transport system, ATPase and permease components Predicted protein(s) >gi|296493206|gb|ADTK01000295.1| GENE 1 369 - 2465 176 698 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 497 696 135 335 398 72 28 2e-12 MTNRNFRQIINLLDLRWQRRVPVIHQTETAECGLACLAMICGHFGKNIDLIYLRRKFNLS ARGATLAGINGIAEQLGMATRALSLELDELRVLKTPCILHWDFSHFVVLVSVKRNRYVLH DPARGIRYISREEMSRYFTGVALEVWPGSEFQSETLQTRISLRSLINSIYGIKRTLAKIF CLSVVIEAINLLMPVGTQLVMDHAIPAGDRGLLTLISAALMFFILLKAATSTLRAWSSLV MSTLINVQWQSGLFDHLLRLPLAFFERRKLGDIQSRFDSLDTLRATFTTSVIGFIMDSIM VVGVCVMMLLYGGYLTWIVLCFTTIYIFIRLVTYGNYRQISEECLVREARAASYFMETLY GIATVKIQGMVGIRGAHWLNMKIDAINSGIKLTRMDLLFGGINTFVTACDQIVILWLGAG LVIDNQMTIGMFVAFSSFRGQFSERVASLTSFLLQLRIMSLHNERIADIALHEKEEKKPE IEIVADMGPISLETNGLSYRYDSQSAPIFSALSLSVAPGESVAITGASGAGKTTLMKVLC GLFEPDSGRVLINGIDIRQIGINNYHRMIACVMQDDRLFSGSIRENICGFAEEMDEEWMV ECARASHIHDVIMNMPMGYETLIGELGEGLSGGQKQRIFIARALYRKPGILFMDEATSAL DSESEHFVNVAIKNMNITRVIIAHRETTLRTVDRVISI >gi|296493206|gb|ADTK01000295.1| GENE 2 2458 - 3732 698 424 aa, chain - ## HITS:1 COG:YPO0099 KEGG:ns NR:ns ## COG: YPO0099 COG0845 # Protein_GI_number: 16120446 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Yersinia pestis # 16 424 2 411 411 369 46.0 1e-102 MFRQDALENRKMKWQGRAILLPGIPLWLIMLGSIVFITAFLMFIIVGTYSRRVNVSGEVT TWPRAVNIYSGVQGFVVRQFVHEGQLIKKGDPVYLIDISKSTRNGIVTDNHRRDIENQLV RVNNIISRLEESKKITLDTLEKQRLQYTDAFRRSSDIIQRAEEGIKIMKNNMENYRYYQS KGLINKDQLTNQVALYYQQQNNLLSLSGQNEQNALQITTLESQIQTQAADFDNRIYQMEL QRLELQKELVNTDVEGEIIIRALSDGKVDSLSVTVGQMVNTGDSLLQVIPENIENYYLIL WVPNDAVPYISAGDKVNIRYEAFPSEKFGQFSATVKTISRTPASTQEMLTYKGAPQNTPG ASVPWYKVIATPEKQIIRYDEKYLPLENGMKAESTLFLEKRRIYQWMLSPFYDMKHSATG PIND >gi|296493206|gb|ADTK01000295.1| GENE 3 4414 - 4611 124 65 aa, chain + ## HITS:1 COG:no KEGG:pECS88_0101 NR:ns ## KEGG: pECS88_0101 # Name: yacA # Def: conserved hypothetical protein YacA, possible repressor # Organism: E.coli_S88 # Pathway: not_defined # 1 65 25 89 89 108 100.0 6e-23 MDRNGSQLIRDFMRQTVERQHNTWFRDQVEAGRQQLERGDVLPHDMVESSAAAWRDEMSR KVAGK >gi|296493206|gb|ADTK01000295.1| GENE 4 4608 - 4790 150 60 aa, chain + ## HITS:1 COG:no KEGG:ECSE_P2-0093 NR:ns ## KEGG: ECSE_P2-0093 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 43 1 43 104 82 93.0 4e-15 MMEIFWTMLASQDRKRIREYIAEQNLIAAIELDERIGCSASLLFSLSFISVQVHDNIITV >gi|296493206|gb|ADTK01000295.1| GENE 5 4982 - 5197 74 71 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM131 NR:ns ## KEGG: APECO1_O1CoBM131 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 71 45 115 115 119 100.0 4e-26 MPFWQVTLLMGGLWGISWGCAMWFMYWGPSGMVAGEAIIISITSGFLFGLLMASFHWWRR KVNRLPPWNDV >gi|296493206|gb|ADTK01000295.1| GENE 6 5757 - 6791 201 344 aa, chain + ## HITS:1 COG:aroH KEGG:ns NR:ns ## COG: aroH COG0722 # Protein_GI_number: 16129660 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Escherichia coli K12 # 10 342 8 342 348 332 49.0 8e-91 MQSASKLIYRGKLLGSLPTVGEIHKEIAVSEETVTWISLQRDIIANILLGKDPRLLVIVG PCSIHDVQAAVDYAKRLFVLQNKYLSKMYIVMRTYFEKPRTRKGWKGIMHDPDLNGSYDV EKGIRYARQCLSSITTMRVATATEFLDPFLTPYIADLICWGAIGARTTESQTHRQLASGL HCPVGFKNSTDGNINLAIDAIIAAREQHIVYMTSLTNSISTLLTDGNPHGHLILRGGREP NYGLSDITKAVKLMHDEGINHRLIIDCSHGNSGKVAKRQISVARQVIDNRKKIPGYVAGI MVESFLQDGKQSDSFPLEYGQSVTDECISWQQTEQLLSTLAAQL >gi|296493206|gb|ADTK01000295.1| GENE 7 6712 - 6912 75 66 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM135 NR:ns ## KEGG: APECO1_O1CoBM135 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 66 1 66 66 113 100.0 2e-24 MVSLLLMNAYHGNRQNNYYQLWQHSFNFNYCTYIAFKGGKMKLLSLRHSISVIREADILS LQFMII >gi|296493206|gb|ADTK01000295.1| GENE 8 7657 - 9825 1540 722 aa, chain + ## HITS:1 COG:STM2777 KEGG:ns NR:ns ## COG: STM2777 COG4771 # Protein_GI_number: 16766089 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Salmonella typhimurium LT2 # 1 722 3 726 726 1154 82.0 0 MRINKILWSLTVLLVGLNSQVSVAKSSDDDNDETLVVEATAEQVLKQQPGVSVITSEDIK KTPPVNDLSDIIRKMPGVNLTGNSASGTRGNNRQIDIRGMGPENTLILIDGVPVTSRNSV RYSWRGERDTRGDTNWVPPEQVERIEVIRGPAAARYGSGAAGGVVNIITKRPTNDWHGSL SLYTNQPESSDEGATRRANFSLSGPLAGNALTTRLYGNLNKTDADSWDINSPVGTKNAAG HEGVRNKDINGVVSWKLNPQQILDFEAGYSRQGNIYAGDTQNSSSSAVTESLAKSGKETN RLYRQNYGITHNGIWDWGQSRFGVYYEKTNNTRMNEGLSGGGEGRILAGEKFTTNRLSSW RTSGELNIPLNVMVDQTLTVGAEWNRDKLDDPSSTSLTVNDSDISGSAADRSSKNHSQIS ALYIEDNIEPVPGTNIIPGVRFDYLSDSGGNFSPSLNLSQELGDYFKVKAGVARTFKAPN LYQSSEGYLLYSKGNGCPKDITSGGCYLIGNKDLDPEISVNKEIGLEFTWEDYHASVTYF RNDYQNKIVAGDNVIGQTASGAYILKWQNGGKALVDGIEASMSFPLVKDRLNWNTNATWM ITSEQKDTGNPLSVIPKYTINNSLNWTITQAFSASVNWTLYGRQKPRTHAETRSEDTGGL SGKELGAYSLVGTNFNYDINKNLRLNVGVSNILNKQIFRSSEGANTYNEPGRAYYAGVTA SF >gi|296493206|gb|ADTK01000295.1| GENE 9 9870 - 10826 538 318 aa, chain - ## HITS:1 COG:STM2776 KEGG:ns NR:ns ## COG: STM2776 COG2819 # Protein_GI_number: 16766088 # Func_class: R General function prediction only # Function: Predicted hydrolase of the alpha/beta superfamily # Organism: Salmonella typhimurium LT2 # 1 314 1 311 311 390 61.0 1e-108 MYAREYRSTRPHKAIFFHLSCLTLICSAQVYAKPDMRPLGPNIADKGSVFYHFSTTSFDS VDGTRHYRVWTAVPNTTAPASGYPILYMLDGNAVMDRLDDELLKQLSEKTPPVIVAVGYQ TNLPFDLNSRAYDYTPAAESRKTDLHSGRFSRKSGGSNNFRQLLETRIAPKVEQGLNIDR QRRGLWGHSYGGLFVLDSWLSSSYFRSYYSASPSLGRGYDALLSRVTAVEPLQFCAKHLA IMEGSATQGDNRETHAVGVLSKIHTTLTILKDKGVNVVFWDFPNLGHGPMFNASFRQALL DISGENANYTAGCHELSH >gi|296493206|gb|ADTK01000295.1| GENE 10 10911 - 12140 833 409 aa, chain - ## HITS:1 COG:STM2775 KEGG:ns NR:ns ## COG: STM2775 COG2382 # Protein_GI_number: 16766087 # Func_class: P Inorganic ion transport and metabolism # Function: Enterochelin esterase and related enzymes # Organism: Salmonella typhimurium LT2 # 8 407 9 412 414 545 67.0 1e-155 MLNMQQHPSAIASLRNQLAAGHIANLTDFWREAESLNVPLVTPVEGAEDEREVTFLWRAR HPLQGVYLRLNRVTDKEHVEKGMMSALPETDIWTLTLRLPASYCGSYSLLEIPPGTTAET IALSGGRFSTLAGKADPLNKMPEINVRGNAKESVLTLDKAPALSEWNGGFHTGQLLTSMR IIAGKSRQIRLYIPDVDISQPLGLVVLPDGETWFDHLGVCAAIDAAINNGRIVPVAVLGI DNINEHERTEILGGRSKLIKDIAGHLLPMIRAEQPQRQWADRSRTVLAGQSLGGISALMG ARYAPETFGLVLSHSPSMWWTPERTSRPGLFSETDTSWVSEHLLSAPPQGVRISLCVGSL EGSTVPHVQQLHQRLITAGVESHCAIYTGGHDYAWWRGALIDGIGLLQG >gi|296493206|gb|ADTK01000295.1| GENE 11 12244 - 15735 2246 1163 aa, chain - ## HITS:1 COG:STM2774 KEGG:ns NR:ns ## COG: STM2774 COG1132 # Protein_GI_number: 16766086 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Salmonella typhimurium LT2 # 18 1149 74 1208 1217 1733 79.0 0 GGGNPARLPWLACGLLLIAFFDFIGNYVRRGYAGMLSLWVQHTLRGRVFDSIQKLDGAGQ DALRTGQVISRTNSDLQQVHTLLQMCPVPLAVFTYYIAGIAVMLWMSPAMTLIVVCVLVC LAITALRARRRVFAQTGLASDQLANLTEHIREVLAQISVVKSCVAEMRETHWLDRQSRQI VRVRIGAVISQAMPGATMLALPVLGQIVLLCYGGWSVMHGRIDLGTFVAFASFLAMLTGP TRVLASFLVIAQRTQASVERVFALIDTRSQMEDGTESINSQVVGLELENMSFDYHHGDRH ILSDISFSLRAGETVAVVGASGSGKSTLLMLLARFYDPCSGKIWLNTSEGRQNLRDIRLE ALRRRVGIVFEDAFLFAGTVAENIAYGHPQATADDIRRAAAAAGASDFINALPKGFDSLL TERGTNLSGGQRQRIALARALITAPDVLILDDTTSAVDAVTEAEINTALGRYADEGHMLL VIARRRSTLQLASRVVVLDKGRMVDTGTPAELEARCPAFRALMTGDSDFLATSHNSHNEL WPAEPATQDDVTDTGDKGFVARMTRVPENAVQQALAGKGRKVTSLLKPVAWMFVIAALLI ALDSAAGVGVLILLQHGIDSGVAAGDMSTIGLCALLALSLVIVGWCSYSLQTVFAARAAE SVQHSVRLRSFGHMLRLGLPWHEKHADSRLTRMTVDVDSLARFLQNGLAGAATSLVTMFA IAATMFWLDPFLALTALSAVPVAALATMIYRRLSTPAYAQARLEIGKVNSTLQEKVSGMR VVQSHGQQELEGARLRALSERFRATRVRAQKYLAVYFPFLTFCTEASYAAVLLVGASQVA AGEMTAGVLAAFFLLLGQFYGPVQQLSGIVDAWQQATASGKHIDELLATEGTENLGSSSV LPVTGALHLDEVTFSYPDSHEPALNKLTLTIPEGMVVAVVGRSGAGKSTLIKLIAGLYFP THGNIRIGVQMLDDASLAEYRRQIGLVDQDVALFSSDIAENIRYSRPSATNEDVEIASQR AGLYEMVCNLPQGFRTPVNNGGADLSAGQRQLIALSRAQLANAHILLLDEATSCLDRTSE ERLMSSLTDVVHAGKHSALIVAHRLTTAQRCDLIAVIDKGLLAEYGTHEQLLSAGGLYTR LWHDSVSSTALHRQHNMKEETPG Prediction of potential genes in microbial genomes Time: Mon May 16 15:54:39 2011 Seq name: gi|296493205|gb|ADTK01000296.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont908.2, whole genome shotgun sequence Length of sequence - 5454 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 3 - 198 137 ## COG1132 ABC-type multidrug transport system, ATPase and permease components - Term 211 - 241 0.3 2 1 Op 2 . - CDS 338 - 1453 810 ## COG1819 Glycosyl transferases, related to UDP-glucuronosyltransferase 3 2 Tu 1 . - CDS 2689 - 3249 383 ## pECS88_0130 hypothetical protein 4 3 Tu 1 . - CDS 3434 - 3724 192 ## EC55989_4862 hypothetical protein - Prom 3750 - 3809 3.0 5 4 Tu 1 . - CDS 4599 - 4892 277 ## EFER_2699 putative lipoprotein; DLP12 prophage - Prom 4926 - 4985 7.0 Predicted protein(s) >gi|296493205|gb|ADTK01000296.1| GENE 1 3 - 198 137 65 aa, chain - ## HITS:1 COG:STM2774 KEGG:ns NR:ns ## COG: STM2774 COG1132 # Protein_GI_number: 16766086 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Salmonella typhimurium LT2 # 1 60 1 60 1217 102 81.0 1e-22 MPANHTPTPAQSWIVRLARVCWERKKLSVIVVVASVSTILLAALTPLLTRQAVNDALAGN PAPPP >gi|296493205|gb|ADTK01000296.1| GENE 2 338 - 1453 810 371 aa, chain - ## HITS:1 COG:STM2773 KEGG:ns NR:ns ## COG: STM2773 COG1819 # Protein_GI_number: 16766085 # Func_class: G Carbohydrate transport and metabolism; C Energy production and conversion # Function: Glycosyl transferases, related to UDP-glucuronosyltransferase # Organism: Salmonella typhimurium LT2 # 1 371 1 371 371 662 86.0 0 MRILFVGPPLYGLLYPVLSLAQAFRVNGHEVLIASGGQFAQKAAEAGLVVFDAAPGLDSE AGYRHHEAQRKKSNIGTQMGNFSFFSEEMADHLVEFAGHWRPDLIIYPPLGVIGPLIAAK YDIPVVMQTVGFGHTPWHIKGVTRSLTDAYRRHNVGATPRDMAWIDVTPPSMSILENDGE PIIPMQYVPYNGGAVWEPWWERRPERKRLLVSLGTVKPMVDGLDLIAWVMDSASEVDAEI ILHISANARSDLRSLPSNVRLVDWIPMGVFLNGADGFIHHGGAGNTLTALHAGIPQIVFG QGADRPVNARVVAERGCGIIPGDVGLSSNMINAFLNNRSLRKASEEVAAEMAAQPCPGEV AKSLITMVQKG >gi|296493205|gb|ADTK01000296.1| GENE 3 2689 - 3249 383 186 aa, chain - ## HITS:1 COG:no KEGG:pECS88_0130 NR:ns ## KEGG: pECS88_0130 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 186 26 211 211 388 100.0 1e-107 MDAIELRMGEMRRYIITMIDEHSDYVLALAVPSLNSDIVNHFFSRAARLFPVGISQIITD NGKEFLGSFDKTLQEAAIKHLWTYPYTPKMNAICERFNRALREQFIEFNEILLFEDLALF NLKLAEYLALYNSKRLHKALALTTPVEYILKENKNCNMWWTHTQQTHRPPRLGRYVGWLV QTSGYG >gi|296493205|gb|ADTK01000296.1| GENE 4 3434 - 3724 192 96 aa, chain - ## HITS:1 COG:no KEGG:EC55989_4862 NR:ns ## KEGG: EC55989_4862 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 96 6 101 251 201 100.0 1e-50 MADIATMRMKALHFWDKHGISAASEAFGVSCRTLYWWRQLLNKGGPEGLIPHSKAPLVRR KKHWHPDVLKEIRRLRTELPNLGKEQIFVRLKPWCA >gi|296493205|gb|ADTK01000296.1| GENE 5 4599 - 4892 277 97 aa, chain - ## HITS:1 COG:no KEGG:EFER_2699 NR:ns ## KEGG: EFER_2699 # Name: borD # Def: putative lipoprotein; DLP12 prophage # Organism: E.fergusonii # Pathway: not_defined # 1 97 17 113 113 181 100.0 6e-45 MKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHHFFVSGIGQEKTVDAAKICGG AENVVKTETQQTFVNGLLGFITFGIYTPLEARVYCSQ Prediction of potential genes in microbial genomes Time: Mon May 16 15:54:46 2011 Seq name: gi|296493204|gb|ADTK01000297.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont917.1, whole genome shotgun sequence Length of sequence - 633 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 120 - 518 480 ## ECSP_1541 putative DNA packaging protein of prophage CP-933X Predicted protein(s) >gi|296493204|gb|ADTK01000297.1| GENE 1 120 - 518 480 132 aa, chain + ## HITS:1 COG:no KEGG:ECSP_1541 NR:ns ## KEGG: ECSP_1541 # Name: not_defined # Def: putative DNA packaging protein of prophage CP-933X # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 132 14 145 145 211 99.0 4e-54 MTKDELIARLRSLGEQLNRDVSLTGTKEELALRVAELEEELDDTDETAGQDTPLSRENVL TGHENEVGSAQPDTVILDTSELVTVVALVKLHTDALHATRDEPVAFVLPGTAFRVSAGVA AEMTERGLARMQ Prediction of potential genes in microbial genomes Time: Mon May 16 15:54:54 2011 Seq name: gi|296493203|gb|ADTK01000298.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont923.1, whole genome shotgun sequence Length of sequence - 16460 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 5, operones - 3 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/1.000 - CDS 197 - 1057 902 ## COG0191 Fructose/tagatose bisphosphate aldolase 2 1 Op 2 9/0.000 - CDS 1070 - 2224 945 ## COG2222 Predicted phosphosugar isomerases - Prom 2313 - 2372 6.1 - Term 2521 - 2556 -0.0 3 1 Op 3 3/1.000 - CDS 2575 - 3708 1010 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase 4 1 Op 4 4/0.500 - CDS 3705 - 4139 606 ## COG2893 Phosphotransferase system, mannose/fructose-specific component IIA 5 1 Op 5 13/0.000 - CDS 4157 - 5035 972 ## COG3716 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID 6 1 Op 6 13/0.000 - CDS 5025 - 5804 1039 ## COG3715 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC 7 1 Op 7 1/1.000 - CDS 5815 - 6288 549 ## COG3444 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB 8 1 Op 8 . - CDS 6311 - 7573 1321 ## COG4573 Predicted tagatose 6-phosphate kinase - Prom 7675 - 7734 6.0 + Prom 7620 - 7679 6.0 9 2 Tu 1 . + CDS 7840 - 8649 695 ## COG1349 Transcriptional regulators of sugar metabolism + Term 8709 - 8772 2.1 - Term 8526 - 8568 2.1 10 3 Op 1 . - CDS 8704 - 9168 202 ## EcolC_0570 hypothetical protein 11 3 Op 2 3/1.000 - CDS 9168 - 9503 240 ## COG2002 Regulators of stationary/sporulation gene expression - Prom 9532 - 9591 7.6 - Term 9591 - 9631 2.5 12 4 Tu 1 . - CDS 9652 - 11223 1239 ## COG2721 Altronate dehydratase - Prom 11424 - 11483 7.0 + Prom 11510 - 11569 7.8 13 5 Op 1 6/0.000 + CDS 11598 - 12932 1643 ## COG0477 Permeases of the major facilitator superfamily 14 5 Op 2 1/1.000 + CDS 12948 - 13718 783 ## COG3836 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase 15 5 Op 3 1/1.000 + CDS 13748 - 14638 1163 ## COG2084 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases 16 5 Op 4 . + CDS 14735 - 15880 1069 ## COG1929 Glycerate kinase Predicted protein(s) >gi|296493203|gb|ADTK01000298.1| GENE 1 197 - 1057 902 286 aa, chain - ## HITS:1 COG:ECs4017 KEGG:ns NR:ns ## COG: ECs4017 COG0191 # Protein_GI_number: 15833271 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Escherichia coli O157:H7 # 1 286 1 286 286 577 100.0 1e-165 MSIISTKYLLQDAQANGYAVPAFNIHNAETIQAILEVCSEMRSPVILAGTPGTFKHIALE EIYALCSAYSTTYNMPLALHLDHHESLDDIRRKVHAGVRSAMIDGSHFPFAENVKLVKSV VDFCHSQDCSVEAELGRLGGVEDDMSVDAESAFLTDPQEAKRFVELTGVDSLAVAIGTAH GLYSKTPKIDFQRLAEIREVVDVPLVLHGASDVPDEFVRRTIELGVTKVNVATELKIAFA GAVKAWFAENPQGNDPRYYMRVGMDAMKEVVRNKINVCGSANRISA >gi|296493203|gb|ADTK01000298.1| GENE 2 1070 - 2224 945 384 aa, chain - ## HITS:1 COG:agaS KEGG:ns NR:ns ## COG: agaS COG2222 # Protein_GI_number: 16131028 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted phosphosugar isomerases # Organism: Escherichia coli K12 # 1 384 1 384 384 770 99.0 0 MPKNYTPAAAATGTWTEEEIRHQPRAWIRSLTNIDALRSALNNFLEPLLRKENLRIILTG AGTSAFIGDIIAPWLASHTGKNFSAVPTTDLVTNPMDYLNPAHPLLLISFGRSGNSPESV AAVELANQFVPECYHLPITCNEAGALYQNAINSDNAFALLMPAETHDRGFAMTSSITTMM ASCLAVFAPETINSQTFRDVADRCQAILTSLGDFSEGVFGYAPWKRIVYLGSGGLQGAAR ESALKVLELTAGKLAAFYDSPTGFRHGPKSLVDDETLVVVFVSSHPYTRQYDLDLLAELR RDNQAMRVIAIAAESSDIVAAGPHIILPPSRHFIDVEQAFCFLMYAQTFALMQSLHMGNT PDTPSASGTVNRVVQGVIIHPWQA >gi|296493203|gb|ADTK01000298.1| GENE 3 2575 - 3708 1010 377 aa, chain - ## HITS:1 COG:ECs4015 KEGG:ns NR:ns ## COG: ECs4015 COG1820 # Protein_GI_number: 15833269 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Escherichia coli O157:H7 # 1 377 8 384 384 719 98.0 0 MTHVLRARRLLTEEGWLDDHQLCIADGVIAAIEPIPVGVTERDAELLCPAYIDTHVHGGA GVDVMDDAPDVLDKLAMHKAREGVGSWLPTTVTAPLNTIHAALKRIAQRCQRGGPGAQVL GSYLEGPYFTPQNKGAHPPELFRELEIAELDQLIAVSQHTLRVVALAPEKEGALQAIRHL KQQNVRVMLGHSAATWQQTRAAFDAGADGLVHCYNGMTGLHHREPGMVGAGLTDKRAWLE LIADGHHVHPAAMSLCCCCAKERIVLITDAMQAAGMPDGRYTLCGEKVQMHGGVVRTASG GLAGSTLSVDAAVRNMVELTGVTPAEAIHMASLHPARMLGVDGVLGSLKPGKRASIVALD SGLHVQQIWIQSQLASF >gi|296493203|gb|ADTK01000298.1| GENE 4 3705 - 4139 606 144 aa, chain - ## HITS:1 COG:ECs4014 KEGG:ns NR:ns ## COG: ECs4014 COG2893 # Protein_GI_number: 15833268 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose-specific component IIA # Organism: Escherichia coli O157:H7 # 1 144 1 144 144 256 100.0 1e-68 MLSIILTGHGGFASGMEKAMKQILGEQSQFIAIDFPETSSTALLTSQLEEAIAQLDCEDG IVFLTDLLGGTPFRVASTLAMQKPGCEVITGTNLQLLLEMVLEREGLSGEEFRVQALECG HRGLTSLVDELGRCHEECPVEEGI >gi|296493203|gb|ADTK01000298.1| GENE 5 4157 - 5035 972 292 aa, chain - ## HITS:1 COG:ECs4013 KEGG:ns NR:ns ## COG: ECs4013 COG3716 # Protein_GI_number: 15833267 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID # Organism: Escherichia coli O157:H7 # 1 292 1 292 292 561 100.0 1e-160 MASNQTTLPNVSENEETLLTGVNENVYEDQSIGAELTKKDINRVAWRSMLLQASFNYERM QASGWLYGLLPALKKIHTNKRDLARAMKGHMGFFNTHPFLVTFVIGIILAMERSKQDVNS IQSTKIAVGAPLGGIGDAMFWLTLLPICGGIGASLALQGSILGAVVFIVLFNVVHLGLRF GLAHYAYRMGVAAIPLIKANTKKVGHAASIVGMTVIGALVATYVRLSTTLEITAGDAVVK LQADVIDKLMPAFLPLVYTLTMFWLVRRGWSPLRLIAVTVVLGIVGKFCHFL >gi|296493203|gb|ADTK01000298.1| GENE 6 5025 - 5804 1039 259 aa, chain - ## HITS:1 COG:ECs4012 KEGG:ns NR:ns ## COG: ECs4012 COG3715 # Protein_GI_number: 15833266 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC # Organism: Escherichia coli O157:H7 # 1 259 1 259 259 429 100.0 1e-120 MEISLLQAFALGIIAFIAGLDMFNGLTHMHRPVVLGPLVGLVLGDLHTGILTGGTLELVW MGLAPLAGAQPPNVIIGTIVGTAFAITTGVKPDVAVGVAVPFAVAVQMGITFLFSVMSGV MSRCDRMAENADTRGIERVNYLALLALGTFYFLCAFLPIYFGAEHAKTIIDVLPQRLIDG LGVAGGIMPAIGFAVLLKIMMKNVYIPYFILGFVAAAWLKLPVLAIAAAALAMALIDLLR KSPEPTQPAAQKEEFEDGI >gi|296493203|gb|ADTK01000298.1| GENE 7 5815 - 6288 549 157 aa, chain - ## HITS:1 COG:ZagaV KEGG:ns NR:ns ## COG: ZagaV COG3444 # Protein_GI_number: 15803671 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB # Organism: Escherichia coli O157:H7 EDL933 # 1 157 13 169 169 305 99.0 2e-83 MPNIVLSRIDERLIHGQVGVQWVGFAGANLVLVANDEVAEDPVQQNLMEMVLAEGIAVRF WTLQKVIDNIHRAADRQKILLVCKTPADFLTLVKGGVPVNRINVGNMHYANGKQQIAKTV SVDAGDIAAFNDLKAAGVECFVQGVPTEPAVDLFKLL >gi|296493203|gb|ADTK01000298.1| GENE 8 6311 - 7573 1321 420 aa, chain - ## HITS:1 COG:agaZ KEGG:ns NR:ns ## COG: agaZ COG4573 # Protein_GI_number: 16131024 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted tagatose 6-phosphate kinase # Organism: Escherichia coli K12 # 1 420 7 426 426 843 99.0 0 MVRQHKAGKTNAIYAVCSAHPLVLEAAIRYASANQTPLLIEATSNQVDQFGGYTGMTPAD FRGFVCQLADSLNFPQDALILGGDHLGPNRWQNLPAAQAMANADDLIKSYVAAGFKKIHL DCSMSCQDDPIPLTDDIVAERAARLAKVAEETCLEHFGEADLEYVIGTEVPVPGGAHETL SELAVTTPDAARATLEAHRHAFEKQGLNAIWPRIIALVVQPGVEFDHTNVIDYQPAKASA LSQMVENYETLIFEAHSTDYQTPQSLRQLVIDHFAILKVGPALTFALREALFSLAAIEEE LVPAKACSGLRQVLEDVMLDRPEYWQSHYHGDGNARRLARGYSYSDRVRYYWPDSQIDDA FAHLVRNLADSQIPLPLISQYLPLQYVKVRSGELQPTPRELIINHIQDILAQYHTACEGQ >gi|296493203|gb|ADTK01000298.1| GENE 9 7840 - 8649 695 269 aa, chain + ## HITS:1 COG:ECs4009 KEGG:ns NR:ns ## COG: ECs4009 COG1349 # Protein_GI_number: 15833263 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli O157:H7 # 1 269 1 269 269 492 100.0 1e-139 MSNTDASGEKRVTGTSERREQIIQRLRQQGSVQVNDLSALYGVSTVTIRNDLAFLEKQGI AVRAYGGALICDSTTPSVEPSVEDKSALNTAMKRSVAKAAVELIQPGHRVILDSGTTTFE IARLMRKHTDVIAMTNGMNVANALLEAEGVELLMTGGHLRRQSQSFYGDQAEQSLQNYHF DMLFLGVDAIDLERGVSTHNEDEARLNRRMCEVAERIIVVTDSSKFNRSSLHKIIDTQRI DMIIVDEGIPADSLEGLRKAGVEVILVGE >gi|296493203|gb|ADTK01000298.1| GENE 10 8704 - 9168 202 154 aa, chain - ## HITS:1 COG:no KEGG:EcolC_0570 NR:ns ## KEGG: EcolC_0570 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 154 1 154 154 311 100.0 4e-84 MDFPQRVNGWALYAHPCFQETYDALVAEVEALKGKDPENYQRKAATKLLAVVHKVIEEHI TVNPSSPAFRHGKSLGSGKNKDWSRVKFGAGRYRLFFRYSEKEKVIILGWMNDENTLRTY GKKTDAYTVFSKMLKRGHPPADWESLTQETEENH >gi|296493203|gb|ADTK01000298.1| GENE 11 9168 - 9503 240 111 aa, chain - ## HITS:1 COG:sohA KEGG:ns NR:ns ## COG: sohA COG2002 # Protein_GI_number: 16131021 # Func_class: K Transcription # Function: Regulators of stationary/sporulation gene expression # Organism: Escherichia coli K12 # 1 111 1 111 111 216 100.0 8e-57 MPANARSHAVLTTESKVTIRGQTTIPAPVREALKLKPGQDSIHYEILPGGQVFMCRLGDE QEDHTMNAFLRFLDADIQNNPQKTRPFNIQQGKKLVAGMDVNIDDEIGDDE >gi|296493203|gb|ADTK01000298.1| GENE 12 9652 - 11223 1239 523 aa, chain - ## HITS:1 COG:yhaG KEGG:ns NR:ns ## COG: yhaG COG2721 # Protein_GI_number: 16131020 # Func_class: G Carbohydrate transport and metabolism # Function: Altronate dehydratase # Organism: Escherichia coli K12 # 1 523 1 523 523 1055 99.0 0 MANIEIRQETPTAFYIKVHDTDNVAIIVNDNGLKAGTRFPDGLELIEHIPQGHKVALLDI PANGEIIRYGEVIGYAVRAIPRGSWIDESMVVLPEAPPLHTLPLATKVPEPLPPLEGYTF EGYRNADGSVGTKNLLGITTSVHCVAGVVDYVVKIIERDLLPKYPNVDGVVGLNHLYGCG VAINAPAAVVPIRTIHNISLNPNFGGEVIVIGLGCEKLQPERLLTGTDDVQAIPVESASI VSLQDEKHVGFQSMVEDILQVAERHLQKLNQRQRETCPASELVVGMQCGGSDAFSGVTAN PAVGYASDLLVRCGATVMFSEVTEVRDAIHLLTPRAVNEEVGKRLLEEMEWYDNYLNIGK TDRSANPSPGNKKGGLANVVEKALGSIAKSGKSAIVEVLSPGQRPTKRGLIYAATPASDF VCGTQQVASGITVQVFTTGRGTPYGLMAVPVIKMATRTELANRWFDLMDINAGTIATGEE TIEEVGWKLFHFILDVASGKKKTFSDQWGLHNQLAVFNPAPVT >gi|296493203|gb|ADTK01000298.1| GENE 13 11598 - 12932 1643 444 aa, chain + ## HITS:1 COG:ECs4005 KEGG:ns NR:ns ## COG: ECs4005 COG0477 # Protein_GI_number: 15833259 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 444 1 444 444 817 99.0 0 MILDTVDVKKKGVHTRYLILLIIFIVTAVNYADRATLSIAGTEVAKELQLSAVSMGYIFS AFGWAYLLMQIPGGWLLDKFGSKKVYTYSLFFWSLFTFLQGFVDMFPLAWAGISMFFMRF MLGFSEAPSFPANARIVAAWFPTKERGTASAIFNSAQYFSLALFSPLLGWLTFAWGWEHV FTVMGVIGFVLTALWIKLIHNPTDHPRMSAEELKFISENGAVVDMDHKKPGSAAASGPKL HYIKQLLSNRMMFGVFFGQYFINTITWFFLTWFPIYLVQEKGMSILKVGLVASIPALCGF AGGVLGGVFSDYLIKRGLSLTLARKLPIVLGMLLASTIILCNYTNNTTLVVMLMALAFFG KGFGALGWSVISDTAPKEIVGLCGGVFNVFGNVASIVTPLVIGYLVSELHSFNAALVFVG CSALMAMVCYLFVVGDIKRMELQK >gi|296493203|gb|ADTK01000298.1| GENE 14 12948 - 13718 783 256 aa, chain + ## HITS:1 COG:ECs4004 KEGG:ns NR:ns ## COG: ECs4004 COG3836 # Protein_GI_number: 15833258 # Func_class: G Carbohydrate transport and metabolism # Function: 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase # Organism: Escherichia coli O157:H7 # 1 256 1 256 256 503 99.0 1e-142 MNNDVFPNKFKAALAAKQVQIGCWSALSNPISTEVLGLAGFDWLVLDGEHAPNDISTFIP QLMALKGSASAPVVRVPTNEPVIIKRLLDIGFYNFLIPFVETKEEAEQAVASTRYPPEGI RGVSVSHRANMFGTVADYFAQSNKNITILVQIESQQGVDNIDAIAATEGVDGIFVGPSDL AAALGHLGNASHPDVQKAIQHIFNRASAHGKPSGILAPIEADARRYLEWGATFVAVGSDL GVFRSATQKLADTFKK >gi|296493203|gb|ADTK01000298.1| GENE 15 13748 - 14638 1163 296 aa, chain + ## HITS:1 COG:ECs4003 KEGG:ns NR:ns ## COG: ECs4003 COG2084 # Protein_GI_number: 15833257 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases # Organism: Escherichia coli O157:H7 # 1 296 4 299 299 512 99.0 1e-145 MTMKVGFIGLGIMGKPMSKNLLKAGYSLVVADRNPKAIADVIAAGAETASTAKAIAEQCD VIITMLPNSPHVKEVALGENGIIEGAKPGTVLIDMSSIAPLASREISEALKAKGIDMLDA PVSGGEPKAIDGTLSVMVGGDKAIFDKYYDLMKAMAGSVVHTGEIGAGNVTKLANQVIVA LNIAAMSEALTLATKAGVNPDLVYQAIRGGLAGSTVLDAKAPMVMDRNFKPGFRIDLHIK DLANALDTSHGVGAQLPLTAAVMEMMQALRADGLGTADHSALACYYEKLAKVEVTR >gi|296493203|gb|ADTK01000298.1| GENE 16 14735 - 15880 1069 381 aa, chain + ## HITS:1 COG:ECs4002 KEGG:ns NR:ns ## COG: ECs4002 COG1929 # Protein_GI_number: 15833256 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerate kinase # Organism: Escherichia coli O157:H7 # 1 381 28 408 408 674 99.0 0 MKIVIAPDSYKESLSASEVAQAIEKGFREIFPDAQYVSIPVADGGEGTVEAMIAATQGSE RHAWVTGPLGEKVNASWGISGDGKTAFIEMAAASGLELVPAEKRDPLVTTSRGTGELILQ ALESGATNIIIGIGGSATNDGGAGMVQALGAKLCDANGNEIGFGGGSLNTLNDIDISGLD PRLKDCVIRVACDVTNPLVGDNGASRIFGPQKGASEAMIVELDNNLSHYAEVIKKALHVD VKDVPGAGAAGGMGAALMAFLGAELKSGIEIVTTALNLEEHIHDCTLVITGEGRIDSQSI HGKVPIGVANVAKKYHKPVIGIAGSLTDDVGVVHQHGIDAVFSVLTSIGTLDEAFRGAYD NICRASRNIAATLAIGMRNAG Prediction of potential genes in microbial genomes Time: Mon May 16 15:55:07 2011 Seq name: gi|296493202|gb|ADTK01000299.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont923.2, whole genome shotgun sequence Length of sequence - 35685 bp Number of predicted genes - 34, with homology - 34 Number of transcription units - 17, operones - 10 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 387 - 1490 356 ## ECB_02988 hypothetical protein 2 1 Op 2 . - CDS 1512 - 2051 40 ## EcE24377A_3595 hypothetical protein - Prom 2087 - 2146 3.5 + Prom 2637 - 2696 8.0 3 2 Op 1 4/0.667 + CDS 2840 - 3778 841 ## COG0583 Transcriptional regulator + Prom 3787 - 3846 4.9 4 2 Op 2 4/0.667 + CDS 3877 - 4866 952 ## COG1171 Threonine dehydratase 5 2 Op 3 3/1.000 + CDS 4888 - 6219 1353 ## COG0814 Amino acid permeases 6 2 Op 4 3/1.000 + CDS 6245 - 7453 1018 ## COG0282 Acetate kinase 7 2 Op 5 1/1.000 + CDS 7487 - 9781 2562 ## COG1882 Pyruvate-formate lyase 8 2 Op 6 1/1.000 + CDS 9795 - 10184 421 ## COG0251 Putative translation initiation inhibitor, yjgF family 9 2 Op 7 10/0.333 + CDS 10256 - 11620 1220 ## COG1760 L-serine deaminase + Term 11645 - 11674 2.1 + Prom 11650 - 11709 4.8 10 3 Op 1 3/1.000 + CDS 11895 - 13226 1307 ## COG0814 Amino acid permeases 11 3 Op 2 . + CDS 13254 - 14564 1188 ## COG3681 Uncharacterized conserved protein + Term 14658 - 14715 3.2 12 4 Op 1 . - CDS 14776 - 14940 179 ## G2583_3831 hypothetical protein 13 4 Op 2 . - CDS 14963 - 15664 675 ## COG1741 Pirin-related protein - Prom 15722 - 15781 1.7 14 5 Tu 1 . + CDS 15769 - 16665 924 ## COG0583 Transcriptional regulator + Term 16688 - 16719 3.2 - Term 16675 - 16705 3.0 15 6 Tu 1 6/0.333 - CDS 16716 - 17072 426 ## COG3152 Predicted membrane protein - Prom 17221 - 17280 6.4 - Term 17266 - 17300 3.5 16 7 Tu 1 4/0.667 - CDS 17314 - 17679 334 ## COG3152 Predicted membrane protein - Prom 17720 - 17779 5.2 - Term 17809 - 17845 2.4 17 8 Op 1 3/1.000 - CDS 17972 - 18958 1013 ## COG0435 Predicted glutathione S-transferase - Term 18971 - 19000 2.1 18 8 Op 2 . - CDS 19028 - 19420 438 ## COG2259 Predicted membrane protein - Prom 19515 - 19574 4.4 - Term 19566 - 19596 3.4 19 9 Op 1 . - CDS 19606 - 19905 408 ## ECSP_4074 hypothetical protein 20 9 Op 2 7/0.333 - CDS 19895 - 20299 410 ## COG5393 Predicted membrane protein 21 9 Op 3 . - CDS 20302 - 20607 450 ## COG4575 Uncharacterized conserved protein 22 9 Op 4 . - CDS 20645 - 21013 444 ## SSON_3256 hypothetical protein - Prom 21094 - 21153 4.1 - Term 21058 - 21096 6.4 23 10 Op 1 . - CDS 21160 - 21528 276 ## JW3067 conserved hypothetical protein 24 10 Op 2 . - CDS 21547 - 22209 497 ## COG0586 Uncharacterized membrane-associated protein - Prom 22378 - 22437 6.0 - Term 22507 - 22542 4.1 25 11 Tu 1 . - CDS 22554 - 23330 965 ## COG2186 Transcriptional regulators - Prom 23523 - 23582 4.8 26 12 Op 1 . + CDS 23719 - 24342 375 ## EC55989_3511 putative fimbrial protein 27 12 Op 2 . + CDS 24372 - 24872 531 ## ECO103_3841 putative fimbrial subunit + Term 24902 - 24932 3.0 28 13 Op 1 . + CDS 24946 - 27648 1897 ## EC55989_3509 conserved hypothetical protein; putative exported protein 29 13 Op 2 . + CDS 27645 - 28733 544 ## ECO103_3839 putative minor pilin and initiator protein - Term 28758 - 28801 8.2 30 14 Tu 1 . - CDS 28821 - 30119 1722 ## COG0477 Permeases of the major facilitator superfamily - Prom 30357 - 30416 3.0 + Prom 30436 - 30495 3.3 31 15 Op 1 3/1.000 + CDS 30602 - 32014 1674 ## COG1904 Glucuronate isomerase 32 15 Op 2 . + CDS 32029 - 33516 1526 ## COG2721 Altronate dehydratase + Term 33543 - 33579 6.4 33 16 Tu 1 . + CDS 33599 - 34150 274 ## EC55989_3504 conserved hypothetical protein; putative inner membrane protein - Term 34110 - 34146 6.2 34 17 Tu 1 . - CDS 34155 - 35399 1356 ## COG3633 Na+/serine symporter - Prom 35523 - 35582 6.5 Predicted protein(s) >gi|296493202|gb|ADTK01000299.1| GENE 1 387 - 1490 356 367 aa, chain - ## HITS:1 COG:no KEGG:ECB_02988 NR:ns ## KEGG: ECB_02988 # Name: yhaC # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 367 1 367 367 648 98.0 0 MFPVSSIGNDISSDLVRRKMNDLPESPTGNNLEALAPGIEKLKQTSIEMVTLLNTLQPGG KCIITGDFQKELAYLQNVILYNVSSLRLDFFGFKAQIIQRSDNTCELTINEPLKNQEIST GNININCPLKDIYNEIRRLNVIFSCGTGDIVDLSSLDLCNVDLDYYDFTDKHMANTILNP FKLNSTNFTNANMFQVNFVSSTQNATISWDYLLKITPVLISISDMYSEEKIKFVESCLNE PGDITEEQLKIMRFAIIKSIPRATLTDKLENELTKEIYKSSSKINNCLNRIKLPEMIDFS SEKIHDYIDIIIEDYENIKENAYLVIPQINYTMDLNIEDSSSEELLSDNGIEKDDNSSDN YFEVEKI >gi|296493202|gb|ADTK01000299.1| GENE 2 1512 - 2051 40 179 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_3595 NR:ns ## KEGG: EcE24377A_3595 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 179 8 186 186 363 100.0 1e-99 MKGFPIAHIFHPSIPPMHAVVNNHNRNIDYWTVKRKFAEIVSTNDVNKIYSISNELRRVL SAITALNFYHGDVPSVMIRIQPENMSPFIIDISTGEHDDYIIQTLDVGTFAPFGEQCTCS AVNKKELECIKETISKYCAKFTRKEAILTPLVHFNKTSITSDCWQILFFSPDHFNNDFY >gi|296493202|gb|ADTK01000299.1| GENE 3 2840 - 3778 841 312 aa, chain + ## HITS:1 COG:ECs3998 KEGG:ns NR:ns ## COG: ECs3998 COG0583 # Protein_GI_number: 15833252 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 312 1 312 312 593 100.0 1e-169 MSTILLPKTQHLVVFQEVIRSGSIGSAAKELGLTQPAVSKIINDIEDYFGVELVVRKNTG VTLTPAGQLLLSRSESITREMKNMVNEISGMSSEAVVEVSFGFPSLIGFTFMSGMINKFK EVFPKAQVSMYEAQLSSFLPAIRDGRLDFAIGTLSAEMKLQDLHVEPLFESEFVLVASKS RTCTGTTTLESLKNEQWVLPQTNMGYYSELLTTLQRNGISIENIVKTDSVVTIYNLVLNA DFLTVIPCDMTSPFGSNQFITIPVEETLPVAQYAAVWSKNYRIKKAASVLVELAKEYSSY NGCRRRQLIEVG >gi|296493202|gb|ADTK01000299.1| GENE 4 3877 - 4866 952 329 aa, chain + ## HITS:1 COG:ECs3997 KEGG:ns NR:ns ## COG: ECs3997 COG1171 # Protein_GI_number: 15833251 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Escherichia coli O157:H7 # 1 329 1 329 329 590 100.0 1e-169 MHITYDLPVAIDDIIEAKQRLAGRIYKTGMPRSNYFSERCKGEIFLKFENMQRTGSFKIR GAFNKLSSLTDAEKRKGVVACSAGNHAQGVSLSCAMLGIDGKVVMPKGAPKSKVAATCDY SAEVVLHGDNFNDTIAKVSEIVEMEGRIFIPPYDDPKVIAGQGTIGLEIMEDLYDVDNVI VPIGGGGLIAGIAVAIKSINPTIRVIGVQSENVHGMAASFHSGEITTHRTTGTLADGCDV SRPGNLTYEIVRELVDDIVLVSEDEIRNSMIALIQRNKVVTEGAGALACAALLSGKLDQY IQNRKTVSIISGGNIDLSRVSQITGFVDA >gi|296493202|gb|ADTK01000299.1| GENE 5 4888 - 6219 1353 443 aa, chain + ## HITS:1 COG:ECs3996 KEGG:ns NR:ns ## COG: ECs3996 COG0814 # Protein_GI_number: 15833250 # Func_class: E Amino acid transport and metabolism # Function: Amino acid permeases # Organism: Escherichia coli O157:H7 # 1 443 1 443 443 795 100.0 0 MSTSDSIVSSQTKQSSWRKSDTTWTLGLFGTAIGAGVLFFPIRAGFGGLIPILLMLVLAY PIAFYCHRALARLCLSGSNPSGNITETVEEHFGKTGGVVITFLYFFAICPLLWIYGVTIT NTFMTFWENQLGFAPLNRGFVALFLLLLMAFVIWFGKDLMVKVMSYLVWPFIASLVLISL SLIPYWNSAVIDQVDLGSLSLTGHDGILITVWLGISIMVFSFNFSPIVSSFVVSKREEYE KDFGRDFTERKCSQIISRASMLMVAVVMFFAFSCLFTLSPANMAEAKAQNIPVLSYLANH FASMTGTKTTFAITLEYAASIIALVAIFKSFFGHYLGTLEGLNGLVLKFGYKGDKTKVSL GKLNTISMIFIMGSTWVVAYANPNILDLIEAMGAPIIASLLCLLPMYAIRKAPSLAKYRG RLDNVFVTVIGLLTILNIVYKLF >gi|296493202|gb|ADTK01000299.1| GENE 6 6245 - 7453 1018 402 aa, chain + ## HITS:1 COG:ECs3995 KEGG:ns NR:ns ## COG: ECs3995 COG0282 # Protein_GI_number: 15833249 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Escherichia coli O157:H7 # 1 402 5 406 406 783 99.0 0 MNEFPVVLVINCGSSSIKFSVLDASDCEVLMSGIADGINSENAFLSVNGGEPAPLAHHSY EGALKAIAFELEKRNLNDSVALIGHRIAHGGSIFTESAIITDEVIDNIRRVSPLAPLHNY ANLSGIESAQQLFPGVTQVAVFDTSFHQTMAPEAYLYGLPWKYYEELGVRRYGFHGTSHR YVSQRAHSLLNLAEDDSGLVVAHLGNGASICAVRNGQSVDTSMGMTPLEGLMMGTRSGDV DFGAMSWVASQTNQSLGDLERVVNKESGLLGISGLSSDLRVLEKAWHEGHERAQLAIKTF VHRIARHIAGHAASLRRLDGIIFTGGIGENSSLIRRLVMEHLAVLGVVIDTEMNNRSNSF GERIVSSENARVICAVIPTNEEKMIALDAIHLGKVNAPAEFA >gi|296493202|gb|ADTK01000299.1| GENE 7 7487 - 9781 2562 764 aa, chain + ## HITS:1 COG:ECs3994 KEGG:ns NR:ns ## COG: ECs3994 COG1882 # Protein_GI_number: 15833248 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Escherichia coli O157:H7 # 1 764 1 764 764 1582 99.0 0 MKVDIDTSDKLYADAWLGFKGTDWKNEINVRDFIQHNYTPYEGDESFLAEATPATTELWE KVMEGIRIENATHAPVDFDTNIATTITAHDAGYINQPLEKIVGLQTDAPLKRALHPFGGI NMIKSSFHAYGREMDSEFEYLFTDLRKTHNQGVFDVYSPDMLRCRKSGVLTGLPDGYGRG RIIGDYRRVALYGISYLVRERELQFADLQSRLEKGEDLEATIRLREELAEHRHALLQIQE MAAKYGFDISRPAQNAQEAVQWLYFAYLAAVKSQNGGAMSLGRTASFLDIYIERDFKAGV LNEQQAQELIDHFIMKIRMVRFLRTPEFDSLFSGDPIWATEVIGGMGLDGRTLVTKNSFR YLHTLHTMGPAPEPNLTILWSEELPIAFKKYAAQVSIVTSSLQYENDDLMRTDFNSDDYA IACCVSPMVIGKQMQFFGARANLAKTLLYAINGGVDEKLKIQVGPKTAPLMDDVLDYDKV MDSLDHFMDWLAVQYISALNIIHYMHDKYSYEASLMALHDRDVYRTMACGIAGLSVATDS LSAIKYARVKPIRDENGLAVDFEIDGEYPQYGNNDERVDSIACDLVERFMKKIKALPTYR NAVPTQSILTITSNVVYGQKTGNTPDGRRAGTPFAPGANPMHGRDRKGAVASLTSVAKLP FTYAKDGISYTFSIVPAALGKEDPVRKTNLVGLLDGYFHHEADVEGGQHLNVNVMNREML LDAIEHPEKYPNLTIRVSGYAVRFNALTREQQQDVISRTFTQAL >gi|296493202|gb|ADTK01000299.1| GENE 8 9795 - 10184 421 129 aa, chain + ## HITS:1 COG:tdcF KEGG:ns NR:ns ## COG: tdcF COG0251 # Protein_GI_number: 16131006 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Escherichia coli K12 # 1 129 22 150 150 249 100.0 1e-66 MKKIIETQRAPGAIGPYVQGVDLGSMVFTSGQIPVCPQTGEIPADVQDQARLSLENVKAI VVAAGLSVGDIIKMTVFITDLNDFATINEVYKQFFDEHQATYPTRSCVQVARLPKDVKLE IEAIAVRSA >gi|296493202|gb|ADTK01000299.1| GENE 9 10256 - 11620 1220 454 aa, chain + ## HITS:1 COG:Z4464 KEGG:ns NR:ns ## COG: Z4464 COG1760 # Protein_GI_number: 15803651 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Escherichia coli O157:H7 EDL933 # 1 454 3 456 456 899 99.0 0 MISAFDIFKIGIGPSSSHTVGPMNAGKSFIDRLESSGLLTATSHIVVDLYGSLSLTGKGH ATDVAIIMGLAGNSPQDVVIDEIPAFIELVTRSGRLPVASGAHIVDFPVAKNIIFHPEML PRHENGMRITAWKGQEALLSKTYYSVGGGFIVEEEHFGLSHDVETSVPYDFHSAGELLKM CDYNGLSISGLMMHNELALRSKAEIDAGFARIWQVMHDGIERGMNTEGVLPGPLNVPRRA VALRRQLVSSDNISNDPMNVIDWINMYALAVSEENAAGGRVVTAPTNGACGIIPAVLAYY DKFRRPVNERSIARYFLAAGAIGALYKMNASISGAEVGCQGEIGVACSMAAAGLTELLGG SPAQVCNAAEIAMEHNLGLTCDPVAGQVQIPCIERNAINAVKAVNAARMAMRRTSAPRVS LDKVIETMYETGKDMNDKYRETSRGGLAIKVVCG >gi|296493202|gb|ADTK01000299.1| GENE 10 11895 - 13226 1307 443 aa, chain + ## HITS:1 COG:ECs3991 KEGG:ns NR:ns ## COG: ECs3991 COG0814 # Protein_GI_number: 15833245 # Func_class: E Amino acid transport and metabolism # Function: Amino acid permeases # Organism: Escherichia coli O157:H7 # 1 443 1 443 443 788 99.0 0 MEIASNKGVIADASTPAGRAGMSESEWREAIKFDSTDTGWVIMSIGMAIGAGIVFLPVQV GLMGLWVFLLSSVIGYPAMYLFQRLFINTLAESPECKDYPSVISGYLGKNWGILLGALYF VMLVIWMFVYSTAITNDSASYLHTFGVTEGLLSDSPFYGLVLICILVAISSRGEKLLFKI STGMVLTKLLVVAALGVSMVGMWHLYNVGSLPPLGLLVKNAIITLPFTLTSILFIQTLSP MVISYRSREKSIEVARHKALRAMNIAFGILFVTVFFYAVSFTLAMGHDEAVKAYEQNISA LAIAAQFISGDGAAWVKVVSVILNIFAVMTAFFGVYLGFREATQGIVMNILRRKMPAEKI NENLVQRGIMIFAILLAWSAIVLNAPVLSFTSICSPIFGMVGCLIPAWLVYKVPALHKYK GMSLYLIIVTGLLLCVSPFLAFS >gi|296493202|gb|ADTK01000299.1| GENE 11 13254 - 14564 1188 436 aa, chain + ## HITS:1 COG:yhaN+M KEGG:ns NR:ns ## COG: yhaN+M COG3681 # Protein_GI_number: 16132252 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 436 1 436 436 749 99.0 0 MFDSTLNPLWQRYILAVQEEVKPALGCTEPISLALAAAVAAAELEGPVERVEAWVSPNLM KNGLGVTVPGTGMVGLPIAAALGALGGNANAGLEVLKDATAQAIADAKALLAAGKVSVKI QEPCNEILFSRAKVWNGEKWACVTIVGGHTNIVHIETHNGVVFTQQACVAEGEQESPLTV LSRTTLAEILKFVNEVPFAAIRFILDSAKLNCALSQEGLSGKWGLHIGATLEKQCERGLL AKDLSSSIVIRTSAASDARMGGATLPAMSNSGSGNQGITATMPVVVVAEHFGADDERLAR ALMLSHLSAIYIHNQLPRLSALCAATTAAMGAAAGMAWLVDGRYETISMAISSMIGDVSG MICDGASNSCAMKVSTSASAAWKAVLMALDDTAVTGNEGIVAHDVEQSIANLCALASHSM QQTDRQIIEIMASKAR >gi|296493202|gb|ADTK01000299.1| GENE 12 14776 - 14940 179 54 aa, chain - ## HITS:1 COG:no KEGG:G2583_3831 NR:ns ## KEGG: G2583_3831 # Name: yhaL # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 54 3 56 56 62 100.0 5e-09 MSKKSAKKRQPVKPVVAKEPARTAKNFGYEEMLSELEAIVADAETRLAEDEATA >gi|296493202|gb|ADTK01000299.1| GENE 13 14963 - 15664 675 233 aa, chain - ## HITS:1 COG:yhaK KEGG:ns NR:ns ## COG: yhaK COG1741 # Protein_GI_number: 16131001 # Func_class: R General function prediction only # Function: Pirin-related protein # Organism: Escherichia coli K12 # 1 233 1 233 233 473 99.0 1e-133 MITTRTARQCGQADYGWLQARYTFSFGHYFDPKLLGYASLRVLNQEVLAPGAAFQPRTYP KVDILNVILDGEAEYRDSEGNHVQASAGEALLLSTQPGVSYSEHNLSKDKPLTRMQLWLD ACPQRENPLIQKLALNMGKQQLIASPEGTMGSLQLRQQVWLHHIVLDKGESANFQLHGPR AYLQSIHGKFHALTHHEEKAALTCGDGAFIRDEANITLVADSPLRALLIDLPV >gi|296493202|gb|ADTK01000299.1| GENE 14 15769 - 16665 924 298 aa, chain + ## HITS:1 COG:ECs3987 KEGG:ns NR:ns ## COG: ECs3987 COG0583 # Protein_GI_number: 15833241 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 298 1 298 298 550 99.0 1e-156 MAKERALTLEALRVMDAIDRRGSFAAAADELGRVPSALSYTMQKLEEELDVVLFDRSGHR TKFTNVGRMLLERGRVLLEAADKLTTDAEALARGWETHLTIVTEALVPTPAFFPLIDKLA AKANTQLAIITEVLAGAWERLEQGRADIVIAPDMHFRSSSEINSRKLYTLMNVYVAAPDH PIHQEPEPLSEVTRVKYRGIAVADTARERPVLTVQLLDKQPRLTVSTIEDKRQALLAGLG VATMPYPMVEKDIAEGRLRVVSPESTSEIDIIMAWRRDSMGEAKSWCLREIPKLFSGK >gi|296493202|gb|ADTK01000299.1| GENE 15 16716 - 17072 426 118 aa, chain - ## HITS:1 COG:ECs3986 KEGG:ns NR:ns ## COG: ECs3986 COG3152 # Protein_GI_number: 15833240 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 118 1 118 118 195 83.0 1e-50 MQWYLAVLKNYVGFSGRARRKEYWMFTLINAIVGAIINVIQLILGLEFPFLSLIYLAATI IPVIALCVRRLHDTDRSGAWALLYLVPIIGWLFLFVFACLEGNSGSNRYGNDPKFGSN >gi|296493202|gb|ADTK01000299.1| GENE 16 17314 - 17679 334 121 aa, chain - ## HITS:1 COG:ECs3985 KEGG:ns NR:ns ## COG: ECs3985 COG3152 # Protein_GI_number: 15833239 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 121 1 121 121 223 100.0 5e-59 MDWYLKVLKNYVGFRGRARRKEYWMFILVNIIFTFVLGLLDKMLGWQRAGGEGILTTIYG ILVFLPWWAVQFRRLHDTDRSAWWALLFLIPFIGWLIIIVFNCQAGTPGENRFGPDPKLE P >gi|296493202|gb|ADTK01000299.1| GENE 17 17972 - 18958 1013 328 aa, chain - ## HITS:1 COG:yqjG KEGG:ns NR:ns ## COG: yqjG COG0435 # Protein_GI_number: 16130997 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted glutathione S-transferase # Organism: Escherichia coli K12 # 1 328 1 328 328 660 99.0 0 MGQLIDGVWHDTWYDTKSTGGKFQRSASAFRNWLTADGAPGPTGTGGFIAEKDRYHLYVS LACPWAHRTLIMRKLKGLEPFISVSVVNPLMLENGWTFDDSFPGATGDTLYQNEFLYQLY LHADPHYSGRVTVPVLWDKKNHTIVSNESAEIIRMFNTAFDTLGAKAGDYYPPALQTKID ELNGWIYDTVNNGVYKAGFATSQEAYDEAVAKVFESLARLEQILGQHRYLTGNQLTEADI RLWTTLVRFDPVYVTHFKCDKHRISDYLNLYGFLRDIYQMPGIAETVNFDHIRNHYFRSH KTINPTGIISIGPWQDLDEPHGRDVRFG >gi|296493202|gb|ADTK01000299.1| GENE 18 19028 - 19420 438 130 aa, chain - ## HITS:1 COG:ECs3983 KEGG:ns NR:ns ## COG: ECs3983 COG2259 # Protein_GI_number: 15833237 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 130 31 160 160 220 98.0 6e-58 MKKLEDVGVLVARILMPILFITAGWGKITAYSGTQQYMEAMGVPGFMLPLVILLEFGGGL AILFGFLTRTTALFTAGFTLLTAFLFHSNFAEGVNSLMFMKNLTISGGFLLLAITGPGAY SIDRLLNKKW >gi|296493202|gb|ADTK01000299.1| GENE 19 19606 - 19905 408 99 aa, chain - ## HITS:1 COG:no KEGG:ECSP_4074 NR:ns ## KEGG: ECSP_4074 # Name: yqjK # Def: hypothetical protein # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 99 1 99 99 140 100.0 1e-32 MSSKVERERRKAQLLSQIQQQRLDLSASRREWLEATGAYDRRWNMLLSLRSWALVGSSVM AIWTIRHPNMLVRWARRGFGVWSAWRLVKTTLKQQQLRG >gi|296493202|gb|ADTK01000299.1| GENE 20 19895 - 20299 410 134 aa, chain - ## HITS:1 COG:STM3230 KEGG:ns NR:ns ## COG: STM3230 COG5393 # Protein_GI_number: 16766529 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Salmonella typhimurium LT2 # 1 130 1 130 132 198 89.0 2e-51 MADTHHAQGPGKSVLGIGQRIVSIMVEMVETRLRLAVVELEEEKANLFQLLLMLGLTMLF AAFGLMSLMVLIIWAVDPQYRLNAMIATTVVLLLLALIGGIWTLRKSRKSTLLRHTRHEL ANDRQLLEEESREQ >gi|296493202|gb|ADTK01000299.1| GENE 21 20302 - 20607 450 101 aa, chain - ## HITS:1 COG:ECs3980 KEGG:ns NR:ns ## COG: ECs3980 COG4575 # Protein_GI_number: 15833234 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 101 1 101 101 150 100.0 4e-37 MSKEHTTEHLRAELKSLSDTLEEVLSSSGEKSKEELSKIRSKAEQALKQSRYRLGETGDA IAKQTRVAAARADEYVRENPWTGVGIGAAIGVVLGVLLSRR >gi|296493202|gb|ADTK01000299.1| GENE 22 20645 - 21013 444 122 aa, chain - ## HITS:1 COG:no KEGG:SSON_3256 NR:ns ## KEGG: SSON_3256 # Name: yqjC # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 122 6 127 127 154 100.0 8e-37 MKYRIALAVSLFALSAGSYATTLCQEKEQNILKEISYAEKHQNQNRIDGLNKALSEVRAN CSDSQLRADHHKKIAKQKDEVAERQQDLAEAKQKGDADKIAKRERKLAEAQEELKKLEAR DY >gi|296493202|gb|ADTK01000299.1| GENE 23 21160 - 21528 276 122 aa, chain - ## HITS:1 COG:no KEGG:JW3067 NR:ns ## KEGG: JW3067 # Name: yqjB # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 122 6 127 127 208 99.0 6e-53 MSLRQLAWSGTVLLLVGTLLLAWSAVRQQESTLAIRAVHQGTTMPDGFSIWHHLDAHGIP FKSITPKNDTLLITFDSSDQSAAAKAVLDRTLPHGYIIAQQDNNSQAMQWLTRLRDNSHR FG >gi|296493202|gb|ADTK01000299.1| GENE 24 21547 - 22209 497 220 aa, chain - ## HITS:1 COG:STM3226 KEGG:ns NR:ns ## COG: STM3226 COG0586 # Protein_GI_number: 16766525 # Func_class: S Function unknown # Function: Uncharacterized membrane-associated protein # Organism: Salmonella typhimurium LT2 # 1 220 1 220 220 372 95.0 1e-103 MELLTQLLQALWAQDFETLANPSMIGMLYFVLFVILFLENGLLPAAFLPGDSLLVLVGVL IAKGAMGYPQTILLLTVAASLGCWVSYIQGRWLGNTRTVQNWLSHLPAHYHQRAHHLFHK HGLSALLIGRFIAFVRTLLPTIAGLSGLNNARFQFFNWMSGLLWVLILTTLGYMLGKTPV FLKYEDQLMSCLMLLPVVLLVFGLAGSLVVLWKKKYGNRG >gi|296493202|gb|ADTK01000299.1| GENE 25 22554 - 23330 965 258 aa, chain - ## HITS:1 COG:ECs3976 KEGG:ns NR:ns ## COG: ECs3976 COG2186 # Protein_GI_number: 15833230 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 258 6 263 263 498 100.0 1e-141 MEITEPRRLYQQLAADLKERIEQGVYLVGDKLPAERFIADEKNVSRTVVREAIIMLEVEG YVEVRKGSGIHVVSNQPRHQQAADNNMEFANYGPFELLQARQLIESNIAEFAATQVTKQD IMKLMAIQEQARGEQCFRDSEWDLQFHIQVALATQNSALAAIVEKMWTQRSHNPYWKKLH EHIDSRTVDNWCDDHDQILKALIRKDPHAAKLAMWQHLENTKIMLFNETSDDFEFNADRY LFAENPVVHLDTATSGSK >gi|296493202|gb|ADTK01000299.1| GENE 26 23719 - 24342 375 207 aa, chain + ## HITS:1 COG:no KEGG:EC55989_3511 NR:ns ## KEGG: EC55989_3511 # Name: not_defined # Def: putative fimbrial protein # Organism: E.coli_55989 # Pathway: not_defined # 1 207 32 238 238 384 100.0 1e-105 MAVSINSQGEGNVRVISKSNEVQYIKATVFRIDNPSTPQENEVEIKSGDANHLVVMPPKF ALPAGSSKTVRFVAMEPEQKEKNYRVKFEAVPSIDDVATDKKDLSMQLTVNLIWGIVVSV PPQQPIAKLEVNAAQKLVNAGNQRLKILTIAYCKNNSKENCKIQTVNKNIFPGQERNLES ISGYDKIVVKYNNWITKDNGEFELAVH >gi|296493202|gb|ADTK01000299.1| GENE 27 24372 - 24872 531 166 aa, chain + ## HITS:1 COG:no KEGG:ECO103_3841 NR:ns ## KEGG: ECO103_3841 # Name: not_defined # Def: putative fimbrial subunit # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 166 1 166 166 259 100.0 3e-68 MKKVFAKSLLVAAMFSVAGSALAVQKDITVTANVDAALDMTQTDNTALPKAVEMQYLPGQ GLQSYQLMTKIWSNDVTKDVKMQLVSPAQLVQSLDASKIVPLTVTWGGEEIKADAATTFT ATKIFASDALTNGSLAKNLMFAQTTKGVLETGIYRGVVSIYLSQDI >gi|296493202|gb|ADTK01000299.1| GENE 28 24946 - 27648 1897 900 aa, chain + ## HITS:1 COG:no KEGG:EC55989_3509 NR:ns ## KEGG: EC55989_3509 # Name: not_defined # Def: conserved hypothetical protein; putative exported protein # Organism: E.coli_55989 # Pathway: not_defined # 1 900 1 900 900 1686 100.0 0 MDFRIAMDKKLLALLILASLSPAEATLTKIPAGFEVIAQGQQEYIEVYFSGKSLGKYYAM VNLDTVTFLDPASLYNKLELDVDDQKIAHIVKEKLSQPLARHGELACGYVRTDSGCGFLN TDTLEIIYNDEESSATLFINPQWNSAFDAKSLYLNPDKNTVNAFIHQQDINVLAQDDYQS LSIQGNGALGITENSYIGAHWNFNGYDADDVSDSNADVSDLYYRYDFLRRYYVQAGRMDN RTLFNAQGGNFTFNFLPLGAIDGMRIGSTLSYLNQAQSQQGTPVMVLLSRNSRVDAYRNE QLLGSFYLNSGSQFIDTSSFPPGSYSVALKVYENNQLTRTELVPFTKTGGLTDGNAQWFL QAGKTTSQVSDDESSAYQLGVRLPLHPQYELYAGLANADDVSAFELGNNWTADLGGVGNL AISASVFRNDDGGKGDMQQANWSNPGWPTLGFYRTNSDGDACTTDSRESYNALSCYESIS ATVSQNFVGWNMMLGYTRTQNNTDDSLRWDKQQSFENNYLRQTTAQSISETVQLSASRAF VMRDWILSTSVGVFHRNDNGGDNDDNGLYLSFSLSDTPTMDSNNNSHSTNVSTDYRYSEQ DGDQTSWQLSHTFYNDSFSHKELGVTVGGLNTDTINSAVNGRWDGQYGNVYATVSDSYDR KNHDHLSAFTGTYSSTLAVSRYGVNLGASGTDDLLGAVLVDVKGFSEQDEESQDLQLEAR VAGSRTLQLGQSDSVLFPYPGFQSGFVEVNDSSQGNQQGTTNIINGAGNRELMLLPGKLR YREVSASFNYNYIGRLLLPAAVKKFPIVGLNSAMLLVAEDGGFTLEINGSEKELYLLSGQ QFLKCPLSVVKKRASIRYSGDVTCSVVTYSQLPESIQVQAQLKQPKLRGNVQTAQREVAP >gi|296493202|gb|ADTK01000299.1| GENE 29 27645 - 28733 544 362 aa, chain + ## HITS:1 COG:no KEGG:ECO103_3839 NR:ns ## KEGG: ECO103_3839 # Name: not_defined # Def: putative minor pilin and initiator protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 362 1 362 362 758 100.0 0 MRNRLIAAILGLFGTLTGVQAAPDVTSEITYDLASGRADYYFWKDEASAGNNGYMWYECS YPDLQQTCTANGNISTVQIYLTEQRSGMRWPVKLKGFKTAIVSSDEAPPGCKGGKGLQTN LKDSNRSSCTEDGQHYYIYDTKFLTLYLEQTEMKNLPIGGVWKGKVKLHSNSPAQDYFAN ITLNTLDPNHIDVFFPEFAHATPRVQLDLHPTGSVNGSNYAQDLTMLDMCLYDGFNGNAI SYEIMLKDEGRPAAGRRDGYFSIYRQGGTTTDEGERIDYRVKMYNPETGGQIDVRNNENM VWNSINLKRVRPVVLPGIRYAVMCVPTPLTLAVDKFSVMDKQAGYYMGKLSVIFTPSLPT IN >gi|296493202|gb|ADTK01000299.1| GENE 30 28821 - 30119 1722 432 aa, chain - ## HITS:1 COG:ECs3975 KEGG:ns NR:ns ## COG: ECs3975 COG0477 # Protein_GI_number: 15833229 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 432 41 472 472 788 99.0 0 MRKIKGLRWYMIALVTLGTVLGYLTRNTVAAAAPTLMEELNISTQQYSYIIAAYSAAYTV MQPVAGYVLDVLGTKIGYAMFAVLWAVFCGATALAGSWGGLAIARGAVGAAEAAMIPAGL KASSEWFPAKERSIAVGYFNVGSSIGAMIAPPLVVWAIVMHSWQMAFIISGALSFIWAMA WLIFYKHPRDQKHLTDEERDYIINGQEAQHQVSTAKKMSVGQILRNRQFWGIALPRFLAE PAWGTFNAWIPLFMFKVYGFNLKEIAMFAWMPMLFADLGCILGGYLPPLFQRWFGVNLIV SRKMVVTLGAVLMIGPGMIGLFTNPYVAIMLLCIGGFAHQALSGALITLSSDVFGRNEVA TANGLTGMSAWLASTLFALVVGALADTIGFSPLFAVLAVFDLLGALVIWTVLQNKPAIEV AQETHNDPAPQH >gi|296493202|gb|ADTK01000299.1| GENE 31 30602 - 32014 1674 470 aa, chain + ## HITS:1 COG:uxaC KEGG:ns NR:ns ## COG: uxaC COG1904 # Protein_GI_number: 16130987 # Func_class: G Carbohydrate transport and metabolism # Function: Glucuronate isomerase # Organism: Escherichia coli K12 # 1 470 1 470 470 991 99.0 0 MTPFMTEDFLLDTEFARRLYHDYAKDQPIFDYHCHLPPQQIAEDYRFKNLYDIWLKGDHY KWRAMRTNGVAERLCTGDASDREKFDAWAATVPHTIGNPLYHWTHLELRRPFGITGKLLS PSTADEIWNECNELLAQDNFSARGIMQQMNVKMVGTTDDPIDSLEHHAEIAKDGSFTIKV LPSWRPDKAFNIEQATFNDYMAKLGEVSDTDIRRFADLQTALTKRLDHFAAHGCKVSDHA LDVVMFAEANEAELDSILARRLAGEPLSEHEVAQFKTAVLVFLGAEYARRGWVQQYHIGA LRNNNLRQFKLLGPDVGFDSINDRPMAEELSKLLSKQNEENLLPKTILYCLNPRDNEVLG TMIGNFQGEGMPGKMQFGSGWWFNDQKDGMERQMTQLAQLGLLSRFVGMLTDSRSFLSYT RHEYFRRILCQMIGRWVEAGEAPADINLLGEMVKNICFNNARDYFAIELN >gi|296493202|gb|ADTK01000299.1| GENE 32 32029 - 33516 1526 495 aa, chain + ## HITS:1 COG:uxaA KEGG:ns NR:ns ## COG: uxaA COG2721 # Protein_GI_number: 16130986 # Func_class: G Carbohydrate transport and metabolism # Function: Altronate dehydratase # Organism: Escherichia coli K12 # 1 495 1 495 495 1021 99.0 0 MQYIKIHALDNVAVALADLAEGTEVSVDNQTVTLRQDVARGHKFALTDIAKGANVIKYGL PIGYALADIAAGEHVHAHNTRTNLSDLDQYRYQPDFQDLPAQAADREVQIYRRANGDVGV RNELWILPTVGCVNGIARQIQNRFLKETNNAEGTDGVFLFSHTYGCSQLGDDHINTRTML QNMVRHPNAGAVLVIGLGCENNQVAAFRETLGDIDPERVHFMICQQQDDEIEAGIEHLHQ LYNVMRNDKREPGKLSELKFGLECGGSDGLSGITANPMLGRFSDYVIANGGTTVLTEVPE MFGAEQLLMDHCRDEATFEKLVTMVNDFKQYFIAHDQPIYENPSPGNKAGGITTLEDKSL GCTQKAGSSVVVDVLRYGERLKTPGLNLLSAPGNDAVATSALAGAGCHMVLFSTGRGTPY GGFVPTVKIATNSELAAKKKHWIDFDAGQLIHGKAMPQLLEEFIDTIVEFANGKQTCNER NDFRELAIFKSGVTL >gi|296493202|gb|ADTK01000299.1| GENE 33 33599 - 34150 274 183 aa, chain + ## HITS:1 COG:no KEGG:EC55989_3504 NR:ns ## KEGG: EC55989_3504 # Name: ygjV # Def: conserved hypothetical protein; putative inner membrane protein # Organism: E.coli_55989 # Pathway: not_defined # 1 183 1 183 183 350 99.0 1e-95 MTAYWLAQGVGVIAFLIGITTFFNRDERRFKKQLSVYSAVIGVHFFLLGTYPAGASAILN AIRTLITLRTRSLWVMAIFIVLTGGIGLAKFHHPVELLPVIGTIVSTWALFRCKGLTMRC VMWFSTCCWVIHNFWAGSIGGTMIEVSFLLMNGLNIIRFWRMQKRGIDPFKVEKPPPAVD ERG >gi|296493202|gb|ADTK01000299.1| GENE 34 34155 - 35399 1356 414 aa, chain - ## HITS:1 COG:ECs3971 KEGG:ns NR:ns ## COG: ECs3971 COG3633 # Protein_GI_number: 15833225 # Func_class: E Amino acid transport and metabolism # Function: Na+/serine symporter # Organism: Escherichia coli O157:H7 # 1 414 1 414 414 700 100.0 0 MTTQRSPGLFRRLAHGSLVKQILVGLVLGILLAWISKPAAEAVGLLGTLFVGALKAVAPI LVLMLVMASIANHQHGQKTNIRPILFLYLLGTFSAALAAVVFSFAFPSTLHLSSSAGDIS PPSGIVEVMRGLVMSMVSNPIDALLKGNYIGILVWAIGLGFALRHGNETTKNLVNDMSNA VTFMVKLVIRFAPIGIFGLVSSTLATTGFSTLWGYAQLLVVLVGCMLLVALVVNPLLVWW KIRRNPFPLVLLCLRESGVYAFFTRSSAANIPVNMALCEKLNLDRDTYSVSIPLGATINM AGAAITITVLTLAAVNTLGIPVDLPTALLLSVVASLCACGASGVAGGSLLLIPLACNMFG ISNDIAMQVVAVGFIIGVLQDSCETALNSSTDVLFTAAACQAEDDRLANSALRN Prediction of potential genes in microbial genomes Time: Mon May 16 15:55:59 2011 Seq name: gi|296493201|gb|ADTK01000300.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont923.3, whole genome shotgun sequence Length of sequence - 44329 bp Number of predicted genes - 37, with homology - 37 Number of transcription units - 23, operones - 8 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 74 - 1039 1192 ## COG0861 Membrane protein TerC, possibly involved in tellurium resistance - Prom 1248 - 1307 6.7 - Term 1244 - 1296 6.6 2 2 Op 1 3/0.786 - CDS 1322 - 2326 796 ## COG0673 Predicted dehydrogenases and related proteins 3 2 Op 2 3/0.786 - CDS 2387 - 3070 503 ## COG2949 Uncharacterized membrane protein 4 2 Op 3 . - CDS 3147 - 3650 471 ## COG1451 Predicted metal-dependent hydrolase - Prom 3706 - 3765 2.0 + Prom 3631 - 3690 4.6 5 3 Tu 1 . + CDS 3735 - 4871 447 ## PROTEIN SUPPORTED gi|225082609|ref|YP_002654106.1| ribosomal protein L11 methyltransferase, putative + Prom 5019 - 5078 3.5 6 4 Tu 1 . + CDS 5142 - 5558 311 ## COG5499 Predicted transcription regulator containing HTH domain - Term 5547 - 5593 5.6 7 5 Tu 1 . - CDS 5603 - 7621 1757 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family - Prom 7662 - 7721 7.2 8 6 Op 1 . - CDS 7947 - 8420 404 ## LF82_3229 uncharacterized protein YgjK 9 6 Op 2 . - CDS 8359 - 10299 2134 ## EcolC_0620 putative glycosyl hydrolase 10 6 Op 3 . - CDS 10316 - 11386 1031 ## B21_02898 hypothetical protein - Prom 11454 - 11513 4.6 - Term 11440 - 11500 9.3 11 7 Op 1 3/0.786 - CDS 11520 - 12953 1361 ## COG0531 Amino acid transporters 12 7 Op 2 . - CDS 13016 - 13465 401 ## COG2731 Beta-galactosidase, beta subunit 13 7 Op 3 1/0.857 - CDS 13462 - 16554 2980 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 16674 - 16733 3.3 - Term 16679 - 16729 13.2 14 8 Tu 1 . - CDS 16738 - 17721 993 ## COG1609 Transcriptional regulators - Prom 17910 - 17969 3.9 + Prom 17860 - 17919 3.6 15 9 Tu 1 . + CDS 17940 - 18272 477 ## COG0073 EMAP domain + Term 18278 - 18318 8.1 - Term 18265 - 18306 9.1 16 10 Tu 1 . - CDS 18314 - 19603 1406 ## COG4992 Ornithine/acetylornithine aminotransferase 17 11 Tu 1 . + CDS 20111 - 21631 1466 ## COG0840 Methyl-accepting chemotaxis protein - Term 21653 - 21696 3.2 18 12 Tu 1 . - CDS 21785 - 22408 759 ## COG1695 Predicted transcriptional regulators - Prom 22615 - 22674 6.6 + Prom 22593 - 22652 6.3 19 13 Tu 1 . + CDS 22696 - 23460 747 ## COG2375 Siderophore-interacting protein + Term 23481 - 23514 5.4 - TRNA 23514 - 23589 93.9 # Met CAT 0 0 + Prom 23633 - 23692 6.7 20 14 Tu 1 . + CDS 23714 - 24220 432 ## COG3663 G:T/U mismatch-specific DNA glycosylase + Term 24225 - 24255 2.6 - Term 24211 - 24243 3.0 21 15 Op 1 31/0.000 - CDS 24299 - 26140 2371 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) 22 15 Op 2 8/0.143 - CDS 26335 - 28080 1320 ## COG0358 DNA primase (bacterial type) - Prom 28103 - 28162 3.0 - Term 28104 - 28144 6.1 23 16 Tu 1 . - CDS 28191 - 28406 357 ## PROTEIN SUPPORTED gi|15803607|ref|NP_289640.1| 30S ribosomal protein S21 - Prom 28542 - 28601 6.3 + Prom 28561 - 28620 4.0 24 17 Tu 1 . + CDS 28644 - 29657 660 ## PROTEIN SUPPORTED gi|227425790|ref|ZP_03908856.1| SSU ribosomal protein S18P alanine acetyltransferase - Term 29489 - 29524 -0.7 25 18 Op 1 3/0.786 - CDS 29700 - 31163 1483 ## COG0471 Di- and tricarboxylate transporters - Term 31172 - 31203 1.7 26 18 Op 2 11/0.000 - CDS 31211 - 31840 248 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase 27 18 Op 3 . - CDS 31813 - 32721 262 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase - Prom 32805 - 32864 3.5 + Prom 32746 - 32805 3.9 28 19 Tu 1 . + CDS 32931 - 33863 904 ## COG0583 Transcriptional regulator 29 20 Tu 1 . - CDS 33876 - 34493 528 ## COG0344 Predicted membrane protein - Prom 34532 - 34591 2.7 + Prom 34482 - 34541 3.8 30 21 Op 1 4/0.429 + CDS 34598 - 34966 431 ## COG1539 Dihydroneopterin aldolase 31 21 Op 2 . + CDS 35056 - 35877 985 ## COG1968 Uncharacterized bacitracin resistance protein - Term 35947 - 35991 4.1 32 22 Op 1 7/0.143 - CDS 36058 - 37296 1173 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase 33 22 Op 2 . - CDS 37360 - 37980 666 ## COG3103 SH3 domain protein - Prom 38007 - 38066 2.2 34 23 Op 1 . + CDS 37999 - 38205 62 ## ECUMN_3537 hypothetical protein 35 23 Op 2 5/0.214 + CDS 38222 - 39523 1561 ## COG3025 Uncharacterized conserved protein 36 23 Op 3 5/0.214 + CDS 39546 - 42386 2970 ## COG1391 Glutamine synthetase adenylyltransferase 37 23 Op 4 . + CDS 42434 - 43867 1744 ## COG2870 ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase + Term 44066 - 44097 3.2 Predicted protein(s) >gi|296493201|gb|ADTK01000300.1| GENE 1 74 - 1039 1192 321 aa, chain - ## HITS:1 COG:ECs3970 KEGG:ns NR:ns ## COG: ECs3970 COG0861 # Protein_GI_number: 15833224 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane protein TerC, possibly involved in tellurium resistance # Organism: Escherichia coli O157:H7 # 1 321 1 321 321 550 99.0 1e-156 MNTVGTPLLWGGFAVVVAIMLAIDLLLQGRRGAHAMTMKQAAAWSLVWVTLSLLFNAAFW WYLVQTEGRAVADPQALAFLTGYLIEKSLAVDNVFVWLMLFSYFSVPAALQRRVLVYGVL GAIVLRTIMIFTGSWLISQFDWILYIFGAFLLFTGVKMALAHEDESGIGDKPLVRWLRGH LRMTDTIDNEHFFVRKNGLLYATPLMLVLILVELSDVIFAVDSIPAIFAVTTDPFIVLTS NLFAILGLRAMYFLLAGVAERFSMLKYGLAVILVFIGIKMLIVDFYHIPIAVSLGVVFGI LVMTFIINAWVNYRHDKQRGG >gi|296493201|gb|ADTK01000300.1| GENE 2 1322 - 2326 796 334 aa, chain - ## HITS:1 COG:ygjR KEGG:ns NR:ns ## COG: ygjR COG0673 # Protein_GI_number: 16130982 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Escherichia coli K12 # 1 334 1 334 334 656 98.0 0 MEYPRVMIRFAVIGTNWITRQFVEAAHESGKYKLTAVYSRSLEQAQHFANDFSVEHLFTS LEAMAESDAIDAVYIASPNALHFSQTQLFLSHKTHVICEKPLASNLAEVDAAIACARENQ VVLFEAFKTACLPNFHLLRQALPKVGKLRKVFFNYCQYSSRYQRYLDGENPNTFNPSFSN GSIMDIGFYCLASAVALFGEPKSVQATASLLASGVDAHGVVVMDYGDFSVTLQHSKVSDS VLASEIQGEAGSLVIEKLSECQKVCFVPRGSQMQDLTQPQHINTMLYEAELFATLVDEHL VDHPGLAVSRITAKLLTEIRRQTGVIFPADSVKL >gi|296493201|gb|ADTK01000300.1| GENE 3 2387 - 3070 503 227 aa, chain - ## HITS:1 COG:ygjQ KEGG:ns NR:ns ## COG: ygjQ COG2949 # Protein_GI_number: 16130981 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Escherichia coli K12 # 4 227 7 230 230 456 97.0 1e-128 MQPRLLLRICFSRRTLKIGCLLLLIAGATIFIADRVMVNASKQLTWSDVNAVPARNVGLL LGARPGNRYFTRRIDTAAALYHAGKVKWLLVSGDNGRKNYDEASGMQQALIAKGVPAKVI FCDYAGFSTLDSVVRAKKVFGENHITIISQEFHNQRAIWLAKQYGIDAIGFNAPDLNMKH GFYTQLREKLARVSAVIDAKILHRQPKYLGPSVMIGPFSEHGCPAKE >gi|296493201|gb|ADTK01000300.1| GENE 4 3147 - 3650 471 167 aa, chain - ## HITS:1 COG:ECs3967 KEGG:ns NR:ns ## COG: ECs3967 COG1451 # Protein_GI_number: 15833221 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Escherichia coli O157:H7 # 1 167 13 179 179 335 100.0 2e-92 MSNLTYLQGYPEQLLSQVRTLINEQRLGDVLAKRYPGTHDYATDKALWQYTQDLKNQFLR NAPPINKVMYDNKIHVLKNALGLHTAVSRVQGGKLKAKAEIRVATVFRNAPEPFLRMIVV HELAHLKEKEHNKAFYQLCCHMEPQYHQLEFDTRLWLTQLSLGQDKI >gi|296493201|gb|ADTK01000300.1| GENE 5 3735 - 4871 447 378 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225082609|ref|YP_002654106.1| ribosomal protein L11 methyltransferase, putative [marine gamma proteobacterium HTCC2148] # 13 368 18 365 371 176 33 2e-43 MSHLDNGFRSLTLQRFPATDDVNPLQAWEAADEYLLQQLDDTEIRGPVLILNDAFGALSC ALAEHKPYSIGDSYISELATRENLRLNGIDESSVKFLDSTADYPQQPGVVLIKVPKTLAL LEQQLRALRKVVTSDTRIIAGAKARDIHTSTLELFEKVLGPTTTTLAWKKARLINCTFNE PPLADAPQTVSWKLEGTDWTIHNHANVFSRTGLDIGARFFMQHLPENLEGEIVDLGCGNG VIGLTLLDKNPQAKVVFVDESPMAVASSRMNVETNMPEALDRCEFMINNALSGVEPFRFN AVLCNPPFHQQHALTDNVAWEMFHHARRCLKINGELYIVANRHLDYFHKLKKIFGNCTTI ATNNKFVVLKTVKLGRRR >gi|296493201|gb|ADTK01000300.1| GENE 6 5142 - 5558 311 138 aa, chain + ## HITS:1 COG:ygjM KEGG:ns NR:ns ## COG: ygjM COG5499 # Protein_GI_number: 16130977 # Func_class: K Transcription # Function: Predicted transcription regulator containing HTH domain # Organism: Escherichia coli K12 # 1 138 1 138 138 259 100.0 1e-69 MIAIADILQAGEKLTAVAPFLAGIQNEEQYTQALELVDHLLLNDPENPLLDLVCAKITAW EESAPEFAEFNAMAQAMPGGIAVIRTLMDQYGLTLSDLPEIGSKSMVSRVLSGKRKLTLE HAKKLATRFGISPALFID >gi|296493201|gb|ADTK01000300.1| GENE 7 5603 - 7621 1757 672 aa, chain - ## HITS:1 COG:ZygjL_1 KEGG:ns NR:ns ## COG: ZygjL_1 COG1902 # Protein_GI_number: 15803622 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Escherichia coli O157:H7 EDL933 # 1 354 1 354 354 721 98.0 0 MSYPSLFAPLDLGFTTLKNRVLMGSMHTGLEEYPDGAERLAAFYAERARHGVALIVSGGI APDLTGVGMEGGAMLNDASQIPHHRTITEAVHQEGGKIALQILHTGRYSYQPHLVAPSAF QAPINRFVPHELTHEEILQLIDDFARCAQLAREAGYDGVEVMGSEGYLINEFLTLRTNQR SDQWGGDYRNRMRFAVEVVRAVRERVGNDFIIIYRLSMLDLVEGGGTFAETVELAQAIEA AGATIINTGIGWHEARIPTIATPVPRGAFTWVTRKLKGHVSLPLVTTNRINDPQVADDIL SRGDADMVSMARPFLADAELLSKAQSGRADEINTCIGCNQACLDQIFVGKVTSCLVNPRA CHETKMPILPAVQKKNLAVVGAGPAGLAFAINAAARGHQVTLFDAHSEIGGQFNIAKQIP GKEEFYETLRYYHRMIEVTGVTLKLNHTVTADQLQAFDETILASGIVPRVPPIDGIDHPK VLSYLDVLRDKAPVGNKVAIIGCGGIGFDTAMYLSQPGESTSQNIARFCNEWGIDSSLQQ AGGLSPQGMQIPRSPRQIVMLQRKASKPGQGLGKTTGWIHRTTLLSRGVKMIPSVSYQKI DDDGLHVVINGETQVLAVDNVVICAGQEPNRALAQPLIDSGKTVHLIGGCDVAMELDARR AIAQGTRLALEI >gi|296493201|gb|ADTK01000300.1| GENE 8 7947 - 8420 404 157 aa, chain - ## HITS:1 COG:no KEGG:LF82_3229 NR:ns ## KEGG: LF82_3229 # Name: ygjK # Def: uncharacterized protein YgjK # Organism: E.coli_LF82 # Pathway: not_defined # 16 157 642 783 783 295 98.0 4e-79 MTCVLKINRWRTAARAKPIVERGKGPEGWSPLFNGAATQANADAVVKVMLDPKEFNTFVP LGTAALTNPAFGADIYWRGHVWVDQFWFGLKGMERYGYRDDALKLADTFFRHAKGLTADG PIQENYNPLTGAQQGAPNFSWSAAHLYMLYNDFFRKQ >gi|296493201|gb|ADTK01000300.1| GENE 9 8359 - 10299 2134 646 aa, chain - ## HITS:1 COG:no KEGG:EcolC_0620 NR:ns ## KEGG: EcolC_0620 # Name: not_defined # Def: putative glycosyl hydrolase # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 643 1 643 783 1296 99.0 0 MKIKTILTSVTCALLISFSAHAANADNYKNVINRTGAPQYMKDYDYDDHQRFNPFFDLGA WHGHLLPDGPNTMGGFPGVALLTEEYINFMASNFDHLTVWQDGKKVDFTLEAYSIPGALV QKLTAKDVQVEMTLRFATPRTSLLETKIISNKPLDLVWDGELLEKLEAKEGKPLSDKTIA GEYPDYQRKISATRDGLKVTFGKVRATWDLLTSGESEYQVHKSLPVQTEINGNRFTSKAH INGSTTLYTTYSHLLTSQEVSKEQMQIRDILARPAFYLTASQQRWEEYLKKGLTNPDATP EQTRVAVKAIETLNGNWRSPGGAVKFNTVTPSVTGRWFSGNQTWPWDTWKQAFAMAHFNP DIAKENIRAVFSWQIQPGDRVRPQDVGFVPDLIAWNLSPERGGDGGNWNERNTKPSLAAW SVMEVYNVTQDKTWVAEMYPKLVAYHDWWLRNRDHNGNGVPEYGATRDKAHNTESGEMLF TVKKGDKEETQSGLNNYARVVEKGQYDSLEIPAQVAASWESGRDDAAVFGFIDKEQLDKY VANGGKRSDWTVKFAENRSQDGTLLGYSLLQESVDQASYMYSDNHYLAEMATILGKPEEA KRYRQLAQQLADYINTCMFDPTTQFYYDVRIEDKPLANGCAGKTDC >gi|296493201|gb|ADTK01000300.1| GENE 10 10316 - 11386 1031 356 aa, chain - ## HITS:1 COG:no KEGG:B21_02898 NR:ns ## KEGG: B21_02898 # Name: ygjJ # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 356 1 356 356 698 100.0 0 MKLITAPCRALLALPFCYAFSAAGEEARPAEHDDTKTPAITSTSSPSFRFYGELGVGGYM DLEGENKHKYSDGTYIEGGLEMKYGSWFGLIYGEGWTVQADHDGNAWVPDHSWGGFEGGI NRFYGGYRTNDGTEIMLSLRQDSSLDDLQWWGDFTPDLGYVIPNTRDIMTALKVQNLSGN FRYSVTATPAGHHDESKAWLHFGKYDRYDDKYTYPAMMNGYIQYDLAEGITWMNGLEITD GTGQLYLTGLLTPNFAARAWHHTGRADGLDVPGSESGMMVSAMYEALKGVYLSTAYTYAK HRPDHADDETTSFMQFGIWYEYGGGRFATAFDSRFYMKNASHDPSDQIFLMQYFYW >gi|296493201|gb|ADTK01000300.1| GENE 11 11520 - 12953 1361 477 aa, chain - ## HITS:1 COG:ECs3960 KEGG:ns NR:ns ## COG: ECs3960 COG0531 # Protein_GI_number: 15833214 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli O157:H7 # 1 477 1 477 477 809 99.0 0 MSDTKRNTIGKFGLLSLTFAAVYSFNNVINNNIELGLASAPMFFLATIFYFIPFCLIIAE FVSLNKNSEAGVYAWVKSSLGGRWAFITAYTYWFVNLFFFTSLLPRVIAYASYAFLGYEY IMTPVATTIISMVLFAFSTWVSTNGAKMLGPITSVTSTLMLLLKLSYILLAGTALVGGVQ PADPITVDAMIPNFNWAFLGVTTWIFMAAGGAESVAVYVNDVKGGSKSFVKVIILAGIFI GVLYSVSSVLINVFVSSKELKFTGGSVQVFHGMAAYFGLPEALMNRFVGLVSFTAMFGSL LMWTATPVKIFFSEIPEGIFGKKTVELNENGVPARAAWIQFLIVIPLMIIPMLGSNTVQD LMNTIINMTAAASMLPPLFIMLAYLNLRAKLDHLPRDFRMGSRRTGIIVVSMLIAIFAVG FVASTFPTGANILTIIFYNVGGIVIFLGFAWWKYSKYIKGLTAEERHIEATPASNVD >gi|296493201|gb|ADTK01000300.1| GENE 12 13016 - 13465 401 149 aa, chain - ## HITS:1 COG:ebgC KEGG:ns NR:ns ## COG: ebgC COG2731 # Protein_GI_number: 16130972 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Escherichia coli K12 # 1 149 1 149 149 287 100.0 4e-78 MRIIDNLEQFRQIYASGKKWQRCVEAIENIDNIQPGVAHSIGDSLTYRVETDSATDALFT GHRRYFEVHYYLQGQQKIEYAPKETLQVVEYYRDETDREYLKGCGETVEVHEGQIVICDI HEAYRFICNNAVKKVVLKVTIEDGYFHNK >gi|296493201|gb|ADTK01000300.1| GENE 13 13462 - 16554 2980 1030 aa, chain - ## HITS:1 COG:ECs3958 KEGG:ns NR:ns ## COG: ECs3958 COG3250 # Protein_GI_number: 15833212 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Escherichia coli O157:H7 # 1 1030 13 1042 1042 2177 99.0 0 MNRWENIQLTHENRLAPRAYFFSYDSVAQARTFARETSSLFLPLSGQWNFHFFDHPLQVP EAFTSELMADWGHITVPAMWQMEGHGKLQYTDEGFPFPIDVPFVPSDNPTGAYQRIFTLS DGWQGKQTLIKFDGVETYFEVYVNGQYVGFSKGSRLTAEFDISAMVKTGDNLLCVRVMQW ADSTYVEDQDMWWSAGIFRDVYLIGKQLTHINDFTVRTDFDEAYCDATLSCEVVLENLSA SPVVTTLEYTLFDGERVVHSSAIDHLAIEKLTSASFAFTVEQPQQWSAESPYLYHLVMTL KDANGNVLEVVPQRVGFRDIKVRDGLFWINNRYVMLHGVNRHDNDHRKGRAVGMDRVEKD LQLMKQHNINSVRTAHYPNDPRFYELCDIYGLFVMAETDVESHGFANVGDISRITDDPQW EKVYVERIVRHIHAQKNHPSIIIWSLGNESGYGCNIRAMYHAAKALDDTRLVHYEEDRDA EVVDIISTMYTRVPLMNEFGEYPHPKPRIICEYAHAMGNGPGGLTEYQNVFYKHDCIQGH YVWEWCDHGIQAQDDNGNVWYKFGGDYGDYPNNYNFCLDGLIYSDQTPGPGLKEYKQVIA PVKIHALDLTRGELKVENKLWFTTLDDYTLHAEVRAEGETLATQQIKLRDVAPNSEAPLQ ITLPQLDAREAFLNITVTKDSRTRYSEAGHSIATYQFPLKENTAQPVPFAPNNARPLTLE DDRLSCTVRGYNFAITFSKMSGKPTSWQVNGESLLTREPKINFFKPMIDNHKQEYEGLWQ PNHLQIMQEHLRDFAVEQSDGEVLIISRTVIAPPVFDFGMRCTYIWRIAADGQVNVALSG ERYGDYPHIIPCIGFTMGINGEYDQVAYYGRGPGENYADSQQANIIDIWRSTVDAMFENY PFPQNNGNRQHVRWTALTNRHGNGLLVVPQRPINFSAWHYTQENIHAAQHCNELQRSDDI TLNLDHQLLGLGSNSWGSEVLDSWRVWFRDFSYGFTLLPVSGGEATAQSLASYEFGAGFF STNLHSENKQ >gi|296493201|gb|ADTK01000300.1| GENE 14 16738 - 17721 993 327 aa, chain - ## HITS:1 COG:ebgR KEGG:ns NR:ns ## COG: ebgR COG1609 # Protein_GI_number: 16130970 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 327 1 327 327 655 99.0 0 MATLKDIAIEAGVSLATVSRVLNDDPTLNVKEETKHRILEIAEKLEYKTSSARKLQTGAV NQHHILAIYSYQQELEINDPYYLAIRHGIETQCEKLGIELTNCYEHSGLPDIKNVTGILI VGKPTPALRAAACALTDNICFIDFHEPGSGYDAVDIDLARISKEIIDFYINQGVNRIGFI GGEDEPGKADIREVAFAEYGRLKQVVREEDIWRGGFSSSSGYELAKQLLAQEDYPKALFV ASDSIAIGVLRAIHERGLNIPQDISLISVNDIPTARFTFPPLSTVRIHSEMMGSQGVNLV YEKARDGRALPLLVFVPSKLKLRGTTR >gi|296493201|gb|ADTK01000300.1| GENE 15 17940 - 18272 477 110 aa, chain + ## HITS:1 COG:ECs3956 KEGG:ns NR:ns ## COG: ECs3956 COG0073 # Protein_GI_number: 15833210 # Func_class: R General function prediction only # Function: EMAP domain # Organism: Escherichia coli O157:H7 # 1 110 1 110 110 193 99.0 8e-50 METVAYADFARLEMRVGKIVEVKRHENADKLYIVQVDVGEKTLQTVTSLVPYYSEEELMG KTVVVLCNLQKAKMRGETSECMLLCAETDDGSESVLLTPERMMPAGVRVV >gi|296493201|gb|ADTK01000300.1| GENE 16 18314 - 19603 1406 429 aa, chain - ## HITS:1 COG:ECs3955 KEGG:ns NR:ns ## COG: ECs3955 COG4992 # Protein_GI_number: 15833209 # Func_class: E Amino acid transport and metabolism # Function: Ornithine/acetylornithine aminotransferase # Organism: Escherichia coli O157:H7 # 1 429 68 496 496 862 100.0 0 MKALNREVIEYFKEHVNPGFLEYRKSVTAGGDYGAVEWQAGGLNTLVDTQGQEFIDCLGG FGIFNVGHRNPVVVSAVQNQLAKQPLHSQELLDPLRAMLAKTLAALTPGKLKYSFFCNSG TESVEAALKLAKAYQSPRGKFTFIATSGAFHGKSLGALSATAKSTFRKPFMPLLPGFRHV PFGNIEAMRTALNECKKTGDDVAAVILEPIQGEGGVILPPPGYLTAVRKLCDEFGALMIL DEVQTGMGRTGKMFACEHENVQPDILCLAKALGGGVMPIGATIATEEVFSVLFDNPFLHT TTFGGNPLACAAALATINVLLEQNLPAQAEQKGDMLLDGFRQLAREYPDLVQEARGKGML MAIEFVDNEIGYNFASEMFRQRVLVAGTLNNAKTIRIEPPLTLTIEQCELVIKAARKALA AMRVSVEEA >gi|296493201|gb|ADTK01000300.1| GENE 17 20111 - 21631 1466 506 aa, chain + ## HITS:1 COG:Zaer_2 KEGG:ns NR:ns ## COG: Zaer_2 COG0840 # Protein_GI_number: 15803613 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Escherichia coli O157:H7 EDL933 # 125 506 1 382 382 674 99.0 0 MSSHPYVTQQNTPLAGDTTLMSTTDLQSYITHANDTFVQVSGFTLQELQGQPHNMVRHPD MPKAAFADMWFTLKKGEPWSGIVKNRRKNGDHYWVRANAVPMVREGKISGYMSIRTRATD EEIAAVEPLYKALNAGRTGKRIHKGLVVRKGWLGKLPSLPLRWRARGVMTLMFILLAAML WFVAAPVVTYILCALVMLLASACFEWQIVRPIENVARQALKVATGERNSVEHLNRSDELG LTLRAVGQLGLMCRWLINDVSSQVSSVRNGSETLAKGTDELNEHTQQTVDNVQQTVATMN QMAASVKQNSATASAADKLSITASNAAVQGGEAMTTVIKTMDDIADSTQRIGTITSLIND IAFQTNILALNAAVEAARAGEQGKGFAVVAGEVRHLASRSANAANDIRKLIDASADKVQS GSQQVHAAGRTMEDIVAQVKNVTQLIAQISHSTLEQADGLSSLTRAVDELNLITQKNAEL VEESAQVSAMVKHRASRLEDAVTVLH >gi|296493201|gb|ADTK01000300.1| GENE 18 21785 - 22408 759 207 aa, chain - ## HITS:1 COG:ECs3953 KEGG:ns NR:ns ## COG: ECs3953 COG1695 # Protein_GI_number: 15833207 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 207 1 207 207 350 99.0 7e-97 MSHHHEGCCKHEGQPRHEGCCKGEKSEHEHCGHGHQHEHGQCCGGRHGRGGGRRQRFFGH GELRLVILDILSRDDSHGYELIKAIENLTQGNYTPSPGVIYPTLDFLQEQSLITIREEEG GKKQIALTEQGAQWLEENREQVEMIEERIKARCVGAALRQNPQMKRALDNFKAVLDLRVN QSDITDAQIKKIIAVIDRAAFDITQLD >gi|296493201|gb|ADTK01000300.1| GENE 19 22696 - 23460 747 254 aa, chain + ## HITS:1 COG:ECs3952 KEGG:ns NR:ns ## COG: ECs3952 COG2375 # Protein_GI_number: 15833206 # Func_class: P Inorganic ion transport and metabolism # Function: Siderophore-interacting protein # Organism: Escherichia coli O157:H7 # 1 254 1 254 254 514 99.0 1e-146 MNNSPRYPQRVRNDLRFRELTVLRVERISAGFQRIVLGGEALDGFTSRGFDDHSKLFFPQ PDAHFVPPTVTEEGIVWPEGPRPPSRDYTPLYDELRHELVIDFFIHDGGVASGWAMQAQP GDKLTVAGPRGSLVVPEDYAYQLYVCDESGMPALRRRLEMLSKLAVKPQVSALVSVRDNA CQDYLAHLDGFNIEWLAHDEQAVDARLAQMQIPADDYFIWITGEGKVVKNLSRRFEAEQY DPQRVRAAAYWHAK >gi|296493201|gb|ADTK01000300.1| GENE 20 23714 - 24220 432 168 aa, chain + ## HITS:1 COG:ygjF KEGG:ns NR:ns ## COG: ygjF COG3663 # Protein_GI_number: 16130964 # Func_class: L Replication, recombination and repair # Function: G:T/U mismatch-specific DNA glycosylase # Organism: Escherichia coli K12 # 1 168 1 168 168 336 100.0 1e-92 MVEDILAPGLRVVFCGINPGLSSAGTGFPFAHPANRFWKVIYQAGFTDRQLKPQEAQHLL DYRCGVTKLVDRPTVQANEVSKQELHAGGRKLIEKIEDYQPQALAILGKQAYEQGFSQRG AQWGKQTLTIGSTQIWVLPNPSGLSRVSLEKLVEAYRELDQALVVRGR >gi|296493201|gb|ADTK01000300.1| GENE 21 24299 - 26140 2371 613 aa, chain - ## HITS:1 COG:ECs3950 KEGG:ns NR:ns ## COG: ECs3950 COG0568 # Protein_GI_number: 15833204 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Escherichia coli O157:H7 # 1 613 1 613 613 1031 99.0 0 MEQNPQSQLKLLVTRGKEQGYLTYAEVNDHLPEDIVDSDQIEDIIQMINDMGIQVMEEAP DADDLMLAENTADEDAAEAAAQVLSSVESEIGRTTDPVRMYMREMGTVELLTREGEIDIA KRIEDGINQVQCSVAEYPEAITYLLEQYDRVEAEEARLSDLITGFVDPNAEEDLAPTATH VGSELSQEDLDDDEDEDEEDGDDDSADDDNSIDPELAREKFAELRAQYVVTRDTIKAKGR SHATAQEEILKLSEVFKQFRLVPKQFDYLVNSMRVMMDRVRTQERLIMKLCVEQCKMPKK NFITLFTGNETSDTWFNAAIAMNKPWSEKLHDVSEEVHRALQKLQQIEEETGLTIEQVKD INRRMSIGEAKARRAKKEMVEANLRLVISIAKKYTNRGLQFLDLIQEGNIGLMKAVDKFE YRRGYKFSTYATWWIRQAITRSIADQARTIRIPVHMIETINKLNRISRQMLQEMGREPTP EELAERMLMPEDKIRKVLKIAKEPISMETPIGDDEDSHLGDFIEDTTLELPLDSATTESL RAATHDVLAGLTAREAKVLRMRFGIDMNTDHTLEEVGKQFDVTRERIRQIEAKALRKLRH PSRSEVLRSFLDD >gi|296493201|gb|ADTK01000300.1| GENE 22 26335 - 28080 1320 581 aa, chain - ## HITS:1 COG:ECs3949 KEGG:ns NR:ns ## COG: ECs3949 COG0358 # Protein_GI_number: 15833203 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Escherichia coli O157:H7 # 1 581 1 581 581 1191 100.0 0 MAGRIPRVFINDLLARTDIVDLIDARVKLKKQGKNFHACCPFHNEKTPSFTVNGEKQFYH CFGCGAHGNAIDFLMNYDKLEFVETVEELAAMHNLEVPFEAGSGPSQIERHQRQTLYQLM DGLNTFYQQSLQQPVATSARQYLEKRGLSHEVIARFAIGFAPPGWDNVLKRFGGNPENRQ SLIDAGMLVTNDQGRSYDRFRERVMFPIRDKRGRVIGFGGRVLGNDTPKYLNSPETDIFH KGRQLYGLYEAQQDNAEPNRLLVVEGYMDVVALAQYGINYAVASLGTSTTADHIQLLFRA TNNVICCYDGDRAGRDAAWRALETALPYMTDGRQLRFMFLPDGEDPDTLVRKEGKEAFEA RMEQAMPLSAFLFNSLMPQVDLSTPDGRARLSTLALPLISQVPGETLRIYLRQELGNKLG ILDDSQLERLMPKAAESGVSRPVPQLKRTTMRILIGLLVQNPELATLVPPLENLDENKLP GLGLFRELVNTCLSQPGLTTGQLLEHYRGTNNAATLEKLSMWDDIADKNIAEQTFTDSLN HMFDSLLELRQEELIARERTHGLSNEERLELWTLNQELAKK >gi|296493201|gb|ADTK01000300.1| GENE 23 28191 - 28406 357 71 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803607|ref|NP_289640.1| 30S ribosomal protein S21 [Escherichia coli O157:H7 EDL933] # 1 71 1 71 71 142 100 4e-33 MPVIKVRENEPFDVALRRFKRSCEKAGVLAEVRRREFYEKPTTERKRAKASAVKRHAKKL ARENARRTRLY >gi|296493201|gb|ADTK01000300.1| GENE 24 28644 - 29657 660 337 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227425790|ref|ZP_03908856.1| SSU ribosomal protein S18P alanine acetyltransferase [Atopobium parvulum DSM 20469] # 3 327 480 813 832 258 44 3e-68 MRVLGIETSCDETGIAIYDDEKGLLANQLYSQVKLHADYGGVVPELASRDHVRKTVPLIQ AALKESGLTAKDIDAVAYTAGPGLVGALLVGATVGRSLAFAWDVPAIPVHHMEGHLLAPM LEDNPPEFPFVALLVSGGHTQLISVTGIGQYELLGESIDDAAGEAFDKTAKLLGLDYPGG PLLSKMAAQGTAGRFVFPRPMTDRPGLDFSFSGLKTFAANTIRDNGTDDQTRADIARAFE DAVVDTLMIKCKRALDQTGFKRLVMAGGVSANRTLRAKLAEMMKKRRGEVFYARPEFCTD NGAMIAYAGMVRFKAGATADLGVSVRPRWPLAELPAA >gi|296493201|gb|ADTK01000300.1| GENE 25 29700 - 31163 1483 487 aa, chain - ## HITS:1 COG:ygjE KEGG:ns NR:ns ## COG: ygjE COG0471 # Protein_GI_number: 16130959 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Escherichia coli K12 # 1 487 1 487 487 802 100.0 0 MKPSTEWWRYLAPLAVIAIIALLPVPAGLENHTWLYFAVFTGVIVGLILEPVPGAVVAMV GISIIAILSPWLLFSPEQLAQPGFKFTAKSLSWAVSGFSNSVIWLIFAAFMFGTGYEKTG LGRRIALILVKKMGHRTLFLGYAVMFSELILAPVTPSNSARGAGIIYPIIRNLPPLYQSQ PNDSSSRSIGSYIMWMGIVADCVTSAIFLTAMAPNLLLIGLMKSASHATLSWGDWFLGML PLSILLVLLVPWLAYVLYPPVLKSGDQVPRWAETELQAMGPLCSREKRMLGLMVGALVLW IFGGDYIDAAMVGYSVVALMLLLRIISWDDIVSNKAAWNVFFWLASLITLATGLNNTGFI SWFGKLLAGSLSGYSPTMVMVALIVVFYLLRYFFASATAYTSALAPMMIAAALAMPEIPL PVFCLMVGAAIGLGSILTPYATGPSPIYYGSGYLPTADYWRLGAIFGLIFLVLLVITGLL WMPVVLL >gi|296493201|gb|ADTK01000300.1| GENE 26 31211 - 31840 248 209 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 1 199 305 504 508 100 30 1e-45 SVSHPERVMKKILTTPIKAEDLQDIRVGDVIYLTGTLVTCRDVCHRRLIELKRPIPYDLN GKAIFHAGPIVRKNGDKWEMVSVGPTTSMRMESFEREFIEQTGVKLVVGKGGMGPLTEEG CQKFKALHVIFPAGCAVLAATQVEEIEEVHWTELGMPESLWVCRVKEFGPLIVSIDTHGN NLIAENKKLFAERRDPIVEEICEHVHYIK >gi|296493201|gb|ADTK01000300.1| GENE 27 31813 - 32721 262 302 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 24 300 21 297 508 105 27 1e-45 MSESNKQQAVNKLTEIVANFTAMISTRMPDDVVDKLKQLKDAETSSMGKIIYHTMFDNMQ KAIDLNRPACQDTGEIMFFVKVGSRFPLLGELQSILKQAVEEATVKAPLRHNAVEIFDEV NTGKNTGSGVPWVTWDIIPDNDDAEIEVYMAGGGCTLPGRSKVLMPSEGYEGVVKFVFEN ISTLAVNACPPVLVGVGIATSVETAAVLSRKAILRPIGSRHPNPKAAELELRLEEGLNRL GIGPQGLTGNSSVMGVHIESAARHPSTIGVAVSTGCWAHRRGTLLVHADLTFENLSHTRS AL >gi|296493201|gb|ADTK01000300.1| GENE 28 32931 - 33863 904 310 aa, chain + ## HITS:1 COG:ygiP KEGG:ns NR:ns ## COG: ygiP COG0583 # Protein_GI_number: 16130956 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 310 1 310 310 639 99.0 0 MLNSWPLAKDLQVLVEIVHSGSFSAAAATLGQTPAFVTKRIQILENTLATTLLNRSARGV ALTESGQRCYEHALEILTQYHRLVDDVTQIKTRPEGMIRIGCSFGFGRSHIAPAITELMR NYPELQVHFELFDRQIDLVQDNIDLDIRINDEIPDYYIAHLLTKNKRILCAAPEYLQKYP QPQSLQELSRHDCLVTKERDMTHGIWELGNGQEKKSVKVSGHLSSNSGEIVLQWALEGKG IMLRSEWDVLPFLESGKLVQVLPEYAQSANIWAVYREPLYRSMKLRVCVEFLAAWCQQRL GKPDEGYQVM >gi|296493201|gb|ADTK01000300.1| GENE 29 33876 - 34493 528 205 aa, chain - ## HITS:1 COG:ECs3942 KEGG:ns NR:ns ## COG: ECs3942 COG0344 # Protein_GI_number: 15833196 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 205 1 205 205 388 100.0 1e-108 MSAIAPGMILIAYLCGSISSAILVCRLCGLPDPRTSGSGNPGATNVLRIGGKGAAVAVLI FDVLKGMLPVWGAYELGVSPFWLGLIAIAACLGHIWPVFFGFKGGKGVATAFGAIAPIGW DLTGVMAGTWLLTVLLSGYSSLGAIVSALIAPFYVWWFKPQFTFPVSMLSCLILLRHHDN IQRLWRRQETKIWTKFKRKREKDPE >gi|296493201|gb|ADTK01000300.1| GENE 30 34598 - 34966 431 122 aa, chain + ## HITS:1 COG:folB KEGG:ns NR:ns ## COG: folB COG1539 # Protein_GI_number: 16130954 # Func_class: H Coenzyme transport and metabolism # Function: Dihydroneopterin aldolase # Organism: Escherichia coli K12 # 1 122 2 123 123 222 100.0 1e-58 MDIVFIEQLSVITTIGVYDWEQTIEQKLVFDIEMAWDNRKAAKSDDVADCLSYADIAETV VSHVEGARFALVERVAEEVAELLLARFNSPWVRIKLSKPGAVARAANVGVIIERGNNLKE NN >gi|296493201|gb|ADTK01000300.1| GENE 31 35056 - 35877 985 273 aa, chain + ## HITS:1 COG:ECs3940 KEGG:ns NR:ns ## COG: ECs3940 COG1968 # Protein_GI_number: 15833194 # Func_class: V Defense mechanisms # Function: Uncharacterized bacitracin resistance protein # Organism: Escherichia coli O157:H7 # 1 273 1 273 273 482 100.0 1e-136 MSDMHSLLIAAILGVVEGLTEFLPVSSTGHMIIVGHLLGFEGDTAKTFEVVIQLGSILAV VVMFWRRLFGLIGIHFGRPLQHEGESKGRLTLIHILLGMIPAVVLGLLFHDTIKSLFNPI NVMYALVVGGLLLIAAECLKPKEPRAPGLDDMTYRQAFMIGCFQCLALWPGFSRSGATIS GGMLMGVSRYAASEFSFLLAVPMMMGATALDLYKSWGFLTSGDIPMFAVGFITAFVVALI AIKTFLQLIKRISFIPFAIYRFIVAAAVYVVFF >gi|296493201|gb|ADTK01000300.1| GENE 32 36058 - 37296 1173 412 aa, chain - ## HITS:1 COG:cca KEGG:ns NR:ns ## COG: cca COG0617 # Protein_GI_number: 16130952 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Escherichia coli K12 # 1 412 1 412 412 837 100.0 0 MKIYLVGGAVRDALLGLPVKDRDWVVVGSTPQEMLDAGYQQVGRDFPVFLHPQTHEEYAL ARTERKSGSGYTGFTCYAAPDVTLEDDLKRRDLTINALAQDDNGEIIDPYNGLGDLQNRL LRHVSPAFGEDPLRVLRVARFAARYAHLGFRIADETLALMREMTHAGELEHLTPERVWKE TESALTTRNPQVFFQVLRDCGALRVLFPEIDALFGVPAPAKWHPEIDTGIHTLMTLSMAA MLSPQVDVRFATLCHDLGKGLTPPELWPRHHGHGPAGVKLVEQLCQRLRVPNEIRDLARL VAEFHDLIHTFPMLNPKTIVKLFDSIDAWRKPQRVEQLALTSEADVRGRTGFESADYPQG RWLREAWEVAQSVPTKAVVEAGFKGVEIREELTRRRIAAVASWKEQRCPKPE >gi|296493201|gb|ADTK01000300.1| GENE 33 37360 - 37980 666 206 aa, chain - ## HITS:1 COG:ECs3938 KEGG:ns NR:ns ## COG: ECs3938 COG3103 # Protein_GI_number: 15833192 # Func_class: T Signal transduction mechanisms # Function: SH3 domain protein # Organism: Escherichia coli O157:H7 # 1 206 1 206 206 348 100.0 4e-96 MPKLRLIGLTLLALSATAVSHAEETRYVSDELNTWVRSGPGDHYRLVGTVNAGEEVTLLQ TDANTNYAQVKDSSGRTAWIPLKQLSTEPSLRSRVPDLENQVKTLTDKLTNIDNTWNQRT AEMQQKVAQSDSVINGLKEENQKLKNELIVAQKKVDAASVQLDDKQRTIIMQWFMYGGGV LGLGLLLGLVLPHLIPSRKRKDRWMN >gi|296493201|gb|ADTK01000300.1| GENE 34 37999 - 38205 62 68 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_3537 NR:ns ## KEGG: ECUMN_3537 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 68 1 68 68 119 100.0 4e-26 MESGTIVVASVSLRKAFDINRNPLCQVLPLLQTFTPVKLPVAWQFGAKYYLLTKKIALSS PFIVTYVT >gi|296493201|gb|ADTK01000300.1| GENE 35 38222 - 39523 1561 433 aa, chain + ## HITS:1 COG:ygiF KEGG:ns NR:ns ## COG: ygiF COG3025 # Protein_GI_number: 16130950 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 433 1 433 433 843 100.0 0 MAQEIELKFIVNHSAVEALRDHLNTLGGEHHDPVQLLNIYYETPDNWLRGHDMGLRIRGE NGRYEMTMKVAGRVTGGLHQRPEYNVALSEPTLDLAQLPTEVWPNGELPADLASRVQPLF STDFYREKWLVAVDGSQIEIALDQGEVKAGEFAEPICELELELLSGDTRAVLKLANQLVS QTGLRQGSLSKAARGYHLAQGNPAREIKPTTILHVAAKADVEQGLEAALELALAQWQYHE ELWVRGNDAAKEQVLAAISLVRHTLMLFGGIVPRKASTHLRDLLTQCEATIASAVSAVTA VYSTETAMAKLALTEWLVSKAWQPFLDAKAQGKISDSFKRFADIHLSRHAAELKSVFCQP LGDRYRDQLPRLTRDIDSILLLAGYYDPVVAQAWLENWQGLHHAIATGQRIEIEHFRNEA NNQEPFWLHSGKR >gi|296493201|gb|ADTK01000300.1| GENE 36 39546 - 42386 2970 946 aa, chain + ## HITS:1 COG:glnE KEGG:ns NR:ns ## COG: glnE COG1391 # Protein_GI_number: 16130949 # Func_class: O Posttranslational modification, protein turnover, chaperones; T Signal transduction mechanisms # Function: Glutamine synthetase adenylyltransferase # Organism: Escherichia coli K12 # 1 946 1 946 946 1811 99.0 0 MKPLSSPLQQYWQTVVERLPEPLAEESLSAQAKSVLTFSDFVQDSISAHPEWLTELESQP PQADEWQHYVAWLQEALSNVSDEAGLMRELRLFRRRIMVRIAWAQTLALVTEESILQQLS YLAETLIVAARDWLYDACCREWGTPCNAQGEAQPLLILGMGKLGGGELNFSSDIDLIFAW PEHGCTQGGRRELDNAQFFTRMGQRLIKVLDQPTQDGFVYRVDMRLRPFGESGPLVLSFA ALEDYYQEQGRDWERYAMVKARIMGDSEGVYANELRAMLRPFVFRRYIDFSVIQSLRNMK GMIAREVRRRGLTDNIKLGAGGIREIEFIVQVFQLIRGGREPSLQSRSLLPTLSAIAELH LLSENDAEQLRVAYLFLRRLENLLQSINDEQTQTLPSDELNRARLAWAMDFADWPQLTGA LTAHMTNVRRVFNELIGDDESETQEESLSEQWRELWQDALQEDDTTPVLAHLSEDDRKQV LTLIADFRKELDKRTIGPRGRQVLDHLMPHLLSDVCAREDAAVTLSRITALLVGIVTRTT YLELLSEFPAALKHLISLCAASPMIASQLARYPLLLDELLDPNTLYQPTATDAYRDELRQ YLLRVPEDDEEQQLEALRQFKQAQLLRIAAADIAGTLPVMKVSDHLTWLAEAMIDAVVQQ AWVQMVARYGKPNHLNEREGRGFAVVGYGKLGGWELGYSSDLDLIFLHDCPMDAMTDGER EIDGRQFYLRLAQRIMHLFSTRTSSGILYEVDARLRPSGAAGMLVTSAEAFADYQKNEAW TWEHQALVRARVVYGDPQLTAHFDAVRREIMTLPREGKTLQTEVREMREKMRAHLGNKHR DRFDIKADEGGITDIEFITQYLVLRYAHEKPKLTRWSDNVRILELLAQNDIMEEQEAMAL TRAYTTLRDELHHLALQELPGHVSEDCFTAERELVRASWQKWLVEE >gi|296493201|gb|ADTK01000300.1| GENE 37 42434 - 43867 1744 477 aa, chain + ## HITS:1 COG:rfaE KEGG:ns NR:ns ## COG: rfaE COG2870 # Protein_GI_number: 16130948 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase # Organism: Escherichia coli K12 # 1 477 1 477 477 916 100.0 0 MKVTLPEFERAGVMVVGDVMLDRYWYGPTSRISPEAPVPVVKVNTIEERPGGAANVAMNI ASLGANARLVGLTGIDDAARALSKSLADVNVKCDFVSVPTHPTITKLRVLSRNQQLIRLD FEEGFEGVDPQPLHERINQALSSIGALVLSDYAKGALASVQQMIQLARKAGVPVLIDPKG TDFERYRGATLLTPNLSEFEAVVGKCKTEEEIVERGMKLIADYELSALLVTRSEQGMSLL QPGKAPLHMPTQAQEVYDVTGAGDTVIGVLAATLAAGNSLEEACFFANAAAGVVVGKLGT STVSPIELENAVRGRADTGFGVMTEEELKLAVAAARKRGEKVVMTNGVFDILHAGHVSYL ANARKLGDRLIVAVNSDASTKRLKGDSRPVNPLEQRMIVLGALEAVDWVVSFEEDTPQRL IAGILPDLLVKGGDYKPEEIAGSKEVWANGGEVLVLNFEDGCSTTNIIKKIQQDKKG Prediction of potential genes in microbial genomes Time: Mon May 16 15:56:42 2011 Seq name: gi|296493200|gb|ADTK01000301.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont923.4, whole genome shotgun sequence Length of sequence - 73416 bp Number of predicted genes - 77, with homology - 77 Number of transcription units - 41, operones - 16 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 4 - 40 3.0 1 1 Op 1 . - CDS 57 - 1718 1753 ## COG2268 Uncharacterized protein conserved in bacteria 2 1 Op 2 . - CDS 1745 - 2374 167 ## ECO103_3727 putative inner membrane protein - Prom 2499 - 2558 5.8 + Prom 2490 - 2549 6.9 3 2 Tu 1 . + CDS 2643 - 2843 200 ## SSON_3186 glycogen synthesis protein GlgS + Term 2844 - 2882 3.5 - Term 2837 - 2864 1.5 4 3 Op 1 . - CDS 2886 - 3950 427 ## SSON_3184 hypothetical protein 5 3 Op 2 10/0.000 - CDS 3952 - 4701 471 ## COG3121 P pilus assembly protein, chaperone PapD 6 3 Op 3 . - CDS 4717 - 5370 356 ## COG3188 P pilus assembly protein, porin PapC 7 3 Op 4 6/0.000 - CDS 5410 - 7230 819 ## COG3188 P pilus assembly protein, porin PapC 8 3 Op 5 . - CDS 7281 - 7832 468 ## COG3539 P pilus assembly protein, pilin FimA - Prom 7934 - 7993 8.8 - Term 8048 - 8099 9.1 9 4 Tu 1 . - CDS 8115 - 8405 319 ## COG2960 Uncharacterized protein conserved in bacteria - Prom 8433 - 8492 3.7 10 5 Tu 1 . + CDS 8779 - 9432 781 ## COG0108 3,4-dihydroxy-2-butanone 4-phosphate synthase + Term 9463 - 9504 5.6 + Prom 9542 - 9601 5.0 11 6 Tu 1 . + CDS 9693 - 9863 242 ## ECUMN_3528 hypothetical protein + Term 9881 - 9923 5.7 - Term 9780 - 9811 -0.8 12 7 Tu 1 . - CDS 9921 - 10694 897 ## COG0428 Predicted divalent heavy-metal cations transporter - Prom 10791 - 10850 3.5 + Prom 10687 - 10746 5.5 13 8 Tu 1 . + CDS 10837 - 11625 834 ## COG3384 Uncharacterized conserved protein + Term 11626 - 11673 9.6 - Term 11616 - 11655 9.1 14 9 Op 1 5/0.062 - CDS 11663 - 12823 1139 ## COG0754 Glutathionylspermidine synthase 15 9 Op 2 . - CDS 12829 - 13404 399 ## COG5463 Predicted integral membrane protein - Prom 13528 - 13587 2.5 16 10 Tu 1 . - CDS 13648 - 15129 1684 ## COG1538 Outer membrane protein - Prom 15244 - 15303 6.1 + Prom 15125 - 15184 4.7 17 11 Op 1 8/0.000 + CDS 15334 - 15963 619 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 18 11 Op 2 7/0.000 + CDS 15964 - 16386 181 ## COG3151 Uncharacterized protein conserved in bacteria 19 11 Op 3 7/0.000 + CDS 16411 - 17238 733 ## COG1409 Predicted phosphohydrolases 20 11 Op 4 7/0.000 + CDS 17238 - 17819 477 ## COG3150 Predicted esterase 21 11 Op 5 . + CDS 17848 - 19740 1948 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit + Term 19750 - 19786 6.1 22 12 Op 1 . - CDS 19785 - 20654 363 ## Ping_0266 outer membrane protein 23 12 Op 2 . - CDS 20668 - 21363 623 ## PSPPH_3003 oligogalacturonate-specific porin protein KdgM - Prom 21409 - 21468 4.0 24 13 Op 1 . - CDS 21498 - 22622 426 ## COG4222 Uncharacterized protein conserved in bacteria 25 13 Op 2 . - CDS 22650 - 23564 541 ## COG0524 Sugar kinases, ribokinase family 26 13 Op 3 . - CDS 23569 - 24561 180 ## RD1_2418 glucosamine--fructose-6-phosphate aminotransferase, putative 27 13 Op 4 2/0.750 - CDS 24567 - 25643 617 ## COG1397 ADP-ribosylglycohydrolase 28 13 Op 5 . - CDS 25657 - 26409 459 ## COG2188 Transcriptional regulators - Prom 26462 - 26521 6.5 + Prom 26284 - 26343 2.5 29 14 Op 1 . + CDS 26368 - 26625 112 ## gi|300905841|ref|ZP_07123574.1| hypothetical protein HMPREF9536_03833 30 14 Op 2 . + CDS 26696 - 28027 735 ## COG2211 Na+/melibiose symporter and related transporters + Term 28036 - 28068 0.5 - Term 28024 - 28056 2.1 31 15 Op 1 4/0.188 - CDS 28071 - 28385 395 ## COG1359 Uncharacterized conserved protein 32 15 Op 2 2/0.750 - CDS 28416 - 28997 670 ## COG2249 Putative NADPH-quinone reductase (modulator of drug activity B) 33 16 Op 1 40/0.000 - CDS 29107 - 30456 1160 ## COG0642 Signal transduction histidine kinase 34 16 Op 2 . - CDS 30453 - 31085 792 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 31176 - 31235 4.6 + Prom 31133 - 31192 8.5 35 17 Op 1 3/0.438 + CDS 31264 - 31656 543 ## COG3111 Uncharacterized conserved protein 36 17 Op 2 . + CDS 31709 - 32191 441 ## COG3449 DNA gyrase inhibitor + Term 32233 - 32262 -0.9 + Prom 32300 - 32359 3.8 37 18 Op 1 . + CDS 32396 - 32692 286 ## B21_02844 hypothetical protein 38 18 Op 2 1/0.938 + CDS 32694 - 33089 209 ## COG1396 Predicted transcriptional regulators + Term 33096 - 33132 3.3 + Prom 33135 - 33194 2.7 39 19 Tu 1 4/0.188 + CDS 33222 - 34829 1526 ## COG4166 ABC-type oligopeptide transport system, periplasmic component + Prom 34859 - 34918 2.4 40 20 Tu 1 5/0.062 + CDS 34967 - 37225 2535 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit + Term 37371 - 37399 -0.9 + Prom 37321 - 37380 2.3 41 21 Op 1 7/0.000 + CDS 37459 - 38196 617 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase + Term 38215 - 38249 2.8 42 21 Op 2 2/0.750 + CDS 38271 - 39683 1188 ## COG2132 Putative multicopper oxidases + Term 39684 - 39731 9.1 + Prom 39706 - 39765 6.3 43 22 Tu 1 . + CDS 39794 - 42013 2079 ## COG1032 Fe-S oxidoreductase + Term 42020 - 42060 5.1 - Term 42008 - 42047 9.2 44 23 Op 1 . - CDS 42056 - 42313 293 ## COG4238 Murein lipoprotein 45 23 Op 2 . - CDS 42364 - 43290 648 ## EC55989_3430 hypothetical protein - Prom 43394 - 43453 2.9 - Term 43438 - 43476 7.2 46 24 Tu 1 1/0.938 - CDS 43490 - 44317 891 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase - Term 44326 - 44361 7.4 47 25 Tu 1 . - CDS 44422 - 45585 1501 ## COG1979 Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family - Prom 45748 - 45807 5.3 48 26 Tu 1 . + CDS 45779 - 46678 854 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 46670 - 46707 7.2 49 27 Tu 1 5/0.062 - CDS 46718 - 47377 575 ## COG0586 Uncharacterized membrane-associated protein - Prom 47456 - 47515 3.9 50 28 Tu 1 . - CDS 47517 - 48704 1054 ## COG0626 Cystathionine beta-lyases/cystathionine gamma-synthases - Prom 48856 - 48915 5.0 + Prom 48691 - 48750 3.0 51 29 Op 1 30/0.000 + CDS 48971 - 49690 788 ## COG0811 Biopolymer transport proteins 52 29 Op 2 . + CDS 49697 - 50122 527 ## COG0848 Biopolymer transport protein + Term 50360 - 50402 8.5 - Term 50348 - 50388 8.0 53 30 Tu 1 . - CDS 50394 - 51278 963 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) - Prom 51438 - 51497 4.1 + Prom 51385 - 51444 3.2 54 31 Tu 1 . + CDS 51469 - 51963 612 ## COG2862 Predicted membrane protein + Term 51971 - 52003 5.3 55 32 Tu 1 . - CDS 52003 - 53043 969 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) - Prom 53184 - 53243 3.3 + Prom 53026 - 53085 1.8 56 33 Tu 1 . + CDS 53200 - 54087 953 ## COG0412 Dienelactone hydrolase and related enzymes + Prom 54114 - 54173 2.5 57 34 Tu 1 . + CDS 54206 - 54493 310 ## ECIAI1_3148 hypothetical protein + Prom 54521 - 54580 3.5 58 35 Op 1 4/0.188 + CDS 54682 - 55800 1132 ## COG1740 Ni,Fe-hydrogenase I small subunit 59 35 Op 2 6/0.000 + CDS 55803 - 56789 999 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 60 35 Op 3 4/0.188 + CDS 56779 - 57957 1530 ## COG5557 Polysulphide reductase 61 35 Op 4 5/0.062 + CDS 57954 - 59657 1871 ## COG0374 Ni,Fe-hydrogenase I large subunit 62 35 Op 5 . + CDS 59657 - 60151 609 ## COG0680 Ni,Fe-hydrogenase maturation factor 63 35 Op 6 . + CDS 60144 - 60632 490 ## SSON_3137 hydrogenase 2-specific chaperone 64 35 Op 7 4/0.188 + CDS 60625 - 60966 317 ## COG0375 Zn finger protein HypA/HybF (possibly regulating hydrogenase expression) 65 35 Op 8 . + CDS 60979 - 61227 370 ## COG0298 Hydrogenase maturation factor - Term 61166 - 61204 2.1 66 36 Tu 1 . - CDS 61350 - 62216 1003 ## COG0625 Glutathione S-transferase - Prom 62278 - 62337 2.7 + Prom 62167 - 62226 3.1 67 37 Tu 1 . + CDS 62421 - 64280 2044 ## COG0754 Glutathionylspermidine synthase - Term 64086 - 64127 1.8 68 38 Tu 1 . - CDS 64372 - 64635 90 ## EcSMS35_3273 hypothetical protein - Prom 64711 - 64770 1.8 + Prom 64308 - 64367 5.2 69 39 Tu 1 . + CDS 64572 - 66071 1365 ## COG0306 Phosphate/sulphate permeases + Term 66078 - 66124 8.1 70 40 Tu 1 . - CDS 66120 - 66812 584 ## B21_02811 hypothetical protein - Prom 66836 - 66895 5.7 + Prom 66813 - 66872 5.4 71 41 Op 1 . + CDS 67001 - 67699 342 ## EcHS_A3165 putative lipoprotein 72 41 Op 2 . + CDS 67731 - 68489 632 ## B21_02809 hypothetical protein 73 41 Op 3 . + CDS 68535 - 69884 839 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 74 41 Op 4 . + CDS 69884 - 70708 589 ## ECO111_3811 hypothetical protein 75 41 Op 5 . + CDS 70720 - 71280 447 ## COG3054 Predicted transcriptional regulator 76 41 Op 6 22/0.000 + CDS 71311 - 72381 968 ## COG0795 Predicted permeases 77 41 Op 7 . + CDS 72378 - 73412 777 ## COG0795 Predicted permeases Predicted protein(s) >gi|296493200|gb|ADTK01000301.1| GENE 1 57 - 1718 1753 553 aa, chain - ## HITS:1 COG:ECs3933 KEGG:ns NR:ns ## COG: ECs3933 COG2268 # Protein_GI_number: 15833187 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 553 1 553 553 811 98.0 0 MDDIVNSVPSWMFTAIIAVCILFIIGIIFARLYRRASAEQAFVRTGLGGQKVVMSGGAIV MPIFHEIIPINMNTLKLEVSRSTIDSLITKDRMRVDVVVAFFVRVKPSVEGIATAAQTLG QRTLSPEDLRMLVEDKFVDALRATAAQMTMHELQDTRENFVQGVQNTVAEDLSKNGLELE SVSLTNFNQTSKEHFNPNNAFDAEGLTKLTQETERRRRERNEVEQDVEVAVREKNRDALS RKLEIEQQEAFMTLEQEQQVKTRTAEQNARIAAFEAERRREAEQTRILAERQIQETEIER EQAVRSRKVEAEREVRIKEIEQQQVTEIANQTKSIAIAAKSEQQSQAEARANLALAEAVS AQQNVETTRQTAEADRAKQVALIAAAQDAETKAVELTVRAKAEKEAAEMQAAAIVELAEA TRKKGLAEAEAQRALNDAINVLSDEQTSLKFKLALLQALPAVIEKSVEPMKSIDGIKIIQ VDGLNRGSAAGDANTGNVGGGNLAEQALSAALSYRTQAPLIDSLLNEIGVSGGSLAALTS PLTSTTPVAENVE >gi|296493200|gb|ADTK01000301.1| GENE 2 1745 - 2374 167 209 aa, chain - ## HITS:1 COG:no KEGG:ECO103_3727 NR:ns ## KEGG: ECO103_3727 # Name: yqiJ # Def: putative inner membrane protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 209 1 209 209 409 100.0 1e-113 MILFADYNTPYLFAISFVLLIGLLEILALICGHMLSGALDAHLDHYDSITTGHISQALHY LNIGRLPALVVLCLLAGFFGLIGILLQHACIMVWQSPLSNLFVVPVSLLFTIIAVHYTGK IVAPWIPRDHSSAITEEEYIGSMALITGHQATSGNPCEGKLTDQFGQIHYLLLEPEEGKF FTKGDKVLIICRLSATRYLAENNPWPQIL >gi|296493200|gb|ADTK01000301.1| GENE 3 2643 - 2843 200 66 aa, chain + ## HITS:1 COG:no KEGG:SSON_3186 NR:ns ## KEGG: SSON_3186 # Name: glgS # Def: glycogen synthesis protein GlgS # Organism: S.sonnei # Pathway: not_defined # 1 66 1 66 66 126 100.0 2e-28 MDHSLNSLNNFDFLARSFARMHAEGRPVDILAVTGNMDEEHRTWFCARYAWYCQQMMQAR ELELEH >gi|296493200|gb|ADTK01000301.1| GENE 4 2886 - 3950 427 354 aa, chain - ## HITS:1 COG:no KEGG:SSON_3184 NR:ns ## KEGG: SSON_3184 # Name: yqiI # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 354 1 354 354 689 99.0 0 MRYLLIVIIFIIGFSALPVWAMDCYAEHEGGNTVVIGYVPRISIPSNGKKGDKIWQSSEY FMNVFCNNALPAPSPGEEYPSAWTNIMMFLPGGQDFYNQNSYIFGVTYNGVDYDSTAPNA IAAPECIDIKGAGTFNNHYKNPAVCSGGPEPQLSVTFPARVQLYIKLAKNANRVNKNMVL PDEYIALEFKGMSGAGAIEVDKNLTFRIRGLNNIHVLDCFVNVALEPADGVVDFGKINSR TIKNTSVSETFSVVMTKDPGAACTEQFNILGSFFTTDILSDYSHLDMGNGLLLKIFHKDG TATEFNRFSQFASFSSSSAPSVTAPFRAELSANPAETVVEGPFSKDVILKITYN >gi|296493200|gb|ADTK01000301.1| GENE 5 3952 - 4701 471 249 aa, chain - ## HITS:1 COG:yqiH KEGG:ns NR:ns ## COG: yqiH COG3121 # Protein_GI_number: 16130943 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli K12 # 1 249 4 252 252 480 93.0 1e-135 MRYLNTKNLIAAGVLLACMSSIAWGAIIPDRTRIIMNESDKGEALKLTNQSKNLPYLAQT WIEDTKGNKSRDFIVTVPPMVRLNPSEQIQIRMISQEKIAQLPNDRETLFYFNVREIPPK TDKKNVMQVTMQHALKLFWRPKAIELEDDGVMSYEKVEIIRRNDGSIRFNNKMPYHVTLG YIGTNGITMLPQTQSLMVTPYSHADTQFKNVPSAFQVGYINDFGGLSFYEINCPTVNNSC NVSVAKRDK >gi|296493200|gb|ADTK01000301.1| GENE 6 4717 - 5370 356 217 aa, chain - ## HITS:1 COG:yqiG KEGG:ns NR:ns ## COG: yqiG COG3188 # Protein_GI_number: 16130942 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli K12 # 1 217 605 821 821 417 97.0 1e-117 MKASLRASYQHNTENGRLYLSGTSQRDSYYSLNASWNGSFTATRHGAAFHDYSGSADSRF MIDADGAEDIPLNNKRAVTNRYGIGVIPSVSSYITTSLSVDTRNLPENVDIENSVITTTL TEGAIGYAKLDTRKGYQIMGVIRLADGSHPPLGISVKDKTSHKELGLVADGGFVYLNGIQ DDSKLTLRWGDKSCFIQPPNSSNLTTGTVILPCISQN >gi|296493200|gb|ADTK01000301.1| GENE 7 5410 - 7230 819 606 aa, chain - ## HITS:1 COG:yqiG KEGG:ns NR:ns ## COG: yqiG COG3188 # Protein_GI_number: 16130942 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli K12 # 26 606 10 590 821 1114 97.0 0 MDQMYKKLKLTTISELIKNIYCSLSVLIIGCASAYAVEFNKDLIEAEDRENVNLSQFETD GQLPVGKYSLNALINNKRTPIHLDLQWVLIDNQTAVCLTPEQLTLLGFTDEIIEEAQQNL IDGCYPIEKEKQITTYLDKGKMQLSISAPQAWLKYKDANWTPPELWDHGIAGAFLDYNLY ASHYAPHQGDNSQNISSYGQAGVNLGAWRLRTDYQYDQSFNNGKSQANNLDFPRIYLFRP IPAINAKLTIGQYDTESSIFDSFHFSGVSLKSDENMLPPDLRGYAPQITGVAQTNAKVTV SQNNRIIYQENVPPGPFAITNLFNTLQGQLDVKVEEEDGQVTQWQVASNSIPYLTRKGQI RYTTAMGKPTSVGGDSLQQPFFWTGEFSWGWLNNVSLYGGSVLTNRDYQSLAAGVGFNLN SLGSLSFDVTRSDAQLHNQDKETGYSYRANYSKRFESTGSQLTFAGYRFSDKNFVTMNEY INDTNHYTNYQDEKESYIVTFNQYLESLRLNTYVSLARNTYWDASSNVNYSLSLSRDFDI GPLKNVSTSLTFSRINWEEDNQDQLYLNISIPWGTSRTLSYGMQRNQDNKISHTASWYDS SDRNNS >gi|296493200|gb|ADTK01000301.1| GENE 8 7281 - 7832 468 183 aa, chain - ## HITS:1 COG:ygiL KEGG:ns NR:ns ## COG: ygiL COG3539 # Protein_GI_number: 16130939 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 183 1 183 183 337 99.0 7e-93 MSAFKKSLLVAGVAMILSNNVFADEGHGIVKFKGEVISAPCSIKPGDEDLTVNLGEVADT VLKSDQKSLAEPFTIHLQDCMLSQGGTTYSKAKVTFTTANTMTGQTDLLKNTKETEIGGA TGVGVRILDSQSGEVTLGTPVVITFNNTNSYQELNFKARMESPSKDATPGNVYAQADYKI AYE >gi|296493200|gb|ADTK01000301.1| GENE 9 8115 - 8405 319 96 aa, chain - ## HITS:1 COG:yqiC KEGG:ns NR:ns ## COG: yqiC COG2960 # Protein_GI_number: 16130938 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 96 21 116 116 134 98.0 4e-32 MIDPKKIEQIARQVHESMPKGIREFGEDVEKKIRQTLQAQLTRLDLVSREEFDVQTQVLL RTREKLALLEQRMSELENRSTEIKKQPDPETLPPTL >gi|296493200|gb|ADTK01000301.1| GENE 10 8779 - 9432 781 217 aa, chain + ## HITS:1 COG:ECs3929 KEGG:ns NR:ns ## COG: ECs3929 COG0108 # Protein_GI_number: 15833183 # Func_class: H Coenzyme transport and metabolism # Function: 3,4-dihydroxy-2-butanone 4-phosphate synthase # Organism: Escherichia coli O157:H7 # 1 217 1 217 217 412 99.0 1e-115 MNQTLLSSFGTPFERVENALAALREGRGVMVLDDEDRENEGDMIFPAETMTVDQMALTIR HGSGIVCLCITEDRRKQLDLPMMVENNTSAYGTGFTVTIEAAEGVTTGVSAADRITTVRA AIADGAKPSDLNRPGHVFPLRAQAGGVLTRGGHTEATIDLMTLAGFKPAGVLCELTNDDG TMARAPECIEFANKHNMALVTIEDLVAYRQAHERKAS >gi|296493200|gb|ADTK01000301.1| GENE 11 9693 - 9863 242 56 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_3528 NR:ns ## KEGG: ECUMN_3528 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 56 9 64 64 102 100.0 5e-21 MFIAWYWIVLIALVVVGYFLHLKRYCRAFRQDRDALLEARNKYLNSTREETAEKVE >gi|296493200|gb|ADTK01000301.1| GENE 12 9921 - 10694 897 257 aa, chain - ## HITS:1 COG:ECs3928 KEGG:ns NR:ns ## COG: ECs3928 COG0428 # Protein_GI_number: 15833182 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted divalent heavy-metal cations transporter # Organism: Escherichia coli O157:H7 # 1 257 1 257 257 389 99.0 1e-108 MSVPLILTILAGAATFIGAFLGVLGQKPSNRLLAFSLGFAAGIMLLISLMEMLPAALAAE GMSPVLGYGMFIFGLLGYFGLDRMLPHAHPQDLMQKSVQPLPKSIKRTAILLTLGISLHN FPEGIATFVTASSNLELGFGIALAVALHNIPEGLAVAGPGYAATGSKRTAILWAGISGLA EILGGVLAWLILGSMISPVVMAAIMAAVAGIMVALSVDELMPLAKEIDPNNNPSYGVLCG MSVMGFSLVLLQTAGIG >gi|296493200|gb|ADTK01000301.1| GENE 13 10837 - 11625 834 262 aa, chain + ## HITS:1 COG:ygiD KEGG:ns NR:ns ## COG: ygiD COG3384 # Protein_GI_number: 16130935 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 262 10 271 271 505 98.0 1e-143 MSSTRMPALFLGHGSPMNVLEDNLYTRSWQKLGMTLPRPQAIVVVSAHWFTRGTGVTAME TPPTIHDFGGFPQALYDTHYPAPGSPVLAQHLVELLAPVPVTLDKEAWGFDHGSWGVLIK MYPDADIPMVQLSIDSSKPAAWHFEMGRKLAALRDEGIMLVASGNVVHNLRTVKWHGDSS PYPWATSFNEYVKANLTWQGPVEQHPLVNYLDHEGGTLSNPTPEHYLPLLYVLGAWDGQE PITIPVEGIEMGSLSMLSVQVG >gi|296493200|gb|ADTK01000301.1| GENE 14 11663 - 12823 1139 386 aa, chain - ## HITS:1 COG:ECs3926 KEGG:ns NR:ns ## COG: ECs3926 COG0754 # Protein_GI_number: 15833180 # Func_class: E Amino acid transport and metabolism # Function: Glutathionylspermidine synthase # Organism: Escherichia coli O157:H7 # 1 386 1 386 386 791 100.0 0 MERVSITERPDWREKAHEYGFNFHTMYGEPYWCEDAYYKLTLAQVEKLEEVTAELHQMCL KVVEKVIASDELMTKFRIPKHTWSFVRQSWLTHQPSLYSRLDLAWDGTGEPKLLENNADT PTSLYEAAFFQWIWLEDQLNAGNLPEGSDQFNSLQEKLIDRFVELREQYGFQLLHLTCCR DTVEDRGTIQYLQDCATEAEIATEFLYIDDIGLGEKGQFTDLQDQVISNLFKLYPWEFML REMFSTKLEDAGVRWLEPAWKSIISNKALLPLLWEMFPNHPNLLPAYFAEDDHPQMEKYV VKPIFSREGANVSIIENGKTIEAAEGPYGEEGMIVQQFHPLPKFGDSYMLIGSWLVNDQP AGIGIREDRALITQDMSRFYPHIFVE >gi|296493200|gb|ADTK01000301.1| GENE 15 12829 - 13404 399 191 aa, chain - ## HITS:1 COG:ygiB KEGG:ns NR:ns ## COG: ygiB COG5463 # Protein_GI_number: 16130933 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Escherichia coli K12 # 1 191 44 234 234 301 100.0 4e-82 MLAGCEKSDETVSLYQNADDCSAANPGKSAECTTAYNNALKEAERTAPKYATREDCVAEF GEGQCQQAPAQAGMAPENQAQAQQSSGSFWMPLMAGYMMGRLMGGGAGFAQQPLFSSKNP ASPAYGKYTDATGKNYGAAQPGRTMTVPKTAMAPKPATTTTVTRGGFGESVAKQSTMQRS ATGTSSRSMGG >gi|296493200|gb|ADTK01000301.1| GENE 16 13648 - 15129 1684 493 aa, chain - ## HITS:1 COG:tolC KEGG:ns NR:ns ## COG: tolC COG1538 # Protein_GI_number: 16130931 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Escherichia coli K12 # 1 493 3 495 495 801 100.0 0 MKKLLPILIGLSLSGFSSLSQAENLMQVYQQARLSNPELRKSAADRDAAFEKINEARSPL LPQLGLGADYTYSNGYRDANGINSNATSASLQLTQSIFDMSKWRALTLQEKAAGIQDVTY QTDQQTLILNTATAYFNVLNAIDVLSYTQAQKEAIYRQLDQTTQRFNVGLVAITDVQNAR AQYDTVLANEVTARNNLDNAVEQLRQITGNYYPELAALNVENFKTDKPQPVNALLKEAEK RNLSLLQARLSQDLAREQIRQAQDGHLPTLDLTASTGISDTSYSGSKTRGAAGTQYDDSN MGQNKVGLSFSLPIYQGGMVNSQVKQAQYNFVGASEQLESAHRSVVQTVRSSFNNINASI SSINAYKQAVVSAQSSLDAMEAGYSVGTRTIVDVLDATTTLYNAKQELANARYNYLINQL NIKSALGTLNEQDLLALNNALSKPVSTNPENVAPQTPEQNAIADGYAPDSPAPVVQQTSA RTTTSNGHNPFRN >gi|296493200|gb|ADTK01000301.1| GENE 17 15334 - 15963 619 209 aa, chain + ## HITS:1 COG:ECs3922 KEGG:ns NR:ns ## COG: ECs3922 COG0494 # Protein_GI_number: 15833176 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Escherichia coli O157:H7 # 1 209 1 209 209 402 99.0 1e-112 MLKPDNLPVTLGKNDVEIIARETLYRGFFSLDLYRFRHRLFNGQMSHEVRREIFERGHAA VLLPFDPVRDEVVLIEQIRIAAYDTSETPWLLEMVAGMIEEGESVEDVARREAIEEAGLI VKRTKPVLSFLASPGGTSERSSIMVGEVDATTASGIHGLADENEDIRVHVVSREQAYQWV EEGKIDNAASVIALQWLQLHHQALKNEWA >gi|296493200|gb|ADTK01000301.1| GENE 18 15964 - 16386 181 140 aa, chain + ## HITS:1 COG:ECs3921 KEGG:ns NR:ns ## COG: ECs3921 COG3151 # Protein_GI_number: 15833175 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 140 1 140 140 274 100.0 4e-74 MKRYTPDFPEMMRLCEMNFSQLRRLLPRNDAPGETVSYQVANAQYRLTIVESTRYTTLVT IEQTAPAISYWSLPSMTVRLYHDAMVAEVCSSQQIFRFKARYDYPNKKLHQRDEKHQINQ FLADWLRYCLAHGAMAIPVY >gi|296493200|gb|ADTK01000301.1| GENE 19 16411 - 17238 733 275 aa, chain + ## HITS:1 COG:ECs3920 KEGG:ns NR:ns ## COG: ECs3920 COG1409 # Protein_GI_number: 15833174 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Escherichia coli O157:H7 # 1 275 1 275 275 528 100.0 1e-150 MESLLTLPLAGEARVRILQITDTHLFAQKHEALLGVNTWESYQAVLEAIRPHQHEFDLIV ATGDLAQDQSSAAYQHFAEGIASFRAPCVWLPGNHDFQPAMYSALQDAGISPAKRVFIGE QWQILLLDSQVFGVPHGELSEFQLEWLERKLADAPERHTLLLLHHHPLPAGCSWLDQHSL RNAGELDTVLAKFPHVKYLLCGHIHQELDLDWNGRRLLATPSTCVQFKPHCSNFTLDTIA PGWRTLELHADGTLTTEVHRLADTRFQPDTASEGY >gi|296493200|gb|ADTK01000301.1| GENE 20 17238 - 17819 477 193 aa, chain + ## HITS:1 COG:ECs3919 KEGG:ns NR:ns ## COG: ECs3919 COG3150 # Protein_GI_number: 15833173 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Escherichia coli O157:H7 # 1 193 1 193 193 390 100.0 1e-109 MSTLLYLHGFNSSPRSAKASLLKNWLAEHHPDVEMIIPQLPPYPSDAAELLESIVLEHGG DSLGIVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDYLGQNENPYTGQQYVLESR HIYDLKVMQIDPLEAPDLIWLLQQTGDEVLDYRQAVAYYASCRQTVIEGGNHAFTGFEDY FNPIVDFLGLHHL >gi|296493200|gb|ADTK01000301.1| GENE 21 17848 - 19740 1948 630 aa, chain + ## HITS:1 COG:parE KEGG:ns NR:ns ## COG: parE COG0187 # Protein_GI_number: 16130926 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Escherichia coli K12 # 1 630 1 630 630 1300 100.0 0 MTQTYNADAIEVLTGLEPVRRRPGMYTDTTRPNHLGQEVIDNSVDEALAGHAKRVDVILH ADQSLEVIDDGRGMPVDIHPEEGVPAVELILCRLHAGGKFSNKNYQFSGGLHGVGISVVN ALSKRVEVNVRRDGQVYNIAFENGEKVQDLQVVGTCGKRNTGTSVHFWPDETFFDSPRFS VSRLTHVLKAKAVLCPGVEITFKDEINNTEQRWCYQDGLNDYLAEAVNGLPTLPEKPFIG NFAGDTEAVDWALLWLPEGGELLTESYVNLIPTMQGGTHVNGLRQGLLDAMREFCEYRNI LPRGVKLSAEDIWDRCAYVLSVKMQDPQFAGQTKERLSSRQCAAFVSGVVKDAFILWLNQ NVQAAELLAEMAISSAQRRMRAAKKVVRKKLTSGPALPGKLADCTAQDLNRTELFLVEGD SAGGSAKQARDREYQAIMPLKGKILNTWEVSSDEVLASQEVHDISVAIGIDPDSDDLSQL RYGKICILADADSDGLHIATLLCALFVKHFRALVKHGHVYVALPPLYRIDLGKEVYYALT EEEKEGVLEQLKRKKGKPNVQRFKGLGEMNPMQLRETTLDPNTRRLVQLTIDDEDDQRTD AMMDMLLAKKRSEDRRNWLQEKGDMAEIEV >gi|296493200|gb|ADTK01000301.1| GENE 22 19785 - 20654 363 289 aa, chain - ## HITS:1 COG:no KEGG:Ping_0266 NR:ns ## KEGG: Ping_0266 # Name: not_defined # Def: outer membrane protein # Organism: P.ingrahamii # Pathway: not_defined # 6 282 5 284 293 207 41.0 4e-52 MNNKFKIILWMMLSALPALSYAQVKNIVTGEYSAVSKIITEKETLYPSDKILLVFDIDNT LLTSGTRIGGDIWYQWQTDKLPLKPDDSQKVPCLYENTISMLYELSPMQLTEPQLPTMLK QWQQKHTAFALTSRAPDTLFPTLRELKRNGIDFSSSALRKKEDTLPPFEKGKLKRSWLYS QGVFFSSGQDKGVILDYMLDKMGNKYDAIVFIDDGQANIHAMTKMFSKDKWFSTDFTIIH YTRVEDELIKQQGAVITTAQAEKMATDWKTLAGSLATLFPDRNKLCPIK >gi|296493200|gb|ADTK01000301.1| GENE 23 20668 - 21363 623 231 aa, chain - ## HITS:1 COG:no KEGG:PSPPH_3003 NR:ns ## KEGG: PSPPH_3003 # Name: not_defined # Def: oligogalacturonate-specific porin protein KdgM # Organism: P.syringae_phaseolicola # Pathway: not_defined # 12 231 19 239 240 118 34.0 2e-25 MLSCLSPTLVCAEQPHGFIDYRHEYLDDTRTHADRVEFGTFFSNGIGLMGELRYNTDEGD KDKWDPSQFGNNGTGLSVVYRFKPLDDKKFWLEPMFWLDTTQYWSTYEYGLSAGYDFSKE WKVSGRFRYDMDKATDKSKGYGNDDRNNRRYDVWIDYRPQKTNFQYQLNLVYYNNGYLTW NDGHKNYTASFKVGYKIGSWIPYMSIADYKGVDKTSSNRQIRWRWGLTYTF >gi|296493200|gb|ADTK01000301.1| GENE 24 21498 - 22622 426 374 aa, chain - ## HITS:1 COG:all3208 KEGG:ns NR:ns ## COG: all3208 COG4222 # Protein_GI_number: 17230700 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 15 365 30 373 394 149 32.0 1e-35 MKIRYAAIFLALPLLCTTAQANSFLHYAGEFNFRTGEKQNNVTIAGLSAITFDAKKNIFY AINDSRNNNKEGDASLYTLKIQVSSRGIEQVNFLNQRPLLDAKQQPFVTNTVDAEGLALT HNGKSLLWSSELGAPLRLSTLDGVMEKDFTSLFPARFNISSDKESSNGIRSGNAWEGLTV TPDGKSLFIAVESSLKQDGPIASPINSGTARLLQFSIDADGRPSKQLHEYLYITDPVPQV SKFGINDNGVSEVLALNDHQLLVIERSGRNVSAGFNDWDYSVRVYMVDLTAASDIKNIDS LQDWSNKSTLQPVSKKLLIDFADYTSSADCIEGVTFGPLIDGHTSLIFVSDNNFQPHQQT KFYLFIDKENKLKI >gi|296493200|gb|ADTK01000301.1| GENE 25 22650 - 23564 541 304 aa, chain - ## HITS:1 COG:VCA0131 KEGG:ns NR:ns ## COG: VCA0131 COG0524 # Protein_GI_number: 15600902 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Vibrio cholerae # 6 303 4 305 306 111 29.0 2e-24 MKKFDVAALGSGNIDIFLSIPSLPTRGGKVVGTRLGEQAGGTVANSACAMGQLGLNVVSV SCVGNDHSASIILDGFKKYHVNCDFVQVIPELIANTAIIFIDELGEKTLVYSPGSDHEWD KEKALQAIAQSRYFYTMPADIEKFRMLAEYSRSQMTKVVVDIEPHIIATPEHLARILHLA DIAIFNYDGFIRGYATEPDFTLLHDIQDEYQLDAVVVTLDVRGVIAVKGNEQVKIGSYTI PVIDTTGAGDTFNGAFVYSLIKNMPLIEALKFASATAAINITALGARGHLPTPSEVNLFL SQHD >gi|296493200|gb|ADTK01000301.1| GENE 26 23569 - 24561 180 330 aa, chain - ## HITS:1 COG:no KEGG:RD1_2418 NR:ns ## KEGG: RD1_2418 # Name: not_defined # Def: glucosamine--fructose-6-phosphate aminotransferase, putative # Organism: R.denitrificans # Pathway: not_defined # 40 313 78 349 388 71 26.0 6e-11 MPARTPVMPLAQTREEINATAASLRHLAAIWSECFSTFQLDKRAVERVCIVGSGDSWTVA LCAAAWLGKYTHLFCYALQTWDFLQTDLTRYQKETLIIILSASGRPSMTVDALCHAVCSN AQVLGVTNCPGTPFCAITENMLYTWANKQGIPTQSSSVTLYALLRLAQKLCPGLTPLQIE EDLEGKFSQINQHWQQKKRHFYQQKEITFLGSGLSWGLAISGSNLLSCGPQIRATALPLE EFYHSLRLHQAGVGQHYILLPATSCESPFYLATQKAIVEQGATAELISFIPDASESNNLF LVMQWLYEMCWHLSCDYVDAGGQRVSHREK >gi|296493200|gb|ADTK01000301.1| GENE 27 24567 - 25643 617 358 aa, chain - ## HITS:1 COG:STM4067 KEGG:ns NR:ns ## COG: STM4067 COG1397 # Protein_GI_number: 16767333 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ADP-ribosylglycohydrolase # Organism: Salmonella typhimurium LT2 # 11 345 6 341 346 168 37.0 2e-41 MNKIEQAVIYNKILGSLACAGMGDALGAATELYSIDEIKAQWGGFLNAFVSPPADTFAGS LNGIAGLITDDSSQMYVFSEGLVEAGFDNFTNNDWLACLLRWADMQPYANYKGPTTEQIV KALKEGRPTNTIGRIGTSSRQAPNVGTTNGAGMRVAPAGLIWPGKKEKACHLALITCLPS HDTNIAIASACAIAAATSQAMLPESSLTSLLDAAIWGANYGERLAKQYARCVAGPSIAMR IQLAADIARRANDLESCLREMEGLVGNSVAAHESIPAAIGLLLYCKGEPWETIHACANIG NDTDSIATMAGAIAGAWRGFDALPEDKYAFFRAVNNKDFDIEAIASGLTLLALQAQEK >gi|296493200|gb|ADTK01000301.1| GENE 28 25657 - 26409 459 250 aa, chain - ## HITS:1 COG:BH0419 KEGG:ns NR:ns ## COG: BH0419 COG2188 # Protein_GI_number: 15612982 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 16 239 13 234 240 107 32.0 2e-23 MIRIYKASKGQLKHREIFDEIKKKIDNGEWKEGKAIPTERELAEHFLSSRTTIRKAIESL KQRGYLHSVHGQGTFTLPARSRENHLLHSFTDDIKARGGVPAQNILEIGYIPLSDIIRKN LELDIHVHTVFCIKRIRYMGSTPLGIQTSWLALNDGQEITQEELLATGSLYILLEEKLAI KPLEATEIISARLPSPLERQLLELTADDVVLSCTRVTLSVERKPMEYVEMVYPASRYSYK ARITKDSFSV >gi|296493200|gb|ADTK01000301.1| GENE 29 26368 - 26625 112 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|300905841|ref|ZP_07123574.1| ## NR: gi|300905841|ref|ZP_07123574.1| hypothetical protein HMPREF9536_03833 [Escherichia coli MS 84-1] # 21 85 1 65 65 106 98.0 5e-22 MLKLTFTGFINSYHDKASFQVVDTNSTFFYFYCAVCFFVICDIKQKKPHLLLKIIKLNEI GFYVMLLNINSVVCCGIYQCIYSHS >gi|296493200|gb|ADTK01000301.1| GENE 30 26696 - 28027 735 443 aa, chain + ## HITS:1 COG:STM4065 KEGG:ns NR:ns ## COG: STM4065 COG2211 # Protein_GI_number: 16767331 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Salmonella typhimurium LT2 # 1 438 1 439 444 506 59.0 1e-143 MKLTINEKIGFGAGDMAIAIVMMSLAMIITYFYTDVFGLKPVDLGILLFSVRILDAVIDP VVGTMTDRTNTRWGKYRPWLLFMSIPFGISIWLMFTTPDTDYSVKLLWAWATYVLLTLTY TLIAIPYVSLISVITDDPQERLSANGYRFVMTKIAMFAVTIIVPLSAMYLGKNNVKLGYQ IAMGAMGILATCLCLYCFFSVRERIYHPKPSLGMAAQFRLLIKNDQWLILGAVIAIIMFG GIIRNSVAAYYAKYYLNGGDALISPFLTSGVIASVLAMIACTWLTRLYDKIKIFRYTQLL AFVVGGAMYFVQPNSIVLAFTFYIVVTFLTDIQLPIYWASIAESVDYGEMKTGQRVSGLA FGGILFFQKFGMGLAGGFIGIALAYLDYQPGVEQTPQALWGICLLMTIFPAILNLITGLI MRFYLINNEFYEQIKARLQTAEE >gi|296493200|gb|ADTK01000301.1| GENE 31 28071 - 28385 395 104 aa, chain - ## HITS:1 COG:ECs3911 KEGG:ns NR:ns ## COG: ECs3911 COG1359 # Protein_GI_number: 15833165 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 104 1 104 104 199 100.0 9e-52 MLTVIAEIRTRPGQHHRQAVLDQFAKIVPTVLKEEGCHGYAPMVDCAAGVSFQSMAPDSI VMIEQWESIAHLEAHLQTPHMKAYSEAVKGDVLEMNIRILQPGI >gi|296493200|gb|ADTK01000301.1| GENE 32 28416 - 28997 670 193 aa, chain - ## HITS:1 COG:ECs3910 KEGG:ns NR:ns ## COG: ECs3910 COG2249 # Protein_GI_number: 15833164 # Func_class: R General function prediction only # Function: Putative NADPH-quinone reductase (modulator of drug activity B) # Organism: Escherichia coli O157:H7 # 1 193 1 193 193 400 100.0 1e-112 MSNILIINGAKKFAHSNGQLNDTLTEVADGTLRDLGHDVRIVRADSDYDVKAEVQNFLWA DVVIWQMPGWWMGAPWTVKKYIDDVFTEGHGTLYASDGRTRKDPSKKYGSGGLVQGKKYM LSLTWNAPMEAFTEKDQFFHGVGVDGVYLPFHKANQFLGMEPLPTFIANDVIKMPDVPRY TEEYRKHLVEIFG >gi|296493200|gb|ADTK01000301.1| GENE 33 29107 - 30456 1160 449 aa, chain - ## HITS:1 COG:ygiY KEGG:ns NR:ns ## COG: ygiY COG0642 # Protein_GI_number: 16130922 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 449 1 449 449 848 99.0 0 MKFTQRLSLRVRLTLIFLILASVTWLLSSFVAWKQTTDNVDELFDTQLMLFAKRLSTLDL NEINAADRMAQTPNRLKHGHVDDDALTFAIFTHDGRMVLNDGDNGEDIPYSYQREGFADG QLVGEDDPWRFVWMTSPDGKYRIVVGQEWEYREDMALAIVAGQLIPWLVALPIMLIIMMV LLGRELAPLNKLALALRMRDPDSEKPLNATGVPSEVRPLVESLNQLFARTHAMMVRERRF TSDAAHELRSPLTALKVQTEVAQLSDDDPQARKKALLQLHSGIDRATRLVDQLLTLSRLD SLDNLQGVAEIPLEDLLQSSVMDIYHTAQQAKIDVRLTLNVQGIKRTGQPLLLSLLVRNL LDNAVRYSPQGSVVDVTLNADNFIVRDNGPGVTPEALARIGERFYRPPGQTATGSGLGLS IVQRIAKLHGMNVEFGNAEQGGFEAKVSW >gi|296493200|gb|ADTK01000301.1| GENE 34 30453 - 31085 792 210 aa, chain - ## HITS:1 COG:ECs3907 KEGG:ns NR:ns ## COG: ECs3907 COG0745 # Protein_GI_number: 15833161 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 210 10 219 219 404 99.0 1e-113 MLIGDGIKTGLSKMGFSVDWFTQGRQGKEALYSAPYDAVILDLTLPGMDGRDILREWREK GQREPVLILTARDALAERVEGLRLGADDYLCKPFALIEVAARLEALMRRTNGQASNELRH GNVMLDPDKRIATLAGEPLTLKPKEFALLELLMRNAGRVLPRKLIEEKLYTWDEEVTSNA VEVHVHHLRRKLGSDFIRTVHGIGYTLGEK >gi|296493200|gb|ADTK01000301.1| GENE 35 31264 - 31656 543 130 aa, chain + ## HITS:1 COG:ECs3906 KEGG:ns NR:ns ## COG: ECs3906 COG3111 # Protein_GI_number: 15833160 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 130 1 130 130 243 100.0 5e-65 MKKFAAVIAVMALCSAPVMAAEQGGFSGPSATQSQAGGFQGPNGSVTTVESAKSLRDDTW VTLRGNIVERISDDLYVFKDASGTINVDIDHKRWNGVTVTPKDTVEIQGEVDKDWNSVEI DVKQIRKVNP >gi|296493200|gb|ADTK01000301.1| GENE 36 31709 - 32191 441 160 aa, chain + ## HITS:1 COG:ECs3905 KEGG:ns NR:ns ## COG: ECs3905 COG3449 # Protein_GI_number: 15833159 # Func_class: L Replication, recombination and repair # Function: DNA gyrase inhibitor # Organism: Escherichia coli O157:H7 # 1 160 1 160 160 332 98.0 3e-91 MTNLTLDVNIIDFPSIPVAMLPHRCSPELLNYSVAKFIMWRKETGLSPVNQSQTFGVAWD DPATTAPEAFRFDICGSVSEPIPDNRYGVSNGELTGGRYAVARHVGELDDISHTIWGIIR HWLPASGEKMRKAPILFHYTNLAEGVTEQRLETDVYVPLA >gi|296493200|gb|ADTK01000301.1| GENE 37 32396 - 32692 286 98 aa, chain + ## HITS:1 COG:no KEGG:B21_02844 NR:ns ## KEGG: B21_02844 # Name: ygiU # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 98 1 98 98 193 100.0 2e-48 MEKRTPHTRLSQVKKLVNAGQVRTTRSALLNADELGLDFDGMCNVIIGLSESDFYKSMTT YSDHTIWQDVYRPRLVTGQVYLKITVIHDVLIVSFKEK >gi|296493200|gb|ADTK01000301.1| GENE 38 32694 - 33089 209 131 aa, chain + ## HITS:1 COG:ygiT KEGG:ns NR:ns ## COG: ygiT COG1396 # Protein_GI_number: 16130917 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 1 131 1 131 131 263 100.0 6e-71 MKCPVCHQGEMVSGIKDIPYTFRGRKTVLKGIHGLYCVHCEESIMNKEESDAFMAQVKAF RASVNAETVAPEFIVKVRKKLSLTQKEASEIFGGGVNAFSRYEKGNAQPHPSTIKLLRVL DKHPELLNEIR >gi|296493200|gb|ADTK01000301.1| GENE 39 33222 - 34829 1526 535 aa, chain + ## HITS:1 COG:ygiS KEGG:ns NR:ns ## COG: ygiS COG4166 # Protein_GI_number: 16130916 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Escherichia coli K12 # 1 535 1 535 535 1059 99.0 0 MYTRNLLWLVSLVSAAPLYAADVPANTPLAPQQVFRYNNHSDPGTLDPQKVEENTAAQIV LDLFEGLVWMDGEGQVQPAQAERWEILDGGKRYIFHLRSGLQWSDGQPLTAEDFVLGWQR AVDPKTASPFAGYLAQAHINNAAAIVAGKADVTSLGVKATDDRTLEVTLEQPVPWFTTML AWPTLFPVPHHVIAKHGDSWSKPENMVYNGAFVLDQWVVNEKITARKNPKYRDAQHTVLQ QVEYLALDNSVTGYNRYRAGEVDLTWVPAQQIPAIEKSLPGELRIIPRLNSEYYNFNLEK PPFNDVRVRRALYLTVDRQLIAQKVLGLRTPATTLTPPEVKGFSATTFDELQKPMSERVA MAKALLKQAGYDASHPLRFELFYNKYDLHEKTAIALSSEWKKWLGAQVTLRTMEWKTYLD ARRAGDFMLSRQSWDATYNDASSFLNTLKSDSEENVGHWKNAQYDALLNQATQITDATKR NALYQQAEVIINQQTPLIPIYYQPLIKLLKPYVGGFPLHNPQDYVYSKELYIKAH >gi|296493200|gb|ADTK01000301.1| GENE 40 34967 - 37225 2535 752 aa, chain + ## HITS:1 COG:ECs3903 KEGG:ns NR:ns ## COG: ECs3903 COG0188 # Protein_GI_number: 15833157 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Escherichia coli O157:H7 # 1 752 1 752 752 1458 99.0 0 MSDMAERLALHEFTENAYLNYSMYVIMDRALPFIGDGLKPVQRRIVYAMSELGLNASAKF KKSARTVGDVLGKYHPHGDIACYEAMVLMAQPFSYRYPLVDGQGNWGAPDDPKSFAEMRY TESRLSKYSELLLSELGQGTADWVPNFDGTLQEPKMLPARLPNILLNGTTGIAVGMATDI PPHNLREVAQAAIALIDQPKTTLDQLLDIVQGPDYPTEAEIITSRAEIRKIYENGRGSVR MRAVWKKEDGAVVISALPHQVSGARVLEQIAAQMRNKKLPMVDDLRDESDHENPTRLVIV PRSNRVDMDQVMNHLFATTDLEKSYRINLNMIGLDGRPAVKNLLEILSEWLVFRRDTVRR RLNYRLEKVLKRLHILEGLLVAFLNIDEVIEIIRNEDEPKPALMSRFGLTETQAEAILEL KLRHLAKLEEMKIRGEQSELEKERDQLQGILASERKMNNLLKKELQADAQAYGDDRRSPL QEREEAKAMSEHDMLPSEPVTIVLSQMGWVRSAKGHDIDAPGLNYKAGDSFKAAVKGKSN QPVVFVDSTGRSYAIDPITLPSARGQGEPLTGKLTLPPGATVDHMLMESDDQKLLMASDA GYGFVCTFNDLVARNRAGKALITLPENAHVMPPVVIEDASDMLLAITQAGRMLMFPVSDL PQLSKGKGNKIINIPSAEAARGEDGLAQLYVLPPQSTLTIHVGKRKIKLRPEELQKVTGE RGRRGTLMRGLQRIDRVEIDSPRRASSGDSEE >gi|296493200|gb|ADTK01000301.1| GENE 41 37459 - 38196 617 245 aa, chain + ## HITS:1 COG:ECs3902 KEGG:ns NR:ns ## COG: ECs3902 COG0204 # Protein_GI_number: 15833156 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Escherichia coli O157:H7 # 1 245 1 245 245 515 100.0 1e-146 MLYIFRLIITVIYSILVCVFGSIYCLFSPRNPKHVATFGHMFGRLAPLFGLKVECRKPAD AESYGNAIYIANHQNNYDMVTASNIVQPPTVTVGKKSLLWIPFFGQLYWLTGNLLIDRNN RTKAHGTIAEVVNHFKKRRISIWMFPEGTRSRGRGLLPFKTGAFHAAIAAGVPIIPVCVS TTSNKINLNRLHNGLVIVEMLPPIDVSQYGKDQVRELAAHCRSIMEQKIAELDKEVAERE AAGKV >gi|296493200|gb|ADTK01000301.1| GENE 42 38271 - 39683 1188 470 aa, chain + ## HITS:1 COG:ECs3901 KEGG:ns NR:ns ## COG: ECs3901 COG2132 # Protein_GI_number: 15833155 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Putative multicopper oxidases # Organism: Escherichia coli O157:H7 # 1 470 1 470 470 930 100.0 0 MSLSRRQFIQASGIALCAGAVPLKASAAGQQQPLPVPPLLESRRGQPLFMTVQRAHWSFT PGTRASVWGINGRYLGPTIRVWKGDDVKLIYSNRLTENVSMTVAGLQVPGPLMGGPARMM SPNADWAPVLPIRQNAATLWYHANTPNRTAQQVYNGLAGMWLVEDEVSKSLPIPNHYGVD DFPVIIQDKRLDNFGTPEYNEPGSGGFVGDTLLVNGVQSPYVEVSRGWVRLRLLNASNSR RYQLQMSDGRPLHVISGDQGFLPAPVSVKQLSLAPGERREILVDMSNGDEVSITCGEAAS IVDRIRGFFEPSSILVSTLVLTLRPTGLLPLVTDSLPMRLLPTEIMAGSPIRSRDISLGD DPGINGQLWDVNRIDVTAQQGTWERWTVRADEPQAFHIEGVMFQIRNVNGAMPFPEDRGW KDTVWVDGQVELLVYFGQPSWAHFPFYFNSQTLEMADRGSIGQLLVNPVP >gi|296493200|gb|ADTK01000301.1| GENE 43 39794 - 42013 2079 739 aa, chain + ## HITS:1 COG:ECs3900 KEGG:ns NR:ns ## COG: ECs3900 COG1032 # Protein_GI_number: 15833154 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Escherichia coli O157:H7 # 1 739 1 739 739 1516 99.0 0 MSSISLIQPDRDLFSWPQYWAACFGPAPFLPMSREEMDQLGWDSCDIILVTGDAYVDHPS FGMAICGRMLEVQGFRVGIIAQPDWSSKDDFMRLGKPNLFFGVTAGNMDSMINRYTADRR LRHDDAYTPDNVAGKRPDRATLVYTQRCKEAWKDVPVILGGIEASLRRTAHYDYWSDTVR RSVLVDSKADMLMFGNGERPLVEVAHRLAMGEPISEIRDVRNTAIIVKEALPGWSGVDST RLDTPGKIDPIPHPYGEDLPCADNKPVAPKKQEAKAVTVQPPRPKPWEKTYVLLPSFEKV KGDKVLYAHASRILHHETNPGCARALMQKHGDRYVWINPPAIPLSTEEMDSVFALPYKRV PHPAYGNARIPAYEMIRFSVNIMRGCFGGCSFCSITEHEGRIIQSRSEDSIINEIEAIRD TVPGFTGVISDLGGPTANMYMLRCKSPRAEQTCRRLSCVYPDICPHMDTNHEPTINLYRR ARDLKGIKKILIASGVRYDIAVEDPRYIKELATHHVGGYLKIAPEHTEEGPLSKMMKPGM GSYDRFKELFDTYSKQAGKEQYLIPYFISAHPGTRDEDMVNLALWLKKHRFRLDQVQNFY PSPLANSTTMYYTGKNPLAKIGYKSEDVFVPKGDKQRRLHKALLRYHDPANWPLIRQALE AMGKKHLIGSRRDCLVPAPTIEEMREARRQNRNTRPALTKHTPMATQRQTPATAKKASST QSRPVNAGAKKRPKAAVGR >gi|296493200|gb|ADTK01000301.1| GENE 44 42056 - 42313 293 85 aa, chain - ## HITS:1 COG:yqhH KEGG:ns NR:ns ## COG: yqhH COG4238 # Protein_GI_number: 16130912 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Murein lipoprotein # Organism: Escherichia coli K12 # 1 85 1 85 85 149 100.0 2e-36 MKTIFTVGAVVLATCLLSGCVNEQKVNQLASNVQTLNAKIARLEQDMKALRPQIYAAKSE ANRANTRLDAQDYFDCLRCLRMYAE >gi|296493200|gb|ADTK01000301.1| GENE 45 42364 - 43290 648 308 aa, chain - ## HITS:1 COG:no KEGG:EC55989_3430 NR:ns ## KEGG: EC55989_3430 # Name: yqhG # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 308 1 308 308 609 100.0 1e-173 MKIILLFLAALASFTVHAQPPSLTVEQTVRHIYQNYKSDATAPYFGETGERAITSARIQQ ALTLNDNLTLPGNIGWLDYDPVCDCQDFGDLVLESVAITQTDADHADAVVRFRIFKDDKE KTTQTLKMVAENGRWVIDDIVSNHGSVLQAVNSENEKTLAALASLQKEQPEAFVAELFEH IADYSWPWTWVVSDSYRQAVNAFYKTTFKTANNPDEDMQIERQFIYDNPICFGEESLFSR VDEIRVLEKTADSARIHVRFTLTNGNNEEQELVLQRREGKWEIADFIRPNSGSLLKQIEA KTAARLKQ >gi|296493200|gb|ADTK01000301.1| GENE 46 43490 - 44317 891 275 aa, chain - ## HITS:1 COG:STM3165 KEGG:ns NR:ns ## COG: STM3165 COG0656 # Protein_GI_number: 16766465 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Salmonella typhimurium LT2 # 1 275 1 275 275 497 92.0 1e-141 MANPTVIKLQDGNVMPQLGLGVWQASNEEVITAIQKALEVGYRSIDTAAAYKNEEGVGKA LKNASVNREELFITTKLWNDDHKRPREALLDSLKKLQLDYIDLYLMHWPVPAIDHYVEAW KGMIELQKEGLIKSIGVCNFQIHHLQRLIDETGVTPVINQIELHPLMQQRQLHAWNATHK IQTESWSPLAQGGKGVFDQKVIRDLADKYGKTPAQIVIRWHLDSGLVVIPKSVTPSRIAE NFDVWDFRLDKDELGEIAKLDQGKRLGPDPDQFGG >gi|296493200|gb|ADTK01000301.1| GENE 47 44422 - 45585 1501 387 aa, chain - ## HITS:1 COG:yqhD KEGG:ns NR:ns ## COG: yqhD COG1979 # Protein_GI_number: 16130909 # Func_class: C Energy production and conversion # Function: Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family # Organism: Escherichia coli K12 # 1 387 1 387 387 779 99.0 0 MNNFNLHTPTRILFGKGAIAGLREQIPHDARVLITYGGGSVKKTGVLDQVLDALKGMDVL EFGGIEPNPAYETLMNAVKLVREQKVTFLLAVGGGSVLDGTKFIAAAANYPENIDPWHIL QTGGKEIKSAIPMGCVLTLPATGSESNAGAVISRKTTGDKQAFHSAHVQPVFAVLDPVYT YTLPPRQVANGVVDAFVHTVEQYVTKPVDAKIQDRFAEGILLTLIEDGPKALKEPENYDV RANVMWGATQALNGLIGAGVPQDWATHMLGHELTAMHGLDHAQTLAIVLPALWNEKRETK RAKLLQYAERVWNITEGSDDERIDAAIAATRNFFEQLGVPTHLSDYGLDGSSIPALLKKL EEHGMTQLGENHDITLDVSRRIYEAAR >gi|296493200|gb|ADTK01000301.1| GENE 48 45779 - 46678 854 299 aa, chain + ## HITS:1 COG:yqhC KEGG:ns NR:ns ## COG: yqhC COG2207 # Protein_GI_number: 16130908 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 299 77 375 375 594 99.0 1e-170 MKREEICRLLADKVNKLKIKENSLSELLPDVRLLYGETPFARTPVMYEPGIIILFSGHKI GYINERVFRYDANEYLLLTVPLPFECETYATSEVPLAGLRLNVDILQLQELLMDIGEDEH FQPSMAASGINSATLSEEILCAAERLLDVMERPLDARILGKQIIREILYYVLTGPCGGAL LALVSRQTHFSLISRVLKRIENKYTENLSVEQLAAEANMSVSAFHHNFKSVTSTSPLQYL KNYRLHKARMMIIHDGMKASAAAMRVGYESASQFSREFKRYFGVTPGEDAARMRAMQGN >gi|296493200|gb|ADTK01000301.1| GENE 49 46718 - 47377 575 219 aa, chain - ## HITS:1 COG:ECs3893 KEGG:ns NR:ns ## COG: ECs3893 COG0586 # Protein_GI_number: 15833147 # Func_class: S Function unknown # Function: Uncharacterized membrane-associated protein # Organism: Escherichia coli O157:H7 # 1 219 1 219 219 384 100.0 1e-106 MAVIQDIIAALWQHDFAALADPHIVSVVYFVMFATLFLENGLLPASFLPGDSLLILAGAL IAQGVMDFLPTIAILTAAASLGCWLSYIQGRWLGNTKTVKGWLAQLPAKYHQRATCMFDR HGLLALLAGRFLAFVRTLLPTMAGISGLPNRRFQFFNWLSGLLWVSVVTSFGYALSMIPF VKRHEDQVMTFLMILPIALLTAGLLGTLFVVIKKKYCNA >gi|296493200|gb|ADTK01000301.1| GENE 50 47517 - 48704 1054 395 aa, chain - ## HITS:1 COG:ECs3892 KEGG:ns NR:ns ## COG: ECs3892 COG0626 # Protein_GI_number: 15833146 # Func_class: E Amino acid transport and metabolism # Function: Cystathionine beta-lyases/cystathionine gamma-synthases # Organism: Escherichia coli O157:H7 # 1 395 1 395 395 814 99.0 0 MADKKLDTQLVNAGRSKKYTLGAVNSVIQRASSLVFESMEAKKHATRNRANGELFYGRRG TLTHFSLQQAMCELEGGAGCALFPCGAAAVANSILAFVEQGDHVLMTNTAYEPSQDFCSK ILSKLGVTTSWFDPLIGADIVKHLQPNTKIVFLESPGSITMEVHDVPAIVAAVRSVVPDA IIMIDNTWAAGVLFKALDFGIDVSIQAATKYLVGHSDAMIGTAVCNARCWEQLRENAYLM GQMVDADTAYITSRGLRTLGVRLRQHHESSLKVAEWLAEHPQVARVNHPALPGSKGHEFW KRDFTGSSGLFSFVLKKKLSNEELANYLDNFSLFSMAYSWGGYESLILANQPEHIAAIRP QGEIDFSGTLIRLHIGLEDVDDLIADLDAGFARIV >gi|296493200|gb|ADTK01000301.1| GENE 51 48971 - 49690 788 239 aa, chain + ## HITS:1 COG:ECs3890 KEGG:ns NR:ns ## COG: ECs3890 COG0811 # Protein_GI_number: 15833144 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Escherichia coli O157:H7 # 1 239 6 244 244 441 100.0 1e-124 MQTDLSVWGMYQHADIVVKCVMIGLILASVVTWAIFFSKSVEFFNQKRRLKREQQLLAEA RSLNQANDIAADFGSKSLSLHLLNEAQNELELSEGSDDNEGIKERTSFRLERRVAAVGRQ MGRGNGYLATIGAISPFVGLFGTVWGIMNSFIGIAQTQTTNLAVVAPGIAEALLATAIGL VAAIPAVVIYNVFARQIGGFKAMLGDVAAQVLLLQSRDLDLEASAAAHPVRVAQKLRAG >gi|296493200|gb|ADTK01000301.1| GENE 52 49697 - 50122 527 141 aa, chain + ## HITS:1 COG:ECs3889 KEGG:ns NR:ns ## COG: ECs3889 COG0848 # Protein_GI_number: 15833143 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport protein # Organism: Escherichia coli O157:H7 # 1 141 1 141 141 252 100.0 2e-67 MAMHLNENLDDNGEMHDINVTPFIDVMLVLLIIFMVAAPLATVDVKVNLPASTSTPQPRP EKPVYLSVKADNSMFIGNDPVTDETMITALNALTEGKKDTTIFFRADKTVDYETLMKVMD TLHQAGYLKIGLVGEETAKAK >gi|296493200|gb|ADTK01000301.1| GENE 53 50394 - 51278 963 294 aa, chain - ## HITS:1 COG:ECs3887 KEGG:ns NR:ns ## COG: ECs3887 COG1028 # Protein_GI_number: 15833141 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Escherichia coli O157:H7 # 1 294 1 294 294 557 100.0 1e-159 MSHLKDPTTQYYTGEYPKQKQPTPGIQAKMTPVPDCGEKTYVGSGRLKDRKALVTGGDSG IGRAAAIAYAREGADVAISYLPVEEEDAQDVKKIIEECGRKAVLLPGDLSDEKFARSLVH EAHKALGGLDIMALVAGKQVAIPDIADLTSEQFQKTFAINVFALFWLTQEAIPLLPKGAS IITTSSIQAYQPSPHLLDYAATKAAILNYSRGLAKQVAEKGIRVNIVAPGPIWTALQISG GQTQDKIPQFGQQTPMKRAGQPAELAPVYVYLASQESSYVTAEVHGVCGGEHLG >gi|296493200|gb|ADTK01000301.1| GENE 54 51469 - 51963 612 164 aa, chain + ## HITS:1 COG:ECs3886 KEGG:ns NR:ns ## COG: ECs3886 COG2862 # Protein_GI_number: 15833140 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 164 1 164 164 271 100.0 5e-73 MERFLENAMYASRWLLAPVYFGLSLALVALALKFFQEIIHVLPNIFSMAESDLILVLLSL VDMTLVGGLLVMVMFSGYENFVSQLDISENKEKLNWLGKMDATSLKNKVAASIVAISSIH LLRVFMDAKNVPDNKLMWYVIIHLTFVLSAFVMGYLDRLTRHNH >gi|296493200|gb|ADTK01000301.1| GENE 55 52003 - 53043 969 346 aa, chain - ## HITS:1 COG:ECs3885 KEGG:ns NR:ns ## COG: ECs3885 COG0667 # Protein_GI_number: 15833139 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Escherichia coli O157:H7 # 1 346 1 346 346 689 98.0 0 MVWLANPERYGQMQYRYCGKSGLRLPALSLGLWHNFGHVNALESQRAILRKAFDLGITHF DLANNYGPPPGSAEENFGRLLREDFAAYRDELIISTKAGYDMWPGPYGSGGSRKYLLASL DQSLKRMGLEYVDIFYSHRVDENTPMEETASALAHAVQSGKALYVGISSYSPERTQKMVE LLHEWKIPLLIHQPSYNLLNRWVDKSGLLDTLQNNGVGCIAFTPLAQGLLTGKYLNGIPQ DSRMYREGNKVRGLTPKMLTEANLNSLRLLNEMAQQRGQSMAQMALSWLLKDDRVTSVLI GASRAEQLEENVQALNNLTFSTEELAQIDQHIADGELNLWQASSDK >gi|296493200|gb|ADTK01000301.1| GENE 56 53200 - 54087 953 295 aa, chain + ## HITS:1 COG:Z4353 KEGG:ns NR:ns ## COG: Z4353 COG0412 # Protein_GI_number: 15803544 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Dienelactone hydrolase and related enzymes # Organism: Escherichia coli O157:H7 EDL933 # 1 295 14 308 308 582 98.0 1e-166 MPRLTAKDFPQELLDYYDYYAHGKISKREFLNLAAKYAVGGMTALALFDLLKPNYALATQ VEFTDPEIVAEYITYPSPNGHGEVRGYLVKPAKMSGKTPAVVVVHENRGLNPYIEDVARR VAKAGYIALAPDGLSSVGGYPGNDDKGRELQQQVDPTKLMNDFFAAIEFMQRYPQAAGKV GITGFCYGGGVSNAAAVAYPELACAVPFYGRQAPTADVAKIEAPLLLHFAELDTRINEGW PAYEAALKANNKVYEAYIYPGVNHGFHNDSTPRYDKSAADLAWQRTLKWFDKYLS >gi|296493200|gb|ADTK01000301.1| GENE 57 54206 - 54493 310 95 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_3148 NR:ns ## KEGG: ECIAI1_3148 # Name: yghW # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 95 1 95 95 184 100.0 8e-46 MNNHFGKGLMAGLKATHADSAVNVTKFCADYKRGFVLGYSHRMYEKTGDRQLSAWEAGIL TRRYGLDKEMVMDFFRENNSCSTLRFFMAGYRLEN >gi|296493200|gb|ADTK01000301.1| GENE 58 54682 - 55800 1132 372 aa, chain + ## HITS:1 COG:ECs3882 KEGG:ns NR:ns ## COG: ECs3882 COG1740 # Protein_GI_number: 15833136 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase I small subunit # Organism: Escherichia coli O157:H7 # 1 372 1 372 372 703 100.0 0 MTGDNTLIHSHGINRRDFMKLCAALAATMGLSSKAAAEMAESVTNPQRPPVIWIGAQECT GCTESLLRATHPTVENLVLETISLEYHEVLSAAFGHQVEENKHNALEKYKGQYVLVVDGS IPLKDNGIYCMVAGEPIVDHIRKAAEGAAAIIAIGSCSAWGGVAAAGVNPTGAVSLQEVL PGKTVINIPGCPPNPHNFLATVAHIITYGKPPKLDDKNRPTFAYGRLIHEHCERRPHFDA GRFAKEFGDEGHREGWCLYHLGCKGPETYGNCSTLQFCDVGGVWPVAIGHPCYGCNEEGI GFHKGIHQLANVENQTPRSQKPDVNAKEGGNVSAGAIGLLGGVVGLVAGVSVMAVRELGR QQKKDNADSRGE >gi|296493200|gb|ADTK01000301.1| GENE 59 55803 - 56789 999 328 aa, chain + ## HITS:1 COG:ECs3881 KEGG:ns NR:ns ## COG: ECs3881 COG0437 # Protein_GI_number: 15833135 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Escherichia coli O157:H7 # 1 328 1 328 328 674 100.0 0 MNRRNFIKAASCGALLTGALPSVSHAAAENRPPIPGSLGMLYDSTLCVGCQACVTKCQDI NFPERNPQGEQTWSNNDKLSPYTNNIIQVWTSGTGVNKDQEENGYAYIKKQCMHCVDPNC VSVCPVSALKKDPKTGIVHYDKDVCTGCRYCMVACPYNVPKYDYNNPFGALHKCELCNQK GVERLDKGGLPGCVEVCPAGAVIFGTREELMAEAKKRLALKPGSEYHYPRQTLKSGDTYL HTVPKYYPHLYGEKEGGGTQVLVLTGVPYENLDLPKLDDLSTGARSENIQHTLYKGMMLP LAVLAGLTVLVRRNTKNDHHDGGDDHES >gi|296493200|gb|ADTK01000301.1| GENE 60 56779 - 57957 1530 392 aa, chain + ## HITS:1 COG:hybB KEGG:ns NR:ns ## COG: hybB COG5557 # Protein_GI_number: 16130895 # Func_class: C Energy production and conversion # Function: Polysulphide reductase # Organism: Escherichia coli K12 # 1 392 1 392 392 719 100.0 0 MSHDPQPLGGKIISKPVMIFGPLIVICMLLIVKRLVFGLGSVSDLNGGFPWGVWIAFDLL IGTGFACGGWALAWAVYVFNRGQYHPLVRPALLASLFGYSLGGLSITIDVGRYWNLPYFY IPGHFNVNSVLFETAVCMTIYIGVMALEFAPALFERLGWKVSLQRLNKVMFFIIALGALL PTMHQSSMGSLMISAGYKVHPLWQSYEMLPLFSLLTAFIMGFSIVIFEGSLVQAGLRGNG PDEKSLFVKLTNTISVLLAIFIVLRFGELIYRDKLSLAFAGDFYSVMFWIEVLLMLFPLV VLRVAKLRNDSRMLFLSALSALLGCATWRLTYSLVAFNPGGGYAYFPTWEELLISIGFVA IEICAYIVLIRLLPILPPLKQNDHNRHEASKA >gi|296493200|gb|ADTK01000301.1| GENE 61 57954 - 59657 1871 567 aa, chain + ## HITS:1 COG:ECs3879 KEGG:ns NR:ns ## COG: ECs3879 COG0374 # Protein_GI_number: 15833133 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase I large subunit # Organism: Escherichia coli O157:H7 # 1 567 1 567 567 1183 99.0 0 MSQRITIDPVTRIEGHLRIDCEIENGVVSKAWASGTMWRGMEEIVKNRDPRDAWMIVQRI CGVCTTTHALSSVRAAESALNIDVPVNAQYIRNIILAAHTTHDHIVHFYQLSALDWVDIT SALQADPTKASEMLKGVSTWHLNSPEEFTKVQNKIKDLVASGQLGIFANGYWGHPAMKLP PEVNLIAVAHYLQALECQRDANRVVALLGGKTPHIQNLAVGGVANPINLDGLGVLNLERL IYIKSFIDKLSDFVEQVYKVDTAVIAAFYPEWLTRGKGAVNYLSVPEFPTDSKNGSFLFP GGYIENADLSSYRPITSHSDEYLIKGIQESAKHSWYKDEAPQAPWEGTTIPAYDGWSDDG KYSWVKSPTFYGKTVEVGPLANMLVKLAAGRESTQNKLNEIVAIYQKLTGNTLEVAQLHS TLGRIIGRTVHCCELQDILQNQYSALITNIGKGDHTTFVKPNIPATGEFKGVGFLEAPRG MLSHWMVIKDGIISNYQAVVPSTWNSGPRNFNDDVGPYEQSLVGTPVADPNKPLEVVRTI HSFDPCMACAVHVVDADGNEVVSVKVL >gi|296493200|gb|ADTK01000301.1| GENE 62 59657 - 60151 609 164 aa, chain + ## HITS:1 COG:ECs3878 KEGG:ns NR:ns ## COG: ECs3878 COG0680 # Protein_GI_number: 15833132 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase maturation factor # Organism: Escherichia coli O157:H7 # 1 164 1 164 164 289 100.0 2e-78 MRILVLGVGNILLTDEAIGVRIVEALEQRYILPDYVEILDGGTAGMELLGDMANRDHLII ADAIVSKKSAPGTMMILRDEEVPALFTNKISPHQLGLADVLSALRFTGEFPKKLTLVGVI PESLEPHIGLTPTVEAMIEPALEQVLAALRESGVEAIPREAIHD >gi|296493200|gb|ADTK01000301.1| GENE 63 60144 - 60632 490 162 aa, chain + ## HITS:1 COG:no KEGG:SSON_3137 NR:ns ## KEGG: SSON_3137 # Name: hybE # Def: hydrogenase 2-specific chaperone # Organism: S.sonnei # Pathway: not_defined # 1 162 1 162 162 327 100.0 7e-89 MTEEIAGFQTSPKAQVQAAFEEIARRSMHDLSFLHPSMPVYVSDFTLFEGQWTGCVITPW MLSAVIFPGPDQLWPLRKVSEKIGLQLPYGTMTFTVGELDGVSQYLSCSLMSPLSHSMSI EEGQRLTDDCARMILSLPVTNPDVPHAGRRALLFGRRSGENA >gi|296493200|gb|ADTK01000301.1| GENE 64 60625 - 60966 317 113 aa, chain + ## HITS:1 COG:hybF KEGG:ns NR:ns ## COG: hybF COG0375 # Protein_GI_number: 16130891 # Func_class: R General function prediction only # Function: Zn finger protein HypA/HybF (possibly regulating hydrogenase expression) # Organism: Escherichia coli K12 # 1 113 1 113 113 200 100.0 5e-52 MHELSLCQSAVEIIQRQAEQHDVKRVTAVWLEIGALSCVEESAVRFSFEIVCHGTVAQGC DLHIVYKPAQAWCWDCSQVVEIHQHDAQCPLCHGERLRVDTGDSLIVKSIEVE >gi|296493200|gb|ADTK01000301.1| GENE 65 60979 - 61227 370 82 aa, chain + ## HITS:1 COG:ECs3875 KEGG:ns NR:ns ## COG: ECs3875 COG0298 # Protein_GI_number: 15833129 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Escherichia coli O157:H7 # 1 82 1 82 82 152 100.0 1e-37 MCIGVPGQVLAVGEDIHQLAQVEVCGIKRDVNIALICEGNPADLLGQWVLVHVGFAMSII DEDEAKATLDALRQMDYDITSA >gi|296493200|gb|ADTK01000301.1| GENE 66 61350 - 62216 1003 288 aa, chain - ## HITS:1 COG:ECs3874 KEGG:ns NR:ns ## COG: ECs3874 COG0625 # Protein_GI_number: 15833128 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione S-transferase # Organism: Escherichia coli O157:H7 # 1 288 17 304 304 602 100.0 1e-172 MTDNTYQPAKVWTWDKSAGGAFANINRPVSGPTHEKTLPVGKHPLQLYSLGTPNGQKVTI MLEELLALGVTGAEYDAWLIRIGDGDQFSSGFVEVNPNSKIPALRDHTHNPPIRVFESGS ILLYLAEKFGYFLPQDLAKRTETMNWLFWLQGAAPFLGGGFGHFYHYAPVKIEYAINRFT MEAKRLLDVLDKQLAQHKFVAGDEYTIADMAIWPWFGNVVLGGVYDAAEFLDAGSYKHVQ RWAKEVGERPAVKRGRIVNRTNGPLNEQLHERHDASDFETNTEDKRQG >gi|296493200|gb|ADTK01000301.1| GENE 67 62421 - 64280 2044 619 aa, chain + ## HITS:1 COG:gsp_2 KEGG:ns NR:ns ## COG: gsp_2 COG0754 # Protein_GI_number: 16130888 # Func_class: E Amino acid transport and metabolism # Function: Glutathionylspermidine synthase # Organism: Escherichia coli K12 # 233 619 1 387 387 825 100.0 0 MSKGTTSQDAPFGTLLGYAPGGVAIYSSDYSSLDPQEYEDDAVFRSYIDDEYMGHKWQCV EFARRFLFLNYGVVFTDVGMAWEIFSLRFLREVVNDNILPLQAFPNGSPRAPVAGALLIW DKGGEFKDTGHVAIITQLHGNKVRIAEQNVIHSPLPQGQQWTRELEMVVENGCYTLKDTF DDTTILGWMIQTEDTEYSLPQPEIAGELLKISGARLENKGQFDGKWLDEKDPLQNAYVQA NGQVINQDPYHYYTITESAEQELIKATNELHLMYLHATDKVLKDDNLLALFDIPKILWPR LRLSWQRRRHHMITGRMDFCMDERGLKVYEYNADSASCHTEAGLILERWAEQGYKGNGFN PAEGLINELAGAWKHSRARPFVHIMQDKDIEENYHAQFMEQALHQAGFETRILRGLDELG WDAAGQLIDGEGRLVNCVWKTWAWETAFDQIREVSDREFAAVPIRTGHPQNEVRLIDVLL RPEVLVFEPLWTVIPGNKAILPILWSLFPHHRYLLDTDFTVNDELVKTGYAVKPIAGRCG SNIDLVSHHEEVLDKTSGKFAEQKNIYQQLWCLPKVDGKYIQVCTFTVGGNYGGTCLRGD ESLVIKKESDIEPLIVVKK >gi|296493200|gb|ADTK01000301.1| GENE 68 64372 - 64635 90 87 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_3273 NR:ns ## KEGG: EcSMS35_3273 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 42 87 1 46 46 75 93.0 8e-13 MPEQAITKALCIYQGQQINLAYIRLRHFEFTNGRIISDFNGMGKVKYHFLTIKRVLFLVY QTISIHSKPLEIIKTQVFHWLIYSHMK >gi|296493200|gb|ADTK01000301.1| GENE 69 64572 - 66071 1365 499 aa, chain + ## HITS:1 COG:pitB KEGG:ns NR:ns ## COG: pitB COG0306 # Protein_GI_number: 16130887 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate/sulphate permeases # Organism: Escherichia coli K12 # 1 499 1 499 499 891 99.0 0 MLNLFVGLDIYTGLLLLLALAFVLFYEAINGFHDTANAVATVIYTRAMQPQLAVVMAAFF NFFGVLLGGLSVAYAIVHMLPTDLLLNMGSTHGLAMVFSMLLAAIIWNLGTWFFGLPASS SHTLIGAIIGIGLTNALLTGSSVMDALNLREVTKIFSSLIVSPIVGLVIAGGLIFLLRRY WSGTKKRDRIHRIPEDRKKKKGKRKPPFWTRIALIVSAAGVAFSHGANDGQKGIGLVMLV LVGIAPAGFVVNMNASGYEITRTRDAVTNFEHYLQQHPELPQKLIAMEPPLPAASTDGTQ VTEFHCHPANTFDAIARVKTMLPGNMESYEPLSVSQRSQLRRIMLCISDTSAKLAKLPGV SKEDQNLLKKLRSDMLSTIEYAPVWIIMAVALALGIGTMIGWRRVAMTIGEKIGKRGMTY AQGMAAQMTAAVSIGLASYIGMPVSTTHVLSSAVAGTMVVDGGGLQRKTVTSILMAWVFT LPAAIFLSGGLYWIALQLI >gi|296493200|gb|ADTK01000301.1| GENE 70 66120 - 66812 584 230 aa, chain - ## HITS:1 COG:no KEGG:B21_02811 NR:ns ## KEGG: B21_02811 # Name: yghT # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 230 1 230 230 462 100.0 1e-129 MQSITPPLIAVIGSDGSGKSTVCEHLITVVEKYGAAERVHLGKQAGNVGRAVTKLPLMGK SLHKTIERNQVKTAKKLPGPVPALVITAFVARRLLRFRHMLACRRRGLIVLTDRYPQDQI PGAYDGTVFPPNVEGGRFVSWLASQERKAFHWMASHKPDLVIKLNVDLEVACARKPDHKR ESLARKIAITPQLTFGGAQLVDIDANQPLEQVLVDAEKAITDFMTARGYH >gi|296493200|gb|ADTK01000301.1| GENE 71 67001 - 67699 342 232 aa, chain + ## HITS:1 COG:no KEGG:EcHS_A3165 NR:ns ## KEGG: EcHS_A3165 # Name: not_defined # Def: putative lipoprotein # Organism: E.coli_HS # Pathway: not_defined # 1 232 6 237 237 471 99.0 1e-131 MSIINSTPVHVIAIVGCDGSGKSTLTASLVNELAARMPTEHIYLGQSSGRIGEWISQLPV IGAPFGRYLRSKAAHVHEKPSTPPGNITALVIYLLSCWRAYKFRKMLCKSQQGFLLITDR YPQVEVPGFRFDGPQLAKTTGGNGWIKMLRQRELKLYQWMASYLPVLLIRLGIDEQTAFA RKPDHQLAALQEKIAVTPQLTFNGAKILELDGRQPADEILQASLRAIHAALS >gi|296493200|gb|ADTK01000301.1| GENE 72 67731 - 68489 632 252 aa, chain + ## HITS:1 COG:no KEGG:B21_02809 NR:ns ## KEGG: B21_02809 # Name: yghR # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 252 1 252 252 478 99.0 1e-134 MDALQTQTVISTTAPQPSYIPGLIAVVGCDGTGKSTLTTDLVKSLQQHWQTERRYLGLLS GEDGDKIKRLPLVGVWLERRLAAKSSKTQSMKTKSPALWAAVIMYCFSLRRMANLRKVQR LAQSGVLVVSDRFPQAEISGFYYDGPGIGVERATGKISMFLAQRERRLYQQMAQYRPELI IRLGIDIETAISRKPDHDYAELQDKIGVMSKIGYNGTKILEIDSRAPYSEVLEQAQKAVS LVAIVSDRRSLT >gi|296493200|gb|ADTK01000301.1| GENE 73 68535 - 69884 839 449 aa, chain + ## HITS:1 COG:yghQ KEGG:ns NR:ns ## COG: yghQ COG2244 # Protein_GI_number: 16130883 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Escherichia coli K12 # 42 355 12 325 325 585 98.0 1e-167 MAGFNIKHWFADGAFRTIIRNSAWLGSSNVVSALLGLLALSCAGKGMTPAMFGVLVIVQS YAKSISDFIKFQTWQLVVQYGTPALTNNNPHQFRNVVSFSFSLDIVSGAVAIVGGIALLP FLSHSLGLDDQSFWLAALYCTLIPSMASSTPTGILRAVDRFDLIAVQQATKPFLRTAGSV VAWYFDFGFAGFVIAWYVSNLVGGTMYWWFAARELRRRNIHNAFKLNLFESARHIKGAWS FVWSTNIAHSIWSARNSCSTVLVGIVLGPAAAGLFKIAMTFFDAAGTPAGLLGKSFYPEV MRLDPRTTRPWLLGVKSGLLAGGIGILVALAVFIVGKPLISLVFGVKYLEAYDLIQVMLG AIVISMLGFPQESLLLMAGKQRAFLVAQTIASIGYIVLLFMFCHLFGVLGAAFAYFGGQC LDVALSLIPTLKAFFQRHSLLYNAAGEKS >gi|296493200|gb|ADTK01000301.1| GENE 74 69884 - 70708 589 274 aa, chain + ## HITS:1 COG:no KEGG:ECO111_3811 NR:ns ## KEGG: ECO111_3811 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 274 1 274 274 534 100.0 1e-150 MMRKYFPLEASERLFVAIEEDDVVDAQVSLPPTIALSCTTEIIHDNYALCLQFWLNGVDR QELLRLVRKQAKGDELTADERKQFKYMRARYKHLRFAQRLYLKKHQAGFLFGKTTVFLGR FQDGFRNGKKNIVSYYGNLLRIYLSSPVWSLVNYSLRHSQLESVSSFIAYRQKQMHTLKE IIAKPRLTGREFHDVRKIISQQVSYYDTLRSLDPENKEALQISRFLAAINGLMGDKHDDM VADDMENRQSYDAPLALDSDIRQRLELLISRFPL >gi|296493200|gb|ADTK01000301.1| GENE 75 70720 - 71280 447 186 aa, chain + ## HITS:1 COG:ytfJ KEGG:ns NR:ns ## COG: ytfJ COG3054 # Protein_GI_number: 16132038 # Func_class: R General function prediction only # Function: Predicted transcriptional regulator # Organism: Escherichia coli K12 # 1 184 1 183 184 205 54.0 4e-53 MSSRLIIALIIMLLAPGVQAHNFVTGKTVTPVYIQEGGELLLNSDDEIHYQKWNSTQLAG KVRIIQYIAGRKSAKKKNSLLIKAVEAANFPQDRFQPTTIVNTDDAIFGTGYFVVGKIEK NKRRYPWAQFVIDGNGQGRVAWRLPEQSSTILVLNKAGQIQWAKDGSLTPEEVDHVIALA QKLINE >gi|296493200|gb|ADTK01000301.1| GENE 76 71311 - 72381 968 356 aa, chain + ## HITS:1 COG:PA3828 KEGG:ns NR:ns ## COG: PA3828 COG0795 # Protein_GI_number: 15599023 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Pseudomonas aeruginosa # 3 343 2 351 372 124 29.0 3e-28 MKLVEHYIMRGTRRLVLIIVGFLIFIFASYSAQRYLTEAANGTLALDVVLDIVFYKVLIA LEMLLPVGLYVSVGVTLGQMYTDSEITAISAAGGSPGRLYKAVLYLAIPLSIFVTLLSMY GRPWAYAQIYQLEQQSQSELDVRQLRAKKFNTNDNGRMILSQTVDQDNNRLTDALIYTST ANRTRIFRARSVDVVDPSPEKPTVMLHNGTAYLLDHQGRDDNEQIYRNLQLHLNPLVQSP NVKRKAKSVTELARSVFPADHAELQWRQSRGLTALLMALLAISLSRVKPRQGRFSTLLPL TLLFIAIFYGGDVCRTLVANGAIPLIPGLWLVPGLMLMGLLMLVARDFSLLQKFSR >gi|296493200|gb|ADTK01000301.1| GENE 77 72378 - 73412 777 344 aa, chain + ## HITS:1 COG:PA3827 KEGG:ns NR:ns ## COG: PA3827 COG0795 # Protein_GI_number: 15599022 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Pseudomonas aeruginosa # 1 333 1 346 355 127 29.0 4e-29 MNVFSRYLIRHLFLGFAAAAGLLLPLFTTFNLINELDDVSPGGYRWTQAVLVVLMTLPRT LVELSPFIALLGGIVGLGQLSKNSELTAIRSTGFSIFRIALVALVAGILWTVSLGAIDEW VASPLQQQALQIKSTATALGEDDDITGNMLWARRGNEFVTVKSLNEQGQPVGVEIFHYRD DLSLESYIYARSATIKDDKTWVLHGVNHKKWLNGKETLETLDNLAWQSAFTSMNLEELSM PGNTFSVRQLNHYIHYLQETGQPILTLAMILLAVPFTFTAPRSPGMGSRLAVGVIVGLLT WISYQIMVNLGLLFALSAPVTALGLPVAFVLVALSLVYWYDRQH Prediction of potential genes in microbial genomes Time: Mon May 16 15:57:49 2011 Seq name: gi|296493199|gb|ADTK01000302.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont923.5, whole genome shotgun sequence Length of sequence - 33121 bp Number of predicted genes - 28, with homology - 28 Number of transcription units - 9, operones - 5 average op.length - 4.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 10 - 1182 1279 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes 2 1 Op 2 . - CDS 1182 - 1430 284 ## S3224 hypothetical protein 3 1 Op 3 2/1.000 - CDS 1462 - 2376 726 ## COG0702 Predicted nucleoside-diphosphate-sugar epimerases 4 1 Op 4 . - CDS 2373 - 4061 1236 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II - Prom 4211 - 4270 8.9 + Prom 4207 - 4266 7.0 5 2 Tu 1 . + CDS 4465 - 5607 1000 ## ECO111_3803 putative DNA-binding transcriptional regulator 6 3 Tu 1 . - CDS 5614 - 6378 552 ## COG2186 Transcriptional regulators - Prom 6457 - 6516 9.3 + Prom 6421 - 6480 2.3 7 4 Op 1 9/0.000 + CDS 6625 - 8124 1530 ## COG0277 FAD/FMN-containing dehydrogenases 8 4 Op 2 15/0.000 + CDS 8124 - 9176 968 ## COG0277 FAD/FMN-containing dehydrogenases 9 4 Op 3 1/1.000 + CDS 9187 - 10410 1123 ## COG0247 Fe-S oxidoreductase 10 4 Op 4 1/1.000 + CDS 10415 - 10819 611 ## COG3193 Uncharacterized protein, possibly involved in utilization of glycolate and propanediol 11 4 Op 5 1/1.000 + CDS 10841 - 13012 2461 ## COG2225 Malate synthase + Term 13040 - 13085 -0.9 + Prom 13224 - 13283 4.4 12 5 Tu 1 . + CDS 13367 - 15049 1655 ## COG1620 L-lactate permease + Term 15059 - 15097 7.1 13 6 Tu 1 . + CDS 15535 - 20100 4260 ## ECO111_3795 putative lipoprotein AcfD homolog precursor 14 7 Op 1 . + CDS 20248 - 21057 466 ## COG1989 Type II secretory pathway, prepilin signal peptidase PulO and related peptidases 15 7 Op 2 . + CDS 21123 - 21533 256 ## JW2938 hypothetical protein + Prom 21567 - 21626 2.9 16 8 Op 1 . + CDS 21680 - 22510 668 ## COG3031 Type II secretory pathway, component PulC 17 8 Op 2 6/0.000 + CDS 22540 - 24600 2095 ## COG1450 Type II secretory pathway, component PulD 18 8 Op 3 24/0.000 + CDS 24600 - 26093 1450 ## COG2804 Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB 19 8 Op 4 10/0.000 + CDS 26093 - 27316 1183 ## COG1459 Type II secretory pathway, component PulF 20 8 Op 5 . + CDS 27333 - 27788 599 ## COG2165 Type II secretory pathway, pseudopilin PulG 21 8 Op 6 . + CDS 27825 - 28355 409 ## EcE24377A_3423 general secretion pathway protein H 22 8 Op 7 12/0.000 + CDS 28352 - 28723 340 ## COG2165 Type II secretory pathway, pseudopilin PulG 23 8 Op 8 7/0.000 + CDS 28720 - 29319 463 ## COG4795 Type II secretory pathway, component PulJ 24 8 Op 9 4/0.000 + CDS 29322 - 30299 978 ## COG3156 Type II secretory pathway, component PulK 25 8 Op 10 2/1.000 + CDS 30296 - 31474 683 ## COG3297 Type II secretory pathway, component PulL 26 8 Op 11 . + CDS 31476 - 32012 419 ## COG3149 Type II secretory pathway, component PulM + Term 32131 - 32157 0.1 - Term 32112 - 32152 2.6 27 9 Op 1 . - CDS 32293 - 32865 314 ## B21_02762 hypothetical protein - Term 32888 - 32918 1.0 28 9 Op 2 . - CDS 32962 - 33120 63 ## EC55989_4897 hypothetical protein Predicted protein(s) >gi|296493199|gb|ADTK01000302.1| GENE 1 10 - 1182 1279 390 aa, chain - ## HITS:1 COG:CC1162 KEGG:ns NR:ns ## COG: CC1162 COG0156 # Protein_GI_number: 16125414 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Caulobacter vibrioides # 1 388 1 389 404 402 53.0 1e-112 MGLYDKYARLAGERLQFSDNGLTPFGTCIDEVYSATEGRIGNKKVILAGTNNYLGLTFNH DAISEGQAALAAQGTGTTGSRMANGSYAPHLALEKEIAEFFNRPTAIVFSTGYTANLGVI SALADHNAVVLLDADSHASIYDACSLGGAEIIRFRHNDAKDLERRMVRLGERAKEAIIIV EGIYSMLGDVAPLAEIVDIKRRLGGYLIVDEAHSFGVLGATGRGLAEAVGVEDDVDIIVG TFSKSLASIGGFAVGSEAMEVLRYGSRPYIFTASPSPSCIATVRSSLRTIATQPELRQKL MDNANHLYDGLQKLGYELSLHISPVVPVIIGSKEEGLRIWRELISLGVYVNLILPPAAPA GITLLRCSVNAAHSREQIDAIIQAFATLKQ >gi|296493199|gb|ADTK01000302.1| GENE 2 1182 - 1430 284 82 aa, chain - ## HITS:1 COG:no KEGG:S3224 NR:ns ## KEGG: S3224 # Name: not_defined # Def: hypothetical protein # Organism: S.flexneri_2457T # Pathway: not_defined # 1 82 1 82 82 124 100.0 1e-27 MVNREIVMDYILSCLQDLVENGVEIKPDSDLVNDLGLESIKVMDLLMMLEDRFDISIPIN ILLDVKTPAQLMETLLPWLENK >gi|296493199|gb|ADTK01000302.1| GENE 3 1462 - 2376 726 304 aa, chain - ## HITS:1 COG:CC1164 KEGG:ns NR:ns ## COG: CC1164 COG0702 # Protein_GI_number: 16125416 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate-sugar epimerases # Organism: Caulobacter vibrioides # 5 287 9 288 317 118 33.0 2e-26 MNQTVAVTGATGFIGKYIINNLLARGFHVRALTRTARAHVNDNLTWVRGSLEDTHSLSKL VAGASAVVHCAGQVRGHKEEIFTHCNVDGSLRLMQAAKESGFCQRFLFISSLAARHPELS WYANSKHVAEQRLTAMADEITLGVFRPTAVYGPGDKELKPLFDWMLRGLLPRLGAPDTQL SFLHVTDFAQAVGQWLSAETVQTQTYELCDGVPGGYDWQHVRQLAADARCGSVRMVGIPL PVLTCLADISTALSRLAGKEPMLTRSKIRELTHADWSASNNRISEDINWFPGISLEHALR NGLF >gi|296493199|gb|ADTK01000302.1| GENE 4 2373 - 4061 1236 562 aa, chain - ## HITS:1 COG:CC1165 KEGG:ns NR:ns ## COG: CC1165 COG0318 # Protein_GI_number: 16125417 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Caulobacter vibrioides # 1 547 11 560 567 444 43.0 1e-124 MRYADFPTLVDALDYAALSSAGMNFYDRRCQLEDQLEYQTLKTRAEAGAKRLLSLNLKKG DRVALIAETSSGFVEAFFACQYAGLVAVPLAIPMGVGQRDSWSAKLQGLLASCQPAAIIT GDEWLPLVNAATHDNPELHVLSHAWFKALPEADVALQRPVPNDIAYLQYTSGSTRFPRGV IITHREVMANLRAISHDGIKLRPGDRCVSWLPFYHDMGLVGFLLTPVATQLSVDYLRTQD FAMRPLQWLKLISKNRGTVSVAPPFGYELCQRRVNEKDLAELDLSCWRVAGIGAEPISAE QLHQFAECFRQVNFDNKTFMPCYGLAENALAVSFSDEASGVVVNEVDRDILEYQGKAVAP GAETRAVSTFVNCGKALPEHGIEIRNEAGMPVAERVVGHICISGPSLMSGYFGDQASQDE IAATGWLDTGDLGYLLDGYLYVTGRIKDLIIIRGRNIWPQDIEYIAEQEPEIHSGDAIAF VTAQEKIILQIQCRISDEERRGQLIHALAARIQSEFGVTAAIELLPPHSIPRTSSGKPAR AEAKKRYQKAYAASLHVQESLA >gi|296493199|gb|ADTK01000302.1| GENE 5 4465 - 5607 1000 380 aa, chain + ## HITS:1 COG:no KEGG:ECO111_3803 NR:ns ## KEGG: ECO111_3803 # Name: yghO # Def: putative DNA-binding transcriptional regulator # Organism: E.coli_O111_H- # Pathway: not_defined # 1 380 1 380 380 795 99.0 0 MECDLLMIKIEKVINKNDLKAFIAFPSSLYPDDPNWIPPLFIERNEHLSAKNPGTDHIIW QAWVAKKAGQIVGRITAQIDTLHCERYGEDTGHFGMIDAIDDPQVFAALFGAAEAWLKSQ GASKISGPFSLNINQESGLLIEGFDTPPCAMMPHGKPWYAAHIEQLGYHKGIDLLAWWMQ RTDLTFSPALKKLMDQVRKKVTIRCINRQRFAEEMQILREIFNSGWQHNWGFVPFTEHEF ATMGDQLKYLVPDDMIYIAEIDSAPCAFIVGLPNINEAIADLNGSLFPFGWAKLLWRLKV SGVRTARVPLMGVRDEYQFSRIGPVIALLLIEALRDPFARRKIDALEMSWILETNTGMNN MLERIGAEPYKRYRLYEKQI >gi|296493199|gb|ADTK01000302.1| GENE 6 5614 - 6378 552 254 aa, chain - ## HITS:1 COG:glcC KEGG:ns NR:ns ## COG: glcC COG2186 # Protein_GI_number: 16130880 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 254 1 254 254 472 100.0 1e-133 MKDERRPICEVVAESIERLIIDGVLKVGQPLPSERRLCEKLGFSRSALREGLTVLRGRGI IETAQGRDSRVARLNRVQDTSPLIHLFSTQPRTLYDLLDVRALLEGESARLAATLGTQAD FVVITRCYEKMLAASENNKEISLIEHAQLDHAFHLAICQASHNQVLVFTLQSLTDLMFNS VFASVNNLYHRPQQKKQIDRQHARIYNAVLQRLPHVAQRAARDHVRTVKKNLHDIELEGH HLIRSAVPLEMNLS >gi|296493199|gb|ADTK01000302.1| GENE 7 6625 - 8124 1530 499 aa, chain + ## HITS:1 COG:glcD KEGG:ns NR:ns ## COG: glcD COG0277 # Protein_GI_number: 16130879 # Func_class: C Energy production and conversion # Function: FAD/FMN-containing dehydrogenases # Organism: Escherichia coli K12 # 1 499 1 499 499 1000 99.0 0 MSILYEERLDGALPDVDRTSVLMALREHVPGLEILHTDEEIIPYECDGLSAYRTRPLLVV LPKQMEQVTAILAVCHRLRVPVVTRGAGTGLSGGALPLEKGVLLVMARFKEILDINPVGR RARVQPGVRNLAISQAVAPHNLYYAPDPSSQIACSIGGNVAENAGGVHCLKYGLTVHNLL KIEVQTLDGEALTLGSDALDSPGFDLLALFTGSEGMLGVTTEVTVKLLPKPPVARVLLAS FDSVEKAGLAVGDIIANGIIPGGLEMMDNLSIRAAEDFIHAGYPVDAEAILLCELDGVES DVQEDCERVNDILLKAGATDVRLAQDEAERVRFWAGRKNAFPAVGRISPDYYCMDGTIPR RALPGVLEGIARLSQQYDLRVANVFHAGDGNMHPLILFDANEPGEFARAEELGGKILELC VEVGGSISGEHGIGREKINQMCAQFNSDEITTFHAVKAAFDPDGLLNPGKSIPTLHRCAE FGAMHVHHGHLPFPELERF >gi|296493199|gb|ADTK01000302.1| GENE 8 8124 - 9176 968 350 aa, chain + ## HITS:1 COG:glcF_1 KEGG:ns NR:ns ## COG: glcF_1 COG0277 # Protein_GI_number: 16130878 # Func_class: C Energy production and conversion # Function: FAD/FMN-containing dehydrogenases # Organism: Escherichia coli K12 # 1 318 1 318 369 634 98.0 0 MLRECDYSQALLEQVNQAISDKTPLVIQGSNSKAFLGRPVTGQTLDVRCHRGIVNYDPTE LVITARAGTPLVAIEAALESAGQMLPCEPPHYGEEATWGGMVACGLAGPRRPWSGSVRDF VLGTRIITGAGKHLRFGGEVMKNVAGYDLSRLMAGSYGCLGVLTEISMKVLPRPRASLSL RREISLQEAMNEIAQWQLQPLPISGLCYFDNALWIRLEGGEGSVKAARELLGGEEVAGQF WQQLREQQLPFFSLPGTLWRISLPSDAPMMDLPGEQLIDWGGALRWLKSTAEDNQIHRIA RNAGGHATRFSAGDGGFAPLSAPLFRYHQQLKQQLDPCGVFNPGRMYAEL >gi|296493199|gb|ADTK01000302.1| GENE 9 9187 - 10410 1123 407 aa, chain + ## HITS:1 COG:glcF_2 KEGG:ns NR:ns ## COG: glcF_2 COG0247 # Protein_GI_number: 16130878 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Escherichia coli K12 # 16 407 1 392 392 775 99.0 0 MQTQLTEEMRQNARALEADSILRACVHCGFCTATCPTYQLLGDELDGPRGRIYLIKQVLE GNEVTLKTQEHLDRCLTCRNCETTCPSGVRYHNLLDIGRDIVEQKVKRPLPERILREGLR QLVPRPAVFRALTQVGLVLRPFLPEQVRAKLPAETVKAKPRPPLRHKRRVLMLEGCAQPT LSPNTNAATARVLDRLGISVMSANEAGCCGAVDYHLNAQEKGLARARNNIDAWWPAIEAG AEAILQTASGCGAFVKEYGQMLKNDALYADKARQVSELAVDLVELLREEPLEKLAIRGDK KLAFHCPCTLQHAQKLNGEVEKVLLRLGFTLTDVPDSHLCCGSAGTYALTHPDLARQLRD NKMNALESGKPEMIVTANIGCQTHLASAGRTSVRHWIEIVEQALEKE >gi|296493199|gb|ADTK01000302.1| GENE 10 10415 - 10819 611 134 aa, chain + ## HITS:1 COG:glcG KEGG:ns NR:ns ## COG: glcG COG3193 # Protein_GI_number: 16130877 # Func_class: R General function prediction only # Function: Uncharacterized protein, possibly involved in utilization of glycolate and propanediol # Organism: Escherichia coli K12 # 1 118 1 118 134 198 100.0 2e-51 MKTKVILSQQMASAIIAAGQEEAQKNNWSVSIAVADDGGHLLALSRMDDCAPIAAYISQE KARTAALGRRETKGYEEMVNNGRTAFVTAPLLTSLEGGVPVVVDGQIIGAVGVSGLTGAQ DAQVAKAAAAVLAK >gi|296493199|gb|ADTK01000302.1| GENE 11 10841 - 13012 2461 723 aa, chain + ## HITS:1 COG:glcB KEGG:ns NR:ns ## COG: glcB COG2225 # Protein_GI_number: 16130876 # Func_class: C Energy production and conversion # Function: Malate synthase # Organism: Escherichia coli K12 # 1 723 1 723 723 1462 98.0 0 MSQTITQGRLRIDANFKRFVDEEVLPGTGLDAAAFWRNFDEIVHDLAPENRQLLAERDRI QAALDEWHRSNPGPVKDKAAYKSFLRELGYLVPQLERVTVETTGIDSEITSQAGPQLVVP AMNARYALNAANARWGSLYDALYGSDIIPQEGAMVSGYDPQRGAQVIAWVRRFLDESLPL EKGSYQDVVAFKVVDKQLRIQLKNVKETTLRTPAQFVGYRGDTAAPTCILLKNNGLHIEL QIDANGRIGKDDPAHINDVIVEAAISTILDCEDSVAAVDAEDKILLYRNLLGLMQGTLQE KMEKNGRQIVRKLNDDRHYTAADGSEISLHGRSLLFIRNVGHLMTIPVIWDSEGNEIPEG ILDGVMTGAIALYDLKVQKNSRTGSVYIVKPKMHGPQEVAFANKLFTRVETMLGMAPNTL KMGIMDEERRTSLNLRSCIAQARNRVAFINTGFLDRTGDEMHSVMEAGPMLRKNQMKSTP WIKAYERNNVLSGLFCGLRGKAQIGKGMWAMPDLMADMFSQKGDQLRAGANTAWVPSPTA ATLHALHYHQTNVQSVQANIAQTEFNAEFEPLLDDLLTIPVAENANWSVEEIQQELDNNV QGILGYVVRWVEQGIGCSKVPDIHNVALMEDRATLRISSQHIANWLRHGILTKEQVQASL ENMAKVVDQQNAGDPAYRPMAGNFANSCAFKAASDLIFLGVKQPNGYTEPLLHAWRLREK ESH >gi|296493199|gb|ADTK01000302.1| GENE 12 13367 - 15049 1655 560 aa, chain + ## HITS:1 COG:yghK KEGG:ns NR:ns ## COG: yghK COG1620 # Protein_GI_number: 16130875 # Func_class: C Energy production and conversion # Function: L-lactate permease # Organism: Escherichia coli K12 # 1 560 1 560 560 914 99.0 0 MVTWTQMYMPMGGLGLSALVALIPIIFFFVALAVLRLKGHVAGAITLILSILIAIFAFKM PIDMAFAAAGYGFIYGLWPIAWIIVAAVFLYKLTVASGQFDIIRSSVISITDDQRLQVLL IGFSFGALLEGAAGFGAPVAITGALLVGLGFKPLYAAGLCLIANTAPVAFGALGVPILVA GQVTGIDPFHIGAMAGRQLPFLSVLVPFWLVAMMDGWKGVKETWPAALVAGGSFAVTQFF TSNYIGPELPDITSALVSIVSLALFLKVWRPKNTETAISMGQSAGAMVVNKPSSGGPVPS EYSLGQIIRAWSPFLILTVLVTIWTMKPFKALFAPGGAFYSLVINFQIPHLHQQVLKAAP IVAQPTPMDAVFKFDPLSAGGTAIFIAAIISIFILGVGIKKGIGVFAETLISLKWPILSI GMVLAFAFVTNYSGMSTTLALVLAGTGVMFPFFSPFLGWLGVFLTGSDTSSNALFGSLQS TTAQQINVSDTLLVAANTSGGVTGKMISPQSIAVACAATGMVGRESELFRYTVKHSLIFA SVIGVITLLQAYVFTGMLVS >gi|296493199|gb|ADTK01000302.1| GENE 13 15535 - 20100 4260 1521 aa, chain + ## HITS:1 COG:no KEGG:ECO111_3795 NR:ns ## KEGG: ECO111_3795 # Name: yghJ # Def: putative lipoprotein AcfD homolog precursor # Organism: E.coli_O111_H- # Pathway: not_defined # 1 1521 1 1521 1521 2928 99.0 0 MNKKFKYKKSLLAAILSATLLAGCDGGGSGSSSDTPPVDSGTGSLPEVKPDPTPNPEPTP EPTPDPEPTPEPTPDPEPTPEPEPEPVPTKTGYLTLGGSLRVTGDITCNDESSDGFTFTP GDKVTCVAGNNTTIATFDTQSEAARSLRAVEKVSFSLEDAQELAGSDNKKSNALSLVTSM NSCPANTEQVCLEFSSVIESKRFDSLYKQIDLAPEEFKKLVNEEVENNAATDKAPSTHTS PVVPVTTPGTKPDLNASFVSANAEQFYQYQPSEIILSEGRLVDSQGYGVAGVNYYTNSGR GVTGENGEFSFSWGETISFGIDTFELGSVRGNKSTIALTELGDEVRGANIDQLIHRYSTT GQNNTRVVPDDVRKVFAEYPNVINEIINLSLSNGATLDEGEQVVNLPNEFIEQFKTGQAK EIDTAICAKTDGCNEARWFSLTTRNVNDGQIQGVINKLWGVDTNYKSVSKFHVFHDSTNF YGSTGNARGQAVVNISNAAFPILMARNDKNYWLAFGEKRAWDKNELAYITEAPSLVEPEN VTRDTATFNLPFISLGQVGEGKLMVIGNPHYNSILRCPNGYSWNGGVNKDGQCTLNSDPD DMKNFMENVLRYLSDDKWTPDAKASMTVGTNLDTVYFKRHGQVTGNSAAFDFHPDFAGIS VEHLSSYGDLDPQEMPLLILNGFEYVTQVGNDPYAIPLRADTSKPKLTQQDVTDLIAYLN KGGSVLIMENVMSNLKEESASGFVRLLDAAGLSMALNKSVVNNDPQGYPNRVRQQRATGI WVYERYPAVDGALPYTIDSKTGEVKWKYQVENKPDDKPKLEVASWLEDVDGKQETRYAFI DEADHKTEDSLKAAKEKIFAAFPGLKECTNPAYHYEVNCLEYRPGTGVPVTGGMYVPQYT QLSLNADTAKAMVQAADLGTNIQRLYQHELYFRTNGRKGERLSSVDLERLYQNMSVWLWN DTSYRYEEGKNDELGFKTFTEFLNCYANDAYAGGTKCSADLKKSLVDNNMIYGDGSSKAG MMNPSYPLNYMEKPLTRLMLGRSWWDLNIKVDVEKYPGAVSEEGQNVTETISLYSNPTKW FAGNMQSTGLWAPAQKEVTIKSNANVPVTVTVALADDLTGREKHEVALNRPPRVTKTYSL DASGTVKFKVPYGGLIYIKGNSSTNESASFTFTGVVKAPFYKDGAWKNDLNSPAPLGELE SDAFVYTTPKKNLNASNYTGGLEQFANDLDTFASSMNDFYGRDETSGKHRMFTYKNLTGH KHRFTNDVQISIGDAHSGYPVMNSSFSTNSTTLPTTPLNDWLIWHEVGHNAAETPLTVPG ATEVANNVLALYMQDRYLGKMNRVADDITVAPEYLEESNGQAWARGGAGDRLLMYAQLKE WAEKNFDIKKWYPEGELPKFFSDREGMKGWNLFQLMHRKARGDDVGDKTFGGKNYCAESN GNAADTLMLCASWVAQTDLSEFFKKWNPGANAYQLPGASEMSFEGGVSQSAYNTLASLKL PKPEQGPETINKVTEHKMSVE >gi|296493199|gb|ADTK01000302.1| GENE 14 20248 - 21057 466 269 aa, chain + ## HITS:1 COG:pppA KEGG:ns NR:ns ## COG: pppA COG1989 # Protein_GI_number: 16130872 # Func_class: N Cell motility; O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, prepilin signal peptidase PulO and related peptidases # Organism: Escherichia coli K12 # 1 269 42 310 310 476 96.0 1e-134 MLFDVFQQYPAAMPILATVGGLIIGSFLNVVIWRYPIMLRQQMAEFHGEMPSAQSKISLA LPRSHCPHCQQTIRIRDNIPLLSWLMLKGRCRDCQAKISKRYPLVELLTALAFLLASLVW PESGWALAVMILSAWLIAASVIDLDHQWLPDVFTQGVLWTGLIAAWAQQSPLTLQDAVTG VLVGFIAFYSLRWIAGVVLRKEALGMGDVLLFAALGSWVGPLSLPNVALIASCCGLIYAV ITKRGSTTLPFGPCLSLGGIATIYLQALF >gi|296493199|gb|ADTK01000302.1| GENE 15 21123 - 21533 256 136 aa, chain + ## HITS:1 COG:no KEGG:JW2938 NR:ns ## KEGG: JW2938 # Name: yghG # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 136 1 136 136 227 100.0 1e-58 MSIKQMPGRVLISLLLSVTGLLSGCASHNENASLLAKKQAQNISQNLPIKSAGYTLVLAQ SSGTTVKMTIISEAGTQTTQTPDAFLTSYQRQMCADPTVKLMITEGINYSITINDTRTGN QYQRKLDRTTCGIVKA >gi|296493199|gb|ADTK01000302.1| GENE 16 21680 - 22510 668 276 aa, chain + ## HITS:1 COG:yghF KEGG:ns NR:ns ## COG: yghF COG3031 # Protein_GI_number: 16130870 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulC # Organism: Escherichia coli K12 # 1 276 1 276 288 501 94.0 1e-142 MFWLMLLIISAKVAHSLWRYFSFSAEYTAVSPSANKPPRADAKTFDKNDVQLISQQNWFG KYQPVATPVKQPEPVPVAETRLNVVLRGIAFGARPGAVIEEGGKQQVYLQGETLGSHNAV IEEINRDHVMLRYQGKIERLSLAEEGHSTVAVTNKKAVSDEAKQAVAEPAVSAPVEIPTA VRQALAKDPQKIFNYIQLTPVRKEGIVGYAVKPGADRSLFDASGFKEGDIAIALNQQDFT DPRAMIALMRQLPSMDSIQLTVLRKGARHDISIALR >gi|296493199|gb|ADTK01000302.1| GENE 17 22540 - 24600 2095 686 aa, chain + ## HITS:1 COG:gspD KEGG:ns NR:ns ## COG: gspD COG1450 # Protein_GI_number: 16131204 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulD # Organism: Escherichia coli K12 # 18 665 5 639 654 502 44.0 1e-142 MFWRDITLSVWRKKTTGLKTKKRLLPLVLAAALCSSPVWAEEATFTANFKDTDLKSFIET VGANLNKTIIMGPGVQGKVSIRTMTPLNERQYYQLFLNLLEAQGYAVVPMENDVLKVVKS SAAKVEPLPLVGEGSDNYAGDEMVTKVVPVRNVSVRELAPILRQMIDSAGSGNVVNYDPS NVIMLTGRASVVERLTEVIQRVDHAGNRTEEVIPLDNASASEIARVLESLTKNSGENQPA TLKSQIVADERTNSVIVSGDPATRDKMRRLIRRLDSEMERSGNSQVFYLKYSKAEDLVDV LKQVSGTLTAAKEEAEGTVGSGREIVSIAASKHSNALIVTAPQDIMQSLQSVIEQLDIRR AQVHVEALIVEVAEGSNINFGVQWASKDAGLMQFANGTQIPIGTLGAAISQAKPQKGSTV ISENGATTINPDTNGDLSTLAQLLSGFSGTAVGVVKGDWMALVQAVKNDSSSNVLSTPSI TTLDNQEAFFMVGQDVPVLTGSTVGSNNSNPFNTVERKKVGIMLKVTPQINEGNAVQMVI EQEVSKVEGQTSLDVVFGERKLKTTVLANDGELIVLGGLMDDQAGESVAKVPLLGDIPLI GNLFKSTADKKEKRNLMVFIRPTILRDGMAADGVSQRKYNYMRAEQIYRDEQGLSLMPHT AQPVLPAQNQALPPEVRAFLNAGRTR >gi|296493199|gb|ADTK01000302.1| GENE 18 24600 - 26093 1450 497 aa, chain + ## HITS:1 COG:VC2732 KEGG:ns NR:ns ## COG: VC2732 COG2804 # Protein_GI_number: 15642726 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB # Organism: Vibrio cholerae # 14 497 16 503 503 635 67.0 0 MVPVAQETTANTVRLPYSFSRRFSLVAWCEASLEILHVHPLSLSVLQELQRGLNAPFTLR QIDEAEFEQRLNAVWQRDSSEARQLMEDLGSAEDFFTLAEELPETEDLLESDDDAPIIKL INAMLAEAIKEGASDIHIETFEKSLVIRFRVDGTLHEMLRPGRKLASLLVSRIKVMARLD IAEKRVPQDGRIALLLGGRAIDVRVSTMPSAWGERVVLRLLDKNQARLTLERLGLSQQLT AQLRQLLHKPHGIFLVTGPTGSGKSTTLYAGLQELNNHSRNILTVEDPIEYMIEGIGQTQ VNTRVGMTFARGLRAILRQDPDVVMVGEIRDTETAEIAVQASLTGHLVLSTLHTNTAVGA ITRLQDMGVEPFLLSSSLTGVMAQRLVRTLCPDCRQPAPATDEEKRLLGITDARTVTLYH PQGCPACNHKGFRGRTAIHELIVVDATLRDLIHRQAGELELERYVRQHSAGIRSNGIEKV LAGETSLDEVLRVTMEA >gi|296493199|gb|ADTK01000302.1| GENE 19 26093 - 27316 1183 407 aa, chain + ## HITS:1 COG:VC2731 KEGG:ns NR:ns ## COG: VC2731 COG1459 # Protein_GI_number: 15642725 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulF # Organism: Vibrio cholerae # 1 405 1 404 406 418 56.0 1e-116 MALFYYQALERNGRKTKGMIEADSARHARQLLRGKELIPVHIEARMNTSSGGMLQRRRHA HRRVAAADLALFTRQLATLVQAAMPLETCLQAVSEQSEKLHVKSLGMALRSRIQEGYTLS DSLREHPRVFDSLFCSMVAAGEKSGHLDVVLNRLADYTEQRQRLKSRLLQAMLYPLVLLV VATGVVTILLTAVVPKIIEQFDHLGHALPASTRTLIAMSDALQASGVYWLAGLLGLLVLG QRLLKNPAMRLRWDQTVLRLPVTGRVARGLNTARFSRTLSILTASSVPLLEGIQTAAAVS ANRYVEQQLLLAADRVREGSSLRAALAELRLFPPMMLYMIASGEQSGELETMLEQAAVNQ EREFDTQVGLALGLFEPALVVMMAGVVLFIVIAILEPMLQLNNMVGM >gi|296493199|gb|ADTK01000302.1| GENE 20 27333 - 27788 599 151 aa, chain + ## HITS:1 COG:VC2730 KEGG:ns NR:ns ## COG: VC2730 COG2165 # Protein_GI_number: 15642724 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, pseudopilin PulG # Organism: Vibrio cholerae # 6 151 2 146 146 230 76.0 6e-61 MYSLSRTQKPRAGFTLLEVMVVIVILGVLASLVVPNLLGNKEKADRQKAISDIVALENAL DMYRLDNGRYPTTEQGLEALIQQPANMADSRNYRTGGYIKRLPKDPWGNDYQYLSPGEKG LFDVYTLGADGQENGEGAGADIGNWNLQEFQ >gi|296493199|gb|ADTK01000302.1| GENE 21 27825 - 28355 409 176 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_3423 NR:ns ## KEGG: EcE24377A_3423 # Name: gspH # Def: general secretion pathway protein H # Organism: E.coli_E24377A # Pathway: Bacterial secretion system [PATH:ecw03070] # 1 176 12 187 187 332 100.0 3e-90 MLVIFLIGLASSGVVQTFATDSEPPAKKAAQDFLTRFAQFKDRAVIEGQTLGVLIDAPGY QFMQRRQGQWLPVSATRLSAQVTVPKQVQMLLQPGSDIWQKEYALELQRRRLTLHDIELE LQKEAKKKTPQIRFSPFEPATPFTLRFYSAAQNACWAVKLAHDGALSLNQCDERMP >gi|296493199|gb|ADTK01000302.1| GENE 22 28352 - 28723 340 123 aa, chain + ## HITS:1 COG:VC2728 KEGG:ns NR:ns ## COG: VC2728 COG2165 # Protein_GI_number: 15642722 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, pseudopilin PulG # Organism: Vibrio cholerae # 2 112 4 112 117 79 41.0 2e-15 MKRGFTLLEVMLALAIFALAATAVLQIASGALSNQQILEEKTVAGWVAENQTALLYLMTR EQRAVRQQGESDMAGSRWYWRTTPLSTGNALLQAVDIEVSLHEDFSSVIQSRRAWFSAVG GQQ >gi|296493199|gb|ADTK01000302.1| GENE 23 28720 - 29319 463 199 aa, chain + ## HITS:1 COG:VC2727 KEGG:ns NR:ns ## COG: VC2727 COG4795 # Protein_GI_number: 15642721 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulJ # Organism: Vibrio cholerae # 2 197 10 207 221 117 38.0 9e-27 MRRTRAGFTLLEMLVAIAIFASLALMAQQVTNGVTRVNSAVAGHDQKLNLMQQTMSFLTH DLTQMMPRPVRGDQGQREPALLAGAGVLASESEGMRFVRGGVVNPLMRLPRSNLLTVGYR IHDGYLERLAWPLTDAAGSVKPTMQKLIPADSLRLQFYDGTRWQESWSSVQAIPVAVRMT LHSPQWGEIERIWLLRGPQ >gi|296493199|gb|ADTK01000302.1| GENE 24 29322 - 30299 978 325 aa, chain + ## HITS:1 COG:VC2726 KEGG:ns NR:ns ## COG: VC2726 COG3156 # Protein_GI_number: 15642720 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulK # Organism: Vibrio cholerae # 7 305 5 307 336 212 43.0 7e-55 MITSPPKRGMALVVVLVLLAVMMLVTITLSGRMQQQLGRTRSQQEYQQALWYSASAESLA LSALSLSLKNEKRVHLAQPWASGPRFFPLPQGQIAVTLRDAQACFNLNALAQPTTASRPL AVQQLIALITRLDVPAYRAELIAESLWEFIDEDRSVQTRLGREDSEYLARSVPFYAANQP LADISEMRVVQGMDAGLYQKLKPLVCALPMTRQQININTLDVTQSVILEALFDPWLSPVQ ARALLQQRPAKGWEDVDQFLAQPLLADVDERTKKQLKTVLSVDSNYFWLRSDITVNEIEL TMNSLIVRMGPQHFSVLWHQTGESE >gi|296493199|gb|ADTK01000302.1| GENE 25 30296 - 31474 683 392 aa, chain + ## HITS:1 COG:yghE KEGG:ns NR:ns ## COG: yghE COG3297 # Protein_GI_number: 16130869 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulL # Organism: Escherichia coli K12 # 107 392 1 286 286 531 97.0 1e-150 MSSILEIFFPLCAADPIHWQRRTPDVEHGIWSDVANEQLQQWLQTDAIRLYIPGEWISVW QVELPVVARKQIPTILPALLEEELNQDIDELHFAPLKIDQQLATVAVIHQQHMRNIAQWL QENGITRATVAPDWMSIPCGFMAGDAQRVICRIDECRGWSAGLALAPVMFRAQLNEQDLP LSLTVVGIAPEKLSAWAGADAERLTVTALPAVTTYGEPEGNLLTGPWQPRVSYRKQWARW RVMILPILLILVALAVERGVTLWSVSEQVAQSRTQAEEQFLTLFPEQKRIVNLRSQVTMA LKKYRPQADDTRLLAELSAIASTLKSASLSDIEMRGFTFDQKRQTLHLQLRAANFASFDK LRSALAADFVVQQDALQKEGDAVSGGVTLRRK >gi|296493199|gb|ADTK01000302.1| GENE 26 31476 - 32012 419 178 aa, chain + ## HITS:1 COG:yghD KEGG:ns NR:ns ## COG: yghD COG3149 # Protein_GI_number: 16130868 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulM # Organism: Escherichia coli K12 # 1 178 1 178 178 333 98.0 8e-92 MLRDKFIHYFQQWRERQLSRGEHWLAQHLAGRSPREKGMLLAAVVFLFSVGYYVLIWQPL SERIEQQETILQQLVAMNTRLKSAAPDIIAARKSATTTPAQVSRVISDSASAHSVVIRRI AERGENIQVWIEPVVFNDLLKWLNALDEKYALRVTQIDVSAAEKPGMVNVQRLEFGRG >gi|296493199|gb|ADTK01000302.1| GENE 27 32293 - 32865 314 190 aa, chain - ## HITS:1 COG:no KEGG:B21_02762 NR:ns ## KEGG: B21_02762 # Name: ybl125 # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 190 1 189 189 372 97.0 1e-102 MSKITVSRPEVVNENTDVICSTSVRSRSLEYDNFPEISEANILSTFEQLHQNKDEVFERG VINVFKGLSWDYKTNSPCKFGSKIIVNNLVRWDQWGFHLISGMQADRLADLERMLHLLSG KPIPDNRGNITINLDDHIQSVQGKGRYEDEMFIIKYFKKGGSAHITFKRLELIDRINDII ARHFPSVLSA >gi|296493199|gb|ADTK01000302.1| GENE 28 32962 - 33120 63 52 aa, chain - ## HITS:1 COG:no KEGG:EC55989_4897 NR:ns ## KEGG: EC55989_4897 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 52 138 189 189 101 92.0 1e-20 IAWLQDNIDCESGIIFDNNEDKTDSAALLPCIEQAREDIRTLRQLQLQHQNR Prediction of potential genes in microbial genomes Time: Mon May 16 15:58:24 2011 Seq name: gi|296493198|gb|ADTK01000303.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont939.1, whole genome shotgun sequence Length of sequence - 1255 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 37 - 201 56 ## EcSMS35_1129 hypothetical protein - Prom 287 - 346 4.0 + Prom 973 - 1032 2.2 2 2 Tu 1 . + CDS 1115 - 1253 68 ## COG2963 Transposase and inactivated derivatives Predicted protein(s) >gi|296493198|gb|ADTK01000303.1| GENE 1 37 - 201 56 54 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_1129 NR:ns ## KEGG: EcSMS35_1129 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 54 1 54 54 82 100.0 3e-15 MPLITHLLNVEELSRLLKNVALSVINLFILDDVSEKMEYIPDWVVVTRLNIWWR >gi|296493198|gb|ADTK01000303.1| GENE 2 1115 - 1253 68 46 aa, chain + ## HITS:1 COG:ECs1381 KEGG:ns NR:ns ## COG: ECs1381 COG2963 # Protein_GI_number: 15830635 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 46 1 46 108 86 95.0 1e-17 MTKNTRFSPEVRQRAVRMVLESQSEYDSQWATICSIAPKIGCTPET Prediction of potential genes in microbial genomes Time: Mon May 16 15:58:26 2011 Seq name: gi|296493197|gb|ADTK01000304.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont942.1, whole genome shotgun sequence Length of sequence - 1353 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 101 - 1192 179 ## Pnap_4432 hypothetical protein - Prom 1218 - 1277 4.3 Predicted protein(s) >gi|296493197|gb|ADTK01000304.1| GENE 1 101 - 1192 179 363 aa, chain - ## HITS:1 COG:no KEGG:Pnap_4432 NR:ns ## KEGG: Pnap_4432 # Name: not_defined # Def: hypothetical protein # Organism: P.naphthalenivorans # Pathway: not_defined # 92 358 87 351 355 101 29.0 6e-20 MEMSQLKQPIFLKKIKKVINTIPGLEEQIFACRNKKRSDNPLLFIDRKDEERILMSRLQS QQKNEELASKLESLFHGNELSSPHSILCFIYWRYTKKIYRLSEDIISDVANTYVDNIPAQ ILKELPSWSIYVSAENLHTILPTSYPIHGFFFYPFLDNNGNIIRLFIIDDLKQSQGTTGL KEKDVDVVNNIIRIKDSREGLLDSRKMECIDGEVVVTVNEKLKDFRDREFNLLNAQISMV LYICSQINDIKEKNQFKRSEKHKKHVHTHHELPAQNIREWDVGIRMGQAIRQYRQAEPTG KERTTIGSKRPHIRRGHWHTYWTGSKKPELAHERKPRLIWLPPVPVNLEDVNKLPVVITP IDK Prediction of potential genes in microbial genomes Time: Mon May 16 15:58:32 2011 Seq name: gi|296493196|gb|ADTK01000305.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont948.1, whole genome shotgun sequence Length of sequence - 1771 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 33 - 509 437 ## COG2003 DNA repair proteins 2 1 Op 2 . + CDS 572 - 793 299 ## ECB_02806 hypothetical protein 3 1 Op 3 . + CDS 867 - 1235 238 ## ECS88_2101 antitoxin of the YeeV-YeeU toxin-antitoxin system; CP4-44 prophage 4 1 Op 4 . + CDS 1324 - 1698 295 ## JW1987 toxin of the YeeV-YeeU toxin-antitoxin system Predicted protein(s) >gi|296493196|gb|ADTK01000305.1| GENE 1 33 - 509 437 158 aa, chain + ## HITS:1 COG:ECs1403 KEGG:ns NR:ns ## COG: ECs1403 COG2003 # Protein_GI_number: 15830657 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Escherichia coli O157:H7 # 1 158 1 158 158 296 98.0 2e-80 MQQLSFLPGEMTPGERSLILRALKTLDRHLHEPGVAFTSTHAAREWLILNMAGLEREEFR VLYLNNQNQLIAGEPLFTGTINRTEVHPREVIKRALYHNAAAVVLAHNHPSGEVTPSKAD RLITERLVQALGLVDIRVPDHLIVGGNQVFSFAEHGLL >gi|296493196|gb|ADTK01000305.1| GENE 2 572 - 793 299 73 aa, chain + ## HITS:1 COG:no KEGG:ECB_02806 NR:ns ## KEGG: ECB_02806 # Name: yeeT # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 73 1 73 73 150 100.0 1e-35 MKIITRGEAMRIHQQHPASRLFPFCTGKYRWHGSAEAYTGREVQDIPGVLAVFAERRKDS FGPYVRLMSVTLN >gi|296493196|gb|ADTK01000305.1| GENE 3 867 - 1235 238 122 aa, chain + ## HITS:1 COG:no KEGG:ECS88_2101 NR:ns ## KEGG: ECS88_2101 # Name: yeeU # Def: antitoxin of the YeeV-YeeU toxin-antitoxin system; CP4-44 prophage # Organism: E.coli_S88 # Pathway: not_defined # 1 122 1 122 122 249 99.0 2e-65 MSDTLPGTTLPDDNHDRPWWGLPCTVTPCFGARLVQEGNQLHYLADRAGIRGLFSDADAY HLDQAFPLLMKQLELMLTSGELNPRHQHTVTLYAKGLTCKADTLSSCGYVYLAVYPTPEM KN >gi|296493196|gb|ADTK01000305.1| GENE 4 1324 - 1698 295 124 aa, chain + ## HITS:1 COG:no KEGG:JW1987 NR:ns ## KEGG: JW1987 # Name: yeeV # Def: toxin of the YeeV-YeeU toxin-antitoxin system # Organism: E.coli_J # Pathway: not_defined # 1 124 1 124 124 249 99.0 2e-65 MKTLPVLPGQAASSRPSPVEIWQILLSRLLDQHYGLTLNDTPFADERVIEQHIEAGISLC DAVNFLVEKYALVRTDQPGFSACTRSQLINSIDILRARRATGLMARDNYRTVNNITLGKY PEAK Prediction of potential genes in microbial genomes Time: Mon May 16 15:58:43 2011 Seq name: gi|296493195|gb|ADTK01000306.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont950.1, whole genome shotgun sequence Length of sequence - 18204 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 10, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 82 - 597 231 ## COG0732 Restriction endonuclease S subunits + Prom 1115 - 1174 7.5 2 2 Tu 1 . + CDS 1264 - 1485 106 ## CP0034 hypothetical protein + Term 1608 - 1655 5.6 + Prom 1554 - 1613 3.3 3 3 Tu 1 . + CDS 1659 - 2579 454 ## COG0524 Sugar kinases, ribokinase family + Prom 2630 - 2689 5.5 4 4 Op 1 . + CDS 2747 - 4264 989 ## COG4580 Maltoporin (phage lambda and maltose receptor) + Term 4275 - 4302 0.1 5 4 Op 2 1/0.000 + CDS 4325 - 5695 856 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 6 4 Op 3 5/0.000 + CDS 5701 - 7104 564 ## COG1621 Beta-fructosidases (levanase/invertase) 7 4 Op 4 . + CDS 7130 - 8137 496 ## COG1609 Transcriptional regulators 8 5 Tu 1 . - CDS 8689 - 9162 444 ## COG2916 DNA-binding protein H-NS - Prom 9393 - 9452 3.9 + Prom 9379 - 9438 4.0 9 6 Tu 1 . + CDS 9481 - 10218 407 ## COG3637 Opacity protein and related surface antigens + Prom 10547 - 10606 4.5 10 7 Op 1 . + CDS 10638 - 11534 302 ## COG2378 Predicted transcriptional regulator 11 7 Op 2 . + CDS 11583 - 12662 901 ## SeHA_C4730 hypothetical protein 12 7 Op 3 . + CDS 12709 - 14280 855 ## SeHA_C4731 hypothetical protein 13 7 Op 4 . + CDS 14277 - 14543 96 ## COG0727 Predicted Fe-S-cluster oxidoreductase + Term 14609 - 14651 1.5 + Prom 14594 - 14653 3.2 14 8 Tu 1 . + CDS 14689 - 14880 124 ## SeHA_C4734 hypothetical protein + Term 14888 - 14934 2.1 + Prom 15065 - 15124 3.0 15 9 Tu 1 . + CDS 15180 - 16619 617 ## COG2194 Predicted membrane-associated, metal-dependent hydrolase - Term 16661 - 16697 3.0 16 10 Tu 1 . - CDS 16763 - 17806 482 ## SeHA_C4737 ShiA-like protein - Prom 17834 - 17893 3.1 Predicted protein(s) >gi|296493195|gb|ADTK01000306.1| GENE 1 82 - 597 231 171 aa, chain + ## HITS:1 COG:Cj1551c KEGG:ns NR:ns ## COG: Cj1551c COG0732 # Protein_GI_number: 15792859 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Campylobacter jejuni # 1 123 17 138 380 121 47.0 5e-28 MGQSPAGESYNEDGIGTLFFQGSTDFGWLFPTPRQYTTSPTRMAKKGDILLSVRAPVGDM NIANADCCIGRGLAALNSKSRSDGFLFYVMKYFKQVFERRNAEGTTFGSMTKDDLHSLQV VCPEPGLLKRYDDIVSEYNKMIFTRSLENQDLIKLRDWLLPILMNGQVKIK >gi|296493195|gb|ADTK01000306.1| GENE 2 1264 - 1485 106 73 aa, chain + ## HITS:1 COG:no KEGG:CP0034 NR:ns ## KEGG: CP0034 # Name: not_defined # Def: hypothetical protein # Organism: S.flexneri # Pathway: not_defined # 14 73 1 60 60 80 66.0 1e-14 MSRKPSSQRSFWLLYQPVKQLVVRKFDLSRVIATSIDTEKIKVRFSLSDVGNFLFSHGLS LTFVVKIYSLWRN >gi|296493195|gb|ADTK01000306.1| GENE 3 1659 - 2579 454 306 aa, chain + ## HITS:1 COG:ECs3242 KEGG:ns NR:ns ## COG: ECs3242 COG0524 # Protein_GI_number: 15832496 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli O157:H7 # 1 303 1 303 304 389 64.0 1e-108 MSTRVWVLGDAVVDLLPESQGRLLQCPGGAPANVAVGIARLGGKSAFIGKVGDDPFGRFM YQTLSTENVDTRYMSLDPQHRTSIVAVGLDEQGERNFTFMVRPSADLFLQPDGLPAFGPG EWLHLCSIALSAEPSRSTAFLAMEKIRQAGGNISFDPNIRSDLWQSEALLRKYLDRALSL ANIAKLSEEELLFISGESQVQQGAYSLVQRYSLTLLLITQGKNGVLVYFQGQFIHYPAKP VSVVDTTGAGDAFVAGLLAGLADSGIPTNTRQLERIIVQAQICGALATTAKGAMTALPRQ HDLPSQ >gi|296493195|gb|ADTK01000306.1| GENE 4 2747 - 4264 989 505 aa, chain + ## HITS:1 COG:STM4231 KEGG:ns NR:ns ## COG: STM4231 COG4580 # Protein_GI_number: 16767481 # Func_class: G Carbohydrate transport and metabolism # Function: Maltoporin (phage lambda and maltose receptor) # Organism: Salmonella typhimurium LT2 # 89 505 20 452 452 77 25.0 6e-14 MYKKTTLAVLVALLTDATTVHAQTDISSIESRLAAVGQRLKNAESRAQAAEARAKTAELQ VQKLAETQQQNQLTTQEVAQRTVQLEQKSAENSGFEFHGYARSGLLMNDAASSSKSGPYL TPAGETGGAVGRLGNEADTYVELNVEHKQTLDNGATTRFKAMLADGQRDYNDWTGGSSNL NIRQAFAELGALPSFTGAFKDSTVWAGKRFDRDNFDIHWLDSDVVFLAGTGGGIYDVKWN DTFRSNFSLYGRNFGDLDDIDNNVQNYILTMNHYAGPFQLMVSGLRAKDNDDRKDANGDL IQTDAANTGVHALVGLHNDTFYGLREGTAKTALLYGHGLGAEVKGIGSDGALLSEANTWR FASYGTTPLGSGWYVAPAILAQSSKDRYVKGDSYEWVTFNTRLIKEVTQNFALAFEGSYQ YMDLKPKGYQNHNAVNGSFYKLTFAPTLKANDINNFFSRPELRLFATWMDWSSKLDDFAS NDAFGSSGFNTGGEWNFGVQMETWF >gi|296493195|gb|ADTK01000306.1| GENE 5 4325 - 5695 856 456 aa, chain + ## HITS:1 COG:RSp1285_2 KEGG:ns NR:ns ## COG: RSp1285_2 COG1263 # Protein_GI_number: 17549504 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Ralstonia solanacearum # 91 443 1 354 374 387 61.0 1e-107 MDFDKIAQSLLPLLGGKENIASAAHCATRLRLVLVDDTLADQHAIGQIDGVKGCFRNSGQ MQIIFGTGVVNKVYAAFIQVAGISESSKADIARLAAQKLNPFQRIARLLSNIFVPIIPAI VASGLLMGLLGMVKTYGWVNTDNAIYILLDMCSSAAFIILPILIGFTAAREFGGNPYLGA TLGGILTHPALTNAWGVAAGFQTMNFFGFEIAMIGYQGTVFPVLLAVWFMSIVEKQLRRF IPDALDLILTPFLTVVISGFIALLIIGPAGRALGDGISFVLSTLIAHAGWLAGLLFGGLY SAIVITGIHHSFHAIEAGLLGNPAIGVNFLLPIWAMANVAQGGACLAVWFKTKDTKIKAI TLPSAFSAMLGITEAAIFGINLRFVKPFIAALIGGAAGGAWVVSVHVYMTAVGLTAIPGM AIVQPTSLVNYIIGMVIAFAVAFSLSLLFKYKTDEE >gi|296493195|gb|ADTK01000306.1| GENE 6 5701 - 7104 564 467 aa, chain + ## HITS:1 COG:BH1858 KEGG:ns NR:ns ## COG: BH1858 COG1621 # Protein_GI_number: 15614421 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Bacillus halodurans # 22 465 26 485 487 367 42.0 1e-101 MPSQMPAILQAVMKGQSKALADAHYPCWHLAPVTGLMNDPNGFCWSGGRYHLFYQWNPLA CDHKYKCWGHWSSTDLLHWQHEPLALMPDKEYDRNGCYSGSAVNNQGVLTLCYTGNVKFD DGSRTAWQCLATENNQGGFDKLGPVISLPDGYTGHVRDPKVWKHNSQWYMVLGAQDKEKR GKVLLYSSVDLNTWSFHGEIAGNGLNEIDNAGYMWECPDLFALDGEYILLCCPQGMVREH ERYLNTYPCAWLHGQFDYETGKFMHGAFSELDAGFEFYAPQTMEAPDGRRLLVGWMGGPD GEEMLQPTRKHHWQHQMTCFRELSFQKGKLFQMPIRELAQLREAEHFWQGKADHAPHVEI ERLEMDIIPSGELYLNFGNALALHLNDDGIQLQRRSLAGQEKLTRYWRGSVTSLKILCDS SSVEIFINNGEGVMSNRYFPHHPASLILQGESDVTLRYWSLRACMVE >gi|296493195|gb|ADTK01000306.1| GENE 7 7130 - 8137 496 335 aa, chain + ## HITS:1 COG:PM1847 KEGG:ns NR:ns ## COG: PM1847 COG1609 # Protein_GI_number: 15603712 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Pasteurella multocida # 5 324 7 330 339 275 42.0 1e-73 MRKTKRVTIKDIAELAGVSKATASLVLNGRGKELRVAQETRERVLSIAQQQHYQPSIHAR SLRDNRSHTIGLVVPEITNYGFAVFSHELETLCREAGVQLLISCTDENPSQETMVVNNMI ARQVDGLIVASSMLHDNDYQKLSEQLPVVLFDRHMNGSTLPLVITDSVSPTAALVADIAR SHPDEFYFLGGQPRLSPTRDRLEGFTQGLQQAGVTLQPEWIIHGNYHPSSGYEMFAALCA RLGRPPKALFTAACGLLEGVLRYMSQYNLLDSTIHLASFDDHYLYDSLSVRIDTIQQDNR QLAFHCFELISQLIEGETPSPLQRYLPASLQKRYR >gi|296493195|gb|ADTK01000306.1| GENE 8 8689 - 9162 444 157 aa, chain - ## HITS:1 COG:VC1130 KEGG:ns NR:ns ## COG: VC1130 COG2916 # Protein_GI_number: 15641143 # Func_class: R General function prediction only # Function: DNA-binding protein H-NS # Organism: Vibrio cholerae # 11 147 4 137 137 79 42.0 3e-15 MCKLTKEEEYSIISRTMTNIRSLRTYVRELDFEQLLDMQEKLNTVIEERREDAERKAAER KELETKCQQVFEYIVSVGLDPEEWQGPVSAIAGTTETKRKPKGGVRKAKYIFEDENGETR TWSGNGKMPLALRKQVNGDCTLETFLIENPNQHEQSE >gi|296493195|gb|ADTK01000306.1| GENE 9 9481 - 10218 407 245 aa, chain + ## HITS:1 COG:STM0306 KEGG:ns NR:ns ## COG: STM0306 COG3637 # Protein_GI_number: 16763689 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Opacity protein and related surface antigens # Organism: Salmonella typhimurium LT2 # 1 245 1 239 239 137 36.0 2e-32 MKKVIAVSALAMAGMFSAQTLADESKTGFYVTGKAGASVMSLSEQRFVDGEGAWADKYKG GDKSDTVFGAGLAVGYDFYQHYNVPVRTEVEFYGRGNAESKYRLSYWESVGGAEFDDAQN KLSVNTLMLNAYYDFRNSSAFTPWISAGLGYARVHHKTSYIYTDNSPAGSEVYSASASKY ENNLAWSLGVGVKYDVTQDFSLDLSYRYLDAGDSTLTYKDEDGAKYKSSVDVRSNEFMLG ATYNF >gi|296493195|gb|ADTK01000306.1| GENE 10 10638 - 11534 302 298 aa, chain + ## HITS:1 COG:Cj0571 KEGG:ns NR:ns ## COG: Cj0571 COG2378 # Protein_GI_number: 15791931 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Campylobacter jejuni # 12 292 10 285 290 123 30.0 3e-28 MADSTRNRSAERLIDILVELQNYGVVSRHNLMKKYNITERTAYRDLNMLSPFIEACGDGK YRLISARAGNQSKESLHKSLARLLDTDAIFPERDEGFWQKLENRATEKHIRVQFHNPEHT IRDDLRKYLDVLEKAICNSNVCQIAYAGKIRIVHPYKLTNQRSIWYLLATEENKLKSFSL AKIKWLDIKKEKFAKSDEIQSLVSESCDPWVSDKTFDVVLFIHSNIAHYFLRRDLLPYQQ LLHKHDNGITLSCKASHKNQIIPLILYWLPNVEIIEPVWLKEAVLTMLGKYLTAENVS >gi|296493195|gb|ADTK01000306.1| GENE 11 11583 - 12662 901 359 aa, chain + ## HITS:1 COG:no KEGG:SeHA_C4730 NR:ns ## KEGG: SeHA_C4730 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 359 1 359 359 546 96.0 1e-154 MALSFIILGVAAGAAAFGGKKAYDGYQKKSEAEDIFEKAKERYEEHKGKFEQAEALTSSK LTTLGNKELEIGKDFSEFDRIAAELLARLEKEGNKDLKLSVPQHNLNKIHNLAISATEYI STVVAAGVSGAAAGFAVYSGVMAFAAASTGTPIAALSGAAAYNATMATIGGGSLAAGGMG MAGGAMVLGTAVAAPLIAIAGWAYDRHATKALENAHQCACDVERAVKKMNLAREHFAKVN QYVDEILAALNRMHAVFIKHYFEPLKSMHSVITEQQDISVSDEVIVLIDHGYSLAAIMTD IITTPLFKPRKQENGEAVIKDNVIEMETDDNGMNVINKEELDDVLASSVVKFDDFSSRS >gi|296493195|gb|ADTK01000306.1| GENE 12 12709 - 14280 855 523 aa, chain + ## HITS:1 COG:no KEGG:SeHA_C4731 NR:ns ## KEGG: SeHA_C4731 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 523 1 523 523 981 97.0 0 MNVEKFDWTEELHQTMVKSLVTSFGLDFLLLNDRKGGDVDTIHNVRNGIYATEAERQRYE GRGDYDSHHYHSHENYIATNRKGKRAHQAGTLTDAYTGEVFASDDNKNLDHIISAKEIHD DPGRILAERDGAELANDASNLAFTHESLNKSKKADSMDAFICTLQKNREDTLRQLAELES QVTLTEKEKKRLISLRKKADADTERMERADKYAREKYEATINQSYYLSSKFATAVASASL NSGFRMGTRQMLGLILAEIWFELRERIPHIIRQQRNSFCAGKCLSEITSALRACQERVQE KFSSFLIAFKDGVIGGILSGITTTLLNIFLTTERMIVRLIREMWNNLVQAFKVLVFNPEG LSPGQLAKAVTKLIAAGVAVAGGVMINEAMAQILIFPFGSELAAFCGALTTGILGLVMNY YLEYSEVMQRVWAFLDKFKDKFQRSLEYFQAVNAELDRYLVELTALEFSVDAAALNHFSL HLDSVNSEIERGLLLREEVSRRGIALPFESGNVASVRDWLNKR >gi|296493195|gb|ADTK01000306.1| GENE 13 14277 - 14543 96 88 aa, chain + ## HITS:1 COG:jhp0701 KEGG:ns NR:ns ## COG: jhp0701 COG0727 # Protein_GI_number: 15611768 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster oxidoreductase # Organism: Helicobacter pylori J99 # 7 82 24 98 117 60 36.0 5e-10 MKLHPPFPCNQCGACCRHVNRAGETQCLDRGDGICRHYQTDSHLCAIYDKRPQICRVEDQ YLLNYQSQYSWQEFIALNQAACLILNKL >gi|296493195|gb|ADTK01000306.1| GENE 14 14689 - 14880 124 63 aa, chain + ## HITS:1 COG:no KEGG:SeHA_C4734 NR:ns ## KEGG: SeHA_C4734 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 63 3 65 65 77 96.0 1e-13 MLLKGLVGMALAWLVIGFGMELFFLADRVHPALRVIVGLGMLFLMIMIAGGMCGSLLSRK KTS >gi|296493195|gb|ADTK01000306.1| GENE 15 15180 - 16619 617 479 aa, chain + ## HITS:1 COG:yhbX KEGG:ns NR:ns ## COG: yhbX COG2194 # Protein_GI_number: 16131064 # Func_class: R General function prediction only # Function: Predicted membrane-associated, metal-dependent hydrolase # Organism: Escherichia coli K12 # 1 476 60 535 547 670 66.0 0 MKSITLRLIIAFPFTLLTAADISISLYSWCTFGTTFNDGFAISILQTDPDEVIRMFRMYV VYVIAFIILFLLFFCSAINKTSSLPSEKVTVITFLLLITVTLYSSFQFALKKQYQINEVD PYIVASRFATYTPFFNLNYFALAAKEYQRLMTIADTIPHYDLVITDNNTDVFVLVIGESA RTDNMSIYGYSRPTTPELQKQKSRLKLFTQAISGAPYTALAVPLALSADSVLHHDIRNYP DNIINMANQAGFDTWWLSAQSAFRQNGTAVASIAMRARNRIYVRGYDELLLPHLAEALNS NPGSRKLIVLHLTGSHEPVCSNWPRDKAVFKPLDTEEVCYDNSIHYTDSLLGQVFAMLET RRASVMYFSDHGLEYDPTKEHAYFHGGIKPSQQAYHVPMFIWYSPTLGEHVDRQTVNSVF STAYNDYLINAWMGVTRPAQPKTPEEVITRWQGKSVVFDASHNVFDYKVLRKNFNELQE >gi|296493195|gb|ADTK01000306.1| GENE 16 16763 - 17806 482 347 aa, chain - ## HITS:1 COG:no KEGG:SeHA_C4737 NR:ns ## KEGG: SeHA_C4737 # Name: not_defined # Def: ShiA-like protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 347 1 347 347 700 98.0 0 MNDRLRFEVNDNQGRFVFPDTWFGPLLGEFEEVLDAYDTDEISETSYINKLRRLAQREPD FIDIHAHLAYAFLEQNAPRKALNAALKGLAAGNRLIPESFSGEIIWMHPENRPYLRALYA AILANVHLQRHQDAVMLTDKILAYNPEDNQGARWLLGSELLRTGDHKQAFSVLKEHADEF SPYWYELGLLHFLNGEHVKAATAFRHGFATNTYIAEMLCGNLHPFPLAVRHNFSGSLDTA EDYYATYSPLWGQYPEALLFVNWLYNHSSVLHERAEIIKCAEMLMQEDDFEICESILRQQ KLLRERIDETLSEEIVQKCRNINGEYVWPWILPFSAAGMKHSSIQHQ Prediction of potential genes in microbial genomes Time: Mon May 16 15:59:05 2011 Seq name: gi|296493194|gb|ADTK01000307.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont950.2, whole genome shotgun sequence Length of sequence - 1455 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 59 - 1249 401 ## PROTEIN SUPPORTED gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 - Prom 1370 - 1429 3.4 Predicted protein(s) >gi|296493194|gb|ADTK01000307.1| GENE 1 59 - 1249 401 396 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 [Campylobacter concisus 13826] # 10 392 14 406 406 159 28 2e-39 MALTARQVETARPKEKDYKLSDERGLYLLVKTTGARYWRLKYRIAGKEKKLALGVYPDVS LAEARIKRDDARKIISEGGDPGEKKRKEKLTQKISATNTFHALATEWHQHKSLSWSESYA RSVLEALDKDIFPYLGKRSVTDILPLEMLEILRRIEKRGSLEKLRKVRQYCNQIFRYAIA PGRATVNPASELTSTLAAPKAAHFPHLRADELPVFLRKLAEYHGSPVTRMATNLLLLTGL RTIELRSAEWSEIDFDNALWTIPESRMKMRRKHVVPLSRQATDILLQLKTFSGQYRLVFP GRCDINKPMSEASINMVLKRIGYDGRATGHGFRHTMSTILHEQGFNSAWIEMQLAHVDKN AIRGTYNHAQYLDGRREMMQWYADYIDSLSRQESQG Prediction of potential genes in microbial genomes Time: Mon May 16 15:59:06 2011 Seq name: gi|296493193|gb|ADTK01000308.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont950.3, whole genome shotgun sequence Length of sequence - 2834 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.000 - CDS 119 - 694 763 ## COG1309 Transcriptional regulator 2 1 Op 2 7/0.000 - CDS 731 - 2428 1477 ## COG4232 Thiol:disulfide interchange protein 3 1 Op 3 . - CDS 2404 - 2742 225 ## COG1324 Uncharacterized protein involved in tolerance to divalent cations Predicted protein(s) >gi|296493193|gb|ADTK01000308.1| GENE 1 119 - 694 763 191 aa, chain - ## HITS:1 COG:ECs5116 KEGG:ns NR:ns ## COG: ECs5116 COG1309 # Protein_GI_number: 15834370 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 191 9 199 199 368 99.0 1e-102 MQREDVLGEALKLLELQGIANTTLEMVAERVDYPLDELRRFWPDKEAILYDALRYLSQQI DVWRRQLMLDETQTAEQKLLARYQALSECVKNNRYPGCLFIAACTFYPDPGHPIHQLADQ QKSAAYDFTHELLTTLEVDDPAMVAKQMELVLEGCLSRMLVNRSQADVDTAHRLAEDILR FARCRQGGALT >gi|296493193|gb|ADTK01000308.1| GENE 2 731 - 2428 1477 565 aa, chain - ## HITS:1 COG:dsbD KEGG:ns NR:ns ## COG: dsbD COG4232 # Protein_GI_number: 16131961 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol:disulfide interchange protein # Organism: Escherichia coli K12 # 1 565 1 565 565 1092 99.0 0 MAQRIFTLILLLCSTSVFAGLFDAPGRSQFVPADQAFTFDFQQNQHDLNLTWQIKDGYYL YRKQIRITPEHAKIADVQLPQGVWHEDEFYGKSEIYRDRLTLPVTINQASAGATLTVTYQ GCADAGFCYPPETKTVPLSEVVANNAASQPVSVSQQEQHTAQLPFSALWALLIGIGIAFT PCVLPMYPLISGIVLGGKQRLSTARALLLTFIYVQGMALTYTALGLVVAAAGLQFQAALQ HPYVLIGLAIVFTLLAMSMFGLFTLQLPSSLQTRLTLMSNRQQGGSPGGVFVMGAIAGLI CSPCTTAPLSAILLYIAQSGNMWLGGGTLYLYALGMGLPLMLITVFGNRLLPKSGPWMEQ VKTAFGFVILALPVFLLERVIGDVWGLRLWSALGVAFFGWAFITSLQAKRGWMRVVQIIL LAAALVSVRPLQDWAFGATHTAQTQTHLNFTQIKTVDELNQALVEAKGKPVMLDLYADWC VACKEFEKYTFSDPQVQKALADTVLLQANVTANDAQDVALLKHLNVLGLPTILFFDGQGQ EHPQARVTGFMDAETFSAHLRDRQP >gi|296493193|gb|ADTK01000308.1| GENE 3 2404 - 2742 225 112 aa, chain - ## HITS:1 COG:ECs5118 KEGG:ns NR:ns ## COG: ECs5118 COG1324 # Protein_GI_number: 15834372 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in tolerance to divalent cations # Organism: Escherichia coli O157:H7 # 1 112 1 112 112 198 100.0 2e-51 MLDEKSSNTASVVVLCTAPDEATAQDLAAKVLAEKLAACATLIPGATSLYYWEGKLEQEY EVQMILKTTVSHQQALLECLKSHHPYQTPELLVLPVTHGDTDYLSWLNASLR Prediction of potential genes in microbial genomes Time: Mon May 16 15:59:11 2011 Seq name: gi|296493192|gb|ADTK01000309.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont950.4, whole genome shotgun sequence Length of sequence - 13555 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 12, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 57 - 1358 1284 ## COG2704 Anaerobic C4-dicarboxylate transporter - Prom 1394 - 1453 3.7 - Term 1415 - 1452 7.1 2 2 Tu 1 . - CDS 1476 - 2912 1548 ## COG1027 Aspartate ammonia-lyase - Prom 3070 - 3129 5.7 + Prom 3133 - 3192 3.1 3 3 Tu 1 . + CDS 3249 - 3725 347 ## COG3030 Protein affecting phage T7 exclusion by the F plasmid + Term 3734 - 3778 -0.4 4 4 Tu 1 . - CDS 3741 - 4997 1199 ## COG0531 Amino acid transporters - Prom 5062 - 5121 2.3 + Prom 5094 - 5153 6.0 5 5 Op 1 41/0.000 + CDS 5273 - 5566 469 ## COG0234 Co-chaperonin GroES (HSP10) 6 5 Op 2 . + CDS 5610 - 7256 2383 ## PROTEIN SUPPORTED gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 + Term 7286 - 7323 6.0 + Prom 7284 - 7343 3.8 7 6 Tu 1 . + CDS 7394 - 7747 345 ## ECIAI1_4377 hypothetical protein + Term 7802 - 7843 -0.2 - Term 7794 - 7826 -0.2 8 7 Tu 1 . - CDS 7940 - 8809 818 ## EcE24377A_4701 hypothetical protein - Prom 8933 - 8992 6.6 9 8 Tu 1 . - CDS 9204 - 10232 936 ## COG1509 Lysine 2,3-aminomutase - Prom 10277 - 10336 3.3 + Prom 10171 - 10230 3.7 10 9 Tu 1 . + CDS 10274 - 10840 587 ## COG0231 Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) + Term 10847 - 10887 10.5 + Prom 11047 - 11106 4.2 11 10 Tu 1 . + CDS 11128 - 11274 119 ## COG5510 Predicted small secreted protein + Term 11282 - 11319 6.2 + Prom 11314 - 11373 3.4 12 11 Tu 1 . + CDS 11449 - 11766 411 ## COG2076 Membrane transporters of cations and cationic drugs 13 12 Op 1 3/0.600 - CDS 11763 - 12296 702 ## COG3040 Bacterial lipocalin - Prom 12317 - 12376 2.0 - Term 12345 - 12374 2.1 14 12 Op 2 . - CDS 12385 - 13389 824 ## COG1680 Beta-lactamase class C and other penicillin binding proteins Predicted protein(s) >gi|296493192|gb|ADTK01000309.1| GENE 1 57 - 1358 1284 433 aa, chain - ## HITS:1 COG:ECs5119 KEGG:ns NR:ns ## COG: ECs5119 COG2704 # Protein_GI_number: 15834373 # Func_class: R General function prediction only # Function: Anaerobic C4-dicarboxylate transporter # Organism: Escherichia coli O157:H7 # 1 433 1 433 433 704 99.0 0 MLVVELIIVLLAIFLGARLGGIGIGFAGGLGVLVLAAIGVKPGNIPFDVISIIMAVIAAI SAMQVAGGLDYLVHQTEKLLRRNPKYITILAPIVTYFLTIFAGTGNISLATLPVIAEVAK EQGVKPCRPLSTAVVSAQIAITASPISAAVVYMSSVMEGHGISYLHLLSVVIPSTLLAVL VMSFLVTMLFNSKLSDDPIYRKRLEEGLVELRGEKQIEIKSGAKTSVWLFLLGVVGVVIY AIINSPSMGLVEKPLMNTTNAILIIMLSVATLTTVICKVDTDNILNSSTFKAGMSACICI LGVAWLGDTFVSNNIDWIKDTAGEVIQGHPWLLAVIFFFASALLYSQAATAKALMPMALA LNVSPLTAVASFAAVSGLFILPTYPTLVAAVQMDDTGTTRIGKFVFNHPFFIPGTLGVAL AVCFGFLLGSFML >gi|296493192|gb|ADTK01000309.1| GENE 2 1476 - 2912 1548 478 aa, chain - ## HITS:1 COG:ECs5120 KEGG:ns NR:ns ## COG: ECs5120 COG1027 # Protein_GI_number: 15834374 # Func_class: E Amino acid transport and metabolism # Function: Aspartate ammonia-lyase # Organism: Escherichia coli O157:H7 # 1 478 16 493 493 932 100.0 0 MSNNIRIEEDLLGTREVPADAYYGVHTLRAIENFYISNNKISDIPEFVRGMVMVKKAAAM ANKELQTIPKSVANAIIAACDEVLNNGKCMDQFPVDVYQGGAGTSVNMNTNEVLANIGLE LMGHQKGEYQYLNPNDHVNKCQSTNDAYPTGFRIAVYSSLIKLVDAINQLREGFERKAVE FQDILKMGRTQLQDAVPMTLGQEFRAFSILLKEEVKNIQRTAELLLEVNLGATAIGTGLN TPKEYSPLAVKKLAEVTGFPCVPAEDLIEATSDCGAYVMVHGALKRLAVKMSKICNDLRL LSSGPRAGLNEINLPELQAGSSIMPAKVNPVVPEVVNQVCFKVIGNDTTVTMAAEAGQLQ LNVMEPVIGQAMFESVHILTNACYNLLEKCINGITANKEVCEGYVYNSIGIVTYLNPFIG HHNGDIVGKICAETGKSVREVVLERGLLTEAELDDIFSVQNLMHPAYKAKRYTDESEQ >gi|296493192|gb|ADTK01000309.1| GENE 3 3249 - 3725 347 158 aa, chain + ## HITS:1 COG:yjeG KEGG:ns NR:ns ## COG: yjeG COG3030 # Protein_GI_number: 16132265 # Func_class: R General function prediction only # Function: Protein affecting phage T7 exclusion by the F plasmid # Organism: Escherichia coli K12 # 1 158 1 158 158 241 100.0 3e-64 MRWLPFIAIFLYVYIEISIFIQVAHVLGVLLTLVLVIFTSVIGMSLVRNQGFKNFVLMQQ KMAAGENPAAEMIKSVSLIIAGLLLLLPGFFTDFLGLLLLLPPVQKHLTVKLMPHLRFSR MPGGGFSAGTGGGNTFDGEYQRKDDERDRLDHKDDRQD >gi|296493192|gb|ADTK01000309.1| GENE 4 3741 - 4997 1199 418 aa, chain - ## HITS:1 COG:yjeH KEGG:ns NR:ns ## COG: yjeH COG0531 # Protein_GI_number: 16131966 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli K12 # 1 418 1 418 418 702 100.0 0 MSGLKQELGLAQGIGLLSTSLLGTGVFAVPALAALVAGNNSLWAWPVLIILVFPIAIVFA ILGRHYPSAGGVAHFVGMAFGSRLERVTGWLFLSVIPVGLPAALQIAAGFGQAMFGWHSW QLLLAELGTLALVWYIGTRGASSSANLQTVIAGLIVALIVAIWWAGDIKPANIPFPAPGN IELTGLFAALSVMFWCFVGLEAFAHLASEFKNPERDFPRALMIGLLLAGLVYWGCTVVVL HFDAYGEKMAAAASLPKIVVQLFGVGALWIACVIGYLACFASLNIYIQSFARLVWSQAQH NPDHYLARLSSRHIPNNALNAVLGCCVVSTLVIHALEINLDALIIYANGIFIMIYLLCML AGCKLLQGRYRLLAVVGGLLCVLLLAMVGWKSLYALIMLAGLWLLLPKRKTPENGITT >gi|296493192|gb|ADTK01000309.1| GENE 5 5273 - 5566 469 97 aa, chain + ## HITS:1 COG:ECs5123 KEGG:ns NR:ns ## COG: ECs5123 COG0234 # Protein_GI_number: 15834377 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Co-chaperonin GroES (HSP10) # Organism: Escherichia coli O157:H7 # 1 97 1 97 97 158 100.0 2e-39 MNIRPLHDRVIVKRKEVETKSAGGIVLTGSAAAKSTRGEVLAVGNGRILENGEVKPLDVK VGDIVIFNDGYGVKSEKIDNEEVLIMSESDILAIVEA >gi|296493192|gb|ADTK01000309.1| GENE 6 5610 - 7256 2383 548 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 [Haemophilus parasuis 29755] # 1 548 1 547 547 922 87 0.0 MAAKDVKFGNDARVKMLRGVNVLADAVKVTLGPKGRNVVLDKSFGAPTITKDGVSVAREI ELEDKFENMGAQMVKEVASKANDAAGDGTTTATVLAQAIITEGLKAVAAGMNPMDLKRGI DKAVTAAVEELKALSVPCSDSKAIAQVGTISANSDETVGKLIAEAMDKVGKEGVITVEDG TGLQDELDVVEGMQFDRGYLSPYFINKPETGAVELESPFILLADKKISNIREMLPVLEAV AKAGKPLLIIAEDVEGEALATLVVNTMRGIVKVAAVKAPGFGDRRKAMLQDIATLTGGTV ISEEIGMELEKATLEDLGQAKRVVINKDTTTIIDGVGEEAAIQGRVAQIRQQIEEATSDY DREKLQERVAKLAGGVAVIKVGAATEVEMKEKKARVEDALHATRAAVEEGVVAGGGVALI RVASKLADLRGQNEDQNVGIKVALRAMEAPLRQIVLNCGEEPSVVANTVKGGDGNYGYNA ATEEYGNMIDMGILDPTKVTRSALQYAASVAGLMITTECMVTDLPKNDAADLGAAGGMGG MGGMGGMM >gi|296493192|gb|ADTK01000309.1| GENE 7 7394 - 7747 345 117 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_4377 NR:ns ## KEGG: ECIAI1_4377 # Name: yjeI # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 117 1 117 117 198 100.0 6e-50 MHVKYLAGIVGAALLMAGCSSSNELSAAGQSVRIVDEQPGAECQLIGTATGKQSNWLSGQ HGEEGGSMRGAANDLRNQAAAMGGNVIYGISSPSQGMLSSFVPTDSQIIGQVYKCPN >gi|296493192|gb|ADTK01000309.1| GENE 8 7940 - 8809 818 289 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_4701 NR:ns ## KEGG: EcE24377A_4701 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 289 1 289 289 553 100.0 1e-156 MAISIKGVNTGVIRKSNNFIALALKIKEPRNKESLFFMSVMELRDLLIALESRLHQKHKL DAAARLQYEQARDKVIKKMAENIPEILVDELKNADINRRVNTLELTDNQGENLTFVLTLH DGNKCELVVNELQIEMLARAIIHAINNAEMRELALRITSLLDFLPLYDVDCQENGNLEYD TYSQPEWKHNLFDHYLAVLYRFKDESGKEQFSGAVVKTREATPGKEVEAITRRMLDFSPR LKKLVGVPCQVYVRTVAANNAQPLTQDQCLRALHHLRVQSTSKTAPQAK >gi|296493192|gb|ADTK01000309.1| GENE 9 9204 - 10232 936 342 aa, chain - ## HITS:1 COG:yjeK KEGG:ns NR:ns ## COG: yjeK COG1509 # Protein_GI_number: 16131971 # Func_class: E Amino acid transport and metabolism # Function: Lysine 2,3-aminomutase # Organism: Escherichia coli K12 # 1 342 1 342 342 671 98.0 0 MAHIVTLNTPSREDWLTQLADVVTDPDELLRLLNIDADEKLLAGRSAKKLFALRVPRSFI DRMEKGNPDDPLLRQVLTSQDEFVVASGFSTDPLEEQHSVVPGLLHKYHNRALLLVKGGC AVNCRYCFRRHFPYAENQGNKRNWQTALEYVAAHPELDEMIFSGGDPLMAKDHELDWLLT QLEAIPHIKRLRIHSRLPIVIPARITEALVERFARSTLQILLVNHINHANEVDETFRQAM AKLRRVGVTLLNQSVLLRGVNDNAQTLANLSNALFDAGVMPYYLHVLDKVQGAAHFMVSD DEARQIMRELLTLVSGYLVPKLAREIGGEPSKTPLDLQLRQQ >gi|296493192|gb|ADTK01000309.1| GENE 10 10274 - 10840 587 188 aa, chain + ## HITS:1 COG:ECs5128 KEGG:ns NR:ns ## COG: ECs5128 COG0231 # Protein_GI_number: 15834382 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) # Organism: Escherichia coli O157:H7 # 1 188 1 188 188 377 100.0 1e-105 MATYYSNDFRAGLKIMLDGEPYAVEASEFVKPGKGQAFARVKLRRLLTGTRVEKTFKSTD SAEGADVVDMNLTYLYNDGEFWHFMNNETFEQLSADAKAIGDNAKWLLDQAECIVTLWNG QPISVTPPNFVELEIVDTDPGLKGDTAGTGGKPATLSTGAVVKVPLFVQIGEVIKVDTRS GEYVSRVK >gi|296493192|gb|ADTK01000309.1| GENE 11 11128 - 11274 119 48 aa, chain + ## HITS:1 COG:STM4336 KEGG:ns NR:ns ## COG: STM4336 COG5510 # Protein_GI_number: 16767585 # Func_class: S Function unknown # Function: Predicted small secreted protein # Organism: Salmonella typhimurium LT2 # 1 48 1 48 48 62 95.0 2e-10 MVKKTIAAIFSVLVLSTVLTACNTTRGVGEDISDGGNAISGAATKAQQ >gi|296493192|gb|ADTK01000309.1| GENE 12 11449 - 11766 411 105 aa, chain + ## HITS:1 COG:ECs5129 KEGG:ns NR:ns ## COG: ECs5129 COG2076 # Protein_GI_number: 15834383 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane transporters of cations and cationic drugs # Organism: Escherichia coli O157:H7 # 1 105 51 155 155 160 100.0 4e-40 MSWIILVIAGLLEVVWAVGLKYTHGFSRLTPSVITVTAMIVSMALLAWAMKSLPVGTAYA VWTGIGAVGAAITGIVLLGESANPMRLASLALIVLGIIGLKLSTH >gi|296493192|gb|ADTK01000309.1| GENE 13 11763 - 12296 702 177 aa, chain - ## HITS:1 COG:ECs5130 KEGG:ns NR:ns ## COG: ECs5130 COG3040 # Protein_GI_number: 15834384 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Bacterial lipocalin # Organism: Escherichia coli O157:H7 # 1 177 1 177 177 365 99.0 1e-101 MRLLPLVAAATAAFLVVACSSPTPPRGVTVVNNFDAKRYLGTWYEIARFDHRFERGLEKV TATYSLRDDGGLNVINKGYNPDREMWQQSEGKAYFTGAPTRAALKVSFFGPFYGGYNVIA LDREYRHALVCGPDRDYLWILSRTPTISDEVKQEMLAVATREGFDVSKFIWVQQPGS >gi|296493192|gb|ADTK01000309.1| GENE 14 12385 - 13389 824 334 aa, chain - ## HITS:1 COG:ECs5131 KEGG:ns NR:ns ## COG: ECs5131 COG1680 # Protein_GI_number: 15834385 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Escherichia coli O157:H7 # 1 334 44 377 377 638 97.0 0 MAVAVIYQGKPYYFTWGYADIAKKQPVTQQTLFELGSVSKTFTGVLGGDAIARGEIKLSD PATKYWPELTAKQWNGITLLHLATYTAGGLPLQVPDEVKSSSDLLRFYQNWQPAWAPGTQ RLYANSSIGLFGALAVKPSGLSFEQAMQTRVFQPLKLNHTWINVPPPEEKNYAWGYREGK AVHVSPGALDAEAYGVKSTIEDMARWVRSNMNPRDINDKTLQQGIQLAQSRYWQTGDMYQ GLGWEMLDWPVNPDSIINGSGNKIALAAHPVKAITPPTPAVRASWVHKTGATSGFGSYVA FIPEKELGIVMLANKNYPNPARVAAAWQILNALQ Prediction of potential genes in microbial genomes Time: Mon May 16 15:59:34 2011 Seq name: gi|296493191|gb|ADTK01000310.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont950.5, whole genome shotgun sequence Length of sequence - 46930 bp Number of predicted genes - 47, with homology - 47 Number of transcription units - 18, operones - 9 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 2 1 Op 2 10/0.000 - CDS 392 - 787 503 ## COG3029 Fumarate reductase subunit C 3 1 Op 3 36/0.000 - CDS 798 - 1532 866 ## COG0479 Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit 4 1 Op 4 . - CDS 1525 - 3333 1942 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit - Prom 3498 - 3557 3.4 + Prom 3456 - 3515 7.1 5 2 Op 1 . + CDS 3658 - 4635 914 ## COG2269 Truncated, possibly inactive, lysyl-tRNA synthetase (class II) + Term 4668 - 4696 -1.0 + Prom 4674 - 4733 3.3 6 2 Op 2 . + CDS 4899 - 6356 1349 ## COG0531 Amino acid transporters + Term 6463 - 6510 5.7 - Term 6355 - 6406 10.8 7 3 Op 1 5/0.286 - CDS 6507 - 9830 3547 ## COG3264 Small-conductance mechanosensitive channel 8 3 Op 2 2/0.857 - CDS 9852 - 10820 970 ## COG0688 Phosphatidylserine decarboxylase - Prom 10851 - 10910 2.1 9 3 Op 3 . - CDS 10916 - 11968 972 ## COG1162 Predicted GTPases - Prom 12048 - 12107 2.7 + Prom 11961 - 12020 1.9 10 4 Tu 1 . + CDS 12063 - 12608 761 ## COG1949 Oligoribonuclease (3'->5' exoribonuclease) + Term 12841 - 12909 30.4 + TRNA 12819 - 12894 93.7 # Gly GCC 0 0 + TRNA 12930 - 13005 93.7 # Gly GCC 0 0 11 5 Tu 1 . - CDS 13275 - 14414 951 ## COG1600 Uncharacterized Fe-S protein - Prom 14439 - 14498 5.0 + Prom 14323 - 14382 2.8 12 6 Op 1 6/0.000 + CDS 14428 - 15960 1044 ## PROTEIN SUPPORTED gi|153825000|ref|ZP_01977667.1| ribosomal protein S15 13 6 Op 2 13/0.000 + CDS 15932 - 16393 537 ## COG0802 Predicted ATPase or kinase 14 6 Op 3 10/0.000 + CDS 16412 - 17749 1002 ## COG0860 N-acetylmuramoyl-L-alanine amidase 15 6 Op 4 12/0.000 + CDS 17759 - 19606 1776 ## COG0323 DNA mismatch repair enzyme (predicted ATPase) 16 6 Op 5 15/0.000 + CDS 19644 - 20549 559 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase 17 6 Op 6 16/0.000 + CDS 20635 - 20943 274 ## COG1923 Uncharacterized host factor I protein + Term 20965 - 21010 6.2 18 6 Op 7 8/0.000 + CDS 21020 - 22300 733 ## PROTEIN SUPPORTED gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 19 6 Op 8 21/0.000 + CDS 22386 - 23645 1414 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs 20 6 Op 9 11/0.000 + CDS 23648 - 24652 1289 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs + Term 24682 - 24720 7.2 21 6 Op 10 6/0.000 + CDS 24734 - 24931 245 ## COG3242 Uncharacterized protein conserved in bacteria + Prom 24952 - 25011 4.0 22 6 Op 11 5/0.286 + CDS 25035 - 26333 1437 ## COG0104 Adenylosuccinate synthase + Prom 26459 - 26518 4.2 23 7 Op 1 2/0.857 + CDS 26538 - 26963 273 ## COG1959 Predicted transcriptional regulator 24 7 Op 2 7/0.000 + CDS 27002 - 29443 1244 ## PROTEIN SUPPORTED gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 + Prom 29524 - 29583 7.1 25 7 Op 3 4/0.429 + CDS 29623 - 30354 455 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 + Term 30369 - 30410 7.4 + Prom 30367 - 30426 6.9 26 8 Op 1 6/0.000 + CDS 30481 - 30882 342 ## COG3789 Uncharacterized protein conserved in bacteria 27 8 Op 2 . + CDS 30901 - 31599 876 ## COG1842 Phage shock protein A (IM30), suppresses sigma54-dependent transcription + Term 31605 - 31643 6.1 28 9 Op 1 . + CDS 31650 - 32309 678 ## ECO103_4976 hypothetical protein 29 9 Op 2 5/0.286 + CDS 32327 - 32725 363 ## COG3766 Predicted membrane protein 30 9 Op 3 5/0.286 + CDS 32735 - 33373 331 ## COG5463 Predicted integral membrane protein 31 9 Op 4 1/0.857 + CDS 33376 - 34539 1071 ## COG0754 Glutathionylspermidine synthase 32 9 Op 5 . + CDS 34623 - 36248 1117 ## COG1960 Acyl-CoA dehydrogenases 33 10 Tu 1 . - CDS 36365 - 36652 254 ## SSON_4370 hypothetical protein - Prom 36717 - 36776 4.3 34 11 Tu 1 . - CDS 36789 - 37118 194 ## APECO1_2203 hypothetical protein - Prom 37145 - 37204 2.6 + Prom 37066 - 37125 4.1 35 12 Tu 1 . + CDS 37300 - 38049 537 ## COG1073 Hydrolases of the alpha/beta superfamily 36 13 Tu 1 . - CDS 38046 - 38801 814 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 38826 - 38885 3.6 - Term 38848 - 38899 11.1 37 14 Tu 1 . - CDS 38909 - 39973 1156 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold - Prom 40220 - 40279 4.8 + Prom 40197 - 40256 5.2 38 15 Op 1 11/0.000 + CDS 40328 - 41725 1593 ## COG3037 Uncharacterized protein conserved in bacteria 39 15 Op 2 13/0.000 + CDS 41741 - 42046 321 ## COG3414 Phosphotransferase system, galactitol-specific IIB component 40 15 Op 3 8/0.000 + CDS 42056 - 42520 607 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 41 15 Op 4 9/0.000 + CDS 42534 - 43184 861 ## COG0269 3-hexulose-6-phosphate synthase and related proteins 42 15 Op 5 8/0.000 + CDS 43194 - 44048 1049 ## COG3623 Putative L-xylulose-5-phosphate 3-epimerase 43 15 Op 6 . + CDS 44048 - 44734 811 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases + Term 44941 - 44965 -1.0 - Term 44792 - 44822 3.0 44 16 Tu 1 . - CDS 44863 - 45138 276 ## LF82_3468 UPF0379 protein YjfY - Prom 45190 - 45249 3.1 + Prom 45314 - 45373 3.6 45 17 Tu 1 . + CDS 45465 - 45860 680 ## PROTEIN SUPPORTED gi|221800783|ref|NP_418621.2| 30S ribosomal subunit protein S6 + Term 45987 - 46031 0.3 46 18 Op 1 27/0.000 + CDS 46186 - 46413 385 ## PROTEIN SUPPORTED gi|15834432|ref|NP_313205.1| 30S ribosomal protein S18 47 18 Op 2 . + CDS 46455 - 46904 720 ## PROTEIN SUPPORTED gi|15804792|ref|NP_290833.1| 50S ribosomal protein L9 Predicted protein(s) >gi|296493191|gb|ADTK01000310.1| GENE 1 22 - 381 440 119 aa, chain - ## HITS:1 COG:ECs5132 KEGG:ns NR:ns ## COG: ECs5132 COG3080 # Protein_GI_number: 15834386 # Func_class: C Energy production and conversion # Function: Fumarate reductase subunit D # Organism: Escherichia coli O157:H7 # 1 119 1 119 119 207 99.0 3e-54 MINPNPKRSDEPVFWGLFGAGGMWSAIIAPVMILLVGILLPLGLFPGDALSYERVLAFAQ SFIGRVFLFLMIVLPLWCGLHRMHHAMHDLKIHVPAGKWVFYGLAAILTVVTLIGIVTI >gi|296493191|gb|ADTK01000310.1| GENE 2 392 - 787 503 131 aa, chain - ## HITS:1 COG:ECs5133 KEGG:ns NR:ns ## COG: ECs5133 COG3029 # Protein_GI_number: 15834387 # Func_class: C Energy production and conversion # Function: Fumarate reductase subunit C # Organism: Escherichia coli O157:H7 # 1 131 1 131 131 225 100.0 2e-59 MTTKRKPYVRPMTSTWWKKLPFYRFYMLREGTAVPAVWFSIELIFGLFALKNGPEAWAGF VDFLQNPVIVIINLITLAAALLHTKTWFELAPKAANIIVKDEKMGPEPIIKSLWAVTVVA TIVILFVALYW >gi|296493191|gb|ADTK01000310.1| GENE 3 798 - 1532 866 244 aa, chain - ## HITS:1 COG:ECs5134 KEGG:ns NR:ns ## COG: ECs5134 COG0479 # Protein_GI_number: 15834388 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit # Organism: Escherichia coli O157:H7 # 1 244 1 244 244 516 100.0 1e-146 MAEMKNLKIEVVRYNPEVDTAPHSAFYEVPYDATTSLLDALGYIKDNLAPDLSYRWSCRM AICGSCGMMVNNVPKLACKTFLRDYTDGMKVEALANFPIERDLVVDMTHFIESLEAIKPY IIGNSRTADQGTNIQTPAQMAKYHQFSGCINCGLCYAACPQFGLNPEFIGPAAITLAHRY NEDSRDHGKKERMAQLNSQNGVWSCTFVGYCSEVCPKHVDPAAAIQQGKVESSKDFLIAT LKPR >gi|296493191|gb|ADTK01000310.1| GENE 4 1525 - 3333 1942 602 aa, chain - ## HITS:1 COG:frdA KEGG:ns NR:ns ## COG: frdA COG1053 # Protein_GI_number: 16131979 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Escherichia coli K12 # 1 602 1 602 602 1170 100.0 0 MQTFQADLAIVGAGGAGLRAAIAAAQANPNAKIALISKVYPMRSHTVAAEGGSAAVAQDH DSFEYHFHDTVAGGDWLCEQDVVDYFVHHCPTEMTQLELWGCPWSRRPDGSVNVRRFGGM KIERTWFAADKTGFHMLHTLFQTSLQFPQIQRFDEHFVLDILVDDGHVRGLVAMNMMEGT LVQIRANAVVMATGGAGRVYRYNTNGGIVTGDGMGMALSHGVPLRDMEFVQYHPTGLPGS GILMTEGCRGEGGILVNKNGYRYLQDYGMGPETPLGEPKNKYMELGPRDKVSQAFWHEWR KGNTISTPRGDVVYLDLRHLGEKKLHERLPFICELAKAYVGVDPVKEPIPVRPTAHYTMG GIETDQNCETRIKGLFAVGECSSVGLHGANRLGSNSLAELVVFGRLAGEQATERAATAGN GNEAAIEAQAAGVEQRLKDLVNQDGGENWAKIRDEMGLAMEEGCGIYRTPELMQKTIDKL AELQERFKRVRITDTSSVFNTDLLYTIELGHGLNVAECMAHSAMARKESRGAHQRLDEGC TERDDVNFLKHTLAFRDADGTTRLEYSDVKITTLPPAKRVYGGEADAADKAEAANKKEKA NG >gi|296493191|gb|ADTK01000310.1| GENE 5 3658 - 4635 914 325 aa, chain + ## HITS:1 COG:ECs5136 KEGG:ns NR:ns ## COG: ECs5136 COG2269 # Protein_GI_number: 15834390 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Truncated, possibly inactive, lysyl-tRNA synthetase (class II) # Organism: Escherichia coli O157:H7 # 1 325 11 335 335 661 99.0 0 MSETASWQPSASIPNLLKRAAIMAEIRRFFADRGVLEVETPCMSQATVTDIHLVPFETRF VGPGHSQGMNLWLMTSPEYHMKRLLVAGCGPVFQLCRSFRNEEMGRYHNPEFTMLEWYRP HYDMYRLMNEVDDLLQQVLDCPAAESLSYQQAFLRYLEIDPLSADKTQLREVAAKLDLSN VADTEEDRDTLLQLLFTFGVEPNIGKEKPTFVYHFPASQASLAQISTEDHRVAERFEVYY KGIELANGFHELTDAREQQQRFEQDNRKRAARGLPQHPIDQNLIDALKVGMPDCSGVALG VDRLVMLALGAETLAEVIAFSVDRA >gi|296493191|gb|ADTK01000310.1| GENE 6 4899 - 6356 1349 485 aa, chain + ## HITS:1 COG:ECs5137 KEGG:ns NR:ns ## COG: ECs5137 COG0531 # Protein_GI_number: 15834391 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli O157:H7 # 1 485 30 514 514 854 99.0 0 MIFTSVFGFANSPSAYYLMGYSAIPFYIFSALLFFIPFALMMAEMGAAYRKEEGGIYSWM NNSVGPRFAFIGTFMWFSSYIIWMVSTSAKVWVPFSTFLYGSDMTQHWRIAGLEPTQVVG LLAVAWMILVTVVASKGINKIARITAVGGIAVMCLNLVLLLVSITILLLNGGHFAQDINF LASPNPGYQSGLAMLSFVVFAIFAYGGIEAVGGLVDKTENPEKNFAKGIVFAAIVISIGY SLAIFLWGVSTNWQQVLSNGSVNLGNITYVLMKSLGMTLGNALHLSPEASLSLGVWFARI TGLSMFLAYTGAFFTLCYSPLKAIIQGTPKALWPEPMTRLNAMGMPSIAMWMQCGLVTIF ILLVSFGGGTASAFFNKLTLMANVSMTLPYLFLALAFPFFKARQDLDRPFVIFKTRMSAM IATVVVVLVVTFANVFTIIQPVVEAGDWDSTLWMIGGPVFFSLLAMAIYQNYCSRMANKP ELALD >gi|296493191|gb|ADTK01000310.1| GENE 7 6507 - 9830 3547 1107 aa, chain - ## HITS:1 COG:yjeP KEGG:ns NR:ns ## COG: yjeP COG3264 # Protein_GI_number: 16131984 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Escherichia coli K12 # 1 1107 1 1107 1107 2058 100.0 0 MRLIITFLMAWCLSWGAYAATAPDSKQITQELEQAKAAKPAQPEVVEALQSALNALEERK GSLERIKQYQQVIDNYPKLSATLRAQLNNMRDEPRSVSPGMSTDALNQEILQVSSQLLDK SRQAQQEQERAREIADSLNQLPQQQTDARRQLNEIERRLGTLTGNTPLNQAQNFALQSDS ARLKALVDELELAQLSANNRQELARLRSELAEKESQQLDAYLQALRNQLNSQRQLEAERA LESTELLAENSADLPKDIVAQFKINRELSAALNQQAQRMDLVASQQRQAASQTLQVRQAL NTLREQSQWLGSSNLLGEALRAQVARLPEMPKPQQLDTEMAQLRVQRLRYEDLLNKQPLL RQIHQADGQPLTAEQNRILEAQLRTQRELLNSLLQGGDTLLLELTKLKVSNGQLEDALKE VNEATHRYLFWTSDVRPMTIAWPLEIAQDLRRLISLDTFSQLGKASVMMLTSKETILPLF GALILVGCSIYSRRYFTRFLERSAAKVGKVTQDHFWLTLRTLFWSILVASPLPVLWMTLG YGLREAWPYPLAVAIGDGVTATVPLLWVVMICATFARPNGLFIAHFGWPRERVSRGMRYY LMSIGLIVPLIMALMMFDNLDDREFSGSLGRLCFILICGALAVVTLSLKKAGIPLYLNKE GSGDNITNHMLWNMMIGAPLVAILASAVGYLATAQALLARLETSVAIWFLLLVVYHVIRR WMLIQRRRLAFDRAKHRRAEMLAQRARGEEEAHHHSSPEGAIEVDESEVDLDAISAQSLR LVRSILMLIALLSVIVLWSEIHSAFGFLENISLWDVTSTVQGVESLEPITLGAVLIAILV FIITTQLVRNLPALLELAILQHLDLTPGTGYAITTITKYLLMLIGGLVGFSMIGIEWSKL QWLVAALGVGLGFGLQEIFANFISGLIILFEKPIRIGDTVTIRDLTGSVTKINTRATTIS DWDRKEIIVPNKAFITEQFINWSLSDSVTRVVLTIPAPADANSEEVTEILLTAARRCSLV IDNPAPEVFLVDLQQGIQIFELRIYAAEMGHRMPLRHEIHQLILAGFHAHGIDMPFPPFQ MRLESLNGKQTGRTLTSAGKGRQAGSL >gi|296493191|gb|ADTK01000310.1| GENE 8 9852 - 10820 970 322 aa, chain - ## HITS:1 COG:ECs5139 KEGG:ns NR:ns ## COG: ECs5139 COG0688 # Protein_GI_number: 15834393 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine decarboxylase # Organism: Escherichia coli O157:H7 # 1 322 1 322 322 664 100.0 0 MLNSFKLSLQYILPKLWLTRLAGWGASKRAGWLTKLVIDLFVKYYKVDMKEAQKPDTASY RTFNEFFVRPLRDEVRPIDTDPNVLVMPADGVISQLGKIEEDKILQAKGHNYSLEALLAG NYLMADLFRNGTFVTTYLSPRDYHRVHMPCNGILREMIYVPGDLFSVNHLTAQNVPNLFA RNERVICLFDTEFGPMAQILVGATIVGSIETVWAGTITPPREGIIKRWTWPAGENDGSVA LLKGQEMGRFKLGSTVINLFAPGKVNLVEQLESLSVTKIGQPLAVSTETFVTPDAEPAPL PAEEIEAEHDASPLVDDKKDQV >gi|296493191|gb|ADTK01000310.1| GENE 9 10916 - 11968 972 350 aa, chain - ## HITS:1 COG:ECs5140 KEGG:ns NR:ns ## COG: ECs5140 COG1162 # Protein_GI_number: 15834394 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Escherichia coli O157:H7 # 14 350 1 337 337 692 99.0 0 MSKNKLSKGQQRRVNANHQRRLKTSKEKPDYDDNLFGEPDEGIVISRFGMHADVESADGD VHRCNIRRTIRSLVTGDRVVWRPGKPAAEGVNVKGIVEAVHERTSVLTRPDFYDGVKPIA ANIDQIVIVSAILPELSLNIIDRYLVACETLQIEPIIVLNKIDLLDDEGMAFVNEQMDIY RNIGYRVLMVSSHTQDGLKPLEEALTGRISIFAGQSGVGKSSLLNALLGLQKEILTNDVS DNSGLGQHTTTAARLYHFPHGGDVIDSPGVREFGLWHLEPEQITQGFVEFHDYLGLCKYR DCKHDTDPGCAIREAVEEGKIAETRFENYHRILESMAQVKTRKNFSDTDD >gi|296493191|gb|ADTK01000310.1| GENE 10 12063 - 12608 761 181 aa, chain + ## HITS:1 COG:orn KEGG:ns NR:ns ## COG: orn COG1949 # Protein_GI_number: 16131987 # Func_class: A RNA processing and modification # Function: Oligoribonuclease (3'->5' exoribonuclease) # Organism: Escherichia coli K12 # 1 181 24 204 204 367 100.0 1e-102 MSANENNLIWIDLEMTGLDPERDRIIEIATLVTDANLNILAEGPTIAVHQSDEQLALMDD WNVRTHTASGLVERVKASTMGDREAELATLEFLKQWVPAGKSPICGNSIGQDRRFLFKYM PELEAYFHYRYLDVSTLKELARRWKPEILDGFTKQGTHQAMDDIRESVAELAYYREHFIK L >gi|296493191|gb|ADTK01000310.1| GENE 11 13275 - 14414 951 379 aa, chain - ## HITS:1 COG:yjeS KEGG:ns NR:ns ## COG: yjeS COG1600 # Protein_GI_number: 16131988 # Func_class: C Energy production and conversion # Function: Uncharacterized Fe-S protein # Organism: Escherichia coli K12 # 1 379 1 379 379 784 100.0 0 MSEPLDLNQLAQKIKQWGLELGFQQVGITDTDLSESEPKLQAWLDKQYHGEMDWMARHGM LRARPHELLPGTLRVISVRMNYLPANAAFASTLKNPKLGYVSRYALGRDYHKLLRNRLKK LGEMIQQHCVSLNFRPFVDSAPILERPLAEKAGLGWTGKHSLILNREAGSFFFLGELLVD IPLPVDQPVEEGCGKCVACMTICPTGAIVEPYTVDARRCISYLTIELEGAIPEELRPLMG NRIYGCDDCQLICPWNRYSQLTTEEDFSPRKPLHAPELIELFAWSEEKFLKVTEGSAIRR IGHLRWLRNIAVALGNAPWDETILTALESRKGEHPLLDEHIAWAIAQQIERRNACIVEVQ LPKKQRLVRVIEKGLPRDA >gi|296493191|gb|ADTK01000310.1| GENE 12 14428 - 15960 1044 510 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|153825000|ref|ZP_01977667.1| ribosomal protein S15 [Vibrio cholerae MZO-2] # 8 494 3 490 490 406 43 1e-112 MKKNPVSIPHTVWHADDIRRGEREAADALGLTLYELMLRAGEAAFQVCRSAYPDARHWLV LCGHGNNGGDGYVVARLATAVGIEVTLLAQESDKPLPEEAALAREAWLNAGGEIHASNIV WPESVDLIVDALLGTGLQQAPRESISQLIDHANSHPAPIAAVDIPSGLLAETGATPGAVI NADHTITFIALKPGLLTGKARDVTGQLHFDSLGLDSWLAGQETKIQRFSAEQLSHWLKPR RPTSHKGDHGRLVIIGGDHGTAGAIRMTGEAALRAGAGLVRVLTRSENIAPLLTARPELM VHELTMDSLTESLEWADVVVIGPGLGQQEWGKKALQKVENFRKPMLWDADALNLLAINPD KRHNRVITPHPGEAARLLGCSVAEIESDRLHCAKRLVQRYGGVAVLKGAGTVVAAHPDAL GIIDAGNAGMASGGMGDVLSGIIGALLGQKLSPYDAACAGCVAHGAAADVLAARFGTRGM LATDLFSTLQRIVNPEVTDKNHDESSNSAP >gi|296493191|gb|ADTK01000310.1| GENE 13 15932 - 16393 537 153 aa, chain + ## HITS:1 COG:ECs5144 KEGG:ns NR:ns ## COG: ECs5144 COG0802 # Protein_GI_number: 15834398 # Func_class: R General function prediction only # Function: Predicted ATPase or kinase # Organism: Escherichia coli O157:H7 # 1 153 1 153 153 313 100.0 1e-85 MMNRVIPLPDEQATLDLGERVAKACDGATVIYLYGDLGAGKTTFSRGFLQALGHQGNVKS PTYTLVEPYTLDNLMVYHFDLYRLADPEELEFMGIRDYFANDAICLVEWPQQGTGVLPDP DVEIHIDYQAQGREARVSAVSSAGELLLARLAG >gi|296493191|gb|ADTK01000310.1| GENE 14 16412 - 17749 1002 445 aa, chain + ## HITS:1 COG:amiB KEGG:ns NR:ns ## COG: amiB COG0860 # Protein_GI_number: 16131991 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Escherichia coli K12 # 1 445 1 445 445 816 99.0 0 MMYRIRNWLVATLLLLCTPVGAATLSDIQVSNGNQQARITLSFIGDPDYAFSHQSKRTVA LDIKQTGVIQGLPLLFSGNNLVKAIRSGTPKDAQTLRLVVDLTENGKTEAVKRQNGSNYT VVFTINADAPPPPPPPPVVAKRVETPAVVAPRVSEPARNPFKTESNRTTGVISSNTVTRP AARATANTGDKIIIAIDAGHGGQDPGAIGPGGTREKNVTIAIARKLRTLLNDDPMFKGVL TRDGDYFISVMGRSDVARKQNANFLVSIHADAAPNRSATGASVWVLSNRRANSEMASWLE QHEKQSELLGGAGDVLANSQSDPYLSQAVLDLQFGHSQRVGYDVATSMISQLQRIGEIHK RRPEHASLGVLRSPDIPSVLVETGFISNNSEERLLASDDYQQQLAEAIYKGLRNYFLAHP MQSAPQGATAQTASTVTTPDRTLPN >gi|296493191|gb|ADTK01000310.1| GENE 15 17759 - 19606 1776 615 aa, chain + ## HITS:1 COG:mutL KEGG:ns NR:ns ## COG: mutL COG0323 # Protein_GI_number: 16131992 # Func_class: L Replication, recombination and repair # Function: DNA mismatch repair enzyme (predicted ATPase) # Organism: Escherichia coli K12 # 1 615 1 615 615 1162 99.0 0 MPIQVLPPQLANQIAAGEVVERPASVVKELVENSLDAGATRIDIDIERGGAKLIRIRDNG CGIKKDELALALARHATSKIASLDDLEAIISLGFRGEALASISSVSRLTLTSRTAEQQEA WQAYAEGRDMNVTVKPAAHPVGTTLEVLDLFYNTPARRKFLRTEKTEFNHIDEIIRRIAL ARFDVTINLSHNGKIVRQYRAVPEGGQKERRLGAICGTAFLEQALAIEWQHGDLTLRGWV ADPNHTTPALAEIQYCYVNGRMMRDRLINHAIRQACEDKLGADQQPAFVLYLEIDPHQVD VNVHPAKHEVRFHQSRLVHDFIYQGVLSVLQQQLETPLPLDDEPQPAPRAIPENRVAAGR NHFAEPAAREPVAPRYTPAPASGSRPAAPWPNAQPGYQKQQGEVYRQLLQTPAPMQKLKA PEPQEPALAANSQSFGRVLTIVHSDCALLERDGNISLLSLPVAERWLRQAQLTPGEAPVC AQPLLIPLRLKVSAEEKSALEKAQSALAELGIDFQSDAQHVTIRAVPLPLRQQNLQILIP ELIGYLAKQSVFEPGNIAQWIARNLMSEHAQWSMAQAITLLADVERLCPQLVKTPPGGLL QSVDLHPAIKALKDE >gi|296493191|gb|ADTK01000310.1| GENE 16 19644 - 20549 559 301 aa, chain + ## HITS:1 COG:miaA KEGG:ns NR:ns ## COG: miaA COG0324 # Protein_GI_number: 16131993 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Escherichia coli K12 # 1 301 16 316 316 580 100.0 1e-165 MGPTASGKTALAIELRKILPVELISVDSALIYKGMDIGTAKPNAEELLAAPHRLLDIRDP SQAYSAADFRRDALAEMADITAAGRIPLLVGGTMLYFKALLEGLSPLPSADPEVRARIEQ QAAEQGWESLHRQLQEVDPVAAARIHPNDPQRLSRALEVFFISGKTLTELTQTSGDALPY QVHQFAIAPASRELLHQRIEQRFHQMLASGFEAEVRALFARGDLHTDLPSIRCVGYRQMW SYLEGEISYDEMVYRGVCATRQLAKRQITWLRGWEGVHWLDSEKPEQARDEVLQVVGAIA G >gi|296493191|gb|ADTK01000310.1| GENE 17 20635 - 20943 274 102 aa, chain + ## HITS:1 COG:ECs5148 KEGG:ns NR:ns ## COG: ECs5148 COG1923 # Protein_GI_number: 15834402 # Func_class: R General function prediction only # Function: Uncharacterized host factor I protein # Organism: Escherichia coli O157:H7 # 1 102 1 102 102 177 100.0 5e-45 MAKGQSLQDPFLNALRRERVPVSIYLVNGIKLQGQIESFDQFVILLKNTVSQMVYKHAIS TVVPSRPVSHHSNNAGGGTSSNYHHGSSAQNTSAQQDSEETE >gi|296493191|gb|ADTK01000310.1| GENE 18 21020 - 22300 733 426 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 [Roseobacter sp. AzwK-3b] # 50 398 55 407 425 286 46 1e-76 MFDRYDAGEQAVLVHIYFTQDKDMEDLQEFESLVSSAGVEALQVITGSRKAPHPKYFVGE GKAVEIAEAVKATGASVVLFDHALSPAQERNLERLCECRVIDRTGLILDIFAQRARTHEG KLQVELAQLRHLATRLVRGWTHLERQKGGIGLRGPGETQLETDRRLLRNRIVQIQSRLER VEKQREQGRQSRIKADVPTVSLVGYTNAGKSTLFNRITEARVYAADQLFATLDPTLRRID VADVGETVLADTVGFIRHLPHDLVAAFKATLQETRQATLLLHVIDAADVRVQENIEAVNT VLEEIDAHEIPTLLVMNKIDMLEDFEPRIDRDEENKPIRVWLSAQTGAGIPQLFQALTER LSGEVAQHTLRLPPQEGRLRSRFYQLQAIEKEWMEEDGSVSLQVRMPIVDWRRLCKQEPA LIDYLI >gi|296493191|gb|ADTK01000310.1| GENE 19 22386 - 23645 1414 419 aa, chain + ## HITS:1 COG:ECs5150 KEGG:ns NR:ns ## COG: ECs5150 COG0330 # Protein_GI_number: 15834404 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Escherichia coli O157:H7 # 1 419 1 419 419 672 100.0 0 MAWNQPGNNGQDRDPWGSSKPGGNSEGNGNKGGRDQGPPDLDDIFRKLSKKLGGLGGGKG TGSGGGSSSQGPRPQLGGRVVTIAAAAIVIIWAASGFYTIKEAERGVVTRFGKFSHLVEP GLNWKPTFIDEVKPVNVEAVRELAASGVMLTSDENVVRVEMNVQYRVTNPEKYLYSVTSP DDSLRQATDSALRGVIGKYTMDRILTEGRTVIRSDTQRELEETIRPYDMGITLLDVNFQA ARPPEEVKAAFDDAIAARENEQQYIREAEAYTNEVQPRANGQAQRILEEARAYKAQTILE AQGEVARFAKLLPEYKAAPEITRERLYIETMEKVLGNTRKVLVNDKGGNLMVLPLDQMLK GGNAPAAKSDNGASNLLRLPPASSSTTSGASNTSSTSQGDIMDQRRANAQRNDYQRQGE >gi|296493191|gb|ADTK01000310.1| GENE 20 23648 - 24652 1289 334 aa, chain + ## HITS:1 COG:ECs5151 KEGG:ns NR:ns ## COG: ECs5151 COG0330 # Protein_GI_number: 15834405 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Escherichia coli O157:H7 # 1 334 1 334 334 566 100.0 1e-161 MRKSVIAIIIIVLVVLYMSVFVVKEGERGITLRFGKVLRDDDNKPLVYEPGLHFKIPFIE TVKMLDARIQTMDNQADRFVTKEKKDLIVDSYIKWRISDFSRYYLATGGGDISQAEVLLK RKFSDRLRSEIGRLDVKDIVTDSRGRLTLEVRDALNSGSAGTEDEVTTPAADNAIAEAAE RVTAETKGKVPVINPNSMAALGIEVVDVRIKQINLPTEVSEAIYNRMRAEREAVARRHRS QGQEEAEKLRATADYEVTRTLAEAERQGRIMRGEGDAEAAKLFADAFSKDPDFYAFIRSL RAYENSFSGNQDVMVMSPDSDFFRYMKTPTSATR >gi|296493191|gb|ADTK01000310.1| GENE 21 24734 - 24931 245 65 aa, chain + ## HITS:1 COG:ECs5152 KEGG:ns NR:ns ## COG: ECs5152 COG3242 # Protein_GI_number: 15834406 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 65 1 65 65 80 100.0 6e-16 MNSTIWLALALVLVLEGLGPMLYPKAWKKMISAMTNLPDNILRRFGGGLVVAGVVVYYML RKTIG >gi|296493191|gb|ADTK01000310.1| GENE 22 25035 - 26333 1437 432 aa, chain + ## HITS:1 COG:ECs5153 KEGG:ns NR:ns ## COG: ECs5153 COG0104 # Protein_GI_number: 15834407 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate synthase # Organism: Escherichia coli O157:H7 # 1 432 1 432 432 867 100.0 0 MGNNVVVLGTQWGDEGKGKIVDLLTERAKYVVRYQGGHNAGHTLVINGEKTVLHLIPSGI LRENVTSIIGNGVVLSPAALMKEMKELEDRGIPVRERLLLSEACPLILDYHVALDNAREK ARGAKAIGTTGRGIGPAYEDKVARRGLRVGDLFDKETFAEKLKEVMEYHNFQLVNYYKAE AVDYQKVLDDTMAVADILTSMVVDVSDLLDQARQRGDFVMFEGAQGTLLDIDHGTYPYVT SSNTTAGGVATGSGLGPRYVDYVLGILKAYSTRVGAGPFPTELFDETGEFLCKQGNEFGA TTGRRRRTGWLDTVAVRRAVQLNSLSGFCLTKLDVLDGLKEVKLCVAYRMPDGREVTTTP LAADDWKGVEPIYETMPGWSESTFGVKDRSGLPQAALNYIKRIEELTGVPIDIISTGPDR TETMILRDPFDA >gi|296493191|gb|ADTK01000310.1| GENE 23 26538 - 26963 273 141 aa, chain + ## HITS:1 COG:ECs5154 KEGG:ns NR:ns ## COG: ECs5154 COG1959 # Protein_GI_number: 15834408 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 141 1 141 141 274 99.0 4e-74 MQLTSFTDYGLRALIYMASLPEGRMTSISEVTDVYGVSRNHMVKIINQLSRAGYVTAVRG KNGGIRLGKSASAIRIGDVVRELEPLSLVNCSSEFCHITPACRLKQALSKAVQSFLTELD NYTLADLVEENQPLYKLLLVE >gi|296493191|gb|ADTK01000310.1| GENE 24 27002 - 29443 1244 813 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 [Clostridium acetobutylicum ATCC 824] # 21 728 4 705 730 483 37 1e-136 MSQDPFQEREAEKYANPIPSREFILEHLTKREKPASRDELAVELHIEGEEQLEGLRRRLR AMERDGQLVFTRRQCYALPERLDLVKGTVIGHRDGYGFLRVEGRKDDLYLSSEQMKTCIH GDQVLAQPLGADRKGRREARIVRVLVPKTSQIVGRYFTEAGVGFVVPDDSRLSFDILIPP DQIMGARMGFVVVVELTQRPTRRTKAVGKIVEVLGDNMGTGMAVDIALRTHEIPYIWPQA VEQQVAGLKEEVPEEAKAGRVDLRDLPLVTIDGEDARDFDDAVYCEKKRGGGWRLWVAIA DVSYYVRPPTPLDREARNRGTSVYFPSQVIPMLPEVLSNGLCSLNPQVDRLCMVCEMTVS SKGRLTGYKFYEAVMSSHARLTYTKVWHILQGDQDLREQYAPLVKHLEELHNLYKVLDKA REERGGISFESEEAKFIFNAERRIERIEQTQRNDAHKLIEECMILANISAARFVEKAKEP ALFRIHDKPSTEAITSFRSVLAELGLELPGGNKPEPRDYAELLESVADRPDAEMLQTMLL RSMKQAIYDPENRGHFGLALQSYAHFTSPIRRYPDLTLHRAIKYLLAKEQGHQGNTTETG GYHYSMEEMLQLGQHCSMAERRADEATRDVADWLKCDFMLDQVGNVFKGVISSVTGFGFF VRLDDLFIDGLVHVSSLDNDYYRFDLVGQRLMGESSGQTYRLGDRVEVRVEAVNMDERKI DFSLISSERAPRNVGKTAREKAKKGDAGKKGGKRRQVGKKVNFEPDSAFRGEKKTKPKAA KKDARKAKKPSAKTQKIAAATKAKRAAKKKVAE >gi|296493191|gb|ADTK01000310.1| GENE 25 29623 - 30354 455 243 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 3 242 7 246 255 179 40 2e-44 MSEMIYGIHAVQALLERAPERFQEVFILKGREDKRLLPLIHALESQGVVIQLANRQYLDE KSDGAVHQGIIARVKPGRQYQENDLPDLIASLDQPFLLILDGVTDPHNLGACLRSADAAG VHAVIVPKDRSAQLNATAKKVACGAAESVPLIRVTNLARTMRMLQEENIWIVGTAGEADH TLYQSKMTGRLALVMGAEGEGMRRLTREHCDELISIPMAGSVSSLNVSVATGICLFEAVR QRS >gi|296493191|gb|ADTK01000310.1| GENE 26 30481 - 30882 342 133 aa, chain + ## HITS:1 COG:ECs5157 KEGG:ns NR:ns ## COG: ECs5157 COG3789 # Protein_GI_number: 15834411 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 133 1 133 133 228 99.0 2e-60 MTWNPLALATALQTVPEQNIDVTNSESALIIKMNDYGDLQINILFTSRQMIIETFICPVS SISNPDEFNTFLLRNQKMMPLSSVGISSVQQEEYYIVFGALSLKSSLEDILLEITSLVDN ALDLAEITEEYSH >gi|296493191|gb|ADTK01000310.1| GENE 27 30901 - 31599 876 232 aa, chain + ## HITS:1 COG:yjfJ KEGG:ns NR:ns ## COG: yjfJ COG1842 # Protein_GI_number: 16132004 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Phage shock protein A (IM30), suppresses sigma54-dependent transcription # Organism: Escherichia coli K12 # 1 232 1 232 232 291 100.0 7e-79 MGILKSLFTLGKSFISQAEESIEETQGVRMLEQHIRDAKAELDKAGKSRVDLLARVKLSH DKLKDLRERKASLEARALEALSKNVNPSLINEVAEEIARLENLITAEEQVLSNLEVSRDG VEKAVTATAQRIAQFEQQMEVVKATEAMQRAQQAVTTSTVGASSSVSTAAESLKRLQTRQ AERQARLDAAAQLEKVADGRDLDEKLAEAGIGGSNKSSAQDVLARLQRQQGE >gi|296493191|gb|ADTK01000310.1| GENE 28 31650 - 32309 678 219 aa, chain + ## HITS:1 COG:no KEGG:ECO103_4976 NR:ns ## KEGG: ECO103_4976 # Name: yjfK # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 219 1 219 219 422 100.0 1e-117 MSGFFQRLFGKDNKPAIARGPLGLHLNSGFTLDTLAFRLLEDELLIALPGEEFTVAAVSH IDLGGGSQIFRYYTSGDEFLQINTTGGEDIDDIDDIKLFVYEESYGISKESHWREAINAK VMGAMILNWQEKRWQRFFNSEEPGNIEPVYMLEKVENQNHAKWEVHNFTMGYQRQVTEDT YEYLLLNGEESFNDLGEPEWLFSRALGVDIPLTSLHIIG >gi|296493191|gb|ADTK01000310.1| GENE 29 32327 - 32725 363 132 aa, chain + ## HITS:1 COG:ECs5160 KEGG:ns NR:ns ## COG: ECs5160 COG3766 # Protein_GI_number: 15834414 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 132 1 132 132 220 100.0 6e-58 MHILDSLLAFSAYFFIGVAMVIIFLFIYSKITPHNEWQLIKNNNTAASLAFSGTLLGYVI PLSSAAINAVSIPDYFAWGGIALVIQLLVFAGVRLYMPALSEKIINHNTAAGMFMGTAAL AGGIFNAACMTW >gi|296493191|gb|ADTK01000310.1| GENE 30 32735 - 33373 331 212 aa, chain + ## HITS:1 COG:ECs5161 KEGG:ns NR:ns ## COG: ECs5161 COG5463 # Protein_GI_number: 15834415 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Escherichia coli O157:H7 # 1 212 1 212 212 372 94.0 1e-103 MARKRKSRNNSKIGHGAISRIGRPHNPFEPRRNRYAQKYLTLVLMGGAVFLIVKGWSESS DVDNDGDGTFYSTVQDCIDDGNNADICARGWNNAKAAFYADVPKNMTQQNCQSKYENCYY DNVEQSWIPVISGFLLSRVIRKDRDEQFVYNSGGSSFASRPVWRNTSGDYSWRSGSGKKE SYSSGGFTTKKASTVSRGGYGRSSSARGHWGG >gi|296493191|gb|ADTK01000310.1| GENE 31 33376 - 34539 1071 387 aa, chain + ## HITS:1 COG:ECs5162 KEGG:ns NR:ns ## COG: ECs5162 COG0754 # Protein_GI_number: 15834416 # Func_class: E Amino acid transport and metabolism # Function: Glutathionylspermidine synthase # Organism: Escherichia coli O157:H7 # 1 387 1 387 387 793 98.0 0 MLRHNVPVRRDLDQIAANNGFDFHIIDNEIYWDESRAYRFTLRQIEEQIEKPTAELHQMC LEVVDRAVKDEEILTQLAIPPLYWDVIAESWRARDPSLYGRMDFAWCGNAPVKLLEYNAD TPTSLYESAYFQWLWLEDARRSGIIPRDADQYNAIQERLISRFSELYSREPFYFCCCQDT DEDRTTVLYLQDCAQQAGQESRFIYIEDLGLGVGGVLTDLDDNVIQRAFKLYPLEWMMRD DNGPLLCKRREQWVEPLWKSILSNKGLMPLLWRFFPGHPNLLASWFEGEKSQIAAGESYV RKPIYSREGGNVTIFDGQNNVVDHADGDYADERMIYQAFQPLPRFGDSYTLIGSWIVDDE ACGMGIREDNTLITKDTSRFVPHYIAG >gi|296493191|gb|ADTK01000310.1| GENE 32 34623 - 36248 1117 541 aa, chain + ## HITS:1 COG:aidB KEGG:ns NR:ns ## COG: aidB COG1960 # Protein_GI_number: 16132009 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Escherichia coli K12 # 1 541 6 546 546 1102 99.0 0 MHWQTHTVFNQPIPLNNSNLYLSDGALCEAVTREGAGWDSDFLASIGQQLGTAESLELGR LANVNPPELLRYDAQGRRLDDVRFHPAWHLLMQALCTNRVHNLAWEEDARSGAFVARAAR FMLHAQVEAGSLCPITMTFAATPLLLQMLPAPFQDWTTPLLSDRYDSHLLPGGQKRGLLI GMGMTEKQGGSDVMSNTTRAERLEDGSYRLVGHKWFFSVPQSDAHLVLAQTTGGLSCFFV PRFLPDGQRNAIRLERLKDKLGNRSNASCEVEFQDAIGWLLGQEGEGIRLILKMGGMTRF DCALGSHAMMRRAFSLAIYHAHQRHVFGNPLIQQPLMRHVLSRMALQLEGQTALLFRLAR AWDRRADAKEALWARLFTPAAKFVICKRGMPFVAEAMEVLGGIGYCEESELPRLYREMPV NSIWEGSGNIMCLDVLRVLNKQAGVYDLLSEAFVEVKGQDRYFDRAVRRLQQQLRKPAEE LRREITHQLFLLGCGAQMLKYASPPMAQAWCQVMLDTRGGVRLSEQIQNDLLLRATGGVC V >gi|296493191|gb|ADTK01000310.1| GENE 33 36365 - 36652 254 95 aa, chain - ## HITS:1 COG:no KEGG:SSON_4370 NR:ns ## KEGG: SSON_4370 # Name: yjfN # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 95 6 100 100 172 100.0 2e-42 MELTMKQLLASPSLQLVTYPASATAQSAEFASADCVTGLNEIGQISVSNISGDPQDVERI VALKADEQGASWYRIITMYEDQQPDNWRVQAILYA >gi|296493191|gb|ADTK01000310.1| GENE 34 36789 - 37118 194 109 aa, chain - ## HITS:1 COG:no KEGG:APECO1_2203 NR:ns ## KEGG: APECO1_2203 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 109 34 142 142 201 99.0 6e-51 MFSRKRNSVIYRFASLLLVLMLSACSALQGTPQPAPPVTDHPQEIRRDQTQGLQRIGSVS TMVRGSPDDALAEIRAKAVAAKADYYVVVMVDETIVTGQWYSQAILYRK >gi|296493191|gb|ADTK01000310.1| GENE 35 37300 - 38049 537 249 aa, chain + ## HITS:1 COG:yjfP KEGG:ns NR:ns ## COG: yjfP COG1073 # Protein_GI_number: 16132012 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Escherichia coli K12 # 1 249 1 249 249 498 99.0 1e-141 MIEIESRELADIPVLHAYPVGQKDTPLPCVIFYHGFTSSSLVYSYFAVALAQAGLRVIMP DAPDHGSRFSGDAARRLNQFWQILLQSMQEFTTLRAAIAEENWLLDDRLAVGGASMGAMT ALGITARHPTVRCTASMMGAGYFTSLARSLFPPLIPETTAQQNEFNNIVAPLAEWEATNH LEQLSDRPLLLWHGLDDDVVPADESLRLQQALSETGRDKLLTCSWQPGVRHRITPEALDA AVTFFRQHL >gi|296493191|gb|ADTK01000310.1| GENE 36 38046 - 38801 814 251 aa, chain - ## HITS:1 COG:ECs5167 KEGG:ns NR:ns ## COG: ECs5167 COG1349 # Protein_GI_number: 15834421 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli O157:H7 # 1 251 1 251 251 482 100.0 1e-136 MTEAQRHQILLEMLAQLGFVTVEKVVERLGISPATARRDINKLDESGKLKKVRNGAEAIT QQRPRWTPMNLHQAQNHDEKVRIAKAASQLVNPGESVVINCGSTAFLLGREMCGKPVQII TNYLPLANYLIDQEHDSVIIMGGQYNKSQSITLSPQGSENSLYAGHWMFTSGKGLTAEGL YKTDMLTAMAEQKMLSVVGKLVVLVDSSKIGERAGMLFSRADQIDMLITGKNANPEILQQ LEAQGVSILRV >gi|296493191|gb|ADTK01000310.1| GENE 37 38909 - 39973 1156 354 aa, chain - ## HITS:1 COG:ECs5168 KEGG:ns NR:ns ## COG: ECs5168 COG2220 # Protein_GI_number: 15834422 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Escherichia coli O157:H7 # 1 354 3 356 356 752 100.0 0 MSKVKSITRESWILSTFPEWGSWLNEEIEQEQVAPGTFAMWWLGCTGIWLKSEGGTNVCV DFWCGTGKQSHGNPLMKQGHQMQRMAGVKKLQPNLRTTPFVLDPFAIRQIDAVLATHDHN DHIDVNVAAAVMQNCADDVPFIGPKTCVDLWIGWGVPKERCIVVKPGDVVKVKDIEIHAL DAFDRTALITLPADQKAAGVLPDGMDDRAVNYLFKTPGGSLYHSGDSHYSNYYAKHGNEH QIDVALGSYGENPRGITDKMTSADMLRMGEALNAKVVIPFHHDIWSNFQADPQEIRVLWE MKKDRLKYGFKPFIWQVGGKFTWPLDKDNFEYHYPRGFDDCFTIEPDLPFKSFL >gi|296493191|gb|ADTK01000310.1| GENE 38 40328 - 41725 1593 465 aa, chain + ## HITS:1 COG:sgaT KEGG:ns NR:ns ## COG: sgaT COG3037 # Protein_GI_number: 16132015 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 465 20 484 484 805 99.0 0 MEILYNIFTVFFNQVMTNAPLLLGIVTCLGYILLRKSVSVIIKGTIKTIIGFMLLQAGSG ILTSTFKPVVAKMSEVYGINGAISDTYASMMATIDRMGDAYSWVGYAVLLALALNICYVL LRRITGIRTIMLTGHIMFQQAGLIAVTLFIFGYSMWTTIICTAILVSLYWGITSNIMYKP TQEVTDGCGFSIGHQQQFASWIAYKVAPFLGKKEESVEDLKLPGWLNIFHDNIVSTAIVM TIFFGAILLSFGIDTVQAMAGKVNWTVYILQTGFSFAVAIFIITQGVRMFVAELSEAFNG ISQRLIPGAVLAIDCAAIYSFAPNAVVWGFMWGTIGQLIAVGILVACGSSILIIPGFIPM FFSNATIGVFANHFGGWRAALKICLVMGMIEIFGCVWAVKLTGMSAWMGMADWSILAPPM MQGFFSIGIAFMAVIIVIALAYMFFAGRALRAEEDAEKQLAEQSA >gi|296493191|gb|ADTK01000310.1| GENE 39 41741 - 42046 321 101 aa, chain + ## HITS:1 COG:ECs5170 KEGG:ns NR:ns ## COG: ECs5170 COG3414 # Protein_GI_number: 15834424 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIB component # Organism: Escherichia coli O157:H7 # 1 101 1 101 101 196 100.0 6e-51 MTVRILAVCGNGQGSSMIMKMKVDQFLTQSNIDHTVNSCAVGEYKSELSGADIIIASTHI AGEITVTGNKYVVGVRNMLSPADFGPKLLEVIKEHFPQDVK >gi|296493191|gb|ADTK01000310.1| GENE 40 42056 - 42520 607 154 aa, chain + ## HITS:1 COG:ECs5171 KEGG:ns NR:ns ## COG: ECs5171 COG1762 # Protein_GI_number: 15834425 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Escherichia coli O157:H7 # 1 154 1 154 154 287 96.0 5e-78 MKLRDSLAENKSIRLQAEAETWQDAVKIGVDLLVAADVVEPRYYQAILDAVEQHGPYFVL APGLAMPHGRPEEGVKKTGFALVTLKKPLEFNHEDNDPVDILITMAAVDANTHQEVGIMQ IVNLFEDEENFDRLRACRTEQEVLDLIDRTNAAA >gi|296493191|gb|ADTK01000310.1| GENE 41 42534 - 43184 861 216 aa, chain + ## HITS:1 COG:ECs5172 KEGG:ns NR:ns ## COG: ECs5172 COG0269 # Protein_GI_number: 15834426 # Func_class: G Carbohydrate transport and metabolism # Function: 3-hexulose-6-phosphate synthase and related proteins # Organism: Escherichia coli O157:H7 # 1 216 1 216 216 420 100.0 1e-117 MSLPMLQVALDNQTMDSAYETTRLIAEEVDIIEVGTILCVGEGVRAVRDLKALYPHKIVL ADAKIADAGKILSRMCFEANADWVTVICCADINTAKGALDVAKEFNGDVQIELTGYWTWE QAQQWRDAGIQQVVYHRSRDAQAAGVAWGEADITAIKRLSDMGFKVTVTGGLALEDLPLF KGIPIHVFIAGRSIRDAASPVEAARQFKRSIAELWG >gi|296493191|gb|ADTK01000310.1| GENE 42 43194 - 44048 1049 284 aa, chain + ## HITS:1 COG:ECs5173 KEGG:ns NR:ns ## COG: ECs5173 COG3623 # Protein_GI_number: 15834427 # Func_class: G Carbohydrate transport and metabolism # Function: Putative L-xylulose-5-phosphate 3-epimerase # Organism: Escherichia coli O157:H7 # 1 284 1 284 284 582 100.0 1e-166 MLSKQIPLGIYEKALPAGECWLERLQLAKTLGFDFVEMSVDETDERLSRLDWSREQRLAL VNAIVETGVRVPSMCLSAHRRFPLGSEDDAVRAQGLEIMRKAIQFAQDVGIRVIQLAGYD VYYQEANNETRRRFRDGLKESVEMASRAQVTLAMEIMDYPLMNSISKALGYAHYLNNPWF QLYPDIGNLSAWDNDVQMELQAGIGHIVAVHVKDTKPGVFKNVPFGEGVVDFERCFETLK QSGYCGPYLIEMWSETAEDPAAEVAKARDWVKARMAKAGMVEAA >gi|296493191|gb|ADTK01000310.1| GENE 43 44048 - 44734 811 228 aa, chain + ## HITS:1 COG:ECs5174 KEGG:ns NR:ns ## COG: ECs5174 COG0235 # Protein_GI_number: 15834428 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli O157:H7 # 1 228 1 228 228 468 99.0 1e-132 MQKLKQQVFEANMDLPRYGLVTFTWGNVSAIDRERGLVVIKPSGVAYETMKADDMVVVDM SGKVVEGKYRPSSDTATHLELYRRYPSLGGIVHTHSTHATAWAQAGLAIPALGTTHADYF FGDIPCTRGLSEEEVQGEYELNTGKVIIETLGNAEPLHTPGIVVYQHGPFAWGKDAHDAV HNAVVMEEVAKMAWIARSINPQLNHIDSFLMNKHFMRKHGPNAYYGQK >gi|296493191|gb|ADTK01000310.1| GENE 44 44863 - 45138 276 91 aa, chain - ## HITS:1 COG:no KEGG:LF82_3468 NR:ns ## KEGG: LF82_3468 # Name: yjfY # Def: UPF0379 protein YjfY # Organism: E.coli_LF82 # Pathway: not_defined # 1 91 1 91 91 141 100.0 6e-33 MFSRVLALLAVLLLSANTWAAIEINNHQARNMDDVQSLGVIYINHNFATESEARQALNEE TDAQGATYYHVILMREPGSNGNMHASADIYR >gi|296493191|gb|ADTK01000310.1| GENE 45 45465 - 45860 680 131 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|221800783|ref|NP_418621.2| 30S ribosomal subunit protein S6 [Escherichia coli str. K-12 substr. MG1655] # 1 131 1 131 135 266 100 2e-70 MRHYEIVFMVHPDQSEQVPGMIERYTAAITGAEGKIHRLEDWGRRQLAYPINKLHKAHYV LMNVEAPQEVIDELETTFRFNDAVIRSMVMRTKHAVTEASPMVKAKDERRERRDDFANET ADDAEAGDSEE >gi|296493191|gb|ADTK01000310.1| GENE 46 46186 - 46413 385 75 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15834432|ref|NP_313205.1| 30S ribosomal protein S18 [Escherichia coli O157:H7 str. Sakai] # 1 75 1 75 75 152 100 3e-36 MARYFRRRKFCRFTAEGVQEIDYKDIATLKNYITESGKIVPSRITGTRAKYQRQLARAIK RARYLSLLPYTDRHQ >gi|296493191|gb|ADTK01000310.1| GENE 47 46455 - 46904 720 149 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15804792|ref|NP_290833.1| 50S ribosomal protein L9 [Escherichia coli O157:H7 EDL933] # 1 149 1 149 149 281 100 4e-75 MQVILLDKVANLGSLGDQVNVKAGYARNFLVPQGKAVPATKKNIEFFEARRAELEAKLAE VLAAANARAEKINALETVTIASKAGDEGKLFGSIGTRDIADAVTAAGVEVAKSEVRLPNG VLRTTGEHEVSFQVHSEVFAKVIVNVVAE Prediction of potential genes in microbial genomes Time: Mon May 16 15:59:55 2011 Seq name: gi|296493190|gb|ADTK01000311.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont950.6, whole genome shotgun sequence Length of sequence - 30840 bp Number of predicted genes - 27, with homology - 27 Number of transcription units - 18, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 29 - 59 0.1 1 1 Tu 1 9/0.250 - CDS 71 - 1042 687 ## PROTEIN SUPPORTED gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 - Term 1053 - 1083 3.0 2 2 Op 1 11/0.000 - CDS 1095 - 2396 1260 ## PROTEIN SUPPORTED gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 3 2 Op 2 . - CDS 2439 - 2915 268 ## PROTEIN SUPPORTED gi|90020580|ref|YP_526407.1| ribosomal protein S3 4 2 Op 3 . - CDS 2936 - 5017 1511 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 5 2 Op 4 3/0.500 - CDS 5014 - 6555 1654 ## COG4670 Acyl CoA:acetate/3-ketoacid CoA transferase 6 2 Op 5 1/0.875 - CDS 6565 - 7341 742 ## COG1024 Enoyl-CoA hydratase/carnithine racemase 7 2 Op 6 . - CDS 7351 - 8193 772 ## COG1082 Sugar phosphate isomerases/epimerases 8 2 Op 7 . - CDS 8203 - 8994 871 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) + Prom 9089 - 9148 4.1 9 3 Tu 1 . + CDS 9207 - 9854 539 ## COG1309 Transcriptional regulator + Term 9918 - 9959 2.1 10 4 Tu 1 . - CDS 9838 - 10476 570 ## COG3061 Cell envelope opacity-associated protein A - Prom 10501 - 10560 2.8 + Prom 10564 - 10623 4.9 11 5 Tu 1 . + CDS 10695 - 11315 904 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 + Prom 11376 - 11435 3.7 12 6 Tu 1 . + CDS 11624 - 13027 1472 ## COG1113 Gamma-aminobutyrate permease and related permeases + Term 13037 - 13064 -0.8 + Prom 13065 - 13124 3.3 13 7 Op 1 . + CDS 13294 - 13728 182 ## ECO111_5094 hypothetical protein + Prom 13737 - 13796 4.5 14 7 Op 2 . + CDS 13827 - 14894 571 ## ECO26_5378 hypothetical protein - Term 15088 - 15127 7.1 15 8 Tu 1 . - CDS 15141 - 15803 876 ## COG2846 Regulator of cell morphogenesis and NO signaling - Prom 15826 - 15885 5.3 16 9 Tu 1 . - CDS 15911 - 16876 895 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 17 10 Tu 1 . - CDS 16984 - 17844 884 ## COG0702 Predicted nucleoside-diphosphate-sugar epimerases - Prom 17918 - 17977 3.1 + Prom 17711 - 17770 1.5 18 11 Tu 1 . + CDS 17843 - 18313 469 ## COG1733 Predicted transcriptional regulators + Term 18382 - 18422 4.2 - Term 18361 - 18405 -0.7 19 12 Tu 1 . - CDS 18442 - 20385 2205 ## COG0737 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases - Prom 20435 - 20494 5.1 + Prom 20479 - 20538 4.3 20 13 Tu 1 . + CDS 20575 - 21315 788 ## COG1218 3'-Phosphoadenosine 5'-phosphosulfate (PAPS) 3'-phosphatase + Term 21324 - 21361 5.1 21 14 Tu 1 . - CDS 21305 - 21862 632 ## COG3054 Predicted transcriptional regulator - Prom 22036 - 22095 5.4 + Prom 21976 - 22035 5.0 22 15 Tu 1 . + CDS 22187 - 22393 253 ## G2583_5047 hypothetical protein + Term 22418 - 22450 6.3 - Term 22406 - 22438 6.3 23 16 Tu 1 . - CDS 22455 - 23741 1564 ## COG1253 Hemolysins and related proteins containing CBS domains - Prom 23902 - 23961 3.4 - Term 23963 - 24000 2.4 24 17 Tu 1 . - CDS 24121 - 24759 697 ## COG0225 Peptide methionine sulfoxide reductase - Prom 24815 - 24874 3.3 + Prom 24774 - 24833 2.8 25 18 Op 1 16/0.000 + CDS 24965 - 26698 1553 ## COG0729 Outer membrane protein 26 18 Op 2 6/0.375 + CDS 26695 - 30474 4068 ## COG2911 Uncharacterized protein conserved in bacteria 27 18 Op 3 . + CDS 30477 - 30818 230 ## COG2105 Uncharacterized conserved protein Predicted protein(s) >gi|296493190|gb|ADTK01000311.1| GENE 1 71 - 1042 687 323 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 [Lentisphaera araneosa HTCC2155] # 24 316 48 339 346 269 43 2e-71 MKIKVSAGIIGAVLMLSASQSWAVTLKLSHNQDKSHPVHKAMEFFAKKSKEYSNGDITIR IYPNGTLGTQRETMELIRSGAIPLVKTNAAEMEAFENSYKLFSLPYLFRDRDHYYQVMQG DIGRKILDSTKSKGYFGLTFYDGGARSFYGNKPVLKPDDLKGMKVRVQPSPGAVEMIKVM GGNPTPLDYGELYTALQQGVVDMAENSVMALTTMRHGEVAKSFSLDEHTMVPDVVLMSNA AFDKLSPENQAVILKAAKESMSYMKDLWSEEEKQEFAKLDKMGVKVYQVDKAPFIEKVQP MYANFAKDNPALAPMLADIQAAK >gi|296493190|gb|ADTK01000311.1| GENE 2 1095 - 2396 1260 433 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 [Algoriphagus sp. PR1] # 5 432 4 430 431 489 55 1e-138 MDYWLPIIVLFGAFFFMLALGVPIVYAIGLSTLASISTQLDFNSALSVVSQKLASGLDSF TLLAIPFFILSGNIMNHGGIARRLINFARILGGRLPGSLAHCNILANMLFGAISGSAVAS AAAMGGVMHPQQVKEGYDPAFSTAVNVASAPTGLLIPPSNTLIVYSLVSGGTSIAALFLA GYVPGILLGLALMVIAGIIAVRRGYPKPERPTLRQAGVAIWMAIPSIFLIILIMGGVLSG IFTPTEASAIAVIYTLFLALVLYREISVKDLPKIFLESVITTAIVLLLIGSSMGMSWAMS NADVPFLILDLLNTISDNPIIILLIINIILLIIGTFMDMTPAVLIFTPIFLPVVTELGMD PIHFGIVMVLNMCIGICTPPVGSVLFVGCSVSKLPINKIIKPMLPFYAVMVLVLAMVTYI PQISMALPRALGY >gi|296493190|gb|ADTK01000311.1| GENE 3 2439 - 2915 268 158 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020580|ref|YP_526407.1| ribosomal protein S3 [Saccharophagus degradans 2-40] # 1 148 1 148 164 107 37 7e-23 MRNIIRSINALLAALNITILAIIVACVTWQVAARFIFTSPSIFTDELSRLLFICLGLFGG AYTAGQNRHLAIDLLPMMLKGKARRHLFLCIQIIVIIFATIIMVYGGGLLTMDTFDSGQT SPALGWQMGYIYMSIPISGVLIIIYTIDMVLTELKQPL >gi|296493190|gb|ADTK01000311.1| GENE 4 2936 - 5017 1511 693 aa, chain - ## HITS:1 COG:TM0427 KEGG:ns NR:ns ## COG: TM0427 COG1053 # Protein_GI_number: 15643193 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Thermotoga maritima # 29 689 5 664 664 357 35.0 4e-98 MSQLDDTILDALTHVTFPKGFAQAEPAWVVTVDGVDYPLWQTDALVVGSGAAGLRAAVEL KRRQQNVLIATAGLYMGTSACSGSDKQTLFTAATAGNGDNFTKLAEALASGGAMDHDTAY VEAVGSLHTLGGLQYLGLELPEDRYGAILRYQTDHDEAGRATSCGPRTSRLMVKVLLEEV QRLAIPVLTSATVIKLLHQRDENGEDRVAGAILATGHRAHNPWGLAIVTAPNVVLATGGP GELYRDSVYPHKCFGSLGLALEEGLTLTNLTESQFGIGTPRSTFPWNLSGTYVQVIPYIY SVDAEGNEYNFLADYYRTTQELASNIFRKGYQWPFHATRVMDFGSSLLDMAVAQEQQSGR QVFMDFNRNPEAVPGDLPFSLDRLDDDVRAYLENNDALAPSPIERLQRMNPLSISLYKMH GYDLTTQPLQFAMNNQHMNGGIEVDIWGQTSLPGCFAVGEVAGTHGVTRPGGAALNAGQV FAVRLARFIGCTQKRNIDGDIAQLVAPTLASIREIITQAHDNGTGMPLSVVREKIQARMS DHAGFICHADKVRRATRDALLLSEFVQRHGLAIKHVGEVAELFMWRHMALTSAAVLTQLT HYIDAGGGSRGARIILDRDGNSIPQTRNGFCDAWRFRSERTEDKKDKLLIHYCNGIFHVR ETPVREFPIIRGIWFEKNWPGFLNGTIYQPQDE >gi|296493190|gb|ADTK01000311.1| GENE 5 5014 - 6555 1654 513 aa, chain - ## HITS:1 COG:BH3898 KEGG:ns NR:ns ## COG: BH3898 COG4670 # Protein_GI_number: 15616460 # Func_class: I Lipid transport and metabolism # Function: Acyl CoA:acetate/3-ketoacid CoA transferase # Organism: Bacillus halodurans # 3 507 4 507 525 417 46.0 1e-116 MRKITTAEALAAQIQDGATIAISGNGGGMVEADHILAAIEARFLQTGHPRDLTLIHSLGI GDRDCKGTNRFAHAEMLKRIIAGHFTWSPKMQALVKNNTIEAYCFPGGVIQALLREIGAG RPGLFTHVGLGSFVDPRNGGGKSNECTTDDLVELIEIDGETKLRYRPFKVDYAILRGTYA DPRGNVSLEEEAIDMDSYSMALAAHNSGGKVFVQVRDVLEAGAIEPRRVKLPGILVDGIV EHREQPQTYLGGYDLTISGQHRRLSSNDAIELVSHPVRRLIARRAARELVAGASTNFGFG IPGGIPGVALREGVPYQSLWLSVEQGVHNGMMLDDAFFGCARNADAIIPSLDQFEFYSGG GIDITFLGMGEMDQYGNVNVSHLNGNLIGPGGFLEIAQNARKVVFCGTFDAKGSKIDVTP DGLHITQSGQIPKLVTQVEKITFSAAYAQQSGQEVLYITERAVFQLTAEGVELIEIAPGV EIERDILPYMAFRPIIKHPRLMESSLFTPMEDA >gi|296493190|gb|ADTK01000311.1| GENE 6 6565 - 7341 742 258 aa, chain - ## HITS:1 COG:AF2273_2 KEGG:ns NR:ns ## COG: AF2273_2 COG1024 # Protein_GI_number: 11499854 # Func_class: I Lipid transport and metabolism # Function: Enoyl-CoA hydratase/carnithine racemase # Organism: Archaeoglobus fulgidus # 16 255 17 259 262 162 34.0 8e-40 MSDQPVLFSRVAASCRLTLNREDKCHAINEEMIESLDHYLNEIENDTTLRLVELTATGDK FFCAGGDIKSWSAYSPLDMGRKWIKRGNDVFNRLRNLPQLTVANLNGHTIGGGIELALCC DIRIARPGAKFSNPEVMLGMVPGWMGIERVLNQVGPVVGRQMLMLGKRLTAQEAQAANLI DEVVEIEQVESWMANQLAQLEKCGPVALAHIKQLILALENKHADYPHQLLAGLMSATQDC QQATRAFAEKSSVSFHNQ >gi|296493190|gb|ADTK01000311.1| GENE 7 7351 - 8193 772 280 aa, chain - ## HITS:1 COG:SMc04130 KEGG:ns NR:ns ## COG: SMc04130 COG1082 # Protein_GI_number: 15963875 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Sinorhizobium meliloti # 11 280 6 274 274 270 50.0 3e-72 MRDLLQRPDLFSINTATLGYKTPLPAIIDACAARGIGAIAPWRRELQGEDLQQIAHQLAA SNMSVSGLCRSTYYTAPTLAERKLAIDDNRRALDDAAVLNAACYMQVVGGLPTGTKDLYE AREQVKQGIRQLLPHSKDVGVPIALEPLHPMTAADRSCLCMLRQALDWCDELDPDGEFGL GVAVDVYHVWWDPDLASQILRAGKRILAFHVSDWLVPTTDLVNDRGMPGDGVINIPSIRR LVENAGFNGAIELEIFSPYWWQKDINSTLDISVDRIAHYC >gi|296493190|gb|ADTK01000311.1| GENE 8 8203 - 8994 871 263 aa, chain - ## HITS:1 COG:mlr3057 KEGG:ns NR:ns ## COG: mlr3057 COG1028 # Protein_GI_number: 13472685 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Mesorhizobium loti # 4 263 2 253 253 206 43.0 5e-53 MKSTRPVAVITGAARGIGKGCALELARGGFNLLINDLPDADSVEKLHITQQECIAEGVEV ICFPADVGDLSLHEEMLDAAQNQWGRLDCLLNNAGISVKKRGDLLDLEPDSFDQNIAINT RAPFFLAQAFSKRLLAQPKPEAELPHRSIIFVSSINAIMLAMNRGEYTIAKTAVSAAARL FAARLCNEQIGVYEVRPGLIKTDMTIPATAYYDELIAKGLVPWGRWGYPADIASTVRAMA EGKLIYTCGQAVAIDGGLSMPRF >gi|296493190|gb|ADTK01000311.1| GENE 9 9207 - 9854 539 215 aa, chain + ## HITS:1 COG:ytfA KEGG:ns NR:ns ## COG: ytfA COG1309 # Protein_GI_number: 16132027 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 108 215 1 108 108 210 99.0 2e-54 MYLDTSNVQSLKEKLLLCAVNEFAEYGYEGARVDNIVKAAGCSKQTVYHHFGNKENLFIE VLEYTWNDIRQKEKALDFSDLPPQKAIEKIIDFTWDYYIANPWFLKIVHSENQSKGVHYA KSQRLLEINHAHLQLMESLLDEGKKHNIFKPDIDPLQVNINIAALGGYYLINQHTLGLVY HISMVSPQALEARRKVIKETILSWLLVDPSSTAHE >gi|296493190|gb|ADTK01000311.1| GENE 10 9838 - 10476 570 212 aa, chain - ## HITS:1 COG:ytfB KEGG:ns NR:ns ## COG: ytfB COG3061 # Protein_GI_number: 16132028 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell envelope opacity-associated protein A # Organism: Escherichia coli K12 # 1 212 13 224 224 363 100.0 1e-100 MPGRFELKPTLEKVWHAPDNFRFMDPLPPMHRRGIIIAAIVLVVGFLLPSDDTPNAPVVT REAQLDIQSQSQPPTEEQLRAQLVTPQNDPDQVAPVAPEPIQEGQPEEQPQTTQTQPFQP DSGIDNQWRSYRVEPGKTMAQLFRDHGLPATDVYAMAQVEGAGKPLSNLQNGQMVKIRQN ASGVVTGLTIDTGNNQQVLFTRQPDGSFIRAR >gi|296493190|gb|ADTK01000311.1| GENE 11 10695 - 11315 904 206 aa, chain + ## HITS:1 COG:ECs5185 KEGG:ns NR:ns ## COG: ECs5185 COG0545 # Protein_GI_number: 15834439 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Escherichia coli O157:H7 # 1 206 54 259 259 395 100.0 1e-110 MTTPTFDTIEAQASYGIGLQVGQQLSESGLEGLLPEALVAGIADALEGKHPAVPVDVVHR ALREIHERADAVRRQRFQAMAAEGVKYLEENAKKEGVNSTESGLQFRVINQGEGAIPART DRVRVHYTGKLIDGTVFDSSVARGEPAEFPVNGVIPGWIEALTLMPVGSKWELTIPQELA YGERGAGASIPPFSTLVFEVELLEIL >gi|296493190|gb|ADTK01000311.1| GENE 12 11624 - 13027 1472 467 aa, chain + ## HITS:1 COG:ECs5186 KEGG:ns NR:ns ## COG: ECs5186 COG1113 # Protein_GI_number: 15834440 # Func_class: E Amino acid transport and metabolism # Function: Gamma-aminobutyrate permease and related permeases # Organism: Escherichia coli O157:H7 # 1 467 1 467 470 840 99.0 0 MVDQVKVVADDQAPAEQSLRRNLTNRHIQLIAIGGAIGTGLFMGSGKTISLAGPSIIFVY MIIGFMLFFVMRAMGELLLSNLEYKSFSDFASDLLGPWAGYFTGWTYWFCWVVTGMADVV AITAYAQFWFPDLSDWVASLAVIVLLLTLNLATVKMFGEMEFWFAMIKIVAIVSLIVVGL VMVAMHFQSPTGVEASFAHLWNDGGWFPKGLSGFFAGFQIAVFAFVGIELVGTTAAETKD PEKSLPRAINSIPIRIIMFYVFALIVIMSVTPWSSVVPEKSPFVELFVLVGLPAAASVIN FVVLTSAASSANSGVFSTSRMLFGLAQEGVAPKAFAKLSKRAVPAKGLTFSCICLLGGVV MLYVNPSVIGAFTMITTVSAILFMFVWTIILCSYLVYRKQRPHLHEKSIYKMPLGKLMCW VCMAFFVFVIVLLTLEDDTRQALLVTPLWFIALGLGWLFIGKKRMAK >gi|296493190|gb|ADTK01000311.1| GENE 13 13294 - 13728 182 144 aa, chain + ## HITS:1 COG:no KEGG:ECO111_5094 NR:ns ## KEGG: ECO111_5094 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 144 1 144 144 290 100.0 1e-77 MAKHSWLLIAALPLSISPSWGADFYYRQQEKGTVYVAEQKGEKDEILSELPGVNFSRLWR IANLANNQETRLLSDFNPDKFDCDDRNNCQHTWLTDGRSVLWSGKVLKNPPGEPNVDAAS FQAFGAFAADKRSVYFDGQRTDDR >gi|296493190|gb|ADTK01000311.1| GENE 14 13827 - 14894 571 355 aa, chain + ## HITS:1 COG:no KEGG:ECO26_5378 NR:ns ## KEGG: ECO26_5378 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 355 19 373 373 701 99.0 0 MAILHPQECWLLERIMSPEYYRRRFEGWQEFVELCERQVAEWSKTMPLDVRRRPLCEQID VVWGGRVLPNIRSTLKSVQYDFIQLQQGDLRVLQSGGNISSDMKGLIDYPSDWMSLAAQK QYDRLKWRGAHYNNLIRRTSGGYWYDGELTYYYEESLHGPLALPMQLPLYELDSSVYLRE DDSVTVAGLYLPDIPDASAQLLYRSEHIPEAWQGRVRTKYVNEAGIQEYYWESGAWAKCN WKRIRRVANRFINVPPEGFFPQGMPEELYNWPQREAQYVTGRQRIAAFSGEACPHSGEWS IFVEGRQATVTLEQGEQMPEWTDRKMEGEYKRGEKFHVLWSLMNRHDGGSVWVEA >gi|296493190|gb|ADTK01000311.1| GENE 15 15141 - 15803 876 220 aa, chain - ## HITS:1 COG:ytfE KEGG:ns NR:ns ## COG: ytfE COG2846 # Protein_GI_number: 16132031 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Regulator of cell morphogenesis and NO signaling # Organism: Escherichia coli K12 # 1 220 1 220 220 423 99.0 1e-118 MAYRDQPLGELALSIPRASALFRKYDMDYCCGGKQTLSRAAARKELDVEVIEAELAKLAE QPIEKDWRSAPLAEIIDHIIVRYHDRHREQLPELILQATKVERVHADKPSVPKGLTKYLT MLHEELSSHMMKEEQILFPMIKQGMGSQAMGPISVMESEHDEAGELLEVIKHTTNNVTPP PEACTTWKAMYNGINELIDDLMDHISLENNVLFPRALAGE >gi|296493190|gb|ADTK01000311.1| GENE 16 15911 - 16876 895 321 aa, chain - ## HITS:1 COG:ECs5188 KEGG:ns NR:ns ## COG: ECs5188 COG0697 # Protein_GI_number: 15834442 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Escherichia coli O157:H7 # 1 321 4 324 324 555 98.0 1e-158 MISGVLYALLAGLMWGLIFVGPLIVPEYPAMLQSMGRYLALGLIALPIAWLGRVRLRQLA RRDWLTALMLTMMGNLIYYFCLASAIQRTGAPVSTMIIGTLPVVIPVFANLLYSQRDGKL AWGKLAPALICIGIGLASVNIAELNHGLPDFDWARYTSGIVLALVSVVCWAWYALRNARW LRENPDKHPMMWATAQALVTLPVSLIGYLVACYWLNTQTPDFSLPFGPRPLVFISLMVAI AVLCSWVGALCWNVASQRLPTVILGPLIVFETLAGLLYTFLIRQQMPPLMTLSGIALLVV GVVIAVRAKPEKPLTESVSES >gi|296493190|gb|ADTK01000311.1| GENE 17 16984 - 17844 884 286 aa, chain - ## HITS:1 COG:ytfG KEGG:ns NR:ns ## COG: ytfG COG0702 # Protein_GI_number: 16132033 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate-sugar epimerases # Organism: Escherichia coli K12 # 1 286 1 286 286 466 98.0 1e-131 MIAITGATGQLGHYVIESLMKTVPASQIVAIVRNPAKAQALAAQGITVRQADYGDEAALT SALQGVEKLLLISSSEVGQRAPQHRNVINAAKAAGVKFIAYTSLLHADTSPLGLADEHIE TEKMLADSGIVYTLLRNGWYTENYLASAPAALEHGVFIGAAGDGKIASATRADYAAAAAR VISEAGHEGKVYELAGDSAWTLTQLAAELTKQSGKSVTYQNLSEADFAAALKSVGLPDGL ADMLADSDVGASKGGLFDNSKTLSKLIGRPTTTLAESVSHLFNVNN >gi|296493190|gb|ADTK01000311.1| GENE 18 17843 - 18313 469 156 aa, chain + ## HITS:1 COG:ytfH KEGG:ns NR:ns ## COG: ytfH COG1733 # Protein_GI_number: 16132034 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 1 156 1 156 156 305 100.0 2e-83 MGSLHRFVLCLNTLTPKLTFSKYVQKGKYEMSQVSLSQQLKEGNLFAEQCPSREVLKHVT SRWGVLILVALREGTHRFSDLRRKIGGVSEKMLAQSLQALEQDGFLNRIAYPVVPPHVEY SLTPLGEQVSEKVAALADWIELNLPEVLAVRDERAA >gi|296493190|gb|ADTK01000311.1| GENE 19 18442 - 20385 2205 647 aa, chain - ## HITS:1 COG:cpdB KEGG:ns NR:ns ## COG: cpdB COG0737 # Protein_GI_number: 16132035 # Func_class: F Nucleotide transport and metabolism # Function: 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases # Organism: Escherichia coli K12 # 1 647 1 647 647 1271 99.0 0 MIKFSATLLATLIAASVNAATVDLRIMETTDLHSNMMDFDYYKDTATEKFGLVRTASLIN DARNEVKNSVLVDNGDLIQGSPLADYMSAKGLKAGDVHPVYKALNTLDYTVGTLGNHEFN YGLDYLKNALAGAKFPYVNANVIDARTKQPMFTPYLIKDTEVVDKDGKKQTLKIGYIGVV PPQIMGWDKANLSGKVTVNDITETVRKYVPEMREKGADVVVVLAHSGLSADPYKVMAENS VYYLSEIPGVNAIMFGHAHAVFPGKDFADIEGADITKGTLNGVPAVMPGMWGDHLGVVDL QLSNDSGKWQVTQAKAEARPIYDIANKKSLAAEDSKLVETLKADHDATRQFVSKPIGKSA DNMYSYLALVQDDPTVQVVNNAQKAYVEHYIQGDPDLAKLPVLSAAAPFKVGGRKNDPAS YVEVEKGQLTFRNAADLYLYPNTLIVVKASGKEVKEWLECSAGQFNQIDPNSTKPQSLIN WDGFRTYNFDVIDGVNYQIDVTQPARYDGECQMINANAERIKNLTFNGKPIDPNAMFLVA TNNYRAYGGKFAGTGDSHIAFASPDENRSVLAAWIADESKRAGEIHPAADNNWRLAPIAG DKKLDIRFETSPSDKAAAFIKEKGQYPMNKVATDDIGFAIYQVDLSK >gi|296493190|gb|ADTK01000311.1| GENE 20 20575 - 21315 788 246 aa, chain + ## HITS:1 COG:cysQ KEGG:ns NR:ns ## COG: cysQ COG1218 # Protein_GI_number: 16132036 # Func_class: P Inorganic ion transport and metabolism # Function: 3'-Phosphoadenosine 5'-phosphosulfate (PAPS) 3'-phosphatase # Organism: Escherichia coli K12 # 1 246 1 246 246 476 99.0 1e-134 MLDQVCQLARNAGDAIMQVYDGTKPMDVVSKADNSPVTAADIAAHTVIMDGLRTLTPDIP VLSEEDPPGWEVRQHWQRYWLVDPLDGTKEFIKRNGEFTVNIALIDHGKPILGVVYAPVM NVMYSAAEGKAWKEECGVRKQIQVRDARPPLVVISRSHADAELKEYLQQLGEHQTTSIGS SLKFCLVAEGQAQLYPRFGPTNIWDTAAGHAVAAAAGAHVHDWQGKPLDYTPRESFLNPG FRVSIY >gi|296493190|gb|ADTK01000311.1| GENE 21 21305 - 21862 632 185 aa, chain - ## HITS:1 COG:ECs5194 KEGG:ns NR:ns ## COG: ECs5194 COG3054 # Protein_GI_number: 15834448 # Func_class: R General function prediction only # Function: Predicted transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 183 1 183 184 353 98.0 1e-97 MTLRKILALTCLLLPMMASAHQFETGQRVPPIGITDRGELVLDKDQFSYKTWNSAQLVGK VRVLQHIAGRTSAKEKNATLIEAIKSAKLPHDRYQTTTIVNTDDAIPGSGMFVRSSLESN KKLYPWSQFIVDSNGVARGAWQLDEESSAVVVLDKDGRVQWAKDGALTQEEVQQVMDLLH KLINK >gi|296493190|gb|ADTK01000311.1| GENE 22 22187 - 22393 253 68 aa, chain + ## HITS:1 COG:no KEGG:G2583_5047 NR:ns ## KEGG: G2583_5047 # Name: ytfK # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 68 14 81 81 119 100.0 4e-26 MKIFQRYNPLQVAKYVKILFRGRLYIKDVGAFEFDKGKILIPKVKDKLHLSVMSEVNRQV MRLQTEMA >gi|296493190|gb|ADTK01000311.1| GENE 23 22455 - 23741 1564 428 aa, chain - ## HITS:1 COG:ECs5196 KEGG:ns NR:ns ## COG: ECs5196 COG1253 # Protein_GI_number: 15834450 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Escherichia coli O157:H7 # 1 428 20 447 447 842 99.0 0 MSEISLAASRKIKLKLLADEGNINAQRVLNMQENPGMFFTVVQIGLNAVAILGGIVGDAA FSPAFHSLFSRYMSAELSEQLSFILSFSLVTGMFILFADLTPKRIGMIAPEAVALRIINP MRFCLYVCTPLVWFFNGLANMIFRIFKLPMVRKDDITSDDIYAVVEAGALAGVLRKQEHE LIENVFELESRTVPSSMTPRENVIWFDLHEDEQSLKNKVAEHPHSKFLVCNEDIDHIIGY VDSKDLLNRVLANQSLALNSGVQIRNTLIVPDTLTLSEALESFKTAGEDFAVIMNEYALV VGIITLNDVMTTLMGDLVGQGLEEQIVARDENSWLIDGGTPIDDVMRVLDIDEFPQSGNY ETIGGFMMFMLRKIPKRTDSVKFAGYKFEVVDIDNYRIDQLLVTRIDSKATALSPKLPDA KDKEESVA >gi|296493190|gb|ADTK01000311.1| GENE 24 24121 - 24759 697 212 aa, chain - ## HITS:1 COG:msrA KEGG:ns NR:ns ## COG: msrA COG0225 # Protein_GI_number: 16132041 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptide methionine sulfoxide reductase # Organism: Escherichia coli K12 # 1 212 1 212 212 405 100.0 1e-113 MSLFDKKHLVSPADALPGRNTPMPVATLHAVNGHSMTNVPDGMEIAIFAMGCFWGVERLF WQLPGVYSTAAGYTGGYTPNPTYREVCSGDTGHAEAVRIVYDPSVISYEQLLQVFWENHD PAQGMRQGNDHGTQYRSAIYPLTPEQDAAARASLERFQAAMLAADDDRHITTEIANATPF YYAEDDHQQYLHKNPYGYCGIGGIGVCLPPEA >gi|296493190|gb|ADTK01000311.1| GENE 25 24965 - 26698 1553 577 aa, chain + ## HITS:1 COG:ECs5198 KEGG:ns NR:ns ## COG: ECs5198 COG0729 # Protein_GI_number: 15834452 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein # Organism: Escherichia coli O157:H7 # 1 577 1 577 577 1171 100.0 0 MRYIRQLCCVSLLCLSGSAVAANVRLQVEGLSGQLEKNVRAQLSTIESDEVTPDRRFRAR VDDAIREGLKALGYYQPTIEFDLRPPPKKGRQVLIAKVTPGVPVLIGGTDVVLRGGARTD KDYLKLLDTRPAIGTVLNQGDYENFKKSLTSIALRKGYFDSEFTKAQLGIALGLHKAFWD IDYNSGERYRFGHVTFEGSQIRDEYLQNLVPFKEGDEYESKDLAELNRRLSATGWFNSVV VAPQFDKARETKVLPLTGVVSPRTENTIETGVGYSTDVGPRVKATWKKPWMNSYGHSLTT STSISAPEQTLDFSYKMPLLKNPLEQYYLVQGGFKRTDLNDTESDSTTLVASRYWDLSSG WQRAINLRWSLDHFTQGEITNTTMLFYPGVMISRTRSRGGLMPTWGDSQRYSIDYSNTAW GSDVDFSVFQAQNVWIRTLYDRHRFVTRGTLGWIETGDFDKVPPDLRFFAGGDRSIRGYK YKSIAPKYANGDLKGASKLITGSLEYQYNVTGKWWGAVFVDSGEAVSDIRRSDFKTGTGV GVRWESPVGPIKLDFAVPVADKDEHGLQFYIGLGPEL >gi|296493190|gb|ADTK01000311.1| GENE 26 26695 - 30474 4068 1259 aa, chain + ## HITS:1 COG:ytfN KEGG:ns NR:ns ## COG: ytfN COG2911 # Protein_GI_number: 16132043 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 1259 1 1259 1259 2435 99.0 0 MSLWKKISLGVVIVILLLLGSVAFLVGTTSGLHLVFKAADRWVPGLDIGKVTGGWRDLTL SDVRYEQPGVAVKAGNLHLAVGLECLWNSSVCINDLALKDIQVNIDSKKMPPSEQVEEEE DSGPLDLSTPYPITLTRVALDNVNIKIDDTTVSVMDFTSGLNWQEKTLTLKPTSLKGLLI ALPKVAEVAQEEVVEPKIENPQPEEKPLGETLKDLFSRPVLPEMTDLHLPLNLNIEEFKG EQLRVTGDTDITVRTMLLKVSSIDGNTKLDALDIDSNQGIVNASGTAQLSDNWPVDITLN STLNVEPLKGEKVKLKVGGALREQLEIGVNLSGPVDMDLRAQTRLAEAGLPLNVEVNSKQ LYWPFTGEKQYQADDLKLKLTGKMTDYTLSMRTAVKGQEIPPATITLDAKGNEQQVNLDK LTVAALEGKTELKALLDWQQAISWRGELTLNGINTAKEFPEWPSKLNGLIKTRGSLYGGT WQMEVPELKLTGNVKQNKVNVDGTLKGNSYMQWMIPGLHLELGPNSAEVKGELGVKDLNL DATINAPGLDNALPGLGGTAKGLVKVRGTVEAPQLLADITARGLRWQELSVAQVRVEGDI KSTDQIAGKLDVRVEQISQPDVNINLVTLNAKGSEKQHELQLRIQGEPVSGQLNLAGSFD RKEERWKGTLSNTRFQTPVGPWSLTRDIALDYRNKEQKISIGPHCWLNPNAELCVPQTID AGAEGRAVVNLNRFDLAMLKPFMPETTQASGIFTGKADVAWDTTKEGLPQGSITLSGRNV QVTQTVNDAALPVAFQTLNLTAELRNNRAELGWTIRLTNNGQFDGQVQVTDPQGRRNLGG NVNIRNFNLAMINPIFTRGEKAAGMVSANLRLGGDVQSPQLFGQLQVTGVDIDGNFMPFD MQPSQLAVNFNGMRSTLAGTVRTQQGEIYLNGDADWSQIENWRARVTAKGSKVRITVPPM VRMDVSPDVVFEATPNLFTLDGRVDVPWARIVVHDLPESAVGVSSDVVMLNDNLQPEEPK TASIPINSNLIVHVGNNVRIDAFGLKARLTGDLNVVQDKQGLGLNGQINIPEGRFHAYGQ DLIVRKGELLFSGPPDQPYLNIEAIRNPDATEDDVIAGVRVTGLADEPKAEIFSDPAMSQ QAALSYLLRGQGLESDQSDSAAMTSMLIGLGVAQSGQIVGKIGETFGVSNLALDTQGVGD SSQVVVSGYVLPGLQVKYGVGIFDSIATLTLRYRLMPKLYLEAVSGVDQALDLLYQFEF >gi|296493190|gb|ADTK01000311.1| GENE 27 30477 - 30818 230 113 aa, chain + ## HITS:1 COG:ECs5200 KEGG:ns NR:ns ## COG: ECs5200 COG2105 # Protein_GI_number: 15834454 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 113 1 113 113 226 100.0 1e-59 MRIFVYGSLRHKQGNSHWMTNAQLLGDFSIDNYQLYSLGHYPGAVPGNGTVHGEVYRIDN ATLAELDALRTRGGEYARQLIQTPYGSAWMYVYQRPVDGLKLIESGDWLDRDK Prediction of potential genes in microbial genomes Time: Mon May 16 16:00:06 2011 Seq name: gi|296493189|gb|ADTK01000312.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont950.7, whole genome shotgun sequence Length of sequence - 2841 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 21 - 80 5.9 1 1 Op 1 8/0.000 + CDS 216 - 467 318 ## COG2336 Growth regulator 2 1 Op 2 . + CDS 461 - 811 352 ## COG2337 Growth inhibitor - Term 801 - 843 -0.6 3 2 Tu 1 . - CDS 891 - 1421 734 ## COG0221 Inorganic pyrophosphatase - Prom 1618 - 1677 4.7 + Prom 1503 - 1562 4.3 4 3 Tu 1 . + CDS 1731 - 2687 1091 ## COG1879 ABC-type sugar transport system, periplasmic component + Term 2745 - 2793 0.1 Predicted protein(s) >gi|296493189|gb|ADTK01000312.1| GENE 1 216 - 467 318 83 aa, chain + ## HITS:1 COG:chpS KEGG:ns NR:ns ## COG: chpS COG2336 # Protein_GI_number: 16132046 # Func_class: T Signal transduction mechanisms # Function: Growth regulator # Organism: Escherichia coli K12 # 1 83 3 85 85 154 100.0 3e-38 MRITIKRWGNSAGMVIPNIVMKELNLQPGQSVEAQVSNNQLILTPISRRYSLDELLAQCD MNAAELSEQDVWGKSTPAGDEIW >gi|296493189|gb|ADTK01000312.1| GENE 2 461 - 811 352 116 aa, chain + ## HITS:1 COG:ECs5203 KEGG:ns NR:ns ## COG: ECs5203 COG2337 # Protein_GI_number: 15834457 # Func_class: T Signal transduction mechanisms # Function: Growth inhibitor # Organism: Escherichia coli O157:H7 # 1 116 1 116 116 211 99.0 3e-55 MVKKSEFERGDIVLVGFDPASGHEQQGAGRPALVLSVQAFNQLGMTLVAPITQGGNFARY AGFSVPLHCEEGDVHGVVLVNQVRMMDLRARLAKRIGLAADEVVEEALLRLQAVGE >gi|296493189|gb|ADTK01000312.1| GENE 3 891 - 1421 734 176 aa, chain - ## HITS:1 COG:ECs5204 KEGG:ns NR:ns ## COG: ECs5204 COG0221 # Protein_GI_number: 15834458 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase # Organism: Escherichia coli O157:H7 # 1 176 1 176 176 341 98.0 4e-94 MSLLNVPAGKDLPEDIYVVIEIPANADPIKYEIDKESGALFVDRFMSTAMFYPCNYGYIN HTLSLDGDPVDVLVPTPYPLQPGSVIRCRPVGVLKMTDEAGEDAKLIAVPHTKLSKEYDH IKDVNDLPELLKAQIAHFFEHYKDLEKGKWVKVEGGENAEAAKAEIVASFERAKNK >gi|296493189|gb|ADTK01000312.1| GENE 4 1731 - 2687 1091 318 aa, chain + ## HITS:1 COG:ECs5205 KEGG:ns NR:ns ## COG: ECs5205 COG1879 # Protein_GI_number: 15834459 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 1 318 1 318 318 571 99.0 1e-163 MWKRLLVVSAVSAAMSSMALAAPLTVGFSQVGSESGWRAAETNVAKSEAEKRGITLKIAD GQQKQENQIKAVRSFVAQGVDAIFIAPVVATGWEPVLKEAKDAEIPVFLLDRSIDVKDKS LYMTTVTADNILEGKLIGDWLVKEVNGKPCNVVELQGTVGASVAIDRKKGFAEAIKNAPN IKIIRSQSGDFTRSKGKEVMESFIKAENNGKNICMVYAHNDDMVIGAIQAIKEAGLKPGK DILTGSIDGVPDIYKAMIDGEANASVELTPNMAGPAFDALEKYKKDGTMPEKLTLTKSTL YLPDTAKEELEKKKNMGY Prediction of potential genes in microbial genomes Time: Mon May 16 16:00:08 2011 Seq name: gi|296493188|gb|ADTK01000313.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont950.8, whole genome shotgun sequence Length of sequence - 3626 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 21/0.000 + CDS 72 - 1574 205 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 2 1 Op 2 11/0.000 + CDS 1588 - 2610 1093 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 3 1 Op 3 . + CDS 2597 - 3592 1191 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components Predicted protein(s) >gi|296493188|gb|ADTK01000313.1| GENE 1 72 - 1574 205 500 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 268 477 9 214 245 83 26 2e-16 MTTDQHQEILRTEGLSKFFPGVKALDNVDFSLRRGEIMALLGENGAGKSTLIKALTGVYH ADRGTIWLEGQAISPKNTAHAQQLGIGTVYQEVNLLPNMSVADNLFIGREPKRFGFLRRK EMEKRATELMTSYGFSLDVREPLNRFSVAMQQIVAICRAIDLSAKVLILDEPTASLDTQE VELLFGLMRQLRDRGVSLIFVTHFLDQVYQVSDRITVLRNGSFVGCRETRELPQIELVKM MLGRELDTHALQRAGRTLLSDKPVAAFKNYGKKGTIAPFDLEVRPGEIVGLAGLLGSGRT ETAEVIFGIKPADSGTALIKGKPQTLRSPHQASVLGIGFCPEDRKTDGIIAAASVRENII LALQAQRGWLRPISRKEQQEIAERFIRQLGIRTPSTEQPIEFLSGGNQQKVLLSRWLLTR PQFLILDEPTRGIDVGAHAEIIRLIETLCADGLALLVISSELEELVGYADRVIIMRDRKQ VAEIPLAELSVPAIMNAIAA >gi|296493188|gb|ADTK01000313.1| GENE 2 1588 - 2610 1093 340 aa, chain + ## HITS:1 COG:ECs5207 KEGG:ns NR:ns ## COG: ECs5207 COG1172 # Protein_GI_number: 15834461 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Escherichia coli O157:H7 # 1 340 2 341 341 489 98.0 1e-138 MPQSLPDTTPPKRRFRWPTGMSQLVALLLVLLVDSLVAPHFWQVVLQDGRLFGSPIDILN RAAPVALLAIGMTLVIATGGIDLSVGAVMAIAGATTAAMTVAGFSLPIVLLSALGTGILA GLWNGILVAILKIQPFVATLILMVAGRGVAQLITAGQIVTFNSPDLSWFGSGSLLFLPTP VIIAVLTLLLFWLLTRKTALGMFIEAVGINIRAAKNAGVNTRIIVMLTYVLSGLCAAIAG IIVAADIRGADANNAGLWLELDAILAVVIGGGSLMGGRFNLLLSVVGALIIQGMNTGILL SGFPPEMNQVVKAVVVLCVLIVQSQRFISLIKGVRSHDKT >gi|296493188|gb|ADTK01000313.1| GENE 3 2597 - 3592 1191 331 aa, chain + ## HITS:1 COG:yjfF KEGG:ns NR:ns ## COG: yjfF COG1172 # Protein_GI_number: 16132053 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Escherichia coli K12 # 9 331 1 323 323 519 99.0 1e-147 MIKRNLPLMITIGVFVLGYFYCLTQFPGFASTRVICNILTDNAFLGIIAVGMTFVILSGG IDLSVGSVIAFTGVFLAKVVGDFGLSPLLAFPLVLVMGCAFGAFMGLLIDALKIPAFIIT LAGMFFLRGVSYLVSEESIPINHPIYDTLSSLAWKIPGGGRLSAMGLLMLAVVVIGIFLA HRTRFGNQVYAIGGNATSANLMGISTRSTTIRIYMLSTGLATLAGIVFSIYTQAGYALAG VGVELDAIASVVIGGTLLSGGVGTVLGTLFGVAIQGLIQTYINFDGTLSSWWTKIAIGIL LFIFIALQRGLTVLWENRQSSPVTRVNIAQR Prediction of potential genes in microbial genomes Time: Mon May 16 16:00:17 2011 Seq name: gi|296493187|gb|ADTK01000314.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont950.9, whole genome shotgun sequence Length of sequence - 26186 bp Number of predicted genes - 23, with homology - 23 Number of transcription units - 19, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 26 - 1024 1082 ## COG0158 Fructose-1,6-bisphosphatase - Prom 1080 - 1139 2.2 + Prom 972 - 1031 4.9 2 2 Tu 1 . + CDS 1200 - 2573 1605 ## COG0773 UDP-N-acetylmuramate-alanine ligase + Term 2616 - 2667 5.9 3 3 Tu 1 . - CDS 2724 - 3275 774 ## COG3028 Uncharacterized protein conserved in bacteria - Prom 3295 - 3354 7.1 + Prom 3266 - 3325 3.1 4 4 Op 1 . + CDS 3369 - 4721 1379 ## COG0312 Predicted Zn-dependent proteases and their inactivated homologs + Prom 4823 - 4882 2.4 5 4 Op 2 . + CDS 4904 - 5290 547 ## COG3783 Soluble cytochrome b562 + Term 5299 - 5342 11.7 - Term 5217 - 5251 -0.6 6 5 Tu 1 . - CDS 5335 - 5799 415 ## COG0602 Organic radical activating enzymes - Prom 5863 - 5922 4.3 - Term 5837 - 5881 4.1 7 6 Tu 1 . - CDS 5957 - 8095 2417 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase - Prom 8274 - 8333 6.5 - Term 8439 - 8476 4.4 8 7 Op 1 9/0.200 - CDS 8489 - 10144 1383 ## COG0366 Glycosidases 9 7 Op 2 7/0.200 - CDS 10194 - 11612 1468 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific - Prom 11664 - 11723 4.4 - Term 11668 - 11694 -1.0 10 8 Tu 1 . - CDS 11734 - 12681 773 ## COG1609 Transcriptional regulators - Prom 12715 - 12774 7.1 11 9 Tu 1 . + CDS 13060 - 15756 2751 ## COG0474 Cation transport ATPase + Term 15879 - 15915 2.4 12 10 Op 1 6/0.300 - CDS 15962 - 16348 547 ## COG0251 Putative translation initiation inhibitor, yjgF family 13 10 Op 2 19/0.000 - CDS 16421 - 16882 549 ## COG1781 Aspartate carbamoyltransferase, regulatory subunit 14 10 Op 3 . - CDS 16895 - 17830 1031 ## COG0540 Aspartate carbamoyltransferase, catalytic chain - Prom 18057 - 18116 8.3 15 11 Tu 1 . - CDS 18249 - 18644 291 ## COG0251 Putative translation initiation inhibitor, yjgF family - Prom 18694 - 18753 2.6 - Term 18718 - 18755 -0.0 16 12 Tu 1 . - CDS 18775 - 19488 272 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 - Prom 19512 - 19571 5.5 + Prom 19474 - 19533 5.0 17 13 Tu 1 . + CDS 19559 - 20152 269 ## COG1309 Transcriptional regulator + Prom 20212 - 20271 3.8 18 14 Tu 1 . + CDS 20297 - 20749 547 ## COG2731 Beta-galactosidase, beta subunit + Term 20758 - 20797 4.3 + Prom 20764 - 20823 8.0 19 15 Tu 1 . + CDS 20872 - 22455 746 ## B21_04083 hypothetical protein + Term 22590 - 22629 2.0 - Term 22574 - 22619 2.7 20 16 Tu 1 . - CDS 22632 - 23645 1051 ## COG0078 Ornithine carbamoyltransferase - Prom 23679 - 23738 6.8 + Prom 23631 - 23690 3.8 21 17 Tu 1 . + CDS 23807 - 24223 740 ## COG3076 Uncharacterized protein conserved in bacteria + Term 24238 - 24275 1.6 - Term 24226 - 24263 2.4 22 18 Tu 1 . - CDS 24269 - 24772 547 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Prom 24922 - 24981 6.9 + Prom 24744 - 24803 5.7 23 19 Tu 1 . + CDS 24965 - 26161 493 ## COG4269 Predicted membrane protein Predicted protein(s) >gi|296493187|gb|ADTK01000314.1| GENE 1 26 - 1024 1082 332 aa, chain - ## HITS:1 COG:fbp KEGG:ns NR:ns ## COG: fbp COG0158 # Protein_GI_number: 16132054 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1,6-bisphosphatase # Organism: Escherichia coli K12 # 1 332 1 332 332 685 100.0 0 MKTLGEFIVEKQHEFSHATGELTALLSAIKLGAKIIHRDINKAGLVDILGASGAENVQGE VQQKLDLFANEKLKAALKARDIVAGIASEEEDEIVVFEGCEHAKYVVLMDPLDGSSNIDV NVSVGTIFSIYRRVTPVGTPVTEEDFLQPGNKQVAAGYVVYGSSTMLVYTTGCGVHAFTY DPSLGVFCLCQERMRFPEKGKTYSINEGNYIKFPNGVKKYIKFCQEEDKSTNRPYTSRYI GSLVADFHRNLLKGGIYLYPSTASHPDGKLRLLYECNPMAFLAEQAGGKASDGKERILDI IPETLHQRRSFFVGNDHMVEDVERFIREFPDA >gi|296493187|gb|ADTK01000314.1| GENE 2 1200 - 2573 1605 457 aa, chain + ## HITS:1 COG:ZyjfG KEGG:ns NR:ns ## COG: ZyjfG COG0773 # Protein_GI_number: 15804823 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate-alanine ligase # Organism: Escherichia coli O157:H7 EDL933 # 1 457 1 457 457 950 99.0 0 MRIHILGICGTFMGGLAMLARQLGHEVTGSDANVYPPMSTLLEKQGIELIQGYDASQLDP QPDLVIIGNAMTRGNPCVEAVLEKNIPYMSGPQWLHDFVLRDRWGLAVAGTHGKTTTAGM ATWILEQCGYKPGFVIGGVPGNFEVSARLGESDFFVIEADEYDCAFFDKRSKFVHYCPRT LILNNLEFDHADIFDDLKAIQKQFHHLVRIVPGQGRIIWPENDINLKQTMAMGCWSEQEL VGEQGHWQAKKLTTDASEWEVLLDGEKVGEVKWSLVGEHNMHNGLMAIAAARHVGVAPAD AANALGSFINARRRLELRGEANGVTVYDDFAHHPTAILATLAALRGKVGGTARIIAVLEP RSNTMKMGICKDDLAPSLGRADEVFLLQPAHIPWQVAEVAEACVQPAHWSGDVDTLADMV VKTAQPGDHILVMSNGGFGGIHQKLLDGLAKKAEAAQ >gi|296493187|gb|ADTK01000314.1| GENE 3 2724 - 3275 774 183 aa, chain - ## HITS:1 COG:ECs5211 KEGG:ns NR:ns ## COG: ECs5211 COG3028 # Protein_GI_number: 15834465 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 183 1 183 183 271 100.0 7e-73 MTKQPEDWLDDVPGDDIEDEDDEIIWVSKSEIKRDAEELKRLGAEIVDLGKNALDKIPLD ADLRAAIELAQRIKMEGRRRQLQLIGKMLRQRDVEPIRQALDKLKNRHNQQVVLFHKLEN LRDRLIDQGDDAIAEVLNLWPDADRQQLRTLIRNAKKEKEGNKPPKSARQIFQYLRELAE NEG >gi|296493187|gb|ADTK01000314.1| GENE 4 3369 - 4721 1379 450 aa, chain + ## HITS:1 COG:ECs5212 KEGG:ns NR:ns ## COG: ECs5212 COG0312 # Protein_GI_number: 15834466 # Func_class: R General function prediction only # Function: Predicted Zn-dependent proteases and their inactivated homologs # Organism: Escherichia coli O157:H7 # 1 450 1 450 450 885 100.0 0 MALAMKVISQVEAQRKILEEAVSTALELASGKSDGAEVAVSKTTGISVSTRYGEVENVEF NSDGALGITVYHQNRKGSASSTDLSPQAIARTVQAALDIARYTSPDPCAGVADKELLAFD APDLDLFHPAEVSPDEAIELAARAEQAALQADKRITNTEGGSFNSHYGVKVFGNSHGMLQ GYCSTRHSLSSCVIAEENGDMERDYAYTIGRAMSDLQTPEWVGADCARRTLSRLSPRKLS TMKAPVIFANEVATGLFGHLVGAIAGGSVYRKSTFLLDSLGKQILPDWLTIEEHPHLLKG LASTPFDSEGVRTERRDIIKDGILTQWLLTSYSARKLGLKSTGHAGGIHNWRIAGQGLSF EQMLKEMGTGLVVTELMGQGVSAITGDYSRGAAGFWVENGEIQYPVSEITIAGNLKDMWR NIVTVGNDIETRSNIQCGSVLLPEMKIAGQ >gi|296493187|gb|ADTK01000314.1| GENE 5 4904 - 5290 547 128 aa, chain + ## HITS:1 COG:STM4439 KEGG:ns NR:ns ## COG: STM4439 COG3783 # Protein_GI_number: 16767685 # Func_class: C Energy production and conversion # Function: Soluble cytochrome b562 # Organism: Salmonella typhimurium LT2 # 1 128 1 128 128 176 85.0 1e-44 MRKSLLAILAVSSLVFSSASFAADLEDNMETLNDNLKVIEKADNAAQVKDALTKMRAAAL DAQKATPPKLEDKSPDSPEMKDFRHGFDILVGQIDDALKLANEGKVKEAQAAAEQLKTTR NAYHQKYR >gi|296493187|gb|ADTK01000314.1| GENE 6 5335 - 5799 415 154 aa, chain - ## HITS:1 COG:ECs5214 KEGG:ns NR:ns ## COG: ECs5214 COG0602 # Protein_GI_number: 15834468 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Escherichia coli O157:H7 # 1 154 1 154 154 322 98.0 2e-88 MNYHQYYPVDIVNGPGTRCTLFVSGCVHECPGCYNKSTWRVNSGQPFTKAMEDKIIKDLN DTRIKRQGISLSGGDPLHPQNVPDILKLVKRIRAECPGKDIWVWTGYKLDELNAAQMQVV DLINVLVDGKFVQDLKDPSLIWRGSSNQVVHHLR >gi|296493187|gb|ADTK01000314.1| GENE 7 5957 - 8095 2417 712 aa, chain - ## HITS:1 COG:nrdD KEGG:ns NR:ns ## COG: nrdD COG1328 # Protein_GI_number: 16132060 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Escherichia coli K12 # 1 712 1 712 712 1507 100.0 0 MTPHVMKRDGCKVPFKSERIKEAILRAAKAAEVDDADYCATVAAVVSEQMQGRNQVDINE IQTAVENQLMSGPYKQLARAYIEYRHDRDIEREKRGRLNQEIRGLVEQTNASLLNENANK DSKVIPTQRDLLAGIVAKHYARQHLLPRDVVQAHERGDIHYHDLDYSPFFPMFNCMLIDL KGMLTQGFKMGNAEIEPPKSISTATAVTAQIIAQVASHIYGGTTINRIDEVLAPFVTASY NKHRKTAEEWNIPDAEGYANSRTIKECYDAFQSLEYEVNTLHTANGQTPFVTFGFGLGTS WESRLIQESILRNRIAGLGKNRKTAVFPKLVFAIRDGLNHKKGDPNYDIKQLALECASKR MYPDILNYDQVVKVTGSFKTPMGCRSFLGVWENENGEQIHDGRNNLGVISLNLPRIALEA KGDEATFWKLLDERLVLARKALMTRIARLEGVKARVAPILYMEGACGVRLNADDDVSEIF KNGRASISLGYIGIHETINALFGGEHVYDNEQLRAKGIAIVERLRQAVDQWKEETGYGFS LYSTPSENLCDRFCRLDTAEFGVVPGVTDKGYYTNSFHLDVEKKVNPYDKIDFEAPYPPL ANGGFICYGEYPNIQHNLKALEDVWDYSYQHVPYYGTNTPIDECYECGFTGEFECTSKGF TCPKCGNHDASRVSVTRRVCGYLGSPDARPFNAGKQEEVKRRVKHLGNGQIG >gi|296493187|gb|ADTK01000314.1| GENE 8 8489 - 10144 1383 551 aa, chain - ## HITS:1 COG:ECs5216 KEGG:ns NR:ns ## COG: ECs5216 COG0366 # Protein_GI_number: 15834470 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Escherichia coli O157:H7 # 1 551 1 551 551 1129 98.0 0 MTNLPHWWQNGVIYQIYPKSFQDTTGSGTGDLRGVIQRLDYLHKLGVDAIWLTPFYVSPQ VDNGYDVANYTAIDPTYGTLDDFDELVTQAKSRGIRIILDMVFNHTSTQHAWFREALNKE SPYRQFYIWRDGEPETPPNNWRSKFGGSAWRWHAESEQYYLHLFAPEQADLNWENPAVRA ELKKVCEFWADRGVDGLRLDVVNLISKDPRFPEDLDGDGRRFYTDGPRAHEFLHEMNRDV FTPRGLMTVGEMSSTSLEHCQRYAALTGSELSMTFNFHHLKVDYPGGEKWTLAKPDFVAL KTLFRHWQQGMHNVAWNALFWCNHDQPRIVSRFGDEGEYRVPAAKMLAMVLHGMQGTPYI YQGEEIGMTNPHFTRITDYRDVESLNMFAELRNNGRDADELLAILASKSRDNSRTPMQWS NGDNAGFTAGEPWIGLGDNYQQINVEAALTDESSVFYTYQKLIALRKQEAVLTWGNYQDL LPNSPVLWCYRREWKGQTLLVIANLSRGIQPWQPGQMRGNWQLVMHNYEEASPQPCAMNL RPFEAVWWLQK >gi|296493187|gb|ADTK01000314.1| GENE 9 10194 - 11612 1468 472 aa, chain - ## HITS:1 COG:ECs5217_2 KEGG:ns NR:ns ## COG: ECs5217_2 COG1263 # Protein_GI_number: 15834471 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Escherichia coli O157:H7 # 92 472 1 381 381 690 100.0 0 MSKINQTDIDRLIELVGGRGNIATVSHCITRLRFVLNQPANARPKEIEQLPMVKGCFTNA GQFQVVIGTNVGDYYQALIASTGQAQVDKEQVKKAARQNMKWHEQLISHFAEIFFPLLPA LISGGLILGFRNVIGDLPMSNGQTLAQMYPSLQTIYDFLWLIGEAIFFYLPVGICWSAVK KMGGTPILGIVLGVTLVSPQLMNAYLLGQQLPEVWDFGMFSIAKVGYQAQVIPALLAGLA LGVIETRLKRIVPDYLYLVVVPVCSLILAVFLAHALIGPFGRMIGDGVAFAVRHLMTGSF APIGAALFGFLYAPLVITGVHQTTLAIDLQMIQSMGGTPVWPLIALSNIAQGSAVIGIII SSRKHNEREISVPAAISAWLGVTEPAMYGINLKYRFPMLCAMIGSGLAGLLCGLNGVMAN GIGVGGLPGILSIQPSYWQVFALAMAIAIIIPIVLTSFIYQRKYRLGTLDIV >gi|296493187|gb|ADTK01000314.1| GENE 10 11734 - 12681 773 315 aa, chain - ## HITS:1 COG:treR KEGG:ns NR:ns ## COG: treR COG1609 # Protein_GI_number: 16132063 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 315 1 315 315 619 99.0 1e-177 MQNRLTIKDIARLSGVGKSTVSRVLNNESGVSQRTRERVEAVMNQHGFSPSRSARAMRGQ SDKVVAIIVTRLDSLSENLAVQTMLPAFYEQGYDPIMMESQFSPQLVAEHLGVLKRRNID GVVLFGFTGITEEMLAHWQSSLVLLARDAKGFASVCYDDEGAIKILMQRLYDQGHRNISY LGVPHSDVTTGKRRHEAYLAFCKAHKLHPVAALPGLAMKQGYENVAKVITPETTALLCAT DTLALGASKYLQEQRIDTLQLASVGNTPLMKFLHPEIVTVDPGYAEAGRQAACQLIAQVT GRSEPQQIIIPATLS >gi|296493187|gb|ADTK01000314.1| GENE 11 13060 - 15756 2751 898 aa, chain + ## HITS:1 COG:ECs5219 KEGG:ns NR:ns ## COG: ECs5219 COG0474 # Protein_GI_number: 15834473 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Escherichia coli O157:H7 # 1 898 1 898 898 1816 100.0 0 MFKEIFTRLIRHLPSRLVHRDPLPGAQQTVNTVVPPSLSAHCLKMAVMPEEELWKTFDTH PEGLNQAEVESAREQHGENKLPAQQPSPWWVHLWVCYRNPFNILLTILGAISYATEDLFA AGVIALMVAISTLLNFIQEARSTKAADALKAMVSNTATVLRVINDKGENGWLEIPIDQLV PGDIIKLAAGDMIPADLRILQARDLFVAQASLTGESLPVEKAATTRQPEHSNPLECDTLC FMGTTVVSGTAQAMVIATGANTWFGQLAGRVSEQESEPNAFQQGISRVSMLLIRFMLVMA PVVLLINGYTKGDWWEAALFALSVAVGLTPEMLPMIVTSTLARGAVKLSKQKVIVKHLDA IQNFGAMDILCTDKTGTLTQDKIVLENHTDISGKTSERVLHSAWLNSHYQTGLKNLLDTA VLEGTDEESARSLASRWQKIDEIPFDFERRRMSVVVAENTEHHQLVCKGALQEILNVCSQ VRHNGEIVPLDDIMLRKIKRVTDTLNRQGLRVVAVATKYLPAREGDYQRADESDLILEGY IAFLDPPKETTAPALKALKASGITVKILTGDSELVAAKVCHEVGLDAGEVVIGSDIETLS DDELANLAQRTTLFARLTPMHKERIVTLLKREGHVVGFMGDGINDAPALRAADIGISVDG AVDIAREAADIILLEKSLMVLEEGVIEGRRTFANMLKYIKMTASSNFGNVFSVLVASAFL PFLPMLPLHLLIQNLLYDVSQVAIPFDNVDDEQIQKPQRWNPADLGRFMIFFGPISSIFD ILTFCLMWWVFHANTPETQTLFQSGWFVVGLLSQTLIVHMIRTRRVPFIQSCASWPLMIM TVIVMIVGIALPFSPLASYLQLQALPLSYFPWLVAILAGYMTLTQLVKGFYSRRYGWQ >gi|296493187|gb|ADTK01000314.1| GENE 12 15962 - 16348 547 128 aa, chain - ## HITS:1 COG:ECs5220 KEGG:ns NR:ns ## COG: ECs5220 COG0251 # Protein_GI_number: 15834474 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Escherichia coli O157:H7 # 1 128 14 141 141 226 99.0 6e-60 MSKTIATENAPAAIGPYVQGVDLGNMIITSGQIPVNPKTGEVPADVAAQARQSLDNVKAI VEAAGLKVGDIVKTTVFVKDLNDFATVNATYEAFFTEHNAAFPARSCVEVARLPKDVKIE IEAIAVRR >gi|296493187|gb|ADTK01000314.1| GENE 13 16421 - 16882 549 153 aa, chain - ## HITS:1 COG:ECs5221 KEGG:ns NR:ns ## COG: ECs5221 COG1781 # Protein_GI_number: 15834475 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, regulatory subunit # Organism: Escherichia coli O157:H7 # 1 153 1 153 153 304 100.0 5e-83 MTHDNKLQVEAIKRGTVIDHIPAQIGFKLLSLFKLTETDQRITIGLNLPSGEMGRKDLIK IENTFLSEDQVDQLALYAPQATVNRIDNYEVVGKSRPSLPERIDNVLVCPNSNCISHAEP VSSSFAVRKRANDIALKCKYCEKEFSHNVVLAN >gi|296493187|gb|ADTK01000314.1| GENE 14 16895 - 17830 1031 311 aa, chain - ## HITS:1 COG:ECs5222 KEGG:ns NR:ns ## COG: ECs5222 COG0540 # Protein_GI_number: 15834476 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, catalytic chain # Organism: Escherichia coli O157:H7 # 1 311 1 311 311 613 100.0 1e-176 MANPLYQKHIISINDLSRDDLNLVLATAAKLKANPQPELLKHKVIASCFFEASTRTRLSF ETSMHRLGASVVGFSDSANTSLGKKGETLADTISVISTYVDAIVMRHPQEGAARLATEFS GNVPVLNAGDGSNQHPTQTLLDLFTIQETQGRLDNLHVAMVGDLKYGRTVHSLTQALAKF DGNRFYFIAPDALAMPQYILDMLDEKGIAWSLHSSIEEVMAEVDILYMTRVQKERLDPSE YANVKAQFVLRASDLHNAKANMKVLHPLPRVDEIATDVDKTPHAWYFQQAGNGIFARQAL LALVLNRDLVL >gi|296493187|gb|ADTK01000314.1| GENE 15 18249 - 18644 291 131 aa, chain - ## HITS:1 COG:ECs5225 KEGG:ns NR:ns ## COG: ECs5225 COG0251 # Protein_GI_number: 15834479 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Escherichia coli O157:H7 # 1 131 1 131 131 264 100.0 3e-71 MVERTAVFPAGRHSLYAEHRYSAAIRSGDLLFVSGQVGSREDGTPEPDFQQQVRLAFDNL HATLAAAGCTFDDIIDVTSFHTDPEKQFEDIMTVKNEIFSAPPYPTWTAVGVTWLAGFDF EIKVIARIPEL >gi|296493187|gb|ADTK01000314.1| GENE 16 18775 - 19488 272 237 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 7 234 4 239 242 109 34 2e-23 MGAFTGKTVLILGGSRGIGAAIVRRFVTDGANVRFTYAGSKDAAKRLAQETGATAVFTDS ADRDAVIDVVRKSGALDILVVNAGIGVFGEALELNADDIDRLFKINIHAPYHASVEAARQ MPEGGRILIIGSVNGERMPVAGMAAYAASKSALQGMARGLARDFGPRGITINVVQPGPID TDANPANGPMRDMLHSFMAIKRHGQPEEVAGMVAWLAGPEASFVTGAMHTIDGAFGA >gi|296493187|gb|ADTK01000314.1| GENE 17 19559 - 20152 269 197 aa, chain + ## HITS:1 COG:STM1674 KEGG:ns NR:ns ## COG: STM1674 COG1309 # Protein_GI_number: 16765017 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Salmonella typhimurium LT2 # 2 197 1 196 196 215 54.0 4e-56 MVTKKQSRVPGRPRRFAPEQAVSAAKVLFHQKGFDAVSVAEVTDYLGITPPSLYAAFGSK AGLFSRVLNEYVGTEAIPLADILRDDRPVGECLVEVLKEAARRYSQNGGCAGCMVLEGIH SHDPLARDIAVQYYHAAETTIYDYIARRHPQSAQCVTDFMSTVMSGLSAKAREGHSIEQL CATAALAGEAIKTLLKE >gi|296493187|gb|ADTK01000314.1| GENE 18 20297 - 20749 547 150 aa, chain + ## HITS:1 COG:ECs5229 KEGG:ns NR:ns ## COG: ECs5229 COG2731 # Protein_GI_number: 15834483 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Escherichia coli O157:H7 # 1 150 4 153 153 305 100.0 2e-83 MIIGNIHNLQPWLPQELRQAIEHIKAHVTAETPKGKHDIEGNRLFYLISEDMTEPYEARR AEYHARYLDIQIVLKGQEGMTFSTQPAGTPDTDWLADKDIAFLPEGVDEKTVILNEGDFV VFYPGEVHKPLCAVGAPAQVRKAVVKMLMA >gi|296493187|gb|ADTK01000314.1| GENE 19 20872 - 22455 746 527 aa, chain + ## HITS:1 COG:no KEGG:B21_04083 NR:ns ## KEGG: B21_04083 # Name: yjgL # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 526 1 526 550 905 91.0 0 MSKISGWNFSQNITSADNCKQKNEDLDTWYVGMNDFARIAGGQNSRSNILSPRAFLEFLA KIFTLGYVDFSKRSNEAGRNMMAHIKSSSYIKNNDGSEIMKFVMNNPEGERADLSKVEIE ITLASAFNNGIREGHTVIIFTQPDGSTNRYEGKSFERKDESSLHLITNKILACYQREANK EIARLLNIPQELNNSQDLNNSQVSCKDSVDSTITDLLEKPLNNALLAIRKEHLLLMPYVC NESISYLLGEKGILKEIDDLNAVNNYLLNNKKATDNEINDIKVNLSHILIDSLDDAKVNL TPVIDSILEIFLKFPYINDVRILDWCFNKRMQYFGDSEKIKYACSVINHIDFSRDQSKDF SCDQSKIKIAETLFFNLDKEHYKNSRKLQELIWDKLVAYVNDFNLSNQEKSRLILRLFDD VKLLFNEVPVSILVNDIFLKDFFMKQPDFAKWYFYQLIKKYEGEQLYLNELGYVYGNEEK TNEIVKKHPGYVIKIFEEKMGNELKIRTRMMKILRNGKINIYEYINI >gi|296493187|gb|ADTK01000314.1| GENE 20 22632 - 23645 1051 337 aa, chain - ## HITS:1 COG:ECs5231 KEGG:ns NR:ns ## COG: ECs5231 COG0078 # Protein_GI_number: 15834485 # Func_class: E Amino acid transport and metabolism # Function: Ornithine carbamoyltransferase # Organism: Escherichia coli O157:H7 # 1 334 1 334 334 673 99.0 0 MSGFYHKHFLKLLDFTPAELNSLLQLAAKLKADKKSGKEEAKLTGKNIALIFEKDSTRTR CSFEVAAYDQGARVTYLGPSGSQIGHKESIKDTARVLGRMYDGIQYRGYGQEIVETLAEY AGVPVWNGLTNEFHPTQLLADLLTMQEHLPGKAFNEMTLVYAGDARNNMGNSMLEAAALT GLDLRLVAPQACWPEAALVTECRALAQQNGGNITLTEDVAKGVEGADFIYTDVWVSMGEA KEKWAERIALLRDYQVNSKMMQLTGNPEVKFLHCLPAFHDDQTTLGKKMAEEFGLHGGME VTDEVFESAASIVFDQAENRMHTIKAVMVATLSKLNN >gi|296493187|gb|ADTK01000314.1| GENE 21 23807 - 24223 740 138 aa, chain + ## HITS:1 COG:ECs5232 KEGG:ns NR:ns ## COG: ECs5232 COG3076 # Protein_GI_number: 15834486 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 138 1 138 138 187 100.0 5e-48 MANPEQLEEQREETRLIIEELLEDGSDPDALYTIEHHLSADDLETLEKAAVEAFKLGYEV TDPEELEVEDGDIVICCDILSECALNADLIDAQVEQLMTLAEKFDVEYDGWGTYFEDPNG EDGDDEDFVDEDDDGVRH >gi|296493187|gb|ADTK01000314.1| GENE 22 24269 - 24772 547 167 aa, chain - ## HITS:1 COG:STM4473 KEGG:ns NR:ns ## COG: STM4473 COG0454 # Protein_GI_number: 16767718 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Salmonella typhimurium LT2 # 1 167 1 167 167 261 80.0 3e-70 MNNIAPQSPVMRRLTLQDNPAIARVIRQVSAEYGLTADKGYTVADPNLDELYQVYSQPGH AYWVVEYEGEVVGGGGIAPLAGSESDICELQKMYFLPAIRGKGLAKKLALMAMEQAREMG FKRCYLETTAFLKEAIALYEHLGFEHIDYALGCTGHVDCEVRMLREL >gi|296493187|gb|ADTK01000314.1| GENE 23 24965 - 26161 493 398 aa, chain + ## HITS:1 COG:ECs5234 KEGG:ns NR:ns ## COG: ECs5234 COG4269 # Protein_GI_number: 15834488 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 398 1 398 398 652 98.0 0 MAQVINEMDVPSHSFVFHGTGERYFLICVVNVLLTIITLGIYLPWALMKCKRYLYANMEV NGQRFSYGITGGNVFVSCLVFVFFYFAILMTVSADMPLVGCVLTLSLLVLLIFMAAKGLR YQALMTSLNGVRFSFNCSLKGFWWVTFFLPILMAIGMGTVFFISTKMLHANSSSSVIISV VLMAIVGIVSIGIFNGTLYSLVMSFLWSNTSFGIHRFKVKLDTTYCIKYAILAFLALLPF LAVAGYIIFDQILNAYDSSVYANDDIENLQQFMEMQRKMIIAQLIYYFGIAVSTSYLTVS LRNHFMSNLSLNDGRIRFRSTLTYHGMLYRMCALVVISGITGGLAYPLLKIWMIDWQAKN TYLLGDLDDLPLINKEEQPDKGFLASISRGIMPSLPFL Prediction of potential genes in microbial genomes Time: Mon May 16 16:00:34 2011 Seq name: gi|296493186|gb|ADTK01000315.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont950.10, whole genome shotgun sequence Length of sequence - 35680 bp Number of predicted genes - 30, with homology - 30 Number of transcription units - 18, operones - 9 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.167 - CDS 40 - 2895 3461 ## COG0525 Valyl-tRNA synthetase 2 1 Op 2 12/0.167 - CDS 2895 - 3338 446 ## COG2927 DNA polymerase III, chi subunit - Prom 3404 - 3463 2.6 - Term 3357 - 3400 1.2 3 2 Tu 1 . - CDS 3498 - 5009 1590 ## COG0260 Leucyl aminopeptidase - Prom 5168 - 5227 5.4 + Prom 5189 - 5248 3.8 4 3 Op 1 22/0.000 + CDS 5297 - 6376 914 ## COG0795 Predicted permeases 5 3 Op 2 . + CDS 6376 - 7458 1442 ## COG0795 Predicted permeases + Term 7463 - 7500 2.4 - Term 7455 - 7481 -1.0 6 4 Tu 1 1/1.000 - CDS 7577 - 9079 1360 ## COG0433 Predicted ATPase 7 5 Op 1 2/0.667 - CDS 9157 - 10155 725 ## COG1609 Transcriptional regulators 8 5 Op 2 5/0.167 - CDS 10222 - 11541 1247 ## COG2610 H+/gluconate symporter and related permeases - Term 11558 - 11588 3.0 9 6 Op 1 10/0.167 - CDS 11606 - 12370 774 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 10 6 Op 2 . - CDS 12394 - 13425 814 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases - Prom 13518 - 13577 12.9 + Prom 13557 - 13616 3.7 11 7 Tu 1 . + CDS 13642 - 14205 419 ## COG3265 Gluconate kinase - Term 14163 - 14199 3.0 12 8 Tu 1 . - CDS 14209 - 15228 1012 ## COG1064 Zn-dependent alcohol dehydrogenases - Prom 15451 - 15510 80.3 + TRNA 15424 - 15508 79.1 # Leu CAA 0 0 + Prom 15433 - 15492 80.4 13 9 Tu 1 . + CDS 15695 - 16960 387 ## PROTEIN SUPPORTED gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 + Term 16964 - 16997 5.2 + Prom 17803 - 17862 4.6 14 10 Tu 1 . + CDS 17936 - 18199 108 ## gi|300906120|ref|ZP_07123839.1| conserved domain protein - Term 18366 - 18403 -0.2 15 11 Tu 1 . - CDS 18538 - 18762 178 ## ECP_2965 hypothetical protein - Prom 18892 - 18951 6.8 16 12 Op 1 . - CDS 19377 - 20369 250 ## ECP_2966 PixG protein 17 12 Op 2 . - CDS 20392 - 20655 101 ## ECP_2967 PixF protein - Prom 20809 - 20868 6.3 18 13 Op 1 . - CDS 20994 - 21557 237 ## ECP_2968 PixJ protein 19 13 Op 2 10/0.167 - CDS 21594 - 22325 553 ## COG3121 P pilus assembly protein, chaperone PapD 20 13 Op 3 . - CDS 22394 - 24853 1876 ## COG3188 P pilus assembly protein, porin PapC - Prom 24896 - 24955 4.0 21 14 Op 1 . - CDS 25025 - 25612 388 ## ECP_2971 PixH protein 22 14 Op 2 . - CDS 25698 - 26231 443 ## COG3539 P pilus assembly protein, pilin FimA 23 14 Op 3 . - CDS 26276 - 26551 72 ## SNSL254_A4668 repressor protein - Prom 26647 - 26706 11.8 24 15 Tu 1 . - CDS 27357 - 27776 252 ## ECP_2974 hypothetical protein 25 16 Op 1 3/0.167 - CDS 28189 - 29436 848 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 26 16 Op 2 3/0.167 - CDS 29426 - 31435 1352 ## COG4192 Signal transduction histidine kinase regulating phosphoglycerate transport system 27 16 Op 3 . - CDS 31432 - 32703 681 ## COG1840 ABC-type Fe3+ transport system, periplasmic component - Prom 32920 - 32979 5.6 + Prom 32792 - 32851 6.0 28 17 Tu 1 . + CDS 33048 - 34412 1144 ## COG2271 Sugar phosphate permease + Prom 34531 - 34590 4.6 29 18 Op 1 6/0.167 + CDS 34704 - 35354 222 ## COG2963 Transposase and inactivated derivatives 30 18 Op 2 . + CDS 35351 - 35680 173 ## PROTEIN SUPPORTED gi|148984516|ref|ZP_01817804.1| 50S ribosomal protein L9 Predicted protein(s) >gi|296493186|gb|ADTK01000315.1| GENE 1 40 - 2895 3461 951 aa, chain - ## HITS:1 COG:valS KEGG:ns NR:ns ## COG: valS COG0525 # Protein_GI_number: 16132080 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Valyl-tRNA synthetase # Organism: Escherichia coli K12 # 1 951 1 951 951 1939 100.0 0 MEKTYNPQDIEQPLYEHWEKQGYFKPNGDESQESFCIMIPPPNVTGSLHMGHAFQQTIMD TMIRYQRMQGKNTLWQVGTDHAGIATQMVVERKIAAEEGKTRHDYGREAFIDKIWEWKAE SGGTITRQMRRLGNSVDWERERFTMDEGLSNAVKEVFVRLYKEDLIYRGKRLVNWDPKLR TAISDLEVENRESKGSMWHIRYPLADGAKTADGKDYLVVATTRPETLLGDTGVAVNPEDP RYKDLIGKYVILPLVNRRIPIVGDEHADMEKGTGCVKITPAHDFNDYEVGKRHALPMINI LTFDGDIRESAQVFDTKGNESDVYSSEIPAEFQKLERFAARKAVVAAVDALGLLEEIKPH DLTVPYGDRGGVVIEPMLTDQWYVRADVLAKPAVEAVENGDIQFVPKQYENMYFSWMRDI QDWCISRQLWWGHRIPAWYDEAGNVYVGRNEDEVRKENNLGADVVLRQDEDVLDTWFSSA LWTFSTLGWPENTDALRQFHPTSVMVSGFDIIFFWIARMIMMTMHFIKDENGKPQVPFHT VYMTGLIRDDEGQKMSKSKGNVIDPLDMVDGISLPELLEKRTGNMMQPQLADKIRKRTEK QFPNGIEPHGTDALRFTLAALASTGRDINWDMKRLEGYRNFCNKLWNASRFVLMNTEGQD CGFNGGEMTLSLADRWILAEFNQTIKAYREALDSFRFDIAAGILYEFTWNQFCDWYLELT KPVMNGGTEAELRGTRHTLVTVLEGLLRLAHPIIPFITETIWQRVKVLCGITADTIMLQP FPQYDASQVDEAALADTEWLKQAIVAVRNIRAEMNIAPGKPLELLLRGCSADAERRVNEN RGFLQTLARLESITVLPADDKGPVSVTKIIDGAELLIPMAGLINKEDELARLAKEVAKIE GEISRIENKLANEGFVARAPEAVIAKEREKLEGYAEAKAKLIEQQAVIAAL >gi|296493186|gb|ADTK01000315.1| GENE 2 2895 - 3338 446 147 aa, chain - ## HITS:1 COG:ECs5236 KEGG:ns NR:ns ## COG: ECs5236 COG2927 # Protein_GI_number: 15834490 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, chi subunit # Organism: Escherichia coli O157:H7 # 1 147 1 147 147 290 100.0 9e-79 MKNATFYLLDNDTTVDGLSAVEQLVCEIAAERWRSGKRVLIACEDEKQAYRLDEALWARP AESFVPHNLAGEGPRGGAPVEIAWPQKRSSSPRDILISLRTSFADFATAFTEVVDFVPYE DSLKQLARERYKAYRVAGFNLNTATWK >gi|296493186|gb|ADTK01000315.1| GENE 3 3498 - 5009 1590 503 aa, chain - ## HITS:1 COG:ECs5237 KEGG:ns NR:ns ## COG: ECs5237 COG0260 # Protein_GI_number: 15834491 # Func_class: E Amino acid transport and metabolism # Function: Leucyl aminopeptidase # Organism: Escherichia coli O157:H7 # 1 503 1 503 503 999 99.0 0 MEFSVKSGSPEKQRSACIVVGVFEPRRLSPIAEQLDKISDGYISALLRRGELEGKPGQTL LLHHVPNVLSERILLIGCGKERELDERQYKQVIQKTINTLNDTGSMEAVCFLTELHVKGR NNYWKVRQAVETAKETLYSFDQLKTNKSEPRRPLRKMVFNVPTRRELTSGERAIQHGLAI AAGIKAAKDLGNMPPNICNAAYLASQARQLADSYSKNVITRVIGEQQMRELGMHSYLAVG QGSQNESLMSVIEYKGNASEDARPIVLVGKGLTFDSGGISIKPSEGMDEMKYDMCGAAAV YGVMRMVAELQLPINVIGVLAGCENLPGGRAYRPGDVLTTMSGQTVEVLNTDAEGRLVLC DVLTYVERFEPEAVIDVATLTGACVIALGHHITGLMANHNPLAHELIAASEQSGDRAWRL PLGDEYQEQLESNFADMANIGGRPGGAITAGCFLSRFTRKYNWAHLDIAGTAWRSGKAKG ATGRPVALLAQFLLNRAGFNGEE >gi|296493186|gb|ADTK01000315.1| GENE 4 5297 - 6376 914 359 aa, chain + ## HITS:1 COG:ECs5238 KEGG:ns NR:ns ## COG: ECs5238 COG0795 # Protein_GI_number: 15834492 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Escherichia coli O157:H7 # 1 359 8 366 366 621 99.0 1e-178 MRETLKSQLAILFILLLIFFCQKLVRILGAAVDGDIPANLVLSLLGLGVPEMAQLILPLS LFLGLLMTLGKLYTESEITVMHACGLSKAVLVKAAMILAVFTAIVAAVNVMWAGPWSSRH QDEVLAEAKANPGMAALAQGQFQQATNGSSVLFIESVDGSDFKDVFLAQIRPKGNARPSV VVADSGHLTQLRDGSQVVTLNQGTRFEGTALLRDFRITDFQDYQAIIGHQAVALDPNDTD QMDMRTLWNTDTDRARAELNWRITLVFTVFMMALMVVPLSVVNPRQGRVLSMLPAMLLYL LFFLIQTSLKSNGGKGKLDPTLWMWTVNLIYLALAIVLNLWDTVPVRRLRASFSRKGAV >gi|296493186|gb|ADTK01000315.1| GENE 5 6376 - 7458 1442 360 aa, chain + ## HITS:1 COG:ECs5239 KEGG:ns NR:ns ## COG: ECs5239 COG0795 # Protein_GI_number: 15834493 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Escherichia coli O157:H7 # 1 360 2 361 361 659 100.0 0 MQPFGVLDRYIGKTIFTTIMMTLFMLVSLSGIIKFVDQLKKAGQGSYDALGAGMYTLLSV PKDVQIFFPMAALLGALLGLGMLAQRSELVVMQASGFTRMQVALSVMKTAIPLVLLTMAI GEWVAPQGEQMARNYRAQAMYGGSLLSTQQGLWAKDGNNFVYIERVKGDEELGGISIYAF NENRRLQSVRYAATAKFDPEHKVWRLSQVDESDLTNPKQITGSQTVSGTWKTNLTPDKLG VVALDPDALSISGLHNYVKYLKSSGQDAGRYQLNMWSKIFQPLSVAVMMLMALSFIFGPL RSVPMGVRVVTGISFGFVFYVLDQIFGPLTLVYGIPPIIGALLPSASFFLISLWLLMRKS >gi|296493186|gb|ADTK01000315.1| GENE 6 7577 - 9079 1360 500 aa, chain - ## HITS:1 COG:yjgR KEGG:ns NR:ns ## COG: yjgR COG0433 # Protein_GI_number: 16132085 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Escherichia coli K12 # 1 500 1 500 500 949 98.0 0 MSEPLLIARTPDTELFLLPGMANRHGLITGATGTGKTVTLQKLAESLSEIGVPVFMADVK GDLTGIAQAGTASEKLLARLKNIGVNDWQPHANPVVVWDIFGEKGHPVRATVSDLGPLLL ARLLNLNDVQSGVLNIIFRIADDQGLLLLDFKDLRAITQYIGDNAKSFQNQYGNISSASV GAIQRGLLSLEQQGAAHFFGEPMLDIKDWMRTDTNGKGVINILSAEKLYQMPKLYAASLL WMLSELYEQLPEAGDLEKPKLVFFFDEAHLLFNDAPQVLLDKIEQVIRLIRSKGVGVWFV SQNPSDIPDNVLGQLGNRVQHALRAFTPKDQKAVKAAAQTMRANPAFDTEKAIQELGTGE ALISFLDVKGSPSVVERAMVIAPCSRMGPVTEDERNGLINHSPVYGKYEDEVDRESAYEM LQKGVQASIEQQNNPPAKGKEIAVDDGILGGLKDILFGTTGPRGGKKDGVVQTMAKSAAR QVTNQIVRGMLGSLLGGRRR >gi|296493186|gb|ADTK01000315.1| GENE 7 9157 - 10155 725 332 aa, chain - ## HITS:1 COG:idnR KEGG:ns NR:ns ## COG: idnR COG1609 # Protein_GI_number: 16132086 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 332 1 332 332 657 99.0 0 MRNHRISLQDIATLAGVTKMTVSRYIRSPKKVAKETGERIAKIMEEINYIPNRAPGILLN AQSYTLGILIPSFQNQLFADILAGIESVTSEHNYQTLIANYNYDRDSEEESVINLLSYNI DGIILSEKYHTIRTVKFLRSATIPVVELMDVQGERLDMEVGFDNRQAAFDMVCTMLEKRV RHKILYLGSKDDTRDEQRYQGYCDAMMLHNLFPLRMNPRAISSIHLGMQLMRDALSANPD LDGVFCTNDDIAMGALLLCRERNLAVPEQISIAGFHGLEIGRQMIPSLASVITPRFDIGR MAAQMLLSKIKNNDHNHNTVDLGYQIYHGNTL >gi|296493186|gb|ADTK01000315.1| GENE 8 10222 - 11541 1247 439 aa, chain - ## HITS:1 COG:idnT KEGG:ns NR:ns ## COG: idnT COG2610 # Protein_GI_number: 16132087 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Escherichia coli K12 # 1 439 1 439 439 703 99.0 0 MPLIIIAAGVALLLILMIGFKVNGFIALVLVAAVVGFAEGMDAQAVLHSIQNGIGSTLGG LAMILGFGAMLGKLISDTGAAQRIATTLIATFGKKRVQWALVITGLVVGLAMFFEVGFVL LLPLVFTIVASSGLPLLYVGVPMVAALSVTHCFLPPHPGPTAIATIFEANLGTTLLYGFI ITIPTVIVAGPLFSKLLTRFEKAPPEGLFNPHLFSEEEMPSFWNSIFAAVIPVILMAIAA VCEITLPKTNTVRLFFEFVGNPAVALFIAIVIAIFTLGRRNGRSIEQIMDIIGDSIGAIA MIVFIIAGGGAFKQVLVDSGVGHYISHLMTGTTLSPLLMCWTVAALLRIALGSATVAAIT TAGVVLPIINVTHADPALMVLATGAGSVIASHVNDPGFWLFKGYFNLTVGETLRTWTVME TLISIMGLLGVLAINAVLH >gi|296493186|gb|ADTK01000315.1| GENE 9 11606 - 12370 774 254 aa, chain - ## HITS:1 COG:idnO KEGG:ns NR:ns ## COG: idnO COG1028 # Protein_GI_number: 16132088 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Escherichia coli K12 # 1 254 1 254 254 509 99.0 1e-144 MNDLFSLAGKNILITGSAQGIGFLLATGLGKYGAQIIINDITAERAELAVEKLHQEGIQA VAAPFNVTHKHEIDAAVEHIEKDIGPIDVLVNNAGIQRRHPFTEFPEREWNDVIAVNQTA VFLVSQAVTRHMVERKAGKVINICSMQSELGRDTITPYAASKGAVKMLTRGMCVELARHN IQVNGIAPGYFKTEMTKALVEDEAFTAWLCKRTPAARWGDPQELIGAAVFLSSKASDFVN GHLLFVDGGMLVAV >gi|296493186|gb|ADTK01000315.1| GENE 10 12394 - 13425 814 343 aa, chain - ## HITS:1 COG:idnD KEGG:ns NR:ns ## COG: idnD COG1063 # Protein_GI_number: 16132089 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli K12 # 1 343 1 343 343 708 99.0 0 MQVKTQSCVVAGKKTVAVTEQTIDWNNNGTLVQITRGGICGSDLHYYQEGKVGNFMIKAP MVLGHEVIGKVIHSDSSELHEGQTVAINPSKPCGHCKYCIEHNENQCTEMRFFGSAMYFP HVDGGFTRYKMVETSQCVPYPAKADEKVMAFAEPLAVAIHAAHQAGELQGKRVFISGVGP IGCLIVSAVKTLGAAEIVCADVSPRSLSLGKEMGADVLVNPQNDDMDHWKAEKGYFDVSF EVSGHPSSVNTCLEVTRARGVMVQVGMGGAMAEFPMMTLIGKEISLKGSFRFTSEFNTAV SWLANGVINPLPLLSAEYPFTDLEEALRFAGDKTQAAKVQLVF >gi|296493186|gb|ADTK01000315.1| GENE 11 13642 - 14205 419 187 aa, chain + ## HITS:1 COG:idnK KEGG:ns NR:ns ## COG: idnK COG3265 # Protein_GI_number: 16132090 # Func_class: G Carbohydrate transport and metabolism # Function: Gluconate kinase # Organism: Escherichia coli K12 # 1 187 1 187 187 380 100.0 1e-106 MAGESFILMGVSGSGKTLIGSKVAALLSAKFIDGDDLHPAKNIDKMSQGIPLSDEDRLPW LERLNDASYSLYKKNETGFIVCSSLKKQYRDILRKGSPHVHFLWLDGDYETILARMQRRA GHFMPVALLKSQFEALERPQADEQDIVRIDINHDIANVTEQCRQAVLAIRQNRICAKEGS ASDQRCE >gi|296493186|gb|ADTK01000315.1| GENE 12 14209 - 15228 1012 339 aa, chain - ## HITS:1 COG:yjgB KEGG:ns NR:ns ## COG: yjgB COG1064 # Protein_GI_number: 16132091 # Func_class: R General function prediction only # Function: Zn-dependent alcohol dehydrogenases # Organism: Escherichia coli K12 # 1 339 15 353 353 674 98.0 0 MSMIKSYAAKEAGGELEVYEYDPGELKPQDVEVQVDYCGICHSDLSMIDNEWGFSQYPLV AGHEVIGRVVALGTPAQDKGLQVGQRVGIGWTARSCGHCDACISGNQINCEQGAVPTIMN RGGFAEKLRADWQWVIPLPENIDIESAGPLLCGGITVFKPLLMHHITATSRVGVIGIGGL GHIAIKLLHAMGCEVTAFSSNPAKEQEVLAMGADKVVNSRDPQALKTLAGQFDLIINTVN VSLDWQPYFEALTYGGNFHTVGAVLTPLSVPAFTLIAGDRSISGSATGTPYELRKLMRFA ARSKVAPTTELFPMSKINDAIQHVRDGKARYRVVLKADF >gi|296493186|gb|ADTK01000315.1| GENE 13 15695 - 16960 387 421 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 [Campylobacter concisus 13826] # 49 401 60 402 406 153 29 1e-36 MALTDIKVRAAKPTDKQYKLTDGGGMHLLVHPNGSKYWRLQYRYEGKQKMLALGVYPEIT LADARVRRDEARKLLANGVDPGDKKKNDKVEQSKARTFKEVAIEWHGTNKKWSEDHAHRV LKSLEDNLFAALGERNIAELKTRDLLAPIKAVEMSGRLEVAARLQQRTTAIMRYAVQSGL IDYNPAQEMAGAVASCNRQHRPALELKRIPELLTKIDSYTGRPLTRWATELSLLIFIRSS ELRFARWSEIDFEASIWTIPPEREPIPGVKHSHRGSKMRTTHLVPLSTQALAILKQIKQF CGAHDLIFIGDHDSHKPMSENTVNSALRVMGYDTKVEVCGHGFRTMACSSLVESGLWSRD AVERQMSHMERNSVRAAYIHKAEHLEERRLMLQWWADFLDANREKFISPFEYAKINNPLK Q >gi|296493186|gb|ADTK01000315.1| GENE 14 17936 - 18199 108 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|300906120|ref|ZP_07123839.1| ## NR: gi|300906120|ref|ZP_07123839.1| conserved domain protein [Escherichia coli MS 84-1] # 1 87 1 87 87 139 100.0 4e-32 MPKSLDELSHAENWLNLRLQRQKSRINNFFGKNAVLSIIGLSYRAAPMMKKVAKTIEKRL NGMRHGVSNGNEESMNNKIRMLRIKAR >gi|296493186|gb|ADTK01000315.1| GENE 15 18538 - 18762 178 74 aa, chain - ## HITS:1 COG:no KEGG:ECP_2965 NR:ns ## KEGG: ECP_2965 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_536 # Pathway: not_defined # 1 74 187 260 260 145 100.0 4e-34 MFTQGLERCKQSHYIIETILELSKKLNIKTIAEGVENMELVNKLMLLGVDYFQGYYFGQA IELCSFIKEYRKTV >gi|296493186|gb|ADTK01000315.1| GENE 16 19377 - 20369 250 330 aa, chain - ## HITS:1 COG:no KEGG:ECP_2966 NR:ns ## KEGG: ECP_2966 # Name: not_defined # Def: PixG protein # Organism: E.coli_536 # Pathway: not_defined # 1 330 1 330 330 625 98.0 1e-177 MKYVKKILGISFFSLLPLSAEAESYCNLLWANNTLPSASVNVSFDGNTSGIFPLNLQAGG LSTVIKRAAKNMDTFVTLDGTYRYWIQYPEAWQTTPDGLKYRITSELEVSGTQTAGVKTV VTSVGYHTWENTYGCRDVGGTYDFEVASVSGVNIEIDRGTARPGVYSIQLPVKVAYEENK GNYDGKNGGGWREFPVSMKSFSPVDSKGISITISSKCNVGEQSLSVNMGDNITPDEAKSG VEKKVNFSLTCNAPAKVSLSLKGTDIVDGVNNKTKCGSGSCSLNFDNDSSSKILEVNQGT YQVPITVRFQDANPVAGGFDGSAVLSVDIL >gi|296493186|gb|ADTK01000315.1| GENE 17 20392 - 20655 101 87 aa, chain - ## HITS:1 COG:no KEGG:ECP_2967 NR:ns ## KEGG: ECP_2967 # Name: not_defined # Def: PixF protein # Organism: E.coli_536 # Pathway: not_defined # 1 87 86 172 172 159 100.0 2e-38 MTGKDNVLATNIDGLGIELYQGGEGTGNHLILGSGSSGYGYEVINALSEKNVERTTFTFT AKIYKAEGVTINSGEFSASALINIVYL >gi|296493186|gb|ADTK01000315.1| GENE 18 20994 - 21557 237 187 aa, chain - ## HITS:1 COG:no KEGG:ECP_2968 NR:ns ## KEGG: ECP_2968 # Name: not_defined # Def: PixJ protein # Organism: E.coli_536 # Pathway: not_defined # 1 187 1 187 187 381 100.0 1e-105 MNRATIGFYLAVVLGSCSMNGVSQADELLTRDDFFVADESRHQWVTEHNGRTGALNVKGA LVSSPCILDTPEVNFPLQKDKGRYVLNLKVSRCGDGHSDIPERQSTEHGNIVVKQSAVLK TGKDQILLSAWRNGGLSRKNLQHGDNQITYYINEAQYQKIAKTQMPVTAHKTSDSNRGVM HLNIMYE >gi|296493186|gb|ADTK01000315.1| GENE 19 21594 - 22325 553 243 aa, chain - ## HITS:1 COG:YPO0699 KEGG:ns NR:ns ## COG: YPO0699 COG3121 # Protein_GI_number: 16121020 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Yersinia pestis # 24 242 20 237 239 279 61.0 3e-75 MQKMKPALKKTLMAVACLSAVPAAQAAVSLDRTRAIFNGDEKSMTLNIANDNKQLPYLAQ AWVENEKKEKITTGPIIATPPVQRLEPGSKSMVRLTSTPDISRLPQDRESLFYFSLREIP PKSDKANVLQIALQTKIKLFYRPESIKAKPNAVWQDQLVLNKTSGGYRIDNPTPYYVTVI GIGGSEKQAREGEFDAVMLAPKSTQMVKSGTYNTPYLSYINDYGGRPVLQFSCGGSRCTA VKK >gi|296493186|gb|ADTK01000315.1| GENE 20 22394 - 24853 1876 819 aa, chain - ## HITS:1 COG:YPO0698 KEGG:ns NR:ns ## COG: YPO0698 COG3188 # Protein_GI_number: 16121019 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Yersinia pestis # 1 818 15 825 826 819 50.0 0 MLSLAGVPVYAVDFNTDVLDAADRQNIDFSRFSRAGYIMPGQYQMEIRVNGQDISPSAFQ IAFLEPPFSDSDNEKPLPEPCLTPEIVSRMGLTEASQEKVTYWNNGQCADFRQLSGVEIR PNPAEGMLYINMPQAWLEYSDASWLPPSRWDNGIPGLLFDYNINGTVNKPHQGKQSQSLN YNGTAGANFGAWRLRADYQGNLNHTTGSAQGTDSQFTWSRFYMYRAIPRWRANLTLGENY INSEIFSSWRYTGASLESDDRMLPPKLRGYAPQVSGIADTNARVVISQQGRILYDSTVPA GPFTIQDLDSSVRGRLDVEVIEQDGRKKTFQVDTAYVPYLTRPGQVRYKLVSGRSRTYEH TMEGPVFAAGEASWGISNTWSLYGGSIVAGDYNALAVGLGRDLSKFGTVSADVTQSVARI PGYDTKQGKSWRLSYSKRFDEVNTDITFAGYRFSERNYMTMDQYLNARYRNDFTGREKEL YTVTLNKNFEDWKASVNLQYSHQTYWDRRTSDYYTLSVNRYFDAFSFKNIALGISASRSK YLNRDNDSAFVRLSVPWGTGTASYSGSMSNDRYTNTVGYSDTLNNGLSSYSLNAGVNSGG GQPSQRQMSAYYNHNGSLTNLSASFSAVENGYSSFGMSASGGATVTMKGAALHAGGMNGG TRLLVDTDGVGGVPVDGGRVYTNRWGIGVVTDVSSYYRNTTSVDLNKLPEDMEATRSVVE SVLTEGAIGYREFEVLKGSRLFAVLRMSDNSYPPFGASVTNAKGRELGMVADSGLAWLSG VNPGETLNVGWDGRTQCVVDIPAHPDPAQQLLLPCRQVK >gi|296493186|gb|ADTK01000315.1| GENE 21 25025 - 25612 388 195 aa, chain - ## HITS:1 COG:no KEGG:ECP_2971 NR:ns ## KEGG: ECP_2971 # Name: not_defined # Def: PixH protein # Organism: E.coli_536 # Pathway: not_defined # 1 195 1 195 195 399 99.0 1e-110 MLRRMILSVLFLWCAMAQALPSASFPPPGMTLPEYWGEEHVWWDGRASFKGQVIAPACTL SMEDAWQEIDMGTTPLRDLQNSPAGPEKKFRLRLRNCELTGAGKQVYTATRVRVTFDGIP GETPDKFSLTGQAEGINLQIMDNYGYPARAGKSMPPLILRGNEDGLDYSLRIVRNGYPLK AGDYYAALRFKLDYE >gi|296493186|gb|ADTK01000315.1| GENE 22 25698 - 26231 443 177 aa, chain - ## HITS:1 COG:YPO2759 KEGG:ns NR:ns ## COG: YPO2759 COG3539 # Protein_GI_number: 16122963 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Yersinia pestis # 2 176 4 175 176 90 38.0 2e-18 MLKSTLGIAIAFSLCAVAHAANQGQSVVNFKGTVIDAPCGIDPDSADQTIDFGQISKSHL KNDGISVKKDLDIKLVNCDFTDPTAKKTVSVTFSGTPVNGHTDELGTAGDTGTAVVVSAS DGSMVKFDGTTSSNATQLQDGDNTLKYTTWVKKSSAAGTEVKEGDFTAVANFNLSYQ >gi|296493186|gb|ADTK01000315.1| GENE 23 26276 - 26551 72 91 aa, chain - ## HITS:1 COG:no KEGG:SNSL254_A4668 NR:ns ## KEGG: SNSL254_A4668 # Name: not_defined # Def: repressor protein # Organism: S.enterica_Newport # Pathway: not_defined # 10 88 21 99 101 73 49.0 3e-12 MLLRNGMGLLSKDRYTSPGSMSIEHFRCLVEISSINSQKTIMAMEDYFVHGKTRKEACER NNVAQSYFSISVKKFIKISNAVAQASKFYRR >gi|296493186|gb|ADTK01000315.1| GENE 24 27357 - 27776 252 139 aa, chain - ## HITS:1 COG:no KEGG:ECP_2974 NR:ns ## KEGG: ECP_2974 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_536 # Pathway: not_defined # 1 139 26 164 164 263 99.0 2e-69 MKAGISVRSGRRIEKGQRAKNSVRHWSTRKDPLEAVWDSMLVPLLKERPVLTPTTLLEML QDKYPGQYPNSFRRTMQRRGREWKLQSGAEQEVMFRQWHQPGLRGLLDFTKLKGVVVTIA GKLLVHMLYNFRLEWSHWS >gi|296493186|gb|ADTK01000315.1| GENE 25 28189 - 29436 848 415 aa, chain - ## HITS:1 COG:STM2396 KEGG:ns NR:ns ## COG: STM2396 COG2204 # Protein_GI_number: 16765722 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Salmonella typhimurium LT2 # 1 414 1 414 415 661 80.0 0 MLSSEYSILLIDDDADVLDAYTQLLEQSGYRVFACNNPFEAQAWIQPDWPGIVLSDVCMP GCSGIDLMMLFHQDDQQLPILLITGHGDVPMAVDAVKKGAWDFLQKPVDPGKLLSLVEEA LRQRQSIIARRQYCQQTLQVELIGRSEWINQYRRRLQQLSETDIAVWLYGAPGTGRMTGA RYLHQFGRNAQGEFVYRELTPDNAPQLNDFIALAQGGTLVLSHPEHLTREQQYHLVQLQS QEHRPFRLIGIGDTSLVELAASNHIIAELYYCFAMTQIACLPLTQRPDDIEPLFRHYLCK ACQRLNHPVPEVGKEMFKEMMRRMWPNNVRELANAAELFTVGILPLAETANPLMHVGTPA PLDRRVEDAERQIITEALNIHQGRINEVAEYLQIPRKKLYLRMKKYGLSKEHYKN >gi|296493186|gb|ADTK01000315.1| GENE 26 29426 - 31435 1352 669 aa, chain - ## HITS:1 COG:STM2397 KEGG:ns NR:ns ## COG: STM2397 COG4192 # Protein_GI_number: 16765723 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase regulating phosphoglycerate transport system # Organism: Salmonella typhimurium LT2 # 6 669 5 668 668 973 83.0 0 MKNITLWQRLRQVSISTSLRCAFLMGALLTLIVSSVSLYSWHEQSSQIRYSLDEYFPRIH SAFLIEGNLNLVVDQLNEFLQAPNTTVRLQLRTQIIQHLDTIERLSRGLSSRERQQLTVI LRDSRSLLSELDRALYNMFLLREKVSELSARIDWLHDDFTTELNSLVQDFTWQQGTLLDQ IASRQGDTAQYLKRSREVQNEQQQVYTLARIENQIVDDLRDRLNELKSGRDDDTQVETHL RYFENLKKTADENIRMLDDWPGTITLRQTIDELLDMGIVKNKMPDTMREYVAAQKALEDA SRTREATLGRFRTLLEAQLGSTHQQMQMFNQRMEQIVRVSGGLILVATALALLLAWVFNH YFIRSRLVKRFTLLNQAVVQIGLGGTETTIPVYGNDELGRIAGLLRHTLGQLNVQKQQLE QEITDRKVIEADLRATQDELIQTAKLAVVGQTMTTLAHEINQPLNALSMYLFTARRAIEQ TQKEQASMMLGKAEGVISRIDAIIRSLRQFTRRAELETSLHAVDLAQMFSAAWELLAMRH RSLQATLVLPQGTATVSGDEVRTQQVLVNVLANALDVCGQGAVITVNWQMQGKTLNVFIG DNGPGWPEALLPSLLKPFTTSKEVGLGIGLSICVSLMEQMKGELRLASTMTRNACVVLQF RLTDVEDAK >gi|296493186|gb|ADTK01000315.1| GENE 27 31432 - 32703 681 423 aa, chain - ## HITS:1 COG:STM2398 KEGG:ns NR:ns ## COG: STM2398 COG1840 # Protein_GI_number: 16765724 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, periplasmic component # Organism: Salmonella typhimurium LT2 # 29 423 3 397 397 631 84.0 0 MTWITNRFFSWCEIGIKTMILLLLLMDTGAVRAQRNELVMATTFSPGATAWIIQRWQTEP ESVMIRTLNRTSASLEQLLYTANVENVDLILTSSPMLLQHLQEHQKLAPFDDAPAESQNL VPESIRATSVAVAISGFGLLINRPALSVKHLPAPADWDDLALPIYQDALLMSSPSRSDTN HLMVESLLQQKGWVKGWETLLTSAGNLVTISSRSFGVADKIKSGLGVAGPVIDNYANLLL NDPHLSFTYFPRSAVSPTYVAILKKSPHADAARRFIHYLLSPKGQRILADANTGKYPVTP LAADNPRATQQQLLMNQPPLNYHLILKRQRLVQRLFDTAISFRLAQLKDAWRALHSTEAR LKKPLPEIRALLTQVPVAAASSEDPVWLAQFDNKSFTEQQMMKWQLWFLNNQRLAIKKLE ELK >gi|296493186|gb|ADTK01000315.1| GENE 28 33048 - 34412 1144 454 aa, chain + ## HITS:1 COG:STM2399 KEGG:ns NR:ns ## COG: STM2399 COG2271 # Protein_GI_number: 16765725 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Salmonella typhimurium LT2 # 1 447 1 447 463 803 90.0 0 MLSIFKTGQAADSVPAEKIQVTYRRYRMQALLSVFLGYLAYYIVRNNFTLSTPYLKEQLD LSATQIGVLSSCMLIAYGISKGVMSSLADKASPKVFMACGLVLCAIVNVGLGFSTAFWIF AVLVILNGLFQGMGVGPSFITIANWFPRRERGRVGAFWNISHNVGGGIVAPIVGAAFALL GSEHWQSASYIVPACVAIVFAVIVLILGKGSPRQEGLPSLEEMMPEEKVVLNTRQTVKAP ENMSAFQIFCTYVLRNKNAWYVSLVDVFVYMVRFGMISWLPIYLLTVKHFSKEQMSVAFL FFEWAAIPSTLLAGWLSDKLFKGRRMPLAMICMALIFICLIGYWKSESLFMVTIFAAIVG CLIYVPQFLASVQTMEIVPSFAVGSAVGLRGFMSYIFGASLGTSLFGIMVDHIGWHGGFY LLGCGIICCIIFCWLSHRGAIELERHRAAYIKEH >gi|296493186|gb|ADTK01000315.1| GENE 29 34704 - 35354 222 216 aa, chain + ## HITS:1 COG:Z4315_2 KEGG:ns NR:ns ## COG: Z4315_2 COG2963 # Protein_GI_number: 15803508 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 EDL933 # 91 215 3 133 134 90 43.0 2e-18 MNARKAVLADNPELILRVLQLRFDESLSYPRISAQTGISKTAIFSLVRRFHQVFTDWPLS GEYSCVQLARALFPGRYPSAPTVTQTVKTEKTRRNQFSPEFKWRLVQQTLLPGACVAQIA RENGINDNLLFNWRRLWRNGGLQSPGEHETSLLPVTLTPEPDNKIPAPVQIPEQTNTLSD SLCCELVLPAGTLRLKGKLTPALLQTLIREIKGSSH >gi|296493186|gb|ADTK01000315.1| GENE 30 35351 - 35680 173 110 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148984516|ref|ZP_01817804.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP3-BS71] # 1 102 1 102 107 71 38 8e-12 LMISLPAGSRIWLVAGITDMRNGFNGLASKVQNVLKDDPFSGHLFIFRGRRGDQIKVLWA DSDGLCLFTKRLERGRFVWPVTRDGKVHLTPAQLSMLLEGINWKHPKRTE Prediction of potential genes in microbial genomes Time: Mon May 16 16:01:00 2011 Seq name: gi|296493185|gb|ADTK01000316.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont951.1, whole genome shotgun sequence Length of sequence - 1915 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 497 - 535 7.7 1 1 Tu 1 . - CDS 598 - 1812 289 ## COG0477 Permeases of the major facilitator superfamily Predicted protein(s) >gi|296493185|gb|ADTK01000316.1| GENE 1 598 - 1812 289 404 aa, chain - ## HITS:1 COG:SMc00744 KEGG:ns NR:ns ## COG: SMc00744 COG0477 # Protein_GI_number: 15966381 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Sinorhizobium meliloti # 1 384 1 384 399 427 67.0 1e-119 MTTTRPAWAYTLPAALLLMAPFDILASLAMDIYLPVVPAMPGILNTTPAMIQLTLSLYMV MLGVGQVIFGPLSDRIGRRPILLAGATAFVIASLGAAWSSTAPAFVAFRLLQAVGASAML VATFATVRDVYANRPEGVVIYGLFSSVLAFVPALGPIAGALIGEFLGWQAIFITLAILAM LALLNAGFRWHETRPLDQVKTRRSVLPIFASPAFWVYTVGFSAGMGTFFVFFSTAPRVLI GQAEYSEIGFSFAFATVALVMIVTTRFAKSFVARWGIAGCVARGMALLVCGAVLLGIGEL YGSPSFLTFILPMWVVAVGIVFTVSVTANGALAEFDDIAGSAVAFYFCVQSLIVSIVGTL AVALLNGDTAWPVICYATAMAVLVSLGLVLLRLRGAATEKSPVV Prediction of potential genes in microbial genomes Time: Mon May 16 16:01:02 2011 Seq name: gi|296493184|gb|ADTK01000317.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont951.2, whole genome shotgun sequence Length of sequence - 6797 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 124 - 846 295 ## COG3843 Type IV secretory pathway, VirD2 components (relaxase) 2 1 Op 2 1/0.000 - CDS 848 - 3814 1781 ## COG4644 Transposase and inactivated derivatives, TnpA family 3 1 Op 3 . - CDS 3817 - 4377 393 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs - Prom 4442 - 4501 2.4 4 2 Tu 1 . - CDS 4503 - 4703 153 ## ASA_P4G104 transposition modulator TnpM - Prom 4894 - 4953 2.8 - Term 4858 - 4894 0.4 5 3 Tu 1 . - CDS 5056 - 6069 459 ## COG0582 Integrase - Prom 6092 - 6151 3.3 + Prom 5930 - 5989 3.0 6 4 Tu 1 . + CDS 6234 - 6767 201 ## KPN_pKPN5p08204 aminoglycoside adenylyltransferase Predicted protein(s) >gi|296493184|gb|ADTK01000317.1| GENE 1 124 - 846 295 240 aa, chain - ## HITS:1 COG:mlr6191 KEGG:ns NR:ns ## COG: mlr6191 COG3843 # Protein_GI_number: 13475173 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD2 components (relaxase) # Organism: Mesorhizobium loti # 23 236 435 650 656 162 45.0 4e-40 MLYFPFSETTPSYTCRVPGGDYEGYVDAHVRRLEALRRAGIVERIDADQWRIPDDLVSRA AAHDAGRDSQASVRVLSPVDLNKQIGSDGATWLDRRLIHGETADLAPTGFGQQVREAMDQ RREHHIEQGDATRSRDSRVFYRRNLLAILREREVAGVGSDMALSKGLPFRAATDGESVSG KFTGTVHLSSGKFAVVEKSHEFTLVPWRPIIDRQLGREVMGIVQGGSVSWQLGRQRGLER >gi|296493184|gb|ADTK01000317.1| GENE 2 848 - 3814 1781 988 aa, chain - ## HITS:1 COG:CAP0093 KEGG:ns NR:ns ## COG: CAP0093 COG4644 # Protein_GI_number: 15004797 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives, TnpA family # Organism: Clostridium acetobutylicum # 317 985 2 671 673 538 39.0 1e-152 MPRRSILSAAERESLLALPDSKDDLIRHYTFNDTDLSIIRQRRGPANRLGFAVQLCYLRF PGVILGVDELPFPPLLKLVADQLKVGVESWNEYGQREQTRREHLSELQTVFGFRPFTMSH YRQAVQMLTELAMQTDKGIVLASALIGHLRRQSVILPALNAVERASAEAITRANRRIYDA LAEPLADAHRRRLDDLLKRRDNGKTTWLAWLRQSPAKPNSRHMLEHIERLKAWQALDLPT GIERLVHQNRLLKIAREGGQMTPADLAKFEPQRRYATLVALATEGMATVTDEIIDLHDRI LGKLFNAAKNKHQQQFQASGKAINAKVRLYGRIGQALIDAKQSGRDAFAAIEAVMSWDSF AESVTEAQKLAQPDDFDFLHRIGESYATLRRYAPEFLAVLKLRAAPAAKNVLDAIEVLRG MNTDNARKLPADAPTGFIKPRWQKLVMTDAGIDRRYYELCALSELKNSLRSGDIWVQGSR QFKDFEDYLVPPEKFTSLKQSSELPLAVATDCEQYLHERLTLLEAQLATVNRMAAANDLP DAIITESGLKITPLDAAVPDTAQALIDQTAMVLPHVKITELLLEVDEWTGFTRHFTHLKS GDLAKDKNLLLTTILADAINLGLTKMAESCPGTTYAKLAWLQAWHTRDETYSTALAELVN AQFRHPFAGHWGDGTTSSSDGQNFRTASKAKSTGHINPKYGSSPGRTFYTHISDQYAPFH TKVVNVGLRDSTYVLDGLLYHESDLRIEEHYTDTAGFTDHVFALMHLLGFRFAPRIRDLG DTKLYIPKGDAAYDALKPMIGGTLNIKHVRAHWDEILRLATSIKQGTVTASLMLRKLGSY PRQNGLAVALRELGRIERTLFILDWLQSVELRRRVHAGLNKGEARNALARAVFFNRLGEI RDRSFEQQRYRASGLNLVTAAIVLWNTVYLERAAHALRGNGHAVDDSLLQYLSPLGWEHI NLTGDYLWRSSAKIGAGKFRPLRPLQPA >gi|296493184|gb|ADTK01000317.1| GENE 3 3817 - 4377 393 186 aa, chain - ## HITS:1 COG:DRC0005 KEGG:ns NR:ns ## COG: DRC0005 COG1961 # Protein_GI_number: 10957532 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Deinococcus radiodurans # 3 184 4 185 185 199 61.0 3e-51 MTGQRIGYIRVSTFDQNPERQLEGVKVDRAFSDKASGKDVKRPQLEALISFARTGDTVVV HSMDRLARNLDDLRRIVQTLTQRGVHIEFVKEHLSFTGEDSPMANLMLSVMGAFAEFERA LIRERQREGIALAKQRGAYRGRKKSLSSERIAELRQRVEAGEQKTKLAREFGISRETLYQ YLRTDQ >gi|296493184|gb|ADTK01000317.1| GENE 4 4503 - 4703 153 66 aa, chain - ## HITS:1 COG:no KEGG:ASA_P4G104 NR:ns ## KEGG: ASA_P4G104 # Name: tnpM # Def: transposition modulator TnpM # Organism: A.salmonicida # Pathway: not_defined # 1 66 129 194 194 120 100.0 2e-26 MNANEPSTSCCVCCKEIPLDAAFTPEGAEYVEHFCGLECYQRFQARASTATETSVKPDAC DSPPSG >gi|296493184|gb|ADTK01000317.1| GENE 5 5056 - 6069 459 337 aa, chain - ## HITS:1 COG:VCA0291 KEGG:ns NR:ns ## COG: VCA0291 COG0582 # Protein_GI_number: 15601056 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Vibrio cholerae # 15 330 4 320 320 275 45.0 1e-73 MKTATAPLPPLRSVKVLDQLRERIRYLHYSLRTEQAYVHWVRAFIRFHGVRHPATLGSSE VEAFLSWLANERKVSVSTHRQALAALLFFYGKVLCTDLPWLQEIGRPRPSRRLPVVLTPD EVVRILGFLEGEHRLFAQLLYGTGMRISEGLQLRVKDLDFDHGTIIVREGKGSKDRALML PESLAPSLREQLSRARAWWLKDQAEGRSGVALPDALERKYPRAGHSWPWFWVFAQHTHST DPRSGVVRRHHMYDQTFQRAFKRAVEQAGITKPATPHTLRHSFATALLRSGYDIRTVQDL LGHSDVSTTMIYTHVLKVGGAGVRSPLDALPPLTSER >gi|296493184|gb|ADTK01000317.1| GENE 6 6234 - 6767 201 177 aa, chain + ## HITS:1 COG:no KEGG:KPN_pKPN5p08204 NR:ns ## KEGG: KPN_pKPN5p08204 # Name: not_defined # Def: aminoglycoside adenylyltransferase # Organism: K.pneumoniae # Pathway: not_defined # 1 177 73 249 249 357 100.0 1e-97 MDTTQVTLIHKILAAADERNLPLWIGGGWAIDARLGRVTRKHDDIDLTFPGERRGELEAI VEMLGGRVMEELDYGFLAEIGDELLDCEPAWWADEAYEIAEAPQGSCPEAAEGVIAGRPV RCNSWEAIIWDYFYYADEVPPVDWPTKHIESYRLACTSLGAEKVEVLRAAFRSRYAA Prediction of potential genes in microbial genomes Time: Mon May 16 16:01:08 2011 Seq name: gi|296493183|gb|ADTK01000318.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont951.3, whole genome shotgun sequence Length of sequence - 2281 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 56 - 847 413 ## COG1708 Predicted nucleotidyltransferases 2 2 Tu 1 . - CDS 1127 - 2170 586 ## COG3547 Transposase and inactivated derivatives Predicted protein(s) >gi|296493183|gb|ADTK01000318.1| GENE 1 56 - 847 413 263 aa, chain + ## HITS:1 COG:STM1264 KEGG:ns NR:ns ## COG: STM1264 COG1708 # Protein_GI_number: 16764615 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferases # Organism: Salmonella typhimurium LT2 # 1 257 1 257 262 210 44.0 3e-54 MREAVIAEVSTQLSEVVGVIERHLEPTLLAVHLYGSAVDGGLKPHSDIDLLVTVTVRLDE TTRRALINDLLETSASPGESEILRAVEVTIVVHDDIIPWRYPAKRELQFGEWQRNDILAG IFEPATIDIDLAILLTKAREHSVALVGPAAEELFDPVPEQDLFEALNETLTLWNSPPDWA GDERNVVLTLSRIWYSAVTGKIAPKDVAADWAMERLPAQYQPVILEARQAYLGQEEDRLA SRADQLEEFVHYVKGEITKVVGK >gi|296493183|gb|ADTK01000318.1| GENE 2 1127 - 2170 586 347 aa, chain - ## HITS:1 COG:mlr5983 KEGG:ns NR:ns ## COG: mlr5983 COG3547 # Protein_GI_number: 13474998 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Mesorhizobium loti # 4 330 6 327 342 199 40.0 5e-51 MKRIAVDLAKSVYQVAESVRSGQVVQRKRLNREAFRRYIQEQAEPVEWVMEACGTAHYWG RVAQALGHQVRLLHPRYVRPYRRRNKTDRNDCDAMLEATRCTDIHPVPVKSHDQQQLQQL HGLRETWKKSRTQRINLLRGLLREAGIEAPASTAAFIRAASERVDQPELAPLSRLLHIVL AEINLYEQCMAECEQQLKRWHADDDIVRKLDEVSGIGLLTASALTAAVGRPERFASGRQL SAWLGMTPREFSSGERRKLGHISRQGNVYVRTLLIHGSRAALLAAQRCQARTPEKLTQLQ RWAVATAARIGHNKAAVALANKLVRICWAVWCHERRFNGNWQSTKPA Prediction of potential genes in microbial genomes Time: Mon May 16 16:01:10 2011 Seq name: gi|296493182|gb|ADTK01000319.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont951.4, whole genome shotgun sequence Length of sequence - 6076 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 63 - 122 3.3 1 1 Op 1 . + CDS 161 - 508 100 ## COG2076 Membrane transporters of cations and cationic drugs 2 1 Op 2 . + CDS 502 - 1341 184 ## COG0294 Dihydropteroate synthase and related enzymes 3 2 Tu 1 . + CDS 1469 - 1702 60 ## AB57_0267 hypothetical protein + Term 1737 - 1792 4.9 4 3 Op 1 1/0.000 - CDS 1755 - 3293 271 ## COG2801 Transposase and inactivated derivatives 5 3 Op 2 9/0.000 - CDS 3375 - 4157 440 ## COG1484 DNA replication protein 6 3 Op 3 . - CDS 4147 - 5418 218 ## COG4584 Transposase and inactivated derivatives 7 4 Tu 1 . + CDS 5765 - 6013 138 ## Dtpsy_2129 putative transcriptional regulator MerR Predicted protein(s) >gi|296493182|gb|ADTK01000319.1| GENE 1 161 - 508 100 115 aa, chain + ## HITS:1 COG:BMEI1045 KEGG:ns NR:ns ## COG: BMEI1045 COG2076 # Protein_GI_number: 17987328 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane transporters of cations and cationic drugs # Organism: Brucella melitensis # 1 100 1 100 110 101 61.0 4e-22 MKGWLFLVIAIVGEVIATSALKSSEGFTKLAPSAVVIIGYGIAFYFLSLVLKSIPVGVAY AVWSGLGVVIITAIAWLLHGQKLDAWGFVGMGLIIAAFLLARSPSWKSLRRPTPW >gi|296493182|gb|ADTK01000319.1| GENE 2 502 - 1341 184 279 aa, chain + ## HITS:1 COG:RSc1527 KEGG:ns NR:ns ## COG: RSc1527 COG0294 # Protein_GI_number: 17546246 # Func_class: H Coenzyme transport and metabolism # Function: Dihydropteroate synthase and related enzymes # Organism: Ralstonia solanacearum # 4 248 26 266 291 129 37.0 9e-30 MVTVFGILNLTEDSFFDESRRLDPAGAVTAAIEMLRVGSDVVDVGPAASHPDARPVSPAD EIRRIAPLLDALSDQMHRVSIDSFQPETQRYALKRGVGYLNDIQGFPDPALYPDIAEADC RLVVMHSAQRDGIATRTGHLRPEDALDEIVRFFEARVSALRRSGVAADRLILDPGMGFFL SPAPETSLHVLSNLQKLKSALGLPLLVSVSRKSFLGATVGLPVKDLGPASLAAELHAIGN GADYVRTHAPGDLRSAITFSETLAKFRSRDARDRGLDHA >gi|296493182|gb|ADTK01000319.1| GENE 3 1469 - 1702 60 77 aa, chain + ## HITS:1 COG:no KEGG:AB57_0267 NR:ns ## KEGG: AB57_0267 # Name: not_defined # Def: hypothetical protein # Organism: A.baumannii_AB0057 # Pathway: not_defined # 1 76 1 76 166 150 100.0 2e-35 MDSEEPPNVRVACSGDIDEVVRLMHDAAAWMSAKGTPAWDVARIDRTFAETFVLRSELLV ASCSDGIVGCCTLSAEC >gi|296493182|gb|ADTK01000319.1| GENE 4 1755 - 3293 271 512 aa, chain - ## HITS:1 COG:AGc3822 KEGG:ns NR:ns ## COG: AGc3822 COG2801 # Protein_GI_number: 15889390 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 510 1 510 512 684 65.0 0 MYSYADRIRAVELYIKLGLRVRATIRQLGYPTKNALKGWYQHYLKHQDLPASQVPRAPKY SLQQRQVAVAHYLAHDRCIAATMRVLGYPGRGTLTAWVRQDCPEASQSILGRSWPVTKPD ALMYEGVVQLCTRQSSAQEIADKLGVCRGTLYNWKNQLLGPCAPASMKHSPKRSPVLDAA ALRCQVESLRQDVRRLKIERELLKQAHEILKNGADIDLHRLANKDKAVLVEALHGQYELP ELLCHVGLARSSYFYHRARLKLADKYLDVRRSITDIFDSNYRCYGYRRVQASLLKECTGI SEKVVRRLMKQEGLIVVKPKRRRYNSYLGEIGAAPQNLINRDFHAAAPNEKWLTDITEFQ IPSGKVYLSPVIDCFDGLVVSWSIGTHPNANLVNTMLDAAIDAVQGSESRPMIHSDRGAH YRWPGWLSRVHDAKLVRSMSRKGCSPDNAACEGFFGRLKMELFYPGNWQSTTIEQFIEAV DTYIRWYNEKRIKVSLGRLSPVEYRQKLGLTT >gi|296493182|gb|ADTK01000319.1| GENE 5 3375 - 4157 440 260 aa, chain - ## HITS:1 COG:mll6090 KEGG:ns NR:ns ## COG: mll6090 COG1484 # Protein_GI_number: 13475085 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Mesorhizobium loti # 48 190 1 141 162 209 70.0 6e-54 MQHEGHVRILKSLKLFGMAHAIEELGNQNSPAFNQALPMLDSLIKAEVAEREVRSVNYQL RVAKFPVYRDLVGFDFSQSLVNEATVKQLHRCDFMEQAQNVVLIGGPGTGKTHLATAIGT QAVMHLNRRVRFFSTVDLVNALEQEKSSGRQGQIANRLLYADLVILDELGYLPFSQTGGA LLFHLLSKLYEKTSVILTTNLSFSEWSRVFGDEKMTTALLDRLTHHCHILETGNESYRFK HSSTQNKQEEKQTRKLKIET >gi|296493182|gb|ADTK01000319.1| GENE 6 4147 - 5418 218 423 aa, chain - ## HITS:1 COG:AGl49 KEGG:ns NR:ns ## COG: AGl49 COG4584 # Protein_GI_number: 15890128 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 9 392 91 464 498 147 28.0 3e-35 MYRDLVALGFTGSYDRVCAFARQWKDSEQFKAQTSGKGCFIPLRFACGEAFQFDWSEDFA RIAGKQVKLQIAQFKLAHSRAFVLRAYYQQKHEMLFDAHWHAFQIFGGIPKRGIYDNMKT AVDSVGRGKERRVNQRFTAMVSHYLFDAQFCNPASGWEKGQIEKNVQDSRQRLWQGAPDF QSLADLNVWLEHRCKALWSELRHPELDQTVQEAFADEQGELMALPNAFDAFVEQTKRVTS TCLVHHEGNRYSVPASYANRAISLRIYADKLVMAAEGQHIAEHPRLFGSGHARRGHTQYD WHHYLSVLQKKPGALRNGAPFAELPPAFKKLQSILLQRPGGDRDMVEILALVLHHDEGAV LSAVELALECGKPSKEHVLNLLGRLTEEPPPKPIPIPKGLRLTLEPQANVNRYDSLRRAH DAA >gi|296493182|gb|ADTK01000319.1| GENE 7 5765 - 6013 138 82 aa, chain + ## HITS:1 COG:no KEGG:Dtpsy_2129 NR:ns ## KEGG: Dtpsy_2129 # Name: not_defined # Def: putative transcriptional regulator MerR # Organism: Diaphorobacter # Pathway: not_defined # 1 82 63 144 144 147 93.0 2e-34 MQLNMDEIAELLRLDDGTHCEEASSLAEHKLKDVREKMADLARMETVLSELVCACHARKG NVSCPLIASLQGEAGLARSAMP Prediction of potential genes in microbial genomes Time: Mon May 16 16:01:17 2011 Seq name: gi|296493181|gb|ADTK01000320.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont973.1, whole genome shotgun sequence Length of sequence - 9145 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 7, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 15 - 59 3.2 1 1 Op 1 11/0.000 - CDS 70 - 597 516 ## COG1763 Molybdopterin-guanine dinucleotide biosynthesis protein 2 1 Op 2 . - CDS 579 - 1163 435 ## COG0746 Molybdopterin-guanine dinucleotide biosynthesis protein A - Prom 1188 - 1247 2.9 + Prom 935 - 994 1.9 3 2 Tu 1 5/0.500 + CDS 1233 - 1502 395 ## COG3084 Uncharacterized protein conserved in bacteria + Prom 1511 - 1570 1.6 4 3 Op 1 5/0.500 + CDS 1627 - 2565 702 ## COG2334 Putative homoserine kinase type II (protein kinase fold) 5 3 Op 2 1/1.000 + CDS 2582 - 3208 753 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 3230 - 3259 2.1 + Prom 3210 - 3269 11.4 6 4 Tu 1 . + CDS 3363 - 4793 1061 ## COG5339 Uncharacterized protein conserved in bacteria + Term 4799 - 4842 1.5 - Term 4787 - 4830 9.1 7 5 Tu 1 . - CDS 4834 - 5577 551 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase - Prom 5791 - 5850 4.5 - Term 5760 - 5788 -1.0 8 6 Tu 1 . - CDS 5886 - 6128 57 ## EcE24377A_4381 hypothetical protein + Prom 5841 - 5900 4.5 9 7 Tu 1 . + CDS 6130 - 8916 3302 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains Predicted protein(s) >gi|296493181|gb|ADTK01000320.1| GENE 1 70 - 597 516 175 aa, chain - ## HITS:1 COG:mobB KEGG:ns NR:ns ## COG: mobB COG1763 # Protein_GI_number: 16131697 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin-guanine dinucleotide biosynthesis protein # Organism: Escherichia coli K12 # 6 175 1 170 170 333 100.0 7e-92 MAGKTMIPLLAFAAWSGTGKTTLLKKLIPALCARGIRPGLIKHTHHDMDVDKPGKDSYEL RKAGAAQTIVASQQRWALMTETPDEEELDLQFLASRMDTSKLDLILVEGFKHEEIAKIVL FRDGAGHRPEELVIDRHVIAVASDVPLNLDVALLDINDVEGLADFVVEWMQKQNG >gi|296493181|gb|ADTK01000320.1| GENE 2 579 - 1163 435 194 aa, chain - ## HITS:1 COG:mobA KEGG:ns NR:ns ## COG: mobA COG0746 # Protein_GI_number: 16131698 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin-guanine dinucleotide biosynthesis protein A # Organism: Escherichia coli K12 # 1 194 1 194 194 401 100.0 1e-112 MNLMTTITGVVLAGGKARRMGGVDKGLLELNGKPLWQHVADALMTQLSHVVVNANRHQEI YQASGLKVIEDSLADYPGPLAGMLSVMQQEAGEWFLFCPCDTPYIPPDLAARLNHQRKDA PVVWVHDGERDHPTIALVNRAIEPLLLEYLQAGERRVMVFMRLAGGHAVDFSDHKDAFVN VNTPEELARWQEKR >gi|296493181|gb|ADTK01000320.1| GENE 3 1233 - 1502 395 89 aa, chain + ## HITS:1 COG:ECs4781 KEGG:ns NR:ns ## COG: ECs4781 COG3084 # Protein_GI_number: 15834035 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 89 1 89 89 158 100.0 3e-39 MKCKRLNEVIELLQPAWQKEPDLNLLQFLQKLAKESGFDGELADLTDDILIYHLKMRDSA KDAVIPGLQKDYEEDFKTALLRARGVIKE >gi|296493181|gb|ADTK01000320.1| GENE 4 1627 - 2565 702 312 aa, chain + ## HITS:1 COG:yihE KEGG:ns NR:ns ## COG: yihE COG2334 # Protein_GI_number: 16131700 # Func_class: R General function prediction only # Function: Putative homoserine kinase type II (protein kinase fold) # Organism: Escherichia coli K12 # 1 312 17 328 328 635 99.0 0 MDALFEHGIRVDSGLTPLNSYENRVYQFQDEDRRRFVVKFYRPERWTADQILEEHQFALQ LVNDEVPVAAPVAFNGQTLLNHQGFYFAVFPSVGGRQFEADNIDQMEAVGRYLGRMHQTG RKQLFIHRPTIGLNEYLIEPRKLFEDATLMPSGLKAAFLKATDELIAAVTAHWREDFTVL RLHGDCHAGNILWRDGPMFVDLDDARNGPAIQDLWMLLNGDKAEQRMQLETIIEAYEEFS EFDTAEIGLIEPLRAMRLVYYLAWLMRRWADPAFPKNFPWLTGEDYWLRQTATFIEQAKV LQEPPLQLTPMY >gi|296493181|gb|ADTK01000320.1| GENE 5 2582 - 3208 753 208 aa, chain + ## HITS:1 COG:ECs4783 KEGG:ns NR:ns ## COG: ECs4783 COG0526 # Protein_GI_number: 15834037 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Escherichia coli O157:H7 # 1 208 1 208 208 410 100.0 1e-115 MKKIWLALAGLVLAFSASAAQYEDGKQYTTLEKPVAGAPQVLEFFSFFCPHCYQFEEVLH ISDNVKKKLPEGVKMTKYHVNFMGGDLGKDLTQAWAVAMALGVEDKVTVPLFEGVQKTQT IRSASDIRDVFINAGIKGEEYDAAWNSFVVKSLVAQQEKAAADVQLRGVPAMFVNGKYQL NPQGMDTSNMDVFVQQYADTVKYLSEKK >gi|296493181|gb|ADTK01000320.1| GENE 6 3363 - 4793 1061 476 aa, chain + ## HITS:1 COG:yihF KEGG:ns NR:ns ## COG: yihF COG5339 # Protein_GI_number: 16131702 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 476 15 490 490 912 98.0 0 MIHKSATGVIVALAVIWGGGTWYTGTQIQPGVEKFIKDFNDAKKKGEHAYDMTLSYKNFD KGFFNSRFQMQMTFDNGAPDLNIKPGQKVVFDVDVEHGPLPITMLMHGNVIPALAAAKVN LVNNELTQPLFIAAKNKSPVEATLRFAFGGSFSTTLDVAPAEYGKFSFGEGQFTFNGDGS SLSNLDIEGKVEDIVLQLSPMNKVTAKSFTIDSLARLEEKKFPVGESESKFNQINIINHG EDVAQIDAFVAKTMLDRVKDKDYINVNLTYELDKLTKGNQQLGSGEWSLIAESIDPSAVR QFIIQYNIAMQKQLAAHPELANDEVALQEVNAALFKEYLPLLQKSEPTIKQPVKWKNALG ELNANLDISIADPAKSSSSTNKDIKSLNFDVKLPLNVATETAKQLNLSEGMDAEKAQKRA DKQISGMMTLGQMFQLITIDNNTASLQLRYTPGKVVFNGQEMSEEEFMSRAGRFVH >gi|296493181|gb|ADTK01000320.1| GENE 7 4834 - 5577 551 247 aa, chain - ## HITS:1 COG:ECs4785 KEGG:ns NR:ns ## COG: ECs4785 COG0204 # Protein_GI_number: 15834039 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Escherichia coli O157:H7 # 1 247 64 310 310 518 100.0 1e-147 MYCWCEGLAVLLHLNPHLQWEVHGLEGLSKKNWYLLICNHRSWADIVVLCVLFRKHIPMN KYFLKQQLAWVPFLGLACWALDMPFMKRYSRAYLLRHPERRGKDVETTRRSCEKFRLHPT TIVNFVEGSRFTQEKHQQTHSTFQNLLPPKAAGIAMALNVLGKQFDKLLNVTLCYPDNNR QPFFDMLSGKLTRIVVHVDLQPIADELHGDYINDKSFKRHFQQWLNSLWQEKDRLLTSLM SSQRQDK >gi|296493181|gb|ADTK01000320.1| GENE 8 5886 - 6128 57 80 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_4381 NR:ns ## KEGG: EcE24377A_4381 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 80 1 80 80 144 100.0 1e-33 MSVPVYQIMGKGCHRQDENMSFSPLFVKIFCRSSGSLANRHVDNFVHKLFTSLLIYYHAF YSMMFVFVFHAVKNKAIPNN >gi|296493181|gb|ADTK01000320.1| GENE 9 6130 - 8916 3302 928 aa, chain + ## HITS:1 COG:polA_2 KEGG:ns NR:ns ## COG: polA_2 COG0749 # Protein_GI_number: 16131704 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Escherichia coli K12 # 289 928 1 640 640 1232 100.0 0 MVQIPQNPLILVDGSSYLYRAYHAFPPLTNSAGEPTGAMYGVLNMLRSLIMQYKPTHAAV VFDAKGKTFRDELFEHYKSHRPPMPNDLRAQIEPLHAMVKAMGLPLLAVSGVEADDVIGT LAREAEKAGRPVLISTGDKDMAQLVTPNITLINTMTNTILGPEEVVNKYGVPPELIIDFL ALMGDSSDNIPGVPGVGEKTAQALLQGLGGLDTLYAEPEKIAGLSFRGAKTMAAKLEQNK EVAYLSYQLATIKTDVELELTCEQLEVQQPAAEELLGLFKKYEFKRWTADVEAGKWLQAK GAKPAAKPQETSVADEAPEVTATVISYDNYVTILDEETLKAWIAKLEKAPVFAFDTETDS LDNISANLVGLSFAIEPGVAAYIPVAHDYLDAPDQISRERALELLKPLLEDEKALKVGQN LKYDRGILANYGIELRGIAFDTMLESYILNSVAGRHDMDSLAERWLKHKTITFEEIAGKG KNQLTFNQIALEEAGRYAAEDADVTLQLHLKMWPDLQKHKGPLNVFENIEMPLVPVLSRI ERNGVKIDPKVLHNHSEELTLRLAELEKKAHEIAGEEFNLSSTKQLQTILFEKQGIKPLK KTPGGAPSTSEEVLEELALDYPLPKVILEYRGLAKLKSTYTDKLPLMINPKTGRVHTSYH QAVTATGRLSSTDPNLQNIPVRNEEGRRIRQAFIAPEDYVIVSADYSQIELRIMAHLSRD KGLLTAFAEGKDIHRATAAEVFGLPLETVTSEQRRSAKAINFGLIYGMSAFGLARQLNIP RKEAQKYMDLYFERYPGVLEYMERTRAQAKEQGYVETLDGRRLYLPDIKSSNGARRAAAE RAAINAPMQGTAADIIKRAMIAVDAWLQAEQPRVRMIMQVHDELVFEVHKDDVDAVAKQI HQLMENCTRLDVPLLVEVGSGENWDQAH Prediction of potential genes in microbial genomes Time: Mon May 16 16:01:20 2011 Seq name: gi|296493180|gb|ADTK01000321.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont973.2, whole genome shotgun sequence Length of sequence - 1314 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) Predicted protein(s) >gi|296493180|gb|ADTK01000321.1| GENE 1 147 - 743 686 198 aa, chain - ## HITS:1 COG:ECs4787 KEGG:ns NR:ns ## COG: ECs4787 COG0218 # Protein_GI_number: 15834041 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Escherichia coli O157:H7 # 1 198 13 210 210 374 99.0 1e-104 MSAPDIRHLPSDTGIEVAFAGRSNAGKSSALNTLTNQKSLARTSKTPGRTQLINLFEVAD GKRLVDLPGYGYAEVPEEMKRKWQRALGEYLEKRQSLQGLVVLMDIRHPLKDLDQQMIEW AVDSNIAVLVLLTKADKLASGARKAQLNMVREAVLAFNGDVQVETFSSLKKQGVDKLRQK LDTWFSEMQPVEETQDGE Prediction of potential genes in microbial genomes Time: Mon May 16 16:01:30 2011 Seq name: gi|296493179|gb|ADTK01000322.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont973.3, whole genome shotgun sequence Length of sequence - 31468 bp Number of predicted genes - 29, with homology - 28 Number of transcription units - 14, operones - 8 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 3.9 1 1 Tu 1 2/1.000 + CDS 101 - 610 613 ## COG3078 Uncharacterized protein conserved in bacteria + Prom 639 - 698 2.1 2 2 Tu 1 . + CDS 799 - 2172 1612 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases - Term 2227 - 2264 2.2 3 3 Tu 1 . - CDS 2362 - 2472 81 ## - Prom 2493 - 2552 2.9 4 4 Op 1 14/0.000 - CDS 2584 - 3993 1463 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 5 4 Op 2 5/0.000 - CDS 4005 - 5054 1076 ## COG3852 Signal transduction histidine kinase, nitrogen specific - Prom 5085 - 5144 5.2 - Term 5298 - 5342 0.2 6 4 Op 3 . - CDS 5373 - 6782 1596 ## COG0174 Glutamine synthetase + Prom 6861 - 6920 5.8 7 5 Tu 1 2/1.000 + CDS 7155 - 8978 2139 ## COG1217 Predicted membrane GTPase involved in stress response + Term 9002 - 9031 2.1 + Prom 9054 - 9113 7.8 8 6 Op 1 . + CDS 9195 - 9905 459 ## COG2188 Transcriptional regulators 9 6 Op 2 . + CDS 9913 - 10893 753 ## B21_03707 hypothetical protein + Prom 10898 - 10957 2.2 10 7 Tu 1 . + CDS 10995 - 12260 1081 ## COG0477 Permeases of the major facilitator superfamily + Term 12312 - 12354 3.5 11 8 Op 1 . - CDS 12351 - 13043 556 ## ECO103_4296 outer membrane porin L 12 8 Op 2 3/0.500 - CDS 13111 - 14514 1050 ## COG2211 Na+/melibiose symporter and related transporters 13 8 Op 3 3/0.500 - CDS 14557 - 15942 1465 ## COG2211 Na+/melibiose symporter and related transporters 14 8 Op 4 . - CDS 15988 - 18024 2016 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases - Prom 18109 - 18168 6.5 - Term 18079 - 18126 -0.7 15 9 Tu 1 2/1.000 - CDS 18223 - 19125 482 ## COG2017 Galactose mutarotase and related enzymes - Term 19132 - 19173 7.4 16 10 Op 1 3/0.500 - CDS 19263 - 20504 1332 ## COG2942 N-acyl-D-glucosamine 2-epimerase 17 10 Op 2 4/0.500 - CDS 20521 - 21399 941 ## COG3684 Tagatose-1,6-bisphosphate aldolase 18 10 Op 3 . - CDS 21423 - 22319 950 ## COG2084 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases + Prom 22403 - 22462 4.1 19 11 Op 1 3/0.500 + CDS 22487 - 23383 564 ## COG0524 Sugar kinases, ribokinase family 20 11 Op 2 1/1.000 + CDS 23417 - 24202 443 ## COG1349 Transcriptional regulators of sugar metabolism + Prom 24218 - 24277 2.1 21 12 Op 1 2/1.000 + CDS 24301 - 24900 764 ## COG1011 Predicted hydrolase (HAD superfamily) 22 12 Op 2 7/0.000 + CDS 24894 - 25766 667 ## COG1295 Predicted membrane protein 23 12 Op 3 5/0.000 + CDS 25763 - 26200 464 ## COG1490 D-Tyr-tRNAtyr deacylase 24 12 Op 4 . + CDS 26245 - 27186 1090 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Term 27193 - 27226 2.3 + Prom 27809 - 27868 8.5 25 13 Op 1 . + CDS 28043 - 28261 146 ## SFV_3606 hypothetical protein + Prom 28292 - 28351 4.8 26 13 Op 2 . + CDS 28479 - 28721 168 ## SSON_4059 hypothetical protein + Term 28845 - 28883 4.4 27 14 Op 1 9/0.000 - CDS 29051 - 29980 944 ## COG3058 Uncharacterized protein involved in formate dehydrogenase formation 28 14 Op 2 12/0.000 - CDS 29977 - 30612 557 ## COG2864 Cytochrome b subunit of formate dehydrogenase 29 14 Op 3 . - CDS 30609 - 31424 753 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 Predicted protein(s) >gi|296493179|gb|ADTK01000322.1| GENE 1 101 - 610 613 169 aa, chain + ## HITS:1 COG:yihI KEGG:ns NR:ns ## COG: yihI COG3078 # Protein_GI_number: 16131706 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 169 1 169 169 211 99.0 6e-55 MKPSSSNSRSKGHAKARRKTREELDQEARDRKRQKKRRGHAPGSRAAGGNTTSGSKGQNA PKDPRIGSKTPIPLGVIEKVTKQHKPKSEKPMLSPQAELELLETDERLDALLERLEAGET LSAEEQSWVDAKLDRIDELMQKLGLSYDDDEEEEEDEKQEDMMRLLRGN >gi|296493179|gb|ADTK01000322.1| GENE 2 799 - 2172 1612 457 aa, chain + ## HITS:1 COG:ECs4789 KEGG:ns NR:ns ## COG: ECs4789 COG0635 # Protein_GI_number: 15834043 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Escherichia coli O157:H7 # 1 457 3 459 459 940 100.0 0 MSVQQIDWDLALIQKYNYSGPRYTSYPTALEFSEDFGEQAFLQAVARYPERPLSLYVHIP FCHKLCYFCGCNKIVTRQQHKADQYLDALEQEIVHRAPLFAGRHVSQLHWGGGTPTYLNK AQISRLMKLLRENFQFNADAEISIEVDPREIELDVLDHLRAEGFNRLSMGVQDFNKEVQR LVNREQDEEFIFALLNHAREIGFTSTNIDLIYGLPKQTPESFAFTLKRVAELNPDRLSVF NYAHLPTIFAAQRKIKDADLPSPQQKLDILQETIAFLTQSGYQFIGMDHFARPDDELAVA QREGVLHRNFQGYTTQGDTDLLGMGVSAISMIGDCYAQNQKELKQYYQQVDEQGNALWRG IALTRDDCIRRDVIKSLICNFRLDYAPIEKQWDLHFADYFAEDLKLLAPLAKDGLVDVDE KGIQVTAKGRLLIRNICMCFDTYLRQKARMQQFSRVI >gi|296493179|gb|ADTK01000322.1| GENE 3 2362 - 2472 81 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLESIINLVSSGAVDSHTPQTAVAAVLCAAMIGLFS >gi|296493179|gb|ADTK01000322.1| GENE 4 2584 - 3993 1463 469 aa, chain - ## HITS:1 COG:ECs4790 KEGG:ns NR:ns ## COG: ECs4790 COG2204 # Protein_GI_number: 15834044 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli O157:H7 # 1 469 1 469 469 914 99.0 0 MQRGIVWVVDDDSSIRWVLERALAGAGLTCTTFENGAEVLEALASKTPDVLLSDIRMPGM DGLALLKQIKQRHPMLPVIIMTAHSDLDAAVSAYQQGAFDYLPKPFDIDEAVALVERAIS HYQEQQQPRNIQLNGPTTDIIGEAPAMQDVFRIIGRLSRSSISVLINGESGTGKELVAHA LHRHSPRAKAPFIALNMAAIPKDLIESELFGHEKGAFTGANTIRQGRFEQADGGTLFLDE IGDMPLDVQTRLLRVLADGQFYRVGGYAPVKVDVRIIAATHQNLEQRVQEGKFREDLFHR LNVIRVHLPPLRERREDIPRLARHFLQVAARELGVEAKLLHPETEAALTRLAWPGNVRQL ENTCRWLTVMAAGQEVLIQDLPGELFESTVAESTSQMQPDSWATLLAQWADRALRSGHQN LLSEAQPELERTLLTTALRHTQGHKQEAARLLGWGRNTLTRKLKELGME >gi|296493179|gb|ADTK01000322.1| GENE 5 4005 - 5054 1076 349 aa, chain - ## HITS:1 COG:ECs4791 KEGG:ns NR:ns ## COG: ECs4791 COG3852 # Protein_GI_number: 15834045 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase, nitrogen specific # Organism: Escherichia coli O157:H7 # 1 349 1 349 349 659 100.0 0 MATGTQPDAGQILNSLINSILLIDDNLAIHYANPAAQQLLAQSSRKLFGTPLPELLSYFS LNIELMQESLEAGQGFTDNEVTLVIDGRSHILSVTAQRMPDGMILLEMAPMDNQRRLSQE QLQHAQQVAARDLVRGLAHEIKNPLGGLRGAAQLLSKALPDPSLLEYTKVIIEQADRLRN LVDRLLGPQLPGTRVTESIHKVAERVVTLVSMELPDNVRLIRDYDPSLPELAHDPDQIEQ VLLNIVRNALQALGPEGGEIILRTRTAFQLTLHGERYRLAARIDVEDNGPGIPPHLQDTL FYPMVSGREGGTGLGLSIARNLIDQHSGKIEFTSWPGHTEFSVYLPIRK >gi|296493179|gb|ADTK01000322.1| GENE 6 5373 - 6782 1596 469 aa, chain - ## HITS:1 COG:ECs4792 KEGG:ns NR:ns ## COG: ECs4792 COG0174 # Protein_GI_number: 15834046 # Func_class: E Amino acid transport and metabolism # Function: Glutamine synthetase # Organism: Escherichia coli O157:H7 # 1 469 1 469 469 961 100.0 0 MSAEHVLTMLNEHEVKFVDLRFTDTKGKEQHVTIPAHQVNAEFFEEGKMFDGSSIGGWKG INESDMVLMPDASTAVIDPFFADSTLIIRCDILEPGTLQGYDRDPRSIAKRAEDYLRSTG IADTVLFGPEPEFFLFDDIRFGSSISGSHVAIDDIEGAWNSSTQYEGGNKGHRPAVKGGY FPVPPVDSAQDIRSEMCLVMEQMGLVVEAHHHEVATAGQNEVATRFNTMTKKADEIQIYK YVVHNVAHRFGKTATFMPKPMFGDNGSGMHCHMSLSKNGVNLFAGDKYAGLSEQALYYIG GVIKHAKAINALANPTTNSYKRLVPGYEAPVMLAYSARNRSASIRIPVVSSPKARRIEVR FPDPAANPYLCFAALLMAGLDGIKNKIHPGEAMDKNLYDLPPEEAKEIPQVAGSLEEALN ELDLDREFLKAGGVFTDEAIDAYIALRREEDDRVRMTPHPVEFELYYSV >gi|296493179|gb|ADTK01000322.1| GENE 7 7155 - 8978 2139 607 aa, chain + ## HITS:1 COG:ECs4793 KEGG:ns NR:ns ## COG: ECs4793 COG1217 # Protein_GI_number: 15834047 # Func_class: T Signal transduction mechanisms # Function: Predicted membrane GTPase involved in stress response # Organism: Escherichia coli O157:H7 # 1 607 1 607 607 1205 100.0 0 MIEKLRNIAIIAHVDHGKTTLVDKLLQQSGTFDSRAETQERVMDSNDLEKERGITILAKN TAIKWNDYRINIVDTPGHADFGGEVERVMSMVDSVLLVVDAFDGPMPQTRFVTKKAFAYG LKPIVVINKVDRPGARPDWVVDQVFDLFVNLDATDEQLDFPIVYASALNGIAGLDHEDMA EDMTPLYQAIVDHVPAPDVDLDGPFQMQISQLDYNSYVGVIGIGRIKRGKVKPNQQVTII DSEGKTRNAKVGKVLGHLGLERIETDLAEAGDIVAITGLGELNISDTVCDTQNVEALPAL SVDEPTVSMFFCVNTSPFCGKEGKFVTSRQILDRLNKELVHNVALRVEETEDADAFRVSG RGELHLSVLIENMRREGFELAVSRPKVIFREIDGRKQEPYENVTLDVEEQHQGSVMQALG ERKGDLKNMNPDGKGRVRLDYVIPSRGLIGFRSEFMTMTSGTGLLYSTFSHYDDVRPGEV GQRQNGVLISNGQGKAVAFALFGLQDRGKLFLGHGAEVYEGQIIGIHSRSNDLTVNCLTG KKLTNMRASGTDEAVVLVPPIRMTLEQALEFIDDDELVEVTPTSIRIRKRHLTENDRRRA NRAPKDD >gi|296493179|gb|ADTK01000322.1| GENE 8 9195 - 9905 459 236 aa, chain + ## HITS:1 COG:ECs4794 KEGG:ns NR:ns ## COG: ECs4794 COG2188 # Protein_GI_number: 15834048 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 236 1 236 236 474 99.0 1e-134 MAENQSTVENAKEKLDRWLKDGITTPGGKLPSERELGELLGIKRMTLRQALLNLEAESKI FRKDRKGWFVTQPRFNYSPELSASFQRAAIEQGREPSWGFTEKNRTSDIPKTLAPLIAVT PSTELYRITGWGALEGHKVFYHETYINPEVAPGFIEQLENHSFSAVWEKCYQKETVVKKL IFKPVRMPGDISKYLGGSAGMPAILIEKHRADQQGNIVQIDIEYWRFEAVDLIINL >gi|296493179|gb|ADTK01000322.1| GENE 9 9913 - 10893 753 326 aa, chain + ## HITS:1 COG:no KEGG:B21_03707 NR:ns ## KEGG: B21_03707 # Name: yihM # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 326 1 326 326 629 99.0 1e-179 MVTINNARKILQRVDTLPLYLHAYAFHLNMRLERVLPADLLDIASENNLRGVKIHVLDGE RFSLGNMDDKELSAFGDKARRLNLDIHIETSASDKASIDEAVAIALKTGASSVRFYPRYE GNLRDVLSIIANDIAYVRETYQDSGLTFTIEQHEDLKSHELVLLVKESEMESLSLLFDFA NMINANEHPIDALKTMAPHITQVHIKDALIVKEQGGLGHKACISGQGDMPFKALLTHLIC LGDDEPQVTAYGLEEEVDYYAPAFRFEDEDDNPWIPYRQMSETPLPENHLLDARLRKEKE DAINQINHVRNVLQQIKQEANHLLNH >gi|296493179|gb|ADTK01000322.1| GENE 10 10995 - 12260 1081 421 aa, chain + ## HITS:1 COG:ECs4796 KEGG:ns NR:ns ## COG: ECs4796 COG0477 # Protein_GI_number: 15834050 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 421 1 421 421 776 99.0 0 MLTKKKWALFSLLTLCGGTIYKLPSLKDAFYIPMQEYFHLTNGQIGNAMSVNSFVTTVGF FLSIYFADKLPRRYTMSFSLIATGLLGVYLTTMPGYWGILFVWALFGVTCDMMNWPVLLK SVSRLGNSEQQGRLFGFFETGRGIVDTVVAFSALAVFTWFGSGLLGFKAGIWFYSLIVIA VGIIIFFVLNDKEEAPSVEVKKEDGASQNTSMTSVLRDKTIWLIAFNVFFVYAVYCGLTF FIPFLKNIYLLPVALVGAYGIINQYCLKMIGGPIGGMISDKILKSPSKYLCYTFIISTAA LVLLIMLPHESMPVYLGMACTLGFGAIVFTQRAVFFAPIGEAKIAENKTGAAMALGSFIG YAPAMFCFSLYGYILDLNPGIIGYKIVFGIMACFAFCGAVVSVMLVKRISQRKKEMLAAE A >gi|296493179|gb|ADTK01000322.1| GENE 11 12351 - 13043 556 230 aa, chain - ## HITS:1 COG:no KEGG:ECO103_4296 NR:ns ## KEGG: ECO103_4296 # Name: ompL # Def: outer membrane porin L # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 230 1 230 230 443 100.0 1e-123 MKKINAIILLSSLTSASVFAGAYVENREAYNLASDQGEVMLRVGYNFDMGAGIMLTNTYT FQREDELKHGYNEIEGWYPLFKPTDKLTIQPGGLINDKSIGSGGAVYLDVNYKFVPWFNL TVRNRYNHNNYSSTDLSGELDNNDTYEIGTYWNFKITDKFSYTFEPHYFIRVNDFNSSNG KDHHWEITNTFRYRINEHWLPYFELRWLDRNVEPYHREQNQIRIGTKYFF >gi|296493179|gb|ADTK01000322.1| GENE 12 13111 - 14514 1050 467 aa, chain - ## HITS:1 COG:yihO KEGG:ns NR:ns ## COG: yihO COG2211 # Protein_GI_number: 16131716 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Escherichia coli K12 # 1 463 3 465 487 889 99.0 0 MSDHNPLTLKLNLREKIAYGMGDVGSNLMLCIGTLYLLKFYTDELGMPAYYGGIIFLVAK FFTAFTDMLTGFLLDSRKNIGPKGKFRPFILYAAVPAALIATLQFIATTFCLPVKTTIAT ALFMMFGLSYSLMNCSYGAMIPAITKNPNERAQLAAYRQGGATIGLLICTVAFIPLQSLF SDSTVGYACAALMFSIGGFIFMMLCYRGVKEHYVDTAPTGHKASILKSFCAIFRNPPLLV LCIANLCTLAAFNIKLAIQVYYTQYVLNDINLLSWMGFFSMGCILVGVLLVPVTVKCFGK KQVYLAGMVLWAVGDILNYFWGSNSFTFVMFSCVAFFGTAFVNSLNWALVPDTVDYGEWK TGIRAEGSVYTGYTFFRKISAALAGFLPGIMLTQIGYVPNIAQSDATLQGLRQLIFIWPC ALAIIAALTMGFFYTLNEKRFALIIEEINQRKNKEMATEEKTASVTL >gi|296493179|gb|ADTK01000322.1| GENE 13 14557 - 15942 1465 461 aa, chain - ## HITS:1 COG:yihP KEGG:ns NR:ns ## COG: yihP COG2211 # Protein_GI_number: 16131717 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Escherichia coli K12 # 1 461 8 468 468 878 99.0 0 MSHITTEDPATLRLPFKEKLSYGIGDLASNILLDIGTLYLLKFYTDVLGLPGTYGGIIFL ISKFFTAFTDMGTGIMLDSRRKIGPKGKFRPFILYASFPVTLLAIANFVGTPFDVTGKTV MATILFMLYGLFFSMMNCSYGAMVPAITKNPNERASLAAWRQGGATLGLLLCTVGFVPVM NLIEGNQQLGYIFAATLFSLFGLLFMWICYSGVKERYVETQPTNPAQKPGLLQSFRAIAG NRPLFILCIANLCTLGAFNVKLAIQVYYTQYVLNDPILLSYMGFFSMGCIFIGVFLMPGA VRRFGKKKVYIGGLLIWVLGDLLNYFFGGGSVSFVAFSCLAFFGSAFVNSLNWALVSDTV EYGEWRTGVRSEGTVYTGFTFFRKVSQALAGFFPGWMLTQIGYVPNVAQADHTIEGLRQL IFIYPSALAVVTIVAMGCFYSLNEKMYVRIVEEIEARKRTA >gi|296493179|gb|ADTK01000322.1| GENE 14 15988 - 18024 2016 678 aa, chain - ## HITS:1 COG:yihQ KEGG:ns NR:ns ## COG: yihQ COG1501 # Protein_GI_number: 16131718 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Escherichia coli K12 # 1 678 1 678 678 1417 99.0 0 MDTPRPQLLDFQFHQNNDSFTLHFQQRLILTHSKDNPCLWIGSGIADIDMFRGNFSIKDK LQEKIALTDAIVSQSPDGWLIHFSRGSDISATLNISADDQGRLLLELQNDNLNHNRIWLR LAAQPEDHIYGCGEQFSYFDLRGKPFPLWTSEQGVGRNKQTYVTWQADCKENAGGDYYWT FFPQPTFVSTQKYYCHVDNSCYMNFDFSAPEYHELALWEDKATLRFECADTYISLLEKLT ALLGRQPELPDWIYDGVTLGIQGGTEVCQKKLDTMRNAGVKVNGIWAQDWSGIRMTSFGK RVMWNWKWNSENYPQLDSRIKQWNQEGVQFLAYINPYVASDKDLCEEAAQHGYLAKDASG GDYLVEFGEFYGGVVDLTNPEAYAWFKEVIKKNMIELGCGGWMADFGEYLPTDTYLHNGI SAEIMHNAWPALWAKCNYEALEETGKLGEILFFMRAGSTGSQKYSTMMWAGDQNVDWSLD DGLASVVPAALSLAMTGHGLHHSDIGGYTTLFEMKRSKELLLRWCDFSAFTPMMRTHEGN RPGDNWQFDGDAETIAHFARMTSVFTTLKPYLKEAVALNAKSGLPVMRPLFLHYEDDAHT YTLKYQYLLGRDILVAPVHEEGRSDWTLYLPEDNWVHAWTGEAFRGGEVTVNAPIGKPPV FYRADSEWAALFASLKSI >gi|296493179|gb|ADTK01000322.1| GENE 15 18223 - 19125 482 300 aa, chain - ## HITS:1 COG:yihR KEGG:ns NR:ns ## COG: yihR COG2017 # Protein_GI_number: 16131719 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Escherichia coli K12 # 1 300 9 308 308 595 97.0 1e-170 MQITNMHCSGQTVSLAAGDYHATIVTVGAGLAELTFQGCHLVIPHKPEEMPLAHLGKVLI PWPNRIANGCYRYQGQEYQLPINEHSSKAAIHGLLAWRDWQISELTATSVTLTAFLPPSY GYPFMLASQVVYSLNAHTGLSVEIASQNIGTVAAPYGVGIHTYLTCNLTSVDEYLFQLPA NQVYAVDEHVNPTTLHHVDELDLNFTQAKTIAATKIDHTFKTANDLWEITITHPQQALSV SLCCDQPWVQIYSGEKLQRQGLAVEPMSCPPNAFNSGIDLLLLEPGKPHRLFFNIYGQRK >gi|296493179|gb|ADTK01000322.1| GENE 16 19263 - 20504 1332 413 aa, chain - ## HITS:1 COG:yihS KEGG:ns NR:ns ## COG: yihS COG2942 # Protein_GI_number: 16131720 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Escherichia coli K12 # 1 413 6 418 418 836 100.0 0 MKWFNTLSHNRWLEQETDRIFDFGKNSVVPTGFGWLGNKGQIKEEMGTHLWITARMLHVY SVAAAMGRPGAYSLVDHGIKAMNGALRDKKYGGWYACVNDEGVVDASKQGYQHFFALLGA ASAVTTGHPEARKLLDYTIEIIEKYFWSEEEQMCLESWDEAFSKTEEYRGGNANMHAVEA FLIVYDVTHDKKWLDRAIRVASVIIHDVARNNHYRVNEHFDTQWNPLPDYNKDNPAHRFR AFGGTPGHWIEWGRLMLHIHAALEARCEQPPAWLLEDAKGLFNATVRDAWAPDGADGIVY TVDWEGKPVVRERVRWPIVEAMGTAYALYTVTGDRQYETWYQTWWEYCIKYLMDYENGSW WQELDADNKVTTKVWDGKQDIYHLLHCLVIPRIPLAPGMAPAVAAGLLDINAK >gi|296493179|gb|ADTK01000322.1| GENE 17 20521 - 21399 941 292 aa, chain - ## HITS:1 COG:yihT KEGG:ns NR:ns ## COG: yihT COG3684 # Protein_GI_number: 16131721 # Func_class: G Carbohydrate transport and metabolism # Function: Tagatose-1,6-bisphosphate aldolase # Organism: Escherichia coli K12 # 1 292 1 292 292 556 100.0 1e-158 MNKYTINDITRASGGFAMLAVDQREAMRMMFAAAGAPAPVADSVLTDFKVNAAKALSPYA SAILVDQQFCYRQVVEQNAIAKSCAMIVAADEFIPGNGIPVDSVVIDRKINPLQIKQDGG KALKLLVLWRSDEDAQQRLDMVKEFNELCHSHGLVSIIEPVVRPPRRGDKFDREQAIIDA AKELGDSGADLYKVEMPLYGKGPQQELLCASQRLNDHINMPWVILSSGVDEKLFPRAVRV AMTAGASGFLAGRAVWASVVGLPDNELMLRDVCAPKLQQLGDIVDEMMAKRR >gi|296493179|gb|ADTK01000322.1| GENE 18 21423 - 22319 950 298 aa, chain - ## HITS:1 COG:yihU KEGG:ns NR:ns ## COG: yihU COG2084 # Protein_GI_number: 16131722 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases # Organism: Escherichia coli K12 # 1 298 1 298 298 542 99.0 1e-154 MAAIAFIGLGQMGSPMASNLLQQGHQLRVFDVNAEAVRHLVDKGATPAANPAQAAKDAEF IITMLPNGDLVRNVLFGENGVCEGLSTDALVIDMSTIHPLQTDKLIADMQAKGFSMMDVP VGRTSANAITGTLLLLAGGTAEQVERATPILMAMGSELINSGGPGMGIRVKLINNYMSIA LNALSAEAAVLCEALNLPFDVAVKVMSGTAAGKGHFTTSWPNKVLSGDLSPAFMIDLAHK DLGIALDVANQLHVPMPLGAASREVYSQARAAGRGRQDWSAILEQVRVSAGMTAKVKM >gi|296493179|gb|ADTK01000322.1| GENE 19 22487 - 23383 564 298 aa, chain + ## HITS:1 COG:yihV KEGG:ns NR:ns ## COG: yihV COG0524 # Protein_GI_number: 16131723 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli K12 # 1 298 3 300 300 571 99.0 1e-163 MIRVACVGITVMDRIYYVKGLPTESGKYVARNYTEVGGGPAATAAVAAARLGAQVDFIGR VGDDDTGNSLLAELESWGVNTRYTKRYNQAKSSQSAIMVDTKGERIIINYPSPDLLPDAE WLEEIDFSQWDVVLADVRWHDGAKKAFTLARQAGVMTVLDGDITPQDISELVALSDHAAF SEPGLARLTGVKEMASALKQAQTLTNGHVYVTQGSAGCDWLENGGRQHQPAFKVDVVDTT GAGDVFHGALAVALATSGDLAESVRFASGVAALKCTRPGGRAGIPDCDQTRSFLSLFV >gi|296493179|gb|ADTK01000322.1| GENE 20 23417 - 24202 443 261 aa, chain + ## HITS:1 COG:ECs4807 KEGG:ns NR:ns ## COG: ECs4807 COG1349 # Protein_GI_number: 15834061 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli O157:H7 # 1 261 9 269 269 486 100.0 1e-137 MSLTELTGNPRHDQLLMLIAERGYMNIDELANLLDVSTQTVRRDIRKLSEQGLITRHHGG AGRASSVVNTAFEQREVSQTEEKKAIAEAVADYIPDGSTIFITIGTTVEHVARALLNHNH LRIITNSLRVAHILYHNPRFEVMVPGGTLRSHNSGIIGPSAASFVADFRADYLVTSVGAI ESDGALMEFDVNEANVVKTMMAHARNILLVADHTKYHASAAVEIGNVAQVTALFTDELPP AALKSRLQDSQIEIILPQEDA >gi|296493179|gb|ADTK01000322.1| GENE 21 24301 - 24900 764 199 aa, chain + ## HITS:1 COG:yihX KEGG:ns NR:ns ## COG: yihX COG1011 # Protein_GI_number: 16131725 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Escherichia coli K12 # 1 199 8 206 206 414 100.0 1e-116 MLYIFDLGNVIVDIDFNRVLGAWSDLTRIPLASLKKSFHMGEAFHQHERGEISDEAFAEA LCHEMALPLSYEQFSHGWQAVFVALRPEVIAIMHKLREQGHRVVVLSNTNRLHTTFWPEE YPEIRDAADHIYLSQDLGMRKPEARIYQHVLQAEGFSPSDTVFFDDNADNIEGANQLGIT SILVKDKTTIPDYFAKVLC >gi|296493179|gb|ADTK01000322.1| GENE 22 24894 - 25766 667 290 aa, chain + ## HITS:1 COG:ECs4809 KEGG:ns NR:ns ## COG: ECs4809 COG1295 # Protein_GI_number: 15834063 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 290 1 290 290 528 100.0 1e-150 MLKTIQDKARHRTRPLWAWLKLLWQRIDEDNMTTLAGNLAYVSLLSLVPLVAVVFALFAA FPMFSDVSIQLRHFIFANFLPATGDVIQRYIEQFVANSNKMTAVGACGLIVTALLLMYSI DSALNTIWRSKRARPKIYSFAVYWMILTLGPLLAGASLAISSYLLSLRWASDLNTVIDNV LRIFPLLLSWISFWLLYSIVPTIRVPNRDAIVGAFVAALLFEAGKKGFALYITMFPSYQL IYGVLAVIPILFVWVYWTWCIVLLGAEITVTLGEYRKLKQAAEQEEDDEP >gi|296493179|gb|ADTK01000322.1| GENE 23 25763 - 26200 464 145 aa, chain + ## HITS:1 COG:ECs4810 KEGG:ns NR:ns ## COG: ECs4810 COG1490 # Protein_GI_number: 15834064 # Func_class: J Translation, ribosomal structure and biogenesis # Function: D-Tyr-tRNAtyr deacylase # Organism: Escherichia coli O157:H7 # 1 145 1 145 145 276 100.0 8e-75 MIALIQRVTRASVTVEGEVTGEIGAGLLVLLGVEKDDDEQKANRLCERVLGYRIFSDAEG KMNLNVQQAGGSVLVVSQFTLAADTERGMRPSFSKGASPDRAEALYDYFVERCRQQEMNT QTGRFAADMQVSLVNDGPVTFWLQV >gi|296493179|gb|ADTK01000322.1| GENE 24 26245 - 27186 1090 313 aa, chain + ## HITS:1 COG:ECs4811 KEGG:ns NR:ns ## COG: ECs4811 COG0454 # Protein_GI_number: 15834065 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Escherichia coli O157:H7 # 1 313 17 329 329 625 99.0 1e-179 MYHLRVPQTEEELERYYQFRWEMLRKPLHQPKGSERDAWDAMAHHQMVVDEQGNLVAVGR LYINADNEASIRFMAVHPDVQDKGLGTLMAMTLESVARQEGVKRVTCSAREDAVEFFAKL GFINQGEITTPTTTPIRHFLMIKPVATLDDILHRGDWCAQLQQAWYEHIPLSEKMGVRIQ QYTGQKFITTMPETGNQNPHHTLFAGSLFSLATLTGWGLIWLMLRERHLGGTIILADAHI RYSKPISGKPHAVADLGALSGDLDRLARGRKARVQMQVEIFGDETPGAVFEGTYIVLPAK PFGPYEEGGNEEE >gi|296493179|gb|ADTK01000322.1| GENE 25 28043 - 28261 146 72 aa, chain + ## HITS:1 COG:no KEGG:SFV_3606 NR:ns ## KEGG: SFV_3606 # Name: yiiE # Def: hypothetical protein # Organism: S.flexneri_8401 # Pathway: not_defined # 1 72 49 120 120 133 100.0 2e-30 MAMNTVFLHLSEEAIKRLNKLRGWRKVSRSAILREAVEQYLERQQFPVRKAKGGRQKGEV VGVDDQCKEHKE >gi|296493179|gb|ADTK01000322.1| GENE 26 28479 - 28721 168 80 aa, chain + ## HITS:1 COG:no KEGG:SSON_4059 NR:ns ## KEGG: SSON_4059 # Name: yiiF # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 80 1 80 80 143 100.0 2e-33 MNSLAGIDMGRILLDLSNEVIKQLDDLEVQRNLPRADLLREAVDQYLINQSQTARTSVPG IWQGCEEDGVEYQRKLREEW >gi|296493179|gb|ADTK01000322.1| GENE 27 29051 - 29980 944 309 aa, chain - ## HITS:1 COG:ECs4817 KEGG:ns NR:ns ## COG: ECs4817 COG3058 # Protein_GI_number: 15834071 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Uncharacterized protein involved in formate dehydrogenase formation # Organism: Escherichia coli O157:H7 # 1 309 1 309 309 617 99.0 1e-177 MSIRIIPQDELGSSEKRTADMIPPLLFPRLKNLYNRRAERLRELAENNPLGDYLRFAALI AHAQEVVLYDHPLEMDLTARIKEASAQGKPPLDIHVLPRDKHWQKLLMALIAELKPEMSG PALAVIENLEKASTQELEDMASALFASDFSSVSSDKAPFIWAALSLYWAQMANLIPGKAR AEYGEQRQYCPVCGSMPVSSMVQIGTTQGLRYLHCNLCETEWHVVRVKCSNCEQSGKLHY WSLDDEQAAIKAESCDDCGTYLKILYQEKDPKIEAVADDLASLVLDARMEQEGYARSSIN PFLFPGEGE >gi|296493179|gb|ADTK01000322.1| GENE 28 29977 - 30612 557 211 aa, chain - ## HITS:1 COG:ECs4818 KEGG:ns NR:ns ## COG: ECs4818 COG2864 # Protein_GI_number: 15834072 # Func_class: C Energy production and conversion # Function: Cytochrome b subunit of formate dehydrogenase # Organism: Escherichia coli O157:H7 # 1 211 1 211 211 390 100.0 1e-108 MKRRDTIVRYTAPERINHWITAFCFILAAVSGLGFLFPSFNWLMQIMGTPQLARILHPFV GVVMFASFIIMFFRYWHHNLINRDDIFWAKNIRKIVVNEEVGDTGRYNFGQKCVFWAAII FLVLLLVSGVIIWRPYFAPAFSIPVIRFALMLHSFAAVALIVVIMVHIYAALWVKGTITA MVEGWVTSAWAKKHHPRWYREVRKTTEKKAE >gi|296493179|gb|ADTK01000322.1| GENE 29 30609 - 31424 753 271 aa, chain - ## HITS:1 COG:fdoH KEGG:ns NR:ns ## COG: fdoH COG0437 # Protein_GI_number: 16131733 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Escherichia coli K12 # 1 271 30 300 300 569 99.0 1e-162 MAKLIDVTTCIGCKACQVACSEWNDIRDTVGNNVGVYDNPNDLSAKSWTVMRFSEVEQND KLEWLIRKDGCMHCSDPGCLKACPAEGAIIQYANGIVDFQSEQCIGCGYCIAGCPFDIPR LNPEDNRVYKCTLCVDRVVVGQEPACVKTCPTGAIHFGTKESMKTLASERVAELKTRGYD NAGLYDPAGVGGTHVMYVLHHADKPNLYHGLPENPEISETVKFWKGIWKPLAAVGFAATF AASIFHYVGVGPNRADEEENNLHEEKDEERK Prediction of potential genes in microbial genomes Time: Mon May 16 16:01:48 2011 Seq name: gi|296493178|gb|ADTK01000323.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont973.4, whole genome shotgun sequence Length of sequence - 5413 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/1.000 - CDS 86 - 2500 2750 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 2 1 Op 2 . - CDS 2549 - 3136 499 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing - Prom 3353 - 3412 4.2 + Prom 3126 - 3185 4.4 3 2 Tu 1 . + CDS 3330 - 4163 619 ## COG1526 Uncharacterized protein required for formate dehydrogenase activity + Term 4170 - 4218 4.1 + Prom 4174 - 4233 7.4 4 3 Tu 1 . + CDS 4316 - 4840 551 ## B21_03730 hypothetical protein + Prom 4887 - 4946 3.3 5 4 Tu 1 . + CDS 5039 - 5371 290 ## B21_03730 hypothetical protein Predicted protein(s) >gi|296493178|gb|ADTK01000323.1| GENE 1 86 - 2500 2750 804 aa, chain - ## HITS:1 COG:fdoG KEGG:ns NR:ns ## COG: fdoG COG0243 # Protein_GI_number: 16131734 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 804 213 1016 1016 1684 100.0 0 MTNHWVDIKNANLVVVMGGNAAEAHPVGFRWAMEAKIHNGAKLIVIDPRFTRTAAVADYY APIRSGTDIAFLSGVLLYLLNNEKFNREYTEAYTNASLIVREDYGFEDGLFTGYDAEKRK YDKSSWTYELDENGFAKRDTTLQHPRCVWNLLKQHVSRYTPDVVENICGTPKDAFLKVCE YIAETSAHDKTASFLYALGWTQHSVGAQNIRTMAMIQLLLGNMGMAGGGVNALRGHSNIQ GLTDLGLLSQSLPGYMTLPSEKQTDLQTYLTANTPKPLLEGQVNYWGNYPKFFVSMMKAF FGDKATAENSWGFDWLPKWDKGYDVLQYFEMMKEGKVNGYICQGFNPVASFPNKNKVIGC LSKLKFLVTIDPLNTETSNFWQNHGELNEVDSSKIQTEVFRLPSTCFAEENGSIVNSGRW LQWHWKGADAPGIALTDGEILSGIFLRLRKMYAEQGGANPDQVLNMTWNYAIPHEPSSEE VAMESNGKALADITDPATGAVIVKKGQQLSSFAQLRDDGTTSCGCWIFAGSWTPEGNQMA RRDNADPSGLGNTLGWAWAWPLNRRILYNRASADPQGNPWDPKRQLLKWDGTKWTGWDIP DYSAAPPGSGVGPFIMQQEGMGRLFALDKMAEGPFPEHYEPFETPLGTNPLHPNVISNPA ARIFKDDAEALGKADKFPYVGTTYRLTEHFHYWTKHALLNAILQPEQFVEIGESLANKLG IAQGDTVKVSSNRGYIKAKAVVTKRIRTLKANGKDIDTIGIPIHWGYEGVAKKGFIANTL TPFVGDANTQTPEFKSFLVNVEKV >gi|296493178|gb|ADTK01000323.1| GENE 2 2549 - 3136 499 195 aa, chain - ## HITS:1 COG:fdoG KEGG:ns NR:ns ## COG: fdoG COG0243 # Protein_GI_number: 16131734 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 195 1 195 1016 412 100.0 1e-115 MQVSRRQFFKICAGGMAGTTAAALGFAPSVALAETRQYKLLRTRETRNTCTYCSVGCGLL MYSLGDGAKNAKASIFHIEGDPDHPVNRGALCPKGAGLVDFIHSESRLKFPEYRAPGSDK WQQISWEEAFDRIAKLMKEDRDANYIAQNAEGVTVNRWLSTGMLCASASSNETGYLTQKF SRALGMLAVDNQARV >gi|296493178|gb|ADTK01000323.1| GENE 3 3330 - 4163 619 277 aa, chain + ## HITS:1 COG:fdhD KEGG:ns NR:ns ## COG: fdhD COG1526 # Protein_GI_number: 16131735 # Func_class: C Energy production and conversion # Function: Uncharacterized protein required for formate dehydrogenase activity # Organism: Escherichia coli K12 # 1 277 1 277 277 569 99.0 1e-162 MKKTQRKEIENVTNITGVRQIELWRRDDLQHPRLDEVAEEVPVALVYNGISHVVMMASPK DLEYFALGFSLSEGIIESPRDIFGMDVVPSCNGLEVQIELSSRRFMGLKERRRALAGRTG CGVCGVEQLNDIGKPVQPLPFTQTFDLNKLDGALRHLNDFQPVGQLTGCTHAAAWMLPSG ELVGGHEDVGRHVALDKLLGRRSQEGESWQQGAVLVSSRASYEMVQKSAMCGVEILFAVS AATTLAVEVAERCNLTLVGFCKPGRATVYTHPQRLSN >gi|296493178|gb|ADTK01000323.1| GENE 4 4316 - 4840 551 174 aa, chain + ## HITS:1 COG:no KEGG:B21_03730 NR:ns ## KEGG: B21_03730 # Name: yiiG # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 174 1 174 351 310 99.0 2e-83 MKRNLLSSAIIVAIMSLGLTGCDDKKAETETLPPANSQPAAPAPEAKPTEAPVAKAEAKP ETPAQPVVDEQAVFDEKMDVYIKCYNKLQIPVQRSLARYADWLKDFKQGPTGEERTVYGI YGISESNLAECEKGVKSAVALTPALQPIDGVAVGYIDAAVALGNTINEMDKYYT >gi|296493178|gb|ADTK01000323.1| GENE 5 5039 - 5371 290 110 aa, chain + ## HITS:1 COG:no KEGG:B21_03730 NR:ns ## KEGG: B21_03730 # Name: yiiG # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 110 242 351 351 203 100.0 2e-51 MISAKQINNLISQDKFDAEAAMKKVSELETLVAQAKEADKGGMNFSFINSAGQYQLEAKK YVRRIRDKVPYSDWDKEQLQDANSSWMVEDSFPRALREYNEMVDDYNSLR Prediction of potential genes in microbial genomes Time: Mon May 16 16:02:02 2011 Seq name: gi|296493177|gb|ADTK01000324.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont973.5, whole genome shotgun sequence Length of sequence - 27095 bp Number of predicted genes - 29, with homology - 28 Number of transcription units - 20, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.714 - CDS 5 - 1753 1062 ## COG3711 Transcriptional antiterminator 2 1 Op 2 2/0.714 - CDS 1753 - 2823 1102 ## COG1363 Cellulase M and related proteins 3 1 Op 3 7/0.143 - CDS 2813 - 4264 1500 ## COG1299 Phosphotransferase system, fructose-specific IIC component 4 1 Op 4 1/0.857 - CDS 4275 - 4721 351 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) - Prom 4912 - 4971 5.1 - Term 4978 - 5013 7.4 5 2 Op 1 4/0.286 - CDS 5022 - 5336 359 ## COG3254 Uncharacterized conserved protein 6 2 Op 2 5/0.143 - CDS 5346 - 6170 961 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 7 2 Op 3 6/0.143 - CDS 6253 - 7512 1387 ## COG4806 L-rhamnose isomerase 8 2 Op 4 . - CDS 7509 - 8978 1253 ## COG1070 Sugar (pentulose and hexulose) kinases - Prom 9002 - 9061 2.4 + Prom 9150 - 9209 2.6 9 3 Tu 1 . + CDS 9266 - 10102 514 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 9937 - 9992 1.1 10 4 Tu 1 . - CDS 10050 - 10184 81 ## - Prom 10266 - 10325 2.2 11 5 Tu 1 . + CDS 10176 - 11024 582 ## COG2207 AraC-type DNA-binding domain-containing proteins 12 6 Tu 1 . - CDS 11021 - 12055 1321 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 12227 - 12286 6.4 + Prom 12230 - 12289 7.2 13 7 Tu 1 . + CDS 12340 - 12960 573 ## PROTEIN SUPPORTED gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent + Term 13015 - 13051 6.4 + Prom 13030 - 13089 7.3 14 8 Tu 1 . + CDS 13214 - 14203 921 ## ECSE_4198 2-keto-3-deoxygluconate permease + Term 14222 - 14266 8.0 + Prom 14257 - 14316 3.7 15 9 Tu 1 . + CDS 14352 - 15026 571 ## COG2258 Uncharacterized protein conserved in bacteria + Term 15055 - 15118 0.8 - Term 14988 - 15023 -0.7 16 10 Op 1 40/0.000 - CDS 15132 - 16505 1458 ## COG0642 Signal transduction histidine kinase 17 10 Op 2 . - CDS 16502 - 17200 884 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 17257 - 17316 2.1 + Prom 17264 - 17323 2.0 18 11 Tu 1 . + CDS 17350 - 17850 268 ## COG3678 P pilus assembly/Cpx signaling pathway, periplasmic inhibitor/zinc-resistance associated protein + Term 17891 - 17921 4.3 + Prom 17870 - 17929 5.2 19 12 Tu 1 . + CDS 17999 - 18901 592 ## COG0053 Predicted Co/Zn/Cd cation transporters + Term 18949 - 18996 1.3 + Prom 18943 - 19002 4.2 20 13 Tu 1 . + CDS 19082 - 20044 1205 ## COG0205 6-phosphofructokinase + Term 20230 - 20258 0.6 + Prom 20246 - 20305 6.6 21 14 Tu 1 . + CDS 20363 - 21352 1234 ## COG1613 ABC-type sulfate transport system, periplasmic component + Term 21362 - 21424 1.8 + Prom 21376 - 21435 4.8 22 15 Tu 1 . + CDS 21459 - 22214 560 ## COG2134 CDP-diacylglycerol pyrophosphatase - Term 22225 - 22259 5.2 23 16 Tu 1 . - CDS 22269 - 23036 857 ## COG0149 Triosephosphate isomerase - Term 23062 - 23092 2.1 24 17 Tu 1 . - CDS 23144 - 23743 599 ## B21_03754 hypothetical protein - Prom 23767 - 23826 5.8 + Prom 23761 - 23820 2.5 25 18 Tu 1 3/0.714 + CDS 23844 - 24284 478 ## COG3152 Predicted membrane protein + Prom 24401 - 24460 3.3 26 19 Op 1 3/0.714 + CDS 24496 - 24795 343 ## COG3691 Uncharacterized protein conserved in bacteria 27 19 Op 2 . + CDS 24822 - 25250 417 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins + Term 25281 - 25333 4.6 28 20 Op 1 4/0.286 - CDS 25255 - 26001 920 ## COG1018 Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 - Term 26042 - 26077 5.1 29 20 Op 2 . - CDS 26098 - 26991 1270 ## COG1494 Fructose-1,6-bisphosphatase/sedoheptulose 1,7-bisphosphatase and related proteins - Prom 27035 - 27094 25.4 Predicted protein(s) >gi|296493177|gb|ADTK01000324.1| GENE 1 5 - 1753 1062 582 aa, chain - ## HITS:1 COG:ZfrvR_1 KEGG:ns NR:ns ## COG: ZfrvR_1 COG3711 # Protein_GI_number: 15804486 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Escherichia coli O157:H7 EDL933 # 1 463 1 463 463 901 98.0 0 MLNERQLKIVDLLEQQPRTPGELAQQTGVSGRTILRDIDYLNFTLNGKARIFASGSAGYQ LEIFERRSFFQLLQKHDNDDRLLALLLLNTFTPRAQLASALNLPETWVAERLPRLKQRYE RTCCLASRPGLGHFIDETEEKRVILLANLLRKDPFLIPLAGITRDNLQHLSTACDNQHRW PLMQGDYLSSLILAIYALRNQLTDEWPQYPGDEIKQIVEHSGLFLGDNSVRTLTGLIEKQ HQQAQVISADHVLGLLQRVPGIASLNIIDTQLVENITGHLLRCLAAPVWIAEHRQSSMNN LKAAWPAAFDMSLHFITLLREQLDIPLFDSDLIGLYFACALERHQNERQPIILLSDQNAI ATINQLAIERDVLHCRVIIARSLSELVAIREEIEPLLIINNSHYLLDDAVNNCITVKNII TAAGIEQIKHFLATAFIRQQPERFFSAPGSFHYSNVRGESWQHITRQICAQLVAQHHITA DEAQRIIAREGEGENLIVNRLAIPHCWSEQERRFRGFFITLAQPVEVNNEVINHVLIACA AADARHELKIFSYLASVLCQHPAEVIAGLTGYEAFMELLHKG >gi|296493177|gb|ADTK01000324.1| GENE 2 1753 - 2823 1102 356 aa, chain - ## HITS:1 COG:frvX KEGG:ns NR:ns ## COG: frvX COG1363 # Protein_GI_number: 16131738 # Func_class: G Carbohydrate transport and metabolism # Function: Cellulase M and related proteins # Organism: Escherichia coli K12 # 1 356 1 356 356 722 97.0 0 MNIELLQQLCEASAVSGDEQEVRDILINTLEPCVNEITFDGLGSFVARKGNKGPKVAVVG HMDEVGFMVTHIDESGFLRFTTIGGWWNQSMLNHRVTIRTHKGVKIPGVIGSVAPHALTE KQRQQPLSFDEMFIDIGANSRDEVEKRGVAMGDFISPEANFACWGEDKVVGKALDNRIGC AMMAELLQTVNNPEITLYGVGSVEEEVGLRGAQTSAEHIKPDVVIVLDTAVAGDVPGIDN IKYPLKLGQGPGLMLFDKRYFPNQKLVAAFKNCATQNDLPLQFSTMKTGATDGGRYNVMG GGRPVVALCLPTRYLHANSGMISKADYDALLTLIRGFLTTLTAEKVNAFSQFRQVD >gi|296493177|gb|ADTK01000324.1| GENE 3 2813 - 4264 1500 483 aa, chain - ## HITS:1 COG:frvB_2 KEGG:ns NR:ns ## COG: frvB_2 COG1299 # Protein_GI_number: 16131739 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Escherichia coli K12 # 129 483 1 355 355 624 99.0 1e-179 MESSLRIVAITNCPAGIAHTYMVAEALEQKARSLGHTIKVETQGSSGVENRLSSEEIAAA DYVILATGRGLSGDDRARFAGKKVYEIAISQALKNIDQIFSELPTNSQLFAADSGVKLGK QEVQSGSVMSHLMAGVSAALPFVIGGGILVALANMLVQFGLPYTDMSKGAPSFTLVVESI GYLGFTFMIPIMGAYIASSIADKPAFAPAFLVCYLANDKALLGTQSGAGFLGAVVLGLAI GYFVFWFRKVRLGKALQPLLGSMLIPFVTLLVFGVLTYYVIGPVMSDLMGGLLHFLNTIP PSMKFAAAFLVGAMLAFDMGGPINKTAWFFCFSLLEKHIYDWYAIVGVVALMPPVAAGLA TFIAPKLFTRQEKEAASSAIVVGATVATEPAIPYALAAPLPMITANTLAGGITGVLVIAF GIKRLAPGLGIFDPLIGLMSPVGSFYLVLAIGLALNISFIIVLKGLWLRRKAKAAQQELV HEH >gi|296493177|gb|ADTK01000324.1| GENE 4 4275 - 4721 351 148 aa, chain - ## HITS:1 COG:frvA KEGG:ns NR:ns ## COG: frvA COG1762 # Protein_GI_number: 16131740 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Escherichia coli K12 # 1 148 1 148 148 286 97.0 1e-77 MAALTASCIDLNIQGNGAYSVLKQLATIALQNGFITDSHQFLQTLLLREKMHSTGFGSGV AVPHGKSACVKQPFVLFARKAQAIDWKASDGEDVNCWICLGVPQSGEEDQVKIIGTLCRK IIHQDFIHQLQQGDADQVLALLNQTLSS >gi|296493177|gb|ADTK01000324.1| GENE 5 5022 - 5336 359 104 aa, chain - ## HITS:1 COG:yiiL KEGG:ns NR:ns ## COG: yiiL COG3254 # Protein_GI_number: 16131741 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 104 1 104 104 198 99.0 2e-51 MIRKAFVMQVNPDAHEEYQRRHNPIWPELEAVLKSHGAHNYAIYLDKARNLLFATVEIES EERWNAVASTDVCQRWWKYMTDVMPANPDNSPVSSELQEVFYLP >gi|296493177|gb|ADTK01000324.1| GENE 6 5346 - 6170 961 274 aa, chain - ## HITS:1 COG:ECs4829 KEGG:ns NR:ns ## COG: ECs4829 COG0235 # Protein_GI_number: 15834083 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli O157:H7 # 1 274 1 274 274 567 98.0 1e-161 MQNITQSWFVQGMIKATTDAWLKGWDERNGGNLTLRLDDADIALYHDNFHPQPRYIPLSQ PMPLLANTPFIVTGSGKFFRNVQLDPAANLGVVKVDSDGAGYHILWGLTNEAVPTSELPA HFLSHCERIKATNGKDRVIMHCHATNLIALTYILENDTAVFTRQLWEGSTECLVVFPDGV GILPWMVPGTDEIGQATAQEMQKHSLVLWSFHGVFGSGPTLDETFGLIDTAEKSAQVLVK VYSMGGMKQTISREELIALGKRFGVTPLASALAL >gi|296493177|gb|ADTK01000324.1| GENE 7 6253 - 7512 1387 419 aa, chain - ## HITS:1 COG:ECs4830 KEGG:ns NR:ns ## COG: ECs4830 COG4806 # Protein_GI_number: 15834084 # Func_class: G Carbohydrate transport and metabolism # Function: L-rhamnose isomerase # Organism: Escherichia coli O157:H7 # 1 419 1 419 419 849 98.0 0 MTTQLEQAWELAKQRFAAVGIDVEEALRQLDRLPVSMHCWQGDDVSGFENPEGSLTGGIQ ATGNYPGKARNASELRADLEQAMRLIPGPKRLNLHAIYLESDTPVSRDQIKPEHFKNWVE WAKANQLGLDFNPSCFSHPLSADGFTLSHPDDSIRQFWIDHCKASRRVSAYFGEQLGTPS VMNIWIPDGMKDITVDRLAPRQRLLAALDEVISEKLDPAHHIDAVESKLFGIGAESYTVG SNEFYMGYATSRQTALCLDAGHFHPTEVISDKISAAMLYVPQLLLHVSRPVRWDSDHVVL LDDETQAIASEIVRHDLFDRVHIGLDFFDASINRIAAWVIGTRNMKKALLRALLEPTAEL RKLEAAGDYTARLALLEEQKSLPWQAVWEMYCQRHDTPAGSEWLESVRAYEKAILSQRG >gi|296493177|gb|ADTK01000324.1| GENE 8 7509 - 8978 1253 489 aa, chain - ## HITS:1 COG:ECs4831 KEGG:ns NR:ns ## COG: ECs4831 COG1070 # Protein_GI_number: 15834085 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Escherichia coli O157:H7 # 1 489 1 489 489 1010 98.0 0 MTFRNCVAVDLGASSGRVMLARYERECRSLTLREIHRFNNGLHSQNGYVTWDVDSLESAI RLGLNKVCEEGIRIDSIGIDTWGVDFVLLDQQGQRVGLPVAYRDSRTNGLMAQAQQQLGK RDIYQRSGIQFLPFNTLYQLRALTEQQPELIPHIAHALLMPDYFSYRLTGKMNWEYTNAT TTQLVNINSDDWDESLLAWSGANKAWFGRPTHPGNVIGHWICPQGNEIPVVAVASHDTAS AVIASPLNGSRAAYLSSGTWSLMGFESQTPFTNDTALAANITNEGGAEGRYRVLKNIMGL WLLQRVLKERKINDLPALIVATQSLPACRFIINPNDDRFINPEAMCSEIQAACRETAQPI PESDAELARCIFDSLALLYADVLHELAQLRGEDFSQLHIVGGGCQNTLLNQLCADACGIR VIAGPVEASTLGNIGIQLMTLDELNNVDDFRQVVSTTANLTTFTPNPDSEIAHYVARIHS TRKTKELCA >gi|296493177|gb|ADTK01000324.1| GENE 9 9266 - 10102 514 278 aa, chain + ## HITS:1 COG:rhaS KEGG:ns NR:ns ## COG: rhaS COG2207 # Protein_GI_number: 16131745 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 278 1 278 278 532 99.0 1e-151 MTVLHSVDFFPSGNASVAIEPRLPQADFPEHHHDFHEIVIVEHGTGIHVFNGQPYTITGG TVCFVRDHDRHLYEHTDNLCLTNVLYRSPDRFQFLAGLNQLLPQELDGQYPSHWRVNHSV LQQVRQLVAQMEQQEGENDLPSTASREILFMQLLLLLRKSSLQENLENSASRLNLLLAWL EDHFADEVNWDAVAEQFSLSLRTLHRQLKQQTGLTPQRYLNRLRLMKARHLLRHSEASVT DIAYRCGFSDSNHFSTLFRREFNWSPRDIRQGRDGFLQ >gi|296493177|gb|ADTK01000324.1| GENE 10 10050 - 10184 81 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRHRGYLGQRTKLIIRNMAYKYVEKIRVIAENHPVPDEYHAVTS >gi|296493177|gb|ADTK01000324.1| GENE 11 10176 - 11024 582 282 aa, chain + ## HITS:1 COG:ECs4833 KEGG:ns NR:ns ## COG: ECs4833 COG2207 # Protein_GI_number: 15834087 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 282 31 312 312 572 98.0 1e-163 MAHQLKLLKDDFFASDQQAVAVADRYPQDVFAEHTHDFCELVIVWRGNGLHVLNDRPYRI TRGDLFYIHADDKHSYASVNDLVLQNIIYCPERLKLNLDWQGAIPGFSASAGQPHWRLGS VGMAQARQVIGQLEHESSQHVSFANEMAELLFGQLVMLLNRHRYTSDSLPPTSSETLLDK LITRLAASLKSPFALDKFCDEASCSERVLRQQFRQQTGMTINQYLRQVRVCHAQYLLQHS RLLISDISTECGFEDSNYFSVVFTRETGMTPSQWRHLNSQKD >gi|296493177|gb|ADTK01000324.1| GENE 12 11021 - 12055 1321 344 aa, chain - ## HITS:1 COG:rhaT KEGG:ns NR:ns ## COG: rhaT COG0697 # Protein_GI_number: 16131747 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Escherichia coli K12 # 1 344 1 344 344 601 98.0 1e-172 MSNAITMGIFWHLIGAASAACFYAPFKKVKKWSWETMWSVGGIVSWIILPWAISALLLPN FWAYYSSFSLSTLLPVFLFGAMWGIGNINYGLTMRYLGMSMGIGIAIGITLIVGTLMTPI INGNFDVLINTEGGRMTLLGVLVALIGVGIVTRAGQLKERKMGIKAEEFNLKKGLVLAVM CGIFSAGMSFAMNAAKPMHEAAAALGVDPLYVALPSYVIIMGGGAIINLGFCFIRLAKVK DLSLKADFSLAKPLITHNVLLSALGGLMWYLQFFFYAWGHARIPAQYDYISWMLHMSFYV LCGGIVGLVLKEWNNAGRRPVTVLSLGCVVIIVAANIVGIGMAN >gi|296493177|gb|ADTK01000324.1| GENE 13 12340 - 12960 573 206 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent [Streptococcus pneumoniae TIGR4] # 1 206 1 201 201 225 54 3e-58 MSYTLPSLPYAYDALEPHFDKQTMEIHHTKHHQTYVNNANAALESLPEFANLPVEELITK LDQLPADKKTVLRNNAGGHANHSLFWKGLKKGTTLQGDLKAAIERDFGSVDNFKAEFEKA AASRFGSGWAWLVLKGDKLAVVSTANQDSPLMGEAISGASGFPILGLDVWEHAYYLKFQN RRPDYIKEFWNVVNWDEAAARFAAKK >gi|296493177|gb|ADTK01000324.1| GENE 14 13214 - 14203 921 329 aa, chain + ## HITS:1 COG:no KEGG:ECSE_4198 NR:ns ## KEGG: ECSE_4198 # Name: not_defined # Def: 2-keto-3-deoxygluconate permease # Organism: E.coli_SE11 # Pathway: not_defined # 1 329 13 341 341 501 100.0 1e-140 MEMQIKRSIEKIPGGMMLVPLFLGALCHTFSPGAGKYFGSFTNGMITGTVPILAVWFFCM GASIKLSATGTVLRKSGTLVVTKIAVAWVVAAIASRIIPEHGVEVGFFAGLSTLALVAAM DMTNGGLYASIMQQYGTKEEAGAFVLMSLESGPLMTMIILGTAGIASFEPHVFVGAVLPF LVGFALGNLDPELREFFSKAVQTLIPFFAFALGNTIDLTVIAQTGLLGILLGVAVIIVTG IPLIIADKLIGGGDGTAGIAASSSAGAAVATPVLIAEMVPAFKPMAPAATSLVATAVIVT SILVPIITSIWSRKVKARAAKIEILGTVK >gi|296493177|gb|ADTK01000324.1| GENE 15 14352 - 15026 571 224 aa, chain + ## HITS:1 COG:yiiM KEGG:ns NR:ns ## COG: yiiM COG2258 # Protein_GI_number: 16131750 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 224 11 234 234 460 99.0 1e-129 MRYPVDVYTGKIQAYPEGKPSAIAKIQVDGELMLTELGLEGDEQAEKKVHGGPDRALCHY PREHYLYWVREFPEQAELFVAPAFGENLSTDGLTESNVHMGDIFRWGEALIQVSQPRSPC YKLNYHFDISDIAQLMQNTGKVGWLYSVIAPGKVSADAPLELVSRVSDVTVQEAAAIAWH MPFDDDQYHRLLSAAGLSKSWTRTMQKRRLSGKIEDFSRRLWGK >gi|296493177|gb|ADTK01000324.1| GENE 16 15132 - 16505 1458 457 aa, chain - ## HITS:1 COG:ECs4837 KEGG:ns NR:ns ## COG: ECs4837 COG0642 # Protein_GI_number: 15834091 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli O157:H7 # 1 457 1 457 457 816 100.0 0 MIGSLTARIFAIFWLTLALVLMLVLMLPKLDSRQMTELLDSEQRQGLMIEQHVEAELAND PPNDLMWWRRLFRAIDKWAPPGQRLLLVTTEGRVIGAERSEMQIIRNFIGQADNADHPQK KKYGRVELVGPFSVRDGEDNYQLYLIRPASSSQSDFINLLFDRPLLLLIVTMLVSTPLLL WLAWSLAKPARKLKNAADEVAQGNLRQHPELEAGPQEFLAAGASFNQMVTALERMMTSQQ RLLSDISHELRTPLTRLQLGTALLRRRSGESKELERIETEAQRLDSMINDLLVMSRNQQK NALVSETIKANQLWSEVLDNAAFEAEQMGKSLTVNFPPGPWPLYGNPNALESALENIVRN ALRYSHTKIEVGFAVDKDGITITVDDDGPGVSPEDREQIFRPFYRTDEARDRESGGTGLG LAIVETAIQQHRGWVKAEDSPLGGLRLVIWLPLYKRS >gi|296493177|gb|ADTK01000324.1| GENE 17 16502 - 17200 884 232 aa, chain - ## HITS:1 COG:ECs4838 KEGG:ns NR:ns ## COG: ECs4838 COG0745 # Protein_GI_number: 15834092 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 232 1 232 232 410 100.0 1e-114 MNKILLVDDDRELTSLLKELLEMEGFNVIVAHDGEQALDLLDDSIDLLLLDVMMPKKNGI DTLKALRQTHQTPVIMLTARGSELDRVLGLELGADDYLPKPFNDRELVARIRAILRRSHW SEQQQNNDNGSPTLEVDALVLNPGRQEASFDGQTLELTGTEFTLLYLLAQHLGQVVSREH LSQEVLGKRLTPFDRAIDMHISNLRRKLPDRKDGHPWFKTLRGRGYLMVSAS >gi|296493177|gb|ADTK01000324.1| GENE 18 17350 - 17850 268 166 aa, chain + ## HITS:1 COG:ECs4839 KEGG:ns NR:ns ## COG: ECs4839 COG3678 # Protein_GI_number: 15834093 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; N Cell motility; T Signal transduction mechanisms; P Inorganic ion transport and metabolism # Function: P pilus assembly/Cpx signaling pathway, periplasmic inhibitor/zinc-resistance associated protein # Organism: Escherichia coli O157:H7 # 1 151 2 152 167 211 100.0 5e-55 MRIVTAAVMASTLAVSSLSHAAEVGSGDNWHPGEELTQRSTQSHMFDGISLTEHQRQQMR DLMQQARHEQPPVNVSELETMHRLVTAENFDENAVRAQAEKMANEQIARQVEMAKVRNQM YRLLTPEQQAVLNEKHQQRMEQLRDVTQWQKSSSLKLLSSSNSRSQ >gi|296493177|gb|ADTK01000324.1| GENE 19 17999 - 18901 592 300 aa, chain + ## HITS:1 COG:ECs4840 KEGG:ns NR:ns ## COG: ECs4840 COG0053 # Protein_GI_number: 15834094 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Escherichia coli O157:H7 # 1 300 1 300 300 558 100.0 1e-159 MNQSYGRLVSRAAIAATAMASLLLLIKIFAWWYTGSVSILAALVDSLVDIGASLTNLLVV RYSLQPADDNHSFGHGKAESLAALAQSMFISGSALFLFLTGIQHLISPTPMTDPGVGVIV TIVALICTIILVSFQRWVVRRTQSQAVRADMLHYQSDVMMNGAILLALGLSWYGWHRADA LFALGIGIYILYSALRMGYEAVQSLLDRALPDEERQEIIDIVTSWPGVSGAHDLRTRQSG PTRFIQIHLEMEDSLPLVQAHMVADQVEQAILRRFPGSDVIIHQDPCSVVPREGKRSMLS >gi|296493177|gb|ADTK01000324.1| GENE 20 19082 - 20044 1205 320 aa, chain + ## HITS:1 COG:ECs4841 KEGG:ns NR:ns ## COG: ECs4841 COG0205 # Protein_GI_number: 15834095 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Escherichia coli O157:H7 # 1 320 1 320 320 634 100.0 0 MIKKIGVLTSGGDAPGMNAAIRGVVRSALTEGLEVMGIYDGYLGLYEDRMVQLDRYSVSD MINRGGTFLGSARFPEFRDENIRAVAIENLKKRGIDALVVIGGDGSYMGAMRLTEMGFPC IGLPGTIDNDIKGTDYTIGFFTALSTVVEAIDRLRDTSSSHQRISVVEVMGRYCGDLTLA AAIAGGCEFVVVPEVEFSREDLVNEIKAGIAKGKKHAIVAITEHMCDVDELAHFIEKETG RETRATVLGHIQRGGSPVPYDRILASRMGAYAIDLLLAGYGGRCVGIQNEQLVHHDIIDA IENMKRPFKGDWLDCAKKLY >gi|296493177|gb|ADTK01000324.1| GENE 21 20363 - 21352 1234 329 aa, chain + ## HITS:1 COG:sbp KEGG:ns NR:ns ## COG: sbp COG1613 # Protein_GI_number: 16131755 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type sulfate transport system, periplasmic component # Organism: Escherichia coli K12 # 1 329 1 329 329 647 99.0 0 MNKWGVGLTFLLAATNVMAKDIQLLNVSYDPTRELYEQYNKAFSAHWKQQTGDNVVIRQS HGGSGKQATSVINGIEADVVTLALAYDVDAIAERGRIDKEWIKRLPDNSAPYTSTIVFLV RKGNPKQIHDWNDLIKPGVSVITPNPKSSGGARWNYLAAWGYALHHNNNDQAKAQDFVRA LYKNVEVLDSGARGSTNTFVERGIGDVLIAWENEALLAANELGKDKFEIVTPSESILAEP TVSVVDKVVEKKGTKEVAEAYLKYLYSPEGQEIAAKNYYRPRDAEVAKKYENAFPKLKLF TIDEEFGGWTKAQKEHFANGGTFDQISKR >gi|296493177|gb|ADTK01000324.1| GENE 22 21459 - 22214 560 251 aa, chain + ## HITS:1 COG:ECs4843 KEGG:ns NR:ns ## COG: ECs4843 COG2134 # Protein_GI_number: 15834097 # Func_class: I Lipid transport and metabolism # Function: CDP-diacylglycerol pyrophosphatase # Organism: Escherichia coli O157:H7 # 1 251 1 251 251 513 99.0 1e-145 MKKAGLLFLVMIVIAVVAAGIGYWKLTGEESDTLRKIVLEECLPNQQQNQNPSPCAEVKP NAGYVVLKDLNGPLQYLLMPTYRINGTESPLLTDPSTPNFFWLAWQARDFMSKKYGQPVP DRAVSLAINSRTGRTQNHFHIHISCIRPDVRKQLDNNLANISSRWLPLPGGLRGHEYLAR RVTESELVQRSPFMMLAEEVPEAREHMGSYGLAMVRQSDNSFVLLATQRNLLTLNRASAE EIQDHQCEILR >gi|296493177|gb|ADTK01000324.1| GENE 23 22269 - 23036 857 255 aa, chain - ## HITS:1 COG:ECs4844 KEGG:ns NR:ns ## COG: ECs4844 COG0149 # Protein_GI_number: 15834098 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Escherichia coli O157:H7 # 1 255 1 255 255 454 100.0 1e-128 MRHPLVMGNWKLNGSRHMVHELVSNLRKELAGVAGCAVAIAPPEMYIDMAKREAEGSHIM LGAQNVDLNLSGAFTGETSAAMLKDIGAQYIIIGHSERRTYHKESDELIAKKFAVLKEQG LTPVLCIGETEAENEAGKTEEVCARQIDAVLKTQGAAAFEGAVIAYEPVWAIGTGKSATP AQAQAVHKFIRDHIAKVDANIAEQVIIQYGGSVNASNAAELFAQPDIDGALVGGASLKAD AFAVIVKAAEAAKQA >gi|296493177|gb|ADTK01000324.1| GENE 24 23144 - 23743 599 199 aa, chain - ## HITS:1 COG:no KEGG:B21_03754 NR:ns ## KEGG: B21_03754 # Name: yiiQ # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 199 1 199 199 349 100.0 3e-95 MKPGCTLFFLLCSALTVTTTAHAQTPDTATTAPYLLAGAPTFDLSISQFREDFNSQNPSL PLNEFRAIDSSPDKANLTRAASKINENLYASTALERGTLKIKSIQMTWLPIQGPEQKAAK AKAQEYMAAVIRTLTPLMTKTQSQKKLQSLLTAGKNKRYYTETEGALRYVVADNGEKGLT FAVEPIKLALSESLEGLNK >gi|296493177|gb|ADTK01000324.1| GENE 25 23844 - 24284 478 146 aa, chain + ## HITS:1 COG:ECs4846 KEGG:ns NR:ns ## COG: ECs4846 COG3152 # Protein_GI_number: 15834100 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 146 1 146 146 260 100.0 6e-70 MTIQQWLFSFKGRIGRRDFWIWIGLWFAGMLVLFSLAGKNLLDIQTAAFCLVCLLWPTAA VTVKRLHDRGRSGAWAFLMIVAWMLLAGNWAILPGVWQWAVGRFVPTLILVMMLIDLGAF VGTQGENKYGKDTQDVKYKADNKSSN >gi|296493177|gb|ADTK01000324.1| GENE 26 24496 - 24795 343 99 aa, chain + ## HITS:1 COG:ECs4847 KEGG:ns NR:ns ## COG: ECs4847 COG3691 # Protein_GI_number: 15834101 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 99 1 99 99 160 100.0 6e-40 MKDVVDKCSTKGCAIDIGTVIDNDNCTSKFSRFFATREEAESFMTKLKELAAAASSADEG ASVAYKIKDLEGQVELDAAFTFSCQAEMIIFELSLRSLA >gi|296493177|gb|ADTK01000324.1| GENE 27 24822 - 25250 417 142 aa, chain + ## HITS:1 COG:yiiT KEGG:ns NR:ns ## COG: yiiT COG0589 # Protein_GI_number: 16131761 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Escherichia coli K12 # 1 142 1 142 142 280 100.0 4e-76 MAYKHIGVAISGNEEDALLVNKALELARHNDAHLTLIHIDDGLSELYPGIYFPATEDILQ LLKNKSDNKLYKLTKNIQWPKTKLRIERGEMPETLLEIMQKEQCDLLVCGHHHSFINRLM PAYRGMINKMSADLLIVPFIDK >gi|296493177|gb|ADTK01000324.1| GENE 28 25255 - 26001 920 248 aa, chain - ## HITS:1 COG:ECs4849 KEGG:ns NR:ns ## COG: ECs4849 COG1018 # Protein_GI_number: 15834103 # Func_class: C Energy production and conversion # Function: Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 # Organism: Escherichia coli O157:H7 # 1 248 1 248 248 503 99.0 1e-142 MADWVTGKVTKVQNWTDALFSLTVHAPVLPFTAGQFTKLGLEIDGERVQRAYSYVNSPDN PDLEFYLVTVPDGKLSPRLAALKPGDEVQVVSEAAGFFVLDEVPDCETLWMLATGTAIGP YLSILQLGKDLDRFKNLVLVHAARYAADLSYLPLMQELEKRYEGKLRIQTVVSRETAAGS LTGRIPALIESGELESAIGLPMNKETSHVMLCGNPQMVRDTQQLLKETRQMTKHLRRRPG HMTAEHYW >gi|296493177|gb|ADTK01000324.1| GENE 29 26098 - 26991 1270 297 aa, chain - ## HITS:1 COG:glpX KEGG:ns NR:ns ## COG: glpX COG1494 # Protein_GI_number: 16131763 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1,6-bisphosphatase/sedoheptulose 1,7-bisphosphatase and related proteins # Organism: Escherichia coli K12 # 1 297 40 336 336 548 100.0 1e-156 MRIMLNQVNIDGTIVIGEGEIDEAPMLYIGEKVGTGRGDAVDIAVDPIEGTRMTAMGQAN ALAVLAVGDKGCFLNAPDMYMEKLIVGPGAKGTIDLNLPLADNLRNVAAALGKPLSELTV TILAKPRHDAVIAEMQQLGVRVFAIPDGDVAASILTCMPDSEVDVLYGIGGAPEGVVSAA VIRALDGDMNGRLLARHDVKGDNEENRRIGEQELARCKAMGIEAGKVLRLGDMARSDNVI FSATGITKGDLLEGISRKGNIATTETLLIRGKSRTIRRIQSIHYLDRKDPEMQVHIL Prediction of potential genes in microbial genomes Time: Mon May 16 16:02:20 2011 Seq name: gi|296493176|gb|ADTK01000325.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont973.6, whole genome shotgun sequence Length of sequence - 11815 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 7, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 18/0.000 - CDS 96 - 1604 1834 ## COG0554 Glycerol kinase 2 1 Op 2 . - CDS 1627 - 2472 734 ## COG0580 Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) - Prom 2578 - 2637 5.3 3 2 Tu 1 . + CDS 2904 - 3143 498 ## COG3074 Uncharacterized protein conserved in bacteria + Term 3205 - 3233 1.0 - Term 3189 - 3224 5.1 4 3 Op 1 7/0.333 - CDS 3228 - 3713 552 ## COG0684 Demethylmenaquinone methyltransferase - Prom 3739 - 3798 5.3 5 3 Op 2 6/0.333 - CDS 3806 - 4732 1040 ## COG1575 1,4-dihydroxy-2-naphthoate octaprenyltransferase - Term 4737 - 4770 2.9 6 4 Op 1 24/0.000 - CDS 4799 - 6130 1200 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 7 4 Op 2 7/0.333 - CDS 6140 - 6670 588 ## COG5405 ATP-dependent protease HslVU (ClpYQ), peptidase subunit - Prom 6690 - 6749 1.7 - Term 6694 - 6725 4.1 8 4 Op 3 6/0.333 - CDS 6763 - 7617 615 ## COG3087 Cell division protein - Prom 7752 - 7811 2.3 9 5 Tu 1 . - CDS 7814 - 8839 929 ## COG1609 Transcriptional regulators - Prom 8928 - 8987 4.5 10 6 Tu 1 . - CDS 8995 - 11193 2190 ## COG1198 Primosomal protein N' (replication factor Y) - superfamily II helicase - Prom 11218 - 11277 2.8 + Prom 11276 - 11335 5.2 11 7 Tu 1 . + CDS 11396 - 11608 384 ## PROTEIN SUPPORTED gi|15804527|ref|NP_290567.1| 50S ribosomal protein L31 + Term 11612 - 11675 5.0 Predicted protein(s) >gi|296493176|gb|ADTK01000325.1| GENE 1 96 - 1604 1834 502 aa, chain - ## HITS:1 COG:ECs4851 KEGG:ns NR:ns ## COG: ECs4851 COG0554 # Protein_GI_number: 15834105 # Func_class: C Energy production and conversion # Function: Glycerol kinase # Organism: Escherichia coli O157:H7 # 1 502 1 502 502 1023 100.0 0 MTEKKYIVALDQGTTSSRAVVMDHDANIISVSQREFEQIYPKPGWVEHDPMEIWATQSST LVEVLAKADISSDQIAAIGITNQRETTIVWEKETGKPIYNAIVWQCRRTAEICEHLKRDG LEDYIRSNTGLVIDPYFSGTKVKWILDHVEGSRERARRGELLFGTVDTWLIWKMTQGRVH VTDYTNASRTMLFNIHTLDWDDKMLEVLDIPREMLPEVRRSSEVYGQTNIGGKGGTRIPI SGIAGDQQAALFGQLCVKEGMAKNTYGTGCFMLMNTGEKAVKSENGLLTTIACGPTGEVN YALEGAVFMAGASIQWLRDEMKLINDAYDSEYFATKVQNTNGVYVVPAFTGLGAPYWDPY ARGAIFGLTRGVNANHIIRATLESIAYQTRDVLEAMQADSGIRLHALRVDGGAVANNFLM QFQSDILGTRVERPEVREVTALGAAYLAGLAVGFWQNLDELQEKAVIEREFRPGIETTER NYRYAGWKKAVKRAMAWEEHDE >gi|296493176|gb|ADTK01000325.1| GENE 2 1627 - 2472 734 281 aa, chain - ## HITS:1 COG:ECs4852 KEGG:ns NR:ns ## COG: ECs4852 COG0580 # Protein_GI_number: 15834106 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) # Organism: Escherichia coli O157:H7 # 1 281 1 281 281 533 100.0 1e-151 MSQTSTLKGQCIAEFLGTGLLIFFGVGCVAALKVAGASFGQWEISVIWGLGVAMAIYLTA GVSGAHLNPAVTIALWLFACFDKRKVIPFIVSQVAGAFCAAALVYGLYYNLFFDFEQTHH IVRGSVESVDLAGTFSTYPNPHINFVQAFAVEMVITAILMGLILALTDDGNGVPRGPLAP LLIGLLIAVIGASMGPLTGFAMNPARDFGPKVFAWLAGWGNVAFTGGRDIPYFLVPLFGP IVGAIVGAFAYRKLIGRHLPCDICVVEEKETTTPSEQKASL >gi|296493176|gb|ADTK01000325.1| GENE 3 2904 - 3143 498 79 aa, chain + ## HITS:1 COG:ECs4853 KEGG:ns NR:ns ## COG: ECs4853 COG3074 # Protein_GI_number: 15834107 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 79 3 81 81 85 100.0 2e-17 MSLEVFEKLEAKVQQAIDTITLLQMEIEELKEKNNSLSQEVQNAQHQREELERENNHLKE QQNGWQERLQALLGRMEEV >gi|296493176|gb|ADTK01000325.1| GENE 4 3228 - 3713 552 161 aa, chain - ## HITS:1 COG:ECs4856 KEGG:ns NR:ns ## COG: ECs4856 COG0684 # Protein_GI_number: 15834110 # Func_class: H Coenzyme transport and metabolism # Function: Demethylmenaquinone methyltransferase # Organism: Escherichia coli O157:H7 # 1 161 1 161 161 299 100.0 1e-81 MKYDTSELCDIYQEDVNVVEPLFSNFGGRASFGGQIITVKCFEDNGLLYDLLEQNGRGRV LVVDGGGSVRRALVDAELARLAVQNEWEGLVIYGAVRQVDDLEELDIGIQAMAAIPVGAA GEGIGESDVRVNFGGVTFFSGDHLYADNTGIILSEDPLDIE >gi|296493176|gb|ADTK01000325.1| GENE 5 3806 - 4732 1040 308 aa, chain - ## HITS:1 COG:menA KEGG:ns NR:ns ## COG: menA COG1575 # Protein_GI_number: 16131768 # Func_class: H Coenzyme transport and metabolism # Function: 1,4-dihydroxy-2-naphthoate octaprenyltransferase # Organism: Escherichia coli K12 # 1 308 1 308 308 572 100.0 1e-163 MTEQQISRTQAWLESLRPKTLPLAFAAIIVGTALAWWQGHFDPLVALLALITAGLLQILS NLANDYGDAVKGSDKPDRIGPLRGMQKGVITQQEMKRALIITVVLICLSGLALVAVACHT LADFVGFLILGGLSIIAAITYTVGNRPYGYIGLGDISVLVFFGWLSVMGSWYLQAHTLIP ALILPATACGLLATAVLNINNLRDINSDRENGKNTLVVRLGEVNARRYHACLLMGSLVCL ALFNLFSLHSLWGWLFLLAAPLLVKQARYVMREMDPVAMRPMLERTVKGALLTNLLFVLG IFLSQWAA >gi|296493176|gb|ADTK01000325.1| GENE 6 4799 - 6130 1200 443 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 4 443 8 466 466 466 51 1e-131 MSEMTPREIVSELDKHIIGQDNAKRSVAIALRNRWRRMQLNEELRHEVTPKNILMIGPTG VGKTEIARRLAKLANAPFIKVEATKFTEVGYVGKEVDSIIRDLTDAAVKMVRVQAIEKNR YRAEELAEERILDVLIPPAKNNWGQTEQQQEPSAARQAFRKKLREGQLDDKEIEIDLAAA PMGVEIMAPPGMEEMTSQLQSMFQNLGGQKQKARKLKIKDAMKLLIEEEAAKLVNPEELK QDAIDAVEQHGIVFIDEIDKICKRGESSGPDVSREGVQRDLLPLVEGCTVSTKHGMVKTD HILFIASGAFQIAKPSDLIPELQGRLPIRVELQALTTSDFERILTEPNASITVQYKALMA TEGVNIEFTDSGIKRIAEAAWQVNESTENIGARRLHTVLERLMEEISYDASDLSGQNITI DADYVSKHLDALVADEDLSRFIL >gi|296493176|gb|ADTK01000325.1| GENE 7 6140 - 6670 588 176 aa, chain - ## HITS:1 COG:ECs4859 KEGG:ns NR:ns ## COG: ECs4859 COG5405 # Protein_GI_number: 15834113 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent protease HslVU (ClpYQ), peptidase subunit # Organism: Escherichia coli O157:H7 # 1 176 1 176 176 330 99.0 7e-91 MTTIVSVRRNGHVVIAGDGQATLGNTVMKGNVKKVRRLYNDKVIAGFAGGTADAFTLFEL FERKLKMHQGHLVKAAVELAKDWRTDRMLRKLEALLAVADETASLIITGNGDVVQPENDL IAIGSGGPYAQAAARALLENTELSAREIAEKALDIAGDICIYTNHFHTIEELSYKA >gi|296493176|gb|ADTK01000325.1| GENE 8 6763 - 7617 615 284 aa, chain - ## HITS:1 COG:ECs4860 KEGG:ns NR:ns ## COG: ECs4860 COG3087 # Protein_GI_number: 15834114 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Escherichia coli O157:H7 # 1 284 36 319 319 389 99.0 1e-108 MVAIAAAVLVTFIGGLYFITHHKKEVSETLQSQKVTGNGLPPKPEERWRYIKELESRQPG VRAPTEPSAGGEVKTPEQLTPEQRQLLEQMQADMRQQPTQLVEVPWNEQTPEQRQQTLQR QRQAQQLAEQQRLAQQSRTTEQSWQQQTRTSQAAPVQAQPRQSKPASTQQPYQDLLQTPA HTTAQSKPQQAAPVTRAADAPKPTAEKKDERRWMVQCGSFRGAEQAETVRAQLAFEGFDS KITTNNGWNRVVIGPVKGKENADSTLNRLKMAGHTNCIRLAAGG >gi|296493176|gb|ADTK01000325.1| GENE 9 7814 - 8839 929 341 aa, chain - ## HITS:1 COG:ECs4861 KEGG:ns NR:ns ## COG: ECs4861 COG1609 # Protein_GI_number: 15834115 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 341 1 341 341 687 100.0 0 MKAKKQETAATMKDVALKAKVSTATVSRALMNPDKVSQATRNRVEKAAREVGYLPQPMGR NVKRNESRTILVIVPDICDPFFSEIIRGIEVTAANHGYLVLIGDCAHQNQQEKTFIDLII TKQIDGMLLLGSRLPFDASIEEQRNLPPMVMANEFAPELELPTVHIDNLTAAFDAVNYLY EQGHKRIGCIAGPEEMPLCHYRLQGYVQALRRCGIMVDPQYIARGDFTFEAGSKAMQQLL DLPQPPTAVFCHSDVMALGALSQAKRQGLKVPEDLSIIGFDNIDLTQFCDPPLTTIAQPR YEIGREAMLLLLDQMQGQHVGSGSRLMDCELIIRGSTRALP >gi|296493176|gb|ADTK01000325.1| GENE 10 8995 - 11193 2190 732 aa, chain - ## HITS:1 COG:priA KEGG:ns NR:ns ## COG: priA COG1198 # Protein_GI_number: 16131773 # Func_class: L Replication, recombination and repair # Function: Primosomal protein N' (replication factor Y) - superfamily II helicase # Organism: Escherichia coli K12 # 1 732 1 732 732 1469 99.0 0 MPVAHVALPVPLPRTFDYLLPEGMTVKAGCRVRVPFGKQQERIGIVVSVSDASELPLNEL KAVVEVLDSEPVFTHSVWRLLLWAADYYHHPIGDVLFHALPILLRQGRPAANAPMWYWFA TEQGQAVDLNSLKRSPKQQQALAALRQGKIWRDQVATLEFNDAALQALRKKGLCDLASET PEFSDWRTNYAVSGERLRLNTEQATAVGAIHSAADTFSAWLLAGVTGSGKTEVYLSVLEN VLAQGKQALVMVPEIGLTPQTIARFRERFNAPVEVLHSGLNDSERLSAWLKAKNGEAAIV IGTRSALFTPFKNLGVIVIDEEHDSSYKQQEGWRYHARDLAVYRAHSEQIPIILGSATPA LETLCNVQQKKYRLLRLTRRAGNARPAIQHVLDLKGQKVQVGLAPALITRMRQHLQADNQ VILFLNRRGFAPALLCHDCGWIAECPRCDHYYTLHQAQQHLRCHHCDSQRPVPRQCPSCG STHLVPVGLGTEQLEQTLAPMFPGVPISRIDRDTTSRKGALEQQLAEVHRGGARILIGTQ MLAKGHHFPDVTLVALLDVDGALFSADFRSAERFAQLYTQVAGRAGRAGKQGEVVLQTHH PEHPLLQTLLYKGYDAFAEQALAERRMMQLPPWTSHVIVRAEDHNNQHAPLFLQQLRNLI LSSPLADDKLWVLGPVPALAPKRGGRWRWQILLQHPSRVRLQHIISGTLALINTIPDSRK VKWVLDVDPIEG >gi|296493176|gb|ADTK01000325.1| GENE 11 11396 - 11608 384 70 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15804527|ref|NP_290567.1| 50S ribosomal protein L31 [Escherichia coli O157:H7 EDL933] # 1 70 1 70 70 152 100 1e-36 MKKDIHPKYEEITASCSCGNVMKIRSTVGHDLNLDVCSKCHPFFTGKQRDVATGGRVDRF NKRFNIPGSK Prediction of potential genes in microbial genomes Time: Mon May 16 16:02:22 2011 Seq name: gi|296493175|gb|ADTK01000326.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont975.1, whole genome shotgun sequence Length of sequence - 7220 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 6, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 246 - 293 6.7 1 1 Op 1 . - CDS 396 - 614 101 ## ECS88_3296 hypothetical protein - Term 630 - 670 3.0 2 1 Op 2 . - CDS 702 - 1442 207 ## ECS88_3297 hypothetical protein - Prom 1492 - 1551 2.5 + Prom 2132 - 2191 5.1 3 2 Op 1 . + CDS 2218 - 2568 81 ## ABC2002 hypothetical protein 4 2 Op 2 . + CDS 2561 - 2743 138 ## YE2445 transposase 5 3 Tu 1 . + CDS 3160 - 3441 68 ## KPK_A0020 transposase, ISL3 family + Term 3673 - 3706 -0.5 6 4 Tu 1 . + CDS 4797 - 5267 95 ## SeHA_C4687 hypothetical protein + Term 5412 - 5449 -0.9 - Term 5171 - 5205 -0.5 7 5 Tu 1 . - CDS 5264 - 6223 192 ## ECS88_3303 hypothetical protein - Prom 6360 - 6419 3.8 8 6 Tu 1 . + CDS 6329 - 7213 485 ## COG3596 Predicted GTPase Predicted protein(s) >gi|296493175|gb|ADTK01000326.1| GENE 1 396 - 614 101 72 aa, chain - ## HITS:1 COG:no KEGG:ECS88_3296 NR:ns ## KEGG: ECS88_3296 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 72 1 72 72 143 98.0 2e-33 MYLLPFSCSYFTLRFHYLLLEVPIYFGAAHLRPFTEEDGGEPYDPGCWRGKVQDWDGAGL VVEDESEFLTGQ >gi|296493175|gb|ADTK01000326.1| GENE 2 702 - 1442 207 246 aa, chain - ## HITS:1 COG:no KEGG:ECS88_3297 NR:ns ## KEGG: ECS88_3297 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 246 1 246 246 473 95.0 1e-132 MNQPIHNAYWLSRFDSLLDSALAQHRAVSFIRVDLRFPEYMPATIMDPDPDSAVISRFFE SLKAKIQAYQRKKRRANKRVHATSLRYFWCREFGKMYGRKHYHVILLLNKDTWCSLGDFS EPSSLATMIQEAWCSALHLEPWQGDGLVHFSRWTPSRKPASSDARPSSDDTPLSGGCSDT RKASDKKPGEAAVLWIKRGDVEALQKAWNRASYLVKYETKLHNGSGQRNYGCSRGPGRLL DGRRSL >gi|296493175|gb|ADTK01000326.1| GENE 3 2218 - 2568 81 116 aa, chain + ## HITS:1 COG:no KEGG:ABC2002 NR:ns ## KEGG: ABC2002 # Name: not_defined # Def: hypothetical protein # Organism: B.clausii # Pathway: not_defined # 6 109 5 110 125 78 35.0 7e-14 MNHAAVVIRDYTSAFPDPISIKKASTAVISHCDLEYRCWVWITLPSGKAGWAPQQIFSPL SANEVICLEDYTAHELSVRASEKITVIKSLNGWYWALRHSGESGWVPEECVNILDV >gi|296493175|gb|ADTK01000326.1| GENE 4 2561 - 2743 138 60 aa, chain + ## HITS:1 COG:no KEGG:YE2445 NR:ns ## KEGG: YE2445 # Name: not_defined # Def: transposase # Organism: Y.enterocolitica # Pathway: not_defined # 16 60 10 61 404 65 69.0 5e-10 MSDFQSRLLLWTKNRILNLYGPWQVASLSLDENAGSVAVTVGIAEITLLSWPVHDHRHRK >gi|296493175|gb|ADTK01000326.1| GENE 5 3160 - 3441 68 93 aa, chain + ## HITS:1 COG:no KEGG:KPK_A0020 NR:ns ## KEGG: KPK_A0020 # Name: not_defined # Def: transposase, ISL3 family # Organism: K.pneumoniae_342 # Pathway: not_defined # 1 89 249 337 406 130 75.0 1e-29 MDKNRQNEHPRLPVKSRRQTKGTRFRWQYSKKWMTESRQEKMTWLREQMQQTNQGWRLKE LAKDIRDRPLSEERRIDCLQWISLVTNIETGCA >gi|296493175|gb|ADTK01000326.1| GENE 6 4797 - 5267 95 156 aa, chain + ## HITS:1 COG:no KEGG:SeHA_C4687 NR:ns ## KEGG: SeHA_C4687 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 10 156 110 262 262 164 57.0 8e-40 MVRIQSEACAKQFIDVIYKNWKEIRNKNRAKWLKNNSEQIEWAWEYVTKRIPDLHVTGFC PDNTEEKHIALLVLLNILNSEDKQSQLVSVCAEEDFCKKMHDAFRKRFIDGKKDKRVQIN VKISPSAKGALDRLTRERKTTQQAILEQLILNRRLD >gi|296493175|gb|ADTK01000326.1| GENE 7 5264 - 6223 192 319 aa, chain - ## HITS:1 COG:no KEGG:ECS88_3303 NR:ns ## KEGG: ECS88_3303 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 14 319 8 317 317 423 74.0 1e-117 MTAQQSDALREIANKARVTTILQSKAWKDTQRILQRSGLVCRERSEPFDPEKHFDCYTVR YLYLLNIIALELKPDTRIKVEVGQWYRITGKHLSLNVPPFMLIPRHIRRKVDGFRQSGKT TKKPAQQFTGSLYKALSGDIDSAKLDAWIAGLPLSALEGAKEISSMGFNPWVVAKSVRRS ASPTFELFYQEYQSLVLKSVFKTEPGFEQLLTRLSFIKQSILRKQQYRVKSELLEILGNV LFFTLLYNQGCPGEFVNALVEKEATYLKASPGEEKVKAYQKMINYIEEFCNKMTEKYLIS AAKRHYQKKKITRIGSGES >gi|296493175|gb|ADTK01000326.1| GENE 8 6329 - 7213 485 294 aa, chain + ## HITS:1 COG:ECs1395 KEGG:ns NR:ns ## COG: ECs1395 COG3596 # Protein_GI_number: 15830649 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Escherichia coli O157:H7 # 12 294 10 289 290 185 36.0 7e-47 MSFSHSSLSAQVKSYLTFLPEEIRQKILEHLHGVIHYEPVIGIMGKSGTGKSSLCNAIFQ SRICATHPLNGCPRQAHRLTLQLGERRMTLVDLPGIGETPQHDQEYRALYRQLLPELDLI IWILRADERAYAADIAMHQFLLNEGADPSRFLFVLSHADRVFPAEEWNDTEKCPSRHQEL SLATVTARVATLFPSSFPVLSVAAPVGWNLPAFVSLMIHALPPQAISAVYSHIRGEKRST RDKNHAQQAFGDAIGQSFDAAVARFSFPVWMLHLLRKARDRIIHLLSALWERLF Prediction of potential genes in microbial genomes Time: Mon May 16 16:02:40 2011 Seq name: gi|296493174|gb|ADTK01000327.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1003.1, whole genome shotgun sequence Length of sequence - 668 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 162 - 323 65 ## 2 2 Tu 1 . - CDS 337 - 666 105 ## COG2801 Transposase and inactivated derivatives Predicted protein(s) >gi|296493174|gb|ADTK01000327.1| GENE 1 162 - 323 65 53 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVQVNVIMVAVVGRGVYWMAGEPCKTASGARVYVTVRSLADALSSFSEKVFLK >gi|296493174|gb|ADTK01000327.1| GENE 2 337 - 666 105 109 aa, chain - ## HITS:1 COG:yi22 KEGG:ns NR:ns ## COG: yi22 COG2801 # Protein_GI_number: 16132094 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli K12 # 1 109 193 301 301 228 99.0 2e-60 VEWLTDNGSCYRANETRQFARMLGLEPKNTAVRSPESNGIAESFVKTIKRDYISVMPKPD GLTAAKNLAEAFEHYNEWHPHSALGYRSPREYLRQRACNGLSDNRCLEI Prediction of potential genes in microbial genomes Time: Mon May 16 16:02:45 2011 Seq name: gi|296493173|gb|ADTK01000328.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1011.1, whole genome shotgun sequence Length of sequence - 805 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 805 400 ## G2583_1970 exonuclease family protein Predicted protein(s) >gi|296493173|gb|ADTK01000328.1| GENE 1 1 - 805 400 268 aa, chain - ## HITS:1 COG:no KEGG:G2583_1970 NR:ns ## KEGG: G2583_1970 # Name: ydfE # Def: exonuclease family protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 268 184 451 823 508 98.0 1e-142 EWPEIQAEMRIWKKRREGERKETGKYTSVVDLARARANQQHTENSTGKINPVIAAIHREY KQTWKTLDDELAYALWPGDVDAGNIDGSIHRWAKNEVIDNDREDWKRISASMRKQPDALR YDRQTIFGLVRERPIDIHKDPVALNKYITEYLTTKGVFEDEGTNQSATDTLSSPVPETDA VETAIPDNEKTECKVEVEPSVEREGPFYFLFTDKDGEKYGRANKLSGLDKALSAGATEIT KEEYFARKNGTYSGSQQNTGASDTTAQP Prediction of potential genes in microbial genomes Time: Mon May 16 16:02:49 2011 Seq name: gi|296493172|gb|ADTK01000329.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1020.1, whole genome shotgun sequence Length of sequence - 1351 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 106 - 165 3.3 1 1 Tu 1 . + CDS 386 - 1349 618 ## COG3468 Type V secretory pathway, adhesin AidA Predicted protein(s) >gi|296493172|gb|ADTK01000329.1| GENE 1 386 - 1349 618 321 aa, chain + ## HITS:1 COG:flu KEGG:ns NR:ns ## COG: flu COG3468 # Protein_GI_number: 16129941 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 1 321 53 373 1091 470 93.0 1e-132 MKRHLNTCYRLVWNHITGAFVVASGLARARGKRGGVAVALSLAAVTSLPVLAADIVVHPG ETVNGGTLANHDNQIVFGTTNGMTISTGLEYGPDNEANTGGQWIQNGGTANNTTVTGGGL QRVNAGGSVSDTVISAGGGQSLQGQAVNTTLNGGEQWVHEDGIATGTVINEKGWQAIKSG AVATDTVVNTGAEGGPDAENGDTGQTVYGDAVRTTINKNGRQIVAAEGTANTTVVYAGGD QTVHGHALDTTLNGGYQYVHNGGTASGTVVNSDGWQIVKNGGVAGNTTVNQKGRLQVDAG GTATNVTLKQGGALVTSTAAT Prediction of potential genes in microbial genomes Time: Mon May 16 16:02:50 2011 Seq name: gi|296493171|gb|ADTK01000330.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1025.1, whole genome shotgun sequence Length of sequence - 1398 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 969 670 ## COG3468 Type V secretory pathway, adhesin AidA Predicted protein(s) >gi|296493171|gb|ADTK01000330.1| GENE 1 3 - 969 670 322 aa, chain - ## HITS:1 COG:Z1211 KEGG:ns NR:ns ## COG: Z1211 COG3468 # Protein_GI_number: 15800732 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli O157:H7 EDL933 # 1 322 57 378 1005 499 97.0 1e-141 MKRHLNTSYRLVWNHITGTLVVASELARSRGKGAGVAVALSLAAVTSVPALAADSIVQAG ETVNGGTLENHDNQIVLGTANGMTISTGLELGPDSEENTGGQWIQNGGIAGNTTVTTNGR QVVLEGGTASDTVIRDGGGQSLNGLAVNTTLNNRGEQWVHEGGVATGTIINRDGYQSVKS GGLATGTIINTGAEGGPDSDNSYTGQKVQGTAESTTINKNGRQIILFSGIARDTLIYAGG DQSVHGRALNTTLNGGYQYVHKDGLALNTVINEGGWQVVKAGGAVGNTTINQNGELRVHA GGEATAVTQNTGGALVTSTAAT Prediction of potential genes in microbial genomes Time: Mon May 16 16:02:51 2011 Seq name: gi|296493170|gb|ADTK01000331.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1046.1, whole genome shotgun sequence Length of sequence - 668 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 13 - 582 581 ## COG3468 Type V secretory pathway, adhesin AidA Predicted protein(s) >gi|296493170|gb|ADTK01000331.1| GENE 1 13 - 582 581 189 aa, chain - ## HITS:1 COG:Z1211 KEGG:ns NR:ns ## COG: Z1211 COG3468 # Protein_GI_number: 15800732 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli O157:H7 EDL933 # 1 189 817 1005 1005 335 95.0 3e-92 MKASSDNNDFRARGWGWLGSLETGLPFSITDNLMLEPQLQYTWQGLSLDDGKDNAGYVKF GHGSAQHVRAGFRLGSHNDMNFGKGTSSRDTLRGSAKHSVRELPVNWWVQPSVIRTFSSR GDMSMGTAAAGSNMTFSPSQNGTSLDLQAGLEARVRENITLGVQAGYVHSVSGSSAEGYN GQATLNVTF Prediction of potential genes in microbial genomes Time: Mon May 16 16:02:51 2011 Seq name: gi|296493169|gb|ADTK01000332.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1046.2, whole genome shotgun sequence Length of sequence - 301 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 247 86 ## COG3468 Type V secretory pathway, adhesin AidA Predicted protein(s) >gi|296493169|gb|ADTK01000332.1| GENE 1 1 - 247 86 82 aa, chain - ## HITS:1 COG:flu KEGG:ns NR:ns ## COG: flu COG3468 # Protein_GI_number: 16129941 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 1 82 835 916 1091 131 92.0 3e-31 MRTEVAGMSVTAGVYGAAGHSSVDVKDDDGSRAGTVRDDAGSLGGYLNLIHNASGLWADI VALGTRHSMKASTDNNDFRARG Prediction of potential genes in microbial genomes Time: Mon May 16 16:02:53 2011 Seq name: gi|296493168|gb|ADTK01000333.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1085.1, whole genome shotgun sequence Length of sequence - 2124 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 269 - 1255 327 ## EcE24377A_1149 hypothetical protein 2 1 Op 2 . - CDS 1252 - 1743 369 ## COG0331 (acyl-carrier-protein) S-malonyltransferase - Prom 1776 - 1835 2.6 3 2 Tu 1 . + CDS 1846 - 2122 275 ## COG2963 Transposase and inactivated derivatives Predicted protein(s) >gi|296493168|gb|ADTK01000333.1| GENE 1 269 - 1255 327 328 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_1149 NR:ns ## KEGG: EcE24377A_1149 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 328 1 328 328 554 99.0 1e-156 MIPDYLTFIRFQDKRNLIYIYAIGLILIGFYWKNAGFTFPSEDIGVVSGILALVLYNFIF DLKAYWAYKCVTKNIDFSWFKKKQNHKIELFLTQPLVAGFLSLIMLSAMSWGLYQLLPSL YALFLISLLGPLVIFLLFRMIRTSYVKQVAISVAKKVKYKSLTRYVLLSVCISTVVNLLT ISPLRNSDSFVTEGQWLTFESIIALLILCGVVLAINLFFLRFSKRYAFLGRLFLQEIDLF FSSENALSTFFAKPLWLRLFILLVIEMMWITLVSVLATLVEWRIWFEAYFLLCYVPCLIY YFFHCRCLWHNDFMMACDMYFRWGHFNK >gi|296493168|gb|ADTK01000333.1| GENE 2 1252 - 1743 369 163 aa, chain - ## HITS:1 COG:RC1116 KEGG:ns NR:ns ## COG: RC1116 COG0331 # Protein_GI_number: 15893039 # Func_class: I Lipid transport and metabolism # Function: (acyl-carrier-protein) S-malonyltransferase # Organism: Rickettsia conorii # 11 138 161 287 314 65 32.0 4e-11 MITESGIALDISCDNTPRQQVIGGTQASLNEFATLLMAAGYEPVKLGVSGAWHTRLMEDG VQAMRDYLAGLDIASPEHQVLMNVTAKSEVAPSIIKENLSLHLTHTVKWTESLDTFLNMP TPVAFLEISNKPYLGNMLNDFAGVDQQRVMHCRKAFSDAKVFK >gi|296493168|gb|ADTK01000333.1| GENE 3 1846 - 2122 275 92 aa, chain + ## HITS:1 COG:b0298 KEGG:ns NR:ns ## COG: b0298 COG2963 # Protein_GI_number: 16128283 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli K12 # 1 92 4 95 102 103 100.0 7e-23 MTKTVSTSKKPRKQHSPEFRSEALKLAERIGVTAAARELSLYESQLYNWRSKQQNQQTSS ERELEMSTEIARLKRQLAERDEELAILQKAAT Prediction of potential genes in microbial genomes Time: Mon May 16 16:03:08 2011 Seq name: gi|296493167|gb|ADTK01000334.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1096.1, whole genome shotgun sequence Length of sequence - 38073 bp Number of predicted genes - 32, with homology - 31 Number of transcription units - 21, operones - 5 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 59 - 442 274 ## UTI89_C4570 hypothetical protein 2 2 Tu 1 . - CDS 505 - 948 308 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Prom 968 - 1027 3.5 + Prom 924 - 983 4.8 3 3 Tu 1 . + CDS 1105 - 2034 942 ## COG1897 Homoserine trans-succinylase + Term 2080 - 2108 1.0 + Prom 2120 - 2179 7.7 4 4 Op 1 7/0.000 + CDS 2303 - 3904 1592 ## COG2225 Malate synthase 5 4 Op 2 5/0.143 + CDS 3934 - 5238 1450 ## COG2224 Isocitrate lyase + Term 5461 - 5504 2.1 + Prom 5256 - 5315 2.5 6 5 Tu 1 . + CDS 5519 - 7255 1254 ## COG4579 Isocitrate dehydrogenase kinase/phosphatase 7 6 Tu 1 . - CDS 7224 - 7964 169 ## COG0666 FOG: Ankyrin repeat - Prom 7988 - 8047 1.6 8 7 Tu 1 . - CDS 9115 - 9939 734 ## COG1414 Transcriptional regulator - Prom 10099 - 10158 3.7 + Prom 9911 - 9970 4.7 9 8 Tu 1 . + CDS 10139 - 13822 4204 ## COG1410 Methionine synthase I, cobalamin-binding domain + Term 13855 - 13891 7.1 + Prom 13840 - 13899 4.3 10 9 Tu 1 . + CDS 14042 - 15673 1811 ## COG1283 Na+/phosphate symporter + Term 15737 - 15769 3.2 - Term 15725 - 15757 3.2 11 10 Tu 1 . - CDS 15764 - 16453 840 ## COG3340 Peptidase E - Prom 16611 - 16670 3.1 12 11 Op 1 1/0.857 - CDS 16836 - 18065 1251 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 13 11 Op 2 13/0.000 - CDS 18117 - 18941 799 ## COG3716 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID 14 11 Op 3 13/0.000 - CDS 18952 - 19749 802 ## COG3715 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC 15 11 Op 4 9/0.000 - CDS 19815 - 20309 504 ## COG3444 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB 16 11 Op 5 2/0.714 - CDS 20309 - 20716 387 ## COG2893 Phosphotransferase system, mannose/fructose-specific component IIA 17 11 Op 6 2/0.714 - CDS 20726 - 21532 194 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 18 11 Op 7 . - CDS 21602 - 22549 1015 ## COG2390 Transcriptional regulator, contains sigma factor-related N-terminal domain - Prom 22763 - 22822 5.8 + Prom 22745 - 22804 5.2 19 12 Op 1 1/0.857 + CDS 22897 - 23076 171 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 20 12 Op 2 . + CDS 23095 - 23769 701 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 21 13 Tu 1 . - CDS 23770 - 24042 370 ## ECH74115_5501 hypothetical protein - Prom 24164 - 24223 5.9 22 14 Op 1 . + CDS 24572 - 25699 1045 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs 23 14 Op 2 . + CDS 25705 - 26496 684 ## COG3541 Predicted nucleotidyltransferase - Term 26488 - 26518 -1.0 24 15 Tu 1 . - CDS 26532 - 27458 414 ## ECIAI1_4248 putative zeta toxin; poison-antidote element - Prom 27483 - 27542 5.5 25 16 Tu 1 . - CDS 27603 - 28952 1330 ## COG0527 Aspartokinases + Prom 29383 - 29442 5.4 26 17 Tu 1 . + CDS 29477 - 31126 1989 ## COG0166 Glucose-6-phosphate isomerase + Term 31144 - 31180 7.3 + Prom 31277 - 31336 7.0 27 18 Tu 1 . + CDS 31481 - 31723 190 ## + Term 31795 - 31822 0.1 + Prom 31742 - 31801 2.6 28 19 Op 1 . + CDS 31837 - 32475 637 ## ECO103_4775 putative lipoprotein 29 19 Op 2 . + CDS 32472 - 33209 605 ## ECSE_4317 hypothetical protein 30 19 Op 3 . + CDS 33209 - 35305 2192 ## G2583_4854 hypothetical protein + Term 35313 - 35359 6.0 + Prom 35710 - 35769 2.5 31 20 Tu 1 . + CDS 35900 - 36310 552 ## COG3223 Predicted membrane protein + Term 36349 - 36393 1.9 - Term 36304 - 36343 7.3 32 21 Tu 1 . - CDS 36354 - 37829 1481 ## COG0477 Permeases of the major facilitator superfamily - Prom 37952 - 38011 5.3 Predicted protein(s) >gi|296493167|gb|ADTK01000334.1| GENE 1 59 - 442 274 127 aa, chain + ## HITS:1 COG:no KEGG:UTI89_C4570 NR:ns ## KEGG: UTI89_C4570 # Name: yjaA # Def: hypothetical protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 127 13 139 139 219 96.0 2e-56 MSVLYIQISRNKITVRDLESKREVSGDAAFSNQRLLIANFFVAEKVLHDLVLQLHPRSTW HSFLPAKRMDIVVSALEMNEGGLSQVEERILHEVVAGATLMKYRQFHIHAQSVVLSDSAV MAMLKQK >gi|296493167|gb|ADTK01000334.1| GENE 2 505 - 948 308 147 aa, chain - ## HITS:1 COG:yjaB KEGG:ns NR:ns ## COG: yjaB COG0454 # Protein_GI_number: 16131838 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Escherichia coli K12 # 1 147 1 147 147 294 96.0 4e-80 MVISIRRSRHEEGEGLVAIWCRSVDATHDFLSAEYRAELEELVRSFLPEAPLWVAVNERE QPVGFMLLSGQHMDALFIDPDVRGCGVGRMLVEHALSMAPELTTNVNEQNEQAVGFYKKV GFKVTGRSEVDDLGKPYPLLNLAYVGA >gi|296493167|gb|ADTK01000334.1| GENE 3 1105 - 2034 942 309 aa, chain + ## HITS:1 COG:ECs4931 KEGG:ns NR:ns ## COG: ECs4931 COG1897 # Protein_GI_number: 15834185 # Func_class: E Amino acid transport and metabolism # Function: Homoserine trans-succinylase # Organism: Escherichia coli O157:H7 # 1 309 1 309 309 623 99.0 1e-178 MPFRVPDELPAVNFLREENVFVMTTSRASGQEIRPLKVLILNLMPKKIETENQFLRLLSN SPLQVDIQLLRIDSRESRNTPAEHLNNFYCNFEDIQEQNFDGLIVTGAPLGLVEFNDVAY WPQIKQVLEWSKDHVTSTLFVCWAVQAALNILYGIPKQTRTDKLSGVYEHHILHPHALLT RGFDDSFLAPHSRYADFPAALIRDYTDLEILAETEEGDAYLFASKDKRIAFVTGHPEYDA QTLAQEYFRDVEAGLGPEVPYNYFPHNDPQNTPRASWRSHGNLLFTNWLNYYVYQITPYD LRHMNPTLD >gi|296493167|gb|ADTK01000334.1| GENE 4 2303 - 3904 1592 533 aa, chain + ## HITS:1 COG:aceB KEGG:ns NR:ns ## COG: aceB COG2225 # Protein_GI_number: 16131840 # Func_class: C Energy production and conversion # Function: Malate synthase # Organism: Escherichia coli K12 # 1 533 1 533 533 1100 98.0 0 MTEQATTTDELAFIRPYGEQEKQILTAEAVEFLTELVTHFTPQRNKLLAARIQQQQDIDN GTLPDFISEIASIRDMDWKIRGIPADLQDRRVEITGPVERKMVINALNANVKVFMADFED SLAPDWNKVIDGQINLRDAVNGTISYTNEAGKIYQLKPNPAVLICRVRGLHLPEKHVTWR GEAIPGSLFDFALYFFHNYQALLAKGSGPYFYLPKTQSWQEAAWWSEVFSYAEDRFNLPR GTIKATLLIETLPAVFQMDEILHALRDHIVGLNCGRWDYIFSYIKTLKNYPDRVLPDRQA VTMDKPFLNAYSRLLIKTCHKRGAFAMGGMAAFIPSKDEERNNQVLNKVKADKALEANNG HDGTWIAHPGLADTAMAVFNDILGSRKNQLEVMREQDAPITADQLLAPCDGERTEEGMRA NIRVAVQYIEAWISGNGCVPIYGLMEDAATAEISRTSIWQWIHHQKTLSNGKPVTKALFR QMLGEEMKVIASELGEERFSQGRFDDAARLMEQITTSDELIDFLTLPGYRLLA >gi|296493167|gb|ADTK01000334.1| GENE 5 3934 - 5238 1450 434 aa, chain + ## HITS:1 COG:aceA KEGG:ns NR:ns ## COG: aceA COG2224 # Protein_GI_number: 16131841 # Func_class: C Energy production and conversion # Function: Isocitrate lyase # Organism: Escherichia coli K12 # 1 434 1 434 434 866 100.0 0 MKTRTQQIEELQKEWTQPRWEGITRPYSAEDVVKLRGSVNPECTLAQLGAAKMWRLLHGE SKKGYINSLGALTGGQALQQAKAGIEAVYLSGWQVAADANLAASMYPDQSLYPANSVPAV VERINNTFRRADQIQWSAGIEPGDPRYVDYFLPIVADAEAGFGGVLNAFELMKAMIEAGA AAVHFEDQLASVKKCGHMGGKVLVPTQEAIQKLVAARLAADVTGVPTLLVARTDADAADL ITSDCDPYDSEFITGERTSEGFFRTHAGIEQAISRGLAYAPYADLVWCETSTPDLELARR FAQAIHAKYPGKLLAYNCSPSFNWQKNLDDKTIASFQQQLSDMGYKFQFITLAGIHSMWF NMFDLANAYAQGEGMKHYVEKVQQPEFAAAKDGYTFVSHQQEVGTGYFDKVTTIIQGGTS SVTALTGSTEESQF >gi|296493167|gb|ADTK01000334.1| GENE 6 5519 - 7255 1254 578 aa, chain + ## HITS:1 COG:aceK KEGG:ns NR:ns ## COG: aceK COG4579 # Protein_GI_number: 16131842 # Func_class: T Signal transduction mechanisms # Function: Isocitrate dehydrogenase kinase/phosphatase # Organism: Escherichia coli K12 # 1 578 1 578 578 1189 99.0 0 MPRGLELLIAQTILQGFDAQYGRFLEVTSGAQQRFEQADWHAVQQAMKNRIHLYDHHVGL VVEQLRCITNGQSTDAAFLLRVKEHYTRLLPDYPRFEIAESFFNSVYCRLFDHRSLTPER LFIFSSQPERRFRTIPRPLAKDFHPDHGWESLLMRVISDLPLRLRWQNKSRDIHYIIRHL TETLGTDNLAESHLQVANELFYRNKAAWLVGKLITPSGTLPFLLPIHQTDDGELFIDTCL TTTAEASIVFGFARSYFMVYAPLPAALVEWLREILPGKTTAELYMAIGCQKHAKTESYRE YLVYLQGCNEQFIEAPGIRGMVMLVFTLPGFDRVFKVIKDKFAPQKEMSAAHVRACYQLV KEHDRVGRMADTQEFENFVLEKRHISPALMELLLQEAAEKITDLGEQIVIRHLYIERRMV PLNIWLEQVEGQQLRDAIEEYGNAIRQLAAANIFPGDMLFKNFGVTRHGRVVFYDYDEIC YMTEVNFRDIPPPRYPEDELASEPWYSVSPGDVFPEEFRHWLCADPRIGPLFEEMHADLF RADYWRALQNRIREGHVEDVYAYRRRQRFSVRYGEMLF >gi|296493167|gb|ADTK01000334.1| GENE 7 7224 - 7964 169 246 aa, chain - ## HITS:1 COG:arp KEGG:ns NR:ns ## COG: arp COG0666 # Protein_GI_number: 16131843 # Func_class: R General function prediction only # Function: FOG: Ankyrin repeat # Organism: Escherichia coli K12 # 94 246 576 728 728 286 94.0 2e-77 MSESKEDIKHYSLMDFMNVDYSLLKWSNDHIINQSVAIIPALPKEQLLMLKGSVDEITPP LSPATMNLLMAIGQNHQLTQLMIQLQKMPELHRTEMLTAYNSGHMNVINTIFNALPTLFN TFKFDKKNMKPLLLANNSNEYPGLFSAIQHKQQNVVETVYLALSNHARLFGFTAEDIMDF WQHKAPQKYSAFELAFELGHRVIAELILNTLNKMAESFGFTDNPRYIAEKNYMEALLKKA SPHTVR >gi|296493167|gb|ADTK01000334.1| GENE 8 9115 - 9939 734 274 aa, chain - ## HITS:1 COG:ECs4936 KEGG:ns NR:ns ## COG: ECs4936 COG1414 # Protein_GI_number: 15834190 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 274 14 287 287 526 100.0 1e-149 MVAPIPAKRGRKPAVATAPATGQVQSLTRGLKLLEWIAESNGSVALTELAQQAGLPNSTT HRLLTTMQQQGFVRQVGELGHWAIGAHAFMVGSSFLQSRNLLAIVHPILRNLMEESGETV NMAVLDQSDHEAIIIDQVQCTHLMRMSAPIGGKLPMHASGAGKAFLAQLSEEQVTKLLHR KGLHAYTHATLVSPVHLKEDLAQTRKRGYSFDDEEHALGLRCLAACIFDEHREPFAAISI SGPISRITDDRVTEFGAMVIKAAKEVTLAYGGMR >gi|296493167|gb|ADTK01000334.1| GENE 9 10139 - 13822 4204 1227 aa, chain + ## HITS:1 COG:metH_2 KEGG:ns NR:ns ## COG: metH_2 COG1410 # Protein_GI_number: 16131845 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I, cobalamin-binding domain # Organism: Escherichia coli K12 # 327 1227 1 901 901 1833 99.0 0 MSSKVEQLRAQLNERILVLDGGMGTMIQSYRLNEADFRGERFADWPCDLKGNNDLLVLSK PEVIAAIHNAYFEAGADIIETNTFNSTTIAMADYQMESLSAEINFAAAKLARACADEWTA RTPEKPRYVAGVLGPTNRTASISPDVNDPAFRNITFDQLVAAYRESTKALVEGGADLILI ETVFDTLNAKAAVFAVKTEFEALGVELPIMISGTITDASGRTLSGQTTEAFYNSLRHAEA LTFGLNCALGPDELRQYVQELSRIAECYVTAHPNAGLPNAFGEYDLDADTMAKQIREWAQ AGFLNIVGGCCGTTPQHIAAMSRAVEGLAPRKLPEIPVACRLSGLEPLNIGEDSLFVNVG ERTNVTGSAKFKRLIKEEKYSEALDVARQQVENGAQIIDINMDEGMLDAEAAMVRFLNLI AGEPDIARVPIMIDSSKWDVIEKGLKCIQGKGIVNSISMKEGVDAFIHHAKLLRRYGAAV VVMAFDEQGQADTRARKIEICRRAYKILTEEVGFPPEDIIFDPNIFAVATGIEEHNNYAQ DFIGACEDIKRELPHALISGGVSNVSFSFRGNDPVREAIHAVFLYYAIRNGMDMGIVNAG QLAIYDDLPAELRDAVEDVILNRRDDGTERLLELAEKYRGSKTDDTANAQQAEWRSWEVN KRLEYSLVKGITEFIEQDTEEARQQATRPIEVIEGPLMDGMNVVGDLFGEGKMFLPQVVK SARVMKQAVAYLEPFIEASKEQGKTNGKMVIATVKGDVHDIGKNIVGVVLQCNNYEIVDL GVMVPAEKILRTAKEVNADLIGLSGLITPSLDEMVNVAKEMERQGFTIPLLIGGATTSKA HTAVKIEQNYSGPTVYVQNASRTVGVVAALLSDTQRDDFVARTRKEYETVRIQHGRKKPR TPPVTLEAARDNDFAFDWQAYTPPVAHRLGVQEVEASIETLRNYIDWTPFFMTWSLAGKY PRILEDEVVGVEAQRLFKDANDMLDKLSAEKTLNPRGVVGLFPANRVGDDIEIYRDETRT HVINVSHHLRQQTEKTGFANYCLADFVAPKLSGKADYIGAFAVTGGLEEDALADAFEAQH DDYNKIMVKALADRLAEAFAEYLHERVRKVYWGYAPNENLSNEELIRENYQGIRPAPGYP ACPEHTEKATIWELLEVEKHTGMKLTESFAMWPGASVSGWYFSHPDSKYYAVAQIQRDQV EDYARRKGMSVSDVERWLAPNLGYDAD >gi|296493167|gb|ADTK01000334.1| GENE 10 14042 - 15673 1811 543 aa, chain + ## HITS:1 COG:yjbB KEGG:ns NR:ns ## COG: yjbB COG1283 # Protein_GI_number: 16131846 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Escherichia coli K12 # 1 543 1 543 543 945 100.0 0 MLTLLHLLSAVALLVWGTHIVRTGVMRVFGARLRTVLSRSVEKKPLAFCAGIGVTALVQS SNATTMLVTSFVAQDLVALAPALVIVLGADVGTALMARILTFDLSWLSPLLIFIGVIFFL GRKQSRAGQLGRVGIGLGLILLALELIVQAVTPITQANGVQVIFASLTGDILLDALIGAM FAIISYSSLAAVLLTATLTAAGIISFPVALCLVIGANLGSGLLAMLNNSAANAAARRVAL GSLLFKLVGSLIILPFVHLLAETMGKLSLPKAELVIYFHVFYNLVRCLVMLPFVDPMARF CKTIIRDEPELDTQLRPKHLDVSALDTPTLALANAARETLRIGDAMEQMMEGLNKVMHGE PRQEKELRKLADDINVLYTAIKLYLARMPKEELAEEESRRWAEIIEMSLNLEQASDIVER MGSEIADKSLAARRAFSLDGLKELDALYEQLLSNLKLAMSVFFSGDVTSARRLRRSKHRF RILNRRYSHAHVDRLHQQNVQSIETSSLHLGLLGDMQRLNSLFCSVAYSVLEQPDEDEGR DEY >gi|296493167|gb|ADTK01000334.1| GENE 11 15764 - 16453 840 229 aa, chain - ## HITS:1 COG:ECs4939 KEGG:ns NR:ns ## COG: ECs4939 COG3340 # Protein_GI_number: 15834193 # Func_class: E Amino acid transport and metabolism # Function: Peptidase E # Organism: Escherichia coli O157:H7 # 1 229 1 229 229 452 100.0 1e-127 MELLLLSNSTLPGKAWLEHALPLIAEQLQGRRSAVFIPFAGVTQTWDDYTAKTAAVLAPL GVSVTGIHSVVDPVAAIENAEIVIVGGGNTFQLLKQCRERGLLAPITDVVKRGALYIGWS AGANLACPTIRTTNDMPIVDPQGFDALNLFPLQINPHFTNALPEGHKGETREQRIRELLV VAPELTIIGLPEGNWITVSKGHATLGGPNTTYVFKAGEEAVPLEAGHRF >gi|296493167|gb|ADTK01000334.1| GENE 12 16836 - 18065 1251 409 aa, chain - ## HITS:1 COG:Z5613 KEGG:ns NR:ns ## COG: Z5613 COG1063 # Protein_GI_number: 15804608 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli O157:H7 EDL933 # 1 409 5 413 413 850 99.0 0 MKTTALRLYGKRDLRLETFDLPEMQEDEILATVVTDSLCLSSWKEANLGENHKKVPDDVA TNPIIIGHEFCGDILAVGKKWQHKFQPGQRYVIQANLQLPDRPDCPGYSFPWVGGEATHV VIPNEVMEQDCLLAYDGETYFEGSLVEPLSCVIGAFNANYHLQEGSYNHTMGIRPQGRML ILGGTGPMGLLAIDYALHGPVNPSLLVITDTDNDKLSYARKHYPSEPQTLIHYLNAADAA FDTLMALSGGHGFDDIFVFVPNEGLVTLASSLLATDGCLNFFAGPQDKHFSAPINFYDVH YAFTHYVGTSGGNTDDMRAAVKLIEEKKVQAAKVVTHILGLNAAGETTLELPAVGGGKKL VYTGKSLPLTSLTQIQDQALAAILARHQGIWSGEAEQYLLTHAEAISHD >gi|296493167|gb|ADTK01000334.1| GENE 13 18117 - 18941 799 274 aa, chain - ## HITS:1 COG:Z5614 KEGG:ns NR:ns ## COG: Z5614 COG3716 # Protein_GI_number: 15804609 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID # Organism: Escherichia coli O157:H7 EDL933 # 1 274 1 274 274 501 99.0 1e-142 MEQRKITRSDLVSMFLRSNLQQASFNFERIHGLGFCYDMIPAIKRLYPLKEDQVAALRRH LVFFNTTPAVCGPVIGVTAAMEEARANGAEIDDGTINGIKVGLMGPLAGVGDPLVWGTLR PITAALGASLALSGNILGPLLFFFIFNAVRLAMKWYGLQLGFRKGVNIVSDMGGNVLQKL TEGASILGLFVMGVLVTKWTSINVPLVVSQTHAADGSTVTMTVQNILDQLCPGLLALGLT LLMVRLLNKKINPVWLIFALFGLGIIGNALGFLS >gi|296493167|gb|ADTK01000334.1| GENE 14 18952 - 19749 802 265 aa, chain - ## HITS:1 COG:ECs5000 KEGG:ns NR:ns ## COG: ECs5000 COG3715 # Protein_GI_number: 15834254 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC # Organism: Escherichia coli O157:H7 # 1 265 1 265 265 427 98.0 1e-119 MEISTLQIIAIFLFSCIAGMGSVLDEFQTHRPLIACTVIGLILGDLKTGIILGGTLELIA LGWMNVGAAQSPDSALASIISAILVIVGQQSIATGIAIALPVAAAGQVLTVFARTITVVF QHAADKAAEEARFRTLDILHVSALGVQALRVAIPALIVSLFVSADMVSNMLSAIPEFVTR GLQIAGGFIVVVGYAMVLRMMGVKYLMPFFFLGFLAGGYLDLSLLAFGGVGVIMALVYIQ LNPQWRKAEPQPQATTSTALDQLDD >gi|296493167|gb|ADTK01000334.1| GENE 15 19815 - 20309 504 164 aa, chain - ## HITS:1 COG:ECs5001 KEGG:ns NR:ns ## COG: ECs5001 COG3444 # Protein_GI_number: 15834255 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB # Organism: Escherichia coli O157:H7 # 1 164 1 164 164 307 98.0 7e-84 MNITLARIDDRLIHGQVTTVWSKVANAQRIIICNDEVYNDEVRRTLLRQAAPPGMKVNVV NIEKAVAVYHNPQYQDETVFYLFTRPQDALAMVRQGVKIDTLNIGGMAWRPGKKQLTKAV SLDDDDINAFHELNNLGVILDLRVVASDPSINIIDKINEQLIAN >gi|296493167|gb|ADTK01000334.1| GENE 16 20309 - 20716 387 135 aa, chain - ## HITS:1 COG:ECs5002 KEGG:ns NR:ns ## COG: ECs5002 COG2893 # Protein_GI_number: 15834256 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose-specific component IIA # Organism: Escherichia coli O157:H7 # 1 135 1 135 135 255 98.0 2e-68 MVNAIFCAHGKLACAMLESVQMVYGNANVEAVEFVPGENAGDIVAKLEKLVSIHNQDEWL IAVDLQCGSPWNAAAMLAMRNPRLRVISGLSLPLALELVDNQDSMNVDELCEHLTQIAKQ TCVVWKQLATAEEDF >gi|296493167|gb|ADTK01000334.1| GENE 17 20726 - 21532 194 268 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 7 265 1 240 242 79 29 3e-14 MQTWLNLQDKIIIVTGGASGIGLAIVEELLAQGANVQMVDIHGGDGQYEGHKGYQFWPTD ISSAKEVNHTVAEIIQRFGRIDGLVNNAGVNFPRLLVDEKAPAGQYELNEAAFEKMVNIN QKGVFLMSQAVARQMVKQHDGVIVNVSSESGLEGSEGQSCYAATKAALNSFTRSWSKELG KHGIRVVGIAPGILEKTGLRTPEYEEALAWTRNITVEQLREGYTKNAIPIGRAGRLAEVA DFVCYLLSKRASYITGVTTNIAGGKTRG >gi|296493167|gb|ADTK01000334.1| GENE 18 21602 - 22549 1015 315 aa, chain - ## HITS:1 COG:ECs5004 KEGG:ns NR:ns ## COG: ECs5004 COG2390 # Protein_GI_number: 15834258 # Func_class: K Transcription # Function: Transcriptional regulator, contains sigma factor-related N-terminal domain # Organism: Escherichia coli O157:H7 # 1 315 1 315 315 627 99.0 1e-180 MENSDDIRLIVKIAQLYYEQDMTQAQIARELRIYRTTISRLLKRGRDQGIVTIAINYDYN ENLWLEQQLKQKFGLKDVVVVSGNDEDEETQLAMMGLHGAQLLDRLLEPGDIVGFSWGRA VSALVENLPQAGQSRQLICVPIIGGPSGKLESRYHVNTLTYSAAAKLKGESHLADFPALL DNPLIRNGIMQSQHFKTISAYWDNLDVALVGIGSPAIRDGANWHAFYGGEESDDLNARQV AGDICSRFFDIHGAMVETNMSEKTLSIEMNKLKQARYSIGIAMSEEKYSGIIGALRGKYI NCLVTNSSTAELLLK >gi|296493167|gb|ADTK01000334.1| GENE 19 22897 - 23076 171 59 aa, chain + ## HITS:1 COG:yjbC KEGG:ns NR:ns ## COG: yjbC COG1187 # Protein_GI_number: 16131848 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Escherichia coli K12 # 1 59 1 59 290 117 98.0 5e-27 MLPDSSVRLNKYISESGICSRREADRYIEQGNVFLNGKRATIGDQVKPGDIVKVNGQLI >gi|296493167|gb|ADTK01000334.1| GENE 20 23095 - 23769 701 224 aa, chain + ## HITS:1 COG:ECs5005 KEGG:ns NR:ns ## COG: ECs5005 COG1187 # Protein_GI_number: 15834259 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Escherichia coli O157:H7 # 1 224 67 290 290 390 99.0 1e-109 MVLIALNKPVGIVSTTEDGERDNIVDFVNHSKRVFPIGRLDKDSQGLIFLTNHGDLVNKI LRAGNDHEKEYLVTVDKPITDEFIRGMGAGVPILGTVTKKCKVKKEAPFVFRITLVQGLN RQIRRMCEHFGYEVKKLERTRIMNVSLSGIPLGEWRDLTDDELIDLFKLIENSSSEVKPK AKAKPKTAGIKRPVVKMEKTAEKGGRPASNGKRFTSPGRKKKGR >gi|296493167|gb|ADTK01000334.1| GENE 21 23770 - 24042 370 90 aa, chain - ## HITS:1 COG:no KEGG:ECH74115_5501 NR:ns ## KEGG: ECH74115_5501 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 90 1 90 90 116 100.0 2e-25 MALPRITQKEMTEREQRELKTLLDRARIAHGRVLTNSETNSIKKEYIDKLMVEREAEAKK ARQLKKKQAYKPDPEASFSWSANTSTRGRR >gi|296493167|gb|ADTK01000334.1| GENE 22 24572 - 25699 1045 375 aa, chain + ## HITS:1 COG:PA4582 KEGG:ns NR:ns ## COG: PA4582 COG0330 # Protein_GI_number: 15599778 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Pseudomonas aeruginosa # 1 374 3 379 381 384 56.0 1e-106 MIKKISVRKDQLALLSRNGDYYKVLHAGEHLLPWLNTPEVLLITLDGSEVPDVLADYLRR FQPDWVEKYCLVADLSEIEAGALYMDGILQEILPPSTRRLYWRVEDDLTLVRMNTQQVQV QTEVMNAVLQPRRKGAVKGRDAILTVQVPAWHVGVLKIDGETQALLPPGLTAYWKINHLV DAEVVDTRLQVLEVSGQEILTKDKVNLRINLAANWRYSDVLLAFSQLTKPIDHLYRELQF ALREAVGTRTLDELLEDKQVIDDVVSEQVKSRMLPFGMEIASLGVKDIVLPGDMKNILAQ LVEAEKSAQANVIRRREETAATRSLLNTAKVMENNPVALRLKELETLERVAERIDNISVF GGLDQVLHGLVNIKG >gi|296493167|gb|ADTK01000334.1| GENE 23 25705 - 26496 684 263 aa, chain + ## HITS:1 COG:RSc0225 KEGG:ns NR:ns ## COG: RSc0225 COG3541 # Protein_GI_number: 17544944 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferase # Organism: Ralstonia solanacearum # 6 260 21 274 281 287 59.0 1e-77 MPLNGVSAAMRERVSQQLKEIERRYDVKVLYACESGSRGWGFASPDSDYDVRFLYVHPLE WYLRVESPRDVIELPIDDELDVSGWEWRKALGLLKGANPTLIEWLDSPVVYQQDEETITA FKAMVPMWFSPLRARWHYYSMAQKNFPGYLQGDEVRLKKYFYVLRPLLAVRWVEAGKGVP PMRFAELLAGSELDAPLRAEIDELLERKQRAGEAEYGPRRPLLHAFIRAELARGEIPPLL PDSREGDVKELDSLMYQTVMRRA >gi|296493167|gb|ADTK01000334.1| GENE 24 26532 - 27458 414 308 aa, chain - ## HITS:1 COG:no KEGG:ECIAI1_4248 NR:ns ## KEGG: ECIAI1_4248 # Name: not_defined # Def: putative zeta toxin; poison-antidote element # Organism: E.coli_IAI1 # Pathway: not_defined # 1 308 3 310 310 592 99.0 1e-168 MATLMEKDALLNGASQCIAFLSNIIDNCSVSSHQDSGDALKRLVSYRDYLYSTPAELVDF TQGKILLQQVRTQYQHEFNNTTHSENKASFDSIWQRLTNHEATPQQHPIGFVLGGQPGAG KSSLIELAKRETKDNIMIINGDDFRFLHPDFNHIYQNYGDDFVTHTAKFSGETVERAIER AIVSKLNIVVEGTFRNAATPLQTLKKLKDAGYQTEVMIKTTSAALSWESTNERYNKDKEA GNIARKVDKNHHDIVTGLLAENARKVFASNLADKFAVYSREKMIFSSQAATNDDIATLIQ NEISGNTQ >gi|296493167|gb|ADTK01000334.1| GENE 25 27603 - 28952 1330 449 aa, chain - ## HITS:1 COG:ECs5007 KEGG:ns NR:ns ## COG: ECs5007 COG0527 # Protein_GI_number: 15834261 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Escherichia coli O157:H7 # 1 449 1 449 449 785 99.0 0 MSEIVVSKFGGTSVADFDAMNRSADIVLSDANVRLVVLSASAGITNLLVALAEGLEPGER FEKLDAIRNIQFAILERLRYPNVIREEIERLLENITVLAEAAALATSPALTDELVSHGEL MSTLLFVEILRERDVQAQWFDVRKVMRTNDRFGRAEPDVAALAELAAVQLLPRLNEGLVI TQGFIGSENKGRTTTLGRGGSDYTAALLAEALHASRVDIWTDVPGIYTTDPRVVSAAKRI DEIAFAEAAEMATFGAKVLHPATLLPAVRSDIPVFVGSSKDPRAGGTLVCNKTENPPLFR ALALRRNQTLLTLHSLNMLHSRGFLAEVFGILARHNISVDLITTSEVSVALTLDTTGSTS TGDTLLTQSLLMELSALCRVEVEEGLALVALIGNDLSKACGVGKEVFGVLEPFNIRMICY GASSHNLCFLVPGEDAEQVVQKLHFNLFE >gi|296493167|gb|ADTK01000334.1| GENE 26 29477 - 31126 1989 549 aa, chain + ## HITS:1 COG:ECs5008 KEGG:ns NR:ns ## COG: ECs5008 COG0166 # Protein_GI_number: 15834262 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate isomerase # Organism: Escherichia coli O157:H7 # 1 548 1 548 549 1132 100.0 0 MKNINPTQTAAWQALQKHFDEMKDVTIADLFAKDGDRFSKFSATFDDQMLVDYSKNRITE ETLAKLQDLAKECDLAGAIKSMFSGEKINRTENRAVLHVALRNRSNTPILVDGKDVMPEV NAVLEKMKTFSEAIISGEWKGYTGKAITDVVNIGIGGSDLGPYMVTEALRPYKNHLNMHF VSNVDGTHIAEVLKKVNPETTLFLVASKTFTTQETMTNAHSARDWFLKAAGDEKHVAKHF AALSTNAKAVGEFGIDTANMFEFWDWVGGRYSLWSAIGLSIVLSIGFDNFVELLSGAHAM DKHFSTTPAEKNLPVLLALIGIWYNNFFGAETEAILPYDQYMHRFAAYFQQGNMESNGKY VDRNGNVVDYQTGPIIWGEPGTNGQHAFYQLIHQGTKMVPCDFIAPAITHNPLSDHHQKL LSNFFAQTEALAFGKSREVVEQEYRDQGKDPATLDYVVPFKVFEGNRPTNSILLREITPF SLGALIALYEHKIFTQGVILNIFTFDQWGVELGKQLANRILPELKDDKEISSHDSSTNGL INRYKAWRD >gi|296493167|gb|ADTK01000334.1| GENE 27 31481 - 31723 190 80 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKVLYGIFAISALAATSAWAAPVQVGEAAGSAATSVSAGSSSATSVSTVSSAVGVALAA TGGGDGSNTGTTTTTTTSTQ >gi|296493167|gb|ADTK01000334.1| GENE 28 31837 - 32475 637 212 aa, chain + ## HITS:1 COG:no KEGG:ECO103_4775 NR:ns ## KEGG: ECO103_4775 # Name: yjbF # Def: putative lipoprotein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 212 1 212 212 414 100.0 1e-114 MKRPALILICLLLQACSATTKELGNSLWDSLFGTPGVQLTDDDIQNMPYASQYMQLNGGP QLFVVLAFAEDGQQKWVTQDQATLVTQHGRLVKTLLGGDNLIEVNNLAADPLIKPAQIVD GATWTRTMGWTEYQQVRYATARSVFKWDGTDTVKVGSDETPVRVLDEEVSTDQARWHNRY WIDSEGQIRQSEQYLGADYFPVKTTLIKAAKQ >gi|296493167|gb|ADTK01000334.1| GENE 29 32472 - 33209 605 245 aa, chain + ## HITS:1 COG:no KEGG:ECSE_4317 NR:ns ## KEGG: ECSE_4317 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 245 1 245 245 430 99.0 1e-119 MIKQTIVALILSVGASSVFAAGTVKVFSNGSSEAKTLTGAEYLIDLVGQPRLANSWWPGA VISEELATAAALRQQQALLTRLAELAADSSADDAAAINALRQQIQALKVTGRQKINLDPD IVRVAERGNPPLQGNYTLWVGPPPSTVTLFGLISRPGKQPFTPGRDVASYLSDQSLLSGA DRSYAWVVYPDGRTQKAPVAYWNKRHVEPMPGSIIYVGLADSVWSETPDALNADILQTLT QRIPQ >gi|296493167|gb|ADTK01000334.1| GENE 30 33209 - 35305 2192 698 aa, chain + ## HITS:1 COG:no KEGG:G2583_4854 NR:ns ## KEGG: G2583_4854 # Name: yjbH # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 698 1 698 698 1413 99.0 0 MKKRHLLSLLALGISTACYGETYPAPIGPSQSDFGGVGLLQTPTARMAREGELSLNYRDN DQYRYYSASVQLFPWLETTLRYTDVRTRQYSSVEAFSGDQTYKDKAFDLKLRLWEESYWL PQVAVGARDIGGTGLFDAEYLVASKAWGPFDFTLGLGWGYLGTSGNVKNPLCSASDKYCY RDNSYKQAGSIDGSQMFHGPASLFGGVEYQTPWQPLRLKLEYEGNNYQQDFAGKLEQKSK FNVGAIYRVTDWADVNLSYERGNTFMFGVTLRTNFNDLRPSYNDNARPQYQPQPQDAILQ HSVVANQLTLLKYNAGLADPQIQAKGDTLYVTGEQVKYRDSREGIIRANQIVMNDLPDGI KTIRITENRLNMPQVTTETDVASLKNHLAGEPLGHETKLAQKRVEPVVPKSTEQGWYIDK SRFDFHIDPVLNQSVGGPENFYMYQLGVMGTADLWLTDHLLTTGSLFANLANNYDKFNYT NPPQDSHLPRVRTHVREYVQNDVYVNNLQANYFQHLGNGFYGQVYGGYLETMFGGAGAEV LYRPLDSNWAFGLDANYVKQRDWRSAKDMMKFTDYSVKTGHLTAYWTPSFAQDVLVKASV GQYLAGDKGGTLEIAKRFDSGVVVGGYATITNVSKEEYGEGDFTKGVYVSVPLDLFSSGP TRSRAAIGWTPLTRDGGQQLGRKFQLYDMTSDRSVNFR >gi|296493167|gb|ADTK01000334.1| GENE 31 35900 - 36310 552 136 aa, chain + ## HITS:1 COG:yjbA KEGG:ns NR:ns ## COG: yjbA COG3223 # Protein_GI_number: 16131856 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 136 1 136 136 199 100.0 2e-51 MTSLSRPRVEFISTILQTVLNLGLLCLGLILVVFLGKETVHLADVLFAPEQTSKYELVEG LVVYFLYFEFIALIVKYFQSGFHFPLRYFVYIGITAIVRLIIVDHKSPLDVLIYSAAILL LVITLWLCNSKRLKRE >gi|296493167|gb|ADTK01000334.1| GENE 32 36354 - 37829 1481 491 aa, chain - ## HITS:1 COG:ECs5014 KEGG:ns NR:ns ## COG: ECs5014 COG0477 # Protein_GI_number: 15834268 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 491 1 491 491 936 99.0 0 MNTQYNSSYIFSITLVATLGGLLFGYDTAVISGTVESLNTVFVAPQNLSESAANSLLGFC VASALIGCIIGGALGGYCSSRFGRRDSLKIAAVLFFISGVGSAWPELGFTSINPDNTVPV YLAGYVPEFVIYRIIGGIGVGLASMLSPMYIAELAPAHIRGKLVSFNQFAIIFGQLLVYC VNYFIARSGDASWLNTDGWRYMFASECIPALLFLMLLYTVPESPRWLMSRGKQEQAEGIL RKIMGNTLATQAVQEIKHSLDHGRKTGGRLLMFGVGVIVIGVMLSILQQFVGINVVLYYA PEVFKTLGASTDIALLQTIIVGVINLTFTVLAIMTVDKFGRKPLQIIGALGMAIGMFSLG TAFYTQAPGIVALLSMLFYVAAFAMSWGPVCWVLLSEIFPNAIRGKALAIAVAAQWLANY FVSWTFPMMDKNSWLVAHFHNGFSYWIYGCMGVLAALFMWKFVPETKGKTLEELEALWEP ETKKTQQTATL Prediction of potential genes in microbial genomes Time: Mon May 16 16:04:05 2011 Seq name: gi|296493166|gb|ADTK01000335.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1096.2, whole genome shotgun sequence Length of sequence - 86802 bp Number of predicted genes - 75, with homology - 75 Number of transcription units - 42, operones - 15 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 20/0.000 - CDS 83 - 973 1087 ## COG3833 ABC-type maltose transport systems, permease component 2 1 Op 2 19/0.000 - CDS 988 - 2532 1595 ## COG1175 ABC-type sugar transport systems, permease components - Prom 2608 - 2667 5.0 - Term 2576 - 2621 2.7 3 1 Op 3 . - CDS 2686 - 3876 1455 ## COG2182 Maltose-binding periplasmic proteins/domains + Prom 4134 - 4193 2.0 4 2 Op 1 5/0.231 + CDS 4241 - 5356 1196 ## COG3839 ABC-type sugar transport systems, ATPase components 5 2 Op 2 . + CDS 5431 - 6768 1584 ## COG4580 Maltoporin (phage lambda and maltose receptor) + Term 6954 - 7004 1.3 + Prom 6867 - 6926 4.2 6 3 Tu 1 . + CDS 7011 - 7931 761 ## ECUMN_4571 maltose regulon periplasmic protein + Term 7964 - 7995 3.2 + Prom 8823 - 8882 4.9 7 4 Tu 1 . + CDS 9003 - 9740 348 ## COG1357 Uncharacterized low-complexity proteins + Prom 9880 - 9939 4.2 8 5 Op 1 6/0.077 + CDS 9963 - 10460 564 ## COG3161 4-hydroxybenzoate synthetase (chorismate lyase) 9 5 Op 2 . + CDS 10473 - 11345 973 ## COG0382 4-hydroxybenzoate polyprenyltransferase and related prenyltransferases + Term 11358 - 11387 2.1 - Term 11343 - 11379 5.8 10 6 Tu 1 . - CDS 11500 - 13983 2589 ## COG2937 Glycerol-3-phosphate O-acyltransferase - Prom 14094 - 14153 4.7 + Prom 13835 - 13894 4.0 11 7 Tu 1 . + CDS 14094 - 14462 441 ## COG0818 Diacylglycerol kinase + Term 14475 - 14512 -1.0 + Prom 14485 - 14544 8.2 12 8 Op 1 1/0.923 + CDS 14572 - 15180 563 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) + Term 15192 - 15231 8.9 13 8 Op 2 . + CDS 15253 - 16578 1335 ## COG0534 Na+-driven multidrug efflux pump 14 9 Tu 1 . + CDS 16694 - 16903 355 ## COG3237 Uncharacterized protein conserved in bacteria 15 10 Tu 1 . - CDS 16945 - 17460 492 ## COG0735 Fe2+/Zn2+ uptake regulation proteins - Prom 17691 - 17750 4.5 + Prom 17568 - 17627 6.9 16 11 Op 1 . + CDS 17778 - 18062 148 ## EcE24377A_4600 hypothetical protein 17 11 Op 2 . + CDS 18053 - 18793 311 ## SFV_4165 hypothetical protein + Term 18835 - 18868 3.1 + Prom 18876 - 18935 6.1 18 12 Tu 1 . + CDS 19174 - 20193 1086 ## COG0042 tRNA-dihydrouridine synthase + Term 20225 - 20260 5.0 + Prom 20241 - 20300 4.4 19 13 Tu 1 . + CDS 20327 - 20569 417 ## ECSE_4343 phage shock protein G + Term 20702 - 20743 5.4 - Term 20693 - 20726 4.1 20 14 Tu 1 . - CDS 20735 - 21718 829 ## COG0604 NADPH:quinone reductase and related Zn-dependent oxidoreductases - Prom 21791 - 21850 1.7 + Prom 21687 - 21746 3.2 21 15 Op 1 9/0.077 + CDS 21801 - 23216 1513 ## COG0305 Replicative DNA helicase 22 15 Op 2 5/0.231 + CDS 23269 - 24348 911 ## COG0787 Alanine racemase + Prom 24361 - 24420 3.7 23 16 Tu 1 . + CDS 24601 - 25794 1212 ## COG1448 Aspartate/tyrosine/aromatic aminotransferase + Term 25897 - 25935 -0.5 - Term 25792 - 25827 6.5 24 17 Tu 1 . - CDS 26015 - 26557 298 ## EC55989_4548 hypothetical protein + Prom 26699 - 26758 7.2 25 18 Tu 1 . + CDS 26920 - 27633 570 ## COG3700 Acid phosphatase (class B) + Prom 27661 - 27720 3.3 26 19 Op 1 4/0.538 + CDS 27744 - 28160 353 ## COG0432 Uncharacterized conserved protein 27 19 Op 2 . + CDS 28164 - 28520 460 ## COG2315 Uncharacterized protein conserved in bacteria - Term 28517 - 28547 3.4 28 20 Tu 1 . - CDS 28555 - 31377 3043 ## COG0178 Excinuclease ATPase subunit - Prom 31438 - 31497 5.0 + Prom 31545 - 31604 5.6 29 21 Tu 1 . + CDS 31631 - 32167 716 ## COG0629 Single-stranded DNA-binding protein + Term 32207 - 32238 4.1 + Prom 32206 - 32265 5.3 30 22 Tu 1 . + CDS 32476 - 33573 257 ## COG0582 Integrase 31 23 Op 1 4/0.538 + CDS 34349 - 35122 451 ## COG0582 Integrase 32 23 Op 2 . + CDS 35115 - 37148 344 ## COG4688 Uncharacterized protein conserved in bacteria 33 23 Op 3 . + CDS 37141 - 37560 62 ## Tcr_0331 hypothetical protein 34 23 Op 4 . + CDS 37635 - 42398 318 ## Gmet_2092 hypothetical protein + Term 42442 - 42474 4.0 35 24 Tu 1 . - CDS 42936 - 43757 209 ## Glov_0061 hypothetical protein - Term 44121 - 44155 5.2 36 25 Tu 1 . - CDS 44162 - 44443 283 ## ECSE_4354 hypothetical protein + Prom 44746 - 44805 3.7 37 26 Tu 1 . + CDS 44873 - 46459 1123 ## COG4943 Predicted signal transduction protein containing sensor and EAL domains - Term 46415 - 46456 7.7 38 27 Tu 1 . - CDS 46462 - 46785 309 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 46826 - 46885 5.8 + Prom 46788 - 46847 5.6 39 28 Tu 1 . + CDS 46871 - 47335 494 ## COG0789 Predicted transcriptional regulators + Term 47337 - 47363 -0.7 + Prom 47694 - 47753 4.7 40 29 Tu 1 . + CDS 47880 - 49229 1557 ## COG2252 Permeases + Term 49245 - 49288 9.2 + Prom 49269 - 49328 3.8 41 30 Tu 1 . + CDS 49381 - 51030 1681 ## COG0025 NhaP-type Na+/H+ and K+/H+ antiporters - Term 51059 - 51088 1.4 42 31 Tu 1 . - CDS 51184 - 51927 379 ## COG1357 Uncharacterized low-complexity proteins - Prom 52150 - 52209 5.4 - Term 52607 - 52644 3.1 43 32 Op 1 10/0.000 - CDS 52654 - 54303 2154 ## COG4147 Predicted symporter 44 32 Op 2 5/0.231 - CDS 54300 - 54614 358 ## COG3162 Predicted membrane protein - Prom 54666 - 54725 4.2 - Term 54732 - 54776 4.1 45 33 Tu 1 . - CDS 54815 - 56773 2225 ## COG0365 Acyl-coenzyme A synthetases/AMP-(fatty) acid ligases - Prom 56999 - 57058 7.0 46 34 Op 1 . + CDS 57165 - 58601 1486 ## COG3303 Formate-dependent nitrite reductase, periplasmic cytochrome c552 subunit 47 34 Op 2 . + CDS 58646 - 59212 355 ## G2583_4896 NrfB, formate-dependent nitrite reductase 48 34 Op 3 8/0.077 + CDS 59209 - 59880 434 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 49 34 Op 4 5/0.231 + CDS 59877 - 60833 1231 ## COG3301 Formate-dependent nitrite reductase, membrane component + Term 60834 - 60870 -0.9 50 34 Op 5 7/0.077 + CDS 60916 - 62571 1234 ## COG1138 Cytochrome c biogenesis factor 51 34 Op 6 9/0.077 + CDS 62564 - 62947 353 ## COG3088 Uncharacterized protein involved in biosynthesis of c-type cytochromes 52 34 Op 7 4/0.538 + CDS 62944 - 63540 683 ## COG4235 Cytochrome c biogenesis factor + Prom 63575 - 63634 7.1 53 35 Tu 1 . + CDS 63882 - 65195 1679 ## COG1301 Na+/H+-dicarboxylate symporters - Term 65191 - 65226 3.3 54 36 Tu 1 4/0.538 - CDS 65273 - 65962 589 ## COG0790 FOG: TPR repeat, SEL1 subfamily - Prom 65986 - 66045 5.1 - Term 66010 - 66041 4.1 55 37 Op 1 5/0.231 - CDS 66056 - 67735 1800 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 56 37 Op 2 3/0.692 - CDS 67784 - 68203 321 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing - Prom 68265 - 68324 4.2 - Term 68357 - 68394 6.2 57 38 Op 1 2/0.846 - CDS 68401 - 69867 1546 ## COG1538 Outer membrane protein 58 38 Op 2 . - CDS 69864 - 70346 488 ## COG1289 Predicted membrane protein 59 38 Op 3 6/0.077 - CDS 70413 - 71915 1154 ## COG1289 Predicted membrane protein 60 38 Op 4 . - CDS 71915 - 72946 1013 ## COG1566 Multidrug resistance efflux pump 61 38 Op 5 . - CDS 72965 - 73240 159 ## SSON_4264 formate dehydrogenase H - Prom 73363 - 73422 13.3 - Term 73399 - 73440 10.3 62 39 Tu 1 . - CDS 73449 - 75434 2083 ## COG2015 Alkyl sulfatase and related hydrolases - Prom 75637 - 75696 5.9 63 40 Op 1 2/0.846 - CDS 75707 - 76636 507 ## COG1940 Transcriptional regulator/sugar kinase 64 40 Op 2 2/0.846 - CDS 76620 - 77315 672 ## COG0036 Pentose-5-phosphate-3-epimerase 65 40 Op 3 21/0.000 - CDS 77326 - 78306 1055 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 66 40 Op 4 16/0.000 - CDS 78285 - 79817 175 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 - Prom 79838 - 79897 2.0 67 40 Op 5 . - CDS 79944 - 80879 856 ## COG1879 ABC-type sugar transport system, periplasmic component 68 40 Op 6 . - CDS 80938 - 81828 538 ## COG1737 Transcriptional regulators - Prom 81922 - 81981 10.4 + Prom 81910 - 81969 8.5 69 41 Op 1 . + CDS 82187 - 82636 320 ## COG0698 Ribose 5-phosphate isomerase RpiB 70 41 Op 2 . + CDS 82705 - 83034 217 ## B21_03923 hypothetical protein 71 42 Op 1 3/0.692 - CDS 83181 - 83939 577 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I 72 42 Op 2 3/0.692 - CDS 83941 - 84375 593 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 73 42 Op 3 7/0.077 - CDS 84362 - 84919 431 ## COG3709 Uncharacterized component of phosphonate metabolism 74 42 Op 4 6/0.077 - CDS 84919 - 86055 1085 ## COG3454 Metal-dependent hydrolase involved in phosphonate metabolism 75 42 Op 5 . - CDS 86052 - 86774 221 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 Predicted protein(s) >gi|296493166|gb|ADTK01000335.1| GENE 1 83 - 973 1087 296 aa, chain - ## HITS:1 COG:ECs5015 KEGG:ns NR:ns ## COG: ECs5015 COG3833 # Protein_GI_number: 15834269 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type maltose transport systems, permease component # Organism: Escherichia coli O157:H7 # 1 296 1 296 296 516 99.0 1e-146 MAMVQPKSQKARLFITHLLLLLFIAAIMFPLLMVVAISLRQGNFATGSLIPEQISWDHWK LALGFSVEQADGRITPPPFPVLLWLWNSVKVAGISAIGIVALSTTCAYAFARMRFPGKAT LLKGMLIFQMFPAVLSLVALYALFDRLGEYIPFIGLNTHGGVIFAYLGGIALHVWTIKGY FETIDSSLEEAAALDGATPWQAFRLVLLPLSVPILAVVFILSFIAAITEVPVASLLLRDV NSYTLAVGMQQYLNPQNSLWGDFAAAAVMSALPITIVFLLAQRWLVNGLTAGGVKG >gi|296493166|gb|ADTK01000335.1| GENE 2 988 - 2532 1595 514 aa, chain - ## HITS:1 COG:malF KEGG:ns NR:ns ## COG: malF COG1175 # Protein_GI_number: 16131859 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Escherichia coli K12 # 1 514 1 514 514 978 100.0 0 MDVIKKKHWWQSDALKWSVLGLLGLLVGYLVVLMYAQGEYLFAITTLILSSAGLYIFANR KAYAWRYVYPGMAGMGLFVLFPLVCTIAIAFTNYSSTNQLTFERAQEVLLDRSWQAGKTY NFGLYPAGDEWQLALSDGETGKNYLSDAFKFGGEQKLQLKETTAQPEGERANLRVITQNR QALSDITAILPDGNKVMMSSLRQFSGTQPLYTLDGDGTLTNNQSGVKYRPNNQIGFYQSI TADGNWGDEKLSPGYTVTTGWKNFTRVFTDEGIQKPFLAIFVWTVVFSLITVFLTVAVGM VLACLVQWEALRGKAVYRVLLILPYAVPSFISILIFKGLFNQSFGEINMMLSALFGVKPA WFSDPTTARTMLIIVNTWLGYPYMMILCMGLLKAIPDDLYEASAMDGAGPFQNFFKITLP LLIKPLTPLMIASFAFNFNNFVLIQLLTNGGPDRLGTTTPAGYTDLLVNYTYRIAFEGGG GQDFGLAAAIATLIFLLVGALAIVNLKATRMKFD >gi|296493166|gb|ADTK01000335.1| GENE 3 2686 - 3876 1455 396 aa, chain - ## HITS:1 COG:ECs5017 KEGG:ns NR:ns ## COG: ECs5017 COG2182 # Protein_GI_number: 15834271 # Func_class: G Carbohydrate transport and metabolism # Function: Maltose-binding periplasmic proteins/domains # Organism: Escherichia coli O157:H7 # 1 396 1 396 396 764 100.0 0 MKIKTGARILALSALTTMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIK VTVEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTW DAVRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEP YFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAE AAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLSAGINAASPNKE LAKEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIP QMSAFWYAVRTAVINAASGRQTVDEALKDAQTRITK >gi|296493166|gb|ADTK01000335.1| GENE 4 4241 - 5356 1196 371 aa, chain + ## HITS:1 COG:ECs5018 KEGG:ns NR:ns ## COG: ECs5018 COG3839 # Protein_GI_number: 15834272 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Escherichia coli O157:H7 # 1 371 1 371 371 729 100.0 0 MASVQLQNVTKAWGEVVVSKDINLDIHEGEFVVFVGPSGCGKSTLLRMIAGLETITSGDL FIGEKRMNDTPPAERGVGMVFQSYALYPHLSVAENMSFGLKLAGAKKEVINQRVNQVAEV LQLAHLLDRKPKALSGGQRQRVAIGRTLVAEPSVFLLDEPLSNLDAALRVQMRIEISRLH KRLGRTMIYVTHDQVEAMTLADKIVVLDAGRVAQVGKPLELYHYPADRFVAGFIGSPKMN FLPVKVTATAIDQVQVELPMPNRQQVWLPVESRDVQVGANMSLGIRPEHLLPSDIADVIL EGEVQVVEQLGNETQIHIQIPSIRQNLVYRQNDVVLVEEGATFAIGLPPERCHLFREDGT ACRRLHKEPGV >gi|296493166|gb|ADTK01000335.1| GENE 5 5431 - 6768 1584 445 aa, chain + ## HITS:1 COG:ECs5019 KEGG:ns NR:ns ## COG: ECs5019 COG4580 # Protein_GI_number: 15834273 # Func_class: G Carbohydrate transport and metabolism # Function: Maltoporin (phage lambda and maltose receptor) # Organism: Escherichia coli O157:H7 # 1 445 2 446 446 853 99.0 0 MITLRKLPLAVAVAAGVMSAQAMAVDFHGYARSGIGWTGSGGEQQCFQTTGAQSKYRLGN ECETYAELKLGQEVWKEGDKSFYFDTNVAYSVAQQNDWEATDPAFREANVQGKNLIEWLP GSTIWAGKRFYQRHDVHMIDFYYWDISGPGAGLENIDVGFGKLSLAATRSSEAGGSSSFA SNNIYDYTNETANDVFDVRLAQMEINPGGTLELGVDYGRANLRDNYRLVDGASKDGWLFT AEHTQSVLKGFNKFVVQYATDSMTSQGKGLSQGSGVAFDNEKFAYNINNNGHMLRILDHG AISMGDNWDMMYVGMYQDINWDNDNGTKWWTVGIRPMYKWTPIMSTVMEIGYDNVESQRT GDKNNQYKITLAQQWQAGDSIWSRPAIRVFATYAKWDEKWGYDYNGDSKVNPNYGKAVPA DFNGGSFGRGDSDEWTFGAQMEIWW >gi|296493166|gb|ADTK01000335.1| GENE 6 7011 - 7931 761 306 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_4571 NR:ns ## KEGG: ECUMN_4571 # Name: malM # Def: maltose regulon periplasmic protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 306 1 306 306 496 100.0 1e-139 MKMNKSLIALCLSAGLLASAPGISLADVNYVPQNTSDAPAIPSAALQQLTWTPVDQSKTQ TTQLATGGQQLNVPGISGPVAAYSVPANIGELTLTLTSEVNKQTSVFAPNVLILDQNMTP SAFFPSSYFTYQEPGVMSADRLEGVMRLTPALGQQKLYVLVFTTEKDLQQTTQLLDPAKA YAKGVGNSIPDIPDPVARHTTDGLLKLKVKTNSSSSVLVGPLFGSSAPAPVTVGNTAAPA VAAPAPAPVKKSEPMLNDTESYFNTAIKNAVAKGDVDKALKLLDEAERLGSTSARSTFIS SVKGKG >gi|296493166|gb|ADTK01000335.1| GENE 7 9003 - 9740 348 245 aa, chain + ## HITS:1 COG:yjbI_1 KEGG:ns NR:ns ## COG: yjbI_1 COG1357 # Protein_GI_number: 16131864 # Func_class: S Function unknown # Function: Uncharacterized low-complexity proteins # Organism: Escherichia coli K12 # 1 51 198 248 248 99 98.0 6e-21 MIDTLPDNAMILKSVLAVKLVMQLKILNIVNKNFIENMKKTFSHCPYIKDPIIRSYIHSG EDNKFDDFMRQHRFSKVDFDTQQMIHFINRFNMNKGLIDKNNNFFIQLIDQALRSTDDMI KANAWYLYKEWIRSDDVSPIFIETEEKLRTFNTNKLTRNDNIFILFSSVDDGPVMVVSSQ RLHDMLNPTKDTNWNSTYIYKSRHEMLPVNLTQETLFSSKSQDKHALFPIFTASWRAHRI MNKGV >gi|296493166|gb|ADTK01000335.1| GENE 8 9963 - 10460 564 165 aa, chain + ## HITS:1 COG:ubiC KEGG:ns NR:ns ## COG: ubiC COG3161 # Protein_GI_number: 16131865 # Func_class: H Coenzyme transport and metabolism # Function: 4-hydroxybenzoate synthetase (chorismate lyase) # Organism: Escherichia coli K12 # 1 165 38 202 202 307 98.0 4e-84 MSHPALTQLRALRYFKEIPALEPQLLDWLLLEDSMTKRFEQQGKTVSVTMIREGFVEQNE IPEELPLLPKESRYWLREILLCADGEPWLAGRTVVPVSTLSGPELALQKLGKTPLGRYLF TSSTLTRDFIEIGRDAGLWGRRSRLRLSGKPLLLTELFLPASPLY >gi|296493166|gb|ADTK01000335.1| GENE 9 10473 - 11345 973 290 aa, chain + ## HITS:1 COG:ECs5023 KEGG:ns NR:ns ## COG: ECs5023 COG0382 # Protein_GI_number: 15834277 # Func_class: H Coenzyme transport and metabolism # Function: 4-hydroxybenzoate polyprenyltransferase and related prenyltransferases # Organism: Escherichia coli O157:H7 # 1 290 1 290 290 512 99.0 1e-145 MEWSLTQNKLLAFHRLMRTDKPIGALLLLWPTLWALWVATPGVPQLWILAVFVAGVWLMR AAGCVVNDYADRKFDGHVKRTANRPLPSGAVTEKEARALFVVLVLISFLLVLTLNTMTIL LSIAALALAWVYPFMKRYTHLPQVVLGAAFGWSIPMAFAAVSESVPLSCWLMFLANILWA VAYDTQYAMVDRDDDVKIGIKSTAILFGQYDKLIIGILQIGVLALMAIIGELNGLGWGYY WSIVVAGALFVYQQKLIANREREACFKAFMNNNYVGLVLFLGLAMSYWHF >gi|296493166|gb|ADTK01000335.1| GENE 10 11500 - 13983 2589 827 aa, chain - ## HITS:1 COG:plsB KEGG:ns NR:ns ## COG: plsB COG2937 # Protein_GI_number: 16131867 # Func_class: I Lipid transport and metabolism # Function: Glycerol-3-phosphate O-acyltransferase # Organism: Escherichia coli K12 # 1 827 1 827 827 1661 100.0 0 MTFCYPCRAFALLTRGFTSFMSGWPRIYYKLLNLPLSILVKSKSIPADPAPELGLDTSRP IMYVLPYNSKADLLTLRAQCLAHDLPDPLEPLEIDGTLLPRYVFIHGGPRVFTYYTPKEE SIKLFHDYLDLHRSNPNLDVQMVPVSVMFGRAPGREKGEVNPPLRMLNGVQKFFAVLWLG RDSFVRFSPSVSLRRMADEHGTDKTIAQKLARVARMHFARQRLAAVGPRLPARQDLFNKL LASRAIAKAVEDEARSKKISHEKAQQNAIALMEEIAANFSYEMIRLTDRILGFTWNRLYQ GINVHNAERVRQLAHDGHELVYVPCHRSHMDYLLLSYVLYHQGLVPPHIAAGINLNFWPA GPIFRRLGAFFIRRTFKGNKLYSTVFREYLGELFSRGYSVEYFVEGGRSRTGRLLDPKTG TLSMTIQAMLRGGTRPITLIPIYIGYEHVMEVGTYAKELRGATKEKESLPQMLRGLSKLR NLGQGYVNFGEPMPLMTYLNQHVPDWRESIDPIEAVRPAWLTPTVNNIAADLMVRINNAG AANAMNLCCTALLASRQRSLTREQLTEQLNCYLDLMRNVPYSTDSTVPSASASELIDHAL QMNKFEVEKDTIGDIIILPREQAVLMTYYRNNIAHMLVLPSLMAAIVTQHRHISRDVLME HVNVLYPMLKAELFLRWDRDELPDVIDALANEMQRQGLITLQDDELHINPAHSRTLQLLA AGARETLQRYAITFWLLSANPSINRGTLEKESRTVAQRLSVLHGINAPEFFDKAVFSSLV LTLRDEGYISDSGDAEPAETMKVYQLLAELITSDVRLTIESATQGEG >gi|296493166|gb|ADTK01000335.1| GENE 11 14094 - 14462 441 122 aa, chain + ## HITS:1 COG:ECs5025 KEGG:ns NR:ns ## COG: ECs5025 COG0818 # Protein_GI_number: 15834279 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Diacylglycerol kinase # Organism: Escherichia coli O157:H7 # 1 122 1 122 122 195 100.0 2e-50 MANNTTGFTRIIKAAGYSWKGLRAAWINEAAFRQEGVAVLLAVVIACWLDVDAITRVLLI SSVMLVMIVEILNSAIEAVVDRIGSEYHELSGRAKDMGSAAVLIAIIVAVITWCILLWSH FG >gi|296493166|gb|ADTK01000335.1| GENE 12 14572 - 15180 563 202 aa, chain + ## HITS:1 COG:ECs5026 KEGG:ns NR:ns ## COG: ECs5026 COG1974 # Protein_GI_number: 15834280 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Escherichia coli O157:H7 # 1 202 1 202 202 380 100.0 1e-106 MKALTARQQEVFDLIRDHISQTGMPPTRAEIAQRLGFRSPNAAEEHLKALARKGVIEIVS GASRGIRLLQEEEEGLPLVGRVAAGEPLLAQQHIEGHYQVDPSLFKPNADFLLRVSGMSM KDIGIMDGDLLAVHKTQDVRNGQVVVARIDDEVTVKRLKKQGNKVELLPENSEFKPIVVD LRQQSFTIEGLAVGVIRNGDWL >gi|296493166|gb|ADTK01000335.1| GENE 13 15253 - 16578 1335 441 aa, chain + ## HITS:1 COG:dinF KEGG:ns NR:ns ## COG: dinF COG0534 # Protein_GI_number: 16131870 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Escherichia coli K12 # 1 441 19 459 459 676 99.0 0 MAFLTSSDKALWHLALPMIFSNITVPLLGLVDTAVIGHLDSPVYLGGVAVGATATSFLFM LLLFLRMSTTGLTAQAYGAKNPQALARALVQPLLLALGAGALIALLRTPIIDLALHIVGG SEAVLEQARRFLEIRWLSAPASLANLVLLGWLLGVQYARAPVILLVVGNILNIVLDVWLV MGLHMNVQGAALATVIAEYATLLIGLLMVRKILKLRGISGEMLKTAWRGNFRRLLALNRD IMLRSLLLQLCFGAITVLGARLGSDIIAVNAVLMTLLTFTAYALDGFAYAVEAHSGQAYG ARDGSQLLDVWRAACRQSGIVALLFSVVYLLAGEHIIALLTSLTQIQQLADRYLIWQVIL PLVGVWCYLLDGMFIGATRAAEMRNSMAVAAAGFALTLLTLPWLGNHGLWLALTVFLALR GLSLAAIWRRHWRNGTWFAAT >gi|296493166|gb|ADTK01000335.1| GENE 14 16694 - 16903 355 69 aa, chain + ## HITS:1 COG:ECs5028 KEGG:ns NR:ns ## COG: ECs5028 COG3237 # Protein_GI_number: 15834282 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 69 1 69 69 95 100.0 2e-20 MNKDEAGGNWKQFKGKVKEQWGKLTDDDMTIIEGKRDQLVGKIQERYGYQKDQAEKEVVD WETRNEYRW >gi|296493166|gb|ADTK01000335.1| GENE 15 16945 - 17460 492 171 aa, chain - ## HITS:1 COG:zur KEGG:ns NR:ns ## COG: zur COG0735 # Protein_GI_number: 16131872 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Escherichia coli K12 # 1 171 21 191 191 325 100.0 3e-89 MEKTTTQELLAQAEKICAQRNVRLTPQRLEVLRLMSLQDGAISAYDLLDLLREAEPQAKP PTVYRALDFLLEQGFVHKVESTNSYVLCHLFDQPTHTSAMFICDRCGAVKEECAEGVEDI MHTLAAKMGFALRHNVIEAHGLCAACVEVEACRHPEQCQHDHSVQVKKKPR >gi|296493166|gb|ADTK01000335.1| GENE 16 17778 - 18062 148 94 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_4600 NR:ns ## KEGG: EcE24377A_4600 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 94 1 94 340 171 97.0 7e-42 MLKIIPGATGYFNKTLNSNQFDNKDANKDKLDNIDATKDKLDNRGTIKGKLNNIYGKSID YSALRHRDIIIAKIDLFIQRITHNLWHARKKMCF >gi|296493166|gb|ADTK01000335.1| GENE 17 18053 - 18793 311 246 aa, chain + ## HITS:1 COG:no KEGG:SFV_4165 NR:ns ## KEGG: SFV_4165 # Name: yjbM # Def: hypothetical protein # Organism: S.flexneri_8401 # Pathway: not_defined # 1 246 1 246 246 436 96.0 1e-121 MFLIEQITNLKVWVNKYIDDCTDEDLNDRDFIASVVDRAIFHFAINSICNPGDNKDATPI ERCTFDVETKNGLPSTVQLFYEESKDNEPLANIHFQAIGSGFLTFVNACQEHDDNSLKLF ASLLISLSYSSAYTDLSETVYINENNESYLKAQFEKLSQRDMKKYLGEMKRLADGGEMNF DGYLDKMSHLVNEGTLDPDILSKMRDAAPQLISFAKSFDPTSKEEIKILTDTSKLIYDLF GVKSEK >gi|296493166|gb|ADTK01000335.1| GENE 18 19174 - 20193 1086 339 aa, chain + ## HITS:1 COG:ECs5031 KEGG:ns NR:ns ## COG: ECs5031 COG0042 # Protein_GI_number: 15834285 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-dihydrouridine synthase # Organism: Escherichia coli O157:H7 # 1 339 7 345 345 720 100.0 0 MQKINQTSAMPEKTDVHWSGRFSVAPMLDWTDRHCRYFLRLLSRNTLLYTEMVTTGAIIH GKGDYLAYSEEEHPVALQLGGSDPAALAQCAKLAEARGYDEINLNVGCPSDRVQNGMFGA CLMGNAQLVADCVKAMRDVVSIPVTVKTRIGIDDQDSYEFLCDFINTVSGKGECEMFIIH ARKAWLSGLSPKENREIPPLDYPRVYQLKRDFPHLTMSINGGIKSLEEAKAHLQHMDGVM VGREAYQNPGILAAVDREIFGSSDTDADPVAVVRAMYPYIERELSQGTYLGHITRHMLGL FQGIPGARQWRRYLSENAHKAGADINVLEHALKLVADKR >gi|296493166|gb|ADTK01000335.1| GENE 19 20327 - 20569 417 80 aa, chain + ## HITS:1 COG:no KEGG:ECSE_4343 NR:ns ## KEGG: ECSE_4343 # Name: pspG # Def: phage shock protein G # Organism: E.coli_SE11 # Pathway: not_defined # 1 80 1 80 80 97 98.0 1e-19 MLELLFVFGFFVMLMVTGVSLLGIIAALVVATAIMFLGGMLALMIKLLPWLLLAIAVGWV IKAIKAPKVPKYQRYDRWRY >gi|296493166|gb|ADTK01000335.1| GENE 20 20735 - 21718 829 327 aa, chain - ## HITS:1 COG:ECs5033 KEGG:ns NR:ns ## COG: ECs5033 COG0604 # Protein_GI_number: 15834287 # Func_class: C Energy production and conversion; R General function prediction only # Function: NADPH:quinone reductase and related Zn-dependent oxidoreductases # Organism: Escherichia coli O157:H7 # 1 327 1 327 327 630 100.0 1e-180 MATRIEFHKHGGPEVLQAVEFTPADPAENEIQVENKAIGINFIDTYIRSGLYPPPSLPSG LGTEAAGIVSKVGSGVKHIKAGDRVVYAQSALGAYSSVHNINADKAAILPAAISFEQAAA SFLKGLTVYYLLRKTYEIKPDEQFLFHAAAGGVGLIACQWAKALGAKLIGTVGTAQKAQS ALKAGAWQVINYREENLVERLKEITGGKKVRVVYDSVGRDTWERSLDCLQRRGLMVSFGN SSGAVTGVNLGILNQKGSLYVTRPSLQGYITTREELTEASNELFSLIASGVIKVDVAEQQ KYPLKDAQRAHEILESRATQGSSLLIP >gi|296493166|gb|ADTK01000335.1| GENE 21 21801 - 23216 1513 471 aa, chain + ## HITS:1 COG:dnaB KEGG:ns NR:ns ## COG: dnaB COG0305 # Protein_GI_number: 16131878 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Escherichia coli K12 # 1 471 1 471 471 892 100.0 0 MAGNKPFNKQQAEPRERDPQVAGLKVPPHSIEAEQSVLGGLMLDNERWDDVAERVVADDF YTRPHRHIFTEMARLQESGSPIDLITLAESLERQGQLDSVGGFAYLAELSKNTPSAANIS AYADIVRERAVVREMISVANEIAEAGFDPQGRTSEDLLDLAESRVFKIAESRANKDEGPK NIADVLDATVARIEQLFQQPHDGVTGVNTGYDDLNKKTAGLQPSDLIIVAARPSMGKTTF AMNLVENAAMLQDKPVLIFSLEMPSEQIMMRSLASLSRVDQTKIRTGQLDDEDWARISGT MGILLEKRNIYIDDSSGLTPTEVRSRARRIAREHGGIGLIMIDYLQLMRVPALSDNRTLE IAEISRSLKALAKELNVPVVALSQLNRSLEQRADKRPVNSDLRESGSIEQDADLIMFIYR DEVYHENSDLKGIAEIIIGKQRNGPIGTVRLTFNGQWSRFDNYAGPQYDDE >gi|296493166|gb|ADTK01000335.1| GENE 22 23269 - 24348 911 359 aa, chain + ## HITS:1 COG:alr KEGG:ns NR:ns ## COG: alr COG0787 # Protein_GI_number: 16131879 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Escherichia coli K12 # 1 359 1 359 359 726 100.0 0 MQAATVVINRRALRHNLQRLRELAPASKMVAVVKANAYGHGLLETARTLPDADAFGVARL EEALRLRAGGITKPVLLLEGFFDARDLPTISAQHFHTAVHNEEQLAALEEASLDEPVTVW MKLDTGMHRLGVRPEQAEAFYHRLTQCKNVRQPVNIVSHFARADEPKCGATEKQLAIFNT FCEGKPGQRSIAASGGILLWPQSHFDWVRPGIILYGVSPLEDRSTGADFGCQPVMSLTSS LIAVREHKAGEPVGYGGTWVSERDTRLGVVAMGYGDGYPRAAPSGTPVLVNGREVPIVGR VAMDMICVDLGPQAQDKAGDPVILWGEGLPVERIAEMTKVSAYELITRLTSRVAMKYVD >gi|296493166|gb|ADTK01000335.1| GENE 23 24601 - 25794 1212 397 aa, chain + ## HITS:1 COG:tyrB KEGG:ns NR:ns ## COG: tyrB COG1448 # Protein_GI_number: 16131880 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Escherichia coli K12 # 1 397 1 397 397 802 99.0 0 MFQKVDAYAGDPILTLMERFKEDPRSDKVNLSIGLYYNEDGIIPQLQAVAEAEARLNAQP HGASLYLPMEGLNSYRHAIAPLLFGADHPVLQQQRVATIQTLGGSGALKVGADFLKRYFP ESGVWVSDPTWENHVAIFAGAGFEVSTYPWYDEATNGVRFNDLLATLKTLPARSIVLLHP CCHNPTGADLTNDQWDAVIEILKARELIPFLDIAYQGFGAGMEEDAYAIRAIASAGLPAL VSNSFSKIFSLYGERVGGLSVMCEDAEAAGRVLGQLKATVRRNYSSPPNFGAQVVAAVLN DEALKASWLAEVEEMRTRILAMRQELVKVLSTEMPERNFDYLLNQRGMFSYTGLSAAQVD RLREEFGVYLIASGRMCVAGLNTANVQRVAKAFAAVM >gi|296493166|gb|ADTK01000335.1| GENE 24 26015 - 26557 298 180 aa, chain - ## HITS:1 COG:no KEGG:EC55989_4548 NR:ns ## KEGG: EC55989_4548 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 180 1 180 180 336 100.0 2e-91 MVTNFITPDGDDDMNISYVNSNKTTSLPVELDALNNKDISYAKDFSYAKDFFLYIETQLK IAKDFCRPGEEVSSSIASKVFHAFIDLVNKIRGKKDFMYICTLCCFAEEVKGDYSHYRTF LFDIGNQYKVKLTQSGKKELSLTLEFNDTIIESQKVTGNKAKHILEDIEKFYRNKPDTYY >gi|296493166|gb|ADTK01000335.1| GENE 25 26920 - 27633 570 237 aa, chain + ## HITS:1 COG:aphA KEGG:ns NR:ns ## COG: aphA COG3700 # Protein_GI_number: 16131881 # Func_class: R General function prediction only # Function: Acid phosphatase (class B) # Organism: Escherichia coli K12 # 1 237 1 237 237 471 99.0 1e-133 MRKITQAISAVCLLFALNSSAVALASSPSPLNPGTNVARLAEQAPIHWVSVAQIENSLAG RPPMAVGFDIDDTVLFSSPGFWRGKKTFSPESEDYLKNPVFWEKMNNGWDEFSIPKEVAR QLIDMHVRRGDAIFFVTGRSPTKTETVSKTLADNFHIPVTNMNPVIFAGDKPGQNTKSQW LQDKNIRIFYGDSDNDITAARDVGARGIRILRASNSTYKPLPQAGAFGEEVIVNSEY >gi|296493166|gb|ADTK01000335.1| GENE 26 27744 - 28160 353 138 aa, chain + ## HITS:1 COG:ECs5038 KEGG:ns NR:ns ## COG: ECs5038 COG0432 # Protein_GI_number: 15834292 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 138 1 138 138 272 98.0 2e-73 MWYQKTLTHSAKSRGFHLVTDEILNQLADMPRVNIGLLHLLLQHTSASLTLNENCDPTVR HDMERFFLRTVPDNGNYEHDYEGADDMPSHIKSSMLGTSLVLPLHKGRIQTGTWQGIWLG EHRIHGGSRRIIATLQGE >gi|296493166|gb|ADTK01000335.1| GENE 27 28164 - 28520 460 118 aa, chain + ## HITS:1 COG:ECs5039 KEGG:ns NR:ns ## COG: ECs5039 COG2315 # Protein_GI_number: 15834293 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 118 1 118 118 216 100.0 5e-57 MTISELLQYCMAKPGAEQSVHNDWKATQIKVEDVLFAMVKEVENRPAVSLKTSPELAELL RQQHSDVRPSRHLNKAHWSTVYLDGSLPDSQIYYLVDASYQQAVNLLPEEKRKLLVQL >gi|296493166|gb|ADTK01000335.1| GENE 28 28555 - 31377 3043 940 aa, chain - ## HITS:1 COG:uvrA KEGG:ns NR:ns ## COG: uvrA COG0178 # Protein_GI_number: 16131884 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Escherichia coli K12 # 1 940 1 940 940 1900 100.0 0 MDKIEVRGARTHNLKNINLVIPRDKLIVVTGLSGSGKSSLAFDTLYAEGQRRYVESLSAY ARQFLSLMEKPDVDHIEGLSPAISIEQKSTSHNPRSTVGTITEIHDYLRLLFARVGEPRC PDHDVPLAAQTVSQMVDNVLSQPEGKRLMLLAPIIKERKGEHTKTLENLASQGYIRARID GEVCDLSDPPKLELQKKHTIEVVVDRFKVRDDLTQRLAESFETALELSGGTAVVADMDDP KAEELLFSANFACPICGYSMRELEPRLFSFNNPAGACPTCDGLGVQQYFDPDRVIQNPEL SLAGGAIRGWDRRNFYYFQMLKSLADHYKFDVEAPWGSLSANVHKVVLYGSGKENIEFKY MNDRGDTSIRRHPFEGVLHNMERRYKETESSAVREELAKFISNRPCASCEGTRLRREARH VYVENTPLPAISDMSIGHAMEFFNNLKLAGQRAKIAEKILKEIGDRLKFLVNVGLNYLTL SRSAETLSGGEAQRIRLASQIGAGLVGVMYVLDEPSIGLHQRDNERLLGTLIHLRDLGNT VIVVEHDEDAIRAADHVIDIGPGAGVHGGEVVAEGPLEAIMAVPESLTGQYMSGKRKIEV PKKRVPANPEKVLKLTGARGNNLKDVTLTLPVGLFTCITGVSGSGKSTLINDTLFPIAQR QLNGATIAEPAPYRDIQGLEHFDKVIDIDQSPIGRTPRSNPATYTGVFTPVRELFAGVPE SRARGYTPGRFSFNVRGGRCEACQGDGVIKVEMHFLPDIYVPCDQCKGKRYNRETLEIKY KGKTIHEVLDMTIEEAREFFDAVPALARKLQTLMDVGLTYIRLGQSATTLSGGEAQRVKL ARELSKRGTGQTLYILDEPTTGLHFADIQQLLDVLHKLRDQGNTIVVIEHNLDVIKTADW IVDLGPEGGSGGGEILVSGTPETVAECEASHTARFLKPML >gi|296493166|gb|ADTK01000335.1| GENE 29 31631 - 32167 716 178 aa, chain + ## HITS:1 COG:ECs5041 KEGG:ns NR:ns ## COG: ECs5041 COG0629 # Protein_GI_number: 15834295 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Escherichia coli O157:H7 # 1 178 1 178 178 270 100.0 1e-72 MASRGVNKVILVGNLGQDPEVRYMPNGGAVANITLATSESWRDKATGEMKEQTEWHRVVL FGKLAEVASEYLRKGSQVYIEGQLRTRKWTDQSGQDRYTTEVVVNVGGTMQMLGGRQGGG APAGGNIGGGQPQGGWGQPQQPQGGNQFSGGAQSRPQQSAPAAPSNEPPMDFDDDIPF >gi|296493166|gb|ADTK01000335.1| GENE 30 32476 - 33573 257 365 aa, chain + ## HITS:1 COG:ECs3512 KEGG:ns NR:ns ## COG: ECs3512 COG0582 # Protein_GI_number: 15832766 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Escherichia coli O157:H7 # 1 340 49 382 401 204 33.0 2e-52 MLAAASNLVVLLRFLDNRGIDLEQRFLTKNFFKPYELDDLRDFAQRKQGKKLLVASSTPW LEDASSDTVDNGTLHSRLTTYAKYLGWYAMHILKTAEPGVVEQINEMLQHIKIRRPSKKR RNSEWQDRSLNDVQLDTLFEVIKPGSDLNPFSIDVQRRNRLMILLLFYLGIRGGELLNIR IQDIDFSTNRIRIFRRADELADSRTNQPHAKTRDRLLPLAESLVQELHSYIIQDRRKVRN AKKNDYLFVTYKLGPTVGNPISKGGYHKIFSVVRAVSPQLYAATGHSFRHTWNRKFSERM DAMNEQVSEERQEQLRSYLMGWRDGSGTAATYNKRFIQQKGFEAALALQEGNGTRLPKDM KNDDE >gi|296493166|gb|ADTK01000335.1| GENE 31 34349 - 35122 451 257 aa, chain + ## HITS:1 COG:PM1948 KEGG:ns NR:ns ## COG: PM1948 COG0582 # Protein_GI_number: 15603813 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Pasteurella multocida # 15 241 269 496 506 154 36.0 1e-37 MVENRLGFELQEDDRQQIPLFPDLDVLATVQSPYEFRQLFATDKLHIPAAKITNTLQYII EKSDIRSERTGELLHINARRFRYTTGTRAAKEGFGELVIAELLDHTDTQNAGVYIKNIPE HVKKLDEAVGFQLAPYAQAFVGVLVDSERDAHRGNDPASRIRTEIGHGVGTCGEHGFCGA NVPIPCYTCMHFQPWLDGPHEDVYQGLLNERERVKEITGDIQIAAILDRSIIAVADVIMR CAKRREELSPQGAIANG >gi|296493166|gb|ADTK01000335.1| GENE 32 35115 - 37148 344 677 aa, chain + ## HITS:1 COG:ECs3510 KEGG:ns NR:ns ## COG: ECs3510 COG4688 # Protein_GI_number: 15832764 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 607 1 538 543 338 34.0 2e-92 MAEIFQFKPKATLTAAQNLGVFISKCRVQLIVFGSDLNWEAPVWPNITVFAKLGIITRTP MQGEVQDPAFIDFAKAYFRYQQGHHPTGTKNESKALRAIEAALIQVNGNANIDGLSISVL DEASELARQHYSDGAAYHCGREIERLAKFVVENRLVSSAVQNWVNPIRRAEDKNKTGREA KKNREKKLPSDIALNALAEIFANDPMDERDIFTTSVFAMLMSAPSRISEVLALPADCEVF ETDSEDIERYGWRFFAGKGYEGGIKWIPTVMVGVAKTAIARTKMLSENARQLAKWIEIHP DKFYRHANCPDVADDEPLTMEQSCMALGLACKSKKQCRSSLFNRGLAQEDGSHTLKSLWR HAMARLPDGFPWFDKDKGIKYSNALFALNVNQFHGNRRCIPIELYKPTNNFFNNDLTPRE TLNGKHASIFDRYNYHAENGERVKLTSHQARHLLNTIAQRGGLSNLEIAKWSGRADMKQN RAYNHMTEYELVGMSERLDPSKGLFGPAGDVAKHLPVTIQEFNTLDHAAVHVTEYGYCVH DYTMGPCEKFRDCVNCTEQVCIKGEDTEILDRIKKRLVKSERMFCIADAAVERGEMGADR WYQYHKKTVTRLRELVAILENPDIENGAQIKLRGNDFSQLRRVVAKTSIAAIKQKGKESE EAVMLDDLTTLLGGGLG >gi|296493166|gb|ADTK01000335.1| GENE 33 37141 - 37560 62 139 aa, chain + ## HITS:1 COG:no KEGG:Tcr_0331 NR:ns ## KEGG: Tcr_0331 # Name: not_defined # Def: hypothetical protein # Organism: T.crunogena # Pathway: not_defined # 1 135 1 135 136 133 51.0 2e-30 MAKHLDNRDINAIVNLIRGWQEPKLSWSVICEAAEPLVGKLPSRQSLNAHGAIVTAYQTK KEALKGRGPKNPRPASLNIAAARIVNLESEIEELKEENRRYKQQFVIWQYNAYKHGMTGH QLNAQLTKIDRERSDGERR >gi|296493166|gb|ADTK01000335.1| GENE 34 37635 - 42398 318 1587 aa, chain + ## HITS:1 COG:no KEGG:Gmet_2092 NR:ns ## KEGG: Gmet_2092 # Name: not_defined # Def: hypothetical protein # Organism: G.metallireducens # Pathway: not_defined # 520 1587 5 1071 1071 1337 60.0 0 MVKPNWDNFKAKFNENPQDNFEWFCYLLFCQEFKIPAGIFRYKNQSGIETNPITKDNELI GWQSKFYDTKLSDNKADLIEMIGKSKKAYPGLSKIIFYTNQEWGQGRKSHEPEDDKNADN YFETVGNSNDPKIKIEVDQKAYESGIEIVWRVASFFESPFVIVENEKIAKHFFSLNESIF DLLEEKRKHTENVLYEIQTNIEFKDRSIEIDRRHCIELLHENLVQKKIVIVSGEGGVGKT AVIKKIYEAEKQCTPFYVFKASEFKKDSINELFGAHGLADFSNAHQDELRKVIVVDSAEK LLELTNIDPFKEFLTVLIKDKWQVIFTTRNNYLADLNYAFIDIYKITPGNLVIKNLERDE LIELSDNNGFSLPQDVRLLELIKNPFYLSEYLRFYTGESINYASFKEKLWNNIVVKNKPS REQCFLATAFQRASEGQFFVSPTCDTGILDALVKDGIVGYEAAGYFITHDIYEEWALEKK ISVDYIRKANNNEFFEKIGESLPVRRSFRNWISERLLLDDQSIKPFIAEIVCGEGISKFW KDELWVAVLLSDNSGIFFNYFKRYLLSSDQNLLKRLTFLLRLACKDVDYDLLKQLGVSNS DLLSIKYVLTKPKGTGWQSVIQFIHENLDEIGFRNINFILPVIQEWNQRNKVGETTRLSS LIALKYYQWTIDEDVYLSGRDNEKNILHTILHGAAMIKPEMEEILVKVLKNKWNEHGTPY FDLMTLILTDLDSHPVWVSLPEYVLQLADLFWYRPLKEKGERYHHMDIEDEFGLFRSHHD YYPESPYQTPIYWLLQSQFKKTIDFILDFTNKTTICFAHSRFAKNEIEEVDVFIEEGKFI KQYICNRLWCSYRGTQVSTYLLSSIHMALEKIFLENFKNADSKVLESWLLFLLRNTKSAS ISAVVTSIVLAFPEKTFNVAKVLFQTKDFFHFDMNRMVLDRTHKSSLISLRDGFGGADYR NSLHEEDRIKACDDVHRNTHLENLALHYQIFRSENVTEKDAIERQQVLWDIFDKYYNQLP DEAQETEADKTWRLCLARMDRRKMNITTKEKDEGIEISFNPEIDPKLKQYSEEAIKKNSE HMKYMTLKLWASYKREKDERYKNYGMYEDNPQIAIQETKEIINKLNEEEDEDFRLLNSNI PADVCSILLLENFNQLNNEEREYCKDIVLAYSKLPLKEGYNYQVLDGTTSAISALPVIYH NYPVERETIKTILLLTLFNDHSIGMASGRYSVFPSMVIHKLWLYYFDDMQSLLFGFLILK PKYVILSRKIIHESYRQADYDIKKININKVFLNNYKHCISNVIDNKISIDDLGGIDKVDL HILNTAFQLIPVDTVNIEHKQLVSLIVKRFSTSLLSSVREDRVDYALRQSFLERFAYFTL HAPVSDIPDYIKPFLEGFNGSEPISELFKKFILVEDRLNTYTKFWKVWDLFFDKVVTLCK DGDRYWYVDKIIKSYLFAESPWKENSNGWHTFKDNNSQFFCDVSRTMGHCPSTVYSLAKS LNNIASCYLNQGITWLSGILSVNKKLWENKLENDTVYYLECLVRRYINTERERIRRTKQL KEEVLVILDFLVEKGSVVGYMSRENIL >gi|296493166|gb|ADTK01000335.1| GENE 35 42936 - 43757 209 273 aa, chain - ## HITS:1 COG:no KEGG:Glov_0061 NR:ns ## KEGG: Glov_0061 # Name: not_defined # Def: hypothetical protein # Organism: G.lovleyi # Pathway: not_defined # 15 273 4 265 265 99 29.0 1e-19 MSAIENQETVIQAVNSLHPTTLFHFTKNEEAFYSILTELYFKPFLARERIVGVKGRRNFA VPMVSFCDIKLSQIKDHSEKYGEFGFGLTKEWAEKNDLHPVLYMNQGSELFSKYNKRIRK LKDALIPLWSKRHIPDNKERKYFESLKEEYADLYNLLRYMKNYKGKLERTNEKPIENYIY ADEKEWRYVPHPFVGDLWPSLNLERVVEPNQKAVLSKKFSEFGISFSFDDIKYILVPDDS HISKLITCLMSIRNYDPYIISKVLTMDKVKQDF >gi|296493166|gb|ADTK01000335.1| GENE 36 44162 - 44443 283 93 aa, chain - ## HITS:1 COG:no KEGG:ECSE_4354 NR:ns ## KEGG: ECSE_4354 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 93 34 126 126 166 100.0 3e-40 MATLTTGVVLLRWQLLSAVMMFLASTLNIRFRRSDYVGLAVISSGLGVVSACWFAMGLLG ITMADITAIWHNIESVMIEEMNQTPPQWPMILT >gi|296493166|gb|ADTK01000335.1| GENE 37 44873 - 46459 1123 528 aa, chain + ## HITS:1 COG:yjcC KEGG:ns NR:ns ## COG: yjcC COG4943 # Protein_GI_number: 16131887 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing sensor and EAL domains # Organism: Escherichia coli K12 # 1 528 1 528 528 1046 99.0 0 MSHRARHQLLALPGIIFLVLFPIILSLWIAFLWAKSEVNNQLRTFAQLALDKSELVIRQA DLVSDAAERYQGQVCTPAHQKRMLNIIRGYLYINELIYARDNHFLCSSLIAPVNGYTIAP ADYKREPNVSIYYYRDTPFFSGYKMTYMQRGNYVAVINPLFWSEVMSDDPTLQWGVYDTV TKTFFSLSKEASAATFSPLIHLKDLTVQRNGYLYATVYSTKRPIAAIVATSYQRLITHFY NHLIFALPAGILGSLVLLLLWLRIRQNYLSPKRKLQRALEKHQLCLYYQPIIDIKTEKCI GAEALLRWLGEQGQIMNPAEFIPLAEKEGMIEQITDYVIDNVFRDLGAYLATHADRYVSI NLSASDFHTSRLIARINQKTEQYAVRPQQIKFEVTEHAFLDVEKMTPIILAFRQAGYEVA IDDFGIGYSNLHNLKSLNVDILKIDKSFVETLTTHKTSHLIAEHIIELAHSLGLKTIAEG VETEEQVNWLRKRGVRYCQGWFFAKAMPPQVFMQWMEQLPARELTRGQ >gi|296493166|gb|ADTK01000335.1| GENE 38 46462 - 46785 309 107 aa, chain - ## HITS:1 COG:ECs5044 KEGG:ns NR:ns ## COG: ECs5044 COG2207 # Protein_GI_number: 15834298 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 107 1 107 107 194 100.0 3e-50 MSHQKIIQDLIAWIDEHIDQPLNIDVVAKKSGYSKWYLQRMFRTVTHQTLGDYIRQRRLL LAAVELRTTERPIFDIAMDLGYVSQQTFSRVFRRQFDRTPSDYRHRL >gi|296493166|gb|ADTK01000335.1| GENE 39 46871 - 47335 494 154 aa, chain + ## HITS:1 COG:ECs5045 KEGG:ns NR:ns ## COG: ECs5045 COG0789 # Protein_GI_number: 15834299 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 154 1 154 154 300 99.0 5e-82 MEKKLPRIKALLTPGEVAKRSGVAVSALHFYESKGLITSIRNSGNQRRYKRDVLRYVAII KIAQRIGIPLATIGEAFGVLPEGHTLSAKEWKQLSSQWREELDRRIHTLVALRDELDGCI GCGCLSRSDCPLRNPGDRLGEEGTGTRLLEDEQN >gi|296493166|gb|ADTK01000335.1| GENE 40 47880 - 49229 1557 449 aa, chain + ## HITS:1 COG:ECs5046 KEGG:ns NR:ns ## COG: ECs5046 COG2252 # Protein_GI_number: 15834300 # Func_class: R General function prediction only # Function: Permeases # Organism: Escherichia coli O157:H7 # 1 449 1 449 449 688 99.0 0 MSTPSARTGGSLDAWFKISQRGSTVRQEVVAGLTTFLAMVYSVIVVPGMLGKAGFPPAAV FVATCLVAGLGSIVMGLWANLPLAIGCAISLTAFTAFSLVLGQHISVPVALGAVFLMGVL FTVIPATGIRSWILRNLPHGVAHGTGIGIGLFLLLIAANGVGLVIKNPLDGLPVALGDFA TFPVIMSLVGLAVIIGLEKLKVPGGILLTIIGISIVGLIFDPNVHFSGVFAMPSLSDENG NSLIGSLDIMGALNPVVLPSVLALVMTAVFDATGTIRAVAGQANLLDKDGQIIDGGKALT TDSMSSVFSGLVGAAPAAVYIESAAGTAAGGKTGLTAITVGVLFLLILFLSPLSYLVPGY ATAPALMYVGLLMLSNVAKIDFADFVDAMAGLVTAVFIVLTCNIVTGIMIGFATLVIGRL VSGEWRKLNIGTVVIAVALVTFYAGGWAI >gi|296493166|gb|ADTK01000335.1| GENE 41 49381 - 51030 1681 549 aa, chain + ## HITS:1 COG:ECs5047 KEGG:ns NR:ns ## COG: ECs5047 COG0025 # Protein_GI_number: 15834301 # Func_class: P Inorganic ion transport and metabolism # Function: NhaP-type Na+/H+ and K+/H+ antiporters # Organism: Escherichia coli O157:H7 # 1 549 1 549 549 979 100.0 0 MEIFFTILIMTLVVSLSGVVTRVMPFQIPLPLMQIAIGALLAWPTFGLHVEFDPELFLVL FIPPLLFADGWKTPTREFLEHGREIFGLALALVVVTVVGIGFLIYWVVPGIPLIPAFALA AVLSPTDAVALSGIVGEGRIPKKIMGILQGEALMNDASGLVSLKFAVAVAMGTMIFTVGG ATVEFMKVAIGGILAGFVVSWLYGRSLRFLSRWGGDEPATQIVLLFLLPFASYLIAEHIG VSGILAAVAAGMTITRSGVMRRAPLAMRLRANSTWAMLEFVFNGMVFLLLGLQLPGILET SLMAAEIDPNVEIWMLFTDIILIYAALMLVRFGWLWTMKKFSNRFLKKKPMEFGSWTTRE ILIASFAGVRGAITLAGVLSIPLLLPDGNVFPARYELVFLAAGVILFSLFVGVVMLPILL QHIEVADHSQQLKEERIARAATAEVAIVAIQKMEERLAADTEENIDNQLLTEVSSRVIGN LRRRADGRNDVESSVQEENLERRFRLAALRSERAELYHLRATREISNETLQKLLHDLDLL EALLIEENQ >gi|296493166|gb|ADTK01000335.1| GENE 42 51184 - 51927 379 247 aa, chain - ## HITS:1 COG:yjcF KEGG:ns NR:ns ## COG: yjcF COG1357 # Protein_GI_number: 16131892 # Func_class: S Function unknown # Function: Uncharacterized low-complexity proteins # Organism: Escherichia coli K12 # 1 247 184 430 430 404 89.0 1e-112 MYKTNFYYAIMEKILFDNCILDDSNFAQIKMADGTLNACSAMHVQFYNAAMNRANIKNTF LDYSNFYIAYMAEVNLYKVIAPYVNLFKADLSFSKLDLINFEHADLSRVNLNKAILQSIN LIDSKLFCTWLTNTFLEMVICTGSNMANVNFNNANLSNCHFNCSILTKACMFNTRLYRVN FDEASVQGMGISILRGEENIPIDSDTLVTLQKFFEEDCTSHTGMSQTEDNINAVAMKITA DIMQHAD >gi|296493166|gb|ADTK01000335.1| GENE 43 52654 - 54303 2154 549 aa, chain - ## HITS:1 COG:yjcG KEGG:ns NR:ns ## COG: yjcG COG4147 # Protein_GI_number: 16131893 # Func_class: R General function prediction only # Function: Predicted symporter # Organism: Escherichia coli K12 # 1 549 1 549 549 960 99.0 0 MKRVLTALAATLPFAANAADAISGAVERQPTNWQAIIMFLIFVVFTLGITYWASKRVRSR SDYYTAGGNITGFQNGLAIAGDYMSAASFLGISALVFTSGYDGLIYSLGFLVGWPIILFL IAERLRNLGRYTFADVASYRLKQGPIRILSACGSLVVVALYLIAQMVGAGKLIELLFGLN YHIAVVLVGVLMMMYVLFGGMLATTWVQIIKAVLLLFGASFMAFMVMKHVGFSFNNLFSE AMAVHPKGVDIMKPGGLVKDPISALSLGLGLMFGTAGLPHILMRFFTVSDAREARKSVFY ATGFMGYFYILTFIIGFGAIMLVGANPEYKDAAGHLIGGNNMAAVHLANAVGGNLFLGFI SAVAFATILAVVAGLTLAGASAVSHDLYANVFKKGATEREELRVSKITVLILGVIAIILG VLFENQNIAFMVGLAFAIAASCNFPIILLSMYWSKLTTRGAMMGGWLGLITAVVLMILGP TIWVQILGHEKAIFPYEYPALFSISVAFLGIWFFSATDNSAEGARERELFRAQFIRSQTG FGVEQGRAH >gi|296493166|gb|ADTK01000335.1| GENE 44 54300 - 54614 358 104 aa, chain - ## HITS:1 COG:yjcH KEGG:ns NR:ns ## COG: yjcH COG3162 # Protein_GI_number: 16131894 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 104 1 104 104 199 100.0 1e-51 MNGTIYQRIEDNAHFRELVEKRQRFATILSIIMLAVYIGFILLIAFAPGWLGTPLNPNTS VTRGIPIGVGVIVISFVLTGIYIWRANGEFDRLNNEVLHEVQAS >gi|296493166|gb|ADTK01000335.1| GENE 45 54815 - 56773 2225 652 aa, chain - ## HITS:1 COG:acs KEGG:ns NR:ns ## COG: acs COG0365 # Protein_GI_number: 16131895 # Func_class: I Lipid transport and metabolism # Function: Acyl-coenzyme A synthetases/AMP-(fatty) acid ligases # Organism: Escherichia coli K12 # 1 652 1 652 652 1354 100.0 0 MSQIHKHTIPANIADRCLINPQQYEAMYQQSINVPDTFWGEQGKILDWIKPYQKVKNTSF APGNVSIKWYEDGTLNLAANCLDRHLQENGDRTAIIWEGDDASQSKHISYKELHRDVCRF ANTLLELGIKKGDVVAIYMPMVPEAAVAMLACARIGAVHSVIFGGFSPEAVAGRIIDSNS RLVITSDEGVRAGRSIPLKKNVDDALKNPNVTSVEHVVVLKRTGGKIDWQEGRDLWWHDL VEQASDQHQAEEMNAEDPLFILYTSGSTGKPKGVLHTTGGYLVYAALTFKYVFDYHPGDI YWCTADVGWVTGHSYLLYGPLACGATTLMFEGVPNWPTPARMAQVVDKHQVNILYTAPTA IRALMAEGDKAIEGTDRSSLRILGSVGEPINPEAWEWYWKKIGNEKCPVVDTWWQTETGG FMITPLPGATELKAGSATRPFFGVQPALVDNEGNPLEGATEGSLVITDSWPGQARTLFGD HERFEQTYFSTFKNMYFSGDGARRDEDGYYWITGRVDDVLNVSGHRLGTAEIESALVAHP KIAEAAVVGIPHNIKGQAIYAYVTLNHGEEPSPELYAEVRNWVRKEIGPLATPDVLHWTD SLPKTRSGKIMRRILRKIAAGDTSNLGDTSTLADPGVVEKLLEEKQAIAMPS >gi|296493166|gb|ADTK01000335.1| GENE 46 57165 - 58601 1486 478 aa, chain + ## HITS:1 COG:ECs5052 KEGG:ns NR:ns ## COG: ECs5052 COG3303 # Protein_GI_number: 15834306 # Func_class: P Inorganic ion transport and metabolism # Function: Formate-dependent nitrite reductase, periplasmic cytochrome c552 subunit # Organism: Escherichia coli O157:H7 # 1 478 1 478 478 970 99.0 0 MTRIKINARRIFSLLIPFFFFTSVHAEQTAAPPKPVTVEAKNETFAPQHPDQYLSWKATS EQSERVDALAEDPRLVILWAGYPFSRDYNKPRGHAFAVTDVRETLRTGAPKNAEDGPLPM ACWSCKSPDVARLIQKDGEDGYFHGKWARGGPEIVNNLGCADCHNTASPEFAKGKPELTL SRPYAARAMEAIGKPFEKAGRFDQQSMVCGQCHVEYYFDGKNKAVKFPWDDGMKVENMEQ YYDKIAFSDWTNSLSKTPMLKAQHPEYETWTAGIHGKNNVTCIDCHMPKVQNAEGKLYTD HKIGNPFDNFAQTCANCHTQDKAALQKVVAERKQSINDLKIKVEDQLVHAHFEAKAALDA GATEAEMKPIQDDIRHAQWRWDLAIASHGIHMHAPEEGLRMLGTAMDKAADARTKLARLL ATKGITHEIQIPDISTKEKAQQAIGLNMEQIKAEKQDFIKTVIPQWEEQARKNGLLSQ >gi|296493166|gb|ADTK01000335.1| GENE 47 58646 - 59212 355 188 aa, chain + ## HITS:1 COG:no KEGG:G2583_4896 NR:ns ## KEGG: G2583_4896 # Name: nrfB # Def: NrfB, formate-dependent nitrite reductase # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 188 3 190 190 362 100.0 5e-99 MSVLRSLLTAGVLASGLLWSLNGITATPAAQASDDRYEVTQQRNPDAACLDCHKPDTEGM HGKHASVINPNNKLPVTCTNCHGQPSPQHREGVKDVMRFNEPMYKVGEQNSVCMSCHLPE QLQKAFWPHDVHVTKVACASCHSLHPQQDTMQTLSDKGRIKICVDCHSDQRTNPNFNPAS VPLLKEQP >gi|296493166|gb|ADTK01000335.1| GENE 48 59209 - 59880 434 223 aa, chain + ## HITS:1 COG:ECs5054 KEGG:ns NR:ns ## COG: ECs5054 COG0437 # Protein_GI_number: 15834308 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Escherichia coli O157:H7 # 1 223 1 223 223 446 100.0 1e-125 MTWSRRQFLTGVGVLAAVSGTAGRVVAKTLNINGVRYGMVHDESLCIGCTACMDACREVN KVPEGVSRLTIIRSEPQGEFPDVKYRFFRKSCQHCDHAPCVDVCPTGASFRDAASGIVDV NPDLCVGCQYCIAACPYRVRFIHPVTKTADKCDFCRKTNLQAGKLPACVEACPTKALTFG NLDDPNSEISQLLRQKPTYRYKLALGTKPKLYRVPFKYGEVSQ >gi|296493166|gb|ADTK01000335.1| GENE 49 59877 - 60833 1231 318 aa, chain + ## HITS:1 COG:nrfD KEGG:ns NR:ns ## COG: nrfD COG3301 # Protein_GI_number: 16131899 # Func_class: P Inorganic ion transport and metabolism # Function: Formate-dependent nitrite reductase, membrane component # Organism: Escherichia coli K12 # 1 318 1 318 318 523 99.0 1e-148 MTQTSAFHFESLVWDWPIAIYLFLIGISAGLVTLAVLLRRFYPQAGGADSTLLRTTLIVG PGAVILGLLILVFHLTRPWTFWKLMFHYSFTSVMSMGVMLFQLYMVVLVLWLAKIFEHDL LALQQRWLPKLGIVQKVLSLLTPVHRGLETLMLVLAVLLGAYTGFLLSALKSYPFLNNPI LPVLFLFSGISSGAAVALIAMAIRQRSNPHSTEAQFVHRMEIPVVWGEIFLLVAFFVGLA LGDDGKVRALVAALGGGFWTWWFWLGVAGLGLIVPMLLKPWVNRSSGIPTVLAACGASLV GVLMLRFFILYAGQLTVA >gi|296493166|gb|ADTK01000335.1| GENE 50 60916 - 62571 1234 551 aa, chain + ## HITS:1 COG:nrfE KEGG:ns NR:ns ## COG: nrfE COG1138 # Protein_GI_number: 16131900 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cytochrome c biogenesis factor # Organism: Escherichia coli K12 # 1 551 2 552 552 916 98.0 0 MTPLTAFAGVRLRWPAMMRLTCIGILAQFALLLLAFGVLTYCFLISDFSVIYVAQHSYSL LSWELKLAAVWGGHEGSLLLWVLLLSAWSALFAWHYRQQTDPLFPLTLAVLSLMLAALLL FVVLWSDPFVRIFPPAIEGRDLNPMLQHPGLIFHPPLLYLGYGGLMVAASVALASLLRGE FDGACARICWRWALPGWSALTAGIILGSWWAYCELGWGGWWFWDPVENASLLPWLSATAL LHSLSLTRQREIFRHWSLLLAIVTLMLSLLGTLIVRSGILVSVHAFALDNVRAVPLFSLF ALISLASLALYGWRARDGGAVVRFSGLSREMLILATLLLFCAVLLIVLVGTLYPMIYGLL GWGRLSVGAPYFNRATLPFGLLMLVVIVLATFVSGKRVQLPALVAHAGVLLFAAGIVVSS VSRQEISLNLQPGQQVTLAGYTFRFERLDLQAKGNYTSEKAIVALFDHQQRIGELMPERR FYEARRQQMMEPSIRWNGIHDWYAVMGEKTGADRYAFRLYVQSGVRWIWGGGLLMIAGAL LSGWRGKKRDE >gi|296493166|gb|ADTK01000335.1| GENE 51 62564 - 62947 353 127 aa, chain + ## HITS:1 COG:nrfF KEGG:ns NR:ns ## COG: nrfF COG3088 # Protein_GI_number: 16131901 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Uncharacterized protein involved in biosynthesis of c-type cytochromes # Organism: Escherichia coli K12 # 1 127 1 127 127 184 99.0 3e-47 MNKGLLTLLLLFTCFARAQVVDTWQFANPQQQQQALNIASQLRCPQCQNQNLLESNAPVA VSMRHQVYSMVAEGKNEVEIIGWMTERYGDFVRYNPPLTGQTLVLWALPVVLLLLMALIL WRVRAKR >gi|296493166|gb|ADTK01000335.1| GENE 52 62944 - 63540 683 198 aa, chain + ## HITS:1 COG:nrfG KEGG:ns NR:ns ## COG: nrfG COG4235 # Protein_GI_number: 16131902 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cytochrome c biogenesis factor # Organism: Escherichia coli K12 # 1 198 1 198 198 325 98.0 3e-89 MKQPKVPVKMLTTLTILMVFLCVGSYLLSPKWQAVRAEYQRQRDPLHQFASQQTPEAQLQ ALQDKIRANPQNSEQWALLGEYYLWQNDYSNSLLAYRQALQLRGENAELYAALATVLYYQ ASQHMTAQTRAMIDKALALDSNEITALMLLASDAFMQANYAQAIELWQKVMDLNSPRINR TQLVESINMAKLLQRRSD >gi|296493166|gb|ADTK01000335.1| GENE 53 63882 - 65195 1679 437 aa, chain + ## HITS:1 COG:ECs5059 KEGG:ns NR:ns ## COG: ECs5059 COG1301 # Protein_GI_number: 15834313 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Escherichia coli O157:H7 # 1 437 1 437 437 747 99.0 0 MKSIKFSLAWQILFAMVLGILLGSYLHYHSDSRDWLVVNLLSPAGDIFIHLIKMIVVPIV ISTLVVGIAGVGDAKQLGRIGAKTIIYFEVITTVAIILGITLANVFQPGAGVDMSQLATV DISKYQSTTEAVQSSSHGIMGTILSLVPTNIVASMAKGEMLPIIFFSVLFGLGLSSLPAT HREPLVTVFRSISETMFKVTHMVMRYAPVGVFALIAVTVANFGFSSLWPLAKLVLLVHFA ILFFALVVLGIVARLCGLSVWILIRILKDELILAYSTASSESVLPRIIEKMEAYGAPASI TSFVVPTGYSFNLDGSTLYQSIAAIFIAQLYGIDLSIWQEIILVLTLMVTSKGIAGVPGV SFVVLLATLGSVGIPLEGLAFIAGVDRILDMARTALNVVGNALAVLVIAKWEHKFDRKKA LAYEREVLGKFDKTADQ >gi|296493166|gb|ADTK01000335.1| GENE 54 65273 - 65962 589 229 aa, chain - ## HITS:1 COG:ECs5060 KEGG:ns NR:ns ## COG: ECs5060 COG0790 # Protein_GI_number: 15834314 # Func_class: R General function prediction only # Function: FOG: TPR repeat, SEL1 subfamily # Organism: Escherichia coli O157:H7 # 1 229 1 229 229 433 100.0 1e-121 MKKIIALMLFLTFFAHANDSEPGSQYLKAAEAGDRRAQYFLADSWFSSGDLSKAEYWAQK AADSGDADACALLAQIKITNPVSLDYPQAKVLAEKAAQAGSKEGEVTLAHILVNTQAGKP DYPKAISLLENASEDLENDSAVDAQMLLGLIYANGVGIKADDDKATWYFKRSSAISRTGY SEYWAGMMFLNGEEGFIEKNKQKALHWLNLSCMEGFDTGCEEFEKLTNG >gi|296493166|gb|ADTK01000335.1| GENE 55 66056 - 67735 1800 559 aa, chain - ## HITS:1 COG:fdhF KEGG:ns NR:ns ## COG: fdhF COG0243 # Protein_GI_number: 16131905 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 559 157 715 715 1172 100.0 0 MSNAINEIDNTDLVFVFGYNPADSHPIVANHVINAKRNGAKIIVCDPRKIETARIADMHI ALKNGSNIALLNAMGHVIIEENLYDKAFVASRTEGFEEYRKIVEGYTPESVEDITGVSAS EIRQAARMYAQAKSAAILWGMGVTQFYQGVETVRSLTSLAMLTGNLGKPHAGVNPVRGQN NVQGACDMGALPDTYPGYQYVKDPANREKFAKAWGVESLPAHTGYRISELPHRAAHGEVR AAYIMGEDPLQTDAELSAVRKAFEDLELVIVQDIFMTKTASAADVILPSTSWGEHEGVFT AADRGFQRFFKAVEPKWDLKTDWQIISEIATRMGYPMHYNNTQEIWDELRHLCPDFYGAT YEKMGELGFIQWPCRDTSDADQGTSYLFKEKFDTPNGLAQFFTCDWVAPIDKLTDEYPMV LSTVREVGHYSCRSMTGNCAALAALADEPGYAQINTEDAKRLGIEDEALVWVHSRKGKII TRAQVSDRPNKGAIYMTYQWWIGACNELVTENLSPITKTPEYKYCAVRVEPIADQRAAEQ YVIDEYNKLKTRLREAALA >gi|296493166|gb|ADTK01000335.1| GENE 56 67784 - 68203 321 139 aa, chain - ## HITS:1 COG:ECs5061 KEGG:ns NR:ns ## COG: ECs5061 COG0243 # Protein_GI_number: 15834315 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli O157:H7 # 1 139 1 139 715 296 100.0 6e-81 MKKVVTVCPYCASGCKINLVVDNGKIVRAEAAQGKTNQGTLCLKGYYGWDFINDTQILTP RLKTPMIRRQRGGKLEPVSWDEALNYVAERLSAIKEKYGPDAIQTTGSSRGTGNETNYVM QKFARAVIGTNNVDCCARV >gi|296493166|gb|ADTK01000335.1| GENE 57 68401 - 69867 1546 488 aa, chain - ## HITS:1 COG:ECs5062 KEGG:ns NR:ns ## COG: ECs5062 COG1538 # Protein_GI_number: 15834316 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Escherichia coli O157:H7 # 1 488 1 488 488 895 99.0 0 MINRQLSRLLLCSILGSTTLISGCALVRKDSAPHQQLKPEQIKLADDIHLASSGWPQAQW WKQLNDPQLDALIQRTLSGSHTLAEAKLREEKAQSQADLLDAGSQLQVAALGMLNRQRVS ANGFLSPYAMDAPALGMDGPYYTEATVGLFAGLDLDLWGVHRSAVAAAIGAHNAALAETA AVELSLTTGVAQLYYSMQASYQMLDLLEQTRDVIDYAVKAHQSKVAHGLEAQVPFHGARA QILAVDKQIAAVKGQITETRESLRALIGAGASDMPEIKPVALPRVQTGIPATLSYELLAR RPDLQAMRWYVQASLDQVDSARALFYPSFDIKAFFGLDAIHLDTLFKKTSRQFNFIPGLK LPLFDGGRLNANLEGTRAASNMMIERYNQSVLNAVRDVAVNGTRLQTLNDEREMQAERVE ATRFTQRAAEAAYQRGLTSRLQATEARLPVLAEEMSLLMLDSRRVIQSIQLMKSLGGGYQ AAPVVEKK >gi|296493166|gb|ADTK01000335.1| GENE 58 69864 - 70346 488 160 aa, chain - ## HITS:1 COG:yjcQ KEGG:ns NR:ns ## COG: yjcQ COG1289 # Protein_GI_number: 16131907 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 160 477 636 636 310 98.0 6e-85 MRIPRQQEVTALRTYLQIRIGLHAAFNACEEMCQRVALERQLDSEERALLIERSQTVIRQ GRDILHAWDATWNSAQALDNALQPDRAAQFADALEKYAAGLATALSRSPQITLEETPASQ AILPTLLKQEQHVCQLFARLPDWTAPALTPATEQAQGATQ >gi|296493166|gb|ADTK01000335.1| GENE 59 70413 - 71915 1154 500 aa, chain - ## HITS:1 COG:ECs5063 KEGG:ns NR:ns ## COG: ECs5063 COG1289 # Protein_GI_number: 15834317 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 48 500 1 453 636 868 98.0 0 MSALNSLPLPVVRLLAFFHEELSERRPGRVPQIVQLWVGCLLVILISMTFEIPFVALSLA VLFYGIQSNAFYTKFVAILFVVATVLEIGSLFLIYKWSYGEPLIRLIIAGPILMGCMFLM RTHRLGLVFFAVAIVAIYGQTFPAMLDYPEVVVRLTLWCIVVGLYPTLLMTLIGVLWFPS RAITQMHQALNDRLDDAISHLTDSLAPLPETRIEREALALQKLNVFCLADDANWRTQSAW WQSCVATVTYIYSTLNRYDPTSFADSQAIIEFRQKLASEINKLQHAVAEGQCWQSDWRIT ESEAMAARECNLENICQTLLQLGQMDPNTPPTPAAKPPSMVADAFANPDYMRYAVKTLLA CLICYTFYSGVDWEGIHTCMLTCVIVANPNVGSSYQKMVLRFGGAFCGAILALLFTLLVM PWLDNIVELLFVLAPIFLLGAWIATSSERSSYIGTQMVVTFALATLENVFGPVYDLVEIR DRAMGIIIGTVVSAVIYTFV >gi|296493166|gb|ADTK01000335.1| GENE 60 71915 - 72946 1013 343 aa, chain - ## HITS:1 COG:yjcR KEGG:ns NR:ns ## COG: yjcR COG1566 # Protein_GI_number: 16131908 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Escherichia coli K12 # 1 343 1 343 343 583 99.0 1e-166 MESTPKKAPRSKFPALLVVALALVALVFVIWRVDSAPSTNDAYASADTIDVVPEVSGRIV ELAVTDNQAVKQGDLLFRIDPRPYEANLAKAEASLAALDKQIMLTQRSVDAQQFGADSIN ATVEKARAAAKQATDTLRRTEPLLKEGFVSAEDVDRARTAQRAAEADLNAVLLQAQSAAS AVSGVDALVAQRAAVEADIALTKLHLEMATVRAPFDGRVISLKTSVGQFASAMRPIFTLI DTRHWYVIANFRETDLKNIRSGTPATIRLMSDSGKTFEGKVDSIGYGVLPDDGGLVLGGL PKVSRSINWVRVAQRFPVKIMVDKPDPEMFRIGASAVANLEPQ >gi|296493166|gb|ADTK01000335.1| GENE 61 72965 - 73240 159 91 aa, chain - ## HITS:1 COG:no KEGG:SSON_4264 NR:ns ## KEGG: SSON_4264 # Name: not_defined # Def: formate dehydrogenase H # Organism: S.sonnei # Pathway: not_defined # 1 91 17 107 107 140 100.0 2e-32 MPTVLSRMAMQLKKTAWIIPVFMVSGCSLSPAIPVIGAYYPSWFFCAIASLILMLITRRV IQRANINLAFVGIIYTALFALYAMLFWLAFF >gi|296493166|gb|ADTK01000335.1| GENE 62 73449 - 75434 2083 661 aa, chain - ## HITS:1 COG:yjcS KEGG:ns NR:ns ## COG: yjcS COG2015 # Protein_GI_number: 16131909 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Alkyl sulfatase and related hydrolases # Organism: Escherichia coli K12 # 1 661 5 665 665 1289 96.0 0 MNNSRLFRLSRIIIALTAASGMMVNTAYATDEAKAATQYTQQVNQNYAKSLPFSDRQDFD DAQRGFIAPLLDEGILRDANGKPYYRGEDYKFDINAPAPETVNPSLWRQSQLNGISGLFK VTDRMYQVRGQDISNITFIEGDTGIIVIDPLVTPPSAKAALDLYFQNRPQKPIVAVIYTH SHTDHYGGVKGIISEADVKSGKVQVIAPAGFMDEAISENVLAGNIMSRRALYSYGLLLAH NPQGNIGNGLGVTLASGYPSIIAPNKTITKTGEKMIIDGLEFDFLMTPGSEAPAEMHFYI PALKALCTAENATHTLHNFYTLRGAKTRDTSKWTEYLNETLDMWGNDAEVLFMPHTWPVW GNKHINDYIGKYRDTIKYIHDQTLHLANQGYTMNEIGDMIKLPPALANNWASRGYYGSVS HNARAVYNFYLGYYDGNPANLHPYGQVEMGKRYVQALGGSARVINLAQEANKQGDYRWSA ELLKQVIAANPGDQVAKNLQANNFEQLGYQAESATWRGFYLTGAKELREGVHKFSHGTTG SPDTIRGMSVEMLFDFMSVRLDSAKAAGKNISLNFNMSNGDNLNLTLNDSVLNYRKTLQS QADASFYISREDLHAVLTGQAKMADLVKAKKAKIIGNGAKLEEIIACLDNFDLWVNIVTP N >gi|296493166|gb|ADTK01000335.1| GENE 63 75707 - 76636 507 309 aa, chain - ## HITS:1 COG:alsK KEGG:ns NR:ns ## COG: alsK COG1940 # Protein_GI_number: 16131910 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Escherichia coli K12 # 1 309 1 309 309 638 98.0 0 MQKQHNVVAGVDMGATHIRFCLRTAEGETLHCEKKRTAEVIAPGLVSGIGEKIDEQLRRF NARCHGLVMGFPALVSKDKRTIISTPNLPLTAADLYDLADKLENTLNCPVEFSRDVNLQL SWDVVENRLTQQLVLAAYLGTGMGFAVWMNGAPWTGAHGVAGELGHIPLGDMTQHCACGN PGCLETNCSGMALRRWYEQQPRNYPLSDLFVHAENAPFVQSLLENAARAIATSINLFDPD AVILGGGVMDMPAFPRETLIAMTQKYLRRPLPHQVVRFIAASSSDFNGAQGAAILAHQHF LPQSCAKAP >gi|296493166|gb|ADTK01000335.1| GENE 64 76620 - 77315 672 231 aa, chain - ## HITS:1 COG:alsE KEGG:ns NR:ns ## COG: alsE COG0036 # Protein_GI_number: 16131911 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Escherichia coli K12 # 1 231 1 231 231 465 98.0 1e-131 MKISPSLMCMDLLKFKEQIEFIDSHADYFHIDIMDGHFVPNLTLSPFFVSQVKKLATKPL DCHLMVTRPQDYIAQLARAGADFITLHPETINGQAFRLIDEIRCHDMKVGLILNPETPVE AMKYYIHKADKITVMTVDPGFARQPFIPEMLDKLAELKAWREREGLEYEIEVDGSCNQAT YEKLMAAGADVFIVGTSGLFNHAENIDEAWRIMTTQILAAKSEVQPHAKTA >gi|296493166|gb|ADTK01000335.1| GENE 65 77326 - 78306 1055 326 aa, chain - ## HITS:1 COG:alsC KEGG:ns NR:ns ## COG: alsC COG1172 # Protein_GI_number: 16131912 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Escherichia coli K12 # 1 326 1 326 326 504 100.0 1e-143 MGFTTRVKSEASEKKPFNFALFWDKYGTFFILAIIVAIFGSLSPEYFLTTNNITQIFVQS SVTVLIGMGEFFAILVAGIDLSVGAILALSGMVTAKLMLAGVDPFLAAMIGGVLVGGALG AINGCLVNWTGLHPFIITLGTNAIFRGITLVISDANSVYGFSFDFVNFFAASVIGIPVPV IFSLIVALILWFLTTRMRLGRNIYALGGNKNSAFYSGIDVKFHILVVFIISGVCAGLAGV VSTARLGAAEPLAGMGFETYAIASAIIGGTSFFGGKGRIFSVVIGGLIIGTINNGLNILQ VQTYYQLVVMGGLIIAAVALDRLISK >gi|296493166|gb|ADTK01000335.1| GENE 66 78285 - 79817 175 510 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 279 494 25 222 309 72 25 1e-11 MATPYISMAGIGKSFGPVHALKSVNLTVYPGEIHALLGENGAGKSTLMKVLSGIHEPTKG TITINNISYNKLDHKLAAQLGIGIIYQELSVIDELTVLENLYIGRHLTKKICGVNIIDWR EMRVRAAMMLLRVGLKVDLDEKVANLSISHKQMLEIAKTLMLDAKVIIMDEPTSSLTNKE VDYLFLIMNQLRKEGTAIVYISHKLAEIRRICDRYTVMKDGSSVCSGMVSDVSNDDIVRL MVGRELQNRFNAMKENVSNLVHDTVFEVRNVTSRDRKKVRDISFSVCRGEILGFAGLVGS GRTELMNCLFGVDKRAGGEIRLNGKDISPRSPLDAVKKGMAYITESRRDNGFFPNFSIAQ NMAISRSLKDGGYKGAMGLFHEVDEQRTAENQRELLALKCHSVNQNITELSGGNQQKVLI SKWLCCCPEVIIFDEPTRGIDVGAKAEIYKVMRQLADDGKVILMVSSELPEIITVCDRIA VFCEGRLTQILTNRDDMSEEEIMAWALPQE >gi|296493166|gb|ADTK01000335.1| GENE 67 79944 - 80879 856 311 aa, chain - ## HITS:1 COG:alsB KEGG:ns NR:ns ## COG: alsB COG1879 # Protein_GI_number: 16131914 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli K12 # 1 311 1 311 311 559 99.0 1e-159 MNKYLKYFSGTLVGLMLSTSAFAAAEYAVVLKTLSNPFWVDMKKGIEDEAKTLGVSVDIF ASPSEGDFQSQLQLFEDLSNKNYKGIAFAPLSSVNLVMPVARAWKKGIYLVNLDEKIDMD NLKKAGGNVEAFVTTDNVAVGAKGASFIIDKLGAEGGEVAIIEGKAGNASGEARRNGATE AFKKASQIKLVASQPADWDRIKALDVATNVLQRNPNIKAIYCANDTMAMGVAQAVANAGK TGKILVVGTDGIPEARKMVEAGQMTATVAQNPADIGATGLKLMVNAEKSGKVIPLDKAPE FKLVDSILVTQ >gi|296493166|gb|ADTK01000335.1| GENE 68 80938 - 81828 538 296 aa, chain - ## HITS:1 COG:rpiR KEGG:ns NR:ns ## COG: rpiR COG1737 # Protein_GI_number: 16131915 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 296 12 307 307 588 100.0 1e-168 MSQSEFDSALPNGIGLAPYLRMKQEGMTENESRIVEWLLKPGNLSCAPAIKDVAEALAVS EAMIVKVSKLLGFSGFRNLRSALEDYFSQSEQVLPSELAFDEAPQDVVNKVFNITLRTIM EGQSIVNVDEIHRAARFFYQARQRDLYGAGGSNAICADVQHKFLRIGVRCQAYPDAHIMM MSASLLQEGDVVLVVTHSGRTSDVKAAVELAKKNGAKIICITHSYHSPIAKLADYIICSP APETPLLGRNASARILQLTLLDAFFVSVAQLNIEQANINMQKTGAIVDFFSPGALK >gi|296493166|gb|ADTK01000335.1| GENE 69 82187 - 82636 320 149 aa, chain + ## HITS:1 COG:rpiB KEGG:ns NR:ns ## COG: rpiB COG0698 # Protein_GI_number: 16131916 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Escherichia coli K12 # 1 149 1 149 149 287 100.0 4e-78 MKKIAFGCDHVGFILKHEIVAHLVERGVEVIDKGTWSSERTDYPHYASQVALAVAGGEVD GGILICGTGVGISIAANKFAGIRAVVCSEPYSAQLSRQHNDTNVLAFGSRVVGLELAKMI VDAWLGAQYEGGRHQQRVEAITAIEQRRN >gi|296493166|gb|ADTK01000335.1| GENE 70 82705 - 83034 217 109 aa, chain + ## HITS:1 COG:no KEGG:B21_03923 NR:ns ## KEGG: B21_03923 # Name: yjdP # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 109 1 109 109 79 99.0 5e-14 MKRFPLFLLFTLLTLSTVPAQADIIDDTIGNIQQAINDAYNPDRGRDYEDSRDDGWQREV SDDRRRQYDDRRRQFEDRRRQLDDRQRQLDQERRQLEDEERRMEDEYGR >gi|296493166|gb|ADTK01000335.1| GENE 71 83181 - 83939 577 252 aa, chain - ## HITS:1 COG:ECs5075 KEGG:ns NR:ns ## COG: ECs5075 COG1235 # Protein_GI_number: 15834329 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Escherichia coli O157:H7 # 1 252 1 252 252 509 95.0 1e-144 MSLTLTLTGTGGAQGVPAWGCECAACARARRSPQYRRQPCSGVVKFNDAITLIDAGRHDL TDRWSPGSFQQFLLTHYHMDHVQGLFPLRWGVGDVIPVYGPPDEQGCDDLFKHPGLLDFS HTVEPFVVFDLQGLQVTPLPLNHSKLTFGYLLETAHSRVAWLSDTAGLPEKTLKFLLNNH PQVMVIDCSHPPRADAPRNHCDLNTVLALNQVIRSPQVILTHISHQFDAWLMENALPSGF EVGFDGMEIGVA >gi|296493166|gb|ADTK01000335.1| GENE 72 83941 - 84375 593 144 aa, chain - ## HITS:1 COG:phnO KEGG:ns NR:ns ## COG: phnO COG0454 # Protein_GI_number: 16131919 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Escherichia coli K12 # 1 144 1 144 144 283 99.0 8e-77 MPACELRPATQYDTDAVYALICELKQAEFDHHAFRVGFNANLRDPNMRYHLALLDGEVVG MIGLHLQFHLHHVNWIGEIQELVVMPQARGLNVGSKLLAWAEEEARQAGAEMTELSTNVK RHNAHRFYLREGYEQSHFRFTKAL >gi|296493166|gb|ADTK01000335.1| GENE 73 84362 - 84919 431 185 aa, chain - ## HITS:1 COG:phnN KEGG:ns NR:ns ## COG: phnN COG3709 # Protein_GI_number: 16131920 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized component of phosphonate metabolism # Organism: Escherichia coli K12 # 1 185 1 185 185 356 98.0 2e-98 MMGKLIWLMGPSGSGKDSLLAELRLREQTQLLVAHRYITRDASAGSENHIALSEREFFTR AGQNLLALSWHANGLYYGVGIEIDLWLHAGFDVVVNGSRAHLPQARARYQSALLPVCLQV SPEILRQRLENRGRENASEINARLARAARYTPQDCHTLNNDGSLRQSVDTLLTLIHQKEK HHACL >gi|296493166|gb|ADTK01000335.1| GENE 74 84919 - 86055 1085 378 aa, chain - ## HITS:1 COG:ECs5078 KEGG:ns NR:ns ## COG: ECs5078 COG3454 # Protein_GI_number: 15834332 # Func_class: P Inorganic ion transport and metabolism # Function: Metal-dependent hydrolase involved in phosphonate metabolism # Organism: Escherichia coli O157:H7 # 1 378 1 378 378 732 98.0 0 MIINNVKLVLENEVVHGSLEVQDGEIRAFAESQSRLPEAMDGEGGWLLPGLIELHTDNLD KFFTPRPKVDWPAHSAMSSHDALMVASGITTVLDAVAIGDVRDGGDRLENLEKMINAIEE TQKRGVNRAEHRLHLRCELPHHTTLPLFEKLVQREPVTLVSLMDHSPGQRQFANREKYRE YYQGKYSLTDAQMQQYEEEQLALAARWSQPNRESIAALCRARQIALASHDDATHAHVAES HQLGSVIAEFPTTFEAAEASRKHGMNVLMGAPNIVRGGSHSGNVAASELAQLGLLDILSS DYYPASLLDAAFRVADDESNRFTLPQAVRLVTKNPAQALNLQDRGVIGEGKRADLVLAHR QGNHIHIDHVWRQGKRVF >gi|296493166|gb|ADTK01000335.1| GENE 75 86052 - 86774 221 240 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 15 233 1 209 245 89 29 4e-17 MRAPLPNMRTTRRKMINVQNVSKTFILHQQNGVRLPVLQRASLTVNAGECVVLHGHSGSG KSTLLRSLYANYLPDEGQIQIKHGDEWVDLVTAPARKVVEIRKTTVGWVSQFLRVIPRIS ALEVVMQPLLDTGVPREACAAKAARLLTRLNVPERLWHLAPSTFSGGEQQRVNIARGFIV DYPILLLDEPTASLDAKNSAAVVELIHEAKARGAAIVGIFHDEAVRNDVADRLHPMGASS Prediction of potential genes in microbial genomes Time: Mon May 16 16:05:02 2011 Seq name: gi|296493165|gb|ADTK01000336.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1096.3, whole genome shotgun sequence Length of sequence - 28915 bp Number of predicted genes - 26, with homology - 25 Number of transcription units - 14, operones - 4 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 2 1 Op 2 8/0.000 - CDS 808 - 1653 858 ## COG3627 Uncharacterized enzyme of phosphonate metabolism 3 1 Op 3 8/0.000 - CDS 1646 - 2710 1186 ## COG3626 Uncharacterized enzyme of phosphonate metabolism 4 1 Op 4 9/0.000 - CDS 2710 - 3294 616 ## COG3625 Uncharacterized enzyme of phosphonate metabolism 5 1 Op 5 5/0.333 - CDS 3291 - 3743 483 ## COG3624 Uncharacterized enzyme of phosphonate metabolism 6 1 Op 6 3/1.000 - CDS 3744 - 4469 628 ## COG2188 Transcriptional regulators 7 1 Op 7 8/0.000 - CDS 4490 - 5269 874 ## COG3639 ABC-type phosphate/phosphonate transport system, permease component 8 1 Op 8 15/0.000 - CDS 5375 - 6391 1031 ## COG3221 ABC-type phosphate/phosphonate transport system, periplasmic component 9 1 Op 9 3/1.000 - CDS 6416 - 7204 261 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 - Prom 7257 - 7316 4.5 10 2 Tu 1 . - CDS 7337 - 7780 438 ## COG2764 Uncharacterized protein conserved in bacteria - Prom 7805 - 7864 2.2 - Term 7834 - 7878 7.5 11 3 Tu 1 . - CDS 7940 - 8275 429 ## COG2824 Uncharacterized Zn-ribbon-containing protein involved in phosphonate metabolism + Prom 8531 - 8590 8.2 12 4 Op 1 . + CDS 8678 - 10906 1784 ## COG0699 Predicted GTPases (dynamin-related) 13 4 Op 2 . + CDS 10903 - 11781 458 ## EC55989_4600 hypothetical protein + Prom 11856 - 11915 3.2 14 5 Tu 1 . + CDS 12045 - 13547 1663 ## COG0477 Permeases of the major facilitator superfamily + Prom 13578 - 13637 1.6 15 6 Tu 1 . + CDS 13659 - 13748 70 ## 16 7 Op 1 40/0.000 - CDS 13724 - 14815 915 ## COG0642 Signal transduction histidine kinase 17 7 Op 2 3/1.000 - CDS 14825 - 15493 906 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 18 7 Op 3 3/1.000 - CDS 15490 - 17133 1363 ## COG2194 Predicted membrane-associated, metal-dependent hydrolase - Term 17168 - 17215 8.2 19 8 Op 1 . - CDS 17237 - 18574 1413 ## COG0531 Amino acid transporters - Prom 18615 - 18674 3.9 20 8 Op 2 . - CDS 18711 - 19472 462 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 19549 - 19608 6.1 - Term 19648 - 19679 1.1 21 9 Tu 1 . - CDS 19797 - 22064 2005 ## COG1982 Arginine/lysine/ornithine decarboxylases - Prom 22186 - 22245 7.2 22 10 Tu 1 . - CDS 22263 - 23171 814 ## COG2207 AraC-type DNA-binding domain-containing proteins 23 11 Tu 1 4/0.333 + CDS 23457 - 24809 1229 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases 24 12 Tu 1 . + CDS 24924 - 26333 1019 ## COG2211 Na+/melibiose symporter and related transporters + Term 26356 - 26397 6.1 25 13 Tu 1 . - CDS 26472 - 27101 629 ## COG3647 Predicted membrane protein - Prom 27124 - 27183 2.5 26 14 Tu 1 . - CDS 27224 - 28870 507 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase Predicted protein(s) >gi|296493165|gb|ADTK01000336.1| GENE 1 53 - 811 360 252 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 4 249 277 523 563 143 35 1e-33 MNQPLLSVNNLTHLYAPGKGFSDVSFDLWPGEVLGIVGESGSGKTTLLKSISARLTPQQG EIHYESRSLYAMSEADRRRLLRTEWGVVHQHPLDGLRRQVSAGGNIGERLMATGARHYGD IRATAQKWLEEVEIPANRIDDLPTTFSGGMQQRLQIARNLVTHPKLVFMDEPTGGLDVSV QARLLDLLRGLVVELNLAVVIVTHDLGVARLLADRLLVMKQGQVVESGLTDRVLDDPHHP YTQLLVSSVLQN >gi|296493165|gb|ADTK01000336.1| GENE 2 808 - 1653 858 281 aa, chain - ## HITS:1 COG:ECs5081 KEGG:ns NR:ns ## COG: ECs5081 COG3627 # Protein_GI_number: 15834335 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized enzyme of phosphonate metabolism # Organism: Escherichia coli O157:H7 # 1 281 1 281 281 573 99.0 1e-163 MANLSGYNFAYLDEQTKRMIRRAILKAVAIPGYQVPFGGREMPMPYGWGTGGIQLTASVI GESDVLKVIDQGADDTTNAVSIRNFFKRVTGVNTTERTDDATLIQTRHRIPETPLTEDQI IIFQVPIPEPLRFIEPRETETRTMHALEEYGVMQVKLYEDIARFGHIATTYAYPVKVNGR YVMDPSPIPKFDNPKMDMMPALQLFGAGREKRIYAVPPFTRVESLDFDDHPFTVQQWDEP CAICGSTHSYLDEVVLDDAGNRMFVCSDTDYCRQQSEAKSQ >gi|296493165|gb|ADTK01000336.1| GENE 3 1646 - 2710 1186 354 aa, chain - ## HITS:1 COG:ECs5082 KEGG:ns NR:ns ## COG: ECs5082 COG3626 # Protein_GI_number: 15834336 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized enzyme of phosphonate metabolism # Organism: Escherichia coli O157:H7 # 1 354 1 354 354 678 98.0 0 MYVAVKGGEKAIDAAHALQESRRRGDTDLPELSIAQIEQQLNLAVDRVMTEGGIADRELA ALALKQASGDNVEAIFLLRAYRTTLTKLAVSEPLDTTEMRLERRISAVYKDIPGGQLLGP TYDYTHRLLDFTLLANGEAPTLTTADSEQQPSPHVFSLLARQGLAKFEEDSGAQPDDITR TPPVYPCSRSSRLQQLMRGDEGYLLALAYSTQRGYGRNHPFAGEIRSGYIDVSIVPEELG FAVNVGELLMTECEMVNGFIDPPDEPPHFTRGYGLVFGMSERKAMAMALVDRALQAPEYG EHATGPAQDEEFVLAHADNVEAAGFVSHLKLPHYVDFQAELELLKRLQQEKNHG >gi|296493165|gb|ADTK01000336.1| GENE 4 2710 - 3294 616 194 aa, chain - ## HITS:1 COG:phnH KEGG:ns NR:ns ## COG: phnH COG3625 # Protein_GI_number: 16131926 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized enzyme of phosphonate metabolism # Organism: Escherichia coli K12 # 1 194 1 194 194 349 98.0 1e-96 MTLETAFMLPVQDAQHSFRRLLKAMSEPGVIVALHQLKRGWQPLNIAPTSVLLTLADNDT PVWLATPLSNDIVNQSLRFHTNAPLVSQPEQATFAVTDEAISSEQLNALSTGTAVAPEAG ATLILQVASLSGGRMLRLTGAGIAEERMIAPQLPECILHELTERPHPFPLGIDLILTCGE RLLAIPRTTHVEVC >gi|296493165|gb|ADTK01000336.1| GENE 5 3291 - 3743 483 150 aa, chain - ## HITS:1 COG:ECs5084 KEGG:ns NR:ns ## COG: ECs5084 COG3624 # Protein_GI_number: 15834338 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized enzyme of phosphonate metabolism # Organism: Escherichia coli O157:H7 # 1 150 1 150 150 258 97.0 3e-69 MHADTATRQHWMSVLAHSQPAELAARLKALNITADYEVIRAAETGLVQIQARMGGTGERF FAGDATLTRAAVRLTDGTLGYSWVLGRDKQHAERCALIDALMQQSRYFQNLSETLIAPLD ADRMARIAARQAEVNASRVDFFTMVRGDNA >gi|296493165|gb|ADTK01000336.1| GENE 6 3744 - 4469 628 241 aa, chain - ## HITS:1 COG:phnF KEGG:ns NR:ns ## COG: phnF COG2188 # Protein_GI_number: 16131928 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 241 1 241 241 468 100.0 1e-132 MHLSTHPTSYPTRYQEIAAKLEQELRQHYRCGDYLPAEQQLAARFEVNRHTLRRAIDQLV EKGWVQRRQGVGVLVLMRPFDYPLNAQARFSQNLLDQGSHPTSEKLLSVLRPASGHVADA LGITEGENVIHLRTLRRVNGVALCLIDHYFADLTLWPTLQRFDSGSLHDFLREQTGIALR RSQTRISARRAQAKECQRLEIPNMSPLLCVRTLNHRDGESSPAEYSVSLTRADMIEFTME H >gi|296493165|gb|ADTK01000336.1| GENE 7 4490 - 5269 874 259 aa, chain - ## HITS:1 COG:ECs5086 KEGG:ns NR:ns ## COG: ECs5086 COG3639 # Protein_GI_number: 15834340 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate/phosphonate transport system, permease component # Organism: Escherichia coli O157:H7 # 1 259 18 276 276 464 99.0 1e-130 MQTITIAPPKRSWFSLLSWAVVLAVLVVSWQGAEMAPLTLIKDGGNMATFAADFFPPDFS QWQDYLTEMAVTLQIAVWGTALAVVLSIPFGLMSAENLVPWWVYQPVRRLMDACRAINEM VFAMLFVVAVGLGPFAGVLALFIHTTGVLSKLLSEAVEAIEPGPVEGIRATGANKLEEIL YGVLPQVMPLLISYSLYRFESNVRSATVVGMVGAGGIGVTLWEAIRGFQFQQTCALMVLI IVTVSLLDFLSQRLRKHFI >gi|296493165|gb|ADTK01000336.1| GENE 8 5375 - 6391 1031 338 aa, chain - ## HITS:1 COG:phnD KEGG:ns NR:ns ## COG: phnD COG3221 # Protein_GI_number: 16131931 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate/phosphonate transport system, periplasmic component # Organism: Escherichia coli K12 # 1 338 1 338 338 642 100.0 0 MNAKIIASLAFTSMFSLSTLLSPAHAEEQEKALNFGIISTESQQNLKPQWTPFLQDMEKK LGVKVNAFFAPDYAGIIQGMRFNKVDIAWYGNLSAMEAVDRANGQVFAQTVAADGSPGYW SVLIVNKDSPINNLNDLLAKRKDLTFGNGDPNSTSGFLVPGYYVFAKNNISASDFKRTVN AGHETNALAVANKQVDVATNNTENLDKLKTSAPEKLKELKVIWKSPLIPGDPIVWRKNLS ETTKDKIYDFFMNYGKTPEEKAVLERLGWAPFRASSDLQLVPIRQLALFKEMQGVKSNKG LNEQDKLAKTTEIQAQLDDLDRLNNALSAMSSVSKAVQ >gi|296493165|gb|ADTK01000336.1| GENE 9 6416 - 7204 261 262 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 231 1 214 305 105 31 4e-22 MQTIIRVEKLAKTFNQHQALHAVDLNIHHGEMVALLGPSGSGKSTLLRHLSGLITGDKSA GSHIELLGRTVQREGRLARDIRKSRANTGYIFQQFNLVNRLSVLENVLIGALGSTPFWRT CFSWFTGEQKQRALQALTRVGMVHFAHQRVSTLSGGQQQRVAIARALMQQAKVILADEPI ASLDPESARIVMDTLRDINQNDGITVVVTLHQVDYALRYCERIVALRQGHVFYDGSSQQF DNERFDHLYRSINRIEENAKAA >gi|296493165|gb|ADTK01000336.1| GENE 10 7337 - 7780 438 147 aa, chain - ## HITS:1 COG:ECs5089 KEGG:ns NR:ns ## COG: ECs5089 COG2764 # Protein_GI_number: 15834343 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 147 1 147 147 296 99.0 1e-80 MPLSPYLSFAGNCADAIAYYQRTLGAELLYKISFGEMPKSAQDSAENCPSGMQFPDTAIA HANVRIAGSDIMMSDAIPSGKASYSGFTLVLDSQQVEEGKRWFDNLAANGKIEMAWQETF WAHGFGKVTDKFGVPWMINVVKQQPTQ >gi|296493165|gb|ADTK01000336.1| GENE 11 7940 - 8275 429 111 aa, chain - ## HITS:1 COG:ECs5090 KEGG:ns NR:ns ## COG: ECs5090 COG2824 # Protein_GI_number: 15834344 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized Zn-ribbon-containing protein involved in phosphonate metabolism # Organism: Escherichia coli O157:H7 # 1 111 1 111 111 216 100.0 8e-57 MSLPHCPKCNSEYTYEDNGMYICPECAYEWNDAEPAQESDELIVKDANGNLLADGDSVTI IKDLKVKGSSSMLKIGTKVKNIRLVEGDHNIDCKIDGFGPMKLKSEFVKKN >gi|296493165|gb|ADTK01000336.1| GENE 12 8678 - 10906 1784 742 aa, chain + ## HITS:1 COG:ZyjdA_1 KEGG:ns NR:ns ## COG: ZyjdA_1 COG0699 # Protein_GI_number: 15804701 # Func_class: R General function prediction only # Function: Predicted GTPases (dynamin-related) # Organism: Escherichia coli O157:H7 EDL933 # 1 275 1 275 275 510 99.0 1e-144 MYTQTLYELSQEAERLLQLSRQQLQLLEKMPLSVPGDDAPQLALPWSQPNIAERHAMLNN ELRKISRLEMVLAIVGTMKAGKSTTINAIVGTEVLPNRNRPMTALPTLIRHTPGQKEPVL HFSHVAPIDRLIQKLQQRLRDCDIKHLTDVLEIDKDMRALMQRIENGVAFEKYYLGAQPI FHCLKSLNDLVRLAKALDVDFPFSAYAAIEHIPVIEVEFVHLAGLESYPGQLTLLDTPGP NEAGQPHLQKMLNQQLARASAVLAVLDYTQLKSISDEEVREAILAVGQSVPLYVLVNKFD QQDRNSDDADQVRALISGTLMKGCITPQQIFPVSSMWGYLANRARHELANNGKLPAPEQQ RWVEDFAHAALGRRWRHDDLADLEHIRHAADQLWEDSLFAQPIQALLHAAYANASLYALR SAAHKLLNYAQQARGYLDFRAHGLNVACEQLRQNIHQVEESLQLLQLNQAQVSGEVKHEI ELALTSANHFLRQQQDALNAQLAALFQDDSEPLSEIRTCCETLLQTAQNTISRDFTLRFA ELESTLCRVLTDVIRPIEQQVKMELSESGFRPGFHFPVFHGVVPHFNTRQLFSEVISRQE ATDEQSTRLGVVRETFSRWLNQPDWGRGNEKSPTETVDYSVLQRALSAEVDLYCQQMAKV LAEQVDESVTAGMNTFFAEFASCLTELQTRLRESLALRQQNESVVRLMQQQLQQTVMTHG WIYTDAQLLRDDIQTLFTAERY >gi|296493165|gb|ADTK01000336.1| GENE 13 10903 - 11781 458 292 aa, chain + ## HITS:1 COG:no KEGG:EC55989_4600 NR:ns ## KEGG: EC55989_4600 # Name: yjcZ # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 292 1 292 292 561 100.0 1e-158 MTKTLLDGPGRVLESVYPRFLVDLAQGDDARLPQAHQQQFRERLMQELLARVQLQTWTNG GMLNAPLSLRLTLVEKLASMLDPGHLALTQIAQHLALLQKMDHRQHSAFPELPQQIAALY EWFSARCRWKEKALTQRGLLVQAGEQSEQIFTRWRAGAYNAWSLPGRCFIVLEELRWGAF GDACRLGSPQAVALLLGDLRVKATQHLAESINAAPTTRHYYHQWFASSTVPTGGDHADFL SWLGKWTTADKQPVCWSVTQRWQTVALGMPRLCSAQRLAGAMVEEIFSVNLA >gi|296493165|gb|ADTK01000336.1| GENE 14 12045 - 13547 1663 500 aa, chain + ## HITS:1 COG:ECs5093 KEGG:ns NR:ns ## COG: ECs5093 COG0477 # Protein_GI_number: 15834347 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 500 1 500 500 973 99.0 0 MLKRKKVKPITLRDVTIIDDGKLRKAITAASLGNAMEWFDFGVYGFVAYALGKVFFPGAD PSVQMVAALATFSVPFLIRPLGGLFFGMLGDKYGRQKILAITIVIMSISTFCIGLIPSYD TIGIWAPILLLICKMAQGFSVGGEYTGASIFVAEYSPDRKRGFMGSWLDFGSIAGFVLGA GVVVLISTIVGEANFLDWGWRIPFFIALPLGIIGLYLRHALEETPAFQQHVDKLEQGDRE GLQDGPKVSFKEIATKHWRSLLTCIGLVIATNVTYYMLLTYMPSYLSHNLHYSEDHGVLI IIAIMIGMLFVQPVMGLLSDRFGRRPFVLLGSVALFVLAIPAFILINSNVIGLIFAGLLM LAVILNCFTGVMASTLPAMFPTHIRYSALAAAFNISVLVAGLTPTLAAWLVESSQNLMMP AYYLMVVAVIGLITGVTMKETANRPLKGATPAASDIQEAKEILVEHYDNIEQKIDNIDHE IADLQAKRTRLVQQHPRIDE >gi|296493165|gb|ADTK01000336.1| GENE 15 13659 - 13748 70 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKNRVYESLTTVFSVLVVSSFLYIWFATF >gi|296493165|gb|ADTK01000336.1| GENE 16 13724 - 14815 915 363 aa, chain - ## HITS:1 COG:ECs5094 KEGG:ns NR:ns ## COG: ECs5094 COG0642 # Protein_GI_number: 15834348 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli O157:H7 # 1 363 1 363 363 677 99.0 0 MHFLRRPISLRQRLILTIGAILLVFELISVFWLWHESTEQIQLFEQALRDNRNNDRHIMR EIREAVASLIVPGVFMVSLTLFICYQAVRRITRPLAELQKELEARTADNLTPIAIHSATL EIEAVVSALNDLVSRLTSTLDNERLFTADVAHELRTPLAGVRLHLELLAKTHHIDVAPLV ARLDQMMESVSQLLQLARAGQSFSSGNYQHVKLLEDVILPSYDELSTMLDQRQQTLLLPE SAADITVQGDATLLRMLLRNLVENAHRYSPQGSNIMIKLQEDGGAVMAVEDEGPGIDESK CGELSKAFVRMDSRYGGIGLGLSIVSRITQLHHGQFFLQNRQETSGTRAWVRLKKDQNVA NQI >gi|296493165|gb|ADTK01000336.1| GENE 17 14825 - 15493 906 222 aa, chain - ## HITS:1 COG:ECs5095 KEGG:ns NR:ns ## COG: ECs5095 COG0745 # Protein_GI_number: 15834349 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 222 1 222 222 432 100.0 1e-121 MKILIVEDDTLLLQGLILAAQTEGYACDGVTTARMAEQSLEAGHYSLVVLDLGLPDEDGL HFLARIRQKKYTLPVLILTARDTLTDKIAGLDVGADDYLVKPFALEELHARIRALLRRHN NQGESELIVGNLTLNMGRRQVWMGGEELILTPKEYALLSRLMLKAGSPVHREILYNDIYN WDNEPSTNTLEVHIHNLRDKVGKARIRTVRGFGYMLVANEEN >gi|296493165|gb|ADTK01000336.1| GENE 18 15490 - 17133 1363 547 aa, chain - ## HITS:1 COG:ZyjdB KEGG:ns NR:ns ## COG: ZyjdB COG2194 # Protein_GI_number: 15804706 # Func_class: R General function prediction only # Function: Predicted membrane-associated, metal-dependent hydrolase # Organism: Escherichia coli O157:H7 EDL933 # 1 547 11 557 557 1080 99.0 0 MLKRLLKRPSLNLLAWLLLAAFYISICLNIAFFKQVLQALPLDSLHNVLVFLSMPVVAFS VINIVLTLSSFLWLNRPLACLFILVGAAAQYFIMTYGIVIDRSMIANIIDTTPAESYALM TPQMLLTLGFSGVLAALIACWIKIKPATSRLRSVLFRGANILVSVLLILLVAALFYKDYA SLFRNNKELVKSLSPSNSIVASWSWYSHQRLANLPLVRIGEDAHRNPLMQNEKRKNLTIL IVGETSRAENFSLNGYPRETNPRLAKDNVVYFPNTASCGTATAVSVPCMFSDMPREHYKE ELAQHQEGVLDIIQRAGINVLWNDNDGGCKGACDRVPHQNVTALNLPGQCINGECYDEVL FHGLEEYINNLQGDGVIVLHTIGSHGPTYYNRYPPQFRKFTPTCDTNEIQTCSKEQLVNT YDNTLVYVDYIVDKAINLLKEHQDKFTTSLVYLSDHGESLGENGIYLHGLPYAIAPDSQK QVPMLLWLSEDYQKRYQVDQNCLQKQAQTQHYSQDNLFSTLLGLTGVETKYYQAADDILQ TCRRVSE >gi|296493165|gb|ADTK01000336.1| GENE 19 17237 - 18574 1413 445 aa, chain - ## HITS:1 COG:ECs5097 KEGG:ns NR:ns ## COG: ECs5097 COG0531 # Protein_GI_number: 15834351 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli O157:H7 # 1 445 1 445 445 780 100.0 0 MSSDADAHKVGLIPVTLMVSGNIMGSGVFLLPANLASTGGIAIYGWLVTIIGALGLSMVY AKMSFLDPSPGGSYAYARRCFGPFLGYQTNVLYWLACWIGNIAMVVIGVGYLSYFFPILK DPLVLTITCVVVLWIFVLLNIVGPKMITRVQAVATVLALIPIVGIAVFGWFWFRGETYMA AWNVSGLGTFGAIQSTLNVTLWSFIGVESASVAAGVVKNPKRNVPIATIGGVLIAAVCYV LSTTAIMGMIPNAALRVSASPFGDAARMALGDTAGAIVSFCAAAGCLGSLGGWTLLAGQT AKAAADDGLFPPIFARVNKAGTPVAGLIIVGILMTIFQLSSISPNATKEFGLVSSVSVIF TLVPYLYTCAALLLLGHGHFGKARPAYLAVTTIAFLYCIWAVVGSGAKEVMWSFVTLMVI TAMYALNYNRLHKNPYPLDAPISKD >gi|296493165|gb|ADTK01000336.1| GENE 20 18711 - 19472 462 253 aa, chain - ## HITS:1 COG:ECs5098 KEGG:ns NR:ns ## COG: ECs5098 COG2207 # Protein_GI_number: 15834352 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 253 1 253 253 493 100.0 1e-139 MRICSDQPCIVLLTEKDVWIRVNGKEPISLKANHMALLNCENNIIDVSSLNNTLVAHISH DIIKDYLRFLNKDLSQIPVWQRSATPILTLPCLTPDVFRVAAQHSMMPAETESEKERTRA LLFTVLSRFLDSKKFLSLMMYMLRNCVSDSVYQIIESDIHKDWNLSMVASCLCLSPSLLK KKLKSENTSYSQIITTCRMRYAVNELMMDGKNISQVSQSCGYNSTSYFISVFKDFYGMTP LHYVSQHRERTVA >gi|296493165|gb|ADTK01000336.1| GENE 21 19797 - 22064 2005 755 aa, chain - ## HITS:1 COG:adiA KEGG:ns NR:ns ## COG: adiA COG1982 # Protein_GI_number: 16131943 # Func_class: E Amino acid transport and metabolism # Function: Arginine/lysine/ornithine decarboxylases # Organism: Escherichia coli K12 # 1 755 2 756 756 1594 99.0 0 MKVLIVESEFLHQDTWVGNAVERLADALSQQNVTVIKSTSFDDGFAILSSNEAIDCLMFS YQMEHSDEHQNVRQLIGKLHERQQNVPVFLLGDREKALAAMDRDLLELVDEFAWILEDTA DFIAGRAVAAMTRSRQQLLPPLFSALMKYSDIHEYSWAAPGHQGGVGFTKTPAGRFYHDY YGENLFRTDMGIERTSLGSLLDHTGAFGESEKYAARVFGADRSWSVVVGTSGSNRTIMQA CMTDNDVVVVDRNCHKSIEQGLMLTGAKPVYMVPSRNRYGIIGPIYPQEMQPETLQKKIS ESPLTKDKAGQKPSYCVVTNCTYDGVCYNAKEAQDLLEKTSDRLHFDEAWYGYARFNPIY ADHYAMRGEPGDHNGPTVFATHSTHKLLNALSQASYIHVREGRGAINFSRFNQAYMMHAT TSPLYAICASNDVAVSMMDGNSGLSLTQEVIDEAVDFRQAMARLYKEFTADGSWFFKPWN KEVVTDPQTGKTYDFADAPTKLLTTVQDCWVMHPGESWHGFKDIPDNWSMLDPIKVSILA PGMGEDGELEETGVPAALVTAWLGRHGIVPTRTTDFQIMFLFSMGVTRGKWGTLVNTLCS FKRHYDANTPLAQVMPELVEQYPDTYANMGIHDLGDTMFAWLKENNPGARLNEAYSGLPM AEITPREAYNAIVDNNVELVSIENLPGRIAANSVIPYPPGIPMLLSGENFGDKNSPQVSY LRSLQSWDHHFPGFEHETEGTEIIDGIYHVMCVKA >gi|296493165|gb|ADTK01000336.1| GENE 22 22263 - 23171 814 302 aa, chain - ## HITS:1 COG:melR KEGG:ns NR:ns ## COG: melR COG2207 # Protein_GI_number: 16131944 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 302 1 302 302 610 100.0 1e-174 MNTDTFMCSSDEKQTRSPLSLYSEYQRMEIEFRAPHIMPTSHWHGQVEVNVPFDGDVEYL INNEKVNINQGHITLFWACTPHQLTDTGTCQSMAIFNLPMHLFLSWPLDKDLINHVTHGM VIKSLATQQLSPFEVRRWQQELNSPNEQIRQLAIDEIGLMLKRFSLSGWEPILVNKTSRT HKNSVSRHAQFYVSQMLGFIAENYDQALTINDVAEHVKLNANYAMGIFQRVMQLTMKQYI TAMRINHVRALLSDTDKSILDIALTAGFRSSSRFYSTFGKYVGMSPQQYRKLSQQRRQTF PG >gi|296493165|gb|ADTK01000336.1| GENE 23 23457 - 24809 1229 450 aa, chain + ## HITS:1 COG:ECs5101 KEGG:ns NR:ns ## COG: ECs5101 COG1486 # Protein_GI_number: 15834355 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Escherichia coli O157:H7 # 1 450 2 451 451 955 99.0 0 MTAPKITFIGAGSTIFVKNILGDVFHREALKTAHIALMDIDPTRLEESHIVVRKLMDSAG ASGKITCHTQQKEALQDADFVVVAFQIGGYEPCTVTDFEVCKRHGLEQTIADTLGPGGIM RALRTIPHLWQICEDMTEVCPDATMLNYVNPMAMNTWAMYARYPHIKQVGLCHSVQGTAE ELARDLNIDPATLRYRCAGINHMAFYLELERKTADGSYVNLYPELLAAYEAGQAPKPNIH GNTRCQNIVRYEMFKKLGYFVTESSEHFAEYTPWFIKPGREDLIERYKVPLDEYPKRCVE QLANWHKELEEYKNASRIDIKPSREYASTIMNAIWTGEPSVIYGNVRNDGLIDNLPQGCC VEVACLVDANGIQPTKVGTLPSHLAALMQTNINVQTLLTEAILTENRDRVYHAAMMDPHT AAVLGIDEIYALVDDLIAAHGDWLPGWLHR >gi|296493165|gb|ADTK01000336.1| GENE 24 24924 - 26333 1019 469 aa, chain + ## HITS:1 COG:ECs5102 KEGG:ns NR:ns ## COG: ECs5102 COG2211 # Protein_GI_number: 15834356 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Escherichia coli O157:H7 # 1 469 1 469 469 854 99.0 0 MTTKLSYGFGAFGKDFAIGIVYMYLMYYYTDVVGLSVGLVGTLFLVARIWDAINDPIMGW IVNATRSRWGKFKPWILIGTLANSVILFLLFSAHLFEGTTQIVFVCVTYILWGMTYTIMD IPFWSLVPTITLDKREREQLVPYPRFFASLAGFVTAGGTLPFVNYVGGGDRGFGFQMFTL VLIAFFIVSTIITLRNVHEVFSSDNQPSAEGSHLTLKAIVALIYKNDQLSCLLGMALAYN VASNIITGFAIYYFSYVIGDADLFPYYLSYAGAANLVTLVFFPRLVKSLSRRILWAGASI LPVLSCGVLLLMALMSYHNVVLIVIAGILLNVGTALFWVLQVIMVADTVDYGEYKLHVRC ESIAYSVQTMVVKGGSAFAAFFIAVVLGMIGYVPNVEQSTQALLGMQFIMIALPTLFFMV TLILYFRFYRLNGDTLRRIQIHLLDKYRKVPPEPVHADIPVGAVSDVKA >gi|296493165|gb|ADTK01000336.1| GENE 25 26472 - 27101 629 209 aa, chain - ## HITS:1 COG:ECs5103 KEGG:ns NR:ns ## COG: ECs5103 COG3647 # Protein_GI_number: 15834357 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 209 1 209 209 396 100.0 1e-110 MTRTLKPLILNTGALALTLILIYTGISAHDKLTWLMEVTPVIIVVPLLLATAKRYPLTPL LYTLIFFHAIILMVGGQYTYAKVPIGFEVQEWLGLSRNPYDKLGHFFQGLVPALVAREIL VRGMYVRGRKMVAFLVCCVALAISAMYELIEWWAALAMGQGADDFLGTQGDQWDTQSDMF CALLGALTTVIFLARFHCRQLRRFGLITG >gi|296493165|gb|ADTK01000336.1| GENE 26 27224 - 28870 507 548 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 69 531 31 482 508 199 32 1e-50 MSNKPFIYQAPFPMGKDNTEYYLLTSDYVSVADFDGETILKVEPEALTLLAQQAFHDASF MLRPAHQKQVAAILHDPEASENDKYVALQFLRNSEIAAKGVLPTCQDTGTAIIVGKKGQR VWTGGGDEEALSKGVYNTYIEDNLRYSQNAALDMYKEVNTGTNLPAQIDLYAVDGDEYKF LCVAKGGGSANKTYLYQETKALLTPGKLKNFLVEKMRTLGTAACPPYHIAFVIGGTSAET NLKTVKLASAHYYDELPTEGNEHGQAFRDVQLEQELLEEAQKLGLGAQFGGKYFAHDIRV IRLPRHGASCPVGMGVSCSADRNIKAKINREGIWIEKLEHNPGQYIPQELRQAGEGEAVK VDLNRPMKEILAQLSQYPVSTRLSLTGTIIVGRDIAHAKLKELIDAGKELPQYIKDHPIY YAGPAKTPAGYPSGSLGPTTAGRMDSYVDLLQSHGGSMIMLAKGNRSQQVTDACHKHGGF YLGSIGGPAAVLAQQSIKHLECVAYPELGMEAIWKIEVEDFPAFILVDDKGNDFFQQIVN KQCANCTK Prediction of potential genes in microbial genomes Time: Mon May 16 16:05:18 2011 Seq name: gi|296493164|gb|ADTK01000337.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1096.4, whole genome shotgun sequence Length of sequence - 25938 bp Number of predicted genes - 24, with homology - 24 Number of transcription units - 16, operones - 6 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 12 - 1352 1458 ## COG2704 Anaerobic C4-dicarboxylate transporter 2 2 Op 1 9/0.000 - CDS 1923 - 2642 278 ## PROTEIN SUPPORTED gi|149011191|ref|ZP_01832496.1| 30S ribosomal protein S9 3 2 Op 2 . - CDS 2639 - 4270 1353 ## COG3290 Signal transduction histidine kinase regulating citrate/malate metabolism - Prom 4294 - 4353 2.9 4 3 Op 1 . + CDS 4262 - 4396 63 ## EcSMS35_4593 hypothetical protein 5 3 Op 2 5/0.333 + CDS 4451 - 4681 227 ## COG3592 Uncharacterized conserved protein 6 3 Op 3 . + CDS 4693 - 4965 339 ## COG2388 Predicted acetyltransferase 7 4 Tu 1 . - CDS 4895 - 5086 116 ## gi|293407852|ref|ZP_06651692.1| predicted protein - Prom 5182 - 5241 3.6 + Prom 5044 - 5103 1.7 8 5 Op 1 . + CDS 5192 - 5488 199 ## ECUMN_4661 hypothetical protein 9 5 Op 2 . + CDS 5516 - 5689 179 ## ECH74115_5644 hypothetical protein + Term 5768 - 5803 6.7 - Term 5756 - 5791 6.7 10 6 Tu 1 . - CDS 5808 - 7325 1665 ## COG1190 Lysyl-tRNA synthetase (class II) - Prom 7408 - 7467 5.5 11 7 Tu 1 4/0.333 - CDS 7562 - 9019 1224 ## COG3104 Dipeptide/tripeptide permease - Term 9027 - 9070 6.3 12 8 Op 1 10/0.000 - CDS 9078 - 11225 2237 ## COG1982 Arginine/lysine/ornithine decarboxylases - Term 11234 - 11289 3.0 13 8 Op 2 4/0.333 - CDS 11305 - 12639 1442 ## COG0531 Amino acid transporters - Term 12897 - 12930 3.4 14 9 Tu 1 . - CDS 13005 - 14543 700 ## COG3710 DNA-binding winged-HTH domains - Prom 14569 - 14628 3.8 - TRNA 15160 - 15235 84.1 # Phe GAA 0 0 - Term 15116 - 15147 4.1 15 10 Tu 1 . - CDS 15341 - 16048 562 ## COG1811 Uncharacterized membrane protein, possible Na+ channel or pump + Prom 16268 - 16327 3.8 16 11 Tu 1 . + CDS 16446 - 18581 1768 ## COG1982 Arginine/lysine/ornithine decarboxylases + Term 18594 - 18627 -0.4 - Term 18576 - 18621 12.1 17 12 Tu 1 . - CDS 18631 - 19887 1478 ## COG0477 Permeases of the major facilitator superfamily - Prom 19993 - 20052 6.2 18 13 Op 1 6/0.333 - CDS 20089 - 21171 959 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) 19 13 Op 2 11/0.000 - CDS 21233 - 21508 378 ## COG2924 Fe-S cluster protector protein 20 13 Op 3 . - CDS 21536 - 22588 1083 ## PROTEIN SUPPORTED gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 - Prom 22698 - 22757 2.5 + Prom 22657 - 22716 3.6 21 14 Op 1 4/0.333 + CDS 22749 - 23468 777 ## COG0220 Predicted S-adenosylmethionine-dependent methyltransferase 22 14 Op 2 . + CDS 23468 - 23794 450 ## COG3171 Uncharacterized protein conserved in bacteria + Term 23937 - 23967 3.0 23 15 Tu 1 . + CDS 23981 - 24697 933 ## ECO111_3709 hypothetical protein + Term 24707 - 24753 5.1 + Prom 24733 - 24792 6.0 24 16 Tu 1 . + CDS 24873 - 25919 1240 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D Predicted protein(s) >gi|296493164|gb|ADTK01000337.1| GENE 1 12 - 1352 1458 446 aa, chain - ## HITS:1 COG:ECs5105 KEGG:ns NR:ns ## COG: ECs5105 COG2704 # Protein_GI_number: 15834359 # Func_class: R General function prediction only # Function: Anaerobic C4-dicarboxylate transporter # Organism: Escherichia coli O157:H7 # 1 446 1 446 446 743 100.0 0 MLFTIQLIIILICLFYGARKGGIALGLLGGIGLVILVFVFHLQPGKPPVDVMLVIIAVVA ASATLQASGGLDVMLQIAEKLLRRNPKYVSIVAPFVTCTLTILCGTGHVVYTILPIIYDV AIKNNIRPERPMAASSIGAQMGIIASPVSVAVVSLVAMLGNVTFDGRHLEFLDLLAITIP STLIGILAIGIFSWFRGKDLDKDEEFQKFISVPENREYVYGDTATLLDKKLPKSNWLAMW IFLGAIAVVALLGADSDLRPSFGGKPLSMVLVIQMFMLLTGALIIILTKTNPASISKNEV FRSGMIAIVAVYGIAWMAETMFGAHMSEIQGVLGEMVKEYPWAYAIVLLLVSKFVNSQAA ALAAIVPVALAIGVDPAYIVASAPACYGYYILPTYPSDLAAIQFDRSGTTHIGRFVINHS FILPGLIGVSVSCVFGWIFAAMYGFL >gi|296493164|gb|ADTK01000337.1| GENE 2 1923 - 2642 278 239 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149011191|ref|ZP_01832496.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP19-BS75] # 2 225 1 222 226 111 36 4e-24 MINVLIIDDDAMVAELNRRYVAQIPGFQCCGTASTLEKAKEIIFNSDTPIDLILLDIYMQ KENGLDLLPVLHNARCKSDVIVISSAADAATIKDSLHYGVVDYLIKPFQASRFEEALTGW RQKKMALEKHQYYDQAELDQLIHGSSSNEQDPRRLPKGLTPQTLRTLCQWIDAHQDYEFS TDELANEVNISRVSCRKYLIWLVNCHILFTSIHYGVTGRPVYRYRIQAEHYSLLKQYCQ >gi|296493164|gb|ADTK01000337.1| GENE 3 2639 - 4270 1353 543 aa, chain - ## HITS:1 COG:ECs5107 KEGG:ns NR:ns ## COG: ECs5107 COG3290 # Protein_GI_number: 15834361 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase regulating citrate/malate metabolism # Organism: Escherichia coli O157:H7 # 1 543 1 543 543 1070 99.0 0 MRHSLPYRMLRKRPMKLSTTVILMVSAVLFSVLLVVHLIYFSQISDMTRDGLANKALAVA RTLADSPEIRQGLQKKPQESGIQAIAEAVRKRNDLLFIVVTDMQSLRYSHPEAQRIGQPF KGDDILKALNGKENVAINRGFLAQALRVFTPIYDENHKQIGVVAIGLELSRVTQQINDSR WSIIWSVLFGMLVGLIGTCILVKVLKKILFGLEPYEISTLFEQRQAMLQSIKEGVVAVDD RGEVTLINDAAQELLNYRKSQDDEKLSTLSHSWSQVVDVSEVLRDGTPRRDEEITIKDRL LLINTVPVRSNGVIIGAISTFRDKTEVRKLMQRLDGLVNYADALRERSHEFMNKLHVILG LLHLKSYKQLEDYILKTANNYQEEIGSLLGKIKSPVIAGFLISKINRATDLGHTLILNSE SQLPDSGSEDQVATLITTLGNLIENALEALGPEPGGEISVTLHYRHGWLHCEVNDDGPGI APDKIDHIFDKGVSTKGSERGVGLALVKQQVENLGGSIAVESEPGIFTQFFVQIPWDGER SNR >gi|296493164|gb|ADTK01000337.1| GENE 4 4262 - 4396 63 44 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_4593 NR:ns ## KEGG: EcSMS35_4593 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 44 1 44 44 79 100.0 3e-14 MSHQLPCVTNFLSIISDEAGNSKGVRMIGYIGEETLATETASAV >gi|296493164|gb|ADTK01000337.1| GENE 5 4451 - 4681 227 76 aa, chain + ## HITS:1 COG:ECs5108 KEGG:ns NR:ns ## COG: ECs5108 COG3592 # Protein_GI_number: 15834362 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 76 1 76 76 158 97.0 2e-39 MDQALLDEGYRCYTGEKIDVYFNTAICQHSGNCVRGNGKLFNLKRKPWIMPDEVDVVTVV KVIDTCPSGALKYRHK >gi|296493164|gb|ADTK01000337.1| GENE 6 4693 - 4965 339 90 aa, chain + ## HITS:1 COG:ECs5109 KEGG:ns NR:ns ## COG: ECs5109 COG2388 # Protein_GI_number: 15834363 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Escherichia coli O157:H7 # 1 90 1 90 90 163 100.0 8e-41 MEIREGHNKFYINDEQGKQIAEIVFVPTGENLAIIEHTDVDESLKGQGIGKQLVAKVVEK MRREKRKIIPLCPFAKHEFDKTREYDDIRS >gi|296493164|gb|ADTK01000337.1| GENE 7 4895 - 5086 116 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293407852|ref|ZP_06651692.1| ## NR: gi|293407852|ref|ZP_06651692.1| predicted protein [Escherichia coli B354] # 1 63 1 63 63 106 87.0 5e-22 MTWDKKTRSLPILCASPYPSDALLHHTAESMKNIVNLYSPINCEYHHTPAFYQIRVSQMD IMG >gi|296493164|gb|ADTK01000337.1| GENE 8 5192 - 5488 199 98 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_4661 NR:ns ## KEGG: ECUMN_4661 # Name: yjdK # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 98 1 98 98 172 98.0 5e-42 MEGKNKFNTYVVSFNYPSSYSSVFLRLRSLMYDMNFSSIVADEYGIPRQLNENSFAITTS LAASEIEDLIRLKCLDLPDIDFDLNIMTVDDYFRQFYK >gi|296493164|gb|ADTK01000337.1| GENE 9 5516 - 5689 179 57 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_5644 NR:ns ## KEGG: ECH74115_5644 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 57 1 57 57 68 100.0 6e-11 MALFSKILIFYVIGVNISFVIIWFISHEKTHIRLLSAFLVGITWPMSLPVALLFSLF >gi|296493164|gb|ADTK01000337.1| GENE 10 5808 - 7325 1665 505 aa, chain - ## HITS:1 COG:ECs5111 KEGG:ns NR:ns ## COG: ECs5111 COG1190 # Protein_GI_number: 15834365 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Lysyl-tRNA synthetase (class II) # Organism: Escherichia coli O157:H7 # 1 505 1 505 505 1000 100.0 0 MSEQETRGANEAIDFNDELRNRREKLAALRQQGVAFPNDFRRDHTSDQLHEEFDAKDNQE LESLNIEVSVAGRMMTRRIMGKASFVTLQDVGGRIQLYVARDSLPEGVYNDQFKKWDLGD IIGARGTLFKTQTGELSIHCTELRLLTKALRPLPDKFHGLQDQEVRYRQRYLDLIANDKS RQTFVVRSKILAAIRQFMVARGFMEVETPMMQVIPGGASARPFITHHNALDLDMYLRIAP ELYLKRLVVGGFERVFEINRNFRNEGISVRHNPEFTMMELYMAYADYHDLIELTESLFRT LAQEVLGTTKVTYGEHVFDFGKPFEKLTMREAIKKYRPETDMADLDNFDAAKALAESIGI TVEKSWGLGRIVTEIFDEVAEAHLIQPTFITEYPAEVSPLARRNDVNPEITDRFEFFIGG REIGNGFSELNDAEDQAERFQEQVNAKAAGDDEAMFYDEDYVTALEYGLPPTAGLGIGID RMIMLFTNSHTIRDVILFPAMRPQK >gi|296493164|gb|ADTK01000337.1| GENE 11 7562 - 9019 1224 485 aa, chain - ## HITS:1 COG:yjdL KEGG:ns NR:ns ## COG: yjdL COG3104 # Protein_GI_number: 16131956 # Func_class: E Amino acid transport and metabolism # Function: Dipeptide/tripeptide permease # Organism: Escherichia coli K12 # 1 485 1 485 485 888 99.0 0 MKTPSQPRAIYYIVAIQIWEYFSFYGMRALLILYLTHQLGFDDNHAISLFSAYASLVYVT PILGGWLADRLLGNRTAVIAGALLMTLGHVVLGIDTNSTFSLYLALAIIICGYGLFKSNI SCLLGELYDENDHRRDGGFSLLYAAGNIGSIAAPIACGLAAQWYGWHVGFALAGGGMFIG LLIFLSGHRHFQSTRSMDKKALTSVKFALPVWSWLVVMLCLAPVFFTLLLENDWSGYLLA IVCLIAAQIIARMMVKFPEHRRALWQIVLLMFVGTLFWVLAQQGGSTISLFIDRFVNRQA FNIEVPTALFQSVNAIAVMLAGVVLAWLASPESRGNSTLRVWLKFAFGLLLMACGFMLLA FDARHAAADGQASMGVMISGLALMGFAELFIDPVAIAQITRLKMSGVLTGIYMLATGAVA NWLAGVVAQQTTESQISGMAIAAYQRFFSQMGEWTLACVAIIVVLAFATRFLFSTPTNMI QESND >gi|296493164|gb|ADTK01000337.1| GENE 12 9078 - 11225 2237 715 aa, chain - ## HITS:1 COG:ECs5113 KEGG:ns NR:ns ## COG: ECs5113 COG1982 # Protein_GI_number: 15834367 # Func_class: E Amino acid transport and metabolism # Function: Arginine/lysine/ornithine decarboxylases # Organism: Escherichia coli O157:H7 # 1 715 1 715 715 1494 100.0 0 MNVIAILNHMGVYFKEEPIRELHRALERLNFQIVYPNDRDDLLKLIENNARLCGVIFDWD KYNLELCEEISKMNENLPLYAFANTYSTLDVSLNDLRLQISFFEYALGAAEDIANKIKQT TDEYINTILPPLTKALFKYVREGKYTFCTPGHMGGTAFQKSPVGSLFYDFFGPNTMKSDI SISVSELGSLLDHSGPHKEAEQYIARVFNADRSYMVTNGTSTANKIVGMYSAPAGSTILI DRNCHKSLTHLMMMSDVTPIYFRPTRNAYGILGGIPQSEFQHATIAKRVKETPNATWPVH AVITNSTYDGLLYNTDFIKKTLDVKSIHFDSAWVPYTNFSPIYEGKCGMSGGRVEGKVIY ETQSTHKLLAAFSQASMIHVKGDVNEETFNEAYMMHTTTSPHYGIVASTETAAAMMKGNA GKRLINGSIERAIKFRKEIKRLRTESDGWFFDVWQPDHIDTTECWPLRSDSTWHGFKNID NEHMYLDPIKVTLLTPGMEKDGTMSDFGIPASIVAKYLDEHGIVVEKTGPYNLLFLFSIG IDKTKALSLLRALTDFKRAFDLNLRVKNMLPSLYREDPEFYENMRIQELAQNIHKLIVHH NLPDLMYRAFEVLPTMVMTPYAAFQKELHGMTEEVYLDEMVGRINANMILPYPPGVPLVM PGEMITEESRPVLEFLQMLCEIGAHYPGFETDIHGAYRQADGRYTVKVLKEESKK >gi|296493164|gb|ADTK01000337.1| GENE 13 11305 - 12639 1442 444 aa, chain - ## HITS:1 COG:ECs5114 KEGG:ns NR:ns ## COG: ECs5114 COG0531 # Protein_GI_number: 15834368 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli O157:H7 # 1 444 1 444 444 761 100.0 0 MSSAKKIGLFACTGVVAGNMMGSGIALLPANLASIGGIAIWGWIISIIGAMSLAYVYARL ATKNPQQGGPIAYAGEISPAFGFQTGVLYYHANWIGNLAIGITAVSYLSTFFPVLNDPVP AGIACIAIVWVFTFVNMLGGTWVSRLTTIGLVLVLIPVVMTAIVGWHWFDAATYAANWNT ADTTDGHAIIKSILLCLWAFVGVESAAVSTGMVKNPKRTVPLATMLGTGLAGIVYIAATQ VLSGMYPSSVMAASGAPFAISASTILGNWAAPLVSAFTAFACLTSLGSWMMLVGQAGVRA ANDGNFPKVYGEVDSNGIPKKGLLLAAVKMTALMILITLMNSAGGKASDLFGELTGIAVL LTMLPYFYSCVDLIRFEGVNIRNFVSLICSVLGCVFCFIALMGASSFELAGTFIVSLIIL MFYARKMHERQSHSMDNHTASNAH >gi|296493164|gb|ADTK01000337.1| GENE 14 13005 - 14543 700 512 aa, chain - ## HITS:1 COG:cadC_1 KEGG:ns NR:ns ## COG: cadC_1 COG3710 # Protein_GI_number: 16131959 # Func_class: K Transcription # Function: DNA-binding winged-HTH domains # Organism: Escherichia coli K12 # 1 180 1 180 180 327 99.0 4e-89 MQQPVVRVGEWLVTPSINQISRNGRQLTLEPRLIDLLVFFAQHSGEVLSRDELIDNVWKR SIVTNHVVTQSISELRKSLKDNDEDSPVYIATVPKRGYKLMVPVIWYSEEEGEEIMLSSP PPIPEAVPATDSPSHSLNIQNTATPPEQSPVKSKRFTTFWVWFFFLLSLGICVALVAFST LDTRLPMSKSRILLNPRDIDINMVNKSCNSWSSPYQLSYAIGVGDLVATSLNTFSTFMVH DKINYNIDEPSSSGKTLSIAFVNQRQYRAQQCFMSIKLVDNADGSTMLDKRYVITNGNQL AIQNDLLESLSKALNQPWPQRMQETLQQILPHRGALLTNFYQAHDYLLHGDDKSLNRASE LLGEIVQSSPEFTYARAEKALVDIVRHSQHPLDEKQLAALNTEIDNIVTLPELNNLSIIY QIKAVSALVKGKTDESYQAINTGIDLEMSWLNYVLLGKVYEMKGMNREAADAYLTAFNLR PGANTLYWIENGIFQTSVPYVVPYLDKFLASE >gi|296493164|gb|ADTK01000337.1| GENE 15 15341 - 16048 562 235 aa, chain - ## HITS:1 COG:ECs3842 KEGG:ns NR:ns ## COG: ECs3842 COG1811 # Protein_GI_number: 15833096 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, possible Na+ channel or pump # Organism: Escherichia coli O157:H7 # 1 235 1 235 235 332 98.0 3e-91 MVIGPFINASAVLLGGVLGALLSQRLPERIRVSMTSIFGLASLGIGILLVVKCANLPAMV LATLLGALIGEICLLEKGVNTAVTKAQNLFRHSRKKPAHESFIQNYVAIIVLFCASGTGI FGAMNEGMTGDPSILIAKSFLDFFTAMIFACSLGIAVSVISIPLLIIQLTLAWAAALILP LTTPSMMADFSAVGGLLLLATGLRICGIKMFPVVNMLPALLLAMPLSAAWTAWFA >gi|296493164|gb|ADTK01000337.1| GENE 16 16446 - 18581 1768 711 aa, chain + ## HITS:1 COG:ECs3841 KEGG:ns NR:ns ## COG: ECs3841 COG1982 # Protein_GI_number: 15833095 # Func_class: E Amino acid transport and metabolism # Function: Arginine/lysine/ornithine decarboxylases # Organism: Escherichia coli O157:H7 # 1 711 21 731 731 1486 98.0 0 MKSMNIAASSDLVSRLSSHRRVVALGDTDFTDVAAVVITAADSRSGILALLKRTGFHLPV FLYSEHAVELPAGVTAVINGNEQQWLELESAACQYEENLLPPFYDTLTQYVEMGNSTFAC PGHQHGAFFKKHPAGRHFYDFFGENVFRADMCNADVKLGDLLIHEGSAKDAQKFAAKVFH ADKTYFVLNGTSAANKVVTNALLTRGDLVLFDRNNHKSNHHGALIQAGATPVYLEASRNP FGFIGGIDAHCFNEEYLRQQIRDVAPEKAELPRPFRLAIIQLGTYDGTVYNARQVIDTVG HLCDYILFDSAWVGYEQFIPMMADSSPLLLELNENDPGIFVTQSVHKQQAGFSQTSQIHK KDNHIRGQARFCPHKRLNNAFMLHASTSPFYPLFAALDVNAKIHEGESGRRLWAECVELG IESRKAILARCKLFRPFIPPVVDGKLWQDYPTSVLASDRRFFSFEPGAKWHGFEGYAADQ YFVDPCKLLLTTPGINAETGEYSDFGVPATILAHYLRENGIVPEKCDLNSILFLLTPAES HEKLAQLVAMLAQFEQHIEDDSPLAEVLPSVYNKYPVRYRDYTMRQLCQEMHDLYVSFDV KDLQKAMFRQQSFPSVVMNPQDAHSAYIRGEVELVRIRDAEGRIAAEGALPYPPGVLCVV PGEVWGGAVQRYFLALEEGVNLLPGFSPELQGVYSETDADGMKRLYGYVLK >gi|296493164|gb|ADTK01000337.1| GENE 17 18631 - 19887 1478 418 aa, chain - ## HITS:1 COG:ECs3840 KEGG:ns NR:ns ## COG: ECs3840 COG0477 # Protein_GI_number: 15833094 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 418 17 434 434 725 100.0 0 MNLKLQLKILSFLQFCLWGSWLTTLGSYMFVTLKFDGASIGAVYSSLGIAAVFMPALLGI VADKWLSAKWVYAICHTIGAITLFMAAQVTTPEAMFLVILINSFAYMPTLGLINTISYYR LQNAGMDIVTDFPPIRIWGTIGFIMAMWVVSLSGFELSHMQLYIGAALSAILVLFTLTLP HIPVAKQQANQSWTTLLGLDAFALFKNKRMAIFFIFSMLLGAELQITNMFGNTFLHSFDK DPMFASSFIVQHASIIMSISQISETLFILTIPFFLSRYGIKNVMMISIVAWILRFALFAY GDPTPFGTVLLVLSMIVYGCAFDFFNISGSVFVEKEVSPAIRASAQGMFLMMTNGFGCIL GGIVSGKVVEMYTQNGITDWQTVWLIFAGYSVVLAFAFMAMFKYKHVRVPTGTQTVSH >gi|296493164|gb|ADTK01000337.1| GENE 18 20089 - 21171 959 360 aa, chain - ## HITS:1 COG:mltC KEGG:ns NR:ns ## COG: mltC COG0741 # Protein_GI_number: 16130864 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Escherichia coli K12 # 1 360 1 360 360 690 100.0 0 MMKKYLALALIAPLLISCSTTKKGDTYNEAWVKDTNGFDILMGQFAHNIENIWGFKEVVI AGPKDYVKYTDQYQTRSHINFDDGTITIETIAGTEPAAHLRRAIIKTLLMGDDPSSVDLY SDVDDITISKEPFLYGQVVDNTGQPIRWEGRASNFADYLLKNRLKSRSNGLRIIYSVTIN MVPNHLDKRAHKYLGMVRQASRKYGVDESLILAIMQTESSFNPYAVSRSDALGLMQVVQH TAGKDVFRSQGKSGTPSRSFLFDPASNIDTGTAYLAMLNNVYLGGIDNPTSRRYAVITAY NGGAGSVLRVFSNDKIQAANIINTMTPGDVYQTLTTRHPSAESRRYLYKVNTAQKSYRRR >gi|296493164|gb|ADTK01000337.1| GENE 19 21233 - 21508 378 91 aa, chain - ## HITS:1 COG:STM3111 KEGG:ns NR:ns ## COG: STM3111 COG2924 # Protein_GI_number: 16766412 # Func_class: C Energy production and conversion; O Posttranslational modification, protein turnover, chaperones # Function: Fe-S cluster protector protein # Organism: Salmonella typhimurium LT2 # 1 91 1 91 91 161 94.0 3e-40 MSRTIFCTFLQREAEGQDFQLYPGELGKRIYNEISKEAWAQWQHKQTMLINEKKLNMMNA EHRKLLEQEMVNFLFEGKEVHIEGYTPEDKK >gi|296493164|gb|ADTK01000337.1| GENE 20 21536 - 22588 1083 350 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 [Haemophilus influenzae 7P49H1] # 6 343 11 366 378 421 59 1e-117 MQASQFSAQVLDWYDKYGRKTLPWQIDKTPYKVWLSEVMLQQTQVATVIPYFERFMARFP TVTDLANAPLDEVLHLWTGLGYYARARNLHKAAQQVATLHGGKFPETFEEVAALPGVGRS TAGAILSLSLGKHFPILDGNVKRVLARCYAVSGWPGKKEVENKLWSLSEQVTPAVGVERF NQAMMDLGAMICTRSKPKCSLCPLQNGCIAAANNSWSLYPGKKPKQTLPERTGYFLLLQH EDEVLLAQRPPSGLWGGLYCFPQFADEESLRQWLAQRQIAADNLTQLTAFRHTFSHFHLD IVPMWLPVSSFTGCMDEGNALWYNLAQPPSVGLAAPVERLLQQLRTGAPV >gi|296493164|gb|ADTK01000337.1| GENE 21 22749 - 23468 777 239 aa, chain + ## HITS:1 COG:yggH KEGG:ns NR:ns ## COG: yggH COG0220 # Protein_GI_number: 16130861 # Func_class: R General function prediction only # Function: Predicted S-adenosylmethionine-dependent methyltransferase # Organism: Escherichia coli K12 # 1 239 1 239 239 491 100.0 1e-139 MKNDVISPEFDENGRPLRRIRSFVRRQGRLTKGQEHALENYWPVMGVEFSEDMLDFPALF GREAPVTLEIGFGMGASLVAMAKDRPEQDFLGIEVHSPGVGACLASAHEEGLSNLRVMCH DAVEVLHKMIPDNSLRMVQLFFPDPWHKARHNKRRIVQVPFAELVKSKLQLGGVFHMATD WEPYAEHMLEVMSSIDGYKNLSESNDYVPRPASRPVTKFEQRGHRLGHGVWDLMFERVK >gi|296493164|gb|ADTK01000337.1| GENE 22 23468 - 23794 450 108 aa, chain + ## HITS:1 COG:yggL KEGG:ns NR:ns ## COG: yggL COG3171 # Protein_GI_number: 16130860 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 108 11 118 118 202 100.0 1e-52 MAKNRSRRLRKKMHIDEFQELGFSVAWRFPEGTSEEQIDKTVDDFINEVIEPNKLAFDGS GYLAWEGLICMQEIGKCTEEHQAIVRKWLEERKLDEVRTSELFDVWWD >gi|296493164|gb|ADTK01000337.1| GENE 23 23981 - 24697 933 238 aa, chain + ## HITS:1 COG:no KEGG:ECO111_3709 NR:ns ## KEGG: ECO111_3709 # Name: yggN # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 238 2 239 239 429 100.0 1e-119 MRKMLLAAALSVTAMTAHADYQCSVTPRDDVIVSPQTVQVKGENGNLVITPDGNVMYNGK QYSLNAAQREQAKDYQAELRSTLPWIDEGAKSRVEKARIALDKIIVQEMGESSKMRSRLT KLDAQLKEQMNRIIETRSDGLTFHYKAIDQVRAEGQQLVNQAMGGILQDSINEMGAKAVL KSGGNPLQNVLGSLGGLQSSIQTEWKKQEKDFQQFGKDVCSRVVTLEDSRKALVGNLK >gi|296493164|gb|ADTK01000337.1| GENE 24 24873 - 25919 1240 348 aa, chain + ## HITS:1 COG:ECs3833 KEGG:ns NR:ns ## COG: ECs3833 COG0252 # Protein_GI_number: 15833087 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Escherichia coli O157:H7 # 1 348 1 348 348 613 99.0 1e-175 MEFFKKTALAALVMGFSGAALALPNITILATGGTIAGGGDSATKSNYTAGKVGVENLVNA VPQLKDIANVKGEQVVNIGSQDMNDNVWLTLAKKINTDCDKTDGFVITHGTDTMEETAYF LDLTVKCDKPVVMVGAMRPSTSMSADGPFNLYNAVVTAADKASANRGVLVVMNDTVLDGR DVTKTNTTDVATFKSVNYGPLGYIHNGKIDYQRTPARKHTSDTPFDVSKLNELPKVGIVY NYANASDLPAKALVDAGYDGIVSAGVGNGNLYKSVFDTLATAAKNGTAVVRSSRVPTGAT TQDAEVDDAKYGFVASGTLNPQKARVLLQLALTQTKDPQQIQQIFNQY Prediction of potential genes in microbial genomes Time: Mon May 16 16:05:35 2011 Seq name: gi|296493163|gb|ADTK01000338.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1096.5, whole genome shotgun sequence Length of sequence - 8546 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 5, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 14 - 73 4.2 1 1 Tu 1 . + CDS 102 - 1109 865 ## ECO103_3539 hypothetical protein - Term 1158 - 1202 3.4 2 2 Op 1 13/0.000 - CDS 1264 - 2400 1048 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 3 2 Op 2 2/1.000 - CDS 2393 - 2986 1005 ## PROTEIN SUPPORTED gi|157155704|ref|YP_001464307.1| putative deoxyribonucleotide triphosphate pyrophosphatase 4 2 Op 3 6/0.500 - CDS 2994 - 3284 300 ## COG1872 Uncharacterized conserved protein 5 2 Op 4 5/0.500 - CDS 3281 - 3847 770 ## COG0762 Predicted integral membrane protein 6 2 Op 5 . - CDS 3865 - 4569 646 ## COG0325 Predicted enzyme with a TIM-barrel fold - Prom 4614 - 4673 3.0 + Prom 4279 - 4338 3.5 7 3 Tu 1 . + CDS 4587 - 5567 744 ## COG2805 Tfp pilus assembly protein, pilus retraction ATPase PilT + Term 5773 - 5806 2.2 - Term 5606 - 5649 3.2 8 4 Op 1 8/0.500 - CDS 5730 - 6146 439 ## COG0816 Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) 9 4 Op 2 2/1.000 - CDS 6146 - 6709 628 ## COG1678 Putative transcriptional regulator - Prom 6746 - 6805 3.2 - Term 6727 - 6783 3.4 10 5 Op 1 2/1.000 - CDS 6818 - 7768 347 ## PROTEIN SUPPORTED gi|212636859|ref|YP_002313384.1| Glutathione synthase; Ribosomal protein S6 modification enzyme 11 5 Op 2 . - CDS 7781 - 8512 504 ## COG1385 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|296493163|gb|ADTK01000338.1| GENE 1 102 - 1109 865 335 aa, chain + ## HITS:1 COG:no KEGG:ECO103_3539 NR:ns ## KEGG: ECO103_3539 # Name: yggM # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 335 1 335 335 599 99.0 1e-170 MKKQWIVGTALLMLMTGNAWADGEPPTENILKDQFKKQYHGILKLDAITLKNLDAKGNQA TWSAEGDVSSSDDLYTWVGQLADYELLEQTWTKDKPVKFSAMLTSKGTPASGWSVNFYSF QAAASDRGRVVDDIKTNNKYLIVNSEDFNYRFSQLESALNTQKNSIPALEKEVKALDKQM VAAQKAADAYWGKDANGKQMTREEAFKKIHQQRDEFNKQNDSEAFEVKYDKEVYQPAIAA CHKQSEECYEVPIQQKRDFDINEQRRQTFLQSQKLSRKLQDDWVTLEKGQYPLTMKVSEI NSKKVAILMKIDDINQANERWKKDTEQLRRNGVIK >gi|296493163|gb|ADTK01000338.1| GENE 2 1264 - 2400 1048 378 aa, chain - ## HITS:1 COG:yggW KEGG:ns NR:ns ## COG: yggW COG0635 # Protein_GI_number: 16130856 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Escherichia coli K12 # 1 378 1 378 378 786 98.0 0 MVKLPPLSLYIHIPWCVQKCPYCDFNSHALKGEVPHDDYVQHLLCDLDNDVAYAQGREVK TIFIGGGTPSLLSGPAMQTLLDGVRARLPLAADAEITMEANPGTVEADRFVDYQRAGVNR ISIGVQSFSEEKLKRLGRIHGPQEAKRAAKLASGLGLRSFNLDLMHGLPDQSLEEALGDL RQAIELNPPHLSWYQLTIEPNTLFGSRPPVLPDDDALWDIFEQGHQLLTAAGYQQYETSA YAKPSYQCQHNLNYWRFGDYIGIGCGAHGKVTFPDGRILRTTKTRHPRGFMQGRYLESQR DVEAADKPFEFFMNRFRLLEAAPRAEFSAYTGLCEDVIRPQLDEAIAQGYLTECADYWQI TEHGKLFLNSLLELFLAE >gi|296493163|gb|ADTK01000338.1| GENE 3 2393 - 2986 1005 197 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157155704|ref|YP_001464307.1| putative deoxyribonucleotide triphosphate pyrophosphatase [Escherichia coli E24377A] # 1 197 1 197 197 391 100 1e-109 MQKVVLATGNAGKVRELASLLSDFGLDIVAQTDLGVDSAEETGLTFIENAILKARHAAKV TGLPAIADDSGLAVDVLGGAPGIYSARYSGEDATDLKNLQKLLETLKDVPDDQRQARFHC VLVYLRHAEDPTPLVCHGSWPGVITREPAGTGGFGYDPIFFVPSEGKTAAELTREEKSAI SHRGQALKLLLDALRNG >gi|296493163|gb|ADTK01000338.1| GENE 4 2994 - 3284 300 96 aa, chain - ## HITS:1 COG:ECs3829 KEGG:ns NR:ns ## COG: ECs3829 COG1872 # Protein_GI_number: 15833083 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 96 5 100 100 174 98.0 5e-44 MSAVTVNDDGLVLRLYIQPKASRDSIVGLHGDEVKVAITAPPVDGQANSHLVKFLGKQFR VAKSQVVIEKGELGRHKQIKIINPQQIPPEIAALIN >gi|296493163|gb|ADTK01000338.1| GENE 5 3281 - 3847 770 188 aa, chain - ## HITS:1 COG:ECs3828 KEGG:ns NR:ns ## COG: ECs3828 COG0762 # Protein_GI_number: 15833082 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Escherichia coli O157:H7 # 1 188 1 188 188 272 100.0 2e-73 MNTLTFLLSTVIELYTMVLLLRIWMQWAHCDFYNPFSQFVVKVTQPIIGPLRRVIPAMGP IDSASLLVAYILSFIKAIVLFKVVTFLPIIWIAGLLILLKTIGLLIFWVLLVMAIMSWVS QGRSPIEYVLIQLADPLLRPIRRLLPAMGGIDFSPMILVLLLYVINMGVAEVLQATGNML LPGLWMAL >gi|296493163|gb|ADTK01000338.1| GENE 6 3865 - 4569 646 234 aa, chain - ## HITS:1 COG:ECs3826 KEGG:ns NR:ns ## COG: ECs3826 COG0325 # Protein_GI_number: 15833081 # Func_class: R General function prediction only # Function: Predicted enzyme with a TIM-barrel fold # Organism: Escherichia coli O157:H7 # 1 234 1 234 234 456 100.0 1e-128 MNDIAHNLAQVRDKISAAATRCGRSPEEITLLAVSKTKPASAIAEAIDAGQRQFGENYVQ EGVDKIRHFQELGVTGLEWHFIGPLQSNKSRLVAEHFDWCHTIDRLRIATRLNDQRPAEL PPLNVLIQINISDENSKSGIQLAELDELAAAVAELPRLRLRGLMAIPAPESEYVRQFEVA RQMAVAFAGLKTRYPHIDTLSLGMSDDMEAAIAAGSTMVRIGTAIFGARDYSKK >gi|296493163|gb|ADTK01000338.1| GENE 7 4587 - 5567 744 326 aa, chain + ## HITS:1 COG:ECs3827 KEGG:ns NR:ns ## COG: ECs3827 COG2805 # Protein_GI_number: 15833080 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Tfp pilus assembly protein, pilus retraction ATPase PilT # Organism: Escherichia coli O157:H7 # 1 326 16 341 341 612 97.0 1e-175 MNMEEIVALSVKHNVSDLHLCSAWPARWRIRGRMEAAPFDAPDVEELLREWLDDDQRAIL LENGQLDFAVSLAENQRLRGSAFAQRQGISLALRLLPSHCPQLEQLGAPPVLPELLKSEN GLILVTGATGSGKSTTLAAMVGYLNQHADAHILTLEDPVEYLYASQRCLIQQREIGLHCM TFASGLRAALREDPDVILLGELRDSETIRLALTAAETGHLVLATLHTRGAAQAVERLVDS FPAQEKDPVRNQLAGSLRAVLSQKLEVDKQEGCVALFELLINTPAVGNLIREGKTHQLPH VIQTGQQVGMITFQQSFQQRVGEGRL >gi|296493163|gb|ADTK01000338.1| GENE 8 5730 - 6146 439 138 aa, chain - ## HITS:1 COG:yqgF KEGG:ns NR:ns ## COG: yqgF COG0816 # Protein_GI_number: 16130850 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) # Organism: Escherichia coli K12 # 1 138 1 138 138 276 100.0 7e-75 MSGTLLAFDFGTKSIGVAVGQRITGTARPLPAIKAQDGTPDWNIIERLLKEWQPDEIIVG LPLNMDGTEQPLTARARKFANRIHGRFGVEVKLHDERLSTVEARSGLFEQGGYRALNKGK VDSASAVIILESYFEQGY >gi|296493163|gb|ADTK01000338.1| GENE 9 6146 - 6709 628 187 aa, chain - ## HITS:1 COG:ECs3824 KEGG:ns NR:ns ## COG: ECs3824 COG1678 # Protein_GI_number: 15833078 # Func_class: K Transcription # Function: Putative transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 187 25 211 211 374 100.0 1e-104 MNLQHHFLIAMPALQDPIFRRSVVYICEHNTNGAMGIIVNKPLENLKIEGILEKLKITPE PRDESIRLDKPVMLGGPLAEDRGFILHTPPSNFASSIRISDNTVMTTSRDVLETLGTDKQ PSDVLVALGYASWEKGQLEQEILDNAWLTAPADLNILFKTPIADRWREAAKLIGVDILTM PGVAGHA >gi|296493163|gb|ADTK01000338.1| GENE 10 6818 - 7768 347 316 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|212636859|ref|YP_002313384.1| Glutathione synthase; Ribosomal protein S6 modification enzyme [Shewanella piezotolerans WP3] # 7 310 6 319 345 138 31 2e-32 MIKLGIVMDPIANINIKKDSSFAMLLEAQRRGYELHYMEMGDLYLINGEARAHTRTLNVK QNYEEWFSFVGEQDLPLADLDVILMRKDPPFDTEFIYATYILERAEEKGTLIVNKPQSLR DCNEKLFTAWFSDLTPETLVTRNKAQLKAFWEKHSDIILKPLDGMGGASIFRVKEGDPNL GVIAETLTEHGTRYCMAQNYLPAIKDGDKRVLVVDGEPVPYCLARIPQGGETRGNLAAGG RGEPRPLTESDWKIARQIGPTLKEKGLIFVGLDIIGDRLTEINVTSPTCIREIEAEFPVS ITGMLMDAIEARLQQQ >gi|296493163|gb|ADTK01000338.1| GENE 11 7781 - 8512 504 243 aa, chain - ## HITS:1 COG:ECs3822 KEGG:ns NR:ns ## COG: ECs3822 COG1385 # Protein_GI_number: 15833076 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 243 10 252 252 485 99.0 1e-137 MRIPRIYHPEPLTSHSHIALCEDAANHIGRVLRMGPGQALQLFDGSNQVFDAEITSASKK SVEVKVLEGKIDDRESPLHIHLGQVMSRGEKMEFTIQKSIELGVSLITPLFSERCGVKLD SERLNKKLQQWQKIAIAACEQCGRNRVPEIRPAMDLEAWCAEQDEGLKLNLHPRASNSIN TLPLPVERVRLLIGPEGGLSADEIAMTARYQFTDILLGPRVLRTETTALTAITALQVRFG DLG Prediction of potential genes in microbial genomes Time: Mon May 16 16:05:41 2011 Seq name: gi|296493162|gb|ADTK01000339.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1096.6, whole genome shotgun sequence Length of sequence - 1420 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 - CDS 65 - 772 513 ## COG2356 Endonuclease I - Prom 794 - 853 4.1 2 1 Op 2 . - CDS 867 - 1322 418 ## COG3091 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|296493162|gb|ADTK01000339.1| GENE 1 65 - 772 513 235 aa, chain - ## HITS:1 COG:endA KEGG:ns NR:ns ## COG: endA COG2356 # Protein_GI_number: 16130846 # Func_class: L Replication, recombination and repair # Function: Endonuclease I # Organism: Escherichia coli K12 # 1 235 1 235 235 446 100.0 1e-125 MYRYLSIAAVVLSAAFSGPALAEGINSFSQAKAAAVKVHADAPGTFYCGCKINWQGKKGV VDLQSCGYQVRKNENRASRVEWEHVVPAWQFGHQRQCWQDGGRKNCAKDPVYRKMESDMH NLQPSVGEVNGDRGNFMYSQWNGGEGQYGQCAMKVDFKEKAAEPPARARGAIARTYFYMR DQYNLTLSRQQTQLFNAWNKMYPVTDWECERDERIAKVQGNHNPYVQRACQARKS >gi|296493162|gb|ADTK01000339.1| GENE 2 867 - 1322 418 151 aa, chain - ## HITS:1 COG:ECs3820 KEGG:ns NR:ns ## COG: ECs3820 COG3091 # Protein_GI_number: 15833074 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 151 15 165 165 289 99.0 2e-78 MRRLREKLAQANLKLGRNYPEPKLSYTQRGTSAGTAWLESYEIRLNPVLLLENSEAFIEE VVPHELAHLLVWKHFGRVAPHGKEWKWMMESVLGVPARRTHQFELQSVRRNTFPYRCKCQ EHQLTVRRHNRVVRGEAVYRCVHCGEQLVAK Prediction of potential genes in microbial genomes Time: Mon May 16 16:05:56 2011 Seq name: gi|296493161|gb|ADTK01000340.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1096.7, whole genome shotgun sequence Length of sequence - 47682 bp Number of predicted genes - 42, with homology - 42 Number of transcription units - 21, operones - 10 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 4/0.111 - CDS 35 - 1429 1657 ## COG0477 Permeases of the major facilitator superfamily - Prom 1530 - 1589 9.3 - Term 1810 - 1842 3.0 2 2 Tu 1 . - CDS 1865 - 3019 1332 ## COG0192 S-adenosylmethionine synthetase + Prom 3734 - 3793 4.0 3 3 Tu 1 . + CDS 3814 - 5790 2263 ## COG1166 Arginine decarboxylase (spermidine biosynthesis) + Term 5808 - 5852 11.0 + Prom 5851 - 5910 6.1 4 4 Tu 1 . + CDS 5936 - 6667 536 ## EcE24377A_3280 putative lipoprotein + Term 6735 - 6779 7.4 + Prom 6724 - 6783 3.0 5 5 Tu 1 . + CDS 6803 - 7723 1251 ## COG0010 Arginase/agmatinase/formimionoglutamate hydrolase, arginase family - Term 7813 - 7853 -0.5 6 6 Tu 1 . - CDS 7929 - 8687 836 ## COG0501 Zn-dependent protease with chaperone function - Prom 8709 - 8768 5.0 + Prom 8827 - 8886 3.8 7 7 Tu 1 2/0.778 + CDS 8965 - 10956 2374 ## COG0021 Transketolase + Term 10967 - 11015 5.0 + Prom 11039 - 11098 6.3 8 8 Op 1 3/0.333 + CDS 11270 - 11713 351 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 9 8 Op 2 2/0.778 + CDS 11741 - 13129 1484 ## COG2213 Phosphotransferase system, mannitol-specific IIBC component 10 8 Op 3 2/0.778 + CDS 13144 - 14421 1212 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 11 8 Op 4 3/0.333 + CDS 14418 - 15383 905 ## COG1494 Fructose-1,6-bisphosphatase/sedoheptulose 1,7-bisphosphatase and related proteins 12 8 Op 5 3/0.333 + CDS 15405 - 15914 399 ## COG3722 Transcriptional regulator 13 8 Op 6 . + CDS 15911 - 16624 425 ## COG1072 Panthothenate kinase 14 9 Op 1 3/0.333 - CDS 16596 - 17273 207 ## COG1118 ABC-type sulfate/molybdate transport systems, ATPase component 15 9 Op 2 34/0.000 - CDS 17267 - 17944 454 ## COG1122 ABC-type cobalt transport system, ATPase component 16 9 Op 3 . - CDS 17932 - 18480 395 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters - Prom 18574 - 18633 2.0 17 10 Op 1 . - CDS 18640 - 19218 675 ## EC55989_3217 putative ABC-type transport system 18 10 Op 2 . - CDS 19241 - 19672 374 ## COG1661 Predicted DNA-binding protein with PD1-like DNA-binding motif - Prom 19894 - 19953 6.1 + Prom 19853 - 19912 3.3 19 11 Op 1 26/0.000 + CDS 20044 - 21063 827 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase 20 11 Op 2 9/0.000 + CDS 21113 - 22276 1339 ## COG0126 3-phosphoglycerate kinase + Term 22286 - 22316 3.0 + Prom 22291 - 22350 6.7 21 11 Op 3 4/0.111 + CDS 22491 - 23570 1118 ## COG0191 Fructose/tagatose bisphosphate aldolase + Term 23595 - 23623 2.1 + Prom 23785 - 23844 3.9 22 12 Tu 1 5/0.111 + CDS 23928 - 24788 918 ## COG0668 Small-conductance mechanosensitive channel + Term 24797 - 24836 5.0 + Prom 24840 - 24899 5.7 23 13 Op 1 5/0.111 + CDS 24927 - 25562 600 ## COG1279 Lysine efflux permease 24 13 Op 2 2/0.778 + CDS 25655 - 26395 795 ## COG2968 Uncharacterized conserved protein + Term 26418 - 26451 -0.3 + Prom 26421 - 26480 6.2 25 14 Tu 1 . + CDS 26562 - 27458 679 ## COG0583 Transcriptional regulator 26 15 Op 1 . - CDS 27455 - 28933 1199 ## COG0427 Acetyl-CoA hydrolase 27 15 Op 2 . - CDS 28957 - 29742 947 ## COG1024 Enoyl-CoA hydratase/carnithine racemase 28 15 Op 3 6/0.000 - CDS 29753 - 30748 870 ## COG1703 Putative periplasmic protein kinase ArgK and related GTPases of G3E family 29 15 Op 4 3/0.333 - CDS 30741 - 32885 2263 ## COG1884 Methylmalonyl-CoA mutase, N-terminal domain/subunit - Prom 33003 - 33062 4.0 - Term 33042 - 33075 2.7 30 16 Tu 1 . - CDS 33089 - 33982 837 ## COG0583 Transcriptional regulator - Prom 34012 - 34071 1.5 + Prom 34022 - 34081 3.5 31 17 Tu 1 . + CDS 34124 - 34285 109 ## COG0583 Transcriptional regulator + Prom 34323 - 34382 4.5 32 18 Op 1 8/0.000 + CDS 34410 - 35069 827 ## COG0120 Ribose 5-phosphate isomerase + Prom 35220 - 35279 4.4 33 18 Op 2 . + CDS 35325 - 36557 1313 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases + Term 36579 - 36615 7.1 - Term 36738 - 36769 3.2 34 19 Op 1 8/0.000 - CDS 36946 - 37494 393 ## COG0212 5-formyltetrahydrofolate cyclo-ligase 35 19 Op 2 . - CDS 37794 - 38123 366 ## COG3027 Uncharacterized protein conserved in bacteria - Prom 38182 - 38241 3.1 + Prom 38203 - 38262 1.5 36 20 Op 1 5/0.111 + CDS 38291 - 38869 646 ## COG3079 Uncharacterized protein conserved in bacteria 37 20 Op 2 8/0.000 + CDS 38895 - 40220 1348 ## COG0006 Xaa-Pro aminopeptidase 38 20 Op 3 8/0.000 + CDS 40217 - 41395 1029 ## COG0654 2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases 39 20 Op 4 5/0.111 + CDS 41418 - 42620 1331 ## COG0654 2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases + Term 42664 - 42699 4.1 + Prom 42817 - 42876 7.6 40 21 Op 1 18/0.000 + CDS 43068 - 44162 1328 ## COG0404 Glycine cleavage system T protein (aminomethyltransferase) 41 21 Op 2 12/0.000 + CDS 44186 - 44575 537 ## COG0509 Glycine cleavage system H protein (lipoate-binding) + Term 44605 - 44648 2.5 42 21 Op 3 . + CDS 44693 - 47566 3310 ## COG1003 Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain + Term 47592 - 47617 -0.5 Predicted protein(s) >gi|296493161|gb|ADTK01000340.1| GENE 1 35 - 1429 1657 464 aa, chain - ## HITS:1 COG:galP KEGG:ns NR:ns ## COG: galP COG0477 # Protein_GI_number: 16130844 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 464 1 464 464 852 100.0 0 MPDAKKQGRSNKAMTFFVCFLAALAGLLFGLDIGVIAGALPFIADEFQITSHTQEWVVSS MMFGAAVGAVGSGWLSFKLGRKKSLMIGAILFVAGSLFSAAAPNVEVLILSRVLLGLAVG VASYTAPLYLSEIAPEKIRGSMISMYQLMITIGILGAYLSDTAFSYTGAWRWMLGVIIIP AILLLIGVFFLPDSPRWFAAKRRFVDAERVLLRLRDTSAEAKRELDEIRESLQVKQSGWA LFKENSNFRRAVFLGVLLQVMQQFTGMNVIMYYAPKIFELAGYTNTTEQMWGTVIVGLTN VLATFIAIGLVDRWGRKPTLTLGFLVMAAGMGVLGTMMHIGIHSPSAQYFAIAMLLMFIV GFAMSAGPLIWVLCSEIQPLKGRDFGITCSTATNWIANMIVGATFLTMLNTLGNANTFWV YAALNVLFILLTLWLVPETKHVSLEHIERNLMKGRKLREIGAHD >gi|296493161|gb|ADTK01000340.1| GENE 2 1865 - 3019 1332 384 aa, chain - ## HITS:1 COG:ECs3818 KEGG:ns NR:ns ## COG: ECs3818 COG0192 # Protein_GI_number: 15833072 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylmethionine synthetase # Organism: Escherichia coli O157:H7 # 1 384 1 384 384 781 100.0 0 MAKHLFTSESVSEGHPDKIADQISDAVLDAILEQDPKARVACETYVKTGMVLVGGEITTS AWVDIEEITRNTVREIGYVHSDMGFDANSCAVLSAIGKQSPDINQGVDRADPLEQGAGDQ GLMFGYATNETDVLMPAPITYAHRLVQRQAEVRKNGTLPWLRPDAKSQVTFQYDDGKIVG IDAVVLSTQHSEEIDQKSLQEAVMEEIIKPILPAEWLTSATKFFINPTGRFVIGGPMGDC GLTGRKIIVDTYGGMARHGGGAFSGKDPSKVDRSAAYAARYVAKNIVAAGLADRCEIQVS YAIGVAEPTSIMVETFGTEKVPSEQLTLLVREFFDLRPYGLIQMLDLLHPIYKETAAYGH FGREHFPWEKTDKAQLLRDAAGLK >gi|296493161|gb|ADTK01000340.1| GENE 3 3814 - 5790 2263 658 aa, chain + ## HITS:1 COG:ECs3814 KEGG:ns NR:ns ## COG: ECs3814 COG1166 # Protein_GI_number: 15833068 # Func_class: E Amino acid transport and metabolism # Function: Arginine decarboxylase (spermidine biosynthesis) # Organism: Escherichia coli O157:H7 # 1 658 1 658 658 1363 100.0 0 MSDDMSMGLPSSAGEHGVLRSMQEVAMSSQEASKMLRTYNIAWWGNNYYDVNELGHISVC PDPDVPEARVDLAQLVKTREAQGQRLPALFCFPQILQHRLRSINAAFKRARESYGYNGDY FLVYPIKVNQHRRVIESLIHSGEPLGLEAGSKAELMAVLAHAGMTRSVIVCNGYKDREYI RLALIGEKMGHKVYLVIEKMSEIAIVLDEAERLNVVPRLGVRARLASQGSGKWQSSGGEK SKFGLAATQVLQLVETLREAGRLDSLQLLHFHLGSQMANIRDIATGVRESARFYVELHKL GVNIQCFDVGGGLGVDYEGTRSQSDCSVNYGLNEYANNIIWAIGDACEENGLPHPTVITE SGRAVTAHHTVLVSNIIGVERNEYTVPTAPAEDAPRALQSMWETWQEMHEPGTRRSLREW LHDSQMDLHDIHIGYSSGTFSLQERAWAEQLYLSMCHEVQKQLDPQNRAHRPIIDELQER MADKMYVNFSLFQSMPDAWGIDQLFPVLPLEGLDQVPERRAVLLDITCDSDGAIDHYIDG DGIATTMPMPEYDPENPPMLGFFMVGAYQEILGNMHNLFGDTEAVDVFVFPDGSVEVELS DEGDTVADMLQYVQLDPKTLLTQFRDQVKKTDLDAELQQQFLEEFEAGLYGYTYLEDE >gi|296493161|gb|ADTK01000340.1| GENE 4 5936 - 6667 536 243 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_3280 NR:ns ## KEGG: EcE24377A_3280 # Name: not_defined # Def: putative lipoprotein # Organism: E.coli_E24377A # Pathway: not_defined # 1 243 1 243 243 461 99.0 1e-129 MKKWKVRSALVALIVLLAGCSSNAQYNSSASGNVGTAWGGDVHSTVQGVSAERAWRDPAE MIVISYSTNVPSGYDRVYSIRINELEYAIRDGNFNSLPITRVYDSSNNEPRYIVHARVGM NYQLYVRNYSRNTNYEIVATVDGMDVLNGKQGSLNNNGYIVNAGDSLAIKGFRKDKHTEA AFQFANVADSYAANSAQGDVRNTGVIGFAAFELQGPAQNALPPCSGQAFPADNNGYAPPP CRK >gi|296493161|gb|ADTK01000340.1| GENE 5 6803 - 7723 1251 306 aa, chain + ## HITS:1 COG:ECs3812 KEGG:ns NR:ns ## COG: ECs3812 COG0010 # Protein_GI_number: 15833066 # Func_class: E Amino acid transport and metabolism # Function: Arginase/agmatinase/formimionoglutamate hydrolase, arginase family # Organism: Escherichia coli O157:H7 # 1 306 1 306 306 616 100.0 1e-176 MSTLGHQYDNSLVSNAFGFLRLPMNFQPYDSDADWVITGVPFDMATSGRAGGRHGPAAIR QVSTNLAWEHNRFPWNFDMRERLNVVDCGDLVYAFGDAREMSEKLQAHAEKLLAAGKRML SFGGDHFVTLPLLRAHAKHFGKMALVHFDAHTDTYANGCEFDHGTMFYTAPKEGLIDPNH SVQIGIRTEFDKDNGFTVLDACQVNDRSVDDVIAQVKQIVGDMPVYLTFDIDCLDPAFAP GTGTPVIGGLTSDRAIKLVRGLKDLNIVGMDVVEVAPAYDQSEITALAAATLALEMLYIQ AAKKGE >gi|296493161|gb|ADTK01000340.1| GENE 6 7929 - 8687 836 252 aa, chain - ## HITS:1 COG:ECs3811 KEGG:ns NR:ns ## COG: ECs3811 COG0501 # Protein_GI_number: 15833065 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Escherichia coli O157:H7 # 1 252 43 294 294 462 99.0 1e-130 MKIRALLVAMSVATVLTGCQNMDSNGLLSSGAEAFQAYSLSDAQVKALSDQACQEMDSKA TIAPANSEYAKRLTTIANALGNNINGQPVNYKVYMAKDVNAFAMANGCIRVYSGLMDMMT DNEVEAVIGHEMGHVALGHVKKGMQVALGTNAVRVAAASAGGIVGSLSQSQLGDLGEKLV NSQFSQRQEAEADDYSYDLLRQRGISPAGLATSFEKLAKLEEGRQSSMFDDHPASAERAQ HIRDRMSADGVK >gi|296493161|gb|ADTK01000340.1| GENE 7 8965 - 10956 2374 663 aa, chain + ## HITS:1 COG:ECs3810 KEGG:ns NR:ns ## COG: ECs3810 COG0021 # Protein_GI_number: 15833064 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase # Organism: Escherichia coli O157:H7 # 1 663 1 663 663 1331 99.0 0 MSSRKELANAIRALSMDAVQKAKSGHPGAPMGMADIAEVLWRDFLKHNPQNPSWADRDRF VLSNGHGSMLIYSLLHLTGYDLPMEELKNFRQLHSKTPGHPEVGYTAGVETTTGPLGQGI ANAVGMAIAEKTLAAQFNRPGHDIVDHYTYAFMGDGCMMEGISHEVCSLAGTLKLGKLIA FYDDNGISIDGHVEGWFTDDTAMRFEAYGWHVIRDIDGHDAASIKRAVEEARAVTDKPSL LMCKTIIGFGSPNKAGTHDSHGAPLGDAEIALTREQLGWKYAPFEIPSEIYAQWDAKEAG QAKESAWNEKFAAYAKAYPQEAAEFTRRMKGEMPSDFDAKAKEFIAKLQANPAKIASRKA SQNAIEAFGPLLPEFLGGSADLAPSNLTLWSGSKAINEDAAGNYIHYGVREFGMTAIANG ISLHGGFLPYTSTFLMFVEYARNAVRMAALMKQRQVMVYTHDSIGLGEDGPTHQPVEQVA SLRVTPNMSTWRPCDQVESAVAWKYGVERQDGPTALILSRQNLAQQERTEEQLANIARGG YVLKDCAGQPELIFIATGSEVELAVAAYEKLTAEGVKARVVSMPSTDAFDKQDAAYRESV LPKAVTARVAVEAGIADYWYKYVGLNGAIVGMTTFGESAPAELLFEEFGFTVDNVVAKAK ELL >gi|296493161|gb|ADTK01000340.1| GENE 8 11270 - 11713 351 147 aa, chain + ## HITS:1 COG:ECs3809 KEGG:ns NR:ns ## COG: ECs3809 COG1762 # Protein_GI_number: 15833063 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Escherichia coli O157:H7 # 1 147 1 147 147 252 100.0 1e-67 MRLSDYFPESSISVIHSAKDWQEAIDFSMVSLLDKNYISENYIQAIKDSTINNGPYYILA PGVAMPHARPECGALKTGMSLTLLEQGVYFPGNDEPIKLLIGLSAADADSHIGAIQALSE LLCEEEILEQLLTASSEKQLADIISRG >gi|296493161|gb|ADTK01000340.1| GENE 9 11741 - 13129 1484 462 aa, chain + ## HITS:1 COG:ECs3808 KEGG:ns NR:ns ## COG: ECs3808 COG2213 # Protein_GI_number: 15833062 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannitol-specific IIBC component # Organism: Escherichia coli O157:H7 # 1 462 1 462 462 856 100.0 0 MENKSARAKVQAFGGFLTAMVIPNIGAFIAWGFITALFIPTGWLPNEHFAKIVGPMITYL LPVMIGSTGGHLVGGKRGAVMGGIGTIGVIVGAEIPMFLGSMIMGPLGGLVIKYVDKALE KRIPAGFEMVINNFSLGIAGMLLCLLGFEVIGPAVLIANTFVKECIEALVHAGYLPLLSV INEPAKVLFLNNAIDQGVYYPLGMQQASVNGKSIFFMVASNPGPGLGLLLAFTLFGKGMS KRSAPGAMIIHFLGGIHELYFPYVLMKPLTIIAMIAGGMSGTWMFNLLDGGLVAGPSPGS IFAYLALTPKGSFLATIAGVTVGTLVSFAITSLILKMEKTVETESEDEFAQSANAVKAMK QEGAFSLSRVKRIAFVCDAGMGSSAMGATTFRKRLEKAGLAIEVKHYAIENVPADADIVV THASLEGRVKRVTDKPLILINNYIGDPKLDTLFNQLTAEHKH >gi|296493161|gb|ADTK01000340.1| GENE 10 13144 - 14421 1212 425 aa, chain + ## HITS:1 COG:ECs3807 KEGG:ns NR:ns ## COG: ECs3807 COG1063 # Protein_GI_number: 15833061 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli O157:H7 # 1 425 1 425 425 859 98.0 0 MKTKVAAIYGKRDVRLREFELPEITDNELLVSVISDSVCLSTWKAALLGSEHKRVPDDLE NHPVITGHECAGVIVEVGKNLTGKYKKGQRFVLQPAMGLPSGYSAGYSYEYFGGNATYMI IPEIAINLGCVLPYHGSYFAAASLAEPMCCIIGAYHANYHTTQYVYEHRMGVKPGGNIAL LACAGPMGIGAIDYAINGGIQPSRVVVVDIDDKRLAQVQKLLPVDLAASKGIELVYVNTK GMSDPVQTLRALTGDVGFDDIFVYAAVPAVVEMADELLAEDGCLNFFAGPTDKNFKVPFN FYNVHYNSTHVVGTSGGSTDDMKEAIALSATGQLQPSFMVTHIGGLDAVPDTVLNLPDIP GGKKLIYNGVTMPLTAIADFAEKGKTDPLFKELARLVEETHGIWNEQAEKYLLAQFGVDI GEAAQ >gi|296493161|gb|ADTK01000340.1| GENE 11 14418 - 15383 905 321 aa, chain + ## HITS:1 COG:yggF KEGG:ns NR:ns ## COG: yggF COG1494 # Protein_GI_number: 16130831 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1,6-bisphosphatase/sedoheptulose 1,7-bisphosphatase and related proteins # Organism: Escherichia coli K12 # 1 321 1 321 321 626 99.0 1e-179 MMSLAWPLFRVTEQAALAAWPQTGCGDKNKIDGLAVTAMRQALNDVAFRGRVVIGEGEID HAPMLWIGEEVGKGDGPEVDIAVDPIEGTRMVAMGQSNALAVMAFAPRDSLLHAPDMYMK KLVVNRLAAGAIDLSLPLADNLRNVARALGKPLDKLRMVTLDKPRLSAAIEEATQLGVKV FALPDGDVAASVLTCWQDNPYDVMYTIGGAPEGVISACAVKALGGDMQAELIDFCQAKGD YTENRQIAEQERKRCKAMGVDVNRVYSLDELVRGNDILFSATGVTGGELVNGIQQTANGV RTQTLLIGGADQTCNIIDSLH >gi|296493161|gb|ADTK01000340.1| GENE 12 15405 - 15914 399 169 aa, chain + ## HITS:1 COG:ECs3805 KEGG:ns NR:ns ## COG: ECs3805 COG3722 # Protein_GI_number: 15833059 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 169 1 169 169 291 97.0 4e-79 MATLTEDDVLEQLDAQDNLFSFMKTAHSILLQGIRQFLPSLFVDNDEEIVEYAVKPLLAQ SGPLDDIDVALRLIYALGKMDKWLYADITHFSQYWHYLNEQDETPGFADDMTWDFISNVN SITCNATLYDALKAMKFADFAVWSEARFSGMVKTALTLAVTTTLKELTP >gi|296493161|gb|ADTK01000340.1| GENE 13 15911 - 16624 425 237 aa, chain + ## HITS:1 COG:ECs3804 KEGG:ns NR:ns ## COG: ECs3804 COG1072 # Protein_GI_number: 15833058 # Func_class: H Coenzyme transport and metabolism # Function: Panthothenate kinase # Organism: Escherichia coli O157:H7 # 1 237 1 237 237 472 96.0 1e-133 MKIELTVNGLNVQAQYHDEEIERVHKPLLRMLAALQTVSPQRRTVVFLCAPPGTGKSTLT TFWEYLAQQDPELPAIQTLPMDGFHHYNSWLDAHQLRPFKGAPETFDVAKLAENLCQVVE GDCTWPQYDRQKHDPVEDALHVTAPLVIVEGNWLLLDDEKWCQLAQFCDFSIFINAPATA LRERLVGRKLAGGVSLADAEAFYDRTDGPNVRRVLEESLPANLTLMMTATGEYRLVD >gi|296493161|gb|ADTK01000340.1| GENE 14 16596 - 17273 207 225 aa, chain - ## HITS:1 COG:ECs3803 KEGG:ns NR:ns ## COG: ECs3803 COG1118 # Protein_GI_number: 15833057 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type sulfate/molybdate transport systems, ATPase component # Organism: Escherichia coli O157:H7 # 1 225 1 225 225 445 97.0 1e-125 MLTLNKISYRWPGAATDCLCDISLQLKQGEWLALTGDNGAGKSTLLRVMAGLLTPTAGTV MLQQQAMKNLKNRQRAAKVGVLFQEAENQLFHSTVADEIAFGLKLQKCPADEITQRTHAA LQCCQLADTANSHPLDLHSAQRRMVAVACLEALSPPLLLLDEPSRDFDENWLSVFESWLE KCGQRGTSVVAISHDAAFTRRHFSRVVRLEDGLIRNVNPPDDIHP >gi|296493161|gb|ADTK01000340.1| GENE 15 17267 - 17944 454 225 aa, chain - ## HITS:1 COG:ECs3802 KEGG:ns NR:ns ## COG: ECs3802 COG1122 # Protein_GI_number: 15833056 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, ATPase component # Organism: Escherichia coli O157:H7 # 1 225 1 225 225 451 100.0 1e-127 MVTLEQFRYCPTHSTHPPFCYDFHYVKPGMVAIFGDNGSGKSTLAQLMAGWYPDFLPGEI TGTGTLLGTPIGHLPLNEQSATIQLVQQSPYLQLSGCTFSVEEEVAFGPENLCLAEAEIM ARIDAALTLTECQPLRHRHPATLSGGETQRVVIACAIAMQPKLLILDEAFSRLTPQASEM LLQRLQHWAFERGSLIILFERHRTPFLNYCQQAWQLQNGALQPLC >gi|296493161|gb|ADTK01000340.1| GENE 16 17932 - 18480 395 182 aa, chain - ## HITS:1 COG:ECs3801 KEGG:ns NR:ns ## COG: ECs3801 COG0619 # Protein_GI_number: 15833055 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Escherichia coli O157:H7 # 1 182 54 235 235 323 98.0 2e-88 MFSLGAGLWLVHGGWLTEWLSGTPRSPERWSHAITLWLRILAIVSTSQLWMQYVPVQCFI RALFASRLPPGVAYLFAGPLLVVEQLKRQLAIIHEAQRARGVPLDEGWYQRLRAMPALII PLTHNALNDLAVRGAALDMRAFRIHNRRTTLWAPADSTLQRVARYTMILLMLTEFGAWIW LR >gi|296493161|gb|ADTK01000340.1| GENE 17 18640 - 19218 675 192 aa, chain - ## HITS:1 COG:no KEGG:EC55989_3217 NR:ns ## KEGG: EC55989_3217 # Name: not_defined # Def: putative ABC-type transport system # Organism: E.coli_55989 # Pathway: not_defined # 1 192 34 225 225 272 100.0 5e-72 MARSHFSSQALVLIVISIAINMIGGQLASMVKLPIFLDSIGTLISAVLLGPVIGMLTGLL TNLLWGLLTDPIAAAFAPVAMVIGLVAGWLARAGWFRTLPKVVVSGVIITLAVTVVAVPL RTALFGGVTGSGADLFVAWMHSMGQNLVESVAITVIGANLVDKILTAVIVWLLLRQLPIR TTRHFPAMAAVR >gi|296493161|gb|ADTK01000340.1| GENE 18 19241 - 19672 374 143 aa, chain - ## HITS:1 COG:ECs3799 KEGG:ns NR:ns ## COG: ECs3799 COG1661 # Protein_GI_number: 15833053 # Func_class: R General function prediction only # Function: Predicted DNA-binding protein with PD1-like DNA-binding motif # Organism: Escherichia coli O157:H7 # 1 143 1 143 143 275 99.0 2e-74 MMTLPYPYSSSARFYALRLLPGQEVFSQLHAFAQQQQLHAAWIAGCTGSLTDVALRYAGQ EGTTLLNGTFEVISLNGTLEQSGEHLHLCVSDPHGTMLGGHMMPGCTVRTTLELVIGSLE ELAFSRQPCALSGYDELHISPVK >gi|296493161|gb|ADTK01000340.1| GENE 19 20044 - 21063 827 339 aa, chain + ## HITS:1 COG:ECs3798 KEGG:ns NR:ns ## COG: ECs3798 COG0057 # Protein_GI_number: 15833052 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Escherichia coli O157:H7 # 1 339 1 339 339 679 100.0 0 MTVRVAINGFGRIGRNVVRALYESGRRAEITVVAINELADAAGMAHLLKYDTSHGRFAWE VRQERDQLFVGDDAIRVLHERSLQSLPWRELGVDVVLDCTGVYGSREHGEAHIAAGAKKV LFSHPGSNDLDATVVYGVNQDQLRAEHRIVSNASCTTNCIIPVIKLLDDAYGIESGTVTT IHSAMHDQQVIDAYHPDLRRTRAASQSIIPVDTKLAAGITRFFPQFNDRFEAIAVRVPTI NVTAIDLSVTVKKPVKANEVNLLLQKAAQGAFHGIVDYTELPLVSVDFNHDPHSAIVDGT QTRVSGAHLIKTLVWCDNEWGFANRMLDTTLAMATVAFR >gi|296493161|gb|ADTK01000340.1| GENE 20 21113 - 22276 1339 387 aa, chain + ## HITS:1 COG:pgk KEGG:ns NR:ns ## COG: pgk COG0126 # Protein_GI_number: 16130827 # Func_class: G Carbohydrate transport and metabolism # Function: 3-phosphoglycerate kinase # Organism: Escherichia coli K12 # 1 387 1 387 387 717 100.0 0 MSVIKMTDLDLAGKRVFIRADLNVPVKDGKVTSDARIRASLPTIELALKQGAKVMVTSHL GRPTEGEYNEEFSLLPVVNYLKDKLSNPVRLVKDYLDGVDVAEGELVVLENVRFNKGEKK DDETLSKKYAALCDVFVMDAFGTAHRAQASTHGIGKFADVACAGPLLAAELDALGKALKE PARPMVAIVGGSKVSTKLTVLDSLSKIADQLIVGGGIANTFIAAQGHDVGKSLYEADLVD EAKRLLTTCNIPVPSDVRVATEFSETAPATLKSVNDVKADEQILDIGDASAQELAEILKN AKTILWNGPVGVFEFPNFRKGTEIVANAIADSEAFSIAGGGDTLAAIDLFGIADKISYIS TGGGAFLEFVEGKVLPAVAMLEERAKK >gi|296493161|gb|ADTK01000340.1| GENE 21 22491 - 23570 1118 359 aa, chain + ## HITS:1 COG:Zfba KEGG:ns NR:ns ## COG: Zfba COG0191 # Protein_GI_number: 15803459 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Escherichia coli O157:H7 EDL933 # 1 359 26 384 384 742 100.0 0 MSKIFDFVKPGVITGDDVQKVFQVAKENNFALPAVNCVGTDSINAVLETAAKVKAPVIVQ FSNGGASFIAGKGVKSDVPQGAAILGAISGAHHVHQMAEHYGVPVILHTDHCAKKLLPWI DGLLDAGEKHFAATGKPLFSSHMIDLSEESLQENIEICSKYLERMSKIGMTLEIELGCTG GEEDGVDNSHMDASALYTQPEDVDYAYTELSKISPRFTIAASFGNVHGVYKPGNVVLTPT ILRDSQEYVSKKHNLPHNSLNFVFHGGSGSTAQEIKDSVSYGVVKMNIDTDTQWATWEGV LNYYKANEAYLQGQLGNPKGEDQPNKKYYDPRVWLRAGQTSMIARLEKAFQELNAIDVL >gi|296493161|gb|ADTK01000340.1| GENE 22 23928 - 24788 918 286 aa, chain + ## HITS:1 COG:ECs3795 KEGG:ns NR:ns ## COG: ECs3795 COG0668 # Protein_GI_number: 15833049 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Escherichia coli O157:H7 # 1 286 1 286 286 498 100.0 1e-141 MEDLNVVDSINGAGSWLVANQALLLSYAVNIVAALAIIIVGLIIARMISNAVNRLMISRK IDATVADFLSALVRYGIIAFTLIAALGRVGVQTASVIAVLGAAGLAVGLALQGSLSNLAA GVLLVMFRPFRAGEYVDLGGVAGTVLSVQIFSTTMRTADGKIIVIPNGKIIAGNIINFSR EPVRRNEFIIGVAYDSDIDQVKQILTNIIQSEDRILKDREMTVRLNELGASSINFVVRVW SNSGDLQNVYWDVLERIKREFDAAGISFPYPQMDVNFKRVKEDKAA >gi|296493161|gb|ADTK01000340.1| GENE 23 24927 - 25562 600 211 aa, chain + ## HITS:1 COG:ECs3794 KEGG:ns NR:ns ## COG: ECs3794 COG1279 # Protein_GI_number: 15833048 # Func_class: R General function prediction only # Function: Lysine efflux permease # Organism: Escherichia coli O157:H7 # 1 211 1 211 211 367 100.0 1e-102 MFSYYFQGLALGAAMILPLGPQNAFVMNQGIRRQYHIMIALLCAISDLVLICAGIFGGSA LLMQSPWLLALVTWGGVVFLLWYGFGAFKTAMSSNIELASAEVLKQGRWKIIATMLAVTW LNPHVYLDTFVVLGSLGGQLDVEPKRWFALGTISASFLWFFGLAILAAWLAPRLRTAKSQ RIINLVVGCVMWFIALQLARDGIAHAQALFS >gi|296493161|gb|ADTK01000340.1| GENE 24 25655 - 26395 795 246 aa, chain + ## HITS:1 COG:ECs3793 KEGG:ns NR:ns ## COG: ECs3793 COG2968 # Protein_GI_number: 15833047 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 246 1 246 246 410 100.0 1e-114 MKFKVIALAALMGISGMAAQANELPDGPHIVTSGTASVDAVPDIATLAIEVNVAAKDAAT AKKQADERVAQYISFLELNQIAKKDISSANLRTQPDYDYQDGKSILKGYRAVRTVEVTLR QLDKLNSLLDGALKAGLNEIRSVSLGVAQPDAYKDKARKAAIDNAIHQAQELANGFHRKL GPVYSVRYHVSNYQPSPMVRMMKADAAPVSAQETYEQAAIQFDDQVDVVFQLEPVDQQPA KTPAAQ >gi|296493161|gb|ADTK01000340.1| GENE 25 26562 - 27458 679 298 aa, chain + ## HITS:1 COG:ygfI KEGG:ns NR:ns ## COG: ygfI COG0583 # Protein_GI_number: 16130822 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 298 6 303 303 594 99.0 1e-170 MDIFISKKMRNFILLAQTNNIARAAEKIHMTASPFGKSIAALEEQIGYTLFTRKDNNISL NKAGQELYQKLFPVYQRLSAIDNEIHNSGRRSREIVIGIDNTYPTIIFDQLISLGDKYEG VTAQPVEFSENGVIDNLFDRQLDFIISPQHVSARVQELENLTISELPPLRLGFLVSRRYE GRQEQELLQELPWLQMRFQNRANFEAMIDANMRPCGINPTIIYRPYSFMAKISAVERGHF LTVIPHFAWRLVNPATLKYFDAPHKPMYMQEYLYSIRNHRYTATMLQHIAEDRDGTSH >gi|296493161|gb|ADTK01000340.1| GENE 26 27455 - 28933 1199 492 aa, chain - ## HITS:1 COG:ygfH KEGG:ns NR:ns ## COG: ygfH COG0427 # Protein_GI_number: 16130821 # Func_class: C Energy production and conversion # Function: Acetyl-CoA hydrolase # Organism: Escherichia coli K12 # 1 492 1 492 492 986 99.0 0 METQWTRMTANEAAEIIQHNDMVAFSGFTPAGSPKALPTAIARRANEQHEAKKPYQIRLL TGASISAAADDVLSDADAVSWRAPYQTSSGLRKKINQGAVSFVDLHLSEVAQMVNYGFFG DIDVAVIEASALAPDGRVWLTSGIGNAPTWLLRAKKVIIELNHYHDPRVAELADIVIPGA PPRRNSVSIFHAMDRVGTRYVQIDPKKIVAVVETNLPDAGNMLDKQNPMCQQIADNVVTF LLQEMAHGRIPPEFLPLQSGVGNINNAVMARLGENPEIPPFMMYSEVLQESVVHLLETGK ISGASASSLTISADSLRKIYDNMDYFASRIVLRPQEISNNPEIIRRLGVIALNVGLEFDI YGHANSTHVAGVDLMNGIGGSGDFERNAYLSIFMAPSIAKEGKISTVVPMCSHVDHSEHS VKVIITEQGIADLRGLSPLQRARTIIDNCAHPMYQDYLHRYLENAPGGHIHHDLSHVFDL HRNLIATGSMLG >gi|296493161|gb|ADTK01000340.1| GENE 27 28957 - 29742 947 261 aa, chain - ## HITS:1 COG:ECs3789 KEGG:ns NR:ns ## COG: ECs3789 COG1024 # Protein_GI_number: 15833043 # Func_class: I Lipid transport and metabolism # Function: Enoyl-CoA hydratase/carnithine racemase # Organism: Escherichia coli O157:H7 # 1 261 15 275 275 525 100.0 1e-149 MSYQYVNVVTINKVAVIEFNYGRKLNALSKVFIDDLMQALSDLNRPEIRCIILRAPSGSK VFSAGHDIHELPSGGRDPLSYDDPLRQITRMIQKFPKPIISMVEGSVWGGAFEMIMSSDL IIAASTSTFSMTPVNLGVPYNLVGIHNLTRDAGFHIVKELIFTASPITAQRALAVGILNH VVEVEELEDFTLQMAHHISEKAPLAIAVIKEELRVLGEAHTMNSDEFERIQGMRRAVYDS EDYQEGMNAFLEKRKPNFVGH >gi|296493161|gb|ADTK01000340.1| GENE 28 29753 - 30748 870 331 aa, chain - ## HITS:1 COG:argK KEGG:ns NR:ns ## COG: argK COG1703 # Protein_GI_number: 16130819 # Func_class: E Amino acid transport and metabolism # Function: Putative periplasmic protein kinase ArgK and related GTPases of G3E family # Organism: Escherichia coli K12 # 1 331 1 331 331 654 99.0 0 MINEATLAESIRRLRQGERATLAQAMTLVESRHPRHQALSTQLLDTIMPYCGNTLRLGVT GTPGAGKSTFLEAFGMLLIREGLKVAVIAVDPSSPVTGGSILGDKTRMNDLARAEAAFIR PVPSSGHLGGASQRARELMLLCEAAGYDVVIVETVGVGQSETEVARMVDCFISLQIAGGG DDLQGIKKGLMEVADLLVINKDDGDNHTNVAIARHMYESALHILRRKYDEWQPRVLTCSA LEKRGIDEIWHAIINFKTALTASGRLQQVRQQQSVEWLRKQTEEEVLNHLFANEDFDRYY RQTLLAVKNNTLSPRTGLRQLSEFIQTQYFD >gi|296493161|gb|ADTK01000340.1| GENE 29 30741 - 32885 2263 714 aa, chain - ## HITS:1 COG:Zsbm_1 KEGG:ns NR:ns ## COG: Zsbm_1 COG1884 # Protein_GI_number: 15803451 # Func_class: I Lipid transport and metabolism # Function: Methylmalonyl-CoA mutase, N-terminal domain/subunit # Organism: Escherichia coli O157:H7 EDL933 # 1 585 1 585 585 1128 99.0 0 MSNEQEWQQLANKELSRREKTVDSLVQQTAEGIAIKPLYTEADLDNLEVTGTLPGLPPYV RGPRATMYTAQPWTIRQYAGFSTAKESNAFYRRNLAAGQKGLSVAFDLATHRGYDSDNPR VAGDVGKAGVAIDTVEDMKVLFDQIPLDKMSVSMTMNGAVLPVLAFYIVAAEEQGVTPDK LTGTIQNDILKEYLCRNTYIYPPKPSMRIIADIIAWCSGNMPRFNTISISGYHMGEAGAN CVQQVAFTLADGIEYIKAAISAGLKIDDFAPRLSFFFGIGMDLFMNVAMLRAARYLWSEE VSGFGAQDPKSLALRTHCQTSGWSLTEQDPYNNVIRTTIEALAATLGGTQSLHTNAFDEA LGLPTDFSARIARNTQIIIQEESELCRTVDPLAGSYYIESLTDQIVKQARAIIQQIDEAG GMAKAIEAGLPKRMIEEASAREQSLIDQGKRVIVGVNKYKLDHEDETDVLEIDNVMVRNE QIASLERIRATRDDAAVTAALNALTHAAQHNENLLAAAVNAARVRATLGEISDALEAAFD RYLVPSQCVTGVIAQSYHQSEKSASEFDAIVAQTEQFLADNGRRPRILIAKMGQDGHDRG AKVIASAYSDLGFDVDLSPMFSTPEEIARLAVENDVHVVGASSLAAGHKTLIPELVEALK KWGREDICVVAGGVIPPQDYAFLQERGVAAIYGPGTPMLDSVRDVLNLISQHHD >gi|296493161|gb|ADTK01000340.1| GENE 30 33089 - 33982 837 297 aa, chain - ## HITS:1 COG:ECs3786 KEGG:ns NR:ns ## COG: ECs3786 COG0583 # Protein_GI_number: 15833040 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 297 1 297 297 577 100.0 1e-165 MKRPDYRTLQALDAVIRERGFERAAQKLCITQSAVSQRIKQLENMFGQPLLVRTVPPRPT EQGQKLLALLRQVELLEEEWLGDEQTGSTPLLLSLAVNADSLATWLLPALAPVLADSPIR LNLQVEDETRTQERLRRGEVVGAVSIQHQALPSCLVDKLGALDYLFVSSKPFAEKYFPNG VTRSALLKAPVVAFDHLDDMHQAFLQQNFDLPPGSVPCHIVNSSEAFVQLARQGTTCCMI PHLQIEKELASGELIDLTPGLFQRRMLYWHRFAPESRMMRKVTDALLDYGHKVLRQD >gi|296493161|gb|ADTK01000340.1| GENE 31 34124 - 34285 109 53 aa, chain + ## HITS:1 COG:yqfE KEGG:ns NR:ns ## COG: yqfE COG0583 # Protein_GI_number: 16130816 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 53 1 53 76 95 98.0 1e-20 MHFAQRVRALVVLNGVALLPQFACKQGLANGELVRLFAPWSGIPRLLYALFAG >gi|296493161|gb|ADTK01000340.1| GENE 32 34410 - 35069 827 219 aa, chain + ## HITS:1 COG:ECs3785 KEGG:ns NR:ns ## COG: ECs3785 COG0120 # Protein_GI_number: 15833039 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase # Organism: Escherichia coli O157:H7 # 1 219 1 219 219 397 100.0 1e-111 MTQDELKKAVGWAALQYVQPGTIVGVGTGSTAAHFIDALGTMKGQIEGAVSSSDASTEKL KSLGIHVFDLNEVDSLGIYVDGADEINGHMQMIKGGGAALTREKIIASVAEKFICIADAS KQVDILGKFPLPVEVIPMARSAVARQLVKLGGRPEYRQGVVTDNGNVILDVHGMEILDPI AMENAINAIPGVVTVGLFANRGADVALIGTPDGVKTIVK >gi|296493161|gb|ADTK01000340.1| GENE 33 35325 - 36557 1313 410 aa, chain + ## HITS:1 COG:ECs3784 KEGG:ns NR:ns ## COG: ECs3784 COG0111 # Protein_GI_number: 15833038 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Escherichia coli O157:H7 # 1 410 1 410 410 802 100.0 0 MAKVSLEKDKIKFLLVEGVHQKALESLRAAGYTNIEFHKGALDDEQLKESIRDAHFIGLR SRTHLTEDVINAAEKLVAIGCFCIGTNQVDLDAAAKRGIPVFNAPFSNTRSVAELVIGEL LLLLRGVPEANAKAHRGVWNKLAAGSFEARGKKLGIIGYGHIGTQLGILAESLGMYVYFY DIENKLPLGNATQVQHLSDLLNMSDVVSLHVPENPSTKNMMGAKEISLMKPGSLLINASR GTVVDIPALCDALASKHLAGAAIDVFPTEPATNSDPFTSPLCEFDNVLLTPHIGGSTQEA QENIGLEVAGKLIKYSDNGSTLSAVNFPEVSLPLHGGRRLMHIHENRPGVLTALNKIFAE QGVNIAAQYLQTSAQMGYVVIDIEADEDVAEKALQAMKAIPGTIRARLLY >gi|296493161|gb|ADTK01000340.1| GENE 34 36946 - 37494 393 182 aa, chain - ## HITS:1 COG:ECs3782 KEGG:ns NR:ns ## COG: ECs3782 COG0212 # Protein_GI_number: 15833036 # Func_class: H Coenzyme transport and metabolism # Function: 5-formyltetrahydrofolate cyclo-ligase # Organism: Escherichia coli O157:H7 # 1 182 1 182 182 359 100.0 1e-99 MIRQRRRALTPEQQQEMGQQAATRMMTYPPVVMAHTVAVFLSFDGELDTQPLIEQLWRAG KRVYLPVLHPFSAGNLLFLNYHPQSELVMNRLKIHEPKLDVRDVLPLSRLDVLITPLVAF DEYGQRLGMGGGFYDRTLQNWQHYKTQPVGYAHDCQLVEKLPVEEWDIPLPAVVTPSKVW EW >gi|296493161|gb|ADTK01000340.1| GENE 35 37794 - 38123 366 109 aa, chain - ## HITS:1 COG:ECs3781 KEGG:ns NR:ns ## COG: ECs3781 COG3027 # Protein_GI_number: 15833035 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 109 1 109 109 163 100.0 6e-41 MSAQPVDIQIFGRSLRVNCPPDQRDALNQAADDLNQRLQDLKERTRVTNTEQLVFIAALN ISYELAQEKAKTRDYAASMEQRIRMLQQTIEQALLEQGRITEKTNQNFE >gi|296493161|gb|ADTK01000340.1| GENE 36 38291 - 38869 646 192 aa, chain + ## HITS:1 COG:ECs3780 KEGG:ns NR:ns ## COG: ECs3780 COG3079 # Protein_GI_number: 15833034 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 192 3 194 194 359 100.0 1e-99 MSIQNEMPGYNEMNQYLNQQGTGLTPAEMHGLISGMICGGNDDSSWLPLLHDLTNEGMAF GHELAQALRKMHSATSDALQDDGFLFQLYLPDGDDVSVFDRADALAGWVNHFLLGLGVTQ PKLDKVTGETGEAIDDLRNIAQLGYDEDEDQEELEMSLEEIIEYVRVAALLCHDTFTHPQ PTAPEVQKPTLH >gi|296493161|gb|ADTK01000340.1| GENE 37 38895 - 40220 1348 441 aa, chain + ## HITS:1 COG:ECs3779 KEGG:ns NR:ns ## COG: ECs3779 COG0006 # Protein_GI_number: 15833033 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Escherichia coli O157:H7 # 1 441 1 441 441 896 99.0 0 MSEISRQEFQRRRQALVEQMQPGSAALIFAAPEVTRSADSEYPYRQNSDFWYFTGFNEPE AVLVLIKSDDTHNHSVLFNRVRDLTAEIWFGRRLGQDAAPEKLGVDRALAFSEINQQLYQ LLNGLDVVYHAQGEYAYADEIVNSALEKLRKGSRQNLTAPATMIDWRPVVHEMRLFKSPE EIAVLRRAGEITAMAHTRAMEKCRPGMFEYHLEGEIHHEFNRHGARYPSYNTIVGSGENG CILHYTENESELRDGDLVLIDAGCEYKGYAGDITRTFPVNGKFTQAQREIYDIVLESLET SLRLYRPGTSILEVTGEVVRIMVSGLVKLGILKGDVDELIAQNAHRPFFMHGLSHWLGLD VHDVGVYGQDRSRILEPGMVLTVEPGLYIAPDADVPEQYRGIGIRIEDDIVITETGNENL TASVVKKPEEIEALMAAARKQ >gi|296493161|gb|ADTK01000340.1| GENE 38 40217 - 41395 1029 392 aa, chain + ## HITS:1 COG:ubiH KEGG:ns NR:ns ## COG: ubiH COG0654 # Protein_GI_number: 16130809 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases # Organism: Escherichia coli K12 # 1 392 1 392 392 741 98.0 0 MSVIIVGGGMAGATLALAISRLSHGALPVHLIEATAPESHAHPGFDGRAIALAAGTCQQL ARIGVWQSLADYATAITTVHVSDRGHAGFVTLAAEDYQLAVLGQVVELHNVGQRLFALLR KAPGVTLHCPDRVANVARTQSHVEVTLESGETLTGRVLVAADGTHSALATACGVDWQQEP YEQLAVIANVATSVAHEGRAFERFTQHGPLAMLPMSDGRCSLVWCHPLERREEVLSWSDE KFCRELQSAFGWRLGKITHAGKRSAYPLALTRAARAITHRTVLVGNAAQTLHPIAGQGFN LGMRDVMSLAETLTQAHERGEDIGDYGILCRYQQRRQSDREATIGVTDSLVHLFANRWAP LVVGRNIGLMTMELFTPARDVLAQRTLGWVAR >gi|296493161|gb|ADTK01000340.1| GENE 39 41418 - 42620 1331 400 aa, chain + ## HITS:1 COG:ECs3777 KEGG:ns NR:ns ## COG: ECs3777 COG0654 # Protein_GI_number: 15833031 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases # Organism: Escherichia coli O157:H7 # 1 400 1 400 400 803 98.0 0 MQSVDVAIVGGGMVGLAVACGLQGSGLRVAVLEQRVPEPLAADAPPQLRVSAINAASEKL LTRLGVWQDILSRRASCYHGMEVWDKDSFGHISFDDQSMGYSHLGHIVENSVIHYALWNK AQQSSDTTLLAPAELQQVAWGENETFLTLKDGSMLTARLVIGADGANSWLRNKADIPLTF WDYQHHALVATIRTEEPHDAVARQVFHGEGILAFLPLSDPHLCSIVWSLSPEEAQRMQQA GEDEFNRALNIAFDNRLGLCKVESARQVFPLTGRYARQFAAHRLALVGDAAHTIHPLAGQ GVNLGFMDAAELVAELKRLHRQGKDIGQYIYLRRYERSRKHSAAMMLAGMQGFRDLFSGT NPAKKLLRDIGLKLADTLPGVKPQLIRQAMGLNDLPEWLR >gi|296493161|gb|ADTK01000340.1| GENE 40 43068 - 44162 1328 364 aa, chain + ## HITS:1 COG:gcvT KEGG:ns NR:ns ## COG: gcvT COG0404 # Protein_GI_number: 16130807 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system T protein (aminomethyltransferase) # Organism: Escherichia coli K12 # 1 364 1 364 364 746 100.0 0 MAQQTPLYEQHTLCGARMVDFHGWMMPLHYGSQIDEHHAVRTDAGMFDVSHMTIVDLRGS RTREFLRYLLANDVAKLTKSGKALYSGMLNASGGVIDDLIVYYFTEDFFRLVVNSATREK DLSWITQHAEPFGIEITVRDDLSMIAVQGPNAQAKAATLFNDAQRQAVEGMKPFFGVQAG DLFIATTGYTGEAGYEIALPNEKAADFWRALVEAGVKPCGLGARDTLRLEAGMNLYGQEM DETISPLAANMGWTIAWEPADRDFIGREALEVQREHGTEKLVGLVMTEKGVLRNELPVRF TDAQGNQHEGIITSGTFSPTLGYSIALARVPEGIGETAIVQIRNREMPVKVTKPVFVRNG KAVA >gi|296493161|gb|ADTK01000340.1| GENE 41 44186 - 44575 537 129 aa, chain + ## HITS:1 COG:ECs3775 KEGG:ns NR:ns ## COG: ECs3775 COG0509 # Protein_GI_number: 15833029 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system H protein (lipoate-binding) # Organism: Escherichia coli O157:H7 # 1 129 1 129 129 217 100.0 4e-57 MSNVPAELKYSKEHEWLRKEADGTYTVGITEHAQELLGDMVFVDLPEVGATVSAGDDCAV AESVKAASDIYAPVSGEIVAVNDALSDSPELVNSEPYAGGWIFKIKASDESELESLLDAT AYEALLEDE >gi|296493161|gb|ADTK01000340.1| GENE 42 44693 - 47566 3310 957 aa, chain + ## HITS:1 COG:ECs3774_2 KEGG:ns NR:ns ## COG: ECs3774_2 COG1003 # Protein_GI_number: 15833028 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain # Organism: Escherichia coli O157:H7 # 451 957 1 507 507 1028 99.0 0 MTQTLSQLENSGAFIERHIGPDAAQQQEMLNAVGAQSLNALTGQIVPKDIQLATPPQVGA PATEYAALAELKAIASRNKRFTSYIGMGYTAVQLPPVILRNMLENPGWYTAYTPYQPEVS QGRLEALLNFQQVTLDLTGLDMASASLLDEATAAAEAMAMAKRVSKLKNANRFFVASDVH PQTLDVVRTRAETFGFEVIVDDAQKVLDHQDVFGVLLQQVGTTGEIHDYTALISELKSRK IVVSVAADIMALVLLTAPGKQGADIVFGSAQRFGVPMGYGGPHAAFFAAKDEYKRSMPGR IIGVSKDAAGNTALRMAMQTREQHIRREKANSNICTSQVLLANIASLYAVYHGPVGLKRI ANRIHRLTDILAAGLQQKGLKLRHAHYFDTLCVEVADKAGVLARAEAAEINLRSDILNAV GITLDETTTRENVMQLFSVLLGDNHGLEIDTLDKDVAHDSRSIQPAMLRDDEILTHPVFN RYHSETEMMRYMHSLERKDLALNQAMIPLGSCTMKLNAAAEMIPITWPEFAELHPFCPPE QAEGYQQMIAQLADWLVKLTGYDAVCMQPNSGAQGEYAGLLAIRHYHESRNEGHRDICLI PASAHGTNPASAHMAGMQVVVVACDKNGNIDLTDLRAKAEQAGDNLSCIMVTYPSTHGVY EETIREVCEVVHQFGGQVYLDGANMNAQVGITSPGFIGADVSHLNLHKTFCIPHGGGGPG MGPIGVKAHLAPFVPGHSVVQIEGMLTRQGAVSAAPFGSASILPISWMYIRMMGAEGLKK ASQVAILNANYIASRLQDAFPVLYTGRDGRVAHECILDIRPLKEETGISELDIAKRLIDY GFHAPTMSFPVAGTLMVEPTESESKVELDRFIDAMLAIRAEIDQVKAGVWPLEDNPLVNA PHIQNELVAEWAHPYSREVAVFPAGVADKYWPTVKRLDDVYGDRNLFCSCVPISEYQ Prediction of potential genes in microbial genomes Time: Mon May 16 16:06:07 2011 Seq name: gi|296493160|gb|ADTK01000341.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1096.8, whole genome shotgun sequence Length of sequence - 3601 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 149 - 892 246 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 2 2 Tu 1 . - CDS 949 - 2388 1658 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase - Prom 2412 - 2471 3.3 + Prom 2134 - 2193 3.2 3 3 Tu 1 . + CDS 2427 - 2738 346 ## COG3097 Uncharacterized protein conserved in bacteria + Prom 2768 - 2827 3.5 4 4 Tu 1 . + CDS 2902 - 3561 650 ## COG1272 Predicted membrane protein, hemolysin III homolog Predicted protein(s) >gi|296493160|gb|ADTK01000341.1| GENE 1 149 - 892 246 247 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 5 247 7 240 242 99 31 4e-21 MAIALVTGGSRGIGRATALLLAKEEYTVAVNYQQNLHAAQEVVNLITQAGGKAFVLQADI SDENQVIAMFTAIDQHDEPLAALVNNAGILFTQCTVENLTAERINRVLSTNVTGYFLCCR EAVKRMALKNGGSGGAIVNVSSVASRLGSPGEYVDYAASKGAIDTLTTGLSLEVAAQGIR VNCVRPGFIYTEMHASGGEPGRVDRVKSNIPMQRGGQAEEVAQAIVWLLSDKASYVTGSF IDLAGGK >gi|296493160|gb|ADTK01000341.1| GENE 2 949 - 2388 1658 479 aa, chain - ## HITS:1 COG:bglA KEGG:ns NR:ns ## COG: bglA COG2723 # Protein_GI_number: 16130803 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Escherichia coli K12 # 1 479 1 479 479 1010 99.0 0 MIVKKLTLPKDFLWGGAVAAHQVEGGWNKGGKGPSICDVLTGGAHGVPREITKEVLPGKY YPNHEAVDFYGHYKEDIKLFAEMGFKCFRTSIAWTRIFPKGDEAQPNEEGLKFYDDMFDE LLKYNIEPVITLSHFEMPLHLVQQYGSWTNRKVVDFFVRFAEVVFERYKHKVKYWMTFNE INNQRNWRAPLFGYCCSGVVYTEHENPEETMYQVLHHQFVASALAVKAARRINPEMKVGC MLAMVPLYPYCCNPDDVMFAQESMRERYVFTDVQLRGYYPSYVLNEWERRGFNIKMEDGD LDVLREGTCDYLGFSYYMTNAVKAEGGTGDAISGFEGSVPNPYVKASDWGWQIDPVGLRY ALCELYERYQRPLFIVENGFGAYDKVEEDGSINDDYRIDYLRAHIEEMKKAVTYDGVDLM GYTPWGCIDCVSFTTGQYSKRYGFIYVNKHDDGTGDMSRSRKKSFNWYKEVIASNGEKL >gi|296493160|gb|ADTK01000341.1| GENE 3 2427 - 2738 346 103 aa, chain + ## HITS:1 COG:ECs3772 KEGG:ns NR:ns ## COG: ECs3772 COG3097 # Protein_GI_number: 15833026 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 103 1 103 103 185 99.0 2e-47 MQPNDITFFQRFQDDILAGRKTITIRDESESHFKTGDVLRVGRFEDDGYFCTIEVTATST VTLDTLTEKHAEQENMTLTELIKVIADIYPGQTQFYVIEFKCL >gi|296493160|gb|ADTK01000341.1| GENE 4 2902 - 3561 650 219 aa, chain + ## HITS:1 COG:ECs3771 KEGG:ns NR:ns ## COG: ECs3771 COG1272 # Protein_GI_number: 15833025 # Func_class: R General function prediction only # Function: Predicted membrane protein, hemolysin III homolog # Organism: Escherichia coli O157:H7 # 1 219 1 219 219 357 100.0 1e-98 MVQKPLIKQGYSLAEEIANSVSHGIGLVFGIVGLVLLLVQAVDLNASATAITSYSLYGGS MILLFLASTLYHAIPHQRAKMWLKKFDHCAIYLLIAGTYTPFLLVGLDSPLARGLMIVIW SLALLGILFKLTIAHRFKILSLVTYLAMGWLSLVVIYEMAVKLAAGSVTLLAVGGVVYSL GVIFYVCKRIPYNHAIWHGFVLGGSVCHFLAIYLYIGQA Prediction of potential genes in microbial genomes Time: Mon May 16 16:06:28 2011 Seq name: gi|296493159|gb|ADTK01000342.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1096.9, whole genome shotgun sequence Length of sequence - 70425 bp Number of predicted genes - 72, with homology - 70 Number of transcription units - 37, operones - 19 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 58 - 1038 1100 ## COG0354 Predicted aminomethyltransferase related to GcvT - Prom 1071 - 1130 3.1 + Prom 989 - 1048 3.7 2 2 Op 1 . + CDS 1281 - 1547 307 ## COG2938 Uncharacterized conserved protein 3 2 Op 2 . + CDS 1528 - 1935 254 ## SSON_3049 hypothetical protein + Term 1966 - 2016 2.6 - Term 1926 - 1968 1.2 4 3 Tu 1 . - CDS 1975 - 2496 718 ## COG0716 Flavodoxins - Prom 2528 - 2587 3.4 + Prom 2525 - 2584 4.9 5 4 Op 1 8/0.000 + CDS 2608 - 3504 800 ## COG4974 Site-specific recombinase XerD 6 4 Op 2 7/0.083 + CDS 3529 - 4239 740 ## COG1651 Protein-disulfide isomerase 7 4 Op 3 5/0.167 + CDS 4245 - 5978 1729 ## COG0608 Single-stranded DNA-specific exonuclease + Term 6002 - 6036 2.6 + Prom 6043 - 6102 1.7 8 5 Op 1 8/0.000 + CDS 6286 - 7167 946 ## COG1186 Protein chain release factor B 9 5 Op 2 . + CDS 7177 - 8694 2035 ## COG1190 Lysyl-tRNA synthetase (class II) + Term 8715 - 8745 2.1 10 6 Tu 1 . - CDS 8737 - 9285 401 ## COG1443 Isopentenyldiphosphate isomerase - Prom 9310 - 9369 3.6 - Term 9358 - 9386 -0.3 11 7 Op 1 . - CDS 9408 - 9533 78 ## 12 7 Op 2 . - CDS 9535 - 10983 375 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 + Prom 11281 - 11340 6.0 13 8 Op 1 3/0.417 + CDS 11404 - 13338 1036 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases 14 8 Op 2 . + CDS 13338 - 13826 268 ## COG1142 Fe-S-cluster-containing hydrogenase components 2 + Term 13833 - 13869 6.7 - Term 13819 - 13856 6.1 15 9 Op 1 2/0.833 - CDS 13862 - 15229 1178 ## COG2252 Permeases 16 9 Op 2 4/0.250 - CDS 15265 - 16581 1412 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases 17 9 Op 3 2/0.833 - CDS 16599 - 17999 292 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 - Term 18127 - 18154 1.5 18 10 Op 1 12/0.000 - CDS 18164 - 21034 2821 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 19 10 Op 2 2/0.833 - CDS 21031 - 21810 838 ## COG1319 Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs 20 10 Op 3 1/0.917 - CDS 21861 - 23189 1376 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases 21 10 Op 4 3/0.417 - CDS 23192 - 26290 2679 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases - Prom 26393 - 26452 10.7 22 11 Tu 1 . - CDS 26612 - 27190 218 ## COG2068 Uncharacterized MobA-related protein - Prom 27292 - 27351 6.1 23 12 Op 1 . + CDS 27461 - 28063 239 ## SBO_3108 hypothetical protein 24 12 Op 2 . + CDS 28111 - 29736 1321 ## COG1975 Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family - Term 29733 - 29770 7.2 25 13 Op 1 . - CDS 29777 - 30709 1203 ## COG0549 Carbamate kinase 26 13 Op 2 4/0.250 - CDS 30757 - 32142 986 ## COG0044 Dihydroorotase and related cyclic amidohydrolases - Term 32145 - 32177 5.0 27 14 Op 1 4/0.250 - CDS 32195 - 33406 1536 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 28 14 Op 2 2/0.833 - CDS 33464 - 34660 1143 ## COG1171 Threonine dehydratase - Term 34678 - 34711 5.2 29 14 Op 3 . - CDS 34718 - 35905 1450 ## COG0078 Ornithine carbamoyltransferase 30 15 Tu 1 . + CDS 36384 - 38162 1005 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains + Term 38170 - 38212 11.1 31 16 Op 1 15/0.000 - CDS 38202 - 38681 362 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 32 16 Op 2 12/0.000 - CDS 38678 - 39556 553 ## COG1319 Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs 33 16 Op 3 . - CDS 39567 - 41864 1745 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs - Prom 41988 - 42047 3.6 34 17 Tu 1 . + CDS 42279 - 43034 315 ## COG0739 Membrane proteins related to metalloendopeptidases + Term 43131 - 43197 30.0 + TRNA 43113 - 43186 78.8 # Gly CCC 0 0 + Prom 43115 - 43174 80.2 35 18 Op 1 . + CDS 43420 - 45030 554 ## ECIAI39_3279 hypothetical protein 36 18 Op 2 . + CDS 45097 - 45276 95 ## ECH74115_4151 hypothetical protein + Prom 45514 - 45573 7.4 37 19 Op 1 . + CDS 45622 - 45753 93 ## COG2207 AraC-type DNA-binding domain-containing proteins 38 19 Op 2 4/0.250 + CDS 45801 - 46283 148 ## COG4789 Type III secretory pathway, component EscV 39 19 Op 3 . + CDS 46300 - 47085 47 ## COG1157 Flagellar biosynthesis/type III secretory pathway ATPase 40 19 Op 4 . + CDS 47098 - 47607 153 ## COG1157 Flagellar biosynthesis/type III secretory pathway ATPase 41 19 Op 5 . + CDS 47597 - 48013 113 ## ECH74115_4145 EivI 42 19 Op 6 . + CDS 48031 - 48144 104 ## + Prom 48402 - 48461 1.8 43 20 Tu 1 . + CDS 48557 - 48904 199 ## ECUMN_3199 type III secretion protein + Prom 49419 - 49478 7.8 44 21 Op 1 7/0.083 + CDS 49510 - 49851 333 ## COG1886 Flagellar motor switch/type III secretory pathway protein 45 21 Op 2 7/0.083 + CDS 49841 - 50506 102 ## COG4790 Type III secretory pathway, component EscR 46 21 Op 3 8/0.000 + CDS 50516 - 50776 159 ## COG4794 Type III secretory pathway, component EscS + Prom 51191 - 51250 5.9 47 21 Op 4 2/0.833 + CDS 51309 - 51545 166 ## COG4791 Type III secretory pathway, component EscT 48 21 Op 5 . + CDS 51554 - 52132 154 ## COG1377 Flagellar biosynthesis pathway, component FlhB 49 21 Op 6 . + CDS 52093 - 52296 70 ## COG1377 Flagellar biosynthesis pathway, component FlhB 50 21 Op 7 . + CDS 52323 - 52685 230 ## COG1377 Flagellar biosynthesis pathway, component FlhB + Term 52713 - 52755 7.1 51 22 Tu 1 . - CDS 53042 - 53542 130 ## COG2771 DNA-binding HTH domain-containing proteins - Prom 53570 - 53629 6.7 + Prom 53640 - 53699 11.4 52 23 Op 1 . + CDS 53800 - 54981 316 ## EC55989_3140 putative type III secretion EprH protein 53 23 Op 2 . + CDS 54995 - 55150 72 ## ECO103_3423 type III secretion protein EprI + Prom 55185 - 55244 3.5 54 24 Op 1 . + CDS 55327 - 55542 98 ## ECUMN_3189 conserved hypothetical protein, putative type III secretion apparatus protein 55 24 Op 2 . + CDS 55583 - 56317 32 ## COG4669 Type III secretory pathway, lipoprotein EscJ 56 24 Op 3 . + CDS 56333 - 56914 -40 ## ECO103_3419 hypothetical protein + Prom 56937 - 56996 4.2 57 25 Tu 1 . + CDS 57131 - 57562 154 ## EC55989_3135 hypothetical protein + Prom 57652 - 57711 8.8 58 26 Op 1 . + CDS 57783 - 57938 115 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 59 26 Op 2 . + CDS 57972 - 58415 154 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 60 27 Tu 1 . - CDS 58460 - 58696 94 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) - Prom 58757 - 58816 7.0 + Prom 58614 - 58673 7.4 61 28 Tu 1 . + CDS 58734 - 58952 122 ## ECSE_3112 hypothetical protein + Term 58970 - 59007 3.2 - Term 58956 - 58995 4.4 62 29 Tu 1 . - CDS 59019 - 59237 150 ## JW5456 hypothetical protein - Prom 59293 - 59352 4.7 63 30 Tu 1 . - CDS 59405 - 60781 185 ## COG0457 FOG: TPR repeat - Prom 60908 - 60967 4.8 - Term 61043 - 61091 11.3 64 31 Tu 1 . - CDS 61116 - 61607 264 ## COG0457 FOG: TPR repeat - Prom 61846 - 61905 5.7 + Prom 62258 - 62317 4.2 65 32 Tu 1 . + CDS 62516 - 62941 138 ## ECUMN_3178 hypothetical protein 66 33 Op 1 . - CDS 63090 - 63572 85 ## JW5455 hypothetical protein 67 33 Op 2 2/0.833 - CDS 63565 - 63900 171 ## COG3710 DNA-binding winged-HTH domains - Prom 63922 - 63981 3.4 - Term 64636 - 64673 3.5 68 34 Tu 1 . - CDS 64708 - 65226 140 ## COG2771 DNA-binding HTH domain-containing proteins - Prom 65466 - 65525 7.6 - Term 65739 - 65793 15.1 69 35 Tu 1 . - CDS 65800 - 67029 537 ## COG0814 Amino acid permeases - Prom 67064 - 67123 4.3 + Prom 67178 - 67237 6.0 70 36 Tu 1 . + CDS 67284 - 68465 968 ## COG0183 Acetyl-CoA acetyltransferase + Prom 68591 - 68650 5.8 71 37 Op 1 9/0.000 + CDS 68860 - 69588 697 ## COG3717 5-keto 4-deoxyuronate isomerase 72 37 Op 2 . + CDS 69618 - 70379 749 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) Predicted protein(s) >gi|296493159|gb|ADTK01000342.1| GENE 1 58 - 1038 1100 326 aa, chain - ## HITS:1 COG:ygfZ KEGG:ns NR:ns ## COG: ygfZ COG0354 # Protein_GI_number: 16130800 # Func_class: R General function prediction only # Function: Predicted aminomethyltransferase related to GcvT # Organism: Escherichia coli K12 # 1 326 1 326 326 627 100.0 1e-179 MAFTPFPPRQPTASARLPLTLMTLDDWALATITGADSEKYMQGQVTADVSQMAEDQHLLA AHCDAKGKMWSNLRLFRDGDGFAWIERRSVREPQLTELKKYAVFSKVTIAPDDERVLLGV AGFQARAALANLFSELPSKEKQVVKEGATTLLWFEHPAERFLIVTDEATANMLTDKLRGE AELNNSQQWLALNIEAGFPVIDAANSGQFIPQATNLQALGGISFKKGCYTGQEMVARAKF RGANKRALWLLAGSASRLPEAGEDLELKMGENWRRTGTVLAAVKLEDGQVVVQVVMNNDM EPDSIFRVRDDANTLHIEPLPYSLEE >gi|296493159|gb|ADTK01000342.1| GENE 2 1281 - 1547 307 88 aa, chain + ## HITS:1 COG:ECs3769 KEGG:ns NR:ns ## COG: ECs3769 COG2938 # Protein_GI_number: 15833023 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 88 1 88 88 171 100.0 4e-43 MDINNKARIHWACRRGMRELDISIMPFFEHEYDSLSDDEKRIFIRLLECDDPDLFNWLMN HGKPADAELEMMVRLIQTRNRERGPVAI >gi|296493159|gb|ADTK01000342.1| GENE 3 1528 - 1935 254 135 aa, chain + ## HITS:1 COG:no KEGG:SSON_3049 NR:ns ## KEGG: SSON_3049 # Name: not_defined # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 135 1 135 135 212 100.0 3e-54 MVLWQSDLRVSWRAQWLSLLIHGLVAAVILLMPWPLSYTPLWMVLLSLVVFDCVRSQRRI NARQGEIRLLMDGRLRWQGQEWSIVKAPWMIKSGMMLRLRSDGGKRQHLWLAADSMDEAE WRDLRRILLQQETQR >gi|296493159|gb|ADTK01000342.1| GENE 4 1975 - 2496 718 173 aa, chain - ## HITS:1 COG:ECs3767 KEGG:ns NR:ns ## COG: ECs3767 COG0716 # Protein_GI_number: 15833021 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Escherichia coli O157:H7 # 1 173 1 173 173 342 100.0 2e-94 MNMGLFYGSSTCYTEMAAEKIRDIIGPELVTLHNLKDDSPKLMEQYDVLILGIPTWDFGE IQEDWEAVWDQLDDLNLEGKIVALYGLGDQLGYGEWFLDALGMLHDKLSTKGVKFVGYWP TEGYEFTSPKPVIADGQLFVGLALDETNQYDLSDERIQSWCEQILNEMAEHYA >gi|296493159|gb|ADTK01000342.1| GENE 5 2608 - 3504 800 298 aa, chain + ## HITS:1 COG:xerD KEGG:ns NR:ns ## COG: xerD COG4974 # Protein_GI_number: 16130796 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Escherichia coli K12 # 1 298 1 298 298 548 100.0 1e-156 MKQDLARIEQFLDALWLEKNLAENTLNAYRRDLSMMVEWLHHRGLTLATAQSDDLQALLA ERLEGGYKATSSARLLSAVRRLFQYLYREKFREDDPSAHLASPKLPQRLPKDLSEAQVER LLQAPLIDQPLELRDKAMLEVLYATGLRVSELVGLTMSDISLRQGVVRVIGKGNKERLVP LGEEAVYWLETYLEHGRPWLLNGVSIDVLFPSQRAQQMTRQTFWHRIKHYAVLAGIDSEK LSPHVLRHAFATHLLNHGADLRVVQMLLGHSDLSTTQIYTHVATERLRQLHQQHHPRA >gi|296493159|gb|ADTK01000342.1| GENE 6 3529 - 4239 740 236 aa, chain + ## HITS:1 COG:ECs3765 KEGG:ns NR:ns ## COG: ECs3765 COG1651 # Protein_GI_number: 15833019 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Protein-disulfide isomerase # Organism: Escherichia coli O157:H7 # 1 236 1 236 236 468 100.0 1e-132 MKKGFMLFTLLAAFSGFAQADDAAIQQTLAKMGIKSSDIQPAPVAGMKTVLTNSGVLYIT DDGKHIIQGPMYDVSGTAPVNVTNKMLLKQLNALEKEMIVYKAPQEKHVITVFTDITCGY CHKLHEQMADYNALGITVRYLAFPRQGLDSDAEKEMKAIWCAKDKNKAFDDVMAGKSVAP ASCDVDIADHYALGVQLGVSGTPAVVLSNGTLVPGYQPPKEMKEFLDEHQKMTSGK >gi|296493159|gb|ADTK01000342.1| GENE 7 4245 - 5978 1729 577 aa, chain + ## HITS:1 COG:ECs3764 KEGG:ns NR:ns ## COG: ECs3764 COG0608 # Protein_GI_number: 15833018 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-specific exonuclease # Organism: Escherichia coli O157:H7 # 1 577 1 577 577 1151 99.0 0 MKQQIQLRRREVDETADLPAELPPLLRRLYASRGVRSAQELERSVKGMLPWQQLSGVEKA VEILYNAFREGTRIIVVGDFDADGATSTALSVLAMRSLGCSNIDYLVPNRFEDGYGLSPE VVDQAHARGAQLIVTVDNGISSHAGVEHARSLGIPVIVTDHHLPGETLPAAEAIINPNLR DCNFPSKSLAGVGVAFYLMLALRTFLRDQGWFDERGIAIPNLAELLDLVALGTVADVVPL DANNRILTWQGMSRIRAGKCRPGIKALLEVANRDAQKLAASDLGFALGPRLNAAGRLDDM SVGVALLLCDNIGEARVLANELDALNQTRKEIEQGMQVEALTLCEKLERSRDTLPGGLAM YHPEWHQGVVGILASRIKERFHRPVIAFAPAGDGTLKGSGRSIQGLHMRDALERLDTLYP GMILKFGGHAMAAGLSLEEDKFELFQQRFGELVTEWLDPSLLQGEVVSDGPLSPAEMTME VAQLLRDAGPWGQMFPEPLFDGHFRLLQQRLVGERHLKVMVEPVGGGPLLDGIAFNVDTA LWPDNGVREVQLAYKLDINEFRGNRSLQIIIDNIWPI >gi|296493159|gb|ADTK01000342.1| GENE 8 6286 - 7167 946 293 aa, chain + ## HITS:1 COG:ECs3763 KEGG:ns NR:ns ## COG: ECs3763 COG1186 # Protein_GI_number: 15833017 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Escherichia coli O157:H7 # 1 293 73 365 365 542 100.0 1e-154 MKQGLEDVSGLLELAVEADDEETFNEAVAELDALEEKLAQLEFRRMFSGEYDSADCYLDI QAGSGGTEAQDWASMLERMYLRWAESRGFKTEIIEESEGEVAGIKSVTIKISGDYAYGWL RTETGVHRLVRKSPFDSGGRRHTSFSSAFVYPEVDDDIDIEINPADLRIDVYRASGAGGQ HVNRTESAVRITHIPTGIVTQCQNDRSQHKNKDQAMKQMKAKLYELEMQKKNAEKQAMED NKSDIGWGSQIRSYVLDDSRIKDLRTGVETRNTQAVLDGSLDQFIEASLKAGL >gi|296493159|gb|ADTK01000342.1| GENE 9 7177 - 8694 2035 505 aa, chain + ## HITS:1 COG:lysS KEGG:ns NR:ns ## COG: lysS COG1190 # Protein_GI_number: 16130792 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Lysyl-tRNA synthetase (class II) # Organism: Escherichia coli K12 # 1 505 1 505 505 1003 99.0 0 MSEQHAQGADAVVDLNNELKTRREKLANLREQGIAFPNDFRRDHTSDQLHAEFDGKENEE LEALNIEVAVAGRMMTRRIMGKASFVTLQDVGGRIQLYVARDDLPEGVYNEQFKKWDLGD ILGAKGKLFKTKTGELSIHCTELRLLTKALRPLPDKFHGLQDQEARYRQRYLDLISNDES RNTFKVRSQILSGIRQFMVNRGFMEVETPMMQVIPGGAAARPFITHHNALDLDMYLRIAP ELYLKRLVVGGFERVFEINRNFRNEGISVRHNPEFTMMELYMAYADYKDLIELTESLFRT LAQDILGKTEVTYGDVTLDFGKPFEKLTMREAIKKYRPETDMADLDNFDSAKAIAESIGI HVEKSWGLGRIVTEIFEEVAEAHLIQPTFITEYPAEVSPLARRNDINPEITDRFEFFIGG REIGNGFSELNDAEDQAQRFLDQVAAKDAGDDEAMFYDEDYVTALEHGLPPTAGLGIGID RMVMLFTNSHTIRDVILFPAMRPVK >gi|296493159|gb|ADTK01000342.1| GENE 10 8737 - 9285 401 182 aa, chain - ## HITS:1 COG:idi KEGG:ns NR:ns ## COG: idi COG1443 # Protein_GI_number: 16130791 # Func_class: I Lipid transport and metabolism # Function: Isopentenyldiphosphate isomerase # Organism: Escherichia coli K12 # 1 182 1 182 182 369 98.0 1e-102 MQTEHVILLNAQGVPTGTLEKYAAHTADTLLHLAFSSWLFNAKGQLLVTRRALSKKAWPG VWTNSVCGHPQLGESSEDAVIRRCRYELGVEITPPESIYPDFRYRATDPSGIVENEVCPV FAARTTSALQINDDEVMDYQWCDLADVLHGIDATPWAFSPWMVMQATNREARIRLSAFTQ LK >gi|296493159|gb|ADTK01000342.1| GENE 11 9408 - 9533 78 41 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNFLMRAIFSLLLLFTLSIPVISDCVAMAIESRFKYMMLLF >gi|296493159|gb|ADTK01000342.1| GENE 12 9535 - 10983 375 482 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 29 440 15 416 447 149 27 5e-35 MSAIDSQLPSSSGQDRPTDEVDRILSPGKLIILGLQHVLVMYAGAVAVPLMIGDRLGLSK DAIAMLISSDLFCCGIVTLLQCIGIGRFMGIRLPVIMSVTFAAVTPMIAIGMNPDIGLLG IFGATIAAGFITTLLAPLIGRLMPLFPPLVTGVVITSIGLSIIQVGIDWAAGGKGNPQYG NPVYLGISFAVLIFILLITRYAKGFMSNVAVLLGIVFGFLLSWMMNEVNLSGLHDASWFA IVTPMSFGMPIFDPVSILTMTAVLIIVFIESMGMFLALGEIVGRKLSSHDIIRGLRVDGV GTMIGGTFNSFPHTSFSQNVGLVSVTRVHSRWVCISSGIILILFGMVPKMAVLVASIPQF VLGGAGLVMFGMVLATGIRILSRCNYTTNRYNLYIVAISLGVGMTPTLSHDFFSKLPAVL QPLLHSGIMLATLSAVVLNVFFNGYQHHADLVKESVSDKDLKVRTVRMWLLMRKLKKNEH GE >gi|296493159|gb|ADTK01000342.1| GENE 13 11404 - 13338 1036 644 aa, chain + ## HITS:1 COG:ECs3759_2 KEGG:ns NR:ns ## COG: ECs3759_2 COG0493 # Protein_GI_number: 15833013 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Escherichia coli O157:H7 # 158 644 1 487 487 994 98.0 0 MKGMQMNKFIAAEAAECIGCHACEIACAVAHNQENWPLSHSDFRPRIHVVGKGQAANPVA CHHCNNAPCVTACPVNALTFQSDSVQLDEQKCIGCKRCAIACPFGVVEMVDTIAQKCDLC NQRSSGTQACIEVCPTQALRLMDDKGLQQIKVARQRKTAAGKASSVAQPSRSAALLPVNS RKGADKISASERKNHFGEIYCGLDPLQATYESDRCVYCAEKANCNWHCPLHNAIPDYIRL VQEGKIIEAAELCHQTSSLPEICGRVCPQDRLCEGACTLKDHSGAVTIGNLERYITDTAL AMGWRPDVSKVVPRSEKVAVIGAGPAGLGCADILARAGVQVDVFDRHPEIGGMLTFGIPP FKLDKTVLSQRREIFTAMGIDFHLNCEIGRDITFSNLTSEYDAVFIGVGTYGMMRADLPH EDAPGVIQALPFLTAHTRQLMGLPESEEYPLTDVEGKRVVVLGGGDTTMDCLRTSIRLNA ASVTCAYRRDEVSMPGSRKEVVNAREEGVEFQFNVQPQYIACDEDGRLTAVGLIRTAMGE PGPDGRRRPRPVAGSEFELPADVLIMAFGFQAHAMPWLQGSGIKLDKWGLIQTGDVGYLP TQTHLKKVFAGGDAVHGADLVVTAMAAGRQAARDMLTLFDTKAS >gi|296493159|gb|ADTK01000342.1| GENE 14 13338 - 13826 268 162 aa, chain + ## HITS:1 COG:ygfS KEGG:ns NR:ns ## COG: ygfS COG1142 # Protein_GI_number: 16130788 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 2 # Organism: Escherichia coli K12 # 1 162 2 163 163 267 100.0 5e-72 MKSLIIVNPADCIGCRTCEVACVVAHPSEQELNADVFLPRLKVQRLDSISAPVMCHQCEN APCVGACPVGALTMGEQVVQTNSARCIGCQSCVSACPFGMITIQSLPGDTRQQIVKCDLC EQREEGPACVESCPTQALQLLTERELRRVRQQRIVASGENPL >gi|296493159|gb|ADTK01000342.1| GENE 15 13862 - 15229 1178 455 aa, chain - ## HITS:1 COG:ECs3757 KEGG:ns NR:ns ## COG: ECs3757 COG2252 # Protein_GI_number: 15833011 # Func_class: R General function prediction only # Function: Permeases # Organism: Escherichia coli O157:H7 # 1 455 1 455 455 711 100.0 0 MSGDILQTPDAPKPQGALDNYFKITARGSTVRQEVLAGLTTFLAMVYSVIVVPGMLGKAG FPPAAVFVATCLVAGFGSLLMGLWANLPMAIGCAISLTAFTAFSLVLGQQISVPVALGAV FLMGVIFTAISVTGVRTWILRNLPMGIAHGTGIGIGLFLLLIAANGVGMVIKNPIEGLPV ALGAFTSFPVMMSLLGLAVIFGLEKCRVPGGILLVIIAISIIGLIFDPAVKYHGLVAMPS LTGEDGKSLIFSLDIMGALQPTVLPSVLALVMTAVFDATGTIRAVAGQANLLDKDNQIIN GGKALTSDSVSSIFSGLVGAAPAAVYIESAAGTAAGGKTGLTATVVGALFLLILFLSPLS FLIPGYATAPALMYVGLLMLSNVSKLDFNDFIDAMAGLVCAVFIVLTCNIVTGIMLGFVT LVVGRVFAREWQKLNIGTVIITAALVAFYAGGWAI >gi|296493159|gb|ADTK01000342.1| GENE 16 15265 - 16581 1412 438 aa, chain - ## HITS:1 COG:ygfP KEGG:ns NR:ns ## COG: ygfP COG0402 # Protein_GI_number: 16130785 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Escherichia coli K12 # 1 438 2 439 439 919 99.0 0 MSGEHTLKAVRGSFIDVTRTVDNPEEIASALRFIEDGLLLIKQGKVEWFGEWENGKHQIP DTIRVRDYRGKLIVPGFVDTHIHYPQSEMVGAYGEQLLEWLNKHTFPTERRYEDLEYARE MSAFFIKQLLRNGTTTALVFGTVHPQSVDALFEAASHINMRMIAGKVMMDRNAPDYLLDT AESSYHQSKELIERWHKNGRLLYAITPRFAPTSSPEQMAMAQRLKEEYPDTWVHTHLCEN KDEIAWVKSLYPDHDGYLDVYHQYGLTGKNCVFAHCVHLEEKEWDRLSETKSSIAFCPTS NLYLGSGLFNLKKAWQKKVKVGMGTDIGAGTTFNMLQTLNEAYKVLQLQGYRLSAYEAFY LATLGGAKSLGLDDLIGNFLPGKEADFVVMEPTATPLQQLRYDNSVSLVDKLFVMMTLGD DRSIYRTYVDGRLVYERN >gi|296493159|gb|ADTK01000342.1| GENE 17 16599 - 17999 292 466 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 45 453 27 425 447 117 25 2e-25 MSDINHAGSDLIFELEDRPPFHQALVGAITHLLAIFVPMVTPALIVGAALQLSAETTAYL VSMAMIASGIGTWLQVNRYGIVGSGLLSIQSVNFSFVTVMIALGSSMKSDGFHEELIMSS LLGVSFVGAFLVVGSSFILPYLRRVITPTVSGIVVLMIGLSLIKVGIIDFGGGFAAKSSG TFGNYEHLGVGLLVLIVVIGFNCCRSPLLRMGGIAIGLCVGYIASLCLGMVDFSSMRNLP LITIPHPFKYGFSFSFHQFLVVGTIYLLSVLEAVGDITATAMVSRRPIQGEEYQSRLKGG VLADGLVSVIASAVGSLPLTTFAQNNGVIQMTGVASRYVGRTIAVMLVILGLFPMIGGFF TTIPSAVLGGAMTLMFSMIAIAGIRIIITNGLKRRETLIVATSLGLGLGVSYDPEIFKIL PASIYVLVENPICAGGLTAILLNIILPGGYRQENVLPGITSAEEMD >gi|296493159|gb|ADTK01000342.1| GENE 18 18164 - 21034 2821 956 aa, chain - ## HITS:1 COG:ECs3754_2 KEGG:ns NR:ns ## COG: ECs3754_2 COG1529 # Protein_GI_number: 15833008 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Escherichia coli O157:H7 # 160 956 1 797 797 1638 99.0 0 MIIHFTLNGAPQELTVNPGENVQKLLFNMGMHSVRNSDDGFGFAGSDAIIFNGNIVNASL LIAAQLEKADIRTAESLGKWNELSLVQQAMVDVGVVQSGYNDPAAALIITDLLDRIAAPT REEIDDALSGLFSRDAGWQQYYQVIELAVARKNNPQATIDIAPTFRDDLEVIGKHYPKTD AAKMVQAKPCYVEDRVTADACVIKMLRSPHAHALITHLDVSKAEALPGVVHVITHLNCPD IYYTPGGQSAPEPSPLDRRMFGKKMRHVGDRVAAVVAESEEIALEALKLIDVEYEVLKPV MSIDEAMAEDAPVVHDEPVVYVAGAPDTLEDDNSHAAQRGEHMIINFPIGSRPRKNIAAS IHGHIGDMDKGFADADVIIERTYNSTQAQQCPTETHICFTRMDGDRLVIHASTQVPWHLR RQVARLVGMKQHKVHVIKERVGGGFGSKQDILLEEVCAWATCVTGRPVLFRYTREEEFIA NTSRHVAKVTVKLGAKRDGRLTAVKMDFRANTGPYGNHSLTVPCNGPALSLPLYPCDNVD FQVTTYYSNICPNGAYQGYGAPKGNFAITMALAELAEQLQIDQLEIIERNRVHEGQELKI LGAIGEGKAPTSVPSAASCALEEILRQGREMIQWSSPKPQNGDWHIGRGVAIIMQKSGIP DIDQANCMIKLESDGTFIVHSGGADIGTGLDTVVTKLAAEVLHCPPQDVHVISGDTDHAL FDKGAYASSGTCFSGNAARLAAENLREKILFHGAQMLGEAVADVQLATPGVVRGKKGEVS FGDIAHKGETGTGFGSLVGTGSYITPDFAFPYGANFAEVAVNTRTGEIRLDKFYALLDCG TPVNPELALGQIYGATLRAIGHSMSEEIIYDAEGHPLTRDLRSYGAPKIGDIPRDFRAVL VPSDDKVGPFGAKSISEIGVNGAAPAIATAIHDACGIWLREWHFTPEKILTALEKI >gi|296493159|gb|ADTK01000342.1| GENE 19 21031 - 21810 838 259 aa, chain - ## HITS:1 COG:ECs3753 KEGG:ns NR:ns ## COG: ECs3753 COG1319 # Protein_GI_number: 15833007 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs # Organism: Escherichia coli O157:H7 # 1 259 1 259 259 500 99.0 1e-141 MIEQFFRPDSVEQALELKRRYQDEAVWFAGGSKLNATPTRTDKKIAISLQDLELDWVDWD NGALRIGAMSRLQPLRDARFIPAALREALGFVYSRHVRNQSTIGGEIAARQEESVLLPVL LALDAELVFGNGETLSIEDYLASPCDRLLTEIIIKDPYRTCATRKISRSQAGLTVVTAAV AMTDHDGMRIALDGVASKALRLHDVEKQNLEGNALEQAVANAIFPQEDLRGSVAYKRYIT GVLVADLYADCQQAGEEAV >gi|296493159|gb|ADTK01000342.1| GENE 20 21861 - 23189 1376 442 aa, chain - ## HITS:1 COG:Z4218 KEGG:ns NR:ns ## COG: Z4218 COG0402 # Protein_GI_number: 15803416 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Escherichia coli O157:H7 EDL933 # 1 442 23 464 464 902 98.0 0 MLILKNVTAVQLHPAKVQEAVDIAIENDVIVAIGDALTQRYPDASYKEMHGRIVMPGIVC SHNHFYSGLSRGIMANIAPCPDFISTLKNLWWRLDRALDEESLYYSGLICSLEAIKSGCT SVIDHHASPAYIGGSLSTLRDAFLKVGLRAMTCFETTDRNNGIKELQEGVEENIRFARQI DEAKKAATEPYLVEAHIGAHAPFTVPDAGLEMLREAVKATGRGLHIHAAEDLYDVSYSHH WYGKDLLARLAQFDLIDSKTLVAHGLYLSKDDIALLNQRDAFLVHNARSNMNNHVGYNHH LSDIRNLALGTDGIGSDMFEEMKFAFFKHRDAGGPLWPDSFAKALTNGNELMSRNFAAKF GLLEAGYKADLTICDYNSPTPLLADNIAGHIAFGMGSGSVHSVMVNGVMVYEDRQFNFDC DSIYAQARKAAASMWRRMDTLA >gi|296493159|gb|ADTK01000342.1| GENE 21 23192 - 26290 2679 1032 aa, chain - ## HITS:1 COG:ygfK_2 KEGG:ns NR:ns ## COG: ygfK_2 COG0493 # Protein_GI_number: 16130780 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Escherichia coli K12 # 451 1032 1 582 582 1226 99.0 0 MGDIMRPIPFEELLTRIFDEYQQQRSIFGIPEQQFYSPVKGKTVSVFGETCATPVGPAAG PHTQLAQNIVTSWLTGGRFIELKTVQILDRLELEKPCIDAEDECFNTEWSTEFTLLKAWD EYLKAWFALHLLEAMFQPSDSGKSFIFNMSVGYNLEGIKQPPMQQFIDNMMDASDHPKFA QYRDTLNKLLQDDAFLARHGLQEKRESLQALPARIPTSMVHGVTLSTMHGCPPHEIEAIC RYMLEEKGLNTFVKLNPTLLGYARVREILDVCGFGYIGLKEESFDHDLKLTQALEMLERL MALAKEKSLGFGVKLTNTLGTINNKGALPGEEMYMSGRALFPLSINVAAVLSRAFDGKLP ISYSGGASQLTIRDIFDTGIRPITMATDLLKPGGYLRLSACMRELEGSDAWGLDHVDVER LNRLAADALTMEYTQKHWKPEERIEVAEDLPLTDCYVAPCVTACAIKQDIPEYIRLLGEH RYADALELIYQRNALPAITGHICDHQCQYNCTRLDYDSALNIRELKKVALEKGWDEYKQR WHKPAGSGSRHPVAVIGAGPAGLAAGYFLARAGHPVTLFEREANAGGVVKNIIPQFRIPA ELIQHDIDFVAAHGVKFEYGCSPDLTVEQLKNQGFHYVLIATGTDKNSGVKLAGDNQNVW KSLPFLREYNKGTALKLGKHVVVVGAGNTAMDCARAALRVPGVKKATIVYRRSLQEMPAW REEYEEALHDGVEFRFLNNPERFDADGTLTLRVMSLGEPDEKGRRRPVETNETVTLHVDS LITAIGEQQDTEALNAMGVPLDKNGWPDVDHNGETRLTDVFMIGDVQRGPSSIVAAVGTA RRATDAILSRENIRSHQNDKYWNNVNPAEIYQRKGDISITLVNSDDRDAFVAQEAARCLE CNYVCSKCVDVCPNRANVSIAVPGFQNRFQTLHLDAYCNECGNCAQFCPWNGKPYKDKIT VFSLSQDFDNSSNPGFLVEDCRVRVRLNNQSWVLNIDSEGQFNNVPPELNDMCRIISHVH QHHHYLLGRVEV >gi|296493159|gb|ADTK01000342.1| GENE 22 26612 - 27190 218 192 aa, chain - ## HITS:1 COG:ECs3750 KEGG:ns NR:ns ## COG: ECs3750 COG2068 # Protein_GI_number: 15833004 # Func_class: R General function prediction only # Function: Uncharacterized MobA-related protein # Organism: Escherichia coli O157:H7 # 1 192 1 192 192 396 100.0 1e-110 MSAIDCIITAAGLSSRMGQWKMMLPWQQGTILDTSIKNALQFCSRIILVTGYRGNELHER YANQSNITIIHNPDYAQGLLTSVKAAVPAVQTEHCFLTHGDMPTLTIDIFRKIWSLRNDG AILPLHNGIPGHPILVSKPCLMQAIKRPNITNMRQALLMGAHYSVEVENAEIILDIDTPD DFITAQKRYTKI >gi|296493159|gb|ADTK01000342.1| GENE 23 27461 - 28063 239 200 aa, chain + ## HITS:1 COG:no KEGG:SBO_3108 NR:ns ## KEGG: SBO_3108 # Name: not_defined # Def: hypothetical protein # Organism: S.boydii # Pathway: not_defined # 1 200 36 235 235 416 100.0 1e-115 MFMPTSHWPVVFCRDPAMLPHASLTSPISFCFHSWKANQGKVQGFTPEAIDALVQRPECD VILIEADGSRAMPLKAPDEHEPCIPKSSCCVIAVMGGHILGAKVSTENVHRWSQFADITG LTPDATLQLSDLVALVRHPQGAFKNVPQGCRRVWFINRFSQCENAIAQSELLQPLQQHDV EAIWLGDIQEHPAIARRFVN >gi|296493159|gb|ADTK01000342.1| GENE 24 28111 - 29736 1321 541 aa, chain + ## HITS:1 COG:yqeB KEGG:ns NR:ns ## COG: yqeB COG1975 # Protein_GI_number: 16130777 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family # Organism: Escherichia coli K12 # 1 541 1 541 541 1045 99.0 0 MNIFTEAAKLEEQNCPFAMAQIVDSRGSTPRHSAQMLVRADGSIVGTIGGGMVERKVIEE SLQALQERKPRLFHGRMARNGADAVGSDCGGAMSVFISVHGMRPRLVLIGAGHVNRAIAQ SAALLGFDIAVADIYRESLNPELFPPSTTLLHAESFGAAVEALDIRPDNFVLIATNNQDR EALDKLIEQPIAWLGLLASRRKVQLFLRQLREKGVAEEHIARLHAPVGYNIGAETPQEIA ISVLAEILQVKNNAPGGLMMKPSHPSGHQLVVIRGAGDIASGVALRLYHAGFKVIMLEVE KPTVIRCTVAFAQAVFDGEMTVEGVTARLATSSAEAMKLTERGFIPVMVDPACSLLDELK PLCVVDAILAKQNLGTRADMAPVTIALGPGFIAGKDCHAVIETNRGHWLGQVIYSGCAQE NTGVPGNIMGHTTRRVIRAPAAGIMRSNVKLGDLVKEGDVIAWIGEHEIKAPLTGMVRGL LNDGLAVVGGFKIGDIDPRGETADFTSVSDKARAIGGGVLEALMMLMHQGVKATKEVLEV A >gi|296493159|gb|ADTK01000342.1| GENE 25 29777 - 30709 1203 310 aa, chain - ## HITS:1 COG:yqeA KEGG:ns NR:ns ## COG: yqeA COG0549 # Protein_GI_number: 16130776 # Func_class: E Amino acid transport and metabolism # Function: Carbamate kinase # Organism: Escherichia coli K12 # 1 310 1 310 310 564 100.0 1e-161 MSKKIVLALGGNALGDDLAGQMKAVKITSQAIVDLIAQGHEVIVTHGNGPQVGMINQAFE AAAKTEAHSPMLPMSVCVALSQGYIGYDLQNALREELLSRGINKPVATLVTQVEVDANDP AFLNPTKPIGSFFTEQEAEQLTKQGYTLKEDAGRGYRRVVASPKPVDIIEKETVKALVDA GQVVITVGGGGIPVIREGNHLRGASAVIDKDWASARLAEMIDADMLIILTAVEKVAINFG KENEQWLDRLSLSDAERFIEEGHFAKGSMLPKVEAAASFARSRAGREALITVLSKAKEGI EGKTGTVICQ >gi|296493159|gb|ADTK01000342.1| GENE 26 30757 - 32142 986 461 aa, chain - ## HITS:1 COG:ECs3746 KEGG:ns NR:ns ## COG: ECs3746 COG0044 # Protein_GI_number: 15833000 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Escherichia coli O157:H7 # 1 461 5 465 465 983 99.0 0 MRVLIKNGTVVNADGQAKQDLLIESGIVRQLGNNISPQLPYEEIDATGCYVFPGGVDVHT HFNIDVGIARSCDDFFTGTRAAACGGTTTIIDHMGFGPNGCRLRHQLEVYRGYAAHKAVI DYSFHGVIQHINHAILDEIPMMVEEGLSSFKLYLTYQYKLNDDEVLQALRRLHESGALTT VHPENDAAIASKRAEFIAAGLTAPRYHALSRPLECEAEAIARMINLAQIAGNAPLYIVHL SNGLGLDYLRLARANHQPVWVETCPQYLLLDERSYDTEDGMKFILSPPLRNIREQDKLWC GISDGAIDVVATDHCTFSMAQRLQISKGDFSRCPNGLPGVENRMQLLFSSGVMTGRITPE RFVELTSAMPARLFGLWPQKGLLAPGSDGDVVIIDPRQSQQIQHRHLHDNADYSPWEGFT CQGAIVRTLSRGETIFCDGTFTGKAGRGRFLRRKPFVPPVL >gi|296493159|gb|ADTK01000342.1| GENE 27 32195 - 33406 1536 403 aa, chain - ## HITS:1 COG:ECs3745 KEGG:ns NR:ns ## COG: ECs3745 COG0624 # Protein_GI_number: 15832999 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Escherichia coli O157:H7 # 1 403 1 403 403 848 100.0 0 MAKNIPFKLILEKAKDYQADMTRFLRDMVAIPSESCDEKRVVHRIKEEMEKVGFDKVEID PMGNVLGYIGHGPRLVAMDAHIDTVGIGNIKNWDFDPYEGMETDELIGGRGTSDQEGGMA SMVYAGKIIKDLGLEDEYTLLVTGTVQEEDCDGLCWQYIIEQSGIRPEFVVSTEPTDCQV YRGQRGRMEIRIDVQGVSCHGSAPERGDNAIFKMGPILGELQELSQRLGYDEFLGKGTLT VSEIFFTSPSRCAVADSCAVSIDRRLTWGETWEGALDEIRALPAVQKANAVVSMYNYDRP SWTGLVYPTECYFPTWKVEEDHFTVKALVNAYEGLFGKAPVVDKWTFSTNGVSIMGRHGI PVIGFGPGKEPEAHAPNEKTWKSHLVTCAAMYAAIPLSWLATE >gi|296493159|gb|ADTK01000342.1| GENE 28 33464 - 34660 1143 398 aa, chain - ## HITS:1 COG:ECs3744 KEGG:ns NR:ns ## COG: ECs3744 COG1171 # Protein_GI_number: 15832998 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Escherichia coli O157:H7 # 1 398 1 398 398 827 100.0 0 MSVFSLKIDIADNKFFNGETSPLFSQSQAKLARQFHQKIAGYRPTPLCALDDLANLFGVK KILVKDESKRFGLNAFKMLGGAYAIAQLLCEKYHLDIETLSFEHLKNAIGEKMTFATTTD GNHGRGVAWAAQQLGQNAVIYMPKGSAQERVDAILNLGAECIVTDMNYDDTVRLTMQHAQ QHGWEVVQDTAWEGYTKIPTWIMQGYATLADEAVEQMREMGVTPTHVLLQAGVGAMAGGV LGYLVDVYSPQNLHSIIVEPDKADCIYRSGVKGDIVNVGGDMATIMAGLACGEPNPLGWE ILRNCATQFISCQDSVAALGMRVLGNPYGNDPRIISGESGAVGLGVLAAVHYHPQRQSLM EKLALNKDAVVLVISTEGDTDVKHYREVVWEGKHAVAP >gi|296493159|gb|ADTK01000342.1| GENE 29 34718 - 35905 1450 395 aa, chain - ## HITS:1 COG:ECs3743 KEGG:ns NR:ns ## COG: ECs3743 COG0078 # Protein_GI_number: 15832997 # Func_class: E Amino acid transport and metabolism # Function: Ornithine carbamoyltransferase # Organism: Escherichia coli O157:H7 # 1 395 2 396 396 805 99.0 0 MKTVNELIKDINSLTSHLHEKDFLLTWEQTPDELKQVLDVAAALKALRAENISTKVFNSG LGISVFRDNSTRTRFSYASALNLLGLAQQDLDEGKSQIAHGETVRETANMISFCADAIGI RDDMYLGAGNAYMREVGAALDDGYKQGVLPQRPALVNLQCDIDHPTQSMADLAWLREHFG SLENLKGKKIAMTWAYSPSYGKPLSVPQGIIGLMTRFGMDVTLAHPEGYDLIPDVVEVAK NNAKASGGSFRQVTSMEEAFKDADIVYPKSWAPYKVMEERTELLRANDHEGLKALEKQCL AQNAQHKDWHCTEEMMELTRDGEALYMHCLPADISGVSCKEGEVTEGVFEKYRIATYKEA SWKPYIIAAMILSRKYAKPGALLEQLLKEAQERVK >gi|296493159|gb|ADTK01000342.1| GENE 30 36384 - 38162 1005 592 aa, chain + ## HITS:1 COG:ygeV KEGG:ns NR:ns ## COG: ygeV COG3829 # Protein_GI_number: 16130771 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 1 592 1 592 592 1178 99.0 0 MELATTQSVLMQIQPTIQRFARMLASVLQLEVEIVDENLCRVAGTGAYGKFLGRQLSGNS RLLRHVLETKTEKVVTQSRFDPLCEGCDSKENCREKAFLGTPVILQDRCVGVISLIAVTH EQQEHISDNLREFSDYVRHISTIFVSKLLEDQGPGDNISKIFATMIDNMDQGVLVVDDES RVQFVNQTALKTLGVVQNNIIGKPIRFRPLTFESNFTHGHMQHIVSWDDKSELIIGQLHN IQGRQLFLMAFHQSHTSFSVANAPDEPHIEQLVGECRVMRQLKRLISRIAPSPSSVMVVG ESGTGKEVVARAIHKLSGRRNKPFIAINCAAIPEQLLESELFGYVKGAFTGASANGKTGL IQAANTGTLFLDEIGDMPLMLQAKLLRAIEAREILPIGASSPIQVDIRIISATNQNLAQF IAEGKFREDLFYRLNVIPITLPPLRERQEDIELLVHYFLHLHTRRLGSVYPGIAPDVVEI LRKHRWPGNLRELSNLMEYLVNVVPSGEVIDSTLLPPNLLNNGTTEQSDVTEVTEAHLSL DDAGGTALEEMEKQMIREALSRHNSKKQVADELGIGIATLYRKIKKYELLNT >gi|296493159|gb|ADTK01000342.1| GENE 31 38202 - 38681 362 159 aa, chain - ## HITS:1 COG:ygeU KEGG:ns NR:ns ## COG: ygeU COG2080 # Protein_GI_number: 16130770 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Escherichia coli K12 # 1 159 1 159 159 295 100.0 2e-80 MNHSETITIECTINGMPFQLHAAPGTPLSELLREQGLLSVKQGCCVGECGACTVLVDGTA IDSCLYLAAWAEGKEIRTLEGEAKGGKLSHVQQAYAKSGAVQCGFCTPGLIMATTAMLAK PREKPLTITEIRRGLAGNLCRCTGYQMIVNTVLDCEKTK >gi|296493159|gb|ADTK01000342.1| GENE 32 38678 - 39556 553 292 aa, chain - ## HITS:1 COG:ECs3740 KEGG:ns NR:ns ## COG: ECs3740 COG1319 # Protein_GI_number: 15832994 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs # Organism: Escherichia coli O157:H7 # 1 292 1 292 292 570 98.0 1e-162 MFDFASYHRAATLADAINLLVDNPQAKLLAGGTDVLIQLHHHNDRYRHIVDIHNLAELRG ITLAEDGSLRIGSATTFTQLIEDPVTQRHLPALCAAASSIAGPQIRNVATYGGNICNGAT SADSATPTLIYDAKLEIHSPRGVRFVPINGFHTGPGKVSLEHDEILVAFHFPPQPKEHVG SAHFKYAMRDAMDISTIGCAAHCRLDNGNFSELRLAFGVAAPTPIRCQHAEQTAQNAPLN LQTLEAISESVLQDVAPRSSWRASKEFRLHLIQTMTKKVISEAVAAAGGKLQ >gi|296493159|gb|ADTK01000342.1| GENE 33 39567 - 41864 1745 765 aa, chain - ## HITS:1 COG:ygeS KEGG:ns NR:ns ## COG: ygeS COG1529 # Protein_GI_number: 16130768 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Escherichia coli K12 # 14 765 1 752 752 1530 100.0 0 MEAREATATGESCMRVDAIAKVTGRARYTDDYVMAGMCYAKYVRSPIAHGYAVSINDEQA RSLPGVLAIFTWEDVPDIPFATAGHAWTLDENKRDTADRALLTRHVRHHGDAVAIVVARD ELTAEKAAQLVSIEWQELPVITTPEAALAEDAAPIHNGGNLLKQSTMSTGNVQQTIDAAD YQVQGHYQTPVIQHCHMESVTSLAWMEDDSRITIVSSTQIPHIVRRVVGQALDIPWSCVR VIKPFVGGGFGNKQDVLEEPMAAFLTSKLGGIPVKVSLSREECFLATRTRHAFTIDGQMG VNRDGTLKGYSLDVLSNTGAYASHGHSIASAGGNKVAYLYPRCAYAYSSKTCYTNLPSAG AMRGYGAPQVVFAVESMLDDAATALGIDPVEIRLRNAAREGDANPLTGKRIYSAGLPECL EKGRKIFEWEKRRAECQNQQGNLRRGVGVACFSYTSNTWPVGVEIAGARLLMNQDGTINV QSGATEIGQGADTVFSQMVAETVGVPVSDVRVISTQDTDVTPFDPGAFASRQSYVAAPAL RSAALLLKEKIIAHAAVMLHQSAMNLTLIKGHIVLVERPEEPLMSLKDLAMDAFYHPERG GQLSAESSIKTTTNPPAFGCTFVDLTVDIALCKVTINRILNVHDSGHILNPLLAEGQVHG GMGMGIGWALFEEMIIDAKSGVVRNPNLLDYKMPTMPDLPQLESAFVEINEPQSAYGHKS LGEPPIIPVAAAIRNAVKMATGVAINTLPLTPKRLYEEFHLAGLI >gi|296493159|gb|ADTK01000342.1| GENE 34 42279 - 43034 315 251 aa, chain + ## HITS:1 COG:ygeR KEGG:ns NR:ns ## COG: ygeR COG0739 # Protein_GI_number: 16130767 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Escherichia coli K12 # 1 251 9 259 259 449 99.0 1e-126 MSAGRLNKKSLGIVMLLSVGLLLAGCSGSKSSDTGTYSGSVYTVKRGDTLYRISRTTGTS VKELARLNGISPPYTIEVGQKLKLGGAKSNSITRKSTAKSTTKTASVTPSSAVPKSSWPP VGQRCWLWPTTGKVIMPYSTADGGNKGIDISAPRGTPIYAAGAGKVVYVGNQLRGYGNLI MIKHSEDYITAYAHNDTMLVNNGQSVKAGQKIATMGSTDAASVRLHFQIRYRATAIDPLR YLPPQGSKPKC >gi|296493159|gb|ADTK01000342.1| GENE 35 43420 - 45030 554 536 aa, chain + ## HITS:1 COG:no KEGG:ECIAI39_3279 NR:ns ## KEGG: ECIAI39_3279 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_IAI39 # Pathway: not_defined # 1 536 15 550 550 1039 98.0 0 MKGKSALTLLLAGIFSCGTCQATGAEVTSESVFNILNSTGAATDKSYLSLNPDKYPNYRL LIHSAKLKNEIKSHYTKDEIQGLLTLTENTRKLTLTEKPWGTFILASTFEDDKTAAETHY DAVWLRDSLWGYMALVSDQGNSVAAKKVLLTLWDYMSTPDQIKRMQDVISNPKRLDGVPG QMNAVHIRFDSNSPVMADVQEEGKPQLWNHKQNDALGLYLDLLIQAIDTGTINAEDWQKG DRLKSVALLIAYLDKANFYVMEDSGAWEEDARLNTSSVALVTSGLERLSNLLSKKDSVFV SDLLREAKANELDEPLSTTRLNHLIDKGYERITLQLDLGGESPGYLEKDKHYREADAALL NVIYPANLAKINTRRKEQVLKIVKKLAGPYGIKRYEKDNYQSANFWFNDIKTDTDQNSHV KREKSFIPSTEAEWFFDSWYAKSAAIVYKESRKEEYLNDSVQFMNRSLAQITGENMIGAN GRSVPEMALPESYNYIHKSGTLHEAPNPIIPLNWSKASMTLMLKEMSSLINDEGNK >gi|296493159|gb|ADTK01000342.1| GENE 36 45097 - 45276 95 59 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_4151 NR:ns ## KEGG: ECH74115_4151 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 59 1 59 59 74 96.0 9e-13 MINKLLLAYLIGLVVTSTFIFIFSEEKVTYRLFATIITGLTWPLSLIPSIISLMIRKSD >gi|296493159|gb|ADTK01000342.1| GENE 37 45622 - 45753 93 43 aa, chain + ## HITS:1 COG:ECs3734 KEGG:ns NR:ns ## COG: ECs3734 COG2207 # Protein_GI_number: 15832988 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 41 1 41 249 82 92.0 2e-16 MIEEGLLLPMNVLLQGQKLTLLDPEMWFLRGNENKDVTIFLIN >gi|296493159|gb|ADTK01000342.1| GENE 38 45801 - 46283 148 160 aa, chain + ## HITS:1 COG:ECs3731 KEGG:ns NR:ns ## COG: ECs3731 COG4789 # Protein_GI_number: 15832985 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type III secretory pathway, component EscV # Organism: Escherichia coli O157:H7 # 1 160 527 686 686 297 98.0 5e-81 MQRISEVIQRLILERISVRNMRLVMEALALWSPREKDIITLVEHVRGALGRYICHKFSYS GEIKAIVISPEIEDRIRDGVRPTAGGTFLNLDASEAEMILDNFKLALSGINIPIKDIILL GSVDIRRFIKKLIESSYRDLEVLSYGELTENVPVNVLKTI >gi|296493159|gb|ADTK01000342.1| GENE 39 46300 - 47085 47 261 aa, chain + ## HITS:1 COG:ECs3730 KEGG:ns NR:ns ## COG: ECs3730 COG1157 # Protein_GI_number: 15832984 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis/type III secretory pathway ATPase # Organism: Escherichia coli O157:H7 # 1 261 5 265 439 506 96.0 1e-143 MEHAGMKRLKLLNKYSYLHSINGSLIEAELDDVSVGEVCEIYASWQANERIARAQVVGFR NGKTLLNLIGSSVGLTRTAVLKPTGEQLTIQISDAFLCSVLNASGQIMERFVPNPPGDRG NLRLIDELPPSYQERRVINTPLETRIRVIDGVLTCGIGQRVGIFASAGCGKTVLMHMLVN NTEADVFVIGLIGERGREDTECAESMKQSVNAAKCVLVYATSDFSSVDRCNAALMATTVA EYFRDRGKRVVLFIDSMTRYA >gi|296493159|gb|ADTK01000342.1| GENE 40 47098 - 47607 153 169 aa, chain + ## HITS:1 COG:ECs3730 KEGG:ns NR:ns ## COG: ECs3730 COG1157 # Protein_GI_number: 15832984 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis/type III secretory pathway ATPase # Organism: Escherichia coli O157:H7 # 1 169 271 439 439 335 97.0 3e-92 MKLAAGEPPARGGYPASVFDSLPRLLERPGPTLKGSITAFYTVLLEGEDESDPLGDEIRS ILDGHIYLSRKLAGQGHYPAINVLKSVSRVFGQVTDEKHRDNAARVRKILTTLEDLQVFI DLGEYRAGQNAENDFAMNARPKLTNWLKQSVNEKMPMSETLKELERIVK >gi|296493159|gb|ADTK01000342.1| GENE 41 47597 - 48013 113 138 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_4145 NR:ns ## KEGG: ECH74115_4145 # Name: not_defined # Def: EivI # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 127 4 130 155 155 95.0 6e-37 MLSKVNRLIRRTAQSLAACEASLQKLNAEKEKLAEKERLYDMQLKNLKSLLDKKELLGEV VFRQDIFYSLRKVAVIQQQIAEINLEKQKIAERRKILNKEIVQQQAQRKHWWLKGEKYVR LKTRIKKTFKSDASSRRA >gi|296493159|gb|ADTK01000342.1| GENE 42 48031 - 48144 104 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDEVKKIEHGKILNTDTPDNSASNLDSFLKKNKKIKK >gi|296493159|gb|ADTK01000342.1| GENE 43 48557 - 48904 199 115 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_3199 NR:ns ## KEGG: ECUMN_3199 # Name: eivJ # Def: type III secretion protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 115 177 291 291 150 91.0 1e-35 MFTQHIEKQRNVNTLNQNDINNSVNNANVHENELTYQFQRWGQNHTVRILESSEGIRLKP SDTLVSNRLREAQHNDVTAQRWVLTEQDERQGQRHQPYEEQENESKFENDQKDES >gi|296493159|gb|ADTK01000342.1| GENE 44 49510 - 49851 333 113 aa, chain + ## HITS:1 COG:ECs3726 KEGG:ns NR:ns ## COG: ECs3726 COG1886 # Protein_GI_number: 15832980 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar motor switch/type III secretory pathway protein # Organism: Escherichia coli O157:H7 # 1 113 216 328 328 214 96.0 3e-56 MADHFEYEEDFETDDFDIKKNESEIYDENDDQMINSFEDLPVKIEFVLGKKIMNLYEIDE LCAKRIISLLPESEKNIEIRVNGALTGYGELVEVDDKLGVEIHSWLSGHNNVK >gi|296493159|gb|ADTK01000342.1| GENE 45 49841 - 50506 102 221 aa, chain + ## HITS:1 COG:ECs3725 KEGG:ns NR:ns ## COG: ECs3725 COG4790 # Protein_GI_number: 15832979 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type III secretory pathway, component EscR # Organism: Escherichia coli O157:H7 # 1 221 1 221 221 342 98.0 2e-94 MSNSISLIAILTLFTLLPFIIASGTYFIKFSIVFVIVRNALGLQQVPSNMTLNGVALLLS MFVMMPVGTEIYYNSQNENLSFNNVASVVNFVETGMSGYKSYLIKYSEPELVSFFEKIQK VNSSEDNEEIIDDDNISIFSLLPAYALSEIKSAFIIGFYIYLPFVVVDLVISSVLLTLGM MMMSPVIISTPIKLILFVAMDGWTMLSKGLILQYFDLSINP >gi|296493159|gb|ADTK01000342.1| GENE 46 50516 - 50776 159 86 aa, chain + ## HITS:1 COG:ECs3724 KEGG:ns NR:ns ## COG: ECs3724 COG4794 # Protein_GI_number: 15832978 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type III secretory pathway, component EscS # Organism: Escherichia coli O157:H7 # 1 86 1 86 86 147 98.0 6e-36 MNDIVFAGNRALYLILVMSAGPIAVATFVGLLVGLFQTVTQLQEQTLPFGVKLLCVSICF FLMSGWYGEKLYSFGIEMLNLAFARG >gi|296493159|gb|ADTK01000342.1| GENE 47 51309 - 51545 166 78 aa, chain + ## HITS:1 COG:ECs3722 KEGG:ns NR:ns ## COG: ECs3722 COG4791 # Protein_GI_number: 15832976 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type III secretory pathway, component EscT # Organism: Escherichia coli O157:H7 # 1 78 1 78 78 116 100.0 8e-27 MTHTIVYASPVIAVMLGGEAVLGLLARYASQLNAFAISLTVKSALAFLILIIYFGPILAE RVMPLSFFPEQLQLYIEK >gi|296493159|gb|ADTK01000342.1| GENE 48 51554 - 52132 154 192 aa, chain + ## HITS:1 COG:ECs3721 KEGG:ns NR:ns ## COG: ECs3721 COG1377 # Protein_GI_number: 15832975 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FlhB # Organism: Escherichia coli O157:H7 # 1 144 1 144 373 280 96.0 1e-75 MANKTEKPTQKKLQDASKKGQILKSRDLTISVIMLVGTLYLGYVFDVHHIMSILEYIFDH NAKPDIWDYFKAMGVGWLKTIIPFLLVCMFTTILVSWFQSKMQLATEAVKFKFDSLNPVN GLKRIFGLKTVKEFVKAILYIILFFFFLHWRSKYFEVIINHCFLKLLMEISYLYYQIGER CYSFSYCIVSAV >gi|296493159|gb|ADTK01000342.1| GENE 49 52093 - 52296 70 67 aa, chain + ## HITS:1 COG:ECs3721 KEGG:ns NR:ns ## COG: ECs3721 COG1377 # Protein_GI_number: 15832975 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FlhB # Organism: Escherichia coli O157:H7 # 1 59 177 235 373 112 98.0 1e-25 MLFLLILYCLGSMIIVLIFDFIAEYFLFMKDMKMDKQEVKREYKKQEGNPEIKSKRRERI RKFFLSN >gi|296493159|gb|ADTK01000342.1| GENE 50 52323 - 52685 230 120 aa, chain + ## HITS:1 COG:ECs3721 KEGG:ns NR:ns ## COG: ECs3721 COG1377 # Protein_GI_number: 15832975 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FlhB # Organism: Escherichia coli O157:H7 # 1 120 254 373 373 227 94.0 4e-60 MIANPTHIAIGIYFKPHLSPIPLISVRETNEVALAVRKYAKEIGIPIITDKKLARKIYAT HRRYDYVSFENIDEILRLLLWLEDVENAGQPVPDEQLSSEDKFIEGEEKEIENKDDNLKN >gi|296493159|gb|ADTK01000342.1| GENE 51 53042 - 53542 130 166 aa, chain - ## HITS:1 COG:ECs3720 KEGG:ns NR:ns ## COG: ECs3720 COG2771 # Protein_GI_number: 15832974 # Func_class: K Transcription # Function: DNA-binding HTH domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 166 1 166 166 302 96.0 2e-82 MQVFSSDVYFTVGTNALLASQNEYYSDLVALVDLGHSFVVIDEHQHRNLNPNTEPVNILL SNNFIRINKNITLSDLTHFLISNLHTQNVYSTQEPLTHDEIDILRLCVSYSLKQIAIIKG IDYKTISYHKIRALNKLNIKGTVELFIALCEWDKHYFKLQSCVRES >gi|296493159|gb|ADTK01000342.1| GENE 52 53800 - 54981 316 393 aa, chain + ## HITS:1 COG:no KEGG:EC55989_3140 NR:ns ## KEGG: EC55989_3140 # Name: eprH # Def: putative type III secretion EprH protein # Organism: E.coli_55989 # Pathway: not_defined # 1 393 4 396 396 738 99.0 0 MENNDKFLSQELLESYAIRLLSGPLNGCEYEILNGRLLVIIGNDVSLGRSDAFSELPENT IVVPYGELTGSFEIIITTDPDLVVTFRELTAQEPEDRTLTLNQQIEVLGLKFAVKEKNEV WQYSLPGIIENNIISTKQHFFSSKLFKYVMLFFLFAIIFIAFYIVNASNDPQLRHIDKIL VNKNRNYEILYGRDHVIYINTNILDEAVWVKQALEKNQPGKPVRVINPDDESIRIFSWLA DNFPDLQYFKLQLLDASNLRLTVSKQRNAITQQLIDNLIKGLLQTMPYASNISIAVLDDN VLESQAIETLSAIGLSYEKYKTANNVYFNIIGTLSDSELNKINNYVDEYYKQWGKQYVRF NVNLKNQDTNNSSFSYRDNRFEKSQGSNWTFQE >gi|296493159|gb|ADTK01000342.1| GENE 53 54995 - 55150 72 51 aa, chain + ## HITS:1 COG:no KEGG:ECO103_3423 NR:ns ## KEGG: ECO103_3423 # Name: not_defined # Def: type III secretion protein EprI # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 51 1 51 51 89 100.0 4e-17 MADWNGYIMDISNQFDQGVDDLNQQVEKALEDLANQSLRHEISCRISKCIS >gi|296493159|gb|ADTK01000342.1| GENE 54 55327 - 55542 98 71 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_3189 NR:ns ## KEGG: ECUMN_3189 # Name: not_defined # Def: conserved hypothetical protein, putative type III secretion apparatus protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 69 25 93 110 120 97.0 2e-26 MIDLNDRVLNLDNPDDKMISAFANYAVQTENWQQNALQALRSDKEGLTPEKLLVLQDHVL NYNVEVSLVEH >gi|296493159|gb|ADTK01000342.1| GENE 55 55583 - 56317 32 244 aa, chain + ## HITS:1 COG:ECs3716 KEGG:ns NR:ns ## COG: ECs3716 COG4669 # Protein_GI_number: 15832970 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type III secretory pathway, lipoprotein EscJ # Organism: Escherichia coli O157:H7 # 1 244 1 244 244 450 98.0 1e-126 MKYISLLLFILLLCGCKQQELLNHLDQQQANDVLAVLQRHNINAEKKDQGKTGFSIFVEP TDFASAVDWLKIYNLPGKPDIQISQMFPADALVSSPRAEKARLYSAIEQRLEQSLKIMDG IISSRVHVSYDVDNGDSGKTALPIHISVLAVYEKDINPEIKINDIKRFIVNSSASVQYEN ISVVLSKRRDIIEQAPTYEISEPVFAYDKAMPVSILLALISVATCWLLWKYRAILTNLVR LKIK >gi|296493159|gb|ADTK01000342.1| GENE 56 56333 - 56914 -40 193 aa, chain + ## HITS:1 COG:no KEGG:ECO103_3419 NR:ns ## KEGG: ECO103_3419 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 193 1 193 193 379 100.0 1e-104 MNLALRKIIYDPISYIHPQRVSLNNTPINNPVLRSITNEMIVLQYNLSVEHFNLNSSLIY YINNWNLFPLFCLFSGYHFYRERFAERGFFYKVPAVLRDYLSAIPVKINEKARYKPGIAS YHNIITCGFSTLSPYIRQQPLAMQQRFNLLFPDFVDHIQLPLPLASTLLERITFYAKKNR DELDKISCKWCCD >gi|296493159|gb|ADTK01000342.1| GENE 57 57131 - 57562 154 143 aa, chain + ## HITS:1 COG:no KEGG:EC55989_3135 NR:ns ## KEGG: EC55989_3135 # Name: ygeM # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 143 1 143 143 271 100.0 5e-72 MINDLKSILLKSSEEVDVFIKIFESWRKKLPAISGPVNLYIPTRFKDKYLEVESYFVDKS IWNVHITFHDDKRFVFFTDQFIAEFSPQEFVDNCEQYLINNHCFSPDKVNEICEQARHYL VEKMFETHSLDMNNSVLASPEDL >gi|296493159|gb|ADTK01000342.1| GENE 58 57783 - 57938 115 51 aa, chain + ## HITS:1 COG:ECs3712 KEGG:ns NR:ns ## COG: ECs3712 COG2197 # Protein_GI_number: 15832966 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 51 1 51 210 104 96.0 3e-23 MGKIKIVVSDQQPFMIDGIIGFLGHYPDLYEVVGGYKDLKKSIAECNKSTA >gi|296493159|gb|ADTK01000342.1| GENE 59 57972 - 58415 154 147 aa, chain + ## HITS:1 COG:ECs3712 KEGG:ns NR:ns ## COG: ECs3712 COG2197 # Protein_GI_number: 15832966 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 147 64 210 210 295 99.0 2e-80 MGSELVKWVKSHKIDAHIITFVAKMPYIDSIKLLEAGAKGCVWKTSHPAKLNRAIDSISN GYTYFDSVHMDCEKISSRYSSDNQLTNRESEILQLIADGKTNKEIANFLQLSRKTVETHR LNIMKKLDVHSGIELIKTALRMGVCTI >gi|296493159|gb|ADTK01000342.1| GENE 60 58460 - 58696 94 78 aa, chain - ## HITS:1 COG:pbl KEGG:ns NR:ns ## COG: pbl COG0741 # Protein_GI_number: 16130758 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Escherichia coli K12 # 1 78 61 138 138 167 100.0 4e-42 MHIPLLKKRGIIKDERDLLDNPCLNIKIGTEILYNHFSRCGVTWQCLGTYNAGFAMDNQK KRQQYAPKYILYIPGLMN >gi|296493159|gb|ADTK01000342.1| GENE 61 58734 - 58952 122 72 aa, chain + ## HITS:1 COG:no KEGG:ECSE_3112 NR:ns ## KEGG: ECSE_3112 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 72 1 72 72 97 100.0 1e-19 MLLFLSLFLPMLSFFRLLSTRAIALIKFGSILNIVPASLEQELLAKDIFVINIQNVKMVK NLTFGERIFFHV >gi|296493159|gb|ADTK01000342.1| GENE 62 59019 - 59237 150 72 aa, chain - ## HITS:1 COG:no KEGG:JW5456 NR:ns ## KEGG: JW5456 # Name: ygeI # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 72 1 72 72 120 100.0 2e-26 MTNPIGINNLSQSSNIANATGDEVVSLDKHINTSATDTDQIQAFIVSTWMAPFQNDMYSE DNPISPYYKIEW >gi|296493159|gb|ADTK01000342.1| GENE 63 59405 - 60781 185 458 aa, chain - ## HITS:1 COG:ygeH_2 KEGG:ns NR:ns ## COG: ygeH_2 COG0457 # Protein_GI_number: 16130756 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Escherichia coli K12 # 114 458 1 345 345 676 100.0 0 MDLENKFSYHFLEGLMLTEDGILTQGNEQVYIPQKELGVLIVLLESAGHVVLKDMIIESV WKNIIVSDESLTRCIYSLRCIFEKIGYDRCIETIYRKGYRFSGQVFKTKINEDNTSDYSI AIFPFTTSLNTLDPLILNQELVQIISNKKIDGLYTYPMAATNFCNDHISQNSFLSRFKPD YFVTGRINQNNAVNTLYIELIDAKNLFLIASNHLPVDELHNTSQFIIDNILQTVHKPERS VRLAKQDQGYKNHYLSDEMLAGKKELYDFTPESIYRAMTIFDRLQNKSDIQTLKTECYCL LAECHMSLALHGKSELELAAQKALELLDYVSDITTVDGKILAIMGLITGLSGQAKVSHIL FEQAKIHSTDIASLYYYRALVHFHNEKIEEARICIDKSLQLEPRRRKAVVIKECVDMYVP NPLKNNIKLYYKETESESHRVIIDNILKLKQLTRICMR >gi|296493159|gb|ADTK01000342.1| GENE 64 61116 - 61607 264 163 aa, chain - ## HITS:1 COG:ygeG KEGG:ns NR:ns ## COG: ygeG COG0457 # Protein_GI_number: 16130755 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Escherichia coli K12 # 1 163 1 163 163 291 98.0 4e-79 MSTETIEIFNNSDEWANQLKHALSKGENLALLHGLTPDILDRIYAYAFDYHEKGNITDAE IYYKFLCIYAFENHEYLKDFASVCQPKKKYQQEYDLYKLSYNYSPYDDYSVIYRMGQCQI GAKNIDNAMQCFYHIINNCEDDSVKSKAQAYIELLNDNSEDNG >gi|296493159|gb|ADTK01000342.1| GENE 65 62516 - 62941 138 141 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_3178 NR:ns ## KEGG: ECUMN_3178 # Name: yqeK # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 141 1 141 141 222 97.0 3e-57 MDIEFSQIHEMVYMHDIVNSDSKKKPRIPLKKFLNAENVLTQTTSWTLKSRYVNVNSVNK VNVKSKVKNSYISRSVNDEFSLTDDEINSFKETLVLSSIDSLSKLVLNNPLSVLFTSTVR RNNNRAKMNVEFDSWICTRCC >gi|296493159|gb|ADTK01000342.1| GENE 66 63090 - 63572 85 160 aa, chain - ## HITS:1 COG:no KEGG:JW5455 NR:ns ## KEGG: JW5455 # Name: yqeJ # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 160 1 160 160 315 100.0 2e-85 MIDYKKNLLFILVFISGFILFTVYSYTAEKMIYNETCTANWVIFNDQGRANLTIDFMYNK KNKTGTVALSGTWQQGNRESKSIRRNIEYTWIENYDTAHLTSKKVNKFEIMDQVDDDRLA QLIPDFYVFPEKSVSYNILKQGKHAFILSIGNRAIMHCAR >gi|296493159|gb|ADTK01000342.1| GENE 67 63565 - 63900 171 111 aa, chain - ## HITS:1 COG:yqeI KEGG:ns NR:ns ## COG: yqeI COG3710 # Protein_GI_number: 16130751 # Func_class: K Transcription # Function: DNA-binding winged-HTH domains # Organism: Escherichia coli K12 # 1 111 159 269 269 237 98.0 4e-63 MLSAFVIGAYSAYWLWNNNQPKPFFKDYKTVAEINGCHFNVTDDTIDGLKEFDKYKTRIL DSGINCKKHPWLYFPLAKSSPGMIVMACNKNYNQHEVADCLTLSYREVNRD >gi|296493159|gb|ADTK01000342.1| GENE 68 64708 - 65226 140 172 aa, chain - ## HITS:1 COG:yqeH KEGG:ns NR:ns ## COG: yqeH COG2771 # Protein_GI_number: 16130750 # Func_class: K Transcription # Function: DNA-binding HTH domain-containing proteins # Organism: Escherichia coli K12 # 1 172 59 230 230 342 98.0 1e-94 MWPEESSYFNRGVVEGILTKNHNARLSGYIFVDFSVSFLRLFLEKDWIDYLASTDMGIVL VSDRNMQSLANYWRKHNSAISAVIYNDDGLDVANEKIRQLFIGRYLSFTRGNTLTQMEFT IMGYMVSGYNPYQIAEVLDMDIRSIYAYKQRIEKRMGDKINELFIRSHSVQH >gi|296493159|gb|ADTK01000342.1| GENE 69 65800 - 67029 537 409 aa, chain - ## HITS:1 COG:ECs3702 KEGG:ns NR:ns ## COG: ECs3702 COG0814 # Protein_GI_number: 15832956 # Func_class: E Amino acid transport and metabolism # Function: Amino acid permeases # Organism: Escherichia coli O157:H7 # 1 409 1 409 409 707 100.0 0 MSNIWSKEETLWSFALYGTAVGAGTLFLPIQLGSAGAVVLFITALVAWPLTYWPHKALCQ FILSSKTSAGEGITGAVTHYYGKKIGNLITTLYFIAFFVVVLIYAVAITNSLTEQLAKHM VIDLRIRMLVSLGVVLILNLIFLMGRHATIRVMGFLVFPLIAYFLFLSIYLVGSWQPDLL TTQVEFNQNTLHQIWISIPVMVFAFSHTPIISTFAIDRREKYGEHAMDKCKKIMKVAYLI ICISVLFFVFSCLLSIPPSYIEAAKEEGVTILSALSMLPNAPAWLSISGIIVAVVAMSKS FLGTYFGVIEGATEVVKTTLQQVGVKKSRAFNRALSIMLVSLITFIVCCINPNAISMIYA ISGPLIAMILFIMPTLSTYLIPALKPWRSIGNLITLIVGILCVSVMFFS >gi|296493159|gb|ADTK01000342.1| GENE 70 67284 - 68465 968 393 aa, chain + ## HITS:1 COG:yqeF KEGG:ns NR:ns ## COG: yqeF COG0183 # Protein_GI_number: 16130748 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA acetyltransferase # Organism: Escherichia coli K12 # 1 393 2 394 394 718 99.0 0 MKDVVIVGALRTPIGCFRGALAGHSAVELGSLVVKALIERTGVPAYAVDEVILGQVLTAG AGQNPARQSAIKGGLPNSVSAITINDVCGSGLKALHLATQAIQCGEADIVIAGGQENMSR APHVLTDSRTGAQLGNSQLVDSLVHDGLWDAFNDYHIGVTAENLAREYGISRQLQDAYAL SSQQKARAAIDAGRFKDEIVPVMTQSNGQTLVVDTDEQPRTDASAEGLARLNPSFDSLGS VTAGNASSINDGAAAVMMMSEAKARALNLPVLARIRAFASVGVDPALMGIAPVYATRRCL ERVGWQLADVDLIEANEAFAAQALSVGKMLEWDERRVNVNGGAIALGHPIGASGCRILVS LVHEMVKRNARKGLATLCIGGGQGVALTIERDE >gi|296493159|gb|ADTK01000342.1| GENE 71 68860 - 69588 697 242 aa, chain + ## HITS:1 COG:kduI KEGG:ns NR:ns ## COG: kduI COG3717 # Protein_GI_number: 16130747 # Func_class: G Carbohydrate transport and metabolism # Function: 5-keto 4-deoxyuronate isomerase # Organism: Escherichia coli K12 # 1 242 37 278 278 512 100.0 1e-145 MVYSHIDRIIVGGIMPITKTVSVGGEVGKQLGVSYFLERRELGVINIGGAGTITVDGQCY EIGHRDALYVGKGAKEVVFASIDTGTPAKFYYNCAPAHTTYPTKKVTPDEVSPVTLGDNL TSNRRTINKYFVPDVLETCQLSMGLTELAPGNLWNTMPCHTHERRMEVYFYFNMDDDACV FHMMGQPQETRHIVMHNEQAVISPSWSIHSGVGTKAYTFIWGMVGENQVFDDMDHVAVKD LR >gi|296493159|gb|ADTK01000342.1| GENE 72 69618 - 70379 749 253 aa, chain + ## HITS:1 COG:ECs3699 KEGG:ns NR:ns ## COG: ECs3699 COG1028 # Protein_GI_number: 15832953 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Escherichia coli O157:H7 # 1 253 1 253 253 492 99.0 1e-139 MILSAFSLEGKVAVVTGCDTGLGQGMALGLAQAGCDIVGINIVEPTETIEQVTALGRRFL SLTADLRKIDGIPGLLDRAVAEFGHIDILVNNAGLIRREDALEFSEKDWDDVMNLNIKSV FFMSQAAAKHFIAQGNGGKIINIASMLSFQGGIRVPSYTASKSGVMGVTRLMANEWAKHN INVNAIAPGYMATNNTQQLRADEQRSAEILDRIPAGRWGLPSDLMGPVVFLASSASDYVN GYTIAVDGGWLAR Prediction of potential genes in microbial genomes Time: Mon May 16 16:07:18 2011 Seq name: gi|296493158|gb|ADTK01000343.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1096.10, whole genome shotgun sequence Length of sequence - 1720 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 107 - 166 7.8 1 1 Tu 1 . + CDS 348 - 1706 1280 ## COG0477 Permeases of the major facilitator superfamily Predicted protein(s) >gi|296493158|gb|ADTK01000343.1| GENE 1 348 - 1706 1280 452 aa, chain + ## HITS:1 COG:ECs3698 KEGG:ns NR:ns ## COG: ECs3698 COG0477 # Protein_GI_number: 15832952 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 452 21 472 472 817 100.0 0 MNMFVSVAAAVAGLLFGLDIGVIAGALPFITDHFVLTSRLQEWVVSSMMLGAAIGALFNG WLSFRLGRKYSLMAGAILFVLGSIGSAFATSVEMLIAARVVLGIAVGIASYTAPLYLSEM ASENVRGKMISMYQLMVTLGIVLAFLSDTAFSYSGNWRAMLGVLALPAVLLIILVVFLPN SPRWLAEKGRHIEAEEVLRMLRDTSEKAREELNEIRESLKLKQGGWALFKINRNVRRAVF LGMLLQAMQQFTGMNIIMYYAPRIFKMAGFTTTEQQMIATLVVGLTFMFATFIAVFTVDK AGRKPALKIGFSVMALGTLVLGYCLMQFDNGTASSGLSWLSVGMTMMCIAGYAMSAAPVV WILCSEIQPLKCRDFGITCSTTTNWVSNMIIGATFLTLLDSIGAAGTFWLYTALNIAFVG ITFWLIPETKNVTLEHIERKLMAGEKLRNIGV Prediction of potential genes in microbial genomes Time: Mon May 16 16:07:37 2011 Seq name: gi|296493157|gb|ADTK01000344.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1096.11, whole genome shotgun sequence Length of sequence - 62562 bp Number of predicted genes - 52, with homology - 52 Number of transcription units - 30, operones - 15 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 55 - 747 744 ## COG1794 Aspartate racemase + Term 964 - 1012 -1.0 2 2 Tu 1 . - CDS 734 - 1669 723 ## COG0583 Transcriptional regulator - Prom 1753 - 1812 6.7 + Prom 1699 - 1758 9.1 3 3 Tu 1 . + CDS 1791 - 3053 1213 ## COG0019 Diaminopimelate decarboxylase 4 4 Tu 1 . - CDS 3060 - 4091 960 ## COG1609 Transcriptional regulators + Prom 4582 - 4641 3.2 5 5 Op 1 5/0.364 + CDS 4677 - 6836 1943 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II 6 5 Op 2 . + CDS 6829 - 8022 1322 ## COG0477 Permeases of the major facilitator superfamily + Term 8026 - 8063 8.0 - Term 8014 - 8051 8.0 7 6 Tu 1 . - CDS 8054 - 9094 1088 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) - Prom 9120 - 9179 7.3 - Term 9160 - 9197 5.1 8 7 Tu 1 . - CDS 9202 - 9420 267 ## G2583_3488 hypothetical protein - Prom 9453 - 9512 3.5 9 8 Op 1 5/0.364 - CDS 9558 - 10271 795 ## COG0861 Membrane protein TerC, possibly involved in tellurium resistance 10 8 Op 2 . - CDS 10340 - 11029 647 ## COG3066 DNA mismatch repair protein - Prom 11075 - 11134 4.8 + Prom 11068 - 11127 3.5 11 9 Tu 1 . + CDS 11215 - 11361 67 ## ECO103_3390 hypothetical protein + Prom 11415 - 11474 5.8 12 10 Op 1 7/0.000 + CDS 11714 - 12244 261 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 13 10 Op 2 5/0.364 + CDS 12257 - 14503 2099 ## COG3605 Signal transduction protein containing GAF and PtsI domains 14 11 Op 1 11/0.000 + CDS 14654 - 15529 1090 ## COG0682 Prolipoprotein diacylglyceryltransferase 15 11 Op 2 4/0.727 + CDS 15536 - 16330 835 ## COG0207 Thymidylate synthase + Term 16349 - 16386 5.6 + Prom 16346 - 16405 3.2 16 12 Op 1 12/0.000 + CDS 16515 - 16985 245 ## COG2165 Type II secretory pathway, pseudopilin PulG 17 12 Op 2 . + CDS 16976 - 17539 594 ## COG4795 Type II secretory pathway, component PulJ 18 12 Op 3 . + CDS 17536 - 17943 272 ## ECO103_3383 hypothetical protein 19 12 Op 4 5/0.364 + CDS 17928 - 18251 98 ## COG4967 Tfp pilus assembly protein PilV 20 12 Op 5 5/0.364 + CDS 18264 - 21632 2916 ## COG1330 Exonuclease V gamma subunit + Prom 21711 - 21770 3.0 21 13 Op 1 5/0.364 + CDS 21808 - 24696 2872 ## COG1025 Secreted/periplasmic Zn-dependent peptidases, insulinase-like 22 13 Op 2 13/0.000 + CDS 24689 - 28231 2948 ## COG1074 ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) 23 13 Op 3 . + CDS 28231 - 30057 1334 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member + Term 30064 - 30120 5.6 - Term 30047 - 30114 6.0 24 14 Tu 1 . - CDS 30119 - 31450 1277 ## COG0548 Acetylglutamate kinase - Prom 31555 - 31614 5.6 + Prom 31473 - 31532 7.1 25 15 Tu 1 . + CDS 31682 - 32935 1228 ## COG0860 N-acetylmuramoyl-L-alanine amidase - TRNA 33010 - 33086 86.1 # Met CAT 0 0 - TRNA 33120 - 33196 86.1 # Met CAT 0 0 + Prom 33220 - 33279 4.4 26 16 Op 1 8/0.000 + CDS 33405 - 34502 1125 ## COG2821 Membrane-bound lytic murein transglycosylase + Term 34537 - 34577 4.1 + Prom 34609 - 34668 7.0 27 16 Op 2 . + CDS 34741 - 35547 856 ## COG1179 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 + Term 35555 - 35600 12.8 - Term 35540 - 35588 11.2 28 17 Op 1 7/0.000 - CDS 35598 - 36041 404 ## COG2166 SufE protein probably involved in Fe-S center assembly 29 17 Op 2 . - CDS 36041 - 37246 1030 ## COG0520 Selenocysteine lyase - Prom 37374 - 37433 5.2 + Prom 37359 - 37418 4.3 30 18 Tu 1 . + CDS 37438 - 37665 329 ## ECIAI1_2919 hypothetical protein + Prom 37885 - 37944 6.5 31 19 Op 1 6/0.182 + CDS 38016 - 38933 551 ## COG0583 Transcriptional regulator 32 19 Op 2 5/0.364 + CDS 38952 - 39347 457 ## COG2363 Uncharacterized small membrane protein 33 19 Op 3 . + CDS 39340 - 40440 1111 ## COG2933 Predicted SAM-dependent methyltransferase - Term 40432 - 40472 1.2 34 20 Tu 1 4/0.727 - CDS 40484 - 41215 534 ## COG1349 Transcriptional regulators of sugar metabolism - Term 41228 - 41259 3.5 35 21 Op 1 5/0.364 - CDS 41273 - 41695 452 ## COG4154 Fucose dissimilation pathway protein FucU 36 21 Op 2 4/0.727 - CDS 41697 - 43115 978 ## COG1070 Sugar (pentulose and hexulose) kinases - Term 43122 - 43157 5.4 37 22 Op 1 3/0.909 - CDS 43224 - 44999 1732 ## COG2407 L-fucose isomerase and related proteins 38 22 Op 2 . - CDS 45032 - 46348 805 ## COG0738 Fucose permease - Prom 46464 - 46523 6.7 + Prom 46774 - 46833 2.4 39 23 Op 1 5/0.364 + CDS 46895 - 47542 668 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 40 23 Op 2 . + CDS 47570 - 48718 1225 ## COG1454 Alcohol dehydrogenase, class IV + Term 48850 - 48887 1.0 - Term 48724 - 48766 8.2 41 24 Tu 1 . - CDS 48773 - 49528 515 ## COG0258 5'-3' exonuclease (including N-terminal domain of PolI) - Prom 49549 - 49608 2.7 42 25 Op 1 10/0.000 - CDS 49640 - 51007 1351 ## COG1760 L-serine deaminase - Term 51025 - 51056 3.9 43 25 Op 2 4/0.727 - CDS 51065 - 52354 1414 ## COG0814 Amino acid permeases - Prom 52550 - 52609 6.8 - Term 52865 - 52900 4.0 44 26 Tu 1 6/0.182 - CDS 52911 - 54275 1340 ## COG1611 Predicted Rossmann fold nucleotide-binding protein - Prom 54311 - 54370 1.7 45 27 Tu 1 . - CDS 54387 - 55235 626 ## COG0780 Enzyme related to GTP cyclohydrolase I - Prom 55327 - 55386 3.6 + Prom 55210 - 55269 3.5 46 28 Tu 1 . + CDS 55303 - 55848 484 ## ECO103_3336 SecY interacting protein Syd + Term 55931 - 55985 0.5 + Prom 56386 - 56445 6.7 47 29 Op 1 7/0.000 + CDS 56470 - 56799 313 ## COG3098 Uncharacterized protein conserved in bacteria 48 29 Op 2 5/0.364 + CDS 56799 - 57581 193 ## PROTEIN SUPPORTED gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family 49 29 Op 3 4/0.727 + CDS 57599 - 58048 581 ## COG0716 Flavodoxins + Term 58071 - 58105 -0.1 + Prom 58288 - 58347 5.7 50 30 Op 1 7/0.000 + CDS 58483 - 59835 1456 ## COG0477 Permeases of the major facilitator superfamily 51 30 Op 2 5/0.364 + CDS 59837 - 61177 1405 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 52 30 Op 3 . + CDS 61198 - 62538 1606 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily Predicted protein(s) >gi|296493157|gb|ADTK01000344.1| GENE 1 55 - 747 744 230 aa, chain + ## HITS:1 COG:ECs3697 KEGG:ns NR:ns ## COG: ECs3697 COG1794 # Protein_GI_number: 15832951 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Aspartate racemase # Organism: Escherichia coli O157:H7 # 1 230 1 230 230 456 99.0 1e-128 MKTIGLLGGMSWESTIPYYRLINEGIKQRLGGLHSAQVLLHSVDFHEIEECQRRGEWDKT GDILAEAALGLQRAGAEGIVLCTNTMHKVADAIESRCTLPFLHIADATGRAITGAGMTRV ALLGTRYTMEQDFYRGRLTEQFSINCLIPEADERAKINQIIFEELCLGQFTEVSRAYYAQ VIARLAEQGAQGVIFGCTEIGLLVPEERSVLPVFDTAAIHAEDAVAFMLS >gi|296493157|gb|ADTK01000344.1| GENE 2 734 - 1669 723 311 aa, chain - ## HITS:1 COG:lysR KEGG:ns NR:ns ## COG: lysR COG0583 # Protein_GI_number: 16130743 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 311 1 311 311 575 99.0 1e-164 MAAVNLRHIEIFHAVMTAGSLTEAAHLLHTSQPTVSRELARFEKVIGLKLFERVRGRLHP TVQGLRLFEEVQRSWYGLDRIVSAAESLREFRQGELSIACLPVFSQSFLPQLLQPFLARY PDVSLNIVPQESPLLEEWLSAQRHDLGLTETLHTPAGTERTELLSLDEVCVLPPGHPLAV KKVLTPDDFQGENYISLSRTDSYRQLLDQLFTEHQVKRRMIVETHSAASVCAMVRAGVGV SVVNPLTALDYAASGLVVRRFSIAVPFTVSLIRPLHRPSSALVQAFSEHLQAGLPKLVTS LDAILSSATTA >gi|296493157|gb|ADTK01000344.1| GENE 3 1791 - 3053 1213 420 aa, chain + ## HITS:1 COG:lysA KEGG:ns NR:ns ## COG: lysA COG0019 # Protein_GI_number: 16130742 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Escherichia coli K12 # 1 411 1 411 420 833 100.0 0 MPHSLFSTDTDLTAENLLRLPAEFGCPVWVYDAQIIRRQIAALKQFDVVRFAQKACSNIH ILRLMREQGVKVDSVSLGEIERALAAGYNPQTHPDDIVFTADVIDQATLERVSELQIPVN AGSVDMLDQLGQVSPGHRVWLRVNPGFGHGHSQKTNTGGENSKHGIWYTDLPAALDVIQR HHLQLVGIHMHIGSGVDYAHLEQVCGAMVRQVIEFGQDLQAISAGGGLSVPYQQGEEAVD TEHYYGLWNAAREQIARHLGHPVKLEIEPGRFLVAQSGVLITQVRSVKQMGSRHFVLVDA GFNDLMRPAMYGSYHHISALAADGRSLEHAPTVETVVAGPLCESGDVFTQQEGGNVETRA LPEVKAGDYLVLHDTGAYGASMSSNYNSRPLLPEVLFDNGQARLIRRRQTIEELLALELL >gi|296493157|gb|ADTK01000344.1| GENE 4 3060 - 4091 960 343 aa, chain - ## HITS:1 COG:galR KEGG:ns NR:ns ## COG: galR COG1609 # Protein_GI_number: 16130741 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 343 1 343 343 654 99.0 0 MATIKDVARLAGVSVATVSRVINNSPKASEASRLAVHNAMESLSYHPNANARALAQQTTE TVGLIVGDVSDPFFGAMVKAVEQVAYHTSNFLLIGNGYHNEQKERQAIEQLIRHRCAALV VHAKMIPDADLASLMKQMPGMVLINRILPGFENRCIALDDRYGAWLATRHLIQQGHTRIG YLCSNHSISDAEDRLQGYYDALAESGIAANDRLVTFGEPDESGGEQAMTELLGRGRNFTA VACYNDSMAAGAMGVLNDNGIDVPGEISLIGFDDVLVSRYVRPRLTTVRYPIVTMATQAA ELALALADNRPLPEITNVFSPTLVRRHSVSTPSLEASHHATSD >gi|296493157|gb|ADTK01000344.1| GENE 5 4677 - 6836 1943 719 aa, chain + ## HITS:1 COG:ECs3693_2 KEGG:ns NR:ns ## COG: ECs3693_2 COG0318 # Protein_GI_number: 15832947 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Escherichia coli O157:H7 # 196 719 1 524 524 1063 99.0 0 MLFSFFRNLCRVLYRVRVTGDTQALKGERVLITPNHVSFIDGILLGLFLPVRPVFAVYTS ISQQWYMRWLKSFIDFVPLDPTQPMAIKHLVRLVEQGRPVVIFPEGRITTTGSLMKIYDG AGFVAAKSGATVIPVRIEGAELTHFSRLKGLVKRRLFPQITLHILPPTQVEMPDAPRARD RRKIAGEMLHQIMMEARMAVRPRETLYESLLSAMYRFGAGKKCVEDVNFTPDSYRKLLTK TLFVGRILEKYSVEGERIGLMLPNAGISAAVIFGAIARRRIPAMMNYTAGVKGLTSAITA AEIKTIFTSRQFLDKGKLWHLPEQLTQVRWVYLEDLKADVTTADKVWIFAHLLMPRLAQV KQQPEEEALILFTSGSEGHPKGVVHSHKSILANVEQIKTIADFTTNDRFMSALPLFHSFG LTVGLFTPLLTGAEVFLYPSPLHYRIVPELVYDRSCTVLFGTSTFLGHYARFANPYDFYR LRYVVAGAEKLQESTKQLWQDKFGLRILEGYGVTECAPVVSINVPMAAKPGTVGRILPGM DARLLSVPGIEEGGRLQLKGPNIMNGYLRVEKPGVLEVPTAENVRGEMERGWYDTGDIVR FDELGFVQIQGRAKRFAKIAGEMVSLEMVEQLALGVSPDKVHATAIKSDASKGEALVLFT TDNELTRDKLQQYAREHGVPELAVPRDIRYLKQMPLLGSGKPDFVTLKSWVDEAEQHDE >gi|296493157|gb|ADTK01000344.1| GENE 6 6829 - 8022 1322 397 aa, chain + ## HITS:1 COG:ygeD KEGG:ns NR:ns ## COG: ygeD COG0477 # Protein_GI_number: 16130739 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 397 1 397 397 642 100.0 0 MSESVHTNTSLWSKGMKAVIVAQFLSAFGDNALLFATLALLKAQFYPEWSQPILQMVFVG AYILFAPFVGQVADSFAKGRVMMFANGLKLLGAASICFGINPFLGYTLVGVGAAAYSPAK YGILGELTTGSKLVKANGLMEASTIAAILLGSVAGGVLADWHVLVALAACALAYGGAVVA NIYIPKLAAARPGQSWNLINMTRSFLNACTSLWRNGETRFSLVGTSLFWGAGVTLRFLLV LWVPVALGITDNATPTYLNAMVAIGIVVGAGAAAKLVTLETVSRCMPAGILIGVVVLIFS LQHELLPAYALLMLIGVMGGFFVVPLNALLQERGKKSVGAGNAIAVQNLGENSAMLLMLG IYSLAVMIGIPVVPIGIGFGALFALAITALWIWQRRH >gi|296493157|gb|ADTK01000344.1| GENE 7 8054 - 9094 1088 346 aa, chain - ## HITS:1 COG:tas KEGG:ns NR:ns ## COG: tas COG0667 # Protein_GI_number: 16130738 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Escherichia coli K12 # 1 346 1 346 346 699 100.0 0 MQYHRIPHSSLEVSTLGLGTMTFGEQNSEADAHAQLDYAVAQGINLIDVAEMYPVPPRPE TQGLTETYVGNWLAKHGSREKLIIASKVSGPSRNNDKGIRPDQALDRKNIREALHDSLKR LQTDYLDLYQVHWPQRPTNCFGKLGYSWTDSAPAVSLLDTLDALAEYQRAGKIRYIGVSN ETAFGVMRYLHLADKHDLPRIVTIQNPYSLLNRSFEVGLAEVSQYEGVELLAYSCLGFGT LTGKYLNGAKPAGARNTLFSRFTRYSGEQTQKAVAAYVDIARRHGLDPAQMALAFVRRQP FVASTLLGATTMDQLKTNIESLHLELSEDVLAEIEAVHQVYTYPAP >gi|296493157|gb|ADTK01000344.1| GENE 8 9202 - 9420 267 72 aa, chain - ## HITS:1 COG:no KEGG:G2583_3488 NR:ns ## KEGG: G2583_3488 # Name: ygdR # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 72 1 72 72 132 100.0 4e-30 MKKWAVIISAVGLAFAVSGCSSDYVMATKDGRMILTDGKPEIDDDTGLVSYHDQQGNAMQ INRDDVSQIIER >gi|296493157|gb|ADTK01000344.1| GENE 9 9558 - 10271 795 237 aa, chain - ## HITS:1 COG:ECs3689 KEGG:ns NR:ns ## COG: ECs3689 COG0861 # Protein_GI_number: 15832943 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane protein TerC, possibly involved in tellurium resistance # Organism: Escherichia coli O157:H7 # 1 237 1 237 237 381 100.0 1e-106 MLFAWITDPNAWLALGTLTLLEIVLGIDNIIFLSLVVAKLPTAQRAHARRLGLAGAMVMR LALLASIAWVTRLTNPLFTIFSQEISARDLILLLGGLFLIWKASKEIHESIEGEEEGLKT RVSSFLGAIVQIMLLDIIFSLDSVITAVGLSDHLFIMMAAVVIAVGVMMFAARSIGDFVE RHPSVKMLALSFLILVGFTLILESFDIHVPKGYIYFAMFFSIAVESLNLIRNKKNPL >gi|296493157|gb|ADTK01000344.1| GENE 10 10340 - 11029 647 229 aa, chain - ## HITS:1 COG:ECs3688 KEGG:ns NR:ns ## COG: ECs3688 COG3066 # Protein_GI_number: 15832942 # Func_class: L Replication, recombination and repair # Function: DNA mismatch repair protein # Organism: Escherichia coli O157:H7 # 1 229 1 229 229 432 100.0 1e-121 MSQPRPLLSPPETEEQLLAQAQQLSGYTLGELAALAGLVTPENLKRDKGWIGVLLEIWLG ASAGSKPEQDFAALGVELKTIPVDSLGRPLETTFVCVAPLTGNSGVTWETSHVRHKLKRV LWIPVEGERSIPLAQRRVGSPLLWSPNEEEDRQLREDWEELMDMIVLGQVERITARHGEY LQIRPKAANAKALTEAIGARGERILTLPRGFYLKKNFTSALLARHFLIQ >gi|296493157|gb|ADTK01000344.1| GENE 11 11215 - 11361 67 48 aa, chain + ## HITS:1 COG:no KEGG:ECO103_3390 NR:ns ## KEGG: ECO103_3390 # Name: ygdT # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 48 1 48 48 95 100.0 5e-19 MLSTESWDNCEKPPLLFPFTALTCDETPVFSGSVLNLVAHSVDKYGIG >gi|296493157|gb|ADTK01000344.1| GENE 12 11714 - 12244 261 176 aa, chain + ## HITS:1 COG:ECs3687 KEGG:ns NR:ns ## COG: ECs3687 COG0494 # Protein_GI_number: 15832941 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Escherichia coli O157:H7 # 1 176 1 176 176 340 100.0 1e-93 MIDDDGYRPNVGIVICNRQGQVMWARRFGQHSWQFPQGGINPGESAEQAMYRELFEEVGL SRKDVRILASTRNWLRYKLPKRLVRWDTKPVCIGQKQKWFLLQLVSGDAEINMQTSSTPE FDGWRWVSYWYPVRQVVSFKRDVYRRVMKEFASVVMSLQENTPKPQNASAYRRKRG >gi|296493157|gb|ADTK01000344.1| GENE 13 12257 - 14503 2099 748 aa, chain + ## HITS:1 COG:ECs3686 KEGG:ns NR:ns ## COG: ECs3686 COG3605 # Protein_GI_number: 15832940 # Func_class: T Signal transduction mechanisms # Function: Signal transduction protein containing GAF and PtsI domains # Organism: Escherichia coli O157:H7 # 1 748 1 748 748 1434 99.0 0 MLTRLREIVEKVASAPRLNEALNILVTDICLAMDTEVCSVYLADHDRRCYYLMATRGLKK PRGRTVTLAFDEGIVGLVGRLAEPINLADAQKHPSFKYIPSVKEERFRAFLGVPIIQRRQ LLGVLVVQQRELRQYDESEESFLVTLATQMAAILSQSQLTALFGQYRQTRIRALPAAPGV AIAEGWQDATLPLMEQVYQASTLDPALERERLTGALEEAANEFRRYSKRFAAGAQKETAA IFDLYSHLLSDTRLRRELFAEVDKGSVAEWAVKTVIEKFAEQFAALSDNYLKERAGDLRA LGQRLLFHLDDANQGPNAWPERFILVADELSATTLAELPQDRLVGVVVRDGAANSHAAIM VRALGIPTVMGADIQPSVLHRRTLIVDGYRGELLVDPEPVLLQEYQRLISEEIELSRLAE DDVNLPAQLKSGERIKVMLNAGLSPEHEEKLGSRIDGIGLYRTEIPFMLQSGFPSEEEQV AQYQGMLQMFNDKPVTLRTLDVGADKQLPYMPISEENPCLGWRGIRITLDQPEIFLIQVR AMLRANAATGNLNILLPMVTSLDEVDEARRLIERAGREVEEMIGYEIPKPRIGIMLEVPS MVFMLPHLAKRVDFISVGTNDLTQYILAVDRNNTRVANIYDSLHPAMLRALAMIAREAEI HGIDLRLCGEMAGDPMCVAILIGLGYRHLSMNGRSVARVKYLLRRIDYADAENLAQRSLE AQLATEVRHQVAAFMERRGMGGLIRGGL >gi|296493157|gb|ADTK01000344.1| GENE 14 14654 - 15529 1090 291 aa, chain + ## HITS:1 COG:ECs3685 KEGG:ns NR:ns ## COG: ECs3685 COG0682 # Protein_GI_number: 15832939 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Prolipoprotein diacylglyceryltransferase # Organism: Escherichia coli O157:H7 # 1 291 1 291 291 548 100.0 1e-156 MTSSYLHFPEFDPVIFSIGPVALHWYGLMYLVGFIFAMWLATRRANRPGSGWTKNEVENL LYAGFLGVFLGGRIGYVLFYNFPQFMADPLYLFRVWDGGMSFHGGLIGVIVVMIIFARRT KRSFFQVSDFIAPLIPFGLGAGRLGNFINGELWGRVDPNFPFAMLFPGSRTEDILLLQTN PQWQSIFDTYGVLPRHPSQLYELLLEGVVLFIILNLYIRKPRPMGAVSGLFLIGYGAFRI IVEFFRQPDAQFTGAWVQYISMGQILSIPMIVAGVIMMVWAYRRSPQQHVS >gi|296493157|gb|ADTK01000344.1| GENE 15 15536 - 16330 835 264 aa, chain + ## HITS:1 COG:ECs3684 KEGG:ns NR:ns ## COG: ECs3684 COG0207 # Protein_GI_number: 15832938 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate synthase # Organism: Escherichia coli O157:H7 # 1 264 1 264 264 567 100.0 1e-162 MKQYLELMQKVLDEGTQKNDRTGTGTLSIFGHQMRFNLQDGFPLVTTKRCHLRSIIHELL WFLQGDTNIAYLHENNVTIWDEWADENGDLGPVYGKQWRAWPTPDGRHIDQITTVLNQLK NDPDSRRIIVSAWNVGELDKMALAPCHAFFQFYVADGKLSCQLYQRSCDVFLGLPFNIAS YALLVHMMAQQCDLEVGDFVWTGGDTHLYSNHMDQTHLQLSREPRPLPKLIIKRKPESIF DYRFEDFEIEGYDPHPGIKAPVAI >gi|296493157|gb|ADTK01000344.1| GENE 16 16515 - 16985 245 156 aa, chain + ## HITS:1 COG:ppdA KEGG:ns NR:ns ## COG: ppdA COG2165 # Protein_GI_number: 16130730 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, pseudopilin PulG # Organism: Escherichia coli K12 # 1 156 1 156 156 293 100.0 8e-80 MKTQRGYTLIETLVAMLILVMLSASGLYGWQYWQQSQRLWQTASQARDYLLYLREDANWH NRDHSISVIREGTLWCLVSSAAGANTCHGSSPLVFVPRWPEVEMSDLTPSLAFFGLRNTA WAGHIRFKNSTGEWWLVVSPWGRLRLCQQGETEGCL >gi|296493157|gb|ADTK01000344.1| GENE 17 16976 - 17539 594 187 aa, chain + ## HITS:1 COG:ppdB KEGG:ns NR:ns ## COG: ppdB COG4795 # Protein_GI_number: 16130729 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulJ # Organism: Escherichia coli K12 # 1 187 1 187 187 376 98.0 1e-104 MPVKEQGFSLLEVLIAMAISSVLLLGAARFLPALQRESLTSTRKLALEDEIWLRVFTVAK HLQRAGYCHGSCTGEGLEIVGQGDCIIVQWDANSNGIWDREPVKESDQIGFRLKEHVLET LRGATSCEGKGWDKVTNPDAIIIDTFQVVRQDVSGFSPVLTVNMRAASKSEPQTVVDASY SVTGFNL >gi|296493157|gb|ADTK01000344.1| GENE 18 17536 - 17943 272 135 aa, chain + ## HITS:1 COG:no KEGG:ECO103_3383 NR:ns ## KEGG: ECO103_3383 # Name: ygdB # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 135 1 135 135 223 100.0 2e-57 MNREKGVSSLALVLMLLVLGSLLLQGMSQQDRSFASRVSMESQSLRRQAIVQSALAWGKM HSWQTQPAVQCSQYAGTDAQVCLRLLADNEALLIAGYEGVSLWRTGEVIDGNIVFSPRGW SDFCPLKERALCQLP >gi|296493157|gb|ADTK01000344.1| GENE 19 17928 - 18251 98 107 aa, chain + ## HITS:1 COG:ppdC KEGG:ns NR:ns ## COG: ppdC COG4967 # Protein_GI_number: 16130727 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Tfp pilus assembly protein PilV # Organism: Escherichia coli K12 # 1 107 1 107 107 179 98.0 2e-45 MSASLKNQQGFSLPEVMVAMVLMVMIVTALSGIQRTLMNSLASRNQYQQLWRHGWQQTQL RAISPPANWQVNRMQTSQAGCVSISVTLVSPGGREGEMTRLHCPNRQ >gi|296493157|gb|ADTK01000344.1| GENE 20 18264 - 21632 2916 1122 aa, chain + ## HITS:1 COG:recC KEGG:ns NR:ns ## COG: recC COG1330 # Protein_GI_number: 16130726 # Func_class: L Replication, recombination and repair # Function: Exonuclease V gamma subunit # Organism: Escherichia coli K12 # 1 1122 1 1122 1122 2224 99.0 0 MLRVYHSNRLDVLEALMEFIVERERLDDPFEPEMILVQSTGMAQWLQMTLSQKFGIAANI DFPLPASFIWDMFVRVLPEIPKESAFNKQSMSWKLMTLLPQLLEREDFTLLRHYLTDDSD KRKLFQLSSKAADLFDQYLVYRPDWLAQWETGHLVEGLGEAQAWQAPLWKALVEYTHQLG QPRWHRANLYQRFIETLESATTCPPGLPSRVFICGISALPPVYLQALQALGKHIEIHLLF TNPCRYYWGDIKDPAYLAKLLTRQRRHSFEDRELPLFRDSENAGQLFNSDGEQDVGNPLL ASWGKLGRDYIYLLSDLESSQELDAFVDVTPDNLLHNIQSDILELENRAVAGVNIEEFSR SDNKRPLDPLDSSITFHVCHSPQREVEVLHDRLLAMLEEDPTLTPRDIIVMVADIDSYSP FIQAVFGSAPADRYLPYAISDRRARQSHPVLEAFISLLSLPDSRFVSEDVLALLDVPVLA ARFDITEEGLRYLRQWVNESGIRWGIDDDNVRELELPATGQHTWRFGLTRMLLGYAMESA QGEWQSVLPYDESSGLIAELVGHLASLLMQLNIWRRGLAQERPLEEWLPVCRDMLNAFFL PDAETEAAMTLIEQQWQAIIAEGLGAQYGDAVPLSLLRDELAQRLDQERISQRFLAGPVN ICTLMPMRSIPFKVVCLLGMNDGVYPRQLAPLGFDLMSQKPKRGDRSRRDDDRYLFLEAL ISAQQKLYISYIGRSIQDNSERFPSVLVQELIDYIGQSHYLPGDEALNCDESEARVKSHL TCLHTRMPFDPQNYQPGERQSYAREWLPAASQAGKAHSEFVQPLPFTLPETVPLETLQRF WAHPVRAFFQMRLQVNFRTEDSEIPDTEPFILEGLSRYQINQQLLNALVEQDDAERLFRR FRAAGDLPYGAFGEIFWETQCQEMQQLADRVIACRQPGQSMEIDLACNGVQITGWLPQVQ PDGLLRWRPSLLSVAQGMQLWLEHLVYCASGGNGESRLFLRKDGEWRFPPLAAEQALHYL SQLIEGYREGMSAPLLVLPESGGAWLKTCYDAQNDAMLDDDSTLQKARTKFLQAYEGNMM VRGEGDDIWYQRLWRQLTPETMEAIVEQSQRFLLPLFRFNQS >gi|296493157|gb|ADTK01000344.1| GENE 21 21808 - 24696 2872 962 aa, chain + ## HITS:1 COG:ECs3678 KEGG:ns NR:ns ## COG: ECs3678 COG1025 # Protein_GI_number: 15832932 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Secreted/periplasmic Zn-dependent peptidases, insulinase-like # Organism: Escherichia coli O157:H7 # 1 962 1 962 962 1854 99.0 0 MPRSTWFKALLLFVALWAPLSQAETGWQPIQETIRKSDKDNRQYQAIRLDNGMVVLLVSD PQAVKSLSALVVPVGSLEDPEAYQGLAHYLEHMSLMGSKKYPQADSLAEYLKMHGGSHNA STAPYRTAFYLEVENDALPGAVDRLADAIAEPLLDKKYAERERNAVNAELTMARTRDGMR MAQVSAETINPAHPGSKFSGGNLETLSDKPGNPVQQALKDFHEKYYSANLMKAVIYSNKP LPELAKMAADTFGRVPNKESKKPEITVPVVTDAQKGIIIHYVPALPRKVLRVEFRIDNNS AKFRSKTDELITYLIGNRSPGTLSDWLQKQGLVEGISANSDPIVNGNSGVLAISASLTDK GLANRDQVVAAIFSYLNLLREKGIDKQYFDELANVLDIDFRYPSITRDMDYVEWLADTMI RVPVEHTLDAVNIADRYDAKAVKERLAMMTPQNARIWYISPKEPHNKTAYFVDAPYQVDK ISEQTFADWQQKAANIALSLPELNPYIPDDFSLIKSEKKYDHPELIVDESNLRVVYAPSR YFSSEPKADVSLILRNPKAMDSARNQVMFALNDYLAGLALDQLSNQASVGGISFSTNANN GLMVNANGYTQRLPQLFQALLEGYFSYTATEDQLEQAKSWYNQMMDSAEKGKAFEQAIMP AQMLSQVPYFSRDERRKILPSITLKEVLAYRDALKSGARPEFMVIGNMTEAQATTLARDV QKQLGADGSEWCRNKDVVVDKKQSVIFEKAGNSTDSALAAVFVPTGYDEYTSSAYSSLLG QIVQPWFYNQLRTEEQLGYAVFAFPMSVGRQWGMGFLLQSNDKQPSFLWERYKAFFPTAE AKLRAMKPDEFAQIQQAVITQMLQAPQTLGEEASKLSKDFDRGNMRFDSRDKIVAQIKLL TPQKLADFFHQAVVEPQGMAILSQISGSQNGKAEYVHPEGWKVWENVSALQQTMPLMSEK NE >gi|296493157|gb|ADTK01000344.1| GENE 22 24689 - 28231 2948 1180 aa, chain + ## HITS:1 COG:ECs3677 KEGG:ns NR:ns ## COG: ECs3677 COG1074 # Protein_GI_number: 15832931 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) # Organism: Escherichia coli O157:H7 # 1 1180 1 1180 1180 2270 98.0 0 MSDVAETLDPLRLPLQGERLIEASAGTGKTFTIAALYLRLLLGLGGSAAFPRPLTVEELL VVTFTEAATAELRGRIRSNIHELRIACLRETTDNPLYERLLEEIDDKAQAAQWLLLAERQ MDEAAVFTIHGFCQRMLNLNAFESGMLFEQQLIEDESLLRYQACADFWRRHCYPLPREIA LVVFETWKGPQALLRDINRYLQGEAPVIKAPPPDDETLASRHAQIVARIDTVKQQWRDAV GELDALIESSGIDRRKFNRSNQAKWIEKISAWAEEETNSYQLPESLEKFSQRFLEDRTKA GGETPRHPLFEAIEQLLAEPLSIRDLVITRALAEIRETVAREKRRRGELGFDDMLSRLDS ALRSESGEVLAAAIRTRFPVAMIDEFQDTDPQQYRIFRRIWHHQPETALLLIGDPKQAIY AFRGADIFTYMKARSEVHAHYTLDTNWRSAPGMVNSVNKLFSQTDDAFMFREIPFIPVKS AGKNQALRFVFKGETQPAMKMWLMEGESCGVGDYQSTMAQVCAAQIRDWLQAGQRGEALL MNGDDARPVRASDISVLVRSRQEAAQVRDALTLLEIPSVYLSNRDSVFETLEAQEMLWLL QAVMTPERENTLRSALATSMMGLNALDIETLNNDEHAWDVVVEEFDGYRQIWRKRGVMPM LRALMSARNIAENLLATAGGERRLTDILHISELLQEAGTQLESEHALVRWLSQHILEPDS NASSQQMRLESDKHLVQIVTIHKSKGLEYPLVWLPFITNFRVQDQAFYHDRHSFEAVLDL NAAPESVDLAEVERLAEDLRLLYVALTRSVWHCSLGVAPLVRRRGDKKGDTDVHQSALGR LLQKGEPQDAAGLRTCIEALCDDDIAWQTAQTGDNQPWQVNDALTAELNARTLQRLPGDN WRVTSYSGLQQRGHGIAQDLMPRLDVDAAGVVSVVEEPTLTPHQFPRGASPGTFLHSLFE DLDFTQPVDPNWVQEKLELGGFESQWEPVLTEWITAVLQAPLNETGVSLSQLSDRDKQVE MEFYLPISEPLIASQLDALIRQFDPLSAGCPPLEFMQVRGMLKGFIDLVFRHEGRYYLLD YKSNWLGEDSSAYTQQAMAAAMQAHRYDLQYQLYTLALHRYLRHRIADYDYDRHFGGVIY LFLRGVDKEHPQQGIYTTRPNAGLIALMDEMFAGMTLEEA >gi|296493157|gb|ADTK01000344.1| GENE 23 28231 - 30057 1334 608 aa, chain + ## HITS:1 COG:recD KEGG:ns NR:ns ## COG: recD COG0507 # Protein_GI_number: 16130723 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Escherichia coli K12 # 1 608 1 608 608 1161 99.0 0 MKLQKQLLEAVEHKQLRPLDVQFALTVAGDEHPAVTLAAALLSHDAGEGHVCLPLSRLEN NEASHPLLATCVSEIGELQNWEECLLASQAVSRGDEPTPMILCGDRLYLNRMWCNERTVA RFFNEVNHAIEVDEALLAQTLDKLFPVSDEINWQKVAAAVALTRRISVISGGPGTGKTTT VAKLLAALIQMADGERCRIRLAAPTGKAAARLTESLGKALRQLPLTDEQKKRIPEDASTL HRLLGAQPGSQRLRHHAGNPLHLDVLVVDEASMIDLPMMSRLIDALPDHARVIFLGDRDQ LASVEAGAVLGDICAYANAGFTAERAGQLSRLTGTHVPAGTGTEAASLRDSLCLLQKSYR FGSDSGIGQLAAAINRGDKTAVKTVFQQDFTDIEKRLLQSGEDYIAMLEEALAGYGRYLD LLQARAEPDLIIQAFNEYQLLCALREGPFGVAGLNERIEQFMQQKRKIHRHPHSRWYEGR PVMIARNDSALGLFNGDIGIALDRGQGTRVWFAMPDGNIKSVQPSRLPEHETTWAMTVHK SQGSEFDHAALILPSQRTPVVTRELVYTAVTRARRRLSLYADERILSAAIATRTERRSGL AALFSSRE >gi|296493157|gb|ADTK01000344.1| GENE 24 30119 - 31450 1277 443 aa, chain - ## HITS:1 COG:ECs3675_1 KEGG:ns NR:ns ## COG: ECs3675_1 COG0548 # Protein_GI_number: 15832929 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate kinase # Organism: Escherichia coli O157:H7 # 1 291 1 291 291 581 100.0 1e-166 MVKERKTELVEGFRHSVPYINTHRGKTFVIMLGGEAIEHENFSSIVNDIGLLHSLGIRLV VVYGARPQIDANLAAHHHEPLYHKNIRVTDAKTLELVKQAAGTLQLDITARLSMSLNNTP LQGAHINVVSGNFIIAQPLGVDDGVDYCHSGRIRRIDEDAIHRQLDSGAIVLMGPVAVSV TGESFNLTSEEIATQLAIKLKAEKMIGFCSSQGVTNDDGDIVSELFPNEAQARVEAQEEK GDYNSGTVRFLRGAVKACRSGVRRCHLISYQEDGALLQELFSRDGIGTQIVMESAEQIRR ATINDIGGILELIRPLEQQGILVRRSREQLEMEIDKFTIIQRDNTTIACAALYPFPEEKI GEMACVAVHPDYRSSSRGEVLLERITAQAKQSGLSKLFVLTTRSIHWFQERGFTPVDIDL LPESKKQLYNYQRKSKVLMADLG >gi|296493157|gb|ADTK01000344.1| GENE 25 31682 - 32935 1228 417 aa, chain + ## HITS:1 COG:ECs3674 KEGG:ns NR:ns ## COG: ECs3674 COG0860 # Protein_GI_number: 15832928 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Escherichia coli O157:H7 # 1 417 31 447 447 783 100.0 0 MSGSNTAISRRRLLQGAGAMWLLSVSQVSLAAVSQVVAVRVWPASSYTRVTVESNRQLKY KQFALSNPERVVVDIEDVNLNSVLKGMAAQIRADDPFIKSARVGQFDPQTVRMVFELKQN VKPQLFALAPVAGFKERLVMDLYPANAQDMQDPLLALLEDYNKGDLEKQVPPAQSGPQPG KAGRDRPIVIMLDPGHGGEDSGAVGKYKTREKDVVLQIARRLRSLIEKEGNMKVYMTRNE DIFIPLQVRVAKAQKQRADLFVSIHADAFTSRQPSGSSVFALSTKGATSTAAKYLAQTQN ASDLIGGVSKSGDRYVDHTMFDMVQSLTIADSLKFGKAVLNKLGKINKLHKNQVEQAGFA VLKAPDIPSILVETAFISNVEEERKLKTATFQQEVAESILAGIKAYFADGATLARRG >gi|296493157|gb|ADTK01000344.1| GENE 26 33405 - 34502 1125 365 aa, chain + ## HITS:1 COG:ECs3673 KEGG:ns NR:ns ## COG: ECs3673 COG2821 # Protein_GI_number: 15832927 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-bound lytic murein transglycosylase # Organism: Escherichia coli O157:H7 # 1 365 1 365 365 740 100.0 0 MKGRWVKYLLMGTVVAMLAACSSKPTDRGQQYKDGKFTQPFSLVNQPDAVGAPINAGDFA EQINHIRNSSPRLYGNQSNVYNAVQEWLRAGGDTRNMRQFGIDAWQMEGADNYGNVQFTG YYTPVIQARHTRQGEFQYPIYRMPPKRGRLPSRAEIYAGALSDKYILAYSNSLMDNFIMD VQGSGYIDFGDGSPLNFFSYAGKNGHAYRSIGKVLIDRGEVKKEDMSMQAIRHWGETHSE AEVRELLEQNPSFVFFKPQSFAPVKGASAVPLVGRASVASDRSIIPPGTTLLAEVPLLDN NGKFNGQYELRLMVALDVGGAIKGQHFDIYQGIGPEAGHRAGWYNHYGRVWVLKTAPGAG NVFSG >gi|296493157|gb|ADTK01000344.1| GENE 27 34741 - 35547 856 268 aa, chain + ## HITS:1 COG:ygdL KEGG:ns NR:ns ## COG: ygdL COG1179 # Protein_GI_number: 16130719 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 # Organism: Escherichia coli K12 # 1 268 1 268 268 511 99.0 1e-145 MSVVISDAWRQRFGGTARLYGEKALQLFADAHICVVGIGGVGSWAAEALARTGIGAITLI DMDDVCVTNTNRQIHALRDNVGLAKAKVMAERIRQINPECRVTVVDDFVTPDNVAQYMSA GYSYVIDAIDSVRPKAALIAYCRRNKIPLVTTGGAGGQIDPTQIQVTDLAKTIQDPLAAK LRERLKSDFGVVKNSKGKLGVDCVFSTEALVYPQSDGTVCAMKATAEGPKRMDCASGFGA ATMVTATFGFVAVSHALKKMMAKAARQG >gi|296493157|gb|ADTK01000344.1| GENE 28 35598 - 36041 404 147 aa, chain - ## HITS:1 COG:ECs3671 KEGG:ns NR:ns ## COG: ECs3671 COG2166 # Protein_GI_number: 15832925 # Func_class: R General function prediction only # Function: SufE protein probably involved in Fe-S center assembly # Organism: Escherichia coli O157:H7 # 1 147 1 147 147 274 98.0 4e-74 MTNPQFAGHPFGTTVTAETLRNTFAPLSQWEDKYRQLIMLGKQLPALPDELKAQAKEIAG CENRVWLGYTVAENGKMHFFGDSEGRIVRGLLAVLLTAVEGKTAAELQAQSPLALFDELG LRGQLSASRSQGLNALSEAIIAATKQV >gi|296493157|gb|ADTK01000344.1| GENE 29 36041 - 37246 1030 401 aa, chain - ## HITS:1 COG:csdA KEGG:ns NR:ns ## COG: csdA COG0520 # Protein_GI_number: 16130717 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Escherichia coli K12 # 1 401 1 401 401 790 99.0 0 MNVFNPAQFRAQFPALQDAGVYLDSAATALKPEAVVEATRQFYSLSAGNVHRSQFAEAQR LTARYEAAREKVAQLLNAPDDKTIVWTRGTTESINMVAQCYARPRLQPGDEIIVSVAEHH ANLVPWLMVAQQTGAKVVKLPLNAQRLPDVDLLPELITPRSRILALGQMSNVTGGCPDLA RAITFAHSAGMVVMVDGAQGAVHFPADVQQLDIDFYAFSGHKLYGPTGIGVLYGKSELLE AMSPWLGGGKMIHEVSFDGFTTQSAPWKLEAGTPNVAGVIGLSAALEWLADYDINQAESW SRSLATLAEDALAKRPGFRSFRCQDSSLLAFDFAGVHHSDMVTLLAEYGIALRAGQHCAQ PLLAELGVTGTLRASFAPYNTKSDVDALVNAVDRALELLVD >gi|296493157|gb|ADTK01000344.1| GENE 30 37438 - 37665 329 75 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_2919 NR:ns ## KEGG: ECIAI1_2919 # Name: ygdI # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 75 1 75 75 129 100.0 3e-29 MKKTAAIISACMLTFALSACSGSNYVMHTNDGRTIVSDGKPQTDNDTGMISYKDANGNKQ QINRTDVKEMVELDQ >gi|296493157|gb|ADTK01000344.1| GENE 31 38016 - 38933 551 305 aa, chain + ## HITS:1 COG:ECs3668 KEGG:ns NR:ns ## COG: ECs3668 COG0583 # Protein_GI_number: 15832922 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 305 1 305 305 598 99.0 1e-171 MSKRLPPLNALRVFDAAARHLSFTRAAEELFVTQAAVSHQIKSLEDFLGLKLFRRRNRSL LLTEEGQSYFLDIKEIFSQLTEATRKLQARSAKGALTVSLLPSFAIHWLVPRLSSFNSAY PGIDVRIQAVDRQEDKLADDVDVAIFYGRGNWPGLRVEKLYAEYLLPVCSPLLLTGEKPL KTPEDLAKHTLLHDASRRDWQTYTRQLGLNHINVQQGPIFSHSAMVLQAAIHGQGVALAN NVMAQSEIEAGRLVCPFNDVLVSKNAFYLVCHDSQAELGKIAAFRQRILAKAAAEQEKFR FRYEQ >gi|296493157|gb|ADTK01000344.1| GENE 32 38952 - 39347 457 131 aa, chain + ## HITS:1 COG:ECs3667 KEGG:ns NR:ns ## COG: ECs3667 COG2363 # Protein_GI_number: 15832921 # Func_class: S Function unknown # Function: Uncharacterized small membrane protein # Organism: Escherichia coli O157:H7 # 1 131 1 131 131 204 100.0 3e-53 MTSRFMLIFAAISGFIFVALGAFGAHVLSKTMGAVEMGWIQTGLEYQAFHTLAILGLAVA MQRRISIWFYWSSVFLALGTVLFSGSLYCLALSHLRLWAFVTPVGGVSFLAGWALMLVGA IRLKRKGVSHE >gi|296493157|gb|ADTK01000344.1| GENE 33 39340 - 40440 1111 366 aa, chain + ## HITS:1 COG:ECs3666 KEGG:ns NR:ns ## COG: ECs3666 COG2933 # Protein_GI_number: 15832920 # Func_class: R General function prediction only # Function: Predicted SAM-dependent methyltransferase # Organism: Escherichia coli O157:H7 # 1 366 1 366 366 780 100.0 0 MNKVVLLCRPGFEKECAAEITDKAGQREIFGFARVKENAGYVIYECYQPDDGDKLIRELP FSSLIFARQWFVVGELLQHLPPEDRITPIVGMLQGVVEKGGELRVEVADTNESKELLKFC RKFTVPLRAALRDAGVLANYETPKRPVVHVFFIAPGCCYTGYSYSNNNSPFYMGIPRLKF PADAPSRSTLKLEEAFHVFIPADEWDERLANGMWAVDLGACPGGWTYQLVKRNMWVYSVD NGPMAQSLMDTGQVTWLREDGFKFRPTRSNISWMVCDMVEKPAKVAALMAQWLVNGWCRE TIFNLKLPMKKRYEEVSHNLAYIQAQLDEHGINAQIQARQLYHDREEVTVHVRRIWAAVG GRRDER >gi|296493157|gb|ADTK01000344.1| GENE 34 40484 - 41215 534 243 aa, chain - ## HITS:1 COG:fucR KEGG:ns NR:ns ## COG: fucR COG1349 # Protein_GI_number: 16130712 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli K12 # 1 243 1 243 243 471 100.0 1e-133 MKAARQQAIVDLLLNHTSLTTEALSEQLKVSKETIRRDLNELQTQGKILRNHGRAKYIHR QNQDSGDPFHIRLKSHYAHKADIAREALAWIEEGMVIALDASSTCWYLARQLPDINIQVF TNSHPICHELGKRERIQLISSGGTLERKYGCYVNPSLISQLKSLEIDLFIFSCEGIDSSG ALWDSNAINADYKSMLLKRAAQSLLLIDKSKFNRSGEARIGHLDEVTHIISDERQVATSL VTA >gi|296493157|gb|ADTK01000344.1| GENE 35 41273 - 41695 452 140 aa, chain - ## HITS:1 COG:ECs3664 KEGG:ns NR:ns ## COG: ECs3664 COG4154 # Protein_GI_number: 15832918 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose dissimilation pathway protein FucU # Organism: Escherichia coli O157:H7 # 1 140 1 140 140 268 100.0 2e-72 MLKTISPLISPELLKVLAEMGHGDEIIFSDAHFPAHSMGPQVIRADGLLVSDLLQAIIPL FELDSYAPPLVMMAAVEGDTLDPEVERRYRNALSLQAPCPDIIRINRFAFYERAQKAFAI VITGERAKYGNILLKKGVTP >gi|296493157|gb|ADTK01000344.1| GENE 36 41697 - 43115 978 472 aa, chain - ## HITS:1 COG:fucK KEGG:ns NR:ns ## COG: fucK COG1070 # Protein_GI_number: 16130710 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Escherichia coli K12 # 1 472 11 482 482 958 99.0 0 MKQEVILVLDCGATNVRAIAVNRQGKIVARASTPNASDIAMENNTWHQWSLDAILQRFAD CCRQINSELTECHIRGIAVTTFGVDGALVDKQGNLLYPIISWKCPRTAAVMDNIEQLISA QRLQAISGVGAFSFNTLYKLVWLKENHPQLLERAHAWLFISSLINHRLTGEFTTDITMAG TSQMLDIQQRDFSPQILQATGIPRRLFPRLVEAGEQIGTLQNSAAAMLGLPVGIPVISAG HDTQFALFGAGADQNEPVLSSGTWEILMVRSAQVDTSLLSQYAGSTCELDSQAGLYNPGM QWLASGVLEWVRKLFWTAETPWQMLIEEARLIAPGADGVKMQCDLLSCQNAGWQGVTLNT TRGHFYRAALEGLTAQLQRNLQMLEKIGHFKASELLLVGGGSRNTLWNQIKANMLDIPVK VLDDAETTVAGAALFGWYGVGEFNSPEEARAQIHYQYRYFYPQTEPEFIEEV >gi|296493157|gb|ADTK01000344.1| GENE 37 43224 - 44999 1732 591 aa, chain - ## HITS:1 COG:fucI KEGG:ns NR:ns ## COG: fucI COG2407 # Protein_GI_number: 16130709 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Escherichia coli K12 # 1 591 1 591 591 1222 100.0 0 MKKISLPKIGIRPVIDGRRMGVRESLEEQTMNMAKATAALLTEKLRHACGAAVECVISDT CIAGMAEAAACEEKFSSQNVGLTITVTPCWCYGSETIDMDPTRPKAIWGFNGTERPGAVY LAAALAAHSQKGIPAFSIYGHDVQDADDTSIPADVEEKLLRFARAGLAVASMKGKSYLSL GGVSMGIAGSIVDHNFFESWLGMKVQAVDMTELRRRIDQKIYDEAELEMALAWADKNFRY GEDENNKQYQRNAEQSRAVLRESLLMAMCIRDMMQGNSKLADIGRVEESLGYNAIAAGFQ GQRHWTDQYPNGDTAEAILNSSFDWNGVREPFVVATENDSLNGVAMLMGHQLTGTAQVFA DVRTYWSPEAIERVTGHKLDGLAEHGIIHLINSGSAALDGSCKQRDSEGNPTMKPHWEIS QQEADACLAATEWCPAIHEYFRGGGYSSRFLTEGGVPFTMTRVNIIKGLGPVLQIAEGWS VELPKDVHDILNKRTNSTWPTTWFAPRLTGKGPFTDVYSVMANWGANHGVLTIGHVGADF ITLASMLRIPVCMHNVEETKVYRPSAWAAHGMDIEGQDYRACQNYGPLYKR >gi|296493157|gb|ADTK01000344.1| GENE 38 45032 - 46348 805 438 aa, chain - ## HITS:1 COG:fucP KEGG:ns NR:ns ## COG: fucP COG0738 # Protein_GI_number: 16130708 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Escherichia coli K12 # 1 438 1 438 438 808 100.0 0 MGNTSIQTQSYRAVDKDAGQSRSYIIPFALLCSLFFLWAVANNLNDILLPQFQQAFTLTN FQAGLIQSAFYFGYFIIPIPAGILMKKLSYKAGIITGLFLYALGAALFWPAAEIMNYTLF LVGLFIIAAGLGCLETAANPFVTVLGPESSGHFRLNLAQTFNSFGAIIAVVFGQSLILSN VPHQSQDVLDKMSPEQLSAYKHSLVLSVQTPYMIIVAIVLLVALLIMLTKFPALQSDNHS DAKQGSFSASLSRLARIRHWRWAVLAQFCYVGAQTACWSYLIRYAVEEIPGMTAGFAANY LTGTMVCFFIGRFTGTWLISRFAPHKVLAAYALIAMALCLISAFAGGHVGLIALTLCSAF MSIQYPTIFSLGIKNLGQDTKYGSSFIVMTIIGGGIVTPVMGFVSDAAGNIPTAELIPAL CFAVIFIFARFRSQTATN >gi|296493157|gb|ADTK01000344.1| GENE 39 46895 - 47542 668 215 aa, chain + ## HITS:1 COG:ECs3660 KEGG:ns NR:ns ## COG: ECs3660 COG0235 # Protein_GI_number: 15832914 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli O157:H7 # 1 215 1 215 215 432 100.0 1e-121 MERNKLARQIIDTCLEMTRLGLNQGTAGNVSVRYQDGMLITPTGIPYEKLTESHIVFIDG NGKHEEGKLPSSEWRFHMAAYQSRPDANAVVHNHAVHCTAVSILNRPIPAIHYMIAAAGG NSIPCAPYATFGTRELSEHVALALKNRKATLLQHHGLIACEVNLEKALWLAHEVEVLAQL YLTTLAITDPVPVLSDEEIAVVLEKFKTYGLRIEE >gi|296493157|gb|ADTK01000344.1| GENE 40 47570 - 48718 1225 382 aa, chain + ## HITS:1 COG:ECs3659 KEGG:ns NR:ns ## COG: ECs3659 COG1454 # Protein_GI_number: 15832913 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Escherichia coli O157:H7 # 1 382 2 383 383 745 100.0 0 MANRMILNETAWFGRGAVGALTDEVKRRGYQKALIVTDKTLVQCGVVAKVTDKMDAAGLA WAIYDGVVPNPTITVVKEGLGVFQNSGADYLIAIGGGSPQDTCKAIGIISNNPEFADVRS LEGLSPTNKPSVPILAIPTTAGTAAEVTINYVITDEEKRRKFVCVDPHDIPQVAFIDADM MDGMPPALKAATGVDALTHAIEGYITRGAWALTDALHIKAIEIIAGALRGSVAGDKDAGE EMALGQYVAGMGFSNVGLGLVHGMAHPLGAFYNTPHGVANAILLPHVMRYNADFTGEKYR DIARVMGVKVEGMSLEEARNAAVEAVFALNRDVGIPPHLRDVGVRKEDIPALAQAALDDV CTGGNPREATLEDIVELYHTAW >gi|296493157|gb|ADTK01000344.1| GENE 41 48773 - 49528 515 251 aa, chain - ## HITS:1 COG:exo KEGG:ns NR:ns ## COG: exo COG0258 # Protein_GI_number: 16130705 # Func_class: L Replication, recombination and repair # Function: 5'-3' exonuclease (including N-terminal domain of PolI) # Organism: Escherichia coli K12 # 1 251 31 281 281 519 99.0 1e-147 MAVHLLIVDALNLIRRIHAVQGSPCVETCQHALDQLIMHSQPTHAVAVFDDENRSSGWRH QRLPDYKAGRPPMPEELHDEMPALRAAFEQRGVPCWSTSGNEADDLAATLAVKVTQAGHQ ATIVSTDKGYCQLLSPTLRIRDYFQKRWLDAPFIDKEFGVQPQQLPDYWGLAGISSSKVP GVAGIGPKSATQLLVEFQSLEGIYENLDAVAEKWRKKLETHKEMAFLCRDIARLQTDLHI DGNLQQLRLVR >gi|296493157|gb|ADTK01000344.1| GENE 42 49640 - 51007 1351 455 aa, chain - ## HITS:1 COG:sdaB KEGG:ns NR:ns ## COG: sdaB COG1760 # Protein_GI_number: 16130704 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Escherichia coli K12 # 1 455 1 455 455 908 99.0 0 MISVFDIFKIGIGPSSSHTVGPMKAGKQFTDDLIARNLLKDVTRVVVDVYGSLSLTGKGH HTDIAIIMGLAGNLPDTVDIDSIPGFIQDVNTHGRLMLANGQHEVEFPVDQCMNFHADNL SLHENGMRITALAGDKVVYSQTYYSIGGGFIVDEEHFGQQNSAPVEVPYPYSSAADLQKH CQETGLSLSGLMMKNELALHSKEELEQHLANVWEVMRGGIERGISTEGVLPGKLRVPRRA AALRRMLVSQDKTTTDPMAVVDWINMFALAVNEENAAGGRVVTAPTNGACGIIPAVLAYY DKFIREVNANSLARYLLVASAIGSLYKMNASISGAEVGCQGEVGVACSMAAAGLAELLGA SPAQVCIAAEIAMEHNLGLTCDPVAGQVQVPCIERNAIAAVKAVNAARMALRRTSEPRVC LDKVIETMYETGKDMNAKYRETSRGGLAMKIVACD >gi|296493157|gb|ADTK01000344.1| GENE 43 51065 - 52354 1414 429 aa, chain - ## HITS:1 COG:ECs3656 KEGG:ns NR:ns ## COG: ECs3656 COG0814 # Protein_GI_number: 15832910 # Func_class: E Amino acid transport and metabolism # Function: Amino acid permeases # Organism: Escherichia coli O157:H7 # 1 429 1 429 429 731 99.0 0 METTQTSTIASKDSRSAWRKTDTMWMLGLYGTAIGAGVLFLPINAGVGGMIPLIIMAILA FPMTFFAHRGLTRFVLSGKNPGENITEVVEEHFGIGAGKLITLLYFFAIYPILLVYSVAI TNTVESFMSHQLGMTPPPRAILSLILIVGMMTIVRFGEQMIVKAMSILVFPFVGVLMLLA LYLIPQWNGAALETLSLDTASATGNGLWMTLWLAIPVMVFSFNHSPIISSFAVAKREEYG DMAEQKCSKILAFAHIMMVLTVMFFVFSCVLSLTPADLAAAKEQNISILSYLANHFNAPV IAWMAPIIAIIAITKSFLGHYLGAREGFNGMVIKSLRGKGKSIEINKLNRITALFMLVTT WIVATLNPSILGMIETLGGPIIAMILFLMPMYAIQKVPAMRKYSGHISNVFVVVMGLIAI SAIFYSLFS >gi|296493157|gb|ADTK01000344.1| GENE 44 52911 - 54275 1340 454 aa, chain - ## HITS:1 COG:ECs3655 KEGG:ns NR:ns ## COG: ECs3655 COG1611 # Protein_GI_number: 15832909 # Func_class: R General function prediction only # Function: Predicted Rossmann fold nucleotide-binding protein # Organism: Escherichia coli O157:H7 # 1 454 1 454 454 931 100.0 0 MITHISPLGSMDMLSQLEVDMLKRTASSDLYQLFRNCSLAVLNSGSLTDNSKELLSRFEN FDINVLRRERGVKLELINPPEEAFVDGRIIRALQANLFAVLRDILFVYGQIHNTVRFPNL NLDNSVHITNLVFSILRNARALHVGEAPNMVVCWGGHSINENEYLYARRVGNQLGLRELN ICTGCGPGAMEAPMKGAAVGHAQQRYKDSRFIGMTEPSIIAAEPPNPLVNELIIMPDIEK RLEAFVRIAHGIIIFPGGVGTAEELLYLLGILMNPANKDQVLPLILTGPKESADYFRVLD EFVVHTLGENARRHYRIIIDDAAEVARQMKKSMPLVKENRRDTGDAYSFNWSMRIAPDLQ MPFEPSHENMANLKLYPDQPVEVLAADLRRAFSGIVAGNVKEVGIRAIEEFGPYKINGDK EIMRRMDDLLQGFVAQHRMKLPGSAYIPCYEICT >gi|296493157|gb|ADTK01000344.1| GENE 45 54387 - 55235 626 282 aa, chain - ## HITS:1 COG:yqcD_2 KEGG:ns NR:ns ## COG: yqcD_2 COG0780 # Protein_GI_number: 16130701 # Func_class: R General function prediction only # Function: Enzyme related to GTP cyclohydrolase I # Organism: Escherichia coli K12 # 138 282 1 145 145 307 99.0 2e-83 MSSYANHQALAGLTLGKSTDYRDTYDASLLQGVPRSLNRDPLGLKADNLPFHGTDIWTLY ELSWLNAKGLPQVAVGHVELDYTSANLIESKSFKLYLNSFNQTRFNNWDEVRQTLERDLS TCAQGKVSVALYRLDELEGQPIGHFNGTCIDDQDITIDNYEFTTDYLENATCGEKVVEET LVSHLLKSNCLITHQPDWGSIQIQYRGRQIDREKLLRYLVSFRHHNEFHEQCVERIFNDL LRFCQPEKLSVYARYTRRGGLDINPWRSNSDFVPSTTRLVRQ >gi|296493157|gb|ADTK01000344.1| GENE 46 55303 - 55848 484 181 aa, chain + ## HITS:1 COG:no KEGG:ECO103_3336 NR:ns ## KEGG: ECO103_3336 # Name: syd # Def: SecY interacting protein Syd # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 181 1 181 181 367 100.0 1e-101 MDDLTAQALKDFTARYCDAWHEEHKSWPLSEELYGVPSPCIISTTEDAVYWQPQPFTGEQ NVNAVERAFDIVIQPTIHTFYTTQFAGDMHAQFGDIKLTLLQTWSEDDFRRVQENLIGHL VTQKRLKLPPTLFIATLEEELEVISVCNLSGEVCKETLGTRKRTHLASNLAEFLNQLKPL L >gi|296493157|gb|ADTK01000344.1| GENE 47 56470 - 56799 313 109 aa, chain + ## HITS:1 COG:ECs3652 KEGG:ns NR:ns ## COG: ECs3652 COG3098 # Protein_GI_number: 15832906 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 109 1 109 109 177 100.0 4e-45 MTTHDRVRLQLQALEALLREHQHWRNDEPQPHQFNSTQPFFMDTMEPLEWLQWVLIPRMH DLLNNNQPLPGAFAVAPYYEMALATDHPQRALILAELEKLDALFADDAS >gi|296493157|gb|ADTK01000344.1| GENE 48 56799 - 57581 193 260 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family [Lactobacillus jensenii 269-3] # 3 219 82 276 287 79 27 6e-14 MLEILYQDEWLVAVNKPSGWLVHRSWLDRDEKVVVMQTVRDQIGQHVFTAHRLDRPTSGV LLMGLSSEAGRLLAQQFEQHQIQKRYHAIVRGWLMEEAVLDYPLVEELDKIADKFAREDK GSQPAVTHYRGLATVEMPVATGRYPTTRYGLVELEPKTGRKHQLRRHLAHLRHPIIGDSK HGDLRQNRSGAEHFGLQRLMLHASQLSLTHPFTGEPLTIHAGLDDTWMQALSQFGWRGLL PENERVEFSAPSGQDGEISS >gi|296493157|gb|ADTK01000344.1| GENE 49 57599 - 58048 581 149 aa, chain + ## HITS:1 COG:ECs3650 KEGG:ns NR:ns ## COG: ECs3650 COG0716 # Protein_GI_number: 15832904 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Escherichia coli O157:H7 # 1 149 1 149 149 291 99.0 2e-79 MAEIGIFVGTMYGNSLLVAEEAEAILTAQGHKATVFEDPELSDWLPYQDKYVLVVTSTTG QGDLPDSIVPLFQGIKDSLGFQPNLRYGVIALGDSSYVNFCNGGKQFDALLQEQSAQRVG EILLIDASENPEPETESNPWVEQWGTLLS >gi|296493157|gb|ADTK01000344.1| GENE 50 58483 - 59835 1456 450 aa, chain + ## HITS:1 COG:ygcZ KEGG:ns NR:ns ## COG: ygcZ COG0477 # Protein_GI_number: 16130696 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 450 1 450 450 817 99.0 0 MSSLSQAASSVEKRTNARYWIVVMLFIVTSFNYGDRATLSIAGSEMAKDIGLDPVGMGYV FSAFSWAYVIGQIPGGWLLDRFGSKRVYFWSIFIWSMFTLLQGFVDIFSGFGIIVALFTL RFLVGLAEAPSFPGNSRIVAAWFPAQERGTAVSIFNSAQYFATVIFAPIMGWLTHEVGWS HVFFFMGGLGIVISFIWLKVIHEPNQHPGVNQKELEYIAAGGALINMDQQNTKVKVPFSV KWGQIKQLLGSRMMIGVYIGQYCINALTYFFITWFPVYLVQARGMSILKAGFVASVPAVC GFIGGVLGGIISDWLMRRTGSLNIARKTPIVMGMLLSMVMVFCNYVNVEWMIIGFMALAF FGKGIGALGWAVMADTAPKEISGLSGGLFNMFGNISGIVTPIAIGYIVGTTGSFNGALIY VGVHALIAVLSYLVLVGDIKRIELKPVAGQ >gi|296493157|gb|ADTK01000344.1| GENE 51 59837 - 61177 1405 446 aa, chain + ## HITS:1 COG:ygcY KEGG:ns NR:ns ## COG: ygcY COG4948 # Protein_GI_number: 16130695 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Escherichia coli K12 # 1 446 1 446 446 899 98.0 0 MTTQSSPVITDMKVIPVAGHDSMLLNIGGAHNAYFTRNIVVLTDNAGHTGIGEAPGGEVI YQTLVDAIPMVLGQEVARLNKVVQQVHKGNQAADFDTFGKGAWTFELRVNAVAALEAALL DLLGKALNVPVCELLGPGKQRETITVLGYLFYIGDRTKTDLPYLENTPGNHEWYQLRHQK AMNSEAVVRLAEASQDRYGFKDFKLKGGVLPGEQEIDTVRALKKSFPDARITVDPNGAWL LDEAISLCKGLNDVLTYAEDPCGAEQGFSGREVMAEFRRATGLPVATNMIATNWREMGHA VMLNAVDIPLADPHFWTLSGAVRVAQLCDDWGLTWGCHSNNHFDISLAMFTHVGAAAPGN PTAIDTHWIWQEGDCRLTQNPLEIKNGKIAVPDAPGLGVELDWEQVQKAHEAYKRLPGGA RNDAGPMQYLIPGWTFDRKRPVFGRH >gi|296493157|gb|ADTK01000344.1| GENE 52 61198 - 62538 1606 446 aa, chain + ## HITS:1 COG:ECs3647 KEGG:ns NR:ns ## COG: ECs3647 COG4948 # Protein_GI_number: 15832901 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Escherichia coli O157:H7 # 1 446 1 446 446 930 99.0 0 MSSQFTTPVVTEMQVIPVAGHDSMLMNLSGAHAPFFTRNIVIIKDNSGHTGVGEIPGGEK IRKTLEDAIPLVVGKTLGEYKNVLTLVRNTFADRDAGGRGLQTFDLRTTIHVVTGIEAAM LDLLGQHLGVNVASLLGDGQQRSEVEMLGYLFFVGNRKATPLPYQSQPDDSCDWYRLRHE EAMTPDAVVRLAEAAYEKYGFNDFKLKGGVLAGEEEAESIVALAKRFPQARITLDPNGAW SLNEAIKIGKYLKGSLAYAEDPCGAEQGFSGREVMAEFRRATGLPTATNMIATDWRQMGH TLSLQSVDIPLADPHFWTMQGSVRVAQMCHEFGLTWGSHSNNHFDISLAMFTHVAAAAPG KITAIDTHWIWQEGNQRLTKEPFEIKGGLVQVPEKPGLGVEIDMDQVMKAHELYQKHGLG ARDDAMGMQYLIPGWTFDNKRPCMVR Prediction of potential genes in microbial genomes Time: Mon May 16 16:07:53 2011 Seq name: gi|296493156|gb|ADTK01000345.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1096.12, whole genome shotgun sequence Length of sequence - 11452 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 4, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 7 - 50 3.1 1 1 Tu 1 . - CDS 243 - 2999 2503 ## COG0642 Signal transduction histidine kinase - Prom 3057 - 3116 3.6 + Prom 2787 - 2846 1.6 2 2 Op 1 8/0.000 + CDS 3056 - 4357 940 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase 3 2 Op 2 3/1.000 + CDS 4405 - 6639 2352 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases + Term 6652 - 6685 1.1 4 3 Op 1 8/0.000 + CDS 6717 - 6965 269 ## COG2336 Growth regulator 5 3 Op 2 3/1.000 + CDS 6965 - 7300 217 ## COG2337 Growth inhibitor 6 3 Op 3 5/1.000 + CDS 7371 - 8162 1044 ## COG1694 Predicted pyrophosphatase + Term 8176 - 8229 1.1 + Prom 8282 - 8341 5.3 7 4 Op 1 8/0.000 + CDS 8390 - 10027 1601 ## COG0504 CTP synthase (UTP-ammonia lyase) + Term 10046 - 10083 9.1 8 4 Op 2 . + CDS 10115 - 11413 1469 ## COG0148 Enolase Predicted protein(s) >gi|296493156|gb|ADTK01000345.1| GENE 1 243 - 2999 2503 918 aa, chain - ## HITS:1 COG:ECs3646_1 KEGG:ns NR:ns ## COG: ECs3646_1 COG0642 # Protein_GI_number: 15832900 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli O157:H7 # 1 531 1 531 531 1033 100.0 0 MTNYSLRARMMILILAPTVLIGLLLSIFFVVHRYNDLQRQLEDAGASIIEPLAVSTEYGM SLQNRESIGQLISVLHRRHSDIVRAISVYDENNRLFVTSNFHLDPSSMQLGSNVPFPRQL TVTRDGDIMILRTPIISESYSPDESPSSDAKNSQNMLGYIALELDLKSVRLQQYKEIFIS SVMMLFCIGIALIFGWRLMRDVTGPIRNMVNTVDRIRRGQLDSRVEGFMLGELDMLKNGI NSMAMSLAAYHEEMQHNIDQATSDLRETLEQMEIQNVELDLAKKRAQEAARIKSEFLANM SHELRTPLNGVIGFTRLTLKTELTPTQRDHLNTIERSANNLLAIINDVLDFSKLEAGKLI LESIPFPLRSTLDEVVTLLAHSSHDKGLELTLNIKSDVPDNVIGDPLRLQQIITNLVGNA IKFTENGNIDILVEKRALSNTKVQIEVQIRDTGIGIPERDQSRLFQAFRQADASISRRHG GTGLGLVITQKLVNEMGGDISFHSQPNRGSTFWFHINLDLNPNIIIEGPSTQCLAGKRLA YVEPNSAAAQCTLDILSETPLEVVYSPTFSALPPAHYDMMLLGIAVTFREPLTMQHERLA KAVSMTDFLMLALPCHAQVNAEKLKQDGIGACLLKPLTPTRLLPALTEFCHHKQNTLLPV TDESKLAMTVMAVDDNPANLKLIGALLEDMVQHVELCDSGHQAVERAKQMPFDLILMDIQ MPDMDGIRACELIHQLPHQQQTPVIAVTAHAMAGQKEKLLGAGMSDYLAKPIEEERLHNL LLRYKPGSGISSRVVTPEVNEIVVNPNATLDWQLALRQAAGKTDLARDMLQMLLDFLPEV RNKVEEQLVGENPEGLVDLIHKLHGSCGYSGVPRMKNLCQLIEQQLRSGTKEEDLEPELL ELLDEMDNVAREASKILG >gi|296493156|gb|ADTK01000345.1| GENE 2 3056 - 4357 940 433 aa, chain + ## HITS:1 COG:ygcA KEGG:ns NR:ns ## COG: ygcA COG2265 # Protein_GI_number: 16130692 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Escherichia coli K12 # 1 433 1 433 433 877 100.0 0 MAQFYSAKRRTTTRQIITVSVNDLDSFGQGVARHNGKTLFIPGLLPQENAEVTVTEDKKQ YARAKVVRRLSDSPERETPRCPHFGVCGGCQQQHASVDLQQRSKSAALARLMKHDVSEVI ADVPWGYRRRARLSLNYLPKTQQLQMGFRKAGSSDIVDVKQCPILAPQLEALLPKVRACL GSLQAMRHLGHVELVQATSGTLMILRHTAPLSSADREKLERFSHSEGLDLYLAPDSEILE TVSGEMPWYDSNGLRLTFSPRDFIQVNAGVNQKMVARALEWLDVQPEDRVLDLFCGMGNF TLPLATQAASVVGVEGVPALVEKGQQNARLNGLQNVTFYHENLEEDVTKQPWAKNGFDKV LLDPARAGAAGVMQQIIKLEPIRIVYVSCNPATLARDSEALLKAGYTIARLAMLDMFPHT GHLESMVLFSRVK >gi|296493156|gb|ADTK01000345.1| GENE 3 4405 - 6639 2352 744 aa, chain + ## HITS:1 COG:ECs3644 KEGG:ns NR:ns ## COG: ECs3644 COG0317 # Protein_GI_number: 15832898 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Escherichia coli O157:H7 # 1 744 1 744 744 1509 100.0 0 MVAVRSAHINKAGEFDPEKWIASLGITSQKSCECLAETWAYCLQQTQGHPDASLLLWRGV EMVEILSTLSMDIDTLRAALLFPLADANVVSEDVLRESVGKSVVNLIHGVRDMAAIRQLK ATHTDSVSSEQVDNVRRMLLAMVDDFRCVVIKLAERIAHLREVKDAPEDERVLAAKECTN IYAPLANRLGIGQLKWELEDYCFRYLHPTEYKRIAKLLHERRLDREHYIEEFVGHLRAEM KAEGVKAEVYGRPKHIYSIWRKMQKKNLAFDELFDVRAVRIVAERLQDCYAALGIVHTHY RHLPDEFDDYVANPKPNGYQSIHTVVLGPGGKTVEIQIRTKQMHEDAELGVAAHWKYKEG AAAGGARSGHEDRIAWLRKLIAWQEEMADSGEMLDEVRSQVFDDRVYVFTPKGDVVDLPA GSTPLDFAYHIHSDVGHRCIGAKIGGRIVPFTYQLQMGDQIEIITQKQPNPSRDWLNPNL GYVTTSRGRSKIHAWFRKQDRDKNILAGRQILDDELEHLGISLKEAEKHLLPRYNFNDVD ELLAAIGGGDIRLNQMVNFLQSQFNKPSAEEQDAAALKQLQQKSYTPQNRSKDNGRVVVE GVGNLMHHIARCCQPIPGDEIVGFITQGRGISVHRADCEQLAELRSHAPERIVDAVWGES YSAGYSLVVRVVANDRSGLLRDITTILANEKVNVLGVASRSDTKQQLATIDMTIEIYNLQ VLGRVLGKLNQVPDVIDARRLHGS >gi|296493156|gb|ADTK01000345.1| GENE 4 6717 - 6965 269 82 aa, chain + ## HITS:1 COG:ECs3643 KEGG:ns NR:ns ## COG: ECs3643 COG2336 # Protein_GI_number: 15832897 # Func_class: T Signal transduction mechanisms # Function: Growth regulator # Organism: Escherichia coli O157:H7 # 1 82 1 82 82 148 100.0 2e-36 MIHSSVKRWGNSPAVRIPATLMQALNLNIDDEVKIDLVDGKLIIEPVRKEPVFTLAELVN DITPENLHENIDWGEPKDKEVW >gi|296493156|gb|ADTK01000345.1| GENE 5 6965 - 7300 217 111 aa, chain + ## HITS:1 COG:ECs3642 KEGG:ns NR:ns ## COG: ECs3642 COG2337 # Protein_GI_number: 15832896 # Func_class: T Signal transduction mechanisms # Function: Growth inhibitor # Organism: Escherichia coli O157:H7 # 1 111 1 111 111 230 100.0 4e-61 MVSRYVPDMGDLIWVDFDPTKGSEQAGHRPAVVLSPFMYNNKTGMCLCVPCTTQSKGYPF EVVLSGQERDGVALADQVKSIAWRARGATKKGTVAPEELQLIKAKINVLIG >gi|296493156|gb|ADTK01000345.1| GENE 6 7371 - 8162 1044 263 aa, chain + ## HITS:1 COG:ECs3641 KEGG:ns NR:ns ## COG: ECs3641 COG1694 # Protein_GI_number: 15832895 # Func_class: R General function prediction only # Function: Predicted pyrophosphatase # Organism: Escherichia coli O157:H7 # 1 263 1 263 263 478 100.0 1e-135 MNQIDRLLTIMQRLRDPENGCPWDKEQTFATIAPYTLEETYEVLDAIAREDFDDLRGELG DLLFQVVFYAQMAQEEGRFDFNDICAAISDKLERRHPHVFADSSAENSSEVLARWEQIKT EERAQKAQHSALDDIPRSLPALMRAQKIQKRCANVGFDWTTLGPVVDKVYEEIDEVMYEA RQAVVDQAKLEEEMGDLLFATVNLARHLGTKAEIALQKANEKFERRFREVERIVAARGLE MTGVDLETMEEVWQQVKRQEIDL >gi|296493156|gb|ADTK01000345.1| GENE 7 8390 - 10027 1601 545 aa, chain + ## HITS:1 COG:ECs3640 KEGG:ns NR:ns ## COG: ECs3640 COG0504 # Protein_GI_number: 15832894 # Func_class: F Nucleotide transport and metabolism # Function: CTP synthase (UTP-ammonia lyase) # Organism: Escherichia coli O157:H7 # 1 545 1 545 545 1101 100.0 0 MTTNYIFVTGGVVSSLGKGIAAASLAAILEARGLNVTIMKLDPYINVDPGTMSPIQHGEV FVTEDGAETDLDLGHYERFIRTKMSRRNNFTTGRIYSDVLRKERRGDYLGATVQVIPHIT NAIKERVLEGGEGHDVVLVEIGGTVGDIESLPFLEAIRQMAVEIGREHTLFMHLTLVPYM AASGEVKTKPTQHSVKELLSIGIQPDILICRSDRAVPANERAKIALFCNVPEKAVISLKD VDSIYKIPGLLKSQGLDDYICKRFSLNCPEANLSEWEQVIFEEANPVSEVTIGMVGKYIE LPDAYKSVIEALKHGGLKNRVSVNIKLIDSQDVETRGVEILKGLDAILVPGGFGYRGVEG MITTARFARENNIPYLGICLGMQVALIDYARHVANMENANSTEFVPDCKYPVVALITEWR DENGNVEVRSEKSDLGGTMRLGAQQCQLVDDSLVRQLYNAPTIVERHRHRYEVNNMLLKQ IEDAGLRVAGRSGDDQLVEIIEVPNHPWFVACQFHPEFTSTPRDGHPLFAGFVKAASEFQ KRQAK >gi|296493156|gb|ADTK01000345.1| GENE 8 10115 - 11413 1469 432 aa, chain + ## HITS:1 COG:ECs3639 KEGG:ns NR:ns ## COG: ECs3639 COG0148 # Protein_GI_number: 15832893 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Escherichia coli O157:H7 # 1 432 1 432 432 760 100.0 0 MSKIVKIIGREIIDSRGNPTVEAEVHLEGGFVGMAAAPSGASTGSREALELRDGDKSRFL GKGVTKAVAAVNGPIAQALIGKDAKDQAGIDKIMIDLDGTENKSKFGANAILAVSLANAK AAAAAKGMPLYEHIAELNGTPGKYSMPVPMMNIINGGEHADNNVDIQEFMIQPVGAKTVK EAIRMGSEVFHHLAKVLKAKGMNTAVGDEGGYAPNLGSNAEALAVIAEAVKAAGYELGKD ITLAMDCAASEFYKDGKYVLAGEGNKAFTSEEFTHFLEELTKQYPIVSIEDGLDESDWDG FAYQTKVLGDKIQLVGDDLFVTNTKILKEGIEKGIANSILIKFNQIGSLTETLAAIKMAK DAGYTAVISHRSGETEDATIADLAVGTAAGQIKTGSMSRSDRVAKYNQLIRIEEALGEKA PYNGRKEIKGQA Prediction of potential genes in microbial genomes Time: Mon May 16 16:08:00 2011 Seq name: gi|296493155|gb|ADTK01000346.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1096.13, whole genome shotgun sequence Length of sequence - 19342 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 8, operones - 6 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 56 - 928 131 ## COG1512 Beta-propeller domains of methanol dehydrogenase type 2 1 Op 2 . - CDS 942 - 1082 183 ## EcE24377A_3081 hypothetical protein - Prom 1143 - 1202 4.2 + Prom 937 - 996 5.9 3 2 Tu 1 . + CDS 1221 - 1892 224 ## PROTEIN SUPPORTED gi|157803532|ref|YP_001492081.1| 50S ribosomal protein L35 + Term 1911 - 1963 5.1 4 3 Op 1 6/0.250 - CDS 3081 - 4559 1480 ## COG1070 Sugar (pentulose and hexulose) kinases 5 3 Op 2 . - CDS 4586 - 5863 1254 ## COG0477 Permeases of the major facilitator superfamily + Prom 6094 - 6153 4.5 6 4 Op 1 8/0.250 + CDS 6182 - 6967 179 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 7 4 Op 2 5/0.500 + CDS 7037 - 8491 1443 ## COG0277 FAD/FMN-containing dehydrogenases + Term 8499 - 8541 8.0 8 5 Op 1 3/0.500 + CDS 8585 - 9922 1070 ## COG0477 Permeases of the major facilitator superfamily 9 5 Op 2 29/0.000 + CDS 9900 - 10679 702 ## COG2086 Electron transfer flavoprotein, beta subunit 10 5 Op 3 . + CDS 10676 - 11536 708 ## COG2025 Electron transfer flavoprotein, alpha subunit + Term 11747 - 11779 0.3 - Term 11639 - 11675 -0.8 11 6 Op 1 2/0.750 - CDS 11684 - 12259 513 ## COG1954 Glycerol-3-phosphate responsive antiterminator (mRNA-binding) 12 6 Op 2 12/0.000 - CDS 12276 - 12536 216 ## COG2440 Ferredoxin-like protein 13 6 Op 3 1/1.000 - CDS 12527 - 13798 647 ## COG0644 Dehydrogenases (flavoproteins) 14 6 Op 4 . - CDS 13876 - 14238 365 ## COG0720 6-pyruvoyl-tetrahydropterin synthase - Prom 14331 - 14390 3.4 + Prom 14335 - 14394 6.2 15 7 Op 1 11/0.000 + CDS 14557 - 16356 1812 ## COG0369 Sulfite reductase, alpha subunit (flavoprotein) 16 7 Op 2 11/0.000 + CDS 16356 - 18068 1872 ## COG0155 Sulfite reductase, beta subunit (hemoprotein) 17 7 Op 3 . + CDS 18142 - 18876 894 ## COG0175 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes + Term 18897 - 18934 5.3 + Prom 18906 - 18965 3.9 18 8 Tu 1 . + CDS 19141 - 19293 168 ## G2583_3410 small toxic membrane polypeptide Predicted protein(s) >gi|296493155|gb|ADTK01000346.1| GENE 1 56 - 928 131 290 aa, chain - ## HITS:1 COG:ygcG KEGG:ns NR:ns ## COG: ygcG COG1512 # Protein_GI_number: 16130685 # Func_class: R General function prediction only # Function: Beta-propeller domains of methanol dehydrogenase type # Organism: Escherichia coli K12 # 1 262 24 285 313 486 98.0 1e-137 MRYFILMFTFVCSFVAAQPTIVPQLQQQVTDLTSSLNSQEKKELTHKLESIFNNTQVQIA VLIVPTTKDETIEQYATRVFDNWRLGDAKRNDGILIIVAWSDRTVRIKVGYGLEEKVTDA LAGDIIRSNMIPAFKQQKLAQGLELAINALNNQLTSQHQYPTNPSESESASSSDHYYFAI FWVFAVMFFPFWFFHQGSNFCRACKSGVCISAIYLLDLFLFSDKIFSIAVFSFFFTFTIF MVFTCLCVLQKRASGRSYHSDNSGSAGGSDSGGFSGGGGSSGGGGASGRW >gi|296493155|gb|ADTK01000346.1| GENE 2 942 - 1082 183 46 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_3081 NR:ns ## KEGG: EcE24377A_3081 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 46 1 46 46 65 100.0 6e-10 MSEENKENGFNHVKTFTKIIFIFSVLVFNDNESKITDAAVNLFIQI >gi|296493155|gb|ADTK01000346.1| GENE 3 1221 - 1892 224 223 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157803532|ref|YP_001492081.1| 50S ribosomal protein L35 [Rickettsia canadensis str. McKiel] # 5 222 20 224 225 90 31 6e-18 MQYPINEMFQTLQGEGYFTGVPAIFIRLQGCPVGCAWCDTKHTWEKLEDREVSLFSILAK TKESDKWGAASSEDLLAVISRQGYTARHVVITGGEPCIHDLLPLTDLLEKNGFSCQIETS GTHEVRCTPNTWVTVSPKLNMRGGYEVLSQALERANEIKHPVGRVRDIEALDELLATLTD DKPRVIALQPISQKDDATRLCIETCIARNWRLSMQTHKYLNIA >gi|296493155|gb|ADTK01000346.1| GENE 4 3081 - 4559 1480 492 aa, chain - ## HITS:1 COG:ygcE KEGG:ns NR:ns ## COG: ygcE COG1070 # Protein_GI_number: 16130683 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Escherichia coli K12 # 1 492 1 492 492 1024 98.0 0 MSKKYIIGIDGGSQSTKVVMYDLEGNVVCEGKGLLQPMHTPDADTAEHPDDDLWASLCFA GHDLMSQFAGNKEDIVGIGLGSIRCCRALLKADGTPAAPLISWQDARVTRPYEHTNPDVA YVTSFSGYLTHRLTGEFKDNIANYFGQWPVDYKSWAWSEDAAVMDKFNIPRHMLFDVQMP GTVLGHITPQAALATHFPAGLPVVCTTSDKPVEALGAGLLDDETAVISLGTYIALMMNGK ALPKDPVAYWPIMSSIPQTLLYEGYGIRKGMWTVSWLRDMLGESLIQDAKAQDLSPEDLL NKKASCVPPGCNGLMTVLDWLTNPWEPYKRGIMIGFDSSMDYAWIYRSILESVALTLKNN YDNMCNEMNYFAKHVIITGGGSNSDLFMQIFADVFNLPARRNAINGCASLGAAINTAVGL GLYPDYATAVDKMVRVKDIFMPVESNAKRYDAMNKGIFKDLTKHTDVILKKSYEVMHGEL GNADSIQSWSNA >gi|296493155|gb|ADTK01000346.1| GENE 5 4586 - 5863 1254 425 aa, chain - ## HITS:1 COG:yqcE KEGG:ns NR:ns ## COG: yqcE COG0477 # Protein_GI_number: 16130682 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 425 1 425 425 749 99.0 0 MQHNSYRRWITLAIISFSGGVSFDLAYLRYIYQIPMAKFMGFSNTEIGLIMSTFGIAAII LYAPSGVIADKFSHRKMITSAMIITGLLGLLMATYPPLWVMLCIQVAFAITTILMLWSVS IKAASLLGDHSEQGKIMGWMEGLRGVGVMSLAVFTMWVFSRFAPDDSTSLKTVIIIYSVV YILLGILCWFFVSDNNNLRSANNEEKQSFQLSDILAVLRISTTWYCSMVIFGVFTIYAIL SYSTNYLTEMYGMSLVAASYMGIVINKIFRALCGPLGGIITTYSKVKSPTRVIQILSIIG LLALTALLVTNSNPQSVAMGIGLILLLGFTCYASRGLYWACPGEARTPSYIMGTTVGICS VIGFLPDVFVYPIIGHWQDTLPAAEAYRNMWLMGMAALGMVIVFTFLLFQKIRTADSAPA MASSK >gi|296493155|gb|ADTK01000346.1| GENE 6 6182 - 6967 179 261 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 16 256 1 238 242 73 26 1e-12 MSIESLNAFSMDFFSLKGKTAIVTGGNSGLGQAFAMALAKAGANIFIPSFVKDNGETKEM IEKQGVEVDFMQVDITAEGAPQKIIAASCERFGTVDILVNNAGICKLNKVLDFGRADWDP MIDVNLTAAFELSYEAAKIMIPQKSGKIINICSLFSYLGGQWSPAYSATKHALAGFTKAY CDELGQYNIQVNGIAPGYYATDITLATRSNPETNQRVLDHIPANRWGDTQDLMGAAVFLA SPASNYVNGHLLVVDGGYLVR >gi|296493155|gb|ADTK01000346.1| GENE 7 7037 - 8491 1443 484 aa, chain + ## HITS:1 COG:ECs3629 KEGG:ns NR:ns ## COG: ECs3629 COG0277 # Protein_GI_number: 15832883 # Func_class: C Energy production and conversion # Function: FAD/FMN-containing dehydrogenases # Organism: Escherichia coli O157:H7 # 1 484 1 484 484 1011 99.0 0 MSLSRAAIVDQLKEIVGADRVITDETVLKKNSIDRFRKFPDIHGIYTLPIPAAVVKLGST EQVSRVLNFMNAHKINGVPRTGASATEGGLETVVENSVVLDGSAMNQIINIDIENMQATA QCGVPLEVLENALREKGYTTGHSPQSKPLAQMGGLVATRSIGQFSTLYGAIEDMVVGLEA VLADGTVTRIKNVPRRAAGPDIRHIIIGNEGALCYITEVTVKIFKFTPENNLFYGYILED MKTGFNILREVMVEGYRPSIARLYDAEDGTQHFTHFADGKCVLIFMAEGNPRIAKATGEG IAEIVARYPQCQRVDSKLIETWFNNLNWGPDKVAAERVQILKTGNMGFTTEVSGCWSCIH EIYESVINRIRTEFPHADDITMLGGHSSHSYQNGTNMYFVYDYNVVDCKPEEEIDKYHNP LNKIICEETIRLGGSMVHHHGIGKHRVHWSKLEHGSAWALLEGLKKQFDPNGIMNTGTIY PIEK >gi|296493155|gb|ADTK01000346.1| GENE 8 8585 - 9922 1070 445 aa, chain + ## HITS:1 COG:ygcS KEGG:ns NR:ns ## COG: ygcS COG0477 # Protein_GI_number: 16130678 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 445 25 469 469 745 100.0 0 MNTSPVRMDDLPLNRFHCRIAALTFGAHLTDGYVLGVIGYAIIQLTPAMQLTPFMAGMIG GSALLGLFLGSLVLGWISDHIGRQKIFTFSFLLITLASFLQFFATTPEHLIGLRILIGIG LGGDYSVGHTLLAEFSPRRHRGILLGAFSVVWTVGYVLASIAGHHFISENPEAWRWLLAS AALPALLITLLRWGTPESPRWLLRQGRFAEAHAIVHRYFGPHVLLGDEVVTATHKHIKTL FSSRYWRRTAFNSVFFVCLVIPWFVIYTWLPTIAQTIGLEDALTASLMLNALLIVGALLG LVLTHLLAHRKFLLGSFLLLAATLVVMACLPSGSSLTLLLFVLFSTTISAVSNLVGILPA ESFPTDIRSLGVGFATAMSRLGAAVSTGLLPWVLAQWGMQVTLLLLATVLLVGFVVTWLW APETKALPLVAAGNVGGANEHSVSV >gi|296493155|gb|ADTK01000346.1| GENE 9 9900 - 10679 702 259 aa, chain + ## HITS:1 COG:ygcR KEGG:ns NR:ns ## COG: ygcR COG2086 # Protein_GI_number: 16130677 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, beta subunit # Organism: Escherichia coli K12 # 1 259 3 261 261 465 99.0 1e-131 MNILLAFKAEPDAGMLAEKEWQAAAQGKSGPDISLLRSLLGADEQAAAALLLAQRKNGTP MSLTALSMGDERALHWLRYLMALGFEEAVLLETAADLRFAPEFVARHIAEWQHQNPLDLI ITGCQSSEGQNGQTPFLLAEMLGWPCFTQVERFTLDALFITLEQRTEHGLRCCRVRLPAV IAVRQCGEVALPVPGMRQRMAAGKAEIIRKTVAAEMPAMQCLQLARAEQRRGATLIDGQT VAEKAQKLWRDYLRQRMQP >gi|296493155|gb|ADTK01000346.1| GENE 10 10676 - 11536 708 286 aa, chain + ## HITS:1 COG:ygcQ KEGG:ns NR:ns ## COG: ygcQ COG2025 # Protein_GI_number: 16130676 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, alpha subunit # Organism: Escherichia coli K12 # 1 286 12 297 297 535 99.0 1e-152 MNIAIVTINQENAAIASWLAAQDFSGCTLAHWQIEPQPVVAEQVLDALVEQWQRTPADVV LFPPGTFGDELSTRLAWRLHGASICQVTSLDIPTVSVRKSHWGNALTATLQTEKRPLCLS LARQAGAAKNATLPSGMQQLIIVPGALPDWLVSTEDLKNVTRDPLAEARRVLVVGQGGEA DNQEIAMLAEKLGAEVGYSRARVMNGGVDAEKVIGISGHLLAPEVCIVVGASGAAALMAG VRNSKFVVAINHDASAAVFSQADVGVVDDWKVVLEALVTNIHADCQ >gi|296493155|gb|ADTK01000346.1| GENE 11 11684 - 12259 513 191 aa, chain - ## HITS:1 COG:ygcP KEGG:ns NR:ns ## COG: ygcP COG1954 # Protein_GI_number: 16130675 # Func_class: K Transcription # Function: Glycerol-3-phosphate responsive antiterminator (mRNA-binding) # Organism: Escherichia coli K12 # 1 191 1 191 191 377 100.0 1e-105 MPLLHLLRQNPVIAAVKDNASLQLAIDSECQFISVLYGNICTISNIVKKIKNAGKYAFIH VDLLEGASNKEVVIQFLKLVTEADGIISTKASMLKAARAEGFFCIHRLFIVDSISFHNID KQVAQSNPDCIEILPGCMPKVLGWVTEKIRQPLIAGGLVCDEEDARNAINAGVVALSTTN TGVWTLAKKLL >gi|296493155|gb|ADTK01000346.1| GENE 12 12276 - 12536 216 86 aa, chain - ## HITS:1 COG:ygcO KEGG:ns NR:ns ## COG: ygcO COG2440 # Protein_GI_number: 16130674 # Func_class: C Energy production and conversion # Function: Ferredoxin-like protein # Organism: Escherichia coli K12 # 1 86 13 98 98 172 97.0 1e-43 MSVARNLWRVADAPHIVPADSVERQTAERLISACPAGLFSLTPEGDLRIDYRSCLECGTC RLLCDESTLQQWRYPPSGFGITYRFG >gi|296493155|gb|ADTK01000346.1| GENE 13 12527 - 13798 647 423 aa, chain - ## HITS:1 COG:ygcN KEGG:ns NR:ns ## COG: ygcN COG0644 # Protein_GI_number: 16130673 # Func_class: C Energy production and conversion # Function: Dehydrogenases (flavoproteins) # Organism: Escherichia coli K12 # 1 423 11 433 433 826 99.0 0 MEDDCDIIIIGAGIAGTACALRCARAGLSVLLLERAEIPGSKNLSGGRLYTHALAELLPQ FHLTAPLERCITHESLSLLTPDGATTFSSLQPGGESWSVLRARFDPWLVAEAEKEGVECI PGATVDALYEENGRVCGVICGDDILRARYVVLAEGANSVLAERHGLVTRPAGEAMALGIK EVLSLETSAIEERFHLENNEGAALLFSGGICDDLPGGAFLYTNQQTLSLGIVCPLSSLTQ SRVPASELLTRFKAHPAVRPLIKNTESLEYGAHLVPEGGLHSMPVQYAGNGWLLVGDALR SCVNTGISVRGMDMALTGAQAAAQTLISACQHREPQNLFPLYHHNVERSLLWDVLQRYQH VPALLQRPGWYRTWPALMQDISRDLWDQGDKPVPPLRQLFWHHLRRHGLWHLAGDVIRSL RCL >gi|296493155|gb|ADTK01000346.1| GENE 14 13876 - 14238 365 120 aa, chain - ## HITS:1 COG:ECs3620 KEGG:ns NR:ns ## COG: ECs3620 COG0720 # Protein_GI_number: 15832874 # Func_class: H Coenzyme transport and metabolism # Function: 6-pyruvoyl-tetrahydropterin synthase # Organism: Escherichia coli O157:H7 # 1 120 2 121 121 251 100.0 2e-67 MSTTLFKDFTFEAAHRLPHVPEGHKCGRLHGHSFMVRLEITGEVDPHTGWIIDFAELKAA FKPTYERLDHHYLNDIPGLENPTSEVLAKWIWDQVKPVVPLLSAVMVKETCTAGCIYRGE >gi|296493155|gb|ADTK01000346.1| GENE 15 14557 - 16356 1812 599 aa, chain + ## HITS:1 COG:cysJ KEGG:ns NR:ns ## COG: cysJ COG0369 # Protein_GI_number: 16130671 # Func_class: P Inorganic ion transport and metabolism # Function: Sulfite reductase, alpha subunit (flavoprotein) # Organism: Escherichia coli K12 # 1 599 1 599 599 1157 99.0 0 MTTQVPPSALLPLNPEQLVRLQAATTDLTPTQLAWVSGYFWGVLNQQPAALAATPAPAAE MPGITIISASQTGNARRVAEALRDDLLAAKLNVKLVNAGDYKFKQIASEKLLIVVTSTQG EGEPPEEAVALHKFLFSKKAPKLENTAFAVFSLGDSSYEFFCQSGKDFDSKLAELGGERL LDRVDADVEYQAAASEWRARVVDALKSRAPVAAPSQSVATGAVNEIHTSPYSKDAPLVAS LSVNQKITGRNSEKDVRHIEIDLGDSGLRYQPGDALGVWYQNDPALVKELVELLWLKGDE PVTVEGKTLPLNEALQWHFELTVNTANIVENYATLTRSETLLPLVGDKAKLQHYAATTPI VDMVRFSPAQLDAEALINLLRPLTPRLYSIASSQAEVENEVHVTVGVVRYDVEGRARAGG ASSFLADRVEEEGEVRVFIEHNDNFRLPANPETPVIMIGPGTGIAPFRAFMQQRAADEAP GKNWLFFGNPHFTEDFLYQVEWQRYVKDGVLTRIDLAWSRDQKEKVYVQDKLREQGAELW RWINDGAHIYVCGDANRMAKDVEQALLEVIAEFGGMDTEAADEFLSELRVERRYQRDVY >gi|296493155|gb|ADTK01000346.1| GENE 16 16356 - 18068 1872 570 aa, chain + ## HITS:1 COG:ECs3618 KEGG:ns NR:ns ## COG: ECs3618 COG0155 # Protein_GI_number: 15832872 # Func_class: P Inorganic ion transport and metabolism # Function: Sulfite reductase, beta subunit (hemoprotein) # Organism: Escherichia coli O157:H7 # 1 570 1 570 570 1186 100.0 0 MSEKHPGPLVVEGKLTDAERMKLESNYLRGTIAEDLNDGLTGGFKGDNFLLIRFHGMYQQ DDRDIRAERAEQKLEPRHAMLLRCRLPGGVITTKQWQAIDKFAGENTIYGSIRLTNRQTF QFHGILKKNVKPVHQMLHSVGLDALATANDMNRNVLCTSNPYESQLHAEAYEWAKKISEH LLPRTRAYAEIWLDQEKVATTDEEPILGQTYLPRKFKTTVVIPPQNDIDLHANDMNFVAI AENGKLVGFNLLVGGGLSIEHGNKKTYARTASEFGYLPLEHTLAVAEAVVTTQRDWGNRT DRKNAKTKYTLERVGVETFKAEVERRAGIKFEPIRPYEFTGRGDRIGWVKGIDDNWHLTL FIENGRILDYPGRPLKTGLLEIAKIHKGDFRITANQNLIIAGVPESEKAKIEKIAKESGL MNAVTPQRENSMACVSFPTCPLAMAEAERFLPSFIDNIDNLMAKHGVSDEHIVMRVTGCP NGCGRAMLAEVGLVGKAPGRYNLHLGGNRIGTRIPRMYKENITEPEILASLDELIGRWAK EREAGEGFGDFTVRAGIIRPVLDPARDLWD >gi|296493155|gb|ADTK01000346.1| GENE 17 18142 - 18876 894 244 aa, chain + ## HITS:1 COG:cysH KEGG:ns NR:ns ## COG: cysH COG0175 # Protein_GI_number: 16130669 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes # Organism: Escherichia coli K12 # 1 244 1 244 244 489 99.0 1e-138 MSKLDLNALNELPKVDRILALAETNAELEKLDAEGRVAWALDNLPGEYVLSSSFGIQAAV SLHLVNQIHPDIPVILTDTGYLFPETYRFIDELTDKLKLNLKVYRATESAAWQEARYGKL WEQGVEGIEKYNDINKVEPMNRALKELNAQTWFAGLRREQSGSRANLPVLAIQRGVFKVL PIIDWDNRTIYQYLQKHGLKYHPLWDEGYLSVGDTHTTRKWEPGMLEEETRFFGLKRECG LHEG >gi|296493155|gb|ADTK01000346.1| GENE 18 19141 - 19293 168 50 aa, chain + ## HITS:1 COG:no KEGG:G2583_3410 NR:ns ## KEGG: G2583_3410 # Name: small # Def: small toxic membrane polypeptide # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 50 46 95 95 95 100.0 8e-19 MLTKYALVAIIVLCCTVLGFTLMVGDSLCELSIRERGMEFKAVLAYESKK Prediction of potential genes in microbial genomes Time: Mon May 16 16:08:05 2011 Seq name: gi|296493154|gb|ADTK01000347.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1096.14, whole genome shotgun sequence Length of sequence - 717 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 78 - 269 225 ## SeHA_A0049 hypothetical protein 2 1 Op 2 . - CDS 266 - 688 509 ## EcE24377A_E0028 putative DNA methylase Predicted protein(s) >gi|296493154|gb|ADTK01000347.1| GENE 1 78 - 269 225 63 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0049 NR:ns ## KEGG: SeHA_A0049 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 63 1 63 63 117 100.0 2e-25 MNISTETREILRNYKAVINARRREMGQKPLTTAQIVDEICDFVVNQQAVFLGGHYILQGS RNR >gi|296493154|gb|ADTK01000347.1| GENE 2 266 - 688 509 140 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_E0028 NR:ns ## KEGG: EcE24377A_E0028 # Name: not_defined # Def: putative DNA methylase # Organism: E.coli_E24377A # Pathway: not_defined # 1 140 122 261 261 258 97.0 3e-68 MYCTVKEIIREVLDTDVPDSECVFAVVLTRGDVRHIAQDWNLSDDELETVMQRLDDAFEH GADVSVVHDVVRELMEEKRVSRQVTVPAVMLEKVVALAGSEMKRLYAVGSENGGDGDAFV REEREAMDVVLQALDGEHMS Prediction of potential genes in microbial genomes Time: Mon May 16 16:08:10 2011 Seq name: gi|296493153|gb|ADTK01000348.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1096.15, whole genome shotgun sequence Length of sequence - 3024 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 28 - 453 454 ## SeHA_A0047 putative antirestriction protein - Term 755 - 783 2.3 2 2 Op 1 . - CDS 868 - 1008 59 ## APECO1_O1CoBM122 hypothetical protein 3 2 Op 2 . - CDS 1031 - 2143 315 ## COG3385 FOG: Transposase and inactivated derivatives 4 2 Op 3 . - CDS 2220 - 2990 501 ## SeHA_A0046 hypothetical protein Predicted protein(s) >gi|296493153|gb|ADTK01000348.1| GENE 1 28 - 453 454 141 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0047 NR:ns ## KEGG: SeHA_A0047 # Name: not_defined # Def: putative antirestriction protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 141 1 141 141 298 100.0 3e-80 MQYAKPVTLNVEECDRLFFLPYLFGQDFLYVEASVYALAKKMMPEYEGGFWHFIRLPDGG GYMMPDGDRFHIVNGANWFDRTVSADACGIILTSLVINRQLWLYHDSGDAGLTQLYRMRD AQLWRHIEFHPECNAIYAALD >gi|296493153|gb|ADTK01000348.1| GENE 2 868 - 1008 59 46 aa, chain - ## HITS:1 COG:no KEGG:APECO1_O1CoBM122 NR:ns ## KEGG: APECO1_O1CoBM122 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 46 17 62 62 72 100.0 3e-12 MVIWSLQVAIRGTVSLTAYKTQLKNARHRLNEAPRRRILQMVQPLS >gi|296493153|gb|ADTK01000348.1| GENE 3 1031 - 2143 315 370 aa, chain - ## HITS:1 COG:yi81 KEGG:ns NR:ns ## COG: yi81 COG3385 # Protein_GI_number: 16130326 # Func_class: L Replication, recombination and repair # Function: FOG: Transposase and inactivated derivatives # Organism: Escherichia coli K12 # 1 370 3 372 372 728 100.0 0 MNYSHDNWSAILAHIGKPEELDTSARNAGALTRRREIRDAATLLRLGLAYGPGGMSLREV TAWAQLHDVATLSDVALLKRLRNAADWFGILAAQTLAVRAAVTGCTSGKRLRLVDGTAIS APGGGSAEWRLHMGYDPHTCQFTDFELTDSRDAERLDRFAQTADEIRIADRGFGSRPECI RSLAFGEADYIVRVHWRGLRWLTAEGMRFDMMGFLRGLDCGKNGETTVMIGNSGNKKAGA PFPARLIAVSLPPEKALISKTRLLSENRRKGRVVQAETLEAAGHVLLLTSLPEDEYSAEQ VADCYRLRWQIELAFKRLKSLLHLDALRAKEPELAKAWIFANLLAAFLIDDIIQPSLDFP PRSAGSEKKN >gi|296493153|gb|ADTK01000348.1| GENE 4 2220 - 2990 501 256 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0046 NR:ns ## KEGG: SeHA_A0046 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 256 18 273 273 484 100.0 1e-135 MNKTLNALVCRHARNLLLAQGWPEETDVDQRNPNYPGWISIYVRLDAPRLATLLINRHGG VLPPLLASAIQRLTGTGAELVLSGSQWQSLPVLPADGTQVSFPYAGEWLTEDEIRAVLDA VHDAVRSICYQVAEDARRIRAALTTTGQTLLTRQTRRFRLVVKESDHPCWLDEDDENLPV VLDAIVNRGARFSSVEMYLVSDCIEHILSCGLACDVLRIPDEPPRRWFDRGVLREVVREA RTEIRSMADALAKIRK Prediction of potential genes in microbial genomes Time: Mon May 16 16:08:18 2011 Seq name: gi|296493152|gb|ADTK01000349.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1096.16, whole genome shotgun sequence Length of sequence - 264 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 42 - 248 257 ## SeHA_A0045 hypothetical protein Predicted protein(s) >gi|296493152|gb|ADTK01000349.1| GENE 1 42 - 248 257 68 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0045 NR:ns ## KEGG: SeHA_A0045 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 68 77 144 144 119 95.0 5e-26 MSRLRENRQVTVPAELLASLIQTAEQALWKREWAARDNGLAVPECVTRRQAVINQARTLL KNNTHENN Prediction of potential genes in microbial genomes Time: Mon May 16 16:08:22 2011 Seq name: gi|296493151|gb|ADTK01000350.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1096.17, whole genome shotgun sequence Length of sequence - 9139 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 1, operones - 1 average op.length - 8.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 74 - 133 8.8 1 1 Op 1 . + CDS 211 - 2910 1396 ## COG1203 Predicted helicases 2 1 Op 2 . + CDS 3008 - 4570 1172 ## ECUMN_3087 hypothetical protein 3 1 Op 3 . + CDS 4567 - 5103 357 ## EC55989_3032 hypothetical protein 4 1 Op 4 . + CDS 5115 - 6170 1145 ## ECIAI1_2861 hypothetical protein 5 1 Op 5 . + CDS 6181 - 6927 495 ## SSON_2906 putative cytoplasmic protein 6 1 Op 6 . + CDS 6909 - 7559 360 ## ECO103_3298 hypothetical protein 7 1 Op 7 . + CDS 7556 - 8479 514 ## COG1518 Uncharacterized protein predicted to be involved in DNA repair 8 1 Op 8 . + CDS 8476 - 8682 184 ## EC55989_3027 hypothetical protein + Term 8698 - 8756 1.4 Predicted protein(s) >gi|296493151|gb|ADTK01000350.1| GENE 1 211 - 2910 1396 899 aa, chain + ## HITS:1 COG:ZygcB KEGG:ns NR:ns ## COG: ZygcB COG1203 # Protein_GI_number: 15803278 # Func_class: R General function prediction only # Function: Predicted helicases # Organism: Escherichia coli O157:H7 EDL933 # 1 898 1 898 899 1762 95.0 0 MRKYPLSLLKDKNIVTFFDFWGKTRRGEKEGGDDYHLLCWHSLDVAAMGYLMVKRNCFGL ADYFRQLGISDKEQAAQFFAWLLCWHDIGKFARSFQQLYLPPELKIQEGARKNYEKISHS TLGYWLWNHYLSECQELLPSSSLSPRKLRRVIEMWMPVTTGHHGRPPDRMDELDNFLPED KAAARDFLLAIKALFPRIEIPAFWDDDEGIELIKHLSWYISATVVLADWTGSSTRFFPRV AQAMDIKHYWQKALVQAQNALTVFPPKAETAPFTGINTLFPFIENPTPLQQKVLDLDISQ PGPQLFILEDVTGAGKTEAALILAHRLIAAGKAQGLFFGLPTMATANAMYDRLVKTWLAF YSPESRPSLVLAHSARTLMDRFNESLWSGDLVGSEEPDEQTFSQGCAAWFADSNKKALLA EIGVGTLDQAMMAVMPFKHNNLRLLGLSNKILLADEIHACDAYMSCILEGLIERQARGGN SVILLSATLSQQQRDKLVAAFARGAEGQQEAPLLGKDDYPWLTHVTKTDVHSHRVATRKE VERSVSVGWLHSEQECIARIESAVSQGKCIAWIRNSVDDAIKVYRQLLARGVIPASSLSL FHSRFAFSDRQRIETETLARFGKEDGSQRAGKVLICTQVLEQSVDCDLDEMISDLAPIDL LIQRAGRLQRHIRDINGQLKRDGKDERSPPELLILAPVWDDSPGDEWFGSAMRNSAFVYP DHGRIWLTQRVLREQGAIQMPHSARLLIESVYGEDVVMPEGFARSEQEQVGKYYCDRARA KKFVLNFRPGYAANINDYLPEKLSTRLAEESVSLWLATCIDGVVKPYATGAHAWEMSVVR VRRSWWKKHRDEFSLLEGDAFRQWCIEQRQDPEMANVILVTDDESCGYSAMEGLTGKVG >gi|296493151|gb|ADTK01000350.1| GENE 2 3008 - 4570 1172 520 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_3087 NR:ns ## KEGG: ECUMN_3087 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 520 3 522 522 1045 97.0 0 MNSFSLLTTPWLPVRFKDGTTGKLAPVDLADENVVDIAAPRADLQGAAWQFLLGLLQTSF APKDHRRWDDIWEDGLEVEKLREALLSLEHAFQFGPDSPSFMQDFEALTGDKVQVASLLP EIPGAQTTKFNKDHFIKRGVTEYLCPHCSALALFSLQLNAPSGGKGYRTGLRGGGPMTTL IELQEYQGNQQTPLWRKMWLNVMPQDEADLPLPKKFDDLVFPWLGPTRTSELAGAVVTDD QVNKLQAYWGMPRRIRIDFNTTTVGNCDICGEQNDALLSLMTTKNYGANYAMWQHPLTPY RVPLKEGGEFYSVKPQPGGLIWRDWLGLIETGKSENNTELPALVVKLFNASSLKQAKVGL WGFGYDFDNMKARCWYEHHFPLLLNKKEGQIPKLRLAAQTASRILSLLRSALKEAWFSDP KGARGDFSFVDIDFWNKTQHRFLRFVRQIEEGQDADELLSKWNKEIWLFARQDFDERVFT NPYEPVNLERIMTARKKYFTTSAEKQSAKAAREKKQEAAE >gi|296493151|gb|ADTK01000350.1| GENE 3 4567 - 5103 357 178 aa, chain + ## HITS:1 COG:no KEGG:EC55989_3032 NR:ns ## KEGG: EC55989_3032 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 178 1 178 178 320 100.0 1e-86 MSIVKEEHKATLRKWHEELQEKRGERASLRRSTTVNDVCLTDGFRLFLKNRQIKWQDEPE WRITALALIAAVSANVKAIDERQPFAAQLAAVMSKGRFTRLSAVKTPDELLRQLRRAVRL LNGSVNLDSLAEGVFRWCQESDDLLNHHRRQQRPTEFIRIRWALEYYQAGDADNEQNQ >gi|296493151|gb|ADTK01000350.1| GENE 4 5115 - 6170 1145 351 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_2861 NR:ns ## KEGG: ECIAI1_2861 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 351 1 351 351 661 98.0 0 MTTFIQLHLLTAYPAANLNRDDTGAPKTVVLGGATRLRVSSQSLKRAWRISALFEQALAG HIGIRSGRIAREAATILIEKGIEEKKAIEWAAKIADYLGKAKNDKKPKDPLTNAETEQLV HISPAEFDAVKALAHQLAEEKRAPKEEDLALLRKDRMAVDIAMFGRMLANKPEFNVEAAC QVAHAFGVSETIVEDDFFTAVDDLRQASEDAGAGHLGETGFGSALFYTYICIDKDLLVEN LGGDEALANQTIRAFTEAALKVSPTGKQNSFASRAYASWAMAEKGTEQPRSLAAAFYEPI NGTRQLEVAVQRITTLRENMNTVYEQKTECASFDVMNKQGSMKDVLDFICA >gi|296493151|gb|ADTK01000350.1| GENE 5 6181 - 6927 495 248 aa, chain + ## HITS:1 COG:no KEGG:SSON_2906 NR:ns ## KEGG: SSON_2906 # Name: not_defined # Def: putative cytoplasmic protein # Organism: S.sonnei # Pathway: not_defined # 1 248 1 248 248 476 98.0 1e-133 MSQYLIFQLHGPMASWGVDAPGEVRHTHELPSRSALLGLLAAGVGIRRDDTERLNAFNRH YSLVVCASRNPRWARDYHTIQMPKEVRKARYFSRREELSDPDLLSAIISRRDYYTDAWWM VAVATTADAPYSLEQLQDGLRHPVFPLYLGRKSHPLALPLAPLLLEGNACDALCNAYQQY QDHFHKLKVSLPKLQDECWWEGEHDGLVASKILRRRDVPLNRQQWLFGERTINQGPWLSK EEPCTSQE >gi|296493151|gb|ADTK01000350.1| GENE 6 6909 - 7559 360 216 aa, chain + ## HITS:1 COG:no KEGG:ECO103_3298 NR:ns ## KEGG: ECO103_3298 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 216 1 216 216 417 98.0 1e-115 MYLSRITLHTGQLSPAQLLHLVDRGEYVMHQWLWDLFPGGKERQFLYRREELQGAFRFFV LSQERPAESDTFTIECRSFAPELRTGQQLCFNLRANPTICKSGKRHDLLMEAKRQVRGQA EGSDVWLHQQQAALDWLAAQGERSGFTLLDTSVDAYRQQQLRRENSRQLIQFSSVDYTGM LTVTDPGLFLQRLSQGYGKSRAFGCGLMLIKPGAEA >gi|296493151|gb|ADTK01000350.1| GENE 7 7556 - 8479 514 307 aa, chain + ## HITS:1 COG:ECs3609 KEGG:ns NR:ns ## COG: ECs3609 COG1518 # Protein_GI_number: 15832863 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Escherichia coli O157:H7 # 1 307 1 307 307 599 99.0 1e-171 MTFVPLSPIPLKDRTSMIFLQYGQIDVLDGAFVLIDKTGIRTHIPVGSVACIMLEPGTRV SHAAVHLAATVGTLLVWVGEAGVRVYSSGQPGGARADKLLYQAKLALTEDLRLKVVRKMY ELRFREPPPARRSVEQLRGIEGSRVRQTYALLAKQYGVKWNGRKYDPKDWEKGDVVNRCI SAATSCLYGISEAAVLAAGYAPAIGFIHSGKPLSFVYDIADIIKFDSVVPKAFEIAARQP AEPDKEVRLACRDIFRSTKLTGKLIPLIEEVLAAGEIEPPQPAPDMLPPAIPEPETLGDS GHRGRGG >gi|296493151|gb|ADTK01000350.1| GENE 8 8476 - 8682 184 68 aa, chain + ## HITS:1 COG:no KEGG:EC55989_3027 NR:ns ## KEGG: EC55989_3027 # Name: ygbF # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 68 1 68 97 128 100.0 5e-29 MSMVVVVTENVPPRLRGRLAIWLLEVRAGVYVGDTSKRIREMIWQQITQLAGCGNVVMAW ATNTESGF Prediction of potential genes in microbial genomes Time: Mon May 16 16:09:00 2011 Seq name: gi|296493150|gb|ADTK01000351.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1096.18, whole genome shotgun sequence Length of sequence - 48678 bp Number of predicted genes - 50, with homology - 50 Number of transcription units - 19, operones - 10 average op.length - 4.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 296 - 328 -0.3 1 1 Tu 1 . - CDS 354 - 1391 656 ## COG2234 Predicted aminopeptidases - Prom 1425 - 1484 3.9 + Prom 1406 - 1465 4.2 2 2 Op 1 18/0.000 + CDS 1643 - 2551 1035 ## COG0175 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes 3 2 Op 2 7/0.000 + CDS 2553 - 3980 1524 ## COG2895 GTPases - Sulfate adenylate transferase subunit 1 4 2 Op 3 . + CDS 3980 - 4585 566 ## COG0529 Adenylylsulfate kinase and related kinases 5 2 Op 4 . + CDS 4635 - 4958 487 ## EcSMS35_2875 hypothetical protein + Prom 4967 - 5026 3.4 6 3 Op 1 11/0.000 + CDS 5152 - 5463 205 ## COG2919 Septum formation initiator 7 3 Op 2 19/0.000 + CDS 5482 - 6192 315 ## PROTEIN SUPPORTED gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 8 3 Op 3 8/0.000 + CDS 6192 - 6671 759 ## COG0245 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase 9 3 Op 4 8/0.000 + CDS 6668 - 7717 971 ## COG0585 Uncharacterized conserved protein 10 3 Op 5 13/0.000 + CDS 7698 - 8459 655 ## COG0496 Predicted acid phosphatase 11 3 Op 6 11/0.000 + CDS 8453 - 9079 503 ## COG2518 Protein-L-isoaspartate carboxylmethyltransferase + Term 9131 - 9189 -0.7 + Prom 9130 - 9189 2.1 12 3 Op 7 8/0.000 + CDS 9219 - 10358 624 ## COG0739 Membrane proteins related to metalloendopeptidases 13 3 Op 8 . + CDS 10421 - 11413 1191 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) + Term 11432 - 11467 6.5 - Term 11463 - 11494 3.4 14 4 Op 1 1/0.875 - CDS 11507 - 12871 1231 ## COG2610 H+/gluconate symporter and related permeases 15 4 Op 2 4/0.500 - CDS 12960 - 13736 672 ## COG3622 Hydroxypyruvate isomerase 16 4 Op 3 6/0.000 - CDS 13741 - 14379 484 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 17 4 Op 4 5/0.000 - CDS 14376 - 15638 1040 ## COG3395 Uncharacterized protein conserved in bacteria 18 4 Op 5 . - CDS 15635 - 16543 914 ## COG2084 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases - Prom 16625 - 16684 4.8 + Prom 16529 - 16588 5.8 19 5 Tu 1 . + CDS 16739 - 17506 531 ## COG1349 Transcriptional regulators of sugar metabolism + Term 17528 - 17570 4.5 - Term 17496 - 17544 0.0 20 6 Tu 1 . - CDS 17557 - 18213 413 ## COG0639 Diadenosine tetraphosphatase and related serine/threonine protein phosphatases - Prom 18235 - 18294 8.0 21 7 Op 1 . - CDS 18319 - 20880 3142 ## COG0249 Mismatch repair ATPase (MutS family) 22 7 Op 2 . - CDS 20877 - 21110 98 ## gi|300922251|ref|ZP_07138377.1| conserved domain protein - Prom 21196 - 21255 6.0 + Prom 20880 - 20939 5.3 23 8 Tu 1 . + CDS 21167 - 21520 334 ## ECO103_3270 hypothetical protein + Term 21523 - 21560 7.8 24 9 Op 1 . - CDS 21557 - 22183 768 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains 25 9 Op 2 3/0.500 - CDS 22226 - 23620 1371 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains - Term 23641 - 23669 3.0 26 10 Op 1 4/0.500 - CDS 23709 - 24719 1114 ## COG0309 Hydrogenase maturation factor 27 10 Op 2 13/0.000 - CDS 24716 - 25837 1059 ## COG0409 Hydrogenase maturation factor 28 10 Op 3 8/0.000 - CDS 25837 - 26109 348 ## COG0298 Hydrogenase maturation factor 29 10 Op 4 11/0.000 - CDS 26100 - 26972 818 ## COG0378 Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase 30 10 Op 5 . - CDS 26976 - 27326 158 ## COG0375 Zn finger protein HypA/HybF (possibly regulating hydrogenase expression) - Prom 27442 - 27501 3.2 + Prom 27313 - 27372 3.2 31 11 Tu 1 . + CDS 27538 - 27999 433 ## ECO103_3263 formate hydrogenlyase regulatory protein HycA + Term 28048 - 28098 10.1 32 12 Op 1 4/0.500 + CDS 28124 - 28735 317 ## COG1142 Fe-S-cluster-containing hydrogenase components 2 33 12 Op 2 10/0.000 + CDS 28732 - 30558 1971 ## COG0651 Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit 34 12 Op 3 7/0.000 + CDS 30561 - 31484 1390 ## COG0650 Formate hydrogenlyase subunit 4 35 12 Op 4 5/0.000 + CDS 31502 - 33211 2123 ## COG3261 Ni,Fe-hydrogenase III large subunit 36 12 Op 5 6/0.000 + CDS 33221 - 33763 470 ## COG1143 Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) 37 12 Op 6 . + CDS 33763 - 34530 839 ## COG3260 Ni,Fe-hydrogenase III small subunit 38 12 Op 7 . + CDS 34527 - 34937 499 ## B21_02533 hypothetical protein 39 12 Op 8 . + CDS 34963 - 35400 585 ## COG0680 Ni,Fe-hydrogenase maturation factor 40 12 Op 9 . + CDS 35425 - 36186 595 ## SSON_2863 putative periplasmic or exported protein + Prom 36236 - 36295 4.2 41 13 Op 1 6/0.000 + CDS 36360 - 36671 139 ## COG4680 Uncharacterized protein conserved in bacteria 42 13 Op 2 . + CDS 36668 - 37087 381 ## COG5499 Predicted transcription regulator containing HTH domain + Term 37114 - 37151 3.5 - Term 36935 - 36996 3.4 43 14 Op 1 8/0.000 - CDS 37201 - 38625 1505 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 44 14 Op 2 . - CDS 38634 - 40091 1735 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific - Prom 40341 - 40400 2.8 + Prom 40181 - 40240 5.9 45 15 Tu 1 . + CDS 40351 - 41361 918 ## COG1609 Transcriptional regulators + Term 41376 - 41415 3.0 + Prom 41413 - 41472 3.6 46 16 Tu 1 . + CDS 41510 - 42037 410 ## COG1142 Fe-S-cluster-containing hydrogenase components 2 + Prom 42040 - 42099 1.9 47 17 Tu 1 . + CDS 42190 - 44442 1605 ## COG0068 Hydrogenase maturation factor + Term 44517 - 44561 4.1 48 18 Op 1 5/0.000 - CDS 44570 - 45703 1025 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 49 18 Op 2 . - CDS 45700 - 47139 1376 ## COG0426 Uncharacterized flavoproteins - Prom 47217 - 47276 4.3 + Prom 47176 - 47235 9.7 50 19 Tu 1 . + CDS 47326 - 48676 1056 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains Predicted protein(s) >gi|296493150|gb|ADTK01000351.1| GENE 1 354 - 1391 656 345 aa, chain - ## HITS:1 COG:iap KEGG:ns NR:ns ## COG: iap COG2234 # Protein_GI_number: 16130660 # Func_class: R General function prediction only # Function: Predicted aminopeptidases # Organism: Escherichia coli K12 # 1 345 1 345 345 696 99.0 0 MFSALRHRTAALALGVCFILPVHASSPKPGDFANTQARHIATFFPGRMTGTPAEMLSADY IRQQFQQMGYRSDIRTFNSRYIYTARDNRKSWHNVTGSTVIAAHEGKAPQQIIIMAHLDT YAPLSDADADANLGGLTLQGMDDNAAGLGVMLELAERLKNTPTEYGIRFVATSGEEEGKL GAENLLKRMSDTEKKNTLLVINLDNLIVGDKLYFNSGVKTPEAVRKLTRDRALAIARSHG IAATTNPGLNKNYPKGTGCCNDAEIFDKAGIAVLSVEATNWNLGNKDGYQQRAKTAAFPA GNSWHDVRLDNQQHIDKALPGRIERRCRDVMRIMLPLVKELAKAS >gi|296493150|gb|ADTK01000351.1| GENE 2 1643 - 2551 1035 302 aa, chain + ## HITS:1 COG:cysD KEGG:ns NR:ns ## COG: cysD COG0175 # Protein_GI_number: 16130659 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes # Organism: Escherichia coli K12 # 1 302 1 302 302 626 100.0 1e-179 MDQIRLTHLRQLEAESIHIIREVAAEFSNPVMLYSIGKDSSVMLHLARKAFYPGTLPFPL LHVDTGWKFREMYEFRDRTAKAYGCELLVHKNPEGVAMGINPFVHGSAKHTDIMKTEGLK QALNKYGFDAAFGGARRDEEKSRAKERIYSFRDRFHRWDPKNQRPELWHNYNGQINKGES IRVFPLSNWTEQDIWQYIWLENIDIVPLYLAAERPVLERDGMLMMIDDNRIDLQPGEVIK KRMVRFRTLGCWPLTGAVESNAQTLPEIIEEMLVSTTSERQGRVIDRDQAGSMELKKRQG YF >gi|296493150|gb|ADTK01000351.1| GENE 3 2553 - 3980 1524 475 aa, chain + ## HITS:1 COG:cysN KEGG:ns NR:ns ## COG: cysN COG2895 # Protein_GI_number: 16130658 # Func_class: P Inorganic ion transport and metabolism # Function: GTPases - Sulfate adenylate transferase subunit 1 # Organism: Escherichia coli K12 # 1 475 1 475 475 943 99.0 0 MNTALAQQIANEGGVEAWMIAQQHKSLLRFLTCGSVDDGKSTLIGRLLHDTRQIYEDQLS SLHNDSKRHGTQGEKLDLALLVDGLQAEREQGITIDVAYRYFSTEKRKFIIADTPGHEQY TRNMATGASTCELAILLIDARKGVLDQTRRHSFISTLLGIKHLVVAINKMDLVDYSEKTF TRIREDYLTFAGQLPGNLDIRFVPLSALEGDNVASQSESMAWYSGPTLLEVLETVEIQRV VDAQPMRFPVQYVNRPNLDFRGYAGTLASGRVEVGQRVKVLPSGVESNVARIVTFDGDRE EAFAGEAITLVLTDEIDISRGDLLLAADEALPAVQSASVDVVWMAEQPLSPGQSYDIKIA GKKTRARVDGIRYQVDINNLTQREVENLPLNGIGLVDLTFDEPLVLDRYQQNPVTGGLIF IDRLSNVTVGAGMVHEPVSQATAAPSEFSAFELELNALVRRHFPHWGARDLLGDK >gi|296493150|gb|ADTK01000351.1| GENE 4 3980 - 4585 566 201 aa, chain + ## HITS:1 COG:ECs3604 KEGG:ns NR:ns ## COG: ECs3604 COG0529 # Protein_GI_number: 15832858 # Func_class: P Inorganic ion transport and metabolism # Function: Adenylylsulfate kinase and related kinases # Organism: Escherichia coli O157:H7 # 1 201 1 201 201 397 100.0 1e-111 MALHDENVVWHSHPVTVQQRELHHGHRGVVLWFTGLSGSGKSTVAGALEEALHKLGVSTY LLDGDNVRHGLCSDLGFSDADRKENIRRVGEVANLMVEAGLVVLTAFISPHRAERQMVRE RVGEGRFIEVFVDTPLAICEARDPKGLYKKARAGELRNFTGIDSVYEAPESAEIHLNGEQ LVTNLVQQLLDLLRQNDIIRS >gi|296493150|gb|ADTK01000351.1| GENE 5 4635 - 4958 487 107 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_2875 NR:ns ## KEGG: EcSMS35_2875 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 107 1 107 107 176 100.0 2e-43 MRNSHNITLTNNDSLTEDEETTWSLPGAVVGFISWLFALAMPMLIYGSNTLFFFIYTWPF FLALMPVAVVVGIALHSLMDGKLRYSIVFTLVTVGIMFGALFMWLLG >gi|296493150|gb|ADTK01000351.1| GENE 6 5152 - 5463 205 103 aa, chain + ## HITS:1 COG:ECs3602 KEGG:ns NR:ns ## COG: ECs3602 COG2919 # Protein_GI_number: 15832856 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation initiator # Organism: Escherichia coli O157:H7 # 16 103 16 103 103 164 100.0 4e-41 MGKLTLLLLAILVWLQYSLWFGKNGIHDYTRVNDDVAAQQATNAKLKARNDQLFAEIDDL NGGQEALEERARNELSMTRPGETFYRLVPDASKRAQSAGQNNR >gi|296493150|gb|ADTK01000351.1| GENE 7 5482 - 6192 315 236 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 [Bacillus selenitireducens MLS10] # 10 227 6 223 234 125 37 4e-28 MATTHLDVCAVVPAAGFGRRMQTECPKQYLSIGNQTILEHSVHALLAHPRVKRVVIAISP GDSRFAQLPLANHPQITVVDGGDERADSVLAGLKAAGDAQWVLVHDAARPCLHQDDLARL LALSETSRTGGILAAPVRDTMKRAEPGKNAIAHTVDRNGLWHALTPQFFPRELLHDCLTR ALNEGATITDEASALEYCGFHPQLVEGRADNIKVTRPEDLALAEFYLTRTIHQENT >gi|296493150|gb|ADTK01000351.1| GENE 8 6192 - 6671 759 159 aa, chain + ## HITS:1 COG:ECs3600 KEGG:ns NR:ns ## COG: ECs3600 COG0245 # Protein_GI_number: 15832854 # Func_class: I Lipid transport and metabolism # Function: 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase # Organism: Escherichia coli O157:H7 # 1 159 1 159 159 288 100.0 3e-78 MRIGHGFDVHAFGGEGPIIIGGVRIPYEKGLLAHSDGDVALHALTDALLGAAALGDIGKL FPDTDPAFKGADSRELLREAWRRIQAKGYTLGNVDVTIIAQAPKMLPHIPQMRVFIAEDL GCHMDDVNVKATTTEKLGFTGRGEGIACEAVALLIKATK >gi|296493150|gb|ADTK01000351.1| GENE 9 6668 - 7717 971 349 aa, chain + ## HITS:1 COG:ygbO KEGG:ns NR:ns ## COG: ygbO COG0585 # Protein_GI_number: 16130652 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 349 1 349 349 696 100.0 0 MIEFDNLTYLHGKPQGTGLLKANPEDFVVVEDLGFEPDGEGEHILVRILKNGCNTRFVAD ALAKFLKIHAREVSFAGQKDKHAVTEQWLCARVPGKEMPDLSAFQLEGCQVLEYARHKRK LRLGALKGNAFTLVLREVSNRDDVEQRLIDICVKGVPNYFGAQRFGIGGSNLQGAQRWAQ TNTPVRDRNKRSFWLSAARSALFNQIVAERLKKADVNQVVDGDALQLAGRGSWFVATTEE LAELQRRVNDKELMITAALPGSGEWGTQREALAFEQAAVAAETELQALLVREKVEAARRA MLLYPQQLSWNWWDDVTVEIRFWLPAGSFATSVVRELINTTGDYAHIAE >gi|296493150|gb|ADTK01000351.1| GENE 10 7698 - 8459 655 253 aa, chain + ## HITS:1 COG:ECs3598 KEGG:ns NR:ns ## COG: ECs3598 COG0496 # Protein_GI_number: 15832852 # Func_class: R General function prediction only # Function: Predicted acid phosphatase # Organism: Escherichia coli O157:H7 # 1 253 1 253 253 505 100.0 1e-143 MRILLSNDDGVHAPGIQTLAKALREFADVQVVAPDRNRSGASNSLTLESSLRTFTFENGD IAVQMGTPTDCVYLGVNALMRPRPDIVVSGINAGPNLGDDVIYSGTVAAAMEGRHLGFPA LAVSLDGHKHYDTAAAVTCSILRALCKEPLRTGRILNINVPDLPLDQIKGIRVTRCGTRH PADQVIPQQDPRGNTLYWIGPPGGKCDAGPGTDFAAVDEGYVSITPLHVDLTAHSAQDVV SDWLNSVGVGTQW >gi|296493150|gb|ADTK01000351.1| GENE 11 8453 - 9079 503 208 aa, chain + ## HITS:1 COG:ECs3597 KEGG:ns NR:ns ## COG: ECs3597 COG2518 # Protein_GI_number: 15832851 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Protein-L-isoaspartate carboxylmethyltransferase # Organism: Escherichia coli O157:H7 # 1 208 1 208 208 397 100.0 1e-111 MVSRRVQALLDQLRAQGIQDEQVLNALAAVPREKFVDEAFEQKAWDNIALPIGQGQTISQ PYMVARMTELLELTPQSRVLEIGTGSGYQTAILAHLVQHVCSVERIKGLQWQARRRLKNL DLHNVSTRHGDGWQGWQARAPFDAIIVTAAPPEIPTALMTQLDEGGILVLPVGEEHQYLK RVRRRGGEFIIDTVEAVRFVPLVKGELA >gi|296493150|gb|ADTK01000351.1| GENE 12 9219 - 10358 624 379 aa, chain + ## HITS:1 COG:nlpD KEGG:ns NR:ns ## COG: nlpD COG0739 # Protein_GI_number: 16130649 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Escherichia coli K12 # 1 379 1 379 379 553 100.0 1e-157 MSAGSPKFTVRRIAALSLVSLWLAGCSDTSNPPAPVSSVNGNAPANTNSGMLITPPPKMG TTSTAQQPQIQPVQQPQIQATQQPQIQPVQPVAQQPVQMENGRIVYNRQYGNIPKGSYSG STYTVKKGDTLFYIAWITGNDFRDLAQRNNIQAPYALNVGQTLQVGNASGTPITGGNAIT QADAAEQGVVIKPAQNSTVAVASQPTITYSESSGEQSANKMLPNNKPTATTVTAPVTVPT ASTTEPTVSSTSTSTPISTWRWPTEGKVIETFGASEGGNKGIDIAGSKGQAIIATADGRV VYAGNALRGYGNLIIIKHNDDYLSAYAHNDTMLVREQQEVKAGQKIATMGSTGTSSTRLH FEIRYKGKSVNPLRYLPQR >gi|296493150|gb|ADTK01000351.1| GENE 13 10421 - 11413 1191 330 aa, chain + ## HITS:1 COG:ECs3595 KEGG:ns NR:ns ## COG: ECs3595 COG0568 # Protein_GI_number: 15832849 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Escherichia coli O157:H7 # 1 330 1 330 330 576 100.0 1e-164 MSQNTLKVHDLNEDAEFDENGVEVFDEKALVEEEPSDNDLAEEELLSQGATQRVLDATQL YLGEIGYSPLLTAEEEVYFARRALRGDVASRRRMIESNLRLVVKIARRYGNRGLALLDLI EEGNLGLIRAVEKFDPERGFRFSTYATWWIRQTIERAIMNQTRTIRLPIHIVKELNVYLR TARELSHKLDHEPSAEEIAEQLDKPVDDVSRMLRLNERITSVDTPLGGDSEKALLDILAD EKENGPEDTTQDDDMKQSIVKWLFELNAKQREVLARRFGLLGYEAATLEDVGREIGLTRE RVRQIQVEGLRRLREILQTQGLNIEALFRE >gi|296493150|gb|ADTK01000351.1| GENE 14 11507 - 12871 1231 454 aa, chain - ## HITS:1 COG:ygbN KEGG:ns NR:ns ## COG: ygbN COG2610 # Protein_GI_number: 16130647 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Escherichia coli K12 # 1 454 1 454 454 685 99.0 0 MSTITLLCIALTGVIMLLLLVIKAKVQPFVALLLVSLLVALAAGIPAGEVGKVMIAGMGG VLGSVTIIIGLGAMLGRMIEHSGGAESLANYFSRKLGDKRTIAALTLAAFFLGIPVFFDV GFIILAPIIYGFAKVAKISPLKFGLPVAGIMLTVHVAVPPHPGPVAAAGLLHADIGWLTI IGIAISIPVGIVGYFAAKIINKHQYAMSVEVLEQMQLAPASEEGATKLSDKINPPGVALV TSLIVIPIAIIMAGTVSATLMPPSHPLLGTLQLIGSPMVALMIALVLAFWLLALRRGWSL QHTSDIMGSALPTAAVVILVTGAGGVFGKVLVESGVGKALANMLQMIDLPLLPAAFIISL ALRASQGSATVAILTTGGLLSEAVMGLNPIQCVLVTLAACFGGLGASHINDSGFWIVTKY LGLSVADGLKTWTVLTTILGFTGFLITWCVWAVI >gi|296493150|gb|ADTK01000351.1| GENE 15 12960 - 13736 672 258 aa, chain - ## HITS:1 COG:ygbM KEGG:ns NR:ns ## COG: ygbM COG3622 # Protein_GI_number: 16130646 # Func_class: G Carbohydrate transport and metabolism # Function: Hydroxypyruvate isomerase # Organism: Escherichia coli K12 # 1 258 1 258 258 522 98.0 1e-148 MPRFAANLSMMFTEVPFIERFAAARKAGFDAVEFLFPYDYSTLQIQKQLEQNHLTLALFN TAPGDINAGEWGLSALPGREHEAHADIDLALEYALALNCEQVHVMAGVVPAGEDAERYRA VFIDNLRYAADRFALHGKRILVEALSPDVKPHYLFSSQYQALAIVEEVARDNVFIQLDTF HAQKVDGNLTHLIRDYAGKYAHVQIAGLPDRHEPDDGEINYPWLFRLFDEVGYQGWIGCE YKPRGLTEEGLGWFDAWR >gi|296493150|gb|ADTK01000351.1| GENE 16 13741 - 14379 484 212 aa, chain - ## HITS:1 COG:ygbL KEGG:ns NR:ns ## COG: ygbL COG0235 # Protein_GI_number: 16130645 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli K12 # 1 212 1 212 212 432 100.0 1e-121 MSDFAKVEQSLREEMTRIASSFFQRGYATGSAGNLSLLLPDGNLLATPTGSCLGNLDPQR LSKVAADGEWLSGDKPSKEVLFHLALYRNNPRCKAVVHLHSTWSTALSCLQGLDSSNVIR PFTPYVVMRMGNVPLVPYYRPGDKRIAQDLAELAADNQAFLLANHGPVVCGESLQEAANN MEELEETAKLIFILGDRPIRYLTAGEIAELRS >gi|296493150|gb|ADTK01000351.1| GENE 17 14376 - 15638 1040 420 aa, chain - ## HITS:1 COG:ygbK KEGG:ns NR:ns ## COG: ygbK COG3395 # Protein_GI_number: 16130644 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 385 1 385 388 734 98.0 0 MIKIGVIADDFTGATDIASFLVENGLPTVQINGVPTGKMPEAIDALVISLKTRSCPVVEA TQQSLAALSWLQQQGCKQIYFKYCSTFDSTAKGNIGPVTDALMDALDTPFTVFSPALPVN GRTVYQGYLFVMNQLLAESGMRHHPVNPMTDSYLPRLVEAQSTGRCGVVSAHVFEQGVDA VRQELARLQQEGYRYAVLDALTEHHLEIQGEALRDAPLVTGGSGLAIGLARQWAQENGNQ AREAGHPLAGRGVVLSGSCSQMTNRQVAHYRQIAPAREVDVARCLSTETLAAYAHELAEW VLGQESVLAPLVFATASTDALAAIQQQYGAQKASQAVETLFSQLAARLAAEGVTRFIVAG GETSGVVTQSLGIKGFHIGPTISPGVPWVNALDKPVSLALKSGNFGDEAFFSRAQREFLS >gi|296493150|gb|ADTK01000351.1| GENE 18 15635 - 16543 914 302 aa, chain - ## HITS:1 COG:ygbJ KEGG:ns NR:ns ## COG: ygbJ COG2084 # Protein_GI_number: 16130643 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases # Organism: Escherichia coli K12 # 1 302 1 302 302 493 99.0 1e-139 MKTGSEFHVGIVGLGSMGMGAALSCVRAGLSTWGADLNSNACATLKEAGACGVSDNAATF AEKLDALLVLVVNAAQVKQVLFGETGVAQHLKPGTAVMVSSTIASADAQEIATALAGFDL EMLDAPVSGGAVKAANGEMTVMASGSDIAFERLAPVLEAVAGKVYRIGAEPGLGSTVKII HQLLAGVHIAAGAEAMALAARAGIPLDVMYDVVTNAAGNSWMFENRMRHVVDGDYTPHSA VDIFVKDLGLVADTAKALHFPLPLASTALNMFTSASNAGYGKEDDSAVIKIFSGITLPGA KS >gi|296493150|gb|ADTK01000351.1| GENE 19 16739 - 17506 531 255 aa, chain + ## HITS:1 COG:ygbI KEGG:ns NR:ns ## COG: ygbI COG1349 # Protein_GI_number: 16130642 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli K12 # 1 255 11 265 265 459 98.0 1e-129 MIPVERRQIILEMVAEKGIVSIAELTDRMNVSHMTIRRDLQKLEQQGAVVLVSGGVQSPG RVAHEPSHQVKTALAMTQKAAIGKLAASLVQPGSCIYLDAGTTTLAIAQHLIHMESLTVV TNDFVIANYLLDNSNCTIIHTGGAVCRGNRSCVGEAAATMLRSLMIDQAFISASSWSVRG ISTPAEDKVTVKRAIASASRQRVLVCDATKYGQVATWLALPLSEFDQIITDDGLPESASR ALAKQDLSLLVAKNE >gi|296493150|gb|ADTK01000351.1| GENE 20 17557 - 18213 413 218 aa, chain - ## HITS:1 COG:ECs3590 KEGG:ns NR:ns ## COG: ECs3590 COG0639 # Protein_GI_number: 15832844 # Func_class: T Signal transduction mechanisms # Function: Diadenosine tetraphosphatase and related serine/threonine protein phosphatases # Organism: Escherichia coli O157:H7 # 1 218 1 218 218 431 95.0 1e-121 MPSTRYQKINAHHYRHIWVVGDIHGEYQLLQSRLHQLSFYPETDLLISTGDNIDRGPKSL NVLRLLNQPWFTSVKGNHEAMALDAFETGDGNMWLASGGDWFFDLNDSEQQEAIDLLLKF HHLPHIIEITNDNIKYVIAHADYPGSEYLFGKEIAESELLWPVDRVQKSLNGELQQINGA DYFIFGHMMFDNIQTFANQIYIDTGTPKSGRLSFYKIK >gi|296493150|gb|ADTK01000351.1| GENE 21 18319 - 20880 3142 853 aa, chain - ## HITS:1 COG:ECs3589 KEGG:ns NR:ns ## COG: ECs3589 COG0249 # Protein_GI_number: 15832843 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Escherichia coli O157:H7 # 1 853 1 853 853 1610 99.0 0 MSAIENFDAHTPMMQQYLRLKAQHPEILLFYRMGDFYELFYDDAKRASQLLDISLTKRGA SAGEPIPMAGIPYHAVENYLAKLVNQGESVAICEQIGDPATSKGPVERKVVRIVTPGTIS DEALLQERQDNLLAAIWQDSKGFGYATLDISSGRFRLSEPADRETMAAELQRTNPAELLY AEDFAEMSLIEGRRGLRRRPLWEFEIDTARQQLNLQFGTRDLVGFGVENAPRGLCAAGCL LQYAKDTQRTTLPHIRSITMEREQDSIIMDAATRRNLEITQNLAGGAENTLASVLDCPVT PMGSRMLKRWLHMPVRDTRVLLERQQTIGALQDFTAELQPVLRQVGDLERILARLALRTA RPRDLARMRHAFQQLPELRAQLETVDSAPVQALREKMGEFAELRDLLERAIIDTPPVLVR DGGVIASGYNEELDEWRALADGATDYLERLEVRERERTGLDTLKVGFNAVHGYYIQISRG QSHLAPINYMRRQTLKNAERYIIPELKEYEDKVLTSKGKALALEKQLYEELFDLLLPHLE ALQQSASALAELDVLVNLAERAYTLNYTCPTFIDKPGIRITEGRHPVVEQVLNEPFIANP LNLSPQRRMLIITGPNMGGKSTYMRQTALIALMAYIGSYVPAQKVEIGPIDRIFTRVGAA DDLASGRSTFMVEMTETANILHNATEYSLVLMDEIGRGTSTYDGLSLAWACAENLANKIK ALTLFATHYFELTQLPEKMEGVANVHLDALEHGDTIAFMHSVQDGAASKSYGLAVAALAG VPKEVIKRARQKLRELESISPNAAATQVDGTQMSLLSVPEETSPAVEALENLDPDSLTPR QALEWIYRLKSLV >gi|296493150|gb|ADTK01000351.1| GENE 22 20877 - 21110 98 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|300922251|ref|ZP_07138377.1| ## NR: gi|300922251|ref|ZP_07138377.1| conserved domain protein [Escherichia coli MS 182-1] # 1 77 1 77 77 142 98.0 7e-33 MALKEEVYSTDAILRTCFISITQKLQNSIFPGTSIKNSPFASSPEMINSGIMCALCDYNE NKNHHTPFNIREPDITP >gi|296493150|gb|ADTK01000351.1| GENE 23 21167 - 21520 334 117 aa, chain + ## HITS:1 COG:no KEGG:ECO103_3270 NR:ns ## KEGG: ECO103_3270 # Name: ygbA # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 117 1 117 117 200 100.0 1e-50 MSGKRISREKLTIKKMIDLYQAKCPQASAEPEHYEALFVYAQKRLDKCVFGEEKPACKQC PVHCYQPAKREEMKQIMRWAGPRMLWRHPILTVRHLIDDKRPVPELPEKYRPKKTRE >gi|296493150|gb|ADTK01000351.1| GENE 24 21557 - 22183 768 208 aa, chain - ## HITS:1 COG:ECs3587 KEGG:ns NR:ns ## COG: ECs3587 COG3604 # Protein_GI_number: 15832841 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Escherichia coli O157:H7 # 1 208 485 692 692 399 99.0 1e-111 MPLELQPKLLRVLQEQEFERLGSNKIIQTDVRLIAATNRDLKKMVADREFRSDLYYRLNV FPIHLPPLRERPEDIPLLAKAFTFKIARRLGRNIDSIPAETLRTLSNMEWPGNVRELENV IERAVLLTRGNVLQLSLPDIVLPEPETPPAATVVAQEGEDEYQLIVRVLKETNGVVAGPK GAAQRLGLKRTTLLSRMKRLGIDKSALI >gi|296493150|gb|ADTK01000351.1| GENE 25 22226 - 23620 1371 464 aa, chain - ## HITS:1 COG:ECs3587 KEGG:ns NR:ns ## COG: ECs3587 COG3604 # Protein_GI_number: 15832841 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Escherichia coli O157:H7 # 1 464 6 469 692 939 99.0 0 MSDLGQQGLFDITRTLLQQPDLASLCEALSQLVKRSALADNAAIVLWQAQTQRTSYYASR EKDTPIKYEDETVLAHGPVRSILSRPDTLHCSYEEFCETWPQLAAGGLYPKFGHYCLMPL AAEGHIFGGCEFIRYDDRPWSEKEFNRLQTFTQIVSVVTEQIQSRVVNNVDYELLCRERD NFRILVAITNAVLSRLDMDELVSEVAKEIHYYFDIDDISIVLRSHRKNKLNIYSTHYLDK QHPAHEQSEVDEAGTLTERVFKSKEMLLINLHERDDLAPYERMLFDTWGNQIQTLCLLPL MSGDTMLGVLKLAQCEEKVFTTTNLNLLRQIAERVAIAVDNALAYQEIHRLKERLVDENL ALTEQLNNVDSEFGEIIGRSEAIYSVLKQVEMVAQSDSTVLILGETGTGKELIARAIHNL SGRNNRRMVKMNCAAMPAGLLESDLFGHERGAFTGASAQRIGRF >gi|296493150|gb|ADTK01000351.1| GENE 26 23709 - 24719 1114 336 aa, chain - ## HITS:1 COG:hypE KEGG:ns NR:ns ## COG: hypE COG0309 # Protein_GI_number: 16130637 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Escherichia coli K12 # 15 336 1 322 322 609 99.0 1e-174 MNNIQLAHGSGGQAMQQLINSLFMEAFANPWLAEQEDQARLDLAQLVAEGDRLAFSTDSY VIDPLFFPGGNIGKLAICGTANDVAVSGAIPRYLSCGFILEEGLPMETLKAVVTSMAETA RTAGIAIVTGDTKVVQRGAVDKLFINTAGMGAIPANIHWGAQTLTAGDVLLVSGTLGDHG ATILNLREQLGLDGELVSDCAVLTPLIQTLRDIPGVKALRDATRGGVNAVVHEFAAACGC GIELSEAALPVKPAVRGVCELLGLDALNFANEGKLVIAVERNAAEQVLAALHSHPLGKDA ALIGEVVERKGVRLAGLYGVKRTLDLPHAEPLPRIC >gi|296493150|gb|ADTK01000351.1| GENE 27 24716 - 25837 1059 373 aa, chain - ## HITS:1 COG:hypD KEGG:ns NR:ns ## COG: hypD COG0409 # Protein_GI_number: 16130636 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Escherichia coli K12 # 1 373 1 373 373 775 99.0 0 MRFVDEYRAPEQVMQLIEHLRERASHLSYTAERPLRIMEVCGGHTHAIFKFGLDQLLPEN VEFIHGPGCPVCVLPMGRIDTCVEIASHPEVIFCTFGDAMRVPGKQGSLLQAKARGADVR IVYSPMDALKLAQENPTRKVVFFGLGFETTMPTTAITLQQAKARDVQNFYFFCQHITLIP TLRSLLEQPDNGIDAFLAPGHVSMVIGTNAYNFIASDFHRPLVVAGFEPLDLLQGVVMLV EQKIAAHSKVENQYRRVVPDAGNLLAQQAIADVFCVNGDSEWRGLGVIESSGVHLTPDYQ RFDAEAHFRPAPQQVCDDPRARCGEVLTGKCKPHQCPLFGNTCNPQTAFGALMVSSEGAC AAWYQYRQQENEA >gi|296493150|gb|ADTK01000351.1| GENE 28 25837 - 26109 348 90 aa, chain - ## HITS:1 COG:ECs3584 KEGG:ns NR:ns ## COG: ECs3584 COG0298 # Protein_GI_number: 15832838 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Escherichia coli O157:H7 # 1 90 1 90 90 174 100.0 3e-44 MCIGVPGQIRTIDGNQAKVDVCGIQRDVDLTLVGSCDENGQPRVGQWVLVHVGFAMSVIN EAEARDTLDALQNMFDVEPDVGALLYGEEK >gi|296493150|gb|ADTK01000351.1| GENE 29 26100 - 26972 818 290 aa, chain - ## HITS:1 COG:hypB KEGG:ns NR:ns ## COG: hypB COG0378 # Protein_GI_number: 16130634 # Func_class: O Posttranslational modification, protein turnover, chaperones; K Transcription # Function: Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase # Organism: Escherichia coli K12 # 1 290 1 290 290 595 99.0 1e-170 MCTTCGCGEGNLYIEGDEHNPHSAFRSAPFAPAARPKMKITGIKAPEFTPSQTEEGDLHY GHGEAGTHAPGMSQRRMLEVEIDVLDKNNRLAERNRARFAARKQLVLNLVSSPGSGKTTL LTETLMRLKDSVPCAVIEGDQQTVNDAARIRATGTPAIQVNTGKGCHLDAQMIAEAAPRL PLDDNGILFIENVGNLVCPASFDLGEKHKVAVLSVTEGEDKPLKYPHMFAAASLMLLNKV DLLPYLNFDVEKCIACAREVNPEIEIILISATSGEGMDQWLNWLETQRCA >gi|296493150|gb|ADTK01000351.1| GENE 30 26976 - 27326 158 116 aa, chain - ## HITS:1 COG:ECs3582 KEGG:ns NR:ns ## COG: ECs3582 COG0375 # Protein_GI_number: 15832836 # Func_class: R General function prediction only # Function: Zn finger protein HypA/HybF (possibly regulating hydrogenase expression) # Organism: Escherichia coli O157:H7 # 1 116 1 116 116 202 100.0 1e-52 MHEITLCQRALELIEQQAAKHGAKRVTGVWLKIGAFSCVETSSLAFCFDLVCRGSVAEGC KLHLEEQEAECWCETCQQYVTLLTQRVRRCPQCHGDMLQIVADDGLQIRRIEIDQE >gi|296493150|gb|ADTK01000351.1| GENE 31 27538 - 27999 433 153 aa, chain + ## HITS:1 COG:no KEGG:ECO103_3263 NR:ns ## KEGG: ECO103_3263 # Name: hycA # Def: formate hydrogenlyase regulatory protein HycA # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 153 1 153 153 309 100.0 2e-83 MTIWEISEKADYIAQRHRRLQDQWHIYCNSLVQGITLSKARLHHAMSCAPDKELCFVLFE HFRIYVTLADGFNSHTIEYYVETKDGEDKQRIAQAQLSIDGMIDGKVNIRDREQVLEHYL EKIAGVYDSLYTAIENNVPVNLSQLVKGQSPAA >gi|296493150|gb|ADTK01000351.1| GENE 32 28124 - 28735 317 203 aa, chain + ## HITS:1 COG:hycB KEGG:ns NR:ns ## COG: hycB COG1142 # Protein_GI_number: 16130631 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 2 # Organism: Escherichia coli K12 # 1 203 1 203 203 360 100.0 1e-99 MNRFVIADSTLCIGCHTCEAACSETHRQHGLQSMPRLRVMLNEKESAPQLCHHCEDAPCA VVCPVNAITRVDGAVQLNESLCVSCKLCGIACPFGAIEFSGSRPLDIPANANTPKAPPAP PAPARVSTLLDWVPGIRAIAVKCDLCSFDEQGPACVRMCPTKALHLVDNTDIARVSKRKR ELTFNTDFGDLTLFQQAQSGEAK >gi|296493150|gb|ADTK01000351.1| GENE 33 28732 - 30558 1971 608 aa, chain + ## HITS:1 COG:hycC KEGG:ns NR:ns ## COG: hycC COG0651 # Protein_GI_number: 16130630 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit # Organism: Escherichia coli K12 # 1 608 1 608 608 994 99.0 0 MSAISLINSGVAWFVAAAVLAFLFSFQKALSGWIAGIGGAVGSLYTAAAGFTVLTGAVGV SGALSLVSYDVQISPLNAIWLITLGLCGLFVSLYNIDWHRHAQVKCNGLQINMLMAAAVC AVIASNLGMFVVMAEIMALCAVFLTSNSKEGKLWFALGRLGTLLLAIACWLLWQRYGTLD LRLLDMRMQQLPLGSDIWLLGVIGFGLLAGIIPLHGWVPQAHANASAPAAALFSTVVMKI GLLGILTLSLLGGNAPLWWGIALLVLGMITAFVGGLYALMEHNIQRLLAYHTLENIGIIL LGLGAGVTGIALEQPALIALGLVGGLYHLLNHSLFKSVLFLGAGSVWFRTGHRDIEKLGG IGKKMPVISIAMLVGLMAMAALPPLNGFAGEWVIYQSFFKLSNSGAFVARLLGPLLAVGL AITGALAVMCMAKVYGVTFLGAPRTKEAENATCAPLLMSVSVVALAICCVIGGVAAPWLL PMLSAAVPLPLEPANTTVSQPMITLLLIACPLLPFIIMAICKGDRLPSRSRGAAWVCGYD HEKSMVITAHGFAMPVKQALAPVLKLRKWLNPVSLVPGWQCEGSALLFRRMALVELAVLV VIIVSRGA >gi|296493150|gb|ADTK01000351.1| GENE 34 30561 - 31484 1390 307 aa, chain + ## HITS:1 COG:hycD KEGG:ns NR:ns ## COG: hycD COG0650 # Protein_GI_number: 16130629 # Func_class: C Energy production and conversion # Function: Formate hydrogenlyase subunit 4 # Organism: Escherichia coli K12 # 1 307 1 307 307 534 99.0 1e-152 MSVLYPLIQALVLFAVAPLLSGITRVARARLHNRRGPGVLQEYRDIFKLLGRQSVGPDAS GWVFRLTPYVMVGVMLTIATALPVVTVGSPLPQLGDLITLLYLFAIARFFFAISGLDTGS PFTAIGASREAMLGVLVEPMLLLGLWVAAQVAGSTNISNITDTVYHWPLSQSIPLVLALC ACAFATFIEMGKLPFDLAEAEQELQEGPLSEYSGSGFGVMKWGISLKQLVVLQMFVGVFI PWGQMETFTAGGLLLALVIAIVKLVVGVLVIALFENSMARLRLDITPRITWAGFGFAFLA FVSLLAA >gi|296493150|gb|ADTK01000351.1| GENE 35 31502 - 33211 2123 569 aa, chain + ## HITS:1 COG:ECs3577_2 KEGG:ns NR:ns ## COG: ECs3577_2 COG3261 # Protein_GI_number: 15832831 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase III large subunit # Organism: Escherichia coli O157:H7 # 158 569 1 412 412 872 100.0 0 MSEEKLGQHYLAALNEAFPGVVLDHAWQTKDQLTVTVKVNYLPEVVEFLYYKQGGWLSVL FGNDERKLNGHYAVYYVLSMEKGTKCWVTVRVEVDANKPEYPSVTPRVPAAVWGEREVRD MYGLIPVGLPDERRLVLPDDWPDELYPLRKDSMDYRQRPAPTTDAETYEFINELGDKKNN VVPIGPLHVTSDEPGHFRLFVDGENIIDADYRLFYVHRGMEKLAETRMGYNEVTFLSDRV CGICGFAHSTAYTTSVENAMGIQVPERAQMIRAILLEVERLHSHLLNLGLACHFTGFDSG FMQFFRVRETSMKMAEILTGARKTYGLNLIGGIRRDLLKDDMIQTRQLAQQMRREVQELV DVLLSTPNMEQRTVGIGRLDPEIARDFSNVGPMVRASGHARDTRADHPFVGYGLLPMEVH SEQGCDVISRLKVRINEVYTALNMIDYGLDNLPGGPLMVEGFTYIPHRFALGFAEAPRGD DIHWSMTGDNQKLYRWRCRAATYANWPTLRYMLRGNTVSDAPLIIGSLDPCYSCTDRMTV VDVRKKKSKVVPYKELERYSIERKNSPLK >gi|296493150|gb|ADTK01000351.1| GENE 36 33221 - 33763 470 180 aa, chain + ## HITS:1 COG:ECs3576 KEGG:ns NR:ns ## COG: ECs3576 COG1143 # Protein_GI_number: 15832830 # Func_class: C Energy production and conversion # Function: Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) # Organism: Escherichia coli O157:H7 # 1 180 1 180 180 343 98.0 8e-95 MFTFIKKVIKTGTATSSYPLEPIAVDKNFRGKPEQNPQQCIGCAACVNACPSNALTVETD LATGELAWEFNLGRCIFCGRCEEVCPTAAIKLSQEYELAVWKKEDFLQQSRFVLCSCRVC NRPFAVQKEIDYAIALLKHNGDSRAENHRESFETCPECKRQKCLVPSDRIELTRHMKEAI >gi|296493150|gb|ADTK01000351.1| GENE 37 33763 - 34530 839 255 aa, chain + ## HITS:1 COG:ECs3575 KEGG:ns NR:ns ## COG: ECs3575 COG3260 # Protein_GI_number: 15832829 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase III small subunit # Organism: Escherichia coli O157:H7 # 1 255 1 255 255 529 99.0 1e-150 MSNLLGPRDANGIPVPMTVDESIASMKASLLKKIKRSAYVYRVDCGGCNGCEIEIFATLS PLFDAERFGIKVVPSPRHADILLFTGAVTRAMRSPALRAWQSAPDPKICISYGACGNSGG IFHDLYCVWGGTDKIVPVDVYIPGCPPTPAATLYGFAMALGLLEQKIHARGPGEQDEQPA EILHGDMVQPLRVKVDREARRLAGYRYGRQIADDFLTQLGQGEEQVARWLEAENDPRLNE IVSHLNHVVEEARIR >gi|296493150|gb|ADTK01000351.1| GENE 38 34527 - 34937 499 136 aa, chain + ## HITS:1 COG:no KEGG:B21_02533 NR:ns ## KEGG: B21_02533 # Name: hycH # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 136 1 136 136 270 100.0 1e-71 MSEKVVFSQLSRKFIDENDATPAEAQQVVYYSLAIGHHLGVIDCLEAALTCPWDEYLAWI ATLEAGSEARRKMEGVPKYGEIVIDINHVPMLANAFDKARAAQTSQQQEWSTMLLSMLHD IHQENAIYLMVRRLRD >gi|296493150|gb|ADTK01000351.1| GENE 39 34963 - 35400 585 145 aa, chain + ## HITS:1 COG:ECs3573 KEGG:ns NR:ns ## COG: ECs3573 COG0680 # Protein_GI_number: 15832827 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase maturation factor # Organism: Escherichia coli O157:H7 # 1 145 12 156 156 283 100.0 7e-77 MMGDDGAGPLLAEKCAAAPKGNWVVIDGGSAPENDIVAIRELRPTRLLIVDATDMGLNPG EIRIIDPDDIAEMFMMTTHNMPLNYLIDQLKEDIGEVIFLGIQPDIVGFYYPMTQPIKDA VETVYQRLEGWEGNGGFAQLAVEEE >gi|296493150|gb|ADTK01000351.1| GENE 40 35425 - 36186 595 253 aa, chain + ## HITS:1 COG:no KEGG:SSON_2863 NR:ns ## KEGG: SSON_2863 # Name: not_defined # Def: putative periplasmic or exported protein # Organism: S.sonnei # Pathway: not_defined # 1 253 1 253 257 520 98.0 1e-146 MNIRTGMCAFLLSLVLPAQATSFTEYLPMSDSEYAQKRALKPLLTMPYDAEQTWHFRKVG VAGVTLEKMPNDDSEWQLNGKDRAGKSWSVPVGMLANIAGKGQLYRADLDRNGIQDLVIW LPSTGNGLAPYAHLILMTFTREGRPCVFEPRGFYTASKTGVDDLLDLQGNGHTQLLDMQF NSGYWITSLYQVKDAKWQRVHGWFGKLSYPALTRFTYTPNRKLVLKPIAGRDPQTEDLAQ TQRCLIKGDVLEG >gi|296493150|gb|ADTK01000351.1| GENE 41 36360 - 36671 139 103 aa, chain + ## HITS:1 COG:STM4031 KEGG:ns NR:ns ## COG: STM4031 COG4680 # Protein_GI_number: 16767296 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Salmonella typhimurium LT2 # 1 103 1 103 103 120 57.0 5e-28 MHIISKAPFEESARKYPNDALALQALYRVIKETDFSTPEEMRTAFPNLDNFRYRNKWYVL DVGGNNLRVIAYINFVNKRFFVKHITNHAEYDKLTRYYRENKE >gi|296493150|gb|ADTK01000351.1| GENE 42 36668 - 37087 381 139 aa, chain + ## HITS:1 COG:ygjM KEGG:ns NR:ns ## COG: ygjM COG5499 # Protein_GI_number: 16130977 # Func_class: K Transcription # Function: Predicted transcription regulator containing HTH domain # Organism: Escherichia coli K12 # 6 139 5 137 138 124 46.0 4e-29 MTANAARAVKATRELVNAVPFLGGSDSEDDYREALELVEYLIEEDDTNPLIDFLASRIAE YENNNEKFAEFDKAVAAMPVGVALLRTLIDQHNLTYADLKNEIGSKSLVSQILSGQRSLT ISHIKALSARFGVKPEWFL >gi|296493150|gb|ADTK01000351.1| GENE 43 37201 - 38625 1505 474 aa, chain - ## HITS:1 COG:ascB KEGG:ns NR:ns ## COG: ascB COG2723 # Protein_GI_number: 16130623 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Escherichia coli K12 # 1 474 1 474 474 1002 100.0 0 MSVFPESFLWGGALAANQSEGAFREGDKGLTTVDMIPHGEHRMAVKLGLEKRFQLRDDEF YPSHEATDFYHRYKEDIALMAEMGFKVFRTSIAWSRLFPQGDEITPNQQGIAFYRSVFEE CKKYGIEPLVTLCHFDVPMHLVTEYGSWRNRKLVEFFSRYARTCFEAFDGLVKYWLTFNE INIMLHSPFSGAGLVFEEGENQDQVKYQAAHHQLVASALATKIAHEVNPQNQVGCMLAGG NFYPYSCKPEDVWAALEKDRENLFFIDVQARGTYPAYSARVFREKGVTINKAPGDDEILK NTVDFVSFSYYASRCASAEMNANNSSAANVVKSLRNPYLQVSDWGWGIDPLGLRITMNMM YDRYQKPLFLVENGLGAKDEFAANGEINDDYRISYLREHIRAMGEAIADGIPLMGYTTWG CIDLVSASTGEMSKRYGFVFVDRDDAGNGTLTRTRKKSFWWYKKVIASNGEDLE >gi|296493150|gb|ADTK01000351.1| GENE 44 38634 - 40091 1735 485 aa, chain - ## HITS:1 COG:ECs3571_2 KEGG:ns NR:ns ## COG: ECs3571_2 COG1263 # Protein_GI_number: 15832825 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Escherichia coli O157:H7 # 91 485 2 396 396 649 98.0 0 MAKNYAALARSVIAALGGVDNISAVTHCMTRLRFVIKDDQLIDSPTLKTIPGVLGVVRSD NQCQVIIGNTVSQAFQEVVSLLPGDMQPAQPVGKPKLTLRRIGAGILDALIGTMSPLIPA IIGGSMVKLLAMILEMSGVLTKGSPTLTILNVIGDGAFFFLPLMVAASAAIKFKTNMSLA IAIAGVLVHPSFIELMAKAAQGEHVEFALIPVTAVKYTYTVIPALVMTWCLSYIERWVDR ITPAVTKNFLKPMLIVLIAAPLAILLIGPIGIWIGSAISALVYTIHGYLGWLSVAIMGAL WPLLVMTGMHRVFTPTIIQTIAETGKEGMVMPSEIGANLSLGGSSLAVAWKTKNPELRQT ALAAAASAIMAGISEPALYGVAIRLKRPLIASLISGFICGAVAGMAGLASHSMAAPGLFT SVQFFDPANPMSIVWVFAVMALAVVLSFILTLLLGFEDIPVEEAAAQARKYQSVQPTVAK EVSLN >gi|296493150|gb|ADTK01000351.1| GENE 45 40351 - 41361 918 336 aa, chain + ## HITS:1 COG:ascG KEGG:ns NR:ns ## COG: ascG COG1609 # Protein_GI_number: 16130621 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 336 2 337 337 660 99.0 0 MTTMLEVAKRAGVSKATVSRVLSGNGYVSQETKDRVFQAVEESGYRPNLLARNLSAKSTQ TLGLVVTNTLYHGIYFSELLFHAARMAEEKGRQLLLADGKHSAEEERQAIQYLLDLRCDA IMIYPRFLSVDEIDDIIDAHSQPIMVLNRRLRKNSSHSVWCDHKQTSFNAVAELINAGHQ EIAFLTGSMDSPTSIERLAGYKDALAQHGIALNEKLIANGKWTPASGAEGVEMLLERGAK FSALVASNDDMAIGAMKALHERGVAVPEQVSVIGFDDIAIAPYTVPALSSVKIPVTEMIQ EIIGRLIFMLDGGDFSPPKTFSGKLIRRDSLIALSR >gi|296493150|gb|ADTK01000351.1| GENE 46 41510 - 42037 410 175 aa, chain + ## HITS:1 COG:ECs3569 KEGG:ns NR:ns ## COG: ECs3569 COG1142 # Protein_GI_number: 15832823 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 2 # Organism: Escherichia coli O157:H7 # 1 175 1 175 175 302 100.0 2e-82 MNRFIIADASKCIGCRTCEVACVVSHQENQDCASLTPETFLPRIHVIKGVNISTATVCRQ CEDAPCANVCPNGAISRDKGFVHVMQERCIGCKTCVVACPYGAMEVVVRPVIRNSGAGLN VRADKAEANKCDLCNHREDGPACMAACPTHALICVDRNKLEQLSAEKRRRTALMF >gi|296493150|gb|ADTK01000351.1| GENE 47 42190 - 44442 1605 750 aa, chain + ## HITS:1 COG:hypF KEGG:ns NR:ns ## COG: hypF COG0068 # Protein_GI_number: 16130619 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Escherichia coli K12 # 1 750 1 750 750 1432 98.0 0 MAKNTSCGVQLRIRGKVQGVGFRPFVWQLAQQLNLHGDVCNDGDGVEVRLLEDPETFLVQ LHQHCPPLARIDSVEREPFIWSQLPTEFTIRQSAGGAMNTQIVPDAATCPACLAEMNTPG ERRYRYPFINCTHCGPRFTIIRAMPYDRPFTVMAEFPLCPACDKEYRDPLDRRFHAQPVA CPECGPHLEWVSHGEHAEQEAALQAAIAQLKMGNIVAIKGIGGFHLACDARNSNAVATLR ARKHRPAKPLAVMLPVADGLPDAARQLLTTPAAPIVLVDKKYVPELCDDIAPGLNEVGVM LPANPLQHLLLQELQCPLVMTSGNLSGKPPAISNEQALADLQGIADGFLIHNRDIVQRMD DSVVRESGEMLRRSRGYVPDALALPPGFKNVPPVLCLGADLKNTFCLVRGEQAVLSQHLG DLSDDGIQMQWREALRLMQNIYDFTPQYVVHDAHPGYVSSQWAHEMNLPTQTVLHHHAHA AACLAEHQWPLDGGDVIALTLDGIGMGENGALWGGECLRVNYRECEHLGGLPAVALAGGD LAAKQPWRNLLAQCLRFVPEWQNYPETASVQQQNWSVLARAIERGINAPLASSCGRFFDA VAAALGCAPATLSYEGEAACALEALAASCHGVTHPVTMPRVDNQLDLATFWQQWLNWQAP VNQRAWAFHDALAQGFAALMREQATMRGITTLVFSGGVIHNRLLRARLAHYLADFTLLFP QSLPAGDGGLSLGQGVIAAARWLAGEVQNG >gi|296493150|gb|ADTK01000351.1| GENE 48 44570 - 45703 1025 377 aa, chain - ## HITS:1 COG:ygbD KEGG:ns NR:ns ## COG: ygbD COG0446 # Protein_GI_number: 16130618 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Escherichia coli K12 # 1 377 1 377 377 729 100.0 0 MSNGIVIIGSGFAARQLVKNIRKQDATIPLTLIAADSMDEYNKPDLSHVISQGQRADDLT RQTAGEFAEQFNLHLFPQTWVTDIDAEARVVKSQNNQWQYDKLVLATGASAFVPPVPGRE LMLTLNSQQEYRACETQLRDARRVLIVGGGLIGSELAMDFCRAGKAVTLIDNAASILASL MPPEVSSRLQHRLTEMGVHLLLKSQLQGLEKTDSGIQATLDRQRNIEVDAVIAATGLRPE TALARRAGLTINRGVCVDSYLQTSNTDIYALGDCAEINGQVLPFLQPIQLSAMVLAKNLL GNNTPLKLPAMLVKIKTPELPLHLAGETQRQDLRWQINTERQGMVARGVDDADQLRAFVV SEDRMKEAFGLLKTLPM >gi|296493150|gb|ADTK01000351.1| GENE 49 45700 - 47139 1376 479 aa, chain - ## HITS:1 COG:ygaK_1 KEGG:ns NR:ns ## COG: ygaK_1 COG0426 # Protein_GI_number: 16130617 # Func_class: C Energy production and conversion # Function: Uncharacterized flavoproteins # Organism: Escherichia coli K12 # 1 394 1 394 394 830 100.0 0 MSIVVKNNIHWVGQRDWEVRDFHGTEYKTLRGSSYNSYLIREEKNVLIDTVDHKFSREFV QNLRNEIDLADIDYIVINHAEEDHAGALTELMAQIPDTPIYCTANAIDSINGHHHHPEWN FNVVKTGDTLDIGNGKQLIFVETPMLHWPDSMMTYLTGDAVLFSNDAFGQHYCDEHLFND EVDQTELFEQCQRYYANILTPFSRLVTPKITEILGFNLPVDMIATSHGVVWRDNPTQIVE LYLKWAADYQEDRITIFYDTMSNNTRMMADAIAQGIAETDPRVAVKIFNVARSDKNEILT NVFRSKGVLVGTSTMNNVMMPKIAGLVEEMTGLRFRNKRASAFGSHGWSGGAVDRLSTRL QDAGFEMSLSLKAKWRPDQDALKLCREHGREIARQWALAPLPQSTVNTVVKEETSATTTA DLGPRMQCSVCQWIYDPAKGEPMQDVAPGTPWSEVPDNFLCPECSLGKDVFEELASEAK >gi|296493150|gb|ADTK01000351.1| GENE 50 47326 - 48676 1056 450 aa, chain + ## HITS:1 COG:ECs3565 KEGG:ns NR:ns ## COG: ECs3565 COG3604 # Protein_GI_number: 15832819 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Escherichia coli O157:H7 # 1 448 26 473 529 819 97.0 0 MSFSVDVLANIAIELQRGIGHQDRFQRLITTLRQVLECDASALLRYDSRQFIPLAIDGLA KDVLGRRFALEGHPRLEAIARAGDVVRFPADSELPDPYDGLIPGQESLKVHACVGLPLFA GQNLIGALTLDGMQPDQFDVFSDEELRLIAALAAGALSNALLIEQLESQNMMPGDATPFE AVKQTQMIGLSPGMTQLKKEIEIVAASDLNVLISGETGTGKELVAKAIHEASPRAVNPLV YLNCAALPESVAESELFGHVKGAFTGAISNRSGKFEMADNGTLFLDEIGELSLALQAKLL RVLQYGDIQRVGDDRSLRVDVRVLAATNRDLREEVLAGRFRADLFHRLSVFPLSVPPLRE RGDDVILLAGYFCEQCRLRLGLSRVVLSAGARNLLQHYRFPGNVRELEHAIHRAVVLSRA TRNGDEVILEAQHFAFPEVTLPPPPPAGGG Prediction of potential genes in microbial genomes Time: Mon May 16 16:09:30 2011 Seq name: gi|296493149|gb|ADTK01000352.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1096.19, whole genome shotgun sequence Length of sequence - 31599 bp Number of predicted genes - 34, with homology - 34 Number of transcription units - 10, operones - 6 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 18 - 188 139 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains 2 2 Op 1 . - CDS 185 - 1150 1168 ## COG0794 Predicted sugar phosphate isomerase involved in capsule formation 3 2 Op 2 4/0.000 - CDS 1143 - 1916 696 ## COG1349 Transcriptional regulators of sugar metabolism 4 2 Op 3 5/0.000 - CDS 1983 - 2342 298 ## COG4578 Glucitol operon activator - Prom 2379 - 2438 2.3 5 2 Op 4 5/0.000 - CDS 2447 - 3226 878 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 6 2 Op 5 6/0.000 - CDS 3230 - 3601 239 ## COG3731 Phosphotransferase system sorbitol-specific component IIA 7 2 Op 6 6/0.000 - CDS 3612 - 4571 1026 ## COG3732 Phosphotransferase system sorbitol-specific component IIBC 8 2 Op 7 . - CDS 4568 - 5131 659 ## COG3730 Phosphotransferase system sorbitol-specific component IIC - Prom 5255 - 5314 6.1 + Prom 5194 - 5253 4.8 9 3 Op 1 4/0.000 + CDS 5387 - 6472 1002 ## COG2951 Membrane-bound lytic murein transglycosylase B + Prom 6490 - 6549 2.6 10 3 Op 2 12/0.000 + CDS 6617 - 7114 301 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase 11 3 Op 3 14/0.000 + CDS 7194 - 8255 1505 ## COG0468 RecA/RadA recombinase + Term 8441 - 8485 4.2 12 3 Op 4 . + CDS 8498 - 8998 569 ## COG2137 Uncharacterized protein conserved in bacteria + Term 9055 - 9108 12.0 13 4 Tu 1 . - CDS 9039 - 9221 134 ## ECSP_3645 hypothetical protein - Prom 9377 - 9436 1.9 14 5 Op 1 8/0.000 + CDS 9126 - 11756 2899 ## COG0013 Alanyl-tRNA synthetase + Prom 11804 - 11863 4.1 15 5 Op 2 5/0.000 + CDS 11991 - 12176 204 ## PROTEIN SUPPORTED gi|167855109|ref|ZP_02477881.1| 30S ribosomal protein S1 + TRNA 12492 - 12584 72.5 # Ser GCT 0 0 + TRNA 12588 - 12664 87.2 # Arg ACG 0 0 + TRNA 12862 - 12938 87.2 # Arg ACG 0 0 + TRNA 13002 - 13078 87.2 # Arg ACG 0 0 + TRNA 13277 - 13353 87.2 # Arg ACG 0 0 16 5 Op 3 6/0.000 + CDS 13634 - 14200 536 ## COG0637 Predicted phosphatase/phosphohexomutase + Prom 14203 - 14262 3.6 17 5 Op 4 5/0.000 + CDS 14341 - 14625 267 ## COG1238 Predicted membrane protein 18 5 Op 5 6/0.000 + CDS 14698 - 16254 1771 ## COG2918 Gamma-glutamylcysteine synthetase + Term 16268 - 16299 2.5 + Prom 16290 - 16349 2.5 19 5 Op 6 . + CDS 16404 - 16919 592 ## COG1854 LuxS protein involved in autoinducer AI2 synthesis + Term 16953 - 16984 3.4 - Term 16941 - 16972 3.4 20 6 Op 1 19/0.000 - CDS 16983 - 18521 1595 ## COG0477 Permeases of the major facilitator superfamily 21 6 Op 2 7/0.000 - CDS 18538 - 19710 1156 ## COG1566 Multidrug resistance efflux pump - Prom 19774 - 19833 1.9 - Term 19754 - 19789 5.0 22 6 Op 3 . - CDS 19837 - 20367 531 ## COG1846 Transcriptional regulators - Prom 20391 - 20450 5.2 23 7 Op 1 . - CDS 20458 - 20793 395 ## EcolC_1024 hypothetical protein 24 7 Op 2 4/0.000 - CDS 20783 - 21520 593 ## COG1296 Predicted branched-chain amino acid permease (azaleucine resistance) - Prom 21540 - 21599 8.4 25 7 Op 3 3/0.333 - CDS 21644 - 22828 1106 ## COG0477 Permeases of the major facilitator superfamily - Prom 22860 - 22919 4.4 - Term 22968 - 23009 3.3 26 8 Op 1 14/0.000 - CDS 23120 - 24112 1100 ## COG2113 ABC-type proline/glycine betaine transport systems, periplasmic components 27 8 Op 2 16/0.000 - CDS 24170 - 25234 1222 ## COG4176 ABC-type proline/glycine betaine transport system, permease component 28 8 Op 3 5/0.000 - CDS 25227 - 26429 1157 ## COG4175 ABC-type proline/glycine betaine transport system, ATPase component - Prom 26458 - 26517 5.4 - Term 26633 - 26669 2.4 29 8 Op 4 24/0.000 - CDS 26784 - 27743 917 ## COG0208 Ribonucleotide reductase, beta subunit 30 8 Op 5 18/0.000 - CDS 27753 - 29897 2060 ## COG0209 Ribonucleotide reductase, alpha subunit 31 8 Op 6 11/0.000 - CDS 29870 - 30280 248 ## COG1780 Protein involved in ribonucleotide reduction 32 8 Op 7 . - CDS 30277 - 30522 274 ## COG0695 Glutaredoxin and related proteins - Prom 30589 - 30648 3.6 - Term 30705 - 30738 4.1 33 9 Tu 1 . - CDS 30770 - 31111 275 ## COG4575 Uncharacterized conserved protein + Prom 31104 - 31163 3.4 34 10 Tu 1 . + CDS 31251 - 31595 296 ## G2583_3317 hypothetical protein Predicted protein(s) >gi|296493149|gb|ADTK01000352.1| GENE 1 18 - 188 139 56 aa, chain + ## HITS:1 COG:ECs3565 KEGG:ns NR:ns ## COG: ECs3565 COG3604 # Protein_GI_number: 15832819 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Escherichia coli O157:H7 # 1 56 474 529 529 113 98.0 8e-26 MPVVKQNLREATEAFQRETIRQALAQNHHNWAACARMLETDVANLHRLAKRLGLKD >gi|296493149|gb|ADTK01000352.1| GENE 2 185 - 1150 1168 321 aa, chain - ## HITS:1 COG:gutQ_1 KEGG:ns NR:ns ## COG: gutQ_1 COG0794 # Protein_GI_number: 16130615 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted sugar phosphate isomerase involved in capsule formation # Organism: Escherichia coli K12 # 14 205 1 192 192 363 100.0 1e-100 MSEALLNAGRQTLMLELQEASRLPERLGDDFVRAANIILHCEGKVVVSGIGKSGHIGKKI AATLASTGTPAFFVHPAEALHGDLGMIESRDVMLFISYSGGAKELDLIIPRLEDKSIALL AMTGKPTSPLGLAAKAVLDISVEREACPMHLAPTSSTVNTLMMGDALAMAVMQARGFNEE DFARSHPAGALGARLLNKVHHLMRRDDAIPQVALTASVMDAMLELSRTGLGLVAVCDAQQ QVQGVFTDGDLRRWLVGGGALTTPVNEAMTTGGTTLQAQSRAIDAKEILMKRKITAAPVV DENGKLTGAINLQDFYQAGII >gi|296493149|gb|ADTK01000352.1| GENE 3 1143 - 1916 696 257 aa, chain - ## HITS:1 COG:srlR KEGG:ns NR:ns ## COG: srlR COG1349 # Protein_GI_number: 16130614 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli K12 # 1 257 1 257 257 504 100.0 1e-143 MKPRQRQAAILEYLQKQGKCSVEELAQYFDTTGTTIRKDLVILEHAGTVIRTYGGVVLNK EESDPPIDHKTLINTHKKELIAEAAVSFIHDGDSIILDAGSTVLQMVPLLSRFNNITVMT NSLHIVNALSELDNEQTILMPGGTFRKKSASFHGQLAENAFEHFTFDKLFMGTDGIDLNA GVTTFNEVYTVSKAMCNAAREVILMADSSKFGRKSPNVVCSLESVDKLITDAGIDPAFRQ ALEEKGIDVIITGESNE >gi|296493149|gb|ADTK01000352.1| GENE 4 1983 - 2342 298 119 aa, chain - ## HITS:1 COG:gutM KEGG:ns NR:ns ## COG: gutM COG4578 # Protein_GI_number: 16130613 # Func_class: K Transcription # Function: Glucitol operon activator # Organism: Escherichia coli K12 # 1 119 1 119 119 231 100.0 2e-61 MVSALITVAVIAWCAQLALGGWQISRFNRAFDTLCQQGRVGVGRSSGRFKPRVVVAIALD DQQRIVDTLFMKGLTVFARPQKIPAITGMHAGDLQPDVIFPHDPLSQNALSLALKLKRG >gi|296493149|gb|ADTK01000352.1| GENE 5 2447 - 3226 878 259 aa, chain - ## HITS:1 COG:srlD KEGG:ns NR:ns ## COG: srlD COG1028 # Protein_GI_number: 16130612 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Escherichia coli K12 # 1 259 1 259 259 491 99.0 1e-139 MNQVAVVIGGGQTLGAFLCHGLAAEGYRVAVVDIQSDKAANVAQEINAEYGEGMAYGFGA DATSEQSVLALSRGVDEIFGRVDLLVYSAGIAKAAFISDFQLGDFDRSLQVNLVGYFLCA REFSRLMIRDGIQGRIIQINSKSGKVGSKHNSGYSAAKFGGVGLTQSLALDLAEYGITVH SLMLGNLLKSPMFQSLLPQYATKLGIKPDQVEQYYIDKVPLKRGCDYQDVLNMLLFYASP KASYCTGQSINVTGGQVMF >gi|296493149|gb|ADTK01000352.1| GENE 6 3230 - 3601 239 123 aa, chain - ## HITS:1 COG:srlB KEGG:ns NR:ns ## COG: srlB COG3731 # Protein_GI_number: 16130611 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system sorbitol-specific component IIA # Organism: Escherichia coli K12 # 1 123 1 123 123 246 99.0 9e-66 MTVIYQTTITRIGASATDALSDQMLITFREGAPADLEEYCFIHCHGELKGALHPGLQFSL GQHRYPVTAVGSVAEDNLRELGHVTLRFDGLNEAEFPGTVHVAGPVPDDIAPGSVLKFES VKE >gi|296493149|gb|ADTK01000352.1| GENE 7 3612 - 4571 1026 319 aa, chain - ## HITS:1 COG:srlE KEGG:ns NR:ns ## COG: srlE COG3732 # Protein_GI_number: 16130610 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system sorbitol-specific component IIBC # Organism: Escherichia coli K12 # 1 319 1 319 319 586 98.0 1e-167 MTRIRIEKGTGGWGGPLELEATPGKKIVYITAGTRPAIVDKLAQLTGWQAIDGFKEGEPA EAEIGVAVIDCGGTLRCGIYPKRRIPTINIHSTGKSGPLAQYIVEDIYVSGVKEENITVV GDATPQPSSVGRDYDTSKKITEQSDGLLAKVGMGMGSAVAVLFQSGRDTIDTVLKTILPF MAFVSALIGIIMASGLGDWIAHGLAPLASHPLGLVMLALICSFPLLSPFLGPGAVIAQVI GVLIGVQIGLGNIPPHLALPALFAINAQAACDFIPVGLSLAEARQDTVRVGVPSVLVSRF LTGAPTVLIAWFVSGFIYQ >gi|296493149|gb|ADTK01000352.1| GENE 8 4568 - 5131 659 187 aa, chain - ## HITS:1 COG:srlA KEGG:ns NR:ns ## COG: srlA COG3730 # Protein_GI_number: 16130609 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system sorbitol-specific component IIC # Organism: Escherichia coli K12 # 1 187 1 187 187 364 98.0 1e-101 MIETITHGAEWFIGLFQKGGEVFTGMVTGILPLLISLLVIMNALINFIGQHRIERFAQRC AGNPVSRYLLLPCIGTFVFCNPMTLSLGRFMPEKYKPSYYAAASYSCHSMNGLFPHINPG ELFVYLGIASGLTTLNLPLGPLAVSYLLVGLVTNFFRGWVTDLTTAVFEKKMGIQLEQKV HLAGATS >gi|296493149|gb|ADTK01000352.1| GENE 9 5387 - 6472 1002 361 aa, chain + ## HITS:1 COG:mltB KEGG:ns NR:ns ## COG: mltB COG2951 # Protein_GI_number: 16130608 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-bound lytic murein transglycosylase B # Organism: Escherichia coli K12 # 1 361 1 361 361 715 99.0 0 MFKRRYVTLLPLFVLLAACSSKPKPTETETTTGTPSGGFLLEPQHNVMQMGGDFANNPNA QQFIDKMVNKHGFDRQQLQEILSQAKRLDSVLRLMDNQAPTTSVKPPSGPNGAWLRYRKK FITPDNVQNGVVFWNQYEDALNRAWQVYGVPPEIIVGIIGVETRWGRVMGKTRILDALAT LSFNYPRRAEYFSGELETFLLMARDEQDDPLNLKGSFAGAMGYGQFMPSSYKQYAVDFSG DGHINLWDPVDAIGSVANYFKAHGWVKGDQVAVMANGQAPGLPNGFKTRYSISQLATAGL TPQQPLGNHQQASLLRLDVGTGYQYWYGLPNFYTITRYNHSTHYAMAVWQLGQAVALARV Q >gi|296493149|gb|ADTK01000352.1| GENE 10 6617 - 7114 301 165 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 8 157 748 898 904 120 46 1e-26 MTDSELMQLSEQVGQALKARGATVTTAESCTGGWVAKVITDIAGSSAWFERGFVTYSNEA KAQMIGVREETLAQHGAVSEPVVVEMAIGALKAARADYAVSISGIAGPDGGSEEKPVGTV WFAFATARGEGITRRECFSGDRDAVRRQATAYALQTLWQQFLQNT >gi|296493149|gb|ADTK01000352.1| GENE 11 7194 - 8255 1505 353 aa, chain + ## HITS:1 COG:ECs3556 KEGG:ns NR:ns ## COG: ECs3556 COG0468 # Protein_GI_number: 15832810 # Func_class: L Replication, recombination and repair # Function: RecA/RadA recombinase # Organism: Escherichia coli O157:H7 # 1 353 1 353 353 666 100.0 0 MAIDENKQKALAAALGQIEKQFGKGSIMRLGEDRSMDVETISTGSLSLDIALGAGGLPMG RIVEIYGPESSGKTTLTLQVIAAAQREGKTCAFIDAEHALDPIYARKLGVDIDNLLCSQP DTGEQALEICDALARSGAVDVIVVDSVAALTPKAEIEGEIGDSHMGLAARMMSQAMRKLA GNLKQSNTLLIFINQIRMKIGVMFGNPETTTGGNALKFYASVRLDIRRIGAVKEGENVVG SETRVKVVKNKIAAPFKQAEFQILYGEGINFYGELVDLGVKEKLIEKAGAWYSYKGEKIG QGKANATAWLKDNPETAKEIEKKVRELLLSNPNSTPDFSVDDSEGVAETNEDF >gi|296493149|gb|ADTK01000352.1| GENE 12 8498 - 8998 569 166 aa, chain + ## HITS:1 COG:ECs3555 KEGG:ns NR:ns ## COG: ECs3555 COG2137 # Protein_GI_number: 15832809 # Func_class: R General function prediction only # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 166 1 166 166 311 99.0 3e-85 MTESTSRRPAYARLLDRAVRILAVRDHSEQELRRKLVAPIMGKNGPEEIDATAEDYERVI AWCHEHGYLDDSRFVARFIASRSRKGYGPARIRQELNQKGISREATEKAMRECDIDWCAL ARDQATRKYGEPLPTVFSEKVKIQRFLLYRGYLMEDIQDIWRNFAD >gi|296493149|gb|ADTK01000352.1| GENE 13 9039 - 9221 134 60 aa, chain - ## HITS:1 COG:no KEGG:ECSP_3645 NR:ns ## KEGG: ECSP_3645 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 60 74 133 133 117 100.0 1e-25 MGYQGAAGNYLMSLTMEKVEKRLTDLSGALAHNYPEIKLTKYRHQLQRVLTAGLVTEKWE >gi|296493149|gb|ADTK01000352.1| GENE 14 9126 - 11756 2899 876 aa, chain + ## HITS:1 COG:ECs3554 KEGG:ns NR:ns ## COG: ECs3554 COG0013 # Protein_GI_number: 15832808 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Alanyl-tRNA synthetase # Organism: Escherichia coli O157:H7 # 1 876 1 876 876 1719 99.0 0 MSKSTAEIRQAFLDFFHSKGHQVVASSSLVPHNDPTLLFTNAGMNQFKDVFLGLDKRNYS RATTSQRCVRAGGKHNDLENVGYTARHHTFFEMLGNFSFGDYFKHDAIQFAWELLTSEKW FALPKERLWVTVYESDDEAYEIWEKEVGIPRERIIRIGDNKGAPYASDNFWQMGDTGPCG PCTEIFYDHGDHIWGGPPGSPEEDGDRYIEIWNIVFMQFNRQADGTMEPLPKPSVDTGMG LERIAAVLQHVNSNYDIDLFRTLIQAVAKVTGATDLSNKSLRVIADHIRSCAFLIADGVM PSNENRGYVLRRIIRRAVRHGNMLGAKETFFYKLVGPLIDVMGSAGEDLKRQQAQVEQVL KTEEEQFARTLERGLALLDEELAKLSGDTLDGETAFRLYDTYGFPVDLTADVCRERNIKV DEAGFEAAMEEQRRRAREASGFGADYNAMIRVDSASEFKGYDHLELNGKVTALFVDGKAV DAINAGQEAVVVLDQTPFYAESGGQVGDKGELKGANFSFAVEDTQKYGQAIGHIGKLAAG SLKVGDAVQADVDEARRARIRLNHSATHLMHAALRQVLGTHVSQKGSLVNDKVLRFDFSH NEAMKPEEIRAVEDLVNAQIRRNLPIETNIMDLEAAKAKGAMALFGEKYDERVRVLSMGD FSTELCGGTHASRTGDIGLFRIISESGTAAGVRRIEAVTGEGAITTVHADSDRLSEVAHL LKGDSNNLADKVRSVLERTRQLEKELQQLKEQAAAQESANLSSKAIDVNGVKLLVSELSG VEPKMLRTMVDDLKNQLGSTIIVLATVAEGKVSLIAGVSKDVTDRVKAGELIGMVAQQVG GKGGGRPDMAQAGGTDAAALPAALASVKGWVSAKLQ >gi|296493149|gb|ADTK01000352.1| GENE 15 11991 - 12176 204 61 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855109|ref|ZP_02477881.1| 30S ribosomal protein S1 [Haemophilus parasuis 29755] # 1 60 1 60 61 83 61 2e-15 MLILTRRVGETLMIGDEVTVTVLGVKGNQVRIGVNAPKEVSVHREEIYQRIQAEKSQQSS Y >gi|296493149|gb|ADTK01000352.1| GENE 16 13634 - 14200 536 188 aa, chain + ## HITS:1 COG:ECs3552 KEGG:ns NR:ns ## COG: ECs3552 COG0637 # Protein_GI_number: 15832806 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Escherichia coli O157:H7 # 1 188 1 188 188 372 100.0 1e-103 MYERYAGLIFDMDGTILDTEPTHRKAWREVLGHYGLQYDVQAMIALNGSPTWRIAQAIIE LNQADLDPHALAREKTEAVRSMLLDSVEPLPLVEVVKSWHGRRPMAVGTGSESAIAEALL AHLGLRRYFDAVVAADHVKHHKPAPDTFLLCAQRMGVQPTQCVVFEDADFGIQAARAAGM DAVDVRLL >gi|296493149|gb|ADTK01000352.1| GENE 17 14341 - 14625 267 94 aa, chain + ## HITS:1 COG:yqaA KEGG:ns NR:ns ## COG: yqaA COG1238 # Protein_GI_number: 16130601 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 94 49 142 142 163 100.0 7e-41 MGNSLGGLTNVILGRFFPLRKTSRWQEKATGWLKRYGAVTLLLSWMPVVGDLLCLLAGWM RISWGPVIFFLCLGKALRYVAVAAATVQGMMWWH >gi|296493149|gb|ADTK01000352.1| GENE 18 14698 - 16254 1771 518 aa, chain + ## HITS:1 COG:gshA KEGG:ns NR:ns ## COG: gshA COG2918 # Protein_GI_number: 16130600 # Func_class: H Coenzyme transport and metabolism # Function: Gamma-glutamylcysteine synthetase # Organism: Escherichia coli K12 # 1 518 1 518 518 1079 99.0 0 MIPDVSQALAWLEKHPQALKGIQRGLERETLRVNADGTLATTGHPEALGSALTHKWITTD FAEALLEFITPVDGDIEHMLTFMRDLHRYTARNMGDERMWPLSMPCYIAEGQDIELAQYG TSNTGRFKTLYREGLKNRYGALMQTISGVHYNFSLPMAFWQAKCGDISGADAKEKISAGY FRVIRNYYRFGWVIPYLFGASPAICSSFLQGKPTSLPFEKTECGMYYLPYATSLRLSDLG YTNKSQSNLGITFNDLYEYVAGLKQAIKTPSEEYAKIGIEKDGKRLQINSNVLQIENELY APIRPKRVTRSGESPSDALLRGGIEYIEVRSLDINPFSPIGVDEQQVRFLDLFMVWCALA DAPEMSSSELACTRVNWNRVILEGRKPGLTLGIGCETAQFPLLQVGKDLFRDLKRVAQTL DSINGGEAYQKVCDELVACFDNPDLTFSARILRSMIDTGIGGTGKAFAEAYRNLLREEPL EILREEDFVAEREASERRQQEMETADTEPFAVWLEKHA >gi|296493149|gb|ADTK01000352.1| GENE 19 16404 - 16919 592 171 aa, chain + ## HITS:1 COG:luxS KEGG:ns NR:ns ## COG: luxS COG1854 # Protein_GI_number: 16130599 # Func_class: T Signal transduction mechanisms # Function: LuxS protein involved in autoinducer AI2 synthesis # Organism: Escherichia coli K12 # 1 171 1 171 171 349 99.0 2e-96 MPLLDSFTVDHTRMEAPAVRVAKTMNTPHGDAITVFDLRFCVPNKEVMPERGIHTLEHLF AGFMRNHLNGKGVEIIDISPMGCRTGFYMSLIGTPDEQRVADAWKAAMEDVLKVQDQNQI PELNVYQCGTYQMHSLQEAQDIARSILERDVRINSNEELALPKEKLQELHI >gi|296493149|gb|ADTK01000352.1| GENE 20 16983 - 18521 1595 512 aa, chain - ## HITS:1 COG:STM2815 KEGG:ns NR:ns ## COG: STM2815 COG0477 # Protein_GI_number: 16766126 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Salmonella typhimurium LT2 # 1 502 1 502 512 907 95.0 0 MQQQKPLEGAQLVIMTIALSLATFMQVLDSTIANVAIPTIAGNLGSSLSQGTWVITSFGV ANAISIPLTGWLAKRVGEVKLFLWSTIAFAIASWACGVSSSLNMLIFFRVIQGIVAGPLI PLSQSLLLNNYPPAKRSIALALWSMTVIVAPICGPILGGYISDNYHWGWIFFINVPIGVA VVLMTLQTLRGRETRTERRRIDAVGLALLVIGIGSLQIMLDRGKELDWFSSQEIIILTVV AVVAICFLIVWELTDDNPIVDLSLFKSRNFTIGCLCISLAYMLYFGAIVLLPQLLQEVYG YTATWAGLASAPVGIIPVILSPIIGRFAHKLDMRRLVTFSFIMYAVCFYWRAYTFEPGMD FGASAWPQFIQGFAVACFFMPLTTITLSGLPPERLAAASSLSNFTRTLAGSIGTSITTTM WTNRESMHHAQLTESVNPFNPNAQAMYSQLEGLGMTQQQASGWIAQQITNQGLIISANEI FWMSAGIFLVLLGLVWFAKPPFGAGGGGGGAH >gi|296493149|gb|ADTK01000352.1| GENE 21 18538 - 19710 1156 390 aa, chain - ## HITS:1 COG:emrA KEGG:ns NR:ns ## COG: emrA COG1566 # Protein_GI_number: 16130597 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Escherichia coli K12 # 1 390 1 390 390 689 99.0 0 MSANAETQTPQQPVKKSGKRKRLLLLLTLLFIIIAVAIGIYWFLVLRHFEETDDAYVAGN QIQIMSQVSGSVTKVWADNTDFVKEGDVLVTLDPTDARQAFEKAKTALASSVRQTHQLMI NSKQLQANIEVQKIALAKAQSDYNRRVPLGNANLIGREELQHARDAVTSAQAQLDVAIQQ YNANQAMILGTKLEDQPAVQQAATEVRNAWLALERTRIVSPMTGYVSRRAVQPGAQISPT TPLMAVVPATNMWVDANFKETQIANMRIGQPVTITTDIYGDDVKYTGKVVGLDMGTGSAF SLLPAQNATGNWIKVVQRLPVRIELDQKQLEQYPLRIGLSTLVSVNTTNRDGQVLANKVR STPVAVSTAREISLAPVNKLIDDIVKANAG >gi|296493149|gb|ADTK01000352.1| GENE 22 19837 - 20367 531 176 aa, chain - ## HITS:1 COG:ECs3546 KEGG:ns NR:ns ## COG: ECs3546 COG1846 # Protein_GI_number: 15832800 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 176 1 176 176 306 100.0 2e-83 MDSSFTPIEQMLKFRASRHEDFPYQEILLTRLCMHMQSKLLENRNKMLKAQGINETLFMA LITLESQENHSIQPSELSCALGSSRTNATRIADELEKRGWIERRESDNDRRCLHLQLTEK GHEFLREVLPPQHNCLHQLWSALSTTEKDQLEQITRKLLSRLDQMEQDGVVLEAMS >gi|296493149|gb|ADTK01000352.1| GENE 23 20458 - 20793 395 111 aa, chain - ## HITS:1 COG:no KEGG:EcolC_1024 NR:ns ## KEGG: EcolC_1024 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 111 1 111 111 174 97.0 1e-42 MSYEVLLLGLLVGAANYCFRYLPLRLRVGNARPTKRGAVGILLDTIGIASICALLVVSTA PEVMHDTRRFVPTLVGFAVLGASFYKTRSIIIPTLLSALTYGLVWKVMAII >gi|296493149|gb|ADTK01000352.1| GENE 24 20783 - 21520 593 245 aa, chain - ## HITS:1 COG:ygaZ KEGG:ns NR:ns ## COG: ygaZ COG1296 # Protein_GI_number: 16130594 # Func_class: E Amino acid transport and metabolism # Function: Predicted branched-chain amino acid permease (azaleucine resistance) # Organism: Escherichia coli K12 # 1 245 1 245 245 432 99.0 1e-121 MESPTPQPAPGSATFMEGCKDSLPIVISYIPVAFAFGLNATRLGFSPLESVFFSCIIYAG ASQFVITAMLAAGSSLWVAALTVMAMDVRHVLYGPSLRSRIIQRLQKSKTALWAFGLTDE VFAAATAKLVRNNRRWSENWMIGIAFSSWSSWVFGTVIGAFSGSGLLQGYPAVEAALGFM LPALFMSFLLASFQRKQSLCVTAALVGALAGVTLFSIPVAILAGIVCGCLTALIQAFWQG APDEL >gi|296493149|gb|ADTK01000352.1| GENE 25 21644 - 22828 1106 394 aa, chain - ## HITS:1 COG:ECs3543 KEGG:ns NR:ns ## COG: ECs3543 COG0477 # Protein_GI_number: 15832797 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 394 1 394 394 621 99.0 1e-178 MTKPNHELSPALIVLMSIATGLAVASNYYAQPLLDTIARNFSLSASSAGFIVTAAQLGYA AGLLFLVPLGDMFERRRLIVSMTLLAAGGMLITASSQSLAMMILGTALTGLFSVVAQILV PLAATLASPDKRGKVVGTIMSGLLLGILLARTVAGLLANLGGWRTVFWVASVLMALMALA LWRGLPQMKSETHLNYPQLLGSVFSMFISDKILRTRALLGCLTFANFSILWTSMAFLLAA PPFNYSDGVIGLFGLAGAAGALGARPAGGFADKGKSHHTTTFGLLLLLLSWLAIWFGHTS VLALIIGILVLDLTVQGVHITNQTVIYRIHPDARNRLTAGYMTSYFIGGAAGSLISASAW QHGGWAGVCLAGATIALVNLLVWWRGFHRQEAAN >gi|296493149|gb|ADTK01000352.1| GENE 26 23120 - 24112 1100 330 aa, chain - ## HITS:1 COG:ECs3542 KEGG:ns NR:ns ## COG: ECs3542 COG2113 # Protein_GI_number: 15832796 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport systems, periplasmic components # Organism: Escherichia coli O157:H7 # 1 330 1 330 330 646 99.0 0 MRHSVLFATAFATLISTQTFAADLPGKGITVNPVQSTITEETFQTLLVSRALEKLGYTVN KPSEVDYNVGYTSLASGDATSPAVNWTPLHDNMYEAAGGDKKFYREGVFVNGAAQGYLID KKTADQYKITNIAQLKDPKIAKLFDTNGDGKADLTGCNPGWGCEGAINHQLAAYELTHTV THNQGNYAAMMADTISRYKEGKPVFYYTWTPYWVSNELKPGKDVVWLQVPFSALPGDKNA DTKLPNGANYGFPVSTMHIVANKAWAEKNPAAAKLFAIMQLPVADINAQNAIMHDGKASE GDIQGHVDGWIKAHQQQFDGWVNEALAAQK >gi|296493149|gb|ADTK01000352.1| GENE 27 24170 - 25234 1222 354 aa, chain - ## HITS:1 COG:ECs3541 KEGG:ns NR:ns ## COG: ECs3541 COG4176 # Protein_GI_number: 15832795 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport system, permease component # Organism: Escherichia coli O157:H7 # 1 354 1 354 354 578 99.0 1e-165 MADQNNPWDTTPAADSAAQSADAGGTPATAPTDGGGADWLTSTPAPNVEHFNILDPFHKT LIPLDSWVTEGIDWVVTHFRPVFQGVRVPVDYILNGFQQLLLGMPAPVAIIVFALIAWQI SGVGMGVATLVSLIAIGAIGAWSQAMVTLALVLTALLFCIVIGLPLGIWLARSPRAAKII RPLLDAMQTTPAFVYLVPIVMLFGIGNVPGVVVTIIFALPPIIRLTILGINQVPADLIEA SRSFGASPRQMLFKVQLPLAMPTIMAGVNQTLMLALSMVVIASMIAVGGLGQMVLRGIGR LDMGLATVGGVGIVILAIILDRLTQAVGRDSRSRGNRRWYTTGPVGLLTRPFIK >gi|296493149|gb|ADTK01000352.1| GENE 28 25227 - 26429 1157 400 aa, chain - ## HITS:1 COG:proV KEGG:ns NR:ns ## COG: proV COG4175 # Protein_GI_number: 16130591 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport system, ATPase component # Organism: Escherichia coli K12 # 1 400 1 400 400 764 100.0 0 MAIKLEIKNLYKIFGEHPQRAFKYIEQGLSKEQILEKTGLSLGVKDASLAIEEGEIFVIM GLSGSGKSTMVRLLNRLIEPTRGQVLIDGVDIAKISDAELREVRRKKIAMVFQSFALMPH MTVLDNTAFGMELAGINAEERREKALDALRQVGLENYAHSYPDELSGGMRQRVGLARALA INPDILLMDEAFSALDPLIRTEMQDELVKLQAKHQRTIVFISHDLDEAMRIGDRIAIMQN GEVVQVGTPDEILNNPANDYVRTFFRGVDISQVFSAKDIARRTPNGLIRKTPGFGPRSAL KLLQDEDREYGYVIERGNKFVGAVSIDSLKTALTQQQGLDAALIDAPLAVDAQTPLSELL SHVGQAPCAVPVVDEDQQYVGIISKGMLLRALDREGVNNG >gi|296493149|gb|ADTK01000352.1| GENE 29 26784 - 27743 917 319 aa, chain - ## HITS:1 COG:nrdF KEGG:ns NR:ns ## COG: nrdF COG0208 # Protein_GI_number: 16130590 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, beta subunit # Organism: Escherichia coli K12 # 1 319 1 319 319 637 100.0 0 MKLSRISAINWNKISDDKDLEVWNRLTSNFWLPEKVPLSNDIPAWQTLTVVEQQLTMRVF TGLTLLDTLQNVIGAPSLMPDALTPHEEAVLSNISFMEAVHARSYSSIFSTLCQTKDVDA AYAWSEENAPLQRKAQIIQQHYRGDDPLKKKIASVFLESFLFYSGFWLPMYFSSRGKLTN TADLIRLIIRDEAVHGYYIGYKYQKNMEKISLGQREELKSFAFDLLLELYDNELQYTDEL YAETPWADDVKAFLCYNANKALMNLGYEPLFPAEMAEVNPAILAALSPNADENHDFFSGS GSSYVMGKAVETEDEDWNF >gi|296493149|gb|ADTK01000352.1| GENE 30 27753 - 29897 2060 714 aa, chain - ## HITS:1 COG:ECs3538 KEGG:ns NR:ns ## COG: ECs3538 COG0209 # Protein_GI_number: 15832792 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, alpha subunit # Organism: Escherichia coli O157:H7 # 1 714 1 714 714 1447 98.0 0 MATTTAECLTQETMDYHALNAMLNLYDSAGRIQFDKDRQAVDAFIATHVRPNSVTFSSQQ QRLNWLVNEGYYDESVLNRYSRDFVITLFAHAHTSGFRFQTFLGAWKFYTSYTLKTFDGK RYLEDFADRVTMVALTLAQGDETLALQLTDEMLSGRFQPATPTFLNCGKQQRGELVSCFL LRIEDNMESIGRAVNSALQLSKRGGGVAFLLSNLREAGAPIKRIENQSSGVIPVMKMLED AFSYANQLGARQGAGAVYLHAHHPDILRFLDTKRENADEKIRIKTLSLGVVIPDITFHLA KENAQMALFSPYDVERVYGKPFADIAISEHYDELVADERIRKKYLNARDFFQRLAEIQFE SGYPYIMYEDTVNRANPIAGRINMSNLCSEILQVNCASEYDENLDYARTGHDISCNLGSL NIAHTMDSPDFARTVETAVRGLTAVSDMSHIRSVPSIEAGNAASHAIGLGQMNLHGYLAR EGIAYGSPEALDFTNLYFYTITWHALRTSMLLARERGETFAGFKQSRYASGEYFSQYLQG NWQPKTAKVGELFARSGITLPTREMWAQLRDDVMRYGIYNQNLQAVPPTGSISYINHATS SIHPIVAKVEIRKEGKTGRVYYPAPFMTNENLALYQDAYKIGAEKIIDTYAEATRHVDQG LSLTLFFPDTATTRDINKAQIYAWRKGIKTLYYIRLRQMALEGTEIEGCVSCAL >gi|296493149|gb|ADTK01000352.1| GENE 31 29870 - 30280 248 136 aa, chain - ## HITS:1 COG:ECs3537 KEGG:ns NR:ns ## COG: ECs3537 COG1780 # Protein_GI_number: 15832791 # Func_class: F Nucleotide transport and metabolism # Function: Protein involved in ribonucleotide reduction # Organism: Escherichia coli O157:H7 # 1 136 1 136 136 268 100.0 1e-72 MSQLVYFSSSSENTQRFIERLGLPAVRIPLNERERIQVDEPYILIVPSYGGGGTAGAVPR QVIRFLNDEHNRALLRGVIASGNRNFGEAYGRAGDVIARKCGVPWLYRFELMGTQSDIEN VRKGVTEFWQRQPQNA >gi|296493149|gb|ADTK01000352.1| GENE 32 30277 - 30522 274 81 aa, chain - ## HITS:1 COG:ECs3536 KEGG:ns NR:ns ## COG: ECs3536 COG0695 # Protein_GI_number: 15832790 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutaredoxin and related proteins # Organism: Escherichia coli O157:H7 # 1 81 1 81 81 164 100.0 4e-41 MRITIYTRNDCVQCHATKRAMENRGFDFEMINVDRVPEAAEALRAQGFRQLPVVIAGDLS WSGFRPDMINRLHPAPHAASA >gi|296493149|gb|ADTK01000352.1| GENE 33 30770 - 31111 275 113 aa, chain - ## HITS:1 COG:ECs3533 KEGG:ns NR:ns ## COG: ECs3533 COG4575 # Protein_GI_number: 15832787 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 113 1 113 113 200 99.0 5e-52 MGEHMFNRPNRNDVDDGVQDIQNDVNQLADSLESVLKSWGSDAKGEAEAARSKAQALLKE TRARMHGRTRVQQAARDAVGCADSFVRERPWCSVGTAAAVGIFIGALLSMRKS >gi|296493149|gb|ADTK01000352.1| GENE 34 31251 - 31595 296 114 aa, chain + ## HITS:1 COG:no KEGG:G2583_3317 NR:ns ## KEGG: G2583_3317 # Name: ygaC # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 114 1 114 114 221 100.0 8e-57 MYLRPDEVARVLEKVGFTVDVVTQKAYGYRRGENYVYVNREARMGRTALVIHPTLKERSS TLAEPASDIKTCDHYQQFPLYLAGERHEHYGIPHGFSSRVALERYLNGLFGEAS Prediction of potential genes in microbial genomes Time: Mon May 16 16:09:43 2011 Seq name: gi|296493148|gb|ADTK01000353.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1096.20, whole genome shotgun sequence Length of sequence - 20716 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 10, operones - 5 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 17 - 466 355 ## ECSP_3616 predicted inner membrane protein - Prom 581 - 640 6.2 + Prom 927 - 986 5.8 2 2 Tu 1 . + CDS 1134 - 1538 456 ## COG2916 DNA-binding protein H-NS - Term 1415 - 1448 1.0 3 3 Op 1 7/0.000 - CDS 1585 - 2109 367 ## COG0607 Rhodanese-related sulfurtransferase 4 3 Op 2 . - CDS 2119 - 2418 323 ## COG0640 Predicted transcriptional regulators - Prom 2442 - 2501 5.6 + Prom 2485 - 2544 4.8 5 4 Op 1 4/0.667 + CDS 2601 - 2759 208 ## COG0401 Uncharacterized homolog of Blt101 + Term 2775 - 2806 3.9 + Prom 2761 - 2820 2.2 6 4 Op 2 . + CDS 2843 - 3292 464 ## COG1652 Uncharacterized protein containing LysM domain - Term 3230 - 3266 4.0 7 5 Op 1 1/1.000 - CDS 3293 - 3955 749 ## COG1802 Transcriptional regulators 8 5 Op 2 4/0.667 - CDS 3976 - 5376 1412 ## COG1113 Gamma-aminobutyrate permease and related permeases - Prom 5543 - 5602 2.0 - Term 5513 - 5556 3.0 9 6 Op 1 12/0.000 - CDS 5614 - 6894 1350 ## COG0160 4-aminobutyrate aminotransferase and related aminotransferases 10 6 Op 2 3/0.667 - CDS 6908 - 8356 1727 ## COG1012 NAD-dependent aldehyde dehydrogenases 11 6 Op 3 . - CDS 8379 - 9647 1014 ## COG0579 Predicted dehydrogenase 12 6 Op 4 . - CDS 9667 - 10644 1011 ## SFV_2844 hypothetical protein - Prom 10866 - 10925 4.1 13 7 Tu 1 . - CDS 11391 - 12881 467 ## COG0366 Glycosidases - Prom 12987 - 13046 4.1 + TRNA 13792 - 13867 94.0 # Met CAT 0 0 + Prom 14422 - 14481 7.7 14 8 Tu 1 . + CDS 14619 - 15992 497 ## ECSP_3600 hypothetical protein 15 9 Tu 1 . + CDS 16378 - 16566 85 ## ECIAI1_2748 putative DNA invertase fragment, putative PinH + Term 16732 - 16774 1.0 + Prom 16601 - 16660 3.0 16 10 Op 1 . + CDS 16905 - 19805 1320 ## COG3468 Type V secretory pathway, adhesin AidA 17 10 Op 2 . + CDS 19834 - 20160 222 ## COG3468 Type V secretory pathway, adhesin AidA + Term 20175 - 20203 1.0 Predicted protein(s) >gi|296493148|gb|ADTK01000353.1| GENE 1 17 - 466 355 149 aa, chain - ## HITS:1 COG:no KEGG:ECSP_3616 NR:ns ## KEGG: ECSP_3616 # Name: ygaW # Def: predicted inner membrane protein # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 149 1 149 149 269 100.0 2e-71 MFSPQSRLRHAVADTFAMVVYCSVVNMCIEVFLSGMSFEQSFYSRLVAIPVNILIAWPYG MYRDLFMRAARKVSPSGWIKNLADILAYVTFQSPVYVAILLVVGADWHQIMAAVSSNIVV SMLMGAVYGYFLDYCRRLFKVSRYQQVKA >gi|296493148|gb|ADTK01000353.1| GENE 2 1134 - 1538 456 134 aa, chain + ## HITS:1 COG:ECs3530 KEGG:ns NR:ns ## COG: ECs3530 COG2916 # Protein_GI_number: 15832784 # Func_class: R General function prediction only # Function: DNA-binding protein H-NS # Organism: Escherichia coli O157:H7 # 1 134 1 134 134 213 100.0 1e-55 MSVMLQSLNNIRTLRAMAREFSIDVLEEMLEKFRVVTKERREEEEQQQRELAERQEKIST WLELMKADGINPEELLGNSSAAAPRAGKKRQPRPAKYKFTDVNGETKTWTGQGRTPKPIA QALAEGKSLDDFLI >gi|296493148|gb|ADTK01000353.1| GENE 3 1585 - 2109 367 174 aa, chain - ## HITS:1 COG:ygaP KEGG:ns NR:ns ## COG: ygaP COG0607 # Protein_GI_number: 16130582 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Escherichia coli K12 # 1 174 1 174 174 339 98.0 1e-93 MALTTISPHDAQELIARGAKLIDIRDADEYLREHIPEADLAPLSVLEQSGLPAKLRHEQI IFHCQAGKRTSNNADKLAAIAAPAEIFLLEDGIDGWKRAGLPVAVNKSQPFPLMRQVQIA AGGLILIGVVLGYTVNSGFFLLSGFVGAGLLFAGISGFCGMARLLDKMPWNQRA >gi|296493148|gb|ADTK01000353.1| GENE 4 2119 - 2418 323 99 aa, chain - ## HITS:1 COG:ygaV KEGG:ns NR:ns ## COG: ygaV COG0640 # Protein_GI_number: 16130581 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 1 99 1 99 99 162 100.0 2e-40 MTELAQLQASAEQAAALLKAMSHPKRLLILCMLSGSPGTSAGELTRITGLSASATSQHLA RMRDEGLIDSQRDAQRILYSIKNEAVNAIIATLKNVYCP >gi|296493148|gb|ADTK01000353.1| GENE 5 2601 - 2759 208 52 aa, chain + ## HITS:1 COG:ECs3527 KEGG:ns NR:ns ## COG: ECs3527 COG0401 # Protein_GI_number: 15832781 # Func_class: S Function unknown # Function: Uncharacterized homolog of Blt101 # Organism: Escherichia coli O157:H7 # 1 52 1 52 52 77 100.0 6e-15 MGFWRIVITIILPPLGVLLGKGFGWAFIINILLTLLGYIPGLIHAFWVQTRD >gi|296493148|gb|ADTK01000353.1| GENE 6 2843 - 3292 464 149 aa, chain + ## HITS:1 COG:ygaU KEGG:ns NR:ns ## COG: ygaU COG1652 # Protein_GI_number: 16130579 # Func_class: S Function unknown # Function: Uncharacterized protein containing LysM domain # Organism: Escherichia coli K12 # 1 149 1 149 149 267 100.0 4e-72 MGLFNFVKDAGEKLWDAVTGQHDKDDQAKKVQEHLNKTGIPDADKVNIQIADGKATVTGD GLSQEAKEKILVAVGNISGIASVDDQVKTATPATASQFYTVKSGDTLSAISKQVYGNANL YNKIFEANKPMLKSPDKIYPGQVLRIPEE >gi|296493148|gb|ADTK01000353.1| GENE 7 3293 - 3955 749 220 aa, chain - ## HITS:1 COG:ECs3525 KEGG:ns NR:ns ## COG: ECs3525 COG1802 # Protein_GI_number: 15832779 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 220 7 226 226 411 100.0 1e-115 MTITSLDGYRWLKNDIIRGNFQPDEKLRMSLLTSRYALGVGPLREALSQLVAERLVTVVN QKGYRVASMSEQELLDIFDARANMEAMLVSLAIARGGDEWEADVLAKAHLLSKLEACDAS EKMLDEWDLRHQAFHTAIVAGCGSHYLLQMRERLFDLAARYRFIWLRRTVLSVEMLEDKH DQHQTLTAAVLARDTARASELMRQHLLTPIPIIQQAMAGN >gi|296493148|gb|ADTK01000353.1| GENE 8 3976 - 5376 1412 466 aa, chain - ## HITS:1 COG:gabP KEGG:ns NR:ns ## COG: gabP COG1113 # Protein_GI_number: 16130577 # Func_class: E Amino acid transport and metabolism # Function: Gamma-aminobutyrate permease and related permeases # Organism: Escherichia coli K12 # 1 466 1 466 466 824 100.0 0 MGQSSQPHELGGGLKSRHVTMLSIAGVIGASLFVGSSVAIAEAGPAVLLAYLFAGLLVVM IMRMLAEMAVATPDTGSFSTYADKAIGRWAGYTIGWLYWWFWVLVIPLEANIAAMILHSW VPGIPIWLFSLVITLALTGSNLLSVKNYGEFEFWLALCKVIAILAFIFLGAVAISGFYPY AEVSGISRLWDSGGFMPNGFGAVLSAMLITMFSFMGAEIVTIAAAESDTPEKHIVRATNS VIWRISIFYLCSIFVVVALIPWNMPGLKAVGSYRSVLELLNIPHAKLIMDCVILLSVTSC LNSALYTASRMLYSLSRRGDAPAVMGKINRSKTPYVAVLLSTGAAFLTVVVNYYAPAKVF KFLIDSSGAIALLVYLVIAVSQLRMRKILRAEGSEIRLRMWLYPWLTWLVIGFITFVLVV MLFRPAQQLEVISTGLLAIGIICTVPIMARWKKLVLWQKTPVHNTR >gi|296493148|gb|ADTK01000353.1| GENE 9 5614 - 6894 1350 426 aa, chain - ## HITS:1 COG:gabT KEGG:ns NR:ns ## COG: gabT COG0160 # Protein_GI_number: 16130576 # Func_class: E Amino acid transport and metabolism # Function: 4-aminobutyrate aminotransferase and related aminotransferases # Organism: Escherichia coli K12 # 1 426 1 426 426 848 99.0 0 MSSNKELMQRRSQAIPRGVGQIHPIFADRAENCRVWDVEGREYLDFAGGIAVLNTGHLHP KVVAAVEAQLKKLSHTCFQVLAYEPYLELCEIMNQKVPGNFAKKTLLVTTGSEAVENAVK IARAATKRSGTIAFSGAYHGRTHYTLALTGKVNPYSAGMGLMPGHVYRALYPCPLHGISE DDAIASIHRIFKNDAAPEDIAAIVIEPVQGEGGFYAASPAFMQRLRALCDEHGIMLIADE VQSGAGRTGTLFAMEQMGVAPDLTTFAKSIAGGFPLAGVTGRAEVMDAVAPGGLGGTYAG NPIACVAALEVLKVFEQENLLQKANDLGQKLKDGLLAIAEKHPEIGDVRGLGAMIAIELF EDGDHNKPDAKLTAEIVARARDKGLILLSCGPYYNVLRILVPLTIEDAQIRQGLEIISQC FDEAKQ >gi|296493148|gb|ADTK01000353.1| GENE 10 6908 - 8356 1727 482 aa, chain - ## HITS:1 COG:ECs3522 KEGG:ns NR:ns ## COG: ECs3522 COG1012 # Protein_GI_number: 15832776 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Escherichia coli O157:H7 # 1 482 1 482 482 950 99.0 0 MKLNDSNLFRQQALINGEWLDANNGEVIDVTNPANGDKLGSVPKMGADETRAAIDAANRA LPAWRALTAKERANILRNWFNLMMEHQDDLARLMTLEQGKPLAEAKGEISYAASFIEWFA EEGKRIYGDTIPGHQADKRLIVIKQPIGVTAAITPWNFPAAMITRKAGPALAAGCTMVLK PASQTPFSALALAELAIRAGIPAGVFNVVTGSAGAVGNELTSNPLVRKLSFTGSTEIGRQ LMEQCAKDIKKVSLELGGNAPFIVFDDADLDKAVEGALSSKFRNAGQTCVCANRLYVQDG VYDRFAEKLQQAVSKLHIGDGLENGVTIGPLIDEKAVAKVEEHIADALEKGARVVCGGKA HERGGNFFQPTILVDVPANAKVSKEETFGPLAPLFRFKDEADVIAQANDTEFGLAAYFYA RDLSRVFRVGEALEYGIVGINTGIISNEVAPFGGIKASGLGREGSKYGIEDYLEIKYMCI GL >gi|296493148|gb|ADTK01000353.1| GENE 11 8379 - 9647 1014 422 aa, chain - ## HITS:1 COG:ygaF KEGG:ns NR:ns ## COG: ygaF COG0579 # Protein_GI_number: 16130574 # Func_class: R General function prediction only # Function: Predicted dehydrogenase # Organism: Escherichia coli K12 # 1 422 23 444 444 846 98.0 0 MYDFVIIGGGIIGMSTAMQLIDVYPDARIALLEKESGPACHQTGHNSGVIHAGVYYTPGS LKAQFCLAGNRATKAFCDQNGIRYDNCGKMLVATSELEMERMRALWERTAANGIEREWLN AMELREREPNITGLGGIFVPSSGIVSYRNVTAAMAKIFQARGGEIIYNAEVSGLSEHKNG VVIRTRQGGEYEASTLISCSGLMADRLVKMLGLEPGFIICPFRGEYFRLAPEHNQIVNHL IYPIPDPAMPFLGVHLTRMIDGSVTVGPNAVLAFKREGYRKRDFSFSDTLEILGSSGIRR VLQNHLRSGLGEMKNSLCKSGYLRLVQKYCPRLSLSDLQPWPAGVRAQAVSPDGKLIDDF LFVTTPRTIHTCNAPSPAATSAIPIGAHIVSKVQTLLASQSNPGRTLRAARSVDALHAAF NQ >gi|296493148|gb|ADTK01000353.1| GENE 12 9667 - 10644 1011 325 aa, chain - ## HITS:1 COG:no KEGG:SFV_2844 NR:ns ## KEGG: SFV_2844 # Name: not_defined # Def: hypothetical protein # Organism: S.flexneri_8401 # Pathway: not_defined # 1 325 36 360 360 639 99.0 0 MNALTAVQNNAVDSDQDYSGFTLIPSAQSPRLLELTFTEQTTKQFLEQVAEWPVQALEYK SFLRFRVGKILDDLCANQLQPLLLKTLLNRAEGALLINAVGIDDVAQADEMVKLATAVAH LIGRSNFDAMSGQYYARFVVKNVDNSDSYLRQPHRVMELHNDGTYVEEITDYVLMMKIDE QNMQGGNSLLLHLDDWEHLDHYFRHPLARRPMRFAAPPSKNVSKDVFHPVFDVDQQGRPV MRYIDQFVQPKDFEEGVWLSELSDAIETSKGILSVPVPVGKFLLINNLFWLHGRDRFTPH PDLRRELMRQRGYFAYATHHYQTHQ >gi|296493148|gb|ADTK01000353.1| GENE 13 11391 - 12881 467 496 aa, chain - ## HITS:1 COG:ECs3518 KEGG:ns NR:ns ## COG: ECs3518 COG0366 # Protein_GI_number: 15832772 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Escherichia coli O157:H7 # 1 295 118 412 412 596 97.0 1e-170 MIERGDNSGVHQYGPAEHFTHIISDKPSPKDEYIAYAINIPDYELAADVYNINVTSPSEQ QETFKILINPEHLRQTLERKSLTAVQKSQCEIITPKKPGEAILHAFNATYQQIRENMSEF ARCHYGYIQIPPVTTFRADGPETPEEEKGYWFHAYQPEDLCTIHNPMGDLQDFIALVKDA KKFGIDIIPDYTFNFMGIGGSGKNDLDYPSADIRAKISKDIEGGIPGYWQGQVLIPFIKD PVTKERKQIHPEDIHLTAKDFEASKDNISKDEWENLHALKEKRLNGMPKTTPKSDQVIML QNQYVREMRKYGVRGLRYDAAKHSKHEQIERSITPPLKNYNERLHNTNLFNPKYHKKAVM NYMEYLVICQLDEQQMSSLLYERDDLSAIDFSLLMKTIKAFSFGGDLQTLASKPGSTISS IPSERRILININHDFPNNGNLFNDFLFNHQQDEQLAMAYIAALPFSRPLVYWDGQVLKST TEIKNYDGSTRVGGEA >gi|296493148|gb|ADTK01000353.1| GENE 14 14619 - 15992 497 457 aa, chain + ## HITS:1 COG:no KEGG:ECSP_3600 NR:ns ## KEGG: ECSP_3600 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 449 1 449 449 718 83.0 0 MLVSKSNGFNASAVLGSGSYNENKSSKHMELLAHSILKLICKEAASETYRGALETLQKMM SECIYQEGNAFVIMGAGEQLKRIKYEVGENNLKVFNVHFNNNHELVSSGEPDVICLSKQV WENLLIKLKLENNENVFSETKKLSNKNNADQFFECAKRNEQNLFDNIRKSDFHVGLLKPS STRSVILETPPNVCMESRNSYENKIDEISSLSESKEHPIDIQEKKDAFVNEFKGILFDKN GRSSEFLLNFYECCYEFLPRAQPQDKIESYNSALQAFSIFCSSTLIHNNIGFDFKLFPEV KLCGENLETVFKYKNGDDVREIAKINITLQKEEDGLYNLGGLDFKGCFFSGQNFSNYDIQ YVNWGTSLFDLDTPCIFNAPAYNKSNEKSLKPVSENGLSGVLTDRNNKIKLITGVAPFDD ILFMDDDFDDSSSEDDPVENSPVVTSPVVSSSKSSFQ >gi|296493148|gb|ADTK01000353.1| GENE 15 16378 - 16566 85 62 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_2748 NR:ns ## KEGG: ECIAI1_2748 # Name: not_defined # Def: putative DNA invertase fragment, putative PinH # Organism: E.coli_IAI1 # Pathway: not_defined # 1 62 1 62 85 121 100.0 1e-26 MDARVVCKPNRLGRSMWHLVVLLEELCKRGINFRALAQSIFAQQWGDECCKSKRICDLKV IV >gi|296493148|gb|ADTK01000353.1| GENE 16 16905 - 19805 1320 966 aa, chain + ## HITS:1 COG:ypjA KEGG:ns NR:ns ## COG: ypjA COG3468 # Protein_GI_number: 16130562 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 1 928 44 973 1569 1315 97.0 0 MNRTSPHYCRRSVLSLLISALIYAPPGMAAFTPDVIGVVNDETVDGNQKVDERGTTNNTH IINHGQQNVHGGVSNGSLIESGGYQDIGSHNNFVGQANNTTINGGRQSIHDGGISTGTTI ESGNQDVYKGGISNGTTIKGGASRVEGGSANGILIDGGSQIVKVQGHADGTTINKSGSQD VVQGSLATNTTINGGRQYVEQSTVETTTIKNGGEQRVYESRALDTTIEGGTQSLNSKSTA KITQIYSGGTQIVDNTSTSDVIEVYSGGVLDVSGGTATNVTQHDGAILKTNTNGTTVSGT NSEGAFSIHNKVEDNVLLENGGHLEVSHSANKTIIKDKGTMSVLTNAKADATRIDNGGVM DVAGNATNTIINGGTQNLNNYGIATGTNINSGTQNIKSGGKADTTIISSGSQQVVEKDGT AIGSNISAGGSLIVYTGGIAHGVNQETGSALVANTGAGTDIEGYNKLSHFTITGGEANYV VLENTGELTVVAKTSAKNTTVDAGGKLIVQKEAKTDTTRLNNGGVLEVQDGGEAKHVEQQ SGGALIASTTSGTLIEGTNSYGDAFYIRNSEAKNVVLENAGSLTVVTGSRAVDTIINANG KMDVYGKDVGTVLNSAGTQTIYASATSDKANIKGGKQTVYGLATEANIESGEQIVDGGST DKTHINGGTQTVQNYGKAINTDIVSGLQQIMANGIAEGSIINGCSQVVNEGGLAENSVLN DGGTLDVREKGSATGIQQSSQGALVATTRATRVTGTRADGVAFSIEQGAANNILLANGGV LTVESDTSSDKTQVNTGGREIVKTKATATGTTLTGGEQIVEGVANETTINDGGIQTVSAN GEAIKTKINEGGTLTVNDNGKATDIVQNSGAALQTSTANGIEISGTHQYGTFSISGNLAT NMLLENGGNLLVLAGTEARDSTVGSGGAANGSYRSNGLGGHIETGMRFTDGNWNLTPYAS LTGVHR >gi|296493148|gb|ADTK01000353.1| GENE 17 19834 - 20160 222 108 aa, chain + ## HITS:1 COG:ECs3515 KEGG:ns NR:ns ## COG: ECs3515 COG3468 # Protein_GI_number: 15832769 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli O157:H7 # 1 108 1464 1571 1571 200 96.0 7e-52 MESKSVDTRSLYRELGATLSYNMRLGNGMEVEPWLKAAVRKEFVDDNRVKVNSDGNFVND LSGRRGIYQAGIKASFSSTLSGHLGVGYSHGAGVESPWNGVAGVNWSF Prediction of potential genes in microbial genomes Time: Mon May 16 16:09:58 2011 Seq name: gi|296493147|gb|ADTK01000354.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1112.1, whole genome shotgun sequence Length of sequence - 729 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 416 - 728 191 ## EcHS_A1637 L-shaped tail fiber protein Predicted protein(s) >gi|296493147|gb|ADTK01000354.1| GENE 1 416 - 728 191 104 aa, chain + ## HITS:1 COG:no KEGG:EcHS_A1637 NR:ns ## KEGG: EcHS_A1637 # Name: not_defined # Def: L-shaped tail fiber protein # Organism: E.coli_HS # Pathway: not_defined # 1 104 766 869 1258 192 99.0 3e-48 MNPKYLGIDTNGDLAFGESPDQKQNSKLITQAKLDKGLTIGGQLAFKGTTAFSAVATFSA GIAGAIEPENIDGQTVNLNNLTIIKSDAGAVKYYICPSSAGGAN Prediction of potential genes in microbial genomes Time: Mon May 16 16:10:01 2011 Seq name: gi|296493146|gb|ADTK01000355.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1115.1, whole genome shotgun sequence Length of sequence - 95 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 16:10:01 2011 Seq name: gi|296493145|gb|ADTK01000356.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1115.2, whole genome shotgun sequence Length of sequence - 168 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 16:10:06 2011 Seq name: gi|296493144|gb|ADTK01000357.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1157.1, whole genome shotgun sequence Length of sequence - 16062 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 7, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 161 - 192 1.6 1 1 Op 1 12/0.000 - CDS 212 - 889 941 ## COG2181 Nitrate reductase gamma subunit 2 1 Op 2 12/0.000 - CDS 889 - 1599 886 ## COG2180 Nitrate reductase delta subunit 3 1 Op 3 13/0.000 - CDS 1596 - 3134 1803 ## COG1140 Nitrate reductase beta subunit 4 1 Op 4 10/0.000 - CDS 3131 - 6874 4155 ## COG5013 Nitrate reductase alpha subunit - Prom 7037 - 7096 5.1 - Term 7213 - 7243 3.4 5 1 Op 5 . - CDS 7267 - 8658 1136 ## COG2223 Nitrate/nitrite transporter - Prom 8684 - 8743 1.8 6 1 Op 6 . - CDS 8753 - 8971 94 ## SSON_1954 hypothetical protein - Prom 9209 - 9268 3.1 + Prom 8736 - 8795 7.5 7 2 Op 1 8/0.000 + CDS 8997 - 10793 1549 ## COG3850 Signal transduction histidine kinase, nitrate/nitrite-specific 8 2 Op 2 . + CDS 10786 - 11436 788 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain - Term 11295 - 11334 0.8 9 3 Tu 1 . - CDS 11437 - 12831 819 ## ECH74115_1703 hypothetical protein - Prom 12855 - 12914 3.3 + Prom 12923 - 12982 2.7 10 4 Tu 1 . + CDS 13016 - 13369 366 ## COG1553 Uncharacterized conserved protein involved in intracellular sulfur reduction + Term 13372 - 13417 2.8 11 5 Tu 1 2/1.000 - CDS 13413 - 14087 515 ## COG3703 Uncharacterized protein involved in cation transport - Prom 14171 - 14230 2.9 12 6 Tu 1 . - CDS 14266 - 14496 270 ## COG4572 Putative cation transport regulator - Prom 14716 - 14775 5.0 + Prom 14660 - 14719 4.8 13 7 Tu 1 . + CDS 14766 - 15866 1068 ## COG0387 Ca2+/H+ antiporter + Term 15870 - 15924 7.8 Predicted protein(s) >gi|296493144|gb|ADTK01000357.1| GENE 1 212 - 889 941 225 aa, chain - ## HITS:1 COG:ECs1732 KEGG:ns NR:ns ## COG: ECs1732 COG2181 # Protein_GI_number: 15830986 # Func_class: C Energy production and conversion # Function: Nitrate reductase gamma subunit # Organism: Escherichia coli O157:H7 # 1 225 1 225 225 422 100.0 1e-118 MQFLNMFFFDIYPYIAGAVFLIGSWLRYDYGQYTWRAASSQMLDRKGMNLASNLFHIGIL GIFVGHFFGMLTPHWMYEAWLPIEVKQKMAMFAGGASGVLCLIGGVLLLKRRLFSPRVRA TTTGADILILSLLVIQCALGLLTIPFSAQHMDGSEMMKLVGWAQSVVTFHGGASQHLDGV AFIFRLHLVLGMTLFLLFPFSRLVHIWSVPVEYLTRKYQLVRARH >gi|296493144|gb|ADTK01000357.1| GENE 2 889 - 1599 886 236 aa, chain - ## HITS:1 COG:ECs1731 KEGG:ns NR:ns ## COG: ECs1731 COG2180 # Protein_GI_number: 15830985 # Func_class: C Energy production and conversion # Function: Nitrate reductase delta subunit # Organism: Escherichia coli O157:H7 # 1 236 1 236 236 426 99.0 1e-119 MIELVIVSRLLEYPDAALWQHQQEMFEAIAASKNLPKEDAHALGIFLRDLTAMDPLDAQA QYSELFDRGRATSLLLFEHVHGESRDRGQAMVDLLAQYEQHGLQLNSRELPDHLPLYLES LAQLPQSEAVEGLKDIAPILALLSARLQQRESRYAVLFDLLLKLANTAIDSDKVAEKIAD EARDDTPQALDAVWEEEQVKFFADKGCGDSAITAHQRRFAGAVAPQYLNITTGGQH >gi|296493144|gb|ADTK01000357.1| GENE 3 1596 - 3134 1803 512 aa, chain - ## HITS:1 COG:narH KEGG:ns NR:ns ## COG: narH COG1140 # Protein_GI_number: 16129188 # Func_class: C Energy production and conversion # Function: Nitrate reductase beta subunit # Organism: Escherichia coli K12 # 1 512 1 512 512 1093 99.0 0 MKIRSQVGMVLNLDKCIGCHTCSVTCKNVWTSREGVEYAWFNNVETKPGQGFPTDWENQE KYKGGWIRKINGKLQPRMGNRAMLLGKIFANPHLPGIDDYYEPFDFDYQNLHTAPEGSKS QPIARPRSLITGERMAKIEKGPNWEDDLGGEFDKLAKDKNFDNIQKAMYSQFENTFMMYL PRLCEHCLNPACVATCPSGAIYKREEDGIVLIDQDKCRGWRMCITGCPYKKIYFNWKSGK SEKCIFCYPRIEAGQPTVCSETCVGRIRYLGVLLYDADAIERAASTENEKDLYQRQLDVF LDPNDPKVIEQAIKDGIPLSVIEAAQQSPVYKMAMEWKLALPLHPEYRTLPMVWYVPPLS PIQSAADAGELGSNGILPDVESLRIPVQYLANLLTVGDTKPVLRALKRMLAMRHYKRAET VDGKVDTRALEEVGLTEAQAQEMYRYLAIANYEDRFVVPSSHRELAREAFPEKNGCGFTF GDGCHGSDTKFNLFNSRRIDAIDVTSKTEPHP >gi|296493144|gb|ADTK01000357.1| GENE 4 3131 - 6874 4155 1247 aa, chain - ## HITS:1 COG:ECs1729 KEGG:ns NR:ns ## COG: ECs1729 COG5013 # Protein_GI_number: 15830983 # Func_class: C Energy production and conversion # Function: Nitrate reductase alpha subunit # Organism: Escherichia coli O157:H7 # 1 1247 1 1247 1247 2599 99.0 0 MSKFLDRFRYFKQKGETFADGHGQLLNTNRDWEDGYRQRWQHDKIVRSTHGVNCTGSCSW KIYVKNGLVTWETQQTDYPRTRPDLPNHEPRGCPRGASYSWYLYSANRLKYPMMRKRLMK MWREAKALHSDPVEAWASIIEDADKAKSFKQARGRGGFVRSSWQEVNELIAASNVYTIKN YGPDRVAGFSPIPAMSMVSYASGARYLSLIGGTCLSFYDWYCDLPPASPQTWGEQTDVPE SADWYNSSYIIAWGSNVPQTRTPDAHFFTEVRYKGTKTVAVTPDYAEIAKLCDLWLAPKQ GTDAAMALAMGHVMLREFHLDNPSQYFTDYVRRYTDMPMLVMLEERDGYYAAGRMLRAAD LVDALGQENNPEWKTVAFNTNGEMVAPNGSIGFRWGEKGKWNLEQRDGKTGEETELQLSL LGSQDEIAEVGFPYFGGDGTEHFNKVELENVLLHKLPVKRLQLADGSTALVTTVYDLTLA NYGLERGLNDVNCATSYDDVKAYTPAWAEQITGVSRSQIIRIAREFADNADKTHGRSMII VGAGLNHWYHLDMNYRGLINMLIFCGCVGQSGGGWAHYVGQEKLRPQTGWQPLAFALDWQ RPARHMNSTSYFYNHSSQWRYETVTAEELLSPMADKSRYTGHLIDFNVRAERMGWLPSAP QLGTNPLTIAAEAEKAGMNPVDYTVKSLKEGSIRFAAEQPENGKNHPRNLFIWRSNLLGS SGKGHEFMLKYLLGTEHGIQGKDLGQQGGVKPEEVDWQDNGLEGKLDLVVTLDFRLSSTC LYSDIILPTATWYEKDDMNTSDMHPFIHPLSAAVDPAWEAKSDWEIYKAIAKKFSEVCVG HLGKETDIVTLPIQHDSAAELAQPLDVKDWKKGECDLIPGKTAPHIMVVERDYPATYERF TSIGPLMEKIGNGGKGIAWNTQSEMDLLRKLNYTKAEGPAKGQPMLNTAIDAAEMILTLA PETNGQVAVKAWAALSEFTGRDHTHLALNKEDEKIRFRDIQAQPRKIISSPTWSGLEDEH VSYNAGYTNVHELIPWRTLSGRQQLYQDHQWMRDFGESLLVYRPPIDTRSVKEVIGQKSN GNPEKALNFLTPHQKWGIHSTYSDNLLMLTLGRGGPVVWLSEADAKDLGIADNDWIEVFN SNGALTARAVVSQRVPAGMTMMYHAQERIVNLPGSEITQQRGGIHNSVTRITPKPTHMIG GYAHLAYGFNYYGTVGSNRDEFVVVRKMKNIDWLDGEGNDQVQESVK >gi|296493144|gb|ADTK01000357.1| GENE 5 7267 - 8658 1136 463 aa, chain - ## HITS:1 COG:narK KEGG:ns NR:ns ## COG: narK COG2223 # Protein_GI_number: 16129186 # Func_class: P Inorganic ion transport and metabolism # Function: Nitrate/nitrite transporter # Organism: Escherichia coli K12 # 1 463 1 463 463 801 100.0 0 MSHSSAPERATGAVITDWRPEDPAFWQQRGQRIASRNLWISVPCLLLAFCVWMLFSAVAV NLPKVGFNFTTDQLFMLTALPSVSGALLRVPYSFMVPIFGGRRWTAFSTGILIIPCVWLG FAVQDTSTPYSVFIIISLLCGFAGANFASSMANISFFFPKQKQGGALGLNGGLGNMGVSV MQLVAPLVVSLSIFAVFGSQGVKQPDGTELYLANASWIWVPFLAIFTIAAWFGMNDLATS KASIKEQLPVLKRGHLWIMSLLYLATFGSFIGFSAGFAMLSKTQFPDVQILQYAFFGPFI GALARSAGGALSDRLGGTRVTLVNFILMAIFSGLLFLTLPTDGQGGSFMAFFAVFLALFL TAGLGSGSTFQMISVIFRKLTMDRVKAEGGSDERAMREAATDTAAALGFISAIGAIGGFF IPKAFGSSLALTGSPVGAMKVFLIFYIACVVITWAVYGRHSKK >gi|296493144|gb|ADTK01000357.1| GENE 6 8753 - 8971 94 72 aa, chain - ## HITS:1 COG:no KEGG:SSON_1954 NR:ns ## KEGG: SSON_1954 # Name: not_defined # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 72 1 72 72 139 98.0 3e-32 MSNNLNECDDTIWNGSILGYWLKYTHTRKELLLICRVVSRFTSVRVGILQHREKSHNFYE ITVLTMGNDKYQ >gi|296493144|gb|ADTK01000357.1| GENE 7 8997 - 10793 1549 598 aa, chain + ## HITS:1 COG:ECs1727 KEGG:ns NR:ns ## COG: ECs1727 COG3850 # Protein_GI_number: 15830981 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase, nitrate/nitrite-specific # Organism: Escherichia coli O157:H7 # 1 598 1 598 598 1133 100.0 0 MLKRCLSPLTLVNQVALIVLLSTAIGLAGMAVSGWLVQGVQGSAHAINKAGSLRMQSYRL LAAVPLSEKDKPLIKEMEQTAFSAELTRAAERDGQLAQLQGLQDYWRNELIPALMRAQNR ETVSADVSQFVAGLDQLVSGFDRTTEMRIETVVLVHRVMAVFMALLLVFTIIWLRARLLQ PWRQLLAMASAVSHRDFTQRANISGRNEMAMLGTALNNMSAELAESYAVLEQRVQEKTAG LEHKNQILSFLWQANRRLHSRAPLCERLSPVLNGLQNLTLLRDIELRVYDTDDEENHQEF TCQPDMTCDDKGCQLCPRGVLPVGDRGTTLKWRLADSHTQYGILLATLPQGRHLSHDQQQ LVDTLVEQLTATLALDRHQERQQQLIVMEERATIARELHDSIAQSLSCMKMQVSCLQMQG DALPESSRELLSQIRNELNASWAQLRELLTTFRLQLTEPGLRPALEASCEEYSAKFGFPV KLDYQLPPRLVPSHQAIHLLQIAREALSNALKHSQASEVVVTVAQNDNQVKLTVQDNGCG VPENAIRSNHYGMIIMRDRAQSLRGDCRVRRRESGGTEVVVTFIPEKTFTDVQGDTHE >gi|296493144|gb|ADTK01000357.1| GENE 8 10786 - 11436 788 216 aa, chain + ## HITS:1 COG:ECs1726 KEGG:ns NR:ns ## COG: ECs1726 COG2197 # Protein_GI_number: 15830980 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 216 1 216 216 346 100.0 2e-95 MSNQEPATILLIDDHPMLRTGVKQLISMAPDITVVGEASNGEQGIELAESLDPDLILLDL NMPGMNGLETLDKLREKSLSGRIVVFSVSNHEEDVVTALKRGADGYLLKDMEPEDLLKAL HQAAAGEMVLSEALTPVLAASLRANRATTERDVNQLTPRERDILKLIAQGLPNKMIARRL DITESTVKVHVKHMLKKMKLKSRVEAAVWVHQERIF >gi|296493144|gb|ADTK01000357.1| GENE 9 11437 - 12831 819 464 aa, chain - ## HITS:1 COG:no KEGG:ECH74115_1703 NR:ns ## KEGG: ECH74115_1703 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 464 13 476 476 897 99.0 0 MSRFVPRIIPFYLLLLVAGGTANAQSTFEQKAANPFDNNNDGLPDLGMAPENHDGEKHFA EIVKDFGETSMNDNGLDTGEQAKAFALGKVRDALSQQVNQHVESWLSPWGNASIDVKVDN EGHFTGSRGSWFVPLQDNDRYLTWSQLGLTQQDDGLVSNVGVGQRWARGNWLVGYNTFYD NLLDENLQRAGFGAEAWGEYLRLSANFYQPFAAWHEQTATQEQRMARGYDLTARMRMPFY QHLNTSVSVEQYFGERVDLFNSGTGYHNPVALSLGLNYTPVPLVTVTAQHKQGESGENQN NLGLNLNYRFGVPLKKQLSAGEVAESQSLRGSRYDNPQRNNLPTLEYRQRKTLTVFLATP PWDLKPGETVPLKLQIRSRYGIRQLIWQGDTQILSLTPGAQANSAEGWTLIMPDWQNGEG ASNHWRLSVVVEDNQGQRVSSNEITLTLVEPFDALSNDELRWEP >gi|296493144|gb|ADTK01000357.1| GENE 10 13016 - 13369 366 117 aa, chain + ## HITS:1 COG:ECs1724 KEGG:ns NR:ns ## COG: ECs1724 COG1553 # Protein_GI_number: 15830978 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized conserved protein involved in intracellular sulfur reduction # Organism: Escherichia coli O157:H7 # 1 117 1 117 117 224 100.0 2e-59 MQKIVIVANGAPYGSESLFNSLRLAIALREQESNLDLRLFLMSDAVTAGLRGQKPGEGYN IQQMLEILTAQNVPVKLCKTCTDGRGISTLPLIDGVEIGTLVELAQWTLSADKVLTF >gi|296493144|gb|ADTK01000357.1| GENE 11 13413 - 14087 515 224 aa, chain - ## HITS:1 COG:chaC KEGG:ns NR:ns ## COG: chaC COG3703 # Protein_GI_number: 16129181 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in cation transport # Organism: Escherichia coli K12 # 1 224 15 238 238 454 99.0 1e-128 MNADCKTAFGAIEESLLWSAEQRAASLAATLACRPDEGPVWIFGYGSLMWNPALEFTESC TGTLVGWHRAFCLRLTAGRGTAHQPGRMLALKEGGRTTGVAYRLPEETLEQELTLLWKRE MITGCYLPTWCQLDLDDGCTVNAIVFIMDPRHPEYESDTRAQVIAPLIAAASGPLGTNAQ YLFSLEQELIKLGMQDDGLNDLLVSVKKLLAENYPDGVLRPGFA >gi|296493144|gb|ADTK01000357.1| GENE 12 14266 - 14496 270 76 aa, chain - ## HITS:1 COG:ECs1722 KEGG:ns NR:ns ## COG: ECs1722 COG4572 # Protein_GI_number: 15830976 # Func_class: R General function prediction only # Function: Putative cation transport regulator # Organism: Escherichia coli O157:H7 # 1 76 1 76 76 98 98.0 2e-21 MPYKTKSDLPESVKHVLPSHAQDIYKEAFNSAWDQYKDKDDRRDDASREETAHKVAWAAV KHEYAKGDDDKWHKKS >gi|296493144|gb|ADTK01000357.1| GENE 13 14766 - 15866 1068 366 aa, chain + ## HITS:1 COG:STM1771 KEGG:ns NR:ns ## COG: STM1771 COG0387 # Protein_GI_number: 16765112 # Func_class: P Inorganic ion transport and metabolism # Function: Ca2+/H+ antiporter # Organism: Salmonella typhimurium LT2 # 1 366 1 366 366 576 93.0 1e-164 MSNAQEAVKTRHKETSLIFPVLALVVLFLWGSSQTLPVVIAINLLALIGILSSAFSVVRH ADVLAHRLGEPYGSLILSLSVVILEVSLISALRATGDAAPTLMRDTLYSIIMIVTGGLVG FSLLLGGRKFATQYMNLFGIKQYLIALFPLAIIVLVFPMALPAANFSTGQALLVALISAA MYGVFLLIQTKTHQSLFVYEHEDDSDDDDPHHGKPSAHSSVWHAIWLIIHLIAVIAVTKM NASPLETLLDSMNAPVAFTGFLVALLILSPEGLGALKAVLNNQVQRAMNLFFGSVLATIS LTVPVVTLIAFMTGNELQFALGAPEMVVMVASLVLCHISFSTGRTNVLNGAAHLALFAAY LMTIFA Prediction of potential genes in microbial genomes Time: Mon May 16 16:10:14 2011 Seq name: gi|296493143|gb|ADTK01000358.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1157.2, whole genome shotgun sequence Length of sequence - 351 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 16:10:29 2011 Seq name: gi|296493142|gb|ADTK01000359.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1157.3, whole genome shotgun sequence Length of sequence - 45517 bp Number of predicted genes - 46, with homology - 45 Number of transcription units - 28, operones - 10 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 303 - 339 4.1 2 2 Op 1 8/0.000 - CDS 361 - 1215 1042 ## COG2877 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase 3 2 Op 2 6/0.200 - CDS 1251 - 2009 698 ## COG2912 Uncharacterized conserved protein 4 2 Op 3 6/0.200 - CDS 2064 - 2456 345 ## COG3094 Uncharacterized protein conserved in bacteria 5 2 Op 4 32/0.000 - CDS 2453 - 3286 483 ## PROTEIN SUPPORTED gi|225874212|ref|YP_002755671.1| ribosomal protein L11 methyltransferase 6 2 Op 5 9/0.000 - CDS 3286 - 4368 1196 ## COG0216 Protein chain release factor A 7 2 Op 6 . - CDS 4410 - 5666 1240 ## COG0373 Glutamyl-tRNA reductase - Prom 5705 - 5764 4.0 + Prom 5680 - 5739 6.0 8 3 Op 1 13/0.000 + CDS 5880 - 6503 565 ## COG3017 Outer membrane lipoprotein involved in outer membrane biogenesis 9 3 Op 2 13/0.000 + CDS 6503 - 7354 536 ## COG1947 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase + Term 7374 - 7400 0.3 10 3 Op 3 3/0.600 + CDS 7439 - 8452 1047 ## COG0462 Phosphoribosylpyrophosphate synthetase + Term 8465 - 8504 4.1 11 4 Tu 1 . + CDS 8604 - 10256 1701 ## COG0659 Sulfate permease and related transporters (MFS superfamily) 12 5 Tu 1 . - CDS 10311 - 10589 272 ## LF82_2742 uncharacterized protein YchH - Prom 10642 - 10701 3.7 + Prom 10778 - 10837 4.2 13 6 Op 1 14/0.000 + CDS 10867 - 11451 511 ## COG0193 Peptidyl-tRNA hydrolase + Term 11463 - 11495 0.7 + Prom 11476 - 11535 2.4 14 6 Op 2 . + CDS 11568 - 12659 1167 ## COG0012 Predicted GTPase, probable translation factor + Term 12720 - 12749 2.1 - Term 12691 - 12750 10.3 15 7 Tu 1 . - CDS 12955 - 13158 82 ## ECH74115_1683 hypothetical protein - Prom 13327 - 13386 5.6 16 8 Tu 1 . + CDS 13503 - 16388 1621 ## COG3468 Type V secretory pathway, adhesin AidA 17 9 Tu 1 . - CDS 16488 - 18407 1521 ## COG3284 Transcriptional activator of acetoin/glycerol metabolism - Prom 18440 - 18499 6.2 + Prom 18482 - 18541 4.9 18 10 Op 1 10/0.000 + CDS 18635 - 19705 1020 ## COG2376 Dihydroxyacetone kinase 19 10 Op 2 2/0.700 + CDS 19716 - 20348 606 ## COG2376 Dihydroxyacetone kinase 20 10 Op 3 1/0.800 + CDS 20359 - 21777 1190 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) + Term 21787 - 21827 8.1 + Prom 21951 - 22010 5.5 21 11 Tu 1 . + CDS 22097 - 23794 1575 ## COG1626 Neutral trehalase + Term 23814 - 23850 4.7 22 12 Op 1 . - CDS 23791 - 24018 142 ## 23 12 Op 2 . - CDS 23873 - 24169 105 ## ECO111_1525 hypothetical protein - Prom 24302 - 24361 4.7 - Term 24437 - 24478 6.2 24 13 Tu 1 . - CDS 24491 - 24745 280 ## COG2261 Predicted membrane protein - Prom 24860 - 24919 3.9 + Prom 24847 - 24906 2.6 25 14 Tu 1 . + CDS 24946 - 25680 628 ## COG5581 Predicted glycosyltransferase - Term 25519 - 25562 1.7 26 15 Tu 1 . - CDS 25682 - 26293 576 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) - Prom 26346 - 26405 1.7 + Prom 26306 - 26365 2.6 27 16 Op 1 1/0.800 + CDS 26393 - 27307 817 ## COG1619 Uncharacterized proteins, homologs of microcin C7 resistance protein MccF + Term 27342 - 27375 -0.4 + Prom 27321 - 27380 1.7 28 16 Op 2 . + CDS 27402 - 29138 1699 ## COG3263 NhaP-type Na+/H+ and K+/H+ antiporters with a unique C-terminal domain 29 17 Tu 1 . - CDS 29272 - 29421 62 ## gi|188493462|ref|ZP_03000732.1| hypothetical protein Ec53638_4549 30 18 Op 1 7/0.100 - CDS 29524 - 30594 876 ## COG0787 Alanine racemase 31 18 Op 2 . - CDS 30604 - 31902 1358 ## COG0665 Glycine/D-amino acid oxidases (deaminating) - Prom 31948 - 32007 5.8 + Prom 31928 - 31987 4.4 32 19 Tu 1 . + CDS 32232 - 33764 1462 ## COG2719 Uncharacterized conserved protein + Term 33778 - 33806 1.0 - Term 33762 - 33798 2.3 33 20 Tu 1 . - CDS 33816 - 34535 705 ## COG2186 Transcriptional regulators - Prom 34568 - 34627 4.6 + Prom 34650 - 34709 5.5 34 21 Tu 1 . + CDS 34757 - 36298 1669 ## COG3067 Na+/H+ antiporter + Term 36309 - 36340 4.1 + Prom 36305 - 36364 4.2 35 22 Tu 1 . + CDS 36444 - 36974 588 ## COG1495 Disulfide bond formation protein DsbB + Term 36983 - 37013 3.0 36 23 Op 1 4/0.500 - CDS 37020 - 38288 561 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair 37 23 Op 2 . - CDS 38288 - 38707 299 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) + Prom 38999 - 39058 4.8 38 24 Tu 1 . + CDS 39078 - 39989 737 ## ECH74115_1668 hemolysin E 39 25 Op 1 1/0.800 - CDS 40196 - 40642 458 ## COG2983 Uncharacterized conserved protein - Prom 40666 - 40725 3.9 40 25 Op 2 2/0.700 - CDS 40734 - 41393 653 ## COG0179 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) 41 25 Op 3 . - CDS 41465 - 41758 327 ## COG3100 Uncharacterized protein conserved in bacteria - Prom 41950 - 42009 6.6 + Prom 41907 - 41966 4.5 42 26 Tu 1 . + CDS 42000 - 42401 298 ## ECB_01153 hypothetical protein - Term 42393 - 42433 2.1 43 27 Tu 1 . - CDS 42521 - 42883 75 ## JW1166 hypothetical protein - Prom 43107 - 43166 4.6 + Prom 43202 - 43261 4.5 44 28 Op 1 22/0.000 + CDS 43409 - 44104 500 ## COG0850 Septum formation inhibitor 45 28 Op 2 22/0.000 + CDS 44128 - 44940 842 ## COG2894 Septum formation inhibitor-activating ATPase 46 28 Op 3 . + CDS 44944 - 45210 451 ## COG0851 Septum formation topological specificity factor + Term 45236 - 45273 8.0 Predicted protein(s) >gi|296493142|gb|ADTK01000359.1| GENE 1 79 - 213 129 44 aa, chain + ## HITS:1 COG:no KEGG:SBO_1851 NR:ns ## KEGG: SBO_1851 # Name: not_defined # Def: hypothetical protein # Organism: S.boydii # Pathway: not_defined # 1 44 48 91 91 85 97.0 7e-16 MLSLQQGGYMTLAQFAMTFWHDLAAPILAGIITAAIVGWWRNRK >gi|296493142|gb|ADTK01000359.1| GENE 2 361 - 1215 1042 284 aa, chain - ## HITS:1 COG:ECs1720 KEGG:ns NR:ns ## COG: ECs1720 COG2877 # Protein_GI_number: 15830974 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase # Organism: Escherichia coli O157:H7 # 1 284 1 284 284 567 99.0 1e-162 MKQKVVSIGDINVANDLPFVLFGGMNVLESRDLAMRICEHYVTVTQKLGIPYVFKASFDK ANRSSIHSYRGPGLEEGMKIFQELKQTFGVKIITDVHEPSQAQPVADVVDVIQLPAFLAR QTDLVEAMAKTGAVINVKKPQFVSPGQMGNIVDKFKEGGNEKVILCDRGANFGYDNLVVD MLGFSIMKKVSGNSPVIFDVTHALQCRDPFGAASGGRRAQVAELARAGMAVGLAGLFIEA HPDPEHAKCDGPSALPLAKLEPFLKQMKAIDDLVKGFEELDTSK >gi|296493142|gb|ADTK01000359.1| GENE 3 1251 - 2009 698 252 aa, chain - ## HITS:1 COG:ECs1719 KEGG:ns NR:ns ## COG: ECs1719 COG2912 # Protein_GI_number: 15830973 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 252 18 269 269 486 100.0 1e-137 MILACEAIRRDFPSQDVYDELERLVSLAKEEISQLLPLEEQLEKLIALFYGDWGFKASRG VYRLSDALWLDQVLKNRQGSAVSLGAVLLWVANRLDLPLLPVIFPTQLILRIECPDGEIW LINPFNGESLSEHMLDVWLKGNISPSAELFYEDLDEADNIEVIRKLLDTLKASLMEENQM ELALRTSEALLQFNPEDPYEIRDRGLIYAQLDCEHVALNDLSYFVEQCPEDPISEMIRAQ INNIAHKHIVLH >gi|296493142|gb|ADTK01000359.1| GENE 4 2064 - 2456 345 130 aa, chain - ## HITS:1 COG:ychQ KEGG:ns NR:ns ## COG: ychQ COG3094 # Protein_GI_number: 16129176 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 130 1 130 130 207 99.0 4e-54 MTSFSTLLSVHLISIALSVGLLTLRFWLRYQKHPQAFARWTRIVPPVVDTMLLLSGIALM AKAHILPFSGQAQWLTEKLFGVIIYIVLGFIALDYRRMHSQQARIIAFPLALVVLYIIIK LATTKVPLLG >gi|296493142|gb|ADTK01000359.1| GENE 5 2453 - 3286 483 277 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225874212|ref|YP_002755671.1| ribosomal protein L11 methyltransferase [Acidobacterium capsulatum ATCC 51196] # 7 274 23 289 294 190 40 1e-47 MEYQHWLREAISQLQASESPRRDAEILLEHVTGKGRTFILAFGETQLTDEQCQQLDALLT RRRDGEPIAHLTGVREFWSLPLFVSPATLIPRPDTECLVEQALARLPEQPCRILDLGTGT GAIALALARERPDCEITAVDRMPDAVALAQRNAQHLAIKNIHILQSDWFSALAGQQFAMI VSNPPYIDEQDPHLQQGDVRFEPLTALVAADSGMADIVHIIEQSRNALVSGGFLLLEHGW QQGEAVRQAFILAGYHDVETCRDYGDNERVTLGRYYQ >gi|296493142|gb|ADTK01000359.1| GENE 6 3286 - 4368 1196 360 aa, chain - ## HITS:1 COG:ECs1716 KEGG:ns NR:ns ## COG: ECs1716 COG0216 # Protein_GI_number: 15830970 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor A # Organism: Escherichia coli O157:H7 # 1 360 1 360 360 632 100.0 0 MKPSIVAKLEALHERHEEVQALLGDAQTIADQERFRALSREYAQLSDVSRCFTDWQQVQE DIETAQMMLDDPEMREMAQDELREAKEKSEQLEQQLQVLLLPKDPDDERNAFLEVRAGTG GDEAALFAGDLFRMYSRYAEARRWRVEIMSASEGEHGGYKEIIAKISGDGVYGRLKFESG GHRVQRVPATESQGRIHTSACTVAVMPELPDAELPDINPADLRIDTFRSSGAGGQHVNTT DSAIRITHLPTGIVVECQDERSQHKNKAKALSVLGARIHAAEMAKRQQAEASTRRNLLGS GDRSDRNRTYNFPQGRVTDHRINLTLYRLDEVMEGKLDMLIEPIIQEHQADQLAALSEQE >gi|296493142|gb|ADTK01000359.1| GENE 7 4410 - 5666 1240 418 aa, chain - ## HITS:1 COG:ECs1715 KEGG:ns NR:ns ## COG: ECs1715 COG0373 # Protein_GI_number: 15830969 # Func_class: H Coenzyme transport and metabolism # Function: Glutamyl-tRNA reductase # Organism: Escherichia coli O157:H7 # 1 418 1 418 418 732 99.0 0 MTLLALGINHKTAPVSLRERVSFSPDKLDQALDSLLAQPMVQGGVVLSTCNRTELYLSVE EQDNLQEALIRWLCDYHNLNEEDLRKSLYWHQDNDAVSHLMRVASGLDSLVLGEPQILGQ VKKAFADSQKGHMKASELERMFQKSFSVAKRVRTETDIGASAVSVAFAACTLARQIFESL STVTVLLVGAGETIELVARHLREHKVQKMIIANRTRERAQILADEVGAEVIALSDIDERL READIIISSTASPLPIIGKGMVERALKSRRNQPMLLVDIAVPRDVEPEVGKLANAYLYSV DDLQSIISHNLAQRKAAAVEAETIVAQETSEFMAWLRAQSASETIREYRGQAEQVRDELT AKALAALEQGGDAQAIMQDLAWKLTNRLIHAPTKSLQQAARDGDNERLNILRDSLGLE >gi|296493142|gb|ADTK01000359.1| GENE 8 5880 - 6503 565 207 aa, chain + ## HITS:1 COG:ECs1714 KEGG:ns NR:ns ## COG: ECs1714 COG3017 # Protein_GI_number: 15830968 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane lipoprotein involved in outer membrane biogenesis # Organism: Escherichia coli O157:H7 # 1 207 1 207 207 403 99.0 1e-112 MPLPDFRLIRLLPLAALVLTACSVTTPKGPGKSPDSPQWRQHQQDVRNLNQYQTRGAFAY ISDQQKVYARFFWQQTGQDRYRLLLTNPLGSTELELNAQPGNVQLVDNKGQRYTADDAEE MIGKLTGMPIPLNSLRQWILGLPGDATDYKLDDQYRLNEITYSQNGKNWKVVYGGYDTKT QPAMPANMELTDGGQRIKLKMDNWIVK >gi|296493142|gb|ADTK01000359.1| GENE 9 6503 - 7354 536 283 aa, chain + ## HITS:1 COG:ECs1713 KEGG:ns NR:ns ## COG: ECs1713 COG1947 # Protein_GI_number: 15830967 # Func_class: I Lipid transport and metabolism # Function: 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase # Organism: Escherichia coli O157:H7 # 1 283 1 283 283 560 99.0 1e-159 MRTQWPSPAKLNLFLYITGQRADGYHTLQTLFQFLDYGDTISIELRDDGDIRLLTPVEGV EHEDNLIVRAARLLMKTAADSGRLPTGSGADISIDKRLPMGGGLGGGSSNAATVLVALNH LWQCGLSMDELAEMGLTLGADVPVFVRGHAAFAEGVGEILTPVDPPEKWYLVAHPGVSIP TPVIFKDPELPRNTPKRSIETLLKCEFSNDCEVIARKRFREVDAVLSWLLEYAPSRLTGT GACVFAEFDTESEARQVLEQAPEWLNGFVAKGVNLSPLHRAML >gi|296493142|gb|ADTK01000359.1| GENE 10 7439 - 8452 1047 337 aa, chain + ## HITS:1 COG:ECs1712 KEGG:ns NR:ns ## COG: ECs1712 COG0462 # Protein_GI_number: 15830966 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Escherichia coli O157:H7 # 23 337 1 315 315 595 99.0 1e-170 MPGPHSFRQILSTNGRMPEVLLVPDMKLFAGNATPELAQRIANRLYTSLGDAAVGRFSDG EVSVQINENVRGGDIFIIQSTCAPTNDNLMELVVMVDALRRASAGRITAVIPYFGYARQD RRVRSARVPITAKVVADFLSSVGVDRVLTVDLHAEQIQGFFDVPVDNVFGSPILLEDMLQ LNLDNPIVVSPDIGGVVRARAIAKLLNDTDMAIIDKRRPRANVSQVMHIIGDVAGRDCVL VDDMIDTGGTLCKAAEALKERGAKRVFAYATHPIFSGNAANNLRNSVIDEVVVCDTIPLS DEIKSLPNVRTLTLSGMLAEAIRRISNEESISAMFEH >gi|296493142|gb|ADTK01000359.1| GENE 11 8604 - 10256 1701 550 aa, chain + ## HITS:1 COG:ECs1711 KEGG:ns NR:ns ## COG: ECs1711 COG0659 # Protein_GI_number: 15830965 # Func_class: P Inorganic ion transport and metabolism # Function: Sulfate permease and related transporters (MFS superfamily) # Organism: Escherichia coli O157:H7 # 1 550 1 550 550 919 100.0 0 MPFRALIDACWKEKYTAARFTRDLIAGITVGIIAIPLAMALAIGSGVAPQYGLYTAAVAG IVIALTGGSRFSVSGPTAAFVVILYPVSQQFGLAGLLVATLLSGIFLILMGLARFGRLIE YIPVSVTLGFTSGIGITIGTMQIKDFLGLQMAHVPEHYLQKVGALFMALPTINVGDAAIG IVTLGILVFWPRLGIRLPGHLPALLAGCAVMGIVNLLGGHVATIGSQFHYVLADGSQGNG IPQLLPQLVLPWDLPNSEFTLTWDSIRTLLPAAFSMAMLGAIESLLCAVVLDGMTGTKHK ANSELVGQGLGNIIAPFFGGITATAAIARSAANVRAGATSPISAVIHSILVILALLVLAP LLSWLPLSAMAALLLMVAWNMSEAHKVVDLLRHAPKDDIIVMLLCMSLTVLFDMVIAISV GIVLASLLFMRRIARMTRLAPVVVDVPDDVLVLRVIGPLFFAAAEGLFTDLESRLEGKRI VILKWDAVPVLDAGGLDAFQRFVKRLPEGCELRVCNVEFQPLRTMARAGIQPIPGRLAFF PNRRAAMADL >gi|296493142|gb|ADTK01000359.1| GENE 12 10311 - 10589 272 92 aa, chain - ## HITS:1 COG:no KEGG:LF82_2742 NR:ns ## KEGG: LF82_2742 # Name: ychH # Def: uncharacterized protein YchH # Organism: E.coli_LF82 # Pathway: not_defined # 1 92 1 92 92 152 100.0 3e-36 MKRKNASLLGNVLMGLGLVVMVVGVGYSILNQLPQFNMPQYFAHGAVLSIFVGAILWLAG ARVGGHEQVCDRYWWVRHYDKRCRRSDNRRHS >gi|296493142|gb|ADTK01000359.1| GENE 13 10867 - 11451 511 194 aa, chain + ## HITS:1 COG:ECs1709 KEGG:ns NR:ns ## COG: ECs1709 COG0193 # Protein_GI_number: 15830963 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptidyl-tRNA hydrolase # Organism: Escherichia coli O157:H7 # 1 194 1 194 194 367 99.0 1e-102 MTIKLIVGLANPGAEYAATRHNAGAWFVDLLAERLRAPLREEAKFFGYTSRVTLGGEGVR LLVPTTFMNLSGKAVAAMASFFRINPDEILVAHDELDLPPGVAKFKLGGGHGGHNGLKDI ISKLGNNPNFHRLRIGIGHPGDKNKVVGFVLGKPPVSEQKLIDEAIDEAARCTEMWFTDG LTKATNRLHAFKAQ >gi|296493142|gb|ADTK01000359.1| GENE 14 11568 - 12659 1167 363 aa, chain + ## HITS:1 COG:ECs1708 KEGG:ns NR:ns ## COG: ECs1708 COG0012 # Protein_GI_number: 15830962 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted GTPase, probable translation factor # Organism: Escherichia coli O157:H7 # 1 363 1 363 363 719 99.0 0 MGFKCGIVGLPNVGKSTLFNALTKAGIEAANFPFCTIEPNTGVVPMPDPRLDQLAEIVKP QRTLPTTMEFVDIAGLVKGASKGEGLGNQFLTNIRETEAIGHVVRCFENDNIIHVSGKVN PADDIEVINTELALADLDTCERAIHRVQKKAKGGDKDAKAELAVLEKCLPQLENAGMLRA LDLSAEEKAAIRYLSFLTLKPTMYIANVNEDGFENNPYLDQVREIAAKEGSVVLPVCAAV EADIAELDDEERDEFMQELGLEEPGLNRVIRAGYKLLNLQTYFTAGVKEVRAWTIPVGAT APQAAGKIHTDFEKGFIRAQTISFEDFITYKGEQGAKEAGKMRAEGKDYIVKDGDVMNFL FNV >gi|296493142|gb|ADTK01000359.1| GENE 15 12955 - 13158 82 67 aa, chain - ## HITS:1 COG:no KEGG:ECH74115_1683 NR:ns ## KEGG: ECH74115_1683 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 67 1 67 67 94 92.0 1e-18 MYLHNKNNMRHFCITNKKKQKTSIFFFEQWIFWHIVIIVLLVIIFILPGSHVITLFGDFI CGILPFR >gi|296493142|gb|ADTK01000359.1| GENE 16 13503 - 16388 1621 961 aa, chain + ## HITS:1 COG:ycgV_2 KEGG:ns NR:ns ## COG: ycgV_2 COG3468 # Protein_GI_number: 16129165 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 543 961 1 413 413 659 96.0 0 MGIKQHNGNTKADRLAELKIRSPSIQLIKFGAIGLNAIIFSPLLIAADTGSQYGTNITIN DGDRITGDTADPSGNLYGVMTPAGNTPGNINLGNDVTVNVNDASGYAKGIIIQGKNSSLT ANRLTVDVVGQTSAIGINLIGDYTHADLGTGSTIKSNDDGIIIGHSSTLTATQFTIENSN GIGLTINDYGTSVDLGSGSKIKTDGSTGVYIGGLNGNNANGAARFTATDLTIDVQGYSAM GINVQKNSVVDLGTNSSIKTSGDNAHGLWSFGQVSANALTVDVTGAAANGVEVRGGTTTI GADSHISSAQGGGLVASGSDATINFSGTAAQRNSIFSGGSYGASAQTATAVINMQNTDIT VDRNGSLALGLWALSGGRITGDSLAITGAAGARGIYAMTNSQIDLTSDLVIDMSTPDQMA IATQHDDGYAASRINASGRMLINGSVLSKGGLINLDMHSGSVWTGSSLSDNVNGGKLDVA MNNSVWNVTNNSNLDTLALSHSTVDFASHASTAGTFTTLNVENLSGNSTFIMRADVVGEG NGVNNKGDLLNISGSSAGNHVLAIRNQGSEATTGNEVLTVVKTTDGAASFSASSQVELGG YLYDVRKNGTNWELYASGTVPEPTPNPEPTPAPAQPPIVNPDPTPEPDPTPTPTPTPKPT TTADAGGNYLNVGYLLNYVENRTLMQRMGDLRNQSKDGNIWLRSYGGSLDSFASGKLSGF DMGYSGIQFGGDKRLSDVMPLYVGLYIGSTHASPDYSGGDGTARSDYMGMYASYMAHNGF YSDLVVKASRQKNSFHVRDSQNNGVNANGTANGLSISLEAGQRFNLTPTGYGFYIEPQTQ LTYSHQNEMAMKASNGLNIHLNHYESLLGRASMILGYDITAGNSQLNMYVKTGAIREFSG DTEYLLNNSREKYSFKGNGWNNGVGVSAQYNKQHTFYLEADYTQGNLFDQKQVNGGYHFS F >gi|296493142|gb|ADTK01000359.1| GENE 17 16488 - 18407 1521 639 aa, chain - ## HITS:1 COG:ycgU KEGG:ns NR:ns ## COG: ycgU COG3284 # Protein_GI_number: 16129164 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; K Transcription # Function: Transcriptional activator of acetoin/glycerol metabolism # Organism: Escherichia coli K12 # 1 639 4 642 642 1267 99.0 0 MSGAFNNDGRGISPLIATSWERCNKLMKRETWNVPHQAQGVTFASIYRRKKAMLTLGQAA LEDAWEYMAPRECALFILDETACILSRNGDPQTLQQLSALGFNDGTYCAEGIIGTCALSL AAISGQAVKTMADQHFKQALWNWAFCATPLFDSKGRLTGTIALACPVEQTTAADLPLTLA IAREVGNLLLTDSLLAETNRHLNQLNALLESMDDGVISWDEQGNLQFINAQAARVLRLDA TASQGRAITELLTLPAVLQQAIKQAHPLKHVEATFESQHQFIDAVITLKPIIETQGTSFI LLLHPVEQMRQLMTSQLGKVSHTFAHMPQDDPQTRRLIHFGRQAARSSFPVLLCGEEGVG KALLSQAIHNESERAAGPYIAVNCELYGDAALAEEFIGGDRTDNENGRLSRLELAHGGTL FLEKIEYLAVELQSALLQVIKQGVITRLDARRLIPIDVKVIATTTADLAMLVEQNRFSRQ LYYALHAFEITIPPLRMRRGSIPALVNNKLRSLEKRFSTRLKIDDDALARLVSCAWPGND FELYSVIENLALSSDNGRIRVSDLPEHLFTEQATDDVSATRLSTSLSFAEVEKEAIINAA QVTGGRIQEMSALLGIGRTTLWRKMKQHGIDAGQFKRRV >gi|296493142|gb|ADTK01000359.1| GENE 18 18635 - 19705 1020 356 aa, chain + ## HITS:1 COG:ECs1705 KEGG:ns NR:ns ## COG: ECs1705 COG2376 # Protein_GI_number: 15830959 # Func_class: G Carbohydrate transport and metabolism # Function: Dihydroxyacetone kinase # Organism: Escherichia coli O157:H7 # 1 356 11 366 366 723 99.0 0 MKKLINDVQDVLDEQLAGLAKAHPSLTLHQEPVYVTRADAPVAGKVALLSGGGSGHEPMH CGYIGQGMLSGACPGEIFTSPTPDKIFECAMQVDGGEGVLLIIKNYTGDILNFETATELL HDSGVKVTTVVIDDDVAVKDSLYTAGRRGVANTVLIEKLVGAAAERGDSLDACAELGRKL NNQGHSIGIALGACTVPAAGKPSFTLADNEMEFGVGIHGEPGIDRRPFSSLDQTVDEMFD TLLENGSYHRTLRFWDYQQGSWQEEPQTKQQLQSGDRVIALVNNLGATPLSELYGVYNRL TTRCQQAGLTIERNLIGAYCTSLDMTGFSITLLKVDDETLALWDAPVHTPALNWGK >gi|296493142|gb|ADTK01000359.1| GENE 19 19716 - 20348 606 210 aa, chain + ## HITS:1 COG:ECs1704 KEGG:ns NR:ns ## COG: ECs1704 COG2376 # Protein_GI_number: 15830958 # Func_class: G Carbohydrate transport and metabolism # Function: Dihydroxyacetone kinase # Organism: Escherichia coli O157:H7 # 1 210 1 210 210 395 98.0 1e-110 MSLSRTQIVNWLTRCGDIFNTESGYLTGLDREIGDADHGLNMNRGFSKVVEKLPAIADKD IGFILKNTGMTLLSSVGGASGPLFGTFFIRAAQATQARQSLTLEELYQMFRDGADGVISR GKAEPGDKTMCDVWVPVVESLRQSSEQNLSVPAALEAASSIAESAAQSTITMQARKGRAS YLGERSIGHQDPGATSVMFMMQMLALAAKE >gi|296493142|gb|ADTK01000359.1| GENE 20 20359 - 21777 1190 472 aa, chain + ## HITS:1 COG:ycgC_3 KEGG:ns NR:ns ## COG: ycgC_3 COG1080 # Protein_GI_number: 16129161 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Escherichia coli K12 # 258 472 1 215 215 410 99.0 1e-114 MVNLVIVSHSSRLGEGVGELARQMLMSDSCKIAIAAGIDDPQNPIGTDAVKVMEAIESVA DADHVLVMMDMGSALLSAETALELLAPEIAAKVRLCAAPLVEGTLAATVSAASGADIDKV IFDAMHALEAKREQLGLLSSDTEISDTCPAYDEEARSLAVVIKNRNGLHVRPASRLVYTL STFNADMLLEKNGKCVTPESINQIALLQVRYNDTLRLIAKGPEAEEALIAFRQLAEDNFG ETEEVAPPTLRPVPPVSGKAFYYQPVLCTVQAKSTLTVEEEQDRLRQAIDFTLLDLMTLT AKAEASGLDDIAAIFSGHHTLLDDPELLAAASELLQHEHCTAEYAWQQVLKELSQQYQQL DDEYLQARYIDVDDLLHRTLVHLTQTKEELPQFNSPTILLAENIYPSTVLQLDPAVVKGI CLSAGSPVSHSALIARELEIGWICQQGEKLYAIQPEEKLTLDVKTQRFNRQG >gi|296493142|gb|ADTK01000359.1| GENE 21 22097 - 23794 1575 565 aa, chain + ## HITS:1 COG:treA KEGG:ns NR:ns ## COG: treA COG1626 # Protein_GI_number: 16129160 # Func_class: G Carbohydrate transport and metabolism # Function: Neutral trehalase # Organism: Escherichia coli K12 # 1 565 1 565 565 1099 99.0 0 MKSPAPSRPQKMALIPACIFLCFAALSVQAEETPVTPQPPDILLGPLFNDVQNAKLFPDQ KTFADAVPNSDPLMILADYRMQQNQSGFDLRHFVNVNFTLPKEGEKYVPPEGQSLREHID GLWPVLTRSTENTEKWDSLLPLPEPYVVPGGRFREVYYWDSYFTMLGLAESGHWDKVADM VANFAHEIDTYGHIPNGNRSYYLSRSQPPFFALMVELLAQHEGDAALKQYLPQMQKEYAY WMDGVENLQAGQQEKRVVKLQDGTLLNRYWDDRDTPRPESWVEDIATAKSNPNRPATEIY RDLRSAAASGWDFSSRWMDNPHQLNTLRTTSIVPVDLNSLMFKMEKILARASKAAGDNAM ANQYETLANARQKGIEKYMWNDQQGWYADYDLKSYKVRNQLTAAALFPLYVNAAAKDRAN KMATATKTHLLQPGGLNTTSVKSGQQWDAPNGWAPLQWVATEGLQNYGQKEVAMDISWHF LTNVQHTYDREKKLVEKYDVSTTGTGGGGGEYPLQDGFGWTNGVTLKMLDLICPKEQPCD NVPATRPTVKSATTQPSTKEAQPTP >gi|296493142|gb|ADTK01000359.1| GENE 22 23791 - 24018 142 75 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLYIGNVGVNHRVMKVKHLPVEMTLSSATTYGKHLRNTSRQDLKNRLCRKHHSYGVICQQ RHPEKPNDVVGVSAG >gi|296493142|gb|ADTK01000359.1| GENE 23 23873 - 24169 105 98 aa, chain - ## HITS:1 COG:no KEGG:ECO111_1525 NR:ns ## KEGG: ECO111_1525 # Name: ycgY # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 98 49 146 146 205 100.0 4e-52 MLQLREEEWSEFFFWLLNSLECLDYVIINLTPESKKTLMSEHRNNIQVAIDALYRQRRCK SPGDESETLTRRNDAIFGNHVWQTFAQYFPPGLEKPSV >gi|296493142|gb|ADTK01000359.1| GENE 24 24491 - 24745 280 84 aa, chain - ## HITS:1 COG:ymgE KEGG:ns NR:ns ## COG: ymgE COG2261 # Protein_GI_number: 16129158 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 84 1 84 84 105 98.0 3e-23 MGIIAWIIFGLIAGIIAKLIMPGRDGGGFFLTCILGIVGAVVGGWLATMFGIGGSISGFN LHSFLVAVVGAILVLGIFRLLRRE >gi|296493142|gb|ADTK01000359.1| GENE 25 24946 - 25680 628 244 aa, chain + ## HITS:1 COG:ycgR KEGG:ns NR:ns ## COG: ycgR COG5581 # Protein_GI_number: 16129157 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted glycosyltransferase # Organism: Escherichia coli K12 # 1 244 1 244 244 481 99.0 1e-136 MSHYHEQFLKQNPLAVLGVLRDLHKAAIPLRLSWNGGQLISKILAITPDKLVLDFGSQAE DNIAVLKAQHITITAETQGAKVEFTVEQLQQSEYLQLPAFITVPPPTLWFVQRRRYFRIS APLHPPYFCQTKLADNSTLRFRLYDLSLGGMGALLETAKPAELQEGMRFAQIEVNMGQWG VFHFDAQLISISERKVIDGKNETITTPRLSFRFLNVSPTVERQLQRIIFSLEREAREKAD KVRD >gi|296493142|gb|ADTK01000359.1| GENE 26 25682 - 26293 576 203 aa, chain - ## HITS:1 COG:ECs1687 KEGG:ns NR:ns ## COG: ECs1687 COG0741 # Protein_GI_number: 15830942 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Escherichia coli O157:H7 # 1 203 39 241 241 405 99.0 1e-113 MKLRWFAFLIVLLAGCSSKHDYTNPPWNAKVPVQRAMQWMPISQKAGAAWGVDPQLITAI IAIESGGNPNAVSKSNAIGLMQLKASTSGRDVYRRMGWSGEPTTSELKNPERNISMGAAY LNILETGPLAGIEDPKVLQYALVVSYANGAGALLRTFSSDRKKAISKINDLDADEFLDHV ARNHPAPQAPRYIYKLEQALDAM >gi|296493142|gb|ADTK01000359.1| GENE 27 26393 - 27307 817 304 aa, chain + ## HITS:1 COG:ldcA KEGG:ns NR:ns ## COG: ldcA COG1619 # Protein_GI_number: 16129155 # Func_class: V Defense mechanisms # Function: Uncharacterized proteins, homologs of microcin C7 resistance protein MccF # Organism: Escherichia coli K12 # 1 304 1 304 304 616 98.0 1e-176 MSLFHLIAPSGYCIKQHAALRGIHRLTDEGHQVNNVEVIARRCERFAGTETERLEDLNSL ARLTTPNTIVLAVRGGYGASRLLADIDWQALVARQQHDPLLICGHSDFTAIQCGLLAQGN VITFSGPMLVANFGADELNAFTEHHFWLALRNKTFTIEWQGEGPTCQTEGTLWGGNLAML ISLIGTPWMPKIENGILVLEDINEHPFRVERMLLQLYHAGILPRQKAIILGSFSGSTPND YDAGYNLESVYAFLRSRLSIPLITGLDFGHEQRTVTLPLGAHAILNNTREGTQLTISGHP VLKM >gi|296493142|gb|ADTK01000359.1| GENE 28 27402 - 29138 1699 578 aa, chain + ## HITS:1 COG:STM1801 KEGG:ns NR:ns ## COG: STM1801 COG3263 # Protein_GI_number: 16765142 # Func_class: P Inorganic ion transport and metabolism # Function: NhaP-type Na+/H+ and K+/H+ antiporters with a unique C-terminal domain # Organism: Salmonella typhimurium LT2 # 1 577 1 577 577 962 92.0 0 MDATTIISLFILGSILVTSSILLSSFSSRLGIPILVIFLAIGMLAGVDGVGGIPFDNYPF AYMVSNLALAIILLDGGMRTQASSFRVALGPALSLATLGVLITSGLTGMMAAWLFNLDLI EGLLIGAIVGSTDAAAVFSLLGGKGLNERVGSTLEIESGSNDPMAVFLTITLIAMIQQHE SSVSWMFVVDILQQFGLGIVIGLGGGYLLLQMINRIALPAGLYPLLALSGGILIFALTTA LEGSGILAVYLCGFLLGNRPIRNRYGILQNFDGLAWLAQIAMFLVLGLLVNPSDLLPIAI PALILSAWMIFFARPLSVFAGLLPFRGFNLRERVFISWVGLRGAVPIILAVFPMMAGLEN ARLFFNVAFFVVLVSLLLQGTSLSWAAKKAKVVVPPVGRPVSRVGLDIHPENPWEQFVYQ LSADKWCVGAALRDLHMPKETRIAALFRDNQLLHPTGSTRLREGDVLCVIGRERDLPALG KLFSQSPPVALDQRFFGDFILEASAKYADVALIYGLEDGREYRDKQQTLGEIVQQLLGAA PVVGDQVEFAGMIWTVAEKEDNEVLKIGVRVAEEEAES >gi|296493142|gb|ADTK01000359.1| GENE 29 29272 - 29421 62 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|188493462|ref|ZP_03000732.1| ## NR: gi|188493462|ref|ZP_03000732.1| hypothetical protein Ec53638_4549 [Escherichia coli 53638] # 8 49 1 42 42 76 100.0 4e-13 MIVSIKRMDTISPEPELTAQAFYIKRLRFIVDSNSTSDKSRFTLDGACT >gi|296493142|gb|ADTK01000359.1| GENE 30 29524 - 30594 876 356 aa, chain - ## HITS:1 COG:ZdadX KEGG:ns NR:ns ## COG: ZdadX COG0787 # Protein_GI_number: 15801412 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Escherichia coli O157:H7 EDL933 # 1 356 1 356 356 712 99.0 0 MTRPIQASLDLQALKQNLSIVRQAAPHARVWSVVKANAYGHGIERIWSALGATDGFALLN LEEAITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARLKAPLDI YLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTLMSHFAEAEHPDGISSAMARIEQAA EGLECRRSLSNSAATLWHPEAHFDWVRPGIILYGASPSGQWRDIANTGLRPVMTLSSEII GVQTLKAGERVGYGGRYTARDEQRIGIVAAGYADGYPRHAPTGTPVLVDGVRTMTVGTVS MDMLAVDLTPCPQAGIGTPVELWGKEIKIDDVAAAAGTVGYELMCALALRVPVVTV >gi|296493142|gb|ADTK01000359.1| GENE 31 30604 - 31902 1358 432 aa, chain - ## HITS:1 COG:ECs1684 KEGG:ns NR:ns ## COG: ECs1684 COG0665 # Protein_GI_number: 15830938 # Func_class: E Amino acid transport and metabolism # Function: Glycine/D-amino acid oxidases (deaminating) # Organism: Escherichia coli O157:H7 # 1 432 1 432 432 893 100.0 0 MRVVILGSGVVGVASAWYLNQAGHEVTVIDREPGAALETSAANAGQISPGYAAPWAAPGV PLKAIKWMFQRHAPLAVRLDGTQFQLKWMWQMLRNCDTSHYMENKGRMVRLAEYSRDCLK ALRAETNIQYEGRQGGTLQLFRTEQQYENATRDIAVLEDAGVPYQLLESSRLAEVEPALA EVAHKLTGGLQLPNDETGDCQLFTQNLARMAEQAGVKFRFNTPVDQLLCDGEQIYGVKCG DEVIKADAYVMAFGSYSTAMLKGIVDIPVYPLKGYSLTIPIAQEDGAPVSTILDETYKIA ITRFDNRIRVGGMAEIVGFNTELLQPRRETLEMVVRDLYPRGGHVEQATFWTGLRPMTPD GTPVVGRTRFKNLWLNTGHGTLGWTMACGSGQLLSDLLSGRTPAIPYEDLSVARYSRGFT PSRPGHLHGAHS >gi|296493142|gb|ADTK01000359.1| GENE 32 32232 - 33764 1462 510 aa, chain + ## HITS:1 COG:ycgB KEGG:ns NR:ns ## COG: ycgB COG2719 # Protein_GI_number: 16129151 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 510 1 510 510 1038 100.0 0 MATIDSMNKDTTRLSDGPDWTFDLLDVYLAEIDRVAKLYRLDTYPHQIEVITSEQMMDAY SSVGMPINYPHWSFGKKFIETERLYKHGQQGLAYEIVINSNPCIAYLMEENTITMQALVM AHACYGHNSFFKNNYLFRSWTDASSIVDYLIFARKYITECEERYGVDEVERLLDSCHALM NYGVDRYKRPQKISLQEEKARQKSREEYLQSQVNMLWRTLPKREEEKTVAEARRYPSEPQ ENLLYFMEKNAPLLESWQREILRIVRKVSQYFYPQKQTQVMNEGWATFWHYTILNHLYDE GKVTERFMLEFLHSHTNVVFQPPYNSPWYSGINPYALGFAMFQDIKRICQSPTEEDKYWF PDIAGSDWLETLHFAMRDFKDESFISQFLSPKVMRDFRFFTVLDDDRHNYLEISAIHNEE GYREIRNRLSSQYNLSNLEPNIQIWNVDLRGDRSLTLRYIPHNRAPLDRGRKEVLKHVHR LWGFDVMLEQQNEDGSIELLERCPPRMGNL >gi|296493142|gb|ADTK01000359.1| GENE 33 33816 - 34535 705 239 aa, chain - ## HITS:1 COG:ECs1682 KEGG:ns NR:ns ## COG: ECs1682 COG2186 # Protein_GI_number: 15830936 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 239 1 239 239 492 100.0 1e-139 MVIKAQSPAGFAEEYIIESIWNNRFPPGTILPAERELSELIGVTRTTLREVLQRLARDGW LTIQHGKPTKVNNFWETSGLNILETLARLDHESVPQLIDNLLSVRTNISTIFIRTAFRQH PDKAQEVLATANEVADHADAFAELDYNIFRGLAFASGNPIYGLILNGMKGLYTRIGRHYF ANPEARSLALGFYHKLSALCSEGAHDQVYETVRRYGHESGEIWHRMQKNLPGDLAIQGR >gi|296493142|gb|ADTK01000359.1| GENE 34 34757 - 36298 1669 513 aa, chain + ## HITS:1 COG:ECs1681 KEGG:ns NR:ns ## COG: ECs1681 COG3067 # Protein_GI_number: 15830935 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter # Organism: Escherichia coli O157:H7 # 1 513 1 513 513 897 100.0 0 MEISWGRALWRNFLGQSPDWYKLALIIFLIVNPLIFLISPFVAGWLLVAEFIFTLAMALK CYPLLPGGLLAIEAVFIGMTSAEHVREEVAANLEVLLLLMFMVAGIYFMKQLLLFIFTRL LLSIRSKMLLSLSFCVAAAFLSAFLDALTVVAVVISVAVGFYGIYHRVASSRTEDTDLQD DSHIDKHYKVVLEQFRGFLRSLMMHAGVGTALGGVMTMVGEPQNLIIAKAAGWHFGDFFL RMSPVTVPVLICGLLTCLLVEKLRWFGYGETLPEKVREVLQQFDDQSRHQRTRQDKIRLI VQAIIGVWLVTALALHLAEVGLIGLSVIILATSLTGVTDEHAIGKAFTESLPFTALLTVF FSVVAVIIDQQLFSPIIQFVLQASEHAQLSLFYIFNGLLSSISDNVFVGTIYINEAKAAM ESGAITLKQYELLAVAINTGTNLPSVATPNGQAAFLFLLTSALAPLIRLSYGRMVWMALP YTLVLTLVGLLCVEFTLAPVTEWFMQMGWIATL >gi|296493142|gb|ADTK01000359.1| GENE 35 36444 - 36974 588 176 aa, chain + ## HITS:1 COG:ECs1680 KEGG:ns NR:ns ## COG: ECs1680 COG1495 # Protein_GI_number: 15830934 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Disulfide bond formation protein DsbB # Organism: Escherichia coli O157:H7 # 1 176 1 176 176 330 100.0 1e-90 MLRFLNQCSQGRGAWLLMAFTALALELTALWFQHVMLLKPCVLCIYERCALFGVLGAALI GAIAPKTPLRYVAMVIWLYSAFRGVQLTYEHTMLQLYPSPFATCDFMVRFPEWLPLDKWV PQVFVASGDCAERQWDFLGLEMPQWLLGIFIAYLIVAVLVVISQPFKAKKRDLFGR >gi|296493142|gb|ADTK01000359.1| GENE 36 37020 - 38288 561 422 aa, chain - ## HITS:1 COG:ECs1679 KEGG:ns NR:ns ## COG: ECs1679 COG0389 # Protein_GI_number: 15830933 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Escherichia coli O157:H7 # 1 422 1 422 422 862 99.0 0 MFALCDVNAFYASCETVFRPDLWGKPVVVLSNNDGCVIARNAEAKALGVKMGDPWFKQKD LFRRCGVVCFSSNYELYADMSNRVMSTLEELSPRVEIYSIDEAFCDLTGVRNCRDLTDFG REIRATVLQRTHLTVGVGIAQTKTLAKLANHAAKKWQRQTGGVVDLSNLERQRKLMSALS VDEVWGIGRRNSKKLDAMGIKTVLDLADTDIRFIRKHFNVVLERTVRELRGEPCLQLEEF APTKQEIICSRSFGERITDYTSMRQAICSYAARAAEKLRSEHQYCRFISTFIKTSPFALN EPYYGNSASVKLLTPTQDSRDIINAATRSLDAIWQAGHRYQKAGVMLGDFFSQGVAQLNL FDDNAPRPGSEQLMAVMDTLNAKEGRGTLYFAGQGIQQQWQMKRAMLSPRYTTRSSDLLR VK >gi|296493142|gb|ADTK01000359.1| GENE 37 38288 - 38707 299 139 aa, chain - ## HITS:1 COG:ECs1678 KEGG:ns NR:ns ## COG: ECs1678 COG1974 # Protein_GI_number: 15830932 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Escherichia coli O157:H7 # 1 139 1 139 139 257 100.0 5e-69 MLFIKPADLREIVTFPLFSDLVQCGFPSPAADYVEQRIDLNQLLIQHPSATYFVKASGDS MIDGGISDGDLLIVDSAITASHGDIVIAAVDGEFTVKKLQLRPTVQLIPMNSAYSPITIS SEDTLDVFGVVIHVVKAMR >gi|296493142|gb|ADTK01000359.1| GENE 38 39078 - 39989 737 303 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_1668 NR:ns ## KEGG: ECH74115_1668 # Name: hlyE # Def: hemolysin E # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 303 49 351 351 512 99.0 1e-144 MTEIVADKTVEVVKNAIETADGALDLYNKYLDQVIPWQTFDETIKELSRFKQEYSQAASV LVGDIKTLLMDSQDKYFEATQTVYEWCGVATQLLAAYILLFDEYNEKKASAQKDILIKVL DDGITKLNEAQKSLLVSSQSFNNASGKLLALDSQLTNDFSEKSSYFQSQVDKIRKEAYAG AAAGVVAGPFGLIISYSIAAGVVEGKLIPELKNKLKSVQSFFTTLSNTVKQANKDIDAAK LKLTTEIAAIGEIKTETETTRFYVDYDDLMLSLLKEAAKKMINTCNEYQKRHGKKTLFEV PEV >gi|296493142|gb|ADTK01000359.1| GENE 39 40196 - 40642 458 148 aa, chain - ## HITS:1 COG:ECs1676 KEGG:ns NR:ns ## COG: ECs1676 COG2983 # Protein_GI_number: 15830930 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 148 11 158 158 300 99.0 5e-82 MSDVPFWQSKTLDEMSDAEWESLCDGCGQCCLHKLMDEDTDEIYFTNVACRQLNIKTCQC RNYERRFEFEPDCIKLTRDNLPTFEWLPMTCAYRLLAEGKDLPAWHPLLTGSKAAMHGER ISVRHIAVKESEVIDWQDHILNKPDWAQ >gi|296493142|gb|ADTK01000359.1| GENE 40 40734 - 41393 653 219 aa, chain - ## HITS:1 COG:ECs1675 KEGG:ns NR:ns ## COG: ECs1675 COG0179 # Protein_GI_number: 15830929 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) # Organism: Escherichia coli O157:H7 # 1 219 1 219 219 447 99.0 1e-126 MYQHHNWQGALLDYPVSKVVCVGSNYAKHIKEMGSAVPEEPVLFIKPETALCDLRQPLAI PSDFGSVHHEVELAVLIGATLRQATEEHVRKAIAGYGVALDLTLRDVQGKMKKAGQPWEK AKAFDNSCPLSGFIPAAEFTGDPQNTTLGLSVNGEQRQQGTTADMIHKIIPLIAYMSKFF TLKAGDVVLTGTPDGVGPLQSGDELTVTFDGHSLTTRVL >gi|296493142|gb|ADTK01000359.1| GENE 41 41465 - 41758 327 97 aa, chain - ## HITS:1 COG:ECs1674 KEGG:ns NR:ns ## COG: ECs1674 COG3100 # Protein_GI_number: 15830928 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 97 12 108 108 176 100.0 1e-44 MFCVIYRSSKRDQTYLYVEKKDDFSRVPEELMKGFGQPQLAMILPLDGRKKLVNADIEKV KQALTEQGYYLQLPPPPEDLLKQHLSVMGQKTDDTNK >gi|296493142|gb|ADTK01000359.1| GENE 42 42000 - 42401 298 133 aa, chain + ## HITS:1 COG:no KEGG:ECB_01153 NR:ns ## KEGG: ECB_01153 # Name: ycgK # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 133 1 133 133 236 99.0 2e-61 MKIKSISKAVLLLALLTSTSFAAGKNVNVEFRKGHSSAQYSGEIKGYDYDTYTFYAKKGQ KVHVSISNEGADTYLFGPGIDDSVDLSRYSPELDSHGQYSLPASGKYELRVLQTRNDARK NKTKKYNVDIQIK >gi|296493142|gb|ADTK01000359.1| GENE 43 42521 - 42883 75 120 aa, chain - ## HITS:1 COG:no KEGG:JW1166 NR:ns ## KEGG: JW1166 # Name: ycgJ # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 120 3 122 122 240 100.0 9e-63 MMRIFYIGLSGVGMMFSSMASGNDAGGLQSPACGVVCDPYICVNSDGISPELTRKYLGEK AAENLQSLQGYDPSEFTFANGVFCDVKEKLCRDDRYFGVDGKRSGKINQTTTKMLFMCRE >gi|296493142|gb|ADTK01000359.1| GENE 44 43409 - 44104 500 231 aa, chain + ## HITS:1 COG:minC KEGG:ns NR:ns ## COG: minC COG0850 # Protein_GI_number: 16129139 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation inhibitor # Organism: Escherichia coli K12 # 1 231 1 231 231 440 99.0 1e-123 MSNTPIELKGSSFTLSVVHLHEAEPKVIHQALEDKIAQAPAFLKHAPVVLNVSALEDPVN WSAMHKAVSATGLRVIGVSGCKDAQLKAEIEKMGLPILTEGKEKAPRPAPAPQAPAQNTT PVTKTRLIDTPVRSGQRIYAPQCDLIVTSHVSAGAELIADGNIHVYGMMRGRALAGASGD RETQIFCTNLMAELVSIAGEYWLSDQIPAEFYGKAARLQLVENALTVQPLN >gi|296493142|gb|ADTK01000359.1| GENE 45 44128 - 44940 842 270 aa, chain + ## HITS:1 COG:ECs1669 KEGG:ns NR:ns ## COG: ECs1669 COG2894 # Protein_GI_number: 15830923 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation inhibitor-activating ATPase # Organism: Escherichia coli O157:H7 # 1 270 1 270 270 506 100.0 1e-143 MARIIVVTSGKGGVGKTTSSAAIATGLAQKGKKTVVIDFDIGLRNLDLIMGCERRVVYDF VNVIQGDATLNQALIKDKRTENLYILPASQTRDKDALTREGVAKVLDDLKAMDFEFIVCD SPAGIETGALMALYFADEAIITTNPEVSSVRDSDRILGILASKSRRAENGEEPIKEHLLL TRYNPGRVSRGDMLSMEDVLEILRIKLVGVIPEDQSVLRASNQGEPVILDINADAGKAYA DTVERLLGEERPFRFIEEEKKGFLKRLFGG >gi|296493142|gb|ADTK01000359.1| GENE 46 44944 - 45210 451 88 aa, chain + ## HITS:1 COG:ECs1668 KEGG:ns NR:ns ## COG: ECs1668 COG0851 # Protein_GI_number: 15830922 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation topological specificity factor # Organism: Escherichia coli O157:H7 # 1 88 1 88 88 148 100.0 2e-36 MALLDFFLSRKKNTANIAKERLQIIVAERRRSDAEPHYLPQLRKDILEVICKYVQIDPEM VTVQLEQKDGDISILELNVTLPEAEELK Prediction of potential genes in microbial genomes Time: Mon May 16 16:10:55 2011 Seq name: gi|296493141|gb|ADTK01000360.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1157.4, whole genome shotgun sequence Length of sequence - 1486 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 455 - 514 6.9 1 1 Op 1 . + CDS 606 - 779 75 ## ECDH10B_1224 hypothetical protein 2 1 Op 2 . + CDS 781 - 1125 359 ## EcSMS35_1978 hypothetical protein 3 1 Op 3 . + CDS 1135 - 1464 457 ## ECO111_1498 hypothetical protein Predicted protein(s) >gi|296493141|gb|ADTK01000360.1| GENE 1 606 - 779 75 57 aa, chain + ## HITS:1 COG:no KEGG:ECDH10B_1224 NR:ns ## KEGG: ECDH10B_1224 # Name: ymgI, ECK4403 # Def: hypothetical protein # Organism: E.coli_DH10B # Pathway: not_defined # 1 57 1 57 57 74 100.0 1e-12 MSYSSFKIILIHVKNIIPIITATLILNYLNNSERSLVKQILIEDEIIVCATYLIPDI >gi|296493141|gb|ADTK01000360.1| GENE 2 781 - 1125 359 114 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_1978 NR:ns ## KEGG: EcSMS35_1978 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 114 1 114 114 110 100.0 1e-23 MKKKILAFGLISALFCSTPAMADMNRTTKGALLGAGVGLLTGNGVNGVLKGAAVGAGVGA VTEKGRDGKNARKGAKVGAAVGAVTGVLTGNGLEGAIKGAVIGGTGGAILGKMK >gi|296493141|gb|ADTK01000360.1| GENE 3 1135 - 1464 457 109 aa, chain + ## HITS:1 COG:no KEGG:ECO111_1498 NR:ns ## KEGG: ECO111_1498 # Name: ymgD # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 109 1 109 109 195 100.0 3e-49 MKKFALLAGLFVFAPMTWAQDYNIKNGLPSETYITCAEANEMAKTDSAQVAEIVAVMGNA SVASRDLKIEQSPELSAKVVEKLNQVCAKDPQMLLITAIDDTMRAIGKK Prediction of potential genes in microbial genomes Time: Mon May 16 16:11:08 2011 Seq name: gi|296493140|gb|ADTK01000361.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1157.5, whole genome shotgun sequence Length of sequence - 23674 bp Number of predicted genes - 21, with homology - 21 Number of transcription units - 14, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 25 - 2703 1228 ## COG3468 Type V secretory pathway, adhesin AidA - Prom 2756 - 2815 6.0 2 2 Tu 1 . - CDS 3103 - 3321 141 ## c1611 hypothetical protein - Prom 3350 - 3409 6.7 3 3 Tu 1 . - CDS 3453 - 4976 672 ## COG2200 FOG: EAL domain - Prom 5214 - 5273 5.0 - Term 5162 - 5199 4.8 4 4 Tu 1 . - CDS 5308 - 5484 264 ## B21_01150 hypothetical protein - Prom 5581 - 5640 4.9 5 5 Tu 1 . - CDS 5669 - 5935 211 ## B21_01149 hypothetical protein - Prom 5955 - 6014 3.7 6 6 Tu 1 . - CDS 6279 - 6515 249 ## ECUMN_1449 hypothetical protein - Prom 6639 - 6698 6.6 + Prom 6598 - 6657 10.6 7 7 Tu 1 1/0.750 + CDS 6830 - 8041 571 ## COG2200 FOG: EAL domain + Term 8160 - 8207 4.6 + Prom 8120 - 8179 7.3 8 8 Tu 1 1/0.750 + CDS 8246 - 8977 411 ## COG0789 Predicted transcriptional regulators + Prom 9009 - 9068 2.6 9 9 Tu 1 . + CDS 9198 - 9602 146 ## COG5562 Phage envelope protein + Term 9820 - 9850 1.8 - Term 10663 - 10702 4.1 10 10 Tu 1 . - CDS 10724 - 11974 1542 ## COG0538 Isocitrate dehydrogenases - Prom 12089 - 12148 4.4 + Prom 12095 - 12154 6.5 11 11 Op 1 3/0.500 + CDS 12176 - 12799 508 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 12 11 Op 2 6/0.000 + CDS 12809 - 13270 439 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 13 11 Op 3 4/0.500 + CDS 13324 - 14430 1258 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain 14 11 Op 4 9/0.000 + CDS 14466 - 15107 708 ## COG2915 Uncharacterized protein involved in purine metabolism 15 11 Op 5 4/0.500 + CDS 15111 - 16481 1633 ## COG0015 Adenylosuccinate lyase + Term 16491 - 16532 8.3 + Prom 16532 - 16591 7.4 16 12 Op 1 40/0.000 + CDS 16650 - 17321 827 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 17 12 Op 2 2/0.750 + CDS 17321 - 18781 1241 ## COG0642 Signal transduction histidine kinase 18 12 Op 3 . + CDS 18857 - 19978 1225 ## COG2850 Uncharacterized conserved protein + Term 19994 - 20031 9.1 19 13 Tu 1 . - CDS 20027 - 21253 1486 ## COG2195 Di- and tripeptidases - Prom 21360 - 21419 5.9 + Prom 21377 - 21436 4.0 20 14 Op 1 30/0.000 + CDS 21521 - 22639 1179 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 21 14 Op 2 . + CDS 22623 - 23486 1001 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I + Term 23528 - 23559 1.0 Predicted protein(s) >gi|296493140|gb|ADTK01000361.1| GENE 1 25 - 2703 1228 892 aa, chain - ## HITS:1 COG:b1170 KEGG:ns NR:ns ## COG: b1170 COG3468 # Protein_GI_number: 16129133 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 550 868 5 323 338 590 95.0 1e-168 MKLKKLPGFSLGLIALAVGNAYAEQFIQHEEHKHAVAIKDGEVKNGEVKGGEVKYIDNAL VITTGNGSYGVSVAGVGSNLTINNGVIVTTGGLYPSDNSSLSTALTASAIVSEDGGAVTL RGNNTIVTTGDYSVGLLSQVDGNLNTDTIIRVNPDGSATPSFSDGDNTFIVTAGNHAVGV LACASPGSARACVSSLDEESTADTENNENSAKAILDMAKGEITTHGTESYAAYANGTVVK AGDKLDYTNASVTLTDVDITTHGDNAHAIAARQGTVSFNQGEIYTTGPDAAIAKIYNGGT VTLKNTSAVAHQGAGVVLESSINGQEATVDILSGSSLRSANEILYNKNETSNVTITDSEV SSAADVFINNIKGHLTVDATDSKITGSANLSTDDNTHTYLSLSDNSTWDIKADSTVSNLT VDNSTVYISRADGRDFEPTRLTITENYVGNNGVLHLRTELGDDNSATDKVVINGNTSGTT RVKVTNAGGSGAYTLNGIEIISVEGESNGEFIKDSRIFAGAYEYSLTRGNTEATNKNWYL TNFQATSGGETNSGGSSAPIVAPTPVLRPEAGSYVANLAAANTLFVMRLNDRAGETRYID PVTEQERSSRLWLRQIGGHNAWRDSNGQLRTTSHRYVSQLGADLLTGGFTDSDSWRLGVM AGYARDYNSTHSSVSDYRSKGSVRGYSAGLYATWFADDISKKGAYIDAWAQYSWFKNSVK GDELAYESYSAKGATVSLEAGYGFALNKSFGLEAAKYTWIFQPQAQAIWMGVDHNAHTEA NGSRIENDANNNIQTRLGFRTFIRTQEKNSGPHGDDFEPFVEMNWLHNSKDFAVSMNGVK VEQDGARNLGEIKLGVNGNLNPAASVWGNVGVQLGDNGYNDTAVMVGLKYKF >gi|296493140|gb|ADTK01000361.1| GENE 2 3103 - 3321 141 72 aa, chain - ## HITS:1 COG:no KEGG:c1611 NR:ns ## KEGG: c1611 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_CFT073 # Pathway: not_defined # 1 72 13 84 84 110 98.0 2e-23 MNNSNNLDYFTLYIIFSIAFMLITLLVILIAKPSTGLGEVLVTINLLNALVWLAINLVNR LRERLVSHRDQQ >gi|296493140|gb|ADTK01000361.1| GENE 3 3453 - 4976 672 507 aa, chain - ## HITS:1 COG:ycgG_2 KEGG:ns NR:ns ## COG: ycgG_2 COG2200 # Protein_GI_number: 16129131 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 237 507 1 271 271 543 98.0 1e-154 MRNTLIPILVAICLFITGVAILNIQLWYSAKAEYLAGARYAANNINHILEEASQATQTAV NIAGKECDLEEQYQLGTEAALKPHLRTIIILKQGIVWCTSLPGNRVLLSRIPVFPDSNLL LAPAIDTVNRLPILLYQSQFADTRILVTISDQHIRGALNVPLKGVRYVLRVADDIIGPTG DVMTLNGHYPYTEKVHSTKYHFTIIFNSPPLFSFYRLIDKGFGILIFILLIACAAAFLLD RYFNKSATPEEILRRAIINGEIVPFYQPVVNGLEGTLRGVEVLARWKQPHGGYISPAAFI PLAEKSGLIVPLTQSLMNQVARQMNAIASKLPEGFHIGINFSASHIISPTFVDECLNYRD SFTRRDLNLVLEVTEREPLNVDESLVQRLNILHENGFVIALDDFGTGYSGLSYLHDLHID YIKIDHSFVGRVNADPESTRILDCVLDLARKLSISIVAEGVETKEQLDYLNQNNITFQQG YYFYKPVTYIDLVKIILSKPKVKVVVE >gi|296493140|gb|ADTK01000361.1| GENE 4 5308 - 5484 264 58 aa, chain - ## HITS:1 COG:no KEGG:B21_01150 NR:ns ## KEGG: B21_01150 # Name: ymgC # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 58 25 82 82 96 100.0 3e-19 MTHGYVDSHIIDQALRLRLKDETSVILSDLYLQILQYIEMHKTTLTDIIINDRESVLS >gi|296493140|gb|ADTK01000361.1| GENE 5 5669 - 5935 211 88 aa, chain - ## HITS:1 COG:no KEGG:B21_01149 NR:ns ## KEGG: B21_01149 # Name: ymgB # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 88 1 88 88 135 100.0 4e-31 MLEDTTIHNAITDKALASYFRSSGNLLEEESAVLGQAVTNLMLSGDNVNNKNIILSLIHS LETTSDILKADVIRKTLEIVLRYTADDM >gi|296493140|gb|ADTK01000361.1| GENE 6 6279 - 6515 249 78 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_1449 NR:ns ## KEGG: ECUMN_1449 # Name: ycgZ # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 78 1 78 78 143 100.0 2e-33 MHQNSVTLDSAGAITRYFAKANLPTQQETLGEIVTEILKDGRNLSRKSLCAKLLCRLEQA TGEEEQKHYNALIGLLFE >gi|296493140|gb|ADTK01000361.1| GENE 7 6830 - 8041 571 403 aa, chain + ## HITS:1 COG:ycgF_2 KEGG:ns NR:ns ## COG: ycgF_2 COG2200 # Protein_GI_number: 16129126 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 161 403 1 243 243 468 97.0 1e-132 MLTTLIYRSHIRDDEPVKKIEEMVSIANRRNMQCDVTGILLFNGSHFFQLLEGPEEQVKM IYRAICQDPRHYNIVELLCDYAPARRFGKAGMELFDLRLHERDDVLQAVFDKGTSKFQLT YDDRALQFFRTFVLATEQSTYFEIPAEDSWLFIADGSDKELDFYTLSPNINDHFAFHPIV DPLSRRIIAFEAIVQKNEDSPSAIAVGQRKDGEIYKADLKSKALAFAMAHALELGDKMIS INLLPMTLVNEPDAVSFLLNEIKANALVPEQIIVEFTESEVISRFDEFAEAIKSLKAAGI SVAIDHFGAGFAGLLLLSRFQPDRIKISQELITNVHKSGPRQAIIQAIIKCCTSLEIQVS AMGVATPEEWMWLESAGIEMFQGDLFAKAKLNGIPSIAWPEKK >gi|296493140|gb|ADTK01000361.1| GENE 8 8246 - 8977 411 243 aa, chain + ## HITS:1 COG:ycgE KEGG:ns NR:ns ## COG: ycgE COG0789 # Protein_GI_number: 16129125 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 1 243 1 243 243 474 99.0 1e-134 MAYYSIGDVAERCGINPVTLRAWQRRYGLLKPQRSEGGHRLFDEEDIQRIEEIKRWISNG VPVGKVKALLETTSQDTEDDWSRLQEEMMSILRMANPAKLRARIISLGREYPVDQLINHV YLPVRQRLVLDHNTSRIMSSMFDGALIEYAAASLFEMRRKPGKEAILMAWNVEERARLWL EAWRLSLSGWHISVLADPIESPRPELFPTQTLIVWTGMAPTRRQNELLQHWGEQGYKVIF HAP >gi|296493140|gb|ADTK01000361.1| GENE 9 9198 - 9602 146 134 aa, chain + ## HITS:1 COG:ycgX KEGG:ns NR:ns ## COG: ycgX COG5562 # Protein_GI_number: 16129124 # Func_class: R General function prediction only # Function: Phage envelope protein # Organism: Escherichia coli K12 # 1 134 1 134 134 265 100.0 1e-71 MDQVVVFQKMFEQVRKEQNFSWFYSELKHHRIAHYIYYLATDNIRIITHDDTVLLLRGTR NLLKVSTTKNPAKIKEAALLHICGKSTFREYCSTLAGAGVFRWVTDVNHNKRSYYAIDNT LLYIEDVENNKPLI >gi|296493140|gb|ADTK01000361.1| GENE 10 10724 - 11974 1542 416 aa, chain - ## HITS:1 COG:ECs1608 KEGG:ns NR:ns ## COG: ECs1608 COG0538 # Protein_GI_number: 15830862 # Func_class: C Energy production and conversion # Function: Isocitrate dehydrogenases # Organism: Escherichia coli O157:H7 # 1 416 1 416 416 842 99.0 0 MESKVVVPAQGKKITLQNGKLNVPENPIIPYIEGDGIGVDVTPAMLKVVDAAVEKAYKGE RKISWMEIYTGEKSTQVYGQDVWLPAETLDLIREYRVAIKGPLTTPVGGGIRSLNVALRQ ELDLYICLRPVRYYQGTPSPVKHPELTDMVIFRENSEDIYAGIEWKADSADAEKVIKFLR EEMGVKKIRFPEHCGIGIKPCSEEGTKRLVRAAIEYAIANDRDSVTLVHKGNIMKFTEGA FKDWGYQLAREEFGGELIDGGPWLKVKNPNTGKEIVIKDVIADAFLQQILLRPAEYDVIA CMNLNGDYISDALAAQVGGIGIAPGANIGDECALFEATHGTAPKYAGQDKVNPGSIILSA EMMLRHMGWTEAADLIVKGMEGAINAKTVTYDFERLMEGAKLLKCSEFGDAIIKNM >gi|296493140|gb|ADTK01000361.1| GENE 11 12176 - 12799 508 207 aa, chain + ## HITS:1 COG:ymfC KEGG:ns NR:ns ## COG: ymfC COG1187 # Protein_GI_number: 16129098 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Escherichia coli K12 # 1 207 1 207 207 400 100.0 1e-112 MQKTSFRNHQVKRFSSQRSTRRKPENQPTRVILFNKPYDVLPQFTDEAGRKTLKEFIPVQ GVYAAGRLDRDSEGLLVLTNNGALQARLTQPGKRTGKIYYVQVEGIPTQDALEALRNGVT LNDGPTLPAGAELVDEPAWLWPRNPPIRERKSIPTSWLKITLYEGRNRQVRRMTAHVGFP TLRLIRYAMGDYSLDNLANGEWREVTD >gi|296493140|gb|ADTK01000361.1| GENE 12 12809 - 13270 439 153 aa, chain + ## HITS:1 COG:ECs1606 KEGG:ns NR:ns ## COG: ECs1606 COG0494 # Protein_GI_number: 15830860 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Escherichia coli O157:H7 # 1 153 1 153 153 314 100.0 3e-86 MFKPHVTVACVVHAEGKFLVVEETINGKALWNQPAGHLEADETLVEAAARELWEETGISA QPQHFIRMHQWIAPDKTPFLRFLFAIELEQICPTQPHDSDIDCCRWVSAEEILQASNLRS PLVAESIRCYQSGQRYPLEMIGDFNWPFTKGVI >gi|296493140|gb|ADTK01000361.1| GENE 13 13324 - 14430 1258 368 aa, chain + ## HITS:1 COG:trmU KEGG:ns NR:ns ## COG: trmU COG0482 # Protein_GI_number: 16129096 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Escherichia coli K12 # 1 368 16 383 383 748 100.0 0 MSETAKKVIVGMSGGVDSSVSAWLLQQQGYQVEGLFMKNWEEDDGEEYCTAAADLADAQA VCDKLGIELHTVNFAAEYWDNVFELFLAEYKAGRTPNPDILCNKEIKFKAFLEFAAEDLG ADYIATGHYVRRADVDGKSRLLRGLDSNKDQSYFLYTLSHEQIAQSLFPVGELEKPQVRK IAEDLGLVTAKKKDSTGICFIGERKFREFLGRYLPAQPGKIITVDGDEIGEHQGLMYHTL GQRKGLGIGGTKEGTEEPWYVVDKDVENNILVVAQGHEHPRLMSVGLIAQQLHWVDREPF TGTMRCTVKTRYRQTDIPCTVKALDDDRIEVIFDEPVAAVTPGQSAVFYNGEVCLGGGII EQRLPLPV >gi|296493140|gb|ADTK01000361.1| GENE 14 14466 - 15107 708 213 aa, chain + ## HITS:1 COG:ycfC KEGG:ns NR:ns ## COG: ycfC COG2915 # Protein_GI_number: 16129095 # Func_class: R General function prediction only # Function: Uncharacterized protein involved in purine metabolism # Organism: Escherichia coli K12 # 1 213 1 213 213 380 100.0 1e-105 MAKNYYDITLALAGICQSARLVQQLAHQGHCDADALHVSLNSIIDMNPSSTLAVFGGSEA NLRVGLETLLGVLNASSRQGLNAELTRYTLSLMVLERKLSSAKGALDTLGNRINGLQRQL EHFDLQSETLMSAMAAIYVDVISPLGPRIQVTGSPAVLQSPQVQAKVRATLLAGIRAAVL WHQVGGGRLQLMFSRNRLTTQAKQILAHLTPEL >gi|296493140|gb|ADTK01000361.1| GENE 15 15111 - 16481 1633 456 aa, chain + ## HITS:1 COG:purB KEGG:ns NR:ns ## COG: purB COG0015 # Protein_GI_number: 16129094 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate lyase # Organism: Escherichia coli K12 # 1 456 1 456 456 931 100.0 0 MELSSLTAVSPVDGRYGDKVSALRGIFSEYGLLKFRVQVEVRWLQKLAAHAAIKEVPAFA ADAIGYLDAIVASFSEEDAARIKTIERTTNHDVKAVEYFLKEKVAEIPELHAVSEFIHFA CTSEDINNLSHALMLKTARDEVILPYWRQLIDGIKDLAVQYRDIPLLSRTHGQPATPSTI GKEMANVAYRMERQYRQLNQVEILGKINGAVGNYNAHIAAYPEVDWHQFSEEFVTSLGIQ WNPYTTQIEPHDYIAELFDCVARFNTILIDFDRDVWGYIALNHFKQKTIAGEIGSSTMPH KVNPIDFENSEGNLGLSNAVLQHLASKLPVSRWQRDLTDSTVLRNLGVGIGYALIAYQST LKGVSKLEVNRDHLLDELDHNWEVLAEPIQTVMRRYGIEKPYEKLKELTRGKRVDAEGMK QFIDGLALPEEEKARLKAMTPANYIGRAITMVDELK >gi|296493140|gb|ADTK01000361.1| GENE 16 16650 - 17321 827 223 aa, chain + ## HITS:1 COG:phoP KEGG:ns NR:ns ## COG: phoP COG0745 # Protein_GI_number: 16129093 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli K12 # 1 223 1 223 223 417 100.0 1e-117 MRVLVVEDNALLRHHLKVQIQDAGHQVDDAEDAKEADYYLNEHIPDIAIVDLGLPDEDGL SLIRRWRSNDVSLPILVLTARESWQDKVEVLSAGADDYVTKPFHIEEVMARMQALMRRNS GLASQVISLPPFQVDLSRRELSINDEVIKLTAFEYTIMETLIRNNGKVVSKDSLMLQLYP DAELRESHTIDVLMGRLRKKIQAQYPQEVITTVRGQGYLFELR >gi|296493140|gb|ADTK01000361.1| GENE 17 17321 - 18781 1241 486 aa, chain + ## HITS:1 COG:phoQ KEGG:ns NR:ns ## COG: phoQ COG0642 # Protein_GI_number: 16129092 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 486 1 486 486 956 100.0 0 MKKLLRLFFPLSLRVRFLLATAAVVLVLSLAYGMVALIGYSVSFDKTTFRLLRGESNLFY TLAKWENNKLHVELPENIDKQSPTMTLIYDENGQLLWAQRDVPWLMKMIQPDWLKSNGFH EIEADVNDTSLLLSGDHSIQQQLQEVREDDDDAEMTHSVAVNVYPATSRMPKLTIVVVDT IPVELKSSYMVWSWFIYVLSANLLLVIPLLWVAAWWSLRPIEALAKEVRELEEHNRELLN PATTRELTSLVRNLNRLLKSERERYDKYRTTLTDLTHSLKTPLAVLQSTLRSLRSEKMSV SDAEPVMLEQISRISQQIGYYLHRASMRGGTLLSRELHPVAPLLDNLTSALNKVYQRKGV NISLDISPEISFVGEQNDFVEVMGNVLDNACKYCLEFVEISARQTDEHLYIVVEDDGPGI PLSKREVIFDRGQRVDTLRPGQGVGLAVAREITEQYEGKIVAGESMLGGARMEVIFGRQH SAPKDE >gi|296493140|gb|ADTK01000361.1| GENE 18 18857 - 19978 1225 373 aa, chain + ## HITS:1 COG:ECs1573 KEGG:ns NR:ns ## COG: ECs1573 COG2850 # Protein_GI_number: 15830827 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 373 4 376 376 749 99.0 0 MEYQLTLNWPDFLERHWQKRPVVLKRGFNNFIDPISPDELAGLAMESEVDSRLVSHQDGK WQVSHGPFESYDHLGETNWSLLVQAVNHWHEPTAALMRPFRELPDWRIDDLMISFSVPGG GVGPHLDQYDVFIIQGTGRRRWRVGEKLQMKQHCPHPDLLQVDPFEAIIDEELEPGDILY IPPGFPHEGYALENAMNYSVGFRAPNTRELISGFADYVLQRELGGNYYSDPDVPPRAHPA DVLPQEMDKLREMMLELINQPEHFKQWFGEFISQSRHELDIAPPEPPYQPDEIYDALKQG DVLVRLGGLRVLRIGDDVYANGEKIDSPHRPALDALASNIALTAENFGDALEDPSFLAML AALVNSGYWFFEG >gi|296493140|gb|ADTK01000361.1| GENE 19 20027 - 21253 1486 408 aa, chain - ## HITS:1 COG:ECs1572 KEGG:ns NR:ns ## COG: ECs1572 COG2195 # Protein_GI_number: 15830826 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Escherichia coli O157:H7 # 1 408 1 408 408 832 99.0 0 MDKLLERFLNYVSLDTQSKAGVRQVPSTEGQWKLLHLLKEQLEEMGLINVTLSEKGTLMA TLPANVPGDIPAIGFISHVDTSPDCSGKNVNPQIVESYRGGDIALGIGDEVLSPVMFPVL HQLLGQTLITTDGKTLLGADDKAGIAEIMTALAVLQQKNIPHGDIRVAFTPDEEVGKGAK HFDVDAFDARWAYTVDGGGVGELEFENFNAASVNIKIVGNNVHPGTAKGVMVNALSLAAR IHAEVPADESPEMTEGYEGFYHLASMKGTVDRADMHYIIRDFDRKQFEARKRKMMEIAKK VGKGLHPDCYIELVIEDSYYNMREKVVEHPHILDIAQQAMRDCDIEPELKPIRGGTDGAQ LSFMGLPCPNLFTGGYNYHGKHEFVTLEGMEKAVQVIVRIAELTAQRK >gi|296493140|gb|ADTK01000361.1| GENE 20 21521 - 22639 1179 372 aa, chain + ## HITS:1 COG:ECs1571 KEGG:ns NR:ns ## COG: ECs1571 COG3842 # Protein_GI_number: 15830825 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Escherichia coli O157:H7 # 1 372 7 378 378 730 99.0 0 MNKQPSSLSPLVQLAGIRKCFDGKEVIPQLDLTINNGEFLTLLGPSGCGKTTVLRLIAGL ETVDSGRIMLDNEDITHVPAENRYVNTVFQSYALFPHMTVFENVAFGLRMQKTPAAEITP RVMEALRMVQLETFAQRKPHQLSGGQQQRVAIARAVVNKPRLLLLDESLSALDYKLRKQM QNELKALQRKLGITFVFVTHDQEEALTMSDRIVVMREGRIEQDGTPREIYEEPKNLFVAG FIGEINMFNATVIERLDEQRVRANVEGRECNIYVNFAVEPGQKLHVLLRPEDLRVEEIND DNHAEGLIGYVRERNYKGMTLESVVELENGKMVMVSEFFNEDDPDFDHSLDQKMAINWVE SWEVVLADEEHK >gi|296493140|gb|ADTK01000361.1| GENE 21 22623 - 23486 1001 287 aa, chain + ## HITS:1 COG:ECs1570 KEGG:ns NR:ns ## COG: ECs1570 COG1176 # Protein_GI_number: 15830824 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Escherichia coli O157:H7 # 1 287 1 287 287 473 98.0 1e-133 MKNTSKFQNVVIVTIVGWLVLFVFLPNLMIIGTSFLTRDDASFVKMVFTLDNYTRLLDPL YFEVLLHSLNMALIATLACLVLGYPFAWFLAKLPHKVRPLLLFLLIVPFWTNSLIRIYGL KIFLSTKGYLNEFLLWLGVIDTPIRIMFTPSAVIIGLVYILLPFMVMPLYSSIEKLDKPL LEAARDLGASKLQTFIRIIIPLTMPGIIAGCLLVMLPAMGLFYVSDLMGGAKNLLIGNVI KVQFLNIRDWPFGAATSITLTIVMGLMLLVYWRASRLLNKKVSELDD Prediction of potential genes in microbial genomes Time: Mon May 16 16:11:16 2011 Seq name: gi|296493139|gb|ADTK01000362.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1157.6, whole genome shotgun sequence Length of sequence - 105 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 16:11:21 2011 Seq name: gi|296493138|gb|ADTK01000363.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1165.1, whole genome shotgun sequence Length of sequence - 14533 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 7, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 181 71 ## gi|300918363|ref|ZP_07134963.1| conserved domain protein + Term 197 - 244 -0.9 2 2 Tu 1 . - CDS 191 - 427 87 ## gi|300906961|ref|ZP_07124631.1| hypothetical protein HMPREF9536_04916 - Prom 462 - 521 2.0 + Prom 289 - 348 6.6 3 3 Op 1 5/0.000 + CDS 466 - 975 321 ## COG3272 Uncharacterized conserved protein 4 3 Op 2 . + CDS 972 - 2390 940 ## COG0415 Deoxyribodipyrimidine photolyase - Term 2372 - 2416 10.8 5 4 Tu 1 . - CDS 2432 - 3913 1793 ## COG3104 Dipeptide/tripeptide permease - Prom 4026 - 4085 3.0 + Prom 4007 - 4066 3.0 6 5 Op 1 4/0.667 + CDS 4184 - 4927 827 ## COG0327 Uncharacterized conserved protein 7 5 Op 2 21/0.000 + CDS 4950 - 5606 634 ## COG2049 Allophanate hydrolase subunit 1 8 5 Op 3 8/0.000 + CDS 5600 - 6532 916 ## COG1984 Allophanate hydrolase subunit 2 9 5 Op 4 3/1.000 + CDS 6522 - 7256 658 ## COG1540 Uncharacterized proteins, homologs of lactam utilization protein B 10 5 Op 5 . + CDS 7292 - 8083 574 ## COG0266 Formamidopyrimidine-DNA glycosylase - Term 7847 - 7892 1.7 11 6 Tu 1 . - CDS 8080 - 9126 912 ## COG3180 Putative ammonia monooxygenase - Prom 9185 - 9244 5.2 - Term 9230 - 9269 5.2 12 7 Op 1 . - CDS 9278 - 10339 784 ## SSON_0667 hypothetical protein 13 7 Op 2 10/0.000 - CDS 10336 - 11067 551 ## COG3121 P pilus assembly protein, chaperone PapD 14 7 Op 3 6/0.000 - CDS 11082 - 13541 1624 ## COG3188 P pilus assembly protein, porin PapC - Prom 13588 - 13647 3.1 - Term 13552 - 13581 3.5 15 7 Op 4 . - CDS 13673 - 14158 473 ## COG3539 P pilus assembly protein, pilin FimA Predicted protein(s) >gi|296493138|gb|ADTK01000363.1| GENE 1 2 - 181 71 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|300918363|ref|ZP_07134963.1| ## NR: gi|300918363|ref|ZP_07134963.1| conserved domain protein [Escherichia coli MS 115-1] # 1 59 10 68 68 115 98.0 1e-24 IWRQSLRGAGFRNLALTCSLIFMADFGESRYFKPAKIAMNKNNLIIRTSDVEIAFKITK >gi|296493138|gb|ADTK01000363.1| GENE 2 191 - 427 87 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|300906961|ref|ZP_07124631.1| ## NR: gi|300906961|ref|ZP_07124631.1| hypothetical protein HMPREF9536_04916 [Escherichia coli MS 84-1] # 1 78 1 78 78 125 100.0 8e-28 MVNEFYGAEFLRGSILLVKILSTSSERELFYYFPSCIHFLYFLFLENNSYQSLITSLLIV FQTNNRTPFSLFKKHKHC >gi|296493138|gb|ADTK01000363.1| GENE 3 466 - 975 321 169 aa, chain + ## HITS:1 COG:ybgA KEGG:ns NR:ns ## COG: ybgA COG3272 # Protein_GI_number: 16128682 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 169 1 169 169 315 100.0 2e-86 MNLQRFDDSTLIRIFALHELHRLKEHGLTRGALLDYHSRYKLVFLAHSQPEYRKLGPFVA DIHQWQNLDDYYNQYRQRVVVLLSHPANPRDHTNVLMHVQGYFRPHIDSTERQQLAALID SYRRGEQPLLAPLMRIKHYMALYPDAWLSGQRYFELWPRVINLRHSGVL >gi|296493138|gb|ADTK01000363.1| GENE 4 972 - 2390 940 472 aa, chain + ## HITS:1 COG:phrB KEGG:ns NR:ns ## COG: phrB COG0415 # Protein_GI_number: 16128683 # Func_class: L Replication, recombination and repair # Function: Deoxyribodipyrimidine photolyase # Organism: Escherichia coli K12 # 1 471 1 471 472 947 97.0 0 MTTHLVWFRQDLRLHDNLALAAACRNSSARVLALYIATPRQWATHNMSPRQAELINAQLN GLQIALAEKGIPLLFREVDDFVASVEIVKQVCAENSVTHLFYNYQYEVNERARDVEVERA LRNVVCEGFDDSVILPPGAVMTGNHEMYKVFTPFKNAWLKRLREGMPECVAAPKVRSSGS IEPSPSITLNYPRQSFDTAHFPVEEKAAIAQLRQFCQNGAGEYEQQRDFPAVEGTSRLSA SLATGGLSPRQCLHRLLAEQPQALDGGAGSVWLSELIWREFYRHLMTYYPSLCKHCPFIA WTDRVQWQSNPAHLQAWQEGKTGYPIVDAAMRQLNSTGWMHNRLRMISASFLVKDLLIDW REGERYFMSQLIDGDLAANNGGWQWAASTGTDAAPYFRIFNPTTQGEKFDREGEFIRRWL PELRDVPGKAVHEPWKWAQKAGVMLDYPQPIVDHKEARLRTLAAYEEARKGA >gi|296493138|gb|ADTK01000363.1| GENE 5 2432 - 3913 1793 493 aa, chain - ## HITS:1 COG:ybgH KEGG:ns NR:ns ## COG: ybgH COG3104 # Protein_GI_number: 16128684 # Func_class: E Amino acid transport and metabolism # Function: Dipeptide/tripeptide permease # Organism: Escherichia coli K12 # 1 493 1 493 493 919 99.0 0 MNKHASQPRAIYYVVALQIWEYFSFYGMRALLILYLTNQLKYNDTHAYELFSAYCSLVYV TPILGGFLADKVLGNRMAVMLGALLMAIGHVVLGASEIHPSFLYLSLAIIVCGYGLFKSN VSCLLGELYEPTDPRRDGGFSLMYAAGNVGSIIAPIACGYAQEEYSWAMGFGLAAVGMIA GLVIFLCGNRHFTHTRGVNKKVLRATNFLLPNWGWLLVLLVATPALITVLFWKEWSVYAL IVATIIGLGVLAKIYRKAENQKQRKELGLIVTLTFFSMLFWAFAQQGGSSISLYIDRFVN RDMFGYTVPTAMFQSINAFAVMLCGVFLAWVVKESVAGNRTVRIWGKFALGLGLMSAGFC ILTLSARWSAMYGHSSLPLMVLGLAVMGFAELFIDPVAMSQITRIEIPGVTGVLTGIYML LSGAIANYLAGVIADQTSQASFDASGAINYSINAYIEVFDQITWGALACVGLVQMIWLYQ ALKFRNRALALES >gi|296493138|gb|ADTK01000363.1| GENE 6 4184 - 4927 827 247 aa, chain + ## HITS:1 COG:ECs0735 KEGG:ns NR:ns ## COG: ECs0735 COG0327 # Protein_GI_number: 15829989 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 247 1 247 247 499 100.0 1e-141 MKNTELEQLINEKLNSAAISDYAPNGLQVEGKETVQKIVTGVTASQALLDEAVRLGADAV IVHHGYFWKGESPVIRGMKRNRLKTLLANDINLYGWHLPLDAHPELGNNAQLAALLGITV MGEIEPLVPWGELTMPVPGLELASWIEARLGRKPLWCGDTGPEVVQRVAWCTGGGQSFID SAARFGVDAFITGEVSEQTIHSAREQGLHFYAAGHHATERGGIRALSEWLNENTDLDVTF IDIPNPA >gi|296493138|gb|ADTK01000363.1| GENE 7 4950 - 5606 634 218 aa, chain + ## HITS:1 COG:ECs0736 KEGG:ns NR:ns ## COG: ECs0736 COG2049 # Protein_GI_number: 15829990 # Func_class: E Amino acid transport and metabolism # Function: Allophanate hydrolase subunit 1 # Organism: Escherichia coli O157:H7 # 1 218 1 218 218 417 100.0 1e-117 MQRARCYLIGETAVVLELEPPVTLASQKRIWRLAQRLVDMPNVVEAIPGMNNITVILRNP ESLALDAIERLQRWWEESEALEPESRFIEIPVVYGGAGGPDLAVVAAHCGLSEKQVVELH SSVEYVVWFLGFQPGFPYLGSLPEQLHTPRRAEPRLLVPAGSVGIGGPQTGVYPLATPGG WQLIGHTSLSLFDPARDEPILLRPGDSVRFVPQKEGVC >gi|296493138|gb|ADTK01000363.1| GENE 8 5600 - 6532 916 310 aa, chain + ## HITS:1 COG:ECs0737 KEGG:ns NR:ns ## COG: ECs0737 COG1984 # Protein_GI_number: 15829991 # Func_class: E Amino acid transport and metabolism # Function: Allophanate hydrolase subunit 2 # Organism: Escherichia coli O157:H7 # 1 310 1 310 310 626 98.0 1e-179 MLKIIRAGMYTTVQDGGRHGFRQSGISHCGALGMPALRIANLLVGNDANAPALEITLGQL TVEFETDGWFALTGAGCEARLDDNAVWTGWRLPMKAGQRLTLKRPQHGMRSYLAVAGGID VPPVMGSCSTDLNVGIGGLEGRLLKDGDRLPIGKSKRDFMEAQGVKQLLWGNRIRALPGP EYHEFDRASQDAFWRSPWQLSPQSNRMGYRLQGQILKRTTERELLSHGLLPGVVQVPHNG QPIVLMNDAQTTGGYPRIACIIEADMYHLAQIPLGQPIHFVQCSLEEALKARQDQQRYFE QLAWRLHNEN >gi|296493138|gb|ADTK01000363.1| GENE 9 6522 - 7256 658 244 aa, chain + ## HITS:1 COG:ybgL KEGG:ns NR:ns ## COG: ybgL COG1540 # Protein_GI_number: 16128688 # Func_class: R General function prediction only # Function: Uncharacterized proteins, homologs of lactam utilization protein B # Organism: Escherichia coli K12 # 1 244 1 244 244 434 97.0 1e-122 MKIDLNADLGEGCASDAELLTLVSSANIACGFHAGDAQTMHACVREAIKNGVAIGAHPSF PDRENFGRSAMQLPPETVYAQTLYQIGALATIARAQGGVMRHVKPHGMLYNQAAKEAQLA DAIARAVYACDPALILVGLAGSELIRAGKQYGLTTREEVFADRGYQADGSLVPRSQRGAL IENEQQALAQTLEMVQHGRVKSITGEWATVAAQTVCLHGDGEHALAFARRLRATFAKKGI VVAA >gi|296493138|gb|ADTK01000363.1| GENE 10 7292 - 8083 574 263 aa, chain + ## HITS:1 COG:ECs0739 KEGG:ns NR:ns ## COG: ECs0739 COG0266 # Protein_GI_number: 15829993 # Func_class: L Replication, recombination and repair # Function: Formamidopyrimidine-DNA glycosylase # Organism: Escherichia coli O157:H7 # 1 263 1 263 263 544 99.0 1e-155 MPEGPEIRRAADNLEAAIKGKPLTDVWFAFPQLKTYQSQLIGQHVTHVETRGKALLTHFS NDLTLYSHNQLYGVWRVVDTGEEPQTTRVLRVKLQTADKTILLYSASDIEMLTPEQLTTH PFLQRVGPDVLDPNLTPEVVKERLLSPRFRNRQFAGLLLDQAFLAGLGNYLRVEILWQVG LTGNHKAKDLNAAQLDALAHALLEIPRFSYATRGQVDENKHHGALFRFKVFHRDGELCER CGGIIEKTTLSSRPFYWCPGCQH >gi|296493138|gb|ADTK01000363.1| GENE 11 8080 - 9126 912 348 aa, chain - ## HITS:1 COG:abrB KEGG:ns NR:ns ## COG: abrB COG3180 # Protein_GI_number: 16128690 # Func_class: R General function prediction only # Function: Putative ammonia monooxygenase # Organism: Escherichia coli K12 # 1 348 16 363 363 526 98.0 1e-149 MPVLQWGMLCVLSLLLSIGFLAVHLPAALLLGPMIAGIIFSMRGITLQLPRSAFLAAQAI LGCMIAQNLTGSILTTLAANWPIVLAILLVTLLSSAIVGWLLVRYSSLPGNTGAWGSSPG GAAAMVAMAQDYGADIRLVAFMQYLRVLFVAGAAVLVTRMMLGDNAEAVNQQIVWFPPVS INLLLTILLAVVAGTAGCMLRLPSGTMLIPMLAGAVLQSGQLITIELPEWLLAMAYMAIG WRIGLGFDKQILLRALRPLPQILLSIFALLAICAGMAWGLTRFMHIDFMTAYLATSPGGL DTVAVIAAGSNADMALIMAMQTLRLFSILLTGPAIARFISTYAPKRSA >gi|296493138|gb|ADTK01000363.1| GENE 12 9278 - 10339 784 353 aa, chain - ## HITS:1 COG:no KEGG:SSON_0667 NR:ns ## KEGG: SSON_0667 # Name: ybgO # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 353 11 363 363 726 100.0 0 MSAGKGVLLVICLLFLPLKSALALNCYFGSSGGSVEKSEAIQPFAVPGNAKLGDKIWESD DIKIPVYCDNNTNGNFESEHVYAWVNPYPGVQDRYYQLGVTYNGVDYDANQGKSRIDTNQ CIDSKNIDIYTPEQIIAMGWQNKICSGDPANIHMSRTFLARMRLYVKIREMPPHDYQSTL SDYIVVQFDGAGSVNEDPTAQNLKYHITGLENIRVLDCSVNFSISPETQVIDFGKFNLLD IRRHTMSKTFSIKTTKSQNDQCTDGFKVSSSFYTEETLVEEDKALLIGNGLKLRLLDENA SPYTFNKYAEYADFTSDMLVYEKTYTAELSSIAGTPIEAGPFDTVVLFKINYN >gi|296493138|gb|ADTK01000363.1| GENE 13 10336 - 11067 551 243 aa, chain - ## HITS:1 COG:ybgP KEGG:ns NR:ns ## COG: ybgP COG3121 # Protein_GI_number: 16128692 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli K12 # 1 242 1 242 242 446 90.0 1e-125 MTFMKGLPLLLLVASLCSHAALQPDRTRIVFNANDKATSLRVDNRSDKLPYLAYSWLENE KGEKSDDLLVALPPIQRLEPKATTQVRIVKQASTTKLPGDRETLFFYNMREIPPAPEKNS DHAVLQVAIQSRIKVFWRPAALRKKAGEKVELQLQVSQQGNQLTLKNPTAYYLTIAYLGR NEKGVLPGFKTVMVAPFSTVNTNTGNYSGSQFYLGYMDDYGALRMTTLNCSGQCHLQAVE AKK >gi|296493138|gb|ADTK01000363.1| GENE 14 11082 - 13541 1624 819 aa, chain - ## HITS:1 COG:ybgQ KEGG:ns NR:ns ## COG: ybgQ COG3188 # Protein_GI_number: 16128693 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli K12 # 1 819 1 818 818 1518 95.0 0 MDTVNIYRLSFVSCLVVAMPCALAVEFNLNVLDKSMRDRIDISLLKEKGVIAPGEYFVSV AVNNNQISNGQKINWHKNDDKTIPCINDLLVDKFGLKPEVRQSLPLINQCVDFSSRPEML FNFDQANQQLNITIPQAWLAWHSENWTPPSTWKEGVAGVLMDYNLFASSYRPQDGSSSTN LNAYGTTGINAGAWRLRSDYQLNQTDSDDNHEQSGGISRTYLFRPLPQLGSKLTLGETDF SSNIFDGFSYTGAALASDERMLPWELRGYAPQISGIAQTNATVTISQSGRVIYQKKVPPG PFIIDDLNQSVQGTLDVKVTEEDGRVNNFQVSAASTPFLTRQGQVRYKLAAGQPRPSMSH QTENETFFSNEVSWGMLSNTSLYGGLLLSGDDYHSAAMGIGQNMLWLGALSFDVTWASSH FDTQQDERGLSYRFNYSKQVDATNSTISLAAYRFSDRHFHSYANYLDHKYNDSDAQDEKQ TISLSVGQPITPLNLNLYANLLHQTWWNADASTTANITAGFNVDIGDWRDISISTSFNTT HYEDKDRDNQIYLSISLPFGNGGRVGYDMQNSSHSTTHRMSWNDTLDERNSWGMSAGLQS DRPDNGAQVSGNYQHLSSAGEWDISGTYAANDYSSVSSSWSGSFTATQYGAAFHRRSSTN EPRLMVSTDGVADIPVQGNLDYTNHFGIAVVPLISSYQPSTVAVNMNDLPDGVTVAENVI KETWIEGAIGYKSLASRSGKDVNVIIRNASGQFPPLGADIRQDDSGISVGMVGEEGHAWL SGVAENQKFTVVWGDSQHCSLHLPEHMEDTANRLILPCH >gi|296493138|gb|ADTK01000363.1| GENE 15 13673 - 14158 473 161 aa, chain - ## HITS:1 COG:ybgD KEGG:ns NR:ns ## COG: ybgD COG3539 # Protein_GI_number: 16128694 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 161 1 161 188 242 78.0 3e-64 MFKGQKTLAALAVSLLFTAPVYAADEGSGEIHFKGEVIEAPCEIHQDDIDKEVELGQVTT SHINQSHHSDAVAVDLRLVNCDLENSSNGSGGKISKVAVTFDSSAKTTGADPILNNTSTG EATGVGVRLMNKDQSNIVLGTATPDIDLAPTSSEQTLNFFA Prediction of potential genes in microbial genomes Time: Mon May 16 16:11:56 2011 Seq name: gi|296493137|gb|ADTK01000364.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1165.2, whole genome shotgun sequence Length of sequence - 62903 bp Number of predicted genes - 64, with homology - 64 Number of transcription units - 28, operones - 10 average op.length - 4.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 1078 - 1137 2.9 2 2 Tu 1 . + CDS 1356 - 1541 92 ## ECSE_0779 hypothetical protein + Prom 1796 - 1855 5.5 3 3 Op 1 24/0.000 + CDS 2062 - 2466 280 ## COG2009 Succinate dehydrogenase/fumarate reductase, cytochrome b subunit 4 3 Op 2 22/0.000 + CDS 2460 - 2807 472 ## COG2142 Succinate dehydrogenase, hydrophobic anchor subunit 5 3 Op 3 36/0.000 + CDS 2807 - 4573 2097 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 6 3 Op 4 5/0.333 + CDS 4589 - 5305 608 ## COG0479 Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit + Prom 5378 - 5437 4.5 7 4 Op 1 21/0.000 + CDS 5606 - 8407 2790 ## COG0567 2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, and related enzymes 8 4 Op 2 6/0.000 + CDS 8422 - 9639 1595 ## COG0508 Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes 9 4 Op 3 39/0.000 + CDS 9733 - 10899 1402 ## COG0045 Succinyl-CoA synthetase, beta subunit 10 4 Op 4 . + CDS 10899 - 11768 986 ## COG0074 Succinyl-CoA synthetase, alpha subunit + Term 11844 - 11873 2.1 - Term 11761 - 11810 1.2 11 5 Tu 1 . - CDS 11872 - 12594 712 ## COG2188 Transcriptional regulators - Prom 12645 - 12704 6.8 + Prom 12660 - 12719 4.5 12 6 Op 1 2/0.556 + CDS 12763 - 14679 1806 ## COG1299 Phosphotransferase system, fructose-specific IIC component 13 6 Op 2 . + CDS 14697 - 17330 1837 ## COG0383 Alpha-mannosidase 14 6 Op 3 . + CDS 17355 - 17582 66 ## SSON_0684 hypothetical protein + Prom 17990 - 18049 7.7 15 7 Op 1 31/0.000 + CDS 18177 - 19745 1718 ## COG1271 Cytochrome bd-type quinol oxidase, subunit 1 16 7 Op 2 3/0.556 + CDS 19761 - 20900 1392 ## COG1294 Cytochrome bd-type quinol oxidase, subunit 2 17 7 Op 3 2/0.556 + CDS 20915 - 21028 109 ## COG4890 Predicted outer membrane lipoprotein 18 7 Op 4 7/0.000 + CDS 21028 - 21321 191 ## COG3790 Predicted membrane protein + Term 21343 - 21378 4.3 + Prom 21371 - 21430 3.2 19 7 Op 5 13/0.000 + CDS 21585 - 21875 225 ## COG0824 Predicted thioesterase 20 7 Op 6 30/0.000 + CDS 21881 - 22564 712 ## COG0811 Biopolymer transport proteins 21 7 Op 7 8/0.000 + CDS 22568 - 22996 429 ## COG0848 Biopolymer transport protein 22 7 Op 8 8/0.000 + CDS 23061 - 24326 1200 ## COG3064 Membrane protein involved in colicin uptake + Prom 24348 - 24407 4.0 23 7 Op 9 20/0.000 + CDS 24459 - 25751 1300 ## COG0823 Periplasmic component of the Tol biopolymer transport system 24 7 Op 10 13/0.000 + CDS 25786 - 26307 657 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins 25 7 Op 11 6/0.000 + CDS 26317 - 27108 589 ## COG1729 Uncharacterized protein conserved in bacteria + TRNA 27273 - 27348 99.5 # Lys TTT 0 0 + TRNA 27484 - 27559 94.3 # Val TAC 0 0 + TRNA 27563 - 27638 99.5 # Lys TTT 0 0 + TRNA 27788 - 27863 94.3 # Val TAC 0 0 + TRNA 27867 - 27942 99.5 # Lys TTT 0 0 + TRNA 27976 - 28051 99.5 # Lys TTT 0 0 26 7 Op 12 5/0.333 + CDS 28301 - 29344 610 ## COG0379 Quinolinate synthase 27 7 Op 13 . + CDS 29397 - 30101 579 ## COG3201 Nicotinamide mononucleotide transporter - Term 30053 - 30089 4.7 28 8 Tu 1 . - CDS 30098 - 31039 753 ## COG1230 Co/Zn/Cd efflux system component - Prom 31063 - 31122 6.0 - Term 31098 - 31131 2.1 29 9 Tu 1 . - CDS 31153 - 31533 313 ## B21_00695 hypothetical protein + Prom 31749 - 31808 6.1 30 10 Tu 1 . + CDS 31849 - 32901 1109 ## COG0722 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase + Term 33014 - 33046 2.8 - Term 32998 - 33038 7.1 31 11 Tu 1 4/0.444 - CDS 33059 - 33811 1067 ## COG0588 Phosphoglycerate mutase 1 - Prom 33846 - 33905 4.1 - Term 33973 - 34011 6.1 32 12 Op 1 6/0.000 - CDS 34016 - 35056 1149 ## COG2017 Galactose mutarotase and related enzymes 33 12 Op 2 8/0.000 - CDS 35050 - 36198 1141 ## COG0153 Galactokinase 34 12 Op 3 3/0.556 - CDS 36202 - 37248 774 ## COG1085 Galactose-1-phosphate uridylyltransferase 35 12 Op 4 . - CDS 37258 - 38274 1055 ## COG1087 UDP-glucose 4-epimerase - Prom 38396 - 38455 7.5 36 13 Op 1 4/0.444 - CDS 38535 - 40007 177 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 37 13 Op 2 . - CDS 40075 - 40863 797 ## COG2005 N-terminal domain of molybdenum-binding protein - Prom 40985 - 41044 3.2 + Prom 40911 - 40970 6.1 38 14 Tu 1 . + CDS 40992 - 41141 131 ## ECSE_0815 hypothetical protein + Term 41148 - 41183 4.5 + Prom 41157 - 41216 4.3 39 15 Op 1 23/0.000 + CDS 41308 - 42081 855 ## COG0725 ABC-type molybdate transport system, periplasmic component 40 15 Op 2 13/0.000 + CDS 42081 - 42770 670 ## COG4149 ABC-type molybdate transport system, permease component 41 15 Op 3 . + CDS 42773 - 43831 1204 ## COG4148 ABC-type molybdate transport system, ATPase component 42 16 Tu 1 . - CDS 43832 - 44650 958 ## COG0561 Predicted hydrolases of the HAD superfamily - Prom 44755 - 44814 6.0 43 17 Tu 1 . + CDS 44805 - 45800 923 ## COG2706 3-carboxymuconate cyclase + Term 45813 - 45848 6.4 - Term 45679 - 45725 5.0 44 18 Tu 1 . - CDS 45841 - 46674 552 ## COG0583 Transcriptional regulator - Prom 46838 - 46897 6.7 + Prom 46731 - 46790 4.8 45 19 Op 1 3/0.556 + CDS 46978 - 48030 725 ## COG2828 Uncharacterized protein conserved in bacteria 46 19 Op 2 . + CDS 48106 - 49539 1653 ## COG0471 Di- and tricarboxylate transporters + Prom 49627 - 49686 4.7 47 20 Tu 1 . + CDS 49722 - 51983 2156 ## COG1048 Aconitase A + Term 52000 - 52054 7.3 48 21 Tu 1 . - CDS 52217 - 53500 1320 ## COG4677 Pectin methylesterase - Prom 53560 - 53619 6.8 49 22 Tu 1 . - CDS 53635 - 54405 210 ## COG0582 Integrase - Term 54952 - 54978 -1.0 50 23 Tu 1 . - CDS 54987 - 55610 248 ## COG3561 Phage anti-repressor protein - Prom 55826 - 55885 2.1 51 24 Op 1 . - CDS 55966 - 56250 208 ## ECB_00728 hypothetical protein 52 24 Op 2 . - CDS 56250 - 56816 405 ## ECED1_2132 hypothetical protein 53 24 Op 3 . - CDS 56821 - 57357 312 ## ECO26_3387 hypothetical protein 54 24 Op 4 . - CDS 57344 - 57598 231 ## gi|168260993|ref|ZP_02682966.1| hypothetical protein SeH_A0844 55 24 Op 5 . - CDS 57595 - 57828 128 ## gi|300907031|ref|ZP_07124700.1| conserved domain protein 56 24 Op 6 . - CDS 57825 - 58265 250 ## E2348C_0984 hypothetical protein 57 25 Op 1 . - CDS 58437 - 58733 179 ## EFER_2084 anti-RecBCD protein 2 58 25 Op 2 . - CDS 58757 - 59140 281 ## EC55989_2646 hypothetical protein 59 25 Op 3 . - CDS 59140 - 59745 340 ## EFER_2082 DNA single-strand annealing protein; essential recombination function protein Erf 60 25 Op 4 . - CDS 59756 - 59926 110 ## ECS88_2547 hypothetical protein 61 25 Op 5 . - CDS 60002 - 60154 74 ## SeSA_A0612 hypothetical protein - Prom 60181 - 60240 3.2 62 26 Tu 1 . - CDS 60295 - 61263 732 ## EFER_2078 conserved hypothetical protein from phage + Prom 61706 - 61765 2.4 63 27 Tu 1 . + CDS 61864 - 62055 101 ## EFER_2076 hypothetical protein + Term 62197 - 62226 -0.2 64 28 Tu 1 . - CDS 62077 - 62409 254 ## SFV_0269 putative bacteriophage protein Predicted protein(s) >gi|296493137|gb|ADTK01000364.1| GENE 1 85 - 1368 1406 427 aa, chain - ## HITS:1 COG:gltA KEGG:ns NR:ns ## COG: gltA COG0372 # Protein_GI_number: 16128695 # Func_class: C Energy production and conversion # Function: Citrate synthase # Organism: Escherichia coli K12 # 1 427 1 427 427 901 100.0 0 MADTKAKLTLNGDTAVELDVLKGTLGQDVIDIRTLGSKGVFTFDPGFTSTASCESKITFI DGDEGILLHRGFPIDQLATDSNYLEVCYILLNGEKPTQEQYDEFKTTVTRHTMIHEQITR LFHAFRRDSHPMAVMCGITGALAAFYHDSLDVNNPRHREIAAFRLLSKMPTMAAMCYKYS IGQPFVYPRNDLSYAGNFLNMMFSTPCEPYEVNPILERAMDRILILHADHEQNASTSTVR TAGSSGANPFACIAAGIASLWGPAHGGANEAALKMLEEISSVKHIPEFVRRAKDKNDSFR LMGFGHRVYKNYDPRATVMRETCHEVLKELGTKDDLLEVAMELENIALNDPYFIEKKLYP NVDFYSGIILKAMGIPSSMFTVIFAMARTVGWIAHWSEMHSDGMKIARPRQLYTGYEKRD FKSDIKR >gi|296493137|gb|ADTK01000364.1| GENE 2 1356 - 1541 92 61 aa, chain + ## HITS:1 COG:no KEGG:ECSE_0779 NR:ns ## KEGG: ECSE_0779 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 61 3 63 63 120 98.0 1e-26 MYQPFKVSLAPYYVRLPELKFAFAHQPGFTRLLFGSPLCERGENLGTELWALAGKGSIDD E >gi|296493137|gb|ADTK01000364.1| GENE 3 2062 - 2466 280 134 aa, chain + ## HITS:1 COG:ECs0746 KEGG:ns NR:ns ## COG: ECs0746 COG2009 # Protein_GI_number: 15830000 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, cytochrome b subunit # Organism: Escherichia coli O157:H7 # 6 134 1 129 129 220 100.0 4e-58 MWALFMIRNVKKQRPVNLDLQTIRFPITAIASILHRVSGVITFVAVGILLWLLGTSLSSP EGFEQASAIMGSFFVKFIMWGILTALAYHVVVGIRHMMMDFGYLEETFEAGKRSAKISFV ITVVLSLLAGVLVW >gi|296493137|gb|ADTK01000364.1| GENE 4 2460 - 2807 472 115 aa, chain + ## HITS:1 COG:ECs0747 KEGG:ns NR:ns ## COG: ECs0747 COG2142 # Protein_GI_number: 15830001 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase, hydrophobic anchor subunit # Organism: Escherichia coli O157:H7 # 1 115 1 115 115 148 99.0 3e-36 MVSNASALGRNGVHDFILVRATAIVLTLYIIYMVGFFATSGELTYEVWIGFFASAFTKVF TLLALFSILIHAWIGMWQVLTDYVKPLALRLMLQLVIVVALVVYVIYGFVVVWGV >gi|296493137|gb|ADTK01000364.1| GENE 5 2807 - 4573 2097 588 aa, chain + ## HITS:1 COG:ECs0748 KEGG:ns NR:ns ## COG: ECs0748 COG1053 # Protein_GI_number: 15830002 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Escherichia coli O157:H7 # 1 588 1 588 588 1153 100.0 0 MKLPVREFDAVVIGAGGAGMRAALQISQSGQTCALLSKVFPTRSHTVSAQGGITVALGNT HEDNWEWHMYDTVKGSDYIGDQDAIEYMCKTGPEAILELEHMGLPFSRLDDGRIYQRPFG GQSKNFGGEQAARTAAAADRTGHALLHTLYQQNLKNHTTIFSEWYALDLVKNQDGAVVGC TALCIETGEVVYFKARATVLATGGAGRIYQSTTNAHINTGDGVGMAIRAGVPVQDMEMWQ FHPTGIAGAGVLVTEGCRGEGGYLLNKHGERFMERYAPNAKDLAGRDVVARSIMIEIREG RGCDGPWGPHAKLKLDHLGKEVLESRLPGILELSRTFAHVDPVKEPIPVIPTCHYMMGGI PTKVTGQALTVNEKGEDVVVPGLFAVGEIACVSVHGANRLGGNSLLDLVVFGRAAGLHLQ ESIAEQGALRDASESDVEASLDRLNRWNNNRNGEDPVAIRKALQECMQHNFSVFREGDAM AKGLEQLKVIRERLKNARLDDTSSEFNTQRVECLELDNLMETAYATAVSANFRTESRGAH SRFDFPDRDDENWLCHSLYLPESESMTRRSVNMEPKLRPAFPPKIRTY >gi|296493137|gb|ADTK01000364.1| GENE 6 4589 - 5305 608 238 aa, chain + ## HITS:1 COG:ECs0749 KEGG:ns NR:ns ## COG: ECs0749 COG0479 # Protein_GI_number: 15830003 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit # Organism: Escherichia coli O157:H7 # 1 238 1 238 238 495 100.0 1e-140 MRLEFSIYRYNPDVDDAPRMQDYTLEAEEGRDMMLLDALIQLKEKDPSLSFRRSCREGVC GSDGLNMNGKNGLACITPISALNQPGKKIVIRPLPGLPVIRDLVVDMGQFYAQYEKIKPY LLNNGQNPPAREHLQMPEQREKLDGLYECILCACCSTSCPSFWWNPDKFIGPAGLLAAYR FLIDSRDTETDSRLDGLSDAFSVFRCHSIMNCVSVCPKGLNPTRAIGHIKSMLLQRNA >gi|296493137|gb|ADTK01000364.1| GENE 7 5606 - 8407 2790 933 aa, chain + ## HITS:1 COG:ECs0751 KEGG:ns NR:ns ## COG: ECs0751 COG0567 # Protein_GI_number: 15830005 # Func_class: C Energy production and conversion # Function: 2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, and related enzymes # Organism: Escherichia coli O157:H7 # 1 933 1 933 933 1944 100.0 0 MQNSALKAWLDSSYLSGANQSWIEQLYEDFLTDPDSVDANWRSTFQQLPGTGVKPDQFHS QTREYFRRLAKDASRYSSTISDPDTNVKQVKVLQLINAYRFRGHQHANLDPLGLWQQDKV ADLDPSFHDLTEADFQETFNVGSFASGKETMKLGELLEALKQTYCGPIGAEYMHITSTEE KRWIQQRIESGRATFNSEEKKRFLSELTAAEGLERYLGAKFPGAKRFSLEGGDALIPMLK EMIRHAGNSGTREVVLGMAHRGRLNVLVNVLGKKPQDLFDEFAGKHKEHLGTGDVKYHMG FSSDFQTDGGLVHLALAFNPSHLEIVSPVVIGSVRARLDRLDEPSSNKVLPITIHGDAAV TGQGVVQETLNMSKARGYEVGGTVRIVINNQVGFTTSNPLDARSTPYCTDIGKMVQAPIF HVNADDPEAVAFVTRLALDFRNTFKRDVFIDLVCYRRHGHNEADEPSATQPLMYQKIKKH PTPRKIYADKLEQEKVATLEDATEMVNLYRDALDAGDCVVAEWRPMNMHSFTWSPYLNHE WDEEYPNKVEMKRLQELAKRISTVPEAVEMQSRVAKIYGDRQAMAAGEKLFDWGGAENLA YATLVDEGIPVRLSGEDSGRGTFFHRHAVIHNQSNGSTYTPLQHIHNGQGAFRVWDSVLS EEAVLAFEYGYATAEPRTLTIWEAQFGDFANGAQVVIDQFISSGEQKWGRMCGLVMLLPH GYEGQGPEHSSARLERYLQLCAEQNMQVCVPSTPAQVYHMLRRQALRGMRRPLVVMSPKS LLRHPLAVSSLEELANGTFLPAIGEIDELDPKGVKRVVMCSGKVYYDLLEQRRKNNQHDV AIVRIEQLYPFPHKAMQEVLQQFAHVKDFVWCQEEPLNQGAWYCSQHHFREVIPFGASLR YAGRPASASPAVGYMSVHQKQQQDLVNDALNVE >gi|296493137|gb|ADTK01000364.1| GENE 8 8422 - 9639 1595 405 aa, chain + ## HITS:1 COG:STM0737 KEGG:ns NR:ns ## COG: STM0737 COG0508 # Protein_GI_number: 16764107 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes # Organism: Salmonella typhimurium LT2 # 1 405 1 402 402 717 96.0 0 MSSVDILVPDLPESVADATVATWHKKPGDAVVRDEVLVEIETDKVVLEVPASADGILDAV LEDEGTTVTSRQILGRLREGNSAGKETSAKSEEKASTPAQRQQASLEEQNNDALSPAIRR LLAEHNLDASAIKGTGVGGRLTREDVEKHLAKAPAKESAPAAAAPAAQPALAARSEKRVP MTRLRKRVAERLLEAKNSTAMLTTFNEVNMKPIMDLRKQYGEAFEKRHGIRLGFMSFYVK AVVEALKRYPEVNASIDGDDVVYHNYFDVSMAVSTPRGLVTPVLRDVDTLGMADIEKKIK ELAVKGRDGKLTVEDLTGGNFTITNGGVFGSLMSTPIINPPQSAILGMHAIKDRPMAVNG QVEILPMMYLALSYDHRLIDGRESVGFLVTIKELLEDPTRLLLDV >gi|296493137|gb|ADTK01000364.1| GENE 9 9733 - 10899 1402 388 aa, chain + ## HITS:1 COG:ECs0753 KEGG:ns NR:ns ## COG: ECs0753 COG0045 # Protein_GI_number: 15830007 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, beta subunit # Organism: Escherichia coli O157:H7 # 1 388 1 388 388 728 100.0 0 MNLHEYQAKQLFARYGLPAPVGYACTTPREAEEAASKIGAGPWVVKCQVHAGGRGKAGGV KVVNSKEDIRAFAENWLGKRLVTYQTDANGQPVNQILVEAATDIAKELYLGAVVDRSSRR VVFMASTEGGVEIEKVAEETPHLIHKVALDPLTGPMPYQGRELAFKLGLEGKLVQQFTKI FMGLATIFLERDLALIEINPLVITKQGDLICLDGKLGADGNALFRQPDLREMRDQSQEDP REAQAAQWELNYVALDGNIGCMVNGAGLAMGTMDIVKLHGGEPANFLDVGGGATKERVTE AFKIILSDDKVKAVLVNIFGGIVRCDLIADGIIGAVAEVGVNVPVVVRLEGNNAELGAKK LADSGLNIIAAKGLTDAAQQVVAAVEGK >gi|296493137|gb|ADTK01000364.1| GENE 10 10899 - 11768 986 289 aa, chain + ## HITS:1 COG:ECs0754 KEGG:ns NR:ns ## COG: ECs0754 COG0074 # Protein_GI_number: 15830008 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, alpha subunit # Organism: Escherichia coli O157:H7 # 1 289 1 289 289 508 100.0 1e-144 MSILIDKNTKVICQGFTGSQGTFHSEQAIAYGTKMVGGVTPGKGGTTHLGLPVFNTVREA VAATGATASVIYVPAPFCKDSILEAIDAGIKLIITITEGIPTLDMLTVKVKLDEAGVRMI GPNCPGVITPGECKIGIQPGHIHKPGKVGIVSRSGTLTYEAVKQTTDYGFGQSTCVGIGG DPIPGSNFIDILEMFEKDPQTEAIVMIGEIGGSAEEEAAAYIKEHVTKPVVGYIAGVTAP KGKRMGHAGAIIAGGKGTADEKFAALEAAGVKTVRSLADIGEALKTVLK >gi|296493137|gb|ADTK01000364.1| GENE 11 11872 - 12594 712 240 aa, chain - ## HITS:1 COG:farR KEGG:ns NR:ns ## COG: farR COG2188 # Protein_GI_number: 16128705 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 240 1 240 240 440 99.0 1e-123 MGHKPLYRQIADRIREQIARGELKPGDALPTESALQTEFGVSRVTVRQALRQLVEQQILE SIQGSGTYVKEERVNYDIFQLTSFDEKLSDRHVDTHSEVLIFEVIPADDFLQQQLQITAQ DRVWHVKRVRYRKQKPMALEETWMPLALFPDLTWQVMENSKYHFIEEVKKMVIDRSEQEI IPLMPTEEMSRLLNISQTKPILEKVSRGYLVDGRVFEYSRNAFNTDDYKFTLIAQRKSSR >gi|296493137|gb|ADTK01000364.1| GENE 12 12763 - 14679 1806 638 aa, chain + ## HITS:1 COG:hrsA_3 KEGG:ns NR:ns ## COG: hrsA_3 COG1299 # Protein_GI_number: 16128706 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Escherichia coli K12 # 279 638 1 360 360 639 99.0 0 MNLTTLTHRDALCLNARFTSREEAIHALTQRLAALGKISSTEQFLEEVYRRESLGPTALG EGLAVPHGKTAAVKEAAFAVATLSEPLQWEGVDGPEAVDLVVLLAIPPNEAGTTHMQLLT ALTTRLADDEIRARIQSATTPDELLSALDDKGGTQPSASFSNAPTIVCVTACPAGIAHTY MAAEYLEKAGRKLGVNVYVEKQGANGIEGRLTADQLNSATACIFAAEVAIKESERFNGIP ALSVPVAEPIRHAEALIQQALTLKRSDETRTVQQDTQPVKSVKTELKQALLSGISFAVPL IVAGGTVLAVAVLLSQIFGLQDLFNEENSWLWMYRKLGGGLLGILMVPVLAAYTAYSLAD KPALAPGFAAGLAANMIGSGFLGAVVGGLIAGYLMRWVKNHLRLSSKFNGFLTFYLYPVL GTLGAGSLMLFVVGEPVAWINNSLTAWLNGLSGSNALLLGAILGFMCSFDLGGPVNKAAY AFCLGAMANGVYGPYAIFASVKMVSAFTVTASTMLAPRLFKEFEIETGKSTWLLGLAGIT EGAIPMAIEDPLRVIGSFVLGSMVTGAIVGAMNIGLSTPGAGIFSLFLLHDNGAGGVMAA IGWFGAALVGAAISTAILLIWRRHAVKHGNYLTDGVMP >gi|296493137|gb|ADTK01000364.1| GENE 13 14697 - 17330 1837 877 aa, chain + ## HITS:1 COG:ybgG KEGG:ns NR:ns ## COG: ybgG COG0383 # Protein_GI_number: 16128707 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Escherichia coli K12 # 1 877 1 877 877 1795 98.0 0 MKAVSRVHITPHMHWDREWYFTTEESRILLVNNMEEILCRLEQDNEYKYYVLDGQTAILE DYFAVKPENKDRVKKQVEAGKLIIGPWYTQTDTTIVSAESIVRNLMYGMRDCLAFGEPMK IGYLPDSFGMSGQLPHIYNGFGITRTMFWRGCSERHGTDKTEFLWQSSDGSEVTAQVLPL GYAIGKYLPADENGLRKRLDSYFDVLEKASVTKEILLPNGHDQMPLQQNIFEVMEKLREI YPQRKFVMSRFEEVFEQIEAQRESLATLKGEFIDGKYMRVHRTIGSTRMDIKVAHARIEN KIVNLLEPLATLAWTLGFEYHHGLLEKMWKEILKNHAHDSIGCCCSDKVHREIVARFELA EDMADNLLRFYMRKIADNMPQSDADKLVLFNLMPWPREEVINTTVRLRASQFNLRDDRGQ PVPYFIRHAREIDPGLIDRQIVHYGNYDPFMEFDIQINQIVPSMGYRTLYIEANQPGNVI AAKSDAEGILENAFWQIALNEDGTLQLVDKDSGVRYDRVLQIEESSDDGDEYDYSPAKEE WVMTSATAKPQCEITHEAWQSRAVIRYDMAVPLNLSERSVRQSTGRVGVEMVVTLSHNSR RIDVDINLDNQADDHRLRVLIPTPFNTDSVLADTQFGSLTRPVNDSAMNNWQQEGWKEAP VPVWNMLNYVALQEGRNGMAVFSEGLREFEVIGEEKKTFAITLLRGVGLLGKEDLLLRPG RPSGIKMPVPDSQLRGLLSCRLSLLSYTGTPTAAGVAQQARAWLTPVQCYNKIPWDAMKL NKAGFNVPESYSLLKMPPVGCLISALKKAEDRQEVILRLFNPAESATCDATVAFSREVIS CSETMMDEHITTEENQGSNLSGPFLPGQSRTFSYRLA >gi|296493137|gb|ADTK01000364.1| GENE 14 17355 - 17582 66 75 aa, chain + ## HITS:1 COG:no KEGG:SSON_0684 NR:ns ## KEGG: SSON_0684 # Name: not_defined # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 75 1 75 75 144 100.0 7e-34 MRGQIVDKGRFVHAGYGMNALFGLQKQANSIYCRDDVDTGKRSASGNFAFIFTLKPRICG FIFNKIITLGELIII >gi|296493137|gb|ADTK01000364.1| GENE 15 18177 - 19745 1718 522 aa, chain + ## HITS:1 COG:ECs0768 KEGG:ns NR:ns ## COG: ECs0768 COG1271 # Protein_GI_number: 15830022 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 1 # Organism: Escherichia coli O157:H7 # 1 522 2 523 523 1010 99.0 0 MLDIVELSRLQFALTAMYHFLFVPLTLGMAFLLAIMETVYVLSGKQIYKDMTKFWGKLFG INFALGVATGLTMEFQFGTNWSYYSHYVGDIFGAPLAIEGLMAFFLESTFVGLFFFGWDR LGKVQHMCVTWLVALGSNLSALWILVANGWMQNPIASDFNFETMRMEMVSFSELVLNPVA QVKFVHTVASGYVTGAMFILGVSAWYMLKGRDFAFAKRSFAIAASFGMAAVLSVIVLGDE SGYEMGDVQKTKLAAIEAEWETQPAPAAFTLFGIPDQEEETNKFAIQIPYALGIIATRSV DTPVIGLKELMVQHEERIRNGMKAYSLLEQLRSGSTDQAVRDQFNSMKKDLGYGLLLKRY TPNVADATEAQIQQATKDSIPRVAPLYFAFRIMVACGFLLLAIIALSFWSVIRNRIGEKK WLLRAALYGIPLPWIAVEAGWFVAEYGRQPWAIGEVLPTAVANSSLTAGDLIFSMVLICG LYTLFLVAELFLMFKFARLGPSSLKTGRYHFEQSSTTTQPAR >gi|296493137|gb|ADTK01000364.1| GENE 16 19761 - 20900 1392 379 aa, chain + ## HITS:1 COG:ECs0769 KEGG:ns NR:ns ## COG: ECs0769 COG1294 # Protein_GI_number: 15830023 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 2 # Organism: Escherichia coli O157:H7 # 1 379 1 379 379 699 100.0 0 MIDYEVLRFIWWLLVGVLLIGFAVTDGFDMGVGMLTRFLGRNDTERRIMINSIAPHWDGN QVWLITAGGALFAAWPMVYAAAFSGFYVAMILVLASLFFRPVGFDYRSKIEETRWRNMWD WGIFIGSFVPPLVIGVAFGNLLQGVPFNVDEYLRLYYTGNFFQLLNPFGLLAGVVSVGMI ITQGATYLQMRTVGELHLRTRATAQVAALVTLVCFALAGVWVMYGIDGYVVKSTMDHYAA SNPLNKEVVREAGAWLVNFNNTPILWAIPALGVVLPLLTILTARMDKAAWAFVFSSLTLA CIILTAGIAMFPFVMPSSTMMNASLTMWDATSSQLTLNVMTWVAVVLVPIILLYTAWCYW KMFGRITKEDIERNTHSLY >gi|296493137|gb|ADTK01000364.1| GENE 17 20915 - 21028 109 37 aa, chain + ## HITS:1 COG:STM0742 KEGG:ns NR:ns ## COG: STM0742 COG4890 # Protein_GI_number: 16764112 # Func_class: S Function unknown # Function: Predicted outer membrane lipoprotein # Organism: Salmonella typhimurium LT2 # 1 36 1 36 37 57 86.0 5e-09 MWYFAWILGTLLACSFGVITALALEHVESGKAGQEDI >gi|296493137|gb|ADTK01000364.1| GENE 18 21028 - 21321 191 97 aa, chain + ## HITS:1 COG:ECs0770 KEGG:ns NR:ns ## COG: ECs0770 COG3790 # Protein_GI_number: 15830024 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 97 1 97 97 169 100.0 1e-42 MSKIIATLYAVMDKRPLRALSFVMALLLAGCMFWDPSRFAAKTSELEIWHGLLLMWAVCA GVIHGVGFRPQKVLWQGIFCPLLADIVLIVGLIFFFF >gi|296493137|gb|ADTK01000364.1| GENE 19 21585 - 21875 225 96 aa, chain + ## HITS:1 COG:ECs0771 KEGG:ns NR:ns ## COG: ECs0771 COG0824 # Protein_GI_number: 15830025 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Escherichia coli O157:H7 # 1 96 39 134 134 177 100.0 3e-45 MLRHHHFSQQALMAERVAFVVRKMTVEYYAPARLDDMLEIQTEITSMRGTSLVFTQRIVN AENTLLNEAEVLVVCVDPLKMKPRALPKSIVAEFKQ >gi|296493137|gb|ADTK01000364.1| GENE 20 21881 - 22564 712 227 aa, chain + ## HITS:1 COG:ECs0772 KEGG:ns NR:ns ## COG: ECs0772 COG0811 # Protein_GI_number: 15830026 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Escherichia coli O157:H7 # 1 227 4 230 230 412 100.0 1e-115 MNILDLFLKASLLVKLIMLILIGFSIASWAIIIQRTRILNAAAREAEAFEDKFWSGIELS RLYQESQGKRDNLTGSEQIFYSGFKEFVRLHRANSHAPEAVVEGASRAMRISMNRELENL ETHIPFLGTVGSISPYIGLFGTVWGIMHAFIALGAVKQATLQMVAPGIAEALIATAIGLF AAIPAVMAYNRLNQRVNKLELNYDNFMEEFTAILHRQAFTVSESNKG >gi|296493137|gb|ADTK01000364.1| GENE 21 22568 - 22996 429 142 aa, chain + ## HITS:1 COG:ECs0773 KEGG:ns NR:ns ## COG: ECs0773 COG0848 # Protein_GI_number: 15830027 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport protein # Organism: Escherichia coli O157:H7 # 11 142 11 142 142 207 100.0 4e-54 MARARGRGRRDLKSEINIVPLLDVLLVLLLIFMATAPIITQSVEVDLPDATESQAVSSND NPPVIVEVSGIGQYTVVVEKDRLERLPPEQVVAEVSSRFKANPKTVFLIGGAKDVPYDEI IKALNLLHSAGVKSVGLMTQPI >gi|296493137|gb|ADTK01000364.1| GENE 22 23061 - 24326 1200 421 aa, chain + ## HITS:1 COG:tolA KEGG:ns NR:ns ## COG: tolA COG3064 # Protein_GI_number: 16128714 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane protein involved in colicin uptake # Organism: Escherichia coli K12 # 1 421 1 421 421 286 100.0 4e-77 MSKATEQNDKLKRAIIISAVLHVILFAALIWSSFDENIEASAGGGGGSSIDAVMVDSGAV VEQYKRMQSQESSAKRSDEQRKMKEQQAAEELREKQAAEQERLKQLEKERLAAQEQKKQA EEAAKQAELKQKQAEEAAAKAAADAKAKAEADAKAAEEAAKKAAADAKKKAEAEAAKAAA EAQKKAEAAAAALKKKAEAAEAAAAEARKKAATEAAEKAKAEAEKKAAAEKAAADKKAAA EKAAADKKAAEKAAAEKAAADKKAAAEKAAADKKAAAAKAAAEKAAAAKAAAEADDIFGE LSSGKNAPKTGGGAKGNNASPAGSGNTKNNGASGADINNYAGQIKSAIESKFYDASSYAG KTCTLRIKLAPDGMLLDIKPEGGDPALCQAALAAAKLAKIPKPPSQAVYEVFKNAPLDFK P >gi|296493137|gb|ADTK01000364.1| GENE 23 24459 - 25751 1300 430 aa, chain + ## HITS:1 COG:ECs0775 KEGG:ns NR:ns ## COG: ECs0775 COG0823 # Protein_GI_number: 15830029 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic component of the Tol biopolymer transport system # Organism: Escherichia coli O157:H7 # 1 430 1 430 430 820 100.0 0 MKQALRVAFGFLILWASVLHAEVRIVIDSGVDSGRPIGVVPFQWAGPGAAPEDIGGIVAA DLRNSGKFNPLDRARLPQQPGSAQEVQPAAWSALGIDAVVVGQVTPNPDGSYNVAYQLVD TGGAPGTVLAQNSYKVNKQWLRYAGHTASDEVFEKLTGIKGAFRTRIAYVVQTNGGQFPY ELRVSDYDGYNQFVVHRSPQPLMSPAWSPDGSKLAYVTFESGRSALVIQTLANGAVRQVA SFPRHNGAPAFSPDGSKLAFALSKTGSLNLYVMDLASGQIRQVTDGRSNNTEPTWFPDSQ NLAFTSDQAGRPQVYKVNINGGAPQRITWEGSQNQDADVSSDGKFMVMVSSNGGQQHIAK QDLATGGVQVLSSTFLDETPSLAPNGTMVIYSSSQGMGSVLNLVSTDGRFKARLPATDGQ VKFPAWSPYL >gi|296493137|gb|ADTK01000364.1| GENE 24 25786 - 26307 657 173 aa, chain + ## HITS:1 COG:ECs0776 KEGG:ns NR:ns ## COG: ECs0776 COG2885 # Protein_GI_number: 15830030 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Escherichia coli O157:H7 # 1 173 1 173 173 296 100.0 2e-80 MQLNKVLKGLMIALPVMAIAACSSNKNASNDGSEGMLGAGTGMDANGGNGNMSSEEQARL QMQQLQQNNIVYFDLDKYDIRSDFAQMLDAHANFLRSNPSYKVTVEGHADERGTPEYNIS LGERRANAVKMYLQGKGVSADQISIVSYGKEKPAVLGHDEAAYSKNRRAVLVY >gi|296493137|gb|ADTK01000364.1| GENE 25 26317 - 27108 589 263 aa, chain + ## HITS:1 COG:ECs0777 KEGG:ns NR:ns ## COG: ECs0777 COG1729 # Protein_GI_number: 15830031 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 263 1 263 263 393 99.0 1e-109 MSSNFRHQLLSLSLLVGIAAPWAAFAQAPISSVGSGSVEDRVTQLERISNAHSQLLTQLQ QQLSDNQSDIDSLRGQIQENQYQLNQVVERQKQILLQIDSLSSGGAAAQSTSSDQSGAAA STTPTADAGTANAGAPVKSGDANTDYNAAIALVQDKSRQDDAMVAFQNFIKNYPDSTYLP NANYWLGQLNYNKGKKDDAAYYFASVVKNYPKSPKAADAMFKVGVIMQDKGDTAKAKAVY QQVISKYPGTDGAKQAQKRLNAM >gi|296493137|gb|ADTK01000364.1| GENE 26 28301 - 29344 610 347 aa, chain + ## HITS:1 COG:nadA KEGG:ns NR:ns ## COG: nadA COG0379 # Protein_GI_number: 16128718 # Func_class: H Coenzyme transport and metabolism # Function: Quinolinate synthase # Organism: Escherichia coli K12 # 1 347 1 347 347 688 98.0 0 MSVMFDPDTAIYPFPPKPTPLSIDEKAYYREKIKRLLKERNAVMVAHYYTDPEIQQLAEE TGGCISDSLEMARFGAKHPASTLLVAGVRFMGETAKILSPEKTILMPTLQAECSLDLGCP VEEFNAFCDAHPDRTVVVYANTSAAVKARADWVVTSSISVELIDHLDSLGEKIIWAPDKH LGRYVQKQTGGDILCWQGACIVHDEFKTQALTRLQEEYPDAAILVHPESPQAIVDMADAG GSPRQLIAAAKALPHQRLIVATDRGIFYKMQQAVPDKELLEAPTAGEGATCRSCAHCPWM AMNGLQSIAEALEQEGSNHEVHVDERLRERALVPLNRMLDFAATLRG >gi|296493137|gb|ADTK01000364.1| GENE 27 29397 - 30101 579 234 aa, chain + ## HITS:1 COG:pnuC KEGG:ns NR:ns ## COG: pnuC COG3201 # Protein_GI_number: 16128719 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinamide mononucleotide transporter # Organism: Escherichia coli K12 # 1 234 6 239 239 391 99.0 1e-109 MQNILVHIPIGAGGYDLSWIEAVGTIAGLLCIGLASLEKISNYFFGLINVTLFGIIFFQI QLYASLLLQVFFFAANIYGWYAWSRQTSQNEAELKIRWLPLPKALSWLAVCVVSIGLMTV FINPVFAFLTRVAVMIMQALGLQVVMPELQPDAFPFWDSCMMVLSIVAMILMTRKYVENW LLWVIINVISVVIFALQGVYAMSLEYIILTFIALNGSRMWINSARERGSRALSH >gi|296493137|gb|ADTK01000364.1| GENE 28 30098 - 31039 753 313 aa, chain - ## HITS:1 COG:ZybgR KEGG:ns NR:ns ## COG: ZybgR COG1230 # Protein_GI_number: 15800461 # Func_class: P Inorganic ion transport and metabolism # Function: Co/Zn/Cd efflux system component # Organism: Escherichia coli O157:H7 EDL933 # 4 307 2 305 311 529 99.0 1e-150 MAHSHSHTSSHLPEDNNARRLLYAFGVTAGFMLVEVVGGFLSGSLALLADAGHMLTDTAA LLFALLAVQFSRRPPTIRHTFGWLRLTTLAAFVNAIALVVITILIVWEAIERFRTPRPVE GGMMMAIAVAGLLANILSFWLLHHGSEEKNLNVRAAALHVLGDLLGSVGAIIAALIIIWT GWTPADPILSILVSLLVLRSAWRLLKDSVNELLEGAPVSLDIAELKRRMCREIPEVRNVH HVHVWMVGEKPVMTLHVQVIPPHDHDALLDQIQHYLMDHYQIEHATIQMEYQPCHGPDCH LNEGVSGHSHHHH >gi|296493137|gb|ADTK01000364.1| GENE 29 31153 - 31533 313 126 aa, chain - ## HITS:1 COG:no KEGG:B21_00695 NR:ns ## KEGG: B21_00695 # Name: ybgS # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 126 1 126 126 146 100.0 2e-34 MKMTKLATLFLTATLSLASGAALAADSGAQTNNGQANAAADAGQVAPDARENVAPNNVDN NGVNTGSGGTMLHSDGSSMNNDGMTKDEEHKNTMCKDGRCPDINKKVQTGDGINNDVDTK TDGTTQ >gi|296493137|gb|ADTK01000364.1| GENE 30 31849 - 32901 1109 350 aa, chain + ## HITS:1 COG:ECs0782 KEGG:ns NR:ns ## COG: ECs0782 COG0722 # Protein_GI_number: 15830036 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Escherichia coli O157:H7 # 1 350 1 350 350 716 99.0 0 MNYQNDDLRIKEIKELLPPVALLEKFPATENAANTVAHARKAIHKILKGNDDRLLVVIGP CSIHDPVAAKEYATRLLALREELKDELEIVMRVYFEKPRTTVGWKGLINDPHMDNSFQIN DGLRIARKLLLDINDSGLPAAGEFLDMITPQYLADLMSWGAIGARTTESQVHRELASGLS CPVGFKNGTDGTIKVAIDAINAAGAPHCFLSVTKWGHSAIVNTSGNGDCHIILRGGKEPN YSAKHVAEVKEGLNKAGLPAQVMIDFSHANSSKQFKKQMDVCADVCQQIAGGEKAIIGVM VESHLVEGNQSLDSGEPLAYGKSITDACIGWEDTDALLRQLANAVKARRG >gi|296493137|gb|ADTK01000364.1| GENE 31 33059 - 33811 1067 250 aa, chain - ## HITS:1 COG:ECs0783 KEGG:ns NR:ns ## COG: ECs0783 COG0588 # Protein_GI_number: 15830037 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglycerate mutase 1 # Organism: Escherichia coli O157:H7 # 1 250 1 250 250 470 100.0 1e-133 MAVTKLVLVRHGESQWNKENRFTGWYDVDLSEKGVSEAKAAGKLLKEEGYSFDFAYTSVL KRAIHTLWNVLDELDQAWLPVEKSWKLNERHYGALQGLNKAETAEKYGDEQVKQWRRGFA VTPPELTKDDERYPGHDPRYAKLSEKELPLTESLALTIDRVIPYWNETILPRMKSGERVI IAAHGNSLRALVKYLDNMSEEEILELNIPTGVPLVYEFDENFKPLKRYYLGNADEIAAKA AAVANQGKAK >gi|296493137|gb|ADTK01000364.1| GENE 32 34016 - 35056 1149 346 aa, chain - ## HITS:1 COG:galM KEGG:ns NR:ns ## COG: galM COG2017 # Protein_GI_number: 16128724 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Escherichia coli K12 # 1 346 1 346 346 711 100.0 0 MLNETPALAPDGQPYRLLTLRNNAGMVVTLMDWGATLLSARIPLSDGSVREALLGCASPE CYQDQAAFLGASIGRYANRIANSRYTFDGETVTLSPSQGVNQLHGGPEGFDKRRWQIVNQ NDRQVLFALSSDDGDQGFPGNLGATVQYRLTDDNRISITYRATVDKPCPVNMTNHVYFNL DGEQSDVRNHKLQILADEYLPVDEGGIPHDGLKSVAGTSFDFRSAKIIASEFLADDDQRK VKGYDHAFLLQAKGDGKKVAAHVWSADEKLQLKVYTTAPALQFYSGNFLGGTPSRGTEPY ADWQGLALESEFLPDSPNHPEWPQPDCFLRPGEEYSSLTEYQFIAE >gi|296493137|gb|ADTK01000364.1| GENE 33 35050 - 36198 1141 382 aa, chain - ## HITS:1 COG:ECs0785 KEGG:ns NR:ns ## COG: ECs0785 COG0153 # Protein_GI_number: 15830039 # Func_class: G Carbohydrate transport and metabolism # Function: Galactokinase # Organism: Escherichia coli O157:H7 # 1 382 1 382 382 771 99.0 0 MSLKEKTQSLFANAFGYPATHTIQAPGRVNLIGEHTDYNDGFVLPCAIDYQTVISCAPCD DRKVRVMAADYENQLDEFSLDAPIVAHENYQWANYVRGVVKHLQLRNNSFGGVDMVISGN VPQGAGLSSSASLEVAVGTVLQQLYHLPLDGAQIALNGQEAENQFVGCNCGIMDQLISAL GKKDHALLIDCRSLGTKAVSMPKGVAVVIINSNFKRTLVGSEYNTRREQCETGARFFQQP ALRDVTIEEFNAVAHELDPIVAKRVRHILTENARTVEAASALEQGDLKRMGELMAESHAS MRDDFEITVPQIDTLVEIVKAVIGDKGGVRMTGGGFGGCIVALIPEELVPAVQQAVAEQY EAKTGIKETFYVCKPSQGAGQC >gi|296493137|gb|ADTK01000364.1| GENE 34 36202 - 37248 774 348 aa, chain - ## HITS:1 COG:ECs0786 KEGG:ns NR:ns ## COG: ECs0786 COG1085 # Protein_GI_number: 15830040 # Func_class: C Energy production and conversion # Function: Galactose-1-phosphate uridylyltransferase # Organism: Escherichia coli O157:H7 # 1 348 1 348 348 706 99.0 0 MTQFNPVDHPHRRYNPLTGQWILVSPHRAKRPWQGAQETPAKQVLPAHDPDCFLCAGNVR VTGDKNPDYTGTYVFTNDFAALMSDTPDAPESNDPLMRCQSARGTSRVICFSPDHSKTLP ELSVAALTEIVKTWQEQTAELGKTYPWVQVFENKGAAMGCSNPHPHGQIWANSFLPNEAE REDRLQKEYFAEQKSPMLVDYVQRELADGSRTVVETEHWLAVVPYWAAWPFETLLLPKAH VLRITDLTDAQRSDLALALKKLTSRYDNLFQCSFPYSMGWHGAPFNGEENQHWQLHAHFY PPLLRSATVRKFMVGYEMLAETQRDLTAEQAAERLRAVSDIHFRESGV >gi|296493137|gb|ADTK01000364.1| GENE 35 37258 - 38274 1055 338 aa, chain - ## HITS:1 COG:galE KEGG:ns NR:ns ## COG: galE COG1087 # Protein_GI_number: 16128727 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Escherichia coli K12 # 1 338 1 338 338 706 100.0 0 MRVLVTGGSGYIGSHTCVQLLQNGHDVIILDNLCNSKRSVLPVIERLGGKHPTFVEGDIR NEALMTEILHDHAIDTVIHFAGLKAVGESVQKPLEYYDNNVNGTLRLISAMRAANVKNFI FSSSATVYGDQPKIPYVESFPTGTPQSPYGKSKLMVEQILTDLQKAQPDWSIALLRYFNP VGAHPSGDMGEDPQGIPNNLMPYIAQVAVGRRDSLAIFGNDYPTEDGTGVRDYIHVMDLA DGHVVAMEKLANKPGVHIYNLGAGVGNSVLDVVNAFSKACGKPVNYHFAPRREGDLPAYW ADASKADRELNWRVTRTLDEMAQDTWHWQSRHPQGYPD >gi|296493137|gb|ADTK01000364.1| GENE 36 38535 - 40007 177 490 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 276 465 21 199 311 72 28 4e-12 MSSLQILQGAFRLSDTKTLQLPQLTLNAGDSWAFVGSNGSGKSALARALAGELPLLKGER QSQFSHITRLSFEQLQKLVSDEWQRNNTDMLGPGEDDTGRTTAEIIQDEVKDAPRCMQLA QQFGITALLDRRFKYLSTGETRKTLLCQALMSEPDLLILDEPFDGLDVASRQQLAERLAS LHQSGITLVLVLNRFDEIPEFVQFAGVLADCTLAETGAKEELLQQALVAQLAHSEQLEGV QLPEPDEPSARHALPANEPRIVLNNGVVSYNDRPILNNLSWQVNPGEHWQIVGPNGAGKS TLLSLVTGDHPQGYSNDLTLFGRRRGSGETIWDIKKHIGYVSSSLHLDYRVSTTVRNVIL SGYFDSIGIYQAVSDRQQKLVQQWLDILGIDKRTADAPFHSLSWGQQRLALIVRALVKHP TLLILDEPLQGLDPLNRQLIRRFVDVLISEGETQLLFVSHHAEDAPACITHRLEFVPDGG LYRYVLTKIY >gi|296493137|gb|ADTK01000364.1| GENE 37 40075 - 40863 797 262 aa, chain - ## HITS:1 COG:ECs0789 KEGG:ns NR:ns ## COG: ECs0789 COG2005 # Protein_GI_number: 15830043 # Func_class: R General function prediction only # Function: N-terminal domain of molybdenum-binding protein # Organism: Escherichia coli O157:H7 # 1 262 1 262 262 449 99.0 1e-126 MQAEILLTLKLQQKLFADPRRISLLKHIALSGSISQGAKDAGISYKSAWDAINEMNQLSE HILVERATGGKGGGGAVLTRYGQRLIQLYDLLAQIQQKAFDVLSDDDALPLNSLLAAISR FSLQTSARNQWFGTITARDHDDVQQHVDVLLADGKTRLKVAITAQSGARLGLDEGKEVLI LLKAPWVGITQDEAVAQNADNQLPGIISHIERGAEQCEVIMSLPDGQTLCATVPVNEATS LQQGQNVTAYFNADSVIIATLC >gi|296493137|gb|ADTK01000364.1| GENE 38 40992 - 41141 131 49 aa, chain + ## HITS:1 COG:no KEGG:ECSE_0815 NR:ns ## KEGG: ECSE_0815 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 49 3 51 51 77 100.0 1e-13 MLELLKSLVFAVIMVPVVMAIILGLIYGLGEVFNIFSGVGKKDQPGQNH >gi|296493137|gb|ADTK01000364.1| GENE 39 41308 - 42081 855 257 aa, chain + ## HITS:1 COG:ECs0791 KEGG:ns NR:ns ## COG: ECs0791 COG0725 # Protein_GI_number: 15830045 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type molybdate transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 1 257 1 257 257 463 99.0 1e-130 MARKWLNLFAGAALSFAVAGNALADEGKITVFAAASLTNAMQDIATQYKKEKGVDVVSSF ASSSTLARQIEAGAPADLFISADQKWMDYAVDKKAIDTATRQTLLGNSLVVVAPKASEQK DFTIDSKTNWTSLLNGGRLAVGDPEHVPAGIYAKEALQKLGAWDTLSPKLAPAEDVRGAL ALVERNEAPLGIVYGSDAVASKGVKVVAIFPEDSHKKVEYPVAVVEGHNNATVKAFYDYL KGPQAAEIFKRYGFTTK >gi|296493137|gb|ADTK01000364.1| GENE 40 42081 - 42770 670 229 aa, chain + ## HITS:1 COG:ECs0792 KEGG:ns NR:ns ## COG: ECs0792 COG4149 # Protein_GI_number: 15830046 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type molybdate transport system, permease component # Organism: Escherichia coli O157:H7 # 1 229 1 229 229 370 100.0 1e-102 MILTDPEWQAVLLSLKVSSLAVLFSLPFGIFFAWLLVRCTFPGKALLDSVLHLPLVLPPV VVGYLLLVSMGRRGFIGERLYDWFGITFAFSWRGAVLAAAVMSFPLMVRAIRLALEGVDV KLEQAARTLGAGRWRVFFTITLPLTLPGIIVGTVLAFARSLGEFGATITFVSNIPGETRT IPSAMYTLIQTPGGESGAARLCIISIALAMISLLISEWLARISRERAGR >gi|296493137|gb|ADTK01000364.1| GENE 41 42773 - 43831 1204 352 aa, chain + ## HITS:1 COG:modC KEGG:ns NR:ns ## COG: modC COG4148 # Protein_GI_number: 16128733 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type molybdate transport system, ATPase component # Organism: Escherichia coli K12 # 1 352 1 352 352 669 100.0 0 MLELNFSQTLGNHCLTINETLPANGITAIFGVSGAGKTSLINAISGLTRPQKGRIVLNGR VLNDAEKGICLTPEKRRVGYVFQDARLFPHYKVRGNLRYGMSKSMVDQFDKLVALLGIEP LLDRLPGSLSGGEKQRVAIGRALLTAPELLLLDEPLASLDIPRKRELLPYLQRLTREINI PMLYVSHSLDEILHLADRVMVLENGQVKAFGALEEVWGSSVMNPWLPKEQQSSILKVTVL EHHPHYAMTALALGDQHLWVNKLDEPLQAALRIRIQASDVSLVLQPPQQTSIRNVLRAKV VNSYDDNGQVEVELEVGGKTLWARISPWARDELAIKPGLWLYAQIKSVSITA >gi|296493137|gb|ADTK01000364.1| GENE 42 43832 - 44650 958 272 aa, chain - ## HITS:1 COG:ECs0794 KEGG:ns NR:ns ## COG: ECs0794 COG0561 # Protein_GI_number: 15830048 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Escherichia coli O157:H7 # 1 272 1 272 272 554 99.0 1e-158 MTTRVIALDLDGTLLTPKKTLLPSSIEALARAREAGYQLIIVTGRHHVAIHPFYQALALD TPAICCNGTYLYDYHAKTVLEADPMPVNKALQLIEMLNEHHIHGLMYVDDAMVYEHPTGH VIRTSNWAQTLPPEQRPTFTQVASLAETAQQVNAVWKFALTHDDLPQLQHFGKHVEHELG LECEWSWHDQVDIARGGNSKGKRLTKWVEAQGWSMENVVAFGDNFNDISMLEAAGTGVAM GNADDAVKARANIVIGDNTTDSIAQFIYSHLI >gi|296493137|gb|ADTK01000364.1| GENE 43 44805 - 45800 923 331 aa, chain + ## HITS:1 COG:ybhE KEGG:ns NR:ns ## COG: ybhE COG2706 # Protein_GI_number: 16128735 # Func_class: G Carbohydrate transport and metabolism # Function: 3-carboxymuconate cyclase # Organism: Escherichia coli K12 # 1 331 1 331 331 676 99.0 0 MKQTVYIASPESQQIHVWNLNHEGALTLTQVVDVPGQVQPMVVSPDKRYLYVGVRPEFRV LAYSIAPDDGALTFAAESALPGSPTHISTDHQGQFVFVGSYNAGNVSVTRLEDGLPVGVV DVVEGLDGCHSANISPDNRTLWVPALKQDRICLFTVSDDGHLVAQDPAEVTTVEGAGPRH MVFHPNEQYAYCVNELNSSVDVWELKDPHGNIECVQTLDMMPENFSDTRWAADIHITPDG RHLYACDRTASLITVFSVSEDGSVLSKEGFQPTETQPRGFNVDHSGKYLIAAGQKSHHIS VYEIVGEQGLLHEKGRYAVGQGPMWVVVNAH >gi|296493137|gb|ADTK01000364.1| GENE 44 45841 - 46674 552 277 aa, chain - ## HITS:1 COG:ybhD KEGG:ns NR:ns ## COG: ybhD COG0583 # Protein_GI_number: 16128736 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 277 62 338 338 575 99.0 1e-164 MEEDLHIQLFERTTRKVTLTKAGKRLLPEARELIKKFDETLFNIRDMNAYHRGMVTLACI PTAVFYFLPLAIGKFNELYPNIKVRILEQGTNNCMESVLCNESDFGINMNNVTNSSIDFT PLVNEPFVLACRRDHPLAKKQLVEWQELVGYKMIGVRSSSGNRLLIEQQLADKPWKLDWF YEVRHLSTSLGLVEAGLGISALPGLAMPHAPYSSIIGIPLVEPVIRRTLGIIRRKDAVLS PAAERFFALLINLWTDDKDNLWTNIVERQRHALQEIG >gi|296493137|gb|ADTK01000364.1| GENE 45 46978 - 48030 725 350 aa, chain + ## HITS:1 COG:ybhH KEGG:ns NR:ns ## COG: ybhH COG2828 # Protein_GI_number: 16128737 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 350 1 350 350 678 100.0 0 MKKIPCVMMRGGTSRGAFLLAEHLPEDQTQRDKILMAIMGSGNDLEIDGIGGGNPLTSKV AIISRSSDPRADVDYLFAQVIVHEQRVDTTPNCGNMLSGVGAFAIENGLIAATSPVTRVR IRNVNTGTFIEADVQTPNGVVEYEGSARIDGVPGTAAPVALTFLNAAGTKTGKVFPTDNQ IDYFDDVPVTCIDMAMPVVIIPAEYLGKTGYELPAELDADKALLARIESIRLQAGKAMGL GDVSNMVIPKPVLISPAQKGGAINVRYFMPHSCHRALAITGAIAISSSCALEGTVTRQIV PSVGYGNINIEHPSGALDVHLSNEGQDATTLRASVIRTTRKIFSGEVYLP >gi|296493137|gb|ADTK01000364.1| GENE 46 48106 - 49539 1653 477 aa, chain + ## HITS:1 COG:ybhI KEGG:ns NR:ns ## COG: ybhI COG0471 # Protein_GI_number: 16128738 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Escherichia coli K12 # 1 477 1 477 477 798 100.0 0 MNKKSLWKLILILAIPCIIGFMPAPAGLSELAWVLFGIYLAAIVGLVIKPFPEPVVLLIA VAASMVVVGNLSDGAFKTTAVLSGYSSGTTWLVFSAFTLSAAFVTTGLGKRIAYLLIGKI GNTTLGLGYVTVFLDLVLAPATPSNTARAGGIVLPIINSVAVALGSEPEKSPRRVGHYLM MSIYMVTKTTSYMFFTAMAGNILALKMINDILHLQISWGGWALAAGLPGIIMLLVTPLVI YTMYPPEIKKVDNKTIAKAGLAELGPMKIREKMLLGVFVLALLGWIFSKSLGVDESTVAI VVMATMLLLGIVTWEDVVKNKGGWNTLIWYGGIIGLSSLLSKVKFFEWLAEVFKNNLAFD GHGNVAFFVIIFLSIIVRYFFASGSAYIVAMLPVFAMLANVSGAPLMLTALALLFSNSYG GMVTHYGGAAGPVIFGVGYNDIKSWWLVGAVLTILTFLVHITLGVWWWNMLIGWNML >gi|296493137|gb|ADTK01000364.1| GENE 47 49722 - 51983 2156 753 aa, chain + ## HITS:1 COG:ybhJ KEGG:ns NR:ns ## COG: ybhJ COG1048 # Protein_GI_number: 16128739 # Func_class: C Energy production and conversion # Function: Aconitase A # Organism: Escherichia coli K12 # 1 753 9 761 761 1550 99.0 0 MIKLSEKGVFLASNNEIIAEEHFTGEIKKEEAKKGTIAWSILSSHNTSGNMDKLKIKFDS LASHDITFVGIVQTAKASGMERFPLPYVLTNCHNSLCAVGGTINGDDHVFGLSAAQRYGG IFVPPHIAVIHQYMREMMAGGGKMILGSDSHTRYGALGTMAVGEGGGELVKQLLNDTWDI DYPGVVAVHLTGKPAPYVGPQDVALAIIGAVFKNGYVKNKVMEFVGPGVSALSTDFRNSV DVMTTETTCLSSVWQTDEEVHNWLALHGRGQDYCQLNPQPMAYYDGCISVDLSAIKPMIA LPFHPSNVYKIDTLNQNLTDILREIEIESERVAHGKAKLSLLDKVENGRLKVQQGIIAGC SGGNYENVIAAANALRGQSCGNDTFSLAVYPSSQPVFMDLAKKGVVADLIGAGAIIRTAF CGPCFGAGDTPINNGLSIRHTTRNFPNREGSKPANGQMSAVALMDARSIAATAANGGYLT SASELDCWDNVPEYAFDVTPYKNRVYQGFVKGATQQPLIYGPNIKDWPELGALTDNIVLK VCSKILDEVTTTDELIPSGETSSYRSNPIGLAEFTLSRRDPGYVGRSKATAELENQRLAG NVSELTEVFARIKQIAGQEHIDPLQTEIGSMVYAVKPGDGSAREQAASCQRVIGGLANIA EEYATKRYRSNVINWGMLPLQMAEVPTFEVGDYIYIPGIKAALDNPGTTFKGYVIHEDAP VTEITLYMESLTAEEREIIKAGSLINFNKNRQM >gi|296493137|gb|ADTK01000364.1| GENE 48 52217 - 53500 1320 427 aa, chain - ## HITS:1 COG:ybhC KEGG:ns NR:ns ## COG: ybhC COG4677 # Protein_GI_number: 16128740 # Func_class: G Carbohydrate transport and metabolism # Function: Pectin methylesterase # Organism: Escherichia coli K12 # 1 427 1 427 427 823 100.0 0 MNTFSVSRLALALAFGVTLTACSSTPPDQRPSDQTAPGTSSRPILSAKEAQNFDAQHYFA SLTPGAAAWNPSPITLPAQPDFVVGPAGTQGVTHTTIQAAVDAAIIKRTNKRQYIAVMPG EYQGTVYVPAAPGGITLYGTGEKPIDVKIGLSLDGGMSPADWRHDVNPRGKYMPGKPAWY MYDSCQSKRSDSIGVLCSAVFWSQNNGLQLQNLTIENTLGDSVDAGNHPAVALRTDGDQV QINNVNILGRQNTFFVTNSGVQNRLETNRQPRTLVTNSYIEGDVDIVSGRGAVVFDNTEF RVVNSRTQQEAYVFAPATLSNIYYGFLAVNSRFNAFGDGVAQLGRSLDVDANTNGQVVIR DSAINEGFNTAKPWADAVISNRPFAGNTGSVDDNDEIQRNLNDTNYNRMWEYNNRGVGSK VVAEAKK >gi|296493137|gb|ADTK01000364.1| GENE 49 53635 - 54405 210 256 aa, chain - ## HITS:1 COG:Z0946 KEGG:ns NR:ns ## COG: Z0946 COG0582 # Protein_GI_number: 15800482 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Escherichia coli O157:H7 EDL933 # 1 256 32 287 287 485 99.0 1e-137 MSKIKAIRRGLPDAPLEDITTKEIAAMLNGYIDEGKAASAKLIRSTLSDAFREAIAEGHI TTNPVAATRAAKSEVRRSRLTADEYLKIYQAAESSPCWLRLAMELAVVTGQRVGDLCEMK WSDIVDGYLYVEQSKTGVKIAIPTTLHVDALGISMKETLDKCKKILGGETIIASTRREPL SSGTVSRYFMRARKASGLSFEGDPPTFHELRSLSARLYEKQISDKFAQHLLGHKSDTMAS QYRDDRGREWDKIEIK >gi|296493137|gb|ADTK01000364.1| GENE 50 54987 - 55610 248 207 aa, chain - ## HITS:1 COG:ECs1251 KEGG:ns NR:ns ## COG: ECs1251 COG3561 # Protein_GI_number: 15830505 # Func_class: K Transcription # Function: Phage anti-repressor protein # Organism: Escherichia coli O157:H7 # 1 207 1 209 209 306 78.0 2e-83 MTSQLIPVFNGTIANETALLCNARDLHAFLGVKKVFAAWITNRISEYKFIENQDYILLSN LGKQTSGRGGHNRKDYHLTLDTAKELAMVERNEKGRQIRRYFIECEKKLRSMQPAQQFTD EEIILLCYMQVQMENAQDICKLLYPIMKELNSSYASKLYDIAFETFYAVTKNRDVLLREA TRLDQTSAVFERARPMLKSLRARQFEF >gi|296493137|gb|ADTK01000364.1| GENE 51 55966 - 56250 208 94 aa, chain - ## HITS:1 COG:no KEGG:ECB_00728 NR:ns ## KEGG: ECB_00728 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 94 19 112 112 174 94.0 1e-42 MDSFAKYTIIDWIVFLQVLLIWFYMAYRSGQWIVSVACSKGWRWWDRKNKKALALDSFYE AFNLNSLQPGSVIVVTTQSGMTIQIHKPKEEGRG >gi|296493137|gb|ADTK01000364.1| GENE 52 56250 - 56816 405 188 aa, chain - ## HITS:1 COG:no KEGG:ECED1_2132 NR:ns ## KEGG: ECED1_2132 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_ED1a # Pathway: not_defined # 90 187 121 215 217 172 84.0 6e-42 MAEFTKERIIEELKSSTQNASGMFEISEDTICALMSMLADKPSPAYVPDEIPEPDTARMF FTDAVVAIAKVQGWNACRAAMLQGKSEQPQNAQQNIPENIPGGNSPVIPDGWISCSERMP EMGERQCYVLAADFKNNYPPNIPNTQVGVYGDWFNDGNPTWDDGDGEDLYLKEVTHWMPL PEPPQEMK >gi|296493137|gb|ADTK01000364.1| GENE 53 56821 - 57357 312 178 aa, chain - ## HITS:1 COG:no KEGG:ECO26_3387 NR:ns ## KEGG: ECO26_3387 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 127 1 127 222 215 88.0 4e-55 MSKIDYQALRAKAEKATCGEWSLEYGDGRFDGDDALIHREAAGYIPICRIEGAHPESGFD EDFQIEQQANAEFIAAANPATVLALLDERERNQQYIKRRDQENEDIALTVGKLRVELEAA EKRISELEAREVVLPSTQDVHPLGPQSAKIFCEFHRNIINRCADEIRKAGVNVSIKGK >gi|296493137|gb|ADTK01000364.1| GENE 54 57344 - 57598 231 84 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|168260993|ref|ZP_02682966.1| ## NR: gi|168260993|ref|ZP_02682966.1| hypothetical protein SeH_A0844 [Salmonella enterica subsp. enterica serovar Hadar str. RI_05P066] # 1 84 1 84 84 134 91.0 1e-30 MSNSARLQLGFSPLSKTIMLAKMRDVEGGRMRVGNDPGRDVTNEAAQLVWRLVMAEGGEI AWELDDGSRMVLKAEKQEATSEQD >gi|296493137|gb|ADTK01000364.1| GENE 55 57595 - 57828 128 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|300907031|ref|ZP_07124700.1| ## NR: gi|300907031|ref|ZP_07124700.1| conserved domain protein [Escherichia coli MS 84-1] # 1 77 1 77 77 139 100.0 5e-32 MSNQIKPARRIDADGNARDVDNAYEEGIYAAVIGLEVTDNPYSTDDPLNRKLWLKAFAGV KSGKIRSAAQLRKGGNQ >gi|296493137|gb|ADTK01000364.1| GENE 56 57825 - 58265 250 146 aa, chain - ## HITS:1 COG:no KEGG:E2348C_0984 NR:ns ## KEGG: E2348C_0984 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_0127 # Pathway: not_defined # 1 113 1 113 187 181 92.0 7e-45 MKQMSLIEMDGFLKGKCIPSDLKVNETNAEYLVRKFAEAEAKCAALAAENAGLKSGAMDE IKVINRGGQAYCVKDGVQVNPMYARGWNDYRAKSLQSDTPATDAFLAEVRAQGVEMLYAS RAAQWADELLAELNEFATQLREGGAA >gi|296493137|gb|ADTK01000364.1| GENE 57 58437 - 58733 179 98 aa, chain - ## HITS:1 COG:no KEGG:EFER_2084 NR:ns ## KEGG: EFER_2084 # Name: abc # Def: anti-RecBCD protein 2 # Organism: E.fergusonii # Pathway: not_defined # 1 98 8 105 105 177 98.0 8e-44 MPAPLYGADDPRRCSGNSVSEVLDKFRKNYNRIMSLPQETKEEKEFRHCIWLAEKEERER IYQTSIRPFRKATYTHFPEYIDPRLRNYRSRYGAISND >gi|296493137|gb|ADTK01000364.1| GENE 58 58757 - 59140 281 127 aa, chain - ## HITS:1 COG:no KEGG:EC55989_2646 NR:ns ## KEGG: EC55989_2646 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 127 31 157 157 216 95.0 2e-55 MAHSITVRLNKPAREFQAGENIGFNIRAGVQYYDRQTKKKEWTNYSAVVFAKPGAQADYY RSVLVEGGIVEITGENIRVDVYQGQNGQSITLELLNAKIGFATSGNNQQQQSSNHQNHPE YDDSIPF >gi|296493137|gb|ADTK01000364.1| GENE 59 59140 - 59745 340 201 aa, chain - ## HITS:1 COG:no KEGG:EFER_2082 NR:ns ## KEGG: EFER_2082 # Name: erf # Def: DNA single-strand annealing protein; essential recombination function protein Erf # Organism: E.fergusonii # Pathway: not_defined # 1 201 11 211 211 358 98.0 6e-98 MSKEFYARLAAIQENLNAPKNQYNSFGKYKYRSCEDILEGVKPLLNGLFLSISDEVVLIG DRYYVKATATITDGENSHTATALAREEESKKGMDSAQVTGATSSYARKYCLNGLFGIDDA KDADTDEHKHQQNAAAKQSKPSPTPEQVLKAFTDAAMQKNTVEELKQAFAKAWKMLEGTP EQQKAQDVYNIRRDELEGAAA >gi|296493137|gb|ADTK01000364.1| GENE 60 59756 - 59926 110 56 aa, chain - ## HITS:1 COG:no KEGG:ECS88_2547 NR:ns ## KEGG: ECS88_2547 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 56 1 56 56 110 100.0 2e-23 MSLATTVKESKLQRRMYTQKALWYRHNGDREGMRVCLNLSRVEVLNQRYFLGPCPF >gi|296493137|gb|ADTK01000364.1| GENE 61 60002 - 60154 74 50 aa, chain - ## HITS:1 COG:no KEGG:SeSA_A0612 NR:ns ## KEGG: SeSA_A0612 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Schwarzengrund # Pathway: not_defined # 1 50 1 50 50 79 100.0 5e-14 MRNEIAINHQMLRAAQNKAVIARFIGDSKMWLEANKAMKSAINLPWYRRK >gi|296493137|gb|ADTK01000364.1| GENE 62 60295 - 61263 732 322 aa, chain - ## HITS:1 COG:no KEGG:EFER_2078 NR:ns ## KEGG: EFER_2078 # Name: not_defined # Def: conserved hypothetical protein from phage # Organism: E.fergusonii # Pathway: not_defined # 1 322 18 339 339 271 97.0 2e-71 MSEVTDLVVIEKSNAMTVFQSADQIEEILQKVEREVMSFVPDITTAKGRKEIASLAYKVA QTKTYLDGLGKDLVAELKEIPKLIDANRKTVRDRLDELKAKARQPLTDYEEEQARIKAEE EAKAAAEALAKQIESDHEIAILMDREFDRQREEARLKAEQEKREHEERLKREAEEKARAE AEAKAKAEIQAAARREAEAKAAAERAERERIEAEQRAQREAKEAAERAEREKQAAIEAEH RKAQEEAERIRREAEAKEQARIAEEKRIKEEEERRAKDKAHRKEVNNKILADLIKVGAPE DVAKNIITAIVKGEVFATKITY >gi|296493137|gb|ADTK01000364.1| GENE 63 61864 - 62055 101 63 aa, chain + ## HITS:1 COG:no KEGG:EFER_2076 NR:ns ## KEGG: EFER_2076 # Name: not_defined # Def: hypothetical protein # Organism: E.fergusonii # Pathway: not_defined # 1 63 43 105 105 109 93.0 4e-23 MILSYLTSNKWFAAKVNTLKLPVIYPTKFRIYLSSYTNHRHSILLCRILATGMNIYLLCK VGV >gi|296493137|gb|ADTK01000364.1| GENE 64 62077 - 62409 254 110 aa, chain - ## HITS:1 COG:no KEGG:SFV_0269 NR:ns ## KEGG: SFV_0269 # Name: not_defined # Def: putative bacteriophage protein # Organism: S.flexneri_8401 # Pathway: not_defined # 1 106 1 105 107 157 93.0 1e-37 MDAQTRRRERRAEKQAQWKAANPLLVGVSAKPVNRPILSLNRKPKSRVESALNPIYLTVL AEYHEQIESNLQRIERKNQRTWYSKPRSEMGVTCSGRQKQRGKSIPAYYD Prediction of potential genes in microbial genomes Time: Mon May 16 16:12:49 2011 Seq name: gi|296493136|gb|ADTK01000365.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1165.3, whole genome shotgun sequence Length of sequence - 3628 bp Number of predicted genes - 8, with homology - 7 Number of transcription units - 6, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 24 - 494 304 ## gi|300825478|ref|ZP_07105545.1| conserved hypothetical protein - Prom 594 - 653 2.1 - Term 496 - 531 -0.3 2 2 Tu 1 . - CDS 662 - 928 111 ## gi|300825479|ref|ZP_07105546.1| conserved domain protein + Prom 885 - 944 6.2 3 3 Tu 1 . + CDS 1140 - 1340 78 ## - Term 1116 - 1158 3.1 4 4 Tu 1 . - CDS 1354 - 2067 425 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) - Prom 2114 - 2173 3.7 + Prom 1981 - 2040 4.0 5 5 Tu 1 . + CDS 2168 - 2368 123 ## ECH74115_0294 hypothetical protein + Term 2426 - 2472 -0.5 + Prom 2402 - 2461 1.5 6 6 Op 1 . + CDS 2487 - 2780 153 ## ECS88_2541 regulatory protein CII 7 6 Op 2 . + CDS 2803 - 3075 99 ## ECS88_2540 hypothetical protein 8 6 Op 3 . + CDS 3138 - 3627 287 ## EFER_2068 hypothetical protein Predicted protein(s) >gi|296493136|gb|ADTK01000365.1| GENE 1 24 - 494 304 156 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|300825478|ref|ZP_07105545.1| ## NR: gi|300825478|ref|ZP_07105545.1| conserved hypothetical protein [Escherichia coli MS 119-7] # 1 156 1 156 156 245 100.0 8e-64 MDETGLRPSKEDVAMKSDTLEVSVSGMSREELDAKLSQNKSEVESIAAEMRRESADFKTY YTQQFSSIERGIAEIKGEIGGLKTGLTTTQWAMAVGLTLVTVILSGVMLASSWIISGNDK SPSVTSPAPIIIQVPTQQPLTSAPTNQSSSQQAPKQ >gi|296493136|gb|ADTK01000365.1| GENE 2 662 - 928 111 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|300825479|ref|ZP_07105546.1| ## NR: gi|300825479|ref|ZP_07105546.1| conserved domain protein [Escherichia coli MS 119-7] # 1 88 48 135 135 162 100.0 6e-39 MSDNTIKIIPQHMTATSVLITPDRAETIITFYRHEFEHHMQSDEQGKNSFQVKVELTPNM SVSMSPDQAVALVKSLQVALRDNGLWKD >gi|296493136|gb|ADTK01000365.1| GENE 3 1140 - 1340 78 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MALTTHDAINTPPTMKLAWFFLVAPRVNRTTERAKKGIAIMLNVFMLTSTYFDSPSPLIR RPMCLL >gi|296493136|gb|ADTK01000365.1| GENE 4 1354 - 2067 425 237 aa, chain - ## HITS:1 COG:ECs0274 KEGG:ns NR:ns ## COG: ECs0274 COG1974 # Protein_GI_number: 15829528 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Escherichia coli O157:H7 # 1 237 1 237 237 459 98.0 1e-129 MSAKKKPLTQEQLEDARRLKAIYEKKKNELGLSQESVADKMGMGQSGVGALFNGINALNA YNAALLAKILNVSVEEFSPSIAREIYEMYEAVSMQPSLRSEYEYPVFSHVQAGMFSPELR TFTKGDAERWVSTTKKASDSAFWLEVEGNSMTAPTGYKPSFPDGMLILVDPEQAVEPGDF CIARLGGDEFTFKKLIRDSGQVFLQPLNPQYPMIPCNESCSVVGKVIASQWPEETFG >gi|296493136|gb|ADTK01000365.1| GENE 5 2168 - 2368 123 66 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_0294 NR:ns ## KEGG: ECH74115_0294 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 66 1 66 66 117 96.0 2e-25 MEQRITLKDYAMRFGQTKTAKDLGVYQSAISKAIHEGRKIFLTINADGSVYAEEVKPFPS NKKTTA >gi|296493136|gb|ADTK01000365.1| GENE 6 2487 - 2780 153 97 aa, chain + ## HITS:1 COG:no KEGG:ECS88_2541 NR:ns ## KEGG: ECS88_2541 # Name: cII # Def: regulatory protein CII # Organism: E.coli_S88 # Pathway: not_defined # 1 97 30 126 126 169 100.0 3e-41 MVRANKRNEALRIESALLNKIAMLGTEKTAEAVGVDKSQISRWKRDWIPKFSMLLAVLEW GVVDDDMARLARQVASILTNKKRPAATERSDQIQMEF >gi|296493136|gb|ADTK01000365.1| GENE 7 2803 - 3075 99 90 aa, chain + ## HITS:1 COG:no KEGG:ECS88_2540 NR:ns ## KEGG: ECS88_2540 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 90 92 181 181 172 100.0 4e-42 MRNKGFNPPDTHKEAKRLRFLRSIDERTQISFVKVARTELLKAEARALLPSLPKEEGYTF IPNAFLEKLLKEDISVSQFNDVLKVFRQGR >gi|296493136|gb|ADTK01000365.1| GENE 8 3138 - 3627 287 163 aa, chain + ## HITS:1 COG:no KEGG:EFER_2068 NR:ns ## KEGG: EFER_2068 # Name: not_defined # Def: hypothetical protein # Organism: E.fergusonii # Pathway: not_defined # 1 163 21 183 315 291 96.0 5e-78 MENQKTGYIPLYRSILKQSWAKDVYLRTLWENLLLNAARKPYKANFKGHEWHLQPGQLVV TAADLGLQLCDRHGKPASRDQVERMLQVFVKEGMITIDGEKQKGRVITITNYHEYAQKMD DSPAHEAAQTTAHEAAHDEASNGAAFRVHAAHESAHEAAQTTA Prediction of potential genes in microbial genomes Time: Mon May 16 16:13:16 2011 Seq name: gi|296493135|gb|ADTK01000366.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1165.4, whole genome shotgun sequence Length of sequence - 12094 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 5, operones - 5 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 312 96 ## SeD_A0624 DNA replication protein GP18 2 1 Op 2 . + CDS 309 - 1685 627 ## COG0305 Replicative DNA helicase + Term 1712 - 1746 4.1 3 2 Op 1 . + CDS 1759 - 2199 158 ## ECUMN_1402 recombination protein NinB from phage origin 4 2 Op 2 . + CDS 2196 - 2723 166 ## ECUMN_1403 putative DNA N-6-adenine-methyltransferase from phage origin + Prom 2970 - 3029 2.5 5 3 Op 1 . + CDS 3260 - 3469 75 ## ECs2978 hypothetical protein 6 3 Op 2 . + CDS 3462 - 4067 143 ## ECO103_5195 putative recombination endonuclease 7 4 Op 1 . + CDS 4248 - 4913 488 ## COG0639 Diadenosine tetraphosphatase and related serine/threonine protein phosphatases 8 4 Op 2 . + CDS 4910 - 5533 241 ## LF82_p120 antiterminator of prophage CP-933Y + Term 5630 - 5684 9.1 + Prom 5626 - 5685 5.2 9 5 Op 1 . + CDS 5902 - 6219 236 ## ECS88_2525 lysis protein S and lysis inhibitor (holin protein) 10 5 Op 2 . + CDS 6203 - 6679 418 ## COG4678 Muramidase (phage lambda lysozyme) 11 5 Op 3 . + CDS 6676 - 7143 278 ## UTI89_C5113 putative endopeptidase protein Rz of prophage CP-933K (EC:3.4.-.-) 12 5 Op 4 . + CDS 7201 - 7977 383 ## S0700 putative bacteriophage protein 13 5 Op 5 1/0.000 + CDS 7928 - 9328 838 ## COG1783 Phage terminase large subunit + Prom 9474 - 9533 2.9 14 5 Op 6 2/0.000 + CDS 9566 - 11017 968 ## COG3567 Uncharacterized protein conserved in bacteria 15 5 Op 7 . + CDS 10869 - 11621 316 ## COG2369 Uncharacterized protein, homolog of phage Mu protein gp30 16 5 Op 8 . + CDS 11660 - 12082 119 ## gi|300907071|ref|ZP_07124738.1| hypothetical protein HMPREF9536_05029 Predicted protein(s) >gi|296493135|gb|ADTK01000366.1| GENE 1 1 - 312 96 103 aa, chain + ## HITS:1 COG:no KEGG:SeD_A0624 NR:ns ## KEGG: SeD_A0624 # Name: not_defined # Def: DNA replication protein GP18 # Organism: S.enterica_Dublin # Pathway: not_defined # 1 103 175 277 277 213 98.0 2e-54 RRKAERIDYESFLNAYNTEVGDRLPHAVAVNEKRKRRLKKIIPQLKTPNVDGFRAYVRAF VHQAKPFYFGDNDTGWTADFDYLLREDSLTGVREGKFADRGIA >gi|296493135|gb|ADTK01000366.1| GENE 2 309 - 1685 627 458 aa, chain + ## HITS:1 COG:PM0412 KEGG:ns NR:ns ## COG: PM0412 COG0305 # Protein_GI_number: 15602277 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Pasteurella multocida # 4 431 24 455 467 222 36.0 1e-57 MKQDIEASVIGGLLIGGLTPTASDVLATLEPEAFSIPLYRKAFEVIRKQARNRNLIDALM VAEACGEEHFTSILMTSKNCPSAANLKGYAGMVADNYHRRLVLEIMDEMREPIQSGTIDT SSQAMDELVKRLSAIRKPRDEVKPVRLGEIITDYTDTLDRRLRNGEESDTLKTGIEELDA ITGGMNAEDLVIIAARPGMGKTELALKIAEGVASRVIPGSDVRRGVLIFSMEMSALQIAE RSIANAGRMSVSVLRNPASMDDEGWARVANGMSQLADLDVWVVDASRLSVEEIRSIAERH KQENPNLSLIMADYLGLIEKPKADRNDLAIAHISGSLKALAKDLKTPVISLSQLSRDVEK RPNKRPTNADLRDSGSIEQDADSIIMLYREAVYDENSSAAPFAEIIVTKNRFGSLGTVYQ RFCNGHFVACDQDEARQICTASNAPAARGRRYAQGADV >gi|296493135|gb|ADTK01000366.1| GENE 3 1759 - 2199 158 146 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_1402 NR:ns ## KEGG: ECUMN_1402 # Name: not_defined # Def: recombination protein NinB from phage origin # Organism: E.coli_UMN026 # Pathway: not_defined # 1 146 7 152 152 277 99.0 1e-73 MKKLTFEIRSPAHQQNAIHAVQQILPDPTKPIVVTIQERNRSLDQNRKLWACLGDVSRQV KWHGRWLDAESWKCVFTAALKQQDVVPNLAGNGFVVIGQSTSRMRVGEFAELLELIQAFG TERGVKWSDEARLALEWKARWGDRAA >gi|296493135|gb|ADTK01000366.1| GENE 4 2196 - 2723 166 175 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_1403 NR:ns ## KEGG: ECUMN_1403 # Name: not_defined # Def: putative DNA N-6-adenine-methyltransferase from phage origin # Organism: E.coli_UMN026 # Pathway: not_defined # 1 173 21 193 195 347 100.0 9e-95 MTIKSNTPAHDKDCWQTPLWLFDALDIEFGFWLDSAASDKNALCAHWLTEADDALNSEWV SHGAIWNNPPYSNIRPWVEKAAEQCIQQRQTVVMLVPEDMSVGWFSKALESVDEVRIITD GRINFIEPSTGLEKKGNSKGSMLLIWRPFISPRRMFTTVSKAALMAIGQGVRRYE >gi|296493135|gb|ADTK01000366.1| GENE 5 3260 - 3469 75 69 aa, chain + ## HITS:1 COG:no KEGG:ECs2978 NR:ns ## KEGG: ECs2978 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157J # Pathway: not_defined # 1 69 1 69 69 110 97.0 1e-23 MLTSDFLMVKAMLSPSQSLQYQKESVERALTCANCGQKLHVLEVHVCEHCCAELMSDPNS SMHEEEDYG >gi|296493135|gb|ADTK01000366.1| GENE 6 3462 - 4067 143 201 aa, chain + ## HITS:1 COG:no KEGG:ECO103_5195 NR:ns ## KEGG: ECO103_5195 # Name: not_defined # Def: putative recombination endonuclease # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 201 1 201 201 323 94.0 2e-87 MAKPARRKCKICKEWFHPAFSNQWWCCPEHGTQLALKLRSKQRKKAEKAAEKKRRREEQK QKDKLKIRKLALKPRSYWIKQAQQAVNAFIRERDRDLPCISCGTLTSAQWDAGHYRTTAA APQLRFDERNIHKQCVVCNQHKSGNLVPYRVELINRIGQEAVDEIESNHNRHRWTVEECK AIKAEYQQKLKDLRNSRSEAA >gi|296493135|gb|ADTK01000366.1| GENE 7 4248 - 4913 488 221 aa, chain + ## HITS:1 COG:ECs0813 KEGG:ns NR:ns ## COG: ECs0813 COG0639 # Protein_GI_number: 15830067 # Func_class: T Signal transduction mechanisms # Function: Diadenosine tetraphosphatase and related serine/threonine protein phosphatases # Organism: Escherichia coli O157:H7 # 1 221 1 221 221 452 95.0 1e-127 MRYYERIDGSKYRNIWVVGDLHGCYTNLMSKLDTIGFDNKKDLLISVGDLVDRGAENVEC LELITFPWFRAVRGNHEQMMIDGLSERGNVNHWLLNGGGWVFNLDYDKEILAKALAHKAD ELPLIIELVSKGKKYVICHADYPCDEYEFGKPVDHQQVIWNRERIGNSQDGIVKEIKGAD TFIFGHTPAVKPLKFANQMYIDTGAVFCGNLTLIQVQGEGA >gi|296493135|gb|ADTK01000366.1| GENE 8 4910 - 5533 241 207 aa, chain + ## HITS:1 COG:no KEGG:LF82_p120 NR:ns ## KEGG: LF82_p120 # Name: not_defined # Def: antiterminator of prophage CP-933Y # Organism: E.coli_LF82 # Pathway: not_defined # 1 207 1 207 207 402 100.0 1e-111 MRLESVAKFHSPKSPMMSDSPRATASDSLSGTDVMAAMGMAQSQAGFGMAAFCGKHELSQ NDKQKAINYLMQFAHKVSGKYRGVAKLEGNTKAKVLQVLATFAYADYCRSAATPGARCRD CHGTGRAVDIAKTEQWGRVVEKECGRCKGVGYSRMPASAAYRAVTMLIPNLTQPTWSRTV KPLYDALVVQCHKEESIADNILNAVTR >gi|296493135|gb|ADTK01000366.1| GENE 9 5902 - 6219 236 105 aa, chain + ## HITS:1 COG:no KEGG:ECS88_2525 NR:ns ## KEGG: ECS88_2525 # Name: not_defined # Def: lysis protein S and lysis inhibitor (holin protein) # Organism: E.coli_S88 # Pathway: not_defined # 1 105 3 107 107 188 100.0 5e-47 MPEKHDLLAAILAAKEQGIGAILAFAMAYLRGRYNGGAFTKTVIDATMCAIIAWFIRDLL DFAGLSSNLAYITSVFIGYIGTDSIGSLIKRFAAKKAGVEDGGNQ >gi|296493135|gb|ADTK01000366.1| GENE 10 6203 - 6679 418 158 aa, chain + ## HITS:1 COG:ECs1622 KEGG:ns NR:ns ## COG: ECs1622 COG4678 # Protein_GI_number: 15830876 # Func_class: G Carbohydrate transport and metabolism # Function: Muramidase (phage lambda lysozyme) # Organism: Escherichia coli O157:H7 # 1 158 1 158 158 303 94.0 1e-82 MVEINNQRKAFLDMLAWSEGTDNGRQKTRNHGYDVIVGGELFTDYSDHPRKLVTLNPKLK STAAGRYQLLSRWWDAYRKQLGLKDFSPKSQDAVALQQIKERGALPMIDRGDIRQAIDRC SNIWASLPGAGYGQFEHKADSLIAKFKEVGGTVREIEV >gi|296493135|gb|ADTK01000366.1| GENE 11 6676 - 7143 278 155 aa, chain + ## HITS:1 COG:no KEGG:UTI89_C5113 NR:ns ## KEGG: UTI89_C5113 # Name: not_defined # Def: putative endopeptidase protein Rz of prophage CP-933K (EC:3.4.-.-) # Organism: E.coli_UTI89 # Pathway: Lysine degradation [PATH:eci00310]; Biotin metabolism [PATH:eci00780]; Metabolic pathways [PATH:eci01100] # 1 155 1 155 155 243 87.0 2e-63 MSRVTAIISALVICIIVCLSWAVNHYRDNAIAYKEQRDKKVSELKQATTTITDMQQRQRD ADALDAKYTKELADAKAENDALQRKLDNGGRVLVKGKCPVSAATQTTGAASMGNDATVEL SAVAGRNVLGIRSGILSDQTALRALQEYITTQCLK >gi|296493135|gb|ADTK01000366.1| GENE 12 7201 - 7977 383 258 aa, chain + ## HITS:1 COG:no KEGG:S0700 NR:ns ## KEGG: S0700 # Name: not_defined # Def: putative bacteriophage protein # Organism: S.flexneri_2457T # Pathway: not_defined # 1 256 1 252 254 400 89.0 1e-110 MAKPDWEAIESAYRAGVLSLRDIGDKYGVTEGAIRKRAKKFDWVRKASTQVRKNGTQSGT QKSKVRTSEKPASAGRTQKSTQPKAEPPPDTKPIRGVRTDPPTNPFQPGNQQALKHGGYA RRLLLKDEVIEDAKALTLEDELFRLRANNLVAAENIGRWLTKLEDAEGDQERKVLMENIS AAEKAMMRNTVRIESIVGTLATVGKIFADTDYRKAATDKVSLEADRLRRDAGIDDGNGER DLNDFYSDIQTDAESGPA >gi|296493135|gb|ADTK01000366.1| GENE 13 7928 - 9328 838 466 aa, chain + ## HITS:1 COG:HI1410 KEGG:ns NR:ns ## COG: HI1410 COG1783 # Protein_GI_number: 16273317 # Func_class: R General function prediction only # Function: Phage terminase large subunit # Organism: Haemophilus influenzae # 59 453 3 386 394 332 45.0 1e-90 MTSTLTSKPTLNPVLRSFWTTQARNKVLYGGRSSSKSWDAAGIAIFLSNKYSLRFCCARQ IQNKIEESVYTLLKIQIDRFGLRHRFRILNNKIINRVTGSEFVFYGLWRNIEEIKSLEGI SVLWLEEAHALTEYQWKILEPTIRKEGSECWFIFNPGLVTDFVWRNFVVDPPEDTLIRKI NYDENPFLSDTMLKVIEAAKRRDPDGFKHVYEGVPESDDDAAIIKLSWIEAAVDAHKVLN FEPSGRKRIGFDVADSGADKCANVYRHGSVVYWADEWKAKEDELLKSCQRTYQAALERDA DIVYDSIGVGASAGAKFAEINEDRKRENMNASRINYQRFNAGAGVNEPDYEYIGIPNKDF FANLKAQAWWLVADRFRNTFNAVKNGEQYPVDELISIDSSCPLLEKLKLELTTPHRDFDK NGRVMVESKKDLAKRDVPSPNVADAFIMAFAPTDTAMDIWEALGNS >gi|296493135|gb|ADTK01000366.1| GENE 14 9566 - 11017 968 483 aa, chain + ## HITS:1 COG:XF1571 KEGG:ns NR:ns ## COG: XF1571 COG3567 # Protein_GI_number: 15838172 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 12 459 29 459 467 194 31.0 3e-49 MAKKTGRVATADSYDNFVARVGMQQPNQHAASTYRANYTSRNRLLIEWAYRSSWIIGAAV DSKADDMTKKGVRITSEIDPKRRGILESRFDELQLWDCINETLKWSRLYGGAVALILIEG QAPLTPLVLDKVGKGSFKGLAVLDRWMINPQLTRRIKALGPNLGKPEFYDIVTTAQGLPA WTVHHSRLIRMDGVKLPYQQKITENEWGMSIVERIFDRLTSYDSTSVGAAQLAYKAHLRT AKIKKLREIIAMGGKPFEALVKQMELVRQYQTNEGMSLFDSEDEFETHSYSFAGLSDLLS EFKEDIAGAVGIPLVRLFRQSPKGFSTGDADLANYYDDVGTLQERDLRPHIRLLFDVLHR SEFGEPLPQDFTFEFNPLWQMSDTDRSTVATNTTTALATAVRDLGMSPAAALTDLRELSD VTGIGASISDEDIQNAAKQWQETESETSPPPPIGGPVSEKPTGDSRPDKSNRHGFLRWFT GKR >gi|296493135|gb|ADTK01000366.1| GENE 15 10869 - 11621 316 250 aa, chain + ## HITS:1 COG:XF1574 KEGG:ns NR:ns ## COG: XF1574 COG2369 # Protein_GI_number: 15838175 # Func_class: S Function unknown # Function: Uncharacterized protein, homolog of phage Mu protein gp30 # Organism: Xylella fastidiosa 9a5c # 3 246 9 274 281 62 23.0 1e-09 MRRNSGRRLNLKPALRRRSEVQYQKSLLAIVDQINQIVTGSYDGSQASAESIAKSLVDYS GVIDDWAEMVGRKMFAQVEREEWNQWRSVSEEISAGLRDVISNTPVGMVAQDTVYRQIRY MKSLPLEAAGRVREIQERAIKAVIHGERPDQLYEMIMQSGDVAASRARMIARTEIGRATT ALTQARALSVGSEGYWWRIKGAGTRPSHRGMKDKFVRWDNPPTLDGMTGHAGCLPNCDCW PEVQIPEFRK >gi|296493135|gb|ADTK01000366.1| GENE 16 11660 - 12082 119 140 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|300907071|ref|ZP_07124738.1| ## NR: gi|300907071|ref|ZP_07124738.1| hypothetical protein HMPREF9536_05029 [Escherichia coli MS 84-1] # 1 140 1 140 140 258 100.0 9e-68 MKQNIHKLNGLEFTHERDFINGQWVYSWYFRPLEQSEWCPFSLPTGKTRKSDIENFLKNC EEATKFYLEWLRNASDVEGAERYLLSAKQAWERISSPDWGGRGSNPNKDARRVQQARETF ESAKVKLEKAKILRERLNSN Prediction of potential genes in microbial genomes Time: Mon May 16 16:14:00 2011 Seq name: gi|296493134|gb|ADTK01000367.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1165.5, whole genome shotgun sequence Length of sequence - 44606 bp Number of predicted genes - 51, with homology - 50 Number of transcription units - 13, operones - 8 average op.length - 5.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 34 - 1236 614 ## COG3566 Uncharacterized protein conserved in bacteria 2 1 Op 2 . + CDS 1240 - 1734 375 ## G2583_3117 putative bacteriophage protein 3 1 Op 3 . + CDS 1746 - 2687 931 ## COG4834 Uncharacterized protein conserved in bacteria 4 1 Op 4 . + CDS 2727 - 3008 261 ## gi|300907076|ref|ZP_07124742.1| conserved domain protein 5 1 Op 5 . + CDS 2977 - 3396 245 ## SNSL254_A1161 putative bacteriophage protein 6 1 Op 6 . + CDS 3393 - 3899 363 ## SPC_1043 putative bacteriophage protein 7 1 Op 7 . + CDS 3899 - 4285 384 ## E2348C_2504 hypothetical protein 8 1 Op 8 . + CDS 4278 - 4820 283 ## Ent638_0805 putative bacteriophage protein 9 1 Op 9 . + CDS 4824 - 5969 1148 ## ECO103_3104 hypothetical protein 10 1 Op 10 . + CDS 5979 - 6422 377 ## SC2598 putative bacteriophage protein 11 1 Op 11 . + CDS 6426 - 6875 476 ## SPAB_02224 hypothetical protein 12 1 Op 12 . + CDS 6917 - 7069 212 ## gi|300907084|ref|ZP_07124750.1| hypothetical protein HMPREF9536_05041 13 1 Op 13 . + CDS 7059 - 8975 1210 ## ECO103_3101 hypothetical protein 14 1 Op 14 . + CDS 8975 - 9550 290 ## STY1062 putative bacteriophage protein 15 1 Op 15 . + CDS 9551 - 9853 223 ## t1878 putative bacteriophage protein 16 1 Op 16 . + CDS 9856 - 10923 509 ## ECO103_3098 hypothetical protein 17 1 Op 17 . + CDS 10920 - 11252 62 ## ECIAI1_2629 putative secreted protein from phage origin + Term 11343 - 11384 -0.8 + Prom 11595 - 11654 4.1 18 2 Op 1 . + CDS 11688 - 12419 409 ## t1873 putative bacteriophage protein 19 2 Op 2 . + CDS 12419 - 12772 336 ## SNSL254_A1175 putative bacteriophage protein 20 2 Op 3 . + CDS 12772 - 13968 661 ## COG3299 Uncharacterized homolog of phage Mu protein gp47 21 2 Op 4 . + CDS 13965 - 14738 384 ## ECO103_3091 putative bacteriophage protein 22 2 Op 5 . + CDS 14738 - 15619 318 ## Ctu_34110 hypothetical protein 23 2 Op 6 . + CDS 15619 - 15789 148 ## gi|300907096|ref|ZP_07124762.1| conserved domain protein + Term 15839 - 15864 -0.5 24 2 Op 7 . + CDS 15871 - 18282 359 ## Kvar_1820 hypothetical protein 25 2 Op 8 . + CDS 18358 - 19389 457 ## Ent638_1051 hypothetical protein 26 3 Op 1 4/0.500 - CDS 19795 - 20271 466 ## COG1881 Phospholipid-binding protein 27 3 Op 2 . - CDS 20330 - 21562 1256 ## COG0161 Adenosylmethionine-8-amino-7-oxononanoate aminotransferase - Prom 21653 - 21712 3.9 + Prom 21603 - 21662 3.2 28 4 Op 1 12/0.000 + CDS 21706 - 22746 1173 ## COG0502 Biotin synthase and related enzymes 29 4 Op 2 6/0.500 + CDS 22743 - 23897 974 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes 30 4 Op 3 9/0.000 + CDS 23884 - 24639 533 ## COG0500 SAM-dependent methyltransferases 31 4 Op 4 3/1.000 + CDS 24632 - 25309 476 ## COG0132 Dethiobiotin synthetase + Term 25466 - 25508 1.1 + Prom 25778 - 25837 6.9 32 5 Tu 1 . + CDS 25888 - 27909 2228 ## COG0556 Helicase subunit of the DNA excision repair complex + Term 27947 - 27985 3.0 - Term 27935 - 27971 2.4 33 6 Tu 1 . - CDS 28101 - 29009 856 ## COG0391 Uncharacterized conserved protein - Prom 29210 - 29269 3.1 + Prom 29197 - 29256 4.2 34 7 Op 1 6/0.500 + CDS 29406 - 30395 732 ## COG2896 Molybdenum cofactor biosynthesis enzyme 35 7 Op 2 11/0.000 + CDS 30417 - 30929 602 ## COG0521 Molybdopterin biosynthesis enzymes 36 7 Op 3 11/0.000 + CDS 30932 - 31417 638 ## COG0315 Molybdenum cofactor biosynthesis enzyme 37 7 Op 4 21/0.000 + CDS 31410 - 31655 320 ## COG1977 Molybdopterin converting factor, small subunit 38 7 Op 5 5/0.500 + CDS 31657 - 32109 617 ## COG0314 Molybdopterin converting factor, large subunit + Prom 32161 - 32220 2.0 39 8 Tu 1 5/0.500 + CDS 32246 - 32950 751 ## COG0670 Integral membrane protein, interacts with FtsH + Term 32970 - 33000 1.0 + Prom 32984 - 33043 4.0 40 9 Tu 1 . + CDS 33155 - 33868 238 ## COG0670 Integral membrane protein, interacts with FtsH 41 10 Op 1 5/0.500 - CDS 33904 - 34860 1044 ## COG0392 Predicted integral membrane protein 42 10 Op 2 8/0.000 - CDS 34860 - 36101 1306 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes 43 10 Op 3 . - CDS 36098 - 36859 568 ## COG3568 Metal-dependent hydrolase - Prom 36966 - 37025 1.5 + Prom 36842 - 36901 3.0 44 11 Tu 1 . + CDS 36992 - 37402 445 ## G2583_1019 inner membrane protein YbhQ 45 12 Op 1 22/0.000 - CDS 37364 - 38470 1220 ## COG0842 ABC-type multidrug transport system, permease component 46 12 Op 2 45/0.000 - CDS 38481 - 39614 1384 ## COG0842 ABC-type multidrug transport system, permease component 47 12 Op 3 10/0.000 - CDS 39607 - 41343 343 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 48 12 Op 4 15/0.000 - CDS 41336 - 42331 1245 ## COG0845 Membrane-fusion protein 49 12 Op 5 . - CDS 42334 - 43005 701 ## COG1309 Transcriptional regulator - Prom 43034 - 43093 3.6 + Prom 42751 - 42810 1.8 50 13 Op 1 . + CDS 43044 - 43265 60 ## 51 13 Op 2 . + CDS 43234 - 44598 1473 ## COG0513 Superfamily II DNA and RNA helicases Predicted protein(s) >gi|296493134|gb|ADTK01000367.1| GENE 1 34 - 1236 614 400 aa, chain + ## HITS:1 COG:XF1575 KEGG:ns NR:ns ## COG: XF1575 COG3566 # Protein_GI_number: 15838176 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 29 181 27 176 397 73 35.0 6e-13 MKYFFKTRLGNTRFQLADGSVLFKDVPIARTGELEYDATERPELVPNDRGKVIVRRTPEE VFSERAMASFEGMAVTIGHPRDFDGQIIFVTPDNWRQLAHGHIQNVRRGTDDKTDLLLAD VIVKTPEALQAIDDGDDEVSCGYDADYEQISPGLAKQSAITANHLALVPNGRAGFRCAIG DSMPSTTKNWFTRLLKARKTGDAAEMASLIDNPPDDVTGDNDVSTSMTPGGVVINLAPQN PLPGPALPGTGDAEEEIPAWGKALIEAVAKLTPAATAPGTGDAEDEEEKKEEEGKVTGDA AYRADLIQPGIQLPEKAKPTAFKRQVLASADQSLVRSIVGDADISKLKKATVDMAFTAVS ELAKNRNTKTVDSLQTQTATTVKTIAGMNQAAQEFWSKRG >gi|296493134|gb|ADTK01000367.1| GENE 2 1240 - 1734 375 164 aa, chain + ## HITS:1 COG:no KEGG:G2583_3117 NR:ns ## KEGG: G2583_3117 # Name: not_defined # Def: putative bacteriophage protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 164 1 168 168 171 58.0 1e-41 MGNTFLYRMPAGIAGAISRPQDLTVEPQLLDSSNLFPAYGLGGKISSGKFVPIAASDTAS VLVGIYVRPYPTASQPDKVQQVGSGKNFTGDCLVRGYVTVNIGADASSVALHGPVYMRVA TPSASSPLGAFLAAADGSNTVQITNAYFNGPGDTSGNIELAFNI >gi|296493134|gb|ADTK01000367.1| GENE 3 1746 - 2687 931 313 aa, chain + ## HITS:1 COG:XF1577 KEGG:ns NR:ns ## COG: XF1577 COG4834 # Protein_GI_number: 15838178 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 15 313 35 327 327 79 25.0 9e-15 MPMTFDQATVDGTGAFLVHELERLDQTLNLPLVNFTWSRDIQLREDVSIADEISSFTNTT FAAAGTPNANGKNWLSKAATAMAGLNVDIAKTGFPLTLWGMELGWTVPELQAAAQVGRPI DTQKYDGMQLKWNMDTDEQVYIGDSGLNVKGLLNLTQVTPTNAAKTWATSTADEIRASIN AGLSAAWANSAYSMVPTDMLIPPEQFSLLASTIVSSAGNQSLLTYLETNTIAYHQNGRPL NIRPVKWAKGRGVSNSDRMMFYTNDKKYVRFPMVPLMSVPIQYRGLYQLVTYYGKLGAVE PVYPETLAYVDGI >gi|296493134|gb|ADTK01000367.1| GENE 4 2727 - 3008 261 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|300907076|ref|ZP_07124742.1| ## NR: gi|300907076|ref|ZP_07124742.1| conserved domain protein [Escherichia coli MS 84-1] # 1 93 1 93 93 165 100.0 9e-40 MKKIYVLSPFNFNDGKEQKHFPVGFHDVDDTVADHWFVKAHCSPDGEAPAVAEDPRIAEL EAKIAEKDARIAELEAQLPETTNNGKKSKSADA >gi|296493134|gb|ADTK01000367.1| GENE 5 2977 - 3396 245 139 aa, chain + ## HITS:1 COG:no KEGG:SNSL254_A1161 NR:ns ## KEGG: SNSL254_A1161 # Name: not_defined # Def: putative bacteriophage protein # Organism: S.enterica_Newport # Pathway: not_defined # 1 135 1 135 135 165 60.0 5e-40 MARNQSLPTPEQFRATFPQFADDTKYPTPMIQARLNLADAMLSESRFGVDIFPYIVGLYV AHYMYLYAADMRGVAVGTAGGINSGIQTAKSVDKVSASYDASATLDPNAGFWNNSRYGSE FWEYLMMFGAGAVQLGTPE >gi|296493134|gb|ADTK01000367.1| GENE 6 3393 - 3899 363 168 aa, chain + ## HITS:1 COG:no KEGG:SPC_1043 NR:ns ## KEGG: SPC_1043 # Name: not_defined # Def: putative bacteriophage protein # Organism: S.enterica_Paratyphi_C # Pathway: not_defined # 2 168 4 180 184 84 38.0 2e-15 MKSGLTIREDNYSSVLDALKQLSGTDVLVGIPAGPPRDDAPLSNAELGYLQSTGATVEID GETVTLPPRPFLDMGIEDSRDKTTERLKLAAQSALEGKADVASMHLEAAGQIARDASKAV IEAGDRLTPLSEKTIKKRREMKPPIPGDKPLRARGFLFRAIQYVVRKE >gi|296493134|gb|ADTK01000367.1| GENE 7 3899 - 4285 384 128 aa, chain + ## HITS:1 COG:no KEGG:E2348C_2504 NR:ns ## KEGG: E2348C_2504 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_0127 # Pathway: not_defined # 1 128 1 128 129 179 74.0 2e-44 MPFLDVTDVLLDPDFVDLSLVCYRQVQTVDEDNFPTNTVQAIPFSGVVTVDRSLEAKRMA AGQNINGAILIVTQFRLTQGQPGLDADIVTYRGRDYRVTFVDPYTAYGAGFVQAHCELLE FDGGTPIE >gi|296493134|gb|ADTK01000367.1| GENE 8 4278 - 4820 283 180 aa, chain + ## HITS:1 COG:no KEGG:Ent638_0805 NR:ns ## KEGG: Ent638_0805 # Name: not_defined # Def: putative bacteriophage protein # Organism: Enterobacter_638 # Pathway: not_defined # 9 180 4 176 176 218 60.0 8e-56 MSNDSTTAGYLTPVGDSPPYDEDLERLISRWIRGVTGLAATLVYPRWTDPQKQIPKNGTT WCAFGITGIQEDFNPAYVQGEENTEQWSHETVSLILCFYGPQGLAMATRFRDGLLVSQNN DELNRSGLTFLQHGRILNLPELINNQWVRRYDISVDLRRKIIRQYGIQSLVDAPVQFFGD >gi|296493134|gb|ADTK01000367.1| GENE 9 4824 - 5969 1148 381 aa, chain + ## HITS:1 COG:no KEGG:ECO103_3104 NR:ns ## KEGG: ECO103_3104 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 381 1 381 381 559 74.0 1e-158 MAQGLPVSNVVNVDVIMSPVAATGRNFGALLILGTSTVIPVTERIRQYSAIEDIGDDFGA DSPEYEAATIFFSQSPKPTLVYIGRWAKTLAEGEAGAVETLLQAVNACLQYTNWYGLAIA DSADLVEADVISVAAAIEASSLSRILAVTTADVNVLVSGNTDNIGYKLKAAGYSRTFWQY SSSSKYAAISAFGRAFTVNFTGNNTTITLKFKTEPGVTHETLTTAQASAIDAINGNVYVY YANDTAIIQQGVMANGDFFDERHGLDWLQNYVQTNLYNLLYTSTTKIPQTDAGVTRLMTN VEASLDQAVNNGLVAPGVWNGGPIGQIESGDTLTKGYYVHADSVENQAQSDREARKSPVI QAAIKLAGAIHYADVQINVVR >gi|296493134|gb|ADTK01000367.1| GENE 10 5979 - 6422 377 147 aa, chain + ## HITS:1 COG:no KEGG:SC2598 NR:ns ## KEGG: SC2598 # Name: not_defined # Def: putative bacteriophage protein # Organism: S.enterica_Choleraesuis # Pathway: not_defined # 1 147 1 147 147 219 82.0 2e-56 MSGTYSFIDVSASLTGPTGSIDLGYGSANSEEGITVAMTEAKNTMTVGADGEVMHSLHAG KSGTITVTLLKTSPVNKKLSLMYNAQSLSSATWGNNVIVIRNKVSGDTTTARSCAFQKQP DHANAKVGNTVSWVFDCGKIDQLLGEF >gi|296493134|gb|ADTK01000367.1| GENE 11 6426 - 6875 476 149 aa, chain + ## HITS:1 COG:no KEGG:SPAB_02224 NR:ns ## KEGG: SPAB_02224 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Paratyphi_B # Pathway: not_defined # 1 149 1 139 141 178 62.0 6e-44 MEFEIKGVKYRTAKLDVFQQLKVSRKLLPVLAGLVSDFGTLKSMMVRDSEGKLVFGEKRA FDALDIVLPKIADTLAALPEEDVNAVIHPCLGVVMRQHEKGWVKIFDQGALMFDDIDLFT MLQLVARVVADSLGNFLKELPGSGTPTQP >gi|296493134|gb|ADTK01000367.1| GENE 12 6917 - 7069 212 50 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|300907084|ref|ZP_07124750.1| ## NR: gi|300907084|ref|ZP_07124750.1| hypothetical protein HMPREF9536_05041 [Escherichia coli MS 84-1] # 1 50 9 58 58 86 100.0 4e-16 MRPVDAGLIPYTALKDGSVDLADIARMNDWLDLKADNENRIAKWREANER >gi|296493134|gb|ADTK01000367.1| GENE 13 7059 - 8975 1210 638 aa, chain + ## HITS:1 COG:no KEGG:ECO103_3101 NR:ns ## KEGG: ECO103_3101 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 638 1 662 662 789 65.0 0 MNAETLKDFLISLGFKVDEAGARKFDAVVAGTTLKAIELGVKVEAAALSVVAFTAKIASG LDDLYWASQRTGATVEGIKQIGYAVSQVGGSVDGARGSLENLARFMRNNPGAEGFLNRLG VQTRDASGNMRDMATIFTGVGQRLSSMPYYRANQYAQMLGLDENTLMAMRRGIGQFSGEY TAMAKAIGYNADVAAVSSNKFMTSLRSFGLMAGMARDKIGSSLADGLAGSLDRLRRQILE NFPKIEGAITGTVKGILWAGEMVGRVIYRLIQLGQSISDWWDSLDKQSQQLIEPIGALTA AWWMLNRAMLASPITWVLGLAAAIALLWEDYQTWKEGGKSLIDWGKWKPEVDAALKMVGD LKQTVLDLGKALAKLLNIDPKSWSLKWDFSNFITQMGEFSKMLSMIGDLLNAIKDGRWSD AASIGRALLKQGSNQPDALPGVSDSANSAADWIKDKTGFDPRSIGRFFRSEGNTLADRNN NPGNIRPVGGGGFRTFGSALEGWEAMKNQLMRYFTGKTTGRRLQTIMDIVSTWAPAADNN DPAKYARDVAGWMGVSPTAALNLSDPNTMAMLMQSMARKEGYSNWNSPLAHQAAGAQVNQ QNTYNIYGGNAQEIGQEVSRRQLDANARVLRNNQTGAG >gi|296493134|gb|ADTK01000367.1| GENE 14 8975 - 9550 290 191 aa, chain + ## HITS:1 COG:no KEGG:STY1062 NR:ns ## KEGG: STY1062 # Name: not_defined # Def: putative bacteriophage protein # Organism: S.typhi # Pathway: not_defined # 1 189 1 189 195 239 69.0 3e-62 MDILSTLFQQQSRRIGLIVPSVVISEKHDDSLEITEHPVEVGAAISDHAFRRPSEVVMQV GFAGGGSLLDFVDTSSLGLSVGIGPKETYQELLNLQSSRVPLDVVTGKRIYTNMLIRALE VTTDRTSENILSAVLTLREVIITSTTTTHVAPKSNMKLGANTSAVQNSGVKTPVQKNESI LSRLSGFVAGG >gi|296493134|gb|ADTK01000367.1| GENE 15 9551 - 9853 223 100 aa, chain + ## HITS:1 COG:no KEGG:t1878 NR:ns ## KEGG: t1878 # Name: not_defined # Def: putative bacteriophage protein # Organism: S.typhi_Ty2 # Pathway: not_defined # 1 100 1 100 100 117 53.0 2e-25 MTISEIPLSPENQRFSISVAGRSLQMAVTWRAAFWCLDIMDSSGADLIKGIPLITGADLL AQYRYLGLGFSLYVGCDNQSSENPTEADLGIYSHLYAVAE >gi|296493134|gb|ADTK01000367.1| GENE 16 9856 - 10923 509 355 aa, chain + ## HITS:1 COG:no KEGG:ECO103_3098 NR:ns ## KEGG: ECO103_3098 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 353 1 353 354 544 77.0 1e-153 MSQNWMRHFELQLVDSKGNATDFGSFKSTFTIDWFNLSSETRVGTFKIYNLSADTVNRIV GEEFSRIRVIAGYDGIAADVPASQVGVARTVNPDEVGQMDGRNYGLIFDGEIRYTITGKD NPVDSFVLIQAADSDRAFATSITAQTLAAGYTVSDVNAVLMKDFNANGATEGNTPAMPAT VFPRGRVLFGMTRHLMDNVAEQCKADWMFVDGKREMVAKNEVVHEAIKLNSATGLVGMPQ QTIGSGVNVRCLINPNIRVNGLIELNQASVFRTALGNSDIAMTQGRITDQNNNGNITIEG TTAQPASIATDGVYIVRGIMYTGDTRGQAWYMDMMCEARGAMDLKTQSALERGAG >gi|296493134|gb|ADTK01000367.1| GENE 17 10920 - 11252 62 110 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_2629 NR:ns ## KEGG: ECIAI1_2629 # Name: not_defined # Def: putative secreted protein from phage origin # Organism: E.coli_IAI1 # Pathway: not_defined # 1 110 1 115 117 115 55.0 5e-25 MKKLLILIALFSAPALSAIQCGGYKLTINDSEGLVRINGELVTSQKVKYLGKKGDESNAK WDMGIMPSRDGNNYGFQFIKRDGKSWLNVQLLQNSMDAPKLIGSYPCKTV >gi|296493134|gb|ADTK01000367.1| GENE 18 11688 - 12419 409 243 aa, chain + ## HITS:1 COG:no KEGG:t1873 NR:ns ## KEGG: t1873 # Name: not_defined # Def: putative bacteriophage protein # Organism: S.typhi_Ty2 # Pathway: not_defined # 1 242 1 250 251 301 66.0 1e-80 MGVSSQTRSGALAEVLASERKTLSEQMRVALPGIIQSFDPESLTAVVQPAIRYIERDNDG NKSTKDYPLLVDVPVVFPRGGGCTLTFPVSEGDECLVIFADRCIDFWWQSGGVQEPVDGR MHDLSDAFCIVGPQSQAKKIGGISTTGAQLRTDDGSAFIEVAAGGDITATTAGSATINAP EIVLNGNVTINGNLSQGMGERGGTATMHGPVTVTNDVTAGGKSLMTHMHGGVEHGNDSTG GPE >gi|296493134|gb|ADTK01000367.1| GENE 19 12419 - 12772 336 117 aa, chain + ## HITS:1 COG:no KEGG:SNSL254_A1175 NR:ns ## KEGG: SNSL254_A1175 # Name: not_defined # Def: putative bacteriophage protein # Organism: S.enterica_Newport # Pathway: not_defined # 1 117 1 117 117 177 81.0 1e-43 MRYRREDDDGDYTFGQGDDTWLVNSPEAVAQAIKTRFLLWYGQWFLDTTEGTPWIQSVLG KQKPDTYNLAIRKRILETQGVSSITAFNTTVDGTTRRVTFTATVETLYGTTTVTSEA >gi|296493134|gb|ADTK01000367.1| GENE 20 12772 - 13968 661 398 aa, chain + ## HITS:1 COG:XF1704 KEGG:ns NR:ns ## COG: XF1704 COG3299 # Protein_GI_number: 15838305 # Func_class: S Function unknown # Function: Uncharacterized homolog of phage Mu protein gp47 # Organism: Xylella fastidiosa 9a5c # 14 397 10 386 387 130 29.0 5e-30 MSLDLDTLGLSATVTAEGISAPDYQTVLDTITGYFQQIYGSDAYLDPDSKDGQMVALVAL AIHDANNTAISVYRSFSPSTALDDALTSNVKINGIARRAATNSTVDELIEGEAGTLITNG SVKDANGIIWNLPAQVTIGIDGTVIATATCSVAGAVAAPAGSVNKINTPTRGWVSVTNPQ AATVGVAAETNAELRVRQSQSVALPSLTPFEAVDGAIANISGVTRHKLYENDTDTTDANG LPPHSIAAIVEGGDATVIANSIRGVKGQGVTPYGSTVIVVPDKYGNPHSVGFSRPVDVPI YVKITIEPLTGYTSQVGEEIKAAVSAYINSLAIGASVLLSRVYSPANLGVVSGGNARYYD ITELLIGTSAGGVAAANVDIAFDQSASCAVSNINLVVS >gi|296493134|gb|ADTK01000367.1| GENE 21 13965 - 14738 384 257 aa, chain + ## HITS:1 COG:no KEGG:ECO103_3091 NR:ns ## KEGG: ECO103_3091 # Name: not_defined # Def: putative bacteriophage protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 254 1 223 226 270 54.0 5e-71 MSRYTDRITNYHAGKPKFFAHVDLSTRPLSDVSDAMSRLIPDFDIDTAIGVQLDVVGEWV GRSRRVATPVTGIYFSWDTERVGWDRGVWQGPYDPNDGFIDLSDEIYRLMLKVKVAINNW DGQNDSLPSILDAALAGSGIRMAIVDNQDMSISIWILGDPSVALSEIDRLILDSAVNKGP FIALPADYVPSRYDINPIDQVNSELWWAIRNGYMTVKAAGVRVREIETVSDGYQFFGFDI ENDYIAGFDRGSWGEQF >gi|296493134|gb|ADTK01000367.1| GENE 22 14738 - 15619 318 293 aa, chain + ## HITS:1 COG:no KEGG:Ctu_34110 NR:ns ## KEGG: Ctu_34110 # Name: not_defined # Def: hypothetical protein # Organism: C.turicensis # Pathway: not_defined # 1 105 1 105 227 98 54.0 3e-19 MATNDFKPFATGSGANVLSQADYDALSARTTGFLSGKASSAQVNKALRQASTIAAVVAQF ISDNSGDDTLDNGNLPTLLASLESALLKSSPGRLQNIVIFTANGTYTPSPGTKHVKVIVT GGGGGGGGCQGTSGSESVSGGGGGAGGTAIGYFAVTESSYAVTVGAGGSAGVGAVQGGTG GTSIINGISGLGGDGGQKSGITTLAGGKGGVSIGGSVNLPGGYGTDGQNGSLIIPGNGGS SYWGGGGRGGARGGVAGDCYGSGGGGAYDAAMSGNSYNGGQGKAGIVYIEEYS >gi|296493134|gb|ADTK01000367.1| GENE 23 15619 - 15789 148 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|300907096|ref|ZP_07124762.1| ## NR: gi|300907096|ref|ZP_07124762.1| conserved domain protein [Escherichia coli MS 84-1] # 1 56 1 56 56 90 100.0 4e-17 MVSKYAVLKEGVVENIIVADDNYSPDDFEVVKYSDETFCQPDMLYNKEDGLFYDDK >gi|296493134|gb|ADTK01000367.1| GENE 24 15871 - 18282 359 803 aa, chain + ## HITS:1 COG:no KEGG:Kvar_1820 NR:ns ## KEGG: Kvar_1820 # Name: not_defined # Def: hypothetical protein # Organism: K.variicola # Pathway: not_defined # 6 262 10 226 702 126 38.0 5e-27 MSEYDTGNPVPSASMPDAWDNMQSIDKFVNSNDETITTRTGEQLDTLHGVNVKASNQRNE LQAEFEASQEERDIAFEETRQNLIPLSRQYMTLAAAQADIANIPTGSATYYRSPDDSALA IEAINNAGTLTATGRKIPAYSALRRGNILFDAFNEYSAVDPKFGAWDWYRGATVTFSTTD ANIPLPTPVAQYSGVWSADKYYDLARLPVRVGDQLTFSVLAWFQDAGAKFHIFWMSASGT IISSKSVTGLASGINVPVLTDIVPAGASYVRIRVENTTTGSFKVGAYASAVGSIQPEFIR AAPDKTYMSAIVSSGIAGLTSRVDALQGAISVGYPYAATWQVGKFVNPNTGEITDNAALN CTIIPHSDGDGWLVTALVTGSATALAVYMNAAGTVLGVEGRGTTTPQQYTNYRLNVPSGT TQIGITGRNSAPMSVKKLAVVETATVLASIDSLDSRVQKIEDSLVYDFVQQDVPITSGAY INRSTGAVVTNSAFDCSIFDYTTGDRWKVTARVNGTSVSLAVYFNSVGTVIGTEGDGTSA SVDYIDYELTPPTGTARIGITTRVALPIVAKKYVIVPGGSTVSPWSGKIIDVMGDSNVAY NKWQPLVASELGCSFLNHGVGGSKIAKPDSSPSQISMCDDARINALDTTATAWICGPWGT NDWAQNIPIGTISDTVNTTVYGALVIIAQKLRARAPTKPIFWVTPFNGDYETGRTSSWVD GETNQYGRVSDYSAAIRAVALRYGFPLIDLNADCGWTKFNSSYFLLTEGNTNPSHIHLND SAGPARISELVIDRLTALANLAN >gi|296493134|gb|ADTK01000367.1| GENE 25 18358 - 19389 457 343 aa, chain + ## HITS:1 COG:no KEGG:Ent638_1051 NR:ns ## KEGG: Ent638_1051 # Name: not_defined # Def: hypothetical protein # Organism: Enterobacter_638 # Pathway: not_defined # 12 137 12 135 232 105 58.0 3e-21 MALKLLANNNAKSVLAAGISASATVITVGTGAGALFPSPVSGQSYFKLTITDAATKTISE IMHVTSVSGDVMTVIRGQEGTTARVWSTNDIVANLMTAGSLLSFLQISNNLSEIKDAGED AINEARYNLGISDSSGFVGRSLGAPKAFYANGTYTRSPLARYAKVTLTGGGGGGGGCQAS NNTETFSGAGGGAGATIIVWVDLSAASSYAITVGKGGKGGVGAVSGADGGATSFASLFSA PGGKGGVKSGVSNTAGGAGGTAATGDIRINGGTGSDGQTGSSLLTGNGGASYFGGGGRAG SQAGIAGAAPGSGGGGAYDLGFTGTAFTGGDGATGMAIVEEFA >gi|296493134|gb|ADTK01000367.1| GENE 26 19795 - 20271 466 158 aa, chain - ## HITS:1 COG:ybhB KEGG:ns NR:ns ## COG: ybhB COG1881 # Protein_GI_number: 16128741 # Func_class: R General function prediction only # Function: Phospholipid-binding protein # Organism: Escherichia coli K12 # 1 158 1 158 158 314 100.0 4e-86 MKLISNDLRDGDKLPHRHVFNGMGYDGDNISPHLAWDDVPAGTKSFVVTCYDPDAPTGSG WWHWVVVNLPADTRVLPQGFGSGLVAMPDGVLQTRTDFGKTGYDGAAPPKGETHRYIFTV HALDIERIDVDEGASGAMVGFNVHFHSLASASITAMFS >gi|296493134|gb|ADTK01000367.1| GENE 27 20330 - 21562 1256 410 aa, chain - ## HITS:1 COG:bioA KEGG:ns NR:ns ## COG: bioA COG0161 # Protein_GI_number: 16128742 # Func_class: H Coenzyme transport and metabolism # Function: Adenosylmethionine-8-amino-7-oxononanoate aminotransferase # Organism: Escherichia coli K12 # 1 410 20 429 429 817 98.0 0 MTSPLPVYPVVSAEGCELILSDGKRLVDGMSSWWAAIHGYNHPQLNAAMKSQIDAMSHVM FGGITHAPAIELCRKLVAMTPQPLECVFLADSGSVAVEVAMKMALQYWQAKGEARQRFLT FRNGYHGDTFGAMSVCDPDNSMHSLWKGYLPENLFAPAPQSRMDGEWDERDMVGFARLMA AHRHEIAAVIIEPIVQGAGGMRMYHPEWLKRIRKMCDREGILLIADEIATGFGRTGKLFA CEYAEIAPDILCLGKALTGGTMTLSATLTTREVAETISNGEAGCFMHGPTFMGNPLACAA ANASLAIIESGEWQQQVAAIEVQLREQLAPARDAEMVADVRVLGAIGVVETTRPVNMAAL QKFFVEQGVWIRPFGKLIYLMPPYIILPQQLQRLTAAVNRAVQDETFFCQ >gi|296493134|gb|ADTK01000367.1| GENE 28 21706 - 22746 1173 346 aa, chain + ## HITS:1 COG:bioB KEGG:ns NR:ns ## COG: bioB COG0502 # Protein_GI_number: 16128743 # Func_class: H Coenzyme transport and metabolism # Function: Biotin synthase and related enzymes # Organism: Escherichia coli K12 # 1 346 1 346 346 694 100.0 0 MAHRPRWTLSQVTELFEKPLLDLLFEAQQVHRQHFDPRQVQVSTLLSIKTGACPEDCKYC PQSSRYKTGLEAERLMEVEQVLESARKAKAAGSTRFCMGAAWKNPHERDMPYLEQMVQGV KAMGLEACMTLGTLSESQAQRLANAGLDYYNHNLDTSPEFYGNIITTRTYQERLDTLEKV RDAGIKVCSGGIVGLGETVKDRAGLLLQLANLPTPPESVPINMLVKVKGTPLADNDDVDA FDFIRTIAVARIMMPTSYVRLSAGREQMNEQTQAMCFMAGANSIFYGCKLLTTPNPEEDK DLQLFRKLGLNPQQTAVLAGDNEQQQRLEQALMTPDTDEYYNAAAL >gi|296493134|gb|ADTK01000367.1| GENE 29 22743 - 23897 974 384 aa, chain + ## HITS:1 COG:bioF KEGG:ns NR:ns ## COG: bioF COG0156 # Protein_GI_number: 16128744 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Escherichia coli K12 # 1 384 1 384 384 679 100.0 0 MSWQEKINAALDARRAADALRRRYPVAQGAGRWLVADDRQYLNFSSNDYLGLSHHPQIIR AWQQGAEQFGIGSGGSGHVSGYSVVHQALEEELAEWLGYSRALLFISGFAANQAVIAAMM AKEDRIAADRLSHASLLEAASLSPSQLRRFAHNDVTHLARLLASPCPGQQMVVTEGVFSM DGDSAPLAEIQQVTQQHNGWLMVDDAHGTGVIGEQGRGSCWLQKVKPELLVVTFGKGFGV SGAAVLCSSTVADYLLQFARHLIYSTSMPPAQAQALRASLAVIRSDEGDARREKLAALIT RFRAGVQDLPFTLADSCSAIQPLIVGDNSRALQLAEKLRQQGCWVTAIRPPTVPAGTARL RLTLTAAHEMQDIDRLLEVLHGNG >gi|296493134|gb|ADTK01000367.1| GENE 30 23884 - 24639 533 251 aa, chain + ## HITS:1 COG:bioC KEGG:ns NR:ns ## COG: bioC COG0500 # Protein_GI_number: 16128745 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli K12 # 1 251 1 251 251 460 100.0 1e-130 MATVNKQAIAAAFGRAAAHYEQHADLQRQSADALLAMLPQRKYTHVLDAGCGPGWMSRHW RERHAQVTALDLSPPMLVQARQKDAADHYLAGDIESLPLATATFDLAWSNLAVQWCGNLS TALRELYRVVRPKGVVAFTTLVQGSLPELHQAWQAVDERPHANRFLPPDEIEQSLNGVHY QHHIQPITLWFDDALSAMRSLKGIGATHLHEGRDPRILTRSQLQRLQLAWPQQQGRYPLT YHLFLGVIARE >gi|296493134|gb|ADTK01000367.1| GENE 31 24632 - 25309 476 225 aa, chain + ## HITS:1 COG:bioD KEGG:ns NR:ns ## COG: bioD COG0132 # Protein_GI_number: 16128746 # Func_class: H Coenzyme transport and metabolism # Function: Dethiobiotin synthetase # Organism: Escherichia coli K12 # 1 225 1 225 225 428 98.0 1e-120 MSKRYFVTGTDTEVGKTVASCALLQAAKAVGYRTAGYKPVASGSEKTPEGLRNSDALALQ HNSSLQLDYATVNPYTFAEPTSPHIISAQEGRPIESLVMSAGLRALEQQADWVLVEGAGG WFTPLSDTFTFADWVTQEQLPVILVVGVKLGCINHAMLTAQAIQHAGLTLAGWVANDVTP PGKRHAEYMTTLTRMIPAPLLGEIPWLAENPENAATGKYINLALL >gi|296493134|gb|ADTK01000367.1| GENE 32 25888 - 27909 2228 673 aa, chain + ## HITS:1 COG:ECs0857 KEGG:ns NR:ns ## COG: ECs0857 COG0556 # Protein_GI_number: 15830111 # Func_class: L Replication, recombination and repair # Function: Helicase subunit of the DNA excision repair complex # Organism: Escherichia coli O157:H7 # 1 673 1 673 673 1265 100.0 0 MSKPFKLNSAFKPSGDQPEAIRRLEEGLEDGLAHQTLLGVTGSGKTFTIANVIADLQRPT MVLAPNKTLAAQLYGEMKEFFPENAVEYFVSYYDYYQPEAYVPSSDTFIEKDASVNEHIE QMRLSATKAMLERRDVVVVASVSAIYGLGDPDLYLKMMLHLTVGMIIDQRAILRRLAELQ YARNDQAFQRGTFRVRGEVIDIFPAESDDIALRVELFDEEVERLSLFDPLTGQIVSTIPR FTIYPKTHYVTPRERIVQAMEEIKEELAARRKVLLENNKLLEEQRLTQRTQFDLEMMNEL GYCSGIENYSRFLSGRGPGEPPPTLFDYLPADGLLVVDESHVTIPQIGGMYRGDRARKET LVEYGFRLPSALDNRPLKFEEFEALAPQTIYVSATPGNYELEKSGGDVVDQVVRPTGLLD PIIEVRPVATQVDDLLSEIRQRAAINERVLVTTLTKRMAEDLTEYLEEHGERVRYLHSDI DTVERMEIIRDLRLGEFDVLVGINLLREGLDMPEVSLVAILDADKEGFLRSERSLIQTIG RAARNVNGKAILYGDKITPSMAKAIGETERRREKQQKYNEEHGITPQGLNKKVVDILALG QNIAKTKAKGRGKSRPIVEPDNVPMDMSPKALQQKIHELEGLMMQHAQNLEFEEAAQIRD QLHQLRELFIAAS >gi|296493134|gb|ADTK01000367.1| GENE 33 28101 - 29009 856 302 aa, chain - ## HITS:1 COG:ybhK KEGG:ns NR:ns ## COG: ybhK COG0391 # Protein_GI_number: 16128748 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 302 1 302 302 548 100.0 1e-156 MRNRTLADLDRVVALGGGHGLGRVLSSLSSLGSRLTGIVTTTDNGGSTGRIRRSEGGIAW GDMRNCLNQLITEPSVASAMFEYRFGGNGELSGHNLGNLMLKALDHLSVRPLEAINLIRN LLKVDTHLIPMSEHPVDLMAIDDQGHEVYGEVNIDQLTTPIQELLLTPNVPATREAVHAI NEADLIIIGPGSFYTSLMPILLLKEIAQALRRTPAPMVYIGNLGRELSLPAANLKLESKL AIMEQYVGKKVIDAVIVGPKVDVSAVKERIVIQEVLEASDIPYRHDRQLLHNALEKALQA LG >gi|296493134|gb|ADTK01000367.1| GENE 34 29406 - 30395 732 329 aa, chain + ## HITS:1 COG:ECs0859 KEGG:ns NR:ns ## COG: ECs0859 COG2896 # Protein_GI_number: 15830113 # Func_class: H Coenzyme transport and metabolism # Function: Molybdenum cofactor biosynthesis enzyme # Organism: Escherichia coli O157:H7 # 1 329 1 329 329 682 100.0 0 MASQLTDAFARKFYYLRLSITDVCNFRCTYCLPDGYKPSGVTNKGFLTVDEIRRVTRAFA SLGTEKVRLTGGEPSLRRDFTDIIAAVRENDAIRQIAVTTNGYRLERDVANWRDAGLTGI NVSVDSLDARQFHAITGQDKFNQVMAGIDAAFEAGFEKVKVNTVLMRDVNHHQLDTFLNW IQHRPIQLRFIELMETGEGSELFRKHHISGQVLRDELLRRGWIHQLRQRSDGPAQVFCHP DYAGEIGLIMPYEKDFCATCNRLRVSSIGKLHLCLFGEGGVNLRDLLEDDTQQQALEARI SAALREKKQTHFLHQNNTGITQNLSYIGG >gi|296493134|gb|ADTK01000367.1| GENE 35 30417 - 30929 602 170 aa, chain + ## HITS:1 COG:ECs0860 KEGG:ns NR:ns ## COG: ECs0860 COG0521 # Protein_GI_number: 15830114 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin biosynthesis enzymes # Organism: Escherichia coli O157:H7 # 1 170 1 170 170 340 98.0 1e-93 MSQVSTEFIPTRIAILTVSNRRGEEDDTSGHYLRDSAQEVGHHVVDKAIVKENRYAIRAQ VSAWIASDDVQVVLITGGTGLTEGDQAPEALLPLFDREIEGFGEVFRMLSFEEIGTSTLQ SRAVAGVANKTLIFAMPGSTKACRTAWENIIAPQLDARTRPCNFHPHLKK >gi|296493134|gb|ADTK01000367.1| GENE 36 30932 - 31417 638 161 aa, chain + ## HITS:1 COG:ECs0861 KEGG:ns NR:ns ## COG: ECs0861 COG0315 # Protein_GI_number: 15830115 # Func_class: H Coenzyme transport and metabolism # Function: Molybdenum cofactor biosynthesis enzyme # Organism: Escherichia coli O157:H7 # 1 161 1 161 161 296 100.0 9e-81 MSQLTHINAAGEAHMVDVSAKAETVREARAEAFVTMRSETLAMIIDGRHHKGDVFATARI AGIQAAKRTWDLIPLCHPLMLSKVEVNLQAEPEHNRVRIETLCRLTGKTGVEMEALTAAS VAALTIYDMCKAVQKDMVIGPVRLLAKSGGKSGDFKVEADD >gi|296493134|gb|ADTK01000367.1| GENE 37 31410 - 31655 320 81 aa, chain + ## HITS:1 COG:moaD KEGG:ns NR:ns ## COG: moaD COG1977 # Protein_GI_number: 16128752 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin converting factor, small subunit # Organism: Escherichia coli K12 # 1 81 1 81 81 149 100.0 1e-36 MIKVLFFAQVRELVGTDATEVAADFPTVEALRQHMAAQSDRWALALEDGKLLAAVNQTLV SFDHPLTDGDEVAFFPPVTGG >gi|296493134|gb|ADTK01000367.1| GENE 38 31657 - 32109 617 150 aa, chain + ## HITS:1 COG:moaE KEGG:ns NR:ns ## COG: moaE COG0314 # Protein_GI_number: 16128753 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin converting factor, large subunit # Organism: Escherichia coli K12 # 1 150 1 150 150 288 99.0 2e-78 MAETKIVVGPQPFSVGEEYPWLAERDEDGAVVTFTGKVRNHNLGDSVKALTLEHYPGMTE KALAEIVDEARNRWPLGRVTVIHRIGELWPGDEIVFVGVTSAHRSSAFEAGQFIMDYLKT RAPFWKREATPEGDRWVEARESDQQAAKRW >gi|296493134|gb|ADTK01000367.1| GENE 39 32246 - 32950 751 234 aa, chain + ## HITS:1 COG:ybhL KEGG:ns NR:ns ## COG: ybhL COG0670 # Protein_GI_number: 16128754 # Func_class: R General function prediction only # Function: Integral membrane protein, interacts with FtsH # Organism: Escherichia coli K12 # 1 234 1 234 234 368 100.0 1e-102 MDRFPRSDSIVQPRAGLQTYMAQVYGWMTVGLLLTAFVAWYAANSAAVMELLFTNRVFLI GLIIAQLALVIVLSAMIQKLSAGVTTMLFMLYSALTGLTLSSIFIVYTAASIASTFVVTA GMFGAMSLYGYTTKRDLSGFGNMLFMALIGIVLASLVNFWLKSEALMWAVTYIGVIVFVG LTAYDTQKLKNMGEQIDTRDTSNLRKYSILGALTLYLDFINLFLMLLRIFGNRR >gi|296493134|gb|ADTK01000367.1| GENE 40 33155 - 33868 238 237 aa, chain + ## HITS:1 COG:ybhM KEGG:ns NR:ns ## COG: ybhM COG0670 # Protein_GI_number: 16128755 # Func_class: R General function prediction only # Function: Integral membrane protein, interacts with FtsH # Organism: Escherichia coli K12 # 1 237 1 237 237 402 94.0 1e-112 MESYSQNSNKLDFQHETRILNGIWLITALGLVATAGLAWGAKYLEITATKYDSPTMYVAI GLLLVCMYGLSKDINKINAVIAGVIYLFLLSLVAIVVASLAPVPAIIVVFSTAGAMFLIS MLAGLLFNVDPGSHRFIIMMTLTGLALVIIMNAALMSERPVWVISCLMIVLWSGIISHGR NKLLELAGKCHSEELWSPVRCAVTGALTLYYYFIGFFGILAAIAITLVWQRHTRFFH >gi|296493134|gb|ADTK01000367.1| GENE 41 33904 - 34860 1044 318 aa, chain - ## HITS:1 COG:ybhN KEGG:ns NR:ns ## COG: ybhN COG0392 # Protein_GI_number: 16128756 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Escherichia coli K12 # 1 318 1 318 318 529 99.0 1e-150 MSKSHPRWRLAKKILTWLFFIAVIVLLVVYAKKVDWEEVWKVIRDYNRVALLSAVGLVVV SYLIYGCYDLLARFYCGHKLAKRQVMLVSFICYAFNLTLSTWVGGIGMRYRLYSRLGLPG STITRIFSLSITTNWLGYILLAGIIFTAGVVELPDHWYVDQTTLRILGIGLLMIIAVYLW FCAFAKHRHMTIKGQKLVLPSWKFALAQMLISSVNWMVMGAIIWLLLGQSVNYFFVLGVL LVSSIAGVIVHIPAGIGVLEAVFIALLAGEHTSKGSIIAALLAYRVLYYFIPLLLALICY LLLESQAKKLRAKNEAAM >gi|296493134|gb|ADTK01000367.1| GENE 42 34860 - 36101 1306 413 aa, chain - ## HITS:1 COG:ECs0867 KEGG:ns NR:ns ## COG: ECs0867 COG1502 # Protein_GI_number: 15830121 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Escherichia coli O157:H7 # 1 413 1 413 413 833 99.0 0 MKCSWREGNKIQLLENGEQYYPAVFKAIGEAQERIILETFIWFEDDVGKQLHAALLAAAQ RGVKAEVLLDGYGSPDLSDEFVNELTAAGVVFRYYDPRPRLFGMRTNVFRRMHRKIVVID ARIAFIGGLNYSAEHMSSYGPEAKQDYAVRLEGPIVEDILQFELENLPGQSAARRWWRRH HKAEENRQPGEAQVLLVWRDNEEHRDDIERHYLKMLTQARREVIIANAYFFPGYRFLHAL RKAARRGVRIKLIIQGEPDMPIVRVGARLLYNYLVKGGVQVFEYRRRPLHGKVALMDDHW ATVGSSNLDPLSLSLNLEANVIIHDRNFNQTLRDNLNGIIAADCQQVDETMLPKRTWWNL TKSVLAFHFLRHFPALVGWLPAHTPRLAQVDPPAQPTMETQDRVETENTGVKP >gi|296493134|gb|ADTK01000367.1| GENE 43 36098 - 36859 568 253 aa, chain - ## HITS:1 COG:ECs0868 KEGG:ns NR:ns ## COG: ECs0868 COG3568 # Protein_GI_number: 15830122 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Escherichia coli O157:H7 # 1 253 1 253 253 520 100.0 1e-147 MPDQTQQFSFKVLTINIHKGFTAFNRRFILPELRDAVRTVSADIVCLQEVMGAHEVHPLH VENWPDTSHYEFLADTMWSDFAYGRNAVYPEGHHGNAVLSRYPIEHYENRDVSVDGAEKR GVLYCRIVPPMTGKAIHVMCVHLGLREAHRQAQLAMLAEWVNELPDGEPVLVAGDFNDWR QKANHPLKVQAGLDEIFTRAHGRPARTFPVQFPLLRLDRIYVKNASASAPTALPLRTWRH LSDHAPLSAEIHL >gi|296493134|gb|ADTK01000367.1| GENE 44 36992 - 37402 445 136 aa, chain + ## HITS:1 COG:no KEGG:G2583_1019 NR:ns ## KEGG: G2583_1019 # Name: ybhQ # Def: inner membrane protein YbhQ # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 136 1 136 136 215 100.0 4e-55 MKWQQRVRVATGLSCWQIMLHLLVVALLVVGWMSKTLVHVGVGLCALYCVTVVMMLVFQR HPEQRWREVADVLEELTTTWYFGAALIVLWLLSRVLENNFLLAIAGLAILAGPAVVSLLA KDKKLHHLTSKHRVRR >gi|296493134|gb|ADTK01000367.1| GENE 45 37364 - 38470 1220 368 aa, chain - ## HITS:1 COG:ECs0870 KEGG:ns NR:ns ## COG: ECs0870 COG0842 # Protein_GI_number: 15830124 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Escherichia coli O157:H7 # 1 368 1 368 368 686 100.0 0 MFHRLWTLIRKELQSLLREPQTRAILILPVLIQVILFPFAATLEVTNATIAIYDEDNGEH SVELTQRFARASAFTHVLLLKSPQEIRPTIDTQKALLLVRFPADFSRKLDTFQTAPLQLI LDGRNSNSAQIAANYLQQIVKNYQQELLEGKPKPNNSELVVRNWYNPNLDYKWFVVPSLI AMITTIGVMIVTSLSVAREREQGTLDQLLVSPLTTWQIFIGKAVPALIVATFQATIVLAI GIWAYQIPFAGSLALFYFTMVIYGLSLVGFGLLISSLCSTQQQAFIGVFVFMMPAILLSG YVSPVENMPVWLQNLTWINPIRHFTDITKQIYLKDASLDIVWNSLWPLLVITATTGSAAY AMFRRKVM >gi|296493134|gb|ADTK01000367.1| GENE 46 38481 - 39614 1384 377 aa, chain - ## HITS:1 COG:ECs0871 KEGG:ns NR:ns ## COG: ECs0871 COG0842 # Protein_GI_number: 15830125 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Escherichia coli O157:H7 # 1 377 1 377 377 679 99.0 0 MSNPILSWRRVRALCVKETRQIVRDPSSWLIAVVIPLLLLFIFGYGINLDSSKLRVGILL EQSSEAALDFTHTMTGSPYIDATISDNRQELIAKMQAGKIRGLVVIPVDFAEQMERANAT APIQVITDGSEPNTANFVQGYVEGIWQIWQMQRAEDNGQTFEPLIDVQTRYWFNPAAISQ HFIIPGAVTIIMTVIGAILTSLVVAREWERGTMEALLSTEITRTELLLCKLIPYYFLGML AMLLCMLVSVFILGVPYRGSLLILFFISSLFLLSTLGMGLLISTITRNQFNAAQVALNAA FLPSIMLSGFIFQIDSMPAVIRAVTYIIPARYFVSTLQSLFLAGNIPVVLVVNVLFLIAS AVMFIGLTWLKTKRRLD >gi|296493134|gb|ADTK01000367.1| GENE 47 39607 - 41343 343 578 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 330 549 12 233 318 136 34 2e-31 MNDAVITLNGLEKRFPGMDKPAVAPLDCTIHAGYVTGLVGPDGAGKTTLMRMLAGLLKPD SGNATVIGFDPIKNDGALHAVLGYMPQKFGLYEDLTVMENLNLYADLRSVTGEARKQTFA RLLEFTSLGPFTGRLAGKLSGGMKQKLGLACTLVGEPKVLLLDEPGVGVDPISRRELWQM VHELAGEGMLILWSTSYLDEAEQCRDVLLMNEGELLYQGEPKALTQTMAGRSFLMTSPHE GNRKLLQRALKLPQVSDGMIQGKSVRLILKKEATPDDIRHADGMPEININETTPRFEDAF IDLLGGAGTSESPLGAILHTVDGTPGETVIEAKELTKKFGDFAATDHVNFAVKRGEIFGL LGPNGAGKSTTFKMMCGLLVPTSGQALVLGMDLKESSGKARQHLGYMAQKFSLYGNLTVE QNLRFFSGVYGLRGRAQNEKISRMSEAFGLKSIASHATDELPLGFKQRLALACSLMHEPD ILFLDEPTSGVDPLTRREFWLHINSMVEKGVTVMVTTHFMDEAEYCDRIGLVYRGKLIAS GTPDDLKAQSANDEQPDPTMEQAFIQLIHDWDKEHSNE >gi|296493134|gb|ADTK01000367.1| GENE 48 41336 - 42331 1245 331 aa, chain - ## HITS:1 COG:Z1015 KEGG:ns NR:ns ## COG: Z1015 COG0845 # Protein_GI_number: 15800546 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli O157:H7 EDL933 # 1 331 2 332 332 550 98.0 1e-156 MKKPVVIGLAVVVLAAVVAGGYWWYQSRQDNGLTLYGNVDIRTVNLSFRVGGRVESLAVD EGDAIKAGQVLGELDHKPYEIALMQAKAGVSVAQAQYDLILAGYRDEEIAQAAAAVKQAQ AAYDYAQNFYNRQQGLWKSRTISANDLENARSSRDQAQATLKSAQDKLRQYRSGNREQDI AQAKASLEQAQAQLAQAELNLQDSTLIAPSDGTLLTRAVEPGTVLNEGGTVFTVSLTRPV WVRAYVDERNLDQAQPGRKVLLYTDGRPDKPYHGQIGFVSPTAEFTPKTVETPDLRTDLV YRLRIVVTDADDALRQGMPVTVQFGNEAGHE >gi|296493134|gb|ADTK01000367.1| GENE 49 42334 - 43005 701 223 aa, chain - ## HITS:1 COG:ECs0874 KEGG:ns NR:ns ## COG: ECs0874 COG1309 # Protein_GI_number: 15830128 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 223 5 227 227 434 99.0 1e-122 MNNPAMTIKGEQAKKQLIAAALAQFGEYGMNATTREIAAQAGQNIAAITYYFGSKEDLYL ACAQWIADFIGEQFRPHAEEAERLFAQPQPDRAAIRELILRACRNMIKLLTQDDTVNLSK FISREQLSPTAAYHLVHEQVISPLHSHLTRLIAAWTGSDANDTRMILHTHALIGEILAFR LGKETILLRTGWTAFDEEKTELINQTVTCHIDLILQGLSQRSL >gi|296493134|gb|ADTK01000367.1| GENE 50 43044 - 43265 60 73 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTPYGSCPVWLRILATPVTGIIYALRHKNCYTPLPHDIVVFVIFLFSISLKTTPVTVGAV RSSYVFRFFGFKP >gi|296493134|gb|ADTK01000367.1| GENE 51 43234 - 44598 1473 454 aa, chain + ## HITS:1 COG:rhlE KEGG:ns NR:ns ## COG: rhlE COG0513 # Protein_GI_number: 16128765 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Escherichia coli K12 # 1 442 1 442 454 778 100.0 0 MSFDSLGLSPDILRAVAEQGYREPTPIQQQAIPAVLEGRDLMASAQTGTGKTAGFTLPLL QHLITRQPHAKGRRPVRALILTPTRELAAQIGENVRDYSKYLNIRSLVVFGGVSINPQMM KLRGGVDVLVATPGRLLDLEHQNAVKLDQVEILVLDEADRMLDMGFIHDIRRVLTKLPAK RQNLLFSATFSDDIKALAEKLLHNPLEIEVARRNTASDQVTQHVHFVDKKRKRELLSHMI GKGNWQQVLVFTRTKHGANHLAEQLNKDGIRSAAIHGNKSQGARTRALADFKSGDIRVLV ATDIAARGLDIEELPHVVNYELPNVPEDYVHRIGRTGRAAATGEALSLVCVDEHKLLRDI EKLLKKEIPRIAIPGYEPDPSIKAEPIQNGRQQRGGGGRGQGGGRGQQQPRRGEGGAKSA SAKPAEKPSRRLGDAKPAGEQQRRRRPRKPAAAQ Prediction of potential genes in microbial genomes Time: Mon May 16 16:15:49 2011 Seq name: gi|296493133|gb|ADTK01000368.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1165.6, whole genome shotgun sequence Length of sequence - 25284 bp Number of predicted genes - 22, with homology - 22 Number of transcription units - 16, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 177 - 626 415 ## COG3236 Uncharacterized protein conserved in bacteria - Prom 747 - 806 2.6 + Prom 562 - 621 2.4 2 2 Op 1 4/0.571 + CDS 779 - 2929 2030 ## COG1199 Rad3-related DNA helicases 3 2 Op 2 3/0.714 + CDS 2957 - 3919 922 ## COG0547 Anthranilate phosphoribosyltransferase + Term 3932 - 3965 3.0 + Prom 3974 - 4033 3.6 4 3 Tu 1 . + CDS 4060 - 5145 1184 ## COG2055 Malate/L-lactate dehydrogenases + Term 5310 - 5368 9.1 - Term 5140 - 5173 5.2 5 4 Tu 1 . - CDS 5374 - 5634 301 ## EC55989_0846 hypothetical protein - Prom 5801 - 5860 2.3 6 5 Op 1 1/1.000 - CDS 5899 - 6165 292 ## COG1734 DnaK suppressor protein 7 5 Op 2 2/0.857 - CDS 6239 - 6916 701 ## COG3128 Uncharacterized iron-regulated protein - Term 6923 - 6948 -0.5 8 5 Op 3 . - CDS 6958 - 9240 1993 ## COG4774 Outer membrane receptor for monomeric catechols - Prom 9357 - 9416 3.7 - Term 9457 - 9490 2.2 9 6 Tu 1 . - CDS 9505 - 9765 247 ## ECIAI1_0844 hypothetical protein - Prom 9867 - 9926 8.7 + Prom 9875 - 9934 5.3 10 7 Tu 1 . + CDS 10041 - 10967 788 ## COG3129 Predicted SAM-dependent methyltransferase 11 8 Tu 1 . - CDS 10964 - 13189 1995 ## COG0668 Small-conductance mechanosensitive channel - Prom 13217 - 13276 5.6 12 9 Op 1 34/0.000 - CDS 13450 - 14172 595 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 13 9 Op 2 31/0.000 - CDS 14169 - 14828 915 ## COG0765 ABC-type amino acid transport system, permease component 14 9 Op 3 4/0.571 - CDS 14967 - 15713 992 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain - Prom 15760 - 15819 4.1 - Term 16038 - 16081 9.5 15 10 Tu 1 . - CDS 16117 - 16620 606 ## COG0783 DNA-binding ferritin-like protein (oxidative damage protectant) - Prom 16660 - 16719 4.8 - Term 16868 - 16910 10.1 16 11 Tu 1 . - CDS 16919 - 17806 891 ## COG5006 Predicted permease, DMT superfamily + Prom 18074 - 18133 4.6 17 12 Tu 1 . + CDS 18159 - 18674 530 ## COG3637 Opacity protein and related surface antigens + Term 18678 - 18740 10.7 - Term 18685 - 18712 -0.9 18 13 Tu 1 . - CDS 18723 - 20306 1406 ## COG2194 Predicted membrane-associated, metal-dependent hydrolase - Prom 20422 - 20481 2.2 + Prom 20787 - 20846 4.1 19 14 Op 1 4/0.571 + CDS 20892 - 21359 518 ## COG1321 Mn-dependent transcriptional regulator 20 14 Op 2 . + CDS 21356 - 22474 928 ## COG0471 Di- and tricarboxylate transporters + Term 22493 - 22525 3.1 - Term 22476 - 22517 3.0 21 15 Tu 1 . - CDS 22533 - 23453 1066 ## COG1376 Uncharacterized protein conserved in bacteria - Prom 23509 - 23568 4.9 + Prom 23494 - 23553 4.1 22 16 Tu 1 . + CDS 23672 - 25264 1855 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains Predicted protein(s) >gi|296493133|gb|ADTK01000368.1| GENE 1 177 - 626 415 149 aa, chain - ## HITS:1 COG:ybiA KEGG:ns NR:ns ## COG: ybiA COG3236 # Protein_GI_number: 16128766 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 149 12 160 160 288 99.0 3e-78 MQDTIINFYSTSDDYGDFSNFAAWPIKVDGKTWPTSEHYFQAQKFLDEKYREEIRRVSSP MVAARMGRDRSKPLRKNWESVKEQVMRKALRAKFEQHAELRALLLATAPAKLVEHTENDA YWGDGGNGKGKNRLGYLLMELREQLAIEK >gi|296493133|gb|ADTK01000368.1| GENE 2 779 - 2929 2030 716 aa, chain + ## HITS:1 COG:dinG KEGG:ns NR:ns ## COG: dinG COG1199 # Protein_GI_number: 16128767 # Func_class: K Transcription; L Replication, recombination and repair # Function: Rad3-related DNA helicases # Organism: Escherichia coli K12 # 1 716 1 716 716 1434 99.0 0 MALTAALKAQIAAWYKALQEQIPDFIPRAPQRQMIADVAKTLAGEEGRHLAIEAPTGVGK TLSYLIPGIAIAREEQKTLVVSTANVALQDQIYSKDLPLLKKIIPDLKFTAAFGRGRYVC PRNLTALASTEPTQQDLLAFLDDELTPNNQEEQKRCAKLKGDLDTYKWDGLRDHTDIAID DDLWRRLSTDKASCLNRNCYYYRECPFFVARREIQEAEVVVANHALVMAAMESEAVLPDP KNLLLVLDEGHHLPDVARDALEMSAEITAPWYRLQLDLFTKLVATCMEQFRPKTIPPLAI PERLKAHCEELYELIASLNNILNLYMPAGQEAEHRFAMGELPDEVLEICQRLAKLTEMLR GLAELFLNDLSEKTGSHDIVRLHRLILQMNRALGMFEAQSKLWRLASLAQSSGAPVTKWA TREEREGQLHLWFHCVGIRVSDQLERLLWRSIPHIIVTSATLRSLNSFSRLQEMSGLKEK AGDRFVALDSPFNHCEQGKIVIPRMRVEPSIDNEEQHIAEMAAFFRKQVESKKHLGMLVL FASGRAMQRFLDYVTDLRLMLLVQGDQPRYRLVELHRKRVANGERSVLVGLQSFAEGLDL KGDLLSQVHIHKIAFPPIDSPVVITEGEWLKSLNRYPFEVQSLPSASFNLIQQVGRLIRS HGCWGEVVIYDKRLLTKNYGKRLLDALPVFPIEQPEVPEGIVKKKEKTKSPRRRRR >gi|296493133|gb|ADTK01000368.1| GENE 3 2957 - 3919 922 320 aa, chain + ## HITS:1 COG:ybiB KEGG:ns NR:ns ## COG: ybiB COG0547 # Protein_GI_number: 16128768 # Func_class: E Amino acid transport and metabolism # Function: Anthranilate phosphoribosyltransferase # Organism: Escherichia coli K12 # 1 320 1 320 320 644 100.0 0 MDYRKIIKEIGRGKNHARDLDRDTARGLYAHMLNGEVPDLELGGVLIALRIKGEGEAEML GFYEAMQNHTIKLTPPAGKPMPIVIPSYNGARKQANLTPLLAILLHKLGFPVVVHGVSED PTRVLTETIFELMGITPTLHGGQAQAKLDEHQPVFMPVGAFCPPLEKQLAMRWRMGVRNS AHTLAKLATPFAEGEALRLSSVSHPEYIGRVAKFFSDIGGRALLMHGTEGEVYANPQRCP QINLIDREGMRVLYEKQDTAGSELLPQAKDPETTAQWIERCLAGSEPIPESLKIQMACCL VATGEAATISDGLARVNQAF >gi|296493133|gb|ADTK01000368.1| GENE 4 4060 - 5145 1184 361 aa, chain + ## HITS:1 COG:ybiC KEGG:ns NR:ns ## COG: ybiC COG2055 # Protein_GI_number: 16128769 # Func_class: C Energy production and conversion # Function: Malate/L-lactate dehydrogenases # Organism: Escherichia coli K12 # 1 361 1 361 361 729 100.0 0 MESGHRFDAQTLHSFIQAVFRQMGSEEQEAKLVADHLIAANLAGHDSHGIGMIPSYVRSW SQGHLQINHHAKTVKEAGAAVTLDGDRAFGQVAAHEAMALGIEKAHQHGIAAVALHNSHH IGRIGYWAEQCAAAGFVSIHFVSVVGIPMVAPFHGRDSRFGTNPFCVVFPRKDNFPLLLD YATSAIAFGKTRVAWHKGVPVPPGCLIDVNGVPTTNPAVMQESPLGSLLTFAEHKGYALA AMCEILGGALSGGKTTHQETLQTSPDAILNCMTTIIINPELFGAPDCNAQTEAFAEWVKA SPHDDDKPILLPGEWEVNTRRERQKQGIPLDAGSWQAICDAARQIGMPEETLQAFCQQLA S >gi|296493133|gb|ADTK01000368.1| GENE 5 5374 - 5634 301 86 aa, chain - ## HITS:1 COG:no KEGG:EC55989_0846 NR:ns ## KEGG: EC55989_0846 # Name: ybiJ # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 86 1 86 86 82 100.0 3e-15 MKTINTVVAAMALSTLSFGVFAAEPVTASQAQNMNKIGVVSADGASTLDALEAKLAEKAA AAGASGYSITSATNNNKLSGTAVIYK >gi|296493133|gb|ADTK01000368.1| GENE 6 5899 - 6165 292 88 aa, chain - ## HITS:1 COG:ybiI KEGG:ns NR:ns ## COG: ybiI COG1734 # Protein_GI_number: 16128771 # Func_class: T Signal transduction mechanisms # Function: DnaK suppressor protein # Organism: Escherichia coli K12 # 1 88 1 88 88 158 100.0 3e-39 MASGWANDDAVNEQINSTIEDAIARARGEIPRGESLDECEECGAPIPQARREAIPGVRLC IHCQQEKDLQKPAYTGYNRRGSKDSQLR >gi|296493133|gb|ADTK01000368.1| GENE 7 6239 - 6916 701 225 aa, chain - ## HITS:1 COG:ybiX KEGG:ns NR:ns ## COG: ybiX COG3128 # Protein_GI_number: 16128772 # Func_class: S Function unknown # Function: Uncharacterized iron-regulated protein # Organism: Escherichia coli K12 # 1 225 13 237 237 459 100.0 1e-129 MMYHIPGVLSPQDVARFREQLEQAEWVDGRVTTGAQGAQVKNNQQVDTRSTLYAALQNEV LNAVNQHALFFAAALPRTLSTPLFNRYQNNETYGFHVDGAVRSHPQNGWMRTDLSATLFL SDPQSYDGGELVVNDTFGQHRVKLPAGDLVLYPSSSLHCVTPVTRGVRVASFMWIQSMIR DDKKRAMLFELDNNIQSLKSRYGESEEILSLLNLYHNLLREWSEI >gi|296493133|gb|ADTK01000368.1| GENE 8 6958 - 9240 1993 760 aa, chain - ## HITS:1 COG:fiu KEGG:ns NR:ns ## COG: fiu COG4774 # Protein_GI_number: 16128773 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for monomeric catechols # Organism: Escherichia coli K12 # 1 760 1 760 760 1396 100.0 0 MENNRNFPARQFHSLTFFAGLCIGITPVAQALAAEGQTNADDTLVVEASTPSLYAPQQSA DPKFSRPVADTTRTMTVISEQVIKDQGATNLTDALKNVPGVGAFFAGENGNSTTGDAIYM RGADTSNSIYIDGIRDIGSVSRDTFNTEQVEVIKGPSGTDYGRSAPTGSINMISKQPRND SGIDASASIGSAWFRRGTLDVNQVIGDTTAVRLNVMGEKTHDAGRDKVKNERYGVAPSVA FGLGTANRLYLNYLHVTQHNTPDGGIPTIGLPGYSAPSAGTAALNHSGKVDTHNFYGTDS DYDDSTTDTATMRFEHDINDNTTIRNTTRWSRVKQDYLMTAIMGGASNITQPTSDVNSWT WSRTANTKDVSNKILTNQTNLTSTFYTGSIGHDVSTGVEFTRETQTNYGVNPVTLPAVNI YHPDSSIHPGGLTRNGANANGQTDTFAIYAFDTLQITRDFELNGGIRLDNYHTEYDSATA CGGSGRGAITCPTGVAKGSPVTTVDTAKSGNLMNWKAGALYHLTENGNVYINYAVSQQPP GGNNFALAQSGSGNSANRTDFKPQKANTSEIGTKWQVLDKRLLLTAALFRTDIENEVEQN DDGTYSQYGKKRVEGYEISVAGNITPAWQVIGGYTQQKATIKNGKDVAQDGSSSLPYTPE HAFTLWSQYQATDDISVGAGARYIGSMHKGSDGAVGTPAFTEGYWVADAKLGYRVNRNLD FQLNVYNLFDTDYVASINKSGYRYHPGEPRTFLLTANMHF >gi|296493133|gb|ADTK01000368.1| GENE 9 9505 - 9765 247 86 aa, chain - ## HITS:1 COG:no KEGG:ECIAI1_0844 NR:ns ## KEGG: ECIAI1_0844 # Name: ybiM # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 86 49 134 134 163 100.0 1e-39 MKKCLTLLIATVLSGISLTAYAAQPMSNLDSGQLRPAGTVSATGASNLSDLEDKLAEKAR EQGAKGYVINSAGGNDQMFGTATIYK >gi|296493133|gb|ADTK01000368.1| GENE 10 10041 - 10967 788 308 aa, chain + ## HITS:1 COG:ybiN KEGG:ns NR:ns ## COG: ybiN COG3129 # Protein_GI_number: 16128775 # Func_class: R General function prediction only # Function: Predicted SAM-dependent methyltransferase # Organism: Escherichia coli K12 # 1 308 28 335 335 638 99.0 0 MSAQKPGLHPRNRHHSCYDLATLCQVNPELRQFLTLTPAGEQSVDFANPLAVKALNKALL AHFYAVANWDIPDGFLCPPVPGRADYIHHLADLLAEASGTIPANASILDIGVGANCIYPL IGVHEYGWRFTGSETSSQALSSAQAIISSNPGLNRAIRLRRQKESGAIFNGIIHKNEQYD ATLCNPPFHDSAAAARAGSERKRRNLGLNKDDALNFGGQQQELWCEGGEVTFIKKMIEES KGFAKQVMWFTSLVSRGENLPPLYRALTDVGAVKVVKKEMAQGQKQSRFIAWTFMNDEQR RRFVNRQR >gi|296493133|gb|ADTK01000368.1| GENE 11 10964 - 13189 1995 741 aa, chain - ## HITS:1 COG:ECs0886_2 KEGG:ns NR:ns ## COG: ECs0886_2 COG0668 # Protein_GI_number: 15830140 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Escherichia coli O157:H7 # 431 741 1 311 311 582 100.0 1e-166 MRWILFILFCLLGAPAHAVSIPGVTTTTTTDSTTEPAPEPDIEQKKAAYGALADVLDNDT SRKELIDQLRTVAATPPAEPVPKIVPPTLVEEQTVLQKVTEVSRHYGEALSARFGQLYRN ITGSPHKPFNPQTFSNALTHFSMLAVLVFGFYWLIRLCALPLYRKMGQWARQKNRERSNW LQLPAMIIGAFIIDLLLLALTLFVGQVLSDNLNAGSRTIAFQQSLFLNAFALIEFFKAVL RLIFCPNVAELRPFTIQDESARYWSRRLSWLSSLIGYGLIVAVPIISNQVNVQIGALANV IIMLCMTVWALYLIFRNKKEITQHLLNFAEHSLAFFSLFIRAFALVWHWLASAYFIVLFF FSLFDPGNSLKFMMGATVRSLAIIGIAAFVSGMFSRWLAKTITLSPHTQRNYPELQKRLN GWLSAALKTARILTVCVAVMLLLSAWGLFDFWNWLQNGAGQKTVDILIRIALILFFSAVG WTVLASLIENRLASDIHGRPLPSARTRTLLTLFRNALAVIISTITIMIVLSEIGVNIAPL LAGAGALGLAISFGSQTLVKDIITGVFIQFENGMNTGDLVTIGPLTGTVERMSIRSVGVR QDTGAYHIIPWSSITTFANFVRGIGSVVANYDVDRHEDADKANQALKDAVAELMENEEIR GLIIGEPNFAGIVGLSNTAFTLRVSFTTLPLKQWTVRFALDSQVKKHFDLAGVRAPVQTY QVLPAPGATPAEPLPPGEPTL >gi|296493133|gb|ADTK01000368.1| GENE 12 13450 - 14172 595 240 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 239 1 242 245 233 47 7e-61 MIEFKNVSKHFGPTQVLHNIDLNIAQGEVVVIIGPSGSGKSTLLRCINKLEEITSGDLIV DGLKVNDPKVDERLIRQEAGMVFQQFYLFPHLTALENVMFGPLRVRGANKEEAEKLAREL LAKVGLAERAHHYPSELSGGQQQRVAIARALAVKPKMMLFDEPTSALDPELRHEVLKVMQ DLAEEGMTMVIVTHEIGFAEKVASRLIFIDKGRIAEDGNPQVLIKNPPSQRLQEFLQHVS >gi|296493133|gb|ADTK01000368.1| GENE 13 14169 - 14828 915 219 aa, chain - ## HITS:1 COG:ECs0888 KEGG:ns NR:ns ## COG: ECs0888 COG0765 # Protein_GI_number: 15830142 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Escherichia coli O157:H7 # 1 219 1 219 219 373 100.0 1e-103 MQFDWSAIWPAIPLLIEGAKMTLWISVLGLAGGLVIGLLAGFARTFGGWIANHVALVFIE VIRGTPIVVQVMFIYFALPMAFNDLRIDPFTAAVVTIMINSGAYIAEITRGAVLSIHKGF REAGLALGLSRWETIRYVILPLALRRMLPPLGNQWIISIKDTSLFIVIGVAELTRQGQEI IAGNFRALEIWSAVAVFYLIITLVLSFILRRLERRMKIL >gi|296493133|gb|ADTK01000368.1| GENE 14 14967 - 15713 992 248 aa, chain - ## HITS:1 COG:ECs0889 KEGG:ns NR:ns ## COG: ECs0889 COG0834 # Protein_GI_number: 15830143 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Escherichia coli O157:H7 # 1 248 1 248 248 462 100.0 1e-130 MKSVLKVSLAALTLAFAVSSHAADKKLVVATDTAFVPFEFKQGDKYVGFDVDLWAAIAKE LKLDYELKPMDFSGIIPALQTKNVDLALAGITITDERKKAIDFSDGYYKSGLLVMVKANN NDVKSVKDLDGKVVAVKSGTGSVDYAKANIKTKDLRQFPNIDNAYMELGTNRADAVLHDT PNILYFIKTAGNGQFKAVGDSLEAQQYGIAFPKGSDELRDKVNGALKTLRENGTYNEIYK KWFGTEPK >gi|296493133|gb|ADTK01000368.1| GENE 15 16117 - 16620 606 167 aa, chain - ## HITS:1 COG:ECs0890 KEGG:ns NR:ns ## COG: ECs0890 COG0783 # Protein_GI_number: 15830144 # Func_class: P Inorganic ion transport and metabolism # Function: DNA-binding ferritin-like protein (oxidative damage protectant) # Organism: Escherichia coli O157:H7 # 1 167 1 167 167 305 100.0 3e-83 MSTAKLVKSKATNLLYTRNDVSDSEKKATVELLNRQVIQFIDLSLITKQAHWNMRGANFI AVHEMLDGFRTALIDHLDTMAERAVQLGGVALGTTQVINSKTPLKSYPLDIHNVQDHLKE LADRYAIVANDVRKAIGEAKDDDTADILTAASRDLDKFLWFIESNIE >gi|296493133|gb|ADTK01000368.1| GENE 16 16919 - 17806 891 295 aa, chain - ## HITS:1 COG:ECs0891 KEGG:ns NR:ns ## COG: ECs0891 COG5006 # Protein_GI_number: 15830145 # Func_class: R General function prediction only # Function: Predicted permease, DMT superfamily # Organism: Escherichia coli O157:H7 # 1 295 1 295 295 481 100.0 1e-135 MPGSLRKMPVWLPIVILLVAMASIQGGASLAKSLFPLVGAPGVTALRLALGTLILIAFFK PWRLRFAKEQRLPLLFYGVSLGGMNYLFYLSIQTVPLGIAVALEFTGPLAVALFSSRRPV DFVWVVLAVLGLWFLLPLGQDVSHVDLTGCALALGAGACWAIYILSGQRAGAEHGPATVA IGSLIAALIFVPIGALQAGEALWHWSVIPLGLAVAILSTALPYSLEMIALTRLPTRTFGT LMSMEPALAAVSGMIFLGETLTPIQLLALGAIIAASMGSTLTVRKESKIKELDIN >gi|296493133|gb|ADTK01000368.1| GENE 17 18159 - 18674 530 171 aa, chain + ## HITS:1 COG:ECs0892 KEGG:ns NR:ns ## COG: ECs0892 COG3637 # Protein_GI_number: 15830146 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Opacity protein and related surface antigens # Organism: Escherichia coli O157:H7 # 1 171 1 171 171 301 100.0 4e-82 MKKIACLSALAAVLAFTAGTSVAATSTVTGGYAQSDAQGQMNKMGGFNLKYRYEEDNSPL GVIGSFTYTEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGVGYGKFQTTEYPT YKHDTSDYGFSYGAGLQFNPMENVALDFSYEQSRIRSVDVGTWIAGVGYRF >gi|296493133|gb|ADTK01000368.1| GENE 18 18723 - 20306 1406 527 aa, chain - ## HITS:1 COG:ybiP KEGG:ns NR:ns ## COG: ybiP COG2194 # Protein_GI_number: 16128783 # Func_class: R General function prediction only # Function: Predicted membrane-associated, metal-dependent hydrolase # Organism: Escherichia coli K12 # 1 527 1 527 527 1073 100.0 0 MNLTLKESLVTRSRVFSPWTAFYFLQSLLINLGLGYPFSLLYTAAFTAILLLLWRTLPRV QKVLVGVSSLVAACYFPFAQAYGAPNFNTLLALHSTNMEESTEILTIFPWYSYLVGLFIF ALGVIAIRRKKENEKARWNTFDSLCLVFSVATFFVAPVQNLAWGGVFKLKDTGYPVFRFA KDVIVNNNEVIEEQERMAKLSGMKDTWTVTAVKPKYQTYVVVIGESARRDALGAFGGHWD NTPFASSVNGLIFADYIAASGSTQKSLGLTLNRVVDGKPQFQDNFVTLANRAGFQTWWFS NQGQIGEYDTAIASIAKRADEVYFLKEGNFEADKNTKDEALLDMTAQVLAQEHSQPQLIV LHLMGSHPQACDRTQGKYETFVQSKETSCYLYTMTQTDDLLRKLYDQLRNSGSSFSLVYF SDHGLAFKERGKDVQYLAHDDKYQQNFQVPFMVISSDDKAHRVIKARRSANDFLGFFSQW TGIKAKEINIKYPFISEKKAGPIYITNFQLQKVDYNHLGTDIFDPKP >gi|296493133|gb|ADTK01000368.1| GENE 19 20892 - 21359 518 155 aa, chain + ## HITS:1 COG:ybiQ KEGG:ns NR:ns ## COG: ybiQ COG1321 # Protein_GI_number: 16128785 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Escherichia coli K12 # 1 155 1 155 155 268 100.0 2e-72 MSRRAGTPTAKKVTQLVNVEEHVEGFRQVREAHRRELIDDYVELISDLIREVGEARQVDM AARLGVSQPTVAKMLKRLATMGLIEMIPWRGVFLTAEGEKLAQESRERHQIVENFLLVLG VSPEIARRDAEGMEHHVSEETLDAFRLFTQKHGAK >gi|296493133|gb|ADTK01000368.1| GENE 20 21356 - 22474 928 372 aa, chain + ## HITS:1 COG:ybiR KEGG:ns NR:ns ## COG: ybiR COG0471 # Protein_GI_number: 16128786 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Escherichia coli K12 # 1 372 1 372 372 617 99.0 1e-176 MSLPFLRTLQGDRFFQLLILVGIGLSFFVPFAPKSWPAAIDWHTIITLSGLMLLTKGVEL SGYFDVLGRKMVRRFATERRLAMFMVLAAALLSTFLTNDVALFIVVPLTITLKRLCEIPV NRLIIFEALAVNAGSLLTPIGNPQNILIWGRSGLSFAGFIAQMAPLAGAMMLTLLLLCWC CFPGKALQYHTGVQTPEWKPRLVWSCLGLYIVFLTALEFKQELWGLVIVAAGFALLARRV VLSVDWTLLLVFMAMFIDVHLLTQLPALQGVLGNVSHLSEPGLWLTAIGLSQVISNVPST ILLLNYVPPSLLLVWAVNVGGFGLLPGSLANLIALRMANDRRIWWRFHLYSIPMLLWAAL VGYVLLVILPAN >gi|296493133|gb|ADTK01000368.1| GENE 21 22533 - 23453 1066 306 aa, chain - ## HITS:1 COG:ECs0896 KEGG:ns NR:ns ## COG: ECs0896 COG1376 # Protein_GI_number: 15830150 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 306 1 306 306 599 100.0 1e-171 MNMKLKTLFAAAFAVVGFCSTASAVTYPLPTDGSRLVGQNQVITIPEGNTQPLEYFAAEY QMGLSNMMEANPGVDTFLPKGGTVLNIPQQLILPDTVHEGIVINSAEMRLYYYPKGTNTV IVLPIGIGQLGKDTPINWTTKVERKKAGPTWTPTAKMHAEYRAAGEPLPAVVPAGPDNPM GLYALYIGRLYAIHGTNANFGIGLRVSHGCVRLRNEDIKFLFEKVPVGTRVQFIDEPVKA TTEPDGSRYIEVHNPLSTTEAQFEGQEIVPITLTKSVQTVTGQPDVDQVVLDEAIKNRSG MPVRLN >gi|296493133|gb|ADTK01000368.1| GENE 22 23672 - 25264 1855 530 aa, chain + ## HITS:1 COG:ECs0897 KEGG:ns NR:ns ## COG: ECs0897 COG0488 # Protein_GI_number: 15830151 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Escherichia coli O157:H7 # 1 530 1 530 530 1051 99.0 0 MLVSSNVTMQFGSKPLFENISVKFGGGNRYGLIGANGSGKSTFMKILGGDLEPTLGNVSL DPNERIGKLRQDQFAFEEFTVLDTVIMGHKELWEVKQERDRIYALPEMSEEDGYKVADLE VKYGEMDGYSAEARAGELLLGVGIPVEQHYGPMSEVAPGWKLRVLLAQALFADPDILLLD EPTNNLDIDTIRWLEQVLNERDSTMIIISHDRHFLNMVCTHMADLDYGELRVYPGNYDEY MTAATQARERLLADNAKKKAQIAELQSFVSRFSANASKSRQATSRARQIDKIKLEEVKAS SRQNPFIRFEQDKKLFRNALEVEGLTKGFDNGPLFKNLNLLLEVGEKLAVLGTNGVGKST LLKTLVGDLQPDSGTVKWSENARIGYYAQDHEYEFENDLTVFEWMSQWKQEGDDEQAVRS ILGRLLFSQDDIKKPAKVLSGGEKGRMLFGKLMMQKPNILIMDEPTNHLDMESIESLNME LELYQGTLIFVSHDREFVSSLATRILEITPERVIDFSGNYEDYLRSKGIE Prediction of potential genes in microbial genomes Time: Mon May 16 16:15:55 2011 Seq name: gi|296493132|gb|ADTK01000369.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1165.7, whole genome shotgun sequence Length of sequence - 1578 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 64 - 101 0.1 1 1 Tu 1 . - CDS 178 - 1443 1053 ## B21_00805 hypothetical protein - Prom 1487 - 1546 3.9 Predicted protein(s) >gi|296493132|gb|ADTK01000369.1| GENE 1 178 - 1443 1053 421 aa, chain - ## HITS:1 COG:no KEGG:B21_00805 NR:ns ## KEGG: B21_00805 # Name: ybiU # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 421 1 421 421 872 100.0 0 MASTFTSDTLPADHKAAIRQMKHALRAQLGDVQQIFNQLSDDIATRVAEINALKAQGDAV WPVLSYADIKAGHVTAEQREQIKRRGCAVIKGHFPREQALGWDQSMLDYLDRNRFDEVYK GPGDNFFGTLSASRPEIYPIYWSQAQMQARQSEEMANAQSFLNRLWTFESDGKQWFNPDV SVIYPDRIRRRPPGTTSKGLGAHTDSGALERWLLPAYQRVFANVFNGNLAQYDPWHAAHR TEVEEYTVDNTTKCSVFRTFQGWTALSDMLPGQGLLHVVPIPEAMAYVLLRPLLDDVPED ELCGVAPGRVLPVSEQWHPLLIEALTSIPKLEAGDSVWWHCDVIHSVAPVENQQGWGNVM YIPAAPMCEKNLAYAHKVKAALEKGASPGDFPREDYETNWEGRFTLADLNIHGKRALGMD V Prediction of potential genes in microbial genomes Time: Mon May 16 16:16:16 2011 Seq name: gi|296493131|gb|ADTK01000370.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1165.8, whole genome shotgun sequence Length of sequence - 52986 bp Number of predicted genes - 61, with homology - 60 Number of transcription units - 25, operones - 13 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 31 - 846 868 ## COG0561 Predicted hydrolases of the HAD superfamily - Prom 904 - 963 3.8 2 2 Op 1 11/0.000 - CDS 992 - 3424 2738 ## COG1882 Pyruvate-formate lyase 3 2 Op 2 . - CDS 3430 - 4329 989 ## COG1180 Pyruvate-formate lyase-activating enzyme - Prom 4414 - 4473 3.6 + Prom 4310 - 4369 3.0 4 3 Tu 1 . + CDS 4454 - 5122 836 ## COG0176 Transaldolase + Term 5155 - 5187 3.8 5 4 Op 1 9/0.000 - CDS 5198 - 5947 789 ## COG0476 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 6 4 Op 2 . - CDS 5947 - 7182 1195 ## COG0303 Molybdopterin biosynthesis enzyme - Prom 7319 - 7378 2.9 + Prom 7259 - 7318 7.5 7 5 Op 1 3/0.364 + CDS 7386 - 8351 1047 ## COG1446 Asparaginase 8 5 Op 2 11/0.000 + CDS 8338 - 10209 845 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 9 5 Op 3 38/0.000 + CDS 10229 - 11767 1657 ## COG0747 ABC-type dipeptide transport system, periplasmic component 10 5 Op 4 49/0.000 + CDS 11785 - 12705 995 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 11 5 Op 5 3/0.364 + CDS 12750 - 13619 796 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components + Term 13694 - 13728 -1.0 + Prom 13658 - 13717 5.6 12 6 Op 1 6/0.000 + CDS 13797 - 16145 1656 ## COG2200 FOG: EAL domain 13 6 Op 2 . + CDS 16153 - 17481 960 ## COG2199 FOG: GGDEF domain + Term 17492 - 17535 7.7 14 7 Tu 1 . - CDS 17528 - 18853 1844 ## PROTEIN SUPPORTED gi|183179613|ref|ZP_02957824.1| conserved hypothetical protein - Prom 18888 - 18947 4.9 + Prom 18942 - 19001 4.5 15 8 Tu 1 . + CDS 19066 - 19449 267 ## EcE24377A_0907 biofilm formation regulatory protein BssR + Term 19454 - 19493 5.0 + Prom 19480 - 19539 4.4 16 9 Tu 1 . + CDS 19704 - 20675 909 ## COG2133 Glucose/sorbosone dehydrogenases 17 10 Tu 1 . - CDS 20672 - 21298 658 ## COG0625 Glutathione S-transferase - Prom 21486 - 21545 3.8 + Prom 21445 - 21504 5.1 18 11 Tu 1 . + CDS 21524 - 22747 1215 ## COG1686 D-alanyl-D-alanine carboxypeptidase + Term 22759 - 22795 6.3 19 12 Op 1 4/0.273 - CDS 22794 - 23552 660 ## COG1349 Transcriptional regulators of sugar metabolism 20 12 Op 2 . - CDS 23610 - 24206 572 ## COG0671 Membrane-associated phospholipid phosphatase + Prom 24329 - 24388 8.4 21 13 Tu 1 . + CDS 24491 - 25723 1117 ## COG0477 Permeases of the major facilitator superfamily + Term 25729 - 25767 2.2 - Term 25714 - 25758 5.2 22 14 Op 1 . - CDS 25764 - 26042 223 ## SbBS512_E2502 hypothetical protein 23 14 Op 2 2/0.455 - CDS 26134 - 26949 899 ## COG0561 Predicted hydrolases of the HAD superfamily 24 14 Op 3 . - CDS 26949 - 28076 1117 ## COG0477 Permeases of the major facilitator superfamily - Prom 28161 - 28220 3.6 + Prom 28120 - 28179 2.0 25 15 Tu 1 . + CDS 28241 - 28777 496 ## COG3226 Uncharacterized protein conserved in bacteria + Term 28865 - 28923 1.2 26 16 Op 1 12/0.000 - CDS 28879 - 29769 453 ## COG0582 Integrase 27 16 Op 2 . - CDS 29782 - 29934 84 ## COG0582 Integrase - Prom 29955 - 30014 6.7 28 17 Op 1 . - CDS 30026 - 30577 62 ## DMR_16080 hypothetical protein 29 17 Op 2 . - CDS 30574 - 31773 331 ## COG4637 Predicted ATPase 30 17 Op 3 . - CDS 31782 - 32675 55 ## COG2932 Predicted transcriptional regulator + Prom 32664 - 32723 2.8 31 18 Op 1 . + CDS 32800 - 33021 191 ## Dd586_0731 hypothetical protein 32 18 Op 2 . + CDS 33054 - 33563 458 ## c0936 hypothetical protein 33 18 Op 3 . + CDS 33571 - 33771 98 ## ECB_00820 hypothetical protein 34 18 Op 4 . + CDS 33735 - 34076 246 ## E2348C_0805 hypothetical protein 35 18 Op 5 . + CDS 34144 - 34377 358 ## t3407 hypothetical protein 36 18 Op 6 1/0.636 + CDS 34377 - 34604 211 ## COG1734 DnaK suppressor protein 37 18 Op 7 . + CDS 34601 - 35458 593 ## COG0338 Site-specific DNA methylase 38 18 Op 8 . + CDS 35455 - 37869 1412 ## c0942 putative replication protein for prophage + Term 37873 - 37903 -0.5 + Prom 37913 - 37972 4.8 39 19 Op 1 . + CDS 38022 - 38210 115 ## ECS88_2839 hypothetical protein 40 19 Op 2 . + CDS 38221 - 38454 315 ## ECS88_2838 putative DinI-like damage-inducible protein + Term 38525 - 38568 6.7 41 20 Tu 1 . - CDS 38647 - 38982 331 ## B21_00837 hypothetical protein - Prom 39002 - 39061 3.1 + Prom 39728 - 39787 1.7 42 21 Tu 1 . + CDS 39815 - 39970 95 ## + Prom 39994 - 40053 5.9 43 22 Tu 1 . + CDS 40091 - 41050 255 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake + Term 41054 - 41080 0.3 - Term 41040 - 41070 2.7 44 23 Op 1 . - CDS 41075 - 42100 611 ## COG5518 Bacteriophage capsid portal protein 45 23 Op 2 . - CDS 42100 - 43866 1375 ## COG5484 Uncharacterized conserved protein - Prom 43982 - 44041 1.7 46 24 Op 1 . + CDS 44009 - 44842 900 ## EcHS_A0925 phage capsid scaffolding protein 47 24 Op 2 . + CDS 44859 - 45917 1241 ## SeD_A3048 phage major capsid protein, P2 family 48 24 Op 3 . + CDS 45921 - 46571 692 ## ECED1_3086 terminase, endonuclease subunit (GpM) 49 24 Op 4 . + CDS 46667 - 47131 437 ## ECED1_3085 head completion/stabilization protein (GpL) 50 24 Op 5 . + CDS 47131 - 47334 187 ## COG5004 P2-like prophage tail protein X 51 24 Op 6 . + CDS 47338 - 47553 258 ## ECS88_2829 putative secretory protein 52 24 Op 7 . + CDS 47534 - 48046 285 ## COG3772 Phage-related lysozyme (muraminidase) 53 24 Op 8 . + CDS 48048 - 48425 346 ## ECIAI39_1807 hypothetical protein 54 24 Op 9 . + CDS 48422 - 48850 218 ## ECS88_2826 putative regulatory protein 55 24 Op 10 . + CDS 48946 - 49377 396 ## SeD_A3040 phage tail completion protein R 56 25 Op 1 . + CDS 49499 - 49816 243 ## t3428 putative phage tail protein 57 25 Op 2 4/0.273 + CDS 49885 - 50463 490 ## COG4540 Phage P2 baseplate assembly protein gpV 58 25 Op 3 4/0.273 + CDS 50460 - 50819 375 ## COG3628 Phage baseplate assembly protein W 59 25 Op 4 4/0.273 + CDS 50806 - 51714 919 ## COG3948 Phage-related baseplate assembly protein 60 25 Op 5 3/0.364 + CDS 51707 - 52312 479 ## COG4385 Bacteriophage P2-related tail formation protein 61 25 Op 6 . + CDS 52309 - 52984 608 ## COG5301 Phage-related tail fibre protein Predicted protein(s) >gi|296493131|gb|ADTK01000370.1| GENE 1 31 - 846 868 271 aa, chain - ## HITS:1 COG:ybiV KEGG:ns NR:ns ## COG: ybiV COG0561 # Protein_GI_number: 16128790 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Escherichia coli K12 # 1 271 1 271 271 539 98.0 1e-153 MSVKVIVTDMDGTFLNDAKTYNQPHFMAQYQELKKRGIKFVVASGNQYYQLISFFPELKD EISFVAENGALVYEHGKQLFHGELTRHESRIVIGELLKDKQLNFVACGLQSAYVSENAPE AFVALMAKHYHRLKAVKDYQEIDDVLFKFSLNLPDEQIPLVIDKLHIALDGIMKPVTSGF GFIDLIIPGLHKANGISRLLKRWDLSPQNVVAIGDSGNDAEMLKMARYSFAMGNAAENIK QIARYATDDNNHEGALNVIQAVLDNTSPFNS >gi|296493131|gb|ADTK01000370.1| GENE 2 992 - 3424 2738 810 aa, chain - ## HITS:1 COG:ybiW KEGG:ns NR:ns ## COG: ybiW COG1882 # Protein_GI_number: 16128791 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Escherichia coli K12 # 1 810 1 810 810 1694 99.0 0 MTTLKLDTLSDRIKAHKNALVHIVKPPVCTERAQHYTEMYQQHLDKPIPVRRALALAHHL ANRTIWIKHDELIIGNQASEVRAAPIFPEYTVSWIEKEIDDLADRPGAGFAVSEENKRVL HEVCPWWRGQTVQDRCYGMFTDEQKGLLATGIIKAEGNMTSGDAHLAVNFPLLLEKGLDG LREKVAERRSRINLTVLEDLHGEQFLKAIDIVLVAVSEHIERFAALAREMAATETRESRR DELLTMAENCDLIAHQPPQTFWQALQLCYFIQLILQIESNGHSVSFGRMDQYLYPYYRRD VELNQTLDREHAIEMLHSCWLKLLEVNKIRSGSHSKASAGSPLYQNVTIGGQNLVDGQPM DAVNPLSYAILESCGRLRSTQPNLSVRYHAGMSNDFLDACVQVIRCGFGMPAFNNDEIVI PEFIKLGIEPQDAYDYAAIGCIETAVGGKWGYRCTGMSFINFARVMLAALEGGRDATSGK VFLPQEKALSAGNFNNFDEVMDAWDTQIRYYTRKSIEIEYVVDTMLEENVHDILCSALVD DCIERAKSIKQGGAKYDWVSGLQVGIANLGNSLAAVKKLVFEQGAIGQQQLAAALADDFD GLTHEQLRQRLINGAPKYGNDDDTVDTLLARAYQTYIDELKQYHNPRYGRGPVGGNYYAG TSSISANVPFGAQTMATPDGRKAHTPLAEGASPASGTDHLGPTAVIGSVGKLPTAAILGG VLLNQKLNPATLENESDKQKLMILLRTFFEVHKGWHIQYNIVSRETLLEAKKHPDQYRDL VVRVAGYSAFFTALSPDAQDDIIARTEHML >gi|296493131|gb|ADTK01000370.1| GENE 3 3430 - 4329 989 299 aa, chain - ## HITS:1 COG:ybiY KEGG:ns NR:ns ## COG: ybiY COG1180 # Protein_GI_number: 16128792 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Escherichia coli K12 # 1 299 10 308 308 600 100.0 1e-171 MIFNIQRYSTHDGPGIRTVVFLKGCSLGCRWCQNPESRARTQDLLYDARLCLEGCELCAK AAPEVIERALNGLLIHREKLTPEHLTALTDCCPTQALTVCGEVKSVEEIMTTVLRDKPFY DRSGGGLTLSGGEPFMQPEMAMALLQASHEAGIHTAVETCLHVPWKYIAPSLPYIDLFLA DLKHVADAPFKQWTDGNAARVLDNLKKLAAAGKKIIIRVPLIQGFNADETSVKAITDFAA DELHVGEIHFLPYHTLGINKYHLLNLPYDAPEKPLDAPELLDFAQQYACQKGLTATLRG >gi|296493131|gb|ADTK01000370.1| GENE 4 4454 - 5122 836 222 aa, chain + ## HITS:1 COG:mipB KEGG:ns NR:ns ## COG: mipB COG0176 # Protein_GI_number: 16128793 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Escherichia coli K12 # 1 222 23 244 244 395 98.0 1e-110 MVMELYLDTSDVVAVKALSRIFPLAGVTTNPSIIAAGKKPLDVVLPQLHEAMGGQGRLFA QVMATTAEGMVNDARKLRSIIADIVVKVPVTAEGLAAIKMLKAEGIPTLGTAVYGAAQGL LSALAGAEYVAPYVNRIDAQGGSGIQTVTDLHQLLKMHAPQAKVLAASFKTPRQALDCLL AGCESITLPLDVAQQMISYPAVEAAVAKFEQDWLGAFGRTSI >gi|296493131|gb|ADTK01000370.1| GENE 5 5198 - 5947 789 249 aa, chain - ## HITS:1 COG:moeB KEGG:ns NR:ns ## COG: moeB COG0476 # Protein_GI_number: 16128794 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 # Organism: Escherichia coli K12 # 1 249 1 249 249 485 97.0 1e-137 MAELSDQEMLRYNRQIILRGFDFDGQEALKDSRVLVVGLGGLGCAASQYLASAGVGNLTL LDFDTVSLSNLQRQTLHSDATVGQPKVESARDAMTRINPHIAITPVNALLDDAELAAMIA KHDLVLDCTDNVAVRNQLNAGCFAAKVPLVSGAAIRMEGQITVFTYQDGEPCYRCLSRLF GENALTCVEAGVMAPLIGVIGSLQAMEAIKLLAGYGKPASGKIVMYDAMTCQFREMKLMR NPGCEVCGQ >gi|296493131|gb|ADTK01000370.1| GENE 6 5947 - 7182 1195 411 aa, chain - ## HITS:1 COG:moeA KEGG:ns NR:ns ## COG: moeA COG0303 # Protein_GI_number: 16128795 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin biosynthesis enzyme # Organism: Escherichia coli K12 # 1 411 1 411 411 790 99.0 0 MEFTTGLMSLDTALNEMLSRVTPLTAQETLPLVQCFGRILASDVVSPLDVPGFDNSAMDG YAVRLADIASRQPLPVAGKSFAGQPYHGEWPAGTCIRIMTGAPVPEGCEAVVMQEQTEQT DNGVRFTAEVRSGQNIRRRGEDISAGAVVFPAGTRLTTAELPVIASLGIAEVPVIRKVRV ALFSTGDELQLPGQPLGDGQIYDTNRLAVHLMLEQLGCEVINLGIIRDDPHALRAAFIEA DSQADVVISSGGVSVGEADYTKTILEELGEIAFWKLAIKPGKPFAFGKLSNSWFCGLPGN PVSATLTFYQLVQPLLAKLSGNTASGLPARQRVRTASRLKKTPGRLDFQRGVLQRNADGE LEVTTTGHQGSHIFSSFSLGNCFIVLERDRGNVDVGEWVEVEPFNALFGGL >gi|296493131|gb|ADTK01000370.1| GENE 7 7386 - 8351 1047 321 aa, chain + ## HITS:1 COG:ybiK KEGG:ns NR:ns ## COG: ybiK COG1446 # Protein_GI_number: 16128796 # Func_class: E Amino acid transport and metabolism # Function: Asparaginase # Organism: Escherichia coli K12 # 1 321 1 321 321 581 99.0 1e-166 MGKAVIAIHGGAGAISRAQMSLQQELRYIEALSAIVETGQKMLEAGESALDVVTEAVRLL EECPLFNAGIGAVFTRDETHELDACVMDGNTLKAGAVAGVSHLRNPVLAARLVMEQSPHV MMIGEGAENFAFARGMERVSPEIFSTPLRYEQLLAARKEGATVLDHSGAPLDEKQKMGTV GAVALDLDGNLAAATSTGGMTNKLPGRVGDSPLVGAGCYANNASVAVSCTGTGEVFIRAL AAYDIAALMDYGGLSLAEACERVVMEKLPALGGSGGLIAIDHEGNVALPFNTEGMYRAWG YAGDTPTTGIYREKGDTVATQ >gi|296493131|gb|ADTK01000370.1| GENE 8 8338 - 10209 845 623 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 12 572 8 528 563 330 35 1e-89 MPHSDELDAGNVLAVENLNIAFMQDQQKIAAVRNLSFSLQRGETLAIVGESGSGKSVTAL ALMRLLEQAGGLVQCDKMLLQRRSREVIELSEQSAAQMRHVRGADMAMIFQEPMTSLNPV FTVGEQIAESIRLHQNASREEAMVEAKRMLDQVRIPEAQTILSRYPHQLSGGMRQRVMIA MALSCRPAVLIADEPTTALDVTIQAQILQLIKVLQKEMSMGVIFITHDMGVVAEIADRVL VMYQGEAVETGTVEQIFHAPQHPYTRALLAAVPQLGAMKGLDYPRRFPLISLEHPAKQAP PIEQKTVVDGEPVLRVRNLVTRFPLRSGLLNRVTREVHAVEKVSFDLWPGETLSLVGESG SGKSTTGRALLRLVESQGGEIIFNGQRIDTLSPGKLQALRRDIQFIFQDPYASLDPRQTI GDSIIEPLRVHGLLPGKEAAARVAWLLERVGLLPEHAWRYPHEFSGGQRQRICIARALAL NPKVIIADEAVSALDVSIRGQIINLLLDLQRDFGIAYLFISHDMAVVERISHRVAVMYLG QIVEIGPRRAVFENPQHPYTRKLLAAVPVAEPSRQRPQRVLLSDDLPSNIHLRGEEVAAV SLQCVGPGHYVAQPQSEYAFMRR >gi|296493131|gb|ADTK01000370.1| GENE 9 10229 - 11767 1657 512 aa, chain + ## HITS:1 COG:ECs0909 KEGG:ns NR:ns ## COG: ECs0909 COG0747 # Protein_GI_number: 15830163 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 1 512 1 512 512 992 99.0 0 MARAVHRSGLVALGIATALMASCAFAAKDVVVAVGSNFTTLDPYDANDTLSQAVAKSFYQ GLFGLDKEMKLKNVLAESYTVSDDGITYTVKLREGIKFQDGTDFNAAAVKANLDRASDPA NHLKRYNLYKNIAKTEAIDPTTVKITLKQPFSAFINILAHPATAMISPAALEKYGKEIGF HPVGTGPYELDTWNQTDFVKVKKFAGYWQPGLPKLDSITWRPVADNNTRAAMLQTGEAQF AFPIPYEQATLLEKNKNIELMASPSIMQRYISMNVTQKPFDNPKVREALNYAINRPALVK VAFAGYATPATGVVPPSIAYAQSYKPWPYDPVKARELLKEAGYPNGFSTTLWSSHNHSTA QKVLQFTQQQLAQVGIKVQVTAMDAGQRAAEVEGKGQKESGVRMFYTGWSASTGEADWAL SPLFASQNWPPTLFNTAFYSNKQVDDFLAQALKTNDPAEKTRLYKAAQDIIWQESPWIPL VVEKLVSAHSKNLTGFWIMPDTGFSFEDADLQ >gi|296493131|gb|ADTK01000370.1| GENE 10 11785 - 12705 995 306 aa, chain + ## HITS:1 COG:yliC KEGG:ns NR:ns ## COG: yliC COG0601 # Protein_GI_number: 16128799 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Escherichia coli K12 # 1 306 1 306 306 563 100.0 1e-160 MLNYVIKRLLGLIPTLFIVSVLVFLFVHMLPGDPARLIAGPEADAQVIELVRQQLGLDQP LYHQFWHYISNAVQGDFGLSMVSRRPVADEIASRFMPTLWLTITSMVWAVIFGMAAGIIA AVWRNRWPDRLSMTIAVSGISFPAFALGMLLIQVFSVELGWLPTVGADSWQHYILPSLTL GAAVAAVMARFTRASFVDVLSEDYMRTARAKGVSETWVVLKHGLRNAMIPVVTMMGLQFG FLLGGSIVVEKVFNWPGLGRLLVDSVEMRDYPVIQAEILLFSLEFILINLVVDVLYAAIN PAIRYK >gi|296493131|gb|ADTK01000370.1| GENE 11 12750 - 13619 796 289 aa, chain + ## HITS:1 COG:ECs0911 KEGG:ns NR:ns ## COG: ECs0911 COG1173 # Protein_GI_number: 15830165 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Escherichia coli O157:H7 # 1 289 15 303 303 536 100.0 1e-152 MPLVKPDQVRTPWHEFWRRFRRQHMAMTAALFVILLIVVAIFARWIAPYDAENYFDYDNL NNGPSLQHWFGVDSLGRDIFSRVLVGAQISLAAGVFAVFIGAAIGTLLGLLAGYYEGWWD RLIMRICDVLFAFPGILLAIAVVAVLGSGIANVIIAVAIFSIPAFARLVRGNTLVLKQQT FIESARSIGASDMTILLRHILPGTVSSIVVFFTMRIGTSIISAASLSFLGLGAQPPTPEW GAMLNEARADMVIAPHVAVFPALAIFLTVLAFNLLGDGLRDALDPKIKG >gi|296493131|gb|ADTK01000370.1| GENE 12 13797 - 16145 1656 782 aa, chain + ## HITS:1 COG:yliE_2 KEGG:ns NR:ns ## COG: yliE_2 COG2200 # Protein_GI_number: 16128801 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 526 782 1 257 257 531 100.0 1e-150 MLSLYEKIKIRLIILFLLAALSFIGLFFIINYQLVSERAVKRADSRFELIQKNVGYFFKD IERSALTLKDSLYLLKNTEEIQRAVILKMEMMPFLDSVGLVLDDNKYYLFSRRANDKIVV YHQEQVNGPLVDESGRVIFADFNPSKRPWSVASDDSNNSWNPAYNCFDRPGKKCISFTLR INGKDHDLLAVDKIHVDLNWRYLNEYLDQISANDEVLFLKQGHEIIAKNQLAREKLIIYN SEGNYNIIDSVDTEYIAKTSAVPNNALFEIYFYYPGGNLLNASDKLFYLPFAFIIIVLLV VYFMTTRVFRRQFSEMTELVNTLAFLPDSKDQIEALKIREGDAKEIISIKNSIAEMKDAE IERSNKLLSLISYDQESGFIKNMAIIESNNNQYLAVGIIKLCGLEAVEAVFGVDERNKIV RKLCQRIAEKYAQCCDIVTFNADLYLLLCRENVQTFTRKIAMVNDFDSSFGYRNLRIHKS AICEPLQGENAWSYAEKLKLAISSIRDHMFSEFIFCDDAKLNEIEENIWIARNIRHAMEI GELFLVYQPIVDINTRAILGAEALCRWVSAERGIISPLKFITIAEDIGFINELGYQIIKT AMGEFRHFSQRASLKDDFLLHINVSPWQLNEPHFHERFTTIMKENGLKANSLCVEITETV IERINEHFYLNIEQLRKQGVRISIDDFGTGLSNLKRFYEINPDSIKVDSQFTGDIFGTAG KIVRIIFDLARYNRIPVIAEGVESEDVARELIKLGCVQAQGYLYQKPMPFSAWDKSGKLV KE >gi|296493131|gb|ADTK01000370.1| GENE 13 16153 - 17481 960 442 aa, chain + ## HITS:1 COG:Z1058_2 KEGG:ns NR:ns ## COG: Z1058_2 COG2199 # Protein_GI_number: 15800584 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli O157:H7 EDL933 # 261 442 1 182 182 367 99.0 1e-101 MSRINKFVLTVSLLIFIMISAVACGIYTQMVKERVYSLKQSVIDTAFAVANIAEYRRSVA IDLINTLNPTEEQLLVGLRTAYADSVSPSYLYDVGPYLISSDECIQVKEFEKNYCADIMQ VVKYRHVKNTGFISFDGKTFVYYLYPVTHNRSLIFLLGLERFSLLSKSLAMDSENLMFSL FKNGKPVTGDEYYAKNAIFTVSEAMEHFAYLPTGLYVFAYKKDVYLRVCTLIIFFAALVA VISGTSCLYLVRRVINRGIVEKEAIINNHFERVLDGGLFFSAADVKKLYSMYNSAFLDDL TKAMGRKSFDEDLKALPEKGGYLCLFDVDKFKNINDTFGHLLGDEVLMKVVKILKSQIPV DKGKIYRFGGDEFAVIYTGGTLEELLSILKEIVHFQVGSINLSTSIGVAHSNECTTVERL KMLADERLYKSKKNGRAQISWQ >gi|296493131|gb|ADTK01000370.1| GENE 14 17528 - 18853 1844 441 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|183179613|ref|ZP_02957824.1| conserved hypothetical protein [Vibrio cholerae MZO-3] # 9 439 34 464 470 714 77 0.0 MSKVTPQPKIGFVSLGCPKNLVDSERILTELRTEGYDVVPSYDDADMVIVNTCGFIDSAV QESLEAIGEALNENGKVIVTGCLGAKEDQIREVHPKVLEITGPHSYEQVLEHVHHYVPKP KHNPFLSLVPEQGVKLTPRHYAYLKISEGCNHRCTFCIIPSMRGDLVSRPIGEVLSEAKR LVDAGVKEILVISQDTSAYGVDVKHRTGFHNGEPVKTSMVSLCEQLSKLGIWTRLHYVYP YPHVDDVIPLMAEGKILPYLDIPLQHASPRILKLMKRPGSVDRQLARIKQWREICPELTL RSTFIVGFPGETEEDFQMLLDFLKEARLDRVGCFKYSPVEGADANALPDQVPEEVKEERW NRFMQLQQQISAERLQEKVGREILVIIDEVDEEGAIGRSMADAPEIDGAVYLNGETNVKP GDILRVKVEHADEYDLWGSRV >gi|296493131|gb|ADTK01000370.1| GENE 15 19066 - 19449 267 127 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_0907 NR:ns ## KEGG: EcE24377A_0907 # Name: bssR # Def: biofilm formation regulatory protein BssR # Organism: E.coli_E24377A # Pathway: not_defined # 1 127 12 138 138 226 100.0 2e-58 MFVDRQRIDLLNRLIDARVDLAAYVQLRKAKGYMSVSESNHLRDNFFKLNRELHDKSLRL NLHLDQEEWSALHHAEEALATAAVCLMSGHHDCPTVITVNADKLENCLMSLTLSIQSLQK HAMLEKA >gi|296493131|gb|ADTK01000370.1| GENE 16 19704 - 20675 909 323 aa, chain + ## HITS:1 COG:ECs0917 KEGG:ns NR:ns ## COG: ECs0917 COG2133 # Protein_GI_number: 15830171 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose/sorbosone dehydrogenases # Organism: Escherichia coli O157:H7 # 1 323 49 371 371 649 99.0 0 MLITLRGGELRHWQAGKGLSAPLSGVPDVWAHGQGGLLDVVLAPDFAQSRRIWLSYSEVG DDGKAGTAVGYGRLSDDLSKVTDFRTVFRQMPKLSTGNHFGGRLVFDGKGYLFIALGENN QRPTAQDLDKLQGKLVRLTDQGEIPDDNPFIKESGVRAEIWSYGIRNPQGMAMNPWSNAL WLNEHGPRGGDEINIPQKGKNYGWPLATWGINYSGFKIPEAKGEIVAGTEQPVFYWKDSP AVSGMAFYNSDKFPQWQQKLFIGALKDKDVIVMSVNGDKVTEDGRILTDRGQRIRDVRTG PDGYLYVLTDESSGELLKVSPRN >gi|296493131|gb|ADTK01000370.1| GENE 17 20672 - 21298 658 208 aa, chain - ## HITS:1 COG:ECs0918 KEGG:ns NR:ns ## COG: ECs0918 COG0625 # Protein_GI_number: 15830172 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione S-transferase # Organism: Escherichia coli O157:H7 # 1 208 3 210 210 400 100.0 1e-112 MITLWGRNNSTNVKKVLLTLEELELPYEQILAGREFGINHDADFLAMNPNGLVPLLRDDE SDLILWESNAIVRYLAAQYGQKRLWIDSPARRAEAEKWMDWANQTLSNAHRGILMGLVRT PPEERDQAAIDASCKECDALFALLDAELAKVKWFSGDEFGVGDIAIAPFIYNLFNVGLTW TPRPNLQRWYQQLTERPAVRKVVMIPVS >gi|296493131|gb|ADTK01000370.1| GENE 18 21524 - 22747 1215 407 aa, chain + ## HITS:1 COG:ECs0919 KEGG:ns NR:ns ## COG: ECs0919 COG1686 # Protein_GI_number: 15830173 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Escherichia coli O157:H7 # 8 407 1 400 400 795 100.0 0 MDTRVAFMTQYSSLLRGLAAGSAFLFLFAPTAFAAEQTVEAPSVDARAWILMDYASGKVL AEGNADEKLDPASLTKIMTSYVVGQALKADKIKLTDMVTVGKDAWATGNPALRGSSVMFL KPGDQVSVADLNKGVIIQSGNDACIALADYVAGSQESFIGLMNGYAKKLGLTNTTFQTVH GLDAPGQFSTARDMALLGKALIHDVPEEYAIHKEKEFTFNKIRQPNRNRLLWSSNLNVDG MKTGTTAGAGYNLVASATQGDMRLISVVLGAKTDRIRFNESEKLLTWGFRFFETVTPIKP DATFVTQRVWFGDKSEVNLGAGEAGSVTIPRGQLKNLKASYTLTEPQLTAPLKKGQVVGT IDFQLNGKSIEQRPLIVMENVEEGGFFGRMWDFVMMKFHQWFGSWFS >gi|296493131|gb|ADTK01000370.1| GENE 19 22794 - 23552 660 252 aa, chain - ## HITS:1 COG:ECs0920 KEGG:ns NR:ns ## COG: ECs0920 COG1349 # Protein_GI_number: 15830174 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli O157:H7 # 1 252 1 252 252 527 99.0 1e-150 METRREERIGQLLQELKRSDKLHLKDAAALLGVSEMTIRRDLNNHSAPVVLLGGYIVLEP RSASHYLLSDQKSRLVEEKRRAAKLAATLVEPDQTLFFDCGTTTPWIIEAIDNEIPFTAV CYSLNTFLALKEKPHCRAFLCGGEFHASNAIFKPIDFQQTLNNFCPDIAFYSAAGVHVSK GATCFNLEELPVKHWAMSMAQKHVLVVDHSKFGKVRPVRMGDLKRFDIVVSDCCPEDEYV KYAQTQRIKLMY >gi|296493131|gb|ADTK01000370.1| GENE 20 23610 - 24206 572 198 aa, chain - ## HITS:1 COG:ECs0921 KEGG:ns NR:ns ## COG: ECs0921 COG0671 # Protein_GI_number: 15830175 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Escherichia coli O157:H7 # 1 198 1 198 198 354 98.0 6e-98 MLENLNLSLFSLINATPDSAPWMISLAIFIAKDLITVVPLLAAVLWLWGFTAQRQLVIKI AIALAVSLFVSWTMGHLFPHDRPFVENIGYNFLHHAADDSFPSDHGTVIFTFALAFLCWH RLWSGSLLMVLAVVIAWSRVYLGVHWPLDMLGGLLAGMIGCLSAQIIWQAMGHKLYQRLQ SWYRFCFALPIRKGWVRD >gi|296493131|gb|ADTK01000370.1| GENE 21 24491 - 25723 1117 410 aa, chain + ## HITS:1 COG:ECs0922 KEGG:ns NR:ns ## COG: ECs0922 COG0477 # Protein_GI_number: 15830176 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 410 1 410 410 720 99.0 0 MQNKLASGARLGRQALLFPLCLVLYEFSTYIGNDMIQPGMLAVVEQYQAGIDWVPTSMTA YLAGGMFLQWLLGPLSDRIGRRPVMLAGVVWFIITCLAILLAQNIEQFTLLRFLQGISLC FIGAVGYAAIQESFEEAVCIKITALMANVALIAPLLGPLVGAAWIHVLPWEGMFVLFAAL AAISFFGLQRAMPETATRIGEKLSLKELGRDYKLVLKNGRFVAGALALGFVSLPLLAWIA QSPIIIITGEQLSSYEYGLLQVPIFGALIAGNLLLARLTSRRTVRSLIIMGGWPIMIGLL VAAAATVISSHAYLWMTAGLSIYAFGIGLANAGLVRLTLFASDMSKGTVSAAMGMLQMLI FTVGIEISKHAWLNGGNGQFNLFNLVNGILWLSLMVIFLKDKQMGNSHEG >gi|296493131|gb|ADTK01000370.1| GENE 22 25764 - 26042 223 92 aa, chain - ## HITS:1 COG:no KEGG:SbBS512_E2502 NR:ns ## KEGG: SbBS512_E2502 # Name: not_defined # Def: hypothetical protein # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 1 92 1 92 92 127 100.0 1e-28 MKNCLLLGALLMGFTGVAMAQSVTVDVPSGYKVVVVPDSVSVPQAVSVATVPQTVYVAPA PAPAYRPHPYARHLASVGEGMVIEHQIDDHHH >gi|296493131|gb|ADTK01000370.1| GENE 23 26134 - 26949 899 271 aa, chain - ## HITS:1 COG:ybjI KEGG:ns NR:ns ## COG: ybjI COG0561 # Protein_GI_number: 16128812 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Escherichia coli K12 # 10 271 1 262 262 518 99.0 1e-147 MSIKLIAVDMDGTFLSDQKTYNRERFMAQYQQMKAQGIRFVVASGNQYYQLISFFPEIAN EIAFVAENGGWVVSEGKDVFNGELSKDAFTTVVEHLLTRPEVEIIACGKNSAYTLKKYDD AMKTVAEMYYHRLEYVDNFDNLEDIFFKFGLNLSDELIPQVQKALHEAIGDIMVPVHTGN GSIDLIIPGVHKANGLRQLQKLWGIDDSEVVVFGDGGNDIEMLRQAGFSFAMENAGSAVV AAAKYRAGSNNREGVLDVIDKVLKHEAPFDQ >gi|296493131|gb|ADTK01000370.1| GENE 24 26949 - 28076 1117 375 aa, chain - ## HITS:1 COG:ybjJ KEGG:ns NR:ns ## COG: ybjJ COG0477 # Protein_GI_number: 16128813 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 375 28 402 402 610 99.0 1e-174 MASWATRTPAIRDILSVSIAEMGGVLFGLSIGSMSGILCSAWLVKRFGTRNVILVTMSCA LIGMMILSLALWLTSPLLFAVGLGVFGASFGSAEVAINVEGAAVEREMNKTVLPMMHGFY SLGTLAGAGVGMALTAFGVPATVHISLAALVGIAPIYIAIQAIPDGTGKNAADGTQHGEK GIPFYRDIQLLLIGVVVLAMAFAEGSANDWLPLLMVDGHGFSPTSGSLIYAGFTLGMTVG RFTGGWFIDRYSRVAVVRASALMGALGIGLIIFVDSAWVAGVSVVLWGLGASLGFPLTIS AASDTGPDAPTRVSVVATTGYLAFLVGPPLLGYLGEHYGLRSAMLVVLALVILAAIVAKA VAKPDTKTQTAMENS >gi|296493131|gb|ADTK01000370.1| GENE 25 28241 - 28777 496 178 aa, chain + ## HITS:1 COG:ECs0926 KEGG:ns NR:ns ## COG: ECs0926 COG3226 # Protein_GI_number: 15830180 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 178 1 178 178 340 100.0 7e-94 MRRANDPQRREKIIQATLEAVKLYGIHAVTHRKIATLAGVPLGSMTYYFSGIDELLLEAF SSFTEIMSRQYQAFFSDVSDAQGACQAITDMIYSSQVATPDNMELMYQLYALASRKPLLK TVMQNWMQRSQQTLEQWFEPGTARALDAFIEGMTLHFVTDRKPLSREEILRMVERVAG >gi|296493131|gb|ADTK01000370.1| GENE 26 28879 - 29769 453 296 aa, chain - ## HITS:1 COG:STM2739 KEGG:ns NR:ns ## COG: STM2739 COG0582 # Protein_GI_number: 16766051 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Salmonella typhimurium LT2 # 2 281 57 337 341 284 50.0 1e-76 MDRRTLKDVVELWFKLHGKSLTAGQHVYDKLLLMVDALGNPLATDLTSKMFAHYRDKRLT GEIYFSEKWKKGASPVTINLEQSYLSSVFSELSRLGEWSYPNPLENMRKFTIAEKEMAWL THEQIVELLADCKRQDPILALVVKICLSTGARWREAVNLTRSQVTKYRITFVRTKGKKNR SIPISKELYEEIMALDGFNFFTDCYFQFLSVMEKTSIVLPRGQLTHVLRHTFAAHFMMSG GNILALQKILGHHDIKMTMRYAHLAPDHLETALRFNPLATLPSGDKVAAAVGITPS >gi|296493131|gb|ADTK01000370.1| GENE 27 29782 - 29934 84 50 aa, chain - ## HITS:1 COG:STM2739 KEGG:ns NR:ns ## COG: STM2739 COG0582 # Protein_GI_number: 16766051 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Salmonella typhimurium LT2 # 1 50 1 50 341 82 72.0 2e-16 MAVRKLTTGKWLCECYPAGRSGRRVRKQFATKGEALAFERHTMEESEAKP >gi|296493131|gb|ADTK01000370.1| GENE 28 30026 - 30577 62 183 aa, chain - ## HITS:1 COG:no KEGG:DMR_16080 NR:ns ## KEGG: DMR_16080 # Name: not_defined # Def: hypothetical protein # Organism: D.magneticus # Pathway: not_defined # 1 183 1 183 185 196 51.0 3e-49 MTTYAFFLEELSAAEMLKGLLPRLLPADADTRYIVFEGKQDLEKNLEKKLKNWMTPDTIF VVIRDKDSGDGPTIKSRLSDICSKAGKTDVLIRIACHELESWYLGDLLSVEEALGIPRLS RSQNNSKYRNPDRLGNAAQELVSLTKYAYQKVSGSRAIGKCLRITGNLSHSYNIFISGLQ NRI >gi|296493131|gb|ADTK01000370.1| GENE 29 30574 - 31773 331 399 aa, chain - ## HITS:1 COG:alr0507 KEGG:ns NR:ns ## COG: alr0507 COG4637 # Protein_GI_number: 17228003 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Nostoc sp. PCC 7120 # 1 398 1 394 394 531 64.0 1e-150 MLIEAIKLKNFKSFQDLEMNNIPKFCVIVGANGVGKTTLFDVFGFLKDCLTFNVRSAVQK RGGFEELLSRGVDVTDRTIEIEVKFRIEISGYERLVTYILKLKEDSRKKVYVEREILRYK RGSFGSPYHFLDFSRGHGTAIVNEDDFSKQDEELNRETQQLDSDDVLAIKGLGQFSKFKA ASAFRQLIENWHVSDFHINAARGSKDAIGYEDHLSATGDNLQLVARNIHDNYPEIFSKII DSMKHRVPGVSDVKPISTQDGRLLLGFQDGSFADPFIDRYVSDGTLKMFAYLVQLHDPEP HPLLCVEEPENQLYPKLLVELAEEFRAYTQRGGQVFVSTHSPDFLNAVNIDEVFWLTKTN GYTRAFRASDDPQLVAYMNDGDKMGYLWKQGFFPGVDPQ >gi|296493131|gb|ADTK01000370.1| GENE 30 31782 - 32675 55 297 aa, chain - ## HITS:1 COG:STM2738 KEGG:ns NR:ns ## COG: STM2738 COG2932 # Protein_GI_number: 16766050 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Salmonella typhimurium LT2 # 110 293 9 210 210 157 40.0 2e-38 MNFKNGGQAVITRMLEAYGFKTRQALCEQFNVSASTMGTRWMRDVFPADWVIQCAIETGA SIEWLSFGKGVPFPQNTEASVMSKATAPKVQYVPQINNQENPIGNLLNLDSGGRDAIDRL MKAYGFKTRQELADHLNVSKSTMANRYLRDTFPADWIIKCSLETGNSLLWLATGQGSKHS SLTTLVKELPKFHLNAGKMVECGSYIFDTSFLPANLSAPIVIQDGLMTYICDQNITDVLD GHWLINIDETYSIRHITRLPKKVIKVSSSINSFECGFSDINFVARVYLFISSINKSN >gi|296493131|gb|ADTK01000370.1| GENE 31 32800 - 33021 191 73 aa, chain + ## HITS:1 COG:no KEGG:Dd586_0731 NR:ns ## KEGG: Dd586_0731 # Name: not_defined # Def: hypothetical protein # Organism: D.dadantii_Ech586 # Pathway: not_defined # 1 73 1 73 73 113 89.0 2e-24 MTPHISITLAVPSVSIEKYSELTGLSIDTINDMLADGRLIRHRLRKDKKREKVMINIAAM TVDALSECNLSIN >gi|296493131|gb|ADTK01000370.1| GENE 32 33054 - 33563 458 169 aa, chain + ## HITS:1 COG:no KEGG:c0936 NR:ns ## KEGG: c0936 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_CFT073 # Pathway: not_defined # 1 169 19 187 187 323 98.0 2e-87 MFDYQVSKHPHFDEACRTFALRHNLVQLADRAGMNVQILRNKLNPAQPHLLTAPDIWLLT DLTEDSTLVDGFLAQIHCLPCVPINEVAKEKLPHYVMSATAEIGRVAAGAVSGDVKTSAG RRDAISSINSVTRLMALAAVSLQARLQANPAMASAVDTVTGLGASFGLL >gi|296493131|gb|ADTK01000370.1| GENE 33 33571 - 33771 98 66 aa, chain + ## HITS:1 COG:no KEGG:ECB_00820 NR:ns ## KEGG: ECB_00820 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 66 14 79 79 127 96.0 1e-28 MLTKEPSFASLLVKQSPAMHYGHGWIIGKDGKRWHPCRSQDELLAELSTKKRGNKWLLKA LRRLFH >gi|296493131|gb|ADTK01000370.1| GENE 34 33735 - 34076 246 113 aa, chain + ## HITS:1 COG:no KEGG:E2348C_0805 NR:ns ## KEGG: E2348C_0805 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_0127 # Pathway: not_defined # 1 113 1 113 113 201 100.0 7e-51 MAIEGAAATVPLSPGERLNGLNHIAELRAKVFGLNIESELERFIKDMRDPRDINNEQNKR ALAAIFFMAKIPAERHSISINELTTDEKRELIKAMNHFRAVVSLFPRRLTMPN >gi|296493131|gb|ADTK01000370.1| GENE 35 34144 - 34377 358 77 aa, chain + ## HITS:1 COG:no KEGG:t3407 NR:ns ## KEGG: t3407 # Name: not_defined # Def: hypothetical protein # Organism: S.typhi_Ty2 # Pathway: not_defined # 1 77 1 77 77 105 96.0 4e-22 MRNIETLKTKTGPDDAGLNILLTEARLEERRARAEAMAARLDSLACHITSRQLNHVEAAE LLRVTAEAIQNEAQEIH >gi|296493131|gb|ADTK01000370.1| GENE 36 34377 - 34604 211 75 aa, chain + ## HITS:1 COG:STM2731 KEGG:ns NR:ns ## COG: STM2731 COG1734 # Protein_GI_number: 16766043 # Func_class: T Signal transduction mechanisms # Function: DnaK suppressor protein # Organism: Salmonella typhimurium LT2 # 1 75 1 75 75 123 89.0 1e-28 MADAMDLVQQRVEEERQRHIRAARAKTPGVSRVLCIECEAPIPPARRRVIPGVQLCITCQ EIAELKGKHYNGGAV >gi|296493131|gb|ADTK01000370.1| GENE 37 34601 - 35458 593 285 aa, chain + ## HITS:1 COG:STM2730 KEGG:ns NR:ns ## COG: STM2730 COG0338 # Protein_GI_number: 16766042 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Salmonella typhimurium LT2 # 1 284 1 282 285 431 74.0 1e-120 MSTILKWAGNKTAIMSELKKHLPAGPRLVEPFAGSCAVMMATDYPSYLVADINPDLINLY KKVSADCEAFISRARVLFENANREVAYYNIKQEFNYSTEITDFMKAVYFLYLNRHGYRGL CRYNKSGHFNIPYGNYKNPYFPEKEIRAFAEKAQRATFICASFDETLAMLQVGDVVYCDP PYDGTFSGYHTDGFTEDDQYHLASVLEHRSSEGHPVIVSNSDTSLIRSLYRNFTHHYIKA KRSIGVAAGDSKSATEIIAVSGARCWVGFDPSRGVDSSAVYEVRV >gi|296493131|gb|ADTK01000370.1| GENE 38 35455 - 37869 1412 804 aa, chain + ## HITS:1 COG:no KEGG:c0942 NR:ns ## KEGG: c0942 # Name: not_defined # Def: putative replication protein for prophage # Organism: E.coli_CFT073 # Pathway: not_defined # 1 804 1 804 804 1563 97.0 0 MSHDDMSNSSGFNEAAASFSWNGPKKAINPYLDPAEFAPESALSNLITLYAADNEQEQLR REALSEQVWERYFFNESRDPVQREMEQDKLISRAKLAHEQQLFNPDMVILADVSAQPTHI SKPLMQRIEYFSSLGRPKAYSRYLRETIKPCLERLDCVRDSQLSASFRFMASHQGLEGLL ILPEMSQDQVKRLSTLVAAHMSMCLDAACGDLYATDDVKPEEIRKTWEKVAAETLRLDVI PPAFEQLRRKRNRRKPVPYELIPGSLARMLCADWWYRKLWKMRCEWREEQLRAVCLVSKK ASPYVSYEAVTHKREQRRKSLEFFRSHELVNEDGDTLDMEDVVNASSSNPAHRRNEMMAC VKGLELIAEMRGDCAVFYTITCPSRFHSTLNNGRPNPTWTNATVRQSSDYLVGMFAAFRK AMHKAGLRWYGVRVAEPHHDGTVHWHLMCFMRKKDRRAITALLRKFAIREDREELGNNTG PRFKSELINPRKGTPTSYIAKYISKNIDGRGLAGEISKETGKSLRDNAEYVNAWASLHRV QQFRFFGIPGRQAYRELRLLAGQAARQQGDKKAGAPVLDNPRLDAILAAADAGCFATYIM KQGGVLVPRKYHLIRTAYEINEEPTAYGDHGIRIYGIWSPIVQGKICTHAVKWKMVRKAV DVQEAAADQGACAPWTRGNNCPLAENLNQQGKDKSADGDSRTDITRMNDKELHDYLHSMS KKERRELAARLRQVKPKRRKDYKQRITDHQRQQLVYELKSRGFDGSEKEVDLLLRGGSIP SGAGLRIFYRNQRLKEDDKWRNLY >gi|296493131|gb|ADTK01000370.1| GENE 39 38022 - 38210 115 62 aa, chain + ## HITS:1 COG:no KEGG:ECS88_2839 NR:ns ## KEGG: ECS88_2839 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 62 35 96 96 118 98.0 6e-26 MQDYFLESLKLQRIDFFLKLVAASECSDEEKGLALQWVSELTDELMAKIRSHEYNRSMDV IS >gi|296493131|gb|ADTK01000370.1| GENE 40 38221 - 38454 315 77 aa, chain + ## HITS:1 COG:no KEGG:ECS88_2838 NR:ns ## KEGG: ECS88_2838 # Name: not_defined # Def: putative DinI-like damage-inducible protein # Organism: E.coli_S88 # Pathway: not_defined # 1 77 25 101 101 132 100.0 3e-30 MRIEIMIDKEQKISQSTLDALESELYRNLRPLYPKTVIRIRKGSSNGVELTGLQLDEERK QVMKIMQKVWEDDSWLH >gi|296493131|gb|ADTK01000370.1| GENE 41 38647 - 38982 331 111 aa, chain - ## HITS:1 COG:no KEGG:B21_00837 NR:ns ## KEGG: B21_00837 # Name: ybl33 # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 111 1 111 111 197 99.0 1e-49 MNNIPPIPQLGIYVSKIDPTLRITVTDVDIVDGEDDSPDDELFYLVHWIEGEDESDMTAM GFELDPLEWQAFVESEQLVFERDPYMDSIPENSNLAKIRDFLMKTKQNDHS >gi|296493131|gb|ADTK01000370.1| GENE 42 39815 - 39970 95 51 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFTVGALLIAVWKEHALFLAHIVKVEIAQLLVTCLVSELGVVTSWGEMCYY >gi|296493131|gb|ADTK01000370.1| GENE 43 40091 - 41050 255 319 aa, chain + ## HITS:1 COG:BMEII0667 KEGG:ns NR:ns ## COG: BMEII0667 COG0758 # Protein_GI_number: 17989012 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Brucella melitensis # 65 277 73 278 393 142 43.0 6e-34 MHSEELKNTLGLAIQIGKLASDQSVLKFFELIDFDVIRDIRDLTEMVNSHGFLKDKISET DFLIANAELENHHVNGVELIPYGSEFYPLSLAFTPNPPSILYIKGDKSILKELPGVAIVG SRDTSPAGEEITRRITNQIVSAGYIVVSGLAIGTDANAHKATLQAKGKTIAVLAHGLEEA KPKQNSRLAQEILDKGGAWISEYPMGRPAQKQSFVQRNRIQVGLSAGSILIEAALNSGTI TQAEFANKAKRPVFAVVPHLPNNPLNLNCEGTVDLVKSNMARALKTKRDYDDVIMIINES REYLLELKWPGKQSTLDLI >gi|296493131|gb|ADTK01000370.1| GENE 44 41075 - 42100 611 341 aa, chain - ## HITS:1 COG:STM2723 KEGG:ns NR:ns ## COG: STM2723 COG5518 # Protein_GI_number: 16766036 # Func_class: R General function prediction only # Function: Bacteriophage capsid portal protein # Organism: Salmonella typhimurium LT2 # 22 339 24 341 346 611 89.0 1e-175 MGKSKKNRVAAMNQIQHKSQSSAEAFSFGDPVPVLDRRELLDYVECVQTDRWYEPPVSFD GLARTFRAAVHHSSPIAVKCNILTSTYIPHPLLSQQAFSRFVQDYLVFGNAYLEKRTNRF GEVIALEPALAKYTRRGLDLDTYWFVQYGMTTQPYQFTKGSIFHLMEPDINQEIYGLPGY LSAIPSALLNESATLFRRKYYINGSHAGFIMYMTDAAQNQEDVNNLRNAMKSAKGPGNFR NLFMYSPNGKKDGLQIIPLSEVAAKDEFLNIKNVSRDDMMAAHRVPPQMMGIMPNNVGGF GDVEKASKVFVKNELLPLQKRMKEFNHWSGEEIIKFERYQI >gi|296493131|gb|ADTK01000370.1| GENE 45 42100 - 43866 1375 588 aa, chain - ## HITS:1 COG:RSc1939 KEGG:ns NR:ns ## COG: RSc1939 COG5484 # Protein_GI_number: 17546658 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Ralstonia solanacearum # 3 491 2 496 506 687 67.0 0 MNTTLTPADLDPRRQAMLLYFQGYRVARIAEMLGEKVATVHSWKKRDKWGDYGPLDQMQL TTAARYCQLIMKEHKEGKDFKEIDLLARQSERHARIGKFNNGGNEADLNPNVANRNKGPR RQPEKNVFTDEQIEKLEEIFHSSMFNYQRHWWEAGKTNRIRNLLKSRQIGATFYFAREAL IDALLTGRNQIFLSASKAQAHVFKQYIIDFAKEVEVELKGDPMVLPNGATLYFLGTNART AQSYHGNLYLDEYFWIPKFQELRKVASGMAIHKKWRQTYFSTPSSLTHSAYPFWSGALFN RGRNKADKVDIDLSHSNLAPGLLCADGQYRQIVTVEDAVRGGCNLFDLDQLRMEYSPDEY QNLLMCEFVDDLASVFPLSELQACMVDSWEVWSDFHALALRPFGWREVWIGYDPAKGTQN GDSAGCVVVAPPAVPGGKFRILERHQWRGMDFRAQADAIKKLTEQYNVTYIGIDSTGVGH GVYENVKAFFPAVREFVYNPNVKNALVLKAYDIISHRRLEFDAGHTDIAQSFMAIRRATT ASGNRPTYEASRSEEASHADLAWATMHALFNEPLQGESANTSNIVEIF >gi|296493131|gb|ADTK01000370.1| GENE 46 44009 - 44842 900 277 aa, chain + ## HITS:1 COG:no KEGG:EcHS_A0925 NR:ns ## KEGG: EcHS_A0925 # Name: not_defined # Def: phage capsid scaffolding protein # Organism: E.coli_HS # Pathway: not_defined # 1 277 1 277 277 477 98.0 1e-133 MTVKVKRFRIGVEGATTDGREIQREWLEQMAASYNPTVYTALINLEHIKSYLPDSTFNRY GKVTALFAEEITEGPLAGKMALYADVEPTESLVELVKKGQKLFTSMEVSPKFADTGKAYL VGLAATDDPASLGTEMLTFSASAAHNPLANRKQNPANLFTAAEETVIELEEIQDDKPSLF ARVTALFTKKEQSDDARFSDVHKAVELVATEQQNLSARTEKSLSEQEERLSELETALQEQ QTAFNELVNKLSHEDSRQDYRQRATGGNAPADTLTNC >gi|296493131|gb|ADTK01000370.1| GENE 47 44859 - 45917 1241 352 aa, chain + ## HITS:1 COG:no KEGG:SeD_A3048 NR:ns ## KEGG: SeD_A3048 # Name: not_defined # Def: phage major capsid protein, P2 family # Organism: S.enterica_Dublin # Pathway: not_defined # 1 344 1 344 352 667 99.0 0 MKKNTRFAFKAYLQQLARLNGVAVEELSSKFTVEPSVQQTLEDQIQQSAAFLTLINVTPV TEQSGQLLGLGVGSTIAGTTDTTAKEREPVDPTLMVDVEYKCEQTNFDTVLTYAKLDLWA KFQDFQVRIRDAIVKRQALDRIMIGFNGVKRAKTSNRSENPLLQDVNKGWLQKIREDAPD HVMGSTTTGGETTPGAVKVGKGGEYANLDAVVMDAVNELIDVVYQDDDDLVVICGRELLS DKYFPLVNKEQENSEKLAADMIISQKRMGGLQAVRAPFFPPNALLITRLDNLSIYWQEDT RRRSVIDNPKRDRIENFESVNEAYVVEDYRCAALVENIQIGDFSAAAAEAGA >gi|296493131|gb|ADTK01000370.1| GENE 48 45921 - 46571 692 216 aa, chain + ## HITS:1 COG:no KEGG:ECED1_3086 NR:ns ## KEGG: ECED1_3086 # Name: M # Def: terminase, endonuclease subunit (GpM) # Organism: E.coli_ED1a # Pathway: not_defined # 1 216 1 216 216 377 98.0 1e-103 MSLSPARQHRLRVQAEQAAREGGSVRHASGYDLMLLQLAEDRRRLKGVQSTVKKAEIKVE LLPKYAAWAEGVLAAGGAQQDDVLMYVMLWRIDAGDYAGALEIGRHALRHGWVMPLGNRN VQTVLAEEMADAAQSAMLAATGFDADPLLQTLEMTDGLDMPDQSRARLHKAIGAVLSESN PASALNHLTHALQLDPRCGVKKDKQQLERRLRNDSR >gi|296493131|gb|ADTK01000370.1| GENE 49 46667 - 47131 437 154 aa, chain + ## HITS:1 COG:no KEGG:ECED1_3085 NR:ns ## KEGG: ECED1_3085 # Name: L # Def: head completion/stabilization protein (GpL) # Organism: E.coli_ED1a # Pathway: not_defined # 1 154 22 175 175 295 100.0 3e-79 MKFVAPEQAPEQAEIIRNTPFWPDVDLSEFRSVMRTDGTVTQPRLKQVALSAISEVNAEL YEFRRRQQMLGYASLAEVPAEQLDGKSERIQHYFNAVYCWARAMLNERYQDYDATASGVK RGEELAEASGDLWRDARWAISRVQDAPHCTVELI >gi|296493131|gb|ADTK01000370.1| GENE 50 47131 - 47334 187 67 aa, chain + ## HITS:1 COG:STM2717 KEGG:ns NR:ns ## COG: STM2717 COG5004 # Protein_GI_number: 16766030 # Func_class: R General function prediction only # Function: P2-like prophage tail protein X # Organism: Salmonella typhimurium LT2 # 1 67 1 67 67 124 92.0 4e-29 MKVRAHQYDTVDALCWRHYGRTQGVTEQVLKANPGLAEYGPFLPHGLQVELPDIPTTTTV QTVQLWD >gi|296493131|gb|ADTK01000370.1| GENE 51 47338 - 47553 258 71 aa, chain + ## HITS:1 COG:no KEGG:ECS88_2829 NR:ns ## KEGG: ECS88_2829 # Name: not_defined # Def: putative secretory protein # Organism: E.coli_S88 # Pathway: not_defined # 1 71 1 71 71 133 100.0 2e-30 MTLERISAFITYCIAVVLAWLGDLSIKDASTLGGLMIGVLMLAINWYYKHKAYQLLRDGQ ISREDYESINR >gi|296493131|gb|ADTK01000370.1| GENE 52 47534 - 48046 285 170 aa, chain + ## HITS:1 COG:STM2715 KEGG:ns NR:ns ## COG: STM2715 COG3772 # Protein_GI_number: 16766028 # Func_class: R General function prediction only # Function: Phage-related lysozyme (muraminidase) # Organism: Salmonella typhimurium LT2 # 1 168 1 168 169 313 91.0 9e-86 MNPSIVKRCLVGAVLAIAATLPGFQQLHTSVEGLKLIADYEGCRLQPYQCSAGVWTDGIG NTSGVIPGKTITERQAAEGLISNVLRVERALERCVKQQPPLKVYDATVSFAFNVGTGNAC SSTLVKLLNQRRWADACRQLPRWVYVKGVFNQGLDNRRAREMAWCLQGAN >gi|296493131|gb|ADTK01000370.1| GENE 53 48048 - 48425 346 125 aa, chain + ## HITS:1 COG:no KEGG:ECIAI39_1807 NR:ns ## KEGG: ECIAI39_1807 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_IAI39 # Pathway: not_defined # 1 125 25 149 149 171 91.0 7e-42 MKKKVISGLFLMLWIDLLIAVMVYPQGIFPVLAASGVWVACLLTWAVIPVSLAALIKNGP LWQELRASLLKTITRKENVFISWMMRLLIVVSLAWTGWAITLVFFLLTVIAFWMTRNQIA QQVSA >gi|296493131|gb|ADTK01000370.1| GENE 54 48422 - 48850 218 142 aa, chain + ## HITS:1 COG:no KEGG:ECS88_2826 NR:ns ## KEGG: ECS88_2826 # Name: not_defined # Def: putative regulatory protein # Organism: E.coli_S88 # Pathway: not_defined # 1 142 1 142 142 187 96.0 8e-47 MNRLLLVVLALLLAALGWQTWRLADASQTISTQADELQSKSQALAKSNSQLISLSILTET NNREQARLYAEAEQTSALLRQRQHRIEELKRENEDLRRWADTTLPADIIRLRERPALTGG TAYRQWLSASDAVSAGSGNAAH >gi|296493131|gb|ADTK01000370.1| GENE 55 48946 - 49377 396 143 aa, chain + ## HITS:1 COG:no KEGG:SeD_A3040 NR:ns ## KEGG: SeD_A3040 # Name: gpr # Def: phage tail completion protein R # Organism: S.enterica_Dublin # Pathway: not_defined # 1 143 1 143 143 277 100.0 1e-73 MNKPQSLRHALNKAVPYVRNNPDKLHLFVDNGSLVATGASSMSWEYRYTLNAVIEDFSGD QNLLMAPVLLWLRDNQPDAINNPALREKLFTFEVDILRNDVCDISLNLQLTERVLVSTDG SVSSVEAVAEPDEPEEMWTVKRG >gi|296493131|gb|ADTK01000370.1| GENE 56 49499 - 49816 243 105 aa, chain + ## HITS:1 COG:no KEGG:t3428 NR:ns ## KEGG: t3428 # Name: not_defined # Def: putative phage tail protein # Organism: S.typhi_Ty2 # Pathway: not_defined # 1 105 44 148 148 173 99.0 2e-42 MQRNPDGSSYEPRRVTARSKKGRIKRQMFAKLRTTKYLKTAASADSASVQFEGKVQRIAR VHHYGLRDRVSSKGPEVRYAERRLLGVNDDVEAMTRDMILQWLAG >gi|296493131|gb|ADTK01000370.1| GENE 57 49885 - 50463 490 192 aa, chain + ## HITS:1 COG:STM2710 KEGG:ns NR:ns ## COG: STM2710 COG4540 # Protein_GI_number: 16766023 # Func_class: R General function prediction only # Function: Phage P2 baseplate assembly protein gpV # Organism: Salmonella typhimurium LT2 # 1 192 1 192 192 316 84.0 1e-86 MNAQLTEIMRLITNLIRTGVVTEVDRANWLCRVKTGDLETNWINWLTLRAGKSRTWWKPS VGEQVVLFSLGGNLETAFALPAVYSNQFPPPSGSEDGNVTEYPDGGWFEYEPATGRWYVR GIKSMVIEAADNITMKTSEFVLEADRTRINSEVVINGGVTQGGGAMSSNGIVVDAHQHTG VLKGGDTTGGPV >gi|296493131|gb|ADTK01000370.1| GENE 58 50460 - 50819 375 119 aa, chain + ## HITS:1 COG:STM2709 KEGG:ns NR:ns ## COG: STM2709 COG3628 # Protein_GI_number: 16766022 # Func_class: R General function prediction only # Function: Phage baseplate assembly protein W # Organism: Salmonella typhimurium LT2 # 1 118 1 118 119 186 83.0 7e-48 MTLYIGMNNTSGKAITDIDHLRQSVRDILLTPQGSRIARREYGSLLSTLIDQPQNPALRL QVMSAVYVALSRWEPRLTLDSITIKSNFDGSMVVGLTGRRNNGVPVSLSVSTGAENGSD >gi|296493131|gb|ADTK01000370.1| GENE 59 50806 - 51714 919 302 aa, chain + ## HITS:1 COG:STM2708 KEGG:ns NR:ns ## COG: STM2708 COG3948 # Protein_GI_number: 16766021 # Func_class: R General function prediction only # Function: Phage-related baseplate assembly protein # Organism: Salmonella typhimurium LT2 # 1 302 1 302 302 464 90.0 1e-131 MAVIDLSQLPAPQIVDVPDFETLLAERKAEFVALHPKDEQEAVIRTLELESEPVTKLLQE NAYRELLLRQRINEAAQAVMVAYAMGGDLDQLAANYNVKRLTVTPADNDAVPPVAAVMES DEALRLRVPAAFEGLSVAGPTAAYEFHARSADGRVADASATSPAPAEVVLTVLSREGDGT AEKDLLDVVEKALNSENVRPVADRLTVRSAEIIPYRVEATIFLYPGPEAEPVMAAAKASL QKYIASQTRLGRDIRRSAIFAALHVEGVQRVELASPLADVVLNKTQAASCTQWSVTNGGT DE >gi|296493131|gb|ADTK01000370.1| GENE 60 51707 - 52312 479 201 aa, chain + ## HITS:1 COG:STM2707 KEGG:ns NR:ns ## COG: STM2707 COG4385 # Protein_GI_number: 16766020 # Func_class: R General function prediction only # Function: Bacteriophage P2-related tail formation protein # Organism: Salmonella typhimurium LT2 # 1 201 1 201 201 391 92.0 1e-109 MNSLLPPGSTSLERRLAQTCSGISDLQVPLRDLWNPATCPVSFLPYLAWAFSVDRWDEGW TESVKRQVVKDAFYIHQHKGTTSAVRRVVEPFGFLIRIIEWWQTGETPGTFRLDIGVQDQ GITEDTYLELERLISDAKPCSRHMIGMSINLQTSGPYWLGAASYLGEEITIFPYINETII SGGTAHEGGAVHVIDTMRVNP >gi|296493131|gb|ADTK01000370.1| GENE 61 52309 - 52984 608 225 aa, chain + ## HITS:1 COG:STM2706 KEGG:ns NR:ns ## COG: STM2706 COG5301 # Protein_GI_number: 16766019 # Func_class: R General function prediction only # Function: Phage-related tail fibre protein # Organism: Salmonella typhimurium LT2 # 1 158 1 158 524 263 93.0 2e-70 MSTKFYTLLTDIGAAKLASAAALGVPLKITHMAVGDGGGTLPTPDAKQTALVNEKRRAAL NMLYIDPQNSSQIIAEQVIPENEGGWWIREVGLFDESGALIAVGNCPESYKPQLAEGSGR TQTVRMVLITSSTDNITLKIDPAVVLATRKYVDDKISEHEQSRRHPDASLTAKGFTQLSS ATNSESEILAATPKAVKAAYDLAAGKASASHTHPWNQITDVPAAS Prediction of potential genes in microbial genomes Time: Mon May 16 16:17:18 2011 Seq name: gi|296493130|gb|ADTK01000371.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1237.1, whole genome shotgun sequence Length of sequence - 5429 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 10/0.000 - CDS 3 - 1719 1199 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC 2 1 Op 2 12/0.000 - CDS 1712 - 2290 502 ## COG2878 Predicted NADH:ubiquinone oxidoreductase, subunit RnfB 3 1 Op 3 . - CDS 2290 - 2871 568 ## COG4657 Predicted NADH:ubiquinone oxidoreductase, subunit RnfA 4 1 Op 4 . - CDS 2948 - 3388 326 ## SSON_1532 hypothetical protein - Prom 3412 - 3471 4.1 5 2 Tu 1 . - CDS 3474 - 3689 235 ## G2583_2020 OriC-binding nucleoid-associated protein - Prom 3825 - 3884 3.7 - Term 3900 - 3947 6.9 6 3 Tu 1 . - CDS 3962 - 4117 100 ## EC55989_1792 beta-lactam resistance membrane protein + Prom 4217 - 4276 4.5 7 4 Tu 1 . + CDS 4330 - 5370 950 ## COG0673 Predicted dehydrogenases and related proteins Predicted protein(s) >gi|296493130|gb|ADTK01000371.1| GENE 1 3 - 1719 1199 572 aa, chain - ## HITS:1 COG:ydgN KEGG:ns NR:ns ## COG: ydgN COG4656 # Protein_GI_number: 16129587 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Escherichia coli K12 # 1 572 1 572 740 1048 100.0 0 MLKLFSAFRKNKIWDFNGGIHPPEMKTQSNGTPLRQVPLAQRFVIPLKQHIGAEGELCVS VGDKVLRGQPLTRGRGKMLPVHAPTSGTVTAIAPHSTAHPSALAELSVIIDADGEDCWIP RDGWADYRTRSREELIERIHQFGVAGLGGAGFPTGVKLQGGGDKIETLIINAAECEPYIT ADDRLMQDCAAQVVEGIRILAHILQPREILIGIEDNKPQAISMLRAVLADSNDISLRVIP TKYPSGGAKQLTYILTGKQVPHGGRSSDIGVLMQNVGTAYAVKRAVIDGEPITERVVTLT GEAIARPGNVWARLGTPVRHLLNDAGFCPSADQMVIMGGPLMGFTLPWLDVPVVKITNCL LAPSANELGEPQEEQSCIRCSACADACPADLLPQQLYWFSKGQQHDKATTHNIADCIECG ACAWVCPSNIPLVQYFRQEKAEIAAIRQEEKRAAEAKARFEARQARLEREKAARLERHKS AAVQPAAKDKDAIAAALARVKEKQAQATQPIVIKAGERPDNSAIIAAREARKAQARAKQA ELQQTNDAATVADPRKTAVEAAIARAKARKLE >gi|296493130|gb|ADTK01000371.1| GENE 2 1712 - 2290 502 192 aa, chain - ## HITS:1 COG:ECs2337 KEGG:ns NR:ns ## COG: ECs2337 COG2878 # Protein_GI_number: 15831591 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfB # Organism: Escherichia coli O157:H7 # 1 192 1 192 192 366 100.0 1e-101 MNAIWIAVAAVSLLGLAFGAILGYASRRFAVEDDPVVEKIDEILPQSQCGQCGYPGCRPY AEAISCNGEKINRCAPGGEAVMLKIAELLNVEPQPLDGEAQELTPARMVAVIDENNCIGC TKCIQACPVDAIVGATRAMHTVMSDLCTGCNLCVDPCPTHCISLQPVAETPDSWKWDLNT IPVRIIPVEHHA >gi|296493130|gb|ADTK01000371.1| GENE 3 2290 - 2871 568 193 aa, chain - ## HITS:1 COG:ECs2336 KEGG:ns NR:ns ## COG: ECs2336 COG4657 # Protein_GI_number: 15831590 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfA # Organism: Escherichia coli O157:H7 # 1 193 1 193 193 285 100.0 3e-77 MTDYLLLFVGTVLVNNFVLVKFLGLCPFMGVSKKLETAMGMGLATTFVMTLASICAWLID TWILIPLNLIYLRTLAFILVIAVVVQFTEMVVRKTSPVLYRLLGIFLPLITTNCAVLGVA LLNINLGHNFLQSALYGFSAAVGFSLVMVLFAAIRERLAVADVPAPFRGNAIALITAGLM SLAFMGFSGLVKL >gi|296493130|gb|ADTK01000371.1| GENE 4 2948 - 3388 326 146 aa, chain - ## HITS:1 COG:no KEGG:SSON_1532 NR:ns ## KEGG: SSON_1532 # Name: not_defined # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 146 9 154 154 204 100.0 1e-51 MTTTTPQRIGGWLLGPLAWLLVALLSTTLALLLYTAALSSPQTFQTLGGQALTTQILWGV SFITAIAMWYYTLWLTIAFFKRRRCVPKHYIIWLLISVLLAVKAFAFSPVEDGIAVRQLL FTLLATALIVPYFKRSSRVKATFVNP >gi|296493130|gb|ADTK01000371.1| GENE 5 3474 - 3689 235 71 aa, chain - ## HITS:1 COG:no KEGG:G2583_2020 NR:ns ## KEGG: G2583_2020 # Name: cnu # Def: OriC-binding nucleoid-associated protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 71 1 71 71 133 100.0 3e-30 MTVQDYLLKFRKISSLESLEKLYDHLNYTLTDDQELINMYRAADHRRAELVSGGRLFDLG QVPKSVWHYVQ >gi|296493130|gb|ADTK01000371.1| GENE 6 3962 - 4117 100 51 aa, chain - ## HITS:1 COG:no KEGG:EC55989_1792 NR:ns ## KEGG: EC55989_1792 # Name: blr # Def: beta-lactam resistance membrane protein # Organism: E.coli_55989 # Pathway: not_defined # 1 51 16 66 66 92 100.0 6e-18 MDQSREMWAVMNRLIELTGWIVLVVSVILLGVASHIDNYQPPEQSASVQHK >gi|296493130|gb|ADTK01000371.1| GENE 7 4330 - 5370 950 346 aa, chain + ## HITS:1 COG:ECs2332 KEGG:ns NR:ns ## COG: ECs2332 COG0673 # Protein_GI_number: 15831586 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Escherichia coli O157:H7 # 1 346 14 359 359 684 100.0 0 MSDNIRVGLIGYGYASKTFHAPLIAGTPGLELAVISSSDETKVKADWPTVTVVSEPKHLF NDPNIDLIVIPTPNDTHFPLAKAALEAGKHVVVDKPFTVTLSQARELDALAKSLGRVLSV FHNRRWDSDFLTLKGLLAEGVLGEVAYFESHFDRFRPQVRDRWREQGGPGSGIWYDLAPH LLDQAITLFGLPVSMTVDLAQLRPGAQSTDYFHAILSYPQRRVILHGTMLAAAESARYIV HGSRGSYVKYGLDPQEERLKNGERLPQEDWGYDMRDGVLTRVEGEERVEETLLTVPGNYP AYYAAIRDALNGDGENPVPASQAIQVMELIELGIESAKHRATLCLA Prediction of potential genes in microbial genomes Time: Mon May 16 16:17:28 2011 Seq name: gi|296493129|gb|ADTK01000372.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1237.2, whole genome shotgun sequence Length of sequence - 9438 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 5, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 22 - 1023 1116 ## COG1816 Adenosine deaminase - Prom 1053 - 1112 6.5 - Term 1074 - 1124 7.1 2 2 Op 1 3/1.000 - CDS 1127 - 2299 1015 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities 3 2 Op 2 . - CDS 2309 - 3901 1534 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific - Prom 3996 - 4055 5.3 + Prom 3967 - 4026 7.4 4 3 Tu 1 . + CDS 4076 - 5104 789 ## COG1609 Transcriptional regulators + Prom 5137 - 5196 2.0 5 4 Op 1 11/0.000 + CDS 5216 - 5983 244 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 + Prom 6131 - 6190 4.9 6 4 Op 2 2/1.000 + CDS 6215 - 6802 588 ## COG1309 Transcriptional regulator + Term 6814 - 6853 5.0 + Prom 7028 - 7087 3.6 7 5 Op 1 1/1.000 + CDS 7191 - 9002 1719 ## COG3250 Beta-galactosidase/beta-glucuronidase 8 5 Op 2 . + CDS 8999 - 9436 282 ## PROTEIN SUPPORTED gi|90020673|ref|YP_526500.1| ribosomal protein L9 Predicted protein(s) >gi|296493129|gb|ADTK01000372.1| GENE 1 22 - 1023 1116 333 aa, chain - ## HITS:1 COG:add KEGG:ns NR:ns ## COG: add COG1816 # Protein_GI_number: 16129581 # Func_class: F Nucleotide transport and metabolism # Function: Adenosine deaminase # Organism: Escherichia coli K12 # 1 333 1 333 333 652 99.0 0 MIDTTLPLTDIHRHLDGNIRPQTILELGRQYNISLPAQSLETLIPHVQVIANEPDLVSFL AKLDWGVKVLASLDACRRVAFENIEDAARHGLHYVELRFSPGYMAMAHQLPVAGVVEAVI DGVREGCRTFGVQAKLIGIMSRTFGEAACQQELEAFLAHRDQITALDLAGDELGFPGSLF LSHFNRARDAGWHITVHAGEAAGPESIWQAIRELGAERIGHGVKAIEDRALMDFLAEQQI GIESCLTSNIQTSTVAELAAHPLKTFLEHGIRASINTDDPGVQGVDIIHEYTVAAPAAGL SREQIRQAQINGLEMAFLSAEEKRALREKVAAK >gi|296493129|gb|ADTK01000372.1| GENE 2 1127 - 2299 1015 390 aa, chain - ## HITS:1 COG:malY KEGG:ns NR:ns ## COG: malY COG1168 # Protein_GI_number: 16129580 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Escherichia coli K12 # 1 390 1 390 390 821 99.0 0 MFDFSKVVDRHGTWCTQWDYVADRFGTADLLPFTISDMDFATAPCIIEALNQRLMHGVFG YSRWKNDEFLAAIAHWFSTQHYTAIDTQTVVYGPSVIYMVSELIRQWSETGEGVVIHTPA YDAFYKAIEGNQRTVMPVALEKQADGWFCDMGKLEAVLAKPECKIMLLCSPQNPTGKVWT CDELEIMADLCERHGVRVISDEIHMDMVWGEQPHIPWSNVARGDWALLTSGSKSFNIPAL TGAYGIIENSSSRDAYLSALKGRDGLSSPSVLALTAHIAAYQQGAPWLDALRIYLKDNLT YIADKMNAAFPELNWQIPQSTYLAWLDLRPLNIDDNALQKALIEQEKVAIMPGYTYGEEG RGFVRLNAGCPRSKLEKGVAGLINAIRAVR >gi|296493129|gb|ADTK01000372.1| GENE 3 2309 - 3901 1534 530 aa, chain - ## HITS:1 COG:malX_1 KEGG:ns NR:ns ## COG: malX_1 COG1263 # Protein_GI_number: 16129579 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Escherichia coli K12 # 1 450 1 450 450 846 99.0 0 MTAKTAPKVTLWEFFQQLGKTFMLPVALLSFCGIMLGIGSSLSSHDVITLIPVLGNPVLQ AIFTWMSKIGSFAFSFLPVMFCIAIPLGLARENKGVAAFAGFVGYAVMNLAVNFWLTNKG ILPTTDAAVLKANNIQSILGIQSIDTGILGAVIAGIIVWMLHERFHNIRLPDALAFFGGT RFVPIISSLVMGLVGLVIPLVWPIFAMGISGLGHMINSAGDFGPMLFGTGERLLLPFGLH HILVALIRFTDAGGTQEVCGQTVSGALTIFQAQLNCPTTHGFSESATRFLSQGKMPAFLG GLPGAALAMYHCARPENRHKIKGLLISGLIACVVGGTTEPLEFLFLFVAPVLYVIHALLT GLGFTVMSVLGVTIGNTDGNIIDFVVFGILHGLSTKWYMVPVVAAIWFVVYYVIFRFAIT RFNLKTPGRDSEVASSIEKAVAGAPGKSGYNVPAILEALGGADNIVSLDNCITRLRLSVK DMSLVNVQALKDNRAIGVVQLNQHNLQVVIGPQVQSVKDEMAGLMHTVQA >gi|296493129|gb|ADTK01000372.1| GENE 4 4076 - 5104 789 342 aa, chain + ## HITS:1 COG:ECs2328 KEGG:ns NR:ns ## COG: ECs2328 COG1609 # Protein_GI_number: 15831582 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 342 1 342 342 644 100.0 0 MATAKKITIHDVALAAGVSVSTVSLVLSGKGRISTATGERVNAAIEELGFVRNRQASALR GGQSGVIGLIVRDLSAPFYAELTAGLTEALEAQGRMVFLLHGGKDGEQLAQRFSLLLNQG VDGVVIAGAAGSSDDLRRMAEEKAIPVIFASRASYLDDVDTVRPDNMQAAQLLTEHLIRN GHQRIAWLGGQSSSLTRAERVGGYCATLLKFGLPFHSDWVLECTSSQKQAAEAITALLRH NPTISAVVCYNETIAMGAWFGLLKAGRQSGESGVDRYFEQQVSLAAFTDATPTTLDDIPV TWASTPARELGTTLADRMMQKITHEETHSRNLIIPARLIAAK >gi|296493129|gb|ADTK01000372.1| GENE 5 5216 - 5983 244 255 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 12 248 4 238 242 98 29 2e-20 MFNSDNLRLDGKCAIITGAGAGIGKEIAITFATAGASVVVSDINADAANHVVDEIQQLGG QAFACRCDITSEQELSALADFAISKLGKVDILVNNAGGGGPKPFDMPMADFRRAYELNVF SFFHLSQLVAPEMEKNGGGVILTITSMAAENKNINMTSYASSKAAASHLVRNMAFDLGEK NIRVNGIAPGAILTDALKSVITPEIEQKMLQHTPIRRLGQPQDIANAALFLCSPAASWVS GQILTVSGGGVQELN >gi|296493129|gb|ADTK01000372.1| GENE 6 6215 - 6802 588 195 aa, chain + ## HITS:1 COG:ECs2326 KEGG:ns NR:ns ## COG: ECs2326 COG1309 # Protein_GI_number: 15831580 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 195 2 196 196 359 100.0 2e-99 MDNMQTEAQPTRTRILNAAREIFSENGFHSASMKAICKSCAISPGTLYHHFISKEALIQA IILQDQERALARFREPIEGIHFVDYMVESIVSLTHEAFGQRALVVEIMAEGMRNPQVAAM LKNKHMTITEFVAQRMRDAQQKGEISPDINTAMTSRLLLDLTYGVLADIEAEDLAREASF AQGLRAMIGGILTAS >gi|296493129|gb|ADTK01000372.1| GENE 7 7191 - 9002 1719 603 aa, chain + ## HITS:1 COG:uidA KEGG:ns NR:ns ## COG: uidA COG3250 # Protein_GI_number: 16129575 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Escherichia coli K12 # 1 603 1 603 603 1268 99.0 0 MLRPVETPTREIKKLDGLWAFSLDRENCGIDQRWWESALQESRAIAVPGSFNDQFADADI RNYVGNVWYQREVFIPKGWAGQRIVLRFDAVTHYGKVWVNNQEVMEHQGGYTPFEADVTP YVIAGKSVRITVCVNNELNWQTIPPGMVITDENGKKKQSYFHDFFNYAGIHRSVMLYTTP NTWVDDITVVTHVAQDCNHASVDWQVVANGDVSVELRDADQQVVATGQGTSGTLQVVNPH LWQPGEGYLYELCVTAKSQTECDIYPLRVGIRSVAVKGEQFLINHKPFYFTGFGRHEDAD LRGKGFDNVLMVHDHALMDWIGANSYRTSHYPYAEEMLDWADEHGIVVIDETAAVGFNLS LGIGFEAANKPKELYSEEAVNGETQQAHLQAIKELIARDKNHPSVVMWSIANEPDTRPQG AREYFAPLAEATRKLDPTRPITCVNVMFCDAHTDTISDLFDVLCLNRYYGWYVQSGDLET AEKVLEKELLAWQEKLHQPIIITEYGVDTLAGLHSMYTDMWSEEYQCAWLDMYHRVFDRV SAVVGEQVWNFADFATSQGILRVGGNKKGIFTRDRKPKSAAFLLQKRWTGMNFGEKPQQG GKQ >gi|296493129|gb|ADTK01000372.1| GENE 8 8999 - 9436 282 146 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020673|ref|YP_526500.1| ribosomal protein L9 [Saccharophagus degradans 2-40] # 3 138 7 141 522 113 41 6e-25 MNQQLSWRTIVGYSLGDVANNFAFAMGALFLLSYYTDVAGVGAAAAGTMLLLVRVFDAFA DVFAGRVVDSVNTRWGKFRPFLLFGTAPLMIFSVLVFWVPTDWSHGSKVVYAYLTYMGLG LCYSLVNIPYGSLATAMTHASCLSMS Prediction of potential genes in microbial genomes Time: Mon May 16 16:17:28 2011 Seq name: gi|296493128|gb|ADTK01000373.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1237.3, whole genome shotgun sequence Length of sequence - 402 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 401 293 ## COG2211 Na+/melibiose symporter and related transporters Predicted protein(s) >gi|296493128|gb|ADTK01000373.1| GENE 1 2 - 401 293 133 aa, chain + ## HITS:1 COG:ECs2323 KEGG:ns NR:ns ## COG: ECs2323 COG2211 # Protein_GI_number: 15831577 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Escherichia coli O157:H7 # 1 133 149 281 457 238 99.0 2e-63 GARGIAASLTFVCLAFLIGPSIKNSSPEEMVSVYHFWTIVLAIAGMVLYFICFKSTRENV VRIVAQPSLKISLQTLKRNRPLFMLCIGALCVLISTFAVSASSLFYVRYVLNDTGLFTVL VLVQNLVGTVASA Prediction of potential genes in microbial genomes Time: Mon May 16 16:17:31 2011 Seq name: gi|296493127|gb|ADTK01000374.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1237.4, whole genome shotgun sequence Length of sequence - 6546 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 14 - 529 476 ## COG2211 Na+/melibiose symporter and related transporters 2 1 Op 2 . + CDS 586 - 1833 972 ## JW1607 predicted outer membrane porin protein + Term 1841 - 1876 4.0 - Term 1828 - 1864 5.0 3 2 Op 1 4/0.000 - CDS 1878 - 3386 1746 ## COG5339 Uncharacterized protein conserved in bacteria - Prom 3426 - 3485 1.9 - Term 3396 - 3429 4.1 4 2 Op 2 . - CDS 3487 - 4662 1261 ## COG1482 Phosphomannose isomerase - Prom 4705 - 4764 4.3 + Prom 4732 - 4791 5.2 5 3 Tu 1 . + CDS 4861 - 6507 486 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase + Term 6514 - 6546 4.9 Predicted protein(s) >gi|296493127|gb|ADTK01000374.1| GENE 1 14 - 529 476 171 aa, chain + ## HITS:1 COG:ECs2323 KEGG:ns NR:ns ## COG: ECs2323 COG2211 # Protein_GI_number: 15831577 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Escherichia coli O157:H7 # 1 171 287 457 457 324 99.0 5e-89 MVARIGKKNTFLIGALLGTCGYLLFFWVSVWSLPVALVALAIASIGQGVTMTVMWALEAD TVEYGEYLTGVRIEGLTYSLFSFTRKCGQAIGGSIPAFILGLSGYIANQVQTPEVIMGIR TSIALVPCGFMLLAFVIIWFYPLTDKKFKEIVVEIDNRKKMQQQLISDITN >gi|296493127|gb|ADTK01000374.1| GENE 2 586 - 1833 972 415 aa, chain + ## HITS:1 COG:no KEGG:JW1607 NR:ns ## KEGG: JW1607 # Name: uidC # Def: predicted outer membrane porin protein # Organism: E.coli_J # Pathway: not_defined # 1 415 7 421 421 810 99.0 0 MAVICLTAASGLTSAYAAQLADDEAGLRIRLKNELRRADKPSAGAGRDIYAWVQGGLLDF NSGYYSNIIGVEGGAYYVYKLGARADMSTRWYLDGDKSFGFALGAVKIKPSENSLLKLGR FGTDYSYGSLPYRIPLMAGSSQRTLPTVSEGALGYWALTPNIDLWGMWRSRVFLWTDSTT GIRDEGVYNSQTGKYDKHRARSFLAASWHDDTSRYSLGASVQKDVSNQIQSILEKSIPLD PNYTLKGELLGFYAQLEGLSRNTSQPNETALVSGQLTWNAPWGSVFGSGGYLRHAMNGAV VDTDIGYPFSLSLDRNREGMQSWQLGVNYRLTPQFTLTFAPIVTRGYESSKRDVRIEGTG ILGGMNYRVSEGPLQGMNFFLAADKGREKRDGSALGDRLNYWDVKMSIQYDFMLK >gi|296493127|gb|ADTK01000374.1| GENE 3 1878 - 3386 1746 502 aa, chain - ## HITS:1 COG:ydgA KEGG:ns NR:ns ## COG: ydgA COG5339 # Protein_GI_number: 16129572 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 502 1 502 502 917 99.0 0 MNKSLVAVGVIVALGVVWTGGAWYTGKKIETHLEDMVAQANAQLKLTAPESNLEVSYQNY HRGVFSSQLQLLVKPIAGKENPWIKSGQSVIFNESVDHGPFPLAQLKKLNLIPSMASIQT TLVNNEVSKPLFDMAKGETPFEINSRIGYSGDSSSDISLNPLNYEQKDEKVAFSGGEFQL NADRDGKAISLSGEAQSGRIDAVNEYNQKVQLTFNNLKTDGSSTLASFGERVGNQKLSLE KMTISVEGKELALLEGMEISGKSDLVNDGKTINSQLDYSLNSLKVQNQDLGSGKLTLKVG QIDGEAWHQFSQQYNAQTQALLAQPEIANNPELYQEKVTEAFFSALPLMLKGDPVITIAP LSWKNSQGESALNLSLFLKDPATTKEAPQTLAQEVDRSVKSLDAKLTIPVDMATEFMTQV AKLEGYQEDQAKKLAKQQVEGASAMGQMFRLTTLQDNTITTSLQYANGQITLNGQKMPLE DFVGMFAMPALNVPAVPAIPQQ >gi|296493127|gb|ADTK01000374.1| GENE 4 3487 - 4662 1261 391 aa, chain - ## HITS:1 COG:manA KEGG:ns NR:ns ## COG: manA COG1482 # Protein_GI_number: 16129571 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Escherichia coli K12 # 1 391 1 391 391 741 98.0 0 MQKLINSVQNYAWGSKTALTELYGMENPSSQPMAELWMGAHPKSSSRVQNAAGDIVSLRD VIESDKSTLLGEAVAKRFGELPFLFKVLCAAQPLSIQVHPNKHNSEIGFAKENAAGIPMD AAERNYKDPNHKPELVFALTPFLAMNAFREFSEIVSLLQPVAGAHPAIAHFLQQPNAERL SELFASLLNMQGEEKSRALAILKSALDSQQGEPWQTIRLISEFYPEDSGLFSPLLLNVVK LNPGEAMFLFAETPHAYLQGVALEVMANSDNVLRAGLTPKYIDIPELVANVKFEAKPANQ LLTQPVKQGAELDFSIPVDDFAFSLHDLSDKETTISQQSAAILFCVEGDATLCQGSQQLQ LKPGESAFIAANESPVTVKGHGRLARVYNKL >gi|296493127|gb|ADTK01000374.1| GENE 5 4861 - 6507 486 548 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 76 531 38 482 508 191 33 1e-48 MSNKPFHYQAPFPLKKDDTEYYLLTSEHVSVSEFEGQEILKVAPEALTLLARQAFHDASF MLRPAHQQQVADILRDPEASENDKYVALQFLRNSDIAAKGVLPTCQDTGTAIIVGKKGQR VWTGGGDEAALARGVYNTYIEDNLRYSQNAPLDMYKEVNTGTNLPAQIDLYAVDGDEYKF LCIAKGGGSANKTYLYQETKALLTPGKLKNYLVEKMRTLGTAACPPYHIAFVIGGTSAET NLKTVKLASAKYYDELPTEGNEHGQAFRDVELEKELLIEAQNLGLGAQFGGKYFAHDIRV IRLPRHGASCPVGMGVSCSADRNIKAKINRQGIWIEKLEHNPGKYIPEELRKAGEGEAVR VDLNRPMKEILAQLSQYPVSTRLSLNGTIIVGRDIAHAKLKERMDNGEGLPQYIKDHPIY YAGPAKTPEGYASGSLGPTTAGRMDSYVDQLQAQGGSMIMLAKGNRSQQVTDACKKHGGF YLGSIGGPAAVLAQGSIKSLECVEYPELGMEAIWKIEVEDFPAFILVDDKGNDFFQQIQL TQCTRCVK Prediction of potential genes in microbial genomes Time: Mon May 16 16:17:39 2011 Seq name: gi|296493126|gb|ADTK01000375.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1237.5, whole genome shotgun sequence Length of sequence - 7068 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 22 - 81 2.3 1 1 Tu 1 . + CDS 129 - 1532 1537 ## COG0114 Fumarase + Term 1542 - 1599 1.5 2 2 Op 1 . - CDS 1529 - 2458 673 ## EcSMS35_1588 DNA replication terminus site-binding protein 3 2 Op 2 40/0.000 - CDS 2534 - 3835 1218 ## COG0642 Signal transduction histidine kinase 4 2 Op 3 . - CDS 3839 - 4558 532 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 4639 - 4698 5.6 + Prom 4598 - 4657 5.1 5 3 Tu 1 . + CDS 4687 - 5022 347 ## COG3136 Uncharacterized membrane protein required for alginate biosynthesis 6 4 Op 1 7/0.000 - CDS 5019 - 5741 697 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 7 4 Op 2 . - CDS 5778 - 6992 1199 ## COG0531 Amino acid transporters Predicted protein(s) >gi|296493126|gb|ADTK01000375.1| GENE 1 129 - 1532 1537 467 aa, chain + ## HITS:1 COG:ECs2317 KEGG:ns NR:ns ## COG: ECs2317 COG0114 # Protein_GI_number: 15831571 # Func_class: C Energy production and conversion # Function: Fumarase # Organism: Escherichia coli O157:H7 # 1 467 1 467 467 900 100.0 0 MNTVRSEKDSMGAIDVPADKLWGAQTQRSLEHFRISTEKMPTSLIHALALTKRAAAKVNE DLGLLSEEKASAIRQAADEVLAGQHDDEFPLAIWQTGSGTQSNMNMNEVLANRASELLGG VRGMERKVHPNDDVNKSQSSNDVFPTAMHVAALLALRKQLIPQLKTLTQTLSEKSRAFAD IVKIGRTHLQDATPLTLGQEISGWVAMLEHNLKHIEYSLPHVAELALGGTAVGTGLNTHP EYARRVADELAVITCAPFVTAPNKFEALATCDALVQAHGALKGLAASLMKIANDVRWLAS GPRCGIGEISIPENEPGSSIMPGKVNPTQCEALTMLCCQVMGNDVAINMGGASGNFELNV FRPMVIHNFLQSVRLLADGMESFNKHCAVGIEPNRERINQLLNESLMLVTALNTHIGYDK AAEIAKKAHKEGLTLKAAALALGYLSEAEFDSWVRPEQMVGSMKAGR >gi|296493126|gb|ADTK01000375.1| GENE 2 1529 - 2458 673 309 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_1588 NR:ns ## KEGG: EcSMS35_1588 # Name: tus # Def: DNA replication terminus site-binding protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 309 5 313 313 587 100.0 1e-166 MARYDLVDRLNTTFRQMEQELAAFAAHLEQHKLLVARVFSLPEVKKEDEHNPLNRIEVKQ HLGNDAQSLALRHFRHLFIQQQSENRSSKAAVRLPGVLCYQVDNLSQAALVSHIQHINKL KTTFEHIVTVESELPTAARFEWVHRHLPGLITLNAYRTLTVLHDPATLRFGWANKHIIKN LHRDEVLAQLEKSLKSPRSVAPWTREEWQRKLEREYQDIAALPQNAKLKIKRPVKVQPIA RVWYKGDQKQVQHACPTPLIALINRDNGAGVPDVGELLNYDADNVQHRYKPQAQPLRLII PRLHLYVAD >gi|296493126|gb|ADTK01000375.1| GENE 3 2534 - 3835 1218 433 aa, chain - ## HITS:1 COG:rstB KEGG:ns NR:ns ## COG: rstB COG0642 # Protein_GI_number: 16129567 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 433 1 433 433 874 99.0 0 MKKLFIQFYLLLFVCFLVMSLLVGLVYKFTAERAGKQSLDDLMNSSLYLMRSELREIPPH DWGKTLKEMDLNLSFDLRVEPLSKYHLDDISMHRLRGGEIVALDDQYTFLQRIPRSHYVL AVGPVPYLYYLHQMRLLDIALIAFIAISLAFPVFIWMRPHWQDMLKLEAAAQRFGDGHLN ERIHFDEGSSFERLGIAFNQMADNINALIASKKQLIDGIAHELRTPLVRLRYRLEMSDNL SAAESQALNRDISQLEALIEELLTYARLDRPQNELHLSEPDLPLWLSTHLADIQAVTPDK TVRIKTLVQGHYAALDMRLMERVLDNLLNNALRYCHSTVETSLLLSGNRATLIVEDDGPG IAPENREHIFEPFVRLDPSRDRSTGGCGLGLAIVHSIALAMGGTVNCDTSELGGARFSFS WPLWHNIPQFTSA >gi|296493126|gb|ADTK01000375.1| GENE 4 3839 - 4558 532 239 aa, chain - ## HITS:1 COG:rstA KEGG:ns NR:ns ## COG: rstA COG0745 # Protein_GI_number: 16129566 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli K12 # 1 239 4 242 242 447 100.0 1e-125 MNTIVFVEDDAEVGSLIAAYLAKHDMQVTVEPRGDQAEETILRENPDLVLLDIMLPGKDG MTICRDLRAKWSGPIVLLTSLDSDMNHILALEMGACDYILKTTPPAVLLARLRLHLRQNE QATLTKGLQETSLTPYKALHFGTLTIDPINRVVTLANTEISLSTADFELLWELATHAGQI MDRDALLKNLRGVSYDGLDRSVDVAISRLRKKLLDNAAEPYRIKTVRNKGYLFAPHAWE >gi|296493126|gb|ADTK01000375.1| GENE 5 4687 - 5022 347 111 aa, chain + ## HITS:1 COG:ECs2313 KEGG:ns NR:ns ## COG: ECs2313 COG3136 # Protein_GI_number: 15831567 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein required for alginate biosynthesis # Organism: Escherichia coli O157:H7 # 1 111 1 111 111 165 100.0 2e-41 MGLVIKAALGALVVLLIGVLAKTKNYYIAGLIPLFPTFALIAHYIVASERGIEALRATII FSMWSIIPYFVYLVSLWYFTGMMRLPAAFVGSVACWGISAWVLIICWIKLH >gi|296493126|gb|ADTK01000375.1| GENE 6 5019 - 5741 697 240 aa, chain - ## HITS:1 COG:ECs2312 KEGG:ns NR:ns ## COG: ECs2312 COG1028 # Protein_GI_number: 15831566 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Escherichia coli O157:H7 # 1 240 1 240 240 489 99.0 1e-138 MGKAQPLPILITGGGRRIGLALAWHFINQKQPVIVSYRTHYPAIDGLINAGAQCIQADFS TNDGVMAFADEVLKSTHGLRAILHNASAWMAEKPGAPLADVLACMMQIHVNTPYLLNHAL ERLLRGHGHAASDIIHFTDYVVERGSDKHVAYAASKAALDNMTRSFARKLAPEVKVNSIA PSLILFNEHDDAEYRQQALNKSLMKTAPGEKEVIDLVDYLLTSCFVTGRSFPLDGGRHLR >gi|296493126|gb|ADTK01000375.1| GENE 7 5778 - 6992 1199 404 aa, chain - ## HITS:1 COG:ydgI KEGG:ns NR:ns ## COG: ydgI COG0531 # Protein_GI_number: 16129563 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli K12 # 1 404 57 460 460 702 100.0 0 MLILTRIRPELDGGIFTYAREGFGELIGFCSAWGYWLCAVIANVSYLVIVFSALSFFTDT PELRLFGDGNTWQSIVGASALLWIVHFLILRGVQTAASINLVATLAKLLPLGLFVVLAMM MFKLDTFKLDFTGLALGVPVWEQVKNTMLITLWVFIGVEGAVVVSARARNKRDVGKATLL AVLSALGVYLLVTLLSLGVVARPELAEIRNPSMAGLMVEMMGPWGEIIIAAGLIVSVCGA YLSWTIMAAEVPFLAATHKAFPRIFARQNAQAAPSASLWLTNICVQICLVLIWLTGSDYN TLLTIASEMILVPYFLVGAFLLKIATRPLHKAVGVGACIYGLWLLYASGPMHLLLSVVLY APGLLVFLYARKTHTHDNVLNRQEMVLIGMLLIASVPATWMLVG Prediction of potential genes in microbial genomes Time: Mon May 16 16:17:45 2011 Seq name: gi|296493125|gb|ADTK01000376.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1237.6, whole genome shotgun sequence Length of sequence - 4709 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 239 - 267 1.3 1 1 Tu 1 . - CDS 288 - 1232 1057 ## EC55989_1769 hypothetical protein - Prom 1351 - 1410 5.0 + Prom 1611 - 1670 4.5 2 2 Op 1 17/0.000 + CDS 1756 - 3288 1542 ## COG3288 NAD/NADP transhydrogenase alpha subunit 3 2 Op 2 . + CDS 3299 - 4687 1424 ## COG1282 NAD/NADP transhydrogenase beta subunit Predicted protein(s) >gi|296493125|gb|ADTK01000376.1| GENE 1 288 - 1232 1057 314 aa, chain - ## HITS:1 COG:no KEGG:EC55989_1769 NR:ns ## KEGG: EC55989_1769 # Name: ydgH # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 314 1 314 314 520 100.0 1e-146 MKLKNTLLASALLSATAFSVNAATELTPEQAAAVKPFDRVVVTGRFNAIGEAVKAVSRRA DKEGAASFYVVDTSDFGNSGNWRVVADLYKADAEKAEETSNRVINGVVELPKDQAVLIEP FDTVTVQGFYRSQPEVNDAITKAAKAKGAYSFYIVRQIDANQGGNQRITAFIYKKDAKKR IVQSPDVIPADSEAGRAALAAGGEAAKKVEIPGVATTASPSSEVGRFFETQSSKGGRYTV TLPDGTKVEELNKATAAMMVPFDSIKFSGNYGNMTEVSYQVAKRAAKKGAKYYHITRQWQ ERGNNLTVSADLYK >gi|296493125|gb|ADTK01000376.1| GENE 2 1756 - 3288 1542 510 aa, chain + ## HITS:1 COG:ECs2309 KEGG:ns NR:ns ## COG: ECs2309 COG3288 # Protein_GI_number: 15831563 # Func_class: C Energy production and conversion # Function: NAD/NADP transhydrogenase alpha subunit # Organism: Escherichia coli O157:H7 # 1 510 1 510 510 974 99.0 0 MRIGIPRERLTNETRVAATPKTVEQLLKLGFTVAVESGAGQLASFDDKAFVQAGAEIVEG NSVWQSEIILKVNAPLDDEIALLNPGTTLVSFIWPAQNPELMQKLAERNVTVMAMDSVPR ISRAQSLDALSSMANIAGYRAIVEAAHEFGRFFTGQITAAGKVPPAKVMVIGAGVAGLAA IGAANSLGAIVRAFDTRPEVKEQVQSMGAEFLELDFKEEAGSGDGYAKVMSDAFIKAEME LFAAQAKEVDIIVTTALIPGKPAPKLITREMVDSMKAGSVIVDLAAQNGGNCEYTVPGEI FTTENGVKVIGYTDLPGRLPTQSSQLYGTNLVNLLKLLCKEKDGNITVDFDDVVIRGVTV IRAGEITWPAPPIQVSAQPQAAQKAAPEVKTEEKCACSPWRKYALMALAIILFGWLASVA PKEFLGHFTVFALACVVGYYVVWNVSHALHTPLMSVTNAISGIIVVGALLQIGQGGWVSF LSFIAVLIASINIFGGFTVTQRMLKMFRKN >gi|296493125|gb|ADTK01000376.1| GENE 3 3299 - 4687 1424 462 aa, chain + ## HITS:1 COG:ECs2308 KEGG:ns NR:ns ## COG: ECs2308 COG1282 # Protein_GI_number: 15831562 # Func_class: C Energy production and conversion # Function: NAD/NADP transhydrogenase beta subunit # Organism: Escherichia coli O157:H7 # 1 462 1 462 462 850 100.0 0 MSGGLVTAAYIVAAILFIFSLAGLSKHETSRQGNNFGIAGMAIALIATIFGPDTGNVGWI LLAMVIGGAIGIRLAKKVEMTEMPELVAILHSFVGLAAVLVGFNSYLHHDAGMAPILVNI HLTEVFLGIFIGAVTFTGSVVAFGKLCGKISSKPLMLPNRHKMNLAALVVSFLLLIVFVR TDSVGLQVLALLIMTAIALVFGWHLVASIGGADMPVVVSMLNSYSGWAAAAAGFMLSNDL LIVTGALVGSSGAILSYIMCKAMNRSFISVIAGGFGTDGSSTGDDQEVGEHREITAEETA ELLKNSHSVIITPGYGMAVAQAQYPVAEITEKLRARGINVRFGIHPVAGRLPGHMNVLLA EAKVPYDIVLEMDEINDDFADTDTVLVIGANDTVNPAAQDDPKSPIAGMPVLEVWKAQNV IVFKRSMNTGYAGVQNPLFFKENTHMLFGDAKASVDAILKAL Prediction of potential genes in microbial genomes Time: Mon May 16 16:17:50 2011 Seq name: gi|296493124|gb|ADTK01000377.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1237.7, whole genome shotgun sequence Length of sequence - 2145 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 9 - 1043 1097 ## COG0628 Predicted permease - Prom 1091 - 1150 4.7 + Prom 1192 - 1251 4.5 2 2 Op 1 12/0.000 + CDS 1455 - 1820 423 ## COG2076 Membrane transporters of cations and cationic drugs 3 2 Op 2 . + CDS 1807 - 2136 484 ## COG2076 Membrane transporters of cations and cationic drugs Predicted protein(s) >gi|296493124|gb|ADTK01000377.1| GENE 1 9 - 1043 1097 344 aa, chain - ## HITS:1 COG:ydgG KEGG:ns NR:ns ## COG: ydgG COG0628 # Protein_GI_number: 16129559 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Escherichia coli K12 # 1 344 1 344 344 493 100.0 1e-139 MAKPIITLNGLKIVIMLGMLVIILCGIRFAAEIIVPFILALFIAVILNPLVQHMVRWRVP RVLAVSILMTIIVMAMVLLLAYLGSALNELTRTLPQYRNSIMTPLQALEPLLQRVGIDVS VDQLAHYIDPNAAMTLLTNLLTQLSNAMSSIFLLLLTVLFMLLEVPQLPGKFQQMMARPV EGMAAIQRAIDSVSHYLVLKTAISIITGLVAWAMLAALDVRFAFVWGLLAFALNYIPNIG SVLAAIPPIAQVLVFNGFYEALLVLAGYLLINLVFGNILEPRIMGRGLGLSTLVVFLSLI FWGWLLGPVGMLLSVPLTIIVKIALEQTAGGQSIAVLLSDLNKE >gi|296493124|gb|ADTK01000377.1| GENE 2 1455 - 1820 423 121 aa, chain + ## HITS:1 COG:ECs2306 KEGG:ns NR:ns ## COG: ECs2306 COG2076 # Protein_GI_number: 15831560 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane transporters of cations and cationic drugs # Organism: Escherichia coli O157:H7 # 1 121 1 121 121 180 99.0 5e-46 MYIYWILLGLAIATEITGTLSMKWASVSEGNGGFILMLVMISLSYIFLSFAVKKIALGVA YALWEGIGILFITLFSVLLFDESLSLMKIVGLTTLVAGIVLIKSGTRKARKPELEVNHGA V >gi|296493124|gb|ADTK01000377.1| GENE 3 1807 - 2136 484 109 aa, chain + ## HITS:1 COG:ECs2305 KEGG:ns NR:ns ## COG: ECs2305 COG2076 # Protein_GI_number: 15831559 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane transporters of cations and cationic drugs # Organism: Escherichia coli O157:H7 # 1 109 1 109 109 144 100.0 5e-35 MAQFEWVHAAWLALAIVLEIVANVFLKFSDGFRRKIFGLLSLAAVLAAFSALSQAVKGID LSVAYALWGGFGIAATLAAGWILFGQRLNRKGWIGLVLLLAGMIMVKLA Prediction of potential genes in microbial genomes Time: Mon May 16 16:17:57 2011 Seq name: gi|296493123|gb|ADTK01000378.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1237.8, whole genome shotgun sequence Length of sequence - 21782 bp Number of predicted genes - 21, with homology - 20 Number of transcription units - 13, operones - 5 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 36 - 857 512 ## COG3591 V8-like Glu-specific endopeptidase - Prom 901 - 960 3.4 2 2 Tu 1 . - CDS 1133 - 1441 337 ## - Prom 1578 - 1637 6.5 - Term 1817 - 1855 0.5 3 3 Tu 1 . - CDS 1865 - 3118 899 ## COG0477 Permeases of the major facilitator superfamily - Prom 3186 - 3245 5.3 + Prom 3142 - 3201 4.7 4 4 Op 1 5/0.333 + CDS 3225 - 4118 204 ## PROTEIN SUPPORTED gi|149913192|ref|ZP_01901726.1| 50S ribosomal protein L35 + Prom 4156 - 4215 1.6 5 4 Op 2 3/0.333 + CDS 4253 - 5473 274 ## PROTEIN SUPPORTED gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 + Term 5515 - 5579 6.5 + Prom 5519 - 5578 4.7 6 5 Tu 1 . + CDS 5598 - 6293 797 ## COG0132 Dethiobiotin synthetase + Term 6337 - 6375 7.6 7 6 Tu 1 . - CDS 6246 - 7538 1113 ## COG0038 Chloride channel protein EriC - Prom 7628 - 7687 4.2 8 7 Op 1 4/0.333 - CDS 7697 - 8311 547 ## COG3381 Uncharacterized component of anaerobic dehydrogenases - Term 8316 - 8346 3.0 9 7 Op 2 9/0.333 - CDS 8354 - 9208 603 ## COG3302 DMSO reductase anchor subunit 10 7 Op 3 16/0.000 - CDS 9210 - 9827 368 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 11 7 Op 4 5/0.333 - CDS 9838 - 12261 1745 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 12 7 Op 5 . - CDS 12322 - 14748 1989 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing - Prom 14781 - 14840 4.3 - Term 14895 - 14930 3.7 13 8 Tu 1 . - CDS 14947 - 15252 225 ## ECSE_1707 hypothetical protein - Prom 15495 - 15554 2.8 + Prom 15256 - 15315 3.4 14 9 Tu 1 . + CDS 15447 - 16070 521 ## ECBD_2061 hypothetical protein + Term 16081 - 16120 -0.7 15 10 Op 1 . - CDS 16073 - 16633 484 ## PROTEIN SUPPORTED gi|116490772|ref|YP_810316.1| acetyltransferase 16 10 Op 2 . - CDS 16668 - 17009 311 ## B21_01542 hypothetical protein - Prom 17109 - 17168 4.6 + Prom 16961 - 17020 4.6 17 11 Tu 1 . + CDS 17144 - 17470 272 ## COG1742 Uncharacterized conserved protein + Prom 17479 - 17538 4.0 18 12 Op 1 2/0.333 + CDS 17676 - 18890 1301 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 19 12 Op 2 . + CDS 18902 - 19921 955 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases + Term 20136 - 20176 -0.0 20 13 Op 1 . - CDS 20109 - 21389 408 ## COG0582 Integrase 21 13 Op 2 . - CDS 21424 - 21675 247 ## ECED1_1745 putative excisionase - Prom 21707 - 21766 1.6 Predicted protein(s) >gi|296493123|gb|ADTK01000378.1| GENE 1 36 - 857 512 273 aa, chain - ## HITS:1 COG:Z2592 KEGG:ns NR:ns ## COG: Z2592 COG3591 # Protein_GI_number: 15802012 # Func_class: E Amino acid transport and metabolism # Function: V8-like Glu-specific endopeptidase # Organism: Escherichia coli O157:H7 EDL933 # 1 273 1 273 273 516 99.0 1e-146 MRTTIAVVLGAISLTSAFVFADKPDVAKSANDEVSTLFFGHDDRVPVNDTTQSPWDAVGQ LETASGNLCTATLIAPNLALTAGHCLLTPPKGKADKAVALRFVSNKGLWRYDIHDIEGRV DPTLGKRLKADGDGWIVPPAAAPWDFGLIVLRNPPSGITPLPLFEGDKAALTAALKSAGR KVTQAGYPEDHLDTLYSHQNCEVTGWAQTSVMSHQCDTLPGDSGSPLMLHTDDGWQLIGV QSSAPAAKDRWRADNRAISVTGFRDKLDQLSQK >gi|296493123|gb|ADTK01000378.1| GENE 2 1133 - 1441 337 102 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKVLALVVAAAMGLSSAAFAAETATTPAPTATTTKAAPAKTTHHKKQHKAAPAQKAQAA KKHHKNTKAEQKAPEQKAQAAKKHAGKHSHQQPAKPAAQPAA >gi|296493123|gb|ADTK01000378.1| GENE 3 1865 - 3118 899 417 aa, chain - ## HITS:1 COG:ynfM KEGG:ns NR:ns ## COG: ynfM COG0477 # Protein_GI_number: 16129554 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 417 1 417 417 729 100.0 0 MSRTTTVDGAPASDTDKQSISQPNQFIKRGTPQFMRVTLALFSAGLATFALLYCVQPILP VLSQEFGLTPANSSISLSISTAMLAIGLLFTGPLSDAIGRKPVMVTALLLASICTLLSTM MTSWHGILIMRALIGLSLSGVAAVGMTYLSEEIHPSFVAFSMGLYISGNSIGGMSGRLIS GVFTDFFNWRIALAAIGCFALASALMFWKILPESRHFRPTSLRPKTLFINFRLHWRDRGL PLLFAEGFLLMGSFVTLFNYIGYRLMLSPWHVSQAVVGLLSLAYLTGTWSSPKAGTMTTR YGRGPVMLFSTGVMLFGLLMTLFSSLWLIFAGMLLFSAGFFAAHSVASSWIGPRAKRAKG QASSLYLFSYYLGSSIAGTLGGVFWHNYGWNGVGAFIALMLVIALLVGTRLHRRLHA >gi|296493123|gb|ADTK01000378.1| GENE 4 3225 - 4118 204 297 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149913192|ref|ZP_01901726.1| 50S ribosomal protein L35 [Roseobacter sp. AzwK-3b] # 3 245 1 239 305 83 26 1e-15 MNIELRHLRYFVAVAEELHFGRAAARLNISQPPLSQQIQALEQQIGARLLARTNRSVLLT AAGKQFLADSRQILSMVDDAAARAERLHQGEAGELRIGFTSSAPFIRAVSDTLSLFRRDY PDVHLQTREMNTREQIAPLIEGTLDMGLLRNTALPETLEHAVIVHEPLMAMIPHDHPLAN NPNVTLAELAKEPFVFFDPHVGTGLYDDILGLMRRYHLTPVITQEVGEAMTIIGLVSAGL GVSILPASFKRVQLNEMRWVPIAEEDAVSEMWLVWPKHHEQSPAARNFRIHLLNALR >gi|296493123|gb|ADTK01000378.1| GENE 5 4253 - 5473 274 406 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 [Bacillus selenitireducens MLS10] # 146 394 63 317 323 110 28 1e-23 MVAENQPGHIDQIKQTNAGAVYRLIDQLGPVSRIDLSRLAQLAPASITKIVREMLEAHLV QELEIKEAGNRGRPAVGLVVETEAWHYLSLRISRGEIFLALRDLSSKLVVEESQELALKD DSPLLDRIISHIDQFFIRHQKKLERLTSIAITLPGIIDTENGIVHRMPFYEDVKEMPLGE ALEQHTGVPVYIQHDISAWTMAEALFGASRGARDVIQVVIDHNVGAGVITDGHLLHAGSS SLVEIGHTQVDPYGKRCYCGNHGCLETIASVDSILELAQLRLNQSMSSMLHGQPLTVDSL CQAALRGDLLAKDIITGVGAHVGRILAIMVNLFNPQKILIGSPLSKAADILFPVISDSIR QQALPAYSQHISVESTQFSNQGTMAGAALVKDAMYNGSLLIRLLQG >gi|296493123|gb|ADTK01000378.1| GENE 6 5598 - 6293 797 231 aa, chain + ## HITS:1 COG:ECs2299 KEGG:ns NR:ns ## COG: ECs2299 COG0132 # Protein_GI_number: 15831553 # Func_class: H Coenzyme transport and metabolism # Function: Dethiobiotin synthetase # Organism: Escherichia coli O157:H7 # 1 231 5 235 235 444 100.0 1e-125 MLKRFFITGTDTSVGKTVVSRALLQALASQGKTVAGYKPVAKGSKETPEGLRNKDALVLQ SVSTIELPYEAVNPIALSEEESSVAHSCPINYTLISNGLANLTEKVDHVVVEGTGGWRSL MNDLRPLSEWVVQEQLPVLMVVGIQEGCINHALLTAQAIANDGLPLIGWVANRINPGLAH YAEIIDVLGKKLPAPLIGELPYLPRAEQRELGQYIRLAMLRSVLAVDRVTV >gi|296493123|gb|ADTK01000378.1| GENE 7 6246 - 7538 1113 430 aa, chain - ## HITS:1 COG:ynfJ KEGG:ns NR:ns ## COG: ynfJ COG0038 # Protein_GI_number: 16129550 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Escherichia coli K12 # 1 430 9 438 438 707 99.0 0 MHHLHIYPDLRTMFRRLLIATVVGILAAFAVAGFHHAMLLLEWLFLNNDSGSLVNAATNL SPWRRLLTPALGGLAAGLLLMGWQKFTQQRPHAPTDYMEALQTDGQFDYAASLVKSLASL LVVTSGSAIGREGAMILLAALAASCFAQRFTPRQEWKLWIACGAAAGMAAAYRAPLAGSL FIAEVLFGTMMLASLGPVIISAVVALLVSNLINHSDALLYNVQLSVTVQARDYALIISTG VLAGLCGPLLLTLMNACHRGFVGLKLAPPWQLALGGLIVGLLSLFTPAVWGNGYSTVQSF LTAPPLLMIIAGIFLCKLCAVLASSGSGAPGGVFTPTLFIGLAIGMLYGRSLGLWFPDGE EITLLLGLTGMATLLAATTHAPIMSTLMICEMTGEYQLLPGLLIACVIASVISRTLHRDS IYRQHTAQHS >gi|296493123|gb|ADTK01000378.1| GENE 8 7697 - 8311 547 204 aa, chain - ## HITS:1 COG:ECs2297 KEGG:ns NR:ns ## COG: ECs2297 COG3381 # Protein_GI_number: 15831551 # Func_class: R General function prediction only # Function: Uncharacterized component of anaerobic dehydrogenases # Organism: Escherichia coli O157:H7 # 1 204 4 207 207 371 100.0 1e-103 MTHFSQQDNFSVAARVLGALFYYAPESAEAAPLVAVLTSDGWETQWPLPEASLAPLVTAF QTQCEETHAQAWQRLFVGPWALPSPPWGSVWLDRESVLFGDSTLALRQWMREKGIQFEMK QNEPEDHFGSLLLMAAWLAENGRQTECEELLAWHLFPWSTRFLDVFIEKAEHPFYRALGE LARLTLAQWQSQLLIPVAVKPLFR >gi|296493123|gb|ADTK01000378.1| GENE 9 8354 - 9208 603 284 aa, chain - ## HITS:1 COG:ynfH KEGG:ns NR:ns ## COG: ynfH COG3302 # Protein_GI_number: 16129548 # Func_class: R General function prediction only # Function: DMSO reductase anchor subunit # Organism: Escherichia coli K12 # 1 284 1 284 284 447 98.0 1e-126 MGNGWHEWPLVIFTVLGQCVVGALIVSGIGWFAAKNDADRQRIVRGMFFLWLLMGIGFIA SVMHLGSPLRAFNSLNRIGASGLSNEIAAGSIFFAVGGLWWLVAVIGKMPQALGKLWLLV SMALGVIFVWMMTCVYQIDTVPTWHNGYTTLAFFLTVLLSGPILAAAILRAARVTFNTTP FAIISVLALIACAGVIVLQGLSLACIHSSVQQASALVPDYASLQVWRVVLLCAGLGCWVC PLIRRREPHVAGLILGLILILGGEMIGRVLFYGLHMTVGMAIAG >gi|296493123|gb|ADTK01000378.1| GENE 10 9210 - 9827 368 205 aa, chain - ## HITS:1 COG:ECs2295 KEGG:ns NR:ns ## COG: ECs2295 COG0437 # Protein_GI_number: 15831549 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Escherichia coli O157:H7 # 1 205 1 205 205 416 100.0 1e-116 MTTQYGFFIDSSRCTGCKTCELACKDFKDLGPEVSFRRIYEYAGGDWQEDNGVWHQNVFA YYLSISCNHCDDPACTKVCPSGAMHKREDGFVVVDEDVCIGCRYCHMACPYGAPQYNAEK GHMTKCDGCYSRVAEGKQPICVESCPLRALEFGPIEELRQKHGTLAAVAPLPRAHFTKPN IVIKPNANSRPTGDTTGYLANPEEV >gi|296493123|gb|ADTK01000378.1| GENE 11 9838 - 12261 1745 807 aa, chain - ## HITS:1 COG:ynfF KEGG:ns NR:ns ## COG: ynfF COG0243 # Protein_GI_number: 16129546 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 807 2 808 808 1700 99.0 0 MKIHTTEALMKAEISRRSLMKTSALGSLALASSAFTLPFSQMVRAAEAPVEEKAVWSSCT VNCGSRCLLRLHVKDDTVYWVESDTTGDDVYGNHQVRACLRGRSIRRRMNHPDRLKYPMK RVGKRGEGKFERISWDEALDTISDNLRRILKDYGNEAVHVLYGTGVDGGNITNSNVPYRL MNSCGGFLSRYGSYSTAQISAAMSYMFGANDGNSPDDIANTKLVVMFGNNPAETRMSGGG VTYYVEQARERSNARMIVIDPRYNDTAAGREDEWLPIRPGTDGALACAIAWVLITENMVD QPFLDKYCVGYDEKTLPANAPRNAHYKAYILGEGPDGIAKTPEWAAKITSIPAEKIIQLA REIGSAKPAYICQGWGPQRHSNGEQTSRAIAMLSVLTGNVGINGGNSGVREGSWDLGVEW FPMLENPVKTQISVFTWTDAIDHGKEMTATRDGVRGKEKLDVPIKFLWCYASNTLINQHG DINHTHEVLQDDSKCEMIVGIDHFMTASAKYCDILLPDLMPTEQEDLISHESAGNMGYVI LAQPATSAKFERKPIYWMLSEVAKRLGPDVYQTFTEGRSQHEWIKYLHAKTKERNPEMPD YEEMKTTGIFKKKCPEEHYVAFRAFREDPQANPLKTPSGKIEIYSERLAKIADTWELKKD EIIHPLPAYTPGFDGWDDPLRKTYPLQLTGFHYKARTHSSYGNIDVLQQACPQEVWINPI DAQARGIRHGDTVRVFNNNGEMLIAAKVTPRILPGVTAIGQGAWLKADMFGDRVDHGGSI NILTSHRPSPLAKGNPSHSNLVQIEKV >gi|296493123|gb|ADTK01000378.1| GENE 12 12322 - 14748 1989 808 aa, chain - ## HITS:1 COG:ynfE KEGG:ns NR:ns ## COG: ynfE COG0243 # Protein_GI_number: 16129545 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 808 1 808 808 1659 98.0 0 MSKNDRMVGISRRTLVKSTAIGSLALAAGGFSLPFTLRSAAATVQQASEKVIWGACSVNC GSRCALRLHVKDNEVTWVETDNTGSDEYGNHQVRACLRGRSIRRRINHPDRLNYPMKRVG TRGEGKFERISWDEALDTIASSLKKTVEQYGNEAVYIQYSSGIVGGNMTRSSPSASAVKR LMNCYGGSLNQYGSYSTAQISCAMPYTYGSNDGNSTTDIENSKLVVMFGNNPAETRMSGG GITYLLEKAREKSNAKMIVIDPRYTDTAAGREDEWLPIRPGTDAALVAGIAWVLINENLV DQPFLDKYCVGYDEKTLPADAPKNGHYKAYILGEGDDNTAKTPQWASQITGIPVDRIIKL AREIGTAKPAYICQGWGPQRQANGELTARAIAMLPILTGNVGISGGNSGARESTYTITIE RLPVLDNPVKTSISCFSWTDAIDHGPQMTAIRDGVRGKDKLDVPIKFIWNYAGNTLVNQH SDINKTHEILQDESKCEMIVVIENFMTSSAKYADILLPDLMTVEQEDIIPNDYAGNMGYL IFLQPVTSEKFERKPIYWILSEVAKRLGPDVYQKFTEGRTQEQWLQHLYAKMLAKDPALP SYDELKKMGIYKRKDPNGHFVAYKAFRDDPEANPLKTPSGKIEIYSSKLAEIARTWELEK DEVISPLPVYASTFEGWDSPERSTFPLQLFGFHYKSRTHSTYGNIDVLKAACRQEVWINP IDAQKRGIANGDMVRVFNHRGEVRLPAKVTPRILPGVSAMGQGAWHEANISGDKIDHGGC VNTLTTLRPSPLAKGNPQHTNLVEIEKI >gi|296493123|gb|ADTK01000378.1| GENE 13 14947 - 15252 225 101 aa, chain - ## HITS:1 COG:no KEGG:ECSE_1707 NR:ns ## KEGG: ECSE_1707 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 101 2 102 102 157 100.0 8e-38 MKLSTCCAALLLALASPVVLAAPGSCERIQSDISQRIINNGVPESSFTLSIVPNDQVDQP DSQVVGHCANDTHKILYTRTTSGNVSAPAQSTQDGAPAEPQ >gi|296493123|gb|ADTK01000378.1| GENE 14 15447 - 16070 521 207 aa, chain + ## HITS:1 COG:no KEGG:ECBD_2061 NR:ns ## KEGG: ECBD_2061 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_BL21_DE3 # Pathway: not_defined # 1 207 42 248 248 406 100.0 1e-112 MASFSNEFDFDPLRGPVKDFTQTLMDEQGEVTKRVSGTLSEEGCFDSLELLDLENNTVVA LVLDANYYRDAETLEKRVRLQGKCQLAELPSAGVSWETDDNGFVIKASSKQMQMEYRYDD QGYPLGKTTKSNDKTLSVSATPSTDPIKKLDYTAVTLLNNQRVGNVKQSCEYDSHANPVD CQLIIVDEGVKPAVERVYTIKNTIDYY >gi|296493123|gb|ADTK01000378.1| GENE 15 16073 - 16633 484 186 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116490772|ref|YP_810316.1| acetyltransferase [Oenococcus oeni PSU-1] # 7 168 3 164 167 191 54 5e-48 MPSAHSVKLRPLEREDLRYVHQLDNNASVMRYWFEEPYEAFVELSDLYDKHIHDQSERRF VVECDGEKAGLVELVEINHVHRRAEFQIIISPEYQGKGLATRAAKLAMDYGFTVLNLYKL YLIVDKENEKAIHIYRKLGFSVEGELMHEFFINGQYRNAIRMCIFQHQYLAEHKTPGQTL LKPTAQ >gi|296493123|gb|ADTK01000378.1| GENE 16 16668 - 17009 311 113 aa, chain - ## HITS:1 COG:no KEGG:B21_01542 NR:ns ## KEGG: B21_01542 # Name: ynfB # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 113 1 113 113 204 100.0 8e-52 MKITLSKRIGLLAILLPCALALSTTVHAETNKLVIESGDSAQSRQHAAMEKEQWNDTRNL RQKVNKRTEKEWDKADAAFDNRDKCEQSANINAYWEPNTLRCLDRRTGRVITP >gi|296493123|gb|ADTK01000378.1| GENE 17 17144 - 17470 272 108 aa, chain + ## HITS:1 COG:ynfA KEGG:ns NR:ns ## COG: ynfA COG1742 # Protein_GI_number: 16129540 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 108 1 108 108 164 100.0 3e-41 MIKTTLLFFATALCEIIGCFLPWLWLKRNASIWLLLPAGISLALFVWLLTLHPAASGRVY AAYGGVYVCTALMWLRVVDGVKLTLYDWTGALIALCGMLIIVAGWGRT >gi|296493123|gb|ADTK01000378.1| GENE 18 17676 - 18890 1301 404 aa, chain + ## HITS:1 COG:rspA KEGG:ns NR:ns ## COG: rspA COG4948 # Protein_GI_number: 16129539 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Escherichia coli K12 # 1 404 1 404 404 868 100.0 0 MKIVKAEVFVTCPGRNFVTLKITTEDGITGLGDATLNGRELSVASYLQDHLCPQLIGRDA HRIEDIWQFFYKGAYWRRGPVTMSAISAVDMALWDIKAKAANMPLYQLLGGASREGVMVY CHTTGHSIDEALDDYARHQELGFKAIRVQCGIPGMKTTYGMSKGKGLAYEPATKGQWPEE QLWSTEKYLDFMPKLFDAVRNKFGFNEHLLHDMHHRLTPIEAARFGKSIEDYRMFWMEDP TPAENQECFRLIRQHTVTPIAVGEVFNSIWDCKQLIEEQLIDYIRTTLTHAGGITGMRRI ADFASLYQVRTGSHGPSDLSPVCMAAALHFDLWVPNFGVQEYMGYSEQMLEVFPHNWTFD NGYMHPGDKPGLGIEFDEKLAAKYPYEPAYLPVARLEDGTLWNW >gi|296493123|gb|ADTK01000378.1| GENE 19 18902 - 19921 955 339 aa, chain + ## HITS:1 COG:rspB KEGG:ns NR:ns ## COG: rspB COG1063 # Protein_GI_number: 16129538 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli K12 # 1 339 1 339 339 678 98.0 0 MKSILIEKPNQLSIIEREIPTPSAGEVRVKVKLAGICGSDSHIYRGHNPFAKYPRVIGHE FFGVIDAVGEGVESARVGERVAVDPVVSCGHCYPCSIGKPNVCTTLAVLGVHADGGFSEY AVVPAKNAWKIPEAVADQYAVMIEPFTIAANVTGHGQPTENDTVLVYGAGPIGLTIVQVL KGVYNVKNVIVADRIDERLEKAKESGADWAINNSQTPLGEIFAEKGIKPTLIIDAACHPS ILKEAVTLASPAARIVLMGFSSEPSEVIQQGITGKELSIFSSRLNANKFPVVIDWLSKGL IKPEKLITHTFDFQHVADAISLFEQDQKHCCKVLLTFSE >gi|296493123|gb|ADTK01000378.1| GENE 20 20109 - 21389 408 426 aa, chain - ## HITS:1 COG:intQ KEGG:ns NR:ns ## COG: intQ COG0582 # Protein_GI_number: 16129537 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Escherichia coli K12 # 31 426 2 398 398 786 96.0 0 MKYPTGVENHGGKLRIWFVYKGVRVRENLGVPDTAKNRRVAGELRASVCYAIKTGVFDYA KQFPSSRNLEKFGEARQDLTIKELAEKFLALKETEVAKTSLNTYRAVIKNILSIIGEKNL ASSINKEKLLEVRKELLTGYQIPKSNYIVTQPGRSAVTVNNYMTNLNAVFQFGVDNGYLA DNPFKGISPLKESRTIPDPLSREEFIRLIDACRNQQAKNLWCVSVYTGVRPGELCALGWE DIDLKNGTMMIRRNLAKDRFTVPKTQAGTNRVIHLIKPAIDALRSQMTLTRLSKEHIIDV HLREYGRTEKQKCTFVFQPEVSARVKNYGDHFTVDSIRQMWDAAIKRAGLRHRKSYQSRH TYACWSLTAGANPAFIANQMGHADAQMVFKVYGKWMSENNNAQVALLNTQLSEFAPTMPH NEAMKN >gi|296493123|gb|ADTK01000378.1| GENE 21 21424 - 21675 247 83 aa, chain - ## HITS:1 COG:no KEGG:ECED1_1745 NR:ns ## KEGG: ECED1_1745 # Name: not_defined # Def: putative excisionase # Organism: E.coli_ED1a # Pathway: not_defined # 1 82 25 106 107 161 100.0 6e-39 MSEVIMIVSPGKWVSEEQLIALKGIKKGTLKKAREKSFMEGREYKHVAHDGMPWDNSPCF YNLEEIDRWIERQASARPRRHLT Prediction of potential genes in microbial genomes Time: Mon May 16 16:18:13 2011 Seq name: gi|296493122|gb|ADTK01000379.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1249.1, whole genome shotgun sequence Length of sequence - 691 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 689 538 ## COG5281 Phage-related minor tail protein Predicted protein(s) >gi|296493122|gb|ADTK01000379.1| GENE 1 2 - 689 538 229 aa, chain + ## HITS:1 COG:Z2140 KEGG:ns NR:ns ## COG: Z2140 COG5281 # Protein_GI_number: 15801577 # Func_class: S Function unknown # Function: Phage-related minor tail protein # Organism: Escherichia coli O157:H7 EDL933 # 1 229 179 407 859 317 80.0 8e-87 LTFNQTSESLTALVNAGVRGGEQFEAISQSVARFSSASGVEVDKVAEAFGKLTTDPTSGL TAMARQFHNVTAEQIAYVAQLQRSGDEAGALQAANEAAPKGFDDQTRRLKENMGTLETWA DRTARAFKSMWDSVLDIGRPDTAQGMLEKAEKAFDEADKKWQWYQSRSHRRGKTSAFLAN LRGAWEDRANAQLGLSAATLQADLEKAREMAAKDWAESEASRLKYTEEA Prediction of potential genes in microbial genomes Time: Mon May 16 16:18:14 2011 Seq name: gi|296493121|gb|ADTK01000380.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1253.1, whole genome shotgun sequence Length of sequence - 351 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 299 380 ## COG0076 Glutamate decarboxylase and related PLP-dependent proteins Predicted protein(s) >gi|296493121|gb|ADTK01000380.1| GENE 1 3 - 299 380 98 aa, chain + ## HITS:1 COG:ECs2098 KEGG:ns NR:ns ## COG: ECs2098 COG0076 # Protein_GI_number: 15831352 # Func_class: E Amino acid transport and metabolism # Function: Glutamate decarboxylase and related PLP-dependent proteins # Organism: Escherichia coli O157:H7 # 1 98 369 466 466 213 100.0 1e-55 GRPDEGIPAVCFKLKDGEDPGYTLYDLSERLRLRGWQVPAFTLGGEATDIVVMRIMCRRG FEMDFAELLLEDYKASLKYLSDHPKLQGIAQQNSFKHT Prediction of potential genes in microbial genomes Time: Mon May 16 16:18:14 2011 Seq name: gi|296493120|gb|ADTK01000381.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1253.2, whole genome shotgun sequence Length of sequence - 571 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 16:18:27 2011 Seq name: gi|296493119|gb|ADTK01000382.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1253.3, whole genome shotgun sequence Length of sequence - 37643 bp Number of predicted genes - 32, with homology - 32 Number of transcription units - 15, operones - 7 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 112 - 1647 1516 ## COG0531 Amino acid transporters + Term 1673 - 1707 5.2 2 1 Op 2 . + CDS 1943 - 3097 824 ## COG1649 Uncharacterized protein conserved in bacteria + Term 3115 - 3163 13.1 + Prom 3261 - 3320 5.6 3 2 Op 1 5/0.000 + CDS 3474 - 4856 1205 ## COG2199 FOG: GGDEF domain 4 2 Op 2 1/1.000 + CDS 4881 - 7280 1965 ## COG2202 FOG: PAS/PAC domain + Term 7294 - 7319 -0.5 + Prom 7306 - 7365 1.7 5 3 Op 1 3/0.500 + CDS 7538 - 8119 428 ## COG2173 D-alanyl-D-alanine dipeptidase 6 3 Op 2 38/0.000 + CDS 8133 - 9683 1616 ## COG0747 ABC-type dipeptide transport system, periplasmic component 7 3 Op 3 49/0.000 + CDS 9685 - 10707 312 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 8 3 Op 4 44/0.000 + CDS 10707 - 11600 800 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 9 3 Op 5 17/0.000 + CDS 11597 - 12583 475 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 10 3 Op 6 . + CDS 12576 - 13502 617 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 - Term 13503 - 13542 6.5 11 4 Tu 1 . - CDS 13558 - 13989 535 ## COG1764 Predicted redox protein, regulator of disulfide bond formation - Prom 14201 - 14260 6.3 + Prom 14158 - 14217 4.0 12 5 Op 1 . + CDS 14334 - 14549 276 ## B21_01451 hypothetical protein 13 5 Op 2 . + CDS 14651 - 14788 230 ## PROTEIN SUPPORTED gi|15801657|ref|NP_287675.1| 30S ribosomal subunit S22 + Term 14811 - 14862 8.4 + Prom 14803 - 14862 4.3 14 6 Tu 1 . + CDS 14945 - 16642 1851 ## COG0281 Malic enzyme + Term 16649 - 16684 6.5 + Prom 16656 - 16715 3.0 15 7 Tu 1 . + CDS 16776 - 17786 1007 ## COG1064 Zn-dependent alcohol dehydrogenases 16 8 Tu 1 . + CDS 17932 - 18216 291 ## COG3093 Plasmid maintenance system antidote protein - Term 18560 - 18603 9.3 17 9 Op 1 12/0.000 - CDS 18623 - 19276 873 ## COG2864 Cytochrome b subunit of formate dehydrogenase 18 9 Op 2 16/0.000 - CDS 19269 - 20153 943 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 19 9 Op 3 5/0.000 - CDS 20166 - 22577 2645 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 20 9 Op 4 . - CDS 22626 - 23213 552 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing - Prom 23399 - 23458 6.2 + Prom 23214 - 23273 4.2 21 10 Tu 1 . + CDS 23502 - 24326 643 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Prom 24487 - 24546 5.1 22 11 Op 1 2/0.833 + CDS 24585 - 24875 383 ## COG3203 Outer membrane protein (porin) 23 11 Op 2 2/0.833 + CDS 24935 - 25855 251 ## COG4886 Leucine-rich repeat (LRR) protein + Term 25990 - 26036 -0.6 24 12 Op 1 10/0.000 + CDS 26553 - 27941 963 ## COG2223 Nitrate/nitrite transporter 25 12 Op 2 13/0.000 + CDS 28023 - 31763 3816 ## COG5013 Nitrate reductase alpha subunit 26 12 Op 3 12/0.000 + CDS 31760 - 33304 1488 ## COG1140 Nitrate reductase beta subunit 27 12 Op 4 12/0.000 + CDS 33304 - 33999 834 ## COG2180 Nitrate reductase delta subunit 28 12 Op 5 4/0.167 + CDS 33996 - 34676 509 ## COG2181 Nitrate reductase gamma subunit 29 12 Op 6 . + CDS 34755 - 35648 811 ## COG0384 Predicted epimerase, PhzC/PhzF homolog 30 13 Tu 1 . - CDS 35744 - 36589 709 ## COG2162 Arylamine N-acetyltransferase - Prom 36703 - 36762 3.3 + Prom 36652 - 36711 2.8 31 14 Tu 1 . + CDS 36762 - 37331 425 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family + Term 37495 - 37521 -1.0 32 15 Tu 1 . - CDS 37335 - 37562 201 ## COG1942 Uncharacterized protein, 4-oxalocrotonate tautomerase homolog - Prom 37583 - 37642 2.6 Predicted protein(s) >gi|296493119|gb|ADTK01000382.1| GENE 1 112 - 1647 1516 511 aa, chain + ## HITS:1 COG:xasA KEGG:ns NR:ns ## COG: xasA COG0531 # Protein_GI_number: 16129451 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli K12 # 1 511 1 511 511 895 100.0 0 MATSVQTGKAKQLTLLGFFAITASMVMAVYEYPTFATSGFSLVFFLLLGGILWFIPVGLC AAEMATVDGWEEGGVFAWVSNTLGPRWGFAAISFGYLQIAIGFIPMLYFVLGALSYILKW PALNEDPITKTIAALIILWALALTQFGGTKYTARIAKVGFFAGILLPAFILIALAAIYLH SGAPVAIEMDSKTFFPDFSKVGTLVVFVAFILSYMGVEASATHVNEMSNPGRDYPLAMLL LMVAAICLSSVGGLSIAMVIPGNEINLSAGVMQTFTVLMSHVAPEIEWTVRVISALLLLG VLAEIASWIVGPSRGMYVTAQKNLLPAAFAKMNKNGVPVTLVISQLVITSIALIILTNTG GGNNMSFLIALALTVVIYLCAYFMLFIGYIVLVLKHPDLKRTFNIPGGKGVKLVVAIVGL LTSIMAFIVSFLPPDNIQGDSTDMYVELLVVSFLVVLALPFILYAVHDRKGKANTGVTLE PINSQNAPKGHFFLHPRARSPHYIVMNDKKH >gi|296493119|gb|ADTK01000382.1| GENE 2 1943 - 3097 824 384 aa, chain + ## HITS:1 COG:ECs2096 KEGG:ns NR:ns ## COG: ECs2096 COG1649 # Protein_GI_number: 15831350 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 384 56 439 439 762 100.0 0 MRGIWLATVSRLDWPPVSSVNISNPTSRARVQQQAMIDKLDHLQRLGINTVFFQVKPDGT ALWPSKILPWSDLMTGKIGENPGYDPLQFMLDEAHKRGMKVHAWFNPYRVSVNTKPGTIR ELNSTLSQQPASVYVQHRDWIRTSGDRFVLDPGIPEVQDWITSIVAEVVSRYPVDGVQFD DYFYTESPGSRLNDNETYRKYGGAFASKADWRRNNTQQLIAKVSHTIKSIKPGVEFGVSP AGVWRNRSHDPLGSDTRGAAAYDESYADTRRWVEQGLLDYIAPQIYWPFSRSAARYDVLA KWWADVVKPTRTRLYIGIAFYKVGEPSKIEPDWMINGGVPELKKQLDLNDAVPEISGTIL FREDYLNKPQTQQAVSYLQSRWGS >gi|296493119|gb|ADTK01000382.1| GENE 3 3474 - 4856 1205 460 aa, chain + ## HITS:1 COG:ECs2095_2 KEGG:ns NR:ns ## COG: ECs2095_2 COG2199 # Protein_GI_number: 15831349 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli O157:H7 # 269 460 1 192 192 365 99.0 1e-100 MEMYFKRMKDEWTGLVEQADPLIRAKAAEIAVAHAHYLSIEFYRIVRIDPHAEEFLSNEQ VERQLKSAMERWIINVLSAQVDDVERLIQIQHTVAEVHARIGIPVEIVEMGFRVLKKILY PVIFSSDYSAAEKLQVYHFSINSIDIAMEVMTRAFTFSDSSASKEDENYRIFSLLENAEE EKERQIASILSWEIDIIYKILLDSDLGSSLPLSQADFGLWFNHKGRHYFSGIAEVGHISR LIQDFDGIFNQTMRNTRNLNNRSLRVKFLLQIRNTVSQIITLLRELFEEVSRHEVGMDVL TKLLNRRFLPTIFKREIAHANRTGTPLSVLIIDVDKFKEINDTWGHNTGDEILRKVSQAF YDNVRSSDYVFRYGGDEFIIVLTEASENETLRTAERIRSRVEKTKLKAANGEDIALSLSI GAAMFNGHPDYERLIQIADEALYITKRRGRNRVELWKASL >gi|296493119|gb|ADTK01000382.1| GENE 4 4881 - 7280 1965 799 aa, chain + ## HITS:1 COG:ECs2094_1 KEGG:ns NR:ns ## COG: ECs2094_1 COG2202 # Protein_GI_number: 15831348 # Func_class: T Signal transduction mechanisms # Function: FOG: PAS/PAC domain # Organism: Escherichia coli O157:H7 # 1 342 1 342 342 703 99.0 0 MKLTDAENAADGIFFPALEQNMMGAVLINENDEVMFFNPAAEKLWGYKREEVIGNNIDML IPRDLRPAHPEYIRHNREGGKARVEGMSRELQLEKKDGSKIWTRFALSKVSAEGKVYYLA LVRDASVEMAQKEQTRQLIIAVDHLDRPVIVLDPERHIVQCNRAFTEMFGYCISEASGMQ PDTLLNIPEFPADNRIRLQQLLWKTARDQDEFLLLTRTGEKIWIKASISPVYDVLAHLQN LVMTFSDITEERQIRQLEGNILAAMCSSPPFHEMGEIICRNIESVLNESHVSLFALRNGM PIHWASSSHGAEVQNAQSWSATIRQRDGAPAGILQIKTSSGAETSAFIERVADISQHMAA LALEQEKSRQHIEQLIQFDPMTGLPNRNNLHNYLDDLVDKAVSPVVYLIGVDHIQDVIDS LGYAWADQALLEVVNRFREKLKPDQYLCRIEGTQFVLVSLENDVSNITQIADELRNVVSK PIMIDDKPFPLTLSIGISYDVGKNRDYLLSTAHNAMDYIRKNGGNGWQFFSPAMNEMVKE RLVLGAALKETISNNQLKLVYQPQIFAETGELYGIEALARWHDPQHGHVPPSRFIPLAEE IGEIENIGRWVIAEACRQLAEWRSQNIHIPALSVNLSALHFRSNQLPNQVSDAMQAWGID GHQLTVEITESMMMEHDTEIFKRIQILRDMGVGLSVDDFGTGFSGLSRLVSLPVTEIKID KSFVDRCLTEKRILALLEAITSIGQSLNLTVVAEGVETKEQFEMLRKIHCRVIQGYFFSR PLPAEEIPGWMSSVLPLKI >gi|296493119|gb|ADTK01000382.1| GENE 5 7538 - 8119 428 193 aa, chain + ## HITS:1 COG:ECs2092 KEGG:ns NR:ns ## COG: ECs2092 COG2173 # Protein_GI_number: 15831346 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine dipeptidase # Organism: Escherichia coli O157:H7 # 1 193 1 193 193 390 98.0 1e-109 MSDTTELVDLAVIFPDLEIELKYACADNITGKAIYQQARCLLHKDAITALAKSISIAQLS GLQLVIYDAYRPQQAQAMLWQVCPDPQYVVDVTVGSNHSRGTAIDLTLRDEHGNILDMGA GFDEMHDRSHAYHPSVPPAAQRNRLLLIAIMTGGGFVGISSEWWHFELPQAASYPLLADQ FTCFISPGTQHVS >gi|296493119|gb|ADTK01000382.1| GENE 6 8133 - 9683 1616 516 aa, chain + ## HITS:1 COG:ECs2091 KEGG:ns NR:ns ## COG: ECs2091 COG0747 # Protein_GI_number: 15831345 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 1 516 1 516 516 1005 99.0 0 MKRSISFRPTLLALVLATTFPVAHAAVPKDMLVIGKAADPQTLDPAVTIDNNDWTVTYPS YQRLVQYKTDGDKGSTDVEGDLASSWKASDDQKEWTFTLKDNAKFADGTSVTAEAVKLSF ERLLKIGQGPAEAFPKDLKIDAPDEHTVKFTLSQPFAPFLYTLANDGASIINPAVLKEHA ADDARGFLAQNTAGSGPFMLKSWQKGQQLVLVPNPHYSGNKPNFKRVSVKIIGESASRRL QLSRGDIDIADALPVDQLNALKQENKVNVAEYPSLRVTYLYLNNSKAPLNQADLRRAISW STDYQGMVNGILSGNGKQMRGPIPEGMWGYDATAMQYNHDETKAKAEWDKVTSKPTSLTF LYSDNDPNWEPIALATQSSLNKLGINVKLEKLANATMRDRVGKGDYDIAIGNWSPDFADP YMFMNYWFESDKKGLPGNRSFYENSEVDKLLRNALATTDQTQRTRDYQQAQKIVIDDAAY VYLFQKNYQLAMNKEVKGFVFNPMLEQVFNINTMSK >gi|296493119|gb|ADTK01000382.1| GENE 7 9685 - 10707 312 340 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 70 332 46 310 320 124 29 6e-28 MTFWSILRQRCWGLVLVVAGVCVITFIISHLIPGDPARLLAGDRASDAIVENIRQHLGLD QPLYVQFYRYVSDLFHGDLGTSIRTGRPVLEELRIFFSATLELAFCALLLALLIGIPLGI LSAVWRNRWLDHLVRIMAITGISTPAFWLGLGVIVLFYGHLQILPGGGRLDDWLDPPTHV TGFYLLDALLEGNGEVFFNALQHLILPALTLAFVHLGIVARQIRSAMLEQLSEDYIRTAR ASGLSGWYIVLCYALPNALIPSITVLGLALGDLLYGAVLTETVFAWPGMGAWVVTSIQAL DFPAVMGFAVVVSFAYVLVNLVVDLLYLWIDPRIGRGGGE >gi|296493119|gb|ADTK01000382.1| GENE 8 10707 - 11600 800 297 aa, chain + ## HITS:1 COG:ECs2089 KEGG:ns NR:ns ## COG: ECs2089 COG1173 # Protein_GI_number: 15831343 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Escherichia coli O157:H7 # 1 297 2 298 298 541 100.0 1e-154 MLSEETSAVRPQKQTRFNGAKLVWMLKGSPLTVTGAVIIVLMLLMMIFSPWLATHDPNAI DLTARLLPPSAAHWFGTDEVGRDLFSRVLVGSQQSILAGLVVVAIAGMIGSLLGCLSGVL GGRADAIIMRIMDIMLSIPSLVLTMALAAALGPSLFNAMLAIAIVRIPFYVRLARGQALV VRQYTYVQAAKTFGASRWHLINWHILRNSLPPLIVQASLDIGSAILMAATLGFIGLGAQQ PSAEWGAMVANGRNYVLDQWWYCAFPGAAILLTAVGFNLFGDGIRDLLDPKAGGKQS >gi|296493119|gb|ADTK01000382.1| GENE 9 11597 - 12583 475 328 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 2 316 8 324 329 187 36 8e-47 MTQPVLDIQQLHLSFPGFHGDVHALNNVSLQINRGEIVGLVGESGSGKSVTAMLIMRLLP TGSYCVHRGQISLLGEDVLNAREKQLRQWRGARVAMIFQEPMTALNPTRRIGLQMMDVIR HHQPISRREARAKAIDLLEEMQIPDAVEVMSRYPFELSGGMRQRVMIALAFSCEPQLIIA DEPTTALDVTVQLQVLRLLKHKARASGTAVLFISHDMAVVSQLCDSVYVMYAGSVIESGV TADVIHHPRHPYTIGLLQCAPEHGVPRQLLPAIPGTVPNLTHLPDGCAFRDRCYAAGAQC ENVPALTACGDNNQRCACWYPQQEVISV >gi|296493119|gb|ADTK01000382.1| GENE 10 12576 - 13502 617 308 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 2 291 8 310 329 242 40 3e-63 MSDTLLTLRDVHINFPARKNWLGKTTEHVHAINGIDLQIRRGETLGIVGESGCGKSTLAQ LLMGMLQPSHGQYIRSGSQRIMQMVFQDPLSSLNPRLPVWRVITEPLWIAKRSSEQQRRA LAEELAVQVGIRPEYLDRLPHAFSGGQRQRIAIARALSSQPDVIVLDEPTSALDISVQAQ ILNLLVTLQENHGLTYVLISHNVSVIRHMSDRVAVMYLGQIVELGDAQQVLTAPAHPYTR LLLDSLPAIDKPLEEEWALRKTDLPGNRTLPQGCFFYERCPLATHGCEVRQSLAIREDGR ELRCWRAL >gi|296493119|gb|ADTK01000382.1| GENE 11 13558 - 13989 535 143 aa, chain - ## HITS:1 COG:osmC KEGG:ns NR:ns ## COG: osmC COG1764 # Protein_GI_number: 16129441 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted redox protein, regulator of disulfide bond formation # Organism: Escherichia coli K12 # 1 143 1 143 143 270 100.0 5e-73 MTIHKKGQAHWEGDIKRGKGTVSTESGVLNQQPYGFNTRFEGEKGTNPEELIGAAHAACF SMALSLMLGEAGFTPTSIDTTADVSLDKVDAGFAITKIALKSEVAVPGIDASTFDGIIQK AKAGCPVSQVLKAEITLDYQLKS >gi|296493119|gb|ADTK01000382.1| GENE 12 14334 - 14549 276 71 aa, chain + ## HITS:1 COG:no KEGG:B21_01451 NR:ns ## KEGG: B21_01451 # Name: bdm # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 71 1 71 71 129 100.0 4e-29 MFTYYQAENSTAEPALVNAIEQGLRAQHGVVTEDDILMELTKWVEASDNDILSDIYQQTI NYVVSGQHPTL >gi|296493119|gb|ADTK01000382.1| GENE 13 14651 - 14788 230 45 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15801657|ref|NP_287675.1| 30S ribosomal subunit S22 [Escherichia coli O157:H7 EDL933] # 1 45 1 45 45 93 100 2e-18 MKSNRQARHILGLDHKISNQRKIVTEGDKSSVVNNPTGRKRPAEK >gi|296493119|gb|ADTK01000382.1| GENE 14 14945 - 16642 1851 565 aa, chain + ## HITS:1 COG:sfcA KEGG:ns NR:ns ## COG: sfcA COG0281 # Protein_GI_number: 16129438 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Escherichia coli K12 # 1 565 10 574 574 1140 99.0 0 MEPKTKKQRSLYIPYAGPVLLEFPLLNKGSAFSMEERRNFNLLGLLPEVVETIEEQAERA WIQYQGFKTEIDKHIYLRNIQDTNETLFYRLVNNHLDEMMPVIYTPTVGAACERFSEIYR RSRGVFISYQNRHNMDDILQNVPNHNIKVIVVTDGERILGLGDQGIGGMGIPIGKLSLYT ACGGISPAYTLPVVLDVGTNNQQLLNDPLYMGWRNPRITDDEYYEFVDEFIQAVKQRWPD VLLQFEDFAQKNAMPLLNRYRNEICSFNDDIQGTAAVTVGTLIAASRAAGGQLSEKKIVF LGAGSAGCGIAEMIIAQTQREGLSEEAARQKVFMVDRFGLLTDKMPNLLPFQTKLVQKRE NLSDWDTDSDVLSLLDVVRNVKPDILIGVSGQTGLFTEEIIREMHKHCPRPIVMPLSNPT SRVEATPQDIIAWTEGNALVATGSPFNPVVWKDKIYPIAQCNNAFIFPGIGLGVIASGAS RITDEMLMSASETLAQYSPLVLNGEGLVLPELKDIQKVSRAIAFAVGKMAQQQGVAVKTS AEALQQAIDDNFWQAEYRDYRRTSI >gi|296493119|gb|ADTK01000382.1| GENE 15 16776 - 17786 1007 336 aa, chain + ## HITS:1 COG:ZadhP KEGG:ns NR:ns ## COG: ZadhP COG1064 # Protein_GI_number: 15801659 # Func_class: R General function prediction only # Function: Zn-dependent alcohol dehydrogenases # Organism: Escherichia coli O157:H7 EDL933 # 1 335 11 345 346 600 99.0 1e-171 MKAAVVTKDHHVDVTDKTLRSLKHGEALLKMECCGVCHTDLHVKNGDFGDKTGVILGHEG IGVVAEVGPGVTSLKPGDRASVAWFYEGCGHCEYCNSGNETLCRSVKNAGYSVDGGMAEE CIVVADYAVKVPDGLDSAAASSITCAGVTTYKAVKLSKIRPGQWIAIYGLGGLGNLALQY AKNVFNAKVIAIDVNDEQLKLATEMGADLAINSRTEDAAKIVQEKTGGAHAAVVTAVAKA AFNSAVDAVRAGGRVVAVGLPPESMSLDIPRLVLDGIEVVGSLVCTRQDLTEAFQFAAEG KVVPKVALRPLADINTIFTEMEEGKIRGRMVIDFRH >gi|296493119|gb|ADTK01000382.1| GENE 16 17932 - 18216 291 94 aa, chain + ## HITS:1 COG:yddM KEGG:ns NR:ns ## COG: yddM COG3093 # Protein_GI_number: 16129436 # Func_class: R General function prediction only # Function: Plasmid maintenance system antidote protein # Organism: Escherichia coli K12 # 1 94 27 120 120 169 100.0 1e-42 MKMANHPRPGDIIQESLDELNVSLREFARAMEIAPSTASRLLTGKAALTPEMAIKLSVVI GSSPQMWLNLQNAWSLAEAEKTVDVSRLRRLVTQ >gi|296493119|gb|ADTK01000382.1| GENE 17 18623 - 19276 873 217 aa, chain - ## HITS:1 COG:STM1568 KEGG:ns NR:ns ## COG: STM1568 COG2864 # Protein_GI_number: 16764912 # Func_class: C Energy production and conversion # Function: Cytochrome b subunit of formate dehydrogenase # Organism: Salmonella typhimurium LT2 # 1 217 1 217 218 408 98.0 1e-114 MSKSKMIVRTKFIDRACHWTVVICFFLVALSGISFFFPTLQWLTQTFGTPQMGRILHPFF GIAIFVALMFMFVRFVHHNIPDKKDIPWLLNIVEVLKGNEHKVADVGKYNAGQKMMFWSI MSMIFVLLVTGVIIWRPYFAQYFPMQVVRYSLLIHAAAGIILIHAILIHMYMAFWVKGSI KGMIEGKVSRRWAKKHHPRWYREIEKAEAKKESEEGI >gi|296493119|gb|ADTK01000382.1| GENE 18 19269 - 20153 943 294 aa, chain - ## HITS:1 COG:fdnH KEGG:ns NR:ns ## COG: fdnH COG0437 # Protein_GI_number: 16129434 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Escherichia coli K12 # 1 294 1 294 294 612 100.0 1e-175 MAMETQDIIKRSATNSITPPSQVRDYKAEVAKLIDVSTCIGCKACQVACSEWNDIRDEVG HCVGVYDNPADLSAKSWTVMRFSETEQNGKLEWLIRKDGCMHCEDPGCLKACPSAGAIIQ YANGIVDFQSENCIGCGYCIAGCPFNIPRLNKEDNRVYKCTLCVDRVSVGQEPACVKTCP TGAIHFGTKKEMLELAEQRVAKLKARGYEHAGVYNPEGVGGTHVMYVLHHADQPELYHGL PKDPKIDTSVSLWKGALKPLAAAGFIATFAGLIFHYIGIGPNKEVDDDEEDHHE >gi|296493119|gb|ADTK01000382.1| GENE 19 20166 - 22577 2645 803 aa, chain - ## HITS:1 COG:fdnG KEGG:ns NR:ns ## COG: fdnG COG0243 # Protein_GI_number: 16129433 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 803 213 1015 1015 1677 99.0 0 MTNHWVDIKNANVVMVMGGNAAEAHPVGFRWAMEAKNNNDATLIVVDPRFTRTASVADIY APIRSGTDITFLSGVLRYLIENNKINAEYVKHYTNASLLVRDDFAFEDGLFSGYDAEKRQ YDKSSWNYQFDENGYAKRDETLTHPRCVWNLLKEHVSRYTPDVVENICGTPKADFLKVCE VLASTSAPDRTTTFLYALGWTQHTVGAQNIRTMAMIQLLLGNMGMAGGGVNALRGHSNIQ GLTDLGLLSTSLPGYLTLPSEKQVDLQSYLEANTPKATLADQVNYWSNYPKFFVSLMKSF YGDAAQKENNWGYDWLPKWDQTYDVIKYFNMMDEGKVTGYFCQGFNPVASFPDKNKVVSC LSKLKYMVVIDPLVTETSTFWQNHGESNDVDPASIQTEVFRLPSTCFAEEDGSIANSGRW LQWHWKGQDAPGEARNDGEILAGIYHHLRELYQSEGGKGVEPLMKMSWNYKQPHEPQSDE VAKENNGYALEDLYDANGVLIAKKGQLLSSFAHLRDDGTTASSCWIYTGSWTEQGNQMAN RDNSDPSGLGNTLGWAWAWPLNRRVLYNRASADINGKPWDPKRMLIQWNGSKWTGNDIPD FGNAAPGTPTGPFIMQPEGMGRLFAINKMAEGPFPEHYEPIETPLGTNPLHPNVVSNPVV RLYEQDALRMGKKEQFPYVGTTYRLTEHFHTWTKHALLNAIAQPEQFVEISETLAAAKGI NNGDRVTVSSKRGFIRAVAVVTRRLKPLNVNGQQVETVGIPIHWGFEGVARKGYIANTLT PNVGDANSQTPEYKAFLVNIEKA >gi|296493119|gb|ADTK01000382.1| GENE 20 22626 - 23213 552 195 aa, chain - ## HITS:1 COG:fdnG KEGG:ns NR:ns ## COG: fdnG COG0243 # Protein_GI_number: 16129433 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 195 1 195 1015 412 100.0 1e-115 MDVSRRQFFKICAGGMAGTTVAALGFAPKQALAQARNYKLLRAKEIRNTCTYCSVGCGLL MYSLGDGAKNAREAIYHIEGDPDHPVSRGALCPKGAGLLDYVNSENRLRYPEYRAPGSDK WQRISWEEAFSRIAKLMKADRDANFIEKNEQGVTVNRWLSTGMLCASGASNETGMLTQKF ARSLGMLAVDNQARV >gi|296493119|gb|ADTK01000382.1| GENE 21 23502 - 24326 643 274 aa, chain + ## HITS:1 COG:ECs2077 KEGG:ns NR:ns ## COG: ECs2077 COG0697 # Protein_GI_number: 15831331 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Escherichia coli O157:H7 # 1 274 20 293 293 475 99.0 1e-134 MVGLIRGVSEGLGPVGGAAAIYSLSGLLLIFTVGFPRIRQIPKGYLLAGSLLFVSYEICL ALSLGYAATRHQAIEVGMVNYLWPSLTILFAILFNGQKTNWLIVPGLLLALVGVCWVLGG DNGLHYDEIINNITTSPLSYFLAFIGAFIWAAYCTVTNKYARGFNGITVFVLLTGACLWI YYFLTPQPEMVFSTPVMIKLISAAFTLGFAYAAWNVGILHGNVTIMAVGSYFTPVLSSAL AAVLLSAPLSFSFWQGALMVCGGSLLCWLATRRG >gi|296493119|gb|ADTK01000382.1| GENE 22 24585 - 24875 383 96 aa, chain + ## HITS:1 COG:yddL KEGG:ns NR:ns ## COG: yddL COG3203 # Protein_GI_number: 16129431 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein (porin) # Organism: Escherichia coli K12 # 1 96 1 96 96 173 98.0 8e-44 MKLKIVAVVVTGLLAANVAHAAEVYNKDGNKLDLYGKVTALRYFTDDKRDDGDKTYARLG FKGETQINDQMIGFGHWEYDFKGYNDEANCSRGKNL >gi|296493119|gb|ADTK01000382.1| GENE 23 24935 - 25855 251 306 aa, chain + ## HITS:1 COG:yddK_2 KEGG:ns NR:ns ## COG: yddK_2 COG4886 # Protein_GI_number: 16129430 # Func_class: S Function unknown # Function: Leucine-rich repeat (LRR) protein # Organism: Escherichia coli K12 # 81 306 1 226 226 330 99.0 1e-90 MKTITLNDNHIAHLNAKNTTKLEYLNLSNNNLLPTNDIDQLISSKYLWHVLVNGINNDPL AQMQYWTAVRNIIDDTNEVTIDLSGLNLTTQPPGLQNFTSINLDNNQLTHFDATNYDRLV KLSLNSNALESINFPQGRNVSITHISMNNNALRNIDIDRLSSVTYFSAAHNQLEFVQLES CEWLQYLNLSHNQLTDIVAGNKNELLLLDLSHNKLTSLHNDLFPNLNTLLINNNLLSEIK IFYSNFCNVQTLNAANNQLKYINLDFLTYLPSIKSLRLDNNKITHIDTNNTSDIGTLFPI IKQSKT >gi|296493119|gb|ADTK01000382.1| GENE 24 26553 - 27941 963 462 aa, chain + ## HITS:1 COG:narU KEGG:ns NR:ns ## COG: narU COG2223 # Protein_GI_number: 16129428 # Func_class: P Inorganic ion transport and metabolism # Function: Nitrate/nitrite transporter # Organism: Escherichia coli K12 # 1 462 1 462 462 778 99.0 0 MALQNEKNSRYLLRDWKPENPAFWENKGKHIARRNLWISVSCLLLAFCVWMLFSAVTVNL NKIGFNFTTDQLFLLTALPSVSGALLRVPYSFMVPIFGGRRWTVFSTAILIIPCVWLGIA VQNPNTPFGIFIVIALLCGFAGANFASSMGNISFFFPKAKQGSALGINGGLGNLGVSVMQ LVAPLVIFVPVFAFLGVNGVPQADGSVMSLANAAWIWVPLLAIATIAAWSGMNDIASSRA SIADQLPVLQRLHLWLLSLLYLATFGSFIGFSAGFAMLAKTQFPDVNILRLAFFGPFIGA IARSVGGAISDKFGGVRVTLINFIFMAIFSALLFLTLPGTGSGNFIAFYAVFMGLFLTAG LGSGSTFQMIAVIFRQITIYRVKMKGGSDEQAQKEAVTETAAALGFISAIGAVGGFFIPQ AFGMSLNMTGSPVGAMKVFLIFYIVCVLLTWLVYGRRKFSQK >gi|296493119|gb|ADTK01000382.1| GENE 25 28023 - 31763 3816 1246 aa, chain + ## HITS:1 COG:narZ KEGG:ns NR:ns ## COG: narZ COG5013 # Protein_GI_number: 16129427 # Func_class: C Energy production and conversion # Function: Nitrate reductase alpha subunit # Organism: Escherichia coli K12 # 1 1246 1 1246 1246 2613 100.0 0 MSKLLDRFRYFKQKGETFADGHGQVMHSNRDWEDSYRQRWQFDKIVRSTHGVNCTGSCSW KIYVKNGLVTWEIQQTDYPRTRPDLPNHEPRGCPRGASYSWYLYSANRLKYPLIRKRLIE LWREALKQHSDPVLAWASIMNDPQKCLSYKQVRGRGGFIRSNWQELNQLIAAANVWTIKT YGPDRVAGFSPIPAMSMVSYAAGTRYLSLLGGTCLSFYDWYCDLPPASPMTWGEQTDVPE SADWYNSSYIIAWGSNVPQTRTPDAHFFTEVRYKGTKTIAITPDYSEVAKLCDQWLAPKQ GTDSALAMAMGHVILKEFHLDNPSDYFINYCRRYSDMPMLVMLEPRDDGSYVPGRMIRAS DLVDGLGESNNPQWKTVAVNTAGELVVPNGSIGFRWGEKGKWNLESIAAGTETELSLTLL GQHDAVAGVAFPYFGGIENPHFRSVKHNPVLVRQLPVKNLTLVDGNTCPVVSVYDLVLAN YGLDRGLEDENSAKDYAEIKPYTPAWGEQITGVPRQYIETIAREFADTAHKTHGRSMIIL GAGVNHWYHMDMNYRGMINMLIFCGCVGQSGGGWAHYVGQEKLRPQTGWLPLAFALDWNR PPRQMNSTSFFYNHSSQWRYEKVSAQELLSPLADASKYSGHLIDFNVRAERMGWLPSAPQ LGRNPLGIKAEADKAGLSPTEFTAQALKSGDLRMACEQPDSSSNHPRNLFVWRSNLLGSS GKGHEYMQKYLLGTESGIQGEELGASDGIKPEEVEWQTAAIEGKLDLLVTLDFRMSSTCL FSDIVLPTATWYEKDDMNTSDMHPFIHPLSAAVDPAWESRSDWEIYKGIAKAFSQVCVGH LGKETDVVLQPLLHDSPAELSQPCEVLDWRKGECDLIPGKTAPNIVAVERDYPATYERFT SLGPLMDKLGNGGKGISWNTQDEIDFLGKLNYTKRDGPAQGRPLIDTAIDASEVILALAP ETNGHVAVKAWQALGEITGREHTHLALHKEDEKIRFRDIQAQPRKIISSPTWSGLESDHV SYNAGYTNVHELIPWRTLSGRQQLYQDHPWMRAFGESLVAYRPPIDTRSVSEMRQIPPNG FPEKALNFLTPHQKWGIHSTYSENLLMLTLSRGGPIVWISETDARELTIVDNDWVEVFNA NGALTARAVVSQRVPPGMTMMYHAQERIMNIPGSEVTGMRGGIHNSVTRVCPKPTHMIGG YAQLAWGFNYYGTVGSNRDEFIMIRKMKNVNWLDDEGRDQVQEAKK >gi|296493119|gb|ADTK01000382.1| GENE 26 31760 - 33304 1488 514 aa, chain + ## HITS:1 COG:ECs2070 KEGG:ns NR:ns ## COG: ECs2070 COG1140 # Protein_GI_number: 15831324 # Func_class: C Energy production and conversion # Function: Nitrate reductase beta subunit # Organism: Escherichia coli O157:H7 # 1 514 1 514 514 1097 99.0 0 MKIRSQVGMVLNLDKCIGCHTCSVTCKNVWTGREGMEYAWFNNVETKPGIGYPKNWEDQE EWQGGWVRDVNGKIRPRLGSKMGVITKIFANPVVPQIDDYYEPFTFDYEHLHSAPEGKHI PTARPRSLIDGKRMDKVIWGPNWEELLGGEFEKRARDRNFEAMQKEMYGQFENTFMMYLP RLCEHCLNPSCVATCPSGAIYKREEDGIVLIDQDKCRGWRLCISGCPYKKIYFNWKSGKS EKCIFCYPRIESGQPTVCSETCVGRIRYLGVLLYDADRIEEAASTEREVDLYERQCEVFL DPHDPSVIEEALKQGIPQNVIEAAQRSPVYKMAMDWKLALPLHPEYRTLPMVWYVPPLSP IQSYADAGGLPKSEGVLPAIESLRIPVQYLANMLSAGDTGPVLRALKRMMAMRHYMRSQT VEGVTDTRAIDEVGLSVAQVEEMYRYLAIANYEDRFVIPTSHREMAGDAFAERNGCGFTF GDGCHGSDSKFNLFNSSRIDAINITEVRDKAEGE >gi|296493119|gb|ADTK01000382.1| GENE 27 33304 - 33999 834 231 aa, chain + ## HITS:1 COG:narW KEGG:ns NR:ns ## COG: narW COG2180 # Protein_GI_number: 16129425 # Func_class: C Energy production and conversion # Function: Nitrate reductase delta subunit # Organism: Escherichia coli K12 # 1 231 1 231 231 443 100.0 1e-124 MQILKVIGLLMEYPDELLWECKEDALALIRRDAPMLTDFTHNLLNAPLLDKQAEWCEVFD RGRTTSLLLFEHVHAESRDRGQAMVDLLAEYEKVGLQLDCRELPDYLPLYLEYLSVLPDD QAKEGLLNVAPILALLGGRLKQREAPWYALFDALLQLAGSSLSSDSVTKQVNSEERDDTR QALDAVWEEEQVKFIEDNATACDSSPLNQYQRRFSQDVAPQYVDISAGGGK >gi|296493119|gb|ADTK01000382.1| GENE 28 33996 - 34676 509 226 aa, chain + ## HITS:1 COG:ECs2068 KEGG:ns NR:ns ## COG: ECs2068 COG2181 # Protein_GI_number: 15831322 # Func_class: C Energy production and conversion # Function: Nitrate reductase gamma subunit # Organism: Escherichia coli O157:H7 # 1 226 1 226 226 420 100.0 1e-117 MIQYLNVFFYDIYPYICATVFFLGSWLRYDYGQYTWRASSSQMLDKRGMVIWSNLFHIGI LGIFFGHLFGMLTPHWMYAWFLPVAAKQLMAMVLGGICGVLTLIGGAGLLWRRLTNQRVR ATSTTPDIIIMSILLIQCLLGLSTIPFSAQYPDGSEMMKLVGWAQSIVTFRGGSSEMLNG VAFVFRLHLVLGMTIFLLFPFTRLVHVWSAPFEYFTRRYQIVRSRR >gi|296493119|gb|ADTK01000382.1| GENE 29 34755 - 35648 811 297 aa, chain + ## HITS:1 COG:yddE KEGG:ns NR:ns ## COG: yddE COG0384 # Protein_GI_number: 16129423 # Func_class: R General function prediction only # Function: Predicted epimerase, PhzC/PhzF homolog # Organism: Escherichia coli K12 # 1 297 1 297 297 594 97.0 1e-170 MKPQVYHVDAFTSQPFRGNSAGVVFPADNLSEAQMQLIARELGHSETAFLLHSDDSDVRI RYFTPTVEVPICGHATVAAHYVRAKVLGLGNCTIWQTSLAGKHRVTIEKHNDDYRISLEQ GTPGFELPLEGEVRAAIINALHLTEDDILPGLPIQVATTGHSKVMIPLKPEVDIDALSPD LNALTAISKQIGCNGFFPFQIRPGKNETDGRMFSPAIGIVEDPVTGNANGPMGAWLVHHN VLPHDGKVLRVKGHQGRALGRDGVIEVTVTIRDNQPEKVTISGAAVILFHAEWAIKL >gi|296493119|gb|ADTK01000382.1| GENE 30 35744 - 36589 709 281 aa, chain - ## HITS:1 COG:nhoA KEGG:ns NR:ns ## COG: nhoA COG2162 # Protein_GI_number: 16129422 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Arylamine N-acetyltransferase # Organism: Escherichia coli K12 # 1 281 1 281 281 553 98.0 1e-157 MTPILNHYFARINWSGAAAVNIDTLRALHLKHNCTIPFENLDVLLPREIQLDDQSLEEKL VIARRGGYCFEQNGVFERVLRELEFNVRSLLGRVVLSNPPALPPRTHRLLLVELEEEKWI ADVGFGGQTLTEPIRLVSDLVQTTPHGDYRLLQEGDDWVLQFNHHQHWQSMYRFDLCEQQ QSDYVMGNFWSAHWPQSHFRHHLLMCRHLPDGGKLTLTNFHFTHYENGHAVEQRNLPDVA SLYAVMQEQFGLGVDDAKHGFTVDELALVMAAFDTHPEAGK >gi|296493119|gb|ADTK01000382.1| GENE 31 36762 - 37331 425 189 aa, chain + ## HITS:1 COG:yddH KEGG:ns NR:ns ## COG: yddH COG1853 # Protein_GI_number: 16129421 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Escherichia coli K12 # 1 189 17 205 205 383 96.0 1e-106 MSRFIPIELHHASRLLNHGPTVMITSFDEQSQRRNIMAAAWSMPVEFEPPRVAIVVDKST WTRELIERNGKFGIVIPGVAATNWTWAVGSVSGREEDKFNCYGIPVVKGPVLGLPLVEEK CLAWMECRLLPATSAQQKYDTLFGEVVSAAADARVFVEGRWQFDDDKLNTLHHLGAGTFV TSGKRVTAG >gi|296493119|gb|ADTK01000382.1| GENE 32 37335 - 37562 201 75 aa, chain - ## HITS:1 COG:ZydcE KEGG:ns NR:ns ## COG: ZydcE COG1942 # Protein_GI_number: 15801678 # Func_class: R General function prediction only # Function: Uncharacterized protein, 4-oxalocrotonate tautomerase homolog # Organism: Escherichia coli O157:H7 EDL933 # 1 75 14 88 88 133 100.0 7e-32 MPHIDIKCFPRELDEQQKAALAADITDVIIRHLNSKDSSISIALQQIQPESWQAIWDAEI APQMEALIKKPGYSM Prediction of potential genes in microbial genomes Time: Mon May 16 16:18:39 2011 Seq name: gi|296493118|gb|ADTK01000383.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1253.4, whole genome shotgun sequence Length of sequence - 23961 bp Number of predicted genes - 23, with homology - 23 Number of transcription units - 17, operones - 5 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 38 - 97 2.4 1 1 Tu 1 . + CDS 120 - 752 111 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 937 - 975 4.0 + Prom 897 - 956 7.7 2 2 Tu 1 . + CDS 1120 - 1848 161 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 1920 - 1947 -0.1 3 3 Tu 1 . + CDS 1993 - 2274 132 ## gi|293412945|ref|ZP_06655613.1| predicted protein - Term 2154 - 2199 10.6 4 4 Op 1 27/0.000 - CDS 2211 - 5324 3170 ## COG0841 Cation/multidrug efflux pump 5 4 Op 2 2/0.667 - CDS 5349 - 6506 1011 ## COG0845 Membrane-fusion protein - Prom 6616 - 6675 2.8 6 5 Tu 1 . - CDS 6845 - 7372 401 ## COG2771 DNA-binding HTH domain-containing proteins - Term 8108 - 8153 12.6 7 6 Tu 1 . - CDS 8171 - 8743 496 ## COG3247 Uncharacterized conserved protein - Prom 8874 - 8933 5.5 + Prom 8851 - 8910 5.9 8 7 Op 1 . + CDS 8998 - 9330 464 ## EC55989_3953 acid-resistance protein + Term 9377 - 9419 0.4 + Prom 9343 - 9402 2.4 9 7 Op 2 . + CDS 9434 - 9772 344 ## SSON_3577 acid-resistance protein + Term 9790 - 9832 7.2 10 8 Tu 1 . + CDS 9836 - 10483 595 ## COG1285 Uncharacterized membrane protein - Term 10364 - 10409 0.5 11 9 Tu 1 . - CDS 10525 - 11028 357 ## COG2771 DNA-binding HTH domain-containing proteins - Prom 11120 - 11179 6.1 - Term 11158 - 11199 9.3 12 10 Tu 1 . - CDS 11211 - 11777 591 ## COG3065 Starvation-inducible outer membrane lipoprotein - Prom 11964 - 12023 9.1 13 11 Tu 1 . - CDS 12402 - 13247 98 ## Z4907 hypothetical protein 14 12 Op 1 2/0.667 - CDS 13872 - 14297 555 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family 15 12 Op 2 7/0.000 - CDS 14310 - 15599 1473 ## COG1055 Na+/H+ antiporter NhaD and related arsenite permeases 16 12 Op 3 . - CDS 15653 - 16006 224 ## COG0640 Predicted transcriptional regulators - Term 16383 - 16411 -1.0 17 13 Tu 1 . - CDS 16545 - 16718 131 ## EC55989_3940 hypothetical protein + Prom 16493 - 16552 4.7 18 14 Tu 1 . + CDS 16680 - 16829 109 ## EcE24377A_3983 hypothetical protein + Term 16848 - 16885 1.3 - Term 16836 - 16874 7.0 19 15 Op 1 7/0.000 - CDS 16883 - 18235 502 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 20 15 Op 2 . - CDS 18307 - 19149 894 ## COG2961 Protein involved in catabolism of external DNA - Prom 19208 - 19267 5.6 + Prom 19237 - 19296 3.1 21 16 Op 1 5/0.333 + CDS 19352 - 21394 2616 ## COG0339 Zn-dependent oligopeptidases 22 16 Op 2 . + CDS 21402 - 22154 735 ## COG0500 SAM-dependent methyltransferases + Term 22162 - 22199 9.4 - Term 22145 - 22191 12.5 23 17 Tu 1 . - CDS 22203 - 23672 1537 ## COG3104 Dipeptide/tripeptide permease - Prom 23793 - 23852 6.2 Predicted protein(s) >gi|296493118|gb|ADTK01000383.1| GENE 1 120 - 752 111 210 aa, chain + ## HITS:1 COG:ECs4396 KEGG:ns NR:ns ## COG: ECs4396 COG2207 # Protein_GI_number: 15833650 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 210 65 274 274 408 98.0 1e-114 MLKEEALNLHAHKKISSLLVHHCSRDIPVFQEVAQLSQNKNLRYAEMLRKRALIFALLSV FLEDEHFIPLLLNVLQPNMRTRVCTVINNNIAHEWTLARIASELLMSPSLLKKKLREEET SYSQLLTECRMQRALQLIVIHGFSIKRVAVSCGYHSVSYFIYVFRNYYGMTPTEYQERSA QGLPNRDSAASIVAQGNFYGTDRSAEGIRL >gi|296493118|gb|ADTK01000383.1| GENE 2 1120 - 1848 161 242 aa, chain + ## HITS:1 COG:ECs4395 KEGG:ns NR:ns ## COG: ECs4395 COG2207 # Protein_GI_number: 15833649 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 242 1 242 242 469 99.0 1e-132 MTHVCSVILIRRSFDIYHEQHKISLHNESIVLLEKNLADDFAFCSPDTRRLDIDELTVCH YLQNIRQLPRNLGLHSKDRLLINQSPPMPLVTAIFDSFNESGVNSPILSNMLYLSCLSMF SHKKELIPLLFNSISTVSGKVERLISFDIAKRWYLRDIAERMYTSESLIKKKLQDENTCF SKILLASRMSMARRLLELRQIPLHTIAEKCGYSSTSYFINTFRQYYGVTPHQFAQHSPGT FS >gi|296493118|gb|ADTK01000383.1| GENE 3 1993 - 2274 132 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293412945|ref|ZP_06655613.1| ## NR: gi|293412945|ref|ZP_06655613.1| predicted protein [Escherichia coli B354] # 1 84 1 84 93 143 100.0 3e-33 MFGIIKLTIHTITGMWVSIVLFKLMTNGWSGFYFQCCVLSLVFLTVSWLLSGEWLAGKSK AEPSRSTLLSFTRYAFLKRAKRCSTTTKKTGTK >gi|296493118|gb|ADTK01000383.1| GENE 4 2211 - 5324 3170 1037 aa, chain - ## HITS:1 COG:ECs4394 KEGG:ns NR:ns ## COG: ECs4394 COG0841 # Protein_GI_number: 15833648 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Escherichia coli O157:H7 # 1 1037 1 1037 1037 1895 99.0 0 MANYFIDRPVFAWVLAIIMMLAGGLAIMNLPVAQYPQIAPPTITVSATYPGADAQTVEDS VTQVIEQNMNGLDGLMYMSSTSDAAGNASITLTFETGTSPDIAQVQVQNKLQLAMPSLPE AVQQQGISVDKSSSNILMVAAFISDNGSLNQYDIADYVASNIKDPLSRTAGVGSVQLFGS EYAMRIWLDPQKLNKYNLVPSDVISQIKVQNNQISGGQLGGMPQAADQQLNASIIVQTRL QTPEEFGKILLKVQQDGSQVLLRDVARVELGAEDYSTVARYNGKPAAGIAIKLATGANAL DTSRAVKEELNRLSAYFPASLKTVYPYDTTPFIEISIQEVFKTLVEAIILVFLVMYLFLQ NFRATIIPTIAVPVVILGTFAILSAVGFTINTLTMFGMVLAIGLLVDDAIVVVENVERVI AEDKLPPKEATHKSMGQIQRALVGIAVVLSAVFMPMAFMSGATGEIYRQFSITLISSMLL SVFVAMSLTPALCATILKAAPEGGHKPNALFARFNTLFEKSTQHYTDSTRSLLRCTGRYM VVYLLICAGMAVLFLRTPTSFLPEEDQGVFMTTAQLPSGATMVNTTKVLQQVTDYYLIKE KDNVQSVFTVGGFGFSGQGQNNGLAFISLKPWSERVGEENSVTAIIQRAMIALSSINKAV VFPFNLPAVAELGTASGFDMELLDNGNLGHEKLTQARNELLSLAAQSPDQVTGVRPNGLE DTPMFKVNVNAAKAEAMGVALSDINQTISTAFGSSYVNDFLNQGRVKKVYVQAGTPFRML PDNINQWYVRNASGTMAPLSAYSSTEWTYGSPRLERYNGIPSMEILGEAAAGKSTGDAMK FMADLVAKLPAGVGYSWTGLSYQEALSSNQAPALYAISLVVVFLALAALYESWSIPFSVM LVVPLGVVGALLATDLRGLSNDVYFQVGLLTTIGLSAKNAILIVEFAVEMMQKEGKTPIE AIIEAARMRLRPILMTSLAFILGVLPLVISHGAGSGAQNAVGTGVMGGMFAATVLAIYFV PVFFVVVEHLFARFKKA >gi|296493118|gb|ADTK01000383.1| GENE 5 5349 - 6506 1011 385 aa, chain - ## HITS:1 COG:yhiU KEGG:ns NR:ns ## COG: yhiU COG0845 # Protein_GI_number: 16131385 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli K12 # 1 385 1 385 385 687 100.0 0 MNRRRKLLIPLLFCGAMLTACDDKSAENAAAMTPEVGVVTLSPGSVNVLSELPGRTVPYE VAEIRPQVGGIIIKRNFIEGDKVNQGDSLYQIDPAPLQAELNSAKGSLAKALSTASNARI TFNRQASLLKTNYVSRQDYDTARTQLNEAEANVTVAKAAVEQATINLQYANVTSPITGVS GKSSVTVGALVTANQADSLVTVQRLDPIYVDLTQSVQDFLRMKEEVASGQIKQVQGSTPV QLNLENGKRYSQTGTLKFSDPTVDETTGSVTLRAIFPNPNGDLLPGMYVTALVDEGSRQN VLLVPQEGVTHNAQGKATALILDKDDVVQLREIEASKAIGDQWVVTSGLQAGDRVIVSGL QRIRPGIKARAISSSQENASTESKQ >gi|296493118|gb|ADTK01000383.1| GENE 6 6845 - 7372 401 175 aa, chain - ## HITS:1 COG:ECs4392 KEGG:ns NR:ns ## COG: ECs4392 COG2771 # Protein_GI_number: 15833646 # Func_class: K Transcription # Function: DNA-binding HTH domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 175 1 175 175 302 100.0 2e-82 MIFLMTKDSFLLQGFWQLKDNHEMIKINSLSEIKKVGNKPFKVIIDTYHNHILDEEAIKF LEKLDAERIIVLAPYHISKLKAKAPIYFVSRKESIKNLLEITYGKHLPHKNSQLCFSHNQ FKIMQLILKNKNESNITSTLNISQQTLKIQKFNIMYKLKLRRMSDIVTLGITSYF >gi|296493118|gb|ADTK01000383.1| GENE 7 8171 - 8743 496 190 aa, chain - ## HITS:1 COG:ECs4391 KEGG:ns NR:ns ## COG: ECs4391 COG3247 # Protein_GI_number: 15833645 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 190 1 190 190 294 99.0 8e-80 MLYIDKATILKFDLEMLKKHRRAIQFIAVLLFIVGLLCISFPFVSGDILSTVVGALLICS GNALIVGLFSNRSHNFWPVLSGFLVAVAYLLIGYFFIRAPELGIFAIAAFIAGLFCVAGV IRLMSWYRQRSMKGSWLQLVIGVLDIVIAWIFLGATPMVSVTLVSTLVGIELIFSAASLF SFASLFVKQQ >gi|296493118|gb|ADTK01000383.1| GENE 8 8998 - 9330 464 110 aa, chain + ## HITS:1 COG:no KEGG:EC55989_3953 NR:ns ## KEGG: EC55989_3953 # Name: hdeA # Def: acid-resistance protein # Organism: E.coli_55989 # Pathway: not_defined # 19 110 19 110 110 157 100.0 7e-38 MKKVLGVILGGLLLLPVVSNAADAQKAADNKKPVNSWTCEDFLAVDESFQPTAVGFAEAL NNKDKPEDAVLDVQGIATVTPAIVQACTQDKQANFKDKVKGEWDKIKKDM >gi|296493118|gb|ADTK01000383.1| GENE 9 9434 - 9772 344 112 aa, chain + ## HITS:1 COG:no KEGG:SSON_3577 NR:ns ## KEGG: SSON_3577 # Name: hdeB # Def: acid-resistance protein # Organism: S.sonnei # Pathway: not_defined # 1 112 1 112 112 216 100.0 1e-55 MGYKMNISSLRKAFIFMGAVAALSLVNAQSALAANESAKDMTCQEFIDLNPKAMTPVAWW MLHEETVYKGGDTVTLNETDLTQIPKVIEYCKKNPQKNLYTFKNQASNDLPN >gi|296493118|gb|ADTK01000383.1| GENE 10 9836 - 10483 595 215 aa, chain + ## HITS:1 COG:ECs4388 KEGG:ns NR:ns ## COG: ECs4388 COG1285 # Protein_GI_number: 15833642 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Escherichia coli O157:H7 # 1 215 1 215 215 381 100.0 1e-106 MTAEFIIRLILAAIACGAIGMERQMRGKGAGLRTHVLIGMGSALFMIVSKYGFADVLSLD HVGLDPSRIAAQVVTGVGFIGAGNILVRNQNIVGLTTAADIWVTAAIGMVIGSGMYELGI YGSVMTLLVLEVFHQLTFRLMNKNYHLQLTLVNGNTVSMLDWFKQQKIKTDLVSLQENED HEVVAIDIQLHATTSIEDLLRLLKGMAGVKGVSIS >gi|296493118|gb|ADTK01000383.1| GENE 11 10525 - 11028 357 167 aa, chain - ## HITS:1 COG:ECs4378 KEGG:ns NR:ns ## COG: ECs4378 COG2771 # Protein_GI_number: 15833632 # Func_class: K Transcription # Function: DNA-binding HTH domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 167 10 176 176 296 100.0 9e-81 MFFTAMKNILSKGNVVHIQNEEEIDVMLHQNAFVIIDTLMNNVFHSNFLTQIERLKPVHV IIFSPFNIKRCLGKVPVTFVPRTITIIDFVALINGSYCSVPEANVSLSRKQHQVLSCIAN QMTTEDILEKLKISLKTFYCHKHNIMMILNLKRINELVRHQHIDYLV >gi|296493118|gb|ADTK01000383.1| GENE 12 11211 - 11777 591 188 aa, chain - ## HITS:1 COG:ECs4377 KEGG:ns NR:ns ## COG: ECs4377 COG3065 # Protein_GI_number: 15833631 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Starvation-inducible outer membrane lipoprotein # Organism: Escherichia coli O157:H7 # 1 188 12 199 199 381 100.0 1e-106 MNMTKGALILSLSFLLAACSSIPQNIKGNNQPDIQKSFVAVHNQPGLYVGQQARFGGKVI NVINGKTDTLLEIAVLPLDSYAKPDIEANYQGRLLARQSGFLDPVNYRNHFVTILGTIQG EQPGFINKVPYNFLEVNMQGIQVWHLREVVNTTYNLWDYGYGAFWPEPGWGAPYYTNAVS QVTPELVK >gi|296493118|gb|ADTK01000383.1| GENE 13 12402 - 13247 98 281 aa, chain - ## HITS:1 COG:no KEGG:Z4907 NR:ns ## KEGG: Z4907 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157 # Pathway: not_defined # 1 278 1 278 371 485 92.0 1e-136 MSIDFTPGMINTYHGDIYNRTTDTDNVKTPDTPMWPCDNREEQQPINSTFSGEGYNPEQY DLARQHLPQIYACRTNTKYTDADYSKVVAQLVSLITNIESISSTPLTRQAQEILNQINNI RYEKNKSAECRIIVVANPKPDKAIITKISVEEGIPVRFSVQTMFSDTNFIAEQRADLPTN IKDIQSLYQKMTKLYIEHSENKNRMKVFAGTNFIDFNMTGQNLSGFVLTLSRFYFEDLLN INFTDANLGDTIFYIKNTLPPNYIKMDNILTNKSKVYFQHY >gi|296493118|gb|ADTK01000383.1| GENE 14 13872 - 14297 555 141 aa, chain - ## HITS:1 COG:arsC KEGG:ns NR:ns ## COG: arsC COG1393 # Protein_GI_number: 16131375 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Escherichia coli K12 # 1 141 1 141 141 269 98.0 9e-73 MSNITIYHNPACGTSRNTLEMIRNSGTEPTIIHYLETPPTRDELVKLIADMGITVRALLR KNVEPYEELGLAEDKFTDERLIDFMLQHPILINRPIVVTPLGTRLCRPSEVVLEILPDAQ KGAFSKEDGEKVVDEAGKRLK >gi|296493118|gb|ADTK01000383.1| GENE 15 14310 - 15599 1473 429 aa, chain - ## HITS:1 COG:arsB KEGG:ns NR:ns ## COG: arsB COG1055 # Protein_GI_number: 16131374 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter NhaD and related arsenite permeases # Organism: Escherichia coli K12 # 1 429 8 436 436 679 99.0 0 MLLAGAIFVLTIVLVIWQPKGLGIGWSATLGAVLALVTGVVHPGDIPVVWNIVWNATAAF IAVIIISLLLDESGFFEWAALHVSRWGNGRGRLLFTWIVLLGAAVAALFANDGAALILTP IVIAMLLALGFSKGTTLAFVMAAGFIADTASLPLIVSNLVNIVSADFFGLGFREYASVMV PVDIAAIVATLVMLHLYFRKDIPQNYDMALLKSPAEAIKDPATFKTGWVVLLLLLVGFFV LEPLGIPVSAIAAVGALILFVVAKRGHAINTGKVLRGAPWQIVIFSLGMYLVVYGLRNAG LTEYLSGVLNVLADNGLWAATLGTGFLTAFLSSIMNNMPTVLVGALSIDGSTASGVIKEA MVYANVIGCDLGPKITPIGSLATLLWLHVLSQKNMTISWGYYFRTGIIMTLPVLFVTLVA LALRLSFTL >gi|296493118|gb|ADTK01000383.1| GENE 16 15653 - 16006 224 117 aa, chain - ## HITS:1 COG:arsR KEGG:ns NR:ns ## COG: arsR COG0640 # Protein_GI_number: 16131373 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 1 117 1 117 117 225 100.0 2e-59 MSFLLPIQLFKILADETRLGIVLLLSELGELCVCDLCTALDQSQPKISRHLALLRESGLL LDRKQGKWVHYRLSPHIPAWAAKIIDEAWRCEQEKVQAIVRNLARQNCSGDSKNICS >gi|296493118|gb|ADTK01000383.1| GENE 17 16545 - 16718 131 57 aa, chain - ## HITS:1 COG:no KEGG:EC55989_3940 NR:ns ## KEGG: EC55989_3940 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 57 3 59 59 97 100.0 1e-19 MHPLTHPLPVTAHALLLDNGILTPARASVNGTTRTSDQDFESVYAHYQSENASELTG >gi|296493118|gb|ADTK01000383.1| GENE 18 16680 - 16829 109 49 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_3983 NR:ns ## KEGG: EcE24377A_3983 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 49 1 49 49 68 97.0 8e-11 MRRDRQWMSKRMHSHSIAWRKRVIDKAIIVLGALIALLELIRFLLQLLN >gi|296493118|gb|ADTK01000383.1| GENE 19 16883 - 18235 502 450 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 5 444 4 442 458 197 31 4e-50 MTKHYDYIAIGGGSGGIASINRAAMYGQKCALIEAKELGGTCVNVGCVPKKVMWHAAQIR EAIHMYGPDYGFDTTINKFNWETLIASRTAYIDRIHTSYENVLGKNNVDVIKGFARFVDA KTLEVNGETITADHILIATGGRPSHPDIPGVEYGIDSDGFFALPALPERVAVVGAGYIAV ELAGVINGLGAKTHLFVRKHAPLRSFDPMISETLVEVMNAEGPQLHTNAIPKAVVKNADG SLTLELEDGRSETVDCLIWAIGREPANDNINLEAAGVKTNEKGYIVVDKYQNTNIEGIYA VGDNTGAVELTPVAVAAGRRLSERLFNNKPDEHLDYSNIPTVVFSHPPIGTVGLTEPQAR EQYGDDQVKVYKSSFTAMYTAVTTHRQPCRMKLVCVGSEEKIVGIHGIGFGMDEMLQGFA VALKMGATKKDFDNTVAIHPTAAEEFVTMR >gi|296493118|gb|ADTK01000383.1| GENE 20 18307 - 19149 894 280 aa, chain - ## HITS:1 COG:ECs4371 KEGG:ns NR:ns ## COG: ECs4371 COG2961 # Protein_GI_number: 15833625 # Func_class: R General function prediction only # Function: Protein involved in catabolism of external DNA # Organism: Escherichia coli O157:H7 # 1 280 1 280 280 555 99.0 1e-158 MLSYRHSFHAGNHADVLKHTVQSLIIESLKEKDKPFLYLDTHAGAGRYQLGSEHAERTGE YLEGIARIWQQDDLPAELEAYINVVKHFNRSGQLRYYPGSPLIARQLLREQDSLQLTELH PSDYPLLRSEFQKDSRARVEKADGFQQLKAKLPPVSRRGLILIDPPYEMKTDYQAVVSGI AEGYKRFATGTYALWYPVVLRQQIKRMIHDLEATGIRKILQIELAVLPDSDRRGMTASGM IVINPPWKLEQQMNNVLPWLHSKLVPAGTGHATVSWIVPE >gi|296493118|gb|ADTK01000383.1| GENE 21 19352 - 21394 2616 680 aa, chain + ## HITS:1 COG:ECs4370 KEGG:ns NR:ns ## COG: ECs4370 COG0339 # Protein_GI_number: 15833624 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent oligopeptidases # Organism: Escherichia coli O157:H7 # 1 680 1 680 680 1366 99.0 0 MTNPLLTPFELPPFSKILPEHVVPAVTKALNDCRENVERVVAQGAPYTWENLCQPLAEVD DVLGRIFSPVSHLNSVKNSPELREAYEQTLPLLSEYSTWVGQHEGLYKAYRDLRDGDHYA TLNTAQKKAVDNALRDFELSGIGLPIEKQQRYGEIATRLSELGNQYSNNVLDATMGWTKL VTDEAELAGMPESALAAAKAQAEAKELEGYLLTLDIPSYLPVMTYCDNQALREEMYRAYS TRASDQGPNAGKWDNSKVMEEILALRHELAQLLGFENYAFKSLATKMAENPQQVLDFLTD LAKRARPQGEKELAQLRAFAKAEFGVDELQPWDIAYYSEKQKQHLYSISDEQLRPYFPEN KAVNGLFEVVKRIYGITAKERKDVDVWHPDVRFFELYDENNELRGSFYLDLYARENKRGG AWMDDCVGQMRKADGSLQKPVAYLTCNFNRPVNGKPALFTHDEVITLFHEFGHGLHHMLT RIETAGVSGISGVPWDAVELPSQFMENWCWEPEALAFISGHYETGEPLPKELLDNMLAAK NYQAALFILRQLEFGLFDFRLHAEFRPDQGAKILETLAEIKKLVAVVPSPSWGRFPHAFS HIFAGGYAAGYYSYLWADVLAADAFSRFEEEGIFNRETGQSFLDNILSRGGSEEPMDLFK RFRGREPQLDAMLEHYGIKG >gi|296493118|gb|ADTK01000383.1| GENE 22 21402 - 22154 735 250 aa, chain + ## HITS:1 COG:ECs4369 KEGG:ns NR:ns ## COG: ECs4369 COG0500 # Protein_GI_number: 15833623 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli O157:H7 # 1 250 1 250 250 482 99.0 1e-136 MKICLIDETGAGDGALSVLAARWGLEHDEDNLMALVLTPEHLELRKRDEPKLGGIFVDFV GGAMAHRRKFGGGRGEAVAKAVGIKGDYLPDVVDATAGLGRDAFVLASVGCRVRMLERNP VVAALLDDGLARGYADAEIGGWLQERLQLIHASSLTALTDITPRPQVVYLDPMFPHKQKS ALVKKEMRVFQSLVGPDLDADGLLEPARLLATKRVVVKRPDYAPPLANVATPNAVVTKGH RFDIYAGTPV >gi|296493118|gb|ADTK01000383.1| GENE 23 22203 - 23672 1537 489 aa, chain - ## HITS:1 COG:yhiP KEGG:ns NR:ns ## COG: yhiP COG3104 # Protein_GI_number: 16131368 # Func_class: E Amino acid transport and metabolism # Function: Dipeptide/tripeptide permease # Organism: Escherichia coli K12 # 1 489 1 489 489 865 99.0 0 MNTTTPMGMLQQPRPFFMIFFVELWERFGYYGVQGVLAVFFVKQLGFSQEQAFVTFGAFA ALVYGLISIGGYVGDHLLGTKRTIVLGALVLAIGYFMTGMSLLKPDLIFIALGTIAVGNG LFKANPASLLSKCYPPKDPRLDGAFTLFYMSINIGSLIALSLAPVIADRFGYSVTYNLCG AGLIIALLVYIACRGMVKDIGSEPDFKPMSFSKLLYVLLGSVVMIFVCAWLMHNVEVANL VLIVLSIVVTIIFFRQAFKLDKTGRNKMFVAFVLMLEAVVFYILYAQMPTSLNFFAINNV HHEILGFSINPVSFQALNPFWVVLASPILAGIYTHLGNKGKDLSMPMKFTLGMFMCSLGF LTAAAAGMWFADAQGLTSPWFIVLVYLFQSLGELFISALGLAMIAALVPQHLMGFILGMW FLTQAAAFLLGGYVATFTAVPDNITDPLETLPVYTNVFGKIGLVTLGVAVVMLLMVPWLK RMIATPESH Prediction of potential genes in microbial genomes Time: Mon May 16 16:19:09 2011 Seq name: gi|296493117|gb|ADTK01000384.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1253.5, whole genome shotgun sequence Length of sequence - 38839 bp Number of predicted genes - 38, with homology - 38 Number of transcription units - 24, operones - 8 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 35 - 469 576 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins - Prom 666 - 725 5.2 + Prom 643 - 702 2.2 2 2 Tu 1 . + CDS 914 - 1195 238 ## G2583_4220 universal stress protein B - Term 1212 - 1249 6.9 3 3 Tu 1 . - CDS 1266 - 2765 1580 ## COG0306 Phosphate/sulphate permeases - Prom 2875 - 2934 6.9 + Prom 2850 - 2909 8.2 4 4 Tu 1 . + CDS 2997 - 4199 1007 ## COG2081 Predicted flavoproteins + Term 4229 - 4274 -0.9 - Term 4452 - 4493 3.0 5 5 Tu 1 . - CDS 4514 - 5521 406 ## G2583_4217 inner membrane protein YhiM - Prom 5740 - 5799 7.9 + Prom 5728 - 5787 10.7 6 6 Tu 1 . + CDS 5950 - 7539 328 ## ECIAI1_3634 hypothetical protein + Term 7606 - 7648 2.6 + Prom 7664 - 7723 6.0 7 7 Tu 1 . + CDS 7801 - 9423 121 ## ECIAI1_3633 hypothetical protein + Prom 9522 - 9581 3.6 8 8 Op 1 10/0.000 + CDS 9789 - 10856 1068 ## COG0845 Membrane-fusion protein 9 8 Op 2 45/0.000 + CDS 10853 - 13588 2695 ## COG1131 ABC-type multidrug transport system, ATPase component 10 8 Op 3 . + CDS 13588 - 14712 1154 ## COG0842 ABC-type multidrug transport system, permease component + Term 14925 - 14965 1.3 + Prom 14821 - 14880 3.0 11 9 Tu 1 . + CDS 15043 - 15402 454 ## COG4226 Uncharacterized protein encoded in hypervariable junctions of pilus gene clusters + Term 15454 - 15503 6.3 12 10 Op 1 3/0.667 - CDS 15522 - 15923 506 ## COG0864 Predicted transcriptional regulators containing the CopG/Arc/MetJ DNA-binding domain and a metal-binding domain 13 10 Op 2 17/0.000 - CDS 15929 - 16735 405 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 14 10 Op 3 44/0.000 - CDS 16732 - 17496 357 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 15 10 Op 4 49/0.000 - CDS 17496 - 18329 1034 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 16 10 Op 5 38/0.000 - CDS 18326 - 19270 254 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 17 10 Op 6 . - CDS 19270 - 20844 1775 ## COG0747 ABC-type dipeptide transport system, periplasmic component - Prom 20874 - 20933 4.7 18 11 Tu 1 . + CDS 20869 - 21045 110 ## UTI89_C3992 hypothetical protein + Term 21135 - 21171 2.0 - Term 20869 - 20909 3.0 19 12 Op 1 1/0.833 - CDS 20955 - 21542 502 ## COG2091 Phosphopantetheinyl transferase 20 12 Op 2 . - CDS 21597 - 22646 907 ## COG0628 Predicted permease - Prom 22667 - 22726 2.9 + Prom 22693 - 22752 2.2 21 13 Tu 1 . + CDS 22778 - 23995 1180 ## COG0477 Permeases of the major facilitator superfamily + Term 24042 - 24086 5.1 - Term 23938 - 23985 7.6 22 14 Op 1 . - CDS 23999 - 24556 713 ## ECO103_4192 hypothetical protein 23 14 Op 2 . - CDS 24629 - 25294 656 ## COG1738 Uncharacterized conserved protein - Prom 25356 - 25415 4.2 + Prom 25432 - 25491 6.5 24 15 Tu 1 . + CDS 25515 - 25760 312 ## COG0425 Predicted redox protein, regulator of disulfide bond formation + Term 25825 - 25874 7.2 25 16 Op 1 5/0.500 - CDS 25862 - 28060 2304 ## COG2217 Cation transport ATPase - Term 28092 - 28123 3.4 26 16 Op 2 . - CDS 28134 - 28760 668 ## COG3714 Predicted membrane protein - Prom 28873 - 28932 3.0 + Prom 28818 - 28877 3.4 27 17 Tu 1 . + CDS 28901 - 29260 371 ## B21_03269 hypothetical protein + Term 29341 - 29373 -1.0 28 18 Op 1 6/0.167 - CDS 29263 - 29532 347 ## COG3776 Predicted membrane protein 29 18 Op 2 . - CDS 29522 - 30118 190 ## PROTEIN SUPPORTED gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 - Prom 30248 - 30307 3.2 + Prom 30137 - 30196 5.1 30 19 Op 1 9/0.000 + CDS 30268 - 31761 748 ## PROTEIN SUPPORTED gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 31 19 Op 2 28/0.000 + CDS 31764 - 32432 348 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 32 19 Op 3 7/0.167 + CDS 32425 - 33483 974 ## COG2177 Cell division protein + Term 33527 - 33566 2.1 + Prom 33572 - 33631 6.2 33 20 Tu 1 . + CDS 33728 - 34582 1037 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) + Term 34599 - 34640 7.2 + Prom 34659 - 34718 5.4 34 21 Tu 1 . + CDS 34854 - 35957 1349 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component + Term 35978 - 36015 3.1 + Prom 36005 - 36064 2.4 35 22 Op 1 9/0.000 + CDS 36107 - 36373 303 ## COG4453 Uncharacterized protein conserved in bacteria 36 22 Op 2 . + CDS 36380 - 36907 415 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Term 36785 - 36823 -0.8 37 23 Tu 1 . - CDS 36904 - 37287 377 ## SSON_3697 hypothetical protein - Prom 37326 - 37385 5.4 + Prom 37632 - 37691 2.1 38 24 Tu 1 . + CDS 37711 - 38820 1240 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component Predicted protein(s) >gi|296493117|gb|ADTK01000384.1| GENE 1 35 - 469 576 144 aa, chain - ## HITS:1 COG:ECs4367 KEGG:ns NR:ns ## COG: ECs4367 COG0589 # Protein_GI_number: 15833621 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Escherichia coli O157:H7 # 1 144 1 144 144 271 100.0 3e-73 MAYKHILIAVDLSPESKVLVEKAVSMARPYNAKVSLIHVDVNYSDLYTGLIDVNLGDMQK RISEETHHALTELSTNAGYPITETLSGSGDLGQVLVDAIKKYDMDLVVCGHHQDFWSKLM SSARQLINTVHVDMLIVPLRDEEE >gi|296493117|gb|ADTK01000384.1| GENE 2 914 - 1195 238 93 aa, chain + ## HITS:1 COG:no KEGG:G2583_4220 NR:ns ## KEGG: G2583_4220 # Name: uspB # Def: universal stress protein B # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 93 19 111 111 184 100.0 1e-45 MARYFSSLRALLVVLRNCDPLLYQYVDGGGFFTSHGQPNKQVRLVWYIYAQRYRDHHDDE FIRRCERVRRQFILTSALCGLVVVSLIALMIWH >gi|296493117|gb|ADTK01000384.1| GENE 3 1266 - 2765 1580 499 aa, chain - ## HITS:1 COG:ECs4365 KEGG:ns NR:ns ## COG: ECs4365 COG0306 # Protein_GI_number: 15833619 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate/sulphate permeases # Organism: Escherichia coli O157:H7 # 1 499 1 499 499 893 100.0 0 MLHLFAGLDLHTGLLLLLALAFVLFYEAINGFHDTANAVATVIYTRAMRSQLAVVMAAVF NFLGVLLGGLSVAYAIVHMLPTDLLLNMGSSHGLAMVFSMLLAAIIWNLGTWYFGLPASS SHTLIGAIIGIGLTNALMTGTSVVDALNIPKVLSIFGSLIVSPIVGLVFAGGLIFLLRRY WSGTKKRARIHLTPAEREKKDGKKKPPFWTRIALILSAIGVAFSHGANDGQKGIGLVMLV LIGVAPAGFVVNMNATGYEITRTRDAINNVEAYFEQHPALLKQATGADQLVPAPEAGATQ PAEFHCHPSNTINALNRLKGMLTTDVESYDKLSLDQRSQMRRIMLCVSDTIDKVVKMPGV SADDQRLLKKLKSDMLSTIEYAPVWIIMAVALALGIGTMIGWRRVATTIGEKIGKKGMTY AQGMSAQMTAAVSIGLASYTGMPVSTTHVLSSSVAGTMVVDGGGLQRKTVTSILMAWVFT LPAAVLLSGGLYWLSLQFL >gi|296493117|gb|ADTK01000384.1| GENE 4 2997 - 4199 1007 400 aa, chain + ## HITS:1 COG:yhiN KEGG:ns NR:ns ## COG: yhiN COG2081 # Protein_GI_number: 16131364 # Func_class: R General function prediction only # Function: Predicted flavoproteins # Organism: Escherichia coli K12 # 1 400 1 400 400 807 99.0 0 MERFDAIIIGAGAAGMFCSALAGQAGRRVLLIDNGKKPGRKILMSGGGRCNFTNLYVEPG AYLSQNPHFCKSALARFTQWDFIDLVNKHGIAWHEKTLGQLFCDDSAQQIVDMLVDECEK GNVTFRLRSEVLSVAKDETGFTLELNGMTVGCEKLVIATGGLSMPGLGASPFGYKIAEQF GLNVLPTRAGLVPFTLHKPLLEELQVLAGVAVPSVITAENGIVFRENLLFTHRGLSGPAV LQISSYWQPGEFVSINLLPDVDLETFLNEQRNAHPNQSLKNTLAVHLPKRLVERLQQLGQ IPDVSLKQLNVRDQQALISTLTDWRVQPNGTEGYRTAEVTLGGVDTNELSSRTMEARKVP GLYFIGEVMDVTGWLGGYNFQWAWSSAWACAQDLIAAKSS >gi|296493117|gb|ADTK01000384.1| GENE 5 4514 - 5521 406 335 aa, chain - ## HITS:1 COG:no KEGG:G2583_4217 NR:ns ## KEGG: G2583_4217 # Name: yhiM # Def: inner membrane protein YhiM # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 335 49 383 383 583 100.0 1e-165 MGLICIALGGFVLESSGQSEYFVAGHVLISLAAICLALFTTAFIIISQLTRGVNTFYNTL FPIIGYAGSIITMIWGWALLAGNDVMADEFVAGHVIFGVGMIAACVSTVAASSGHFLLIP KNAAGSKSDGTPVQAYSSLIGNCLIAVPVLLTLLGFIWSITLLRSADITPHYVAGHVLLG LTAICACLIGLVATIVHQTRNTFSTKEHWLWCYWVIFLGSITVLQGIYVLVSSDASARLA PGIILICLGMICYSIFSKVWLLALVWRRTCSLANRIPMIPVFTCLFCLFLASFLAEMAQT DMGYFIPSRVLVGLGAVCFTLFSIVSILEAGSAKK >gi|296493117|gb|ADTK01000384.1| GENE 6 5950 - 7539 328 529 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_3634 NR:ns ## KEGG: ECIAI1_3634 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 529 1 529 529 1061 100.0 0 MKVFHNIFKHFSSNHQDKHSDKVNSHQHHGKVDKTHRAKIVEFDKLDNDSQIDNDFGLHI IYFLQHGHWKVNDRSHQMEKVWFYNSEPSIDIQEYNRFADNTTDTFIFTIIPDNNHVLKL SSPITVTVECKGGYYFINSSGDKSDIIYKVDGLSIIARNFFTLLSGNFKPDWRWDVSKET FTKEKFDSYVKPVFSKIDFYKQCGVINPQNANTAYFGDTDGRVGAVLYALLVSGHIGIRE KGWSLLCELLKHEEMASSAYKHKNNKVLYDLLNTRDMILNELHQHVFLKDDAITPCIFLG DHTGDRFSTIFGDKYILTLLNSMRNMEGNKDSRINKNVVVLAGNHEINFNGNYTARLANH KLSAGDTYNLIKTLDVCNYDSERQVLTSHHGIIRDEEKKCYCLGALQVPFNQMKNPTDPE ELANIFNKKHKEHMDDRFIHLIRSNAIASTPVYDNYFNNTTAFRPKPEDIFKCGQTLKKT KQKYGHYGLGVDQHQKIDNYTMGLNSWKIAPNERGDKKGVPGLSCFQPH >gi|296493117|gb|ADTK01000384.1| GENE 7 7801 - 9423 121 540 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_3633 NR:ns ## KEGG: ECIAI1_3633 # Name: yhiJ # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 540 1 540 540 1102 100.0 0 MKIGTVAGTNDSTTTIATNDMVQEHVTNFTKELFGYIANGIGDDISSIARTMLGEVVEKI DDWQIERFQQSIRDDKISFTIQTNHSEKYSMLSDMRAHILRRDNNYQFIVTINSKNYGCY LDNTDINWCSIVYLLNNMTVNDDANDVAVTESYKPVWDWKISQFNVSDIKFETMINPQFA DRTYFSNCSPVDPTSTRPTYFGDTDGSVGAVLYALFATGHLRIMAEGENFLSQLLNIEDE VLNVLLRENFNEQLDTNVNTIISILNRRDIVLESLQPYLVINKDAVTPCTFLGDQTGDRF SNICGDQFIIDLLKRIMSINENVHVLAGNHETNCNGNYMQNFTRMKPLDEDTYAGIKDYP VCFYDPKYKIMANHHGITFDEQRKRYIIGPITVSIDEMTNALDPVELAAIINKKHHAIIN GKKFKTSRAISCRSFNRYFSVSTDYRPKLEALLACSQMLGINQVVAHNGNGGRERIGETG TVLGLNARDSKHAGRMFSMHNCQINPGAGPEITTPWKSYQHEKNRNGLMPLIRRRTMLQL >gi|296493117|gb|ADTK01000384.1| GENE 8 9789 - 10856 1068 355 aa, chain + ## HITS:1 COG:yhiI KEGG:ns NR:ns ## COG: yhiI COG0845 # Protein_GI_number: 16131359 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli K12 # 1 355 1 355 355 567 99.0 1e-161 MDKSKRHLAWWVVGLLAVAAIVAWWLLRPAGVPEGFAVSNGRIEATEVDIASKIAGRIDT ILVKEGQFVREGEVLAKMDTRVLQEQRLEAIAQIKEAQSAVAAAQALLEQRQSETRAAQS LVNQRQAELDSVAKRHTRSRSLAHRGAISAQQLDDDRAAAESARAALESAKAQVSASKAA IEAARTNIIQAQTRVEAAQATERRIAADIDDSELKAPRDGRVQYRVAEPGEVLAAGGRVL NMVDLSDVYMTFFLPTEQAGTLKLGGEARLILDAAPDLRIPATISFVASVAQFTPKTVET SDERLKLMFRVKARIPPELLQQHLEYVKTGLPGVAWVRVNEELPWPDDLVVRLPQ >gi|296493117|gb|ADTK01000384.1| GENE 9 10853 - 13588 2695 911 aa, chain + ## HITS:1 COG:ZyhiH_1 KEGG:ns NR:ns ## COG: ZyhiH_1 COG1131 # Protein_GI_number: 15804022 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Escherichia coli O157:H7 EDL933 # 18 549 1 532 532 1085 99.0 0 MTHLELVPVPPVAQLAGVSQHYGKTVALNNITLDIPARCMVGLIGPDGVGKSSLLSLISG ARVIEQGNVMVLGGDMRDPKHRRDVCPRIAWMPQGLGKNLYHTLSVYENVDFFARLFGHD KAEREVRINELLTSTGLAPFRDRPAGKLSGGMKQKLGLCCALIHDPELLILDEPTTGVDP LSRAQFWDLIDSIRQRQSNMSVLVATAYMEEAERFDWLVAMNAGEVLATGSAEELRQQTQ SATLEEAFINLLPQAQRQAHQAVVIPPYQPENAEIAIEARDLTMRFGSFVAVDHVNFRIP RGEIFGFLGSNGCGKSTTMKMLTGLLPASEGEAWLFGQPVDPKDIDTRRRVGYMSQAFSL YNELTVRQNLELHARLFHIPEAEIPARVAEMSERFKLNDVEDVLPESLPLGIRQRLSLAV AVIHRPEMLILDEPTSGVDPVARDMFWQLMVDLSRQDKVTIFISTHFMNEAERCDRISLM HAGKVLASGTPQELVEKRGAASLEEAFIAYLQEAAGQSNEAEAPPVIHDTTHAPRQGFSL RRLFSYSRREALELRRDPVRSTLALMGTVILMLIMGYGISMDVENLRFAVLDRDQTVSSQ AWTLNLSGSRYFIEQPPLTSYDELDRRMRAGDITVAIEIPPNFGRDIARGTPVELGVWID GAMPSRAETVKGYVQAMHQSWLQDVASRQSTPASQSGLMNIETRYRYNPDVKSLPAIVPA VIPLLLMMIPSMLSALSVVREKELGSIINLYVTPTTRSEFLLGKQLPYIVLGMLNFFLLC GLSVFVFGVPHKGSFLTLTLAVLLYIIIATGMGLLISTFMKSQIAAIFGTAIITLIPATQ FSGMIDPVASLEGPGRWIGEVYPTSHFLTIARGTFSKALDLTDLWQLFIPLLIAIPLVMG LSILLLKKQEG >gi|296493117|gb|ADTK01000384.1| GENE 10 13588 - 14712 1154 374 aa, chain + ## HITS:1 COG:yhhJ KEGG:ns NR:ns ## COG: yhhJ COG0842 # Protein_GI_number: 16131357 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Escherichia coli K12 # 1 374 2 375 375 667 99.0 0 MRHLRNIFNLGIKELRSLLGDKAMLTLIVFSFTVSVYSSATVTPGSLNLAPIAIADMDQS QLSNRIVNSFYRPWFLPPEMITADEMDAGLDAGRYTFAINIPPNFQRDVLAGRQPDIQVN VDATRMSQAFTGNGYIQNIINGEVNSFVARYRDNSEPLVSLETRMRFNPNLDPAWFGGVM AIINNITMLAIVLTGSALIREREHGTVEHLLVMPITPFEIMMAKVWSMGLVVLVVSGLSL VLMVKGVLGVPIEGSIPLFMLGVALSLFATTSIGIFMGTIARSMPQLGLLVILVLLPLQM LSGGSTPRESMPQMVQDIMLTMPTTHFVSLAQAILYRGAGFEIVWPQFLTLMAIGGAFFT IALLRFRKTIGTMA >gi|296493117|gb|ADTK01000384.1| GENE 11 15043 - 15402 454 119 aa, chain + ## HITS:1 COG:ECs4356 KEGG:ns NR:ns ## COG: ECs4356 COG4226 # Protein_GI_number: 15833610 # Func_class: S Function unknown # Function: Uncharacterized protein encoded in hypervariable junctions of pilus gene clusters # Organism: Escherichia coli O157:H7 # 1 119 1 119 119 232 100.0 1e-61 MIKLKTPNSMEIAGQPAVITYVPELNAFRGKFLGLSGYCDFVSDSIQGLQKEGELSLREY LEDCKAAGIEPYARTEKIKTFTLRYPESLSERLNNAAAQQQVSVNTYIIETLNERLNHL >gi|296493117|gb|ADTK01000384.1| GENE 12 15522 - 15923 506 133 aa, chain - ## HITS:1 COG:ECs4348 KEGG:ns NR:ns ## COG: ECs4348 COG0864 # Protein_GI_number: 15833602 # Func_class: K Transcription # Function: Predicted transcriptional regulators containing the CopG/Arc/MetJ DNA-binding domain and a metal-binding domain # Organism: Escherichia coli O157:H7 # 1 133 1 133 133 205 100.0 2e-53 MQRVTITLDDDLLETLDSLSQRRGYNNRSEAIRDILRSALAQEATQQHGTQGFAVLSYVY EHEKRDLASRIVSTQHHHHDLSVATLHVHINHDDCLEIAVLKGDMGDVQHFADDVIAQRG VRHGHLQCLPKED >gi|296493117|gb|ADTK01000384.1| GENE 13 15929 - 16735 405 268 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 27 241 294 509 563 160 40 8e-75 MTLLNVCGLSHHYAHGGFSGKHQHQAVLNNVSLTLKSGETVALLGRSGCGKSTLARLLVG LESPSQGNISWRGEPLAKLNRAQRKAFRRDIQMVFQDSISAVNPRKTVREILREPMRHLL SLKKSEQLARASEMLKAVDLDDSVLDKRPPQLSGGQLQRVCLARALAVEPKLLILDEAVS NLDLVLQAGVIRLLKKLQQQFGTACLFITHDLRLVERFCQRVMVMDNGQIVETQVVGDKL TFSSDAGRVLQNAVLPAFPVRRRTTEKV >gi|296493117|gb|ADTK01000384.1| GENE 14 16732 - 17496 357 254 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 5 240 9 258 563 142 34 8e-75 MPQQIELRNIALQAAQPLVHGVSLTLQRGRVLALVGGSGSGKSLTCAATLGILPAGVRQT AGEILADGKPVSPCALRGIKIATIMQNPRSAFNPLHTMHTHARETCLALGKPADDATLTA AIEAVGLENAARVLKLYPFEMSGGMLQRMMIAMAVLCESPFIIADEPTTDLDVVAQARIL DLLESIMQKQAPGMLLVTHDMGVVARLADDVAVMSQGKIVEQGDVETLFNAPKHTVTRSL VSAHLALYGMELAS >gi|296493117|gb|ADTK01000384.1| GENE 15 17496 - 18329 1034 277 aa, chain - ## HITS:1 COG:ECs4345 KEGG:ns NR:ns ## COG: ECs4345 COG1173 # Protein_GI_number: 15833599 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Escherichia coli O157:H7 # 1 277 1 277 277 461 100.0 1e-130 MNFFLSSRWSVRLALIIIALLALIALTSQWWLPYDPQAIDLPSRLLSPDAQHWLGTDHLG RDIFSRLMAATRVSLGSVMACLLLVLTLGLVIGGSAGLIGGRVDQATMRVADMFMTFPTS ILSFFMVGVLGTGLTNVIIAIALSHWAWYARMVRSLVISLRQREFVLASRLSGAGHVRVF VDHLAGAVIPSLLVLATLDIGHMMLHVAGMSFLGLGVTAPTAEWGVMINDARQYIWTQPL QMFWPGLALFISVMAFNLVGDALRDHLDPHLVTEHAH >gi|296493117|gb|ADTK01000384.1| GENE 16 18326 - 19270 254 314 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 68 304 43 310 320 102 26 4e-21 MLRYVLRRFLLLIPMVLAASVIIFLMLRLGTGDPALDYLRLSNLPPTPEMLASTRTMLGL DQPLYVQYGTWLWKALHLDFGISFASQRPVLDDMLNFLPATLELAGAALVLILLTSVPLG IWAARHRDRLPDFAVRFIAFLGVSMPNFWLAFLLVMAFSVYLQWLPAMGYGGWQHIILPA VSIAFMSLAINARLLRASMLDVAGQRHVTWARLRGLNDKQTERRHILRNASLPMITAVGM HIGELIGGTMIIENIFAWPGVGRYAVSAIFNRDYPVIQCFTLMMVVVFVVCNLIVDLLNA ALDPRIRRHEGAHA >gi|296493117|gb|ADTK01000384.1| GENE 17 19270 - 20844 1775 524 aa, chain - ## HITS:1 COG:nikA KEGG:ns NR:ns ## COG: nikA COG0747 # Protein_GI_number: 16131348 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Escherichia coli K12 # 1 524 1 524 524 1041 99.0 0 MLSTLRRTLFALLACASFIVHAAAPDEITTAWPVNVGPLNPHLYTPNQMFAQSMVYEPLV KYQADGSVIPWLAKSWTHSEDGKTWTFTLRDDVKFSNGEPFDAEAAAENFRAVLDNRQRH AWLELANQIVDVKALSKTELQITLKSAYYPFLQELALPRPFRFIAPSQFKNHETMNGIKA PIGTGPWILQESKLNQYDVFVRNENYWGEKPAIKKITFNVIPDPTTRAVAFETGDIDLLY GNEGLLPLDTFARFSQNPAYHTQLSQPIETVMLALNTAKAPTNELAVREALNYAVNKKSL IDNALYGTQQVADTLFAPSVPYANLGLKPRQYDPQKAKELLEKAGWTLPAGKDIREKNGQ PLRIELSFIGTDALSKSMAEIIQADMRQIGADVSLIGEEESSIYARQRDGRFGMIFHRTW GAPYDPHAFLSSMRVPSHADFQAQQGLADKPLIDKEIGEVLATHDETQRQALYRDILTRL HDEAVYLPISYISMMVVSKPELGNIPYAPIATEIPFEQIKPVKP >gi|296493117|gb|ADTK01000384.1| GENE 18 20869 - 21045 110 58 aa, chain + ## HITS:1 COG:no KEGG:UTI89_C3992 NR:ns ## KEGG: UTI89_C3992 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 58 1 58 58 102 100.0 3e-21 MAEISMTILSIRHTDYLFWINRWAVGWADQLTESIHCTLSAVSVKGVGVQIARLKFSS >gi|296493117|gb|ADTK01000384.1| GENE 19 20955 - 21542 502 195 aa, chain - ## HITS:1 COG:ECs4342 KEGG:ns NR:ns ## COG: ECs4342 COG2091 # Protein_GI_number: 15833596 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantetheinyl transferase # Organism: Escherichia coli O157:H7 # 1 195 1 195 195 386 100.0 1e-107 MYRIVLGKVSTLSAAPLPPGLREQAPQGPRRERWLAGRALLSHTLSPLPEIIYGEQGKPA FAPETPLWFNLSHSGDDIALLLSDEGEVGCDIEVIRPRANWRWLANAVFSLGEHAEMDAV HPDQQLEMFWRIWTRKEAIVKQRGGSAWQIVSVDSTYHSSLSVSHCQLENLSLAICTPTP FTLTADSVQWIDSVN >gi|296493117|gb|ADTK01000384.1| GENE 20 21597 - 22646 907 349 aa, chain - ## HITS:1 COG:yhhT KEGG:ns NR:ns ## COG: yhhT COG0628 # Protein_GI_number: 16131346 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Escherichia coli K12 # 1 349 28 376 376 581 99.0 1e-166 METPQPDKTGMHILLKLASLVVILAGIHAAADIIVQLLLALFFAIVLNPLVTWFIRRGVQ RPIAITIVVVVMLIALTALVGVLAASFNEFISMLPKFNKELTRKLFKLQEMLPFLNLHMS PERMLQRMDSEKVVTFTTALMTGLSGAMASVLLLVMTVVFMLFEVRHVPYKMRFALNNPQ IHIAGLHRALKGVSHYLALKTLLSLWTGVIVWLGLELMGVQFALMWAVLAFLLNYVPNIG AVISAVPPMIQVLLFNGIYECILVGALFLVVHMVIGNILEPRMMGHRLGMSTMVVFLSLL IWGWLLGPVGMLLSVPLTSVCKIWMETTKGGSKLAILLGPGRPKSRLPG >gi|296493117|gb|ADTK01000384.1| GENE 21 22778 - 23995 1180 405 aa, chain + ## HITS:1 COG:yhhS KEGG:ns NR:ns ## COG: yhhS COG0477 # Protein_GI_number: 16131345 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 405 15 419 419 667 99.0 0 MPEPVAEPALNGLRLNLRIVSIVMFNFASYLTIGLPLAVLPGYVHDVMGFSAFWAGLVIS LQYFATLLSRPHAGRYADLLGPKKIVVFGLCGCFLSGLGYLTAGLTASLPVISLLLLCLG RVILGIGQSFAGTGSTLWGVGVVGSLHIGRVISWNGIVTYGAMAMGAPLGVVFYHWGGLQ ALALIIMGVALVAILLAIPRPTVKASKGKPLPFRAVLGSVWLYGMALALASAGFGVIATF ITLFYDAKGWDGAAFALTLFSCAFVGTRLLFPNGINRIGGLNVAMICFSVEIIGLLLVGV ATMPWMAKIGVLLAGAGFSLVFPALGVVAVKAVPQQNQGAALATYTVFMDLSLGVTGPLA GLVMSWAGVPVIYLAAAGLVAIALLLTWRLKKRPPEHVPEAASSS >gi|296493117|gb|ADTK01000384.1| GENE 22 23999 - 24556 713 185 aa, chain - ## HITS:1 COG:no KEGG:ECO103_4192 NR:ns ## KEGG: ECO103_4192 # Name: dcrB # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 185 15 199 199 324 100.0 1e-87 MRNLVKYVGIGLLVMGLAACDDKDTNATAQGSVAESNATGNPVNLLDGKLSFSLPADMTD QSGKLGTQANNMHVWSDATGQKAVIVIMGDDPKEDLAVLAKRLEDQQRSRDPQLQVVTNK AIELKGHKMQQLDSIISAKGQTAYSSVILGNVGNQLLTMQITLPADDQQKAQTTAENIIN TLVIQ >gi|296493117|gb|ADTK01000384.1| GENE 23 24629 - 25294 656 221 aa, chain - ## HITS:1 COG:ECs4320 KEGG:ns NR:ns ## COG: ECs4320 COG1738 # Protein_GI_number: 15833574 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 221 1 221 221 373 100.0 1e-103 MNVFSQTQRYKALFWLSLFHLLVITSSNYLVQLPVSIFGFHTTWGAFSFPFIFLATDLTV RIFGAPLARRIIFAVMIPALLISYVISSLFYMGSWQGFGALAHFNLFVARIATASFMAYA LGQILDVHVFNRLRQSRRWWLAPTASTLFGNVSDTLAFFFIAFWRSPDAFMAEHWMEIAL VDYCFKVLISIVFFLPMYGVLLNMLLKRLADKSEINALQAS >gi|296493117|gb|ADTK01000384.1| GENE 24 25515 - 25760 312 81 aa, chain + ## HITS:1 COG:ECs4319 KEGG:ns NR:ns ## COG: ECs4319 COG0425 # Protein_GI_number: 15833573 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted redox protein, regulator of disulfide bond formation # Organism: Escherichia coli O157:H7 # 1 81 1 81 81 159 100.0 1e-39 MTDLFSSPDHTLDALGLRCPEPVMMVRKTVRNMQPGETLLIIADDPATTRDIPGFCTFME HELVAKETDGLPYRYLIRKGG >gi|296493117|gb|ADTK01000384.1| GENE 25 25862 - 28060 2304 732 aa, chain - ## HITS:1 COG:zntA KEGG:ns NR:ns ## COG: zntA COG2217 # Protein_GI_number: 16131341 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Escherichia coli K12 # 1 732 1 732 732 1256 99.0 0 MSTPDNHGKKAPQFAAFKPLTTVQNANDCCCDGACSSSPTLSENVSGTRYSWKVSGMDCA ACARKVENAVRQLAGVNQVQVLFATEKLVVDADNDIRAQVESAVQKAGYSLRDEQAADEP QASRLKENLPLITLIVMMAISWGLEQFNHPFGQLAFIATTLVGLYPIARQALRLIKSGSY FAIETLMSVAAIGALFIGATAEAAMVLLLFLIGERLEGWAASRARQGVSALMALKPETAT RLRNGEREEVAINSLRPGDVIEVAAGGRLPADGKLLSPFASFDESALTGESIPVERATGD KVPAGATSVDRLVTLEVLSEPGASAIDRILKLIEEAEERRAPIERFIDRFSRIYTPAIMA VALLVTLVPPLLFAASWQEWIYKGLTLLLIGCPCALVISTPAAITSGLAAAARRGALIKG GAALEQLGRVTQVAFDKTGTLTVGKPRVTAIHPATGISESELLTLAAAVEQGATHPLAQA IVREAQVAELAIPTAESQRALVGSGIEAQVNGERVLICAAGKHPADAFAGLINELESAGQ TVVLVVRNDDVLGIIALQDTLRADAATAISELNALGVKGVILTGDNPRAAAAIAGELGLE FKAGLLPEDKVKAVTKLNQHAPLAMVGDGINDAPAMKAAAIGIAMGSGTDVALETADAAL THNHLRGLVQMIELARATHANIRQNITIALGLKGIFLVTTLLGMTGLWLAVLADTGATVL VTANALRLLRRR >gi|296493117|gb|ADTK01000384.1| GENE 26 28134 - 28760 668 208 aa, chain - ## HITS:1 COG:STM3575 KEGG:ns NR:ns ## COG: STM3575 COG3714 # Protein_GI_number: 16766861 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Salmonella typhimurium LT2 # 1 208 1 208 208 302 91.0 3e-82 MLWSFIAVCLSAWLSVDASYRGPTWQRWVFKPLTLLLLLLLAWQAPMFDAISYLVLAGLC ASLLGDALTLLPRQRLMYAIGAFFLSHLLYTIYFASQMTLSFFWPLPLVLLVLGALLLAI IWTRLEEYRWPICTFIGMTLVMVWLAGELWFFRPTAPALSAFVGASLLFISNFVWLGSHY RRRFRADNAIAAACYFAGHFLIVRSLYL >gi|296493117|gb|ADTK01000384.1| GENE 27 28901 - 29260 371 119 aa, chain + ## HITS:1 COG:no KEGG:B21_03269 NR:ns ## KEGG: B21_03269 # Name: yhhM # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 119 1 119 119 218 99.0 7e-56 MSKPPLFFIVIIGLIVVAASFRFMQQRREKADNDMAPLQQKLVVVSNKREKPINDRRSRQ QEVTPAGTSMRYEASFKPQSGGMEQTFRLDAQQYHALTVGDKGTLSYKGTRFVSFVGEQ >gi|296493117|gb|ADTK01000384.1| GENE 28 29263 - 29532 347 89 aa, chain - ## HITS:1 COG:yhhL KEGG:ns NR:ns ## COG: yhhL COG3776 # Protein_GI_number: 16131338 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 89 1 89 89 146 98.0 8e-36 MLINIGRLLMLCVWGFLILNLVHPFPRPLNIFVNVALIFTVLMHGMQLALLKSTLPKDGP QMTTAEKVRIFLFGVFELLVWQKKFKVKK >gi|296493117|gb|ADTK01000384.1| GENE 29 29522 - 30118 190 198 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 [Bacillus selenitireducens MLS10] # 12 194 13 199 199 77 27 9e-14 MKKPNHSGSGQIRIIGGQWRGRKLPVPDSPGLRPTTDRVRETLFNWLAPVIVDAQCLDCF AGSGALGLEALSRYAAGATLIEMDRAVSQQLIKNLATLKAGNARVVNSNAMSFLAQKGTP HNIVFVDPPFRRGLLEETINLLEDNGWLADEALIYVESEVENGLPTVPANWSLHREKVAG QVAYRLYQREAQGESDAD >gi|296493117|gb|ADTK01000384.1| GENE 30 30268 - 31761 748 497 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 [Bacillus selenitireducens MLS10] # 198 494 21 321 336 292 48 2e-78 MAKEKKRGFFSWLGFGQKEQTPEKETEVQNEQPVVEEIVQAQEPVKASEQAVEEQPQAHT EAEAETFAADVVEVTEQVAESEKAQPEAEVVAQPEPVVEETPEPVAIEREELPLPEDVNA EEVSPEEWQAEAETVEIVEAAEEEAAKEEITDEELEAQALAAEAAEEAVMVVPPVEEQPV EEIAQEQEKPTKEGFFARLKRSLLKTKENLGSGFISLFRGKKIDDDLFEELEEQLLIADV GVETTRKIITNLTEGASRKQLRDAEALYGLLKEEMGEILAKVDEPLNVEGKTPFVILMVG VNGVGKTTTIGKLARQFEQQGKSVMLAAGDTFRAAAVEQLQVWGQRNNIPVIAQHTGADS ASVIFDAIQAAKARNIDVLIADTAGRLQNKSHLMEELKKIVRVMKKLDVEAPHEVMLTID ASTGQNAVSQAKLFHEAVGLTGITLTKLDGTAKGGVIFSVADQFGIPIRYIGVGERIEDL RPFKADDFIEALFARED >gi|296493117|gb|ADTK01000384.1| GENE 31 31764 - 32432 348 222 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 216 4 220 223 138 36 4e-32 MIRFEHVSKAYLGGRQALQGVTFHMQPGEMAFLTGHSGAGKSTLLKLICGIERPSAGKIW FSGHDITRLKNREVPFLRRQIGMIFQDHHLLMDRTVYDNVAIPLIIAGASGDDIRRRVSA ALDKVGLLDKAKNFPIQLSGGEQQRVGIARAVVNKPAVLLADEPTGNLDDALSEGILRLF EEFNRVGVTVLMATHDINLISRRSYRMLTLSDGHLHGGVGHE >gi|296493117|gb|ADTK01000384.1| GENE 32 32425 - 33483 974 352 aa, chain + ## HITS:1 COG:ftsX KEGG:ns NR:ns ## COG: ftsX COG2177 # Protein_GI_number: 16131334 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Escherichia coli K12 # 1 352 1 352 352 687 100.0 0 MNKRDAINHIRQFGGRLDRFRKSVGGSGDGGRNAPKRAKSSPKPVNRKTNVFNEQVRYAF HGALQDLKSKPFATFLTVMVIAISLTLPSVCYMVYKNVNQAATQYYPSPQITVYLQKTLD DDAAAGVVAQLQAEQGVEKVNYLSREDALGEFRNWSGFGGALDMLEENPLPAVAVVIPKL DFQGTESLNTLRDRITQINGIDEVRMDDSWFARLAALTGLVGRVSAMIGVLMVAAVFLVI GNSVRLSIFARRDSINVQKLIGATDGFILRPFLYGGALLGFSGALLSLILSEILVLRLSS AVAEVAQVFGTKFDINGLSFDECLLLLLVCSMIGWVAAWLATVQHLRHFTPE >gi|296493117|gb|ADTK01000384.1| GENE 33 33728 - 34582 1037 284 aa, chain + ## HITS:1 COG:ECs4310 KEGG:ns NR:ns ## COG: ECs4310 COG0568 # Protein_GI_number: 15833564 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Escherichia coli O157:H7 # 1 284 1 284 284 514 100.0 1e-146 MTDKMQSLALAPVGNLDSYIRAANAWPMLSADEERALAEKLHYHGDLEAAKTLILSHLRF VVHIARNYAGYGLPQADLIQEGNIGLMKAVRRFNPEVGVRLVSFAVHWIKAEIHEYVLRN WRIVKVATTKAQRKLFFNLRKTKQRLGWFNQDEVEMVARELGVTSKDVREMESRMAAQDM TFDLSSDDDSDSQPMAPVLYLQDKSSNFADGIEDDNWEEQAANRLTDAMQGLDERSQDII RARWLDEDNKSTLQELADRYGVSAERVRQLEKNAMKKLRAAIEA >gi|296493117|gb|ADTK01000384.1| GENE 34 34854 - 35957 1349 367 aa, chain + ## HITS:1 COG:ECs4309 KEGG:ns NR:ns ## COG: ECs4309 COG0683 # Protein_GI_number: 15833563 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Escherichia coli O157:H7 # 1 367 20 386 386 705 99.0 0 MNMKGKALLAGCIALAFSNMALAEDIKVAVVGAMSGPVAQYGDQEFTGAEQAVADINAKG GIKGNKLQIVKYDDACDPKQAVAVANKVVNDGIKYVIGHLCSSSTQPASDIYEDEGILMI TPAATAPELTARGYQLILRTTGLDSDQGPTAAKYILEKVKPQRIAIVHDKQQYGEGLARA VQDGLKKGNANVVFFDGITAGEKDFSTLVARLKKENIDFVYYGGYHPEMGQILRQARAAG LKTQFMGPEGVANVSLSNIAGESAEGLLVTKPKNYDQVPANKPIVDAIKAKKQDPSGAFV WTTYAALQSLQAGLNQSDDPAEIAKYLKANSVDTVMGPLTWDEKGDLKGFEFGVFDWHAN GTATDAK >gi|296493117|gb|ADTK01000384.1| GENE 35 36107 - 36373 303 88 aa, chain + ## HITS:1 COG:ECs4308 KEGG:ns NR:ns ## COG: ECs4308 COG4453 # Protein_GI_number: 15833562 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 88 1 88 88 134 100.0 6e-32 MSAVKKQRIDLRLTDDDKSMIEEAAAISNQSVSQFMLNSASQRAAEVIEQHRRVILNEES WTRVMDALSNPPSPGEKLKRAAKRLQGM >gi|296493117|gb|ADTK01000384.1| GENE 36 36380 - 36907 415 175 aa, chain + ## HITS:1 COG:ECs4307 KEGG:ns NR:ns ## COG: ECs4307 COG0454 # Protein_GI_number: 15833561 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Escherichia coli O157:H7 # 1 175 1 175 175 367 98.0 1e-102 MDDLTIEILTDDADYDLQRFDCGEEALNLFLTTHLVRQHRNKILRAYILCRNTPERQVLG YYTLCGSCFERAALPSKSKQKKIPYKNIPSVTLGRLAIDRSLQGQGWGATLVAHAMKVVW SASLAVGIHGLFVEALNKKAHTFYQSLGFIPLVGENENALFFPTKSIELLFTQSD >gi|296493117|gb|ADTK01000384.1| GENE 37 36904 - 37287 377 127 aa, chain - ## HITS:1 COG:no KEGG:SSON_3697 NR:ns ## KEGG: SSON_3697 # Name: yhhK # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 127 1 127 127 248 99.0 7e-65 MKLTIIRLENFSDQDRIDLQKIWPEYSPSSLQVDDNHRIYAARFNERLLAAVRVTLSGTE GALDSLCVREVTRRRGVGQYLLEEVLRNNPGVSCWWMADAGVEDRGVMTAFMQALGFTAQ QGGWEKR >gi|296493117|gb|ADTK01000384.1| GENE 38 37711 - 38820 1240 369 aa, chain + ## HITS:1 COG:livK KEGG:ns NR:ns ## COG: livK COG0683 # Protein_GI_number: 16131330 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Escherichia coli K12 # 1 369 1 369 369 707 100.0 0 MKRNAKTIIAGMIALAISHTAMADDIKVAVVGAMSGPIAQWGDMEFNGARQAIKDINAKG GIKGDKLVGVEYDDACDPKQAVAVANKIVNDGIKYVIGHLCSSSTQPASDIYEDEGILMI SPGATNPELTQRGYQHIMRTAGLDSSQGPTAAKYILETVKPQRIAIIHDKQQYGEGLARS VQDGLKAANANVVFFDGITAGEKDFSALIARLKKENIDFVYYGGYYPEMGQMLRQARSVG LKTQFMGPEGVGNASLSNIAGDAAEGMLVTMPKRYDQDPANQGIVDALKADKKDPSGPYV WITYAAVQSLATALERTGSDEPLALVKDLKANGANTVIGPLNWDEKGDLKGFDFGVFQWH ADGSSTAAK Prediction of potential genes in microbial genomes Time: Mon May 16 16:19:55 2011 Seq name: gi|296493116|gb|ADTK01000385.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1253.6, whole genome shotgun sequence Length of sequence - 56267 bp Number of predicted genes - 47, with homology - 47 Number of transcription units - 25, operones - 10 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 24/0.000 + CDS 2 - 895 1288 ## COG0559 Branched-chain amino acid ABC-type transport system, permease components 2 1 Op 2 19/0.000 + CDS 892 - 2169 1362 ## COG4177 ABC-type branched-chain amino acid transport system, permease component 3 1 Op 3 18/0.000 + CDS 2166 - 2933 258 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 4 1 Op 4 . + CDS 2935 - 3648 268 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Term 3722 - 3764 -0.8 + Prom 3909 - 3968 7.3 5 2 Op 1 35/0.000 + CDS 4044 - 5360 1666 ## COG1653 ABC-type sugar transport system, periplasmic component 6 2 Op 2 38/0.000 + CDS 5458 - 6345 1044 ## COG1175 ABC-type sugar transport systems, permease components 7 2 Op 3 21/0.000 + CDS 6342 - 7187 851 ## COG0395 ABC-type sugar transport system, permease component 8 2 Op 4 4/0.455 + CDS 7189 - 8259 1391 ## COG3839 ABC-type sugar transport systems, ATPase components 9 2 Op 5 . + CDS 8256 - 8999 770 ## COG0584 Glycerophosphoryl diester phosphodiesterase + Term 9021 - 9064 8.1 - Term 8895 - 8938 7.0 10 3 Tu 1 . - CDS 8986 - 9426 333 ## S4297 hypothetical protein - Prom 9449 - 9508 3.7 + Prom 9400 - 9459 2.2 11 4 Tu 1 . + CDS 9546 - 11288 1849 ## COG0405 Gamma-glutamyltransferase - Term 11280 - 11323 7.4 12 5 Tu 1 . - CDS 11349 - 11837 351 ## PROTEIN SUPPORTED gi|229877854|ref|ZP_04497362.1| acetyltransferase, ribosomal protein N-acetylase - Prom 11867 - 11926 3.2 13 6 Tu 1 . - CDS 12071 - 12268 175 ## ECS88_3838 hypothetical protein + Prom 11900 - 11959 4.4 14 7 Tu 1 5/0.182 + CDS 12170 - 13207 1110 ## COG0673 Predicted dehydrogenases and related proteins + Term 13214 - 13266 3.6 + Prom 13220 - 13279 4.5 15 8 Tu 1 4/0.455 + CDS 13330 - 14025 906 ## COG1741 Pirin-related protein + Term 14046 - 14075 0.4 + Prom 14118 - 14177 4.4 16 9 Tu 1 . + CDS 14249 - 15244 1050 ## COG1609 Transcriptional regulators + Prom 15273 - 15332 2.0 17 10 Op 1 4/0.455 + CDS 15422 - 15910 486 ## COG3265 Gluconate kinase + Term 15939 - 15976 3.9 18 10 Op 2 . + CDS 15977 - 17254 1444 ## COG2610 H+/gluconate symporter and related permeases + Term 17264 - 17299 5.2 19 11 Tu 1 . - CDS 17311 - 17904 599 ## COG2095 Multiple antibiotic transporter - Prom 18067 - 18126 3.4 + Prom 17872 - 17931 3.7 20 12 Tu 1 . + CDS 18097 - 19200 1076 ## COG0136 Aspartate-semialdehyde dehydrogenase + Term 19219 - 19248 1.1 + Prom 19259 - 19318 3.1 21 13 Op 1 9/0.000 + CDS 19473 - 21659 2589 ## COG0296 1,4-alpha-glucan branching enzyme 22 13 Op 2 7/0.000 + CDS 21656 - 23629 1484 ## COG1523 Type II secretory pathway, pullulanase PulA and related glycosidases 23 13 Op 3 17/0.000 + CDS 23647 - 24942 1206 ## COG0448 ADP-glucose pyrophosphorylase 24 13 Op 4 10/0.000 + CDS 24942 - 26375 1276 ## COG0297 Glycogen synthase 25 13 Op 5 2/0.727 + CDS 26394 - 28841 2752 ## COG0058 Glucan phosphorylase + Term 28857 - 28892 4.9 + Prom 28887 - 28946 2.3 26 14 Op 1 . + CDS 28969 - 30474 939 ## COG0226 ABC-type phosphate transport system, periplasmic component 27 14 Op 2 . + CDS 30489 - 31199 317 ## ECIAI1_3572 hypothetical protein 28 14 Op 3 . + CDS 31202 - 31969 248 ## EcE24377A_3904 hypothetical protein 29 14 Op 4 . + CDS 31972 - 32577 86 ## ECIAI1_3570 hypothetical protein - Term 32587 - 32623 7.1 30 15 Tu 1 . - CDS 32631 - 34136 1726 ## COG0578 Glycerol-3-phosphate dehydrogenase - Prom 34179 - 34238 5.7 + Prom 34165 - 34224 4.6 31 16 Op 1 4/0.455 + CDS 34326 - 34652 499 ## COG0607 Rhodanese-related sulfurtransferase 32 16 Op 2 6/0.000 + CDS 34697 - 35527 845 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) 33 16 Op 3 . + CDS 35544 - 36302 888 ## COG1349 Transcriptional regulators of sugar metabolism + Term 36467 - 36501 -1.0 34 17 Tu 1 . - CDS 36284 - 37882 1330 ## COG4650 Sigma54-dependent transcription regulator containing an AAA-type ATPase domain and a DNA-binding domain - Prom 37979 - 38038 4.4 + Prom 37949 - 38008 5.2 35 18 Op 1 3/0.636 + CDS 38071 - 39297 1318 ## COG1690 Uncharacterized conserved protein 36 18 Op 2 . + CDS 39301 - 40317 806 ## COG0430 RNA 3'-terminal phosphate cyclase - Term 40315 - 40356 -1.0 37 19 Tu 1 . - CDS 40360 - 43065 2378 ## COG2909 ATP-dependent transcriptional regulator - Prom 43090 - 43149 6.1 + Prom 43498 - 43557 2.7 38 20 Op 1 7/0.000 + CDS 43677 - 46070 2655 ## COG0058 Glucan phosphorylase 39 20 Op 2 . + CDS 46080 - 48164 2001 ## COG1640 4-alpha-glucanotransferase + Term 48173 - 48215 9.1 40 21 Tu 1 . - CDS 48209 - 49525 1691 ## COG2610 H+/gluconate symporter and related permeases - Prom 49577 - 49636 4.0 - Term 49786 - 49821 1.6 41 22 Op 1 5/0.182 - CDS 49886 - 50461 700 ## COG0694 Thioredoxin-like proteins and domains 42 22 Op 2 . - CDS 50520 - 51203 251 ## COG1040 Predicted amidophosphoribosyltransferases + Prom 51151 - 51210 2.9 43 23 Tu 1 . + CDS 51241 - 52011 613 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 44 24 Tu 1 . - CDS 52040 - 52918 741 ## COG5464 Uncharacterized conserved protein - Prom 52998 - 53057 3.0 - Term 53057 - 53094 6.1 45 25 Op 1 . - CDS 53121 - 53357 224 ## SDY_3666 hypothetical protein 46 25 Op 2 22/0.000 - CDS 53357 - 55678 2819 ## COG0370 Fe2+ transport system protein B 47 25 Op 3 . - CDS 55695 - 55922 224 ## COG1918 Fe2+ transport system protein A - Prom 56136 - 56195 4.8 Predicted protein(s) >gi|296493116|gb|ADTK01000385.1| GENE 1 2 - 895 1288 297 aa, chain + ## HITS:1 COG:ECs4304 KEGG:ns NR:ns ## COG: ECs4304 COG0559 # Protein_GI_number: 15833558 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid ABC-type transport system, permease components # Organism: Escherichia coli O157:H7 # 1 297 12 308 308 468 99.0 1e-132 MFNGVTLGSTYALIAIGYTMVYGIIGMINFAHGEVYMIGSYVSFMIIAALMMMGIDTGWL LVAAGFVGAIVIASAYGWSIERVAYRPVRNSKRLIALISAIGMSIFLQNYVSLTEGSRDV ALPSLFNGQWVVGHSENFSASITTMQAVIWIVTFLAMLALTVFIRYSRMGRACRACAEDL KMASLLGINTDRVIALTFVIGAAMAAVAGVLLGQFYGVINPYIGFMAGMKAFTAAVLGGI GSIPGAMIGGLILGIAEALSSAYLSTEYKDVVSFALLILVLLVMPTGILGRPEVEKV >gi|296493116|gb|ADTK01000385.1| GENE 2 892 - 2169 1362 425 aa, chain + ## HITS:1 COG:livM KEGG:ns NR:ns ## COG: livM COG4177 # Protein_GI_number: 16131328 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport system, permease component # Organism: Escherichia coli K12 # 1 425 1 425 425 718 99.0 0 MKPMHIAMALLSAAMFFVLAGVFMGVQLELDGTKLVVDTASDIRWQWVFIGTAVVFFFQL LRPAFQKGLKSVSGPKFILPAIDGSTVKQKLFLVALLVLAVAWPFMVSRGTVDIATLTMI YIILGLGLNVVVGLSGLLVLGYGGFYAIGAYTFALLNHYYGLGFWTCLPIAGLMAAAAGF LLGFPVLRLRGDYLAIVTLGFGEIVRILLLNNTEITGGPNGISQIPKPTLFGLEFSRTAR EGGWDTFSNFFGLKYDPSDRVIFLYLVALLLVVLSLFVINRLLRMPLGRAWEALREDEIA CRSLGLSPRRIKLTAFTISAAFAGFAGTLFAARQGFVSPESFTFAESAFVLAIVVLGGMG SQFAVILAAILLVVSRELMRDFNEYSMLMLGGLMVLMMIWRPQGLLPMTRPQLKLKNGAA KGEQA >gi|296493116|gb|ADTK01000385.1| GENE 3 2166 - 2933 258 255 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 5 245 1 230 245 103 25 2e-21 MSQPLLSVNGLMMRFGGLLAVNNVNLELYPQEIVSLIGPNGAGKTTVFNCLTGFYKPTGG TILLRDQHLEGLPGQQIARMGVVRTFQHVRLFREMTVIENLLVAQHQQLKTGLFSGLLKT PSFRRAQSEALDRAATWLERIGLLEHANRQASNLAYGDQRRLEIARCMVTQPEILMLDEP AAGLNPKETKELDELIAELRNHHNTTILLIEHDMKLVMGISDRIYVVNQGTPLANGTPEQ IRNNPDVIRAYLGEA >gi|296493116|gb|ADTK01000385.1| GENE 4 2935 - 3648 268 237 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 5 216 1 218 245 107 29 1e-22 MEKVMLSFDKVSAHYGKIQALHEVSLHINQGEIVTLIGANGAGKTTLLGTLCGDPRATSG RIVFDDKDITDWQTAKIMREAVAIVPEGRRVFSRMTVEENLAMGGFFAERDQFQERIKWV YELFPRLHERRVQRAGTMSGGEQQMLAIGRALMSNPRLLLLDEPSLGLAPIIIQQIFDTI EQLREQGMTIFLVEQNANQALKLADRGYVLENGHVVLSDTGDALLANEAVRSAYLGG >gi|296493116|gb|ADTK01000385.1| GENE 5 4044 - 5360 1666 438 aa, chain + ## HITS:1 COG:ECs4299 KEGG:ns NR:ns ## COG: ECs4299 COG1653 # Protein_GI_number: 15833553 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 1 438 1 438 438 843 99.0 0 MKPLHYTASALALGLALMGNAQAVTTIPFWHSMEGELGKEVDSLAQRFNAENPDYKIVPT YKGNYEQNLSAGIAAFRTGNAPAILQVYEVGTATMMASKAIKPVYDVFKEAGIQFDESQF VPTVSGYYSDSKTGHLLSQPFNSSTPVLYYNKDAFKKAGLDPEQPPKTWQDLADYAAKLK ASGMKCGYASGWQGWIQLENFSAWNGLPFASKNNGFDGTDAVLEFNKPEQVKHIAMLEEM NKKGDFSYVGRKDESTEKFYNGDCAMTTASSGSLANIREYAKFNYGVGMMPYDADAKDAP QNAIIGGASLWVMQGKNKETYTGVAKFLDFLAKPENAAEWHQKTGYLPITKAAYDLTREQ GFYEKNPGADTATRQMLNKPPLPFTKGLRLGNMPQIRVIVDEELESVWTGKKTPQQALDT AVERGNQLLRRFEKSTKS >gi|296493116|gb|ADTK01000385.1| GENE 6 5458 - 6345 1044 295 aa, chain + ## HITS:1 COG:ugpA KEGG:ns NR:ns ## COG: ugpA COG1175 # Protein_GI_number: 16131324 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Escherichia coli K12 # 1 295 1 295 295 484 99.0 1e-136 MSSSRPVFRSRWLPYLLVAPQLIITVIFFIWPAGEALWYSLQSVDPFGFSSQFVGLDNFV ALFHDSYYLDSFWTTIKFSTFVTVSGLLVSLFFAALVEYIVRGSRFYQTLMLLPYAVAPA VAAVLWIFLFNPGRGLITHFLAEFGYDWNHAQNSGQAMFLVVFASVWKQISYNFLFFYAA LQSIPRSLIEAAAIDGAGPIRRFFKIALPLIAPVSFFLLVVNLVYAFFDTFPVIDAATSG GPVQATTTLIYKIYREGFTGLDLASSAAQSVVLMFLVIVLTVVQFRYVESKVRYQ >gi|296493116|gb|ADTK01000385.1| GENE 7 6342 - 7187 851 281 aa, chain + ## HITS:1 COG:ECs4297 KEGG:ns NR:ns ## COG: ECs4297 COG0395 # Protein_GI_number: 15833551 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Escherichia coli O157:H7 # 1 281 1 281 281 483 100.0 1e-136 MIENRPWLTIFSHTMLILGIAVILFPLYVAFVAATLDKQAVYAAPMTLIPGTHLLENIHN IWVNGVGTNSAPFWRMLLNSFVMAFSITLGKITVSMLSAFAIVWFRFPLRNLFFWMIFIT LMLPVEVRIFPTVEVIANLKMLDSYAGLTLPLMASATATFLFRQFFMTLPDELVEAARID GASPMRFFCDIVFPLSKTNLAALFVITFIYGWNQYLWPLLIITDVDLGTTVAGIKGMIAT GEGTTEWNSVMAAMLLTLIPPVVIVLVMQRAFVRGLVDSEK >gi|296493116|gb|ADTK01000385.1| GENE 8 7189 - 8259 1391 356 aa, chain + ## HITS:1 COG:ugpC KEGG:ns NR:ns ## COG: ugpC COG3839 # Protein_GI_number: 16131322 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Escherichia coli K12 # 1 356 14 369 369 696 99.0 0 MAGLKLQAVTKSWDGKTQVIKPLTLDVADGEFIVMVGPSGCGKSTLLRMVAGLERVTEGD IWINDQRVTEMEPKDRGIAMVFQNYALYPHMSVEENMAWGLKIRGMGKQQIAERVKEAAR ILELDGLLKRRPRELSGGQRQRVAMGRAIVRDPAVFLFDEPLSNLDAKLRVQMRLELQQL HRRLKTTSLYVTHDQVEAMTLAQRVMVMNGGVAEQIGTPVEVYEKPASLFVASFIGSPAM NLLTGRVNNEGTHFELDGGIELPLNGGYRQYAGRKMTLGIRPEHIALSSQAEGGVPLVMD TLEILGADNLAHGRWGEQKLVVRLAHQERPTAGSTLWLHLAENQLHLFDGETGQRV >gi|296493116|gb|ADTK01000385.1| GENE 9 8256 - 8999 770 247 aa, chain + ## HITS:1 COG:ECs4295 KEGG:ns NR:ns ## COG: ECs4295 COG0584 # Protein_GI_number: 15833549 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Escherichia coli O157:H7 # 1 247 1 247 247 496 97.0 1e-140 MSNWPYPRIVAHRGGGKLAPENTLAAIDVGAKYGHKMIEFDAKLSKDGEIFLLHDDNLER TSNGWGVAGELNWQDLLRVDAGSWYSKAFKGEPLPLLSQVAERCREHGMMANIEIKPTTG TGPLTGKMVALAARQLWAGMTPPLLSSFEIDALEAAQLAAPELPRGLLLDEWRDDWRELT ARLGCVSIHLNHKLLDKVRVMQLKDAGLRILVYTVNKPQRAAELLRWGVDCICTDAIDVI GPNFTAQ >gi|296493116|gb|ADTK01000385.1| GENE 10 8986 - 9426 333 146 aa, chain - ## HITS:1 COG:no KEGG:S4297 NR:ns ## KEGG: S4297 # Name: yhhA # Def: hypothetical protein # Organism: S.flexneri_2457T # Pathway: not_defined # 1 146 1 146 146 157 100.0 1e-37 MKRLLILTALLPFVGFAQPINTLNNPNQPGYQIPSQQRMQTQMQTQQIQQKGMLNQQLKT QTQLQQQHLENQINNNSQRVLQSQPGERNPARQQMLPNTNGGMLNSNRNPDSSLNQQHML PERRNGDMLNQPSTPQPDIPLKTIGP >gi|296493116|gb|ADTK01000385.1| GENE 11 9546 - 11288 1849 580 aa, chain + ## HITS:1 COG:ECs4293 KEGG:ns NR:ns ## COG: ECs4293 COG0405 # Protein_GI_number: 15833547 # Func_class: E Amino acid transport and metabolism # Function: Gamma-glutamyltransferase # Organism: Escherichia coli O157:H7 # 1 580 1 581 581 1100 98.0 0 MIKPTFLRRVAIAALLSGSCFSAAAAPPAPPVSYGVEEDVFHPVRAKQGMVASVDATATQ VGVDILKEGGNAVDAAVAVGYALAVTHPQAGNLGGGGFMLIRSKNGNTTAIDFREMAPAK ATRDMFLDDQGNPDSKKSLTSHLASGTPGTVAGFSLALDKYGTMPLNKVVQPAFKLARDG FIVNDALADDLKTYGSEVLPNHENSKAIFWKEGEPLKKGDTLVQANLAKSLEMIAENGPD EFYKGTIAEQIAQEMQKNGGLITKEDLAAYKAVERTPISGDYRGYQVYSMPPPSSGGIHI VQILNILENFDMKKYGFGSADAMQIMAEAEKYAYADRSEYLGDPDFVKVPWQALTNKAYA KSIADQIDINKAKPSSEIRPGKLAPYESNQTTHYSVVDKDGNAVAVTYTLNTTFGTGIVA GESGILLNNQMDDFSAKPGVPNVYGLVGGDANAVGPNKRPLSSMSPTIVVKDGKTWLVTG SPGGSRIITTVLQMVVNSIDYGMNVAEATNAPRFHHQWLPDELRVEKGFSPDTLKLLEAK GQKVALKEAMGSTQSIMVGPDGELYGASDPRSVDDLTAGY >gi|296493116|gb|ADTK01000385.1| GENE 12 11349 - 11837 351 162 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229877854|ref|ZP_04497362.1| acetyltransferase, ribosomal protein N-acetylase [Sphaerobacter thermophilus DSM 20745] # 1 162 1 163 179 139 46 3e-32 MSEIVIRHAETRDYEAIRQIHAQPEVYCNTLQVPHPSDHMWQERLADRPGIKQLVACIDG DVVGHLTIDVQQRPRRSHVADFGICVDSRWKNRGVASALMREMIEMCDNWLRVDRIELTV FVDNAPAIKVYKKFGFEIEGTGKKYALRNGEYVDAYYMARMK >gi|296493116|gb|ADTK01000385.1| GENE 13 12071 - 12268 175 65 aa, chain - ## HITS:1 COG:no KEGG:ECS88_3838 NR:ns ## KEGG: ECS88_3838 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 62 20 81 86 123 98.0 2e-27 MRDMPAILAVKYIRQMVTGGAFAEANKGAVDDHDFVLFKVVIYTLAQSGRGSYWSAHNEH KHSRG >gi|296493116|gb|ADTK01000385.1| GENE 14 12170 - 13207 1110 345 aa, chain + ## HITS:1 COG:yhhX KEGG:ns NR:ns ## COG: yhhX COG0673 # Protein_GI_number: 16131312 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Escherichia coli K12 # 1 345 1 345 345 701 99.0 0 MVINCAFIGFGKSTTRYHLPYVLNRKDSWHVAHIFRRHAKPEEQAPIYSHIHLTSDLDEV LNDPDVKLVVVCTHADSHFEYAKRALEAGKNVLVEKPFTPTLAQAKELFALAKSKGLTVT PYQNRRFDSCFLTAKKAIESGKLGEIVEVESHFDYYRPVAETKPGLPQDGAFYGLGVHTM DQIISLFGRPDHVAYDIRSLRNKANPDDTFEAQLFYGDLKAIVKTSHLVKIDYPKFIVHG KKGSFIKYGIDQQETSLKANIMPGEPGFAADDSVGVLEYVNDEGVTVREEMKPEMGDYGR VYDALYQTITHGAPNYVKESEVLTNLEILERGFEQASPSTVTLAK >gi|296493116|gb|ADTK01000385.1| GENE 15 13330 - 14025 906 231 aa, chain + ## HITS:1 COG:yhhW KEGG:ns NR:ns ## COG: yhhW COG1741 # Protein_GI_number: 16131311 # Func_class: R General function prediction only # Function: Pirin-related protein # Organism: Escherichia coli K12 # 1 231 1 231 231 468 99.0 1e-132 MIYFRKANERGHANHGWLDSWHTFSFANYYDPNFMGFSALRVINDDVIEAGQGFGTHPHK DMEILTYVLEGTVEHQDSMGNKEQVPAGEFQIMSAGTGIRHSEYNPNSTERLHLYQIWIM PEENGITPRYEQRRFDAVQGKQLVLSPDARDGSLKVHQDMELYRWALLKDEQSVHQIAAE RRVWIQVVKGNVTINGVKASTSDGLAIWDEQAISIHADSDSEVLLFDLPPV >gi|296493116|gb|ADTK01000385.1| GENE 16 14249 - 15244 1050 331 aa, chain + ## HITS:1 COG:ECs4287 KEGG:ns NR:ns ## COG: ECs4287 COG1609 # Protein_GI_number: 15833541 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 331 1 331 331 652 100.0 0 MKKKRPVLQDVADRVGVTKMTVSRFLRNPEQVSVALRGKIAAALDELGYIPNRAPDILSN ATSRAIGVLLPSLTNQVFAEVLRGIESVTDAHGYQTMLAHYGYKPEMEQERLESMLSWNI DGLILTERTHTPRTLKMIEVAGIPVVELMDSQSPCLDIAVGFDNFEAARQMTTAIIARGH RHIAYLGARLDERTIIKQKGYEQAMLDAGLVPYSVMVEQSSSYSSGIELIRQARREYPQL DGVFCTNDDLAVGAAFECQRLGLKVPDDMAIAGFHGHDIGQVMEPRLASVLTPRERMGSI GAERLLARIRGESVTPKMLDLGFTLSPGGSI >gi|296493116|gb|ADTK01000385.1| GENE 17 15422 - 15910 486 162 aa, chain + ## HITS:1 COG:ECs4286 KEGG:ns NR:ns ## COG: ECs4286 COG3265 # Protein_GI_number: 15833540 # Func_class: G Carbohydrate transport and metabolism # Function: Gluconate kinase # Organism: Escherichia coli O157:H7 # 1 162 1 162 162 325 100.0 3e-89 MGVSGSGKSAVASEVAHQLHAAFLDGDFLHPRRNIEKMASGEPLNDDDRKPWLQALNDAA FAMQRTNKVSLIVCSALKKHYRDLLREGNPNLSFIYLKGDFDVIESRLKARKGHFFKTQM LVTQFETLQEPGADETDVLVVDIDQPLEGVVASTIEVIKKGK >gi|296493116|gb|ADTK01000385.1| GENE 18 15977 - 17254 1444 425 aa, chain + ## HITS:1 COG:STM3541 KEGG:ns NR:ns ## COG: STM3541 COG2610 # Protein_GI_number: 16766827 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Salmonella typhimurium LT2 # 1 425 22 446 446 673 97.0 0 MKARMHAFLALMVVSMGAGLFSGMPLDKIAATMEKGMGGTLGFLAVVVALGAMFGKILHE TGAVDQIAVKMLKSFGHSRAHFAIGLAGLVCALPLFFEVAIVLLISVAFSMARHTGTNLV KLVIPLFAGVAAAAAFLVPGPAPMLLASQMNADFGWMILIGLCAAIPGMIIAGPLWGNFI SRYVELHIPDDISEPHLGEGKMPSFGFSLSLILLPLVLVGLKTIAARFVPEGSTAYEWFE FIGHPFTAILVACLVAIYGLAMRQGMPKDKVMEICGHALQPAGIILLVIGAGGVFKQVLV DSGVGPALGEALTGMGLPIAITCFVLAAAVRIIQGSATVACLTAVGLVMPVIEQLNYSGA QMAALSICIAGGSIVVSHVNDAGFWLFGKFTGATEAETLKTWTMMETILGTVGAIVGMIA FQLLS >gi|296493116|gb|ADTK01000385.1| GENE 19 17311 - 17904 599 197 aa, chain - ## HITS:1 COG:ECs4279 KEGG:ns NR:ns ## COG: ECs4279 COG2095 # Protein_GI_number: 15833533 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Multiple antibiotic transporter # Organism: Escherichia coli O157:H7 # 1 197 1 197 197 294 100.0 9e-80 MNEIISAAVLLILIMDPLGNLPIFMSVLKHTEPKRRRAIMVRELLIALLVMLVFLFAGEK ILAFLSLRAETVSISGGIILFLIAIKMIFPSASGNSSGLPAGEEPFIVPLAIPLVAGPTI LATLMLLSHQYPNQMGHLVIALLLAWGGTFVILLQSSLFLRLLGEKGVNALERLMGLILV MMATQMFLDGIRMWMKG >gi|296493116|gb|ADTK01000385.1| GENE 20 18097 - 19200 1076 367 aa, chain + ## HITS:1 COG:ECs4278 KEGG:ns NR:ns ## COG: ECs4278 COG0136 # Protein_GI_number: 15833532 # Func_class: E Amino acid transport and metabolism # Function: Aspartate-semialdehyde dehydrogenase # Organism: Escherichia coli O157:H7 # 1 367 1 367 367 746 100.0 0 MKNVGFIGWRGMVGSVLMQRMVEERDFDAIRPVFFSTSQLGQAAPSFGGTTGTLQDAFDL EALKALDIIVTCQGGDYTNEIYPKLRESGWQGYWIDAASSLRMKDDAIIILDPVNQDVIT DGLNNGIRTFVGGNCTVSLMLMSLGGLFANDLVDWVSVATYQAASGGGARHMRELLTQMG HLYGHVADELATPSSAILDIERKVTTLTRSGELPVDNFGVPLAGSLIPWIDKQLDNGQSR EEWKGQAETNKILNTSSVIPVDGLCVRVGALRCHSQAFTIKLKKDVSIPTVEELLAAHNP WAKVVPNDREITMRELTPAAVTGTLTTPVGRLRKLNMGPEFLSAFTVGDQLLWGAAEPLR RMLRQLA >gi|296493116|gb|ADTK01000385.1| GENE 21 19473 - 21659 2589 728 aa, chain + ## HITS:1 COG:glgB KEGG:ns NR:ns ## COG: glgB COG0296 # Protein_GI_number: 16131306 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Escherichia coli K12 # 1 728 1 728 728 1507 100.0 0 MSDRIDRDVINALIAGHFADPFSVLGMHKTTAGLEVRALLPDATDVWVIEPKTGRKLAKL ECLDSRGFFSGVIPRRKNFFRYQLAVVWHGQQNLIDDPYRFGPLIQEMDAWLLSEGTHLR PYETLGAHADTMDGVTGTRFSVWAPNARRVSVVGQFNYWDGRRHPMRLRKESGIWELFIP GAHNGQLYKYEMIDANGNLRLKSDPYAFEAQMRPETASLICGLPEKVVQTEERKKANQFD APISIYEVHLGSWRRHTDNNFWLSYRELADQLVPYAKWMGFTHLELLPINEHPFDGSWGY QPTGLYAPTRRFGTRDDFRYFIDAAHAAGLNVILDWVPGHFPTDDFALAEFDGTNLYEHS DPREGYHQDWNTLIYNYGRREVSNFLVGNALYWIERFGIDALRVDAVASMIYRDYSRKEG EWIPNEFGGRENLEAIEFLRNTNRILGEQVSGAVTMAEESTDFPGVSRPQDMGGLGFWYK WNLGWMHDTLDYMKLDPVYRQYHHDKLTFGILYNYTENFVLPLSHDEVVHGKKSILDRMP GDAWQKFANLRAYYGWMWAFPGKKLLFMGNEFAQGREWNHDASLDWHLLEGGDNWHHGVQ RLVRDLNLTYRHHKAMHELDFDPYGFEWLVVDDKERSVLIFVRRDKEGNEIIVASNFTPV PRHDYRFGINQPGKWREILNTDSMHYHGSNAGNGGTVHSDEIASHGRQHSLSLTLPPLAT IWLVREAE >gi|296493116|gb|ADTK01000385.1| GENE 22 21656 - 23629 1484 657 aa, chain + ## HITS:1 COG:glgX KEGG:ns NR:ns ## COG: glgX COG1523 # Protein_GI_number: 16131305 # Func_class: G Carbohydrate transport and metabolism # Function: Type II secretory pathway, pullulanase PulA and related glycosidases # Organism: Escherichia coli K12 # 1 657 1 657 657 1377 99.0 0 MTQLAIGKPAPLGAHYDGQGVNFTLFSAHAERVELCVFDANGQEHRYDLPGHSGDIWHGY LPDARPGLRYGYRVHGPWQPAEGHRFNPAKLLIDPCARQIEGEFKDNPLLHAGHNEPDYR DNAAIAPKCVVVVDHYDWEDDAPPRTPWGSTIIYEAHVKGLTYLHPEIPVEIRGTYKALG HPVMINYLKQLGITALELLPVAQFASEPRLQRMGLSNYWGYNPVAMFALHPAYACSPETA LDEFRDAIKALHKAGIEVILDIVLNHSAELDLDGPLFSLRGIDNRSYYWIREDGDYHNWT GCGNTLNLSHPAVVDYASACLRYWVETCHVDGFRFDLAAVMGRTPEFRQDAPLFTAIQNC PVLSQVKLIAEPWDIAPGGYQVGNFPPLFAEWNDHFRDAARRFWLHYDLLLGAFAGRFAA SSDVFKRNGRLPSAAINLVTAHDGFTLRDCVCFNHKHNEANGEENRDGTNNNYSNNHGKE GLGGSLDLVERRRDSIHALLTTLLLSQGTPMLLAGDEHGHSQHGNNNAYCQDNQLTWLDW SQASSGLTAFTAALIHLRKRIPALVENRWWEEGDGNVRWLNRYAQPLSTDEWQNGPKQLQ ILLSDRFLIAINATLEVTEIVLPAGEWHAIPPFAGEDNPVITAVWQGPAHGLCVFQR >gi|296493116|gb|ADTK01000385.1| GENE 23 23647 - 24942 1206 431 aa, chain + ## HITS:1 COG:ECs4275 KEGG:ns NR:ns ## COG: ECs4275 COG0448 # Protein_GI_number: 15833529 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Escherichia coli O157:H7 # 1 431 1 431 431 906 99.0 0 MVSLEKNDHLMLARQLPLKSVALILAGGRGTRLKDLTNKRAKPAVHFGGKFRIIDFALSN CINSGIRRMGVITQYQSHTLVQHIQRGWSFFNEEMNEFVDLLPAQQRMKGENWYRGTADA VTQNLDIIRRYKAEYVVILAGDHIYKQDYSRMLIDHVEKGARCTVACMPVPIEEASAFGV MAVDENDKNIEFVEKPANPPSMPNDPSKSLASMGIYVFDADYLYELLEEDDRDENSSHDF GKDLIPKITEAGLAYAHPFPLSCVQSDPDAEPYWRDVGTLEAYWKANLDLASVVPELDMY DRNWPIRTYNESLPPAKFVQDRSGSHGMTLNSLVSGGCVISGSVVVQSVLFSRVRVNSFC NIDSAVLLPEVWVGRSCRLRRCVIDRACVIPEGMVIGENAEEDARRFYRSEEGIVLVTRE MLRKLGHKQER >gi|296493116|gb|ADTK01000385.1| GENE 24 24942 - 26375 1276 477 aa, chain + ## HITS:1 COG:ECs4274 KEGG:ns NR:ns ## COG: ECs4274 COG0297 # Protein_GI_number: 15833528 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen synthase # Organism: Escherichia coli O157:H7 # 1 477 1 477 477 976 100.0 0 MQVLHVCSEMFPLLKTGGLADVIGALPAAQIADGVDARVLLPAFPDIRRGVTDAQVVSRR DTFAGHITLLFGHYNGVGIYLIDAPHLYDRPGSPYHDTNLFAYTDNVLRFALLGWVGAEM ASGLDPFWRPDVVHAHDWHAGLAPAYLAARGRPAKSVFTVHNLAYQGMFYAHHMNDIQLP WSFFNIHGLEFNGQISFLKAGLYYADHITAVSPTYAREITEPQFAYGMEGLLQQRHREGR LSGVLNGVDEKIWSPETDLLLASRYTRDTLEDKAENKRQLQIAMGLKVDDKVPLFAVVSR LTSQKGLDLVLEALPGLLEQGGQLALLGAGDPVLQEGFLAAAAEYPGQVGVQIGYHEAFS HRIMGGADVILVPSRFEPCGLTQLYGLKYGTLPLVRRTGGLADTVSDCSLENLADGVASG FVFEDSNAWSLLRAIRRAFVLWSRPSLWRFVQRQAMAMDFSWQVAAKSYRELYYRLK >gi|296493116|gb|ADTK01000385.1| GENE 25 26394 - 28841 2752 815 aa, chain + ## HITS:1 COG:glgP KEGG:ns NR:ns ## COG: glgP COG0058 # Protein_GI_number: 16131302 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Escherichia coli K12 # 1 815 1 815 815 1660 100.0 0 MNAPFTYSSPTLSVEALKHSIAYKLMFTIGKDPVVANKHEWLNATLFAVRDRLVERWLRS NRAQLSQETRQVYYLSMEFLIGRTLSNAMLSLGIYEDVQGALEAMGLNLEELIDEENDPG LGNGGLGRLAACFLDSLATLGLPGRGYGIRYDYGMFKQNIVNGSQKESPDYWLEYGNPWE FKRHNTRYKVRFGGRIQQEGKKTRWIETEEILGVAYDQIIPGYDTDATNTLRLWSAQASS EINLGKFNQGDYFAAVEDKNHSENVSRVLYPDDSTYSGRELRLRQEYFLVSSTIQDILSR HYQLHKTYDNLADKIAIHLNDTHPVLSIPEMMRLLIDEHQFSWDDAFEVCCQVFSYTNHT LMSEALETWPVDMLGKILPRHLQIIFEINDYFLKTLQEQYPNDTDLLGRASIIDESNGRR VRMAWLAVVVSHKVNGVSELHSNLMVQSLFADFAKIFPGRFTNVTNGVTPRRWLAVANPS LSAVLDEHLGRNWRTDLSLLNELQQHCDFPMVNHAVHQAKLENKKRLAEYIAQQLNVVVN PKALFDVQIKRIHEYKRQLMNVLHVITRYNRIKADPDAKWVPRVNIFGGKAASAYYMAKH IIHLINDVAKVINNDPQIGDKLKVVFIPNYSVSLAQLIIPAADLSEQISLAGTEASGTSN MKFALNGALTIGTLDGANVEMLDHVGADNIFIFGNTAEEVEELRRQGYKPREYYEKDEEL HQVLTQIGSGVFSPEDPGRYRDLVDSLINFGDHYQVLADYRSYVDCQDKVDELYELQEEW TAKAMLNIANMGYFSSDRTIKEYADHIWHIDPVRL >gi|296493116|gb|ADTK01000385.1| GENE 26 28969 - 30474 939 501 aa, chain + ## HITS:1 COG:ECs4272 KEGG:ns NR:ns ## COG: ECs4272 COG0226 # Protein_GI_number: 15833526 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 1 500 5 504 505 909 89.0 0 MQNRKWILTSLVMTFFGIPILAQFLAAVIAMLGVGLAGIIEVCNIFITPTIYLLLNIFML ALGALMLFFSGRVWADDSAPEKREIAVWRQCLFLVPALLTLGVWIIALHLADYQFRQMGA GWLADLMLPWLGVLLASLVGGEYWWLVIIPVGAHISFSLGYGWPTRYPLTGTSGLRCRNS LLFILLMLGFVAGYQAYLYKQLNPGVGVRENIDTWAWRPDKLNNQLTPLRGKPQIQFTQN WPRLDGATAAYPIYASAFYALSVLPEDFHEWEYLANSRTPEAYNKIVKGNADIIFVAQPS GGQKKRAEESGVTLIYTPFAREAFVFIVNVDNPVNSLTEQQVRDIFSGAITNWRTVGGND QEIQTWQRPEDSGSQTVMQSQVMKNVRMISPQETEVASMMEGMIKVVAEYRNTNNAIGYT FRYYATQMNADKNIKLLAINGIAPTAENIRNGKYPYIVDAFMVTRENMTSETQKLVEWFL TPQGQSLVEDVGYVPMYKTLH >gi|296493116|gb|ADTK01000385.1| GENE 27 30489 - 31199 317 236 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_3572 NR:ns ## KEGG: ECIAI1_3572 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 236 1 236 236 492 100.0 1e-138 MELVNTLFASLVGTDPFTGVDITIANCKSAYWDEGIVQQLINQALDEGEKFVGADGLEGL LRYNVTLNIGLTSSNVWPGFSLDTATISRLCACGADFGFDPYISDVPDVQCDLNTTNDLT VQFTAMLNPDERVIIAKRPLKKCESWIEDIYIFQVFKDAWKFHNDNSLRGFRDKQAELKL YARYYTVENCAEESCRDCNSCIRPSFSLSRSAIIRLNVANARFVYQPFTRDQRARG >gi|296493116|gb|ADTK01000385.1| GENE 28 31202 - 31969 248 255 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_3904 NR:ns ## KEGG: EcE24377A_3904 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 255 1 255 255 511 100.0 1e-143 MQPSPAMQALIEQIYHIFRRYPVPQKFVVCCEYCLSQQEQKVLRSTSLRAISYSLINAWN SSPGPDPQNSDEVRYFLPRLLEFVAQGQFDNIHEVFSLRRINLASKENWREDEWKILQRF ACQYMTDWVSGDEAVELQYMLEMFFRADIALAPLLDAINSVPGFWSTVSLACLLNRYCED YIRDNQDDIDNVITTQINAWAFNNQSILKERARQAIENPPKQPEQGTQHQVWVDDWIIDE CLCAMYDASSESPGK >gi|296493116|gb|ADTK01000385.1| GENE 29 31972 - 32577 86 201 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_3570 NR:ns ## KEGG: ECIAI1_3570 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 201 1 201 201 350 100.0 2e-95 MIKFRLYIPPVILGFVIVPLLVWPTVIVLAVLIFTLTFLAEIIFSFPLLVVRISLQELQL ELMVEYALFFSVMAGIGWQFSRRTPPELKNRLHCWLVFSPVYFWLILSNFILYISPEKSA LLENIRNFFLTFVWLPLNFSPFWPQPWTDFVGPISAQLGFALGYYCHWCRKNRNQRKKWG DWVTCLSLAILAQGPLFNYLQ >gi|296493116|gb|ADTK01000385.1| GENE 30 32631 - 34136 1726 501 aa, chain - ## HITS:1 COG:glpD KEGG:ns NR:ns ## COG: glpD COG0578 # Protein_GI_number: 16131300 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Escherichia coli K12 # 1 501 1 501 501 982 99.0 0 METKDLIVIGGGINGAGIAADAAGRGLSVLMLEAQDLACATSSASSKLIHGGLRYLEHYE FRLVSEALAEREVLLKMAPHIAFPMRFRLPHRPHLRPSWMIRIGLFMYDHLGKRTSLPGS TGLRFGANSVLKPEIKRGFEYSDCWVDDARLVLANAQMVVRKGGEVLTRTRATSARRENG LWIVEAEDIDTGKKYSWQARGLVNATGPWVKQFFDEGMHLPSPYGIRLIKGSHIVVPRVH TQKQAYILQNEDKRIVFVIPWMDEFSIIGTTDVEYKGDPKAVKIEESEINYLLKVYNTHF KKQLSRDDIVWTYSGVRPLCDDESDSPQAITRDYTLDIHDENGKAPLLSVFGGKLTTYRK LAEHALEKLTPYYQGIGPAWTKESVLPGGAIEGDRDDYAARLRRRYPFLTESLARHYART YGSNSELLLGNAGTVSDLGEDFGHEFYEAELKYLVDHEWVRRADDALWRRTKQSMWLNAD QQSRVSQWLVEYTQQKLSLAS >gi|296493116|gb|ADTK01000385.1| GENE 31 34326 - 34652 499 108 aa, chain + ## HITS:1 COG:glpE KEGG:ns NR:ns ## COG: glpE COG0607 # Protein_GI_number: 16131299 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Escherichia coli K12 # 1 108 1 108 108 210 99.0 6e-55 MDQFECINVAVAHQKLQEKEAVLVDIRDPQSFAMGHAVQAFHLTNDTLGAFMRDNDFDTP VMVMCYHGNSSKGAAQYLLQQGYDVVYSIDGGFEAWQRQFPAEVAYGA >gi|296493116|gb|ADTK01000385.1| GENE 32 34697 - 35527 845 276 aa, chain + ## HITS:1 COG:glpG KEGG:ns NR:ns ## COG: glpG COG0705 # Protein_GI_number: 16131298 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Escherichia coli K12 # 1 276 1 276 276 517 98.0 1e-147 MLMITSFANPRVAQAFVDYMATQGVILTIQQHNQSDVWLADESQAERVRAELARFLENPA DPRYLAASWQAGHTGSGLHYRRYPFFAALRERAGPVTWVVMIACVVVFIAMQILGDQEVM LWLAWPFDPTLKFEFWRYFTHALMHFSLMHILFNLLWWWYLGGAVEKRLGSGKLIVITLI SALLSGYVQQKFSGPWFGGLSGVVYALMGYVWLRGERDPQSGIYLQRGLIIFALIWIVAG WFDLFGMSMANGAHIAGLAVGLAMAFVDSLNARKRK >gi|296493116|gb|ADTK01000385.1| GENE 33 35544 - 36302 888 252 aa, chain + ## HITS:1 COG:glpR KEGG:ns NR:ns ## COG: glpR COG1349 # Protein_GI_number: 16131297 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli K12 # 1 252 1 252 252 491 100.0 1e-139 MKQTQRHNGIIELVKQQGYVSTEELVEHFSVSPQTIRRDLNELAEQNLILRHHGGAALPS SSVNTPWHDRKATQTEEKERIARKVAEQIPNGSTLFIDIGTTPEAVAHALLNHSNLRIVT NNLNVANTLMVKEDFRIILAGGELRSRDGGIIGEATLDFISQFRLDFGILGISGIDSDGS LLEFDYHEVRTKRAIIENSRHVMLVVDHSKFGRNAMVNMGSISMVDAVYTDAPPPVSVMQ VLTDHHIQLELC >gi|296493116|gb|ADTK01000385.1| GENE 34 36284 - 37882 1330 532 aa, chain - ## HITS:1 COG:rtcR KEGG:ns NR:ns ## COG: rtcR COG4650 # Protein_GI_number: 16131296 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Sigma54-dependent transcription regulator containing an AAA-type ATPase domain and a DNA-binding domain # Organism: Escherichia coli K12 # 1 532 1 532 532 1043 99.0 0 MRKTVAFGFVGTVLDYAGRGSQRWSKWRPTLCLCQQESLVIDRLELLHDARSRSLFETLK RDIASVSPETEVVSVEIELHNPWDFEEVYACLHDFARGYEFQPEKEDYLIHITTGTHVAQ ICWFLLAEARYLPARLIQSSPPRKKEQPRGAGEVTIIDLDLSRYNAIASRFAEERQQTLD FLKSGIATRNPHFNRMIEQIEKVAIKSRAPILLNGPTGAGKSFLARRIFELKQARHQFSG AFVEVNCATLRGDTAMSTLFGHVKGAFTGARESREGLLRSANGGMLFLDEIGELGADEQA MLLKAIEEKTFYPFGSDRQVSSDFQLIAGTVRDLRQLVAEGKFREDLYARINLWTFTLPG LRQRQEDIEPNLDYEVERHATLTGDSVRFNTEARRAWLAFATSPQATWRGNFRELSASVT RMATFATSGRITLDVVEDEINRLRYNWQESRPSALTALLGAEAENIDLFDRMQLEHVIAL CRQAKSLSAAGRLLFDVSRQGKASVNDADRLRKYLARFGLTWEAVQDQHSSS >gi|296493116|gb|ADTK01000385.1| GENE 35 38071 - 39297 1318 408 aa, chain + ## HITS:1 COG:rtcB KEGG:ns NR:ns ## COG: rtcB COG1690 # Protein_GI_number: 16131295 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 408 1 408 408 845 99.0 0 MNYELLTTENAPVKMWTKGVPVEADARQQLINTAKMPFIFKHIAVMPDVHLGKGSTIGSV IPTKGAIIPAAVGVDIGCGMNALRTALTAEDLPENLAELRQAIETAVPHGRTTGRCKRDK GAWENPPVNVDAKWAELEAGYQWLTQKYPRFLNTNNYKHLGTLGTGNHFIEICLDESDQV WIMLHSGSRGIGNAIGTYFIDLAQKEMQDQLETLPSRDLAYFMEGTEYFDDYLKAVAWAQ LFASLNRDAMMENVVTALQSVTQKTVRQQQTLAMEEINCHHNYVQKEQHFGEEIYVTRKG AVSARAGQYGIIPGSMGAKSFIVRGLGNEESFCSCSHGAGRVMSRTKAKKLFSVEDQIRA TAHVECRKDAEVIDEIPMAYKDIDAVMAAQSDLVEVIYTLRQVVCVKG >gi|296493116|gb|ADTK01000385.1| GENE 36 39301 - 40317 806 338 aa, chain + ## HITS:1 COG:yhgK+J KEGG:ns NR:ns ## COG: yhgK+J COG0430 # Protein_GI_number: 16132255 # Func_class: A RNA processing and modification # Function: RNA 3'-terminal phosphate cyclase # Organism: Escherichia coli K12 # 1 338 2 339 339 614 97.0 1e-176 MKRMIALDGAQGEGGGQILRSALSLSMITGQPFTITSIRAGRAKPGLLRQHLTAVKAAAE ICRATVEGAELGSQRLVFRPGTVRGGEYRFAIGSAGSCTLVLQTVLPALWFADGPSRVEV SGGTDNPSAPPADFIRRVLEPLLAKIGIHQQTTLLRHGFYPAGGGVVATEVSPVASFNTL QLGERGNIVQMRGEVLLAGVPRHVAEREIATLAGSFSLHEQNIHNLPRDQGPGNTVSLEV ESENITERFFVVGEKRVSAEVVAAQLVKEVKRYLASPAAVGEYLADQLVLPMALAGAGEF KVAHPSCHLLTNIAVVERFLPVRFGLIETDGVTRVSIE >gi|296493116|gb|ADTK01000385.1| GENE 37 40360 - 43065 2378 901 aa, chain - ## HITS:1 COG:ECs4260 KEGG:ns NR:ns ## COG: ECs4260 COG2909 # Protein_GI_number: 15833514 # Func_class: K Transcription # Function: ATP-dependent transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 901 1 901 901 1679 100.0 0 MLIPSKLSRPVRLDHTVVRERLLAKLSGANNFRLALITSPAGYGKTTLISQWAAGKNDIG WYSLDEGDNQQERFASYLIAAVQQATNGHCAICETMAQKRQYASLTSLFAQLFIELAEWH SPLYLVIDDYHLITNPVIHESMRFFIRHQPENLTLVVLSRNLPQLGIANLRVRDQLLEIG SQQLAFTHQEAKQFFDCRLSSPIEAAESSRICDDVSGWATALQLIALSARQNTHSAHKSA RRLAGINASHLSDYLVDEVLDNVDLATRHFLLKSAILRSMNDALITRVTGEENGQMRLEE IERQGLFLQRMDDTGEWFCYHPLFGNFLRQRCQWELAAELPEIHRAAAESWMAQGFPSEA IHHALAAGDALMLRDILLNHAWSLFNHSELSLLEESLKALPWDSLLENPQLVLLQAWLMQ SQHRYGEVNTLLARAEHEIKDIREGTMHAEFNALRAQVAINDGNPDEAERLAKLALEELP PGWFYSRIVATSVLGEVLHCKGELTRSLALMQQTEQMARQHDVWHYALWSLIQQSEILFA QGFLQTAWETQEKAFQLINEQHLEQLPMHEFLVRIRAQLLWAWARLDEAEASARSGIEVL SSYQPQQQLQCLAMLIQCSLARGDLDNARSQLNRLENLLGNGKYHSDWISNANKVRVIYW QMTGDKAAAANWLRHTAKPEFANNHFLQGQWRNIARAQILLGEFEPAEIVLEELNENARS LRLMSDLNRNLLLLNQLYWQAGRKSDAQRVLLDALKLANRTGFISHFVIEGEAMAQQLRQ LIQLNTLPELEQHRAQRILREINQHHRHKFAHFDENFVERLLNHPEVPELIRTSPLTQRE WQVLGLIYSGYSNEQIAGELEVAATTIKTHIRNLYQKLGVAHRQAAVQHAQKLLKMMGYG V >gi|296493116|gb|ADTK01000385.1| GENE 38 43677 - 46070 2655 797 aa, chain + ## HITS:1 COG:ECs4259 KEGG:ns NR:ns ## COG: ECs4259 COG0058 # Protein_GI_number: 15833513 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Escherichia coli O157:H7 # 1 797 1 797 797 1644 99.0 0 MSQPIFNDKQFQEALSRQWQRYGLNSAAEMTPRQWWLAVSEALAEMLRAQPFAKPVANQR HVNYISMEFLIGRLTGNNLLNLGWYQDVQDSLKAYDINLTDLLEEEIDPALGNGGLGRLA ACFLDSMATVGQSATGYGLNYQYGLFRQSFVDGKQVEAPDDWHRSNYPWFRHNEALDVQV GIGGKVTKDGRWEPEFTITGQAWDLPVVGYRNGVAQPLRLWQATHAHPFDLTKFNDGDFL RAEQQGINAEKLTKVLYPNDNHTAGKKLRLMQQYFQCACSVADILRRHHLAGRKLHELAD YEVIQLNDTHPTIAIPELLRVLIDEHQMSWDDAWAITSKTFAYTNHTLMPEALERWDVKL VKGLLPRHMQIINEINTRFKTLVEKTWPGDEKVWAKLAVVHDKQVHMANLCVVGGFAVNG VAALHSDLVVKDLFPEYHQLWPNKFHNVTNGITPRRWIKQCNPALAALLDKSLQKEWAND LDQLINLEKFADDAKFRQQYREIKQANKVRLAEFVKVRTGIEINPQAIFDIQIKRLHEYK RQHLNLLHILALYKEIRENPQADRVPRVFLFGAKAAPGYYLAKNIIFAINKVADVINNDP LVGDKLKVVFLPDYCVSAAEKLIPAADISEQISTAGKEASGTGNMKLALNGALTVGTLDG ANVEIAEKVGEENIFIFGHTVEQVKAILAKGYDPVKWRKKDKVLDAVLKELESGKYSDGD KHAFDQMLHSIGKQGGDPYLVMADFAAYVEAQKQVDVLYRDQEAWTRAAILNTARCGMFS SDRSIRDYQARIWQAKR >gi|296493116|gb|ADTK01000385.1| GENE 39 46080 - 48164 2001 694 aa, chain + ## HITS:1 COG:malQ KEGG:ns NR:ns ## COG: malQ COG1640 # Protein_GI_number: 16131292 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Escherichia coli K12 # 1 684 1 684 694 1409 99.0 0 MESKRLDNAALAAGISPNYINAHGKPQSISAETKRRLLDAMHQRTATKVAVTPVPNVMVY TSGKKMPMVVEGSGEYSWLLTTEEGTQYKGHVTGGKAFNLPTKLPEGYHTLTLTQDDQRA HCRVIVAPKRCYEPQALLNKQKLWGACVQLYTLRSEKNWGIGDFGDLKAMLVDVAKRGGS FIGLNPIHALYPANPESASPYSPSSRRWLNVIYIDVNAVEDFHLSEEAQAWWQLPTTQQT LQQARDADWVDYSTVTALKMTALRMAWKGFAQRDDEQMTAFRQFVAEQGDSLFWQAAFDA LHAQQVKEDEMRWGWPAWPEMYQNVDSPEVRQFCEEHRNDVDFYLWLQWLAYSQFAACWE ISQGYEMPIGLYRDLAVGVAEGGAETWCDRELYCLKASVGAPPDILGPLGQNWGLPPMDP HIITARAYEPFIELLRANMQNCGALRIDHVMSMLRLWWIPYGETADQGAYVHYPVDDLLS ILALESKRHRCMVIGEDLGTVPVEIVGKLRSSGVYSYKVLYFENDHEKTFRAPKAYPEQS MAVAATHDLPTLRGYWESGDLTLGKTLGLYPDEVVLRGLYQDRELAKQGLLDALHKYGCL PKRAGHKASLMSMTPTLNRGLQRYIADSNSALLGLQPEDWLDMAEPVNIPGTSYQYKNWR RKLSATLETMFADDGVNKLLKDLDRRRRAAAKKK >gi|296493116|gb|ADTK01000385.1| GENE 40 48209 - 49525 1691 438 aa, chain - ## HITS:1 COG:ECs4257 KEGG:ns NR:ns ## COG: ECs4257 COG2610 # Protein_GI_number: 15833511 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Escherichia coli O157:H7 # 1 438 1 438 438 700 100.0 0 MPLVIVAIGVILLLLLMIRFKMNGFIALVLVALAVGLMQGMPLDKVIGSIKAGVGGTLGS LALIMGFGAMLGKMLADCGGAQRIATTLIAKFGKKHIQWAVVLTGFTVGFALFYEVGFVL MLPLVFTIAASANIPLLYVGVPMAAALSVTHGFLPPHPGPTAIATIFNADMGKTLLYGTI LAIPTVILAGPVYARVLKGIDKPIPEGLYSAKTFSEEEMPSFGVSVWTSLVPVVLMAMRA IAEMILPKGHAFLPVAEFLGDPVMATLIAVLIAMFTFGLNRGRSMDQINDTLVSSIKIIA MMLLIIGGGGAFKQVLVDSGVDKYIASMMHETNISPLLMAWSIAAVLRIALGSATVAAIT AGGIAAPLIATTGVSPELMVIAVGSGSVIFSHVNDPGFWLFKEYFNLTIGETIKSWSMLE TIISVCGLVGCLLLNMVI >gi|296493116|gb|ADTK01000385.1| GENE 41 49886 - 50461 700 191 aa, chain - ## HITS:1 COG:yhgI_2 KEGG:ns NR:ns ## COG: yhgI_2 COG0694 # Protein_GI_number: 16131290 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Thioredoxin-like proteins and domains # Organism: Escherichia coli K12 # 97 191 1 95 95 197 100.0 8e-51 MIRISDAAQAHFAKLLANQEEGTQIRVFVINPGTPNAECGVSYCPPDAVEATDTALKFDL LTAYVDELSAPYLEDAEIDFVTDQLGSQLTLKAPNAKMRKVADDAPLMERVEYMLQSQIN PQLAGHGGRVSLMEITEDGYAILQFGGGCNGCSMVDVTLKEGIEKQLLNEFPELKGVRDL TEHQRGEHSYY >gi|296493116|gb|ADTK01000385.1| GENE 42 50520 - 51203 251 227 aa, chain - ## HITS:1 COG:yhgH KEGG:ns NR:ns ## COG: yhgH COG1040 # Protein_GI_number: 16131289 # Func_class: R General function prediction only # Function: Predicted amidophosphoribosyltransferases # Organism: Escherichia coli K12 # 1 227 17 243 243 400 98.0 1e-112 MLTVPGLCWLCRMPLALGHWGICSVCSRAARTDKTLCPQCGLPATHSHLPCGRCLQKPPP WQRLVTVADYAPPLSPLIHQLKFSRRSEIASALSRLLLLEVLHARRTTGLQLPDRIISVP LWQRRHWRRGFNQSDLLCQPLSRWLHCQWDSEAVTRTRATATQHFLSARLRKRNLKNAFR LELPVQGRHMVIVDDVVTTGSTVAEIAQLLLRNGAATVQVWCLCRTL >gi|296493116|gb|ADTK01000385.1| GENE 43 51241 - 52011 613 256 aa, chain + ## HITS:1 COG:bioH KEGG:ns NR:ns ## COG: bioH COG0596 # Protein_GI_number: 16131288 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Escherichia coli K12 # 1 256 1 256 256 514 99.0 1e-146 MNNIWWQTKGQGNVHLVLLHGWGLNAEVWRCIDEELSSHFTLHLVDLPGFGRSRGFGALS LADMAEAVLQQAPDKAIWLGWSLGGLVASQIALTHPERVQALVTVASSPCFSARDEWPGI KPDVLAGFQQQLSDDFQRTVERFLALQTMGTETARQDARALKKTVLALPMPEVDVLNGGL EILKTVDLRQPLQNVPMPFLRLYGYLDGLVPRKVVPMLDKLWPHSESYIFAKAAHAPFIS HPVEFCHLLVALKQRV >gi|296493116|gb|ADTK01000385.1| GENE 44 52040 - 52918 741 292 aa, chain - ## HITS:1 COG:yhgA KEGG:ns NR:ns ## COG: yhgA COG5464 # Protein_GI_number: 16131287 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 292 1 292 292 581 98.0 1e-166 MSKKQSSTPHDALFKLFLRQPDTARDFLAFHLPAPIHALCDMKTLKLESSSFIDDDLRES YSDVLWSVKTEQGPGYIYCLIEHQSTSNKLIAFRMMRYAIAAMQNHLDAGYKTLPMVVLL LFYHGIESPYPYSLCWLDCFADPKLARQLYASAFPLIDVTVMPDDEIMQHRRMALLELIQ KHIRQRDLMGLVEQMACLLSSGYANDRQIKGLFNYILQTGDAVRFNDFIDGVAERSPKHK ESLMTIAERLRQEGEQSKALHIAKIMLESGVPLTDIMRFTGVSEEELAAASQ >gi|296493116|gb|ADTK01000385.1| GENE 45 53121 - 53357 224 78 aa, chain - ## HITS:1 COG:no KEGG:SDY_3666 NR:ns ## KEGG: SDY_3666 # Name: yhgG # Def: hypothetical protein # Organism: S.dysenteriae # Pathway: not_defined # 1 78 1 78 78 136 100.0 3e-31 MASLIQVRDLLALRGRMEAAQISQTLNTPQPMINAMLQQLESMGKAVRIQEEPDGCLSGS CKSCPEGKACLREWWALR >gi|296493116|gb|ADTK01000385.1| GENE 46 53357 - 55678 2819 773 aa, chain - ## HITS:1 COG:ZfeoB KEGG:ns NR:ns ## COG: ZfeoB COG0370 # Protein_GI_number: 15803913 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Escherichia coli O157:H7 EDL933 # 1 773 1 773 773 1497 99.0 0 MKKLTIGLIGNPNSGKTTLFNQLTGARQRVGNWAGVTVERKEGQFSTTDHQVTLVDLPGT YSLTTISSQTSLDEQIACHYILSGDADLLINVVDASNLERNLYLTLQLLELGIPCIVALN MLDIAEKQNIRIEIDALSARLGCLVIPLVSTRGRGIEALKLAIDRYKANENVELVHYAQP LLNEADSLAKVMPSDIPLKQRRWLGLQMLEGDIYSRAYAGEASQHLDAALARLRNEMDDP ALHIADARYQCIAAICDVVSNTLTAEPSRFTTAVDKIVLNRFLGLPIFLFVMYLMFLLAI NIGGALQPLFDVGSVALFVHGIQWIGYTLHFPDWLTIFLAQGLGGGINTVLPLVPQIGMM YLFLSFLEDSGYMARAAFVMDRLMQALGLPGKSFVPLIVGFGCNVPSVMGARTLDAPRER LMTIMMAPFMSCGARLAIFAVFAAAFFGQNGALAVFSLYMLGIVMAVLTGLMLKYTIMRG EATPFVMELPVYHVPHVKSLIIQTWQRLKGFVLRAGKVIIIVSIFLSAFNSFSLSGKIVD NINDSALASVSRVITPVFKPIGVHEDNWQATVGLFTGAMAKEVVVGTLNTLYTAENIQDE EFNPAEFNLGEELFSAIDETWQSLKDTFSLSVLMNPIEASKGDGEMGTGAMGVMDQKFGS AAAAYSYLIFVLLYVPCISVMGAIARESSRGWMGFSILWGLNIAYSLATLFYQVASYSQH PTYSLVCILAVILFNIVVIGLLRRARSRVDIELLATRKSVSSCCAASTTGDCH >gi|296493116|gb|ADTK01000385.1| GENE 47 55695 - 55922 224 75 aa, chain - ## HITS:1 COG:ECs4250 KEGG:ns NR:ns ## COG: ECs4250 COG1918 # Protein_GI_number: 15833504 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein A # Organism: Escherichia coli O157:H7 # 1 75 1 75 75 146 98.0 1e-35 MQYTPDTAWKITGFSHEISPAYRQKLLSLGMLPGSSFNVVRVAPLGDPIHIETRRVSLVL RKKDLALLEVEAVSC Prediction of potential genes in microbial genomes Time: Mon May 16 16:20:14 2011 Seq name: gi|296493115|gb|ADTK01000386.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1253.7, whole genome shotgun sequence Length of sequence - 2471 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 91 - 2412 1644 ## PROTEIN SUPPORTED gi|51894064|ref|YP_076755.1| ribosomal protein S1-like protein Predicted protein(s) >gi|296493115|gb|ADTK01000386.1| GENE 1 91 - 2412 1644 773 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|51894064|ref|YP_076755.1| ribosomal protein S1-like protein [Symbiobacterium thermophilum IAM 14863] # 1 743 1 743 764 637 48 0.0 MMNDSFCRIIAGEIQARPEQVDAAVRLLDEGNTVPFIARYRKEITGGLDDTQLRNLETRL SYLRELEERRQAILKSISEQGKLTDDLAKAINATLSKTELEDLYLPYKPKRRTRGQIAIE AGLEPLADLLWSDPSHTPEVAAAQYVDADKGVADTKAALDGARYILMERFAEDAALLAKV RDYLWKNAHLVSTVVSGKEEEGAKFRDYFDHHEPLSTVPSHRALAMFRGRNEGVLQLSLN ADPQFDEPPKESYCEQIIMDHLGLRLNNAPADSWRKGVVSWTWRIKVLMHLETELMGTVR ERAEDEAINVFARNLHDLLMAAPAGLRATMGLDPGLRTGVKVAVVDATGKLVATDTIYPH TGQAAKAAMTVAALCEKHNVELVAIGNGTASRETERFYLDVQKQFPKVTAQKVIVSEAGA SVYSASELAAQEFPDLDVSLRGAVSIARRLQDPLAELVKIDPKSIGVGQYQHDVSQTQLA RKLDAVVEDCVNAVGVDLNTASVPLLTRVAGLTRMMAQNIVAWRDENGQFQNRQQLLKVS RLGPKAFEQCAGFLRINHGDNPLDASTVHPEAYPVVERILAATQQALKDLMGNSSELRNL KASDFTDEKFGVPTVTDIIKELEKPGRDPRPEFKTAQFADGVETMNDLQPGMILEGAVTN VTNFGAFVDIGVHQDGLVHISSLSNKFVEDPHTVVKAGDIVKVKVLEVDLQRKRIALTMR LDEQPGETNARRGGGNERPQNNRPAAKPRGREAQPAGNSAMMDALAAAMGKKR Prediction of potential genes in microbial genomes Time: Mon May 16 16:20:21 2011 Seq name: gi|296493114|gb|ADTK01000387.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1253.8, whole genome shotgun sequence Length of sequence - 21560 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 8, operones - 3 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 451 589 ## COG0782 Transcription elongation factor - Prom 533 - 592 4.9 + Prom 515 - 574 11.2 2 2 Op 1 40/0.000 + CDS 679 - 1398 894 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 3 2 Op 2 . + CDS 1395 - 2747 1329 ## COG0642 Signal transduction histidine kinase + Term 2749 - 2793 7.1 - Term 2741 - 2774 4.4 4 3 Tu 1 . - CDS 2823 - 4445 1965 ## COG1866 Phosphoenolpyruvate carboxykinase (ATP) - Prom 4601 - 4660 4.0 + Prom 4543 - 4602 5.7 5 4 Tu 1 . + CDS 4824 - 6548 1513 ## SbBS512_E3780 hypothetical protein - Term 6639 - 6674 7.4 6 5 Op 1 3/0.333 - CDS 6683 - 7561 1016 ## COG1281 Disulfide bond chaperones of the HSP33 family 7 5 Op 2 4/0.333 - CDS 7586 - 7987 427 ## COG1188 Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) 8 5 Op 3 . - CDS 7998 - 8666 918 ## COG1011 Predicted hydrolase (HAD superfamily) - Term 8680 - 8726 10.1 9 5 Op 4 . - CDS 8731 - 10866 1812 ## EcE24377A_3870 hypothetical protein - Prom 11085 - 11144 5.3 + Prom 11044 - 11103 7.0 10 6 Tu 1 . + CDS 11186 - 11746 623 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes + Term 11760 - 11796 0.8 - Term 11855 - 11892 1.0 11 7 Tu 1 . - CDS 11911 - 14463 2571 ## COG5009 Membrane carboxypeptidase/penicillin-binding protein - Prom 14487 - 14546 3.4 + Prom 14451 - 14510 2.0 12 8 Op 1 . + CDS 14583 - 15362 680 ## ECO111_4204 putative pilus assembly protein 13 8 Op 2 . + CDS 15362 - 15901 286 ## COG3166 Tfp pilus assembly protein PilN 14 8 Op 3 . + CDS 15885 - 16325 174 ## SFV_3398 hypothetical protein 15 8 Op 4 . + CDS 16315 - 16719 244 ## SDY_3687 hypothetical protein 16 8 Op 5 8/0.000 + CDS 16631 - 17869 1143 ## COG4796 Type II secretory pathway, component HofQ + Term 17949 - 17982 0.6 + Prom 18108 - 18167 3.9 17 8 Op 6 20/0.000 + CDS 18270 - 18791 544 ## COG0703 Shikimate kinase 18 8 Op 7 7/0.000 + CDS 18848 - 19936 876 ## COG0337 3-dehydroquinate synthetase + Term 19952 - 19982 2.7 19 8 Op 8 . + CDS 20028 - 21314 1045 ## COG3266 Uncharacterized protein conserved in bacteria + Term 21355 - 21393 -0.9 Predicted protein(s) >gi|296493114|gb|ADTK01000387.1| GENE 1 1 - 451 589 150 aa, chain - ## HITS:1 COG:ECs4248 KEGG:ns NR:ns ## COG: ECs4248 COG0782 # Protein_GI_number: 15833502 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Escherichia coli O157:H7 # 1 150 13 162 170 287 99.0 4e-78 MKTPLVTREGYEKLKQELNYLWREERPEVTKKVTWAASLGDRSENADYQYNKKRLREIDR RVRYLTKCMENLKIVDYSPQQEGKVFFGAWVEIENDDGVTHRFRIVGYDEIFDRKDYISI DSPMARALLKKEVGDLAVVNTPAGEASWYV >gi|296493114|gb|ADTK01000387.1| GENE 2 679 - 1398 894 239 aa, chain + ## HITS:1 COG:ECs4247 KEGG:ns NR:ns ## COG: ECs4247 COG0745 # Protein_GI_number: 15833501 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 239 1 239 239 447 100.0 1e-126 MQENYKILVVDDDMRLRALLERYLTEQGFQVRSVANAEQMDRLLTRESFHLMVLDLMLPG EDGLSICRRLRSQSNPMPIIMVTAKGEEVDRIVGLEIGADDYIPKPFNPRELLARIRAVL RRQANELPGAPSQEEAVIAFGKFKLNLGTREMFREDEPMPLTSGEFAVLKALVSHPREPL SRDKLMNLARGREYSAMERSIDVQISRLRRMVEEDPAHPRYIQTVWGLGYVFVPDGSKA >gi|296493114|gb|ADTK01000387.1| GENE 3 1395 - 2747 1329 450 aa, chain + ## HITS:1 COG:envZ KEGG:ns NR:ns ## COG: envZ COG0642 # Protein_GI_number: 16131281 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 450 1 450 450 899 100.0 0 MRRLRFSPRSSFARTLLLIVTLLFASLVTTYLVVLNFAILPSLQQFNKVLAYEVRMLMTD KLQLEDGTQLVVPPAFRREIYRELGISLYSNEAAEEAGLRWAQHYEFLSHQMAQQLGGPT EVRVEVNKSSPVVWLKTWLSPNIWVRVPLTEIHQGDFSPLFRYTLAIMLLAIGGAWLFIR IQNRPLVDLEHAALQVGKGIIPPPLREYGASEVRSVTRAFNHMAAGVKQLADDRTLLMAG VSHDLRTPLTRIRLATEMMSEQDGYLAESINKDIEECNAIIEQFIDYLRTGQEMPMEMAD LNAVLGEVIAAESGYEREIETALYPGSIEVKMHPLSIKRAVANMVVNAARYGNGWIKVSS GTEPNRAWFQVEDDGPGIAPEQRKHLFQPFVRGDSARTISGTGLGLAIVQRIVDNHNGML ELGTSERGGLSIRAWLPVPVTRAQGTTKEG >gi|296493114|gb|ADTK01000387.1| GENE 4 2823 - 4445 1965 540 aa, chain - ## HITS:1 COG:pckA KEGG:ns NR:ns ## COG: pckA COG1866 # Protein_GI_number: 16131280 # Func_class: C Energy production and conversion # Function: Phosphoenolpyruvate carboxykinase (ATP) # Organism: Escherichia coli K12 # 1 540 1 540 540 1119 100.0 0 MRVNNGLTPQELEAYGISDVHDIVYNPSYDLLYQEELDPSLTGYERGVLTNLGAVAVDTG IFTGRSPKDKYIVRDDTTRDTFWWADKGKGKNDNKPLSPETWQHLKGLVTRQLSGKRLFV VDAFCGANPDTRLSVRFITEVAWQAHFVKNMFIRPSDEELAGFKPDFIVMNGAKCTNPQW KEQGLNSENFVAFNLTERMQLIGGTWYGGEMKKGMFSMMNYLLPLKGIASMHCSANVGEK GDVAVFFGLSGTGKTTLSTDPKRRLIGDDEHGWDDDGVFNFEGGCYAKTIKLSKEAEPEI YNAIRRDALLENVTVREDGTIDFDDGSKTENTRVSYPIYHIDNIVKPVSKAGHATKVIFL TADAFGVLPPVSRLTADQTQYHFLSGFTAKLAGTERGITEPTPTFSACFGAAFLSLHPTQ YAEVLVKRMQAAGAQAYLVNTGWNGTGKRISIKDTRAIIDAILNGSLDNAETFTLPMFNL AIPTELPGVDTKILDPRNTYASPEQWQEKAETLAKLFIDNFDKYTDTPAGAALVAAGPKL >gi|296493114|gb|ADTK01000387.1| GENE 5 4824 - 6548 1513 574 aa, chain + ## HITS:1 COG:no KEGG:SbBS512_E3780 NR:ns ## KEGG: SbBS512_E3780 # Name: not_defined # Def: hypothetical protein # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 1 574 1 574 574 998 99.0 0 MDNVELSPATRWGMIATGLLQGLVCYLLIAWLAGKNHSWIVYGVPATVAFSSVLLFSVIS FKQKRLWGWLALVFIATTGMSGWLKWQTDGMNPWRAEKAIWDFGCYLLLMAMLLLPWIQQ SLRIRNGSSRYSYFYQSVWHNVLILLVIFLANGLTWLVLLLWGELFKLVGIKFFNTLFFA TDWFMYLTLGLVTALAVILARTQSRLIDSIQKLFTLIATGLLPLVSLLTLMFIITLPFTG LSAISRHISAAGLLLTLAFLQLILMAIVRDPQKASLPWTGPLRCLIKTALLVAPLYVFVA AWALWLRVAQYGWTVDRLQGALAVLVLLVWSLGYFVSIVWRNGQNPLVLQGKVNLAVSLL VLVILVLLNSPVLDSMRISVNSHMARYQSGKNTPDQVTIYMLEQSGRYGRAALESLKSDA GFMKDPKRARDLLMALDGEQHLQEQVSEKVLAENVLIAPGSVKPDATFWSALIQDRYNMM TCIEKDACVLVEQDLNSDGQAERILFAFNDDRVIVYGFDSDRKEWDALDMSLLPNEITKE KLLTAAKDGKLGTRPKAWRDLTVDGETLEINLSK >gi|296493114|gb|ADTK01000387.1| GENE 6 6683 - 7561 1016 292 aa, chain - ## HITS:1 COG:ZyrfI KEGG:ns NR:ns ## COG: ZyrfI COG1281 # Protein_GI_number: 15803904 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Disulfide bond chaperones of the HSP33 family # Organism: Escherichia coli O157:H7 EDL933 # 1 292 3 294 294 585 100.0 1e-167 MPQHDQLHRYLFENFAVRGELVTVSETLQQILENHDYPQPVKNVLAELLVATSLLTATLK FDGDITVQLQGDGPMNLAVINGNNNQQMRGVARVQGEIPENADLKTLVGNGYVVITITPS EGERYQGVVGLEGDTLAACLEDYFMRSEQLPTRLFIRTGDVDGKPAAGGMLLQVMPAQNA QQDDFDHLATLTETIKTEELLTLPANEVLWRLYHEEEVTVYDPQDVEFKCTCSRERCADA LKTLPDEEVDSILAEDGEIDMHCDYCGNHYLFNAMDIAEIRNNASPADPQVH >gi|296493114|gb|ADTK01000387.1| GENE 7 7586 - 7987 427 133 aa, chain - ## HITS:1 COG:ECs4242 KEGG:ns NR:ns ## COG: ECs4242 COG1188 # Protein_GI_number: 15833496 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) # Organism: Escherichia coli O157:H7 # 1 133 1 133 133 213 99.0 6e-56 MKEKPAVEVRLDKWLWAARFYKTRALAREMIEGGKVHYNGQRSKPSKIVELNATLTLRQG NDERTVIVKAITEQRRPASEAALLYEETAESVEKREKMAMARKLNALTMPHPDRRPDKKE RRDLLRFKHGDSE >gi|296493114|gb|ADTK01000387.1| GENE 8 7998 - 8666 918 222 aa, chain - ## HITS:1 COG:ECs4241 KEGG:ns NR:ns ## COG: ECs4241 COG1011 # Protein_GI_number: 15833495 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Escherichia coli O157:H7 # 1 222 16 237 237 459 100.0 1e-129 MHINIAWQDVDTVLLDMDGTLLDLAFDNYFWQKLVPETWGAKNGVTPQEAMEYMRQQYHD VQHTLNWYCLDYWSEQLGLDICAMTTEMGPRAVLREDTIPFLEALKASGKQRILLTNAHP HNLAVKLEHTGLDAHLDLLLSTHTFGYPKEDQRLWHAVAEATGLKAERTLFIDDSEAILD AAAQFGIRYCLGVTNPDSGIAEKQYQRHPSLNDYRRLIPSLM >gi|296493114|gb|ADTK01000387.1| GENE 9 8731 - 10866 1812 711 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_3870 NR:ns ## KEGG: EcE24377A_3870 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 711 1 711 711 1382 99.0 0 MSTIVIFLAALLACSLLAGWLIKVRSRRRQLPWTNAFADAQTRKLTPEERSAVENYLESL TQVLQVPGPTGASAAPISLALNAESNNVMMLTHAITRYGISTDDPNKWRYYLDSVEVHLP PFWEQYINDENTVELIYTDSLPLVISLNGHTLQEYMQETRGYALQPVPSTQASIRGEESE QIELLNIRKETHEEYALSRPRGLREALLIVASFLMFFFCLITPDVFVPWLAGGALLLLGA GLWGLFAPPAKSSLREIHCLRGTPRRWGLFGENDQEQINNISLGIIDLVYPAHWQPYIAQ DLGQQTNIDIYLDRHVVRQGRYLSLHDEVKNFPLQHWLRSTIIAAGSLLVLFMLLFWIPL DMPLKFTLSWMKGAQTIEATSVKQLADAGVRVGDTLRISGTGMCNIRTSGTWSAKTNSPF LPFDCSQIIWNDARSLPLPESELVNKATALTEAVNRQLHPKPEDESRVSASLRSAIQKSG MVLLDDFGDIVLKTADLCSAKDDCVRLKNALVNLGNSKDWDALVKRANAGKLDGVNVLLR PVSAESLDNLVATSTAPFITHETARAAQSLNSPAPGGFLIVSDEGSDFVDQPWPSASLYD YPPQEQWNAFQKLAQMLMHTPFNAEGIVTKIFTDANGTQHIGLHPIPDRSGLWRYLSTTL LLLTMLGSAIYNGVQAWRRYQRHRTRMMEIQAYYESCLNPQLITPSESLIE >gi|296493114|gb|ADTK01000387.1| GENE 10 11186 - 11746 623 186 aa, chain + ## HITS:1 COG:yrfE KEGG:ns NR:ns ## COG: yrfE COG0494 # Protein_GI_number: 16131274 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Escherichia coli K12 # 1 186 1 186 186 360 100.0 1e-99 MSKSLQKPTILNVETVARSRLFTVESVDLEFSNGVRRVYERMRPTNREAVMIVPIVDDHL ILIREYAVGTESYELGFSKGLIDPGESVYEAANRELKEEVGFGANDLTFLKKLSMAPSYF SSKMNIVVAQDLYPESLEGDEPEPLPQVRWPLAHMMDLLEDPDFNEARNVSALFLVREWL KGQGRV >gi|296493114|gb|ADTK01000387.1| GENE 11 11911 - 14463 2571 850 aa, chain - ## HITS:1 COG:ECs4238 KEGG:ns NR:ns ## COG: ECs4238 COG5009 # Protein_GI_number: 15833492 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase/penicillin-binding protein # Organism: Escherichia coli O157:H7 # 1 850 9 858 858 1712 99.0 0 MKFVKYFLILAVCCILLGAGSIYGLYRYIEPQLPDVATLKDVRLQIPMQIYSADGELIAQ YGEKRRIPVTLDQIPPEMVKAFIATEDSRFYEHHGVDPVGIFRAASVALFSGHASQGAST ITQQLARNFFLSPERTLMRKIKEVFLAIRIEQLLTKDEILELYLNKIYLGYRAYGVGAAA QVYFGKTVDQLTLNEMAVIAGLPKAPSTFNPLYSMDRAVARRNVVLSRMLDEGYITQQQF DQTRTEAINANYHAPEIAFSAPYLSEMVRQEMYNRYGESAYEDGYRIYTTITRKVQQAAQ QAVRNNVLDYDMRHGYRGPANVLWKVGESAWDNNKITDTLKALPTYGPLLPAAVTSANPQ EATAMLADGSTVALSMDGVRWARPYRSDTQQGPTPRKVTDVLQTGQQIWVRQVGDAWWLA QVPEVNSALVSINPQNGAVMALVGGFDFNQSKFNRATQALRQVGSNIKPFLYTAAMDKGL TLASMLNDVPISRWDAGAGSDWQPKNSPPQYAGPIRLRQGLGQSKNVVMVRAMRAMGVDY AAEYLQRFGFPAQNIVHTESLALGSASFTPMQVARGYAVMANGGFLVDPWFISKIENDQG GVIFEAKPKVACPECDIPVIYGDTQKSNVLENNDVEDVAISREQQNVSVPMPQLEQANQA LVAKTGAQEYAPHVINTPLAFLIKSALNTNIFGEPGWQGTGWRAGRDLQRRDIGGKTGTT NSSKDAWFSGYGPGVVTSVWIGFDDHRRNLGHTTASGAIKDQISGYEGGAKSAQPAWDAY MKAVLEGVPEQPLTPPPGIVTVNIDRSTGQLANGGNSREEYFIEGTQPTQQAVHEVGTTI IDNGEAQELF >gi|296493114|gb|ADTK01000387.1| GENE 12 14583 - 15362 680 259 aa, chain + ## HITS:1 COG:no KEGG:ECO111_4204 NR:ns ## KEGG: ECO111_4204 # Name: hofM # Def: putative pilus assembly protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 259 1 259 259 490 99.0 1e-137 MAFKIWQIGLHLQQQEAVAVAIVRGAKECFLQRWWRLPLENDIIKDGRIVDAQQLAKTLL PWSRELPQRHHIMLAFPASRTLQRSFPRPSMSLGEREQTAWLSGTMARELDMDPDSLRFD YSEDSLSPAYNVTAAQSKELATLLTLAERLRVHVSAITPDASALQQFLPFLPSHQQCLAW RDNEQWLWATRYRWGRKLAVGMTSAKELAAALSVDPDSVAICGEGGFDPWEAVSVRQPPL PPPGGDFAIALGLALGKAY >gi|296493114|gb|ADTK01000387.1| GENE 13 15362 - 15901 286 179 aa, chain + ## HITS:1 COG:yrfC KEGG:ns NR:ns ## COG: yrfC COG3166 # Protein_GI_number: 16131271 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Tfp pilus assembly protein PilN # Organism: Escherichia coli K12 # 1 179 1 179 179 244 95.0 5e-65 MNLPINFLPWRQQRRTAFLRFWLLMFVAPLLLAVGITLILRLTSSAEARVDAVLLQGEQQ LAHSLQITKPRLLERQQIREQRLQRQRQRQFTRDWQSALEALAALLPEHAWLTTISWQQG TLEIKGLTTSITALNALETSLRQDASFHLNQRGATQQDAQGRWQFEYQLTRKVSDEHVL >gi|296493114|gb|ADTK01000387.1| GENE 14 15885 - 16325 174 146 aa, chain + ## HITS:1 COG:no KEGG:SFV_3398 NR:ns ## KEGG: SFV_3398 # Name: yrfB # Def: hypothetical protein # Organism: S.flexneri_8401 # Pathway: not_defined # 1 146 1 146 146 266 100.0 2e-70 MNMFFDWWFATSPRLRQFCWAFWLLMLVTLIFLSSTHHEERDALIRLRASHHQQWAALYR LVDTTPFSEEKTLPFSPLDFQLSGAQLVSWHPSAQGGELALKTLWEAVPSAFTRLAERNV SVSRFSLSVEGDDLLFTLQLETPHEG >gi|296493114|gb|ADTK01000387.1| GENE 15 16315 - 16719 244 134 aa, chain + ## HITS:1 COG:no KEGG:SDY_3687 NR:ns ## KEGG: SDY_3687 # Name: yrfA # Def: hypothetical protein # Organism: S.dysenteriae # Pathway: not_defined # 1 134 14 147 147 253 100.0 1e-66 MRVKRWLLAGIALCLLTGMRDPFKPPEDLCRISELSQWRYQGMVGRGERIIGVIKDGQKK WRRVQQNDVLENGWTILQLTPDVLTLGTGTNCEPPQWLWQRQGDTNEAMDSRTTVDADTR RTGGKAAKSDADGG >gi|296493114|gb|ADTK01000387.1| GENE 16 16631 - 17869 1143 412 aa, chain + ## HITS:1 COG:hofQ KEGG:ns NR:ns ## COG: hofQ COG4796 # Protein_GI_number: 16131268 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component HofQ # Organism: Escherichia coli K12 # 1 412 1 412 412 736 99.0 0 MKQWIAALLLMLIPGVQAAKPQKVTLMVDDVPVAQVLQALAEQEKLNLVVSPDVSGTVSL HLTDVPWKQALQTVVKSAGLITRQEDNILSVHSIAWQNNNIARQEAEQARAQANLPLENR SITLQYADAGELAKAGEKLLSAKGSMTVDKRTNRLLLRDNKTALSALEQWVAQMDLPVGQ VELSAHIVTINEKSLRELGVKWTLADAQHAGGVGQVTTLGSDLSVATATTHVGFNIGRIN GRLLDLELSALEQKQQLDIIASPRLLASHLQPASIKQGSEIPYQVSSGESGATSVEFKEA VLGMEVTPTVLQKGRIRLKLHISQNVPGQVLQQADGEVLAIDKQEIETQVEVKSGETLAL GGIFTRKNKSGQDSVPLLGDIPWFGQLFRHDGKEDERRELVVFITPRLVSSE >gi|296493114|gb|ADTK01000387.1| GENE 17 18270 - 18791 544 173 aa, chain + ## HITS:1 COG:ECs4232 KEGG:ns NR:ns ## COG: ECs4232 COG0703 # Protein_GI_number: 15833486 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Escherichia coli O157:H7 # 1 173 68 240 240 321 100.0 4e-88 MAEKRNIFLVGPMGAGKSTIGRQLAQQLNMEFYDSDQEIEKRTGADVGWVFDLEGEEGFR DREEKVINELTEKQGIVLATGGGSVKSRETRNRLSARGVVVYLETTIEKQLARTQRDKKR PLLHVETPPREVLEALANERNPLYEEIADVTIRTDDQSAKVVANQIIHMLESN >gi|296493114|gb|ADTK01000387.1| GENE 18 18848 - 19936 876 362 aa, chain + ## HITS:1 COG:ECs4231 KEGG:ns NR:ns ## COG: ECs4231 COG0337 # Protein_GI_number: 15833485 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate synthetase # Organism: Escherichia coli O157:H7 # 1 362 1 362 362 692 99.0 0 MERIVVTLGERSYPITIASGLFNEPASFLPLKSGEQVMLVTNETLAPLYLDKVRGVLEQA GVNVDSVILPDGEQYKSLAVLDTVFTALLQKPHGRDTTLVALGGGVVGDLTGFAAASYQR GVRFIQVPTTLLSQVDSSVGGKTAVNHPLGKNMIGAFYQPASVVVDLDCLKTLPPRELAS GLAEVIKYGIILDGAFFNWLEENLDALLRLDGPAMAYCIRRCCELKAEVVAADERETGLR ALLNLGHTFGHAIEAEMGYGNWLHGEAVAAGMVMAARTSERLGQFSSAETQRIITLLTRA GLPVNGPREMSAQAYLPHMLRDKKVLAGEMRLILPLAIGKSEVRSGVSHELVLNAIADCQ SA >gi|296493114|gb|ADTK01000387.1| GENE 19 20028 - 21314 1045 428 aa, chain + ## HITS:1 COG:ECs4230 KEGG:ns NR:ns ## COG: ECs4230 COG3266 # Protein_GI_number: 15833484 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 428 1 428 428 484 99.0 1e-136 MDEFKPEDELKPDPSDRRTGRSRQSSERSERTERGEPQINFDDIELDDTDDRRPTRAQKE RNEEPEIEEEIDESEDETVDEERVERRPRKRKKAASKPASRQYMMMGVGILVLLLLIIGI GSALKAPSTTSSDQTASGEKSIDLAGNATDQANGVQPAPGTTSAENTQQDVSLPPISSTP TQGQTPAATDGQQRVEVQGDLNNALTQPQNQQQLNNVAVNSTLPTEPATVAPVRNGNASR ETAKTQTAERPATTRPARQQAVIEPKKPQATVKTEPKPVAQTPKRTEPAAPVASTKAPAA TSTPAPKETATTAPVQTASPAQTTATPAAGGKTAGNVGSLKSAPSSHYTLQLSSSSNYDN LNGWAKKENLKNYVVYETTRNGQPWYVLVSGVYASKEEAKKAVSTLPADVQAKNPWAKPL RQVQADLK Prediction of potential genes in microbial genomes Time: Mon May 16 16:20:47 2011 Seq name: gi|296493113|gb|ADTK01000388.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1253.9, whole genome shotgun sequence Length of sequence - 3451 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 + CDS 76 - 711 510 ## COG0338 Site-specific DNA methylase 2 1 Op 2 9/0.000 + CDS 729 - 1406 815 ## COG0036 Pentose-5-phosphate-3-epimerase 3 1 Op 3 6/0.000 + CDS 1399 - 2157 942 ## COG0546 Predicted phosphatases 4 1 Op 4 . + CDS 2150 - 3154 1157 ## COG0180 Tryptophanyl-tRNA synthetase + Term 3280 - 3313 3.8 Predicted protein(s) >gi|296493113|gb|ADTK01000388.1| GENE 1 76 - 711 510 211 aa, chain + ## HITS:1 COG:ECs4229 KEGG:ns NR:ns ## COG: ECs4229 COG0338 # Protein_GI_number: 15833483 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Escherichia coli O157:H7 # 1 211 68 278 278 422 99.0 1e-118 MRTDEYVQAARELFVPETNCAEVYYQFREEFNKSQDPFRRAVLFLYLNRYGYNGLCRYNL RGEFNVPFGRYKKPYFPEAELYHFAEKAQNAFFYCESYADSMARADDASVVYCDPPYAPL SATANFTAYHTNSFTLEQQAHLAEIAEGLVDRHIPVLISNHDTMLTREWYQRAKLHVVKV RRSISSNGGTRKKVDELLALYKPGVVSPAKK >gi|296493113|gb|ADTK01000388.1| GENE 2 729 - 1406 815 225 aa, chain + ## HITS:1 COG:ECs4228 KEGG:ns NR:ns ## COG: ECs4228 COG0036 # Protein_GI_number: 15833482 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Escherichia coli O157:H7 # 1 225 1 225 225 441 100.0 1e-124 MKQYLIAPSILSADFARLGEDTAKALAAGADVVHFDVMDNHYVPNLTIGPMVLKSLRNYG ITAPIDVHLMVKPVDRIVPDFAAAGASIITFHPEASEHVDRTLQLIKENGCKAGLVFNPA TPLSYLDYVMDKLDVILLMSVNPGFGGQSFIPQTLDKLREVRRRIDESGFDIRLEVDGGV KVNNIGEIAAAGADMFVAGSAIFDQPDYKKVIDEMRSELAKVSHE >gi|296493113|gb|ADTK01000388.1| GENE 3 1399 - 2157 942 252 aa, chain + ## HITS:1 COG:gph KEGG:ns NR:ns ## COG: gph COG0546 # Protein_GI_number: 16131263 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Escherichia coli K12 # 1 252 1 252 252 491 99.0 1e-139 MNKFEDIRGVAFDLDGTLVDSAPGLAAAVDMALYALELPIAGEERVITWIGNGADVLMER ALTWARQERATQRKTMGKPPVDDDIPAEEQVRILRKLFDRYYSEVAEEGTFLFPHVADTL GALQAKGLPLGLVTNKPTPFVAPLLEALDIAKYFSVVIGGDDVQNKKPHPDPLLLVAERM GIAPQQMLFVGDSRNDIQAAKAAGCPSVGLTYGYNYGEAIDLSQPDVIYQSINDLLPALG LPHSENQESKND >gi|296493113|gb|ADTK01000388.1| GENE 4 2150 - 3154 1157 334 aa, chain + ## HITS:1 COG:trpS KEGG:ns NR:ns ## COG: trpS COG0180 # Protein_GI_number: 16131262 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Tryptophanyl-tRNA synthetase # Organism: Escherichia coli K12 # 1 334 1 334 334 683 99.0 0 MTKPIVFSGAQPSGELTIGNYMGALRQWVNMQDDYHCIYCIVDQHAITVRQDAQKLRKAT LDTLALYLACGIDPEKSTIFVQSHVPEHAQLGWALNCYTYFGELSRMTQFKDKSARYAEN INAGLFDYPVLMAADILLYQTNLVPVGEDQKQHLELSRDIAQRFNALYGEIFKVPEPFIP KSGARVMSLLEPTKKMSKSDDNRNNVIGLLEDPKSVVKKIKRAVTDSDEPPVVRYDVQNK AGVSNLLDILSAVTGQSIPELEKQFEGKMYGHLKGEVADAVSGMLTELQERYHRFRNDEA FLQQVMKDGAEKASVHASRTLKAVYEAIGFVAKP Prediction of potential genes in microbial genomes Time: Mon May 16 16:20:50 2011 Seq name: gi|296493112|gb|ADTK01000389.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1253.10, whole genome shotgun sequence Length of sequence - 7439 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 2, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 48 - 869 650 ## EC55989_3788 hypothetical protein 2 1 Op 2 . + CDS 886 - 1248 364 ## SSON_3513 hypothetical protein + Prom 1253 - 1312 1.5 3 2 Op 1 2/0.000 + CDS 1332 - 2495 1256 ## COG3457 Predicted amino acid racemase 4 2 Op 2 1/0.000 + CDS 2495 - 3721 1242 ## COG1015 Phosphopentomutase 5 2 Op 3 . + CDS 3718 - 4596 936 ## COG1735 Predicted metal-dependent hydrolase with the TIM-barrel fold 6 2 Op 4 . + CDS 4607 - 4960 527 ## SBO_3365 hypothetical protein 7 2 Op 5 . + CDS 4972 - 6276 1424 ## ECO103_4095 hypothetical protein 8 2 Op 6 . + CDS 6288 - 7373 948 ## JW3339 conserved hypothetical protein Predicted protein(s) >gi|296493112|gb|ADTK01000389.1| GENE 1 48 - 869 650 273 aa, chain + ## HITS:1 COG:no KEGG:EC55989_3788 NR:ns ## KEGG: EC55989_3788 # Name: yhfZ # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 273 29 301 301 548 99.0 1e-154 MKTIDELATECRSSVGLTQAALKTLESSGAIRIERRGRNGSYLVEMDNKALLSHVDINNV VCAMPLPYTRLYEGLASGLKAQFDGIPFYYAHMRGADIRVECLLNGVYDMAVVSRLAAES YLTQKGLCLALELGPHTYVGEHQLICRKGESANVKRVGLDNRSADQKIMTDVFFGGSDVE RVDLSYHESLQRIVKGDVDAVIWNVVAENELTMLGLEATPLTDDPRFLQATEAVVLTRVD DYPMQQLLRAVVDKHALLAHQQRVVSGEQEPSY >gi|296493112|gb|ADTK01000389.1| GENE 2 886 - 1248 364 120 aa, chain + ## HITS:1 COG:no KEGG:SSON_3513 NR:ns ## KEGG: SSON_3513 # Name: yhfY # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 120 15 134 134 220 100.0 1e-56 METRLNLLCEAGVIDKDICKGMMQVVNVLETECHLPVRSEQGTMAMTHMASALMRSRRGE EIEPLDNELLAELAQSSHWQAVVQLHQVLLKEFALEVNPCEEGYLLANLYGLWMAANEEV >gi|296493112|gb|ADTK01000389.1| GENE 3 1332 - 2495 1256 387 aa, chain + ## HITS:1 COG:yhfX KEGG:ns NR:ns ## COG: yhfX COG3457 # Protein_GI_number: 16131259 # Func_class: E Amino acid transport and metabolism # Function: Predicted amino acid racemase # Organism: Escherichia coli K12 # 1 387 1 387 387 763 95.0 0 MFVEALKRQNPALISAALSLWQQGKIAPDSWVIDVDQVLENGKRLIETARLYGIELYLMT KQFGRNPWLAEKLLALGYSGIVAVDYKEARVMRRAGLSVAHQGHLVQIPCHQVADAVEQG TDVITVFTLDKAREVSAAAVKAGRVQFVLLKVYSDDDFLYPGQESGFVQHSLHEVVAEIK KLPGLHLAGLTHFPCLLWDEAAGKVLPTPNLHTLIQARDQLAKSGIALEQLNAPSATSCT SLPLLAEYGVTHAEPGHALTGTIPANQQGDQPERIAMLWLSEISHHFRGDSYCYGGGYYR RGHAQHALVFTPENQKITETNLKTVDDSSIDYTLPLAGEYPVSSAVVLCFRTQIFVTRSD VVLVSGIHRGEPKIVGRYDSLGNSLGA >gi|296493112|gb|ADTK01000389.1| GENE 4 2495 - 3721 1242 408 aa, chain + ## HITS:1 COG:yhfW KEGG:ns NR:ns ## COG: yhfW COG1015 # Protein_GI_number: 16131258 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphopentomutase # Organism: Escherichia coli K12 # 1 408 1 408 408 812 98.0 0 MARFVVLVIDSFGVGAMKDVTLVRPQDAGANTCGHILSQLPHLQLPTLEKLGLINALGYA PGDMQPSDSATWGVAELQHEGGDTFMGHQEILGTRPLPPLRMPFRDVIGRVEQALVSAGW QVERRGDDLQFLWVNQAVAIGDNLEADLGQVYNITANLSVISFDDAIKIGRIVREQVQVG RVITFGGLLTDSQRILDAAESKEGRFIGINAPRSGAYDNGFQVVHMGYGVDEKVQVPQKL YEAGVPTVLVGKVADIVSNPYGVSWQNLVDSQRIMDITLDEFNTHPTAFICINIQETDLA GHAEDVARYAERLQVVDRNLARLVEAMQPDDCLVVMADHGNDPTIGHSHHTREVVPVLVY QQGLVHTQLGVRTTLSDVGATVCEFFRAPPPQNGRSFLSSLRFAGDTL >gi|296493112|gb|ADTK01000389.1| GENE 5 3718 - 4596 936 292 aa, chain + ## HITS:1 COG:php KEGG:ns NR:ns ## COG: php COG1735 # Protein_GI_number: 16131257 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Escherichia coli K12 # 1 292 1 292 292 588 98.0 1e-168 MSIDPTGYTLAHEHLHIDLSGFKNNVDCRLDQYAFICQEMNDLMTRGVRNVIEMTNRYMG RNAQFMLDVMRETGINVVACTGYYQDAFFPEHVATRSVQELAQEMVDEIEQGIDGTDLKA GIIAEIGSSEGKITPLEEKVFIAAALAHNQTGRPISTHTSFSTMGLEQLALLQAHGVDLS RVTVGHCDLKDNLDNILKMIDLGAYVQFDTIGKNSYYPDEKRIAMLHALRDRGLLNRVML SMDITRRSHLKANGGYGYDYLLTTFIPQLRQSGFSQADVDVMLRENPSQFFQ >gi|296493112|gb|ADTK01000389.1| GENE 6 4607 - 4960 527 117 aa, chain + ## HITS:1 COG:no KEGG:SBO_3365 NR:ns ## KEGG: SBO_3365 # Name: yhfU # Def: hypothetical protein # Organism: S.boydii # Pathway: not_defined # 1 117 14 130 130 211 99.0 6e-54 MKKIGVAGLQREQIKKNIEATAPGSFEVFIHNDMEAAMKVKSGQLDYYIGACNTGAGAAL SIAIAVIGYNKSCTIAKPGIKAKDEHIAKMIAEGKVAFGLSVEHVEHAIPMLINHLK >gi|296493112|gb|ADTK01000389.1| GENE 7 4972 - 6276 1424 434 aa, chain + ## HITS:1 COG:no KEGG:ECO103_4095 NR:ns ## KEGG: ECO103_4095 # Name: yhfT # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 434 1 434 434 729 100.0 0 MDLYIQIIVVACLTGMTSLLAHRSAAVFHDGIRPILPQLIEGYMNRREAGSIAFGLSIGF VASVGISFTLKTGLLNAWLLFLPTDILGVLAINSLMAFGLGAIWGVLILTCLLPVNQLLT ALPVDVLGSLGELSSPVVSAFALFPLVAIFYQFGWKQSLVAAVVVLMTRVVVVRYFPHLN PESIEIFIGMVMLLGIAITHDLRHRDENDIDASGLSVFEERTSRIIKNLPYIAIVGALIA AVASMKIFAGSEVSIFTLEKAYSAGVTPEQSQTLINQAALAEFMRGLGFVPLIATTALAT GVYAVAGFTFVYAVGYLSPNPMVAAVLGAVVISAEVLLLRSIGKWLGRYPSVRNASDNIR NAMNMLMEVALLVGSIFAAIKMAGYTGFSIAVAIYFLNESLGRPVQKMAAPVVAVMITGI LLNVLYWLGLFVPA >gi|296493112|gb|ADTK01000389.1| GENE 8 6288 - 7373 948 361 aa, chain + ## HITS:1 COG:no KEGG:JW3339 NR:ns ## KEGG: JW3339 # Name: yhfS # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 361 1 361 361 659 99.0 0 MKTFPLQSLTIIEAQQKQFALVDSICRHFPGSEFLTGGDLGLTPGLNQPRVTQRVEQVLA DAFHAQAAALVQGAGTGAIRAGLAALLKPGQRLLVHDAPVYPTTRVIIEQMGLTLITVDF NDLSALKQVVDAQQPDAALVQHTRQQPQDSYVLADVLATLRAAGVPALTDDNYAVMKVAR IGCECGANVSTFSCFKLFGPEGVGAVVGDADVINRIRATLYSGGSQIQGAQALEVLRGLV FAPVMHAVQAGVSERLLALLNGGAVPEVKSAVIANAQSKVLIVEFHQPIAARVLEEAQKR GALPYPVGAESKYEIPPLFYRLSGTFRQANPQSEHCAIRINPNRSGEETVLRILRESIAS I Prediction of potential genes in microbial genomes Time: Mon May 16 16:21:13 2011 Seq name: gi|296493111|gb|ADTK01000390.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1253.11, whole genome shotgun sequence Length of sequence - 12991 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 7, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 1/1.000 - CDS 31 - 762 641 ## COG2188 Transcriptional regulators - Prom 788 - 847 3.6 - Term 814 - 855 9.5 2 2 Op 1 5/1.000 - CDS 862 - 1647 871 ## COG0524 Sugar kinases, ribokinase family 3 2 Op 2 2/1.000 - CDS 1644 - 2474 923 ## COG1082 Sugar phosphate isomerases/epimerases 4 2 Op 3 2/1.000 - CDS 2524 - 3546 1204 ## COG2222 Predicted phosphosugar isomerases 5 2 Op 4 . - CDS 3567 - 4904 1449 ## COG0531 Amino acid transporters - Prom 4977 - 5036 4.8 - Term 5122 - 5157 5.5 6 3 Tu 1 . - CDS 5199 - 5366 246 ## LF82_3285 uncharacterized protein YhfL - Prom 5400 - 5459 7.6 - Term 5574 - 5603 2.1 7 4 Op 1 3/1.000 - CDS 5613 - 6986 1375 ## COG0007 Uroporphyrinogen-III methylase 8 4 Op 2 3/1.000 - CDS 7005 - 7811 1006 ## COG2116 Formate/nitrite family of transporters - Prom 7834 - 7893 12.2 - Term 7899 - 7929 3.4 9 5 Op 1 14/0.000 - CDS 7939 - 8265 466 ## COG2146 Ferredoxin subunits of nitrite reductase and ring-hydroxylating dioxygenases 10 5 Op 2 4/1.000 - CDS 8262 - 10805 2959 ## COG1251 NAD(P)H-nitrite reductase - Prom 10898 - 10957 5.0 - Term 11020 - 11058 7.0 11 6 Tu 1 . - CDS 11067 - 12248 1267 ## COG0477 Permeases of the major facilitator superfamily - Prom 12355 - 12414 7.3 + Prom 12381 - 12440 5.8 12 7 Tu 1 . + CDS 12519 - 12990 390 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family Predicted protein(s) >gi|296493111|gb|ADTK01000390.1| GENE 1 31 - 762 641 243 aa, chain - ## HITS:1 COG:ECs4225 KEGG:ns NR:ns ## COG: ECs4225 COG2188 # Protein_GI_number: 15833479 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 243 23 265 265 501 99.0 1e-142 MSATDRYSHQLLYATVRQRLLDDIAQGVYQAGQQIPTENELCTQYNVSRITIRKAISDLV ADGVLIRWQGKGTFVQSQKVENALLTVSGFTDFGVSQGKATKEKVIEQERVSAAPFCEKL NIPGNSEVFHLCRVMYLDKEPLFIDSSWIPLSRYPDFDEIYVEGSSTYQLFQERFDTRVV SDKKTIDIFAATRPQAKWLKCELGEPLFRISKIAFDQNDKPVHVSELFCRANRITLTIDN KRH >gi|296493111|gb|ADTK01000390.1| GENE 2 862 - 1647 871 261 aa, chain - ## HITS:1 COG:yhfQ KEGG:ns NR:ns ## COG: yhfQ COG0524 # Protein_GI_number: 16131252 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli K12 # 1 261 1 261 261 519 99.0 1e-147 MKTLATIGDNCVDIYPQLNKAFSGGNAVNVAVYCTRYGIQPGCITWVGDDDYGTKLKQDL ARMGVDISHVHTKHGVTAQTQVELHDNDRVFGDYTEGVMADFALSEEDYAWLAQYDIVHA AIWGHAEDAFPQLHAAGKLTAFDFSDKWDSPLWQTLVPHLDFAFASAPQEDETLRLKMKA IVARGAGTVIVTLGENGSIAWDGAQFWRQAPEPVTVIDTMGAGDSFIAGFLCGWSAGMTL PQAMAQGTACAAKTIQYHGAW >gi|296493111|gb|ADTK01000390.1| GENE 3 1644 - 2474 923 276 aa, chain - ## HITS:1 COG:ECs4223 KEGG:ns NR:ns ## COG: ECs4223 COG1082 # Protein_GI_number: 15833477 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Escherichia coli O157:H7 # 1 275 1 275 275 563 99.0 1e-161 MKTGMFTCGHQRLPIEHAFRDASELGYDGIEIWGGRPHAFAPDLKAGGIKQIKALAQTYQ MPIIGYTPEINGYPYNMMLGDEHMRRESLDMIKLAMDMAKEMNAGYTLISAAHAGYLTPP NVIWGRLAENLSELCEYAENIGMDLILEPLTPYESNVVCNANDVLHALALVPSPRLFSMV DICAPYVQAEPVMSYFDKLGDKLRHLHIVDSDGASDTHYIPGEGKMPLRELMRDIIDRGY EGYCTVELVTMYMNEPRLYARQALERFRALLPEDER >gi|296493111|gb|ADTK01000390.1| GENE 4 2524 - 3546 1204 340 aa, chain - ## HITS:1 COG:yhfN KEGG:ns NR:ns ## COG: yhfN COG2222 # Protein_GI_number: 16131249 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted phosphosugar isomerases # Organism: Escherichia coli K12 # 1 340 8 347 347 725 100.0 0 MLDIDKSTVDFLVTENMVQEVEKVLSHDVPLVHAIVEEMVKRDIDRIYFVACGSPLNAAQ TAKHLADRFSDLQVYAISGWEFCDNTPYRLDDRCAVIGVSDYGKTEEVIKALELGRACGA LTAAFTKRADSPITSAAEFSIDYQADCIWEIHLLLCYSVVLEMITRLAPNAEIGKIKNDL KQLPNALGHLVRTWEEKGRQLGELASQWPMIYTVAAGPLRPLGYKEGIVTLMEFTWTHGC VIESGEFRHGPLEIVEPGVPFLFLLGNDESRHTTERAINFVKQRTDNVIVIDYAEISQGL HPWLAPFLMFVPMEWLCYYLSIYKDHNPDERRYYGGLVEY >gi|296493111|gb|ADTK01000390.1| GENE 5 3567 - 4904 1449 445 aa, chain - ## HITS:1 COG:yhfM KEGG:ns NR:ns ## COG: yhfM COG0531 # Protein_GI_number: 16131248 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli K12 # 1 445 18 462 462 781 99.0 0 MGSQELQRKLGFWAVLAIAVGTTVGSGIFVSVGEVAKAAGTPWLTVLAFVIGGLIVIPQM CVYAELSTAYPENGADYVYLKNAGSRPLAFLSGWASFWANDAPSLSIMALAIVSNLGFLT PIDPLLGKFIAAGLIIAFMLLHLRSVEGGATFQTLITIAKIIPFTIVIGLGIFWFKAENF AAPTTTAIGATGSFMALLAGISATSWSYTGMASICYMTGEIKNPGKTMPRALIGSCLLVL VLYTLLALVISGLMPFDKLANSETPISDALTWIPALGSTAGIFVAITAMIVILGSLSSCV MYQPRLEYAMAKDNLFFKCFGHVHPKYNTPDVSIILQGALGIFFIFVSDLTSLLGYFTLV MCFKNTLTFGSIIWCRKRDDYKPLWRTPAFGLMTTLAIASSLILVASTFVWAPIPGLICA VIVIATGLPAYAFWAKRSRQLNALS >gi|296493111|gb|ADTK01000390.1| GENE 6 5199 - 5366 246 55 aa, chain - ## HITS:1 COG:no KEGG:LF82_3285 NR:ns ## KEGG: LF82_3285 # Name: yhfL # Def: uncharacterized protein YhfL # Organism: E.coli_LF82 # Pathway: not_defined # 1 55 1 55 55 107 100.0 1e-22 MNKFIKVALVGAVLATLTACTGHIENRDKNCSYDYLLHPAISISKIIGGCGPTAQ >gi|296493111|gb|ADTK01000390.1| GENE 7 5613 - 6986 1375 457 aa, chain - ## HITS:1 COG:cysG_2 KEGG:ns NR:ns ## COG: cysG_2 COG0007 # Protein_GI_number: 16131246 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III methylase # Organism: Escherichia coli K12 # 211 457 1 247 247 503 100.0 1e-142 MDHLPIFCQLRDRDCLIVGGGDVAERKARLLLDAGARLTVNALAFIPQFTAWADAGMLTL VEGPFDESLLDTCWLAIAATDDDALNQRVSEAAEARRIFCNVVDAPKAASFIMPSIIDRS PLMVAVSSGGTSPVLARLLREKLESLLPLHLGQVAKYAGQLRGRVKQQFATMGERRRFWE KLFVNDRLAQSLANNDQKAITETTEQLINEPLDHRGEVVLVGAGPGDAGLLTLKGLQQIQ QADVVVYDRLVSDDIMNLVRRDADRVFVGKRAGYHCVPQEEINQILLREAQKGKRVVRLK GGDPFIFGRGGEELETLCNAGIPFSVVPGITAASGCSAYSGIPLTHRDYAQSVRLITGHL KTGGELDWENLAAEKQTLVFYMGLNQAATIQQKLIEHGMPGEMPVAIVENGTAVTQRVID GTLTQLGELAQQMNSPSLIIIGRVVGLRDKLNWFSNH >gi|296493111|gb|ADTK01000390.1| GENE 8 7005 - 7811 1006 268 aa, chain - ## HITS:1 COG:nirCm KEGG:ns NR:ns ## COG: nirCm COG2116 # Protein_GI_number: 16132233 # Func_class: P Inorganic ion transport and metabolism # Function: Formate/nitrite family of transporters # Organism: Escherichia coli K12 # 1 268 1 268 268 468 100.0 1e-132 MFTDTINKCAANAARIARLSANNPLGFWVSSAMAGAYVGLGIILIFTLGNLLDPSVRPLV MGATFGIALTLVIIAGSELFTGHTMFLTFGVKAGSISHGQMWAILPQTWLGNLVGSVFVA MLYSWGGGSLLPVDTSIVHSVALAKTTAPAMVLFFKGALCNWLVCLAIWMALRTEGAAKF IAIWWCLLAFIASGYEHSIANMTLFALSWFGNHSEAYTLAGIGHNLLWVTLGNTLSGAVF MGLGYWYATPKANRPVADKFNQTETAAG >gi|296493111|gb|ADTK01000390.1| GENE 9 7939 - 8265 466 108 aa, chain - ## HITS:1 COG:ECs4217 KEGG:ns NR:ns ## COG: ECs4217 COG2146 # Protein_GI_number: 15833471 # Func_class: P Inorganic ion transport and metabolism; R General function prediction only # Function: Ferredoxin subunits of nitrite reductase and ring-hydroxylating dioxygenases # Organism: Escherichia coli O157:H7 # 1 108 1 108 108 223 100.0 8e-59 MSQWKDICKIDDILPETGVCALLGDEQVAIFRPYHSDQVFAISNIDPFFESSVLSRGLIA EHQGELWVASPLKKQRFRLSDGLCMEDEQFSVKHYEARVKDGVVQLRG >gi|296493111|gb|ADTK01000390.1| GENE 10 8262 - 10805 2959 847 aa, chain - ## HITS:1 COG:nirB KEGG:ns NR:ns ## COG: nirB COG1251 # Protein_GI_number: 16131244 # Func_class: C Energy production and conversion # Function: NAD(P)H-nitrite reductase # Organism: Escherichia coli K12 # 1 847 1 847 847 1739 99.0 0 MSKVRLAIIGNGMVGHRFIEDLLDKSDAANFDITVFCEEPRIAYDRVHLSSYFSHHTAEE LSLVREGFYEKHGIKVLVGERAITINRQEKVIHSSAGRTVFYDKLIMATGSYPWIPPIKG SDTQDCFVYRTIEDLNAIESCARRSKRGAVVGGGLLGLEAAGALKNLGIETHVIEFAPML MAEQLDQMGGEQLRRKIESMGVRVHTSKNTLEIVQEGVEARKTMRFADGSELEVDFIVFS TGIRPRDKLATQCGLDVAPRGGIVINDSCQTSDPDIYAIGECASWNNRVFGLVAPGYKMA QVAVDHILGSENAFEGADLSAKLKLLGVDVGGIGDAHGRTPGARSYVYLDESKEIYKRLI VSEDNKTLLGAVLVGDTSDYGNLLQLVLNAIELPENPDSLILPAHSGSGKPSIGVDKLPD SAQICSCFDVTKGDLIAAINKGCHTVAALKAETKAGTGCGGCIPLVTQVLNAELAKQGIE VNNNLCEHFAYSRQELFHLIRVEGIKTFEELLAKHGKGYGCEVCKPTVGSLLASCWNEYI LKPEHTPLQDSNDNFLANIQKDGTYSVIPRSPGGEITPEGLMAVGRIAREFNLYTKITGS QRLAMFGAQKDDLPEIWRQLIEAGFETGHAYAKALRMAKTCVGSTWCRYGVGDSVGLGVE LENRYKGIRTPHKMKFGVSGCTRECSEAQGKDVGIIATEKGWNLYVCGNGGMKPRHADLL AADIDRETLIKYLDRFMMFYIRTADKLTRTAPWLENLEGGIDYLKAVIIDDKLGLNAHLE EEMARLREAVVCEWTETVNTPSAQTRFKHFINSDKRDPNVQMVPEREQHRPATPYERIPV TLVEDNA >gi|296493111|gb|ADTK01000390.1| GENE 11 11067 - 12248 1267 393 aa, chain - ## HITS:1 COG:yhfC KEGG:ns NR:ns ## COG: yhfC COG0477 # Protein_GI_number: 16131243 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 393 1 393 393 684 100.0 0 MTNSNRIKLTWISFLSYALTGALVIVTGMVMGNIADYFNLPVSSMSNTFTFLNAGILISI FLNAWLMEIVPLKTQLRFGFLLMVLAVAGLMFSHSLALFSAAMFILGVVSGITMSIGTFL VTQMYEGRQRGSRLLFTDSFFSMAGMIFPMIAAFLLARSIEWYWVYACIGLVYVAIFILT FGCEFPALGKHAPKTDAPVEKEKWGIGVLFLSVAALCYILGQLGFISWVPEYAKGLGMSL NDAGTLVSNFWMSYMVGMWAFSFILRFFDLQRILTVLAGLAAILMYVFNTGTPAHMAWSI LALGFFSSAIYTTIITLGSQQTKVPSPKLVNFVLTCGTIGTMLTFVVTGPIVEHSGPQAA LLTANGLYAVVFVMCFLLGFVSRHRQHNTLTSH >gi|296493111|gb|ADTK01000390.1| GENE 12 12519 - 12990 390 157 aa, chain + ## HITS:1 COG:ECs4214 KEGG:ns NR:ns ## COG: ECs4214 COG0652 # Protein_GI_number: 15833468 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Escherichia coli O157:H7 # 1 157 1 157 190 300 100.0 9e-82 MFKSTLAAMAAVFALSALSPAAMAAKGDPHVLLTTSAGNIELELDKQKAPVSVQNFVDYV NSGFYNNTTFHRVIPGFMIQGGGFTEQMQQKKPNPPIKNEADNGLRNTRGTIAMARTADK DSATSQFFINVADNAFLDHGQRDFGYAVFGKVVKGMD Prediction of potential genes in microbial genomes Time: Mon May 16 16:21:20 2011 Seq name: gi|296493110|gb|ADTK01000391.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1253.12, whole genome shotgun sequence Length of sequence - 15065 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 9, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 24 - 191 136 ## SDY_3524 hypothetical protein 2 1 Op 2 4/0.833 + CDS 181 - 783 755 ## COG2184 Protein involved in cell division 3 1 Op 3 5/0.500 + CDS 815 - 1378 498 ## COG0512 Anthranilate/para-aminobenzoate synthases component II + Prom 1380 - 1439 5.6 4 1 Op 4 . + CDS 1464 - 2684 1298 ## COG4992 Ornithine/acetylornithine aminotransferase + Term 2709 - 2750 3.2 5 2 Op 1 4/0.833 - CDS 2751 - 4841 1695 ## COG1289 Predicted membrane protein - Term 4843 - 4880 8.2 6 2 Op 2 . - CDS 4892 - 5524 833 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Prom 5674 - 5733 1.8 7 3 Tu 1 . + CDS 5826 - 6230 457 ## COG1765 Predicted redox protein, regulator of disulfide bond formation + Term 6241 - 6296 2.7 - Term 6178 - 6209 -0.8 8 4 Tu 1 6/0.333 - CDS 6285 - 7154 846 ## COG3954 Phosphoribulokinase 9 5 Op 1 5/0.500 - CDS 7208 - 7426 333 ## COG3089 Uncharacterized protein conserved in bacteria 10 5 Op 2 3/1.000 - CDS 7420 - 8382 849 ## COG0429 Predicted hydrolase of the alpha/beta-hydrolase fold 11 5 Op 3 . - CDS 8442 - 10355 2353 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 10459 - 10518 3.4 + Prom 10345 - 10404 4.7 12 6 Op 1 7/0.000 + CDS 10486 - 11037 442 ## COG2249 Putative NADPH-quinone reductase (modulator of drug activity B) 13 6 Op 2 3/1.000 + CDS 11037 - 12842 1056 ## PROTEIN SUPPORTED gi|229845962|ref|ZP_04466074.1| 30S ribosomal protein S2 14 6 Op 3 4/0.833 + CDS 12852 - 13052 235 ## COG3529 Predicted nucleic-acid-binding protein containing a Zn-ribbon domain + Term 13090 - 13131 0.7 + Prom 13059 - 13118 3.9 15 7 Tu 1 . + CDS 13147 - 13737 729 ## COG1047 FKBP-type peptidyl-prolyl cis-trans isomerases 2 + Term 13759 - 13787 2.1 - Term 13734 - 13782 10.2 16 8 Tu 1 . - CDS 13786 - 14004 310 ## COG2900 Uncharacterized protein conserved in bacteria - Prom 14112 - 14171 3.3 + Prom 14074 - 14133 3.4 17 9 Tu 1 . + CDS 14225 - 15037 938 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 Predicted protein(s) >gi|296493110|gb|ADTK01000391.1| GENE 1 24 - 191 136 55 aa, chain + ## HITS:1 COG:no KEGG:SDY_3524 NR:ns ## KEGG: SDY_3524 # Name: yhfG # Def: hypothetical protein # Organism: S.dysenteriae # Pathway: not_defined # 1 55 1 55 55 72 100.0 4e-12 MKKLTDKQKSRLWELQRNRNFQASRRLEGVEMPLVTLTAAEALARLEELRSHYER >gi|296493110|gb|ADTK01000391.1| GENE 2 181 - 783 755 200 aa, chain + ## HITS:1 COG:fic KEGG:ns NR:ns ## COG: fic COG2184 # Protein_GI_number: 16131240 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Protein involved in cell division # Organism: Escherichia coli K12 # 1 200 1 200 200 398 100.0 1e-111 MSDKFGEGRDPYLYPGLDIMRNRLNIRQQQRLEQAAYEMTALRAATIELGPLVRGLPHLR TIHRQLYQDIFDWAGQLREVDIYQGDTPFCHFAYIEKEGNALMQDLEEEGYLVGLEKAKF VERLAHYYCEINVLHPFRVGSGLAQRIFFEQLAIHAGYQLSWQGIEKEAWNQANQSGAMG DLTALQMIFSKVVSEAGESE >gi|296493110|gb|ADTK01000391.1| GENE 3 815 - 1378 498 187 aa, chain + ## HITS:1 COG:pabA KEGG:ns NR:ns ## COG: pabA COG0512 # Protein_GI_number: 16131239 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component II # Organism: Escherichia coli K12 # 1 187 1 187 187 397 99.0 1e-111 MILLIDNYDSFTWNLYQYFCELGADVLVKRNDALTLADIDALKPQKIVISPGPCTPDEAG ISLDVIRHYAGRLPILGVCLGHQAMAQAFGGKVVRAAKVMHGKTSPITHNGEGVFRGLAN PLTVTRYHSLVVEPGSLPACFDVTAWSETREIMGIRHRQWDLEGVQFHPESILSEQGHQL LANFLHR >gi|296493110|gb|ADTK01000391.1| GENE 4 1464 - 2684 1298 406 aa, chain + ## HITS:1 COG:argD KEGG:ns NR:ns ## COG: argD COG4992 # Protein_GI_number: 16131238 # Func_class: E Amino acid transport and metabolism # Function: Ornithine/acetylornithine aminotransferase # Organism: Escherichia coli K12 # 1 406 1 406 406 822 100.0 0 MAIEQTAITRATFDEVILPIYAPAEFIPVKGQGSRIWDQQGKEYVDFAGGIAVTALGHCH PALVNALKTQGETLWHISNVFTNEPALRLGRKLIEATFAERVVFMNSGTEANETAFKLAR HYACVRHSPFKTKIIAFHNAFHGRSLFTVSVGGQPKYSDGFGPKPADIIHVPFNDLHAVK AVMDDHTCAVVVEPIQGEGGVTAATPEFLQGLRELCDQHQALLVFDEVQCGMGRTGDLFA YMHYGVTPDILTSAKALGGGFPISAMLTTAEIASAFHPGSHGSTYGGNPLACAVAGAAFD IINTPEVLEGIQAKRQRFVDHLQKIDQQYDVFSDIRGMGLLIGAELKPQYKGRARDFLYA GAEAGVMVLNAGPDVMRFAPSLVVEDADIDEGMQRFAHAVAKVVGA >gi|296493110|gb|ADTK01000391.1| GENE 5 2751 - 4841 1695 696 aa, chain - ## HITS:1 COG:ECs4209 KEGG:ns NR:ns ## COG: ECs4209 COG1289 # Protein_GI_number: 15833463 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 696 1 696 696 1330 99.0 0 MWRRLIYHPDINYALRQTLVLCLPVAVGLMLGELRFGLLFSLVPACCNIAGLDTPHKRFF KRLIIGASLFAACSLLTQLLLAKDVPLPFLLTGLTLVLGVTAELGPLHAKLLPASLLAAI FTLSLAGYMPVWEPLLIYALGTLWYGLFNWFWFWIWREQPLRESLSLLYRELADYCEAKY SLLTQHTDPEKALPPLLVRQQKAVDLITQCYQQMHMLSAQNNTDYKRMLRIFQEALDLQE HISVSLHQPEEVQKLVERSHAEEVIRWNAQTVAARLRVLADDILYHRLPTRFTMEKQIGA LEKIARQHPDNPVGQFCYWHFSRIARVLRTQKPLYARDLLADKQRRMPLLPALKSYLSLK SPALRNAGRLSVMLSVASLMGTALHLPKSYWILMTVLLVTQNGYGATRLRIVNRSVGTVV GLIIAGVALHFKIPEGYTLTLMLITTLASYLILRKNYGWATVGFTITAVYTLQLLWLNGE QYILPRLIDTIIGCLIAFGGTVWLWPQWQSGLLRKNAHDALEAYQEAIRLILSEDPQPTP LAWQRMRVNQAHNTLYNSLNQAMQEPAFNSHYLADMKLWVTHSQFIVEHINAMTTLAREH RALPPELAQEYLQSCEIAIQRCQQRLEYDEPGSSGDANIMDAPEMQPHEGAAGTLEQHLQ RVIGHLNTMHTISSMAWRQRPHHGIWLSRKLRDSKA >gi|296493110|gb|ADTK01000391.1| GENE 6 4892 - 5524 833 210 aa, chain - ## HITS:1 COG:ECs4208 KEGG:ns NR:ns ## COG: ECs4208 COG0664 # Protein_GI_number: 15833462 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Escherichia coli O157:H7 # 1 210 1 210 210 418 100.0 1e-117 MVLGKPQTDPTLEWFLSHCHIHKYPSKSTLIHQGEKAETLYYIVKGSVAVLIKDEEGKEM ILSYLNQGDFIGELGLFEEGQERSAWVRAKTACEVAEISYKKFRQLIQVNPDILMRLSAQ MARRLQVTSEKVGNLAFLDVTGRIAQTLLNLAKQPDAMTHPDGMQIKITRQEIGQIVGCS RETVGRILKMLEDQNLISAHGKTIVVYGTR >gi|296493110|gb|ADTK01000391.1| GENE 7 5826 - 6230 457 134 aa, chain + ## HITS:1 COG:ECs4207 KEGG:ns NR:ns ## COG: ECs4207 COG1765 # Protein_GI_number: 15833461 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted redox protein, regulator of disulfide bond formation # Organism: Escherichia coli O157:H7 # 1 134 1 134 134 257 100.0 3e-69 MQARVKWVEGLTFLGESASGHQILMDGNSGDKAPSPMEMVLMAAGGCSAIDVVSILQKGR QDVVDCEVKLTSERREEAPRLFTHINLHFIVTGRDLKDAAVARAVDLSAEKYCSVALMLE KAVNITHSYEVVAA >gi|296493110|gb|ADTK01000391.1| GENE 8 6285 - 7154 846 289 aa, chain - ## HITS:1 COG:prkB KEGG:ns NR:ns ## COG: prkB COG3954 # Protein_GI_number: 16131234 # Func_class: C Energy production and conversion # Function: Phosphoribulokinase # Organism: Escherichia coli K12 # 1 289 1 289 289 595 100.0 1e-170 MSAKHPVIAVTGSSGAGTTTTSLAFRKIFAQLNLHAAEVEGDSFHRYTRPEMDMAIRKAR DAGRHISYFGPEANDFGLLEQTFIEYGQSGKGKSRKYLHTYDEAVPWNQVPGTFTPWQPL PEPTDVLFYEGLHGGVVTPQHNVAQHVDLLVGVVPIVNLEWIQKLIRDTSERGHSREAVM DSVVRSMEDYINYITPQFSRTHLNFQRVPTVDTSNPFAAKGIPSLDESFVVIHFRNLEGI DFPWLLAMLQGSFISHINTLVVPGGKMGLAMELIMLPLVQRLMEGKKIE >gi|296493110|gb|ADTK01000391.1| GENE 9 7208 - 7426 333 72 aa, chain - ## HITS:1 COG:ECs4205 KEGG:ns NR:ns ## COG: ECs4205 COG3089 # Protein_GI_number: 15833459 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 72 1 72 72 130 100.0 4e-31 MLIPWQDLSPETLENLIESFVLREGTDYGEHERTLEQKVADVKRQLQCGEAVLVWSELHE TVNIMPRSQFRE >gi|296493110|gb|ADTK01000391.1| GENE 10 7420 - 8382 849 320 aa, chain - ## HITS:1 COG:ECs4204 KEGG:ns NR:ns ## COG: ECs4204 COG0429 # Protein_GI_number: 15833458 # Func_class: R General function prediction only # Function: Predicted hydrolase of the alpha/beta-hydrolase fold # Organism: Escherichia coli O157:H7 # 1 320 21 340 340 672 99.0 0 MRGFSNCHLQTMLPRLFRRKVKFTPYWQRLELPDGDFVDLAWSEAPAQARHKPRLVVFHG LEGSLNSPYAHGLVEAAQKRGWLGVVMHFRGCSGEPNRMHRIYHSGETEDASWFLRWLQR EFGHAPTAAVGYSLGGNMLACLLAKEGNDLPVDAAVIVSAPFMLEACSYHMEKGFSRVYQ RYLLNLLKANAARKLAAYPGTLPINLAQLKSVRRIREFDDLITARIHGYADAIDYYRQCS AMPMLNRIAKPTLIIHAKDDPFMDHQVIPKPESLPSQVEYQLTEHGGHVGFIGGTLLHPQ MWLESRIPDWLTTYLEAKSC >gi|296493110|gb|ADTK01000391.1| GENE 11 8442 - 10355 2353 637 aa, chain - ## HITS:1 COG:ECs4203 KEGG:ns NR:ns ## COG: ECs4203 COG0488 # Protein_GI_number: 15833457 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Escherichia coli O157:H7 # 1 637 1 637 637 1207 99.0 0 MIVFSSLQIRRGVRVLLDNATATINPGQKVGLVGKNGCGKSTLLALLKNEISADGGSYTF PGSWQLAWVNQETPALPQAALEYVIDGDREYRQLEAQLHDANERNDGHAIATIHGKLDAI DAWSIRSRAASLLHGLGFSNEQLERPVSDFSGGWRMRLNLAQALICRSDLLLLDEPTNHL DLDAVIWLEKWLKSYQGTLILISHDRDFLDPIVDKIIHIEQQSMFEYTGNYSSFEVQRAT RLAQQQAMYESQQERVAHLQSYIDRFRAKATKAKQAQSRIKMLERMELIAPAHVDNPFRF SFRAPESLPNPLLKMEKVSAGYGDRIILDSIKLNLVPGSRIGLLGRNGAGKSTLIKLLAG ELAPVSGEIGLAKGIKLGYFAQHQLEYLRADESPIQHLARLAPQELEQKLRDYLGGFGFQ GDKVTEETRRFSGGEKARLVLALIVWQRPNLLLLDEPTNHLDLDMRQALTEALIEFEGAL VVVSHDRHLLRSTTDDLYLVHDRKVEPFDGDLEDYQQWLSDVQKQENQADEAPKENANSA QARKDQKRREAELRAQTQPLRKEIARLEKEMEKLNAQLAQAEEKLGDSELYDQSRKAELT ACLQQQASAKSGLEECEMAWLEAQEQLEQMLLEGQSN >gi|296493110|gb|ADTK01000391.1| GENE 12 10486 - 11037 442 183 aa, chain + ## HITS:1 COG:ECs4202 KEGG:ns NR:ns ## COG: ECs4202 COG2249 # Protein_GI_number: 15833456 # Func_class: R General function prediction only # Function: Putative NADPH-quinone reductase (modulator of drug activity B) # Organism: Escherichia coli O157:H7 # 1 183 2 184 184 365 100.0 1e-101 MSQPAKVLLLYAHPESQDSVANRVLLKPATQLSNVTVHDLYAHYPDFFIDIPREQALLRE HEVIVFQHPLYTYSCPALLKEWLDRVLSRGFASGPGGNQLAGKYWRSVITTGEPESAYRY DALNRYPMSDVLRPFELAAGMCRMHWLSPIIIYWARRQSAQELASHARAYGDWLANPLSP GGR >gi|296493110|gb|ADTK01000391.1| GENE 13 11037 - 12842 1056 601 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229845962|ref|ZP_04466074.1| 30S ribosomal protein S2 [Haemophilus influenzae 7P49H1] # 7 590 9 612 618 411 36 1e-114 MEGSDFLLAGVLFLFAAVAAVPLASRLGIGAVLGYLLAGIAIGPWGLGFISDVDEILHFS ELGVVFLMFIIGLELNPSKLWQLRRSIFGVGAAQVLLSAALLAGLLMLTDFAWQAAVVGG IGLAMSSTAMALQLMREKGMNRSESGQLGFSVLLFQDLAVIPALALVPLLAGAADEHFDW MKIGMKVLAFVGMLIGGRYLLRPVFRFIAASGVREVFTAATLLLVLGSALFMDALGLSMA LGTFIAGVLLAESEYRHELETAIDPFKGLLLGLFFISVGMSLNLGVLYTHLLWVVISVVV LVAVKILVLYLLARLYGVRSSERMQFAGVLSQGGEFAFVLFSTASSQRLFQGDQMALLLV TVTLSMMTTPLLMKLVDKWLSRQFNGPEEEDEKPWVNDDKPQVIVVGFGRFGQVIGRLLM ANKMRITVLERDISAVNLMRKYGYKVYYGDATQVDLLRSAGAEAAESIVITCNEPEDTMK LVEICQQHFPHLHILARARGRVEAHELLQAGVTQFSRETFSSALELGRKTLVTLGMHPHQ AQRAQLHFRRLDMRMLRELIPMHADTVQISRAREARRELEEIFQREMQQERRQLDGWDEF E >gi|296493110|gb|ADTK01000391.1| GENE 14 12852 - 13052 235 66 aa, chain + ## HITS:1 COG:Z4708 KEGG:ns NR:ns ## COG: Z4708 COG3529 # Protein_GI_number: 15803863 # Func_class: R General function prediction only # Function: Predicted nucleic-acid-binding protein containing a Zn-ribbon domain # Organism: Escherichia coli O157:H7 EDL933 # 1 66 1 66 66 120 100.0 6e-28 MAIRKRFIAGAKCPACQAQDSMAMWRENNIDIVECVKCGHQMREADKEARDHVRKDEQVI GIFHPD >gi|296493110|gb|ADTK01000391.1| GENE 15 13147 - 13737 729 196 aa, chain + ## HITS:1 COG:ECs4200 KEGG:ns NR:ns ## COG: ECs4200 COG1047 # Protein_GI_number: 15833454 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 2 # Organism: Escherichia coli O157:H7 # 1 148 1 148 196 271 100.0 7e-73 MKVAKDLVVSLAYQVRTEDGVLVDESPVSAPLDYLHGHGSLISGLETALEGHEVGDKFDV AVGANDAYGQYDENLVQRVPKDVFMGVDELQVGMRFLAETDQGPVPVEITAVEDDHVVVD GNHMLAGQNLKFNVEVVAIREATEEELAHGHVHGAHDHHHDHDHDGCCGGHGHDHGHEHG GEGCCGGKGNGGCGCH >gi|296493110|gb|ADTK01000391.1| GENE 16 13786 - 14004 310 72 aa, chain - ## HITS:1 COG:ECs4199 KEGG:ns NR:ns ## COG: ECs4199 COG2900 # Protein_GI_number: 15833453 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 72 1 72 72 99 98.0 1e-21 MQDLSLETRLAELESRLAFQEITIEELNVTVTAHEMEMAKLRDHLRLLTEKLKASQPSNI ASQAEETPPPHY >gi|296493110|gb|ADTK01000391.1| GENE 17 14225 - 15037 938 270 aa, chain + ## HITS:1 COG:ECs4198 KEGG:ns NR:ns ## COG: ECs4198 COG0545 # Protein_GI_number: 15833452 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Escherichia coli O157:H7 # 1 259 1 259 270 418 100.0 1e-117 MKSLFKVTLLATTMAVALHAPITFAAEAAKPATTADSKAAFKNDDQKSAYALGASLGRYM ENSLKEQEKLGIKLDKDQLIAGVQDAFADKSKLSDQEIEQTLQAFEARVKSSAQAKMEKD AADNEAKGKEYREKFAKEKGVKTSSTGLVYQVVEAGKGEAPKDSDTVVVNYKGTLIDGKE FDNSYTRGEPLSFRLDGVIPGWTEGLKNIKKGGKIKLVIPPELAYGKAGVPGIPPNSTLV FDVELLDVKPAPKADAKPEADAKAADSAKK Prediction of potential genes in microbial genomes Time: Mon May 16 16:21:31 2011 Seq name: gi|296493109|gb|ADTK01000392.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1253.13, whole genome shotgun sequence Length of sequence - 10184 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 3, operones - 3 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 14 - 73 3.2 1 1 Op 1 6/0.000 + CDS 147 - 869 755 ## COG2964 Uncharacterized protein conserved in bacteria 2 1 Op 2 13/0.000 + CDS 869 - 1255 426 ## COG1553 Uncharacterized conserved protein involved in intracellular sulfur reduction 3 1 Op 3 10/0.000 + CDS 1255 - 1614 274 ## COG2923 Uncharacterized protein involved in the oxidation of intracellular sulfur 4 1 Op 4 7/0.000 + CDS 1622 - 1909 292 ## COG2168 Uncharacterized conserved protein involved in oxidation of intracellular sulfur 5 1 Op 5 56/0.000 + CDS 1975 - 2409 744 ## PROTEIN SUPPORTED gi|226956878|ref|YP_002807671.1| 30S ribosomal subunit protein S12 + Prom 2418 - 2477 2.1 6 1 Op 6 51/0.000 + CDS 2506 - 2976 773 ## PROTEIN SUPPORTED gi|15803854|ref|NP_289888.1| 30S ribosomal protein S7 + Term 3002 - 3037 3.1 + Prom 2990 - 3049 1.6 7 1 Op 7 30/0.000 + CDS 3073 - 5187 2256 ## COG0480 Translation elongation factors (GTPases) 8 1 Op 8 14/0.000 + CDS 5258 - 6442 1634 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 + Term 6468 - 6515 8.9 + Prom 6533 - 6592 6.0 9 2 Op 1 46/0.000 + CDS 6672 - 7055 451 ## COG0690 Preprotein translocase subunit SecE 10 2 Op 2 45/0.000 + CDS 7057 - 7602 636 ## COG0250 Transcription antiterminator + Term 7636 - 7688 10.2 + Prom 7610 - 7669 6.1 11 3 Op 1 55/0.000 + CDS 7761 - 8189 705 ## PROTEIN SUPPORTED gi|15804573|ref|NP_290614.1| 50S ribosomal protein L11 12 3 Op 2 43/0.000 + CDS 8193 - 8897 1148 ## PROTEIN SUPPORTED gi|15804574|ref|NP_290615.1| 50S ribosomal protein L1 13 3 Op 3 47/0.000 + CDS 9310 - 9807 798 ## PROTEIN SUPPORTED gi|15804575|ref|NP_290616.1| 50S ribosomal protein L10 14 3 Op 4 . + CDS 9874 - 10182 480 ## PROTEIN SUPPORTED gi|74314480|ref|YP_312899.1| 50S ribosomal protein L7/L12 Predicted protein(s) >gi|296493109|gb|ADTK01000392.1| GENE 1 147 - 869 755 240 aa, chain + ## HITS:1 COG:yheO KEGG:ns NR:ns ## COG: yheO COG2964 # Protein_GI_number: 16131225 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 240 5 244 244 469 100.0 1e-132 MSRSLLTNETSELDLLDQRPFDQTDFDILKSYEAVVDGLAMLIGSHCEIVLHSLQDLKCS AIRIANGEHTGRKIGSPITDLALRMLHDMTGADSSVSKCYFTRAKSGVLMKSLTIAIRNR EQRVIGLLCINMNLDVPFSQIMSTFVPPETPDVGSSVNFASSVEDLVTQTLEFTIEEVNA DRNVSNNAKNRQIVLNLYEKGIFDIKDAINQVADRLNISKHTVYLYIRQFKSGDFQGQDK >gi|296493109|gb|ADTK01000392.1| GENE 2 869 - 1255 426 128 aa, chain + ## HITS:1 COG:ECs4196 KEGG:ns NR:ns ## COG: ECs4196 COG1553 # Protein_GI_number: 15833450 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized conserved protein involved in intracellular sulfur reduction # Organism: Escherichia coli O157:H7 # 1 128 1 128 128 225 95.0 2e-59 MRFAIVVTGPAYGTQQASSAFQFAQALIVEGHELSSVFFYREGVYNANQLTSPASDEFDL VRGWQQLNAQHGVALNICVAAALRRGIVDETEAGRLGLASSNLQPGFTLSGLGALAEASL TCDRVVQF >gi|296493109|gb|ADTK01000392.1| GENE 3 1255 - 1614 274 119 aa, chain + ## HITS:1 COG:yheM KEGG:ns NR:ns ## COG: yheM COG2923 # Protein_GI_number: 16131223 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in the oxidation of intracellular sulfur # Organism: Escherichia coli K12 # 1 119 1 119 119 223 99.0 5e-59 MKRIAFVFSTAPHGTAAGREGLDALLATSALTDDLAVFFIADGVFQLLSGQKPDAVLARD YIATFKLLGLYDIEQCWVCAASLRERGLDPQTPFVVEATPLEADALRRELANYDVILRF >gi|296493109|gb|ADTK01000392.1| GENE 4 1622 - 1909 292 95 aa, chain + ## HITS:1 COG:ECs4194 KEGG:ns NR:ns ## COG: ECs4194 COG2168 # Protein_GI_number: 15833448 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized conserved protein involved in oxidation of intracellular sulfur # Organism: Escherichia coli O157:H7 # 1 95 1 95 95 182 98.0 2e-46 MLHTLHRSPWLTDFAALLRLLSEGDELLLLQDGVTAAVDGNRYLESLRNAPIKVYALNED LIARGLTGQISNDIIPIDYTDFVRLTVKHSSQMAW >gi|296493109|gb|ADTK01000392.1| GENE 5 1975 - 2409 744 144 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|226956878|ref|YP_002807671.1| 30S ribosomal subunit protein S12 [Escherichia sp. 1_1_43] # 1 144 1 144 144 291 99 2e-78 MCEDVLLRVYEAKAKTRSYLMATVNQLVRKPRARKVAKSNVPALEACPQKRGVCTRVYTT TPKKPNSALRKVCRVRLTNGFEVTSYIGGEGHNLQEHSVILIRGGRVKDLPGVRYHTVRG ALDCSGVKDRKQARSKYGVKRPKA >gi|296493109|gb|ADTK01000392.1| GENE 6 2506 - 2976 773 156 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803854|ref|NP_289888.1| 30S ribosomal protein S7 [Escherichia coli O157:H7 EDL933] # 1 156 1 156 156 302 99 8e-82 MPRRRVIGQRKILPDPKFGSELLAKFVNILMVDGKKSTAESIVYSALETLAQRSGKSELE AFEVALENVRPTVEVKSRRVGGSTYQVPVEVRPVRRNTLAMRWIVEAARKRGDKSMALRL ANELSDAAENKGTAVKKREDVHRMAEANKAFAHYRW >gi|296493109|gb|ADTK01000392.1| GENE 7 3073 - 5187 2256 704 aa, chain + ## HITS:1 COG:ECs4191 KEGG:ns NR:ns ## COG: ECs4191 COG0480 # Protein_GI_number: 15833445 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Escherichia coli O157:H7 # 1 704 1 704 704 1402 100.0 0 MARTTPIARYRNIGISAHIDAGKTTTTERILFYTGVNHKIGEVHDGAATMDWMEQEQERG ITITSAATTAFWSGMAKQYEPHRINIIDTPGHVDFTIEVERSMRVLDGAVMVYCAVGGVQ PQSETVWRQANKYKVPRIAFVNKMDRMGANFLKVVNQIKTRLGANPVPLQLAIGAEEHFT GVVDLVKMKAINWNDADQGVTFEYEDIPADMVELANEWHQNLIESAAEASEELMEKYLGG EELTEAEIKGALRQRVLNNEIILVTCGSAFKNKGVQAMLDAVIDYLPSPVDVPAINGILD DGKDTPAERHASDDEPFSALAFKIATDPFVGNLTFFRVYSGVVNSGDTVLNSVKAARERF GRIVQMHANKREEIKEVRAGDIAAAIGLKDVTTGDTLCDPDAPIILERMEFPEPVISIAV EPKTKADQEKMGLALGRLAKEDPSFRVWTDEESNQTIIAGMGELHLDIIVDRMKREFNVE ANVGKPQVAYRETIRQKVTDVEGKHAKQSGGRGQYGHVVIDMYPLEPGSNPKGYEFINDI KGGVIPGEYIPAVDKGIQEQLKAGPLAGYPVVDMGIRLHFGSYHDVDSSELAFKLAASIA FKEGFKKAKPVLLEPIMKVEVETPEENTGDVIGDLSRRRGMLKGQESEVTGVKIHAEVPL SEMFGYATQLRSLTKGRASYTMEFLKYDEAPSNVAQAVIEARGK >gi|296493109|gb|ADTK01000392.1| GENE 8 5258 - 6442 1634 394 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 393 1 406 407 634 76 0.0 MSKEKFERTKPHVNVGTIGHVDHGKTTLTAAITTVLAKTYGGAARAFDQIDNAPEEKARG ITINTSHVEYDTPTRHYAHVDCPGHADYVKNMITGAAQMDGAILVVAATDGPMPQTREHI LLGRQVGVPYIIVFLNKCDMVDDEELLELVEMEVRELLSQYDFPGDDTPIVRGSALKALE GDAEWEAKILELAGFLDSYIPEPERAIDKPFLLPIEDVFSISGRGTVVTGRVERGIIKVG EEVEIVGIKETQKSTCTGVEMFRKLLDEGRAGENVGVLLRGIKREEIERGQVLAKPGTIK PHTKFESEVYILSKDEGGRHTPFFKGYRPQFYFRTTDVTGTIELPEGVEMVMPGDNIKMV VTLIHPIAMDDGLRFAIREGGRTVGAGVVAKVLS >gi|296493109|gb|ADTK01000392.1| GENE 9 6672 - 7055 451 127 aa, chain + ## HITS:1 COG:STM4147 KEGG:ns NR:ns ## COG: STM4147 COG0690 # Protein_GI_number: 16767401 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecE # Organism: Salmonella typhimurium LT2 # 1 127 1 127 127 184 96.0 4e-47 MSANTEAQGSGRGLEAMKWVVVVALLLVAIVGNYLYRDIMLPLRALAVVILIAAAGGVAL LTTKGKATVAFAREARTEVRKVIWPTRQETLHTTLIVAAVTAVMSLILWGLDGILVRLVS FITGLRF >gi|296493109|gb|ADTK01000392.1| GENE 10 7057 - 7602 636 181 aa, chain + ## HITS:1 COG:ECs4905 KEGG:ns NR:ns ## COG: ECs4905 COG0250 # Protein_GI_number: 15834159 # Func_class: K Transcription # Function: Transcription antiterminator # Organism: Escherichia coli O157:H7 # 1 181 1 181 181 348 100.0 3e-96 MSEAPKKRWYVVQAFSGFEGRVATSLREHIKLHNMEDLFGEVMVPTEEVVEIRGGQRRKS ERKFFPGYVLVQMVMNDASWHLVRSVPRVMGFIGGTSDRPAPISDKEVDAIMNRLQQVGD KPRPKTLFEPGEMVRVNDGPFADFNGVVEEVDYEKSRLKVSVSIFGRATPVELDFSQVEK A >gi|296493109|gb|ADTK01000392.1| GENE 11 7761 - 8189 705 142 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15804573|ref|NP_290614.1| 50S ribosomal protein L11 [Escherichia coli O157:H7 EDL933] # 1 142 1 142 142 276 100 6e-74 MAKKVQAYVKLQVAAGMANPSPPVGPALGQQGVNIMEFCKAFNAKTDSIEKGLPIPVVIT VYADRSFTFVTKTPPAAVLLKKAAGIKSGSGKPNKDKVGKISRAQLQEIAQTKAADMTGA DIEAMTRSIEGTARSMGLVVED >gi|296493109|gb|ADTK01000392.1| GENE 12 8193 - 8897 1148 234 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15804574|ref|NP_290615.1| 50S ribosomal protein L1 [Escherichia coli O157:H7 EDL933] # 1 234 1 234 234 446 100 1e-125 MAKLTKRMRVIREKVDATKQYDINEAIALLKELATAKFVESVDVAVNLGIDARKSDQNVR GATVLPHGTGRSVRVAVFTQGANAEAAKAAGAELVGMEDLADQIKKGEMNFDVVIASPDA MRVVGQLGQVLGPRGLMPNPKVGTVTPNVAEAVKNAKAGQVRYRNDKNGIIHTTIGKVDF DADKLKENLEALLVALKKAKPTQAKGVYIKKVSISTTMGAGVAVDQAGLSASVN >gi|296493109|gb|ADTK01000392.1| GENE 13 9310 - 9807 798 165 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15804575|ref|NP_290616.1| 50S ribosomal protein L10 [Escherichia coli O157:H7 EDL933] # 1 165 1 165 165 311 100 1e-84 MALNLQDKQAIVAEVSEVAKGALSAVVADSRGVTVDKMTELRKAGREAGVYMRVVRNTLL RRAVEGTPFECLKDAFVGPTLIAYSMEHPGAAARLFKEFAKANAKFEVKAAAFEGELIPA SQIDRLATLPTYEEAIARLMATMKEASAGKLVRTLAAVRDAKEAA >gi|296493109|gb|ADTK01000392.1| GENE 14 9874 - 10182 480 103 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|74314480|ref|YP_312899.1| 50S ribosomal protein L7/L12 [Shigella sonnei Ss046] # 1 103 1 103 121 189 100 7e-48 MSITKDQIIEAVAAMSVMDVVELISAMEEKFGVSAAAAVAVAAGPVEAAEEKTEFDVILK AAGANKVAVIKAVRGATGLGLKEAKDLVESAPAALKEGVSKDD Prediction of potential genes in microbial genomes Time: Mon May 16 16:21:33 2011 Seq name: gi|296493108|gb|ADTK01000393.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1253.14, whole genome shotgun sequence Length of sequence - 8618 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 49 - 108 2.6 1 1 Op 1 58/0.000 + CDS 265 - 4293 4095 ## PROTEIN SUPPORTED gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 + Term 4313 - 4352 8.9 2 1 Op 2 . + CDS 4370 - 8593 4663 ## COG0086 DNA-directed RNA polymerase, beta' subunit/160 kD subunit Predicted protein(s) >gi|296493108|gb|ADTK01000393.1| GENE 1 265 - 4293 4095 1342 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 [alpha proteobacterium BAL199] # 2 1342 6 1390 1392 1582 58 0.0 MVYSYTEKKRIRKDFGKRPQVLDVPYLLSIQLDSFQKFIEQDPEGQYGLEAAFRSVFPIQ SYSGNSELQYVSYRLGEPVFDVQECQIRGVTYSAPLRVKLRLVIYEREAPEGTVKDIKEQ EVYMGEIPLMTDNGTFVINGTERVIVSQLHRSPGVFFDSDKGKTHSSGKVLYNARIIPYR GSWLDFEFDPKDNLFVRIDRRRKLPATIILRALNYTTEQILDLFFEKVIFEIRDNKLQME LVPERLRGETASFDIEANGKVYVEKGRRITARHIRQLEKDDVKLIEVPVEYIAGKVVAKD YIDESTGELICAANMELSLDLLAKLSQSGHKRIETLFTNDLDHGPYISETLRVDPTNDRL SALVEIYRMMRPGEPPTREAAESLFENLFFSEDRYDLSAVGRMKFNRSLLREEIEGSGIL SKDDIIDVMKKLIDIRNGKGEVDDIDHLGNRRIRSVGEMAENQFRVGLVRVERAVKERLS LGDLDTLMPQDMINAKPISAAVKEFFGSSQLSQFMDQNNPLSEITHKRRISALGPGGLTR ERAGFEVRDVHPTHYGRVCPIETPEGPNIGLINSLSVYAQTNEYGFLETPYRKVTDGVVT DEIHYLSAIEEGNYVIAQANSNLDEEGHFVEDLVTCRSKGESSLFSRDQVDYMDVSTQQV VSVGASLIPFLEHDDANRALMGANMQRQAVPTLRADKPLVGTGMERAVAVDSGVTAVAKR GGVVQYVDASRIVIKVNEDEMYPGEAGIDIYNLTKYTRSNQNTCINQMPCVSLGEPVERG DVLADGPSTDLGELALGQNMRVAFMPWNGYNFEDSILVSERVVQEDRFTTIHIQELACVS RDTKLGPEEITADIPNVGEAALSKLDESGIVYIGAEVTGGDILVGKVTPKGETQLTPEEK LLRAIFGEKASDVKDSSLRVPNGVSGTVIDVQVFTRDGVEKDKRALEIEEMQLKQAKKDL SEELQILEAGLFSRIRAVLVAGGVEAEKLDKLPRDRWLELGLTDEEKQNQLEQLAEQYDE LKHEFEKKLEAKRRKITQGDDLAPGVLKIVKVYLAVKRRIQPGDKMAGRHGNKGVISKIN PIEDMPYDENGTPVDIVLNPLGVPSRMNIGQILETHLGMAAKGIGDKINAMLKQQQEVAK LREFIQRAYDLGADVRQKVDLSTFSDEEVMRLAENLRKGMPIATPVFDGAKEAEIKELLK LGDLPTSGQIRLYDGRTGEQFERPVTVGYMYMLKLNHLVDDKMHARSTGSYSLVTQQPLG GKAQFGGQRFGEMEVWALEAYGAAYTLQEMLTVKSDDVNGRTKMYKNIVDGNHQMEPGMP ESFNVLLKEIRSLGINIELEDE >gi|296493108|gb|ADTK01000393.1| GENE 2 4370 - 8593 4663 1407 aa, chain + ## HITS:1 COG:ECs4911 KEGG:ns NR:ns ## COG: ECs4911 COG0086 # Protein_GI_number: 15834165 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta' subunit/160 kD subunit # Organism: Escherichia coli O157:H7 # 1 1407 1 1407 1407 2776 100.0 0 MKDLLKFLKAQTKTEEFDAIKIALASPDMIRSWSFGEVKKPETINYRTFKPERDGLFCAR IFGPVKDYECLCGKYKRLKHRGVICEKCGVEVTQTKVRRERMGHIELASPTAHIWFLKSL PSRIGLLLDMPLRDIERVLYFESYVVIEGGMTNLERQQILTEEQYLDALEEFGDEFDAKM GAEAIQALLKSMDLEQECEQLREELNETNSETKRKKLTKRIKLLEAFVQSGNKPEWMILT VLPVLPPDLRPLVPLDGGRFATSDLNDLYRRVINRNNRLKRLLDLAAPDIIVRNEKRMLQ EAVDALLDNGRRGRAITGSNKRPLKSLADMIKGKQGRFRQNLLGKRVDYSGRSVITVGPY LRLHQCGLPKKMALELFKPFIYGKLELRGLATTIKAAKKMVEREEAVVWDILDEVIREHP VLLNRAPTLHRLGIQAFEPVLIEGKAIQLHPLVCAAYNADFDGDQMAVHVPLTLEAQLEA RALMMSTNNILSPANGEPIIVPSQDVVLGLYYMTRDCVNAKGEGMVLTGPKEAERLYRSG LASLHARVKVRITEYEKDANGELVAKTSLKDTTVGRAILWMIVPKGLPYSIVNQALGKKA ISKMLNTCYRILGLKPTVIFADQIMYTGFAYAARSGASVGIDDMVIPEKKHEIISEAEAE VAEIQEQFQSGLVTAGERYNKVIDIWAAANDRVSKAMMDNLQTETVINRDGQEEKQVSFN SIYMMADSGARGSAAQIRQLAGMRGLMAKPDGSIIETPITANFREGLNVLQYFISTHGAR KGLADTALKTANSGYLTRRLVDVAQDLVVTEDDCGTHEGIMMTPVIEGGDVKEPLRDRVL GRVTAEDVLKPGTADILVPRNTLLHEQWCDLLEENSVDAVKVRSVVSCDTDFGVCAHCYG RDLARGHIINKGEAIGVIAAQSIGEPGTQLTMRTFHIGGAASRAAAESSIQVKNKGSIKL SNVKSVVNSSGKLVITSRNTELKLIDEFGRTKESYKVPYGAVLAKGDGEQVAGGETVANW DPHTMPVITEVSGFVRFTDMIDGQTITRQTDELTGLSSLVVLDSAERTAGGKDLRPALKI VDAQGNDVLIPGTDMPAQYFLPGKAIVQLEDGVQISSGDTLARIPQESGGTKDITGGLPR VADLFEARRPKEPAILAEISGIVSFGKETKGKRRLVITPVDGSDPYEEMIPKWRQLNVFE GERVERGDVISDGPEAPHDILRLRGVHAVTRYIVNEVQDVYRLQGVKINDKHIEVIVRQM LRKATIVNAGSSDFLEGEQVEYSRVKIANRELEANGKVGATYSRDLLGITKASLATESFI SAASFQETTRVLTEAAVAGKRDELRGLKENVIVGRLIPAGTGYAYHQDRMRRRAAGEAPA APQVTAEDASASLAELLNAGLGGSDNE Prediction of potential genes in microbial genomes Time: Mon May 16 16:21:40 2011 Seq name: gi|296493107|gb|ADTK01000394.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1253.15, whole genome shotgun sequence Length of sequence - 18257 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 8, operones - 4 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 10 - 69 4.2 1 1 Tu 1 . + CDS 192 - 731 267 ## B21_03818 hypothetical protein + Term 866 - 906 5.6 - Term 984 - 1028 4.3 2 2 Op 1 5/0.333 - CDS 1141 - 2274 1094 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 3 2 Op 2 16/0.000 - CDS 2271 - 3041 738 ## COG2022 Uncharacterized enzyme of thiazole biosynthesis 4 2 Op 3 5/0.333 - CDS 3043 - 3243 249 ## COG2104 Sulfur transfer protein involved in thiamine biosynthesis 5 2 Op 4 3/0.667 - CDS 3227 - 3982 720 ## COG0476 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 6 2 Op 5 8/0.000 - CDS 3975 - 4610 675 ## COG0352 Thiamine monophosphate synthase 7 2 Op 6 4/0.667 - CDS 4610 - 6505 1975 ## COG0422 Thiamine biosynthesis protein ThiC - Prom 6675 - 6734 3.5 - Term 6675 - 6703 -0.7 8 3 Tu 1 . - CDS 6738 - 7214 474 ## COG3160 Regulator of sigma D - Prom 7268 - 7327 3.2 + Prom 7031 - 7090 2.8 9 4 Op 1 5/0.333 + CDS 7309 - 8082 567 ## COG2816 NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding 10 4 Op 2 4/0.667 + CDS 8122 - 9186 1209 ## COG0407 Uroporphyrinogen-III decarboxylase 11 4 Op 3 4/0.667 + CDS 9196 - 9867 631 ## COG1515 Deoxyinosine 3'endonuclease (endonuclease V) 12 4 Op 4 6/0.000 + CDS 9910 - 10500 594 ## COG3068 Uncharacterized protein conserved in bacteria + Prom 10523 - 10582 1.8 13 4 Op 5 . + CDS 10687 - 10959 404 ## COG0776 Bacterial nucleoid DNA-binding protein + Term 10987 - 11020 3.8 14 5 Tu 1 . + CDS 11032 - 11667 472 ## c4958 hypothetical protein + Term 11897 - 11931 -0.7 15 6 Tu 1 . - CDS 11669 - 12094 376 ## COG3678 P pilus assembly/Cpx signaling pathway, periplasmic inhibitor/zinc-resistance associated protein - Prom 12172 - 12231 2.0 + Prom 12123 - 12182 3.5 16 7 Op 1 13/0.000 + CDS 12341 - 13708 1092 ## COG0642 Signal transduction histidine kinase 17 7 Op 2 . + CDS 13705 - 15030 1353 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains + Term 15206 - 15239 -1.0 - Term 14942 - 14981 1.1 18 8 Op 1 17/0.000 - CDS 15027 - 16316 1578 ## COG0151 Phosphoribosylamine-glycine ligase 19 8 Op 2 . - CDS 16328 - 17917 1761 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) - Prom 18012 - 18071 3.6 Predicted protein(s) >gi|296493107|gb|ADTK01000394.1| GENE 1 192 - 731 267 179 aa, chain + ## HITS:1 COG:no KEGG:B21_03818 NR:ns ## KEGG: B21_03818 # Name: htrC # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 179 1 179 179 347 100.0 1e-94 MKQEVEKWRPFGHPDGDIRDLSFLDAHQAVYVQHHEGKEPLEYRFWVTYSLHCFTKDYEH QTNEEKQSLMYHAPKESRPFCQHRYNLARTHLKRTILALPESNVIHAGYGSYAVIEVDLD GGDKAFYFVAFRAFREKKKLRLHVTSAYPISEKQKGKSVKFFTIAYNLLRNKQLPQPSK >gi|296493107|gb|ADTK01000394.1| GENE 2 1141 - 2274 1094 377 aa, chain - ## HITS:1 COG:ECs4913 KEGG:ns NR:ns ## COG: ECs4913 COG1060 # Protein_GI_number: 15834167 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Escherichia coli O157:H7 # 1 377 1 377 377 762 98.0 0 MKTFSDRWRQLDWDDIRLRINGKTAVDVERALNASQFTRDDMMALLSPAASGYLEQLAQR AQRLTRQRFGNTVSFYVPLYLSNLCANDCTYCGFSMSNRIKRKTLDEADIARESAAIREM GFEHLLLVTGEHQAKVGMDYFRRHLPALREQFSSLQMEVQPLAETEYAELKQLGLDGVMV YQETYHEATYARHHLKGKKQDFFWRLETPDRLGRAGIDKIGLGALIGLSDNWRVDCYMVA EHLLWLQQHYWQSRYSVSFPRLRPCTGGIEPASIMDERQLVQTICAFRLLAPEIELSLST RESPWFRDRVIPLAINNVSAFSKTQPGGYADNHPELEQFSPHDDRRPEAVAAALTAQGLQ PVWKDWDSYLGRASQRL >gi|296493107|gb|ADTK01000394.1| GENE 3 2271 - 3041 738 256 aa, chain - ## HITS:1 COG:thiG KEGG:ns NR:ns ## COG: thiG COG2022 # Protein_GI_number: 16131821 # Func_class: H Coenzyme transport and metabolism # Function: Uncharacterized enzyme of thiazole biosynthesis # Organism: Escherichia coli K12 # 1 256 26 281 281 466 98.0 1e-131 MLRIADKTFDSHLFTGTGKFASSQLMVEAIRASGSQLVTLAMKRVDLRQHNDAILEPLIA AGVTLLPNTSGAKTAEEAIFAAHLAREALGTNWLKLEIHPDARWLLPDPIETLKAAEMLV QQGFVVLPYCGADPVLCKRLEEVGCAAVMPLGAPIGSNQGLETCAMLEIIIQQATVPVVV DAGIGVPSHAAQALEMGADAVLVNTAIAVADDPVNMAKAFRLAVEAGLLARQSGPGSRSH FAHATSPLTGFLEASA >gi|296493107|gb|ADTK01000394.1| GENE 4 3043 - 3243 249 66 aa, chain - ## HITS:1 COG:thiS KEGG:ns NR:ns ## COG: thiS COG2104 # Protein_GI_number: 16132237 # Func_class: H Coenzyme transport and metabolism # Function: Sulfur transfer protein involved in thiamine biosynthesis # Organism: Escherichia coli K12 # 1 66 1 66 66 92 98.0 3e-19 MQILFNDQPMQCAAGQTVHELLEQLDQRQAGAALAINQQIVPREQWAQHIVQDGDQILLF QVIAGG >gi|296493107|gb|ADTK01000394.1| GENE 5 3227 - 3982 720 251 aa, chain - ## HITS:1 COG:thiF KEGG:ns NR:ns ## COG: thiF COG0476 # Protein_GI_number: 16131822 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 # Organism: Escherichia coli K12 # 7 251 1 245 245 444 100.0 1e-125 MNDRDFMRYSRQILLDDIALDGQQKLLDSQVLIIGLGGLGTPAALYLAGAGVGTLVLADD DDVHLSNLQRQILFTTEDIDRPKSQVSQQRLTQLNPDIQLTALQQRLTGEALKDAVARAD VVLDCTDNMATRQEINAACVALNTPLITASAVGFGGQLMVLTPPWEQGCYRCLWPDNQEP ERNCRTAGVVGPVVGVMGTLQALEAIKLLSGIETPAGELRLFDGKSSQWRSLALRRASGC PVCGGSNADPV >gi|296493107|gb|ADTK01000394.1| GENE 6 3975 - 4610 675 211 aa, chain - ## HITS:1 COG:ECs4916 KEGG:ns NR:ns ## COG: ECs4916 COG0352 # Protein_GI_number: 15834170 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate synthase # Organism: Escherichia coli O157:H7 # 1 211 1 211 211 395 99.0 1e-110 MYQPDFPPVPFRLGLYPVVDSVQWIERLLDAGVRTLQLRIKDRRDEEVEADVVAAIALGR RYNARLFINDYWRLAIKHQAYGVHLGQEDLQATDLNAIRAAGLRLGVSTHDDMEIDVALA ARPSYIALGHVFPTQTKQMPSAPQGLEQLARHVERLADYPTVAIGGISLARAPAVIATGV GSIAVVSAITQAADWRLATAQLLEIAGVGDE >gi|296493107|gb|ADTK01000394.1| GENE 7 4610 - 6505 1975 631 aa, chain - ## HITS:1 COG:thiC KEGG:ns NR:ns ## COG: thiC COG0422 # Protein_GI_number: 16131824 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine biosynthesis protein ThiC # Organism: Escherichia coli K12 # 1 631 1 631 631 1313 100.0 0 MSATKLTRREQRARAQHFIDTLEGTAFPNSKRIYITGTHPGVRVPMREIQLSPTLIGGSK EQPQYEENEAIPVYDTSGPYGDPQIAINVQQGLAKLRQPWIDARGDTEELTVRSSDYTKA RLADDGLDELRFSGVLTPKRAKAGRRVTQLHYARQGIITPEMEFIAIRENMGRERIRSEV LRHQHPGMSFGAHLPENITAEFVRDEVAAGRAIIPANINHPESEPMIIGRNFLVKVNANI GNSAVTSSIEEEVEKLVWSTRWGADTVMDLSTGRYIHETREWILRNSPVPIGTVPIYQAL EKVNGIAEDLTWEAFRDTLLEQAEQGVDYFTIHAGVLLRYVPMTAKRLTGIVSRGGSIMA KWCLSHHQENFLYQHFREICEICAAYDVSLSLGDGLRPGSIQDANDEAQFAELHTLGELT KIAWEYDVQVMIEGPGHVPMQMIRRNMTEELEHCHEAPFYTLGPLTTDIAPGYDHFTSGI GAAMIGWFGCAMLCYVTPKEHLGLPNKEDVKQGLITYKIAAHAADLAKGHPGAQIRDNAM SKARFEFRWEDQFNLALDPFTARAYHDETLPQESGKVAHFCSMCGPKFCSMKISQEVRDY AATQTIEMGMADMSENFRARGGEIYLRKEEA >gi|296493107|gb|ADTK01000394.1| GENE 8 6738 - 7214 474 158 aa, chain - ## HITS:1 COG:ECs4918 KEGG:ns NR:ns ## COG: ECs4918 COG3160 # Protein_GI_number: 15834172 # Func_class: K Transcription # Function: Regulator of sigma D # Organism: Escherichia coli O157:H7 # 1 158 1 158 158 305 100.0 3e-83 MLNQLDNLTERVRGSNKLVDRWLHVRKHLLVAYYNLVGIKPGKESYMRLNEKALDDFCQS LVDYLSAGHFSIYERILHKLEGNGQLARAAKIWPQLEANTQQIMDYYDSSLETAIDHDNY LEFQQVLSDIGEALEARFVLEDKLILLVLDAARVKHPA >gi|296493107|gb|ADTK01000394.1| GENE 9 7309 - 8082 567 257 aa, chain + ## HITS:1 COG:yjaD KEGG:ns NR:ns ## COG: yjaD COG2816 # Protein_GI_number: 16131826 # Func_class: L Replication, recombination and repair # Function: NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding # Organism: Escherichia coli K12 # 1 257 1 257 257 531 99.0 1e-151 MDRIIEKLDHGWWVVSHEQKLWLPKGELPYGEAANFDLVGQRALQIGEWQGEPVWLVQQQ RRHDMGSVRQVIDLDVGLFQLAGRGVQLAEFYRSHKYCGYCGHEMYPSKTEWAMLCSHCR ERYYPQIAPCIIVAIRRDDSILLAQHTRHRNGVHTVLAGFVEVGETLEQAVAREVMEESG IKVKNLRYVTSQPWPFPQSLMTAFMAEYDSGDIVIDPKELLEANWYRYDDLPLLPPPGTV ARRLIEDTVAMCRAEYE >gi|296493107|gb|ADTK01000394.1| GENE 10 8122 - 9186 1209 354 aa, chain + ## HITS:1 COG:hemE KEGG:ns NR:ns ## COG: hemE COG0407 # Protein_GI_number: 16131827 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III decarboxylase # Organism: Escherichia coli K12 # 1 354 1 354 354 735 99.0 0 MTELKNDRYLRALLRQPVDVTPVWMMRQAGRYLPEYKATRAQAGDFMSLCKNAELACEVT LQPLRRYPLDAAILFSDILTVPDAMGLGLYFEAGEGPRFTSPVTCKADVDKLPIPDPEDE LGYVMNAVRTIRRELKGEVPLIGFSGSPWTLATYMVEGGSSKAFTVIKKMMYADPQALHA LLDKLAKSVTLYLNAQIKAGAQAVMIFDTWGGVLTGRDYQQFSLYYMHKIVDGLVRENDG RRVPVTLFTKGGGQWLEAMAETGCDALGLDWTTDIADARRRVGNKVALQGNMDPSMLYAP PARIEEEVATILAGFGHGEGHVFNLGHGIHQDVPPEHAGVFVEAVHRLSEQYHR >gi|296493107|gb|ADTK01000394.1| GENE 11 9196 - 9867 631 223 aa, chain + ## HITS:1 COG:nfi KEGG:ns NR:ns ## COG: nfi COG1515 # Protein_GI_number: 16131828 # Func_class: L Replication, recombination and repair # Function: Deoxyinosine 3'endonuclease (endonuclease V) # Organism: Escherichia coli K12 # 1 223 3 225 225 442 100.0 1e-124 MDLASLRAQQIELASSVIREDRLDKDPPDLIAGADVGFEQGGEVTRAAMVLLKYPSLELV EYKVARIATTMPYIPGFLSFREYPALLAAWEMLSQKPDLVFVDGHGISHPRRLGVASHFG LLVDVPTIGVAKKRLCGKFEPLSSEPGALAPLMDKGEQLAWVWRSKARCNPLFIATGHRV SVDSALAWVQRCMKGYRLPEPTRWADAVASERPAFVRYTANQP >gi|296493107|gb|ADTK01000394.1| GENE 12 9910 - 10500 594 196 aa, chain + ## HITS:1 COG:yjaG KEGG:ns NR:ns ## COG: yjaG COG3068 # Protein_GI_number: 16131829 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 196 1 196 196 384 100.0 1e-107 MLQNPIHLRLERLESWQHVTFMACLCERMYPNYAMFCQQTGFGDGQIYRRILDLIWETLT VKDAKVNFDSQLEKFEEAIPSADDFDLYGVYPAIDACVALSELVHSRLSGETLEHAVEVS KTSITTVAMLEMTQAGREMSDEELKENPAVEQEWDIQWEIFRLLAECEERDIELIKGLRA DLREAGESNIGIIFQQ >gi|296493107|gb|ADTK01000394.1| GENE 13 10687 - 10959 404 90 aa, chain + ## HITS:1 COG:STM4170 KEGG:ns NR:ns ## COG: STM4170 COG0776 # Protein_GI_number: 16767424 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Salmonella typhimurium LT2 # 1 90 1 90 90 135 98.0 1e-32 MNKTQLIDVIAEKAELSKTQAKAALESTLAAITESLKEGDAVQLVGFGTFKVNHRAERTG RNPQTGKEIKIAAANVPAFVSGKALKDAVK >gi|296493107|gb|ADTK01000394.1| GENE 14 11032 - 11667 472 211 aa, chain + ## HITS:1 COG:no KEGG:c4958 NR:ns ## KEGG: c4958 # Name: yjaH # Def: hypothetical protein # Organism: E.coli_CFT073 # Pathway: not_defined # 1 211 23 233 233 418 100.0 1e-116 MLAGALFLTACSHNSSLPPFTASGFAEDQGAVRIWRKDSGDNVHLLAVFSPWRSGDTTTR EYRWQGDNLTLININVYSKPPVNIRARFDDRGDLSFMQRESDGEKQQLSNDQIDLYRYRA DQIRQISDALRQGRVVLRQGRWHAMEQTVTTCEGQTIKPDLDSQAIAHIERRQSRSSVDV SVAWLEAPEGSQLLLVANSDFCRWQPNEKTF >gi|296493107|gb|ADTK01000394.1| GENE 15 11669 - 12094 376 141 aa, chain - ## HITS:1 COG:ECs4925 KEGG:ns NR:ns ## COG: ECs4925 COG3678 # Protein_GI_number: 15834179 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; N Cell motility; T Signal transduction mechanisms; P Inorganic ion transport and metabolism # Function: P pilus assembly/Cpx signaling pathway, periplasmic inhibitor/zinc-resistance associated protein # Organism: Escherichia coli O157:H7 # 1 141 48 188 188 219 100.0 1e-57 MKRNTKIALVMMALSAMAMGSTSAFAHGGHGMWQQNAAPLTSEQQTAWQKIHNDFYAQSS ALQQQLVTKRYEYNALLAANPPDSSKINAVAKEMENLRQSLDELRVKRDIAMAEAGIPRG AGMGMGYGGCGGGGHMGMGHW >gi|296493107|gb|ADTK01000394.1| GENE 16 12341 - 13708 1092 455 aa, chain + ## HITS:1 COG:ECs4926 KEGG:ns NR:ns ## COG: ECs4926 COG0642 # Protein_GI_number: 15834180 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli O157:H7 # 1 455 4 458 458 859 98.0 0 MQRSKDSLAKWLSAILPVVIVGLVGLFAVTVIRDYGRETAAARQTLLEKGSVLIRALESG SRVGMGMRMHHAQQQALLEEMAGQPGVRWFAVTDEQGTIVMHSNSGMVGKQLYSPQEMQQ LHPGDEEAWRRIDSADGEPVLEIYRQFQPMFAAGMHRMRHMQQYAATPQAIFIAFDASNI VSAEDREQRNTLIILFALATVLLASVLSFFWYRRYLRSRQLLQDEMKRKEKLVALGHLAA GVAHEIRNPLSSIKGLAKYFAERAPAGGEAHQLAQVMAKEADRLNRVVSELLELVKPTHL ALQAVDLNTLINHSLQLVSQDANSREIQLRFTANDTLPEIQADPDRLTQVLLNLYLNAIQ AIGQHGVISVTVSESGAGVKISVTDSGKGIAADQLDAIFTPYFTTKAEGTGLGLAVVHNI VEQHGGTIQVASQEGKGSTFTLWLPVNITRKDPQG >gi|296493107|gb|ADTK01000394.1| GENE 17 13705 - 15030 1353 441 aa, chain + ## HITS:1 COG:hydG KEGG:ns NR:ns ## COG: hydG COG2204 # Protein_GI_number: 16131834 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 1 441 1 441 441 835 99.0 0 MTHDNIDILVVDDDISHCTILQALLRGWGYNVALANSGRQALEQVREQVFDLVLCDVRMA EMDGIATLKEIKALNPAIPVLIMTAYSSVETAVEALKTGALDYLIKPLDFDNLQATLEKA LAHTHSIDAETPAVTASQFGMVGKSPAMQHLLSEIALVAPSEATVLIHGDSGTGKELVAR AIHASSARSEKPLVTLNCAALNESLLESELFGHEKGAFTGADKRREGRFVEADGGTLFLD EIGDISPMMQVRLLRAIQEREVQRVGSNQTISVDVRLIAATHRDLAAEVNAGRFRQDLYY RLNVVAIEVPSLRQRREDIPLLAGHFLQRFAERNRKAVKGFTPQAMDLLIHYDWPGNIRE LENAVERAVVLLTGEYISERELPLAIASTPIPLGQSQDIQPLVEVEKEVILAALEKTGGN KTEAARQLGITRKTLLAKLSR >gi|296493107|gb|ADTK01000394.1| GENE 18 15027 - 16316 1578 429 aa, chain - ## HITS:1 COG:ECs4928 KEGG:ns NR:ns ## COG: ECs4928 COG0151 # Protein_GI_number: 15834182 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylamine-glycine ligase # Organism: Escherichia coli O157:H7 # 1 429 1 429 429 847 98.0 0 MKVLVIGNGGREHALAWKAAQSPLVETVFVAPGNAGTALEPALQNVAIGVTDIPALLDFA QNEKIDLTIVGPEAPLVKGVVDTFRAAGLKIFGPTAGAAQLEGSKAFTKDFLARHKIPTA EYQNFTEVEPALAYLREKGAPIVIKADGLAAGKGVIVAMTLEEAEAAVHDMLAGNAFGDA GHRIVIEEFLDGEEASFIVMVDGEHVLPMATSQDHKRVGDKDTGPNTGGMGAYSPAPVVT DEVHQRTMERIIWPTVKGMAAEGNTYTGFLYAGLMIDKQGNPKVIEFNCRFGDPETQPIM LRMKSDLVELCLAACEGKLDEKTSEWDERASLGVVMAAGGYPGDYRTGDVIHGLPLEEVE DGKVFHAGTKLADDEQVVTSGGRVLCVTALGHTVAEAQKRAYALMTDIHWDDCFCRKDIG WRAIEREQN >gi|296493107|gb|ADTK01000394.1| GENE 19 16328 - 17917 1761 529 aa, chain - ## HITS:1 COG:purH KEGG:ns NR:ns ## COG: purH COG0138 # Protein_GI_number: 16131836 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Escherichia coli K12 # 1 529 1 529 529 1033 100.0 0 MQQRRPVRRALLSVSDKAGIVEFAQALSARGVELLSTGGTARLLAEKGLPVTEVSDYTGF PEMMDGRVKTLHPKVHGGILGRRGQDDAIMEEHQIQPIDMVVVNLYPFAQTVAREGCSLE DAVENIDIGGPTMVRSAAKNHKDVAIVVKSSDYDAIIKEMDDNEGSLTLATRFDLAIKAF EHTAAYDSMIANYFGSMVPAYHGESKEAAGRFPRTLNLNFIKKLDMRYGENSHQQAAFYI EENVKEASVATATQVQGKALSYNNIADTDAALECVKEFAEPACVIVKHANPCGVAIGNSI LDAYDRAYKTDPTSAFGGIIAFNRELDAETAQAIISRQFVEVIIAPSASEEALKITAAKQ NVRVLTCGQWGERVPGLDFKRVNGGLLVQDRDLGMVGAEELRVVTKRQPSEQELRDALFC WKVAKFVKSNAIVYAKNNMTIGIGAGQMSRVYSAKIAGIKAADEGLEVKGSSMASDAFFP FRDGIDAAAAAGVTCVIQPGGSIRDDEVIAAADEHGIAMLFTDMRHFRH Prediction of potential genes in microbial genomes Time: Mon May 16 16:21:47 2011 Seq name: gi|296493106|gb|ADTK01000395.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1267.1, whole genome shotgun sequence Length of sequence - 666 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 115 - 312 141 ## EC55989_3988 toxic polypeptide, small + Term 363 - 401 7.0 Predicted protein(s) >gi|296493106|gb|ADTK01000395.1| GENE 1 115 - 312 141 65 aa, chain + ## HITS:1 COG:no KEGG:EC55989_3988 NR:ns ## KEGG: EC55989_3988 # Name: ldrD # Def: toxic polypeptide, small # Organism: E.coli_55989 # Pathway: not_defined # 1 65 16 80 80 129 100.0 5e-29 MHLTTPRGPFLHPNNTRIAAKRCHYNTGGYMTLAELGMAFWHDLAAPVIAGILASMIVNW LNKRK Prediction of potential genes in microbial genomes Time: Mon May 16 16:21:58 2011 Seq name: gi|296493105|gb|ADTK01000396.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1267.2, whole genome shotgun sequence Length of sequence - 32604 bp Number of predicted genes - 23, with homology - 22 Number of transcription units - 12, operones - 5 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 1645 1153 ## ECH74115_4904 hypothetical protein 2 1 Op 2 . - CDS 1642 - 1833 205 ## ECO103_4265 hypothetical protein 3 1 Op 3 . - CDS 1830 - 3401 859 ## B21_03337 hypothetical protein - Prom 3457 - 3516 4.8 + Prom 3435 - 3494 4.3 4 2 Op 1 . + CDS 3674 - 3862 217 ## SSON_3856 hypothetical protein 5 2 Op 2 1/1.000 + CDS 3874 - 4626 716 ## COG1192 ATPases involved in chromosome partitioning 6 2 Op 3 . + CDS 4623 - 7241 2484 ## COG1215 Glycosyltransferases, probably involved in cell wall biogenesis 7 2 Op 4 . + CDS 7252 - 9591 2346 ## B21_03333 hypothetical protein 8 2 Op 5 4/0.333 + CDS 9598 - 10704 1142 ## COG3405 Endoglucanase Y 9 2 Op 6 5/0.000 + CDS 10686 - 14159 3164 ## COG0457 FOG: TPR repeat 10 2 Op 7 3/0.833 + CDS 14280 - 16229 1940 ## COG2200 FOG: EAL domain + Term 16254 - 16307 4.3 + Prom 16264 - 16323 3.6 11 3 Op 1 5/0.000 + CDS 16412 - 17698 1500 ## COG1301 Na+/H+-dicarboxylate symporters + Term 17711 - 17752 6.1 + Prom 17882 - 17941 4.0 12 3 Op 2 . + CDS 17961 - 19415 1567 ## COG0612 Predicted Zn-dependent peptidases + Term 19416 - 19463 1.7 - Term 19404 - 19449 -0.4 13 4 Tu 1 . - CDS 19511 - 20440 1001 ## COG0524 Sugar kinases, ribokinase family - Prom 20665 - 20724 2.6 + Prom 20475 - 20534 2.8 14 5 Op 1 4/0.333 + CDS 20672 - 21439 649 ## COG2200 FOG: EAL domain 15 5 Op 2 . + CDS 21509 - 23569 1717 ## COG2982 Uncharacterized protein involved in outer membrane biogenesis + Term 23627 - 23663 4.0 16 6 Tu 1 4/0.333 - CDS 23803 - 25125 1686 ## COG0477 Permeases of the major facilitator superfamily - Prom 25273 - 25332 5.4 - Term 25485 - 25533 11.2 17 7 Op 1 3/0.833 - CDS 25536 - 26549 1071 ## COG1295 Predicted membrane protein 18 7 Op 2 . - CDS 26598 - 27479 724 ## COG0583 Transcriptional regulator - Prom 27553 - 27612 1.8 + Prom 27900 - 27959 4.3 19 8 Tu 1 . + CDS 28017 - 28619 422 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain + Term 28629 - 28678 8.4 - Term 28617 - 28666 8.4 20 9 Tu 1 . - CDS 28670 - 30319 1679 ## COG1626 Neutral trehalase - Prom 30431 - 30490 6.6 + Prom 30645 - 30704 4.1 21 10 Tu 1 . + CDS 30724 - 32121 1195 ## COG1858 Cytochrome c peroxidase + Term 32129 - 32170 8.8 22 11 Tu 1 . - CDS 32253 - 32399 75 ## - Prom 32471 - 32530 5.6 + Prom 32160 - 32219 6.0 23 12 Tu 1 . + CDS 32332 - 32602 218 ## COG0076 Glutamate decarboxylase and related PLP-dependent proteins Predicted protein(s) >gi|296493105|gb|ADTK01000396.1| GENE 1 1 - 1645 1153 548 aa, chain - ## HITS:1 COG:no KEGG:ECH74115_4904 NR:ns ## KEGG: ECH74115_4904 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 548 1 548 559 1064 99.0 0 MTQFTQNTAMPSSLWQYWRGLSGWNFYFLVKFGLLWAGYLNFHPLLNLVFAAFLLMPIPR YSLHRLRHWIALPIGFALFWHDTWLPGPESIMSQGSQVAGFSTDYLIDLVTRFINWQMIG AIFVLLVAWLFLSQWIRITVFVVAILLWLNVLTLAGPSFSLWPAGQPTTTVTTTGGNAAA TVAATGGAPVVGDMPAQTAPPTTANLNAWLNNFYNAEAKRKSTFPSSLPADAQPFELLVI NICSLSWSDIEAAGLMSHPLWSHFDIEFKNFNSATSYSGPAAIRLLRASCGQTSHTNLYQ PANNDCYLFDNLSKLGFTQHLMMGHNGQFGGFLKEVRENGGMQTELMDQTNLPVILLGFD GSPVYDDTAVLNRWLDVTEKDKNSRSATFYNTLPLHDGNHYPGVSKTADYKARAQKFFDE LDAFFTELEKSGRKVMVVVVPEHGGALKGDRMQVSGLRDIPSPSITDVPVEVKFFGMKAP HQGAPIVIDQPSSFLAISDLVVRVLDGKIFTEDNVDWKKLTSGLPQTAPVSENSNAVVIQ YQDKPYVR >gi|296493105|gb|ADTK01000396.1| GENE 2 1642 - 1833 205 63 aa, chain - ## HITS:1 COG:no KEGG:ECO103_4265 NR:ns ## KEGG: ECO103_4265 # Name: bcsF # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 63 1 63 63 104 100.0 9e-22 MMTISDIIEIIVVCALIFFPLGYLARHSLRRIRDTLRLFFAKPRYVKPAGTLRRTEKARA TKK >gi|296493105|gb|ADTK01000396.1| GENE 3 1830 - 3401 859 523 aa, chain - ## HITS:1 COG:no KEGG:B21_03337 NR:ns ## KEGG: B21_03337 # Name: bcsE # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 523 1 523 523 1065 100.0 0 MRDIVDPVFSIGISSLWDELRHMPAGGVWWFNVDRHEDAISLANQTIASQAETAHVAVIS MDSDPAKIFQLDDSQGPEKIKLFSMLNHEKGLYYLTRDLQCSIDPHNYLFILVCANNAWQ NIPAERLRSWLDKMNKWSRLNHCSLLVINPGNNNDKQFSLLLEEYRSLFGLASLRFQGDQ HLLDIAFWCNEKGVSARQQLSVQQQNGIWTLVQSEEAEIQPRSDEKRILSNVAVLEGAPP LSEHWQLFNNNEVLFNEARTAQAATVVFSLQQNAQIEPLARSIHTLRRQRGSAMKILVRE NTASLRATDERLLLACGANMVIPWNAPLSRCLTMIESVQGQKFSRYVPEDITTLLSMTQP LKLRGFQKWDVFCNAVNNMMNNPLLPAHGKGVLVALRPVPGIRVEQALTLCRPNRTGDIM TIGGNRLVLFLSFCRINDLDTALNHIFPLPTGDIFSNRMVWFEDDQISAELVQMRLLAPE QWGMPLPLTQSSKPVINAEHDGRHWRRIPEPMRLLDDAVERSS >gi|296493105|gb|ADTK01000396.1| GENE 4 3674 - 3862 217 62 aa, chain + ## HITS:1 COG:no KEGG:SSON_3856 NR:ns ## KEGG: SSON_3856 # Name: yhjR # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 62 1 62 62 108 100.0 7e-23 MNNNEPDTLPDPAIGYIFQNDIVALKQAFSLPDIDYADISQREQLAAALKRWPLLAEFAQ QK >gi|296493105|gb|ADTK01000396.1| GENE 5 3874 - 4626 716 250 aa, chain + ## HITS:1 COG:yhjQ KEGG:ns NR:ns ## COG: yhjQ COG1192 # Protein_GI_number: 16131406 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Escherichia coli K12 # 9 250 1 242 242 477 99.0 1e-134 MAVLGLQGVRGGVGTTTITAALAWSLQMLGENVLVVDACPDNLLRLSFNVDFTHRQGWAR AMLDGQDWRDAGLRYTSQLDLLPFGQLSIEEQENPQHWQTRLSDICSGLQQLKASGRYQW ILIDLPRDASQITHQLLSLCDHSLAIVNVDANCHIRLHQQALPDGAHILINDFRIGSQVQ DDIYQLWLQSQRRLLPMLIHRDEAMAECLAAKQPVGEYRSDALAAEEILTLANWCLLNYS GLKTPVGSAS >gi|296493105|gb|ADTK01000396.1| GENE 6 4623 - 7241 2484 872 aa, chain + ## HITS:1 COG:yhjO KEGG:ns NR:ns ## COG: yhjO COG1215 # Protein_GI_number: 16131405 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases, probably involved in cell wall biogenesis # Organism: Escherichia coli K12 # 1 872 17 888 888 1809 99.0 0 MSILTRWLLIPPVNARLIGRYRDYRRHGASAFSATLGCFWMILAWIFIPLEHPRWQRIRA EHKNLYPHINASRPRPLDPVRYLIQTCWLLIGTSRKETPKPRRRAFSGLQNIRGRYHQWM NELPERVSHKTQHLDEKKELGHLSAGARRLILGIIVTFSLILALICVTQPFNPLAQFIFL MLLWGVALIVRRMPGRFSALMLIVLSLTVSCRYIWWRYTSTLNWDDPVSLVCGLILLFAE TYAWIVLVLGYFQVVWPLNRQPVPLPKDMSLWPSVDIFVPTYNEDLNVVKNTIYASLGID WPKDKLNIWILDDGGREEFRQFAQNVGVKYIARTTHEHAKAGNINNALKYAKGEFVSIFD CDHVPTRSFLQMTMGWFLKEKQLAMMQTPHHFFSPDPFERNLGRFRKTPNEGTLFYGLVQ DGNDMWDATFFCGSCAVIRRKPLDEIGGIAVETVTEDAHTSLRLHRRGYTSAYMRIPQAA GLATESLSAHIGQRIRWARGMVQIFRLDNPLTGKGLKFAQRLCYVNAMFHFLSGIPRLIF LTAPLAFLLLHAYIIYAPALMIALFVLPHMIHASLTNSKIQGKYRHSFWSEIYETVLAWY IAPPTLVALINPHKGKFNVTAKGGLVEEEYVDWVISRPYIFLVLLNLVGVAVGIWRYFYG PPTEMLTVVVSMVWVFYNLIVLGGAVAVSVESKQVRRSHRVEMTMPAAIAREDGHLFSCT VQDFSDGGLGIKINGQAQILEGQKVNLLLKRGQQEYVFPTQVARVMGNEVGLKLMPLTTQ QHIDFVQCTFARADTWALWQDSYPEDKPLESLLDILKLGFRGYRHLAEFAPSSVKGIFRV LTSLVSWVVSFIPRRPERSETAQPSDQALAQQ >gi|296493105|gb|ADTK01000396.1| GENE 7 7252 - 9591 2346 779 aa, chain + ## HITS:1 COG:no KEGG:B21_03333 NR:ns ## KEGG: B21_03333 # Name: bcsB # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 779 1 779 779 1516 100.0 0 MKRKLFWICAVAMGMSAFPSFMTQATPATQPLINAEPAVAAQTEQNPQVGQVMPGVQGAD APVVAQNGPSRDVKLTFAQIAPPPGSMVLRGINPNGSIEFGMRSDEVVTKAMLNLEYTPS PSLLPVQSQLKVYLNDELMGVLPVTKEQLGKKTLAQMPINPLFITDFNRVRLEFVGHYQD VCENPASTTLWLDVGRSSGLDLTYQTLNVKNDLSHFPVPFFDPRDNRTNTLPMVFAGAPD VGLQQASAIVASWFGSRSGWRGQNFPVLYNQLPDRNAIVFATNDKRPDFLRDHPAVKAPV IEMINHPQNPYVKLLVVFGRDDKDLLQAAKGIAQGNILFRGESVVVNEVKPLLPRKPYDA PNWVRTDRPVTFGELKTYEEQLQSSGLEPAAINVSLNLPPDLYLMRSTGIDMDINYRYTM PPVKDSSRMDISLNNQFLQSFNLSSKQEANRLLLRIPVLQGLLDGKTDVSIPALKLGATN QLRFDFEYMNPMPGGSVDNCITFQPVQNHVVIGDDSTIDFSKYYHFIPMPDLRAFANAGF PFSRMADLSQTITVMPKAPNEAQMETLLNTVGFIGAQTGFPAINLTVTDDGSTIQGKDAD IMIIGGIPDKLKDDKQIDLLVQATESWVKTPMRQTPFPGIVPDESDRAAETRSTLTSSGA MAAVIGFQSPYNDQRSVIALLADSPRGYEMLNDAVNDSGKRATMFGSVAVIRESGINSLR VGDVYYVGHLPWFERLWYALANHPILLAVLAAISVILLAWVLWRLLRIISRRRLNPDNE >gi|296493105|gb|ADTK01000396.1| GENE 8 9598 - 10704 1142 368 aa, chain + ## HITS:1 COG:ECs4411 KEGG:ns NR:ns ## COG: ECs4411 COG3405 # Protein_GI_number: 15833665 # Func_class: G Carbohydrate transport and metabolism # Function: Endoglucanase Y # Organism: Escherichia coli O157:H7 # 1 368 1 368 368 724 99.0 0 MNVLRSGIVTMLLLAAFSVQAACTWPAWEQFKKDYISQEGRVIDPSDARKITTSEGQSYG MFFALAANDRVAFDNILDWTQNNLAQGSLKERLPAWLWGKKENSKWEVLDSNSASDGDVW MAWSLLEAGRLWKEQRYTDIGSALLKRIAREEVVTVPGLGSMLLPGKVGFAEDNSWRFNP SYLPPTLAQYFTRFGAPWTTLRETNQRLLLETAPKGFSPDWVRYEKDKGWQLKAEKTLIS SYDAIRVYMWVGMMPDSDPQKARMLNRFKPMATFTEKNGYPPEKVDVATGKAQGKGPVGF SAAMLPFLQNRDAQAVQRQRVADNFPGSDAYYNYVLTLFGQGWDQHRFRFSTKGELLPDW GQECANSH >gi|296493105|gb|ADTK01000396.1| GENE 9 10686 - 14159 3164 1157 aa, chain + ## HITS:1 COG:ZyhjLm KEGG:ns NR:ns ## COG: ZyhjLm COG0457 # Protein_GI_number: 15804991 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Escherichia coli O157:H7 EDL933 # 1 1157 1 1154 1154 2122 99.0 0 MRKFTLNIFTLSLGLAVMPMVEAAPTAQQQLLEQVRLGEATHREDLVQQSLYRLELIDPN NPDVVAARFRSLLRQGDIDGAQKQLDRLSQLAPSSNAYKSSRTTMLLSTPDGRQALQQAR LQATTGHAEEAVASYNKLFNGAPPEGDIAVEYWSTVAKIPARRGEAINQLKRINADAPGN TGLQNNLALLLFSSDRRDEGFAVLEQMAKSNAGREGASKIWYGQIKDMPVSDASVSALKK YLSIFSDGDSVAAAQSQLAEQQKQLADPAFRARAQGLAAVDSGMAGKAIPELQQAVRANP KDSEALGALGQAYSQKGDRANAVANLEKALALDPHSSNNDKWNSLLKVNRYWLAIQQGDA ALKANNPDRAERLFQQARNVDNTDSYAVLGLGDVAMARKDYPAAERYYQQTLRMDSGNTN AVRGLANIYRQQSPEKAEAFIASLSASQRRSIDDIERSLQNDRLAQQAEALENQGKWAQA AALQRQRLALDPGSVWITYRLSQDLWQAGQRSQADTLMRNLAQQKPNDPEQVYAYGLYLS GHDQDRAALAHINSLPRAQWNSNIQELVNRLQSDQVLETANRLRESGKEAEAEAMLRQQP PSTRIDLTLADWAQQRRDYTAARAAYQNVLTREPTNADAILGLTEVDIAAGDTAAARSQL AKLPATDNASLNTQRRVALAQAQLGDTAAAQQTFNKLIPQAKSQPPSMESAMVLRDGAKF EAQAGDPKQALETYKDAMVASGVTTTRPQDNDTFTRLTRNDEKDDWLKRGVRSDAADLYR QQDLNVTLEHDYWGSSGTGGYSDLKAHTTMLQVDAPYSDGRMFFRSDFVNMNVGSFSTNA DGKWDDNWGTCTLQDCSGNRSQSDSGASVAVGWRNDVWSWDIGTTPMGFNVVDVVGGISY SDDIGPLGYTVNAHRRPISSSLLAFGGQKDSPSNTGKKWGGVRADGVGLSLSYDKGEANG VWASLSGDQLTGKNVEDNWRVRWMTGYYYKVINQNNRRVTIGLNNMIWHYDKDLSGYSLG QGGYYSPQEYLSFAIPVMWRERTENWSWELGASGSWSHSRTKTMPRYPLMNLIPTDWQEE AARQSNDGGSSQGFGYTARALLERRVTSNWFVGTAIDIQQAKDYAPSHFLLYVRYSAAGW QGDMDLPPQPLIPYADW >gi|296493105|gb|ADTK01000396.1| GENE 10 14280 - 16229 1940 649 aa, chain + ## HITS:1 COG:ECs4409_3 KEGG:ns NR:ns ## COG: ECs4409_3 COG2200 # Protein_GI_number: 15833663 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli O157:H7 # 387 649 1 263 263 509 99.0 1e-144 MVAAVVLVFVFIFCTVLLFHLVQQNRYNTATQLESIARSVREPLSSAILKGDIPEAEAIL ASIKPAGVVSRADVVLPNQFQALRKSFIPERPVPVMVTRLFELPVQISLGVYSLERPANP QPIAYLVLQADSFRMYKFVMSTLSTLVTIYLLLSLILTVAISWCINRLILHPLRNIAREL NAIPAQELVGHQLALPRLHQDDEIGMLVRSYNLNQQLLQRHYEEQNENAMRFPVSDLPNK ALLMEMLEQVVARKQTTALMIITCETLRDTAGVLKEAQREILLLTLVEKLKSVLSPRMIL AQISGYDFAVIANGVQEPWHAITLGQQVLTIMSERLPIERIQLRPHCSIGVAMFYGDLTA EQLYSRAISAAFTAHHKGKNQIQFFDPQQMEAAQKRLTEESDILNALENHQFAIWLQPQV EMTSGKLVSAEVLLRIQQPDGSWDLPDGLIDRIECCGLMVTVGHWVLEESCRLLAAWQER GIMLPLSVNLSALQLMHPNMVADMLELLTRYRIQPGTLILEVTESRRIDDPHAAVAILRP LRNAGVRVALDDFGMGYAGLRQLQHMKSLPIDVLKIDKMFVEGLPEDSSMIAAIIMLAQS LNLQMIAEGVETEAQRDWLAKAGVGIAQGFLFARPLPIEIFEESYLEEK >gi|296493105|gb|ADTK01000396.1| GENE 11 16412 - 17698 1500 428 aa, chain + ## HITS:1 COG:dctA KEGG:ns NR:ns ## COG: dctA COG1301 # Protein_GI_number: 16131400 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Escherichia coli K12 # 1 428 1 428 428 733 100.0 0 MKTSLFKSLYFQVLTAIAIGILLGHFYPEIGEQMKPLGDGFVKLIKMIIAPVIFCTVVTG IAGMESMKAVGRTGAVALLYFEIVSTIALIIGLIIVNVVQPGAGMNVDPATLDAKAVAVY ADQAKDQGIVAFIMDVIPASVIGAFASGNILQVLLFAVLFGFALHRLGSKGQLIFNVIES FSQVIFGIINMIMRLAPIGAFGAMAFTIGKYGVGTLVQLGQLIICFYITCILFVVLVLGS IAKATGFSIFKFIRYIREELLIVLGTSSSESALPRMLDKMEKLGCRKSVVGLVIPTGYSF NLDGTSIYLTMAAVFIAQATNSQMDIVHQITLLIVLLLSSKGAAGVTGSGFIVLAATLSA VGHLPVAGLALILGIDRFMSEARALTNLVGNGVATIVVAKWVKELDHKKLDDVLNNRAPD GKTHELSS >gi|296493105|gb|ADTK01000396.1| GENE 12 17961 - 19415 1567 484 aa, chain + ## HITS:1 COG:ECs4407 KEGG:ns NR:ns ## COG: ECs4407 COG0612 # Protein_GI_number: 15833661 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Escherichia coli O157:H7 # 1 484 15 498 498 897 99.0 0 MMATAGYVQADALQPDPAWQQGTLSNGLQWQVLTTPQRPSDRVEIRLLVNTGSLAESTQQ SGYSHAIPRIALTQSGGLDAAQARSLWQQGIDPKRPMPPVIVSYDTTLFNLSLPNNRNDL LKEALSYLANATGKLTITPETINHALQSQDMVATWPADTKEGWWRYRLKGSTLLGHDPAD PLKQPVEAEKIKDFYQKWYTPDAMTLLVVGNVDARSVVDQINKTFGELKGKRETPAPVPT LSPLRAEAVSIMTDAVRQDRLSIMWDTPWQPIRESAALLRYWRADLAREALFWHVQQALS ASNSKDIGLGFDCRVLYLRAQCAINIESPNDKLNSNLNLVARELAKVRDKGLPEEEFNAL VAQKKLELQKLFAAYARADTDILMGQRMRSLQNQVIDIAPEQYQKLRQDFLNSLTVEMLN QDLRQQLSNDMALILLQPKGEPEFNMKALQAAWDQIMAPSTAAATTSVATDDVHPEVTDI PPAQ >gi|296493105|gb|ADTK01000396.1| GENE 13 19511 - 20440 1001 309 aa, chain - ## HITS:1 COG:ECs4406 KEGG:ns NR:ns ## COG: ECs4406 COG0524 # Protein_GI_number: 15833660 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli O157:H7 # 1 309 74 382 382 626 99.0 1e-179 MSKKIAVIGECMIELSEKGADVKRGFGGDTLNTSVYIARQVDPAALTVHYVTALGTDSFS QQMLDAWHGENVDTSLTQRMENRLPGLYYIETDSTGERTFYYWRNEAAAKFWLESEQSAA ICEELANFDYLYLSGISLAILSPTSREKLLSLLRECRANGGKVIFDNNYRPRLWASKEET QQVYQQMLECTDIAFLTLDDEDALWGQQPVEDVIARTHNAGVKEVVVKRGADSCLVSIAG EGLVDVPAVKLPKEKVIDTTAAGDSFSAGYLAVRLTGGSAENAAKRGHLTASTVIQYRGA IIPREAMPA >gi|296493105|gb|ADTK01000396.1| GENE 14 20672 - 21439 649 255 aa, chain + ## HITS:1 COG:yhjH KEGG:ns NR:ns ## COG: yhjH COG2200 # Protein_GI_number: 16131397 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 1 255 2 256 256 506 99.0 1e-143 MIRQVIQRISNPEASIESLQERRFWLQCERAYTWQPIYQTCGRLMAVELLTVVAHPLNPS QRLPPDRYFTEITVSHRMEVVKEQIDLLAQKADFFIEHGLLASVNIDGPTLIALRQQPKI LRQIERLPWLRFELVEHIRLPKDSTFASMCEFGPLWLDDFGTGMANFSALSEVRYDYIKI ARELFVMLRQSPEGRTLFSQLLHLMNRYCRGVIVEGVETPEEWRDVQNSPAFAAQGWFLS RPAPIETLNTAVLAL >gi|296493105|gb|ADTK01000396.1| GENE 15 21509 - 23569 1717 686 aa, chain + ## HITS:1 COG:yhjG KEGG:ns NR:ns ## COG: yhjG COG2982 # Protein_GI_number: 16131396 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein involved in outer membrane biogenesis # Organism: Escherichia coli K12 # 1 686 6 691 691 1327 100.0 0 MSKAGKITAAISGAFLLLIVVAIILIATFDWNRLKPTINQKVSAELNRPFAIRGDLGVVW ERQKQETGWRSWVPWPHVHAEDIILGNPPDIPEVTMVHLPRVEATLAPLALLTKTVWLPW IKLEKPDARLIRLSEKNNNWTFNLANDDNKDANAKPSAWSFRLDNILFDQGRIAIDDKVS KADLEIFVDPLGKPLPFSEVTGSKGKADKEKVGDYVFGLKAQGRYNGEPLTGTGKIGGML ALRGEGTPFPVQADFRSGNTRVAFDGVVNDPMKMGGVDLRLKFSGDSLGDLYELTGVLLP DTPPFETDGRLVAKIDTEKSSVFDYRGFNGRIGDSDIHGSLVYTTGKPRPKLEGDVESRQ LRLADLGPLIGVDSGKGAEKSKRSEQKKGEKSVQPAGKVLPYDRFETDKWDVMDADVRFK GRRIEHGSSLPISDLSTHIILKNADLRLQPLKFGMAGGSIAANIHLEGDKKPMQGRADIQ ARRLKLKELMPDVELMQKTLGEMNGDAELRGSGNSVAALLGNSNGNLKLLMNDGLVSRNL MEIVGLNVGNYIVGAIFGDDEVRVNCAAANLNIANGVARPQIFAFDTENALINVTGTASF ASEQLDLTIDPESKGIRIITLRSPLYVRGTFKNPQAGVKAGPLIARGAVAAALATLVTPA AALLALISPSEGEANQCRTILSQMKK >gi|296493105|gb|ADTK01000396.1| GENE 16 23803 - 25125 1686 440 aa, chain - ## HITS:1 COG:yhjE KEGG:ns NR:ns ## COG: yhjE COG0477 # Protein_GI_number: 16131395 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 440 1 440 440 682 99.0 0 MQATATTLDHEQEYTPINSRNKVLVASLIGTAIEFFDFYIYATAAVIVFPHIFFPQGDPT AATLQSLATFAIAFVARPIGSAVFGHFGDRVGRKATLVASLLTMGISTVVIGLLPGYATI GIFAPLLLALARFGQGLGLGGEWGGAALLATENAPPRKRALYGSFPQLGAPIGFFFANGT FLLLSWLLTDEQFMSWGWRVPFIFSAVLVIIGLYVRVSLHESPVFEKVAKAKKQVKIPLG TLLTKHVRVTVLGTFIMLATYTLFYIMTVYSMTFSTAAAPVGLGLPRNEVLWMLMMAVIG FGVMVPVAGLLADAFGRRKSMVIITTLIILFALFAFNPLLGSGNPILVFAFLLLGLSLMG LTFGPMGALLPELFPTEVRYTGASFSYNVASILGASVAPYIAAWLQANYGLGAVGLYLAA MAGLTLIALLLTHETRHQSL >gi|296493105|gb|ADTK01000396.1| GENE 17 25536 - 26549 1071 337 aa, chain - ## HITS:1 COG:ECs4402 KEGG:ns NR:ns ## COG: ECs4402 COG1295 # Protein_GI_number: 15833656 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 337 1 337 337 663 100.0 0 MTQENEIKRPTQDLEHEPIKQLDNSEKGGKVSQALETVTTTAEKVQRQPVIAHLIRATER FNDRLGNQFGAAITYFSFLSMIPILMVSFAAGGFVLASHPMLLQDIFDKILQNISDPTLA ATLKNTINTAVQQRTTVGLVGLAVALYSGINWMGNLREAIRAQSRDVWERSPQDQEKFWV KYLRDFISLIGLLIALIVTLSITSVAGSAQQMIISALHLNSIEWLKPTWRLIGLAISIFA NYLLFFWIFWRLPRHRPRKKALIRGTFLAAIGFEVIKIVMTYTLPSLMKSPSGAAFGSVL GLMAFFYFFARLTLFCAAWIATAEYKDDPRMPGKTQP >gi|296493105|gb|ADTK01000396.1| GENE 18 26598 - 27479 724 293 aa, chain - ## HITS:1 COG:yhjC KEGG:ns NR:ns ## COG: yhjC COG0583 # Protein_GI_number: 16131393 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 293 31 323 323 585 99.0 1e-167 MQLFIKVAELESFSRAADFFALPKGSVSRQIQALEHQLGTQLLQRTTRRVKLTPEGMTYY QRAKDVLSNLSELDGLFQQDATSISGKLRIDIPPGIAKSLLLPRLSEFLYLHPGIELELS SHDRPVDILHDGFDCVIRTGALPEDGVIARPLGKLTMVNCASPHYLTRFGYPQSPDDLTS HAIVRYTPHLGVHPLGFEVASVNGVQWFKSGGMLTVNSSENYLAAGLAGLGIIQIPRIAV REALRAGRLIEVLPGYRAEPLSLSLVYPQRRELSRRVNLFMQWLAGVMKEHLD >gi|296493105|gb|ADTK01000396.1| GENE 19 28017 - 28619 422 200 aa, chain + ## HITS:1 COG:yhjB KEGG:ns NR:ns ## COG: yhjB COG2197 # Protein_GI_number: 16131392 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli K12 # 1 200 1 200 200 394 100.0 1e-110 MQIVMFDRQSIFIHGMKISLQQRIPGVSIQGASQADELWQKLESYPEALVMLDGDQDGEF CYWLLQKTVVQFPEVKVLITATDCNKRWLQEVIHFNVLAIVPRDSTVETFALAVNSAAMG MMFLPGDWRTTPEKDIKDLKSLSARQREILTMLAAGESNKEIGRALNISTGTVKAHLESL YRRLEVKNRTQAAMMLNISS >gi|296493105|gb|ADTK01000396.1| GENE 20 28670 - 30319 1679 549 aa, chain - ## HITS:1 COG:ECs4399 KEGG:ns NR:ns ## COG: ECs4399 COG1626 # Protein_GI_number: 15833653 # Func_class: G Carbohydrate transport and metabolism # Function: Neutral trehalase # Organism: Escherichia coli O157:H7 # 1 549 1 549 549 1112 99.0 0 MLNQKIQNPNPDELMIEVDLCYELDPYELKLDEMIEAEPEPEMIEGLPASDALTPADRYL ELFEHVQSAKIFPDSKTFPDCAPKMDPLDILIRYRKVRRHRDFDLRKFVENHFWLPEVYS SEYVSDPQNSLKEHIDQLWPVLTREPQDHIPWSSLLALPQSYIVPGGRFSETYYWDSYFT MLGLAESGREDLLKCMADNFAWMIENYGHIPNGNRTYYLSRSQPPVFALMVELFEEDGVR GARRYLDHLKMEYAFWMDGAESLIPNQAYRHVVRMPDGSLLNRYWDDRDTPRDESWLEDV ETAKHSGRPPNEVYRDLRAGAASGWDYSSRWLRDTGRLASIRTTQFIPIDLNAFLFKLES AIANISALKGEKETEALFRQKASARRDAVNRYLWDDENGIYRDYDWRREQLALFSAAAIV PLYVGMANHEQADRLANAVRSRLLTPGGILASEYETGEQWDKPNGWAPLQWMAIQGFKMY GDDLLGDEIARSWLKTVNQFYLEQHKMIEKYHIADGVPREGGGGEYPLQDGFGWTNGVVR RLIGLYGEP >gi|296493105|gb|ADTK01000396.1| GENE 21 30724 - 32121 1195 465 aa, chain + ## HITS:1 COG:yhjA KEGG:ns NR:ns ## COG: yhjA COG1858 # Protein_GI_number: 16131390 # Func_class: P Inorganic ion transport and metabolism # Function: Cytochrome c peroxidase # Organism: Escherichia coli K12 # 1 465 1 465 465 968 99.0 0 MKMVSRITAIGLAGVAICYLGLSGYVWYHDNKRSKQADVQASAVSENNKVLGFLREKGCD YCHTPSAELPAYYYIPGAKQLMDYDIKLGYKSFNLEAVRAALLADKPVSQSDLNKIEWVM QYETMPPTRYTALHWAGKVSDEERAEILAWIAKQRAEYYASNDTAPEHRNEPVQPIPQKL PTDAQKVALGFALYHDPRLSADSTISCAHCHALNAGGVDGRKTSIGVGGAVGPINAPTVF NSVFNVEQFWDGRAATLQDQAGGPPLNPIEMASKSWDEIIAKLEKDPQLKAQFLEVYPQG FSGENITDAIAEFEKTLITPDSPFDKWLRGDENALTAQQKKGYQLFKDNKCATCHGGIIL GGRSFEPLGLKKDFNFGEITAADIGRMNVTKEERDKLRQKVPGLRNVALTAPYFHRGDVP TLDGAVKLMLRYQVGKELPQEDVDDIVAFLHSLNGVYTPYMQDKQ >gi|296493105|gb|ADTK01000396.1| GENE 22 32253 - 32399 75 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAFAPKRESSSSERKSVNSFWSISNSLNLFEGNKKVGFIRNGSKALQA >gi|296493105|gb|ADTK01000396.1| GENE 23 32332 - 32602 218 90 aa, chain + ## HITS:1 COG:ECs4397 KEGG:ns NR:ns ## COG: ECs4397 COG0076 # Protein_GI_number: 15833651 # Func_class: E Amino acid transport and metabolism # Function: Glutamate decarboxylase and related PLP-dependent proteins # Organism: Escherichia coli O157:H7 # 1 90 1 90 466 192 100.0 8e-50 MDQKLLTDFRSELLDSRFGAKAISTIAESKRFPLHEMRDDVAFQIINDELYLDGNARQNL ATFCQTWDDENVHKLMDLSINKNWIDKEEY Prediction of potential genes in microbial genomes Time: Mon May 16 16:22:32 2011 Seq name: gi|296493104|gb|ADTK01000397.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1273.1, whole genome shotgun sequence Length of sequence - 15594 bp Number of predicted genes - 17, with homology - 16 Number of transcription units - 7, operones - 3 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 211 - 825 326 ## SeHA_A0112 typeIV prepilin 2 1 Op 2 24/0.000 - CDS 843 - 1940 499 ## COG1459 Type II secretory pathway, component PulF 3 1 Op 3 . - CDS 1953 - 3506 653 ## COG2804 Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB 4 1 Op 4 . - CDS 3517 - 3969 219 ## SeHA_A0115 type IV pilus biogenesis protein PilP 5 1 Op 5 . - CDS 3956 - 5251 272 ## SeHA_A0116 PilO protein 6 1 Op 6 . - CDS 5244 - 6926 588 ## COG1450 Type II secretory pathway, component PulD 7 1 Op 7 . - CDS 6940 - 7377 260 ## SeHA_A0118 PilM protein 8 1 Op 8 . - CDS 7377 - 8444 191 ## SeHA_A0119 PilL protein - Prom 8675 - 8734 5.3 9 2 Tu 1 . - CDS 9068 - 9364 128 ## SeHA_A0122 PilJ protein - Prom 9396 - 9455 5.5 10 3 Op 1 . - CDS 9540 - 9794 119 ## SeHA_A0125 hypothetical protein 11 3 Op 2 . - CDS 9697 - 10020 68 ## 12 3 Op 3 . - CDS 9956 - 10507 205 ## SeHA_A0126 hypothetical protein 13 4 Tu 1 . - CDS 10680 - 11363 340 ## SeHA_A0127 hypothetical protein - Prom 11482 - 11541 2.2 - Term 11578 - 11607 3.5 14 5 Tu 1 . - CDS 11617 - 12150 -91 ## ECSE_P1-0123 TraB protein - Prom 12174 - 12233 1.8 - Term 12554 - 12587 4.1 15 6 Op 1 . - CDS 12592 - 12879 85 ## SeHA_A0131 TraA 16 6 Op 2 . - CDS 12797 - 13015 143 ## gi|83404893|ref|YP_424908.1| hypothetical protein pCoo129 - Prom 13154 - 13213 6.3 + Prom 13905 - 13964 3.0 17 7 Tu 1 . + CDS 14033 - 15064 880 ## SeHA_A0002 replication initiation protein Predicted protein(s) >gi|296493104|gb|ADTK01000397.1| GENE 1 211 - 825 326 204 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0112 NR:ns ## KEGG: SeHA_A0112 # Name: not_defined # Def: typeIV prepilin # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 204 1 204 204 294 98.0 1e-78 MLVENINTTLTGNNKKNEPHDKGWGILEQGTIALVVLFVIVVVLGSLYALRTRTNVATET ANIQTIITSAQSLLKGSDGYTFTSSAKMTGALIQMGAVPSGMTVQGDKTSGTATLYNAWG GAVTVAPASTSGFNNGFTVTYDKVPQDACIQIATRISKTGLTNGITLNSTAHSDGKVTTE EASTQCKADNGSTGTNKLIFTING >gi|296493104|gb|ADTK01000397.1| GENE 2 843 - 1940 499 365 aa, chain - ## HITS:1 COG:VC0836 KEGG:ns NR:ns ## COG: VC0836 COG1459 # Protein_GI_number: 15640853 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulF # Organism: Vibrio cholerae # 17 275 2 255 340 62 24.0 9e-10 MREMNFSQRLRRFIVRKTFSAPYRVQFYEALRFLLENKQPLKTALEQMRDAWTDFGRKWH PFAELATDCIESLRENSGENSLEYTLSLWVPQEEAAVISAGIRSGSIVDALQFATTLTDA KEQIHQAIWQMAIYPVGLLIMMTGTLYVLNTELIPELSKISSPDSWSGALGFLYGLSVFV DNYGAICAVLFAVITGLISWSLPNWKSPDSVRTFADKIMPWSIYQDIQGATFLLNMAALL KAKMTTLNSLNILQEFASPWLSTRLDSIIYRVRQGDHLGLALRQCGYQFPSREAANFLSL LQGDGATELISNYGQRWLSQTLQRVKKRANVIRLIMLIFLVMSLMLLVFAIMDIQSISDN SMGNF >gi|296493104|gb|ADTK01000397.1| GENE 3 1953 - 3506 653 517 aa, chain - ## HITS:1 COG:VC0835 KEGG:ns NR:ns ## COG: VC0835 COG2804 # Protein_GI_number: 15640852 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB # Organism: Vibrio cholerae # 81 472 82 443 503 172 30.0 2e-42 MDKYNLKDALLFISRGDTHEILIETNQRTRPDVQSNLQELLKLYPDINPKVVSLSELQEA GQDEKNRGPKERIIHLKDLADVSESEKKVLSYFETARKLGASDIHFLISESIFKVRMRIF GELQTVDEDQPALGYSLCATAILSMADVTETSFFPQREQDARLSPQLMRKIGIFGARYSH RPTGDGLIAVMRLIPDDGDKVPTFKQLGFIPEQIRLLNIMLRRPEGKIVLSGPTGSGKST TLRSACRVYLDDNQGRHLLTIEDPLEGQILGATQTPIICDKSDEDAVKLAWSRAISSAMR LDPDAIMEGEMRDLISMMSTTYAAQTGHIVLTTLHTNSALGIPERMITMGMNADLICDAQ LLIGMISQRLVPTLCPSCRIPWEKRAPELSDDERDYLERHCNKDSLCSTDNIWFRNHHGC SECNHDVIINGRKRGEIGKGLTGRTVIAEVIEPDNRLFQILKTRGKVAARKYWLENMKGI SRVEHLLRRINEGLVDPLEADRIIPLDEDERLSIDDV >gi|296493104|gb|ADTK01000397.1| GENE 4 3517 - 3969 219 150 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0115 NR:ns ## KEGG: SeHA_A0115 # Name: pilP # Def: type IV pilus biogenesis protein PilP # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 150 1 150 150 215 100.0 4e-55 MRPGKLLIIPSIFFFSGFSFATTQPLVTIGELEAQQNRNILLQAKVQGAQLQKQLEESDV TSSSETVSGFSGVTASLPSVSEQPTSKNRKELPIIMEINGKDKRLNAVLRMADGRQTSVT TGSQLPGTSVTVKSISLSGVTLSDGTTLTF >gi|296493104|gb|ADTK01000397.1| GENE 5 3956 - 5251 272 431 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0116 NR:ns ## KEGG: SeHA_A0116 # Name: not_defined # Def: PilO protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 431 1 431 431 827 100.0 0 MADEDINPVVLLADPKVNHRVWAACLKWSPVVKKQRVPSHQKHKPHVKSRRLTSLKVTVG SRSSRGKISRITGTGILARPERNHYFSLALAFCSWVRNGYGVFRYSDKELLFLASINGQP AVMADLSGNDADVAQKVSLFLTMNEEPPEKWQVVSPLEHPDNWESIITRLSSADLRRCKL TVGNRSKFTLPAVLFLVAAGAGTVFWMTQPEPDVVPTPEEIAARARLQFKKPDPPPELPH PWASQPVISDFLKACADLRKPSPVALEGWKLTGGTCTPETFTLIYERQPGGTIEGFLARS KEIFNVIPDFNLKDGARLASVTRPLPSLPRRDEAVPTPSEQLMRVFTWFQKKQLTPAINE IAIPEPLPGNDGEPAPVQKWKEYQFSLSTPVNPDELFPLFQDTGVRLSNIHFELNGGTFS YSSEGHIYASR >gi|296493104|gb|ADTK01000397.1| GENE 6 5244 - 6926 588 560 aa, chain - ## HITS:1 COG:Cj1474c KEGG:ns NR:ns ## COG: Cj1474c COG1450 # Protein_GI_number: 15792789 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulD # Organism: Campylobacter jejuni # 215 552 110 453 472 63 22.0 1e-09 MKKSHQRSMKLAVLPCMIAVALSISGCTFSEINKMQKKAQEDSAHAREKVSALSARKSQA LTWLDNQWINPVPVAQVSREKKQTAPACYITQARKGEITLQELGQRITAVCGIPVIITPD AANSTLEGGATRQMTGTLPAPDENGRLPLSSLGSTTMTTSTQPLTLNNLMWQGDINGLLD LMASRSGLYWRMDNGRIVFYLTETRTYPLHMLNTKTSSSSSVSSGSTSTMGATGGQDNSA SGDATSSQSTTVGQEYDLYEDIRKTIEAMLTPEKGRYWLSASSSTLTVTDTPAVQEAVAR YVDEQNSIMNRQVALNVQVLSVSNTRNEQFGLDWNLVYKSLHSAGATLNNASGDFTGATS AGVSILDTATGNAAKFSGSSLLIKALSEQGDVSVVTSQESTVTNLTPVPIQMADQTVYVA QSATTTTTDVGATTTLTPGMITTGFNMTLLPLIQKTGNLQLQMNFNLSDPPTIRSFTSKD GNSYIEMPYTKLRSLSQKVNLKEGQSLVVTGFDQNNTTTSKAGTFTPANPLFGGSQTGKN ERSTLVIIITPTFPSGGNNG >gi|296493104|gb|ADTK01000397.1| GENE 7 6940 - 7377 260 145 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0118 NR:ns ## KEGG: SeHA_A0118 # Name: not_defined # Def: PilM protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 145 1 145 145 258 100.0 4e-68 MGWLVMAVGLTILIITGSYQNQKMSETTNAQQYASASVWASQILMIANRINDIRYVSGQQ DGVISSDKLALPVTPDSRIKHQLQQGRLWVWMPEQPGLVETLRSKSRGSALIGIFQNGQL TWLSGTATGLTPPAGITAGSVVYVN >gi|296493104|gb|ADTK01000397.1| GENE 8 7377 - 8444 191 355 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0119 NR:ns ## KEGG: SeHA_A0119 # Name: not_defined # Def: PilL protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 355 1 355 355 590 100.0 1e-167 MKKNCLTLILPSLLSGCAGISGSNNVAPSDTPALVFVDGQISESAGVLALTQQRLSPPKI TPPPRVSVPETVKTAQPVSESRPLPSPTISQNFSRLIITGKPGSAPVTLSPTARNRTVSQ WIKTLLPEGWKFRQENAATPKLNTKLVSWSANDQWTRSLNRLLEEQHLWGHLDWNNKTLT VTTSAAAPAQDTATTISPAAMVTTSTYPESPTTANSQNKPRNPFRGNSVSSSTPAATPTP TGSTVKSIPLMTGTPVKPVSQGKEWRAPAGTTLRENIIKWAEETKCESMASTHWMVIWPL SVTDYRLDAPLMFRGSFESAMEQVFELYRTAQKPLYAQASRMQCIITVSDTRDGR >gi|296493104|gb|ADTK01000397.1| GENE 9 9068 - 9364 128 98 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0122 NR:ns ## KEGG: SeHA_A0122 # Name: not_defined # Def: PilJ protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 56 98 1 43 43 80 100.0 2e-14 MFTMFFLVKLTIAIVIVCGILSPILASPHNKLSTLFWKIPNAFWIFLICIYFAIAMAVLI ESSEQTRIRRQQYFQQYEVPNETHKNVVIENSIKSQLI >gi|296493104|gb|ADTK01000397.1| GENE 10 9540 - 9794 119 84 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0125 NR:ns ## KEGG: SeHA_A0125 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 84 1 84 84 160 100.0 1e-38 MPQQHPGRLQILVVDAHCKRRLFSTKTPTDPDELARRFCTPDNCLVVVLRDNRFLFRLER APGSHCRWHKGSSSRHQHLQDWLS >gi|296493104|gb|ADTK01000397.1| GENE 11 9697 - 10020 68 107 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDSPETELAKSASFSSHLTCVNSQHNPFPVFAACTALPAVSDILSVTTRLCRFFATGTCS YDGTAASFPVPEGVFDATATSRTTANTGRRRTLQTQAVFNENTDRPR >gi|296493104|gb|ADTK01000397.1| GENE 12 9956 - 10507 205 183 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0126 NR:ns ## KEGG: SeHA_A0126 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 183 1 183 183 335 100.0 5e-91 MKKQPLLWMTIAAALLSGCTGHREHHPQVTFTVDTVRTDTTASNPICLAEKYGSLLNERH PDLFAGAQGSISRSAFFNDSLTLTPETLHGMIQNTPCVESLVTSVGPDTFTADVSPGVRM QVSPDIHGAPGNITLRLSTTDQKRKQTVEQSISLRPEQSLVVNGFTGDGAGEKRVVLITP HVR >gi|296493104|gb|ADTK01000397.1| GENE 13 10680 - 11363 340 227 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0127 NR:ns ## KEGG: SeHA_A0127 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 227 1 227 227 450 100.0 1e-125 MTTTEHPVRQLLWCALNALKTAQENHNITSETGLRHYLLEWLSGASRHPEFRSLPGEIMA LKALTEKDRNIPIIGTLNTLFLSSATVGECPLFRFRAALGKLQKAGWRTYVCPWPERVFN ESIERACSGKRHLLQLSRTEECFLPTGDMRAPVTLQLITPLHRKTMQLLETAELIFSDEG FQVVRGYEHQFGIERRETLFHTLFIGLPSLPEDVWGATQEQTTITLH >gi|296493104|gb|ADTK01000397.1| GENE 14 11617 - 12150 -91 177 aa, chain - ## HITS:1 COG:no KEGG:ECSE_P1-0123 NR:ns ## KEGG: ECSE_P1-0123 # Name: not_defined # Def: TraB protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 177 1 177 177 362 100.0 2e-99 MNIEHLNNRNWYLAQYNTAGKNRESLFSWLNEQNVVPWTPLITRKIRRADSRCCYRERIF AIFPGYFFILANFDIQPVSALRRHSAFIDFVKFGGEIKPVNKDIVDGLMKIYPDPVLNPG AREELNAASSIWLTKAQYQYLLRMENTLQPESRISLLLELVSNAEHHGFMERLVNIP >gi|296493104|gb|ADTK01000397.1| GENE 15 12592 - 12879 85 95 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0131 NR:ns ## KEGG: SeHA_A0131 # Name: not_defined # Def: TraA # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 95 1 95 95 178 100.0 6e-44 MSPQLIQKLPAITLLEGMFPELSTNQLKVCVFYAMGVPYDAIAQNCRLSPETVRTYLKRS LKNLNLEGYDALRSAVLMRTFVFMISNTAKENEKM >gi|296493104|gb|ADTK01000397.1| GENE 16 12797 - 13015 143 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|83404893|ref|YP_424908.1| ## NR: gi|83404893|ref|YP_424908.1| hypothetical protein pCoo129 [Escherichia coli] # 1 72 16 87 87 141 100.0 2e-32 MVAQAGQALAWPVSDDAGIPTPVWAITNERRNSVIAFLVVIGDLCYVTSTDTETPGHHSV GRHVSRAINKPA >gi|296493104|gb|ADTK01000397.1| GENE 17 14033 - 15064 880 343 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0002 NR:ns ## KEGG: SeHA_A0002 # Name: repZ # Def: replication initiation protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 343 16 358 358 598 99.0 1e-170 MAGLKNTPYNAVHWSQLAPEEQIRFWEDYEAGRATTFLVEPERKRTKRRRGEHSTKPKCE NPSWYRPERYKALSGQLGHAYNRLVKKDPVTGEQSLRMHMSLHPFYVQKRTYAGRKYAFR PEKQRLLDAVWPVLVSFSDAGTHTVGMSVSRLAREISPKDSKGKVIPELEVTVSRLSRLL AEQVRFGVLGVSDETLWDRETRQRLPRYVWITPAGWQMLGVDMVKLHEQQQKRLRESEIR QQLIREGVLREDEDISVHAARKRWYLQRSQDALKHRRAKAAASKRARRLKKLPADQQIHE MAEYLRKRLPPDEAYFCSDDHLKRMAIRELRQLELTLAAPPPH Prediction of potential genes in microbial genomes Time: Mon May 16 16:23:19 2011 Seq name: gi|296493103|gb|ADTK01000398.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1273.2, whole genome shotgun sequence Length of sequence - 253 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 16:23:20 2011 Seq name: gi|296493102|gb|ADTK01000399.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1273.3, whole genome shotgun sequence Length of sequence - 3211 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 185 - 244 2.3 1 1 Tu 1 . + CDS 277 - 930 479 ## p1ECUMN_0068 conserved hypothetical protein; putative membrane protein; putative CAAX amino terminal protease + Prom 940 - 999 5.3 2 2 Op 1 8/0.000 + CDS 1023 - 1280 295 ## COG2336 Growth regulator 3 2 Op 2 . + CDS 1282 - 1614 333 ## COG2337 Growth inhibitor + Prom 1793 - 1852 2.5 4 3 Tu 1 . + CDS 1899 - 3209 229 ## COG1479 Uncharacterized conserved protein Predicted protein(s) >gi|296493102|gb|ADTK01000399.1| GENE 1 277 - 930 479 217 aa, chain + ## HITS:1 COG:no KEGG:p1ECUMN_0068 NR:ns ## KEGG: p1ECUMN_0068 # Name: not_defined # Def: conserved hypothetical protein; putative membrane protein; putative CAAX amino terminal protease # Organism: E.coli_UMN026 # Pathway: not_defined # 1 217 1 217 217 362 100.0 5e-99 MIQTRNQYLQFMLVMLAAWGISWGARFVMEQAVLLYGSGKNYLFFSHGTVLMYLLCVFLV YRRWIAPLPVVGQLRNVGVPWLVGAMAVVYVGVFLLGKALALPAEPFMTKLFADKSIPDV ILTLLTIFILAPLNEETLFRGIMLNVFRSRYCWTMWLGALITSLLFVAAHSQYQNLLTLA ELFLVGLITSVARIRSGGLLLPVLLHMEATTLGLLFG >gi|296493102|gb|ADTK01000399.1| GENE 2 1023 - 1280 295 85 aa, chain + ## HITS:1 COG:AGc1711 KEGG:ns NR:ns ## COG: AGc1711 COG2336 # Protein_GI_number: 15888279 # Func_class: T Signal transduction mechanisms # Function: Growth regulator # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 3 85 19 103 103 64 38.0 4e-11 MHTTRLKRVGGSVMLTVPPALLNALSLGTDNEVGMVIDNGRLIVEPYRRPQYSLAELLAQ CDPNAEISAEEREWLDAPATGQEEI >gi|296493102|gb|ADTK01000399.1| GENE 3 1282 - 1614 333 110 aa, chain + ## HITS:1 COG:ECs5203 KEGG:ns NR:ns ## COG: ECs5203 COG2337 # Protein_GI_number: 15834457 # Func_class: T Signal transduction mechanisms # Function: Growth inhibitor # Organism: Escherichia coli O157:H7 # 2 109 8 115 116 85 44.0 2e-17 MERGEIWLVSLDPTAGHEQQGTRPVLIVTPAAFNRVTRLPVVVPVTSGGNFARTAGFAVS LDGVGIRTTGVVRCDQPRTIDMKARGGKRLERVPETIMNEVLGRLSTILT >gi|296493102|gb|ADTK01000399.1| GENE 4 1899 - 3209 229 436 aa, chain + ## HITS:1 COG:alr1266 KEGG:ns NR:ns ## COG: alr1266 COG1479 # Protein_GI_number: 17228761 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 12 428 9 438 599 179 29.0 7e-45 MANENLDIDSKTAEELYEWYLDDKLIINRRYQRKLVWSLEEKTALISSMTQQYPIPLLLF VSIDNKREILDGMQRLEALMSFIEQRFSLNGKYFNLDAIALTKQLKDSGKLQQKEPVLSR ESSTLIARYRFAISEYSSTEDHIDEVFRRINSNGKILSKQELRSAGCVSNFSELVRKIST IIRGDTTHSDIMGLNKIHNISICNDGLDYGINIDNHFYIRNHIISRPSIRDSDDEELVAN ILGYIFLDDKPTSGSTSLDTFYGEGSTSHAIHTRTQLENYIQTNGADKIVNNYLFVYEMI QKLFDANNLNFRSHILGNASSSQECPRYYQAVFLALYELIINENMQLDDEQKFIAQLGDS VQRSMVQTEGGRWAASARQKSVEDLCALIRRYFKESENKFINHAWQTLIRTLLNNSRTEQ PNYDFKQGGDAANLLI Prediction of potential genes in microbial genomes Time: Mon May 16 16:23:24 2011 Seq name: gi|296493101|gb|ADTK01000400.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1279.1, whole genome shotgun sequence Length of sequence - 303 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 16:23:24 2011 Seq name: gi|296493100|gb|ADTK01000401.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1279.2, whole genome shotgun sequence Length of sequence - 171 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 16:23:25 2011 Seq name: gi|296493099|gb|ADTK01000402.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1279.3, whole genome shotgun sequence Length of sequence - 1201 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 504 - 1031 573 ## COG0629 Single-stranded DNA-binding protein Predicted protein(s) >gi|296493099|gb|ADTK01000402.1| GENE 1 504 - 1031 573 175 aa, chain + ## HITS:1 COG:BU545 KEGG:ns NR:ns ## COG: BU545 COG0629 # Protein_GI_number: 15617138 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Buchnera sp. APS # 1 175 1 171 171 172 50.0 2e-43 MSARGINKVILVGRLGNDPEVRYIPNGGAVANLQVATSESWRDKQTGEMREQTEWHRVVL FGKLAEVAGEYLRKGAQVYIEGQLRTRSWDDNGITRYITEILVKTTGTMQMLGSAPQQNA QAQPKPQQNGQPQSADATKKGGAKTKGRGRKAAQPEPQPQTPEGEDYGFSDDIPF Prediction of potential genes in microbial genomes Time: Mon May 16 16:23:29 2011 Seq name: gi|296493098|gb|ADTK01000403.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1348.1, whole genome shotgun sequence Length of sequence - 10346 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 4, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 + CDS 2 - 643 125 ## COG3209 Rhs family protein 2 1 Op 2 2/0.000 + CDS 664 - 1506 245 ## COG1413 FOG: HEAT repeat 3 1 Op 3 . + CDS 1554 - 2492 295 ## COG3209 Rhs family protein 4 1 Op 4 . + CDS 2504 - 2965 100 ## B21_03402 hypothetical protein + Prom 3493 - 3552 5.6 5 2 Tu 1 . + CDS 3672 - 4109 90 ## ECO103_4586 hypothetical protein - Term 4501 - 4535 -0.8 6 3 Op 1 . - CDS 4569 - 5705 984 ## COG1566 Multidrug resistance efflux pump 7 3 Op 2 . - CDS 5708 - 6070 292 ## EcSMS35_3931 hypothetical protein - Prom 6311 - 6370 5.2 8 4 Op 1 11/0.000 + CDS 6607 - 8520 2378 ## COG2213 Phosphotransferase system, mannitol-specific IIBC component + Term 8527 - 8567 6.2 + Prom 8556 - 8615 2.6 9 4 Op 2 7/0.000 + CDS 8750 - 9898 1505 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases 10 4 Op 3 . + CDS 9898 - 10345 381 ## COG3722 Transcriptional regulator Predicted protein(s) >gi|296493098|gb|ADTK01000403.1| GENE 1 2 - 643 125 213 aa, chain + ## HITS:1 COG:rhsA KEGG:ns NR:ns ## COG: rhsA COG3209 # Protein_GI_number: 16131464 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli K12 # 2 213 1166 1377 1377 446 99.0 1e-125 AWCAEYDEWGNQLNEENPHQLQQLIRLPGQQYDEESGLYYNRHRYYDPLQGRYITQDPIG LKGGWNFYQYPLNPVTNTDPLGLEVFPRPFPLPIPWPKSPAQQQADDNAAKALTKWWNDT ASQRIFDSLILNNPGLALDITMIASRGNVADTGITDRVNDIINDRFWSDGKKPDRCDVLQ ELIDCGDISAKDAKSTQKAWNCRHSRQSNDKKR >gi|296493098|gb|ADTK01000403.1| GENE 2 664 - 1506 245 280 aa, chain + ## HITS:1 COG:ZyibA KEGG:ns NR:ns ## COG: ZyibA COG1413 # Protein_GI_number: 15804135 # Func_class: C Energy production and conversion # Function: FOG: HEAT repeat # Organism: Escherichia coli O157:H7 EDL933 # 1 280 1 280 280 514 100.0 1e-146 MSNTYQKRKASKEYGLYNQCKKLNDDELFRLLDDHNSLKRISSARVLQLRGGQDAVRLAI EFCSDKNYIRRDIGAFILGQIKICKKCEDNVFNILNNMALNDKSACVRATAIESTAQRCK KNPIYSPKIVEQSQITAFDKSTNVRRATAFAISVINDKATIPLLINLLKDPNGDVRNWAA FAININKYDNSDIRDCFVEMLQDKNEEVRIEAIIGLSYRKDKRVLSVLCDELKKNTVYDD IIEAAGELGDKTLLPVLDTMLYKFDDNEIITSAIDKLKRS >gi|296493098|gb|ADTK01000403.1| GENE 3 1554 - 2492 295 312 aa, chain + ## HITS:1 COG:ECs4470 KEGG:ns NR:ns ## COG: ECs4470 COG3209 # Protein_GI_number: 15833724 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli O157:H7 # 1 312 1098 1409 1409 639 97.0 0 MLDRLESEILADRVSEESRRWLASCGLTVEQMQNQMDPVYTPARKIHLYHCDHRGLPLAL ISTEGATAWCAEYDEWGNLLSDENPHHLQQLIRLPGQQYDEESGLYYNRHRYYDPLQGRY ITQDPIGLKGGWNFYQYPLNPVINVDPQGLVDINLYPESDLIHSVADEINIPGVFTIGGH GTPTSIESATRSIMTAKDLAYLIKFDGNYKDGMTVWLFSCNTGKGQNSFASQLAKELHTN VIGPDTLWTWWGRGTNGKLKMDTVLTAPTNLNSNKDLMAITTKDLGNWITYGPSGHPISN MQGTPEKPSDIR >gi|296493098|gb|ADTK01000403.1| GENE 4 2504 - 2965 100 153 aa, chain + ## HITS:1 COG:no KEGG:B21_03402 NR:ns ## KEGG: B21_03402 # Name: yibG # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 153 1 153 153 280 99.0 9e-75 MKACLLLFFYFSLIGQLHGADVKIKENESVMGSTAMTYDLSEEKLMKLKYKSQHGDSEAS FRLYQYYCFTKNNIDKQLRFLERSASQGNVTAQFNYGIFLSDTNPTLSEYYNLNRAIYWM EFAVNNGNIDAKSKLQELKKLKRMDRRKNKENP >gi|296493098|gb|ADTK01000403.1| GENE 5 3672 - 4109 90 145 aa, chain + ## HITS:1 COG:no KEGG:ECO103_4586 NR:ns ## KEGG: ECO103_4586 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 145 1 145 145 227 100.0 9e-59 MADKAILWALISASTKEGRKACSLSYFACKAAEAELGLAYMAANDNKEFLTSLSNIMRYK IDAGLSESYTCYLLSKGKIIRPYLKNLNPLQLAADCIETVNKIKDKNKKIIDINSVNICS DDKNIKLRVNSTIMAIDDSIKCIDE >gi|296493098|gb|ADTK01000403.1| GENE 6 4569 - 5705 984 378 aa, chain - ## HITS:1 COG:ECs4473 KEGG:ns NR:ns ## COG: ECs4473 COG1566 # Protein_GI_number: 15833727 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Escherichia coli O157:H7 # 1 378 1 378 378 723 100.0 0 MDLLIVLTYVALAWAVFKIFRIPVNQWTLATAALGGVFLVSGLILLMNYNHPYTFTAQKA VIAIPITPQVTGIVTEVTDKNNQLIQKGEVLFKLDPVRYQARVDRLQADLMTATHNIKTL RAQLTEAQANTTQVSAERDRLFKNYQRYLKGSQAAVNPFSERDIDDARQNFLAQDALVKG SVAEQAQIQSQLDSMVNGEQSQIVSLRAQLTEAKYNLEQTVIRAPSNGYVTQVLIRPGTY AAALPLRPVMVFIPEQKRQIVAQFRQNSLLRLKPGDDAEVVFNALPGQVFHGKLTSILPV VPGGSYQAQGVLQSLTVVPGTDGVLGTIELDPNDDIDALPDGIYAQVAVYSDHFSHVSVM RKVLLRMTSWMHYLYLDH >gi|296493098|gb|ADTK01000403.1| GENE 7 5708 - 6070 292 120 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_3931 NR:ns ## KEGG: EcSMS35_3931 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 120 35 154 154 238 100.0 6e-62 MFLNYFALGVLIFVFLVIFYGIIAIHDIPYLIAKKRNHPHADAIHTAGWVSLFTLHVIWP FLWIWATLYQPERGWGMQSHVASQEKATDPEIAALSDRISRLEHQLAAEKKTDYSTFPEI >gi|296493098|gb|ADTK01000403.1| GENE 8 6607 - 8520 2378 637 aa, chain + ## HITS:1 COG:ECs4475_1 KEGG:ns NR:ns ## COG: ECs4475_1 COG2213 # Protein_GI_number: 15833729 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannitol-specific IIBC component # Organism: Escherichia coli O157:H7 # 1 492 1 492 492 923 99.0 0 MSSDIKIKVQSFGRFLSNMVMPNIGAFIAWGIITALFIPTGWLPNETLAKLVGPMITYLL PLLIGYTGGKLVGGERGGVVGAITTMGVIVGADMPMFLGSMIAGPLGGWCIKHFDRWVDG KIKSGFEMLVNNFSAGIIGMILAILAFLGIGPIVEALSKMLAAGVNFMVVHDMLPLASIF VEPAKILFLNNAINHGIFSPLGIQQSHELGKSIFFLIEANPGPGMGVLLAYMFFGRGSAK QSAGGAAIIHFLGGIHEIYFPYVLMNPRLILAVILGGMTGVFTLTILGGGLVSPASPGSI LAVLAMTPKGAYFANIAGVCAAMAVSFVVSAILLKTSKVKEEDDIEAATRRMQDMKAESK GTSPLSAGDVTNDLSHVRKIIVACDAGMGSSAMGAGVLRKKIQDAGLSQISVTNSAINNL PPDVDLVITHRDLTERAMRQVPQAQHISLTNFLDSGLYTSLTERLVAAQRHTENEVKVKD SLKDSFDDSSANLFKLGAENIFLGRKAATKEEAIRFAGEQLVKGGYVEPEYVQAMLDREK LTPTYLGESIAVPHGTVEAKDRVLKTGVVFCQYPEGVRFGEEEDDIARLVIGIAARNNEH IQVITSLTNALDDESVIERLAHTTSVDEVLELLAGRK >gi|296493098|gb|ADTK01000403.1| GENE 9 8750 - 9898 1505 382 aa, chain + ## HITS:1 COG:ECs4476 KEGG:ns NR:ns ## COG: ECs4476 COG0246 # Protein_GI_number: 15833730 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Escherichia coli O157:H7 # 1 382 1 382 382 731 99.0 0 MKALHFGAGNIGRGFIGKLLADAGIQLTFADVNQVVLDALNARHSYQVHVVGETEQVDTV SGVNAVSSIGDDVVDLIAQVDLVTTAVGPVVLERIAPAIAKGLVKRKEQGNESPLNIIAC ENMVRGTTQLKGHVMNALPEDAKAWVEEHVGFVDSAVDRIVPPSASATNDPLEVTVETFS EWIVDKTQFKGTLPNIPGMELTDNLMAFVERKLFTLNTGHAITAYLGKLAGHQTIRDAIL DEKIRAVVKGAMEESGAVLIKRYGFDADKHAAYIQKILGRFENPYLKDDVERVGRQPLRK LSAGDRLIKPLLGTLEYSLPHKNLIQGIAGAMHFRSEDDPQAQELAALIADKGPQAALAQ ISGLDANSEVVSEAVTAYKAMQ >gi|296493098|gb|ADTK01000403.1| GENE 10 9898 - 10345 381 149 aa, chain + ## HITS:1 COG:mtlR KEGG:ns NR:ns ## COG: mtlR COG3722 # Protein_GI_number: 16131472 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 149 1 149 195 285 100.0 2e-77 MVDQAQDTLRPNNRLSDMQATMEQTQAFENRVLERLNAGKTVRSFLITAVELLTEAVNLL VLQVFRKDDYAVKYAVEPLLDGDGPLGDLSVRLKLIYGLGVINRQEYEDAELLMALREEL NHDGNEYAFTDDEILGPFGELHCVAALPP Prediction of potential genes in microbial genomes Time: Mon May 16 16:23:56 2011 Seq name: gi|296493097|gb|ADTK01000404.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1348.2, whole genome shotgun sequence Length of sequence - 65354 bp Number of predicted genes - 59, with homology - 59 Number of transcription units - 30, operones - 15 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 166 - 375 200 ## JW3576 hypothetical protein - Prom 535 - 594 3.5 + Prom 450 - 509 4.8 2 2 Tu 1 . + CDS 660 - 1022 497 ## ECSP_4596 hypothetical protein + Prom 1319 - 1378 9.1 3 3 Op 1 . + CDS 1565 - 2248 279 ## ECO26_4996 hypothetical protein 4 3 Op 2 3/0.583 + CDS 2292 - 7163 3512 ## COG5295 Autotransporter adhesin + Term 7260 - 7297 7.1 5 4 Op 1 4/0.417 + CDS 7531 - 9186 1707 ## COG1620 L-lactate permease 6 4 Op 2 5/0.333 + CDS 9186 - 9962 764 ## COG2186 Transcriptional regulators 7 4 Op 3 4/0.417 + CDS 9959 - 11149 1520 ## COG1304 L-lactate dehydrogenase (FMN-dependent) and related alpha-hydroxy acid dehydrogenases 8 5 Tu 1 . + CDS 11335 - 11808 406 ## COG0219 Predicted rRNA methylase (SpoU class) 9 6 Tu 1 6/0.250 - CDS 11861 - 12682 848 ## COG1045 Serine acetyltransferase - Term 12695 - 12736 6.7 10 7 Op 1 7/0.000 - CDS 12762 - 13781 1075 ## COG0240 Glycerol-3-phosphate dehydrogenase 11 7 Op 2 9/0.000 - CDS 13781 - 14248 490 ## COG1952 Preprotein translocase subunit SecB 12 7 Op 3 7/0.000 - CDS 14311 - 14562 323 ## COG0695 Glutaredoxin and related proteins - Prom 14634 - 14693 2.4 13 7 Op 4 . - CDS 14704 - 15135 425 ## COG0607 Rhodanese-related sulfurtransferase - Prom 15175 - 15234 5.1 + Prom 15290 - 15349 4.2 14 8 Op 1 4/0.417 + CDS 15380 - 16924 1683 ## COG0696 Phosphoglyceromutase 15 8 Op 2 3/0.583 + CDS 16958 - 18217 1347 ## COG4942 Membrane-bound metallopeptidase 16 8 Op 3 . + CDS 18221 - 19180 748 ## COG2861 Uncharacterized protein conserved in bacteria + Term 19182 - 19223 -0.5 17 9 Tu 1 . - CDS 19167 - 20201 469 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 20371 - 20430 6.2 - Term 20400 - 20430 3.0 18 10 Op 1 9/0.000 - CDS 20440 - 21465 1013 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 19 10 Op 2 . - CDS 21475 - 22671 1632 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes - Prom 22819 - 22878 3.0 + Prom 22608 - 22667 3.7 20 11 Op 1 6/0.250 + CDS 22885 - 23817 1095 ## COG0451 Nucleoside-diphosphate-sugar epimerases 21 11 Op 2 11/0.000 + CDS 23827 - 24873 920 ## COG0859 ADP-heptose:LPS heptosyltransferase 22 11 Op 3 . + CDS 24877 - 25848 647 ## COG0859 ADP-heptose:LPS heptosyltransferase + Term 25874 - 25922 11.6 - Term 25860 - 25909 11.8 23 12 Tu 1 . - CDS 25918 - 26394 141 ## COG3307 Lipid A core - O-antigen ligase and related enzymes - Prom 26472 - 26531 6.9 - Term 27120 - 27165 -0.8 24 13 Tu 1 2/0.833 - CDS 27217 - 28200 117 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 28223 - 28282 7.7 25 14 Op 1 . - CDS 28285 - 29310 666 ## COG1442 Lipopolysaccharide biosynthesis proteins, LPS:glycosyltransferases 26 14 Op 2 . - CDS 29336 - 30028 484 ## SDY_4057 lipopolysaccharide core biosynthesis protein 27 14 Op 3 5/0.333 - CDS 30038 - 31033 366 ## COG1442 Lipopolysaccharide biosynthesis proteins, LPS:glycosyltransferases 28 14 Op 4 . - CDS 31050 - 32066 337 ## COG1442 Lipopolysaccharide biosynthesis proteins, LPS:glycosyltransferases 29 14 Op 5 . - CDS 32082 - 32879 512 ## EcE24377A_4131 lipopolysaccharide core biosynthesis protein RfaP 30 14 Op 6 5/0.333 - CDS 32872 - 33996 720 ## COG0438 Glycosyltransferase 31 14 Op 7 . - CDS 33993 - 35015 617 ## COG0859 ADP-heptose:LPS heptosyltransferase + Prom 35322 - 35381 6.4 32 15 Op 1 6/0.250 + CDS 35464 - 36741 1184 ## COG1519 3-deoxy-D-manno-octulosonic-acid transferase 33 15 Op 2 . + CDS 36749 - 37228 393 ## PROTEIN SUPPORTED gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 + Term 37229 - 37278 4.2 34 16 Tu 1 5/0.333 - CDS 37267 - 38076 766 ## COG0266 Formamidopyrimidine-DNA glycosylase - Prom 38099 - 38158 2.2 - Term 38115 - 38156 9.7 35 17 Op 1 16/0.000 - CDS 38174 - 38341 280 ## PROTEIN SUPPORTED gi|15804177|ref|NP_290216.1| 50S ribosomal protein L33 36 17 Op 2 7/0.000 - CDS 38362 - 38598 403 ## PROTEIN SUPPORTED gi|15804178|ref|NP_290217.1| 50S ribosomal protein L28 - Prom 38738 - 38797 3.4 37 17 Op 3 . - CDS 38815 - 39483 461 ## COG2003 DNA repair proteins - Prom 39679 - 39738 2.4 + Prom 39565 - 39624 2.3 38 18 Op 1 5/0.333 + CDS 39655 - 40875 1149 ## COG0452 Phosphopantothenoylcysteine synthetase/decarboxylase 39 18 Op 2 4/0.417 + CDS 40853 - 41311 553 ## COG0756 dUTPase + Term 41382 - 41417 -0.5 40 19 Tu 1 . + CDS 41418 - 42014 686 ## COG1309 Transcriptional regulator + Term 42029 - 42066 7.1 41 20 Tu 1 . - CDS 42060 - 43019 540 ## COG3344 Retron-type reverse transcriptase - Prom 43244 - 43303 1.6 42 21 Op 1 6/0.250 - CDS 43317 - 43958 793 ## COG0461 Orotate phosphoribosyltransferase - Term 43987 - 44016 2.1 43 21 Op 2 . - CDS 44024 - 44740 846 ## COG0689 RNase PH - Prom 44770 - 44829 4.8 + Prom 44780 - 44839 5.5 44 22 Tu 1 . + CDS 44867 - 45730 1127 ## COG1561 Uncharacterized stress-induced protein + Prom 45832 - 45891 3.9 45 23 Tu 1 . + CDS 45951 - 46553 496 ## ECIAI1_3816 DNA-damage-inducible protein D + Prom 46618 - 46677 5.0 46 24 Tu 1 . + CDS 46844 - 47461 835 ## COG2860 Predicted membrane protein + Term 47481 - 47518 5.3 47 25 Op 1 . - CDS 47458 - 48975 1067 ## COG0272 NAD-dependent DNA ligase (contains BRCT domain type II) 48 25 Op 2 . - CDS 49017 - 49259 64 ## ECH74115_5019 hypothetical protein - Prom 49339 - 49398 2.0 + Prom 49299 - 49358 4.6 49 26 Op 1 25/0.000 + CDS 49399 - 50022 509 ## COG0194 Guanylate kinase 50 26 Op 2 18/0.000 + CDS 50077 - 50352 487 ## COG1758 DNA-directed RNA polymerase, subunit K/omega 51 26 Op 3 5/0.333 + CDS 50371 - 52479 2156 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases 52 26 Op 4 4/0.417 + CDS 52486 - 53175 629 ## COG0566 rRNA methylases 53 26 Op 5 . + CDS 53181 - 55262 2093 ## COG1200 RecG-like helicase - Term 54967 - 54998 1.6 54 27 Op 1 . - CDS 55247 - 56113 565 ## B21_03462 hypothetical protein 55 27 Op 2 . - CDS 56116 - 57321 1487 ## COG0786 Na+/glutamate symporter - Prom 57453 - 57512 7.0 + Prom 57518 - 57577 4.8 56 28 Tu 1 . + CDS 57601 - 58992 1592 ## COG2233 Xanthine/uracil permeases + Prom 59028 - 59087 1.8 57 29 Tu 1 . + CDS 59113 - 60822 1484 ## EcE24377A_4159 AsmA family protein + Term 61021 - 61060 -0.9 58 30 Op 1 3/0.583 - CDS 60875 - 63193 2012 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases 59 30 Op 2 . - CDS 63203 - 64585 1189 ## COG2211 Na+/melibiose symporter and related transporters + TRNA 64878 - 64968 75.2 # SeC(p) TCA 0 0 Predicted protein(s) >gi|296493097|gb|ADTK01000404.1| GENE 1 166 - 375 200 69 aa, chain - ## HITS:1 COG:no KEGG:JW3576 NR:ns ## KEGG: JW3576 # Name: yibT # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 69 1 69 69 121 100.0 6e-27 MGKLGENVPLLIDKAVDFMASSQAFREYLKKLPPRNAIPSGIPDESVPLYLQRLEYYRRL YRPKQVEGQ >gi|296493097|gb|ADTK01000404.1| GENE 2 660 - 1022 497 120 aa, chain + ## HITS:1 COG:no KEGG:ECSP_4596 NR:ns ## KEGG: ECSP_4596 # Name: yibL # Def: hypothetical protein # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 120 1 120 120 167 100.0 1e-40 MKEVEKNEIKRLSDRLDAIRHQQADLSLVEAADKYAELEKEKATLEAEIARLREVHSQKL SKEAQKLMKMPFQRAITKKEQADMGKLKKSVRGLVVVHPMTALGREMGLQEMTGFSKTAF >gi|296493097|gb|ADTK01000404.1| GENE 3 1565 - 2248 279 227 aa, chain + ## HITS:1 COG:no KEGG:ECO26_4996 NR:ns ## KEGG: ECO26_4996 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 227 1 227 242 415 100.0 1e-115 MNLKKIFFSAVTVSVLCALTGCDYIEEGKPESSLLKQQEEHNNKIDLLEKQQAQLKSQLE TIQKQQTGIISSTKTLTHVIKSVKDQQNTFIFTEFNPAKTKYFILNNGSVALAGRVLSID ATENGSVIHISLVNLLSTPISNIGFNATWGGEKPVDAKEFARWQQLLFNTSMKSTLKLLP GQWQDINLTLKGVSPNNLGYLKLAINMENIQFDNLPSAENRQKRSKK >gi|296493097|gb|ADTK01000404.1| GENE 4 2292 - 7163 3512 1623 aa, chain + ## HITS:1 COG:ECs4480 KEGG:ns NR:ns ## COG: ECs4480 COG5295 # Protein_GI_number: 15833734 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; W Extracellular structures # Function: Autotransporter adhesin # Organism: Escherichia coli O157:H7 # 1 1623 1 1588 1588 1983 97.0 0 MNKIFKVIWNPATGNYTVTSETAKSRGKKSGRSKLLISALVAGGMLSSFGALANAGNDNG QGVDYGSGSAGDGWVAIGKGAKANTFMNTSGSSTAVGYDAIAEGQYSSAIGSKTHAIGGA SMAFGVSAISEGDRSIALGASSYSLGQYSMALGRYSKALGKLSIAMGDSSKAEGANAIAL GNATKATEIMSIVLGDTANASKAYSMALGASSVASEENAIALGRSSVASGTDSLAFGRQS LASAANAIAIGAETEAAENATAIGNNAKAKGTNSMAMGFGSLADKVNTIALGNGSQALAD NAIAIGQGNKADGVDAIALGNGSQSRGLNTIALGTASNATGDKSLALGSNSSANGINSVA LGADSIADLDNTVSVGNSSLKRKIVNVKNGAIKSDSYDAINGSQLYAISDSVAKRLGGGA AVDVDDGTVTAPTYNLKNGSKNNVGAALAVLDENTLQWDQTKGKYSAAHGTSSPTASVIT DVADGTISASSKDAVNGSQLKATNDDVEANTANIATNTSNIATNTANIATNTTNITNLTD SVGDLQADALLWNETKKAFSAAHGQDTTSKITNVKDADLTADSTDAVNGSQLKTTNDAVA TNTTNIANNTSNIATNTTNISNLTETVTNLGEDALKWDKDNGVFTAAHGTETTSKITNVK DGDLTTGSTDAVNGSQLKTTNDAVATNTTNIANNTSNIATNTTNISNLTETVTNLGEDAL KWDKDNGVFTAAHGNNTASKITNILDGTVTATSSDAINGSQLYDLSSNIATYFGGNASVN TDGVFTGPTYKIGETNYYNVGDALAAINSSFSTSLGDALLWDATAGKFSAKHGTNGDASV ITDVADGEISDSSSDAVNGSQLHGVSSYVVDALGGGAEVNADGTITAPTYTIANADYDNV GDALNAIDTTLDDALLWDADAGENGAFSAAHGKDKTASVITNVANGAISAASSDAINGSQ LYTTNKYIADALGGDAEVNADGTITAPTYTIANAEYNNVGDALDALDDNALLWDETANGG AGAYNASPDGKASIITNVANGSISEDSTDAVNGSQLNATNMMIEQNTQIINQLAGNTDAT YIQENGAGINYVRTNDDGLAFNDASAQGVGATAIGYNSVAKGDSSVAIGQGSYSDVDTGI ALGSSSVSSRVIAKGSRDTSITENGVVIGYDTTDGELLGALSIGDDGKYRQIINVADGSE AHDAVTVRQLQNAIGAVATTPSKYFHANSTEEDSLAVGTDSLAMGAKTIVNGDKGIGIGY GAYVDANALNGIAIGSNAQVIHVNSIAIGNGSTTTRGAQTNYTAYNMDAPQNSVGEFSVG SADGQRQITNVAAGSADTDAVNVGQLKVTDAQVSQNTQSITNLDNRVTNLDSRVTNIENG IGDIVTTGSTKYFKTNTDGVDASAQGKDSVAIGSGSIAAADNSVALGTGSVAEEENTISV GSSTNQRRITNVAAGVNATDAVNVSQLKSSEAGGVRYDTKADGSIDYSNITLGGGNGGTT RISNVSAGVNNNDAVNYAQLKQSVQETKQYTDQRMVEMDNKLSKTESKLSGGIASAMAMT GLPQAYTPGASMASIGGGTYNGESAVALGVSMVSANGRWVYKLQGSTNSQGEYSAALGAG IQW >gi|296493097|gb|ADTK01000404.1| GENE 5 7531 - 9186 1707 551 aa, chain + ## HITS:1 COG:ECs4481 KEGG:ns NR:ns ## COG: ECs4481 COG1620 # Protein_GI_number: 15833735 # Func_class: C Energy production and conversion # Function: L-lactate permease # Organism: Escherichia coli O157:H7 # 1 551 1 551 551 929 100.0 0 MNLWQQNYDPAGNIWLSSLIASLPILFFFFALIKLKLKGYVAASWTVAIALAVALLFYKM PVANALASVVYGFFYGLWPIAWIIIAAVFVYKISVKTGQFDIIRSSILSITPDQRLQMLI VGFCFGAFLEGAAGFGAPVAITAALLVGLGFKPLYAAGLCLIVNTAPVAFGAMGIPILVA GQVTGIDSFEIGQMVGRQLPFMTIIVLFWIMAIMDGWRGIKETWPAVVVAGGSFAIAQYL SSNFIGPELPDIISSLVSLLCLTLFLKRWQPVRVFRFGDLGASQVDMTLAHTGYTAGQVL RAWTPFLFLTATVTLWSIPPFKALFASGGALYEWVINIPVPYLDKLVARMPPVVSEATAY AAVFKFDWFSATGTAILFAALLSIVWLKMKPSDAISTFGSTLKELALPIYSIGMVLAFAF ISNYSGLSSTLALALAHTGHAFTFFSPFLGWLGVFLTGSDTSSNALFAALQATAAQQIGV SDLLLVAANTTGGVTGKMISPQSIAIACAAVGLVGKESDLFRFTVKHSLIFTCMVGVITT LQAYVLTWMIP >gi|296493097|gb|ADTK01000404.1| GENE 6 9186 - 9962 764 258 aa, chain + ## HITS:1 COG:lldR KEGG:ns NR:ns ## COG: lldR COG2186 # Protein_GI_number: 16131475 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 258 1 258 258 489 100.0 1e-138 MIVLPRRLSDEVADRVRALIDEKNLEAGMKLPAERQLAMQLGVSRNSLREALAKLVSEGV LLSRRGGGTFIRWRHDTWSEQNIVQPLKTLMADDPDYSFDILEARYAIEASTAWHAAMRA TPGDKEKIQLCFEATLSEDPDIASQADVRFHLAIAEASHNIVLLQTMRGFFDVLQSSVKH SRQRMYLVPPVFSQLTEQHQAVIDAIFAGDADGARKAMMAHLSFVHTTMKRFDEDQARHA RITRLPGEHNEHSREKNA >gi|296493097|gb|ADTK01000404.1| GENE 7 9959 - 11149 1520 396 aa, chain + ## HITS:1 COG:ECs4483 KEGG:ns NR:ns ## COG: ECs4483 COG1304 # Protein_GI_number: 15833737 # Func_class: C Energy production and conversion # Function: L-lactate dehydrogenase (FMN-dependent) and related alpha-hydroxy acid dehydrogenases # Organism: Escherichia coli O157:H7 # 1 396 1 396 396 783 99.0 0 MIISAASDYRAAAQRILPPFLFHYMDGGAYSEYTLRRNVEDLSEVALRQRILKNMSDLSL ETTLFNEKLSMPVALGPVGLCGMYARRGEVQAAKAADAHGIPFTLSTVSVCPIEEVAPAI KRPMWFQLYVLRDRGFMRNALERAKAAGCSTLVFTVDMPTPGARYRDAHSGMSGPNAAMR RYLQAVTHPQWAWDVGLNGRPHDLGNISAYLGKPTGLEDYIGWLGNNFDPSISWKDLEWI RDFWDGPMVIKGILDPEDARDAVRFGADGIVVSNHGGRQLDGVLSSARALPAIADAVKGD IAILADSGIRNGLDVVRMIALGADTVLLGRAFLYALATAGQAGVANLLNLIEKEMKVAMT LTGAKSISEITQDSLVQGLGKELPTALAPMAKGNAA >gi|296493097|gb|ADTK01000404.1| GENE 8 11335 - 11808 406 157 aa, chain + ## HITS:1 COG:yibK KEGG:ns NR:ns ## COG: yibK COG0219 # Protein_GI_number: 16131477 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase (SpoU class) # Organism: Escherichia coli K12 # 1 157 1 157 157 322 99.0 2e-88 MLNIVLYEPEIPPNTGNIIRLCANTGFRLHIIEPMGFAWDDKRLRRAGLDYHEFTAVTRH HDYRAFLEAENPQRLFALTTKGTPAHSAVSYQDGDYLMFGPETRGLPASILDALPAEQKI RIPMVPDSRSMNLSNAVSVVVYEAWRQLGYPGALLRD >gi|296493097|gb|ADTK01000404.1| GENE 9 11861 - 12682 848 273 aa, chain - ## HITS:1 COG:ECs4485 KEGG:ns NR:ns ## COG: ECs4485 COG1045 # Protein_GI_number: 15833739 # Func_class: E Amino acid transport and metabolism # Function: Serine acetyltransferase # Organism: Escherichia coli O157:H7 # 1 273 1 273 273 551 100.0 1e-157 MSCEELEIVWNNIKAEARTLADCEPMLASFYHATLLKHENLGSALSYMLANKLSSPIMPA IAIREVVEEAYAADPEMIASAACDIQAVRTRDPAVDKYSTPLLYLKGFHALQAYRIGHWL WNQGRRALAIFLQNQVSVTFQVDIHPAAKIGRGIMLDHATGIVVGETAVIENDVSILQSV TLGGTGKSGGDRHPKIREGVMIGAGAKILGNIEVGRGAKIGAGSVVLQPVPPHTTAAGVP ARIVGKPDSDKPSMDMDQHFNGINHTFEYGDGI >gi|296493097|gb|ADTK01000404.1| GENE 10 12762 - 13781 1075 339 aa, chain - ## HITS:1 COG:ECs4486 KEGG:ns NR:ns ## COG: ECs4486 COG0240 # Protein_GI_number: 15833740 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Escherichia coli O157:H7 # 1 339 1 339 339 664 100.0 0 MNQRNASMTVIGAGSYGTALAITLARNGHEVVLWGHDPEHIATLERDRCNAAFLPDVPFP DTLHLESDLATALAASRNILVVVPSHVFGEVLRQIKPLMRPDARLVWATKGLEAETGRLL QDVAREALGDQIPLAVISGPTFAKELAAGLPTAISLASTDQTFADDLQQLLHCGKSFRVY SNPDFIGVQLGGAVKNVIAIGAGMSDGIGFGANARTALITRGLAEMSRLGAALGADPATF MGMAGLGDLVLTCTDNQSRNRRFGMMLGQGMDVQSAQEKIGQVVEGYRNTKEVRELAHRF GVEMPITEEIYQVLYCGKNAREAALTLLGRARKDERSSH >gi|296493097|gb|ADTK01000404.1| GENE 11 13781 - 14248 490 155 aa, chain - ## HITS:1 COG:ECs4487 KEGG:ns NR:ns ## COG: ECs4487 COG1952 # Protein_GI_number: 15833741 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecB # Organism: Escherichia coli O157:H7 # 1 155 1 155 155 309 100.0 1e-84 MSEQNNTEMTFQIQRIYTKDISFEAPNAPHVFQKDWQPEVKLDLDTASSQLADDVYEVVL RVTVTASLGEETAFLCEVQQGGIFSIAGIEGTQMAHCLGAYCPNILFPYARECITSMVSR GTFPQLNLAPVNFDALFMNYLQQQAGEGTEEHQDA >gi|296493097|gb|ADTK01000404.1| GENE 12 14311 - 14562 323 83 aa, chain - ## HITS:1 COG:ECs4488 KEGG:ns NR:ns ## COG: ECs4488 COG0695 # Protein_GI_number: 15833742 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutaredoxin and related proteins # Organism: Escherichia coli O157:H7 # 1 83 1 83 83 173 100.0 9e-44 MANVEIYTKETCPYCHRAKALLSSKGVSFQELPIDGNAAKREEMIKRSGRTTVPQIFIDA QHIGGCDDLYALDARGGLDPLLK >gi|296493097|gb|ADTK01000404.1| GENE 13 14704 - 15135 425 143 aa, chain - ## HITS:1 COG:ECs4489 KEGG:ns NR:ns ## COG: ECs4489 COG0607 # Protein_GI_number: 15833743 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Escherichia coli O157:H7 # 1 143 1 143 143 276 100.0 6e-75 MQEIMQFVGRHPILSIAWIALLVAVLVTTFKSLTSKVKVITRGEATRLINKEDAVVVDLR QRDDFRKGHIAGSINLLPSEIKANNVGELEKHKDKPVIVVDGSGMQCQEPANALTKAGFA QVFVLKEGVAGWAGENLPLVRGK >gi|296493097|gb|ADTK01000404.1| GENE 14 15380 - 16924 1683 514 aa, chain + ## HITS:1 COG:ECs4490 KEGG:ns NR:ns ## COG: ECs4490 COG0696 # Protein_GI_number: 15833744 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglyceromutase # Organism: Escherichia coli O157:H7 # 1 514 1 514 514 1008 100.0 0 MSVSKKPMVLVILDGYGYREEQQDNAIFSAKTPVMDALWANRPHTLIDASGLEVGLPDRQ MGNSEVGHVNLGAGRIVYQDLTRLDVEIKDRAFFANPVLTGAVDKAKNAGKAVHIMGLLS AGGVHSHEDHIMAMVELAAERGAEKIYLHAFLDGRDTPPRSAESSLKKFEEKFAALGKGR VASIIGRYYAMDRDNRWDRVEKAYDLLTLAQGEFQADTAVAGLQAAYARDENDEFVKATV IRAEGQPDAAMEDGDALIFMNFRADRAREITRAFVNADFDGFARKKVVNVDFVMLTEYAA DIKTAVAYPPASLVNTFGEWMAKNDKTQLRISETEKYAHVTFFFNGGVEESFKGEDRILI NSPKVATYDLQPEMSSAELTEKLVAAIKSGKYDTIICNYPNGDMVGHTGVMEAAVKAVEA LDHCVEEVAKAVESVGGQLLITADHGNAEQMRDPATGQAHTAHTNLPVPLIYVGDKNVKA VAGGKLSDIAPTMLSLMGMEIPQEMTGKPLFIVE >gi|296493097|gb|ADTK01000404.1| GENE 15 16958 - 18217 1347 419 aa, chain + ## HITS:1 COG:ECs4491 KEGG:ns NR:ns ## COG: ECs4491 COG4942 # Protein_GI_number: 15833745 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Membrane-bound metallopeptidase # Organism: Escherichia coli O157:H7 # 1 419 9 427 427 602 100.0 1e-172 MTRAVKPRRFAIRPIIYASVLSAGVLLCAFSAHADERDQLKSIQADIAAKERAVRQKQQQ RASLLAQLKKQEEAISEATRKLRETQNTLNQLNKQIDEMNASIAKLEQQKAAQERSLAAQ LDAAFRQGEHTGIQLILSGEESQRGQRLQAYFGYLNQARQETIAQLKQTREEVAMQRAEL EEKQSEQQTLLYEQRAQQAKLTQALNERKKTLAGLESSIQQGQQQLSELRANESRLRNSI ARAEAAAKARAEREAREAQAVRDRQKEATRKGTTYKPTESEKSLMSRTGGLGAPRGQAFW PVRGPTLHRYGEQLQGELRWKGMVIGASEGTEVKAIADGRVILADWLQGYGLVVVVEHGK GDMSLYGYNQSALVSVGSQVRAGQPIALVGSSGGQGRPSLYFEIRRQGQAVNPQPWLGR >gi|296493097|gb|ADTK01000404.1| GENE 16 18221 - 19180 748 319 aa, chain + ## HITS:1 COG:ECs4492 KEGG:ns NR:ns ## COG: ECs4492 COG2861 # Protein_GI_number: 15833746 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 43 319 1 277 277 529 99.0 1e-150 MFPFRRNVLAFAALLALSSPVLAGKLAIVIDDFGYRPHNENQVLAMPSAISVAVLPDSPH AREMATKAHNSGHEVLIHLPMAPLSKQPLEKNTLRPEMSSDEIERIIRSAVNNVPYAVGI NNHMGSKMTSNLFGMQKVMQALERYNLYFLDSVTIGNTQAMRAAQGTGVKVIKRKVFLDD SQNEADIRVQFNRAIDLARRNGSTIAIGHPHPSTVRVLQQMVYNLPPDITLVKASSLLNE PQVDTSTPPKNAVPDAPRNPFRGVKLCKPKKPIEPVYANRFFEVLSESISQSTLIVYFQH QWQGWGKQPEAAKFNASAN >gi|296493097|gb|ADTK01000404.1| GENE 17 19167 - 20201 469 344 aa, chain - ## HITS:1 COG:yibD KEGG:ns NR:ns ## COG: yibD COG0463 # Protein_GI_number: 16131486 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Escherichia coli K12 # 1 344 1 344 344 681 99.0 0 MMNSTNKLSVIIPLYNAGDDFRTCMESLITQTWTALEIIIINDGSTDNSVEIAKHYAENY PHVRLLHQANAGASVARNRGIEVATGKYVAFVDADDEVYPTMYETLMTMALEDDLDVAQC NADWCFRETGETWQSIPSDRLRSTGVLTGPDWLRMGLSSRRWTHVVWMGVYRRDVIVKNN IKFIAGLHHQDIVWTTEFMFNALRARYTEQSLYKYYLHNTSVSRLHRQGNKNLNYQRHYI KITRLLEKLNRNYADKITIYPEFHQQITYEALRVCHAVRKEPDILTRQRMIAEIFTSGMY KRLITNVRSVKVGYQALLWSFRLWQWRDKTRSHHRITRSAFNLR >gi|296493097|gb|ADTK01000404.1| GENE 18 20440 - 21465 1013 341 aa, chain - ## HITS:1 COG:tdh KEGG:ns NR:ns ## COG: tdh COG1063 # Protein_GI_number: 16131487 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli K12 # 1 341 1 341 341 707 99.0 0 MKALSKLKAEEGIWMTDVPVPELGHNDLLIKIRKTAICGTDVHIYNWDEWSQKTIPVPMV VGHEYVGEVVGIGQEVKGFKIGDRVSGEGHITCGHCRNCRGGRTHLCRNTIGVGVNRPGC FAEYLVIPAFNAFKIPDNISDDLASIFDPFGNAVHTALSFDLVGEDVLVSGAGPIGIMAA AVAKHVGARNVVITDVNEYRLELARKMGITRAVNVAKENLNDVIAELGMTEGFDVGLEMS GAPPAFRTMLDTMNHGGRIAMLGIPPSDMSIDWTKVIFKGLFIKGIYGREMFETWYKMAA LIQSGLDLSPIITHRFSIDDFQKGFDAMRSGQSGKVILSWD >gi|296493097|gb|ADTK01000404.1| GENE 19 21475 - 22671 1632 398 aa, chain - ## HITS:1 COG:ECs4495 KEGG:ns NR:ns ## COG: ECs4495 COG0156 # Protein_GI_number: 15833749 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Escherichia coli O157:H7 # 1 398 1 398 398 775 100.0 0 MRGEFYQQLTNDLETARAEGLFKEERIITSAQQADITVADGSHVINFCANNYLGLANHPD LIAAAKAGMDSHGFGMASVRFICGTQDSHKELEQKLAAFLGMEDAILYSSCFDANGGLFE TLLGAEDAIISDALNHASIIDGVRLCKAKRYRYANNDMQELEARLKEAREAGARHVLIAT DGVFSMDGVIANLKGVCDLADKYDALVMVDDSHAVGFVGENGRGSHEYCDVMGRVDIITG TLGKALGGASGGYTAARKEVVEWLRQRSRPYLFSNSLAPAIVAASIKVLEMVEAGSELRD RLWANARQFREQMSAAGFTLAGADHAIIPVMLGDAVVAQKFARELQKEGIYVTGFFYPVV PKGQARIRTQMSAAHTPEQITRAVEAFTRIGKQLGVIA >gi|296493097|gb|ADTK01000404.1| GENE 20 22885 - 23817 1095 310 aa, chain + ## HITS:1 COG:ECs4497 KEGG:ns NR:ns ## COG: ECs4497 COG0451 # Protein_GI_number: 15833751 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Escherichia coli O157:H7 # 1 310 1 310 310 618 99.0 1e-177 MIIVTGGAGFIGSNIVKALNDKGITDILVVDNLKDGTKFVNLVDLDIADYMDKEDFLIQI MAGEEFGDVEAIFHEGACSSTTEWDGKYMMDNNYQYSKELLHYCLEREIPFLYASSAATY GGRTSDFIESREYEKPLNVYGYSKFLFDEYVRQILPEANSQIVGFRYFNVYGPREGHKGS MASVAFHLNTQLNKGESPKLFEGSENFKRDFVYVGDVADVNLWFLENGVSGIFNLGTGRA ESFQAVADATLAYHKKGQIEYIPFPDKLKGRYQAFTQADLTNLRAAGYDKPFKTVAEGVM EYMAWLNRDA >gi|296493097|gb|ADTK01000404.1| GENE 21 23827 - 24873 920 348 aa, chain + ## HITS:1 COG:rfaF KEGG:ns NR:ns ## COG: rfaF COG0859 # Protein_GI_number: 16131491 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Escherichia coli K12 # 1 348 1 348 348 710 99.0 0 MKILVIGPSWVGDMMMSQSLYRTLQARYPQAIIDVMAPAWCRPLLSRMPEVNEAIPMPLG HGALEIGERRKLGHSLREKRYDRAYVLPNSFKSALVPFFAGIPHRTGWRGEMRYGLLNDV RVLDKEAWPLMVERYVALAYDKGIMRTAQDLPQPLLWPQLQVSEGEKSYTCNQFSLSSER PMIGFCPGAEFGPAKRWPHYHYAELAKQLIDEGYQVVLFGSAKDHEAGNEILAALNTEQQ AWCRNLAGETQLDQAVILIAACKAIVTNDSGLMHVAAALNRPLVALFGPSSPDFTPPLSH KARVIRLITGYHKVRKGDAAEGYHQSLIDITPQRVLEELNALLLQEEA >gi|296493097|gb|ADTK01000404.1| GENE 22 24877 - 25848 647 323 aa, chain + ## HITS:1 COG:rfaC KEGG:ns NR:ns ## COG: rfaC COG0859 # Protein_GI_number: 16131492 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Escherichia coli K12 # 1 312 1 312 319 605 97.0 1e-173 MRVLIVKTSSMGDVLHTLPALTDAQQAIPGIKFDWVVEEGFAQIPSWHAAVERVIPVAIR RWRKAWFSAPIKAERKAFREALQAENYDAVIDAQGLVKSAALVTRLAHGVKHGLDWQTAR EPLASLFYNRKHHIAKQQHAVERTRELFAKSLGYSKPQTQGDYAIAQHFLTNLPTDAGEY AVFLHATTRDDKHWPEEHWRELIGLLADSGIRIKLPWGAPHEEERAKRLAEGFAYVEVLP KMSLEGVARVLAGAKFVVSVDTGLSHLTAALDRPNITVYGPTDPGLIGGYGKNQMVCRAP GNELSQLTANAVKRFIEENAAMI >gi|296493097|gb|ADTK01000404.1| GENE 23 25918 - 26394 141 158 aa, chain - ## HITS:1 COG:rfaL KEGG:ns NR:ns ## COG: rfaL COG3307 # Protein_GI_number: 16131493 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipid A core - O-antigen ligase and related enzymes # Organism: Escherichia coli K12 # 2 143 265 408 419 79 38.0 2e-15 MRMNDLNNDLVNYSHDNTRTSVGARLAMYEVGLKTYSPIGQSLEKRAEKIHELEEKEPRL SGALPYVDSHLHNDLIDTLSTRGIPGVVLTILAFSAILIYALRTAKEPYILILLFSLLVV GLSDVILFSKPVPTAVFITIILLCAYFKAQSDQCLLEK >gi|296493097|gb|ADTK01000404.1| GENE 24 27217 - 28200 117 327 aa, chain - ## HITS:1 COG:BS_yveT KEGG:ns NR:ns ## COG: BS_yveT COG0463 # Protein_GI_number: 16080481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 6 208 3 214 344 118 32.0 2e-26 MSNDYPLVSIIIPTYNSSDYITETLTKLEKQTYPNFEIVIVNDGSKDNTSNVLREYGLTH SRLIIINKENGGVSSARNTGIRKAQGQFICFMDDDDEIDPNYLLKMHSRQHETGGDAIYC GLYGHHIKNGVTYSPINTEFNEGSLLFDFFYKKVRFHIGCLFIRKQLLEENNLFFDEDLR LGEDLDFIYRLLITCDMYAVPYYMYKHNYRENSLMNSCRTISHYRHESFAHEKIYSSVMQ LYKGNRKEEIHTLLSQNRAYHKTRYLWNVLLNGDFKQLNQLVESNEKELKDCNLPGKRDK RRAKILASKNYIIWRMVRLVNRKKNKR >gi|296493097|gb|ADTK01000404.1| GENE 25 28285 - 29310 666 341 aa, chain - ## HITS:1 COG:rfaJ KEGG:ns NR:ns ## COG: rfaJ COG1442 # Protein_GI_number: 16131497 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipopolysaccharide biosynthesis proteins, LPS:glycosyltransferases # Organism: Escherichia coli K12 # 19 338 23 338 338 220 38.0 3e-57 MDLLAESITEVAVSGEIANTDRVLNIAYGIDRNFLFGAAVSMQSVVMHNPDLAVKFHLFT DYIDEDYLQRVNAFTSKNANVEVIIYKVSNAFIDIFPSLKQWSYATFFRLVAFQYLSETI ENLLYIDADVICKGSLAGLLDINFDEDKFAAVIKDVPFMQEKPAKRLAIEGLPGNYFNAG VVYLQLEAWAKNDFMNKAIAMLASDPQHTKYKCLDQDILNILFFGHCIFISGDYDCFYGI DYELKNKSDEDYKKTITDDTKLIHYVGVTKPWNDWTNYPCQKYFNEAYQASCWNDVAFIP ATNEKQYQVKYQHAKKNGDTFNAFIYFIKFKLNKYKRKLFG >gi|296493097|gb|ADTK01000404.1| GENE 26 29336 - 30028 484 230 aa, chain - ## HITS:1 COG:no KEGG:SDY_4057 NR:ns ## KEGG: SDY_4057 # Name: rfaY # Def: lipopolysaccharide core biosynthesis protein # Organism: S.dysenteriae # Pathway: Lipopolysaccharide biosynthesis [PATH:sdy00540]; Metabolic pathways [PATH:sdy01100] # 1 230 1 230 230 423 100.0 1e-117 MITSIRYRGFSFYYKDNDNKYKEIFDEILAYNFKTVKVLRNIDDTKVSLIDTKYGRYVFK VFAPKTKRNERFLKSFVKGDYYQNLIVETDRVRSAGLTFPNDFYFLAERKIFNYASVFIM LIEYVEGVELNDMPIIPENVKAEIKASMEKLHALNMLSGDPHRGNFIVSKDGVRIIDLSG KSCTAERKARDRLAMERHLGIANEIKDYGYYSVIYRTKLRKFIKKLKGKA >gi|296493097|gb|ADTK01000404.1| GENE 27 30038 - 31033 366 331 aa, chain - ## HITS:1 COG:STM3717 KEGG:ns NR:ns ## COG: STM3717 COG1442 # Protein_GI_number: 16767002 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipopolysaccharide biosynthesis proteins, LPS:glycosyltransferases # Organism: Salmonella typhimurium LT2 # 13 331 18 332 336 275 46.0 7e-74 MNEFIKERFSYLADNKKENAPELNVSYGIDKNFLYGAGVSISSVLINNSDINFVFHVFTD YVDDDYLKSFNETAKQFNTSIIVYLIDPKYFADLPTSQFWSYATYFRVLSFEYLSESIST LLYLDADVVCKGSLKPLTEIIFKDEFAAVIPDNDSTQEACAKRLNIPEMNGRYFNAGVIY VNLKKWHEANLTPYLLKLLRGETKYGSLKYLDQDALNIAFNMNNIYLGKDFDTIYTLKNE LHDRSHRKFQQTITDKTVLIHYTGITKPWHSWAGYPSASYFNIAREQSPWKKYPLKEART VAEMQKQYKHLFAHGEYIKGITSLIKYKLKK >gi|296493097|gb|ADTK01000404.1| GENE 28 31050 - 32066 337 338 aa, chain - ## HITS:1 COG:STM3718 KEGG:ns NR:ns ## COG: STM3718 COG1442 # Protein_GI_number: 16767003 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipopolysaccharide biosynthesis proteins, LPS:glycosyltransferases # Organism: Salmonella typhimurium LT2 # 1 338 1 337 337 357 52.0 1e-98 MSAHYFNPQEMITKTIIFDERPVASVASSFHVAYGIDKNFLFGCGVSITSVLLHNNDVSF VFHVFIDDIPEADIQRLAQLAKSYRTCIQIHLVNCERLKALPTTKNWSIAMYFRFVIADY FIDQQDKVLYLDADIACQGNLKPLITMDLANNIAAVVTERDANWWSLRGQSLQCNELEKG YFNSGVLLINTLAWAQESVSAKAMSMLADKAVVSRLTYMDQDILNLILSGKVKFIDAKYN TQFSLNYELKKSFVCPINDETVLIHYVGPTKPWHYWAGYPSARPFIKAKEASPWKNEPLM RPVNSNYARYCAKHNFKQNKPINGIMNYIYYFYLKIIK >gi|296493097|gb|ADTK01000404.1| GENE 29 32082 - 32879 512 265 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_4131 NR:ns ## KEGG: EcE24377A_4131 # Name: rfaP # Def: lipopolysaccharide core biosynthesis protein RfaP # Organism: E.coli_E24377A # Pathway: Lipopolysaccharide biosynthesis [PATH:ecw00540]; Metabolic pathways [PATH:ecw01100] # 1 265 1 265 265 530 100.0 1e-149 MVELKEPFATLWRGKDPFEEVKTLQGEVFRELETRRTLRFEMAGKSYFLKWHRGTTLKEI IKNLLSLRMPVLGADREWNAIHRLRDVGVDTMYGVAFGEKGMNPLTRTSFIITEDLTPTI SLEDYCADWATNPPDVRVKRMLIKRVATMVRDMHAAGINHRDCYICHFLLHLPFSGKEEE LKISVIDLHRAQLRTRVPRRWRDKDLIGLYFSSMNIGLTQRDIWRFMKVYFAAPLKDILK QEQGLLSQAEAKATKIRERTIRKSL >gi|296493097|gb|ADTK01000404.1| GENE 30 32872 - 33996 720 374 aa, chain - ## HITS:1 COG:ECs4506 KEGG:ns NR:ns ## COG: ECs4506 COG0438 # Protein_GI_number: 15833760 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Escherichia coli O157:H7 # 1 374 1 374 374 764 99.0 0 MIVAFCLYKYFPFGGLQRDFMRIAQTVAARGHHVRVYTQSWEGECPDVFELIKVPVKSHT NHGRNAEYFAWVQKHLREHPVDKVVGFNKMPGLDVYYAADVCYAEKVAQEKGFFYRLTSR YRHYAAFERATFEQGKPTQLLMLTDKQIADFQKHYQTEAERFHILPPGIYPDRKYSQQPA NSREIFRKKNGITEQQYLLLQVGSDFTRKGVDRSIEALASLPDSLRHNTLLYVVGQDKPR KFEALAEKRGVRSNVHFFSGRNDVSELMAAADLLLHPAYQEAAGIVLLEAITAGLPVLTT AVCGYAHYIVDANCGEAIAEPFRQETLNEILRKALTQSSLRQAWAENARHYADTQDLYSL PEKAADIITGGLDG >gi|296493097|gb|ADTK01000404.1| GENE 31 33993 - 35015 617 340 aa, chain - ## HITS:1 COG:ECs4507 KEGG:ns NR:ns ## COG: ECs4507 COG0859 # Protein_GI_number: 15833761 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Escherichia coli O157:H7 # 1 340 13 352 352 699 100.0 0 MRFHGDMLLTTPVISSLKKNYPDAKIDVLLYQDTIPILSENPEINALYGIKNKKAKASEK IANFFHLIKVLRANKYDLIVNLTDQWMVAILVRLLNARVKISQDYHHRQSAFWRKSFTHL VPLQGGNVVESNLSVLTPLGLDSLVKQTTMSYPPASWKRMRRELDHAGVGQNYVVIQPTA RQIFKCWDNAKFSAVIDALHARGYEVVLTSGPDKDDLACVNEIAQGCQTPPVTALAGKVT FPELGALIDHAQLFIGVDSAPAHIAAAVNTPLISLFGATDHIFWRPWSNNMIQFWAGDYR EMPTRDQRDRNEMYLSVIPAADVIAAVDKLLPSSTTGTSL >gi|296493097|gb|ADTK01000404.1| GENE 32 35464 - 36741 1184 425 aa, chain + ## HITS:1 COG:ECs4508 KEGG:ns NR:ns ## COG: ECs4508 COG1519 # Protein_GI_number: 15833762 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic-acid transferase # Organism: Escherichia coli O157:H7 # 1 425 1 425 425 831 100.0 0 MLELLYTALLYLIQPLIWIRLWVRGRKAPAYRKRWGERYGFYRHPLKPGGIMLHSVSVGE TLAAIPLVRALRHRYPDLPITVTTMTPTGSERVQSAFGKDVQHVYLPYDLPDALNRFLNK VDPKLVLIMETELWPNLIAALHKRKIPLVIANARLSARSAAGYAKLGKFVRRLLRRITLI AAQNEEDGARFVALGAKNNQVTVTGSLKFDISVTPQLAAKAVTLRRQWAPHRPVWIATST HEGEESVVIAAHQALLQQFPNLLLILVPRHPERFPDAINLVRQAGLSYITRSSGEVPSTS TQVVVGDTMGELMLLYGIADLAFVGGSLVERGGHNPLEAAAHAIPVLMGPHTFNFKDICA RLEQASGLITVTDATTLAKEVSSLLTDADYRSFYGRHAVEVLYQNQGALQRLLQLLEPYL PPKTH >gi|296493097|gb|ADTK01000404.1| GENE 33 36749 - 37228 393 159 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 [Bacillus selenitireducens MLS10] # 2 159 3 160 164 155 46 4e-37 MQKRAIYPGTFDPITNGHIDIVTRATQMFDHVILAIAASPSKKPMFTLEERVALAQQATA HLGNVEVVGFSDLMANFARNQHATVLIRGLRAVADFEYEMQLAHMNRHLMPELESVFLMP SKEWSFISSSLVKEVARHQGDVTHFLPENVHQALMAKLA >gi|296493097|gb|ADTK01000404.1| GENE 34 37267 - 38076 766 269 aa, chain - ## HITS:1 COG:ECs4510 KEGG:ns NR:ns ## COG: ECs4510 COG0266 # Protein_GI_number: 15833764 # Func_class: L Replication, recombination and repair # Function: Formamidopyrimidine-DNA glycosylase # Organism: Escherichia coli O157:H7 # 1 269 1 269 269 556 100.0 1e-158 MPELPEVETSRRGIEPHLVGATILHAVVRNGRLRWPVSEEIYRLSDQPVLSVQRRAKYLL LELPEGWIIIHLGMSGSLRILPEELPPEKHDHVDLVMSNGKVLRYTDPRRFGAWLWTKEL EGHNVLAHLGPEPLSDDFNGEYLHQKCAKKKTAIKPWLMDNKLVVGVGNIYASESLFAAG IHPDRLASSLSLAECELLARVIKAVLLRSIEQGGTTLKDFLQSDGKPGYFAQELQVYGRK GEPCRVCGTPIVATKHAQRATFYCRQCQK >gi|296493097|gb|ADTK01000404.1| GENE 35 38174 - 38341 280 55 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15804177|ref|NP_290216.1| 50S ribosomal protein L33 [Escherichia coli O157:H7 EDL933] # 1 55 1 55 55 112 100 5e-24 MAKGIREKIKLVSSAGTGHFYTTTKNKRTKPEKLELKKFDPVVRQHVIYKEAKIK >gi|296493097|gb|ADTK01000404.1| GENE 36 38362 - 38598 403 78 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15804178|ref|NP_290217.1| 50S ribosomal protein L28 [Escherichia coli O157:H7 EDL933] # 1 78 1 78 78 159 100 3e-38 MSRVCQVTGKRPVTGNNRSHALNATKRRFLPNLHSHRFWVESEKRFVTLRVSAKGMRVID KKGIDTVLAELRARGEKY >gi|296493097|gb|ADTK01000404.1| GENE 37 38815 - 39483 461 222 aa, chain - ## HITS:1 COG:ECs4513 KEGG:ns NR:ns ## COG: ECs4513 COG2003 # Protein_GI_number: 15833767 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Escherichia coli O157:H7 # 1 222 3 224 224 432 99.0 1e-121 MKNNAQLLMPREKMLKFGISALTDVELLALFLRTGTRGKDVLTLAKEMLENFGSLYGLLT SEYEQFSGVHGIGVAKFAQLKGIAELARRYYNVRMREESPLLSPEMTREFLQSQLTGEER EIFMVIFLDSQHRVITHSRLFSGTLNHVEVHPREIIREAIKINASALILAHNHPSGCAEP SKADKLITERIIKSCQFMDLRVLDHIVIGRGEYVSFAERGWI >gi|296493097|gb|ADTK01000404.1| GENE 38 39655 - 40875 1149 406 aa, chain + ## HITS:1 COG:ECs4514 KEGG:ns NR:ns ## COG: ECs4514 COG0452 # Protein_GI_number: 15833768 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantothenoylcysteine synthetase/decarboxylase # Organism: Escherichia coli O157:H7 # 1 406 25 430 430 763 99.0 0 MSLAGKKIVLGVSGGIAAYKTPELVRRLRDRGADVRVAMTEAAKAFITPLSLQAVSGYPV SDSLLDPAAEAAMGHIELGKWADLVILAPATADLIARVAAGMANDLVSTICLATPARVAV LPAMNQQMYRAAATQHNLEVLASRGLLIWGPDSGSQACGDIGPGRMLDPLTIVDMAVAHF SPVNDLKHLNIMITAGPTREPLDPVRYISNHSSGKMGFAIAAAAARRGANVTLVSGPVSL PTPPFVKRVDVMTALEMEAAVNASVQQQNIFIGCAAVADYRAATVAPEKIKKQATQGDEF TIKMVKNPDIVAGVAALKDHRPYVVGFAAETNNVEEYARQKRIRKNLDLICANDVSQPTQ GFNSDNNALHLFWQDGDKVLPLERKELLGQLLLDEIVTRYDEKNRR >gi|296493097|gb|ADTK01000404.1| GENE 39 40853 - 41311 553 152 aa, chain + ## HITS:1 COG:ECs4515 KEGG:ns NR:ns ## COG: ECs4515 COG0756 # Protein_GI_number: 15833769 # Func_class: F Nucleotide transport and metabolism # Function: dUTPase # Organism: Escherichia coli O157:H7 # 2 152 1 151 151 298 100.0 2e-81 MMKKIDVKILDPRVGKEFPLPTYATSGSAGLDLRACLDDAVELAPGDTTLVPTGLAIHIA DPSLAAMMLPRSGLGHKHGIVLGNLVGLIDSDYQGQLMISVWNRGQDSFTIQPGERIAQM IFVPVVQAEFNLVEDFDATDRGEGGFGHSGRQ >gi|296493097|gb|ADTK01000404.1| GENE 40 41418 - 42014 686 198 aa, chain + ## HITS:1 COG:ECs4516 KEGG:ns NR:ns ## COG: ECs4516 COG1309 # Protein_GI_number: 15833770 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 198 15 212 212 340 100.0 7e-94 MAEKQTAKRNRREEILQSLALMLESSDGSQRITTAKLAASVGVSEAALYRHFPSKTRMFD SLIEFIEDSLITRINLILKDEKDTTARLRLIVLLLLGFGERNPGLTRILTGHALMFEQDR LQGRINQLFERIEAQLRQVLREKRMREGEGYTTDETLLASQILAFCEGMLSRFVRSEFKY RPTDDFDARWPLIAAQLQ >gi|296493097|gb|ADTK01000404.1| GENE 41 42060 - 43019 540 319 aa, chain - ## HITS:1 COG:FN0161 KEGG:ns NR:ns ## COG: FN0161 COG3344 # Protein_GI_number: 19703506 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Fusobacterium nucleatum # 25 313 56 339 349 181 35.0 2e-45 MDATRTTLLALDLFGSPGWSADKEIQRLHALSNHAGRHYRRIILSKRHGGQRLVLAPDYL LKTVQRNILKNVLSQFPLSPFATAYRPGCPIVSNAQPHCQQPQILKLDIENFFDSISWLQ VWRVFRQAQLPRNVVTMLTWICCYNDALPQGAPTSPAISNLVMRRFDERIGEWCQARGIT YTRYCDDMTFSGHFNARQVKNKVCGLLAELGLSLNKRKGCLIAACKRQQVTGIVVNHKPQ LAREARRALRQEVHLCQKYGVISHLSHRGELDPSGDLHAQATAYLYALQGRINWLLQINP EDEAFQQARESVKRMLVAW >gi|296493097|gb|ADTK01000404.1| GENE 42 43317 - 43958 793 213 aa, chain - ## HITS:1 COG:pyrE KEGG:ns NR:ns ## COG: pyrE COG0461 # Protein_GI_number: 16131513 # Func_class: F Nucleotide transport and metabolism # Function: Orotate phosphoribosyltransferase # Organism: Escherichia coli K12 # 1 213 1 213 213 421 100.0 1e-118 MKPYQRQFIEFALSKQVLKFGEFTLKSGRKSPYFFNAGLFNTGRDLALLGRFYAEALVDS GIEFDLLFGPAYKGIPIATTTAVALAEHHDLDLPYCFNRKEAKDHGEGGNLVGSALQGRV MLVDDVITAGTAIRESMEIIQANGATLAGVLISLDRQERGRGEISAIQEVERDYNCKVIS IITLKDLIAYLEEKPEMAEHLAAVKAYREEFGV >gi|296493097|gb|ADTK01000404.1| GENE 43 44024 - 44740 846 238 aa, chain - ## HITS:1 COG:ECs4518 KEGG:ns NR:ns ## COG: ECs4518 COG0689 # Protein_GI_number: 15833772 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNase PH # Organism: Escherichia coli O157:H7 # 1 238 1 238 238 416 99.0 1e-116 MRPAGRSNNQVRPVTLTRNYTKHAEGSVLVEFGDTKVLCTASIEEGVPRFLKGQGQGWIT AEYGMLPRSTHTRNAREAAKGKQGGRTMEIQRLIARALRAAVDLKALGEFTITLDCDVLQ ADGGTRTASITGACVALADALQKLVENGKLKTNPMKGMVAAVSVGIVNGEAICDLEYVED SAAETDMNVVMTEDGRIIEVQGTAEGEPFTHEELLTLLALARGGIESIVATQKAALAN >gi|296493097|gb|ADTK01000404.1| GENE 44 44867 - 45730 1127 287 aa, chain + ## HITS:1 COG:ECs4519 KEGG:ns NR:ns ## COG: ECs4519 COG1561 # Protein_GI_number: 15833773 # Func_class: S Function unknown # Function: Uncharacterized stress-induced protein # Organism: Escherichia coli O157:H7 # 1 287 1 287 287 456 100.0 1e-128 MIRSMTAYARREIKGEWGSATWEMRSVNQRYLETYFRLPEQFRSLEPVVRERIRSRLTRG KVECTLRYEPDVSAQGELILNEKLAKQLVTAANWVKMQSDEGEINPVDILRWPGVMAAQE QDLDAIAAEILAALDGTLDDFIVARETEGQALKALIEQRLEGVTAEVVKVRAHMPEILQW QRERLVAKLEDAQVQLENNRLEQELVLLAQRIDVAEELDRLEAHVKETYNILKKKEAVGR RLDFMMQEFNRESNTLASKSINAEVTNSAIELKVLIEQMREQIQNIE >gi|296493097|gb|ADTK01000404.1| GENE 45 45951 - 46553 496 200 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_3816 NR:ns ## KEGG: ECIAI1_3816 # Name: dinD # Def: DNA-damage-inducible protein D # Organism: E.coli_IAI1 # Pathway: not_defined # 1 200 5 204 204 377 100.0 1e-103 MNEHHQPFEEIKLINANGAEQWSARQLGKLLGYSEYRHFIPVLTRAKEACENSGHTIDDH FEEILDMVKIGSNAKRALKDIVLSRYACYLVVQNGDPAKPVIAAGQTYFAIQTRRQELAD DEAFKQLREDEKRLFLRNELKEHNKQLVEAAQQANTTHFDVGSKVRQTIQELGGTMPEEL PTPQVSIKQLENSVKITEKK >gi|296493097|gb|ADTK01000404.1| GENE 46 46844 - 47461 835 205 aa, chain + ## HITS:1 COG:yicG KEGG:ns NR:ns ## COG: yicG COG2860 # Protein_GI_number: 16131517 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 205 19 223 223 326 100.0 2e-89 MLLHILYLVGITAEAMTGALAAGRRRMDTFGVIIIATATAIGGGSVRDILLGHYPLGWVK HPEYVIIVATAAVLTTIVAPVMPYLRKVFLVLDALGLVVFSIIGAQVALDMGHGPIIAVV AAVTTGVFGGVLRDMFCKRIPLVFQKELYAGVSFASAVLYIALQHYVSNHDVVIISTLVF GFFARLLALRLKLGLPVFYYSHEGH >gi|296493097|gb|ADTK01000404.1| GENE 47 47458 - 48975 1067 505 aa, chain - ## HITS:1 COG:yicF KEGG:ns NR:ns ## COG: yicF COG0272 # Protein_GI_number: 16131518 # Func_class: L Replication, recombination and repair # Function: NAD-dependent DNA ligase (contains BRCT domain type II) # Organism: Escherichia coli K12 # 1 505 58 562 562 959 99.0 0 MEDGVYDQLSARLTQWQRCFGSEPRDVMMPPLNGAVMHPVAHTGVRKMVDKNALSLWMRE RSDLWVQPKVDGVAVTLVYRDGKLNKAISRGNGLKGEDWTQKVSLISAVPQTVSGPLANS TLQGEIFLQREGHIQQQMGGINARAKVAGLMMRQDDSDTLNSLGVFVWAWPDGPQLMSDR LKELATAGFTLTQTYTRAVKNADEVARVRNEWWKAELPFVTDGVVVRGAKEPESRHWLPG QAEWLVAWKYQPVAQVAEVKAIQFAVGKSGKISVVASLAPVMLDDKKVQRVNIGSVRRWQ EWDIAPGDQILVSLAGQGIPRIDDVVWRGAERTKPTPPENRFNSLTCYFASDVCQEQFIS RLVWLGSKQVLGLDGIGEAGWRALHQTHRFEHIFSWLLLTPEQLQNTPGIAKSKSAQLWH QFNLARKQPFTRWVMAMGIPLTRAALNASDERSWSQLLFSTEQFWQQLPGTGSGRARQVI EWKENAQIKKLGSWLAAQQITGFEP >gi|296493097|gb|ADTK01000404.1| GENE 48 49017 - 49259 64 80 aa, chain - ## HITS:1 COG:no KEGG:ECH74115_5019 NR:ns ## KEGG: ECH74115_5019 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 75 1 75 85 130 86.0 1e-29 MIYCNKDDSQEASSQTPFLPLGFAFTTLYICQPDYSGGEDESMDGDINKYLVLAIICVGG LSGPGRQPEHRKKFPACNSK >gi|296493097|gb|ADTK01000404.1| GENE 49 49399 - 50022 509 207 aa, chain + ## HITS:1 COG:gmk KEGG:ns NR:ns ## COG: gmk COG0194 # Protein_GI_number: 16131519 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Escherichia coli K12 # 1 207 1 207 207 398 100.0 1e-111 MAQGTLYIVSAPSGAGKSSLIQALLKTQPLYDTQVSVSHTTRQPRPGEVHGEHYFFVNHD EFKEMISRDAFLEHAEVFGNYYGTSREAIEQVLATGVDVFLDIDWQGAQQIRQKMPHARS IFILPPSKIELDRRLRGRGQDSEEVIAKRMAQAVAEMSHYAEYDYLIVNDDFDTALTDLK TIIRAERLRMSRQKQRHDALISKLLAD >gi|296493097|gb|ADTK01000404.1| GENE 50 50077 - 50352 487 91 aa, chain + ## HITS:1 COG:ECs4524 KEGG:ns NR:ns ## COG: ECs4524 COG1758 # Protein_GI_number: 15833778 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, subunit K/omega # Organism: Escherichia coli O157:H7 # 1 91 1 91 91 112 100.0 2e-25 MARVTVQDAVEKIGNRFDLVLVAARRARQMQVGGKDPLVPEENDKTTVIALREIEEGLIN NQILDVRERQEQQEQEAAELQAVTAIAEGRR >gi|296493097|gb|ADTK01000404.1| GENE 51 50371 - 52479 2156 702 aa, chain + ## HITS:1 COG:ECs4525 KEGG:ns NR:ns ## COG: ECs4525 COG0317 # Protein_GI_number: 15833779 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Escherichia coli O157:H7 # 1 702 1 702 702 1414 100.0 0 MYLFESLNQLIQTYLPEDQIKRLRQAYLVARDAHEGQTRSSGEPYITHPVAVACILAEMK LDYETLMAALLHDVIEDTPATYQDMEQLFGKSVAELVEGVSKLDKLKFRDKKEAQAENFR KMIMAMVQDIRVILIKLADRTHNMRTLGSLRPDKRRRIARETLEIYSPLAHRLGIHHIKT ELEELGFEALYPNRYRVIKEVVKAARGNRKEMIQKILSEIEGRLQEAGIPCRVSGREKHL YSIYCKMVLKEQRFHSIMDIYAFRVIVNDSDTCYRVLGQMHSLYKPRPGRVKDYIAIPKA NGYQSLHTSMIGPHGVPVEVQIRTEDMDQMAEMGVAAHWAYKEHGETSTTAQIRAQRWMQ SLLELQQSAGSSFEFIESVKSDLFPDEIYVFTPEGRIVELPAGATPVDFAYAVHTDIGHA CVGARVDRQPYPLSQPLTSGQTVEIITAPGARPNAAWLNFVVSSKARAKIRQLLKNLKRD DSVSLGRRLLNHALGGSRKLNEIPQENIQRELDRMKLATLDDLLAEIGLGNAMSVVVAKN LQHGDASIPPATQSHGHLPIKGADGVLITFAKCCRPIPGDPIIAHVSPGKGLVIHHESCR NIRGYQKEPEKFMAVEWDKETAQEFITEIKVEMFNHQGALANLTAAINTTTSNIQSLNTE EKDGRVYSAFIRLTARDRVHLANIMRKIRVMPDVIKVTRNRN >gi|296493097|gb|ADTK01000404.1| GENE 52 52486 - 53175 629 229 aa, chain + ## HITS:1 COG:ECs4526 KEGG:ns NR:ns ## COG: ECs4526 COG0566 # Protein_GI_number: 15833780 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Escherichia coli O157:H7 # 1 229 1 229 229 447 100.0 1e-126 MNPTRYARICEMLARRQPDLTVCMEQVHKPHNVSAIIRTADAVGVHEVHAVWPGSRMRTM ASAAAGSNSWVQVKTHRTIGDAVAHLKGQGMQILATHLSDNAVDFREIDYTRPTCILMGQ EKTGITQEALALADQDIIIPMIGMVQSLNVSVASALILYEAQRQRQNAGMYLRENSMLPE AEQQRLLFEGGYPVLAKVAKRKGLPYPHVNQQGEIEADADWWATMQAAG >gi|296493097|gb|ADTK01000404.1| GENE 53 53181 - 55262 2093 693 aa, chain + ## HITS:1 COG:ZrecG KEGG:ns NR:ns ## COG: ZrecG COG1200 # Protein_GI_number: 15804193 # Func_class: L Replication, recombination and repair; K Transcription # Function: RecG-like helicase # Organism: Escherichia coli O157:H7 EDL933 # 1 693 12 704 704 1305 99.0 0 MKGRLLDAVPLSSLTGVGAALSNKLAKINLHTVQDLLLHLPLRYEDRTHLYPIGELLPGV YATVEGEVLNCNISFGGRRMMTCQISDGSGILTMRFFNFSAAMKNSLATGRRVLAYGEAK RGKYGAEMIHPEYRVQGDLSTPELQETLTPVYPTTEGVKQATLRKLTDQALDLLDTCAIE ELLPPELSQGMMTLPEALRTLHRPPPTLQLSDLETGQHPAQRRLILEELLAHNLSMLALR AGAQRFHAQPLSANDALKNKLLAALPFKPTGAQARVVAEIEHDMALDVPMMRLVQGDVGS GKTLVAALAALRAIAHGKQVALMAPTELLAEQHANNFRNWFAPLGIEVGWLAGKQKGKAR LAQQEAIASGQVQMIVGTHAIFQEQVQFNGLALVIIDEQHRFGVHQRLALWEKGQQQGFH PHQLIMTATPIPRTLAMTAYADLDTSVIDELPPGRTPVTTVAIPDTRRTDIIDRVRHACM TEGRQAYWVCTLIEESELLEAQAAEATWEELKLALPELNVGLVHGRMKPAEKQAVMASFK QGELQLLVATTVIEVGVDVPNASLMIIENPERLGLAQLHQLRGRVGRGAVASHCVLLYKT PLSKTAQIRLQVLRDSNDGFVIAQKDLEIRGPGELLGTRQTGNAEFKVADLLRDQAMIPE VQRLARHIHERYPQQAKALIERWMPETERYSNA >gi|296493097|gb|ADTK01000404.1| GENE 54 55247 - 56113 565 288 aa, chain - ## HITS:1 COG:no KEGG:B21_03462 NR:ns ## KEGG: B21_03462 # Name: ybl159 # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 288 1 288 288 576 100.0 1e-163 MKKRRKEPETLREHCRHIFGDEPPVLCVWETEFDYADAELKALAAKEWQQISEWDLSAYY VLNLVYNEPMQIELFRYLFPLCLAQWHETVLAGGYGDHFEESLMKALCRPYLWQEMMNAS QRQQVRQFLLDTALQRMDNERGFNNVLCWLAVFNTLGGAAPLIRSLWSRWWALDTPGKAV CAIQYAAHLIYPIEANPLWSQEWIGWGHPLGHKDGWSSDNRAFLRQMLTPEMIVAGVQAA AEILRGEPEGAMAARIAQDAYEAMDILTIQIEDLLRDLSCDESGHALE >gi|296493097|gb|ADTK01000404.1| GENE 55 56116 - 57321 1487 401 aa, chain - ## HITS:1 COG:ECs4529 KEGG:ns NR:ns ## COG: ECs4529 COG0786 # Protein_GI_number: 15833783 # Func_class: E Amino acid transport and metabolism # Function: Na+/glutamate symporter # Organism: Escherichia coli O157:H7 # 1 401 1 401 401 619 100.0 1e-177 MFHLDTLATLVAATLTLLLGRKLVHSVSFLKKYTIPEPVAGGLLVALALLVLKKSMGWEV NFDMSLRDPLMLAFFATIGLNANIASLRAGGRVVGIFLIVVVGLLVMQNAIGIGMASLLG LDPLMGLLAGSITLSGGHGTGAAWSKLFIERYGFTNATEVAMACATFGLVLGGLIGGPVA RYLVKHSTTPNGIPDDQEVPTAFEKPDVGRMITSLVLIETIALIAICLTVGKIVAQLLAG TAFELPTFVCVLFVGVILSNGLSMMGFYRVFERAVSVLGNVSLSLFLAMALMGLKLWELA SLALPMLAILVVQTIFMALYAIFVTWRMMGKNYDAAVLAAGHCGFGLGATPTAIANMQAI TERFGPSHMAFLVVPMVGAFFIDIVNALVIKLYLMLPIFAG >gi|296493097|gb|ADTK01000404.1| GENE 56 57601 - 58992 1592 463 aa, chain + ## HITS:1 COG:ECs4530 KEGG:ns NR:ns ## COG: ECs4530 COG2233 # Protein_GI_number: 15833784 # Func_class: F Nucleotide transport and metabolism # Function: Xanthine/uracil permeases # Organism: Escherichia coli O157:H7 # 1 463 1 463 463 784 99.0 0 MSVSTLESENAQPVAQTQNSELIYRLEDRPPLPQTLFAACQHLLAMFVAVITPALLICQA LGLPAQDTQHIISMSLFASGVASIIQIKAWGPVGSGLLSIQGTSFNFVAPLIMGGTALKT GGADVPTMMAALFGTLMLASCTEMVLSRVLHLARRIITPLVSGVVVMIIGLSLIQVGLTS IGGGYAAMSDNTFGAPKNLLLAGVVLALIILLNRQRNPYLRVASLVIAMAAGYALAWFMG MLPESNEPMTQELIMVPTPLYYGLGIEWSLLLPLMLVFMITSLETIGDITATSDVSEQPV SGPLYMKRLKGGVLANGLNSFVSAVFNTFPNSCFGQNNGVIQLTGVASRYVGFVVALMLI VLGLFPAVSGFVQHIPEPVLGGATLVMFGTIAASGVRIVSREPLNRRAILIIALSLAVGL GVSQQPLILQFAPEWLKNLLSSGIAAGGITAIVLNLIFPPEKQ >gi|296493097|gb|ADTK01000404.1| GENE 57 59113 - 60822 1484 569 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_4159 NR:ns ## KEGG: EcE24377A_4159 # Name: not_defined # Def: AsmA family protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 569 1 569 569 1076 99.0 0 MKFIGKLLLYILIALLVVIAGLYFLLQTRWGAEHISAWVSENSDYHLAFGAMDHRFSAPS HIVLENVTFGRDGQPATLVAKSVDIALSSRQLTEPRHVDTILLENGTLNLTDQTAPLPFK ADRLQLRDMAFNSPNSEWKLSAQRVNGGVVPWSPEAGKVLGTKAQIQFSAGSLSLNDVPA TNVLIEGSIDNDRVTLTNLGADIARGTLTGNAQRNADGSWQVENLRMADIRLQSEKSLTD FFAPLRSVPSLQIGRLEVIDARLQGPDWAVTDLDLSLRNMTFSKDDWQTQEGKLSMNASE FIYGSLHLFDPIINAEFSPQGVALRQFTSRWEGGMVRTSGNWLRDGKTLILDDAAIAGLE YTLPKNWQQLWMETTPGWLNSLQLKRFSASRNLIIDIDPDFPWQLTTLDGYGANLTLVTD HKWGVWSGSANLNAAAATFNRVDVRRPSLALTANSSTVNISELSAFTEKGILEATASVSQ TPQRQTHISLNGRGVPVNILQQWGWPKLPLTGDGNIQLTASGDIQANVPLKPTVSGQLHA VNAAKQQVTQTMNAGIVSSGEVTSTEPVR >gi|296493097|gb|ADTK01000404.1| GENE 58 60875 - 63193 2012 772 aa, chain - ## HITS:1 COG:yicI KEGG:ns NR:ns ## COG: yicI COG1501 # Protein_GI_number: 16131527 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Escherichia coli K12 # 1 772 1 772 772 1640 99.0 0 MKISDGNWLIQPGLNLIHPLQVFEVEQQDNEMVVYAAPRDVRERTWQLDTPLFTLRFFSP QEGIVGVRIEHFQGALNNGPHYPLNILQDVKVTIENTERYAEFKSGNLSARVSKGEFWSL DFLRNGERITGSQVKNNGYVQDTNNQRNYMFERLDLGVGETVYGLGERFTALVRNGQTVE TWNRDGGTSTEQAYKNIPFYMTNRGYGVLVNHPQCVSFEVGSEKVSKVQFSVESEYLEYF VIDGPTPKAVLDRYTRFTGRPALPPAWSFGLWLTTSFTTNYDEATVNSFIDGMAERNLPL HVFHFDCFWMKAFQWCDFEWDPLTFPDPEGMIRRLKAKGLKICVWINPYIGQKSPVFKEL QEKGYLLKRPDGSLWQWDKWQPGLAIYDFTNPDACKWYADKLKGLVAMGVDCFKTDFGER IPTDVQWFDGSDPQKMHNHYAYIYNELVWNVLKDTVGEEEAVLFARSASVGAQKFPVHWG GDCYANYESMAESLRGGLSIGLSGFGFWSHDIGGFENTAPAHVYKRWCAFGLLSSHSRLH GSKSYRVPWAYDDESCDVVRFFTQLKCRMMPYLYREAARANARGTLMMRAMMMEFPDDPA CDYLDRQYMLGDNVMVAPVFTEAGDVQFYLPEGRWTHLWHNDELDGSRWHKQQHGFLSLP VYVRDNTLLALGNNDQRPDYVWHEGTAFHLFNLQDGHEAVCEVPATDGSVIFTLKAARTG NTITVTGAGEAKNWTLCLRNVVKVNGLQDGSQAESEQGLVVKPQGNALTITL >gi|296493097|gb|ADTK01000404.1| GENE 59 63203 - 64585 1189 460 aa, chain - ## HITS:1 COG:yicJ KEGG:ns NR:ns ## COG: yicJ COG2211 # Protein_GI_number: 16131528 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Escherichia coli K12 # 1 460 20 479 479 877 100.0 0 MKSEVLSVKEKIGYGMGDAASHIIFDNVMLYMMFFYTDIFGIPAGFVGTMFLVARALDAI SDPCMGLLADRTRSRWGKFRPWVLFGALPFGIVCVLAYSTPDLSMNGKMIYAAITYTLLT LLYTVVNIPYCALGGVITNDPTQRISLQSWRFVLATAGGMLSTVLMMPLVNLIGGDNKPL GFQGGIAVLSVVAFMMLAFCFFTTKERVEAPPTTTSMREDLRDIWQNDQWRIVGLLTIFN ILAVCVRGGAMMYYVTWILGTPEVFVAFLTTYCVGNLIGSALAKPLTDWKCKVTIFWWTN ALLAVISLAMFFVPMQASITMFVFIFVIGVLHQLVTPIQWVMMSDTVDYGEWCNGKRLTG ISFAGTLFVLKLGLAFGGALIGWMLAYGGYDAAEKAQNSATISIIIALFTIVPAICYLLS AIIAKRYYSLTTHNLKTVMEQLAQGKRRCQQQFTSQEVQN Prediction of potential genes in microbial genomes Time: Mon May 16 16:24:30 2011 Seq name: gi|296493096|gb|ADTK01000405.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1369.1, whole genome shotgun sequence Length of sequence - 2577 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 2577 1783 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member Predicted protein(s) >gi|296493096|gb|ADTK01000405.1| GENE 1 3 - 2577 1783 858 aa, chain + ## HITS:1 COG:PSLT108 KEGG:ns NR:ns ## COG: PSLT108 COG0507 # Protein_GI_number: 17233470 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Salmonella typhimurium LT2 # 10 858 889 1734 1752 1345 87.0 0 TDLLHGNTQEAKSFAAEGTSFTELGGEINAQIKRGDLLYVDVAKGYGTGLLVSRASYEAE KSILRHILEGKEAVTPLMERVPGELMEKLTSGQRAATRMILETSDRFTVVQGYAGVGKTT QFRAVMSAVNMLPESERPRVVGLGPTHRAIGEMRSAGVDAQTLASFLHDTQLLQRSGETP NFSNTLFLLDESSMVGNTDMARAYALIAAGGGRAVASGDTDQLQAIAPGQPFRLQQTRSA ADVVIMKEIVRQTPELREAVYSLINRDVERALSGLESVKPSQVPRQEGAWAPEHSVTEFS HSQEAKLAEAQQKAMLKGEAFPDIPMTLYEAIVRDYTGRTPEAREQTLIVTHLNEDRRVL NSMIHDAREKAGELGKEQVMVPVLNTANIRDGELRRLSTWENNPDALALVDSVYHRIAGI SKDDGLITLEDAEGNTRLISPREAVAEGVTLYTPDTIRVGTGDRMRFTKSDRERGYVANS VWTVTAVSGDSVTLSDGQQTRVIRPGQERAEQHIDLAYAITAHGAQGASETFAIALEGTE GNRKLMAGFESAYVALSRMKQHVQVYTDNRQGWTDAINNAVQKGTAHDVLEPKSDREVMN AERLFSTARELRDVVAGRAVLRQAGLAGGDSPARFIAPGRKYPQPYVALPAFDRNGKSAG IWLNPLTTDDGNGLRGFSGEGRVKGSGDAQFVALQGSRNGESLLADNMQDGVRIARDNPD SGVVVRIAGEGRPWNPRTITGGRVWGDIPDNSVQPGAGNGEPVTAEVLAQRQAEEAIRRE TERRADEIVRKMAENKPDLPDGKTEQAVRDIAGLERDRSAISEREAALPESVLREPQRVR EAVREVARENLLQERLQQ Prediction of potential genes in microbial genomes Time: Mon May 16 16:24:31 2011 Seq name: gi|296493095|gb|ADTK01000406.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1369.2, whole genome shotgun sequence Length of sequence - 2419 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 68 - 814 621 ## APECO1_O1CoBM59 conjugal transfer pilus acetylation protein TraX 2 1 Op 2 . + CDS 873 - 1733 488 ## COG1073 Hydrolases of the alpha/beta superfamily 3 2 Tu 1 . + CDS 1836 - 2396 511 ## APECO1_O1CoBM61 conjugal transfer fertility inhibition protein FinO Predicted protein(s) >gi|296493095|gb|ADTK01000406.1| GENE 1 68 - 814 621 248 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM59 NR:ns ## KEGG: APECO1_O1CoBM59 # Name: traX # Def: conjugal transfer pilus acetylation protein TraX # Organism: E.coli_APEC # Pathway: not_defined # 13 248 13 248 248 424 100.0 1e-117 MTTDNTNTTRNDSLAARTDTWLQSLLVWSPGQRDIIKTVALVLMVLDHANRILHLDQSWM FLVGRGAFPLFALVWGLNLSRHTHIRQEAINRLWGWAVIAQFAYYLAGFPWYEGNILFAF AVAAQVLTWCETRTWWRSAETMLLLAMWLPFSGTSYGIAGLLMLAVSHRLYRAEDRMERL ALVACLLAVIPALNLATSDAAAVAGLVMTVLTVGLVSCTGKSLPRFWYGDFFPTFYACHL TVLGVLAV >gi|296493095|gb|ADTK01000406.1| GENE 2 873 - 1733 488 286 aa, chain + ## HITS:1 COG:PA3829 KEGG:ns NR:ns ## COG: PA3829 COG1073 # Protein_GI_number: 15599024 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Pseudomonas aeruginosa # 10 286 11 295 307 105 25.0 1e-22 MKITDHKLPEGIALTFRVPEGNIKHPLILLCHGFCGIRNVLLPSFANAFTEAGFATITFD YRGFGESEGERGRLVPAMQTEDIISVINWAEKQVCIDNQRIGLWGTSLGGCHVFNAAAQD KRVKSIVSQLAFADGEVLVTGEMNELEKASFLSTLNKMAEKKKNTGKEMFVGVTRVLSDN ESKVFFEKVKGQYPEMDIKIPFLTVMETLQYKPAESAAKVQCPVLVVIAGQDSVNPPEQG RALYDAVASGTKELYEEADACHYDIYKGAFFERVAAVQTQWFKKHL >gi|296493095|gb|ADTK01000406.1| GENE 3 1836 - 2396 511 186 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM61 NR:ns ## KEGG: APECO1_O1CoBM61 # Name: finO # Def: conjugal transfer fertility inhibition protein FinO # Organism: E.coli_APEC # Pathway: not_defined # 1 186 1 186 186 309 100.0 3e-83 MTEQKRPVLTLKRKTEGETPTRSRKTIINVTTPPKWKVKKQKLAEKAAREAELAAKKAQA RQALSIYLTLPSLDEAVNTLKPWWPGLFDGNTPRLLACGIRDVLLEDVAQRNIPLSHKKL RRALKAITRSESYLCAMKAGACRYDTEGYVTEHISQEEEAYAAGRLEKIRRQNRTKAELQ AVLDGK Prediction of potential genes in microbial genomes Time: Mon May 16 16:24:38 2011 Seq name: gi|296493094|gb|ADTK01000407.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1369.3, whole genome shotgun sequence Length of sequence - 1203 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 57 - 116 2.5 1 1 Tu 1 . + CDS 139 - 351 105 ## APECO1_O1CoBM62 hypothetical protein + Term 428 - 461 -0.9 2 2 Tu 1 . - CDS 767 - 976 119 ## gi|222104846|ref|YP_002539335.1| hypothetical protein MM1_0059 + Prom 827 - 886 3.7 3 3 Tu 1 . + CDS 1032 - 1181 92 ## APECO1_O1CoBM63 hypothetical protein Predicted protein(s) >gi|296493094|gb|ADTK01000407.1| GENE 1 139 - 351 105 70 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM62 NR:ns ## KEGG: APECO1_O1CoBM62 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 70 1 70 70 90 100.0 2e-17 MNGFRNSSRNGQVWRYQRAGGRAVILEVSGRWMEAAEAWRRAACIAPRTDWQQFARKRAE HCHRRCRGRV >gi|296493094|gb|ADTK01000407.1| GENE 2 767 - 976 119 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|222104846|ref|YP_002539335.1| ## NR: gi|222104846|ref|YP_002539335.1| hypothetical protein MM1_0059 [Escherichia coli] # 1 69 1 69 69 128 100.0 1e-28 MQTSRVNGLTSGVFAFLVPASCLNQKGSNTRRDNTTFPVADYYPHLPVFVTKTPGAGENL LFPVSGGDN >gi|296493094|gb|ADTK01000407.1| GENE 3 1032 - 1181 92 49 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM63 NR:ns ## KEGG: APECO1_O1CoBM63 # Name: srnB # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 49 55 103 103 91 100.0 8e-18 MTKYALIGLLAVCATVLCFSLIFRERLCELNIHRGNTVVQVTLAYEARK Prediction of potential genes in microbial genomes Time: Mon May 16 16:24:47 2011 Seq name: gi|296493093|gb|ADTK01000408.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1369.4, whole genome shotgun sequence Length of sequence - 2449 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 100 - 159 2.8 1 1 Tu 1 . + CDS 292 - 549 203 ## APECO1_O1CoBM64 replication protein 2 2 Op 1 . + CDS 1046 - 1330 83 ## APECO1_O1CoBM66 hypothetical protein 3 2 Op 2 . + CDS 1323 - 2120 683 ## APECO1_O1CoBM67 replication protein Predicted protein(s) >gi|296493093|gb|ADTK01000408.1| GENE 1 292 - 549 203 85 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM64 NR:ns ## KEGG: APECO1_O1CoBM64 # Name: repA2 # Def: replication protein # Organism: E.coli_APEC # Pathway: not_defined # 1 85 1 85 85 105 100.0 4e-22 MSQTENAVTSSSGTKRTYRKGNPVPARERQRASLARRSNTHKAFHAVIQARLKDRLSELA DEEGITQAQMLEKLIESELKRRATS >gi|296493093|gb|ADTK01000408.1| GENE 2 1046 - 1330 83 94 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM66 NR:ns ## KEGG: APECO1_O1CoBM66 # Name: repA # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 94 71 164 164 202 100.0 3e-51 MVSFYVSLVQVMCAHYIPDTNKVNLSVKELSVCSGGSYTRVWRALKTLDNDLHLIAFDGG TIWFRPDMFETLRVGPDELVTARRRGNSVGGGHG >gi|296493093|gb|ADTK01000408.1| GENE 3 1323 - 2120 683 265 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM67 NR:ns ## KEGG: APECO1_O1CoBM67 # Name: repA1 # Def: replication protein # Organism: E.coli_APEC # Pathway: not_defined # 1 265 1 285 285 455 92.0 1e-127 MADLLQKYYSQVKNPNPVFTPREGAGTLKFCEKLMEKAVGFTSRFDFAIHVAHARSKGLR RRMPPVLRRRAIDALLQGLCFHYDPLANRVQCSITRASRALTFLAELGLITYQTEYDPLI GCYIPTDITFTPALFAALDVSEDAVVAARRSRVEWENRQRKKQGLDTLGMDELIAKAWRF VRERFRSYQTELKSRGIKRARARRDANRERQDIVTLVKRQLTREISEGRFTANREAVKRE VERRVKERMILSRNRNYSRLATASP Prediction of potential genes in microbial genomes Time: Mon May 16 16:24:57 2011 Seq name: gi|296493092|gb|ADTK01000409.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1430.1, whole genome shotgun sequence Length of sequence - 7034 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 35 - 280 245 ## COG3209 Rhs family protein - Term 298 - 348 5.4 2 2 Op 1 . - CDS 461 - 682 216 ## EC55989_0557 hypothetical protein 3 2 Op 2 5/0.000 - CDS 709 - 5571 3146 ## COG3209 Rhs family protein 4 2 Op 3 5/0.000 - CDS 5591 - 6052 349 ## COG5435 Uncharacterized conserved protein 5 2 Op 4 . - CDS 6080 - 7030 613 ## COG3501 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|296493092|gb|ADTK01000409.1| GENE 1 35 - 280 245 81 aa, chain - ## HITS:1 COG:Z0705 KEGG:ns NR:ns ## COG: Z0705 COG3209 # Protein_GI_number: 15800280 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli O157:H7 EDL933 # 9 68 476 535 1645 121 91.0 4e-28 MLANIETSNPYHAQYILRRRDDYLVLFDRDALSCRFFYDAFPGMRLRHQVTGDTDDDRLA HSPADRMYTGKDHQKYTFCTK >gi|296493092|gb|ADTK01000409.1| GENE 2 461 - 682 216 73 aa, chain - ## HITS:1 COG:no KEGG:EC55989_0557 NR:ns ## KEGG: EC55989_0557 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 73 11 83 83 139 100.0 3e-32 MALDLARRELELREIPYIKNSLHANYSYKSISIGSKQGWLISAKLKVPETFEPDMIFIEI SDPEGFINIPDVL >gi|296493092|gb|ADTK01000409.1| GENE 3 709 - 5571 3146 1620 aa, chain - ## HITS:1 COG:ECs0605 KEGG:ns NR:ns ## COG: ECs0605 COG3209 # Protein_GI_number: 15829859 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli O157:H7 # 1 1519 1 1522 1616 2821 92.0 0 MSEGPGGPQGATAGGTLAMRMLSQQAMVASQMKRAANDKAIAQMLASKKSGPPAARLGDE IQHKSFLGALAGAVLGAIVTIAEGCLIMAACATGPYALVLVPALMYASYKASDYVEEKQN QLESWINSFCDTDGAINTGSENVKINGELAARAAVTLPPPPPPGAIPEVPQGEPSWGDIA TDLLESAAEKAVPLAKAWGNAVITLTDSNAGFMDRVSAGASLLFPAGPVLMEFATMVGGR GEIKKEVDFPEAGEDTALCDKENKPPRIAQGSSNVFINNQPAARKGDKLECSAVIVGGSP DVFIGGEQVTYLDIQPEFPPWQRMILGGITIASYLLPPAGLLGKLKNLARLGKLGNLLGK SGKLLGAKLGALLGKTGKSLKSIANKVIRWVTDPVDPVTGAYCDERTDFTLGQTLPLSFT RFHSSVLPLHGLTGVGWSDSWSEYAWVREQGNRVDIISQGATLRFAFDGDSDTTVNPYHA QYILRRRDDYLELFDRDALSSRFFYDAFPGMRLRHPVTDDTSDDRLAHSPNDRMYMLGGM SDTASNRITFERDSQYRITGVSHTDGIRLKLTYHASGYLKAIHRTDNGIQTLATYEQDAR GRLTEADARLDYHLFYEYDAADRIIRWSDNDQTWSRFTYDEQGRCVNVTGAEGYYNATLD YGDGCTTVTDGKGTHRYYYDPDGNILREEAPDGSTTTYEWDEFHHLLARHSPAGRVEKFE YNAALGQLSRYTAADGAEWLYRYDERGLLSNITDPAGQTWTQQCDERGLPVSLVSPQGEE TRLAYTAQGLLSGIFRQDERRLGIEYDHHNRPETLTDVMGREHHTEYSGHDLPVKMRGPG GQSVRLQWQQHHKLSGIERAGTGAEGFRYDRHGNLLAYTDGNGVVWTMEYGPFDLPVART DGEGHRWQYRYDKDTLQLTEVINPQGESYRYILDNCGRVTEERDWGGVVWRYRYDADGLC TARVNGLEETILYSRDAAGRLAEVITPEGKTQYAYDKSGRLTGIFSPDGISQRTGYDERG RVNVTTQGRRAIEYHYPDEHTVIRCILPPEDERDRHPDESLLKTTYRYNAAGELAEVILP GDETLTFSRDEAGREVLRHSNRGFACEQGWNAAGQPVSQRAGFFPEEATWGGLVPSLVRE YRYDSAGNVSAVTSREDYGRETRREYRLDRNGQVMAVTASGTGLGYGEGDESYGYDSCGY LKAQSAGRHRISEETDQYAGGHRLKQAGNTQYDYDAAGRMVSRTRHRDGYRPETERFRWD SRDQLTGYCSAQGELWEYRHDASGRRTEKRCDRKKIRFTYLWDGDSIAEIREYRDDKLYS VRHLVFNGFELISQQFSRVRQPHPSVAPQWVTRTNHAVSDLTGRPLMLFNSEGKTVWRPG QTSLWGLALSLPADTGYPDPRGELDPEAAPGLLYAGQWQDAESGLCYNRFRYYEPETGMY LVSDPLGLLGGEQTYRYVPNPCGWVDPLGLAASSKISSLMDYIGDGRRVSGHTGFLDGVR LSRSQINNIAKEMEKLGIKVIRKADKYLPPNARAAFDYGLRNIYLRKNATLYEVYHEVIH AKQFAKIGREAYEAPGRLSREEHVLNEILKSKNLFNEAEIAHAIKYVEGLREKFMMGLIN >gi|296493092|gb|ADTK01000409.1| GENE 4 5591 - 6052 349 153 aa, chain - ## HITS:1 COG:ECs0606 KEGG:ns NR:ns ## COG: ECs0606 COG5435 # Protein_GI_number: 15829860 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 153 1 153 153 288 98.0 4e-78 MQFTFNEGHIQLPSQWQDQSMQVLVSTDNSGINLVITREAVPQGTLTPELYQETLALYQG KLDGYTEHACREITLAEAPAWLLDYSWNGPEDEGNQGRISQIAVFQRRGDTLLTFTFSTS LSLKNSQKTMLLEVIKSFTPLPPENDIQKDQPR >gi|296493092|gb|ADTK01000409.1| GENE 5 6080 - 7030 613 316 aa, chain - ## HITS:1 COG:ECs0607 KEGG:ns NR:ns ## COG: ECs0607 COG3501 # Protein_GI_number: 15829861 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 316 318 633 633 617 95.0 1e-177 MVASELHGEQPQAVPGRSGSGTTLNNHFAVIPADRTWRPQPLLKPLVDGPQSAVVTGPAG EEIFCDEHGRVRVKFNWDRYNPSNQESSCWIRVAQAWAGTGFGNLAIPRVGQEVIVDFLN GDPDQPIIMGRTYHQENRTPGSLPGTKTQMTIRSKTYKGSGFNELKFDDATGKEQVYIHA QKNMDTEVLNDRTTTVKHDHRETVKNDQTVTIQEGNRLLTVEKGHKITGVLKGSLSEDVF QDRSTIAGSVHVDAVNNGGEGDGIQAYTAIKEILLAVEESKIALTPDGIQLQVGESTVIR LSKDGITIVGGSVFIN Prediction of potential genes in microbial genomes Time: Mon May 16 16:24:59 2011 Seq name: gi|296493091|gb|ADTK01000410.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1430.2, whole genome shotgun sequence Length of sequence - 251 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 250 69 ## COG3501 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|296493091|gb|ADTK01000410.1| GENE 1 1 - 250 69 83 aa, chain - ## HITS:1 COG:ECs0236 KEGG:ns NR:ns ## COG: ECs0236 COG3501 # Protein_GI_number: 15829490 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 83 269 351 713 174 98.0 3e-44 GQNFARWQMDGWRNNAEVARGTSRSPEIWPGRRIVLTGHPQANLNREWQVVASDLHGEQP QAVPGRRGSGTTLDNHFAVIPAD Prediction of potential genes in microbial genomes Time: Mon May 16 16:25:00 2011 Seq name: gi|296493090|gb|ADTK01000411.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1430.3, whole genome shotgun sequence Length of sequence - 1563 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.000 - CDS 2 - 869 523 ## COG3209 Rhs family protein 2 1 Op 2 . - CDS 945 - 1550 363 ## COG3501 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|296493090|gb|ADTK01000411.1| GENE 1 2 - 869 523 289 aa, chain - ## HITS:1 COG:Z0268 KEGG:ns NR:ns ## COG: Z0268 COG3209 # Protein_GI_number: 15799917 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli O157:H7 EDL933 # 1 289 1 289 1404 568 97.0 1e-162 MGGKPAARQGDMTRKGLDIVQGSAGVLIGAPTGVACSVCPGGITYANPVNPLLGAKVLPG ETDLALPGPLPFILSRAYSSYRTRTPAPVGVFGPGWKAPFDIRLQIRDEGLILNDSGGRS IHFEPLFPGEISYSRSESLWLARGGVAAQHSSQPLSALWQVLPEDVRLSPHVYLATNSLQ GPWWILSWPERVPGADEVLPPEPPAYRVLTGVVDGFGRTLAFHRAAKGDVAGAVTGVTDG AGRRFHLVLTTQAQRAEVFRKQRATSLSSPAGPRSASSSLVFPDTLPAG >gi|296493090|gb|ADTK01000411.1| GENE 2 945 - 1550 363 201 aa, chain - ## HITS:1 COG:ECs0236 KEGG:ns NR:ns ## COG: ECs0236 COG3501 # Protein_GI_number: 15829490 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 201 513 713 713 399 99.0 1e-111 MINNHAEKIGNNQAITVTNNQIQNIGVNQIQTVGVNQVETVGSNQIIKVGSNQVEKVGII RALTVGVAYQTTVGGIMNTSVALLQSSQVGLHKSLMVGMGYSVNVGNNVTFSVGKTMKEN TGQTAVYSAGEHLELCCGKARLVLTKDGSIFLNGTHIHLEGESDVNGDAPVINWNCGATQ PVPDAPVPKDLPPGMPDMRQF Prediction of potential genes in microbial genomes Time: Mon May 16 16:25:04 2011 Seq name: gi|296493089|gb|ADTK01000412.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1430.4, whole genome shotgun sequence Length of sequence - 12208 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 4, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1499 1107 ## COG3501 Uncharacterized protein conserved in bacteria - Prom 1626 - 1685 6.0 - Term 2184 - 2219 -0.5 2 2 Op 1 40/0.000 - CDS 2236 - 3684 1060 ## COG0642 Signal transduction histidine kinase 3 2 Op 2 . - CDS 3674 - 4357 686 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 4496 - 4555 2.7 + Prom 4340 - 4399 3.9 4 3 Op 1 3/1.000 + CDS 4514 - 5893 478 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 5 3 Op 2 3/1.000 + CDS 5920 - 6252 494 ## COG5569 Uncharacterized conserved protein 6 3 Op 3 11/0.000 + CDS 6268 - 7491 1181 ## COG0845 Membrane-fusion protein 7 3 Op 4 3/1.000 + CDS 7503 - 10646 3345 ## COG3696 Putative silver efflux pump + Prom 10659 - 10718 3.5 8 4 Tu 1 . + CDS 10748 - 12124 1589 ## COG1113 Gamma-aminobutyrate permease and related permeases + Term 12162 - 12200 5.0 Predicted protein(s) >gi|296493089|gb|ADTK01000412.1| GENE 1 2 - 1499 1107 499 aa, chain - ## HITS:1 COG:ECs0607 KEGG:ns NR:ns ## COG: ECs0607 COG3501 # Protein_GI_number: 15829861 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 499 1 499 633 995 97.0 0 MSTGLRFTLEVDGLPPDAFAVVSFHLNQSLSSLFSLDLSLVSQQFLSLEFAQVLDKMAYL TIWQGDDVQRRVKGMVTWFELGENDKNQMLYSMKVHPPLWRAGLRQNFRIFQNEDIKSIL GTMLQENGVTEWSPLFSEPHPSREFCVQYGETDYDFLCRMAAEEGIFFYEEHAYKSTDQS LVLCDTVRHLPESFETPWNPNTRTEVSTLCISQFRYSAQIRPSSVVTKDYTFKRPGWAGR FEQEGQHQDYQRTQYEVYDYPGRFKGAHGQNFARWQMDGWRNNAETARGMSRSPEIWPGR RIVLTGHPQANLNREWQVVASELHGEQPQAVPGRQGAGTALENHFAVIPADRTWRPQPLL KPLVDGPQSAVVTGPAGEEIFCDEHGRVRVKFNWDRYNPADQDSSCWIRVAQAWAGTGFG HLAIPRVGQEVIVDFLNGDPDQPIIMGRTYHQENRTPGSLPGTKTQMTIRTKTYMGSGFN ELKFDDAAGREQVYIHAQK >gi|296493089|gb|ADTK01000412.1| GENE 2 2236 - 3684 1060 482 aa, chain - ## HITS:1 COG:ECs0608 KEGG:ns NR:ns ## COG: ECs0608 COG0642 # Protein_GI_number: 15829862 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli O157:H7 # 1 481 1 481 482 919 97.0 0 MVSKPFQRPFSLATRLTFFISLATIAAFFAFAWIMIHSVKVHFAEQDINDLKEISATLER VLNHPDETQARRLMTLEDIVSGYSNVLISLADSHGKTVYHSPGAPDIREFTRDAIPDKDA RGGEVYLLSGPTIMMPGHGHGHMEHSNWRMINLSVGPLVDGKPIYTLYIALSIDFHLHYI NDLMNKLIMTASVISILIVFIVLLAVHKGHAPIRSVSRQIQNITSKDLDVRLDPQTVPIE LEQLVLSFNHMIERIEDVFTRQSNFSADIAHEIRTPITNLITQTEIALSQSRSQKELEDV LYSNLEELTRMAKMVSDMLFLAQADNNQLIPEKKMLNLADEVGKVFDFFEALAEDRGVEL RFVGDECQVAGDPLMLRRALSNLLSNALRYTPPGEAIVVRCQTVDHQVQVTVENPGTPIA PEHLPRLFDRFYRVDPSRQRKGEGSGIGLAIVKSIVVAHKGTVAVTSDARGTRFVIMLPE RE >gi|296493089|gb|ADTK01000412.1| GENE 3 3674 - 4357 686 227 aa, chain - ## HITS:1 COG:ECs0609 KEGG:ns NR:ns ## COG: ECs0609 COG0745 # Protein_GI_number: 15829863 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 227 1 227 227 432 100.0 1e-121 MKLLIVEDEKKTGEYLTKGLTEAGFVVDLADNGLNGYHLAMTGDYDLIILDIMLPDVNGW DIVRMLRSANKGMPILLLTALGTIEHRVKGLELGADDYLVKPFAFAELLARVRTLLRRGA AVIIESQFQVADLMVDLVSRKVTRSGTRITLTSKEFTLLEFFLRHQGEVLPRSLIASQVW DMNFDSDTNAIDVAVKRLRGKIDNDFEPKLIQTVRGVGYMLEVPDGQ >gi|296493089|gb|ADTK01000412.1| GENE 4 4514 - 5893 478 459 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 12 459 11 457 460 188 28 1e-47 MSPCKLLPFCVALALTGCSLAPDYQRPAMPVPQQFSLSQNGLVNAADNYQNAGWRTFFVD NQVKTLISEALVNNRDLRMATLKVQEARAQYRLTDADRYPQLNGEGSGSWSGNLKGNTAT TREFSTGLNASFDLDFFGRLKNMSEAERQNYLATEEAQRAVHILLVSNVAQSYFNQQLAY AQLQIAEETLRNYQQSYAFVEKQLLTGSSNVLALEQARGVIESTRSDIAKRQGELAQANN ALQLLLGSYGKLPQAQTVNSDSLQSVKLPAGLSSQILLQRPDIMEAEHALMAANANIGAA RAAFFPSISLTSGISTASSDLSSLFNASSGMWNFIPKIEIPIFNAGRNQANLDIAEIRQQ QSVVNYEQKIQNAFKEVADALALRQSLNDQISAQQRYLASLQITLQRARALYQHGAVSYL EVLDAERSLFATRQTLLDLNYARQVNEISLYTALGGGWQ >gi|296493089|gb|ADTK01000412.1| GENE 5 5920 - 6252 494 110 aa, chain + ## HITS:1 COG:ylcC KEGG:ns NR:ns ## COG: ylcC COG5569 # Protein_GI_number: 16128556 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 110 1 110 110 195 98.0 2e-50 MKKALQVAMFSLFTVIGFNAQANEHHHETMSEVQPQVISATGVGKGIDLESKKITIHHDP IAAVNWPEMTMRFTITPQTKMSEIKTGDKVAFNFVQQGNLSLLQDIKVSQ >gi|296493089|gb|ADTK01000412.1| GENE 6 6268 - 7491 1181 407 aa, chain + ## HITS:1 COG:ECs0612 KEGG:ns NR:ns ## COG: ECs0612 COG0845 # Protein_GI_number: 15829866 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli O157:H7 # 1 407 1 407 407 771 99.0 0 MKKIALIIGSMIAGGIISAAGFTWFAKAEPPAEKTSTAERKILFWYDPMYPNTRFDKPGK SPFMDMDLVPKYADEESSASGVRIDPTQTQNLGVKTATVTRGPLTFAQSFPANVSYNEYQ YAIVQARAAGFIDKVYPLTVGDKVQKGTPLLDLTIPDWVEAQSEYLLLRETGGTATQTEG ILERLRLAGMPEADIRRLIATQKIQTRFTLKAPIDGVITAFDLRAGMNIAKDNVVAKIQG MDPVWVTAAIPESIAWLVKDASQFTLTVPARPDKTLTIRKWTLLPGVDAATRTLQLRLEV DNADEALKPGMNAWLQLNTASEPMLLIPSQALIDTGNEQRVITVDADGRFVPKRVAVFQA SQGVTALHSGLAEGEKVVSSGLFLIDSEANISGALERMRSESATHAH >gi|296493089|gb|ADTK01000412.1| GENE 7 7503 - 10646 3345 1047 aa, chain + ## HITS:1 COG:ybdE KEGG:ns NR:ns ## COG: ybdE COG3696 # Protein_GI_number: 16128558 # Func_class: P Inorganic ion transport and metabolism # Function: Putative silver efflux pump # Organism: Escherichia coli K12 # 1 1047 1 1047 1047 2032 100.0 0 MIEWIIRRSVANRFLVLMGALFLSIWGTWTIINTPVDALPDLSDVQVIIKTSYPGQAPQI VENQVTYPLTTTMLSVPGAKTVRGFSQFGDSYVYVIFEDGTDPYWARSRVLEYLNQVQGK LPAGVSAELGPDATGVGWIYEYALVDRSGKHDLADLRSLQDWFLKYELKTIPDVAEVASV GGVVKEYQVVIDPQRLAQYGISLAEVKSALDASNQEAGGSSIELAEAEYMVRASGYLQTL DDFNHIVLKASENGVPVYLRDVAKVQIGPEMRRGIAELNGEGEVAGGVVILRSGKNAREV IAAVKDKLETLKSSLPEGVEIVTTYDRSQLIDRAIDNLSGKLLEEFIVVAVVCALFLWHV RSALVAIISLPLGLCIAFIVMHFQGLNANIMSLGGIAIAVGAMVDAAIVMIENAHKRLEE WQHQHPDATLDNKTRWQVITDASVEVGPALFISLLIITLSFIPIFTLEGQEGRLFGPLAF TKTYAMAGAALLAIVVIPILMGYWIRGKIPPESSNPLNRFLIRVYHPLLLKVLHWPKTTL LVAALSVLTVLWPLNKVGGEFLPQINEGDLLYMPSTLPGISAAEAASMLQKTDKLIMSVP EVARVFGKTGKAETATDSAPLEMVETTIQLKPQEQWRPGMTMDKIIEELDNTVRLPGLAN LWVPPIRNRIDMLSTGIKSPIGIKVSGTVLADIDAMAEQIEEVARTVPGVASALAERLEG GRYINVEINREKAARYGMTVADVQLFVTSAVGGAMVGETVEGIARYPINLRYPQSWRDSP QALRQLPILTPMKQQITLADVADIKVSTGPSMLKTENARPTSWIYIDARDRDMVSVVHDL QKAIAEKVQLKPGTSVAFSGQFELLERANHKLKLMVPMTLMIIFVLLYLAFRRVGEALLI ISSVPFALVGGIWLLWWMGFHLSVATGTGFIALAGVAAEFGVVMLMYLRHAIEAVPSLNN PQTFSEQKLDEALYHGAVLRVRPKAMTVAVIIAGLLPILWGTGAGSEVMSRIAAPMIGGM ITAPLLSLFIIPAAYKLMWLHRHRVRK >gi|296493089|gb|ADTK01000412.1| GENE 8 10748 - 12124 1589 458 aa, chain + ## HITS:1 COG:pheP KEGG:ns NR:ns ## COG: pheP COG1113 # Protein_GI_number: 16128559 # Func_class: E Amino acid transport and metabolism # Function: Gamma-aminobutyrate permease and related permeases # Organism: Escherichia coli K12 # 1 458 1 458 458 837 99.0 0 MKNASTVSEDTASNQEPTLHRGLHNRHIQLIALGGAIGTGLFLGIGPAIQMAGPAVLLGY GVAGIIAFLIMRQLGEMVVEEPVSGSFAHFAYKYWGPFAGFLSGWNYWVMFVLVGMAELT AAGIYMQYWFPDVPTWIWTAAFFIIINAVNLVNVRLYGETEFWFALIKVLAIIGMIGFGL WLLFSGHGGEKASIDNLWRYGGFFATGWNGLILSLAVIMFSFGGLELIGITAAEARDPEK SIPKAVNQVVYRILLFYIGSLVVLLALYPWVEVKSNSSPFVMIFHNLDSNVVASALNFVI LVASLSVYNSGVYSNSRMLFGLSVQGNAPKFLTRVSRRGVPINSLMLSGAITSLVVLINY LLPQKAFGLLMALVVATLLLNWIMICLAHLRFRAAMRRQGRETQFKALLYPFGNYLCIAF LGMILLLMCTMDDMRLSAILLPVWIVFLFVAFKTLRRK Prediction of potential genes in microbial genomes Time: Mon May 16 16:25:05 2011 Seq name: gi|296493088|gb|ADTK01000413.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1430.5, whole genome shotgun sequence Length of sequence - 1373 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 26 - 1273 1288 ## COG0668 Small-conductance mechanosensitive channel - Prom 1294 - 1353 4.2 Predicted protein(s) >gi|296493088|gb|ADTK01000413.1| GENE 1 26 - 1273 1288 415 aa, chain - ## HITS:1 COG:ybdG KEGG:ns NR:ns ## COG: ybdG COG0668 # Protein_GI_number: 16128560 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Escherichia coli K12 # 1 415 1 415 415 800 99.0 0 MQDLISQVEDLAGIEIDHTTSMVMIFGIIFLTAVVVHIILHWVVLRTFEKRAIASSRLWL QIITQNKLFHRLAFTLQGIIVNIQAVFWLQKGTEAADILTTCAQLWIMMYALLSVFSLLD VILNLAQKFPAASQLPLKGIFQGIKLIGAILVGILMISLLIGQSPAILISGLGAMAAVLM LVFKDPILGLVAGIQLSANDMLKLGDWLEMPKYGADGAVIDIGLTTVKVRNWDNTITTIP TWSLVSDSFKNWSGMSASGGRRIKRSISIDVTSIRFLDEDEMQRLNKAHLLKPYLTSRHQ EINEWNRQQGSTESILNLRRMTNIGTFRAYLNEYLRNHPRIRKDMTLMVRQLAPGDNGLP LEIYAFTNTVVWLEYESIQADIFDHIFAIVEEFGLRLHQSPTGNDIRSLAGAFKQ Prediction of potential genes in microbial genomes Time: Mon May 16 16:25:17 2011 Seq name: gi|296493087|gb|ADTK01000414.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1430.6, whole genome shotgun sequence Length of sequence - 38878 bp Number of predicted genes - 36, with homology - 36 Number of transcription units - 22, operones - 7 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.636 - CDS 13 - 666 797 ## COG0778 Nitroreductase - Prom 695 - 754 3.5 2 1 Op 2 . - CDS 760 - 1128 304 ## COG2315 Uncharacterized protein conserved in bacteria - Term 1150 - 1184 -1.0 3 1 Op 3 . - CDS 1193 - 1441 292 ## B21_00530 hypothetical protein 4 1 Op 4 . - CDS 1507 - 2625 918 ## COG2170 Uncharacterized conserved protein - Prom 2723 - 2782 2.5 + Prom 2872 - 2931 4.0 5 2 Tu 1 . + CDS 3078 - 3230 155 ## EcHS_A0629 Hok/Gef family protein + Term 3231 - 3275 3.2 6 3 Tu 1 . - CDS 3352 - 3981 413 ## COG2977 Phosphopantetheinyl transferase component of siderophore synthetase - Prom 4029 - 4088 6.4 - Term 3998 - 4034 2.4 7 4 Tu 1 . - CDS 4147 - 6387 2640 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins - Prom 6563 - 6622 4.5 + Prom 6524 - 6583 4.8 8 5 Op 1 2/0.818 + CDS 6630 - 7832 815 ## COG2382 Enterochelin esterase and related enzymes 9 5 Op 2 2/0.818 + CDS 7835 - 8053 261 ## COG3251 Uncharacterized protein conserved in bacteria 10 5 Op 3 4/0.636 + CDS 8050 - 11931 3745 ## COG1020 Non-ribosomal peptide synthetase modules and related proteins + Prom 11933 - 11992 8.5 11 6 Tu 1 . + CDS 12147 - 13280 952 ## COG3765 Chain length determinant protein 12 7 Op 1 7/0.000 - CDS 13277 - 14092 197 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 13 7 Op 2 8/0.000 - CDS 14089 - 15081 938 ## COG4779 ABC-type enterobactin transport system, permease component 14 7 Op 3 . - CDS 15078 - 16094 1293 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component - Prom 16147 - 16206 4.5 + Prom 16091 - 16150 3.9 15 8 Tu 1 . + CDS 16193 - 17443 1186 ## COG0477 Permeases of the major facilitator superfamily + Term 17688 - 17752 1.7 16 9 Tu 1 . - CDS 17447 - 18403 959 ## COG4592 ABC-type Fe2+-enterobactin transport system, periplasmic component - Prom 18423 - 18482 2.2 + Prom 18664 - 18723 4.9 17 10 Op 1 6/0.000 + CDS 18778 - 19953 1160 ## COG1169 Isochorismate synthase 18 10 Op 2 5/0.091 + CDS 19963 - 21573 1644 ## COG1021 Peptide arylation enzymes 19 10 Op 3 5/0.091 + CDS 21587 - 22444 1128 ## COG1535 Isochorismate hydrolase 20 10 Op 4 5/0.091 + CDS 22444 - 23190 926 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 21 10 Op 5 4/0.636 + CDS 23193 - 23606 371 ## COG2050 Uncharacterized protein, possibly involved in aromatic compounds catabolism + Term 23648 - 23708 0.3 + Prom 23690 - 23749 3.6 22 11 Op 1 9/0.000 + CDS 23787 - 25892 2527 ## COG1966 Carbon starvation protein, predicted membrane protein + Term 26011 - 26053 3.5 + Prom 25952 - 26011 3.9 23 11 Op 2 . + CDS 26074 - 26271 250 ## COG2879 Uncharacterized small protein - Term 26156 - 26196 -0.4 24 12 Tu 1 . - CDS 26281 - 27369 1181 ## COG0371 Glycerol dehydrogenase and related enzymes - Prom 27415 - 27474 3.6 + Prom 27389 - 27448 5.2 25 13 Tu 1 . + CDS 27478 - 28638 902 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase + Term 28666 - 28712 5.7 26 14 Op 1 8/0.000 - CDS 28639 - 29256 526 ## COG1475 Predicted transcriptional regulators 27 14 Op 2 4/0.636 - CDS 29241 - 30461 737 ## COG3969 Predicted phosphoadenosine phosphosulfate sulfotransferase - Prom 30506 - 30565 6.4 28 15 Tu 1 . - CDS 30608 - 31510 399 ## COG0583 Transcriptional regulator - Prom 31578 - 31637 7.6 29 16 Tu 1 . - CDS 31719 - 32465 744 ## COG1651 Protein-disulfide isomerase + Prom 32713 - 32772 3.9 30 17 Op 1 11/0.000 + CDS 32836 - 33399 565 ## COG0450 Peroxiredoxin + Term 33463 - 33494 4.1 31 17 Op 2 . + CDS 33570 - 35135 411 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 + Term 35166 - 35201 5.5 - Term 35195 - 35236 7.1 32 18 Tu 1 . - CDS 35256 - 35684 507 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins - Prom 35720 - 35779 3.3 33 19 Tu 1 . + CDS 35580 - 35786 106 ## EcE24377A_0629 hypothetical protein + Prom 35821 - 35880 2.0 34 20 Tu 1 . + CDS 35905 - 37143 864 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases - Term 37310 - 37337 0.1 35 21 Tu 1 . - CDS 37374 - 37784 361 ## COG0782 Transcription elongation factor - Prom 37871 - 37930 3.7 - Term 37831 - 37863 3.0 36 22 Tu 1 . - CDS 38014 - 38838 428 ## COG3719 Ribonuclease I Predicted protein(s) >gi|296493087|gb|ADTK01000414.1| GENE 1 13 - 666 797 217 aa, chain - ## HITS:1 COG:ECs0616 KEGG:ns NR:ns ## COG: ECs0616 COG0778 # Protein_GI_number: 15829870 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Escherichia coli O157:H7 # 1 217 1 217 217 419 99.0 1e-117 MDIISVALKRHSTKAFDASKKLTPEQAEQIKTLLQYSPSSTNSQPWHFIVASTEEGKARV AKSAADNYVFNERKILDASHVVVFCAKTAMDDAWLKLVVDQEDADGRFATPEAKAANDKG RKFFADMHRKDLHDDAEWMAKQVYLNVGNFLLGVAALGLDAVPIEGFDAAILDAEFGLKE KGYTSLVVVPVGHHSVEDFNATLPKSRLPQNITLTEV >gi|296493087|gb|ADTK01000414.1| GENE 2 760 - 1128 304 122 aa, chain - ## HITS:1 COG:ECs0617 KEGG:ns NR:ns ## COG: ECs0617 COG2315 # Protein_GI_number: 15829871 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 122 1 122 122 248 100.0 2e-66 MDKQSLHETAKRLALELPFVELCWPFGPEFDVFKIGGKIFMLSSELRGVPFINLKSDPQK SLLNQQIYPSIKPGYHMNKKHWISVYPGEEISEALLRDLINDSWNLVVDGLAKRDQKRVR PG >gi|296493087|gb|ADTK01000414.1| GENE 3 1193 - 1441 292 82 aa, chain - ## HITS:1 COG:no KEGG:B21_00530 NR:ns ## KEGG: B21_00530 # Name: ybdJ # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 82 1 82 82 103 100.0 1e-21 MKHPLETLTTAAGILLMAFLSCLLLPAPALGLTLAQKLVSTFHLMDLSQLYTLLFCLWFL VLGAIEYFVLRFIWRRWFSLAD >gi|296493087|gb|ADTK01000414.1| GENE 4 1507 - 2625 918 372 aa, chain - ## HITS:1 COG:ybdK KEGG:ns NR:ns ## COG: ybdK COG2170 # Protein_GI_number: 16128564 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 372 1 372 372 789 100.0 0 MPLPDFHVSEPFTLGIELEMQVVNPPGYDLSQDSSMLIDAVKNKITAGEVKHDITESMLE LATDVCRDINQAAGQFSAMQKVVLQAATDHHLEICGGGTHPFQKWQRQEVCDNERYQRTL ENFGYLIQQATVFGQHVHVGCASGDDAIYLLHGLSRFVPHFIALSAASPYMQGTDTRFAS SRPNIFSAFPDNGPMPWVSNWQQFEALFRCLSYTTMIDSIKDLHWDIRPSPHFGTVEVRV MDTPLTLSHAVNMAGLIQATAHWLLTERPFKHQEKDYLLYKFNRFQACRYGLEGVITDPH TGDRRPLTEDTLRLLEKIAPSAHKIGASSAIEALHRQVVSGLNEAQLMRDFVADGGSLIG LVKKHCEIWAGD >gi|296493087|gb|ADTK01000414.1| GENE 5 3078 - 3230 155 50 aa, chain + ## HITS:1 COG:no KEGG:EcHS_A0629 NR:ns ## KEGG: EcHS_A0629 # Name: not_defined # Def: Hok/Gef family protein # Organism: E.coli_HS # Pathway: not_defined # 1 50 34 83 83 92 100.0 5e-18 MLTKYALAAVIVLCLTVLGFTLLVGDSLCEFTVKERNIEFKAVLAYEPKK >gi|296493087|gb|ADTK01000414.1| GENE 6 3352 - 3981 413 209 aa, chain - ## HITS:1 COG:entD KEGG:ns NR:ns ## COG: entD COG2977 # Protein_GI_number: 16128566 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Phosphopantetheinyl transferase component of siderophore synthetase # Organism: Escherichia coli K12 # 1 209 1 209 209 428 100.0 1e-120 MVDMKTTHTSLPFAGHTLHFVEFDPANFCEQDLLWLPHYAQLQHAGRKRKTEHLAGRIAA VYALREYGYKCVPAIGELRQPVWPAEVYGSISHCGTTALAVVSRQPIGIDIEEIFSVQTA RELTDNIITPAEHERLADCGLAFSLALTLAFSAKESAFKASEIQTDAGFLDYQIISWNKQ QVIIHRENEMFAVHWQIKEKIVITLCQHD >gi|296493087|gb|ADTK01000414.1| GENE 7 4147 - 6387 2640 746 aa, chain - ## HITS:1 COG:fepA KEGG:ns NR:ns ## COG: fepA COG4771 # Protein_GI_number: 16128567 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Escherichia coli K12 # 1 746 1 746 746 1442 99.0 0 MNKKIHSLALLVNLGIYGVAQAQEPTDTPVSHDDTIVVTAAEQNLQAPGVSTITADEIRK NPVARDVSEIIRTMPGVNLTGNSTSGQRGNNRQIDIRGMGPENTLILIDGKPVSSRNSVR QGWRGERDTRGDTSWVPPEMIERIEVLRGPAAARYGNGAAGGVVNIITKKGSGEWHGSWD AYFNAPEHKEEGATKRTNFSLTGPLGDEFSFRLYGNLDKTQADAWDINQGHQSARAGTYA TTLPAGREGVINKDINGVVRWDFAPLQSLELEAGYSRQGNLYAGDTQNTNSDAYTRSKYG DETNRLYRQNYSLTWNGGWDNGVTTSNWVQYEHTRNSRIPEGLAGGTEGKFNEKATQDFV DIDLDDVMLHSEVNLPLDFLVNQTLTLGTEWNQQRMKDLSSNTQALTGTNTGGAIDGVSA TDRSPYSKAEIFSLFAENNMELTDSTIVTPGLRFDHHSIVGNNWSPALNISQGLGDDFTL KMGIARAYKAPSLYQTNPNYILYSKGQGCYASAGGCYLQGNDDLKAETSINKEIGLEFKR DGWLAGVTWFRNDYRNKIEAGYVAVGQNAVGTDLYQWDNVPKAVVEGLEGSLNVPVSETV MWTNNITYMLKSENKTTGDRLSIIPEYTLNSTLSWQAREDLSMQTTFTWYGKQQPKKYNY KGQPAVGPETKEISPYSIVGLSATWDVTKNVSLTGGVDNLFDKRLWRAGNAQTTGDLAGA NYIAGAGAYTYNEPGRTWYMSVNTHF >gi|296493087|gb|ADTK01000414.1| GENE 8 6630 - 7832 815 400 aa, chain + ## HITS:1 COG:ECs0624 KEGG:ns NR:ns ## COG: ECs0624 COG2382 # Protein_GI_number: 15829878 # Func_class: P Inorganic ion transport and metabolism # Function: Enterochelin esterase and related enzymes # Organism: Escherichia coli O157:H7 # 27 400 1 374 374 719 98.0 0 MTALKVGSESWWQSKHGPEWQRLNDEMFEVTFWWRDPQGSEEYSPIKRVWIYITGVTDHH QNSQPQSMQRIAGTDVWQWTTQLNANWRGSYCFIPTERDDIFSAPSPDRLELREGWRKLL PQAIADPLNPQSWKGGRGHAVSALEMPQAPLQPGWDCPQAPEIAAKEIIWKSERLKNSRR VWIFTTGDVTAEERPLAVLLDGEFWAQSMPVWPVLTSLTHRQQLPPAVYVLIDAIDTTHR AHELPCNADFWLAVQQELLPLVKAIAPFSDRGDRTVVAGQSFGGLSALYAGLHWPERFGC VLSQSGSYWWPHRGGQQEGVLLEKLKAGEVSAEGLRIVLEAGIREPMIMRANQALYAQLH PIKESIFWRQVDGGHDALCWRGGLMQGLIDLWQPLFHDRS >gi|296493087|gb|ADTK01000414.1| GENE 9 7835 - 8053 261 72 aa, chain + ## HITS:1 COG:Z0726 KEGG:ns NR:ns ## COG: Z0726 COG3251 # Protein_GI_number: 15800300 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 EDL933 # 1 72 1 72 72 116 94.0 1e-26 MAFSNPFDDPQGAFYILRNAQGQFSLWPQQCALPAGWDIVCQPQSQASCQQWLEAHWRTL TPTNFTQLQEAQ >gi|296493087|gb|ADTK01000414.1| GENE 10 8050 - 11931 3745 1293 aa, chain + ## HITS:1 COG:entF_1 KEGG:ns NR:ns ## COG: entF_1 COG1020 # Protein_GI_number: 16128569 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Non-ribosomal peptide synthetase modules and related proteins # Organism: Escherichia coli K12 # 1 1060 1 1060 1060 1991 99.0 0 MSQHLPLVAAQPGIWMAEKLSELPSAWSVAHYVELTGEVDSPLLARAVVAGLAQADTLRM RFTEDNGEVWQWVDDALTFELPEIIDLRTNIDPHGTAQALMQADLQQDLRVDSGKPLVFH QLIQVADNRWYWYQRYHHLLVDGFSFPAITRQIANIYCTWLRGEPTPASPFTPFADVVEE YQQYRESEAWQRDAAFWAEQRRQLPPPASLSPAPLPGRSASADILRLKLEFTDGEFRQLA MQLSGVQRTDLALALAALWLGRLCNRMDYAAGFIFMRRLGSAALTATGPVLNVLPLGIHI AAQETLPELATRLAAQLKKMRRHQRYDAEQIVRDSGRAAGDEPLFGPVLNIKVFDYQLDI PDVQAQTHTLATGPVNDLELALFPDVHGDLSIEILANKQRYDEPTLIQHAERLKMLIAQF AADPALLCGDVDIMLPGEYAQLAQINATQVEIPETTLSALVAEQAAKTPDAPALADARYL FSYREMREQVVALANLLRERGVKPGDSVAVALPRSVFLTLALHAIVEAGAAWLPLDTGYP DDRLKMMLEDARPSLLITTDDQLPRFSDVPNLTSLCYNAPLTPQGSAPLQLSQPHHTAYI IFTSGSTGRPKGVMVGQTAIVNRLLWMQNHYPLTGEDVVAQKTPCSFDVSVWEFFWPFIA GAKLVMAEPEAHRDPLAMQQFFAEYGVTTTHFVPSMLAAFVASLTPQTARQNCATLKQVF CSGEALPADLCREWQQLTGAPLHNLYGPTEAAVDVSWYPAFGEELAQVRGSSVPIGYPVW NTGLRILDAMMHPVPPGVAGDLYLTGIQLAQGYLGRPDLTASRFIADPFAPGERMYRTGD VARWLDNGAVEYLGRSDDQLKIRGQRIELGEIDRVMQALPDVEQAVTHACVINQAAATGG DARQLVGYLVSQSGLPLDTSALQAQLRETLPPHMVPVVLLQLPQLPLSANGKLDRKALPL PELKTQASGRAPKAGSETIIAAAFASLLGCDVQDADADFFALGGHSLLAMKLAAQLSRQF ARQVTPGQVMVASTVAKLATIIDGEEDSSRRMGFETILPLREGNGPTLFCFHPASGFAWQ FSVLSRYLDPQWSIIGIQSPRPHGPMQTATNLDEVCEAHLATLLEQQPHGPYYLLGYSLG GTLAQGIAARLRARGEQVAFLGLLDTWPPETQNWQEKEANGLDPEVLAEINREREAFLAA QQGSTSTELFTTIEGNYADAVRLLTTAHSVPFDGKATLFVAERTLQEGMSPERAWSPWIA ELDIYRQDCAHVDIISPGAFVKIGPIIRATLNR >gi|296493087|gb|ADTK01000414.1| GENE 11 12147 - 13280 952 377 aa, chain + ## HITS:1 COG:fepE KEGG:ns NR:ns ## COG: fepE COG3765 # Protein_GI_number: 16128570 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Chain length determinant protein # Organism: Escherichia coli K12 # 1 377 1 377 377 717 98.0 0 MSSLNIKQGSDAHFPEYPLASPSNNEIDLLNLIAVLWRAKKTVMAVIFAFACAGLLISFI LPQKWTSAAVVTPPEPVQWQELEKTFTKLRVLDLDIKIDRTEAFNLFIKKFQSVSLLEEY LRSSPYVMDQLKEAKIDELDLHRAIVALSEKMKAVDDNASKKKDEPSLYTSWTLSFTAPT SEEAQTVLSGYIDYISTLVVKESLENVRNKLEIKTQFEKEKLAQDRIKTKNQLDANIQRL NYSLDIANAAGIKRPVYSNGQAVKDDPDFSISLGADGIERKLEIEKAVTDVAELNGELRN RQYLVEQLTKAHVNDVNFTPFKYQLSPSLPVKKDGPGKAIIVILSALIGGMVACGGVLLR YAMASRKQDAMMADHLV >gi|296493087|gb|ADTK01000414.1| GENE 12 13277 - 14092 197 271 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 8 232 9 228 309 80 25 1e-14 MTESVARLRGEQLTLGYGKYTVAENLTVEIPDGHFTAIIGPNGCGKSTLLRTLSRLMTPA HGHVWLDGEHIQHYASKEVARRIGLLAQNATTPGDITVQELVARGRYPHQPLFTRWRKED EEAVTKAMQATGITHLADQSVDTLSGGQRQRAWIAMVLAQETAIMLLDEPTTWLDISHQI DLLELLSELNREKGYTLAAVLHDLNQACRYASHLIALREGKIVAQGAPKEIVTAELIERI YGLRCMIIDDPVAGTPLVVPLGRTAPSTANS >gi|296493087|gb|ADTK01000414.1| GENE 13 14089 - 15081 938 330 aa, chain - ## HITS:1 COG:fepG KEGG:ns NR:ns ## COG: fepG COG4779 # Protein_GI_number: 16128572 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type enterobactin transport system, permease component # Organism: Escherichia coli K12 # 1 330 1 330 330 464 98.0 1e-130 MIYVSRRLIITCLLLVSACVVAGIWGLRSGAVTLETSQVFAALMGDAPRSMTMVVTEWRL PRVLMALLIGAALGVSGAIFQSLMRNPLGSPDVMGFNTGAWSGVLVAMVLFGQDLTAIAL AAMVGGIITSLLVWLLAWRNGIDTFRLIIIGIGVRAMLVAFNTWLLLKASLETALTAGLW NAGSLNGLTWAKTSPSAPIIILMLIAAALLVRRMRLLEMGDDTACALGVSVERSRLLMML VAVVLTAAATALAGPISFIALVAPHIARRISGTARWRLTQAALCGALLLLAADLCAQQLF MPYQLPVGVVTVSLGGIYLIVLLIQESRKK >gi|296493087|gb|ADTK01000414.1| GENE 14 15078 - 16094 1293 338 aa, chain - ## HITS:1 COG:ECs0629 KEGG:ns NR:ns ## COG: ECs0629 COG0609 # Protein_GI_number: 15829883 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Escherichia coli O157:H7 # 5 338 1 334 334 448 99.0 1e-126 MEVRMSGSVAVTRAIAVPGLLLLLIIATALSLLIGAKSLPASVVLEALSGTCQSADCTIV LDARLPRTLAGLLAGGALGLAGALMQTLTRNPLADPGLLGVNAGASFAIVLGAALFGYSS AQEQLAMAFAGALVASLIVAFTGSQGGGQLSPVRLTLAGVALAAVLEGVTSGIALLNPDV YDQLRFWQAGSLDIRNLHTLKVVLIPVLIAGATALLLSRALNSLSLGSDTATALGSRVAR TQLIGLLAITVLCGSATAVVGPIAFIGLMMPHMARWLVGADHRWSLPVTLLATPALLLFA DIIGRVIVPGELRVSVVSAFIGAPVLIFLVRRKTRGGA >gi|296493087|gb|ADTK01000414.1| GENE 15 16193 - 17443 1186 416 aa, chain + ## HITS:1 COG:ybdA KEGG:ns NR:ns ## COG: ybdA COG0477 # Protein_GI_number: 16128574 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 416 1 416 416 574 99.0 1e-163 MNKQSWLLNLSLLKTHPAFRAVFLARFISIVSLGLLGVAVPVQIQMMTHSTWQVGLSVTL TGGAMFVGLMVGGVLADRYERKKVILLARGTCGIGFIGLCLNALLPEPSLLAIYLLGLWD GFFASLGVTALLAATPALVGRENLMQAGAITMLTVRLGSVISPMIGGLLLATGGVAWNYG LAAAGTFITLLPLLSLPALPPPPQPREHPLKSLLAGFRFLLASPLVGGIALLGGLLTMAS AVRVLYPALADNWQMSAAQIGFLYAAIPLGAAIGALTSGKLAHSARPGLLMLLSTLGSFL AIGLFGLMPMWILGVVCLALFGWLSAVSSLLQYTMLQTQTPEAMLGRINGLWTAQNVTGD AIGAALLGGLGAMMTPVASASASGFGLLIIGVLLLLVLVELRRFRQTPPQVTASDS >gi|296493087|gb|ADTK01000414.1| GENE 16 17447 - 18403 959 318 aa, chain - ## HITS:1 COG:ECs0631 KEGG:ns NR:ns ## COG: ECs0631 COG4592 # Protein_GI_number: 15829885 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe2+-enterobactin transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 1 318 1 318 318 570 98.0 1e-162 MRLAPLYRNALLLTGLLLSGIAAVQAADWPRQITDSRGTHTLESQPQRIVSTSVTLTGSL LAIDAPVIASGATTPNNRVADDQGFLRQWSKVAKERKLQRLYIGEPSAEAVAAQMPDLIL ISATGGDSALALYDQLSTIAPTLIINYDDKSWQSLLTQLGEITGHEKQAAERIAQFDKQL AAAKEQIKLPPQPVTAIVYTAAAHSANLWTPESAQGQMLEQLGFTLAKLPAGLNASQSQG KRHDIIQLGGENLAAGLNGESLFLFAGDQKDADAIYANPLLAHLPAVQNKQVYALGTETF RLDYYSATQVLERLKALF >gi|296493087|gb|ADTK01000414.1| GENE 17 18778 - 19953 1160 391 aa, chain + ## HITS:1 COG:ECs0632 KEGG:ns NR:ns ## COG: ECs0632 COG1169 # Protein_GI_number: 15829886 # Func_class: H Coenzyme transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Isochorismate synthase # Organism: Escherichia coli O157:H7 # 1 391 1 391 391 756 99.0 0 MDTSLAEEVQQTMATLAPNRFFFMSPYRSFTTSGCFARFDEPAVNGDSPDSPFQQKLAAL FADAKAQGIKNPVMVGAIPFDPRQPSSLYIPESWQSFSRQEKQTSARRFTRSQSLNVVER QAIPEQTTFEQMVARAAALTATPQVDKVVLSRLIDITTDAAIDSGVLLERLIAQNPVSYN FHVPLADGGVLLGASPELLLRKDGERFSSIPLAGSARRQPDEVLDREAGNRLLASEKDRH EHELVTQAMKEVLRERSSELHVPSSPQLITTPTLWHLATPFEGKANSQENALTLACLLHP TPALSGFPHQAATQVIAELEPFDRELFGGIVGWCDSEGNGEWVVTIRCAKLRENQVRLFA GAGIVPASSPLGEWRETGVKLSTMLNVFGLH >gi|296493087|gb|ADTK01000414.1| GENE 18 19963 - 21573 1644 536 aa, chain + ## HITS:1 COG:entE KEGG:ns NR:ns ## COG: entE COG1021 # Protein_GI_number: 16128577 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Peptide arylation enzymes # Organism: Escherichia coli K12 # 1 536 1 536 536 1089 99.0 0 MSIPFTRWPEEFARRYREKGYWQDLPLTDILTRHAASDSIAVIDGERQLSYRELNQAADN LACSLRRQGIKPGETALVQLGNVAELYITFFALLKLGVAPVLALFSHQRSELNAYASQIE PALLIADRQHALFSGDDFLNTFVTEHSSIRVVQLHNDSGEHNLQDAINHPAEDFTATPSP ADEVAYFQLSGGTTGTPKLIPRTHNDYYYSVRRSVEICQFTQQTRYLCAIPAAHNYAMSS PGSLGVFLAGGTVVLAADPSATLCFPLIEKHQVNVTALVPPAVSLWLQALTEGESRAQLA SLKLLQVGGARLSATLAARIPAEIGCQLQQVFGMAEGLVNYTRLDDSAEKIIHTQGYPMC PDDEVWVADAEGNPLPQGEVGRLMTRGPYTFRGYYKSPQHNASAFDANGFYCSGDLISID PEGYITVQGREKDQINRGGEKIAAEEIENLLLRHPAVIYAALVSMEDELMGEKSCAYLVV KEPLRAVQVRRFLREQGIAEFKLPDRVECVDSLPLTAVGKVDKKQLRQWLASRASA >gi|296493087|gb|ADTK01000414.1| GENE 19 21587 - 22444 1128 285 aa, chain + ## HITS:1 COG:entB_1 KEGG:ns NR:ns ## COG: entB_1 COG1535 # Protein_GI_number: 16128578 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Isochorismate hydrolase # Organism: Escherichia coli K12 # 1 215 1 215 215 437 99.0 1e-122 MAIPKLQAYALPESHDIPQNKVDWAFEPQRAALLIHDMQDYFVSFWGENCPMMEQVIANI AALRDYCKQHNIPVYYTAQPKEQSDEDRALLNDMWGPGLTRSPEQQKVVDRLTPDADDTV LVKWRYSAFHRSPLEQMLKESGRNQLIITGVYAHIGCMTTATDAFMRDIKPFMVADALAD FSRDEHLMSLKYVAGRSGRVVMTEELLPAPVPASKAALREVILPLLDESDEPFDDDNLID YGLDSVRMMALAARWRKVHGDIDFVMLAKNPTIDAWWKLLSREVK >gi|296493087|gb|ADTK01000414.1| GENE 20 22444 - 23190 926 248 aa, chain + ## HITS:1 COG:ECs0635 KEGG:ns NR:ns ## COG: ECs0635 COG1028 # Protein_GI_number: 15829889 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Escherichia coli O157:H7 # 1 248 1 248 248 450 98.0 1e-126 MDFSGKNVWVTGAGKGIGYATALAFVEAGAKVTGFDQAFTQEQYPFATEVMDVADAGQVA QVCQRLLAETDRLDVLINAAGILRMGATDQLSKEDWQQTFAVNVGGAFNLFQQTMNQFRR QRGGAIVTVASDAAHTPRIGMSAYGASKAALKSLALSVGLELAGSGVRCNVVSPGSTDTD MQRTLWVSDDAEEQRIRGFGEQFKLGIPLGKIARPQEIANTILFLASDLASHITLQDIVV DGGSTLGA >gi|296493087|gb|ADTK01000414.1| GENE 21 23193 - 23606 371 137 aa, chain + ## HITS:1 COG:ybdB KEGG:ns NR:ns ## COG: ybdB COG2050 # Protein_GI_number: 16128580 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized protein, possibly involved in aromatic compounds catabolism # Organism: Escherichia coli K12 # 1 137 1 137 137 275 100.0 1e-74 MIWKRHLTLDELNATSDNTMVAHLGIVYTRLGDDVLEAEMPVDTRTHQPFGLLHGGASAA LAETLGSMAGFMMTRDGQCVVGTELNATHHRPVSEGKVRGVCQPLHLGRQNQSWEIVVFD EQGRRCCTCRLGTAVLG >gi|296493087|gb|ADTK01000414.1| GENE 22 23787 - 25892 2527 701 aa, chain + ## HITS:1 COG:cstA KEGG:ns NR:ns ## COG: cstA COG1966 # Protein_GI_number: 16128581 # Func_class: T Signal transduction mechanisms # Function: Carbon starvation protein, predicted membrane protein # Organism: Escherichia coli K12 # 1 701 1 701 701 1283 99.0 0 MNKSGKYLVWTVLSVMGAFALGYIALNRGEQINALWIVVASVCIYLIAYRFYGLYIAKNV LAVDPTRMTPAVRHNDGLDYVPTDKKVLFGHHFAAIAGAGPLVGPVLAAQMGYLPGMIWL LAGVVLAGAVQDFMVLFVSTRRDGRSLGELVKEEMGPTAGVIALVACFMIMVIILAVLAM IVVKALTHSPWGTYTVAFTIPLAIFMGIYLRYLRPGRIGEVSVIGLVFLIFAIISGGWVA ESPTWAPYFDFTGVQLTWMLVGYGFVAAVLPVWLLLAPRDYLSTFLKIGTIVGLAVGILI MRPTLTMPALTKFVDGTGPVWTGNLFPFLFITIACGAVSGFHALISSGTTPKMLANEGQA CFIGYGGMLMESFVAIMALVSACIIDPGVYFAMNSPMAVLAPAGTADVVASAAQVVSSWG FSITPDTLNQIASEVGEQSIISRAGGAPTLAVGMAYILHGALGGMMDVAFWYHFAILFEA LFILTAVDAGTRAARFMLQDLLGVVSPGLKRTDSLPANLLATALCVLAWGYFLHQGVVDP LGGINTLWPLFGIANQMLAGMALMLCAVVLFKMKRQRYAWVALVPTAWLLICTLTAGWQK AFSPDAKVGFLAIANKFQAMIDSGNIPSQYTESQLAQLVFNNRLDAGLTIFFMVVVVVLA LFSIKTALAALKDPKPTAKETPYEPMPENVEEIVAQAKGAH >gi|296493087|gb|ADTK01000414.1| GENE 23 26074 - 26271 250 65 aa, chain + ## HITS:1 COG:ybdD KEGG:ns NR:ns ## COG: ybdD COG2879 # Protein_GI_number: 16132239 # Func_class: S Function unknown # Function: Uncharacterized small protein # Organism: Escherichia coli K12 # 1 65 1 65 65 129 100.0 1e-30 MFDSLAKAGKYLGQAAKLMIGMPDYDNYVEHMRVNHPDQTPMTYEEFFRERQDARYGGKG GARCC >gi|296493087|gb|ADTK01000414.1| GENE 24 26281 - 27369 1181 362 aa, chain - ## HITS:1 COG:ybdH KEGG:ns NR:ns ## COG: ybdH COG0371 # Protein_GI_number: 16128582 # Func_class: C Energy production and conversion # Function: Glycerol dehydrogenase and related enzymes # Organism: Escherichia coli K12 # 1 362 1 362 362 693 99.0 0 MPHNPIRVVVGPANYFSHPGSFNHLHDFFTDEQLSRAVWIYGKRAIAAAQTKLPPAFGLP GAKHILFRGHCSESDVQQLAAESGDDRSVVIGVGGGALLDTAKALARRLGLPFVAAPTIA ATCAAWTPLSVWYNDAGQALHYEIFDDANFMVLVEPEIILNAPQQYLLAGIGDTLAKWYE AVVLAPQPETLPLTVRLGINNAQAIRDVLLNSSEQALADQQNQQLTQSFCDVVDAIIAGG GMVGGLGDRFTRVAAAHAVHNGLTVLPQTEKFLHGTKVAYGILVQSALLGQDDVLAQLTG AYQRFHLPTTLAELEVDINNQAEIDKVIAHTLRPVESIHYLPVTLTPDTLRAAFEKVESF KA >gi|296493087|gb|ADTK01000414.1| GENE 25 27478 - 28638 902 386 aa, chain + ## HITS:1 COG:ECs0639 KEGG:ns NR:ns ## COG: ECs0639 COG0436 # Protein_GI_number: 15829893 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Escherichia coli O157:H7 # 1 386 1 386 386 768 99.0 0 MTNNPLIPQSKLPQLGTTIFTQMSALAQQHQAINLSQGFPDFDGPRYLQERLAYHVAQGA NQYAPMTGVQALREAIAQKTERLYGYQPDADSDITVTAGATEALYAAITALVRNGDEVIC FDPSYDSYSPAIALSGGIVKRMALQPPHFRVDWQEFAALLSERTRLVILNTPHNPSATVW QQADFAALWQAIAGHEIFVISDEVYEHINFSQQGHASVLAHPQLRERAVAVSSFGKTYHM TGWKVGYCVAPAPISAEIRKVHQYLTFSVNTPAQLALADMLRAEPEHYLALPDFYRQKRD ILVNALNESRLEILPCEGTYFLLVDYSAVSTLDDVEFCQWLTQEHGVAAIPLSVFCADPF PHKLIRLCFAKKESTLLAAAERLRQL >gi|296493087|gb|ADTK01000414.1| GENE 26 28639 - 29256 526 205 aa, chain - ## HITS:1 COG:ECs0640 KEGG:ns NR:ns ## COG: ECs0640 COG1475 # Protein_GI_number: 15829894 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 205 5 209 209 409 100.0 1e-114 MQQRLTQDLTQFLASLPEDDRINAINEIRMAIHQVSPFREEPVDCVLWVKNSQLMPNDYN PNNVAPPEKKLLKKSIEIDGFTQPIVVTHTDKNALEIVDGFHRHEIGKGSSSLKLRLKGY LPVTCLEGTRNQRIAATIRHNRARGRHQITAMSEIVRELSQLGWDDNKIGKELGMDSDEV LRLKQINGLQELFADRQYSRAWTVK >gi|296493087|gb|ADTK01000414.1| GENE 27 29241 - 30461 737 406 aa, chain - ## HITS:1 COG:ECs0641 KEGG:ns NR:ns ## COG: ECs0641 COG3969 # Protein_GI_number: 15829895 # Func_class: R General function prediction only # Function: Predicted phosphoadenosine phosphosulfate sulfotransferase # Organism: Escherichia coli O157:H7 # 1 406 1 406 406 838 100.0 0 MSIYKIPLPLNILEAARERITWTLNTLPRVCVSFSGGKDSGLMLHLTAELARQMGKKICV LFIDWEAQFSCTINYVQSLRELYTDVIEEFYWVALPLTTQNSLSQYQPEWQCWEPDVEWV RQPPQDAITDPDFFCFYQPGMTFEQFVREFAEWFSQKRPAAMMIGIRADESYNRFVAIAS LNKQRFADDKPWTTAAPGGHSWYIYPIYDWKVADIWTWYANHQSLCNPLYNLMYQAGVPL RHMRICEPFGPEQRQGLWLYHVIEPDRWAAMCARVSGVKSGGIYAGHDNHFYGHRKILKP EHLDWQEYALLLLNSMPEKTAEHYRNKIAIYLHWYQKKGIEVPQTQQGDIGAKDIPSWRR ICKVLLNNDYWCRALSFSPTKAKNYQRYNERIKGKRQEWGILCNND >gi|296493087|gb|ADTK01000414.1| GENE 28 30608 - 31510 399 300 aa, chain - ## HITS:1 COG:ybdO KEGG:ns NR:ns ## COG: ybdO COG0583 # Protein_GI_number: 16128586 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 300 1 300 300 562 99.0 1e-160 MANLYDLKKFDLNLLVIFECIYQHLSISKAAESLYITPSAVSQSLQRLRAQFNDPLFIRS GKGIAPTTTGLNLHHHLEKNLKGLEQTINIVNKSELKKNFIIYGPQLISCSNNSMLIRCL RQDASVEIECHDILMSAENAEELLVHRKADLVITQMPVISRSVICMPLHTIRNTLICSNK HPRITDNSTYEQIMAEEFTQLISKSAGVDDIQMEIDERFMNRKISFRGSSLLTIINSIAV TDLLGIVPYELYNSYRDFLNLKEIKLEHPLPSIKLYISYNKSSLNNLVFSRFIDRLNESF >gi|296493087|gb|ADTK01000414.1| GENE 29 31719 - 32465 744 248 aa, chain - ## HITS:1 COG:dsbG KEGG:ns NR:ns ## COG: dsbG COG1651 # Protein_GI_number: 16128587 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Protein-disulfide isomerase # Organism: Escherichia coli K12 # 12 248 32 268 268 474 100.0 1e-134 MLKKILLLALLPAIAFAEELPAPVKAIEKQGITIIKTFDAPGGMKGYLGKYQDMGVTIYL TPDGKHAISGYMYNEKGENLSNTLIEKEIYAPAGREMWQRMEQSHWLLDGKKDAPVIVYV FADPFCPYCKQFWQQARPWVDSGKVQLRTLLVGVIKPESPATAAAILASKDPAKTWQQYE ASGGKLKLNVPANVSTEQMKVLSDNEKLMDDLGANVTPAIYYMSKENTLQQAVGLPDQKT LNIIMGNK >gi|296493087|gb|ADTK01000414.1| GENE 30 32836 - 33399 565 187 aa, chain + ## HITS:1 COG:ECs0644 KEGG:ns NR:ns ## COG: ECs0644 COG0450 # Protein_GI_number: 15829898 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Escherichia coli O157:H7 # 1 187 1 187 187 381 100.0 1e-106 MSLINTKIKPFKNQAFKNGEFIEITEKDTEGRWSVFFFYPADFTFVCPTELGDVADHYEE LQKLGVDVYAVSTDTHFTHKAWHSSSETIAKIKYAMIGDPTGALTRNFDNMREDEGLADR ATFVVDPQGIIQAIEVTAEGIGRDASDLLRKIKAAQYVASHPGEVCPAKWKEGEATLAPS LDLVGKI >gi|296493087|gb|ADTK01000414.1| GENE 31 33570 - 35135 411 521 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 213 515 2 303 306 162 31 2e-39 MLDTNMKTQLKAYLEKLTKPVELIATLDDSAKSAEIKELLAEIAELSDKVTFKEDNSLPV RKPSFLITNPGSNQGPRFAGSPLGHEFTSLVLALLWTGGHPSKEAQSLLEQIRHIDGDFE FETYYSLSCHNCPDVVQALNLMSVLNPRIKHTAIDGGTFQNEITDRNVMGVPAVFVNGKE FGQGRMTLTEIVAKIDTGAEKRAAEELNKRDAYDVLIVGSGPAGAAAAIYSARKGIRTGL MGERFGGQILDTVDIENYISVPKTEGQKLAGALKVHVDEYDVDVIDSQSASKLIPAAVEG GLHQIETASGAVLKARSIIVATGAKWRNMNVPGEDQYRTKGVTYCPHCDGPLFKGKRVAV IGGGNSGVEAAIDLAGIVEHVTLLEFAPEMKADQVLQDKLRSLKNVDIILNAQTTEVKGD GSKVVGLEYRDRVSGDIHNIELAGIFVQIGLLPNTNWLEGAVERNRMGEIIIDAKCETNV KGVFAAGDCTTVPYKQIIIATGEGAKASLSAFDYLIRTKTA >gi|296493087|gb|ADTK01000414.1| GENE 32 35256 - 35684 507 142 aa, chain - ## HITS:1 COG:ECs0646 KEGG:ns NR:ns ## COG: ECs0646 COG0589 # Protein_GI_number: 15829900 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Escherichia coli O157:H7 # 1 142 1 142 142 252 100.0 1e-67 MYKTIIMPVDVFEMELSDKAVRHAEFLAQDDGVIHLLHVLPGSASLSLHRFAADVRRFEE HLQHEAEERLQTMVSHFTIDPSRIKQHVRFGSVRDEVNELAEELGADVVVIGSRNPSIST HLLGSNASSVIRHANLPVLVVR >gi|296493087|gb|ADTK01000414.1| GENE 33 35580 - 35786 106 68 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_0629 NR:ns ## KEGG: EcE24377A_0629 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 13 68 1 56 56 93 98.0 2e-18 MNNSVILGEEFSVANSFVAQFHFKYINWHNDCLIHNPFSLLIMNKSFAMIIIFIPDICLV LFPYERFL >gi|296493087|gb|ADTK01000414.1| GENE 34 35905 - 37143 864 412 aa, chain + ## HITS:1 COG:ECs0647 KEGG:ns NR:ns ## COG: ECs0647 COG1063 # Protein_GI_number: 15829901 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli O157:H7 # 1 412 1 412 412 833 99.0 0 MKALTYHGPHHVQVENVPDPGIEQADDIILRITATAICGSDLHLYRGKIPQVKHGDIFGH EFMGEVVETGKDVKNLQKGDRVVIPFVIACGDCFFCRMQQYAACENTNAGKGAALNKKQI PAPAALFGYSHLYGGVPGGQAEYVRVPKGNVGPFKVPPLLSDDKALFLSDILPTAWQAAK NAQIQQGSSVAVYGAGPVGLLTIACARLLGAEQIFVVDHHPYRLRFAADRYGAIPINFDE DSDPAQSIIEQTAGHRGVDAVIDAVGFEAKGSTTETVLTNLKLEGSSGKALRQCIAAVRR GGIVSVPGVYAGFIHGFLFGDAFDKGLTFKMGQTHVHAWLGELLPLIEKGLLKPEEIVTH YMPFEEAARGYEIFEKREEECRKVILVPGAQSAEAAQKAVSGLVNAMPGGTI >gi|296493087|gb|ADTK01000414.1| GENE 35 37374 - 37784 361 136 aa, chain - ## HITS:1 COG:ECs0649 KEGG:ns NR:ns ## COG: ECs0649 COG0782 # Protein_GI_number: 15829903 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Escherichia coli O157:H7 # 1 136 1 136 136 259 100.0 7e-70 MSRPTIIINDLDAERIDILLEQPAYAGLPIADALNAELDRAQMCSPEEMPHDVVTMNSRV KFRNLSDGEVRVRTLVYPAKMTDSNTQLSVMAPVGAALLGLRVGDSIHWELPGGVATHLE VLELEYQPEAAGDYLL >gi|296493087|gb|ADTK01000414.1| GENE 36 38014 - 38838 428 274 aa, chain - ## HITS:1 COG:rna KEGG:ns NR:ns ## COG: rna COG3719 # Protein_GI_number: 16128594 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonuclease I # Organism: Escherichia coli K12 # 7 274 1 268 268 549 99.0 1e-156 MSSTPIMKAFWRNAALLAVSLLPFSSANALALQAKQYGDFDRYVLALSWQTGFCQSQHDR NRNERDECRLQTETTNKADFLTVHGLWPGLPKSVAARGVDERRWMRFGCATRPIPNLPEA HASRMCSSPETGLSLETAAKLSEVMPGAGGRSCLERYEYAKHGACFGFDPDAYFGTMVRL NQEIKESEAGKFLADNYGKTVSRRDFDAAFAKSWGKENVKAVKLTCQGNPAYLTEIQISI KADAINAPLSANSFLPQPHPGNCGKTFVIDKVGY Prediction of potential genes in microbial genomes Time: Mon May 16 16:25:33 2011 Seq name: gi|296493086|gb|ADTK01000415.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1430.7, whole genome shotgun sequence Length of sequence - 26519 bp Number of predicted genes - 28, with homology - 28 Number of transcription units - 10, operones - 4 average op.length - 5.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 2 - 43 6.1 1 1 Op 1 5/0.400 - CDS 76 - 1539 1595 ## COG0471 Di- and tricarboxylate transporters 2 1 Op 2 5/0.400 - CDS 1590 - 2468 683 ## COG1767 Triphosphoribosyl-dephospho-CoA synthetase 3 1 Op 3 7/0.000 - CDS 2443 - 2994 392 ## COG3697 Phosphoribosyl-dephospho-CoA transferase (holo-ACP synthetase) 4 1 Op 4 6/0.000 - CDS 2998 - 4530 1648 ## COG3051 Citrate lyase, alpha subunit 5 1 Op 5 6/0.000 - CDS 4541 - 5449 1120 ## COG2301 Citrate lyase beta subunit 6 1 Op 6 5/0.400 - CDS 5446 - 5742 319 ## COG3052 Citrate lyase, gamma subunit 7 1 Op 7 . - CDS 5757 - 6815 843 ## COG3053 Citrate lyase synthetase - Prom 6972 - 7031 7.4 + Prom 7084 - 7143 5.3 8 2 Op 1 9/0.000 + CDS 7195 - 8853 1470 ## COG3290 Signal transduction histidine kinase regulating citrate/malate metabolism 9 2 Op 2 . + CDS 8822 - 9502 263 ## PROTEIN SUPPORTED gi|149011191|ref|ZP_01832496.1| 30S ribosomal protein S9 + Term 9508 - 9552 5.1 - Term 9500 - 9534 5.2 10 3 Tu 1 . - CDS 9543 - 10928 1382 ## COG3069 C4-dicarboxylate transporter - Prom 10952 - 11011 6.9 + Prom 11225 - 11284 8.9 11 4 Op 1 . + CDS 11517 - 12077 457 ## B21_00581 hypothetical protein + Term 12210 - 12238 -0.0 + Prom 12079 - 12138 6.5 12 4 Op 2 . + CDS 12252 - 12461 338 ## COG1278 Cold shock proteins + Term 12485 - 12515 1.0 - Term 12473 - 12501 0.6 13 5 Tu 1 . - CDS 12515 - 12898 192 ## COG0239 Integral membrane protein possibly involved in chromosome condensation - Prom 12957 - 13016 5.2 + Prom 12854 - 12913 2.7 14 6 Tu 1 . + CDS 12991 - 13779 759 ## COG0388 Predicted amidohydrolase + Prom 13802 - 13861 3.7 15 7 Tu 1 . + CDS 13902 - 14111 178 ## PROTEIN SUPPORTED gi|90022866|ref|YP_528693.1| ribosomal protein L25 + Term 14158 - 14187 2.1 - Term 14146 - 14175 2.1 16 8 Tu 1 . - CDS 14212 - 15177 948 ## COG0320 Lipoate synthase - Prom 15263 - 15322 4.7 17 9 Tu 1 . - CDS 15386 - 16339 650 ## COG0583 Transcriptional regulator - Prom 16506 - 16565 5.4 - Term 16551 - 16584 -0.3 18 10 Op 1 6/0.000 - CDS 16598 - 17239 499 ## COG0321 Lipoate-protein ligase B - Prom 17272 - 17331 3.9 19 10 Op 2 9/0.000 - CDS 17340 - 17603 340 ## COG2921 Uncharacterized conserved protein - Prom 17651 - 17710 2.3 - Term 17624 - 17655 4.8 20 10 Op 3 12/0.000 - CDS 17714 - 18925 1346 ## COG1686 D-alanyl-D-alanine carboxypeptidase - Prom 18963 - 19022 5.0 - Term 18951 - 18985 1.3 21 10 Op 4 8/0.000 - CDS 19065 - 20153 845 ## COG0797 Lipoproteins 22 10 Op 5 19/0.000 - CDS 20164 - 21276 1126 ## COG0772 Bacterial cell division membrane protein 23 10 Op 6 9/0.000 - CDS 21279 - 23180 2070 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 24 10 Op 7 14/0.000 - CDS 23211 - 23678 527 ## COG1576 Uncharacterized conserved protein 25 10 Op 8 6/0.000 - CDS 23682 - 23999 297 ## COG0799 Uncharacterized homolog of plant Iojap protein - Prom 24035 - 24094 4.7 26 10 Op 9 3/0.800 - CDS 24259 - 24870 433 ## COG0406 Fructose-2,6-bisphosphatase 27 10 Op 10 7/0.000 - CDS 24894 - 25535 508 ## COG1057 Nicotinic acid mononucleotide adenylyltransferase 28 10 Op 11 . - CDS 25537 - 26517 729 ## COG1466 DNA polymerase III, delta subunit Predicted protein(s) >gi|296493086|gb|ADTK01000415.1| GENE 1 76 - 1539 1595 487 aa, chain - ## HITS:1 COG:ECs0651 KEGG:ns NR:ns ## COG: ECs0651 COG0471 # Protein_GI_number: 15829905 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Escherichia coli O157:H7 # 1 487 1 487 487 867 100.0 0 MSLAKDNIWKLLAPLVVMGVMFLIPVPDGMPPQAWHYFAVFVAMIVGMILEPIPATAISF IAVTICVIGSNYLLFDAKELADPAFNAQKQALKWGLAGFSSTTVWLVFGAFIFALGYEVS GLGRRIALFLVKFMGKRTLTLGYAIVIIDILLAPFTPSNTARTGGTVFPVIKNLPPLFKS FPNDPSARRIGGYLMWMMVISTSLSSSMFVTGAAPNVLGLEFVSKIAGIQISWLQWFLCF LPVGVILLIIAPWLSYVLYKPEITHSEEVATWAGDELKTMGALTRREWTLIGLVLLSLGL WVFGSEVINATAVGLLAVSLMLALHVVPWKDITRYNSAWNTLVNLATLVVMANGLTRSGF IDWFAGTMSTHLEGFSPNATVIVLVLVFYFAHYLFASLSAHTATMLPVILAVGKGIPGVP MEQLCILLVLSIGIMGCLTPYATGPGVIIYGCGYVKSKDYWRLGAIFGVIYISMLLLVGW PILAMWN >gi|296493086|gb|ADTK01000415.1| GENE 2 1590 - 2468 683 292 aa, chain - ## HITS:1 COG:citG KEGG:ns NR:ns ## COG: citG COG1767 # Protein_GI_number: 16128596 # Func_class: H Coenzyme transport and metabolism # Function: Triphosphoribosyl-dephospho-CoA synthetase # Organism: Escherichia coli K12 # 1 292 1 292 292 554 99.0 1e-158 MSMPATSTKTTKLATSLIDEYALLCWRAMLTEVNLSPKPGLVDRINCGAHKDMALEDFRR SALAIQGWLPRFIEFGACSAEMAPEAVLHGLRPIGMACEGDMFRATAGVNTHKGSIFSLG LLCAAIGRLLQLNQPVTPTTVCSTAASFCRGLTDRELRTNNSQLTAGQRLYQQLGLTGAR GEAEAGYPLVINHALPHYLTLLDQGLDPELALLDTLLLLMAINGDTNVASRGGEGGLRWL QREAQTLLQKGGIRTPADLDYLRQFDRECIERNLSPGGSADLLILTWFLAQI >gi|296493086|gb|ADTK01000415.1| GENE 3 2443 - 2994 392 183 aa, chain - ## HITS:1 COG:ECs0653 KEGG:ns NR:ns ## COG: ECs0653 COG3697 # Protein_GI_number: 15829907 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Phosphoribosyl-dephospho-CoA transferase (holo-ACP synthetase) # Organism: Escherichia coli O157:H7 # 1 183 1 183 183 350 100.0 1e-96 MHLLPELASHHAVSIPELLVSRDERQARQHVWLKRHPVPLVSFTVVAPGPIKDSEVTRRI FNHGVTALRALAAKQGWQIQEQAALVSASGPEGMLSIAAPARDLKLATIELEHSHPLGRL WDIDVLTPEGEILSRRDYSLPPRRCLLCEQSAAVCARGKTHQLTDLLNRMEALLNDVDAC NVN >gi|296493086|gb|ADTK01000415.1| GENE 4 2998 - 4530 1648 510 aa, chain - ## HITS:1 COG:citF KEGG:ns NR:ns ## COG: citF COG3051 # Protein_GI_number: 16128598 # Func_class: C Energy production and conversion # Function: Citrate lyase, alpha subunit # Organism: Escherichia coli K12 # 1 510 1 510 510 1008 99.0 0 MTQKIEQSQRQERVAAWNRRAECDLAAFQNSPKQTYQAEKARDRKLCANLEEAIRRSGLQ DGMTVSFHHAFRGGDLTVNMVMDVIAKMGFKNLTLASSSLSDCHAPLVEHIRQGVVTRIY TSGLRGPLAEEISRGLLAEPVQIHSHGGRVHLVQSGELNIDVAFLGVPSCDEFGNANGYT GKACCGSLGYAMVDADNAKQVVMLTEELLPYPHNPASIEQDQVDLIVKVDRVGDAAKIGA GATRMTTNPRELLIARSAADVIVNSGYFKEGFSMQTGTGGASLAVTRFLEDKMRSRDIRA DFALGGITATMVDLHEKGLIRKLLDVQSFDSHAAQSLARNPNHIEISANQYANWGSKGAS VDRLDVVVLSALEIDTQFNVNVLTGSDGVLRGASGGHCDTAIASALSIIVAPLVRGRIPT LVDNVLTCITPGSSVDILVTDHGIAVNPARPELAERLQEAGIKVVSIEWLRERARLLTGE PQPIEFTDRVVAVVRYRDGSVIDVVHQVKE >gi|296493086|gb|ADTK01000415.1| GENE 5 4541 - 5449 1120 302 aa, chain - ## HITS:1 COG:citE KEGG:ns NR:ns ## COG: citE COG2301 # Protein_GI_number: 16128599 # Func_class: G Carbohydrate transport and metabolism # Function: Citrate lyase beta subunit # Organism: Escherichia coli K12 # 1 302 6 307 307 542 100.0 1e-154 MISASLQQRKTRTRRSMLFVPGANAAMVSNSFIYPADALMFDLEDSVALREKDTARRMVY HALQHPLYRDIETIVRVNALDSEWGVNDLEAVVRGGADVVRLPKTDTAQDVLDIEKEILR IEKACGREPGSTGLLAAIESPLGITRAVEIAHASERLIGIALGAEDYVRNLRTERSPEGT ELLFARCSILQAARSAGIQAFDTVYSDANNEAGFLQEAAHIKQLGFDGKSLINPRQIDLL HNLYAPTQKEVDHARRVVEAAEAAAREGLGVVSLNGKMVDGPVIDRARLVLSRAELSGIR EE >gi|296493086|gb|ADTK01000415.1| GENE 6 5446 - 5742 319 98 aa, chain - ## HITS:1 COG:ECs0656 KEGG:ns NR:ns ## COG: ECs0656 COG3052 # Protein_GI_number: 15829910 # Func_class: C Energy production and conversion # Function: Citrate lyase, gamma subunit # Organism: Escherichia coli O157:H7 # 1 98 1 98 98 182 100.0 1e-46 MKINQPAVAGTLESGDVMIRIAPLDTQDIDLQINSSVEKQFGDAIRTTILDVLARYNVRG VQLNVDDKGALDCILRARLEALLARASGIPALPWEDCQ >gi|296493086|gb|ADTK01000415.1| GENE 7 5757 - 6815 843 352 aa, chain - ## HITS:1 COG:citC KEGG:ns NR:ns ## COG: citC COG3053 # Protein_GI_number: 16128601 # Func_class: C Energy production and conversion # Function: Citrate lyase synthetase # Organism: Escherichia coli K12 # 1 352 30 381 381 724 100.0 0 MFGNDIFTRVKRSENKKMAEIAQFLHENDLSVDTTVEVFITVTRDEKLIACGGIAGNIIK CVAISESVRGEGLALTLATELINLAYERHSTHLFIYTKTEYEALFRQCGFSTLTSVPGVM VLMENSATRLKRYAESLKKFRHPGNKIGCIVMNANPFTNGHRYLIQQAAAQCDWLHLFLV KEDSSRFPYEDRLDLVLKGTADIPRLTVHRGSEYIISRATFPCYFIKEQSVINHCYTEID LKIFRQYLAPALGVTHRFVGTEPFCRVTAQYNQDMRYWLETPTISAPPIELVEIERLRYQ EMPISASRVRQLLAKNDLTAIAPLVPAVTLHYLQNLLEHSRQDAAARQKTPA >gi|296493086|gb|ADTK01000415.1| GENE 8 7195 - 8853 1470 552 aa, chain + ## HITS:1 COG:citA KEGG:ns NR:ns ## COG: citA COG3290 # Protein_GI_number: 16128602 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase regulating citrate/malate metabolism # Organism: Escherichia coli K12 # 1 552 1 552 552 1109 99.0 0 MLQLNENKQFAFFQRLAFPLRIFLLILVFSIFVIAALAQYFTASFEDYLTLHVRDMAMNQ AKIIASNDSVISAVKTRDYKRLATIANKLQRDTDFDYVVIGDRHSIRLYHPNPEKIGYPM QFTKQGALEKGESYFITGKGSMGMAMRAKTPIFDDDGKVIGVVSIGYLVSKIDSWRAEFL LPMAGVFVVLLGILMLLSWFLAAHIRRQMMGMEPKQIARVVRQQEALFSSVYEGLIAVDP HGYITAINRNARKMLGLSSPGRQWLGKPIAEVVRPADFFTEQIDEKRQDVVANFNGLSVI ANREAIRSGDDLLGAIISFRSKDEISTLNAQLTQIKQYVESLRTLRHEHLNWMSTLNGLL QMKEYDRVLAMVQGESQAQQQLIDSLREAFADRQVAGLLFGKVQRARELGLKMIIVPGSQ LSQLPPGLDSTEFAAIVGNLLDNAFEASLRSDEGNKIVELFLSDEGDDVVIEVADQGCGV PESLRDKIFEQGVSTRADEPGEHGIGLYLIASYVTRCGGVITLEDNDPCGTLFSIYIPKV KPNDSSINPIDR >gi|296493086|gb|ADTK01000415.1| GENE 9 8822 - 9502 263 226 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149011191|ref|ZP_01832496.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP19-BS75] # 5 223 1 221 226 105 28 2e-22 MTAPLTLLIVEDETPLAEMHAEYIRHIPGFSQILLAGNLAQARMMIERFKPGLILLDNYL PDGRGINLLHELVQAHYPGDVVFTTAASDMETVSEAVRCGVFDYLIKPIAYERLGQTLTR FRQRKHMLESIDSASQKQIDEMFNAYARGEPKDELPTGIDPLTLNAVRKLFKEPGVQHTA ETVAQALTISRTTARRYLEYCASRHLIIAEIVHGKVGRPQRIYHSG >gi|296493086|gb|ADTK01000415.1| GENE 10 9543 - 10928 1382 461 aa, chain - ## HITS:1 COG:ECs0660 KEGG:ns NR:ns ## COG: ECs0660 COG3069 # Protein_GI_number: 15829914 # Func_class: C Energy production and conversion # Function: C4-dicarboxylate transporter # Organism: Escherichia coli O157:H7 # 1 461 1 461 461 633 100.0 0 MLTFIELLIGVVVIVGVARYIIKGYSATGVLFVGGLLLLIISAIMGHKVLPSSQASTGYS ATDIVEYVKILLMSRGGDLGMMIMMLCGFAAYMTHIGANDMVVKLASKPLQYINSPYLLM IAAYFVACLMSLAVSSATGLGVLLMATLFPVMVNVGISRGAAAAICASPAAIILAPTSGD VVLAAQASEMSLIDFAFKTTLPISIAAIIGMAIAHFFWQRYLDKKEHISHEMLDVSEITT TAPAFYAILPFTPIIGVLIFDGKWGPQLHIITILVICMLIASILEFLRSFNTQKVFSGLE VAYRGMADAFANVVMLLVAAGVFAQGLSTIGFIQSLISIATSFGSASIILMLVLVILTML AAVTTGSGNAPFYAFVEMIPKLAHSSGINPAYLTIPMLQASNLGRTLSPVSGVVVAVAGM AKISPFEVVKRTSVPVLVGLVIVIVATELMVPGTAAAVTGK >gi|296493086|gb|ADTK01000415.1| GENE 11 11517 - 12077 457 186 aa, chain + ## HITS:1 COG:no KEGG:B21_00581 NR:ns ## KEGG: B21_00581 # Name: crcA # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 186 1 186 186 351 100.0 6e-96 MNVSKYVAIFSFVFIQLISVGKVFANADEWMTTFRENIAQTWQQPEHYDLYIPAITWHAR FAYDKEKTDRYNERPWGGGFGLSRWDEKGNWHGLYAMAFKDSWNKWEPIAGYGWESTWRP LADENFHLGLGFTAGVTARDNWNYIPLPVLLPLASVGYGPVTFQMTYIPGTYNNGNVYFA WMRFQF >gi|296493086|gb|ADTK01000415.1| GENE 12 12252 - 12461 338 69 aa, chain + ## HITS:1 COG:ECs0662 KEGG:ns NR:ns ## COG: ECs0662 COG1278 # Protein_GI_number: 15829916 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Escherichia coli O157:H7 # 1 69 1 69 69 124 100.0 4e-29 MSKIKGNVKWFNESKGFGFITPEDGSKDVFVHFSAIQTNGFKTLAEGQRVEFEITNGAKG PSAANVIAL >gi|296493086|gb|ADTK01000415.1| GENE 13 12515 - 12898 192 127 aa, chain - ## HITS:1 COG:crcB KEGG:ns NR:ns ## COG: crcB COG0239 # Protein_GI_number: 16128607 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Integral membrane protein possibly involved in chromosome condensation # Organism: Escherichia coli K12 # 1 127 1 127 127 202 99.0 2e-52 MLQLLLAVFIGGGTGSVARWLLSMRFNPLHQAIPLGTLTANLIGAFIIGMGFAWFSRMTN IDPVWKVLITTGFCGGLTTFSTFSAEVVFLLQEGRFGWALLNVFVNLLGSFAMTALAFWL FSASTAH >gi|296493086|gb|ADTK01000415.1| GENE 14 12991 - 13779 759 262 aa, chain + ## HITS:1 COG:ECs0664 KEGG:ns NR:ns ## COG: ECs0664 COG0388 # Protein_GI_number: 15829918 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Escherichia coli O157:H7 # 1 262 1 262 262 501 100.0 1e-142 MLVAAGQFAVTSVWEKNAEICASLMAQAAENDVSLFVLPEALLARDDHDADLSVKSAQLL EGEFLGRLRRESKRNMMTTILTIHVPSTPGRAWNMLVALQAGNIVARYAKLHLYDAFAIQ ESRRVDAGNEIAPLLEVEGMKVGLMTCYDLRFPELALAQALQGAEILVLPAAWVRGPLKE HHWSTLLAARALDTTCYMVAAGECGNKNIGQSRIIDPFGVTIAAASEMPALIMAEVTPER VRQVRAQLPVLNNRRFAPPQLL >gi|296493086|gb|ADTK01000415.1| GENE 15 13902 - 14111 178 69 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90022866|ref|YP_528693.1| ribosomal protein L25 [Saccharophagus degradans 2-40] # 1 59 1 59 83 73 57 2e-12 VSMGEISITKLLVVAALVVLLFGTKKLRTLGGDLGAAIKGFKKAMNDDDAAAKKGADVDL QAEKLSHKE >gi|296493086|gb|ADTK01000415.1| GENE 16 14212 - 15177 948 321 aa, chain - ## HITS:1 COG:ECs0666 KEGG:ns NR:ns ## COG: ECs0666 COG0320 # Protein_GI_number: 15829920 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate synthase # Organism: Escherichia coli O157:H7 # 1 321 1 321 321 657 100.0 0 MSKPIVMERGVKYRDADKMALIPVKNVATEREALLRKPEWMKIKLPADSTRIQGIKAAMR KNGLHSVCEEASCPNLAECFNHGTATFMILGAICTRRCPFCDVAHGRPVAPDANEPVKLA QTIADMALRYVVITSVDRDDLRDGGAQHFADCITAIREKSPQIKIETLVPDFRGRMDRAL DILTATPPDVFNHNLENVPRIYRQVRPGADYNWSLKLLERFKEAHPEIPTKSGLMVGLGE TNEEIIEVMRDLRRHGVTMLTLGQYLQPSRHHLPVQRYVSPDEFDEMKAEALAMGFTHAA CGPFVRSSYHADLQAKGMEVK >gi|296493086|gb|ADTK01000415.1| GENE 17 15386 - 16339 650 317 aa, chain - ## HITS:1 COG:ybeF KEGG:ns NR:ns ## COG: ybeF COG0583 # Protein_GI_number: 16128612 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 52 317 1 266 266 525 99.0 1e-149 MDSNNQIEPCLSRKSSEGKPQIFTTLRNIDLNLLTIFEAVYVHKGIVNAAKVLNLTPSAI SQSIQKLRVIFPDPLFIRKGQGVTPTAFAMHLHEYISQGLESILGALDIEGSYDKQRTIT IATTPSVGALVLPVIYRAIKTHYPQLLLRNPPISDAENQLSQFQTDLIIDNMFCTNRTVQ HHVLFTDNMVLICREGNPLLSLEDDRETIDNAAHVLLLPEEQNFSGLRQRVQEMFPDRQI NFTSYNILTIAALVANSDMLAIIPSRFYNLFSRCWPLEKLPFPSLNEEQIDFSIHYNKFS LRDPILHGVIDVIRNAF >gi|296493086|gb|ADTK01000415.1| GENE 18 16598 - 17239 499 213 aa, chain - ## HITS:1 COG:ECs0668 KEGG:ns NR:ns ## COG: ECs0668 COG0321 # Protein_GI_number: 15829922 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate-protein ligase B # Organism: Escherichia coli O157:H7 # 23 213 1 191 191 401 100.0 1e-112 MYQDKILVRQLGLQPYEPISQAMHEFTDTRDDSTLDEIWLVEHYPVFTQGQAGKAEHILM PGDIPVIQSDRGGQVTYHGPGQQVMYVLLNLKRRKLGVRELVTLLEQTVVNTLAELGIEA HPRADAPGVYVGEKKICSLGLRIRRGCSFHGLALNVNMDLSPFLRINPCGYAGMEMAKIS QWKPEATTNNIAPRLLENILALLNNPDFEYITA >gi|296493086|gb|ADTK01000415.1| GENE 19 17340 - 17603 340 87 aa, chain - ## HITS:1 COG:ECs0669 KEGG:ns NR:ns ## COG: ECs0669 COG2921 # Protein_GI_number: 15829923 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 87 1 87 87 157 100.0 5e-39 MKTKLNELLEFPTPFTYKVMGQALPELVDQVVEVVQRHAPGDYTPTVKPSSKGNYHSVSI TINATHIEQVETLYEELGKIDIVRMVL >gi|296493086|gb|ADTK01000415.1| GENE 20 17714 - 18925 1346 403 aa, chain - ## HITS:1 COG:ZdacA KEGG:ns NR:ns ## COG: ZdacA COG1686 # Protein_GI_number: 15800346 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Escherichia coli O157:H7 EDL933 # 1 403 1 403 403 825 100.0 0 MNTIFSARIMKRLALTTALCTAFISAAHADDLNIKTMIPGVPQIDAESYILIDYNSGKVL AEQNADVRRDPASLTKMMTSYVIGQAMKAGKFKETDLVTIGNDAWATGNPVFKGSSLMFL KPGMQVPVSQLIRGINLQSGNDACVAMADFAAGSQDAFVGLMNSYVNALGLKNTHFQTVH GLDADGQYSSARDMALIGQALIRDVPNEYSIYKEKEFTFNGIRQLNRNGLLWDNSLNVDG IKTGHTDKAGYNLVASATEGQMRLISAVMGGRTFKGREAESKKLLTWGFRFFETVNPLKV GKEFASEPVWFGDSDRASLGVDKDVYLTIPRGRMKDLKASYVLNSSELHAPLQKNQVVGT INFQLDGKTIEQRPLVVLQEIPEGNFFGKIIDYIKLMFHHWFG >gi|296493086|gb|ADTK01000415.1| GENE 21 19065 - 20153 845 362 aa, chain - ## HITS:1 COG:rlpA KEGG:ns NR:ns ## COG: rlpA COG0797 # Protein_GI_number: 16128616 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipoproteins # Organism: Escherichia coli K12 # 1 362 1 362 362 566 99.0 1e-161 MRKQWLGICIAAGMLAACTSDDGQQQTVSVPQPAVCNGPIVEISGADPRFEPLNATANQD YQRDGKSYKIVQDPSRFSQAGLAAIYDAEPGSNLTASGEAFDPTQLTAAHPTLPIPSYAR ITNLANGRMIVVRINDRGPYGNDRVISLSRAAADRLNTSNNTKVRIDPIIVAQDGSLSGP GMACTTVAKQTYALPAPPDLSGGAGASSVSGPQGDILPVSNSTLKSEDPTGAPVTSSGFL GAPTTLAPGVLESSEPTPAPQPVVTAPSTTPATSPAMVTPQAASQSASGNFMVQVGAVSD QARAQQYQQQLGQKFGVPGRVTQNGAVWRIQLGPFASKAEASTLQQRLQTEAQLQSFITT AQ >gi|296493086|gb|ADTK01000415.1| GENE 22 20164 - 21276 1126 370 aa, chain - ## HITS:1 COG:ECs0672 KEGG:ns NR:ns ## COG: ECs0672 COG0772 # Protein_GI_number: 15829926 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Escherichia coli O157:H7 # 1 370 1 370 370 618 100.0 1e-177 MTDNPNKKTFWDKVHLDPTMLLILLALLVYSALVIWSASGQDIGMMERKIGQIAMGLVIM VVMAQIPPRVYEGWAPYLYIICIILLVAVDAFGAISKGAQRWLDLGIVRFQPSEIAKIAV PLMVARFINRDVCPPSLKNTGIALVLIFMPTLLVAAQPDLGTSILVALSGLFVLFLSGLS WRLIGVAVVLVAAFIPILWFFLMHDYQRQRVMMLLDPESDPLGAGYHIIQSKIAIGSGGL RGKGWLHGTQSQLEFLPERHTDFIFAVLAEELGLVGILILLALYILLIMRGLWIAARAQT TFGRVMAGGLMLILFVYVFVNIGMVSGILPVVGVPLPLVSYGGSALIVLMAGFGIVMSIH THRKMLSKSV >gi|296493086|gb|ADTK01000415.1| GENE 23 21279 - 23180 2070 633 aa, chain - ## HITS:1 COG:ECs0673 KEGG:ns NR:ns ## COG: ECs0673 COG0768 # Protein_GI_number: 15829927 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Escherichia coli O157:H7 # 1 633 1 633 633 1295 100.0 0 MKLQNSFRDYTAESALFVRRALVAFLGILLLTGVLIANLYNLQIVRFTDYQTRSNENRIK LVPIAPSRGIIYDRNGIPLALNRTIYQIEMMPEKVDNVQQTLDALRSVVDLTDDDIAAFR KERARSHRFTSIPVKTNLTEVQVARFAVNQYRFPGVEVKGYKRRYYPYGSALTHVIGYVS KINDKDVERLNNDGKLANYAATHDIGKLGIERYYEDVLHGQTGYEEVEVNNRGRVIRQLK EVPPQAGHDIYLTLDLKLQQYIETLLAGSRAAVVVTDPRTGGVLALVSTPSYDPNLFVDG ISSKDYSALLNDPNTPLVNRATQGVYPPASTVKPYVAVSALSAGVITRNTTLFDPGWWQL PGSEKRYRDWKKWGHGRLNVTRSLEESADTFFYQVAYDMGIDRLSEWMGKFGYGHYTGID LAEERSGNMPTREWKQKRFKKPWYQGDTIPVGIGQGYWTATPIQMSKALMILINDGIVKV PHLLMSTAEDGKQVPWVQPHEPPVGDIHSGYWELAKDGMYGVANRPNGTAHKYFASAPYK IAAKSGTAQVFGLKANETYNAHKIAERLRDHKLMTAFAPYNNPQVAVAMILENGGAGPAV GTLMRQILDHIMLGDNNTDLPAENPAVAAAEDH >gi|296493086|gb|ADTK01000415.1| GENE 24 23211 - 23678 527 155 aa, chain - ## HITS:1 COG:ECs0674 KEGG:ns NR:ns ## COG: ECs0674 COG1576 # Protein_GI_number: 15829928 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 155 1 155 155 304 100.0 4e-83 MKLQLVAVGTKMPDWVQTGFTEYLRRFPKDMPFELIEIPAGKRGKNADIKRILDKEGEQM LAAAGKNRIVTLDIPGKPWDTPQLAAELERWKLDGRDVSLLIGGPEGLSPACKAAAEQSW SLSALTLPHPLVRVLVAESLYRAWSITTNHPYHRE >gi|296493086|gb|ADTK01000415.1| GENE 25 23682 - 23999 297 105 aa, chain - ## HITS:1 COG:STM0642 KEGG:ns NR:ns ## COG: STM0642 COG0799 # Protein_GI_number: 16764019 # Func_class: S Function unknown # Function: Uncharacterized homolog of plant Iojap protein # Organism: Salmonella typhimurium LT2 # 1 105 1 105 105 187 97.0 3e-48 MQGKALQDFVIDKIDDLKGQDIIALDVQGKSSITDCMIICTGTSSRHVMSIADHVVQESR AAGLLPLGVEGENSADWIVVDLGDVIVHVMQEESRRLYELEKLWS >gi|296493086|gb|ADTK01000415.1| GENE 26 24259 - 24870 433 203 aa, chain - ## HITS:1 COG:phpB KEGG:ns NR:ns ## COG: phpB COG0406 # Protein_GI_number: 16128621 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Escherichia coli K12 # 1 203 1 203 203 406 98.0 1e-113 MRLWLIRHGETQANVDGLYSGHAPTPLTARGIEQAQNLHTLLHGVSFDLVLCGELERAQH TARLVLSDRQLPVQIIPELNEMFFGDWEMRHHRDLMQEDAENYSAWCNDWQHAIPTNGEG FQAFSQRVERFIARLSEFQHYQNILVVSHQGVLSLLIARLIGMPAEAMWHFRVDQGCWST IDINQKFATLRVLNSRAIGVENA >gi|296493086|gb|ADTK01000415.1| GENE 27 24894 - 25535 508 213 aa, chain - ## HITS:1 COG:nadD KEGG:ns NR:ns ## COG: nadD COG1057 # Protein_GI_number: 16128622 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid mononucleotide adenylyltransferase # Organism: Escherichia coli K12 # 1 213 1 213 213 415 100.0 1e-116 MKSLQALFGGTFDPVHYGHLKPVETLANLIGLTRVTIIPNNVPPHRPQPEANSVQRKHML ELAIADKPLFTLDERELKRNAPSYTAQTLKEWRQEQGPDVPLAFIIGQDSLLTFPTWYEY ETILDNAHLIVCRRPGYPLEMAQPQYQQWLEDHLTHNPEDLHLQPAGKIYLAETPWFNIS ATIIRERLQNGESCEDLLPEPVLTYINQQGLYR >gi|296493086|gb|ADTK01000415.1| GENE 28 25537 - 26517 729 326 aa, chain - ## HITS:1 COG:holA KEGG:ns NR:ns ## COG: holA COG1466 # Protein_GI_number: 16128623 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, delta subunit # Organism: Escherichia coli K12 # 2 326 19 343 343 574 99.0 1e-164 GAAYLLLGNDPLLLQESQDAVRQVAAAQGFEEHHTFSIDPNTDWNAIFSLCQAMSLFASR QTLLLLLPENGPNAAINEQLLTLTGLLHDDLLLIVRGNKLSKAQENAAWFTALANRSVQV TCQTPEQAQLPRWVAARAKQLNLELDDAANQVLCYCYEGNLLALAQALERLSLLWPDGKL TLPRVEQAVNDAAHFTPFHWVDALLMGKSKRALHILQQLRLEGSEPVILLRTLQRELLLL VNLKRQSAHTPLRALFDKHRVWQNRRGMMGEALNRLSQPQLRQAVQLLTRTELTLKQDYG QSVWAELEGLSLLLCHKPLADVFIDG Prediction of potential genes in microbial genomes Time: Mon May 16 16:25:46 2011 Seq name: gi|296493085|gb|ADTK01000416.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1430.8, whole genome shotgun sequence Length of sequence - 26351 bp Number of predicted genes - 22, with homology - 22 Number of transcription units - 11, operones - 6 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 9/0.000 - CDS 56 - 637 557 ## COG2980 Rare lipoprotein B 2 1 Op 2 . - CDS 652 - 3375 2954 ## COG0495 Leucyl-tRNA synthetase - Prom 3428 - 3487 3.9 + Prom 3328 - 3387 3.1 3 2 Tu 1 . + CDS 3469 - 3951 498 ## APECO1_1413 hypothetical protein + Term 4104 - 4136 3.1 - Term 3973 - 4008 4.3 4 3 Tu 1 . - CDS 4021 - 4998 758 ## COG0790 FOG: TPR repeat, SEL1 subfamily - Prom 5088 - 5147 4.9 + Prom 5030 - 5089 7.4 5 4 Op 1 . + CDS 5162 - 5869 336 ## JW0640 hypothetical protein 6 4 Op 2 . + CDS 5866 - 7293 602 ## EcSMS35_0667 DnaJ domain-containing protein 7 5 Tu 1 . - CDS 7303 - 7842 336 ## COG0790 FOG: TPR repeat, SEL1 subfamily - Prom 7891 - 7950 9.9 + Prom 7816 - 7875 10.2 8 6 Op 1 . + CDS 7959 - 8666 486 ## EC55989_0640 hypothetical protein 9 6 Op 2 . + CDS 8663 - 10114 907 ## ECO111_0679 Hsc56 co-chaperone of HscC + Term 10129 - 10185 7.4 - Term 10020 - 10075 0.2 10 7 Op 1 3/0.333 - CDS 10291 - 11844 1327 ## COG0443 Molecular chaperone - Prom 11864 - 11923 2.3 - Term 11866 - 11916 8.0 11 7 Op 2 4/0.333 - CDS 11928 - 12863 969 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase 12 8 Op 1 34/0.000 - CDS 12981 - 13706 552 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 13 8 Op 2 17/0.000 - CDS 13706 - 14380 742 ## COG0765 ABC-type amino acid transport system, permease component 14 8 Op 3 31/0.000 - CDS 14380 - 15120 805 ## COG0765 ABC-type amino acid transport system, permease component - Prom 15165 - 15224 2.0 - Term 15128 - 15167 4.4 15 8 Op 4 . - CDS 15290 - 16198 1166 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain - Prom 16271 - 16330 4.0 + Prom 16501 - 16560 5.4 16 9 Tu 1 . + CDS 16653 - 18530 747 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) - Term 18523 - 18557 5.1 17 10 Op 1 9/0.000 - CDS 18608 - 20146 1848 ## COG0815 Apolipoprotein N-acyltransferase 18 10 Op 2 11/0.000 - CDS 20171 - 21049 946 ## COG4535 Putative Mg2+ and Co2+ transporter CorC - Term 21073 - 21108 2.4 19 10 Op 3 17/0.000 - CDS 21139 - 21606 717 ## COG0319 Predicted metal-dependent hydrolase 20 10 Op 4 10/0.000 - CDS 21603 - 22643 1364 ## COG1702 Phosphate starvation-inducible protein PhoH, predicted ATPase - Prom 22694 - 22753 2.2 - Term 22757 - 22789 6.1 21 10 Op 5 . - CDS 22796 - 24220 438 ## PROTEIN SUPPORTED gi|227372256|ref|ZP_03855738.1| SSU ribosomal protein S12P methylthiotransferase - Prom 24332 - 24391 2.1 + Prom 24256 - 24315 3.1 22 11 Tu 1 . + CDS 24366 - 25541 1272 ## COG0654 2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases + Term 25669 - 25697 1.0 - TRNA 25695 - 25769 75.8 # Gln CTG 0 0 - TRNA 25807 - 25881 75.8 # Gln CTG 0 0 - TRNA 25930 - 26006 96.2 # Met CAT 0 0 - TRNA 26022 - 26096 74.3 # Gln TTG 0 0 - TRNA 26131 - 26205 74.3 # Gln TTG 0 0 - TRNA 26229 - 26313 78.3 # Leu TAG 0 0 Predicted protein(s) >gi|296493085|gb|ADTK01000416.1| GENE 1 56 - 637 557 193 aa, chain - ## HITS:1 COG:rlpB KEGG:ns NR:ns ## COG: rlpB COG2980 # Protein_GI_number: 16128624 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rare lipoprotein B # Organism: Escherichia coli K12 # 1 193 1 193 193 337 100.0 7e-93 MRYLATLLLSLAVLITAGCGWHLRDTTQVPSTMKVMILDSGDPNGPLSRAVRNQLRLNGV ELLDKETTRKDVPSLRLGKVSIAKDTASVFRNGQTAEYQMIMTVNATVLIPGRDIYPISA KVFRSFFDNPQMALAKDNEQDMIVKEMYDRAAEQLIRKLPSIRAADIRSDEEQTSTTTDT PATPARVSTTLGN >gi|296493085|gb|ADTK01000416.1| GENE 2 652 - 3375 2954 907 aa, chain - ## HITS:1 COG:ECs0680 KEGG:ns NR:ns ## COG: ECs0680 COG0495 # Protein_GI_number: 15829934 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Leucyl-tRNA synthetase # Organism: Escherichia coli O157:H7 # 48 907 1 860 860 1791 99.0 0 MNNPGIISTSSARKAVLTRTFGLCYADLKNHLNATFVAVLKTGPLAAMQEQYRPEEIESK VQLHWDEKRTFEVTEDESKEKYYCLSMLPYPSGRLHMGHVRNYTIGDVIARYQRMLGKNV LQPIGWDAFGLPAEGAAVKNNTAPAPWTYDNIAYMKNQLKMLGFGYDWSRELATCTPEYY RWEQKFFTELYKKGLVYKKTSAVNWCPNDQTVLANEQVIDGCCWRCDTKVERKEIPQWFI KITAYADELLNDLDKLDHWPDTVKTMQRNWIGRSEGVEITFNVNDYDNTLTVYTTRPDTF MGCTYLAVAAGHPLAQKAAENNPELAAFIDECRNTKVAEAEMATMEKKGVDTGFKAVHPL TGEEIPVWAANFVLMEYGTGAVMAVPGHDQRDYEFASKYGLNIKPVILAADGSEPDLSQQ ALTEKGVLFNSGEFNGLDHEAAFNAIADKLTAMGVGERKVNYRLRDWGVSRQRYWGAPIP MVTLEDGTVMPTPDDQLPVILPEDVVMDGITSPIKADPEWAKTTVNGMPALRETDTFDTF MESSWYYARYTCPEYKEGMLDSEAANYWLPVDIYIGGIEHAIMHLLYFRFFHKLMRDAGM VNSDEPAKQLLCQGMVLADAFYYVGENGERNWVSPVDAIVERDEKGRIVKAKDAAGHELV YTGMSKMSKSKNNGIDPQVMVERYGADTVRLFMMFASPADMTLEWQESGVEGANRFLKRV WKLVYEHTAKGDVAALNVDALTEDQKALRRDVHKTIAKVTDDIGRRQTFNTAIAAIMELM NKLAKAPTDGEQDRALMQEALLAVVRMLNPFTPHICFTLWQELKGEGDIDNAPWPVADEK AMVEDSTLVVVQVNGKVRAKITVPVDATEEQVRERAGQEHLVAKYLDGVTVRKVIYVPGK LLNLVVG >gi|296493085|gb|ADTK01000416.1| GENE 3 3469 - 3951 498 160 aa, chain + ## HITS:1 COG:no KEGG:APECO1_1413 NR:ns ## KEGG: APECO1_1413 # Name: ybeL # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 160 1 160 160 305 100.0 2e-82 MNKVAQYYRELVASLSERLRNGERDIDALVEQARERVIKTGELTRTEVDELTRAVRRDLE EFAMSYEESLKEESDSVFMRVIKESLWQELADITDKTQLEWREVFQDLNHHGVYHSGEVV GLGNLVCEKCHFHLPIYTPEVLTLCPKCGHDQFQRRPFEP >gi|296493085|gb|ADTK01000416.1| GENE 4 4021 - 4998 758 325 aa, chain - ## HITS:1 COG:ybeQ KEGG:ns NR:ns ## COG: ybeQ COG0790 # Protein_GI_number: 16128627 # Func_class: R General function prediction only # Function: FOG: TPR repeat, SEL1 subfamily # Organism: Escherichia coli K12 # 1 325 3 327 327 592 98.0 1e-169 MIFTSTCCDNLSIDEIIERAEKGDCEAQYIVGFYYNRDSAIDSPDDEKAFYWLKLAAEQG HCEAQYSLGQKYTEDKSRHKDNEHAIFWLKKAALQGHTFASNALGWILDRGEDPNYKEAV VWYQIAAESGMSYAQNNLGWMYRNGNGVAKDYALAFFWYKQAALQGHSDAQNNLADLYED GKGVAQNETLAAFWYLKSAQQGNRHAQFQIAWDYNAGEGVDQDYKQAMYWYLKAAAQGSV GAYVNIGYMYKHGQGVEKDYQAAFEWFTKAAECNDATAWYNLAIMYHYGEGRPVDLRQAL DLYRKVQSSGTRDVSQEIRETEDLL >gi|296493085|gb|ADTK01000416.1| GENE 5 5162 - 5869 336 235 aa, chain + ## HITS:1 COG:no KEGG:JW0640 NR:ns ## KEGG: JW0640 # Name: ybeR # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 235 1 235 235 471 100.0 1e-132 MDMESQKILFALSTPMEIRNECCLPSHSSPKMYLGTCFFDLSSSWGIDDRDDLLRTIHRM IDNGHAARLAGFYHRWFRYSPCEWRDYLAELNEQGQAYAQFVASTAECCGEGGIKAWDYV RMGFLSRMGVLNNWLSEEESLWIQSRIHLRALRYYSNWRQYFAGYTFGRQYWQSPEDDHL PLLREFLARKEYDDSGNDMFYQLFASDDAYYPTLSWQPLAYYSACPETLKDMSDL >gi|296493085|gb|ADTK01000416.1| GENE 6 5866 - 7293 602 475 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_0667 NR:ns ## KEGG: EcSMS35_0667 # Name: not_defined # Def: DnaJ domain-containing protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 475 1 475 475 902 96.0 0 MKNCWKILDIEETTDVDIIRRAYLALLPSFHPETDPQGFKQLRQAYEEALRIAQSPAKSV WQPEEYEVAEHEILLAFRALLASDSERFLPSAWQRFIQQLNYCSMEEIDELRWSLCTIAM NTAHLSFECVVLLAERLRWLQEENVGEIDEEELESFLYAIAKGNVFNFQTILHLPVAVQN DTIDFYQMFARIWSSHPEWLTLYLAQHRAVIIPDDAKLHRNLLRWYSASRLGIPELLDYA QSWREAEPDNEDAPYYEYAQRVYCGEGESLLAELCDYWREYPSTQADALMLQWCRQHRVD YYPLLVMMIEARDLVNDQGKPLLYVPGDSARTRFHLYEILSDEKLSALGRSLVEMVLHKG RKPRISLTRDTEHPLWPLYLVAKQLVQACQPTEESLMPIVSRLDAENRCPLEALIIRRLL IQAANFTEKQTVEPEPQPQPMPVDDGGPGCLGIIKIIFYIFIFAGLIGKILHLFG >gi|296493085|gb|ADTK01000416.1| GENE 7 7303 - 7842 336 179 aa, chain - ## HITS:1 COG:ECs0685 KEGG:ns NR:ns ## COG: ECs0685 COG0790 # Protein_GI_number: 15829939 # Func_class: R General function prediction only # Function: FOG: TPR repeat, SEL1 subfamily # Organism: Escherichia coli O157:H7 # 1 179 6 184 184 341 98.0 3e-94 MYIFAIFIVAAITCISQPKKTTLRDKAMVNYAFDYLRSPGSLPFTTAATELSAIHGHSTS QYRLGEFYLHGSDGKPLDYTQARYWYEQSAEQENPRAQSKLGWIYLKGLGVKPDTRKAIL WYKEAAEQGYAHAQYTLGLIYRNGSGINVNHYESQKWLKLAAKQHYKNAERLLAGLPAH >gi|296493085|gb|ADTK01000416.1| GENE 8 7959 - 8666 486 235 aa, chain + ## HITS:1 COG:no KEGG:EC55989_0640 NR:ns ## KEGG: EC55989_0640 # Name: ybeU # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 235 1 235 235 478 100.0 1e-134 MNKEEQYLLFALSAPMEILNQGCKPAHDSPKMYTGIKEFDLSSSWGINNRDDLIQTIYQM TDDGHANDLAGLYLTWHRSSPEEWKALIAGGSERGLIYTQFVAQTAMCCGEGGIKAWDYV RMGFLSRVGVLNNWLTEEESLWLQSRVYVRAHHYYHSWMHYFSAYSLGRLYWQSSQCEDN TSLREALTLYKYDSAGSRMFEELAAGSDRFYATLPWQPLTVQPECPVTLKDVSDL >gi|296493085|gb|ADTK01000416.1| GENE 9 8663 - 10114 907 483 aa, chain + ## HITS:1 COG:no KEGG:ECO111_0679 NR:ns ## KEGG: ECO111_0679 # Name: ybeV # Def: Hsc56 co-chaperone of HscC # Organism: E.coli_O111_H- # Pathway: not_defined # 1 483 1 483 483 945 99.0 0 MKTCWQILEIESTTQIDIIRQAYLARLPLCHPETDPQGFKALRQAYEEALRLAVNPVEEA DDEEKDAAAEHEILRAFRTLLDSESDRFQPSAWQKFIQQLNTWNMEDVDQLRWPLCAIAI EARYLSLNCASLLAERLNWHSFNDSEGMDEEEREAFLEAIQAGDCFDFLSLLEYPVALQN QTVEYYFALECCCRYHPDYVTAFLAMEGPWFIPDDAKLHRKLLRWYSSVQTGMAELIPVA KQWQMEEPESEDARYYLCAQRLYCGEGESLLADLCAYRESYPSTQADNLLLQWSKRHCPD YFALLVMVIEARSMVDAQGQPLKYVPGESARTRLLWAEILHSGKLSPLGQSFIESLFFKR KAWAWWKSRVGSETEEDSPLLDLYRVAEQVVLEAFPKQEMLARLNTRLEGGDAHPLEAII TRMLLTKVKLEPEDEDVDEPTPENHEEKNDEGEKPQSITSIIKISLTVLVIGYALGKIAM LFS >gi|296493085|gb|ADTK01000416.1| GENE 10 10291 - 11844 1327 517 aa, chain - ## HITS:1 COG:ybeW KEGG:ns NR:ns ## COG: ybeW COG0443 # Protein_GI_number: 16128633 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Escherichia coli K12 # 1 517 1 517 556 1024 99.0 0 MDNAELAIGIDLGTTNSLIAVWKDGAAQLIPNKFGEYLTPSIISMDENNHILVGKPAVSR RTSHPDKTAALFKRAMGSNTNWRLGSDTFNAPELSSLVLRSLKEDAEEFLQRPIKDVVIS VPAYFSDEQRKHTRLAAELAGLNAVRLINEPTAAAMAYGLHTQQNTRSLVFDLGGGTFDV TVLEYATPVIEVHASAGDNFLGGEDFTHMLVDDVLKRADVTRTTLNESELAALYACVEAA KCSNQSPLHIRWQYQDEARECEFYENELEDLWLPLLNRLRVPIEQALRDARLKPSQIDSL VLVGGASQMPLVQRIAVRLFGKLPYQSYDPSTIVALGAAIQAACRLRSEDIEEVILTDIC PYSLGVEVNRQGISGIFSPIIERNTTVPVSRVETYSTMHPEQDSITVNVYQGENHKVKNN ILVESFDVPLKKTGAYQSIDIRFSYDINGLLEVDVLLEDGSVKSRVINHSPVTLSAQQIE ESRTRLSALKIYPRDMLINRTFKAKLEELWARALGDE >gi|296493085|gb|ADTK01000416.1| GENE 11 11928 - 12863 969 311 aa, chain - ## HITS:1 COG:ECs0690 KEGG:ns NR:ns ## COG: ECs0690 COG1957 # Protein_GI_number: 15829944 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Escherichia coli O157:H7 # 1 311 1 311 311 622 99.0 1e-178 MALPILLDCDPGHDDAIAIVLALASPELDVKAITSSAGNQTPEKTLRNVLRMLTLLNRTD IPVAGGAVKPLMRELIIADNVHGESGLDGPALPEPAFAPQNCTAVELMAKTLRESAEPVT IVSTGPQTNVALLLNSHPELHSKIARIVIMGGAMGLGNWTPAAEFNIYVDPEAAEIVFQS GIPVVMAGLDVTHKAQIHVEDTERFRAIGNPVSTIVAELLDFFLEYHKDEKWGFVGAPLH DPCTIAWLLKPELFTTVERWVGVETQGKYTQGMTVVDYYYLTGNKPNATVMVDVDRQGFV DLLADRLKFYA >gi|296493085|gb|ADTK01000416.1| GENE 12 12981 - 13706 552 241 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 239 1 242 245 217 45 7e-56 MITLKNVSKWYGHFQVLTDCSTEVKKGEVVVVCGPSGSGKSTLIKTVNGLEPVQQGEITV DGIVVNDKKTDLAKLRSRVGMVFQHFELFPHLSIIENLTLAQVKVLKRDKAPAREKALKL LERVGLSAHANKFPAQLSGGQQQRVAIARALCMDPIAMLFDEPTSALDPEMINEVLDVMV ELANEGMTMMVVTHEMGFARKVANRVIFMDEGKIVEDSPKDAFFDDPKSDRAKDFLAKIL H >gi|296493085|gb|ADTK01000416.1| GENE 13 13706 - 14380 742 224 aa, chain - ## HITS:1 COG:ECs0692 KEGG:ns NR:ns ## COG: ECs0692 COG0765 # Protein_GI_number: 15829946 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Escherichia coli O157:H7 # 1 224 1 224 224 370 100.0 1e-102 MYEFDWSSIVPSLPYLLDGLVITLKITVTAVVIGILWGTMLAVMRLSSFAPVAWFAKAYV NVFRSIPLVMVLLWFYLIVPGFLQNVLGLSPKNDIRLISAMVAFSMFEAAYYSEIIRAGI QSISRGQSSAALALGMTHWQSMKLIILPQAFRAMVPLLLTQGIVLFQDTSLVYVLSLADF FRTASTIGERDGTQVEMILFAGFVYFVISLSASLLVSYLKRRTA >gi|296493085|gb|ADTK01000416.1| GENE 14 14380 - 15120 805 246 aa, chain - ## HITS:1 COG:gltJ KEGG:ns NR:ns ## COG: gltJ COG0765 # Protein_GI_number: 16128637 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Escherichia coli K12 # 1 246 1 246 246 465 100.0 1e-131 MSIDWNWGIFLQQAPFGNTTYLGWIWSGFQVTIALSICAWIIAFLVGSFFGILRTVPNRF LSGLGTLYVELFRNVPLIVQFFTWYLVIPELLPEKIGMWFKAELDPNIQFFLSSMLCLGL FTAARVCEQVRAAIQSLPRGQKNAALAMGLTLPQAYRYVLLPNAYRVIVPPMTSEMMNLV KNSAIASTIGLVDMAAQAGKLLDYSAHAWESFTAITLAYVLINAFIMLVMTLVERKVRLP GNMGGK >gi|296493085|gb|ADTK01000416.1| GENE 15 15290 - 16198 1166 302 aa, chain - ## HITS:1 COG:STM0665 KEGG:ns NR:ns ## COG: STM0665 COG0834 # Protein_GI_number: 16764042 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Salmonella typhimurium LT2 # 1 302 7 308 308 547 93.0 1e-156 MQLRKPATAILALALSAGLAQADDAAPAAGSTLDKIAKNGVIVVGHRESSVPFSYYDNQQ KVVGYSQDYSNAIVEAVKKKLNKPDLQVKLIPITSQNRIPLLQNGTFDFECGSTTNNVER QKQAAFSDTIFVVGTRLLTKKGGDIKDFADLKDKAVVVTSGTTSEVLLNKLNEEQKMNMR IISAKDHGDSFRTLESGRAVAFMMDDALLAGERAKAKKPDNWDIVGKPQSQEAYGCMLRK DDPQFKKLMDDTIAQVQTSGEAEKWFDKWFKNPIPPKNLNMNFELSDEMKALFKEPNDKA LN >gi|296493085|gb|ADTK01000416.1| GENE 16 16653 - 18530 747 625 aa, chain + ## HITS:1 COG:RSp1525 KEGG:ns NR:ns ## COG: RSp1525 COG0705 # Protein_GI_number: 17549744 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Ralstonia solanacearum # 45 204 237 394 569 95 38.0 3e-19 MSASSVKPLNVQLPAITLILFALCIGIFCYLAQWMSYEEVDQSALIHLGANVAPLTLSGE PWRLLSSIFLHSSVSHLLMNMFAFLVVGGVAEQILGKWRLLITWLFSGVFGGLISACYAL RESEQIVISVGASGAILGIAGAAIATQFASGAGTYHKNQRRVFSLLGMVALTLLYGARQT GIDNACHIGGLIAGGALGWLSARLSGQNRLVTEGGIIVAGSLLLTGAIWLAQQQMDESVL QVRQSLREAFYPQEIEQERRQKKQQLAEERNALRETLSAPVSREQASGDLLAEIADIHDM AISRDGNMLYAAIENTNSIVVFDLGQKKILHTFTAPIAKEKSVKHCGGCKDQGVRSLALS PDEKLIYATSFEANALSVINVATGEIIQSITTGAHPDSLILSRDGTKAWVMNRTSNSVSA IDLVTYQHVADIPLEKYDGTGTSNKPGAWVMALSPDEKILLIPGMVRGDIVRINTITHQK ESYPASDARGTISAMRFRPENGDVIFADSQGISRISVGDQQASIMTQWCSRSVYSVEGIS PDGQYLALVSYGLQGYVILLNINAGQIIGVYPASYVNHLRFSADGRKIFVMAKNGLIQLD RTRSLDPQAIIRHPQYGNVACIPEP >gi|296493085|gb|ADTK01000416.1| GENE 17 18608 - 20146 1848 512 aa, chain - ## HITS:1 COG:lnt KEGG:ns NR:ns ## COG: lnt COG0815 # Protein_GI_number: 16128640 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Apolipoprotein N-acyltransferase # Organism: Escherichia coli K12 # 1 512 1 512 512 984 98.0 0 MAFASLIERQRIRLLLALLFGACGTLAFSPYDVWPAAIISLMGLQALTFNRRPLQSAAIG FCWGFGLFGSGINWVYVSIATFGGMPGPVNIFLVVLLAAYLSLYTGLFAGVLSRLWPKTT WLRVAIAAPSLWQVTEFLRGWVLTGFPWLQFGYSQIDGPLKGLAPIMGVEAINFLLMMVS GLLALALVKRNWRPLVVAVVLFALPFPLRYIQWFTPQPEKTIQVSMVQGDIPQSLKWDGD QLLNTLKIYYNATAPLMGKSSLIIWPESAITDLEINQQPFLKALDGELRDKGSSLVTGIV DARLNKQNRYDTYNTIITLGKGAPYSYESADRYNKNHLVPFGEFVPLESILRPLAPFFDL PMSSFSRGPYIQPPLSVNGIELTAAICYEIILGEQVRDNFRPDTDYLLTISNDAWFGKSI GPWQHFQMARMRALELARPLLRSTNNGITAVIGPQGEIQAMIPQFTREVLTTNVTPTTGL TPYARTGNWPLWVLTALFGFAAVLMSLRARKR >gi|296493085|gb|ADTK01000416.1| GENE 18 20171 - 21049 946 292 aa, chain - ## HITS:1 COG:ECs0696 KEGG:ns NR:ns ## COG: ECs0696 COG4535 # Protein_GI_number: 15829950 # Func_class: P Inorganic ion transport and metabolism # Function: Putative Mg2+ and Co2+ transporter CorC # Organism: Escherichia coli O157:H7 # 1 292 1 292 292 508 100.0 1e-144 MSDDNSHSSDTISNKKGFFSLLLSQLFHGEPKNRDELLALIRDSGQNDLIDEDTRDMLEG VMDIADQRVRDIMIPRSQMITLKRNQTLDECLDVIIESAHSRFPVISEDKDHIEGILMAK DLLPFMRSDAEAFSMDKVLRQAVVVPESKRVDRMLKEFRSQRYHMAIVIDEFGGVSGLVT IEDILELIVGEIEDEYDEEDDIDFRQLSRHTWTVRALASIEDFNEAFGTHFSDEEVDTIG GLVMQAFGHLPARGETIDIDGYQFKVAMADSRRIIQVHVKIPDDSPQPKLDE >gi|296493085|gb|ADTK01000416.1| GENE 19 21139 - 21606 717 155 aa, chain - ## HITS:1 COG:ybeY KEGG:ns NR:ns ## COG: ybeY COG0319 # Protein_GI_number: 16128642 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Escherichia coli K12 # 1 155 1 155 155 272 99.0 2e-73 MSQVILDLQLACEDNSGLPEESQFQTWLNAVIPQFQKESEVTIRVVDTAESHSLNLTYRG KDKPTNVLSFPFEVPPGMEMSLLGDLVICRQVVEKEAQEQGKPLEAHWAHMVVHGSLHLL GYDHIEDDEAEEMEALETEIMLALGYEDPYIAEKE >gi|296493085|gb|ADTK01000416.1| GENE 20 21603 - 22643 1364 346 aa, chain - ## HITS:1 COG:ECs0698 KEGG:ns NR:ns ## COG: ECs0698 COG1702 # Protein_GI_number: 15829952 # Func_class: T Signal transduction mechanisms # Function: Phosphate starvation-inducible protein PhoH, predicted ATPase # Organism: Escherichia coli O157:H7 # 1 346 14 359 359 644 99.0 0 MNIDTREITLEPADNARLLSLCGPFDDNIKQLERRLGIEINRRDNHFKLTGRPICVTAAA DILRSLYVDTAPMRGQIQDIEPEQIHLAIKEARVLEQSAESVPEYGKAVNIKTKRGVIKP RTPNQAQYIANILDHDITFGVGPAGTGKTYLAVAAAVDALERQEIRRILLTRPAVEAGEK LGFLPGDLSQKVDPYLRPLYDALFEMLGFEKVEKLIERNVIEVAPLAYMRGRTLNDAFII LDESQNTTIEQMKMFLTRIGFNSKAVITGDVTQIDLPRNTKSGLRHAIEVLADVEEISFN FFHSEDVVRHPVVARIVNAYEAWEEAEQKRKAALAAERKREEQEQK >gi|296493085|gb|ADTK01000416.1| GENE 21 22796 - 24220 438 474 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227372256|ref|ZP_03855738.1| SSU ribosomal protein S12P methylthiotransferase [Veillonella parvula DSM 2008] # 1 390 1 389 449 173 29 1e-42 MTKKLHIKTWGCQMNEYDSSKMADLLDATHGYQLTDVAEEADVLLLNTCSIREKAQEKVF HQLGRWKLLKEKNPDLIIGVGGCVASQEGEHIRQRAHYVDIIFGPQTLHRLPEMINSVRG DRSPVVDISFPEIEKFDRLPEPRAEGPTAFVSIMEGCNKYCTYCVVPYTRGEEVSRPSDD ILFEIAQLAAQGVREVNLLGQNVNAWRGENYDGTTGSFADLLRLVAAIDGIDRIRFTTSH PIEFTDDIIEVYRDTPELVSFLHLPVQSGSDRILNLMGRTHTALEYKAIIRKLRAARPDI QISSDFIVGFPGETTEDFEKTMKLIADVNFDMSYSFIFSARPGTPAADMVDDVPEEEKKQ RLYILQERINQQAMAWSRRMLGTTQRILVEGTSRKSIMELSGRTENNRVVNFEGTPDMIG KFVDVEITDVYPNSLRGKVVRTEDEMGLRVAETPESVIARTRKENDLGVGYYQP >gi|296493085|gb|ADTK01000416.1| GENE 22 24366 - 25541 1272 391 aa, chain + ## HITS:1 COG:ubiF KEGG:ns NR:ns ## COG: ubiF COG0654 # Protein_GI_number: 16128645 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases # Organism: Escherichia coli K12 # 1 391 1 391 391 771 98.0 0 MTNQPTEIAIVGVGMVGGALALGLAQHGFTVTVIEHAEPAPFVADSQPDVRISAISAASV SLLKGLGVWDAVQAMRCHPYRRLETWEWETAHVVFDAAELKLPLLGYMVENTVLQQALWQ ALEAHPKVTLRVPGSLIALHRHDDLQELELKGGETIRAKLVIGADGANSQVRQMAGIGVH AWQYAQSCMLISVQCENDPGDSTWQQFTPDGPRAFLPLFDNWASLVWYDSPARIRQLQNM NMAQLQAEIAKHFLSRLGYVTPLAAGAFPLTRRHALQYVQPGLALVGDAAHTIHPLAGQG VNLGYRDVDALIDVLVNARSYGEAWASYPILKRYQMRRMADNFIMQSGMDLFYAGFSNNL PPLRFMRNLGLMAAERAGVLKRQALKYALGL Prediction of potential genes in microbial genomes Time: Mon May 16 16:26:08 2011 Seq name: gi|296493084|gb|ADTK01000417.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1430.9, whole genome shotgun sequence Length of sequence - 385 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 16:26:12 2011 Seq name: gi|296493083|gb|ADTK01000418.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1430.10, whole genome shotgun sequence Length of sequence - 10442 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 30 - 1694 1876 ## COG0367 Asparagine synthase (glutamine-hydrolyzing) - Prom 1785 - 1844 8.9 2 2 Op 1 5/1.000 - CDS 2091 - 2843 843 ## COG0647 Predicted sugar phosphatases of the HAD superfamily 3 2 Op 2 7/0.000 - CDS 2891 - 4111 261 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase 4 2 Op 3 12/0.000 - CDS 4120 - 5268 1256 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase - Term 5280 - 5317 2.1 5 2 Op 4 . - CDS 5328 - 6128 939 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase - Prom 6266 - 6325 5.5 6 3 Tu 1 . + CDS 6461 - 8407 2006 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific + Term 8437 - 8495 2.2 + Prom 8533 - 8592 7.7 7 4 Tu 1 . + CDS 8675 - 10339 2067 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases Predicted protein(s) >gi|296493083|gb|ADTK01000418.1| GENE 1 30 - 1694 1876 554 aa, chain - ## HITS:1 COG:asnB KEGG:ns NR:ns ## COG: asnB COG0367 # Protein_GI_number: 16128650 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthase (glutamine-hydrolyzing) # Organism: Escherichia coli K12 # 1 554 1 554 554 1147 99.0 0 MCSIFGVFDIKTDAVELRKKALELSRLMRHRGPDWSGIYASDNAILAHERLSIVDVNAGA QPLYNQQKTHVLAVNGEIYNHQALRAEYGDRYQFQTGSDCEVILALYQEKGPEFLDDLQG MFAFALYDSEKDAYLIGRDHLGIIPLYMGYDEHGQLYVASEMKALVPVCRTIKEFPAGSY LWSQDGEIRSYYHRDWFDYDAVKDNVTDKNELRQALEDSVKSHLMSDVPYGVLLSGGLDS SIISAITKKYAARRVEDQERSEAWWPQLHSFAVGLPGSPDLKAAQEVANHLGTVHHEIHF TVQEGLDAIRDVIYHIETYDVTTIRASTPMYLMSRKIKAMGIKMVLSGEGSDEVFGGYLY FHKAPNAKELHEETVRKLLALHMYDCARANKAMSAWGVEARVPFLDKKFLDVAMRINPQD KMCGNGKMEKHILRECFEAYLPASVAWRQKEQFSDGVGYSWIDTLKEVAAQQVTDQQLET ARFRFPYNTPTSKEAYLYREIFEELFPLPSAAECVPGGPSVACSSAKAIEWDEAFKKMDD PSGRAVGVHQSAYK >gi|296493083|gb|ADTK01000418.1| GENE 2 2091 - 2843 843 250 aa, chain - ## HITS:1 COG:ECs0705 KEGG:ns NR:ns ## COG: ECs0705 COG0647 # Protein_GI_number: 15829959 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar phosphatases of the HAD superfamily # Organism: Escherichia coli O157:H7 # 1 250 1 250 250 517 100.0 1e-147 MTIKNVICDIDGVLMHDNVAVPGAAEFLHGIMDKGLPLVLLTNYPSQTGQDLANRFATAG VDVPDSVFYTSAMATADFLRRQEGKKAYVVGEGALIHELYKAGFTITDVNPDFVIVGETR SYNWDMMHKAAYFVANGARFIATNPDTHGRGFYPACGALCAGIEKISGRKPFYVGKPSPW IIRAALNKMQAHSEETVIVGDNLRTDILAGFQAGLETILVLSGVSSLDDIDSMPFRPSWI YPSVAEIDVI >gi|296493083|gb|ADTK01000418.1| GENE 3 2891 - 4111 261 406 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 129 390 47 317 319 105 27 2e-22 MTPGGQAQIGNVDLVKQLNSAAVYRLIDQYGPISRIQIAEQSQLAPASVTKITRQLIERG LIKEVDQQASTGGRRAISIVTETRNFHAIGVRLGRHDATITLFDLSSKVLAEEHYPLPER TQQTLEHALLNAIAQFIDSYQRKLRELIAISVILPGLVDPDSGKIHYMPHIQVENWGLVE ALEERFKVTCFVGHDIRSLALAEHYFGASQDCEDSILVRVHRGTGAGIISNGRIFIGRNG NVGEIGHIQVEPLGERCHCGNFGCLETIAANAAIEQRVLNLLKQGYQSRVPLDDCTIKTI CKAANKGDSLASEVIEYVGRHLGKTIAIAINLFNPQKIVIAGEITEADKVLLPAIESCIN TQALKAFRTNLPVVRSELDHRSAIGAFALVKRAMLNGILLQHLLEN >gi|296493083|gb|ADTK01000418.1| GENE 4 4120 - 5268 1256 382 aa, chain - ## HITS:1 COG:ECs0707 KEGG:ns NR:ns ## COG: ECs0707 COG1820 # Protein_GI_number: 15829961 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Escherichia coli O157:H7 # 1 382 1 382 382 769 100.0 0 MYALTQGRIFTGHEFLDDHAVVIADGLIKSVCPVAELPPEIEQRSLNGAILSPGFIDVQL NGCGGVQFNDTAEAVSVETLEIMQKANEKSGCTNYLPTLITTSDELMKQGVRVMREYLAK HPNQALGLHLEGPWLNLVKKGTHNPNFVRKPDAALVDFLCENADVITKVTLAPEMVPAEV ISKLANAGIVVSAGHSNATLKEAKAGFRAGITFATHLYNAMPYITGREPGLAGAILDEAD IYCGIIADGLHVDYANIRNAKRLKGDKLCLVTDATAPAGANIEQFIFAGKTIYYRNGLCV DENGTLSGSSLTMIEGVRNLVEHCGIALDEVLRMATLYPARAIGVEKRLGTLAAGKVANL TAFTPDFKITKTIVNGNEVVTQ >gi|296493083|gb|ADTK01000418.1| GENE 5 5328 - 6128 939 266 aa, chain - ## HITS:1 COG:ECs0708 KEGG:ns NR:ns ## COG: ECs0708 COG0363 # Protein_GI_number: 15829962 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Escherichia coli O157:H7 # 1 266 1 266 266 560 100.0 1e-159 MRLIPLTTAEQVGKWAARHIVNRINAFKPTADRPFVLGLPTGGTPMTTYKALVEMHKAGQ VSFKHVVTFNMDEYVGLPKEHPESYYSFMHRNFFDHVDIPAENINLLNGNAPDIDAECRQ YEEKIRSYGKIHLFMGGVGNDGHIAFNEPASSLASRTRIKTLTHDTRVANSRFFDNDVNQ VPKYALTVGVGTLLDAEEVMILVLGSQKALALQAAVEGCVNHMWTISCLQLHPKAIMVCD EPSTMELKVKTLRYFNELEAENIKGL >gi|296493083|gb|ADTK01000418.1| GENE 6 6461 - 8407 2006 648 aa, chain + ## HITS:1 COG:ECs0709_1 KEGG:ns NR:ns ## COG: ECs0709_1 COG1263 # Protein_GI_number: 15829963 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Escherichia coli O157:H7 # 1 389 1 389 390 688 100.0 0 MNILGFFQRLGRALQLPIAVLPVAALLLRFGQPDLLNVAFIAQAGGAIFDNLALIFAIGV ASSWSKDSAGAAALAGAVGYFVLTKAMVTINPEINMGVLAGIITGLVGGAAYNRWSDIKL PDFLSFFGGKRFVPIATGFFCLVLAAIFGYVWPPVQHAIHAGGEWIVSAGALGSGIFGFI NRLLIPTGLHQVLNTIAWFQIGEFTNAAGTVFHGDINRFYAGDGTAGMFMSGFFPIMMFG LPGAALAMYFAAPKERRPMVGGMLLSVAVTAFLTGVTEPLEFLFMFLAPLLYLLHALLTG ISLFVATLLGIHAGFSFSAGAIDYALMYNLPAASQNVWMLLVMGVVFFAIYFVVFSLVIR MFNLKTPGREDKEDEIVTEEANSNTEEGLTQLATNYIAAVGGTDNLKAIDACITRLRLTV ADSARVNDAMCKRLGASGVVKLNKQTIQVIVGAKAESIGDAMKKVVARGPVAAASAEATP ATAAPVAKPQAVPNAVSIAELVSPITGDVVALDQVPDEAFASKAVGDGVAVKPTDKIVVS PAAGTIVKIFNTNHAFCLETEKGAEIVVHMGIDTVALEGKGFKRLVEEGAQVRAGQPILE MDLDYLNANARSMISPVVCSNIDDFSGLIIKAQGHVVAGQTPLYEIKK >gi|296493083|gb|ADTK01000418.1| GENE 7 8675 - 10339 2067 554 aa, chain + ## HITS:1 COG:glnS KEGG:ns NR:ns ## COG: glnS COG0008 # Protein_GI_number: 16128656 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Escherichia coli K12 # 1 554 1 554 554 1169 100.0 0 MSEAEARPTNFIRQIIDEDLASGKHTTVHTRFPPEPNGYLHIGHAKSICLNFGIAQDYKG QCNLRFDDTNPVKEDIEYVESIKNDVEWLGFHWSGNVRYSSDYFDQLHAYAIELINKGLA YVDELTPEQIREYRGTLTQPGKNSPYRDRSVEENLALFEKMRAGGFEEGKACLRAKIDMA SPFIVMRDPVLYRIKFAEHHQTGNKWCIYPMYDFTHCISDALEGITHSLCTLEFQDNRRL YDWVLDNITIPVHPRQYEFSRLNLEYTVMSKRKLNLLVTDKHVEGWDDPRMPTISGLRRR GYTAASIREFCKRIGVTKQDNTIEMASLESCIREDLNENAPRAMAVIDPVKLVIENYQGE GEMVTMPNHPNKPEMGSRQVPFSGEIWIDRADFREEANKQYKRLVLGKEVRLRNAYVIKA ERVEKDAEGNITTIFCTYDADTLSKDPADGRKVKGVIHWVSAAHALPVEIRLYDRLFSVP NPGAADDFLSVINPESLVIKQGFAEPSLKDAVAGKAFQFEREGYFCLDSRHSTAEKPVFN RTVGLRDTWAKVGE Prediction of potential genes in microbial genomes Time: Mon May 16 16:26:18 2011 Seq name: gi|296493082|gb|ADTK01000419.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1430.11, whole genome shotgun sequence Length of sequence - 19828 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 8, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 428 - 1834 1540 ## B21_00629 hypothetical protein 2 1 Op 2 . + CDS 1884 - 2210 445 ## EcE24377A_0710 putative lipoprotein + Term 2211 - 2242 0.1 - Term 2245 - 2274 2.1 3 2 Tu 1 6/0.500 - CDS 2295 - 2741 434 ## COG0735 Fe2+/Zn2+ uptake regulation proteins - Prom 2812 - 2871 7.0 - Term 2985 - 3024 9.1 4 3 Tu 1 . - CDS 3030 - 3560 707 ## COG0716 Flavodoxins - Prom 3617 - 3676 2.9 5 4 Op 1 . - CDS 3700 - 4062 394 ## LF82_2603 uncharacterized protein YbfE 6 4 Op 2 . - CDS 4133 - 4897 739 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) - Prom 4930 - 4989 2.4 + Prom 4886 - 4945 4.4 7 5 Op 1 6/0.500 + CDS 5082 - 5627 409 ## COG3057 Negative regulator of replication initiationR 8 5 Op 2 . + CDS 5653 - 7293 1960 ## COG0033 Phosphoglucomutase + Term 7304 - 7350 9.5 - Term 7294 - 7339 -0.4 9 6 Op 1 10/0.000 - CDS 7349 - 8668 1393 ## COG0531 Amino acid transporters 10 6 Op 2 2/1.000 - CDS 8665 - 10863 2090 ## COG1982 Arginine/lysine/ornithine decarboxylases - Prom 11061 - 11120 5.8 11 7 Op 1 16/0.000 - CDS 11508 - 12110 688 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 12 7 Op 2 15/0.000 - CDS 12182 - 14866 2328 ## COG2205 Osmosensitive K+ channel histidine kinase 13 7 Op 3 18/0.000 - CDS 14859 - 15431 495 ## COG2156 K+-transporting ATPase, c chain 14 7 Op 4 20/0.000 - CDS 15440 - 17488 2425 ## COG2216 High-affinity K+ transport system, ATPase chain B 15 7 Op 5 . - CDS 17511 - 19184 1833 ## COG2060 K+-transporting ATPase, A chain 16 7 Op 6 . - CDS 19184 - 19387 85 ## EcE24377A_0725 K+-transporting ATPase, F subunit (EC:3.6.3.12) - Prom 19488 - 19547 5.2 + Prom 19356 - 19415 9.8 17 8 Tu 1 . + CDS 19586 - 19792 252 ## ECO103_0695 hypothetical protein Predicted protein(s) >gi|296493082|gb|ADTK01000419.1| GENE 1 428 - 1834 1540 468 aa, chain + ## HITS:1 COG:no KEGG:B21_00629 NR:ns ## KEGG: B21_00629 # Name: ybfM # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 468 1 468 468 880 100.0 0 MRTFSGKRSTLALAIAGVTAMSGFMAMPEARAEGFIDDSTLTGGIYYWQRERDRKDVTDG DKYKTNLSHSTWNANLDFQSGYAADMFGLDIAAFTAIEMAENGDSSHPNEIAFSKSNKAY DEDWSGDKSGISLYKAAAKFKYGPVWARAGYIQPTGQTLLAPHWSFMPGTYQGAEAGANF DYGDAGALSFSYMWTNEYKAPWHLEMDEFYQNDKTTKVDYLHSLGAKYDFKNNFVLEAAF GQAEGYIDQYFAKASYKFDIAGSPLTTSYQFYGTRDKVDDRSVNDLYDGTAWLQALTFGY RAADVVDLRLEGTWVKADGQQGYFLQRMTPTYASSNGRLDIWWDNRSDFNANGEKAVFFG AMYDLKNWNLPGFAIGASYVYAWDAKPATWQSNPDAYYDKNRTIEESAYSLDAVYTIQDG RAKGTMFKLHFTEYDNHSDIPSWGGGYGNIFQDERDVKFMVIAPFTIF >gi|296493082|gb|ADTK01000419.1| GENE 2 1884 - 2210 445 108 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_0710 NR:ns ## KEGG: EcE24377A_0710 # Name: not_defined # Def: putative lipoprotein # Organism: E.coli_E24377A # Pathway: not_defined # 1 108 1 108 108 177 99.0 1e-43 MKKLILIAMMASGLVACAQSTAPQEDSRLKEAYSACINTAQGSPEKIEACQSVLNVLKKE KQHQHFADQESVRVLDYQQCLRATQTGNDQAVKADCDKVWQEIRSNNK >gi|296493082|gb|ADTK01000419.1| GENE 3 2295 - 2741 434 148 aa, chain - ## HITS:1 COG:ECs0714 KEGG:ns NR:ns ## COG: ECs0714 COG0735 # Protein_GI_number: 15829968 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Escherichia coli O157:H7 # 1 148 1 148 148 279 100.0 1e-75 MTDNNTALKKAGLKVTLPRLKILEVLQEPDNHHVSAEDLYKRLIDMGEEIGLATVYRVLN QFDDAGIVTRHNFEGGKSVFELTQQHHHDHLICLDCGKVIEFSDDSIEARQREIAAKHGI RLTNHSLYLYGHCAEGDCREDEHAHEGK >gi|296493082|gb|ADTK01000419.1| GENE 4 3030 - 3560 707 176 aa, chain - ## HITS:1 COG:ECs0715 KEGG:ns NR:ns ## COG: ECs0715 COG0716 # Protein_GI_number: 15829969 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Escherichia coli O157:H7 # 1 176 1 176 176 348 100.0 3e-96 MAITGIFFGSDTGNTENIAKMIQKQLGKDVADVHDIAKSSKEDLEAYDILLLGIPTWYYG EAQCDWDDFFPTLEEIDFNGKLVALFGCGDQEDYAEYFCDALGTIRDIIEPRGATIVGHW PTAGYHFEASKGLADDDHFVGLAIDEDRQPELTAERVEKWVKQISEELHLDEILNA >gi|296493082|gb|ADTK01000419.1| GENE 5 3700 - 4062 394 120 aa, chain - ## HITS:1 COG:no KEGG:LF82_2603 NR:ns ## KEGG: LF82_2603 # Name: ybfE # Def: uncharacterized protein YbfE # Organism: E.coli_LF82 # Pathway: not_defined # 1 120 1 120 120 190 99.0 1e-47 MYYGALSIRAEAWLIVSPEVTKIMAKEQTDRTTLDLFAHERRPGRPKTNPLSRDEQLRIN KRNQLKRDKVRGLKRVELKLNAEAVEALNELAESRNMSRSELIEEMLMQQLAALRSQGIV >gi|296493082|gb|ADTK01000419.1| GENE 6 4133 - 4897 739 254 aa, chain - ## HITS:1 COG:ybfF KEGG:ns NR:ns ## COG: ybfF COG0596 # Protein_GI_number: 16128662 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Escherichia coli K12 # 1 254 1 254 254 494 98.0 1e-140 MKLNIRAQTAQNQHNNSPIVLVHGLFGSLDNLGVLARDLVNDHNIIQVDMRNHGLSPREP VMNYPAMAQDLVDTLDALQIDKATFIGHSMGGKAVMALTALAPDRIDKLVAIDIAPVDYH VRRHDEIFAAINAVSESDAQTRQQAAAIMRQHLNEEGVIQFLLKSFVDGEWRFNVPVLWD QYPHIVGWEKIPAWDHPALFIPGGNSPYVSEQYRDDLLAQFPQARAHVIAGAGHWVHAEK PDAVLRAIRRYLND >gi|296493082|gb|ADTK01000419.1| GENE 7 5082 - 5627 409 181 aa, chain + ## HITS:1 COG:seqA KEGG:ns NR:ns ## COG: seqA COG3057 # Protein_GI_number: 16128663 # Func_class: L Replication, recombination and repair # Function: Negative regulator of replication initiationR # Organism: Escherichia coli K12 # 1 181 1 181 181 344 100.0 5e-95 MKTIEVDDELYSYIASHTKHIGESASDILRRMLKFSAASQPAAPVTKEVRVASPAIVEAK PVKTIKDKVRAMRELLLSDEYAEQKRAVNRFMLLLSTLYSLDAQAFAEATESLHGRTRVY FAADEQTLLKNGNQTKPKHVPGTPYWVITNTNTGRKCSMIEHIMQSMQFPAELIEKVCGT I >gi|296493082|gb|ADTK01000419.1| GENE 8 5653 - 7293 1960 546 aa, chain + ## HITS:1 COG:pgm KEGG:ns NR:ns ## COG: pgm COG0033 # Protein_GI_number: 16128664 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglucomutase # Organism: Escherichia coli K12 # 1 546 1 546 546 1080 99.0 0 MAIHNRAGQPAQQSDLINVAQLTAQYYVLKPEAGNAEHAVKFGTSGHRGSAARHSFNEPH ILAIAQAIAEERAKNGITGPCYVGKDTHALSEPAFISVLEVLAANGVDVIVQENNGFTPT PAISNAILVHNKKGGPLADGIVITPSHNPPEDGGIKYNPPNGGPADTNVTKVVEDRANAL MADGLKGVKRISLDEAMASGHVKEQDLVQPFVEGLADIVDMAAIQKAGLTLGVDPLGGSG IEYWKRIGEYYNLNLTIVNDQVDQTFRFMHLDKDGAIRMDCSSECAMAGLLALRDKFDLA FANDPDYDRHGIVTPAGLMNPNHYLAVAINYLFQHRPQWGKDVAVGKTLVSSAMIDRVVN DLGRKLVEVPVGFKWFVDGLFDGSFGFGGEESAGASFLRFDGTPWSTDKDGIIMCLLAAE ITAVTGKNPQEHYNELAKRFGAPSYNRLQAAATSAQKAALSKLSPEMVSASTLAGDPITA RLTAAPGNGASIGGLKVMTDNGWFAARPSGTEDAYKIYCESFLGEEHRKQIEKEAVEIVS EVLKNA >gi|296493082|gb|ADTK01000419.1| GENE 9 7349 - 8668 1393 439 aa, chain - ## HITS:1 COG:potE KEGG:ns NR:ns ## COG: potE COG0531 # Protein_GI_number: 16128668 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli K12 # 1 439 1 439 439 770 100.0 0 MSQAKSNKMGVVQLTILTMVNMMGSGIIMLPTKLAEVGTISIISWLVTAVGSMALAWAFA KCGMFSRKSGGMGGYAEYAFGKSGNFMANYTYGVSLLIANVAIAISAVGYGTELLGASLS PVQIGLATIGVLWICTVANFGGARITGQISSITVWGVIIPVVGLCIIGWFWFSPTLYVDS WNPHHAPFFSAVGSSIAMTLWAFLGLESACANTDVVENPERNVPIAVLGGTLGAAVIYIV STNVIAGIVPNMELANSTAPFGLAFAQMFTPEVGKVIMALMVMSCCGSLLGWQFTIAQVF KSSSDEGYFPKIFSRVTKVDAPVQGMLTIVIIQSGLALMTISPSLNSQFNVLVNLAVVTN IIPYILSMAALVIIQKVANVPPSKAKVANFVAFVGAMYSFYALYSSGEEAMLYGSIVTFL GWTLYGLVSPRFELKNKHG >gi|296493082|gb|ADTK01000419.1| GENE 10 8665 - 10863 2090 732 aa, chain - ## HITS:1 COG:ECs0721 KEGG:ns NR:ns ## COG: ECs0721 COG1982 # Protein_GI_number: 15829975 # Func_class: E Amino acid transport and metabolism # Function: Arginine/lysine/ornithine decarboxylases # Organism: Escherichia coli O157:H7 # 1 732 1 732 732 1537 98.0 0 MSELKIAVSRSCPDCFSTHRACVNIDESNYIDVAAIILSVSDVERGKLDEIDATGYDIPV FIATENEERVPAEYLPRISGVFEHCESRKEFYGRQLETAASHYETQLRPPFFRALVDYVN QGNSAFDCPGHQGGEFFRRHPAGNQFVEYFGEMLFRSDLCNADVAMGDLLIHEGAPCIAQ QHAAKVFNADKTYFVLNGTSSSNKVVLNALLTPGDLVLFDRNNHKSNHHGALLQAGATPV YLETARNPYGFIGGIDAHCFEESYLRELITEVAPQRAKETRPFRLAVIQLGTYDGTIYNA RQVVDKIGHLCDYILFDSAWVGYEQFIPMMADCSPLLLDLNENDPGILVTQSVHKQQAGF SQTSQIHKKDSHIKGQQRYVPHKRMNNAFMMHASTSPFYPLFAALDINAKMHEGVSGRNM WMDCVVNGINARKLILDNCQHIRPFVPELVDGKPWQSYETAQIAVDLRFFQFVPGEHWHS FEGYAENQYFVDPCKLLLTTPGIDARNGEYEAFGVPATILANFLRENGVVPEKCDLNSIL FLLTPAEDMAKLQQLVALLVRFEKLLESDAPLAEVLPSIYKQHEERYAGYTLRQLCQEMH DLYARHNVKQLQKEMFRKEHFPRVSMNPQEANYAYLRGEVELVRLPDAEGRIAAEGALPY PPGVLCVVPGEIWGGAVLRYFSALEEGINLLPGFAPELQGVYIEEHDGRKQVWCYVIKPR DAQSTLLKGEKL >gi|296493082|gb|ADTK01000419.1| GENE 11 11508 - 12110 688 200 aa, chain - ## HITS:1 COG:ECs0722 KEGG:ns NR:ns ## COG: ECs0722 COG0745 # Protein_GI_number: 15829976 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 199 26 224 225 385 99.0 1e-107 MRVYEAETLQRGLLEAATRKPDLIILDLGLPDGDGIEFIRDLRQWSAVPVIVLSARSEES DKIAALDAGADDYLSKPFGIGELQARLRVALRRHSATAAPDPLVKFSDVTVDLAARVIHR GEEEVHLTPIEFRLLAVLLNNAGKVLTQRQLLNQVWGPNAVEHSHYLRIYMGHLRQKLEQ DPARPRHFITETGIGYRFMP >gi|296493082|gb|ADTK01000419.1| GENE 12 12182 - 14866 2328 894 aa, chain - ## HITS:1 COG:ECs0723 KEGG:ns NR:ns ## COG: ECs0723 COG2205 # Protein_GI_number: 15829977 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Escherichia coli O157:H7 # 1 894 1 894 894 1761 99.0 0 MNNEPLRPDPDRLLEQTAAPHRGKLKVFFGACAGVGKTWAMLAEAQRLRTQGLDIVVGVV ETHGRKDTAAMLEELAVLPPKRLAYRGRHISEFDLDAALARRPALILMDELAHSNAPGSR HPKRWQDIEELLEAGIDVFTTVNVQHLESLNDVVSGVTGIQVRETVPDPFFDAADDVVLV DLPPDDLRQRLKEGKVYIAGQAERAIEHFFRKGNLIALRELALRRTADRVDEQMRAWRGH PGEEKVWHTRDAILLCIGHNTGSEKLVRAAARLASRLGSVWHAVYVETPALHRLPEKKRR AILSALRLAQELGAETATLSDPAEEKAVVRYAREHNLGKIILGRPASRRWWRRETFADRL ARIAPDLDQVLVALDEPPARTINNAPDSRSFKDKWRVQIQGCVVAAALCAVITLIAMQWL MAFDAANLVMLYLLGVVVVALFYGRWPSVVATVINVVSFDLFFIAPRGTLAVSDVQYLLT FAVMLTVGLVIGNLTAGVRYQARVARYREQRTRHLYEMSKALAVGRSPQDIAATSEQFIA STFHARSQVLLPDDNGKLQPLTHPQGMTPWDDAIAQWSFDKGLPAGAGTDTLPGVPYQIL PLKSGEKTYGLVVVEPGNLRQLMIPEQQRLLETFTLLVANALERLTLTASEEQARMASER EQIRNALLAALSHDLRTPLTVLFGQAEILTLDLASEGSPHARQASEIRQHVLNTTRLVNN LLDMARIQSGGFNLKKEWLTLEEVVGSALQMLEPGLSSPINLSLPEPLTLIHVDGPLFER VLINLLENAVKYAGAQAEIGIDAHVEGENLQLDVWDNGPGLPPGQEQTIFDKFARGNKES AVPGVGLGLAICRAIVDVHGGTITAFNRPEGGACFRVTLPQQTAPELEDFHEDM >gi|296493082|gb|ADTK01000419.1| GENE 13 14859 - 15431 495 190 aa, chain - ## HITS:1 COG:kdpC KEGG:ns NR:ns ## COG: kdpC COG2156 # Protein_GI_number: 16128672 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, c chain # Organism: Escherichia coli K12 # 1 190 1 190 190 337 99.0 7e-93 MSGLRPALSTFIFLLLITGGVYPLLTTVLGQWWFPWQANGSLIREGDTVRGSALIGQNFT GNGYFHGRPSATAEMPYNPQASGGSNLAVSNPELDKLIAARVAALRAANPDASASVPVEL VTASASGLDNNITPQAAAWQIPRVAKARNLSVEQLTQLIAKYSQQPLVKYIGQPVINIVE LNLALDKLDE >gi|296493082|gb|ADTK01000419.1| GENE 14 15440 - 17488 2425 682 aa, chain - ## HITS:1 COG:ECs0725 KEGG:ns NR:ns ## COG: ECs0725 COG2216 # Protein_GI_number: 15829979 # Func_class: P Inorganic ion transport and metabolism # Function: High-affinity K+ transport system, ATPase chain B # Organism: Escherichia coli O157:H7 # 1 682 1 682 682 1246 99.0 0 MSRKQLALFEPTLVVQALKEAVKKLNPQAQWRNPVMFIVWIGSLLTTCISIAMASGAMPG NALFSAAISGWLWVTVLFANFAEALAEGRSKAQANSLKGVKKTAFARKLREPKYGAAADK VPADQLRKGDIVLVEAGDIIPCDGEVIEGGASVDESAITGESAPVIRESGGDFASVTGGT RILSDWLVIECSVNPGETFLDRMIAMVEGAQRRKTPNEIALTILLIALTIVFLLATATLW PFSAWGGNAVSVTVLVALLVCLIPTTIGGLLSAIGVAGMSRMLGANVIATSGRAVEAAGD VDVLLLDKTGTITLGNRQASEFIPAQGVDEKTLADAAQLASLADETPEGRSIVILAKQRF NLRERDVQSLHATFVPFTAQSRMSGINIDNRMIRKGSVDAIRRHVEANGGHFPTDVDQKV DQVARQGATPLVVVEGSRVLGVIALKDIVKGGIKERFAQLRKMGIKTVMITGDNRLTAAA IAAEAGVDDFLAEATPEAKLALIRQYQAEGRLVAMTGDGTNDAPALAQADVAVAMNSGTQ AAKEAGNMVDLDSNPTKLIEVVHIGKQMLMTRGSLTTFSIANDVAKYFAIIPAAFAATYP QLNALNIMCLHSPDSAILSAVIFNALIIVFLIPLALKGVSYKPLTASAMLRRNLWIYGLG GLLVPFIGIKVIDLLLTVCGLV >gi|296493082|gb|ADTK01000419.1| GENE 15 17511 - 19184 1833 557 aa, chain - ## HITS:1 COG:kdpA KEGG:ns NR:ns ## COG: kdpA COG2060 # Protein_GI_number: 16128674 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, A chain # Organism: Escherichia coli K12 # 1 557 1 557 557 979 99.0 0 MAAQGFLLIATFLLVLMVLARPLGSGLARLINDIPLPGTTGVERVLFRALGVSDREMNWK QYLCAILGLNMLGLAVLFFMLLGQHYLPLNPQQLPGLSWDLALNTAVSFVTNTNWQSYSG ETTLSYFSQMAGLTVQNFLSAASGIAVIFALIRAFTRQSMSTLGNAWVDLLRITLWVLVP VALLIALFFIQQGALQNFQPYQAVNTVEGAQQLLPMGPVASQEAIKMLGTNGGGFFNANS SHPFENPTALTNFVQMLAIFLIPTALCFAFGEVTGDRRQGRMLLWAMSVIFVICVGVVMW AEVQGNPHLLALGADSSINMEGKESRFGVLVSSLFAVVTTAASCGAVIAMHDSFTALGGM VPMWLMQIGEVVFGGVGSGLYGMMLFVLLAVFIAGLMIGRTPEYLGKKIDVREMKLTALA ILVTPTLVLMGAALAMMTDAGRSAMLNPGPHGFSEVLYAVSSAANNNGSAFAGLSANSPF WNCLLAFCMFVGRFGVIIPVMAIAGSLVSKKSQPASSGTLPTHGPLFVGLLIGTVLLVGA LTFIPALALGPVAEYLS >gi|296493082|gb|ADTK01000419.1| GENE 16 19184 - 19387 85 67 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_0725 NR:ns ## KEGG: EcE24377A_0725 # Name: kdpF # Def: K+-transporting ATPase, F subunit (EC:3.6.3.12) # Organism: E.coli_E24377A # Pathway: Two-component system [PATH:ecw02020] # 1 67 1 67 67 93 100.0 2e-18 MAFAIFILFLHPARRFLRNLCSQNSTLPVSLLGHWRCTVSAGVITGVLLVFLLLGYLVYA LINAEAF >gi|296493082|gb|ADTK01000419.1| GENE 17 19586 - 19792 252 68 aa, chain + ## HITS:1 COG:no KEGG:ECO103_0695 NR:ns ## KEGG: ECO103_0695 # Name: ybfA # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 68 1 68 68 117 100.0 1e-25 MELYREYPAWLIFLRRTYAVAAGVLALPFMLFWKDRARFYSYLHRVWSKTSDKPVWMDQA EKATGDFY Prediction of potential genes in microbial genomes Time: Mon May 16 16:26:33 2011 Seq name: gi|296493081|gb|ADTK01000420.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1430.12, whole genome shotgun sequence Length of sequence - 170 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 16:26:37 2011 Seq name: gi|296493080|gb|ADTK01000421.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1457.1, whole genome shotgun sequence Length of sequence - 12509 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 4, operones - 2 average op.length - 8.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 26 - 487 412 ## E2348_P1_058 conjugal transfer mating pair stabilization protein TraN 2 1 Op 2 . - CDS 484 - 1122 469 ## APECO1_O1CoBM43 conjugal transfer pilus assembly protein TrbC 3 1 Op 3 . - CDS 1131 - 2123 751 ## APECO1_O1CoBM42 conjugal transfer pilus assembly protein TraU 4 1 Op 4 . - CDS 2120 - 2752 518 ## pECS88_0079 conjugal transfer pilus assembly protein TraW 5 1 Op 5 . - CDS 2749 - 3135 296 ## APECO1_O1CoBM40 conjugal transfer protein TrbI 6 1 Op 6 . - CDS 3132 - 5759 2495 ## COG3451 Type IV secretory pathway, VirB4 components - Term 5803 - 5852 1.0 7 2 Tu 1 . - CDS 5919 - 6140 133 ## COG1734 DnaK suppressor protein - Prom 6198 - 6257 3.1 - Term 6224 - 6252 2.1 8 3 Op 1 . - CDS 6275 - 6790 292 ## E2348_P1_065 conjugal transfer protein TraV 9 3 Op 2 . - CDS 6787 - 7038 202 ## pECS88_0073 conjugal transfer protein TrbG 10 3 Op 3 . - CDS 7050 - 7247 161 ## E2348_P1_067 conjugal transfer protein TrbD 11 3 Op 4 . - CDS 7234 - 7824 278 ## p1ECUMN_0106 conjugal transfer protein TraP 12 3 Op 5 . - CDS 7814 - 9241 1252 ## EcSMS35_A0035 conjugal transfer pilus assembly protein TraB 13 3 Op 6 . - CDS 9241 - 9969 532 ## p1ECUMN_0108 conjugal transfer protein TraK 14 3 Op 7 . - CDS 9956 - 10522 522 ## EcSMS35_A0037 conjugal transfer pilus assembly protein TraE 15 3 Op 8 . - CDS 10544 - 10855 363 ## EcSMS35_A0038 conjugal transfer pilus assembly protein TraL 16 3 Op 9 . - CDS 10870 - 11235 344 ## pECS88_0066 conjugal transfer pilin subunit TraA 17 3 Op 10 . - CDS 11268 - 11663 203 ## ECO111_p3-60 conjugal transfer protein TraY - Prom 11701 - 11760 3.6 18 4 Tu 1 . - CDS 11762 - 12451 190 ## pECS88_0064 conjugal transfer transcriptional regulator TraJ Predicted protein(s) >gi|296493080|gb|ADTK01000421.1| GENE 1 26 - 487 412 153 aa, chain - ## HITS:1 COG:no KEGG:E2348_P1_058 NR:ns ## KEGG: E2348_P1_058 # Name: traN # Def: conjugal transfer mating pair stabilization protein TraN # Organism: E.coli_0127 # Pathway: not_defined # 1 153 1 153 602 301 99.0 4e-81 MKRILPLILALVAGMAQADSNSDYRAGSDFAHQIKGQGSSSIQGFKPQESIPGYNANPDE TKYYGGVTAGGDGGLKNDGTTEWATGETGKTITESFMNKPKDILSPDAPFIQTGRDVVNR ADSIVGNTGQQCSAQEISRSEYTNYTCERDLQV >gi|296493080|gb|ADTK01000421.1| GENE 2 484 - 1122 469 212 aa, chain - ## HITS:1 COG:no KEGG:APECO1_O1CoBM43 NR:ns ## KEGG: APECO1_O1CoBM43 # Name: trbC # Def: conjugal transfer pilus assembly protein TrbC # Organism: E.coli_APEC # Pathway: not_defined # 1 212 2 213 213 409 99.0 1e-113 MKLSMKSLAALLMMLNGAVMASENVNTPENRQFLKQQENLSRQLREKPDHQLKAWAEKQV LETPLQRSDNHFLDELVRKQQASQDGKPRQGALYFVSFSIPEEGLKRMLGETRHYGIPAT LRGMVNNDLKTTAEAVLSLVKDGATDGVQIDPTLFSQYGIRTVPALVVFCSQGYDIIRGN LRVGQALEKVAATGDCRQVAHDLLAGKGDSGK >gi|296493080|gb|ADTK01000421.1| GENE 3 1131 - 2123 751 330 aa, chain - ## HITS:1 COG:no KEGG:APECO1_O1CoBM42 NR:ns ## KEGG: APECO1_O1CoBM42 # Name: traU # Def: conjugal transfer pilus assembly protein TraU # Organism: E.coli_APEC # Pathway: not_defined # 1 330 1 330 330 657 99.0 0 MKRRLWLLMLFLFAGHVPAASADSACEGRFVNPITDICWSCIFPLSLGSIKVSQGKVPDT ANPSMPIQICPAPPPLFRRIGLAIGYWEPMALTDVTRSPGCMVNLGFSLPAFGKTAQGTA KKDEKQVNGAFYHVHWYKYPLTYWLNIITSLGCLEGGDLDIAYLSEIDPTWTDSSLTTIL NPEAVIFANPIAQGACAADAIASAFNMPLDVLFWCAGSQGSMYPFNGWVSNESSPLQSSL LVSERMAFKLHRQGMIMETIGKNNAVCNEYPPPILPKERWRYQMVNMYPDSGQCHPFGRS VTRWETGKNPPNTKKNFGYLMWRKRNCVFL >gi|296493080|gb|ADTK01000421.1| GENE 4 2120 - 2752 518 210 aa, chain - ## HITS:1 COG:no KEGG:pECS88_0079 NR:ns ## KEGG: pECS88_0079 # Name: traW # Def: conjugal transfer pilus assembly protein TraW # Organism: E.coli_S88 # Pathway: not_defined # 1 210 1 210 210 416 100.0 1e-115 MRCRGLIALLIWGQSVAAADLGTWGDLWPVKEPDMLTVIMQRLTALEQSGEMGRKMDAFK ERVIRNSLRPPAVPGIGRTEKYGSRLFDPSVRLAADIRDNEGRVFARQGEVMNPLQYVPF NQTLYFINGDDPAQVAWMKRQTPPTLESKIILVQGSIPEMQKSLDSRVYFDQNGVLCQRL GIDQVPARVSAVPGDRFLKVEFIPAEEGRK >gi|296493080|gb|ADTK01000421.1| GENE 5 2749 - 3135 296 128 aa, chain - ## HITS:1 COG:no KEGG:APECO1_O1CoBM40 NR:ns ## KEGG: APECO1_O1CoBM40 # Name: trbI # Def: conjugal transfer protein TrbI # Organism: E.coli_APEC # Pathway: not_defined # 1 128 1 128 128 214 96.0 6e-55 MSSTQKPADVMAERRSHWWWTVPGCLAMVLLNAAVSYGIVRLNAPVTVAFNMKQTVDAFF DRASQKQLSEAQSKALSARFNTALEASLQAWQQKHHAVILVSPAVVQGAPDITREIQQDI ARRMRAEP >gi|296493080|gb|ADTK01000421.1| GENE 6 3132 - 5759 2495 875 aa, chain - ## HITS:1 COG:PSLT088_2 KEGG:ns NR:ns ## COG: PSLT088_2 COG3451 # Protein_GI_number: 17233453 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Salmonella typhimurium LT2 # 290 872 1 583 593 1141 94.0 0 MNNPLEAVTQAVNSLVTALKLPDESAKANEVLGEMSFPQFSRLLPYRDYNQESGLFMNDT TMGFMLEAIPINGANESIVEALDHMLRTKLPRGIPLCIHLMSSQLVGDRIEYGLREFSWS GEQAERFNAITRAYYMKAAATQFPLPEGVNLPLTLRHYRVFISYCSPSKKKSRADILEME NLVKIIRASLQGASITTQTVDAQAFIDIVGEMINHNPDSLYPKRRQLDPYSDLNYQCVED SFDLKVRADYLTLGLRENGRNSTARILNFHLARNPEIAFLWNVADNYSNLLNPELSISCP FILTLTLVVEDQVKTHSEANLKYMDLEKKSKTSYAKWFPSVEKEAKEWGELRQRLGSGQS SVVSYFLNITAFCKDNNETALEVEQDILNSFRKNGFELISPRFNHMRNFLTCLPFMAGKG LFKQLKEAGVVQRAESFNVANLMPLVADNPLTPAGLLAPTYRNQLAFIDIFFRGMNNTNY NMAVCGTSGAGKTGLIQPLIRSVLDSGGFAVVFDMGDGYKSLCENMGGVYLDGETLRFNP FANITDIDQSAERVRDQLSVMASPNGNLDEVHEGLLLQAVRASWLAKENRARIDDVVDFL KNASDSEQYAESPTIRSRLDEMIVLLDQYTANGTYGQYFNSDEPSLRDDAKMVVLELGGL EDRPSLLVAVMFSLIIYIENRMYRTPRNLKKLNVIDEGWRLLDFKNHKVGEFIEKGYRTA RRHTGAYITITQNIVDFDSDKASSAARAAWGNSSYKIILKQSAKEFAKYNQLYPDQFLPL QRDMIGKFGAAKDQWFSSFLLQVENHSSWHRLFVDPLSRAMYSSDGPDFEFVQQKRKEGL SIHEAVWQLAWKKSGPEMASLEAWLEEHEKYRSVA >gi|296493080|gb|ADTK01000421.1| GENE 7 5919 - 6140 133 73 aa, chain - ## HITS:1 COG:PSLT085 KEGG:ns NR:ns ## COG: PSLT085 COG1734 # Protein_GI_number: 17233451 # Func_class: T Signal transduction mechanisms # Function: DnaK suppressor protein # Organism: Salmonella typhimurium LT2 # 1 73 1 73 73 125 84.0 1e-29 MSDEADEAYSVTEQLTMTGINRIRQKINAHGIPVYLCEACGNPIPEARRKIFPGVTLCVE CQAYQERQRKHYA >gi|296493080|gb|ADTK01000421.1| GENE 8 6275 - 6790 292 171 aa, chain - ## HITS:1 COG:no KEGG:E2348_P1_065 NR:ns ## KEGG: E2348_P1_065 # Name: traV # Def: conjugal transfer protein TraV # Organism: E.coli_0127 # Pathway: not_defined # 1 171 1 171 171 272 100.0 4e-72 MKQISFFIPLLGTLLLSGCAGTSTEFECNATTSDTCMTMEQANEKAKKLEQSSEAKPVAA SLPRLAEGNFRTMPVQTVTATTPSGSRPAVTAHPEQKLLAPRPLFTAAREVKTVVPVSSV TPVTPPRPLRTGEQTAALWIAPYIDNQDVYHQPSSVFFVIKPSAWGKPRIN >gi|296493080|gb|ADTK01000421.1| GENE 9 6787 - 7038 202 83 aa, chain - ## HITS:1 COG:no KEGG:pECS88_0073 NR:ns ## KEGG: pECS88_0073 # Name: trbG # Def: conjugal transfer protein TrbG # Organism: E.coli_S88 # Pathway: not_defined # 1 83 1 83 83 157 100.0 1e-37 MNKLVSDGSVKKINYPVLYESGITPPLCEVSAPEPDAGGKRIVAYVYKSSRSTVFENPDI VKTCTVRDLKKDFVNCDEKGEGQ >gi|296493080|gb|ADTK01000421.1| GENE 10 7050 - 7247 161 65 aa, chain - ## HITS:1 COG:no KEGG:E2348_P1_067 NR:ns ## KEGG: E2348_P1_067 # Name: trbD # Def: conjugal transfer protein TrbD # Organism: E.coli_0127 # Pathway: not_defined # 1 61 1 61 106 116 98.0 3e-25 MNMRNINVITALSVPGKTVSDDFMHAVLSNCATRIVLPAPEKFGSESLPDNFNMTAVGVM KNSEI >gi|296493080|gb|ADTK01000421.1| GENE 11 7234 - 7824 278 196 aa, chain - ## HITS:1 COG:no KEGG:p1ECUMN_0106 NR:ns ## KEGG: p1ECUMN_0106 # Name: traP # Def: conjugal transfer protein TraP # Organism: E.coli_UMN026 # Pathway: not_defined # 1 196 1 196 196 380 98.0 1e-104 MANNTSSRQAGHAARYVVARVLRGLFWCLKYTVILPLATMALMALFVLWKDNTTPGKLLV KEINFVRQTAPAGQFPVSECWFSSSDSSGRSEIQDICHYRAADAADYVRETDRSLMQLVT ALWATLALMYVSLAAITGKYPVRPGKMKCIRVVTADEHLKEVYTEDASLPGKIRKCPVYL PDDRTNRNNGDKNEHA >gi|296493080|gb|ADTK01000421.1| GENE 12 7814 - 9241 1252 475 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_A0035 NR:ns ## KEGG: EcSMS35_A0035 # Name: traB # Def: conjugal transfer pilus assembly protein TraB # Organism: E.coli_SECEC # Pathway: not_defined # 1 475 1 475 475 841 100.0 0 MASINTIVKRKQYLWLGIVVVGTASAIGGALYLSDVDMSGNGETVAEQEPVPDMTGVVDT TFDDKVRQHATTEMQVTAAQMQKQYEEIRRELDVLNKQRGDDQRRIEKLGQDNAALAEQV KALGANPVTATGEPVPQMPASPPGPEGEPQPGNTPVSFPPQGSVAVPPPTAFYPGNGVTP PPQVTYQSVPVPNRIQRKVFTRNEGKQGPSLPYIPSGSFAKAMLIEGADANASVTGNEST VPMQLRITGLVEMPNSKTYDATGCFVGLEAWGDVSSERAIVRTRNISCLKDGKTIDMPIK GHVSFRGKNGIKGEVVMRNGKILGWAWGAGFVDGIGQGMERASQPAVGLGATAAYGAGDV LKMGIGGGASKAAQTLSDYYIKRAEQYHPVIPIGAGNEVTVVFQDGFQLKTVEEMALERT QSRAEEDNPESPVPVPPSAESHLNGFNTDQMLKQLGNLNPQQFMSGSQGGGNDGK >gi|296493080|gb|ADTK01000421.1| GENE 13 9241 - 9969 532 242 aa, chain - ## HITS:1 COG:no KEGG:p1ECUMN_0108 NR:ns ## KEGG: p1ECUMN_0108 # Name: traK # Def: conjugal transfer protein TraK # Organism: E.coli_UMN026 # Pathway: not_defined # 1 242 1 242 242 451 100.0 1e-125 MRKNNTAIIFGSLFFSCSVMAANGTLAPTVVPMVNGGQASIAISNTSPNLFTVPGDRIIA VNSLDGALTNNEQTASGGVVVATVNKKPFTFILETERGLNLSIQAVPREGAGRTIQLVSD LRGTGEEAGAWETSTPYESLLVTISQAVRGGKLPAGWYQVPVTKETLQAPAGLSSVADAV WTGNHLKMVRFAVENKTLSALNIRESDFWQPGTRAVMFSQPASQLLAGARMDVYVIRDGE GN >gi|296493080|gb|ADTK01000421.1| GENE 14 9956 - 10522 522 188 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_A0037 NR:ns ## KEGG: EcSMS35_A0037 # Name: traE # Def: conjugal transfer pilus assembly protein TraE # Organism: E.coli_SECEC # Pathway: not_defined # 1 188 1 188 188 363 100.0 2e-99 MEHGARLSTSRVMAIAFIFMSVLIVLSLSVNVIQGVNNYRLQNEQRTAVTPMAFNAPFAV SQNSADASYLQQMALSFIALRLNVSSETVDASHQALLQYIRPGAQNQMKVILAEEAKRIK NDNVNSAFFQTSVRVWPQYGRVEIRGVLKTWIGDSKPFTDIKHYILILKRENGVTWLDNF GETDDEKK >gi|296493080|gb|ADTK01000421.1| GENE 15 10544 - 10855 363 103 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_A0038 NR:ns ## KEGG: EcSMS35_A0038 # Name: traL # Def: conjugal transfer pilus assembly protein TraL # Organism: E.coli_SECEC # Pathway: not_defined # 1 103 1 103 103 207 100.0 1e-52 MSGDENKLKKYRFPETLTNQSRWFGLPLDELIPAAICIGWGITTSKYLFGIGAAVLVYFG IKKLKKGRGSSWLRDLIYWYMPTALLRGIFHNVPDSCFRQWIK >gi|296493080|gb|ADTK01000421.1| GENE 16 10870 - 11235 344 121 aa, chain - ## HITS:1 COG:no KEGG:pECS88_0066 NR:ns ## KEGG: pECS88_0066 # Name: traA # Def: conjugal transfer pilin subunit TraA # Organism: E.coli_S88 # Pathway: not_defined # 1 121 1 121 121 171 100.0 8e-42 MNAVLSVQGASAPVKKKSFFSKFTRLNMLRLARAVIPAAVLMMFFPQLAMAAGSSGQDLM ASGNTTVKATFGKDSSVVKWVVLAEVLVGAVMYMMTKNVKFLAGFAIISVFIAVGMAVVG L >gi|296493080|gb|ADTK01000421.1| GENE 17 11268 - 11663 203 131 aa, chain - ## HITS:1 COG:no KEGG:ECO111_p3-60 NR:ns ## KEGG: ECO111_p3-60 # Name: not_defined # Def: conjugal transfer protein TraY # Organism: E.coli_O111_H- # Pathway: not_defined # 1 131 1 131 131 252 100.0 2e-66 MKRFGTRSATGKMVKLKLPVDVESLLIEASNRSGRSRSFEAVIRLKDHLHRYPKFNRAGN IYGKSLVKYLTMRLDDETNQLLIAAKNRSGWCKTDEAADRVIDHLIKFPDFYNSEIFREA DKEEDITFNTL >gi|296493080|gb|ADTK01000421.1| GENE 18 11762 - 12451 190 229 aa, chain - ## HITS:1 COG:no KEGG:pECS88_0064 NR:ns ## KEGG: pECS88_0064 # Name: traJ # Def: conjugal transfer transcriptional regulator TraJ # Organism: E.coli_S88 # Pathway: not_defined # 1 229 28 256 256 426 100.0 1e-118 MYPMDRIQQKHARQIDLLENLTAVIQDYPNPACIRDETGKFIFCNTLFHESFLTQDQSAE KWLLSQRDFCELISVTEMEAYRNEHTHLNLVEDVFIQNRFWTISVQSFLNGHRNIILWQF YDAAHVRHKDSYNQKTIVSDDIRNIIRRMSDDSSVSSYVNDVFYLYSTGISHNAIARILN ISISTSKKHASLICDYFSVSNKDELIILLYNKKFIYYLYEKAMCIINTR Prediction of potential genes in microbial genomes Time: Mon May 16 16:27:24 2011 Seq name: gi|296493079|gb|ADTK01000422.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1457.2, whole genome shotgun sequence Length of sequence - 689 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 48 - 81 -0.8 1 1 Tu 1 . - CDS 162 - 545 332 ## pECS88_0063 conjugal transfer protein TRAM - Prom 597 - 656 4.9 Predicted protein(s) >gi|296493079|gb|ADTK01000422.1| GENE 1 162 - 545 332 127 aa, chain - ## HITS:1 COG:no KEGG:pECS88_0063 NR:ns ## KEGG: pECS88_0063 # Name: traM # Def: conjugal transfer protein TRAM # Organism: E.coli_S88 # Pathway: not_defined # 1 127 1 127 127 216 100.0 2e-55 MAKVNLYISNDAYEKINAIIEKRRQEGAREKDVSFSATASMLLELGLRVHEAQMERKESA FNQTEFNKLLLECVVKTQSSVAKILGIESLSPHVSGNPKFEYANMVEDIREKVSSEMERF FPKNDDE Prediction of potential genes in microbial genomes Time: Mon May 16 16:27:27 2011 Seq name: gi|296493078|gb|ADTK01000423.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1473.1, whole genome shotgun sequence Length of sequence - 670 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 59 - 559 499 ## COG4734 Antirestriction protein Predicted protein(s) >gi|296493078|gb|ADTK01000423.1| GENE 1 59 - 559 499 166 aa, chain + ## HITS:1 COG:YPMT1.61c KEGG:ns NR:ns ## COG: YPMT1.61c COG4734 # Protein_GI_number: 16082851 # Func_class: R General function prediction only # Function: Antirestriction protein # Organism: Yersinia pestis # 2 164 4 166 168 164 53.0 5e-41 MSVVAPAVYVGTWHKYNCGSIAGRWFDLATFDDERDFFAACRSLHQDEADPELMFQDYEG FPGNMASECHINWAYVEGFRQARDEGCEEAYRLWVDDTGETDFDTFRDAWWGEADSEEAF AVEFASDTGLLADVPETVALYFDYEAYARDLFLDSFTFIDGHVFRR Prediction of potential genes in microbial genomes Time: Mon May 16 16:27:27 2011 Seq name: gi|296493077|gb|ADTK01000424.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1473.2, whole genome shotgun sequence Length of sequence - 131 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 16:27:27 2011 Seq name: gi|296493076|gb|ADTK01000425.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1491.1, whole genome shotgun sequence Length of sequence - 349 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 16:27:28 2011 Seq name: gi|296493075|gb|ADTK01000426.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1491.2, whole genome shotgun sequence Length of sequence - 1890 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 296 110 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) + Term 428 - 466 -1.0 - Term 536 - 589 17.4 2 2 Op 1 . - CDS 593 - 1414 551 ## APECO1_O1CoBM24 hypothetical protein - Prom 1434 - 1493 6.2 3 2 Op 2 . - CDS 1532 - 1819 312 ## EcSMS35_A0046 hypothetical protein Predicted protein(s) >gi|296493075|gb|ADTK01000426.1| GENE 1 3 - 296 110 97 aa, chain + ## HITS:1 COG:PSLT072 KEGG:ns NR:ns ## COG: PSLT072 COG0741 # Protein_GI_number: 17233503 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Salmonella typhimurium LT2 # 1 83 73 155 156 128 72.0 3e-30 NELARYGIKPEHLTTDPCMNIYTGAYYLAIAFKKWGVTWEAVGAYNAGFRKTERQNQRRL AYASDVYRIYTGIKSSKGIRLPATKKSLPEINSVQNN >gi|296493075|gb|ADTK01000426.1| GENE 2 593 - 1414 551 273 aa, chain - ## HITS:1 COG:no KEGG:APECO1_O1CoBM24 NR:ns ## KEGG: APECO1_O1CoBM24 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 273 39 311 311 557 99.0 1e-157 MRLASRFGRYNSIRRERPLTDDELMQFVPSVFSGDKHESRSERYTYIPTINIINKLRDEG FQPFFACQSRIRDLGRREYSKHMLRLRREGHINGQEVPEIILLNSHDGSSSYQMIPGIFR FVCTNGLVCGNNFGEIRVPHKGDIVGQVIEGAYEVLGVFDKVTDNMEAMKEIHLNSDEQH LFGRAALMVRYEDENKTPVTPEQIITPRRREDKQNDLWTTWQRVQENMIKGGLSGRSASG KNTRTRAITGIDGDIRINKALWVIAEQFRKWKS >gi|296493075|gb|ADTK01000426.1| GENE 3 1532 - 1819 312 95 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_A0046 NR:ns ## KEGG: EcSMS35_A0046 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 95 1 95 95 182 100.0 4e-45 MSTRNIHVNTASYTLLVAGKKKNTGEEWDVLEFSSLTELKKYRKSHPEKMAFSYSYALSQ GVDKQFRHINIAEADHFKQFLRQIKRAGLDIRAIC Prediction of potential genes in microbial genomes Time: Mon May 16 16:27:34 2011 Seq name: gi|296493074|gb|ADTK01000427.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1491.3, whole genome shotgun sequence Length of sequence - 700 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 327 339 ## COG0629 Single-stranded DNA-binding protein Predicted protein(s) >gi|296493074|gb|ADTK01000427.1| GENE 1 3 - 327 339 108 aa, chain - ## HITS:1 COG:STM4256 KEGG:ns NR:ns ## COG: STM4256 COG0629 # Protein_GI_number: 16767506 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Salmonella typhimurium LT2 # 1 107 1 108 176 168 78.0 3e-42 MAVRGINKVILVGRLGKDPEVRYIPNGGAVANLQVATSETWRDKQTGEMREQTEWHRVVL FGKLAEVAGEYLRKGAQVYIEGQLRTRSWEDNGITRYVTEILVKTTGT Prediction of potential genes in microbial genomes Time: Mon May 16 16:27:36 2011 Seq name: gi|296493073|gb|ADTK01000428.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1547.1, whole genome shotgun sequence Length of sequence - 2674 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 59 - 1093 267 ## PROTEIN SUPPORTED gi|148987750|ref|ZP_01819213.1| ribose-phosphate pyrophosphokinase 2 2 Tu 1 . + CDS 1974 - 2180 130 ## COG2801 Transposase and inactivated derivatives - Term 2074 - 2121 1.1 3 3 Tu 1 . - CDS 2173 - 2655 281 ## COG3436 Transposase and inactivated derivatives Predicted protein(s) >gi|296493073|gb|ADTK01000428.1| GENE 1 59 - 1093 267 344 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148987750|ref|ZP_01819213.1| ribose-phosphate pyrophosphokinase [Streptococcus pneumoniae SP6-BS73] # 19 339 11 317 317 107 31 1e-23 MLRDTGGIKPHERKRAVAHLTLSEREEIRAGLSAKMSIRAIATALNRSPSTISREVQRNR GRRYYKAVDANNRANRMAKRPKPCLLDQNLPLRKLVLEKLEMKWSPEQISGWLRRTKPRQ KTLRISPETIYKTLYFRSREALHHLNIQHLRRSHSLRHGRRHTRKGERGTINIVNGTPIH ERSRNIDNRRSLGHWEGDLVSGTKNSHIATLVDRKSRYTIILRLRGKDSVSVNQALTDKF LSLPSELRKSLTWDRGMELARHLEFTVSTGVKVYFCDPQSPWQRGTNENTNGLIRQYFPK KTCLAQYTQHELDLVAAQLNNRPRKTLKFKTPKEIIERGVALTD >gi|296493073|gb|ADTK01000428.1| GENE 2 1974 - 2180 130 68 aa, chain + ## HITS:1 COG:Z5089 KEGG:ns NR:ns ## COG: Z5089 COG2801 # Protein_GI_number: 15804202 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 EDL933 # 1 68 158 225 225 142 100.0 1e-34 MERFFRSLKTEWVPTDGYTGKDVARQQISSYILNYYNSVRPHHYNGGLTPEESENRYHFY CKTVASIT >gi|296493073|gb|ADTK01000428.1| GENE 3 2173 - 2655 281 160 aa, chain - ## HITS:1 COG:ECs3848 KEGG:ns NR:ns ## COG: ECs3848 COG3436 # Protein_GI_number: 15833102 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 153 170 322 482 235 71.0 2e-62 MQAPVSSKPIARSYAGAGLLAHVVTGKYADHLPLYRQSEIYRRQGVEVSRATLGRWTSAV AELLEPLYDVLRQYVLMPGKVHADDIPVPVQEPGSGKTRTARLWVYVRDDRNAGSQMPPA VWFAYSPDRKGIHPQNHLAGYSGVLQADAYGGYRVFVVVK Prediction of potential genes in microbial genomes Time: Mon May 16 16:27:37 2011 Seq name: gi|296493072|gb|ADTK01000429.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1550.1, whole genome shotgun sequence Length of sequence - 2792 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 2781 625 ## COG1511 Predicted membrane protein Predicted protein(s) >gi|296493072|gb|ADTK01000429.1| GENE 1 3 - 2781 625 926 aa, chain - ## HITS:1 COG:DR0075 KEGG:ns NR:ns ## COG: DR0075 COG1511 # Protein_GI_number: 15805116 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Deinococcus radiodurans # 3 244 870 1112 1467 61 27.0 8e-09 MVEEVAHNASAVAQNTAAAKKSASDAGTSACEAATHATDAAGSARAASTSAGQAASSAQS ASSSAGTASTKATEASKSAAAAESSKSAAATSAGAAKTSETNAAASQKSAATSASTATTK ASEAATSARDAAASKEAAKSSETNASSSASSAASSATAAGNSAKAAKTSETNADNSAQAA ADSQTASANSATAAKKSETNAKNSEAAAKSSETNAKASETNAKSSETNAAKSAADALNYR NQAQLIVGDNIGLGSAPRDCPDISGNPSGYIGFMRIMSNAKGFPSIASGESSLTGFISQV DGTPAYTGVFQGWATRSLYTYRWNPTIGPQWTRHARKNEVDRLDQWSSETWLYNHDKSMR LGLTGSTWGCYSDTQKKWIPLDVSHGGTGAATIDGARTNLGLGRSNSPQLNSLFLDRYSD STNTYTSSGILHTRLLATDSTVRLGADMYVETLSNEPGQLTIRFTYDGSTGASKYLNLNS EGNLIVDSAILKSTVEKPLQIRSANPAIRFNETDRPANTPTYTLIANAGDWFIQKRDYDD AGSVSNAIAYNFANDRIDVQNLKASGLITANSGIATLTGHDWNSQHTDNVDKFRPIAGST NGPAGSMVLGGIHVQFSKNYAVQFGGRNSGFWGRTVENGTTQEWKKLLTVDDLNSSTDLA VRSLTTSNPLKSGGGRIDVLGSTSDYSKMDCFVRGFDSTGNSLAWALGSSVGVSKMLSLK NFFSGAEILLNGNDGAVQLKTGAVNGATAQALTINRNEVNSTVDLTLTKQSGTGNRFVLQ NSGNAELPFSVRVWGSSTRQNVFEVGTSAAYLFYAQKTSAGQLFDVNGAINCTTLNQSSD RDLKDDILVISDATKAIRKMNGYTYTLRENGMPYAGVIAQEVMEAIPEAVGSFTHYGEEL QGPTVDGNELREETRYLNVDYAAVTG Prediction of potential genes in microbial genomes Time: Mon May 16 16:27:38 2011 Seq name: gi|296493071|gb|ADTK01000430.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1554.1, whole genome shotgun sequence Length of sequence - 3156 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 213 - 272 4.5 1 1 Op 1 25/0.000 + CDS 323 - 1480 654 ## COG1192 ATPases involved in chromosome partitioning 2 1 Op 2 . + CDS 1480 - 2451 225 ## COG1475 Predicted transcriptional regulators + Term 2538 - 2577 3.0 Predicted protein(s) >gi|296493071|gb|ADTK01000430.1| GENE 1 323 - 1480 654 385 aa, chain + ## HITS:1 COG:YPCD1.13c KEGG:ns NR:ns ## COG: YPCD1.13c COG1192 # Protein_GI_number: 16082703 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Yersinia pestis # 3 383 6 386 388 577 69.0 1e-164 METLNQCINAGHEMTKAIAIAQFNDDSPEARKITRRWRIGEAADLVGVSSQAIRDAEKAG RLPHPDMEIRGRVEQRVGYTIEQINHMRDVFGTRLRRAEDVFPPVIGVAAHKGGVYKTSV SVHLAQDLALKGLRVLLVEGNDPQGTASMYHGWVPDLHIHAEDTLLPFYLGEKDDVTYAI KPTCWPGLDIIPSCLALHRIETELMGKFDEGKLPTDPHLMLRLAIETVAHDYDVIVIDSA PNLGIGTINVVCAADVLIVPTPAELFDYTSALQFFDMLRDLLKNVDLKGFEPDVRILLTK YSNSNGSQSPWMEEQIRDAWGSMVLKNVVRETDEVGKGQIRMRTVFEQAIDQRSSTGAWR NALSIWEPVCNEIFDRLIKPRWEIR >gi|296493071|gb|ADTK01000430.1| GENE 2 1480 - 2451 225 323 aa, chain + ## HITS:1 COG:YPCD1.12c KEGG:ns NR:ns ## COG: YPCD1.12c COG1475 # Protein_GI_number: 16082702 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Yersinia pestis # 1 320 1 316 320 266 48.0 3e-71 MKRAPVIPKHTIKTQPLEDTPSSAPAAPMVDSLIARVGAMARGNAITLPVCGRDVKFTLE VLRGDSVEKTSRVWSGNERDQELLTEDALDDLIPSFLLTGQQTPAFGRRVSGVIEIADGS RRRKAAALTESDYRVLVGELDDEQMAALSRLGNDYRPTSAYERGQRYASRLQNEFAGNIS ALADAENISRKIITRCINTAKLPKSVVALFSHPGELSARSGDALQRAFTDKEELLKQQAS NLHEQKKAGVIFEAEEVITLLTSVLKTSSASRTSLSSRHQFAPGATVLYKGDKMVLNLDR SRIPTECIERIEAILKELEKAAL Prediction of potential genes in microbial genomes Time: Mon May 16 16:27:38 2011 Seq name: gi|296493070|gb|ADTK01000431.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1664.1, whole genome shotgun sequence Length of sequence - 136 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 16:27:39 2011 Seq name: gi|296493069|gb|ADTK01000432.1| Escherichia coli MS 84-1 E_coli84-1-1.0_Cont1664.2, whole genome shotgun sequence Length of sequence - 240 bp Number of predicted genes - 0