Prediction of potential genes in microbial genomes Time: Sun May 15 23:01:35 2011 Seq name: gi|296494737|gb|ADTN01000001.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont4.1, whole genome shotgun sequence Length of sequence - 1383 bp Number of predicted genes - 3, with homology - 1 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 42 - 191 80 ## + Term 315 - 352 1.2 2 2 Tu 1 . - CDS 188 - 412 91 ## - Prom 643 - 702 7.4 + Prom 597 - 656 9.4 3 3 Tu 1 . + CDS 848 - 1360 -126 ## gi|301643304|ref|ZP_07243367.1| conserved hypothetical protein Predicted protein(s) >gi|296494737|gb|ADTN01000001.1| GENE 1 42 - 191 80 49 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHNLQLLSLLNDGEINSIARSIEKWTYNKFSISLFKNRQSYLGSLGGKA >gi|296494737|gb|ADTN01000001.1| GENE 2 188 - 412 91 74 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNQHNNNSQEKSHYIEIIFISHLKSCFSLKNNDIVEADNDNLFEISLFERPLNFNLMISS ETTSHILLGLPPPR >gi|296494737|gb|ADTN01000001.1| GENE 3 848 - 1360 -126 170 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|301643304|ref|ZP_07243367.1| ## NR: gi|301643304|ref|ZP_07243367.1| conserved hypothetical protein [Escherichia coli MS 146-1] # 1 170 168 337 337 252 100.0 5e-66 MITIGIVFFIAISWFYYFTYVYSGARADTLAKQFFYLSGFFFFGSLLSINEVFFSKIRWI SLISILSYFSFKGEWFAFIIEMIAFPSVVIFLCTGFFKEIKLNFIGDLSYGIYLYHFPII QLLQHLKLFDYNAYMGLLITSLATFSLAYLSWHLLENRFLKRSVSEPVIG Prediction of potential genes in microbial genomes Time: Sun May 15 23:01:57 2011 Seq name: gi|296494736|gb|ADTN01000002.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont6.1, whole genome shotgun sequence Length of sequence - 15828 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 9, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 7 - 66 2.6 1 1 Op 1 10/0.000 + CDS 108 - 401 152 ## COG3121 P pilus assembly protein, chaperone PapD 2 1 Op 2 . + CDS 417 - 2897 1709 ## COG3188 P pilus assembly protein, porin PapC + Prom 2932 - 2991 4.5 3 2 Tu 1 . + CDS 3108 - 3947 393 ## EC55989_2363 putative exported fimbrial-like adhesin protein + Term 4004 - 4034 4.3 4 3 Tu 1 1/1.000 - CDS 4029 - 4367 306 ## COG5455 Predicted integral membrane protein - Prom 4415 - 4474 3.6 - Term 4473 - 4503 1.1 5 4 Tu 1 . - CDS 4586 - 5410 573 ## COG2215 ABC-type uncharacterized transport system, permease component - Prom 5437 - 5496 3.9 + Prom 5448 - 5507 5.0 6 5 Tu 1 1/1.000 + CDS 5531 - 5803 372 ## COG1937 Uncharacterized protein conserved in bacteria + Prom 5817 - 5876 3.0 7 6 Op 1 12/0.000 + CDS 6107 - 6814 465 ## COG2145 Hydroxyethylthiazole kinase, sugar kinase family 8 6 Op 2 2/0.667 + CDS 6811 - 7611 721 ## COG0351 Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase 9 6 Op 3 2/0.667 + CDS 7676 - 8494 660 ## COG3757 Lyzozyme M1 (1,4-beta-N-acetylmuramidase) 10 6 Op 4 . + CDS 8546 - 9292 591 ## COG2188 Transcriptional regulators - Term 9173 - 9229 11.1 11 7 Op 1 5/0.667 - CDS 9266 - 10231 883 ## COG0524 Sugar kinases, ribokinase family 12 7 Op 2 4/0.667 - CDS 10228 - 11232 1005 ## COG1397 ADP-ribosylglycohydrolase 13 7 Op 3 . - CDS 11229 - 12506 1398 ## COG0477 Permeases of the major facilitator superfamily - Prom 12554 - 12613 3.0 + Prom 12557 - 12616 4.0 14 8 Tu 1 . + CDS 12763 - 13815 1122 ## COG1830 DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes + Term 13978 - 14013 4.0 + Prom 14018 - 14077 4.5 15 9 Op 1 . + CDS 14122 - 14976 681 ## COG0191 Fructose/tagatose bisphosphate aldolase 16 9 Op 2 . + CDS 15005 - 15823 602 ## COG4573 Predicted tagatose 6-phosphate kinase Predicted protein(s) >gi|296494736|gb|ADTN01000002.1| GENE 1 108 - 401 152 97 aa, chain + ## HITS:1 COG:yehC KEGG:ns NR:ns ## COG: yehC COG3121 # Protein_GI_number: 16130048 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli K12 # 1 97 143 239 239 177 97.0 4e-45 MQNRIKLFYRPAGIAPVNKATFKKLLVNRSGNGLVIKNDSANWVTISDVKANNVKVNYET IMIAPLESQSVNIKSNNANNWYLTIIDDHGNYISDKI >gi|296494736|gb|ADTN01000002.1| GENE 2 417 - 2897 1709 826 aa, chain + ## HITS:1 COG:yehB KEGG:ns NR:ns ## COG: yehB COG3188 # Protein_GI_number: 16130047 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli K12 # 1 826 1 826 826 1599 98.0 0 MLRMTPLASAIVALLLGIEAYAAEETFDTHFMIGGMKDQQVANIRLDDNQPLPGQYDIDI YVNKQWRGKYEIIVKDNPQETCLSREVIKRLGINSDNFASGKQCLTFEQLVQGGSYSWDI GVFRLDFSVPQAWVEELESGYVPPENWERGINAFYTSYYVSQYYSDYKASGNNKSTYVRF NSGLNLLEWQLHSDASFSKTNNNPGVWKSNTLYLERGFAQLLGTLRVGDMYTSSDIFDSV RFSGVRLFRDMQMLPNSKQNFTPRVQGIAQSNALVTIEQNGFVVYQKEVPPGPFAITDLQ LAGGGADLDVSVKEADGSVTTYLVPYAAVPNMLQPGVSNYDFAAGRSHIEGASKQSDFVQ AGYQYGFNNLLTLYGGSMVANNYYAFTLGTGWNTRIGAISVDATKSHSKQDNGDVFDGQS YQIAYNKFVSQTSTRFGLAAWRYSSRDYRTFNDHVWANNKDNYRRDENDIYDIADYYQND FGRKNSFSANMSQSLPEGWGSVSLSTLWRDYWGRSGSSKDYQLSYSNNWRRISYTLAASQ AYDENHHEEKRFNIFISIPFDWGDDVTTPRRQIYMSNSTTFDDQGFASNNTGLSGTVGSR DQFNYGVNLSYQHQGNETTAGANLTWNAPVATVNGSYSQSSTYRQAGASVSGGIVAWSGS VNLANRLSETFAVMNAPGIKDAYVNGQKYRTTNRNGVVIYDGMTPYQENHLMLDVSQSDS EAELRGNRKIAAPYRGAVVLVNFDTDQRKPWFIKALRADGQPLMFGYEVNDIHGHNIGVV GQGSQLFIRTNEVPPSVNVAIDKQQGLSCTITFGKEIDESRNYICQ >gi|296494736|gb|ADTN01000002.1| GENE 3 3108 - 3947 393 279 aa, chain + ## HITS:1 COG:no KEGG:EC55989_2363 NR:ns ## KEGG: EC55989_2363 # Name: yehA # Def: putative exported fimbrial-like adhesin protein # Organism: E.coli_55989 # Pathway: not_defined # 1 279 66 344 344 453 98.0 1e-126 MSPSDIIVGLYNDTIKLNLHFEWTNKNNITLSNNQTSFTSGYSVTVTPAASNAKVNVSAG GGGSVMINGVATLSSASSSTRGSAAVQFLLCLLGGKSWDACVNSYRNALAQNAGVYSFNL TLSYNPITTTCKPDDLLITLDSIPVSQLPATGNKATINSKKEDIILRCKNLLGQQNQTSR KMQVYLSSSDLLTNSNIILKGAEDNGVGFILESNGSPVTLLNITNSSKGYTNLKEIAAKS KLTDTTVSIPITASYYVYDTNKIKSGALEATALINVKYD >gi|296494736|gb|ADTN01000002.1| GENE 4 4029 - 4367 306 112 aa, chain - ## HITS:1 COG:yohN KEGG:ns NR:ns ## COG: yohN COG5455 # Protein_GI_number: 16130045 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Escherichia coli K12 # 1 112 61 172 172 209 100.0 1e-54 MTIKNKMLLGALLLVTSAAWAAPATAGSTNTSGISKYELSSFIADFKHFKPGDTVPEMYR TDEYNIKQWQLRNLPAPDAGTHWTYMGGAYVLISDTDGKIIKAYDGEIFYHR >gi|296494736|gb|ADTN01000002.1| GENE 5 4586 - 5410 573 274 aa, chain - ## HITS:1 COG:ZyohM KEGG:ns NR:ns ## COG: ZyohM COG2215 # Protein_GI_number: 15802585 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Escherichia coli O157:H7 EDL933 # 1 274 1 283 283 448 91.0 1e-126 MTEFTTLLQQGNAWFFIPSAILLGALHGLEPGHSKTMMAAFIIAIKGTIKQAVMLGLAAT ISHTAVVWLIAFGGMVISKRFTAQSAEPWLQLISAVIIISTAFWMFWRTWRGERNWLENM HGHDYEHHHHDHEHHHDHGHHHHHEHGEYQDAHARAHANDIKRRFDGREVTNWQILLFGL TGGLIPCPAAITVLLICIQLKALTLGATLVVSFSIGLALTLVTVGVGAAISVQQVAKRWS GFNTLAKRAPYFSSLLIGLVGVYMGVHGFMGIMR >gi|296494736|gb|ADTN01000002.1| GENE 6 5531 - 5803 372 90 aa, chain + ## HITS:1 COG:ECs2911 KEGG:ns NR:ns ## COG: ECs2911 COG1937 # Protein_GI_number: 15832165 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 90 1 90 90 146 100.0 1e-35 MSHTIRDKQKLKARASKIQGQVVALKKMLDEPHECAAVLQQIAAIRGAVNGLMREVIKGH LTEHIVHQGDELKREEDLDVVLKVLDSYIK >gi|296494736|gb|ADTN01000002.1| GENE 7 6107 - 6814 465 235 aa, chain + ## HITS:1 COG:ECs2907 KEGG:ns NR:ns ## COG: ECs2907 COG2145 # Protein_GI_number: 15832161 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxyethylthiazole kinase, sugar kinase family # Organism: Escherichia coli O157:H7 # 1 235 28 262 262 402 94.0 1e-112 MTNDVVQTFTANTLLALGASPAMVIETEEASQFAAIASALLINVGTLTQLRAQSMCAAVE QAKSSQTPWTLDPVAVGALDYRRRFCLELLSHKPTAIRGNASEIMALAGIANGGRGVDTT DAAANAIPAAQTLARETGAIVVVTGEMDYVTDGHRIIGIHGGDPLMTKVVGTGCALSAVV AACCALPGDMLENVASACHWMKQAGERAVARSEGPGSFVPHFLDALWQLTQEVQA >gi|296494736|gb|ADTN01000002.1| GENE 8 6811 - 7611 721 266 aa, chain + ## HITS:1 COG:thiD KEGG:ns NR:ns ## COG: thiD COG0351 # Protein_GI_number: 16130041 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase # Organism: Escherichia coli K12 # 1 266 1 266 266 516 100.0 1e-146 MKRINALTIAGTDPSGGAGIQADLKTFSALGAYGCSVITALVAQNTRGVQSVYRIEPDFV AAQLDSVFSDVRIDTTKIGMLAETDIVEAVAERLQRYQIQNVVLDTVMLAKSGDPLLSPS AVATLRSRLLPQVSLITPNLPEAAALLDAPHARTEQEMLEQGRSLLAMGCGAVLMKGGHL DDEQSPDWLFTREGEQRFTAPRIMTKNTHGTGCTLSAALAALRPRHTNWADTVQEAKSWL SSALAQADTLEVGHGIGPVHHFHAWW >gi|296494736|gb|ADTN01000002.1| GENE 9 7676 - 8494 660 272 aa, chain + ## HITS:1 COG:yegX KEGG:ns NR:ns ## COG: yegX COG3757 # Protein_GI_number: 16130040 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lyzozyme M1 (1,4-beta-N-acetylmuramidase) # Organism: Escherichia coli K12 # 1 272 4 275 275 552 98.0 1e-157 MQLRITSRKKLTALLCALGLISIVAIYPRQTVNFFYSTAVQITDYIHFYGYRPVKSFAIR IPASYTIHGIDVSRWQERIDWQRVAKMRDNGIRLQFAFIKATEGEKLVDPYFSRNWQLSR ENGLLRGAYHYFSPSVAAPVQARLFLQTVDFSQGDFPAVLDVEERGKLSAKELRKRVSQW LKMVEKSTGKKPIIYSGAVFYHTNLAGYFNEYPWWVAHYYQRRPDNDGMAWRFWQHSDRG QVDGINGPVDFNVFNGTVEELQAFVDGIKETP >gi|296494736|gb|ADTN01000002.1| GENE 10 8546 - 9292 591 248 aa, chain + ## HITS:1 COG:ECs2904 KEGG:ns NR:ns ## COG: ECs2904 COG2188 # Protein_GI_number: 15832158 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 248 1 248 248 476 99.0 1e-134 MEQAHTQLIAQLNERILAADNTPLYIKFAETVKNAVRSGVLEHGNILPGERDLSQLTGVS RITVRKAMQALEEEGVVTRSRGYGTQINNIFEYSLKEARGFSQQVVLRGKKPDTLWVNKR VVKCPEEVAEQLAVEAGSDVFLLKRIRYVDEEAVSIEESWVPAHLIHDVDAIGISLYDYF RSQHIYPQRTRSRVSARMPDAEFQSHIQLDSKIPVLVIKQVALDQQQRPIEYSISHCRSD LYVFVCEE >gi|296494736|gb|ADTN01000002.1| GENE 11 9266 - 10231 883 321 aa, chain - ## HITS:1 COG:ECs2903 KEGG:ns NR:ns ## COG: ECs2903 COG0524 # Protein_GI_number: 15832157 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli O157:H7 # 1 321 1 321 321 624 97.0 1e-179 MSGARLHTLLPELTTRQSVMVVGAAVIDVIADAYALPWRGCDIELKQQSVNVGGCALNIA VALKRLGIEAGNALPLGQGVWAEIIRNRMAKEGLISLIDNAEGDNGWCLALVEPDGERTF MSFSGVENQWNRQWLARLTVAPGNLLYFSGYQLASPCGELLVEWLEELQDVTPFIDFGPR IGDIPDALLARIMACRPLVSLNRQEAEIAAERFALSAEITTLGEQWQEKFAAPLIVRLDK EGAWYFSNDASGCIPAFPTQVVDTIGAGDSHAGGVLAGLASGLPLADAVLLGNAVASWVV GHRGGDCAPTREELLLAHKNV >gi|296494736|gb|ADTN01000002.1| GENE 12 10228 - 11232 1005 334 aa, chain - ## HITS:1 COG:yegU KEGG:ns NR:ns ## COG: yegU COG1397 # Protein_GI_number: 16130037 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ADP-ribosylglycohydrolase # Organism: Escherichia coli K12 # 1 334 1 334 334 639 99.0 0 MKTERILGALYGQALGDAMGMPSELWPRSRVKAHFGWIDRFLPGPKENNAACYFNRAEFT DDTSMALCLADALLERKGKIDPDLIGRNILDWALRFDAFNKNVLGPTSKIALNAIRDGKP VAELENNGVTNGAAMRVSPLGCLLPARDVDSFIDDVALASSPTHKSDLAVAGAVVIAWAI SRAIDGESWSAIVDSLPSIARHAQQKRTTTFSASLAARLEIALKIVRNADGTESASEQLY QVVGAGTSTIESVPCAIALVELAQTDPNRCAVLCANLGGDTDTIGAMATAICGALHGINA IDPALKAELDAVNQLDFNRYATALAKYRQQREAV >gi|296494736|gb|ADTN01000002.1| GENE 13 11229 - 12506 1398 425 aa, chain - ## HITS:1 COG:yegT KEGG:ns NR:ns ## COG: yegT COG0477 # Protein_GI_number: 16130036 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 425 1 425 425 765 100.0 0 MKTTAKLSFMMFVEWFIWGAWFVPLWLWLSKSGFSAGEIGWSYACTAIAAILSPILVGSI TDRFFSAQKVLAVLMFAGALLMYFAAQQTTFAGFFPLLLAYSLTYMPTIALTNSIAFANV PDVERDFPRIRVMGTIGWIASGLACGFLPQILGYADISPTNIPLLITAGSSALLGVFAFF LPDTPPKSTGKMDIKVMLGLDALILLRDKNFLVFFFCSFLFAMPLAFYYIFANGYLTEVG MKNATGWMTLGQFSEIFFMLALPFFTKRFGIKKVLLLGLVTAAIRYGFFIYGSADEYFTY ALLFLGILLHGVSYDFYYVTAYIYVDKKAPVHMRTAAQGLITLCCQGFGSLLGYRLGGVM MEKMFAYQEPVNGLTFNWSGMWTFGAVMIAIIAVLFMIFFRESDNEITAIKVDDRDIALT QGEVK >gi|296494736|gb|ADTN01000002.1| GENE 14 12763 - 13815 1122 350 aa, chain + ## HITS:1 COG:ECs2900 KEGG:ns NR:ns ## COG: ECs2900 COG1830 # Protein_GI_number: 15832154 # Func_class: G Carbohydrate transport and metabolism # Function: DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes # Organism: Escherichia coli O157:H7 # 1 350 25 374 374 694 99.0 0 MTDIAQLLGKDADNLLQHRCMTIPSDQLYLPGHDYVDRVMIDNNRPPAVLRNMQTLYNTG RLAGTGYLSILPVDQGVEHSAGASFAANPLYFDPKNIVELAIEAGCNCVASTYGVLASVS RRYAHRIPFLVKLNHNETLSYPNTYDQTLYASVEQAFNMGAVAVGATIYFGSEESRRQIE EISAAFERAHELGMVTVLWAYLRNSAFKKDGVDYHVSADLTGQANHLAVTIGADIVKQKM AENNGGYKAINYGYTDDRVYSKLTSENPIDLVRYQLANCYMGRAGLINSGGAAGGETDLS DAVRTAVINKRAGGMGLILGRKAFKKSMADGVKLINAVQDVYLDSKITIA >gi|296494736|gb|ADTN01000002.1| GENE 15 14122 - 14976 681 284 aa, chain + ## HITS:1 COG:ECs2899 KEGG:ns NR:ns ## COG: ECs2899 COG0191 # Protein_GI_number: 15832153 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Escherichia coli O157:H7 # 1 284 3 286 286 552 98.0 1e-157 MYVVSTKQMLNNAQRGGYAVPAFNIHNLETMQVVVETAANLHAPVIIAGTPGTFTYAGTE NLLALVSAMAKQYHHPLAIHLDHHTKFDDIAQKVRSGVRSVMIDASHLPFAQNISRVKEV VDFCHRFDVSVEAELGQLGGQEDDVQVNEADAFYTNPAQAREFAEATGIDSLAVAIGTAH GMYASAPALDFSRLENIRQWVNLPQVLHGASGLSTKDIQQTIKMGICKINVATELKNAFS QALKNYLTEHPEATDPRDYLQSAKSAMRDVVSKVIADCGCEGRA >gi|296494736|gb|ADTN01000002.1| GENE 16 15005 - 15823 602 272 aa, chain + ## HITS:1 COG:gatZ KEGG:ns NR:ns ## COG: gatZ COG4573 # Protein_GI_number: 16130033 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted tagatose 6-phosphate kinase # Organism: Escherichia coli K12 # 1 266 1 266 420 541 100.0 1e-154 MKTLIARHKAGEHIGICSVCSAHPLVIEAALAFDRNSTRKVLIEATSNQVNQFGGYTGMT PADFREFVFTIADKVGFARERIILGGDHLGPNCWQQENADAAMEKSVELVKEYVRAGFSK IHLDASMSCAGDPIPLAPETVAERAAVLCFAAESVATDCQREQLSYVIGTEVPVPGGEAS AIQSVHITHVEDAANTLRTHQKAFIARGLTEALTRVIAIVVQPGVEFDHSNIIHYQPQEA QPLAQWIENTRMVYEAHSTDYQTRTAVMTPTY Prediction of potential genes in microbial genomes Time: Sun May 15 23:02:07 2011 Seq name: gi|296494735|gb|ADTN01000003.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont7.1, whole genome shotgun sequence Length of sequence - 17034 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 6, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.333 - CDS 773 - 2251 1468 ## COG1070 Sugar (pentulose and hexulose) kinases 2 1 Op 2 . - CDS 2278 - 3405 1279 ## COG0477 Permeases of the major facilitator superfamily - Prom 3456 - 3515 3.4 + Prom 3657 - 3716 6.6 3 2 Op 1 8/0.333 + CDS 3874 - 4659 180 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 4 2 Op 2 5/0.667 + CDS 4729 - 6183 1438 ## COG0277 FAD/FMN-containing dehydrogenases + Term 6191 - 6233 8.0 5 3 Op 1 3/0.667 + CDS 6277 - 7614 1146 ## COG0477 Permeases of the major facilitator superfamily 6 3 Op 2 29/0.000 + CDS 7592 - 8371 760 ## COG2086 Electron transfer flavoprotein, beta subunit 7 3 Op 3 . + CDS 8368 - 9228 721 ## COG2025 Electron transfer flavoprotein, alpha subunit + Term 9439 - 9471 0.3 - Term 9331 - 9367 -0.8 8 4 Op 1 2/0.667 - CDS 9376 - 9951 559 ## COG1954 Glycerol-3-phosphate responsive antiterminator (mRNA-binding) 9 4 Op 2 12/0.000 - CDS 9968 - 10228 236 ## COG2440 Ferredoxin-like protein 10 4 Op 3 1/1.000 - CDS 10219 - 11490 711 ## COG0644 Dehydrogenases (flavoproteins) 11 4 Op 4 . - CDS 11568 - 11930 397 ## COG0720 6-pyruvoyl-tetrahydropterin synthase - Prom 12023 - 12082 3.4 + Prom 12027 - 12086 6.2 12 5 Op 1 11/0.000 + CDS 12249 - 14048 1824 ## COG0369 Sulfite reductase, alpha subunit (flavoprotein) 13 5 Op 2 11/0.000 + CDS 14048 - 15760 1873 ## COG0155 Sulfite reductase, beta subunit (hemoprotein) 14 5 Op 3 . + CDS 15834 - 16568 869 ## COG0175 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes + Term 16589 - 16626 5.3 + Prom 16598 - 16657 3.9 15 6 Tu 1 . + CDS 16833 - 16985 154 ## G2583_3410 small toxic membrane polypeptide Predicted protein(s) >gi|296494735|gb|ADTN01000003.1| GENE 1 773 - 2251 1468 492 aa, chain - ## HITS:1 COG:ygcE KEGG:ns NR:ns ## COG: ygcE COG1070 # Protein_GI_number: 16130683 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Escherichia coli K12 # 1 492 1 492 492 1033 100.0 0 MSKKYIIGIDGGSQSTKVVMYDLEGNVVCEGKGLLQPMHTPDADTAEHPDDDLWASLCFA GHDLMSQFAGNKEDIVGIGLGSIRCCRALLKADGTPAAPLISWQDARVTRPYEHTNPDVA YVTSFSGYLTHRLTGEFKDNIANYFGQWPVDYKSWAWSEDAAVMDKFNIPRHMLFDVQMP GTVLGHITPQAALATHFPAGLPVVCTTSDKPVEALGAGLLDDETAVISLGTYIALMMNGK ALPKDPVAYWPIMSSIPQTLLYEGYGIRKGMWTVSWLRDMLGESLIQDARAQDLSPEDLL NKKASCVPPGCNGLMTVLDWLTNPWEPYKRGIMIGFDSSMDYAWIYRSILESVALTLKNN YDNMCNEMNHFAKHVIITGGGSNSDLFMQIFADVFNLPARRNAINGCASLGAAINTAVGL GLYPDYATAVDNMVRVKDIFIPIESNAKRYDAMNKGIFKDLTKHTDVILKKSYEVMHGEL GNVDSIQSWSNA >gi|296494735|gb|ADTN01000003.1| GENE 2 2278 - 3405 1279 375 aa, chain - ## HITS:1 COG:yqcE KEGG:ns NR:ns ## COG: yqcE COG0477 # Protein_GI_number: 16130682 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 375 51 425 425 656 100.0 0 MSTFGIAAIILYAPSGVIADKFSHRKMITSAMIITGLLGLLMATYPPLWVMLCIQIAFAI TTILMLWSVSIKAASLLGDHSEQGKIMGWMEGLRGVGVMSLAVFTMWVFSRFAPDDSTSL KTVIIIYSVVYILLGILCWFFVSDNNNLRSANNEEKQSFQLSDILAVLRISTTWYCSMVI FGVFTIYAILSYSTNYLTEMYGMSLVAASYMGIVINKIFRALCGPLGGIITTYSKVKSPT RVIQILSVLGLLTLTALLVTNSNPQSVAMGIGLILLLGFTCYASRGLYWACPGEARTPSY IMGTTVGICSVIGFLPDVFVYPIIGHWQDTLPAAEAYRNMWLMGMAALGMVIVFTFLLFQ KIRTADSAPAMASSK >gi|296494735|gb|ADTN01000003.1| GENE 3 3874 - 4659 180 261 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 16 256 1 238 242 73 26 7e-13 MSIESLNAFSMDFFSLKGKTAIVTGGNSGLGQAFAMALAKAGANIFIPSFVKDNGETKEM IEKQGVEVDFMQVDITAEGAPQKIIAACCERFGTVDILVNNAGICKLNKVLDFGRADWDP MIDVNLTAAFELSYEAAKIMIPQKSGKIINICSLFSYLGGQWSPAYSATKHALAGFTKAY CDELGQYNIQVNGIAPGYYATDITLATRSNPETNQRVLDHIPANRWGDTQDLMGAAVFLA SPASNYVNGHLLVVDGGYLVR >gi|296494735|gb|ADTN01000003.1| GENE 4 4729 - 6183 1438 484 aa, chain + ## HITS:1 COG:ECs3629 KEGG:ns NR:ns ## COG: ECs3629 COG0277 # Protein_GI_number: 15832883 # Func_class: C Energy production and conversion # Function: FAD/FMN-containing dehydrogenases # Organism: Escherichia coli O157:H7 # 1 484 1 484 484 1009 99.0 0 MSLSRAAIVDQLKEIVGADRVITDETVLKKNSIDRFRKFPDIHGIYTLPIPAAVVKLGST EQVSRVLNFMNAHKINGVPRTGASATEGGLETVVENSVVLDGSAMNQIINIDIENMQATA QCGVPLEVLENALREKGYTTGHSPQSKPLAQMGGLVATRSIGQFSTLYGAIEDMVVGLEA VLADGTVTRIKNVPRRAAGPDIRHIIIGNEGALCYITEVTVKIFKFTPENNLFYGYILED MKTGFNILREIMVEGYRPSIARLYDAEDGTQHFTHFADGKCVLIFMAEGNPRIAKVTGEG IAEIVARYPQCQRVDSKLIETWFNNLNWGPDKVAAERVQILKTGNMGFTTEVSGCWSCIH EIYESVINRIRTEFPHADDITMLGGHSSHSYQNGTNMYFVYDYNVVDCKPEEEIDKYHNP LNKIICEETIRLGGSMVHHHGIGKHRVHWSKLEHGSAWALLEGLKKQFDPNGIMNTGTIY PIEK >gi|296494735|gb|ADTN01000003.1| GENE 5 6277 - 7614 1146 445 aa, chain + ## HITS:1 COG:ygcS KEGG:ns NR:ns ## COG: ygcS COG0477 # Protein_GI_number: 16130678 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 445 25 469 469 745 100.0 0 MNTSPVRMDDLPLNRFHCRIAALTFGAHLTDGYVLGVIGYAIIQLTPAMQLTPFMAGMIG GSALLGLFLGSLVLGWISDHIGRQKIFTFSFLLITLASFLQFFATTPEHLIGLRILIGIG LGGDYSVGHTLLAEFSPRRHRGILLGAFSVVWTVGYVLASIAGHHFISENPEAWRWLLAS AALPALLITLLRWGTPESPRWLLRQGRFAEAHAIVHRYFGPHVLLGDEVVTATHKHIKTL FSSRYWRRTAFNSVFFVCLVIPWFVIYTWLPTIAQTIGLEDALTASLMLNALLIVGALLG LVLTHLLAHRKFLLGSFLLLAATLVVMACLPSGSSLTLLLFVLFSTTISAVSNLVGILPA ESFPTDIRSLGVGFATAMSRLGAAVSTGLLPWVLAQWGMQVTLLLLATVLLVGFVVTWLW APETKALPLVAAGNVGGANEHSVSV >gi|296494735|gb|ADTN01000003.1| GENE 6 7592 - 8371 760 259 aa, chain + ## HITS:1 COG:ygcR KEGG:ns NR:ns ## COG: ygcR COG2086 # Protein_GI_number: 16130677 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, beta subunit # Organism: Escherichia coli K12 # 1 259 3 261 261 466 100.0 1e-131 MNILLAFKAEPDAGMLAEKEWQAAAQGKSGPDISLLRSLLGADEQAAAALLLAQRKNGTP MSLTALSMGDERALHWLRYLMALGFEEAVLLETAADLRFAPEFVARHIAEWQHQNPLDLI ITGCQSSEGQNGQTPFLLAEMLGWPCFTQVERFTLDALFITLEQRTEHGLRCCRVRLPAV IAVRQCGEVALPVPGMRQRMAAGKAEIIRKTVAAEMPAMQCLQLARAEQRRGATLIDGQT VAEKAQKLWQDYLRQRMQP >gi|296494735|gb|ADTN01000003.1| GENE 7 8368 - 9228 721 286 aa, chain + ## HITS:1 COG:ygcQ KEGG:ns NR:ns ## COG: ygcQ COG2025 # Protein_GI_number: 16130676 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, alpha subunit # Organism: Escherichia coli K12 # 1 286 12 297 297 538 100.0 1e-153 MNIAIVTINQENAAIASWLAAQDFSGCTLAHWQIEPQPVVAEQVLDALVEQWQRTPADVV LFPPGTFGDELSTRLAWRLHGASICQVTSLDIPTVSVRKSHWGNALTATLQTEKRPLCLS LARQAGAAKNATLPSGMQQLNIVPGALPDWLVSTEDLKNVTRDPLAEARRVLVVGQGGEA DNQEIAMLAEKLGAEVGYSRARVMNGGVDAEKVIGISGHLLAPEVCIVVGASGAAALMAG VRNSKFVVAINHDASAAVFSQADVGVVDDWKVVLEALVTNIHADCQ >gi|296494735|gb|ADTN01000003.1| GENE 8 9376 - 9951 559 191 aa, chain - ## HITS:1 COG:ygcP KEGG:ns NR:ns ## COG: ygcP COG1954 # Protein_GI_number: 16130675 # Func_class: K Transcription # Function: Glycerol-3-phosphate responsive antiterminator (mRNA-binding) # Organism: Escherichia coli K12 # 1 191 1 191 191 377 100.0 1e-105 MPLLHLLRQNPVIAAVKDNASLQLAIDSECQFISVLYGNICTISNIVKKIKNAGKYAFIH VDLLEGASNKEVVIQFLKLVTEADGIISTKASMLKAARAEGFFCIHRLFIVDSISFHNID KQVAQSNPDCIEILPGCMPKVLGWVTEKIRQPLIAGGLVCDEEDARNAINAGVVALSTTN TGVWTLAKKLL >gi|296494735|gb|ADTN01000003.1| GENE 9 9968 - 10228 236 86 aa, chain - ## HITS:1 COG:ygcO KEGG:ns NR:ns ## COG: ygcO COG2440 # Protein_GI_number: 16130674 # Func_class: C Energy production and conversion # Function: Ferredoxin-like protein # Organism: Escherichia coli K12 # 1 86 13 98 98 177 100.0 6e-45 MSVARNLWRVADAPHIVPADSVERQTAERLINACPAGLFSLTPEGNLRIDYRSCLECGTC RLLCDESTLQQWRYPPSGFGITYRFG >gi|296494735|gb|ADTN01000003.1| GENE 10 10219 - 11490 711 423 aa, chain - ## HITS:1 COG:ygcN KEGG:ns NR:ns ## COG: ygcN COG0644 # Protein_GI_number: 16130673 # Func_class: C Energy production and conversion # Function: Dehydrogenases (flavoproteins) # Organism: Escherichia coli K12 # 1 423 11 433 433 831 99.0 0 MEDDCDIIIIGAGIAGTACALRCARAGLSVLLLERAEIPGSKNLSGGRLYTHALAELLPQ FHLTAPLERRITHESLSLLTPDGVTTFSSLQPGGESWSVLRARFDPWLVAEAEKEGVECI PGATVDALYEENGRVCGVICGDDILRARYVVLAEGANSVLAERHGLVARPAGEAMALGIK EVLSLETSAIEERFHLENNEGAALLFSGRICDDLPGGAFLYTNQQTLSLGIVCPLSSLTQ SRVPASELLTRFKAHPAVRPLIKNTESLEYGTHLVPEGGLHSMPVQYAGNGWLLVGDALR SCVNTGISVRGMDMALTGAQAAAQTLISACQHREPQNLFPLYHHNVERSLLWDVLQRYQH VPALLQRPGWYRTWPALMQDISRDLWDQGDKPVPPLRQLFWHHLRRHGLWHLAGDVIRSL RCL >gi|296494735|gb|ADTN01000003.1| GENE 11 11568 - 11930 397 120 aa, chain - ## HITS:1 COG:ECs3620 KEGG:ns NR:ns ## COG: ECs3620 COG0720 # Protein_GI_number: 15832874 # Func_class: H Coenzyme transport and metabolism # Function: 6-pyruvoyl-tetrahydropterin synthase # Organism: Escherichia coli O157:H7 # 1 120 2 121 121 251 100.0 2e-67 MSTTLFKDFTFEAAHRLPHVPEGHKCGRLHGHSFMVRLEITGEVDPHTGWIIDFAELKAA FKPTYERLDHHYLNDIPGLENPTSEVLAKWIWDQVKPVVPLLSAVMVKETCTAGCIYRGE >gi|296494735|gb|ADTN01000003.1| GENE 12 12249 - 14048 1824 599 aa, chain + ## HITS:1 COG:cysJ KEGG:ns NR:ns ## COG: cysJ COG0369 # Protein_GI_number: 16130671 # Func_class: P Inorganic ion transport and metabolism # Function: Sulfite reductase, alpha subunit (flavoprotein) # Organism: Escherichia coli K12 # 1 599 1 599 599 1155 99.0 0 MTTQVPPSALLPLNPEQLARLQAATTDLTPTQLAWVSGYFWGVLNQQPAALAATPAPAAE MPGITIISASQTGNARRVAEALRDDLLAAKLNVKLVNAGDYKFKQIASEKLLIVVTSTQG EGEPPEEAVALHKFLFSKKAPKLENTAFAVFSLGDSSYEFFCQSGKDFDSKLAELGGERL LDRVDADVEYQAAASEWRARVVDALKSRAPVAAPSQSVATGAVNEIHTSPYSKDAPLVAS LSVNQKITGRNSEKDVRHIEIDLGDSGLRYQPGDALGVWYQNDPALVKELVELLWLKGDE PVTVEGKTLPLNEALQWHFELTVNTANIVENYATLTRSETLLPLVGDKAKLQHYAATTPI VDMVRFSPAQLDAEALINLLRPLTPRLYSIASSQAEVENEVHVTVGVVRYDVEGRARAGG ASSFLADRVEEEGEVRVFIEHNDNFRLPANPETPVIMIGPGTGIAPFRAFMQQRAADEAP GKNWLFFGNPHFTEDFLYQVEWQRYVKEGVLTRIDLAWSRDQKEKVYVQDKLREQGAELW RWINDGAHIYVCGDANRMAKDVEQALLEVIAEFGGMDTEAADEFLSELRVERRYQRDVY >gi|296494735|gb|ADTN01000003.1| GENE 13 14048 - 15760 1873 570 aa, chain + ## HITS:1 COG:ECs3618 KEGG:ns NR:ns ## COG: ECs3618 COG0155 # Protein_GI_number: 15832872 # Func_class: P Inorganic ion transport and metabolism # Function: Sulfite reductase, beta subunit (hemoprotein) # Organism: Escherichia coli O157:H7 # 1 570 1 570 570 1183 99.0 0 MSEKHPGPLVVEGKLTDAERMKLESNYLRGTIAEDLNDGLTGGFKGDNFLLIRFHGMYQQ DDRDIRAERAEQKLEPRHAMLLRCRLPGGVITTKQWQAIDKFAGENTIYGSIRLTNRQTF QFHGILKKNVKPVHQMLHSVGLDALATANDMNRNVLCTSNPYESQLHAEAYEWAKKISEH LLPRTRAYAEIWLDQEKVATTDEEPILGQTYLPRKFKTTVVIPPQNDIDLHANDMNFVAI AENGKLVGFNLLVGGGLSIEHGNKKTYARTASEFGYLPLEHTLAVAEAVVTTQRDWGNRT DRKNAKTKYTLERVGVETFKAEVERRAGIKFEPIRPYEFTGRGDRIGWVKGIDDNWHLTL FIENGRILDYPARPLKTGLLEIAKIHKGDFRITANQNLIIAGVPESEKAKIEKIAKESGL MNAVTPQRENSMACVSFPTCPLAMAEAERFLPSFIDNIDNLMAKHGVSDEHIVMRVTGCP NGCGRAMLAEVGLVGKAPGRYNLHLGGNRIGTRIPRMYKENITEPEILASLDELIGRWAK EREAGEGFGDFTVRAGIIRPVLDPARDLWD >gi|296494735|gb|ADTN01000003.1| GENE 14 15834 - 16568 869 244 aa, chain + ## HITS:1 COG:cysH KEGG:ns NR:ns ## COG: cysH COG0175 # Protein_GI_number: 16130669 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes # Organism: Escherichia coli K12 # 1 244 1 244 244 491 99.0 1e-139 MSKLDLNALNELPKVDRILALAETNAQLEKLDAEGRVAWALDNLPGEYVLSSSFGIQAAV SLHLVNQIRPDIPVILTDTGYLFPETYRFIDELTDKLKLNLKVYRATESAAWQEARYGKL WEQGVEGIEKYNDINKVEPMNRALKELNAQTWFAGLRREQSGSRANLPVLAIQRGVFKVL PIIDWDNRTIYQYLQKHGLKYHPLWDEGYLSVGDTHTTRKWEPGMSEEETRFFGLKRECG LHEG >gi|296494735|gb|ADTN01000003.1| GENE 15 16833 - 16985 154 50 aa, chain + ## HITS:1 COG:no KEGG:G2583_3410 NR:ns ## KEGG: G2583_3410 # Name: small # Def: small toxic membrane polypeptide # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 50 46 95 95 91 98.0 1e-17 MLTKYALVAIIVLCFTVLGFTLMVGDSLCELSIRERGMEFKAVLAYESKK Prediction of potential genes in microbial genomes Time: Sun May 15 23:02:10 2011 Seq name: gi|296494734|gb|ADTN01000004.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont8.1, whole genome shotgun sequence Length of sequence - 1206 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + LSU_RRNA 1 - 1206 99.0 # CP000800 [D:227600..230523] # 23S ribosomal RNA # Escherichia coli E24377A # Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; Enterobacteriaceae; Escherichia. Prediction of potential genes in microbial genomes Time: Sun May 15 23:02:11 2011 Seq name: gi|296494733|gb|ADTN01000005.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont9.1, whole genome shotgun sequence Length of sequence - 3274 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 92 - 140 13.1 1 1 Op 1 3/0.000 - CDS 158 - 1312 868 ## COG1649 Uncharacterized protein conserved in bacteria - Prom 1515 - 1574 1.9 - Term 1548 - 1582 5.2 2 1 Op 2 . - CDS 1608 - 3143 1625 ## COG0531 Amino acid transporters Predicted protein(s) >gi|296494733|gb|ADTN01000005.1| GENE 1 158 - 1312 868 384 aa, chain - ## HITS:1 COG:ECs2096 KEGG:ns NR:ns ## COG: ECs2096 COG1649 # Protein_GI_number: 15831350 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 384 56 439 439 759 99.0 0 MRGIWLATVSRLDWPPVSSVNISNPTSRARVQQQAMIDKLDHLQRLGINTVFFQVKPDGT ALWPSKILPWSDLMTGKIGENPGYDPLQFMLDEAHKRGMKVHAWFNPYRVSVNTKPSTIR ELNSTLSQQPASVYVQHRDWIRTSGDRFVLDPGIPEVQDWITSIVAEVVSRYPVDGVQFD DYFYTESPGSRLNDNETYRKYGGAFASKADWRRNNTQQLIAKVSHTIKSIKPGVEFGVSP AGVWRNRSHDPLGSDTRGAAAYDESYADTRRWVEQGLLDYIAPQIYWPFSRSAARYDVLA KWWADVVKPTRTRLYIGIAFYKVGEPSKIEPDWMINGGVPELKKQLDLNDAVPEISGTIL FREDYLNKPQTQQAVSYLQSRWGS >gi|296494733|gb|ADTN01000005.1| GENE 2 1608 - 3143 1625 511 aa, chain - ## HITS:1 COG:xasA KEGG:ns NR:ns ## COG: xasA COG0531 # Protein_GI_number: 16129451 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli K12 # 1 511 1 511 511 895 100.0 0 MATSVQTGKAKQLTLLGFFAITASMVMAVYEYPTFATSGFSLVFFLLLGGILWFIPVGLC AAEMATVDGWEEGGVFAWVSNTLGPRWGFAAISFGYLQIAIGFIPMLYFVLGALSYILKW PALNEDPITKTIAALIILWALALTQFGGTKYTARIAKVGFFAGILLPAFILIALAAIYLH SGAPVAIEMDSKTFFPDFSKVGTLVVFVAFILSYMGVEASATHVNEMSNPGRDYPLAMLL LMVAAICLSSVGGLSIAMVIPGNEINLSAGVMQTFTVLMSHVAPEIEWTVRVISALLLLG VLAEIASWIVGPSRGMYVTAQKNLLPAAFAKMNKNGVPVTLVISQLVITSIALIILTNTG GGNNMSFLIALALTVVIYLCAYFMLFIGYIVLVLKHPDLKRTFNIPGGKGVKLVVAIVGL LTSIMAFIVSFLPPDNIQGDSTDMYVELLVVSFLVVLALPFILYAVHDRKGKANTGVTLE PINSQNAPKGHFFLHPRARSPHYIVMNDKKH Prediction of potential genes in microbial genomes Time: Sun May 15 23:02:12 2011 Seq name: gi|296494732|gb|ADTN01000006.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont12.1, whole genome shotgun sequence Length of sequence - 1721 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + LSU_RRNA 1 - 1695 99.0 # CP000946 [D:468076..471061] # 23S ribosomal RNA # Escherichia coli ATCC 8739 # Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; Enterobacteriaceae; Escherichia. Prediction of potential genes in microbial genomes Time: Sun May 15 23:02:14 2011 Seq name: gi|296494731|gb|ADTN01000007.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont13.1, whole genome shotgun sequence Length of sequence - 5912 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 428 - 463 -0.7 1 1 Op 1 2/0.000 - CDS 649 - 1629 254 ## COG2801 Transposase and inactivated derivatives - Term 1648 - 1689 2.5 2 1 Op 2 . - CDS 1694 - 2800 682 ## COG3055 Uncharacterized protein conserved in bacteria 3 1 Op 3 . - CDS 2820 - 3536 416 ## G2583_5112 hypothetical protein - Prom 3648 - 3707 6.1 4 2 Tu 1 . + CDS 5031 - 5910 443 ## COG3039 Transposase and inactivated derivatives, IS5 family Predicted protein(s) >gi|296494731|gb|ADTN01000007.1| GENE 1 649 - 1629 254 326 aa, chain - ## HITS:1 COG:yjhS KEGG:ns NR:ns ## COG: yjhS COG2801 # Protein_GI_number: 16132130 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli K12 # 1 326 1 326 326 677 100.0 0 MNAIISPDYYYVLTVAGQSNAMAYGEGLPLPDREDAPHPRIKQLARFAHTHPGGPPCHFN DIIPLTHCPHDVQDMQGYHHPLATNHQTQYGTVGQALHIARKLLPFIPDNAGILIVPCCR GGSAFTAGSEGTYSERHGASHDACRWGTDTPLYQDLVSRTRAALAKNPQNKFLGACWMQG EFDLMTSDYASHPQHFNHMVEAFRRDLKQYHSQLNNITDAPWFCGDTTWYWKENFPHSYE AIYGNYQNNVLANIIFVDFQQQGERGLTNAPDEDPDDLSTGYYGSAYRSPENWTTALRSS HFSTAARRGIISDRFVEAILQFWRER >gi|296494731|gb|ADTN01000007.1| GENE 2 1694 - 2800 682 368 aa, chain - ## HITS:1 COG:yjhT KEGG:ns NR:ns ## COG: yjhT COG3055 # Protein_GI_number: 16132131 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 368 37 404 404 706 100.0 0 MNKTITALAIMMASFAANASVLPETPVPFKSGTGAIDNDTVYIGLGSAGTAWYKLDTQAK DKKWTALAAFPGGPRDQATSAFIDGNLYVFGGIGKNSEGLTQVFNDVHKYNPKTNSWVKL MSHAPMGMAGHVTFVHNGKAYVTGGVNQNIFNGYFEDLNEAGKDSTAIDKINAHYFDKKA EDYFFNKFLLSFDPSTQQWSYAGESPWYGTAGAAVVNKGDKTWLINGEAKPGLRTDAVFE LDFTGNNLKWNKLAPVSSPDGVAGGFAGISNDSLIFAGGAGFKGSRENYQNGKNYAHEGL KKSYSTDIHLWHNGKWDKSGELSQGRAYGVSLPWNNSLLIIGGETAGGKAVTDSVLITVK DNKVTVQN >gi|296494731|gb|ADTN01000007.1| GENE 3 2820 - 3536 416 238 aa, chain - ## HITS:1 COG:no KEGG:G2583_5112 NR:ns ## KEGG: G2583_5112 # Name: nanC # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 238 4 241 241 454 100.0 1e-126 MKKAKILSGVLLLCFSSPLISQAATLDVRGGYRSGSHAYETRLKVSEGWQNGWWASMESN TWNTIHDNKKENAALNDVQVEVNYAIKLDDQWTVRPGMLTHFSSNGTRYGPYVKLSWDAT KDLNFGIRYRYDWKAYRQQDLSGDMSRDNVHRWDGYVTYHINSDFTFAWQTTLYSKQNDY RYANHKKWATENAFVLQYHMTPDITPYIEYDYLDRQGVYNGRDNLSENSYRIGVSFKL >gi|296494731|gb|ADTN01000007.1| GENE 4 5031 - 5910 443 293 aa, chain + ## HITS:1 COG:yi52_g6 KEGG:ns NR:ns ## COG: yi52_g6 COG3039 # Protein_GI_number: 16129935 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives, IS5 family # Organism: Escherichia coli K12 # 1 293 13 305 338 578 99.0 1e-165 MSHQLTFADSEFSTKRRQTRKEIFLSRMEQILPWQNMVEVIEPFYPKAGNGRRPYPLETM LRIHCMQHWYNLSDGAMEDALYEIASMRLFARLSLDSALPDRTTIMNFRHLLEQHQLARQ LFKTINRWLAEAGVMMTQGTLVDATIIEAPSSTKNKEQQRDPEMHQTKKGNQWHFGMKAH IGVDAKSGLTHSLVTTAANEHDLNQLGNLLHGEEQFVSADAGYQGAPQREELAEVDVDWL IAERPGKVRTLKQHPRKNKTAINIESMKASIRAKVEHPFRIIKRQFGFVKARY Prediction of potential genes in microbial genomes Time: Sun May 15 23:02:18 2011 Seq name: gi|296494730|gb|ADTN01000008.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont15.1, whole genome shotgun sequence Length of sequence - 1413 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 39 - 1079 259 ## COG3547 Transposase and inactivated derivatives - Prom 1274 - 1333 3.6 Predicted protein(s) >gi|296494730|gb|ADTN01000008.1| GENE 1 39 - 1079 259 346 aa, chain - ## HITS:1 COG:Cgl1010 KEGG:ns NR:ns ## COG: Cgl1010 COG3547 # Protein_GI_number: 19552260 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Corynebacterium glutamicum # 14 331 7 327 402 190 38.0 4e-48 MTESSDYESVQVFIGVDVGKDTHHAVAINRSGKRLFDKALPNDENKLRSLISDLKQHGQI LLVVDQPATIGALPVAVARSEGVLVGYLPGLAMRRIADLHAGEAKTDARDAAIIAEAART LPHALRTLKLADEQIAELSMLCGFDDDLAAQTTQASNRIRGLLTQIHPALERVLGPRLDH PAVLDLLQRYPSPEKLASLGEKKLAAQLCKLAPRLGKRLAADIAQALAEQTVVVPGTNAA AVVLPRLALQLITLRKQRDEVALAVEQRVLAHPLYPVLTSMPGVGVRTAARLLTEVACRA FASAAHLAAYAGLAPVTRRSGSSIRGEHPSRRDFLYPAGVITCLTT Prediction of potential genes in microbial genomes Time: Sun May 15 23:02:22 2011 Seq name: gi|296494729|gb|ADTN01000009.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont18.1, whole genome shotgun sequence Length of sequence - 8753 bp Number of predicted genes - 9, with homology - 8 Number of transcription units - 3, operones - 1 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 60 - 128 61 ## - Prom 156 - 215 7.3 + Prom 178 - 237 6.8 2 2 Op 1 6/0.000 + CDS 265 - 786 402 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 3 2 Op 2 2/0.000 + CDS 783 - 1736 670 ## COG3712 Fe2+-dicitrate sensor, membrane component 4 2 Op 3 . + CDS 1823 - 4147 2342 ## COG4772 Outer membrane receptor for Fe3+-dicitrate + Term 4157 - 4184 1.5 5 2 Op 4 3/0.000 + CDS 4192 - 5094 878 ## COG4594 ABC-type Fe3+-citrate transport system, periplasmic component 6 2 Op 5 20/0.000 + CDS 5112 - 6089 722 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 7 2 Op 6 35/0.000 + CDS 6086 - 7042 845 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 8 2 Op 7 . + CDS 7043 - 7810 228 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Term 8144 - 8173 2.5 9 3 Tu 1 . - CDS 8367 - 8624 183 ## EC55989_4958 conserved hypothetical protein; KpLE2 phage-like element Predicted protein(s) >gi|296494729|gb|ADTN01000009.1| GENE 1 60 - 128 61 22 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGWSLCFWHVSVRMTHDVVCYY >gi|296494729|gb|ADTN01000009.1| GENE 2 265 - 786 402 173 aa, chain + ## HITS:1 COG:fecI KEGG:ns NR:ns ## COG: fecI COG1595 # Protein_GI_number: 16132114 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Escherichia coli K12 # 1 173 1 173 173 326 100.0 1e-89 MSDRATTTASLTFESLYGTHHGWLKSWLTRKLQSAFDADDIAQDTFLRVMVSETLSTIRD PRSFLCTIAKRVMVDLFRRNALEKAYLEMLALMPEGGAPSPEERESQLETLQLLDSMLDG LNGKTREAFLLSQLDGLTYSEIAHKLGVSISSVKKYVAKAVEHCLLFRLEYGL >gi|296494729|gb|ADTN01000009.1| GENE 3 783 - 1736 670 317 aa, chain + ## HITS:1 COG:fecR KEGG:ns NR:ns ## COG: fecR COG3712 # Protein_GI_number: 16132113 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Escherichia coli K12 # 1 317 1 317 317 567 100.0 1e-161 MNPLLTDSRRQALRSASHWYAVLSGERVSPQQEARWQQWYEQDQDNQWAWQQVENLRNQL GGVPGDVASRALHDTRLTRRHVMKGLLLLLGAGGGWQLWQSETGEGLRADYRTAKGTVSR QQLEDGSLLTLNTQSAADVRFDAHQRTVRLWYGEIAITTAKDALQRPFRVLTRQGQLTAL GTEFTVRQQDNFTQLDVQQHAVEVLLASAPAQKRIVNAGESLQFSASEFGAVKPLDDEST SWTKDILSFSDKPLGEVIATLTRYRNGVLRCDPAVAGLRLSGTFPLKNTDAILNVIAQTL PVKIQSITRYWINISPL >gi|296494729|gb|ADTN01000009.1| GENE 4 1823 - 4147 2342 774 aa, chain + ## HITS:1 COG:fecA KEGG:ns NR:ns ## COG: fecA COG4772 # Protein_GI_number: 16132112 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for Fe3+-dicitrate # Organism: Escherichia coli K12 # 1 774 1 774 774 1541 100.0 0 MTPLRVFRKTTPLVNTIRLSLLPLAGLSFSAFAAQVNIAPGSLDKALNQYAAHSGFTLSV DASLTRGKQSNGLHGDYDVESGLQQLLDGSGLQVKPLGNNSWTLEPAPAPKEDALTVVGD WLGDARENDVFEHAGARDVIRREDFAKTGATTMREVLNRIPGVSAPENNGTGSHDLAMNF GIRGLNPRLASRSTVLMDGIPVPFAPYGQPQLSLAPVSLGNMDAIDVVRGGGAVRYGPQS VGGVVNFVTRAIPQDFGIEAGVEGQLSPTSSQNNPKETHNLMVGGTADNGFGTALLYSGT RGSDWREHSATRIDDLMLKSKYAPDEVHTFNSLLQYYDGEADMPGGLSRADYDADRWQST RPYDRFWGRRKLASLGYQFQPDSQHKFNIQGFYTQTLRSGYLEQGKRITLSPRNYWVRGI EPRYSQIFMIGPSAHEVGVGYRYLNESTHEMRYYTATSSGQLPSGSSPYDRDTRSGTEAH AWYLDDKIDIGNWTITPGMRFEHIESYQNNAITGTHEEVSYNAPLPALNVLYHLTDSWNL YANTEGSFGTVQYSQIGKAVQSGNVEPEKARTWELGTRYDDGALTAEMGLFLINFNNQYD SNQTNDTVTARGKTRHTGLETQARYDLGTLTPTLDNVSIYASYAYVNAEIREKGDTYGNL VPFSPKHKGTLGVDYKPGNWTFNLNSDFQSSQFADNANTVKESADGSTGRIPGFMLWGAR VAYDFGPQMADLNLAFGVKNIFDQDYFIRSYDDNNKGIYAGQPRTLYMQGSLKF >gi|296494729|gb|ADTN01000009.1| GENE 5 4192 - 5094 878 300 aa, chain + ## HITS:1 COG:fecB KEGG:ns NR:ns ## COG: fecB COG4594 # Protein_GI_number: 16132111 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-citrate transport system, periplasmic component # Organism: Escherichia coli K12 # 1 300 3 302 302 552 100.0 1e-157 MLAFIRFLFAGLLLVISHAFAATVQDEHGTFTLEKTPQRIVVLELSFADALAAVDVIPIG IADDNDAKRILPEVRAHLKPWQSVGTRAQPSLEAIAALKPDLIIADSSRHAGVYIALQQI APVLLLKSRNETYAENLQSAAIIGEMVGKKREMQARLEQHKERMAQWASQLPKGTRVAFG TSREQQFNLHTQETWTGSVLASLGLNVPAAMAGASMPSIGLEQLLAVNPAWLLVAHYREE SIVKRWQQDPLWQMLTAAQKQQVASVDSNTWARMRGIFAAERIAADTVKIFHHQPLTVVK >gi|296494729|gb|ADTN01000009.1| GENE 6 5112 - 6089 722 325 aa, chain + ## HITS:1 COG:fecC KEGG:ns NR:ns ## COG: fecC COG0609 # Protein_GI_number: 16132110 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Escherichia coli K12 # 1 325 8 332 332 496 99.0 1e-140 MLLWGLPVAALIIIFWLSLFCYSAIPVSGADATRALLPGHTPTLPEALVQNLRLPRSLVA VLIGASLALAGTLLQTLTHNPMASPSLLGINSGAALAMALTSALSPTPIAGYSLSFIAAC GGGVSWLLVMTAGGGFRHTHDRNKLILAGIALSAFCMGLTRITLLLAEDHAYGIFYWLAG GVSHARWQDVWQLLPVVVTAVPVVLLLANQLNLLNLSDSTAHTLGVNLTRLRLVINMLVL LLVGACVSVAGPVAFIGLLVPHLARFWAGFDQRNVLPVSMLLGATLMLLADVLARALAFP GDLPAGAVLALIGSPCFVWLVRRRG >gi|296494729|gb|ADTN01000009.1| GENE 7 6086 - 7042 845 318 aa, chain + ## HITS:1 COG:fecD KEGG:ns NR:ns ## COG: fecD COG0609 # Protein_GI_number: 16132109 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Escherichia coli K12 # 1 318 1 318 318 495 100.0 1e-140 MKIALVIFITLALAGCALLSLHMGVIPVPWRALLTDWQAGHEHYYVLMEYRLPRLLLALF VGAALAVAGVLIQGIVRNPLASPDILGVNHAASLASVGALLLMPSLPVMVLPLLAFAGGM AGLILLKMLAKTHQPMKLALTGVALSACWASLTDYLMLSRPQDVNNALLWLTGSLWGRDW SFVKIAIPLMILFLPLSLSFCRDLDLLALGDARATTLGVSVPHTRFWALLLAVAMTSTGV AACGPISFIGLVVPHMMRSITGGRHRRLLPVSALTGALLLVVADLLARIIHPPLELPVGV LTAIIGAPWFVWLLVRMR >gi|296494729|gb|ADTN01000009.1| GENE 8 7043 - 7810 228 255 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 225 2 226 245 92 25 1e-18 MTLRTENLTVSYGTDKVLNDVSLSLPTGKITALIGPNGCGKSTLLNCFSRLLMPQSGTVF LGDNPINMLSSRQLARRLSLLPQHHLTPEGITVQELVSYGRNPWLSLWGRLSAEDNARVN VAMNQTRINHLAVRRLTELSGGQRQRAFLAMVLAQNTPVVLLDEPTTYLDINHQVDLMRL MGELRTQGKTVVAVLHDLNQASRYCDQLVVMANGHVMAQGTPEEVMTPGLLRTVFSVEAE IHPEPVSGRPMCLMR >gi|296494729|gb|ADTN01000009.1| GENE 9 8367 - 8624 183 85 aa, chain - ## HITS:1 COG:no KEGG:EC55989_4958 NR:ns ## KEGG: EC55989_4958 # Name: yjhV # Def: conserved hypothetical protein; KpLE2 phage-like element # Organism: E.coli_55989 # Pathway: not_defined # 1 85 53 137 137 184 100.0 1e-45 MTLVNDTGFDPVFSGSIAESWRQQPCTPSYCCDWEAATMLRAFPLAKKGEGRARLPSLYA SFGKLGETPTHEDIIDNNRSINWPV Prediction of potential genes in microbial genomes Time: Sun May 15 23:02:28 2011 Seq name: gi|296494728|gb|ADTN01000010.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont19.1, whole genome shotgun sequence Length of sequence - 2798 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 324 - 377 10.0 1 1 Op 1 . - CDS 577 - 999 213 ## KPN_pKPN7p10266 Excl1 protein 2 1 Op 2 . - CDS 999 - 1190 205 ## KPN_pKPN6p09258 RNA one modulator-like protein 3 1 Op 3 . - CDS 1207 - 1308 75 ## - Prom 1548 - 1607 5.8 - Term 1606 - 1646 1.3 4 2 Tu 1 . - CDS 1669 - 1818 151 ## SNSL254_B0005 hypothetical protein - Prom 1889 - 1948 2.7 Predicted protein(s) >gi|296494728|gb|ADTN01000010.1| GENE 1 577 - 999 213 140 aa, chain - ## HITS:1 COG:no KEGG:KPN_pKPN7p10266 NR:ns ## KEGG: KPN_pKPN7p10266 # Name: excl1 # Def: Excl1 protein # Organism: K.pneumoniae # Pathway: not_defined # 1 140 1 133 133 87 41.0 1e-16 MARVSISEAARLVKVSRPTIYKMINSGKLSYTSVVKHGKSIKVIDTSELIRVFGSLDGVI DTVKYDVKSDAESTDVNSVVLHDLQYRIALLEAENDGLKGAVKARDEHIDSLRQAMQLLE HKHEPSSPPHSSWWKFWKKS >gi|296494728|gb|ADTN01000010.1| GENE 2 999 - 1190 205 63 aa, chain - ## HITS:1 COG:no KEGG:KPN_pKPN6p09258 NR:ns ## KEGG: KPN_pKPN6p09258 # Name: rom # Def: RNA one modulator-like protein # Organism: K.pneumoniae # Pathway: not_defined # 1 63 1 63 63 73 100.0 2e-12 MTKQQKTALNMAKFIQNQSLLLLEKLNELDLDAEADLCEKLHDDAEHLFRTLSSRLDALQ DGN >gi|296494728|gb|ADTN01000010.1| GENE 3 1207 - 1308 75 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTTDKLQNNYKITSDPFSEPFVHPTPLPVYIEH >gi|296494728|gb|ADTN01000010.1| GENE 4 1669 - 1818 151 49 aa, chain - ## HITS:1 COG:no KEGG:SNSL254_B0005 NR:ns ## KEGG: SNSL254_B0005 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Newport # Pathway: not_defined # 1 49 1 49 49 77 93.0 1e-13 MPSEHDEQRDAADTQLPGEYDLFDLMAKVTEENLHAPVDSGPPVGREAL Prediction of potential genes in microbial genomes Time: Sun May 15 23:02:38 2011 Seq name: gi|296494727|gb|ADTN01000011.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont26.1, whole genome shotgun sequence Length of sequence - 896 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 297 - 356 2.3 1 1 Tu 1 . + CDS 384 - 890 363 ## ETA_32360 replication initiation protein A Predicted protein(s) >gi|296494727|gb|ADTN01000011.1| GENE 1 384 - 890 363 168 aa, chain + ## HITS:1 COG:no KEGG:ETA_32360 NR:ns ## KEGG: ETA_32360 # Name: not_defined # Def: replication initiation protein A # Organism: E.tasmaniensis # Pathway: not_defined # 1 162 49 210 291 318 92.0 5e-86 MFWLGLDVDRVGAAIDWSDRNAPAPTLTITNPENGHAHLLYALETSICTAPDGRMKPLKY AAAVENALRKKLEADAGYSGLICKNPNHGHWKIAVWHPELYTLDWLADSLDLNVTNDKEI VADYGLGRNCTLFDKTRKWAYRAIRQGWPEYEQWRQACYERAVMTPTY Prediction of potential genes in microbial genomes Time: Sun May 15 23:02:41 2011 Seq name: gi|296494726|gb|ADTN01000012.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont33.1, whole genome shotgun sequence Length of sequence - 1343 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 11 - 151 78 ## APECO1_O1CoBM122 hypothetical protein 2 1 Op 2 . - CDS 174 - 1286 178 ## COG3385 FOG: Transposase and inactivated derivatives Predicted protein(s) >gi|296494726|gb|ADTN01000012.1| GENE 1 11 - 151 78 46 aa, chain - ## HITS:1 COG:no KEGG:APECO1_O1CoBM122 NR:ns ## KEGG: APECO1_O1CoBM122 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 46 17 62 62 72 100.0 3e-12 MVIWSLQVAIRGTVSLTAYKTQLKNARHRLNEAPRRRILQMVQPLS >gi|296494726|gb|ADTN01000012.1| GENE 2 174 - 1286 178 370 aa, chain - ## HITS:1 COG:yi81 KEGG:ns NR:ns ## COG: yi81 COG3385 # Protein_GI_number: 16130326 # Func_class: L Replication, recombination and repair # Function: FOG: Transposase and inactivated derivatives # Organism: Escherichia coli K12 # 1 370 3 372 372 722 99.0 0 MNYSHDNWSAILAHIGKPEELDTSARNAGALTRRREIRDAATLLRLGLASAPGGMSLREV TAWAQLHDVATLSDVALLKRLRNAADWFGILAAQTLAVRAAVTGCTSGKRLRLVDGTAIS APGGGSAEWRLHMGYDPHTCQFTDFELTDSRDAERLDRFAQTADEIRIADRGFGSRPECI RSLAFGEADYIVRVHWRGLRWLTAEGMRFDMMGFLRGLDCGKNGETTVMIGNSGNKKAGA PFPARLIAVSLPPEKALISKTRLLSENRRKGRVVQAETLEAAGHVLLLTSLPEDEYSAEQ VADCYRLRWQIELAFKRLKSLLHLDALRAKEPELAKAWIFANLLAAFLIDDIIQPSLDFP PRSAGSEKKN Prediction of potential genes in microbial genomes Time: Sun May 15 23:02:45 2011 Seq name: gi|296494725|gb|ADTN01000013.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont35.1, whole genome shotgun sequence Length of sequence - 579 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - SSU_RRNA 1 - 579 100.0 # GQ048523 [D:1..1362] # 16S ribosomal RNA # uncultured bacterium # Bacteria; environmental samples. Prediction of potential genes in microbial genomes Time: Sun May 15 23:02:52 2011 Seq name: gi|296494724|gb|ADTN01000014.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont39.1, whole genome shotgun sequence Length of sequence - 21532 bp Number of predicted genes - 19, with homology - 18 Number of transcription units - 14, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 14 - 298 344 ## COG0582 Integrase - Prom 437 - 496 1.8 + Prom 134 - 193 4.5 2 2 Tu 1 . + CDS 365 - 439 59 ## + Term 601 - 628 0.1 - TRNA 631 - 715 79.1 # Leu CAA 0 0 3 3 Tu 1 . + CDS 911 - 1930 1190 ## COG1064 Zn-dependent alcohol dehydrogenases + Term 1940 - 1976 3.0 4 4 Tu 1 . - CDS 1934 - 2497 409 ## COG3265 Gluconate kinase - Prom 2523 - 2582 3.7 + Prom 2562 - 2621 12.9 5 5 Op 1 10/0.125 + CDS 2714 - 3745 856 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 6 5 Op 2 5/0.250 + CDS 3769 - 4533 803 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) + Term 4548 - 4578 3.0 7 6 Op 1 2/0.750 + CDS 4595 - 5914 1294 ## COG2610 H+/gluconate symporter and related permeases 8 6 Op 2 1/1.000 + CDS 5981 - 6979 796 ## COG1609 Transcriptional regulators 9 6 Op 3 . + CDS 7057 - 8559 1427 ## COG0433 Predicted ATPase + Term 8683 - 8718 6.0 - Term 8671 - 8706 6.0 10 7 Op 1 22/0.000 - CDS 8720 - 9802 1461 ## COG0795 Predicted permeases 11 7 Op 2 . - CDS 9802 - 10881 976 ## COG0795 Predicted permeases - Prom 10930 - 10989 3.8 + Prom 10951 - 11010 5.4 12 8 Tu 1 12/0.125 + CDS 11169 - 12680 1673 ## COG0260 Leucyl aminopeptidase + Term 12771 - 12807 -0.7 + Prom 12755 - 12814 2.7 13 9 Op 1 5/0.250 + CDS 12936 - 13379 480 ## COG2927 DNA polymerase III, chi subunit 14 9 Op 2 . + CDS 13379 - 16234 3336 ## COG0525 Valyl-tRNA synthetase + Term 16256 - 16293 2.4 15 10 Tu 1 . - CDS 16290 - 17486 403 ## COG4269 Predicted membrane protein - Prom 17528 - 17587 7.3 + Prom 17470 - 17529 10.4 16 11 Tu 1 . + CDS 17679 - 18182 546 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Term 18188 - 18225 2.4 - Term 18176 - 18213 1.6 17 12 Tu 1 . - CDS 18228 - 18644 737 ## COG3076 Uncharacterized protein conserved in bacteria - Prom 18743 - 18802 4.2 + Prom 18713 - 18772 7.8 18 13 Tu 1 . + CDS 18806 - 19810 1111 ## COG0078 Ornithine carbamoyltransferase + Term 19827 - 19871 2.9 - Term 19815 - 19857 6.7 19 14 Tu 1 . - CDS 19866 - 21461 633 ## ECO26_5422 hypothetical protein Predicted protein(s) >gi|296494724|gb|ADTN01000014.1| GENE 1 14 - 298 344 94 aa, chain - ## HITS:1 COG:intB KEGG:ns NR:ns ## COG: intB COG0582 # Protein_GI_number: 16132092 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Escherichia coli K12 # 1 90 25 114 396 178 100.0 2e-45 MLALGVYPEITLADARVRRDEARKLLANGVDPGDKKKNDKVEQSKARTFKEVAIEWHGTN KKWSEDHAHRVLKSLEDNLFAALGERNIAEGFVE >gi|296494724|gb|ADTN01000014.1| GENE 2 365 - 439 59 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHTATISQLILLIRWLGCSDFDIC >gi|296494724|gb|ADTN01000014.1| GENE 3 911 - 1930 1190 339 aa, chain + ## HITS:1 COG:yjgB KEGG:ns NR:ns ## COG: yjgB COG1064 # Protein_GI_number: 16132091 # Func_class: R General function prediction only # Function: Zn-dependent alcohol dehydrogenases # Organism: Escherichia coli K12 # 1 339 15 353 353 680 100.0 0 MSMIKSYAAKEAGGELEVYEYDPGELRPQDVEVQVDYCGICHSDLSMIDNEWGFSQYPLV AGHEVIGRVVALGSAAQDKGLQVGQRVGIGWTARSCGHCDACISGNQINCEQGAVPTIMN RGGFAEKLRADWQWVIPLPENIDIESAGPLLCGGITVFKPLLMHHITATSRVGVIGIGGL GHIAIKLLHAMGCEVTAFSSNPAKEQEVLAMGADKVVNSRDPQALKALAGQFDLIINTVN VSLDWQPYFEALTYGGNFHTVGAVLTPLSVPAFTLIAGDRSVSGSATGTPYELRKLMRFA ARSKVAPTTELFPMSKINDAIQHVRDGKARYRVVLKADF >gi|296494724|gb|ADTN01000014.1| GENE 4 1934 - 2497 409 187 aa, chain - ## HITS:1 COG:idnK KEGG:ns NR:ns ## COG: idnK COG3265 # Protein_GI_number: 16132090 # Func_class: G Carbohydrate transport and metabolism # Function: Gluconate kinase # Organism: Escherichia coli K12 # 1 187 1 187 187 380 100.0 1e-106 MAGESFILMGVSGSGKTLIGSKVAALLSAKFIDGDDLHPAKNIDKMSQGIPLSDEDRLPW LERLNDASYSLYKKNETGFIVCSSLKKQYRDILRKGSPHVHFLWLDGDYETILARMQRRA GHFMPVALLKSQFEALERPQADEQDIVRIDINHDIANVTEQCRQAVLAIRQNRICAKEGS ASDQRCE >gi|296494724|gb|ADTN01000014.1| GENE 5 2714 - 3745 856 343 aa, chain + ## HITS:1 COG:idnD KEGG:ns NR:ns ## COG: idnD COG1063 # Protein_GI_number: 16132089 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli K12 # 1 343 1 343 343 712 100.0 0 MQVKTQSCVVAGKKTVAVTEQTIDWNNNGTLVQITRGGICGSDLHYYQEGKVGNFMIKAP MVLGHEVIGKVIHSDSSELHEGQTVAINPSKPCGHCKYCIEHNENQCTDMRFFGSAMYFP HVDGGFTRYKMVETSQCVPYPAKADEKVMAFAEPLAVAIHAAHQAGELQGKRVFISGVGP IGCLIVSAVKTLGAAEIVCADVSPRSLSLGKEMGADVLVNPQNDDMDHWKAEKGYFDVSF EVSGHPSSVNTCLEVTRARGVMVQVGMGGAMAEFPMMTLIGKEISLRGSFRFTSEFNTAV SWLANGVINPLPLLSAEYPFTDLEEALRFAGDKTQAAKVQLVF >gi|296494724|gb|ADTN01000014.1| GENE 6 3769 - 4533 803 254 aa, chain + ## HITS:1 COG:idnO KEGG:ns NR:ns ## COG: idnO COG1028 # Protein_GI_number: 16132088 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Escherichia coli K12 # 1 254 1 254 254 510 100.0 1e-144 MNDLFSLAGKNILITGSAQGIGFLLATGLGKYGAQIIINDITAERAELAVEKLHQEGIQA VAAPFNVTHKHEIDAAVEHIEKDIGPIDVLVNNAGIQRRHPFTEFPEQEWNDVIAVNQTA VFLVSQAVTRHMVERKAGKVINICSMQSELGRDTITPYAASKGAVKMLTRGMCVELARHN IQVNGIAPGYFKTEMTKALVEDEAFTAWLCKRTPAARWGDPQELIGAAVFLSSKASDFVN GHLLFVDGGMLVAV >gi|296494724|gb|ADTN01000014.1| GENE 7 4595 - 5914 1294 439 aa, chain + ## HITS:1 COG:idnT KEGG:ns NR:ns ## COG: idnT COG2610 # Protein_GI_number: 16132087 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Escherichia coli K12 # 1 439 1 439 439 703 100.0 0 MPLIIIAAGVALLLILMIGFKVNGFIALVLVAAVVGFAEGMDAQAVLHSIQNGIGSTLGG LAMILGFGAMLGKLISDTGAAQRIATTLIATFGKKRVQWALVITGLVVGLAMFFEVGFVL LLPLVFTIVASSGLPLLYVGVPMVAALSVTHCFLPPHPGPTAIATIFEANLGTTLLYGFI ITIPTVIVAGPLFSKLLTRFEKAPPEGLFNPHLFSEEEMPSFWNSIFAAVIPVILMAIAA VCEITLPKTNTVRLFFEFVGNPAVALFIAIVIAIFTLGRRNGRTIEQIMDIIGDSIGAIA MIVFIIAGGGAFKQVLVDSGVGHYISHLMTGTTLSPLLMCWTVAALLRIALGSATVAAIT TAGVVLPIINVTHADPALMVLATGAGSVIASHVNDPGFWLFKGYFNLTVGETLRTWTVME TLISIMGLLGVLAINAVLH >gi|296494724|gb|ADTN01000014.1| GENE 8 5981 - 6979 796 332 aa, chain + ## HITS:1 COG:idnR KEGG:ns NR:ns ## COG: idnR COG1609 # Protein_GI_number: 16132086 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 332 1 332 332 660 100.0 0 MRNHRISLQDIATLAGVTKMTVSRYIRSPKKVAKETGERIAKIMEEINYIPNRAPGMLLN AQSYTLGILIPSFQNQLFADILAGIESVTSEHNYQTLIANYNYDRDSEEESVINLLSYNI DGIILSEKYHTIRTVKFLRSATIPVVELMDVQGERLDMEVGFDNRQAAFDMVCTMLEKRV RHKILYLGSKDDTRDEQRYQGYCDAMMLHNLSPLRMNPRAISSIHLGMQLMRDALSANPD LDGVFCTNDDIAMGALLLCRERNLAVPEQISIAGFHGLEIGRQMIPSLASVITPRFDIGR MAAQMLLSKIKNNDHNHNTVDLGYQIYHGNTL >gi|296494724|gb|ADTN01000014.1| GENE 9 7057 - 8559 1427 500 aa, chain + ## HITS:1 COG:yjgR KEGG:ns NR:ns ## COG: yjgR COG0433 # Protein_GI_number: 16132085 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Escherichia coli K12 # 1 500 1 500 500 959 100.0 0 MSEPLLIARTPDTELFLLPGMANRHGLITGATGTGKTVTLQKLAESLSEIGVPVFMADVK GDLTGVAQAGTVSEKLLARLKNIGVNDWQPHANPVVVWDIFGEKGHPVRATVSDLGPLLL ARLLNLNDVQSGVLNIIFRIADDQGLLLLDFKDLRAITQYIGDNAKSFQNQYGNISSASV GAIQRGLLSLEQQGAAHFFGEPMLDIKDWMRTDANGKGVINILSAEKLYQMPKLYAASLL WMLSELYEQLPEAGDLEKPKLVFFFDEAHLLFNDAPQVLLDKIEQVIRLIRSKGVGVWFV SQNPSDIPDNVLGQLGNRVQHALRAFTPKDQKAVKAAAQTMRANPAFDTEKAIQELGTGE ALISFLDAKGSPSVVERAMVIAPCSRMGPVTEDERNGLINHSPVYGKYEDEVDRESAYEM LQKGFQASTEQQNNPPAKGKEVAVDDGILGGLKDILFGTTGPRGGKKDGVVQTMAKSAAR QVTNQIVRGMLGSLLGGRRR >gi|296494724|gb|ADTN01000014.1| GENE 10 8720 - 9802 1461 360 aa, chain - ## HITS:1 COG:ECs5239 KEGG:ns NR:ns ## COG: ECs5239 COG0795 # Protein_GI_number: 15834493 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Escherichia coli O157:H7 # 1 360 2 361 361 659 100.0 0 MQPFGVLDRYIGKTIFTTIMMTLFMLVSLSGIIKFVDQLKKAGQGSYDALGAGMYTLLSV PKDVQIFFPMAALLGALLGLGMLAQRSELVVMQASGFTRMQVALSVMKTAIPLVLLTMAI GEWVAPQGEQMARNYRAQAMYGGSLLSTQQGLWAKDGNNFVYIERVKGDEELGGISIYAF NENRRLQSVRYAATAKFDPEHKVWRLSQVDESDLTNPKQITGSQTVSGTWKTNLTPDKLG VVALDPDALSISGLHNYVKYLKSSGQDAGRYQLNMWSKIFQPLSVAVMMLMALSFIFGPL RSVPMGVRVVTGISFGFVFYVLDQIFGPLTLVYGIPPIIGALLPSASFFLISLWLLMRKS >gi|296494724|gb|ADTN01000014.1| GENE 11 9802 - 10881 976 359 aa, chain - ## HITS:1 COG:ECs5238 KEGG:ns NR:ns ## COG: ECs5238 COG0795 # Protein_GI_number: 15834492 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Escherichia coli O157:H7 # 1 359 8 366 366 621 99.0 1e-178 MRETLKSQLAILFILLLIFFCQKLVRILGAAVDGDIPANLVLSLLGLGVPEMAQLILPLS LFLGLLMTLGKLYTESEITVMHACGLSKAVLVKAAMILAVFTAIVAAVNVMWAGPWSSRH QDEVLAEAKANPGMAALAQGQFQQATNGSSVLFIESVDGSDFKDVFLAQIRPKGNARPSV VVADSGHLTQLRDGSQVVTLNQGTRFEGTALLRDFRITDFQDYQAIIGHQAVALDPNDTD QMDMRTLWNTDTDRARAELNWRITLVFTVFMMALMVVPLSVVNPRQGRVLSMLPAMLLYL LFFLIQTSLKSNGGKGKLDPTLWMWTVNLIYLALAIVLNLWDTVPVRRLRASFSRKGAV >gi|296494724|gb|ADTN01000014.1| GENE 12 11169 - 12680 1673 503 aa, chain + ## HITS:1 COG:ECs5237 KEGG:ns NR:ns ## COG: ECs5237 COG0260 # Protein_GI_number: 15834491 # Func_class: E Amino acid transport and metabolism # Function: Leucyl aminopeptidase # Organism: Escherichia coli O157:H7 # 1 503 1 503 503 1002 100.0 0 MEFSVKSGSPEKQRSACIVVGVFEPRRLSPIAEQLDKISDGYISALLRRGELEGKPGQTL LLHHVPNVLSERILLIGCGKERELDERQYKQVIQKTINTLNDTGSMEAVCFLTELHVKGR NNYWKVRQAVETAKETLYSFDQLKTNKSEPRRPLRKMVFNVPTRRELTSGERAIQHGLAI AAGIKAAKDLGNMPPNICNAAYLASQARQLADSYSKNVITRVIGEQQMKELGMHSYLAVG QGSQNESLMSVIEYKGNASEDARPIVLVGKGLTFDSGGISIKPSEGMDEMKYDMCGAAAV YGVMRMVAELQLPINVIGVLAGCENMPGGRAYRPGDVLTTMSGQTVEVLNTDAEGRLVLC DVLTYVERFEPEAVIDVATLTGACVIALGHHITGLMANHNPLAHELIAASEQSGDRAWRL PLGDEYQEQLESNFADMANIGGRPGGAITAGCFLSRFTRKYNWAHLDIAGTAWRSGKAKG ATGRPVALLAQFLLNRAGFNGEE >gi|296494724|gb|ADTN01000014.1| GENE 13 12936 - 13379 480 147 aa, chain + ## HITS:1 COG:ECs5236 KEGG:ns NR:ns ## COG: ECs5236 COG2927 # Protein_GI_number: 15834490 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, chi subunit # Organism: Escherichia coli O157:H7 # 1 147 1 147 147 290 100.0 9e-79 MKNATFYLLDNDTTVDGLSAVEQLVCEIAAERWRSGKRVLIACEDEKQAYRLDEALWARP AESFVPHNLAGEGPRGGAPVEIAWPQKRSSSPRDILISLRTSFADFATAFTEVVDFVPYE DSLKQLARERYKAYRVAGFNLNTATWK >gi|296494724|gb|ADTN01000014.1| GENE 14 13379 - 16234 3336 951 aa, chain + ## HITS:1 COG:valS KEGG:ns NR:ns ## COG: valS COG0525 # Protein_GI_number: 16132080 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Valyl-tRNA synthetase # Organism: Escherichia coli K12 # 1 951 1 951 951 1936 99.0 0 MEKTYNPQDIEQPLYEHWEKQGYFKPNGDESQESFCIMIPPPNVTGSLHMGHAFQQTIMD TMIRYQRMQGKNTLWQVGTDHAGIATQMVVERKIAAEEGKTRHDYGREAFIDKIWEWKAE SGGTITRQMRRLGNSVDWERERFTMDEGLSNAVKEVFVRLYKEDLIYRGKRLVNWDPKLR TAISDLEVENRESKGSMWHIRYPLADGAKTADGKDYLVVATTRPETLLGDTGVAVNPEDP RYKDLIGKYVILPLVNRRIPIVGDEHADMEKGTGCVKITPAHDFNDYEVGKRHALPMINI LTFDGDIRESAQVFDTKGNESDVYSSEIPAEFQKLERFAARKAVVAAVDALGLLEEIKPH DLTVPYGDRGGVVIEPMLTDQWYVRADVLAKPAVEAVENGDIQFVPKQYENMYFSWMRDI QDWCISRQLWWGHRIPAWYDEAGNVYVGRNEDEVRKENNLGADVALRQDEDVLDTWFSSA LWTFSTLGWPENTDALRQFHPTSVMVSGFDIIFFWIARMIMMTMHFIKDENGKPQVPFHT VYMTGLIRDDEGQKMSKSKGNVIDPLDMVDGISLPELLEKRTGNMMQPQLADKIRKRTEK QFPNGIEPHGTDALRFTLAALASTGRDINWDMKRLEGYRNFCNKLWNASRFVLMNTEGQD CGFNGGEMTLSLADRWILAEFNQTIKAYREALDSFRFDIAAGILYEFTWNQFCDWYLELT KPVMNGGTEAELRGTRHTLVTVLEGLLRLAHPIIPFITETIWQRVKVLCGITADTIMLQP FPQYDASQVDEAALADTEWLKQAIVAVRNIRAEMNIAPGKPLELLLRGCSADAERRVNEN RGFLQTLARLESITVLPADNKGPVSVTKIIDGAELLIPMAGLINKEDELARLAKEVAKIE GEISRIENKLANEGFVARAPEAVIAKEREKLEGYAEAKAKLIEQQAVIAAL >gi|296494724|gb|ADTN01000014.1| GENE 15 16290 - 17486 403 398 aa, chain - ## HITS:1 COG:ECs5234 KEGG:ns NR:ns ## COG: ECs5234 COG4269 # Protein_GI_number: 15834488 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 398 1 398 398 649 97.0 0 MAQVINEMDVPSHSFVFHGTGERYFLICVVNVLLTIITLGIYLPWALMKCKRYLYANMEV NGQRFSYGITGGNVFVSCLVFVFFYFAILMTVSADMPLVGCVLTLSLLVLLIFMAAKGLR YQALMTSLNGVRFSFNCSLKGFWWVTFFLPILMAIGMGAVFFISTKMLHANSSSSVIISV VLMAIVGIVSIGIFNGTLYSLVMSFLWSNTSFGIHRFKVKLDTTYCIKYAILAFLALLPF LAVAGYIIFDQILNAYDSSVYANDDIENLQQFMEMQRKMIIAQLIYYFGIAVSTSYLTVS LRNHFMSNLSLNDGRIRFRSTLTYHGMLYRMCALVVISGITGGLAYPLLKIWMIDWQAKN TYLLGDLDDLPLINKEEQPDKGFLARISRGIMPSLPFL >gi|296494724|gb|ADTN01000014.1| GENE 16 17679 - 18182 546 167 aa, chain + ## HITS:1 COG:STM4473 KEGG:ns NR:ns ## COG: STM4473 COG0454 # Protein_GI_number: 16767718 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Salmonella typhimurium LT2 # 1 167 1 167 167 263 80.0 1e-70 MNNIAPQSPVMRRLTQQDNPAIARVIRQVSAEYGLTADKGYTVADPNLDELYQVYSQPGH AYWVVEFDGEVVGGGGIAPLAGSESDICELQKMYFLPAIRGKGLAKKLALMAMEQAREMG FKRCYLETTAFLKEAIALYEHLGFEHIDYALGCTGHVDCEVRMLREL >gi|296494724|gb|ADTN01000014.1| GENE 17 18228 - 18644 737 138 aa, chain - ## HITS:1 COG:ECs5232 KEGG:ns NR:ns ## COG: ECs5232 COG3076 # Protein_GI_number: 15834486 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 138 1 138 138 187 100.0 5e-48 MANPEQLEEQREETRLIIEELLEDGSDPDALYTIEHHLSADDLETLEKAAVEAFKLGYEV TDPEELEVEDGDIVICCDILSECALNADLIDAQVEQLMTLAEKFDVEYDGWGTYFEDPNG EDGDDEDFVDEDDDGVRH >gi|296494724|gb|ADTN01000014.1| GENE 18 18806 - 19810 1111 334 aa, chain + ## HITS:1 COG:ECs5231 KEGG:ns NR:ns ## COG: ECs5231 COG0078 # Protein_GI_number: 15834485 # Func_class: E Amino acid transport and metabolism # Function: Ornithine carbamoyltransferase # Organism: Escherichia coli O157:H7 # 1 334 1 334 334 673 99.0 0 MSGFYHKHFLKLLDFTPAELNSLLQLAAKLKADKKSGKEEAKLTGKNIALIFEKDSTRTR CSFEVAAYDQGARVTYLGPSGSQIGHKESIKDTARVLGRMYDGIQYRGYGQEIVETLAEY AGVPVWNGLTNEFHPTQLLADLLTMQEHLPGKAFNEMTLVYAGDARNNMGNSMLEAAALT GLDLRLVAPQACWPEAALVTECRALAQQNGGNITLTEDVAKGVEGADFIYTDVWVSMGEA KEKWAERIALLRDYQVNSKMMQLTGNPEVKFLHCLPAFHDDQTTLGKKMAEEFGLHGGME VTDEVFESAASIVFDQAENRMHTIKAVMVATLSK >gi|296494724|gb|ADTN01000014.1| GENE 19 19866 - 21461 633 531 aa, chain - ## HITS:1 COG:no KEGG:ECO26_5422 NR:ns ## KEGG: ECO26_5422 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 502 1 502 503 850 88.0 0 MSKISDLNYSQHITLADNFKQKSEVLNTWRVGMNNFARNAEGQDNTRNILDPKTFLEFLV KIFTLGYVDFSNRSNEAGRNMMAHIESSSYIKNNDGSEIMKFVMNNPEGERADLSKVEIE ITLSTITTMGTRQGHTAIIFQQPDGSTNRYEGKSFERKDESSLHLITNKVLACYQREANK EVALLLNNHQLNWKDAIDYNELLEKPLASTLEKIKKEHLLLMPHVCDDTIPYLLGEGGIL EEINGLKTLHDTKIDSGTEGNDEINNIKINLSNILIDSLDDAKINLEKVIDSMLETFFKL PYINDVKILEWCFNQSMQYSDDTTKIKHARSVIDHIDFNCDQSKIAETLFFNLNKEPYKN SPELQELIWEKLVVYVNDFNLSNQEKSRLILRLFDDVKLVFDKVPLSILVNDIFLKDFFM KQPDFAKWYFYQLLKSYEGEQMYLNELGYVYGDEEKTKEIVNKLPGYVVKIFEEKIDNEL KIRTRMMEILRDEQINIYEYINEKQLEKLNPQGNLRIAIEKFGWNNKPITA Prediction of potential genes in microbial genomes Time: Sun May 15 23:03:07 2011 Seq name: gi|296494723|gb|ADTN01000015.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont63.1, whole genome shotgun sequence Length of sequence - 19876 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 7, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 640 - 2073 979 ## KPN_02513 hypothetical protein 2 2 Op 1 6/0.000 + CDS 2219 - 3355 435 ## COG1596 Periplasmic protein involved in polysaccharide export + Term 3395 - 3434 5.2 + Prom 3457 - 3516 1.9 3 2 Op 2 3/0.000 + CDS 3601 - 3792 68 ## COG0394 Protein-tyrosine-phosphatase 4 2 Op 3 2/0.000 + CDS 3808 - 5973 490 ## COG3206 Uncharacterized protein involved in exopolysaccharide biosynthesis + Term 5979 - 6030 10.4 5 3 Op 1 . + CDS 6760 - 7500 373 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 6 3 Op 2 11/0.000 + CDS 7532 - 8443 85 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Term 8467 - 8527 -0.1 + Prom 8468 - 8527 8.5 7 3 Op 3 8/0.000 + CDS 8630 - 9505 -65 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 8 3 Op 4 . + CDS 9517 - 10353 100 ## COG1216 Predicted glycosyltransferases + Prom 10662 - 10721 5.1 9 4 Op 1 . + CDS 10971 - 11525 70 ## gi|301643411|ref|ZP_07243460.1| conserved hypothetical protein 10 4 Op 2 . + CDS 11553 - 12794 256 ## gi|301643412|ref|ZP_07243461.1| conserved hypothetical protein + Term 12920 - 12960 2.2 + Prom 12989 - 13048 5.7 11 5 Op 1 7/0.000 + CDS 13068 - 14090 201 ## COG2327 Uncharacterized conserved protein + Term 14129 - 14169 0.1 + Prom 14811 - 14870 3.4 12 5 Op 2 11/0.000 + CDS 14903 - 15313 88 ## COG0438 Glycosyltransferase + Prom 15772 - 15831 9.2 13 5 Op 3 . + CDS 15856 - 16635 64 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid + Prom 16690 - 16749 5.7 14 6 Tu 1 . + CDS 16799 - 18205 1625 ## COG0362 6-phosphogluconate dehydrogenase + Term 18234 - 18274 8.5 - Term 18971 - 19011 -0.6 15 7 Tu 1 . - CDS 19224 - 19490 223 ## COG1662 Transposase and inactivated derivatives, IS1 family - Prom 19636 - 19695 3.2 Predicted protein(s) >gi|296494723|gb|ADTN01000015.1| GENE 1 640 - 2073 979 477 aa, chain + ## HITS:1 COG:no KEGG:KPN_02513 NR:ns ## KEGG: KPN_02513 # Name: wzi # Def: hypothetical protein # Organism: K.pneumoniae # Pathway: not_defined # 1 477 1 477 477 909 97.0 0 MIKIARIAVTLGLLSSLGAQAYAAGLVVNDNDLRNDLAWLSDRGVIHLSLSTWPLSQEEI ARAIKKAKPSYSSEQVVLGRINRRLSALKADFRVAGYTSTDQPGTPQGFAQTQPADNSLG LAFNNSGEWWDVHLQGNVEGGERISNGSRFNANGAYGAVKFWNQWLSFGQIPQWWGPGYE GSLIRGDAMRPMTGFLMQRAEQAAPETWWLRWIGPWQYQISASQMNQYTAVPHAKIIGGR FTFSPFQSLELGASRIMQWGGEGRPESFSSFWDGFTGHDNTGTDNEPGNQLAGFDFKFKL EPTLGWPVSFYGQMIGEDESGYLPSANMYLGGVEGHHGWGKDAVNWYIEAHDTRSNMSRT NYSYNHHIYKDGYYQQGYPLGDAMGGDGQLVAGKVELITEDNQRWSTRLVYAKVNPEDQS INKAFPHADTLKGVQLGWSGDVYQSVRLNTSLWYTNANNSDSDDVGASAGIEIPFSL >gi|296494723|gb|ADTN01000015.1| GENE 2 2219 - 3355 435 378 aa, chain + ## HITS:1 COG:ECs1139 KEGG:ns NR:ns ## COG: ECs1139 COG1596 # Protein_GI_number: 15830393 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Escherichia coli O157:H7 # 1 378 1 378 379 595 74.0 1e-170 MKKKLVRFSALALAIGFLSGCTIIPGQGLNSLRKNVVELPDSDYDLDKLVNVYPMTPGLI DQLRPETVLARPNPQLDNLLRSYEYRIGVGDVLMVTVWDHPELTTPAGQYRSASDTGNWV NSDGTIFYPYIGKVQVAGKTLSQVRQDIASRLTTYIESPQVDVSIAAFRSQKVYVTGEVT KSGQQPITNIPLTVMDAINAAGGLATDADWRNVVLTHNGKDTKISLYALMQKGDLTQNHL LYPGDILFVPRNDDLKVFVMGEVGKQSTLKMDRSGMTLAEALGTSEGMAQATSDATGVFV IRQLKGDKKGKIANIYQLNAQDASAMVLGTEFQLQPYDIVYVTTAPLVRWNRVISQLVPT ITGVHDMTETARYIKSWP >gi|296494723|gb|ADTN01000015.1| GENE 3 3601 - 3792 68 63 aa, chain + ## HITS:1 COG:ECs2866 KEGG:ns NR:ns ## COG: ECs2866 COG0394 # Protein_GI_number: 15832120 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Escherichia coli O157:H7 # 1 63 82 144 147 86 60.0 1e-17 MEKYHIEQIGRVAPEARGKTMLFGHWLNQRDIPDPYRKSNEAFLSVYKLIEQAGGLWAQK LGA >gi|296494723|gb|ADTN01000015.1| GENE 4 3808 - 5973 490 721 aa, chain + ## HITS:1 COG:ZyccC_1 KEGG:ns NR:ns ## COG: ZyccC_1 COG3206 # Protein_GI_number: 15800902 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein involved in exopolysaccharide biosynthesis # Organism: Escherichia coli O157:H7 EDL933 # 10 479 12 480 480 554 61.0 1e-157 MSSVINKSTSKDDEGLDLGRLLGEVIDHRKLIVSVTSLFTLLALLYAIFATPIYQADALI QVEQKQANAILSNLSQMLPNSQPQSAPEITLLGSRMILGKTVDDLNLQIRAKQDYFPVFG RGLARLLGEKPNNISISRLYIKNSEGDEVPEIKLTVLDERNYKLDVGDLVLKGKTGSLLE KDGIALLVDKIEATPGTTFSIKYVSRLKAISDLQEALEVADQGKNTGMLGISLTGDNPVL IEKIVDSISNNYLAQNIARQAAQDAKSLDFLNEQLPQVRSDLDLAEDKLNKYRQQNDSVD LSLEAKSVLDQIVNVDNQLNELTFRESEISQLYTKEHPTYKALLEKRKTLQDEKAKLNKR VANMPGTQQEILRLSRDVESGRAVYMQLLNRQQELSISKSSAIGNVRIIDNAVTEIKPVK PKKILIVLIGIVFGGIVSIGLVLLRVFLRKGIESPEQLEEVGCNVYASIPVAEAYTKITE QSKKWSRKENKINQGFLAVDNPADLAIEAIRGLRTSLHFAMMESRNNVLMISGASPNAGK TFVSSNLSAVIAQTGKKVIFIDTDMRKGNTHKLFNVSNDNGLSDILSGKISIEKSIKKIS SAGFDYISRGMAPPNPAELLMHKRFAELINWASENYDIVVLDTPPILAVTDPAVIGHYAG TTLLVARFELNTVKEIEVSIKRFENTGIQVKGCILNGVVKKASSYYGYGYSHYGYSYKDN N >gi|296494723|gb|ADTN01000015.1| GENE 5 6760 - 7500 373 246 aa, chain + ## HITS:1 COG:HI0872 KEGG:ns NR:ns ## COG: HI0872 COG2148 # Protein_GI_number: 16272813 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Haemophilus influenzae # 4 246 229 471 471 376 71.0 1e-104 MIKGYRFVSVIPTLRGMPLDSTDMSFIFSHEVMIFRVQQNLAKLSSRIIKRLFDILGSIF IISILSPLLLYISVKVKKDGGPSIYGHERVGKGGKSFKCLKFRSMVINSKEVLDELLASD PEAKREWDATFKLKNDPRVTKIGGFLRRTSLDELPQLFNVLKGEMSLVGPRPIITAELER YNEEVDYYLLSKPGMTGLWQVSGRSDVDYETRVYLDAWYVKNWSMWNDIAILFKTVGVVL KKDGAY >gi|296494723|gb|ADTN01000015.1| GENE 6 7532 - 8443 85 303 aa, chain + ## HITS:1 COG:STM2085 KEGG:ns NR:ns ## COG: STM2085 COG0463 # Protein_GI_number: 16765415 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Salmonella typhimurium LT2 # 1 302 1 303 314 337 54.0 1e-92 MKCFIAIPTYNGGQIWHSAVNSIKKNAPANTLVHVIDSGSKDDTVSIAKEAKFDVISIAS NDFNHGATRNMAIKKYVKEYDIVVFLTQDAIPEPNFINTIINVFSDENIACAYGRQLPHE DANPLARHARGFSYPDKSHICDASDIPRMGIKTVFMSNSFSAYRLSVFEELNGFPSNTIL CEDMYFTAKAVLAGYKVAYVSEARVRHSHNYTPKEEFKRYFDIGVFHEMEPWIREATGGA GGEGKKFIISEFKFLINNAPLWIPIAFINNFMKILGYKLGLRFKILPKKFVKSLSMHKRY WDK >gi|296494723|gb|ADTN01000015.1| GENE 7 8630 - 9505 -65 291 aa, chain + ## HITS:1 COG:ECs2864 KEGG:ns NR:ns ## COG: ECs2864 COG0463 # Protein_GI_number: 15832118 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Escherichia coli O157:H7 # 6 278 2 273 279 123 33.0 4e-28 MNNTDKLNKKVSVYISTHNRLERLKRAIQSVLNQEYSNLELLVCDDASNDGTKEYMEELC SKDNRVKYFRNDVNKGACATRNLGIRHATGFFITGLDDDDEFTFDRIKLFVENWDDRYSF LSCNFYDCYKIKHKKIHYKKNLKPIILTYKDILFKNLASNQVFTLTERLRSINGFDVNVK RLQDWDTWLRLSAKYGEFMVLPYATYLMHHDHLPGEQRVSKAYPFSRALHELKMRNNQLY DKEEQSFVDYLVALESNKGQIVESLKWALKRKEPKYIAKYLIQYVNKNTNK >gi|296494723|gb|ADTN01000015.1| GENE 8 9517 - 10353 100 278 aa, chain + ## HITS:1 COG:AF0321 KEGG:ns NR:ns ## COG: AF0321 COG1216 # Protein_GI_number: 11497935 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Archaeoglobus fulgidus # 6 246 5 231 324 66 28.0 7e-11 MNCLVIIVTHNSQKHIQWCIDGLESSQSLLKIKVIDSGSTNISYLDNISSKYRLEIIKEN NIGFVKGNNLALQSDENFDWVLLLNPDARIEGDILDELLKIATDVNNSNVGIFSVPLVRF SIDEMQDMQVYDSLGIDATRYGRWFDIGSGEKTKVLTKELIDVPAVCGAFMLIRTKALEL CLDKSAKKGFESSYYMYKEDIELCLRMKKNDWRVVLVNSLQAFHCRGWNKPRKEISHWAK YHSAKNDIDIALRYRRQFLPYALCKYLWVNVFERFGLM >gi|296494723|gb|ADTN01000015.1| GENE 9 10971 - 11525 70 184 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|301643411|ref|ZP_07243460.1| ## NR: gi|301643411|ref|ZP_07243460.1| conserved hypothetical protein [Escherichia coli MS 146-1] # 1 184 203 386 386 307 100.0 2e-82 MIVYAARFALVMFLCAIFIKVDIVGLLYEHVSSLGFLFSGVNLSSKADAYLAGGVEGANF YDFNIPRRIYNILTTIIPLGVFVIYTINKKNMEYFFSDSQKVCILALFSLLSIYGVLLIS MTWFTRNFYWNTFFACAFYPLFLEFIQQKIPKKLTKYCIALSVFLLLLSCITMWRTPSIF ISYP >gi|296494723|gb|ADTN01000015.1| GENE 10 11553 - 12794 256 413 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|301643412|ref|ZP_07243461.1| ## NR: gi|301643412|ref|ZP_07243461.1| conserved hypothetical protein [Escherichia coli MS 146-1] # 43 413 1 371 371 684 100.0 0 MKKELLYIVSMLTIPLSVSAISEELKFNYLGKPVSSSNGYSPMYLQGIYASRDTFGMRQS AVVASFKANSKNPFSFPVLGINSLYGLVDYKDRDSVALYADNTAPDYNGWEILSADKITP TSILSNSIDPTLIKPGMIIETKTNPKWTTYVISVTKGKIITSGWVNMQTRHMGYPKAQTL LINPLTKIWAANFNITIPENSRAVKGVVQENGIVNNKINNGIINGIDTVILPYSKYGGSS AYVARSANTGLNQQWENGFLSLGNKFGFVSLSKEVNHVNVSFLDASNADIGVKFLGSNKK HSIEWSDGQKVTASISPSGEVEKINYKTKVIKSNDELTDSFSQYIFMTKNDITVQLPESK LLTDGYTLKLLSIGDGKCNITFTSDVKIIGEAEYKNFNWHKEIVFLDHKWYLI >gi|296494723|gb|ADTN01000015.1| GENE 11 13068 - 14090 201 340 aa, chain + ## HITS:1 COG:wcaK KEGG:ns NR:ns ## COG: wcaK COG2327 # Protein_GI_number: 16129985 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 3 338 87 423 426 343 47.0 3e-94 MPKILMAHYKNEGVFKYKKLPQHIIEFTNKVRNYDAVIQVGGSFFVDLYGSAQFEHIFCS LLAKKPIYLVGHSVGPFKNSEFKRIAMFAFKYCNELILRENVSKQLLVSDNFQMDNVTDG VDTAFLVDNNVTSDNDYIIDHWNEIISHNKTVALTVRKLAPFDKRLGVTQEEYELAIAGI IEHILSLGYQVVIFSTCTGIESYNNDDRIVALSVKNKVKNTESVHVVMDEINDIQLGILL NKCSFTIGTRLHSAIISMNFGTPAIAINYEHKSKGIMNSLNVDNLSIDVKNLFDNVIIER VNYLHNNLDEVKNKIEQGVSKVKNDGRLMLENIINKIGVK >gi|296494723|gb|ADTN01000015.1| GENE 12 14903 - 15313 88 136 aa, chain + ## HITS:1 COG:wcaL KEGG:ns NR:ns ## COG: wcaL COG0438 # Protein_GI_number: 16129984 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Escherichia coli K12 # 1 133 271 404 406 124 47.0 6e-29 MIIDLDLKSKVDIIGFQPQSIVKKMLSNSDVFLLPSVTASNGDMEGIPVALMEAMATGLP VISSYHSGIPELIQNDYSGWLCQEGNSQEIANCIIKLIEDKVEINDIEKNARNHIELNFN QDNEYNKMATMLEHIR >gi|296494723|gb|ADTN01000015.1| GENE 13 15856 - 16635 64 259 aa, chain + ## HITS:1 COG:BS_tuaB KEGG:ns NR:ns ## COG: BS_tuaB COG2244 # Protein_GI_number: 16080613 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Bacillus subtilis # 9 230 194 414 483 101 27.0 2e-21 MLMSLLGEKDWRPLFYFDKGLCKDVFKYGVYQLGSQIINQLRTQLDVIIIGKVLGSDSLG LYSLAKDLIMQPLKLITPVINRLALPRFAEAQNNSDILKSVYLKGTAVIVLFSALSFIFI YIFSPAIITLLYGSERMTILSLLPFMLLFGMLRPMGGLTGAIAQANGRTNVEFYWNVIAG IMVVCVSLTILIYSSLWYVSLVLSLSQILITFMVFPFFVKPIVNVKFSSYILNWLPVTIT FTIFVYLIYNFNLFVFPFW >gi|296494723|gb|ADTN01000015.1| GENE 14 16799 - 18205 1625 468 aa, chain + ## HITS:1 COG:gnd KEGG:ns NR:ns ## COG: gnd COG0362 # Protein_GI_number: 16129970 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconate dehydrogenase # Organism: Escherichia coli K12 # 1 468 1 468 468 892 95.0 0 MSKQQIGVVGMAVMGRNLALNIESRGYTVSVFNRSREKTEEVIAENPGKKLVPYYTVQEF VESLETPRRILLMVKAGAGTDSAIDSLKPYLDKGDIIIDGGNTFFQDTIRRNRELSAEGF NFIGTGVSGGEEGALKGPSIMPGGQKEAYELVAPILKQIAAVAEDGEPCVTYIGADGAGH YVKMVHNGIEYGDMQLIAEAYALLKGGLALSNEELAQTFTEWNEGELSSYLIDITKDIFT KKDEEGKYLVDVILDEAANKGTGKWTSQSSLDLGEPLSLITESVFARYISSLKDQRVAAS KVLSGPQAQPAGDKAEFIEKVRRALYLGKIVSYAQGFSQLRAASDEYNWDLNYGEIAKIF RAGCIIRAQFLQKITDAYTQNAGIANLLLAPYFKQIADDYQQALRDVVAYAVQNGIPVPT FSAAIAYYDSYRSAVLPANLIQAQRDYFGAHTYKRTDKEGVFHTEWLD >gi|296494723|gb|ADTN01000015.1| GENE 15 19224 - 19490 223 88 aa, chain - ## HITS:1 COG:insB_g3 KEGG:ns NR:ns ## COG: insB_g3 COG1662 # Protein_GI_number: 16128015 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives, IS1 family # Organism: Escherichia coli K12 # 1 88 80 167 167 171 100.0 3e-43 MATLGRLMSLLSPFDVVIWMTDGWPLYESRLKGKLHVISKRYTQRIERHNLNLRQHLARL GRKSLSFSKSVELHDKVIGHYLNIKHYQ Prediction of potential genes in microbial genomes Time: Sun May 15 23:04:01 2011 Seq name: gi|296494722|gb|ADTN01000016.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont64.1, whole genome shotgun sequence Length of sequence - 68651 bp Number of predicted genes - 59, with homology - 57 Number of transcription units - 30, operones - 11 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 11/0.000 - CDS 15 - 755 776 ## COG1180 Pyruvate-formate lyase-activating enzyme - Prom 838 - 897 2.9 - Term 873 - 904 3.2 2 1 Op 2 7/0.111 - CDS 947 - 3229 2578 ## COG1882 Pyruvate-formate lyase 3 1 Op 3 3/0.778 - CDS 3284 - 4141 682 ## COG2116 Formate/nitrite family of transporters - Term 4509 - 4537 0.7 4 2 Tu 1 . - CDS 4547 - 6307 1943 ## COG1944 Uncharacterized conserved protein - Prom 6421 - 6480 2.9 + Prom 6292 - 6351 1.9 5 3 Tu 1 . + CDS 6437 - 7129 647 ## COG2323 Predicted membrane protein + Prom 7212 - 7271 3.3 6 4 Op 1 6/0.222 + CDS 7364 - 8416 1004 ## COG1932 Phosphoserine aminotransferase + Term 8436 - 8469 4.1 7 4 Op 2 3/0.778 + CDS 8487 - 9770 1372 ## COG0128 5-enolpyruvylshikimate-3-phosphate synthase + Term 9814 - 9845 2.0 + Prom 9773 - 9832 3.2 8 5 Tu 1 . + CDS 9930 - 10703 889 ## COG0501 Zn-dependent protease with chaperone function + Term 10818 - 10851 2.3 + Prom 10779 - 10838 5.0 9 6 Op 1 21/0.000 + CDS 10876 - 11559 267 ## PROTEIN SUPPORTED gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 + Term 11580 - 11616 1.3 10 6 Op 2 16/0.000 + CDS 11670 - 13343 2807 ## PROTEIN SUPPORTED gi|15800772|ref|NP_286786.1| 30S ribosomal protein S1 11 6 Op 3 . + CDS 13285 - 13383 150 ## PROTEIN SUPPORTED gi|238903721|ref|ZP_04649193.1| ribosomal protein S1 + Term 13429 - 13458 3.5 12 7 Tu 1 . + CDS 13503 - 13787 167 ## PROTEIN SUPPORTED gi|148826039|ref|YP_001290792.1| 50S ribosomal protein L35 + Term 13809 - 13854 9.2 + Prom 13855 - 13914 3.6 13 8 Op 1 5/0.556 + CDS 13995 - 16259 859 ## COG0658 Predicted membrane metal-binding protein 14 8 Op 2 9/0.000 + CDS 16296 - 18044 242 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 15 8 Op 3 4/0.667 + CDS 18041 - 19027 653 ## COG1663 Tetraacyldisaccharide-1-P 4'-kinase 16 8 Op 4 4/0.667 + CDS 19064 - 20296 787 ## COG3214 Uncharacterized protein conserved in bacteria 17 8 Op 5 11/0.000 + CDS 20348 - 20530 198 ## COG2835 Uncharacterized conserved protein 18 8 Op 6 . + CDS 20527 - 21273 813 ## COG1212 CMP-2-keto-3-deoxyoctulosonic acid synthetase + Term 21305 - 21346 3.4 + Prom 21331 - 21390 4.8 19 9 Tu 1 . + CDS 21427 - 22320 751 ## JW0902 conserved hypothetical protein + Term 22527 - 22583 -0.8 - Term 22038 - 22075 -0.9 20 10 Tu 1 . - CDS 22297 - 23076 692 ## COG1434 Uncharacterized conserved protein + Prom 23020 - 23079 5.2 21 11 Op 1 6/0.222 + CDS 23212 - 23997 703 ## COG0500 SAM-dependent methyltransferases 22 11 Op 2 7/0.111 + CDS 23994 - 25316 1569 ## COG3006 Uncharacterized protein involved in chromosome partitioning 23 11 Op 3 8/0.111 + CDS 25297 - 26001 802 ## COG3095 Uncharacterized protein involved in chromosome partitioning 24 11 Op 4 4/0.667 + CDS 26001 - 30461 5755 ## COG3096 Uncharacterized protein involved in chromosome partitioning + Term 30468 - 30498 3.0 + Prom 30547 - 30606 6.5 25 12 Op 1 . + CDS 30722 - 32569 1395 ## COG2989 Uncharacterized protein conserved in bacteria 26 12 Op 2 . + CDS 32610 - 32687 68 ## 27 12 Op 3 7/0.111 + CDS 32750 - 33298 414 ## COG3108 Uncharacterized protein conserved in bacteria 28 12 Op 4 . + CDS 33325 - 33972 603 ## COG0491 Zn-dependent hydrolases, including glyoxylases + Term 34109 - 34146 2.2 - Term 33973 - 34004 1.1 29 13 Tu 1 . - CDS 34194 - 35384 1520 ## COG1448 Aspartate/tyrosine/aromatic aminotransferase - Prom 35419 - 35478 5.0 - Term 35504 - 35544 4.2 30 14 Tu 1 . - CDS 35569 - 36657 1386 ## COG3203 Outer membrane protein (porin) - Prom 36811 - 36870 6.8 - Term 37213 - 37244 4.1 31 15 Tu 1 . - CDS 37259 - 38659 1751 ## COG0017 Aspartyl/asparaginyl-tRNA synthetases - Prom 38718 - 38777 6.4 32 16 Tu 1 . + CDS 38553 - 38822 83 ## - Term 38790 - 38819 1.2 33 17 Tu 1 . - CDS 38828 - 40078 1285 ## COG1488 Nicotinic acid phosphoribosyltransferase - Prom 40273 - 40332 3.5 + Prom 40130 - 40189 4.3 34 18 Tu 1 . + CDS 40296 - 42908 3055 ## COG0308 Aminopeptidase N + Term 42913 - 42961 5.7 - Term 42902 - 42945 12.2 35 19 Op 1 24/0.000 - CDS 42951 - 43718 218 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 36 19 Op 2 8/0.111 - CDS 43715 - 44506 815 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 37 19 Op 3 5/0.556 - CDS 44517 - 45662 1235 ## COG2141 Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related flavin-dependent oxidoreductases 38 19 Op 4 4/0.667 - CDS 45659 - 46618 1001 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components 39 19 Op 5 . - CDS 46611 - 47186 521 ## COG0431 Predicted flavoprotein - Prom 47336 - 47395 7.7 40 20 Tu 1 7/0.111 + CDS 47542 - 48081 288 ## COG3539 P pilus assembly protein, pilin FimA + Term 48113 - 48146 6.1 + Prom 48085 - 48144 4.0 41 21 Op 1 10/0.000 + CDS 48164 - 48865 305 ## COG3121 P pilus assembly protein, chaperone PapD + Term 48936 - 48976 -0.9 + Prom 48874 - 48933 2.3 42 21 Op 2 6/0.222 + CDS 49037 - 51490 1666 ## COG3188 P pilus assembly protein, porin PapC 43 21 Op 3 4/0.667 + CDS 51508 - 52551 382 ## COG3539 P pilus assembly protein, pilin FimA 44 21 Op 4 4/0.667 + CDS 52563 - 53105 241 ## COG3539 P pilus assembly protein, pilin FimA 45 21 Op 5 7/0.111 + CDS 53113 - 53628 207 ## COG3539 P pilus assembly protein, pilin FimA 46 21 Op 6 . + CDS 53621 - 54331 166 ## COG3121 P pilus assembly protein, chaperone PapD 47 22 Tu 1 . - CDS 54289 - 54498 153 ## gi|213615837|ref|ZP_03371663.1| hypothetical protein SentesTyp_15729 + Prom 54351 - 54410 4.0 48 23 Tu 1 . + CDS 54442 - 55452 941 ## COG0167 Dihydroorotate dehydrogenase + Term 55474 - 55529 12.9 + Prom 55467 - 55526 6.7 49 24 Tu 1 . + CDS 55626 - 56168 492 ## SSON_0950 hypothetical protein + Term 56412 - 56473 0.5 50 25 Tu 1 . - CDS 56165 - 57232 689 ## COG3217 Uncharacterized Fe-S protein - Prom 57340 - 57399 3.4 + Prom 57286 - 57345 5.8 51 26 Op 1 6/0.222 + CDS 57518 - 59626 178 ## PROTEIN SUPPORTED gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative 52 26 Op 2 5/0.556 + CDS 59638 - 61545 2412 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Prom 61552 - 61611 1.6 53 27 Op 1 11/0.000 + CDS 61675 - 62928 634 ## COG2995 Uncharacterized paraquat-inducible protein A 54 27 Op 2 8/0.111 + CDS 62933 - 64573 1539 ## COG3008 Paraquat-inducible protein B 55 27 Op 3 3/0.778 + CDS 64570 - 65133 653 ## COG3009 Uncharacterized protein conserved in bacteria + Term 65156 - 65196 4.1 + Prom 65253 - 65312 3.3 56 28 Tu 1 . + CDS 65389 - 65556 86 ## COG3130 Ribosome modulation factor + Term 65574 - 65611 9.1 - Term 65563 - 65598 4.5 57 29 Op 1 8/0.111 - CDS 65626 - 66144 738 ## COG0764 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases 58 29 Op 2 . - CDS 66213 - 67973 1614 ## COG1067 Predicted ATP-dependent protease - Prom 68001 - 68060 3.7 + Prom 68074 - 68133 4.0 59 30 Tu 1 . + CDS 68159 - 68611 419 ## COG3120 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|296494722|gb|ADTN01000016.1| GENE 1 15 - 755 776 246 aa, chain - ## HITS:1 COG:ECs0985 KEGG:ns NR:ns ## COG: ECs0985 COG1180 # Protein_GI_number: 15830239 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Escherichia coli O157:H7 # 1 246 1 246 246 519 100.0 1e-147 MSVIGRIHSFESCGTVDGPGIRFITFFQGCLMRCLYCHNRDTWDTHGGKEVTVEDLMKEV VTYRHFMNASGGGVTASGGEAILQAEFVRDWFRACKKEGIHTCLDTNGFVRRYDPVIDEL LEVTDLVMLDLKQMNDEIHQNLVGVSNHRTLEFAKYLANKNVKVWIRYVVVPGWSDDDDS AHRLGEFTRDMGNVEKIELLPYHELGKHKWVAMGEEYKLDGVKPPKKETMERVKGILEQY GHKVMF >gi|296494722|gb|ADTN01000016.1| GENE 2 947 - 3229 2578 760 aa, chain - ## HITS:1 COG:pflB KEGG:ns NR:ns ## COG: pflB COG1882 # Protein_GI_number: 16128870 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Escherichia coli K12 # 1 760 1 760 760 1565 100.0 0 MSELNEKLATAWEGFTKGDWQNEVNVRDFIQKNYTPYEGDESFLAGATEATTTLWDKVME GVKLENRTHAPVDFDTAVASTITSHDAGYINKQLEKIVGLQTEAPLKRALIPFGGIKMIE GSCKAYNRELDPMIKKIFTEYRKTHNQGVFDVYTPDILRCRKSGVLTGLPDAYGRGRIIG DYRRVALYGIDYLMKDKLAQFTSLQADLENGVNLEQTIRLREEIAEQHRALGQMKEMAAK YGYDISGPATNAQEAIQWTYFGYLAAVKSQNGAAMSFGRTSTFLDVYIERDLKAGKITEQ EAQEMVDHLVMKLRMVRFLRTPEYDELFSGDPIWATESIGGMGLDGRTLVTKNSFRFLNT LYTMGPSPEPNMTILWSEKLPLNFKKFAAKVSIDTSSLQYENDDLMRPDFNNDDYAIACC VSPMIVGKQMQFFGARANLAKTMLYAINGGVDEKLKMQVGPKSEPIKGDVLNYDEVMERM DHFMDWLAKQYITALNIIHYMHDKYSYEASLMALHDRDVIRTMACGIAGLSVAADSLSAI KYAKVKPIRDEDGLAIDFEIEGEYPQFGNNDPRVDDLAVDLVERFMKKIQKLHTYRDAIP TQSVLTITSNVVYGKKTGNTPDGRRAGAPFGPGANPMHGRDQKGAVASLTSVAKLPFAYA KDGISYTFSIVPNALGKDDEVRKTNLAGLMDGYFHHEASIEGGQHLNVNVMNREMLLDAM ENPEKYPQLTIRVSGYAVRFNSLTKEQQQDVITRTFTQSM >gi|296494722|gb|ADTN01000016.1| GENE 3 3284 - 4141 682 285 aa, chain - ## HITS:1 COG:ECs0987 KEGG:ns NR:ns ## COG: ECs0987 COG2116 # Protein_GI_number: 15830241 # Func_class: P Inorganic ion transport and metabolism # Function: Formate/nitrite family of transporters # Organism: Escherichia coli O157:H7 # 1 285 1 285 285 535 100.0 1e-152 MKADNPFDLLLPAAMAKVAEEAGVYKATKHPLKTFYLAITAGVFISIAFVFYITATTGTG TMPFGMAKLVGGICFSLGLILCVVCGADLFTSTVLIVVAKASGRITWGQLAKNWLNVYFG NLVGALLFVLLMWLSGEYMTANGQWGLNVLQTADHKVHHTFIEAVCLGILANLMVCLAVW MSYSGRSLMDKAFIMVLPVAMFVASGFEHSIANMFMIPMGIVIRDFASPEFWTAVGSAPE NFSHLTVMNFITDNLIPVTIGNIIGGGLLVGLTYWVIYLRENDHH >gi|296494722|gb|ADTN01000016.1| GENE 4 4547 - 6307 1943 586 aa, chain - ## HITS:1 COG:ycaO KEGG:ns NR:ns ## COG: ycaO COG1944 # Protein_GI_number: 16128872 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 586 4 589 589 1187 100.0 0 MTQTFIPGKDAALEDSIARFQQKLSDLGFQIEEASWLNPVPNVWSVHIRDKECALCFTNG KGATKKAALASALGEYFERLSTNYFFADFWLGETIANGPFVHYPNEKWFPLTENDDVPEG LLDDRLRAFYDPENELTGSMLIDLQSGNEDRGICGLPFTRQSDNQTVYIPMNIIGNLYVS NGMSAGNTRNEARVQGLSEVFERYVKNRIIAESISLPEIPADVLARYPAVVEAIETLEAE GFPIFAYDGSLGGQYPVICVVLFNPANGTCFASFGAHPDFGVALERTVTELLQGRGLKDL DVFTPPTFDDEEVAEHTNLETHFIDSSGLISWDLFKQDADYPFVDWNFSGTTEEEFATLM AIFNKEDKEVYIADYEHLGVYACRIIVPGMSDIYPAEDLWLANNSMGSHLRETILSLPGS EWEKEDYLNLIEQLDEEGFDDFTRVRELLGLATGSDNGWYTLRIGELKAMLALAGGDLEQ ALVWTEWTMEFNSSVFSPERANYYRCLQTLLLLAQEEDRQPLQYLNAFVRMYGADAVEAA SAAMSGEAAFYGLQPVDSDLHAFAAHQSLLKAYEKLQRAKAAFWAK >gi|296494722|gb|ADTN01000016.1| GENE 5 6437 - 7129 647 230 aa, chain + ## HITS:1 COG:ycaP KEGG:ns NR:ns ## COG: ycaP COG2323 # Protein_GI_number: 16128873 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 230 1 230 230 463 100.0 1e-130 MKAFDLHRMAFDKVPFDFLGEVALRSLYTFVLVFLFLKMTGRRGVRQMSLFEVLIILTLG SAAGDVAFYDDVPMVPVLIVFITLALLYRLVMWLMAHSEKLEDLLEGKPVVIIEDGELAW SKLNNSNMTEFEFFMELRLRGVEQLGQVRLAILETNGQISVYFFEDDKVKPGLLILPSDC TQRYKVVPESADYACIRCSEIIHMKAGEKQLCPRCANPEWTKASRAKRVT >gi|296494722|gb|ADTN01000016.1| GENE 6 7364 - 8416 1004 350 aa, chain + ## HITS:1 COG:serC KEGG:ns NR:ns ## COG: serC COG1932 # Protein_GI_number: 16128874 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoserine aminotransferase # Organism: Escherichia coli K12 # 1 350 13 362 362 726 100.0 0 MLPAEVLKQAQQELRDWNGLGTSVMEVSHRGKEFIQVAEEAEKDFRDLLNVPSNYKVLFC HGGGRGQFAAVPLNILGDKTTADYVDAGYWAASAIKEAKKYCTPNVFDAKVTVDGLRAVK PMREWQLSDNAAYMHYCPNETIDGIAIDETPDFGADVVVAADFSSTILSRPIDVSRYGVI YAGAQKNIGPAGLTIVIVREDLLGKANIACPSILDYSILNDNGSMFNTPPTFAWYLSGLV FKWLKANGGVAEMDKINQQKAELLYGVIDNSDFYRNDVAKANRSRMNVPFQLADSALDKL FLEESFAAGLHALKGHRVVGGMRASIYNAMPLEGVKALTDFMVEFERRHG >gi|296494722|gb|ADTN01000016.1| GENE 7 8487 - 9770 1372 427 aa, chain + ## HITS:1 COG:ECs0991 KEGG:ns NR:ns ## COG: ECs0991 COG0128 # Protein_GI_number: 15830245 # Func_class: E Amino acid transport and metabolism # Function: 5-enolpyruvylshikimate-3-phosphate synthase # Organism: Escherichia coli O157:H7 # 1 427 1 427 427 846 100.0 0 MESLTLQPIARVDGTINLPGSKSVSNRALLLAALAHGKTVLTNLLDSDDVRHMLNALTAL GVSYTLSADRTRCEIIGNGGPLHAEGALELFLGNAGTAMRPLAAALCLGSNDIVLTGEPR MKERPIGHLVDALRLGGAKITYLEQENYPPLRLQGGFTGGNVDVDGSVSSQFLTALLMTA PLAPEDTVIRIKGDLVSKPYIDITLNLMKTFGVEIENQHYQQFVVKGGQSYQSPGTYLVE GDASSASYFLAAAAIKGGTVKVTGIGRNSMQGDIRFADVLEKMGATICWGDDYISCTRGE LNAIDMDMNHIPDAAMTIATAALFAKGTTTLRNIYNWRVKETDRLFAMATELRKVGAEVE EGHDYIRITPPEKLNFAEIATYNDHRMAMCFSLVALSDTPVTILDPKCTAKTFPDYFEQL ARISQAA >gi|296494722|gb|ADTN01000016.1| GENE 8 9930 - 10703 889 257 aa, chain + ## HITS:1 COG:ECs0992 KEGG:ns NR:ns ## COG: ECs0992 COG0501 # Protein_GI_number: 15830246 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Escherichia coli O157:H7 # 1 257 6 262 262 450 99.0 1e-126 MIHMKNTKLLLAIATSAALLTGCQNTHGIDTNMAISSGLNAYKAATLSDADAKAIANQGC AEMDSGNQVASKSSKYGKRLAKIAKALGNNINGTPVNYKVYMTSDVNAWAMANGCVRVYS GLMDMMNDNEIEGVLGHELGHVALGHSLAEMKASYAIVAARDAISATSGVASQLSRSQLG DIAEGAINAKYSRDKESEADDFSFDLLKKRGISTQGLVGSFEKLASLDGGRTQSMFDSHP PSTERAQHIRDRIASGK >gi|296494722|gb|ADTN01000016.1| GENE 9 10876 - 11559 267 227 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 [Treponema pallidum subsp. pallidum str. Nichols] # 7 223 36 286 863 107 30 2e-22 MTAIAPVITIDGPSGAGKGTLCKAMAEALQWHLLDSGAIYRVLALAALHHHVDVASEDAL VPLASHLDVRFVSTNGNLEVILEGEDVSGEIRTQEVANAASQVAAFPRVREALLRRQRAF RELPGLIADGRDMGTVVFPDAPVKIFLDASSEERAHRRMLQLQEKGFSVNFERLLAEIKE RDDRDRNRAVAPLVPAADALVLDSTTLSIEQVIEKALQYARQKLALA >gi|296494722|gb|ADTN01000016.1| GENE 10 11670 - 13343 2807 557 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15800772|ref|NP_286786.1| 30S ribosomal protein S1 [Escherichia coli O157:H7 EDL933] # 1 557 1 557 557 1085 100 0.0 MTESFAQLFEESLKEIETRPGSIVRGVVVAIDKDVVLVDAGLKSESAIPAEQFKNAQGEL EIQVGDEVDVALDAVEDGFGETLLSREKAKRHEAWITLEKAYEDAETVTGVINGKVKGGF TVELNGIRAFLPGSLVDVRPVRDTLHLEGKELEFKVIKLDQKRNNVVVSRRAVIESENSA ERDQLLENLQEGMEVKGIVKNLTDYGAFVDLGGVDGLLHITDMAWKRVKHPSEIVNVGDE ITVKVLKFDRERTRVSLGLKQLGEDPWVAIAKRYPEGTKLTGRVTNLTDYGCFVEIEEGV EGLVHVSEMDWTNKNIHPSKVVNVGDVVEVMVLDIDEERRRISLGLKQCKANPWQQFAET HNKGDRVEGKIKSITDFGIFIGLDGGIDGLVHLSDISWNVAGEEAVREYKKGDEIAAVVL QVDAERERISLGVKQLAEDPFNNWVALNKKGAIVTGKVTAVDAKGATVELADGVEGYLRA SEASRDRVEDATLVLSVGDEVEAKFTGVDRKNRAISLSVRAKDEADEKDAIATVNKQEDA NFSNNAMAEAFKAAKGE >gi|296494722|gb|ADTN01000016.1| GENE 11 13285 - 13383 150 32 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238903721|ref|ZP_04649193.1| ribosomal protein S1 [Escherichia coli BL21(DE3)] # 4 32 542 570 570 62 100 0.0 MQTSPTTQWLKLSKQLKASNSLTLRDFYSEVC >gi|296494722|gb|ADTN01000016.1| GENE 12 13503 - 13787 167 94 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148826039|ref|YP_001290792.1| 50S ribosomal protein L35 [Haemophilus influenzae PittEE] # 1 89 4 91 96 68 38 7e-11 MTKSELIERLATQQSHIPAKTVEDAVKEMLEHMASTLAQGERIEIRGFGSFSLHYRAPRT GRNPKTGDKVELEGKYVPHFKPGKELRDRANIYG >gi|296494722|gb|ADTN01000016.1| GENE 13 13995 - 16259 859 754 aa, chain + ## HITS:1 COG:ycaI_1 KEGG:ns NR:ns ## COG: ycaI_1 COG0658 # Protein_GI_number: 16128880 # Func_class: R General function prediction only # Function: Predicted membrane metal-binding protein # Organism: Escherichia coli K12 # 1 428 27 454 454 785 100.0 0 MKITTVGVCIISGIFPLLILPQLPGTLTLAFLTLFACVLAFIPVKTVRYIALTLLFFVWG ILSAKQILWAGETLTGATQDAIVEITATDGMTTHYGQITHLQGRRIFPASGLVMYGEYLP QAVCAGQQWSMKLKVRAVHGQLNDGGFDSQRYAIAQHQPLTGRFLQASVIEPNCSLRAQY LASLQTTLQPYPWNAVILGLGMGERLSVPKEIKNIMRDTGTAHLMAISGLHIAFAALLAA GLIRSGQIFLPGRWIHWQIPLIGGICCAAFYAWLTGMQPPALRTMVALATWGMLKLSGRQ WSGWDVWICCLAAILLMDPVAILSQSLWLSAAAVAALIFWYQWFPCPEWQLPPVLRAVVS LIHLQLGITLLLMPVQIVIFHGISLTSFIANLLAIPLVTFITVPLILAAMVVHLSGPLIL EQGLWFLADRSLALLFWGLKSLPEGWINIAECWQWLSFSPWFLLVVWRLNAWRTLPAMCV AGGLLMCWPLWQKPRPDEWQLYMLDVGQGLAMVIARNGKAILYDTGLAWPEGDSGQQLII PWLHWHNLEPEGVILSHEHLDHRGGLDSILHIWPMLWIRSPLNWEHHQPCVRGEAWQWQG LRFSAHWPLQGSNDKGNNHSCVVKVDDGTNSILLTGDIEAPAEQKMLSRYWQQVQATLLQ VPHHGSNTSSSLPLIQRVNGKVALASASRYNAWRLPSNKVKHRYQLQGYQWIDTPHQGQT TVNFSAQGWRISSLREQILPRWYHQWFGVPVDNG >gi|296494722|gb|ADTN01000016.1| GENE 14 16296 - 18044 242 582 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 359 568 17 229 245 97 32 1e-19 MHNDKDLSTWQTFRRLWPTIAPFKAGLIVAGVALILNAASDTFMLSLLKPLLDDGFGKTD RSVLVWMPLVVIGLMILRGITSYVSSYCISWVSGKVVMTMRRRLFGHMMGMPVSFFDKQS TGTLLSRITYDSEQVASSSSGALITVVREGASIIGLFIMMFYYSWQLSIILIVLAPIVSI AIRVVSKRFRNISKNMQNTMGQVTTSAEQMLKGHKEVLIFGGQEVETKRFDKVSNRMRLQ GMKMVSASSISDPIIQLIASLALAFVLYAASFPSVMDSLTAGTITVVFSSMIALMRPLKS LTNVNAQFQRGMAACQTLFTILDSEQEKDEGKRVIERATGDVEFRNVTFTYPGRDVPALR NINLKIPAGKTVALVGRSGSGKSTIASLITRFYDIDEGEILMDGHDLREYTLASLRNQVA LVSQNVHLFNDTVANNIAYARTEQYSREQIEEAARMAYAMDFINKMDNGLDTVIGENGVL LSGGQRQRIAIARALLRDSPILILDEATSALDTESERAIQAALDELQKNRTSLVIAHRLS TIEKADEIVVVEDGVIVERGTHNDLLEHRGVYAQLHKMQFGQ >gi|296494722|gb|ADTN01000016.1| GENE 15 18041 - 19027 653 328 aa, chain + ## HITS:1 COG:lpxK KEGG:ns NR:ns ## COG: lpxK COG1663 # Protein_GI_number: 16128882 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Tetraacyldisaccharide-1-P 4'-kinase # Organism: Escherichia coli K12 # 1 328 1 328 328 649 100.0 0 MIEKIWSGESPLWRLLLPLSWLYGLVSGAIRLCYKLKLKRAWRAPVPVVVVGNLTAGGNG KTPVVVWLVEQLQQRGIRVGVVSRGYGGKAESYPLLLSADTTTAQAGDEPVLIYQRTDAP VAVSPVRSDAVKAILAQHPDVQIIVTDDGLQHYRLARDVEIVVIDGVRRFGNGWWLPAGP MRERAGRLKSVDAVIVNGGVPRSGEIPMHLLPGQAVNLRTGTRCDVAQLEHVVAMAGIGH PPRFFATLKMCGVQPEKCVPLADHQSLNHADVSALVSAGQTLVMTEKDAVKCRAFAEENW WYLPVDAQLSGDEPAKLLTQLTLLASGN >gi|296494722|gb|ADTN01000016.1| GENE 16 19064 - 20296 787 410 aa, chain + ## HITS:1 COG:ycaQ KEGG:ns NR:ns ## COG: ycaQ COG3214 # Protein_GI_number: 16128883 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 410 1 410 410 820 99.0 0 MSLPHLSLADARNLHLAAQGLLNKPRRRASLEDIPATISRMSLLQIDTINIVARSPYLVL FSRLGNYPAQWLDESLARGELMEYWAHEACFMPRSDFRLIRHRMLAPEKMGWKYKDAWMQ EHEAEIAQLIQHIHDKGPVRSADFEHPRKGASGWWEWKPHKRHLEGLFTAGKVMVIERRN FQRVYDLTHRVMPDWDDERDLVSQTEAEIIMLDNSARSLGIFREQWLADYYRLKRPALAA WRKARAEQQQIIAVHVEKLGNLWLHDDLLPLLERALAGKLTATHSAVLSPFDPVVWDRKR AEQLFDFSYRLECYTPAPKRQYGYFVLPLLHRGQLVGRMDAKMHRQTGILEVISLWLQEG IKPTTTLQKGLRQAITDFANWQQATRVTLGCCPQGLFTDCRTGWEIDPVA >gi|296494722|gb|ADTN01000016.1| GENE 17 20348 - 20530 198 60 aa, chain + ## HITS:1 COG:ECs1000 KEGG:ns NR:ns ## COG: ECs1000 COG2835 # Protein_GI_number: 15830254 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 60 1 60 60 119 100.0 2e-27 MDHRLLEIIACPVCNGKLWYNQEKQELICKLDNLAFPLRDGIPVLLETEARVLTADESKS >gi|296494722|gb|ADTN01000016.1| GENE 18 20527 - 21273 813 248 aa, chain + ## HITS:1 COG:kdsB KEGG:ns NR:ns ## COG: kdsB COG1212 # Protein_GI_number: 16128885 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: CMP-2-keto-3-deoxyoctulosonic acid synthetase # Organism: Escherichia coli K12 # 1 248 1 248 248 488 100.0 1e-138 MSFVVIIPARYASTRLPGKPLVDINGKPMIVHVLERARESGAERIIVATDHEDVARAVEA AGGEVCMTRADHQSGTERLAEVVEKCAFSDDTVIVNVQGDEPMIPATIIRQVADNLAQRQ VGMATLAVPIHNAEEAFNPNAVKVVLDAEGYALYFSRATIPWDRDRFAEGLETVGDNFLR HLGIYGYRAGFIRRYVNWQPSPLEHIEMLEQLRVLWYGEKIHVAVAQEVPGTGVDTPEDL ERVRAEMR >gi|296494722|gb|ADTN01000016.1| GENE 19 21427 - 22320 751 297 aa, chain + ## HITS:1 COG:no KEGG:JW0902 NR:ns ## KEGG: JW0902 # Name: ycbJ # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 297 1 297 297 588 100.0 1e-166 MEQLRAELSHLLGEKLSRIECVNEKADTALWALYDSQGNPMPLMARSFSTPGKARQLAWK TTMLARSGTVRMPTIYGVMTHEEHPGPDVLLLERMRGVSVEAPARTPERWEQLKDQIVEA LLAWHRQDSRGCVGAVDNTQENFWPSWYRQHVEVLWTTLNQFNNTGLTMQDKRILFRTRE CLPALFEGFNDNCVLIHGNFCLRSMLKDSRSDQLLAMVGPGLMLWAPREYELFRLMDNSL AEDLLWSYLQRAPVAESFIWRRWLYVLWDEVAQLVNTGRFSRRNFDLASKSLLPWLA >gi|296494722|gb|ADTN01000016.1| GENE 20 22297 - 23076 692 259 aa, chain - ## HITS:1 COG:ECs1003 KEGG:ns NR:ns ## COG: ECs1003 COG1434 # Protein_GI_number: 15830257 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 259 1 259 259 452 100.0 1e-127 MLFTLKKVIGNMLLPLPLMLLIIGAGLALLWFSRFQKTGKIFISIGWLALLLLSLQPVAD RLLRPIESTYPTWNNSQKVDYIVVLGGGYTWNPQWAPSSNLINNSLPRLNEGIRLWRENP GSKLIFTGGVAKTNTVSTAEVGARVAQSLGVPREQIITLDLPKDTEEEAAAVKQAIGDAP FLLVTSASHLPRAMIFFQQEGLNPLPAPANQLAIDSPLNPWERAIPSPVWLMHSDRVGYE TLGRIWQWLKGSSGEPRQE >gi|296494722|gb|ADTN01000016.1| GENE 21 23212 - 23997 703 261 aa, chain + ## HITS:1 COG:smtA KEGG:ns NR:ns ## COG: smtA COG0500 # Protein_GI_number: 16128888 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli K12 # 1 261 1 261 261 539 100.0 1e-153 MQDRNFDDIAEKFSRNIYGTTKGQLRQAILWQDLDRVLAEMGPQKLRVLDAGGGEGQTAI KMAERGHQVILCDLSAQMIDRAKQAAEAKGVSDNMQFIHCAAQDVASHLETPVDLILFHA VLEWVADPRSVLQTLWSVLRPGGVLSLMFYNAHGLLMHNMVAGNFDYVQAGMPKKKKRTL SPDYPRDPAQVYLWLEEAGWQIMGKTGVRVFHDYLREKHQQRDCYEALLELETRYCRQEP YITLGRYIHVTARKPQSKDKV >gi|296494722|gb|ADTN01000016.1| GENE 22 23994 - 25316 1569 440 aa, chain + ## HITS:1 COG:mukF KEGG:ns NR:ns ## COG: mukF COG3006 # Protein_GI_number: 16128889 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Uncharacterized protein involved in chromosome partitioning # Organism: Escherichia coli K12 # 1 440 1 440 440 832 100.0 0 MSEFSQTVPELVAWARKNDFSISLPVDRLSFLLAVATLNGERLDGEMSEGELVDAFRHVS DAFEQTSETIGVRANNAINDMVRQRLLNRFTSEQAEGNAIYRLTPLGIGITDYYIRQREF STLRLSMQLSIVAGELKRAADAAEEGGDEFHWHRNVYAPLKYSVAEIFDSIDLTQRLMDE QQQQVKDDIAQLLNKDWRAAISSCELLLSETSGTLRELQDTLEAAGDKLQANLLRIQDAT MTHDDLHFVDRLVFDLQSKLDRIISWGQQSIDLWIGYDRHVHKFIRTAIDMDKNRVFAQR LRQSVQTYFDEPWALTYANADRLLDMRDEEMALRDEEVTGELPEDLEYEEFNEIREQLAA IIEEQLAVYKTRQVPLDLGLVVREYLSQYPRARHFDVARIVIDQAVRLGVAQADFTGLPA KWQPINDYGAKVQAHVIDKY >gi|296494722|gb|ADTN01000016.1| GENE 23 25297 - 26001 802 234 aa, chain + ## HITS:1 COG:ECs1006 KEGG:ns NR:ns ## COG: ECs1006 COG3095 # Protein_GI_number: 15830260 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Uncharacterized protein involved in chromosome partitioning # Organism: Escherichia coli O157:H7 # 1 234 1 234 234 444 99.0 1e-124 MSSTNIEQVMPVKLAQALANPLFPALDSALRSGRHIGLDELDNHAFLMDFQEYLEEFYAR YNVELIRAPEGFFYLRPRSTTLIPRSVLSELDMMVGKILCYLYLSPERLANEGIFTQQEL YDELLTLADEAKLLKLVNNRSTGSDVDRQKLQEKVRSSLNRLRRLGMVWFMGHDSSKFRI TESVFRFGADVRAGDDPREAQRRLIRDGEAMPIENHLQLNDETEENQPDSGEEE >gi|296494722|gb|ADTN01000016.1| GENE 24 26001 - 30461 5755 1486 aa, chain + ## HITS:1 COG:mukB KEGG:ns NR:ns ## COG: mukB COG3096 # Protein_GI_number: 16128891 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Uncharacterized protein involved in chromosome partitioning # Organism: Escherichia coli K12 # 1 1486 1 1486 1486 2526 99.0 0 MIERGKFRSLTLINWNGFFARTFDLDELVTTLSGGNGAGKSTTMAAFVTALIPDLTLLHF RNTTEAGATSGSRDKGLHGKLKAGVCYSMLDTINSRHQRVVVGVRLQQVAGRDRKVDIKP FAIQGLPMSVQPTQLVTETLNERQARVLPLNELKDKLEAMEGVQFKQFNSITDYHSLMFD LGIIARRLRSASDRSKFYRLIEASLYGGISSAITRSLRDYLLPENSGVRKAFQDMEAALR ENRMTLEAIRVTQSDRDLFKHLISEATNYVAADYMRHANERRVHLDKALEFRRELHTSRQ QLAAEQYKHVDMARELAEHNGAEGDLEADYQAASDHLNLVQTALRQQEKIERYEADLDEL QIRLEEQNEVVAEAIERQEENEARAEAAELEVDELKSQLADYQQALDVQQTRAIQYNQAI AALNRAKELCHLPDLTADCAAEWLETFQAKELEATEKMLSLEQKMSMAQTAHSQFEQAYQ LVVAINGPLARNEAWDVARELLREGVDQRHLAEQVQPLRMRLSELEQRLREQQEAERLLA DFCKRQGKNFDIDELEALHQELEARIASLSDSVSNAREERMALRQEQEQLQSRIQSLMQR APVWLAAQNSLNQLSEQCGEEFTSSQDVTEYLQQLLEREREAIVERDEVGARKNAVDEEI ERLSQPGGSEDQRLNALAERFGGVLLSEIYDDVSLEDAPYFSALYGPSRHAIVVPDLSQV TEHLEGLTDCPEDLYLIEGDPQSFDDSVFSVDELEKAVVVKIADRQWRYSRFPEVPLFGR AARESRIESLHAEREVLSERFATLSFDVQKTQRLHQAFSRFIGSHLAVAFESDPEAEIRQ LNSRRVELERALSNHENDNQQQRIQFEQAKEGVTALNRILPRLNLLADDSLADRVDEIRE RLDEAQEAARFVQQFGNQLAKLEPIVSVLQSDPEQFEQLKEDYAYSQQMQRDARQQAFAL TEVVQRRAHFSYSDSAEMLSGNSDLNEKLRERLEQAEAERTRAREALRGHAAQLSQYNQV LASLKSSYDTKKELLNDLQRELQDIGVRADSGAEERARIRRDELHAQLSNNRSRRNQLEK ALTFCEAEMDNLTRKLRKLERDYFEMREQVVTAKAGWCAVMRMVKDNGVERRLHRRELAY LSADDLRSMSDKALGALRLAVADNEHLRDVLRMSEDPKRPERKIQFFVAVYQHLRERIRQ DIIRTDDPVEAIEQMEIELSRLTEELTSREQKLAISSRSVANIIRKTIQREQNRIRMLNQ GLQNVSFGQVNSVRLNVNVRETHAMLLDVLSEQHEQHQDLFNSNRLTFSEALAKLYQRLN PQIDMGQRTPQTIGEELLDYRNYLEMEVEVNRGSDGWLRAESGALSTGEAIGTGMSILVM VVQSWEDESRRLRGKDISPCRLLFLDEAARLDARSIATLFELCERLQMQLIIAAPENISP EKGTTYKLVRKVFQNTEHVHVVGLRGFAPQLPETLPGTDEAPSQAS >gi|296494722|gb|ADTN01000016.1| GENE 25 30722 - 32569 1395 615 aa, chain + ## HITS:1 COG:ycbB KEGG:ns NR:ns ## COG: ycbB COG2989 # Protein_GI_number: 16128892 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 615 1 615 615 1194 99.0 0 MLLNMMCGRRLSAISLCLAVTFAPLFNAQADEPEVIPGDSPVAVSEQGEALPQAQATAIM AGIQPLPEGAAEKARTQIESQLPAGYKPVYLNQLQLLYAARDMQPMWENRDAVKAFQQQL AEVAIAGFQPQFNKWVELLTDPGVNGMARDVVLSDAMMGYLHFIANIPVKGTRWLYSSKP YALATPPLSVINQWQLALDKGQLPTFVAGLAPQHPQYAAMHESLLALLCDTKPWPQLTGK ATLRPGQWSNDVPALREILQRTGMLDGGPKITLPGDDTPTDAVVSPSAVTVETAETKPMD KQTTSRSKPAPAVRAAYDNELVEAVKRFQAWQGLGADGAIGPATRDWLNVTPAQRAGVLA LNIQRLRLLPTELSTGIMVNIPAYSLVYYQNGNQVLDSRVIVGRPDRKTPMMSSALNNVV VNPPWNVPPTLARKDILPKVRNDPGYLESHGYTVMRGWNSREAIDPWQVDWSTITASNLP FRFQQAPGPRNSLGRYKFNMPSSEAIYLHDTPNHNLFKRDTRALSSGCVRVNKASDLANM LLQDAGWNDKRISDALKQGDTRYVNIRQSIPVNLYYLTAFVGADGRTQYRTDIYNYDLPA RSSSQIVSKAEQLIR >gi|296494722|gb|ADTN01000016.1| GENE 26 32610 - 32687 68 25 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIGGDSLQPPSLLGLSHLDVCFTGG >gi|296494722|gb|ADTN01000016.1| GENE 27 32750 - 33298 414 182 aa, chain + ## HITS:1 COG:ECs1009 KEGG:ns NR:ns ## COG: ECs1009 COG3108 # Protein_GI_number: 15830263 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 182 1 182 182 361 100.0 1e-100 MDKFDANRRKLLALGGVALGAAILPTPAFATLSTPRPRILTLNNLHTGESIKAEFFDGRG YIQEELAKLNHFFRDYRANKIKSIDPGLFDQLYRLQGLLGTRKPVQLISGYRSIDTNNEL RARSRGVAKKSYHTKGQAMDFHIEGIALSNIRKAALSMRAGGVGYYPRSNFVHIDTGPAR HW >gi|296494722|gb|ADTN01000016.1| GENE 28 33325 - 33972 603 215 aa, chain + ## HITS:1 COG:ycbL KEGG:ns NR:ns ## COG: ycbL COG0491 # Protein_GI_number: 16128894 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Escherichia coli K12 # 1 215 1 215 215 448 100.0 1e-126 MNYRIIPVTAFSQNCSLIWCEQTRLAALVDPGGDAEKIKQEVDDSGLTLMQILLTHGHLD HVGAAAELAQHYGVPVFGPEKEDEFWLQGLPAQSRMFGLEECQPLTPDRWLNEGDTISIG NVTLQVLHCPGHTPGHVVFFDDRAKLLISGDVIFKGGVGRSDFPRGDHNQLISSIKDKLL PLGDDVIFIPGHGPLSTLGYERLHNPFLQDEMPVW >gi|296494722|gb|ADTN01000016.1| GENE 29 34194 - 35384 1520 396 aa, chain - ## HITS:1 COG:aspC KEGG:ns NR:ns ## COG: aspC COG1448 # Protein_GI_number: 16128895 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Escherichia coli K12 # 1 396 1 396 396 812 100.0 0 MFENITAAPADPILGLADLFRADERPGKINLGIGVYKDETGKTPVLTSVKKAEQYLLENE TTKNYLGIDGIPEFGRCTQELLFGKGSALINDKRARTAQTPGGTGALRVAADFLAKNTSV KRVWVSNPSWPNHKSVFNSAGLEVREYAYYDAENHTLDFDALINSLNEAQAGDVVLFHGC CHNPTGIDPTLEQWQTLAQLSVEKGWLPLFDFAYQGFARGLEEDAEGLRAFAAMHKELIV ASSYSKNFGLYNERVGACTLVAADSETVDRAFSQMKAAIRANYSNPPAHGASVVATILSN DALRAIWEQELTDMRQRIQRMRQLFVNTLQEKGANRDFSFIIKQNGMFSFSGLTKEQVLR LREEFGVYAVASGRVNVAGMTPDNMAPLCEAIVAVL >gi|296494722|gb|ADTN01000016.1| GENE 30 35569 - 36657 1386 362 aa, chain - ## HITS:1 COG:ompF KEGG:ns NR:ns ## COG: ompF COG3203 # Protein_GI_number: 16128896 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein (porin) # Organism: Escherichia coli K12 # 1 362 1 362 362 628 100.0 1e-180 MMKRNILAVIVPALLVAGTANAAEIYNKDGNKVDLYGKAVGLHYFSKGNGENSYGGNGDM TYARLGFKGETQINSDLTGYGQWEYNFQGNNSEGADAQTGNKTRLAFAGLKYADVGSFDY GRNYGVVYDALGYTDMLPEFGGDTAYSDDFFVGRVGGVATYRNSNFFGLVDGLNFAVQYL GKNERDTARRSNGDGVGGSISYEYEGFGIVGAYGAADRTNLQEAQPLGNGKKAEQWATGL KYDANNIYLAANYGETRNATPITNKFTNTSGFANKTQDVLLVAQYQFDFGLRPSIAYTKS KAKDVEGIGDVDLVNYFEVGATYYFNKNMSTYVDYIINQIDSDNKLGVGSDDTVAVGIVY QF >gi|296494722|gb|ADTN01000016.1| GENE 31 37259 - 38659 1751 466 aa, chain - ## HITS:1 COG:asnS KEGG:ns NR:ns ## COG: asnS COG0017 # Protein_GI_number: 16128897 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl/asparaginyl-tRNA synthetases # Organism: Escherichia coli K12 # 1 466 1 466 466 959 100.0 0 MSVVPVADVLQGRVAVDSEVTVRGWVRTRRDSKAGISFLAVYDGSCFDPVQAVINNSLPN YNEDVLRLTTGCSVIVTGKVVASPGQGQQFEIQASKVEVAGWVEDPDTYPMAAKRHSIEY LREVAHLRPRTNLIGAVARVRHTLAQALHRFFNEQGFFWVSTPLITASDTEGAGEMFRVS TLDLENLPRNDQGKVDFDKDFFGKESFLTVSGQLNGETYACALSKIYTFGPTFRAENSNT SRHLAEFWMLEPEVAFANLNDIAGLAEAMLKYVFKAVLEERADDMKFFAERVDKDAVSRL ERFIEADFAQVDYTDAVTILENCGRKFENPVYWGVDLSSEHERYLAEEHFKAPVVVKNYP KDIKAFYMRLNEDGKTVAAMDVLAPGIGEIIGGSQREERLDVLDERMLEMGLNKEDYWWY RDLRRYGTVPHSGFGLGFERLIAYVTGVQNVRDVIPFPRTPRNASF >gi|296494722|gb|ADTN01000016.1| GENE 32 38553 - 38822 83 89 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPAFESRRVRTHPRTVTSLSTATRPWSTSATGTTLIIFSLLIVGKNKHLSTRKWGDTYVT WHLQSDKQNSQMQRKISELKVKKGSRLAP >gi|296494722|gb|ADTN01000016.1| GENE 33 38828 - 40078 1285 416 aa, chain - ## HITS:1 COG:pncB KEGG:ns NR:ns ## COG: pncB COG1488 # Protein_GI_number: 16128898 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid phosphoribosyltransferase # Organism: Escherichia coli K12 # 17 416 1 400 400 823 100.0 0 MFIVLNAPVRGRYCAPMTQFASPVLHSLLDTDAYKLHMQQAVFHHYYDVHVAAEFRCRGD DLLGIYADAIREQVQAMQHLRLQDDEYQWLSALPFFKADYLNWLREFRFNPEQVTVSNDN GKLDIRLSGPWREVILWEVPLLAVISEMVHRYRSPQADVAQALDTLESKLVDFSALTAGL DMSRFHLMDFGTRRRFSREVQETIVKRLQQESWFVGTSNYDLARRLSLTPMGTQAHEWFQ AHQQISPDLANSQRAALAAWLEEYPDQLGIALTDCITMDAFLRDFGVEFASRYQGLRHDS GDPVEWGEKAIAHYEKLGIDPQSKTLVFSDNLDLRKAVELYRHFSSRVQLSFGIGTRLTC DIPQVKPLNIVIKLVECNGKPVAKLSDSPGKTICHDKAFVRALRKAFDLPHIKKAS >gi|296494722|gb|ADTN01000016.1| GENE 34 40296 - 42908 3055 870 aa, chain + ## HITS:1 COG:pepN KEGG:ns NR:ns ## COG: pepN COG0308 # Protein_GI_number: 16128899 # Func_class: E Amino acid transport and metabolism # Function: Aminopeptidase N # Organism: Escherichia coli K12 # 1 870 1 870 870 1759 100.0 0 MTQQPQAKYRHDYRAPDYQITDIDLTFDLDAQKTVVTAVSQAVRHGASDAPLRLNGEDLK LVSVHINDEPWTAWKEEEGALVISNLPERFTLKIINEISPAANTALEGLYQSGDALCTQC EAEGFRHITYYLDRPDVLARFTTKIIADKIKYPFLLSNGNRVAQGELENGRHWVQWQDPF PKPCYLFALVAGDFDVLRDTFTTRSGREVALELYVDRGNLDRAPWAMTSLKNSMKWDEER FGLEYDLDIYMIVAVDFFNMGAMENKGLNIFNSKYVLARTDTATDKDYLDIERVIGHEYF HNWTGNRVTCRDWFQLSLKEGLTVFRDQEFSSDLGSRAVNRINNVRTMRGLQFAEDASPM AHPIRPDMVIEMNNFYTLTVYEKGAEVIRMIHTLLGEENFQKGMQLYFERHDGSAATCDD FVQAMEDASNVDLSHFRRWYSQSGTPIVTVKDDYNPETEQYTLTISQRTPATPDQAEKQP LHIPFAIELYDNEGKVIPLQKGGHPVNSVLNVTQAEQTFVFDNVYFQPVPALLCEFSAPV KLEYKWSDQQLTFLMRHARNDFSRWDAAQSLLATYIKLNVARHQQGQPLSLPVHVADAFR AVLLDEKIDPALAAEILTLPSVNEMAELFDIIDPIAIAEVREALTRTLATELADELLAIY NANYQSEYRVEHEDIAKRTLRNACLRFLAFGETHLADVLVSKQFHEANNMTDALAALSAA VAAQLPCRDALMQEYDDKWHQNGLVMDKWFILQATSPAANVLETVRGLLQHRSFTMSNPN RIRSLIGAFAGSNPAAFHAEDGSGYLFLVEMLTDLNSRNPQVASRLIEPLIRLKRYDAKR QEKMRAALEQLKGLENLSGDLYEKITKALA >gi|296494722|gb|ADTN01000016.1| GENE 35 42951 - 43718 218 255 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 12 255 9 284 309 88 26 9e-17 MNTARLNQGTPLLLNAVSKHYAENIVLNQLDLHIPAGQFVAVVGRSGGGKSTLLRLLAGL ETPTAGDVLAGTTPLAEIQEDTRMMFQDARLLPWKSVIDNVGLGLKGQWRDAARRALAAV GLENRAGEWPAALSGGQKQRVALARALIHRPGLLLLDEPLGALDALTRLEMQDLIVSLWQ EHGFTVLLVTHDVSEAVAMADRVLLIEEGKIGLDLTVDIPRPRRLGSVRLAELEAEVLQR VMQRGESETRLRKQG >gi|296494722|gb|ADTN01000016.1| GENE 36 43715 - 44506 815 263 aa, chain - ## HITS:1 COG:ssuC KEGG:ns NR:ns ## COG: ssuC COG0600 # Protein_GI_number: 16128901 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Escherichia coli K12 # 1 263 16 278 278 422 100.0 1e-118 MATPVKKWLLRVAPWFLPVGIVAVWQLASSVGWLSTRILPSPEGVVTAFWTLSASGELWQ HLAISSWRALIGFSIGGSLGLILGLISGLSRWGERLLDTSIQMLRNVPHLALIPLVILWF GIDESAKIFLVALGTLFPIYINTWHGIRNIDRGLVEMARSYGLSGIPLFIHVILPGALPS IMVGVRFALGLMWLTLIVAETISANSGIGYLAMNAREFLQTDVVVVAIILYALLGKLADV SAQLLERLWLRWNPAYHLKEATV >gi|296494722|gb|ADTN01000016.1| GENE 37 44517 - 45662 1235 381 aa, chain - ## HITS:1 COG:ycbN KEGG:ns NR:ns ## COG: ycbN COG2141 # Protein_GI_number: 16128902 # Func_class: C Energy production and conversion # Function: Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related flavin-dependent oxidoreductases # Organism: Escherichia coli K12 # 1 381 1 381 381 757 100.0 0 MSLNMFWFLPTHGDGHYLGTEEGSRPVDHGYLQQIAQAADRLGYTGVLIPTGRSCEDAWL VAASMIPVTQRLKFLVALRPSVTSPTVAARQAATLDRLSNGRALFNLVTGSDPQELAGDG VFLDHSERYEASAEFTQVWRRLLQRETVDFNGKHIHVRGAKLLFPAIQQPYPPLYFGGSS DVAQELAAEQVDLYLTWGEPPELVKEKIEQVRAKAAAHGRKIRFGIRLHVIVRETNDEAW QAAERLISHLDDETIAKAQAAFARTDSVGQQRMAALHNGKRDNLEISPNLWAGVGLVRGG AGTALVGDGPTVAARINEYAALGIDSFVLSGYPHLEEAYRVGELLFPLLDVAIPEIPQPQ PLNPQGEAVANDFIPRKVAQS >gi|296494722|gb|ADTN01000016.1| GENE 38 45659 - 46618 1001 319 aa, chain - ## HITS:1 COG:ssuA KEGG:ns NR:ns ## COG: ssuA COG0715 # Protein_GI_number: 16128903 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Escherichia coli K12 # 1 319 15 333 333 608 100.0 1e-174 MRNIIKLALAGLLSVSTFAVAAESSPEALRIGYQKGSIGMVLAKSHQLLEKRYPESKISW VEFPAGPQMLEALNVGSIDLGSTGDIPPIFAQAAGADLVYVGVEPPKPKAEVILVAENSP IKTVADLKGHKVAFQKGSSSHNLLLRALRQAGLKFTDIQPTYLTPADARAAFQQGNVDAW AIWDPYYSAALLQGGVRVLKDGTDLNQTGSFYLAARPYAEKNGAFIQGVLATFSEADALT RSQREQSIALLAKTMGLPAPVIASYLDHRPPTTIKPVNAEVAALQQQTADLFYENRLVPK KVDIRQRIWQPTQLEGKQL >gi|296494722|gb|ADTN01000016.1| GENE 39 46611 - 47186 521 191 aa, chain - ## HITS:1 COG:ZycbP KEGG:ns NR:ns ## COG: ZycbP COG0431 # Protein_GI_number: 15800798 # Func_class: R General function prediction only # Function: Predicted flavoprotein # Organism: Escherichia coli O157:H7 EDL933 # 1 191 1 191 191 353 97.0 8e-98 MRVITLAGSPRFPSRSSSLLEYAREKLNGLDVEVYHWNLQNFAPEDLLYARFDSPALKTF TEQLQQADGLIVATPVYKAAYSGALKTLLDLLPERALQGKVVLPLATGGTVAHLLAVDYA LKPVLSALKAQEILHGVFADDSQVIDYHHRPQFTPNLQTRLDTALETFWQALHRRDVQVP DLLSLRGNAHA >gi|296494722|gb|ADTN01000016.1| GENE 40 47542 - 48081 288 179 aa, chain + ## HITS:1 COG:ycbQ KEGG:ns NR:ns ## COG: ycbQ COG3539 # Protein_GI_number: 16128905 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 179 4 182 182 251 100.0 4e-67 MKKSVLTAFITVVCATSSVMAADDNAITDGSVTFNGKVIAPACTLVAATKDSVVTLPDVS ATKLQTNGQVSGVQIDVPIELKDCDTTVTKNATFTFNGTADTTQITAFANQASSDAATNV ALQMYMNDGTTAITPDTETGNILLQDGDQTLTFKVDYIATGKATSGNVNAVTNFHINYY >gi|296494722|gb|ADTN01000016.1| GENE 41 48164 - 48865 305 233 aa, chain + ## HITS:1 COG:ycbR KEGG:ns NR:ns ## COG: ycbR COG3121 # Protein_GI_number: 16128906 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli K12 # 1 233 1 233 233 447 100.0 1e-126 MKTCITKGIVTVSLTAILLSCSSAWAAGKGGIGLAATRLVYSEGEEQISLGVRNTSPDVP YLIQSWVMTPDNKKSADFIITPPLFVLNPANENLLRIMYIGAPLAKDRETLFFTSVRAVP STTKRKEGNTLKIATQSVIKLFWRPKGLAYPLGEAPAKLRCTSSADMVTVSNPTPYFITL TDLKIGGKVVKNQMISPFDKYQFSLPKGAKNSSVTYRTINDYGAETPQLNCKS >gi|296494722|gb|ADTN01000016.1| GENE 42 49037 - 51490 1666 817 aa, chain + ## HITS:1 COG:ycbS KEGG:ns NR:ns ## COG: ycbS COG3188 # Protein_GI_number: 16128907 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli K12 # 1 817 50 866 866 1581 99.0 0 MVADLSRFEKGQKITPGVYRVDIVLNQTIVDTRNVNFVEITPEKGIAACLTTESLDAMGV NTDAFPAFKQLDKQACVPLAEIIPDASVTFNVNKLRLEISVPQIAIKSNARGYVPPERWD EGINALLLGYSFSGANSIHSSADSDSGDSYFLNLNSGVNLGPWRLRNNSTWSRSSGQTAE WKNLSSYLQRAVIPLKGELTVGDDYTAGDFFDSVSFRGVQLASDDNMLPDSLKGFAPVVR GIAKSNAQITIKQNGYTIYQTYVSPGAFEISDLYSTSSSGDLLVEIKEADGSVNSYSVPF SSVPLLQRQGRIKYAVTLAKYRTNSNEQQESKFAQATLQWGGPWGTTWYGGGQYAEYYRA AMFGLGFNLGDFGAISFDATQAKSTLADQSEHKGQSYRFLYAKTLNHLGTNFQLMGYRYS TSGFYTLSDTMYKHMDGYEFNDGDDEDTPMWSRYYNLFYTKRGKLQVNISQQLGEYGSFY LSGSQQTYWHTDQQDRLLQFGYNTQIKDLSLGISWNYSKSRGQPDADQVFALNFSLPLNL LLPRSNDSYTRKKNYAWMTSNTSIDNEGHTTQNLGLTETLLDDGNLSYSVQQGYNSEGKT ANGSASMDYKGAFADARVGYNYSDNGSQQQLNYALSGSLVAHSQGITLGQSLGETNVLIA APGAENTRVANSTGLKTDWRGYTVVPYAISYRENRIALDAASLKRNVDLENAVVNVVPTK GALVLAEFNAHAGARVLMKTSKQGIPLRFGAIATLDGVQANSGIIDDDGSLYMAGLPAKG TISVRWGEAPDQICHINYELTEQQINSAITRMDAICR >gi|296494722|gb|ADTN01000016.1| GENE 43 51508 - 52551 382 347 aa, chain + ## HITS:1 COG:ycbT KEGG:ns NR:ns ## COG: ycbT COG3539 # Protein_GI_number: 16128908 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 347 10 356 356 693 99.0 0 MSLLRLFFAAVLMLWCAQTAAYSGQCHTTQGNPYIGVNFGVKTLEEEANTAGVVKDKFYQ WNESNDYYVSCDCDKDNVRSGRWAFAADSPLVYLGDNWYKINDYLAAKVLLQVKGSSPTA VPFENVGTGGDTRWHICDPGGQRLGGQGASGNSGSFSLKILQPFVGSVVIPPMALARLYE CYNIPAGDSCTTTGTPVLVYYLSGTINSLGSCSVNAGETIEVDLGDVFAANFRVVGHKPL GARTAELAIPVRCNTGNAGLVNVNLSLTATTDPSYPQAIKTSRPGVGVVVTDSQNNIISP AGGTLPLSIPDDADSIARMNVYPVSTTGVPPETGRFEATATVRINFD >gi|296494722|gb|ADTN01000016.1| GENE 44 52563 - 53105 241 180 aa, chain + ## HITS:1 COG:ycbU KEGG:ns NR:ns ## COG: ycbU COG3539 # Protein_GI_number: 16128909 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 180 1 180 180 334 100.0 6e-92 MKKKTIFQCVILFFSILNIHVGMAGPEQVSMHIYGNVVDQGCDVATKSALQNIHIGDFNI SDFQAANTVSTAADLNIDITGCAAGITGADVLFSGEADTLAPTLLKLTDTGGSGGMATGI AVQILDAQSQQEIPLNQVQPLTPLKAGDNTLKYQLRYKSTKAGATGGNATAVLYFDLVYQ >gi|296494722|gb|ADTN01000016.1| GENE 45 53113 - 53628 207 171 aa, chain + ## HITS:1 COG:ycbV KEGG:ns NR:ns ## COG: ycbV COG3539 # Protein_GI_number: 16128910 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 171 17 187 187 345 100.0 2e-95 MLKRIIWILFLLGLTWGCELFAHDGTVNISGSFRRNTCVLAQDSKQINVQLGDVSLTRFS HGNYGPEKSFIINLQDCGTDVSTVDVTFSGTPDGVQSEMLSIESGTDAASGLAIAILDDA KILIPLNQASKDYSLHSGKVPLTFYAQLRPVNSDVQSGKVNASATFVLHYD >gi|296494722|gb|ADTN01000016.1| GENE 46 53621 - 54331 166 236 aa, chain + ## HITS:1 COG:ycbF KEGG:ns NR:ns ## COG: ycbF COG3121 # Protein_GI_number: 16128911 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli K12 # 1 236 10 245 245 426 100.0 1e-119 MTNTWNRLALLIFAVLSLLVAGELQAGVVVGGTRFIFPADRESISILLTNTSQESWLINS KINRPTRWAGGEASTVPAPLLAAPPLILLKPGTTGTLRLLRTESDILPVDRETLFELSIA SVPSGKVENQSVKVAMRSVFKLFWRPEGLPGDPLEAYQQLRWTRNSQGVQLTNPTPYYIN LIQVSVNGKALSNVGVVPPKSQRQTSWCQAIAPCHVAWRAINDYGGLSAKKEQNLP >gi|296494722|gb|ADTN01000016.1| GENE 47 54289 - 54498 153 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|213615837|ref|ZP_03371663.1| ## NR: gi|213615837|ref|ZP_03371663.1| hypothetical protein SentesTyp_15729 [Salmonella enterica subsp. enterica serovar Typhi str. E98-2068] # 1 44 1 44 45 81 88.0 2e-14 MSALWIELEKGFTNEGVVHELSWIPGVQTGGVLCASSDQKGIDLRQKKRKRFPNLFQGRF CSFFALNPP >gi|296494722|gb|ADTN01000016.1| GENE 48 54442 - 55452 941 336 aa, chain + ## HITS:1 COG:ECs1029 KEGG:ns NR:ns ## COG: ECs1029 COG0167 # Protein_GI_number: 15830283 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Escherichia coli O157:H7 # 1 336 1 336 336 677 99.0 0 MYYPFVRKALFQLDPERAHEFTFQQLRRITGTPFEALVRQKVPAKPVNCMGLTFKNPLGL AAGLDKDGECIDALGAMGFGSIEIGTVTPRPQPGNDKPSLFRLVDAEGLINRMGFNNLGV DNLVENVKKAHYDGVLGINIGKNKDTPVEQGKDDYLICMEKIYAYAGYIAINISSPNTPG LRTLQYGEALDDLLTAIKNKQNDLQAMHHKYVPIAVKIAPDLSEEELIQVADSLVRHNID GVIATNTTLDRSLVQGMKNCDQTGGLSGRPLQLKSTEIIRRLSLELNGRLPIIGVGGIDS VIAAREKIAAGASLVQIYSGFIFKGPPLIKEIVTHI >gi|296494722|gb|ADTN01000016.1| GENE 49 55626 - 56168 492 180 aa, chain + ## HITS:1 COG:no KEGG:SSON_0950 NR:ns ## KEGG: SSON_0950 # Name: ycbW # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 180 13 192 192 374 100.0 1e-102 MRIKPDDNWRWYYDEEHDRMMLDLANGMLFRSRFARKMLTPDAFSPAGFCVDDAALYFSF EEKCRDFNLSKEQKAELVLNALVAIRYLKPQMPKSWHFVSHGEMWVPMPGDAACVWLSDT HEQVNLLVVESGENAALCLLAQPCVVIAGRAMQLGDAIKIMNDRLKPQVNVDSFSLEQAV >gi|296494722|gb|ADTN01000016.1| GENE 50 56165 - 57232 689 355 aa, chain - ## HITS:1 COG:ycbX_1 KEGG:ns NR:ns ## COG: ycbX_1 COG3217 # Protein_GI_number: 16128914 # Func_class: R General function prediction only # Function: Uncharacterized Fe-S protein # Organism: Escherichia coli K12 # 1 259 15 273 273 543 100.0 1e-154 MRGIGLTHALADVSGLAFDRIFMITEPDGTFITARQFPQMVRFTPSPVHDGLHLTAPDGS SAYVRFADFATQDAPTEVWGTHFTARIAPDAINKWLSGFFSREVQLRWVGPQMTRRVKRH NTVPLSFADGYPYLLANEASLRDLQQRCPASVKMEQFRPNLVVSGASAWEEDRWKVIRIG DVVFDVVKPCSRCIFTTVSPEKGQKHPAGEPLKTLQSFRTAQDNGDVDFGQNLIARNSGV IRVGDEVEILATAPAKIYGAAAADDTANITQQPDANVDIDWQGQAFRGNNQQVLLEQLEN QGIRIPYSCRAGICGSCRVQLLEGEVTPLKKSAMGDDGTILCCSCVPKTALKLAR >gi|296494722|gb|ADTN01000016.1| GENE 51 57518 - 59626 178 702 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative [Thermococcus barophilus MP] # 487 667 174 328 396 73 29 4e-12 MNSLFASTARGLEELLKTELENLGAVECQVVQGGVHFKGDTRLVYQSLMWSRLASRIMLP LGECKVYSDLDLYLGVQAINWTEMFNPGATFAVHFSGLNDTIRNSQYGAMKVKDAIVDAF TRKNLPRPNVDRDAPDIRVNVWLHKETASIALDLSGDGLHLRGYRDRAGIAPIKETLAAA IVMRSGWQPGTPLLDPMCGSGTLLIEAAMLATDRAPGLHRGRWGFSGWAQHDEAIWQEVK AEAQTRARKGLAEYSSHFYGSDSDARVIQRARTNARLAGIGELITFEVKDVAQLTNPLPK GPYGTVLSNPPYGERLDSEPALIALHSLLGRIMKNQFGGWNLSLFSASPDLLSCLQLRAD KQYKAKNGPLDCVQKNYHVAESTPDSKPAMVAEDYTNRLRKNLKKFEKWARQEGIECYRL YDADLPEYNVAVDRYADWVVVQEYAPPKTIDAHKARQRLFDIIAATISVLGIAPNKLVLK TRERQKGKNQYQKLGEKGEFLEVTEYNAHLWVNLTDYLDTGLFLDHRIARRMLGQMSKGK DFLNLFSYTGSATVHAGLGGARSTTTVDMSRTYLEWAERNLRLNGLTGRAHRLIQADCLA WLREANEQFDLIFIDPPTFSNSKRMEDAFDVQRDHLALMKDLKRLLRAGGTIMFSNNKRG FRMDLDGLAKLGLKAQEITQKTLSQDFARNRQIHNCWLITAA >gi|296494722|gb|ADTN01000016.1| GENE 52 59638 - 61545 2412 635 aa, chain + ## HITS:1 COG:ECs1033 KEGG:ns NR:ns ## COG: ECs1033 COG0488 # Protein_GI_number: 15830287 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Escherichia coli O157:H7 # 1 635 1 635 635 1186 99.0 0 MSLISMHGAWLSFSDAPLLDNAELHIEDNERVCLVGRNGAGKSTLMKILNREQGLDDGRI IYEQDLIVARLQQDPPRNVEGSVYDFVAEGIEEQAEYLKRYHDISRLVMNDPSEKNLNEL AKVQEQLDHHNLWQLENRINEVLAQLGLDPNVALSSLSGGWLRKAALGRALVSNPRVLLL DEPTNHLDIETIDWLEGFLKTFNGTIIFISHDRSFIRNMATRIVDLDRGKLVTYPGNYDQ YLLEKEEALRVEELQNAEFDRKLAQEEVWIRQGIKARRTRNEGRVRALKAMRRERGERRE VMGTAKMQVEEASRSGKIVFEMEDVCYQVNGKQLVKDFSAQVLRGDKIALIGPNGCGKTT LLKLMLGQLQADSGRIHVGTKLEVAYFDQHRAELDPDKTVMDNLAEGKQEVMVNGKPRHV LGYLQDFLFHPKRAMTPVRALSGGERNRLLLARLFLKPSNLLILDEPTNDLDVETLELLE ELIDSYQGTVLLVSHDRQFVDNTVTECWIFEGGGKIGRYVGGYHDARGQQEQYVALKQPA VKKTEEAAAAKAETVKRSSSKLSYKLQRELEQLPQLLEDLEAKLEALQTQVADASFFSQP HEQTQKVLADMAAAEQELEQAFERWEYLEALKNGG >gi|296494722|gb|ADTN01000016.1| GENE 53 61675 - 62928 634 417 aa, chain + ## HITS:1 COG:pqiA KEGG:ns NR:ns ## COG: pqiA COG2995 # Protein_GI_number: 16128917 # Func_class: S Function unknown # Function: Uncharacterized paraquat-inducible protein A # Organism: Escherichia coli K12 # 1 417 1 417 417 790 100.0 0 MCEHHHAAKHILCSQCDMLVALPRLEHGQKAACPRCGTTLTVAWDAPRQRPTAYALAALF MLLLSNLFPFVNMNVAGVTSEITLLEIPGVLFSEDYASLGTFFLLFVQLVPAFCLITILL LVNRAELPVRLKEQLARVLFQLKTWGMAEIFLAGVLVSFVKLMAYGSIGVGSSFLPWCLF CVLQLRAFQCVDRRWLWDDIAPMPELRQPLKPGVTGIRQGLRSCSCCTAILPADEPVCPR CSTKGYVRRRNSLQWTLALLVTSIMLYLPANILPIMVTDLLGSKMPSTILAGVILLWSEG SYPVAAVIFLASIMVPTLKMIAIAWLCWDAKGHGKRDSERMHLIYEVVEFVGRWSMIDVF VIAVLSALVRMGGLMSIYPAMGALMFALVVIMTMFSAMTFDPRLSWDRQPESEHEES >gi|296494722|gb|ADTN01000016.1| GENE 54 62933 - 64573 1539 546 aa, chain + ## HITS:1 COG:pqiB KEGG:ns NR:ns ## COG: pqiB COG3008 # Protein_GI_number: 16128918 # Func_class: R General function prediction only # Function: Paraquat-inducible protein B # Organism: Escherichia coli K12 # 1 546 1 546 546 1082 100.0 0 MESNNGEAKIQKVKNWSPVWIFPIVTALIGAWVLFYHYSHQGPEVTLITANAEGIEGGKT TIKSRSVDVGVVESATLADDLTHVEIKARLNSGMEKLLHKDTVFWVVKPQIGREGISGLG TLLSGVYIELQPGAKGSKMDKYDLLDSPPLAPPDAKGIRVILDSKKAGQLSPGDPVLFRG YRVGSVETSTFDTQKRNISYQLFINAPYDRLVTNNVRFWKDSGIAVDLTSAGMRVEMGSL TTLLSGGVSFDVPEGLDLGQPVAPKTAFVLYDDQKSIQDSLYTDHIDYLMFFKDSVRGLQ PGAPVEFRGIRLGTVSKVPFFAPNMRQTFNDDYRIPVLIRIEPERLKMQLGENADVVEHL GELLKRGLRGSLKTGNLVTGALYVDLDFYPNTPAITGIREFNGYQIIPTVSGGLAQIQQR LMEALDKINKLPLNPMIEQATSTLSESQRTMKNLQTTLDSMNKILASQSMQQLPTDMQST LRELNRSMQGFQPGSAAYNKMVADMQRLDQVLRELQPVLKTLNEKSNALVFEAKDKKDPE PKRAKQ >gi|296494722|gb|ADTN01000016.1| GENE 55 64570 - 65133 653 187 aa, chain + ## HITS:1 COG:ymbA KEGG:ns NR:ns ## COG: ymbA COG3009 # Protein_GI_number: 16128919 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 6 187 1 182 182 338 99.0 3e-93 MKKWLVTIAALWLAGCSSGEINKNYYQLPVVQSGTQSTASQGNRLLWVEQVTVPDYLAGN GVVYQTSDVKYVIANNNLWASPLDQQLRNTLVANLSTQLPGWVVASQPLGSAQDTLNVTV TEFNGRYDGKVIVSGEWLLNHQGQLIKRPFRLEGVQTQDGYDEMVKVLAGVWSQEAASIA QEIKRLP >gi|296494722|gb|ADTN01000016.1| GENE 56 65389 - 65556 86 55 aa, chain + ## HITS:1 COG:ECs1037 KEGG:ns NR:ns ## COG: ECs1037 COG3130 # Protein_GI_number: 15830291 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome modulation factor # Organism: Escherichia coli O157:H7 # 1 55 1 55 55 80 100.0 5e-16 MKRQKRDRLERAHQRGYQAGIAGRSKEMCPYQTLNQRSQWLGGWREAMADRVVMA >gi|296494722|gb|ADTN01000016.1| GENE 57 65626 - 66144 738 172 aa, chain - ## HITS:1 COG:ECs1038 KEGG:ns NR:ns ## COG: ECs1038 COG0764 # Protein_GI_number: 15830292 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases # Organism: Escherichia coli O157:H7 # 1 172 1 172 172 344 100.0 4e-95 MVDKRESYTKEDLLASGRGELFGAKGPQLPAPNMLMMDRVVKMTETGGNFDKGYVEAELD INPDLWFFGCHFIGDPVMPGCLGLDAMWQLVGFYLGWLGGEGKGRALGVGEVKFTGQVLP TAKKVTYRIHFKRIVNRRLIMGLADGEVLVDGRLIYTASDLKVGLFQDTSAF >gi|296494722|gb|ADTN01000016.1| GENE 58 66213 - 67973 1614 586 aa, chain - ## HITS:1 COG:ycbZ KEGG:ns NR:ns ## COG: ycbZ COG1067 # Protein_GI_number: 16128922 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent protease # Organism: Escherichia coli K12 # 1 586 1 586 586 1167 100.0 0 MTITKLAWRDLVPDTDSYQEIFAQPHLIDENDPLFSDTQPRLQFALEQLLHTRASSSFML AKAPEESEYLNLIANAARTLQSDAGQLVGGHYEVSGHSIRLRHAVSADDNFATLTQVVAA DWVEAEQLFGCLRQFNGDITLQPGLVHQANGGILIISLRTLLAQPLLWMRLKNIVNRERF DWVAFDESRPLPVSVPSMPLKLKVILVGERESLADFQEMEPELSEQAIYSEFEDTLQIVD AESVTQWCRWVTFTARHNHLPAPGADAWPILIREAARYTGEQETLPLSPQWILRQCKEVA SLCDGDTFSGEQLNLMLQQREWREGFLAERMQDEILQEQILIETEGERIGQINALSVIEF PGHPRAFGEPSRISCVVHIGDGEFTDIERKAELGGNIHAKGMMIMQAFLMSELQLEQQIP FSASLTFEQSYSEVDGDSASMAELCALISALADVPVNQSIAITGSVDQFGRAQPVGGLNE KIEGFFAICQQRELTGKQGVIIPTANVRHLSLHSELVKAVEEGKFTIWAVDDVTDALPLL LNLVWDGEGQTTLMQTIQERIAQASQQEGRHRFPWPLRWLNWFIPN >gi|296494722|gb|ADTN01000016.1| GENE 59 68159 - 68611 419 150 aa, chain + ## HITS:1 COG:ECs1040 KEGG:ns NR:ns ## COG: ECs1040 COG3120 # Protein_GI_number: 15830294 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 150 1 150 150 253 100.0 9e-68 MKYQQLENLESGWKWKYLVKKHREGELITRYIEASAAQEAVDVLLSLENEPVLVNGWIDK HMNPELVNRMKQTIRARRKRHFNAEHQHTRKKSIDLEFIVWQRLAGLAQRRGKTLSETIV QLIEDAENKEKYANKMSSLKQDLQALLGKE Prediction of potential genes in microbial genomes Time: Sun May 15 23:04:28 2011 Seq name: gi|296494721|gb|ADTN01000017.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont64.2, whole genome shotgun sequence Length of sequence - 12643 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 10, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 46 - 1086 1225 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins - Prom 1239 - 1298 12.2 - Term 1329 - 1372 3.1 2 2 Tu 1 . - CDS 1443 - 1952 314 ## COG5404 SOS-response cell division inhibitor, blocks FtsZ ring formation - Prom 2139 - 2198 5.8 + Prom 2057 - 2116 4.5 3 3 Tu 1 . + CDS 2171 - 2800 544 ## COG3070 Regulator of competence-specific genes - Term 2699 - 2747 3.9 4 4 Op 1 5/0.286 - CDS 2763 - 4925 1865 ## COG1289 Predicted membrane protein 5 4 Op 2 . - CDS 4935 - 5381 527 ## COG3304 Predicted membrane protein - Prom 5414 - 5473 3.6 + Prom 5378 - 5437 4.7 6 5 Tu 1 . + CDS 5486 - 7558 1797 ## COG0210 Superfamily I DNA and RNA helicases + Term 7643 - 7681 1.2 - Term 7549 - 7582 2.1 7 6 Op 1 3/0.857 - CDS 7590 - 8048 408 ## COG1803 Methylglyoxal synthase - Prom 8070 - 8129 2.9 - Term 8108 - 8134 -1.0 8 6 Op 2 . - CDS 8144 - 8806 609 ## COG3110 Uncharacterized protein conserved in bacteria 9 6 Op 3 . - CDS 8846 - 9004 100 ## EcSMS35_2154 hypothetical protein + Prom 8800 - 8859 3.7 10 7 Tu 1 . + CDS 8979 - 9392 599 ## COG1832 Predicted CoA-binding protein + Term 9397 - 9436 9.0 - Term 9383 - 9424 10.2 11 8 Op 1 3/0.857 - CDS 9437 - 9754 406 ## COG3785 Uncharacterized conserved protein 12 8 Op 2 . - CDS 9812 - 11002 580 ## PROTEIN SUPPORTED gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative - Prom 11072 - 11131 2.0 + Prom 10954 - 11013 2.4 13 9 Tu 1 . + CDS 11097 - 11375 359 ## COG1254 Acylphosphatases + Term 11436 - 11479 1.6 14 10 Op 1 2/1.000 - CDS 11372 - 11701 580 ## COG2920 Dissimilatory sulfite reductase (desulfoviridin), gamma subunit - Prom 11725 - 11784 4.8 - Term 11733 - 11772 3.1 15 10 Op 2 . - CDS 11792 - 12451 872 ## COG0670 Integral membrane protein, interacts with FtsH - Prom 12579 - 12638 6.8 Predicted protein(s) >gi|296494721|gb|ADTN01000017.1| GENE 1 46 - 1086 1225 346 aa, chain - ## HITS:1 COG:ECs1041 KEGG:ns NR:ns ## COG: ECs1041 COG2885 # Protein_GI_number: 15830295 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Escherichia coli O157:H7 # 1 346 1 346 346 658 100.0 0 MKKTAIAIAVALAGFATVAQAAPKDNTWYTGAKLGWSQYHDTGFINNNGPTHENQLGAGA FGGYQVNPYVGFEMGYDWLGRMPYKGSVENGAYKAQGVQLTAKLGYPITDDLDIYTRLGG MVWRADTKSNVYGKNHDTGVSPVFAGGVEYAITPEIATRLEYQWTNNIGDAHTIGTRPDN GMLSLGVSYRFGQGEAAPVVAPAPAPAPEVQTKHFTLKSDVLFNFNKATLKPEGQAALDQ LYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVVDYLISKGIPADKISARGM GESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEVKGIKDVVTQPQA >gi|296494721|gb|ADTN01000017.1| GENE 2 1443 - 1952 314 169 aa, chain - ## HITS:1 COG:ECs1042 KEGG:ns NR:ns ## COG: ECs1042 COG5404 # Protein_GI_number: 15830296 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: SOS-response cell division inhibitor, blocks FtsZ ring formation # Organism: Escherichia coli O157:H7 # 1 169 1 169 169 269 100.0 2e-72 MYTSGYAHRSSSFSSAASKIARVSTENTTAGLISEVVYREDQPMMTQLLLLPLLQQLGQQ SRWQLWLTPQQKLSREWVQASGLPLTKVMQISQLSPCHTVESMVRALRTGNYSVVIGWLA DDLTEEEHAELVDAANEGNAMGFIMRPVSASSHATRQLSGLKIHSNLYH >gi|296494721|gb|ADTN01000017.1| GENE 3 2171 - 2800 544 209 aa, chain + ## HITS:1 COG:yccR KEGG:ns NR:ns ## COG: yccR COG3070 # Protein_GI_number: 16128926 # Func_class: K Transcription # Function: Regulator of competence-specific genes # Organism: Escherichia coli K12 # 1 209 1 209 209 407 100.0 1e-114 MKSLSYKRIYKSQEYLATLGTIEYRSLFGSYSLTVDDTVFAMVSDGELYLRACEQSAQYC VKHPPVWLTYKKCGRSVTLNYYRVDESLWRNQLKLVRLSKYSLDAALKEKSTRNTRERLK DLPNMSFHLEAILGEVGIKDVRALRILGAKMCWLRLRQQNSLVTEKILFMLEGAIIGIHE AALPVARRQELAEWADSLTPKQEFPAELE >gi|296494721|gb|ADTN01000017.1| GENE 4 2763 - 4925 1865 720 aa, chain - ## HITS:1 COG:yccS KEGG:ns NR:ns ## COG: yccS COG1289 # Protein_GI_number: 16128927 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 720 1 720 720 1428 100.0 0 MAFMLSPLLKRYTWNSAWLYYARIFIALCGTTAFPWWLGDVKLTIPLTLGMVAAALTDLD DRLAGRLRNLIITLFCFFIASASVELLFPWPWLFAIGLTLSTSGFILLGGLGQRYATIAF GALLIAIYTMLGTSLYEHWYQQPMYLLAGAVWYNVLTLIGHLLFPVRPLQDNLARCYEQL ARYLELKSRMFDPDIEDQSQAPLYDLALANGLLMATLNQTKLSLLTRLRGDRGQRGTRRT LHYYFVAQDIHERASSSHIQYQTLREHFRHSDVLFRFQRLMSMQGQACQQLSRCILLRQP YQHDPHFERAFTHIDAALERMRDNGAPADLLKTLGFLLNNLRAIDAQLATIESEQAQALP HNNDENELADDSPHGLSDIWLRLSRHFTPESALFRHAVRMSLVLCFGYAIIQITGMHHGY WILLTSLFVCQPNYNATRHRLKLRIIGTLVGIAIGIPVLWFVPSLEGQLVLLVITGVLFF AFRNVQYAHATMFITLLVLLCFNLLGEGFEVALPRVIDTLIGCAIAWAAVSYIWPDWQFR NLPRMLERATEANCRYLDAILEQYHQGRDNRLAYRIARRDAHNRDAELASVVSNMSSEPN VTPQIREAAFRLLCLNHTFTSYISALGAHREQLTNPEILAFLDDAVCYVDDALHHQPADE ERVNEALASLKQRMQQLEPRADSKEPLVVQQVGLLIALLPEIGRLQRQITQVPQETPVSA >gi|296494721|gb|ADTN01000017.1| GENE 5 4935 - 5381 527 148 aa, chain - ## HITS:1 COG:yccF KEGG:ns NR:ns ## COG: yccF COG3304 # Protein_GI_number: 16128928 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 134 1 134 148 239 100.0 2e-63 MRTVLNILNFVLGGFATTLGWLLATLVSIVLIFTLPLTRSCWEITKLSLVPYGNEAIHVD ELNPAGKNVLLNTGGTVLNIFWLIFFGWWLCLMHIATGIAQCISIIGIPVGIANFKIAAI ALWPVGRRVVSVETAQAAREANARRRFE >gi|296494721|gb|ADTN01000017.1| GENE 6 5486 - 7558 1797 690 aa, chain + ## HITS:1 COG:helD KEGG:ns NR:ns ## COG: helD COG0210 # Protein_GI_number: 16128929 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Escherichia coli K12 # 7 690 1 684 684 1340 99.0 0 MWIYRVMELKATTLGKRLAQHPYDRAVILNAGIKVSGDRHEYLIPFNQLLAIHCKRGLVW GELEFVLPDEKVVRLHGTEWGETQRFYHHLDAHWRRWSGEMSEIASGVLRQQLDLIATRT GENKWLTREQTSGVQQQIRQALSALPLPVNRLEEFDNCREAWRKCQAWLKDIESARLQHN QAYTEAMLTEYADFFRQVESSPLNPAQARAVVNGEHSLLVLAGAGSGKTSVLVARAGWLL ARGEASPEQILLLAFGRKAAEEMDERIRERLHTEDITARTFHALALHIIQQGSKKVPIVS KLENDTAARHELFIAEWRKQCSEKKAQAKGWRQWLTEEMQWSVPEGNFWDDEKLQRRLAS RLDRWVSLMRMHGGAQAEMIASAPEEIRDLFSKRIKLMAPLLKAWKGALKAENAVDFSGL IHQAIVILEKGRFISPWKHILVDEFQDISPQRAALLAALRKQNSQTTLFAVGDDWQAIYR FSGAQMSLTTAFHENFGEGERCDLDTTYRFNSRIGEVANRFIQQNPGQLKKPLNSLINGD KKAVTLLDESQLDALLDKLSGYAKPEERILILARYHHMRPASLEKAATRWPKLQIDFMTI HASKGQQADYVIIVGLQEGSDGFPAAARESIMEEALLPPVEDFPDAEERRLMYVALTRAR HRVWALFNKENPSPFVEILKNLDVPVARKP >gi|296494721|gb|ADTN01000017.1| GENE 7 7590 - 8048 408 152 aa, chain - ## HITS:1 COG:ECs1047 KEGG:ns NR:ns ## COG: ECs1047 COG1803 # Protein_GI_number: 15830301 # Func_class: G Carbohydrate transport and metabolism # Function: Methylglyoxal synthase # Organism: Escherichia coli O157:H7 # 1 152 4 155 155 307 100.0 6e-84 MELTTRTLPARKHIALVAHDHCKQMLMSWVERHQPLLEQHVLYATGTTGNLISRATGMNV NAMLSGPMGGDQQVGALISEGKIDVLIFFWDPLNAVPHDPDVKALLRLATVWNIPVATNV ATADFIIQSPHFNDAVDILIPDYQRYLADRLK >gi|296494721|gb|ADTN01000017.1| GENE 8 8144 - 8806 609 220 aa, chain - ## HITS:1 COG:ECs1048 KEGG:ns NR:ns ## COG: ECs1048 COG3110 # Protein_GI_number: 15830302 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 220 1 220 220 409 100.0 1e-114 MKTGIVTTLIALCLPVSVFATTLRLSTDVDLLVLDGKKVSSSLLRGADSIELDNGPHQLV FRVEKTIHLSNSEERLYISPPLVVSFNTQLINQVNFRLPRLENEREANHFDAAPRLELLD GDATPIPVKLDILAITSTAKTIDYEVEVERYNKSAKRASLPQFATMMADDSTLLSGVSEL DAIPPQSQVLTEQRLKYWFKLADPQTRNTFLQWAEKQPSS >gi|296494721|gb|ADTN01000017.1| GENE 9 8846 - 9004 100 52 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_2154 NR:ns ## KEGG: EcSMS35_2154 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 52 5 56 56 79 100.0 3e-14 MPAISVSFIISSRLFNKVYDKADMPTMQLIRMNESKSGKYVYMLVSMIKEKF >gi|296494721|gb|ADTN01000017.1| GENE 10 8979 - 9392 599 137 aa, chain + ## HITS:1 COG:yccU KEGG:ns NR:ns ## COG: yccU COG1832 # Protein_GI_number: 16128932 # Func_class: R General function prediction only # Function: Predicted CoA-binding protein # Organism: Escherichia coli K12 # 1 137 28 164 164 269 100.0 1e-72 MKETDIAGILTSTHTIALVGASDKPDRPSYRVMKYLLDQGYHVIPVSPKVAGKTLLGQKG YGTLADVPEKVDMVDVFRNSEAAWGVAQEAIAIGAKTLWMQLGVINEQAAVLARDAGLNV VMDRCPAIEIPRLGLAK >gi|296494721|gb|ADTN01000017.1| GENE 11 9437 - 9754 406 105 aa, chain - ## HITS:1 COG:ECs1050 KEGG:ns NR:ns ## COG: ECs1050 COG3785 # Protein_GI_number: 15830304 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 105 18 122 122 200 100.0 6e-52 MIASKFGIGQQVRHSLLGYLGVVVDIDPVYSLSEPSPDELAVNDELRAAPWYHVVMEDDN GLPVHTYLAEAQLSSELQDEHPEQPSMDELAQTIRKQLQAPRLRN >gi|296494721|gb|ADTN01000017.1| GENE 12 9812 - 11002 580 396 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative [Thermococcus barophilus MP] # 22 389 21 386 396 228 32 2e-59 MSVRLVLAKGREKSLLRRHPWVFSGAVARMEGKASLGETIDIVDHQGKWLARGAYSPASQ IRARVWTFDPSESIDIAFFSRRLQQAQKWRDWLAQKDGLDSYRLIAGESDGLPGITIDRF GNFLVLQLLSAGAEYQRAALISALQTLYPECSIYDRSDVAVRKKEGMELTQGPVTGELPP ALLPIEEHGMKLLVDIQHGHKTGYYLDQRDSRLATRRYVENKRVLNCFSYTGGFAVSALM GGCSQVVSVDTSQEALDIARQNVELNKLDLSKAEFVRDDVFKLLRTYRDRGEKFDVIVMD PPKFVENKSQLMGACRGYKDINMLAIQLLNEGGILLTFSCSGLMTSDLFQKIIADAAIDA GRDVQFIEQFRQAADHPVIATYPEGLYLKGFACRVM >gi|296494721|gb|ADTN01000017.1| GENE 13 11097 - 11375 359 92 aa, chain + ## HITS:1 COG:ECs1052 KEGG:ns NR:ns ## COG: ECs1052 COG1254 # Protein_GI_number: 15830306 # Func_class: C Energy production and conversion # Function: Acylphosphatases # Organism: Escherichia coli O157:H7 # 1 92 1 92 92 184 100.0 4e-47 MSKVCIIAWVYGRVQGVGFRYTTQYEAKRLGLTGYAKNLDDGSVEVVACGEEGQVEKLMQ WLKSGGPRSARVERVLSEPHHPSGELTDFRIR >gi|296494721|gb|ADTN01000017.1| GENE 14 11372 - 11701 580 109 aa, chain - ## HITS:1 COG:ECs1053 KEGG:ns NR:ns ## COG: ECs1053 COG2920 # Protein_GI_number: 15830307 # Func_class: P Inorganic ion transport and metabolism # Function: Dissimilatory sulfite reductase (desulfoviridin), gamma subunit # Organism: Escherichia coli O157:H7 # 1 109 20 128 128 219 100.0 1e-57 MLIFEGKEIETDTEGYLKESSQWSEPLAVVIAENEGISLSPEHWEVVRFVRDFYLEFNTS PAIRMLVKAMANKFGEEKGNSRYLYRLFPKGPAKQATKIAGLPKPVKCI >gi|296494721|gb|ADTN01000017.1| GENE 15 11792 - 12451 872 219 aa, chain - ## HITS:1 COG:yccA KEGG:ns NR:ns ## COG: yccA COG0670 # Protein_GI_number: 16128937 # Func_class: R General function prediction only # Function: Integral membrane protein, interacts with FtsH # Organism: Escherichia coli K12 # 1 219 1 219 219 330 100.0 8e-91 MDRIVSSSHDRTSLLSTHKVLRNTYFLLSLTLAFSAITATASTVLMLPSPGLILTLVGMY GLMFLTYKTANKPTGIISAFAFTGFLGYILGPILNTYLSAGMGDVIAMALGGTALVFFCC SAYVLTTRKDMSFLGGMLMAGIVVVLIGMVANIFLQLPALHLAISAVFILISSGAILFET SNIIHGGETNYIRATVSLYVSLYNIFVSLLSILGFASRD Prediction of potential genes in microbial genomes Time: Sun May 15 23:04:36 2011 Seq name: gi|296494720|gb|ADTN01000018.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont64.3, whole genome shotgun sequence Length of sequence - 18365 bp Number of predicted genes - 17, with homology - 15 Number of transcription units - 4, operones - 3 average op.length - 5.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 16 - 103 76.9 # Ser TGA 0 0 + Prom 310 - 369 2.7 1 1 Op 1 11/0.000 + CDS 530 - 1648 897 ## COG1740 Ni,Fe-hydrogenase I small subunit 2 1 Op 2 8/0.000 + CDS 1645 - 3438 1880 ## COG0374 Ni,Fe-hydrogenase I large subunit 3 1 Op 3 7/0.000 + CDS 3457 - 4164 726 ## COG1969 Ni,Fe-hydrogenase I cytochrome b subunit 4 1 Op 4 . + CDS 4161 - 4748 812 ## COG0680 Ni,Fe-hydrogenase maturation factor 5 1 Op 5 . + CDS 4745 - 5143 532 ## LF82_1053 hydrogenase-1 operon protein HyaE 6 1 Op 6 . + CDS 5140 - 5997 625 ## B21_00987 hypothetical protein + Prom 6039 - 6098 2.2 7 2 Op 1 31/0.000 + CDS 6131 - 7675 1726 ## COG1271 Cytochrome bd-type quinol oxidase, subunit 1 8 2 Op 2 . + CDS 7687 - 8823 1173 ## COG1294 Cytochrome bd-type quinol oxidase, subunit 2 9 2 Op 3 . + CDS 8836 - 8928 123 ## 10 2 Op 4 . + CDS 9008 - 10306 851 ## EcolC_2616 phosphoanhydride phosphorylase - Term 10306 - 10341 6.0 11 3 Op 1 3/0.000 - CDS 10421 - 12601 2213 ## COG3206 Uncharacterized protein involved in exopolysaccharide biosynthesis 12 3 Op 2 6/0.000 - CDS 12621 - 13067 482 ## COG0394 Protein-tyrosine-phosphatase 13 3 Op 3 . - CDS 13055 - 14194 847 ## COG1596 Periplasmic protein involved in polysaccharide export 14 3 Op 4 . - CDS 14240 - 16336 1902 ## JW0967 conserved hypothetical protein 15 3 Op 5 . - CDS 16336 - 17082 471 ## JW0968 conserved hypothetical protein 16 3 Op 6 . - CDS 17079 - 17657 356 ## B21_00996 hypothetical protein - Prom 17700 - 17759 7.8 - Term 17756 - 17792 4.1 17 4 Tu 1 . - CDS 17830 - 18135 199 ## - Prom 18191 - 18250 3.5 Predicted protein(s) >gi|296494720|gb|ADTN01000018.1| GENE 1 530 - 1648 897 372 aa, chain + ## HITS:1 COG:hyaA KEGG:ns NR:ns ## COG: hyaA COG1740 # Protein_GI_number: 16128938 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase I small subunit # Organism: Escherichia coli K12 # 1 372 1 372 372 738 100.0 0 MNNEETFYQAMRRQGVTRRSFLKYCSLAATSLGLGAGMAPKIAWALENKPRIPVVWIHGL ECTCCTESFIRSAHPLAKDVILSLISLDYDDTLMAAAGTQAEEVFEDIITQYNGKYILAV EGNPPLGEQGMFCISSGRPFIEKLKRAAAGASAIIAWGTCASWGCVQAARPNPTQATPID KVITDKPIIKVPGCPPIPDVMSAIITYMVTFDRLPDVDRMGRPLMFYGQRIHDKCYRRAH FDAGEFVQSWDDDAARKGYCLYKMGCKGPTTYNACSSTRWNDGVSFPIQSGHGCLGCAEN GFWDRGSFYSRVVDIPQMGTHSTADTVGLTALGVVAAAVGVHAVASAVDQRRRHNQQPTE TEHQPGNEDKQA >gi|296494720|gb|ADTN01000018.1| GENE 2 1645 - 3438 1880 597 aa, chain + ## HITS:1 COG:hyaB KEGG:ns NR:ns ## COG: hyaB COG0374 # Protein_GI_number: 16128939 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase I large subunit # Organism: Escherichia coli K12 # 1 597 1 597 597 1276 100.0 0 MSTQYETQGYTINNAGRRLVVDPITRIEGHMRCEVNINDQNVITNAVSCGTMFRGLEIIL QGRDPRDAWAFVERICGVCTGVHALASVYAIEDAIGIKVPDNANIIRNIMLATLWCHDHL VHFYQLAGMDWIDVLDALKADPRKTSELAQSLSSWPKSSPGYFFDVQNRLKKFVEGGQLG IFRNGYWGHPQYKLPPEANLMGFAHYLEALDFQREIVKIHAVFGGKNPHPNWIVGGMPCA INIDESGAVGAVNMERLNLVQSIITRTADFINNVMIPDALAIGQFNKPWSEIGTGLSDKC VLSYGAFPDIANDFGEKSLLMPGGAVINGDFNNVLPVDLVDPQQVQEFVDHAWYRYPNDQ VGRHPFDGITDPWYNPGDVKGSDTNIQQLNEQERYSWIKAPRWRGNAMEVGPLARTLIAY HKGDAATVESVDRMMSALNLPLSGIQSTLGRILCRAHEAQWAAGKLQYFFDKLMTNLKNG NLATASTEKWEPATWPTECRGVGFTEAPRGALGHWAAIRDGKIDLYQCVVPTTWNASPRD PKGQIGAYEAALMNTKMAIPEQPLEILRTLHSFDPCLACSTHVLGDDGSELISVQVR >gi|296494720|gb|ADTN01000018.1| GENE 3 3457 - 4164 726 235 aa, chain + ## HITS:1 COG:ECs1130 KEGG:ns NR:ns ## COG: ECs1130 COG1969 # Protein_GI_number: 15830384 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase I cytochrome b subunit # Organism: Escherichia coli O157:H7 # 1 235 1 235 235 433 100.0 1e-121 MQQKSDNVVSHYVFEAPVRIWHWLTVLCMAVLMVTGYFIGKPLPSVSGEATYLFYMGYIR LIHFSAGMVFTVVLLMRIYWAFVGNRYSRELFIVPVWRKSWWQGVWYEIRWYLFLAKRPS ADIGHNPIAQAAMFGYFLMSVFMIITGFALYSEHSQYAIFAPFRYVVEFFYWTGGNSMDI HSWHRLGMWLIGAFVIGHVYMALREDIMSDDTVISTMVNGYRSHKFGKISNKERS >gi|296494720|gb|ADTN01000018.1| GENE 4 4161 - 4748 812 195 aa, chain + ## HITS:1 COG:hyaD KEGG:ns NR:ns ## COG: hyaD COG0680 # Protein_GI_number: 16128941 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase maturation factor # Organism: Escherichia coli K12 # 1 195 1 195 195 360 100.0 1e-99 MSEQRVVVMGLGNLLWADEGFGVRVAERLYAHYHWPEYVEIVDGGTQGLNLLGYVESASH LLILDAIDYGLEPGTLRTYAGERIPAYLSAKKMSLHQNSFSEVLALADIRGHLPAHIALV GLQPAMLDDYGGSLSELAREQLPAAEQAALAQLAAWGIVPQPANESRCLNYDCLSMENYE GVRLRQYRMTQEEQG >gi|296494720|gb|ADTN01000018.1| GENE 5 4745 - 5143 532 132 aa, chain + ## HITS:1 COG:no KEGG:LF82_1053 NR:ns ## KEGG: LF82_1053 # Name: hyaE # Def: hydrogenase-1 operon protein HyaE # Organism: E.coli_LF82 # Pathway: not_defined # 1 132 1 132 132 260 99.0 1e-68 MSNDTPFDALWQRMLARGWTPVSESRLDDWLTQAPDGVVLLSSDPKRTPEVSDNPVMIGE LLREFPDYTWQVAIADLEQSEAIGDRFGVFRFPATLVFTGGNYRGVLNGIHPWAELINLM RGLVEPQQERAS >gi|296494720|gb|ADTN01000018.1| GENE 6 5140 - 5997 625 285 aa, chain + ## HITS:1 COG:no KEGG:B21_00987 NR:ns ## KEGG: B21_00987 # Name: hyaF # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 285 1 285 285 556 100.0 1e-157 MSETFFHLLGPGTQPNDDSFSMNPLPITCQVNDEPSMAALEQCAHSPQVIALLNELQHQL SERQPPLGEVLAVDLLNLNADDRHFINTLLGEGEVSVRIQQADDSESEIQEAIFCGLWRV RRRRGEKLLEDKLEAGCAPLALWQAATQNLLPTDSLLPPPIDGLMNGLPLAHELLAHVRN PDAQPHSINLTQLPISEADRLFLSRLCGPGNIQIRTIGYGESYINATGLRHVWHLRCTDT LKGPLLESYEICPIPEVVLAAPEDLVDSAQRLSEVCQWLAEAAPT >gi|296494720|gb|ADTN01000018.1| GENE 7 6131 - 7675 1726 514 aa, chain + ## HITS:1 COG:appC KEGG:ns NR:ns ## COG: appC COG1271 # Protein_GI_number: 16128944 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 1 # Organism: Escherichia coli K12 # 1 514 1 514 514 964 99.0 0 MWDVIDLSRWQFALTALYHFLFVPLTLGLIFLLAIMETIYVVTGKTIYRDMTRFWGKLFG INFALGVATGLTMEFQFGTNWSFYSNYVGDIFGAPLAMEALMAFFLESTFVGLFFFGWQR LNKYQHLLVTWLVAFGSNLSALWILNANGWMQYPTGAHFDIDTLRMEMTSFSELVFNPVS QVKFVHTVMAGYVTGAMFIMAISAWYLLRGRERDVALRSFAIGSVFGTLAIIGTLQLGDS SAYEVAQVQPVKLAAMEGEWQTEPAPAPFHVVAWPEQDQERNAFALKIPALLGILATHSL DKPVPGLKNLMAETYPRLQRGRMAWLLMQEISQGNREPHVLQAFRGLEGDLGYGMLLSRY APDMNHVTAAQYQAAMRGAIPQVAPVFWSFRIMVGCGSLLLLVMLIALVQTLRGKIDQHR WVLKMALWSLPLPWIAIEAGWFMTEFGRQPWAIQDILPTYSAHSALTTGQLAFSLIMIVG LYTLFLIAEVYLMQKYARLGPSAMQSEQPTQQQG >gi|296494720|gb|ADTN01000018.1| GENE 8 7687 - 8823 1173 378 aa, chain + ## HITS:1 COG:appB KEGG:ns NR:ns ## COG: appB COG1294 # Protein_GI_number: 16128945 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 2 # Organism: Escherichia coli K12 # 1 378 1 378 378 666 100.0 0 MFDYETLRFIWWLLIGVILVVFMISDGFDMGIGCLLPLVARNDDERRIVINSVGAHWEGN QVWLILAGGALFAAWPRVYAAAFSGFYVAMILVLCSLFFRPLAFDYRGKIADARWRKMWD AGLVIGSLVPPVVFGIAFGNLLLGVPFAFTPQLRVEYLGSFWQLLTPFPLLCGLLSLGMV ILQGGVWLQLKTVGVIHLRSQLATKRAALLVMLCFLLAGYWLWVGIDGFVLLAQDANGPS NPLMKLVAVLPGAWMNNFVESPVLWIFPLLGFFCPLLTVMAIYRGRPGWGFLMASLMQFG VIFTAGITLFPFVMPSSVSPISSLTLWDSTSSQLTLSIMLVIVLIFLPIVLLYTLWSYYK MWGRMTTETLRRNENELY >gi|296494720|gb|ADTN01000018.1| GENE 9 8836 - 8928 123 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MWYLLWFVGILLMCSLSTLVLVWLDPRLKS >gi|296494720|gb|ADTN01000018.1| GENE 10 9008 - 10306 851 432 aa, chain + ## HITS:1 COG:no KEGG:EcolC_2616 NR:ns ## KEGG: EcolC_2616 # Name: not_defined # Def: phosphoanhydride phosphorylase # Organism: E.coli_ATCC8739 # Pathway: gamma-Hexachlorocyclohexane degradation [PATH:ecl00361]; Inositol phosphate metabolism [PATH:ecl00562]; Riboflavin metabolism [PATH:ecl00740] # 1 432 11 442 442 844 99.0 0 MKAILIPFLSLLIPLTPQSAFAQSEPELKLESVVIVSRHGVRAPTKATQLMQDVTPDAWP TWPVKLGWLTPRGGELIAYLGHYQRQRLVADGLLAKKGCPQSGQVAIIADVDERTRKTGE AFAAGLAPDCAITVHTQADTSSPDPLFNPLKTGVCQLDNANVTDAILSRAGGSIADFTGH RQTAFRELERVLNFPQSNLCLKREKQDESCSLTQALPSELKVSADNVSLTGAVSLASMLT EIFLLQQAQGMPEPGWGRITDSHQWNTLLSLHNAQFYLLQRTPEVARSRATPLLDLIKTA LTPHPPQKQAYGVTLPTSVLFIAGHDTNLANLGGALELNWTLPGQPDNTPPGGELVFERW RRLSDNSQWIQVSLVFQTLQQMRDKTPLSLNTPPGEVKLTLAGCEERNAQGMCSLAGFTQ IVNEARIPACSL >gi|296494720|gb|ADTN01000018.1| GENE 11 10421 - 12601 2213 726 aa, chain - ## HITS:1 COG:yccC_1 KEGG:ns NR:ns ## COG: yccC_1 COG3206 # Protein_GI_number: 16128947 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein involved in exopolysaccharide biosynthesis # Organism: Escherichia coli K12 # 1 480 1 480 480 892 100.0 0 MTTKNMNTPPGSTQENEIDLLRLVGELWDHRKFIISVTALFTLIAVAYSLLSTPIYQADT LVQVEQKQGNAILSGLSDMIPNSSPESAPEIQLLQSRMILGKTIAELNLRDIVEQKYFPI VGRGWARLTKEKPGELAISWMHIPQLNGQDQQLTLTVGENGHYTLEGEEFTVNGMVGQRL EKDGVALTIADIKAKPGTQFVLSQRTELEAINALQETFTVSERSKESGMLELTMTGDDPQ LITRILNSIANNYLQQNIARQAAQDSQSLEFLQRQLPEVRSELDQAEEKLNVYRQQRDSV DLNLEAKAVLEQIVNVDNQLNELTFREAEISQLYKKDHPTYRALLEKRQTLEQERKRLNK RVSAMPSTQQEVLRLSRDVEAGRAVYLQLLNRQQELSISKSSAIGNVRIIDPAVTQPQPV KPKKALNVVLGFILGLFISVGAVLARAMLRRGVEAPEQLEEHGISVYATIPMSEWLDKRT RLRKKNLFSNQQRHRTKNIPFLAVDNPADSAVEAVRALRTSLHFAMMETENNILMITGAT PDSGKTFVSSTLAAVIAQSDQKVLFIDADLRRGYSHNLFTVSNEHGLSEYLAGKDELNKV IQHFGKGGFDVITRGQVPPNPSELLMRDRMRQLLEWANDHYDLVIVDTPPMLAVSDAAVV GRSVGTSLLVARFGLNTAKEVSLSMQRLEQAGVNIKGAILNGVIKRASTAYSYGYNYYGY SYSEKE >gi|296494720|gb|ADTN01000018.1| GENE 12 12621 - 13067 482 148 aa, chain - ## HITS:1 COG:ECs1138 KEGG:ns NR:ns ## COG: ECs1138 COG0394 # Protein_GI_number: 15830392 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Escherichia coli O157:H7 # 1 148 5 152 152 276 100.0 1e-74 MAQLKFNSILVVCTGNICRSPIGERLLRKRLPGVKVKSAGVHGLVKHPADATAADVAANH GVSLEGHAGRKLTAEMARNYDLILAMESEHIAQVTAIAPEVRGKTMLFGQWLEQKEIPDP YRKSQDAFEHVYGMLERASQEWAKRLSR >gi|296494720|gb|ADTN01000018.1| GENE 13 13055 - 14194 847 379 aa, chain - ## HITS:1 COG:ECs1139 KEGG:ns NR:ns ## COG: ECs1139 COG1596 # Protein_GI_number: 15830393 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Escherichia coli O157:H7 # 1 379 1 379 379 744 100.0 0 MKKNIFKFSVLTLAVLSLTACTLVPGQNLSTSNKDVIELPDNQYDLDKMVNIYPVTPGLI DQLRAKPIMSQANPELEQQIANYEYRIGIGDVLMVTVWDHPELTTPAGQYRSASDTGNWV NADGAIFYPYIGRLKVAGKTLTQVRNEITARLDSVIESPQVDVSVAAFRSQKAYVTGEVS KSGQQPITNIPLTIMDAINAAGGLTADADWRNVVLTQNGVKTKVNLYALMQRGDLRQNKL LHPGDILFIPRNDDLKVFVMGEVGKQSTLKMDRSGMTLAEALGNAEGMNQDVADATGIFV IRATQNKQNGKIANIYQLNAKDASAMILGTEFQLEPYDIVYVTTAPLARWNRVISLLVPT ISGVHDLTETSRWIQTWPN >gi|296494720|gb|ADTN01000018.1| GENE 14 14240 - 16336 1902 698 aa, chain - ## HITS:1 COG:no KEGG:JW0967 NR:ns ## KEGG: JW0967 # Name: ymcA # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 698 1 698 698 1422 100.0 0 MKKNSYLLSCLAIAVSSACHAEVLTYPDPLGSSQSDFGGTGLLQMPNARIAPEGEFSVNY RDNDQYRFYSTSVALFPWLEGTIRYTDVRTRKYSQWEDFSGDQSYKDKSFDFKLRLWEEG YWLPQVAFGKRDIAGTGLFDGEYLVASKQAGPFDFTLGMAWGYAGNAGNITNPFCRVSDK YCHRAESHDAGDISFSDIFRGPASIFGGIEYQTPWNPLRLKLEYDGNNYQNDFAGKLPQA SHFNVGAVYRAASWADLNLSYERGNTLMFGFTLRTNFNDLRPALRDTPKPAYQPAPESEG LQYTTVANQLTALKYNAGFDAPEIQLRDKTLYMSGQQYKYRDSREAVDRANRILVNNLPQ GVEKISVTQKREHMAMVTTETDVASLRKQLAGTAPGQSEPLQQQRVEAEDLSAFGRGYRI REDRFSYSFNPTLSQSLGGPEDFYMFQLGLMSSARYWFTDHLLLDGGIFTNIYNNYDKFK SSLLPADSTLPRVRTHIRDYVRNDVYLNNLQANYFADLGNGFYGQVYGGYLETMYAGVGS ELLYRPLDACWALGVDVNYVKQRDWDNMMRFTDYSTPTGFVTAYWNPPTLNGVLMKLSVG QYLAKDKGATIDVAKRFDSGVAVGVWAAISNVSKDDYGEGGFSKGFYISIPFDLMTIGPN RNRAVVSWTPLTRDGGQMLSRKYQLYPMTAEREVPVGQ >gi|296494720|gb|ADTN01000018.1| GENE 15 16336 - 17082 471 248 aa, chain - ## HITS:1 COG:no KEGG:JW0968 NR:ns ## KEGG: JW0968 # Name: ymcB # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 248 1 248 248 472 100.0 1e-132 MNKLQSYFIASVLYVMTPHAFAQGTVTIYLPGEQQTLSVGPVENVVQLVTQPQLRDRLWW PGALLTDSAAKAKALKDYQHVMAQLASWEAEADDDVAATIKSVRQQLLNLNITGRLPVKL DPDFVRVDENSNPPLVGDYTLYTVQRPVTITLLGAVSGAGQLPWQAGRSVTDYLQDHPRL AGADKNNVMVITPEGETVVAPVALWNKRHVEPPPGSQLWLGFSAHVLPEKYADLNDQIVS VLTQRVPD >gi|296494720|gb|ADTN01000018.1| GENE 16 17079 - 17657 356 192 aa, chain - ## HITS:1 COG:no KEGG:B21_00996 NR:ns ## KEGG: B21_00996 # Name: ymcC # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 192 23 214 214 375 100.0 1e-103 MVDTFRASLFDNQDITVADQQIQALPYSTMYLRLNEGQRIFVVLGYIEQEQSKWLSQDNA MLVTHNGRLLKTVKLNNNLLEVTNSGQDPLRNALAIKDGSRWTRDILWSEDNHFRSATLS STFSFAGLETLNIAGRNVLCNVWQEEVTSTRPEKQWQNTFWVDSATGQVRQSRQMLGAGV IPVEMTFLKPAP >gi|296494720|gb|ADTN01000018.1| GENE 17 17830 - 18135 199 101 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKHKLSAILMAFMLTTPAAFAAPEATNGTEATTGTTGTTTTTTGATTTATTTGGVAAGAV GTATVVGVATAVGVATLAVVAANDSGDGGSHNTSTTTSTTR Prediction of potential genes in microbial genomes Time: Sun May 15 23:05:15 2011 Seq name: gi|296494719|gb|ADTN01000019.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont65.1, whole genome shotgun sequence Length of sequence - 886 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 260 - 460 63 ## ECO103_3574 hypothetical protein - Prom 494 - 553 4.0 2 2 Tu 1 . + CDS 660 - 885 198 ## UTI89_C2250 hypothetical protein Predicted protein(s) >gi|296494719|gb|ADTN01000019.1| GENE 1 260 - 460 63 66 aa, chain - ## HITS:1 COG:no KEGG:ECO103_3574 NR:ns ## KEGG: ECO103_3574 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 63 1 63 64 86 85.0 4e-16 MVIRKKKCRDCGNAITHNTVCCPYCGSVDPFGYYRNTDRIVTILLALIIVVLLTTVSVSV YILCSW >gi|296494719|gb|ADTN01000019.1| GENE 2 660 - 885 198 75 aa, chain + ## HITS:1 COG:no KEGG:UTI89_C2250 NR:ns ## KEGG: UTI89_C2250 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 75 13 87 201 151 100.0 7e-36 MTYKYNPFWQQRIRETVRHALNVHPRLTALRVDLRFPDVPAATDAAVISRFINALKARID AYQKRKHREGKRVHP Prediction of potential genes in microbial genomes Time: Sun May 15 23:05:23 2011 Seq name: gi|296494718|gb|ADTN01000020.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont67.1, whole genome shotgun sequence Length of sequence - 15011 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 5, operones - 2 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 7 - 66 3.6 1 1 Tu 1 . + CDS 172 - 549 234 ## EcHS_A2224 hypothetical protein - Term 704 - 743 5.1 2 2 Tu 1 . - CDS 880 - 2241 1473 ## COG0826 Collagenase and related proteases 3 3 Op 1 . - CDS 2344 - 2640 219 ## EC55989_2339 conserved hypothetical protein, putative plasmid stabilisation system protein 4 3 Op 2 . - CDS 2642 - 2893 289 ## COG3609 Predicted transcriptional regulators containing the CopG/Arc/MetJ DNA-binding domain - Prom 3077 - 3136 3.1 - Term 3101 - 3139 3.3 5 4 Op 1 2/0.000 - CDS 3147 - 3479 384 ## COG3422 Uncharacterized conserved protein 6 4 Op 2 40/0.000 - CDS 3670 - 4392 859 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 7 4 Op 3 10/0.000 - CDS 4389 - 5792 1290 ## COG0642 Signal transduction histidine kinase 8 4 Op 4 5/0.000 - CDS 5789 - 7204 1419 ## COG0477 Permeases of the major facilitator superfamily 9 4 Op 5 10/0.000 - CDS 7205 - 10282 3064 ## COG0841 Cation/multidrug efflux pump 10 4 Op 6 27/0.000 - CDS 10283 - 13405 3423 ## COG0841 Cation/multidrug efflux pump 11 4 Op 7 . - CDS 13405 - 14652 1437 ## COG0845 Membrane-fusion protein - Prom 14724 - 14783 3.5 12 5 Tu 1 . - CDS 14822 - 14950 97 ## ECSP_2830 hypothetical protein Predicted protein(s) >gi|296494718|gb|ADTN01000020.1| GENE 1 172 - 549 234 125 aa, chain + ## HITS:1 COG:no KEGG:EcHS_A2224 NR:ns ## KEGG: EcHS_A2224 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_HS # Pathway: not_defined # 1 125 1 125 125 249 98.0 3e-65 MPLLYLNTRECRWYLMGEGEMKKIAAISLISIFLISGCAVHNDETSIGKFGLAYKSNIQR KLDNQYYTEAEASLARGRISGAENIVKNDATHFCVTQGKKMQIVELKTEGVGLHGVARLT FKCGE >gi|296494718|gb|ADTN01000020.1| GENE 2 880 - 2241 1473 453 aa, chain - ## HITS:1 COG:yegQ KEGG:ns NR:ns ## COG: yegQ COG0826 # Protein_GI_number: 16130021 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Escherichia coli K12 # 1 453 1 453 453 929 100.0 0 MFKPELLSPAGTLKNMRYAFAYGADAVYAGQPRYSLRVRNNEFNHENLQLGINEAHALGK KFYVVVNIAPHNAKLKTFIRDLKPVVEMGPDALIMSDPGLIMLVREHFPEMPIHLSVQAN AVNWATVKFWQQMGLTRVILSRELSLEEIEEIRNQVPDMEIEIFVHGALCMAYSGRCLLS GYINKRDPNQGTCTNACRWEYNVQEGKEDDVGNIVHKYEPIPVQNVEPTLGIGAPTDKVF MIEEAQRPGEYMTAFEDEHGTYIMNSKDLRAIAHVERLTKMGVHSLKIEGRTKSFYYCAR TAQVYRKAIDDAAAGKPFDTSLLETLEGLAHRGYTEGFLRRHTHDDYQNYEYGYSVSDRQ QFVGEFTGERKGDLAAVAVKNKFSVGDSLELMTPQGNINFTLEHMENAKGEAMPIAPGDG YTVWLPVPQDLELNYALLMRNFSGETTRNPHGK >gi|296494718|gb|ADTN01000020.1| GENE 3 2344 - 2640 219 98 aa, chain - ## HITS:1 COG:no KEGG:EC55989_2339 NR:ns ## KEGG: EC55989_2339 # Name: not_defined # Def: conserved hypothetical protein, putative plasmid stabilisation system protein # Organism: E.coli_55989 # Pathway: not_defined # 1 98 1 98 98 195 100.0 4e-49 MRIIKLMPKANEDLEGIWYYSYHHFGEPQADRYVEHLSDVLQILSNNNIGTPRPELGEGI FVLPFERHVIYFLQSPGEIIVIRILNQNQDATRHLHWS >gi|296494718|gb|ADTN01000020.1| GENE 4 2642 - 2893 289 83 aa, chain - ## HITS:1 COG:STM2955 KEGG:ns NR:ns ## COG: STM2955 COG3609 # Protein_GI_number: 16766261 # Func_class: K Transcription # Function: Predicted transcriptional regulators containing the CopG/Arc/MetJ DNA-binding domain # Organism: Salmonella typhimurium LT2 # 1 83 24 106 118 128 90.0 2e-30 MARTMTVDLGDELREFIESLIESGDYRTQSEVIRESLRLLREKQAESRLQALRDMLAEGL SSGEAQPWEKDAFLRKVKAGIRK >gi|296494718|gb|ADTN01000020.1| GENE 5 3147 - 3479 384 110 aa, chain - ## HITS:1 COG:yegP KEGG:ns NR:ns ## COG: yegP COG3422 # Protein_GI_number: 16130020 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 110 14 123 123 169 99.0 9e-43 MAGWFELSKSSDNQFRFVLKAGNGETILTSELYTSKASAEKGIASVRSNSPQEERYEKKT ASNGKFYFNLKAANHQIIGSSQMYATAQSRETGIASVKANGTSQTVKDNT >gi|296494718|gb|ADTN01000020.1| GENE 6 3670 - 4392 859 240 aa, chain - ## HITS:1 COG:baeR KEGG:ns NR:ns ## COG: baeR COG0745 # Protein_GI_number: 16130019 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli K12 # 1 240 1 240 240 462 99.0 1e-130 MTELPIDENTPRILIVEDEPKLGQLLIDYLRAASYAPTLISHGDQVLPYVRQTPPDLILL DLMLPGTDGLTLCREIRRFSDIPIVMVTAKIEEIDRLLGLEIGADDYICKPYSPREVVAR VKTILRRCKPQRELQQQDAESPLIIDEGRFQASWRGKMLDLTPAEFRLLKTLSHEPGKVF SREHLLNHLYDDYRVVTDRTIDSHIKNLRRKLESLDAEQSFIRAVYGVGYRWEADACRIV >gi|296494718|gb|ADTN01000020.1| GENE 7 4389 - 5792 1290 467 aa, chain - ## HITS:1 COG:baeS KEGG:ns NR:ns ## COG: baeS COG0642 # Protein_GI_number: 16130018 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 467 1 467 467 910 100.0 0 MKFWRPGITGKLFLAIFATCIVLLISMHWAVRISFERGFIDYIKHGNEQRLQLLSDALGE QYAQHGNWRFLRNNDRFVFQILRSFEHDNSEDKPGPGMPPHGWRTQFWVVDQNNKVLVGP RAPIPPDGTRRPILVNGAEVGAVIASPVERLTRNTDINFDKQQRQTSWLIVALATLLAAL ATFLLARGLLAPVKRLVDGTHKLAAGDFTTRVTPTSEDELGKLAQDFNQLASTLEKNQQM RRDFMADISHELRTPLAVLRGELEAIQDGVRKFTPETVASLQAEVGTLTKLVDDLHQLSM SDEGALAYQKAPVDLIPLLEVAGGAFRERFASRGLKLQFSLPDSITVFGDRDRLMQLFNN LLENSLRYTDSGGSLQISAGQRDKTVRLTFADSAPGVSDDQLQKLFERFYRTEGSRNRAS GGSGLGLAICLNIVEAHNGRIIAAHSPFGGVSITVELPLERDLQREV >gi|296494718|gb|ADTN01000020.1| GENE 8 5789 - 7204 1419 471 aa, chain - ## HITS:1 COG:yegB KEGG:ns NR:ns ## COG: yegB COG0477 # Protein_GI_number: 16130017 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 471 1 471 471 749 99.0 0 MTDLPDSTRWQLWIVAFGFFMQSLDTTIVNTALPSMAQSLGESPLHMHMVIVSYVLTVAV MLPASGWLADKVGVRNIFFTAIVLFTLGSLFCALSGTLNELLLARALQGVGGAMMVPVGR LTVMKIVPREQYMAAMTFVTLPGQVGPLLGPALGGLLVEYASWHWIFLINIPVGIIGAIA TLLLMPNYTMQTRRFDLSGFLLLAVGMAVLTLALDGSKGTGLSPLTIAGLVAVGVVALVL YLLHARNNHRALFSLKLFRTRTFSLGLAGSFAGRIGSGMLPFMTPVFLQIGLGFSPFHAG LMMIPMVLGSMGMKRIVVQVVNRFGYRRVLVATTLGLSLVTLLFMTTALLGWYYVLPFVL FLQGMVNSTRFSSMNTLTLKDLPDNLASSGNSLLSMIMQLSMSIGVTIAGLLLGLFGSQH VSVDSGTTQTVFMYTWLSMALIIALPAFIFARVPNDTHQNVAISRRKRSAQ >gi|296494718|gb|ADTN01000020.1| GENE 9 7205 - 10282 3064 1025 aa, chain - ## HITS:1 COG:yegO KEGG:ns NR:ns ## COG: yegO COG0841 # Protein_GI_number: 16130016 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Escherichia coli K12 # 1 1025 1 1025 1025 1867 99.0 0 MKFFALFIYRPVATILLSVAITLCGILGFRMLPVAPLPQVDFPVIMVSASLPGASPETMA SSVATPLERSLGRIAGVSEMTSSSSLGSTRIILQFDFDRDINGAARDVQAAINAAQSLLP SGMPSRPTYRKANPSDAPIMILTLTSDTYSQGELYDFASTQLAPTISQIDGVGDVDVGGS SLPAVRVGLNPQALFNQGVSLDDVRTAVSNANVRKPQGALEDGTHRWQIQTNDELKTAAE YQPLIIHYNNGGAVRLGDVATVTDSVQDVRNAGMTNAKPAILLMIRKLPEANIIQTVDSI RAKLPELQETIPAAIDLQIAQDRSPTIRASLEEVEQTLIISVALVILVVFLFLRSGRATI IPAVSVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENIARHLEAGMKP LQAALQGTREVGFTVLSMSLSLVAVFLPLLLMGGLPGRLLREFAVTLSVAIGISLLVSLT LTPMMCGWMLKASKPREQKRLRGFGRMLVALQQGYGKSLKWVLNHTRLVGVVLLGTIALN IWLYISIPKTFFPEQDTGVLMGGIQADQSISFQAMRGKLQDFMKIIRDDPAVDNVTGFTG GSRVNSGMMFITLKPRDERSETAQQIIDRLRVKLAKEPGANLFLMAVQDIRVGGRQSNAS YQYTLLSDDLAALREWKPKIRKKLATLPELADVNSDQQDNGAEMNLVYDRDTMARLGIDV QAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRYTQDISALEKMFVINNEGKAIPLSY FAKWQPTNAPLSVNHQGLSAASTISFNLPTGKSLSDASAAIDRAMTQLGVPSTVRGSFAG TAQVFQETMNSQVILIIAAIATVYIVLGILYESYVHPLTILSTLPSAGVGALLALELFNA PFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRHGNLTPQEAIFQACLLRFRPIMMTTLA ALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQLLTLYTTPVVYLFFDRLRLRFSRKPK QTVTE >gi|296494718|gb|ADTN01000020.1| GENE 10 10283 - 13405 3423 1040 aa, chain - ## HITS:1 COG:yegN KEGG:ns NR:ns ## COG: yegN COG0841 # Protein_GI_number: 16130015 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Escherichia coli K12 # 1 1040 1 1040 1040 1808 99.0 0 MQVLPPSSTGGPSRLFIMRPVATTLLMVAILLAGIIGYRALPVSALPEVDYPTIQVVTLY PGASPDVMTSAVTAPLERQFGQMSGLKQMSSQSSGGASVITLQFQLTLPLDVAEQEVQAA INAATNLLPSDLPNPPVYSKVNPADPPIMTLAVTSTAMPMTQVEDMVETRVAQKISQISG VGLVTLSGGQRPAVRVKLNAQAIAALGLTSETVRTAITGANVNSAKGSLDGPSRAVTLSA NDQMQSAEEYRQLIIAYQNGAPIRLGDVATVEQGAENSWLGAWANKEQAIVMNVQRQPGA NIISTADSIRQMLPQLTESLPKSVKVTVLSDRTTNIRASVDDTQFELMMAIALVVMIIYL FLRNIPATIIPGVAVPLSLIGTFAVMVFLDFSINNLTLMALTIATGFVVDDAIVVIENIS RYIEKGEKPLAAALKGAGEIGFTIISLTFSLIAVLIPLLFMGDIVGRLFREFAITLAVAI LISAVVSLTLTPMMCARMLSQESLRKQNRFSRASEKMFDRIIAAYGRGLAKVLNHPWLTL SVALSTLLLSVLLWVFIPKGFFPVQDNGIIQGTLQAPQSSSFANMAQRQRQVADVILQDP AVQSLTSFVGVDGTNPSLNSARLQINLKPLDERDDRVQKVIARLQTAVDKVPGVDLFLQP TQDLTIDTQVSRTQYQFTLQATSLDALSTWVPQLMEKLQQLPQLSDVSSDWQDKGLVAYV NVDRDSASRLGISMADVDNALYNAFGQRLISTIYTQANQYRVVLEHNTENTPGLAALDTI RLTSSDGGVVPLSSIAKIEQRFAPLSINHLDQFPVTTISFNVPDNYSLGDAVQAIMDTEK TLNLPVDITTQFQGSTLAFQSALGSTVWLIVAAVVAMYIVLGILYESFIHPITILSTLPT AGVGALLALLIAGSELDVIAIIGIILLIGIVKKNAIMMIDFALTAEREQGMSPREAIYQA CLLRFRPILMTTLAALLGALPLMLSTGVGAELRRPLGIGMVGGLIVSQVLTLFTTPVIYL LFDRLALWTKSRFARHEEEA >gi|296494718|gb|ADTN01000020.1| GENE 11 13405 - 14652 1437 415 aa, chain - ## HITS:1 COG:ECs2882 KEGG:ns NR:ns ## COG: ECs2882 COG0845 # Protein_GI_number: 15832136 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli O157:H7 # 1 415 50 464 464 718 99.0 0 MKGSYKSRWVIVIVVVIAAIAAFWFWQGRNDSQSAAPGATKQAQQSPAGGRRGMRSGPLA PVQAATAVEQAVPRYLTGLGTIIAANTVTVRSRVDGQLMALHFQEGQQVKAGDLLAEIDP SQFKVALAQTQGQLAKDKATLANARRDLARYQQLAKTNLVSRQELDAQQALVSETEGTIK ADEASVASAQLQLDWSRITAPVDGRVGLKQVDVGNQISSGDTTGIVVITQTHPIDLVFTL PESDIATVVQAQKAGKPLVVEAWDRTNSKKLSEGTLLSLDNQIDATTGTIKVKARFNNQD DALFPNQFVNARMLVDTEQNAVVIPTAALQMGNEGHFVWVLNSENKVSKHLVTPGIQDSQ KVVIRAGISAGDRVVTDGIDRLTEGAKVEVVEAQSATTPEEKATSREYAKKGARS >gi|296494718|gb|ADTN01000020.1| GENE 12 14822 - 14950 97 42 aa, chain - ## HITS:1 COG:no KEGG:ECSP_2830 NR:ns ## KEGG: ECSP_2830 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 42 7 48 48 63 92.0 2e-09 MMSIFIMTLSLFKAPSSGGAFPFQRPAEIVGLPPFALNAVYR Prediction of potential genes in microbial genomes Time: Sun May 15 23:05:32 2011 Seq name: gi|296494717|gb|ADTN01000021.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont68.1, whole genome shotgun sequence Length of sequence - 6824 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 1, operones - 1 average op.length - 8.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.000 + CDS 496 - 1410 1078 ## COG1344 Flagellin and related hook-associated proteins + Term 1504 - 1540 1.0 + Prom 1444 - 1503 8.2 2 1 Op 2 15/0.000 + CDS 1701 - 3017 1346 ## COG1345 Flagellar capping protein 3 1 Op 3 . + CDS 3040 - 3432 408 ## COG1516 Flagellin-specific chaperone FliS 4 1 Op 4 . + CDS 3437 - 3748 169 ## ECUMN_0288 Lateral flagellar chaperone protein (FliT-like) 5 1 Op 5 . + CDS 3745 - 4806 862 ## COG3144 Flagellar hook-length control protein 6 1 Op 6 . + CDS 4814 - 5281 454 ## ECO111_0291 putative lateral flagellar biogenesis protein 7 1 Op 7 1/0.000 + CDS 5301 - 6017 968 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit 8 1 Op 8 . + CDS 6030 - 6822 906 ## COG1291 Flagellar motor component Predicted protein(s) >gi|296494717|gb|ADTN01000021.1| GENE 1 496 - 1410 1078 304 aa, chain + ## HITS:1 COG:VC2188 KEGG:ns NR:ns ## COG: VC2188 COG1344 # Protein_GI_number: 15642187 # Func_class: N Cell motility # Function: Flagellin and related hook-associated proteins # Organism: Vibrio cholerae # 2 196 3 183 379 105 39.0 8e-23 MLSINTNNASMAAVNAISKSSSSLSTSMERLATGNRINSSADDAAGKQIANRLTAQSSGM DVALSNINDATAMLQTADSMFDEMSDVLGRMKDLSTQAANGTYSDDDLQAMQDEYDELGQ QMSDMLQNTTYGGTNLFGVSGTSNTGTDGLFQSAVTFQVGAESSDTMTVNISSQLNTLVT DLSAISNSFSADQADTTGTAGVSGGTELTASGSANQMITSISTAMDDVSQIQSKLGASIN RLNDTANNLTSMQDNTEVAIGNIMDTDYATEASNMTKQQVLMQTGITMLKQSNSMSSMVS SLLQ >gi|296494717|gb|ADTN01000021.1| GENE 2 1701 - 3017 1346 438 aa, chain + ## HITS:1 COG:YPO0740 KEGG:ns NR:ns ## COG: YPO0740 COG1345 # Protein_GI_number: 16121058 # Func_class: N Cell motility # Function: Flagellar capping protein # Organism: Yersinia pestis # 2 437 4 418 425 132 26.0 1e-30 MINPRTIAQEIAYADVATQAANLQEKQTELDAESSGLDSLSSALSDFQSAVDALNSDTDG PVTFAATSNNDSATVSANSQAQAGSYSFFVEQLAQGQQTTFSMGDDTFSATGTFEMTMGD STMDIDLAAADQNGDGDGFIDASELVNAINDSDDNPGVSAALVKTDGTTTIMLTSDSTGA QSAFSVSVTGHDASNDSASAPVATDVSSAQDAIIHLGSATGPEITNSSNTFDDVIPGVTM TFTEVSDSDSDLTTFNISEDSSASQEKVQTFVDAYNTLIDTVDSLTTHGDDSTSAGVFAG DAGLNSLANQLDDIAHANYNGVSIVDYGITLDSHGHLQIDSDQFNDEMAKNPDGLTSIFV GDNSMVAQMDDLINTYTDSSNGIITLRQQNIDDQMSKIQDEGDQLTDTYNANYDRYLEEY TNTLVEVYTMKASMAAFA >gi|296494717|gb|ADTN01000021.1| GENE 3 3040 - 3432 408 130 aa, chain + ## HITS:1 COG:YPO0741 KEGG:ns NR:ns ## COG: YPO0741 COG1516 # Protein_GI_number: 16121059 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport; O Posttranslational modification, protein turnover, chaperones # Function: Flagellin-specific chaperone FliS # Organism: Yersinia pestis # 11 125 11 124 126 128 59.0 3e-30 MYQVQDGYSQYKEIDLAARTAAASPLELVLVLFSGLMDELERAKSHIEGRRFEKKAQSIN KCIDILNALTSSLEFETGGELVVNLSRLYDHCVYRLYEASGELSAEKIDEVMLILSNLRE GWEGLSGKLG >gi|296494717|gb|ADTN01000021.1| GENE 4 3437 - 3748 169 103 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_0288 NR:ns ## KEGG: ECUMN_0288 # Name: lafD # Def: Lateral flagellar chaperone protein (FliT-like) # Organism: E.coli_UMN026 # Pathway: not_defined # 1 103 1 103 103 144 97.0 1e-33 MEKQRRQLFALVEAMNEALDKQRWRRLPALHQQLMRQFHDYAAAETDAAQFCAVKAHLYG AFNALIKRREQRAEVLKARMEQHQRHQEGVLAYSIVNLLSEKS >gi|296494717|gb|ADTN01000021.1| GENE 5 3745 - 4806 862 353 aa, chain + ## HITS:1 COG:YPO0743 KEGG:ns NR:ns ## COG: YPO0743 COG3144 # Protein_GI_number: 16121061 # Func_class: N Cell motility # Function: Flagellar hook-length control protein # Organism: Yersinia pestis # 202 351 230 390 393 72 36.0 1e-12 MNPALLATLGTLAETASLKADILPPVSGENAPAFTLPKMAVAAVAERVHSAKTSQQQATR PQENDPVAMQALMALLLPQPAAPHQHTPQPRNVATSPVIQQLTKAVVQNAPQRQTQQQEL TPLPPQLQELISQLPQAKPEQQARLATYASEDLHAIAPTQPRVSTQPARPKPELTRVTAR PQVERKTEKVPDSEPVIARAVLQVKTPELLSEHQEIVAKPATLSMDELGEKLTTLLKDQI HFQLNKQQQISTIRLDPPSLGKLEIAVQLDNGKLMVHIGANQSEVCRALQQFSDDLRQHL TAQNFMEVNVQVSSEGQSQQQQQSGHQQEEVSAALQLDDAPQFQQNESVLIKV >gi|296494717|gb|ADTN01000021.1| GENE 6 4814 - 5281 454 155 aa, chain + ## HITS:1 COG:no KEGG:ECO111_0291 NR:ns ## KEGG: ECO111_0291 # Name: lafF # Def: putative lateral flagellar biogenesis protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 155 1 155 155 270 100.0 1e-71 MKKIVVAGVVSAVLALVVGAGAGWGVWHHYAGKGKPTAAAQTETVETLDESKSVFVTLPE TIVTLHDNNGADHYLSAELVMVVASDKEAEKIKHQEPLYQSIAVECLTEMKFEDLRGMKI SAIRKLISDALKKDLQRRKMSAPYKDLLVKKVVFQ >gi|296494717|gb|ADTN01000021.1| GENE 7 5301 - 6017 968 238 aa, chain + ## HITS:1 COG:YPO0745 KEGG:ns NR:ns ## COG: YPO0745 COG1191 # Protein_GI_number: 16121063 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Yersinia pestis # 7 234 2 230 231 224 53.0 1e-58 MLQDVFESEEFAAAPVLTPQQESHYLQAYLPLVRKVVRQLAPQCNCIIDRQDMEQTALMG LLNAIRRYGLPDEGFAGYAVHRIRGAILDELRALDWRPRQLRQKYHQMKDLIRETRKKLG HEPEWSELAVEGISHEDYLEYQQLEGAETLASLDELLNGEGPGILLAGRELEDQFVTQQM LQQALSQLSEKEQLILSMYYQHEMNLKEIALVLGLTEARICQLNKQIAQKIRDFVYPN >gi|296494717|gb|ADTN01000021.1| GENE 8 6030 - 6822 906 264 aa, chain + ## HITS:1 COG:YPO0746 KEGG:ns NR:ns ## COG: YPO0746 COG1291 # Protein_GI_number: 16121064 # Func_class: N Cell motility # Function: Flagellar motor component # Organism: Yersinia pestis # 1 261 1 261 288 301 66.0 9e-82 MQKILGLLVVLGCVFGGYFEAGGQLVAIWQPAEMIIILGAGFGAMIIGNPKHVLKEIAHQ IKGVISKKQLGPEFQRQLLMCLYELLEMVQNGGLRMLDQHIEQPEESTIFQKYPLVLTQK RLVTFIADNFRLMAMGKIDAHELEGILDQELDTAEESLLTPSRSLQRTAEAMPGFGICAA VLGIIITMQSIDGSIALIGLKVAAALVGTFLGVFICYCLMDPLANAMEQQARAEHSLLEC VRTVLVAQAGGKPTLLAGGGGGGG Prediction of potential genes in microbial genomes Time: Sun May 15 23:05:47 2011 Seq name: gi|296494716|gb|ADTN01000022.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont68.2, whole genome shotgun sequence Length of sequence - 37710 bp Number of predicted genes - 35, with homology - 34 Number of transcription units - 17, operones - 9 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.571 + CDS 145 - 1068 814 ## COG1360 Flagellar motor protein 2 1 Op 2 3/0.571 + CDS 1139 - 2194 1038 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair 3 1 Op 3 . + CDS 2246 - 2539 217 ## COG2161 Antitoxin of toxin-antitoxin stability system 4 1 Op 4 . + CDS 2542 - 2940 282 ## JW0223 predicted toxin of the YafO-YafN toxin-antitoxin system 5 1 Op 5 2/0.571 + CDS 3028 - 3402 162 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Prom 3522 - 3581 9.0 6 2 Op 1 5/0.143 + CDS 3708 - 3974 103 ## COG1690 Uncharacterized conserved protein 7 2 Op 2 . + CDS 3943 - 4443 253 ## COG1186 Protein chain release factor B + Term 4685 - 4728 2.4 - Term 4444 - 4487 8.3 8 3 Tu 1 . - CDS 4500 - 5957 1814 ## COG2195 Di- and tripeptidases - Prom 6049 - 6108 3.6 + Prom 6099 - 6158 3.2 9 4 Tu 1 6/0.143 + CDS 6218 - 6676 629 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins + Term 6683 - 6721 9.1 + Prom 6683 - 6742 3.4 10 5 Op 1 . + CDS 6768 - 8012 1186 ## COG1073 Hydrolases of the alpha/beta superfamily 11 5 Op 2 . + CDS 8070 - 8471 582 ## SSON_0282 DNA-binding transcriptional regulator Crl + Term 8480 - 8514 5.2 - Term 8468 - 8502 6.0 12 6 Tu 1 . - CDS 8510 - 9565 1108 ## COG3203 Outer membrane protein (porin) - Prom 9649 - 9708 5.8 + Prom 9642 - 9701 8.4 13 7 Op 1 22/0.000 + CDS 9853 - 10956 1154 ## COG0263 Glutamate 5-kinase 14 7 Op 2 . + CDS 10968 - 12221 1482 ## COG0014 Gamma-glutamyl phosphate reductase + TRNA 12336 - 12411 91.5 # Thr CGT 0 0 15 8 Op 1 2/0.571 - CDS 12538 - 12948 237 ## COG0583 Transcriptional regulator 16 8 Op 2 6/0.143 - CDS 12927 - 13883 578 ## COG1975 Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family 17 8 Op 3 12/0.000 - CDS 13893 - 16091 1872 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 18 8 Op 4 15/0.000 - CDS 16088 - 17044 717 ## COG1319 Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs 19 8 Op 5 . - CDS 17041 - 17730 418 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs + Prom 18034 - 18093 5.4 20 9 Tu 1 . + CDS 18148 - 18762 225 ## COG3477 Predicted periplasmic/secreted protein + Term 18782 - 18811 -0.9 - Term 19364 - 19404 3.2 21 10 Op 1 . - CDS 19652 - 20362 467 ## JW5030 conserved hypothetical protein 22 10 Op 2 . - CDS 20331 - 21950 897 ## JW0284 predicted receptor 23 10 Op 3 . - CDS 21964 - 24489 1686 ## B21_00253 hypothetical protein 24 10 Op 4 . - CDS 24515 - 25183 439 ## APECO1_1705 hypothetical protein - Term 25192 - 25224 3.0 25 11 Op 1 . - CDS 25241 - 25828 642 ## JW0287 conserved hypothetical protein 26 11 Op 2 . - CDS 25903 - 26445 157 ## COG2771 DNA-binding HTH domain-containing proteins - Prom 26589 - 26648 6.1 - Term 27468 - 27505 3.7 27 12 Op 1 6/0.143 - CDS 27531 - 27674 241 ## PROTEIN SUPPORTED gi|26246304|ref|NP_752343.1| 50S ribosomal protein L36 28 12 Op 2 . - CDS 27671 - 27937 460 ## PROTEIN SUPPORTED gi|110640565|ref|YP_668293.1| 50S ribosomal protein L31 type B - Prom 27961 - 28020 5.2 29 13 Tu 1 . - CDS 28298 - 28399 150 ## - Prom 28500 - 28559 4.7 + Prom 29249 - 29308 7.2 30 14 Tu 1 . + CDS 29514 - 33770 2972 ## EcolC_3322 Ig domain-containing protein + Term 33809 - 33843 4.6 - Term 33834 - 33881 13.9 31 15 Tu 1 . - CDS 33910 - 34761 382 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 34898 - 34957 3.0 + Prom 34899 - 34958 4.5 32 16 Tu 1 . + CDS 35088 - 35192 111 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase 33 17 Op 1 . - CDS 35351 - 35944 649 ## COG3059 Predicted membrane protein 34 17 Op 2 . - CDS 35956 - 36204 83 ## G2583_0402 hypothetical protein 35 17 Op 3 . - CDS 36301 - 37626 385 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 - Prom 37648 - 37707 1.7 Predicted protein(s) >gi|296494716|gb|ADTN01000022.1| GENE 1 145 - 1068 814 307 aa, chain + ## HITS:1 COG:mbhA KEGG:ns NR:ns ## COG: mbhA COG1360 # Protein_GI_number: 16128216 # Func_class: N Cell motility # Function: Flagellar motor protein # Organism: Escherichia coli K12 # 53 307 7 261 261 484 100.0 1e-137 MRKTGNRRDRGAKTTIVRRQIKKNHAGHHGGAWKVAFADFTLAMMALFMTLWIVNSVSKS ERESIIAALHGQSIFNGGGLSPLNKISPSHPPKPATVAVPEETEKKARDVNEKTALLKKK SATELGELATSINTIARDAHMEANLEMEIVPQGLRVLIKDDQNRNMFERGSAKIMPFFKT LLVELAPVFDSLDNKIIITGHTDAMAYKNNIYNNWNLSGDRALSARRVLEEAGMPEDKVM QVSAMADQMLLDSKNPQSAGNRRIEIMVLTKSASDTLYQYFGQHGDKVVQPLVQKLDKQQ VLSQRTR >gi|296494716|gb|ADTN01000022.1| GENE 2 1139 - 2194 1038 351 aa, chain + ## HITS:1 COG:dinP KEGG:ns NR:ns ## COG: dinP COG0389 # Protein_GI_number: 16128217 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Escherichia coli K12 # 1 351 1 351 351 714 100.0 0 MRKIIHVDMDCFFAAVEMRDNPALRDIPIAIGGSRERRGVISTANYPARKFGVRSAMPTG MALKLCPHLTLLPGRFDAYKEASNHIREIFSRYTSRIEPLSLDEAYLDVTDSVHCHGSAT LIAQEIRQTIFNELQLTASAGVAPVKFLAKIASDMNKPNGQFVITPAEVPAFLQTLPLAK IPGVGKVSAAKLEAMGLRTCGDVQKCDLVMLLKRFGKFGRILWERSQGIDERDVNSERLR KSVGVERTMAEDIHHWSECEAIIERLYPELERRLAKVKPDLLIARQGVKLKFDDFQQTTQ EHVWPRLNKADLIATARKTWDERRGGRGVRLVGLHVTLLDPQMERQLVLGL >gi|296494716|gb|ADTN01000022.1| GENE 3 2246 - 2539 217 97 aa, chain + ## HITS:1 COG:yafN KEGG:ns NR:ns ## COG: yafN COG2161 # Protein_GI_number: 16128218 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Antitoxin of toxin-antitoxin stability system # Organism: Escherichia coli K12 # 1 97 1 97 97 166 100.0 7e-42 MHRILAEKSVNITELRKNPAKYFIDQPVAVLSNNRPAGYLLSASAFEALMDMLAEQEEKK PIKARFRPSAARLEEITRRAEQYLNDMTDDDFNDFKE >gi|296494716|gb|ADTN01000022.1| GENE 4 2542 - 2940 282 132 aa, chain + ## HITS:1 COG:no KEGG:JW0223 NR:ns ## KEGG: JW0223 # Name: yafO # Def: predicted toxin of the YafO-YafN toxin-antitoxin system # Organism: E.coli_J # Pathway: not_defined # 1 132 1 132 132 266 100.0 2e-70 MRVFKTKLIRLQLTAEELDALTADFISYKRDGVLPDIFGRDALYDDSFTWPLIKFERVAH IHLANENNPFPPQLRQFSRTNDEAHLVYCQGAFDEQAWLLIAILKPEPHKLARDNNQMHK IGKMAEAFRMRF >gi|296494716|gb|ADTN01000022.1| GENE 5 3028 - 3402 162 124 aa, chain + ## HITS:1 COG:yafP KEGG:ns NR:ns ## COG: yafP COG0454 # Protein_GI_number: 16128220 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Escherichia coli K12 # 1 124 27 150 150 243 100.0 9e-65 MTASQHYSPQQISAWAQIDESRWKEKLAKSQVWVAIINAQPVGFISRIEHYIDMLFVDPE YTRRGVASALLKPLIKSESELTVDASITAKPFFERYGFQTVKQQRVECRGAWFTNFYMRY KPQH >gi|296494716|gb|ADTN01000022.1| GENE 6 3708 - 3974 103 88 aa, chain + ## HITS:1 COG:ykfJ KEGG:ns NR:ns ## COG: ykfJ COG1690 # Protein_GI_number: 16128221 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 88 1 88 88 177 100.0 5e-45 MEWYMGKYIRPLSDAVFTIASDDLWIESLAIQQLHTTANLPNMQRVVGMPDLHPGRGYPI GAAFFSVGRFYPARRRGNGAGNRNGPLL >gi|296494716|gb|ADTN01000022.1| GENE 7 3943 - 4443 253 166 aa, chain + ## HITS:1 COG:prfH KEGG:ns NR:ns ## COG: prfH COG1186 # Protein_GI_number: 16128222 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Escherichia coli K12 # 1 166 1 166 166 315 100.0 3e-86 MLETETGRYSDTLRSALVSLDGDNAWALSESWCGTIQWICPSPYRPHHGRKNWFLGIGRF TADEQEQSDAIRYETLRSSGPGGQHVNKTDSAVRATHLASGISVKVQSERSQHANKRLAR LLIAWKLEQQQQENSAALKSQRRMFHHQIERGNPRRTFTGMAFIEG >gi|296494716|gb|ADTN01000022.1| GENE 8 4500 - 5957 1814 485 aa, chain - ## HITS:1 COG:pepD KEGG:ns NR:ns ## COG: pepD COG2195 # Protein_GI_number: 16128223 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Escherichia coli K12 # 1 485 1 485 485 988 100.0 0 MSELSQLSPQPLWDIFAKICSIPHPSYHEEQLAEYIVGWAKEKGFHVERDQVGNILIRKP ATAGMENRKPVVLQAHLDMVPQKNNDTVHDFTKDPIQPYIDGEWVKARGTTLGADNGIGM ASALAVLADENVVHGPLEVLLTMTEEAGMDGAFGLQGNWLQADILINTDSEEEGEIYMGC AGGIDFTSNLHLDREAVPAGFETFKLTLKGLKGGHSGGEIHVGLGNANKLLVRFLAGHAE ELDLRLIDFNGGTLRNAIPREAFATIAVAADKVDVLKSLVNTYQEILKNELAEKEKNLAL LLDSVANDKAALIAKSRDTFIRLLNATPNGVIRNSDVAKGVVETSLNVGVVTMTDNNVEI HCLIRSLIDSGKDYVVSMLDSLGKLAGAKTEAKGAYPGWQPDANSPVMHLVRETYQRLFN KTPNIQIIHAGLECGLFKKPYPEMDMVSIGPTITGPHSPDEQVHIESVGHYWTLLTELLK EIPAK >gi|296494716|gb|ADTN01000022.1| GENE 9 6218 - 6676 629 152 aa, chain + ## HITS:1 COG:ECs0265 KEGG:ns NR:ns ## COG: ECs0265 COG0503 # Protein_GI_number: 15829519 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Escherichia coli O157:H7 # 1 152 1 152 152 306 100.0 8e-84 MSEKYIVTWDMLQIHARKLASRLMPSEQWKGIIAVSRGGLVPGALLARELGIRHVDTVCI SSYDHDNQRELKVLKRAEGDGEGFIVIDDLVDTGGTAVAIREMYPKAHFVTIFAKPAGRP LVDDYVVDIPQDTWIEQPWDMGVVFVPPISGR >gi|296494716|gb|ADTN01000022.1| GENE 10 6768 - 8012 1186 414 aa, chain + ## HITS:1 COG:yafA KEGG:ns NR:ns ## COG: yafA COG1073 # Protein_GI_number: 16128225 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Escherichia coli K12 # 1 414 1 414 414 870 99.0 0 MTQANLSETLFKPRFKHPETSTLVRRFNHGAQPPVQSALDGKTIPHWYRMINRLMWIWRG IDPREILDVQARIVMSDAERTDDDLYDTVIGYRGGNWIYEWATQAMVWQQKACAEDDPQL SGRHWLHAATLYNIAAYPHLKGDDLAEQAQALSNRAYEEAAQRLPGTMRQMEFTVPGGAP ITGFLHMPKGDGPFPTVLMCGGLDAMQTDYYSLYERYFAPRGIAMLTIDMPSVGFSSKWK ITQDSSLLHQHVLKALPNVPWVDHTRVAAFGFRFGANVAVRLAYLESPRLKAVACLGPVV HTLLSDFKCQQQVPEMYLDVLASRLGMHDASDEALRVELNRYSLKVQGLLGRRCPTPMLS GYWKNDPFSPEEDSRLITSSSADGKLLEIPFNPVYRNFDKGLQEITDWIEKRLC >gi|296494716|gb|ADTN01000022.1| GENE 11 8070 - 8471 582 133 aa, chain + ## HITS:1 COG:no KEGG:SSON_0282 NR:ns ## KEGG: SSON_0282 # Name: crl # Def: DNA-binding transcriptional regulator Crl # Organism: S.sonnei # Pathway: not_defined # 1 133 1 133 133 266 100.0 2e-70 MTLPSGHPKSRLIKKFTALGPYIREGKCEDNRFFFDCLAVCVNVKPAPEVREFWGWWMEL EAQESRFTYSYQFGLFDKAGDWKSVPVKDTEVVERLEHTLREFHEKLRELLTTLNLKLEP ADDFRDEPVKLTA >gi|296494716|gb|ADTN01000022.1| GENE 12 8510 - 9565 1108 351 aa, chain - ## HITS:1 COG:phoE KEGG:ns NR:ns ## COG: phoE COG3203 # Protein_GI_number: 16128227 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein (porin) # Organism: Escherichia coli K12 # 1 351 1 351 351 625 100.0 1e-179 MKKSTLALVVMGIVASASVQAAEIYNKDGNKLDVYGKVKAMHYMSDNASKDGDQSYIRFG FKGETQINDQLTGYGRWEAEFAGNKAESDTAQQKTRLAFAGLKYKDLGSFDYGRNLGALY DVEAWTDMFPEFGGDSSAQTDNFMTKRASGLATYRNTDFFGVIDGLNLTLQYQGKNENRD VKKQNGDGFGTSLTYDFGGSDFAISGAYTNSDRTNEQNLQSRGTGKRAEAWATGLKYDAN NIYLATFYSETRKMTPITGGFANKTQNFEAVAQYQFDFGLRPSLGYVLSKGKDIEGIGDE DLVNYIDVGATYYFNKNMSAFVDYKINQLDSDNKLNINNDDIVAVGMTYQF >gi|296494716|gb|ADTN01000022.1| GENE 13 9853 - 10956 1154 367 aa, chain + ## HITS:1 COG:ECs0269 KEGG:ns NR:ns ## COG: ECs0269 COG0263 # Protein_GI_number: 15829523 # Func_class: E Amino acid transport and metabolism # Function: Glutamate 5-kinase # Organism: Escherichia coli O157:H7 # 1 367 1 367 367 692 100.0 0 MSDSQTLVVKLGTSVLTGGSRRLNRAHIVELVRQCAQLHAAGHRIVIVTSGAIAAGREHL GYPELPATIASKQLLAAVGQSRLIQLWEQLFSIYGIHVGQMLLTRADMEDRERFLNARDT LRALLDNNIVPVINENDAVATAEIKVGDNDNLSALAAILAGADKLLLLTDQKGLYTADPR SNPQAELIKDVYGIDDALRAIAGDSVSGLGTGGMSTKLQAADVACRAGIDTIIAAGSKPG VIGDVMEGISVGTLFHAQATPLENRKRWIFGAPPAGEITVDEGATAAILERGSSLLPKGI KSVTGNFSRGEVIRICNLEGRDIAHGVSRYNSDALRRIAGHHSQEIDAILGYEYGPVAVH RDDMITR >gi|296494716|gb|ADTN01000022.1| GENE 14 10968 - 12221 1482 417 aa, chain + ## HITS:1 COG:ECs0270 KEGG:ns NR:ns ## COG: ECs0270 COG0014 # Protein_GI_number: 15829524 # Func_class: E Amino acid transport and metabolism # Function: Gamma-glutamyl phosphate reductase # Organism: Escherichia coli O157:H7 # 1 417 1 417 417 775 98.0 0 MLEQMGIAAKQASYKLAQLSSREKNRVLEKIADELEAQSEIILNANAQDVADARANGLSE AMLDRLALTPARLKGIADDVRQVCNLADPVGQVIDGGVLDSGLRLERRRVPLGVIGVIYE ARPNVTVDVASLCLKTGNAVILRGGKETCRTNAATVAVIQDALKSCGLPAGAVQAIDNPD RALVSEMLRMDKYIDMLIPRGGAGLHKLCREQSTIPVITGGIGVCHIYVDESVEIAEALK VIVNAKTQRPSTCNTVETLLVNKNIADSFLPALSKQMAESGVTLHADAAALAQLQAGPAK VVAVKAEEYDDEFLSLDLNVKIVSDLDDAIAHIREHGTQHSDAILTRDMRNAQRFVNEVD SSAVYVNASTRFTDGGQFGLGAEVAVSTQKLHARGPMGLEALTTYKWIGIGDYTIRA >gi|296494716|gb|ADTN01000022.1| GENE 15 12538 - 12948 237 136 aa, chain - ## HITS:1 COG:yagP KEGG:ns NR:ns ## COG: yagP COG0583 # Protein_GI_number: 16128267 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 136 1 136 136 251 99.0 2e-67 MPAPVVLILAAGRGERFLASGGNTHKCIGWRQSPEVAPYRWPFEENGRTFDLAIEPQITT NDLRLMLRLALAGGGITIATQETFRPYIESGKLVSLLDDFLPQFPGFYLYFPQRRNIAPK LRALIDYVKEWRQQLA >gi|296494716|gb|ADTN01000022.1| GENE 16 12927 - 13883 578 318 aa, chain - ## HITS:1 COG:yagQ KEGG:ns NR:ns ## COG: yagQ COG1975 # Protein_GI_number: 16128268 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family # Organism: Escherichia coli K12 # 1 318 1 318 318 613 100.0 1e-175 MSYPLFDKDEHWHKPEQAFLTDDHRTILRFAVEALMSGKGAVLVTLVEIRGGAARPLGAQ MVVREDGRYCGFVSGGCVEAAAAFEALEMMGSGRDREIRYGEGSPWFDIVLPCGGGITLT LHKLRSAQPLLAVLNRLEQRKPVGLRYDPQAQSLVCLPTQTRTGWNLNGFEVGFRPCVRL MIYGRSLEAQATASLAAATGYDSHIFDLFPASASAQIDTDTAVILLCHDLNRELPVLQAA REAKPFYLGALGSYRTHTLRLQKLHELGWSREETTQIRAPVGIFPKARDAHTLALSVLAE VASVRLHQEEDSCLPPSS >gi|296494716|gb|ADTN01000022.1| GENE 17 13893 - 16091 1872 732 aa, chain - ## HITS:1 COG:yagR KEGG:ns NR:ns ## COG: yagR COG1529 # Protein_GI_number: 16128269 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Escherichia coli K12 # 1 732 1 732 732 1404 100.0 0 MKFDKPAGENPIDQLKVVGRPHDRIDGPLKTTGTARYAYEWHEEAPNAAYGYIVGSAIAK GRLTALDTDAAQKAPGVLAVITASNAGALGKGDKNTARLLGGPTIEHYHQAIALVVAETF EQARAAASLVQAHYRRNKGAYSLADEKQAVNQPPEDTPDKNVGDFDGAFTSAAVKIDATY TTPDQSHMAMEPHASMAVWDGNKLTLWTSNQMIDWCRTDLAKTLKVPVENVRIISPYIGG GFGGKLFLRSDALLAALAARAVKRPVKVMLPRPSIPNNTTHRPATLQHLRIGADQSGKIT AISHESWSGNLPGGTPETAVQQSELLYAGANRHTGLRLATLDLPEGNAMRAPGEAPGLMA LEIAIDELAEKAGIDPVEFRILNDTQVDPADPTRCFSRRQLIECLRTGADKFGWKQRNAT PGQVRDGEWLVGHGVAAGFRNNLLEKSGARVHLEQNGTVTVETDMTDIGTGSYTILAQTA AEMLGVPLEQVAVHLGDSSFPVSAGSGGQWGANTSTSGVYAACMKLREMIASAVGFDPEQ SQFADGKITNGTRSATLHEATAGGRLTAEESIEFGTLSKEYQQSTFAGHFVEVGVHSATG EVRVRRMLAVCAAGRILNPKTARSQVIGAMTMGMGAALMEELAVDDRLGYFVNHDMAGYE VPVHADIPKQEVIFLDDTDPISSPMKAKGVGELGLCGVSAAIANAVYNATGIRVRDYPIT LDKLLDKLPDVV >gi|296494716|gb|ADTN01000022.1| GENE 18 16088 - 17044 717 318 aa, chain - ## HITS:1 COG:yagS KEGG:ns NR:ns ## COG: yagS COG1319 # Protein_GI_number: 16128270 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs # Organism: Escherichia coli K12 # 1 318 1 318 318 599 100.0 1e-171 MKAFTYERVNTPAEAALSAQRVPGAKFIAGGTNLLDLMKLEIETPTHLIDVNGLGLDKIE VTDAGGLRIGALVRNTDLAAHERVRRDYAVLSRALLAGASGQLRNQATTAGNLLQRTRCP YFYDTNQPCNKRLPGSGCAALEGFSRQHAVVGVSEACIATHPSDMAVAMRLLDAVVETIT PEGKTRSITLADFYHPPGKTPHIETALLPGELIVAVTLPPPLGGKHIYRKVRDRASYAFA LVSVAAIIQPDGSGRVALGGVAHKPWRIEAADAQLSQGAQAVYDTLFASAHPTAENTFKL LLAKRTLASVLAEARAQA >gi|296494716|gb|ADTN01000022.1| GENE 19 17041 - 17730 418 229 aa, chain - ## HITS:1 COG:yagT KEGG:ns NR:ns ## COG: yagT COG2080 # Protein_GI_number: 16128271 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Escherichia coli K12 # 1 229 1 229 229 431 100.0 1e-121 MSNQGEYPEDNRVGKHEPHDLSLTRRDLIKVSAATAATAVVYPHSTLAASVPAATPAPEI MPLTLKVNGKTEQLEVDTRTTLLDTLRENLHLIGTKKGCDHGQCGACTVLVNGRRLNACL TLAVMHQGAEITTIEGLGSPDNLHPMQAAFIKHDGFQCGYCTSGQICSSVAVLKEIQDGI PSHVTVDLVSAPETTADEIRERMSGNICRCGAYANILAAIEDAAGEIKS >gi|296494716|gb|ADTN01000022.1| GENE 20 18148 - 18762 225 204 aa, chain + ## HITS:1 COG:ECs0317 KEGG:ns NR:ns ## COG: ECs0317 COG3477 # Protein_GI_number: 15829571 # Func_class: S Function unknown # Function: Predicted periplasmic/secreted protein # Organism: Escherichia coli O157:H7 # 1 204 1 204 204 408 100.0 1e-114 MNIFEQTPPNRRRYGLAAFIGLIAGVVSAFVKWGAEVPLPPRSPVDMFNAACGPESLIRA AGQIDCSRNFLNPPYIFLRDWLGLTDPNAAVYTFAGHVFNWVGVTHIIFSIVFAVGYCVV AEVFPKIKLWQGLLAGALAQLFVHMISFPLMGLTPPLFDLPWYENVSEIFGHLVWFWSIE IIRRDLRNRITHEPDPEIPLGSNR >gi|296494716|gb|ADTN01000022.1| GENE 21 19652 - 20362 467 236 aa, chain - ## HITS:1 COG:no KEGG:JW5030 NR:ns ## KEGG: JW5030 # Name: yagV # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 236 1 236 236 472 99.0 1e-132 MFRRRGVTLTKALLTAVCMLAAPLTQAISVGNLTFSLPSETDFVSKRVVNNNKSARIYRI AISAIDSPGSSELRTRPVDVELLFAPRQLALQAGESEYFKFYYHGPRDNRERYYRVSFRE VPTRNHTRRSPTGGVVSTEPVVVMDTILVVRPRQVQFKWSFDKVTGTVSNTGNTWFKLLI KPGCDSTEEEGDAWYLRPGDVVHQPELRQPGNHYLVYNDKFIKISDSCPAKPPSAD >gi|296494716|gb|ADTN01000022.1| GENE 22 20331 - 21950 897 539 aa, chain - ## HITS:1 COG:no KEGG:JW0284 NR:ns ## KEGG: JW0284 # Name: yagW # Def: predicted receptor # Organism: E.coli_J # Pathway: not_defined # 1 539 9 547 547 1039 100.0 0 MIIFALIWPVTALRAAVSKTTWADAPAREFVFVENNSDDNFFVTPGGALDPRLTGANRWT GLKYTGSGTIYQQSLGYIDNGYNTGLYTNWKFDMWLENSPVSSPLTGLRCINWYAGCNMT TSLILPQTTDASGFYGATVTSGGAKWMHGMLSDAFYQYMQQMPVGSSFTMTINACQTSVN YDASSGARCKDQASGNWYVRNVTHTKAANLRLINTHSLAEVFINSDGVPTLGEGNADCRT QTIGSRSGLSCKMVNYTLQTNGLSNTSIHIFPAIANSSLASAVGAYDMQFSLNGSSWKPV SNTAYYYTFNEMKSADSIYVFFSSNFFKQMVNLGISDINTKDLFNFRFQNTTSPESGWYE FSTSNTLIIKPRDFSISIISDEYTQTPSREGYVGSGESALDFGYIVTTSGKTAADEVLIK VTGPAQVIGGRSYCVFSSDDGKAKVPFPATLSFITRNGATKTYDAGCDDSWRDMTDALWL TTPWTDISGEVGQMDKTTVKFSIPMDNAISLRTVDDNGWFGEVSASGEIHVQATWRNIN >gi|296494716|gb|ADTN01000022.1| GENE 23 21964 - 24489 1686 841 aa, chain - ## HITS:1 COG:no KEGG:B21_00253 NR:ns ## KEGG: B21_00253 # Name: yagX # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 841 1 841 841 1545 99.0 0 MPLRRFSPGLKAQFAFGMVFLFVQPDASAADISAQQIGGVIIPQAFSQALQDGMSVPLYI HLAGSQGRQDDQRIGSAFIWLDDGQLRIRKIQLEESEDNASVSEQTRQQLMALANAPFNE ALTIPLTDNAQLDLSLRQLLLQLVVKREALGTVLRSRSEDIGQSSVNTLSSNLSYNLGVY NNQLRNGGSNTSSYLSLNNVTALREHHVVLDGSLYGIGSGQQDSELYKAMYERDFAGHRF AGGMLDTWNLQSLGPMTAISAGKIYGLSWGNQASSTIFDSSQSATPVIAFLPAAGEVHLT RDGRLLSVQNFTMGNHEVDTRGLPYGIYDVEVEVIVNGRVISKRTQRVNKLFSRGRGVGA PLAWQVWGGSFHMDRWSENGKKTRPAKESWLAGASTSGSLSTLSWAATGYGYDNQAVGET RLTLPLGGAINVNLQNMLASDSSWSSIGSISATLPGGFSSLWVNQEKTRIGNQLRRSDAD NRAIGGTLNLNSLWSKLGTFSISYNDDRRYNSHYYTADYYQNVYSGTFGSLGLRAGIQRY NNGDSNANTGKYIALDLSLPLGNWFSAGMTHQNGYTMANLSARKQFDEGTIRTVGANLSR AISGDTGDDKTLSGGAYAQFDARYASGTLNVNSAADGYVNTNLTANGSVGWQGKNIAASG RTDGNAGVIFNTGLEDDGQISAKINGRIFPLNGKRNYLPLSPYGRYEVELQNSKNSLDSY DIVSGRKSRLTLYPGNVAVIEPEVKQMVTVSGRIRAEDGTLLANARINNHIGRTRTDENG EFVMDVDKKYPTIDFRYSGNKTCEVALELNQARGAVWVGDVVCSGLSSWAAVTQTGEENE S >gi|296494716|gb|ADTN01000022.1| GENE 24 24515 - 25183 439 222 aa, chain - ## HITS:1 COG:no KEGG:APECO1_1705 NR:ns ## KEGG: APECO1_1705 # Name: yagY # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 222 17 238 238 418 99.0 1e-116 MKKHLLPLALLFSGISPAQALDVGDISSFMNSDSSTLSKTIKNSTDSGRLINIRLERLSS PLDDGQVISMDKPDELLLTPASLLLPAQASEVIRFFYKGPADEKERYYRIVWFDQALSDA QRDNANRSAVATASARIGTILVVAPRQANYHFQYANGSLTNTGNATLRILAYGPCLKAAN GKECKENYYLMPGKSRRFTRVDTADNKGRVALWQGDKFIPVK >gi|296494716|gb|ADTN01000022.1| GENE 25 25241 - 25828 642 195 aa, chain - ## HITS:1 COG:no KEGG:JW0287 NR:ns ## KEGG: JW0287 # Name: yagZ # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 195 1 195 195 290 100.0 2e-77 MKKKVLAIALVTVFTGMGVAQAADVTAQAVATWSATAKKDTTSKLVVTPLGSLAFQYAEG IKGFNSQKGLFDVAIEGDSTATAFKLTSRLITNTLTQLDTSGSTLNVGVDYNGAAVEKTG DTVMIDTANGVLGGNLSPLANGYNASNRTTAQDGFTFSIISGTTNGTTAVTDYSTLPEGI WSGDVSVQFDATWTS >gi|296494716|gb|ADTN01000022.1| GENE 26 25903 - 26445 157 180 aa, chain - ## HITS:1 COG:ykgK KEGG:ns NR:ns ## COG: ykgK COG2771 # Protein_GI_number: 16128279 # Func_class: K Transcription # Function: DNA-binding HTH domain-containing proteins # Organism: Escherichia coli K12 # 1 180 17 196 196 333 100.0 1e-91 MECQNRSDKYIWSPHDAYFYKGLSELIVDIDRLIYLSLEKIRKDFVFINLSTDSLSEFIN RDNEWLSAVKGKQVVLIAARKSEALANYWYYNSNIRGVVYAGLSRDIRKELVYVINGRFL RKDIKKDKITDREMEIIRMTAQGMQPKSIARIENCSVKTVYTHRRNAEAKLYSKIYKLVQ >gi|296494716|gb|ADTN01000022.1| GENE 27 27531 - 27674 241 47 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|26246304|ref|NP_752343.1| 50S ribosomal protein L36 [Escherichia coli CFT073] # 1 47 1 47 47 97 97 1e-19 MMKVLNSLRTAKERHPDCQIVKRKGRLYVICKSNPRFKAVQGRKKKR >gi|296494716|gb|ADTN01000022.1| GENE 28 27671 - 27937 460 88 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|110640565|ref|YP_668293.1| 50S ribosomal protein L31 type B [Escherichia coli 536] # 1 88 1 88 88 181 100 4e-45 MMKPNIHPEYRTVVFHDTSVDEYFKIGSTIKTDREIELDGVTYPYVTIDVSSKSHPFYTG KLRTVASEGNVARFTQRFGRFVSTKKGA >gi|296494716|gb|ADTN01000022.1| GENE 29 28298 - 28399 150 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKENKVQQISHKLINIVVFVAIVEYAYLFLHFY >gi|296494716|gb|ADTN01000022.1| GENE 30 29514 - 33770 2972 1418 aa, chain + ## HITS:1 COG:no KEGG:EcolC_3322 NR:ns ## KEGG: EcolC_3322 # Name: not_defined # Def: Ig domain-containing protein # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 1418 1 1418 1418 2503 99.0 0 MSHYKTGHKQPRFRYSVLARCVAWANISVQVLFPLAVTFTPVMAARAQHAVQPRLSMGNT TVTADNNVEKNVASFAANAGTFLSSQPDSDATRNFITGMATAKANQEIQEWLGKYGTARV KLNVDKDFSLKDSSLEMLYPIYDTPTNMLFTQGAIHRTDDRTQSNIGFGWRHFSGNDWMA GVNTFIDHDLSRSHTRIGVGAEYWRDYLKLSANGYIRASGWKKSPDIEDYQERPANGWDI RAEGYLPAWPQLGASLMYEQYYGDEVGLFGKDKRQKDPHAISAEVTYTPVPLLTLSAGHK QGKSGENDTRFGLEVNYRIGEPLAKQLDTDSIRERRVLAGSRYDLVERNNNIVLEYRKSE VIRIALPERIEGKGGQTLSLGLVVSKATHGLKNVQWEAPSLLAEGGKITGQGSQWQVTLP AYRPGKDNYYAISAVAYDNKGNASKRVQTEVVITGAGMSADRTALTLDGQSRIQMLANGN EQRPLVLSLRDAEGQPVTGMKDQIKTELAFKPAGNIVTRSLKATKSQAKPTLGEFTETEA GVYQSVFTTGTQSGEATITVSVDGMSKTVTAELRATMMDVANSTLSANEPSGDVVADGQQ AYTLTLTAVDSEGNPVTGEASRLRFVPQDTNGVTVGAISEIKPGVYSATVSSTRAGNVVV RAFSEQYQLGTLQQTLKFVAGPLDAAHSSITLNPDKPVVGGTVTAIWTAKDAYDNPVTSL TPEAPSLAGAAAVGSTASGWTNNGDGTWTAQITLGSTAGELEVMPKLNGQDAAANAAKVT VVADALSSNQSKVSVAEDHVKAGESTTVTLIAKDAHGNTISGLSLSASLTGTASEGATVS SGTEKGDCSYVATLTTGGKTGELRVMPLFNGQPAATEAAQLTVIAGEMSSANSTLVADNK APTVKMTTELTFTVKDAYGNPVTGLKPDAPVFSGAASTGSERPSAGNWTEKGNGVYVATL TLGSAAGQLSVMPRVNGQNAVAQPLVLNVAGDASKAEIRDMTVKVNNQLANGQSANQITL TVVDSYGNPLQGQEVTLTLPQGVTSKTGNTVTTNAAGKVDIELMSTVAGEHSITASVNNA QKTVTVKFKADFSTGQATLEVDGSTPKVANDNDAFTLTATVKDQYGNLLPGAVVVFNLPR GVKPLADGNIMVNADKEGKAELKVVSVTAGTYEITASAGNDQPSNAQSVTFVADKTTATI SSIEVIGNRAVADGKTKQTYKVTVTDANNNLLKDSDVTLTASSENLVLDPKGTAKTNEQG QAVFTGSTTIAATYTLTAKVEQANGQVSTKTAESKFVADDKNAVLAASPERVDSLVADGK TTATMTVTLMAGVNPVGGSMWVDIEAPEGVTEKDYQFLPSKADHFSGGKITRTFSTSKPG VYTFTFNALTYGGYEMTPVKVTINAVAAETENGEEEMP >gi|296494716|gb|ADTN01000022.1| GENE 31 33910 - 34761 382 283 aa, chain - ## HITS:1 COG:ECs0337 KEGG:ns NR:ns ## COG: ECs0337 COG2207 # Protein_GI_number: 15829591 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 283 12 296 296 550 93.0 1e-156 MIRQKILQQLLEWIECNLEHPISIEDIAQKSGYSRRNIQLLFRNFMHVPLGEYIRKRRLC RAAILVRLTAKSMLDIALSLHFDSQQSFSREFKKLFGCSPREYRHRDYWDLANIFPSFLI RQQQKTECRLINFPETPIFGNSFKYDIEVSNKSPDEEVKLRRHHLARCMKNFKTDIYFVS TFEPSTKSVDLLTVETFAGTVCEYADMPKEWTTTRGLYASFRYEGNWENYPDWVRNIYLI ELPARGLARVNGSDIERFYYNEDFVEKDGNDVVCEIFIPVRPV >gi|296494716|gb|ADTN01000022.1| GENE 32 35088 - 35192 111 34 aa, chain + ## HITS:1 COG:ECs0338 KEGG:ns NR:ns ## COG: ECs0338 COG0656 # Protein_GI_number: 15829592 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Escherichia coli O157:H7 # 1 34 256 289 289 77 100.0 8e-15 MAQISSLDLGYVGESVKHFNPEFVRGCLAVKIHD >gi|296494716|gb|ADTN01000022.1| GENE 33 35351 - 35944 649 197 aa, chain - ## HITS:1 COG:ykgB KEGG:ns NR:ns ## COG: ykgB COG3059 # Protein_GI_number: 16128286 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 197 4 200 200 387 100.0 1e-108 MEKYLHLLSRGDKIGLTLIRLSIAIVFMWIGLLKFVPYEADSITPFVANSPLMSFFYEHP EDYKQYLTHEGEYKPEARAWQTANNTYGFSNGLGVVEVIIALLVLANPVNRWLGLLGGLM AFTTPLVTLSFLITTPEAWVPALGDAHHGFPYLSGAGRLVLKDTLMLAGAVMIMADSARE ILKQRSNESSSTLKTEY >gi|296494716|gb|ADTN01000022.1| GENE 34 35956 - 36204 83 82 aa, chain - ## HITS:1 COG:no KEGG:G2583_0402 NR:ns ## KEGG: G2583_0402 # Name: ykgI # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 82 2 83 83 121 97.0 8e-27 MVFYMFKKSVLFATLLSGVMAFSTNADDKIILKHISVSSVSASPTVLEDTIADIARKYNA SSWKVTSMRIDNNSTATAVLYK >gi|296494716|gb|ADTN01000022.1| GENE 35 36301 - 37626 385 441 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 1 433 1 444 458 152 26 2e-36 MNKYQAVIIGFGKAGKTLAVTLAKAGWRVALIEQSNAMYGGTCINIGCIPTKTLVHDAQQ HTDFVRAIQRKNEVVNFLRNKNFHNLADMPNIDVIDGQAEFINNHSLRVHRPEGNLEIHG EKIFINTGAQTVVPPIPGITTTPGVYDSTGLLNLKELPGHLGILGGGYIGVEFASMFANF GSKVTILEAASLFLPREDRDIADNIATILRDQGVDIILNAHVERISHHENQVQVHSEHAQ LAVDALLIASGRQPATASLHPENAGIAVNERGAIVVDKRLHTTADNIWAMGDVTGGLQFT YISLDDYRIVRDELLGEGKRSTDDRKNVPYSVFMTPPLSRVGMTEEQARESGADIQVVTL PVAAIPRARVMNDTRGVLKAIVDNKTQRMLGASLLCVDSHEMINIVKMVMDAGLPYSILR DQIFTHPSMSESLNVLFSLVK Prediction of potential genes in microbial genomes Time: Sun May 15 23:07:01 2011 Seq name: gi|296494715|gb|ADTN01000023.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont69.1, whole genome shotgun sequence Length of sequence - 62832 bp Number of predicted genes - 70, with homology - 70 Number of transcription units - 39, operones - 14 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 483 118 ## B21_02868 hypothetical protein 2 2 Op 1 2/0.812 - CDS 1057 - 1428 230 ## COG3121 P pilus assembly protein, chaperone PapD 3 2 Op 2 10/0.000 - CDS 1428 - 1805 327 ## COG3121 P pilus assembly protein, chaperone PapD 4 2 Op 3 . - CDS 1821 - 4190 1027 ## COG3188 P pilus assembly protein, porin PapC 5 2 Op 4 . - CDS 4177 - 4335 90 ## ECBD_0695 fimbrial biogenesis outer membrane usher protein 6 2 Op 5 . - CDS 4386 - 4937 502 ## COG3539 P pilus assembly protein, pilin FimA - Prom 5029 - 5088 9.0 - Term 5155 - 5205 7.3 7 3 Tu 1 . - CDS 5221 - 5511 333 ## COG2960 Uncharacterized protein conserved in bacteria - Prom 5539 - 5598 3.7 8 4 Tu 1 . + CDS 5885 - 6538 780 ## COG0108 3,4-dihydroxy-2-butanone 4-phosphate synthase + Term 6570 - 6611 5.0 + Prom 6554 - 6613 5.2 9 5 Tu 1 . + CDS 6800 - 6970 222 ## ECUMN_3528 hypothetical protein + Term 6988 - 7030 12.0 - Term 6887 - 6918 -0.8 10 6 Tu 1 . - CDS 7028 - 7801 937 ## COG0428 Predicted divalent heavy-metal cations transporter - Prom 7898 - 7957 3.5 + Prom 7794 - 7853 5.5 11 7 Tu 1 . + CDS 7944 - 8732 848 ## COG3384 Uncharacterized conserved protein - Term 8723 - 8762 9.1 12 8 Op 1 5/0.062 - CDS 8770 - 9930 1128 ## COG0754 Glutathionylspermidine synthase 13 8 Op 2 . - CDS 9936 - 10607 426 ## COG5463 Predicted integral membrane protein - Prom 10635 - 10694 2.5 14 9 Tu 1 . - CDS 10755 - 12236 1638 ## COG1538 Outer membrane protein - Prom 12351 - 12410 6.6 + Prom 12232 - 12291 4.7 15 10 Op 1 1/1.000 + CDS 12441 - 12605 114 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 16 10 Op 2 8/0.000 + CDS 12642 - 13070 402 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 17 10 Op 3 7/0.000 + CDS 13071 - 13493 166 ## COG3151 Uncharacterized protein conserved in bacteria 18 10 Op 4 7/0.000 + CDS 13518 - 14345 725 ## COG1409 Predicted phosphohydrolases 19 10 Op 5 7/0.000 + CDS 14345 - 14926 492 ## COG3150 Predicted esterase 20 10 Op 6 . + CDS 14955 - 16847 2036 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit - Term 16847 - 16883 4.2 21 11 Op 1 4/0.188 - CDS 16895 - 17209 402 ## COG1359 Uncharacterized conserved protein 22 11 Op 2 . - CDS 17240 - 17821 754 ## COG2249 Putative NADPH-quinone reductase (modulator of drug activity B) + Prom 18056 - 18115 6.3 23 12 Tu 1 . + CDS 18242 - 18472 84 ## B21_02849 hypothetical protein + Term 18621 - 18648 -0.8 - Term 18466 - 18509 7.2 24 13 Op 1 40/0.000 - CDS 18518 - 19867 1168 ## COG0642 Signal transduction histidine kinase 25 13 Op 2 . - CDS 19864 - 20496 787 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 20587 - 20646 4.6 + Prom 20544 - 20603 8.5 26 14 Op 1 3/0.438 + CDS 20675 - 21067 501 ## COG3111 Uncharacterized conserved protein 27 14 Op 2 . + CDS 21120 - 21602 446 ## COG3449 DNA gyrase inhibitor + Term 21644 - 21673 -0.9 + Prom 21711 - 21770 4.1 28 15 Op 1 . + CDS 21807 - 22103 260 ## B21_02844 hypothetical protein 29 15 Op 2 1/1.000 + CDS 22105 - 22500 256 ## COG1396 Predicted transcriptional regulators + Term 22507 - 22543 3.3 + Prom 22546 - 22605 2.7 30 16 Tu 1 4/0.188 + CDS 22633 - 24240 1570 ## COG4166 ABC-type oligopeptide transport system, periplasmic component + Prom 24270 - 24329 2.4 31 17 Tu 1 5/0.062 + CDS 24378 - 26636 2602 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit + Term 26782 - 26810 -0.9 + Prom 26732 - 26791 2.3 32 18 Op 1 7/0.000 + CDS 26870 - 27607 585 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase + Term 27626 - 27660 2.8 33 18 Op 2 2/0.812 + CDS 27682 - 29094 1228 ## COG2132 Putative multicopper oxidases + Term 29095 - 29142 9.1 + Prom 29117 - 29176 6.3 34 19 Tu 1 . + CDS 29205 - 31424 2176 ## COG1032 Fe-S oxidoreductase + Term 31431 - 31471 -0.9 - Term 31417 - 31458 8.0 35 20 Op 1 . - CDS 31467 - 31724 311 ## COG4238 Murein lipoprotein 36 20 Op 2 . - CDS 31775 - 32701 654 ## B21_02836 hypothetical protein - Prom 32805 - 32864 2.9 - Term 32849 - 32887 7.2 37 21 Tu 1 1/1.000 - CDS 32901 - 33728 886 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase - Term 33737 - 33772 7.4 38 22 Tu 1 . - CDS 33833 - 34996 1615 ## COG1979 Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family - Prom 35159 - 35218 5.3 39 23 Tu 1 . + CDS 35190 - 36089 860 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 36081 - 36118 7.0 40 24 Tu 1 5/0.062 - CDS 36129 - 36788 624 ## COG0586 Uncharacterized membrane-associated protein - Prom 36867 - 36926 3.9 - Term 36864 - 36907 -0.4 41 25 Tu 1 . - CDS 36928 - 38115 1102 ## COG0626 Cystathionine beta-lyases/cystathionine gamma-synthases - Prom 38267 - 38326 5.0 + Prom 38102 - 38161 2.7 42 26 Op 1 30/0.000 + CDS 38382 - 39101 842 ## COG0811 Biopolymer transport proteins 43 26 Op 2 . + CDS 39108 - 39533 545 ## COG0848 Biopolymer transport protein + Term 39771 - 39813 8.5 - Term 39759 - 39799 8.0 44 27 Tu 1 . - CDS 39805 - 40689 983 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) - Prom 40849 - 40908 4.1 + Prom 40796 - 40855 3.2 45 28 Tu 1 . + CDS 40880 - 41374 669 ## COG2862 Predicted membrane protein + Term 41382 - 41414 5.3 46 29 Tu 1 . - CDS 41414 - 42454 1026 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) - Prom 42595 - 42654 3.3 + Prom 42437 - 42496 1.8 47 30 Op 1 1/1.000 + CDS 42611 - 42754 212 ## COG0412 Dienelactone hydrolase and related enzymes 48 30 Op 2 1/1.000 + CDS 42732 - 43085 301 ## COG0412 Dienelactone hydrolase and related enzymes 49 30 Op 3 . + CDS 43085 - 43495 323 ## COG0412 Dienelactone hydrolase and related enzymes + Prom 43522 - 43581 3.5 50 31 Tu 1 . + CDS 43614 - 43901 321 ## ECIAI1_3148 hypothetical protein + Prom 43929 - 43988 3.5 51 32 Op 1 4/0.188 + CDS 44090 - 45208 1157 ## COG1740 Ni,Fe-hydrogenase I small subunit 52 32 Op 2 6/0.000 + CDS 45211 - 46197 1025 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 53 32 Op 3 4/0.188 + CDS 46187 - 47365 1603 ## COG5557 Polysulphide reductase 54 32 Op 4 5/0.062 + CDS 47362 - 49065 1950 ## COG0374 Ni,Fe-hydrogenase I large subunit 55 32 Op 5 . + CDS 49065 - 49559 612 ## COG0680 Ni,Fe-hydrogenase maturation factor 56 32 Op 6 . + CDS 49552 - 50040 545 ## SSON_3137 hydrogenase 2-specific chaperone 57 32 Op 7 4/0.188 + CDS 50033 - 50374 272 ## COG0375 Zn finger protein HypA/HybF (possibly regulating hydrogenase expression) 58 32 Op 8 . + CDS 50387 - 50635 379 ## COG0298 Hydrogenase maturation factor - Term 50574 - 50612 2.1 59 33 Tu 1 . - CDS 50758 - 51624 1025 ## COG0625 Glutathione S-transferase - Prom 51686 - 51745 2.7 + Prom 51575 - 51634 3.1 60 34 Tu 1 . + CDS 51829 - 53688 2146 ## COG0754 Glutathionylspermidine synthase - Term 53494 - 53535 1.8 61 35 Tu 1 . - CDS 53780 - 54043 68 ## EcSMS35_3273 hypothetical protein - Prom 54119 - 54178 1.8 + Prom 53716 - 53775 5.3 62 36 Tu 1 . + CDS 53980 - 55479 1473 ## COG0306 Phosphate/sulphate permeases + Term 55486 - 55532 8.1 63 37 Tu 1 . - CDS 55528 - 56220 514 ## B21_02811 hypothetical protein - Prom 56244 - 56303 5.7 + Prom 56221 - 56280 5.4 64 38 Op 1 . + CDS 56409 - 57107 325 ## B21_02810 hypothetical protein 65 38 Op 2 . + CDS 57139 - 57897 622 ## B21_02809 hypothetical protein + Prom 57907 - 57966 2.0 66 39 Op 1 . + CDS 58093 - 59292 935 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 67 39 Op 2 . + CDS 59292 - 60116 616 ## B21_02807 hypothetical protein 68 39 Op 3 . + CDS 60128 - 60688 504 ## COG3054 Predicted transcriptional regulator 69 39 Op 4 22/0.000 + CDS 60719 - 61789 963 ## COG0795 Predicted permeases 70 39 Op 5 . + CDS 61786 - 62830 870 ## COG0795 Predicted permeases Predicted protein(s) >gi|296494715|gb|ADTN01000023.1| GENE 1 3 - 483 118 160 aa, chain - ## HITS:1 COG:no KEGG:B21_02868 NR:ns ## KEGG: B21_02868 # Name: yqiI # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 154 192 345 354 275 98.0 5e-73 MSGAGAIEVDKNLTFRIRGLNNIHVLDCFVNVDLEPADGVVDFGKINSRTIKNTSVSETF SVVMTKDPGAACTEQFNILGSFFTTDILSDYSHLDIGNGLLLKIFHNDGTATEFNRFSQF ASFSSSSAPSVTAPFRAELSANPAETVVEGPFREGANKQV >gi|296494715|gb|ADTN01000023.1| GENE 2 1057 - 1428 230 123 aa, chain - ## HITS:1 COG:yqiH KEGG:ns NR:ns ## COG: yqiH COG3121 # Protein_GI_number: 16130943 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli K12 # 1 123 130 252 252 254 100.0 2e-68 MQVTMQHALKLFWRPKAIELEDDGVMTYEKVEIIRRNDGSIRFNNKMPYHVTLGYIGTNG VTMLPQTQSLMVTPFSYANTQFKNVPSTFQVGYINDFGGLSFYEINCPVVNNICNISVAN RDQ >gi|296494715|gb|ADTN01000023.1| GENE 3 1428 - 1805 327 125 aa, chain - ## HITS:1 COG:yqiH KEGG:ns NR:ns ## COG: yqiH COG3121 # Protein_GI_number: 16130943 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli K12 # 1 124 4 127 252 244 100.0 3e-65 MRYLNTKNIIAAGVLLSCMSSIAWGAIIPDRTRIIMNESDKGEALKLTNQSKNLPYLAQT WIEDTKGNKSRDFIVTVPPMVRLNPSEQIQIRMITQEKIAQLPKDRETLFYFNVREIPPK TDKKM >gi|296494715|gb|ADTN01000023.1| GENE 4 1821 - 4190 1027 789 aa, chain - ## HITS:1 COG:yqiG KEGG:ns NR:ns ## COG: yqiG COG3188 # Protein_GI_number: 16130942 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli K12 # 5 789 37 821 821 1521 99.0 0 MKTFNLSQFETDGQLPVGKYSLSTLINNKRTPIHLDLQWVLIDNQTAVCVTPEQLTLLGF TDEFIEKTQQNLIDGCYPIEKEKQITTYLDKGKMQLSISAPQAWLKYKDANWTPPELWNH GIAGAFLDYNLYASHYAPHQGDNSQNISSYGQAGVNLGAWRLRTDYQYDQSFNNGKSQAT NLDFPRIYLFRPIPAMNAKLTIGQYDTESSIFDSFHFSGISLKSDENMLPPDLRGYAPQI TGVAQTNAKVTVSQNNHIIYQENVPPGPFAITNLFNTLQGQLDVKVEEEDGRVTQWQVAS NSIPYLTRKGQIRYTTAMGKPTSVGGDSLQQPFFWTGEFSWGWLNNVSLYGGSVLTNRDY QSLAAGVGFNLNSLGSLSFDVTRSDAQLHNQDKETGYSYRANYSKRFESTGSQLTFAGYR FSDKNFVTMNEYINDTNHYTNYQNEKESYIVTFNQYLESLRLNTYVSLARNTYWDASSNV NYSLSLSRDFDIGPLKNVSTSLTFSRINWEEDNQDQLYLNISIPWGTSRTLSYGMQRNQD NEISHTASWYDSSDRNNSWSVSASGDNDEFKDMKASLRASYQHNTENGRLYLSGTSQRDS YYSLNASWNGSFTATRHGAAFHDYSGSADSRFMIDADGTEDIPLNNKRAVTNRYGIGVIP SVSSYITTSLSVDTRNLPENVDIENSVITTTLTEGAIGYAKLDTRKGYQIIGVIRLADGS HPPLGISVKDETSHKELGLVADGGFVYLNGIQDDNKLALRWGDKSCFIQPPNSSNLTTGT AILPCISQN >gi|296494715|gb|ADTN01000023.1| GENE 5 4177 - 4335 90 52 aa, chain - ## HITS:1 COG:no KEGG:ECBD_0695 NR:ns ## KEGG: ECBD_0695 # Name: not_defined # Def: fimbrial biogenesis outer membrane usher protein # Organism: E.coli_BL21_DE3 # Pathway: not_defined # 1 52 1 52 837 109 100.0 3e-23 MDQMYKKLKLTTISELIKNIYCSLSVIIIGCASAYAVEFNKDLIEAEDRENV >gi|296494715|gb|ADTN01000023.1| GENE 6 4386 - 4937 502 183 aa, chain - ## HITS:1 COG:ygiL KEGG:ns NR:ns ## COG: ygiL COG3539 # Protein_GI_number: 16130939 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 183 1 183 183 338 100.0 3e-93 MSAFKKSLLVAGVAMILSNNVFADEGHGIVKFKGEVISAPCSIKPGDEDLTVNLGEVADT VLKSDQKSLAEPFTIHLQDCMLSQGGTTYSKAKVTFTTANTMTGQSDLLKNTKETEIGGA TGVGVRILDSQSGEVTLGTPVVITFNNTNSYQELNFKARMESPSKDATPGNVYAQADYKI AYE >gi|296494715|gb|ADTN01000023.1| GENE 7 5221 - 5511 333 96 aa, chain - ## HITS:1 COG:yqiC KEGG:ns NR:ns ## COG: yqiC COG2960 # Protein_GI_number: 16130938 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 96 21 116 116 135 100.0 1e-32 MIDPKKIEQIARQVHESMPKGIREFGEDVEKKIRQTLQAQLTRLDLVSREEFDVQTQVLL RTREKLALLEQRISELENRSTEIKKQPDPETLPPTL >gi|296494715|gb|ADTN01000023.1| GENE 8 5885 - 6538 780 217 aa, chain + ## HITS:1 COG:ECs3929 KEGG:ns NR:ns ## COG: ECs3929 COG0108 # Protein_GI_number: 15833183 # Func_class: H Coenzyme transport and metabolism # Function: 3,4-dihydroxy-2-butanone 4-phosphate synthase # Organism: Escherichia coli O157:H7 # 1 217 1 217 217 412 100.0 1e-115 MNQTLLSSFGTPFERVENALAALREGRGVMVLDDEDRENEGDMIFPAETMTVEQMALTIR HGSGIVCLCITEDRRKQLDLPMMVENNTSAYGTGFTVTIEAAEGVTTGVSAADRITTVRA AIADGAKPSDLNRPGHVFPLRAQAGGVLTRGGHTEATIDLMTLAGFKPAGVLCELTNDDG TMARAPECIEFANKHNMALVTIEDLVAYRQAHERKAS >gi|296494715|gb|ADTN01000023.1| GENE 9 6800 - 6970 222 56 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_3528 NR:ns ## KEGG: ECUMN_3528 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 56 9 64 64 102 100.0 5e-21 MFIAWYWIVLIALVVVGYFLHLKRYCRAFRQDRDALLEARNKYLNSTREETAEKVE >gi|296494715|gb|ADTN01000023.1| GENE 10 7028 - 7801 937 257 aa, chain - ## HITS:1 COG:ECs3928 KEGG:ns NR:ns ## COG: ECs3928 COG0428 # Protein_GI_number: 15833182 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted divalent heavy-metal cations transporter # Organism: Escherichia coli O157:H7 # 1 257 1 257 257 392 100.0 1e-109 MSVPLILTILAGAATFIGAFLGVLGQKPSNRLLAFSLGFAAGIMLLISLMEMLPAALAAE GMSPVLGYGMFIFGLLGYFGLDRMLPHAHPQDLMQKSVQPLPKSIKRTAILLTLGISLHN FPEGIATFVTASSNLELGFGIALAVALHNIPEGLAVAGPVYAATGSKRTAILWAGISGLA EILGGVLAWLILGSMISPVVMAAIMAAVAGIMVALSVDELMPLAKEIDPNNNPSYGVLCG MSVMGFSLVLLQTAGIG >gi|296494715|gb|ADTN01000023.1| GENE 11 7944 - 8732 848 262 aa, chain + ## HITS:1 COG:ygiD KEGG:ns NR:ns ## COG: ygiD COG3384 # Protein_GI_number: 16130935 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 262 10 271 271 511 100.0 1e-145 MSSTRMPALFLGHGSPMNVLEDNLYTRSWQKLGMTLPRPQAIVVVSAHWFTRGTGVTAME TPPTIHDFGGFPQALYDTHYPAPGSPALAQRLVELLAPIPVTLDKEAWGFDHGSWGVLIK MYPDADIPMVQLSIDSSKPAAWHFEMGRKLAALRDEGIMLVASGNVVHNLRTVKWHGDSS PYPWATSFNEYVKANLTWQGPVEQHPLVNYLDHEGGTLSNPTPEHYLPLLYVLGAWDGQE PITIPVEGIEMGSLSMLSVQIG >gi|296494715|gb|ADTN01000023.1| GENE 12 8770 - 9930 1128 386 aa, chain - ## HITS:1 COG:ECs3926 KEGG:ns NR:ns ## COG: ECs3926 COG0754 # Protein_GI_number: 15833180 # Func_class: E Amino acid transport and metabolism # Function: Glutathionylspermidine synthase # Organism: Escherichia coli O157:H7 # 1 386 1 386 386 786 99.0 0 MERVSITERPDWREKAHEYGFNFHTMYGEPYWSEDAYYKLTLAQVEKLEEVTAELHQMCL KVVEKVIASDELMTKFRIPKHTWSFVRQSWLTHQPSLYSRLDLAWDGTGEPKLLENNADT PTSLYEAAFFQWIWLEDQLNAGNLPEGSDQFNSLQEKLIDRFVELREQYGFQLLHLTCCR DTVEDRGTIQYLQDCATEAEIATEFLYIDDIGLGEKGQFTDLQDQVISNLFKLYPWEFML REMFSTKLEDAGVRWLEPAWKSIISNKALLPLLWEMFPNHPNLLPAYFAEDDHPQMEKYV VKPIFSREGANVSIIENGKTIEAAEGPYGEEGMIVQQFHPLPKFGDSYMLIGSWLVNDQP AGIGIREDRALITQDMSRFYPHIFVE >gi|296494715|gb|ADTN01000023.1| GENE 13 9936 - 10607 426 223 aa, chain - ## HITS:1 COG:ygiB KEGG:ns NR:ns ## COG: ygiB COG5463 # Protein_GI_number: 16130933 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Escherichia coli K12 # 1 223 12 234 234 360 100.0 1e-99 MKRTKSIRHASFRKNWSARHLTPVALAVATVFMLAGCEKSDETVSLYQNADDCSAANPGK SAECTTAYNNALKEAERTAPKYATREDCVAEFGEGQCQQAPAQAGMAPENQAQAQQSSGS FWMPLMAGYMMGRLMGGGAGFAQQPLFSSKNPASPAYGKYTDATGKNYGAAQPGRTMTVP KTAMAPKPATTTTVTRGGFGESVAKQSTMQRSATGTSSRSMGG >gi|296494715|gb|ADTN01000023.1| GENE 14 10755 - 12236 1638 493 aa, chain - ## HITS:1 COG:tolC KEGG:ns NR:ns ## COG: tolC COG1538 # Protein_GI_number: 16130931 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Escherichia coli K12 # 1 493 3 495 495 801 100.0 0 MKKLLPILIGLSLSGFSSLSQAENLMQVYQQARLSNPELRKSAADRDAAFEKINEARSPL LPQLGLGADYTYSNGYRDANGINSNATSASLQLTQSIFDMSKWRALTLQEKAAGIQDVTY QTDQQTLILNTATAYFNVLNAIDVLSYTQAQKEAIYRQLDQTTQRFNVGLVAITDVQNAR AQYDTVLANEVTARNNLDNAVEQLRQITGNYYPELAALNVENFKTDKPQPVNALLKEAEK RNLSLLQARLSQDLAREQIRQAQDGHLPTLDLTASTGISDTSYSGSKTRGAAGTQYDDSN MGQNKVGLSFSLPIYQGGMVNSQVKQAQYNFVGASEQLESAHRSVVQTVRSSFNNINASI SSINAYKQAVVSAQSSLDAMEAGYSVGTRTIVDVLDATTTLYNAKQELANARYNYLINQL NIKSALGTLNEQDLLALNNALSKPVSTNPENVAPQTPEQNAIADGYAPDSPAPVVQQTSA RTTTSNGHNPFRN >gi|296494715|gb|ADTN01000023.1| GENE 15 12441 - 12605 114 54 aa, chain + ## HITS:1 COG:ECs3922 KEGG:ns NR:ns ## COG: ECs3922 COG0494 # Protein_GI_number: 15833176 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Escherichia coli O157:H7 # 1 54 1 54 209 117 98.0 6e-27 MLKPDNLPVTFGKNDVEIIARETLYRGFFSLDLYRFRHRLFNGQMSHEVRRGIF >gi|296494715|gb|ADTN01000023.1| GENE 16 12642 - 13070 402 142 aa, chain + ## HITS:1 COG:ECs3922 KEGG:ns NR:ns ## COG: ECs3922 COG0494 # Protein_GI_number: 15833176 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Escherichia coli O157:H7 # 1 142 68 209 209 262 99.0 1e-70 MRDEVVLIEQIRIAAYDTSETPWLLEMVAGMIEEGESVEDVARREAIEEAGLIVKRTKPV LSFLASPGGTSERSSIMVGEVDATTASGIHGLADENEDIRVHVVSREQAYQWVEEGKIDN AASVIALQWLQLHHQALKNEWA >gi|296494715|gb|ADTN01000023.1| GENE 17 13071 - 13493 166 140 aa, chain + ## HITS:1 COG:ECs3921 KEGG:ns NR:ns ## COG: ECs3921 COG3151 # Protein_GI_number: 15833175 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 140 1 140 140 274 100.0 4e-74 MKRYTPDFPEMMRLCEMNFSQLRRLLPRNDAPGETVSYQVANAQYRLTIVESTRYTTLVT IEQTAPAISYWSLPSMTVRLYHDAMVAEVCSSQQIFRFKARYDYPNKKLHQRDEKHQINQ FLADWLRYCLAHGAMAIPVY >gi|296494715|gb|ADTN01000023.1| GENE 18 13518 - 14345 725 275 aa, chain + ## HITS:1 COG:ECs3920 KEGG:ns NR:ns ## COG: ECs3920 COG1409 # Protein_GI_number: 15833174 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Escherichia coli O157:H7 # 1 275 1 275 275 528 100.0 1e-150 MESLLTLPLAGEARVRILQITDTHLFAQKHEALLGVNTWESYQAVLEAIRPHQHEFDLIV ATGDLAQDQSSAAYQHFAEGIASFRAPCVWLPGNHDFQPAMYSALQDAGISPAKRVFIGE QWQILLLDSQVFGVPHGELSEFQLEWLERKLADAPERHTLLLLHHHPLPAGCSWLDQHSL RNAGELDTVLAKFPHVKYLLCGHIHQELDLDWNGRRLLATPSTCVQFKPHCSNFTLDTIA PGWRTLELHADGTLTTEVHRLADTRFQPDTASEGY >gi|296494715|gb|ADTN01000023.1| GENE 19 14345 - 14926 492 193 aa, chain + ## HITS:1 COG:ECs3919 KEGG:ns NR:ns ## COG: ECs3919 COG3150 # Protein_GI_number: 15833173 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Escherichia coli O157:H7 # 1 193 1 193 193 390 100.0 1e-109 MSTLLYLHGFNSSPRSAKASLLKNWLAEHHPDVEMIIPQLPPYPSDAAELLESIVLEHGG DSLGIVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDYLGQNENPYTGQQYVLESR HIYDLKVMQIDPLEAPDLIWLLQQTGDEVLDYRQAVAYYASCRQTVIEGGNHAFTGFEDY FNPIVDFLGLHHL >gi|296494715|gb|ADTN01000023.1| GENE 20 14955 - 16847 2036 630 aa, chain + ## HITS:1 COG:parE KEGG:ns NR:ns ## COG: parE COG0187 # Protein_GI_number: 16130926 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Escherichia coli K12 # 1 630 1 630 630 1300 100.0 0 MTQTYNADAIEVLTGLEPVRRRPGMYTDTTRPNHLGQEVIDNSVDEALAGHAKRVDVILH ADQSLEVIDDGRGMPVDIHPEEGVPAVELILCRLHAGGKFSNKNYQFSGGLHGVGISVVN ALSKRVEVNVRRDGQVYNIAFENGEKVQDLQVVGTCGKRNTGTSVHFWPDETFFDSPRFS VSRLTHVLKAKAVLCPGVEITFKDEINNTEQRWCYQDGLNDYLAEAVNGLPTLPEKPFIG NFAGDTEAVDWALLWLPEGGELLTESYVNLIPTMQGGTHVNGLRQGLLDAMREFCEYRNI LPRGVKLSAEDIWDRCAYVLSVKMQDPQFAGQTKERLSSRQCAAFVSGVVKDAFILWLNQ NVQAAELLAEMAISSAQRRMRAAKKVVRKKLTSGPALPGKLADCTAQDLNRTELFLVEGD SAGGSAKQARDREYQAIMPLKGKILNTWEVSSDEVLASQEVHDISVAIGIDPDSDDLSQL RYGKICILADADSDGLHIATLLCALFVKHFRALVKHGHVYVALPPLYRIDLGKEVYYALT EEEKEGVLEQLKRKKGKPNVQRFKGLGEMNPMQLRETTLDPNTRRLVQLTIDDEDDQRTD AMMDMLLAKKRSEDRRNWLQEKGDMAEIEV >gi|296494715|gb|ADTN01000023.1| GENE 21 16895 - 17209 402 104 aa, chain - ## HITS:1 COG:ECs3911 KEGG:ns NR:ns ## COG: ECs3911 COG1359 # Protein_GI_number: 15833165 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 104 1 104 104 199 100.0 9e-52 MLTVIAEIRTRPGQHHRQAVLDQFAKIVPTVLKEEGCHGYAPMVDCAAGVSFQSMAPDSI VMIEQWESIAHLEAHLQTPHMKAYSEAVKGDVLEMNIRILQPGI >gi|296494715|gb|ADTN01000023.1| GENE 22 17240 - 17821 754 193 aa, chain - ## HITS:1 COG:ECs3910 KEGG:ns NR:ns ## COG: ECs3910 COG2249 # Protein_GI_number: 15833164 # Func_class: R General function prediction only # Function: Putative NADPH-quinone reductase (modulator of drug activity B) # Organism: Escherichia coli O157:H7 # 1 193 1 193 193 400 100.0 1e-112 MSNILIINGAKKFAHSNGQLNDTLTEVADGTLRDLGHDVRIVRADSDYDVKAEVQNFLWA DVVIWQMPGWWMGAPWTVKKYIDDVFTEGHGTLYASDGRTRKDPSKKYGSGGLVQGKKYM LSLTWNAPMEAFTEKDQFFHGVGVDGVYLPFHKANQFLGMEPLPTFIANDVIKMPDVPRY TEEYRKHLVEIFG >gi|296494715|gb|ADTN01000023.1| GENE 23 18242 - 18472 84 76 aa, chain + ## HITS:1 COG:no KEGG:B21_02849 NR:ns ## KEGG: B21_02849 # Name: ygiZ # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 76 35 110 110 112 100.0 6e-24 MLESGGNICSIPSVSGEDRILQAMIAAFFLLTPLIILILRKLFMREMFEFWVYVFSLGIC LVCGWWLFWGRFIFCY >gi|296494715|gb|ADTN01000023.1| GENE 24 18518 - 19867 1168 449 aa, chain - ## HITS:1 COG:ygiY KEGG:ns NR:ns ## COG: ygiY COG0642 # Protein_GI_number: 16130922 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 449 1 449 449 855 100.0 0 MKFTQRLSLRVRLTLIFLILASVTWLLSSFVAWKQTTDNVDELFDTQLMLFAKRLSTLDL NEINAADRMAQTPNRLKHGHVDDDALTFAIFTHDGRMVLNDGDNGEDIPYSYQREGFADG QLVGEDDPWRFVWMTSPDGKYRIVVGQEWEYREDMALAIVAGQLIPWLVALPIMLIIMMV LLGRELAPLNKLALALRMRDPDSEKPLNATGVPSEVRPLVESLNQLFARTHAMMVRERRF TSDAAHELRSPLTALKVQTEVAQLSDDDPQARKKALLQLHSGIDRATRLVDQLLTLSRLD SLDNLQDVAEIPLEDLLQSSVMDIYHTAQQAKIDVRLTLNAHSIKRTGQPLLLSLLVRNL LDNAVRYSPQGSVVDVTLNADNFIVRDNGPGVTPEALARIGERFYRPPGQTATGSGLGLS IVQRIAKLHGMNVEFGNAEQGGFEAKVSW >gi|296494715|gb|ADTN01000023.1| GENE 25 19864 - 20496 787 210 aa, chain - ## HITS:1 COG:ECs3907 KEGG:ns NR:ns ## COG: ECs3907 COG0745 # Protein_GI_number: 15833161 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 210 10 219 219 407 99.0 1e-113 MLIGDGIKTGLSKMGFSVDWFTQGRQGKEALYSAPYDAVILDLTLPGMDGRDILREWREK GQREPVLILTARDALAERVEGLRLGADDYLCKPFALIEVAARLEALMRRTNGQASNELRH GNVMLDPGKRIATLAGEPLTLKPKEFALLELLMRNAGRVLPRKLIEEKLYTWDEEVTSNA VEVHVHHLRRKLGSDFIRTVHGIGYTLGEK >gi|296494715|gb|ADTN01000023.1| GENE 26 20675 - 21067 501 130 aa, chain + ## HITS:1 COG:ECs3906 KEGG:ns NR:ns ## COG: ECs3906 COG3111 # Protein_GI_number: 15833160 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 130 1 130 130 243 100.0 5e-65 MKKFAAVIAVMALCSAPVMAAEQGGFSGPSATQSQAGGFQGPNGSVTTVESAKSLRDDTW VTLRGNIVERISDDLYVFKDASGTINVDIDHKRWNGVTVTPKDTVEIQGEVDKDWNSVEI DVKQIRKVNP >gi|296494715|gb|ADTN01000023.1| GENE 27 21120 - 21602 446 160 aa, chain + ## HITS:1 COG:ygiV KEGG:ns NR:ns ## COG: ygiV COG3449 # Protein_GI_number: 16130919 # Func_class: L Replication, recombination and repair # Function: DNA gyrase inhibitor # Organism: Escherichia coli K12 # 1 160 1 160 160 331 98.0 3e-91 MTNLTLDVNIIDFPSIPVAMLPHRCSPELLNYSVAKFIMWRKETGLSPVNQSQTFGVAWD DPATTAPEAFRFDICGSVSEPIPDNRYGVSNGELTGGRYAVARHVGELDDISHTVWGIIR HWLPASGEKMRKAPILFHYTNLAEGVTEQRLETDVYVPLA >gi|296494715|gb|ADTN01000023.1| GENE 28 21807 - 22103 260 98 aa, chain + ## HITS:1 COG:no KEGG:B21_02844 NR:ns ## KEGG: B21_02844 # Name: ygiU # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 98 1 98 98 193 100.0 2e-48 MEKRTPHTRLSQVKKLVNAGQVRTTRSALLNADELGLDFDGMCNVIIGLSESDFYKSMTT YSDHTIWQDVYRPRLVTGQVYLKITVIHDVLIVSFKEK >gi|296494715|gb|ADTN01000023.1| GENE 29 22105 - 22500 256 131 aa, chain + ## HITS:1 COG:ygiT KEGG:ns NR:ns ## COG: ygiT COG1396 # Protein_GI_number: 16130917 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 1 131 1 131 131 263 100.0 6e-71 MKCPVCHQGEMVSGIKDIPYTFRGRKTVLKGIHGLYCVHCEESIMNKEESDAFMAQVKAF RASVNAETVAPEFIVKVRKKLSLTQKEASEIFGGGVNAFSRYEKGNAQPHPSTIKLLRVL DKHPELLNEIR >gi|296494715|gb|ADTN01000023.1| GENE 30 22633 - 24240 1570 535 aa, chain + ## HITS:1 COG:ygiS KEGG:ns NR:ns ## COG: ygiS COG4166 # Protein_GI_number: 16130916 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Escherichia coli K12 # 1 535 1 535 535 1060 100.0 0 MYTRNLLWLVSLVSAAPLYAADVPANTPLAPQQVFRYNNHSDPGTLDPQKVEENTAAQIV LDLFEGLVWMDGEGQVQPAQAERWEILDGGKRYIFHLRSGLQWSDGQPLTAEDFVLGWQR AVDPKTASPFAGYLAQAHINNAAAIVAGKADVTSLGVKATDDRTLEVTLEQPVPWFTTML AWPTLFPVPHHVIAKHGDSWSKPENMVYNGAFVLDQWVVNEKITARKNPKYRDAQHTVLQ QVEYLALDNSVTGYNRYRAGEVDLTWVPAQQIPAIEKSLPGELRIIPRLNSEYYNFNLEK PPFNDVRVRRALYLTVDRQLIAQKVLGLRTPATTLTPPEVKGFSATTFDELQKPMSERVA MAKALLKQAGYDASHPLRFELFYNKYDLHEKTAIALSSEWKKWLGAQVTLRTMEWKTYLD ARRAGDFMLSRQSWDATYNDASSFLNTLKSDSEENVGHWKNAQYDALLNQATQITDATKR NALYQQAEVIINQQAPLIPIYYQPLIKLLKPYVGGFPLHNPQDYVYSKELYIKAH >gi|296494715|gb|ADTN01000023.1| GENE 31 24378 - 26636 2602 752 aa, chain + ## HITS:1 COG:ECs3903 KEGG:ns NR:ns ## COG: ECs3903 COG0188 # Protein_GI_number: 15833157 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Escherichia coli O157:H7 # 1 752 1 752 752 1459 99.0 0 MSDMAERLALHEFTENAYLNYSMYVIMDRALPFIGDGLKPVQRRIVYAMSELGLNASAKF KKSARTVGDVLGKYHPHGDSACYEAMVLMAQPFSYRYPLVDGQGNWGAPDDPKSFAAMRY TESRLSKYSELLLSELGQGTADWVPNFDGTLQEPKMLPARLPNILLNGTTGIAVGMATDI PPHNLREVAQAAIALIDQPKTTLDQLLDIVQGPDYPTEAEIITSRAEIRKIYENGRGSVR MRAVWKKEDGAVVISALPHQVSGARVLEQIAAQMRNKKLPMVDDLRDESDHENPTRLVIV PRSNRVDMAQVMNHLFATTDLEKSYRINLNMIGLDGRPAVKNLLEILSEWLVFRRDTVRR RLNYRLEKVLKRLHILEGLLVAFLNIDEVIEIIRNEDEPKPALMSRFGLTETQAEAILEL KLRHLAKLEEMKIRGEQSELEKERDQLQGILASERKMNNLLKKELQADAQAYGDDRRSPL QEREEAKAMSEHDMLPSEPVTIVLSQMGWVRSAKGHDIDAPGLNYKAGDSFKAAVKGKSN QPVVFVDSTGRSYAIDPITLPSARGQGEPLTGKLTLPPGATVDHMLMESDDQKLLMASDA GYGFVCTFNDLVARNRAGKALITLPENAHVMPPVVIEDASDMLLAITQAGRMLMFPVSDL PQLSKGKGNKIINIPSAEAARGEDGLAQLYVLPPQSTLTIHVGKRKIKLRPEELQKVTGE RGRRGTLMRGLQRIDRVEIDSPRRASSGDSEE >gi|296494715|gb|ADTN01000023.1| GENE 32 26870 - 27607 585 245 aa, chain + ## HITS:1 COG:plsC KEGG:ns NR:ns ## COG: plsC COG0204 # Protein_GI_number: 16130914 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Escherichia coli K12 # 1 245 1 245 245 516 100.0 1e-146 MLYIFRLIITVIYSILVCVFGSIYCLFSPRNPKHVATFGHMFGRLAPLFGLKVECRKPTD AESYGNAIYIANHQNNYDMVTASNIVQPPTVTVGKKSLLWIPFFGQLYWLTGNLLIDRNN RTKAHGTIAEVVNHFKKRRISIWMFPEGTRSRGRGLLPFKTGAFHAAIAAGVPIIPVCVS TTSNKINLNRLHNGLVIVEMLPPIDVSQYGKDQVRELAAHCRSIMEQKIAELDKEVAERE AAGKV >gi|296494715|gb|ADTN01000023.1| GENE 33 27682 - 29094 1228 470 aa, chain + ## HITS:1 COG:sufI KEGG:ns NR:ns ## COG: sufI COG2132 # Protein_GI_number: 16130913 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Putative multicopper oxidases # Organism: Escherichia coli K12 # 1 470 1 470 470 932 100.0 0 MSLSRRQFIQASGIALCAGAVPLKASAAGQQQPLPVPPLLESRRGQPLFMTVQRAHWSFT PGTRASVWGINGRYLGPTIRVWKGDDVKLIYSNRLTENVSMTVAGLQVPGPLMGGPARMM SPNADWAPVLPIRQNAATLWYHANTPNRTAQQVYNGLAGMWLVEDEVSKSLPIPNHYGVD DFPVIIQDKRLDNFGTPEYNEPGSGGFVGDTLLVNGVQSPYVEVSRGWVRLRLLNASNSR RYQLQMNDGRPLHVISGDQGFLPAPVSVKQLSLAPGERREILVDMSNGDEVSITCGEAAS IVDRIRGFFEPSSILVSTLVLTLRPTGLLPLVTDSLPMRLLPTEIMAGSPIRSRDISLGD DPGINGQLWDVNRIDVTAQQGTWERWTVRADEPQAFHIEGVMFQIRNVNGAMPFPEDRGW KDTVWVDGQVELLVYFGQPSWAHFPFYFNSQTLEMADRGSIGQLLVNPVP >gi|296494715|gb|ADTN01000023.1| GENE 34 29205 - 31424 2176 739 aa, chain + ## HITS:1 COG:ECs3900 KEGG:ns NR:ns ## COG: ECs3900 COG1032 # Protein_GI_number: 15833154 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Escherichia coli O157:H7 # 1 739 1 739 739 1516 99.0 0 MSSISLIQPDRDLFSWPQYWAACFGPAPFLPMSREEMDQLGWDSCDIILVTGDAYVDHPS FGMAICGRMLEAQGFRVGIIAQPDWSSKDDFMRLGKPNLFFGVTAGNMDSMINRYTADRR LRHDDAYTPDNVAGKRPDRATLVYTQRCKEAWKDVPVILGGIEASLRRTAHYDYWSDTVR RSVLVDSKADMLMFGNGERPLVEVAHRLAMGEPISEIRDVRNTAIIVKEALPGWSGVDST RLDTPGKIDPIPHPYGEDLPCADNKPVAPKKQEAKAVTVQPPRPKPWEKTYVLLPSFEKV KGDKVLYAHASRILHHETNPGCARALMQKHGDRYVWINPPAIPLSTEEMDSVFALPYKRV PHPAYGNARIPAYEMIRFSVNIMRGCFGGCSFCSITEHEGRIIQSRSEDSIINEIEAIRD TVPGFTGVISDLGGPTANMYMLRCKSPRAEQTCRRLSCVYPDICPHMDTNHEPTINLYRR ARDLKGIKKILIASGVRYDIAVEDPRYIKELATHHVGGYLKIAPEHTEEGPLSKMMKPGM GSYDRFKELFDTYSKQAGKEQYLIPYFISAHPGTRDEDMVNLALWLKKHRFRLDQVQNFY PSPLANSTTMYYTGKNPLAKIGYKSEDVFVPKGDKQRRLHKALLRYHDPANWPLIRQALE AMGKKHLIGSRRDCLVPAPTIEEMREARRQNRNTRPALTKHTPMATQRQTPATAKKASST QSRPVNAGAKKRPKAAVGR >gi|296494715|gb|ADTN01000023.1| GENE 35 31467 - 31724 311 85 aa, chain - ## HITS:1 COG:yqhH KEGG:ns NR:ns ## COG: yqhH COG4238 # Protein_GI_number: 16130912 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Murein lipoprotein # Organism: Escherichia coli K12 # 1 85 1 85 85 149 100.0 2e-36 MKTIFTVGAVVLATCLLSGCVNEQKVNQLASNVQTLNAKIARLEQDMKALRPQIYAAKSE ANRANTRLDAQDYFDCLRCLRMYAE >gi|296494715|gb|ADTN01000023.1| GENE 36 31775 - 32701 654 308 aa, chain - ## HITS:1 COG:no KEGG:B21_02836 NR:ns ## KEGG: B21_02836 # Name: yqhG # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 308 1 308 308 605 99.0 1e-172 MKIILLFLAAPASFTVHAQPPSQTVEQTVRHIYQNYKSDATAPYFGETGERAITSARIQQ ALTLNDNLTLPGNIGWLDYDPVCDCQDFGDLVLESVAITQTDADHADAVVRFRIFKDDKE KTTQTLKMVAENGRWVIDDIVSNHGSVLQAVNSENEKTLAALASLQKEQPEAFVAELFEH IADYSWPWTWVVSDSYRQAVNAFYKTTFKTANNPDEDMQIERQFIYDNPICFGEESLFSR VDEIRVLEKTADSARIHVRFTLTNGNNEEQELVLQRREGKWEIADFIRPNSGSLLKQIEA KTAARLKQ >gi|296494715|gb|ADTN01000023.1| GENE 37 32901 - 33728 886 275 aa, chain - ## HITS:1 COG:STM3165 KEGG:ns NR:ns ## COG: STM3165 COG0656 # Protein_GI_number: 16766465 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Salmonella typhimurium LT2 # 1 275 1 275 275 497 92.0 1e-141 MANPTVIKLQDGNVMPQLGLGVWQASNEEVITAIQKALEVGYRSIDTAAAYKNEEGVGKA LKNASVNREELFITTKLWNDDHKRPREALLDSLKKLQLDYIDLYLMHWPVPAIDHYVEAW KGMIELQKEGLIKSIGVCNFQIHHLQRLIDETGVTPVINQIELHPLMQQRQLHAWNATHK IQTESWSPLAQGGKGVFDQKVIRDLADKYGKTPAQIVIRWHLDSGLVVIPKSVTPSRIAE NFDVWDFRLDKDELGEIAKLDQGKRLGPDPDQFGG >gi|296494715|gb|ADTN01000023.1| GENE 38 33833 - 34996 1615 387 aa, chain - ## HITS:1 COG:yqhD KEGG:ns NR:ns ## COG: yqhD COG1979 # Protein_GI_number: 16130909 # Func_class: C Energy production and conversion # Function: Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family # Organism: Escherichia coli K12 # 1 387 1 387 387 782 100.0 0 MNNFNLHTPTRILFGKGAIAGLREQIPHDARVLITYGGGSVKKTGVLDQVLDALKGMDVL EFGGIEPNPAYETLMNAVKLVREQKVTFLLAVGGGSVLDGTKFIAAAANYPENIDPWHIL QTGGKEIKSAIPMGCVLTLPATGSESNAGAVISRKTTGDKQAFHSAHVQPVFAVLDPVYT YTLPPRQVANGVVDAFVHTVEQYVTKPVDAKIQDRFAEGILLTLIEDGPKALKEPENYDV RANVMWAATQALNGLIGAGVPQDWATHMLGHELTAMHGLDHAQTLAIVLPALWNEKRDTK RAKLLQYAERVWNITEGSDDERIDAAIAATRNFFEQLGVPTHLSDYGLDGSSIPALLKKL EEHGMTQLGENHDITLDVSRRIYEAAR >gi|296494715|gb|ADTN01000023.1| GENE 39 35190 - 36089 860 299 aa, chain + ## HITS:1 COG:yqhC KEGG:ns NR:ns ## COG: yqhC COG2207 # Protein_GI_number: 16130908 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 299 77 375 375 597 99.0 1e-171 MKREEICRLLADKVNKLKNKENSLSELLPDVRLLYGETPFARTPVMYEPGIIILFSGHKI GYINERVFRYDANEYLLLTVPLPFECETYATSEVPLAGLRLNVDILQLQELLMDIGEDEH FQPSMAASGINSATLSEEILCAAERLLDVMERPLDARILGKQIIREILYYVLTGPCGGAL LALVSRQTHFSLISRVLKRIENKYTENLSVEQLAAEANMSVSAFHHNFKSVTSTSPLQYL KNYRLHKARMMIIHDGMKASAAAMRVGYESASQFSREFKRYFGVTPGEDAARMRAMQGN >gi|296494715|gb|ADTN01000023.1| GENE 40 36129 - 36788 624 219 aa, chain - ## HITS:1 COG:ECs3893 KEGG:ns NR:ns ## COG: ECs3893 COG0586 # Protein_GI_number: 15833147 # Func_class: S Function unknown # Function: Uncharacterized membrane-associated protein # Organism: Escherichia coli O157:H7 # 1 219 1 219 219 384 100.0 1e-106 MAVIQDIIAALWQHDFAALADPHIVSVVYFVMFATLFLENGLLPASFLPGDSLLILAGAL IAQGVMDFLPTIAILTAAASLGCWLSYIQGRWLGNTKTVKGWLAQLPAKYHQRATCMFDR HGLLALLAGRFLAFVRTLLPTMAGISGLPNRRFQFFNWLSGLLWVSVVTSFGYALSMIPF VKRHEDQVMTFLMILPIALLTAGLLGTLFVVIKKKYCNA >gi|296494715|gb|ADTN01000023.1| GENE 41 36928 - 38115 1102 395 aa, chain - ## HITS:1 COG:metC KEGG:ns NR:ns ## COG: metC COG0626 # Protein_GI_number: 16130906 # Func_class: E Amino acid transport and metabolism # Function: Cystathionine beta-lyases/cystathionine gamma-synthases # Organism: Escherichia coli K12 # 1 395 1 395 395 820 100.0 0 MADKKLDTQLVNAGRSKKYTLGAVNSVIQRASSLVFDSVEAKKHATRNRANGELFYGRRG TLTHFSLQQAMCELEGGAGCVLFPCGAAAVANSILAFIEQGDHVLMTNTAYEPSQDFCSK ILSKLGVTTSWFDPLIGADIVKHLQPNTKIVFLESPGSITMEVHDVPAIVAAVRSVVPDA IIMIDNTWAAGVLFKALDFGIDVSIQAATKYLVGHSDAMIGTAVCNARCWEQLRENAYLM GQMVDADTAYITSRGLRTLGVRLRQHHESSLKVAEWLAEHPQVARVNHPALPGSKGHEFW KRDFTGSSGLFSFVLKKKLNNEELANYLDNFSLFSMAYSWGGYESLILANQPEHIAAIRP QGEIDFSGTLIRLHIGLEDVDDLIADLDAGFARIV >gi|296494715|gb|ADTN01000023.1| GENE 42 38382 - 39101 842 239 aa, chain + ## HITS:1 COG:ECs3890 KEGG:ns NR:ns ## COG: ECs3890 COG0811 # Protein_GI_number: 15833144 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Escherichia coli O157:H7 # 1 239 6 244 244 439 99.0 1e-123 MQTDLSVWGMYQHADIVVKCVMIGLILASVVTWAIFFSKSVEFFNQKRRLKREQQLLAEA RSLNQANDIAADFGSKSLSLHLLNEAQNELELSEGSDDNEGIKERTSFRLERRVAAVGRQ MGRGNGYLATIGAISPFVGLFGTVWGIMNSFIGIAQTQTTNLAVVAPGIAEALLATAIGL VAAIPAVVIYNVFARQIGGFKAMLGDVAAQVLLLQSRDLDLEASAAVHPVRVAQKLRAG >gi|296494715|gb|ADTN01000023.1| GENE 43 39108 - 39533 545 141 aa, chain + ## HITS:1 COG:ECs3889 KEGG:ns NR:ns ## COG: ECs3889 COG0848 # Protein_GI_number: 15833143 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport protein # Organism: Escherichia coli O157:H7 # 1 141 1 141 141 252 100.0 2e-67 MAMHLNENLDDNGEMHDINVTPFIDVMLVLLIIFMVAAPLATVDVKVNLPASTSTPQPRP EKPVYLSVKADNSMFIGNDPVTDETMITALNALTEGKKDTTIFFRADKTVDYETLMKVMD TLHQAGYLKIGLVGEETAKAK >gi|296494715|gb|ADTN01000023.1| GENE 44 39805 - 40689 983 294 aa, chain - ## HITS:1 COG:ECs3887 KEGG:ns NR:ns ## COG: ECs3887 COG1028 # Protein_GI_number: 15833141 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Escherichia coli O157:H7 # 1 294 1 294 294 557 100.0 1e-159 MSHLKDPTTQYYTGEYPKQKQPTPGIQAKMTPVPDCGEKTYVGSGRLKDRKALVTGGDSG IGRAAAIAYAREGADVAISYLPVEEEDAQDVKKIIEECGRKAVLLPGDLSDEKFARSLVH EAHKALGGLDIMALVAGKQVAIPDIADLTSEQFQKTFAINVFALFWLTQEAIPLLPKGAS IITTSSIQAYQPSPHLLDYAATKAAILNYSRGLAKQVAEKGIRVNIVAPGPIWTALQISG GQTQDKIPQFGQQTPMKRAGQPAELAPVYVYLASQESSYVTAEVHGVCGGEHLG >gi|296494715|gb|ADTN01000023.1| GENE 45 40880 - 41374 669 164 aa, chain + ## HITS:1 COG:ECs3886 KEGG:ns NR:ns ## COG: ECs3886 COG2862 # Protein_GI_number: 15833140 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 164 1 164 164 271 100.0 5e-73 MERFLENAMYASRWLLAPVYFGLSLALVALALKFFQEIIHVLPNIFSMAESDLILVLLSL VDMTLVGGLLVMVMFSGYENFVSQLDISENKEKLNWLGKMDATSLKNKVAASIVAISSIH LLRVFMDAKNVPDNKLMWYVIIHLTFVLSAFVMGYLDRLTRHNH >gi|296494715|gb|ADTN01000023.1| GENE 46 41414 - 42454 1026 346 aa, chain - ## HITS:1 COG:yghZ KEGG:ns NR:ns ## COG: yghZ COG0667 # Protein_GI_number: 16130899 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Escherichia coli K12 # 1 346 1 346 346 694 100.0 0 MVWLANPERYGQMQYRYCGKSGLRLPALSLGLWHNFGHVNALESQRAILRKAFDLGITHF DLANNYGPPPGSAEENFGRLLREDFAAYRDELIISTKAGYDMWPGPYGSGGSRKYLLASL DQSLKRMGLEYVDIFYSHRVDENTPMEETASALAHAVQSGKALYVGISSYSPERTQKMVE LLREWKIPLLIHQPSYNLLNRWVDKSGLLDTLQNNGVGCIAFTPLAQGLLTGKYLNGIPQ DSRMHREGNKVRGLTPKMLTEANLNSLRLLNEMAQQRGQSMAQMALSWLLKDDRVTSVLI GASRAEQLEENVQALNNLTFSTKELAQIDQHIADGELNLWQASSDK >gi|296494715|gb|ADTN01000023.1| GENE 47 42611 - 42754 212 47 aa, chain + ## HITS:1 COG:yghY+X KEGG:ns NR:ns ## COG: yghY+X COG0412 # Protein_GI_number: 16132250 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Dienelactone hydrolase and related enzymes # Organism: Escherichia coli K12 # 1 36 14 49 307 78 100.0 3e-15 MPRLTAKDFPQELLDYYDYYAHGKISKREFLNLAAKCGRRDDGISVV >gi|296494715|gb|ADTN01000023.1| GENE 48 42732 - 43085 301 117 aa, chain + ## HITS:1 COG:yghY+X KEGG:ns NR:ns ## COG: yghY+X COG0412 # Protein_GI_number: 16132250 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Dienelactone hydrolase and related enzymes # Organism: Escherichia coli K12 # 1 116 54 169 307 231 100.0 3e-61 MTALALFDLLKPNYALATQVEFTDPEIVAEYITYPSPNGHGEVRGYLVKPAKMSGKTPAV VVVHENRGLNPYIEDVARRVAKAGYIALAPDGLSSVGGYPGNDDKGRELQQQVDPTN >gi|296494715|gb|ADTN01000023.1| GENE 49 43085 - 43495 323 136 aa, chain + ## HITS:1 COG:yghY+X KEGG:ns NR:ns ## COG: yghY+X COG0412 # Protein_GI_number: 16132250 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Dienelactone hydrolase and related enzymes # Organism: Escherichia coli K12 # 1 136 172 307 307 279 100.0 1e-75 MNDFFAAIEFMQRYPQATGKVGITGFCYGGGVSNAAAVAYPELACAVPFYGRQAPTADVA KIEAPLLLHFAELDTRINEGWPAYEAALKANNKVYEAYIYPGVNHGFHNDSTPRYDKSAA DLAWQRTLKWFDKYLS >gi|296494715|gb|ADTN01000023.1| GENE 50 43614 - 43901 321 95 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_3148 NR:ns ## KEGG: ECIAI1_3148 # Name: yghW # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 95 1 95 95 184 100.0 8e-46 MNNHFGKGLMAGLKATHADSAVNVTKFCADYKRGFVLGYSHRMYEKTGDRQLSAWEAGIL TRRYGLDKEMVMDFFRENNSCSTLRFFMAGYRLEN >gi|296494715|gb|ADTN01000023.1| GENE 51 44090 - 45208 1157 372 aa, chain + ## HITS:1 COG:ECs3882 KEGG:ns NR:ns ## COG: ECs3882 COG1740 # Protein_GI_number: 15833136 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase I small subunit # Organism: Escherichia coli O157:H7 # 1 372 1 372 372 703 100.0 0 MTGDNTLIHSHGINRRDFMKLCAALAATMGLSSKAAAEMAESVTNPQRPPVIWIGAQECT GCTESLLRATHPTVENLVLETISLEYHEVLSAAFGHQVEENKHNALEKYKGQYVLVVDGS IPLKDNGIYCMVAGEPIVDHIRKAAEGAAAIIAIGSCSAWGGVAAAGVNPTGAVSLQEVL PGKTVINIPGCPPNPHNFLATVAHIITYGKPPKLDDKNRPTFAYGRLIHEHCERRPHFDA GRFAKEFGDEGHREGWCLYHLGCKGPETYGNCSTLQFCDVGGVWPVAIGHPCYGCNEEGI GFHKGIHQLANVENQTPRSQKPDVNAKEGGNVSAGAIGLLGGVVGLVAGVSVMAVRELGR QQKKDNADSRGE >gi|296494715|gb|ADTN01000023.1| GENE 52 45211 - 46197 1025 328 aa, chain + ## HITS:1 COG:ECs3881 KEGG:ns NR:ns ## COG: ECs3881 COG0437 # Protein_GI_number: 15833135 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Escherichia coli O157:H7 # 1 328 1 328 328 674 100.0 0 MNRRNFIKAASCGALLTGALPSVSHAAAENRPPIPGSLGMLYDSTLCVGCQACVTKCQDI NFPERNPQGEQTWSNNDKLSPYTNNIIQVWTSGTGVNKDQEENGYAYIKKQCMHCVDPNC VSVCPVSALKKDPKTGIVHYDKDVCTGCRYCMVACPYNVPKYDYNNPFGALHKCELCNQK GVERLDKGGLPGCVEVCPAGAVIFGTREELMAEAKKRLALKPGSEYHYPRQTLKSGDTYL HTVPKYYPHLYGEKEGGGTQVLVLTGVPYENLDLPKLDDLSTGARSENIQHTLYKGMMLP LAVLAGLTVLVRRNTKNDHHDGGDDHES >gi|296494715|gb|ADTN01000023.1| GENE 53 46187 - 47365 1603 392 aa, chain + ## HITS:1 COG:hybB KEGG:ns NR:ns ## COG: hybB COG5557 # Protein_GI_number: 16130895 # Func_class: C Energy production and conversion # Function: Polysulphide reductase # Organism: Escherichia coli K12 # 1 392 1 392 392 719 100.0 0 MSHDPQPLGGKIISKPVMIFGPLIVICMLLIVKRLVFGLGSVSDLNGGFPWGVWIAFDLL IGTGFACGGWALAWAVYVFNRGQYHPLVRPALLASLFGYSLGGLSITIDVGRYWNLPYFY IPGHFNVNSVLFETAVCMTIYIGVMALEFAPALFERLGWKVSLQRLNKVMFFIIALGALL PTMHQSSMGSLMISAGYKVHPLWQSYEMLPLFSLLTAFIMGFSIVIFEGSLVQAGLRGNG PDEKSLFVKLTNTISVLLAIFIVLRFGELIYRDKLSLAFAGDFYSVMFWIEVLLMLFPLV VLRVAKLRNDSRMLFLSALSALLGCATWRLTYSLVAFNPGGGYAYFPTWEELLISIGFVA IEICAYIVLIRLLPILPPLKQNDHNRHEASKA >gi|296494715|gb|ADTN01000023.1| GENE 54 47362 - 49065 1950 567 aa, chain + ## HITS:1 COG:ECs3879 KEGG:ns NR:ns ## COG: ECs3879 COG0374 # Protein_GI_number: 15833133 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase I large subunit # Organism: Escherichia coli O157:H7 # 1 567 1 567 567 1185 100.0 0 MSQRITIDPVTRIEGHLRIDCEIENGVVSKAWASGTMWRGMEEIVKNRDPRDAWMIVQRI CGVCTTTHALSSVRAAESALNIDVPVNAQYIRNIILAAHTTHDHIVHFYQLSALDWVDIT SALQADPTKASEMLKGVSTWHLNSPEEFTKVQNKIKDLVASGQLGIFANGYWGHPAMKLP PEVNLIAVAHYLQALECQRDANRVVALLGGKTPHIQNLAVGGVANPINLDGLGVLNLERL MYIKSFIDKLSDFVEQVYKVDTAVIAAFYPEWLTRGKGAVNYLSVPEFPTDSKNGSFLFP GGYIENADLSSYRPITSHSDEYLIKGIQESAKHSWYKDEAPQAPWEGTTIPAYDGWSDDG KYSWVKSPTFYGKTVEVGPLANMLVKLAAGRESTQNKLNEIVAIYQKLTGNTLEVAQLHS TLGRIIGRTVHCCELQDILQNQYSALITNIGKGDHTTFVKPNIPATGEFKGVGFLEAPRG MLSHWMVIKDGIISNYQAVVPSTWNSGPRNFNDDVGPYEQSLVGTPVADPNKPLEVVRTI HSFDPCMACAVHVVDADGNEVVSVKVL >gi|296494715|gb|ADTN01000023.1| GENE 55 49065 - 49559 612 164 aa, chain + ## HITS:1 COG:hybD KEGG:ns NR:ns ## COG: hybD COG0680 # Protein_GI_number: 16130893 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase maturation factor # Organism: Escherichia coli K12 # 1 164 1 164 164 291 100.0 4e-79 MRILVLGVGNILLTDEAIGVRIVEALEQRYILPDYVEILDGGTAGMELLGDMANRDHLII ADAIVSKKNAPGTMMILRDEEVPALFTNKISPHQLGLADVLSALRFTGEFPKKLTLVGVI PESLEPHIGLTPTVEAMIEPALEQVLAALRESGVEAIPREAIHD >gi|296494715|gb|ADTN01000023.1| GENE 56 49552 - 50040 545 162 aa, chain + ## HITS:1 COG:no KEGG:SSON_3137 NR:ns ## KEGG: SSON_3137 # Name: hybE # Def: hydrogenase 2-specific chaperone # Organism: S.sonnei # Pathway: not_defined # 1 162 1 162 162 327 100.0 7e-89 MTEEIAGFQTSPKAQVQAAFEEIARRSMHDLSFLHPSMPVYVSDFTLFEGQWTGCVITPW MLSAVIFPGPDQLWPLRKVSEKIGLQLPYGTMTFTVGELDGVSQYLSCSLMSPLSHSMSI EEGQRLTDDCARMILSLPVTNPDVPHAGRRALLFGRRSGENA >gi|296494715|gb|ADTN01000023.1| GENE 57 50033 - 50374 272 113 aa, chain + ## HITS:1 COG:hybF KEGG:ns NR:ns ## COG: hybF COG0375 # Protein_GI_number: 16130891 # Func_class: R General function prediction only # Function: Zn finger protein HypA/HybF (possibly regulating hydrogenase expression) # Organism: Escherichia coli K12 # 1 113 1 113 113 200 100.0 5e-52 MHELSLCQSAVEIIQRQAEQHDVKRVTAVWLEIGALSCVEESAVRFSFEIVCHGTVAQGC DLHIVYKPAQAWCWDCSQVVEIHQHDAQCPLCHGERLRVDTGDSLIVKSIEVE >gi|296494715|gb|ADTN01000023.1| GENE 58 50387 - 50635 379 82 aa, chain + ## HITS:1 COG:ECs3875 KEGG:ns NR:ns ## COG: ECs3875 COG0298 # Protein_GI_number: 15833129 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Escherichia coli O157:H7 # 1 82 1 82 82 152 100.0 1e-37 MCIGVPGQVLAVGEDIHQLAQVEVCGIKRDVNIALICEGNPADLLGQWVLVHVGFAMSII DEDEAKATLDALRQMDYDITSA >gi|296494715|gb|ADTN01000023.1| GENE 59 50758 - 51624 1025 288 aa, chain - ## HITS:1 COG:ECs3874 KEGG:ns NR:ns ## COG: ECs3874 COG0625 # Protein_GI_number: 15833128 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione S-transferase # Organism: Escherichia coli O157:H7 # 1 288 17 304 304 602 100.0 1e-172 MTDNTYQPAKVWTWDKSAGGAFANINRPVSGPTHEKTLPVGKHPLQLYSLGTPNGQKVTI MLEELLALGVTGAEYDAWLIRIGDGDQFSSGFVEVNPNSKIPALRDHTHNPPIRVFESGS ILLYLAEKFGYFLPQDLAKRTETMNWLFWLQGAAPFLGGGFGHFYHYAPVKIEYAINRFT MEAKRLLDVLDKQLAQHKFVAGDEYTIADMAIWPWFGNVVLGGVYDAAEFLDAGSYKHVQ RWAKEVGERPAVKRGRIVNRTNGPLNEQLHERHDASDFETNTEDKRQG >gi|296494715|gb|ADTN01000023.1| GENE 60 51829 - 53688 2146 619 aa, chain + ## HITS:1 COG:gsp_2 KEGG:ns NR:ns ## COG: gsp_2 COG0754 # Protein_GI_number: 16130888 # Func_class: E Amino acid transport and metabolism # Function: Glutathionylspermidine synthase # Organism: Escherichia coli K12 # 233 619 1 387 387 825 100.0 0 MSKGTTSQDAPFGTLLGYAPGGVAIYSSDYSSLDPQEYEDDAVFRSYIDDEYMGHKWQCV EFARRFLFLNYGVVFTDVGMAWEIFSLRFLREVVNDNILPLQAFPNGSPRAPVAGALLIW DKGGEFKDTGHVAIITQLHGNKVRIAEQNVIHSPLPQGQQWTRELEMVVENGCYTLKDTF DDTTILGWMIQTEDTEYSLPQPEIAGELLKISGARLENKGQFDGKWLDEKDPLQNAYVQA NGQVINQDPYHYYTITESAEQELIKATNELHLMYLHATDKVLKDDNLLALFDIPKILWPR LRLSWQRRRHHMITGRMDFCMDERGLKVYEYNADSASCHTEAGLILERWAEQGYKGNGFN PAEGLINELAGAWKHSRARPFVHIMQDKDIEENYHAQFMEQALHQAGFETRILRGLDELG WDAAGQLIDGEGRLVNCVWKTWAWETAFDQIREVSDREFAAVPIRTGHPQNEVRLIDVLL RPEVLVFEPLWTVIPGNKAILPILWSLFPHHRYLLDTDFTVNDELVKTGYAVKPIAGRCG SNIDLVSHHEEVLDKTSGKFAEQKNIYQQLWCLPKVDGKYIQVCTFTVGGNYGGTCLRGD ESLVIKKESDIEPLIVVKK >gi|296494715|gb|ADTN01000023.1| GENE 61 53780 - 54043 68 87 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_3273 NR:ns ## KEGG: EcSMS35_3273 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 42 87 1 46 46 75 93.0 8e-13 MPEQAITKALCIYQGQQINLAYIRLRHFEFTNGRIISDFNGMGKVKYHFLTIKRVLFLVY QTISIHSKPLEIIKTQVFHWLIYSHMK >gi|296494715|gb|ADTN01000023.1| GENE 62 53980 - 55479 1473 499 aa, chain + ## HITS:1 COG:pitB KEGG:ns NR:ns ## COG: pitB COG0306 # Protein_GI_number: 16130887 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate/sulphate permeases # Organism: Escherichia coli K12 # 1 499 1 499 499 892 100.0 0 MLNLFVGLDIYTGLLLLLALAFVLFYEAINGFHDTANAVAAVIYTRAMQPQLAVVMAAFF NFFGVLLGGLSVAYAIVHMLPTDLLLNMGSTHGLAMVFSMLLAAIIWNLGTWFFGLPASS SHTLIGAIIGIGLTNALLTGSSVMDALNLREVTKIFSSLIVSPIVGLVIAGGLIFLLRRY WSGTKKRDRIHRIPEDRKKKKGKRKPPFWTRIALIVSAAGVAFSHGANDGQKGIGLVMLV LVGIAPAGFVVNMNASGYEITRTRDAVTNFEHYLQQHPELPQKLIAMEPPLPAASTDGTQ VTEFHCHPANTFDAIARVKTMLPGNMESYEPLSVSQRSQLRRIMLCISDTSAKLAKLPGV SKEDQNLLKKLRSDMLSTIEYAPVWIIMAVALALGIGTMIGWRRVAMTIGEKIGKRGMTY AQGMAAQMTAAVSIGLASYIGMPVSTTHVLSSAVAGTMVVDGGGLQRKTVTSILMAWVFT LPAAIFLSGGLYWIALQLI >gi|296494715|gb|ADTN01000023.1| GENE 63 55528 - 56220 514 230 aa, chain - ## HITS:1 COG:no KEGG:B21_02811 NR:ns ## KEGG: B21_02811 # Name: yghT # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 230 1 230 230 462 100.0 1e-129 MQSITPPLIAVIGSDGSGKSTVCEHLITVVEKYGAAERVHLGKQAGNVGRAVTKLPLMGK SLHKTIERNQVKTAKKLPGPVPALVITAFVARRLLRFRHMLACRRRGLIVLTDRYPQDQI PGAYDGTVFPPNVEGGRFVSWLASQERKAFHWMASHKPDLVIKLNVDLEVACARKPDHKR ESLARKIAITPQLTFGGAQLVDIDANQPLEQVLVDAEKAITDFMTARGYH >gi|296494715|gb|ADTN01000023.1| GENE 64 56409 - 57107 325 232 aa, chain + ## HITS:1 COG:no KEGG:B21_02810 NR:ns ## KEGG: B21_02810 # Name: yghS # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 232 6 237 237 470 99.0 1e-131 MSIINSTPVRVIAIVGCDGSGKSTLTASLVNELAARMPTEHIYLGQSSGRIGEWISQLPV IGAPFGRYLRSKAAHVHEKPSTPPGNITALVIYLLSCWRAYKFRKMLCKSQQGFLLITDR YPQVEVPGFRFDGPQLAKTTGGNGWIKMLRQRELKLYQWMASYLPVLLIRLGIDEQTAFA RKPDHQLAALQEKIAVTPQLTFNGAKILELDGRHPADKILQASLRAIHAALS >gi|296494715|gb|ADTN01000023.1| GENE 65 57139 - 57897 622 252 aa, chain + ## HITS:1 COG:no KEGG:B21_02809 NR:ns ## KEGG: B21_02809 # Name: yghR # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 252 1 252 252 486 100.0 1e-136 MDALQTQTVNSTTAPQPNYIPGLIAVVGCDGTGKSTLTTDLVKSLQQHWQTERRYLGLLS GEDGDKIKRLPLVGVWLERRLAAKSSKTQSMKTKSPALWAAVIMYCFSLRRMANLRKVQR LAQSGVLVVSDRFPQAEISGFYYDGPGIGVERATGKISMFLAQRERRLYQQMAQYRPELI IRLGIDIETAISRKPDHDYAELQDKIGVMSKIGYNGTKILEIDSRAPYSEVLEQAQKAVS LVAIVSDRRSLT >gi|296494715|gb|ADTN01000023.1| GENE 66 58093 - 59292 935 399 aa, chain + ## HITS:1 COG:yghQ KEGG:ns NR:ns ## COG: yghQ COG2244 # Protein_GI_number: 16130883 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Escherichia coli K12 # 1 305 21 325 325 572 100.0 1e-163 MFGVLVIVQSYAKSISDFIKFQTWQLVVQYGTPALTNNNPQQFRNVVSFSFSLDIVSGAV AIVGGIALLPFLSHSLGLDDQSFWLAALYCTLIPSMASSTPTGILRAVDRFDLIAVQQAT KPFLRAAGSVVAWYFDFGFAGFVIAWYVSNLVGGTMYWWFAARELRRRNIHNAFKLNLFE SARYIKGAWSFVWSTNIAHSIWSARNSCSTVLVGIVLGPAAAGLFKIAMTFFDAAGTPAG LLGKSFYPEVMRLDPRTTRPWLLGVKSGLLAGGIGILVALAVLIVGKPLISLVFGVKYLE AYDLIQVMLGAIVISMLGFPQESLLLMAGKQRAFLVAQTIASIGYIVLLFMFCHLFGVLG AAFAYFGGQCLDVALSLIPTLKAFFQRHSLLYNAAGEKS >gi|296494715|gb|ADTN01000023.1| GENE 67 59292 - 60116 616 274 aa, chain + ## HITS:1 COG:no KEGG:B21_02807 NR:ns ## KEGG: B21_02807 # Name: ybl138 # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 274 1 274 274 535 100.0 1e-151 MMRKYFPLEASERLFVAIEEDDVVDAQVSLPPTIALSCTTEIIHDNYALCLQFWLNGVDR QELLRLVRKQAKGDELTADERKQFKYMRARYKHLRFAQRLYLKKHQAGFLFGKTTVFLGR FQDGFRNGKKNIVSYYGNLLRIYLSSPVWSLVNYSLRHSQLESVSSFIAYRQKQMHTLKE IIAKPRLTGREFHDVRKIISQQVSYYDTLRSLDPENKEALQISRFLAAINGLMGDKHDDM VADDMENRQSYDAPVALDSDIRQRLELLISRFPL >gi|296494715|gb|ADTN01000023.1| GENE 68 60128 - 60688 504 186 aa, chain + ## HITS:1 COG:ytfJ KEGG:ns NR:ns ## COG: ytfJ COG3054 # Protein_GI_number: 16132038 # Func_class: R General function prediction only # Function: Predicted transcriptional regulator # Organism: Escherichia coli K12 # 1 184 1 183 184 205 54.0 4e-53 MSSRLIIALIIMLLAPGVQAHNFVTGKTVTPVYIQEGGELLLNSDDEIHYQKWNSTQLAG KVRIIQYIAGRKSAKKKNSLLIKAVEAANFPQDRFQPTTIVNTDDAIFGTGYFVVGKIEK NKRRYPWAQFVIDGNGQGRVAWRLPEQSSTILVLNKAGQIQWAKDGSLTPEEVDHVIALA QKLINE >gi|296494715|gb|ADTN01000023.1| GENE 69 60719 - 61789 963 356 aa, chain + ## HITS:1 COG:PA3828 KEGG:ns NR:ns ## COG: PA3828 COG0795 # Protein_GI_number: 15599023 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Pseudomonas aeruginosa # 3 343 2 351 372 124 29.0 2e-28 MKLVEHYIMRGTRRLVLIIVGFLIFIFASYSAQRYLTEAANGTLALDVVLDIVFYKVLIA LEMLLPVGLYVSVGVTLGQMYTDSEITAISAAGGSPGRLYKAVLYLAIPLSIFVTLLSMY GRPWAYAQIYQLEQQSQSELDVRQLRAKKFNTNDNGRMILSQTVDQDNNRLTDALIYTST ANRTRIFRARSVDVVDPSPEKPTVMLHNGTAYLIDHQGRDDNEQIYRNLQLHLNPLDQSP NVKRKAKSVTELARSVFPADHAELQWRQSRGLTALLMALLAISLSRVKPRQGRFSTLLPL TLLFIAIFYGGDVCRTLVANGAIPLIPGLWLVPGLMLMGLLMLVARDFSLLQKFSR >gi|296494715|gb|ADTN01000023.1| GENE 70 61786 - 62830 870 348 aa, chain + ## HITS:1 COG:VC2499 KEGG:ns NR:ns ## COG: VC2499 COG0795 # Protein_GI_number: 15642495 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Vibrio cholerae # 33 348 34 348 356 145 30.0 8e-35 MNVFSRYLIRHLFLGFAAAAGLLLPLFTTFNLINELDDVSPGGYRWTQAVLVVLMTLPRT LVELSPFIALLGGIVGLGQLSKNSELTAIRSMGFSIFRIALVALVAGILWTVSLGAIDEW VASPLQQQALQIKSTATALGEDDDITGNMLWARRGNEFVTVKSLNEQGQPVGVEIFHYRD DLSLESYIYARSATIEDDKTWILHGVNHKKWLNGKETLETLDNLAWQSAFTSMDLEELSM PGNTFSVRQLNHYIHYLQETGQPSSEYRLALWEKLGQPILTLAMILLAVPFTFSAPRSPG MGSRLAVGVIVGLLTWISYQIMVNLGLLFALSAPVTALGLPIAFVLVA Prediction of potential genes in microbial genomes Time: Sun May 15 23:07:44 2011 Seq name: gi|296494714|gb|ADTN01000024.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont69.2, whole genome shotgun sequence Length of sequence - 18738 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 6, operones - 3 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 10 - 1182 1237 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes 2 1 Op 2 . - CDS 1182 - 1430 288 ## S3224 hypothetical protein 3 1 Op 3 2/1.000 - CDS 1462 - 2376 828 ## COG0702 Predicted nucleoside-diphosphate-sugar epimerases 4 1 Op 4 . - CDS 2373 - 4061 1293 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II - Prom 4241 - 4300 7.6 + Prom 4209 - 4268 8.2 5 2 Tu 1 . + CDS 4472 - 5614 1056 ## c3711 hypothetical protein 6 3 Tu 1 . - CDS 5621 - 6385 568 ## COG2186 Transcriptional regulators - Prom 6464 - 6523 8.0 + Prom 6428 - 6487 2.3 7 4 Op 1 9/0.000 + CDS 6636 - 8135 1615 ## COG0277 FAD/FMN-containing dehydrogenases 8 4 Op 2 15/0.000 + CDS 8135 - 9187 964 ## COG0277 FAD/FMN-containing dehydrogenases 9 4 Op 3 1/1.000 + CDS 9198 - 10421 1160 ## COG0247 Fe-S oxidoreductase 10 4 Op 4 1/1.000 + CDS 10426 - 10830 624 ## COG3193 Uncharacterized protein, possibly involved in utilization of glycolate and propanediol 11 4 Op 5 1/1.000 + CDS 10852 - 13023 2445 ## COG2225 Malate synthase + Term 13051 - 13096 -0.9 + Prom 13235 - 13294 4.4 12 5 Tu 1 . + CDS 13378 - 15060 1673 ## COG1620 L-lactate permease + Term 15070 - 15108 7.1 13 6 Op 1 . + CDS 15545 - 16336 409 ## EcSMS35_3251 hypothetical protein 14 6 Op 2 4/0.000 + CDS 16378 - 16890 662 ## COG3156 Type II secretory pathway, component PulK 15 6 Op 3 2/1.000 + CDS 16887 - 18065 723 ## COG3297 Type II secretory pathway, component PulL 16 6 Op 4 . + CDS 18067 - 18603 430 ## COG3149 Type II secretory pathway, component PulM Predicted protein(s) >gi|296494714|gb|ADTN01000024.1| GENE 1 10 - 1182 1237 390 aa, chain - ## HITS:1 COG:CC1162 KEGG:ns NR:ns ## COG: CC1162 COG0156 # Protein_GI_number: 16125414 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Caulobacter vibrioides # 1 388 1 389 404 401 53.0 1e-111 MGLYDKYARLAGERLQFSDNGLTPFGTCIDEVYSATEGRIGNKKVILAGTNNYLGLTFNH DAISEGQAALAAQGTGTTGSRMANGSYAPHLALEKEIAEFFNRPTAIVFSTGYTANLGVI SALADHNAVVLLDADSHASIYDACSLGGAEIIRFRHNDAKDLERRMVRLGERAKEAIIIV EGIYSMLGDVAPLAEIVDIKRRLGGYLIVDEAHSFGVLGATGRGLAEAVGVEDDVDIIVG TFSKSLASIGGFAVGSEAMEVLRYGSRPYIFTASPSPSCIATVRSSLRTIATQPELRQKL MDNANHLYDGLQKLGYELSSHISPVVPVIIGSKEDGLRIWRELISLGVYVNLILPPAAPA GITLLRCSVNAAHSREQIDAIIQAFATLKQ >gi|296494714|gb|ADTN01000024.1| GENE 2 1182 - 1430 288 82 aa, chain - ## HITS:1 COG:no KEGG:S3224 NR:ns ## KEGG: S3224 # Name: not_defined # Def: hypothetical protein # Organism: S.flexneri_2457T # Pathway: not_defined # 1 82 1 82 82 124 100.0 1e-27 MVNREIVMDYILSCLQDLVENGVEIKPDSDLVNDLGLESIKVMDLLMMLEDRFDISIPIN ILLDVKTPAQLMETLLPWLENK >gi|296494714|gb|ADTN01000024.1| GENE 3 1462 - 2376 828 304 aa, chain - ## HITS:1 COG:CC1164 KEGG:ns NR:ns ## COG: CC1164 COG0702 # Protein_GI_number: 16125416 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate-sugar epimerases # Organism: Caulobacter vibrioides # 5 287 9 288 317 120 33.0 4e-27 MNQTVAVTGATGFIGKYIIDNLLARGFHVRALTRTARAHVNDNLTWVRGSLEDTHSLSEL VAGASVVVHCAGQVRGHKEEIFTRCNVDGSLRLMQAAKESGFCQRFLFISSLAARHPELS WYAKSKYVAEQRLAAMADEITLGVFRPTAVYGPGDKELKPLFDWMLRGLLPRLGAPDTQL SFLHVTDFAQAVGQWLSAETIQTQTYELCDGVAGGYDWQRVQQLAADARCGSVRMVGIPL PVLTGLADISTALSRLAGKEPMLTRSKIRELTHADWSASNNRISEDINWFPGISLEHALR NGLF >gi|296494714|gb|ADTN01000024.1| GENE 4 2373 - 4061 1293 562 aa, chain - ## HITS:1 COG:CC1165 KEGG:ns NR:ns ## COG: CC1165 COG0318 # Protein_GI_number: 16125417 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Caulobacter vibrioides # 1 547 11 560 567 444 43.0 1e-124 MRYADFPTLVDALDYAALSSAGMNFYDRRCQLEDQLEYQTLKARAEAGAKRLLSLNLKKG DRVALIAETSSEFVEAFFACQYAGLVAVPLAIPMGVGQRDSWSAKLQGLLASCQPAAIIT GDEWLPLVNAATHDNPELHVLSHAWFKALPEADVALQRPVPNDIAYLQYTSGSTRFPRGV IITHREVMANLRAISHDGIKLRPGDRCVSWLPFYHDMGLVGFLLTPVATQLSVDYLRTQD FAMRPLQWLKLISKNRGTVSVAPPFGYELCQRRVNEKDLAELDLSCWRVAGIGAEPISAE QLHQFAECFRQVNFDNKTFMPCYGLAENALAVSFSDEASGVVVNEVDRDILEYQGKAVAP GAETRAVSTFVNCGKALPEHGIEIRNEAGMPVAERVVGHICISGPSLMSGYFGDQVSQDE IAATGWLDTGDLGYLLDGYLYVTGRIKDLIIIRGRNIWPQDIEYIAEQEPEIHSGDAIAF VTAQEKIILQIQCRISDEERRGQLIHALAARIQSEFGVTAAIDLLPPHSIPRTSSGKPAR AEAKKRYQKAYAASLNVQESLA >gi|296494714|gb|ADTN01000024.1| GENE 5 4472 - 5614 1056 380 aa, chain + ## HITS:1 COG:no KEGG:c3711 NR:ns ## KEGG: c3711 # Name: yghO # Def: hypothetical protein # Organism: E.coli_CFT073 # Pathway: not_defined # 1 380 23 402 402 800 100.0 0 MECDLLMIKIEKVINKNDLKAFIAFPSSLYPDDPNWIPPLFIERNEHLSAKNPGTDHIIW QAWVAKKAGQIVGRITAQIDTLHRERYGKDTGHFGMIDAIDDPQVFAALFGAAEAWLKSQ GASKISGPFSLNINQESGLLIEGFDTPPCAMMPHGKPWYAAHIEQLGYHKGIDLLAWWMQ RTDLTFSPALKKLMDQVRKKVTIRCINRQRFAEEMQILREIFNSGWQHNWGFVPFTEHEF ATMGDQLKYLVPDDMIYIAEIDSAPCAFIVGLPNINEAIADLNGSLFPFGWAKLLWRLKV SGVRTARVPLMGVRDEYQFSRIGPVIALLLIEALRDPFARRKIDALEMSWILETNTGMNN MLERIGAEPYKRYRLYEKQI >gi|296494714|gb|ADTN01000024.1| GENE 6 5621 - 6385 568 254 aa, chain - ## HITS:1 COG:glcC KEGG:ns NR:ns ## COG: glcC COG2186 # Protein_GI_number: 16130880 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 254 1 254 254 472 100.0 1e-133 MKDERRPICEVVAESIERLIIDGVLKVGQPLPSERRLCEKLGFSRSALREGLTVLRGRGI IETAQGRDSRVARLNRVQDTSPLIHLFSTQPRTLYDLLDVRALLEGESARLAATLGTQAD FVVITRCYEKMLAASENNKEISLIEHAQLDHAFHLAICQASHNQVLVFTLQSLTDLMFNS VFASVNNLYHRPQQKKQIDRQHARIYNAVLQRLPHVAQRAARDHVRTVKKNLHDIELEGH HLIRSAVPLEMNLS >gi|296494714|gb|ADTN01000024.1| GENE 7 6636 - 8135 1615 499 aa, chain + ## HITS:1 COG:glcD KEGG:ns NR:ns ## COG: glcD COG0277 # Protein_GI_number: 16130879 # Func_class: C Energy production and conversion # Function: FAD/FMN-containing dehydrogenases # Organism: Escherichia coli K12 # 1 499 1 499 499 1002 100.0 0 MSILYEERLDGALPDVDRTSVLMALREHVPGLEILHTDEEIIPYECDGLSAYRTRPLLVV LPKQMEQVTAILAVCHRLRVPVVTRGAGTGLSGGALPLEKGVLLVMARFKEILDINPVGR RARVQPGVRNLAISQAVAPHNLYYAPDPSSQIACSIGGNVAENAGGVHCLKYGLTVHNLL KIEVQTLDGEALTLGSDALDSPGFDLLALFTGSEGMLGVTTEVTVKLLPKPPVARVLLAS FDSVEKAGLAVGDIIANGIIPGGLEMMDNLSIRAAEDFIHAGYPVDAEAILLCELDGVES DVQEDCERVNDILLKAGATDVRLAQDEAERVRFWAGRKNAFPAVGRISPDYYCMDGTIPR RALPGVLEGIARLSQQYDLRVANVFHAGDGNMHPLILFDANEPGEFARAEELGGKILELC VEVGGSISGEHGIGREKINQMCAQFNSDEITTFHAVKAAFDPDGLLNPGKNIPTLHRCAE FGAMHVHHGHLPFPELERF >gi|296494714|gb|ADTN01000024.1| GENE 8 8135 - 9187 964 350 aa, chain + ## HITS:1 COG:glcF_1 KEGG:ns NR:ns ## COG: glcF_1 COG0277 # Protein_GI_number: 16130878 # Func_class: C Energy production and conversion # Function: FAD/FMN-containing dehydrogenases # Organism: Escherichia coli K12 # 1 318 1 318 369 643 100.0 0 MLRECDYSQALLEQVNQAISDKTPLVIQGSNSKAFLGRPVTGQTLDVRCHRGIVNYDPTE LVITARVGTPLVTIEAALESAGQMLPCEPPHYGEEATWGGMVACGLAGPRRPWSGSVRDF VLGTRIITGAGKHLRFGGEVMKNVAGYDLSRLMVGSYGCLGVLTEISMKVLPRPRASLSL RREISLQEAMSEIAEWQLQPLPISGLCYFDNALWIRLEGGEGSVKAARELLGGEEVAGQF WQQLREQQLPFFSLPGTLWRISLPSDAPMMDLPGEQLIDWGGALRWLKSTAEDNQIHRIA RNAGGHATRFSAGDGGFAPLSAPLFRYHQQLKQQLDPCGVFNPGRMYAEL >gi|296494714|gb|ADTN01000024.1| GENE 9 9198 - 10421 1160 407 aa, chain + ## HITS:1 COG:glcF_2 KEGG:ns NR:ns ## COG: glcF_2 COG0247 # Protein_GI_number: 16130878 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Escherichia coli K12 # 16 407 1 392 392 778 100.0 0 MQTQLTEEMRQNARALEADSILRACVHCGFCTATCPTYQLLGDELDGPRGRIYLIKQVLE GNEVTLKTQEHLDRCLTCRNCETTCPSGVRYHNLLDIGRDIVEQKVKRPLPERILREGLR QVVPRPAVFRALTQVGLVLRPFLPEQVRAKLPAETVKAKPRPPLRHKRRVLMLEGCAQPT LSPNTNAATARVLDRLGISVMPANEAGCCGAVDYHLNAQEKGLARARNNIDAWWPAIEAG AEAILQTASGCGAFVKEYGQMLKNDALYADKARQVSELAVDLVELLREEPLEKLAIRGDK KLAFHCPCTLQHAQKLNGEVEKVLLRLGFTLTDVPDSHLCCGSAGTYALTHPDLARQLRD NKMNALESGKPEMIVTANIGCQTHLASAGRTSVRHWIEIVEQALEKE >gi|296494714|gb|ADTN01000024.1| GENE 10 10426 - 10830 624 134 aa, chain + ## HITS:1 COG:glcG KEGG:ns NR:ns ## COG: glcG COG3193 # Protein_GI_number: 16130877 # Func_class: R General function prediction only # Function: Uncharacterized protein, possibly involved in utilization of glycolate and propanediol # Organism: Escherichia coli K12 # 1 118 1 118 134 198 100.0 2e-51 MKTKVILSQQMASAIIAAGQEEAQKNNWSVSIAVADDGGHLLALSRMDDCAPIAAYISQE KARTAALGRRETKGYEEMVNNGRTAFVTAPLLTSLEGGVPVVVDGQIIGAVGVSGLTGAQ DAQVAKAAAAVLAK >gi|296494714|gb|ADTN01000024.1| GENE 11 10852 - 13023 2445 723 aa, chain + ## HITS:1 COG:glcB KEGG:ns NR:ns ## COG: glcB COG2225 # Protein_GI_number: 16130876 # Func_class: C Energy production and conversion # Function: Malate synthase # Organism: Escherichia coli K12 # 1 723 1 723 723 1480 100.0 0 MSQTITQSRLRIDANFKRFVDEEVLPGTGLDAAAFWRNFDEIVHDLAPENRQLLAERDRI QAALDEWHRSNPGPVKDKAAYKSFLRELGYLVPQPERVTVETTGIDSEITSQAGPQLVVP AMNARYALNAANARWGSLYDALYGSDIIPQEGAMVSGYDPQRGEQVIAWVRRFLDESLPL ENGSYQDVVAFKVVDKQLRIQLKNGKETTLRTPAQFVGYRGDAAAPTCILLKNNGLHIEL QIDANGRIGKDDPAHINDVIVEAAISTILDCEDSVAAVDAEDKILLYRNLLGLMQGTLQE KMEKNGRQIVRKLNDDRHYTAADGSEISLHGRSLLFIRNVGHLMTIPVIWDSEGNEIPEG ILDGVMTGAIALYDLKVQKNSRTGSVYIVKPKMHGPQEVAFANKLFTRIETMLGMAPNTL KMGIMDEERRTSLNLRSCIAQARNRVAFINTGFLDRTGDEMHSVMEAGPMLRKNQMKSTP WIKAYERNNVLSGLFCGLRGKAQIGKGMWAMPDLMADMYSQKGDQLRAGANTAWVPSPTA ATLHALHYHQTNVQSVQANIAQTEFNAEFEPLLDDLLTIPVAENANWSAQEIQQELDNNV QGILGYVVRWVEQGIGCSKVPDIHNVALMEDRATLRISSQHIANWLRHGILTKEQVQASL ENMAKVVDQQNAGDPAYRPMAGNFANSCAFKAASDLIFLGVKQPNGYTEPLLHAWRLREK ESH >gi|296494714|gb|ADTN01000024.1| GENE 12 13378 - 15060 1673 560 aa, chain + ## HITS:1 COG:yghK KEGG:ns NR:ns ## COG: yghK COG1620 # Protein_GI_number: 16130875 # Func_class: C Energy production and conversion # Function: L-lactate permease # Organism: Escherichia coli K12 # 1 560 1 560 560 913 99.0 0 MVTWTQMYMPMGGLGLSALVALIPIIFFFVALAVLRLKGHVAGAITLILSILIAIFAFKM PIDMAFAAAGYGFIYGLWPIAWIIVAAVFLYKLTVASGQFDIIRSSVISITDDQRLQVLL IGFSFGALLEGAAGFGAPVAITGALLVGLGFKPLYAAGLCLIANTAPVAFGALGVPILVA GQVTGIDPFHIGAMAGRQLPFLSVLVPFWLVAMMDGWKGVKETWPAVLVAGGSFAVTQFF TSNYIGPELPDITSALVSIVSLALFLKVWRPKNTETAISMGQSAGAMVVNKPSSGGPVPS EYSLGQIIRAWSPFLILTVLVTIWTMKPFKALFAPGGAFYSLVINFQIPHLHQQVLKAAP IVAQPTPMDAVFKFDPLSAGGTAIFIAAIISIFILGVGIKKGIGVFAETLISLKWPILSI GMVLAFAFVTNYSGMSTTLALVLAGTGVMFPFFSPFLGWLGVFLTGSDTSSNALFGSLQS TTAQQINVSDTLLVAANTSGGVTGKMISPQSIAVACAATGMVGRESELFRYTVKHSLIFA SVIGIITLLQAYVFTGMLVS >gi|296494714|gb|ADTN01000024.1| GENE 13 15545 - 16336 409 263 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_3251 NR:ns ## KEGG: EcSMS35_3251 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 259 1 259 1518 378 100.0 1e-103 MNKKFKYKKSLLAAILSATLLAGCDGGGSGSSSDTPPVDSGTGSLPEVKPDPTPNPEPTP EPTPDPEPTPEPIPDPEPTPEPEPEPVPTKTGYLTLGGSQRVTGATCNGESSDGFTFKPG EDVTCVAGNTTIATFNTQSEAARSLRAVEKVSFSLEDAQELAGSDDKKSNAVSLVTSSNS CPANTEQVCLTFSSVIESKRFDSLYKQIDLAPEEFKKLVNEEVENNAATDKAPSTHTSPV VPVTTPGTKPDLNASFVSAIGPN >gi|296494714|gb|ADTN01000024.1| GENE 14 16378 - 16890 662 170 aa, chain + ## HITS:1 COG:VC2726 KEGG:ns NR:ns ## COG: VC2726 COG3156 # Protein_GI_number: 15642720 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulK # Organism: Vibrio cholerae # 1 150 158 307 336 110 42.0 1e-24 MQTRLGREDSEYLARSVPFYAANQPLADISEMRVVQGMDAGLYQKLKPLVCALPMTRQQI NINTLDVTQSVILEALFDPWLSPVQARALLQQRPAKGWEDVDQFLAQPLLADVDERTKKQ LKTVLSVDSNYFWLRSDITVNEIELTMNSLIVRMGPQHFSVLWHQTGESE >gi|296494714|gb|ADTN01000024.1| GENE 15 16887 - 18065 723 392 aa, chain + ## HITS:1 COG:yghE KEGG:ns NR:ns ## COG: yghE COG3297 # Protein_GI_number: 16130869 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulL # Organism: Escherichia coli K12 # 107 392 1 286 286 540 99.0 1e-153 MSSILEIFFPLCASDTVRWQRRTPDVEHGIWPDVADEQLQQWLQTDAIRLYIPGEWISVW QVELPDVARKQIPTILPALLEEELNQDIDELHFAPLNIDQQLATVAVIHQQHMRNIAQWL QENGITRATVAPDWMSIPCGFMACDAQRVICRIDECRGWSAGLALAPVMFRAQLNEQDLP LSLTVVGIAPEKLSAWAGADAERLTVTALPAITTYGEPEGNLLTGPWQPRVSYRKQWARW RVMILPILLILVALAVERGVTLWSVSEQVAQSRTQAEEQFLTLFPEQKRIVNLRSQVTMA LKKYRPQADDTRLLAELSAIASTLKSASLSDIEMRGFTFDQKRQILHLQLRAANFASFDK LRSALATDYVVQQDALQKEGDAVSGGVTLRRK >gi|296494714|gb|ADTN01000024.1| GENE 16 18067 - 18603 430 178 aa, chain + ## HITS:1 COG:yghD KEGG:ns NR:ns ## COG: yghD COG3149 # Protein_GI_number: 16130868 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulM # Organism: Escherichia coli K12 # 1 178 1 178 178 338 100.0 3e-93 MLRDKFIHYFQQWRERQLSRGEHWLAQHLAGRSPREKGMLLAAVVFLFSVGYYVLIWQPL SERIEQQETILQQLVAMNTRLKNAAPDIIAARKSATTTPAQVSRVISDSASAHSVVIRRI ADRGENIQVWIEPVVFNDLLKWLNALDEKYALRVTQIDVSAAEKPGMVNVQRLEFGRG Prediction of potential genes in microbial genomes Time: Sun May 15 23:08:18 2011 Seq name: gi|296494713|gb|ADTN01000025.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont69.3, whole genome shotgun sequence Length of sequence - 64668 bp Number of predicted genes - 61, with homology - 60 Number of transcription units - 34, operones - 13 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 305 - 380 84.1 # Phe GAA 0 0 - Term 261 - 292 4.1 1 1 Tu 1 . - CDS 486 - 1193 553 ## COG1811 Uncharacterized membrane protein, possible Na+ channel or pump + Prom 1413 - 1472 3.8 2 2 Tu 1 . + CDS 1591 - 3726 1958 ## COG1982 Arginine/lysine/ornithine decarboxylases + Term 3739 - 3772 -0.4 - Term 3721 - 3766 12.1 3 3 Tu 1 . - CDS 3776 - 5032 1609 ## COG0477 Permeases of the major facilitator superfamily - Prom 5138 - 5197 6.2 4 4 Op 1 6/0.182 - CDS 5234 - 6316 1012 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) 5 4 Op 2 11/0.000 - CDS 6378 - 6653 336 ## COG2924 Fe-S cluster protector protein 6 4 Op 3 . - CDS 6681 - 7733 1082 ## PROTEIN SUPPORTED gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 - Prom 7843 - 7902 2.6 + Prom 7802 - 7861 3.6 7 5 Op 1 4/0.273 + CDS 7894 - 8613 763 ## COG0220 Predicted S-adenosylmethionine-dependent methyltransferase 8 5 Op 2 . + CDS 8613 - 8939 463 ## COG3171 Uncharacterized protein conserved in bacteria + Term 9082 - 9112 3.0 9 6 Tu 1 . + CDS 9126 - 9842 926 ## ECO111_3709 hypothetical protein + Term 9852 - 9898 5.1 + Prom 9878 - 9937 6.0 10 7 Tu 1 . + CDS 10018 - 11064 1327 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D + Term 11072 - 11104 5.4 + Prom 11093 - 11152 4.9 11 8 Tu 1 . + CDS 11181 - 12188 829 ## JW2923 conserved hypothetical protein - Term 12237 - 12281 3.4 12 9 Op 1 13/0.000 - CDS 12343 - 13479 1112 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 13 9 Op 2 2/0.909 - CDS 13472 - 14065 993 ## PROTEIN SUPPORTED gi|26249375|ref|NP_755415.1| putative deoxyribonucleotide triphosphate pyrophosphatase 14 9 Op 3 6/0.182 - CDS 14073 - 14363 302 ## COG1872 Uncharacterized conserved protein 15 9 Op 4 5/0.273 - CDS 14360 - 14926 824 ## COG0762 Predicted integral membrane protein 16 9 Op 5 . - CDS 14944 - 15648 645 ## COG0325 Predicted enzyme with a TIM-barrel fold - Prom 15693 - 15752 3.0 + Prom 15358 - 15417 3.5 17 10 Tu 1 . + CDS 15666 - 16646 736 ## COG2805 Tfp pilus assembly protein, pilus retraction ATPase PilT + Term 16875 - 16904 0.5 - Term 16685 - 16728 3.2 18 11 Op 1 8/0.091 - CDS 16830 - 17246 475 ## COG0816 Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) 19 11 Op 2 2/0.909 - CDS 17246 - 17809 606 ## COG1678 Putative transcriptional regulator - Prom 17846 - 17905 3.2 - Term 17827 - 17883 3.4 20 12 Op 1 2/0.909 - CDS 17918 - 18868 347 ## PROTEIN SUPPORTED gi|212636859|ref|YP_002313384.1| Glutathione synthase; Ribosomal protein S6 modification enzyme 21 12 Op 2 3/0.364 - CDS 18881 - 19612 579 ## COG1385 Uncharacterized protein conserved in bacteria 22 12 Op 3 6/0.182 - CDS 19692 - 20399 561 ## COG2356 Endonuclease I - Prom 20421 - 20480 4.1 23 13 Tu 1 4/0.273 - CDS 20494 - 20949 452 ## COG3091 Uncharacterized protein conserved in bacteria - Prom 20993 - 21052 2.8 24 14 Tu 1 . - CDS 21068 - 22462 1761 ## COG0477 Permeases of the major facilitator superfamily - Prom 22563 - 22622 8.8 - Term 22831 - 22863 3.0 25 15 Tu 1 . - CDS 22886 - 24040 1410 ## COG0192 S-adenosylmethionine synthetase + Prom 24278 - 24337 6.2 26 16 Op 1 . + CDS 24458 - 24601 62 ## gi|188495354|ref|ZP_03002624.1| hypothetical protein Ec53638_2979 + Term 24617 - 24648 1.1 + Prom 24755 - 24814 4.0 27 16 Op 2 5/0.273 + CDS 24835 - 26811 2385 ## COG1166 Arginine decarboxylase (spermidine biosynthesis) + Term 26881 - 26925 7.4 + Prom 26870 - 26929 3.0 28 17 Tu 1 . + CDS 26949 - 27869 1301 ## COG0010 Arginase/agmatinase/formimionoglutamate hydrolase, arginase family - Term 27959 - 27999 -0.5 29 18 Tu 1 . - CDS 28075 - 28833 774 ## COG0501 Zn-dependent protease with chaperone function - Prom 28855 - 28914 5.0 + Prom 28973 - 29032 3.8 30 19 Tu 1 . + CDS 29111 - 31102 2482 ## COG0021 Transketolase + Term 31113 - 31161 5.0 + Prom 31185 - 31244 6.3 31 20 Op 1 3/0.364 + CDS 31416 - 31859 353 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 32 20 Op 2 2/0.909 + CDS 31887 - 33275 1724 ## COG2213 Phosphotransferase system, mannitol-specific IIBC component 33 20 Op 3 2/0.909 + CDS 33290 - 34567 1303 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 34 20 Op 4 3/0.364 + CDS 34564 - 35529 828 ## COG1494 Fructose-1,6-bisphosphatase/sedoheptulose 1,7-bisphosphatase and related proteins 35 20 Op 5 3/0.364 + CDS 35551 - 36060 380 ## COG3722 Transcriptional regulator 36 20 Op 6 1/1.000 + CDS 36057 - 36770 384 ## COG1072 Panthothenate kinase + Term 36999 - 37036 1.5 + Prom 36864 - 36923 2.4 37 21 Op 1 26/0.000 + CDS 37055 - 38074 813 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase 38 21 Op 2 9/0.091 + CDS 38124 - 39287 1488 ## COG0126 3-phosphoglycerate kinase + Term 39297 - 39327 3.0 + Prom 39302 - 39361 6.0 39 22 Tu 1 4/0.273 + CDS 39502 - 40581 1189 ## COG0191 Fructose/tagatose bisphosphate aldolase + Term 40606 - 40634 2.1 + Prom 40796 - 40855 3.9 40 23 Tu 1 . + CDS 40939 - 41799 995 ## COG0668 Small-conductance mechanosensitive channel + Term 41808 - 41847 5.0 + Prom 41851 - 41910 5.7 41 24 Op 1 5/0.273 + CDS 41938 - 42573 625 ## COG1279 Lysine efflux permease 42 24 Op 2 2/0.909 + CDS 42666 - 43406 768 ## COG2968 Uncharacterized conserved protein + Term 43429 - 43462 -0.3 + Prom 43432 - 43491 6.2 43 25 Tu 1 . + CDS 43573 - 44469 730 ## COG0583 Transcriptional regulator 44 26 Op 1 . - CDS 44466 - 45944 1094 ## COG0427 Acetyl-CoA hydrolase 45 26 Op 2 . - CDS 45968 - 46753 909 ## COG1024 Enoyl-CoA hydratase/carnithine racemase 46 26 Op 3 6/0.182 - CDS 46764 - 47759 889 ## COG1703 Putative periplasmic protein kinase ArgK and related GTPases of G3E family 47 26 Op 4 3/0.364 - CDS 47752 - 49896 2389 ## COG1884 Methylmalonyl-CoA mutase, N-terminal domain/subunit - Prom 50014 - 50073 4.0 - Term 50053 - 50086 2.7 48 27 Tu 1 . - CDS 50100 - 50993 842 ## COG0583 Transcriptional regulator - Prom 51023 - 51082 1.5 + Prom 51033 - 51092 3.4 49 28 Op 1 . + CDS 51135 - 51365 204 ## COG0583 Transcriptional regulator 50 28 Op 2 8/0.091 + CDS 51421 - 52080 911 ## COG0120 Ribose 5-phosphate isomerase + Prom 52231 - 52290 4.4 51 29 Tu 1 . + CDS 52336 - 53568 1400 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases + Term 53590 - 53626 7.1 - Term 53749 - 53780 3.2 52 30 Tu 1 . - CDS 53957 - 54559 407 ## COG0212 5-formyltetrahydrofolate cyclo-ligase 53 31 Tu 1 . - CDS 54805 - 55134 394 ## COG3027 Uncharacterized protein conserved in bacteria - Prom 55193 - 55252 3.1 + Prom 55214 - 55273 1.5 54 32 Op 1 5/0.273 + CDS 55302 - 55880 682 ## COG3079 Uncharacterized protein conserved in bacteria 55 32 Op 2 8/0.091 + CDS 55906 - 57231 1405 ## COG0006 Xaa-Pro aminopeptidase 56 32 Op 3 8/0.091 + CDS 57228 - 58406 1030 ## COG0654 2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases 57 32 Op 4 . + CDS 58429 - 59631 1342 ## COG0654 2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases + Term 59675 - 59710 4.1 58 33 Tu 1 . - CDS 59705 - 59788 59 ## - Prom 59889 - 59948 4.3 59 34 Op 1 18/0.000 + CDS 60079 - 61173 1402 ## COG0404 Glycine cleavage system T protein (aminomethyltransferase) 60 34 Op 2 12/0.000 + CDS 61197 - 61586 569 ## COG0509 Glycine cleavage system H protein (lipoate-binding) + Term 61616 - 61660 2.3 + Prom 61624 - 61683 1.8 61 34 Op 3 . + CDS 61705 - 64578 3491 ## COG1003 Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain + Term 64604 - 64629 -0.5 Predicted protein(s) >gi|296494713|gb|ADTN01000025.1| GENE 1 486 - 1193 553 235 aa, chain - ## HITS:1 COG:ECs3842 KEGG:ns NR:ns ## COG: ECs3842 COG1811 # Protein_GI_number: 15833096 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, possible Na+ channel or pump # Organism: Escherichia coli O157:H7 # 1 235 1 235 235 333 99.0 2e-91 MVIGPFINASAVLLGGVLGALLSQRLPERIRVSMTSIFGLASLGIGILLVVKCANLPAMV LATLLGALIGEICLLEKGVNTAVAKAQNLFRHSRKKPAHESFIQNYVAIIVLFCASGTGI FGAMNEGMTGDPSILIAKSFLDFFTAMIFACSLGIAVSVISIPLLIIQLTLAWAAALILP LTTPSMMADFSAVGGLLLLATGLRICGIKMFPVVNMLPALLLAMPLSAAWTAWFA >gi|296494713|gb|ADTN01000025.1| GENE 2 1591 - 3726 1958 711 aa, chain + ## HITS:1 COG:speC KEGG:ns NR:ns ## COG: speC COG1982 # Protein_GI_number: 16130866 # Func_class: E Amino acid transport and metabolism # Function: Arginine/lysine/ornithine decarboxylases # Organism: Escherichia coli K12 # 1 711 21 731 731 1494 100.0 0 MKSMNIAASSELVSRLSSHRRVVALGDTDFTDVAAVVITAADSRSGILALLKRTGFHLPV FLYSEHAVELPAGVTAVINGNEQQWLELESAACQYEENLLPPFYDTLTQYVEMGNSTFAC PGHQHGAFFKKHPAGRHFYDFFGENVFRADMCNADVKLGDLLIHEGSAKDAQKFAAKVFH ADKTYFVLNGTSAANKVVTNALLTRGDLVLFDRNNHKSNHHGALIQAGATPVYLEASRNP FGFIGGIDAHCFNEEYLRQQIRDVAPEKADLPRPYRLAIIQLGTYDGTVYNARQVIDTVG HLCDYILFDSAWVGYEQFIPMMADSSPLLLELNENDPGIFVTQSVHKQQAGFSQTSQIHK KDNHIRGQARFCPHKRLNNAFMLHASTSPFYPLFAALDVNAKIHEGESGRRLWAECVEIG IEARKAILARCKLFRPFIPPVVDGKLWQDYPTSVLASDRRFFSFEPGAKWHGFEGYAADQ YFVDPCKLLLTTPGIDAETGEYSDFGVPATILAHYLRENGIVPEKCDLNSILFLLTPAES HEKLAQLVAMLAQFEQHIEDDSPLVEVLPSVYNKYPVRYRDYTLRQLCQEMHDLYVSFDV KDLQKAMFRQQSFPSVVMNPQDAHSAYIRGDVELVRIRDAEGRIAAEGALPYPPGVLCVV PGEVWGGAVQRYFLALEEGVNLLPGFSPELQGVYSETDADGVKRLYGYVLK >gi|296494713|gb|ADTN01000025.1| GENE 3 3776 - 5032 1609 418 aa, chain - ## HITS:1 COG:ECs3840 KEGG:ns NR:ns ## COG: ECs3840 COG0477 # Protein_GI_number: 15833094 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 418 17 434 434 725 100.0 0 MNLKLQLKILSFLQFCLWGSWLTTLGSYMFVTLKFDGASIGAVYSSLGIAAVFMPALLGI VADKWLSAKWVYAICHTIGAITLFMAAQVTTPEAMFLVILINSFAYMPTLGLINTISYYR LQNAGMDIVTDFPPIRIWGTIGFIMAMWVVSLSGFELSHMQLYIGAALSAILVLFTLTLP HIPVAKQQANQSWTTLLGLDAFALFKNKRMAIFFIFSMLLGAELQITNMFGNTFLHSFDK DPMFASSFIVQHASIIMSISQISETLFILTIPFFLSRYGIKNVMMISIVAWILRFALFAY GDPTPFGTVLLVLSMIVYGCAFDFFNISGSVFVEKEVSPAIRASAQGMFLMMTNGFGCIL GGIVSGKVVEMYTQNGITDWQTVWLIFAGYSVVLAFAFMAMFKYKHVRVPTGTQTVSH >gi|296494713|gb|ADTN01000025.1| GENE 4 5234 - 6316 1012 360 aa, chain - ## HITS:1 COG:mltC KEGG:ns NR:ns ## COG: mltC COG0741 # Protein_GI_number: 16130864 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Escherichia coli K12 # 1 360 1 360 360 690 100.0 0 MMKKYLALALIAPLLISCSTTKKGDTYNEAWVKDTNGFDILMGQFAHNIENIWGFKEVVI AGPKDYVKYTDQYQTRSHINFDDGTITIETIAGTEPAAHLRRAIIKTLLMGDDPSSVDLY SDVDDITISKEPFLYGQVVDNTGQPIRWEGRASNFADYLLKNRLKSRSNGLRIIYSVTIN MVPNHLDKRAHKYLGMVRQASRKYGVDESLILAIMQTESSFNPYAVSRSDALGLMQVVQH TAGKDVFRSQGKSGTPSRSFLFDPASNIDTGTAYLAMLNNVYLGGIDNPTSRRYAVITAY NGGAGSVLRVFSNDKIQAANIINTMTPGDVYQTLTTRHPSAESRRYLYKVNTAQKSYRRR >gi|296494713|gb|ADTN01000025.1| GENE 5 6378 - 6653 336 91 aa, chain - ## HITS:1 COG:STM3111 KEGG:ns NR:ns ## COG: STM3111 COG2924 # Protein_GI_number: 16766412 # Func_class: C Energy production and conversion; O Posttranslational modification, protein turnover, chaperones # Function: Fe-S cluster protector protein # Organism: Salmonella typhimurium LT2 # 1 91 1 91 91 161 94.0 3e-40 MSRTIFCTFLQREAEGQDFQLYPGELGKRIYNEISKEAWAQWQHKQTMLINEKKLNMMNA EHRKLLEQEMVNFLFEGKEVHIEGYTPEDKK >gi|296494713|gb|ADTN01000025.1| GENE 6 6681 - 7733 1082 350 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 [Haemophilus influenzae 7P49H1] # 6 343 11 366 378 421 59 1e-117 MQASQFSAQVLDWYDKYGRKTLPWQIDKTPYKVWLSEVMLQQTQVATVIPYFERFMARFP TVTDLANAPLDEVLHLWTGLGYYARARNLHKAAQQVATLHGGKFPETFEEVAALPGVGRS TAGAILSLSLGKHFPILDGNVKRVLARCYAVSGWPGKKEVENKLWSLSEQVTPAVGVERF NQAMMDLGAMICTRSKPKCSLCPLQNGCIAAANNSWALYPGKKPKQTLPERTGYFLLLQH EDEVLLAQRPPSGLWGGLYCFPQFADEESLRQWLAQRQIAADNLTQLTAFRHTFSHFHLD IVPMWLPVSSFTGCMDEGNALWYNLAQPPSVGLAAPVERLLQQLRTGAPV >gi|296494713|gb|ADTN01000025.1| GENE 7 7894 - 8613 763 239 aa, chain + ## HITS:1 COG:yggH KEGG:ns NR:ns ## COG: yggH COG0220 # Protein_GI_number: 16130861 # Func_class: R General function prediction only # Function: Predicted S-adenosylmethionine-dependent methyltransferase # Organism: Escherichia coli K12 # 1 239 1 239 239 491 100.0 1e-139 MKNDVISPEFDENGRPLRRIRSFVRRQGRLTKGQEHALENYWPVMGVEFSEDMLDFPALF GREAPVTLEIGFGMGASLVAMAKDRPEQDFLGIEVHSPGVGACLASAHEEGLSNLRVMCH DAVEVLHKMIPDNSLRMVQLFFPDPWHKARHNKRRIVQVPFAELVKSKLQLGGVFHMATD WEPYAEHMLEVMSSIDGYKNLSESNDYVPRPASRPVTKFEQRGHRLGHGVWDLMFERVK >gi|296494713|gb|ADTN01000025.1| GENE 8 8613 - 8939 463 108 aa, chain + ## HITS:1 COG:yggL KEGG:ns NR:ns ## COG: yggL COG3171 # Protein_GI_number: 16130860 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 108 11 118 118 202 100.0 1e-52 MAKNRSRRLRKKMHIDEFQELGFSVAWRFPEGTSEEQIDKTVDDFINEVIEPNKLAFDGS GYLAWEGLICMQEIGKCTEEHQAIVRKWLEERKLDEVRTSELFDVWWD >gi|296494713|gb|ADTN01000025.1| GENE 9 9126 - 9842 926 238 aa, chain + ## HITS:1 COG:no KEGG:ECO111_3709 NR:ns ## KEGG: ECO111_3709 # Name: yggN # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 238 2 239 239 429 100.0 1e-119 MRKMLLAAALSVTAMTAHADYQCSVTPRDDVIVSPQTVQVKGENGNLVITPDGNVMYNGK QYSLNAAQREQAKDYQAELRSTLPWIDEGAKSRVEKARIALDKIIVQEMGESSKMRSRLT KLDAQLKEQMNRIIETRSDGLTFHYKAIDQVRAEGQQLVNQAMGGILQDSINEMGAKAVL KSGGNPLQNVLGSLGGLQSSIQTEWKKQEKDFQQFGKDVCSRVVTLEDSRKALVGNLK >gi|296494713|gb|ADTN01000025.1| GENE 10 10018 - 11064 1327 348 aa, chain + ## HITS:1 COG:ansB KEGG:ns NR:ns ## COG: ansB COG0252 # Protein_GI_number: 16130858 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Escherichia coli K12 # 1 348 1 348 348 612 99.0 1e-175 MEFFKKTALAALVMGFSGAALALPNITILATGGTIAGGGDSATKSNYTAGKVGVENLVNA VPQLKDIANVKGEQVVNIGSQDMNDNVWLTLAKKINTDCDKTDGFVITHGTDTMEETAYF LDLTVKCDKPVVMVGAMRPSTSMSADGPFNLYNAVVTAADKASANRGVLVVMNDTVLDGR DVTKTNTTDVATFKSVNYGPLGYIHNGKIDYQRTPARKHTSDTPFDVSKLNELPKVGIVY NYANASDLPAKALVDAGYDGIVSAGVGNGNLYKSVFDTLATAAKTGTAVVRSSRVPTGAT TQDAEVDDAKYGFVASGTLNPQKARVLLQLALTQTKDPQQIQQIFNQY >gi|296494713|gb|ADTN01000025.1| GENE 11 11181 - 12188 829 335 aa, chain + ## HITS:1 COG:no KEGG:JW2923 NR:ns ## KEGG: JW2923 # Name: yggM # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 335 1 335 335 602 100.0 1e-171 MKKQWIVGTALLMLMTGNAWADGEPPTENILKDQFKKQYHGILKLDAITLKNLDAKGNQA TWSAEGDVSSSDDLYTWVGQLADYELLEQTWTKDKPVKFSAMLTSKGTPASGWSVNFYSF QAAASDRGRVVDDIKTNNKYLIVNSEDFNYRFSQLESALNTQKNSIPALEKEVKALDKQM VAAQKAADAYWGKDANGKQMTREDAFKKIHQQRDEFNKQNDSEAFAVKYDKEVYQPAIAA CHKQSEECYEVPIQQKRDFDINEQRRQTFLQSQKLSRKLQDDWVTLEKGQYPLTMKVSEI NSKKVAILMKIDDINQANERWKKDTEQLRRNGVIK >gi|296494713|gb|ADTN01000025.1| GENE 12 12343 - 13479 1112 378 aa, chain - ## HITS:1 COG:yggW KEGG:ns NR:ns ## COG: yggW COG0635 # Protein_GI_number: 16130856 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Escherichia coli K12 # 1 378 1 378 378 799 100.0 0 MVKLPPLSLYIHIPWCVQKCPYCDFNSHALKGEVPHDDYVQHLLNDLDNDVAYAQGREVK TIFIGGGTPSLLSGPAMQTLLDGVRARLPLAADAEITMEANPGTVEADRFVDYQRAGVNR ISIGVQSFSEEKLKRLGRIHGPQEAKRAAKLASGLGLRSFNLDLMHGLPDQSLEEALGDL RQAIELNPPHLSWYQLTIEPNTLFGSRPPVLPDDDALWDIFEQGHQLLTAAGYQQYETSA YAKPGYQCQHNLNYWRFGDYIGIGCGAHGKVTFPDGRILRTTKTRHPRGFMQGRYLESQR DVEATDKPFEFFMNRFRLLEAAPRVEFIAYTGLCEDVIRPQLDEAIAQGYLTECADYWQI TEHGKLFLNSLLELFLAE >gi|296494713|gb|ADTN01000025.1| GENE 13 13472 - 14065 993 197 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|26249375|ref|NP_755415.1| putative deoxyribonucleotide triphosphate pyrophosphatase [Escherichia coli CFT073] # 1 197 1 197 197 387 98 1e-106 MQKVVLATGNVGKVRELASLLSDFGLDIVAQTDLGVDSAEETGLTFIENAILKARHAAKV TALPAIADDSGLAVDVLGGAPGIYSARYSGEDATDQKNLQKLLETMKDVPDDQRQARFHC VLVYLRHAEDPTPLVCHGSWPGVITREPAGTGGFGYDPIFFVPSEGKTAAELTREEKSAI SHRGQALKLLLDALRNG >gi|296494713|gb|ADTN01000025.1| GENE 14 14073 - 14363 302 96 aa, chain - ## HITS:1 COG:yggU KEGG:ns NR:ns ## COG: yggU COG1872 # Protein_GI_number: 16130854 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 96 5 100 100 175 100.0 2e-44 MNAVTVNDDGLVLRLYIQPKASRDSIVGLHGDEVKVAITAPPVDGQANSHLVKFLGKQFR VAKSQVVIEKGELGRHKQIKIINPQQIPPEIAALIN >gi|296494713|gb|ADTN01000025.1| GENE 15 14360 - 14926 824 188 aa, chain - ## HITS:1 COG:ECs3828 KEGG:ns NR:ns ## COG: ECs3828 COG0762 # Protein_GI_number: 15833082 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Escherichia coli O157:H7 # 1 188 1 188 188 272 100.0 2e-73 MNTLTFLLSTVIELYTMVLLLRIWMQWAHCDFYNPFSQFVVKVTQPIIGPLRRVIPAMGP IDSASLLVAYILSFIKAIVLFKVVTFLPIIWIAGLLILLKTIGLLIFWVLLVMAIMSWVS QGRSPIEYVLIQLADPLLRPIRRLLPAMGGIDFSPMILVLLLYVINMGVAEVLQATGNML LPGLWMAL >gi|296494713|gb|ADTN01000025.1| GENE 16 14944 - 15648 645 234 aa, chain - ## HITS:1 COG:ECs3826 KEGG:ns NR:ns ## COG: ECs3826 COG0325 # Protein_GI_number: 15833081 # Func_class: R General function prediction only # Function: Predicted enzyme with a TIM-barrel fold # Organism: Escherichia coli O157:H7 # 1 234 1 234 234 456 100.0 1e-128 MNDIAHNLAQVRDKISAAATRCGRSPEEITLLAVSKTKPASAIAEAIDAGQRQFGENYVQ EGVDKIRHFQELGVTGLEWHFIGPLQSNKSRLVAEHFDWCHTIDRLRIATRLNDQRPAEL PPLNVLIQINISDENSKSGIQLAELDELAAAVAELPRLRLRGLMAIPAPESEYVRQFEVA RQMAVAFAGLKTRYPHIDTLSLGMSDDMEAAIAAGSTMVRIGTAIFGARDYSKK >gi|296494713|gb|ADTN01000025.1| GENE 17 15666 - 16646 736 326 aa, chain + ## HITS:1 COG:yggR KEGG:ns NR:ns ## COG: yggR COG2805 # Protein_GI_number: 16130851 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Tfp pilus assembly protein, pilus retraction ATPase PilT # Organism: Escherichia coli K12 # 1 326 16 341 341 625 100.0 1e-179 MNMEEIVALSVKHNVSDLHLCSAWPARWRIRGRMEAAPFDTPDVEELLREWLDDDQRAIL LENGQLDFAVSLAENQRLRGSAFAQRHGISLALRLLPSHCPQLEQLGAPTVLPELLKSEN GLILVTGATGSGKSTTLAAMVGYLNQHADAHILTLEDPVEYLYASQRCLIQQREIGLHCM TFASGLRAALREDPDVILLGELRDSETIRLALTAAETGHLVLATLHTRGAAQAVERLVDS FPAQEKDPVRNQLAGSLRAVLSQKLEVDKQEGRVALFELLINTPAVGNLIREGKTHQLPH VIQTGQQVGMITFQQSYQHRVGEGRL >gi|296494713|gb|ADTN01000025.1| GENE 18 16830 - 17246 475 138 aa, chain - ## HITS:1 COG:yqgF KEGG:ns NR:ns ## COG: yqgF COG0816 # Protein_GI_number: 16130850 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) # Organism: Escherichia coli K12 # 1 138 1 138 138 276 100.0 7e-75 MSGTLLAFDFGTKSIGVAVGQRITGTARPLPAIKAQDGTPDWNIIERLLKEWQPDEIIVG LPLNMDGTEQPLTARARKFANRIHGRFGVEVKLHDERLSTVEARSGLFEQGGYRALNKGK VDSASAVIILESYFEQGY >gi|296494713|gb|ADTN01000025.1| GENE 19 17246 - 17809 606 187 aa, chain - ## HITS:1 COG:ECs3824 KEGG:ns NR:ns ## COG: ECs3824 COG1678 # Protein_GI_number: 15833078 # Func_class: K Transcription # Function: Putative transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 187 25 211 211 374 100.0 1e-104 MNLQHHFLIAMPALQDPIFRRSVVYICEHNTNGAMGIIVNKPLENLKIEGILEKLKITPE PRDESIRLDKPVMLGGPLAEDRGFILHTPPSNFASSIRISDNTVMTTSRDVLETLGTDKQ PSDVLVALGYASWEKGQLEQEILDNAWLTAPADLNILFKTPIADRWREAAKLIGVDILTM PGVAGHA >gi|296494713|gb|ADTN01000025.1| GENE 20 17918 - 18868 347 316 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|212636859|ref|YP_002313384.1| Glutathione synthase; Ribosomal protein S6 modification enzyme [Shewanella piezotolerans WP3] # 7 310 6 319 345 138 31 9e-32 MIKLGIVMDPIANINIKKDSSFAMLLEAQRRGYELHYMEMGDLYLINGEARAHTRTLNVK QNYEEWFSFVGEQDLPLADLDVILMRKDPPFDTEFIYATYILERAEEKGTLIVNKPQSLR DCNEKLFTAWFSDLTPETLVTRNKAQLKAFWEKHSDIILKPLDGMGGASIFRVKEGDPNL GVIAETLTEHGTRYCMAQNYLPAIKDGDKRVLVVDGEPVPYCLARIPQGGETRGNLAAGG RGEPRPLTESDWKIARQIGPTLKEKGLIFVGLDIIGDRLTEINVTSPTCIREIEAEFPVS ITGMLMDAIEARLQQQ >gi|296494713|gb|ADTN01000025.1| GENE 21 18881 - 19612 579 243 aa, chain - ## HITS:1 COG:ECs3822 KEGG:ns NR:ns ## COG: ECs3822 COG1385 # Protein_GI_number: 15833076 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 243 10 252 252 486 100.0 1e-137 MRIPRIYHPEPLTSHSHIALCEDAANHIGRVLRMGPGQALQLFDGSNQVFDAEITSASKK SVEVKVLEGQIDDRESPLHIHLGQVMSRGEKMEFTIQKSIELGVSLITPLFSERCGVKLD SERLNKKLQQWQKIAIAACEQCGRNRVPEIRPAMDLEAWCAEQDEGLKLNLHPRASNSIN TLPLPVERVRLLIGPEGGLSADEIAMTARYQFTDILLGPRVLRTETTALTAITALQVRFG DLG >gi|296494713|gb|ADTN01000025.1| GENE 22 19692 - 20399 561 235 aa, chain - ## HITS:1 COG:endA KEGG:ns NR:ns ## COG: endA COG2356 # Protein_GI_number: 16130846 # Func_class: L Replication, recombination and repair # Function: Endonuclease I # Organism: Escherichia coli K12 # 1 235 1 235 235 446 100.0 1e-125 MYRYLSIAAVVLSAAFSGPALAEGINSFSQAKAAAVKVHADAPGTFYCGCKINWQGKKGV VDLQSCGYQVRKNENRASRVEWEHVVPAWQFGHQRQCWQDGGRKNCAKDPVYRKMESDMH NLQPSVGEVNGDRGNFMYSQWNGGEGQYGQCAMKVDFKEKAAEPPARARGAIARTYFYMR DQYNLTLSRQQTQLFNAWNKMYPVTDWECERDERIAKVQGNHNPYVQRACQARKS >gi|296494713|gb|ADTN01000025.1| GENE 23 20494 - 20949 452 151 aa, chain - ## HITS:1 COG:sprT KEGG:ns NR:ns ## COG: sprT COG3091 # Protein_GI_number: 16130845 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 151 15 165 165 290 100.0 5e-79 MRRLREKLAQANLKLGRNYPEPKLSYTQRGTSAGTAWLESYEIRLNPVLLLENSEAFIEE VVPHELAHLLVWKHFGRVAPHGKEWKWMMENVLGVPARRTHQFELQSVRRNTFPYRCKCQ EHQLTVRRHNRVVRGEAVYRCVHCGEQLVAK >gi|296494713|gb|ADTN01000025.1| GENE 24 21068 - 22462 1761 464 aa, chain - ## HITS:1 COG:galP KEGG:ns NR:ns ## COG: galP COG0477 # Protein_GI_number: 16130844 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 464 1 464 464 850 99.0 0 MPDAKKQGRSNKAMTFFVCFLAALAGLLFGLDIGVIAGALPFIADEFQITSHTQEWVVSS MMFGAAVGAVGSGWLSFKLGRKKSLMIGAILFVACSLFSAAAPNVEVLILSRVLLGLAVG VASYTAPLYLSEIAPEKIRGSMISMYQLMITIGILGAYLSDTAFSYTGAWRWMLGVIIIP AILLLIGVFFLPDSPRWFAAKRRFVDAERVLLRLRDTSAEAKRELDEIRESLQVKQSGWA LFKENSNFRRAVFLGVLLQVMQQFTGMNVIMYYAPKIFELAGYTNTTEQMWGTVIVGLTN VLATFIAIGLVDRWGRKPTLTLGFLVMAAGMGVLGTMMHIGIHSPSAQYFAIAMLLMFIV GFAMSAGPLIWVLCSEIQPLKGRDFGITCSTATNWIANMIVGATFLTMLNTLGNANTFWV YAALNVLFILLTLWLVPETKHVSLEHIERNLMKGRKLREIGAHD >gi|296494713|gb|ADTN01000025.1| GENE 25 22886 - 24040 1410 384 aa, chain - ## HITS:1 COG:ECs3818 KEGG:ns NR:ns ## COG: ECs3818 COG0192 # Protein_GI_number: 15833072 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylmethionine synthetase # Organism: Escherichia coli O157:H7 # 1 384 1 384 384 781 100.0 0 MAKHLFTSESVSEGHPDKIADQISDAVLDAILEQDPKARVACETYVKTGMVLVGGEITTS AWVDIEEITRNTVREIGYVHSDMGFDANSCAVLSAIGKQSPDINQGVDRADPLEQGAGDQ GLMFGYATNETDVLMPAPITYAHRLVQRQAEVRKNGTLPWLRPDAKSQVTFQYDDGKIVG IDAVVLSTQHSEEIDQKSLQEAVMEEIIKPILPAEWLTSATKFFINPTGRFVIGGPMGDC GLTGRKIIVDTYGGMARHGGGAFSGKDPSKVDRSAAYAARYVAKNIVAAGLADRCEIQVS YAIGVAEPTSIMVETFGTEKVPSEQLTLLVREFFDLRPYGLIQMLDLLHPIYKETAAYGH FGREHFPWEKTDKAQLLRDAAGLK >gi|296494713|gb|ADTN01000025.1| GENE 26 24458 - 24601 62 47 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|188495354|ref|ZP_03002624.1| ## NR: gi|188495354|ref|ZP_03002624.1| hypothetical protein Ec53638_2979 [Escherichia coli 53638] # 1 47 1 47 47 79 100.0 6e-14 MNRCLLLNLSHRSGEDSFPALCISALHTCRCYTHLGASQDSRAGYSY >gi|296494713|gb|ADTN01000025.1| GENE 27 24835 - 26811 2385 658 aa, chain + ## HITS:1 COG:speA KEGG:ns NR:ns ## COG: speA COG1166 # Protein_GI_number: 16130839 # Func_class: E Amino acid transport and metabolism # Function: Arginine decarboxylase (spermidine biosynthesis) # Organism: Escherichia coli K12 # 1 658 1 658 658 1362 100.0 0 MSDDMSMGLPSSAGEHGVLRSMQEVAMSSQEASKMLRTYNIAWWGNNYYDVNELGHISVC PDPDVPEARVDLAQLVKTREAQGQRLPALFCFPQILQHRLRSINAAFKRARESYGYNGDY FLVYPIKVNQHRRVIESLIHSGEPLGLEAGSKAELMAVLAHAGMTRSVIVCNGYKDREYI RLALIGEKMGHKVYLVIEKMSEIAIVLDEAERLNVVPRLGVRARLASQGSGKWQSSGGEK SKFGLAATQVLQLVETLREAGRLDSLQLLHFHLGSQMANIRDIATGVRESARFYVELHKL GVNIQCFDVGGGLGVDYEGTRSQSDCSVNYGLNEYANNIIWAIGDACEENGLPHPTVITE SGRAVTAHHTVLVSNIIGVERNEYTVPTAPAEDAPRALQSMWETWQEMHEPGTRRSLREW LHDSQMDLHDIHIGYSSGIFSLQERAWAEQLYLSMCHEVQKQLDPQNRAHRPIIDELQER MADKMYVNFSLFQSMPDAWGIDQLFPVLPLEGLDQVPERRAVLLDITCDSDGAIDHYIDG DGIATTMPMPEYDPENPPMLGFFMVGAYQEILGNMHNLFGDTEAVDVFVFPDGSVEVELS DEGDTVADMLQYVQLDPKTLLTQFRDQVKKTDLDAELQQQFLEEFEAGLYGYTYLEDE >gi|296494713|gb|ADTN01000025.1| GENE 28 26949 - 27869 1301 306 aa, chain + ## HITS:1 COG:ECs3812 KEGG:ns NR:ns ## COG: ECs3812 COG0010 # Protein_GI_number: 15833066 # Func_class: E Amino acid transport and metabolism # Function: Arginase/agmatinase/formimionoglutamate hydrolase, arginase family # Organism: Escherichia coli O157:H7 # 1 306 1 306 306 616 100.0 1e-176 MSTLGHQYDNSLVSNAFGFLRLPMNFQPYDSDADWVITGVPFDMATSGRAGGRHGPAAIR QVSTNLAWEHNRFPWNFDMRERLNVVDCGDLVYAFGDAREMSEKLQAHAEKLLAAGKRML SFGGDHFVTLPLLRAHAKHFGKMALVHFDAHTDTYANGCEFDHGTMFYTAPKEGLIDPNH SVQIGIRTEFDKDNGFTVLDACQVNDRSVDDVIAQVKQIVGDMPVYLTFDIDCLDPAFAP GTGTPVIGGLTSDRAIKLVRGLKDLNIVGMDVVEVAPAYDQSEITALAAATLALEMLYIQ AAKKGE >gi|296494713|gb|ADTN01000025.1| GENE 29 28075 - 28833 774 252 aa, chain - ## HITS:1 COG:yggG KEGG:ns NR:ns ## COG: yggG COG0501 # Protein_GI_number: 16130837 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Escherichia coli K12 # 1 252 43 294 294 465 100.0 1e-131 MKIRALLVAMSVATVLTGCQNMDSNGLLSSGAEAFQAYSLSDAQVKTLSDQACQEMDSKA TIAPANSEYAKRLTTIANALGNNINGQPVNYKVYMAKDVNAFAMANGCIRVYSGLMDMMT DNEVEAVIGHEMGHVALGHVKKGMQVALGTNAVRVAAASAGGIVGSLSQSQLGNLGEKLV NSQFSQRQEAEADDYSYDLLRQRGISPAGLATSFEKLAKLEEGRQSSMFDDHPASAERAQ HIRDRMSADGIK >gi|296494713|gb|ADTN01000025.1| GENE 30 29111 - 31102 2482 663 aa, chain + ## HITS:1 COG:ECs3810 KEGG:ns NR:ns ## COG: ECs3810 COG0021 # Protein_GI_number: 15833064 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase # Organism: Escherichia coli O157:H7 # 1 663 1 663 663 1331 99.0 0 MSSRKELANAIRALSMDAVQKAKSGHPGAPMGMADIAEVLWRDFLKHNPQNPSWADRDRF VLSNGHGSMLIYSLLHLTGYDLPMEELKNFRQLHSKTPGHPEVGYTAGVETTTGPLGQGI ANAVGMAIAEKTLAAQFNRPGHDIVDHYTYAFMGDGCMMEGISHEVCSLAGTLKLGKLIA FYDDNGISIDGHVEGWFTDDTAMRFEAYGWHVIRDIDGHDAASIKRAVEEARAVTDKPSL LMCKTIIGFGSPNKAGTHDSHGAPLGDAEIALTREQLGWKYAPFEIPSEIYAQWDAKEAG QAKESAWNEKFAAYAKAYPQEAAEFTRRMKGEMPSDFDAKAKEFIAKLQANPAKIASRKA SQNAIEAFGPLLPEFLGGSADLAPSNLTLWSGSKAINEDAAGNYIHYGVREFGMTAIANG ISLHGGFLPYTSTFLMFVEYARNAVRMAALMKQRQVMVYTHDSIGLGEDGPTHQPVEQVA SLRVTPNMSTWRPCDQVESAVAWKYGVERQDGPTALILSRQNLAQQERTEEQLANIARGG YVLKDCAGQPELIFIATGSEVELAVAAYEKLTAEGVKARVVSMPSTDAFDKQDAAYRESV LPKAVTARVAVEAGIADYWYKYVGLNGAIVGMTTFGESAPAELLFEEFGFTVDNVVAKAK ELL >gi|296494713|gb|ADTN01000025.1| GENE 31 31416 - 31859 353 147 aa, chain + ## HITS:1 COG:ECs3809 KEGG:ns NR:ns ## COG: ECs3809 COG1762 # Protein_GI_number: 15833063 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Escherichia coli O157:H7 # 1 147 1 147 147 252 100.0 1e-67 MRLSDYFPESSISVIHSAKDWQEAIDFSMVSLLDKNYISENYIQAIKDSTINNGPYYILA PGVAMPHARPECGALKTGMSLTLLEQGVYFPGNDEPIKLLIGLSAADADSHIGAIQALSE LLCEEEILEQLLTASSEKQLADIISRG >gi|296494713|gb|ADTN01000025.1| GENE 32 31887 - 33275 1724 462 aa, chain + ## HITS:1 COG:ECs3808 KEGG:ns NR:ns ## COG: ECs3808 COG2213 # Protein_GI_number: 15833062 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannitol-specific IIBC component # Organism: Escherichia coli O157:H7 # 1 462 1 462 462 856 100.0 0 MENKSARAKVQAFGGFLTAMVIPNIGAFIAWGFITALFIPTGWLPNEHFAKIVGPMITYL LPVMIGSTGGHLVGGKRGAVMGGIGTIGVIVGAEIPMFLGSMIMGPLGGLVIKYVDKALE KRIPAGFEMVINNFSLGIAGMLLCLLGFEVIGPAVLIANTFVKECIEALVHAGYLPLLSV INEPAKVLFLNNAIDQGVYYPLGMQQASVNGKSIFFMVASNPGPGLGLLLAFTLFGKGMS KRSAPGAMIIHFLGGIHELYFPYVLMKPLTIIAMIAGGMSGTWMFNLLDGGLVAGPSPGS IFAYLALTPKGSFLATIAGVTVGTLVSFAITSLILKMEKTVETESEDEFAQSANAVKAMK QEGAFSLSRVKRIAFVCDAGMGSSAMGATTFRKRLEKAGLAIEVKHYAIENVPADADIVV THASLEGRVKRVTDKPLILINNYIGDPKLDTLFNQLTAEHKH >gi|296494713|gb|ADTN01000025.1| GENE 33 33290 - 34567 1303 425 aa, chain + ## HITS:1 COG:ECs3807 KEGG:ns NR:ns ## COG: ECs3807 COG1063 # Protein_GI_number: 15833061 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli O157:H7 # 1 425 1 425 425 862 99.0 0 MKTKVAAIYGKRDVRLRVFELPEITDNELLVSVISDSVCLSTWKAALLGSEHKRVPDDLE NHPVITGHECAGVIVEVGKNLTGKYKKGQRFVLQPAMGLPSGYSAGYSYEYFGGNATYMI IPEIAINLGCVLPYHGSYFAAASLAEPMCCIIGAYHANYHTTQYVYEHRMGVKPGGNIAL LACAGPMGIGAIDYAINGGIQPSRVVVVDIDDKRLAQVQKLLPVELAASKGIELVYVNTK GMSDPVQMLRALTGDAGFDDIFVYAAVPAVVEMADELLAEDGCLNFFAGPTDKNFKVPFN FYNVHYNSTHVVGTSGGSTDDMKEAIALSATGQLQPSFMVTHIGGLDAVPETVLNLPDIP GGKKLIYNGVTMPLTAIADFAEKGKTDPLFKELARLVEETHGIWNEQAEKYLLAQFGVDI GEAAQ >gi|296494713|gb|ADTN01000025.1| GENE 34 34564 - 35529 828 321 aa, chain + ## HITS:1 COG:yggF KEGG:ns NR:ns ## COG: yggF COG1494 # Protein_GI_number: 16130831 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1,6-bisphosphatase/sedoheptulose 1,7-bisphosphatase and related proteins # Organism: Escherichia coli K12 # 1 321 1 321 321 630 100.0 1e-180 MMSLAWPLFRVTEQAALAAWPQTGCGDKNKIDGLAVTAMRQALNDVAFRGRVVIGEGEID HAPMLWIGEEVGKGDGPEVDIAVDPIEGTRMVAMGQSNALAVMAFAPRDSLLHAPDMYMK KLVVNRLAAGAIDLSLPLTDNLRNVAKALGKPLDKLRMVTLDKPRLSAAIEEATQLGVKV FALPDGDVAASVLTCWQDNPYDVMYTIGGAPEGVISACAVKALGGDMQAELIDFCQAKGD YTENRQIAEQERKRCKAMGVDVNRVYSLDELVRGNDILFSATGVTGGELVNGIQQTANGV RTQTLLIGGADQTCNIIDSLH >gi|296494713|gb|ADTN01000025.1| GENE 35 35551 - 36060 380 169 aa, chain + ## HITS:1 COG:yggD KEGG:ns NR:ns ## COG: yggD COG3722 # Protein_GI_number: 16130830 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 169 1 169 169 298 100.0 4e-81 MATLTEDDVLEQLDAQDNLFSFMKTAHTILLQGIRQFLPSLFVDNDEEIVEYAVKPLLAQ SGPLDDIDVALRLIYALGKMDKWLYADITHFSQFWHYLNEQDETPGFADDMTWDFISNVN SITRNAMLYDALKAMKFADFSVWSEARFSGMVKTALTLAVTTTLKELTP >gi|296494713|gb|ADTN01000025.1| GENE 36 36057 - 36770 384 237 aa, chain + ## HITS:1 COG:yggC KEGG:ns NR:ns ## COG: yggC COG1072 # Protein_GI_number: 16130829 # Func_class: H Coenzyme transport and metabolism # Function: Panthothenate kinase # Organism: Escherichia coli K12 # 1 237 1 237 237 489 100.0 1e-138 MKIELTVNGLKIQAQYQNEEIENVHKPLLHMLAALQTVNPQRRTVVFLCAPPGTGKSTLT TFWEYLAQQDPELPAIQTLPMDGFHHYNSWLDAHQLRPFKGAPETFDVAKLTENLRQVVE GDCTWPQYDRQKHDPVEDALHVTAPLVIVEGNWLLLDDEKWLELASFCDFSIFIHAPAQI LRERLISRKIAGGLTRQVAEAFYARTDGPNVERVLMNSRQANLIVEMTEEGRYHFTS >gi|296494713|gb|ADTN01000025.1| GENE 37 37055 - 38074 813 339 aa, chain + ## HITS:1 COG:ECs3798 KEGG:ns NR:ns ## COG: ECs3798 COG0057 # Protein_GI_number: 15833052 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Escherichia coli O157:H7 # 1 339 1 339 339 677 99.0 0 MTVRVAINGFGRIGRNVVRALYESGRRAEITVVAINELADAAGMAHLLKYDTSHGRFAWE VRQERDQLFVGDDAIRVLHERSLQSLPWRELGVDVVLDCTGVYGSREHGEAHIAAGAKKV LFSHPGSNDLDATVVYGANQDQLRAEHRIVSNASCTTNCIIPVIKLLDDAYGIESGTVTT IHSAMHDQQVIDAYHPDLRRTRAASQSIIPVDTKLAAGITRFFPQFNDRFEAIAVRVPTI NVTAIDLSVTVKKPVKANEVNLLLQKAAQGAFHGIVDYTELPLVSVDFNHDPHSAIVDGT QTRVSGAHLIKTLVWCDNEWGFANRMLDTTLAMATVAFR >gi|296494713|gb|ADTN01000025.1| GENE 38 38124 - 39287 1488 387 aa, chain + ## HITS:1 COG:pgk KEGG:ns NR:ns ## COG: pgk COG0126 # Protein_GI_number: 16130827 # Func_class: G Carbohydrate transport and metabolism # Function: 3-phosphoglycerate kinase # Organism: Escherichia coli K12 # 1 387 1 387 387 717 100.0 0 MSVIKMTDLDLAGKRVFIRADLNVPVKDGKVTSDARIRASLPTIELALKQGAKVMVTSHL GRPTEGEYNEEFSLLPVVNYLKDKLSNPVRLVKDYLDGVDVAEGELVVLENVRFNKGEKK DDETLSKKYAALCDVFVMDAFGTAHRAQASTHGIGKFADVACAGPLLAAELDALGKALKE PARPMVAIVGGSKVSTKLTVLDSLSKIADQLIVGGGIANTFIAAQGHDVGKSLYEADLVD EAKRLLTTCNIPVPSDVRVATEFSETAPATLKSVNDVKADEQILDIGDASAQELAEILKN AKTILWNGPVGVFEFPNFRKGTEIVANAIADSEAFSIAGGGDTLAAIDLFGIADKISYIS TGGGAFLEFVEGKVLPAVAMLEERAKK >gi|296494713|gb|ADTN01000025.1| GENE 39 39502 - 40581 1189 359 aa, chain + ## HITS:1 COG:Zfba KEGG:ns NR:ns ## COG: Zfba COG0191 # Protein_GI_number: 15803459 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Escherichia coli O157:H7 EDL933 # 1 359 26 384 384 742 100.0 0 MSKIFDFVKPGVITGDDVQKVFQVAKENNFALPAVNCVGTDSINAVLETAAKVKAPVIVQ FSNGGASFIAGKGVKSDVPQGAAILGAISGAHHVHQMAEHYGVPVILHTDHCAKKLLPWI DGLLDAGEKHFAATGKPLFSSHMIDLSEESLQENIEICSKYLERMSKIGMTLEIELGCTG GEEDGVDNSHMDASALYTQPEDVDYAYTELSKISPRFTIAASFGNVHGVYKPGNVVLTPT ILRDSQEYVSKKHNLPHNSLNFVFHGGSGSTAQEIKDSVSYGVVKMNIDTDTQWATWEGV LNYYKANEAYLQGQLGNPKGEDQPNKKYYDPRVWLRAGQTSMIARLEKAFQELNAIDVL >gi|296494713|gb|ADTN01000025.1| GENE 40 40939 - 41799 995 286 aa, chain + ## HITS:1 COG:ECs3795 KEGG:ns NR:ns ## COG: ECs3795 COG0668 # Protein_GI_number: 15833049 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Escherichia coli O157:H7 # 1 286 1 286 286 498 100.0 1e-141 MEDLNVVDSINGAGSWLVANQALLLSYAVNIVAALAIIIVGLIIARMISNAVNRLMISRK IDATVADFLSALVRYGIIAFTLIAALGRVGVQTASVIAVLGAAGLAVGLALQGSLSNLAA GVLLVMFRPFRAGEYVDLGGVAGTVLSVQIFSTTMRTADGKIIVIPNGKIIAGNIINFSR EPVRRNEFIIGVAYDSDIDQVKQILTNIIQSEDRILKDREMTVRLNELGASSINFVVRVW SNSGDLQNVYWDVLERIKREFDAAGISFPYPQMDVNFKRVKEDKAA >gi|296494713|gb|ADTN01000025.1| GENE 41 41938 - 42573 625 211 aa, chain + ## HITS:1 COG:ECs3794 KEGG:ns NR:ns ## COG: ECs3794 COG1279 # Protein_GI_number: 15833048 # Func_class: R General function prediction only # Function: Lysine efflux permease # Organism: Escherichia coli O157:H7 # 1 211 1 211 211 362 98.0 1e-100 MFSYYFQGLALGAAMILPLGPQNAFVMNQGIRRQYHIMIALLCAISDLVLICAGIFGGSA LLMQSPWLLALVTWGGVAFLLWYGFGAFKTAMSSNIELASAEVMKQGRWKIIATMLAVTW LNPHVYLDTFVVLGSLGGQLDVEPKRWFALGTISASFLWFFGLALLAAWLAPRLRTAKAQ RIINLVVGCVMWFIALQLARDGIAHAQALFS >gi|296494713|gb|ADTN01000025.1| GENE 42 42666 - 43406 768 246 aa, chain + ## HITS:1 COG:ECs3793 KEGG:ns NR:ns ## COG: ECs3793 COG2968 # Protein_GI_number: 15833047 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 246 1 246 246 410 100.0 1e-114 MKFKVIALAALMGISGMAAQANELPDGPHIVTSGTASVDAVPDIATLAIEVNVAAKDAAT AKKQADERVAQYISFLELNQIAKKDISSANLRTQPDYDYQDGKSILKGYRAVRTVEVTLR QLDKLNSLLDGALKAGLNEIRSVSLGVAQPDAYKDKARKAAIDNAIHQAQELANGFHRKL GPVYSVRYHVSNYQPSPMVRMMKADAAPVSAQETYEQAAIQFDDQVDVVFQLEPVDQQPA KTPAAQ >gi|296494713|gb|ADTN01000025.1| GENE 43 43573 - 44469 730 298 aa, chain + ## HITS:1 COG:ygfI KEGG:ns NR:ns ## COG: ygfI COG0583 # Protein_GI_number: 16130822 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 298 6 303 303 596 100.0 1e-170 MDIFISKKMRNFILLAQTNNIARAAEKIHMTASPFGKSIAALEEQIGYTLFTRKDNNISL NKAGQELYQKLFPVYQRLSAIDNEIHNSGRRSREIVIGIDNTYPTIIFDQLISLGDKYEG VTAQPVEFSENGVIDNLFDRQLDFIISPQHVSARVQELENLTISELPPLRLGFLVSRRYE ERQEQELLQELPWLQMRFQNRANFEAMIDANMRPCGINPTIIYRPYSFMAKISAVERGHF LTVIPHFAWRLVNPATLKYFDAPHRPMYMQEYLYSIRNHRYTATMLQHIAEDRDGTSH >gi|296494713|gb|ADTN01000025.1| GENE 44 44466 - 45944 1094 492 aa, chain - ## HITS:1 COG:ygfH KEGG:ns NR:ns ## COG: ygfH COG0427 # Protein_GI_number: 16130821 # Func_class: C Energy production and conversion # Function: Acetyl-CoA hydrolase # Organism: Escherichia coli K12 # 1 492 1 492 492 991 100.0 0 METQWTRMTANEAAEIIQHNDMVAFSGFTPAGSPKALPTAIARRANEQHEAKKPYQIRLL TGASISAAADDVLSDADAVSWRAPYQTSSGLRKKINQGAVSFVDLHLSEVAQMVNYGFFG DIDVAVIEASALAPDGRVWLTSGIGNAPTWLLRAKKVIIELNHYHDPRVAELADIVIPGA PPRRNSVSIFHAMDRVGTRYVQIDPKKIVAVVETNLPDAGNMLDKQNPMCQQIADNVVTF LLQEMAHGRIPPEFLPLQSGVGNINNAVMARLGENPVIPPFMMYSEVLQESVVHLLETGK ISGASASSLTISADSLRKIYDNMDYFASRIVLRPQEISNNPEIIRRLGVIALNVGLEFDI YGHANSTHVAGVDLMNGIGGSGDFERNAYLSIFMAPSIAKEGKISTVVPMCSHVDHSEHS VKVIITEQGIADLRGLSPLQRARTIIDNCAHPMYRDYLHRYLENAPGGHIHHDLSHVFDL HRNLIATGSMLG >gi|296494713|gb|ADTN01000025.1| GENE 45 45968 - 46753 909 261 aa, chain - ## HITS:1 COG:ECs3789 KEGG:ns NR:ns ## COG: ECs3789 COG1024 # Protein_GI_number: 15833043 # Func_class: I Lipid transport and metabolism # Function: Enoyl-CoA hydratase/carnithine racemase # Organism: Escherichia coli O157:H7 # 1 261 15 275 275 525 100.0 1e-149 MSYQYVNVVTINKVAVIEFNYGRKLNALSKVFIDDLMQALSDLNRPEIRCIILRAPSGSK VFSAGHDIHELPSGGRDPLSYDDPLRQITRMIQKFPKPIISMVEGSVWGGAFEMIMSSDL IIAASTSTFSMTPVNLGVPYNLVGIHNLTRDAGFHIVKELIFTASPITAQRALAVGILNH VVEVEELEDFTLQMAHHISEKAPLAIAVIKEELRVLGEAHTMNSDEFERIQGMRRAVYDS EDYQEGMNAFLEKRKPNFVGH >gi|296494713|gb|ADTN01000025.1| GENE 46 46764 - 47759 889 331 aa, chain - ## HITS:1 COG:argK KEGG:ns NR:ns ## COG: argK COG1703 # Protein_GI_number: 16130819 # Func_class: E Amino acid transport and metabolism # Function: Putative periplasmic protein kinase ArgK and related GTPases of G3E family # Organism: Escherichia coli K12 # 1 331 1 331 331 658 100.0 0 MINEATLAESIRRLRQGERATLAQAMTLVESRHPRHQALSTQLLDAIMPYCGNTLRLGVT GTPGAGKSTFLEAFGMLLIREGLKVAVIAVDPSSPVTGGSILGDKTRMNDLARAEAAFIR PVPSSGHLGGASQRARELMLLCEAAGYDVVIVETVGVGQSETEVARMVDCFISLQIAGGG DDLQGIKKGLMEVADLIVINKDDGDNHTNVAIARHMYESALHILRRKYDEWQPRVLTCSA LEKRGIDEIWHAIIDFKTALTASGRLQQVRQQQSVEWLRKQTEEEVLNHLFANEDFDRYY RQTLLAVKNNTLSPRTGLRQLSEFIQTQYFD >gi|296494713|gb|ADTN01000025.1| GENE 47 47752 - 49896 2389 714 aa, chain - ## HITS:1 COG:sbm_1 KEGG:ns NR:ns ## COG: sbm_1 COG1884 # Protein_GI_number: 16130818 # Func_class: I Lipid transport and metabolism # Function: Methylmalonyl-CoA mutase, N-terminal domain/subunit # Organism: Escherichia coli K12 # 1 585 1 585 585 1138 100.0 0 MSNVQEWQQLANKELSRREKTVDSLVHQTAEGIAIKPLYTEADLDNLEVTGTLPGLPPYV RGPRATMYTAQPWTIRQYAGFSTAKESNAFYRRNLAAGQKGLSVAFDLATHRGYDSDNPR VAGDVGKAGVAIDTVEDMKVLFDQIPLDKMSVSMTMNGAVLPVLAFYIVAAEEQGVTPDK LTGTIQNDILKEYLCRNTYIYPPKPSMRIIADIIAWCSGNMPRFNTISISGYHMGEAGAN CVQQVAFTLADGIEYIKAAISAGLKIDDFAPRLSFFFGIGMDLFMNVAMLRAARYLWSEA VSGFGAQDPKSLALRTHCQTSGWSLTEQDPYNNVIRTTIEALAATLGGTQSLHTNAFDEA LGLPTDFSARIARNTQIIIQEESELCRTVDPLAGSYYIESLTDQIVKQARAIIQQIDEAG GMAKAIEAGLPKRMIEEASAREQSLIDQGKRVIVGVNKYKLDHEDETDVLEIDNVMVRNE QIASLERIRATRDDAAVTAALNALTHAAQHNENLLAAAVNAARVRATLGEISDALEVAFD RYLVPSQCVTGVIAQSYHQSEKSASEFDAIVAQTEQFLADNGRRPRILIAKMGQDGHDRG AKVIASAYSDLGFDVDLSPMFSTPEEIARLAVENDVHVVGASSLAAGHKTLIPELVEALK KWGREDICVVAGGVIPPQDYAFLQERGVAAIYGPGTPMLDSVRDVLNLISQHHD >gi|296494713|gb|ADTN01000025.1| GENE 48 50100 - 50993 842 297 aa, chain - ## HITS:1 COG:ECs3786 KEGG:ns NR:ns ## COG: ECs3786 COG0583 # Protein_GI_number: 15833040 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 297 1 297 297 577 100.0 1e-165 MKRPDYRTLQALDAVIRERGFERAAQKLCITQSAVSQRIKQLENMFGQPLLVRTVPPRPT EQGQKLLALLRQVELLEEEWLGDEQTGSTPLLLSLAVNADSLATWLLPALAPVLADSPIR LNLQVEDETRTQERLRRGEVVGAVSIQHQALPSCLVDKLGALDYLFVSSKPFAEKYFPNG VTRSALLKAPVVAFDHLDDMHQAFLQQNFDLPPGSVPCHIVNSSEAFVQLARQGTTCCMI PHLQIEKELASGELIDLTPGLFQRRMLYWHRFAPESRMMRKVTDALLDYGHKVLRQD >gi|296494713|gb|ADTN01000025.1| GENE 49 51135 - 51365 204 76 aa, chain + ## HITS:1 COG:yqfE KEGG:ns NR:ns ## COG: yqfE COG0583 # Protein_GI_number: 16130816 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 76 1 76 76 142 100.0 1e-34 MHFAQRVRALVVLNGVALLPQFACKQGLANGELVRLFAPWSGIPRPLYALFAGRKGMPAI ARYFMDELTTRLANGV >gi|296494713|gb|ADTN01000025.1| GENE 50 51421 - 52080 911 219 aa, chain + ## HITS:1 COG:ECs3785 KEGG:ns NR:ns ## COG: ECs3785 COG0120 # Protein_GI_number: 15833039 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase # Organism: Escherichia coli O157:H7 # 1 219 1 219 219 397 100.0 1e-111 MTQDELKKAVGWAALQYVQPGTIVGVGTGSTAAHFIDALGTMKGQIEGAVSSSDASTEKL KSLGIHVFDLNEVDSLGIYVDGADEINGHMQMIKGGGAALTREKIIASVAEKFICIADAS KQVDILGKFPLPVEVIPMARSAVARQLVKLGGRPEYRQGVVTDNGNVILDVHGMEILDPI AMENAINAIPGVVTVGLFANRGADVALIGTPDGVKTIVK >gi|296494713|gb|ADTN01000025.1| GENE 51 52336 - 53568 1400 410 aa, chain + ## HITS:1 COG:ECs3784 KEGG:ns NR:ns ## COG: ECs3784 COG0111 # Protein_GI_number: 15833038 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Escherichia coli O157:H7 # 1 410 1 410 410 802 100.0 0 MAKVSLEKDKIKFLLVEGVHQKALESLRAAGYTNIEFHKGALDDEQLKESIRDAHFIGLR SRTHLTEDVINAAEKLVAIGCFCIGTNQVDLDAAAKRGIPVFNAPFSNTRSVAELVIGEL LLLLRGVPEANAKAHRGVWNKLAAGSFEARGKKLGIIGYGHIGTQLGILAESLGMYVYFY DIENKLPLGNATQVQHLSDLLNMSDVVSLHVPENPSTKNMMGAKEISLMKPGSLLINASR GTVVDIPALCDALASKHLAGAAIDVFPTEPATNSDPFTSPLCEFDNVLLTPHIGGSTQEA QENIGLEVAGKLIKYSDNGSTLSAVNFPEVSLPLHGGRRLMHIHENRPGVLTALNKIFAE QGVNIAAQYLQTSAQMGYVVIDIEADEDVAEKALQAMKAIPGTIRARLLY >gi|296494713|gb|ADTN01000025.1| GENE 52 53957 - 54559 407 200 aa, chain - ## HITS:1 COG:ECs3782 KEGG:ns NR:ns ## COG: ECs3782 COG0212 # Protein_GI_number: 15833036 # Func_class: H Coenzyme transport and metabolism # Function: 5-formyltetrahydrofolate cyclo-ligase # Organism: Escherichia coli O157:H7 # 19 200 1 182 182 356 100.0 1e-98 MTQLPELPLTLSRQEIRKMIRQRRRALTPEQQQEMGQQAATRMMTYPPVVMAHTVAVFLS FDGELDTQPLIEQLWRAGKRVYLPVLHPFSAGNLLFLNYHPQSELVMNRLKIHEPKLDVR DVLPLSRLDVLITPLVAFDEYGQRLGMGGGFYDRTLQNWQHYKTQPVGYAHDCQLVEKLP VEEWDIPLPAVVTPSKVWEW >gi|296494713|gb|ADTN01000025.1| GENE 53 54805 - 55134 394 109 aa, chain - ## HITS:1 COG:ECs3781 KEGG:ns NR:ns ## COG: ECs3781 COG3027 # Protein_GI_number: 15833035 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 109 1 109 109 163 100.0 6e-41 MSAQPVDIQIFGRSLRVNCPPDQRDALNQAADDLNQRLQDLKERTRVTNTEQLVFIAALN ISYELAQEKAKTRDYAASMEQRIRMLQQTIEQALLEQGRITEKTNQNFE >gi|296494713|gb|ADTN01000025.1| GENE 54 55302 - 55880 682 192 aa, chain + ## HITS:1 COG:ECs3780 KEGG:ns NR:ns ## COG: ECs3780 COG3079 # Protein_GI_number: 15833034 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 192 3 194 194 359 100.0 1e-99 MSIQNEMPGYNEMNQYLNQQGTGLTPAEMHGLISGMICGGNDDSSWLPLLHDLTNEGMAF GHELAQALRKMHSATSDALQDDGFLFQLYLPDGDDVSVFDRADALAGWVNHFLLGLGVTQ PKLDKVTGETGEAIDDLRNIAQLGYDEDEDQEELEMSLEEIIEYVRVAALLCHDTFTHPQ PTAPEVQKPTLH >gi|296494713|gb|ADTN01000025.1| GENE 55 55906 - 57231 1405 441 aa, chain + ## HITS:1 COG:pepP KEGG:ns NR:ns ## COG: pepP COG0006 # Protein_GI_number: 16130810 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Escherichia coli K12 # 1 441 1 441 441 902 100.0 0 MSEISRQEFQRRRQALVEQMQPGSAALIFAAPEVTRSADSEYPYRQNSDFWYFTGFNEPE AVLVLIKSDDTHNHSVLFNRVRDLTAEIWFGRRLGQDAAPEKLGVDRALAFSEINQQLYQ LLNGLDVVYHAQGEYAYADVIVNSALEKLRKGSRQNLTAPATMIDWRPVVHEMRLFKSPE EIAVLRRAGEITAMAHTRAMEKCRPGMFEYHLEGEIHHEFNRHGARYPSYNTIVGSGENG CILHYTENECEMRDGDLVLIDAGCEYKGYAGDITRTFPVNGKFTQAQREIYDIVLESLET SLRLYRPGTSILEVTGEVVRIMVSGLVKLGILKGDVDELIAQNAHRPFFMHGLSHWLGLD VHDVGVYGQDRSRILEPGMVLTVEPGLYIAPDAEVPEQYRGIGIRIEDDIVITETGNENL TASVVKKPEEIEALMVAARKQ >gi|296494713|gb|ADTN01000025.1| GENE 56 57228 - 58406 1030 392 aa, chain + ## HITS:1 COG:ubiH KEGG:ns NR:ns ## COG: ubiH COG0654 # Protein_GI_number: 16130809 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases # Organism: Escherichia coli K12 # 1 392 1 392 392 753 100.0 0 MSVIIVGGGMAGATLALAISRLSHGALPVHLIEATAPESHAHPGFDGRAIALAAGTCQQL ARIGVWQSLADCATAITTVHVSDRGHAGFVTLAAEDYQLAALGQVVELHNVGQRLFALLR KAPGVTLHCPDRVANVARTQSHVEVTLESGETLTGRVLVAADGTHSALATACGVDWQQEP YEQLAVIANVATSVAHEGRAFERFTQHGPLAMLPMSDGRCSLVWCHPLERREEVLSWSDE KFCRELQSAFGWRLGKITHAGKRSAYPLALTHAARSITHRTVLVGNAAQTLHPIAGQGFN LGMRDVMSLAETLTQAQERGEDMGDYGVLCRYQQRRQSDREATIGVTDSLVHLFANRWAP LVVGRNIGLMTMELFTPARDVLAQRTLGWVAR >gi|296494713|gb|ADTN01000025.1| GENE 57 58429 - 59631 1342 400 aa, chain + ## HITS:1 COG:visC KEGG:ns NR:ns ## COG: visC COG0654 # Protein_GI_number: 16130808 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases # Organism: Escherichia coli K12 # 1 400 1 400 400 809 100.0 0 MQSVDVAIVGGGMVGLAVACGLQGSGLRVAVLEQRVQEPLAANAPPQLRVSAINAASEKL LTRLGVWQDILSRRASCYHGMEVWDKDSFGHISFDDQSMGYSHLGHIVENSVIHYALWNK AHQSSDITLLAPAELQQVAWGENETFLTLKDGSMLTARLVIGADGANSWLRNKADIPLTF WDYQHHALVATIRTEEPHDAVARQVFHGEGILAFLPLSDPHLCSIVWSLSPEEAQRMQQA SEDEFNRALNIAFDNRLGLCKVESARQVFPLTGRYARQFASHRLALVGDAAHTIHPLAGQ GVNLGFMDAAELIAELKRLHRQGKDIGQYIYLRRYERSRKHSAALMLAGMQGFRDLFSGT NPAKKLLRDIGLKLADTLPGVKPQLIRQAMGLNDLPEWLR >gi|296494713|gb|ADTN01000025.1| GENE 58 59705 - 59788 59 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSKNTKSKNNGIRKYNAKTEVKLVYFK >gi|296494713|gb|ADTN01000025.1| GENE 59 60079 - 61173 1402 364 aa, chain + ## HITS:1 COG:gcvT KEGG:ns NR:ns ## COG: gcvT COG0404 # Protein_GI_number: 16130807 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system T protein (aminomethyltransferase) # Organism: Escherichia coli K12 # 1 364 1 364 364 746 100.0 0 MAQQTPLYEQHTLCGARMVDFHGWMMPLHYGSQIDEHHAVRTDAGMFDVSHMTIVDLRGS RTREFLRYLLANDVAKLTKSGKALYSGMLNASGGVIDDLIVYYFTEDFFRLVVNSATREK DLSWITQHAEPFGIEITVRDDLSMIAVQGPNAQAKAATLFNDAQRQAVEGMKPFFGVQAG DLFIATTGYTGEAGYEIALPNEKAADFWRALVEAGVKPCGLGARDTLRLEAGMNLYGQEM DETISPLAANMGWTIAWEPADRDFIGREALEVQREHGTEKLVGLVMTEKGVLRNELPVRF TDAQGNQHEGIITSGTFSPTLGYSIALARVPEGIGETAIVQIRNREMPVKVTKPVFVRNG KAVA >gi|296494713|gb|ADTN01000025.1| GENE 60 61197 - 61586 569 129 aa, chain + ## HITS:1 COG:ECs3775 KEGG:ns NR:ns ## COG: ECs3775 COG0509 # Protein_GI_number: 15833029 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system H protein (lipoate-binding) # Organism: Escherichia coli O157:H7 # 1 129 1 129 129 217 100.0 4e-57 MSNVPAELKYSKEHEWLRKEADGTYTVGITEHAQELLGDMVFVDLPEVGATVSAGDDCAV AESVKAASDIYAPVSGEIVAVNDALSDSPELVNSEPYAGGWIFKIKASDESELESLLDAT AYEALLEDE >gi|296494713|gb|ADTN01000025.1| GENE 61 61705 - 64578 3491 957 aa, chain + ## HITS:1 COG:ECs3774_2 KEGG:ns NR:ns ## COG: ECs3774_2 COG1003 # Protein_GI_number: 15833028 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain # Organism: Escherichia coli O157:H7 # 451 957 1 507 507 1030 100.0 0 MTQTLSQLENSGAFIERHIGPDAAQQQEMLNAVGAQSLNALTGQIVPKDIQLATPPQVGA PATEYAALAELKAIASRNKRFTSYIGMGYTAVQLPPVILRNMLENPGWYTAYTPYQPEVS QGRLEALLNFQQVTLDLTGLDMASASLLDEATAAAEAMAMAKRVSKLKNANRFFVASDVH PQTLDVVRTRAETFGFEVIVDDAQKVLDHQDVFGVLLQQVGTTGEIHDYTALISELKSRK IVVSVAADIMALVLLTAPGKQGADIVFGSAQRFGVPMGYGGPHAAFFAAKDEYKRSMPGR IIGVSKDAAGNTALRMAMQTREQHIRREKANSNICTSQVLLANIASLYAVYHGPVGLKRI ANRIHRLTDILAAGLQQKGLKLRHAHYFDTLCVEVADKAGVLTRAEAAEINLRSDILNAV GITLDETTTRENVMQLFNVLLGDNHGLDIDTLDKDVAHDSRSIQPAMLRDDEILTHPVFN RYHSETEMMRYMHSLERKDLALNQAMIPLGSCTMKLNAAAEMIPITWPEFAELHPFCPPE QAEGYQQMIAQLADWLVKLTGYDAVCMQPNSGAQGEYAGLLAIRHYHESRNEGHRDICLI PASAHGTNPASAHMAGMQVVVVACDKNGNIDLTDLRAKAEQAGDNLSCIMVTYPSTHGVY EETIREVCEVVHQFGGQVYLDGANMNAQVGITSPGFIGADVSHLNLHKTFCIPHGGGGPG MGPIGVKAHLAPFVPGHSVVQIEGMLTRQGAVSAAPFGSASILPISWMYIRMMGAEGLKK ASQVAILNANYIASRLQDAFPVLYTGRDGRVAHECILDIRPLKEETGISELDIAKRLIDY GFHAPTMSFPVAGTLMVEPTESESKVELDRFIDAMLAIRAEIDQVKAGVWPLEDNPLVNA PHIQSELVAEWAHPYSREVAVFPAGVADKYWPTVKRLDDVYGDRNLFCSCVPISEYQ Prediction of potential genes in microbial genomes Time: Sun May 15 23:08:47 2011 Seq name: gi|296494712|gb|ADTN01000026.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont69.4, whole genome shotgun sequence Length of sequence - 35991 bp Number of predicted genes - 31, with homology - 29 Number of transcription units - 18, operones - 9 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 100 - 843 253 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 2 2 Tu 1 . - CDS 900 - 2333 1660 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase - Prom 2363 - 2422 3.3 + Prom 2085 - 2144 3.3 3 3 Tu 1 . + CDS 2378 - 2689 402 ## COG3097 Uncharacterized protein conserved in bacteria + Prom 2719 - 2778 3.5 4 4 Tu 1 . + CDS 2853 - 3512 684 ## COG1272 Predicted membrane protein, hemolysin III homolog + Term 3668 - 3716 3.1 5 5 Tu 1 . - CDS 3708 - 4688 1125 ## COG0354 Predicted aminomethyltransferase related to GcvT - Prom 4721 - 4780 3.1 + Prom 4639 - 4698 3.7 6 6 Op 1 . + CDS 4931 - 5197 319 ## COG2938 Uncharacterized conserved protein 7 6 Op 2 . + CDS 5178 - 5585 270 ## SSON_3049 hypothetical protein + Term 5616 - 5666 2.6 - Term 5576 - 5618 1.2 8 7 Tu 1 . - CDS 5625 - 6146 751 ## COG0716 Flavodoxins - Prom 6178 - 6237 3.4 + Prom 6175 - 6234 4.9 9 8 Op 1 8/0.000 + CDS 6258 - 7154 814 ## COG4974 Site-specific recombinase XerD 10 8 Op 2 7/0.111 + CDS 7194 - 7889 748 ## COG1651 Protein-disulfide isomerase 11 8 Op 3 . + CDS 7895 - 9628 1869 ## COG0608 Single-stranded DNA-specific exonuclease + Term 9652 - 9686 2.6 12 9 Tu 1 . + CDS 9719 - 9796 83 ## 13 10 Op 1 8/0.000 + CDS 9936 - 10817 995 ## COG1186 Protein chain release factor B 14 10 Op 2 . + CDS 10827 - 12344 2050 ## COG1190 Lysyl-tRNA synthetase (class II) + Term 12365 - 12395 2.1 15 11 Tu 1 . - CDS 12388 - 12936 458 ## COG1443 Isopentenyldiphosphate isomerase - Prom 12961 - 13020 3.6 - Term 13009 - 13037 -0.3 16 12 Op 1 . - CDS 13059 - 13184 110 ## 17 12 Op 2 . - CDS 13186 - 14634 375 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 + Prom 14932 - 14991 6.0 18 13 Op 1 3/0.667 + CDS 15055 - 16989 1064 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases 19 13 Op 2 . + CDS 16989 - 17477 251 ## COG1142 Fe-S-cluster-containing hydrogenase components 2 + Term 17484 - 17520 6.7 - Term 17470 - 17507 6.1 20 14 Op 1 2/0.889 - CDS 17513 - 18880 1213 ## COG2252 Permeases 21 14 Op 2 4/0.556 - CDS 18916 - 20232 1422 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases 22 14 Op 3 2/0.889 - CDS 20250 - 21650 292 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 - Term 21778 - 21805 1.5 23 15 Op 1 12/0.000 - CDS 21815 - 24685 2930 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 24 15 Op 2 2/0.889 - CDS 24682 - 25461 833 ## COG1319 Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs 25 15 Op 3 1/1.000 - CDS 25512 - 26840 1374 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases 26 15 Op 4 3/0.667 - CDS 26843 - 29941 2691 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases - Prom 30044 - 30103 10.7 27 16 Tu 1 . - CDS 30263 - 30841 146 ## COG2068 Uncharacterized MobA-related protein - Prom 30943 - 31002 6.3 28 17 Op 1 . + CDS 31112 - 31714 229 ## ECIAI1_2996 hypothetical protein 29 17 Op 2 . + CDS 31762 - 33387 1401 ## COG1975 Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family + Term 33481 - 33518 -0.1 - Term 33559 - 33607 2.7 30 18 Op 1 . - CDS 33608 - 34540 1230 ## COG0549 Carbamate kinase 31 18 Op 2 . - CDS 34588 - 35973 1072 ## COG0044 Dihydroorotase and related cyclic amidohydrolases Predicted protein(s) >gi|296494712|gb|ADTN01000026.1| GENE 1 100 - 843 253 247 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 5 247 7 240 242 102 31 4e-21 MAIALVTGGSRGIGRATALLLAQEGYTVAVNYQQNLHAAQEVMNLITQAGGKAFVLQADI SDENQVVAMFTAIDQHDEPLAALVNNAGILFTQCTVENLTAERINRVLSTNVTGYFLCCR EAVKRMALKNGGSGGAIVNVSSVASRLGSPGEYVDYAASKGAIDTLTTGLSLEVAAQGIR VNCVRPGFIYTEMHASGGEPGRVDRVKSNIPMQRGGQAEEVAQAIVWLLSDKASYVTGSF IDLAGGK >gi|296494712|gb|ADTN01000026.1| GENE 2 900 - 2333 1660 477 aa, chain - ## HITS:1 COG:bglA KEGG:ns NR:ns ## COG: bglA COG2723 # Protein_GI_number: 16130803 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Escherichia coli K12 # 1 477 3 479 479 1006 99.0 0 MKKLTLPKDFLWGGAVAAHQVEGGWNKGGKGPSICDVLTGGAHGVPREITKEVLPGKYYP NHEAVDFYGHYKEDIKLFAEMGFKCFRTSIAWTRIFPKGDEAQPNEEGLKFYDDMFDELL KYNIEPVITLSHFEMPLHLVQQYGSWTNRKVVDFFVRFAEVVFERYKHKVKYWMTFNEIN NQRNWRAPLFGYCCSGVVYTEHENPEETMYQVLHHQFVASALAVKAARRINPEMKVGCML AMVPLYPYSCNPDDVMFAQESMRERYVFTDVQLRGYYPSYVLNEWERRGFNIKMEDGDLD VLREGTCDYLGFSYYMTNAVKAEGGTGDAISGFEGSVPNPYVKASDWGWQIDPVGLRYAL CELYERYQRPLFIVENGFGAYDKVEEDGSINDDYRIDYLRAHIEEMKKAVTYDGVDLMGY TPWGCIDCVSFTTGQYSKRYGFIYVNKHDDGTGDMSRSRKKSFNWYKEVIASNGEKL >gi|296494712|gb|ADTN01000026.1| GENE 3 2378 - 2689 402 103 aa, chain + ## HITS:1 COG:ECs3772 KEGG:ns NR:ns ## COG: ECs3772 COG3097 # Protein_GI_number: 15833026 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 103 1 103 103 187 100.0 4e-48 MQPNDITFFQRFQDDILAGRKTITIRDESESHFKTGDVLRVGRFEDDGYFCTIEVTATST VTLDTLTEKHAEQENMTLTELKKVIADIYPGQTQFYVIEFKCL >gi|296494712|gb|ADTN01000026.1| GENE 4 2853 - 3512 684 219 aa, chain + ## HITS:1 COG:ECs3771 KEGG:ns NR:ns ## COG: ECs3771 COG1272 # Protein_GI_number: 15833025 # Func_class: R General function prediction only # Function: Predicted membrane protein, hemolysin III homolog # Organism: Escherichia coli O157:H7 # 1 219 1 219 219 357 100.0 1e-98 MVQKPLIKQGYSLAEEIANSVSHGIGLVFGIVGLVLLLVQAVDLNASATAITSYSLYGGS MILLFLASTLYHAIPHQRAKMWLKKFDHCAIYLLIAGTYTPFLLVGLDSPLARGLMIVIW SLALLGILFKLTIAHRFKILSLVTYLAMGWLSLVVIYEMAVKLAAGSVTLLAVGGVVYSL GVIFYVCKRIPYNHAIWHGFVLGGSVCHFLAIYLYIGQA >gi|296494712|gb|ADTN01000026.1| GENE 5 3708 - 4688 1125 326 aa, chain - ## HITS:1 COG:ygfZ KEGG:ns NR:ns ## COG: ygfZ COG0354 # Protein_GI_number: 16130800 # Func_class: R General function prediction only # Function: Predicted aminomethyltransferase related to GcvT # Organism: Escherichia coli K12 # 1 326 1 326 326 627 100.0 1e-179 MAFTPFPPRQPTASARLPLTLMTLDDWALATITGADSEKYMQGQVTADVSQMAEDQHLLA AHCDAKGKMWSNLRLFRDGDGFAWIERRSVREPQLTELKKYAVFSKVTIAPDDERVLLGV AGFQARAALANLFSELPSKEKQVVKEGATTLLWFEHPAERFLIVTDEATANMLTDKLRGE AELNNSQQWLALNIEAGFPVIDAANSGQFIPQATNLQALGGISFKKGCYTGQEMVARAKF RGANKRALWLLAGSASRLPEAGEDLELKMGENWRRTGTVLAAVKLEDGQVVVQVVMNNDM EPDSIFRVRDDANTLHIEPLPYSLEE >gi|296494712|gb|ADTN01000026.1| GENE 6 4931 - 5197 319 88 aa, chain + ## HITS:1 COG:ECs3769 KEGG:ns NR:ns ## COG: ECs3769 COG2938 # Protein_GI_number: 15833023 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 88 1 88 88 171 100.0 4e-43 MDINNKARIHWACRRGMRELDISIMPFFEHEYDSLSDDEKRIFIRLLECDDPDLFNWLMN HGKPADAELEMMVRLIQTRNRERGPVAI >gi|296494712|gb|ADTN01000026.1| GENE 7 5178 - 5585 270 135 aa, chain + ## HITS:1 COG:no KEGG:SSON_3049 NR:ns ## KEGG: SSON_3049 # Name: not_defined # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 135 1 135 135 212 100.0 3e-54 MVLWQSDLRVSWRAQWLSLLIHGLVAAVILLMPWPLSYTPLWMVLLSLVVFDCVRSQRRI NARQGEIRLLMDGRLRWQGQEWSIVKAPWMIKSGMMLRLRSDGGKRQHLWLAADSMDEAE WRDLRRILLQQETQR >gi|296494712|gb|ADTN01000026.1| GENE 8 5625 - 6146 751 173 aa, chain - ## HITS:1 COG:ECs3767 KEGG:ns NR:ns ## COG: ECs3767 COG0716 # Protein_GI_number: 15833021 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Escherichia coli O157:H7 # 1 173 1 173 173 342 100.0 2e-94 MNMGLFYGSSTCYTEMAAEKIRDIIGPELVTLHNLKDDSPKLMEQYDVLILGIPTWDFGE IQEDWEAVWDQLDDLNLEGKIVALYGLGDQLGYGEWFLDALGMLHDKLSTKGVKFVGYWP TEGYEFTSPKPVIADGQLFVGLALDETNQYDLSDERIQSWCEQILNEMAEHYA >gi|296494712|gb|ADTN01000026.1| GENE 9 6258 - 7154 814 298 aa, chain + ## HITS:1 COG:xerD KEGG:ns NR:ns ## COG: xerD COG4974 # Protein_GI_number: 16130796 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Escherichia coli K12 # 1 298 1 298 298 548 100.0 1e-156 MKQDLARIEQFLDALWLEKNLAENTLNAYRRDLSMMVEWLHHRGLTLATAQSDDLQALLA ERLEGGYKATSSARLLSAVRRLFQYLYREKFREDDPSAHLASPKLPQRLPKDLSEAQVER LLQAPLIDQPLELRDKAMLEVLYATGLRVSELVGLTMSDISLRQGVVRVIGKGNKERLVP LGEEAVYWLETYLEHGRPWLLNGVSIDVLFPSQRAQQMTRQTFWHRIKHYAVLAGIDSEK LSPHVLRHAFATHLLNHGADLRVVQMLLGHSDLSTTQIYTHVATERLRQLHQQHHPRA >gi|296494712|gb|ADTN01000026.1| GENE 10 7194 - 7889 748 231 aa, chain + ## HITS:1 COG:ECs3765 KEGG:ns NR:ns ## COG: ECs3765 COG1651 # Protein_GI_number: 15833019 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Protein-disulfide isomerase # Organism: Escherichia coli O157:H7 # 1 231 6 236 236 459 100.0 1e-129 MLFTLLAAFSGFAQADDAAIQQTLAKMGIKSSDIQPAPVAGMKTVLTNSGVLYITDDGKH IIQGPMYDVSGTAPVNVTNKMLLKQLNALEKEMIVYKAPQEKHVITVFTDITCGYCHKLH EQMADYNALGITVRYLAFPRQGLDSDAEKEMKAIWCAKDKNKAFDDVMAGKSVAPASCDV DIADHYALGVQLGVSGTPAVVLSNGTLVPGYQPPKEMKEFLDEHQKMTSGK >gi|296494712|gb|ADTN01000026.1| GENE 11 7895 - 9628 1869 577 aa, chain + ## HITS:1 COG:recJ KEGG:ns NR:ns ## COG: recJ COG0608 # Protein_GI_number: 16130794 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-specific exonuclease # Organism: Escherichia coli K12 # 1 577 1 577 577 1157 100.0 0 MKQQIQLRRREVDETADLPAELPPLLRRLYASRGVRSAQELERSVKGMLPWQQLSGVEKA VEILYNAFREGTRIIVVGDFDADGATSTALSVLAMRSLGCSNIDYLVPNRFEDGYGLSPE VVDQAHARGAQLIVTVDNGISSHAGVEHARSLGIPVIVTDHHLPGDTLPAAEAIINPNLR DCNFPSKSLAGVGVAFYLMLALRTFLRDQGWFDERNIAIPNLAELLDLVALGTVADVVPL DANNRILTWQGMSRIRAGKCRPGIKALLEVANRDAQKLAASDLGFALGPRLNAAGRLDDM SVGVALLLCDNIGEARVLANELDALNQTRKEIEQGMQIEALTLCEKLERSRDTLPGGLAM YHPEWHQGVVGILASRIKERFHRPVIAFAPAGDGTLKGSGRSIQGLHMRDALERLDTLYP GMMLKFGGHAMAAGLSLEEDKFKLFQQRFGELVTEWLDPSLLQGEVVSDGPLSPAEMTME VAQLLRDAGPWGQMFPEPLFDGHFRLLQQRLVGERHLKVMVEPVGGGPLLDGIAFNVDTA LWPDNGVREVQLAYKLDINEFRGNRSLQIIIDNIWPI >gi|296494712|gb|ADTN01000026.1| GENE 12 9719 - 9796 83 25 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFEINPVNNRIQDLTERSDVLRGYL >gi|296494712|gb|ADTN01000026.1| GENE 13 9936 - 10817 995 293 aa, chain + ## HITS:1 COG:ECs3763 KEGG:ns NR:ns ## COG: ECs3763 COG1186 # Protein_GI_number: 15833017 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Escherichia coli O157:H7 # 1 293 73 365 365 542 100.0 1e-154 MKQGLEDVSGLLELAVEADDEETFNEAVAELDALEEKLAQLEFRRMFSGEYDSADCYLDI QAGSGGTEAQDWASMLERMYLRWAESRGFKTEIIEESEGEVAGIKSVTIKISGDYAYGWL RTETGVHRLVRKSPFDSGGRRHTSFSSAFVYPEVDDDIDIEINPADLRIDVYRASGAGGQ HVNRTESAVRITHIPTGIVTQCQNDRSQHKNKDQAMKQMKAKLYELEMQKKNAEKQAMED NKSDIGWGSQIRSYVLDDSRIKDLRTGVETRNTQAVLDGSLDQFIEASLKAGL >gi|296494712|gb|ADTN01000026.1| GENE 14 10827 - 12344 2050 505 aa, chain + ## HITS:1 COG:lysS KEGG:ns NR:ns ## COG: lysS COG1190 # Protein_GI_number: 16130792 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Lysyl-tRNA synthetase (class II) # Organism: Escherichia coli K12 # 1 505 1 505 505 1004 100.0 0 MSEQHAQGADAVVDLNNELKTRREKLANLREQGIAFPNDFRRDHTSDQLHAEFDGKENEE LEALNIEVAVAGRMMTRRIMGKASFVTLQDVGGRIQLYVARDDLPEGVYNEQFKKWDLGD ILGAKGKLFKTKTGELSIHCTELRLLTKALRPLPDKFHGLQDQEARYRQRYLDLISNDES RNTFKVRSQILSGIRQFMVNRGFMEVETPMMQVIPGGAAARPFITHHNALDLDMYLRIAP ELYLKRLVVGGFERVFEINRNFRNEGISVRHNPEFTMMELYMAYADYKDLIELTESLFRT LAQDILGKTEVTYGDVTLDFGKPFEKLTMREAIKKYRPETDMADLDNFDSAKAIAESIGI HVEKSWGLGRIVTEIFEEVAEAHLIQPTFITEYPAEVSPLARRNDVNPEITDRFEFFIGG REIGNGFSELNDAEDQAQRFLDQVAAKDAGDDEAMFYDEDYVTALEHGLPPTAGLGIGID RMVMLFTNSHTIRDVILFPAMRPVK >gi|296494712|gb|ADTN01000026.1| GENE 15 12388 - 12936 458 182 aa, chain - ## HITS:1 COG:idi KEGG:ns NR:ns ## COG: idi COG1443 # Protein_GI_number: 16130791 # Func_class: I Lipid transport and metabolism # Function: Isopentenyldiphosphate isomerase # Organism: Escherichia coli K12 # 1 182 1 182 182 377 100.0 1e-105 MQTEHVILLNAQGVPTGTLEKYAAHTADTRLHLAFSSWLFNAKGQLLVTRRALSKKAWPG VWTNSVCGHPQLGESNEDAVIRRCRYELGVEITPPESIYPDFRYRATDPSGIVENEVCPV FAARTTSALQINDDEVMDYQWCDLADVLHGIDATPWAFSPWMVMQATNREARKRLSAFTQ LK >gi|296494712|gb|ADTN01000026.1| GENE 16 13059 - 13184 110 41 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNFLMRAIFSLLLLFTLSIPVISDCVAMAIESRFKYMMLLF >gi|296494712|gb|ADTN01000026.1| GENE 17 13186 - 14634 375 482 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 29 440 15 416 447 149 27 3e-35 MSAIDSQLPSSSGQDRPTDEVDRILSPGKLIILGLQHVLVMYAGAVAVPLMIGDRLGLSK EAIAMLISSDLFCCGIVTLLQCIGIGRFMGIRLPVIMSVTFAAVTPMIAIGMNPDIGLLG IFGATIAAGFITTLLAPLIGRLMPLFPPLVTGVVITSIGLSIIQVGIDWAAGGKGNPQYG NPVYLGISFAVLIFILLITRYAKGFMSNVAVLLGIVFGFLLSWMMNEVNLSGLHDASWFA IVTPMSFGMPIFDPVSILTMTAVLIIVFIESMGMFLALGEIVGRKLSSHDIIRGLRVDGV GTMIGGTFNSFPHTSFSQNVGLVSVTRVHSRWVCISSGIILILFGMVPKMAVLVASIPQF VLGGAGLVMFGMVLATGIRILSRCNYTTNRYNLYIVAISLGVGMTPTLSHDFFSKLPAVL QPLLHSGIMLATLSAVVLNVFFNGYQHHADLVKESVSDKDLKVRTVRMWLLMRKLKKNEH GE >gi|296494712|gb|ADTN01000026.1| GENE 18 15055 - 16989 1064 644 aa, chain + ## HITS:1 COG:ECs3759_2 KEGG:ns NR:ns ## COG: ECs3759_2 COG0493 # Protein_GI_number: 15833013 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Escherichia coli O157:H7 # 158 644 1 487 487 1000 99.0 0 MKGMQMNKFIAAEAAECIGCHACEIACAVAHNQENWPLSHSDFRPRIHVVGKGQAANPVA CHHCNNAPCVTACPVNALTFQSDSVQLDEQKCIGCKRCAIACPFGVVEMVDTIAQKCDLC NQRSSGTQACIEVCPTQALRLMDDKGLQQIKVARQRKTAAGKASSDAQPSRSAALLPVNS RKGADKISASERKTHFGEIYCGLDPQQATYESDRCVYCAEKANCNWHCPLHNAIPDYIRL VQEGKIIEAAELCHQTSSLPEICGRVCPQDRLCEGACTLKDHSGAVSIGNLERYITDTAL AMGWRPDVSKVVPRSEKVAVIGAGPAGLGCADILARAGVQVDVFDRHPEIGGMLTFGIPP FKLDKTVLSQRREIFTAMGIDFHLNCEIGRDITFSDLTSEYDAVFIGVGTYGMMRADLPH EDAPGVIQALPFLTAHTRQLMGLPESEEYPLTDVEGKRVVVLGGGDTTMDCLRTSIRLNA ASVTCAYRRDEVSMPGSRKEVVNAREEGVEFQFNVQPQYIACDEDGRLTAVGLIRTAMGE PGPDGRRRPRPVAGSEFELPADVLIMAFGFQAHAMPWLQGSGIKLDKWGLIQTGDVGYLP TQTHLKKVFAGGDAVHGADLVVTAMAAGRQAARDMLTLFDTKAS >gi|296494712|gb|ADTN01000026.1| GENE 19 16989 - 17477 251 162 aa, chain + ## HITS:1 COG:ygfS KEGG:ns NR:ns ## COG: ygfS COG1142 # Protein_GI_number: 16130788 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 2 # Organism: Escherichia coli K12 # 1 162 2 163 163 267 100.0 5e-72 MKSLIIVNPADCIGCRTCEVACVVAHPSEQELNADVFLPRLKVQRLDSISAPVMCHQCEN APCVGACPVGALTMGEQVVQTNSARCIGCQSCVSACPFGMITIQSLPGDTRQQIVKCDLC EQREEGPACVESCPTQALQLLTERELRRVRQQRIVASGENPL >gi|296494712|gb|ADTN01000026.1| GENE 20 17513 - 18880 1213 455 aa, chain - ## HITS:1 COG:ECs3757 KEGG:ns NR:ns ## COG: ECs3757 COG2252 # Protein_GI_number: 15833011 # Func_class: R General function prediction only # Function: Permeases # Organism: Escherichia coli O157:H7 # 1 455 1 455 455 709 99.0 0 MSGDILQTPDAPKPQGALDNYFKITARGSTVRQEVLAGLTTFLAMVYSVIVVPGMLGKAG FPPAAVFVATCLVAGFGSLLMGLWANLPMAIGCAISLTAFTAFSLVLGQQISVPVALGAV FLMGVIFTAISVTGVRTWILRNLPMGIAHGTGIGIGLFLLLIAANGVGMVIKNPIEGLPV ALGAFTSFPVMMSLLGLAVIFGLEKCRVPGGILLGVILISIIGLIFDPAVKYHGLVAMPS LTGEDGKSLIFSLDIMGALQPTVLPSVLALVMTAVFDATGTIRAVAGQANLLDKDNQIIN GGKALTSDSVSSIFSGLVGAAPAAVYIESAAGTAAGGKTGLTATVVGALFLLILFLSPLS FLIPGYATAPALMYVGLLMLSNVSKLDFNDFIDAMAGLVCAVFIVLTCNIVTGIMLGFVT LVVGRVFAREWQKLNIGTVIITAALVAFYAGGWAI >gi|296494712|gb|ADTN01000026.1| GENE 21 18916 - 20232 1422 438 aa, chain - ## HITS:1 COG:ygfP KEGG:ns NR:ns ## COG: ygfP COG0402 # Protein_GI_number: 16130785 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Escherichia coli K12 # 1 438 2 439 439 919 100.0 0 MSGEHTLKAVRGSFIDVTRTIDNPEEIASALRFIEDGLLLIKQGKVEWFGEWENGKHQIP DTIRVRDYRGKLIVPGFVDTHIHYPQSEMVGAYGEQLLEWLNKHTFPTERRYEDLEYARE MSAFFIKQLLRNGTTTALVFGTVHPQSVDALFEAASHINMRMIAGKVMMDRNAPDYLLDT AESSYHQSKELIERWHKNGRLLYAITPRFAPTSSPEQMAMAQRLKEEYPDTWVHTHLCEN KDEIAWVKSLYPDHDGYLDVYHQYGLTGKNCVFAHCVHLEEKEWDRLSETKSSIAFCPTS NLYLGSGLFNLKKAWQKKVKVGMGTDIGAGTTFNMLQTLNEAYKVLQLQGYRLSAYEAFY LATLGGAKSLGLDDLIGNFLPGKEADFVVMEPTATPLQQLRYDNSVSLVDKLFVMMTLGD DRSIYRTYVDGRLVYERN >gi|296494712|gb|ADTN01000026.1| GENE 22 20250 - 21650 292 466 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 45 453 27 425 447 117 25 1e-25 MSDINHAGSDLIFELEDRPPFHQALVGAITHLLAIFVPMVTPALIVGAALQLSAETTAYL VSMAMIASGIGTWLQVNRYGIVGSGLLSIQSVNFSFVTVMIALGSSMKSDGFHEELIMSS LLGVSFVGAFLVVGSSFILPYLRRVITPTVSGIVVLMIGLSLIKVGIIDFGGGFAAKSSG TFGNYEHLGVGLLVLIVVIGFNCCRSPLLRMGGIAIGLCVGYIASLCLGMVDFSSMRNLP LITIPHPFKYGFSFSFHQFLVVGTIYLLSVLEAVGDITATAMVSRRPIQGEEYQSRLKGG VLADGLVSVIASAVGSLPLTTFAQNNGVIQMTGVASRYVGRTIAVMLVILGLFPMIGGFF TTIPSAVLGGAMTLMFSMIAIAGIRIIITNGLKRRETLIVATSLGLGLGVSYDPEIFKIL PASIYVLVENPICAGGLTAILLNIILPGGYRQENVLPGITSAEEMD >gi|296494712|gb|ADTN01000026.1| GENE 23 21815 - 24685 2930 956 aa, chain - ## HITS:1 COG:ECs3754_2 KEGG:ns NR:ns ## COG: ECs3754_2 COG1529 # Protein_GI_number: 15833008 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Escherichia coli O157:H7 # 160 956 1 797 797 1642 99.0 0 MIIHFTLNGAPQELTVNPGENVQKLLFNMGMHSVRNSDDGFGFAGSDAIIFNGNIVNASL LIAAQLEKADIRTAESLGKWNELSLVQQAMVDVGVVQSGYNDPAAALIITDLLDRIAAPT REEIDDALSGLFSRDAGWQQYYQVIELAVARKNNPQATIDIAPTFRDDLEVIGKHYPKTD AAKMVQAKPCYVEDRVTADACVIKMLRSPHAHALITHLDVSKAEALPGVVHVITHLNCPD IYYTPGGQSAPEPSPLDRRMFGKKMRHVGDRVAAVVAESEEIALEALKLIDVEYEVLKPV MSIDEAMAEDAPVVHDEPVVYVAGAPDTLEDDNSHAAQRGEHMIINFPIGSRPRKNIAAS IHGHIGDMDKGFADADVIIERTYNSTQAQQCPTETHICFTRMDGDRLVIHASTQVPWHLR RQVARLVGMKQHKVHVIKERVGGGFGSKQDILLEEVCAWATCVTGRPVLFRYTREEEFIA NTSRHVAKVTVKLGAKKDGRLTAVKMDFRANTGPYGNHSLTVPCNGPALSLPLYPCDNVD FQVTTYYSNICPNGAYQGYGAPKGNFAITMALAELAEQLQIDQLEIIERNRVHEGQELKI LGAIGEGKAPTSVPSAASCALEEILRQGREMIQWSSPKPQNGDWHIGRGVAIIMQKSGIP DIDQANCMIKLESDGTFIVHSGGADIGTGLDTVVTKLAAEVLHCPPQDVHVISGDTDHAL FDKGAYASSGTCFSGNAARLAAENLREKILFHGAQMLGEPVADVQLATPGVVRGKKGEVS FGDIAHKGETGTGFGSLVGTGSYITPDFAFPYGANFAEVAVNTRTGEIRLDKFYALLDCG TPVNPELALGQIYGATLRAIGHSMSEEIIYDAEGHPLTRDLRSYGAPKIGDIPRDFRAVL VPSDDKVGPFGAKSISEIGVNGAAPAIATAIHDACGIWLREWHFTPEKILTALEKI >gi|296494712|gb|ADTN01000026.1| GENE 24 24682 - 25461 833 259 aa, chain - ## HITS:1 COG:ECs3753 KEGG:ns NR:ns ## COG: ECs3753 COG1319 # Protein_GI_number: 15833007 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs # Organism: Escherichia coli O157:H7 # 1 259 1 259 259 504 100.0 1e-143 MIEQFFRPDSVEQALELKRRYQDEAVWFAGGSKLNATPTRTDKKIAISLQDLELDWVDWD NGALRIGAMSRLQPLRDARFIPAALREALGFVYSRHVRNQSTIGGEIAARQEESVLLPVL LALDAELVFGNGETLSIEDYLACPCDRLLTEIIIKDPYRTCATRKISRSQAGLTVVTAAV AMTDHDGMRIALDGVASKALRLHDVEKQNLEGNALEQAVANAIFPQEDLRGSVAYKRYIT GVLVADLYADCQQAGEEAV >gi|296494712|gb|ADTN01000026.1| GENE 25 25512 - 26840 1374 442 aa, chain - ## HITS:1 COG:ssnA KEGG:ns NR:ns ## COG: ssnA COG0402 # Protein_GI_number: 16130781 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Escherichia coli K12 # 1 442 23 464 464 916 100.0 0 MLILKNVTAVQLHPAKVQEGVDIAIENDVIVAIGDALTQRYPDASFKEMHGRIVMPGIVC SHNHFYSGLSRGIMANIAPCPDFISTLKNLWWRLDRALDEESLYYSGLICSLEAIKSGCT SVIDHHASPAYIGGSLSTLRDAFLKVGLRAMTCFETTDRNNGIKELQEGVEENIRFARLI DEAKKATSEPYLVEAHIGAHAPFTVPDAGLEMLREAVKATGRGLHIHAAEDLYDVSYSHH WYGKDLLARLAQFDLIDSKTLVAHGLYLSKDDITLLNQRDAFLVHNARSNMNNHVGYNHH LSDIRNLALGTDGIGSDMFEEMKFAFFKHRDAGGPLWPDSFAKALTNGNELMSRNFGAKF GLLEAGYKADLTICDYNSPTPLLADNIAGHIAFGMGSGSVHSVMVNGVMVYEDRQFNFDC DSIYAQARKAAASMWRRMDALA >gi|296494712|gb|ADTN01000026.1| GENE 26 26843 - 29941 2691 1032 aa, chain - ## HITS:1 COG:ygfK_2 KEGG:ns NR:ns ## COG: ygfK_2 COG0493 # Protein_GI_number: 16130780 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Escherichia coli K12 # 451 1032 1 582 582 1235 100.0 0 MGDIMRPIPFEELLTRIFDEYQQQRSIFGIPEQQFYSPVKGKTVSVFGETCATPVGPAAG PHTQLAQNIVTSWLTGGRFIELKTVQILDRLELEKPCIDAEDECFNTEWSTEFTLLKAWD EYLKAWFALHLLEAMFQPSDSGKSFIFNMSVGYNLEGIKQPPMQQFIDNMMDASDHPKFA QYRDTLNKLLQDDAFLARHGLQEKRESLQALPARIPTSMVHGVTLSTMHGCPPHEIEAIC RYMLEEKGLNTFVKLNPTLLGYARVREILDVCGFGYIGLKEESFDHDLKLTQALEMLERL MALAKEKSLGFGVKLTNTLGTINNKGALPGEEMYMSGRALFPLSINVAAVLSRAFDGKLP ISYSGGASQLTIRDIFDTGIRPITMATDLLKPGGYLRLSACMRELEGSDAWGLDHVDVER LNRLAADALTMEYTQKHWKPEERIEVAEDLPLTDCYVAPCVTACAIKQDIPEYIRLLGEH RYADALELIYQRNALPAITGHICDHQCQYNCTRLDYDSALNIRELKKVALEKGWDEYKQR WHKPAGSGSRHPVAVIGAGPAGLAAGYFLARAGHPVTLFEREANAGGVVKNIIPQFRIPA ELIQHDIDFVAAHGVKFEYGCSPDLTIEQLKNQGFHYVLIATGTDKNSGVKLAGDNQNVW KSLPFLREYNKGTALKLGKHVVVVGAGNTAMDCARAALRVPGVEKATIVYRRSLQEMPAW REEYEEALHDGVEFRFLNNPERFDADGTLTLRVMSLGEPDEKGRRRPVETNETVTLLVDS LITAIGEQQDTEALNAMGVPLDKNGWPDVDHNGETRLTDVFMIGDVQRGPSSIVAAVGTA RRATDAILSRENIRSHQNDKYWNNVNPAEIYQRKGDISITLVNSDDRDAFVAQEAARCLE CNYVCSKCVDVCPNRANVSIAVPGFQNRFQTLHLDAYCNECGNCAQFCPWNGKPYKDKIT VFSLAQDFDNSSNPGFLVEDCRVRVRLNNQSWVLNIDSKGQFNNVPPELNDMCRIISHVH QHHHYLLGRVEV >gi|296494712|gb|ADTN01000026.1| GENE 27 30263 - 30841 146 192 aa, chain - ## HITS:1 COG:ygfJ KEGG:ns NR:ns ## COG: ygfJ COG2068 # Protein_GI_number: 16130779 # Func_class: R General function prediction only # Function: Uncharacterized MobA-related protein # Organism: Escherichia coli K12 # 1 192 1 192 192 397 100.0 1e-111 MSAIDCIITAAGLSSRMGQWKMMLPWEQGTILDTSIKNALQFCSRIILVTGYRGNELHER YANQSNITIIHNPDYAQGLLTSVKAAVPAVQTEHCFLTHGDMPTLTIDIFRKIWSLRNDG AILPLHNGIPGHPILVSKPCLMQAIQRPNVTNMRQALLMGDHYSVEIENAEIILDIDTPD DFITAKERYTEI >gi|296494712|gb|ADTN01000026.1| GENE 28 31112 - 31714 229 200 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_2996 NR:ns ## KEGG: ECIAI1_2996 # Name: yqeC # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 200 57 256 256 419 100.0 1e-116 MFMPTSHWPVVFCRDPAMLPHASLTSPISFCFHSWKANQGKVQGFTPEAIDALVQRPECD VILIEADGSRGMPLKAPDEHEPCIPKSSCCVIAVMGGHTLGAKVSTENVHRWSQFADITG LTPDATLQLSDLVALVRHPQGAFKNVPQGCRRVWFINRFSQCENAIAQSELLQPLQQHDV EAIWLGDIQEHPAIARRFVN >gi|296494712|gb|ADTN01000026.1| GENE 29 31762 - 33387 1401 541 aa, chain + ## HITS:1 COG:yqeB KEGG:ns NR:ns ## COG: yqeB COG1975 # Protein_GI_number: 16130777 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family # Organism: Escherichia coli K12 # 1 541 1 541 541 1048 100.0 0 MNIFTEAAKLEEQNCPFAMAQIVDSRGSTPRHSAQMLVRADGSIVGTIGGGMVERKVIEE SLQALQERKPRLFHGRMARNGADAVGSDCGGAMSVFISVHGMRPRLVLIGAGHVNRAIAQ SAALLGFDIAVADIYRESLNPELFPPSTTLLHAESFGAAVEALDIRPDNFVLIATNNQDR EALDKLIEQPIAWLGLLASRRKVQLFLRQLREKGVAEEHIARLHAPVGYNIGAETPQEIA ISVLAEILQVKNNAPGGLMMKPSHPSGHQLVVIRGAGDIASGVALRLYHAGFKVIMLEVE KPTVIRCTVAFAQAVFDGEMTVEGVTARLATSSAEAMKLTERGFIPVMVDPACSLLDELK PLCVVDAILAKQNLGTRADMAPVTIALGPGFTAGKDCHAVIETNRGHWLGQVIYSGCAQE NTGVPGNIMGHTTRRVIRAPAAGIMRSNVKLGDLVKEGDVIAWIGEHEIKAPLTGMVRGL LNDGLAVVGGFKIGDIDPRGETADFTSVSDKARAIGGGVLEALMMLMHQGVKATKEVLEV A >gi|296494712|gb|ADTN01000026.1| GENE 30 33608 - 34540 1230 310 aa, chain - ## HITS:1 COG:yqeA KEGG:ns NR:ns ## COG: yqeA COG0549 # Protein_GI_number: 16130776 # Func_class: E Amino acid transport and metabolism # Function: Carbamate kinase # Organism: Escherichia coli K12 # 1 310 1 310 310 564 100.0 1e-161 MSKKIVLALGGNALGDDLAGQMKAVKITSQAIVDLIAQGHEVIVTHGNGPQVGMINQAFE AAAKTEAHSPMLPMSVCVALSQGYIGYDLQNALREELLSRGINKPVATLVTQVEVDANDP AFLNPTKPIGSFFTEQEAEQLTKQGYTLKEDAGRGYRRVVASPKPVDIIEKETVKALVDA GQVVITVGGGGIPVIREGNHLRGASAVIDKDWASARLAEMIDADMLIILTAVEKVAINFG KENEQWLDRLSLSDAERFIEEGHFAKGSMLPKVEAAASFARSRAGREALITVLSKAKEGI EGKTGTVICQ >gi|296494712|gb|ADTN01000026.1| GENE 31 34588 - 35973 1072 461 aa, chain - ## HITS:1 COG:ECs3746 KEGG:ns NR:ns ## COG: ECs3746 COG0044 # Protein_GI_number: 15833000 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Escherichia coli O157:H7 # 1 461 5 465 465 984 100.0 0 MRVLIKNGTVVNADGQAKQDLLIESGIVRQLGNNISPQLPYEEIDATGCYVFPGGVDVHT HFNIDVGIARSCDDFFTGTRAAACGGTTTIIDHMGFGPNGCRLRHQLEVYRGYAAHKAVI DYSFHGVIQHINHAILDEIPMMVEEGLSSFKLYLTYQYKLNDDEVLQALRRLHESGALTT VHPENDAAIASKRAEFIAAGLTAPRYHALSRPLECEAEAIARMINLAQIAGNAPLYIVHL SNGLGLDYLRLARANHQPVWVETCPQYLLLDERSYDTEDGMKFILSPPLRNVREQDKLWC GISDGAIDVVATDHCTFSMAQRLQISKGDFSRCPNGLPGVENRMQLLFSSGVMTGRITPE RFVELTSAMPARLFGLWPQKGLLAPGSDGDVVIIDPRQSQQIQHRHLHDNADYSPWEGFT CQGAIVRTLSRGETIFCDGTFTGKAGRGRFLRRKPFVPPVL Prediction of potential genes in microbial genomes Time: Sun May 15 23:09:09 2011 Seq name: gi|296494711|gb|ADTN01000027.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont69.5, whole genome shotgun sequence Length of sequence - 28848 bp Number of predicted genes - 30, with homology - 29 Number of transcription units - 16, operones - 7 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 - CDS 38 - 1249 1677 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 2 1 Op 2 2/0.750 - CDS 1307 - 2503 1193 ## COG1171 Threonine dehydratase - Term 2521 - 2554 5.2 3 1 Op 3 . - CDS 2561 - 3748 1513 ## COG0078 Ornithine carbamoyltransferase 4 2 Tu 1 . + CDS 4227 - 6005 1009 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains + Term 6013 - 6055 11.1 5 3 Op 1 15/0.000 - CDS 6045 - 6524 265 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 6 3 Op 2 12/0.000 - CDS 6521 - 7399 629 ## COG1319 Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs 7 3 Op 3 . - CDS 7410 - 9707 1813 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs - Prom 9778 - 9837 3.5 + Prom 10037 - 10096 4.1 8 4 Tu 1 . + CDS 10122 - 10877 311 ## COG0739 Membrane proteins related to metalloendopeptidases + Term 10977 - 11044 31.8 + TRNA 10956 - 11029 78.8 # Gly CCC 0 0 9 5 Op 1 . + CDS 11310 - 12800 770 ## COG0582 Integrase 10 5 Op 2 . + CDS 12793 - 13431 206 ## SbBS512_E3301 hypothetical protein + Prom 14413 - 14472 3.0 11 6 Op 1 . + CDS 14492 - 14983 341 ## STY3191 hypothetical protein 12 6 Op 2 . + CDS 14986 - 15216 131 ## ECA2310 hypothetical protein 13 7 Op 1 . + CDS 15340 - 15750 461 ## ECA2311 hypothetical protein 14 7 Op 2 . + CDS 15839 - 16966 1124 ## KPK_0788 hypothetical protein + Term 16970 - 17017 7.2 + Prom 16968 - 17027 2.1 15 8 Tu 1 . + CDS 17047 - 17112 84 ## + Prom 17435 - 17494 6.6 16 9 Tu 1 . + CDS 17555 - 18349 300 ## EcHS_A3023 hypothetical protein + Term 18570 - 18596 -0.6 + Prom 18680 - 18739 5.4 17 10 Tu 1 . + CDS 18980 - 19165 139 ## ECIAI1_2983 hypothetical protein + Prom 19812 - 19871 7.8 18 11 Op 1 7/0.000 + CDS 19903 - 20244 316 ## COG1886 Flagellar motor switch/type III secretory pathway protein + Prom 20270 - 20329 6.0 19 11 Op 2 7/0.000 + CDS 20519 - 20899 108 ## COG4790 Type III secretory pathway, component EscR 20 11 Op 3 8/0.000 + CDS 20909 - 21169 160 ## COG4794 Type III secretory pathway, component EscS + Prom 21584 - 21643 5.9 21 11 Op 4 2/0.750 + CDS 21702 - 21938 127 ## COG4791 Type III secretory pathway, component EscT 22 11 Op 5 . + CDS 21947 - 22678 298 ## COG1377 Flagellar biosynthesis pathway, component FlhB 23 11 Op 6 . + CDS 22705 - 23067 209 ## COG1377 Flagellar biosynthesis pathway, component FlhB + Term 23095 - 23137 7.1 24 12 Tu 1 . - CDS 23424 - 23924 169 ## COG2771 DNA-binding HTH domain-containing proteins - Prom 24028 - 24087 5.9 + Prom 24019 - 24078 9.7 25 13 Op 1 . + CDS 24181 - 25362 264 ## ECO103_3424 type III secretion protein EprH 26 13 Op 2 . + CDS 25376 - 25615 262 ## ECUMN_3190 type III secretion system needle protein 27 13 Op 3 . + CDS 25635 - 25967 227 ## ECH74115_4132 type III secretion apparatus protein EprJ + Prom 26936 - 26995 4.9 28 14 Tu 1 . + CDS 27236 - 27943 169 ## ECUMN_3186 hypothetical protein - Term 27677 - 27712 0.4 29 15 Tu 1 . - CDS 27865 - 28038 61 ## Z4177 hypothetical protein - Prom 28114 - 28173 6.7 + Prom 28008 - 28067 10.2 30 16 Tu 1 . + CDS 28165 - 28320 129 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain Predicted protein(s) >gi|296494711|gb|ADTN01000027.1| GENE 1 38 - 1249 1677 403 aa, chain - ## HITS:1 COG:ECs3745 KEGG:ns NR:ns ## COG: ECs3745 COG0624 # Protein_GI_number: 15832999 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Escherichia coli O157:H7 # 1 403 1 403 403 848 100.0 0 MAKNIPFKLILEKAKDYQADMTRFLRDMVAIPSESCDEKRVVHRIKEEMEKVGFDKVEID PMGNVLGYIGHGPRLVAMDAHIDTVGIGNIKNWDFDPYEGMETDELIGGRGTSDQEGGMA SMVYAGKIIKDLGLEDEYTLLVTGTVQEEDCDGLCWQYIIEQSGIRPEFVVSTEPTDCQV YRGQRGRMEIRIDVQGVSCHGSAPERGDNAIFKMGPILGELQELSQRLGYDEFLGKGTLT VSEIFFTSPSRCAVADSCAVSIDRRLTWGETWEGALDEIRALPAVQKANAVVSMYNYDRP SWTGLVYPTECYFPTWKVEEDHFTVKALVNAYEGLFGKAPVVDKWTFSTNGVSIMGRHGI PVIGFGPGKEPEAHAPNEKTWKSHLVTCAAMYAAIPLSWLATE >gi|296494711|gb|ADTN01000027.1| GENE 2 1307 - 2503 1193 398 aa, chain - ## HITS:1 COG:ECs3744 KEGG:ns NR:ns ## COG: ECs3744 COG1171 # Protein_GI_number: 15832998 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Escherichia coli O157:H7 # 1 398 1 398 398 827 100.0 0 MSVFSLKIDIADNKFFNGETSPLFSQSQAKLARQFHQKIAGYRPTPLCALDDLANLFGVK KILVKDESKRFGLNAFKMLGGAYAIAQLLCEKYHLDIETLSFEHLKNAIGEKMTFATTTD GNHGRGVAWAAQQLGQNAVIYMPKGSAQERVDAILNLGAECIVTDMNYDDTVRLTMQHAQ QHGWEVVQDTAWEGYTKIPTWIMQGYATLADEAVEQMREMGVTPTHVLLQAGVGAMAGGV LGYLVDVYSPQNLHSIIVEPDKADCIYRSGVKGDIVNVGGDMATIMAGLACGEPNPLGWE ILRNCATQFISCQDSVAALGMRVLGNPYGNDPRIISGESGAVGLGVLAAVHYHPQRQSLM EKLALNKDAVVLVISTEGDTDVKHYREVVWEGKHAVAP >gi|296494711|gb|ADTN01000027.1| GENE 3 2561 - 3748 1513 395 aa, chain - ## HITS:1 COG:ECs3743 KEGG:ns NR:ns ## COG: ECs3743 COG0078 # Protein_GI_number: 15832997 # Func_class: E Amino acid transport and metabolism # Function: Ornithine carbamoyltransferase # Organism: Escherichia coli O157:H7 # 1 395 2 396 396 805 99.0 0 MKTVNELIKDINSLTSHLHEKDFLLTWEQTPDELKQVLDVAAALKALRAENISTKVFNSG LGISVFRDNSTRTRFSYASALNLLGLAQQDLDEGKSQIAHGETVRETANMISFCADAIGI RDDMYLGAGNAYMREVGAALDDGYKQGVLPQRPALVNLQCDIDHPTQSMADLAWLREHFG SLENLKGKKIAMTWAYSPSYGKPLSVPQGIIGLMTRFGMDVTLAHPEGYDLIPDVVEVAK NNAKASGGSFRQVTSMEEAFKDADIVYPKSWAPYKVMEERTELLRANDHEGLKALEKQCL AQNAQHKDWHCTEEMMELTRDGEALYMHCLPADISGVSCKEGEVTEGVFEKYRIATYKEA SWKPYIIAAMILSRKYAKPGALLEQLLKEAQERVK >gi|296494711|gb|ADTN01000027.1| GENE 4 4227 - 6005 1009 592 aa, chain + ## HITS:1 COG:ygeV KEGG:ns NR:ns ## COG: ygeV COG3829 # Protein_GI_number: 16130771 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 1 592 1 592 592 1181 100.0 0 MELATTQSVLMQIQPTIQRFARMLASVLQLEVEIVDENLCRVAGTGAYGKFLGRQLSGNS RLLRHVLETKTEKVVTQSRFDPLCEGCDSKENCREKAFLGTPVILQDRCVGVISLIAVTH EQQEHISDNLREFSDYVRHISTIFVSKLLEDQGPGDNISKIFATMIDNMDQGVLVVDDEN RVQFVNQTALKTLGVVQNNIIGKPIRFRPLTFESNFTHGHMQHIVSWDDKSELIIGQLHN IQGRQLFLMAFHQSHTSFSVANAPDEPHIEQLVGECRVMRQLKRLISRIAPSPSSVMVVG ESGTGKEVVARAIHKLSGRRNKPFIAINCAAIPEQLLESELFGYVKGAFTGASANGKTGL IQAANTGTLFLDEIGDMPLMLQAKLLRAIEAREILPIGASSPIQVDIRIISATNQNLAQF IAEGKFREDLFYRLNVIPITLPPLRERQEDIELLVHYFLHLHTRRLGSVYPGIAPDVVEI LRKHRWPGNLRELSNLMEYLVNVVPSGEVIDSTLLPPNLLNNGTTEQSDVTEVSEAHLSL DDAGGTALEEMEKQMIREALSRHNSKKQVADELGIGIATLYRKIKKYELLNT >gi|296494711|gb|ADTN01000027.1| GENE 5 6045 - 6524 265 159 aa, chain - ## HITS:1 COG:ygeU KEGG:ns NR:ns ## COG: ygeU COG2080 # Protein_GI_number: 16130770 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Escherichia coli K12 # 1 159 1 159 159 295 100.0 2e-80 MNHSETITIECTINGMPFQLHAAPGTPLSELLREQGLLSVKQGCCVGECGACTVLVDGTA IDSCLYLAAWAEGKEIRTLEGEAKGGKLSHVQQAYAKSGAVQCGFCTPGLIMATTAMLAK PREKPLTITEIRRGLAGNLCRCTGYQMIVNTVLDCEKTK >gi|296494711|gb|ADTN01000027.1| GENE 6 6521 - 7399 629 292 aa, chain - ## HITS:1 COG:ygeT KEGG:ns NR:ns ## COG: ygeT COG1319 # Protein_GI_number: 16130769 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs # Organism: Escherichia coli K12 # 1 292 1 292 292 572 100.0 1e-163 MFDFASYHRAATLADAINLLADNPQAKLLAGGTDVLIQLHHHNDRYRHIVDIHNLAELRG ITLAEDGSLRIGSATTFTQLIEDPITQRHLPALCAAATSIAGPQIRNVATYGGNICNGAT SADSATPTLIYDAKLEIHSPRGVRFVPINGFHTGPGKVSLEHDEILVAFHFPPQPKEHAG SAHFKYAMRDAMDISTIGCAAHCRLDNGNFSELRLAFGVAAPTPIRCQHAEQTAQNAPLN LQTLEAISESVLQDVAPRSSWRASKEFRLHLIQTMTKKVISEAVAAAGGKLQ >gi|296494711|gb|ADTN01000027.1| GENE 7 7410 - 9707 1813 765 aa, chain - ## HITS:1 COG:ygeS KEGG:ns NR:ns ## COG: ygeS COG1529 # Protein_GI_number: 16130768 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Escherichia coli K12 # 14 765 1 752 752 1530 100.0 0 MEAREATATGESCMRVDAIAKVTGRARYTDDYVMAGMCYAKYVRSPIAHGYAVSINDEQA RSLPGVLAIFTWEDVPDIPFATAGHAWTLDENKRDTADRALLTRHVRHHGDAVAIVVARD ELTAEKAAQLVSIEWQELPVITTPEAALAEDAAPIHNGGNLLKQSTMSTGNVQQTIDAAD YQVQGHYQTPVIQHCHMESVTSLAWMEDDSRITIVSSTQIPHIVRRVVGQALDIPWSCVR VIKPFVGGGFGNKQDVLEEPMAAFLTSKLGGIPVKVSLSREECFLATRTRHAFTIDGQMG VNRDGTLKGYSLDVLSNTGAYASHGHSIASAGGNKVAYLYPRCAYAYSSKTCYTNLPSAG AMRGYGAPQVVFAVESMLDDAATALGIDPVEIRLRNAAREGDANPLTGKRIYSAGLPECL EKGRKIFEWEKRRAECQNQQGNLRRGVGVACFSYTSNTWPVGVEIAGARLLMNQDGTINV QSGATEIGQGADTVFSQMVAETVGVPVSDVRVISTQDTDVTPFDPGAFASRQSYVAAPAL RSAALLLKEKIIAHAAVMLHQSAMNLTLIKGHIVLVERPEEPLMSLKDLAMDAFYHPERG GQLSAESSIKTTTNPPAFGCTFVDLTVDIALCKVTINRILNVHDSGHILNPLLAEGQVHG GMGMGIGWALFEEMIIDAKSGVVRNPNLLDYKMPTMPDLPQLESAFVEINEPQSAYGHKS LGEPPIIPVAAAIRNAVKMATGVAINTLPLTPKRLYEEFHLAGLI >gi|296494711|gb|ADTN01000027.1| GENE 8 10122 - 10877 311 251 aa, chain + ## HITS:1 COG:ygeR KEGG:ns NR:ns ## COG: ygeR COG0739 # Protein_GI_number: 16130767 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Escherichia coli K12 # 1 251 9 259 259 448 99.0 1e-126 MSAGRLNKKSLGIVMLLSVGLLLAGCSGSKSSDTGTYSGSVYTVKRGDTLYRISRTTGTS VKELARLNGISPPYTIEVGQKLKLGGAKSSSITRKSTAKSTTKTASVTPSSAVPKSSWPP VGQRCWLWPTTGKVIMPYSTADGGNKGIDISAPRGTPIYAAGAGKVVYVGNQLRGYGNLI MIKHSEDYITAYAHNDTMLVNNGQSVKAGQKIATMGSTDAASVRLHFQIRYRATAIDPLR YLPPQGSKPKC >gi|296494711|gb|ADTN01000027.1| GENE 9 11310 - 12800 770 496 aa, chain + ## HITS:1 COG:Z5878_2 KEGG:ns NR:ns ## COG: Z5878_2 COG0582 # Protein_GI_number: 15804857 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Escherichia coli O157:H7 EDL933 # 310 488 5 173 205 87 35.0 6e-17 MAHSRHIKAALKAIHADNPTASYEDMRKHLRDIAEWELSTGRSDLFEPDMRDTYRDQYGE LGENLTDALASEPLSIDQHRYINEALKVLKACMKRIEAGDSQSLIDYVDRFDGIDRQDDQ ASVSLSVSAPEVKPVITPSVTVASLFEQYEAENAQNWKPATLRENQSSHAALIEIFDYLG LGADANTITRADVLRVRDVLQQLPKNRKQRFKDAPLVDLLGREEKTDCLDVVTINNKYLI KMAAVFKWAVRNDLIKKNMTEGLELKVPQRKASEARNAFSTEQVGQLLVAAKSYSQKTSG KPYHYYVTALAAITGARLNEVAQLQVKDVRVTEAGTVYIHINEDDSSLPGKSIKNAHSDR CVPLVDGAYGFVLADFMDLLEARRNANASCASNGNDAMVFDGLRLMKNGYGEQVSKWFNR TLLPKVLADRDGLAFHSFRHTVATQLKQHGVELAYAQAIMGHSSGSITYDRYAKEVEVDR LVNVLADVYKETVVNG >gi|296494711|gb|ADTN01000027.1| GENE 10 12793 - 13431 206 212 aa, chain + ## HITS:1 COG:no KEGG:SbBS512_E3301 NR:ns ## KEGG: SbBS512_E3301 # Name: not_defined # Def: hypothetical protein # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 8 210 1 203 212 364 83.0 1e-99 MADYHPAVFEYQAFNCPYCGVYARQFWRTLYGNGQNLGVKDTPFRMSSCSHCGDSAYWCQ EKMIVPAAGNVELPNPDMPEDCKADYMEARSIINLSPKGAAALLRLCLQKLMRHLGEPGN NINADIKSLVEKGLPVRIQQAADICRIVGNQAVHPGEISLDDDPQLTHGLFKLLNIIVDD RITRPKEIEAMFQSMPEGPRQGIEKRDGKASG >gi|296494711|gb|ADTN01000027.1| GENE 11 14492 - 14983 341 163 aa, chain + ## HITS:1 COG:no KEGG:STY3191 NR:ns ## KEGG: STY3191 # Name: not_defined # Def: hypothetical protein # Organism: S.typhi # Pathway: not_defined # 5 162 56 215 217 136 48.0 3e-31 MSNQQLSANQIAYLNRAMSDGNGLIEIGELARGLGKRPDNVKRKLERIFPEDHLLNLRKR YKASIGKGASREIETYMLDYKTAGALAMSYDGMLGIEVLTILEDSLSTIQAMTIQAAKDN SAGVLKAAAGFRERYRERLVFRPGASENEDRSVALKRLGRKGL >gi|296494711|gb|ADTN01000027.1| GENE 12 14986 - 15216 131 76 aa, chain + ## HITS:1 COG:no KEGG:ECA2310 NR:ns ## KEGG: ECA2310 # Name: not_defined # Def: hypothetical protein # Organism: E.carotovora # Pathway: not_defined # 17 75 10 67 68 75 66.0 4e-13 MKTLPPDLPPTYSVDVKIDPRTPEGRKAMRLLDVPTAILVAALGLPPKHTRPDMYYSKGA LCLMATAEGLTPMDFK >gi|296494711|gb|ADTN01000027.1| GENE 13 15340 - 15750 461 136 aa, chain + ## HITS:1 COG:no KEGG:ECA2311 NR:ns ## KEGG: ECA2311 # Name: not_defined # Def: hypothetical protein # Organism: E.carotovora # Pathway: not_defined # 1 136 1 136 136 155 63.0 5e-37 MSISYRKLDINLSADKETVLVFGQEMSTKYFTEIVVTSMLNGLPALSAGAHAILTSLHAA GLNANDYGAYSRAWAESNAEARREAERQRIENEKDRQRIAAMYATPEEIAKEAAERKERK ADLERRFSRKGAAFGL >gi|296494711|gb|ADTN01000027.1| GENE 14 15839 - 16966 1124 375 aa, chain + ## HITS:1 COG:no KEGG:KPK_0788 NR:ns ## KEGG: KPK_0788 # Name: not_defined # Def: hypothetical protein # Organism: K.pneumoniae_342 # Pathway: not_defined # 3 375 7 379 379 635 88.0 0 MPEVKRIDSNNSKIKHTDELRAHLLDKEGNKIPHSVFGTDTTRTEIISQADGTGETMNVG FTDRSGRPQAFGNDVTDYVQALESALDETLNDGDFIDGLNTEVFPCEQRPFATLPTPINR ETLTDRTVRFLSNGVYNPAKNAPTTKVKGGQTLMVNPVIADKDAKETGILCSAITHVSDF AMMGGWTGAGLMATAADMVDQYLRNMTSKVVDVLLDAPDLQSTKVGALSGKASAQADDIL DAVAVNLPFYLGSSLTDYTLLVPEKYEAILNRAAQKAGMTEISELVGTAVAPYSGDDRGV IILPKKYAMLSFRSTRSGDLVNVLVTRDANRAGYDVELVGALDVMATGTVKVKAGAFDTE AAASYPLVHVLKFED >gi|296494711|gb|ADTN01000027.1| GENE 15 17047 - 17112 84 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSNLMLLYLVLLLVYYVVEEF >gi|296494711|gb|ADTN01000027.1| GENE 16 17555 - 18349 300 264 aa, chain + ## HITS:1 COG:no KEGG:EcHS_A3023 NR:ns ## KEGG: EcHS_A3023 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_HS # Pathway: not_defined # 1 264 15 278 528 522 99.0 1e-147 MKGKSALTLLLAGIFSCGTCQATGAEVTSESVFNILNSTGAATDKSYLSLNPDKYPNYRL LIHSAKLQNEIKSHYTKDEIQGLLTLTENTRKLTLTEKPWGTFILASTFEDDKTAAETHY DAVWLRDSLWGYMALVSDQGNSVAAKKVLLTLWDYMSTPDQIKRMQDVISNPKRLDGIPV QMNAVHIRFDSNSPVMADVQEEGKPQLWNHKQNDALGLYLDLLIQAINTGTINAEDWQKG DRLKSVALLIAYLDKANFYVMEDS >gi|296494711|gb|ADTN01000027.1| GENE 17 18980 - 19165 139 61 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_2983 NR:ns ## KEGG: ECIAI1_2983 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 61 490 550 550 124 98.0 1e-27 MISANGRSVPEMALPESYNYIHKSGTLHEAPSPIIPLNWSKASMTLMLKEMSNLINDEGI K >gi|296494711|gb|ADTN01000027.1| GENE 18 19903 - 20244 316 113 aa, chain + ## HITS:1 COG:ECs3726 KEGG:ns NR:ns ## COG: ECs3726 COG1886 # Protein_GI_number: 15832980 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar motor switch/type III secretory pathway protein # Organism: Escherichia coli O157:H7 # 1 113 216 328 328 211 95.0 3e-55 MADHFEYEEDFETDDFDIKKNESEIYDENDDQMINSFEDLPVKIEFVLGKKIMNLYEIDE LCAKRIISLLSESEKNIEIRVNGALTGYGELVEVDDKLGVEIHSWLSGHNNVK >gi|296494711|gb|ADTN01000027.1| GENE 19 20519 - 20899 108 126 aa, chain + ## HITS:1 COG:ECs3725 KEGG:ns NR:ns ## COG: ECs3725 COG4790 # Protein_GI_number: 15832979 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type III secretory pathway, component EscR # Organism: Escherichia coli O157:H7 # 1 126 96 221 221 206 98.0 9e-54 MSGYKSYLIKYSEPELVNFFEKIQKVNSSEDNEEIFDDDNISIFSLLPAYALSEIKSAFI IGFYIYLPFVVVDLVISSVLLTLGMMMMSPVTISTPIKLILFVAMDGWTMLSKGLILQYF DLSINP >gi|296494711|gb|ADTN01000027.1| GENE 20 20909 - 21169 160 86 aa, chain + ## HITS:1 COG:ECs3724 KEGG:ns NR:ns ## COG: ECs3724 COG4794 # Protein_GI_number: 15832978 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type III secretory pathway, component EscS # Organism: Escherichia coli O157:H7 # 1 86 1 86 86 149 100.0 2e-36 MDDIVFAGNRALYLILVMSAGPIAVATFVGLLVGLFQTVTQLQEQTLPFGVKLLCVSICF FLMSGWYGEKLYSFGIEMLNLAFARG >gi|296494711|gb|ADTN01000027.1| GENE 21 21702 - 21938 127 78 aa, chain + ## HITS:1 COG:ECs3722 KEGG:ns NR:ns ## COG: ECs3722 COG4791 # Protein_GI_number: 15832976 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type III secretory pathway, component EscT # Organism: Escherichia coli O157:H7 # 1 78 1 78 78 116 100.0 8e-27 MTHTIVYASPVIAVMLGGEAVLGLLARYASQLNAFAISLTVKSALAFLILIIYFGPILAE RVMPLSFFPEQLQLYIEK >gi|296494711|gb|ADTN01000027.1| GENE 22 21947 - 22678 298 243 aa, chain + ## HITS:1 COG:ECs3721 KEGG:ns NR:ns ## COG: ECs3721 COG1377 # Protein_GI_number: 15832975 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FlhB # Organism: Escherichia coli O157:H7 # 1 235 1 235 373 437 95.0 1e-122 MANKTEKPTQKKLQDASKKGQILKSRDLTISVIMLVGTLYLGYVFDVHHIMSILEYILDH NAKPDIWDYFKAMGVGWLKTNIPFLLVCMFTTILVSWFQSKKQLVTEAVKFKLDSLNPVN GLKRIFGLKTVKESVKAILYIIFFALAIKVFWSNHKSLLFKTLDGDIISLLSGWGEMLFL LILYCLGSMIIVLIFDFIAEYFLFMKDMKMDKQEVKREYKEQEGNPEIKSKRRERIRKFF LSN >gi|296494711|gb|ADTN01000027.1| GENE 23 22705 - 23067 209 120 aa, chain + ## HITS:1 COG:ECs3721 KEGG:ns NR:ns ## COG: ECs3721 COG1377 # Protein_GI_number: 15832975 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FlhB # Organism: Escherichia coli O157:H7 # 1 120 254 373 373 227 94.0 4e-60 MIANPTHIAIGIYFKPHLSPIPLISVRETNEVALAVRKYAKEIGIPIITDKKLARKIYAT HRRYDYVSFENIDEILRLLLWLEDVENAGQPVPDEQLSSEDKFIEGEEKEIENKDDNLKN >gi|296494711|gb|ADTN01000027.1| GENE 24 23424 - 23924 169 166 aa, chain - ## HITS:1 COG:ECs3720 KEGG:ns NR:ns ## COG: ECs3720 COG2771 # Protein_GI_number: 15832974 # Func_class: K Transcription # Function: DNA-binding HTH domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 166 1 166 166 296 95.0 2e-80 MQVFSSDVYFTVVTNALLASQKEYYSDLVALVDLGHSFVVIDEHQHRNLKPNTEPVNILL SNNFIRINKNIRLSDLTHFLISNLHTQNVYSTQEALTHDDIDILRLCVSYSLKQVAIIKG IDYKTVSYHKIRALNKLNIKGTVELFIALCEWDKHYFKLQSCVRES >gi|296494711|gb|ADTN01000027.1| GENE 25 24181 - 25362 264 393 aa, chain + ## HITS:1 COG:no KEGG:ECO103_3424 NR:ns ## KEGG: ECO103_3424 # Name: not_defined # Def: type III secretion protein EprH # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 393 1 393 393 739 99.0 0 MENNDKFLSQDLLESYAIRLLSGPLNGCEYEILNGRLLVIIGNDVSLGRSDAFSELPENT IVVPYGELTGSFEIIITTDPDLVVTFRELTAQEPEDRTLTLNQQIEVLGLKFAVKEKNEV WQYSLPGIIENNIISTKQHFFSSKLFKYVMLFFLFAIIFIAFYIVNASNDPQLRHIDKIL VNKNRNYEILYGRDHVIYINTNILDEAVWVKQALEKNQPGKPVRVINPDDESIRIFSWLA DNFPDLQYFKLQLLDASNLRLTVSKQRNAITQQLIDNLIKGLLQTMPYASNISIAVLDDN VLESQAIETLSAIGLSYEKYKTANNVYFNIIGTLSDSELNKINNYVDEYYKQWGKQYVRF NVNLKNQDTNNSSFSYGDNRFEKSQGSKWTFQE >gi|296494711|gb|ADTN01000027.1| GENE 26 25376 - 25615 262 79 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_3190 NR:ns ## KEGG: ECUMN_3190 # Name: prgI # Def: type III secretion system needle protein # Organism: E.coli_UMN026 # Pathway: Bacterial secretion system [PATH:eum03070] # 1 79 1 79 79 135 100.0 6e-31 MADWNGYIMDISKQFDQGVDDLNQQVEKALEDLATNPSDPKFLAEYQSALAEYTLYRNAQ SNVVKAYKDLDSAIIQNFR >gi|296494711|gb|ADTN01000027.1| GENE 27 25635 - 25967 227 110 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_4132 NR:ns ## KEGG: ECH74115_4132 # Name: eprJ # Def: type III secretion apparatus protein EprJ # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 110 1 110 110 187 100.0 7e-47 MSVSNMPPIDRAEQSTAHEIQQAKVIDLNDRVLNLDNPDDKMISAFANYAVQTENWQQNA LQALTSDKKGLTPEKLLVLQDHVLNYNVEVSLVGTLARKIVAAVETLTRS >gi|296494711|gb|ADTN01000027.1| GENE 28 27236 - 27943 169 235 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_3186 NR:ns ## KEGG: ECUMN_3186 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 235 22 256 256 398 89.0 1e-110 MRKKIEMSLIKSPANGVVIKRKISDGLKEIVSLKEKILLETTAKIQSIEEKREEKFIQGY YDGYTKGIIDEMDNFIPLISLLCSELEKKRINMINDLKSILLKPSEEVDVFIKIFESWVT KLPSISGPVNLHIPTSFKDKSLEVESYFVDKSIWNVHITFHDDKRFVFFTDQFIAEFSPQ EFVDNCEQYLINNHCFSPDKVNEICEQARHYLVEKMFETHSLDMNNSVLASPEDL >gi|296494711|gb|ADTN01000027.1| GENE 29 27865 - 28038 61 57 aa, chain - ## HITS:1 COG:no KEGG:Z4177 NR:ns ## KEGG: Z4177 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157 # Pathway: not_defined # 14 57 1 44 44 62 79.0 6e-09 MRKKIPQALKNHEMIKILQLDYIQIIKLEPGSYKSSGLAKTELFISKECVSNIFSTK >gi|296494711|gb|ADTN01000027.1| GENE 30 28165 - 28320 129 51 aa, chain + ## HITS:1 COG:ECs3712 KEGG:ns NR:ns ## COG: ECs3712 COG2197 # Protein_GI_number: 15832966 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 51 1 51 210 105 98.0 1e-23 MGKIKIVVSDQQPFMIDGIIGFLGHYPDLYEVVGGYKDLKKAIAECNKSTA Prediction of potential genes in microbial genomes Time: Sun May 15 23:09:49 2011 Seq name: gi|296494710|gb|ADTN01000028.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont81.1, whole genome shotgun sequence Length of sequence - 548 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + SSU_RRNA 1 - 509 100.0 # CP000026 [D:293322..294867] # 16S ribosomal RNA # Salmonella enterica subsp. enterica serovar Paratyphi A str. ATCC # 9150 Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; Enterobacteriaceae; Salmonella. Prediction of potential genes in microbial genomes Time: Sun May 15 23:09:54 2011 Seq name: gi|296494709|gb|ADTN01000029.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont87.1, whole genome shotgun sequence Length of sequence - 18106 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 6, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 583 153 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain + TRNA 826 - 902 89.4 # Arg TCT 0 0 2 2 Tu 1 . - CDS 948 - 1709 418 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 1841 - 1881 8.1 3 3 Op 1 . - CDS 1892 - 2782 643 ## JW0556 hypothetical protein 4 3 Op 2 . - CDS 2783 - 5755 2457 ## COG0457 FOG: TPR repeat 5 3 Op 3 . - CDS 5742 - 7979 1896 ## LF82_1481 bacteriophage N4 adsorption protein B - Prom 8045 - 8104 5.9 6 4 Op 1 40/0.000 - CDS 8129 - 9571 1035 ## COG0642 Signal transduction histidine kinase 7 4 Op 2 . - CDS 9561 - 10244 800 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 10383 - 10442 2.7 + Prom 10262 - 10321 3.9 8 5 Op 1 3/1.000 + CDS 10401 - 11780 478 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 9 5 Op 2 3/1.000 + CDS 11807 - 12139 541 ## COG5569 Uncharacterized conserved protein 10 5 Op 3 11/0.000 + CDS 12155 - 13378 1251 ## COG0845 Membrane-fusion protein 11 5 Op 4 3/1.000 + CDS 13390 - 16533 3569 ## COG3696 Putative silver efflux pump + Prom 16546 - 16605 3.5 12 6 Tu 1 . + CDS 16635 - 18011 1650 ## COG1113 Gamma-aminobutyrate permease and related permeases + Term 18058 - 18096 -0.8 Predicted protein(s) >gi|296494709|gb|ADTN01000029.1| GENE 1 1 - 583 153 194 aa, chain - ## HITS:1 COG:ZfimZ KEGG:ns NR:ns ## COG: ZfimZ COG2197 # Protein_GI_number: 15800272 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli O157:H7 EDL933 # 1 186 22 207 231 351 100.0 5e-97 MKPTSVIIMDTHPIIRMSIEVLLQKNSELQIVLKTDDYRITIDYLRTRPVDLIIMDIDLP GTDGFTFLKRIKQIQSTVKVLFLSSKSECFYAGRAIQAGANGFVSKCNDQNDIFHAVQMI LSGYTFFPSETLNYIKSNKCSTNSSTVTVLSNREVTILRYLVSGLSNKEIADKLLLSNKT VSAHKSKEGANKQV >gi|296494709|gb|ADTN01000029.1| GENE 2 948 - 1709 418 253 aa, chain - ## HITS:1 COG:envY KEGG:ns NR:ns ## COG: envY COG2207 # Protein_GI_number: 16128549 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 253 1 253 253 454 100.0 1e-128 MQLSSSEPCVVILTEKEVEVSVNNHATFTLPKNYLAAFACNNNVIELSTLNHVLITHINR NIINDYLLFLNKNLTCVKPWSRLATPVIACHSRTPEVFRLAANHSKQQPSRPCEAELTRA LLFTVLSNFLEQSRFIALLMYILRSSVRDSVCRIIQSDIQHYWNLRIVASSLCLSPSLLK KKLKNENTSYSQIVTECRMRYAVQMLLMDNKNITQVAQLCGYSSTSYFISVFKAFYGLTP LNYLAKQRQKVMW >gi|296494709|gb|ADTN01000029.1| GENE 3 1892 - 2782 643 296 aa, chain - ## HITS:1 COG:no KEGG:JW0556 NR:ns ## KEGG: JW0556 # Name: ybcH # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 296 1 296 296 580 100.0 1e-164 MRKFIFVLLTLLLVSPFSFAMKGIIWQPQNRDSQVTDTQWQGLMSQLRLQGFDTLVLQWT RYGDAFTQPEQRTLLFKRAAAAQQAGLKLIVGLNADPEFFMHQKQSSAALESYLNRLLAA DLQQARLWSAAPGITPDGWYISAEIDDLNWRSEAARQPLLTWLNNAQRLISDVSAKPVYI SSFFAGNMSPDGYRQLLEHVKATGVNVWVQDGSGVDKLTAEQRERYLQASADCQSPAPAS GVVYELFVAGKGKTFTAKPKPDAEIASLLAKRSSCGKDTLYFSLRYLPVAHGILEY >gi|296494709|gb|ADTN01000029.1| GENE 4 2783 - 5755 2457 990 aa, chain - ## HITS:1 COG:nfrA KEGG:ns NR:ns ## COG: nfrA COG0457 # Protein_GI_number: 16128551 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Escherichia coli K12 # 1 990 1 990 990 1860 99.0 0 MKENNLNRVIGWSGLLLTSLLSTSALADNIGTSAEELGLSDYRHFVIYPRLDKALKAQKN NDEATAIREFEYIHQQVPDNIPLTLYLAEAYRHFGHDDRARLLLEDQLKRHPGDARLERS LAAIPVEVKSVTTVEELLAQQKACDAAPTLRCRSEVGQNALRLAQLPVARAQLNDATFAA SPEGKTLRTDLLQRAIYLKQWSQADTLYNEARQQNTLSAAERRQWFDVLLAGQLDDRILA LQSQGIFTDPQSYITYATALAYRGEKARLQHYLIENKPLFTTDAQEKSWLYLLSKYSANP VQALANYTVQFADNRQYVVGATLPVLLKEGQYDAAQKLLATLPANEMLEERYAVSVATRN KAEALRLARLLYQQEPANLTRLDQLTWQLMQNEQSREAADLLLQRYPFQGDARVSQTLMA RLASLLESHPYLATPAKVAILSKPLPLAEQRQWQSQLPGIADNCPAIVRLLGDMSPSYDA AAWNRLAKCYRDTLPGVALYAWLQAEQRQPSAWQHRAVAYQAYQVEDYATALAAWQKISL HDMSNEDLLAAANTAQAAGNGAARDRWLQQAEKRGLGSNALYWWLHAQRYIPGQPELALN DLTRSINIAPSANAYVARATIYRQRHNVPAAVSDLRAALELEPNNSNTQAALGYALWDSG DIAQSREMLEPAHKGLPDDPALIRQLAYVNQRLDDMPATQHYARLVIDDIDNQALITPLT PEQNQQRFNFRRLHEEVGRRWTFSFDSSIGLRSGAMSTANNNVGGAAPGKSYRSYGQLEA EYRIGLNMLLEGDLLSVYSRVFADTGENGVMMPVKNPMSGTGLRWKPLRDQIFFIAVEQQ LPLNGQNGASDTMLRASASFFNGGKYSDEWHPNGSGWFAQNLYLDAAQYIRQDIQAWTAD YRVSWHQKVANGQTIEPYAHVQDNGYRDKGTQGAQLGGVGVRWNIWTGETHYDAWPHKVS LGVEYQHTFKAINQRNGERNNAFLTIGVHW >gi|296494709|gb|ADTN01000029.1| GENE 5 5742 - 7979 1896 745 aa, chain - ## HITS:1 COG:no KEGG:LF82_1481 NR:ns ## KEGG: LF82_1481 # Name: nfrB # Def: bacteriophage N4 adsorption protein B # Organism: E.coli_LF82 # Pathway: not_defined # 1 745 1 745 745 1511 99.0 0 MDWLLDVFATWLYGLKVIAITLAVIMFISGLDDFFIDVVYWVRRIKRKLSVYRRYPRMSY RELYKPDEKPLAIMVPAWNETGVIGNMAELAATTLDYENYHIFVGTYPNAPDTQRDVDEV CARFPNVHKVVCARPGPTSKADCLNNVLDAITQFERSANFAFAGFILHDAEDVISPMELR LFNYLVERKDLIQIPVYPFEREWTHFTSMTYIDEFSELHGKDVPVREALAGQVPSAGVGT CFSRRAVTALLADGDGIAFDVQSLTEDYDIGFRLKEKGMTEIFVRFPVVDEAKEREQRKF LQHARTSNMICVREYFPDTFSTAVRQKSRWIIGIVFQGFKTHKWTSSLTLNYFLWRDRKG AISNFVSFLAMLVMIQLLLLLAYESLWPDAWHFLSIFSGSAWLMTLLWLNFGLMVNRIVQ RVIFVTGYYGLTQGLLSVLRLFWGNLINFMANWRALKQVLQHGDPRRVAWDKTTHDFPSV TGDTRSLRPLGQILLENQVITEEQLDTALRNRVEGLRLGGSMLMQGLISAEQLAQALAEQ NGVAWESIDAWQIPSSLIAEMPASVALHYAVLPLRLENDELIVGSEDGIDPVSLAALTRK VGRKVRYVIVLRGQIVTGLRHWYARRRGHDPRAMLYNAVQHQWLTEQQAGEIWRQYVPHQ FLFAEILTTLGHINRSAINVLLLRHERSSLPLGKFLVTEGVISQETLDRVLTIQRELQVS MQSLLLKAGLNTEQVAQLESENEGE >gi|296494709|gb|ADTN01000029.1| GENE 6 8129 - 9571 1035 480 aa, chain - ## HITS:1 COG:ybcZ KEGG:ns NR:ns ## COG: ybcZ COG0642 # Protein_GI_number: 16128553 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 480 1 480 480 932 99.0 0 MVSKPFQRPFSLATRLTFFISLATIAAFFAFAWIMIHSVKVHFAEQDINDLKEISATLER VLNHPDETQARRLMTLEDIVSGYSNVLISLADSQGKTVYHSPGAPDIREFTRDAIPDKDA RGGEVYLLSGPTIMMPGHGHGHMEHSNWRMINLPVGPLVDGKPIYTLYIALSIDFHLHYI NDLMNKLIMTASVISILIVFIVLLAVHKGHAPIRSVSRQIQNITSKDLDVRLDPQTVPIE LEQLVLSFNHMIERIEDVFTRQSNFSADIAHEIRTPITNLITQTEIALSQSRSQKELEDV LYSNLEELTRMAKMVSDMLFLAQADNNQLIPEKKMLNLADEVGKVFDFFEALAEDRGVEL RFVGDKCQVAGDPLMLRRALSNLLSNALRYTPTGETIVVRCQTVDHLVQVIVENPGTPIA PEHLPRLFDRFYRVDPSRQRKGEGSGIGLAIVKSIVVAHKGTVAVTSDARGTRFVITLPA >gi|296494709|gb|ADTN01000029.1| GENE 7 9561 - 10244 800 227 aa, chain - ## HITS:1 COG:ECs0609 KEGG:ns NR:ns ## COG: ECs0609 COG0745 # Protein_GI_number: 15829863 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 227 1 227 227 432 100.0 1e-121 MKLLIVEDEKKTGEYLTKGLTEAGFVVDLADNGLNGYHLAMTGDYDLIILDIMLPDVNGW DIVRMLRSANKGMPILLLTALGTIEHRVKGLELGADDYLVKPFAFAELLARVRTLLRRGA AVIIESQFQVADLMVDLVSRKVTRSGTRITLTSKEFTLLEFFLRHQGEVLPRSLIASQVW DMNFDSDTNAIDVAVKRLRGKIDNDFEPKLIQTVRGVGYMLEVPDGQ >gi|296494709|gb|ADTN01000029.1| GENE 8 10401 - 11780 478 459 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 12 459 11 457 460 188 28 2e-47 MSPCKLLPFCVALALTGCSLAPDYQRPAMPVPQQFSLSQNGLVNAADNYQNAGWRTFFVD NQVKTLISEALVNNRDLRMATLKVQEARAQYRLTDADRYPQLNGEGSGSWSGNLKGNTAT TREFSTGLNASLDLDFFGRLKNMSEAERQNYLATEEAQRAVHILLVSNVAQSYFNQQLAY AQLQIAEETLRNYQQSYAFVEKQLLTGSSNVLALEQARGVIESTRSDIAKRQGELAQANN ALQLLLGSYGKLPQAQTVNSDSLQSVKLPAGLSSQILLQRPDIMEAEHALMAANANIGAA RAAFFPSISLTSGISTASSDLSSLFNASSGMWNFIPKIEIPIFNAGRNQANLDIAEIRQQ QSVVNYEQKIQNAFKEVADALALRQSLNDQISAQQRYLASLQITLQRARALYQHGAVSYL EVLDAERSLFATRQTLLDLNYARQVNEISLYTALGGGWQ >gi|296494709|gb|ADTN01000029.1| GENE 9 11807 - 12139 541 110 aa, chain + ## HITS:1 COG:ylcC KEGG:ns NR:ns ## COG: ylcC COG5569 # Protein_GI_number: 16128556 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 110 1 110 110 196 98.0 1e-50 MKKALQVAMFSLFTVIGFNAQANEHHHETMSEVQPQVISATGVVKGVDLESKKITIHHDP IAAVNWPEMTMRFTITPQTKMSEIKTGDKVAFNFVQQGNLSLLQDIKVSQ >gi|296494709|gb|ADTN01000029.1| GENE 10 12155 - 13378 1251 407 aa, chain + ## HITS:1 COG:ECs0612 KEGG:ns NR:ns ## COG: ECs0612 COG0845 # Protein_GI_number: 15829866 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli O157:H7 # 1 407 1 407 407 771 99.0 0 MKKIALIIGSMIAGGIISAAGFNWFAKAEPPAEKTSTAERKVLFWYDPMYPNTRFDKPGK SPFMDMDLVPKYADEESSASGVRIDPTQTQNLGVKTATVTRGPLTFAQSFPANVSYNEYQ YAIVQARAAGFIDKVYPLTVGDKVQKGTPLLDLTIPDWVEAQSEYLLLRETGGTATQTEG ILERLRLAGMPEADIRRLIATQKIQTRFTLKAPIDGVITAFDLRAGMNIAKDNVVAKIQG MDPVWVTAAIPESIAWLVKDASQFTLTVPARPDKTLTIRKWTLLPGVDAATRTLQLRLEV DNADEALKPGMNAWLQLNTASEPMLLIPSQALIDTGNEQRVITVDADGRFVPKRVAVFQA SQGVTALRSGLAEGEKVVSSGLFLIDSEANISGALERMRSESATHAH >gi|296494709|gb|ADTN01000029.1| GENE 11 13390 - 16533 3569 1047 aa, chain + ## HITS:1 COG:ybdE KEGG:ns NR:ns ## COG: ybdE COG3696 # Protein_GI_number: 16128558 # Func_class: P Inorganic ion transport and metabolism # Function: Putative silver efflux pump # Organism: Escherichia coli K12 # 1 1047 1 1047 1047 2027 99.0 0 MIEWIIRRSVANRFLVLMGALFLSIWGTWTIINTPVDALPDLSDVQVSIKTSYPGQAPQI VENQVTYPLTTTMLSVPGAKTVRGFSQFGDSYVYVIFEDGTDPYWARSRVLEYLNQVQGK LPAGVSAELGPDATGVGWIYEYALVDRSGKHDLADLRSLQDWFLKYELKTIPDVAEVASV GGVVKEYQVVIDPQRLAQYGISLAEVKSALDASNQEAGGSSIELAEAEYMVRASGYLQTL DDFNHIVLKASENGVPVYLRDVAKVQIGPEMRRGIAELNGEGEVAGGVVILRSGKNAREV IAAVKDKLETLKSSLPEGVEIVTTYDRSQLIDRAIDNLSGKLLEEFIVVAVVCALFLWHV RSALVAIISLPLGLCIAFIVMHFQGLNANIMSLGGIAIAVGAMVDAAIVMIENAHKRLEE WQHQHPDATLDNKTRWQVITDASVEVGPALFISLLIITLSFIPIFTLEGQEGRLFGPLAF TKTYAMAGAALLAIVVIPILMGYWIRGKIPPESSNPLNRFLIRVYHPLLLKVLHWPKTTL LVAALSVLTVLWPLNKVGGEFLPQINEGDLLYMPSTLPGISAAEAASMLQKTDKLIMSVP EVARVFGKTGKAETATDSAPLEMVETTIQLKPQEQWRPGMTMDKIIEELDNTVRLPGLAN LWVPPIRNRIDMLSTGIKSPIGIKVSGTVLADIDAMAEQIEEVARTVPGVASALAERLEG GRYINVEINREKAARYGMTVADVQLFVTSAVGGAMVGETVEGIARYPINLRYPQSWRDSP QALRQLPILTPMKQQITLADVADVKVSTGPSMLKTENARPTSWIYIDARDRDMVSVVHDL QKAIAEKVQLKPGTSVAFSGQFELLERANHKLKLMVPMTLMIIFVLLYLAFRRVGEALLI ISSVPFALVGGIWLLWWMGFHLSVATGTGFIALAGVAAEFGVVMLMYLRHAIEAEPSLNN PQTFSEQKLDEALYHGAVLRVRPKAMTVAVIIAGLLPILWGTGAGSEVMSRIAAPMIGGM ITAPLLSLFIIPAAYKLMWLHRHRVRK >gi|296494709|gb|ADTN01000029.1| GENE 12 16635 - 18011 1650 458 aa, chain + ## HITS:1 COG:pheP KEGG:ns NR:ns ## COG: pheP COG1113 # Protein_GI_number: 16128559 # Func_class: E Amino acid transport and metabolism # Function: Gamma-aminobutyrate permease and related permeases # Organism: Escherichia coli K12 # 1 458 1 458 458 839 100.0 0 MKNASTVSEDTASNQEPTLHRGLHNRHIQLIALGGAIGTGLFLGIGPAIQMAGPAVLLGY GVAGIIAFLIMRQLGEMVVEEPVSGSFAHFAYKYWGPFAGFLSGWNYWVMFVLVGMAELT AAGIYMQYWFPDVPTWIWAAAFFIIINAVNLVNVRLYGETEFWFALIKVLAIIGMIGFGL WLLFSGHGGEKASIDNLWRYGGFFATGWNGLILSLAVIMFSFGGLELIGITAAEARDPEK SIPKAVNQVVYRILLFYIGSLVVLLALYPWVEVKSNSSPFVMIFHNLDSNVVASALNFVI LVASLSVYNSGVYSNSRMLFGLSVQGNAPKFLTRVSRRGVPINSLMLSGAITSLVVLINY LLPQKAFGLLMALVVATLLLNWIMICLAHLRFRAAMRRQGRETQFKALLYPFGNYLCIAF LGMILLLMCTMDDMRLSAILLPVWIVFLFMAFKTLRRK Prediction of potential genes in microbial genomes Time: Sun May 15 23:10:07 2011 Seq name: gi|296494708|gb|ADTN01000030.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont87.2, whole genome shotgun sequence Length of sequence - 1353 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 25 - 1272 1299 ## COG0668 Small-conductance mechanosensitive channel - Prom 1293 - 1352 4.2 Predicted protein(s) >gi|296494708|gb|ADTN01000030.1| GENE 1 25 - 1272 1299 415 aa, chain - ## HITS:1 COG:ybdG KEGG:ns NR:ns ## COG: ybdG COG0668 # Protein_GI_number: 16128560 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Escherichia coli K12 # 1 415 1 415 415 801 100.0 0 MQDLISQVEDLAGIEIDHTTSMVMIFGIIFLTAVVVHIILHWVVLRTFEKRAIASSRLWL QIITQNKLFHRLAFTLQGIIVNIQAVFWLQKGTEAADILTTCAQLWIMMYALLSVFSLLD VILNLAQKFPAASQLPLKGIFQGIKLIGAILVGILMISLLIGQSPAILISGLGAMAAVLM LVFKDPILGLVAGIQLSANDMLKLGDWLEMPKYGADGAVIDIGLTTVKVRNWDNTITTIP TWSLVSDSFKNWSGMSASGGRRIKRSISIDVTSIRFLDEDEMQRLNKAHLLKPYLTSRHQ EINEWNRQQGSTESVLNLRRMTNIGTFRAYLNEYLRNHPRIRKDMTLMVRQLAPGDNGLP LEIYAFTNTVVWLEYESIQADIFDHIFAIVEEFGLRLHQSPTGNDIRSLAGAFKQ Prediction of potential genes in microbial genomes Time: Sun May 15 23:10:08 2011 Seq name: gi|296494707|gb|ADTN01000031.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont87.3, whole genome shotgun sequence Length of sequence - 3281 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 - CDS 17 - 670 819 ## COG0778 Nitroreductase - Prom 699 - 758 3.5 2 1 Op 2 . - CDS 764 - 1132 292 ## COG2315 Uncharacterized protein conserved in bacteria - Term 1154 - 1188 -1.0 3 1 Op 3 . - CDS 1197 - 1445 351 ## S0495 hypothetical protein 4 1 Op 4 . - CDS 1511 - 2629 948 ## COG2170 Uncharacterized conserved protein - Prom 2727 - 2786 2.5 + Prom 2876 - 2935 4.0 5 2 Tu 1 . + CDS 3082 - 3234 167 ## EcHS_A0629 Hok/Gef family protein Predicted protein(s) >gi|296494707|gb|ADTN01000031.1| GENE 1 17 - 670 819 217 aa, chain - ## HITS:1 COG:nfnB KEGG:ns NR:ns ## COG: nfnB COG0778 # Protein_GI_number: 16128561 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Escherichia coli K12 # 1 217 1 217 217 425 100.0 1e-119 MDIISVALKRHSTKAFDASKKLTPEQAEQIKTLLQYSPSSTNSQPWHFIVASTEEGKARV AKSAAGNYVFNERKMLDASHVVVFCAKTAMDDVWLKLVVDQEDADGRFATPEAKAANDKG RKFFADMHRKDLHDDAEWMAKQVYLNVGNFLLGVAALGLDAVPIEGFDAAILDAEFGLKE KGYTSLVVVPVGHHSVEDFNATLPKSRLPQNITLTEV >gi|296494707|gb|ADTN01000031.1| GENE 2 764 - 1132 292 122 aa, chain - ## HITS:1 COG:ECs0617 KEGG:ns NR:ns ## COG: ECs0617 COG2315 # Protein_GI_number: 15829871 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 122 1 122 122 248 100.0 2e-66 MDKQSLHETAKRLALELPFVELCWPFGPEFDVFKIGGKIFMLSSELRGVPFINLKSDPQK SLLNQQIYPSIKPGYHMNKKHWISVYPGEEISEALLRDLINDSWNLVVDGLAKRDQKRVR PG >gi|296494707|gb|ADTN01000031.1| GENE 3 1197 - 1445 351 82 aa, chain - ## HITS:1 COG:no KEGG:S0495 NR:ns ## KEGG: S0495 # Name: ybdJ # Def: hypothetical protein # Organism: S.flexneri_2457T # Pathway: not_defined # 1 82 1 82 82 100 98.0 2e-20 MKHPLETLTTAAGILLMAFLSCLLLPAPALGLALAQKLVTMFHLMDLSQLYTLLFCLWFL VLGAIEYFVLRFIWRRWFSLAD >gi|296494707|gb|ADTN01000031.1| GENE 4 1511 - 2629 948 372 aa, chain - ## HITS:1 COG:ybdK KEGG:ns NR:ns ## COG: ybdK COG2170 # Protein_GI_number: 16128564 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 372 1 372 372 789 100.0 0 MPLPDFHVSEPFTLGIELEMQVVNPPGYDLSQDSSMLIDAVKNKITAGEVKHDITESMLE LATDVCRDINQAAGQFSAMQKVVLQAATDHHLEICGGGTHPFQKWQRQEVCDNERYQRTL ENFGYLIQQATVFGQHVHVGCASGDDAIYLLHGLSRFVPHFIALSAASPYMQGTDTRFAS SRPNIFSAFPDNGPMPWVSNWQQFEALFRCLSYTTMIDSIKDLHWDIRPSPHFGTVEVRV MDTPLTLSHAVNMAGLIQATAHWLLTERPFKHQEKDYLLYKFNRFQACRYGLEGVITDPH TGDRRPLTEDTLRLLEKIAPSAHKIGASSAIEALHRQVVSGLNEAQLMRDFVADGGSLIG LVKKHCEIWAGD >gi|296494707|gb|ADTN01000031.1| GENE 5 3082 - 3234 167 50 aa, chain + ## HITS:1 COG:no KEGG:EcHS_A0629 NR:ns ## KEGG: EcHS_A0629 # Name: not_defined # Def: Hok/Gef family protein # Organism: E.coli_HS # Pathway: not_defined # 1 50 34 83 83 92 100.0 5e-18 MLTKYALAAVIVLCLTVLGFTLLVGDSLCEFTVKERNIEFKAVLAYEPKK Prediction of potential genes in microbial genomes Time: Sun May 15 23:10:13 2011 Seq name: gi|296494706|gb|ADTN01000032.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont88.1, whole genome shotgun sequence Length of sequence - 3023 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 324 - 2990 2256 ## JW3300 periplasmic endochitinase Predicted protein(s) >gi|296494706|gb|ADTN01000032.1| GENE 1 324 - 2990 2256 888 aa, chain + ## HITS:1 COG:no KEGG:JW3300 NR:ns ## KEGG: JW3300 # Name: chiA # Def: periplasmic endochitinase # Organism: E.coli_J # Pathway: Amino sugar and nucleotide sugar metabolism [PATH:ecj00520] # 1 888 10 897 897 1605 99.0 0 MIGMGLVCSALPALAMEAWNNQQGGNKYQVIFDGKIYENAWWVSSTNCPGKAKANDATNP WRLKRTATAAEISQFGNTLSCEKSGSSSSSNSNTPASNTPANGGSATPAQGTVPSNSSVV AWNKQQGGQTWYVVFNGAVYKNAWWVASSNCPGDAKSNDASNPWRYVRAATATEISETSN PQSCTSAPQPSPDVKPAPDVKPAPDVQPAPADKSNDNYAVVAWKGQEGSSTWYVIYNGGI YKNAWWVGAANCPGDAKENDASNPWRYVRAATATEISQYGNPGSCSVKPDNNGGAVTPVD PTPETPVTPTPDNSEPSTPADSVNDYSLQAWSGQEGSEIYHVIFNGNVYKNAWWVGSKDC PRGTSAENSNNPWRLERTATAAELSQYGNPTTCEIDNGGVIVADGFQASKAYSADSIVDY NDAHYKTSVDQDAWGFVPGGDNPWKKYEPAKAWSASTVYVKGDRVVVDGQAYEALFWTQS DNPALVANQNATGSNSRPWKPLGKAQSYSNEELNNAPQFNPETLYASDTLIRFNGVNYIS QSKVQKVSPSDINPWRVFVDWTGTKERVGTPKKAWPKHVYAPYVDFTLNTIPDLAALAKN HNVNHFTLAFVVSKDANTCLPTWGTAYGMQNYAQYSKIKALREAGGDVMLSIGGANNAPL AASCKNVDDLMQHYYDIVDNLNLKVLDFDIEGTWVADQASIERRNLAVKKVQDKWKSEGK DIAIWYTLPILPTGLTPEGMNVLSDAKAKGVELAGVNVMTMDYGNAICQSANTEGQNIHG KCATSAIANLHSQLKGLHPNKSDAEIDAMMGTTPMVGVNDVQGEVFYLSDARLVMQDAQK RNLGMVGIWSIARDLPGGTNLSPEFHGLTKEQAPKYAFSEIFAPFTKQ Prediction of potential genes in microbial genomes Time: Sun May 15 23:12:21 2011 Seq name: gi|296494705|gb|ADTN01000033.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont88.2, whole genome shotgun sequence Length of sequence - 27568 bp Number of predicted genes - 45, with homology - 43 Number of transcription units - 8, operones - 6 average op.length - 7.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 72 - 131 4.0 1 1 Op 1 9/0.000 + CDS 152 - 346 193 ## COG2906 Bacterioferritin-associated ferredoxin 2 1 Op 2 . + CDS 418 - 894 650 ## COG2193 Bacterioferritin (cytochrome b1) - Term 837 - 872 2.6 3 2 Op 1 . - CDS 923 - 1600 454 ## COG1989 Type II secretory pathway, prepilin signal peptidase PulO and related peptidases 4 2 Op 2 . - CDS 1600 - 2061 253 ## EcHS_A3528 general secretion pathway protein M 5 2 Op 3 4/1.000 - CDS 2058 - 3221 423 ## COG3297 Type II secretory pathway, component PulL 6 2 Op 4 7/0.000 - CDS 3236 - 4219 527 ## COG3156 Type II secretory pathway, component PulK 7 2 Op 5 12/0.000 - CDS 4212 - 4799 379 ## COG4795 Type II secretory pathway, component PulJ 8 2 Op 6 12/0.000 - CDS 4792 - 5169 221 ## COG2165 Type II secretory pathway, pseudopilin PulG 9 2 Op 7 12/0.000 - CDS 5166 - 5675 322 ## COG2165 Type II secretory pathway, pseudopilin PulG 10 2 Op 8 10/0.000 - CDS 5683 - 6120 512 ## COG2165 Type II secretory pathway, pseudopilin PulG 11 2 Op 9 24/0.000 - CDS 6130 - 7326 915 ## COG1459 Type II secretory pathway, component PulF 12 2 Op 10 6/0.000 - CDS 7323 - 8804 995 ## COG2804 Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB 13 2 Op 11 . - CDS 8814 - 10766 1588 ## COG1450 Type II secretory pathway, component PulD - Prom 10831 - 10890 2.5 14 3 Tu 1 . + CDS 11449 - 11556 60 ## + Term 11598 - 11634 0.5 + Prom 11660 - 11719 11.4 15 4 Op 1 . + CDS 11745 - 13214 365 ## JW3285 general secretory pathway component, cryptic 16 4 Op 2 . + CDS 13216 - 13635 312 ## B21_03124 hypothetical protein 17 5 Tu 1 . - CDS 13604 - 13810 116 ## - Prom 13843 - 13902 2.2 + Prom 13643 - 13702 4.3 18 6 Op 1 40/0.000 + CDS 13873 - 14184 513 ## PROTEIN SUPPORTED gi|15803848|ref|NP_289882.1| 30S ribosomal protein S10 19 6 Op 2 58/0.000 + CDS 14217 - 14846 1056 ## PROTEIN SUPPORTED gi|15803847|ref|NP_289881.1| 50S ribosomal protein L3 20 6 Op 3 61/0.000 + CDS 14857 - 15462 992 ## PROTEIN SUPPORTED gi|15803846|ref|NP_289880.1| 50S ribosomal protein L4 21 6 Op 4 61/0.000 + CDS 15459 - 15761 490 ## PROTEIN SUPPORTED gi|15803845|ref|NP_289879.1| 50S ribosomal protein L23 22 6 Op 5 60/0.000 + CDS 15779 - 16600 1438 ## PROTEIN SUPPORTED gi|15803844|ref|NP_289878.1| 50S ribosomal protein L2 23 6 Op 6 59/0.000 + CDS 16617 - 16895 484 ## PROTEIN SUPPORTED gi|15803843|ref|NP_289877.1| 30S ribosomal protein S19 24 6 Op 7 61/0.000 + CDS 16910 - 17242 537 ## PROTEIN SUPPORTED gi|15803842|ref|NP_289876.1| 50S ribosomal protein L22 25 6 Op 8 50/0.000 + CDS 17260 - 17961 1183 ## PROTEIN SUPPORTED gi|15833433|ref|NP_312206.1| 30S ribosomal protein S3 26 6 Op 9 50/0.000 + CDS 17974 - 18384 694 ## PROTEIN SUPPORTED gi|15803840|ref|NP_289874.1| 50S ribosomal protein L16 27 6 Op 10 50/0.000 + CDS 18384 - 18575 301 ## PROTEIN SUPPORTED gi|15803839|ref|NP_289873.1| 50S ribosomal protein L29 28 6 Op 11 50/0.000 + CDS 18575 - 18829 431 ## PROTEIN SUPPORTED gi|15803838|ref|NP_289872.1| 30S ribosomal protein S17 + Term 18851 - 18894 10.3 + Prom 18863 - 18922 7.1 29 7 Op 1 57/0.000 + CDS 18994 - 19365 617 ## PROTEIN SUPPORTED gi|15803837|ref|NP_289871.1| 50S ribosomal protein L14 30 7 Op 2 48/0.000 + CDS 19376 - 19690 516 ## PROTEIN SUPPORTED gi|15803836|ref|NP_289870.1| 50S ribosomal protein L24 31 7 Op 3 50/0.000 + CDS 19705 - 20244 911 ## PROTEIN SUPPORTED gi|15803835|ref|NP_289869.1| 50S ribosomal protein L5 32 7 Op 4 50/0.000 + CDS 20259 - 20564 513 ## PROTEIN SUPPORTED gi|15803834|ref|NP_289868.1| 30S ribosomal protein S14 33 7 Op 5 55/0.000 + CDS 20598 - 20990 641 ## PROTEIN SUPPORTED gi|15803833|ref|NP_289867.1| 30S ribosomal protein S8 34 7 Op 6 46/0.000 + CDS 21003 - 21536 902 ## PROTEIN SUPPORTED gi|15803832|ref|NP_289866.1| 50S ribosomal protein L6 35 7 Op 7 56/0.000 + CDS 21546 - 21899 572 ## PROTEIN SUPPORTED gi|15803831|ref|NP_289865.1| 50S ribosomal protein L18 36 7 Op 8 50/0.000 + CDS 21914 - 22417 834 ## PROTEIN SUPPORTED gi|15803830|ref|NP_289864.1| 30S ribosomal protein S5 37 7 Op 9 48/0.000 + CDS 22421 - 22600 291 ## PROTEIN SUPPORTED gi|15803829|ref|NP_289863.1| 50S ribosomal protein L30 38 7 Op 10 53/0.000 + CDS 22604 - 23038 715 ## PROTEIN SUPPORTED gi|16131180|ref|NP_417760.1| 50S ribosomal subunit protein L15 39 7 Op 11 . + CDS 23046 - 24377 1258 ## PROTEIN SUPPORTED gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 40 7 Op 12 . + CDS 24409 - 24525 198 ## PROTEIN SUPPORTED gi|15803826|ref|NP_289860.1| 50S ribosomal protein L36 + Term 24555 - 24596 8.2 + Prom 24567 - 24626 3.9 41 8 Op 1 48/0.000 + CDS 24672 - 25028 586 ## PROTEIN SUPPORTED gi|15803825|ref|NP_289859.1| 30S ribosomal protein S13 42 8 Op 2 36/0.000 + CDS 25045 - 25434 669 ## PROTEIN SUPPORTED gi|15803824|ref|NP_289858.1| 30S ribosomal protein S11 43 8 Op 3 26/0.000 + CDS 25468 - 26088 1038 ## PROTEIN SUPPORTED gi|15803823|ref|NP_289857.1| 30S ribosomal protein S4 44 8 Op 4 50/0.000 + CDS 26114 - 27103 1050 ## COG0202 DNA-directed RNA polymerase, alpha subunit/40 kD subunit 45 8 Op 5 . + CDS 27144 - 27527 636 ## PROTEIN SUPPORTED gi|15803821|ref|NP_289855.1| 50S ribosomal protein L17 Predicted protein(s) >gi|296494705|gb|ADTN01000033.1| GENE 1 152 - 346 193 64 aa, chain + ## HITS:1 COG:bfd KEGG:ns NR:ns ## COG: bfd COG2906 # Protein_GI_number: 16131216 # Func_class: P Inorganic ion transport and metabolism # Function: Bacterioferritin-associated ferredoxin # Organism: Escherichia coli K12 # 1 64 1 64 64 119 100.0 1e-27 MYVCLCNGISDKKIRQAVRQFSPHSFQQLKKFIPVGNQCGKCVRAAREVMEDELMQLPEF KESA >gi|296494705|gb|ADTN01000033.1| GENE 2 418 - 894 650 158 aa, chain + ## HITS:1 COG:bfr KEGG:ns NR:ns ## COG: bfr COG2193 # Protein_GI_number: 16131215 # Func_class: P Inorganic ion transport and metabolism # Function: Bacterioferritin (cytochrome b1) # Organism: Escherichia coli K12 # 1 158 1 158 158 278 100.0 3e-75 MKGDTKVINYLNKLLGNELVAINQYFLHARMFKNWGLKRLNDVEYHESIDEMKHADRYIE RILFLEGLPNLQDLGKLNIGEDVEEMLRSDLALELDGAKNLREAIGYADSVHDYVSRDMM IEILRDEEGHIDWLETELDLIQKMGLQNYLQAQIREEG >gi|296494705|gb|ADTN01000033.1| GENE 3 923 - 1600 454 225 aa, chain - ## HITS:1 COG:hofD KEGG:ns NR:ns ## COG: hofD COG1989 # Protein_GI_number: 16131214 # Func_class: N Cell motility; O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, prepilin signal peptidase PulO and related peptidases # Organism: Escherichia coli K12 # 1 225 1 225 225 361 99.0 1e-100 MTMLLPLFILVGFIAGYFVNAIAYHLSPLEDKTALTFRQVLVHFRQKKYAWHDTVPLILC VAAAIACALAPFTPIVTGALFLYFCFVLTLSVIDFRTQLLPDKLTLPLLWLGLVFNAQYG LIDLHDAVYGAVAGYGVLWCVYWGVWLVCHKEGLGYGDFKLLAAAGAWCGWQTLPMILLI ASLGGIGYAIVSQLLQRRTITTIAFGPWLALGSMINLGYLAWISY >gi|296494705|gb|ADTN01000033.1| GENE 4 1600 - 2061 253 153 aa, chain - ## HITS:1 COG:no KEGG:EcHS_A3528 NR:ns ## KEGG: EcHS_A3528 # Name: not_defined # Def: general secretion pathway protein M # Organism: E.coli_HS # Pathway: Bacterial secretion system [PATH:ecx03070] # 1 153 9 161 161 291 100.0 5e-78 MIKSWWAEKSTSEKQIVAALAVLSLGVFCWLGVIKPIDTYIAEHQSHAQKIKKDIKWMQD QASTHGLLGHPALTQPIKNILLEEAKRENLAITLENGPDNTLTIHPVTAPLENVSRWLTT AQVTYGIVIEDLQFTLAGNEEITLRHLSFREQQ >gi|296494705|gb|ADTN01000033.1| GENE 5 2058 - 3221 423 387 aa, chain - ## HITS:1 COG:gspL KEGG:ns NR:ns ## COG: gspL COG3297 # Protein_GI_number: 16131212 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulL # Organism: Escherichia coli K12 # 1 387 2 388 388 716 100.0 0 MPESLMVIRSSSTLRKHWEWMTFSADSVSSVHTLTDDLPLESLADQPGAGNVHLLIPPEG LLYRSLTLPNAKYKLTAQTLQWLAEETLPDNTQDWHWTVVDKQNESVEVIGIQSEKLSRY LERLHTAGLNVTRVLPDGCYLPWEVDSWTLVNQQTSWLIRSAAHAFNELDEHWLQHLAAQ FPPENMLCYGVVPHGVAAANPLIQHPEIPSLSLYSADIAFQRYDMLHGIFRKQKTVSKSG KWLARLAVSCLVLAILSFVGSRSIALWHTLKIEDQLQQQQQETWQRYFPQIKRTHNFRFY FKQQLAQQYPEAVPLLYHLQTLLLEHPELQLMEANYSQKQKSLTLKMSAKSEANIDRFCE LTQSWLPMEKTEKDPVSGVWTVRNSGK >gi|296494705|gb|ADTN01000033.1| GENE 6 3236 - 4219 527 327 aa, chain - ## HITS:1 COG:gspK KEGG:ns NR:ns ## COG: gspK COG3156 # Protein_GI_number: 16131211 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulK # Organism: Escherichia coli K12 # 1 327 1 327 327 587 100.0 1e-167 MNNEQRGVALLIVLMLLALMAALAADMTLSFHSQLQRTRQVNHHLQRQYDIELAEKLALA SLTQDVKDNDRQTTLQQYWAQPQQLQLEDGNTVKWQLRDAQHCFNLNALAKISDDPLASP DFPAQVFSALLINAGIDRGNTDEIVQSIADYIDVDDSPRFHGAEDSFYQSQTPPRHSANQ MLFLTGELRQIKGITENIYQRLIPYVCVLPTTELSINLNMLTENDIPLFRALFLNNITDA DARVLLQKRPREGWLTTDAFLYWAQQDFSGVKPLVAQVKRHLFPYSRYFTLSTESISDEQ SQGWQSHIFFNRKQQSAQIYRRTLQLY >gi|296494705|gb|ADTN01000033.1| GENE 7 4212 - 4799 379 195 aa, chain - ## HITS:1 COG:gspJ KEGG:ns NR:ns ## COG: gspJ COG4795 # Protein_GI_number: 16131210 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulJ # Organism: Escherichia coli K12 # 1 195 1 195 195 325 100.0 4e-89 MINRQQGFTLLEVMAALAIFSMLSVLAFMIFSQASELHQRSQKEIQQFNQLQRTITILDN DLLQLVARRNRSTDKIMVLGEEAIFTTQSRDPLAPLSEAQTLLTVHWYLRNHTLYRAVRT SVDGRKDQPAQAMLEHVESFLLESNSGESQELPLSVTLHLQTQQYGGLQRRFALPEQLAR EESPAQTQAGNNNHE >gi|296494705|gb|ADTN01000033.1| GENE 8 4792 - 5169 221 125 aa, chain - ## HITS:1 COG:gspI KEGG:ns NR:ns ## COG: gspI COG2165 # Protein_GI_number: 16131209 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, pseudopilin PulG # Organism: Escherichia coli K12 # 1 125 14 138 138 205 100.0 2e-53 MNKQSGMTLLEVLLAMSIFTAVALTLMSSMQGQRNAIERMRNETLALWIADNQLQSQDSF GEENTSSSGKELINGEEWNWRSDIHSSKDGTLLERTITVTLPSGQTTSLTRYQSIDNKSG QAQDD >gi|296494705|gb|ADTN01000033.1| GENE 9 5166 - 5675 322 169 aa, chain - ## HITS:1 COG:hofH KEGG:ns NR:ns ## COG: hofH COG2165 # Protein_GI_number: 16131208 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, pseudopilin PulG # Organism: Escherichia coli K12 # 1 169 1 169 169 320 100.0 9e-88 MNQQRGFTLLEMMLVLALVAITASVVLFTYGREDVASTRARETAARFTAALELAIDRATL SGQPVGIHFSDSAWRIMVPGKTPSAWRWVPLQEDAADESQNDWDEELSIHLQPFKPDDSN QPQVVILADGQITPFSLLMANAGTGEPLLTLVCSGSWPLDQTLARDTRP >gi|296494705|gb|ADTN01000033.1| GENE 10 5683 - 6120 512 145 aa, chain - ## HITS:1 COG:hofG KEGG:ns NR:ns ## COG: hofG COG2165 # Protein_GI_number: 16131207 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, pseudopilin PulG # Organism: Escherichia coli K12 # 1 145 1 145 145 259 100.0 1e-69 MRATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYK LDNHHYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDL LSAGPDGEMGTEDDITNWGLSKKKK >gi|296494705|gb|ADTN01000033.1| GENE 11 6130 - 7326 915 398 aa, chain - ## HITS:1 COG:hofF KEGG:ns NR:ns ## COG: hofF COG1459 # Protein_GI_number: 16131206 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulF # Organism: Escherichia coli K12 # 1 398 1 398 398 717 100.0 0 MNYRYRAMTQDGQKLQGIIDANDERQARLRLREEGLFLLDIRPQKSSGVKTRRPRISHSE LTLFTRQLATLSAAALPLEESLAVIGQQSSNKRLGDVLNQVRSAILEGHPLSDALQHFPT LFDSLYRTLVKAGEKSGLLAPVLEKLADYNENRQKIRSKLIQSLIYPCMLTTVAIGVVII LLTAVVPKITEQFVHMKQQLPLSTRILLGLSDTLQRTGPTLLATVFIVAVGFWLWLKRGN NRHRFHAMLLRVALIGPLICAINSARYLRTLSILQSSGVPLLDGMNLSTESLNNLEIRQR LANAAENVRQGNSIHLSLEQTAIFPPMMLYMVASGEKSGQLGTLMVRAADNQETLQQNRI ALTLSIFEPALIITMALIVLFIVVSVLQPLLQLNSMIN >gi|296494705|gb|ADTN01000033.1| GENE 12 7323 - 8804 995 493 aa, chain - ## HITS:1 COG:gspE KEGG:ns NR:ns ## COG: gspE COG2804 # Protein_GI_number: 16131205 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB # Organism: Escherichia coli K12 # 1 493 1 493 493 933 100.0 0 MRIHSPYPASWALAQRIGYLYSEGEIIYLADTPFERLLDIQRQVGQCQTMTSLSQADFEA RLEAVFHQNTGESQQIAQDIDQSVDLLSLSEEMPANEDLLNEDSAAPVIRLINAILSEAI KETASDIHIETYEKTMSIRFRIDGVLRTILQPNKKLAALLISRIKVMARLDIAEKRIPQD GRISLRIGRRNIDVRVSTLPSIYGERAVLRLLDKNSLQLSLNNLGMTAADKQDLENLIQL PHGIILVTGPTGSGKSTTLYAILSALNTPGRNILTVEDPVEYELEGIGQTQVNTRVDMSF ARGLRAILRQDPDVVMVGEIRDTETAQIAVQASLTGHLVLSTLHTNSASGAVTRLRDMGV ESFLLSSSLAGIIAQRLVRRLCPQCRQFTPVSPQQAQMFKYHQLAVTTIGTPVGCPHCHQ SGYQGRMAIHEMMVVTPELRAAIHENVDEQALERLVRQQHKALIKNGLQKVISGDTSWDE VMRVASATLESEA >gi|296494705|gb|ADTN01000033.1| GENE 13 8814 - 10766 1588 650 aa, chain - ## HITS:1 COG:gspD KEGG:ns NR:ns ## COG: gspD COG1450 # Protein_GI_number: 16131204 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulD # Organism: Escherichia coli K12 # 1 650 5 654 654 1224 100.0 0 MKGLNKITCCLLAALLMPCAGHAENEQYGANFNNADIRQFVEIVGQHLGKTILIDPSVQG TISVRSNDTFSQQEYYQFFLSILDLYGYSVITLDNGFLKVVRSANVKTSPGMIADSSRPG VGDELVTRIVPLENVPARDLAPLLRQMMDAGSVGNVVHYEPSNVLILTGRASTINKLIEV IKRVDVIGTEKQQIIHLEYASAEDLAEILNQLISESHGKSQMPALLSAKIVADKRTNSLI ISGPEKARQRITSLLKSLDVEESEEGNTRVYYLKYAKATNLVEVLTGVSEKLKDEKGNAR KPSSSGAMDNVAITADEQTNSLVITADQSVQEKLATVIARLDIRRAQVLVEAIIVEVQDG NGLNLGVQWANKNVGAQQFTNTGLPIFNAAQGVADYKKNGGITSANPAWDMFSAYNGMAA GFFNGDWGVLLTALASNNKNDILATPSIVTLDNKLASFNVGQDVPVLSGSQTTSGDNVFN TVERKTVGTKLKVTPQVNEGDAVLLEIEQEVSSVDSSSNSTLGPTFNTRTIQNAVLVKTG ETVVLGGLLDDFSKEQVSKVPLLGDIPLVGQLFRYTSTERAKRNLMVFIRPTIIRDDDVY RSLSKEKYTRYRQEQQQRIDGKSKALVGSEDLPVLDENTFNSHAPAPSSR >gi|296494705|gb|ADTN01000033.1| GENE 14 11449 - 11556 60 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEIIHIKNNRPYAYKYNQYIDGSIFIMIRQMEGET >gi|296494705|gb|ADTN01000033.1| GENE 15 11745 - 13214 365 489 aa, chain + ## HITS:1 COG:no KEGG:JW3285 NR:ns ## KEGG: JW3285 # Name: gspA # Def: general secretory pathway component, cryptic # Organism: E.coli_J # Pathway: not_defined # 1 489 1 489 489 949 100.0 0 MSTRREVILSWLCEKRQTWRLCYLLGEAGSGKTWLAQQLQKDKHRRVITLSLVVSWQGKA AWIVTDDNAAEQGCRDSAWTRDEMAGQLLHALHRTDSRCPLIIIENAHLNHRRILDDLQR AISLIPDGQFLLIGRPDRKVERDFKKQGIELVSIGRLTEHELKASILEGQNIDQPDLLLT ARVLKRIALLCRGDRRKLALAGETIRLLQQAEQTSVFTAKQWRMIYRILGDNRPRKMQLA VVMSGTIIALTCGWLLLSSFTATLPVPAWLIPVTPVVKQDMTKDIAHVVMRDSEALSVLY GVWGYEVPADSAWCDQAVRAGLACKSGNASLQTLVDQNLPWIASLKVGDKKLPVVVVRVG EASVDVLVGQQTWTLTHKWFESVWTGDYLLLWKMSPEGESTITRDSSEEEILWLETMLNR ALHISTEPSAEWRPLLVEKIKQFQKSHHLKTDGVVGFSTLVHLWQVAGESAYLYRDEANI SPETTVKGK >gi|296494705|gb|ADTN01000033.1| GENE 16 13216 - 13635 312 139 aa, chain + ## HITS:1 COG:no KEGG:B21_03124 NR:ns ## KEGG: B21_03124 # Name: pioO # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 139 1 139 139 263 99.0 2e-69 MFEFYIAAREQKETGHPGIFSRQKHSTIIYVICLLLICLWFAGMVLVGGYARQLWVLWIV KAEVTVEAETPAFKQSTQHYFFKKQPLPVVESVEEEDDPGVAVENAPSSSEDEENTVEES EEKAGLRERVKNALNELER >gi|296494705|gb|ADTN01000033.1| GENE 17 13604 - 13810 116 68 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGECNRSYVAPRLGALLGSQIRLTEVQIEPAVNYDKPAHYTYLTTERKRIASKVNSLRVI SQVRSVHF >gi|296494705|gb|ADTN01000033.1| GENE 18 13873 - 14184 513 103 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803848|ref|NP_289882.1| 30S ribosomal protein S10 [Escherichia coli O157:H7 EDL933] # 1 103 1 103 103 202 100 2e-51 MQNQRIRIRLKAFDHRLIDQATAEIVETAKRTGAQVRGPIPLPTRKERFTVLISPHVNKD ARDQYEIRTHLRLVDIVEPTEKTVDALMRLDLAAGVDVQISLG >gi|296494705|gb|ADTN01000033.1| GENE 19 14217 - 14846 1056 209 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803847|ref|NP_289881.1| 50S ribosomal protein L3 [Escherichia coli O157:H7 EDL933] # 1 209 1 209 209 411 100 1e-114 MIGLVGKKVGMTRIFTEDGVSIPVTVIEVEANRVTQVKDLANDGYRAIQVTTGAKKANRV TKPEAGHFAKAGVEAGRGLWEFRLAEGEEFTVGQSISVELFADVKKVDVTGTSKGKGFAG TVKRWNFRTQDATHGNSLSHRVPGSIGQNQTPGKVFKGKKMAGQMGNERVTVQSLDVVRV DAERNLLLVKGAVPGATGSDLIVKPAVKA >gi|296494705|gb|ADTN01000033.1| GENE 20 14857 - 15462 992 201 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803846|ref|NP_289880.1| 50S ribosomal protein L4 [Escherichia coli O157:H7 EDL933] # 1 201 1 201 201 386 100 1e-107 MELVLKDAQSALTVSETTFGRDFNEALVHQVVVAYAAGARQGTRAQKTRAEVTGSGKKPW RQKGTGRARSGSIKSPIWRSGGVTFAARPQDHSQKVNKKMYRGALKSILSELVRQDRLIV VEKFSVEAPKTKLLAQKLKDMALEDVLIITGELDENLFLAARNLHKVDVRDATGIDPVSL IAFDKVVMTADAVKQVEEMLA >gi|296494705|gb|ADTN01000033.1| GENE 21 15459 - 15761 490 100 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803845|ref|NP_289879.1| 50S ribosomal protein L23 [Escherichia coli O157:H7 EDL933] # 1 100 1 100 100 193 100 1e-48 MIREERLLKVLRAPHVSEKASTAMEKSNTIVLKVAKDATKAEIKAAVQKLFEVEVEVVNT LVVKGKVKRHGQRIGRRSDWKKAYVTLKEGQNLDFVGGAE >gi|296494705|gb|ADTN01000033.1| GENE 22 15779 - 16600 1438 273 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803844|ref|NP_289878.1| 50S ribosomal protein L2 [Escherichia coli O157:H7 EDL933] # 1 273 1 273 273 558 100 1e-158 MAVVKCKPTSPGRRHVVKVVNPELHKGKPFAPLLEKNSKSGGRNNNGRITTRHIGGGHKQ AYRIVDFKRNKDGIPAVVERLEYDPNRSANIALVLYKDGERRYILAPKGLKAGDQIQSGV DAAIKPGNTLPMRNIPVGSTVHNVEMKPGKGGQLARSAGTYVQIVARDGAYVTLRLRSGE MRKVEADCRATLGEVGNAEHMLRVLGKAGAARWRGVRPTVRGTAMNPVDHPHGGGEGRNF GKHPVTPWGVQTKGKKTRSNKRTDKFIVRRRSK >gi|296494705|gb|ADTN01000033.1| GENE 23 16617 - 16895 484 92 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803843|ref|NP_289877.1| 30S ribosomal protein S19 [Escherichia coli O157:H7 EDL933] # 1 92 1 92 92 191 100 6e-48 MPRSLKKGPFIDLHLLKKVEKAVESGDKKPLRTWSRRSTIFPNMIGLTIAVHNGRQHVPV FVTDEMVGHKLGEFAPTRTYRGHAADKKAKKK >gi|296494705|gb|ADTN01000033.1| GENE 24 16910 - 17242 537 110 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803842|ref|NP_289876.1| 50S ribosomal protein L22 [Escherichia coli O157:H7 EDL933] # 1 110 1 110 110 211 100 4e-54 METIAKHRHARSSAQKVRLVADLIRGKKVSQALDILTYTNKKAAVLVKKVLESAIANAEH NDGADIDDLKVTKIFVDEGPSMKRIMPRAKGRADRILKRTSHITVVVSDR >gi|296494705|gb|ADTN01000033.1| GENE 25 17260 - 17961 1183 233 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15833433|ref|NP_312206.1| 30S ribosomal protein S3 [Escherichia coli O157:H7 str. Sakai] # 1 233 1 233 233 460 100 1e-129 MGQKVHPNGIRLGIVKPWNSTWFANTKEFADNLDSDFKVRQYLTKELAKASVSRIVIERP AKSIRVTIHTARPGIVIGKKGEDVEKLRKVVADIAGVPAQINIAEVRKPELDAKLVADSI TSQLERRVMFRRAMKRAVQNAMRLGAKGIKVEVSGRLGGAEIARTEWYREGRVPLHTLRA DIDYNTSEAHTTYGVIGVKVWIFKGEILGGMAAVEQPEKPAAQPKKQQRKGRK >gi|296494705|gb|ADTN01000033.1| GENE 26 17974 - 18384 694 136 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803840|ref|NP_289874.1| 50S ribosomal protein L16 [Escherichia coli O157:H7 EDL933] # 1 136 1 136 136 271 100 3e-72 MLQPKRTKFRKMHKGRNRGLAQGTDVSFGSFGLKAVGRGRLTARQIEAARRAMTRAVKRQ GKIWIRVFPDKPITEKPLAVRMGKGKGNVEYWVALIQPGKVLYEMDGVPEELAREAFKLA AAKLPIKTTFVTKTVM >gi|296494705|gb|ADTN01000033.1| GENE 27 18384 - 18575 301 63 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803839|ref|NP_289873.1| 50S ribosomal protein L29 [Escherichia coli O157:H7 EDL933] # 1 63 1 63 63 120 100 1e-26 MKAKELREKSVEELNTELLNLLREQFNLRMQAASGQLQQSHLLKQVRRDVARVKTLLNEK AGA >gi|296494705|gb|ADTN01000033.1| GENE 28 18575 - 18829 431 84 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803838|ref|NP_289872.1| 30S ribosomal protein S17 [Escherichia coli O157:H7 EDL933] # 1 84 1 84 84 170 100 8e-42 MTDKIRTLQGRVVSDKMEKSIVVAIERFVKHPIYGKFIKRTTKLHVHDENNECGIGDVVE IRECRPLSKTKSWTLVRVVEKAVL >gi|296494705|gb|ADTN01000033.1| GENE 29 18994 - 19365 617 123 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803837|ref|NP_289871.1| 50S ribosomal protein L14 [Escherichia coli O157:H7 EDL933] # 1 123 1 123 123 242 100 2e-63 MIQEQTMLNVADNSGARRVMCIKVLGGSHRRYAGVGDIIKITIKEAIPRGKVKKGDVLKA VVVRTKKGVRRPDGSVIRFDGNACVLLNNNSEQPIGTRIFGPVTRELRSEKFMKIISLAP EVL >gi|296494705|gb|ADTN01000033.1| GENE 30 19376 - 19690 516 104 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803836|ref|NP_289870.1| 50S ribosomal protein L24 [Escherichia coli O157:H7 EDL933] # 1 104 1 104 104 203 100 1e-51 MAAKIRRDDEVIVLTGKDKGKRGKVKNVLSSGKVIVEGINLVKKHQKPVPALNQPGGIVE KEAAIQVSNVAIFNAATGKADRVGFRFEDGKKVRFFKSNSETIK >gi|296494705|gb|ADTN01000033.1| GENE 31 19705 - 20244 911 179 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803835|ref|NP_289869.1| 50S ribosomal protein L5 [Escherichia coli O157:H7 EDL933] # 1 179 1 179 179 355 100 2e-97 MAKLHDYYKDEVVKKLMTEFNYNSVMQVPRVEKITLNMGVGEAIADKKLLDNAAADLAAI SGQKPLITKARKSVAGFKIRQGYPIGCKVTLRGERMWEFFERLITIAVPRIRDFRGLSAK SFDGRGNYSMGVREQIIFPEIDYDKVDRVRGLDITITTTAKSDEEGRALLAAFDFPFRK >gi|296494705|gb|ADTN01000033.1| GENE 32 20259 - 20564 513 101 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803834|ref|NP_289868.1| 30S ribosomal protein S14 [Escherichia coli O157:H7 EDL933] # 1 101 1 101 101 202 100 2e-51 MAKQSMKAREVKRVALADKYFAKRAELKAIISDVNASDEDRWNAVLKLQTLPRDSSPSRQ RNRCRQTGRPHGFLRKFGLSRIKVREAAMRGEIPGLKKASW >gi|296494705|gb|ADTN01000033.1| GENE 33 20598 - 20990 641 130 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803833|ref|NP_289867.1| 30S ribosomal protein S8 [Escherichia coli O157:H7 EDL933] # 1 130 1 130 130 251 100 4e-66 MSMQDPIADMLTRIRNGQAANKAAVTMPSSKLKVAIANVLKEEGFIEDFKVEGDTKPELE LTLKYFQGKAVVESIQRVSRPGLRIYKRKDELPKVMAGLGIAVVSTSKGVMTDRAARQAG LGGEIICYVA >gi|296494705|gb|ADTN01000033.1| GENE 34 21003 - 21536 902 177 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803832|ref|NP_289866.1| 50S ribosomal protein L6 [Escherichia coli O157:H7 EDL933] # 1 177 1 177 177 352 100 2e-96 MSRVAKAPVVVPAGVDVKINGQVITIKGKNGELTRTLNDAVEVKHADNTLTFGPRDGYAD GWAQAGTARALLNSMVIGVTEGFTKKLQLVGVGYRAAVKGNVINLSLGFSHPVDHQLPAG ITAECPTQTEIVLKGADKQVIGQVAADLRAYRRPEPYKGKGVRYADEVVRTKEAKKK >gi|296494705|gb|ADTN01000033.1| GENE 35 21546 - 21899 572 117 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803831|ref|NP_289865.1| 50S ribosomal protein L18 [Escherichia coli O157:H7 EDL933] # 1 117 1 117 117 224 100 4e-58 MDKKSARIRRATRARRKLQELGATRLVVHRTPRHIYAQVIAPNGSEVLVAASTVEKAIAE QLKYTGNKDAAAAVGKAVAERALEKGIKDVSFDRSGFQYHGRVQALADAAREAGLQF >gi|296494705|gb|ADTN01000033.1| GENE 36 21914 - 22417 834 167 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803830|ref|NP_289864.1| 30S ribosomal protein S5 [Escherichia coli O157:H7 EDL933] # 1 167 1 167 167 325 100 1e-88 MAHIEKQAGELQEKLIAVNRVSKTVKGGRIFSFTALTVVGDGNGRVGFGYGKAREVPAAI QKAMEKARRNMINVALNNGTLQHPVKGVHTGSRVFMQPASEGTGIIAGGAMRAVLEVAGV HNVLAKAYGSTNPINVVRATIDGLENMNSPEMVAAKRGKSVEEILGK >gi|296494705|gb|ADTN01000033.1| GENE 37 22421 - 22600 291 59 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803829|ref|NP_289863.1| 50S ribosomal protein L30 [Escherichia coli O157:H7 EDL933] # 1 59 1 59 59 116 100 1e-25 MAKTIKITQTRSAIGRLPKHKATLLGLGLRRIGHTVEREDTPAIRGMINAVSFMVKVEE >gi|296494705|gb|ADTN01000033.1| GENE 38 22604 - 23038 715 144 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|16131180|ref|NP_417760.1| 50S ribosomal subunit protein L15 [Escherichia coli str. K-12 substr. MG1655] # 1 144 1 144 144 280 100 9e-75 MRLNTLSPAEGSKKAGKRLGRGIGSGLGKTGGRGHKGQKSRSGGGVRRGFEGGQMPLYRR LPKFGFTSRKAAITAEIRLSDLAKVEGGVVDLNTLKAANIIGIQIEFAKVILAGEVTTPV TVRGLRVTKGARAAIEAAGGKIEE >gi|296494705|gb|ADTN01000033.1| GENE 39 23046 - 24377 1258 443 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 [alpha proteobacterium BAL199] # 12 440 16 444 447 489 56 1e-137 MAKQPGLDFQSAKGGLGELKRRLLFVIGALIVFRIGSFIPIPGIDAAVLAKLLEQQRGTI IEMFNMFSGGALSRASIFALGIMPYISASIIIQLLTVVHPTLAEIKKEGESGRRKISQYT RYGTLVLAIFQSIGIATGLPNMPGMQGLVINPGFAFYFTAVVSLVTGTMFLMWLGEQITE RGIGNGISIIIFAGIVAGLPPAIAHTIEQARQGDLHFLVLLLVAVLVFAVTFFVVFVERG QRRIVVNYAKRQQGRRVYAAQSTHLPLKVNMAGVIPAIFASSIILFPATIASWFGGGTGW NWLTTISLYLQPGQPLYVLLYASAIIFFCFFYTALVFNPRETADNLKKSGAFVPGIRPGE QTAKYIDKVMTRLTLVGALYITFICLIPEFMRDAMKVPFYFGGTSLLIVVVVIMDFMAQV QTLMMSSQYESALKKANLKGYGR >gi|296494705|gb|ADTN01000033.1| GENE 40 24409 - 24525 198 38 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803826|ref|NP_289860.1| 50S ribosomal protein L36 [Escherichia coli O157:H7 EDL933] # 1 38 1 38 38 80 100 8e-15 MKVRASVKKLCRNCKIVKRDGVIRVICSAEPKHKQRQG >gi|296494705|gb|ADTN01000033.1| GENE 41 24672 - 25028 586 118 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803825|ref|NP_289859.1| 30S ribosomal protein S13 [Escherichia coli O157:H7 EDL933] # 1 118 1 118 118 230 99 9e-60 MARIAGINIPDHKHAVIALTSIYGVGKTRSKAILAAAGIAEDVKISELSEGQIDTLRDEV AKFVVEGDLRREISMSIKRLMDLGCYRGLRHRRGLPVRGQRTKTNARTRKGPRKPIKK >gi|296494705|gb|ADTN01000033.1| GENE 42 25045 - 25434 669 129 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803824|ref|NP_289858.1| 30S ribosomal protein S11 [Escherichia coli O157:H7 EDL933] # 1 129 1 129 129 262 100 2e-69 MAKAPIRARKRVRKQVSDGVAHIHASFNNTIVTITDRQGNALGWATAGGSGFRGSRKSTP FAAQVAAERCADAVKEYGIKNLEVMVKGPGPGRESTIRALNAAGFRITNITDVTPIPHNG CRPPKKRRV >gi|296494705|gb|ADTN01000033.1| GENE 43 25468 - 26088 1038 206 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803823|ref|NP_289857.1| 30S ribosomal protein S4 [Escherichia coli O157:H7 EDL933] # 1 206 1 206 206 404 100 1e-112 MARYLGPKLKLSRREGTDLFLKSGVRAIDTKCKIEQAPGQHGARKPRLSDYGVQLREKQK VRRIYGVLERQFRNYYKEAARLKGNTGENLLALLEGRLDNVVYRMGFGATRAEARQLVSH KAIMVNGRVVNIASYQVSPNDVVSIREKAKKQSRVKAALELAEQREKPTWLEVDAGKMEG TFKRKPERSDLSADINEHLIVELYSK >gi|296494705|gb|ADTN01000033.1| GENE 44 26114 - 27103 1050 329 aa, chain + ## HITS:1 COG:ECs4160 KEGG:ns NR:ns ## COG: ECs4160 COG0202 # Protein_GI_number: 15833414 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, alpha subunit/40 kD subunit # Organism: Escherichia coli O157:H7 # 1 329 1 329 329 613 100.0 1e-175 MQGSVTEFLKPRLVDIEQVSSTHAKVTLEPLERGFGHTLGNALRRILLSSMPGCAVTEVE IDGVLHEYSTKEGVQEDILEILLNLKGLAVRVQGKDEVILTLNKSGIGPVTAADITHDGD VEIVKPQHVICHLTDENASISMRIKVQRGRGYVPASTRIHSEEDERPIGRLLVDACYSPV ERIAYNVEAARVEQRTDLDKLVIEMETNGTIDPEEAIRRAATILAEQLEAFVDLRDVRQP EVKEEKPEFDPILLRPVDDLELTVRSANCLKAEAIHYIGDLVQRTEVELLKTPNLGKKSL TEIKDVLASRGLSLGMRLENWPPASIADE >gi|296494705|gb|ADTN01000033.1| GENE 45 27144 - 27527 636 127 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803821|ref|NP_289855.1| 50S ribosomal protein L17 [Escherichia coli O157:H7 EDL933] # 1 127 1 127 127 249 100 1e-65 MRHRKSGRQLNRNSSHRQAMFRNMAGSLVRHEIIKTTLPKAKELRRVVEPLITLAKTDSV ANRRLAFARTRDNEIVAKLFNELGPRFASRAGGYTRILKCGFRAGDNAPMAYIELVDRSE KAEAAAE Prediction of potential genes in microbial genomes Time: Sun May 15 23:12:48 2011 Seq name: gi|296494704|gb|ADTN01000034.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont88.3, whole genome shotgun sequence Length of sequence - 10624 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 4, operones - 3 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 80 - 448 263 ## SSON_3434 hypothetical protein 2 1 Op 2 . + CDS 459 - 884 348 ## COG0789 Predicted transcriptional regulators + Term 1095 - 1147 11.1 3 2 Op 1 7/0.000 - CDS 1155 - 1565 468 ## COG1970 Large-conductance mechanosensitive channel - Prom 1599 - 1658 6.9 - Term 1654 - 1693 8.6 4 2 Op 2 8/0.000 - CDS 1695 - 3071 1149 ## COG0569 K+ transport systems, NAD-binding component 5 2 Op 3 20/0.000 - CDS 3093 - 4355 1120 ## COG0144 tRNA and rRNA cytosine-C5-methylases - Term 4383 - 4411 2.1 6 2 Op 4 26/0.000 - CDS 4428 - 5375 886 ## COG0223 Methionyl-tRNA formyltransferase 7 2 Op 5 . - CDS 5390 - 5899 556 ## COG0242 N-formylmethionyl-tRNA deformylase - Prom 5938 - 5997 3.5 + Prom 5833 - 5892 2.4 8 3 Op 1 8/0.000 + CDS 6029 - 7153 294 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake 9 3 Op 2 7/0.000 + CDS 7125 - 7598 628 ## COG2922 Uncharacterized protein conserved in bacteria 10 3 Op 3 6/0.000 + CDS 7627 - 8169 169 ## COG0551 Zn-finger domain associated with topoisomerase type I 11 3 Op 4 8/0.000 + CDS 8174 - 8746 358 ## COG0009 Putative translation factor (SUA5) 12 3 Op 5 . + CDS 8751 - 9569 363 ## COG0169 Shikimate 5-dehydrogenase 13 3 Op 6 . + CDS 9566 - 9823 172 ## B21_03082 hypothetical protein 14 4 Tu 1 . - CDS 9799 - 10353 460 ## COG0663 Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily - Prom 10476 - 10535 4.4 Predicted protein(s) >gi|296494704|gb|ADTN01000034.1| GENE 1 80 - 448 263 122 aa, chain + ## HITS:1 COG:no KEGG:SSON_3434 NR:ns ## KEGG: SSON_3434 # Name: yhdN # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 122 1 122 122 224 100.0 1e-57 MWLLDQWAERHIAEAQAKGEFDNLAGSGEPLILDDDSHVPPELRAGYRLLKNAGCLPPEL EQRREAIQLLDILKGIRHDDPQYQEVSRRLSLLELKLRQAGLSTDFLRGDYADKLLDKIN DN >gi|296494704|gb|ADTN01000034.1| GENE 2 459 - 884 348 141 aa, chain + ## HITS:1 COG:ECs4157 KEGG:ns NR:ns ## COG: ECs4157 COG0789 # Protein_GI_number: 15833411 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 141 1 141 141 247 100.0 4e-66 MYRIGELAKMAEVTPDTIRYYEKQQMMEHEVRTEGGFRLYTESDLQRLKFIRHARQLGFS LESIRELLSIRIDPEHHTCQESKGIVQERLQEVEARIAELQSMQRSLQRLNDACCGTAHS SVYCSILEALEQGASGVKSGC >gi|296494704|gb|ADTN01000034.1| GENE 3 1155 - 1565 468 136 aa, chain - ## HITS:1 COG:ECs4156 KEGG:ns NR:ns ## COG: ECs4156 COG1970 # Protein_GI_number: 15833410 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Large-conductance mechanosensitive channel # Organism: Escherichia coli O157:H7 # 1 136 1 136 136 243 100.0 5e-65 MSIIKEFREFAMRGNVVDLAVGVIIGAAFGKIVSSLVADIIMPPLGLLIGGIDFKQFAVT LRDAQGDIPAVVMHYGVFIQNVFDFLIVAFAIFMAIKLINKLNRKKEEPAAAPAPTKEEV LLTEIRDLLKEQNNRS >gi|296494704|gb|ADTN01000034.1| GENE 4 1695 - 3071 1149 458 aa, chain - ## HITS:1 COG:ECs4155 KEGG:ns NR:ns ## COG: ECs4155 COG0569 # Protein_GI_number: 15833409 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Escherichia coli O157:H7 # 1 458 1 458 458 871 100.0 0 MKIIILGAGQVGGTLAENLVGENNDITVVDTNGERLRTLQDKFDLRVVQGHGSHPRVLRE AGADDADMLVAVTSSDETNMVACQVAYSLFNTPNRIARIRSPDYVRDADKLFHSDAVPID HLIAPEQLVIDNIYRLIEYPGALQVVNFAEGKVSLAVVKAYYGGPLIGNALSTMREHMPH IDTRVAAIFRHDRPIRPQGSTIVEAGDEVFFIAASQHIRAVMSELQRLEKPYKRIMLVGG GNIGAGLARRLEKDYSVKLIERNQQRAAELAEKLQNTIVFFGDASDQELLAEEHIDQVDL FIAVTNDDEANIMSAMLAKRMGAKKVMVLIQRRAYVDLVQGSVIDIAISPQQATISALLS HVRKADIVGVSSLRRGVAEAIEAVAHGDESTSRVVGRVIDEIKLPPGTIIGAVVRGNDVM IANDNLRIEQGDHVIMFLTDKKFITDVERLFQPSPFFL >gi|296494704|gb|ADTN01000034.1| GENE 5 3093 - 4355 1120 420 aa, chain - ## HITS:1 COG:sun KEGG:ns NR:ns ## COG: sun COG0144 # Protein_GI_number: 16131168 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Escherichia coli K12 # 1 420 10 429 429 841 100.0 0 MAAQAVEQVVEQGQSLSNILPPLQQKVSDKDKALLQELCFGVLRTLSQLDWLINKLMARP MTGKQRTVHYLIMVGLYQLLYTRIPPHAALAETVEGAIAIKRPQLKGLINGVLRQFQRQQ EELLAEFNASDARYLHPSWLLKRLQKAYPEQWQSIVEANNQRPPMWLRINRTHHSRDSWL ALLDEAGMKGFPHADYPDAVRLETPAPVHALPGFEDGWVTVQDASAQGCMTWLAPQNGEH ILDLCAAPGGKTTHILEVAPEAQVVAVDIDEQRLSRVYDNLKRLGMKATVKQGDGRYPSQ WCGEQQFDRILLDAPCSATGVIRRHPDIKWLRRDRDIPELAQLQSEILDAIWPHLKTGGT LVYATCSVLPEENSLQIKAFLQRTADAELCETGTPEQPGKQNLPGAEEGDGFFYAKLIKK >gi|296494704|gb|ADTN01000034.1| GENE 6 4428 - 5375 886 315 aa, chain - ## HITS:1 COG:fmt KEGG:ns NR:ns ## COG: fmt COG0223 # Protein_GI_number: 16131167 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA formyltransferase # Organism: Escherichia coli K12 # 1 315 1 315 315 612 100.0 1e-175 MSESLRIIFAGTPDFAARHLDALLSSGHNVVGVFTQPDRPAGRGKKLMPSPVKVLAEEKG LPVFQPVSLRPQENQQLVAELQADVMVVVAYGLILPKAVLEMPRLGCINVHGSLLPRWRG AAPIQRSLWAGDAETGVTIMQMDVGLDTGDMLYKLSCPITAEDTSGTLYDKLAELGPQGL ITTLKQLADGTAKPEVQDETLVTYAEKLSKEEARIDWSLSAAQLERCIRAFNPWPMSWLE IEGQPVKVWKASVIDTATNAAPGTILEANKQGIQVATGDGILNLLSLQPAGKKAMSAQDL LNSRREWFVPGNRLV >gi|296494704|gb|ADTN01000034.1| GENE 7 5390 - 5899 556 169 aa, chain - ## HITS:1 COG:ECs4152 KEGG:ns NR:ns ## COG: ECs4152 COG0242 # Protein_GI_number: 15833406 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Escherichia coli O157:H7 # 1 169 1 169 169 291 100.0 3e-79 MSVLQVLHIPDERLRKVAKPVEEVNAEIQRIVDDMFETMYAEEGIGLAATQVDIHQRIIV IDVSENRDERLVLINPELLEKSGETGIEEGCLSIPEQRALVPRAEKVKIRALDRDGKPFE LEADGLLAICIQHEMDHLVGKLFMDYLSPLKQQRIRQKVEKLDRLKARA >gi|296494704|gb|ADTN01000034.1| GENE 8 6029 - 7153 294 374 aa, chain + ## HITS:1 COG:smf KEGG:ns NR:ns ## COG: smf COG0758 # Protein_GI_number: 16132235 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Escherichia coli K12 # 1 374 1 374 374 746 99.0 0 MVDTEIWLRLMSISSLYGDDMVRIAHWVAKQSHIDAVVLQQTGLTLRQAQRFLSFPRKSI ESSLCWLEQPNHHLIPADSEFYPPQLLATTDYPGALFVEGELHALHSFQLAVVGSRAHSW YGERWGRLFCETLATRGVTITSGLARGIDGVAHKAALQVNGVSIAVLGNGLNTIHPRRHA RLAASLLEQGGALVSEFPLDVPPLAYNFSRRNRIISGLSKGVLVVEAALRSGSLVTARCA LEQGREVFALPGPIGNPGSEGPHWLIKQGAILVTEPEEILENLQFGLHWLPDAPENSFYS PDQQDVALPFPELLANVGDEVTPVDVVAERAGQPVPEVVTQLLELELAGWIAAVPGGYVR LRRACHVRRTNVFV >gi|296494704|gb|ADTN01000034.1| GENE 9 7125 - 7598 628 157 aa, chain + ## HITS:1 COG:smg KEGG:ns NR:ns ## COG: smg COG2922 # Protein_GI_number: 16131165 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 157 1 157 157 270 100.0 5e-73 MFDVLMYLFETYIHTEAELRVDQDKLEQDLTDAGFEREDIYNALLWLEKLADYQEGLAEP MQLASDPLSMRIYTPEECERLDASCRGFLLFLEQIQVLNLETREMVIERVLALDNAEFEL DDLKWVILMVLFNIPGCENAYQQMEELLFEVNEGMLH >gi|296494704|gb|ADTN01000034.1| GENE 10 7627 - 8169 169 180 aa, chain + ## HITS:1 COG:ZyrdD KEGG:ns NR:ns ## COG: ZyrdD COG0551 # Protein_GI_number: 15803811 # Func_class: L Replication, recombination and repair # Function: Zn-finger domain associated with topoisomerase type I # Organism: Escherichia coli O157:H7 EDL933 # 9 180 1 172 172 328 99.0 3e-90 MAKSALFTVRNNESCPKCGAELVIRSGKHGPFLGCSQYPACDYVRPLKSSADGHIVKVLE GQVCPACGANLVLRQGRFGMFIGCINYPECEHTELIDKPDETAITCPQCRTGHLVQRRSR YGKTFHSCDRYPECQFAINFKPIAGECPECHYPLLIEKKTAQGVKHFCASKQCGKPVSAE >gi|296494704|gb|ADTN01000034.1| GENE 11 8174 - 8746 358 190 aa, chain + ## HITS:1 COG:yrdC KEGG:ns NR:ns ## COG: yrdC COG0009 # Protein_GI_number: 16131163 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Escherichia coli K12 # 1 190 1 190 190 374 100.0 1e-104 MNNNLQRDAIAAAIDVLNEERVIAYPTEAVFGVGCDPDSETAVMRLLELKQRPVDKGLIL IAANYEQLKPYIDDTMLTDVQRETIFSRWPGPVTFVFPAPATTPRWLTGRFDSLAVRVTD HPLVVALCQAYGKPLVSTSANLSGLPPCRTVDEVRAQFGAAFPVVPGETGGRLNPSEIRD ALTGELFRQG >gi|296494704|gb|ADTN01000034.1| GENE 12 8751 - 9569 363 272 aa, chain + ## HITS:1 COG:aroE KEGG:ns NR:ns ## COG: aroE COG0169 # Protein_GI_number: 16131162 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Escherichia coli K12 # 1 272 1 272 272 551 100.0 1e-157 METYAVFGNPIAHSKSPFIHQQFAQQLNIEHPYGRVLAPINDFINTLNAFFSAGGKGANV TVPFKEEAFARADELTERAALAGAVNTLMRLEDGRLLGDNTDGVGLLSDLERLSFIRPGL RILLIGAGGASRGVLLPLLSLDCAVTITNRTVSRAEELAKLFAHTGSIQALSMDELEGHE FDLIINATSSGISGDIPAIPSSLIHPGIYCYDMFYQKGKTPFLAWCEQRGSKRNADGLGM LVAQAAHAFLLWHGVLPDVEPVIKQLQEELSA >gi|296494704|gb|ADTN01000034.1| GENE 13 9566 - 9823 172 85 aa, chain + ## HITS:1 COG:no KEGG:B21_03082 NR:ns ## KEGG: B21_03082 # Name: yrdB # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 85 1 85 85 155 100.0 4e-37 MNQAIQFPDREEWDENKKCVCFPALVNGMQLTCAISGESLAYRFTGDTPEQWLASFRQHR WDLEEEAENLIQEQSEDDQGWVWLP >gi|296494704|gb|ADTN01000034.1| GENE 14 9799 - 10353 460 184 aa, chain - ## HITS:1 COG:ECs4145 KEGG:ns NR:ns ## COG: ECs4145 COG0663 # Protein_GI_number: 15833399 # Func_class: R General function prediction only # Function: Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily # Organism: Escherichia coli O157:H7 # 1 184 73 256 256 385 100.0 1e-107 MSDVLRPYRDLFPQIGQRVMIDDSSVVIGDVRLADDVGIWPLVVIRGDVHYVQIGARTNI QDGSMLHVTHKSSYNPDGNPLTIGEDVTVGHKVMLHGCTIGNRVLVGMGSILLDGAIVED DVMIGAGSLVPQNKRLESGYLYLGSPVKQIRPLSDEEKAGLRYSANNYVKWKDEYLDQGN QTQP Prediction of potential genes in microbial genomes Time: Sun May 15 23:12:55 2011 Seq name: gi|296494703|gb|ADTN01000035.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont90.1, whole genome shotgun sequence Length of sequence - 6327 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 31 - 1197 1669 ## COG1004 Predicted UDP-glucose 6-dehydrogenase - Prom 1259 - 1318 2.2 - Term 1319 - 1356 6.2 2 2 Op 1 9/0.000 - CDS 1378 - 1932 561 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes 3 2 Op 2 11/0.000 - CDS 1947 - 2837 1045 ## COG1091 dTDP-4-dehydrorhamnose reductase 4 2 Op 3 16/0.000 - CDS 2869 - 3738 1054 ## COG1209 dTDP-glucose pyrophosphorylase 5 2 Op 4 . - CDS 3765 - 4829 1136 ## COG1088 dTDP-D-glucose 4,6-dehydratase - Prom 4865 - 4924 5.1 - Term 4857 - 4920 -0.8 6 3 Tu 1 . - CDS 5004 - 5771 -23 ## MADE_02452 putative lipopolysaccharide modification acyltransferase - Prom 5856 - 5915 5.2 Predicted protein(s) >gi|296494703|gb|ADTN01000035.1| GENE 1 31 - 1197 1669 388 aa, chain - ## HITS:1 COG:ECs2829 KEGG:ns NR:ns ## COG: ECs2829 COG1004 # Protein_GI_number: 15832083 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Escherichia coli O157:H7 # 1 388 1 388 388 647 83.0 0 MKITISGTGYVGLSNGVLIAQNHEVVALDIVQAKVDMLNQKISPIVDKEIQEYLAEKPLN FRATTDKHDAYRNADYVIIATPTDYDPKTNYFNTSTVEAVIRDVTEINPNAVMIIKSTIP VGFTRDIKERLGIDNVIFSPEFLREGRALYDNLHPSRIVIGERSARAERFADLLKEGAIK QDIPTLFTDSTEAEAIKLFANTYLALRVAYFNELDSYAESQGLNSKQIIEGVCLDPRIGN HYNNPSFGYGGYCLPKDTKQLLANYESVPNNIIAAIVDANRTRKDFIADSILARKPKVVG VYRLIMKSGSDNFRASSIQGIMKRIKAKGIPVIIYEPVMQEDEFFNSRVVRDLDAFKQEA DVIISNRMAEELADVADKVYTRDLFGND >gi|296494703|gb|ADTN01000035.1| GENE 2 1378 - 1932 561 184 aa, chain - ## HITS:1 COG:STM2094 KEGG:ns NR:ns ## COG: STM2094 COG1898 # Protein_GI_number: 16765424 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Salmonella typhimurium LT2 # 1 158 2 159 183 247 70.0 8e-66 MNIIKTDIPEVLIFEPRVFGDARGFFFESFSSKVFNEAVGRQVDFVQDNHSQSQKGVLRG LHYQLAPHAQGKLVRCVEGEVFDVAVDIRRSSPTFGKWVGAVLSAENKRQLWIPEGFAHG FMALSDTVQFVYKATNYYAPQSERSIIWNDPEIGIDWPALSDCALSLSEKDLQAHTLATA EVYK >gi|296494703|gb|ADTN01000035.1| GENE 3 1947 - 2837 1045 296 aa, chain - ## HITS:1 COG:STM2096 KEGG:ns NR:ns ## COG: STM2096 COG1091 # Protein_GI_number: 16765426 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Salmonella typhimurium LT2 # 1 294 1 294 299 369 62.0 1e-102 MKILLIGKNGQVGWELQRSLSTLGDVVAVDYFDKELCGDLTNLEGIAQTVRTVRPDVVVN AAAHTAVDKAESERELSDLLNDKGVAVLAAESAKLGALMVHYSTDYVFDGAGSHYRREDE ATGPLNVYGETKRAGELALEQGNPRHLIFRTSWVYATRGANFAKTMLRLAGEKETLSIID DQHGAPTGAELLADCTATAIRETLRNPALAGTYHLVASGETSWCDYARYVFEVARAHGAE LAVQEVKGIPTTAYPTPAKRPLNSRLSNEKFQQAFGVTLPDWRQGVARVVTEVLGK >gi|296494703|gb|ADTN01000035.1| GENE 4 2869 - 3738 1054 289 aa, chain - ## HITS:1 COG:YPO3861 KEGG:ns NR:ns ## COG: YPO3861 COG1209 # Protein_GI_number: 16123996 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Yersinia pestis # 1 289 1 289 293 465 75.0 1e-131 MKGIVLAGGSGTRLYPITQGVSKQLLPIYDKPMIFYPVSVLMLAGIRDILIISTPDDMPS FQRLLGDGSQFGVNFSYAVQPSPDGLAQAFIIGEKFIGNDACALVLGDNIYFGQSFGKKL EAAAAKTSGATVFGYQVLDPERFGVVEFDENFKALSIEEKPLKPKSDWAVTGLYFYDNNV VEMAKDVKPSERGELEITTLNQMYLEQGDLHVELLGRGFAWLDTGTHDSLMDASQFIHTI EKRQGMKVACLEEIGYRNQWLSAEGVAAQAERLKKTEYGAYLKRLLNER >gi|296494703|gb|ADTN01000035.1| GENE 5 3765 - 4829 1136 354 aa, chain - ## HITS:1 COG:YPO3862 KEGG:ns NR:ns ## COG: YPO3862 COG1088 # Protein_GI_number: 16123997 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Yersinia pestis # 2 354 3 355 355 596 79.0 1e-170 MKILVTGGAGFIGSAVVRHIIENTRDEVRVVDCLTYAGNLESLEPVAGSERYSFSQTDIT DAAAVAAQFSEFRPDIVMHLAAESHVDRSIDGPAAFIQTNVIGTFTLLEAARHYWSELGE AQKQAFRFHHISTDEVYGDLHGTDDLFTEETPYAPSSPYSASKAGSDHLVRAWNRTYGLP VVVTNCSNNYGPYHFPEKLIPLTILNALAGKPLPVYGNGEQIRDWLYVEDHARALYKVAT EGKSGETYNIGGHNERKNIDVVRTICAILDKVVAQKPGNITHFADLITFVTDRPGHDLRY AIDAAKIQRDLGWVPQETFESGIEKTVHWYLNNQTWWQRVLDGSYAGERLGLNN >gi|296494703|gb|ADTN01000035.1| GENE 6 5004 - 5771 -23 255 aa, chain - ## HITS:1 COG:no KEGG:MADE_02452 NR:ns ## KEGG: MADE_02452 # Name: not_defined # Def: putative lipopolysaccharide modification acyltransferase # Organism: A.macleodii # Pathway: not_defined # 23 248 418 639 639 112 32.0 2e-23 MDRFHTYSSSDEGLAQFGNAKRICFLTSGFNNAANFSEDKCLVNNPSNNKPRLLLIGDSH AAEFYQAAKEYYPNHLVMQVTSSGCMPFIEAKGEKRCVDLMRGFYQKYINKYKFDIVLIS ANWVDGRKYYTDRQMVENIKHSVSILNAHSKHVFVLGQTKVFDTPFYRIALSGDPAKILR HELIEPKYFNDYLVNIFHEGDVNYIDLYDSECVNGNCRYISSDNYPLMFDNNHFTLPWSR LIMKNISIATPGVEK Prediction of potential genes in microbial genomes Time: Sun May 15 23:13:03 2011 Seq name: gi|296494702|gb|ADTN01000036.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont96.1, whole genome shotgun sequence Length of sequence - 10354 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 4, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 37 - 96 1.9 1 1 Tu 1 . + CDS 175 - 2043 1692 ## COG3158 K+ transporter + Term 2049 - 2089 5.6 + Prom 2056 - 2115 4.1 2 2 Op 1 9/1.000 + CDS 2210 - 2629 392 ## COG1869 ABC-type ribose transport system, auxiliary component 3 2 Op 2 21/0.000 + CDS 2637 - 4142 161 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 4 2 Op 3 16/0.000 + CDS 4147 - 5112 1302 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 5 2 Op 4 8/1.000 + CDS 5137 - 6027 1140 ## COG1879 ABC-type sugar transport system, periplasmic component 6 3 Op 1 13/0.000 + CDS 6138 - 7082 759 ## COG0524 Sugar kinases, ribokinase family 7 3 Op 2 . + CDS 7086 - 8078 940 ## COG1609 Transcriptional regulators 8 4 Op 1 5/1.000 - CDS 8044 - 9471 888 ## COG0477 Permeases of the major facilitator superfamily 9 4 Op 2 . - CDS 9494 - 10186 531 ## COG2186 Transcriptional regulators - Prom 10213 - 10272 6.7 Predicted protein(s) >gi|296494702|gb|ADTN01000036.1| GENE 1 175 - 2043 1692 622 aa, chain + ## HITS:1 COG:ECs4689 KEGG:ns NR:ns ## COG: ECs4689 COG3158 # Protein_GI_number: 15833943 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transporter # Organism: Escherichia coli O157:H7 # 1 622 1 622 622 1177 99.0 0 MSTDNKQSLPAITLAAIGVVYGDIGTSPLYTLRECLSGQFGFGVERDAVFGFLSLIFWLL IFVVSIKYLTFVMRADNAGEGGILTLMSLAGRNTSARTTSMLVIMGLIGGSFFYGEVVIT PAISVMSAIEGLEIVAPQLDTWIVPLSIIVLTLLFMIQKHGTAMVGKLFAPIMLTWFLIL AGLGLRSIIANPEVLHALNPMWAVHFFLEYKTVSFIALGAVVLSITGVEALYADMGHFGK FPIRLAWFTVVLPSLTLNYFGQGALLLKNPEAIKNPFFLLAPDWALIPLLIIAALATVIA SQAVISGVFSLTRQAVRLGYLSPMRIIHTSEMESGQIYIPFVNWMLYVAVVIVIVSFEHS SNLAAAYGIAVTGTMVLTSILSTTVARQNWHWNKYFVALILIAFLCVDIPLFTANLDKLL SGGWLPLSLGTVMFIVMTTWKSERFRLLRRMHEHGNSLEAMIASLEKSPPVRVPGTAVYM SRAINVIPFALMHNLKHNKVLHERVILLTLRTEDAPYVHNVRRVQIEQLSPTFWRVVASY GWRETPNVEEVFHRCGLEGLSCRMMETSFFMSHESLILGKRPWYLRLRGKLYLLLQRNAL RAPDQFEIPPNRVIELGTQVEI >gi|296494702|gb|ADTN01000036.1| GENE 2 2210 - 2629 392 139 aa, chain + ## HITS:1 COG:rbsD KEGG:ns NR:ns ## COG: rbsD COG1869 # Protein_GI_number: 16131616 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type ribose transport system, auxiliary component # Organism: Escherichia coli K12 # 1 139 13 151 151 256 100.0 8e-69 MKKGTVLNSDISSVISRLGHTDTLVVCDAGLPIPKSTTRIDMALTQGVPSFMQVLGVVTN EMQVEAAIIAEEIKHHNPQLHETLLTHLEQLQKHQGNTIEIRYTTHEQFKQQTAESQAVI RSGECSPYANIILCAGVTF >gi|296494702|gb|ADTN01000036.1| GENE 3 2637 - 4142 161 501 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 268 487 28 247 563 66 26 7e-11 MEALLQLKGIDKAFPGVKALSGAALNVYPGRVMALVGENGAGKSTMMKVLTGIYTRDAGT LLWLGKETTFTGPKSSQEAGIGIIHQELNLIPQLTIAENIFLGREFVNRFGKIDWKTMYA EADKLLAKLNLRFKSDKLVGDLSIGDQQMVEIAKVLSFESKVIIMDEPTDALTDTETESL FRVIRELKSQGRGIVYISHRMKEIFEICDDVTVFRDGQFIAEREVASLTEDSLIEMMVGR KLEDQYPHLDKAPGDIRLKVDNLCGPGVNDVSFTLRKGEILGVSGLMGAGRTELMKVLYG ALPRTSGYVTLDGHEVVTRSPQDGLANGIVYISEDRKRDGLVLGMSVKENMSLTALRYFS RAGGSLKHADEQQAVSDFIRLFNVKTPSMEQAIGLLSGGNQQKVAIARGLMTRPKVLILD EPTRGVDVGAKKEIYQLINQFKADGLSIILVSSEMPEVLGMSDRIIVMHEGHLSGEFTRE QATQEVLMAAAVGKLNRVNQE >gi|296494702|gb|ADTN01000036.1| GENE 4 4147 - 5112 1302 321 aa, chain + ## HITS:1 COG:ECs4692 KEGG:ns NR:ns ## COG: ECs4692 COG1172 # Protein_GI_number: 15833946 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Escherichia coli O157:H7 # 1 321 1 321 321 481 100.0 1e-135 MTTQTVSGRRYFTKAWLMEQKSLIALLVLIAIVSTLSPNFFTINNLFNILQQTSVNAIMA VGMTLVILTSGIDLSVGSLLALTGAVAASIVGIEVNALVAVAAALALGAAIGAVTGVIVA KGRVQAFIATLVMMLLLRGVTMVYTNGSPVNTGFTENADLFGWFGIGRPLGVPTPVWIMG IVFLAAWYMLHHTRLGRYIYALGGNEAATRLSGINVNKIKIIVYSLCGLLASLAGIIEVA RLSSAQPTAGTGYELDAIAAVVLGGTSLAGGKGRIVGTLIGALILGFLNNGLNLLGVSSY YQMIVKAVVILLAVLVDNKKQ >gi|296494702|gb|ADTN01000036.1| GENE 5 5137 - 6027 1140 296 aa, chain + ## HITS:1 COG:ECs4693 KEGG:ns NR:ns ## COG: ECs4693 COG1879 # Protein_GI_number: 15833947 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 1 296 1 296 296 493 100.0 1e-139 MNMKKLATLVSAVALSATVSANAMAKDTIALVVSTLNNPFFVSLKDGAQKEADKLGYNLV VLDSQNNPAKELANVQDLTVRGTKILLINPTDSDAVGNAVKMANQANIPVITLDRQATKG EVVSHIASDNVLGGKIAGDYIAKKAGEGAKVIELQGIAGTSAARERGEGFQQAVAAHKFN VLASQPADFDRTKGLNVMQNLLTAHPDVQAVFAQNDEMALGALRALQTAGKSDVMVVGFD GTPDGEKAVNDGKLAATIAQLPDQIGAKGVETADKVLKGEKVQAKYPVDLKLVVKQ >gi|296494702|gb|ADTN01000036.1| GENE 6 6138 - 7082 759 314 aa, chain + ## HITS:1 COG:ECs4694 KEGG:ns NR:ns ## COG: ECs4694 COG0524 # Protein_GI_number: 15833948 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli O157:H7 # 6 314 1 309 309 529 100.0 1e-150 MDIPNMQNAGSLVVLGSINADHILNLQSFPTPGETVTGNHYQVAFGGKGANQAVAAGRSG ANIAFIACTGDDSIGESVRQQLATDNIDITPVSVIKGESTGVALIFVNGEGENVIGIHAG ANAALSPALVEAQRERIANASALLMQLESPLESVMAAAKIAHQNKTIVALNPAPARELPD ELLALVDIITPNETEAEKLTGIRVENDEDAAKAAQVLHEKGIRTVLITLGSRGVWASVNG EGQRVPGFRVQAVDTIAAGDTFNGALITALLEEKPLPEAIRFAHAAAAIAVTRKGAQPSV PWREEIDAFLDRQR >gi|296494702|gb|ADTN01000036.1| GENE 7 7086 - 8078 940 330 aa, chain + ## HITS:1 COG:ECs4695 KEGG:ns NR:ns ## COG: ECs4695 COG1609 # Protein_GI_number: 15833949 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 330 1 330 330 661 100.0 0 MATMKDVARLAGVSTSTVSHVINKDRFVSEAITAKVEAAIKELNYAPSALARSLKLNQTH TIGMLITASTNPFYSELVRGVERSCFERGYSLVLCNTEGDEQRMNRNLETLMQKRVDGLL LLCTETHQPSREIMQRYPTVPTVMMDWAPFDGDSDLIQDNSLLGGDLATQYLIDKGHTRI ACITGPLDKTPARLRLEGYRAAMKRAGLNIPDGYEVTGDFEFNGGFDAMRQLLSHPLRPQ AVFTGNDAMAVGVYQALYQAELQVPQDIAVIGYDDIELASFMTPPLTTIHQPKDELGELA IDVLIHRITQPTLQQQRLQLTPILMERGSA >gi|296494702|gb|ADTN01000036.1| GENE 8 8044 - 9471 888 475 aa, chain - ## HITS:1 COG:yieO KEGG:ns NR:ns ## COG: yieO COG0477 # Protein_GI_number: 16131622 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 475 1 475 475 831 100.0 0 MSDKKKRSMAGLPWIAAMAFFMQALDATILNTALPAIAHSLNRSPLAMQSAIISYTLTVA MLIPVSGWLADRFGTRRIFTLAVSLFTLGSLACALSNSLPQLVVFRVIQGIGGAMMMPVA RLALLRAYPRNELLPVLNFVAMPGLVGPILGPVLGGVLVTWATWHWIFLINIPIGIAGLL YARKHMPNFTTARRRFDITGFLLFGLSLVLFSSGIELFGEKIVASWIALTVIVTSIGLLL LYILHARRTPNPLISLDLFKTRTFSIGIVGNIATRLGTGCVPFLMPLMLQVGFGYQAFIA GCMMAPTALGSIIAKSMVTQVLRRLGYRHTLVGITVIIGLMIAQFSLQSPAMAIWMLILP LFILGMAMSTQFTAMNTITLADLTDDNASSGNSVLAVTQQLSISLGVAVSAAVLRVYEGM EGTTTVEQFHYTFITMGIITVASAAMFMLLKTTDGNNLIKRQRKSKPNRVPSESE >gi|296494702|gb|ADTN01000036.1| GENE 9 9494 - 10186 531 230 aa, chain - ## HITS:1 COG:ZyieP KEGG:ns NR:ns ## COG: ZyieP COG2186 # Protein_GI_number: 15804355 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 EDL933 # 1 230 1 230 230 461 100.0 1e-130 MPLSAQQLAAQKNLSYVLAEKLAQRILKGEYEPGTILPGEIELGEQFGVSRTAVREAVKT LTAKGMVLPRPRIGTRVMPQSNWNFLDQELLTWWMTEENFHQVIDHFLVMRICLEPQACL LAATVGTAEQKAHLNTLMAEMAALKENFRRERWIEVDMAWHEHIYEMSANPFLTSFASLF HSVYHTYFTSITSDTVIKLDLHQAIVDAIIQSDGDAAFKACQALLRSPDK Prediction of potential genes in microbial genomes Time: Sun May 15 23:13:11 2011 Seq name: gi|296494701|gb|ADTN01000037.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont101.1, whole genome shotgun sequence Length of sequence - 23849 bp Number of predicted genes - 21, with homology - 21 Number of transcription units - 10, operones - 6 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 3 - 77 75.8 # Gln CTG 0 0 + TRNA 115 - 189 75.8 # Gln CTG 0 0 - Term 187 - 215 1.0 1 1 Tu 1 . - CDS 343 - 1518 1129 ## COG0654 2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases - Prom 1569 - 1628 3.1 + Prom 1568 - 1627 2.5 2 2 Op 1 10/0.000 + CDS 1664 - 3088 438 ## PROTEIN SUPPORTED gi|227372256|ref|ZP_03855738.1| SSU ribosomal protein S12P methylthiotransferase + Term 3095 - 3127 4.5 + Prom 3131 - 3190 2.1 3 2 Op 2 17/0.000 + CDS 3241 - 4281 1263 ## COG1702 Phosphate starvation-inducible protein PhoH, predicted ATPase 4 2 Op 3 11/0.000 + CDS 4278 - 4745 726 ## COG0319 Predicted metal-dependent hydrolase + Term 4776 - 4811 2.4 5 2 Op 4 9/0.000 + CDS 4835 - 5713 1019 ## COG4535 Putative Mg2+ and Co2+ transporter CorC 6 2 Op 5 4/1.000 + CDS 5738 - 7276 1969 ## COG0815 Apolipoprotein N-acyltransferase 7 3 Op 1 31/0.000 + CDS 7673 - 8581 1187 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain + Term 8704 - 8743 9.1 + Prom 8647 - 8706 2.0 8 3 Op 2 17/0.000 + CDS 8751 - 9491 825 ## COG0765 ABC-type amino acid transport system, permease component 9 3 Op 3 34/0.000 + CDS 9491 - 10165 736 ## COG0765 ABC-type amino acid transport system, permease component 10 3 Op 4 4/1.000 + CDS 10165 - 10890 552 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 11 4 Op 1 3/1.000 + CDS 11008 - 11943 1091 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase + Term 11955 - 12005 8.0 + Prom 11948 - 12007 2.3 12 4 Op 2 . + CDS 12027 - 13697 1269 ## COG0443 Molecular chaperone + Term 13796 - 13851 0.2 - Term 13689 - 13738 9.0 13 5 Op 1 . - CDS 13757 - 15208 929 ## JW0644 Hsc56 co-chaperone of HscC 14 5 Op 2 . - CDS 15205 - 15912 474 ## JW0643 predicted tRNA ligase - Prom 15996 - 16055 10.2 + Prom 15921 - 15980 9.9 15 6 Tu 1 . + CDS 16029 - 16568 342 ## COG0790 FOG: TPR repeat, SEL1 subfamily 16 7 Op 1 . - CDS 16578 - 18005 792 ## JW0641 predicted chaperone 17 7 Op 2 . - CDS 18002 - 18709 291 ## JW0640 hypothetical protein - Prom 18782 - 18841 7.4 + Prom 18724 - 18783 4.9 18 8 Tu 1 . + CDS 18873 - 19850 785 ## COG0790 FOG: TPR repeat, SEL1 subfamily + Term 19863 - 19898 4.3 - Term 19735 - 19767 3.1 19 9 Tu 1 . - CDS 19920 - 20402 473 ## APECO1_1413 hypothetical protein - Prom 20484 - 20543 3.1 + Prom 20384 - 20443 3.9 20 10 Op 1 9/0.000 + CDS 20496 - 23219 3241 ## COG0495 Leucyl-tRNA synthetase 21 10 Op 2 . + CDS 23330 - 23815 518 ## COG2980 Rare lipoprotein B Predicted protein(s) >gi|296494701|gb|ADTN01000037.1| GENE 1 343 - 1518 1129 391 aa, chain - ## HITS:1 COG:ubiF KEGG:ns NR:ns ## COG: ubiF COG0654 # Protein_GI_number: 16128645 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases # Organism: Escherichia coli K12 # 1 391 1 391 391 780 100.0 0 MTNQPTEIAIVGGGMVGGALALGLAQHGFAVTVIEHAEPAPFVADSQPDVRISAISAASV SLLKGLGVWDAVQAMRCHPYRRLETWEWETAHVVFDAAELKLPLLGYMVENTVLQQALWQ ALEAHPKVTLRVPGSLIALHRHDDLQELELKGGEVIRAKLVIGADGANSQVRQMAGIGVH AWQYAQSCMLISVQCENDPGDSTWQQFTPDGPRAFLPLFDNWASLVWYDSPARIRQLQNM NMAQLQAEIAKHFPSRLGYVTPLAAGAFPLTRRHALQYVQPGLALVGDAAHTIHPLAGQG VNLGYRDVDALIDVLVNARSYGEAWASYPVLKRYQMRRMADNFIMQSGMDLFYAGFSNNL PPLRFMRNLGLMAAERAGVLKRQALKYALGL >gi|296494701|gb|ADTN01000037.1| GENE 2 1664 - 3088 438 474 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227372256|ref|ZP_03855738.1| SSU ribosomal protein S12P methylthiotransferase [Veillonella parvula DSM 2008] # 1 390 1 389 449 173 29 1e-42 MTKKLHIKTWGCQMNEYDSSKMADLLDATHGYQLTDVAEEADVLLLNTCSIREKAQEKVF HQLGRWKLLKEKNPDLIIGVGGCVASQEGEHIRQRAHYVDIIFGPQTLHRLPEMINSVRG DRSPVVDISFPEIEKFDRLPEPRAEGPTAFVSIMEGCNKYCTYCVVPYTRGEEVSRPSDD ILFEIAQLAAQGVREVNLLGQNVNAWRGENYDGTTGSFADLLRLVAAIDGIDRIRFTTSH PIEFTDDIIEVYRDTPELVSFLHLPVQSGSDRILNLMGRTHTALEYKAIIRKLRAARPDI QISSDFIVGFPGETTEDFEKTMKLIADVNFDMSYSFIFSARPGTPAADMVDDVPEEEKKQ RLYILQERINQQAMAWSRRMLGTTQRILVEGTSRKSIMELSGRTENNRVVNFEGTPDMIG KFVDVEITDVYPNSLRGKVVRTEDEMGLRVAETPESVIARTRKENDLGVGYYQP >gi|296494701|gb|ADTN01000037.1| GENE 3 3241 - 4281 1263 346 aa, chain + ## HITS:1 COG:ECs0698 KEGG:ns NR:ns ## COG: ECs0698 COG1702 # Protein_GI_number: 15829952 # Func_class: T Signal transduction mechanisms # Function: Phosphate starvation-inducible protein PhoH, predicted ATPase # Organism: Escherichia coli O157:H7 # 1 346 14 359 359 644 99.0 0 MNIDTREITLEPADNARLLSLCGPFDDNIKQLERRLGIEINRRDNHFKLTGRPICVTAAA DILRSLYVDTAPMRGQIQDIEPEQIHLAIKEARVLEQSAESVPEYGKAVNIKTKRGVIKP RTPNQAQYIANILDHDITFGVGPAGTGKTYLAVAAAVDALERQEIRRILLTRPAVEAGEK LGFLPGDLSQKVDPYLRPLYDALFEMLGFEKVEKLIERNVIEVAPLAYMRGRTLNDAFII LDESQNTTIEQMKMFLTRIGFNSKAVITGDVTQIDLPRNTKSGLRHAIEVLADVEEISFN FFHSEDVVRHPVVARIVNAYEAWEEAEQKRKAALAAERKREEQEQK >gi|296494701|gb|ADTN01000037.1| GENE 4 4278 - 4745 726 155 aa, chain + ## HITS:1 COG:ybeY KEGG:ns NR:ns ## COG: ybeY COG0319 # Protein_GI_number: 16128642 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Escherichia coli K12 # 1 155 1 155 155 271 100.0 2e-73 MSQVILDLQLACEDNSGLPEESQFQTWLNAVIPQFQEESEVTIRVVDTAESHSLNLTYRG KDKPTNVLSFPFEVPPGMEMSLLGDLVICRQVVEKEAQEQGKPLEAHWAHMVVHGSLHLL GYDHIEDDEAEEMEALETEIMLALGYEDPYIAEKE >gi|296494701|gb|ADTN01000037.1| GENE 5 4835 - 5713 1019 292 aa, chain + ## HITS:1 COG:ECs0696 KEGG:ns NR:ns ## COG: ECs0696 COG4535 # Protein_GI_number: 15829950 # Func_class: P Inorganic ion transport and metabolism # Function: Putative Mg2+ and Co2+ transporter CorC # Organism: Escherichia coli O157:H7 # 1 292 1 292 292 508 100.0 1e-144 MSDDNSHSSDTISNKKGFFSLLLSQLFHGEPKNRDELLALIRDSGQNDLIDEDTRDMLEG VMDIADQRVRDIMIPRSQMITLKRNQTLDECLDVIIESAHSRFPVISEDKDHIEGILMAK DLLPFMRSDAEAFSMDKVLRQAVVVPESKRVDRMLKEFRSQRYHMAIVIDEFGGVSGLVT IEDILELIVGEIEDEYDEEDDIDFRQLSRHTWTVRALASIEDFNEAFGTHFSDEEVDTIG GLVMQAFGHLPARGETIDIDGYQFKVAMADSRRIIQVHVKIPDDSPQPKLDE >gi|296494701|gb|ADTN01000037.1| GENE 6 5738 - 7276 1969 512 aa, chain + ## HITS:1 COG:lnt KEGG:ns NR:ns ## COG: lnt COG0815 # Protein_GI_number: 16128640 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Apolipoprotein N-acyltransferase # Organism: Escherichia coli K12 # 1 512 1 512 512 996 100.0 0 MAFASLIERQRIRLLLALLFGACGTLAFSPYDVWPAAIISLMGLQALTFNRRPLQSAAIG FCWGFGLFGSGINWVYVSIATFGGMPGPVNIFLVVLLAAYLSLYTGLFAGVLSRLWPKTT WLRVAIAAPALWQVTEFLRGWVLTGFPWLQFGYSQIDGPLKGLAPIMGVEAINFLLMMVS GLLALALVKRNWRPLVVAVVLFALPFPLRYIQWFTPQPEKTIQVSMVQGDIPQSLKWDEG QLLNTLKIYYNATAPLMGKSSLIIWPESAITDLEINQQPFLKALDGELRDKGSSLVTGIV DARLNKQNRYDTYNTIITLGKGAPYSYESADRYNKNHLVPFGEFVPLESILRPLAPFFDL PMSSFSRGPYIQPPLSANGIELTAAICYEIILGEQVRDNFRPDTDYLLTISNDAWFGKSI GPWQHFQMARMRALELARPLLRSTNNGITAVIGPQGEIQAMIPQFTREVLTTNVTPTTGL TPYARTGNWPLWVLTALFGFAAVLMSLRQRRK >gi|296494701|gb|ADTN01000037.1| GENE 7 7673 - 8581 1187 302 aa, chain + ## HITS:1 COG:STM0665 KEGG:ns NR:ns ## COG: STM0665 COG0834 # Protein_GI_number: 16764042 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Salmonella typhimurium LT2 # 1 302 7 308 308 547 93.0 1e-156 MQLRKPATAILALALSAGLAQADDAAPAAGSTLDKIAKNGVIVVGHRESSVPFSYYDNQQ KVVGYSQDYSNAIVEAVKKKLNKPDLQVKLIPITSQNRIPLLQNGTFDFECGSTTNNVER QKQAAFSDTIFVVGTRLLTKKGGDIKDFANLKDKAVVVTSGTTSEVLLNKLNEEQKMNMR IISAKDHGDSFRTLESGRAVAFMMDDALLAGERAKAKKPDNWEIVGKPQSQEAYGCMLRK DDPQFKKLMDDTIAQVQTSGEAEKWFDKWFKNPIPPKNLNMNFELSDEMKALFKEPNDKA LN >gi|296494701|gb|ADTN01000037.1| GENE 8 8751 - 9491 825 246 aa, chain + ## HITS:1 COG:gltJ KEGG:ns NR:ns ## COG: gltJ COG0765 # Protein_GI_number: 16128637 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Escherichia coli K12 # 1 246 1 246 246 465 100.0 1e-131 MSIDWNWGIFLQQAPFGNTTYLGWIWSGFQVTIALSICAWIIAFLVGSFFGILRTVPNRF LSGLGTLYVELFRNVPLIVQFFTWYLVIPELLPEKIGMWFKAELDPNIQFFLSSMLCLGL FTAARVCEQVRAAIQSLPRGQKNAALAMGLTLPQAYRYVLLPNAYRVIVPPMTSEMMNLV KNSAIASTIGLVDMAAQAGKLLDYSAHAWESFTAITLAYVLINAFIMLVMTLVERKVRLP GNMGGK >gi|296494701|gb|ADTN01000037.1| GENE 9 9491 - 10165 736 224 aa, chain + ## HITS:1 COG:ECs0692 KEGG:ns NR:ns ## COG: ECs0692 COG0765 # Protein_GI_number: 15829946 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Escherichia coli O157:H7 # 1 224 1 224 224 370 100.0 1e-102 MYEFDWSSIVPSLPYLLDGLVITLKITVTAVVIGILWGTMLAVMRLSSFAPVAWFAKAYV NVFRSIPLVMVLLWFYLIVPGFLQNVLGLSPKNDIRLISAMVAFSMFEAAYYSEIIRAGI QSISRGQSSAALALGMTHWQSMKLIILPQAFRAMVPLLLTQGIVLFQDTSLVYVLSLADF FRTASTIGERDGTQVEMILFAGFVYFVISLSASLLVSYLKRRTA >gi|296494701|gb|ADTN01000037.1| GENE 10 10165 - 10890 552 241 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 239 1 242 245 217 45 7e-56 MITLKNVSKWYGHFQVLTDCSTEVKKGEVVVVCGPSGSGKSTLIKTVNGLEPVQQGEITV DGIVVNDKKTDLAKLRSRVGMVFQHFELFPHLSIIENLTLAQVKVLKRDKAPAREKALKL LERVGLSAHANKFPAQLSGGQQQRVAIARALCMDPIAMLFDEPTSALDPEMINEVLDVMV ELANEGMTMMVVTHEMGFARKVANRVIFMDEGKIVEDSPKDAFFDDPKSDRAKDFLAKIL H >gi|296494701|gb|ADTN01000037.1| GENE 11 11008 - 11943 1091 311 aa, chain + ## HITS:1 COG:ybeK KEGG:ns NR:ns ## COG: ybeK COG1957 # Protein_GI_number: 16128634 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Escherichia coli K12 # 1 311 1 311 311 625 99.0 1e-179 MALPILLDCDPGHDDAIAIVLALASPELDVKAITSSAGNQTPEKTLRNVLRMLTLLNRTD IPVAGGAVKPLMRELIIADNVHGESGLDGPALPEPTFAPQNCTAVELMAKTLRESAEPVT IVSTGPQTNVALLLNSHPELHSKIARIVIMGGAMGLGNWTPAAEFNIYVDPEAAEIVFQS GIPVVMAGLDVTHKAQIHVEDTERFRAIGNPVSTIVAELLDFFLEYHKDEKWGFVGAPLH DPCTIAWLLKPELFTSVERWIGVETQGKYTQGMTVVDYYYLTGNKPNATVMVDVDRQGFV DLLADRLKFYA >gi|296494701|gb|ADTN01000037.1| GENE 12 12027 - 13697 1269 556 aa, chain + ## HITS:1 COG:ybeW KEGG:ns NR:ns ## COG: ybeW COG0443 # Protein_GI_number: 16128633 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Escherichia coli K12 # 1 556 1 556 556 1102 100.0 0 MDNAELAIGIDLGTTNSLIAVWKDGAAQLIPNKFGEYLTPSIISMDENNHILVGKPAVSR RTSHPDKTAALFKRAMGSNTNWRLGSDTFNAPELSSLVLRSLKEDAEEFLQRPIKDVVIS VPAYFSDEQRKHTRLAAELAGLNAVRLINEPTAAAMAYGLHTQQNTRSLVFDLGGGTFDV TVLEYATPVIEVHASAGDNFLGGEDFTHMLVDEVLKRADVARTTLNESELAALYACVEAA KCSNQSPLHIRWQYQEETRECEFYENELEDLWLPLLNRLRVPIEQALRDARLKPSQIDSL VLVGGASQMPLVQRIAVRLFGKLPYQSYDPSTIVALGAAIQAACRLRSEDIEEVILTDIC PYSLGVEVNRQGVSGIFSPIIERNTTVPVSRVETYSTMHPEQDSITVNVYQGENHKVKNN ILVESFDVPLKKTGAYQSIDIRFSYDINGLLEVDVLLEDGSVKSRVINHSPVTLSAQQIE ESRTRLSALKIYPRDMLINRTFKAKLEELWARALGDEREEIGRVITDFDAALQSNDMARV DEVRRRASDYLAIEIP >gi|296494701|gb|ADTN01000037.1| GENE 13 13757 - 15208 929 483 aa, chain - ## HITS:1 COG:no KEGG:JW0644 NR:ns ## KEGG: JW0644 # Name: djlC # Def: Hsc56 co-chaperone of HscC # Organism: E.coli_J # Pathway: not_defined # 1 483 1 483 483 947 100.0 0 MKTCWQILEIESTTQIDIIRQAYLARLPLCHPETDPQGFKALRQAYEEALRLAVNPVEEA DDEEKDAAAEHEILRAFRTLLDSESDRFQPSAWQKFIQQLNTWNMEDVDQLRWPLCAIAI EARYLSLNCASLLAERLNWHSFNDSEGMDEEEREAFLEAIQAGDCFDFLSLLEYPIALQN QTVEYYFALERCCRYHPDYVTAFLAMEGPWLIPDDAKLHRKLLRWYSSVQTGMAELIPVA QQWQTEEPESEDARYYLCAQRLYCGEGESLLADLCAYWESYPSTQADNLLLQWSKRHCPD YFALLVMVIEARSMVDAQGQPLKYVPGESARTRLLWAEILHSGKLSPLGQSFIESLFFKR KAWAWWKSRVGSETEQDSPFLDLYRVAEQVVLEAFPKQEMLARLNTRLEGGDAHPLEAIV TRMLLTKVKLEPEDEDVDEPTPENHEEKNDEGEKPQSITSIIKISLTVLVIGYALGKIAM LFS >gi|296494701|gb|ADTN01000037.1| GENE 14 15205 - 15912 474 235 aa, chain - ## HITS:1 COG:no KEGG:JW0643 NR:ns ## KEGG: JW0643 # Name: ybeU # Def: predicted tRNA ligase # Organism: E.coli_J # Pathway: not_defined # 1 235 1 235 235 473 100.0 1e-132 MNKEEQYLLFALSAPMEILNQGCKPAHDSPKMYTGIKEFELSSSWGINNRDDLIQTIYQM TDDGHANDLAGLYLTWHRSSPEEWKALIAGGSERGLIYTQFVAQTAMCCGEGGIKAWDYV RMGFLSRVGVLNKWLTEEESLWLQSRVYVRAHHYYHSWMHYFSAYSLGRLYWQSSQCEDN TSLREALTLYKYDSAGSRMFEELAAGSDRFYATLPWQPLTVQSECPVTLKDVSDL >gi|296494701|gb|ADTN01000037.1| GENE 15 16029 - 16568 342 179 aa, chain + ## HITS:1 COG:ECs0685 KEGG:ns NR:ns ## COG: ECs0685 COG0790 # Protein_GI_number: 15829939 # Func_class: R General function prediction only # Function: FOG: TPR repeat, SEL1 subfamily # Organism: Escherichia coli O157:H7 # 1 179 6 184 184 343 99.0 1e-94 MYIFAIFIVAAITCISQPKKTTLRDKAMVNYAFDYLSSPGSLPFTTAATELSAIHGHSTS QYRLGEFYLHGSDGKPLDYTQARYWYEQSAEQENPRAQSKLGWIYLKGLGVKPDTRKAIL WYKEAAEQGYAHAQYTLGLIYRNGSGINVNHYESQKWLKLAAKQHYKNAERLLAGLPAH >gi|296494701|gb|ADTN01000037.1| GENE 16 16578 - 18005 792 475 aa, chain - ## HITS:1 COG:no KEGG:JW0641 NR:ns ## KEGG: JW0641 # Name: djlB # Def: predicted chaperone # Organism: E.coli_J # Pathway: not_defined # 1 475 1 475 475 907 100.0 0 MKNCWKILDIEETTDVDIIRRAYLALLPSFHPETDPQGFKQLRQAYEEALRIAQSPAKSV WQPEEYEVAEHEILLAFRALLASDSERFLPSAWQRFIQQLNYCSMEEIDELRWSLCTIAM NTAHLSFECVVLLAERLRWLQEENTGEIDEEELESFLYAIAKGNVFNFQTILHLPVAVQN DTIDFYQMFARIWSSHPQWLTLYLAQHRAVIIPDDAKLHRNLLRWYSAGRLDIPELLDYA QSWRETEPDNEDAPYYEYAQRVYCGEGESLLAELCDYWREYPSTQADALMLQWCRQHRVD YYPLLVMMIEARDLVNDQGKPLLYVPGDSARTRFHLYEILSDEKLSALGRSLVEMVLHKG RKPRISLTRDTEHTLWPLYLVAKQLVQACQPTEESLMPIVSRLDAENRCPLEALIIRRLL IQAANFTEKQTVEPEPQPQPMPVDDGGPGCLGIIKIIFYIFIFAGLIGKILHLFG >gi|296494701|gb|ADTN01000037.1| GENE 17 18002 - 18709 291 235 aa, chain - ## HITS:1 COG:no KEGG:JW0640 NR:ns ## KEGG: JW0640 # Name: ybeR # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 235 1 235 235 471 100.0 1e-132 MDMESQKILFALSTPMEIRNECCLPSHSSPKMYLGTCFFDLSSSWGIDDRDDLLRTIHRM IDNGHAARLAGFYHRWFRYSPCEWRDYLAELNEQGQAYAQFVASTAECCGEGGIKAWDYV RMGFLSRMGVLNNWLSEEESLWIQSRIHLRALRYYSNWRQYFAGYTFGRQYWQSPEDDHL PLLREFLARKEYDDSGNDMFYQLFASDDAYYPTLSWQPLAYYSACPETLKDMSDL >gi|296494701|gb|ADTN01000037.1| GENE 18 18873 - 19850 785 325 aa, chain + ## HITS:1 COG:ybeQ KEGG:ns NR:ns ## COG: ybeQ COG0790 # Protein_GI_number: 16128627 # Func_class: R General function prediction only # Function: FOG: TPR repeat, SEL1 subfamily # Organism: Escherichia coli K12 # 1 325 3 327 327 597 100.0 1e-170 MIFTSSCCDNLSIDEIIERAEKGDCEAQYIVGFYYNRDSAIDSPDDEKAFYWLKLAAEQG HCEAQYSLGQKYTEDKSRHKDNEQAIFWLKKAALQGHTFASNALGWTLDRGEAPNYKEAV VWYQIAAESGMSYAQNNLGWMYRNGNGVAKDYALAFFWYKQAALQGHSDAQNNLADLYED GKGVAQNKTLAAFWYLKSAQQGNRHAQFQIAWDYNAGEGVDQDYKQAMYWYLKAAAQGSV GAYVNIGYMYKHGQGVEKDYQAAFEWFTKAAECNDATAWYNLAIMYHYGEGRPVDLRQAL DLYRKVQSSGTRDVSQEIRETEDLL >gi|296494701|gb|ADTN01000037.1| GENE 19 19920 - 20402 473 160 aa, chain - ## HITS:1 COG:no KEGG:APECO1_1413 NR:ns ## KEGG: APECO1_1413 # Name: ybeL # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 160 1 160 160 305 100.0 2e-82 MNKVAQYYRELVASLSERLRNGERDIDALVEQARERVIKTGELTRTEVDELTRAVRRDLE EFAMSYEESLKEESDSVFMRVIKESLWQELADITDKTQLEWREVFQDLNHHGVYHSGEVV GLGNLVCEKCHFHLPIYTPEVLTLCPKCGHDQFQRRPFEP >gi|296494701|gb|ADTN01000037.1| GENE 20 20496 - 23219 3241 907 aa, chain + ## HITS:1 COG:leuS KEGG:ns NR:ns ## COG: leuS COG0495 # Protein_GI_number: 16128625 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Leucyl-tRNA synthetase # Organism: Escherichia coli K12 # 48 907 1 860 860 1795 100.0 0 MNNPGIISTSSARKAVLTRTFGLCYADLKNHLNATFVAVLKTGPLAAMQEQYRPEEIESK VQLHWDEKRTFEVTEDESKEKYYCLSMLPYPSGRLHMGHVRNYTIGDVIARYQRMLGKNV LQPIGWDAFGLPAEGAAVKNNTAPAPWTYDNIAYMKNQLKMLGFGYDWSRELATCTPEYY RWEQKFFTELYKKGLVYKKTSAVNWCPNDQTVLANEQVIDGCCWRCDTKVERKEIPQWFI KITAYADELLNDLDKLDHWPDTVKTMQRNWIGRSEGVEITFNVNDYDNTLTVYTTRPDTF MGCTYLAVAAGHPLAQKAAENNPELAAFIDECRNTKVAEAEMATMEKKGVDTGFKAVHPL TGEEIPVWAANFVLMEYGTGAVMAVPGHDQRDYEFASKYGLNIKPVILAADGSEPDLSQQ ALTEKGVLFNSGEFNGLDHEAAFNAIADKLTAMGVGERKVNYRLRDWGVSRQRYWGAPIP MVTLEDGTVMPTPDDQLPVILPEDVVMDGITSPIKADPEWAKTTVNGMPALRETDTFDTF MESSWYYARYTCPQYKEGMLDSEAANYWLPVDIYIGGIEHAIMHLLYFRFFHKLMRDAGM VNSDEPAKQLLCQGMVLADAFYYVGENGERNWVSPVDAIVERDEKGRIVKAKDAAGHELV YTGMSKMSKSKNNGIDPQVMVERYGADTVRLFMMFASPADMTLEWQESGVEGANRFLKRV WKLVYEHTAKGDVAALNVDALTENQKALRRDVHKTIAKVTDDIGRRQTFNTAIAAIMELM NKLAKAPTDGEQDRALMQEALLAVVRMLNPFTPHICFTLWQELKGEGDIDNAPWPVADEK AMVEDSTLVVVQVNGKVRAKITVPVDATEEQVRERAGQEHLVAKYLDGVTVRKVIYVPGK LLNLVVG >gi|296494701|gb|ADTN01000037.1| GENE 21 23330 - 23815 518 161 aa, chain + ## HITS:1 COG:rlpB KEGG:ns NR:ns ## COG: rlpB COG2980 # Protein_GI_number: 16128624 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rare lipoprotein B # Organism: Escherichia coli K12 # 1 161 33 193 193 273 100.0 8e-74 MKVMILDSGDPNGPLSRAVRNQLRLNGVELLDKETTRKDVPSLRLGKVSIAKDTASVFRN GQTAEYQMIMTVNATVLIPGRDIYPISAKVFRSFFDNPQMALAKDNEQDMIVKEMYDRAA EQLIRKLPSIRAADIRSDEEQTSTTTDTPATPARVSTTLGN Prediction of potential genes in microbial genomes Time: Sun May 15 23:13:36 2011 Seq name: gi|296494700|gb|ADTN01000038.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont101.2, whole genome shotgun sequence Length of sequence - 8881 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 1, operones - 1 average op.length - 9.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 + CDS 2 - 997 793 ## COG1466 DNA polymerase III, delta subunit 2 1 Op 2 3/0.000 + CDS 999 - 1640 497 ## COG1057 Nicotinic acid mononucleotide adenylyltransferase 3 1 Op 3 6/0.000 + CDS 1664 - 2275 381 ## COG0406 Fructose-2,6-bisphosphatase + Prom 2440 - 2499 4.7 4 1 Op 4 14/0.000 + CDS 2535 - 2852 314 ## COG0799 Uncharacterized homolog of plant Iojap protein 5 1 Op 5 9/0.000 + CDS 2856 - 3323 513 ## COG1576 Uncharacterized conserved protein 6 1 Op 6 19/0.000 + CDS 3354 - 5255 2009 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 7 1 Op 7 8/0.000 + CDS 5258 - 6370 1106 ## COG0772 Bacterial cell division membrane protein 8 1 Op 8 12/0.000 + CDS 6381 - 7469 942 ## COG0797 Lipoproteins 9 1 Op 9 . + CDS 7536 - 8819 1378 ## COG1686 D-alanyl-D-alanine carboxypeptidase Predicted protein(s) >gi|296494700|gb|ADTN01000038.1| GENE 1 2 - 997 793 331 aa, chain + ## HITS:1 COG:holA KEGG:ns NR:ns ## COG: holA COG1466 # Protein_GI_number: 16128623 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, delta subunit # Organism: Escherichia coli K12 # 1 331 13 343 343 585 100.0 1e-167 LNEGLRAAYLLLGNDPLLLQESQDAVRQVAAAQGFEEHHTFSIDPNTDWNAIFSLCQAMS LFASRQTLLLLLPENGPNAAINEQLLTLTGLLHDDLLLIVRGNKLSKAQENAAWFTALAN RSVQVTCQTPEQAQLPRWVAARAKQLNLELDDAANQVLCYCYEGNLLALAQALERLSLLW PDGKLTLPRVEQAVNDAAHFTPFHWVDALLMGKSKRALHILQQLRLEGSEPVILLRTLQR ELLLLVNLKRQSAHTPLRALFDKHRVWQNRRGMMGEALNRLSQTQLRQAVQLLTRTELTL KQDYGQSVWAELEGLSLLLCHKPLADVFIDG >gi|296494700|gb|ADTN01000038.1| GENE 2 999 - 1640 497 213 aa, chain + ## HITS:1 COG:nadD KEGG:ns NR:ns ## COG: nadD COG1057 # Protein_GI_number: 16128622 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid mononucleotide adenylyltransferase # Organism: Escherichia coli K12 # 1 213 1 213 213 415 100.0 1e-116 MKSLQALFGGTFDPVHYGHLKPVETLANLIGLTRVTIIPNNVPPHRPQPEANSVQRKHML ELAIADKPLFTLDERELKRNAPSYTAQTLKEWRQEQGPDVPLAFIIGQDSLLTFPTWYEY ETILDNAHLIVCRRPGYPLEMAQPQYQQWLEDHLTHNPEDLHLQPAGKIYLAETPWFNIS ATIIRERLQNGESCEDLLPEPVLTYINQQGLYR >gi|296494700|gb|ADTN01000038.1| GENE 3 1664 - 2275 381 203 aa, chain + ## HITS:1 COG:phpB KEGG:ns NR:ns ## COG: phpB COG0406 # Protein_GI_number: 16128621 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Escherichia coli K12 # 1 203 1 203 203 409 100.0 1e-114 MRLWLIRHGETQANIDGLYSGHAPTPLTARGIEQAQNLHTLLHGVSFDLVLCSELERAQH TARLVLSDRQLPVQIIPELNEMFFGDWEMRHHRDLMQEDAENYSAWCNDWQHAIPTNGEG FQAFSQRVERFIARLSEFQHYQNILVVSHQGVLSLLIARLIGMPAEAMWHFRVDQGCWSA IDINQKFATLRVLNSRAIGVENA >gi|296494700|gb|ADTN01000038.1| GENE 4 2535 - 2852 314 105 aa, chain + ## HITS:1 COG:STM0642 KEGG:ns NR:ns ## COG: STM0642 COG0799 # Protein_GI_number: 16764019 # Func_class: S Function unknown # Function: Uncharacterized homolog of plant Iojap protein # Organism: Salmonella typhimurium LT2 # 1 105 1 105 105 187 97.0 3e-48 MQGKALQDFVIDKIDDLKGQDIIALDVQGKSSITDCMIICTGTSSRHVMSIADHVVQESR AAGLLPLGVEGENSADWIVVDLGDVIVHVMQEESRRLYELEKLWS >gi|296494700|gb|ADTN01000038.1| GENE 5 2856 - 3323 513 155 aa, chain + ## HITS:1 COG:ECs0674 KEGG:ns NR:ns ## COG: ECs0674 COG1576 # Protein_GI_number: 15829928 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 155 1 155 155 304 100.0 4e-83 MKLQLVAVGTKMPDWVQTGFTEYLRRFPKDMPFELIEIPAGKRGKNADIKRILDKEGEQM LAAAGKNRIVTLDIPGKPWDTPQLAAELERWKLDGRDVSLLIGGPEGLSPACKAAAEQSW SLSALTLPHPLVRVLVAESLYRAWSITTNHPYHRE >gi|296494700|gb|ADTN01000038.1| GENE 6 3354 - 5255 2009 633 aa, chain + ## HITS:1 COG:ECs0673 KEGG:ns NR:ns ## COG: ECs0673 COG0768 # Protein_GI_number: 15829927 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Escherichia coli O157:H7 # 1 633 1 633 633 1295 100.0 0 MKLQNSFRDYTAESALFVRRALVAFLGILLLTGVLIANLYNLQIVRFTDYQTRSNENRIK LVPIAPSRGIIYDRNGIPLALNRTIYQIEMMPEKVDNVQQTLDALRSVVDLTDDDIAAFR KERARSHRFTSIPVKTNLTEVQVARFAVNQYRFPGVEVKGYKRRYYPYGSALTHVIGYVS KINDKDVERLNNDGKLANYAATHDIGKLGIERYYEDVLHGQTGYEEVEVNNRGRVIRQLK EVPPQAGHDIYLTLDLKLQQYIETLLAGSRAAVVVTDPRTGGVLALVSTPSYDPNLFVDG ISSKDYSALLNDPNTPLVNRATQGVYPPASTVKPYVAVSALSAGVITRNTTLFDPGWWQL PGSEKRYRDWKKWGHGRLNVTRSLEESADTFFYQVAYDMGIDRLSEWMGKFGYGHYTGID LAEERSGNMPTREWKQKRFKKPWYQGDTIPVGIGQGYWTATPIQMSKALMILINDGIVKV PHLLMSTAEDGKQVPWVQPHEPPVGDIHSGYWELAKDGMYGVANRPNGTAHKYFASAPYK IAAKSGTAQVFGLKANETYNAHKIAERLRDHKLMTAFAPYNNPQVAVAMILENGGAGPAV GTLMRQILDHIMLGDNNTDLPAENPAVAAAEDH >gi|296494700|gb|ADTN01000038.1| GENE 7 5258 - 6370 1106 370 aa, chain + ## HITS:1 COG:ECs0672 KEGG:ns NR:ns ## COG: ECs0672 COG0772 # Protein_GI_number: 15829926 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Escherichia coli O157:H7 # 1 370 1 370 370 618 100.0 1e-177 MTDNPNKKTFWDKVHLDPTMLLILLALLVYSALVIWSASGQDIGMMERKIGQIAMGLVIM VVMAQIPPRVYEGWAPYLYIICIILLVAVDAFGAISKGAQRWLDLGIVRFQPSEIAKIAV PLMVARFINRDVCPPSLKNTGIALVLIFMPTLLVAAQPDLGTSILVALSGLFVLFLSGLS WRLIGVAVVLVAAFIPILWFFLMHDYQRQRVMMLLDPESDPLGAGYHIIQSKIAIGSGGL RGKGWLHGTQSQLEFLPERHTDFIFAVLAEELGLVGILILLALYILLIMRGLWIAARAQT TFGRVMAGGLMLILFVYVFVNIGMVSGILPVVGVPLPLVSYGGSALIVLMAGFGIVMSIH THRKMLSKSV >gi|296494700|gb|ADTN01000038.1| GENE 8 6381 - 7469 942 362 aa, chain + ## HITS:1 COG:rlpA KEGG:ns NR:ns ## COG: rlpA COG0797 # Protein_GI_number: 16128616 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipoproteins # Organism: Escherichia coli K12 # 1 362 1 362 362 573 99.0 1e-163 MRKQWLGICIAAGMLAACTSDDGQQQTVSVPQPAVCNGPIVEISGADPRFEPLNATANQD YQRDGKSYKIVQDPSRFSQAGLAAIYDAEPGSNLTASGEAFDPTQLTAAHPTLPIPSYAR ITNLANGRMIVVRINDRGPYGNDRVISLSRAAADRLNTSNNTKVRIDPIIVAQDGSLSGP GMACTTVAKQTYALPAPPDLSGGAGTSSVSGPQGDILPVSNSTLKSEDPTGAPVTSSGFL GAPTTLAPGVLEGSEPTPAPQPVVTAPSTTPATSPAMVTPQAVSQSASGNFMVQVGAVSD QARAQQYQQQLGQKFGVPGRVTQNGAVWRIQLGPFVSKAEASTLQQRLQTEAQLQSFITT AQ >gi|296494700|gb|ADTN01000038.1| GENE 9 7536 - 8819 1378 427 aa, chain + ## HITS:1 COG:ZdacA KEGG:ns NR:ns ## COG: ZdacA COG1686 # Protein_GI_number: 15800346 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Escherichia coli O157:H7 EDL933 # 25 427 1 403 403 825 99.0 0 MPAGSAFAIVGHFFNSITDVVVQTMNTIFSARIMKRLALTTALCTAFISAAHADDLNIKT MIPGVPQIDAESYILIDYNSGKVLAEQNADVRRDPASLTKMMTSYVIGQAMKAGKFKETD LVTIGNDAWATGNPVFKGSSLMFLKPGMQVPVSQLIRGINLQSGNDACVAMADFAAGSQD AFVGLMNSYVNALGLKNTHFQTVHGLDADGQYSSARDMALIGQALIRDVPNEYSIYKEKE FTFNGIRQLNRNGLLWDNSLNVDGIKTGHTDKAGYNLVASATEGQMRLISAVMGGRTFKG REAESKKLLTWGFRFFETVNPLKVGKEFASEPVWFGDSDRASLGVDKDVYLTIPRGRMKD LKASYVLNSSELHAPLQKNQIVGTINFQLDGKTIEQRPLVVLQEIPEGNFFGKIIDYIKL MFHHWFG Prediction of potential genes in microbial genomes Time: Sun May 15 23:13:38 2011 Seq name: gi|296494699|gb|ADTN01000039.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont101.3, whole genome shotgun sequence Length of sequence - 3475 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 + CDS 38 - 301 343 ## COG2921 Uncharacterized conserved protein + Prom 310 - 369 3.9 2 1 Op 2 . + CDS 402 - 1043 534 ## COG0321 Lipoate-protein ligase B + Term 1057 - 1090 -0.3 + Prom 1076 - 1135 5.5 3 2 Op 1 2/0.000 + CDS 1302 - 2255 577 ## COG0583 Transcriptional regulator + Prom 2319 - 2378 4.7 4 2 Op 2 . + CDS 2464 - 3429 970 ## COG0320 Lipoate synthase Predicted protein(s) >gi|296494699|gb|ADTN01000039.1| GENE 1 38 - 301 343 87 aa, chain + ## HITS:1 COG:ECs0669 KEGG:ns NR:ns ## COG: ECs0669 COG2921 # Protein_GI_number: 15829923 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 87 1 87 87 157 100.0 5e-39 MKTKLNELLEFPTPFTYKVMGQALPELVDQVVEVVQRHAPGDYTPTVKPSSKGNYHSVSI TINATHIEQVETLYEELGKIDIVRMVL >gi|296494699|gb|ADTN01000039.1| GENE 2 402 - 1043 534 213 aa, chain + ## HITS:1 COG:ECs0668 KEGG:ns NR:ns ## COG: ECs0668 COG0321 # Protein_GI_number: 15829922 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate-protein ligase B # Organism: Escherichia coli O157:H7 # 23 213 1 191 191 401 100.0 1e-112 MYQDKILVRQLGLQPYEPISQAMHEFTDTRDDSTLDEIWLVEHYPVFTQGQAGKAEHILM PGDIPVIQSDRGGQVTYHGPGQQVMYVLLNLKRRKLGVRELVTLLEQTVVNTLAELGIEA HPRADAPGVYVGEKKICSLGLRIRRGCSFHGLALNVNMDLSPFLRINPCGYAGMEMAKIS QWKPEATTNNIAPRLLENILALLNNPDFEYITA >gi|296494699|gb|ADTN01000039.1| GENE 3 1302 - 2255 577 317 aa, chain + ## HITS:1 COG:ybeF KEGG:ns NR:ns ## COG: ybeF COG0583 # Protein_GI_number: 16128612 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 52 317 1 266 266 525 99.0 1e-149 MDSNNQIEPCLSRKSSEGKPQIFTTLRNIDLNLLTIFEAVYVHKGIVNAAKVLNLTPSAI SQSIQKLRVIFPDPLFIRKGQGVTPTAFAMHLHEYISQGLESILGALDIEGSYDKQRTIT IATTPSVGALVLPVIYRAIKTHYPQLLLRNPPISDAENQLSQFQTDLIIDNMFCTNRTVQ HHVLFTDNMVLICREGNPLLSLEDDRETIDNAAHVLLLPEEQNFSGLRQRVQEMFPDRQI NFTSYNILTIAALVANSDMLAIIPSRFYNLFSRCWPLEKLPFPSLNEEQIDFSIHYNKFS LRDPILHGVIDVIRNAF >gi|296494699|gb|ADTN01000039.1| GENE 4 2464 - 3429 970 321 aa, chain + ## HITS:1 COG:ECs0666 KEGG:ns NR:ns ## COG: ECs0666 COG0320 # Protein_GI_number: 15829920 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate synthase # Organism: Escherichia coli O157:H7 # 1 321 1 321 321 657 100.0 0 MSKPIVMERGVKYRDADKMALIPVKNVATEREALLRKPEWMKIKLPADSTRIQGIKAAMR KNGLHSVCEEASCPNLAECFNHGTATFMILGAICTRRCPFCDVAHGRPVAPDANEPVKLA QTIADMALRYVVITSVDRDDLRDGGAQHFADCITAIREKSPQIKIETLVPDFRGRMDRAL DILTATPPDVFNHNLENVPRIYRQVRPGADYNWSLKLLERFKEAHPEIPTKSGLMVGLGE TNEEIIEVMRDLRRHGVTMLTLGQYLQPSRHHLPVQRYVSPDEFDEMKAEALAMGFTHAA CGPFVRSSYHADLQAKGMEVK Prediction of potential genes in microbial genomes Time: Sun May 15 23:13:43 2011 Seq name: gi|296494698|gb|ADTN01000040.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont101.4, whole genome shotgun sequence Length of sequence - 14170 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 7, operones - 4 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 63 - 272 178 ## PROTEIN SUPPORTED gi|90022866|ref|YP_528693.1| ribosomal protein L25 - Prom 313 - 372 3.7 2 2 Op 1 . - CDS 395 - 958 571 ## COG0388 Predicted amidohydrolase 3 2 Op 2 . - CDS 955 - 1182 283 ## COG0388 Predicted amidohydrolase - Prom 1260 - 1319 2.7 + Prom 1157 - 1216 5.2 4 3 Tu 1 . + CDS 1275 - 1658 200 ## COG0239 Integral membrane protein possibly involved in chromosome condensation + Term 1672 - 1700 0.6 - Term 1658 - 1688 1.0 5 4 Op 1 . - CDS 1712 - 1921 328 ## COG1278 Cold shock proteins - Prom 2035 - 2094 8.3 - Term 1935 - 1963 -0.0 6 4 Op 2 . - CDS 2096 - 2656 432 ## B21_00581 hypothetical protein - Prom 2889 - 2948 8.9 + Prom 3161 - 3220 6.5 7 5 Tu 1 . + CDS 3244 - 4629 1440 ## COG3069 C4-dicarboxylate transporter + Term 4638 - 4672 5.2 - Term 4620 - 4664 5.1 8 6 Op 1 9/0.000 - CDS 4670 - 5350 263 ## PROTEIN SUPPORTED gi|149011191|ref|ZP_01832496.1| 30S ribosomal protein S9 9 6 Op 2 . - CDS 5319 - 6977 1487 ## COG3290 Signal transduction histidine kinase regulating citrate/malate metabolism - Prom 7029 - 7088 5.3 + Prom 7093 - 7152 9.5 10 7 Op 1 5/0.250 + CDS 7356 - 8414 834 ## COG3053 Citrate lyase synthetase 11 7 Op 2 6/0.000 + CDS 8480 - 8725 302 ## COG3052 Citrate lyase, gamma subunit 12 7 Op 3 6/0.000 + CDS 8722 - 9630 1142 ## COG2301 Citrate lyase beta subunit 13 7 Op 4 7/0.000 + CDS 9641 - 11173 1700 ## COG3051 Citrate lyase, alpha subunit 14 7 Op 5 5/0.250 + CDS 11177 - 11728 303 ## COG3697 Phosphoribosyl-dephospho-CoA transferase (holo-ACP synthetase) 15 7 Op 6 5/0.250 + CDS 11709 - 12581 580 ## COG1767 Triphosphoribosyl-dephospho-CoA synthetase 16 7 Op 7 . + CDS 12632 - 14095 1672 ## COG0471 Di- and tricarboxylate transporters Predicted protein(s) >gi|296494698|gb|ADTN01000040.1| GENE 1 63 - 272 178 69 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90022866|ref|YP_528693.1| ribosomal protein L25 [Saccharophagus degradans 2-40] # 1 59 1 59 83 73 57 1e-12 VSMGEISITKLLVVAALVVLLFGTKKLRTLGGDLGAAIKGFKKAMNDDDAAAKKGADVDL QAEKLSHKE >gi|296494698|gb|ADTN01000040.1| GENE 2 395 - 958 571 187 aa, chain - ## HITS:1 COG:ECs0664 KEGG:ns NR:ns ## COG: ECs0664 COG0388 # Protein_GI_number: 15829918 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Escherichia coli O157:H7 # 1 187 76 262 262 363 100.0 1e-101 MMTTILTIHVPSTPGRAWNMLVALQAGNIVARYAKLHLYDAFAIQESRRVDAGNEIAPLL EVEGMKVGLMTCYDLRFPELALAQALQGAEILVLPAAWVRGPLKEHHWSTLLAARALDTT CYMVAAGECGNKNIGQSRIIDPFGVTIAAASEMPALIMAEVTPERVRQVRAQLPVLNNRR FAPPQLL >gi|296494698|gb|ADTN01000040.1| GENE 3 955 - 1182 283 75 aa, chain - ## HITS:1 COG:ECs0664 KEGG:ns NR:ns ## COG: ECs0664 COG0388 # Protein_GI_number: 15829918 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Escherichia coli O157:H7 # 1 66 1 66 262 126 100.0 8e-30 MLVAAGQFAVTSVWEKNAEICASLMAQAAENDVSLFVLPEALLARDDHDADLSVKSAQLL EGEFLGLYGEKVNVT >gi|296494698|gb|ADTN01000040.1| GENE 4 1275 - 1658 200 127 aa, chain + ## HITS:1 COG:crcB KEGG:ns NR:ns ## COG: crcB COG0239 # Protein_GI_number: 16128607 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Integral membrane protein possibly involved in chromosome condensation # Organism: Escherichia coli K12 # 1 127 1 127 127 202 100.0 8e-53 MLQLLLAVFIGGGTGSVARWLLSMRFNPLHQAIPLGTLTANLIGAFIIGIGFAWFSRMTN IDPVWKVLITTGFCGGLTTFSTFSAEVVFLLQEGRFGWALLNVFVNLLGSFAMTALAFWL FSASTAH >gi|296494698|gb|ADTN01000040.1| GENE 5 1712 - 1921 328 69 aa, chain - ## HITS:1 COG:ECs0662 KEGG:ns NR:ns ## COG: ECs0662 COG1278 # Protein_GI_number: 15829916 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Escherichia coli O157:H7 # 1 69 1 69 69 124 100.0 4e-29 MSKIKGNVKWFNESKGFGFITPEDGSKDVFVHFSAIQTNGFKTLAEGQRVEFEITNGAKG PSAANVIAL >gi|296494698|gb|ADTN01000040.1| GENE 6 2096 - 2656 432 186 aa, chain - ## HITS:1 COG:no KEGG:B21_00581 NR:ns ## KEGG: B21_00581 # Name: crcA # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 186 1 186 186 351 100.0 6e-96 MNVSKYVAIFSFVFIQLISVGKVFANADEWMTTFRENIAQTWQQPEHYDLYIPAITWHAR FAYDKEKTDRYNERPWGGGFGLSRWDEKGNWHGLYAMAFKDSWNKWEPIAGYGWESTWRP LADENFHLGLGFTAGVTARDNWNYIPLPVLLPLASVGYGPVTFQMTYIPGTYNNGNVYFA WMRFQF >gi|296494698|gb|ADTN01000040.1| GENE 7 3244 - 4629 1440 461 aa, chain + ## HITS:1 COG:ECs0660 KEGG:ns NR:ns ## COG: ECs0660 COG3069 # Protein_GI_number: 15829914 # Func_class: C Energy production and conversion # Function: C4-dicarboxylate transporter # Organism: Escherichia coli O157:H7 # 1 461 1 461 461 633 100.0 0 MLTFIELLIGVVVIVGVARYIIKGYSATGVLFVGGLLLLIISAIMGHKVLPSSQASTGYS ATDIVEYVKILLMSRGGDLGMMIMMLCGFAAYMTHIGANDMVVKLASKPLQYINSPYLLM IAAYFVACLMSLAVSSATGLGVLLMATLFPVMVNVGISRGAAAAICASPAAIILAPTSGD VVLAAQASEMSLIDFAFKTTLPISIAAIIGMAIAHFFWQRYLDKKEHISHEMLDVSEITT TAPAFYAILPFTPIIGVLIFDGKWGPQLHIITILVICMLIASILEFLRSFNTQKVFSGLE VAYRGMADAFANVVMLLVAAGVFAQGLSTIGFIQSLISIATSFGSASIILMLVLVILTML AAVTTGSGNAPFYAFVEMIPKLAHSSGINPAYLTIPMLQASNLGRTLSPVSGVVVAVAGM AKISPFEVVKRTSVPVLVGLVIVIVATELMVPGTAAAVTGK >gi|296494698|gb|ADTN01000040.1| GENE 8 4670 - 5350 263 226 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149011191|ref|ZP_01832496.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP19-BS75] # 5 223 1 221 226 105 28 1e-22 MTAPLTLLIVEDETPLAEMHAEYIRHIPGFSQILLAGNLAQARMMIERFKPGLILLDNYL PDGRGINLLHELVQAHYPGDVVFTTAASDMETVSEAVRCGVFDYLIKPIAYERLGQTLTR FRQRKHMLESIDSASQKQIDEMFNAYARGEPKDELPTGIDPLTLNAVRKLFKEPGVQHTA ETVAQALTISRTTARRYLEYCASRHLIIAEIVHGKVGRPQRIYHSG >gi|296494698|gb|ADTN01000040.1| GENE 9 5319 - 6977 1487 552 aa, chain - ## HITS:1 COG:citA KEGG:ns NR:ns ## COG: citA COG3290 # Protein_GI_number: 16128602 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase regulating citrate/malate metabolism # Organism: Escherichia coli K12 # 1 552 1 552 552 1111 100.0 0 MLQLNENKQFAFFQRLAFPLRIFLLILVFSIFVIAALAQYFTASFEDYLTLHVRDMAMNQ AKIIASNDSVISAVKTRDYKRLATIANKLQRDTDFDYVVIGDRHSIRLYHPNPEKIGYPM QFTKQGALEKGESYFITGKGSMGMAMRAKTPIFDDDGKVIGVVSIGYLVSKIDSWRAEFL LPMAGVFVVLLGILMLLSWFLAAHIRRQMMGMEPKQIARVVRQQEALFSSVYEGLIAVDP HGYITAINRNARKMLGLSSPGRQWLGKPIVEVVRPADFFTEQIDEKRQDVVANFNGLSVI ANREAIRSGDDLLGAIISFRSKDEISTLNAQLTQIKQYVESLRTLRHEHLNWMSTLNGLL QMKEYDRVLAMVQGESQAQQQLIDSLREAFADRQVAGLLFGKVQRARELGLKMIIVPGSQ LSQLPPGLDSTEFAAIVGNLLDNAFEASLRSDEGNKIVELFLSDEGDDVVIEVADQGCGV PESLRDKIFEQGVSTRADEPGEHGIGLYLIASYVTRCGGVITLEDNDPCGTLFSIYIPKV KPNDSSINPIDR >gi|296494698|gb|ADTN01000040.1| GENE 10 7356 - 8414 834 352 aa, chain + ## HITS:1 COG:citC KEGG:ns NR:ns ## COG: citC COG3053 # Protein_GI_number: 16128601 # Func_class: C Energy production and conversion # Function: Citrate lyase synthetase # Organism: Escherichia coli K12 # 1 352 30 381 381 724 100.0 0 MFGNDIFTRVKRSENKKMAEIAQFLHENDLSVDTTVEVFITVTRDEKLIACGGIAGNIIK CVAISESVRGEGLALTLATELINLAYERHSTHLFIYTKTEYEALFRQCGFSTLTSVPGVM VLMENSATRLKRYAESLKKFRHPGNKIGCIVMNANPFTNGHRYLIQQAAAQCDWLHLFLV KEDSSRFPYEDRLDLVLKGTADIPRLTVHRGSEYIISRATFPCYFIKEQSVINHCYTEID LKIFRQYLAPALGVTHRFVGTEPFCRVTAQYNQDMRYWLETPTISAPPIELVEIERLRYQ EMPISASRVRQLLAKNDLTAIAPLVPAVTLHYLQNLLEHSRQDAAARQKTPA >gi|296494698|gb|ADTN01000040.1| GENE 11 8480 - 8725 302 81 aa, chain + ## HITS:1 COG:ECs0656 KEGG:ns NR:ns ## COG: ECs0656 COG3052 # Protein_GI_number: 15829910 # Func_class: C Energy production and conversion # Function: Citrate lyase, gamma subunit # Organism: Escherichia coli O157:H7 # 1 81 18 98 98 151 100.0 3e-37 MIRIAPLDTQDIDLQINSSVEKQFGDAIRTTILDVLARYNVRGVQLNVDDKGALDCILRA RLEALLARASGIPALPWEDCQ >gi|296494698|gb|ADTN01000040.1| GENE 12 8722 - 9630 1142 302 aa, chain + ## HITS:1 COG:citE KEGG:ns NR:ns ## COG: citE COG2301 # Protein_GI_number: 16128599 # Func_class: G Carbohydrate transport and metabolism # Function: Citrate lyase beta subunit # Organism: Escherichia coli K12 # 1 302 6 307 307 542 100.0 1e-154 MISASLQQRKTRTRRSMLFVPGANAAMVSNSFIYPADALMFDLEDSVALREKDTARRMVY HALQHPLYRDIETIVRVNALDSEWGVNDLEAVVRGGADVVRLPKTDTAQDVLDIEKEILR IEKACGREPGSTGLLAAIESPLGITRAVEIAHASERLIGIALGAEDYVRNLRTERSPEGT ELLFARCSILQAARSAGIQAFDTVYSDANNEAGFLQEAAHIKQLGFDGKSLINPRQIDLL HNLYAPTQKEVDHARRVVEAAEAAAREGLGVVSLNGKMVDGPVIDRARLVLSRAELSGIR EE >gi|296494698|gb|ADTN01000040.1| GENE 13 9641 - 11173 1700 510 aa, chain + ## HITS:1 COG:citF KEGG:ns NR:ns ## COG: citF COG3051 # Protein_GI_number: 16128598 # Func_class: C Energy production and conversion # Function: Citrate lyase, alpha subunit # Organism: Escherichia coli K12 # 1 510 1 510 510 1008 99.0 0 MTQKIEQSQRQERVAAWNRRAECDLAAFQNSPKQTYQAEKARDRKLCANLEEAIRRSGLQ DGMTVSFHHAFRGGDLTVNMVMDVIAKMGFKNLTLASSSLSDCHAPLVEHIRQGVVTRIY TSGLRGPLAEEISRGLLAEPVQIHSHGGRVHLVQSGELNIDVAFLGVPSCDEFGNANGYT GKACCGSLGYAMVDADNAKQVVMLTEELLPYPHNPASIEQDQVDLIVKVDRVGDAAKIGA GATRMTTNPRELLIARSAADVIVNSGYFKEGFSMQTGTGGASLAVTRFLEDKMRSRDIRA DFALGGITATMVDLHEKGLIRKLLDVQSFDSHAAQSLARNPNHIEISANQYANWGSKGAS VDRLDVVVLSALEIDTQFNVNVLTGSDGVLRGASGGHCDTAIASALSIIVAPLVRGRIPT LVDNVLTCITPGSSVDILVTDHGIAVNPARPELAERLQEAGIKVVSIEWLRERARLLTGE PQPIEFTDRVVAVVRYRDGSVIDVVHQVKE >gi|296494698|gb|ADTN01000040.1| GENE 14 11177 - 11728 303 183 aa, chain + ## HITS:1 COG:ECs0653 KEGG:ns NR:ns ## COG: ECs0653 COG3697 # Protein_GI_number: 15829907 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Phosphoribosyl-dephospho-CoA transferase (holo-ACP synthetase) # Organism: Escherichia coli O157:H7 # 1 183 1 183 183 350 100.0 1e-96 MHLLPELASHHAVSIPELLVSRDERQARQHVWLKRHPVPLVSFTVVAPGPIKDSEVTRRI FNHGVTALRALAAKQGWQIQEQAALVSASGPEGMLSIAAPARDLKLATIELEHSHPLGRL WDIDVLTPEGEILSRRDYSLPPRRCLLCEQSAAVCARGKTHQLTDLLNRMEALLNDVDAC NVN >gi|296494698|gb|ADTN01000040.1| GENE 15 11709 - 12581 580 290 aa, chain + ## HITS:1 COG:citG KEGG:ns NR:ns ## COG: citG COG1767 # Protein_GI_number: 16128596 # Func_class: H Coenzyme transport and metabolism # Function: Triphosphoribosyl-dephospho-CoA synthetase # Organism: Escherichia coli K12 # 1 290 3 292 292 557 100.0 1e-158 MPATSTKTTKLATSLIDEYALLGWRAMLTEVNLSPKPGLVDRINCGAHKDMALEDFHRSA LAIQGWLPRFIEFGACSAEMAPEAVLHGLRPIGMACEGDMFRATAGVNTHKGSIFSLGLL CAAIGRLLQLNQPVTPTTVCSTAASFCRGLTDRELRTNNSQLTAGQRLYQQLGLTGARGE AEAGYPLVINHALPHYLTLLDQGLDPELALLDTLLLLMAINGDTNVASRGGEGGLRWLQR EAQTLLQKGGIRTPADLDYLRQFDRECIERNLSPGGSADLLILTWFLAQI >gi|296494698|gb|ADTN01000040.1| GENE 16 12632 - 14095 1672 487 aa, chain + ## HITS:1 COG:ECs0651 KEGG:ns NR:ns ## COG: ECs0651 COG0471 # Protein_GI_number: 15829905 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Escherichia coli O157:H7 # 1 487 1 487 487 867 100.0 0 MSLAKDNIWKLLAPLVVMGVMFLIPVPDGMPPQAWHYFAVFVAMIVGMILEPIPATAISF IAVTICVIGSNYLLFDAKELADPAFNAQKQALKWGLAGFSSTTVWLVFGAFIFALGYEVS GLGRRIALFLVKFMGKRTLTLGYAIVIIDILLAPFTPSNTARTGGTVFPVIKNLPPLFKS FPNDPSARRIGGYLMWMMVISTSLSSSMFVTGAAPNVLGLEFVSKIAGIQISWLQWFLCF LPVGVILLIIAPWLSYVLYKPEITHSEEVATWAGDELKTMGALTRREWTLIGLVLLSLGL WVFGSEVINATAVGLLAVSLMLALHVVPWKDITRYNSAWNTLVNLATLVVMANGLTRSGF IDWFAGTMSTHLEGFSPNATVIVLVLVFYFAHYLFASLSAHTATMLPVILAVGKGIPGVP MEQLCILLVLSIGIMGCLTPYATGPGVIIYGCGYVKSKDYWRLGAIFGVIYISMLLLVGW PILAMWN Prediction of potential genes in microbial genomes Time: Sun May 15 23:13:56 2011 Seq name: gi|296494697|gb|ADTN01000041.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont101.5, whole genome shotgun sequence Length of sequence - 26015 bp Number of predicted genes - 28, with homology - 27 Number of transcription units - 20, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 51 - 875 454 ## COG3719 Ribonuclease I + Term 1025 - 1059 5.0 + Prom 959 - 1018 3.7 2 2 Tu 1 . + CDS 1105 - 1515 389 ## COG0782 Transcription elongation factor + Term 1552 - 1579 0.1 + Prom 1573 - 1632 3.2 3 3 Tu 1 . + CDS 1653 - 1742 71 ## 4 4 Tu 1 . - CDS 1746 - 2984 910 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases - Prom 3009 - 3068 2.0 5 5 Tu 1 . - CDS 3103 - 3309 109 ## EcE24377A_0629 hypothetical protein + Prom 3110 - 3169 3.4 6 6 Tu 1 . + CDS 3205 - 3633 537 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins + Term 3653 - 3694 6.2 - Term 3688 - 3723 5.5 7 7 Tu 1 . - CDS 3754 - 5319 411 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 + Prom 5107 - 5166 2.2 8 8 Tu 1 . + CDS 5318 - 5434 58 ## gi|188495373|ref|ZP_03002643.1| hypothetical protein Ec53638_0709 - Term 5395 - 5426 4.1 9 9 Tu 1 . - CDS 5564 - 6127 626 ## COG0450 Peroxiredoxin - Prom 6191 - 6250 3.9 10 10 Tu 1 . + CDS 6499 - 7245 692 ## COG1651 Protein-disulfide isomerase + Prom 7374 - 7433 6.2 11 11 Tu 1 . + CDS 7454 - 8356 416 ## COG0583 Transcriptional regulator + Prom 8399 - 8458 6.4 12 12 Op 1 8/0.000 + CDS 8503 - 9723 677 ## COG3969 Predicted phosphoadenosine phosphosulfate sulfotransferase 13 12 Op 2 . + CDS 9708 - 10325 535 ## COG1475 Predicted transcriptional regulators + Term 10340 - 10379 0.6 - Term 10251 - 10298 5.0 14 13 Tu 1 . - CDS 10326 - 11486 989 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 11516 - 11575 5.2 + Prom 11490 - 11549 3.6 15 14 Tu 1 . + CDS 11595 - 12683 1158 ## COG0371 Glycerol dehydrogenase and related enzymes + Term 12768 - 12808 -0.4 16 15 Op 1 9/0.000 - CDS 12693 - 12890 259 ## COG2879 Uncharacterized small protein - Prom 12953 - 13012 3.9 - Term 12911 - 12953 3.5 17 15 Op 2 4/0.667 - CDS 13073 - 15178 2812 ## COG1966 Carbon starvation protein, predicted membrane protein - Prom 15234 - 15293 7.0 - Term 15257 - 15317 0.3 18 16 Op 1 5/0.111 - CDS 15359 - 15772 336 ## COG2050 Uncharacterized protein, possibly involved in aromatic compounds catabolism 19 16 Op 2 5/0.111 - CDS 15775 - 16521 185 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 20 16 Op 3 5/0.111 - CDS 16521 - 17378 1253 ## COG1535 Isochorismate hydrolase 21 16 Op 4 6/0.000 - CDS 17392 - 19002 1645 ## COG1021 Peptide arylation enzymes 22 16 Op 5 . - CDS 19012 - 20187 1238 ## COG1169 Isochorismate synthase - Prom 20242 - 20301 6.3 + Prom 20483 - 20542 2.2 23 17 Tu 1 . + CDS 20562 - 21518 1072 ## COG4592 ABC-type Fe2+-enterobactin transport system, periplasmic component 24 18 Tu 1 . - CDS 21522 - 22772 1195 ## COG0477 Permeases of the major facilitator superfamily - Prom 22815 - 22874 3.9 + Prom 22759 - 22818 4.5 25 19 Op 1 8/0.000 + CDS 22871 - 23887 1290 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 26 19 Op 2 7/0.000 + CDS 23884 - 24876 1033 ## COG4779 ABC-type enterobactin transport system, permease component 27 19 Op 3 . + CDS 24873 - 25688 197 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 28 20 Tu 1 . - CDS 25685 - 26014 283 ## COG3765 Chain length determinant protein Predicted protein(s) >gi|296494697|gb|ADTN01000041.1| GENE 1 51 - 875 454 274 aa, chain + ## HITS:1 COG:rna KEGG:ns NR:ns ## COG: rna COG3719 # Protein_GI_number: 16128594 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonuclease I # Organism: Escherichia coli K12 # 7 274 1 268 268 551 100.0 1e-157 MSSTPIMKAFWRNAALLAVSLLPFSSANALALQAKQYGDFDRYVLALSWQTGFCQSQHDR NRNERDECRLQTETTNKADFLTVHGLWPGLPKSVAARGVDERRWMRFGCATRPIPNLPEA RASRMCSSPETGLSLETAAKLSEVMPGAGGRSCLERYEYAKHGACFGFDPDAYFGTMVRL NQEIKESEAGKFLADNYGKTVSRRDFDAAFAKSWGKENVKAVKLTCQGNPAYLTEIQISI KADAINAPLSANSFLPQPHPGNCGKTFVIDKAGY >gi|296494697|gb|ADTN01000041.1| GENE 2 1105 - 1515 389 136 aa, chain + ## HITS:1 COG:ECs0649 KEGG:ns NR:ns ## COG: ECs0649 COG0782 # Protein_GI_number: 15829903 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Escherichia coli O157:H7 # 1 136 1 136 136 259 100.0 7e-70 MSRPTIIINDLDAERIDILLEQPAYAGLPIADALNAELDRAQMCSPEEMPHDVVTMNSRV KFRNLSDGEVRVRTLVYPAKMTDSNTQLSVMAPVGAALLGLRVGDSIHWELPGGVATHLE VLELEYQPEAAGDYLL >gi|296494697|gb|ADTN01000041.1| GENE 3 1653 - 1742 71 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAEAFYILIGFLIMAAIIVMAVLYLENHS >gi|296494697|gb|ADTN01000041.1| GENE 4 1746 - 2984 910 412 aa, chain - ## HITS:1 COG:ybdR KEGG:ns NR:ns ## COG: ybdR COG1063 # Protein_GI_number: 16128591 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli K12 # 1 412 1 412 412 832 100.0 0 MKALTYHGPHHVQVENVPDPGVEQADDIILRITATAICGSDLHLYRGKIPQVKHGDIFGH EFMGEVVETGKDVKNLQKGDRVVIPFVIACGDCFFCRLQQYAACENTNAGKGAALNKKQI PAPAALFGYSHLYGGVPGGQAEYVRVPKGNVGPFKVPPLLSDDKALFLSDILPTAWQAAK NAQIQQGSSVAVYGAGPVGLLTIACARLLGAEQIFVVDHHPYRLHFAADRYGAIPINFDE DSDPAQSIIEQTAGHRGVDAVIDAVGFEAKGSTTETVLTNLKLEGSSGKALRQCIAAVRR GGIVSVPGVYAGFIHGFLFGDAFDKGLSFKMGQTHVHAWLGELLPLIEKGLLKPEEIVTH YMPFEEAARGYEIFEKREEECRKVILVPGAQSAEAAQKAVSGLVNAMPGGTI >gi|296494697|gb|ADTN01000041.1| GENE 5 3103 - 3309 109 68 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_0629 NR:ns ## KEGG: EcE24377A_0629 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 13 68 1 56 56 89 96.0 3e-17 MNNSVILGEEFSVANSFVAQFHFKYINWHNDCLIHNPFSLLIMNKSFAMIIIFIPDICLV LFPYELFL >gi|296494697|gb|ADTN01000041.1| GENE 6 3205 - 3633 537 142 aa, chain + ## HITS:1 COG:ybdQ KEGG:ns NR:ns ## COG: ybdQ COG0589 # Protein_GI_number: 16128590 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Escherichia coli K12 # 1 142 1 142 142 253 100.0 8e-68 MYKTIIMPVDVFEMELSDKAVRHAEFLAQDDGVIHLLHVLPGSASLSLHRFAADVRRFEE HLQHEAQERLQTMVSHFTIDPSRIKQHVRFGSVRDEVNELAEELGADVVVIGSRNPSIST HLLGSNASSVIRHANLPVLVVR >gi|296494697|gb|ADTN01000041.1| GENE 7 3754 - 5319 411 521 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 213 515 2 303 306 162 31 2e-39 MLDTNMKTQLKAYLEKLTKPVELIATLDDSAKSAEIKELLAEIAELSDKVTFKEDNSLPV RKPSFLITNPGSNQGPRFAGSPLGHEFTSLVLALLWTGGHPSKEAQSLLEQIRHIDGDFE FETYYSLSCHNCPDVVQALNLMSVLNPRIKHTAIDGGTFQNEITDRNVMGVPAVFVNGKE FGQGRMTLTEIVAKIDTGAEKRAAEELNKRDAYDVLIVGSGPAGAAAAIYSARKGIRTGL MGERFGGQILDTVDIENYISVPKTEGQKLAGALKVHVDEYDVDVIDSQSASKLIPAAVEG GLHQIETASGAVLKARSIIVATGAKWRNMNVPGEDQYRTKGVTYCPHCDGPLFKGKRVAV IGGGNSGVEAAIDLAGIVEHVTLLEFAPEMKADQVLQDKLRSLKNVDIILNAQTTEVKGD GSKVVGLEYRDRVSGDIHNIELAGIFVQIGLLPNTNWLEGAVERNRMGEIIIDAKCETNV KGVFAAGDCTTVPYKQIIIATGEGAKASLSAFDYLIRTKTA >gi|296494697|gb|ADTN01000041.1| GENE 8 5318 - 5434 58 38 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|188495373|ref|ZP_03002643.1| ## NR: gi|188495373|ref|ZP_03002643.1| hypothetical protein Ec53638_0709 [Escherichia coli 53638] # 1 38 1 38 38 62 100.0 7e-09 MFISWALNIIMQAASAATWMQLASWCGEENRVPQHPKN >gi|296494697|gb|ADTN01000041.1| GENE 9 5564 - 6127 626 187 aa, chain - ## HITS:1 COG:ECs0644 KEGG:ns NR:ns ## COG: ECs0644 COG0450 # Protein_GI_number: 15829898 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Escherichia coli O157:H7 # 1 187 1 187 187 381 100.0 1e-106 MSLINTKIKPFKNQAFKNGEFIEITEKDTEGRWSVFFFYPADFTFVCPTELGDVADHYEE LQKLGVDVYAVSTDTHFTHKAWHSSSETIAKIKYAMIGDPTGALTRNFDNMREDEGLADR ATFVVDPQGIIQAIEVTAEGIGRDASDLLRKIKAAQYVASHPGEVCPAKWKEGEATLAPS LDLVGKI >gi|296494697|gb|ADTN01000041.1| GENE 10 6499 - 7245 692 248 aa, chain + ## HITS:1 COG:dsbG KEGG:ns NR:ns ## COG: dsbG COG1651 # Protein_GI_number: 16128587 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Protein-disulfide isomerase # Organism: Escherichia coli K12 # 12 248 32 268 268 474 100.0 1e-134 MLKKILLLALLPAIAFAEELPAPVKAIEKQGITIIKTFDAPGGMKGYLGKYQDMGVTIYL TPDGKHAISGYMYNEKGENLSNTLIEKEIYAPAGREMWQRMEQSHWLLDGKKDAPVIVYV FADPFCPYCKQFWQQARPWVDSGKVQLRTLLVGVIKPESPATAAAILASKDPAKTWQQYE ASGGKLKLNVPANVSTEQMKVLSDNEKLMDDLGANVTPAIYYMSKENTLQQAVGLPDQKT LNIIMGNK >gi|296494697|gb|ADTN01000041.1| GENE 11 7454 - 8356 416 300 aa, chain + ## HITS:1 COG:ybdO KEGG:ns NR:ns ## COG: ybdO COG0583 # Protein_GI_number: 16128586 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 300 1 300 300 563 99.0 1e-160 MANLYDLKKFDLNLLVIFECIYQHLSISKAAESLYITPSAVSQSLQRLRAQFNDPLFIRS GKGIAPTTTGLNLHHHLEKNLRGLEQTINIVNKSELKKNFIIYGPQLISCSNNSMLIRCL RQDSSVEIECHDILMSAENAEELLVHRKADLVITQMPVISRSVICMPLHTIRNTLICSNR HPRITDNSTYEQIMAEEFTQLISKSAGVDDIQMEIDERFMNRKISFRGSSLLTIINSIAV TDLLGIVPYELYNFYRDFLNLKEIKLEHPLPSIKLYISYNKSSLNNLVFSRFIDRLNESF >gi|296494697|gb|ADTN01000041.1| GENE 12 8503 - 9723 677 406 aa, chain + ## HITS:1 COG:ybdN KEGG:ns NR:ns ## COG: ybdN COG3969 # Protein_GI_number: 16128585 # Func_class: R General function prediction only # Function: Predicted phosphoadenosine phosphosulfate sulfotransferase # Organism: Escherichia coli K12 # 1 406 1 406 406 834 99.0 0 MSIYKIPLPLNILEAARERITWTLNTLPRVCVSFSGGKDSGLMLHLTAELARQMGKKICV LFIDWEAQFSCTINYVQSLRELYTDVIEEFYWVALPLTTQNSLSQYQPEWQCWEPDVEWV RQPPQDAITDPDFFSFYQPGMTFEQFVREFAEWFSQKRPAAMMIGIRADESYNRFVAIAS LNKQRFADDKPWTTAAPGGHSWYIYPIYDWKVADIWTWYANHQSLCNPLYNLMYQAGVPL RHMRICEPFGPEQRQGLWLYHVIEPDRWAAMCARVSGVKSGGIYAGHDNHFYGHRKILKP EHLDWQEYALLLLNSMPEKTAEHYRNKIAIYLHWYQKKGIEVPQTQQGDIGAKDIPSWRR ICKVLLNNDYWCRALSFSPTKSKNYQRYNERIKGKRQEWGILCNND >gi|296494697|gb|ADTN01000041.1| GENE 13 9708 - 10325 535 205 aa, chain + ## HITS:1 COG:ybdM KEGG:ns NR:ns ## COG: ybdM COG1475 # Protein_GI_number: 16128584 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 1 205 5 209 209 408 100.0 1e-114 MQQRLTQDLTQFLASLPEDDRIKAINEIRMAIHQVSPFREEPVDCVLWVKNSQLMPNDYN PNNVAPPEKKLLQKSIEIDGFTQPIVVTHTDKNAMEIVDGFHRHEIGKGSSSLKLRLKGY LPVTCLEGTRNQRIAATIRHNRARGRHQITAMSEIVRELSQLGWDDNKIGKELGMDSDEV LRLKQINGLQELFADRQYSRAWTVK >gi|296494697|gb|ADTN01000041.1| GENE 14 10326 - 11486 989 386 aa, chain - ## HITS:1 COG:ybdL KEGG:ns NR:ns ## COG: ybdL COG0436 # Protein_GI_number: 16128583 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Escherichia coli K12 # 1 386 1 386 386 770 100.0 0 MTNNPLIPQSKLPQLGTTIFTQMSALAQQHQAINLSQGFPDFDGPRYLQERLAHHVAQGA NQYAPMTGVQALREAIAQKTERLYGYQPDADSDITVTAGATEALYAAITALVRNGDEVIC FDPSYDSYAPAIALSGGIVKRMALQPPHFRVDWQEFAALLSERTRLVILNTPHNPSATVW QQADFAALWQAIAGHEIFVISDEVYEHINFSQQGHASVLAHPQLRERAVAVSSFGKTYHM TGWKVGYCVAPAPISAEIRKVHQYLTFSVNTPAQLALADMLRAEPEHYLALPDFYRQKRD ILVNALNESRLEILPCEGTYFLLVDYSAVSTLDDVEFCQWLTQEHGVAAIPLSVFCADPF PHKLIRLCFAKKESTLLAAAERLRQL >gi|296494697|gb|ADTN01000041.1| GENE 15 11595 - 12683 1158 362 aa, chain + ## HITS:1 COG:ybdH KEGG:ns NR:ns ## COG: ybdH COG0371 # Protein_GI_number: 16128582 # Func_class: C Energy production and conversion # Function: Glycerol dehydrogenase and related enzymes # Organism: Escherichia coli K12 # 1 362 1 362 362 698 100.0 0 MPHNPIRVVVGPANYFSHPGSFNHLHDFFTDEQLSRAVWIYGKRAIAAAQTKLPPAFGLP GAKHILFRGHCSESDVQQLAAESGDDRSVVIGVGGGALLDTAKALARRLGLPFVAVPTIA ATCAAWTPLSVWYNDAGQALHYEIFDDANFMVLVEPEIILNAPQQYLLAGIGDTLAKWYE AVVLAPQPETLPLTVRLGINNAQAIRDVLLNSSEQALSDQQNQQLTQSFCDVVDAIIAGG GMVGGLGDRFTRVAAAHAVHNGLTVLPQTEKFLHGTKVAYGILVQSALLGQDDVLAQLTG AYQRFHLPTTLAELEVDINNQAEIDKVIAHTLRPVESIHYLPVTLTPDTLRAAFKKVESF KA >gi|296494697|gb|ADTN01000041.1| GENE 16 12693 - 12890 259 65 aa, chain - ## HITS:1 COG:ybdD KEGG:ns NR:ns ## COG: ybdD COG2879 # Protein_GI_number: 16132239 # Func_class: S Function unknown # Function: Uncharacterized small protein # Organism: Escherichia coli K12 # 1 65 1 65 65 129 100.0 1e-30 MFDSLAKAGKYLGQAAKLMIGMPDYDNYVEHMRVNHPDQTPMTYEEFFRERQDARYGGKG GARCC >gi|296494697|gb|ADTN01000041.1| GENE 17 13073 - 15178 2812 701 aa, chain - ## HITS:1 COG:cstA KEGG:ns NR:ns ## COG: cstA COG1966 # Protein_GI_number: 16128581 # Func_class: T Signal transduction mechanisms # Function: Carbon starvation protein, predicted membrane protein # Organism: Escherichia coli K12 # 1 701 1 701 701 1283 100.0 0 MNKSGKYLVWTVLSVMGAFALGYIALNRGEQINALWIVVASVCIYLIAYRFYGLYIAKNV LAVDPTRMTPAVRHNDGLDYVPTDKKVLFGHHFAAIAGAGPLVGPVLAAQMGYLPGMIWL LAGVVLAGAVQDFMVLFVSTRRDGRSLGELVKEEMGPTAGVIALVACFMIMVIILAVLAM IVVKALTHSPWGTYTVAFTIPLALFMGIYLRYLRPGRIGEVSVIGLVFLIFAIISGGWVA ESPTWAPYFDFTGVQLTWMLVGYGFVAAVLPVWLLLAPRDYLSTFLKIGTIVGLAVGILI MRPTLTMPALTKFVDGTGPVWTGNLFPFLFITIACGAVSGFHALISSGTTPKMLANEGQA CFIGYGGMLMESFVAIMALVSACIIDPGVYFAMNSPMAVLAPAGTADVVASAAQVVSSWG FSITPDTLNQIASEVGEQSIISRAGGAPTLAVGMAYILHGALGGMMDVAFWYHFAILFEA LFILTAVDAGTRAARFMLQDLLGVVSPGLKRTDSLPANLLATALCVLAWGYFLHQGVVDP LGGINTLWPLFGIANQMLAGMALMLCAVVLFKMKRQRYAWVALVPTAWLLICTLTAGWQK AFSPDAKVGFLAIANKFQAMIDSGNIPSQYTESQLAQLVFNNRLDAGLTIFFMVVVVVLA LFSIKTALAALKDPKPTAKETPYEPMPENVEEIVAQAKGAH >gi|296494697|gb|ADTN01000041.1| GENE 18 15359 - 15772 336 137 aa, chain - ## HITS:1 COG:ybdB KEGG:ns NR:ns ## COG: ybdB COG2050 # Protein_GI_number: 16128580 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized protein, possibly involved in aromatic compounds catabolism # Organism: Escherichia coli K12 # 1 137 1 137 137 275 100.0 1e-74 MIWKRHLTLDELNATSDNTMVAHLGIVYTRLGDDVLEAEMPVDTRTHQPFGLLHGGASAA LAETLGSMAGFMMTRDGQCVVGTELNATHHRPVSEGKVRGVCQPLHLGRQNQSWEIVVFD EQGRRCCTCRLGTAVLG >gi|296494697|gb|ADTN01000041.1| GENE 19 15775 - 16521 185 248 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 6 242 4 238 242 75 27 3e-13 MDFSGKNVWVTGAGKGIGYATALAFVEAGAKVTGFDQAFTQEQYPFATEVMDVADAAQVA QVCQRLLAETERLDALVNAAGILRMGATDQLSKEDWQQTFAVNVGGAFNLFQQTMNQFRR QRGGAIVTVASDAAHTPRIGMSAYGASKAALKSLALSVGLELAGSGVRCNVVSPGSTDTD MQRTLWVSDDAEEQRIRGFGEQFKLGIPLGKIARPQEIANTILFLASDLASHITLQDIVV DGGSTLGA >gi|296494697|gb|ADTN01000041.1| GENE 20 16521 - 17378 1253 285 aa, chain - ## HITS:1 COG:entB_1 KEGG:ns NR:ns ## COG: entB_1 COG1535 # Protein_GI_number: 16128578 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Isochorismate hydrolase # Organism: Escherichia coli K12 # 1 215 1 215 215 437 100.0 1e-123 MAIPKLQAYALPESHDIPQNKVDWAFEPQRAALLIHDMQDYFVSFWGENCPMMEQVIANI AALRDYCKQHNIPVYYTAQPKEQSDEDRALLNDMWGPGLTRSPEQQKVVDRLTPDADDTV LVKWRYSAFHRSPLEQMLKESGRNQLIITGVYAHIGCMTTATDAFMRDIKPFMVADALAD FSRDEHLMSLKYVAGRSGRVVMTEELLPAPIPASKAALREVILPLLDESDEPFDDDNLID YGLDSVRMMALAARWRKVHGDIDFVMLAKNPTIDAWWKLLSREVK >gi|296494697|gb|ADTN01000041.1| GENE 21 17392 - 19002 1645 536 aa, chain - ## HITS:1 COG:entE KEGG:ns NR:ns ## COG: entE COG1021 # Protein_GI_number: 16128577 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Peptide arylation enzymes # Organism: Escherichia coli K12 # 1 536 1 536 536 1093 100.0 0 MSIPFTRWPEEFARRYREKGYWQDLPLTDILTRHAASDSIAVIDGERQLSYRELNQAADN LACSLRRQGIKPGETALVQLGNVAELYITFFALLKLGVAPVLALFSHQRSELNAYASQIE PALLIADRQHALFSGDDFLNTFVTEHSSIRVVQLLNDSGEHNLQDAINHPAEDFTATPSP ADEVAYFQLSGGTTGTPKLIPRTHNDYYYSVRRSVEICQFTQQTRYLCAIPAAHNYAMSS PGSLGVFLAGGTVVLAADPSATLCFPLIEKHQVNVTALVPPAVSLWLQALIEGESRAQLA SLKLLQVGGARLSATLAARIPAEIGCQLQQVFGMAEGLVNYTRLDDSAEKIIHTQGYPMC PDDEVWVADAEGNPLPQGEVGRLMTRGPYTFRGYYKSPQHNASAFDANGFYCSGDLISID PEGYITVQGREKDQINRGGEKIAAEEIENLLLRHPAVIYAALVSMEDELMGEKSCAYLVV KEPLRAVQVRRFLREQGIAEFKLPDRVECVDSLPLTAVGKVDKKQLRQWLASRASA >gi|296494697|gb|ADTN01000041.1| GENE 22 19012 - 20187 1238 391 aa, chain - ## HITS:1 COG:ECs0632 KEGG:ns NR:ns ## COG: ECs0632 COG1169 # Protein_GI_number: 15829886 # Func_class: H Coenzyme transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Isochorismate synthase # Organism: Escherichia coli O157:H7 # 1 391 1 391 391 758 100.0 0 MDTSLAEEVQQTMATLAPNRFFFMSPYRSFTTSGCFARFDEPAVNGDSPDSPFQQKLAAL FADAKAQGIKNPVMVGAIPFDPRQPSSLYIPESWQSFSRQEKQASARRFTRSQSLNVVER QAIPEQTTFEQMVARAAALTATPQVDKVVLSRLIDITTDAAIDSGVLLERLIAQNPVSYN FHVPLADGGVLLGASPELLLRKDGERFSSIPLAGSARRQPDEVLDREAGNRLLASEKDRH EHELVTQAMKEVLRERSSELHVPSSPQLITTPTLWHLATPFEGKANSQENALTLACLLHP TPALSGFPHQAATQVIAELEPFDRELFGGIVGWCDSEGNGEWVVTIRCAKLRENQVRLFA GAGIVPASSPLGEWRETGVKLSTMLNVFGLH >gi|296494697|gb|ADTN01000041.1| GENE 23 20562 - 21518 1072 318 aa, chain + ## HITS:1 COG:ECs0631 KEGG:ns NR:ns ## COG: ECs0631 COG4592 # Protein_GI_number: 15829885 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe2+-enterobactin transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 1 318 1 318 318 572 98.0 1e-163 MRLAPLYRNALLLTGLLLSGIAAVQAADWPRQITDSRGTHTLESQPQRIVSTSVTLTGSL LAIDAPVIASGATTPNNRVADDQGFLRQWSKVAKERKLQRLYIGEPSAEAVAAQMPDLIL ISATGGDSALALYDQLSTIAPTLIINYDDKSWQSLLTQLGEITGHEKQAAERIAQFDKQL AAAKEQIKLPPQPVTAIVYTAAAHSANLWTPESAQGQMLEQLGFTLAKLPAGLNASQSQG KRHDIIQLGGENLAAGLNGESLFLFAGDQKDADAIYANPLLAHLPAVQNKQVYALGTETF RLDYYSAMQVLDRLKALF >gi|296494697|gb|ADTN01000041.1| GENE 24 21522 - 22772 1195 416 aa, chain - ## HITS:1 COG:ybdA KEGG:ns NR:ns ## COG: ybdA COG0477 # Protein_GI_number: 16128574 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 416 1 416 416 577 100.0 1e-164 MNKQSWLLNLSLLKTHPAFRAVFLARFISIVSLGLLGVAVPVQIQMMTHSTWQVGLSVTL TGGAMFVGLMVGGVLADRYERKKVILLARGTCGIGFIGLCLNALLPEPSLLAIYLLGLWD GFFASLGVTALLAATPALVGRENLMQAGAITMLTVRLGSVISPMIGGLLLATGGVAWNYG LAAAGTFITLLPLLSLPALPPPPQPREHPLKSLLAGFRFLLASPLVGGIALLGGLLTMAS AVRVLYPALADNWQMSAAQIGFLYAAIPLGAAIGALTSGKLAHSARPGLLMLLSTLGSFL AIGLFGLMPMWILGVVCLALFGWLSAVSSLLQYTMLQTQTPEAMLGRINGLWTAQNVTGD AIGAALLGGLGAMMTPVASASASGFGLLIIGVLLLLVLVELRHFRQTPPQVTASDS >gi|296494697|gb|ADTN01000041.1| GENE 25 22871 - 23887 1290 338 aa, chain + ## HITS:1 COG:fepD KEGG:ns NR:ns ## COG: fepD COG0609 # Protein_GI_number: 16128573 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Escherichia coli K12 # 5 338 1 334 334 452 100.0 1e-127 MEVRMSGSVAVTRAIAVPGLLLLLIIATALSLLIGAKSLPASVVLEAFSGTCQSADCTIV LDARLPRTLAGLLAGGALGLAGALMQTLTRNPLADPGLLGVNAGASFAIVLGAALFGYSS AQEQLAMAFAGALVASLIVAFTGSQGGGQLSPVRLTLAGVALAAVLEGLTSGIALLNPDV YDQLRFWQAGSLDIRNLHTLKVVLIPVLIAGATALLLSRALNSLSLGSDTATALGSRVAR TQLIGLLAITVLCGSATAIVGPIAFIGLMMPHMARWLVGADHRWSLPVTLLATPALLLFA DIIGRVIVPGELRVSVVSAFIGAPVLIFLVRRKTRGGA >gi|296494697|gb|ADTN01000041.1| GENE 26 23884 - 24876 1033 330 aa, chain + ## HITS:1 COG:fepG KEGG:ns NR:ns ## COG: fepG COG4779 # Protein_GI_number: 16128572 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type enterobactin transport system, permease component # Organism: Escherichia coli K12 # 1 330 1 330 330 468 100.0 1e-132 MIYVSRRLLITCLLLVSACVVAGIWGLRSGAVTLETSQVFAALMGDAPRSMTMVVTEWRL PRVLMALLIGAALGVSGAIFQSLMRNPLGSPDVMGFNTGAWSGVLVAMVLFGQDLTAIAL SAMVGGIVTSLLVWLLAWRNGIDTFRLIIIGIGVRAMLVAFNTWLLLKASLETALTAGLW NAGSLNGLTWAKTSPSAPIIILMLIAAALLVRRMRLLEMGDDTACALGVSVERSRLLMML VAVVLTAAATALAGPISFIALVAPHIARRISGTARWGLTQAALCGALLLLAADLCAQQLF MPYQLPVGVVTVSLGGIYLIVLLIQESRKK >gi|296494697|gb|ADTN01000041.1| GENE 27 24873 - 25688 197 271 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 8 232 9 228 309 80 25 1e-14 MTESVARLRGEQLTLGYGKYTVAENLTVEIPDGHFTAIIGPNGCGKSTLLRTLSRLMTPA HGHVWLDGEHIQHYASKEVARRIGLLAQNATTPGDITVQELVARGRYPHQPLFTRWRKED EEAVTKAMQATGITHLADQSVDTLSGGQRQRAWIAMVLAQETAIMLLDEPTTWLDISHQI DLLELLSELNREKGYTLAAVLHDLNQACRYASHLIALREGKIVAQGAPKEIVTAELIERI YGLRCMIIDDPVAGTPLVVPLGRTAPSTANS >gi|296494697|gb|ADTN01000041.1| GENE 28 25685 - 26014 283 109 aa, chain - ## HITS:1 COG:fepE KEGG:ns NR:ns ## COG: fepE COG3765 # Protein_GI_number: 16128570 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Chain length determinant protein # Organism: Escherichia coli K12 # 10 109 278 377 377 201 99.0 3e-52 TLSISWSHYLERKLEIEKAVTDVAELNGELRNRQYLVEQLTKAHVNDVNFTPFKYQLSPS LPVKKDGPGKAIIVILSALIGGMVACGGVLLRYAMASRKQDAMMADHLV Prediction of potential genes in microbial genomes Time: Sun May 15 23:14:06 2011 Seq name: gi|296494696|gb|ADTN01000042.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont102.1, whole genome shotgun sequence Length of sequence - 1445 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 514 - 780 152 ## JW3449 hypothetical protein 2 1 Op 2 . - CDS 869 - 1360 -70 ## COG3209 Rhs family protein Predicted protein(s) >gi|296494696|gb|ADTN01000042.1| GENE 1 514 - 780 152 88 aa, chain - ## HITS:1 COG:no KEGG:JW3449 NR:ns ## KEGG: JW3449 # Name: yhhH # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 88 40 127 127 172 100.0 4e-42 MRKYESEGKYTVRNLVKNKAIALELAEIYVKNRYGQDAAEEEKPYEITELTTSWVVEGTI HSDQIAGGVFIIEIGKNDGRILNFGHGK >gi|296494696|gb|ADTN01000042.1| GENE 2 869 - 1360 -70 163 aa, chain - ## HITS:1 COG:rhsB KEGG:ns NR:ns ## COG: rhsB COG3209 # Protein_GI_number: 16131354 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli K12 # 1 163 1249 1411 1411 351 99.0 3e-97 MWEDAKSGACTNGLCGTLSAMIGPDKFDSIDSTAYGALNKINSQSICEDKEFAGLICKDN SGRYFSTAPNRGERKGSYPFNSPCPNGTEKVSAYHTHGADSHGEYWDEIFSGKDEKIVKS KDNNIKSFYLGTPSGNFKAIDNHGKEITNRKGLPNVCRVHGNM Prediction of potential genes in microbial genomes Time: Sun May 15 23:14:09 2011 Seq name: gi|296494695|gb|ADTN01000043.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont104.1, whole genome shotgun sequence Length of sequence - 529 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 105 - 528 241 ## COG3209 Rhs family protein Predicted protein(s) >gi|296494695|gb|ADTN01000043.1| GENE 1 105 - 528 241 141 aa, chain + ## HITS:1 COG:ybfO KEGG:ns NR:ns ## COG: ybfO COG3209 # Protein_GI_number: 16128678 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli K12 # 1 141 94 234 477 283 100.0 7e-77 MSLSRKPQVTWYGWDGDRLTTIQNDRTRIQTIYQPGSFTPLIRVETATGELAKTQRRSLA DTLQQSGGEDGGSVVFPPVLVQMLDRLESEILADRVSEESRRWLASCGLTVEQMQNQMDP VYTPARKIHLYHCDHRGLPLA Prediction of potential genes in microbial genomes Time: Sun May 15 23:14:10 2011 Seq name: gi|296494694|gb|ADTN01000044.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont105.1, whole genome shotgun sequence Length of sequence - 1733 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 964 200 ## ECO111_p1-164 conjugative transfer system protein 2 1 Op 2 . - CDS 975 - 1697 375 ## ECO111_p1-163 conserved predicted protein Predicted protein(s) >gi|296494694|gb|ADTN01000044.1| GENE 1 2 - 964 200 320 aa, chain - ## HITS:1 COG:no KEGG:ECO111_p1-164 NR:ns ## KEGG: ECO111_p1-164 # Name: not_defined # Def: conjugative transfer system protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 316 1 316 1011 629 93.0 1e-179 MNFRAIFLSMQRVLGIFSRRENDVVELIKQDPSSLSPFAQIVGDQKYTVPDHPNLEVLKF IEYPTRPAGIQTFNEQSILSLYRDKLHSISMMLAISDGDIREDAFTFTNLVLKPLIEYIR WIHLLPASENHHHNGIGGLLSHSLEVAMISLKNANHSELRPIGYQDEEVIRRKVYLYAAF ICGLVHDAGKVYDIDIVSLNLSKTLTWAPSSQSLLDWARENNVVEYEIHWRKRIHNQHNI WSSVFLERILDPVCMSFLDRVKKEHVYAKMVTALNVYNDGNDFLSKCVRTSDYYSTGTDL NVLRDPIMGLRSGDAANLLI >gi|296494694|gb|ADTN01000044.1| GENE 2 975 - 1697 375 240 aa, chain - ## HITS:1 COG:no KEGG:ECO111_p1-163 NR:ns ## KEGG: ECO111_p1-163 # Name: not_defined # Def: conserved predicted protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 240 1 240 240 410 91.0 1e-113 MTDVSINETEQKTPDVNARTSQASDNTRKVVQSKNELYGMATVTIKPGTPDFNRFLTARN RSVIRGFDDVSIAISSLFRTVNAVRHPELVQAIQDWFNELHDENNLMKENLEHHIATIHV DESDPFFSSTEFSPFRFESVQLNFNNQNTMRFYKHIFEMNNLLTQMYKFNSLGQLPVSDY NVMAHNIIRSLNMYIERVKKTLNVSRRVKGAYSPDEFIEKVKQYKSVQAYIAAELSGKRR Prediction of potential genes in microbial genomes Time: Sun May 15 23:14:20 2011 Seq name: gi|296494693|gb|ADTN01000045.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont106.1, whole genome shotgun sequence Length of sequence - 7872 bp Number of predicted genes - 12, with homology - 11 Number of transcription units - 9, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 51 - 275 240 ## COG3615 Uncharacterized protein/domain, possibly involved in tellurite resistance 2 1 Op 2 . + CDS 279 - 461 302 ## ECB_01765 hypothetical protein + Prom 507 - 566 3.6 3 2 Tu 1 . + CDS 609 - 857 324 ## COG2261 Predicted membrane protein + Term 871 - 908 6.3 - Term 854 - 897 1.3 4 3 Tu 1 . - CDS 916 - 990 86 ## - Prom 1030 - 1089 5.2 - Term 1032 - 1070 2.0 5 4 Tu 1 . - CDS 1124 - 2149 821 ## COG2199 FOG: GGDEF domain - Prom 2248 - 2307 6.9 + Prom 2248 - 2307 3.9 6 5 Tu 1 . + CDS 2332 - 2586 163 ## COG3042 Putative hemolysin + Term 2614 - 2670 0.6 7 6 Op 1 3/0.500 - CDS 2608 - 2955 430 ## COG3189 Uncharacterized conserved protein 8 6 Op 2 . - CDS 3010 - 4191 496 ## COG2807 Cyanate permease - Prom 4214 - 4273 3.4 + Prom 4208 - 4267 5.8 9 7 Tu 1 . + CDS 4291 - 5109 316 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 5014 - 5047 2.0 10 8 Tu 1 . - CDS 5066 - 5512 630 ## COG2707 Predicted membrane protein - Prom 5612 - 5671 3.1 - Term 5615 - 5662 4.2 11 9 Op 1 2/1.000 - CDS 5787 - 6290 469 ## COG2606 Uncharacterized conserved protein 12 9 Op 2 . - CDS 6333 - 7805 1050 ## COG2199 FOG: GGDEF domain Predicted protein(s) >gi|296494693|gb|ADTN01000045.1| GENE 1 51 - 275 240 74 aa, chain + ## HITS:1 COG:ECs2506 KEGG:ns NR:ns ## COG: ECs2506 COG3615 # Protein_GI_number: 15831760 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein/domain, possibly involved in tellurite resistance # Organism: Escherichia coli O157:H7 # 1 74 46 119 119 151 100.0 2e-37 MHGAVKYLGYADEHSAEPDQVILIEAGQFAVFPPEKWHNIEAMTDDTYFNIDFFVAPEVL MEGAQQRKVIHNGK >gi|296494693|gb|ADTN01000045.1| GENE 2 279 - 461 302 60 aa, chain + ## HITS:1 COG:no KEGG:ECB_01765 NR:ns ## KEGG: ECB_01765 # Name: yoaG # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 60 1 60 60 96 100.0 3e-19 MGKATYTVTVTNNSNGVSVDYETETPMTLLVPEVAAEVIKDLVNTVRSYDTENEHDVCGW >gi|296494693|gb|ADTN01000045.1| GENE 3 609 - 857 324 82 aa, chain + ## HITS:1 COG:YPO1181 KEGG:ns NR:ns ## COG: YPO1181 COG2261 # Protein_GI_number: 16121476 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Yersinia pestis # 1 82 26 107 107 99 82.0 2e-21 MGILSWIIFGLIAGILAKWIMPGKDGGGFFMTILLGIVGAVVGGWISTLFGFGKVDGFNF GSFVVAVIGAIVVLFIYRKIKS >gi|296494693|gb|ADTN01000045.1| GENE 4 916 - 990 86 24 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKTTIIMMGVAIIVVLGTELGWW >gi|296494693|gb|ADTN01000045.1| GENE 5 1124 - 2149 821 341 aa, chain - ## HITS:1 COG:yeaP_2 KEGG:ns NR:ns ## COG: yeaP_2 COG2199 # Protein_GI_number: 16129748 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli K12 # 168 341 1 174 174 351 100.0 1e-96 MSDQIIARVSQSLAKEQSLESLVRQLLEMLEMVTDMESTYLTKVDVEARLQHIMFARNSQ KMYIPENFTVSWDYSLCKRAIDENCFFSDEVPDRWGDCIAARNLGITTFLSTPIHLPDGS FYGTLCAASSEKRQWSERAEQVLQLFAGLIAQYIQKEALVEQLREANAALIAQSYTDSLT GLPNRRAIFENLTTLFSLARHLNHKIMIAFIDLDNFKLINDRFGHNSGDLFLIQVGERLN TLQQNGEVIGRLGGDEFLVVSLNNENADISSLRERIQQQIRGEYHLGDVDLYYPGASLGI VEVDPETTDADSALHAADIAMYQEKKHKQKTPFVAHPALHS >gi|296494693|gb|ADTN01000045.1| GENE 6 2332 - 2586 163 84 aa, chain + ## HITS:1 COG:yoaF KEGG:ns NR:ns ## COG: yoaF COG3042 # Protein_GI_number: 16129747 # Func_class: R General function prediction only # Function: Putative hemolysin # Organism: Escherichia coli K12 # 1 84 1 84 84 154 100.0 3e-38 MKIISFVLPCLLVLAGCSTPSQPEAPKPPQIGMANPASVYCQQKGGTLIPVQTAQGVSNN CKLPGGETIDEWALWRRDHPAGEK >gi|296494693|gb|ADTN01000045.1| GENE 7 2608 - 2955 430 115 aa, chain - ## HITS:1 COG:yeaO KEGG:ns NR:ns ## COG: yeaO COG3189 # Protein_GI_number: 16129746 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 115 8 122 122 222 100.0 2e-58 MNIQCKRVYDPAEQSDGYRILVDRLWPRGIKKTDLALDEWDKEITPSTELRKAFHGEVVD YATFREQYLAELAQHEQEGKRLADIAKKQPLTLLYSAKNTTQNHALVLADWLRSL >gi|296494693|gb|ADTN01000045.1| GENE 8 3010 - 4191 496 393 aa, chain - ## HITS:1 COG:yeaN KEGG:ns NR:ns ## COG: yeaN COG2807 # Protein_GI_number: 16129745 # Func_class: P Inorganic ion transport and metabolism # Function: Cyanate permease # Organism: Escherichia coli K12 # 1 393 1 393 393 641 100.0 0 MTCSTSLSGKNRIVLIAGILMIATTLRVTFTGAAPLLDTIRSAYSLTTAQTGLLTTLPLL AFALISPLAAPVARRFGMERSLFAALLLICAGIAIRSLPSPYLLFGGTAVIGGGIALGNV LLPGLIKRDFPHSVARLTGAYSLTMGAAAALGSAMVVPLALNGFGWQGALLMLMCFPLLA LFLWLPQWRSQQHANLSTSRALHTRGIWRSPLAWQVTLFLGINSLVYYVIIGWLPAILIS HGYSEAQAGSLHGLLQLATAAPGLLIPLFLHHVKDQRGIAAFVALMCAVGAVGLCFMPAH AITWTLLFGFGSGATMILGLTFIGLRASSAHQAAALSGMAQSVGYLLAACGPPLMGKIHD ANGNWSVPLMGVAILSLLMAIFGLCAGRDKEIR >gi|296494693|gb|ADTN01000045.1| GENE 9 4291 - 5109 316 272 aa, chain + ## HITS:1 COG:yeaM KEGG:ns NR:ns ## COG: yeaM COG2207 # Protein_GI_number: 16129744 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 272 2 273 273 519 100.0 1e-147 MHRLNLNGYEPDRHHEAAVAFCIHAGTDELTSPVHQHRKGQLILALHGAITCTVENALWM VPPQYAVWIPGGVEHSNQVTANAELCFLFIEPSAVTMPTTCCTLKISPLCRELILTLANR TTTQRAEPMTRRLIQVLFDELPQQPQQQLHLPVSSHPKIRTMVEMMAKGPVEWGALGQWA GFFAMSERNLARLIVKETGLSFRQWRQQLQLIMALQGLVKGDTVQKVAHTLGYDSTTAFI TMFKKGLGQTPGRYIARLTTVSPQSAKPDPRQ >gi|296494693|gb|ADTN01000045.1| GENE 10 5066 - 5512 630 148 aa, chain - ## HITS:1 COG:yeaL KEGG:ns NR:ns ## COG: yeaL COG2707 # Protein_GI_number: 16129743 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 148 1 148 148 188 100.0 3e-48 MFDVTLLILLGLAALGFISHNTTVAVSILVLIIVRVTPLSTFFPWIEKQGLSIGIIILTI GVMAPIASGTLPPSTLIHSFLNWKSLVAIAVGVIVSWLGGRGVTLMGSQPQLVAGLLVGT VLGVALFRGVPVGPLIAAGLVSLIVGKQ >gi|296494693|gb|ADTN01000045.1| GENE 11 5787 - 6290 469 167 aa, chain - ## HITS:1 COG:ECs2496 KEGG:ns NR:ns ## COG: ECs2496 COG2606 # Protein_GI_number: 15831750 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 167 1 167 167 320 100.0 8e-88 MTEMAKGSVTHQRLIALLSQEGADFRVVTHEAVGKCEAVSEIRGTALGQGAKALVCKVKG NGVNQHVLAILAADQQADLSQLASHIGGLRASLASPAEVDELTGCVFGAIPPFSFHPKLK LVADPLLFERFDEIAFNAGMLDKSVILKTADYLRIAQPELVNFRRTA >gi|296494693|gb|ADTN01000045.1| GENE 12 6333 - 7805 1050 490 aa, chain - ## HITS:1 COG:yeaJ_2 KEGG:ns NR:ns ## COG: yeaJ_2 COG2199 # Protein_GI_number: 16129740 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli K12 # 325 490 1 166 166 326 100.0 7e-89 MLRHFIAASVIVLTSSFLIFELVASDRAMSAYLRYIVQKADSSFLYDKYQNQSIAAHVMR ALAAEQSEVSPEQRRAICEAFESANNTHGLNLTAHKYPGLRGTLQTASTDCDTIVEAAAL LPAFDQAVEGNRHQDDYGSGLGMAEEKFHYYLDLNDRYVYFYEPVNVEYFAMNNWSFLQS GSIGIDRKDIEKVFTGRTVLSSIYQDQRTKQNVMSLLTPVYVAGQLKGIVLLDINKNNLR NIFYTHDRPLLWRFLNVTLTDTDSGRDIIINQSEDNLFQYVSYVHDLPGGIRVSLSIDIL YFITSSWKSVLFWILTALILLNMVRMHFRLYQNVSRENISDAMTGLYNRKILTPELEQRL QKLVQSGSSVMFIAIDMDKLKQINDTLGHQEGDLAITLLAQAIKQSIRKSDYAIRLGGDE FCIILVDSTPQIAAQLPERIEKRLQHIAPQKEIGFSSGIYAMKENDTLHDAYKASDERLY VNKQNKNSRS Prediction of potential genes in microbial genomes Time: Sun May 15 23:14:27 2011 Seq name: gi|296494692|gb|ADTN01000046.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont110.1, whole genome shotgun sequence Length of sequence - 5639 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 238 - 278 6.2 1 1 Tu 1 . - CDS 283 - 2856 1845 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 - Prom 2907 - 2966 4.7 2 2 Op 1 11/1.000 - CDS 2986 - 3717 466 ## COG1496 Uncharacterized conserved protein 3 2 Op 2 . - CDS 3714 - 4694 1157 ## COG0564 Pseudouridylate synthases, 23S RNA-specific - Prom 4749 - 4808 4.8 + Prom 4648 - 4707 4.5 4 3 Tu 1 . + CDS 4829 - 5566 847 ## COG4105 DNA uptake lipoprotein Predicted protein(s) >gi|296494692|gb|ADTN01000046.1| GENE 1 283 - 2856 1845 857 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 857 1 811 815 715 46 0.0 MRLDRLTNKFQLALADAQSLALGHDNQFIEPLHLMSALLNQEGGSVSPLLTSAGINAGQL RTDINQALNRLPQVEGTGGDVQPSQDLVRVLNLCDKLAQKRGDNFISSELFVLAALESRG TLADILKAAGATTANITQAIEQMRGGESVNDQGAEDQRQALKKYTIDLTERAEQGKLDPV IGRDEEIRRTIQVLQRRTKNNPVLIGEPGVGKTAIVEGLAQRIINGEVPEGLKGRRVLAL DMGALVAGAKYRGEFEERLKGVLNDLAKQEGNVILFIDELHTMVGAGKADGAMDAGNMLK PALARGELHCVGATTLDEYRQYIEKDAALERRFQKVFVAEPSVEDTIAILRGLKERYELH HHVQITDPAIVAAATLSHRYIADRQLPDKAIDLIDEAASSIRMQIDSKPEELDRLDRRII QLKLEQQALMKESDEASKKRLDMLNEELSDKERQYSELEEEWKAEKASLSGTQTIKAELE QAKIAIEQARRVGDLARMSELQYGKIPELEKQLEAATQLEGKTMRLLRNKVTDAEIAEVL ARWTGIPVSRMMESEREKLLRMEQELHHRVIGQNEAVDAVSNAIRRSRAGLADPNRPIGS FLFLGPTGVGKTELCKALANFMFDSDEAMVRIDMSEFMEKHSVSRLVGAPPGYVGYEEGG YLTEAVRRRPYSVILLDEVEKAHPDVFNILLQVLDDGRLTDGQGRTVDFRNTVVIMTSNL GSDLIQERFGELDYAHMKELVLGVVSHNFRPEFINRIDEVVVFHPLGEQHIASIAQIQLK RLYKRLEERGYEIHISDEALKLLSENGYDPVYGARPLKRAIQQQIENPLAQQILSGELVP GKVIRLEVNEDRIVAVQ >gi|296494692|gb|ADTN01000046.1| GENE 2 2986 - 3717 466 243 aa, chain - ## HITS:1 COG:yfiH KEGG:ns NR:ns ## COG: yfiH COG1496 # Protein_GI_number: 16130514 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 243 1 243 243 508 100.0 1e-144 MSKLIVPQWPQPKGVAACSSTRIGGVSLPPYDSLNLGAHCGDNPDHVEENRKRLFAAGNL PSKPVWLEQVHGKDVLKLTGEPYASKRADASYSNTPGTVCAVMTADCLPVLFCNRAGTEV AAAHAGWRGLCAGVLEETVSCFADNPENILAWLGPAIGPRAFEVGGEVREAFMAVDAKAS AAFIQHGDKYLADIYQLARQRLANVGVEQIFGGDRCTYTENETFFSYRRDKTTGRMASFI WLI >gi|296494692|gb|ADTN01000046.1| GENE 3 3714 - 4694 1157 326 aa, chain - ## HITS:1 COG:sfhB KEGG:ns NR:ns ## COG: sfhB COG0564 # Protein_GI_number: 16130515 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthases, 23S RNA-specific # Organism: Escherichia coli K12 # 1 326 1 326 326 625 100.0 1e-179 MAQRVQLTATVSENQLGQRLDQALAEMFPDYSRSRIKEWILDQRVLVNGKVCDKPKEKVL GGEQVAINAEIEEEARFEPQDIPLDIVYEDEDIIIINKPRDLVVHPGAGNPDGTVLNALL HYYPPIADVPRAGIVHRLDKDTTGLMVVAKTVPAQTRLVESLQRREITREYEAVAIGHMT AGGTVDEPISRHPTKRTHMAVHPMGKPAVTHYRIMEHFRVHTRLRLRLETGRTHQIRVHM AHITHPLVGDPVYGGRPRPPKGASEAFISTLRKFDRQALHATMLRLYHPISGIEMEWHAP IPQDMVELIEVMRADFEEHKDEVDWL >gi|296494692|gb|ADTN01000046.1| GENE 4 4829 - 5566 847 245 aa, chain + ## HITS:1 COG:ECs3458 KEGG:ns NR:ns ## COG: ECs3458 COG4105 # Protein_GI_number: 15832712 # Func_class: R General function prediction only # Function: DNA uptake lipoprotein # Organism: Escherichia coli O157:H7 # 1 245 1 245 245 466 100.0 1e-131 MTRMKYLVAAATLSLFLAGCSGSKEEVPDNPPNEIYATAQQKLQDGNWRQAITQLEALDN RYPFGPYSQQVQLDLIYAYYKNADLPLAQAAIDRFIRLNPTHPNIDYVMYMRGLTNMALD DSALQGFFGVDRSDRDPQHARAAFSDFSKLVRGYPNSQYTTDATKRLVFLKDRLAKYEYS VAEYYTERGAWVAVVNRVEGMLRDYPDTQATRDALPLMENAYRQMQMNAQAEKVAKIIAA NSSNT Prediction of potential genes in microbial genomes Time: Sun May 15 23:14:28 2011 Seq name: gi|296494691|gb|ADTN01000047.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont110.2, whole genome shotgun sequence Length of sequence - 1991 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 113 - 172 4.1 1 1 Op 1 2/0.000 + CDS 232 - 573 574 ## PROTEIN SUPPORTED gi|15803120|ref|NP_289151.1| translation inhibitor protein RaiA + Term 599 - 626 1.5 + Prom 599 - 658 3.5 2 1 Op 2 . + CDS 823 - 1983 956 ## COG0077 Prephenate dehydratase Predicted protein(s) >gi|296494691|gb|ADTN01000047.1| GENE 1 232 - 573 574 113 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803120|ref|NP_289151.1| translation inhibitor protein RaiA [Escherichia coli O157:H7 EDL933] # 1 113 1 113 113 225 100 2e-59 MTMNITSKQMEITPAIRQHVADRLAKLEKWQTHLINPHIILSKEPQGFVADATINTPNGV LVASGKHEDMYTAINELINKLERQLNKLQHKGEARRAATSVKDANFVEEVEEE >gi|296494691|gb|ADTN01000047.1| GENE 2 823 - 1983 956 386 aa, chain + ## HITS:1 COG:ECs3462_2 KEGG:ns NR:ns ## COG: ECs3462_2 COG0077 # Protein_GI_number: 15832716 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydratase # Organism: Escherichia coli O157:H7 # 105 386 1 282 282 568 100.0 1e-162 MTSENPLLALREKISALDEKLLALLAERRELAVEVGKAKLLSHRPVRDIDRERDLLERLI TLGKAHHLDAHYITRLFQLIIEDSVLTQQALLQQHLNKINPHSARIAFLGPKGSYSHLAA RQYAARHFEQFIESGCAKFADIFNQVETGQADYAVVPIENTSSGAINDVYDLLQHTSLSI VGEMTLTIDHCLLVSGTTDLSTINTVYSHPQPFQQCSKFLNRYPHWKIEYTESTSAAMEK VAQAKSPHVAALGSEAGGTLYGLQVLERIEANQRQNFTRFVVLARKAINVSDQVPAKTTL LMATGQQAGALVEALLVLRNHNLIMTRLESRPIHGNPWEEMFYLDIQANLESAEMQKALK ELGEITRSMKVLGCYPSENVVPVDPT Prediction of potential genes in microbial genomes Time: Sun May 15 23:14:34 2011 Seq name: gi|296494690|gb|ADTN01000048.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont110.3, whole genome shotgun sequence Length of sequence - 15045 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 9, operones - 6 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.333 - CDS 14 - 1135 1236 ## COG0287 Prephenate dehydrogenase 2 1 Op 2 . - CDS 1146 - 2216 1107 ## COG0722 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase - Prom 2327 - 2386 7.2 + Prom 2324 - 2383 3.0 3 2 Tu 1 . + CDS 2429 - 2791 317 ## B21_02455 hypothetical protein + Prom 2830 - 2889 4.1 4 3 Op 1 . + CDS 2941 - 3459 166 ## ECH74115_3842 hypothetical protein 5 3 Op 2 6/0.333 + CDS 3452 - 4675 780 ## COG2199 FOG: GGDEF domain 6 4 Tu 1 . + CDS 4790 - 5173 270 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins - Term 5178 - 5224 9.6 7 5 Op 1 33/0.000 - CDS 5249 - 5596 573 ## PROTEIN SUPPORTED gi|15803128|ref|NP_289159.1| 50S ribosomal protein L19 8 5 Op 2 30/0.000 - CDS 5638 - 6405 647 ## COG0336 tRNA-(guanine-N1)-methyltransferase 9 5 Op 3 12/0.000 - CDS 6436 - 6984 189 ## PROTEIN SUPPORTED gi|163796730|ref|ZP_02190688.1| 50S ribosomal protein L19 10 5 Op 4 23/0.000 - CDS 7003 - 7311 524 ## PROTEIN SUPPORTED gi|110806723|ref|YP_690243.1| 30S ribosomal protein S16 11 5 Op 5 . - CDS 7500 - 8861 1674 ## COG0541 Signal recognition particle GTPase - Prom 8918 - 8977 5.2 + Prom 8938 - 8997 3.9 12 6 Op 1 4/0.333 + CDS 9028 - 9819 738 ## COG4137 ABC-type uncharacterized transport system, permease component 13 6 Op 2 . + CDS 9885 - 11126 1208 ## COG4536 Putative Mg2+ and Co2+ transporter CorB - Term 11119 - 11158 5.1 14 7 Op 1 . - CDS 11181 - 11774 901 ## COG0576 Molecular chaperone GrpE (heat shock protein) 15 7 Op 2 . - CDS 11771 - 11965 192 ## UTI89_C2948 hypothetical protein - Prom 12149 - 12208 2.6 + Prom 11716 - 11775 4.2 16 8 Op 1 17/0.000 + CDS 11897 - 12775 772 ## COG0061 Predicted sugar kinase + Prom 12777 - 12836 4.7 17 8 Op 2 5/0.333 + CDS 12861 - 14522 1507 ## COG0497 ATPase involved in DNA repair + Prom 14559 - 14618 3.2 18 9 Tu 1 . + CDS 14671 - 15012 339 ## COG2913 Small protein A (tmRNA-binding) Predicted protein(s) >gi|296494690|gb|ADTN01000048.1| GENE 1 14 - 1135 1236 373 aa, chain - ## HITS:1 COG:ECs3463_2 KEGG:ns NR:ns ## COG: ECs3463_2 COG0287 # Protein_GI_number: 15832717 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydrogenase # Organism: Escherichia coli O157:H7 # 100 373 1 274 274 555 99.0 1e-158 MVAELTALRDQIDEVDKALLNLLAKRLELVAEVGEVKSRFGLPIYVPEREASMLASRRAE AEALGVPPDLIEDVLRRVMRESYSSENDKGFKTLCPSLRPVVIVGGGGQMGRLFEKMLTL SGYQVRILEQHDWDRAADIVADAGMVIVSVPIHVTEQVIGKLPPLPKDCILVDLASVKNG PLQAMLVAHDGPVLGLHPMFGPDSGSLAKQVVVWCDGRKPEAYQWFLEQIQVWGARLHRI SAVEHDQNMAFIQALRHFATFAYGLHLAEENVQLEQLLALSSPIYRLELAMVGRLFAQDP QLYADIIMSSERNLALIKRYYKRFGEAIELLEQGDKQAFIDSFRKVEHWFGDYAQRFQSE SRVLLRQANDNRQ >gi|296494690|gb|ADTN01000048.1| GENE 2 1146 - 2216 1107 356 aa, chain - ## HITS:1 COG:aroF KEGG:ns NR:ns ## COG: aroF COG0722 # Protein_GI_number: 16130522 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Escherichia coli K12 # 1 356 1 356 356 709 100.0 0 MQKDALNNVHITDEQVLMTPEQLKAAFPLSLQQEAQIADSRKSISDIIAGRDPRLLVVCG PCSIHDPETALEYARRFKALAAEVSDSLYLVMRVYFEKPRTTVGWKGLINDPHMDGSFDV EAGLQIARKLLLELVNMGLPLATEALDPNSPQYLGDLFSWSAIGARTTESQTHREMASGL SMPVGFKNGTDGSLATAINAMRAAAQPHRFVGINQAGQVALLQTQGNPDGHVILRGGKAP NYSPADVAQCEKEMEQAGLRPSLMVDCSHGNSNKDYRRQPAVAESVVAQIKDGNRSIIGL MIESNIHEGNQSSEQPRSEMKYGVSVTDACISWEMTDALLREIHQDLNGQLTARVA >gi|296494690|gb|ADTN01000048.1| GENE 3 2429 - 2791 317 120 aa, chain + ## HITS:1 COG:no KEGG:B21_02455 NR:ns ## KEGG: B21_02455 # Name: yfiL # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 120 2 121 121 238 100.0 5e-62 MKKFIAPLLALLVSGCQIDPYTHAPTLTSTDWYDVGMEDAISGSAIKDDDAFSDSQADRG LYLKGYAEGQKKTCQTDFTYARGLSGKSFPASCNNVENASQLHEVWQKGADENASTIRLN >gi|296494690|gb|ADTN01000048.1| GENE 4 2941 - 3459 166 172 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_3842 NR:ns ## KEGG: ECH74115_3842 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 172 1 172 172 329 100.0 2e-89 MRFSHRLFLLLILLLTGAPILAQEPSDVAKNVRMMVSGIVSYTRWPALSGPPKLCIFSSS RFSTALQENAATSLPYLPVIIHTQQEAMISGCNGFYFGNESPTFQMELTEQYPSKALLLI AEQNTECIIGSAFCLIIHNNDVRFAVNLDALSRSGVKVNPDVLMLARKKNDG >gi|296494690|gb|ADTN01000048.1| GENE 5 3452 - 4675 780 407 aa, chain + ## HITS:1 COG:yfiN_2 KEGG:ns NR:ns ## COG: yfiN_2 COG2199 # Protein_GI_number: 16130525 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli K12 # 240 407 1 168 168 333 100.0 3e-91 MDNDNSLNKRPTFKRALRNISMTSIFITMMLIWLLLSVTSVLTLKQYAQKNLALTAATMT YSLEAAVVFADGPAATETLAALGQQGQFSTAEVRDKQQNILASWHYTRKDPGDTFSNFIS HWLFPAPIIQPIRHNGETIGEVRLTARDSSISHFIWFSLAVLTGCILLASGIAITLTRHL HNGLVEALKNITDVVHDVRSNRNFSRRVSEERIAEFHRFALDFNSLLDEMEEWQLRLQAK NAQLLRTALHDPLTGLANRAAFRSGINTLMNNSDARKTSALLFLDGDNFKYINDTWGHAT GDRVLIEIAKRLAEFGGLRHKAYRLGGDEFAMVLYDVQSESEVQQICSALTQIFNLPFDL HNGHQTTMTLSIGYAMTIEHASAEKLQELADHNMYQAKHQRAEKLVR >gi|296494690|gb|ADTN01000048.1| GENE 6 4790 - 5173 270 127 aa, chain + ## HITS:1 COG:yfiB KEGG:ns NR:ns ## COG: yfiB COG2885 # Protein_GI_number: 16130526 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Escherichia coli K12 # 1 127 34 160 160 244 100.0 3e-65 MQSYGFTESAGDWSLGLSDAILFAKNDYKLLPESQQQIQTMAAKLASTGLTHARMDGHTD NYGEDSYNEGLSLKRANVVADAWAMGGQIPRSNLTTQGLGKKYPIASNKTAQGRAENRRV AVVITTP >gi|296494690|gb|ADTN01000048.1| GENE 7 5249 - 5596 573 115 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803128|ref|NP_289159.1| 50S ribosomal protein L19 [Escherichia coli O157:H7 EDL933] # 1 115 1 115 115 225 100 2e-58 MSNIIKQLEQEQMKQDVPSFRPGDTVEVKVWVVEGSKKRLQAFEGVVIAIRNRGLHSAFT VRKISNGEGVERVFQTHSPVVDSISVKRRGAVRKAKLYYLRERTGKAARIKERLN >gi|296494690|gb|ADTN01000048.1| GENE 8 5638 - 6405 647 255 aa, chain - ## HITS:1 COG:ECs3470 KEGG:ns NR:ns ## COG: ECs3470 COG0336 # Protein_GI_number: 15832724 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-(guanine-N1)-methyltransferase # Organism: Escherichia coli O157:H7 # 1 255 1 255 255 488 100.0 1e-138 MWIGIISLFPEMFRAITDYGVTGRAVKNGLLSIQSWSPRDFTHDRHRTVDDRPYGGGPGM LMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVCGRYEGIDE RVIQTEIDEEWSIGDYVLSGGELPAMTLIDSVSRFIPGVLGHEASATEDSFAEGLLDCPH YTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQSLGRTWLRRPELLENLALTEEQARLLAEF KTEHAQQQHKHDGMA >gi|296494690|gb|ADTN01000048.1| GENE 9 6436 - 6984 189 182 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163796730|ref|ZP_02190688.1| 50S ribosomal protein L19 [alpha proteobacterium BAL199] # 11 177 2 160 179 77 29 5e-14 MSKQLTAQAPVDPIVLGKMGSSYGIRGWLRVFSSTEDAESIFDYQPWFIQKAGQWQQVQL ESWKHHNQDMIIKLKGVDDRDAANLLTNCEIVVDSSQLPQLEEGDYYWKDLMGCQVVTTE GYDLGKVVDMMETGSNDVLVIKANLKDAFGIKERLVPFLDGQVIKKVDLTTRSIEVDWDP GF >gi|296494690|gb|ADTN01000048.1| GENE 10 7003 - 7311 524 102 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|110806723|ref|YP_690243.1| 30S ribosomal protein S16 [Shigella flexneri 5 str. 8401] # 1 102 1 102 102 206 100 8e-53 MTPDSVPRWGPVVLFTQEDVMVTIRLARHGAKKRPFYQVVVADSRNARNGRFIERVGFFN PIASEKEEGTRLDLDRIAHWVGQGATISDRVAALIKEVNKAA >gi|296494690|gb|ADTN01000048.1| GENE 11 7500 - 8861 1674 453 aa, chain - ## HITS:1 COG:ECs3473 KEGG:ns NR:ns ## COG: ECs3473 COG0541 # Protein_GI_number: 15832727 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal recognition particle GTPase # Organism: Escherichia coli O157:H7 # 1 453 1 453 453 803 100.0 0 MFDNLTDRLSRTLRNISGRGRLTEDNVKDTLREVRMALLEADVALPVVREFINRVKEKAV GHEVNKSLTPGQEFVKIVRNELVAAMGEENQTLNLAAQPPAVVLMAGLQGAGKTTSVGKL GKFLREKHKKKVLVVSADVYRPAAIKQLETLAEQVGVDFFPSDVGQKPVDIVNAALKEAK LKFYDVLLVDTAGRLHVDEAMMDEIKQVHASINPVETLFVVDAMTGQDAANTAKAFNEAL PLTGVVLTKVDGDARGGAALSIRHITGKPIKFLGVGEKTEALEPFHPDRIASRILGMGDV LSLIEDIESKVDRAQAEKLASKLKKGDGFDLNDFLEQLRQMKNMGGMASLMGKLPGMGQI PDNVKSQMDDKVLVRMEAIINSMTMKERAKPEIIKGSRKRRIAAGCGMQVQDVNRLLKQF DDMQRMMKKMKKGGMAKMMRSMKGMMPPGFPGR >gi|296494690|gb|ADTN01000048.1| GENE 12 9028 - 9819 738 263 aa, chain + ## HITS:1 COG:ECs3474 KEGG:ns NR:ns ## COG: ECs3474 COG4137 # Protein_GI_number: 15832728 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Escherichia coli O157:H7 # 1 263 26 288 288 446 100.0 1e-125 MPVFALLALVAYSVSLALIVPGLLQKNGGWRRMAIISAVIALVCHAIALEARILPDGDSG QNLSLLNVGSLVSLMICTVMTIVASRNRGWLLLPIVYAFALINLALATFMPNEYITHLEA TPGMLVHIGLSLFSYATLIIAALYALQLAWIDYQLKNKKLAFNQEMPPLMSIERKMFHIT QIGVVLLTLTLCTGLFYMHNLFSMENIDKAVLSIVAWFVYIVLLWGHYHEGWRGRRVVWF NVAGAVILTLAYFGSRIVQQLIS >gi|296494690|gb|ADTN01000048.1| GENE 13 9885 - 11126 1208 413 aa, chain + ## HITS:1 COG:yfjDm KEGG:ns NR:ns ## COG: yfjDm COG4536 # Protein_GI_number: 16132248 # Func_class: P Inorganic ion transport and metabolism # Function: Putative Mg2+ and Co2+ transporter CorB # Organism: Escherichia coli K12 # 5 413 12 420 420 771 100.0 0 MVVISAYFSGSETGMMTLNRYRLRHMAKQGNRSAKRVEKLLRKPDRLISLVLIGNNLVNI LASALGTIVGMRLYGDAGVAIATGVLTFVVLVFAEVLPKTIAALYPEKVAYPSSFLLAPL QILMMPLVWLLNAITRMLMRMMGIKTDIVVSGSLSKEELRTIVHESRSQISRRNQDMLLS VLDLEKMTVDDIMVPRSEIIGIDINDDWKSILRQLSHSPHGRIVLYRDSLDDAISMLRVR EAWRLMSEKKEFTKETMLRAADEIYFVPEGTPLSTQLVKFQRNKKKVGLVVNEYGDIQGL VTVEDILEEIVGDFTTSMSPTLAEEVTPQNDGSVIIDGTANVREINKAFNWHLPEDDART VNGVILEALEEIPVAGTRVRIGEYDIDILDVQDNMIKQVKVFPVKPLRESVAE >gi|296494690|gb|ADTN01000048.1| GENE 14 11181 - 11774 901 197 aa, chain - ## HITS:1 COG:ECs3476 KEGG:ns NR:ns ## COG: ECs3476 COG0576 # Protein_GI_number: 15832730 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone GrpE (heat shock protein) # Organism: Escherichia coli O157:H7 # 1 197 1 197 197 315 100.0 3e-86 MSSKEQKTPEGQAPEEIIMDQHEEIEAVEPEASAEQVDPRDEKIANLEAQLAEAQTRERD GILRVKAEMENLRRRTELDIEKAHKFALEKFINELLPVIDSLDRALEVADKANPDMSAMV EGIELTLKSMLDVVRKFGVEVIAETNVPLDPNVHQAIAMVESDDVAPGNVLGIMQKGYTL NGRTIRAAMVTVAKAKA >gi|296494690|gb|ADTN01000048.1| GENE 15 11771 - 11965 192 64 aa, chain - ## HITS:1 COG:no KEGG:UTI89_C2948 NR:ns ## KEGG: UTI89_C2948 # Name: yfjC # Def: hypothetical protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 64 20 83 83 118 100.0 7e-26 MCCQCSGVPWVSHNANTLEMIIHFSEVLVAKIDDNVSASLETLKLIPIISEVSEMNAKKT RRNS >gi|296494690|gb|ADTN01000048.1| GENE 16 11897 - 12775 772 292 aa, chain + ## HITS:1 COG:yfjB KEGG:ns NR:ns ## COG: yfjB COG0061 # Protein_GI_number: 16130534 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar kinase # Organism: Escherichia coli K12 # 1 292 1 292 292 611 100.0 1e-175 MNNHFKCIGIVGHPRHPTALTTHEMLYRWLCTKGYEVIVEQQIAHELQLKNVKTGTLAEI GQLADLAVVVGGDGNMLGAARTLARYDIKVIGINRGNLGFLTDLDPDNAQQQLADVLEGH YISEKRFLLEAQVCQQDCQKRISTAINEVVLHPGKVAHMIEFEVYIDEIFAFSQRSDGLI ISTPTGSTAYSLSAGGPILTPSLDAITLVPMFPHTLSARPLVINSSSTIRLRFSHRRNDL EISCDSQIALPIQEGEDVLIRRCDYHLNLIHPKDYSYFNTLSTKLGWSKKLF >gi|296494690|gb|ADTN01000048.1| GENE 17 12861 - 14522 1507 553 aa, chain + ## HITS:1 COG:recN KEGG:ns NR:ns ## COG: recN COG0497 # Protein_GI_number: 16130535 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Escherichia coli K12 # 1 553 1 553 553 998 99.0 0 MLAQLTISNFAIVRELEIDFHSGMTVITGETGAGKSIAIDALGLCLGGRAEADMVRTGAA RADLCARFSLKDTPAALRWLEENQLEDGHECLLRRVISSDGRSRGFINGTAVPLSQLREL GQLLIQIHGQHAHQLLTKPEHQKFLLDGYANETSLLQEMTARYQLWHQSCRDLAHHQQLS QEHAARAELLQYQLKELNEFNPQPGEFEQIDEEYKRLANSGQLLTTSQNALALMADGEDA NLQSQLYTAKQLVSELIGMDSKLSGVLDMLEEATIQIAEASDELRHYCDRLDLDPNRLFE LEQRISKQISLARKHHVSPEALPQYYQSLLEEQQQLDDQADSQETLALAVTKHHQQALEI ARALHQQRQQYAEELAQLITDSMHALSMPHGQFTIDVKFDEHHLGADGADRIEFRVTTNP GQPMQPIAKVASGGELSRIALAIQVITARKMETPALIFDEVDVGISGPTAAVVGKLLRQL GESTQVMCVTHLPQVAGCGHQHYFVSKETDGAMTETHMQSLNKKARLQELARLLGGSEVT RNTLANAKELLAA >gi|296494690|gb|ADTN01000048.1| GENE 18 14671 - 15012 339 113 aa, chain + ## HITS:1 COG:STM2685 KEGG:ns NR:ns ## COG: STM2685 COG2913 # Protein_GI_number: 16766000 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Small protein A (tmRNA-binding) # Organism: Salmonella typhimurium LT2 # 1 111 1 111 112 198 92.0 2e-51 MRCKTLTAAAAVLLMLTAGCSTLERVVYRPDINQGNYLTANDVSKIRVGMTQQQVAYALG TPLMSDPFGTNTWFYVFRQQPGHEGVTQQTLTLTFNSSGVLTNIDNKPALSGN Prediction of potential genes in microbial genomes Time: Sun May 15 23:14:43 2011 Seq name: gi|296494689|gb|ADTN01000049.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont110.4, whole genome shotgun sequence Length of sequence - 7300 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 6, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 9/1.000 - CDS 15 - 305 295 ## COG2914 Uncharacterized protein conserved in bacteria 2 1 Op 2 . - CDS 295 - 744 383 ## COG2867 Oligoketide cyclase/lipid transport protein - Prom 871 - 930 2.3 + Prom 719 - 778 5.3 3 2 Tu 1 . + CDS 903 - 1385 456 ## COG0691 tmRNA-binding protein + Prom 2081 - 2140 4.5 4 3 Tu 1 . + CDS 2166 - 3407 431 ## PROTEIN SUPPORTED gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 + Term 3413 - 3476 4.5 + Prom 3788 - 3847 5.8 5 4 Tu 1 . + CDS 3940 - 4752 235 ## RPB_1067 hypothetical protein - Term 4778 - 4825 -0.3 6 5 Tu 1 . - CDS 4906 - 6033 323 ## PSPA7_0138 hypothetical protein - Prom 6058 - 6117 5.4 7 6 Tu 1 . - CDS 6702 - 7223 -127 ## PSPTO_0581 DNA circulation protein, putative Predicted protein(s) >gi|296494689|gb|ADTN01000049.1| GENE 1 15 - 305 295 96 aa, chain - ## HITS:1 COG:yfjF KEGG:ns NR:ns ## COG: yfjF COG2914 # Protein_GI_number: 16130537 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 96 7 102 102 163 98.0 7e-41 MPGKIAVEVAYALPEKQYLQRVTLQEGATVEEAIRASGLLELRTDIDLTKNKVGIYSRPA KLSDSVHDGDRVEIYRPLIADPKELRRQRAEKSANK >gi|296494689|gb|ADTN01000049.1| GENE 2 295 - 744 383 149 aa, chain - ## HITS:1 COG:yfjG KEGG:ns NR:ns ## COG: yfjG COG2867 # Protein_GI_number: 16130538 # Func_class: I Lipid transport and metabolism # Function: Oligoketide cyclase/lipid transport protein # Organism: Escherichia coli K12 # 1 149 10 158 158 295 100.0 2e-80 MEIVMPQISRTALVPYSAEQMYQLVNDVQSYPQFLPGCTGSRILESTPGQMTAAVDVSKA GISKTFTTRNQLTSNQSILMNLVDGPFKKLIGGWKFTPLSQEACRIEFHLDFEFTNKLIE LAFGRVFKELAANMVQAFTVRAKEVYSAR >gi|296494689|gb|ADTN01000049.1| GENE 3 903 - 1385 456 160 aa, chain + ## HITS:1 COG:ECs3482 KEGG:ns NR:ns ## COG: ECs3482 COG0691 # Protein_GI_number: 15832736 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: tmRNA-binding protein # Organism: Escherichia coli O157:H7 # 1 160 1 160 160 308 100.0 3e-84 MTKKKAHKPGSATIALNKRARHEYFIEEEFEAGLALQGWEVKSLRAGKANISDSYVLLRD GEAFLFGANITPMAVASTHVVCDPTRTRKLLLNQRELDSLYGRVNREGYTVVALSLYWKN AWCKVKIGVAKGKKQHDKRSDIKEREWQVDKARIMKNAHR >gi|296494689|gb|ADTN01000049.1| GENE 4 2166 - 3407 431 413 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 [Campylobacter concisus 13826] # 10 393 7 400 406 170 30 3e-42 MARKTKPLTDTEIKAAKPKDADYQLYDGDGLTLLIKSSGSKLWQFRYYRPLTKQRTKQSF GAYPAVSLSDARKLRAESRVLLAKDIDPQEHQKEQVRNSQEAKTNTFLLVAERWWNVKKA SVTEDYADDIWRSLERDVFPAIGDISVTEIKAHTLVKAVQPVQARGALETVRRLCQRINE VMIYAQNTGLIDAVPSVNIGKAFEKPQKKNMPSIRPDQLPQLMQTMRTANISQSTRCLFM WQLLTITRPAEAAEARWDEIDFEAKEWKIPAARMKMNRDHTVPLSDGALAILAMMKSLSG GREFIFPSRIKPTQPMNSQTVNAALKRAGLGGVLVSHGLRSIASTALNEQGFPPDVIEAA LAHVDKNEVRRAYNRSDYLEQRRPMMQWWANLIKAADSIPTVCNEVAGLRLVG >gi|296494689|gb|ADTN01000049.1| GENE 5 3940 - 4752 235 270 aa, chain + ## HITS:1 COG:no KEGG:RPB_1067 NR:ns ## KEGG: RPB_1067 # Name: not_defined # Def: hypothetical protein # Organism: R.palustris_HaA2 # Pathway: not_defined # 1 263 84 360 370 175 35.0 1e-42 MKQNDIETRMATAKIIDHLSFGVTLIASHERVKQELCNATYSILGAKDSIPIDQLVWTKL SYIFGDYHPYDTSFDAAEELIIQKSFFDHMWDISLVEMMNHINYESWEQFDWQKTAEMLN LANKEHTNELRSYQHAYRIEFDGVLSLFNEQLIQIFKEAFKAGYNNDEINNKKKSKNEKL KQFAQLVRTLHIGASCHAAVRWDQKRQLNGNDLLDFHHAEAALGYCDLFLTEKPLKVQVS QEHLGLRELFSCSVESSASEGLKILNMCKI >gi|296494689|gb|ADTN01000049.1| GENE 6 4906 - 6033 323 375 aa, chain - ## HITS:1 COG:no KEGG:PSPA7_0138 NR:ns ## KEGG: PSPA7_0138 # Name: not_defined # Def: hypothetical protein # Organism: P.aeruginosa_PA7 # Pathway: not_defined # 1 375 195 570 570 443 56.0 1e-123 MALSKTCLVLKFDFTRVRWGNFAGWGNINRYTKDEGSLFYHGGINGEASYCNGVMIIRPQ VTVRGLVKAWKDEGNPSKRKYATFKIYDRKNSCNVETSCGPDSLSNYFQENKLPWEVSPA FFRAEVLHRFKADPEKYTLDERSIACRNAWYLKSYDINEAGQVHVYIGDLAHLPYEEQLY WQAFNEWPKGTISKRAFQTDIVGEWYLSNEPLLALKHKISLMDKNPPTWWKPRGTVLSDA VQYPATDSIKEWGDEILALDQYLVEGFLPKPLSEIAKYEGRIIEPKWGSLRILQEALIAK SATVDEAKMLVQPMQKLHALRTEVRGHASTEKKKKAILEARTDHENLRAHFTSLVGQCEA SFNDILRIFDFKIDD >gi|296494689|gb|ADTN01000049.1| GENE 7 6702 - 7223 -127 173 aa, chain - ## HITS:1 COG:no KEGG:PSPTO_0581 NR:ns ## KEGG: PSPTO_0581 # Name: not_defined # Def: DNA circulation protein, putative # Organism: P.syringae # Pathway: not_defined # 1 173 436 607 607 188 49.0 5e-47 MTSHIESFWTREQMLSSQAPDPLSTTTLITPPTFEQLVNKTCMAWPLAGTLDFGRIGYLQ HDLYPSRLFLPTMPGIENSLEVIPHDGRLQVKLKDLVVAELYYWNAGWGPARLGQIQGNC GTALISRGTTYRERNITNTGSLRSFYSWQVRRFHRNGSYGRFSEKITRGVVFV Prediction of potential genes in microbial genomes Time: Sun May 15 23:14:59 2011 Seq name: gi|296494688|gb|ADTN01000050.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont117.1, whole genome shotgun sequence Length of sequence - 13022 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 5, operones - 2 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 69 - 128 4.0 1 1 Tu 1 . + CDS 151 - 384 205 ## COG1942 Uncharacterized protein, 4-oxalocrotonate tautomerase homolog - Term 192 - 218 -1.0 2 2 Tu 1 . - CDS 381 - 842 440 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family + Prom 950 - 1009 3.3 3 3 Tu 1 . + CDS 1123 - 1968 655 ## COG2162 Arylamine N-acetyltransferase 4 4 Op 1 4/0.333 - CDS 2064 - 2957 835 ## COG0384 Predicted epimerase, PhzC/PhzF homolog 5 4 Op 2 12/0.000 - CDS 3036 - 3716 500 ## COG2181 Nitrate reductase gamma subunit 6 4 Op 3 12/0.000 - CDS 3713 - 4408 881 ## COG2180 Nitrate reductase delta subunit 7 4 Op 4 13/0.000 - CDS 4408 - 5952 1507 ## COG1140 Nitrate reductase beta subunit 8 4 Op 5 10/0.000 - CDS 5949 - 9689 3973 ## COG5013 Nitrate reductase alpha subunit 9 4 Op 6 2/1.000 - CDS 9771 - 11159 957 ## COG2223 Nitrate/nitrite transporter 10 5 Op 1 2/1.000 - CDS 11483 - 11818 62 ## COG4886 Leucine-rich repeat (LRR) protein 11 5 Op 2 2/1.000 - CDS 11857 - 12777 101 ## COG4886 Leucine-rich repeat (LRR) protein 12 5 Op 3 . - CDS 12837 - 13022 172 ## COG3203 Outer membrane protein (porin) Predicted protein(s) >gi|296494688|gb|ADTN01000050.1| GENE 1 151 - 384 205 77 aa, chain + ## HITS:1 COG:ZydcE KEGG:ns NR:ns ## COG: ZydcE COG1942 # Protein_GI_number: 15801678 # Func_class: R General function prediction only # Function: Uncharacterized protein, 4-oxalocrotonate tautomerase homolog # Organism: Escherichia coli O157:H7 EDL933 # 1 75 14 88 88 134 100.0 3e-32 MPHIDIKCFPRELDEQQKAALAADITDVIIRHLNSKDSSISIALQQIQPESWQAIWDAEI APQMEALIKKPGYSMNA >gi|296494688|gb|ADTN01000050.1| GENE 2 381 - 842 440 153 aa, chain - ## HITS:1 COG:yddH KEGG:ns NR:ns ## COG: yddH COG1853 # Protein_GI_number: 16129421 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Escherichia coli K12 # 1 153 53 205 205 319 100.0 2e-87 MAAAWSMPVEFEPPRVAIVVDKSTWTRELIEHNGKFGIVIPGVAATNWTWAVGSVSGRDE DKFNCYGIPVVRGPVFGLPLVEEKCLAWMECRLLPATSAQEEYDTLFGEVVSAAADARVF VEGRWQFDDDKLNTLHHLGAGTFVTSGKRVTAG >gi|296494688|gb|ADTN01000050.1| GENE 3 1123 - 1968 655 281 aa, chain + ## HITS:1 COG:nhoA KEGG:ns NR:ns ## COG: nhoA COG2162 # Protein_GI_number: 16129422 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Arylamine N-acetyltransferase # Organism: Escherichia coli K12 # 1 281 1 281 281 566 100.0 1e-161 MTPILNHYFARINWSGAAAVNIDTLRALHLKHNCTIPFENLDVLLPREIQLDNQSPEEKL VIARRGGYCFEQNGVFERVLRELGFNVRSLLGRVVLSNPPALPPRTHRLLLVELEEEKWI ADVGFGGQTLTAPIRLVSDLVQTTPHGEYRLLQEGDDWVLQFNHHQHWQSMYRFDLCEQQ QSDYVMGNFWSAHWPQSHFRHHLLMCRHLPDGGKLTLTNFHFTHYENGHAVEQRNLPDVA SLYAVMQEQFGLGVDDAKHGFTVDELALVMAAFDTHPEAGK >gi|296494688|gb|ADTN01000050.1| GENE 4 2064 - 2957 835 297 aa, chain - ## HITS:1 COG:yddE KEGG:ns NR:ns ## COG: yddE COG0384 # Protein_GI_number: 16129423 # Func_class: R General function prediction only # Function: Predicted epimerase, PhzC/PhzF homolog # Organism: Escherichia coli K12 # 1 297 1 297 297 607 100.0 1e-174 MKPQVYHVDAFTSQPFRGNSAGVVFPADNLSEAQMQLIARELGHSETAFLLHSDDSDVRI RYFTPTVEVPICGHATVAAHYVRAKVLGLGNCTIWQTSLAGKHRVTIEKHNDDYRISLEQ GTPGFEPPLEGETRAAIINALHLTEDDILPGLPIQVATTGHSKVMIPLKPEVDIDALSPD LNALTAISKKIGCNGFFPFQIRPGKNETDGRMFSPAIGIVEDPVTGNANGPMGAWLVHHN VLPHDGNVLRVKGHQGRALGRDGMIEVTVTIRDNQPEKVTISGTAVILFHAEWAIEL >gi|296494688|gb|ADTN01000050.1| GENE 5 3036 - 3716 500 226 aa, chain - ## HITS:1 COG:ECs2068 KEGG:ns NR:ns ## COG: ECs2068 COG2181 # Protein_GI_number: 15831322 # Func_class: C Energy production and conversion # Function: Nitrate reductase gamma subunit # Organism: Escherichia coli O157:H7 # 1 226 1 226 226 420 100.0 1e-117 MIQYLNVFFYDIYPYICATVFFLGSWLRYDYGQYTWRASSSQMLDKRGMVIWSNLFHIGI LGIFFGHLFGMLTPHWMYAWFLPVAAKQLMAMVLGGICGVLTLIGGAGLLWRRLTNQRVR ATSTTPDIIIMSILLIQCLLGLSTIPFSAQYPDGSEMMKLVGWAQSIVTFRGGSSEMLNG VAFVFRLHLVLGMTIFLLFPFTRLVHVWSAPFEYFTRRYQIVRSRR >gi|296494688|gb|ADTN01000050.1| GENE 6 3713 - 4408 881 231 aa, chain - ## HITS:1 COG:narW KEGG:ns NR:ns ## COG: narW COG2180 # Protein_GI_number: 16129425 # Func_class: C Energy production and conversion # Function: Nitrate reductase delta subunit # Organism: Escherichia coli K12 # 1 231 1 231 231 443 100.0 1e-124 MQILKVIGLLMEYPDELLWECKEDALALIRRDAPMLTDFTHNLLNAPLLDKQAEWCEVFD RGRTTSLLLFEHVHAESRDRGQAMVDLLAEYEKVGLQLDCRELPDYLPLYLEYLSVLPDD QAKEGLLNVAPILALLGGRLKQREAPWYALFDALLQLAGSSLSSDSVTKQVNSEERDDTR QALDAVWEEEQVKFIEDNATACDSSPLNQYQRRFSQDVAPQYVDISAGGGK >gi|296494688|gb|ADTN01000050.1| GENE 7 4408 - 5952 1507 514 aa, chain - ## HITS:1 COG:narY KEGG:ns NR:ns ## COG: narY COG1140 # Protein_GI_number: 16129426 # Func_class: C Energy production and conversion # Function: Nitrate reductase beta subunit # Organism: Escherichia coli K12 # 1 514 1 514 514 1097 99.0 0 MKIRSQVGMVLNLDKCIGCHTCSVTCKNVWTGREGMEYAWFNNVETKPGIGYPKNWEDQE EWQGGWVRDVNGKIRPRLGNKMGVITKIFANPVVPQIDDYYEPFTFDYEHLHSAPEGKHI PTARPRSLIDGKRMDKVIWGPNWEELLGGEFEKRARDRNFEAMQKEMYGQFENTFMMYLP RLCEHCLNPSCVATCPSGAIYKREEDGIVLIDQDKCRGWRLCISGCPYKKIYFNWKSGKS EKCIFCYPRIESGQPTVCSETCVGRIRYLGVLLYDADRIEEAASTEREVDLYERQCEVFL DPHDPSVIEEALKQGIPQNVIDAAQRSPVYKMAMDWKLALPLHPEYRPLPMVWYVPPLSP IQSYADAGGLPKSEGVLPAIESLRIPVQYLANMLSAGDTGPVLRALKRMMAMRHYMRSQT VEGVTDTRAIDEVGLSVAQVEEMYRYLAIANYEDRFVIPTSHREMAGDAFAERNGCGFTF GDGCHGSDSKFNLFNSSRIDAINITEVRDKAEGE >gi|296494688|gb|ADTN01000050.1| GENE 8 5949 - 9689 3973 1246 aa, chain - ## HITS:1 COG:narZ KEGG:ns NR:ns ## COG: narZ COG5013 # Protein_GI_number: 16129427 # Func_class: C Energy production and conversion # Function: Nitrate reductase alpha subunit # Organism: Escherichia coli K12 # 1 1246 1 1246 1246 2613 100.0 0 MSKLLDRFRYFKQKGETFADGHGQVMHSNRDWEDSYRQRWQFDKIVRSTHGVNCTGSCSW KIYVKNGLVTWEIQQTDYPRTRPDLPNHEPRGCPRGASYSWYLYSANRLKYPLIRKRLIE LWREALKQHSDPVLAWASIMNDPQKCLSYKQVRGRGGFIRSNWQELNQLIAAANVWTIKT YGPDRVAGFSPIPAMSMVSYAAGTRYLSLLGGTCLSFYDWYCDLPPASPMTWGEQTDVPE SADWYNSSYIIAWGSNVPQTRTPDAHFFTEVRYKGTKTIAITPDYSEVAKLCDQWLAPKQ GTDSALAMAMGHVILKEFHLDNPSDYFINYCRRYSDMPMLVMLEPRDDGSYVPGRMIRAS DLVDGLGESNNPQWKTVAVNTAGELVVPNGSIGFRWGEKGKWNLESIAAGTETELSLTLL GQHDAVAGVAFPYFGGIENPHFRSVKHNPVLVRQLPVKNLTLVDGNTCPVVSVYDLVLAN YGLDRGLEDENSAKDYAEIKPYTPAWGEQITGVPRQYIETIAREFADTAHKTHGRSMIIL GAGVNHWYHMDMNYRGMINMLIFCGCVGQSGGGWAHYVGQEKLRPQTGWLPLAFALDWNR PPRQMNSTSFFYNHSSQWRYEKVSAQELLSPLADASKYSGHLIDFNVRAERMGWLPSAPQ LGRNPLGIKAEADKAGLSPTEFTAQALKSGDLRMACEQPDSSSNHPRNLFVWRSNLLGSS GKGHEYMQKYLLGTESGIQGEELGASDGIKPEEVEWQTAAIEGKLDLLVTLDFRMSSTCL FSDIVLPTATWYEKDDMNTSDMHPFIHPLSAAVDPAWESRSDWEIYKGIAKAFSQVCVGH LGKETDVVLQPLLHDSPAELSQPCEVLDWRKGECDLIPGKTAPNIVAVERDYPATYERFT SLGPLMDKLGNGGKGISWNTQDEIDFLGKLNYTKRDGPAQGRPLIDTAIDASEVILALAP ETNGHVAVKAWQALGEITGREHTHLALHKEDEKIRFRDIQAQPRKIISSPTWSGLESDHV SYNAGYTNVHELIPWRTLSGRQQLYQDHPWMRAFGESLVAYRPPIDTRSVSEMRQIPPNG FPEKALNFLTPHQKWGIHSTYSENLLMLTLSRGGPIVWISETDARELTIVDNDWVEVFNA NGALTARAVVSQRVPPGMTMMYHAQERIMNIPGSEVTGMRGGIHNSVTRVCPKPTHMIGG YAQLAWGFNYYGTVGSNRDEFIMIRKMKNVNWLDDEGRDQVQEAKK >gi|296494688|gb|ADTN01000050.1| GENE 9 9771 - 11159 957 462 aa, chain - ## HITS:1 COG:narU KEGG:ns NR:ns ## COG: narU COG2223 # Protein_GI_number: 16129428 # Func_class: P Inorganic ion transport and metabolism # Function: Nitrate/nitrite transporter # Organism: Escherichia coli K12 # 1 462 1 462 462 781 100.0 0 MALQNEKNSRYLLRDWKPENPAFWENKGKHIARRNLWISVSCLLLAFCVWMLFSAVTVNL NKIGFNFTTDQLFLLTALPSVSGALLRVPYSFMVPIFGGRRWTVFSTAILIIPCVWLGIA VQNPNTPFGIFIVIALLCGFAGANFASSMGNISFFFPKAKQGSALGINGGLGNLGVSVMQ LVAPLVIFVPVFAFLGVNGVPQADGSVMSLANAAWIWVPLLAIATIAAWSGMNDIASSRA SIADQLPVLQRLHLWLLSLLYLATFGSFIGFSAGFAMLAKTQFPDVNILRLAFFGPFIGA IARSVGGAISDKFGGVRVTLINFIFMAIFSALLFLTLPGTGSGNFIAFYAVFMGLFLTAG LGSGSTFQMIAVIFRQITIYRVKMKGGSDEQAHKEAVTETAAALGFISAIGAVGGFFIPQ AFGMSLNMTGSPVGAMKVFLIFYIVCVLLTWLVYGRRKFSQK >gi|296494688|gb|ADTN01000050.1| GENE 10 11483 - 11818 62 111 aa, chain - ## HITS:1 COG:ECs2073 KEGG:ns NR:ns ## COG: ECs2073 COG4886 # Protein_GI_number: 15831327 # Func_class: S Function unknown # Function: Leucine-rich repeat (LRR) protein # Organism: Escherichia coli O157:H7 # 1 111 307 417 419 202 92.0 1e-52 MQLMLFNLFSPALKLNTGLAILSPGAFEVHSDGIDVDNELFHYPIKKAYTPYNIHTYKTE EVVNQRNIKVKNMTLDEINNTYCNNDYYNQAIREEPIDLLDRSFSSSSWPF >gi|296494688|gb|ADTN01000050.1| GENE 11 11857 - 12777 101 306 aa, chain - ## HITS:1 COG:yddK_2 KEGG:ns NR:ns ## COG: yddK_2 COG4886 # Protein_GI_number: 16129430 # Func_class: S Function unknown # Function: Leucine-rich repeat (LRR) protein # Organism: Escherichia coli K12 # 81 306 1 226 226 328 99.0 6e-90 MKTITLNDNHIAHLNAKNTTKLEYLNLSNNNLLPTNDIDQLISSKHLWHVLVNGINNDPL AQMQYWTAVRNIIDDTNEVTIDLSGLNLTTQPPGLQNFTSINLDNNQLTHFDATNYDRLV KLSLNSNALESINFPQGRNVSITHISMNNNALRNIDIDRLSSVTYFSAAHNQLEFVQLES CEWLQYLNLSHNQLTDIVAGNKNELLLLDLSHNKLTSLHNDLFPNLNTLLINNNLLSEIK IFYSNFCNVQTLNAANNQLKYINLDFLTYLPSIKSLRLDNNKITHIDTNNTSGIGTLFPI IKQSKT >gi|296494688|gb|ADTN01000050.1| GENE 12 12837 - 13022 172 61 aa, chain - ## HITS:1 COG:yddL KEGG:ns NR:ns ## COG: yddL COG3203 # Protein_GI_number: 16129431 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein (porin) # Organism: Escherichia coli K12 # 8 61 43 96 96 101 94.0 3e-22 ALSISWSHYPDDKRDDGDKTYARLGFKGETQINDQMIGFGHWEYDFKGYNDEANGSRGKN L Prediction of potential genes in microbial genomes Time: Sun May 15 23:15:02 2011 Seq name: gi|296494687|gb|ADTN01000051.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont118.1, whole genome shotgun sequence Length of sequence - 4192 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 284 - 343 2.0 1 1 Op 1 . + CDS 446 - 1090 308 ## COG1052 Lactate dehydrogenase and related dehydrogenases + Prom 1094 - 1153 2.3 2 1 Op 2 2/0.000 + CDS 1184 - 2239 387 ## COG2837 Predicted iron-dependent peroxidase 3 1 Op 3 . + CDS 2241 - 3047 710 ## COG1659 Uncharacterized protein, linocin/CFP29 homolog - Term 3044 - 3084 6.5 4 2 Tu 1 . - CDS 3099 - 3950 576 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) - Prom 4050 - 4109 5.1 Predicted protein(s) >gi|296494687|gb|ADTN01000051.1| GENE 1 446 - 1090 308 214 aa, chain + ## HITS:1 COG:SMa0478 KEGG:ns NR:ns ## COG: SMa0478 COG1052 # Protein_GI_number: 16262704 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Sinorhizobium meliloti # 29 211 201 384 401 218 56.0 9e-57 MELARHLEFTVSTGVKVYFCDPQSPWQRGTNENTNGLIRQYFPKKTCLAQYTQHELDLVA TQLGLTYHPDAESLARTVDIVNLQVPLYPSTEHFFNQKMISEMKRGSYLINTARAQLVDR DAVVNALKSGHLAGYAGDVWFPQPAPASHPWRTMPWNGMTPHMSGTSLSAQARYAAGTLE ILESFLGNSPIREEYLIVDRGQLAGTGAKSYQLN >gi|296494687|gb|ADTN01000051.1| GENE 2 1184 - 2239 387 351 aa, chain + ## HITS:1 COG:MT0820 KEGG:ns NR:ns ## COG: MT0820 COG2837 # Protein_GI_number: 15840211 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted iron-dependent peroxidase # Organism: Mycobacterium tuberculosis CDC1551 # 11 348 9 331 335 335 52.0 7e-92 MGCPFSQSVSQPVDERLTRAAIFLVVTINPGKAAEVAVRAHCSILSSLIRGVGFRISDGG LSCVMGVSEGGWERLFGDTKPEYLHVFREINGVHHAPSTPGDLLYHIRAARMDLCFELAS RILSDLGNSVSVVDSVQGFRYFDDRDLLGFVDGTENPVAQAAVDATLIGDEDMVFAGGSY VIVQKYLHDLDKWNAIPVEQQEKIIGREKLSDIELRDADKPSYAHNVLTSIEEDGEDVDI LRDNMPFGDPGKGEFGTYFIGYSRKPERIERMLENMFIGNPPGNYDRILDVSRAITGTLF FVPTTSFLDSVEPQSAPGQQGDDVINTLRSTAIKGDIMPGSLNIGSLKKEV >gi|296494687|gb|ADTN01000051.1| GENE 3 2241 - 3047 710 268 aa, chain + ## HITS:1 COG:MT0819 KEGG:ns NR:ns ## COG: MT0819 COG1659 # Protein_GI_number: 15840210 # Func_class: S Function unknown # Function: Uncharacterized protein, linocin/CFP29 homolog # Organism: Mycobacterium tuberculosis CDC1551 # 1 263 1 263 265 292 54.0 5e-79 MNNLHRELAPVSDAAWEQIEEEATRTLKRFLAARRVVDVTDPQGAALSAVGTGHVAYLDG PCAGVSAVKRQVLPVVEFRVPFKLNRQAIDDVERGSQDSDWSPLKEAARKIAAAEDQTIF DGYMAAGIAGIRPQSSNTPLTLPATASDYPTVVAQALDQLRVAGVNGPYHLVLGEKAYTS ITGGNEGGYPVFQHIRRLIDGEIVWAPAIEGGLLLTTRGGDFVMDIGQDISIGYLNHTGT DVELYLQESFTFSALTSEATVTLLPPEE >gi|296494687|gb|ADTN01000051.1| GENE 4 3099 - 3950 576 283 aa, chain - ## HITS:1 COG:ECs4310 KEGG:ns NR:ns ## COG: ECs4310 COG0568 # Protein_GI_number: 15833564 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Escherichia coli O157:H7 # 1 281 1 282 284 411 77.0 1e-115 MTKEMSVLAPVGNLDSYIRAAYACPMLSQEEEQQLASNLQIHDDLDAAKQLILSHLRFVI HIARQYSGYGLPQADLIQEGNIGLMKAVRRFNPDVGVRLVSFAVHWVKSEIHEYVLRNWR IVKVATTKAQRKLFFNLRKTKQRLGWFNQDEVDMVARELGVSVKDVREMESRMSAQDMAY DIYPDDETDTHHPVSPSLYLQDKSSDFAGMIEEDNWDAQATDHLVAAMETLDPRSQDIIR ARWLDDDNKATLHELAERYGISAERVRQLENNAMKKLRSAIAL Prediction of potential genes in microbial genomes Time: Sun May 15 23:15:03 2011 Seq name: gi|296494686|gb|ADTN01000052.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont119.1, whole genome shotgun sequence Length of sequence - 1431 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 1.8 1 1 Tu 1 . + CDS 86 - 1414 599 ## COG3385 FOG: Transposase and inactivated derivatives Predicted protein(s) >gi|296494686|gb|ADTN01000052.1| GENE 1 86 - 1414 599 442 aa, chain + ## HITS:1 COG:yi41 KEGG:ns NR:ns ## COG: yi41 COG3385 # Protein_GI_number: 16132099 # Func_class: L Replication, recombination and repair # Function: FOG: Transposase and inactivated derivatives # Organism: Escherichia coli K12 # 1 442 1 442 442 883 100.0 0 MHIGQALDLVSRYDSLRNPLTSLGDYLDPELISRCLAESGTVTLRKRRLPLEMMVWCIVG MALERKEPLHQIVNRLDIMLPGNRPFVAPSAVIQARQRLGSEAVRRVFTKTAQLWHNATP HPHWCGLTLLAIDGVFWRTPDTPENDAAFPRQTHAGNPALYPQVKMVCQMELTSHLLTAA AFGTMKNSENELAEQLIEQTGDNTLTLMDKGYYSLGLLNAWSLAGEHRHWMIPLRKGAQY EEIRKLGKGDHLVKLKTSPQARKKWPGLGNEVTARLLTVTRKGKVCHLLTSMTDAMRFPG GEMGDLYSHRWEIELGYREIKQTMQRSRLTLRSKKPELVEQELWGVLLAYNLVRYQMIKM AEHLKGYWPNQLSFSESCGMVMRMLMTLQGASPGRIPELMRDLASMGQLVKLPTRRERAF PRVVKERPWKYPTAPKKSQSVA Prediction of potential genes in microbial genomes Time: Sun May 15 23:15:13 2011 Seq name: gi|296494685|gb|ADTN01000053.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont144.1, whole genome shotgun sequence Length of sequence - 29449 bp Number of predicted genes - 32, with homology - 32 Number of transcription units - 5, operones - 4 average op.length - 7.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 - CDS 33 - 722 617 ## COG0790 FOG: TPR repeat, SEL1 subfamily - Prom 746 - 805 5.1 - Term 770 - 801 4.1 2 1 Op 2 5/0.000 - CDS 816 - 2495 1763 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 3 1 Op 3 3/0.000 - CDS 2544 - 2963 352 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing - Prom 3025 - 3084 3.0 - Term 3117 - 3154 6.2 4 1 Op 4 2/0.000 - CDS 3161 - 4627 1567 ## COG1538 Outer membrane protein 5 1 Op 5 6/0.000 - CDS 4624 - 6675 1494 ## COG1289 Predicted membrane protein 6 1 Op 6 . - CDS 6675 - 7706 1087 ## COG1566 Multidrug resistance efflux pump 7 1 Op 7 . - CDS 7725 - 8000 173 ## EcHS_A4328 hypothetical protein - Prom 8123 - 8182 13.9 - Term 8160 - 8199 8.3 8 2 Tu 1 . - CDS 8209 - 10194 2408 ## COG2015 Alkyl sulfatase and related hydrolases - Prom 10397 - 10456 5.9 9 3 Op 1 2/0.000 - CDS 10467 - 11396 539 ## COG1940 Transcriptional regulator/sugar kinase 10 3 Op 2 2/0.000 - CDS 11380 - 12075 826 ## COG0036 Pentose-5-phosphate-3-epimerase 11 3 Op 3 21/0.000 - CDS 12086 - 13066 1066 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 12 3 Op 4 16/0.000 - CDS 13045 - 14577 175 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 - Prom 14598 - 14657 2.0 13 3 Op 5 . - CDS 14704 - 15639 990 ## COG1879 ABC-type sugar transport system, periplasmic component 14 3 Op 6 . - CDS 15698 - 16588 539 ## COG1737 Transcriptional regulators - Prom 16682 - 16741 10.4 + Prom 16670 - 16729 8.5 15 4 Op 1 . + CDS 16947 - 17396 313 ## COG0698 Ribose 5-phosphate isomerase RpiB 16 4 Op 2 . + CDS 17465 - 17794 207 ## B21_03923 hypothetical protein 17 5 Op 1 3/0.000 - CDS 17941 - 18699 648 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I 18 5 Op 2 3/0.000 - CDS 18701 - 19135 521 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 19 5 Op 3 7/0.000 - CDS 19122 - 19679 499 ## COG3709 Uncharacterized component of phosphonate metabolism 20 5 Op 4 6/0.000 - CDS 19679 - 20815 1150 ## COG3454 Metal-dependent hydrolase involved in phosphonate metabolism 21 5 Op 5 7/0.000 - CDS 20812 - 21492 223 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 21532 - 21591 1.9 22 5 Op 6 7/0.000 - CDS 21603 - 22361 376 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 23 5 Op 7 8/0.000 - CDS 22358 - 23203 904 ## COG3627 Uncharacterized enzyme of phosphonate metabolism 24 5 Op 8 8/0.000 - CDS 23196 - 24260 1261 ## COG3626 Uncharacterized enzyme of phosphonate metabolism 25 5 Op 9 9/0.000 - CDS 24260 - 24844 677 ## COG3625 Uncharacterized enzyme of phosphonate metabolism 26 5 Op 10 5/0.000 - CDS 24841 - 25293 431 ## COG3624 Uncharacterized enzyme of phosphonate metabolism 27 5 Op 11 3/0.000 - CDS 25294 - 26019 588 ## COG2188 Transcriptional regulators 28 5 Op 12 8/0.000 - CDS 26040 - 26408 427 ## COG3639 ABC-type phosphate/phosphonate transport system, permease component 29 5 Op 13 8/0.000 - CDS 26258 - 26827 128 ## COG3639 ABC-type phosphate/phosphonate transport system, permease component 30 5 Op 14 15/0.000 - CDS 26933 - 27949 1144 ## COG3221 ABC-type phosphate/phosphonate transport system, periplasmic component 31 5 Op 15 3/0.000 - CDS 27974 - 28762 264 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 - Prom 28815 - 28874 4.5 32 5 Op 16 . - CDS 28895 - 29338 511 ## COG2764 Uncharacterized protein conserved in bacteria - Prom 29377 - 29436 1.6 Predicted protein(s) >gi|296494685|gb|ADTN01000053.1| GENE 1 33 - 722 617 229 aa, chain - ## HITS:1 COG:ECs5060 KEGG:ns NR:ns ## COG: ECs5060 COG0790 # Protein_GI_number: 15834314 # Func_class: R General function prediction only # Function: FOG: TPR repeat, SEL1 subfamily # Organism: Escherichia coli O157:H7 # 1 229 1 229 229 433 100.0 1e-121 MKKIIALMLFLTFFAHANDSEPGSQYLKAAEAGDRRAQYFLADSWFSSGDLSKAEYWAQK AADSGDADACALLAQIKITNPVSLDYPQAKVLAEKAAQAGSKEGEVTLAHILVNTQAGKP DYPKAISLLENASEDLENDSAVDAQMLLGLIYANGVGIKADDDKATWYFKRSSAISRTGY SEYWAGMMFLNGEEGFIEKNKQKALHWLNLSCMEGFDTGCEEFEKLTNG >gi|296494685|gb|ADTN01000053.1| GENE 2 816 - 2495 1763 559 aa, chain - ## HITS:1 COG:fdhF KEGG:ns NR:ns ## COG: fdhF COG0243 # Protein_GI_number: 16131905 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 559 157 715 715 1172 100.0 0 MSNAINEIDNTDLVFVFGYNPADSHPIVANHVINAKRNGAKIIVCDPRKIETARIADMHI ALKNGSNIALLNAMGHVIIEENLYDKAFVASRTEGFEEYRKIVEGYTPESVEDITGVSAS EIRQAARMYAQAKSAAILWGMGVTQFYQGVETVRSLTSLAMLTGNLGKPHAGVNPVRGQN NVQGACDMGALPDTYPGYQYVKDPANREKFAKAWGVESLPAHTGYRISELPHRAAHGEVR AAYIMGEDPLQTDAELSAVRKAFEDLELVIVQDIFMTKTASAADVILPSTSWGEHEGVFT AADRGFQRFFKAVEPKWDLKTDWQIISEIATRMGYPMHYNNTQEIWDELRHLCPDFYGAT YEKMGELGFIQWPCRDTSDADQGTSYLFKEKFDTPNGLAQFFTCDWVAPIDKLTDEYPMV LSTVREVGHYSCRSMTGNCAALAALADEPGYAQINTEDAKRLGIEDEALVWVHSRKGKII TRAQVSDRPNKGAIYMTYQWWIGACNELVTENLSPITKTPEYKYCAVRVEPIADQRAAEQ YVIDEYNKLKTRLREAALA >gi|296494685|gb|ADTN01000053.1| GENE 3 2544 - 2963 352 139 aa, chain - ## HITS:1 COG:ECs5061 KEGG:ns NR:ns ## COG: ECs5061 COG0243 # Protein_GI_number: 15834315 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli O157:H7 # 1 139 1 139 715 296 100.0 6e-81 MKKVVTVCPYCASGCKINLVVDNGKIVRAEAAQGKTNQGTLCLKGYYGWDFINDTQILTP RLKTPMIRRQRGGKLEPVSWDEALNYVAERLSAIKEKYGPDAIQTTGSSRGTGNETNYVM QKFARAVIGTNNVDCCARV >gi|296494685|gb|ADTN01000053.1| GENE 4 3161 - 4627 1567 488 aa, chain - ## HITS:1 COG:yjcP KEGG:ns NR:ns ## COG: yjcP COG1538 # Protein_GI_number: 16131906 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Escherichia coli K12 # 1 488 1 488 488 890 100.0 0 MINRQLSRLLLCSILGSTTLISGCALVRKDSAPHQQLKPEQIKLADDIHLASSGWPQAQW WKQLNDPQLDALIQRTLSGSHTLAEAKLREEKAQSQADLLDAGSQLQVAALGMLNRQRVS ANGFLSPYSMDAPALGMDGPYYTEATVGLFAGLDLDLWGVHRSAVAAAIGAHNAALAETA AVELSLATGVAQLYYSMQASYQMLDLLEQTHDVIDYAVKAHQSKVAHGLEAQVPFHGARA QILAVDKQIVAVKGQITETRESLRALIGAGASDMPEIRPVALPQVQTGIPATLSYELLAR RPDLQAMRWYVQASLDQVDSARALFYPSFDIKAFFGLDSIHLHTLFKKTSRQFNFIPGLK LPLFDGGRLNANLEGTRAASNMMIERYNQSVLNAVRDVAVNGTRLQTLNDEREMQAERVE ATRFTQRAAEAAYQRGLTSRLQATEARLPVLAEEMSLLMLDSRRVIQSIQLMKSLGGGYQ AGPVVEKK >gi|296494685|gb|ADTN01000053.1| GENE 5 4624 - 6675 1494 683 aa, chain - ## HITS:1 COG:yjcQ KEGG:ns NR:ns ## COG: yjcQ COG1289 # Protein_GI_number: 16131907 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 48 683 1 636 636 1232 100.0 0 MSALNSLPLPEVRLLAFFHEELSERRPGRVPQTVQLWVGCLLVILISMTFEIPFVALSLA VLFYGIQSNAFYTKFVAILFVVATVLEIGSLFLIYKWSYGEPLIRLIIAGPILMGCMFLM RTHRLGLVFFAVAIVAIYGQTFPAMLDYPEVVVRLTLWCIVVGLYPTLLMTLIGVLWFPS RAISQMHQALNDRLDDAISHLTDSLAPLPETRIEREALALQKLNVFCLADDANWRTQNAW WQSCVATVTYIYSTLNRYDPTSFADSQAIIEFRQKLASEINKLQHAVAEGQCWQSDWRIS ESEAMAARECNLENICQTLLQLGQMDPNTPPTPAAKPPSMAADAFTNPDYMRYAVKTLLA CLICYTFYSGVDWEGIHTCMLTCVIVANPNVGSSYQKMVLRFGGAFCGAILALLFTLLVM PWLDNIVELLFVLAPIFLLGAWIATSSERSSYIGTQMVVTFALATLENVFGPVYDLVEIR DRALGIIIGTVVSAVIYTFVWPESEARTLPQKLAGTLGMLSKVMRIPRQQEVTALRTYLQ IRIGLHAAFNACEEMCQRVALERQLDSEERALLIERSQTVIRQGRDLLHAWDATWNSAQA LDNALQPDRAGQFADALEKYAAGLATALSRSPQITLEETPASQAILPTLLKQEQHVCQLF ARLPDWTAPALTPATEQAQGATQ >gi|296494685|gb|ADTN01000053.1| GENE 6 6675 - 7706 1087 343 aa, chain - ## HITS:1 COG:yjcR KEGG:ns NR:ns ## COG: yjcR COG1566 # Protein_GI_number: 16131908 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Escherichia coli K12 # 1 343 1 343 343 583 100.0 1e-166 MESTPKKAPRSKFPALLVVALALVALVFVIWRVDSAPSTNDAYASADTIDVVPEVSGRIV ELAVTDNQAVKQGDLLFRIDPRPYEANLAKAEASLAALDKQIMLTQRSVDAQQFGADSVN ATVEKARAAAKQATDTLRRTEPLLKEGFVSAEDVDRARTAQRAAEADLNAVLLQAQSAAS AVSGVDALVAQRAAVEADIALTKLHLEMATVRAPFDGRVISLKTSVGQFASAMRPIFTLI DTRHWYVIANFRETDLKNIRSGTPATIRLMSDSGKTFEGKVDSIGYGVLPDDGGLVLGGL PKVSRSINWVRVAQRFPVKIMVDKPDPEMFRIGASAVANLEPQ >gi|296494685|gb|ADTN01000053.1| GENE 7 7725 - 8000 173 91 aa, chain - ## HITS:1 COG:no KEGG:EcHS_A4328 NR:ns ## KEGG: EcHS_A4328 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_HS # Pathway: not_defined # 1 91 17 107 107 142 100.0 4e-33 MPTVLSRMAMQLKKTAWIIPVFMVSGCSLSPAIPVIGAYYPSWFFCAIASLILTLITRRV IQRANINLAFVGIIYTALFALYAMLFWLAFF >gi|296494685|gb|ADTN01000053.1| GENE 8 8209 - 10194 2408 661 aa, chain - ## HITS:1 COG:yjcS KEGG:ns NR:ns ## COG: yjcS COG2015 # Protein_GI_number: 16131909 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Alkyl sulfatase and related hydrolases # Organism: Escherichia coli K12 # 1 661 5 665 665 1328 100.0 0 MNNSRLFRLSRIVIALTAASGMMVNTANAKEEAKAATQYTQQVNQNYAKSLPFSDRQDFD DAQRGFIAPLLDEGILRDANGKVYYRADDYKFDINAAAPETVNPSLWRQSQINGISGLFK VTDKMYQVRGQDISNITFVEGEKGIIVIDPLVTPPAAKAALDLYFQHRPQKPIVAVIYTH SHTDHYGGVKGIISEADVKSGKVQVIAPAGFMDEAISENVLAGNIMSRRALYSYGLLLPH NAQGNVGNGLGVTLATGDPSIIAPTKTIVRTGEKMIIDGLEFDFLMTPGSEAPAEMHFYI PALKALCTAENATHTLHNFYTLRGAKTRDTSKWTEYLNETLDMWGNDAEVLFMPHTWPVW GNKHINDYIGKYRDTIKYIHDQTLHLANQGYTMNEIGDMIKLPPALANNWASRGYYGSVS HNARAVYNFYLGYYDGNPANLHPYGQVEMGKRYVQALGGSARVINLAQEANKQGDYRWSA ELLKQVIAANPGDQVAKNLQANNFEQLGYQAESATWRGFYLTGAKELREGVHKFSHGTTG SPDTIRGMSVEMLFDFMAVRLDSAKAAGKNISLNFNMSNGDNLNLTLNDSVLNYRKTLQP QADASFYISREDLHAVLTGQAKMADLVKAKKAKIIGNGAKLEEIIACLDNFDLWVNIVTP N >gi|296494685|gb|ADTN01000053.1| GENE 9 10467 - 11396 539 309 aa, chain - ## HITS:1 COG:alsK KEGG:ns NR:ns ## COG: alsK COG1940 # Protein_GI_number: 16131910 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Escherichia coli K12 # 1 309 1 309 309 649 100.0 0 MQKQHNVVAGVDMGATHIRFCLRTAEGETLHCEKKRTAEVIAPGLVSGIGEMIDEQLRRF NARCHGLVMGFPALVSKDKRTIISTPNLPLTAADLYDLADKLENTLNCPVEFSRDVNLQL SWDVVENRLTQQLVLAAYLGTGMGFAVWMNGAPWTGAHGVAGELGHIPLGDMTQHCACGN PGCLETNCSGMALRRWYEQQPRNYPLRDLFVHAENAPFVQSLLENAARAIATSINLFDPD AVILGGGVMDMPAFPRETLVAMTQKYLRRPLPHQVVRFIAASSSDFNGAQGAAILAHQRF LPQFCAKAP >gi|296494685|gb|ADTN01000053.1| GENE 10 11380 - 12075 826 231 aa, chain - ## HITS:1 COG:alsE KEGG:ns NR:ns ## COG: alsE COG0036 # Protein_GI_number: 16131911 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Escherichia coli K12 # 1 231 1 231 231 473 100.0 1e-133 MKISPSLMCMDLLKFKEQIEFIDSHADYFHIDIMDGHFVPNLTLSPFFVSQVKKLATKPL DCHLMVTRPQDYIAQLARAGADFITLHPETINGQAFRLIDEIRRHDMKVGLILNPETPVE AMKYYIHKADKITVMTVDPGFAGQPFIPEMLDKLAELKAWREREGLEYEIEVDGSCNQAT YEKLMAAGADVFIVGTSGLFNHAENIDEAWRIMTAQILAAKSEVQPHAKTA >gi|296494685|gb|ADTN01000053.1| GENE 11 12086 - 13066 1066 326 aa, chain - ## HITS:1 COG:alsC KEGG:ns NR:ns ## COG: alsC COG1172 # Protein_GI_number: 16131912 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Escherichia coli K12 # 1 326 1 326 326 504 100.0 1e-143 MGFTTRVKSEASEKKPFNFALFWDKYGTFFILAIIVAIFGSLSPEYFLTTNNITQIFVQS SVTVLIGMGEFFAILVAGIDLSVGAILALSGMVTAKLMLAGVDPFLAAMIGGVLVGGALG AINGCLVNWTGLHPFIITLGTNAIFRGITLVISDANSVYGFSFDFVNFFAASVIGIPVPV IFSLIVALILWFLTTRMRLGRNIYALGGNKNSAFYSGIDVKFHILVVFIISGVCAGLAGV VSTARLGAAEPLAGMGFETYAIASAIIGGTSFFGGKGRIFSVVIGGLIIGTINNGLNILQ VQTYYQLVVMGGLIIAAVALDRLISK >gi|296494685|gb|ADTN01000053.1| GENE 12 13045 - 14577 175 510 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 279 494 25 222 309 72 25 4e-12 MATPYISMAGIGKSFGPVHALKSVNLTVYPGEIHALLGENGAGKSTLMKVLSGIHEPTKG TITINNISYNKLDHKLAAQLGIGIIYQELSVIDELTVLENLYIGRHLTKKICGVNIIDWR EMRVRAAMMLLRVGLKVDLDEKVANLSISHKQMLEIAKTLMLDAKVIIMDEPTSSLTNKE VDYLFLIMNQLRKEGTAIVYISHKLAEIRRICDRYTVMKDGSSVCSGIVSDVSNDDIVRL MVGRELQNRFNAMKENVSNLAHETVFEVRNVTSRDRKKVRDISFSVCRGEILGFAGLVGS GRTELMNCLFGVDKRAGGEIRLNGKDISPRSPLDAVKKGMAYITESRRDNGFFPNFSIAQ NMAISRSLKDGGYKGAMGLFHEVDEQRTAENQRELLALKCHSVNQNITELSGGNQQKVLI SKWLCCCPEVIIFDEPTRGIDVGAKAEIYKVMRQLADDGKVILMVSSELPEIITVCDRIA VFCEGRLTQILTNRDDMSEEEIMAWALPQE >gi|296494685|gb|ADTN01000053.1| GENE 13 14704 - 15639 990 311 aa, chain - ## HITS:1 COG:alsB KEGG:ns NR:ns ## COG: alsB COG1879 # Protein_GI_number: 16131914 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli K12 # 1 311 1 311 311 561 100.0 1e-160 MNKYLKYFSGTLVGLMLSTSAFAAAEYAVVLKTLSNPFWVDMKKGIEDEAKTLGVSVDIF ASPSEGDFQSQLQLFEDLSNKNYKGIAFAPLSSVNLVMPVARAWKKGIYLVNLDEKIDMD NLKKAGGNVEAFVTTDNVAVGAKGASFIIDKLGAEGGEVAIIEGKAGNASGEARRNGATE AFKKASQIKLVASQPADWDRIKALDVATNVLQRNPNIKAIYCANDTMAMGVAQAVANAGK TGKVLVVGTDGIPEARKMVEAGQMTATVAQNPADIGATGLKLMVDAEKSGKVIPLDKAPE FKLVDSILVTQ >gi|296494685|gb|ADTN01000053.1| GENE 14 15698 - 16588 539 296 aa, chain - ## HITS:1 COG:rpiR KEGG:ns NR:ns ## COG: rpiR COG1737 # Protein_GI_number: 16131915 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 296 12 307 307 588 100.0 1e-168 MSQSEFDSALPNGIGLAPYLRMKQEGMTENESRIVEWLLKPGNLSCAPAIKDVAEALAVS EAMIVKVSKLLGFSGFRNLRSALEDYFSQSEQVLPSELAFDEAPQDVVNKVFNITLRTIM EGQSIVNVDEIHRAARFFYQARQRDLYGAGGSNAICADVQHKFLRIGVRCQAYPDAHIMM MSASLLQEGDVVLVVTHSGRTSDVKAAVELAKKNGAKIICITHSYHSPIAKLADYIICSP APETPLLGRNASARILQLTLLDAFFVSVAQLNIEQANINMQKTGAIVDFFSPGALK >gi|296494685|gb|ADTN01000053.1| GENE 15 16947 - 17396 313 149 aa, chain + ## HITS:1 COG:rpiB KEGG:ns NR:ns ## COG: rpiB COG0698 # Protein_GI_number: 16131916 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Escherichia coli K12 # 1 149 1 149 149 287 100.0 4e-78 MKKIAFGCDHVGFILKHEIVAHLVERGVEVIDKGTWSSERTDYPHYASQVALAVAGGEVD GGILICGTGVGISIAANKFAGIRAVVCSEPYSAQLSRQHNDTNVLAFGSRVVGLELAKMI VDAWLGAQYEGGRHQQRVEAITAIEQRRN >gi|296494685|gb|ADTN01000053.1| GENE 16 17465 - 17794 207 109 aa, chain + ## HITS:1 COG:no KEGG:B21_03923 NR:ns ## KEGG: B21_03923 # Name: yjdP # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 109 1 109 109 79 99.0 5e-14 MKRFPLFLLFTLLTLSTVPAQADIIDDTIGNIQQAINDAYNPDRGRDYEDSRDDGWQREV SDDRRRQYDDRRRQFEDRRRQLDDRQRQLDQERRQLEDEERRMEDEYGR >gi|296494685|gb|ADTN01000053.1| GENE 17 17941 - 18699 648 252 aa, chain - ## HITS:1 COG:ECs5075 KEGG:ns NR:ns ## COG: ECs5075 COG1235 # Protein_GI_number: 15834329 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Escherichia coli O157:H7 # 1 252 1 252 252 513 96.0 1e-145 MSLTLTLTGTGGAQGVPAWGCECAACARARRSPQYRRQPCSGVVKFNDAITLIDAGLHDL ADRWSPGSFQQFLLTHYHMVHVQGLFPLRWGVGDPIPVYGPPDEQGCDDLFKHPGLLDFS HTVEPFVVFDLQGLQVTPLPLNHSKLTFGYLLETAHSRVAWLSDTAGLPEKTLKFLRNNQ PQVMVMDCSHPPRADAPRNHCDLNTVLALNQVIRSPRVILTHISHQFDAWLMENALPSGF EVGFDGMEIGVA >gi|296494685|gb|ADTN01000053.1| GENE 18 18701 - 19135 521 144 aa, chain - ## HITS:1 COG:phnO KEGG:ns NR:ns ## COG: phnO COG0454 # Protein_GI_number: 16131919 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Escherichia coli K12 # 1 144 1 144 144 285 100.0 2e-77 MPACELRPATQYDTDAVYALICELKQAEFDHHAFRVGFNANLRDPNMRYHLALLDGEVVG MIGLHLQFHLHHVNWIGEIQELVVMPQARGLNVGSKLLAWAEEEARQAGAEMTELSTNVK RHDAHRFYLREGYEQSHFRFTKAL >gi|296494685|gb|ADTN01000053.1| GENE 19 19122 - 19679 499 185 aa, chain - ## HITS:1 COG:phnN KEGG:ns NR:ns ## COG: phnN COG3709 # Protein_GI_number: 16131920 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized component of phosphonate metabolism # Organism: Escherichia coli K12 # 1 185 1 185 185 358 100.0 3e-99 MMGKLIWLMGPSGSGKDSLLAELRLREQTQLLVAHRYITRDASAGSENHIALSEQEFFTR AGQNLLALSWHANGLYYGVGVEIDLWLHAGFDVLVNGSRAHLPQARARYQSALLPVCLQV SPEILRQRLENRGRENASEINARLARAARYTPQDCHTLNNDGSLRQSVDTLLTLIHQKEK HHACL >gi|296494685|gb|ADTN01000053.1| GENE 20 19679 - 20815 1150 378 aa, chain - ## HITS:1 COG:phnM KEGG:ns NR:ns ## COG: phnM COG3454 # Protein_GI_number: 16131921 # Func_class: P Inorganic ion transport and metabolism # Function: Metal-dependent hydrolase involved in phosphonate metabolism # Organism: Escherichia coli K12 # 1 378 1 378 378 743 100.0 0 MIINNVKLVLENEVVSGSLEVQNGEIRAFAESQSRLPEAMDGEGGWLLPGLIELHTDNLD KFFTPRPKVDWPAHSAMSSHDALMVASGITTVLDAVAIGDVRDGGDRLENLEKMINAIEE TQKRGVNRAEHRLHLRCELPHHTTLPLFEKLVQREPVTLVSLMDHSPGQRQFANREKYRE YYQGKYSLTDAQMQQYEEEQLALAARWSQPNRESIAALCRARKIALASHDDATHAHVAES HQLGSVIAEFPTTFEAAEASRKHGMNVLMGAPNIVRGGSHSGNVAASELAQLGLLDILSS DYYPASLLDAAFRVADDQSNRFTLPQAVKLVTKNPAQALNLQDRGVIGEGKRADLVLAHR KDNHIHIDHVWRQGKRVF >gi|296494685|gb|ADTN01000053.1| GENE 21 20812 - 21492 223 226 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 219 1 209 245 90 29 1e-17 MINVQNVSKTFILHQQNGVRLPVLNRASLTVNAGECVVLHGHSGSGKSTLLRSLYANYLP DEGQIQIKHGDEWVDLVTAPARKVVEIRKTTVGWVSQFLRVIPRISALEVVMQPLLDTGV PREACAAKAARLLTRLNVPERLWHLAPSTFSGGEQQRVNIARGFIVDYPILLLDEPTASL DAKNSAAVVELIREAKTRGAAIVGIFHDEAVRNDVADRLHPMGASS >gi|296494685|gb|ADTN01000053.1| GENE 22 21603 - 22361 376 252 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 19 251 36 267 329 149 35 2e-35 MNQPLLSVNNLTHLYAPGKGFSDVSFDLWPGEVLGIVGESGSGKTTLLKSISARLTPQQG EIHYENRSLYAMSEADRRRLLRTEWGVVHQHPLDGLRRQVSAGGNIGERLMATGARHYGD IRATAQKWLEEVEIPANRIDDLPTTFSGGMQQRLQIARNLVTHPKLVFMDEPTGGLDVSV QARLLDLLRGLVVELNLAVVIVTHDLGVARLLADRLLVMKQGQVVESGLTDRVLDDPHHP YTQLLVSSVLQN >gi|296494685|gb|ADTN01000053.1| GENE 23 22358 - 23203 904 281 aa, chain - ## HITS:1 COG:phnJ KEGG:ns NR:ns ## COG: phnJ COG3627 # Protein_GI_number: 16131924 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized enzyme of phosphonate metabolism # Organism: Escherichia coli K12 # 1 281 1 281 281 575 100.0 1e-164 MANLSGYNFAYLDEQTKRMIRRAILKAVAIPGYQVPFGGREMPMPYGWGTGGIQLTASVI GESDVLKVIDQGADDTTNAVSIRNFFKRVTGVNTTERTDDATVIQTRHRIPETPLTEDQI IIFQVPIPEPLRFIEPRETETRTMHALEEYGVMQVKLYEDIARFGHIATTYAYPVKVNGR YVMDPSPIPKFDNPKMDMMPALQLFGAGREKRIYAVPPFTRVESLDFDDHPFTVQQWDEP CAICGSTHSYLDEVVLDDAGNRMFVCSDTDYCRQQSEAKNQ >gi|296494685|gb|ADTN01000053.1| GENE 24 23196 - 24260 1261 354 aa, chain - ## HITS:1 COG:phnI KEGG:ns NR:ns ## COG: phnI COG3626 # Protein_GI_number: 16131925 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized enzyme of phosphonate metabolism # Organism: Escherichia coli K12 # 1 354 1 354 354 684 100.0 0 MYVAVKGGEKAIDAAHALQESRRRGDTDLPELSVAQIEQQLNLAVDRVMTEGGIADRELA ALALKQASGDNVEAIFLLRAYRTTLAKLAVSEPLDTTGMRLERRISAVYKDIPGGQLLGP TYDYTHRLLDFTLLANGEAPTLTTADSEQQPSPHVFSLLARQGLAKFEEDSGAQPDDITR TPPVYPCSRSSRLQQLMRGDEGYLLALAYSTQRGYGRNHPFAGEIRSGYIDVSIVPEELG FAVNVGELLMTECEMVNGFIDPPGEPPHFTRGYGLVFGMSERKAMAMALVDRALQAPEYG EHATGPAQDEEFVLAHADNVEAAGFVSHLKLPHYVDFQAELELLKRLQQEQNHG >gi|296494685|gb|ADTN01000053.1| GENE 25 24260 - 24844 677 194 aa, chain - ## HITS:1 COG:phnH KEGG:ns NR:ns ## COG: phnH COG3625 # Protein_GI_number: 16131926 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized enzyme of phosphonate metabolism # Organism: Escherichia coli K12 # 1 194 1 194 194 355 100.0 2e-98 MTLETAFMLPVQDAQHSFRRLLKAMSEPGVIVALHQLKRGWQPLNIATTSVLLTLADNDT PVWLSTPLNNDIVNQSLRFHTNAPLVSQPEQATFAVTDEAISSEQLNALSTGTAVAPEAG ATLILQVASLSGGRMLRLTGAGIAEERMIAPQLPECILHELTERPHPFPLGIDLILTCGE RLLAIPRTTHVEVC >gi|296494685|gb|ADTN01000053.1| GENE 26 24841 - 25293 431 150 aa, chain - ## HITS:1 COG:phnG KEGG:ns NR:ns ## COG: phnG COG3624 # Protein_GI_number: 16131927 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized enzyme of phosphonate metabolism # Organism: Escherichia coli K12 # 1 150 1 150 150 261 99.0 3e-70 MHADTAPRQHWMSVLAHSQPAELAARLNALNITADYEVIRAAETGLVQIQARMGGTGERF FAGDATLTRAAVRLTDGTLGYSWVQGRDKQHAERCALIDALMQQSRHFQNLSETLIAPLD ADRMARIAARQAEVNASRVDFFTMVRGDNA >gi|296494685|gb|ADTN01000053.1| GENE 27 25294 - 26019 588 241 aa, chain - ## HITS:1 COG:phnF KEGG:ns NR:ns ## COG: phnF COG2188 # Protein_GI_number: 16131928 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 241 1 241 241 468 100.0 1e-132 MHLSTHPTSYPTRYQEIAAKLEQELRQHYRCGDYLPAEQQLAARFEVNRHTLRRAIDQLV EKGWVQRRQGVGVLVLMRPFDYPLNAQARFSQNLLDQGSHPTSEKLLSVLRPASGHVADA LGITEGENVIHLRTLRRVNGVALCLIDHYFADLTLWPTLQRFDSGSLHDFLREQTGIALR RSQTRISARRAQAKECQRLEIPNMSPLLCVRTLNHRDGESSPAEYSVSLTRADMIEFTME H >gi|296494685|gb|ADTN01000053.1| GENE 28 26040 - 26408 427 122 aa, chain - ## HITS:1 COG:ECs5086 KEGG:ns NR:ns ## COG: ECs5086 COG3639 # Protein_GI_number: 15834340 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate/phosphonate transport system, permease component # Organism: Escherichia coli O157:H7 # 1 122 155 276 276 229 99.0 6e-61 MLALFIHTTGVLSKLLSEAVEAIEPGPVEGIRATGANKLEEILYGVLPQVMPLLISYSLY RFESNVRSATVVGMVGAGGIGVTLWEAIRGFQFQQTCALMVLIIVTVSLLDFLSQRLRKH FI >gi|296494685|gb|ADTN01000053.1| GENE 29 26258 - 26827 128 189 aa, chain - ## HITS:1 COG:phnE KEGG:ns NR:ns ## COG: phnE COG3639 # Protein_GI_number: 16131930 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate/phosphonate transport system, permease component # Organism: Escherichia coli K12 # 1 189 18 206 206 322 99.0 3e-88 MQTITIAPPKRSWFSLLSWAVVLAVLVVSWQGAEMAPLTLIKDGGNMATFAADFFPPDFS QWQDYLTEMAVTLQIAVWGTALAVVLSIPFGLMSAENLVPWWVYQPVRRLMDACRAINEM VFAMLFVVAVGLGPFAGVLACWRCLSTPPACSPSCFPKRWKRLNPARWKAFAPPVPTSSK RSSTACCHR >gi|296494685|gb|ADTN01000053.1| GENE 30 26933 - 27949 1144 338 aa, chain - ## HITS:1 COG:phnD KEGG:ns NR:ns ## COG: phnD COG3221 # Protein_GI_number: 16131931 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate/phosphonate transport system, periplasmic component # Organism: Escherichia coli K12 # 1 338 1 338 338 642 100.0 0 MNAKIIASLAFTSMFSLSTLLSPAHAEEQEKALNFGIISTESQQNLKPQWTPFLQDMEKK LGVKVNAFFAPDYAGIIQGMRFNKVDIAWYGNLSAMEAVDRANGQVFAQTVAADGSPGYW SVLIVNKDSPINNLNDLLAKRKDLTFGNGDPNSTSGFLVPGYYVFAKNNISASDFKRTVN AGHETNALAVANKQVDVATNNTENLDKLKTSAPEKLKELKVIWKSPLIPGDPIVWRKNLS ETTKDKIYDFFMNYGKTPEEKAVLERLGWAPFRASSDLQLVPIRQLALFKEMQGVKSNKG LNEQDKLAKTTEIQAQLDDLDRLNNALSAMSSVSKAVQ >gi|296494685|gb|ADTN01000053.1| GENE 31 27974 - 28762 264 262 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 231 1 214 305 106 31 2e-22 MQTIIRVEKLAKTFNQHQALHAVDLNIHHGEMVALLGPSGSGKSTLLRHLSGLITGDKSV GSHIELLGRTVQREGRLARDIRKSRAHTGYIFQQFNLVNRLSVLENVLIGALGSTPFWRT CFSWFTGEQKQRALQALTRVGMVHFAHQRVSTLSGGQQQRVAIARALMQQAKVILADEPI ASLDPESARIVMDTLRDINQNDGITVVVTLHQVDYALRYCERIVALRQGHVFYDGSSQQF DNERFDHLYRSINRVEENAKAA >gi|296494685|gb|ADTN01000053.1| GENE 32 28895 - 29338 511 147 aa, chain - ## HITS:1 COG:phnB KEGG:ns NR:ns ## COG: phnB COG2764 # Protein_GI_number: 16131933 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 147 1 147 147 295 100.0 1e-80 MPLSPYLSFAGNCSDAIAYYQRTLGAELLYKISFGEMPKSAQDSAENCPSGMQFPDTAIA HANVRIAGSDIMMSDAMPSGKASYSGFTLVLDSQQVEEGKRWFDNLAANGKIEMAWQETF WAHGFGKVTDKFGVPWMINVVKQQPTQ Prediction of potential genes in microbial genomes Time: Sun May 15 23:15:20 2011 Seq name: gi|296494684|gb|ADTN01000054.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont146.1, whole genome shotgun sequence Length of sequence - 4849 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 100 - 134 5.1 1 1 Tu 1 . - CDS 171 - 713 449 ## COG3539 P pilus assembly protein, pilin FimA - Prom 793 - 852 7.3 - Term 925 - 979 14.3 2 2 Op 1 . - CDS 1006 - 1287 160 ## G2583_2648 hypothetical protein - Prom 1370 - 1429 2.8 - Term 1448 - 1511 10.0 3 2 Op 2 . - CDS 1550 - 2659 1126 ## COG0489 ATPases involved in chromosome partitioning - Prom 2687 - 2746 2.7 + Prom 2702 - 2761 4.4 4 3 Tu 1 . + CDS 2791 - 4824 2443 ## COG0143 Methionyl-tRNA synthetase Predicted protein(s) >gi|296494684|gb|ADTN01000054.1| GENE 1 171 - 713 449 180 aa, chain - ## HITS:1 COG:yehD KEGG:ns NR:ns ## COG: yehD COG3539 # Protein_GI_number: 16130049 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 180 1 180 180 315 99.0 4e-86 MKRSIIAAAVFSSFFMSAGVFAADVDTGTLTIKGNIAESPCKFEAGGDSVSINMPTVPTT VFEGKAKYSTYDDAVGVTSSMLKISCPKEVAGVKLSLITNDKITGNDKAIASSNDTVGYY LYLGDNSDVLDVSAPFNIESYKTAEGQYAIPFKAKYLKLTDNSVQSGDVLSSLVMRVAQD >gi|296494684|gb|ADTN01000054.1| GENE 2 1006 - 1287 160 93 aa, chain - ## HITS:1 COG:no KEGG:G2583_2648 NR:ns ## KEGG: G2583_2648 # Name: yehE # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 93 1 93 93 181 100.0 5e-45 MNKYWLSGIIFLAYGLASPAFSSETATLTINGRISPPTCSMAMVNSQPQQHCGQLTYNVD TRHQVSSPVKGVTTEVVVAGSDSKRRIVLNRYD >gi|296494684|gb|ADTN01000054.1| GENE 3 1550 - 2659 1126 369 aa, chain - ## HITS:1 COG:ECs2919 KEGG:ns NR:ns ## COG: ECs2919 COG0489 # Protein_GI_number: 15832173 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Escherichia coli O157:H7 # 1 369 11 379 379 724 99.0 0 MNEQSQAKSPEALRAMVAGTLANFQHPTLKHNLTTLKALHHVAWMDDTLHVELVMPFVWH SAFEELKEQCSAELLRITGVKAIDWKLSHNIATLKRVKNQPGINGVKNIIAVSSGKGGVG KSSTAVNLALALAAEGAKVGILDADIYGPSIPTMLGAENQRPTSPDGTHMAPIMSHGLAT NSIGYLVTDDNAMVWRGPMASKALMQMLQETLWPDLDYLVLDMPPGTGDIQLTLAQNIPV TGAVVVTTPQDIALIDAKKGIVMFEKVEVPVLGIVENMSVHICSNCGHHEPIFGTGGAEK LAEKYHTQLLGQMPLHISLREDLDKGTPTVISRPESEFTAIYRQLADRVAAQLYWQGEVI PGEISFRAV >gi|296494684|gb|ADTN01000054.1| GENE 4 2791 - 4824 2443 677 aa, chain + ## HITS:1 COG:ZmetG_1 KEGG:ns NR:ns ## COG: ZmetG_1 COG0143 # Protein_GI_number: 15802593 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA synthetase # Organism: Escherichia coli O157:H7 EDL933 # 1 567 1 567 567 1205 99.0 0 MTQVAKKILVTCALPYANGSIHLGHMLEHIQADVWVRYQRMRGHEVNFICADDAHGTPIM LKAQQLGITPEQMIGEMSQEHQTDFAGFNISYDNYHSTHSEENRQLSELIYSRLKENGFI KNRTISQLYDPEKGMFLPDRFVKGTCPKCKSPDQYGDNCEVCGATYSPTELIEPKSVVSG ATPVMRDSEHFFFDLPSFSEMLQAWTRSGALQEQVANKMQEWFESGLQQWDISRDAPYFG FEIPNAPGKYFYVWLDAPIGYMGSFKNLCDKRGDSVSFDEYWKKDSTAELYHFIGKDIVY FHSLFWPAMLEGSNFRKPTNLFVHGYVTVNGAKMSKSRGTFIKASTWLNHFDADSLRYYY TAKLSSRIDDIDLNLEDFVQRVNADIVNKVVNLASRNAGFINKRFDGVLASELADPQLYK TFTDAAEVIGEAWESREFGKAVREIMALADLANRYVDEQAPWVVAKQEGRDADLQAICSM GINLFRVLMTYLKPVLPKLTERAEAFLNTELTWDGIQQPLLGHKVNPFKALYNRIDMKQV EALVEASKEEVKAAAAPVTGPLADDPIQETITFDDFAKVDLRVALIENAEFVEGSDKLLR LTLDLGGEKRNVFSGIRSAYPDPQALIGRHTIMVANLAPRKMRFGISEGMVMAAGPGGKD IFLLSPDAGAKPGHQVK Prediction of potential genes in microbial genomes Time: Sun May 15 23:15:26 2011 Seq name: gi|296494683|gb|ADTN01000055.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont146.2, whole genome shotgun sequence Length of sequence - 17351 bp Number of predicted genes - 11, with homology - 9 Number of transcription units - 6, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 46 - 105 2.4 1 1 Op 1 . + CDS 142 - 3936 2258 ## COG3831 Uncharacterized conserved protein 2 1 Op 2 . + CDS 3946 - 7578 2314 ## ECO103_2592 hypothetical protein + Term 7586 - 7619 3.7 3 1 Op 3 . + CDS 7639 - 7959 81 ## G2583_2655 hypothetical protein + Term 7963 - 8008 4.3 - Term 7959 - 7989 1.1 4 2 Tu 1 . - CDS 8013 - 8102 121 ## - Prom 8127 - 8186 1.6 - Term 8566 - 8597 1.7 5 3 Tu 1 . - CDS 8621 - 8710 133 ## - Prom 8900 - 8959 6.3 6 4 Tu 1 . + CDS 8972 - 10042 462 ## COG2801 Transposase and inactivated derivatives + Term 10101 - 10145 7.7 + Prom 10110 - 10169 7.7 7 5 Op 1 . + CDS 10244 - 11332 945 ## COG0714 MoxR-like ATPases 8 5 Op 2 . + CDS 11343 - 13622 1859 ## c2650 hypothetical protein 9 5 Op 3 . + CDS 13615 - 14751 893 ## JW5350 conserved hypothetical protein 10 5 Op 4 . + CDS 14748 - 16751 1005 ## COG2801 Transposase and inactivated derivatives + Term 16830 - 16858 1.6 11 6 Tu 1 . + CDS 16876 - 17337 550 ## COG4808 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|296494683|gb|ADTN01000055.1| GENE 1 142 - 3936 2258 1264 aa, chain + ## HITS:1 COG:molR_g1_1 KEGG:ns NR:ns ## COG: molR_g1_1 COG3831 # Protein_GI_number: 16130053 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 112 1 112 112 202 96.0 4e-51 MRHFIYQDEKSHKFWAVEQQDNELHISWGKIGTHGQSQIKSFSDAAAAAKAELKLIAEKV KKGYVEQAKDNSLQPSQTVTGSLKVADLSTIIQEQPSFVAETRAPDKNTDAVLPWLAKDI AVVFPPEVVHTTLSHRRFPGVPVQQADKLTQLRRLACSVSQRDNKTATFDFSACSLEWQN TVAQAISQIDGLKTTQLPSPVMAVLTALEMKCTRYKVREDVMDQIVQEGGLEYATDVIIH LQQIDIEWDYANNVIIILPSGIAPSYLEQYSRFELRLRKHLSLTEESLWQKCAQKLIAAI PHIPEWRQPLIALLLPEKPEIAHEIAQRLLGQKKLPSLEWLKIVATDEHILASLEKYHEP YAIFDDYYCGAIWSATVLQEQGVAALPRFAPYAASDYCADVLRHINHPFALTLLIRVAGH TKRCHDRMTKACAAFPHAAMAALTELLAQKEENSWRIMLMTMLISQPALAEQVIPWLSTP AVAVLKSCQQQLTQPSNHASADLLPAIVVSPPWLSKKKKSPIPVLDLAPLGIEPICYLTE EISNQLLAKYIWYSKHITVSHEESTTNLLARMGFQRRIAGTYIKAPEAVVEAWLNEDYST LLSEFKVFHSPTGHYWQLGILTTLPLEKAVKAWNALTLSPHTDTEYAMLHFGLKGLPGLV NSLARYPQEALPITNYFAASELAPAVARAFNKLKTLRENARSWLLKYPEHALTGLLPAAL GKAGEAQDNARAALRMLTENGHKPLLQEIARRYNQPEVTDAVNALLALDPLDNHPTKIPT LPAFYQPSLWTRPVLKANAQSLPDSALLHLGEMLRFPQEEALYPGLLQVKDVCSADSLAG FAWDLFTAWQTAGAPSKESWAFTALGVLGNDDTARKLTPLIRAWPGESQHKRATVGLDIL AAIGSDIALMQLNGIAQKLKFKALQERAKEKIADIAESRELTVAELEDRLAPDLGLDDNG SLLLDFGPRQFTVSFDETLKPFVRDVSGSRLKDLPKPNKSDDETRANDAVNRYKLLKKDA RTIAAQQVARLESAMCLRRRWSLENFQLFLVEHPLVRHLTRRLIWGVYSAENQLLACFRV AEDNSYSTADDDLFTLPEGDISIGTPHVLEISPTDAAAFGQLFADYELLPPFRQLDRNSY ALTEAERNASELTRWAGRKCPSGRVMGLANKGWIKGEPQDGGWIGWMIKPLGRWSLIMEI DEGFAVGMSPAELSAEQLLSKLWLWEGKAESYGWGSNSTQEAQFSVLDAITASELINDIE ALFE >gi|296494683|gb|ADTN01000055.1| GENE 2 3946 - 7578 2314 1210 aa, chain + ## HITS:1 COG:no KEGG:ECO103_2592 NR:ns ## KEGG: ECO103_2592 # Name: yehI # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 1210 1 1210 1210 2278 96.0 0 MDKELPWLADNAQLELKYKKGKTPLSHRKWPGEPVPVITESIIQTLGDELLQKAEKKKNI VWRYENFSLEWQSAITQAINLIGEHKPSVPARTMAALACIAQNDNQQLLDEIVQQEGLEY ATEVVIARQFIARCYESDPLVVTLQYQNEDYGYGYRSETYNEFDLRLRKHLSLAEESRWQ RCADKLIAALPGITKVRRPFIALILPEKPEIANELVSLECPRTHFHSKEWLKVVATDPKA VRKLERYWSQDIFSDREASYMSHENHFGYAACAALLREQGLAAVPRLAIYAHKEDCGSLL VQINHPQVIRTLLLVADKNKPSLQRVAKYSKNFPHATLAALAELLALKEPPARPGYPIIE DKKLPAQQKARDEYWRTLLQTLIASQPQLAERVMPWLSTQAQSVVKRYLSASPKPVIDST DNSGLPEMLVSPPWRSKKKMTAPRLDLAPLELTPQVYWQPGEQERLAATESARYFSTESL AERMEQKSGRVVLQELGFGDDVWLFLNYILPGKLDAARNSLIVQWHYYQGRVEEILNGWN SPEAQLAEQALRSGHIEALINIWENDNYSRYRPEKSVWNLYLLAQLPREMALTFWLRINE KKHLFAGEDYFLSILGLDALPGLMLAFSHRPKETFPLILNFGATELALPVARVWRRFAVQ RGLARQWILHWPEHTATALIPLVFTKSSDNSEAALLALRLLYEHGHGELLQTVANRWQRK DVWPALEQLLKQGPMDIYPARIPKAPDFWHPAMWSRPRLITNNQPVTDDALEIIGEMLRF TQGGRFYSGLEQLKTFCQPQTLAAFAWDLFTAWQQAGAPAKDNWAFLALSLFGDESTARG LTTQILAWPQEGKSARAVIGLNILTLMNNDMALIQLHHVSQRAKSSSLRENAAEFLQVVA ENRGLSQEELADRLVPTLGLDDPQALIFDFGPRQFTVRFDENLNPVIFDQQNVRQKSVPR LRADDDQLKAPEALARLKGLKKDATQVSKNLLPRLEAALRTTRRWSLADFHSLFVNHPFT RLVTQRLIWGGYPANEPRRLLNAFRVAAEGEFCNAQDEPIDLPADALIGIAHPLEMTAEM RSEFAQLFADYEIMPPFRQLSRRTVLLTPDESTSNSLTRWEGKSATVGQLMGMRYKGWES GYEDAFVYDLGEYRLVLKFSPGFNHYNVDSKALMSFRSLRVYRDNKSVTFAELDVFDLSE ALSAPDVIFH >gi|296494683|gb|ADTN01000055.1| GENE 3 7639 - 7959 81 106 aa, chain + ## HITS:1 COG:no KEGG:G2583_2655 NR:ns ## KEGG: G2583_2655 # Name: yehK # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 105 1 105 105 188 98.0 5e-47 MIVQKELVAIYDYEVPVPEDPFSFRLEIHKCSELFTGSVYRLERFRLHPTFHQRDREDAD PLINDALIYIRDECIDERKLRGESPETVIAIFNRELQNIFIQQIEK >gi|296494683|gb|ADTN01000055.1| GENE 4 8013 - 8102 121 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDTLTQKLIVLIAVLELLVALLRLIDLLK >gi|296494683|gb|ADTN01000055.1| GENE 5 8621 - 8710 133 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDTLTQKLTVLIAVLELLVALLQLIDLLK >gi|296494683|gb|ADTN01000055.1| GENE 6 8972 - 10042 462 356 aa, chain + ## HITS:1 COG:VC0817 KEGG:ns NR:ns ## COG: VC0817 COG2801 # Protein_GI_number: 15640835 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Vibrio cholerae # 144 332 94 285 327 64 28.0 4e-10 MLMQHIGVGYFGYYRATAYAMKHSLMPEIAKLRMKALNFWDKHGIRAAADAFDVSTRTLY WWRRLLRTGGPEALIPRSKAPLVRRSRHWHPDVLKEIRRLRTELPNLGKEQIFVRLKPWC EARHFTCPSTSTIGRIIAGAHDKMRMIPVRLSARGKARLIKKRSVKPRRPKQYRPVKTGE LIGMDAIELRMGDLRRYIITMIDEHSDYALALAVPSLNSDITSHFFSKATKLFPVAIRQV VTDNGKEFLGNFDKTLQEASIKHIWTYPYTPKMNATCERFNRTLREQFIEFNELLLFEDL NLFNQRMAEYLVLYNSKRPHKSLELMTPVDYILRESKNCNMWWTHTLNCNGVQPWL >gi|296494683|gb|ADTN01000055.1| GENE 7 10244 - 11332 945 362 aa, chain + ## HITS:1 COG:ECs2927 KEGG:ns NR:ns ## COG: ECs2927 COG0714 # Protein_GI_number: 15832181 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Escherichia coli O157:H7 # 1 362 23 384 384 715 99.0 0 MSPQNNHLQRPPAAVLYADELAKLKQNDNAPCPPGWQLSLPAARAFILGDSAQNISRKVV ISPSAVERMLVTLATGRGLMLVGEPGTAKSLLSELLATAISGDAGLTVQGGASTTEDQIK YGWNYALLINHGPSTEALVPAPLYQGMRDGKIVRFEEITRTPLEVQDCLLGMLSDRVMTV PELTGEASQLYAREGFNIIATANTRDRGVNEMSAALKRRFDFETVFPIMDFAQELELVAS ASARLLAHSGIPHKVPDAVLELLVRTFRDLRANGEKKTSMDTLTAIMSTAEAVNVAHAVG VRAWFLANRAGEPADLVDCIAGTIVKDNEEDRARLRRYFEQRVATHKEAHWQAYYQARHR LP >gi|296494683|gb|ADTN01000055.1| GENE 8 11343 - 13622 1859 759 aa, chain + ## HITS:1 COG:no KEGG:c2650 NR:ns ## KEGG: c2650 # Name: yehM # Def: hypothetical protein # Organism: E.coli_CFT073 # Pathway: not_defined # 1 759 26 784 784 1400 97.0 0 MSEPLIVGIRHHSPACARLVKSLIESQRPRYVLIEGPADFNDRVDELFLAHQLPVAIYSY CQYQDGAAPGRGAWTPFAEFSPEWQALQAARRIQAQTYFIDLPCWAQSEEEDDSPDTQEE SQTLLLRATRMDNSDNLWDHLFEDESQQTALPSALAHYFAQLRGDFPGDALNRQREAFMA RWIAWAVQQNNGDVLVVCGGWHAPALAKMWRECPQDINKPELPSLADAVTGCYLTPYSEK RLDVLAGYLSGMPAPVWQSWCWQWGLQQAGEQLLKTVLTRLRQHHLPASTADMAAAHLHA MALAQLRGHKLPLRTDWLDAIAGSLIKEALNAPLPWSYRGVIHPDTDPILLTLIDTLAGD GFGKLAPSTPQPPLPKDVTCELERTAISLPAELTLNRFTPNGLAQSQVLHRLAILEIPGI VRQQGSTLTLAGNGEERWKLTRPLSQHAALIEAACFGATLQEAARHKLEADMLDAGGIGS ITTCLSQAALAGLASFSQQLLEQLTLLIAQENQFAEMGQALEVLYALWRLDEISGMQGAQ ILQTTLCAAIDRTLWLCESNGKPDEKEFHAHLHSWQALCHILRDLHSGVNLPGVSLSAAV ALLERRSQAIHAPALDRGAALGALMRLEHPNASAEAALAMLAQFFPAQSGEALHGLLALA RHQLACQPAFIAGFSSHLNQLSDDDFINALPDLRAAMAWLPPRERGTLAHQVLEHYQLAQ LPVSALQMPLHCPPQAIAHHQQLEQQALASLQNWGVFHV >gi|296494683|gb|ADTN01000055.1| GENE 9 13615 - 14751 893 378 aa, chain + ## HITS:1 COG:no KEGG:JW5350 NR:ns ## KEGG: JW5350 # Name: yehP # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 378 1 378 378 737 99.0 0 MSELNDLLTTRELQRWRLILGEAAETTLCGLDDNARQIDHALEWLYGRDPERLQRGERSG GLGGSNLTTPEWINCIHRLFPQQVIERLESDAVLRYGIEDVVTNLDVLERMQPSESLLRA VLHTKHLMNPEVLAAARRIVRQVVEEIMARLAKEVRQAFSGVRDRRRRSFIPLARNFDFK STLRANLQHWHPQHGKLYIESPRFNSRIKRQSEQWQLVLLVDQSGSMVDSVIHSAVMAAC LWQLPGIRTHLVAFDTSVVDLTADVADPVELLMKVQLGGGTNIASAVEYGRQLIEQPAKS VIILVSDFYEGGSSSLLTHQVKKCVQSGIKVLGLAALDSTATPCYDRDTAQALVNVGAQI AAMTPGELASWLAENLQS >gi|296494683|gb|ADTN01000055.1| GENE 10 14748 - 16751 1005 667 aa, chain + ## HITS:1 COG:ECs2931 KEGG:ns NR:ns ## COG: ECs2931 COG2801 # Protein_GI_number: 15832185 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 667 1 667 745 1274 97.0 0 MNSLRPELLELTPQALTALSNAGFVKRSLKELENGNVPEISHENGALIATFSDGVRTQLA NGQALKEAQCSCGASGMCRHRVMLVLSYQRLCATVQPTEKEEEWDPAIWLEELATLPDAI RKRAQALVAKGITIELFCTPGEIPSARLPMSDVRFYSRSSIRFARCDCIEGTLCEHVVLA VQAFVQAKAQQAELTHLIWQMRSEHVTSSDDLFASEEGKTCRQYVQQLSQALWLGGISQP LIHYEAAFSRAQQAAERCNWRWVSESLRQLRASVDAFHARASHYHAGECLRQLAALNSRL NCAQEMARRDSVGEVPPVPWRTVVGSGIAGEAKLDHLRLVSLGMRCWQDIEHYGLRIWFT DPDTGSILHLSRSWPRSEQENSPAATRRLFSFQAGALAGGQIVSQAAKRSADGELLLATR NRLSSVVPLSPDAWQMLSAPLRQSGIVALREYLRQRPPSCIRPLNQVDNLFILPVAECIS LGWDSSRQTLDAQVISGEGEDNLLTLSLPASASAPYAIERMAALLQQTDDPVCLVSGFVS FVDGQLTLEPQVMMTKTRAWALDAETAPVVASLPSVSVLPVPSTAHQLLMRCQALLIQLL HNGWRYQEQSAIGQAELLANDLTAVGFYRLAHVLGQFRNTESEARVEAMNNCVLLCEQLF PMLQQQG >gi|296494683|gb|ADTN01000055.1| GENE 11 16876 - 17337 550 153 aa, chain + ## HITS:1 COG:ECs2934 KEGG:ns NR:ns ## COG: ECs2934 COG4808 # Protein_GI_number: 15832188 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 153 5 157 157 245 100.0 3e-65 MKAFNKLFSLVVASVLVFSLAGCGDKEESKKFSANLNGTEIAITYVYKGDKVLKQSSETK IQFASIGATTKEDAAKTLEPLSAKYKNIAGVEEKLTYTDTYAQENVTIDMEKVDFKALQG ISGINVSAEDAKKGITMAQMELVMKAAGFKEVK Prediction of potential genes in microbial genomes Time: Sun May 15 23:16:24 2011 Seq name: gi|296494682|gb|ADTN01000056.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont146.3, whole genome shotgun sequence Length of sequence - 77455 bp Number of predicted genes - 70, with homology - 67 Number of transcription units - 41, operones - 20 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.684 - CDS 31 - 501 466 ## COG4807 Uncharacterized protein conserved in bacteria 2 1 Op 2 9/0.053 - CDS 548 - 1267 831 ## COG3279 Response regulator of the LytR/AlgR family 3 1 Op 3 . - CDS 1264 - 2949 1136 ## COG3275 Putative regulator of cell autolysis - Prom 3061 - 3120 3.6 + Prom 3025 - 3084 4.6 4 2 Op 1 . + CDS 3171 - 3902 571 ## COG0789 Predicted transcriptional regulators 5 2 Op 2 . + CDS 3962 - 4069 231 ## + Term 4209 - 4248 -0.7 - Term 3762 - 3801 3.2 6 3 Op 1 24/0.000 - CDS 4050 - 4781 844 ## COG1174 ABC-type proline/glycine betaine transport systems, permease component 7 3 Op 2 24/0.000 - CDS 4786 - 5712 1194 ## COG1125 ABC-type proline/glycine betaine transport systems, ATPase components 8 3 Op 3 13/0.053 - CDS 5705 - 6862 1279 ## COG1174 ABC-type proline/glycine betaine transport systems, permease component 9 3 Op 4 4/0.421 - CDS 6869 - 7786 1146 ## COG1732 Periplasmic glycine betaine/choline-binding (lipo)protein of an ABC-type transport system (osmoprotectant binding protein) - Prom 7967 - 8026 5.1 - Term 7985 - 8033 3.0 10 4 Tu 1 . - CDS 8038 - 10335 2878 ## COG1472 Beta-glucosidase-related glycosidases - Prom 10481 - 10540 5.9 + Prom 10439 - 10498 6.1 11 5 Tu 1 . + CDS 10531 - 12246 1643 ## COG0277 FAD/FMN-containing dehydrogenases + Term 12312 - 12340 1.3 - Term 12240 - 12276 4.3 12 6 Tu 1 . - CDS 12284 - 13177 1005 ## COG1686 D-alanyl-D-alanine carboxypeptidase - Prom 13287 - 13346 5.1 - Term 13331 - 13376 11.2 13 7 Tu 1 . - CDS 13390 - 13977 595 ## APECO1_4414 hypothetical protein - Prom 14010 - 14069 3.1 + Prom 13987 - 14046 5.2 14 8 Tu 1 . + CDS 14147 - 14725 647 ## COG0586 Uncharacterized membrane-associated protein + Term 14788 - 14814 -1.0 15 9 Op 1 2/0.684 - CDS 14855 - 15616 204 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 16 9 Op 2 . - CDS 15669 - 17105 1164 ## COG1538 Outer membrane protein - Prom 17242 - 17301 5.8 + Prom 17246 - 17305 6.2 17 10 Tu 1 . + CDS 17329 - 17412 113 ## + Term 17527 - 17571 -1.0 18 11 Tu 1 . - CDS 17744 - 18691 933 ## COG0042 tRNA-dihydrouridine synthase - Prom 18713 - 18772 5.0 + Prom 18850 - 18909 3.9 19 12 Op 1 23/0.000 + CDS 18930 - 19328 404 ## COG1380 Putative effector of murein hydrolase LrgA 20 12 Op 2 5/0.158 + CDS 19325 - 20020 696 ## COG1346 Putative effector of murein hydrolase + Prom 20065 - 20124 4.5 21 13 Tu 1 . + CDS 20150 - 21034 1025 ## COG0295 Cytidine deaminase + Term 21102 - 21130 1.6 + Prom 21094 - 21153 3.7 22 14 Op 1 . + CDS 21184 - 21903 745 ## COG2949 Uncharacterized membrane protein 23 14 Op 2 . + CDS 21906 - 22145 250 ## G2583_2688 hypothetical protein + Term 22197 - 22238 1.2 + Prom 22147 - 22206 5.1 24 15 Op 1 4/0.421 + CDS 22339 - 23577 1046 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases 25 15 Op 2 . + CDS 23571 - 24806 1192 ## COG0167 Dihydroorotate dehydrogenase + Term 24910 - 24953 3.9 - Term 24804 - 24849 0.3 26 16 Op 1 10/0.053 - CDS 25049 - 26059 1431 ## COG4211 ABC-type glucose/galactose transport system, permease component 27 16 Op 2 16/0.000 - CDS 26075 - 27595 1602 ## COG1129 ABC-type sugar transport system, ATPase component - Term 27612 - 27647 6.1 28 16 Op 3 6/0.158 - CDS 27656 - 28654 1396 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 28829 - 28888 5.2 - Term 28893 - 28922 -0.2 29 17 Tu 1 . - CDS 28934 - 29974 1025 ## COG1609 Transcriptional regulators - Prom 30017 - 30076 1.9 30 18 Op 1 4/0.421 - CDS 30116 - 31273 1083 ## COG2311 Predicted membrane protein 31 18 Op 2 . - CDS 31290 - 31958 766 ## COG0302 GTP cyclohydrolase I - Prom 32075 - 32134 4.7 + Prom 31913 - 31972 3.0 32 19 Tu 1 . + CDS 32216 - 33052 763 ## COG0627 Predicted esterase + Term 33059 - 33086 1.5 - Term 33045 - 33075 2.1 33 20 Op 1 . - CDS 33084 - 35075 1975 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins 34 20 Op 2 . - CDS 35068 - 35157 57 ## - Prom 35278 - 35337 7.1 - Term 35316 - 35347 4.1 35 21 Op 1 . - CDS 35369 - 36838 1942 ## COG0833 Amino acid transporters - Prom 36866 - 36925 5.3 36 21 Op 2 . - CDS 37043 - 37924 204 ## PROTEIN SUPPORTED gi|149913192|ref|ZP_01901726.1| 50S ribosomal protein L35 - Prom 37944 - 38003 9.1 + Prom 37913 - 37972 9.8 37 22 Op 1 5/0.158 + CDS 38023 - 39072 1235 ## COG2855 Predicted membrane protein 38 22 Op 2 3/0.579 + CDS 39146 - 40003 924 ## COG0648 Endonuclease IV 39 22 Op 3 . + CDS 40006 - 41094 768 ## COG0524 Sugar kinases, ribokinase family 40 23 Op 1 3/0.579 - CDS 41201 - 42451 1424 ## COG1972 Nucleoside permease - Term 42494 - 42537 7.5 41 23 Op 2 . - CDS 42551 - 43492 1118 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase - Prom 43558 - 43617 3.5 + Prom 43335 - 43394 6.1 42 24 Tu 1 . + CDS 43622 - 44320 518 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases 43 25 Tu 1 . - CDS 44391 - 45614 1472 ## COG1972 Nucleoside permease - Prom 45639 - 45698 1.7 44 26 Op 1 8/0.053 - CDS 45735 - 46673 1168 ## COG2313 Uncharacterized enzyme involved in pigment biosynthesis 45 26 Op 2 4/0.421 - CDS 46661 - 47602 574 ## COG0524 Sugar kinases, ribokinase family - Prom 47788 - 47847 4.2 - Term 47971 - 48015 11.0 46 27 Op 1 19/0.000 - CDS 48025 - 49716 1991 ## COG1299 Phosphotransferase system, fructose-specific IIC component 47 27 Op 2 11/0.053 - CDS 49733 - 50671 994 ## COG1105 Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) 48 27 Op 3 . - CDS 50671 - 51801 1304 ## COG4668 Mannitol/fructose-specific phosphotransferase system, IIA domain - Prom 51902 - 51961 3.5 + Prom 51961 - 52020 3.7 49 28 Tu 1 . + CDS 52169 - 53350 977 ## COG0477 Permeases of the major facilitator superfamily 50 29 Tu 1 . - CDS 53347 - 53601 170 ## COG0727 Predicted Fe-S-cluster oxidoreductase - Prom 53767 - 53826 3.4 51 30 Tu 1 . + CDS 53756 - 54328 810 ## COG0231 Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) + Term 54340 - 54388 5.0 + Prom 54461 - 54520 6.3 52 31 Tu 1 . + CDS 54551 - 56017 1231 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases + Term 56028 - 56079 2.2 53 32 Op 1 3/0.579 + CDS 56135 - 57121 777 ## COG0523 Putative GTPases (G3E family) 54 32 Op 2 3/0.579 + CDS 57160 - 57873 420 ## COG0671 Membrane-associated phospholipid phosphatase + Prom 58167 - 58226 5.5 55 33 Tu 1 . + CDS 58285 - 58851 358 ## PROTEIN SUPPORTED gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 + Term 58897 - 58934 6.4 + Prom 58942 - 59001 4.1 56 34 Op 1 2/0.684 + CDS 59032 - 60588 1151 ## COG2200 FOG: EAL domain 57 34 Op 2 11/0.053 + CDS 60664 - 62484 1390 ## COG4166 ABC-type oligopeptide transport system, periplasmic component 58 34 Op 3 11/0.053 + CDS 62485 - 63579 1362 ## COG4174 ABC-type uncharacterized transport system, permease component 59 34 Op 4 11/0.053 + CDS 63579 - 64604 1139 ## COG4239 ABC-type uncharacterized transport system, permease component 60 34 Op 5 . + CDS 64606 - 66195 332 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 - Term 66116 - 66163 3.6 61 35 Tu 1 . - CDS 66199 - 66543 187 ## EC55989_2435 hypothetical protein - Prom 66772 - 66831 5.2 - Term 66831 - 66860 0.5 62 36 Op 1 8/0.053 - CDS 66876 - 68066 1220 ## COG0477 Permeases of the major facilitator superfamily 63 36 Op 2 . - CDS 68094 - 68789 794 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases - Prom 68850 - 68909 3.5 + Prom 68820 - 68879 2.7 64 37 Op 1 . + CDS 68938 - 70698 1785 ## COG1061 DNA or RNA helicases of superfamily II 65 37 Op 2 . + CDS 70727 - 71107 633 ## PROTEIN SUPPORTED gi|26108973|gb|AAN81176.1|AE016763_135 50S ribosomal protein L25 + Term 71130 - 71157 1.5 - Term 71108 - 71156 4.6 66 38 Tu 1 . - CDS 71246 - 72253 1253 ## COG3081 Nucleoid-associated protein - Prom 72319 - 72378 6.9 + Prom 72304 - 72363 4.6 67 39 Op 1 8/0.053 + CDS 72435 - 72662 298 ## COG3082 Uncharacterized protein conserved in bacteria 68 39 Op 2 . + CDS 72682 - 74442 1660 ## COG3083 Predicted hydrolase of alkaline phosphatase superfamily + Term 74604 - 74645 4.0 + TRNA 74517 - 74593 78.0 # Pro GGG 0 0 - Term 74646 - 74681 4.2 69 40 Tu 1 . - CDS 74696 - 76687 1312 ## COG3468 Type V secretory pathway, adhesin AidA 70 41 Tu 1 . - CDS 76792 - 77286 151 ## COG3468 Type V secretory pathway, adhesin AidA - Prom 77329 - 77388 2.8 Predicted protein(s) >gi|296494682|gb|ADTN01000056.1| GENE 1 31 - 501 466 156 aa, chain - ## HITS:1 COG:ECs2935 KEGG:ns NR:ns ## COG: ECs2935 COG4807 # Protein_GI_number: 15832189 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 156 1 156 156 300 100.0 7e-82 MLSNDILRSVRYILKANNNDLVRILALGNVEATAEQIAVWLRKEDEEGFQRCPDIVLSSF LNGLIYEKRGKDESAPALEPERRINNNIVLKKLRIAFSLKTDDILAILTEQQFRVSMPEI TAMMRAPDHKNFRECGDQFLRYFLRGLAARQHVKKS >gi|296494682|gb|ADTN01000056.1| GENE 2 548 - 1267 831 239 aa, chain - ## HITS:1 COG:ECs2936 KEGG:ns NR:ns ## COG: ECs2936 COG3279 # Protein_GI_number: 15832190 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Escherichia coli O157:H7 # 1 239 6 244 244 459 100.0 1e-129 MIKVLIVDDEPLARENLRVFLQEQSDIEIVGECSNAVEGIGAVHKLRPDVLFLDIQMPRI SGLEMVGMLDPEHRPYIVFLTAFDEYAIKAFEEHAFDYLLKPIDEARLEKTLARLRQERS KQDVSLLPENQQALKFIPCTGHSRIYLLQMKDVAFVSSRMSGVYVTSHEGKEGFTELTLR TLESRTPLLRCHRQYLVNLAHLQEIRLEDNGQAELILRNGLTVPVSRRYLKSLKEAIGL >gi|296494682|gb|ADTN01000056.1| GENE 3 1264 - 2949 1136 561 aa, chain - ## HITS:1 COG:yehU KEGG:ns NR:ns ## COG: yehU COG3275 # Protein_GI_number: 16130064 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Escherichia coli K12 # 1 561 1 561 561 1078 100.0 0 MYDFNLVLLLLQQMCVFLVIAWLMSKTPLFIPLMQVTVRLPHKFLCYIVFSIFCIMGTWF GLHIDDSIANTRAIGAVMGGLLGGPVVGGLVGLTGGLHRYSMGGMTALSCMISTIVEGLL GGLVHSILIRRGRTDKVFNPITAGAVTFVAEMVQMLIILAIARPYEDAVRLVSNIAAPMM VTNTVGAALFMRILLDKRAMFEKYTSAFSATALKVAASTEGILRQGFNEVNSMKVAQVLY QELDIGAVAITDREKLLAFTGIGDDHHLPGKPISSTYTLKAIETGEVVYADGNEVPYRCS LHPQCKLGSTLVIPLRGENQRVMGTIKLYEAKNRLFSSINRTLGEGIAQLLSAQILAGQY ERQKAMLTQSEIKLLHAQVNPHFLFNALNTIKAVIRRDSEQASQLVQYLSTFFRKNLKRP SEFVTLADEIEHVNAYLQIEKARFQSRLQVNIAIPQELSQQQLPAFTLQPIVENAIKHGT SQLLDTGRVAISARREGQHLMLEIEDNAGLYQPVTNASGLGMNLVDKRLRERFGDDYGIS VACEPDSYTRITLRLPWRDEA >gi|296494682|gb|ADTN01000056.1| GENE 4 3171 - 3902 571 243 aa, chain + ## HITS:1 COG:yehV KEGG:ns NR:ns ## COG: yehV COG0789 # Protein_GI_number: 16130065 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 1 243 1 243 243 463 98.0 1e-130 MALYTIGEVALLCDINPVTLRAWQRRYGLLKPQRTDGGHRLFNDADIDRIREIKRWIDNG VQVSKVKMLLSNENVDVQNGWRDQQETLLTYLQSGNLHSLRTWIKERGQDYPAQTLTTHL FIPLRRRLQCQQPTLQALLAILDGVLINYIAICLASARKKKGKDALVVGWNIHDTTRLWL EGWIASQQGWRIDVLAHSLNQLRPELFEGRTLLVWCGENRTSAQQQQLTSWQEQGYDIFP LGI >gi|296494682|gb|ADTN01000056.1| GENE 5 3962 - 4069 231 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRIAKIGVIALFLFMALGGIGGVMLAGYTFILRAG >gi|296494682|gb|ADTN01000056.1| GENE 6 4050 - 4781 844 243 aa, chain - ## HITS:1 COG:ECs3015 KEGG:ns NR:ns ## COG: ECs3015 COG1174 # Protein_GI_number: 15832269 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport systems, permease component # Organism: Escherichia coli O157:H7 # 1 243 1 243 243 380 100.0 1e-105 MKMLRDPLFWLIALFVALIFWLPYSQPLFAALFPQLPRPVYQQESFAALALAHFWLVGIS SLFAVIIGTGAGIAVTRPWGAEFRPLVETIAAVGQTFPPVAVLAIAVPVIGFGLQPAIIA LILYGVLPVLQATLAGLGAIDASVTEVAKGMGMSRGQRLRKVELPLAAPVILAGVRTSVI INIGTATIASTVGASTLGTPIIIGLSGFNTAYVIQGALLVALAAIIADRLFERLVQALSQ HAK >gi|296494682|gb|ADTN01000056.1| GENE 7 4786 - 5712 1194 308 aa, chain - ## HITS:1 COG:ECs3016 KEGG:ns NR:ns ## COG: ECs3016 COG1125 # Protein_GI_number: 15832270 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport systems, ATPase components # Organism: Escherichia coli O157:H7 # 1 308 1 308 308 580 99.0 1e-166 MIEFSHVSKLFGAQKAVNDLNLNFQEGSFSVLIGTSGSGKSTTLKMINRLVEHDSGEIRF AGEEIRSLPVLELRRRMGYAIQSIGLFPHWSVAQNIATVPQLQKWSRARIDDRIDELMAL LGLESNLRERYPHQLSGGQQQRVGVARALAADPQVLLMDEPFGALDPVTRGALQQEMTRI HRLLGRTIVLVTHDIDEALRLAEHLVLMDHGEVVQQGNPLTMLTRPANDFVRQFFGRSEL GVRLLSLRSVADYVRREERAEGEALAEEMTLRDALSLFVARGCEVLPVVNTQGQPSGTLH FQDLLEEA >gi|296494682|gb|ADTN01000056.1| GENE 8 5705 - 6862 1279 385 aa, chain - ## HITS:1 COG:ECs3017 KEGG:ns NR:ns ## COG: ECs3017 COG1174 # Protein_GI_number: 15832271 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport systems, permease component # Organism: Escherichia coli O157:H7 # 1 385 1 385 385 603 99.0 1e-172 MTYLRINPVLALLLLLTAIAAALPFISYAPNRLVSGEGRHLWQLWPQTIWMLVGVGCAWL TACFIPAKKGSIFALILAQFVFVLLVWGTGKAATQLAQNGSALARTSLGSGFWLAAALAL LACSDAIRRISTHPLWRWLLHMQIAIIPLWLLYSGTLNDLSLMKEYANRQDVFDDALAQH LTLLFGAVLPALVIGVPLGIWCYFSTARQGAIFSLLNVIQTVPSVALFGLLIAPLAALVT AFPWLGKLGIAGTGMTPALIALVLYALLPLVRGVVVGLNQIPRDVLESARAMGMSGAQRF LHVQLPLALPVFLRSLRVVMVQTVGMAVIAALIGAGGFGALVFQGLLSSAIDLVLLGVIP VIVLAVLIDALFDLLIALLKVKRND >gi|296494682|gb|ADTN01000056.1| GENE 9 6869 - 7786 1146 305 aa, chain - ## HITS:1 COG:ECs3018 KEGG:ns NR:ns ## COG: ECs3018 COG1732 # Protein_GI_number: 15832272 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic glycine betaine/choline-binding (lipo)protein of an ABC-type transport system (osmoprotectant binding protein) # Organism: Escherichia coli O157:H7 # 1 305 1 305 305 546 98.0 1e-155 MPLSKVWAGSLVLLAAVSLPLHAASPVKVGSKIDTEGALLGNIILQVLESHGVPTVNKVQ LGTTPVVRGAITSGELDIYPEYTGNGAFFFKDENDAAWKNAQQGYEKVKKLDAEQNKLIW LTPAPANNTWTIAVRQDVAEKNKLTSLADLSRYLKEGGTFKLAASAEFIERADALPAFEK AYGFKLGQDQLLSLAGGDTAVTIKAAAQQTSGVNAAMAYGTDGPVAALGLQTLSDPQGVQ PIYAPAPVVRESVLKEYPQMAQWLQPVFASLDAKTLQQLNASIAVEGLDAKKVAADYLKQ KGWAK >gi|296494682|gb|ADTN01000056.1| GENE 10 8038 - 10335 2878 765 aa, chain - ## HITS:1 COG:bglX KEGG:ns NR:ns ## COG: bglX COG1472 # Protein_GI_number: 16130070 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Escherichia coli K12 # 1 765 1 765 765 1510 99.0 0 MKWLCSVGIAVSLALQPALADDLFGNHPLTPEARDAFVTELLKKMTVDEKIGQLRLISVG PDNPKEAIREMIKDGQVGAIFNTVTRQDIRAMQDQVMELSRLKIPLFFAYDVLHGQRTVF PISLGLASSFNLDAVKTVGRVSAYEAADDGLNMTWAPMVDVSRDPRWGRASEGFGEDTYL TSIMGKTMVEAMQGKSPADRYSVMTSVKHFAAYGAVEGGKEYNTVDMSPQRLFNDYMPPY KAGLDAGSGAVMVALNSLNGTPATSDSWLLKDVLRDQWGFKGITVSDHGAIKELIKHGTA ADPEDAVRVALKSGINMSMSDEYYSKYLPGLIKSGKVTMAELDDAARHVLNVKYDMGLFN DPYSHLGPKESDPVDTNAESRLHRKEAREVARESLVLLKNRLETLPLKKSATIAVVGPLA DSKRDVMGSWSAAGVADQSVTVLTGIKNAVGENGKVLYAKGANVTSDKGIIDFLNQYEEA VKVDPRSPQEMIDEAVQTAKQSDVVVAVVGEAQGMAHEASSRTDITIPQSQRDLIAALKA TGKPLVLVLMNGRPLALVKEDQQADAILETWFAGTEGGNAIADVLFGDYNPSGKLPMSFP RSVGQIPVYYSHLNTGRPYNADKPNKYTSRYFDEANGALYPFGYGLSYTTFTVSDVKLSA PTMKRDGKVTASVQVTNTGKREGATVVQMYLQDVTASMSRPVKQLKGFEKITLKPGETQT VSFPIDIEALKFWNQQMKYDAEPGKFNVFIGTDSARVKKGEFELL >gi|296494682|gb|ADTN01000056.1| GENE 11 10531 - 12246 1643 571 aa, chain + ## HITS:1 COG:dld KEGG:ns NR:ns ## COG: dld COG0277 # Protein_GI_number: 16130071 # Func_class: C Energy production and conversion # Function: FAD/FMN-containing dehydrogenases # Organism: Escherichia coli K12 # 1 571 1 571 571 1173 100.0 0 MSSMTTTDNKAFLNELARLVGSSHLLTDPAKTARYRKGFRSGQGDALAVVFPGSLLELWR VLKACVTADKIILMQAANTGLTEGSTPNGNDYDRDVVIISTLRLDKLHVLGKGEQVLAYP GTTLYSLEKALKPLGREPHSVIGSSCIGASVIGGICNNSGGSLVQRGPAYTEMSLFARIN EDGKLTLVNHLGIDLGETPEQILSKLDDDRIKDDDVRHDGRHAHDYDYVHRVRDIEADTP ARYNADPDRLFESSGCAGKLAVFAVRLDTFEAEKNQQVFYIGTNQPEVLTEIRRHILANF ENLPVAGEYMHRDIYDIAEKYGKDTFLMIDKLGTDKMPFFFNLKGRTDAMLEKVKFFRPH FTDRAMQKFGHLFPSHLPPRMKNWRDKYEHHLLLKMAGDGVGEAKSWLVDYFKQAEGDFF VCTPEEGSKAFLHRFAAAGAAIRYQAVHSDEVEDILALDIALRRNDTEWYEHLPPEIDSQ LVHKLYYGHFMCYVFHQDYIVKKGVDVHALKEQMLELLQQRGAQYPAEHNVGHLYKAPET LQKFYRENDPTNSMNPGIGKTSKRKNWQEVE >gi|296494682|gb|ADTN01000056.1| GENE 12 12284 - 13177 1005 297 aa, chain - ## HITS:1 COG:pbpG KEGG:ns NR:ns ## COG: pbpG COG1686 # Protein_GI_number: 16130072 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Escherichia coli K12 # 1 297 17 313 313 507 100.0 1e-143 MLAVPFAPQAVAKTAAATTASQPEIASGSAMIVDLNTNKVIYSNHPDLVRPIASISKLMT AMVVLDARLPLDEKLKVDISQTPEMKGVYSRVRLNSEISRKDMLLLALMSSENRAAASLA HHYPGGYKAFIKAMNAKAKSLGMNNTRFVEPTGLSVHNVSTARDLTKLLIASKQYPLIGQ LSTTREDMATFSNPTYTLPFRNTNHLVYRDNWNIQLTKTGFTNAAGHCLVMRTVINNKPV ALVVMDAFGKYTHFADASRLRTWIETGKVMPVPAAALSYKKQKAAQMAAAGQTAQND >gi|296494682|gb|ADTN01000056.1| GENE 13 13390 - 13977 595 195 aa, chain - ## HITS:1 COG:no KEGG:APECO1_4414 NR:ns ## KEGG: APECO1_4414 # Name: yohC # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 195 9 203 203 338 100.0 9e-92 MSHVWGLFSHPDREMQVINRENETISHHYTHHVLLMAAIPVICAFIGTTQIGWNFGDGTI LKLSWFTGLALAVLFYGVMLAGVAVMGRVIWWMARNYPQRPSLAHCMVFAGYVATPLFLS GLVALYPLVWLCALVGTVALFYTGYLLYLGIPSFLNINKEEGLSFSSSTLAIGVLVLEVL LALTVILWGYGYRLF >gi|296494682|gb|ADTN01000056.1| GENE 14 14147 - 14725 647 192 aa, chain + ## HITS:1 COG:yohD KEGG:ns NR:ns ## COG: yohD COG0586 # Protein_GI_number: 16130074 # Func_class: S Function unknown # Function: Uncharacterized membrane-associated protein # Organism: Escherichia coli K12 # 1 192 13 204 204 333 100.0 9e-92 MDLNTLISQYGYAALVIGSLAEGETVTLLGGVAAHQGLLKFPLVVLSVALGGMIGDQVLY LCGRRFGGKLLRRFSKHQDKIERAQKLIQRHPYLFVIGTRFMYGFRVIGPTLIGASQLPP KIFLPLNILGAFAWALIFTTIGYAGGQVIAPWLHNLDQHLKHWVWLILVVVLVVGVRWWL KRRGKKKPDHQA >gi|296494682|gb|ADTN01000056.1| GENE 15 14855 - 15616 204 253 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 57 241 52 238 242 83 32 4e-15 MAQVAIITASDSGIGKECALLLAQQGFDIGITWHSDEEGAKDTAREVVSHGVRAEIVQLD LGNLPEGALALEKLIQRLGRIDVLVNNAGAMTKAPFLDMAFDEWRKIFTVDVDGAFLCSQ IAARQMVKQGQGGRIINITSVHEHTPLPDASAYTAAKHALGGLTKAMALELVRHKILVNA VAPGAIATPMNGMDDSDVKPDAEPSIPLRRFGATHEIASLVVWLCSEGANYTTGQSLIVD GGFMLANPQFNPE >gi|296494682|gb|ADTN01000056.1| GENE 16 15669 - 17105 1164 478 aa, chain - ## HITS:1 COG:ECs3025 KEGG:ns NR:ns ## COG: ECs3025 COG1538 # Protein_GI_number: 15832279 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Escherichia coli O157:H7 # 1 478 28 505 505 876 99.0 0 MNRDSFYPAIACFPLLLMLAGCAPMHETRQALSQQTPAAQVDTALPTALKNGWPDSQWWL EYHDNQLTSLINNALQNAPDMQVAEQRIQLAEAQAKAVATQDGPQIDFSADMERQKMSAE GLMGPFALNDPAAGTTGPWYTNGTFGLTAGWHLDIWGKNRAEVTARLGTVKARAAEREQT RQLLAGSVARLYWEWQTQAALNTVLQQIEKEQNTIIATDRQLYQNGITSSVEGVETDINA SKTRQQLNDVAGKMKIIEARLSALTNNQTKSLKLKPVALPKVASQLPDELGYSLLARRAD LQAAHWYVESSLSTIDAAKAAFYPDINLMAFLQQDALHLSDLFRHSAQQMGVTAGLTLPI FDSGRLNANLDIAKAESNLSIASYNKAVVEAVNDVARAASQVQTLAEKNQHQAQIERDAL RVVGLAQARFNAGIIAGSRVSEARIPALRERANGLLLQGQWLDASIQLTGALGGGYKR >gi|296494682|gb|ADTN01000056.1| GENE 17 17329 - 17412 113 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKIILWAVLIIFLIGLLVVTGVFKMIF >gi|296494682|gb|ADTN01000056.1| GENE 18 17744 - 18691 933 315 aa, chain - ## HITS:1 COG:yohI KEGG:ns NR:ns ## COG: yohI COG0042 # Protein_GI_number: 16130078 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-dihydrouridine synthase # Organism: Escherichia coli K12 # 1 315 1 315 315 644 100.0 0 MRVLLAPMEGVLDSLVRELLTEVNDYDLCITEFVRVVDQLLPVKVFHRICPELQNASRTP SGTLVRVQLLGQFPQWLAENAARAVELGSWGVDLNCGCPSKTVNGSGGGATLLKDPELIY QGAKAMREAVPAHLPVSVKVRLGWDSGEKKFEIADAVQQAGATELVVHGRTKEQGYRAEH IDWQAIGDIRQRLNIPVIANGEIWDWQSAQQCMAISGCDAVMIGRGALNIPNLSRVVKYN EPRMPWPEVVALLQKYTRLEKQGDTGLYHVARIKQWLSYLRKEYDEATELFQHVRVLNNS PDIARAIQAIDIEKL >gi|296494682|gb|ADTN01000056.1| GENE 19 18930 - 19328 404 132 aa, chain + ## HITS:1 COG:yohJ KEGG:ns NR:ns ## COG: yohJ COG1380 # Protein_GI_number: 16130079 # Func_class: R General function prediction only # Function: Putative effector of murein hydrolase LrgA # Organism: Escherichia coli K12 # 1 132 1 132 132 218 100.0 2e-57 MSKTLNIIWQYLRAFVLIYACLYAGIFIASLLPVTIPGSIIGMLILFVLLALQILPAKWV NPGCYVLIRYMALLFVPIGVGVMQYFDLLRAQFGPVVVSCAVSTLVVFLVVSWSSQLVHG ERKVVGQKGSEE >gi|296494682|gb|ADTN01000056.1| GENE 20 19325 - 20020 696 231 aa, chain + ## HITS:1 COG:yohK KEGG:ns NR:ns ## COG: yohK COG1346 # Protein_GI_number: 16130080 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative effector of murein hydrolase # Organism: Escherichia coli K12 # 1 231 1 231 231 383 100.0 1e-106 MMANIWWSLPLTLIVFFAARKLAARYKFPLLNPLLVAMVVIIPFLMLTGISYDSYFKGSE VLNDLLQPAVVALAYPLYEQLHQIRARWKSIITICFIGSVVAMVTGTSVALLMGASPEIA ASILPKSVTTPIAMAVGGSIGGIPAISAVCVIFVGILGAVFGHTLLNAMRIRTKAARGLA MGTASHALGTARCAELDYQEGAFSSLALVLCGIITSLIAPFLFPIILAVMG >gi|296494682|gb|ADTN01000056.1| GENE 21 20150 - 21034 1025 294 aa, chain + ## HITS:1 COG:cdd KEGG:ns NR:ns ## COG: cdd COG0295 # Protein_GI_number: 16130081 # Func_class: F Nucleotide transport and metabolism # Function: Cytidine deaminase # Organism: Escherichia coli K12 # 1 294 1 294 294 541 100.0 1e-154 MHPRFQTAFAQLADNLQSALEPILADKYFPALLTGEQVSSLKSATGLDEDALAFALLPLA AACARTPLSNFNVGAIARGVSGTWYFGANMEFIGATMQQTVHAEQSAISHAWLSGEKALA AITVNYTPCGHCRQFMNELNSGLDLRIHLPGREAHALRDYLPDAFGPKDLEIKTLLMDEQ DHGYALTGDALSQAAIAAANRSHMPYSKSPSGVALECKDGRIFSGSYAENAAFNPTLPPL QGALILLNLKGYDYPDIQRAVLAEKADAPLIQWDATSATLKALGCHSIDRVLLA >gi|296494682|gb|ADTN01000056.1| GENE 22 21184 - 21903 745 239 aa, chain + ## HITS:1 COG:ECs3036 KEGG:ns NR:ns ## COG: ECs3036 COG2949 # Protein_GI_number: 15832290 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Escherichia coli O157:H7 # 1 239 1 239 239 447 100.0 1e-126 MLKRVFLSLLVLIGLLLLTVLGLDRWMSWKTAPYIYDELQDLPYRQVGVVLGTAKYYRTG VINQYYRYRIQGAINAYNSGKVNYLLLSGDNALQSYNEPMTMRKDLIAAGVDPSDIVLDY AGFRTLDSIVRTRKVFDTNDFIIITQRFHCERALFIALHMGIQAQCYAVPSPKDMLSVRI REFAARFGALADLYIFKREPRFLGPLVPIPAMHQVPEDAQGYPAVTPEQLLELQKKQGK >gi|296494682|gb|ADTN01000056.1| GENE 23 21906 - 22145 250 79 aa, chain + ## HITS:1 COG:no KEGG:G2583_2688 NR:ns ## KEGG: G2583_2688 # Name: yeiS # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 79 1 79 79 135 100.0 6e-31 MDVQQFFVVAVFFLIPIFCFREAWKGWRAGAIDKRVKNAPEPVYVWRAKNPGLFFAYMVA YIGFGILSIGMIVYLIFYR >gi|296494682|gb|ADTN01000056.1| GENE 24 22339 - 23577 1046 412 aa, chain + ## HITS:1 COG:yeiT KEGG:ns NR:ns ## COG: yeiT COG0493 # Protein_GI_number: 16130084 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Escherichia coli K12 # 1 412 1 412 412 819 99.0 0 MPQQNYLDELTPAFTSLLAIKEASRCLLCHDAPCSQACPAQTDPGKFIRSIYFRNFKGAA ETIRENNALGAVCARVCPTEKLCQSGCTRAGVDAPIDIGRLQRFVTDFEQQTGMEIYQPG TKTLGKVAIIGAGPAGLQASVTLTNQGYDVTIYEKEAHPGGWLRNGIPQFRLPQSVLDAE IARIEKIGVTIKCNNEVGNTLTLEQLKAENRAVLVTVGLSSGSGLPLFEHSDVEIAVDFL QRARQAQGDISIPQSALIIGGGDVAMDVASTLKVLGCQAVTCVAREELDEFPASEKEFTS ARELGVSIIDGFTPVAVEGNKVTFKHVRLSGELTMAADKIILAVGQHARLDAFAELEPQR NTIKTQNYQTRDPQVFAAGDIVEGDKTVVYAVKTGKEAAEAIHHYLEGACSC >gi|296494682|gb|ADTN01000056.1| GENE 25 23571 - 24806 1192 411 aa, chain + ## HITS:1 COG:yeiA_1 KEGG:ns NR:ns ## COG: yeiA_1 COG0167 # Protein_GI_number: 16130085 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Escherichia coli K12 # 1 326 3 328 328 685 100.0 0 MLTKDLSITFCGVKFPNPFCLSSSPVGNCYEMCAKAYDTGWGGVVFKTIGFFIANEVSPR FDHLVKEDTGFIGFKNMEQIAEHPLEENLAALRRLKEDYPDKVLIASIMGENEQQWEELA RLVQEAGADMIECNFSCPQMTSHAMGSDVGQSPELVEKYCRAVKRGSTLPMLAKMTPNIG DMCEVALAAKRGGADGIAAINTVKSITNIDLNQKIGMPIVNGKSSISGYSGKAVKPIALR FIQQMRTHPELRDFPISGIGGIETWEDAAEFLLLGAATLQVTTGIMQYGYRIVEDMASGL SHYLADQGFDSLQEMVGLANNNIVPAEDLDRSYIVYPRINLDKCVGCGRCYISCYDGGHQ AMEWSEKTRTPHCNTEKCVGCLLCGHVCPVGCIELGEVKFKKGEKEHPVTL >gi|296494682|gb|ADTN01000056.1| GENE 26 25049 - 26059 1431 336 aa, chain - ## HITS:1 COG:mglC KEGG:ns NR:ns ## COG: mglC COG4211 # Protein_GI_number: 16130086 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type glucose/galactose transport system, permease component # Organism: Escherichia coli K12 # 1 336 1 336 336 486 100.0 1e-137 MSALNKKSFLTYLKEGGIYVVLLVLLAIIIFQDPTFLSLLNLSNILTQSSVRIIIALGVA GLIVTQGTDLSAGRQVGLAAVVAATLLQSMDNANKVFPEMATMPIALVILIVCAIGAVIG LINGLIIAYLNVTPFITTLGTMIIVYGINSLYYDFVGASPISGFDSGFSTFAQGFVALGS FRLSYITFYALIAVAFVWVLWNKTRFGKNIFAIGGNPEAAKVSGVNVGLNLLMIYALSGV FYAFGGMLEAGRIGSATNNLGFMYELDAIAACVVGGVSFSGGVGTVIGVVTGVIIFTVIN YGLTYIGVNPYWQYIIKGAIIIFAVALDSLKYARKK >gi|296494682|gb|ADTN01000056.1| GENE 27 26075 - 27595 1602 506 aa, chain - ## HITS:1 COG:mglA KEGG:ns NR:ns ## COG: mglA COG1129 # Protein_GI_number: 16130087 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Escherichia coli K12 # 1 506 1 506 506 998 100.0 0 MVSSTTPSSGEYLLEMSGINKSFPGVKALDNVNLKVRPHSIHALMGENGAGKSTLLKCLF GIYQKDSGTILFQGKEIDFHSAKEALENGISMVHQELNLVLQRSVMDNMWLGRYPTKGMF VDQDKMYRETKAIFDELDIDIDPRARVGTLSVSQMQMIEIAKAFSYNAKIVIMDEPTSSL TEKEVNHLFTIIRKLKERGCGIVYISHKMEEIFQLCDEVTVLRDGQWIATEPLAGLTMDK IIAMMVGRSLNQRFPDKENKPGEVILEVRNLTSLRQPSIRDVSFDLHKGEILGIAGLVGA KRTDIVETLFGIREKSAGTITLHGKQINNHNANEAINHGFALVTEERRSTGIYAYLDIGF NSLISNIRNYKNKVGLLDNSRMKSDTQWVIDSMRVKTPGHRTQIGSLSGGNQQKVIIGRW LLTQPEILMLDEPTRGIDVGAKFEIYQLIAELAKKGKGIIIISSEMPELLGITDRILVMS NGLVSGIVDTKTTTQNEILRLASLHL >gi|296494682|gb|ADTN01000056.1| GENE 28 27656 - 28654 1396 332 aa, chain - ## HITS:1 COG:mglB KEGG:ns NR:ns ## COG: mglB COG1879 # Protein_GI_number: 16130088 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli K12 # 1 332 1 332 332 587 100.0 1e-167 MNKKVLTLSAVMASMLFGAAAHAADTRIGVTIYKYDDNFMSVVRKAIEQDAKAAPDVQLL MNDSQNDQSKQNDQIDVLLAKGVKALAINLVDPAAAGTVIEKARGQNVPVVFFNKEPSRK ALDSYDKAYYVGTDSKESGIIQGDLIAKHWAANQGWDLNKDGQIQFVLLKGEPGHPDAEA RTTYVIKELNDKGIKTEQLQLDTAMWDTAQAKDKMDAWLSGPNANKIEVVIANNDAMAMG AVEALKAHNKSSIPVFGVDALPEALALVKSGALAGTVLNDANNQAKATFDLAKNLADGKG AADGTNWKIDNKVVRVPYVGVDKDNLAEFSKK >gi|296494682|gb|ADTN01000056.1| GENE 29 28934 - 29974 1025 346 aa, chain - ## HITS:1 COG:galS KEGG:ns NR:ns ## COG: galS COG1609 # Protein_GI_number: 16130089 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 346 1 346 346 662 100.0 0 MITIRDVARQAGVSVATVSRVLNNSTLVSADTREAVMKAVSELDYRPNANAQALATQVSD TIGVVVMDVSDAFFGALVKAVDLVAQQHQKYVLIGNSYHEAEKERHAIEVLIRQRCNALI VHSKALSDDELAQFMDNIPGMVLINRVVPGYAHRCVCLDNLSGARMATRMLLNNGHQRIG YLSSSHGIEDDAMRKAGWMSALKEQDIIPPESWIGAGTPDMPGGEAAMVELLGRNLQLTA VFAYNDNMAAGALTALKDNGIAIPLHLSIIGFDDIPIARYTDPQLTTVRYPIASMAKLAT ELALQGAAGNIDPRASHCFMPTLVRRHSVATRQNAAAITNSTNQAM >gi|296494682|gb|ADTN01000056.1| GENE 30 30116 - 31273 1083 385 aa, chain - ## HITS:1 COG:Z3408 KEGG:ns NR:ns ## COG: Z3408 COG2311 # Protein_GI_number: 15802708 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 EDL933 # 1 385 44 428 428 665 98.0 0 MERNVTLDFVRGVAILGILLLNISAFGLPKAAYLNPAWYGAITPRDAWTWAFLDLIGQVK FLTLFALLFGAGLQMLLPRGRRWIQSRLTLLVLLGFIHGLLFWDGDILLAYGLVGLICWR LVRDAPSVKSLFNTGVMLYLVGLGVLLLLGLISDSQTSRAWTPDASAILYEKYWKLHGGV EAISYRADGVGNSLLALGAQYGWQLAGMMLIGAALMRSGWLKGQFSLRHYRRTGFVLVAI GVTINLPAIALQWQLDWAYRWCAFLLQMPRELSAPFQAIGYASLFYGFWPQLSRFKLVLA IACVGRMALTNYLLQTLICTTLFYHLGLFMHFDRLELLAFVIPVWLANILFSVIWLRYFR QGPVEWLWRQLTLRAAGPAISKTSR >gi|296494682|gb|ADTN01000056.1| GENE 31 31290 - 31958 766 222 aa, chain - ## HITS:1 COG:ECs3045 KEGG:ns NR:ns ## COG: ECs3045 COG0302 # Protein_GI_number: 15832299 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase I # Organism: Escherichia coli O157:H7 # 1 222 1 222 222 412 100.0 1e-115 MPSLSKEAALVHEALVARGLETPLRPPVHEMDNETRKSLIAGHMTEIMQLLNLDLADDSL METPHRIAKMYVDEIFSGLDYANFPKITLIENKMKVDEMVTVRDITLTSTCEHHFVTIDG KATVAYIPKDSVIGLSKINRIVQFFAQRPQVQERLTQQILIALQTLLGTNNVAVSIDAVH YCVKARGIRDATSATTTTSLGGLFKSSQNTRHEFLRAVRHHN >gi|296494682|gb|ADTN01000056.1| GENE 32 32216 - 33052 763 278 aa, chain + ## HITS:1 COG:yeiG KEGG:ns NR:ns ## COG: yeiG COG0627 # Protein_GI_number: 16130092 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Escherichia coli K12 # 1 278 1 278 278 556 100.0 1e-158 MEMLEEHRCFEGWQQRWRHDSSTLNCPMTFSIFLPPPRDHTPPPVLYWLSGLTCNDENFT TKAGAQRVAAELGIVLVMPDTSPRGEKVANDDGYDLGQGAGFYLNATQPPWATHYRMYDY LRDELPALVQSQFNVSDRCAISGHSMGGHGALIMALKNPGKYTSVSAFAPIVNPCSVPWG IKAFSSYLGEDKNAWLEWDSCALMYASNAQDAIPTLIDQGDNDQFLADQLQPAVLAEAAR QKAWPMTLRIQPGYDHSYYFIASFIEDHLRFHAQYLLK >gi|296494682|gb|ADTN01000056.1| GENE 33 33084 - 35075 1975 663 aa, chain - ## HITS:1 COG:cirA KEGG:ns NR:ns ## COG: cirA COG4771 # Protein_GI_number: 16130093 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Escherichia coli K12 # 1 663 1 663 663 1315 100.0 0 MFRLNPFVRVGLCLSAISCAWPVLAVDDDGETMVVTASSVEQNLKDAPASISVITQEDLQ RKPVQNLKDVLKEVPGVQLTNEGDNRKGVSIRGLDSSYTLILVDGKRVNSRNAVFRHNDF DLNWIPVDSIERIEVVRGPMSSLYGSDALGGVVNIITKKIGQKWSGTVTVDTTIQEHRDR GDTYNGQFFTSGPLIDGVLGMKAYGSLAKREKDDPQNSTTTDTGETPRIEGFSSRDGNVE FAWTPNQNHDFTAGYGFDRQDRDSDSLDKNRLERQNYSVSHNGRWDYGTSELKYYGEKVE NKNPGNSSPITSESNTVDGKYTLPLTAINQFLTVGGEWRHDKLSDAVNLTGGTSSKTSAS QYALFVEDEWRIFEPLALTTGVRMDDHETYGEHWSPRAYLVYNATDTVTVKGGWATAFKA PSLLQLSPDWTSNSCRGACKIVGSPDLKPETSESWELGLYYMGEEGWLEGVESSVTVFRN DVKDRISISRTSDVNAAPGYQNFVGFETGANGRRIPVFSYYNVNKARIQGVETELKIPFN DEWKLSINYTYNDGRDVSNGENKPLSDLPFHTANGTLDWKPLALEDWSFYVSGHYTGQKR ADSATAKTPGGYTIWNTGAAWQVTKDVKLRAGVLNLGDKDLSRDDYSYNEDGRRYFMAVD YRF >gi|296494682|gb|ADTN01000056.1| GENE 34 35068 - 35157 57 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLPYARGREEDVSDNPFYFRSYLMEIWNV >gi|296494682|gb|ADTN01000056.1| GENE 35 35369 - 36838 1942 489 aa, chain - ## HITS:1 COG:lysP KEGG:ns NR:ns ## COG: lysP COG0833 # Protein_GI_number: 16130094 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli K12 # 1 489 1 489 489 911 100.0 0 MVSETKTTEAPGLRRELKARHLTMIAIGGSIGTGLFVASGATISQAGPGGALLSYMLIGL MVYFLMTSLGELAAYMPVSGSFATYGQNYVEEGFGFALGWNYWYNWAVTIAVDLVAAQLV MSWWFPDTPGWIWSALFLGVIFLLNYISVRGFGEAEYWFSLIKVTTVIVFIIVGVLMIIG IFKGAQPAGWSNWTIGEAPFAGGFAAMIGVAMIVGFSFQGTELIGIAAGESEDPAKNIPR AVRQVFWRILLFYVFAILIISLIIPYTDPSLLRNDVKDISVSPFTLVFQHAGLLSAAAVM NAVILTAVLSAGNSGMYASTRMLYTLACDGKAPRIFAKLSRGGVPRNALYATTVIAGLCF LTSMFGNQTVYLWLLNTSGMTGFIAWLGIAISHYRFRRGYVLQGHDINDLPYRSGFFPLG PIFAFILCLIITLGQNYEAFLKDTIDWGGVAATYIGIPLFLIIWFGYKLIKGTHFVRYSE MKFPQNDKK >gi|296494682|gb|ADTN01000056.1| GENE 36 37043 - 37924 204 293 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149913192|ref|ZP_01901726.1| 50S ribosomal protein L35 [Roseobacter sp. AzwK-3b] # 3 241 1 243 305 83 26 4e-15 MHITLRQLEVFAEVLKSGSTTQASVMLALSQSAVSAALTDLEGQLGVQLFDRVGKRLVVN EHGRLLYPRALALLEQAVEIEQLFREDNGAIRIYASSTIGNYILPAVIARYRHDYPQLPI ELSVGNSQDVMQAVLDFRVDIGFIEGPCHSTEIISEPWLEDELVVFAAPTSPLARGPVTL EQLAAAPWILRERGSGTREIVDYLLLSHLPKFEMAMELGNSEAIKHAVRHGLGISCLSRR VIEDQLQAGTLSEVAVPLPRLMRTLWRIHHRQKHLSNALRRFLDYCDPANVPR >gi|296494682|gb|ADTN01000056.1| GENE 37 38023 - 39072 1235 349 aa, chain + ## HITS:1 COG:ECs3050 KEGG:ns NR:ns ## COG: ECs3050 COG2855 # Protein_GI_number: 15832304 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 349 1 349 349 573 100.0 1e-163 MTNITLQKQHRTLWHFIPGLALSAVITGVALWGGSIPAVAGAGFSALTLAILLGMVLGNT IYPHIWKSCDGGVLFAKQYLLRLGIILYGFRLTFSQIADVGISGIIIDVLTLSSTFLLAC FLGQKVFGLDKHTSWLIGAGSSICGAAAVLATEPVVKAEASKVTVAVATVVIFGTVAIFL YPAIYPLMSQWFSPETFGIYIGSTVHEVAQVVAAGHAISPDAENAAVISKMLRVMMLAPF LILLAARVKQLSGANSGEKSKITIPWFAILFIVVAIFNSFHLLPQSVVNMLVTLDTFLLA MAMAALGLTTHVSALKKAGAKPLLMALVLFAWLIVGGGAINYVIQSVIA >gi|296494682|gb|ADTN01000056.1| GENE 38 39146 - 40003 924 285 aa, chain + ## HITS:1 COG:ECs3051 KEGG:ns NR:ns ## COG: ECs3051 COG0648 # Protein_GI_number: 15832305 # Func_class: L Replication, recombination and repair # Function: Endonuclease IV # Organism: Escherichia coli O157:H7 # 1 285 1 285 285 582 100.0 1e-166 MKYIGAHVSAAGGLANAAIRAAEIDATAFALFTKNQRQWRAAPLTTQTIDEFKAACEKYH YTSAQILPHDSYLINLGHPVTEALEKSRDAFIDEMQRCEQLGLSLLNFHPGSHLMQISEE DCLARIAESINIALDKTQGVTAVIENTAGQGSNLGFKFEHLAAIIDGVEDKSRVGVCIDT CHAFAAGYDLRTPAECEKTFADFARTVGFKYLRGMHLNDAKSTFGSRVDRHHSLGEGNIG HDAFRWIMQDDRFDGIPLILETINPDIWAEEIAWLKAQQTEKAVA >gi|296494682|gb|ADTN01000056.1| GENE 39 40006 - 41094 768 362 aa, chain + ## HITS:1 COG:yeiI_2 KEGG:ns NR:ns ## COG: yeiI_2 COG0524 # Protein_GI_number: 16130098 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli K12 # 54 362 1 309 309 634 100.0 0 MNNREKEILAILRRNPLIQQNEIADMLQISRSRVAAHIMDLMRKGRIKGKGYILTEQEYC VVVGTINMDIRGMADIRYPQSASHPGTIHCSAGGVGRNIAHNLALLGRDVHLLSVIGDDF YGEMLLEETRRAGVNVSGCVRLHGQSTSTYLAIANRDDQTVLAINDTHLLEQLTPQLLNG SRDLLRHAGVVLADCNLTAEALEWVFTLADEIPVFVDTVSEFKAGKIKHWLAHIHTLKPT LPELEILWGQAITSDADRNTAVNALHQQGVQQLFVYLPDESVYCSEKDGEQFLLTAPAHT TVDSFGADDGFMAGLVYSFLEGYSFRDSARFAVACAAISRASGSLNNPTLSADNALSLVP MV >gi|296494682|gb|ADTN01000056.1| GENE 40 41201 - 42451 1424 416 aa, chain - ## HITS:1 COG:yeiJ KEGG:ns NR:ns ## COG: yeiJ COG1972 # Protein_GI_number: 16130099 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside permease # Organism: Escherichia coli K12 # 1 416 1 416 416 656 100.0 0 MDVMRSVLGMVVLLTIAFLLSVNKKKISLRTVGAALVLQVVIGGIMLWLPPGRWVAEKVA FGVHKVMAYSDAGSAFIFGSLVGPKMDTLFDGAGFIFGFRVLPAIIFVTALVSILYYIGV MGILIRILGGIFQKALNISKIESFVAVTTIFLGQNEIPAIVKPFIDRLNRNELFTAICSG MASIAGSTMIGYAALGVPVEYLLAASLMAIPGGILFARLLSPATESSQVSFNNLSFTETP PKSIIEAAATGAMTGLKIAAGVATVVMAFVAIIALINGIIGGVGGWFGFEHASLESILGY LLAPLAWVMGVDWSDANLAGSLIGQKLAINEFVAYLNFSPYLQTAGTLDAKTVAIISFAL CGFANFGSIGVVVGAFSAVAPHRAPEIAQLGLRALAAATLSNLMSATIAGFFIGLA >gi|296494682|gb|ADTN01000056.1| GENE 41 42551 - 43492 1118 313 aa, chain - ## HITS:1 COG:yeiK KEGG:ns NR:ns ## COG: yeiK COG1957 # Protein_GI_number: 16130100 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Escherichia coli K12 # 1 313 1 313 313 632 100.0 0 MEKRKIILDCDPGHDDAIAIMMAAKHPAIDLLGITIVAGNQTLDKTLINGLNVCQKLEIN VPVYAGMPQPIMRQQIVADNIHGETGLDGPVFEPLTRQAESTHAVKYIIDTLMASDGDIT LVPVGPLSNIAVAMRMQPAILPKIREIVLMGGAYGTGNFTPSAEFNIFADPEAARVVFTS GVPLVMMGLDLTNQTVCTPDVIARMERAGGPAGELFSDIMNFTLKTQFENYGLAGGPVHD ATCIGYLINPDGIKTQEMYVEVDVNSGPCYGRTVCDELGVLGKPANTKVGITIDTDWFWG LVEECVRGYIKTH >gi|296494682|gb|ADTN01000056.1| GENE 42 43622 - 44320 518 232 aa, chain + ## HITS:1 COG:ECs3055 KEGG:ns NR:ns ## COG: ECs3055 COG0664 # Protein_GI_number: 15832309 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Escherichia coli O157:H7 # 14 232 1 219 219 446 100.0 1e-125 MKEIHNNDLKQQLMSESAFKDCFLTDVSADTRLFHFLARDYIVQEGQQPSWLFYLTRGRA RLYATLANGRVSLIDFFAAPCFIGEIELIDKDHEPRAVQAIEECWCLALPMKHYRPLLLN DTLFLRKLCVTLSHKNYRNIVSLTQNQSFPLVNRLAAFILLSQEGDLYHEKHTQAAEYLG VSYRHLLYVLAQFIHDGLLIKSKKGYLIKNRKQLSGLALEMDPENKFSGMMQ >gi|296494682|gb|ADTN01000056.1| GENE 43 44391 - 45614 1472 407 aa, chain - ## HITS:1 COG:yeiM KEGG:ns NR:ns ## COG: yeiM COG1972 # Protein_GI_number: 16130102 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside permease # Organism: Escherichia coli K12 # 1 407 10 416 416 622 100.0 1e-178 MVVLLAIAFLLSVNKKSISLRTVGAALLLQIAIGGIMLYFPPGKWAVEQAALGVHKVMSY SDAGSAFIFGSLVGPKMDVLFDGAGFIFAFRVLPAIIFVTALISLLYYIGVMGLLIRILG SIFQKALNISKIESFVAVTTIFLGQNEIPAIVKPFIDRMNRNELFTAICSGMASIAGSMM IGYAGMGVPIDYLLAASLMAIPGGILFARILSPATEPSQVTFENLSFSETPPKSFIEAAA SGAMTGLKIAAGVATVVMAFVAIIALINGIIGGIGGWFGFANASLESIFGYVLAPLAWIM GVDWSDANLAGSLIGQKLAINEFVAYLSFSPYLQTGGTLEVKTIAIISFALCGFANFGSI GVVVGAFSAISPKRAPEIAQLGLRALAAATLSNLMSATIAGFFIGLA >gi|296494682|gb|ADTN01000056.1| GENE 44 45735 - 46673 1168 312 aa, chain - ## HITS:1 COG:ECs3057 KEGG:ns NR:ns ## COG: ECs3057 COG2313 # Protein_GI_number: 15832311 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized enzyme involved in pigment biosynthesis # Organism: Escherichia coli O157:H7 # 1 312 1 312 312 569 99.0 1e-162 MSELKISPELLQISPEVQDALKNKKPVVALESTIISHGMPFPQNAQTAIEVEETIRKQGA VPATIAIIGGVMKVGLSKEEIELLGREGHNVTKVSRRDLPFVVAAGKNGATTVASTMIIA ALAGIKVFATGGIGGVHRGAEHTFDISADLQELANTNVTVVCAGAKSILDLGLTTEYLET FGVPLIGYQTKALPAFFCRTSPFDVSIRLDSASEIARAMAVKWQSGLNGGLVVANPIPEQ FAMPEHTINAAIDQAVAEAEAQGVIGKESTPFLLARVAELTGGDSLKSNIQLVFNNAILA SEIAKEYQRLAG >gi|296494682|gb|ADTN01000056.1| GENE 45 46661 - 47602 574 313 aa, chain - ## HITS:1 COG:yeiC KEGG:ns NR:ns ## COG: yeiC COG0524 # Protein_GI_number: 16130104 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli K12 # 1 313 1 313 313 618 100.0 1e-177 MREKDYVVIIGSANIDVAGYSHESLNYADSNPGKIKFTPGGVGRNIAQNLALLGNKAWLL SAVGSDFYGQSLLTQTNQSGVYVDKCLIVPGENTSSYLSLLDNTGEMLVAINDMNISNAI TAEYLAQHGEFIQRAKVIVADCNISEEALAWILDNAANVPVFVDPVSAWKCVKVRDRLNQ IHTLKPNRLEAETLSGIALSGREDVAKVAAWFHQHGLNRLVLSMGGDGVYYSDISGESGW SAPIKTNVINVTGAGDAMMAGLASCWVDGMPFAESVRFAQGCSSMALSCEYTNNPDLSIA NVISLVENAECLN >gi|296494682|gb|ADTN01000056.1| GENE 46 48025 - 49716 1991 563 aa, chain - ## HITS:1 COG:fruA_3 KEGG:ns NR:ns ## COG: fruA_3 COG1299 # Protein_GI_number: 16130105 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Escherichia coli K12 # 226 563 1 338 338 545 100.0 1e-154 MKTLLIIDANLGQARAYMAKTLLGAAARKAKLEIIDNPNDAEMAIVLGDSIPNDSALNGK NVWLGDISRAVAHPELFLSEAKGHAKPYTAPVAATAPVAASGPKRVVAVTACPTGVAHTF MAAEAIETEAKKRGWWVKVETRGSVGAGNAITPEEVAAADLVIVAADIEVDLAKFAGKPM YRTSTGLALKKTAQELDKAVAEATPYEPAGKAQTATTESKKESAGAYRHLLTGVSYMLPM VVAGGLCIALSFAFGIEAFKEPGTLAAALMQIGGGSAFALMVPVLAGYIAFSIADRPGLT PGLIGGMLAVSTGSGFIGGIIAGFLAGYIAKLISTQLKLPQSMEALKPILIIPLISSLVV GLAMIYLIGKPVAGILEGLTHWLQTMGTANAVLLGAILGGMMCTDMGGPVNKAAYAFGVG LLSTQTYGPMAAIMAAGMVPPLAMGLATMVARRKFDKAQQEGGKAALVLGLCFISEGAIP FAARDPMRVLPCCIVGGALTGAISMAIGAKLMAPHGGLFVLLIPGAITPVLGYLVAIIAG TLVAGLAYAFLKRPEVDAVAKAA >gi|296494682|gb|ADTN01000056.1| GENE 47 49733 - 50671 994 312 aa, chain - ## HITS:1 COG:ECs3060 KEGG:ns NR:ns ## COG: ECs3060 COG1105 # Protein_GI_number: 15832314 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) # Organism: Escherichia coli O157:H7 # 1 312 1 312 312 611 100.0 1e-175 MSRRVATITLNPAYDLVGFCPEIERGEVNLVKTTGLHAAGKGINVAKVLKDLGIDVTVGG FLGKDNQDGFQQLFSELGIANRFQVVQGRTRINVKLTEKDGEVTDFNFSGFEVTPADWER FVTDSLSWLGQFDMVCVSGSLPSGVSPEAFTDWMTRLRSQCPCIIFDSSREALVAGLKAA PWLVKPNRRELEIWAGRKLPEMKDVIEAAHALREQGIAHVVISLGAEGALWVNASGEWIA KPPSVDVVSTVGAGDSMVGGLIYGLLMRESSEHTLRLATAVAALAVSQSNVGITDRPQLA AMMARVDLQPFN >gi|296494682|gb|ADTN01000056.1| GENE 48 50671 - 51801 1304 376 aa, chain - ## HITS:1 COG:fruB_1 KEGG:ns NR:ns ## COG: fruB_1 COG4668 # Protein_GI_number: 16130107 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol/fructose-specific phosphotransferase system, IIA domain # Organism: Escherichia coli K12 # 1 286 1 286 286 471 100.0 1e-132 MFQLSVQDIHPGEKAGDKEEAIRQVAAALVQAGNVAEGYVNGMLAREQQTSTFLGNGIAI PHGTTDTRDQVLKTGVQVFQFPEGVTWGDGQVAYVAIGIAASSDEHLGLLRQLTHVLSDD SVAEQLKSATTAEELRALLMGEKQSEQLKLDNEMLTLDIVASDLLTLQALNAARLKEAGA VDATFVTKAINEQPLNLGQGIWLSDSAEGNLRSAIAVSRAANAFDVDGETAAMLVSVAMN DDQPIAVLKRLADLLLDNKADRLLKADAATLLALLTSDDAPTDDVLSAEFVVRNEHGLHA RPGTMLVNTIKQFNSDITVTNLDGTGKPANGRSLMKVVALGVKKGHRLRFTAQGADAEQA LKAIGDAIAAGLGEGA >gi|296494682|gb|ADTN01000056.1| GENE 49 52169 - 53350 977 393 aa, chain + ## HITS:1 COG:ECs3062 KEGG:ns NR:ns ## COG: ECs3062 COG0477 # Protein_GI_number: 15832316 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 393 1 393 393 693 99.0 0 MHNSPAVSSAKSFDLTSTAFLIVAFLTGIAGALQTPTLSIFLTDEVHARPAMVGFFFTGS AVIGILVSQFLAGRSDKRGDRKSLIVFCCLLGVLACTLFAWNRNYFVLLFVGVFLSSFGS TANPQMFALAREHADKTGREAVMFSSFLRAQVSLAWVIGPPLAYALAMGFSFTVMYLSAA VAFIVCGVMVWLFLPSMQKELPLATGTIEAPRRNRRDTLLLFVICTLMWGSNSLYIINMP LFIINELHLPEKLAGVMMGTAAGLEIPTMLIAGYFAKRLGKRFLMRVAAVGGVCFYAGML MAHSPVILLGLQLLNAIFIGILGGIGMLYFQDLMPGQAGSATTLYTNTSRVGWIIAGSVA GIVAEIWNYHAVFWFAMVMIIATLFCLLRIKDV >gi|296494682|gb|ADTN01000056.1| GENE 50 53347 - 53601 170 84 aa, chain - ## HITS:1 COG:YPO1286 KEGG:ns NR:ns ## COG: YPO1286 COG0727 # Protein_GI_number: 16121569 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster oxidoreductase # Organism: Yersinia pestis # 1 84 1 84 84 101 66.0 3e-22 MECRPGCGACCTAPSISSPIPGMPEGKPANTPCIQLDEQQRCKIFTSPLRPKVCAGLQAS AEMCGNSRQQAMTWLIDLEMLTAP >gi|296494682|gb|ADTN01000056.1| GENE 51 53756 - 54328 810 190 aa, chain + ## HITS:1 COG:ECs3063 KEGG:ns NR:ns ## COG: ECs3063 COG0231 # Protein_GI_number: 15832317 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) # Organism: Escherichia coli O157:H7 # 1 190 86 275 275 384 100.0 1e-107 MPRANEIKKGMVLNYNGKLLLVKDIDIQSPTARGAATLYKMRFSDVRTGLKVEERFKGDD IVDTVTLTRRYVDFSYVDGNEYVFMDKEDYTPYTFTKDQIEEELLFMPEGGMPDMQVLTW DGQLLALELPQTVDLEIVETAPGIKGASASARNKPATLSTGLVIQVPEYLSPGEKIRIHI EERRYMGRAD >gi|296494682|gb|ADTN01000056.1| GENE 52 54551 - 56017 1231 488 aa, chain + ## HITS:1 COG:yeiQ KEGG:ns NR:ns ## COG: yeiQ COG0246 # Protein_GI_number: 16130110 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Escherichia coli K12 # 1 488 1 488 488 1012 100.0 0 MNTIASVTLPHHVHAPRYDRQQLQSRIVHFGFGAFHRAHQALLTDRVLNAQGGDWGICEI SLFSGDQLMSQLRAQNHLYTVLEKGADGNQVIIVGAVHECLNAKLDSLAAIIEKFCEPQV AIVSLTITEKGYCIDPATGALDTSNPRIIHDLQTPEEPHSAPGILVEALKRRRERGLTPF TVLSCDNIPDNGHVVKNAVLGMAEKRSPELAGWIKEHVSFPGTMVDRIVPAATDESLVEI SQHLGVNDPCAISCEPFIQWVVEDNFVAGRPAWEVAGVQMVNDVLPWEEMKLRMLNGSHS FLAYLGYLSGFAHISDCMQDRAFRHAARTLMLDEQAPTLQIKDVDLTQYADKLIARFANP ALKHKTWQIAMDGSQKLPQRMLAGIRIHQGRETDWSLLALGVAGWMRYVSGVDDAGNAID VRDPLSDKIRELVAGSSSEQRVTALLSLREVFGDDLPDNPHFVQAIEQAWQQIVQFGAHQ ALLNTLKI >gi|296494682|gb|ADTN01000056.1| GENE 53 56135 - 57121 777 328 aa, chain + ## HITS:1 COG:yeiR KEGG:ns NR:ns ## COG: yeiR COG0523 # Protein_GI_number: 16130111 # Func_class: R General function prediction only # Function: Putative GTPases (G3E family) # Organism: Escherichia coli K12 # 1 328 1 328 328 658 100.0 0 MTRTNLITGFLGSGKTTSILHLLAHKDPNEKWAVLVNEFGEVGIDGALLADSGALLKEIP GGCMCCVNGLPMQVGLNTLLRQGKPDRLLIEPTGLGHPKQILDLLTAPVYEPWIDLRATL CILDPRLLLDEKSASNENFRDQLAAADIIVANKSDRTTPESEQALQRWWQQNGGDRQLIH SEHGKVDGHLLDLPRRNLAELPASAAHSHQHVVKKGLAALSLPEHQRWRRSLNSGQGYQA CGWIFDADTVFDTIGILEWARLAPVERVKGVLRIPEGLVRINRQGDDLHIETQNVAPPDS RIELISSSEADWNALQSALLKLRLATTA >gi|296494682|gb|ADTN01000056.1| GENE 54 57160 - 57873 420 237 aa, chain + ## HITS:1 COG:ECs3066 KEGG:ns NR:ns ## COG: ECs3066 COG0671 # Protein_GI_number: 15832320 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Escherichia coli O157:H7 # 1 237 13 249 249 437 100.0 1e-123 MIKNLPQIVLLNIVGLALFLSWYIPVNHGFWLPIDADIFYFFNQKLVESKAFLWLVALTN NRAFDGCSLLAMGMLMLSFWLKENAPGRRRIVIIGLVMLLTAVVLNQLGQALIPVKRASP TLTFTDINRVSELLSVPTKDASRDSFPGDHGMMLLIFSAFMWRYFGKVAGLIALIIFVVF AFPRVMIGAHWFTDIIVGSMTVILIGLPWVLLTPLSDRLITFFDKSLPGKNKHFQNK >gi|296494682|gb|ADTN01000056.1| GENE 55 58285 - 58851 358 188 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 [Haemophilus parasuis 29755] # 65 184 58 174 175 142 57 5e-33 MVKSQPILRYILRGIPAIAVAVLLSACSANNTAKNMHPETRAVGSETSSLQASQDEFENL VRNVDVKSRIMDQYADWKGVRYRLGGSTKKGIDCSGFVQRTFREQFGLELPRSTYEQQEM GKSVSRSNLRTGDLVLFRAGSTGRHVGIYIGNNQFVHASTSSGVIISSMNEPYWKKRYNE ARRVLSRS >gi|296494682|gb|ADTN01000056.1| GENE 56 59032 - 60588 1151 518 aa, chain + ## HITS:1 COG:rtn_2 KEGG:ns NR:ns ## COG: rtn_2 COG2200 # Protein_GI_number: 16130114 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 261 518 1 258 258 531 100.0 1e-150 MFIRAPNFGRKLLLTCIVAGVMIAILVSCLQFLVAWHKHEVKYDTLITDVQKYLDTYFAD LKSTTDRLQPLTLDTCQQANPELTARAAFSMNVRTFVLVKDKKTFCSSATGEMDIPLNEL IPALDINKNVDMAILPGTPMVPNKPAIVIWYRNPLLKNSGVFAALNLNLTPSLFYSSRQE DYDGVALIIGNTALSTFSSRLMNVNELTDMPVRETKIAGIPLTVRLYADDWTWNDVWYAF LLGGMSGTVVGLLCYYLMSVRMRPGREIMTAIKREQFYVAYQPVVDTQALRVTGLEVLLR WRHPVAGEIPPDAFINFAESQKMIVPLTQHLFELIARDAAELEKVLPVGVKFGINIAPDH LHSESFKADIQKLLTSLPAHHFQIVLEITERDMLKEQEATQLFAWLHSVGVEIAIDDFGT GHSALIYLERFTLDYLKIDRGFINAIGTETITSPVLDAVLTLAKRLNMLTVAEGVETPEQ ARWLSERGVNFMQGYWISRPLPLDDFVRWLKKPYTPQW >gi|296494682|gb|ADTN01000056.1| GENE 57 60664 - 62484 1390 606 aa, chain + ## HITS:1 COG:yejA KEGG:ns NR:ns ## COG: yejA COG4166 # Protein_GI_number: 16130115 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Escherichia coli K12 # 1 606 1 606 606 1198 99.0 0 MQMIVRILLLFIALFTFGVQAQAIKESYAFAVLGEPRYAFNFNHFDYVNPAAPKGGQITL SALGTFDNFNRYALRGNPGARTEQLYDTLFTTSDDEPGSYYPLIAESARYADDYSWVEVA INPRARFHDGSPITARDVEFTFQKFMTEGVPQFRLVYKGTTVKAIAPLTVRIELAKPGKE DMLSLFSLPVFPEKYWKDHKLSDPFATPPLASGPYRVTSWKMGQNIVYSRVKDYWAANLP VNRGRWNFDTIRYDYYLDDNVAFEAFKAGAFDLRMENDAKNWATRYTGKNFDKKYIIKDE QKNESAQDTRWLAFNIQRPVFSDRRVREAITLAFDFEWMNKALFYNAWSRTNSYFQNTEY AARNYPDAAELVLLAPMKKDLPSEVFTQIYQPPVSKGDGYDRDNLLKADKLLNEAGWVLK GQQRVNATTGQPLSFELLLPASSNSQWVLPFQHSLQRLGINMDIRKVDNSQITNRMRSRD YDMMPRVWRAMPWPSSDLQISWSSEYINSTYNAPGVQSPVIDSLINQIIAAQGNKEKLLP LGRALDRVLTWNYYMLPMWYMAEDRLAWWDKFSQPAVRPIYSLGIDTWWYDVNKAAKLPS ASKQGE >gi|296494682|gb|ADTN01000056.1| GENE 58 62485 - 63579 1362 364 aa, chain + ## HITS:1 COG:ECs3070 KEGG:ns NR:ns ## COG: ECs3070 COG4174 # Protein_GI_number: 15832324 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Escherichia coli O157:H7 # 1 364 1 364 364 685 100.0 0 MGAYLIRRLLLVIPTLWAIITINFFIVQIAPGGPVDQAIAAIEFGNAGVLPGAGGEGVRA SHAQTGVGNISDSNYRGGRGLDPEVIAEITHRYGFDKPIHERYFKMLWDYIRFDFGDSLF RSASVLTLIKDSLPVSITLGLWSTLIIYLVSIPLGIRKAVYNGSRFDVWSSAFIIIGYAI PAFLFAILLIVFFAGGSYFDLFPLRGLVSANFDSLPWYQKITDYLWHITLPVLATVIGGF AALTMLTKNSFLDEVRKQYVVTARAKGVSEKNILWKHVFRNAMLLVIAGFPATFISMFFT GSLLIEVMFSLNGLGLLGYEATVSRDYPVMFGTLYIFTLIGLLLNIVSDISYTLVDPRID FEGR >gi|296494682|gb|ADTN01000056.1| GENE 59 63579 - 64604 1139 341 aa, chain + ## HITS:1 COG:ECs3071 KEGG:ns NR:ns ## COG: ECs3071 COG4239 # Protein_GI_number: 15832325 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Escherichia coli O157:H7 # 1 341 1 341 341 649 99.0 0 MSRLSPVNQARWARFRHNRRGYWSLWIFLVLFGLSLCSELIANDKPLLVRYDGSWYFPLL KNYSESDFGGPLASQADYQDPWLKQRLENNGWVLWAPIRFGATSINFATDKPFPSPPSRQ NWLGTDANGGDVLARILYGTRISVLFGLMLTLCSSVMGVLAGALQGYYGGKVDLWGQRFI EVWSGMPTLFLIILLSSVVQPNFWWLLAITVLFGWMSLVGVVRAEFLRTRNFDYIRAAQA LGVSDRSIILRHMLPNAMVATLTFLPFILCSSITTLTSLDFLGFGLPLGSPSLGELLLQG KNNLQAPWLGITAFLSVAILLSLLIFIGEAVRDAFDPNKAV >gi|296494682|gb|ADTN01000056.1| GENE 60 64606 - 66195 332 529 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 268 528 1 261 563 132 32 6e-30 MTQTLLAIENLSVGFRHQQTVRTVVNDVSLQIEAGETLALVGESGSGKSVTALSILRLLP SPPVEYLSGDIRFHGESLLHASDQTLRGVRGNKIAMIFQEPMVSLNPLHTLEKQLYEVLS LHRGMRREAARGEILNCLDRVGIRQAAKRLTDYPHQLSGGERQRVMIAMALLTRPELLIA DEPTTALDVSVQAQILQLLRELQGELNMGMLFITHNLSIVRKLAHRVAVMQNGRCVEQNY AATLFASPTHPYTQKLLNSEPSGDPVPLPEPASTLLDVEQLQVAFPIRKGILKRIVDHNV VVKNISFTLRAGETLGLVGESGSGKSTTGLALLRLINSQGSIIFDGQPLQNLNRRQLLPI RHRIQVVFQDPNSSLNPRLNVLQIIEEGLRVHQPTLSAAQREQQVIAVMHEVGLDPETRH RYPAEFSGGQRQRIAIARALILKPSLIILDEPTSSLDKTVQAQILTLLKSLQQKHQLAYL FISHDLHVVRALCHQVIILRQGEVVEQGPCARVFATPQQEYTRQLLALS >gi|296494682|gb|ADTN01000056.1| GENE 61 66199 - 66543 187 114 aa, chain - ## HITS:1 COG:no KEGG:EC55989_2435 NR:ns ## KEGG: EC55989_2435 # Name: yejG # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 114 1 114 114 233 100.0 2e-60 MTSLQLSIVHRLPQNYRWSAGFAGSKVEPIPQNGPCGDNSLVALKLLSPDGDNAWSVMYK LSQALSDIEVPCSVLECEGEPCLFVNRQDEFAATCRLKNFGVAIAEPFSNYNPF >gi|296494682|gb|ADTN01000056.1| GENE 62 66876 - 68066 1220 396 aa, chain - ## HITS:1 COG:ECs3074 KEGG:ns NR:ns ## COG: ECs3074 COG0477 # Protein_GI_number: 15832328 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 396 1 396 396 652 99.0 0 MTTRQHSSFAIVFILGLLAMLMPLSIDMYLPALPVISAQFGVPAGSTQMTLSTYILGFAL GQLIYGPMADSFGRKPVVLGGTLVFAAAAVACALANTIDQLIVMRFFHGLAAAAASVVIN ALMRDIYPKEEFSRMMSFVMLVTTIAPLMAPIVGGWVLVWLSWHYIFWILALAAILASAM IFFLIKETLPPERRQPFHIRTTIGNFAALFRHKRVLSYMLASGFSFAGMFSFLSAGPFVY IEINHVAPENFGYYFALNIVFLFVMTIFNSRFVRRIGALNMFRSGLWIQFIMAAWMVISA LLGLGFWSLVVGVAAFVGCVSMVSSNAMAVILDEFPHMAGTASSLAGTFRFGIGAIVGAL LSLATFNSAWPMIWSIAFCATSSILFCLYASRPKKR >gi|296494682|gb|ADTN01000056.1| GENE 63 68094 - 68789 794 231 aa, chain - ## HITS:1 COG:ECs3075 KEGG:ns NR:ns ## COG: ECs3075 COG1187 # Protein_GI_number: 15832329 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Escherichia coli O157:H7 # 1 231 1 231 231 469 100.0 1e-132 MRLDKFIAQQLGVSRAIAGREIRGNRVTVDGEIVRNAAFKLLPEHDVAYDGNPLAQQHGP RYFMLNKPQGYVCSTDDPDHPTVLYFLDEPVAWKLHAAGRLDIDTTGLVLMTDDGQWSHR ITSPRHHCEKTYLVTLESPVADDTAEQFAKGVQLHNEKDLTKPAVLEVITPTQVRLTISE GRYHQVKRMFAAVGNHVVELHRERIGGITLDADLAPGEYRPLTEEEIASVV >gi|296494682|gb|ADTN01000056.1| GENE 64 68938 - 70698 1785 586 aa, chain + ## HITS:1 COG:yejH_1 KEGG:ns NR:ns ## COG: yejH_1 COG1061 # Protein_GI_number: 16130122 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA or RNA helicases of superfamily II # Organism: Escherichia coli K12 # 1 358 1 358 358 730 99.0 0 MIFTLRPYQQEAVDATLNHFRRHKTPAVIVLPTGAGKSLVIAELARLARGRVLVLAHVKE LVAQNHAKYQALGLEADIFAAGLKRKESHGKVVFGSVQSVARNLDAFQGEFSLLIVDECH RIGDDEESQYQQILTHLTKVNPHLRLLGLTATPFRLGKGWIYQFHYHGMVRGDEKALFRD CIYELPLRYMIKHGYLTPPERLDMPVVQYDFSRLQAQSNGLFSEADLNRELKKQQRITPH IISQIMEFAAMRKGVMIFAATVEHAKEIVGLLPAEDAALITGDTPGAERDVLIENFKAQR FRYLVNVAVLTTGFDAPHVDLIAILRPTESVSLYQQIVGRGLRLAPGKTDCLILDYAGNP HDLYAPEVGTPKGKSDNVPVQVFCPACGFANTFWGKTTADGTLIEHFGRRCQGWFEDDDG HREQCDFRFRFKNCPQCNAENDIAARRCRECDTVLVDPDDMLKAALRLKDALVLRCSGMS LQHGHDEKGEWLKITYYDEDGADVSERFRLQTPAQRTAFEQLFIRPHTRTPGIPLRWITA ADILAQQALLRHPDFVVARMKGQYWQVREKVFDYEGRFRLAHELRG >gi|296494682|gb|ADTN01000056.1| GENE 65 70727 - 71107 633 126 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|26108973|gb|AAN81176.1|AE016763_135 50S ribosomal protein L25 [Escherichia coli CFT073] # 1 126 1 126 126 248 96 7e-65 MSIESRPLLHTQSRSLTCCWVACSRINLREKEMFTINAEVRKEQGKGASRRLRAANKFPA IIYGGKEAPLAIELDHDKVMNMQAKAEFYSEVLTIVVDGKEIKVKAQDVQRHPYKPKLQH IDFVRA >gi|296494682|gb|ADTN01000056.1| GENE 66 71246 - 72253 1253 335 aa, chain - ## HITS:1 COG:yejK KEGG:ns NR:ns ## COG: yejK COG3081 # Protein_GI_number: 16130124 # Func_class: R General function prediction only # Function: Nucleoid-associated protein # Organism: Escherichia coli K12 # 1 335 1 335 335 643 100.0 0 MSLDINQIALHQLIKRDEQNLELVLRDSLLEPTETVVEMVAELHRVYSAKNKAYGLFSEE SELAQTLRLQRQGEEDFLAFSRAATGRLRDELAKYPFADGGFVLFCHYRYLAVEYLLVAV LSNLSSMRVNENLDINPTHYLDINHADIVARIDLTEWETNPESTRYLTFLKGRVGRKVAD FFMDFLGASEGLNAKAQNRGLLQAVDDFTAEAQLDKAERQNVRQQVYSYCNEQLQAGEEI ELKSLSKELAGVSEVSFTEFAAEKGYELEESFPADRSTLRQLTKFAGSGGGLTINFDAML LGERIFWDPATDTLTIKGTPPNLRDQLQRRTSGGN >gi|296494682|gb|ADTN01000056.1| GENE 67 72435 - 72662 298 75 aa, chain + ## HITS:1 COG:ECs3079 KEGG:ns NR:ns ## COG: ECs3079 COG3082 # Protein_GI_number: 15832333 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 75 1 75 75 120 100.0 8e-28 MPQISRYSDEQVEQLLAELLNVLEKHKAPTDLSLMVLGNMVTNLINTSIAPAQRQAIANS FARALQSSINEDKAH >gi|296494682|gb|ADTN01000056.1| GENE 68 72682 - 74442 1660 586 aa, chain + ## HITS:1 COG:ECs3080 KEGG:ns NR:ns ## COG: ECs3080 COG3083 # Protein_GI_number: 15832334 # Func_class: R General function prediction only # Function: Predicted hydrolase of alkaline phosphatase superfamily # Organism: Escherichia coli O157:H7 # 1 586 1 586 586 1172 100.0 0 MVTHRQRYREKVSQMVSWGHWFALFNILLSLVIGSRYLFIADWPTTLAGRIYSYVSIIGH FSFLVFATYLLILFPLTFIVGSQRLMRFLSVILATAGMTLLLIDSEVFTRFHLHLNPIVW QLVINPDENEMARDWQLMFISVPVILLLELVFATWSWQKLRSLTRRRRFARPLAAFLFIA FIASHVVYIWADANFYRPITMQRANLPLSYPMTARRFLEKHGLLDAQEYQRRLIEQGNPD AVSVQYPLSELRYRDMGTGQNVLLITVDGLNYSRFEKQMPALAGFAEQNISFTRHMSSGN TTDNGIFGLFYGISPSYMDGILSTRTPAALITALNQQGYQLGLFSSDGFTSPLYRQALLS DFSMPSVRTQSDEQTATQWINWLGRYAQEDNRWFSWVSFNGTNIDDSNQQAFARKYSRAA GNVDDQINRVLNALRDSGKLDNTVVIITAGRGIPLSEEEETFDWSHGHLQVPLVIHWPGT PAQRINALTDHTDLMTTLMQRLLHVSTPASEYSQGQDLFNPQRRHYWVTAADNDTLAITT PKKTLVLNNNGKYRTYNLRGERVKDEKPQLSLLLQVLTDEKRFIAN >gi|296494682|gb|ADTN01000056.1| GENE 69 74696 - 76687 1312 663 aa, chain - ## HITS:1 COG:yejO KEGG:ns NR:ns ## COG: yejO COG3468 # Protein_GI_number: 16130127 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 1 663 174 836 836 1176 100.0 0 MLLENGGSLRVEENDFAYNTTVDSGGLLEVMDGGTVTGVDKKAGGKLIVSTNALEVSGPN SKGQFSIKDGVSKNYELDDGSGLIVMEDTQAIDTILDKHATMQSLGKDTGTKVQANAVYD LGRSYQNGSITYSSKAISENMVINNGRANVWAGTMVNVSVRGNDGILEVMKPQINYAPAM LVGKVVVSEGASFRTHGAVDTSKADVSLENSVWTIIADITTTNQNTLLNLANLAMSDANV IMMDEPVTRSSVTASAENFITLTTNTLSGNGNFYMRTDMANHQSDQLNVTGQATGDFKIF VTDTGASPAAGDSLTLVTTGGGDAAFTLGNAGGVVDIGTYEYTLLDNGNHSWSLAENRAQ ITPSTTDVLNMAAAQPLVFDAELDTVRERLGSVKGVSYDTAMWSSAINTRNNVTTDAGAG FEQTLTGLTLGIDSRFSREESSTIRGLIFGYSHSDIGFDRGGKGNIDSYTLGAYAGWEHQ NGAYVDGVVKVDRFANTIHGKMSNGATAFGDYNSNGAGAHVESGFRWVDGLWSVRPYLAF TGFTTDGQDYTLSNGMRADVGNTRILRAEAGTAVSYHMDLQNGTTLEPWLKAAVRQEYAD SNQVKVNDDGKFNNDVAGTSGVYQAGIRSSFTPTLSGHLSVSYGNGAGVESPWNTQAGVV WTF >gi|296494682|gb|ADTN01000056.1| GENE 70 76792 - 77286 151 164 aa, chain - ## HITS:1 COG:yejO KEGG:ns NR:ns ## COG: yejO COG3468 # Protein_GI_number: 16130127 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 28 142 1 115 836 215 99.0 4e-56 MHQSGSVSLCRSAISVLVATALYSPIALASTVEYGETVDGVVLEKDIQLVYGTANNTKIN PGGEQHIKEFGVSNNTEINGGYQYIEMNGAAEYSVLNDGYQIVQMGGAANQTTLNNGVLQ VYGAANDTTIKGGRLIVEKDGGPSLSLSKREDYWRLKRGDLHLR Prediction of potential genes in microbial genomes Time: Sun May 15 23:16:52 2011 Seq name: gi|296494681|gb|ADTN01000057.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont147.1, whole genome shotgun sequence Length of sequence - 23757 bp Number of predicted genes - 23, with homology - 21 Number of transcription units - 15, operones - 2 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 3.2 1 1 Tu 1 . + CDS 111 - 416 114 ## JW5424 hypothetical protein 2 2 Tu 1 . - CDS 449 - 589 97 ## gi|193071002|ref|ZP_03051930.1| hypothetical protein EcE110019_1782 - Prom 695 - 754 5.3 + Prom 441 - 500 1.8 3 3 Tu 1 . + CDS 564 - 1307 259 ## ECSP_3600 hypothetical protein + Term 1430 - 1465 0.4 4 4 Tu 1 . + CDS 1737 - 1880 77 ## ECIAI1_2748 putative DNA invertase fragment, putative PinH + Prom 1915 - 1974 3.4 5 5 Tu 1 . + CDS 2219 - 6799 2351 ## COG3468 Type V secretory pathway, adhesin AidA + Term 6814 - 6842 1.0 6 6 Op 1 . - CDS 7163 - 7492 291 ## JW2627 toxin of the YpjF-YfjZ toxin-antitoxin system 7 6 Op 2 . - CDS 7530 - 7895 313 ## ECO103_5087 hypothetical protein 8 6 Op 3 . - CDS 7921 - 8115 62 ## KPK_4943 hypothetical protein 9 6 Op 4 . - CDS 8139 - 8681 416 ## COG2003 DNA repair proteins 10 6 Op 5 . - CDS 8694 - 9137 445 ## KPK_4945 antirestriction protein 11 6 Op 6 . - CDS 9168 - 9989 602 ## ECO103_5084 hypothetical protein - Term 10006 - 10047 5.7 12 7 Op 1 . - CDS 10081 - 10422 315 ## YPK_3135 hypothetical protein 13 7 Op 2 . - CDS 10457 - 10921 360 ## YPK_3136 hypothetical protein 14 7 Op 3 . - CDS 10934 - 11050 95 ## 15 7 Op 4 . - CDS 11047 - 11205 253 ## - Prom 11271 - 11330 3.7 - Term 11376 - 11415 4.0 16 8 Tu 1 . - CDS 11427 - 12314 475 ## COG3596 Predicted GTPase - Prom 12411 - 12470 6.3 + Prom 12540 - 12599 3.8 17 9 Tu 1 . + CDS 12629 - 14086 611 ## Spro_1780 hypothetical protein + Term 14215 - 14271 -0.9 18 10 Tu 1 . + CDS 14524 - 15108 256 ## YPTS_1204 hypothetical protein 19 11 Tu 1 . - CDS 15179 - 15685 93 ## YPTS_1205 hypothetical protein - Prom 15893 - 15952 2.9 + Prom 15827 - 15886 5.0 20 12 Tu 1 . + CDS 16016 - 16243 203 ## COG3311 Predicted transcriptional regulator - Term 16366 - 16401 0.2 21 13 Tu 1 . - CDS 16524 - 16805 60 ## COG0582 Integrase - Prom 16974 - 17033 3.9 + Prom 17484 - 17543 2.9 22 14 Tu 1 . + CDS 17576 - 20986 568 ## Bpet4495 hypothetical protein + Prom 20995 - 21054 1.8 23 15 Tu 1 . + CDS 21285 - 23757 49 ## Rmet_6257 hypothetical protein Predicted protein(s) >gi|296494681|gb|ADTN01000057.1| GENE 1 111 - 416 114 101 aa, chain + ## HITS:1 COG:no KEGG:JW5424 NR:ns ## KEGG: JW5424 # Name: ypjC # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 101 60 160 160 174 100.0 7e-43 MSECIYQEGNAFVIMGAGEQLKRIKYEVGENNLKVFNVHFNNNHELVSSGEPDVICLSKQ VWENLLIKLKLENNENVFSETKKLSNKNNADQFFECAKRNE >gi|296494681|gb|ADTN01000057.1| GENE 2 449 - 589 97 46 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|193071002|ref|ZP_03051930.1| ## NR: gi|193071002|ref|ZP_03051930.1| hypothetical protein EcE110019_1782 [Escherichia coli E110019] # 1 46 1 46 46 80 100.0 4e-14 MGCSFDSDKDEISSILFSYELRDSIQTFGGVSKITLRVLLGLSKPT >gi|296494681|gb|ADTN01000057.1| GENE 3 564 - 1307 259 247 aa, chain + ## HITS:1 COG:no KEGG:ECSP_3600 NR:ns ## KEGG: ECSP_3600 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 7 239 217 449 449 423 89.0 1e-117 MSESKEHPIDIQEKKDAFVNEFKGVLFDKNTRSSELLFNFYECCYKFLPRAQPQDKIDSY NSALQAFSIFCSSTLTHNNIGFDFKLFPEVKLSGEHLETVFKYKNGDDVREIAKINITLQ KEEGGLYNLRGLDFKGCFFSGQNFSNYDIQYVNWGTSLFDVDTPCIFNAPAYNKSNEKSL KPVSENGLSGVLTDRNNKIKLITGVAPFDDILFMDDDFDDSSSEDDPVENSPVVTSPVVS SSKSSFQ >gi|296494681|gb|ADTN01000057.1| GENE 4 1737 - 1880 77 47 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_2748 NR:ns ## KEGG: ECIAI1_2748 # Name: not_defined # Def: putative DNA invertase fragment, putative PinH # Organism: E.coli_IAI1 # Pathway: not_defined # 1 47 16 62 85 88 93.0 8e-17 MWHLVLLLEELCERGINFRALAQSIFAQQWGDECCKSKTICDLKVIV >gi|296494681|gb|ADTN01000057.1| GENE 5 2219 - 6799 2351 1526 aa, chain + ## HITS:1 COG:ypjA KEGG:ns NR:ns ## COG: ypjA COG3468 # Protein_GI_number: 16130562 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 1 1526 44 1569 1569 2313 99.0 0 MNRTSPYYCRRSVLSLLISALIYAPPGMAAFTTNVIGVVNDETVDGNQKVDERGTTNNTH IINHGQQNVHGGVSNGSLIESGGYQDIGSHNNFVGQANNTTINGGRQSIHDGGISTGTTI ESGNQDVYKGGISNGTTIKGGASRVEGGSANGILIDGGSQIVKVQGHADGTTINKSGSQD VVQGSLATNTTINGGRQYVEQSTVETTTIKNGGEQRVYESRALDTTIEGGTQSLNSKSTA KNTHIYSGGTQIVDNTSTSDVIEVYSGGVLDVSGGTATNVTQHDGAILKTNTNGTTVSGT NSEGAFSIHNHVADNVLLENGGHLDINAYGSANKTIIKDKGTMSVLTNAKADATRIDNGG VMDVAGNATNTIINGGTQNINNYGIATGTNINSGTQNIKSGGKADTTIISSGSRQVVEKD GTAIGSNISAGGSLIVYTGGIAHGVNQETGSALVANTGAGTDIEGYNKLSHFTITGGEAN YVVLENTGELTVVAKTSAKNTTIDTGGKLIVQKEAKTDSTRLNNGGVLEGQDGGEAKHVE QQSGGALIASTTSGTLIEGTNSYGDAFYIRNSEAKNVVLENAGSLTVVTGSRAVDTIINA NGKMDVYGKDVGTVLNSAGTQTIYASATSDKANIKGGKQTVYGLATEANIESGEQIVDGG STEKTHINGGTQTVQNYGKAINTDIVSGLQQIMANGTAEGSIINGGSQVVNEGGLAENSV LNDGGTLDVREKGSATGIQQSSQGALVATTRATRVTGTRADGVAFSIEQGAANNILLANG GVLTVESDTSSDKTQVNMGGREIVKTKATATGTTLTGGEQIVEGVANETTINDGGIQTVS ANGEAIKTKINEGGTLTVNDNGKATDIVQNSGAALQTSTANGIEISGTHQYGTFSISGNL ATNMLLENGGNLLVLAGTEARDSTVGKGGAMQNLGQDSATKVNSGGQYTLGRSKDEFQAL ARAEDLQVAGGTAIVYAGTLADASVSGATGSLSLMTPRDNVTPVKLEGAVRITDSATLTL GNGVDTTLADLTAASRGSVWLNSNNSCAGTSNCEYRVNSLLLNDGDVYLSAQTAAPATTN GIYNTLTTNELSGSGNFYLHTNVAGSRGDQLVVNNNATGNFKIFVQDTGVSPQSDDAMTL VKTGGGDASFTLGNTGGFVDLGTYEYVLKSDGNSNWNLTNDVKPNPDPIPNPKPDPKPDP KPDPNPKPDPTPDPTPTPVPEKRITPSTAAVLNMAATLPLVFDAELNSIRERLNIMKASP HNNNVWGATYNTRNNVTTDAGAGFEQTLTGMTVGIDSRNDIPEGITTLGAFMGYSHSHIG FDRGGHGSVGSYSLGGYASWEHESGFYLDGVVKLNRFKSNVAGKMSSGGAANGSYHSNGL GGHIETGMRFTDGNWNLTPYASLTGFTADNPEYHLSNGMKSKSVDTRSIYRELGATLSYN MRLGNGMEVEPWLKAAVRKEFVDDNRVKVNNDGNFVNDLSGRRGIYQAGIKASFSSTLSG HLGVGYSHSAGVESPWNAVAGVNWSF >gi|296494681|gb|ADTN01000057.1| GENE 6 7163 - 7492 291 109 aa, chain - ## HITS:1 COG:no KEGG:JW2627 NR:ns ## KEGG: JW2627 # Name: ypjF # Def: toxin of the YpjF-YfjZ toxin-antitoxin system # Organism: E.coli_J # Pathway: not_defined # 1 109 1 109 109 172 77.0 3e-42 MQTLSSHPTRATQPCLSPVETWQRLLTHLLSQHYGLTLNDTPFSNETTIREDIDAGVSLS DAVNFLVEKYELVRIDRKGFSWQEQTPYISVVDILRARRSTGLLKTNVK >gi|296494681|gb|ADTN01000057.1| GENE 7 7530 - 7895 313 121 aa, chain - ## HITS:1 COG:no KEGG:ECO103_5087 NR:ns ## KEGG: ECO103_5087 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 121 1 121 121 213 81.0 2e-54 MNNHSVFGTKPENLTCQEWGLKRTITPCFGARLVQEGNRVHFLADRAGFNGAFIDDNALR LDLAFPLILKQLELMLTSGELSPRHEHCVTLYHNGLTCEADTLGSCGYVYIAIYPDQPEP H >gi|296494681|gb|ADTN01000057.1| GENE 8 7921 - 8115 62 64 aa, chain - ## HITS:1 COG:no KEGG:KPK_4943 NR:ns ## KEGG: KPK_4943 # Name: not_defined # Def: hypothetical protein # Organism: K.pneumoniae_342 # Pathway: not_defined # 1 64 10 73 73 121 92.0 8e-27 MALYRQHPGSRLFRFCTGKYKWTGSICHYAGREVQDIRNVLAVFAERRQDRNGPYVILRS VTLN >gi|296494681|gb|ADTN01000057.1| GENE 9 8139 - 8681 416 180 aa, chain - ## HITS:1 COG:yfjY KEGG:ns NR:ns ## COG: yfjY COG2003 # Protein_GI_number: 16130559 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Escherichia coli K12 # 26 178 6 158 160 181 58.0 8e-46 MNTVLCPESSQRELFPLTVMAQHGYLLPATSGLTPYAQRTIRRAINLLDKYLRQPGISFT SSIAARDWLRLQLAGQEREVFMVLYLDNQHRLLESETLFAGSVYHVQVHPREVVKSALRF NAAAVVLAHNHPSGDPEPSKVDRQMTDKLKEVLGLVDVKTLDHLVVGQDGIVSFAERGWI >gi|296494681|gb|ADTN01000057.1| GENE 10 8694 - 9137 445 147 aa, chain - ## HITS:1 COG:no KEGG:KPK_4945 NR:ns ## KEGG: KPK_4945 # Name: not_defined # Def: antirestriction protein # Organism: K.pneumoniae_342 # Pathway: not_defined # 1 147 1 147 147 287 91.0 7e-77 MDTQNLQFAEQTAITASVVPDELRIGFWPQHFGSIPQWITLEPRIFAWMDRLCTDYHGGI WHFSTLSNGGAFMAPESEQDEKWKLFNSMNGNGAELTGEAAGIVACLMVYSHHACRIECD AMTEHYYRLRDFALNHPECSAIMYLID >gi|296494681|gb|ADTN01000057.1| GENE 11 9168 - 9989 602 273 aa, chain - ## HITS:1 COG:no KEGG:ECO103_5084 NR:ns ## KEGG: ECO103_5084 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 273 1 273 273 524 95.0 1e-147 MRLASRFGRINQIRRDRPLTHEELMSHVPSVFGSDKHESRSDRYTYIPTITILESLQREG FQPFFACQTKVRDQSKREHTKHMLRLRRAGQLNGHQVPEIILLNSHDGSSSYQMLPGLFR GVCTNGLVCGQSFGEVRVPHKGNVVEKVIEGAYEVLGVFDRVDEKRDAMASLLLPPPAQH ALANAALKYRFGEDHQPVTVSQLLTSRRREDCSDDLWTVYQRVQENLMKGGLSGRTAQGK SSRTRAVTGINGDVKLNRALWVMAENMLEFFGR >gi|296494681|gb|ADTN01000057.1| GENE 12 10081 - 10422 315 113 aa, chain - ## HITS:1 COG:no KEGG:YPK_3135 NR:ns ## KEGG: YPK_3135 # Name: not_defined # Def: hypothetical protein # Organism: Y.pseudotuberculosis_YPIII # Pathway: not_defined # 1 110 1 110 112 110 49.0 1e-23 MKKIITVAALSLLSFSSLAGSSPVSVSVSPGSYSHYSSIRVTSKVDSIVIKQLIVNRGNC QDAEFASPWKSVRLGFGGAVSHEFTGKGLMVPCNVLEVVVQTSEGAWQFDFDS >gi|296494681|gb|ADTN01000057.1| GENE 13 10457 - 10921 360 154 aa, chain - ## HITS:1 COG:no KEGG:YPK_3136 NR:ns ## KEGG: YPK_3136 # Name: not_defined # Def: hypothetical protein # Organism: Y.pseudotuberculosis_YPIII # Pathway: not_defined # 1 153 1 153 154 192 60.0 5e-48 MKNKLVISFLLASVLTPSVSHAFGNPDSWVSGYAQGTSEYTILGKGQSQLYLACDSSGSQ PATIIFTDVNGQQVRMDSGQTLAMRIDNAEEVNISESESHVGSDNVMWAWNKLRTGKRVV VSGSGVKPATFTLAGAAAVLPAFGDNGCVPGFAL >gi|296494681|gb|ADTN01000057.1| GENE 14 10934 - 11050 95 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKTFLRLFGKFTRWYITLIIFMACLLLIITTLGNMAGF >gi|296494681|gb|ADTN01000057.1| GENE 15 11047 - 11205 253 52 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPLILLWGGLALLLGIVASANGRSFWGWFILGLIIDPILAGLLYWLIAKDRT >gi|296494681|gb|ADTN01000057.1| GENE 16 11427 - 12314 475 295 aa, chain - ## HITS:1 COG:ykfA KEGG:ns NR:ns ## COG: ykfA COG3596 # Protein_GI_number: 16128238 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Escherichia coli K12 # 1 294 2 288 288 227 43.0 2e-59 MKHPSGYTLIRRHLRRYPHALRQHLLQELNRLVTYEPVIGIMGKTGVGKSSLCNALFRSE VCTVNAVEACTRQPQRVRLRFGSHYLTLVDLPGVGESVTRDGEYRELYRDLMPQLDMVLW VLKADDRAFAVEEQFYQDVFAQFRGPIPPVLWILNQVDKTEPAEQWRWLSAQPSALQAER IAQKRQAVARQLQIAEIDILPVSVRGRYRLPRLVETMITRLPKQARSPLVPHLQNRYRTD TVINTASSSFGDAVVEVIDNIIDLAPLPQVARTALHAVTHTVARAAGSLWGFFFG >gi|296494681|gb|ADTN01000057.1| GENE 17 12629 - 14086 611 485 aa, chain + ## HITS:1 COG:no KEGG:Spro_1780 NR:ns ## KEGG: Spro_1780 # Name: not_defined # Def: hypothetical protein # Organism: S.proteamaculans # Pathway: not_defined # 1 479 1 475 477 560 58.0 1e-158 MQEFPFNAFPSSLLETIFDIEECTRAPGEMIALTLLSGLSLACQSLINVKITDTMKSPVS LYSFVIADSGERKTAVEKLLLKPFHEHDLRVLLQYEKQKKEHEVQSKIWSKKENALLKCI DKDTMKGLCTDKLSRQLEQLSAEKPVQPKAKRYIYNNVTPEALQFEMYTHSPQTGLITDE GANILSRRTMSDLGFMNSIWDDESFHVNRKTGPSFTIKNGRITLLVMVQKAIFYDFLKRK GEHARGSGLFARCFILYIDAKLSTQGQRFISRLSSPQSGKLENLNRFYQRMNELIEKSAS QHGTGNQTCLSFEPSALAAWKEIHDEIESSIGPDREYANMNDFASKLPNNVARLAALLSY YTEGECAIKKEYVENAWLLCEWYMQQAIKVFGAQEGYYEALLLSWLRREYYETEMDHVRF NTIRNAGPNVLRKGKLLERVIERLEKEGAIDIVRSRRKARMVYAGRYFRNNPFTKNAAYK HYWPR >gi|296494681|gb|ADTN01000057.1| GENE 18 14524 - 15108 256 194 aa, chain + ## HITS:1 COG:no KEGG:YPTS_1204 NR:ns ## KEGG: YPTS_1204 # Name: not_defined # Def: hypothetical protein # Organism: Y.pseudotuberculosis_PB1 # Pathway: not_defined # 1 194 1 194 194 310 84.0 2e-83 MQHNSLNAYYQQAFYDVIRNAVDEFPRTLALRVDLRFPRHYQYGDSNKEVTRFIESLKAK LLVDCQRKNLRWKRNRSNRLRYAWVREVGELNRRKHYHVLLLLNKDFYHGAGNYNSDDSL YALIQQAWCSALGLDTEQYTGLANMTENGCFYLNRKLPNYMQQVKELLKRMEYMAKDHTK SYDDGYRSIGMSRR >gi|296494681|gb|ADTN01000057.1| GENE 19 15179 - 15685 93 168 aa, chain - ## HITS:1 COG:no KEGG:YPTS_1205 NR:ns ## KEGG: YPTS_1205 # Name: not_defined # Def: hypothetical protein # Organism: Y.pseudotuberculosis_PB1 # Pathway: not_defined # 1 168 73 240 241 282 82.0 3e-75 MSFDEPDTVSETISELVFLADVNWQLGLGLLNTNKNEAIKCLLLAIEYLDYCRGLEDQEL WQQQQTTKLDVPAQGGRSKAARFDPIKAKIIRLLQEACPSDGWDTKSEALKSIENGINEL KWPSARDQNNLPKTGAEIFAMKIRQVEVWSSQDQKIKAAFDSVVKPKK >gi|296494681|gb|ADTN01000057.1| GENE 20 16016 - 16243 203 75 aa, chain + ## HITS:1 COG:alpA KEGG:ns NR:ns ## COG: alpA COG3311 # Protein_GI_number: 16130542 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Escherichia coli K12 # 5 67 6 68 70 77 52.0 5e-15 MEPKSVKILRVPALVKKLGIGRSTVYDWLDPSSDRHDPTFPQRVKLGAKTVGWLESEIDA WILSKVKKPLTPVSP >gi|296494681|gb|ADTN01000057.1| GENE 21 16524 - 16805 60 93 aa, chain - ## HITS:1 COG:intA KEGG:ns NR:ns ## COG: intA COG0582 # Protein_GI_number: 16130540 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Escherichia coli K12 # 1 55 240 294 413 92 78.0 2e-19 MWQLLTISRPAEAAEARWDEIDFEAKEWKIPAARMKMNRDHTVPLGSRGLSLAQLSPAES PLMQVAQAQLKHLLPAHLMASFSPFFSQAGHTS >gi|296494681|gb|ADTN01000057.1| GENE 22 17576 - 20986 568 1136 aa, chain + ## HITS:1 COG:no KEGG:Bpet4495 NR:ns ## KEGG: Bpet4495 # Name: not_defined # Def: hypothetical protein # Organism: B.petrii # Pathway: not_defined # 205 1133 2 932 939 1083 56.0 0 MLEDYSPEADWGQIKVRLGQDFVPMFYGSCIERMSDFVEAFRITYAHIPEAQAHMDLAIA LQAQIIESIPELMLSVNTEAQCADVELATEDFWLTCKSLLQQLGNELANRRKKAGCTFDT RVGTFKAPLKPDAFGNAVMQGMALPFLAVEMDRVWIPMSVRSAPGLIIDHWANKNLNGIS PSTHRKLAEFVAERFQRTLIGPLTLYVDGTACKDLPISCIILGNSGIYLICTCDHTSNDR LSNAAESIYAKVSRGSPIFFRLDDGRFLSLSRVGAEPSVDELHIVIVVTLAGTAFNSINL PSKPTRLFPLADFITIFDSLNDLEELESYWKFTDAQEGTLSPFSRGTVDLFASFKDMHGV LVEGAIAPNFIGLDPHWGTSWRFKILTEFWSRAPKVFPDSSIGWTLSSSTEGVVRLESRH HKAMAYSTSIGSCTVQTLIEITKELRFEDGRLIDLFAQLLADCSYRCRKMMSDISLFRQP HVLFTCSPEPSSLVKAEESPSSLDEFKHVVTSVIPDSHNPDVFHVQVDTRAVLAGLNLSK DGSFEVRCLLETLEKCHAVLGLQMPDEFAQRFTHIASQAARYHLKVMTRNIDVPDYVYPI IPSPTDYKLARRKLAAEIMGLGFIPGRYELSDAKARIDPSSARMRLHIENRLAAFDRHQL LQAFIEQHDALLVTERTRIQRARLSLAHEVEYDRLDAIEHARKEFGDKARHYRYLLEKTV NLSSNGVDRVTEDVLRELVGLVDWFMVLTGASDTLHNGIDVGGVEIDDSYIPEVFYSTDF GDREATFAKEYAKSRLGLGTNDKDAVEGESEGLLASEKLKKAFITDLGFELQNMLTVLAV LSQAQRYGFGNELSLSYAAAPDRLARGLADSIDRLDYSEAQKIVAFLTLSEVGVRRLVGR DIDEGDIPYWERNKRIDRYTIKPLVIDGADLRWGAESASRAMYIWMSAVRDGYLPAEFDW PHVKPVIREVKESIEKRLELRTEEIFRRHTPYVQRGIDFFRRFREEKFDDVGDFDVFAYW PESNLLVTVECKYNQPPFTMKDSRRLRDRIFGKAENDRAGQFSRILDRRKFLETHRSRLL ELLQWPQPESVPLRNIELYVSRDVYYWMVHPPYPVPTQFVRVDTLDTWIKTELNIP >gi|296494681|gb|ADTN01000057.1| GENE 23 21285 - 23757 49 824 aa, chain + ## HITS:1 COG:no KEGG:Rmet_6257 NR:ns ## KEGG: Rmet_6257 # Name: not_defined # Def: hypothetical protein # Organism: R.metallidurans # Pathway: not_defined # 1 816 99 915 1474 1123 65.0 0 MLRKWFDAFKQLDPVRIGEISLITNRRPDAAIEVCLERDRIDPSKIPEPQRTLVEVELGG VEECKLFFRQLCIRHSDKGYLALEHEVDARLRLHGTPEGIANLKYVALNWATQKNLPAPD GWIKLAEVGTILRASPPAPLPEDFVVPEGYEVPDEIFHQEFVQSTINATGKAIVLTGPPG RGKSTYLSALCDTLAEKGIPTVRHHYFLSTTERGRDRVNSYIVEQSILAQIQQFHEDVPK TGGGLHTQLEACAAHYKVLGKPFVLILDGLDHVWRINAEDKRPLDDLFCQVIPCPENMVL LVGTQPVNDEQLPADLLAFAPKSEWHTLPAMSESAVLSYLRRTVQEGWLTTGFESEEQIE EQLQSAASALRKKTNGHPLHVIYATSELKHSGRRLSSWDIEQLKGDLSQDARFYYASLWE RLAPSQKDTLRLICAFTFFWPKTAFSEIATKIKAVEPDVDKVEHLLHSSVAGLKVFHESL AVFVRATDDYESRINELMPVIADWLEHQAPTSLRVNWLWTVQAKLGNPNNLITGLTRDWI MSRLKEGYSEALFEILLSEALSAALETANFSDAYRLAHLKARMVDGSQFQMQDSDLARLI SFTLTLTSEESVIRESIASRHETDILHVVALGLALLSRGDIIQAEICGEEAFHRFRALCQ FSNKYNSRAGSDELLFLASAFTQLGVIADTSENLAWLVDKNSPDIWLPRVQMLIQKGNLD DLMLLAASLPDGQKKNIISDACIRAAAVAGVVVTEREDFDELVRTPFVAAIEVALIKTST PSSQPIPVNWISDSYYKHKDNLATLTHHWLFGAVLLWSTKTGHR Prediction of potential genes in microbial genomes Time: Sun May 15 23:18:02 2011 Seq name: gi|296494680|gb|ADTN01000058.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont152.1, whole genome shotgun sequence Length of sequence - 8626 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 60 - 443 257 ## UTI89_C4570 hypothetical protein 2 2 Tu 1 . - CDS 506 - 949 415 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Prom 969 - 1028 3.5 + Prom 925 - 984 4.4 3 3 Op 1 5/0.000 + CDS 1106 - 2035 885 ## COG1897 Homoserine trans-succinylase + Term 2081 - 2109 1.0 + Prom 2121 - 2180 7.2 4 3 Op 2 7/0.000 + CDS 2304 - 3905 1651 ## COG2225 Malate synthase 5 3 Op 3 5/0.000 + CDS 3935 - 5239 1532 ## COG2224 Isocitrate lyase + Term 5462 - 5505 2.1 + Prom 5257 - 5316 2.5 6 3 Op 4 . + CDS 5520 - 7256 1268 ## COG4579 Isocitrate dehydrogenase kinase/phosphatase 7 4 Tu 1 . - CDS 7225 - 8577 310 ## COG0666 FOG: Ankyrin repeat Predicted protein(s) >gi|296494680|gb|ADTN01000058.1| GENE 1 60 - 443 257 127 aa, chain + ## HITS:1 COG:no KEGG:UTI89_C4570 NR:ns ## KEGG: UTI89_C4570 # Name: yjaA # Def: hypothetical protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 127 13 139 139 223 97.0 2e-57 MSVLYIQIRRNQITVRDLESKREVSGDAAFSNQRLLIANFFVAEKVLQDLVLQLHPRSTW HSFLPAKRMDIVVSALEMNEGGLSQVEERILHEVVAGATLMKYRQFHIHAQSAVLSDSAV MAMLKQK >gi|296494680|gb|ADTN01000058.1| GENE 2 506 - 949 415 147 aa, chain - ## HITS:1 COG:yjaB KEGG:ns NR:ns ## COG: yjaB COG0454 # Protein_GI_number: 16131838 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Escherichia coli K12 # 1 147 1 147 147 304 100.0 4e-83 MVISIRRSRHEEGEELVAIWCRSVDATHDFLSAEYRTELEDLVRSFLPEAPLWVAVNERD QPVGFMLLSGQHMDALFIDPDVRGCGVGRVLVEHALSMAPELTTNVNEQNEQAVGFYKKV GFKVTGRSEVDDLGKPYPLLNLAYVGA >gi|296494680|gb|ADTN01000058.1| GENE 3 1106 - 2035 885 309 aa, chain + ## HITS:1 COG:metA KEGG:ns NR:ns ## COG: metA COG1897 # Protein_GI_number: 16131839 # Func_class: E Amino acid transport and metabolism # Function: Homoserine trans-succinylase # Organism: Escherichia coli K12 # 1 309 1 309 309 627 99.0 1e-179 MPIRVPDELPAVNFLREENVFVMTTSRASGQEIRPLKVLILNLMPKKIETENQFLRLLSN SPLQVDIQLLRIDSRESRNTPAEHLNNFYCNFEDIQDQNFDGLIVTGAPLGLVEFNDVAY WPQIKQVLEWSKDHVTSTLFVCWAVQAALNILYGIPKQTRTDKLSGVYEHHILHPHALLT RGFDDSFLAPHSRYADFPAALIRDYTDLEILAETEEGDAYLFASKDKRIAFVTGHPEYDA QTLAQEFFRDVEAGLDPDVPYNYFPHNDPQNTPRASWRSHGNLLFTNWLNYYVYQITPYD LRHMNPTLD >gi|296494680|gb|ADTN01000058.1| GENE 4 2304 - 3905 1651 533 aa, chain + ## HITS:1 COG:aceB KEGG:ns NR:ns ## COG: aceB COG2225 # Protein_GI_number: 16131840 # Func_class: C Energy production and conversion # Function: Malate synthase # Organism: Escherichia coli K12 # 1 533 1 533 533 1112 100.0 0 MTEQATTTDELAFTRPYGEQEKQILTAEAVEFLTELVTHFTPQRNKLLAARIQQQQDIDN GTLPDFISETASIRDADWKIRGIPADLEDRRVEITGPVERKMVINALNANVKVFMADFED SLAPDWNKVIDGQINLRDAVNGTISYTNEAGKIYQLKPNPAVLICRVRGLHLPEKHVTWR GEAIPGSLFDFALYFFHNYQALLAKGSGPYFYLPKTQSWQEAAWWSEVFSYAEDRFNLPR GTIKATLLIETLPAVFQMDEILHALRDHIVGLNCGRWDYIFSYIKTLKNYPDRVLPDRQA VTMDKPFLNAYSRLLIKTCHKRGAFAMGGMAAFIPSKDEEHNNQVLNKVKADKSLEANNG HDGTWIAHPGLADTAMAVFNDILGSRKNQLEVMREQDAPITADQLLAPCDGERTEEGMRA NIRVAVQYIEAWISGNGCVPIYGLMEDAATAEISRTSIWQWIHHQKTLSNGKPVTKALFR QMLGEEMKVIASELGEERFSQGRFDDAARLMEQITTSDELIDFLTLPGYRLLA >gi|296494680|gb|ADTN01000058.1| GENE 5 3935 - 5239 1532 434 aa, chain + ## HITS:1 COG:aceA KEGG:ns NR:ns ## COG: aceA COG2224 # Protein_GI_number: 16131841 # Func_class: C Energy production and conversion # Function: Isocitrate lyase # Organism: Escherichia coli K12 # 1 434 1 434 434 866 100.0 0 MKTRTQQIEELQKEWTQPRWEGITRPYSAEDVVKLRGSVNPECTLAQLGAAKMWRLLHGE SKKGYINSLGALTGGQALQQAKAGIEAVYLSGWQVAADANLAASMYPDQSLYPANSVPAV VERINNTFRRADQIQWSAGIEPGDPRYVDYFLPIVADAEAGFGGVLNAFELMKAMIEAGA AAVHFEDQLASVKKCGHMGGKVLVPTQEAIQKLVAARLAADVTGVPTLLVARTDADAADL ITSDCDPYDSEFITGERTSEGFFRTHAGIEQAISRGLAYAPYADLVWCETSTPDLELARR FAQAIHAKYPGKLLAYNCSPSFNWQKNLDDKTIASFQQQLSDMGYKFQFITLAGIHSMWF NMFDLANAYAQGEGMKHYVEKVQQPEFAAAKDGYTFVSHQQEVGTGYFDKVTTIIQGGTS SVTALTGSTEESQF >gi|296494680|gb|ADTN01000058.1| GENE 6 5520 - 7256 1268 578 aa, chain + ## HITS:1 COG:aceK KEGG:ns NR:ns ## COG: aceK COG4579 # Protein_GI_number: 16131842 # Func_class: T Signal transduction mechanisms # Function: Isocitrate dehydrogenase kinase/phosphatase # Organism: Escherichia coli K12 # 1 578 1 578 578 1189 100.0 0 MPRGLELLIAQTILQGFDAQYGRFLEVTSGAQQRFEQADWHAVQQAMKNRIHLYDHHVGL VVEQLRCITNGQSTDAAFLLRVKEHYTRLLPDYPRFEIAESFFNSVYCRLFDHRSLTPER LFIFSSQPERRFRTIPRPLAKDFHPDHGWESLLMRVISDLPLRLRWQNKSRDIHYIIRHL TETLGTDNLAESHLQVANELFYRNKAAWLVGKLITPSGTLPFLLPIHQTDDGELFIDTCL TTTAEASIVFGFARSYFMVYAPLPAALVEWLREILPGKTTAELYMAIGCQKHAKTESYRE YLVYLQGCNEQFIEAPGIRGMVMLVFTLPGFDRVFKVIKDRFAPQKEMSAAHVRACYQLV KEHDRVGRMADTQEFENFVLEKRHISPALMELLLQEAAEKITDLGEQIVIRHLYIERRMV PLNIWLEQVEGQQLRDAIEEYGNAIRQLAAANIFPGDMLFKNFGVTRHGRVVFYDYDEIC YMTEVNFRDIPPPRYPEDELASEPWYSVSPGDVFPEEFRHWLCADPRIGPLFEEMHADLF RADYWRALQNRIREGHVEDVYAYRRRQRFSVRYGEMLF >gi|296494680|gb|ADTN01000058.1| GENE 7 7225 - 8577 310 450 aa, chain - ## HITS:1 COG:arp KEGG:ns NR:ns ## COG: arp COG0666 # Protein_GI_number: 16131843 # Func_class: R General function prediction only # Function: FOG: Ankyrin repeat # Organism: Escherichia coli K12 # 1 450 279 728 728 877 100.0 0 MSESKENIKHYSLMDFMNVDYSLLKWSNDHVINQSVAIIPALPKEQLLMLKGSVDEITPP LSPATMNLLMAIGQNHQLTQLMIQLQKMPELHRTEMLTAYNSINLPGLYLAINYGNADIV ETIFNSLSETGYEGLLSKKNLMHILEAKDKNGFSGLFLAISRKDKNVVTSILNALPKLAA THHLDNEQVYKFLSAKNRTSSHVLYHVMANGDADMLKIVLNALPLLIRTCHLTKEQVLDL LKAKDFYGCPGLYLAMQNGHSDIVKVILEALPSLAQEINISASDIVDLLTAKSLARDTGL FMAMQRGHMNVINTIFNALPTLFNTFKFDKKNMKPLLLANNSNEYPGLFSAIQHKQQNVV ETVYLALSDHARLFGFTAEDIMDFWQHKAPQKYSAFELAFEFGHRVIAELILNTLNKMAE SFGFTDNPRYIAEKNYMEALLKKASPHTVR Prediction of potential genes in microbial genomes Time: Sun May 15 23:18:08 2011 Seq name: gi|296494679|gb|ADTN01000059.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont166.1, whole genome shotgun sequence Length of sequence - 10167 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 3, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 103 - 162 3.1 1 1 Op 1 2/0.000 + CDS 233 - 451 77 ## COG3209 Rhs family protein 2 1 Op 2 2/0.000 + CDS 472 - 1314 219 ## COG1413 FOG: HEAT repeat 3 1 Op 3 . + CDS 1362 - 2300 345 ## COG3209 Rhs family protein 4 1 Op 4 . + CDS 2312 - 2773 -36 ## JW3570 conserved hypothetical protein - Term 4310 - 4344 -0.8 5 2 Op 1 . - CDS 4378 - 5514 1071 ## COG1566 Multidrug resistance efflux pump 6 2 Op 2 . - CDS 5517 - 5879 272 ## EcSMS35_3931 hypothetical protein - Prom 6120 - 6179 5.2 7 3 Op 1 11/0.000 + CDS 6416 - 8329 2512 ## COG2213 Phosphotransferase system, mannitol-specific IIBC component + Term 8336 - 8376 6.2 + Prom 8365 - 8424 2.6 8 3 Op 2 7/0.000 + CDS 8559 - 9707 1607 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases 9 3 Op 3 . + CDS 9707 - 10162 355 ## COG3722 Transcriptional regulator Predicted protein(s) >gi|296494679|gb|ADTN01000059.1| GENE 1 233 - 451 77 72 aa, chain + ## HITS:1 COG:rhsA KEGG:ns NR:ns ## COG: rhsA COG3209 # Protein_GI_number: 16131464 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli K12 # 1 72 1306 1377 1377 154 100.0 3e-38 MIASRGNVADTGITDRVNDIINDRFWSDGKKPDRCDVLQELIDCGDISAKDAKSTQKAWN CRHSRQSNDKKR >gi|296494679|gb|ADTN01000059.1| GENE 2 472 - 1314 219 280 aa, chain + ## HITS:1 COG:ZyibA KEGG:ns NR:ns ## COG: ZyibA COG1413 # Protein_GI_number: 15804135 # Func_class: C Energy production and conversion # Function: FOG: HEAT repeat # Organism: Escherichia coli O157:H7 EDL933 # 1 280 1 280 280 509 99.0 1e-144 MSNTYQKRKASKEYGLYNQCKKLNDDELFRLLDDRNSLKRISSARVLQLRGGQDAVRLAI EFCSDKNYIRRDIGAFILGQIKICKKCEDNVFNILNNMALNDKSACVRATAIESTAQRCK KNPIYSPKIVEQSQITAFDKSTNVRRATAFAISVINDKATIPLLINLLKDPNGDVRNWAA FAININKYDNSDIRDCFVEMLQDKNEEVRIEAIIGLSYRKDKRVLSVLCDELKKNTVYDD IIEAAGELGDKTLLPVLDTMLYKFDDNEIITSAIDKLKRS >gi|296494679|gb|ADTN01000059.1| GENE 3 1362 - 2300 345 312 aa, chain + ## HITS:1 COG:ECs4470 KEGG:ns NR:ns ## COG: ECs4470 COG3209 # Protein_GI_number: 15833724 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli O157:H7 # 1 312 1098 1409 1409 639 97.0 0 MLDRLESEILADRVSEESRRWLASCGLTVEQMQNQMDPVYTPARKIHLYHCDHRGLPLAL ISTEGATAWCAEYDEWGNLLSDENPHHLQQLIRLPGQQYDEESGLYYNRHRYYDPLQGRY ITQDPIGLKGGWNFYQYPLNPVINVDPQGLVDINLYPESDLIHSVADEINIPGVFTIGGH GTPTSIESATRSIMTAKDLAYLIKFDGNYKDGMTVWLFSCNTGKGQNSFASQLAKELHTN VIGPDTLWTWWGRGTNGKLKMDTVLTAPTNLNSNKDLMAITTKDLGNWITYGPSGHPISN MQGTPEKPSDIR >gi|296494679|gb|ADTN01000059.1| GENE 4 2312 - 2773 -36 153 aa, chain + ## HITS:1 COG:no KEGG:JW3570 NR:ns ## KEGG: JW3570 # Name: yibG # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 153 1 153 153 277 100.0 7e-74 MKACLLLFFYFSFICQLHGADVKIKQNESMMGSTAMTYDLSEEKLMKLKYKSQHGDSEAS FRLYQYYCFTKNNIYKQLRFLERSASQGNVTAQFNYGVFLSDTNPTLSEYYNLNRAIYWM EFAVNNGNIDAKSKLQELKKLKRMDRRKNKENP >gi|296494679|gb|ADTN01000059.1| GENE 5 4378 - 5514 1071 378 aa, chain - ## HITS:1 COG:ECs4473 KEGG:ns NR:ns ## COG: ECs4473 COG1566 # Protein_GI_number: 15833727 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Escherichia coli O157:H7 # 1 378 1 378 378 723 100.0 0 MDLLIVLTYVALAWAVFKIFRIPVNQWTLATAALGGVFLVSGLILLMNYNHPYTFTAQKA VIAIPITPQVTGIVTEVTDKNNQLIQKGEVLFKLDPVRYQARVDRLQADLMTATHNIKTL RAQLTEAQANTTQVSAERDRLFKNYQRYLKGSQAAVNPFSERDIDDARQNFLAQDALVKG SVAEQAQIQSQLDSMVNGEQSQIVSLRAQLTEAKYNLEQTVIRAPSNGYVTQVLIRPGTY AAALPLRPVMVFIPEQKRQIVAQFRQNSLLRLKPGDDAEVVFNALPGQVFHGKLTSILPV VPGGSYQAQGVLQSLTVVPGTDGVLGTIELDPNDDIDALPDGIYAQVAVYSDHFSHVSVM RKVLLRMTSWMHYLYLDH >gi|296494679|gb|ADTN01000059.1| GENE 6 5517 - 5879 272 120 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_3931 NR:ns ## KEGG: EcSMS35_3931 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 120 35 154 154 238 100.0 6e-62 MFLNYFALGVLIFVFLVIFYGIIAIHDIPYLIAKKRNHPHADAIHTAGWVSLFTLHVIWP FLWIWATLYQPERGWGMQSHVASQEKATDPEIAALSDRISRLEHQLAAEKKTDYSTFPEI >gi|296494679|gb|ADTN01000059.1| GENE 7 6416 - 8329 2512 637 aa, chain + ## HITS:1 COG:mtlA_1 KEGG:ns NR:ns ## COG: mtlA_1 COG2213 # Protein_GI_number: 16131470 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannitol-specific IIBC component # Organism: Escherichia coli K12 # 1 493 1 493 493 926 100.0 0 MSSDIKIKVQSFGRFLSNMVMPNIGAFIAWGIITALFIPTGWLPNETLAKLVGPMITYLL PLLIGYTGGKLVGGERGGVVGAITTMGVIVGADMPMFLGSMIAGPLGGWCIKHFDRWVDG KIKSGFEMLVNNFSAGIIGMILAILAFLGIGPIVEALSKMLAAGVNFMVVHDMLPLASIF VEPAKILFLNNAINHGIFSPLGIQQSHELGKSIFFLIEANPGPGMGVLLAYMFFGRGSAK QSAGGAAIIHFLGGIHEIYFPYVLMNPRLILAVILGGMTGVFTLTILGGGLVSPASPGSI LAVLAMTPKGAYFANIAGVCAAMAVSFVVSAILLKTSKVKEEDDIEAATRRMQDMKAESK GASPLSAGDVTNDLSHVRKIIVACDAGMGSSAMGAGVLRKKIQDAGLSQISVTNSAINNL PPDVDLVITHRDLTERAMRQVPQAQHISLTNFLDSGLYTSLTERLVAAQRHTANEEKVKD SLKDSFDDSSANLFKLGAENIFLGRKAATKEEAIRFAGEQLVKGGYVEPEYVQAMLDREK LTPTYLGESIAVPHGTVEAKDRVLKTGVVFCQYPEGVRFGEEEDDIARLVIGIAARNNEH IQVITSLTNALDDESVIERLAHTTSVDEVLELLAGRK >gi|296494679|gb|ADTN01000059.1| GENE 8 8559 - 9707 1607 382 aa, chain + ## HITS:1 COG:mtlD KEGG:ns NR:ns ## COG: mtlD COG0246 # Protein_GI_number: 16131471 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Escherichia coli K12 # 1 382 1 382 382 733 99.0 0 MKALHFGAGNIGRGFIGKLLADAGIQLTFADVNQVVLDALNARHSYQVHVVGETEQVDTV SGVNAVSSIGDDVVDLIAQVDLVTTAVGPVVLERIAPAIAKGLVKRKEQGNESPLNIIAC ENMVRGTTQLKGHVMNALPEDAKAWVEEHVGFVDSAVDRIVPPSASATNDPLEVTVETFS EWIVDKTQFKGALPNIPGMELTDNLMAFVERKLFTLNTGHAITAYLGKLAGHQTIRDAIL DEKIRAVVKGAMEESGAVLIKRYGFDADKHAAYIQKILGRFENPYLKDDVERVGRQPLRK LSAGDRLIKPLLGTLEYGLPHKNLIEGIAAAMHFRSEDDPQAQELAALIADKGPQAALAQ ISGLDANSEVVSEAVTAYKAMQ >gi|296494679|gb|ADTN01000059.1| GENE 9 9707 - 10162 355 151 aa, chain + ## HITS:1 COG:mtlR KEGG:ns NR:ns ## COG: mtlR COG3722 # Protein_GI_number: 16131472 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 146 1 146 195 278 99.0 3e-75 MVDQAQDTLRPNNRLSDMQATMEQTQAFENRVLERLNAGKTVRSFLITAVELLTEAVNLL VLQVFRKDDYAVKYAVEPLLDGDGPLGDLSVRLKLIYGLGVINRQEYEDAELLMALREEL NHDGNEYAFTDDEILGPFGELHCVAGKQYRR Prediction of potential genes in microbial genomes Time: Sun May 15 23:18:15 2011 Seq name: gi|296494678|gb|ADTN01000060.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont166.2, whole genome shotgun sequence Length of sequence - 5078 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 223 - 432 228 ## JW3576 hypothetical protein - Prom 603 - 662 4.0 + Prom 507 - 566 4.8 2 2 Tu 1 . + CDS 717 - 1079 467 ## ECSP_4596 hypothetical protein + Term 1180 - 1217 7.1 + Prom 1271 - 1330 2.5 3 3 Op 1 4/0.000 + CDS 1451 - 3106 1906 ## COG1620 L-lactate permease 4 3 Op 2 5/0.000 + CDS 3106 - 3882 783 ## COG2186 Transcriptional regulators 5 3 Op 3 . + CDS 3879 - 5069 1505 ## COG1304 L-lactate dehydrogenase (FMN-dependent) and related alpha-hydroxy acid dehydrogenases Predicted protein(s) >gi|296494678|gb|ADTN01000060.1| GENE 1 223 - 432 228 69 aa, chain - ## HITS:1 COG:no KEGG:JW3576 NR:ns ## KEGG: JW3576 # Name: yibT # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 69 1 69 69 121 100.0 6e-27 MGKLGENVPLLIDKAVDFMASSQAFREYLKKLPPRNAIPSGIPDESVPLYLQRLEYYRRL YRPKQVEGQ >gi|296494678|gb|ADTN01000060.1| GENE 2 717 - 1079 467 120 aa, chain + ## HITS:1 COG:no KEGG:ECSP_4596 NR:ns ## KEGG: ECSP_4596 # Name: yibL # Def: hypothetical protein # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 120 1 120 120 167 100.0 1e-40 MKEVEKNEIKRLSDRLDAIRHQQADLSLVEAADKYAELEKEKATLEAEIARLREVHSQKL SKEAQKLMKMPFQRAITKKEQADMGKLKKSVRGLVVVHPMTALGREMGLQEMTGFSKTAF >gi|296494678|gb|ADTN01000060.1| GENE 3 1451 - 3106 1906 551 aa, chain + ## HITS:1 COG:lldP KEGG:ns NR:ns ## COG: lldP COG1620 # Protein_GI_number: 16131474 # Func_class: C Energy production and conversion # Function: L-lactate permease # Organism: Escherichia coli K12 # 1 551 1 551 551 928 100.0 0 MNLWQQNYDPAGNIWLSSLIASLPILFFFFALIKLKLKGYVAASWTVAIALAVALLFYKM PVANALASVVYGFFYGLWPIAWIIIAAVFVYKISVKTGQFDIIRSSILSITPDQRLQMLI VGFCFGAFLEGAAGFGAPVAITAALLVGLGFKPLYAAGLCLIVNTAPVAFGAMGIPILVA GQVTGIDSFEIGQMVGRQLPFMTIIVLFWIMAIMDGWRGIKETWPAVVVAGGSFAIAQYL SSNFIGPELPDIISSLVSLLCLTLFLKRWQPVRVFRFGDLGASQVDMTLAHTGYTAGQVL RAWTPFLFLTATVTLWSIPPFKALFASGGALYEWVINIPVPYLDKLVARMPPVVSEATAY AAVFKFDWFSATGTAILFAALLSIVWLKMKPSDAISTFGSTLKELALPIYSIGMVLAFAF ISNYSGLSSTLALALAHTGHAFTFFSPFLGWLGVFLTGSDTSSNALFAALQATAAQQIGV SDLLLVAANTTGGVTGKMISPQSIAIACAAVGLVGKESDLFRFTVKHSLIFTCIVGVITT LQAYVLTWMIP >gi|296494678|gb|ADTN01000060.1| GENE 4 3106 - 3882 783 258 aa, chain + ## HITS:1 COG:lldR KEGG:ns NR:ns ## COG: lldR COG2186 # Protein_GI_number: 16131475 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 258 1 258 258 489 100.0 1e-138 MIVLPRRLSDEVADRVRALIDEKNLEAGMKLPAERQLAMQLGVSRNSLREALAKLVSEGV LLSRRGGGTFIRWRHDTWSEQNIVQPLKTLMADDPDYSFDILEARYAIEASTAWHAAMRA TPGDKEKIQLCFEATLSEDPDIASQADVRFHLAIAEASHNIVLLQTMRGFFDVLQSSVKH SRQRMYLVPPVFSQLTEQHQAVIDAIFAGDADGARKAMMAHLSFVHTTMKRFDEDQARHA RITRLPGEHNEHSREKNA >gi|296494678|gb|ADTN01000060.1| GENE 5 3879 - 5069 1505 396 aa, chain + ## HITS:1 COG:lldD KEGG:ns NR:ns ## COG: lldD COG1304 # Protein_GI_number: 16131476 # Func_class: C Energy production and conversion # Function: L-lactate dehydrogenase (FMN-dependent) and related alpha-hydroxy acid dehydrogenases # Organism: Escherichia coli K12 # 1 396 1 396 396 783 100.0 0 MIISAASDYRAAAQRILPPFLFHYMDGGAYSEYTLRRNVEDLSEVALRQRILKNMSDLSL ETTLFNEKLSMPVALAPVGLCGMYARRGEVQAAKAADAHGIPFTLSTVSVCPIEEVAPAI KRPMWFQLYVLRDRGFMRNALERAKAAGCSTLVFTVDMPTPGARYRDAHSGMSGPNAAMR RYLQAVTHPQWAWDVGLNGRPHDLGNISAYLGKPTGLEDYIGWLGNNFDPSISWKDLEWI RDFWDGPMVIKGILDPEDARDAVRFGADGIVVSNHGGRQLDGVLSSARALPAIADAVKGD IAILADSGIRNGLDVVRMIALGADTVLLGRAFLYALATAGQAGVANLLNLIEKEMKVAMT LTGAKSISEITQDSLVQGLGKELPAALAPMAKGNAA Prediction of potential genes in microbial genomes Time: Sun May 15 23:18:24 2011 Seq name: gi|296494677|gb|ADTN01000061.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont166.3, whole genome shotgun sequence Length of sequence - 15989 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 7, operones - 4 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 89 - 562 433 ## COG0219 Predicted rRNA methylase (SpoU class) 2 2 Tu 1 6/0.250 - CDS 615 - 1436 865 ## COG1045 Serine acetyltransferase - Term 1449 - 1490 6.7 3 3 Op 1 7/0.000 - CDS 1516 - 2535 1086 ## COG0240 Glycerol-3-phosphate dehydrogenase 4 3 Op 2 9/0.000 - CDS 2535 - 3002 489 ## COG1952 Preprotein translocase subunit SecB 5 3 Op 3 7/0.000 - CDS 3065 - 3316 304 ## COG0695 Glutaredoxin and related proteins - Prom 3388 - 3447 2.4 6 3 Op 4 . - CDS 3458 - 3889 473 ## COG0607 Rhodanese-related sulfurtransferase - Prom 3929 - 3988 5.1 + Prom 4044 - 4103 4.2 7 4 Op 1 4/0.500 + CDS 4134 - 5678 1761 ## COG0696 Phosphoglyceromutase 8 4 Op 2 3/0.500 + CDS 5712 - 6971 1361 ## COG4942 Membrane-bound metallopeptidase 9 4 Op 3 . + CDS 6975 - 7934 818 ## COG2861 Uncharacterized protein conserved in bacteria + Term 7936 - 7977 -0.5 10 5 Tu 1 5/0.250 - CDS 7921 - 8952 502 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 9125 - 9184 5.7 - Term 9154 - 9184 3.0 11 6 Op 1 9/0.000 - CDS 9194 - 10219 1123 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 12 6 Op 2 . - CDS 10229 - 11425 1650 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes - Prom 11509 - 11568 3.9 + Prom 11451 - 11510 2.6 13 7 Op 1 6/0.250 + CDS 11639 - 12619 987 ## COG0451 Nucleoside-diphosphate-sugar epimerases 14 7 Op 2 11/0.000 + CDS 12699 - 13745 1068 ## COG0859 ADP-heptose:LPS heptosyltransferase 15 7 Op 3 6/0.250 + CDS 13749 - 14705 793 ## COG0859 ADP-heptose:LPS heptosyltransferase 16 7 Op 4 . + CDS 14744 - 15961 323 ## COG3307 Lipid A core - O-antigen ligase and related enzymes Predicted protein(s) >gi|296494677|gb|ADTN01000061.1| GENE 1 89 - 562 433 157 aa, chain + ## HITS:1 COG:yibK KEGG:ns NR:ns ## COG: yibK COG0219 # Protein_GI_number: 16131477 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase (SpoU class) # Organism: Escherichia coli K12 # 1 157 1 157 157 323 100.0 7e-89 MLNIVLYEPEIPPNTGNIIRLCANTGFRLHIIEPMGFAWDDKRLRRAGLDYHEFTAVTRH HDYRAFLEAENPQRLFALTTKGTPAHSAVSYQDGDYLMFGPETRGLPASILDALPAEQKI RIPMVPDSRSMNLSNAVSVVVYEAWRQLGYPGAVLRD >gi|296494677|gb|ADTN01000061.1| GENE 2 615 - 1436 865 273 aa, chain - ## HITS:1 COG:ECs4485 KEGG:ns NR:ns ## COG: ECs4485 COG1045 # Protein_GI_number: 15833739 # Func_class: E Amino acid transport and metabolism # Function: Serine acetyltransferase # Organism: Escherichia coli O157:H7 # 1 273 1 273 273 551 100.0 1e-157 MSCEELEIVWNNIKAEARTLADCEPMLASFYHATLLKHENLGSALSYMLANKLSSPIMPA IAIREVVEEAYAADPEMIASAACDIQAVRTRDPAVDKYSTPLLYLKGFHALQAYRIGHWL WNQGRRALAIFLQNQVSVTFQVDIHPAAKIGRGIMLDHATGIVVGETAVIENDVSILQSV TLGGTGKSGGDRHPKIREGVMIGAGAKILGNIEVGRGAKIGAGSVVLQPVPPHTTAAGVP ARIVGKPDSDKPSMDMDQHFNGINHTFEYGDGI >gi|296494677|gb|ADTN01000061.1| GENE 3 1516 - 2535 1086 339 aa, chain - ## HITS:1 COG:ECs4486 KEGG:ns NR:ns ## COG: ECs4486 COG0240 # Protein_GI_number: 15833740 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Escherichia coli O157:H7 # 1 339 1 339 339 664 100.0 0 MNQRNASMTVIGAGSYGTALAITLARNGHEVVLWGHDPEHIATLERDRCNAAFLPDVPFP DTLHLESDLATALAASRNILVVVPSHVFGEVLRQIKPLMRPDARLVWATKGLEAETGRLL QDVAREALGDQIPLAVISGPTFAKELAAGLPTAISLASTDQTFADDLQQLLHCGKSFRVY SNPDFIGVQLGGAVKNVIAIGAGMSDGIGFGANARTALITRGLAEMSRLGAALGADPATF MGMAGLGDLVLTCTDNQSRNRRFGMMLGQGMDVQSAQEKIGQVVEGYRNTKEVRELAHRF GVEMPITEEIYQVLYCGKNAREAALTLLGRARKDERSSH >gi|296494677|gb|ADTN01000061.1| GENE 4 2535 - 3002 489 155 aa, chain - ## HITS:1 COG:ECs4487 KEGG:ns NR:ns ## COG: ECs4487 COG1952 # Protein_GI_number: 15833741 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecB # Organism: Escherichia coli O157:H7 # 1 155 1 155 155 309 100.0 1e-84 MSEQNNTEMTFQIQRIYTKDISFEAPNAPHVFQKDWQPEVKLDLDTASSQLADDVYEVVL RVTVTASLGEETAFLCEVQQGGIFSIAGIEGTQMAHCLGAYCPNILFPYARECITSMVSR GTFPQLNLAPVNFDALFMNYLQQQAGEGTEEHQDA >gi|296494677|gb|ADTN01000061.1| GENE 5 3065 - 3316 304 83 aa, chain - ## HITS:1 COG:ECs4488 KEGG:ns NR:ns ## COG: ECs4488 COG0695 # Protein_GI_number: 15833742 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutaredoxin and related proteins # Organism: Escherichia coli O157:H7 # 1 83 1 83 83 173 100.0 9e-44 MANVEIYTKETCPYCHRAKALLSSKGVSFQELPIDGNAAKREEMIKRSGRTTVPQIFIDA QHIGGCDDLYALDARGGLDPLLK >gi|296494677|gb|ADTN01000061.1| GENE 6 3458 - 3889 473 143 aa, chain - ## HITS:1 COG:ECs4489 KEGG:ns NR:ns ## COG: ECs4489 COG0607 # Protein_GI_number: 15833743 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Escherichia coli O157:H7 # 1 143 1 143 143 276 100.0 6e-75 MQEIMQFVGRHPILSIAWIALLVAVLVTTFKSLTSKVKVITRGEATRLINKEDAVVVDLR QRDDFRKGHIAGSINLLPSEIKANNVGELEKHKDKPVIVVDGSGMQCQEPANALTKAGFA QVFVLKEGVAGWAGENLPLVRGK >gi|296494677|gb|ADTN01000061.1| GENE 7 4134 - 5678 1761 514 aa, chain + ## HITS:1 COG:ECs4490 KEGG:ns NR:ns ## COG: ECs4490 COG0696 # Protein_GI_number: 15833744 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglyceromutase # Organism: Escherichia coli O157:H7 # 1 514 1 514 514 1008 100.0 0 MSVSKKPMVLVILDGYGYREEQQDNAIFSAKTPVMDALWANRPHTLIDASGLEVGLPDRQ MGNSEVGHVNLGAGRIVYQDLTRLDVEIKDRAFFANPVLTGAVDKAKNAGKAVHIMGLLS AGGVHSHEDHIMAMVELAAERGAEKIYLHAFLDGRDTPPRSAESSLKKFEEKFAALGKGR VASIIGRYYAMDRDNRWDRVEKAYDLLTLAQGEFQADTAVAGLQAAYARDENDEFVKATV IRAEGQPDAAMEDGDALIFMNFRADRAREITRAFVNADFDGFARKKVVNVDFVMLTEYAA DIKTAVAYPPASLVNTFGEWMAKNDKTQLRISETEKYAHVTFFFNGGVEESFKGEDRILI NSPKVATYDLQPEMSSAELTEKLVAAIKSGKYDTIICNYPNGDMVGHTGVMEAAVKAVEA LDHCVEEVAKAVESVGGQLLITADHGNAEQMRDPATGQAHTAHTNLPVPLIYVGDKNVKA VAGGKLSDIAPTMLSLMGMEIPQEMTGKPLFIVE >gi|296494677|gb|ADTN01000061.1| GENE 8 5712 - 6971 1361 419 aa, chain + ## HITS:1 COG:ECs4491 KEGG:ns NR:ns ## COG: ECs4491 COG4942 # Protein_GI_number: 15833745 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Membrane-bound metallopeptidase # Organism: Escherichia coli O157:H7 # 1 419 9 427 427 602 100.0 1e-172 MTRAVKPRRFAIRPIIYASVLSAGVLLCAFSAHADERDQLKSIQADIAAKERAVRQKQQQ RASLLAQLKKQEEAISEATRKLRETQNTLNQLNKQIDEMNASIAKLEQQKAAQERSLAAQ LDAAFRQGEHTGIQLILSGEESQRGQRLQAYFGYLNQARQETIAQLKQTREEVAMQRAEL EEKQSEQQTLLYEQRAQQAKLTQALNERKKTLAGLESSIQQGQQQLSELRANESRLRNSI ARAEAAAKARAEREAREAQAVRDRQKEATRKGTTYKPTESEKSLMSRTGGLGAPRGQAFW PVRGPTLHRYGEQLQGELRWKGMVIGASEGTEVKAIADGRVILADWLQGYGLVVVVEHGK GDMSLYGYNQSALVSVGSQVRAGQPIALVGSSGGQGRPSLYFEIRRQGQAVNPQPWLGR >gi|296494677|gb|ADTN01000061.1| GENE 9 6975 - 7934 818 319 aa, chain + ## HITS:1 COG:ECs4492 KEGG:ns NR:ns ## COG: ECs4492 COG2861 # Protein_GI_number: 15833746 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 43 319 1 277 277 529 99.0 1e-150 MFPFRRNVLAFAALLALSSPVLAGKLAIVIDDFGYRPHNENQVLAMPSAISVAVLPDSPH AREMATKAHNSGHEVLIHLPMAPLSKQPLEKNTLRPEMSSDEIERIIRSAVNNVPYAVGI NNHMGSKMTSNLFGMQKVMQALERYNLYFLDSVTIGNTQAMRAAQGTGVKVIKRKVFLDD SQNEADIRVQFNRAIDLARRNGSTIAIGHPHPSTVRVLQQMVYNLPPDITLVKASSLLNE PQVDTSTPPKNAVPDAPRNPFRGVKLCKPKKPIEPVYANRFFEVLSESISQSTLIVYFQH QWQGWGKQPEAAKFNASAN >gi|296494677|gb|ADTN01000061.1| GENE 10 7921 - 8952 502 343 aa, chain - ## HITS:1 COG:yibD KEGG:ns NR:ns ## COG: yibD COG0463 # Protein_GI_number: 16131486 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Escherichia coli K12 # 1 343 2 344 344 682 99.0 0 MNSTNKLSVIIPLYNAGDDFRTCMESLITQTWTALEIIIINDGSTDNSVEIAKHYAENYP HVRLLHQANAGASVARNRGIEVATGKYVAFVDADDEVYPTMYETLMTMALEDDLDVAQCN ADWCFRETGETWQSIPTDRLRSTGVLTGPDWLRMGLSSRRWTHVVWMGVYRRDVIVKNNI KFIAGLHHQDIVWTTEFMFNALRARYTEQSLYKYYLHNTSVSRLHRQGNKNLNYQRHYIK ITRLLEKLNRNYADKIMIYPEFHQQITYEALRVCHAVRKEPDILTRQRMIAEIFTSGMYK RLITNVRSVKVGYQALLWSFRLWQWRDKTRSHHRITRSAFNLR >gi|296494677|gb|ADTN01000061.1| GENE 11 9194 - 10219 1123 341 aa, chain - ## HITS:1 COG:tdh KEGG:ns NR:ns ## COG: tdh COG1063 # Protein_GI_number: 16131487 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli K12 # 1 341 1 341 341 707 99.0 0 MKALSKLKAEEGIWMTDVPEPELGHNDLLIKIRKTAICGTDVHIYNWDEWSQKTIPVPMV VGHEYVGEVVGIGQEVKGFKIGDRVSGEGHITCGHCRNCRGGRTHLCRNTIGVGVNRPGC FAEYLVIPAFNAFKIPDNISDDLASIFDPFGNAVHTALSFDLVGEDVLVSGAGPIGIMAA AVAKHVGARNVVITDVNEYRLELARKMGITRAVNVAKENLNDVMAELGMTEGFDVGLEMS GAPPAFRTMLDTMNHGGRIAMLGIPPSDMSIDWTKVIFKGLFIKGIYGREMFETWYKMAA LIQSGLDLSPIITHRFSIDDFQKGFDAMRSGQSGKVILSWD >gi|296494677|gb|ADTN01000061.1| GENE 12 10229 - 11425 1650 398 aa, chain - ## HITS:1 COG:ECs4495 KEGG:ns NR:ns ## COG: ECs4495 COG0156 # Protein_GI_number: 15833749 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Escherichia coli O157:H7 # 1 398 1 398 398 763 98.0 0 MRGDFYQQLANDLETARAEGLFKEERIITSAQQADITVADGSHVINFCANNYLGLANHPE LIAAAKAGMDSHGFGMASVRFICGTQDSHKELEQKLAAFLGMEDAILYSSCFDANGGLFE TLLGAEDAIISDALNHASIIDGVRLCKAKRYRYANNDMQELEARLKEAREAGARHVLIAT DGVFSMDGVIANLKGVCNLADKYDALVMVDDSHAVGFVGENGRGSHEYCDVMGRVDIITG TLGKALGGASGGYTAARKEVVEWLRQRSRPYLFSNSLAPAIVAASIKVLEMVEAGSELRD RLWANARQFREQMSAAGFTLAGADHAIIPVMLGDAVVAQKFARELQKEGIYVTGFFYPVV PKGQARIRTQMSAAHTSEQITRAVEAFTRIGKQLGVIA >gi|296494677|gb|ADTN01000061.1| GENE 13 11639 - 12619 987 326 aa, chain + ## HITS:1 COG:ECs4497 KEGG:ns NR:ns ## COG: ECs4497 COG0451 # Protein_GI_number: 15833751 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Escherichia coli O157:H7 # 1 309 1 309 310 617 99.0 1e-177 MIIVTGGAGFIGSNIVKALNDKGITDILVVDNLKDGTKFVNLVDLNIADYMDKEDFLIQI MAGGEVGDVEAIFHEGACSSTTEWDGKYMMDNNYQYSKELLHYCLEREIPFLYASSAATY GGRTSDFIESREYEKPLNVYGYSKFLFDEYVRQILPEANSQIVGFRYFNVYGPREGHKGS MASVAFHLNTQLNNGESPKLFEGSENFKRDFVYVGDVADVNLWFLENGVSGIFNLGTGRA ESFQAVADATLAYHKKGQIEYIPFPDKLKGRYQAFTQADLTNLRAAGYDKPFKTVAEGVT EYMAWLNRDINVNSQNHSSGIKAATK >gi|296494677|gb|ADTN01000061.1| GENE 14 12699 - 13745 1068 348 aa, chain + ## HITS:1 COG:rfaF KEGG:ns NR:ns ## COG: rfaF COG0859 # Protein_GI_number: 16131491 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Escherichia coli K12 # 1 348 1 348 348 710 99.0 0 MKILVIGPSWVGDMMMSQSLYRTLQARYPQAIIDVMAPAWCRPLLSRMPEVNEAIPMPLG HGALEIGERRKLGHSLREKRYDRAYVLPNSFKSALVPFFAGIPHRTGWRGEMRYGLLNDV RVLDKEAWPLMVERYVALAYDKGIMRTAQDLPQPLLWPQLQVSEGEKSYTCNQFSLSSER PMIGFCPGAEFGPAKRWPHYHYAELAKQLIDEGYQVVLFGSAKDHEAGNEILAALNTEQQ AWCRNLAGETQLDQAVILIAACKAIVTNDSGLMHVAAALNRPLVALYGPSSPDFTPPLSH KARVIRLITGYHKVRKGDAAEGYHQSLIDITPQRVLEELNSLLLQEEA >gi|296494677|gb|ADTN01000061.1| GENE 15 13749 - 14705 793 318 aa, chain + ## HITS:1 COG:rfaC KEGG:ns NR:ns ## COG: rfaC COG0859 # Protein_GI_number: 16131492 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Escherichia coli K12 # 1 318 1 318 319 607 94.0 1e-174 MRVLIVKTSSMGDVLHTLPALTDAQQAIPGIKFDWVVEEGFAQIPSWHAAVERVIPVAIR RWRKAWFSAPIKAERKAFREALQAENYDAVIDAQGLVKSAALVTRLAHGVKHGMDWQTAR EPLASLFYNRKHHIAKQQHAVERIRELFAKSLGYSKPQTQGDYAIAQHFLTNLPTDAGEY AVFLHATTRDDKHWPEEHWRELIGLLADSGIRIKLPWGAPHEEKRAKRLAEGFAYVEVLP KMSLEGVARVLAGAKFVVSVDTGLSHLTAALDRPNITVYGPTDPGLIGGYGKNQVVCRAP DKDLAHLTAETVFNKINS >gi|296494677|gb|ADTN01000061.1| GENE 16 14744 - 15961 323 405 aa, chain + ## HITS:1 COG:STM3713 KEGG:ns NR:ns ## COG: STM3713 COG3307 # Protein_GI_number: 16766998 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipid A core - O-antigen ligase and related enzymes # Organism: Salmonella typhimurium LT2 # 1 405 1 404 404 523 65.0 1e-148 MLTASLALRNKEKWKPYWNKALVFLFIATFFLDGITRYKHIISILMIITVIYQVSRAPGT FKVLYKNNLFYSVLALSLILLYATFISPDLKISFKEFSNTVLKGFLAYSLLIPALLKDED NESIGKIVLYSLVTGLGLRCLVELILYIQDYNKGIMPFSTYEHRSISDSMVFLFPALLNL WLIKKTSYKIAFVIFSAVFLFLLLGTLSRGAWLAVFIVTLLWLILNRQWKLLMLTSIVIS VAAVGVFTYKGDHAGKDRLIYKLQQTDSSYRYTNGTQGTAWTLIMENPLKGYGYGDDIYH AIYNKRVVDFPSWKFRQSIGPHNVVLSIWFAAGLAGLLALLYLYGSIIKETANATFKTVV VTPYNGQLLLFLTLVSFYIIRGNFEEVDLKPIGLIVGLLLAMRNK Prediction of potential genes in microbial genomes Time: Sun May 15 23:18:39 2011 Seq name: gi|296494676|gb|ADTN01000062.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont166.4, whole genome shotgun sequence Length of sequence - 47728 bp Number of predicted genes - 44, with homology - 43 Number of transcription units - 27, operones - 11 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 35 - 550 406 ## COG0438 Glycosyltransferase - Prom 579 - 638 4.6 2 2 Op 1 . - CDS 1230 - 2039 289 ## EFER_3913 lipopolysaccharide core biosynthesis protein 3 2 Op 2 . - CDS 2111 - 2809 365 ## EFER_3914 lipopolysaccharide core biosynthesis protein 4 2 Op 3 5/0.250 - CDS 2827 - 3843 76 ## COG1442 Lipopolysaccharide biosynthesis proteins, LPS:glycosyltransferases - Prom 3870 - 3929 3.1 5 3 Tu 1 4/0.250 - CDS 4007 - 4564 233 ## COG1442 Lipopolysaccharide biosynthesis proteins, LPS:glycosyltransferases - Prom 4754 - 4813 6.0 6 4 Tu 1 . - CDS 5026 - 6105 400 ## COG0438 Glycosyltransferase - Prom 6131 - 6190 5.7 + Prom 6131 - 6190 6.1 7 5 Tu 1 . + CDS 6232 - 6306 58 ## - Term 6998 - 7049 -0.7 8 6 Op 1 . - CDS 7132 - 7929 491 ## EFER_3921 kinase that phosphorylates core heptose of lipopolysaccharide 9 6 Op 2 5/0.250 - CDS 7922 - 9046 581 ## COG0438 Glycosyltransferase 10 6 Op 3 . - CDS 9043 - 10077 472 ## COG0859 ADP-heptose:LPS heptosyltransferase - Prom 10279 - 10338 10.2 + Prom 10258 - 10317 9.4 11 7 Op 1 6/0.167 + CDS 10520 - 11797 1118 ## COG1519 3-deoxy-D-manno-octulosonic-acid transferase 12 7 Op 2 . + CDS 11805 - 12284 388 ## PROTEIN SUPPORTED gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 + Term 12285 - 12334 4.2 13 8 Tu 1 5/0.250 - CDS 12323 - 13132 765 ## COG0266 Formamidopyrimidine-DNA glycosylase - Prom 13155 - 13214 2.2 - Term 13171 - 13212 9.7 14 9 Op 1 16/0.000 - CDS 13230 - 13397 280 ## PROTEIN SUPPORTED gi|15804177|ref|NP_290216.1| 50S ribosomal protein L33 15 9 Op 2 7/0.000 - CDS 13418 - 13654 403 ## PROTEIN SUPPORTED gi|15804178|ref|NP_290217.1| 50S ribosomal protein L28 - Prom 13794 - 13853 3.4 16 9 Op 3 . - CDS 13871 - 14539 473 ## COG2003 DNA repair proteins - Prom 14735 - 14794 2.4 + Prom 14621 - 14680 2.2 17 10 Op 1 5/0.250 + CDS 14711 - 15931 1230 ## COG0452 Phosphopantothenoylcysteine synthetase/decarboxylase 18 10 Op 2 4/0.250 + CDS 15909 - 16367 591 ## COG0756 dUTPase + Term 16438 - 16473 -0.5 19 11 Tu 1 . + CDS 16474 - 17070 741 ## COG1309 Transcriptional regulator + Term 17076 - 17111 7.4 - Term 17062 - 17099 7.8 20 12 Op 1 6/0.167 - CDS 17107 - 17748 810 ## COG0461 Orotate phosphoribosyltransferase - Term 17777 - 17804 0.1 21 12 Op 2 . - CDS 17813 - 18529 906 ## COG0689 RNase PH - Prom 18559 - 18618 4.8 + Prom 18569 - 18628 5.5 22 13 Tu 1 . + CDS 18656 - 19519 1168 ## COG1561 Uncharacterized stress-induced protein + Prom 19622 - 19681 3.9 23 14 Tu 1 . + CDS 19741 - 20565 573 ## JW3620 DNA-damage-inducible protein + Prom 20631 - 20690 4.9 24 15 Tu 1 . + CDS 20857 - 21474 924 ## COG2860 Predicted membrane protein + Term 21494 - 21531 5.3 25 16 Tu 1 . - CDS 21471 - 23153 1115 ## COG0272 NAD-dependent DNA ligase (contains BRCT domain type II) - Prom 23351 - 23410 2.0 + Prom 23311 - 23370 4.6 26 17 Op 1 25/0.000 + CDS 23411 - 24034 532 ## COG0194 Guanylate kinase 27 17 Op 2 18/0.000 + CDS 24089 - 24364 513 ## COG1758 DNA-directed RNA polymerase, subunit K/omega 28 17 Op 3 5/0.250 + CDS 24383 - 26491 2180 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases 29 17 Op 4 4/0.250 + CDS 26498 - 27187 558 ## COG0566 rRNA methylases 30 17 Op 5 . + CDS 27193 - 29274 2052 ## COG1200 RecG-like helicase + Term 29372 - 29409 2.1 - Term 29267 - 29297 -1.0 31 18 Tu 1 . - CDS 29440 - 30645 1455 ## COG0786 Na+/glutamate symporter - Prom 30777 - 30836 7.0 + Prom 30842 - 30901 4.8 32 19 Tu 1 . + CDS 30925 - 32316 1632 ## COG2233 Xanthine/uracil permeases + Prom 32352 - 32411 1.8 33 20 Tu 1 . + CDS 32437 - 34146 1488 ## LF82_3356 uncharacterized protein yich + Term 34347 - 34384 -0.3 34 21 Op 1 3/0.500 - CDS 34199 - 36517 2114 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases 35 21 Op 2 . - CDS 36527 - 37909 1216 ## COG2211 Na+/melibiose symporter and related transporters + TRNA 38202 - 38292 75.2 # SeC(p) TCA 0 0 + Prom 38691 - 38750 9.8 36 22 Op 1 8/0.000 + CDS 38933 - 40117 665 ## COG0477 Permeases of the major facilitator superfamily + Prom 40134 - 40193 2.3 37 22 Op 2 . + CDS 40228 - 41151 613 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 41219 - 41275 5.2 38 23 Tu 1 . - CDS 41155 - 41973 793 ## COG1464 ABC-type metal ion transport system, periplasmic component/surface antigen - Prom 41997 - 42056 5.8 + Prom 41936 - 41995 5.6 39 24 Op 1 . + CDS 42071 - 42190 60 ## ECBD_0043 hypothetical protein 40 24 Op 2 . + CDS 42195 - 42488 267 ## SSON_3614 putative transport protein 41 25 Tu 1 . - CDS 42529 - 43719 1119 ## COG2814 Arabinose efflux permease - Prom 43751 - 43810 3.7 42 26 Op 1 . - CDS 43930 - 44382 362 ## APECO1_2789 hypothetical protein 43 26 Op 2 . - CDS 44435 - 45781 1240 ## COG2252 Permeases - Prom 45900 - 45959 5.8 + Prom 45835 - 45894 6.1 44 27 Tu 1 . + CDS 45944 - 47710 1753 ## COG1001 Adenine deaminase Predicted protein(s) >gi|296494676|gb|ADTN01000062.1| GENE 1 35 - 550 406 171 aa, chain - ## HITS:1 COG:STM3714 KEGG:ns NR:ns ## COG: STM3714 COG0438 # Protein_GI_number: 16766999 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Salmonella typhimurium LT2 # 1 170 210 379 381 268 73.0 3e-72 MLMEAFNQLNKIQDNLKLVIVGDPFASKKGEKAEYQKKVLDAAKAIGAQCIMAGGQPPEQ MHNYYRLADLVVVPSQVEEAFCMVAVEAMAAGKPVLASQKGGISEFVLEGITGYHLAEPM TSESILADIKRVLADADRAQIAKNARNFVFSKYSWEHVTQRFEAQIQDWFG >gi|296494676|gb|ADTN01000062.1| GENE 2 1230 - 2039 289 269 aa, chain - ## HITS:1 COG:no KEGG:EFER_3913 NR:ns ## KEGG: EFER_3913 # Name: waaZ # Def: lipopolysaccharide core biosynthesis protein # Organism: E.fergusonii # Pathway: not_defined # 1 269 1 269 269 515 98.0 1e-145 MKNIRYIDKKDVENLIESKTSDDVIIFLSGPTSQKTPLSVLQTRDVIAVNGSAQYLLSHN IIPYIYVLTDVRFLHQRRDDFYKFSQRSRYTIVNVDVYEHASEEDKRYILQNCLVLRSFY RREKGGLIKKIKFNILSRIHKELLISVPFSKKGRLVGFCKDINLGYCSCHTVAFAAIQIA YSLKYARIICSGLDLTGSCSRFYDEDKNPMPSELTRDLFKILPFFRFMRENIEDINIYNL SDDTAIQYDIIPCMKISEIEEPCVYEKIS >gi|296494676|gb|ADTN01000062.1| GENE 3 2111 - 2809 365 232 aa, chain - ## HITS:1 COG:no KEGG:EFER_3914 NR:ns ## KEGG: EFER_3914 # Name: waaY # Def: lipopolysaccharide core biosynthesis protein # Organism: E.fergusonii # Pathway: Lipopolysaccharide biosynthesis [PATH:efe00540]; Metabolic pathways [PATH:efe01100] # 1 232 1 232 232 416 98.0 1e-115 MIQKNKIKDLVVFTDENNSKYLNVLNDFLSYDINIIKVFRSIDDTKVMLIDTDYGKLILK VFSPKVKRNERFFKSLLKGDYYERLFEHTQKVRNEGLHSLNDFYLLAERKTLRFVHTYIM LIEYIDGVELCDIPDIDETLKNKIQQSIRSLHEHGMVSGDPHRGNFIIENGEVRIIDLSG KRASAQRKAKDRIDLERHYGIKNEVKDLGYYLLVYRKKIRNLMRRLKGKPAR >gi|296494676|gb|ADTN01000062.1| GENE 4 2827 - 3843 76 338 aa, chain - ## HITS:1 COG:rfaJ KEGG:ns NR:ns ## COG: rfaJ COG1442 # Protein_GI_number: 16131497 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipopolysaccharide biosynthesis proteins, LPS:glycosyltransferases # Organism: Escherichia coli K12 # 1 338 1 338 338 602 89.0 1e-172 MNSFPAIEIDKVKAWDFRLIDENTAESLNVAYGVDSNYLDGVGVSITSIVINNRHVNLDF YIIADVYNDDFFQKVEKLAEQYQLRITLYRINTDNLQCLPCTQVWSRAMYFRLFAFQLLG VTLNRLLYLDADVVCKGNISQLLHLEFNGAVAAVVRDVDPMQEKAVVRLSDPELRGQYFN SGVVYLDLKKWTEAKLTEKALSILMSKDSIYKYPDQDVMNVLLKGMTIFLPREFNTIYTI KSELKDKTHQKYKELIKEDTLLIHYTGATKPWHKWAIYPSVKYYKIALERSPWKDDSPRD AKSIIEFKKRYKHLLVQHHYISGLIAGVCYLCRKYYRK >gi|296494676|gb|ADTN01000062.1| GENE 5 4007 - 4564 233 185 aa, chain - ## HITS:1 COG:rfaI KEGG:ns NR:ns ## COG: rfaI COG1442 # Protein_GI_number: 16131498 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipopolysaccharide biosynthesis proteins, LPS:glycosyltransferases # Organism: Escherichia coli K12 # 1 185 155 339 339 335 85.0 2e-92 MVVTEGQKDWWAKRAHSLGVAGIANGYFNSGFLLINTNQWTNERVSARAIAMLSDPEIVK KITHPDQDVLNMLLADKLVYADIKYNTQFSLNYQLKESFKNPVTNDTVFIHYIGPTKPWH DWAWDYPISQAFMAAKNASPWKDTALLKPVNSNQLRYSAKHMLKKKQYIKGFGNYLLYFI KKLKH >gi|296494676|gb|ADTN01000062.1| GENE 6 5026 - 6105 400 359 aa, chain - ## HITS:1 COG:rfaB KEGG:ns NR:ns ## COG: rfaB COG0438 # Protein_GI_number: 16131499 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Escherichia coli K12 # 1 359 11 369 369 714 92.0 0 MKIAFIGEAVSGFGGMETVIRDVITTFRQQHIQSEIFFFCRNDKMDKGWLEGIKYSCSFS NIRLGFLRRAKHIHALSKWLHDYQPDVVICIDVISCLYAAKARKKSGIDVPVFSWPHFSL DHKKHAEYITCADYHLAISSGIKQQMISRGVPESTINVIFNPVEAKNSVIPAPGEGETAT FIYVGRMKFEGQKRVKDLLDGLSQVQGDWKLHVLGDGSDFEKCQAYGRGLNIDDRIVWYG WQQHPWKLVQQDIKKVSALLLTSSFEGFPMTLLEALSWGIPCISADCVSGPVDIIQPDVN GHLYQPGDMTGFVALLNKYIAGEIHIAHEKIPASIDKFYQPKYYDRLQKIIISVISRRK >gi|296494676|gb|ADTN01000062.1| GENE 7 6232 - 6306 58 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MMELDFFYGSSKENLKNNDQNFVL >gi|296494676|gb|ADTN01000062.1| GENE 8 7132 - 7929 491 265 aa, chain - ## HITS:1 COG:no KEGG:EFER_3921 NR:ns ## KEGG: EFER_3921 # Name: waaP # Def: kinase that phosphorylates core heptose of lipopolysaccharide # Organism: E.fergusonii # Pathway: Lipopolysaccharide biosynthesis [PATH:efe00540]; Metabolic pathways [PATH:efe01100] # 1 265 1 265 265 503 93.0 1e-141 MVELEEPLATLWRGKDAFAEVKKLNGEVFRELETRRTLRFELAGKSYFLKWHKGTTLKEI IKNLLSLRMPVLGADREWHAIHRLHDVGVDTMHGIGFGEKGLNPLTRTSFIITEDLTPTI SLEDYCADWAVNPPDVHIKRMLIARVATMVRKMHAAGINHRDCYICHFLLHLPFTGREDE LKISVIDLHRAQIRAKVPRRWRDKDLIGLYFSSMNIGLTQRDIWRFMKVYFGMPLRDIYR LEIDLLKKARIKAGKIEARTIRKNL >gi|296494676|gb|ADTN01000062.1| GENE 9 7922 - 9046 581 374 aa, chain - ## HITS:1 COG:rfaG KEGG:ns NR:ns ## COG: rfaG COG0438 # Protein_GI_number: 16131502 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Escherichia coli K12 # 1 374 1 374 374 747 95.0 0 MIVAFCLYKYFPFGGLQRDFMRIASTVAARGHHVRVYTQSWEGDCPEAFELIRVPVKSHT NHGRNAEYYAWVQNHLKAHPADRVVGFNKMPGLDVYFAADVCYAEKVAQEKGFFYRLTSR YRHYAAFERATFEHGKSTKLMMLTDKQIADFQKHYQTEPERFQILPPGIYPDRKYNAQIP NSREIYRQKNGITEQQNLLLQVGSDFGRKGVDRSIEALASLPESLRHNTLLFVVGQDKPR KFEVLAEKLGVRSKVHFFSGRNDVSELMAAADLLMHPAYQEAAGIVLLEAIAAGLPVLTT AVCGYAHYITDANCGTVIAEPFSQEQLNDVLRKALTQSPLRMAWAENARYYADTQDLYSL PEKAADIITGGLDG >gi|296494676|gb|ADTN01000062.1| GENE 10 9043 - 10077 472 344 aa, chain - ## HITS:1 COG:rfaQ KEGG:ns NR:ns ## COG: rfaQ COG0859 # Protein_GI_number: 16131503 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Escherichia coli K12 # 1 344 1 344 344 715 97.0 0 MRYHGDMLLTTPVISTLKQNYPDAKIDMLLYQDTIPILSENPEINALYGISNKGTGTFDK IKNALSLIKTLRANNYDLVINLTDQWMVALLVRCLPARMKISQLYGHRQHGIWKKSFTHL APIHGTHIVERNLSVLEPLGITDFYTETTMSYAEDCWKKMRQELDALGVKDHYVVIQPTA RQIFKCWDNDKFSMVIDALQHRGYQVVLTCGPSADDLACVDEIARGCQTKPITGLAGKTR FPELGALIDHAVLFIGVDSAPGHIAAAVKTPVISLFGATDHVFWRPWTENIIQFWAGNYQ KMPTRHELDRNKKYLSVIPAEDVIAATEKLLPEDAPSADRNAQL >gi|296494676|gb|ADTN01000062.1| GENE 11 10520 - 11797 1118 425 aa, chain + ## HITS:1 COG:ECs4508 KEGG:ns NR:ns ## COG: ECs4508 COG1519 # Protein_GI_number: 15833762 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic-acid transferase # Organism: Escherichia coli O157:H7 # 1 425 1 425 425 830 99.0 0 MLELLYTALLYLIQPLIWIRLWVRGRKAPAYRKRWGERYGFYRHPLKPGGIMLHSVSVGE TLAAIPLVRALRHRYPDLPITVTTMTPTGSERVQSAFGKDVQHVYLPYDLPDALNRFLNK VDPKLVLIMETELWPNLIAALHKRKIPLVIANARLSARSAAGYAKLGKFVRRLLRRITLI AAQNEEDGARFVALGAKNNQVTVTGSLKFDISVTPQLAAKAVTLRRQWAPHRPVWIATST HEGEESVVIAAHQALLQQFPNLLLILVPRHPERFPDAINLVRQAGLSYITRSSGEVPSTS TQVVVGDTMGELMLLYGIADLAFVGGSLVERGGHNPLEAAAHAIPVLMGPHTFNFKDICA RLEQASGLITVTDATTLVKEVSSLLTDADYRSFYGRHAVEVLYQNQGALQRLLQLLEPYL PPKTH >gi|296494676|gb|ADTN01000062.1| GENE 12 11805 - 12284 388 159 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 [Bacillus selenitireducens MLS10] # 2 159 3 160 164 154 46 1e-36 MQKRAIYPGTFDPITNGHIDIVTRATQMFDHVILAIAASPSKKPMFTLEERVELAQQATA HLGNVEVVGFSDLMANFARNQHATVLIRGLRAVADFEYEMQLAHMNRHLMPELESVFLMP SKEWSFISSSLVKEVARHQGDVTHFLPENVHQALMAKLA >gi|296494676|gb|ADTN01000062.1| GENE 13 12323 - 13132 765 269 aa, chain - ## HITS:1 COG:ECs4510 KEGG:ns NR:ns ## COG: ECs4510 COG0266 # Protein_GI_number: 15833764 # Func_class: L Replication, recombination and repair # Function: Formamidopyrimidine-DNA glycosylase # Organism: Escherichia coli O157:H7 # 1 269 1 269 269 555 99.0 1e-158 MPELPEVETSRRGIEPHLVGATILHAVVRNGRLRWPVSEEIYRLSDQPVLSVQRRAKYLL LELPEGWIIIHLGMSGSLRILPEELPPEKHDHVDLVMNNGKVLRYTDPRRFGAWLWTKEL EGHNVLAHLGPEPLSDDFNGEYLHQKCAKKKTAIKPWLMDNKLVVGVGNIYASESLFAAG IHPDRLASSLSLAECELLARVIKAVLLRSIEQGGTTLKDFLQSDGKPGYFAQELQVYGRK GEPCRVCGTPIVATKHAQRATFYCRQCQK >gi|296494676|gb|ADTN01000062.1| GENE 14 13230 - 13397 280 55 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15804177|ref|NP_290216.1| 50S ribosomal protein L33 [Escherichia coli O157:H7 EDL933] # 1 55 1 55 55 112 100 4e-24 MAKGIREKIKLVSSAGTGHFYTTTKNKRTKPEKLELKKFDPVVRQHVIYKEAKIK >gi|296494676|gb|ADTN01000062.1| GENE 15 13418 - 13654 403 78 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15804178|ref|NP_290217.1| 50S ribosomal protein L28 [Escherichia coli O157:H7 EDL933] # 1 78 1 78 78 159 100 2e-38 MSRVCQVTGKRPVTGNNRSHALNATKRRFLPNLHSHRFWVESEKRFVTLRVSAKGMRVID KKGIDTVLAELRARGEKY >gi|296494676|gb|ADTN01000062.1| GENE 16 13871 - 14539 473 222 aa, chain - ## HITS:1 COG:ECs4513 KEGG:ns NR:ns ## COG: ECs4513 COG2003 # Protein_GI_number: 15833767 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Escherichia coli O157:H7 # 1 222 3 224 224 431 99.0 1e-121 MKNNSQLLMPREKMLKFGISALTDVELLALFLRTGTRGKDVLTLAKEMLENFGSLYGLLT SEYEQFSGVHGIGVAKFAQLKGIAELARRYYNVRMREESPLLSPEMTREFLQSQLTGEER EIFMVIFLDSQHRVITHSRLFSGTLNHVEVHPREIIREAIKINASALILAHNHPSGCAEP SKADKLITERIIKSCQFMDLRVLDHIVIGRGEYVSFAERGWI >gi|296494676|gb|ADTN01000062.1| GENE 17 14711 - 15931 1230 406 aa, chain + ## HITS:1 COG:ECs4514 KEGG:ns NR:ns ## COG: ECs4514 COG0452 # Protein_GI_number: 15833768 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantothenoylcysteine synthetase/decarboxylase # Organism: Escherichia coli O157:H7 # 1 406 25 430 430 765 100.0 0 MSLAGKKIVLGVSGGIAAYKTPELVRRLRDRGADVRVAMTEAAKAFITPLSLQAVSGYPV SDSLLDPAAEAAMGHIELGKWADLVILAPATADLIARVAAGMANDLVSTICLATPAPVAV LPAMNQQMYRAAATQHNLEVLASRGLLIWGPDSGSQACGDIGPGRMLDPLTIVDMAVAHF SPVNDLKHLNIMITAGPTREPLDPVRYISNHSSGKMGFAIAAAAARRGANVTLVSGPVSL PTPPFVKRVDVMTALEMEAAVNASVQQQNIFIGCAAVADYRAATVAPEKIKKQATQGDEL TIKMVKNPDIVAGVAALKDHRPYVVGFAAETNNVEEYARQKRIRKNLDLICANDVSQPTQ GFNSDNNALHLFWQDGDKVLPLERKELLGQLLLDEIVTRYDEKNRR >gi|296494676|gb|ADTN01000062.1| GENE 18 15909 - 16367 591 152 aa, chain + ## HITS:1 COG:ECs4515 KEGG:ns NR:ns ## COG: ECs4515 COG0756 # Protein_GI_number: 15833769 # Func_class: F Nucleotide transport and metabolism # Function: dUTPase # Organism: Escherichia coli O157:H7 # 2 152 1 151 151 298 100.0 2e-81 MMKKIDVKILDPRVGKEFPLPTYATSGSAGLDLRACLDDAVELAPGDTTLVPTGLAIHIA DPSLAAMMLPRSGLGHKHGIVLGNLVGLIDSDYQGQLMISVWNRGQDSFTIQPGERIAQM IFVPVVQAEFNLVEDFDATDRGEGGFGHSGRQ >gi|296494676|gb|ADTN01000062.1| GENE 19 16474 - 17070 741 198 aa, chain + ## HITS:1 COG:ECs4516 KEGG:ns NR:ns ## COG: ECs4516 COG1309 # Protein_GI_number: 15833770 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 198 15 212 212 340 100.0 7e-94 MAEKQTAKRNRREEILQSLALMLESSDGSQRITTAKLAASVGVSEAALYRHFPSKTRMFD SLIEFIEDSLITRINLILKDEKDTTARLRLIVLLLLGFGERNPGLTRILTGHALMFEQDR LQGRINQLFERIEAQLRQVLREKRMREGEGYTTDETLLASQILAFCEGMLSRFVRSEFKY RPTDDFDARWPLIAAQLQ >gi|296494676|gb|ADTN01000062.1| GENE 20 17107 - 17748 810 213 aa, chain - ## HITS:1 COG:pyrE KEGG:ns NR:ns ## COG: pyrE COG0461 # Protein_GI_number: 16131513 # Func_class: F Nucleotide transport and metabolism # Function: Orotate phosphoribosyltransferase # Organism: Escherichia coli K12 # 1 213 1 213 213 421 100.0 1e-118 MKPYQRQFIEFALSKQVLKFGEFTLKSGRKSPYFFNAGLFNTGRDLALLGRFYAEALVDS GIEFDLLFGPAYKGIPIATTTAVALAEHHDLDLPYCFNRKEAKDHGEGGNLVGSALQGRV MLVDDVITAGTAIRESMEIIQANGATLAGVLISLDRQERGRGEISAIQEVERDYNCKVIS IITLKDLIAYLEEKPEMAEHLAAVKAYREEFGV >gi|296494676|gb|ADTN01000062.1| GENE 21 17813 - 18529 906 238 aa, chain - ## HITS:1 COG:ECs4518 KEGG:ns NR:ns ## COG: ECs4518 COG0689 # Protein_GI_number: 15833772 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNase PH # Organism: Escherichia coli O157:H7 # 1 238 1 238 238 416 100.0 1e-116 MRPAGRSNNQVRPVTLTRNYTKHAEGSVLVEFGDTKVLCTASIEEGVPRFLKGQGQGWIT AEYGMLPRSTHTRNAREAAKGKQGGRTMEIQRLIARALRAAVDLKALGEFTITLDCDVLQ ADGGTRTASITGACVALADALQKLVENGKLKTNPMKGMVAAVSVGIVNGEAVCDLEYVED SAAETDMNVVMTEDGRIIEVQGTAEGEPFTHEELLTLLALARGGIESIVATQKAALAN >gi|296494676|gb|ADTN01000062.1| GENE 22 18656 - 19519 1168 287 aa, chain + ## HITS:1 COG:yicC KEGG:ns NR:ns ## COG: yicC COG1561 # Protein_GI_number: 16131515 # Func_class: S Function unknown # Function: Uncharacterized stress-induced protein # Organism: Escherichia coli K12 # 1 287 1 287 287 456 100.0 1e-128 MIRSMTAYARREIKGEWGSATWEMRSVNQRYLETYFRLPEQFRSLEPVVRERIRSRLTRG KVECTLRYEPDVSAQGELILNEKLAKQLVTAANWVKMQSDEGEINPVDILRWPGVMAAQE QDLDAIAAEILAALDGTLDDFIVARETEGQALKALIEQRLEGVTAEVVKVRSHMPEILQW QRERLVAKLEDAQVQLENNRLEQELVLLAQRIDVAEELDRLEAHVKETYNILKKKEAVGR RLDFMMQEFNRESNTLASKSINAEVTNSAIELKVLIEQMREQIQNIE >gi|296494676|gb|ADTN01000062.1| GENE 23 19741 - 20565 573 274 aa, chain + ## HITS:1 COG:no KEGG:JW3620 NR:ns ## KEGG: JW3620 # Name: dinD # Def: DNA-damage-inducible protein # Organism: E.coli_J # Pathway: not_defined # 1 274 1 274 274 503 99.0 1e-141 MNEHHQPFEEIKLINANGAEQWSARQLGKLLGYSEYRHFIPVLTRAKEACENSGHTIDDH FEEILDMVKIGSNAKRALKDIVLSRYACYLVVQNGDPAKPVIAAGQTYFAIQTRRQELAD DEAFKQLREDEKRLFLRNELKEHNKQLVEAAQQAGVATATDFAIFQNHGYQGLYGGLDQK AIHQRKGLKKNQKILDHMGSTELAANLFRATQTEEKLKRDGVNSKQQANTTHFDVGSKVM QTIQELGGTMPEELPTPQVSIKQLENSVKITEKK >gi|296494676|gb|ADTN01000062.1| GENE 24 20857 - 21474 924 205 aa, chain + ## HITS:1 COG:yicG KEGG:ns NR:ns ## COG: yicG COG2860 # Protein_GI_number: 16131517 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 205 19 223 223 326 100.0 2e-89 MLLHILYLVGITAEAMTGALAAGRRRMDTFGVIIIATATAIGGGSVRDILLGHYPLGWVK HPEYVIIVATAAVLTTIVAPVMPYLRKVFLVLDALGLVVFSIIGAQVALDMGHGPIIAVV AAVTTGVFGGVLRDMFCKRIPLVFQKELYAGVSFASAVLYIALQHYVSNHDVVIISTLVF GFFARLLALRLKLGLPVFYYSHEGH >gi|296494676|gb|ADTN01000062.1| GENE 25 21471 - 23153 1115 560 aa, chain - ## HITS:1 COG:yicF KEGG:ns NR:ns ## COG: yicF COG0272 # Protein_GI_number: 16131518 # Func_class: L Replication, recombination and repair # Function: NAD-dependent DNA ligase (contains BRCT domain type II) # Organism: Escherichia coli K12 # 1 560 3 562 562 1056 99.0 0 MKVWMAILISILCWQSSVWAVCPAWSPARAQEEISRLQQQIKQWDDDYWKEGKSEVEDGV YDQLSARLTQWQRCFGSEPRDVMMPPLNGAVMHPVAHTGVRKMVDKNALSLWMRERSDLW VQPKVDGVAVTLVYRDGKLNKAISRGNGLKGEDWTQKVSLISAVPQTVSGPLANSTLQGE IFLQREGHIQQQMGGINARAKVAGLMMRQDDSDTLNSLGVFVWAWPDGPQLMSDRLKELA TAGFTLTQTYTRAVKNADEVARVRNEWWKAELPFVTDGVVVRAAKEPESRHWLPGQAEWL VAWKYQPVAQVAEVKAIQFAVGKSGKISVVASLAPVMLDDKKVQRVNIGSVRRWQEWDIA PGDQILVSLAGQGIPRIDDVVWRGAERTKPTPPENRFNSLTCYFASDVCQEQFISRLVWL GSKQVLGLDGIGEAGWRALHQTHRFEHIFSWLLLTPEQLQNTPGIAKSKSAQLWHQFNLA RKQPFTRWVMAMGIPLTRAALNASDERSWSQLLFSTEQFWQQLPGTGSGRARQVIEWKEN AQIKKLGSWLAAQQITGFEP >gi|296494676|gb|ADTN01000062.1| GENE 26 23411 - 24034 532 207 aa, chain + ## HITS:1 COG:gmk KEGG:ns NR:ns ## COG: gmk COG0194 # Protein_GI_number: 16131519 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Escherichia coli K12 # 1 207 1 207 207 398 100.0 1e-111 MAQGTLYIVSAPSGAGKSSLIQALLKTQPLYDTQVSVSHTTRQPRPGEVHGEHYFFVNHD EFKEMISRDAFLEHAEVFGNYYGTSREAIEQVLATGVDVFLDIDWQGAQQIRQKMPHARS IFILPPSKIELDRRLRGRGQDSEEVIAKRMAQAVAEMSHYAEYDYLIVNDDFDTALTDLK TIIRAERLRMSRQKQRHDALISKLLAD >gi|296494676|gb|ADTN01000062.1| GENE 27 24089 - 24364 513 91 aa, chain + ## HITS:1 COG:ECs4524 KEGG:ns NR:ns ## COG: ECs4524 COG1758 # Protein_GI_number: 15833778 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, subunit K/omega # Organism: Escherichia coli O157:H7 # 1 91 1 91 91 112 100.0 2e-25 MARVTVQDAVEKIGNRFDLVLVAARRARQMQVGGKDPLVPEENDKTTVIALREIEEGLIN NQILDVRERQEQQEQEAAELQAVTAIAEGRR >gi|296494676|gb|ADTN01000062.1| GENE 28 24383 - 26491 2180 702 aa, chain + ## HITS:1 COG:ECs4525 KEGG:ns NR:ns ## COG: ECs4525 COG0317 # Protein_GI_number: 15833779 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Escherichia coli O157:H7 # 1 702 1 702 702 1414 100.0 0 MYLFESLNQLIQTYLPEDQIKRLRQAYLVARDAHEGQTRSSGEPYITHPVAVACILAEMK LDYETLMAALLHDVIEDTPATYQDMEQLFGKSVAELVEGVSKLDKLKFRDKKEAQAENFR KMIMAMVQDIRVILIKLADRTHNMRTLGSLRPDKRRRIARETLEIYSPLAHRLGIHHIKT ELEELGFEALYPNRYRVIKEVVKAARGNRKEMIQKILSEIEGRLQEAGIPCRVSGREKHL YSIYCKMVLKEQRFHSIMDIYAFRVIVNDSDTCYRVLGQMHSLYKPRPGRVKDYIAIPKA NGYQSLHTSMIGPHGVPVEVQIRTEDMDQMAEMGVAAHWAYKEHGETSTTAQIRAQRWMQ SLLELQQSAGSSFEFIESVKSDLFPDEIYVFTPEGRIVELPAGATPVDFAYAVHTDIGHA CVGARVDRQPYPLSQPLTSGQTVEIITAPGARPNAAWLNFVVSSKARAKIRQLLKNLKRD DSVSLGRRLLNHALGGSRKLNEIPQENIQRELDRMKLATLDDLLAEIGLGNAMSVVVAKN LQHGDASIPPATQSHGHLPIKGADGVLITFAKCCRPIPGDPIIAHVSPGKGLVIHHESCR NIRGYQKEPEKFMAVEWDKETAQEFITEIKVEMFNHQGALANLTAAINTTTSNIQSLNTE EKDGRVYSAFIRLTARDRVHLANIMRKIRVMPDVIKVTRNRN >gi|296494676|gb|ADTN01000062.1| GENE 29 26498 - 27187 558 229 aa, chain + ## HITS:1 COG:ECs4526 KEGG:ns NR:ns ## COG: ECs4526 COG0566 # Protein_GI_number: 15833780 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Escherichia coli O157:H7 # 1 229 1 229 229 447 99.0 1e-125 MNPTRYARICEMLARRQPDLTVCMEQVHKPHNVSAIIRTADAVGVHEVHAVWPGSRMRTM ASAAAGSNSWVQVKTHRTIGDAVAHLKGQGMQILATHLSDNAVDFREIDYTRPTCILMGQ EKTGITQEALALADQDIIIPMIGMVQSLNVSVASALILYEAQRQRQNAGMYLRENSMLPE AEQQRLLFEGGYPVLAKVAKRKGLPYPHVNQQGEIEADADWWSTMQAAG >gi|296494676|gb|ADTN01000062.1| GENE 30 27193 - 29274 2052 693 aa, chain + ## HITS:1 COG:ZrecG KEGG:ns NR:ns ## COG: ZrecG COG1200 # Protein_GI_number: 15804193 # Func_class: L Replication, recombination and repair; K Transcription # Function: RecG-like helicase # Organism: Escherichia coli O157:H7 EDL933 # 1 693 12 704 704 1309 99.0 0 MKGRLLDTVPLSSLTGVGAALSNKLAKINLHTVQDLLLHLPLRYEDRTHLYPIGELLPGV YATVEGEVLNCNISFGGRRMMTCQISDGSGILTMRFFNFSAAMKNSLATGRRVLAYGEAK RGKYGAEMIHPEYRVQGDLSTPELQETLTPVYPTTEGVKQATLRKLTDQALDLLDTCAIE ELLPPELSQGMMTLPEALRTLHRPPPTLQLSDLETGQHPAQRRLILEELLAHNLSMLALR AGAQRFHAQPLSANDTLKNKLLAALPFKPTGAQARVVAEIEHDMALDVPMMRLVQGDVGS GKTLVAALAALRAIAHGKQVALMAPTELLAEQHANNFRNWFAPLGIKVGWLAGKQKGKAR LSQQEAIASGQVQMIVGTHAIFQEQVQFNGLALVIIDEQHRFGVHQRLALWEKGQQQGFH PHQLIMTATPIPRTLAMTAYADLDTSVIDELPPGRTPVTTVAIPDTRRTDIIDRVRHACI TEGRQAYWVCTLIEESELLEAQAAEATWEELKLALPELNVGLVHGRMKPAEKQAVMASFK QGELHLLVATTVIEVGVDVPNASLMIIENPERLGLAQLHQLRGRVGRGAVASHCVLLYKT PLSKTAQIRLQVLRDSNDGFVIAQKDLEIRGPGELLGTRQTGNAEFKVADLLRDQAMIPE VQRLARHIHERYPQQAKALIERWMPETERYSNA >gi|296494676|gb|ADTN01000062.1| GENE 31 29440 - 30645 1455 401 aa, chain - ## HITS:1 COG:gltS KEGG:ns NR:ns ## COG: gltS COG0786 # Protein_GI_number: 16131524 # Func_class: E Amino acid transport and metabolism # Function: Na+/glutamate symporter # Organism: Escherichia coli K12 # 1 401 1 401 401 619 100.0 1e-177 MFHLDTLATLVAATLTLLLGRKLVHSVSFLKKYTIPEPVAGGLLVALALLVLKKSMGWEV NFDMSLRDPLMLAFFATIGLNANIASLRAGGRVVGIFLIVVVGLLVMQNAIGIGMASLLG LDPLMGLLAGSITLSGGHGTGAAWSKLFIERYGFTNATEVAMACATFGLVLGGLIGGPVA RYLVKHSTTPNGIPDDQEVPTAFEKPDVGRMITSLVLIETIALIAICLTVGKIVAQLLAG TAFELPTFVCVLFVGVILSNGLSIMGFYRVFERAVSVLGNVSLSLFLAMALMGLKLWELA SLALPMLAILVVQTIFMALYAIFVTWRMMGKNYDAAVLAAGHCGFGLGATPTAIANMQAI TERFGPSHMAFLVVPMVGAFFIDIVNALVIKLYLMLPIFAG >gi|296494676|gb|ADTN01000062.1| GENE 32 30925 - 32316 1632 463 aa, chain + ## HITS:1 COG:ECs4530 KEGG:ns NR:ns ## COG: ECs4530 COG2233 # Protein_GI_number: 15833784 # Func_class: F Nucleotide transport and metabolism # Function: Xanthine/uracil permeases # Organism: Escherichia coli O157:H7 # 1 463 1 463 463 785 100.0 0 MSVSTLESENAQPVAQTQNSELIYRLEDRPPLPQTLFAACQHLLAMFVAVITPALLICQA LGLPAQDTQHIISMSLFASGVASIIQIKAWGPVGSGLLSIQGTSFNFVAPLIMGGTALKT GGADVPTMMAALFGTLMLASCTEMVISRVLHLARRIITPLVSGVVVMIIGLSLIQVGLTS IGGGYAAMSDNTFGAPKNLLLAGVVLALIILLNRQRNPYLRVASLVIAMAAGYALAWFMG MLPESNEPMTQELIMVPTPLYYGLGIEWSLLLPLMLVFMITSLETIGDITATSDVSEQPV SGPLYMKRLKGGVLANGLNSFVSAVFNTFPNSCFGQNNGVIQLTGVASRYVGFVVALMLI VLGLFPAVSGFVQHIPEPVLGGATLVMFGTIAASGVRIVSREPLNRRAILIIALSLAVGL GVSQQPLILQFAPEWLKNLLSSGIAAGGITAIVLNLIFPPEKQ >gi|296494676|gb|ADTN01000062.1| GENE 33 32437 - 34146 1488 569 aa, chain + ## HITS:1 COG:no KEGG:LF82_3356 NR:ns ## KEGG: LF82_3356 # Name: yicH # Def: uncharacterized protein yich # Organism: E.coli_LF82 # Pathway: not_defined # 1 569 1 569 569 1090 99.0 0 MKFIGKLLLYILIALLVVIAGLYFLLQTRWGAEHISAWVSENSDYHLAFGAMDHRFSAPS HIVLENVTFGRDGQPATLVAKSVDIALSSRQLTEPRHVDTILLENGTLNLTDQTAPLPFK ADRLQLRDMAFNSPNSEWKLSAQRVNGGVVPWSPEAGKVLGTKAQIQFSAGSLSLNDVPA TNVLIEGSIDNDRVTLTNLGADIARGTLTGNAQRNADGSWQVENLRMADIRLQSEKSLTD FFAPLRSVPSLQIGRLEVIDARLQGPDWAVTDLDLSLRNMTFSKDDWQTQEGKLSMNASE FIYGSLHLFDPIINAEFSPQGVALRQFTSRWEGGMVRTSGNWLRDGKTLILDDAAIAGLE YTLPKNWQQLWMETTPGWLNSLQLKRFSASRNLIIDIDPDFPWQLTTLDGYGANLTLVTD HKWGVWSGSANLNAAAATFNRVDVRRPSLALTANSSTVNISELSAFTEKGILEATASVSQ TPQRQTHISLNGRGVPVNILQQWGWPELPLTGDGNIQLTASGDIQANAPLKPTVSGQLHA VNAAKQQVTQTMNAGVVSSGEVTSTEPVQ >gi|296494676|gb|ADTN01000062.1| GENE 34 34199 - 36517 2114 772 aa, chain - ## HITS:1 COG:yicI KEGG:ns NR:ns ## COG: yicI COG1501 # Protein_GI_number: 16131527 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Escherichia coli K12 # 1 772 1 772 772 1625 98.0 0 MKISDGNWLIQPGLNLIHPLQVFEVEQQGNEMVVYAAPRDVRERTWQLDTPLFTLRFFSP QEGIVGVRIEHFQGALNNGPHYPLNILQDVKVTIENTERYAEFKSGNLSARVSKGEFWSL DFLRNGERITGSQVKNNGYVQDTNNQRNYMFERLDLGVGETVYGLGERFTALVRNGQTVE TWNRDGGTSTEQAYKNIPFYMTNRGYGVLVNHPQCVSFEVGSEKVSKVQFSVESEYLEYF VIDGPTPKAVLDRYTRFTGRPALPPAWSFGLWLTTSFTTNYDEATVNSFIDGMAERNLPL HVFHFDCFWMKAFQWCDFEWDPLTFPDPEGMIRRLKAKGLKICVWINPYIGQKSPVFKEL QEKGYLLKRPDGSLWQWDKWQPGLAIYDFTNPDACTWYADKLKGLVAMGVDCFKTDFGER IPTDVQWFDGSDPQKMHNHYAYIYNELVWNVLKDTVGEEEAVLFARSASVGAQKFPVHWG GDCYANYESMAESLRGGLSIGLSGFGFWSHDIGGFENTAPAHVYKRWCAFGLLSSHSRLH GSKSYRVPWAYDDESCDVVRFFTQLKCRMMPYLYREAARANTRGTPMMRAMMMEFPDDPA CDYLDRQYMLGDNVMVAPVFTEAGDVQFYLPEGRWTHLWHNDELDGSRWHKQQHSFLSLP VYVRDNTLLALGNNDQRPDYAWHEGTAFHLFNQQDGHEAVCEVPAADGSVIFTLKATRTG NTITVIGAGEARNWTLCLRNVVKVNGLQGGSQAESEQGLVVKPQGNALTITL >gi|296494676|gb|ADTN01000062.1| GENE 35 36527 - 37909 1216 460 aa, chain - ## HITS:1 COG:yicJ KEGG:ns NR:ns ## COG: yicJ COG2211 # Protein_GI_number: 16131528 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Escherichia coli K12 # 1 460 20 479 479 875 99.0 0 MKSEVLSVKEKIGYGMGDAASHIIFDNVMLYMMFFYTDIFGIPAGFVGTMFLVARALDAI SDPCMGLLADRTRSRWGKFRPWVLFGALPFGIVCVLAYSTPDLSMNGKMIYAAITYTLLT LLYTVVNIPYCALGGVITNDPTQRISLQSWRFVLATAGGMLSTVLMMPLVNLIGGDNKPL GFQGGIAVLSVVAFMMLAFCFFTTKERVEAPPTTTSMREDLRDIWQNDQWRIVGLLTIFN ILAVCVRGGAMMYYVTWILGTPEVFVAFLTIYCVGNLIGSALAKPLTDWKCKVTIFWWTN ALLAVISLAMFFVPMQASITMFVFIFVIGVLHQLVTPIQWVMMSDTVDYGEWCNGKRLTG ISFAGTLFVLKLGLAFGGALIGWMLAYGGYDAAEKAQNSATISIIIALFTIVPAICYLLS AIIAKRYYSLTTHNLKTVMEQLAQGKRRCQQQFTSQEVQN >gi|296494676|gb|ADTN01000062.1| GENE 36 38933 - 40117 665 394 aa, chain + ## HITS:1 COG:setC KEGG:ns NR:ns ## COG: setC COG0477 # Protein_GI_number: 16131529 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 394 1 394 394 711 99.0 0 MQKTATTPSKILDLTAAAFLLVAFLTGIAGALQTPTLSIFLADELKARPIMVGFFFTGSA IMGILVSQFLARHSDKQGDRKLLILLCCLFGVLACTLFAWNRNYFILLSTGVLLSSFAST ANPQMFALAREHADRTGRETVMFSTFLRAQISLAWVIGPPLAYELAMGFSFKVMYLTAAI AFVVCGLIVWLFLPSIQRNIPVVTQPVEILPSTHRKRDTRLLFVVCSMMWAANNLYMINM PLFIIDELHLTDKLAGEMIGIAAGLEIPMMLIAGYYMKRIGKRLLMLIAIVSGMCFYASV LMATTPAVELELQILNAIFLGILCGIGMLYFQDLMPEKIGSATTLYANTSRVGWIIAGSV DGIMVEIWSYHALFWLAIGMLGIAMICLLFIKDI >gi|296494676|gb|ADTN01000062.1| GENE 37 40228 - 41151 613 307 aa, chain + ## HITS:1 COG:yicL KEGG:ns NR:ns ## COG: yicL COG0697 # Protein_GI_number: 16131530 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Escherichia coli K12 # 1 307 1 307 307 499 100.0 1e-141 MGSTRKGMLNVLIAAVLWGSSGVCAQYIMEQSQMSSQFLTMTRLIFAGLILLTLSFVHGD KIFSIINNHKDAISLLIFSVVGALTVQLTFLLTIEKSNAATATVLQFLSPTIIVAWFSLV RKSRPGILVFCAILTSLVGTFLLVTHGNPTSLSISPAALFWGIASAFAAAFYTTYPSTLI ARYGTLPVVGWSMLIGGLILLPFYARQGTNFVVNGSLILAFFYLVVIGTSLTFSLYLKGA QLIGGPKASILSCAEPLSSALLSLLLLGITFTLPDWLGTLLILSSVILISMDSRRRARKI NRPARHK >gi|296494676|gb|ADTN01000062.1| GENE 38 41155 - 41973 793 272 aa, chain - ## HITS:1 COG:nlpA KEGG:ns NR:ns ## COG: nlpA COG1464 # Protein_GI_number: 16131531 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface antigen # Organism: Escherichia coli K12 # 1 272 1 272 272 503 100.0 1e-142 MKLTTHHLRTGAALLLAGILLAGCDQSSSDAKHIKVGVINGAEQDVAEVAKKVAKEKYGL DVELVGFSGSLLPNDATNHGELDANVFQHRPFLEQDNQAHGYKLVAVGNTFVFPMAGYSK KIKTVAQIKEGATVAIPNDPTNLGRALLLLQKEKLITLKEGKGLLPTALDITDNPRHLQI MELEGAQLPRVLDDPKVDVAIISTTYIQQTGLSPVHDSVFIEDKNSPYVNILVAREDNKN AENVKEFLQSYQSPEVAKAAETIFNGGAVPGW >gi|296494676|gb|ADTN01000062.1| GENE 39 42071 - 42190 60 39 aa, chain + ## HITS:1 COG:no KEGG:ECBD_0043 NR:ns ## KEGG: ECBD_0043 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_BL21_DE3 # Pathway: not_defined # 1 39 1 39 39 70 100.0 2e-11 MEYKVWHFLLTTQARFVQHDESDESKLHLCFIRYTFVKG >gi|296494676|gb|ADTN01000062.1| GENE 40 42195 - 42488 267 97 aa, chain + ## HITS:1 COG:no KEGG:SSON_3614 NR:ns ## KEGG: SSON_3614 # Name: not_defined # Def: putative transport protein # Organism: S.sonnei # Pathway: not_defined # 1 97 1 97 97 187 100.0 9e-47 MKPTTLLLIFTFFAMPGIVYAESPFSSLQSAKEKTTVLQDLRKICTPQASLSDEAWEKLM LSDENNKQHIREAIVAMERNNQSNYWEALGKVECPDM >gi|296494676|gb|ADTN01000062.1| GENE 41 42529 - 43719 1119 396 aa, chain - ## HITS:1 COG:yicM KEGG:ns NR:ns ## COG: yicM COG2814 # Protein_GI_number: 16131532 # Func_class: G Carbohydrate transport and metabolism # Function: Arabinose efflux permease # Organism: Escherichia coli K12 # 1 396 56 451 451 667 99.0 0 MSEFIAENRGADAITRPNWSAVFSVAFCVACLIIVEFLPVSLLTPMAQDLGISEGVAGQS VTVTAFVAMFASLFIPQTIQATDRRYVVILFAVLLTLSCLLVSFANSFSLLLIGRACLGL ALGGFWAMSASLTMRLVPPRTVPKALSVIFGAVSIALVIAAPLGSFLGELIGWRNVFNAA AVMGVLCIFWIIKSLPSLPGEPSHQKQNTFRLLQRPGVMAGMIAIFMSFAGQFAFFTYIR PVYMNLAGFGVDGLTLVLLSFGIASFIGTSLSSFILKRSVKLALAGAPLILAVSALVLTL WGSDKIVATGVAIIWGLTFALVPVGWSTWITRSLADQAEKAGSIQVAVIQLANTCGAAIG GYALDNIGLTSPLMLSGTLMLLTALLVTAKVKMKKS >gi|296494676|gb|ADTN01000062.1| GENE 42 43930 - 44382 362 150 aa, chain - ## HITS:1 COG:no KEGG:APECO1_2789 NR:ns ## KEGG: APECO1_2789 # Name: yicN # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 150 10 159 159 290 100.0 9e-78 MIWIMLATLAVVFVVGFRVLTSGARKAIRRLSDRLNIDVVPVESMVDQMGKSAGDEFLRY LHRPDESHLQNAAQVLLIWQIVIVDGSEQNLLQWHRILQKARLAAPITDAQVRLALGFLR ETEPEMQDINAFQMRYNAFFQPAEGVHWLH >gi|296494676|gb|ADTN01000062.1| GENE 43 44435 - 45781 1240 448 aa, chain - ## HITS:1 COG:yicO KEGG:ns NR:ns ## COG: yicO COG2252 # Protein_GI_number: 16131534 # Func_class: R General function prediction only # Function: Permeases # Organism: Escherichia coli K12 # 1 448 23 470 470 766 100.0 0 MDKKMNNDNTDYVSNESGTLSRLFKLPQHGTTVRTELIAGMTTFLTMVYIVFVNPQILGA AQMDPKVVFVTTCLIAGIGSIAMGIFANLPVALAPAMGLNAFFAFVVVGAMGISWQTGMG AIFWGAVGLFLLTLFRIRYWMISNIPLSLRIGITSGIGLFIALMGLKNTGVIVANKDTLV MIGDLSSHGVLLGILGFFIITVLSSRHFHAAVLVSIVVTSCCGLFFGDVHFSGVYSIPPD ISGVIGEVDLSGALTLELAGIIFSFMLINLFDSSGTLIGVTDKAGLIDGNGKFPNMNKAL YVDSVSSVAGAFIGTSSVTAYIESTSGVAVGGRTGLTAVVVGVMFLLVMFFSPLVAIVPP YATAGALIFVGVLMTSSLARVNWDDFTESVPAFITTVMMPFTFSITEGIALGFMSYCIMK VCTGRWRDLNLCVVVVAALFALKIILVD >gi|296494676|gb|ADTN01000062.1| GENE 44 45944 - 47710 1753 588 aa, chain + ## HITS:1 COG:yicP KEGG:ns NR:ns ## COG: yicP COG1001 # Protein_GI_number: 16131535 # Func_class: F Nucleotide transport and metabolism # Function: Adenine deaminase # Organism: Escherichia coli K12 # 1 588 1 588 588 1200 100.0 0 MNNSINHKFHHISRAEYQELLAVSRGDAVADYIIDNVSILDLINGGEISGPIVIKGRYIA GVGAEYTDAPALQRIDARGATAVPGFIDAHLHIESSMMTPVTFETATLPRGLTTVICDPH EIVNVMGEAGFAWFARCAEQARQNQYLQVSSCVPALEGCDVNGASFTLEQMLAWRDHPQV TGLAEMMDYPGVISGQNALLDKLDAFRHLTLDGHCPGLGGKELNAYITAGIENCHESYQL EEGRRKLQLGMSLMIREGSAARNLNALAPLINEFNSPQCMLCTDDRNPWEIAHEGHIDAL IRRLIEQHNVPLHVAYRVASWSTARHFGLNHLGLLAPGKQADIVLLSDARKVTVQQVLVK GEPIDAQTLQAEESARLAQSAPPYGNTIARQPVSASDFALQFTPGKRYRVIDVIHNELIT HSHSSVYSENGFDRDDVSFIAVLERYGQRLAPACGLLGGFGLNEGALAATVSHDSHNIVV IGRSAEEMALAVNQVIQDGGGLCVVRNGQVQSHLPLPIAGLMSTDTAQSLAEQIDALKAA ARECGPLPDEPFIQMAFLSLPVIPALKLTSQGLFDGEKFAFTTLEVTE Prediction of potential genes in microbial genomes Time: Sun May 15 23:19:27 2011 Seq name: gi|296494675|gb|ADTN01000063.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont166.5, whole genome shotgun sequence Length of sequence - 49527 bp Number of predicted genes - 47, with homology - 46 Number of transcription units - 26, operones - 12 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 8 - 35 -0.1 1 1 Tu 1 3/0.429 - CDS 59 - 1450 1788 ## COG2271 Sugar phosphate permease - Prom 1477 - 1536 2.3 2 2 Op 1 5/0.143 - CDS 1588 - 2907 1342 ## COG2271 Sugar phosphate permease 3 2 Op 2 5/0.143 - CDS 2917 - 4410 1275 ## COG3851 Signal transduction histidine kinase, glucose-6-phosphate specific 4 2 Op 3 1/0.857 - CDS 4419 - 5009 799 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain - Term 5024 - 5062 6.0 5 3 Op 1 32/0.000 - CDS 5085 - 5375 331 ## COG0440 Acetolactate synthase, small (regulatory) subunit 6 3 Op 2 . - CDS 5379 - 7067 1902 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] + Prom 7561 - 7620 5.0 7 4 Tu 1 . + CDS 7836 - 7925 179 ## + Term 7948 - 7984 10.1 + Prom 8125 - 8184 5.3 8 5 Tu 1 . + CDS 8239 - 9390 1286 ## COG0477 Permeases of the major facilitator superfamily + Term 9403 - 9452 2.0 9 6 Op 1 . - CDS 9398 - 9895 368 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) 10 6 Op 2 . - CDS 9892 - 10254 283 ## ECO111_4500 putative inner membrane protein 11 6 Op 3 . - CDS 10244 - 10591 409 ## COG2149 Predicted membrane protein - Prom 10698 - 10757 3.6 + Prom 10575 - 10634 4.8 12 7 Tu 1 . + CDS 10738 - 11148 340 ## B21_03504 hypothetical protein + Term 11162 - 11200 1.2 - Term 11150 - 11188 4.2 13 8 Op 1 4/0.143 - CDS 11195 - 12688 1331 ## COG3119 Arylsulfatase A and related enzymes 14 8 Op 2 . - CDS 12685 - 14400 1711 ## COG4146 Predicted symporter - Prom 14552 - 14611 5.0 + Prom 14478 - 14537 2.6 15 9 Tu 1 . + CDS 14567 - 15460 532 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 15408 - 15454 10.0 16 10 Op 1 . - CDS 15457 - 15633 217 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases 17 10 Op 2 2/0.571 - CDS 15633 - 16271 608 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases 18 10 Op 3 . - CDS 16271 - 17887 1794 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific - Prom 18088 - 18147 3.4 19 11 Tu 1 . + CDS 18183 - 18899 637 ## COG2188 Transcriptional regulators - Term 18842 - 18888 11.3 20 12 Tu 1 . - CDS 18896 - 20557 1821 ## COG2985 Predicted permease - Prom 20674 - 20733 4.4 - Term 20705 - 20737 5.4 21 13 Tu 1 4/0.143 - CDS 20753 - 21181 571 ## COG0071 Molecular chaperone (small heat shock protein) 22 14 Tu 1 . - CDS 21293 - 21706 585 ## COG0071 Molecular chaperone (small heat shock protein) + Prom 21834 - 21893 3.0 23 15 Tu 1 . + CDS 22081 - 22344 338 ## COG5645 Predicted periplasmic lipoprotein 24 16 Tu 1 . - CDS 22346 - 23560 984 ## JW5860 conserved hypothetical protein - Prom 23585 - 23644 4.6 25 17 Tu 1 . + CDS 23661 - 24725 638 ## COG0644 Dehydrogenases (flavoproteins) - Term 24665 - 24711 4.0 26 18 Tu 1 . - CDS 24722 - 26014 1520 ## COG0477 Permeases of the major facilitator superfamily 27 19 Op 1 1/0.857 - CDS 26134 - 27282 1652 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 28 19 Op 2 2/0.571 - CDS 27279 - 27896 510 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase 29 19 Op 3 1/0.857 - CDS 27880 - 28758 526 ## COG3734 2-keto-3-deoxy-galactonokinase 30 19 Op 4 . - CDS 28755 - 29444 726 ## COG2186 Transcriptional regulators - Prom 29576 - 29635 8.7 + Prom 29520 - 29579 4.8 31 20 Tu 1 . + CDS 29722 - 30378 357 ## EcE24377A_4205 hypothetical protein + Term 30379 - 30435 3.2 - Term 30368 - 30420 12.7 32 21 Op 1 . - CDS 30424 - 31236 1196 ## COG0561 Predicted hydrolases of the HAD superfamily - Prom 31287 - 31346 4.2 - Term 31305 - 31343 2.3 33 21 Op 2 . - CDS 31351 - 31749 592 ## COG3753 Uncharacterized protein conserved in bacteria - Prom 31796 - 31855 6.6 - Term 31815 - 31857 11.9 34 22 Op 1 9/0.000 - CDS 31989 - 34403 3167 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit 35 22 Op 2 18/0.000 - CDS 34432 - 35505 753 ## COG1195 Recombinational DNA repair ATPase (RecF pathway) 36 22 Op 3 16/0.000 - CDS 35505 - 36605 1302 ## COG0592 DNA polymerase sliding clamp subunit (PCNA homolog) 37 22 Op 4 . - CDS 36610 - 37944 1271 ## COG0593 ATPase involved in DNA replication initiation + Prom 38465 - 38524 6.3 38 23 Op 1 . + CDS 38620 - 38760 228 ## PROTEIN SUPPORTED gi|15804297|ref|NP_290336.1| 50S ribosomal protein L34 39 23 Op 2 22/0.000 + CDS 38810 - 39136 206 ## COG0594 RNase P protein component + Prom 39237 - 39296 3.9 40 23 Op 3 10/0.000 + CDS 39360 - 41006 2012 ## COG0706 Preprotein translocase subunit YidC + Term 41032 - 41062 3.0 41 23 Op 4 . + CDS 41112 - 42476 1671 ## COG0486 Predicted GTPase + Term 42486 - 42532 13.3 + Prom 42839 - 42898 3.9 42 24 Op 1 3/0.429 + CDS 43014 - 44429 1882 ## COG3033 Tryptophanase + Term 44441 - 44479 3.7 43 24 Op 2 3/0.429 + CDS 44520 - 45767 1153 ## COG0814 Amino acid permeases + Term 45780 - 45808 0.6 + Prom 45813 - 45872 2.5 44 25 Op 1 9/0.000 + CDS 45899 - 47074 814 ## COG0477 Permeases of the major facilitator superfamily 45 25 Op 2 4/0.143 + CDS 47049 - 48008 807 ## COG0583 Transcriptional regulator + Prom 48012 - 48071 6.7 46 26 Op 1 3/0.429 + CDS 48153 - 48914 520 ## COG2091 Phosphopantetheinyl transferase 47 26 Op 2 . + CDS 48936 - 49502 764 ## COG0431 Predicted flavoprotein Predicted protein(s) >gi|296494675|gb|ADTN01000063.1| GENE 1 59 - 1450 1788 463 aa, chain - ## HITS:1 COG:ECs4603 KEGG:ns NR:ns ## COG: ECs4603 COG2271 # Protein_GI_number: 15833857 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Escherichia coli O157:H7 # 1 463 1 463 463 845 100.0 0 MLAFLNQVRKPTLDLPLEVRRKMWFKPFMQSYLVVFIGYLTMYLIRKNFNIAQNDMISTY GLSMTQLGMIGLGFSITYGVGKTLVSYYADGKNTKQFLPFMLILSAICMLGFSASMGSGS VSLFLMIAFYALSGFFQSTGGSCSYSTITKWTPRRKRGTFLGFWNISHNLGGAGAAGVAL FGANYLFDGHVIGMFIFPSIIALIVGFIGLRYGSDSPESYGLGKAEELFGEEISEEDKET ESTDMTKWQIFVEYVLKNKVIWLLCFANIFLYVVRIGIDQWSTVYAFQELKLSKAVAIQG FTLFEAGALVGTLLWGWLSDLANGRRGLVACIALALIIATLGVYQHASNEYIYLASLFAL GFLVFGPQLLIGVAAVGFVPKKAIGAADGIKGTFAYLIGDSFAKLGLGMIADGTPVFGLT GWAGTFAALDIAAIGCICLMAIVAVMEERKIRREKKIQQLTVA >gi|296494675|gb|ADTN01000063.1| GENE 2 1588 - 2907 1342 439 aa, chain - ## HITS:1 COG:uhpC KEGG:ns NR:ns ## COG: uhpC COG2271 # Protein_GI_number: 16131537 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Escherichia coli K12 # 1 439 2 440 440 810 100.0 0 MLPFLKAPADAPLMTDKYEIDARYRYWRRHILLTIWLGYALFYFTRKSFNAAVPEILANG VLSRSDIGLLATLFYITYGVSKFVSGIVSDRSNARYFMGIGLIATGIINILFGFSTSLWA FAVLWVLNAFFQGWGSPVCARLLTAWYSRTERGGWWALWNTAHNVGGALIPIVMAAAALH YGWRAGMMIAGCMAIVVGIFLCWRLRDRPQALGLPAVGEWRHDALEIAQQQEGAGLTRKE ILTKYVLLNPYIWLLSFCYVLVYVVRAAINDWGNLYMSETLGVDLVTANTAVTMFELGGF IGALVAGWGSDKLFNGNRGPMNLIFAAGILLSVGSLWLMPFASYVMQATCFFTIGFFVFG PQMLIGMAAAECSHKEAAGAATGFVGLFAYLGASLAGWPLAKVLDTWHWSGFFVVISIAA GISALLLLPFLNAQTPREA >gi|296494675|gb|ADTN01000063.1| GENE 3 2917 - 4410 1275 497 aa, chain - ## HITS:1 COG:uhpB KEGG:ns NR:ns ## COG: uhpB COG3851 # Protein_GI_number: 16131538 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase, glucose-6-phosphate specific # Organism: Escherichia coli K12 # 1 497 5 501 501 841 99.0 0 MFSRLITVIACFFIFSAAWFCLWSISLHLVERPDMAVLLFPFGLRLGLMLQCPRGYWPVL LGAEWLLIYWLTQAVGLTHFPLLMIGSLLTLLPVALISRYRHQRDWRTLLLQGAALTAAA LLQSLPWLWHGKESWNALLLTLTGGLTLAPICLVFWHYLANNTWLPLGPSLVSQPINWRG RHLVWYLLLFVISLWLQLGLPDELSRFTPFCLALPIIALAWHYGWQGALIATLMNAIALI ASQTWRDHPVDLLLSLLVQSLTGLLLGAGIQRLRELNQSLQKELARNQHLAERLLETEES VRRDVARELHDDIGQTITAIRTQAGIVQRLAADNASVKQSGQLIEQLSLGVYDAVRRLLG RLRPRQLDDLTLEQAIRSLMREMELEGRGIVSHLEWRIDESALSENQRVTLFRVCQEGLN NIVKHADASAVTLQGWQQDERLMLVIEDDGSGLPPGSGQQGFGLTGMRERVTALGGTLHI SCLHGTRVSVSLPQRYV >gi|296494675|gb|ADTN01000063.1| GENE 4 4419 - 5009 799 196 aa, chain - ## HITS:1 COG:ECs4606 KEGG:ns NR:ns ## COG: ECs4606 COG2197 # Protein_GI_number: 15833860 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 196 1 196 196 375 100.0 1e-104 MITVALIDDHLIVRSGFAQLLGLEPDLQVVAEFGSGREALAGLPGRGVQVCICDISMPDI SGLELLSQLPKGMATIMLSVHDSPALVEQALNAGARGFLSKRCSPDELIAAVHTVATGGC YLTPDIAIKLASGRQDPLTKRERQVAEKLAQGMAVKEIAAELGLSPKTVHVHRANLMEKL GVSNDVELARRMFDGW >gi|296494675|gb|ADTN01000063.1| GENE 5 5085 - 5375 331 96 aa, chain - ## HITS:1 COG:ECs4611 KEGG:ns NR:ns ## COG: ECs4611 COG0440 # Protein_GI_number: 15833865 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase, small (regulatory) subunit # Organism: Escherichia coli O157:H7 # 1 96 1 96 96 184 100.0 3e-47 MQNTTHDNVILELTVRNHPGVMTHVCGLFARRAFNVEGILCLPIQDSDKSHIWLLVNDDQ RLEQMISQIDKLEDVVKVQRNQSDPTMFNKIAVFFQ >gi|296494675|gb|ADTN01000063.1| GENE 6 5379 - 7067 1902 562 aa, chain - ## HITS:1 COG:ECs4612 KEGG:ns NR:ns ## COG: ECs4612 COG0028 # Protein_GI_number: 15833866 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Escherichia coli O157:H7 # 1 562 1 562 562 1099 98.0 0 MASSGTTSTRKRFTGAEFIVHFLEQQGIKIVTGIPGGSILPVYDALSQSTQIRHILARHE QGAGFIAQGMARTDGKPAVCMACSGPGATNLVTAIADARLDSIPLICITGQVPASMIGTD AFQEVDTYGISIPITKHNYLVRHIEELPQVMSDAFRIAQSGRPGPVWIDIPKDVQTAVFE IETQPAMAEKAAAPAFSEESIRDAAAMINAAKRPVLYLGGGVINAPARVRELAEKAQLPT TMTLMALGMLPKAHPLSLGMLGMHGVRSTNYILQEADLLIVLGARFDDRAIGKTEQFCPN AKIIHVDIDRAELGKIKQPHVAIQADVDDVLAQLIPLVEAQPRAEWHQLVADLQREFPCP IPKACDPLSHYGLINAVAACVDDNAIITTDVGQHQMWTAQAYPLNRPRQWLTSGGLGTMG FGLPAAIGAALANPDRKVLCFSGDGSLMMNIQEMATASENQLDVKIILMNNEALGLVHQQ QSLFYEQGVFAATYPGKINFMQIAAGFGLETCDLNNEADPQASLQEIINRPGPALIHVRI DAEEKVYPMVPPGAANTEMVGE >gi|296494675|gb|ADTN01000063.1| GENE 7 7836 - 7925 179 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNLVDIAILILKLIVAALQLLDAVLKYLK >gi|296494675|gb|ADTN01000063.1| GENE 8 8239 - 9390 1286 383 aa, chain + ## HITS:1 COG:ECs4614 KEGG:ns NR:ns ## COG: ECs4614 COG0477 # Protein_GI_number: 15833868 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 9 383 22 396 396 619 100.0 1e-177 MLVLLVAVGQMAQTIYIPAIADMARDLNVREGAVQSVMGAYLLTYGVSQLFYGPISDRVG RRPVILVGMSIFMLATLVAVTTSSLTVLIAASAMQGMGTGVGGVMARTLPRDLYERTQLR HANSLLNMGILVSPLLAPLIGGLLDTMWNWRACYLFLLVLCAGVTFSMARWMPETRPVDA PRTRLLTSYKTLFGNSGFNCYLLMLIGGLAGIAAFEACSGVLMGAVLGLSSMTVSILFIL PIPAAFFGAWFAGRPNKRFSTLMWQSVICCLLAGLLMWIPDWFGVMNVWTLLVPAALFFF GAGMLFPLATSGAMEPFPFLAGTAGALVGGLQNIGSGVLASLSAMLPQTGQGSLGLLMTL MGLLIVLCWLPLATRMSHQGQPV >gi|296494675|gb|ADTN01000063.1| GENE 9 9398 - 9895 368 165 aa, chain - ## HITS:1 COG:yidF KEGG:ns NR:ns ## COG: yidF COG0641 # Protein_GI_number: 16131544 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Escherichia coli K12 # 1 165 1 165 165 343 100.0 7e-95 MTGSQVIDAEEDRHKLVVEYKDALQPADFYHNFKQRGIRSVQLIPYLEFDDRGDLTAASV TAELWGKFLIALFECWVRADISRISIELFDATLQKWCGSENPQPRCDCQACDWHRLCPHA RQETPDSVLCAGYQAFYSYSAPHMRVMRDLIKQHRSPMELMTMLR >gi|296494675|gb|ADTN01000063.1| GENE 10 9892 - 10254 283 120 aa, chain - ## HITS:1 COG:no KEGG:ECO111_4500 NR:ns ## KEGG: ECO111_4500 # Name: yidG # Def: putative inner membrane protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 120 1 120 120 221 100.0 8e-57 MPDSRKARRIADPGLQPERTSLAWFRTMLGYGALMALAIKHNWHQAGMLFWISIGILAIV ALILWHYTRNRNLMDVTNSDFSQFHVVRDKFLISLAVLSLAILFAVTHIHQLIVFIERVA >gi|296494675|gb|ADTN01000063.1| GENE 11 10244 - 10591 409 115 aa, chain - ## HITS:1 COG:ECs4617 KEGG:ns NR:ns ## COG: ECs4617 COG2149 # Protein_GI_number: 15833871 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 115 1 115 115 190 100.0 7e-49 MKISRLGEAPDYRFSLANERTFLAWIRTALGFLAAGVGLDQLAPDFATPVIRELLALLLC LFSGGLAMYGYLRWLRNEKAMRLKEDLPYTNSLLIISLILMVVAVIVMGLVLYAG >gi|296494675|gb|ADTN01000063.1| GENE 12 10738 - 11148 340 136 aa, chain + ## HITS:1 COG:no KEGG:B21_03504 NR:ns ## KEGG: B21_03504 # Name: yidI # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 136 14 149 149 217 100.0 1e-55 MLFGAIALMMGIIHFSFGPFSAPPPTFESIVADKTAEIKRGLLAGIKGEKITTVEKKEDV DVDKILNQSGIALAIAALLCAFIGGMRKENRWGIRGALVFGGGTLAFHTLLFGIGIVCSI LLIFLIFSFLTGGSLV >gi|296494675|gb|ADTN01000063.1| GENE 13 11195 - 12688 1331 497 aa, chain - ## HITS:1 COG:yidJ KEGG:ns NR:ns ## COG: yidJ COG3119 # Protein_GI_number: 16131548 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Escherichia coli K12 # 1 497 1 497 497 1050 100.0 0 MKRPNFLFVMTDTQATNMVGCYSGKPLNTQNIDSLAAEGIRFNSAYTCSPVCTPARAGLF TGIYANQSGPWTNNVAPGKNISTMGRYFKDAGYHTCYIGKWHLDGHDYFGTGECPPEWDA DYWFDGANYLSELTEKEISLWRNGLNSVEDLQANHIDETFTWAHRISNRAVDFLQQPARA DEPFLMVVSYDEPHHPFTCPVEYLEKYADFYYELGEKAQDDLANKPEHHRLWAQAMPSPV GDDGLYHHPLYFACNDFVDDQIGRVINALTPEQRENTWVIYTSDHGEMMGAHKLISKGAA MYDDITRIPLIIRSPQGERRQVDTPVSHIDLLPTMMALADIEKPEILPGENILAVKEPRG VMVEFNRYEIEHDSFGGFIPVRCWVTDDFKLVLNLFTSDELYDRRNDPNEMHNLIDDIRF ADVRSKMHDALLDYMDKIRDPFRSYQWSLRPWRKDARPRWMGAFRPRPQDGYSPVVRDYD TGLPTQGVKVEEKKQKF >gi|296494675|gb|ADTN01000063.1| GENE 14 12685 - 14400 1711 571 aa, chain - ## HITS:1 COG:yidK KEGG:ns NR:ns ## COG: yidK COG4146 # Protein_GI_number: 16131549 # Func_class: R General function prediction only # Function: Predicted symporter # Organism: Escherichia coli K12 # 1 571 1 571 571 1053 100.0 0 MNSLQILSFVGFTLLVAVITWWKVRKTDTGSQQGYFLAGRSLKAPVIAASLMLTNLSTEQ LVGLSGQAYKSGMSVMGWEVTSAVTLIFLALIFLPRYLKRGIATIPDFLEERYDKTTRII IDFCFLIATGVCFLPIVLYSGALALNSLFHVGESLQISHGAAIWLLVILLGLAGILYAVI GGLRAMAVADSINGIGLVIGGLMVPVFGLIAMGKGSFMQGIEQLTTVHAEKLNSIGGPTD PLPIGAAFTGLILVNTFYWCTNQGIVQRTLASKSLAEGQKGALLTAVLKMLDPLVLVLPG LIAFHLYQDLPKADMAYPTLVNNVLPVPMVGFFGAVLFGAVISTFNGFLNSASTLFSMGI YRRIINQNAEPQQLVTVGRKFGFFIAIVSVLVAPWIANAPQGLYSWMKQLNGIYNVPLVT IIIMGFFFPRIPALAAKVAMGIGIISYITINYLVKFDFHFLYVLACTFCINVVVMLVIGF IKPRATPFTFKDAFAVDMKPWKNVKIASIGILFAMIGVYAGLAEFGGYGTRWLAMISYFI AAVVIVYLIFDSWRHRHDPAVTFTPDGKDSL >gi|296494675|gb|ADTN01000063.1| GENE 15 14567 - 15460 532 297 aa, chain + ## HITS:1 COG:yidL KEGG:ns NR:ns ## COG: yidL COG2207 # Protein_GI_number: 16131550 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 297 11 307 307 624 100.0 1e-179 MNGKLQSSDVKNETPYNIPLLINENVISSGISLISLWHTYADEHYRVIWPRDKKKPLIAN SWVAVYTVQGCGKILLKNGEQITLHGNCIIFLKPMDIHSYHCEGLVWEQYWMEFTPTSMM DIPVGQQSVIYNGEIYNQELTEVAELITSPEAIKNNLAVAFLTKIIYQWICLMYADGKKD PQRRQIEKLIATLHASLQQRWSVADMAATIPCSEAWLRRLFLRYTGKTPKEYYLDARLDL ALSLLKQQGNSVGEVADTLNFFDSFHFSKAFKHKFGYAPSAVLKNTDQHPTDASPHN >gi|296494675|gb|ADTN01000063.1| GENE 16 15457 - 15633 217 58 aa, chain - ## HITS:1 COG:Z5177 KEGG:ns NR:ns ## COG: Z5177 COG1486 # Protein_GI_number: 15804281 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Escherichia coli O157:H7 EDL933 # 1 58 383 440 440 117 100.0 5e-27 MSQQVAVEKLVVDAWEQRSYQHLWQAITLSKTVPSASVAKAILDELLEANKAYWPELR >gi|296494675|gb|ADTN01000063.1| GENE 17 15633 - 16271 608 212 aa, chain - ## HITS:1 COG:glvG KEGG:ns NR:ns ## COG: glvG COG1486 # Protein_GI_number: 16131551 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Escherichia coli K12 # 1 212 1 212 212 444 100.0 1e-125 MTKFSVVVAGGGSTFTPGIVLMLLANQDRFPLRALKFYDNDGARQEVIAEACKVILKEKA PDIAFSYTTDPEVAFSDVDFVMAHIRVGKYPMRELDEKIPLRHGVVGQETCGPGGIAYGM RSIGGVLELVDYMEKYSPNAWMLNYSNPAAIVAEATRRLRPNAKILNICDMPIGIESRMA QIVGLQDRKQMRVRYYGLNHWWSAISRSFRKG >gi|296494675|gb|ADTN01000063.1| GENE 18 16271 - 17887 1794 538 aa, chain - ## HITS:1 COG:glvCm_1 KEGG:ns NR:ns ## COG: glvCm_1 COG1263 # Protein_GI_number: 16132269 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Escherichia coli K12 # 1 447 88 534 534 852 99.0 0 MLSQIQRFGGAMFTPVLLFPFAGIVVGLAILLQNPMFVGESLTDPNSLFAQIVHIIEEGG WTVFRNMPLIFAVGLPIGLAKQAQGRACLAVMVSFLTWNYFINAMGMTWGSYFGVDFTQD AVAGSGLTMMAGIKTLDTSIIGAIIISGIVTALHNRLFDKKLPVFLGIFQGTSYVVIIAF LVMIPCAWLTLLGWPKVQMGIESLQAFLRSAGALGVWVYTFLERILIPTGLHHFIYGQFI FGPAAVEGGIQMYWAQHLQEFSLSAEPLKSLFPEGGFALHGNSKIFGAVGISLAMYFTAA PENRVKVAGLLIPATLTAMLVGITEPLEFTFLFISPLLFAVHAVLAASMSTVMYLFGVVG NMGGGLIDQVLPQNWIPMFSNHADMMLTQIAIGLCFTLLYFVVFRTLILQFNMCTPGRED AEVKLYSKAEYKASRGQTTAAEPKKELDQAAGILQALGGVGNISSINNCATRLRIALHDM SQTLDDEVFKKLGAHGVFRSGDAIQVIIGLHVSQLREQLDSLINSHQSAENVAITEAV >gi|296494675|gb|ADTN01000063.1| GENE 19 18183 - 18899 637 238 aa, chain + ## HITS:1 COG:yidP KEGG:ns NR:ns ## COG: yidP COG2188 # Protein_GI_number: 16131552 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 238 1 238 238 448 100.0 1e-126 MIYKSIAERLRIRLNSADFTLNSLLPGEKKLAEEFAVSRMTIRKAIDLLVAWGLVVRRHG SGTYLVRKDVLHQTASLTGLVEVLKRQGKTVTSQVLIFEIMPAPPAIASQLRIQINEQIY FSRRVRFVEGKPLMLEDSYMPVKLFRNLSLQHLEGSKFEYIEQECGILIGGNYESLTPVL ADRLLARQMKVAEHTPLLRITSLSYSESGEFLNYSVMFRNASEYQVEYHLRRLHPEKS >gi|296494675|gb|ADTN01000063.1| GENE 20 18896 - 20557 1821 553 aa, chain - ## HITS:1 COG:ECs4625 KEGG:ns NR:ns ## COG: ECs4625 COG2985 # Protein_GI_number: 15833879 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Escherichia coli O157:H7 # 1 553 9 561 561 985 100.0 0 MSDIALTVSILALVAVVGLFIGNVKFRGIGLGIGGVLFGGIIVGHFVSQAGMTLSSDMLH VIQEFGLILFVYTIGIQVGPGFFASLRVSGLRLNLFAVLIVIIGGLVTAILHKLFDIPLP VVLGIFSGAVTNTPALGAGQQILRDLGTPMEMVDQMGMSYAMAYPFGICGILFTMWMLRV IFRVNVETEAQQHESSRTNGGALIKTINIRVENPNLHDLAIKDVPILNGDKIICSRLKRE ETLKVPSPDTIIQLGDLLHLVGQPADLHNAQLVIGQEVDTSLSTKGTDLRVERVVVTNEN VLGKRIRDLHFKERYDVVISRLNRAGVELVASGDISLQFGDILNLVGRPSAIDAVANVLG NAQQKLQQVQMLPVFIGIGLGVLLGSIPVFVPGFPAALKLGLAGGPLIMALILGRIGSIG KLYWFMPPSANLALRELGIVLFLSVVGLKSGGDFVNTLVNGEGLSWIGYGALITAVPLIT VGILARMLAKMNYLTMCGMLAGSMTDPPALAFANNLHPTSGAAALSYATVYPLVMFLRII TPQLLAVLFWSIG >gi|296494675|gb|ADTN01000063.1| GENE 21 20753 - 21181 571 142 aa, chain - ## HITS:1 COG:ECs4626 KEGG:ns NR:ns ## COG: ECs4626 COG0071 # Protein_GI_number: 15833880 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone (small heat shock protein) # Organism: Escherichia coli O157:H7 # 1 142 3 144 144 275 100.0 2e-74 MRNFDLSPLMRQWIGFDKLANALQNAGESQSFPPYNIEKSDDNHYRITLALAGFRQEDLE IQLEGTRLSVKGTPEQPKEEKKWLHQGLMNQPFSLSFTLAENMEVSGATFVNGLLHIDLI RNEPEPIAAQRIAISERPALNS >gi|296494675|gb|ADTN01000063.1| GENE 22 21293 - 21706 585 137 aa, chain - ## HITS:1 COG:ECs4627 KEGG:ns NR:ns ## COG: ECs4627 COG0071 # Protein_GI_number: 15833881 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone (small heat shock protein) # Organism: Escherichia coli O157:H7 # 1 137 1 137 137 256 100.0 7e-69 MRNFDLSPLYRSAIGFDRLFNHLENNQSQSNGGYPPYNVELVDENHYRIAIAVAGFAESE LEITAQDNLLVVKGAHADEQKERTYLYQGIAERNFERKFQLAENIHVRGANLVNGLLYID LERVIPEAKKPRRIEIN >gi|296494675|gb|ADTN01000063.1| GENE 23 22081 - 22344 338 87 aa, chain + ## HITS:1 COG:ECs4628 KEGG:ns NR:ns ## COG: ECs4628 COG5645 # Protein_GI_number: 15833882 # Func_class: R General function prediction only # Function: Predicted periplasmic lipoprotein # Organism: Escherichia coli O157:H7 # 1 87 49 135 135 167 98.0 5e-42 MMSHTGGKEGTYPGTRASATMIGDDETNWGTKSLAILDMPFTAVMDTLLLPWDVFRKDSS VRSRVEKSEANAQATNAVIPPARMPDN >gi|296494675|gb|ADTN01000063.1| GENE 24 22346 - 23560 984 404 aa, chain - ## HITS:1 COG:no KEGG:JW5860 NR:ns ## KEGG: JW5860 # Name: yidR # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 404 1 404 404 815 100.0 0 MKQITFAPRNHLLTNTNTWTPDSQWLVFDVRPSGASFTGETIERVNIHTGEVEVIYRASQ GAHVGVVTVHPKSEKYVFIHGPENPDETWHYDFHHRRGVIVEGGKMSNLDAMDITAPYTP GVLRGGSHVHVFSPNGERVSFTYNDHVMHELDPALDLRNVGVAAPFGPVNVQKQHPREYS GSHWCVLVSKTTPTPQPGSDEINRAYEEGWVGNHALAFIGDTLSPKGEKVPELFIVELPQ DEAGWKAAGDAPLSGTETTLPAPPRGVVQRRLTFTHHRAYPGLVNVPRHWVRCNPQGTQI AFLMRDDNGIVQLWLISPQGGEPRQLTHNKTDIQSAFNWHPSGEWLGFVLDNRIACAHAQ SGEVEYLTEHHANSPSADAVVFSPDGQWLAWMEGGQLWITETDR >gi|296494675|gb|ADTN01000063.1| GENE 25 23661 - 24725 638 354 aa, chain + ## HITS:1 COG:yidS KEGG:ns NR:ns ## COG: yidS COG0644 # Protein_GI_number: 16131558 # Func_class: C Energy production and conversion # Function: Dehydrogenases (flavoproteins) # Organism: Escherichia coli K12 # 1 354 8 361 361 749 100.0 0 MEHFDVAIIGLGPAGSALARKLAGKMQVIALDKKHQCGTEGFSKPCGGLLAPDAQRSFIR DGLTLPVDVIANPQIFSVKTVDVAASLTRNYQRSYININRHAFDLWMKSLIPASVEVYHD SLCRKIWREDDKWHVIFRADGWEQHITARYLVGADGANSMVRRHLYPDHQIRKYVAIQQW FAEKHPVPFYSCIFDNSITNCYSWSISKDGYFIFGGAYPMKDGQTRFTTLKEKMSAFQFQ FGKTVKSEKCTVLFPSRWQDFVCGKDNAFLIGEAAGFISASSLEGISYALDSTDILRSVL LKQPEKLNTAYWRATRKLRLKLFGKIVKSRCLTAPALRKWIMRSGVAHIPQLKD >gi|296494675|gb|ADTN01000063.1| GENE 26 24722 - 26014 1520 430 aa, chain - ## HITS:1 COG:dgoT KEGG:ns NR:ns ## COG: dgoT COG0477 # Protein_GI_number: 16131559 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 430 16 445 445 801 100.0 0 MDIPVNAAKPGRRRYLTLVMIFITVVICYVDRANLAVASAHIQEEFGITKAEMGYVFSAF AWLYTLCQIPGGWFLDRVGSRVTYFIAIFGWSVATLFQGFATGLMSLIGLRAITGIFEAP AFPTNNRMVTSWFPEHERASAVGFYTSGQFVGLAFLTPLLIWIQEMLSWHWVFIVTGGIG IIWSLIWFKVYQPPRLTKGISKAELDYIRDGGGLVDGDAPVKKEARQPLTAKDWKLVFHR KLIGVYLGQFAVASTLWFFLTWFPNYLTQEKGITALKAGFMTTVPFLAAFVGVLLSGWVA DLLVRKGFSLGFARKTPIICGLLISTCIMGANYTNDPMMIMCLMALAFFGNGFASITWSL VSSLAPMRLIGLTGGVFNFAGGLGGITVPLVVGYLAQGYGFAPALVYISAVALIGALSYI LLVGDVKRVG >gi|296494675|gb|ADTN01000063.1| GENE 27 26134 - 27282 1652 382 aa, chain - ## HITS:1 COG:dgoA_2 KEGG:ns NR:ns ## COG: dgoA_2 COG4948 # Protein_GI_number: 16131560 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Escherichia coli K12 # 1 382 96 477 477 799 100.0 0 MKITKITTYRLPPRWMFLKIETDEGVVGWGEPVIEGRARTVEAAVHELGDYLIGQDPSRI NDLWQVMYRAGFYRGGPILMSAIAGIDQALWDIKGKVLNAPVWQLMGGLVRDKIKAYSWV GGDRPADVIDGIKTLREIGFDTFKLNGCEELGLIDNSRAVDAAVNTVAQIREAFGNQIEF GLDFHGRVSAPMAKVLIKELEPYRPLFIEEPVLAEQAEYYPKLAAQTHIPLAAGERMFSR FDFKRVLEAGGISILQPDLSHAGGITECYKIAGMAEAYDVTLAPHCPLGPIALAACLHID FVSYNAVLQEQSMGIHYNKGAELLDFVKNKEDFSMVGGFFKPLTKPGLGVEIDEAKVIEF SKNAPDWRNPLWRHEDNSVAEW >gi|296494675|gb|ADTN01000063.1| GENE 28 27279 - 27896 510 205 aa, chain - ## HITS:1 COG:RSc2752 KEGG:ns NR:ns ## COG: RSc2752 COG0800 # Protein_GI_number: 17547471 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Ralstonia solanacearum # 1 201 3 203 213 235 62.0 4e-62 MQWQTKLPLIAILRGITPDEALAHVGAVIDAGFDAVEIPLNSPQWEQSIPAIVDAYGDKA LIGAGTVLKPEQVDALARMSCQLIVTPNIHSEVIRRAVGYGMTVCPGCATATEAFTALEA GAQALKIFPSSAFGPQYIKALKAVLPSDIAVFAVGGVTPENLAQWIDAGCAGAGLGSDLY RAGQSVERTAQQAAAFVKAYREAVQ >gi|296494675|gb|ADTN01000063.1| GENE 29 27880 - 28758 526 292 aa, chain - ## HITS:1 COG:dgoK KEGG:ns NR:ns ## COG: dgoK COG3734 # Protein_GI_number: 16131561 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-galactonokinase # Organism: Escherichia coli K12 # 1 292 1 292 292 558 100.0 1e-159 MTARYIAIDWGSTNLRAWLYQGDHCLESRQSEAGVTRLNGKSPAAVLAEVTTDWREEKTP VVMAGMVGSNVGWKVAPYLSVPACFSSIGEQLTSVGDNIWIIPGLCVSHDDNHNVMRGEE TQLIGARALAPSSLYVMPGTHCKWVQADSQQINDFRTVMTGELHHLLLNHSLIGAGLPPQ ENSADAFTAGLERGLNTPAILPQLFEVRASHVLGTLPREQVSEFLSGLLIGAEVASMRDY VAHQHAITLVAGTSLTARYQQAFQAMGCDVTAVAGDTAFQAGIRSIAHAVAN >gi|296494675|gb|ADTN01000063.1| GENE 30 28755 - 29444 726 229 aa, chain - ## HITS:1 COG:STM3830 KEGG:ns NR:ns ## COG: STM3830 COG2186 # Protein_GI_number: 16767115 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Salmonella typhimurium LT2 # 1 229 1 229 229 409 94.0 1e-114 MTLNKTDRIVITLGKQIVHGKYVPGSPLPAEAELCEEFATSRNIIREVFRSLMAKRLIEM KRYRGAFVAPRNQWNYLDTDVLQWVLENDYDPRLISAMSEVRNLVEPAIARWAAERATSS DLAQIESALNEMIANNQDREAFNEADIRYHEAVLQSVHNPVLQQLSIAISSLQRAVFERT WMGDEANMPQTLQEHKALFDAIRHQDGDAAEQAALTMIASSTRRLKEIT >gi|296494675|gb|ADTN01000063.1| GENE 31 29722 - 30378 357 218 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_4205 NR:ns ## KEGG: EcE24377A_4205 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 218 36 253 253 442 99.0 1e-123 MKLNFKGFFKAAGLFPLALMLSGCISYALVSHTAKGSSGKYQSQSDTITGLSQAKDSNGT KGYVFVGESLDYLITDGADDIVKMLNDPALNRHNIQVADDARFVLNAGKKKFTGTISLYY YWNNEEEKALATHYGFACGVQHCTRSLENLKGTIHEKNKNMDYSKVMAFYHPFKVRFYEY YSPRGIPDGVSAALLPVTVTLDIITAPLQFLVVYAVNQ >gi|296494675|gb|ADTN01000063.1| GENE 32 30424 - 31236 1196 270 aa, chain - ## HITS:1 COG:yidA KEGG:ns NR:ns ## COG: yidA COG0561 # Protein_GI_number: 16131565 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Escherichia coli K12 # 1 270 1 270 270 525 100.0 1e-149 MAIKLIAIDMDGTLLLPDHTISPAVKNAIAAARARGVNVVLTTGRPYAGVHNYLKELHME QPGDYCITYNGALVQKAADGSTVAQTALSYDDYRFLEKLSREVGSHFHALDRTTLYTANR DISYYTVHESFVATIPLVFCEAEKMDPNTQFLKVMMIDEPAILDQAIARIPQEVKEKYTV LKSAPYFLEILDKRVNKGTGVKSLADVLGIKPEEIMAIGDQENDIAMIEYAGVGVAMDNA IPSVKEVANFVTKSNLEDGVAFAIEKYVLN >gi|296494675|gb|ADTN01000063.1| GENE 33 31351 - 31749 592 132 aa, chain - ## HITS:1 COG:ECs4633 KEGG:ns NR:ns ## COG: ECs4633 COG3753 # Protein_GI_number: 15833887 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 132 4 135 135 214 98.0 3e-56 MGLFDEVVGAFLKGDAGKYQAILSWVEEQGGIQVLLEKLQSGGLGAILSTWLSNQQGNQS VSGEQLESALGTNAVSDLGQKLGVDTSTASILLAEQLPKIIDALSPQGEVSPQANNDLLS AGMELLKGKLFR >gi|296494675|gb|ADTN01000063.1| GENE 34 31989 - 34403 3167 804 aa, chain - ## HITS:1 COG:ECs4634 KEGG:ns NR:ns ## COG: ECs4634 COG0187 # Protein_GI_number: 15833888 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Escherichia coli O157:H7 # 1 804 1 804 804 1588 100.0 0 MSNSYDSSSIKVLKGLDAVRKRPGMYIGDTDDGTGLHHMVFEVVDNAIDEALAGHCKEII VTIHADNSVSVQDDGRGIPTGIHPEEGVSAAEVIMTVLHAGGKFDDNSYKVSGGLHGVGV SVVNALSQKLELVIQREGKIHRQIYEHGVPQAPLAVTGETEKTGTMVRFWPSLETFTNVT EFEYEILAKRLRELSFLNSGVSIRLRDKRDGKEDHFHYEGGIKAFVEYLNKNKTPIHPNI FYFSTEKDGIGVEVALQWNDGFQENIYCFTNNIPQRDGGTHLAGFRAAMTRTLNAYMDKE GYSKKAKVSATGDDAREGLIAVVSVKVPDPKFSSQTKDKLVSSEVKSAVEQQMNELLAEY LLENPTDAKIVVGKIIDAARAREAARRAREMTRRKGALDLAGLPGKLADCQERDPALSEL YLVEGDSAGGSAKQGRNRKNQAILPLKGKILNVEKARFDKMLSSQEVATLITALGCGIGR DEYNPDKLRYHSIIIMTDADVDGSHIRTLLLTFFYRQMPEIVERGHVYIAQPPLYKVKKG KQEQYIKDDEAMDQYQISIALDGATLHTNASAPALAGEALEKLVSEYNATQKMINRMERR YPKAMLKELIYQPTLTEADLSDEQTVTRWVNALVSELNDKEQHGSQWKFDVHTNAEQNLF EPIVRVRTHGVDTDYPLDHEFITGGEYRRICTLGEKLRGLLEEDAFIERGERRQPVASFE QALDWLVKESRRGLSIQRYKGLGEMNPEQLWETTMDPESRRMLRVTVKDAIAADQLFTTL MGDAVEPRRAFIEENALKAANIDI >gi|296494675|gb|ADTN01000063.1| GENE 35 34432 - 35505 753 357 aa, chain - ## HITS:1 COG:ECs4635 KEGG:ns NR:ns ## COG: ECs4635 COG1195 # Protein_GI_number: 15833889 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair ATPase (RecF pathway) # Organism: Escherichia coli O157:H7 # 1 357 1 357 357 712 99.0 0 MSLTRLLIRDFRNIETADLALSPGFNFLVGANGSGKTSVLEAIYTLGHGRAFRSLQIGRV IRHEQEAFVLHGRLQGEERETAIGLTKDKQGDSKVRIDGTDGHKVAELAHLMPMQLITPE GFTLLNGGPKYRRAFLDWGCFHNEPGFFTAWSNLKRLLKQRNAALRQVTCYEQLRPWDKE LIPLAEQISTWRAEYSAGIAADMADTCKQFLPEFSLTFSFQRGWEKETEYAEVLERNFER DRQLTYTAHGPHKADLRIRADGAPVEDTLSRGQLKLLMCALRLAQGEFLTRESGRRCLYL IDDFASELDDERRGLLASRLKATQSQVFVSAISAEHVIDMSDENSKMFTVEKGKITD >gi|296494675|gb|ADTN01000063.1| GENE 36 35505 - 36605 1302 366 aa, chain - ## HITS:1 COG:ECs4636 KEGG:ns NR:ns ## COG: ECs4636 COG0592 # Protein_GI_number: 15833890 # Func_class: L Replication, recombination and repair # Function: DNA polymerase sliding clamp subunit (PCNA homolog) # Organism: Escherichia coli O157:H7 # 1 366 1 366 366 729 100.0 0 MKFTVEREHLLKPLQQVSGPLGGRPTLPILGNLLLQVADGTLSLTGTDLEMEMVARVALV QPHEPGATTVPARKFFDICRGLPEGAEIAVQLEGERMLVRSGRSRFSLSTLPAADFPNLD DWQSEVEFTLPQATMKRLIEATQFSMAHQDVRYYLNGMLFETEGEELRTVATDGHRLAVC SMPIGQSLPSHSVIVPRKGVIELMRMLDGGDNPLRVQIGSNNIRAHVGDFIFTSKLVDGR FPDYRRVLPKNPDKHLEAGCDLLKQAFARAAILSNEKFRGVRLYVSENQLKITANNPEQE EAEEILDVTYSGAEMEIGFNVSYVLDVLNALKCENVRMMLTDSVSSVQIEDAASQSAAYV VMPMRL >gi|296494675|gb|ADTN01000063.1| GENE 37 36610 - 37944 1271 444 aa, chain - ## HITS:1 COG:dnaA KEGG:ns NR:ns ## COG: dnaA COG0593 # Protein_GI_number: 16131570 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication initiation # Organism: Escherichia coli K12 # 1 444 24 467 467 884 100.0 0 MWIRPLQAELSDNTLALYAPNRFVLDWVRDKYLNNINGLLTSFCGADAPQLRFEVGTKPV TQTPQAAVTSNVAAPAQVAQTQPQRAAPSTRSGWDNVPAPAEPTYRSNVNVKHTFDNFVE GKSNQLARAAARQVADNPGGAYNPLFLYGGTGLGKTHLLHAVGNGIMARKPNAKVVYMHS ERFVQDMVKALQNNAIEEFKRYYRSVDALLIDDIQFFANKERSQEEFFHTFNALLEGNQQ IILTSDRYPKEINGVEDRLKSRFGWGLTVAIEPPELETRVAILMKKADENDIRLPGEVAF FIAKRLRSNVRELEGALNRVIANANFTGRAITIDFVREALRDLLALQEKLVTIDNIQKTV AEYYKIKVADLLSKRRSRSVARPRQMAMALAKELTNHSLPEIGDAFGGRDHTTVLHACRK IEQLREESHDIKEDFSNLIRTLSS >gi|296494675|gb|ADTN01000063.1| GENE 38 38620 - 38760 228 46 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15804297|ref|NP_290336.1| 50S ribosomal protein L34 [Escherichia coli O157:H7 EDL933] # 1 46 1 46 46 92 100 4e-18 MKRTFQPSVLKRNRSHGFRARMATKNGRQVLARRRAKGRARLTVSK >gi|296494675|gb|ADTN01000063.1| GENE 39 38810 - 39136 206 108 aa, chain + ## HITS:1 COG:rnpA KEGG:ns NR:ns ## COG: rnpA COG0594 # Protein_GI_number: 16131572 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNase P protein component # Organism: Escherichia coli K12 # 1 108 12 119 119 190 99.0 5e-49 MLTPSQFTFVFQQPQRAGTPQITILGRLNSLGHPRIGLTVAKKNVRRAHERNRIKRLTRE SFRLRQHELPAMDFVVVAKKGVADLDNRALSEALEKLWRRHCRLARGS >gi|296494675|gb|ADTN01000063.1| GENE 40 39360 - 41006 2012 548 aa, chain + ## HITS:1 COG:yidC KEGG:ns NR:ns ## COG: yidC COG0706 # Protein_GI_number: 16131573 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YidC # Organism: Escherichia coli K12 # 1 548 1 548 548 1064 100.0 0 MDSQRNLLVIALLFVSFMIWQAWEQDKNPQPQAQQTTQTTTTAAGSAADQGVPASGQGKL ISVKTDVLDLTINTRGGDVEQALLPAYPKELNSTQPFQLLETSPQFIYQAQSGLTGRDGP DNPANGPRPLYNVEKDAYVLAEGQNELQVPMTYTDAAGNTFTKTFVLKRGDYAVNVNYNV QNAGEKPLEISSFGQLKQSITLPPHLDTGSSNFALHTFRGAAYSTPDEKYEKYKFDTIAD NENLNISSKGGWVAMLQQYFATAWIPHNDGTNNFYTANLGNGIAAIGYKSQPVLVQPGQT GAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGWLWFISQPLFKLLKWIHSFVGNWGFSIII ITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQAMRERLGDDKQRISQEMMALYKAEKVNPL GGCFPLLIQMPIFLALYYMLMGSVELRQAPFALWIHDLSAQDPYYILPILMGVTMFFIQK MSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSGLVLYYIVSNLVTIIQQQLIYRGLEKRGL HSREKKKS >gi|296494675|gb|ADTN01000063.1| GENE 41 41112 - 42476 1671 454 aa, chain + ## HITS:1 COG:thdF KEGG:ns NR:ns ## COG: thdF COG0486 # Protein_GI_number: 16131574 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Escherichia coli K12 # 1 454 1 454 454 862 100.0 0 MSDNDTIVAQATPPGRGGVGILRISGFKAREVAETVLGKLPKPRYADYLPFKDADGSVLD QGIALWFPGPNSFTGEDVLELQGHGGPVILDLLLKRILTIPGLRIARPGEFSERAFLNDK LDLAQAEAIADLIDASSEQAARSALNSLQGAFSARVNHLVEALTHLRIYVEAAIDFPDEE IDFLSDGKIEAQLNDVIADLDAVRAEARQGSLLREGMKVVIAGRPNAGKSSLLNALAGRE AAIVTDIAGTTRDVLREHIHIDGMPLHIIDTAGLREASDEVERIGIERAWQEIEQADRVL FMVDGTTTDAVDPAEIWPEFIARLPAKLPITVVRNKADITGETLGMSEVNGHALIRLSAR TGEGVDVLRNHLKQSMGFDTNMEGGFLARRRHLQALEQAAEHLQQGKAQLLGAWAGELLA EELRLAQQNLSEITGEFTSDDLLGRIFSSFCIGK >gi|296494675|gb|ADTN01000063.1| GENE 42 43014 - 44429 1882 471 aa, chain + ## HITS:1 COG:tnaA KEGG:ns NR:ns ## COG: tnaA COG3033 # Protein_GI_number: 16131576 # Func_class: E Amino acid transport and metabolism # Function: Tryptophanase # Organism: Escherichia coli K12 # 1 471 6 476 476 971 99.0 0 MENFKHLPEPFRIRVIEPVKRTTRAYREEAIIKSGMNPFLLDSEDVFIDLLTDSGTGAVT QSMQAAMMRGDEAYSGSRSYYALAESVKNIFGYQYTIPTHQGRGAEQIYIPVLIKKREQE KGLDRSKMVAFSNYFFDTTQGHSQINGCTVRNVYIKEAFDTGVRYDFKGNFDLEGLERGI EEVGPNNVPYIVATITSNSAGGQPVSLANLKAMYSIAKKYDIPVVMDSARFAENAYFIKQ REAEYKDWTIEQITRETYKYADMLAMSAKKDAMVPMGGLLCMKDDSFFDVYTECRTLCVV QEGFPTYGGLEGGAMERLAVGLYDGMNLDWLAYRIAQVQYLVDGLEEIGVVCQQAGGHAA FVDAGKLLPHIPADQFPAQALACELYKVAGIRAVEIGSFLLGRDPKTGKQLPCPAELLRL TIPRATYTQTHMDFIIEAFKNVKENAANIKGLTFTYEPKVLRHFTAKLKEV >gi|296494675|gb|ADTN01000063.1| GENE 43 44520 - 45767 1153 415 aa, chain + ## HITS:1 COG:tnaB KEGG:ns NR:ns ## COG: tnaB COG0814 # Protein_GI_number: 16131577 # Func_class: E Amino acid transport and metabolism # Function: Amino acid permeases # Organism: Escherichia coli K12 # 1 415 1 415 415 733 100.0 0 MTDQAEKKHSAFWGVMVIAGTVIGGGMFALPVDLAGAWFFWGAFILIIAWFSMLHSGLLL LEANLNYPVGSSFNTITKDLIGNTWNIISGITVAFVLYILTYAYISANGAIISETISMNL GYHANPRIVGICTAIFVASVLWLSSLAASRITSLFLGLKIISFVIVFGSFFFQVDYSILR DATSSTAGTSYFPYIFMALPVCLASFGFHGNIPSLIICYGKRKDKLIKSVVFGSLLALVI YLFWLYCTMGNIPRESFKAIISSGGNVDSLVKSFLGTKQHGIIEFCLLVFSNLAVASSFF GVTLGLFDYLADLFKIDNSHGGRFKTVLLTFLPPALLYLIFPNGFIYGIGGAGLCATIWA VIIPAVLAIKARKKFPNQMFTVWGGNLIPAIVILFGITVILCWFGNVFNVLPKFG >gi|296494675|gb|ADTN01000063.1| GENE 44 45899 - 47074 814 391 aa, chain + ## HITS:1 COG:yidY KEGG:ns NR:ns ## COG: yidY COG0477 # Protein_GI_number: 16131578 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 391 1 391 391 661 100.0 0 MSRFLICSFALVLLYPAGIDMYLVGLPRIAADLNASEAQLHIAFSVYLAGMAAAMLFAGK VADRSGRKPVAIPGAALFIIASVFCSLAETSTLFLAGRFLQGLGAGCCYVVAFAILRDTL DDRRRAKVLSLLNGITCIIPVLAPVLGHLIMLKFPWQSLFWAMAMMGIAVLMLSLFILKE TRPAAPAASDKPRENSESLLNRFFLSRVVITTLSVSVILTFVNTSPVLLMEIMGFERGEY ATIMALTAGVSMTVSFSTPFALGIFKPRTLMITSQVLFLAAGITLAVSPSHAVSLFGITL ICAGFSVGFGVAMSQALGPFSLRAGVASSTLGIAQVCGSSLWIWLAAVVGIGAWNMLIGI LIACSIVSLLLIMFVAPGRPVAAHEEIHHHA >gi|296494675|gb|ADTN01000063.1| GENE 45 47049 - 48008 807 319 aa, chain + ## HITS:1 COG:yidZ KEGG:ns NR:ns ## COG: yidZ COG0583 # Protein_GI_number: 16131579 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 319 1 319 319 588 100.0 1e-168 MKKSITTLDLNLLLCLQLLMQERSVTKAAKRINVTPSAVSKSLAKLRAWFDDPLFVNSPL GLSPTPLMVSMEQNLAEWMQMSNLLLDKPHHQTPRGLKFELAAESPLMMIMLNALSKQIY QRYPQATIKLRNWDYDSLDAITRGEVDIGFSGRESHPRSRELLSSLPLAIDYEVLFSDVP CVWLRKDHPALHQTWNLDTFLRYPHISICWEQSDTWALDNVLQELGRERTIAMSLPEFEQ SLFMAAQPDNLLLATAPRYCQYYNQLHQLPLVALPLPFDESQQKKLEVPFTLLWHKRNSH NPKIVWLRETIKNLYASMA >gi|296494675|gb|ADTN01000063.1| GENE 46 48153 - 48914 520 253 aa, chain + ## HITS:1 COG:yieE KEGG:ns NR:ns ## COG: yieE COG2091 # Protein_GI_number: 16131580 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantetheinyl transferase # Organism: Escherichia coli K12 # 1 253 1 253 253 518 100.0 1e-147 MERKMATHFARGILTEGHLISVRLPSQCHQEARNIPPHRQSRFLASRGLLAELMFMLYGI GELPEIVTLPKGKPVFSDKNLPSFSISYAGNMVGVALTTEGECGLDMELQRATRGFHSPH APDNHTFSSNESLWISKQNDPNEARAQLITLRRSVLKLTGDVLNDDPRDLQLLPIAGRLK CAHVNHVEALCDAEDVLVWSVAVTPTIEKLSVWELDGKHGWKSLPDIHSRANNPTSRMMR FAQLSTVKAFSPN >gi|296494675|gb|ADTN01000063.1| GENE 47 48936 - 49502 764 188 aa, chain + ## HITS:1 COG:ECs4650 KEGG:ns NR:ns ## COG: ECs4650 COG0431 # Protein_GI_number: 15833904 # Func_class: R General function prediction only # Function: Predicted flavoprotein # Organism: Escherichia coli O157:H7 # 1 188 1 188 188 364 100.0 1e-101 MSEKLQVVTLLGSLRKGSFNGMVARTLPKIAPASMEVNALPSIADIPLYDADVQQEEGFP ATVEALAEQIRQADGVVIVTPEYNYSVPGGLKNAIDWLSRLPDQPLAGKPVLIQTSSMGV IGGARCQYHLRQILVFLDAMVMNKPEFMGGVIQNKVDPQTGEVIDQGTLDHLTGQLTAFG EFIQRVKI Prediction of potential genes in microbial genomes Time: Sun May 15 23:19:54 2011 Seq name: gi|296494674|gb|ADTN01000064.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont166.6, whole genome shotgun sequence Length of sequence - 13297 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 5, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 6 - 44 7.3 1 1 Tu 1 . - CDS 65 - 1402 1665 ## COG2252 Permeases - Prom 1490 - 1549 2.5 + Prom 1446 - 1505 4.3 2 2 Op 1 . + CDS 1567 - 2232 602 ## COG0637 Predicted phosphatase/phosphohexomutase 3 2 Op 2 . + CDS 2299 - 2766 239 ## JW3694 predicted inner membrane protein 4 2 Op 3 . + CDS 2815 - 3402 203 ## COG3196 Uncharacterized protein conserved in bacteria - Term 3409 - 3450 8.0 5 3 Op 1 . - CDS 3464 - 3712 280 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase 6 3 Op 2 . - CDS 3731 - 4186 574 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase 7 3 Op 3 . - CDS 4201 - 5370 1145 ## COG2382 Enterochelin esterase and related enzymes 8 3 Op 4 1/0.500 - CDS 5397 - 7013 1474 ## COG4580 Maltoporin (phage lambda and maltose receptor) - Term 7034 - 7077 9.2 9 4 Op 1 8/0.000 - CDS 7082 - 8494 1202 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 10 4 Op 2 7/0.000 - CDS 8513 - 10390 1520 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific - Prom 10430 - 10489 4.9 - Term 10421 - 10476 12.3 11 4 Op 3 1/0.500 - CDS 10524 - 11360 612 ## COG3711 Transcriptional antiterminator - Prom 11570 - 11629 6.8 - Term 11594 - 11636 9.0 12 5 Op 1 32/0.000 - CDS 11646 - 12371 883 ## COG0704 Phosphate uptake regulator 13 5 Op 2 . - CDS 12386 - 13159 345 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 13195 - 13254 1.7 Predicted protein(s) >gi|296494674|gb|ADTN01000064.1| GENE 1 65 - 1402 1665 445 aa, chain - ## HITS:1 COG:STM3851 KEGG:ns NR:ns ## COG: STM3851 COG2252 # Protein_GI_number: 16767135 # Func_class: R General function prediction only # Function: Permeases # Organism: Salmonella typhimurium LT2 # 1 445 43 487 487 733 96.0 0 MSHQHTTQTSGQGMLERVFKLREHGTTARTEVIAGFTTFLTMVYIVFVNPQILGVAGMDT SAVFVTTCLIAAFGSIMMGLFANLPVALAPAMGLNAFFAFVVVQAMGLPWQVGMGAIFWG AIGLLLLTIFRVRYWMIANIPVSLRVGITSGIGLFIGMMGLKNAGVIVANPETLVSIGNL TSHSVLLGILGFFIIAILASRNIHAAVLVSIVVTTLLGWMLGDVHYNGIVSAPPSVMTVV GHVDLAGSFNLGLAGVIFSFMLVNLFDSSGTLIGVTDKAGLADEKGKFPRMKQALYVDSI SSVTGSFIGTSSVTAYIESSSGVSVGGRTGLTAVVVGLLFLLVIFLSPLAGMVPGYAAAG ALIYVGVLMTSSLARVNWQDLTESVPAFITAVMMPFSFSITEGIALGFISYCVMKIGTGR LRDLSPCVIIVALLFILKIVFIDAH >gi|296494674|gb|ADTN01000064.1| GENE 2 1567 - 2232 602 221 aa, chain + ## HITS:1 COG:yieH KEGG:ns NR:ns ## COG: yieH COG0637 # Protein_GI_number: 16131583 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Escherichia coli K12 # 1 221 1 221 221 468 100.0 1e-132 MSRIEAVFFDCDGTLVDSEVICSRAYVTMFQEFGITLDPEEVFKRFKGVKLYEIIDIVSL EHGVTLAKTEAEHVYRAEVARLFDSELEAIEGAGALLSAITAPMCVVSNGPNNKMQHSMG KLNMLHYFPDKLFSGYDIQRWKPDPALMFHAAKAMNVNVENCILVDDSVAGAQSGIDAGM EVFYFCADPHNKPIVHPKVTTFTHLSQLPELWKARGWDITA >gi|296494674|gb|ADTN01000064.1| GENE 3 2299 - 2766 239 155 aa, chain + ## HITS:1 COG:no KEGG:JW3694 NR:ns ## KEGG: JW3694 # Name: yieI # Def: predicted inner membrane protein # Organism: E.coli_J # Pathway: not_defined # 1 155 1 155 155 261 100.0 5e-69 MSVSRRVIHHGLYFAVLGPLIGVLFLVLYIFFAKEPLVLWVIIHPIFLLLSITTGAIPAL LTGVMVACLPEKIGSQKRYRCLAGGIGGVVITEIYCAVIVHIKGMASSELFENILSGDSL VVRIIPALLAGVVMSRIITRLPGLDISCPETDSLS >gi|296494674|gb|ADTN01000064.1| GENE 4 2815 - 3402 203 195 aa, chain + ## HITS:1 COG:yieJ KEGG:ns NR:ns ## COG: yieJ COG3196 # Protein_GI_number: 16131585 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 195 1 195 195 376 100.0 1e-104 MTQNIRPLPQFKYHPKPLETGAFEQDKTVECDCCEQQTSVYYSGPFYCVDEVEHLCPWCI ADGSAAEKFAGSFQDDASIEGVEFEYDEEDEFAGIKNTYPDEMLKELVERTPGYHGWQQE FWLAHCGDFCVFIGYVGWNDIKDRLDEFANLEEDCENFGIRNSDLAKCLQKGGHCQGYLF RCLHCGKLRLWGDFS >gi|296494674|gb|ADTN01000064.1| GENE 5 3464 - 3712 280 82 aa, chain - ## HITS:1 COG:yieK KEGG:ns NR:ns ## COG: yieK COG0363 # Protein_GI_number: 16131586 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Escherichia coli K12 # 1 74 136 209 213 145 98.0 3e-35 MAHGELGGDFSLVPDSYVTMGPKSIMAAKNLLIIVSGAGKAQALKNVLQGPVTEDVPASV LQLHPSLMVIADKAAAAELALG >gi|296494674|gb|ADTN01000064.1| GENE 6 3731 - 4186 574 151 aa, chain - ## HITS:1 COG:yieK KEGG:ns NR:ns ## COG: yieK COG0363 # Protein_GI_number: 16131586 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Escherichia coli K12 # 24 151 1 128 213 278 100.0 2e-75 MKLIITEDYQEMSRVAAHHLLGYMSKTRRVNLAITAGSTPKGMYEYLTTLVKGKPWYDNC YFYNFDEIPFRGKEGEGVTITNLRNLFFTPAGIKEENIQKLTIDNYREHDQKLAREGGLD LVVLGLGADGHFCGNLPNTTHFHEQTVEFPI >gi|296494674|gb|ADTN01000064.1| GENE 7 4201 - 5370 1145 389 aa, chain - ## HITS:1 COG:yieL KEGG:ns NR:ns ## COG: yieL COG2382 # Protein_GI_number: 16131587 # Func_class: P Inorganic ion transport and metabolism # Function: Enterochelin esterase and related enzymes # Organism: Escherichia coli K12 # 1 389 12 400 400 763 99.0 0 MNIKIAALTLAIASGISAQWAIAADMPASPAPTIPVKQYVTQVNADNSVTFRYFAPGAKN VSVVVGVPVPDNIHPMTKDEAGVWSWRTPILKGNLYEYFFNVDGVRSIDTGTAMTNPQRQ VNSSMILVPGSYLDTRSVAHGDLIAITYHSNALQSERQMYVWTPPGYTGMGEPLPVLYFY HGFGDTGRSAIDQGRIPQIMDNLLAEGKIKPMLVVIPDTETDAKGIIPEDFVPQERRKVF YPLNAKAADRELMNDIIPLISKRFNVRKDADGRALAGLSQGGYQALVSGMNHLESFGWLA TFSGVTTTTVPDEGVAARLNDPAAINQQLRNFTVVVGDKDVVTGKDIAGLKTELEQKKIN FDYQEYPGLNHEMDVWRSAYAAFVQKLFK >gi|296494674|gb|ADTN01000064.1| GENE 8 5397 - 7013 1474 538 aa, chain - ## HITS:1 COG:yieC KEGG:ns NR:ns ## COG: yieC COG4580 # Protein_GI_number: 16131588 # Func_class: G Carbohydrate transport and metabolism # Function: Maltoporin (phage lambda and maltose receptor) # Organism: Escherichia coli K12 # 1 538 1 538 538 1016 100.0 0 MFRRNLITSAILLMAPLAFSAQSLAESLTVEQRLELLEKALRETQSELKKYKDEEKKKYT PATVNRSVSTNDQGYAANPFPTSSAAKPDAVLVKNEEKNASETGSIYSSMTLKDFSKFVK DEIGFSYNGYYRSGWGTASHGSPKSWAIGSLGRFGNEYSGWFDLQLKQRVYNENGKRVDA VVMMDGNVGQQYSTGWFGDNAGGENYMQFSDMYVTTKGFLPFAPEADFWVGKHGAPKIEI QMLDWKTQRTDAAAGVGLENWKVGPGKIDIALVREDVDDYDRSLQNKQQINTNTIDLRYK DIPLWDKATLMVSGRYVTANESASEKDNQDNNGYYDWKDTWMFGTSLTQKFDKGGFNEFS FLVANNSIASNFGRYAGASPFTTFNGRYYGDHTGGTAVRLTSQGEAYIGDHFIVANAIVY SFGNDIYSYETGAHSDFESIRAVVRPAYIWDQYNQTGVELGYFTQQNKDANSNKFNESGY KTTLFHTFKVNTSMLTSRPEIRFYATYIKALENELDGFTFEDNKDDQFAVGAQAEIWW >gi|296494674|gb|ADTN01000064.1| GENE 9 7082 - 8494 1202 470 aa, chain - ## HITS:1 COG:bglB KEGG:ns NR:ns ## COG: bglB COG2723 # Protein_GI_number: 16131589 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Escherichia coli K12 # 1 470 1 470 470 996 100.0 0 MKAFPETFLWGGATAANQVEGAWQEDGKGISTSDLQPHGVMGKMEPRILGKENIKDVAID FYHRYPEDIALFAEMGFTCLRISIAWARIFPQGDEVEPNEAGLAFYDRLFDEMAQAGIKP LVTLSHYEMPYGLVKNYGGWANRAVIDHFEHYARTVFTRYQHKVALWLTFNEINMSLHAP FTGVGLAEESGEAEVYQAIHHQLVASARAVKACHSLLPEAKIGNMLLGGLVYPLTCQPQD MLQAMEENRRWMFFGDVQARGQYPGYMQRFFRDHNITIEMTESDAEDLKHTVDFISFSYY MTGCVSHDESINKNAQGNILNMIPNPHLKSSEWGWQIDPVGLRVLLNTLWDRYQKPLFIV ENGLGAKDSVEADGSIQDDYRIAYLNDHLVQVNEAIADGVDIMGYTSWGPIDLVSASHSQ MSKRYGFIYVDRDDNGEGSLTRTRKKSFGWYAEVIKTRGLSLKKITIKAP >gi|296494674|gb|ADTN01000064.1| GENE 10 8513 - 10390 1520 625 aa, chain - ## HITS:1 COG:bglF_2 KEGG:ns NR:ns ## COG: bglF_2 COG1263 # Protein_GI_number: 16131590 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Escherichia coli K12 # 91 451 1 361 361 686 100.0 0 MTELARKIVAGVGGADNIVSLMHCATRLRFKLKDESKAQAEVLKKTPGIIMVVESGGQFQ VVIGNHVADVFLAVNSVAGLDEKAQQAPENDDKGNLLNRFVYVISGIFTPLIGLMAATGI LKGMLALALTFQWTTEQSGTYLILFSASDALFWFFPIILGYTAGKRFGGNPFTAMVIGGA LVHPLILTAFENGQKADALGLDFLGIPVTLLNYSSSVIPIIFSAWLCSILERRLNAWLPS AIKNFFTPLLCLMVITPVTFLLVGPLSTWISELIAAGYLWLYQAVPAFAGAVMGGFWQIF VMFGLHWGLVPLCINNFTVLGYDTMIPLLMPAIMAQVGAALGVFLCERDAQKKVVAGSAA LTSLFGITEPAVYGVNLPRKYPFVIACISGALGATIIGYAQTKVYSFGLPSIFTFMQTIP STGIDFTVWASVIGGVIAIGCAFVGTVMLHFITAKRQPAQGAPQEKTPEVITPPEQGGIC SPMTGEIVPLIHVADTTFASGLLGKGIAILPSVGEVRSPVAGRIASLFATLHAIGIESDD GVEILIHVGIDTVKLDGKFFSAHVNVGDKVNTGDRLISFDIPAIREAGFDLTTPVLISNS DDFTDVLPHGTAQISAGEPLLSIIR >gi|296494674|gb|ADTN01000064.1| GENE 11 10524 - 11360 612 278 aa, chain - ## HITS:1 COG:bglG KEGG:ns NR:ns ## COG: bglG COG3711 # Protein_GI_number: 16131591 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Escherichia coli K12 # 1 278 1 278 278 524 100.0 1e-149 MNMQITKILNNNVVVVIDDQQREKVVMGRGIGFQKRAGERINSSGIEKEYALSSHELNGR LSELLSHIPLEVMATCDRIISLAQERLGKLQDSIYISLTDHCQFAIKRFQQNVLLPNPLL WDIQRLYPKEFQLGEEALTIIDKRLGVQLPKDEVGFIAMHLVSAQMSGNMEDVAGVTQLM REMLQLIKFQFSLNYQEESLSYQRLVTHLKFLSWRILEHASINDSDESLQQAVKQNYPQA WQCAERIAIFIGLQYQRKISPAEIMFLAINIERVRKEH >gi|296494674|gb|ADTN01000064.1| GENE 12 11646 - 12371 883 241 aa, chain - ## HITS:1 COG:phoU KEGG:ns NR:ns ## COG: phoU COG0704 # Protein_GI_number: 16131592 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate uptake regulator # Organism: Escherichia coli K12 # 1 241 1 241 241 449 100.0 1e-126 MDSLNLNKHISGQFNAELESIRTQVMTMGGMVEQQLSDAITAMHNQDSDLAKRVIEGDKN VNMMEVAIDEACVRIIAKRQPTASDLRLVMVISKTIAELERIGDVADKICRTALEKFSQQ HQPLLVSLESLGRHTIQMLHDVLDAFARMDIDEAVRIYREDKKVDQEYEGIVRQLMTYMM EDSRTIPSVLTALFCARSIERIGDRCQNICEFIFYYVKGQDFRHVGGDELDKLLAGKDSD K >gi|296494674|gb|ADTN01000064.1| GENE 13 12386 - 13159 345 257 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 11 252 2 239 245 137 36 4e-32 MSMVETAPSKIQVRNLNFYYGKFHALKNINLDIAKNQVTAFIGPSGCGKSTLLRTFNKMF ELYPEQRAEGEILLDGDNILTNSQDIALLRAKVGMVFQKPTPFPMSIYDNIAFGVRLFEK LSRADMDERVQWALTKAALWNETKDKLHQSGYSLSGGQQQRLCIARGIAIRPEVLLLDEP CSALDPISTGRIEELITELKQDYTVVIVTHNMQQAARCSDHTAFMYLGELIEFSNTDDLF TKPAKKQTEDYITGRYG Prediction of potential genes in microbial genomes Time: Sun May 15 23:20:05 2011 Seq name: gi|296494673|gb|ADTN01000065.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont166.7, whole genome shotgun sequence Length of sequence - 22704 bp Number of predicted genes - 21, with homology - 21 Number of transcription units - 7, operones - 6 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 38/0.000 - CDS 45 - 935 1189 ## COG0581 ABC-type phosphate transport system, permease component 2 1 Op 2 39/0.000 - CDS 935 - 1894 1293 ## COG0573 ABC-type phosphate transport system, permease component 3 1 Op 3 2/1.000 - CDS 1981 - 3021 1254 ## COG0226 ABC-type phosphate transport system, periplasmic component - Prom 3100 - 3159 8.3 - Term 3292 - 3329 5.3 4 2 Op 1 9/0.000 - CDS 3335 - 5164 2413 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains 5 2 Op 2 6/0.500 - CDS 5326 - 6696 1752 ## COG1207 N-acetylglucosamine-1-phosphate uridyltransferase (contains nucleotidyltransferase and I-patch acetyltransferase domains) - Prom 6757 - 6816 3.2 - Term 6993 - 7026 4.5 6 3 Op 1 42/0.000 - CDS 7049 - 7462 498 ## COG0355 F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) 7 3 Op 2 42/0.000 - CDS 7489 - 8871 1784 ## COG0055 F0F1-type ATP synthase, beta subunit 8 3 Op 3 42/0.000 - CDS 8898 - 9761 1029 ## COG0224 F0F1-type ATP synthase, gamma subunit 9 3 Op 4 41/0.000 - CDS 9812 - 11353 1770 ## COG0056 F0F1-type ATP synthase, alpha subunit 10 3 Op 5 38/0.000 - CDS 11366 - 11899 594 ## COG0712 F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) 11 3 Op 6 37/0.000 - CDS 11914 - 12384 581 ## COG0711 F0F1-type ATP synthase, subunit b 12 3 Op 7 40/0.000 - CDS 12446 - 12685 380 ## COG0636 F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K 13 3 Op 8 8/0.000 - CDS 12732 - 13547 806 ## COG0356 F0F1-type ATP synthase, subunit a 14 3 Op 9 3/0.500 - CDS 13556 - 13936 160 ## COG3312 F0F1-type ATP synthase, subunit I - Prom 14021 - 14080 5.8 - Term 14414 - 14465 4.2 15 4 Op 1 24/0.000 - CDS 14553 - 15176 572 ## COG0357 Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division 16 4 Op 2 5/0.500 - CDS 15240 - 17129 1911 ## COG0445 NAD/FAD-utilizing enzyme apparently involved in cell division - Prom 17222 - 17281 4.7 17 5 Op 1 4/0.500 - CDS 17508 - 17951 622 ## COG0716 Flavodoxins - Prom 17973 - 18032 2.0 18 5 Op 2 . - CDS 18041 - 18499 500 ## COG1522 Transcriptional regulators - Prom 18525 - 18584 6.7 + Prom 18522 - 18581 4.0 19 6 Tu 1 . + CDS 18651 - 19643 1277 ## COG2502 Asparagine synthetase A 20 7 Op 1 7/0.500 - CDS 19648 - 21099 1144 ## COG2425 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain 21 7 Op 2 . - CDS 21093 - 22589 1365 ## COG0714 MoxR-like ATPases - Prom 22626 - 22685 2.6 Predicted protein(s) >gi|296494673|gb|ADTN01000065.1| GENE 1 45 - 935 1189 296 aa, chain - ## HITS:1 COG:pstA KEGG:ns NR:ns ## COG: pstA COG0581 # Protein_GI_number: 16131594 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Escherichia coli K12 # 1 296 1 296 296 507 100.0 1e-143 MAMVEMQTTAALAESRRKMQARRRLKNRIALTLSMATMAFGLFWLIWILMSTITRGIDGM SLALFTEMTPPPNTEGGGLANALAGSGLLILWATVFGTPLGIMAGIYLAEYGRKSWLAEV IRFINDILLSAPSIVVGLFVYTIVVAQMEHFSGWAGVIALALLQVPIVIRTTENMLKLVP YSLREAAYALGTPKWKMISAITLKASVSGIMTGILLAIARIAGETAPLLFTALSNQFWST DMMQPIANLPVTIFKFAMSPFAEWQQLAWAGVLIITLCVLLLNILARVVFAKNKHG >gi|296494673|gb|ADTN01000065.1| GENE 2 935 - 1894 1293 319 aa, chain - ## HITS:1 COG:ECs4663 KEGG:ns NR:ns ## COG: ECs4663 COG0573 # Protein_GI_number: 15833917 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Escherichia coli O157:H7 # 1 319 1 319 319 533 99.0 1e-151 MAATKPAFNPPGKKGDIIFSVLVKLAALIVLLMLGGIIVSLIISSWPSIQKFGLAFLWTK EWDAPNDIYGALVPIYGTLVTSFIALLIAVPVSFGIALFLTELAPGWLKRPLGIAIELLA AIPSIVYGMWGLFIFAPLFAVYFQEPVGNIMSNIPIVGALFSGSAFGIGILAAGVILAIM IIPYIAAVMRDVFEQTPVMMKESAYGIGCTTWEVIWRIVLPFTKNGVIGGIMLGLGRALG ETMAVTFIIGNTYQLDSASLYMPGNSITSALANEFAEAESGLHVAALMELGLILFVITFI VLAASKFMIMRLAKNEGAR >gi|296494673|gb|ADTN01000065.1| GENE 3 1981 - 3021 1254 346 aa, chain - ## HITS:1 COG:pstS KEGG:ns NR:ns ## COG: pstS COG0226 # Protein_GI_number: 16131596 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Escherichia coli K12 # 1 346 1 346 346 632 100.0 0 MKVMRTTVATVVAATLSMSAFSVFAEASLTGAGATFPAPVYAKWADTYQKETGNKVNYQG IGSSGGVKQIIANTVDFGASDAPLSDEKLAQEGLFQFPTVIGGVVLAVNIPGLKSGELVL DGKTLGDIYLGKIKKWDDEAIAKLNPGLKLPSQNIAVVRRADGSGTSFVFTSYLAKVNEE WKNNVGTGSTVKWPIGLGGKGNDGIAAFVQRLPGAIGYVEYAYAKQNNLAYTKLISADGK PVSPTEENFANAAKGADWSKTFAQDLTNQKGEDAWPITSTTFILIHKDQKKPEQGTEVLK FFDWAYKTGAKQANDLDYASLPDSVVEQVRAAWKTNIKDSSGKPLY >gi|296494673|gb|ADTN01000065.1| GENE 4 3335 - 5164 2413 609 aa, chain - ## HITS:1 COG:glmS KEGG:ns NR:ns ## COG: glmS COG0449 # Protein_GI_number: 16131597 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Escherichia coli K12 # 1 609 1 609 609 1196 100.0 0 MCGIVGAIAQRDVAEILLEGLRRLEYRGYDSAGLAVVDAEGHMTRLRRLGKVQMLAQAAE EHPLHGGTGIAHTRWATHGEPSEVNAHPHVSEHIVVVHNGIIENHEPLREELKARGYTFV SETDTEVIAHLVNWELKQGGTLREAVLRAIPQLRGAYGTVIMDSRHPDTLLAARSGSPLV IGLGMGENFIASDQLALLPVTRRFIFLEEGDIAEITRRSVNIFDKTGAEVKRQDIESNLQ YDAGDKGIYRHYMQKEIYEQPNAIKNTLTGRISHGQVDLSELGPNADELLSKVEHIQILA CGTSYNSGMVSRYWFESLAGIPCDVEIASEFRYRKSAVRRNSLMITLSQSGETADTLAGL RLSKELGYLGSLAICNVPGSSLVRESDLALMTNAGTEIGVASTKAFTTQLTVLLMLVAKL SRLKGLDASIEHDIVHGLQALPSRIEQMLSQDKRIEALAEDFSDKHHALFLGRGDQYPIA LEGALKLKEISYIHAEAYAAGELKHGPLALIDADMPVIVVAPNNELLEKLKSNIEEVRAR GGQLYVFADQDAGFVSSDNMHIIEMPHVEEVIAPIFYTVPLQLLAYHVALIKGTDVDQPR NLAKSVTVE >gi|296494673|gb|ADTN01000065.1| GENE 5 5326 - 6696 1752 456 aa, chain - ## HITS:1 COG:ECs4672 KEGG:ns NR:ns ## COG: ECs4672 COG1207 # Protein_GI_number: 15833926 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylglucosamine-1-phosphate uridyltransferase (contains nucleotidyltransferase and I-patch acetyltransferase domains) # Organism: Escherichia coli O157:H7 # 1 456 1 456 456 890 100.0 0 MLNNAMSVVILAAGKGTRMYSDLPKVLHTLAGKAMVQHVIDAANELGAAHVHLVYGHGGD LLKQALKDDNLNWVLQAEQLGTGHAMQQAAPFFADDEDILMLYGDVPLISVETLQRLRDA KPQGGIGLLTVKLDDPTGYGRITRENGKVTGIVEHKDATDEQRQIQEINTGILIANGADM KRWLAKLTNNNAQGEYYITDIIALAYQEGREIVAVHPQRLSEVEGVNNRLQLSRLERVYQ SEQAEKLLLAGVMLRDPARFDLRGTLTHGRDVEIDTNVIIEGNVTLGHRVKIGTGCVIKN SVIGDDCEISPYTVVEDANLAAACTIGPFARLRPGAELLEGAHVGNFVEMKKARLGKGSK AGHLTYLGDAEIGDNVNIGAGTITCNYDGANKFKTIIGDDVFVGSDTQLVAPVTVGKGAT IAAGTTVTRNVGENALAISRVPQTQKEGWRRPVKKK >gi|296494673|gb|ADTN01000065.1| GENE 6 7049 - 7462 498 137 aa, chain - ## HITS:1 COG:atpC KEGG:ns NR:ns ## COG: atpC COG0355 # Protein_GI_number: 16131599 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) # Organism: Escherichia coli K12 # 1 137 3 139 139 243 100.0 8e-65 MTYHLDVVSAEQQMFSGLVEKIQVTGSEGELGIYPGHAPLLTAIKPGMIRIVKQHGHEEF IYLSGGILEVQPGNVTVLADTAIRGQDLDEARAMEAKRKAEEHISSSHGDVDYAQASAEL AKAIAQLRVIELTKKAM >gi|296494673|gb|ADTN01000065.1| GENE 7 7489 - 8871 1784 460 aa, chain - ## HITS:1 COG:ECs4674 KEGG:ns NR:ns ## COG: ECs4674 COG0055 # Protein_GI_number: 15833928 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, beta subunit # Organism: Escherichia coli O157:H7 # 1 460 1 460 460 890 100.0 0 MATGKIVQVIGAVVDVEFPQDAVPRVYDALEVQNGNERLVLEVQQQLGGGIVRTIAMGSS DGLRRGLDVKDLEHPIEVPVGKATLGRIMNVLGEPVDMKGEIGEEERWAIHRAAPSYEEL SNSQELLETGIKVIDLMCPFAKGGKVGLFGGAGVGKTVNMMELIRNIAIEHSGYSVFAGV GERTREGNDFYHEMTDSNVIDKVSLVYGQMNEPPGNRLRVALTGLTMAEKFRDEGRDVLL FVDNIYRYTLAGTEVSALLGRMPSAVGYQPTLAEEMGVLQERITSTKTGSITSVQAVYVP ADDLTDPSPATTFAHLDATVVLSRQIASLGIYPAVDPLDSTSRQLDPLVVGQEHYDTARG VQSILQRYQELKDIIAILGMDELSEEDKLVVARARKIQRFLSQPFFVAEVFTGSPGKYVS LKDTIRGFKGIMEGEYDHLPEQAFYMVGSIEEAVEKAKKL >gi|296494673|gb|ADTN01000065.1| GENE 8 8898 - 9761 1029 287 aa, chain - ## HITS:1 COG:ECs4675 KEGG:ns NR:ns ## COG: ECs4675 COG0224 # Protein_GI_number: 15833929 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, gamma subunit # Organism: Escherichia coli O157:H7 # 1 287 1 287 287 555 100.0 1e-158 MAGAKEIRSKIASVQNTQKITKAMEMVAASKMRKSQDRMAASRPYAETMRKVIGHLAHGN LEYKHPYLEDRDVKRVGYLVVSTDRGLCGGLNINLFKKLLAEMKTWTDKGVQCDLAMIGS KGVSFFNSVGGNVVAQVTGMGDNPSLSELIGPVKVMLQAYDEGRLDKLYIVSNKFINTMS QVPTISQLLPLPASDDDDLKHKSWDYLYEPDPKALLDTLLRRYVESQVYQGVVENLASEQ AARMVAMKAATDNGGSLIKELQLVYNKARQASITQELTEIVSGAAAV >gi|296494673|gb|ADTN01000065.1| GENE 9 9812 - 11353 1770 513 aa, chain - ## HITS:1 COG:ECs4676 KEGG:ns NR:ns ## COG: ECs4676 COG0056 # Protein_GI_number: 15833930 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, alpha subunit # Organism: Escherichia coli O157:H7 # 1 513 1 513 513 986 100.0 0 MQLNSTEISELIKQRIAQFNVVSEAHNEGTIVSVSDGVIRIHGLADCMQGEMISLPGNRY AIALNLERDSVGAVVMGPYADLAEGMKVKCTGRILEVPVGRGLLGRVVNTLGAPIDGKGP LDHDGFSAVEAIAPGVIERQSVDQPVQTGYKAVDSMIPIGRGQRELIIGDRQTGKTALAI DAIINQRDSGIKCIYVAIGQKASTISNVVRKLEEHGALANTIVVVATASESAALQYLAPY AGCAMGEYFRDRGEDALIIYDDLSKQAVAYRQISLLLRRPPGREAFPGDVFYLHSRLLER AARVNAEYVEAFTKGEVKGKTGSLTALPIIETQAGDVSAFVPTNVISITDGQIFLETNLF NAGIRPAVNPGISVSRVGGAAQTKIMKKLSGGIRTALAQYRELAAFSQFASDLDDATRKQ LDHGQKVTELLKQKQYAPMSVAQQSLVLFAAERGYLADVELSKIGSFEAALLAYVDRDHA PLMQEINQTGGYNDEIEGKLKGILDSFKATQSW >gi|296494673|gb|ADTN01000065.1| GENE 10 11366 - 11899 594 177 aa, chain - ## HITS:1 COG:ECs4677 KEGG:ns NR:ns ## COG: ECs4677 COG0712 # Protein_GI_number: 15833931 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) # Organism: Escherichia coli O157:H7 # 1 177 1 177 177 300 100.0 8e-82 MSEFITVARPYAKAAFDFAVEHQSVERWQDMLAFAAEVTKNEQMAELLSGALAPETLAES FIAVCGEQLDENGQNLIRVMAENGRLNALPDVLEQFIHLRAVSEATAEVDVISAAALSEQ QLAKISAAMEKRLSRKVKLNCKIDKSVMAGVIIRAGDMVIDGSVRGRLERLADVLQS >gi|296494673|gb|ADTN01000065.1| GENE 11 11914 - 12384 581 156 aa, chain - ## HITS:1 COG:ECs4678 KEGG:ns NR:ns ## COG: ECs4678 COG0711 # Protein_GI_number: 15833932 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit b # Organism: Escherichia coli O157:H7 # 1 156 1 156 156 201 100.0 5e-52 MNLNATILGQAIAFVLFVLFCMKYVWPPLMAAIEKRQKEIADGLASAERAHKDLDLAKAS ATDQLKKAKAEAQVIIEQANKRRSQILDEAKAEAEQERTKIVAQAQAEIEAERKRAREEL RKQVAILAVAGAEKIIERSVDEAANSDIVDKLVAEL >gi|296494673|gb|ADTN01000065.1| GENE 12 12446 - 12685 380 79 aa, chain - ## HITS:1 COG:BU003 KEGG:ns NR:ns ## COG: BU003 COG0636 # Protein_GI_number: 15616633 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K # Organism: Buchnera sp. APS # 1 79 1 79 79 97 83.0 5e-21 MENLNMDLLYMAAAVMMGLAAIGAAIGIGILGGKFLEGAARQPDLIPLLRTQFFIVMGLV DAIPMIAVGLGLYVMFAVA >gi|296494673|gb|ADTN01000065.1| GENE 13 12732 - 13547 806 271 aa, chain - ## HITS:1 COG:STM3871 KEGG:ns NR:ns ## COG: STM3871 COG0356 # Protein_GI_number: 16767155 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit a # Organism: Salmonella typhimurium LT2 # 1 271 1 271 271 467 96.0 1e-131 MASENMTPQDYIGHHLNNLQLDLRTFSLVDPQNPPATFWTINIDSMFFSVVLGLLFLVLF RSVAKKATSGVPGKFQTAIELVIGFVNGSVKDMYHGKSKLIAPLALTIFVWVFLMNLMDL LPIDLLPYIAEHVLGLPALRVVPSADVNVTLSMALGVFILILFYSIKMKGIGGFTKELTL QPFNHWAFIPVNLILEGVSLLSKPVSLGLRLFGNMYAGELIFILIAGLLPWWSQWILNVP WAIFHILIITLQAFIFMVLTIVYLSMASEEH >gi|296494673|gb|ADTN01000065.1| GENE 14 13556 - 13936 160 126 aa, chain - ## HITS:1 COG:ECs4681 KEGG:ns NR:ns ## COG: ECs4681 COG3312 # Protein_GI_number: 15833935 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit I # Organism: Escherichia coli O157:H7 # 1 126 5 130 130 157 100.0 6e-39 MSVSLVSRNVARKLLLVQLLVVIASGLLFSLKDPFWGVSAISGGLAVFLPNVLFMIFAWR HQAHTPAKGRVAWTFAFGEAFKVLAMLVLLVVALAVLKAVFLPLIVTWVLVLVVQILAPA VINNKG >gi|296494673|gb|ADTN01000065.1| GENE 15 14553 - 15176 572 207 aa, chain - ## HITS:1 COG:ECs4682 KEGG:ns NR:ns ## COG: ECs4682 COG0357 # Protein_GI_number: 15833936 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division # Organism: Escherichia coli O157:H7 # 1 207 1 207 207 411 100.0 1e-115 MLNKLSLLLKDAGISLTDHQKNQLIAYVNMLHKWNKAYNLTSVRDPNEMLVRHILDSIVV APYLQGERFIDVGTGPGLPGIPLSIVRPEAHFTLLDSLGKRVRFLRQVQHELKLENIEPV QSRVEEFPSEPPFDGVISRAFASLNDMVSWCHHLPGEQGRFYALKGQMPEDEIALLPEEY QVESVVKLQVPALDGERHLVVIKANKI >gi|296494673|gb|ADTN01000065.1| GENE 16 15240 - 17129 1911 629 aa, chain - ## HITS:1 COG:gidA KEGG:ns NR:ns ## COG: gidA COG0445 # Protein_GI_number: 16131609 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: NAD/FAD-utilizing enzyme apparently involved in cell division # Organism: Escherichia coli K12 # 1 629 1 629 629 1248 100.0 0 MFYPDPFDVIIIGGGHAGTEAAMAAARMGQQTLLLTHNIDTLGQMSCNPAIGGIGKGHLV KEVDALGGLMAKAIDQAGIQFRILNASKGPAVRATRAQADRVLYRQAVRTALENQPNLMI FQQAVEDLIVENDRVVGAVTQMGLKFRAKAVVLTVGTFLDGKIHIGLDNYSGGRAGDPPS IPLSRRLRELPLRVGRLKTGTPPRIDARTIDFSVLAQQHGDNPMPVFSFMGNASQHPQQV PCYITHTNEKTHDVIRSNLDRSPMYAGVIEGVGPRYCPSIEDKVMRFADRNQHQIFLEPE GLTSNEIYPNGISTSLPFDVQMQIVRSMQGMENAKIVRPGYAIEYDFFDPRDLKPTLESK FIQGLFFAGQINGTTGYEEAAAQGLLAGLNAARLSADKEGWAPARSQAYLGVLVDDLCTL GTKEPYRMFTSRAEYRLMLREDNADLRLTEIGRELGLVDDERWARFNEKLENIERERQRL KSTWVTPSAEAAAEVNAHLTAPLSREASGEDLLRRPEMTYEKLTTLTPFAPALTDEQAAE QVEIQVKYEGYIARQQDEIEKQLRNENTLLPATLDYRQVSGLSNEVIAKLNDHKPASIGQ ASRISGVTPAAISILLVWLKKQGMLRRSA >gi|296494673|gb|ADTN01000065.1| GENE 17 17508 - 17951 622 147 aa, chain - ## HITS:1 COG:mioC KEGG:ns NR:ns ## COG: mioC COG0716 # Protein_GI_number: 16131610 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Escherichia coli K12 # 1 147 1 147 147 280 100.0 5e-76 MADITLISGSTLGGAEYVAEHLAEKLEEAGFTTETLHGPLLEDLPASGIWLVISSTHGAG DIPDNLSPFYEALQEQKPDLSAVRFGAIGIGSREYDTFCGAIDKLEAELKNSGAKQTGET LKINILDHDIPEDPAEEWLGSWVNLLK >gi|296494673|gb|ADTN01000065.1| GENE 18 18041 - 18499 500 152 aa, chain - ## HITS:1 COG:ECs4685 KEGG:ns NR:ns ## COG: ECs4685 COG1522 # Protein_GI_number: 15833939 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 152 1 152 152 296 100.0 8e-81 MENYLIDNLDRGILEALMGNARTAYAELAKQFGVSPGTIHVRVEKMKQAGIITGARIDVS PKQLGYDVGCFIGIILKSAKDYPSALAKLESLDEVTEAYYTTGHYSIFIKVMCRSIDALQ HVLINKIQTIDEIQSTETLIVLQNPIMRTIKP >gi|296494673|gb|ADTN01000065.1| GENE 19 18651 - 19643 1277 330 aa, chain + ## HITS:1 COG:asnA KEGG:ns NR:ns ## COG: asnA COG2502 # Protein_GI_number: 16131612 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthetase A # Organism: Escherichia coli K12 # 1 330 1 330 330 657 100.0 0 MKTAYIAKQRQISFVKSHFSRQLEERLGLIEVQAPILSRVGDGTQDNLSGCEKAVQVKVK ALPDAQFEVVHSLAKWKRQTLGQHDFSAGEGLYTHMKALRPDEDRLSPLHSVYVDQWDWE RVMGDGERQFSTLKSTVEAIWAGIKATEAAVSEEFGLAPFLPDQIHFVHSQELLSRYPDL DAKGRERAIAKDLGAVFLVGIGGKLSDGHRHDVRAPDYDDWSTPSELGHAGLNGDILVWN PVLEDAFELSSMGIRVDADTLKHQLALTGDEDRLELEWHQALLRGEMPQTIGGGIGQSRL TMLLLQLPHIGQVQCGVWPAAVRESVPSLL >gi|296494673|gb|ADTN01000065.1| GENE 20 19648 - 21099 1144 483 aa, chain - ## HITS:1 COG:ECs4687 KEGG:ns NR:ns ## COG: ECs4687 COG2425 # Protein_GI_number: 15833941 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain # Organism: Escherichia coli O157:H7 # 1 473 1 473 483 890 99.0 0 MLTLDTLNVMLAVSEEGLIEEMIIALLASPQLAVFFEKFPRLKAAITDDVPRWREALRSR LKDARVPPELTEEVMCYQQSQLLSTPQFIVQLPQILDLLHRLNSPWAEQARQLVDANSTI TSALHTLFLQRWRLSLIVQATTLNQQLLEEEREQLLSEVQERMTLSGQLEPILADNNTAA GRLWDMSAGQLKRGDYQLIVKYGEFLNEHPELKRLAEQLGRSREAKSIPRNDAQMETFRT MVREPATVPEQVDGLQQSDDILRLLPPELATLGITELEYEFYRRLVEKQLLTYRLHGESW REKVIERPVVHKDYDEQPRGPFIVCVDTSGSMGGFNEQCAKAFCLALMRIALAENRRCYI MLFSTEIVRYELSGPQGIEQAIRFLSQQFRGGTDLASCFRAIMERLQSREWFDADAVVIS DFIAQRLPDDVTSKVKELQRVHQHRFHAVAMSAHGKPGIMRIFDHIWRFDTGMRSRLLRR WRR >gi|296494673|gb|ADTN01000065.1| GENE 21 21093 - 22589 1365 498 aa, chain - ## HITS:1 COG:yieN KEGG:ns NR:ns ## COG: yieN COG0714 # Protein_GI_number: 16131614 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Escherichia coli K12 # 1 498 9 506 506 947 100.0 0 MAHPHLLAERISRLSSSLEKGLYERSHAIRLCLLAALSGESVFLLGPPGIAKSLIARRLK FAFQNARAFEYLMTRFSTPEEVFGPLSIQALKDEGRYERLTSGYLPEAEIVFLDEIWKAG PAILNTLLTAINERQFRNGAHVEKIPMRLLVAASNELPEADSSLEALYDRMLIRLWLDKV QDKANFRSMLTSQQDENDNPVPDALQVTDEEYERWQKEIGEITLPDHVFELIFMLRQQLD KLPDAPYVSDRRWKKAIRLLQASAFFSGRSAVAPVDLILLKDCLWYDAQSLNLIQQQIDV LMTGHAWQQQGMLTRLGAIVQRHLQLQQQQSDKTALTVIRLGGIFSRRQQYQLPVNVTAS TLTLLLQKPLKLHDMEVVHISFERSALEQWLSKGGEIRGKLNGIGFAQKLNLEVDSAQHL VVRDVSLQGSTLALPGSSAEGLPGEIKQQLEELESDWRKQHALFSEQQKCLFIPGDWLGR IEASLQDVGAQIRQAQQC Prediction of potential genes in microbial genomes Time: Sun May 15 23:20:08 2011 Seq name: gi|296494672|gb|ADTN01000066.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont170.1, whole genome shotgun sequence Length of sequence - 4125 bp Number of predicted genes - 9, with homology - 8 Number of transcription units - 5, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 44 - 334 300 ## gi|301644424|ref|ZP_07244422.1| hypothetical protein HMPREF9543_01076 2 1 Op 2 . + CDS 355 - 597 260 ## gi|301644425|ref|ZP_07244423.1| conserved domain protein 3 1 Op 3 . + CDS 597 - 1016 313 ## gi|301644426|ref|ZP_07244424.1| hypothetical protein HMPREF9543_01078 4 2 Op 1 . + CDS 1139 - 1429 221 ## EcolC_1232 hypothetical protein 5 2 Op 2 . + CDS 1445 - 1813 302 ## ECDH10B_2610 CPZ-55 prophage; hypothetical protein + Prom 1857 - 1916 3.5 6 3 Op 1 . + CDS 2165 - 2347 124 ## ECDH10B_2611 CPZ-55 prophage; hypothetical protein 7 3 Op 2 . + CDS 2344 - 2937 267 ## ECDH10B_2612 CPZ-55 prophage; hypothetical protein + Prom 3346 - 3405 3.6 8 4 Tu 1 . + CDS 3571 - 3633 70 ## + Prom 3664 - 3723 2.7 9 5 Tu 1 . + CDS 3854 - 4090 292 ## gi|301644432|ref|ZP_07244430.1| hypothetical protein HMPREF9543_01084 Predicted protein(s) >gi|296494672|gb|ADTN01000066.1| GENE 1 44 - 334 300 96 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|301644424|ref|ZP_07244422.1| ## NR: gi|301644424|ref|ZP_07244422.1| hypothetical protein HMPREF9543_01076 [Escherichia coli MS 146-1] # 1 96 15 110 110 174 100.0 1e-42 MSFIQNHFDGQNHELIAEQKAFKKAMDKKVEEIVESLPEMLRPIIMQAAEQLEAGLSDEL RKADPITHDELDKKAIYMRLLFSVSNKLGHGASWLK >gi|296494672|gb|ADTN01000066.1| GENE 2 355 - 597 260 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|301644425|ref|ZP_07244423.1| ## NR: gi|301644425|ref|ZP_07244423.1| conserved domain protein [Escherichia coli MS 146-1] # 1 80 1 80 80 103 100.0 3e-21 MNRYDELEAKGNELVEKRNKASGSERELLNREINAIRQAQLKTLAEDVTAALKLAIPELV SIAIKSSNVQKAIEELQRGE >gi|296494672|gb|ADTN01000066.1| GENE 3 597 - 1016 313 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|301644426|ref|ZP_07244424.1| ## NR: gi|301644426|ref|ZP_07244424.1| hypothetical protein HMPREF9543_01078 [Escherichia coli MS 146-1] # 1 139 2 140 140 259 100.0 5e-68 MDEQIKALLNQIRILETEANNHSGNERAELFKAAGKLKNQVATLLNQQGKGKAESKIPDP SEYFGAKLPEGFENYFPGDYIDPEIEDYFNNLTPEQRDYFTVQVPDNAPDEPKQEPAKAK VWDDLEYRRKVFHALGIPC >gi|296494672|gb|ADTN01000066.1| GENE 4 1139 - 1429 221 96 aa, chain + ## HITS:1 COG:no KEGG:EcolC_1232 NR:ns ## KEGG: EcolC_1232 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 96 15 105 109 67 47.0 1e-10 MLALRIISGMAKQIAIRVAYIKHKCPAHCVAQKRVALREFAELVLGTISILIAGKLKDGT SLHECSAVEKEFVKKLGLIRRDINAMLRSVQCDLDD >gi|296494672|gb|ADTN01000066.1| GENE 5 1445 - 1813 302 122 aa, chain + ## HITS:1 COG:no KEGG:ECDH10B_2610 NR:ns ## KEGG: ECDH10B_2610 # Name: yffN # Def: CPZ-55 prophage; hypothetical protein # Organism: E.coli_DH10B # Pathway: not_defined # 1 122 6 127 127 209 88.0 2e-53 MKHVFKYLDFAEDREHAEAVATKELELGHVEKFAIRDLANDIKERGCVELVQPGGFDELV QIYEAGGDGIEPLNCGIEARKVAIAALLRVMREPDFQCREMVHELVKAARDLEMPVSDKF DC >gi|296494672|gb|ADTN01000066.1| GENE 6 2165 - 2347 124 60 aa, chain + ## HITS:1 COG:no KEGG:ECDH10B_2611 NR:ns ## KEGG: ECDH10B_2611 # Name: yffO # Def: CPZ-55 prophage; hypothetical protein # Organism: E.coli_DH10B # Pathway: not_defined # 1 60 79 138 138 86 100.0 3e-16 MSWLWCAEAGRRAVEIIDEVDINAEDGPKQLRKAEAKAKALLAAAKLNSLKHSPFGDDKQ >gi|296494672|gb|ADTN01000066.1| GENE 7 2344 - 2937 267 197 aa, chain + ## HITS:1 COG:no KEGG:ECDH10B_2612 NR:ns ## KEGG: ECDH10B_2612 # Name: yffP # Def: CPZ-55 prophage; hypothetical protein # Organism: E.coli_DH10B # Pathway: not_defined # 1 197 1 197 197 387 94.0 1e-107 MSLIRTETRDTKRAADPLHDLRSKPFSEWGEDEIRRFNLIDALLEFVYTDTSSPFGIGMT FDYTECWEIGVRDDCLVMTRVKPVHPEYAKHWNMKGVMNDKTRFHAEKWVGYSKVLAWVS LSHKDTFTGAKRFQYFQAMYDMERQINANLPIGGLPYVDAERTGKLFQRDDFSEDSHAND PKLAGDDYVAAPPEQVD >gi|296494672|gb|ADTN01000066.1| GENE 8 3571 - 3633 70 20 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKRWEVILLMVMMFVTSTEC >gi|296494672|gb|ADTN01000066.1| GENE 9 3854 - 4090 292 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|301644432|ref|ZP_07244430.1| ## NR: gi|301644432|ref|ZP_07244430.1| hypothetical protein HMPREF9543_01084 [Escherichia coli MS 146-1] # 1 78 1 78 78 153 100.0 4e-36 MDKTATILINREQVQKMLGGLSKSSFFKLVRKWKDAGTPFPDPVEGMPALKHGGFLYRYQ DVMKFFKSIGLLSDSDNQ Prediction of potential genes in microbial genomes Time: Sun May 15 23:20:59 2011 Seq name: gi|296494671|gb|ADTN01000067.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont170.2, whole genome shotgun sequence Length of sequence - 55228 bp Number of predicted genes - 50, with homology - 50 Number of transcription units - 22, operones - 11 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 19 - 78 4.1 1 1 Tu 1 . + CDS 105 - 914 715 ## ECDH10B_2615 CPZ-55 prophage; hypothetical protein + Term 944 - 981 2.2 + Prom 959 - 1018 3.2 2 2 Tu 1 . + CDS 1128 - 1730 189 ## COG3646 Uncharacterized phage-encoded protein + Term 1745 - 1773 1.3 - Term 1733 - 1761 2.1 3 3 Op 1 4/0.571 - CDS 1859 - 3262 1419 ## COG4819 Ethanolamine utilization protein, possible chaperonin protecting lyase from inhibition 4 3 Op 2 2/0.857 - CDS 3259 - 4485 1659 ## COG3192 Ethanolamine utilization protein - Prom 4519 - 4578 1.7 5 4 Op 1 2/0.857 - CDS 4702 - 5889 1245 ## COG1454 Alcohol dehydrogenase, class IV 6 4 Op 2 4/0.571 - CDS 5879 - 6715 909 ## COG4820 Ethanolamine utilization protein, possible chaperonin 7 4 Op 3 4/0.571 - CDS 6726 - 8129 837 ## PROTEIN SUPPORTED gi|148544941|ref|YP_001272311.1| 50S ribosomal protein L29P 8 4 Op 4 4/0.571 - CDS 8141 - 8428 240 ## COG4576 Carbon dioxide concentrating mechanism/carboxysome shell protein - Term 8490 - 8527 5.6 9 5 Op 1 1/0.857 - CDS 8535 - 8828 472 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein 10 5 Op 2 4/0.571 - CDS 8867 - 9883 942 ## COG0280 Phosphotransacetylase 11 5 Op 3 4/0.571 - CDS 9880 - 10683 830 ## COG4812 Ethanolamine utilization cobalamin adenosyltransferase 12 5 Op 4 4/0.571 - CDS 10680 - 11381 749 ## COG4766 Ethanolamine utilization protein 13 5 Op 5 4/0.571 - CDS 11356 - 11835 401 ## COG4917 Ethanolamine utilization protein 14 5 Op 6 1/0.857 - CDS 11848 - 12183 391 ## COG4810 Ethanolamine utilization protein - Prom 12354 - 12413 5.6 - Term 12419 - 12459 10.0 15 6 Tu 1 . - CDS 12476 - 14755 2690 ## COG0281 Malic enzyme - Prom 14876 - 14935 3.5 + Prom 14838 - 14897 3.3 16 7 Op 1 13/0.000 + CDS 15044 - 15994 999 ## COG0176 Transaldolase 17 7 Op 2 . + CDS 16014 - 18017 2403 ## COG0021 Transketolase + Term 18033 - 18090 5.5 18 8 Tu 1 . - CDS 18112 - 19032 651 ## B21_02319 hypothetical protein 19 9 Op 1 3/0.714 - CDS 19281 - 19856 706 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 20 9 Op 2 . - CDS 19924 - 21903 1644 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases - Prom 21990 - 22049 2.3 + Prom 21972 - 22031 2.8 21 10 Tu 1 . + CDS 22109 - 23809 1242 ## COG3850 Signal transduction histidine kinase, nitrate/nitrite-specific + Prom 23814 - 23873 4.7 22 11 Tu 1 . + CDS 23973 - 27086 3374 ## COG0841 Cation/multidrug efflux pump + Term 27091 - 27139 7.5 + Prom 27161 - 27220 2.1 23 12 Tu 1 . + CDS 27264 - 27386 100 ## ECP_2483 hypothetical protein + Term 27466 - 27511 2.3 + Prom 27520 - 27579 5.7 24 13 Op 1 9/0.000 + CDS 27625 - 27981 390 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family 25 13 Op 2 . + CDS 27985 - 29112 1107 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 26 13 Op 3 . + CDS 29140 - 29340 301 ## G2583_2995 hypothetical protein - Term 29325 - 29358 -0.9 27 14 Tu 1 . - CDS 29450 - 30148 835 ## COG0400 Predicted esterase 28 15 Op 1 4/0.571 - CDS 30222 - 32198 1163 ## COG1444 Predicted P-loop ATPase fused to an acetyltransferase 29 15 Op 2 5/0.286 - CDS 32252 - 33046 838 ## COG2321 Predicted metalloprotease - Prom 33130 - 33189 3.6 30 16 Tu 1 . - CDS 33283 - 33996 1176 ## COG0152 Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase - Term 34170 - 34201 3.2 31 17 Op 1 9/0.000 - CDS 34209 - 35243 933 ## COG3317 Uncharacterized lipoprotein 32 17 Op 2 . - CDS 35260 - 36138 816 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase - Prom 36337 - 36396 3.0 + Prom 36196 - 36255 6.0 33 18 Op 1 4/0.571 + CDS 36284 - 36856 311 ## COG2716 Glycine cleavage system regulatory protein 34 18 Op 2 1/0.857 + CDS 36856 - 37326 554 ## COG1225 Peroxiredoxin + Prom 37478 - 37537 5.0 35 19 Op 1 4/0.571 + CDS 37579 - 38196 308 ## COG1142 Fe-S-cluster-containing hydrogenase components 2 36 19 Op 2 10/0.000 + CDS 38196 - 40214 1689 ## COG0651 Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit 37 19 Op 3 1/0.857 + CDS 40225 - 41172 935 ## COG0650 Formate hydrogenlyase subunit 4 38 19 Op 4 3/0.714 + CDS 41189 - 42628 1246 ## COG1009 NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit 39 19 Op 5 7/0.000 + CDS 42640 - 43290 655 ## COG4237 Hydrogenase 4 membrane component (E) 40 19 Op 6 7/0.000 + CDS 43295 - 44875 1199 ## COG0651 Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit 41 19 Op 7 5/0.286 + CDS 44865 - 46532 1813 ## COG3261 Ni,Fe-hydrogenase III large subunit 42 19 Op 8 6/0.000 + CDS 46542 - 47087 372 ## COG1143 Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) 43 19 Op 9 . + CDS 47084 - 47842 545 ## COG3260 Ni,Fe-hydrogenase III small subunit 44 19 Op 10 . + CDS 47835 - 48248 312 ## SSON_2571 putative protein processing element 45 19 Op 11 1/0.857 + CDS 48278 - 50290 1346 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains 46 19 Op 12 . + CDS 50312 - 51160 559 ## COG2116 Formate/nitrite family of transporters + Term 51165 - 51208 11.5 - Term 51153 - 51194 7.3 47 20 Tu 1 . - CDS 51198 - 52259 1363 ## COG0628 Predicted permease - Prom 52329 - 52388 4.5 + Prom 52344 - 52403 3.1 48 21 Op 1 6/0.000 + CDS 52472 - 53935 1779 ## COG4783 Putative Zn-dependent protease, contains TPR repeats 49 21 Op 2 . + CDS 53956 - 54315 547 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family 50 22 Tu 1 . - CDS 54453 - 55130 567 ## COG0593 ATPase involved in DNA replication initiation - Prom 55164 - 55223 1.5 Predicted protein(s) >gi|296494671|gb|ADTN01000067.1| GENE 1 105 - 914 715 269 aa, chain + ## HITS:1 COG:no KEGG:ECDH10B_2615 NR:ns ## KEGG: ECDH10B_2615 # Name: yffS # Def: CPZ-55 prophage; hypothetical protein # Organism: E.coli_DH10B # Pathway: not_defined # 1 269 11 279 279 441 88.0 1e-122 MSYEIKVCDILKGAAMEGKYKGAQRGAKCEEIANELTRRGVKNNKGEPITKGGVSHWLEG RREPNFDTLAELCDMFGVYALMPMRGGNWIRVHPEDGAMKELREAVAERDAIIDDLKARV AELEAALAEKQVLAITVEMGDEKVNMVAAEQTPNYDKEMGVKEWVNPNPKKYSVGMLCQV LAALGGEYLGNNAGLQQKITAMDNDGNRKPISNGAFYRLIEQAKERGLIKVDQEITHKKD ENGNQIGKGKKGDKLITLLPNWEYNLGDK >gi|296494671|gb|ADTN01000067.1| GENE 2 1128 - 1730 189 200 aa, chain + ## HITS:1 COG:Z1457_1 KEGG:ns NR:ns ## COG: Z1457_1 COG3646 # Protein_GI_number: 15800956 # Func_class: S Function unknown # Function: Uncharacterized phage-encoded protein # Organism: Escherichia coli O157:H7 EDL933 # 1 109 3 99 104 58 33.0 6e-09 MNELISNDLTGARMTTIDIMNLLNANLPANKKHPYDHFEIIKKVEDLMRMNIIHLRKKIE VDNNQTLSNHSKVRGYEFIGKQGERDSIVVIAQFQPKVTGVLYDKWDALREENRQLKEHL LMLTKQEDLKKDAERLLNELDHGDVDVNAVTELLNRYNHSSHHAGQQLSLVRKTKPIVKQ LHHNTIELLQPELPLETAEQ >gi|296494671|gb|ADTN01000067.1| GENE 3 1859 - 3262 1419 467 aa, chain - ## HITS:1 COG:eutA KEGG:ns NR:ns ## COG: eutA COG4819 # Protein_GI_number: 16130376 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein, possible chaperonin protecting lyase from inhibition # Organism: Escherichia coli K12 # 1 467 1 467 467 897 100.0 0 MNTRQLLSVGIDIGTTTTQVIFSRLELVNRAAVSQVPRYEFIKREISWQSPVFFTPVDKQ GGLKEAELKTLILEQYHAAGIEPESVDSGAIIITGESAKTRNARPAVMALSQSLGDFVVA SAGPHLESVIAGHGAGAQTLSEQRLCRVLNIDIGGGTANYALFDAGKISGTACLNVGGRL LETDSHGRVVYAHKPGQMIVDECFGAGTDARSLTGAQLVQVTRRMAELIVEVIDGTLSPL AQALMQTGLLPAGVTPEIITLSGGVGECYRHQPADPFCFADIGPLLATALHDHPRLREMN VQFPAQTVRATVIGAGAHTLSLSGSTIWLEGVQLPLRNLPVAIPIDETDLVGAWQQALIQ LDLDPKTDAYVLALPASLPVRYAAVLTVINALVDFVARFPNPHPLLVVAGQDFGKALGML LRPQLQQLPLAVIDEVIVRAGDYIDIGTPLFGGSVVPVTVKSLAFPS >gi|296494671|gb|ADTN01000067.1| GENE 4 3259 - 4485 1659 408 aa, chain - ## HITS:1 COG:eutH KEGG:ns NR:ns ## COG: eutH COG3192 # Protein_GI_number: 16130377 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Escherichia coli K12 # 1 408 1 408 408 663 100.0 0 MGINEIIMYIMMFFMLIAAVDRILSQFGGSARFLGKFGKSIEGSGGQFEEGFMAMGALGL AMVGMTALAPVLAHVLGPVIIPVYEMLGANPSMFAGTLLACDMGGFFLAKELAGGDVAAW LYSGLILGSMMGPTIVFSIPVALGIIEPSDRRYLALGVLAGIVTIPIGCIAGGLVAMYSG VQINGQPVEFTFALILMNMIPVIIVAILVALGLKFIPEKMINGFQIFAKFLVALITLGLA AAVVKFLLGWELIPGLDPIFMAPGDKPGEVMRAIEVIGSISCVLLGAYPMVLLLTRWFEK PLMSVGKVLNMNNIAAAGMVATLANNIPMFGMMKQMDTRGKVINCAFAVSAAFALGDHLG FAAANMNAMIFPMIVGKLIGGVTAIGVAMMLVPKEDATATKTEAEAQS >gi|296494671|gb|ADTN01000067.1| GENE 5 4702 - 5889 1245 395 aa, chain - ## HITS:1 COG:eutG KEGG:ns NR:ns ## COG: eutG COG1454 # Protein_GI_number: 16130378 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Escherichia coli K12 # 1 395 10 404 404 727 100.0 0 MQNELQTALFQAFDTLNLQRVKTFSVPPVTLCGPGSVSSCGQQAQTRGLKHLFVMADSFL HQAGMTAGLTRSLTVKGIAMTLWPCPVGEPCITDVCAAVAQLRESGCDGVIAFGGGSVLD AAKAVTLLVTNPDSTLAEMSETSVLQPRLPLIAIPTTAGTGSETTNVTVIIDAVSGRKQV LAHASLMPDVAILDAALTEGVPSHVTAMTGIDALTHAIEAYSALNATPFTDSLAIGAIAM IGKSLPKAVGYGHDLAARESMLLASCMAGMAFSSAGLGLCHAMAHQPGAALHIPHGLANA MLLPTVMEFNRMVCRERFSQIGRALRTKKSDDRDAINAVSELIAEVGIGKRLGDVGATSA HYGAWAQAALEDICLRSNPRTASLEQIVGLYAAAQ >gi|296494671|gb|ADTN01000067.1| GENE 6 5879 - 6715 909 278 aa, chain - ## HITS:1 COG:eutJ KEGG:ns NR:ns ## COG: eutJ COG4820 # Protein_GI_number: 16130379 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein, possible chaperonin # Organism: Escherichia coli K12 # 1 278 1 278 278 564 100.0 1e-161 MAHDEQWLTPRLQTAATLCNQTPAATESPLWLGVDLGTCDVVSMVVDRDGQPVAVCLDWA DVVRDGIVWDFFGAVTIVRRHLDTLEQQFGRRFSHAATSFPPGTDPRISINVLESAGLEV SHVLDEPTAVADLLQLDNAGVVDIGGGTTGIAIVKKGKVTYSADEATGGHHISLTLAGNR RISLEEAEQYKRGHGEEIWPAVKPVYEKMADIVARHIEGQGITDLWLAGGSCMQPGVAEL FRKQFPALQVHLPQHSLFMTPLAIASSGREKAEGLYAK >gi|296494671|gb|ADTN01000067.1| GENE 7 6726 - 8129 837 467 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148544941|ref|YP_001272311.1| 50S ribosomal protein L29P [Lactobacillus reuteri DSM 20016] # 1 454 1 461 477 327 40 1e-88 MNQQDIEQVVKAVLLKMQSSDTPSAAVHEMGVFASLDDAVAAAKVAQQGLKSVAMRQLAI AAIREAGEKHARDLAELAVSETGMGRVEDKFAKNVAQARGTPGVECLSPQVLTGDNGLTL IENAPWGVVASVTPSTNPAATVINNAISLIAAGNSVIFAPHPAAKKVSQRAITLLNQAIV AAGGPENLLVTVANPDIETAQRLFKFPGIGLLVVTGGEAVVEAARKHTNKRLIAAGAGNP PVVVDETADLARAAQSIVKGASFDNNIICADEKVLIVVDSVADELMRLMEGQHAVKLTAE QAQQLQPVLLKNIDERGKGTVSRDWVGRDAGKIAAAIGLKVPQETRLLFVETTAEHPFAV TELMMPVLPVVRVANVADAIALAVKLEGGCHHTAAMHSRNIENMNQMANAIDTSIFVKNG PCIAGLGLGGEGWTTMTITTPTGEGVTSARTFVRLRRCVLVDAFRIV >gi|296494671|gb|ADTN01000067.1| GENE 8 8141 - 8428 240 95 aa, chain - ## HITS:1 COG:cchB KEGG:ns NR:ns ## COG: cchB COG4576 # Protein_GI_number: 16130381 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Escherichia coli K12 # 1 95 1 95 95 176 100.0 9e-45 MKLAVVTGQIVCTVRHHGLAHDKLLMVEMIDPQGNPDGQCAVAIDNIGAGTGEWVLLVSG SSARQAHKSETSPVDLCVIGIVDEVVSGGQVIFHK >gi|296494671|gb|ADTN01000067.1| GENE 9 8535 - 8828 472 97 aa, chain - ## HITS:1 COG:ECs3319 KEGG:ns NR:ns ## COG: ECs3319 COG4577 # Protein_GI_number: 15832573 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Escherichia coli O157:H7 # 1 97 15 111 111 153 100.0 9e-38 MEALGMIETRGLVALIEASDAMVKAARVKLVGVKQIGGGLCTAMVRGDVAACKAATDAGA AAAQRIGELVSVHVIPRPHGDLEEVFPIGLKGDSSNL >gi|296494671|gb|ADTN01000067.1| GENE 10 8867 - 9883 942 338 aa, chain - ## HITS:1 COG:eutI KEGG:ns NR:ns ## COG: eutI COG0280 # Protein_GI_number: 16130383 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Escherichia coli K12 # 1 338 1 338 338 624 100.0 1e-179 MIIERCRELALRAPARVVFPDALDQRVLKAAQYLHQQGLATPILVANPFELRQFALSHGV AMDGLQVIDPHGNLAMREEFAHRWLARAGEKTPPDALEKLTDPLMFAAAMVSAGKADVCI AGNLSSTANVLRAGLRIIGLQPGCKTLSSIFLMLPQYSGPALGFADCSVVPQPTAAQLAD IALASAETWRAITGEEPRVAMLSFSSNGSARHPCVANVQQATEIVRERAPKLVVDGELQF DAAFVPEVAAQKAPASPLQGKANVMVFPSLEAGNIGYKIAQRLGGYRAVGPLIQGLAAPM HDLSRGCSVQEIIELALVAAVPRQTEVNRESSLQTLVE >gi|296494671|gb|ADTN01000067.1| GENE 11 9880 - 10683 830 267 aa, chain - ## HITS:1 COG:eutT KEGG:ns NR:ns ## COG: eutT COG4812 # Protein_GI_number: 16130384 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization cobalamin adenosyltransferase # Organism: Escherichia coli K12 # 1 267 1 267 267 508 100.0 1e-144 MKDFITEAWLRANHTLSEGAEIHLPADSRLTPSARELLESRHLRIKFIDEQGRLFVDDEQ QQPQPVHGLTSSDEHPQACCELCRQPVAKKPDTLTHLSAEKMVAKSDPRLGFRAVLDSTI ALAVWLQIELAEPWQPWLADIRSRLGNIMRADALGEPLGCQAIVGLSDEDLHRLSHQPLR YLDHDHLVPEASHGRDAALLNLLRTKVRETETVAAQVFITRSFEVLRPDILQALNRLSST VYVMMILSVTKQPLTVKQIQQRLGETQ >gi|296494671|gb|ADTN01000067.1| GENE 12 10680 - 11381 749 233 aa, chain - ## HITS:1 COG:eutQ KEGG:ns NR:ns ## COG: eutQ COG4766 # Protein_GI_number: 16130385 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Escherichia coli K12 # 1 233 1 233 233 456 100.0 1e-128 MKKLITANDIREAHARGEQAMSVVLRASIITPEAREVADLLGFTITECDESIPVTASVPA SVPADKTESQRIRETIIAQLPEGQFTESLVAQLMEKVMKEKQSLEQGAMQPSFKSVTGKG GIKVIDGSSVKFGRFDGAEPHCVGLTDLVTGDDGSSMAAGFMQWENAFFPWTLNYDEIDM VLEGELHVRHEGQTMIAKAGDVMFIPKGSSIEFGTTSSVKFLYVAWPANWQSL >gi|296494671|gb|ADTN01000067.1| GENE 13 11356 - 11835 401 159 aa, chain - ## HITS:1 COG:eutP KEGG:ns NR:ns ## COG: eutP COG4917 # Protein_GI_number: 16130386 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Escherichia coli K12 # 1 159 1 159 159 316 99.0 1e-86 MKRIAFVGSVGAGKTTLFNALQGNYTLARKTQAVEFNDKGDIDTPGEYFNHPRWYHALIT TLQDVDMLIYVHGANDPESRLPAGLLDIGVSKRQIAVISKTDMPDADVAATRKLLLETGF EEPIFELNSHDPQSVQQLVDYLASLTKQEEAGEKTHHSE >gi|296494671|gb|ADTN01000067.1| GENE 14 11848 - 12183 391 111 aa, chain - ## HITS:1 COG:ECs3324 KEGG:ns NR:ns ## COG: ECs3324 COG4810 # Protein_GI_number: 15832578 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Escherichia coli O157:H7 # 1 111 25 135 135 213 100.0 1e-55 MDKERIIQEFVPGKQVTLAHLIAHPGEELAKKIGVPDAGAIGIMTLTPGETAMIAGDLAL KAADVHIGFLDRFSGALVIYGSVGAVEEALSQTVSGLGRLLNYTLCEMTKS >gi|296494671|gb|ADTN01000067.1| GENE 15 12476 - 14755 2690 759 aa, chain - ## HITS:1 COG:maeB_1 KEGG:ns NR:ns ## COG: maeB_1 COG0281 # Protein_GI_number: 16130388 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Escherichia coli K12 # 1 434 1 434 434 849 100.0 0 MDDQLKQSALDFHEFPVPGKIQVSPTKPLATQRDLALAYSPGVAAPCLEIEKDPLKAYKY TARGNLVAVISNGTAVLGLGNIGALAGKPVMEGKGVLFKKFAGIDVFDIEVDELDPDKFI EVVAALEPTFGGINLEDIKAPECFYIEQKLRERMNIPVFHDDQHGTAIISTAAILNGLRV VEKNISDVRMVVSGAGAAAIACMNLLVALGLQKHNIVVCDSKGVIYQGREPNMAETKAAY AVVDDGKRTLDDVIEGADIFLGCSGPKVLTQEMVKKMARAPMILALANPEPEILPPLAKE VRPDAIICTGRSDYPNQVNNVLCFPFIFRGALDVGATAINEEMKLAAVRAIAELAHAEQS EVVASAYGDQDLSFGPEYIIPKPFDPRLIVKIAPAVAKAAMESGVATRPIADFDVYIDKL TEFVYKTNLFMKPIFSQARKAPKRVVLPEGEEARVLHATQELVTLGLAKPILIGRPNVIE MRIQKLGLQIKAGVDFEIVNNESDPRFKEYWTEYFQIMKRRGVTQEQAQRALISNPTVIG AIMVQRGEADAMICGTVGDYHEHFSVVKNVFGYRDGVHTAGAMNALLLPSGNTFIADTYV NDEPDAEELAEITLMAAETVRRFGIEPRVALLSHSNFGSSDCPSSSKMRQALELVRERAP ELMIDGEMHGDAALVEAIRNDRMPDSSLKGSANILVMPNMEAARISYNLLRVSSSEGVTV GPVLMGVAKPVHVLTPIASVRRIVNMVALAVVEAQTQPL >gi|296494671|gb|ADTN01000067.1| GENE 16 15044 - 15994 999 316 aa, chain + ## HITS:1 COG:ECs3326 KEGG:ns NR:ns ## COG: ECs3326 COG0176 # Protein_GI_number: 15832580 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Escherichia coli O157:H7 # 1 316 1 316 316 635 100.0 0 MNELDGIKQFTTVVADSGDIESIRHYHPQDATTNPSLLLKAAGLSQYEHLIDDAIAWGKK NGKTQEQQVVAACDKLAVNFGAEILKIVPGRVSTEVDARLSFDKEKSIEKARHLVDLYQQ QGVEKSRILIKLASTWEGIRAAEELEKEGINCNLTLLFSFAQARACAEAGVFLISPFVGR IYDWYQARKPMDPYVVEEDPGVKSVRNIYDYYKQHHYETIVMGASFRRTEQILALTGCDR LTIAPNLLKELQEKVSPVVRKLIPPSQTFPRPAPMSEAEFRWEHNQDAMAVEKLSEGIRL FAVDQRKLEDLLAAKL >gi|296494671|gb|ADTN01000067.1| GENE 17 16014 - 18017 2403 667 aa, chain + ## HITS:1 COG:tktB KEGG:ns NR:ns ## COG: tktB COG0021 # Protein_GI_number: 16130390 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase # Organism: Escherichia coli K12 # 1 667 1 667 667 1360 100.0 0 MSRKDLANAIRALSMDAVQKANSGHPGAPMGMADIAEVLWNDFLKHNPTDPTWYDRDRFI LSNGHASMLLYSLLHLTGYDLPLEELKNFRQLHSKTPGHPEIGYTPGVETTTGPLGQGLA NAVGLAIAERTLAAQFNQPDHEIVDHFTYVFMGDGCLMEGISHEVCSLAGTLGLGKLIGF YDHNGISIDGETEGWFTDDTAKRFEAYHWHVIHEIDGHDPQAVKEAILEAQSVKDKPSLI ICRTVIGFGSPNKAGKEEAHGAPLGEEEVALARQKLGWHHPPFEIPKEIYHAWDAREKGE KAQQSWNEKFAAYKKAHPQLAEEFTRRMSGGLPKDWEKTTQKYINELQANPAKIATRKAS QNTLNAYGPMLPELLGGSADLAPSNLTIWKGSVSLKEDPAGNYIHYGVREFGMTAIANGI AHHGGFVPYTATFLMFVEYARNAARMAALMKARQIMVYTHDSIGLGEDGPTHQAVEQLAS LRLTPNFSTWRPCDQVEAAVGWKLAVERHNGPTALILSRQNLAQVERTPDQVKEIARGGY VLKDSGGKPDIILIATGSEMEITLQAAEKLAGEGRNVRVVSLPSTDIFDAQDEEYRESVL PSNVAARVAVEAGIADYWYKYVGLKGAIVGMTGYGESAPADKLFPFFGFTAENIVAKAHK VLGVKGA >gi|296494671|gb|ADTN01000067.1| GENE 18 18112 - 19032 651 306 aa, chain - ## HITS:1 COG:no KEGG:B21_02319 NR:ns ## KEGG: B21_02319 # Name: ypfG # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 306 42 347 347 602 99.0 1e-171 MARNTGDHNGLVMTLSRSAGAHTDAVLRIERGGLKSPEASEGEIAPRLLLDGEPLALSGD KWRISPWLLVTDDTATITAFLQMIQEGKAITLRDGDQTISLSGLKAALLFIDAQQKRVGS ETAWIKKGDEPPLSVPPAPALKEVAVVNPTPTPLSLEERNDLLDYGNWRMNGLRCSLDPF RREVNVTALTDDKALMMISCEAGAYNTIDLAWIVSRKKPLASRPVRLRLPFNNGQETNEL ELMNATFDEKSRELVTLAKGRGLSDCGIQARWRFDGQRFRLVRYAAEPTCDNWHGPDAWP TLWITR >gi|296494671|gb|ADTN01000067.1| GENE 19 19281 - 19856 706 191 aa, chain - ## HITS:1 COG:yffH KEGG:ns NR:ns ## COG: yffH COG0494 # Protein_GI_number: 16130392 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Escherichia coli K12 # 1 191 1 191 191 373 100.0 1e-103 MTQQITLIKDKILSDNYFTLHNITYDLTRKDGEVIRHKREVYDRGNGATILLYNTKKKTV VLIRQFRVATWVNGNESGQLIESCAGLLDNDEPEVCIRKEAIEETGYEVGEVRKLFELYM SPGGVTELIHFFIAEYSDNQRANAGGGVEDEDIEVLELPFSQALEMIKTGEIRDGKTVLL LNYLQTSHLMD >gi|296494671|gb|ADTN01000067.1| GENE 20 19924 - 21903 1644 659 aa, chain - ## HITS:1 COG:aegA_2 KEGG:ns NR:ns ## COG: aegA_2 COG0493 # Protein_GI_number: 16130393 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Escherichia coli K12 # 181 659 1 479 479 1000 100.0 0 MNRFIMANSQQCLGCHACEIACVMAHNDEQHVLSQHHFHPRITVIKHQQQRSAVTCHHCE DAPCARSCPNGAISHVDDSIQVNQQKCIGCKSCVVACPFGTMQIVLTPVAAGKVKATAHK CDLCAGRENGPACVENCPADALQLVTDVALSGMAKSRRLRTARQEHQPWHASTAAQEMPV MSKVEQMQATPARGEPDKLAIEARKTGFDEIYLPFRADQAQREASRCLKCGEHSVCEWTC PLHNHIPQWIELVKAGNIDAAVELSHQTNTLPEITGRVCPQDRLCEGACTIRDEHGAVTI GNIERYISDQALAKGWRPDLSHVTKVDKRVAIIGAGPAGLACADVLTRNGVGVTVYDRHP EIGGLLTFGIPSFKLDKSLLARRREIFSAMGIHFELNCEVGKDVSLDSLLEQYDAVFVGV GTYRSMKAGLPNEDAPGVYDALPFLIANTKQVMGLEELPEEPFINTAGLNVVVLGGGDTA MDCVRTALRHGASNVTCAYRRDEANMPGSKKEVKNAREEGANFEFNVQPVALELNEQGHV CGIRFLRTRLGEPDAQGRRRPVPVEGSEFVMPADAVIMAFGFNPHGMPWLESHGVTVDKW GRIIADVESQYRYQTTNPKIFAGGDAVRGADLVVTAMAEGRHAAQGIIDWLGVKSVKSH >gi|296494671|gb|ADTN01000067.1| GENE 21 22109 - 23809 1242 566 aa, chain + ## HITS:1 COG:narQ KEGG:ns NR:ns ## COG: narQ COG3850 # Protein_GI_number: 16130394 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase, nitrate/nitrite-specific # Organism: Escherichia coli K12 # 1 566 1 566 566 1080 100.0 0 MIVKRPVSASLARAFFYIVLLSILSTGIALLTLASSLRDAEAINIAGSLRMQSYRLGYDL QSGSPQLNAHRQLFQQALHSPVLTNLNVWYVPEAVKTRYAHLNANWLEMNNRLSKGDLPW YQANINNYVNQIDLFVLALQHYAERKMLLVVAISLAGGIGIFTLVFFTLRRIRHQVVAPL NQLVTASQRIEHGQFDSPPLDTNLPNELGLLAKTFNQMSSELHKLYRSLEASVEEKTRDL HEAKRRLEVLYQCSQALNTSQIDVHCFRHILQIVRDNEAAEYLELNVGENWRISEGQPNP ELPMQILPVTMQETVYGELHWQNSHVSSSEPLLNSVSSMLGRGLYFNQAQKHFQQLLLME ERATIARELHDSLAQVLSYLRIQLTLLKRSIPEDNATAQSIMADFSQALNDAYRQLRELL TTFRLTLQQADLPSALREMLDTLQNQTSAKLTLDCRLPTLALDAQMQVHLLQIIREAVLN AMKHANASEIAVSCVTAPDGNHTVYIRDNGIGIGEPKEPEGHYGLNIMRERAERLGGTLT FSQPSGGGTLVSISFRSAEGEESQLM >gi|296494671|gb|ADTN01000067.1| GENE 22 23973 - 27086 3374 1037 aa, chain + ## HITS:1 COG:acrD KEGG:ns NR:ns ## COG: acrD COG0841 # Protein_GI_number: 16130395 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Escherichia coli K12 # 1 1037 1 1037 1037 2042 99.0 0 MANFFIDRPIFAWVLAILLCLTGTLAIFSLPVEQYPDLAPPNVRVTANYPGASAQTLENT MTQVIEHNMTGLDNLMYMSSQSSGTGQASVTLSFKAGTDPDEAVQQVQNQLQSAMRKLPQ AVQNQGVTVRKTGDTNILTIAFVSTDGSMDKQDIADYVASNIQDPLSRVNGVGDIDAYGS QYSMRIWLDPAKLNSFQMTAKDVTDAIESQNAQIAVGQLGGTPSVDKQALNATINAQSLL QTPEQFRDITLRVNQDGSEVRLGDVATVEMGAEKYDYLSRFNGKPASGLGVKLASGANEM ATAELVLNRLDELAQYFPHGLEYKVAYETTSFVKASIEDVVKTLLEAIALVFLVMYLFLQ NFRATLIPTIAVPVVLMGTFSVLYAFGYSVNTLTMFAMVLAIGLLVDDAIVVVENVERIM SEEGLTPREATRKSMGQIQGALVGIAMVLSAVFVPMAFFGGTTGAIYRQFSITIVAAMVL SVLVAMILTPALCATLLKPLKKGEHHGQKGFFAWFNQMFNRNAERYEKGVAKILHRSLRW IVIYVLLLGGMVFLFLRLPTSFLPLEDRGMFTTSVQLPSGSTQQQTLKVVEQIEKYYFTH EKDNIMSVFATVGSGPGGNGQNVARMFIRLKDWSERDSKTGTSFAIIERATKAFNQIKEA RVIASSPPAISGLGSSAGFDMELQDHAGAGHDALMAARNQLLALAAENPELTRVRHNGLD DSPQLQIDIDQRKAQALGVAIDDINDTLQTAWGSSYVNDFMDRGRVKKVYVQAAAPYRML PDDINLWYVRNKDGGMVPFSAFATSRWETGSPRLERYNGYSAVEIVGEAAPGVSTGTAMD IMESLVKQLPNGFGLEWTAMSYQERLSGAQAPALYAISLLVVFLCLAALYESWSVPFSVM LVVPLGVIGALLATWMRGLENDVYFQVGLLTVIGLSAKNAILIVEFANEMNQKGHDLFEA TLHACRQRLRPILMTSLAFIFGVLPMATSTGAGSGGQHAVGTGVMGGMISATILAIYFVP LFFVLVRRRFPLKPRPE >gi|296494671|gb|ADTN01000067.1| GENE 23 27264 - 27386 100 40 aa, chain + ## HITS:1 COG:no KEGG:ECP_2483 NR:ns ## KEGG: ECP_2483 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_536 # Pathway: not_defined # 1 40 36 75 75 62 95.0 3e-09 MGILRKQYTVVKLIYSIAAMTTSGGQILCKYYGPAVMICM >gi|296494671|gb|ADTN01000067.1| GENE 24 27625 - 27981 390 118 aa, chain + ## HITS:1 COG:yffB KEGG:ns NR:ns ## COG: yffB COG1393 # Protein_GI_number: 16130396 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Escherichia coli K12 # 1 118 1 118 118 246 100.0 6e-66 MVTLYGIKNCDTIKKARRWLEANNIDYRFHDYRVDGLDSELLNDFINELGWEALLNTRGT TWRKLDETTRNKITDAASAAALMTEMPAIIKRPLLCVPGKPMLLGFSDSSYQQFFHEV >gi|296494671|gb|ADTN01000067.1| GENE 25 27985 - 29112 1107 375 aa, chain + ## HITS:1 COG:dapE KEGG:ns NR:ns ## COG: dapE COG0624 # Protein_GI_number: 16130397 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Escherichia coli K12 # 1 375 1 375 375 778 100.0 0 MSCPVIELTQQLIRRPSLSPDDAGCQALLIERLQAIGFTVERMDFADTQNFWAWRGQGET LAFAGHTDVVPPGDADRWINPPFEPTIRDGMLFGRGAADMKGSLAAMVVAAERFVAQHPN HTGRLAFLITSDEEASAHNGTVKVVEALMARNERLDYCLVGEPSSIEVVGDVVKNGRRGS LTCNLTIHGVQGHVAYPHLADNPVHRAAPFLNELVAIEWDQGNEFFPATSMQIANIQAGT GSNNVIPGELFVQFNFRFSTELTDEMIKAQVLALLEKHQLRYTVDWWLSGQPFLTARGKL VDAVVNAVEHYNEIKPQLLTTGGTSDGRFIARMGAQVVELGPVNATIHKINECVNAADLQ LLARMYQRIMEQLVA >gi|296494671|gb|ADTN01000067.1| GENE 26 29140 - 29340 301 66 aa, chain + ## HITS:1 COG:no KEGG:G2583_2995 NR:ns ## KEGG: G2583_2995 # Name: ypfN # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 66 1 66 66 85 100.0 7e-16 MDWLAKYWWILVIVFLVGVLLNVIKDLKRVDHKKFLANKPELPPHRDFNDKWDDDDDWPK KDQPKK >gi|296494671|gb|ADTN01000067.1| GENE 27 29450 - 30148 835 232 aa, chain - ## HITS:1 COG:ypfH KEGG:ns NR:ns ## COG: ypfH COG0400 # Protein_GI_number: 16130398 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Escherichia coli K12 # 1 232 9 240 240 459 100.0 1e-129 MKHDHFVVQSPDKPAQQLLLLFHGVGDNPVAMGEIGNWFAPLFPDALVVSVGGAEPSGNP AGRQWFSVQGITEDNRQARVDAIMPTFIETVRYWQKQSGVGANATALIGFSQGAIMVLES IKAEPGLASRVIAFNGRYASLPETASTATTIHLIHGGEDPVIDLAHAVAAQEALISAGGD VTLDIVEDLGHAIDNRSMQFALDHLRYTIPKHYFDEALSGGKPGDDDVIEMM >gi|296494671|gb|ADTN01000067.1| GENE 28 30222 - 32198 1163 658 aa, chain - ## HITS:1 COG:ypfI KEGG:ns NR:ns ## COG: ypfI COG1444 # Protein_GI_number: 16130399 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase fused to an acetyltransferase # Organism: Escherichia coli K12 # 1 658 14 671 671 1293 100.0 0 MKREGIRRLLVLSGEEGWCFEHTLKLRDALPGDWLWISPRPDAENHCSPSALQTLLGREF RHAVFDARHGFDAAAFAALSGTLKAGSWLVLLLPVWEEWENQPDADSLRWSDCPDPIATP HFVQHLKRVLTADNEAILWRQNQPFSLAHFTPRTDWYPATGAPQPEQQQLLKQLMTMPPG VAAVTAARGRGKSALAGQLISRIAGRAIVTAPAKASTDVLAQFAGEKFRFIAPDALLASD EQADWLVVDEAAAIPAPLLHQLVSRFPRTLLTTTVQGYEGTGRGFLLKFCARFPHLHRFE LQQPIRWAQGCPLEKMVSEALVFDDENFTHTPQGNIVISAFEQTLWQSDPETPLKVYQLL SGAHYRTSPLDLRRMMDAPGQHFLQAAGENEIAGALWLVDEGGLSQQLSQAVWAGFRRPR GNLVAQSLAAHGNNPLAATLRGRRVSRIAVHPARQREGTGRQLIAGALQYTQDLDYLSVS FGYTGELWRFWQRCGFVLVRMGNHREASSGCYTAMALLPMSDAGKQLAEREHYRLRRDAQ ALAQWNGETLPVDPLNDAVLSDDDWLELAGFAFAHRPLLTSLGCLLRLLQTSELALPALR GRLQKNASDAQLCTTLKLSGRKMLLVRQREEAAQALFALNDVRTERLRDRITQWQLFH >gi|296494671|gb|ADTN01000067.1| GENE 29 32252 - 33046 838 264 aa, chain - ## HITS:1 COG:ECs3337 KEGG:ns NR:ns ## COG: ECs3337 COG2321 # Protein_GI_number: 15832591 # Func_class: R General function prediction only # Function: Predicted metalloprotease # Organism: Escherichia coli O157:H7 # 6 264 29 287 287 508 100.0 1e-144 MGGPGFRLPSGKGGLILLIVVLVAGYYGVDLTGLMTGQPVSQQQSTRSISPNEDEAAKFT SVILATTEDTWGQQFEKMGKTYQQPKLVMYRGMTRTGCGAGQSIMGPFYCPADGTVYIDL SFYDDMKDKLGADGDFAQGYVIAHEVGHHVQKLLGIEPKVRQLQQNATQAEVNRLSVRME LQADCFAGVWGHSMQQQGVLETGDLEEALNAAQAIGDDRLQQQSQGRVVPDSFTHGTSQQ RYSWFKRGFDSGDPAQCNTFGKSI >gi|296494671|gb|ADTN01000067.1| GENE 30 33283 - 33996 1176 237 aa, chain - ## HITS:1 COG:ECs3338 KEGG:ns NR:ns ## COG: ECs3338 COG0152 # Protein_GI_number: 15832592 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase # Organism: Escherichia coli O157:H7 # 1 237 1 237 237 464 100.0 1e-131 MQKQAELYRGKAKTVYSTENPDLLVLEFRNDTSAGDGARIEQFDRKGMVNNKFNYFIMSK LAEAGIPTQMERLLSDTECLVKKLDMVPVECVVRNRAAGSLVKRLGIEEGIELNPPLFDL FLKNDAMHDPMVNESYCETFGWVSKENLARMKELTYKANDVLKKLFDDAGLILVDFKLEF GLYKGEVVLGDEFSPDGSRLWDKETLEKMDKDRFRQSLGGLIEAYEAVARRLGVQLD >gi|296494671|gb|ADTN01000067.1| GENE 31 34209 - 35243 933 344 aa, chain - ## HITS:1 COG:nlpB KEGG:ns NR:ns ## COG: nlpB COG3317 # Protein_GI_number: 16130402 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized lipoprotein # Organism: Escherichia coli K12 # 1 344 2 345 345 622 99.0 1e-178 MAYSVQKSRLAKVAGVSLVLLLAACSSDSRYKRQVSGDEAYLEAAPLAELHAPAGMILPV TSGDYAIPVTNGSGAVGKALDIRPPAQPLALVSGARTQFTGDTASLLVENGRGNTLWPQV VSVLQAKNYTITQRDDAGQTLTTDWVQWNRLDEDEQYRGRYQISVKPQGYQQAVTVKLLN LEQAGKPVADAASMQRYSTEMMNVISAGLDKSATDAANAAQNRASTTMDVQSAADDTGLP MLVVRGPFNVVWQRLPAALEKGGMKVTDSTRSQGNMAVTYKPLSDSDWQELGASDPGLAS GDYKLQVGDLDNRSSLQFIDPKGHTLTQSQNDALVAVFQAAFSK >gi|296494671|gb|ADTN01000067.1| GENE 32 35260 - 36138 816 292 aa, chain - ## HITS:1 COG:dapA KEGG:ns NR:ns ## COG: dapA COG0329 # Protein_GI_number: 16130403 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Escherichia coli K12 # 1 292 1 292 292 580 100.0 1e-165 MFTGSIVAIVTPMDEKGNVCRASLKKLIDYHVASGTSAIVSVGTTGESATLNHDEHADVV MMTLDLADGRIPVIAGTGANATAEAISLTQRFNDSGIVGCLTVTPYYNRPSQEGLYQHFK AIAEHTDLPQILYNVPSRTGCDLLPETVGRLAKVKNIIGIKEATGNLTRVNQIKELVSDD FVLLSGDDASALDFMQLGGHGVISVTANVAARDMAQMCKLAAEGHFAEARVINQRLMPLH NKLFVEPNPIPVKWACKELGLVATDTLRLPMTPITDSGRETVRAALKHAGLL >gi|296494671|gb|ADTN01000067.1| GENE 33 36284 - 36856 311 190 aa, chain + ## HITS:1 COG:ECs3341 KEGG:ns NR:ns ## COG: ECs3341 COG2716 # Protein_GI_number: 15832595 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system regulatory protein # Organism: Escherichia coli O157:H7 # 1 190 23 212 212 374 99.0 1e-104 MTLSSQHYLVITALGADRPGIVNTITRHVSSCGCNIEDSRLAMLGEEFTFIMLLSGSWNA ITLIESTLPLKGAELDLLIVMKRTTARPRPPMPASVWVQVDVADSPHLIERFTALFDAHH MNIAELVSRTQPAENERAAQLHIQITAHSPASADAANIEQAFKALCTELNAQGSINVVNY SQHDEQDGVK >gi|296494671|gb|ADTN01000067.1| GENE 34 36856 - 37326 554 156 aa, chain + ## HITS:1 COG:ECs3342 KEGG:ns NR:ns ## COG: ECs3342 COG1225 # Protein_GI_number: 15832596 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Escherichia coli O157:H7 # 1 156 1 156 156 322 100.0 1e-88 MNPLKAGDIAPKFSLPDQDGEQVNLTDFQGQRVLVYFYPKAMTPGCTVQACGLRDNMDEL KKAGVDVLGISTDKPEKLSRFAEKELLNFTLLSDEDHQVCEQFGVWGEKSFMGKTYDGIH RISFLIDADGKIEHVFDDFKTSNHHDVVLNWLKEHA >gi|296494671|gb|ADTN01000067.1| GENE 35 37579 - 38196 308 205 aa, chain + ## HITS:1 COG:hyfA KEGG:ns NR:ns ## COG: hyfA COG1142 # Protein_GI_number: 16130406 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 2 # Organism: Escherichia coli K12 # 1 205 14 218 218 387 100.0 1e-108 MNRFVVAEPLWCTGCNTCLAACSDVHKTQGLQQHPRLALAKTSTITAPVVCHHCEEAPCL QVCPVNAISQRDDAIQLNESLCIGCKLCAVVCPFGAISASGSRPVNAHAQYVFQAEGSLK DGEENAPTQHALLRWEPGVQTVAVKCDLCDFLPEGPACVRACPNQALRLITGDSLQRQMK EKQRLAASWFANGGEDPLSLTQEQR >gi|296494671|gb|ADTN01000067.1| GENE 36 38196 - 40214 1689 672 aa, chain + ## HITS:1 COG:hyfB KEGG:ns NR:ns ## COG: hyfB COG0651 # Protein_GI_number: 16130407 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit # Organism: Escherichia coli K12 # 1 672 1 672 672 1102 99.0 0 MDALQLLTWSLILYLFASLASLFLLGLDRLAIKLSGITSLVGGVIGIISGITQLHAGVTL VARFATPFEFADLTLRMDSLSAFMVLVISLLVVVCSLYSLTYMREYEGKGAAAMGFFMNI FIASMVALLVMDNAFWFIVLFEMMSLSSWFLVIARQDKTSINAGMLYFFIAHAGSVLIMI AFLLMGRESGSLDFASFRTLSLSPGLASAVFLLAFFGFGAKAGMMPLHSWLPRAHPAAPS HASALMSGVMVKIGIFGILKVAMDLLAQTGLPLWWGILVMAIGAISALLGVLYALAEQDI KRLLAWSTVENVGIILLAVGVAMVGLSLHDPLLTVVGLLGALFHLLNHALFKGLLFLGAG AIISRLHTHDMEKMGALAKRMPWTAAACLIGCLAISAIPPLNGFISEWYTWQSLFSLSRV EAVALQLAGPIAMVMLAVTGGLAVMCFVKMYGITFCGAPRSTHAEEAQEVPNTMIVAMLL LAALCVLIALSASWLAPKIMHIAHAFTNTPPATVASGIALVPGTFHTQVTPSLLLLLLLA MPLLPGLYWLWCRSRRAAFRRTGDAWACGYGWENAMAPSGNGVMQPLRVVFSALFRLRQQ LDPTLRLNKGLAHVTARAQSTEPFWDERVIRPIVSATQRLAKEIQHLQSGDFRLYCLYVV AALVVLLIAIAV >gi|296494671|gb|ADTN01000067.1| GENE 37 40225 - 41172 935 315 aa, chain + ## HITS:1 COG:hyfC KEGG:ns NR:ns ## COG: hyfC COG0650 # Protein_GI_number: 16130408 # Func_class: C Energy production and conversion # Function: Formate hydrogenlyase subunit 4 # Organism: Escherichia coli K12 # 1 315 8 322 322 508 100.0 1e-144 MRQTLCDGYLVIFALAQAVILLMLTPLFTGISRQIRARMHSRRGPGIWQDYRDIHKLFKR QEVAPTSSGLMFRLMPWVLISSMLVLAMALPLFITVSPFAGGGDLITLIYLLALFRFFFA LSGLDTGSPFAGVGASRELTLGILVEPMLILSLLVLALIAGSTHIEMISNTLAMGWNSPL TTVLALLACGFACFIEMGKIPFDVAEAEQELQEGPLTEYSGAGLALAKWGLGLKQVVMAS LFVALFLPFGRAQELSLACLLTSLVVTLLKVLLIFVLASIAENTLARGRFLLIHHVTWLG FSLAALAWVFWLTGL >gi|296494671|gb|ADTN01000067.1| GENE 38 41189 - 42628 1246 479 aa, chain + ## HITS:1 COG:hyfD KEGG:ns NR:ns ## COG: hyfD COG1009 # Protein_GI_number: 16130409 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit # Organism: Escherichia coli K12 # 1 479 1 479 479 830 100.0 0 MENLALTTLLLPFIGALVVSFSPQRRAAEWGVLFAALTTLCMLSLISAFYQADKVAVTLT LVNVGDVALFGLVIDRVSTLILFVVVFLGLLVTIYSTGYLTDKNREHPHNGTNRYYAFLL VFIGAMAGLVLSSTLLGQLLFFEITGGCSWALISYYQSDKAQRSALKALLITHIGSLGLY LAAATLFLQTGTFALSAMSELHGDARYLVYGGILFAAWGKSAQLPMQAWLPDAMEAPTPI SAYLHAASMVKVGVYIFARAIIDGGNIPHVIGGVGMVMALVTILYGFLMYLPQQDMKRLL AWSTITQLGWMFFGLSLSIFGSRLALEGSIAYIVNHAFAKSLFFLVAGALSYSCGTRLLP RLRGVLHTLPLPGVGFCVAALAITGVPPFNGFFSKFPLFAAGFALSVEYWILLPAMILLM IESVASFAWFIRWFGRVVPGKPSEAVADAAPLPGSMRLVLIVLIVMSLISSVIAATWLQ >gi|296494671|gb|ADTN01000067.1| GENE 39 42640 - 43290 655 216 aa, chain + ## HITS:1 COG:ECs3347 KEGG:ns NR:ns ## COG: ECs3347 COG4237 # Protein_GI_number: 15832601 # Func_class: C Energy production and conversion # Function: Hydrogenase 4 membrane component (E) # Organism: Escherichia coli O157:H7 # 1 216 1 216 216 339 100.0 2e-93 MTGSMIVNNLAGLMMLTSLFVISVKSYRLSCGFYACQSLVLVSIFATLSCLFAAEQLLIW SASAFITKVLLVPLIMTYAARNIPQNIPEKALFGPAMMALLAALIVLLCAFVVQPVKLPM ATGLKPALAVALGHFLLGLLCIVSQRNILRQIFGYCLMENGSHLVLALLAWRAPELVEIG IATDAIFAVIVMVLLARKIWRTHGTLDVNNLTALKG >gi|296494671|gb|ADTN01000067.1| GENE 40 43295 - 44875 1199 526 aa, chain + ## HITS:1 COG:hyfF KEGG:ns NR:ns ## COG: hyfF COG0651 # Protein_GI_number: 16130411 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit # Organism: Escherichia coli K12 # 1 526 1 526 526 872 99.0 0 MSYSVMFALLLLTPLLFSLLCFACRKRRLSATRTVTVLHSLGITLLLILALWVVQTAADA GEIFAAGLWLHIDGLGGLFLAILGVIGFLTGIYSIGYMCHEVAHGELSPVTLCDYYGFFH LFLFTMLLVVTSNNLIVMWAAIEATTLSSAFLVGIYGQRSSLEAAWKYIIICTVGVAFGL FGTVLVYANAASVMPQAEMAIFWSEVLKQSSLLDPTLMLLAFVFLLIGFGTKTGLFPMHA WLPDAHSEAPSPVSALLSAVLLNCALLVLIRYYIIICQAIGSDFPNRLLLIFGMLSVAVA AFFILVQRDIKRLLAYSSVENMGLVAVELGIGGPLGIFAALLHILNHSLAKTLLFCGSGN VLLKYGTRDLNVVCGMLKIMPFTAVLFGGGALALAGMPPFNIFLSEFMTITAGLARNHLL IIVLLLLLLTLVLAGLVRMAARVLMAKPPQAVNRGDLGWLTTSPMVILLVMMLAMGTHIP QPVIRILAGASTIVLSGTHDLPAQRSTWHDFLPSGTASVSEKHSER >gi|296494671|gb|ADTN01000067.1| GENE 41 44865 - 46532 1813 555 aa, chain + ## HITS:1 COG:ECs3349_2 KEGG:ns NR:ns ## COG: ECs3349_2 COG3261 # Protein_GI_number: 15832603 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase III large subunit # Organism: Escherichia coli O157:H7 # 156 555 1 416 416 823 95.0 0 MNVNSSSNRGEAILAALKTQFPGAVLDEERQTPEQVTITVKINLLPDVVQYLYYQHDGWL PVLFGNDERTLNGHYAVYYALSMEGAEKCWIVVKALVDADSREFPSVTPRVPAAVWGERE IRDMYGLIPVGLPDQRRLVLPDDWPEDMHPLRKDAMDYRLRPEPTTDSETYPFINEGNSD ARVIPVGPLHITSDEPGHFRLFVDGEQIVDADYRLFYVHRGMEKLAETRMGYNEVTFLSD RVCGICGFAHSVAYTNSVENALGIEVPQRAHTIRSILLEVERLHSHLLNLGLSCHFVGFD TGFMQFFRVREKSMTMAELLIGSRKTYGLNLIGGVRRDILKEQRLQTLKLVREMRADVSE LVEMLLATPNMEQRTQGIGILDRQIARDLRFDHPYADYGNIPKTLFTFTGGDVFSRVMVR VKETFDSLAMLEFALDNMPDTPLLTEGFSYKPHAFALGFVEAPRGEDVHWSMLGDNQKLF RWRCRAATYANWPVLRYMLRGNTVSDAPLIIGSLDPCYSCTDRVTLVDVRKRQSKTVPYK EIERYGIDRNRSPLK >gi|296494671|gb|ADTN01000067.1| GENE 42 46542 - 47087 372 181 aa, chain + ## HITS:1 COG:hyfH KEGG:ns NR:ns ## COG: hyfH COG1143 # Protein_GI_number: 16130413 # Func_class: C Energy production and conversion # Function: Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) # Organism: Escherichia coli K12 # 1 181 1 181 181 306 100.0 1e-83 MLKLLKTIMRAGTATVKYPFAPLEVSPGFRGKPDLMPSQCIACGACACACPANALTIQTD DQQNSRTWQLYLGRCIYCGRCEEVCPTRAIQLTNNFELTVTNKADLYTRATFHLQRCSRC ERPFAPQKTIALAAELLAQQQNAPQNREMLWAQASVCPECKQRATLINDDTDVLLVAKEQ L >gi|296494671|gb|ADTN01000067.1| GENE 43 47084 - 47842 545 252 aa, chain + ## HITS:1 COG:hyfI KEGG:ns NR:ns ## COG: hyfI COG3260 # Protein_GI_number: 16130414 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase III small subunit # Organism: Escherichia coli K12 # 1 252 1 252 252 521 100.0 1e-148 MSPVLTQHVSQPITLDEQTQKMKRHLLQDIRRSAYVYRVDCGGCNACEIEIFAAITPVFD AERFGIKVVSSPRHADILLFTGAVTRAMRMPALRAYESAPDHKICVSYGACGVGGGIFHD LYSVWGGSDTIVPIDVWIPGCPPTPAATIHGFAVALGLLQQKIHAVDYRDPTGVTMQPLW PQIPPSQRIAIEREARRLAGYRQGREICDRLLRHLSDDPTGNRVNTWLRDADDPRLNSIV QQLFRVLRGLHD >gi|296494671|gb|ADTN01000067.1| GENE 44 47835 - 48248 312 137 aa, chain + ## HITS:1 COG:no KEGG:SSON_2571 NR:ns ## KEGG: SSON_2571 # Name: not_defined # Def: putative protein processing element # Organism: S.sonnei # Pathway: not_defined # 1 137 22 158 158 272 99.0 2e-72 MTEECGEIVFWTLRKKFVASSDEMPEHSSQVMYYSLAIGHHVGVIDCLNVAFRCPLTEYE DWLALVEEEQARRKMLGVMTFGEIVIDASHTALLTRAFAPLADDATSVWQARSIQFIHLL DEIVQEPAIYLMARKIA >gi|296494671|gb|ADTN01000067.1| GENE 45 48278 - 50290 1346 670 aa, chain + ## HITS:1 COG:hyfR KEGG:ns NR:ns ## COG: hyfR COG3604 # Protein_GI_number: 16130416 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Escherichia coli K12 # 8 670 1 663 663 1314 99.0 0 MAMSDEAMFAPPQGITIEAVNGMLAERLAQKHGKASLLRAFIPLPPPFSPVQLIELHVLK SNFYYRYHDDGSDVTATTEYQGEMVDYSRHAVLLGSSGMAELRFIRTHGSRFTSQDCTLF NWLARIITPVLQSWLNDEEQQVALRLLEKDRDHHRVLVDITNAVLSHLDLDDLIADVARE IHHFFGLASVSMVLGDHRKNEKFSLWCSDLSASHCACLPRCMPGESVLLTQTLQTRQPTL THRADDLFLWQRDPLLLLLASNGCESALLIPLTFGNHTPGALLLAHTSSTLFSEENCQLL QHIADRIAIAVGNADAWRSMTDLQESLQQENHQLSEQLLSNLGIGDIIYQSQAMEDLLQQ VDIVAKSDSTVLICGETGTGKEVIARAIHQLSPRRDKPLVKINCAAIPASLLESELFGHD KGAFTGAINTHRGRFEIADGGTLFLDEIGDLPLELQPKLLRVLQEREIERLGGSRTIPVN VRVIAATNRDLWQMVEDRQFRSDLFYRLNVFPLELPPLRDRPEDIPLLAKHFTQKMARHM NRAIDAIPTEALRQLMSWDWPGNVHELENVIERAVLLTRGNSLNLHLNVRQSRLLPTLNE DSALRSSMAQLLHPTTPENDEEERQRIVQVLRETNGIVAGPRGAATRLGMKRTTLLSRMQ RLGISVREVL >gi|296494671|gb|ADTN01000067.1| GENE 46 50312 - 51160 559 282 aa, chain + ## HITS:1 COG:focB KEGG:ns NR:ns ## COG: focB COG2116 # Protein_GI_number: 16130417 # Func_class: P Inorganic ion transport and metabolism # Function: Formate/nitrite family of transporters # Organism: Escherichia coli K12 # 1 282 1 282 282 507 100.0 1e-144 MRNKLSFDLQLSARKAAIAERIAAHKIARSKVSVFLMAMSAGVFMAIGFTFYLSVIADAP SSQALTHLVGGLCFTLGFILLAVCGTSLFTSSVMTVMAKSRGVISWRTWLINALLVACGN LAGIACFSLLIWFSGLVMSENAMWGVAVLHCAEGKMHHTFTESVSLGIMCNLMVCLALWM SYCGRSLCDKIVAMILPITLFVASGFEHCIANLFVIPFAIAIRHFAPPPFWQLAHSSADN FPALTVSHFITANLLPVMLGNIIGGAVLVSMCYRAIYLRQEP >gi|296494671|gb|ADTN01000067.1| GENE 47 51198 - 52259 1363 353 aa, chain - ## HITS:1 COG:ECs3355 KEGG:ns NR:ns ## COG: ECs3355 COG0628 # Protein_GI_number: 15832609 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Escherichia coli O157:H7 # 1 353 1 353 353 588 100.0 1e-168 MLEMLMQWYRRRFSDPEAIALLVILVAGFGIIFFFSGLLAPLLVAIVLAYLLEWPTVRLQ SIGCSRRWATSIVLVVFVGILLLMAFVVLPIAWQQGIYLIRDMPGMLNKLSDFAATLPRR YPALMDAGIIDAMAENMRSRMLTMGDSVVKISLASLVGLLTIAVYLVLVPLMVFFLLKDK EQMLNAVRRVLPRNRGLAGQVWKEMNQQITNYIRGKVLEMIVVGIATWLGFLLFGLNYSL LLAVLVGFSVLIPYIGAFVVTIPVVGVALFQFGAGTEFWSCFAVYLIIQALDGNLLVPVL FSEAVNLHPLVIILSVVIFGGLWGFWGVFFAIPLATLIKAVIHAWPDGQIAQE >gi|296494671|gb|ADTN01000067.1| GENE 48 52472 - 53935 1779 487 aa, chain + ## HITS:1 COG:yfgC KEGG:ns NR:ns ## COG: yfgC COG4783 # Protein_GI_number: 16130419 # Func_class: R General function prediction only # Function: Putative Zn-dependent protease, contains TPR repeats # Organism: Escherichia coli K12 # 1 487 1 487 487 872 99.0 0 MFRQLKKNLVATLIAAMTIGQVAPAFADSADTLPDMGTSAGSTLSIGQEMQMGDYYVRQL RGSAPLINDPLLTQYINSLGMRLVSHANSVKTPFHFFLINNDEINAFAFFGGNVVLHSAL FRYSDNESQLASVMAHEISHVTQRHLARAMEDQQRSAPLTWVGALGSILLAMASPQAGMA ALTGTLAGTRQGMISFTQQNEQEADRIGIQVLQRSGFDPQAMPTFLEKLLDQARYSSRPP EILLTHPLPESRLADARNRANQMRPMVVQSSEDFYLAKARTLGMYNSGRNQLTSDLLDEW AKGNVRQQRAAQYGRALQAMEANKYDEARKTLQPLLAAEPGHAWYLDLATDIDLGQNKAN EAINRLKNARDLRTNPVLQLNLANAYLQGGQPQEAANILNRYTFNNKDDSNGWDLLAQAE AALNNRDQELAARAEGYALAGRLDQAISLLSSASSQVKLGSLQQARYDARIDQLRQLQER FKPYTKM >gi|296494671|gb|ADTN01000067.1| GENE 49 53956 - 54315 547 119 aa, chain + ## HITS:1 COG:yfgD KEGG:ns NR:ns ## COG: yfgD COG1393 # Protein_GI_number: 16130420 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Escherichia coli K12 # 1 119 1 119 119 207 100.0 4e-54 MTKQVKIYHNPRCSKSRETLNLLKENGVEPEVVLYLETPADAATLRDLLKILGMNSAREL MRQKEDLYKELNLADSSLSEEALIQAMVDNPKLMERPIVVANGKARIGRPPEQVLEIVG >gi|296494671|gb|ADTN01000067.1| GENE 50 54453 - 55130 567 225 aa, chain - ## HITS:1 COG:ECs3358 KEGG:ns NR:ns ## COG: ECs3358 COG0593 # Protein_GI_number: 15832612 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication initiation # Organism: Escherichia coli O157:H7 # 1 225 24 248 248 447 99.0 1e-126 MPLYLPDDETFASFWPGDNSSLLAALQNVLRQEHSGYIYLWAREGAGRSHLLHAACAELS QRGDAVGYVPLDKRTWFVPEVLDGMEHLSLVCIDNIECIAGDELWEMAIFDLYNRILESG KTRLLITGDRPPRQLNLGLPDLASRLDWGQIYKLQPLSDEDKLQALQLRARLRGFELPED VGRFLLKRLDREMRTLFMTLDQLDRASITAQRKLTIPFVKEILKL Prediction of potential genes in microbial genomes Time: Sun May 15 23:21:18 2011 Seq name: gi|296494670|gb|ADTN01000068.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont170.3, whole genome shotgun sequence Length of sequence - 7925 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 - CDS 15 - 1304 1017 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 - Prom 1329 - 1388 7.7 - Term 1339 - 1371 3.1 2 1 Op 2 . - CDS 1390 - 2016 853 ## COG0035 Uracil phosphoribosyltransferase - Prom 2052 - 2111 7.5 + Prom 2238 - 2297 7.0 3 2 Op 1 21/0.000 + CDS 2341 - 3378 1238 ## PROTEIN SUPPORTED gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase 4 2 Op 2 4/0.000 + CDS 3378 - 4016 721 ## COG0299 Folate-dependent phosphoribosylglycinamide formyltransferase PurN + Term 4034 - 4065 2.4 + Prom 4093 - 4152 2.8 5 2 Op 3 11/0.000 + CDS 4188 - 6254 2129 ## COG0855 Polyphosphate kinase 6 2 Op 4 . + CDS 6259 - 7800 1494 ## COG0248 Exopolyphosphatase + Term 7806 - 7846 10.3 Predicted protein(s) >gi|296494670|gb|ADTN01000068.1| GENE 1 15 - 1304 1017 429 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 3 423 2 428 447 396 47 1e-110 MTRRAIGVSERPPLLQTIPLSLQHLFAMFGATVLVPVLFHINPATVLLFNGIGTLLYLFI CKGKIPAYLGSSFAFISPVLLLLPLGYEVALGGFIMCGVLFCLVSFIVKKAGTGWLDVLF PPAAMGAIVAVIGLELAGVAAGMAGLLPAEGQTPDSKTIIISITTLAVTVLGSVLFRGFL AIIPILIGVLVGYALSFAMGIVDTTPIINAHWFALPTLYTPRFEWFAILTILPAALVVIA EHVGHLVVTANIVKKDLLRDPGLHRSMFANGLSTVISGFFGSTPNTTYGENIGVMAITRV YSTWVIGGAAIFAILLSCVGKLAAAIQMIPLPVMGGVSLLLYGVIGASGIRVLIESKVDY NKAQNLILTSVILIIGVSGAKVNIGAAELKGMALATIVGIGLSLIFKLISVLRPEEVVLD AEDADITDK >gi|296494670|gb|ADTN01000068.1| GENE 2 1390 - 2016 853 208 aa, chain - ## HITS:1 COG:ECs3360 KEGG:ns NR:ns ## COG: ECs3360 COG0035 # Protein_GI_number: 15832614 # Func_class: F Nucleotide transport and metabolism # Function: Uracil phosphoribosyltransferase # Organism: Escherichia coli O157:H7 # 1 208 10 217 217 394 100.0 1e-110 MKIVEVKHPLVKHKLGLMREQDISTKRFRELASEVGSLLTYEATADLETEKVTIEGWNGP VEIDQIKGKKITVVPILRAGLGMMDGVLENVPSARISVVGMYRNEETLEPVPYFQKLVSN IDERMALIVDPMLATGGSVIATIDLLKKAGCSSIKVLVLVAAPEGIAALEKAHPDVELYT ASIDQGLNEHGYIIPGLGDAGDKIFGTK >gi|296494670|gb|ADTN01000068.1| GENE 3 2341 - 3378 1238 345 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase [Acinetobacter baumannii SDF] # 2 345 7 356 356 481 68 1e-136 MTDKTSLSYKDAGVDIDAGNALVGRIKGVVKKTRRPEVMGGLGGFGALCALPQKYREPVL VSGTDGVGTKLRLAMDLKRHDTIGIDLVAMCVNDLVVQGAEPLFFLDYYATGKLDVDTAS AVISGIAEGCLQSGCSLVGGETAEMPGMYHGEDYDVAGFCVGVVEKSEIIDGSKVSDGDV LIALGSSGPHSNGYSLVRKILEVSGCDPQTTELDGKPLADHLLAPTRIYVKSVLELIEKV DVHAIAHLTGGGFWENIPRVLPDNTQAVIDESSWQWPEVFNWLQTAGNVEHHEMYRTFNC GVGMIIALPAPEVDKALALLNANGENAWKIGIIKASDSEQRVVIE >gi|296494670|gb|ADTN01000068.1| GENE 4 3378 - 4016 721 212 aa, chain + ## HITS:1 COG:purN KEGG:ns NR:ns ## COG: purN COG0299 # Protein_GI_number: 16130425 # Func_class: F Nucleotide transport and metabolism # Function: Folate-dependent phosphoribosylglycinamide formyltransferase PurN # Organism: Escherichia coli K12 # 1 212 1 212 212 431 100.0 1e-121 MNIVVLISGNGSNLQAIIDACKTNKIKGTVRAVFSNKADAFGLERARQAGIATHTLIASA FDSREAYDRELIHEIDMYAPDVVVLAGFMRILSPAFVSHYAGRLLNIHPSLLPKYPGLHT HRQALENGDEEHGTSVHFVTDELDGGPVILQAKVPVFAGDSEDDITARVQTQEHAIYPLV ISWFADGRLKMHENAAWLDGQRLPPQGYAADE >gi|296494670|gb|ADTN01000068.1| GENE 5 4188 - 6254 2129 688 aa, chain + ## HITS:1 COG:ECs3363 KEGG:ns NR:ns ## COG: ECs3363 COG0855 # Protein_GI_number: 15832617 # Func_class: P Inorganic ion transport and metabolism # Function: Polyphosphate kinase # Organism: Escherichia coli O157:H7 # 1 688 1 688 688 1347 100.0 0 MGQEKLYIEKELSWLSFNERVLQEAADKSNPLIERMRFLGIYSNNLDEFYKVRFAELKRR IIISEEQGSNSHSRHLLGKIQSRVLKADQEFDGLYNELLLEMARNQIFLINERQLSVNQQ NWLRHYFKQYLRQHITPILINPDTDLVQFLKDDYTYLAVEIIRGDTIRYALLEIPSDKVP RFVNLPPEAPRRRKPMILLDNILRYCLDDIFKGFFDYDALNAYSMKMTRDAEYDLVHEME ASLMELMSSSLKQRLTAEPVRFVYQRDMPNALVEVLREKLTISRYDSIVPGGRYHNFKDF INFPNVGKANLVNKPLPRLRHIWFDKAQFRNGFDAIRERDVLLYYPYHTFEHVLELLRQA SFDPSVLAIKINIYRVAKDSRIIDSMIHAAHNGKKVTVVVELQARFDEEANIHWAKRLTE AGVHVIFSAPGLKIHAKLFLISRKENGEVVRYAHIGTGNFNEKTARLYTDYSLLTADARI TNEVRRVFNFIENPYRPVTFDYLMVSPQNSRRLLYEMVDREIANAQQGLPSGITLKLNNL VDKGLVDRLYAASSSGVPVNLLVRGMCSLIPNLEGISDNIRAISIVDRYLEHDRVYIFEN GGDKKVYLSSADWMTRNIDYRIEVATPLLDPRLKQRVLDIIDILFSDTVKARYIDKELSN RYVPRGNRRKVRAQLAIYDYIKSLEQPE >gi|296494670|gb|ADTN01000068.1| GENE 6 6259 - 7800 1494 513 aa, chain + ## HITS:1 COG:ECs3364 KEGG:ns NR:ns ## COG: ECs3364 COG0248 # Protein_GI_number: 15832618 # Func_class: F Nucleotide transport and metabolism; P Inorganic ion transport and metabolism # Function: Exopolyphosphatase # Organism: Escherichia coli O157:H7 # 1 513 1 513 513 1009 100.0 0 MPIHDKSPRPQEFAAVDLGSNSFHMVIARVVDGAMQIIGRLKQRVHLADGLGPDNMLSEE AMTRGLNCLSLFAERLQGFSPASVCIVGTHTLRQALNATDFLKRAEKVIPYPIEIISGNE EARLIFMGVEHTQPEKGRKLVIDIGGGSTELVIGENFEPILVESRRMGCVSFAQLYFPGG VINKENFQRARMAAAQKLETLTWQFRIQGWNVAMGASGTIKAAHEVLMEMGEKDGIITPE RLEKLVKEVLRHRNFASLSLPGLSEERKTVFVPGLAILCGVFDALAIRELRLSDGALREG VLYEMEGRFRHQDVRSRTASSLANQYHIDSEQARRVLDTTMQMYEQWREQQPKLAHPQLE ALLRWAAMLHEVGLNINHSGLHRHSAYILQNSDLPGFNQEQQLMMATLVRYHRKAIKLDD LPRFTLFKKKQFLPLIQLLRLGVLLNNQRQATTTPPTLTLITDDSHWTLRFPHDWFSQNA LVLLDLEKEQEYWEGVAGWRLKIEEESTPEIAA Prediction of potential genes in microbial genomes Time: Sun May 15 23:21:22 2011 Seq name: gi|296494669|gb|ADTN01000069.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont179.1, whole genome shotgun sequence Length of sequence - 12668 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 5, operones - 2 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 21/0.000 + CDS 3 - 821 834 ## COG1256 Flagellar hook-associated protein 2 1 Op 2 . + CDS 833 - 1786 958 ## COG1344 Flagellin and related hook-associated proteins + Term 1828 - 1868 7.1 - Term 1929 - 1960 3.2 3 2 Tu 1 . - CDS 1982 - 5167 3400 ## COG1530 Ribonucleases G and E + Prom 5580 - 5639 3.6 4 3 Tu 1 . + CDS 5740 - 6699 190 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit 5 4 Tu 1 . - CDS 6811 - 7434 521 ## COG0424 Nucleotide-binding protein implicated in inhibition of septum formation - Prom 7560 - 7619 3.1 + Prom 7390 - 7449 5.4 6 5 Op 1 20/0.000 + CDS 7594 - 8115 435 ## COG1399 Predicted metal-binding, possibly nucleic acid-binding protein 7 5 Op 2 14/0.000 + CDS 8167 - 8340 292 ## PROTEIN SUPPORTED gi|15801206|ref|NP_287223.1| 50S ribosomal protein L32 8 5 Op 3 16/0.000 + CDS 8451 - 9491 650 ## COG0416 Fatty acid/phospholipid biosynthesis enzyme 9 5 Op 4 14/0.000 + CDS 9559 - 10512 856 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III 10 5 Op 5 26/0.000 + CDS 10528 - 11457 1008 ## COG0331 (acyl-carrier-protein) S-malonyltransferase 11 5 Op 6 22/0.000 + CDS 11470 - 12204 252 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 + Term 12233 - 12259 -0.6 + Prom 12317 - 12376 4.9 12 5 Op 7 . + CDS 12415 - 12651 414 ## COG0236 Acyl carrier protein Predicted protein(s) >gi|296494669|gb|ADTN01000069.1| GENE 1 3 - 821 834 272 aa, chain + ## HITS:1 COG:flgK KEGG:ns NR:ns ## COG: flgK COG1256 # Protein_GI_number: 16129045 # Func_class: N Cell motility # Function: Flagellar hook-associated protein # Organism: Escherichia coli K12 # 5 272 280 547 547 435 98.0 1e-122 TINKLESLPTFRSQDLDQTRNTLGQLALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAV LQNTKNKGDVAIGATVTDASAVLATDYKISFDNNQWQVTRLASNTTFTVTPDANGKVAFD GLELTFTGTPAVNDSFTLKPVSDAIVNMDVLITDEAKIAMASEEDAGDSDNRNGQALLDL QSNSKTVGGAKSFNDAYASLVSDIGNKTATLKTSSATQGNVVTQLSNQQQSISGVNLDEE YGNLQRFQQYYLANAQVLQTANAIFDALINIR >gi|296494669|gb|ADTN01000069.1| GENE 2 833 - 1786 958 317 aa, chain + ## HITS:1 COG:flgL KEGG:ns NR:ns ## COG: flgL COG1344 # Protein_GI_number: 16129046 # Func_class: N Cell motility # Function: Flagellin and related hook-associated proteins # Organism: Escherichia coli K12 # 1 317 1 317 317 514 100.0 1e-146 MRFSTQMMYQQNMRGITNSQAEWMKYGEQMSTGKRVVNPSDDPIAASQAVVLSQAQAQNS QYTLARTFATQKVSLEESVLSQVTTAIQNAQEKIVYASNGTLSDDDRASLATDIQGLRDQ LLNLANTTDGNGRYIFAGYKTETAPFSEEKGKYVGGAESIKQQVDASRSMVIGHTGDKIF DSITSNAVAEPDGSASETNLFAMLDSAIAALKTPVADSEADKETAAAALDKTNRGLKNSL NNVLTVRAELGTQLNELESLDSLGSDRALGQTQQMSDLVDVDWNATISSYIMQQTALQAS YKAFTDMQGLSLFQLSK >gi|296494669|gb|ADTN01000069.1| GENE 3 1982 - 5167 3400 1061 aa, chain - ## HITS:1 COG:rne KEGG:ns NR:ns ## COG: rne COG1530 # Protein_GI_number: 16129047 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonucleases G and E # Organism: Escherichia coli K12 # 1 1061 1 1061 1061 1486 99.0 0 MKRMLINATQQEELRVALVDGQRLYDLDIESPGHEQKKANIYKGKITRIEPSLEAAFVDY GAERHGFLPLKEIAREYFPANYSAHGRPNIKDVLREGQEVIVQIDKEERGNKGAALTTFI SLAGSYLVLMPNNPRAGGISRRIEGDDRTELKEALASLELPEGMGLIVRTAGVGKSAEAL QWDLSFRLKHWEAIKKAAESRPAPFLIHQESNVIVRAFRDYLRQDIGEILIDNPKVLELA RQHIAALGRPDFSSKIKLYTGEIPLFSHYQIESQIESAFQREVRLPSGGSIVIDSTEALT AIDINSARATRGGDIEETAFNTNLEAADEIARQLRLRDLGGLIVIDFIDMTPVRHQRAVE NRLREAVRQDRARIQISHISRFGLLEMSRQRLSPSLGESSHHVCPRCSGTGTVRDNESLS LSILRLIEEEALKENTQEVHAIVPVPIASYLLNEKRSAVNAIETRQDGVRCVIVPNDQME TPHYHVLRVRKGEETPTLSYMLPKLHEEAMALPSEEEFAERKRPEQPALATFAMPDVPPA PTPAEPAAPVVAPAPKAAPATPAAPAQPGLLSRFFGALKALFSGGEETKPTEQPAPKAEA KPERQQDRRKPRQNNRRDRNERRDTRSERTEGSDNREENRRNRRQAQQQTAETRESRQQA EVTEKARTADEQQAPRRERSRRRNDDKRQAQQEAKALNVEEQSVQETEQEERVRPVQPRR KQRQLNQKVRYEQSVAEEAVVAPVVEETVAAEPIVQEATAPRTELVKVPLPVVAQTAPEQ QEENNADNRDNGGMPRRSRRSPRHLRVSGQRRRRYRDERYPTQSPMPLTVACASPELASG KVWIRYPIVRPQDVQVEEQREQEEVHVQPMVTEVPVAAAIEPVVSAPVVEEVAGVVEAPV QVAEPQPEVVETTHPEVIAAAVTEQPQVITESDVAVAQEVAEQAEPVVEPQEETADIEEV VETAEVVVAEPEVVAQPAAPVVAEVAAEVETVAAVEPEVTVEHNHATAPMTRAPAPEYVP EAPRHSDWQRPTFAFEGKGAAGGHTATHHASAAPARPQPVE >gi|296494669|gb|ADTN01000069.1| GENE 4 5740 - 6699 190 319 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 95 309 83 284 285 77 29 4e-14 MKTETPSVKIVAITADEAGQRIDNFLRTQLKGVPKSMIYRILRKGEVRVNKKRIKPEYKL EAGDEVRIPPVRVAEREEEAVSPHLQKVAALADVILYEDDHILVLNKPSGTAVHGGSGLS FGVIEGLRALRPEARFLELVHRLDRDTSGVLLVAKKRSALRSLHEQLREKGMQKDYLALV RGQWQSHVKSVQAPLLKNILQSGERIVRVSQEGKPSETRFKVEERYAFATLVRCSPVTGR THQIRVHTQYAGHPIAFDDRYGDREFDRQLTEAGTGLNRLFLHAAALKFTHPGTGEVMRI EAPMDEGLKRCLQKLRNAR >gi|296494669|gb|ADTN01000069.1| GENE 5 6811 - 7434 521 207 aa, chain - ## HITS:1 COG:yceF KEGG:ns NR:ns ## COG: yceF COG0424 # Protein_GI_number: 16129050 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Nucleotide-binding protein implicated in inhibition of septum formation # Organism: Escherichia coli K12 # 1 207 1 207 207 418 100.0 1e-117 MAIINTLYLMEKNMPKLILASTSPWRRALLEKLQISFECAAPEVDETPRSDESPRQLVLR LAQEKAQSLASRYPDHLIIGSDQVCVLDGEITGKPLTEENARLQLRKASGNIVTFYTGLA LFNSANGHLQTEVEPFDVHFRHLSEAEIDNYVRKEHPLHCAGSFKSEGFGITLFERLEGR DPNTLVGLPLIALCQMLRREGKNPLMG >gi|296494669|gb|ADTN01000069.1| GENE 6 7594 - 8115 435 173 aa, chain + ## HITS:1 COG:ECs1466 KEGG:ns NR:ns ## COG: ECs1466 COG1399 # Protein_GI_number: 15830720 # Func_class: R General function prediction only # Function: Predicted metal-binding, possibly nucleic acid-binding protein # Organism: Escherichia coli O157:H7 # 1 173 1 173 173 320 100.0 7e-88 MQKVKLPLTLDPVRTAQKRLDYQGIYTPDQVERVAESVVSVDSDVECSMSFAIDNQRLAV LNGDAKVTVTLECQRCGKPFTHQVYTTYCFSPVRSDEQAEALPEAYEPIEVNEFGEIDLL AMVEDEIILALPVVPVHDSEHCEVSEADMVFGELPEEAQKPNPFAVLASLKRK >gi|296494669|gb|ADTN01000069.1| GENE 7 8167 - 8340 292 57 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15801206|ref|NP_287223.1| 50S ribosomal protein L32 [Escherichia coli O157:H7 EDL933] # 1 57 1 57 57 117 100 6e-26 MAVQQNKPTRSKRGMRRSHDALTAVTSLSVDKTSGEKHLRHHITADGYYRGRKVIAK >gi|296494669|gb|ADTN01000069.1| GENE 8 8451 - 9491 650 346 aa, chain + ## HITS:1 COG:ZplsX KEGG:ns NR:ns ## COG: ZplsX COG0416 # Protein_GI_number: 15801207 # Func_class: I Lipid transport and metabolism # Function: Fatty acid/phospholipid biosynthesis enzyme # Organism: Escherichia coli O157:H7 EDL933 # 1 346 1 346 346 633 99.0 0 MGGDFGPSVTVPAALQALNSNSQLTLLLVGNPDAITPLLAKADFEQRSRLQIIPAQSVIA SDARPSQAIRASRGSSMRVALELVKEGRAQACVSAGNTGALMGLAKLLLKPLEGIERPAL VTVLPHQQKGKTVVLDLGANVDCDSTMLVQFAIMGSVLAEEVVEIPNPRVALLNIGEEEV KGLDSIRDASAVLKTIPSINYIGYLEANELLTGKTDVLVCDGFTGNVTLKTMEGVVRMFL SLLKSQGEGKKRSWWLLLLKRWLQKSLTRRFSHLNPDQYNGACLLGLRGTVIKSHGAANQ RAFAVAIEQAVQAVQRQVPQRIAARLESVYPAGFELLDGGKSGTLR >gi|296494669|gb|ADTN01000069.1| GENE 9 9559 - 10512 856 317 aa, chain + ## HITS:1 COG:ECs1469 KEGG:ns NR:ns ## COG: ECs1469 COG0332 # Protein_GI_number: 15830723 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Escherichia coli O157:H7 # 1 317 1 317 317 612 100.0 1e-175 MYTKIIGTGSYLPEQVRTNADLEKMVDTSDEWIVTRTGIRERHIAAPNETVSTMGFEAAT RAIEMAGIEKDQIGLIVVATTSATHAFPSAACQIQSMLGIKGCPAFDVAAACAGFTYALS VADQYVKSGAVKYALVVGSDVLARTCDPTDRGTIIIFGDGAGAAVLAASEEPGIISTHLH ADGSYGELLTLPNADRVNPENSIHLTMAGNEVFKVAVTELAHIVDETLAANNLDRSQLDW LVPHQANLRIISATAKKLGMSMDNVVVTLDRHGNTSAASVPCALDEAVRDGRIKPGQLVL LEAFGGGFTWGSALVRF >gi|296494669|gb|ADTN01000069.1| GENE 10 10528 - 11457 1008 309 aa, chain + ## HITS:1 COG:fabD KEGG:ns NR:ns ## COG: fabD COG0331 # Protein_GI_number: 16129055 # Func_class: I Lipid transport and metabolism # Function: (acyl-carrier-protein) S-malonyltransferase # Organism: Escherichia coli K12 # 1 309 1 309 309 532 100.0 1e-151 MTQFAFVFPGQGSQTVGMLADMAASYPIVEETFAEASAALGYDLWALTQQGPAEELNKTW QTQPALLTASVALYRVWQQQGGKAPAMMAGHSLGEYSALVCAGVIDFADAVRLVEMRGKF MQEAVPEGTGAMAAIIGLDDASIAKACEEAAEGQVVSPVNFNSPGQVVIAGHKEAVERAG AACKAAGAKRALPLPVSVPSHCALMKPAADKLAVELAKITFNAPTVPVVNNVDVKCETNG DAIRDALVRQLYNPVQWTKSVEYMAAQGVEHLYEVGPGKVLTGLTKRIVDTLTASALNEP SAMAAALEL >gi|296494669|gb|ADTN01000069.1| GENE 11 11470 - 12204 252 244 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 6 243 4 242 242 101 29 2e-21 MNFEGKIALVTGASRGIGRAIAETLAARGAKVIGTATSENGAQAISDYLGANGKGLMLNV TDPASIESVLEKIRAEFGEVDILVNNAGITRDNLLMRMKDEEWNDIIETNLSSVFRLSKA VMRAMMKKRHGRIITIGSVVGTMGNGGQANYAAAKAGLIGFSKSLAREVASRGITVNVVA PGFIETDMTRALSDDQRAGILAQVPAGRLGGAQEIANAVAFLASDEAAYITGETLHVNGG MYMV >gi|296494669|gb|ADTN01000069.1| GENE 12 12415 - 12651 414 78 aa, chain + ## HITS:1 COG:ECs1472 KEGG:ns NR:ns ## COG: ECs1472 COG0236 # Protein_GI_number: 15830726 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Escherichia coli O157:H7 # 1 78 1 78 78 108 100.0 2e-24 MSTIEERVKKIIGEQLGVKQEEVTNNASFVEDLGADSLDTVELVMALEEEFDTEIPDEEA EKITTVQAAIDYINGHQA Prediction of potential genes in microbial genomes Time: Sun May 15 23:21:29 2011 Seq name: gi|296494668|gb|ADTN01000070.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont179.2, whole genome shotgun sequence Length of sequence - 17517 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 7, operones - 2 average op.length - 6.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 + CDS 68 - 1309 1244 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase + Term 1335 - 1362 -0.1 + Prom 1350 - 1409 2.9 2 1 Op 2 6/0.000 + CDS 1429 - 2238 497 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase 3 1 Op 3 10/0.000 + CDS 2241 - 3263 1110 ## COG1559 Predicted periplasmic solute-binding protein 4 1 Op 4 22/0.000 + CDS 3253 - 3894 807 ## COG0125 Thymidylate kinase 5 1 Op 5 10/0.000 + CDS 3891 - 4895 782 ## COG0470 ATPase involved in DNA replication 6 1 Op 6 6/0.000 + CDS 4906 - 5703 783 ## COG0084 Mg-dependent DNase + Prom 5844 - 5903 5.6 7 1 Op 7 . + CDS 5998 - 7431 1672 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific + Term 7444 - 7496 5.1 - Term 7444 - 7473 2.8 8 2 Tu 1 . - CDS 7491 - 9680 2080 ## COG4773 Outer membrane receptor for ferric coprogen and ferric-rhodotorulic acid - Prom 9777 - 9836 4.2 9 3 Op 1 4/0.333 + CDS 10014 - 10373 478 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases 10 3 Op 2 6/0.000 + CDS 10376 - 10753 194 ## COG5633 Predicted periplasmic lipoprotein 11 3 Op 3 6/0.000 + CDS 10767 - 11408 628 ## COG3417 Collagen-binding surface adhesin SpaP (antigen I/II family) 12 3 Op 4 5/0.000 + CDS 11389 - 12213 357 ## COG0510 Predicted choline kinase involved in LPS biosynthesis 13 3 Op 5 2/1.000 + CDS 12224 - 13249 1124 ## COG1472 Beta-glucosidase-related glycosidases 14 3 Op 6 2/1.000 + CDS 13272 - 13814 522 ## COG3150 Predicted esterase + Term 13820 - 13874 -0.8 + Prom 14115 - 14174 5.2 15 4 Tu 1 4/0.333 + CDS 14214 - 15518 1363 ## COG1252 NADH dehydrogenase, FAD-containing subunit + Prom 15608 - 15667 2.1 16 5 Tu 1 . + CDS 15728 - 16267 592 ## COG3134 Predicted outer membrane lipoprotein + Term 16290 - 16329 3.1 - Term 16246 - 16282 -0.4 17 6 Tu 1 . - CDS 16329 - 17039 463 ## COG1309 Transcriptional regulator - Prom 17098 - 17157 10.0 + Prom 17034 - 17093 9.8 18 7 Tu 1 . + CDS 17202 - 17459 283 ## ECO103_1157 hypothetical protein Predicted protein(s) >gi|296494668|gb|ADTN01000070.1| GENE 1 68 - 1309 1244 413 aa, chain + ## HITS:1 COG:ECs1473 KEGG:ns NR:ns ## COG: ECs1473 COG0304 # Protein_GI_number: 15830727 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Escherichia coli O157:H7 # 1 413 1 413 413 729 100.0 0 MSKRRVVVTGLGMLSPVGNTVESTWKALLAGQSGISLIDHFDTSAYATKFAGLVKDFNCE DIISRKEQRKMDAFIQYGIVAGVQAMQDSGLEITEENATRIGAAIGSGIGGLGLIEENHT SLMNGGPRKISPFFVPSTIVNMVAGHLTIMYGLRGPSISIATACTSGVHNIGHAARIIAY GDADVMVAGGAEKASTPLGVGGFGAARALSTRNDNPQAASRPWDKERDGFVLGDGAGMLV LEEYEHAKKRGAKIYAELVGFGMSSDAYHMTSPPENGAGAALAMANALRDAGIEASQIGY VNAHGTSTPAGDKAEAQAVKTIFGEAASRVLVSSTKSMTGHLLGAAGAVESIYSILALRD QAVPPTINLDNPDEGCDLDFVPHEARQVSGMEYTLCNSFGFGGTNGSLIFKKI >gi|296494668|gb|ADTN01000070.1| GENE 2 1429 - 2238 497 269 aa, chain + ## HITS:1 COG:pabC KEGG:ns NR:ns ## COG: pabC COG0115 # Protein_GI_number: 16129059 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Escherichia coli K12 # 1 269 1 269 269 551 100.0 1e-157 MFLINGHKQESLAVSDRATQFGDGCFTTARVIDGKVSLLSAHIQRLQDACQRLMISCDFW PQLEQEMKTLAAEQQNGVLKVVISRGSGGRGYSTLNSGPATRILSVTAYPAHYDRLRNEG ITLALSPVRLGRNPHLAGIKHLNRLEQVLIRSHLEQTNADEALVLDSEGWVTECCAANLF WRKGNVVYTPRLDQAGVNGIMRQFCIRLLAQSSYQLVEVQASLEESLQADEMVICNALMP VMPVCACGDVSFSSATLYEYLAPLCERPN >gi|296494668|gb|ADTN01000070.1| GENE 3 2241 - 3263 1110 340 aa, chain + ## HITS:1 COG:yceG KEGG:ns NR:ns ## COG: yceG COG1559 # Protein_GI_number: 16129060 # Func_class: R General function prediction only # Function: Predicted periplasmic solute-binding protein # Organism: Escherichia coli K12 # 1 340 1 340 340 653 100.0 0 MKKVLLIILLLLVVLGIAAGVGVWKVRHLADSKLLIKEETIFTLKPGTGRLALGEQLYAD KIINRPRVFQWLLRIEPDLSHFKAGTYRFTPQMTVREMLKLLESGKEAQFPLRLVEGMRL SDYLKQLREAPYIKHTLSDDKYATVAQALELENPEWIEGWFWPDTWMYTANTTDVALLKR AHKKMVKAVDSAWEGRADGLPYKDKNQLVTMASIIEKETAVASERDKVASVFINRLRIGM RLQTDPTVIYGMGERYNGKLSRADLETPTAYNTYTITGLPPGAIATPGADSLKAAAHPAK TPYLYFVADGKGGHTFNTNLASHNKSVQDYLKVLKEKNAQ >gi|296494668|gb|ADTN01000070.1| GENE 4 3253 - 3894 807 213 aa, chain + ## HITS:1 COG:tmk KEGG:ns NR:ns ## COG: tmk COG0125 # Protein_GI_number: 16129061 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate kinase # Organism: Escherichia coli K12 # 1 213 1 213 213 399 100.0 1e-111 MRSKYIVIEGLEGAGKTTARNVVVETLEQLGIRDMVFTREPGGTQLAEKLRSLVLDIKSV GDEVITDKAEVLMFYAARVQLVETVIKPALANGTWVIGDRHDLSTQAYQGGGRGIDQHML ATLRDAVLGDFRPDLTLYLDVTPEVGLKRARARGELDRIEQESFDFFNRTRARYLELAAQ DKSIHTIDATQPLEAVMDAIRTTVTHWVKELDA >gi|296494668|gb|ADTN01000070.1| GENE 5 3891 - 4895 782 334 aa, chain + ## HITS:1 COG:ECs1477 KEGG:ns NR:ns ## COG: ECs1477 COG0470 # Protein_GI_number: 15830731 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication # Organism: Escherichia coli O157:H7 # 1 334 1 334 334 608 99.0 1e-174 MRWYPWLRPDFEKLVASYQAGRGHHALLIQALPGMGDDALIYALSRYLLCQQPQGHKSCG HCRGCQLMQAGTHPDYYTLAPEKGKNTLGVDAVREVTEKLNEHARLGGAKVVWVTDAALL TDAAANALLKTLEEPPAETWFFLATREPERLLATLRSRCRLHYLAPPPEQYAVTWLSREV TMSQDALLAALRLSAGSPGAALALFQGDNWQARETLCQALAYSVPSGDWYSLLAALNHEQ APARLHWLATLLMDALKRHHGAAQVTNVDVPGLVAELANHLSPSRLQAILGDVCHIREQL MSVTGINRELLITDLLLRIEHYLQPGVVLPVPHL >gi|296494668|gb|ADTN01000070.1| GENE 6 4906 - 5703 783 265 aa, chain + ## HITS:1 COG:ECs1478 KEGG:ns NR:ns ## COG: ECs1478 COG0084 # Protein_GI_number: 15830732 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Escherichia coli O157:H7 # 1 265 1 265 265 528 100.0 1e-150 MFLVDSHCHLDGLDYESLHKDVDDVLAKAAARDVKFCLAVATTLPGYLHMRDLVGERDNV VFSCGVHPLNQNDPYDVEDLRRLAAEEGVVALGETGLDYYYTPETKVRQQESFIHHIQIG RELNKPVIVHTRDARADTLAILREEKVTDCGGVLHCFTEDRETAGKLLDLGFYISFSGIV TFRNAEQLRDAARYVPLDRLLVETDSPYLAPVPHRGKENQPAMVRDVAEYMAVLKGVAVE ELAQVTTDNFARLFHIDASRLQSIR >gi|296494668|gb|ADTN01000070.1| GENE 7 5998 - 7431 1672 477 aa, chain + ## HITS:1 COG:ptsG_1 KEGG:ns NR:ns ## COG: ptsG_1 COG1263 # Protein_GI_number: 16129064 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Escherichia coli K12 # 1 397 1 397 397 736 99.0 0 MFKNAFANLQKVGKSLMLPVSVLPIAGILLGVGSANFSWLPAVVSHVMAEAGGSVFANMP LIFAIGVALGFTNNDGVSALAAVVAYGIMVKTMAVVAPLVLHLPAEEIASKHLADTGVLG GIISGAIAAYMFNRFYRIKLPEYLGFFAGKRFVPIISGLAAIFTGVVLSFIWPPIGSAIQ TFSQWAAYQNPVVAFGIYGFIERCLVPFGLHHIWNVPFQMQIGEYTNAAGQVFHGDIPRY MAGDPTAGKLSGGFLFKMYGLPAAAIAIWHSAKPENRAKVGGIMISAALTSFLTGITEPI EFSFMFVAPILYIIHAILAGLAFPICILLGMRDGTSFSHGLIDFIVLSGNSSKLWLFPIV GIGYAIVYYTIFRGLIKALDLKTPGREDATEDAKATGTSEMAPALVAAFGGKENITNLDA CITRLRVSVADVSKVDQAGLKKLGAAGVVVAGSGVQAIFGTKSDNLKTEMDEYIRNH >gi|296494668|gb|ADTN01000070.1| GENE 8 7491 - 9680 2080 729 aa, chain - ## HITS:1 COG:fhuE KEGG:ns NR:ns ## COG: fhuE COG4773 # Protein_GI_number: 16129065 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferric coprogen and ferric-rhodotorulic acid # Organism: Escherichia coli K12 # 1 729 1 729 729 1407 100.0 0 MLSTQFNRDNQYQAITKPSLLAGCIALALLPSAAFAAPATEETVIVEGSATAPDDGENDY SVTSTSAGTKMQMTQRDIPQSVTIVSQQRMEDQQLQTLGEVMENTLGISKSQADSDRALY YSRGFQIDNYMVDGIPTYFESRWNLGDALSDMALFERVEVVRGATGLMTGTGNPSAAINM VRKHATSREFKGDVSAEYGSWNKERYVADLQSPLTEDGKIRARIVGGYQNNDSWLDRYNS EKTFFSGIVDADLGDLTTLSAGYEYQRIDVNSPTWGGLPRWNTDGSSNSYDRARSTAPDW AYNDKEINKVFMTLKQQFADTWQATLNATHSEVEFDSKMMYVDAYVNKADGMLVGPYSNY GPGFDYVGGTGWNSGKRKVDALDLFADGSYELFGRQHNLMFGGSYSKQNNRYFSSWANIF PDEIGSFYNFNGNFPQTDWSPQSLAQDDTTHMKSLYAATRVTLADPLHLILGARYTNWRV DTLTYSMEKNHTTPYAGLVFDINDNWSTYASYTSIFQPQNDRDSSGKYLAPITGNNYELG LKSDWMNSRLTTTLAIFRIEQDNVAQSTGTPIPGSNGETAYKAVDGTVSKGVEFELNGAI TDNWQLTFGATRYIAEDNEGNAVNPNLPRTTVKMFTSYRLPVMPELTVGGGVNWQNRVYT DTVTPYGTFRAEQGSYALVDLFTRYQVTKNFSLQGNVNNLFDKTYDTNVEGSIVYGTPRN FSITGTYQF >gi|296494668|gb|ADTN01000070.1| GENE 9 10014 - 10373 478 119 aa, chain + ## HITS:1 COG:ECs1481 KEGG:ns NR:ns ## COG: ECs1481 COG0537 # Protein_GI_number: 15830735 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Escherichia coli O157:H7 # 1 119 1 119 119 205 100.0 2e-53 MAEETIFSKIIRREIPSDIVYQDDLVTAFRDISPQAPTHILIIPNILIPTVNDVSAEHEQ ALGRMITVAAKIAEQEGIAEDGYRLIMNTNRHGGQEVYHIHMHLLGGRPLGPMLAHKGL >gi|296494668|gb|ADTN01000070.1| GENE 10 10376 - 10753 194 125 aa, chain + ## HITS:1 COG:ycfL KEGG:ns NR:ns ## COG: ycfL COG5633 # Protein_GI_number: 16129067 # Func_class: R General function prediction only # Function: Predicted periplasmic lipoprotein # Organism: Escherichia coli K12 # 1 125 1 125 125 227 100.0 4e-60 MRKGCFGLVSLVLLLLVGCRSHPEIPVNDEQSLVMESSLLAAGISAEKPFLSTSDIQPSA SSTLYNERQEPVTVHYRFYWYDARGLEMHPLERPRSVTIPAHSAVTLYGSANFLGAHKVR LYLYL >gi|296494668|gb|ADTN01000070.1| GENE 11 10767 - 11408 628 213 aa, chain + ## HITS:1 COG:ECs1483 KEGG:ns NR:ns ## COG: ECs1483 COG3417 # Protein_GI_number: 15830737 # Func_class: R General function prediction only # Function: Collagen-binding surface adhesin SpaP (antigen I/II family) # Organism: Escherichia coli O157:H7 # 1 213 1 213 213 367 100.0 1e-102 MTKMSRYALITALAMFLAGCVGQREPAPVEEVKPAPEQPAEPQQPVPTVPSVPTIPQQPG PIEHEDQTAPPAPHIRHYDWNGAMQPMVSKMLGADGVTAGSVLLVDSVNNRTNGSLNAAE ATETLRNALANNGKFTLVSAQQLSMAKQQLGLSPQDSLGTRSKAIGIARNVGAHYVLYSS ASGNVNAPTLQMQLMLVQTGEIIWSGKGAVSQQ >gi|296494668|gb|ADTN01000070.1| GENE 12 11389 - 12213 357 274 aa, chain + ## HITS:1 COG:ycfN KEGG:ns NR:ns ## COG: ycfN COG0510 # Protein_GI_number: 16129069 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted choline kinase involved in LPS biosynthesis # Organism: Escherichia coli K12 # 1 274 1 274 274 528 100.0 1e-150 MPFRSNNPITRDELLSRFFPQYHPVTTFNSGLSGGSFLIEHQGQRFVVRQPHDPDAPQSA FLRQYRALSQLPACIAPKPHLYLRDWMVVDYLPGAVKTYLPDTNELAGLLYYLHQQPRFG WRITLLPLLELYWQQSDPARRTVGWLRMLKRLRKAREPRPLRLSPLHMDVHAGNLVHSAS GLKLIDWEYAGDGDIALELAAVWVENTEQHRQLVNDYATRAKIYPAQLWRQVRRWFPWLL MLKAGWFEYRWRQTGDQQFIRLADDTWRQLLIKQ >gi|296494668|gb|ADTN01000070.1| GENE 13 12224 - 13249 1124 341 aa, chain + ## HITS:1 COG:ycfO KEGG:ns NR:ns ## COG: ycfO COG1472 # Protein_GI_number: 16129070 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Escherichia coli K12 # 1 341 1 341 341 679 100.0 0 MGPVMLDVEGYELDAEEREILAHPLVGGLILFTRNYHDPAQLRELVRQIRAASRNRLVVA VDQEGGRVQRFREGFTRLPAAQSFAALSGMEEGGKLAQEAGWLMASEMIAMDIDISFAPV LDVGHISAAIGERSYHADPQKALAIASRFIDGMHEAGMKTTGKHFPGHGAVTADSHKETP CDPRPQAEIRAKDMSVFSSLIRENKLDAIMPAHVIYSDVDPRPASGSPYWLKTVLRQELG FDGVIFSDDLSMEGAAIMGSYAERGQASLDAGCDMILVCNNRKGAVSVLDNLSPIKAERV TRLYHKGSFSRQELMDSARWKAISTRLNQLHERWQEEKAGH >gi|296494668|gb|ADTN01000070.1| GENE 14 13272 - 13814 522 180 aa, chain + ## HITS:1 COG:ECs1486 KEGG:ns NR:ns ## COG: ECs1486 COG3150 # Protein_GI_number: 15830740 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Escherichia coli O157:H7 # 1 180 20 199 199 375 100.0 1e-104 MIIYLHGFDSNSPGNHEKVLQLQFIDPDVRLISYSTRHPKHDMQHLLKEVDKMLQLNVDE RPLICGVGLGGYWAERIGFLCDIRQVIFNPNLFPYENMEGKIDRPEEYADIATKCVTNFR EKNRDRCLVILSRNDEALNSQRTSEELHHYYEIVWDEEQTHKFKNISPHLQRIKAFKTLG >gi|296494668|gb|ADTN01000070.1| GENE 15 14214 - 15518 1363 434 aa, chain + ## HITS:1 COG:ndh KEGG:ns NR:ns ## COG: ndh COG1252 # Protein_GI_number: 16129072 # Func_class: C Energy production and conversion # Function: NADH dehydrogenase, FAD-containing subunit # Organism: Escherichia coli K12 # 1 434 1 434 434 863 100.0 0 MTTPLKKIVIVGGGAGGLEMATQLGHKLGRKKKAKITLVDRNHSHLWKPLLHEVATGSLD EGVDALSYLAHARNHGFQFQLGSVIDIDREAKTITIAELRDEKGELLVPERKIAYDTLVM ALGSTSNDFNTPGVKENCIFLDNPHQARRFHQEMLNLFLKYSANLGANGKVNIAIVGGGA TGVELSAELHNAVKQLHSYGYKGLTNEALNVTLVEAGERILPALPPRISAAAHNELTKLG VRVLTQTMVTSADEGGLHTKDGEYIEADLMVWAAGIKAPDFLKDIGGLETNRINQLVVEP TLQTTRDPDIYAIGDCASCPRPEGGFVPPRAQAAHQMATCAMNNILAQMNGKPLKNYQYK DHGSLVSLSNFSTVGSLMGNLTRGSMMIEGRIARFVYISLYRMHQIALHGYFKTGLMMLV GSINRVIRPRLKLH >gi|296494668|gb|ADTN01000070.1| GENE 16 15728 - 16267 592 179 aa, chain + ## HITS:1 COG:ycfJ KEGG:ns NR:ns ## COG: ycfJ COG3134 # Protein_GI_number: 16129073 # Func_class: S Function unknown # Function: Predicted outer membrane lipoprotein # Organism: Escherichia coli K12 # 1 179 1 179 179 300 100.0 1e-81 MNKSMLAGIGIGVAAALGVAAVASLNVFERGPQYAQVVSATPIKETVKTPRQECRNVTVT HRRPVQDENRITGSVLGAVAGGVIGHQFGGGRGKDVATVVGALGGGYAGNQIQGSLQESD TYTTTQQRCKTVYDKSEKMLGYDVTYKIGDQQGKIRMDRDPGTQIPLDSNGQLILNNKV >gi|296494668|gb|ADTN01000070.1| GENE 17 16329 - 17039 463 236 aa, chain - ## HITS:1 COG:ycfQ KEGG:ns NR:ns ## COG: ycfQ COG1309 # Protein_GI_number: 16129074 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 236 1 236 236 488 100.0 1e-138 MGSGLVNGGDYFYNNLSFTVTRYNGIMATDSTQCVKKSRGRPKVFDRDAALDKAMKLFWQ HGYEATSLADLVEATGAKAPTLYAEFTNKEGLFRAVLDRYIDRFAAKHEAQLFCEEKSVE SALADYFAAIANCFTSKDTPAGCFMINNCTTLSPDSGDIANTLKSRHAMQERTLQQFLCQ RQARGEIPPHCDVTHLAEFLNCIIQGMSISAREGASLEKLMQIAGTTLRLWPELVK >gi|296494668|gb|ADTN01000070.1| GENE 18 17202 - 17459 283 85 aa, chain + ## HITS:1 COG:no KEGG:ECO103_1157 NR:ns ## KEGG: ECO103_1157 # Name: ycfR # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 85 1 85 85 127 100.0 9e-29 MKNVKTLIAAAILSSMSFASFAAVEVQSTPEGQQKVGTISANAGTNLGSLEEQLAQKADE MGAKSFRITSVTGPNTLHGTAVIYK Prediction of potential genes in microbial genomes Time: Sun May 15 23:21:41 2011 Seq name: gi|296494667|gb|ADTN01000071.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont179.3, whole genome shotgun sequence Length of sequence - 30364 bp Number of predicted genes - 27, with homology - 27 Number of transcription units - 13, operones - 5 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 39 - 998 996 ## COG1376 Uncharacterized protein conserved in bacteria - Prom 1021 - 1080 4.9 2 2 Tu 1 . - CDS 1145 - 4591 3706 ## COG1197 Transcription-repair coupling factor (superfamily II helicase) - Prom 4656 - 4715 2.7 3 3 Tu 1 . - CDS 4719 - 5765 1019 ## COG4763 Predicted membrane protein - Prom 5847 - 5906 2.3 + Prom 5942 - 6001 4.4 4 4 Op 1 23/0.000 + CDS 6027 - 7226 1354 ## COG4591 ABC-type transport system, involved in lipoprotein release, permease component 5 4 Op 2 23/0.000 + CDS 7219 - 7920 226 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 6 4 Op 3 5/0.200 + CDS 7926 - 9164 1324 ## COG4591 ABC-type transport system, involved in lipoprotein release, permease component 7 4 Op 4 5/0.200 + CDS 9193 - 10104 869 ## COG1940 Transcriptional regulator/sugar kinase 8 4 Op 5 . + CDS 10120 - 10959 672 ## COG0846 NAD-dependent protein deacetylases, SIR2 family 9 5 Op 1 . - CDS 11079 - 11867 238 ## JW1107 predicted inner membrane protein 10 5 Op 2 . - CDS 11864 - 12325 339 ## JW5164 predicted inner membrane protein - Term 12336 - 12378 7.7 11 6 Op 1 25/0.000 - CDS 12383 - 13429 1512 ## COG0687 Spermidine/putrescine-binding periplasmic protein 12 6 Op 2 36/0.000 - CDS 13426 - 14220 948 ## COG1177 ABC-type spermidine/putrescine transport system, permease component II 13 6 Op 3 30/0.000 - CDS 14217 - 15044 994 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I 14 6 Op 4 . - CDS 15058 - 16176 1293 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components - Prom 16261 - 16320 4.0 + Prom 16278 - 16337 5.9 15 7 Tu 1 . + CDS 16444 - 17670 1496 ## COG2195 Di- and tripeptidases - Term 17666 - 17703 9.1 16 8 Op 1 2/0.800 - CDS 17719 - 18840 1175 ## COG2850 Uncharacterized conserved protein 17 8 Op 2 40/0.000 - CDS 18916 - 20376 1221 ## COG0642 Signal transduction histidine kinase 18 8 Op 3 4/0.400 - CDS 20376 - 21047 858 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 21106 - 21165 7.4 - Term 21163 - 21206 9.1 19 9 Op 1 9/0.000 - CDS 21216 - 22586 1676 ## COG0015 Adenylosuccinate lyase 20 9 Op 2 4/0.400 - CDS 22590 - 23231 752 ## COG2915 Uncharacterized protein involved in purine metabolism 21 9 Op 3 6/0.000 - CDS 23267 - 24373 1404 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain 22 9 Op 4 3/0.600 - CDS 24427 - 24888 433 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 23 9 Op 5 . - CDS 24898 - 25521 425 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases - Prom 25543 - 25602 6.5 + Prom 25549 - 25608 4.4 24 10 Tu 1 . + CDS 25723 - 26973 1560 ## COG0538 Isocitrate dehydrogenases + Term 26995 - 27034 4.1 - Term 27885 - 27924 1.2 25 11 Tu 1 . - CDS 28099 - 28503 71 ## COG5562 Phage envelope protein - Prom 28633 - 28692 2.6 26 12 Tu 1 . - CDS 28724 - 29455 389 ## COG0789 Predicted transcriptional regulators - Prom 29522 - 29581 7.3 - Term 29494 - 29541 4.6 27 13 Tu 1 . - CDS 29660 - 30190 323 ## COG2200 FOG: EAL domain - Prom 30302 - 30361 2.2 Predicted protein(s) >gi|296494667|gb|ADTN01000071.1| GENE 1 39 - 998 996 319 aa, chain - ## HITS:1 COG:ECs1491 KEGG:ns NR:ns ## COG: ECs1491 COG1376 # Protein_GI_number: 15830745 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 319 2 320 320 598 100.0 1e-171 MIKTRFSRWLTFFTFAAAVALALPAKANTWPLPPAGSRLVGENKFHVVENDGGSLEAIAK KYNVGFLALLQANPGVDPYVPRAGSVLTIPLQTLLPDAPREGIVINIAELRLYYYPPGKN SVTVYPIGIGQLGGDTLTPTMVTTVSDKRANPTWTPTANIRARYKAQGIELPAVVPAGPD NPMGHHAIRLAAYGGVYLLHGTNADFGIGMRVSSGCIRLRDDDIKTLFSQVTPGTKVNII NTPIKVSAEPNGARLVEVHQPLSEKIDDDPQLLPITLNSAMQSFKDAAQTDAEVMQHVMD VRSGMPVDVRRHQVSPQTL >gi|296494667|gb|ADTN01000071.1| GENE 2 1145 - 4591 3706 1148 aa, chain - ## HITS:1 COG:mfd KEGG:ns NR:ns ## COG: mfd COG1197 # Protein_GI_number: 16129077 # Func_class: L Replication, recombination and repair; K Transcription # Function: Transcription-repair coupling factor (superfamily II helicase) # Organism: Escherichia coli K12 # 1 1148 1 1148 1148 2282 100.0 0 MPEQYRYTLPVKAGEQRLLGELTGAACATLVAEIAERHAGPVVLIAPDMQNALRLHDEIS QFTDQMVMNLADWETLPYDSFSPHQDIISSRLSTLYQLPTMQRGVLIVPVNTLMQRVCPH SFLHGHALVMKKGQRLSRDALRTQLDSAGYRHVDQVMEHGEYATRGALLDLFPMGSELPY RLDFFDDEIDSLRVFDVDSQRTLEEVEAINLLPAHEFPTDKAAIELFRSQWRDTFEVKRD PEHIYQQVSKGTLPAGIEYWQPLFFSEPLPPLFSYFPANTLLVNTGDLETSAERFQADTL ARFENRGVDPMRPLLPPQSLWLRVDELFSELKNWPRVQLKTEHLPTKAANANLGFQKLPD LAVQAQQKAPLDALRKFLETFDGPVVFSVESEGRREALGELLARIKIAPQRIMRLDEASD RGRYLMIGAAEHGFVDTVRNLALICESDLLGERVARRRQDSRRTINPDTLIRNLAELHIG QPVVHLEHGVGRYAGMTTLEAGGITGEYLMLTYANDAKLYVPVSSLHLISRYAGGAEENA PLHKLGGDAWSRARQKAAEKVRDVAAELLDIYAQRAAKEGFAFKHDREQYQLFCDSFPFE TTPDQAQAINAVLSDMCQPLAMDRLVCGDVGFGKTEVAMRAAFLAVDNHKQVAVLVPTTL LAQQHYDNFRDRFANWPVRIEMISRFRSAKEQTQILAEVAEGKIDILIGTHKLLQSDVKF KDLGLLIVDEEHRFGVRHKERIKAMRANVDILTLTATPIPRTLNMAMSGMRDLSIIATPP ARRLAVKTFVREYDSMVVREAILREILRGGQVYYLYNDVENIQKAAERLAELVPEARIAI GHGQMRERELERVMNDFHHQRFNVLVCTTIIETGIDIPTANTIIIERADHFGLAQLHQLR GRVGRSHHQAYAWLLTPHPKAMTTDAQKRLEAIASLEDLGAGFALATHDLEIRGAGELLG EEQSGSMETIGFSLYMELLENAVDALKAGREPSLEDLTSQQTEVELRMPSLLPDDFIPDV NTRLSFYKRIASAKTENELEEIKVELIDRFGLLPDPARTLLDIARLRQQAQKLGIRKLEG NEKGGVIEFAEKNHVNPAWLIGLLQKQPQHYRLDGPTRLKFIQDLSERKTRIEWVRQFMR ELEENAIA >gi|296494667|gb|ADTN01000071.1| GENE 3 4719 - 5765 1019 348 aa, chain - ## HITS:1 COG:ycfT KEGG:ns NR:ns ## COG: ycfT COG4763 # Protein_GI_number: 16129078 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 348 1 357 357 610 97.0 1e-174 MKQKELWINQIKGLCICLVVIYHSVITFYPHLTTFQHPLSEVLSKCWIYFNLYLAPFRMP VFFFISGYLIRRYIDSVPWGNCLDKRIWNIFWVLALWGVVQWLAPERDLSNASNAAYADS TGEFLHGMITASTSLWYLYALIVYFVVCKIFSRLALPLFALFVLLSVAVNFVPTPWWGMN SVIRNLPYYSLGAWFGATIMTCVKEVPLRRHLLMASLLTVLAVGAWLFTISLLLSLVSIV VIMKLFYQYEQRFGMRSTSLLNVIGSNTIAIYTTHRILVEIFSLTLLAQMNAARWSPQVE LTLLLVYPFVSLFICTVAGLLVRKLSQRAFSDLLFSPPSLPAAVSYSR >gi|296494667|gb|ADTN01000071.1| GENE 4 6027 - 7226 1354 399 aa, chain + ## HITS:1 COG:ECs1494 KEGG:ns NR:ns ## COG: ECs1494 COG4591 # Protein_GI_number: 15830748 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ABC-type transport system, involved in lipoprotein release, permease component # Organism: Escherichia coli O157:H7 # 1 399 1 399 399 703 100.0 0 MYQPVALFIGLRYMRGRAADRFGRFVSWLSTIGITLGVMALVTVLSVMNGFERELQNNIL GLMPQAILSSEHGSLNPQQLPETAVKLDGVNRVAPITTGDVVLQSARSVAVGVMLGIDPA QKDPLTPYLVNVKQTDLEPGKYNVILGEQLASQLGVNRGDQIRVMVPSASQFTPMGRIPS QRLFNVIGTFAANSEVDGYEMLVNIEDASRLMRYPAGNITGWRLWLDEPLKVDSLSQQKL PEGSKWQDWRDRKGELFQAVRMEKNMMGLLLSLIVAVAAFNIITSLGLMVMEKQGEVAIL QTQGLTPRQIMMVFMVQGASAGIIGAILGAALGALLASQLNNLMPIIGVLLDGAALPVAI EPLQVIVIALVAMAIALLSTLYPSWRAAATQPAEALRYE >gi|296494667|gb|ADTN01000071.1| GENE 5 7219 - 7920 226 233 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 6 216 9 210 309 91 29 5e-18 MNKILLQCDNLCKRYQEGSVQTDVLHNVSFSVGEGEMMAIVGSSGSGKSTLLHLLGGLDT PTSGDVIFNGQPMSKLSSAAKAELRNQKLGFIYQFHHLLPDFTALENVAMPLLIGKKKPA EINSRALEMLKAVGLDHRANHRPSELSGGERQRVAIARALVNNPRLVLADEPTGNLDARN ADSIFQLLGELNRLQGTAFLVVTHDLQLAKRMSRQLEMRDGRLTAELSLMGAE >gi|296494667|gb|ADTN01000071.1| GENE 6 7926 - 9164 1324 412 aa, chain + ## HITS:1 COG:lolE KEGG:ns NR:ns ## COG: lolE COG4591 # Protein_GI_number: 16129081 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ABC-type transport system, involved in lipoprotein release, permease component # Organism: Escherichia coli K12 # 1 412 3 414 414 759 100.0 0 MPLSLLIGLRFSRGRRRGGMVSLISVISTIGIALGVAVLIVGLSAMNGFERELNNRILAV VPHGEIEAVDQPWTNWQEALDHVQKVPGIAAAAPYINFTGLVESGANLRAIQVKGVNPQQ EQRLSALPSFVQGDAWRNFKAGEQQIIIGKGVADALKVKQGDWVSIMIPNSNPEHKLMQP KRVRLHVAGILQLSGQLDHSFAMIPLADAQQYLDMGSSVSGIALKMTDVFNANKLVRDAG EVTNSYVYIKSWIGTYGYMYRDIQMIRAIMYLAMVLVIGVACFNIVSTLVMAVKDKSGDI AVLRTLGAKDGLIRAIFVWYGLLAGLFGSLCGVIIGVVVSLQLTPIIEWIEKLIGHQFLS SDIYFIDFLPSELHWLDVFYVLVTALLLSLLASWYPARRASNIDPARVLSGQ >gi|296494667|gb|ADTN01000071.1| GENE 7 9193 - 10104 869 303 aa, chain + ## HITS:1 COG:ycfX KEGG:ns NR:ns ## COG: ycfX COG1940 # Protein_GI_number: 16129082 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Escherichia coli K12 # 1 303 1 303 303 625 100.0 1e-179 MYYGFDIGGTKIALGVFDSGRQLQWEKRVPTPRDSYDAFLDAVCELVAEADQRFGCKGSV GIGIPGMPETEDGTLYAANVPAASGKPLRADLSARLDRDVRLDNDANCFALSEAWDDEFT QYPLVMGLILGTGVGGGLIFNGKPITGKSYITGEFGHMRLPVDALTMMGLDFPLRRCGCG QHGCIENYLSGRGFAWLYQHYYHQPLQAPEIIALYDQGDEQARAHVERYLDLLAVCLGNI LTIVDPDLVVIGGGLSNFPAITTQLADRLPRHLLPVARVPRIERARHGDAGGMRGAAFLH LTD >gi|296494667|gb|ADTN01000071.1| GENE 8 10120 - 10959 672 279 aa, chain + ## HITS:1 COG:ycfY KEGG:ns NR:ns ## COG: ycfY COG0846 # Protein_GI_number: 16129083 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Escherichia coli K12 # 1 279 1 279 279 555 100.0 1e-158 MLSRRGHRLSRFRKNKRRLRERLRQRIFFRDKVVPEAMEKPRVLVLTGAGISAESGIRTF RAADGLWEEHRVEDVATPEGFDRDPELVQAFYNARRRQLQQPEIQPNAAHLALAKLQDAL GDRFLLVTQNIDNLHERAGNTNVIHMHGELLKVRCSQSGQVLDWTGDVTPEDKCHCCQFP APLRPHVVWFGEMPLGMDEIYMALSMADIFIAIGTSGHVYPAAGFVHEAKLHGAHTVELN LEPSQVGNEFAEKYYGPASQVVPEFVEKLLKGLKAGSIA >gi|296494667|gb|ADTN01000071.1| GENE 9 11079 - 11867 238 262 aa, chain - ## HITS:1 COG:no KEGG:JW1107 NR:ns ## KEGG: JW1107 # Name: ycfZ # Def: predicted inner membrane protein # Organism: E.coli_J # Pathway: not_defined # 1 262 1 262 262 484 100.0 1e-135 MKKFIILLSLLILLPLTAASKPLIPIMKTLFTDVTGTVPDAEEIAHKAELFRQQTGIAPF IVVLPDINNEASLRQNGKAMLAHASSSLSDVKGSVLLLFTTREPRLIMITNGQVESGLDD KHLGLLIENHTLAYLNADLWYQGINNALAVLQAQILKQSTPPLTYYPHPGQQHENAPPGS TNTLGFIAWAATFILFSRIFYYTTRFIYALKFAVAMTIANMGYQALCLYIDNSFAITRIS PLWAGLIGVCTFIAALLLTSKR >gi|296494667|gb|ADTN01000071.1| GENE 10 11864 - 12325 339 153 aa, chain - ## HITS:1 COG:no KEGG:JW5164 NR:ns ## KEGG: JW5164 # Name: ymfA # Def: predicted inner membrane protein # Organism: E.coli_J # Pathway: not_defined # 1 140 1 140 153 276 100.0 1e-73 MSQDSKVFFRIFLGIGLVLILISVVVFYNQFTYSKDAIHTEGVIVDTVWHSSHSHRTGKD GSWYPVVAFRPTPDYTLIFNSSIGSDFYEDSEGDKVNVYYSPGHPEKAEINNPWVNFFKW GFIGIMGVIFIAVGLLISMPSSKKSRRKRKSRP >gi|296494667|gb|ADTN01000071.1| GENE 11 12383 - 13429 1512 348 aa, chain - ## HITS:1 COG:potD KEGG:ns NR:ns ## COG: potD COG0687 # Protein_GI_number: 16129086 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Escherichia coli K12 # 1 348 1 348 348 652 100.0 0 MKKWSRHLLAAGALALGMSAAHADDNNTLYFYNWTEYVPPGLLEQFTKETGIKVIYSTYE SNETMYAKLKTYKDGAYDLVVPSTYYVDKMRKEGMIQKIDKSKLTNFSNLDPDMLNKPFD PNNDYSIPYIWGATAIGVNGDAVDPKSVTSWADLWKPEYKGSLLLTDDAREVFQMALRKL GYSGNTTDPKEIEAAYNELKKLMPNVAAFNSDNPANPYMEGEVNLGMIWNGSAFVARQAG TPIDVVWPKEGGIFWMDSLAIPANAKNKEGALKLINFLLRPDVAKQVAETIGYPTPNLAA RKLLSPEVANDKTLYPDAETIKNGEWQNDVGAASSIYEEYYQKLKAGR >gi|296494667|gb|ADTN01000071.1| GENE 12 13426 - 14220 948 264 aa, chain - ## HITS:1 COG:ECs1500 KEGG:ns NR:ns ## COG: ECs1500 COG1177 # Protein_GI_number: 15830754 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component II # Organism: Escherichia coli O157:H7 # 1 264 1 264 264 431 100.0 1e-121 MIGRLLRGGFMTAIYAYLYIPIIILIVNSFNSSRFGINWQGFTTKWYSLLMNNDSLLQAA QHSLTMAVFSATFATLIGSLTAVALYRYRFRGKPFVSGMLFVVMMSPDIVMAISLLVLFM LLGIQLGFWSLLFSHITFCLPFVVVTVYSRLKGFDVRMLEAAKDLGASEFTILRKIILPL AMPAVAAGWVLSFTLSMDDVVVSSFVTGPSYEILPLKIYSMVKVGVSPEVNALATILLVL SLVMVIASQLIARDKTKGNTGDVK >gi|296494667|gb|ADTN01000071.1| GENE 13 14217 - 15044 994 275 aa, chain - ## HITS:1 COG:ECs1570 KEGG:ns NR:ns ## COG: ECs1570 COG1176 # Protein_GI_number: 15830824 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Escherichia coli O157:H7 # 1 271 11 281 287 444 99.0 1e-124 MIVTIVGWLVLFVFLPNLMIIGTSFLTRDDASFVKMVFTLDNYTRLLDPLYFEVLLHSLN MALIATLACLVLGYPFAWFLAKLPHKVRPLLLFLLIVPFWTNSLIRIYGLKIFLSTKGYL NEFLLWLGVIDTPIRIMFTPSAVIIGLVYILLPFMVMPLYSSIEKLDKPLLEAARDLGAS KLQTFIRIIIPLTMPGIIAGCLLVMLPAMGLFYVSDLMGGAKNLLIGNVIKVQFLNIRDW PFGAATSITLTIVMGLMLLVYWRASRLLNKKVELE >gi|296494667|gb|ADTN01000071.1| GENE 14 15058 - 16176 1293 372 aa, chain - ## HITS:1 COG:ECs1571 KEGG:ns NR:ns ## COG: ECs1571 COG3842 # Protein_GI_number: 15830825 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Escherichia coli O157:H7 # 1 372 7 378 378 733 99.0 0 MNKQPSSLSPLVQLAGIRKCFDGKEVIPQLDLTINNGEFLTLLGPSGCGKTTVLRLIAGL ETVDSGRIMLDNEDITHVPAENRYVNTVFQSYALFPHMTVFENVAFGLRMQKTPAAEITP RVMEALRMVQLETFAQRKPHQLSGGQQQRVAIARAVVNKPRLLLLDESLSALDYKLRKQM QNELKALQRKLGITFVFVTHDQEEALTMSDRIVVMRDGRIEQDGTPREIYEEPKNLFVAG FIGEINMFNATVIERLDEQRVRANVEGRECNIYVNFAVEPGQKLHVLLRPEDLRVEEIND DNHAEGLIGYVRERNYKGMTLESVVELENGKMVMVSEFFNEDDPDFDHSLDQKMAINWVE SWEVVLADEEHK >gi|296494667|gb|ADTN01000071.1| GENE 15 16444 - 17670 1496 408 aa, chain + ## HITS:1 COG:pepT KEGG:ns NR:ns ## COG: pepT COG2195 # Protein_GI_number: 16129090 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Escherichia coli K12 # 1 408 1 408 408 835 100.0 0 MDKLLERFLNYVSLDTQSKAGVRQVPSTEGQWKLLHLLKEQLEEMGLINVTLSEKGTLMA TLPANVPGDIPAIGFISHVDTSPDCSGKNVNPQIVENYRGGDIALGIGDEVLSPVMFPVL HQLLGQTLITTDGKTLLGADDKAGIAEIMTALAVLQQKKIPHGDIRVAFTPDEEVGKGAK HFDVDAFDARWAYTVDGGGVGELEFENFNAASVNIKIVGNNVHPGTAKGVMVNALSLAAR IHAEVPADESPEMTEGYEGFYHLASMKGTVERADMHYIIRDFDRKQFEARKRKMMEIAKK VGKGLHPDCYIELVIEDSYYNMREKVVEHPHILDIAQQAMRDCDIEPELKPIRGGTDGAQ LSFMGLPCPNLFTGGYNYHGKHEFVTLEGMEKAVQVIVRIAELTAQRK >gi|296494667|gb|ADTN01000071.1| GENE 16 17719 - 18840 1175 373 aa, chain - ## HITS:1 COG:ycfD KEGG:ns NR:ns ## COG: ycfD COG2850 # Protein_GI_number: 16129091 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 373 4 376 376 749 100.0 0 MEYQLTLNWPDFLERHWQKRPVVLKRGFNNFIDPISPDELAGLAMESEVDSRLVSHQDGK WQVSHGPFESYDHLGETNWSLLVQAVNHWHEPTAALMRPFRELPDWRIDDLMISFSVPGG GVGPHLDQYDVFIIQGTGRRRWRVGEKLQMKQHCPHPDLLQVDPFEAIIDEELEPGDILY IPPGFPHEGYALENAMNYSVGFRAPNTRELISGFADYVLQRELGGNYYSDPDVPPRAHPA DVLPQEMDKLREMMLELINQPEHFKQWFGEFISQSRHELDIAPPEPPYQPDEIYDALKQG EVLVRLGGLRVLRIGDDVYANGEKIDSPHRPALDALASNIALTAENFGDALEDPSFLAML AALVNSGYWFFEG >gi|296494667|gb|ADTN01000071.1| GENE 17 18916 - 20376 1221 486 aa, chain - ## HITS:1 COG:phoQ KEGG:ns NR:ns ## COG: phoQ COG0642 # Protein_GI_number: 16129092 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 486 1 486 486 956 100.0 0 MKKLLRLFFPLSLRVRFLLATAAVVLVLSLAYGMVALIGYSVSFDKTTFRLLRGESNLFY TLAKWENNKLHVELPENIDKQSPTMTLIYDENGQLLWAQRDVPWLMKMIQPDWLKSNGFH EIEADVNDTSLLLSGDHSIQQQLQEVREDDDDAEMTHSVAVNVYPATSRMPKLTIVVVDT IPVELKSSYMVWSWFIYVLSANLLLVIPLLWVAAWWSLRPIEALAKEVRELEEHNRELLN PATTRELTSLVRNLNRLLKSERERYDKYRTTLTDLTHSLKTPLAVLQSTLRSLRSEKMSV SDAEPVMLEQISRISQQIGYYLHRASMRGGTLLSRELHPVAPLLDNLTSALNKVYQRKGV NISLDISPEISFVGEQNDFVEVMGNVLDNACKYCLEFVEISARQTDEHLYIVVEDDGPGI PLSKREVIFDRGQRVDTLRPGQGVGLAVAREITEQYEGKIVAGESMLGGARMEVIFGRQH SAPKDE >gi|296494667|gb|ADTN01000071.1| GENE 18 20376 - 21047 858 223 aa, chain - ## HITS:1 COG:phoP KEGG:ns NR:ns ## COG: phoP COG0745 # Protein_GI_number: 16129093 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli K12 # 1 223 1 223 223 417 100.0 1e-117 MRVLVVEDNALLRHHLKVQIQDAGHQVDDAEDAKEADYYLNEHIPDIAIVDLGLPDEDGL SLIRRWRSNDVSLPILVLTARESWQDKVEVLSAGADDYVTKPFHIEEVMARMQALMRRNS GLASQVISLPPFQVDLSRRELSINDEVIKLTAFEYTIMETLIRNNGKVVSKDSLMLQLYP DAELRESHTIDVLMGRLRKKIQAQYPQEVITTVRGQGYLFELR >gi|296494667|gb|ADTN01000071.1| GENE 19 21216 - 22586 1676 456 aa, chain - ## HITS:1 COG:purB KEGG:ns NR:ns ## COG: purB COG0015 # Protein_GI_number: 16129094 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate lyase # Organism: Escherichia coli K12 # 1 456 1 456 456 931 100.0 0 MELSSLTAVSPVDGRYGDKVSALRGIFSEYGLLKFRVQVEVRWLQKLAAHAAIKEVPAFA ADAIGYLDAIVASFSEEDAARIKTIERTTNHDVKAVEYFLKEKVAEIPELHAVSEFIHFA CTSEDINNLSHALMLKTARDEVILPYWRQLIDGIKDLAVQYRDIPLLSRTHGQPATPSTI GKEMANVAYRMERQYRQLNQVEILGKINGAVGNYNAHIAAYPEVDWHQFSEEFVTSLGIQ WNPYTTQIEPHDYIAELFDCVARFNTILIDFDRDVWGYIALNHFKQKTIAGEIGSSTMPH KVNPIDFENSEGNLGLSNAVLQHLASKLPVSRWQRDLTDSTVLRNLGVGIGYALIAYQST LKGVSKLEVNRDHLLDELDHNWEVLAEPIQTVMRRYGIEKPYEKLKELTRGKRVDAEGMK QFIDGLALPEEEKARLKAMTPANYIGRAITMVDELK >gi|296494667|gb|ADTN01000071.1| GENE 20 22590 - 23231 752 213 aa, chain - ## HITS:1 COG:ycfC KEGG:ns NR:ns ## COG: ycfC COG2915 # Protein_GI_number: 16129095 # Func_class: R General function prediction only # Function: Uncharacterized protein involved in purine metabolism # Organism: Escherichia coli K12 # 1 213 1 213 213 379 99.0 1e-105 MAKNYYDITLALAGICQSARLVQQLAHQGHCDADALHVSLNSIIDMNPSSTLAVFGGSEA NLRVGLETLLDVLNASSRQGLNAELTRYTLSLMVLERKLSSAKGALDTLGNRINGLQRQF EHFDLQSETLMSAMAAIYVDVISPLGPRIQVTGSPAVLQSPQVQAKVRATLLAGIRAAVL WHQVGGGRLQLMFSRNRLTTQAKQILAHLTPEL >gi|296494667|gb|ADTN01000071.1| GENE 21 23267 - 24373 1404 368 aa, chain - ## HITS:1 COG:trmU KEGG:ns NR:ns ## COG: trmU COG0482 # Protein_GI_number: 16129096 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Escherichia coli K12 # 1 368 16 383 383 748 100.0 0 MSETAKKVIVGMSGGVDSSVSAWLLQQQGYQVEGLFMKNWEEDDGEEYCTAAADLADAQA VCDKLGIELHTVNFAAEYWDNVFELFLAEYKAGRTPNPDILCNKEIKFKAFLEFAAEDLG ADYIATGHYVRRADVDGKSRLLRGLDSNKDQSYFLYTLSHEQIAQSLFPVGELEKPQVRK IAEDLGLVTAKKKDSTGICFIGERKFREFLGRYLPAQPGKIITVDGDEIGEHQGLMYHTL GQRKGLGIGGTKEGTEEPWYVVDKDVENNILVVAQGHEHPRLMSVGLIAQQLHWVDREPF TGTMRCTVKTRYRQTDIPCTVKALDDDRIEVIFDEPVAAVTPGQSAVFYNGEVCLGGGII EQRLPLPV >gi|296494667|gb|ADTN01000071.1| GENE 22 24427 - 24888 433 153 aa, chain - ## HITS:1 COG:ECs1606 KEGG:ns NR:ns ## COG: ECs1606 COG0494 # Protein_GI_number: 15830860 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Escherichia coli O157:H7 # 1 153 1 153 153 314 100.0 3e-86 MFKPHVTVACVVHAEGKFLVVEETINGKALWNQPAGHLEADETLVEAAARELWEETGISA QPQHFIRMHQWIAPDKTPFLRFLFAIELEQICPTQPHDSDIDCCRWVSAEEILQASNLRS PLVAESIRCYQSGQRYPLEMIGDFNWPFTKGVI >gi|296494667|gb|ADTN01000071.1| GENE 23 24898 - 25521 425 207 aa, chain - ## HITS:1 COG:ymfC KEGG:ns NR:ns ## COG: ymfC COG1187 # Protein_GI_number: 16129098 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Escherichia coli K12 # 1 207 1 207 207 400 100.0 1e-112 MQKTSFRNHQVKRFSSQRSTRRKPENQPTRVILFNKPYDVLPQFTDEAGRKTLKEFIPVQ GVYAAGRLDRDSEGLLVLTNNGALQARLTQPGKRTGKIYYVQVEGIPTQDALEALRNGVT LNDGPTLPAGAELVDEPAWLWPRNPPIRERKSIPTSWLKITLYEGRNRQVRRMTAHVGFP TLRLIRYAMGDYSLDNLANGEWREVTD >gi|296494667|gb|ADTN01000071.1| GENE 24 25723 - 26973 1560 416 aa, chain + ## HITS:1 COG:icd KEGG:ns NR:ns ## COG: icd COG0538 # Protein_GI_number: 16129099 # Func_class: C Energy production and conversion # Function: Isocitrate dehydrogenases # Organism: Escherichia coli K12 # 1 416 1 416 416 840 99.0 0 MESKVVVPAQGKKITLQNGKLNVPENPIIPYIEGDGIGVDVTPAMLKVVDAAVEKAYKGE RKISWMEIYTGEKSTQVYGQDVWLPAETLDLIREYRVAIKGPLTTPVGGGIRSLNVALRQ ELDLYICLRPVRYYQGTPSPVKHPELTDMVIFRENSEDIYAGIEWKADSADAEKVIKFLR EEMGVKKIRFPEHCGIGIKPCSEEGTKRLVRAAIEYAIANDRDSVTLVHKGNIMKFTEGA FKDWGYQLAREEFGGELIDGGPWLKVKNPNTGKEIVIKDVIADAFLQQILLRPAEYDVIA CMNLNGDYISDALAAQVGGIGIAPGANIGDECALFEATHGTAPKYAGQDKVNPGSIILSA EMMLRHMGWTEAADLIVKGMEGAINAKTVTYDFERLMEGAKLLKCSEFGEAIIENM >gi|296494667|gb|ADTN01000071.1| GENE 25 28099 - 28503 71 134 aa, chain - ## HITS:1 COG:ycgX KEGG:ns NR:ns ## COG: ycgX COG5562 # Protein_GI_number: 16129124 # Func_class: R General function prediction only # Function: Phage envelope protein # Organism: Escherichia coli K12 # 1 134 1 134 134 265 100.0 1e-71 MDQVVVFQKMFEQVRKEQNFSWFYSELKHHRIAHYIYYLATDNIRIITHDDTVLLLRGTR NLLKVSTTKNPAKIKEAALLHICGKSTFREYCSTLAGAGVFRWVTDVNHNKRSYYAIDNT LLYIEDVENNKPLI >gi|296494667|gb|ADTN01000071.1| GENE 26 28724 - 29455 389 243 aa, chain - ## HITS:1 COG:ycgE KEGG:ns NR:ns ## COG: ycgE COG0789 # Protein_GI_number: 16129125 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 1 243 1 243 243 475 100.0 1e-134 MAYYSIGDVAERCGINPVTLRAWQRRYGLLKPQRSEGGHRLFDEEDIQRIEEIKRWISNG VPVGKVKALLETTSQDTEDDWSRLQEEMMSILRMANPAKLRARIISLGREYPVDQLINHV YLPVRQRLVLDHNTSRIMSSMFDGALIEYAATSLFEMRRKPGKEAILMAWNVEERARLWL EAWRLSLSGWHISVLADPIESPRPELFPTQTLIVWTGMAPTRRQNELLQHWGEQGYKVIF HAP >gi|296494667|gb|ADTN01000071.1| GENE 27 29660 - 30190 323 176 aa, chain - ## HITS:1 COG:ycgF_2 KEGG:ns NR:ns ## COG: ycgF_2 COG2200 # Protein_GI_number: 16129126 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 1 176 68 243 243 338 100.0 4e-93 MAHALELGDKMISINLLPMTLVNEPDAVSFLLNEIKANALVPEQIIVEFTESEVISRFDE FAEAIKSLKAAGISVAIDHFGAGFAGLLLLSRFQPDRIKISQELITNVHKSGPRQAIIQA IIKCCTSLEIQVSAMGVATPEEWMWLESAGIEMFQGDLFAKAKLNGIPSIAWPEKK Prediction of potential genes in microbial genomes Time: Sun May 15 23:21:53 2011 Seq name: gi|296494666|gb|ADTN01000072.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont210.1, whole genome shotgun sequence Length of sequence - 9897 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 8, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 19 - 339 235 ## ECSE_2263 hypothetical protein - Prom 432 - 491 4.1 - TRNA 709 - 784 87.1 # Asn GTT 0 0 - Term 739 - 777 -0.8 2 2 Tu 1 . - CDS 885 - 1682 408 ## COG3228 Uncharacterized protein conserved in bacteria - Prom 1806 - 1865 80.3 + TRNA 1776 - 1865 75.5 # Ser CGA 0 0 - Term 2150 - 2185 2.3 3 3 Tu 1 1/0.500 - CDS 2435 - 2965 244 ## COG3038 Cytochrome B561 - Prom 3044 - 3103 4.0 - Term 3173 - 3217 9.1 4 4 Tu 1 3/0.500 - CDS 3308 - 3979 382 ## COG3443 Predicted periplasmic or secreted protein - Prom 4085 - 4144 8.9 5 5 Op 1 13/0.000 - CDS 4215 - 4850 507 ## COG2717 Predicted membrane protein 6 5 Op 2 3/0.500 - CDS 4851 - 5798 897 ## COG2041 Sulfite oxidase and related enzymes - Prom 5854 - 5913 3.1 7 6 Tu 1 . - CDS 5964 - 6377 286 ## COG2351 Transthyretin-like protein - Prom 6422 - 6481 5.2 + Prom 6325 - 6384 4.8 8 7 Op 1 40/0.000 + CDS 6462 - 7181 189 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Prom 7333 - 7392 4.1 9 7 Op 2 . + CDS 7637 - 8539 198 ## COG0642 Signal transduction histidine kinase + Term 8579 - 8624 1.3 - Term 8567 - 8611 9.4 10 8 Tu 1 . - CDS 8647 - 9498 721 ## COG0693 Putative intracellular protease/amidase Predicted protein(s) >gi|296494666|gb|ADTN01000072.1| GENE 1 19 - 339 235 106 aa, chain - ## HITS:1 COG:no KEGG:ECSE_2263 NR:ns ## KEGG: ECSE_2263 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 103 29 131 538 204 99.0 6e-52 MGIKLRRLTAGICLITQLAFPMAAAAQGVVNAATQQPVPAQIAIANANTVPYTLGALESA QSVAERFGISVAELRKLNQFRTFARGFDNVRQGDELDVPAQVSGLC >gi|296494666|gb|ADTN01000072.1| GENE 2 885 - 1682 408 265 aa, chain - ## HITS:1 COG:ECs2774 KEGG:ns NR:ns ## COG: ECs2774 COG3228 # Protein_GI_number: 15832028 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 265 14 278 278 532 100.0 1e-151 MIKWPWKVQESAHQTALPWQEALSIPLLTCLTEQEQSKLVTLAERFLQQKRLVPLQGFEL DSLRSCRIALLFCLPVLELGLEWLDGFHEVLIYPAPFVVDDEWEDDIGLVHNQRIVQSGQ SWQQGPIVLNWLDIQDSFDASGFNLIIHEVAHKLDTRNGDRASGVPFIPLREVAGWEHDL HAAMNNIQEEIELVGENAASIDAYAASDPAECFAVLSEYFFSAPELFAPRFPSLWQRFCQ FYQQDPLQRLHHANDTDSFSATNVH >gi|296494666|gb|ADTN01000072.1| GENE 3 2435 - 2965 244 176 aa, chain - ## HITS:1 COG:yodB KEGG:ns NR:ns ## COG: yodB COG3038 # Protein_GI_number: 16129920 # Func_class: C Energy production and conversion # Function: Cytochrome B561 # Organism: Escherichia coli K12 # 1 176 11 186 186 329 100.0 2e-90 MNRFSKTQIYLHWITLLFVAITYAAMELRGWFPKGSSTYLLMRETHYNAGIFVWVLMFSR LIIKHRYSDPSIVPPPPAWQMKAASLMHIMLYITFLALPLLGIALMAYSGKSWSFLGFNV SPFVTPNSEIKALIKNIHETWANIGYFLIAAHAGAALFHHYIQKDNTLLRMMPRRK >gi|296494666|gb|ADTN01000072.1| GENE 4 3308 - 3979 382 223 aa, chain - ## HITS:1 COG:yodA KEGG:ns NR:ns ## COG: yodA COG3443 # Protein_GI_number: 16129919 # Func_class: R General function prediction only # Function: Predicted periplasmic or secreted protein # Organism: Escherichia coli K12 # 8 223 1 216 216 433 99.0 1e-121 MTLEETVLAIRLHKLAVALGVFIVSAPAFSHGHHSHGKPLTEVEQKAANGVFDDANVQNR TLSDWDGVWQSVYPLLQSGKLDPVFQKKADADKTKTFAEIKDYYHKGYATDIEMIGIEDG IVEFHRNNETTSCKYDYDGYKILTYKSGKKGVRYLFECKDPESKAPKYIQFSDHIIAPRK SSHFHIFMGNDSQQSLLNEMENWPTYYPYQLSSEEVVEEMMSH >gi|296494666|gb|ADTN01000072.1| GENE 5 4215 - 4850 507 211 aa, chain - ## HITS:1 COG:yedZ KEGG:ns NR:ns ## COG: yedZ COG2717 # Protein_GI_number: 16129918 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 211 1 211 211 337 99.0 9e-93 MRLTAKQVTWLKVSLHLAGLLPFLWLVWAINHGGLGADPVKDIQHFTGRTALKFLLATLL ITPLARYAKQPLLIRTRRLLGLWCFAWATLHLTSYALLELGVNNLALLGKELITRPYLTL GIISWVILLALAFTSTQAMQRKLGKHWQQLHNFVYLVAILAPIHYLWSVKIISPQPLIYA GLAVLLLALRYKKLRSLFNRLRKQVHNKLSV >gi|296494666|gb|ADTN01000072.1| GENE 6 4851 - 5798 897 315 aa, chain - ## HITS:1 COG:ECs2709 KEGG:ns NR:ns ## COG: ECs2709 COG2041 # Protein_GI_number: 15831963 # Func_class: R General function prediction only # Function: Sulfite oxidase and related enzymes # Organism: Escherichia coli O157:H7 # 1 315 20 334 334 631 99.0 0 MKRRQVLKALGISAAALSLPHAAHADLLSWFKGNDRPPAPAGKPLEFSKPAAWQNNLPLT PVDKVSGYNNFYEFGLDKADPAANAGSLKTDPWTLKISGEVAKPLTLDHDDLTRRFPLEE RIYRMRCVEAWSMVVPWIGFPLHKLLALAEPTSNAKYVAFETIYAPEQMPGQQDRFIGGG LKYPYVEGLRLDEAMHPLTLMTVGVYGKALPPQNGAPVRLIVPWKYGFKGIKSIVSIKLT RERPPTTWNLAAPDEYGFYANVNPHVDHPRWSQATERFIGSGGILDVQRQPTLLFNGYAD QVASLYRGLDLRENF >gi|296494666|gb|ADTN01000072.1| GENE 7 5964 - 6377 286 137 aa, chain - ## HITS:1 COG:ECs2708 KEGG:ns NR:ns ## COG: ECs2708 COG2351 # Protein_GI_number: 15831962 # Func_class: R General function prediction only # Function: Transthyretin-like protein # Organism: Escherichia coli O157:H7 # 1 137 1 137 137 270 99.0 7e-73 MLKRYLVLSVATAAFSLPSLVYAAQQNILSVHILNQQTGKPAADVTVTLEKKADNGWLQL NTAKTDKDGRIKALWPEQTATTGDYRVVFKTGDYFKKQNLESFFPEIPVEFHINKVNEHY HVPLLLSQYGYSTYRGS >gi|296494666|gb|ADTN01000072.1| GENE 8 6462 - 7181 189 239 aa, chain + ## HITS:1 COG:ECs2707 KEGG:ns NR:ns ## COG: ECs2707 COG0745 # Protein_GI_number: 15831961 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 239 1 239 239 460 99.0 1e-129 MNQAVSITYDLWHIIFMKILLIEDNQRTQEWVTQGLSEAGYVIDAVSDGRDGLYLALKDD YALIILDIMLPGMDGWQILQTLRTAKQTPVICLTARDSVDDRVRGLDSGANDYLVKPFSF SELLARVRAQLRQHHALNSTLEISGLRMDSVSQSVSRDNISITLTRKEFQLLWLLASRAG EIIPRTVIASEIWGINFDSDTNTVDVAIRRLRAKVDDPFPEKLIATIRGMGYSFVAVKK >gi|296494666|gb|ADTN01000072.1| GENE 9 7637 - 8539 198 300 aa, chain + ## HITS:1 COG:yedV KEGG:ns NR:ns ## COG: yedV COG0642 # Protein_GI_number: 16129914 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 300 153 452 452 568 98.0 1e-162 MLEQYKINSIIICIVAIVLCSVLSPLLIRTGLREIKKLSGVTEALNYNYSREPVEVSALP RELKPLGQALNKMHQALVKDFERLSQFADDLAHELRTPINALLGQNQVTLSQTRSIAEYQ KTIAGNIEELENISRLTENILFLARADKNNVLVKLDSLSLNKEVENLLDYLEYLSDEKEI CFKVECNQQIFADKILLQRMLSNLIVNAIRYSPEKSRIHITSFLDTNGYLNIDIASPGTK IHEPEKLFRRFWRGDNSRHSVGQGLGLSLVKAIAELHGGSASYHYLNKHNVFRITLPQRN >gi|296494666|gb|ADTN01000072.1| GENE 10 8647 - 9498 721 283 aa, chain - ## HITS:1 COG:yedU KEGG:ns NR:ns ## COG: yedU COG0693 # Protein_GI_number: 16129913 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Escherichia coli K12 # 1 283 1 283 283 576 99.0 1e-164 MTVQTSKNPQVDIAEDNAFFPSEYSLSQYTSPVSDLDGVDYPKPYRGKHKILVIAADERY LPTDNGKLFSTGNHPIETLLPLYHLHAAGFEFEVATISGLMTKFEYWAMPHKDEKVMPFF EQHKSLFRNPKKLADVVASLNADSEYAAIFVPGGHGALIGLPESQDVAAALQWAIKNDRF VISLCHGPAAFLALRHGDNPLNGYSICAFPDAADKQTPDIGYMPGHLTWYFGEELKKMGM NIINDDITGRVHKDRKLLTGDSPFAANALGKLAAQEMLAAYAG Prediction of potential genes in microbial genomes Time: Sun May 15 23:21:56 2011 Seq name: gi|296494665|gb|ADTN01000073.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont210.2, whole genome shotgun sequence Length of sequence - 1415 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 50 - 1123 1169 ## COG3203 Outer membrane protein (porin) - Prom 1215 - 1274 9.0 Predicted protein(s) >gi|296494665|gb|ADTN01000073.1| GENE 1 50 - 1123 1169 357 aa, chain - ## HITS:1 COG:STM1572 KEGG:ns NR:ns ## COG: STM1572 COG3203 # Protein_GI_number: 16764916 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein (porin) # Organism: Salmonella typhimurium LT2 # 1 357 1 362 362 338 57.0 8e-93 MKRKVLAMLVPALLVAGAANAAEIYNKDGNKVDFYGKMVGERIWSNTDDNNSENEDTSYA RFGVKGETQITSELTGFGQFEYNLDASKPEGSNQEKTRLTFAGLKYNELGSFDYGRNYGV AYDAAAYTDMLVEWGGDSWASADNFMNGRTNGVATYRNSDFFGLVDGLNFAVQYQGKNSN RGVTKQNGDGYALSVDYNIEGFGFVGAYSKSDRTNEQAGDGYGDNAEVWSLAAKYDANNI YAAMMYGETRNMTVLANDHFANKTQNFEAVVQYQFDFGLRPSLGYVYSKGKDLYARDGHK GVDADRVNYIEVGTWYYFNKNMNVYTAYKFNLLDKDDAAITDAATDDQFAVGIVYQF Prediction of potential genes in microbial genomes Time: Sun May 15 23:21:59 2011 Seq name: gi|296494664|gb|ADTN01000074.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont210.3, whole genome shotgun sequence Length of sequence - 8499 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 5, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 43 - 102 9.7 1 1 Op 1 . + CDS 336 - 617 305 ## ECED1_2230 conserved hypothetical protein; putative inner membrane protein 2 1 Op 2 4/0.667 + CDS 657 - 1352 564 ## COG1418 Predicted HD superfamily hydrolase 3 1 Op 3 6/0.000 + CDS 1419 - 2837 1422 ## COG0270 Site-specific DNA methylase 4 1 Op 4 . + CDS 2818 - 3288 465 ## COG3727 DNA G:T-mismatch repair endonuclease 5 2 Tu 1 . - CDS 3277 - 4197 958 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 4223 - 4282 4.5 + Prom 4149 - 4208 4.0 6 3 Op 1 3/1.000 + CDS 4376 - 5287 1102 ## COG2354 Uncharacterized protein conserved in bacteria 7 3 Op 2 2/1.000 + CDS 5366 - 5548 224 ## COG5475 Uncharacterized small protein + Prom 5637 - 5696 4.5 8 4 Tu 1 . + CDS 5737 - 7413 1394 ## COG2199 FOG: GGDEF domain + Term 7586 - 7622 3.7 9 5 Tu 1 . - CDS 7410 - 8225 693 ## COG3769 Predicted hydrolase (HAD superfamily) - Prom 8248 - 8307 2.2 Predicted protein(s) >gi|296494664|gb|ADTN01000074.1| GENE 1 336 - 617 305 93 aa, chain + ## HITS:1 COG:no KEGG:ECED1_2230 NR:ns ## KEGG: ECED1_2230 # Name: yedR # Def: conserved hypothetical protein; putative inner membrane protein # Organism: E.coli_ED1a # Pathway: not_defined # 1 93 29 121 121 171 100.0 6e-42 MEERLSRSPGGKPALWAFYTWCGYFVWAMARYIWVMSRIPDAPVSGFESDLGSTAGKWLG ALVGFLFMALVGALLGSIAWYTRPRPARSRRYE >gi|296494664|gb|ADTN01000074.1| GENE 2 657 - 1352 564 231 aa, chain + ## HITS:1 COG:yedJ KEGG:ns NR:ns ## COG: yedJ COG1418 # Protein_GI_number: 16129908 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Escherichia coli K12 # 1 231 1 231 231 422 99.0 1e-118 MDLQHWQAQFENWLKNHHQHQDAAHDVCHFRRVWATAQKLAADDDVDMLVILTACYFHDI VSLAKNHPQRQRSSILAAEETRRLLREEFEQFPAEKIEAVCHAIAAHSFSAQIAPLTTEA KIVQDADRLEALGAIGLARVFAVSGALGVALFDGEDPFAQHRPLDDKRYALDHFQTKLLK LPQTMQTARGKQLAQHNAHFLVEFMAKLSAELAGENEGVDHKVIDAFSPAG >gi|296494664|gb|ADTN01000074.1| GENE 3 1419 - 2837 1422 472 aa, chain + ## HITS:1 COG:ECs2699 KEGG:ns NR:ns ## COG: ECs2699 COG0270 # Protein_GI_number: 15831953 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Escherichia coli O157:H7 # 1 472 1 472 472 961 100.0 0 MQENISVTDSYSTGNAAQAMLEKLLQIYDVKTLVAQLNGVGENHWSAAILKRALANDSAW HRLSEKEFAHLQTLLPKPPAHHPHYAFRFIDLFAGIGGIRRGFESIGGQCVFTSEWNKHA VRTYKANHYCDPATHHFNEDIRDITLSHKEGVSDEAAAEHIRQHIPEHDVLLAGFPCQPF SLAGVSKKNSLGRAHGFACDTQGTLFFDVVRIIDARRPAMFVLENVKNLKSHDQGKTFRI IMQTLDELGYDVADAEDNGPDDPKIIDGKHFLPQHRERIVLVGFRRDLNLKADFTLRDIS ECFPAQRVTLAQLLDPMVEAKYILTPVLWKYLYRYAKKHQARGNGFGYGMVYPNNPQSVT RTLSARYYKDGAEILIDRGWDMATGEKDFDDPLNQQHRPRRLTPRECARLMGFEAPGEAK FRIPVSDTQAYRQFGNSVVVPVFAAVAKLLEPKIKQAVALRQQEAQHGRRSR >gi|296494664|gb|ADTN01000074.1| GENE 4 2818 - 3288 465 156 aa, chain + ## HITS:1 COG:vsr KEGG:ns NR:ns ## COG: vsr COG3727 # Protein_GI_number: 16129906 # Func_class: L Replication, recombination and repair # Function: DNA G:T-mismatch repair endonuclease # Organism: Escherichia coli K12 # 1 156 1 156 156 312 99.0 2e-85 MADVHDKATRSKNMRAIATRDTAIEKRLASLLTGQGLAFRVQDASLPGSPDFVVDEYRCV IFTHGCFWHHHHCYLFKVPATRTEFWLEKIGKNVERDRRDISRLQELGWRVLIVWECALR GREKLTDEALTERLEEWICGEGASAQIDTQGIHLLA >gi|296494664|gb|ADTN01000074.1| GENE 5 3277 - 4197 958 306 aa, chain - ## HITS:1 COG:ECs2697 KEGG:ns NR:ns ## COG: ECs2697 COG0697 # Protein_GI_number: 15831951 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Escherichia coli O157:H7 # 1 306 1 306 306 449 99.0 1e-126 MRFRQLLPLFGALFALYIIWGSTYFVIRIGVESWPPLMMAGVRFLAAGILLLAFLLLRGH KLPPLRPLLNAALIGLLLLAVGNGMVTVAEHQNVPSGIAAVVVATVPLFTLCFSRLFGIK TRKLEWVGIAIGLAGIIMLNSGGNLSGNPWGAILILIGSISWAFGSVYGSRITLPVGMMA GAIEMLAAGVVLMIASMIAGEKLTALPSLSGFLAVGYLALFGSIIAINAYMYLIRNVSPA LATSYAYVNPVVAVLLGTGLGGETLSKIEWLALGVIVFAVLLVTLGKYLFPAKPVVAPVI QDASSE >gi|296494664|gb|ADTN01000074.1| GENE 6 4376 - 5287 1102 303 aa, chain + ## HITS:1 COG:yedI KEGG:ns NR:ns ## COG: yedI COG2354 # Protein_GI_number: 16129904 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 303 3 305 305 500 99.0 1e-141 MAGSSLLTLLDDIATLLDDISVMGKLAAKKTAGVLGDDLSLNAQQVSGVRANRELPVVWG VAKGSLINKVILVPLALIISAFIPWAITPLLMIGGAFLCFEGVEKVLHMLEARKHKEDPA QSQQRLEKLAAQDPLKFEKDKIKGAIRTDFILSAEIVAITLGIVAEAPLLNQVLVLSGIA LVVTVGVYGLVGVIVKIDDLGYWLAEKSSALMQALGKGLLIIAPWLMKALSIVGTLAMFL VGGGIVVHGIAPLHHAIEHFAGQQSAVVAMILPTVLNLILGFIIGGIVVLGVKAVAKIRG QAH >gi|296494664|gb|ADTN01000074.1| GENE 7 5366 - 5548 224 60 aa, chain + ## HITS:1 COG:ECs2695 KEGG:ns NR:ns ## COG: ECs2695 COG5475 # Protein_GI_number: 15831949 # Func_class: S Function unknown # Function: Uncharacterized small protein # Organism: Escherichia coli O157:H7 # 1 60 1 60 60 102 100.0 2e-22 MSFMVSEEVTVKEGGPRMIVTGYSSGMVECRWYDGYGVKREAFHETELVPGEGSRSAEEV >gi|296494664|gb|ADTN01000074.1| GENE 8 5737 - 7413 1394 558 aa, chain + ## HITS:1 COG:yedQ_2 KEGG:ns NR:ns ## COG: yedQ_2 COG2199 # Protein_GI_number: 16129902 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli K12 # 380 558 1 179 179 356 100.0 6e-98 MENQSWLKKLARRLGPGHVVNLCFIVVLLFSTLLTWREVVVLEDAYISSQRNHLENVANA LDKHLQYNVDKLIFLRNGMREALVAPLDFTSLRDAVTEFEQHRDEHAWQIELNRRRTLSV NGVSDALVSEGNLLSRENESLDNEITAALEVGYLLRLAHNTSSMVEQAMYVSRAGFYVST QPTLFTRNVPTRYYGYVTQPWFIGHSQRENRHRAVRWFTSQPEHASNTEPQVTVSVPVDS NNYWYGVLGMSIPVRTMQQFLRNAIDKNLDGEYQLYDSKLRFLTSSNPDHPTGNIFDPRE LALLAQAMEHDTRGGIRMDSRYVSWERLNHFDGVLVRVHTLSEGVRGDFGSISIALTLLW ALFTTMLLISWYVIRRMVSNMYVLQSSLQWQAWHDTLTRLYNRGALFEKARPLAKLCQTH QHPFSVIQVDLDHFKAINDRFGHQAGDRVLSHAAGLISSSLRAQDVAGRVGGEEFCVILP GASLTEAAEVAERIRLKLNEKEMLIAKSTTIRISASLGVSSSEETGDYDFEQLQSLADRR LYLAKQAGRNRVFASDNA >gi|296494664|gb|ADTN01000074.1| GENE 9 7410 - 8225 693 271 aa, chain - ## HITS:1 COG:ECs2693 KEGG:ns NR:ns ## COG: ECs2693 COG3769 # Protein_GI_number: 15831947 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Escherichia coli O157:H7 # 1 271 1 271 271 532 98.0 1e-151 MFSIQQPLLVFSDLDGTLLDSHSYDWQPAAPWLSRLREANVPVILCSSKTSAEMLYLQKT LGLQGLPLIAENGAVIQLAEQWQDIDGFPRIISGISHGEISQVLNTLREKEHFKFTTFDD VDDATIAEWTGLSRSQAALTQLHEASVTLIWRDSDERMAQFTARLNELGLQFMQGARFWH VLDASAGKDQAANWIIATYQQLSGKRPTTLGLGDGPNDAPLLEVMDYAVIVKGLNREGGH LHDEDPAHVWRTQREGPEGWREGLDHFFSAR Prediction of potential genes in microbial genomes Time: Sun May 15 23:22:07 2011 Seq name: gi|296494663|gb|ADTN01000075.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont210.4, whole genome shotgun sequence Length of sequence - 15678 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 8, operones - 3 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 31 - 258 299 ## G2583_2404 hypothetical protein - Prom 317 - 376 4.2 2 2 Tu 1 . + CDS 421 - 609 352 ## LF82_0529 protein DsrB + Term 701 - 746 4.0 3 3 Tu 1 4/1.000 - CDS 653 - 1276 377 ## COG2771 DNA-binding HTH domain-containing proteins - Prom 1428 - 1487 2.7 4 4 Op 1 17/0.000 - CDS 1566 - 2351 749 ## COG1684 Flagellar biosynthesis pathway, component FliR 5 4 Op 2 16/0.000 - CDS 2359 - 2628 396 ## COG1987 Flagellar biosynthesis pathway, component FliQ 6 4 Op 3 6/1.000 - CDS 2638 - 3375 720 ## COG1338 Flagellar biosynthesis pathway, component FliP 7 4 Op 4 6/1.000 - CDS 3375 - 3710 220 ## COG3190 Flagellar biogenesis protein 8 4 Op 5 20/0.000 - CDS 3713 - 4126 520 ## COG1886 Flagellar motor switch/type III secretory pathway protein 9 4 Op 6 13/0.000 - CDS 4123 - 5127 1099 ## COG1868 Flagellar motor switch protein 10 4 Op 7 7/1.000 - CDS 5132 - 5596 549 ## COG1580 Flagellar basal body-associated protein - Prom 5629 - 5688 2.2 11 5 Op 1 8/0.000 - CDS 5701 - 6828 817 ## COG3144 Flagellar hook-length control protein 12 5 Op 2 12/0.000 - CDS 6825 - 7268 529 ## COG2882 Flagellar biosynthesis chaperone 13 5 Op 3 13/0.000 - CDS 7287 - 8660 1641 ## COG1157 Flagellar biosynthesis/type III secretory pathway ATPase 14 5 Op 4 15/0.000 - CDS 8660 - 9346 882 ## COG1317 Flagellar biosynthesis/type III secretory pathway protein 15 5 Op 5 19/0.000 - CDS 9339 - 10334 1281 ## COG1536 Flagellar motor switch protein 16 5 Op 6 . - CDS 10327 - 11985 1575 ## COG1766 Flagellar biosynthesis/type III secretory pathway lipoprotein - Prom 12071 - 12130 2.2 + Prom 12117 - 12176 3.6 17 6 Tu 1 . + CDS 12200 - 12514 453 ## COG1677 Flagellar hook-basal body protein + Prom 12674 - 12733 2.2 18 7 Tu 1 . + CDS 12848 - 13180 296 ## COG2076 Membrane transporters of cations and cationic drugs + Term 13268 - 13294 -0.6 + Prom 13243 - 13302 6.3 19 8 Op 1 3/1.000 + CDS 13349 - 13900 23 ## COG1881 Phospholipid-binding protein 20 8 Op 2 . + CDS 13910 - 14707 -42 ## COG2207 AraC-type DNA-binding domain-containing proteins Predicted protein(s) >gi|296494663|gb|ADTN01000075.1| GENE 1 31 - 258 299 75 aa, chain - ## HITS:1 COG:no KEGG:G2583_2404 NR:ns ## KEGG: G2583_2404 # Name: yodD # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 75 6 80 80 112 100.0 4e-24 MKTAKEYSDTAKREVSVDVDALLAAINEISESEVHRSQNDSEHVSVDGREYHTWRELADA FELDIHDFSVSEVNR >gi|296494663|gb|ADTN01000075.1| GENE 2 421 - 609 352 62 aa, chain + ## HITS:1 COG:no KEGG:LF82_0529 NR:ns ## KEGG: LF82_0529 # Name: dsrB # Def: protein DsrB # Organism: E.coli_LF82 # Pathway: not_defined # 1 62 1 62 62 124 100.0 1e-27 MKVNDRVTVKTDGGPRRPGVVLAVEEFSEGTMYLVSLEDYPLGIWFFNEAGHQDGIFVEK AE >gi|296494663|gb|ADTN01000075.1| GENE 3 653 - 1276 377 207 aa, chain - ## HITS:1 COG:ECs2690 KEGG:ns NR:ns ## COG: ECs2690 COG2771 # Protein_GI_number: 15831944 # Func_class: K Transcription # Function: DNA-binding HTH domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 207 1 207 207 379 99.0 1e-105 MSTIIMDLCSYTRLGLTGYLLSRGVKKREINDIETVDDLAIACDSQRPSVVFINEDCFIH DASNSQHIKHIINQHPNTLFIVFMAIANVHFDEYLLVRKNLLISSKSIKPESLDDILGDI LKKETTITSFLNMPTLSLSRTESSMLRMWMAGQGTIQISDQMNIKAKTVSSHKGNIKRKI KTHNKQVIYHVVRLTDNVTNGIFVNMR >gi|296494663|gb|ADTN01000075.1| GENE 4 1566 - 2351 749 261 aa, chain - ## HITS:1 COG:fliR KEGG:ns NR:ns ## COG: fliR COG1684 # Protein_GI_number: 16129897 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FliR # Organism: Escherichia coli K12 # 1 261 1 261 261 363 98.0 1e-100 MLQVTSDQWLSWLSLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC EHLFSEMFNLLADIISELPLI >gi|296494663|gb|ADTN01000075.1| GENE 5 2359 - 2628 396 89 aa, chain - ## HITS:1 COG:STM1980 KEGG:ns NR:ns ## COG: STM1980 COG1987 # Protein_GI_number: 16765318 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FliQ # Organism: Salmonella typhimurium LT2 # 1 89 1 89 89 98 95.0 3e-21 MTPESVMMMGTEAMKVALALAAPLLLVALVTGLIISILQAATQINEMTLSFIPKIIAVFI AIIIAGPWMLNLLLDYVRTLFTNLPYIIG >gi|296494663|gb|ADTN01000075.1| GENE 6 2638 - 3375 720 245 aa, chain - ## HITS:1 COG:ECs2687 KEGG:ns NR:ns ## COG: ECs2687 COG1338 # Protein_GI_number: 15831941 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FliP # Organism: Escherichia coli O157:H7 # 1 245 1 245 245 392 99.0 1e-109 MRRLLSVAPVLLWLVTPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA QSFYS >gi|296494663|gb|ADTN01000075.1| GENE 7 3375 - 3710 220 111 aa, chain - ## HITS:1 COG:ECs2686 KEGG:ns NR:ns ## COG: ECs2686 COG3190 # Protein_GI_number: 15831940 # Func_class: N Cell motility # Function: Flagellar biogenesis protein # Organism: Escherichia coli O157:H7 # 21 111 1 101 101 118 85.0 3e-27 MNNHATVQSSTPVSAAPLLQVSGALIAIIALILAAAWLVKRLGFAPKRTGVNGLKISASA SLGARERVVLGVTAGQINLLHKLPPSAPTEEIPQTDFQSVMKNLLKRSGRS >gi|296494663|gb|ADTN01000075.1| GENE 8 3713 - 4126 520 137 aa, chain - ## HITS:1 COG:ECs2685 KEGG:ns NR:ns ## COG: ECs2685 COG1886 # Protein_GI_number: 15831939 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar motor switch/type III secretory pathway protein # Organism: Escherichia coli O157:H7 # 1 137 1 137 137 245 100.0 1e-65 MSDMNNPADDNNGAMDDLWAEALSEQKSTSSKSAADAVFQQFGGGDVSGTLQDIDLIMDI PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV RITDIITPSERMRRLSR >gi|296494663|gb|ADTN01000075.1| GENE 9 4123 - 5127 1099 334 aa, chain - ## HITS:1 COG:ECs2684 KEGG:ns NR:ns ## COG: ECs2684 COG1868 # Protein_GI_number: 15831938 # Func_class: N Cell motility # Function: Flagellar motor switch protein # Organism: Escherichia coli O157:H7 # 1 334 1 334 334 649 100.0 0 MGDSILSQAEIDALLNGDSEVKDEPTASVSGESDIRPYDPNTQRRVVRERLQALEIINER FARHFRMGLFNLLRRSPDITVGAIRIQPYHEFARNLPVPTNLNLIHLKPLRGTGLVVFSP SLVFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYV RSEMQVKFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENSR NEDQNWRDNLVRQVQHSQLELVANFADISLRLSQILKLKPGDVLPIEKPDRIIAHVDGVP VLTSQYGTLNGQYALRIEHLINPILNSLNEEQPK >gi|296494663|gb|ADTN01000075.1| GENE 10 5132 - 5596 549 154 aa, chain - ## HITS:1 COG:ECs2683 KEGG:ns NR:ns ## COG: ECs2683 COG1580 # Protein_GI_number: 15831937 # Func_class: N Cell motility # Function: Flagellar basal body-associated protein # Organism: Escherichia coli O157:H7 # 1 154 1 154 154 289 100.0 2e-78 MTDYAISKKSKRSLWIPILVFITLAACASAGYSYWHSHQVAADDKAQQRVVPSPVFYALD TFTVNLGDADRVLYIGITLRLKDEATRSRLSEYLPEVRSRLLLLFSRQDAAVLATEEGKK NLIAEIKTTLSTPLVAGQPKQDVTDVLYTAFILR >gi|296494663|gb|ADTN01000075.1| GENE 11 5701 - 6828 817 375 aa, chain - ## HITS:1 COG:fliK KEGG:ns NR:ns ## COG: fliK COG3144 # Protein_GI_number: 16129890 # Func_class: N Cell motility # Function: Flagellar hook-length control protein # Organism: Escherichia coli K12 # 1 375 1 375 375 521 97.0 1e-147 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK GEPLISDILADAQQADLLIPVDETPPVINDEQSTSTPLTTAQTMTLAAVAGNNTAKDEKA DDLNEDVTASLSALFAMLPGFDNTPKVTDVPSTVLPAEKPTLFTKLTSAQLTTAQPDDAP GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS LQVRVTGNSGVDIFA >gi|296494663|gb|ADTN01000075.1| GENE 12 6825 - 7268 529 147 aa, chain - ## HITS:1 COG:fliJ KEGG:ns NR:ns ## COG: fliJ COG2882 # Protein_GI_number: 16129889 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport; O Posttranslational modification, protein turnover, chaperones # Function: Flagellar biosynthesis chaperone # Organism: Escherichia coli K12 # 1 147 1 147 147 206 99.0 2e-53 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG MTSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST AALLAENRLDQKKMDEFAQRAAMRKPE >gi|296494663|gb|ADTN01000075.1| GENE 13 7287 - 8660 1641 457 aa, chain - ## HITS:1 COG:ZfliI KEGG:ns NR:ns ## COG: ZfliI COG1157 # Protein_GI_number: 15802376 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis/type III secretory pathway ATPase # Organism: Escherichia coli O157:H7 EDL933 # 13 457 52 496 496 824 99.0 0 MTTRLTRWLTTLDNFEAKMAQLPAVRRYGRLTRATGLVLEATGLQLPLGATCVIERQNGT ETHEVESEVVGFNGQRLFLMPLEEVEGVLPGARVYAKNISAEGLQSGKQLPLGPALLGRV LDGSGKPLDGLPSPDTTETGALITPPFNPLQRTPIEHVLDTGVHPINALLTVGRGQRMGL FAGSGVGKSVLLGMMARYTRADVIVVGLIGERGREVKDFIENILGAEGRARSVVIAAPAD VSPLLRMQGAAYATRIAEDFRDRGQHVLLIMDSLTRYAMAQREIALAIGEPPATKGYPPS VFAKLPALVERAGNGISGGGSITAFYTVLTEGDDQQDPIADSARAILDGHIVLSRRLAEA GHYPAIDIEASISRAMTALISEQHYARVRTFKQLLSSFQRNRDLVSVGAYAKGSDPMLDK AIALWPQLEGYLQQGIFERADWEASLQGLERIFPTVS >gi|296494663|gb|ADTN01000075.1| GENE 14 8660 - 9346 882 228 aa, chain - ## HITS:1 COG:fliH KEGG:ns NR:ns ## COG: fliH COG1317 # Protein_GI_number: 16129887 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis/type III secretory pathway protein # Organism: Escherichia coli K12 # 1 228 8 235 235 361 98.0 1e-100 MSDNLPWKTWTPDDLAPPQAEFVPMVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI AEGRQQGHEQGYQEGLARGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV >gi|296494663|gb|ADTN01000075.1| GENE 15 9339 - 10334 1281 331 aa, chain - ## HITS:1 COG:ECs2678 KEGG:ns NR:ns ## COG: ECs2678 COG1536 # Protein_GI_number: 15831932 # Func_class: N Cell motility # Function: Flagellar motor switch protein # Organism: Escherichia coli O157:H7 # 1 331 1 331 331 551 100.0 1e-157 MSNLTGTDKSVILLMTIGEDRAAEVFKHLSQREVQTLSAAMANVTQISNKQLTDVLAEFE QEAEQFAALNINANDYLRSVLVKALGEERAASLLEDILETRDTASGIETLNFMEPQSAAD LIRDEHPQIIATILVHLKRAQAADILALFDERLRHDVMLRIATFGGVQPAALAELTEVLN GLLDGQNLKRSKMGGVRTAAEIINLMKTQQEEAVITAVREFDGELAQKIIDEMFLFENLV DVDDRSIQRLLQEVDSESLLIALKGAEQPLREKFLRNMSQRAADILRDDLANRGPVRLSQ VENEQKAILLIVRRLAETGEMVIGSGEDTYV >gi|296494663|gb|ADTN01000075.1| GENE 16 10327 - 11985 1575 552 aa, chain - ## HITS:1 COG:fliF KEGG:ns NR:ns ## COG: fliF COG1766 # Protein_GI_number: 16129885 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis/type III secretory pathway lipoprotein # Organism: Escherichia coli K12 # 1 552 1 552 552 963 99.0 0 MNATAAQTKSLEWLNRLRANPKIPLIVAGSAAVAVMVALILWAKAPDYRTLFSNLSDQDG GAIVSQLTQMNIPYRFSEASGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGI SQFSEQVNYQRALEGELSRTIETIGPVKGARVHLAMPKPSLFVREQKSPSASVTVNLLPG RALDEGQISAIVHLVSSAVAGLPPGNVTLVDQGGHLLTQSNTSGRDLNDAQLKYASDVEG RIQRRIEAILSPIVGNGNIHAQVTAQLDFASKEQTEEQYRPNGDESQAALRSRQLNESEQ SGSGYPGGVPGALSNQPAPANNAPISTPPANQNNRQQQASTTSNSGPRSTQRNETSNYEV DRTIRHTKMNVGDVQRLSVAVVVNYKTLPDGKPLPLSNEQMKQIEDLTREAMGFSEKRGD SLNVVNSPFNSSDESGGELPFWQQQAFIDQLLAAGRWLLVLLVAWLLWRKAVRPQLTRRA EAMKAVQQQAQAREEVEDAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPRVVA LVIRQWINNDHE >gi|296494663|gb|ADTN01000075.1| GENE 17 12200 - 12514 453 104 aa, chain + ## HITS:1 COG:ECs2676 KEGG:ns NR:ns ## COG: ECs2676 COG1677 # Protein_GI_number: 15831930 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar hook-basal body protein # Organism: Escherichia coli O157:H7 # 1 104 1 104 104 144 99.0 5e-35 MSAIQGIEGVISQLQATAMSARAQDSLPQPTISFAGQLHAALDRISDTQTAARTQAEKFT LGEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV >gi|296494663|gb|ADTN01000075.1| GENE 18 12848 - 13180 296 110 aa, chain + ## HITS:1 COG:ECs1614 KEGG:ns NR:ns ## COG: ECs1614 COG2076 # Protein_GI_number: 15830868 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane transporters of cations and cationic drugs # Organism: Escherichia coli O157:H7 # 1 110 1 110 110 186 98.0 8e-48 MNPYIYLGGAILAEVIGTTLMKFSEGFTRLWPSVGTIICYCASFWLLAQTLAYIPTGIAY AIWSGVGIVLISLLSWGFFGQRLDLPAIIGMMLICAGVLVINLLSRSAPH >gi|296494663|gb|ADTN01000075.1| GENE 19 13349 - 13900 23 183 aa, chain + ## HITS:1 COG:ybcL KEGG:ns NR:ns ## COG: ybcL COG1881 # Protein_GI_number: 16128528 # Func_class: R General function prediction only # Function: Phospholipid-binding protein # Organism: Escherichia coli K12 # 1 183 1 183 183 338 95.0 3e-93 MKKLIVSSVLAFITFSAQAAAFQVTSNEIKTGEQLTTSHVFSGFGCEGGNTSPSLTWSGA PEGTKSFAVTVYDPDAPTGSGWWHWTVANIPATVTYLPADAGRRDGTKLPTGAVQGRNDF GYAGFGGACPPKGDKPHHYQFKIWALKTDKIPVDSNSSGALVGYMLNANKIATAEIIPVY EVK >gi|296494663|gb|ADTN01000075.1| GENE 20 13910 - 14707 -42 265 aa, chain + ## HITS:1 COG:ybcM KEGG:ns NR:ns ## COG: ybcM COG2207 # Protein_GI_number: 16128529 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 265 1 265 265 494 91.0 1e-140 MLAKDRTNLKIEEIRMHKHHEIHRVKPLMPALCRIRQGKKIINWETHSLTVDNNQIILFP CGYEFYIANYPEAGLYLAEMLYYPIDLIEKFQNLYAITDQIRNTTGFCLPQNPELIYCWE QLKTSISRGFSTQIQEHLAMGVLLSLGAHHVNCLLLSDSKQSLISRCYNLMLSEPGTKWT ANKVARYLYISVSTLHRRLASEGVSFQSILDDVRLNNALSAIQTTVKPISEIARENGYKC PSRFTERFHNRFKITPRELRKASRE Prediction of potential genes in microbial genomes Time: Sun May 15 23:22:12 2011 Seq name: gi|296494662|gb|ADTN01000076.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont211.1, whole genome shotgun sequence Length of sequence - 1740 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 511 -11 ## COG3209 Rhs family protein 2 1 Op 2 . + CDS 580 - 837 60 ## B21_00650 hypothetical protein 3 2 Tu 1 . + CDS 955 - 1738 464 ## COG3209 Rhs family protein Predicted protein(s) >gi|296494662|gb|ADTN01000076.1| GENE 1 2 - 511 -11 169 aa, chain + ## HITS:1 COG:rhsC KEGG:ns NR:ns ## COG: rhsC COG3209 # Protein_GI_number: 16128676 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli K12 # 1 169 1229 1397 1397 380 100.0 1e-106 WNFYQYPLNPISNIDPLGLETLKCIKPLHSMGGTGERSGPDIWGNPFYHQYLCVPDGKGD YTCGGQDQRGESKGDGLWGPGKASNDTKEAAGRCDLVETDNSCVENCLKGKFKEVRPRYS VLPDIFTPINLGLFKNCQDWSNDSLETCKMKCSGNNIGRFIRFVFTGVM >gi|296494662|gb|ADTN01000076.1| GENE 2 580 - 837 60 85 aa, chain + ## HITS:1 COG:no KEGG:B21_00650 NR:ns ## KEGG: B21_00650 # Name: ybfB # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 85 24 108 108 124 100.0 7e-28 MHRLSLLDSTRDVSELISLMSYGMMVICFPTGIVFFIALIFIGTVSDIIGVRIDSKYIMA IIIWLYFLSGGYIQWFVLSKRIINK >gi|296494662|gb|ADTN01000076.1| GENE 3 955 - 1738 464 261 aa, chain + ## HITS:1 COG:rhsC KEGG:ns NR:ns ## COG: rhsC COG3209 # Protein_GI_number: 16128676 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli K12 # 1 261 923 1183 1397 523 99.0 1e-149 MWPDNRIARDAHYLYRYDRHGRLTEKTDLIPEGVIRTDDERTHRYHYDSQHRLVHYTRTQ YEEPLVESRYLYDPLGRRVAKRVWRRERDLTGWMSLSRKPQVTWYGWDGDRLTTIQNDRT RIQTIYQPGSFTPLIRVETATGELAKTQRRSLADTLQQSGGEDGGSVVFPPVLVQMLDRL ESEILADRVSEESRRWLASCGLTVAQMQSQMDPVYTPARKIHLYHCDHRGLPLALISTEG TTAWYAEYDEWGNLLNEENPH Prediction of potential genes in microbial genomes Time: Sun May 15 23:22:19 2011 Seq name: gi|296494661|gb|ADTN01000077.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont212.1, whole genome shotgun sequence Length of sequence - 19806 bp Number of predicted genes - 29, with homology - 27 Number of transcription units - 7, operones - 6 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 8/0.000 - CDS 36 - 1091 214 ## COG0630 Type IV secretory pathway, VirB11 components, and related ATPases involved in archaeal flagella biosynthesis 2 1 Op 2 11/0.000 - CDS 1134 - 2273 471 ## COG2948 Type IV secretory pathway, VirB10 components 3 1 Op 3 8/0.000 - CDS 2263 - 3030 188 ## COG3504 Type IV secretory pathway, VirB9 components 4 1 Op 4 . - CDS 3030 - 3764 325 ## COG3736 Type IV secretory pathway, component VirB8 5 1 Op 5 . - CDS 3764 - 3898 113 ## gi|10955509|ref|NP_065361.1| conjugal transfer lipoprotein TraF 6 1 Op 6 . - CDS 3930 - 6287 754 ## COG3451 Type IV secretory pathway, VirB4 components - Prom 6404 - 6463 7.2 - Term 6618 - 6662 5.2 7 2 Op 1 . - CDS 6684 - 6974 248 ## ECH74115_A0010 conjugal transfer prepropilin 8 2 Op 2 . - CDS 6974 - 7558 209 ## ECH74115_A0009 conjugal transfer protein 9 2 Op 3 . - CDS 7579 - 7977 251 ## gi|10955514|ref|NP_065366.1| hypothetical protein R721_75 10 3 Op 1 . - CDS 8135 - 8296 79 ## 11 3 Op 2 . - CDS 8317 - 8754 326 ## ECED1_3503 putative PilM protein 12 3 Op 3 . - CDS 8760 - 9488 51 ## ECL_00414 pilus biogenesis protein PilL 13 4 Op 1 . - CDS 9998 - 10390 231 ## SeD_B0090 killer protein - Term 10410 - 10436 -0.7 14 4 Op 2 . - CDS 10457 - 11092 82 ## SeHA_A0126 hypothetical protein - Term 11112 - 11138 -0.7 15 4 Op 3 . - CDS 11165 - 11452 309 ## gi|301644631|ref|ZP_07244618.1| conserved domain protein 16 4 Op 4 . - CDS 11465 - 11719 101 ## gi|301644632|ref|ZP_07244619.1| conserved hypothetical protein 17 4 Op 5 . - CDS 11721 - 12362 122 ## ECH74115_A0008 conjugal transfer protein TrbJ 18 4 Op 6 . - CDS 12368 - 12637 77 ## gi|301644634|ref|ZP_07244621.1| TrbL/VirB6 plasmid conjugal transfer protein - Prom 12718 - 12777 5.7 19 5 Op 1 . - CDS 13367 - 13624 158 ## gi|10955523|ref|NP_065375.1| hypothetical protein R721_84 20 5 Op 2 . - CDS 13621 - 13923 99 ## gi|10955524|ref|NP_065376.1| hypothetical protein R721_85 21 5 Op 3 . - CDS 13939 - 14154 114 ## - Prom 14267 - 14326 3.7 22 6 Op 1 . - CDS 14338 - 14799 172 ## gi|301644636|ref|ZP_07244623.1| conserved hypothetical protein 23 6 Op 2 . - CDS 14810 - 14980 171 ## gi|195940671|ref|ZP_03086053.1| hypothetical protein EscherichcoliO157_30478 24 6 Op 3 26/0.000 - CDS 14984 - 15427 288 ## COG1585 Membrane protein implicated in regulation of membrane protease activity - Term 15747 - 15790 -0.9 25 6 Op 4 . - CDS 15801 - 16778 885 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs 26 6 Op 5 . - CDS 16781 - 16957 253 ## EC55989_1040 hypothetical protein 27 6 Op 6 . - CDS 16950 - 17165 103 ## ECED1_3501 hypothetical protein 28 6 Op 7 . - CDS 17158 - 17664 256 ## plu0270 hypothetical protein + Prom 18626 - 18685 2.8 29 7 Tu 1 . + CDS 18742 - 19770 504 ## SeHA_A0002 replication initiation protein Predicted protein(s) >gi|296494661|gb|ADTN01000077.1| GENE 1 36 - 1091 214 351 aa, chain - ## HITS:1 COG:XFa0015 KEGG:ns NR:ns ## COG: XFa0015 COG0630 # Protein_GI_number: 10956726 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB11 components, and related ATPases involved in archaeal flagella biosynthesis # Organism: Xylella fastidiosa 9a5c # 7 348 6 346 347 243 36.0 4e-64 MNNIRDTSNTSQPRDNSKTARRYLDMTGIQSVLNIEHVTEISVNKPGTIWFEGKNGWESK DAPDATFDNLMTLAKTLTNLSKIKIPLSHDNPIASVVLPGGERGQIIIPPATENNSVVIS IRKPSLTRFTIDDYVRTGRFDNVRIATKHEAILTERQRYLYELSRRPDGQSKAQFLREAV KDRLNFLIVGGTGSGKTTIAKAIADIFPPERRYITVEDVPEMSLPLHPNHIRLFYKKNTV EAKEIIEACMRLKPDHIFLAELRGNEAWSYLEALNTGHEGSISTIHANNTYASFSRLASI VKQSDVGMTVDMDLIMRTIKTSIDVILFFNHTRMTELYYEPEEKNRFLSTM >gi|296494661|gb|ADTN01000077.1| GENE 2 1134 - 2273 471 379 aa, chain - ## HITS:1 COG:XFa0014 KEGG:ns NR:ns ## COG: XFa0014 COG2948 # Protein_GI_number: 10956725 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB10 components # Organism: Xylella fastidiosa 9a5c # 162 378 229 443 447 162 42.0 1e-39 MKNKDDEKNNDAGNRGIIEVKGKAAPKKILILIILLIAALFVIIILFKVLSREQVVQQTP LEKSDETLVTNTNNGVSLTTMMKNIEEKEKLDAANRRKAQEEQEQKQADNAPSAPASQKA DQTAVNVIANGTPQTASGDPNQPQPLPKSVRQLMGDTMVKIDNQEPGEKNQERDDLQGSQ YADGKVSPVLNRRYLLSAGTALSCVLKTKIVTSYPGITMCQLTRDVWSDNGEVLLARKGA LLIGEQNKVMTQGVARVFVNWTTLKDENVNVRIGALGTDSLGASGLPAWVDNHFGQRFGG ALLLSLLGDGLDILKNSTQQTGSNSNITYENTSDATKEMAKTTLDNTINIPPTAYINQGT VLSVIVPRNIDFSSVYELQ >gi|296494661|gb|ADTN01000077.1| GENE 3 2263 - 3030 188 255 aa, chain - ## HITS:1 COG:XFa0013 KEGG:ns NR:ns ## COG: XFa0013 COG3504 # Protein_GI_number: 10956724 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB9 components # Organism: Xylella fastidiosa 9a5c # 8 246 4 263 278 114 29.0 2e-25 MLKRTTAIVLLSMLQVPGALAAMYGTPSERDGRIQTVDYNEQDVFNVRVKAGAQTTIKFG QDETIKDVGIGDPEAWSVSVRDNTLFLRPKAEEPDTNVTVQTNKHIYPLYLISTTKQPTY ILRFNYPKPPSATVMTEKPFPCTDGGIINGHYQLKGDKSIFPYQVWDNGEFTCMRWTNKQ EIPVLYRVDADGNEHLVNGDRNKNTMVYYDVAENLRLRLGDQVADIRTSSIVNRPWNKKG TSNGKTRVEKFSYEK >gi|296494661|gb|ADTN01000077.1| GENE 4 3030 - 3764 325 244 aa, chain - ## HITS:1 COG:SMa1308 KEGG:ns NR:ns ## COG: SMa1308 COG3736 # Protein_GI_number: 16263165 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, component VirB8 # Organism: Sinorhizobium meliloti # 23 243 5 222 223 122 31.0 4e-28 MENTNYYEHSKKEATKKKFEEKNEKKDYFKAIRDFERSEIEIIKKKAKTFTILAIGEFVV ICILGFAIASLAPLKTAVPFLVRVDNSTGYTDIAPQLSDAKESYQDVETKYFLSKYLINY EAYDWQTIQEQADTVKTMSSQKVFSAYDTMIRADSSPLNILKNNYKIKVQINSVILLRKD MAQVRFKKMVLDLSGKPAPGYRATEWISTISFDWDKDIKTEKERLVNPLGLQVLSYQPDP EVIK >gi|296494661|gb|ADTN01000077.1| GENE 5 3764 - 3898 113 44 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|10955509|ref|NP_065361.1| ## NR: gi|10955509|ref|NP_065361.1| conjugal transfer lipoprotein TraF [Escherichia coli] # 1 44 1 44 44 87 100.0 2e-16 MKIPLIIASALLISACSHTPNPVQKKGGWFELNTAIEQIKAGDY >gi|296494661|gb|ADTN01000077.1| GENE 6 3930 - 6287 754 785 aa, chain - ## HITS:1 COG:SMa1315 KEGG:ns NR:ns ## COG: SMa1315 COG3451 # Protein_GI_number: 16263169 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Sinorhizobium meliloti # 21 782 20 788 792 287 29.0 4e-77 MHKIKLKNSDNMDVAREYPDYRFHITDSIIFTSDRKMLASLVVAGIPFETENDNVLTNLF NSVKNFLIGLGKEGDLYLWTHLIKKRITIDGEWNFDNDFLSRFSKKYLALFTSSAFYKST WYLTFGIPYDDVDTGIERMNEMTQQAQKALQPFNASVLSVHNTYISEVADYLSLLLNTEH NMIPLSSTPLSSSICDSEWYFGADVLELRNNESDSKKFATNYILKDFPIETTPGQWDFLL KQPYEFILTQSFIFESPTKTLKNIDSQLNKLQSANDAAKTQQEELEAGKEAVAAGITLFG SLHCALTVFGDTPDQARSNGIKLSAEFITSGKGFRFSRASLASPFVFFSHMPLNKRRPLD TRRTITNLACLMSFHNYSSGKKSGNPIGDGSAIMPLKTVSDSIYWFNTHYSPPEKNVTGQ KIAGHGMILGATGTGKTTFEAAASGFLQRFDPLMFVVDFNRSTELFVRAYGGSYFTLQEG IYTGCNPWQLAEGPESPVWHRLLAFLKRWTQVLARDNQGNPCSDEHGIELNAAVESIMRM PVEERRTALLLDIVSPELQTRLAKWCDNGEYAWAVDSPRNTFNPLHHKKVGFDTTVVLDT KNGIHPACEPLLAVLFFYKEIMQRGGNLMLSIIEEFWMPANFPMTQSMIKSALKAGRMKG EMMWLTSQSPEDAINCAIFAALVQQTATKILLPNPDAKWEGYKEIGLTEKEFEKLKELTK ESRTMLIKQSGSSVFAKMDLFGFDEFIPVLSGSETGLSIFDEIIAEKGDVAPDIWIPELL KRLNG >gi|296494661|gb|ADTN01000077.1| GENE 7 6684 - 6974 248 96 aa, chain - ## HITS:1 COG:no KEGG:ECH74115_A0010 NR:ns ## KEGG: ECH74115_A0010 # Name: not_defined # Def: conjugal transfer prepropilin # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 2 96 3 97 97 90 54.0 1e-17 MKVKSTLQYFILAFFLSLTASVAYAGGLDTATNTMTELKTWAFGFGGVCALCYLIYNIIM AFMEKKSWSDVGMALIYCSLAGGALYAGNWALELFK >gi|296494661|gb|ADTN01000077.1| GENE 8 6974 - 7558 209 194 aa, chain - ## HITS:1 COG:no KEGG:ECH74115_A0009 NR:ns ## KEGG: ECH74115_A0009 # Name: not_defined # Def: conjugal transfer protein # Organism: E.coli_O157_EC4115 # Pathway: Bacterial secretion system [PATH:ecf03070] # 1 194 1 201 203 180 49.0 3e-44 MQLSAAVLAALISQCAPDVSPDTMNALIMTESGANPYVIANVSDGTSKYFKDEKGAIEYA EKLTAENKRFSAGLTQIYSKNFPSLNLTNKTVFDPCTNIKAGAAVLTDNYLRQKEGSSNQ KILRALSLYYSGNESTGFIKEKKFNNTSYLERVIRNANNYIVPSIREKTDESEIKPPNDE EKTSPDWDVFGDFS >gi|296494661|gb|ADTN01000077.1| GENE 9 7579 - 7977 251 132 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|10955514|ref|NP_065366.1| ## NR: gi|10955514|ref|NP_065366.1| hypothetical protein R721_75 [Escherichia coli] # 1 120 1 120 132 216 100.0 4e-55 MQDNDDVSELDIATREVLILRCNLIQLASEYIKINPLKNTRIYTDLIHRSADVIKNIEQG NIQNKHNYLSQDVKHTLVLYDLYKSDFEQCPGAKSLIPVREKIVSLKELLSGANELDSIV RRPRRKKRSSRI >gi|296494661|gb|ADTN01000077.1| GENE 10 8135 - 8296 79 53 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKLKNSYTSTPAFLSPKSQAEAERLKHFADMVYMNCATTLSLGTKNNSPSTKE >gi|296494661|gb|ADTN01000077.1| GENE 11 8317 - 8754 326 145 aa, chain - ## HITS:1 COG:no KEGG:ECED1_3503 NR:ns ## KEGG: ECED1_3503 # Name: not_defined # Def: putative PilM protein # Organism: E.coli_ED1a # Pathway: not_defined # 1 145 1 144 144 113 37.0 2e-24 MGWFFSLLLIFLSIGSYYLTDNNNQTIQRMKLLPAKQEAIEFVHFANVINDYLYKYPDKR NSGGTLTSEQIGITPVYDIHHIIYGKRVYIWSADTEGLMSALQQQTKHSAMLGRVKNKKI VDNQGNDMGVTIPSSIPEGSIVFIN >gi|296494661|gb|ADTN01000077.1| GENE 12 8760 - 9488 51 242 aa, chain - ## HITS:1 COG:no KEGG:ECL_00414 NR:ns ## KEGG: ECL_00414 # Name: not_defined # Def: pilus biogenesis protein PilL # Organism: E.cloacae # Pathway: not_defined # 114 239 300 422 428 100 44.0 6e-20 MSKSGFLKKNANNAVPATTLTGPSTPEKQIYFVRHDGKGSLQKAVKEIVPSDWTAEISPE VTKTFRRTISWQGNDQWPYVLDKMLRNYGLTVIKNTEKRHLLIDFQQQKGKTSQKAPVSP DKTPASPVRTPVSSDKLPQKPLTPATVPQKQTEKLTWNLKKGSTLRDGLTEWASSSPCGN STWNLVWDTPVNYRIDAPLIFRGNFESALKDTLMLYLSADKPLYAFRNIAQCVITIKDTQ QE >gi|296494661|gb|ADTN01000077.1| GENE 13 9998 - 10390 231 130 aa, chain - ## HITS:1 COG:no KEGG:SeD_B0090 NR:ns ## KEGG: SeD_B0090 # Name: not_defined # Def: killer protein # Organism: S.enterica_Dublin # Pathway: not_defined # 32 130 1 100 100 147 79.0 1e-34 MKSTILLMVAATAALRPGCADSRTHPVPETPLSKIMLSMIAASAIALSAPVYASDPCASV LCLYGKAIGSGGGSECKSAEKDFFNIIKKKKGSIRWSKTFDARKAFLNQCTTADPAAISK IMSKFGRSWG >gi|296494661|gb|ADTN01000077.1| GENE 14 10457 - 11092 82 211 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0126 NR:ns ## KEGG: SeHA_A0126 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 2 211 3 183 183 122 44.0 1e-26 MKSIILSMVAATALLSGCADSKTQPVPESPLTSNDHQITLKIDTVRTDLPDSSPVCLNGM NIYESLSERHPGLFAKNQNVIFRSAFFDDGQAISSETIRKIMQSTPCVESQTTSAGSWEP ATELIPRQAISGDTTLAEKDRTAGFTMRIAPEIVRGNFHLLLTTTDPIRKQTVEQNILLK PDQPLIVSGFTGVNADSQSENRIVIITPHVR >gi|296494661|gb|ADTN01000077.1| GENE 15 11165 - 11452 309 95 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|301644631|ref|ZP_07244618.1| ## NR: gi|301644631|ref|ZP_07244618.1| conserved domain protein [Escherichia coli MS 146-1] # 1 95 1 95 95 147 100.0 2e-34 MNKGTIISLALFCGLLTGCEDKIYDVSYYKEHQDEAQKISDKCKAGEITNNNCKNANEAL YDIKRKEIINQMLGQSYKEKEEHKKKVNELMERLQ >gi|296494661|gb|ADTN01000077.1| GENE 16 11465 - 11719 101 84 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|301644632|ref|ZP_07244619.1| ## NR: gi|301644632|ref|ZP_07244619.1| conserved hypothetical protein [Escherichia coli MS 146-1] # 1 84 1 84 84 133 100.0 5e-30 MNKSRNAIIASSIIFASLMLVGCKEKIYSVEYYSNNISEATKTLEDCKKGTITDQNCDNA RAALQQKQDSEYKKKVSEMRRRLD >gi|296494661|gb|ADTN01000077.1| GENE 17 11721 - 12362 122 213 aa, chain - ## HITS:1 COG:no KEGG:ECH74115_A0008 NR:ns ## KEGG: ECH74115_A0008 # Name: not_defined # Def: conjugal transfer protein TrbJ # Organism: E.coli_O157_EC4115 # Pathway: Bacterial secretion system [PATH:ecf03070] # 26 202 23 199 213 182 61.0 9e-45 MKDKVLSFLFAGIITITSPSIFASGIPVVDVTAIAKTVEEGLNRAAEAARQLEQLKQQYE QTIRYAEEQKKRLEGFTDFSNGFDSASSYMKDSLSTITNSAKSDLSSLRSQYDLSSNVAE TQQKYDAILAKIKFYENFNNEMHERAERLTTLQKEFASADTPQKKADLSNQLNTEKLTME LQLKQYDIAERQLENERKAQYETYVRQKRRDLS >gi|296494661|gb|ADTN01000077.1| GENE 18 12368 - 12637 77 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|301644634|ref|ZP_07244621.1| ## NR: gi|301644634|ref|ZP_07244621.1| TrbL/VirB6 plasmid conjugal transfer protein [Escherichia coli MS 146-1] # 1 89 243 331 331 155 100.0 6e-37 MVTVISIFLVEQVGTLCSTLTGGVGINGLTSAANGFGGKVASGFMRASGLRSFTNGFASK MPNPFFNAGSRSASALVQGLNRGSKLKAG >gi|296494661|gb|ADTN01000077.1| GENE 19 13367 - 13624 158 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|10955523|ref|NP_065375.1| ## NR: gi|10955523|ref|NP_065375.1| hypothetical protein R721_84 [Escherichia coli] # 1 85 1 85 85 155 100.0 7e-37 MKKNDDLITLPSHNSHALDYIIKKAFTPRKTKTSTATHNVYSYNKEESTLNFQKIGEAIF FGTDLVIKLKPNISASGELFICQKD >gi|296494661|gb|ADTN01000077.1| GENE 20 13621 - 13923 99 100 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|10955524|ref|NP_065376.1| ## NR: gi|10955524|ref|NP_065376.1| hypothetical protein R721_85 [Escherichia coli] # 1 86 18 103 129 134 95.0 2e-30 MTQAKANLRKLRRARYVPPRTVKKSIANTERVVSKAITGCGVFRDIPQVKINLIETRKIL QIELALSIANLSEANPIEIRKKIKSIDRQLSSTSKIRTTL >gi|296494661|gb|ADTN01000077.1| GENE 21 13939 - 14154 114 71 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSAFIKLASVLALFISPVVVAIAGEQSHCTKINEYPFIVTQCDDGTVTVVNVINNRVAVC RKGESCKEIKL >gi|296494661|gb|ADTN01000077.1| GENE 22 14338 - 14799 172 153 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|301644636|ref|ZP_07244623.1| ## NR: gi|301644636|ref|ZP_07244623.1| conserved hypothetical protein [Escherichia coli MS 146-1] # 1 153 1 153 153 281 100.0 6e-75 MRNANEGVSLSEIVTKATLRKFGIRGVNKYETYTVIIITVAFFILESAAGRAKDTFAVAV KVLSYMLPCTAIIMAVIWIIFFIGDRREKLKHAELDVIKIKARQKIYDRLRYVENEHIVF DPVTGREVPAERTCINKLIEALVDESNDISKRN >gi|296494661|gb|ADTN01000077.1| GENE 23 14810 - 14980 171 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|195940671|ref|ZP_03086053.1| ## NR: gi|195940671|ref|ZP_03086053.1| hypothetical protein EscherichcoliO157_30478 [Escherichia coli O157:H7 str. EC4024] # 1 56 1 56 56 73 100.0 3e-12 MHLSLLIPLTGAAMNALCIFIIRLLDKDFIEAYKKTLKNITALHILLMVLSVTGIV >gi|296494661|gb|ADTN01000077.1| GENE 24 14984 - 15427 288 147 aa, chain - ## HITS:1 COG:AGl2125 KEGG:ns NR:ns ## COG: AGl2125 COG1585 # Protein_GI_number: 15891177 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Membrane protein implicated in regulation of membrane protease activity # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 3 140 14 153 154 65 27.0 2e-11 MLWFTLFLLCIALEIITGTGWLLLISLGALSSAIMGFFLPISQEANICFFASISILASII KFFYDKKHKKLDTLLVNTGHSRFKGKEFTLSDDIVNGKGQLLIGDTFWPVETLHNENYPA GTRVIVTDMQGITLKVMTIEDKSENKE >gi|296494661|gb|ADTN01000077.1| GENE 25 15801 - 16778 885 325 aa, chain - ## HITS:1 COG:ECs0552 KEGG:ns NR:ns ## COG: ECs0552 COG0330 # Protein_GI_number: 15829806 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Escherichia coli O157:H7 # 17 307 7 297 305 422 78.0 1e-118 MIDAIITPIVTSIPLLLLILVALIFVKSAVKIVPQGNAWTVERFGKYTHTLSPGLHFLIP FMDRIGQRINMMETVLDIPKQEVISKDNANVTIDAVCFVQVIDAAKAAYEVDNLASAISN LVMTNIRTVVGGMNLDDMLSQRDSINSKLLTVVDYATDPWGIKVTRIEIRDVKPPKELTE AMNAQMKAERTKRARILEAEGIRQSEILKAEGEKQSQILKAEGERQSAFLQSEARERQAE AEARATKLVSDAIAEGDVQSVNYFIAQKYTEALQAIGTASNSKLVMMPLDSSSLVSSVAG ISELLKNVSQDTPKVDMKQAQQYLS >gi|296494661|gb|ADTN01000077.1| GENE 26 16781 - 16957 253 58 aa, chain - ## HITS:1 COG:no KEGG:EC55989_1040 NR:ns ## KEGG: EC55989_1040 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 58 1 58 58 83 86.0 2e-15 MADFGSTKQTVTFEEWHELLMDYAELRGGNAADAEAWRADYEAGKTPVEAYCDEWGED >gi|296494661|gb|ADTN01000077.1| GENE 27 16950 - 17165 103 71 aa, chain - ## HITS:1 COG:no KEGG:ECED1_3501 NR:ns ## KEGG: ECED1_3501 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_ED1a # Pathway: not_defined # 3 63 12 72 76 69 59.0 3e-11 MPKYQINATIKKPGGGPVKWTHYSDFPMTAEECRKNFSKEKEAGKCFSVDVHISDFICKE IPPQNAGNTYG >gi|296494661|gb|ADTN01000077.1| GENE 28 17158 - 17664 256 168 aa, chain - ## HITS:1 COG:no KEGG:plu0270 NR:ns ## KEGG: plu0270 # Name: not_defined # Def: hypothetical protein # Organism: P.luminescens # Pathway: not_defined # 29 159 38 162 166 64 30.0 1e-09 MTLYSGDVSISDKQPGENMARKLKINPVIPECVADLKGYPLYIIVAYWGLRNNKTLTTTD VSRNFLITQQQSYDVLTYIYQEAQKQITATKRTLFNEHKRRISAFRILDIDKGILRTPKK EIAYASYDNIPTVTMKSTRCEIQDSFTELRRWMCSRKTGDRVEVFLNA >gi|296494661|gb|ADTN01000077.1| GENE 29 18742 - 19770 504 342 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0002 NR:ns ## KEGG: SeHA_A0002 # Name: repZ # Def: replication initiation protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 10 342 28 358 358 444 72.0 1e-123 MQLAVISKRHWSELSPEEQVRFWQDYEAGIESSFLVPQENKGGTTKRRRGEHSTKPKCEN PAWFRPDSYKALGGQLGHAYNRLVKKDPVTGQYTLRMHMSLHPFYVLKRQSVGRKYKFRP EKQRLLDAIWVVLVSFCDRGLHTVGMSVSRLAEEISPKDSKGNVIPETAVTVSRLSRLLA EQVCFGTLGTSEKTIWDRESRQRLPKYVWITETGWKMLGVDLVKLQEQQRKRLAESEIRL QLIKEGVIREGEEISVHSARKRWYAQRSLDAIKFRREKAAKRKRANRLAKLPYDEQRNEI ARFILKRMPPDEAYWCTKERLEQLVARDLRQLELALTASPPH Prediction of potential genes in microbial genomes Time: Sun May 15 23:23:42 2011 Seq name: gi|296494660|gb|ADTN01000078.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont215.1, whole genome shotgun sequence Length of sequence - 2057 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 139 - 198 3.4 1 1 Tu 1 . + CDS 225 - 1124 672 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase - Term 1150 - 1193 1.5 2 2 Tu 1 . - CDS 1206 - 1979 526 ## COG1349 Transcriptional regulators of sugar metabolism Predicted protein(s) >gi|296494660|gb|ADTN01000078.1| GENE 1 225 - 1124 672 299 aa, chain + ## HITS:1 COG:ECs2892 KEGG:ns NR:ns ## COG: ECs2892 COG1597 # Protein_GI_number: 15832146 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Escherichia coli O157:H7 # 1 299 1 299 299 593 99.0 1e-169 MAEFPASLLILNGKSTDNLPLREAIMLLREEGMTIHVRVTWEKGDAARYVEEARKLGVAT VIAGGGDGTINEVSTALIQCEGDGIPALGILPLGTANDFATSVGIPEALDKALKLAIAGN AIAIDMAQVNKQTCFINMATGGFGTRITTETPEKLKAALGGVSYIIHGLMRMDTLQPDRC EIRGENFHWQGDALVIGIGNGRQAGGGQQLCPNALINDGLLQLRIFTGDEILPALVSTLK SDEDNPNIIEGASSWFDIQAPHEITFNLDGEPLSGQNFHIEILPAALRCRLPPDCPLLR >gi|296494660|gb|ADTN01000078.1| GENE 2 1206 - 1979 526 257 aa, chain - ## HITS:1 COG:gatR KEGG:ns NR:ns ## COG: gatR COG1349 # Protein_GI_number: 16132228 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli K12 # 1 257 3 259 259 454 100.0 1e-128 MNSFERRNKIIQLVNEQGTVLVQDLAGVFAASEATIRADLRFLEQKGVVTRFHGGAAKIM SGNSETETQEVGFKERFQLASAPKNRIAQAAVKMIHEGMTVILDSGSTTMLIAEGLMTAK NITVITNSLPAAFALSENKDITLVVCGGTVRHKTRSMHGSIAERSLQDINADLMFVGADG IDAVNGITTFNEGYSISGAMVTAANKVIAVLDSSKFNRRGFNQVLPIEKIDIIITDDAVS EVDKLALQKTRVKLITV Prediction of potential genes in microbial genomes Time: Sun May 15 23:23:44 2011 Seq name: gi|296494659|gb|ADTN01000079.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont215.2, whole genome shotgun sequence Length of sequence - 2594 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 - CDS 10 - 1050 797 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 2 1 Op 2 . - CDS 1098 - 2453 1545 ## COG3775 Phosphotransferase system, galactitol-specific IIC component - Prom 2477 - 2536 4.8 Predicted protein(s) >gi|296494659|gb|ADTN01000079.1| GENE 1 10 - 1050 797 346 aa, chain - ## HITS:1 COG:ECs2894 KEGG:ns NR:ns ## COG: ECs2894 COG1063 # Protein_GI_number: 15832148 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli O157:H7 # 1 346 1 346 346 712 99.0 0 MKSVVNDTDGIVRVAESVIPEIKHQDEVRVKIASSGLCGSDLPRIFKNGAHYYPITLGHE FSGYIDAVGSGVDDLHPGDAVACVPLLPCFTCPECLKGFYSQCAKYDFIGSRRDGGFAEY IVVKRKNVFALPTDMPIEDGAFIEPITVGLHAFHLAQGCENKNVIIIGAGTIGLLAIQCA VALGAKSVTAIDISSEKLALAKSFGAMQTFNSSEMSAPQMQSVLRELRFNQLILETAGVP QTVELAVEIAGPHAQLALVGTLHQDLHLTSATFGKILRKELTVIGSWMNYSSPWPGQEWE TASRLLTERKLSLEPLIAHRGSFESFTQAVRDIARNAMPGKVLLIP >gi|296494659|gb|ADTN01000079.1| GENE 2 1098 - 2453 1545 451 aa, chain - ## HITS:1 COG:ECs2895 KEGG:ns NR:ns ## COG: ECs2895 COG3775 # Protein_GI_number: 15832149 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIC component # Organism: Escherichia coli O157:H7 # 1 451 1 451 451 806 100.0 0 MFSEVMRYILDLGPTVMLPIVIIIFSKILGMKAGDCFKAGLHIGIGFVGIGLVIGLMLDS IGPAAKAMAENFDLNLHVVDVGWPGSSPMTWASQIALVAIPIAILVNVAMLLTRMTRVVN VDIWNIWHMTFTGALLHLATGSWMIGMAGVVIHAAFVYKLGDWFARDTRNFFELEGIAIP HGTSAYMGPIAVLVDAIIEKIPGVNRIKFSADDIQRKFGPFGEPVTVGFVMGLIIGILAG YDVKGVLQLAVKTAAVMLLMPRVIKPIMDGLTPIAKQARSRLQAKFGGQEFLIGLDPALL LGHTAVVSASLIFIPLTILIAVCVPGNQVLPFGDLATIGFFVAMAVAVHRGNLFRTLISG VIIMSITLWIATQTIGLHTQLAANAGALKAGGMVASMDQGGSPITWLLIQVFSPQNIPGF IIIGAIYLTGIFMTWRRARGFIKQEKVVLAE Prediction of potential genes in microbial genomes Time: Sun May 15 23:23:47 2011 Seq name: gi|296494658|gb|ADTN01000080.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont216.1, whole genome shotgun sequence Length of sequence - 11280 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 9, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 482 4 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 2 2 Tu 1 . - CDS 799 - 1032 225 ## B21_01514 hypothetical protein - Prom 1093 - 1152 3.8 + Prom 1696 - 1755 2.1 3 3 Op 1 3/1.000 + CDS 1818 - 3101 1414 ## COG0477 Permeases of the major facilitator superfamily + Term 3110 - 3145 4.2 + Prom 3111 - 3170 3.1 4 3 Op 2 . + CDS 3190 - 4650 1254 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases - Term 4643 - 4678 5.2 5 4 Tu 1 . - CDS 4685 - 4888 366 ## ECIAI1_1560 putative selenium carrying protein - Prom 4930 - 4989 1.5 - Term 5021 - 5056 5.8 6 5 Op 1 5/0.000 - CDS 5065 - 5751 715 ## COG1802 Transcriptional regulators - Prom 5777 - 5836 5.5 - Term 5799 - 5835 3.0 7 5 Op 2 . - CDS 5840 - 6586 871 ## COG4221 Short-chain alcohol dehydrogenase of unknown specificity 8 6 Tu 1 . + CDS 6732 - 8768 1925 ## COG0339 Zn-dependent oligopeptidases - Term 8760 - 8811 0.6 9 7 Tu 1 . - CDS 8812 - 9330 232 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase - Prom 9541 - 9600 10.1 + Prom 9483 - 9542 9.6 10 8 Tu 1 . + CDS 9606 - 9998 286 ## COG3111 Uncharacterized conserved protein + Term 10010 - 10051 6.2 + Prom 10166 - 10225 6.9 11 9 Tu 1 . + CDS 10253 - 11143 475 ## COG2199 FOG: GGDEF domain Predicted protein(s) >gi|296494658|gb|ADTN01000080.1| GENE 1 3 - 482 4 159 aa, chain - ## HITS:1 COG:pinR KEGG:ns NR:ns ## COG: pinR COG1961 # Protein_GI_number: 16129335 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Escherichia coli K12 # 1 154 1 154 196 285 98.0 2e-77 MSRIFAYCRISTLDQTTENQRREIESAGFKIKPQQIIEEHISGSAATSERPGFNRLLARL KCGDQLIVTKLDRLGCNAMDIRKTVEQLTETGIRVHCLALGGIDLTSPTGKMMMHVISAV AEFERDLLLERTHSGIVRARGAGKRFGRPPVLRKVRTSS >gi|296494658|gb|ADTN01000080.1| GENE 2 799 - 1032 225 77 aa, chain - ## HITS:1 COG:no KEGG:B21_01514 NR:ns ## KEGG: B21_01514 # Name: ybl69 # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 77 12 88 88 136 98.0 2e-31 MKSKDTLKWFPAQLPEVRIILGDAVVEVAKQGGPINTRTLLDYIEGNIKKKSWLDNKELL QTAISVLKDNQNLNGKM >gi|296494658|gb|ADTN01000080.1| GENE 3 1818 - 3101 1414 427 aa, chain + ## HITS:1 COG:ydfJ KEGG:ns NR:ns ## COG: ydfJ COG0477 # Protein_GI_number: 16129502 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 427 1 427 427 787 99.0 0 MDFQLYSLGAALVFHEIFFPESSTAMALILAMGTYGAGYVARIVGAFIFGKMGDRIGRKK VLFITITMMGICTTLIGVLPTYAQIGVFAPILLVTLRIIQGVGAGAEISGAGTMLAEYAP KGKRGIISSFVAMGTNCGTLSATAIWAFMFFILSKEELLAWGWRIPFLASVVVMVFAIWL RMNLKESPVFEKVNDSNQPTAKPAPAGSMFQSKSFWLATGLRFGQAGNSGLIQTFLAGYL VQTLLFNKAIPTDALMISSILGFMTIPFLGWLSDKIGRRIPYIIMNTSAIVLAWPMLSII VDKSYAPSTIMVALIVIHNCAVLGLFALENITMAEMFGCKNRFTRMAISKEIGGLIASGF GPILAGIFCTMTESWYPIAIMIMAYSVIGLISALKMPEVKDRDLSALEDAAEDQPRVVRA AQPSRSL >gi|296494658|gb|ADTN01000080.1| GENE 4 3190 - 4650 1254 486 aa, chain + ## HITS:1 COG:ydfI KEGG:ns NR:ns ## COG: ydfI COG0246 # Protein_GI_number: 16129501 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Escherichia coli K12 # 1 486 1 486 486 987 99.0 0 MGNNLLSAKATLPVYDRNNLAPRIVHLGFGAFHRAHQGVYADILATEHFSDWGYYEVNLI GGEQQIADLQQQDNLYTVAEMSADVWTARVVGVVKKALHVQIDGLETVLAAMCEPQIAIV SLTITEKGYFHSPATGQLMLDHPMVAADVQNPHQPKTATGVIVEALARRKAAGLPAFTVM SCDNMPENGHVMRDVVTSYAQAVDVKLAQWIEDNVTFPSTMVDRIVPAVTEDTLAKIEQL TGVRDPAGVACEPFRQWVIEDNFVAGRPEWEKAGAELVSDVLPYEEMKLRMLNGSHSFLA YLGYLAGYQHINDCMEDEHYRYAAYGLMLQEQAPTLKVQGVDLQDYANRLIARYSNPALR HRTWQIAMDGSQKLPQRMLDSVRWHLAHDSKFDLLALGVAGWMRYVGGVDEQGNPIEISD PLLPVIQKAVQSSAEGKARVQSLLAIKAIFGDDLPDNSLFTARVTETYLSLLAHGAKATV AKYSVK >gi|296494658|gb|ADTN01000080.1| GENE 5 4685 - 4888 366 67 aa, chain - ## HITS:1 COG:no KEGG:ECIAI1_1560 NR:ns ## KEGG: ECIAI1_1560 # Name: ydfZ # Def: putative selenium carrying protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 67 1 67 67 114 100.0 1e-24 MTTYDRNRNAITTGSRVMVSGTGHTGKILSIDTEGLTAEQIRRGKTVVVEGCEEKLAPLD LIRLGMN >gi|296494658|gb|ADTN01000080.1| GENE 6 5065 - 5751 715 228 aa, chain - ## HITS:1 COG:ECs2149 KEGG:ns NR:ns ## COG: ECs2149 COG1802 # Protein_GI_number: 15831403 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 228 1 228 228 414 100.0 1e-116 MTVETQLNPTQPVNQQIYRILRRDIVHCLIAPGTPLSEKEVSVRFNVSRQPVREAFIKLA ENGLIQIRPQRGSYVNKISMAQVRNGSFIRQAIECAVARRAASMITESQCYQLEQNLHQQ RIAIERKQLDDFFELDDNFHQLLTQIADCQLAWDTIENLKATVDRVRYMSFDHVSPPEML LRQHLDIFSALQKRDGDAVERAMTQHLQEISESVRQIRQENSDWFSEE >gi|296494658|gb|ADTN01000080.1| GENE 7 5840 - 6586 871 248 aa, chain - ## HITS:1 COG:ydfG KEGG:ns NR:ns ## COG: ydfG COG4221 # Protein_GI_number: 16129498 # Func_class: R General function prediction only # Function: Short-chain alcohol dehydrogenase of unknown specificity # Organism: Escherichia coli K12 # 1 248 1 248 248 504 100.0 1e-143 MIVLVTGATAGFGECITRRFIQQGHKVIATGRRQERLQELKDELGDNLYIAQLDVRNRAA IEEMLASLPAEWCNIDILVNNAGLALGMEPAHKASVEDWETMIDTNNKGLVYMTRAVLPG MVERNHGHIINIGSTAGSWPYAGGNVYGATKAFVRQFSLNLRTDLHGTAVRVTDIEPGLV GGTEFSNVRFKGDDGKAEKTYQNTVALTPEDVSEAVWWVSTLPAHVNINTLEMMPVTQSY AGLNVHRQ >gi|296494658|gb|ADTN01000080.1| GENE 8 6732 - 8768 1925 678 aa, chain + ## HITS:1 COG:dcp KEGG:ns NR:ns ## COG: dcp COG0339 # Protein_GI_number: 16129497 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent oligopeptidases # Organism: Escherichia coli K12 # 1 678 4 681 681 1328 100.0 0 MNPFLVQSTLPYLAPHFDQIANHHYRPAFDEGMQQKRAEIAAIALNPQMPDFNNTILALE QSGELLTRVTSVFFAMTAAHTNDELQRLDEQFSAELAELANDIYLNGELFARVDAVWQRR ESLGLDSESIRLVEVIHQRFVLAGAKLAQADKAKLKVLNTEAATLTSQFNQRLLAANKSG GLVVNDIAQLAGMSEQEIALAAEAAREKGLDNKWLIPLLNTTQQPALAEMRDRATREKLF IAGWTRAEKNDANDTRAIIQRLVEIRAQQATLLGFPHYAAWKIADQMAKTPEAALNFMRE IVPAARQRASDELASIQAVIDKQQGGFSAQPWDWAFYAEQVRREKFDLDEAQLKPYFELN TVLNEGVFWTANQLFGIKFVERFDIPVYHPDVRVWEIFDHNGVGLALFYGDFFARDSKSG GAWMGNFVEQSTLNKTHPVIYNVCNYQKPAAGEPALLLWDDVITLFHEFGHTLHGLFARQ RYATLSGTNTPRDFVEFPSQINEHWATHPQVFARYARHYQSGAAMPDELQQKMRNASLFN KGYEMSELLSAALLDMRWHCLEENEAMQDVDDFELRALVAENMDLPAIPPRYRSSYFAHI FGGGYAAGYYAYLWTQMLADDGYQWFVEQGGLTRENGLRFREAILSRGNSEDLERLYRQW RGKAPKIMPMLQHRGLNI >gi|296494658|gb|ADTN01000080.1| GENE 9 8812 - 9330 232 172 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 18 172 747 902 904 94 36 4e-19 MNINRDKIVQLVDTDTIENLTSALSQRLIADQLRLTTAESCTGGKLASALCAAEDTPKFY GAGFVTFTDQAKMKILSVSQQSLERYSAVSEKVAAEMATGAIERADADVSIAITGYGGPE GGEDGTPAGTVWFAWHIKGQNYTAVMHFAGDCETVLALAVRFALAQLLQLLL >gi|296494658|gb|ADTN01000080.1| GENE 10 9606 - 9998 286 130 aa, chain + ## HITS:1 COG:ydeI KEGG:ns NR:ns ## COG: ydeI COG3111 # Protein_GI_number: 16129495 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 130 1 130 130 246 100.0 9e-66 MKFQAIVLASFLVMPYALADDQGGLKQDAAPPPPHAIEDGYRGTDDAKKMTVDFAKTMHD GASVSLRGNLISHKGEDRYVFRDKSGEINVVIPAAVFDGREVQPDQMINISGSLDKKSAP AVVRVTHLQK >gi|296494658|gb|ADTN01000080.1| GENE 11 10253 - 11143 475 296 aa, chain + ## HITS:1 COG:ydeH_2 KEGG:ns NR:ns ## COG: ydeH_2 COG2199 # Protein_GI_number: 16129494 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli K12 # 121 296 1 176 176 362 100.0 1e-100 MIKKTTEIDAILLNLNKAIDAHYQWLVSMFHSVVARDASKPEITDNHSYGLCQFGRWIDH LGPLDNDELPYVRLMDSAHQHMHNCGRELMLAIVENHWQDAHFDAFQEGLLSFTAALTDY KIYLLTIRSNMDVLTGLPGRRVLDESFDHQLRNAEPLNLYLMLLDIDRFKLVNDTYGHLI GDVVLRTLATYLASWTRDYETVYRYGGEEFIIIVKAANDEEACRAGVRICQLVDNHAITH SEGHINITVTAGVSRAFPEEPLDVVIGRADRAMYEGKQTGRNRCMFIDEQNVINRV Prediction of potential genes in microbial genomes Time: Sun May 15 23:23:55 2011 Seq name: gi|296494657|gb|ADTN01000081.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont216.2, whole genome shotgun sequence Length of sequence - 10679 bp Number of predicted genes - 13, with homology - 12 Number of transcription units - 9, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 34 - 129 133 ## - Prom 189 - 248 3.0 2 2 Tu 1 . - CDS 256 - 1443 837 ## COG0477 Permeases of the major facilitator superfamily - Prom 1509 - 1568 3.7 + Prom 1422 - 1481 4.7 3 3 Tu 1 . + CDS 1638 - 2537 882 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Term 2528 - 2558 3.0 4 4 Op 1 . - CDS 2568 - 2786 106 ## B21_01502 hypothetical protein 5 4 Op 2 3/0.750 - CDS 2818 - 3201 292 ## COG2207 AraC-type DNA-binding domain-containing proteins 6 4 Op 3 . - CDS 3221 - 3655 442 ## COG1846 Transcriptional regulators - Prom 3683 - 3742 7.1 + Prom 3775 - 3834 5.2 7 5 Tu 1 . + CDS 3867 - 4532 670 ## COG2095 Multiple antibiotic transporter + Term 4537 - 4562 -0.5 - Term 4521 - 4554 3.8 8 6 Tu 1 . - CDS 4557 - 5747 860 ## COG2814 Arabinose efflux permease - Prom 5780 - 5839 6.6 9 7 Tu 1 . - CDS 5897 - 6604 228 ## JW1520 hypothetical protein - Prom 6760 - 6819 4.8 10 8 Tu 1 . - CDS 7090 - 7971 795 ## COG0583 Transcriptional regulator - Prom 8004 - 8063 3.8 + Prom 7989 - 8048 2.4 11 9 Op 1 3/0.750 + CDS 8072 - 9460 1454 ## COG1012 NAD-dependent aldehyde dehydrogenases 12 9 Op 2 . + CDS 9536 - 10450 653 ## COG2066 Glutaminase 13 9 Op 3 . + CDS 10450 - 10678 84 ## SF1570 hypothetical protein Predicted protein(s) >gi|296494657|gb|ADTN01000081.1| GENE 1 34 - 129 133 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLGNMNVFMAVLGIILFSGFLAAYFSHKWDD >gi|296494657|gb|ADTN01000081.1| GENE 2 256 - 1443 837 395 aa, chain - ## HITS:1 COG:ydeF KEGG:ns NR:ns ## COG: ydeF COG0477 # Protein_GI_number: 16129493 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 395 1 395 395 662 99.0 0 MNLSLRRSTSALLASSLLLTIGRGATLPFMTIYLSRQYSLSVDLIGYAMTIALTIGVVFS LGFGILADKFDKKRYMLLAITAFASGFIAIPLVNNVTLVVLFFALINCAYSVFATVLKAW FADNLSSTSKTKIFSINYTMLNIGWTIGPPLGTLLVMQNINLPFWLAAICSAFPMLFIQI WVKRSEKIIATETGSVWSPKVLLQDKALLWFTCSGFLASFVSGAFASCISQYVMVIADGD FAEKVVAVVLPVNAAMVVTLQYSVGRRLNPANIRALMTAGTLCFVIGLVGFIFSGNSLLL WGMSAAVFTVGEIIYAPGEYMLIDHIAPPEMKASYFSAQSLGWLGAAINPLVSGVVLTSL PPSSLFVILALVIIAAWVLMLKGIRARPWGQPALC >gi|296494657|gb|ADTN01000081.1| GENE 3 1638 - 2537 882 299 aa, chain + ## HITS:1 COG:ECs2140 KEGG:ns NR:ns ## COG: ECs2140 COG0697 # Protein_GI_number: 15831394 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Escherichia coli O157:H7 # 1 297 1 297 299 459 96.0 1e-129 MSRKDGVLALLVVVVWGLNFVVIKVGLHNMPPLMLAGLRFMLVAFPAIFFVARPKVPLNL LLGYGLTISFAQFAFLFCAINFGMPAGLASLVLQAQAFFTIMLGAFTFGERLHGKQLAGI ALAIFGVLVLIEDSLNGQHVAMLGFMLTLAAAFSWACGNIFNKKIMSHSTRPAVMSLVIW SALIPIIPFFVASLILDGSATMIHSLVTIDMTTILSLMYLAFVATIVGYGIWGTLLGRYE TWRVAPLSLLVPVVGLASAALLLDERLTGLQFLGAVLIMTGLYINVFGLRWRKAVKVGS >gi|296494657|gb|ADTN01000081.1| GENE 4 2568 - 2786 106 72 aa, chain - ## HITS:1 COG:no KEGG:B21_01502 NR:ns ## KEGG: B21_01502 # Name: marB # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 72 1 72 72 123 98.0 2e-27 MKPLSSAIAAALILFSARGVAEQTTQPVVTSCANVVVVPPSQEHPPFDLNHMGTGSDKSD ALGVPYYNQHAM >gi|296494657|gb|ADTN01000081.1| GENE 5 2818 - 3201 292 127 aa, chain - ## HITS:1 COG:ECs2138 KEGG:ns NR:ns ## COG: ECs2138 COG2207 # Protein_GI_number: 15831392 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 127 3 129 129 233 100.0 9e-62 MSRRNTDAITIHSILDWIEDNLESPLSLEKVSERSGYSKWHLQRMFKKETGHSLGQYIRS RKMTEIAQKLKESNEPILYLAERYGFESQQTLTRTFKNYFDVPPHKYRMTNMQGESRFLH PLNHYNS >gi|296494657|gb|ADTN01000081.1| GENE 6 3221 - 3655 442 144 aa, chain - ## HITS:1 COG:STM1520 KEGG:ns NR:ns ## COG: STM1520 COG1846 # Protein_GI_number: 16764865 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Salmonella typhimurium LT2 # 1 144 1 144 144 259 92.0 8e-70 MKSTSDLFNEIIPLGRLIHMVNQKKDRLLNEYLSPLDITAAQFKVLCSIRCAACITPVEL KKVLSVDLGALTRMLDRLVCKGWVERLPNPNDKRGVLVKLTTGGAAICEQCHQLVGQDLH QELTKNLTADEVATLEYLLKKVLP >gi|296494657|gb|ADTN01000081.1| GENE 7 3867 - 4532 670 221 aa, chain + ## HITS:1 COG:marC KEGG:ns NR:ns ## COG: marC COG2095 # Protein_GI_number: 16129488 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Multiple antibiotic transporter # Organism: Escherichia coli K12 # 1 221 1 221 221 366 100.0 1e-101 MLDLFKAIGLGLVVLLPLANPLTTVALFLGLAGNMNSAERNRQSLMASVYVFAIMMVAYY AGQLVMDTFGISIPGLRIAGGLIVAFIGFRMLFPQQKAIDSPEAKSKSEELEDEPSANIA FVPLAMPSTAGPGTIAMIISSASTVRQSSTFADWVLMVAPPLIFFLVAVILWGSLRSSGA IMRLVGKGGIEAISRLMGFLLVCMGVQFIINGILEIIKTYH >gi|296494657|gb|ADTN01000081.1| GENE 8 4557 - 5747 860 396 aa, chain - ## HITS:1 COG:ydeA KEGG:ns NR:ns ## COG: ydeA COG2814 # Protein_GI_number: 16129487 # Func_class: G Carbohydrate transport and metabolism # Function: Arabinose efflux permease # Organism: Escherichia coli K12 # 1 396 1 396 396 572 100.0 1e-163 MTTNTVSRKVAWLRVVTLAVAAFIFNTTEFVPVGLLSDIAQSFHMQTAQVGIMLTIYAWV VALMSLPFMLMTSQVERRKLLICLFVVFIASHVLSFLSWSFTVLVISRIGVAFAHAIFWS ITASLAIRMAPAGKRAQALSLIATGTALAMVLGLPLGRIVGQYFGWRMTFFAIGIGALIT LLCLIKLLPLLPSEHSGSLKSLPLLFRRPALMSIYLLTVVVVTAHYTAYSYIEPFVQNIA GFSANFATALLLLLGGAGIIGSVIFGKLGNQYASALVSTAIALLLVCLALLLPAANSEIH LGVLSIFWGIAMMIIGLGMQVKVLALAPDATDVAMALFSGIFNIGIGAGALVGNQVSLHW SMSMIGYVGAVPAFAALIWSIIIFRRWPVTLEEQTQ >gi|296494657|gb|ADTN01000081.1| GENE 9 5897 - 6604 228 235 aa, chain - ## HITS:1 COG:no KEGG:JW1520 NR:ns ## KEGG: JW1520 # Name: yneK # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 235 137 371 371 455 100.0 1e-127 MGRVFQTGVERVLFLFLNDFIEQFPMINPGVPIKRAHTPHIEPLPSDHHTAADYLRQFDL LVLNFISRGNFVILPRLWNNSEVHRWFVNKDPNLITAILDITDSELKEDLLQSLMDSLGS NKHVLPEVCICFLSLLAEQESPHFQNLFLFFANMLLHYHQFMNPNESDLNDVLMPASLSD DKIIKHMARRTLKLFVKNETPPKVTHEDLVKNRPRSPVRPPIPATAKTPDLPERH >gi|296494657|gb|ADTN01000081.1| GENE 10 7090 - 7971 795 293 aa, chain - ## HITS:1 COG:yneJ KEGG:ns NR:ns ## COG: yneJ COG0583 # Protein_GI_number: 16129485 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 293 1 293 293 592 100.0 1e-169 MDLTQLEMFNAVAEAGSITQAAAKVHRVPSNLTTRLRQLETELGVDLFIRENQRLRLSPA GHNFLRYSQQILALVDEARSVVAGDEPQGLFSLGSLESTAAVRIPATLAEFNRRYPKIQF SLSTGPSGTMLDGVLEGKLNAAFIDGPINHTAIDGIPVYREELMIVTPQGYAPVTRASQV NGSNIYAFRANCSYRRHFESWFHADGAAPGTIHEMESYHGMLACVIAGAGIALIPRSMLE SMPGHHQVEAWPLAEQWRWLTTWLVWRRGAKTRPLEAFIQLLDVPDSARQGYQ >gi|296494657|gb|ADTN01000081.1| GENE 11 8072 - 9460 1454 462 aa, chain + ## HITS:1 COG:yneI KEGG:ns NR:ns ## COG: yneI COG1012 # Protein_GI_number: 16129484 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Escherichia coli K12 # 1 462 9 470 470 893 100.0 0 MTITPATHAISINPATGEQLSVLPWAGADDIENALQLAAAGFRDWRETNIDYRAEKLRDI GKALRARSEEMAQMITREMGKPINQARAEVAKSANLCDWYAEHGPAMLKAEPTLVENQQA VIEYRPLGTILAIMPWNFPLWQVMRGAVPIILAGNGYLLKHAPNVMGCAQLIAQVFKDAG IPQGVYGWLNADNDGVSQMIKDSRIAAVTVTGSVRAGAAIGAQAGAALKKCVLELGGSDP FIVLNDADLELAVKAAVAGRYQNTGQVCAAAKRFIIEEGIASAFTERFVAAAAALKMGDP RDEENALGPMARFDLRDELHHQVEKTLAQGARLLLGGEKMAGAGNYYPPTVLANVTPEMT AFREEMFGPVAAITIAKDAEHALELANDSEFGLSATIFTTDETQARQMAARLECGGVFIN GYCASDARVAFGGVKKSGFGRELSHFGLHEFCNIQTVWKDRI >gi|296494657|gb|ADTN01000081.1| GENE 12 9536 - 10450 653 304 aa, chain + ## HITS:1 COG:ECs2131 KEGG:ns NR:ns ## COG: ECs2131 COG2066 # Protein_GI_number: 15831385 # Func_class: E Amino acid transport and metabolism # Function: Glutaminase # Organism: Escherichia coli O157:H7 # 1 304 5 308 308 612 100.0 1e-175 MDNAILENILRQVRPLIGQGKVADYIPALATVDGSRLGIAICTVDGQLFQAGDAQERFSI QSISKVLSLVVAMRHYSEEEIWQRVGKDPSGSPFNSLVQLEMEQGIPRNPFINAGALVVC DMLQGRLSAPRQRMLEVVRGLSGVSDISYDTVVARSEFEHSARNAAIAWLMKSFGNFHHD VTTVLQNYFHYCALKMSCVELARTFVFLANQGKAIHIDEPVVTPMQARQINALMATSGMY QNAGEFAWRVGLPAKSGVGGGIVAIVPHEMAIAVWSPELDDAGNSLAGIAVLEQLTKQLG RSVY >gi|296494657|gb|ADTN01000081.1| GENE 13 10450 - 10678 84 76 aa, chain + ## HITS:1 COG:no KEGG:SF1570 NR:ns ## KEGG: SF1570 # Name: not_defined # Def: hypothetical protein # Organism: S.flexneri # Pathway: not_defined # 1 76 1 76 114 150 100.0 1e-35 MQSLDPLFARLSRSKFRSRFRLGMKERQYCLEKGAPVIEQHAADFVAKRLAPALPANDGK QTPMRGHPVFIAQHAT Prediction of potential genes in microbial genomes Time: Sun May 15 23:24:12 2011 Seq name: gi|296494656|gb|ADTN01000082.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont216.3, whole genome shotgun sequence Length of sequence - 17261 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 7, operones - 2 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 139 62 ## EcolC_2135 hypothetical protein + Prom 162 - 221 2.0 2 2 Tu 1 . + CDS 308 - 1198 738 ## COG2199 FOG: GGDEF domain + Prom 1200 - 1259 3.5 3 3 Tu 1 . + CDS 1425 - 2876 1803 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases + Term 2893 - 2926 3.8 + Prom 2907 - 2966 4.1 4 4 Tu 1 . + CDS 3083 - 3997 632 ## COG3781 Predicted membrane protein 5 5 Op 1 3/0.500 - CDS 4001 - 4759 600 ## COG4106 Trans-aconitate methyltransferase 6 5 Op 2 5/0.500 - CDS 4816 - 5106 389 ## COG1359 Uncharacterized conserved protein 7 5 Op 3 6/0.500 - CDS 5130 - 6005 907 ## COG1830 DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes 8 5 Op 4 16/0.000 - CDS 6032 - 7054 1093 ## COG1879 ABC-type sugar transport system, periplasmic component 9 5 Op 5 11/0.000 - CDS 7066 - 8058 1085 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 10 5 Op 6 21/0.000 - CDS 8058 - 9086 1006 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 11 5 Op 7 . - CDS 9080 - 10615 178 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 10853 - 10912 4.9 + Prom 10681 - 10740 6.2 12 6 Op 1 5/0.500 + CDS 10864 - 11817 872 ## COG2390 Transcriptional regulator, contains sigma factor-related N-terminal domain 13 6 Op 2 1/1.000 + CDS 11896 - 12726 678 ## COG1070 Sugar (pentulose and hexulose) kinases 14 6 Op 3 . + CDS 12769 - 13488 481 ## COG1070 Sugar (pentulose and hexulose) kinases + Term 13494 - 13539 -0.9 + Prom 13788 - 13847 8.2 15 7 Tu 1 . + CDS 14019 - 17259 2095 ## COG4625 Uncharacterized protein with a C-terminal OMP (outer membrane protein) domain Predicted protein(s) >gi|296494656|gb|ADTN01000082.1| GENE 1 2 - 139 62 45 aa, chain + ## HITS:1 COG:no KEGG:EcolC_2135 NR:ns ## KEGG: EcolC_2135 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 45 75 119 119 94 97.0 1e-18 ATAPCCRGCLAKWHNIPQGVSLSEEQQRYIVAVIYHWLVVQMNQP >gi|296494656|gb|ADTN01000082.1| GENE 2 308 - 1198 738 296 aa, chain + ## HITS:1 COG:yneF_2 KEGG:ns NR:ns ## COG: yneF_2 COG2199 # Protein_GI_number: 16129481 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli K12 # 112 296 1 185 185 381 100.0 1e-105 MLTLAIPGVLPRFKAEQMMPAIALIVSVIASVVIGGAGSLAFPLPALIWCAVRYTPQVTC LLTFVTGAVEIVLVANSVIDISVGSPFSIPQMFSARLGIATMAICPIMVSFSVAAINSLM KQVALRADFDFLTQVYSRSGLYEALKSPSLKQTQHLTVMLLDIDYFKSINDNYGHECGDK VLSVFARHIQKIVGDKGLVARMGGEEFAVAVPSVNPVDGLLMAEKIRKGVELQPFTWQQK TLYLTVSIGVGSGRASYLTLTDDFNKLMVEADTCLYRSKKDGRNRTSTMRYGEEVV >gi|296494656|gb|ADTN01000082.1| GENE 3 1425 - 2876 1803 483 aa, chain + ## HITS:1 COG:ECs2128 KEGG:ns NR:ns ## COG: ECs2128 COG0246 # Protein_GI_number: 15831382 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Escherichia coli O157:H7 # 1 483 1 483 483 996 100.0 0 MKTLNRRDFPGAQYPERIIQFGEGNFLRAFVDWQIDLLNEHTDLNSGVVVVRPIETSFPP SLSTQDGLYTTIIRGLNEKGEAVSDARLIRSVNREISVYSEYDEFLKLAHNPEMRFVFSN TTEAGISYHAGDKFDDAPAVSYPAKLTRLLFERFSHFNGALDKGWIIIPCELIDYNGDAL RELVLRYAQEWALPEAFIQWLDQANSFCSTLVDRIVTGYPRDEVAKLEEELGYHDGFLDT AEHFYLFVIQGPKSLATELRLDKYPLNVLIVDDIKPYKERKVAILNGAHTALVPVAFQAG LDTVGEAMNDAEICAFVEKAIYEEIIPVLDLPRDELESFASAVTGRFRNPYIKHQLLSIA LNGMTKFRTRILPQLLAGQKANGTLPARLTFALAALIAFYRGERNGETYPVQDDAHWLER YQQLWSQHRDRVIGTQELVAIVLAEKDHWEQDLTQVPGLVEQVANDLDAILEKGMREAVR PLC >gi|296494656|gb|ADTN01000082.1| GENE 4 3083 - 3997 632 304 aa, chain + ## HITS:1 COG:yneE KEGG:ns NR:ns ## COG: yneE COG3781 # Protein_GI_number: 16129479 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 304 18 321 321 597 100.0 1e-170 MIVRPQQHWLRRIFVWHGSVLSKISSRLLLNFLFSIAVIFMLPWYTHLGIKFTLAPFSIL GVAIAIFLGFRNNAGYARYVEARKLWGQLMIASRSLLREVKTTLPDSASVREFARLQIAF AHCLRMTLRKQPQAEVLAHYLKTEDLQRVLASNSPANRILLIMGEWLAVQRRNGQLSDIL FISLNDRLNDISAVLAGCERIAYTPIPFAYTLILHRTVYLFCIMLPFALVVDLHYMTPFI SVLISYTFISLDCLAEELEDPFGTENNDLPLDAICNAIEIDLLQMNDEAEIPAKILPDRH YQLT >gi|296494656|gb|ADTN01000082.1| GENE 5 4001 - 4759 600 252 aa, chain - ## HITS:1 COG:tam KEGG:ns NR:ns ## COG: tam COG4106 # Protein_GI_number: 16129478 # Func_class: R General function prediction only # Function: Trans-aconitate methyltransferase # Organism: Escherichia coli K12 # 1 252 1 252 252 494 100.0 1e-140 MSDWNPSLYLHFSAERSRPAVELLARVPLENVEYVADLGCGPGNSTALLQQRWPAARITG IDSSPAMIAEARSALPDCQFVEADIRNWQPVQALDLIFANASLQWLPDHYELFPHLVSLL NPQGVLAVQMPDNWLEPTHVLMREVAWEQNYPDRGREPLAGVHAYYDILSEAGCEVDIWR TTYYHQMPSHQAIIDWVTATGLRPWLQDLTESEQQLFLKRYHQMLEEQYPLQENGQILLA FPRLFIVARRME >gi|296494656|gb|ADTN01000082.1| GENE 6 4816 - 5106 389 96 aa, chain - ## HITS:1 COG:ECs2125 KEGG:ns NR:ns ## COG: ECs2125 COG1359 # Protein_GI_number: 15831379 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 96 1 96 96 188 100.0 2e-48 MHVTLVEINVHEDKVDEFIEVFRQNHLGSVQEEGNLRFDVLQDPEVNSRFYIYEAYKDED AVAFHKTTPHYKTCVAKLESLMTGPRKKRLFNGLMP >gi|296494656|gb|ADTN01000082.1| GENE 7 5130 - 6005 907 291 aa, chain - ## HITS:1 COG:yneB KEGG:ns NR:ns ## COG: yneB COG1830 # Protein_GI_number: 16129476 # Func_class: G Carbohydrate transport and metabolism # Function: DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes # Organism: Escherichia coli K12 # 1 291 1 291 291 597 100.0 1e-171 MADLDDIKDGKDFRTDQPQKNIPFTLKGCGALDWGMQSRLSRIFNPKTGKTVMLAFDHGY FQGPTTGLERIDINIAPLFEHADVLMCTRGILRSVVPPATNRPVVLRASGANSILAELSN EAVALSMDDAVRLNSCAVAAQVYIGSEYEHQSIKNIIQLVDAGMKVGMPTMAVTGVGKDM VRDQRYFSLATRIAAEMGAQIIKTYYVEKGFERIVAGCPVPIVIAGGKKLPEREALEMCW QAIDQGASGVDMGRNIFQSDHPVAMMKAVQAVVHHNETADRAYELYLSEKQ >gi|296494656|gb|ADTN01000082.1| GENE 8 6032 - 7054 1093 340 aa, chain - ## HITS:1 COG:yneA KEGG:ns NR:ns ## COG: yneA COG1879 # Protein_GI_number: 16129475 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli K12 # 1 340 1 340 340 647 100.0 0 MTLHRFKKIALLSALGIAAISMNVQAAERIAFIPKLVGVGFFTSGGNGAQQAGKELGVDV TYDGPTEPSVSGQVQLINNFVNQGYNAIIVSAVSPDGLCPALKRAMQRGVRVLTWDSDTK PECRSYYINQGTPAQLGGMLVDMAARQVNKDKAKVAFFYSSPTVTDQNQWVKEAKAKIAK EHPGWEIVTTQFGYNDATKSLQTAEGILKAYSDLDAIIAPDANALPAAAQAAENLKNDKV AIVGFSTPNVMRPYVERGTVKEFGLWDVVQQGKISVYVADALLKKGSMKTGDKLDIKGVG QVEVSPNSVQGYDYEADGNGIVLLPERVIFNKENIGKYDF >gi|296494656|gb|ADTN01000082.1| GENE 9 7066 - 8058 1085 330 aa, chain - ## HITS:1 COG:ydeZ KEGG:ns NR:ns ## COG: ydeZ COG1172 # Protein_GI_number: 16129474 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Escherichia coli K12 # 1 330 1 330 330 511 100.0 1e-145 MRIRYGWELALAALLVIEIVAFGAINPRMLDLNMLLFSTSDFICIGIVALPLTMVIVSGG IDISFGSTIGLCAIALGVLFQSGVPMPLAILLTLLLGALCGLINAGLIIYTKVNPLVITL GTLYLFAGSALLLSGMAGATGYEGIGGFPMAFTDFANLDVLGLPVPLIIFLICLLVFWLW LHKTHAGRNVFLIGQSPRVALYSAIPVNRTLCALYAMTGLASAVAAVLLVSYFGSARSDL GASFLMPAITAVVLGGANIYGGSGSIIGTAIAVLLVGYLQQGLQMAGVPNQVSSALSGAL LIVVVVGRSVSLHRQQIKEWLARRANNPLP >gi|296494656|gb|ADTN01000082.1| GENE 10 8058 - 9086 1006 342 aa, chain - ## HITS:1 COG:ydeY KEGG:ns NR:ns ## COG: ydeY COG1172 # Protein_GI_number: 16129473 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Escherichia coli K12 # 1 342 1 342 342 539 99.0 1e-153 MLKFIQNNREITALLAVVLLFVLPGFLDRQYLSVQTLTMVYSSAQILILLAMGATLVMLT RNIDVSVGSITGMCAVLLGMLLNAGYSLPVACVATLLLGLLAGFFNGVLVAWLKIPAIVA TLGTLGLYRGIMLLWTGGKWIEGLPAELKQLSAPLLLGVSAIGWLTIILVAFMAWLPAKT AFGRSFYATGDNLQGARQLGVRTEAIRIVAFSLNGCMAALAGIVFASQIGFIPNQTGTGL EMKAIAACVLGGISLLGGSGAIIGAVLGAWFLTQIDSVLVLLRIPAWWNDFIAGLVLLAV LVFDGRLRCALERNLRRQKYARFMTPPPSVKPASSGKKREAA >gi|296494656|gb|ADTN01000082.1| GENE 11 9080 - 10615 178 511 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 268 481 1 217 245 73 26 1e-12 MQTSDTRALPLLCARSVYKQYSGVNVLKGIDFTLHQGEVHALLGGNGAGKSTLMKIIAGI TPADSGTLEIEGNNYVRLTPVHAHQLGIYLVPQEPLLFPSLSIKENILFGLAKKQLSMQK MKNLLAALGCQFDLHSLAGSLDVADRQMVEILRGLMRDSRILILDEPTASLTPAETERLF SRLQELLATGVGIVFISHKLPEIRQIADRISVMRDGTIALSGKTSELSTDDIIQAITPAV REKSLSASQKLWLELPGNRPQHAAGTPVLTLENLTGEGFRNVSLTLNAGEILGLAGLVGA GRTELAETLYGLRTLRGGRIMLNGKEINKLSTGERLLRGLVYLPEDRQSSGLNLDASLAW NVCALTHNLRGFWAKTAKDNATLERYRRALNIKFNQPEQAARTLSGGNQQKILIAKCLEA SPQVLIVDEPTRGVDVSARNDIYQLLRSIAAQNVAVLLISSDLEEIELMADRVYVMHQGE ITHSALTERDINVETIMRVAFGDSQRQEASC >gi|296494656|gb|ADTN01000082.1| GENE 12 10864 - 11817 872 317 aa, chain + ## HITS:1 COG:ydeW KEGG:ns NR:ns ## COG: ydeW COG2390 # Protein_GI_number: 16129471 # Func_class: K Transcription # Function: Transcriptional regulator, contains sigma factor-related N-terminal domain # Organism: Escherichia coli K12 # 1 317 1 317 317 588 100.0 1e-168 MTINDSAISEQGMCEEEQVARIAWFYYHDGLTQSEISDRLGLTRLKVSRLLEKGHQSGII RVQINSRFEGCLEYETQLRRQFSLQHVRVIPGLADADVGGRLGIGAAHMLMSLLQPQQML AIGFGEATMNTLQRLSGFISSQQIRLVTLSGGVGSYMTGIGQLNAACSVNIIPAPLRASS ADIARTLKNENCVKDVLLAAQAADVAIVGIGAVSQQDDATIIRSGYISQGEQLMIGRKGA VGDILGYFFDAKGDVVTNIKIHNELIGLPLSALKTIPVRVGVAGGENKAEAIAAAMKGGY INALVTDQDTAAAILRS >gi|296494656|gb|ADTN01000082.1| GENE 13 11896 - 12726 678 276 aa, chain + ## HITS:1 COG:ydeV KEGG:ns NR:ns ## COG: ydeV COG1070 # Protein_GI_number: 16129470 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Escherichia coli K12 # 1 276 1 276 530 556 100.0 1e-158 MARLFTLSESKYYLMALDAGTGSIRAVIFDLEGNQIAVGQAEWRHLAVPDVPGSMEFDLN KNWQLACECMRQALHNAGIAPEYIAAVSACSMREGIVLYNNEGAPIWACANVDARAAREV SELKELHNNTFENEVYRATGQTLALSAIPRLLWLAHHRSDIYRQASTITMISDWLAYMLS GELAVDPSNAGTTGLLDLTTRDWKPALLDMAGLRADILSPVKETGTLLGVVSSQAAELCG LKAGTPVVVGGGDVQLGCLGLGVVRPAQTAVLGGTF >gi|296494656|gb|ADTN01000082.1| GENE 14 12769 - 13488 481 239 aa, chain + ## HITS:1 COG:ydeV KEGG:ns NR:ns ## COG: ydeV COG1070 # Protein_GI_number: 16129470 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Escherichia coli K12 # 1 239 292 530 530 501 100.0 1e-142 MNVRVNPHVIPGMVQAESISFFTGLTMRWFRDAFCAEEKLIAERLGIDTYTLLEEMASRV PPGSWGVMPIFSDRMRFKTWYHAAPSFINLSIDPDKCNKATLFRALEENAAIVSACNLQQ IADFSNIHPSSLVFAGGGSKGKLWSQILADVSGLPVNIPVVKEATALGCAIAAGVGAGIF SSMAETGERLVRWERTHTPDPEKHELYQDSRDKWQAVYQDQLGLVDHGLTTSLWKAPGL >gi|296494656|gb|ADTN01000082.1| GENE 15 14019 - 17259 2095 1080 aa, chain + ## HITS:1 COG:AGl3085 KEGG:ns NR:ns ## COG: AGl3085 COG4625 # Protein_GI_number: 15891657 # Func_class: S Function unknown # Function: Uncharacterized protein with a C-terminal OMP (outer membrane protein) domain # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 110 752 142 778 1341 100 25.0 2e-20 MNRIYRVIWNCTLQVFQACSELTRRAGKTSTVNLRKSSGLTTKFSRLTLGVLLALSGSAS GASLEVDNDQITNIDTDVAYDAYLVGWYGTGVLNILAGGNASLTTITTSVIGANEDSEGT VNVLGGTWRLYDSGNNARPLNVGQSGTGTLNIKQKGHVDGGYLRLGSSTGGVGTVNVEGE DSVLTTELFEIGSYGTGSLNITDKGYVTSSIVAILGYQAGSNGQVVVEKGGEWLIKNNDS SIEFQIGNQGTGEATIREGGVVTAENTIIGGNATGIGTLNVQDQDSVITVRRLYNGYFGN GTVNISNNGLINNKEYSLVGVQDGSHGVVNVTDKGHWNFLGTGEAFRYIYIGDAGDGELN VSSEGKVDSGIITAGMKETGTGNITVKDKNSVITNLGTNLGYDGHGEMNISNQGLVVSNG GSSLGYGETGVGNVSITTGGMWEVNKNVYTTIGVAGVGNLNISDGGKFVSQNITFLGDKA SGIGTLNLMDATSSFDTVGINVGNFGSGIVNVSNGAPLNSTGYGFIGGNASGKGIVNIST DSLWNLKTSSTNAQLLQVGVLGTGELNITTGGIVKARDTQIALNDKSKGDVRVDGQNSLL ETFNMYVGTSGTGTLTLTNNGTLNVEGGEVYLGVFEPAVGTLNIGAAHGEAAADAGFITN ATKVEFGLGEGVFVFNHTNNSDAGYQVDMLITGDDKDGKVIHDAGHTVFNAGNTYSGKTL VNDGLLTIASHTADGVTGMGSSEVTIANPGTLDILASTNSAGDYTLTNALKGDGLMRVQL SSSDKMFGFTHATGTEFAGVAQLKDSTFTLERDNTAALTHAMLQSDSENTTSVKVGEQSI GGLAMNGGTIIFDTDIPAATLAEGYISVDTLVVGAGDYTWKGRNYQVNGTGDVLIDVPKP WNDPMANNPLTTLNLLEHDDSHVGVQLVKAQTVIGSGGSLTLRDLQGDEVEADKTLHIAQ NGTVVAEGDYGFRLTTAPGNGLYVNYGLKALNIHGGQKLTLAEHGGAYGATADMSAKIGG EGDLAINTVRQVSLSNGQNDYQGATYVQMGTLRTDADGALGNTRELNISNAAIVDLNGST Prediction of potential genes in microbial genomes Time: Sun May 15 23:24:18 2011 Seq name: gi|296494655|gb|ADTN01000083.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont216.4, whole genome shotgun sequence Length of sequence - 13575 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 5, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 1498 642 ## COG3468 Type V secretory pathway, adhesin AidA 2 1 Op 2 1/0.000 + CDS 1318 - 1974 718 ## COG3468 Type V secretory pathway, adhesin AidA + Term 1984 - 2020 7.1 + Prom 2105 - 2164 5.9 3 2 Op 1 8/0.000 + CDS 2197 - 2463 271 ## COG1396 Predicted transcriptional regulators 4 2 Op 2 2/0.000 + CDS 2463 - 3785 596 ## COG3550 Uncharacterized protein related to capsule biosynthesis enzymes + Prom 3828 - 3887 3.4 5 2 Op 3 . + CDS 4103 - 4282 87 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 4460 - 4519 5.6 6 3 Op 1 6/0.000 + CDS 4638 - 5786 585 ## COG3188 P pilus assembly protein, porin PapC 7 3 Op 2 4/0.000 + CDS 5800 - 6330 322 ## COG3539 P pilus assembly protein, pilin FimA 8 3 Op 3 . + CDS 6343 - 6846 250 ## COG3539 P pilus assembly protein, pilin FimA 9 3 Op 4 . + CDS 6905 - 7819 363 ## B21_01472 hypothetical protein + Prom 7898 - 7957 9.9 10 4 Tu 1 . + CDS 8153 - 10432 1259 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing + Term 10446 - 10488 8.0 + Prom 10496 - 10555 5.2 11 5 Op 1 . + CDS 10680 - 10877 128 ## S1655 hypothetical protein 12 5 Op 2 2/0.000 + CDS 10952 - 11713 353 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 11761 - 11797 3.0 + Prom 11869 - 11928 5.0 13 5 Op 3 . + CDS 12115 - 13572 956 ## COG3119 Arylsulfatase A and related enzymes Predicted protein(s) >gi|296494655|gb|ADTN01000083.1| GENE 1 2 - 1498 642 498 aa, chain + ## HITS:1 COG:ECs2116 KEGG:ns NR:ns ## COG: ECs2116 COG3468 # Protein_GI_number: 15831370 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli O157:H7 # 264 434 1 171 466 320 100.0 5e-87 GAAQTVETFTGQMGSTVLFKEGALTVNKGGISQGELTGGGNLNVTGGTLAIEGLNARYNA LTSISPNAEVSLDNTQGLGRGNIANDGLLTLKNVTGELRNSISGKGIVSATARTDVELDG DNSRFVGQFNIDTGSALSVNEQKNLGDASVINNGLLTISTERSWAMTHSISGSGDVTKLG TGILTLNNDSAAYQGTTDIVGGEIAFGSDSAINMASQHINIHNSGVMSGNVTTAGDMNVM PGGALRVAKTTIGGNLENGGTVQMNSEGGKPGNVLTVNGNYTGNNGLMTFNATLGGDNSP TDKMNVKGDTQGNTRVRVDNIGGVGAQTVNGIELIEVGGNSAGNFALTTGTVEAGAYVYT LAKGKGNDEKNWYLTSKWDGVTPADTPDPINNPPVVDPEGPSVYRPEAGSYISNIAAANS LFSHRLHDRLGEPQLRVIVWVINRMGASAVTALGCTRPGIRTMRIRPALMLTAGRCITGL ITASVPITVLLTTMILAV >gi|296494655|gb|ADTN01000083.1| GENE 2 1318 - 1974 718 218 aa, chain + ## HITS:1 COG:ydeU KEGG:ns NR:ns ## COG: ydeU COG3468 # Protein_GI_number: 16129468 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 1 218 249 466 466 386 99.0 1e-107 MGYKSDGCISGYSAGLYATWYQNDANKTGAYVDSWALYNWFDNSVSSDNRSADDYDSRGV TASVEGGYTFEAGTFSGSEGTLNTWYVQPQAQITWMGVKDSDHTRKDGTRIETEGDGNVQ TRLGVKTYLNSHHQRDDGKQREFQPYIEANWINNSKVYAVKMNGQTVGREGARNLGEVRT GVEAKVNNNLSLWGNVGVQLGDKGYSDTQGMLGVKYSW >gi|296494655|gb|ADTN01000083.1| GENE 3 2197 - 2463 271 88 aa, chain + ## HITS:1 COG:hipB KEGG:ns NR:ns ## COG: hipB COG1396 # Protein_GI_number: 16129467 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 1 88 1 88 88 139 100.0 1e-33 MMSFQKIYSPTQLANAMKLVRQQNGWTQSELAKKIGIKQATISNFENNPDNTTLTTFFKI LQSLELSMTLCDAKNASPESTEQQNLEW >gi|296494655|gb|ADTN01000083.1| GENE 4 2463 - 3785 596 440 aa, chain + ## HITS:1 COG:hipA KEGG:ns NR:ns ## COG: hipA COG3550 # Protein_GI_number: 16129466 # Func_class: R General function prediction only # Function: Uncharacterized protein related to capsule biosynthesis enzymes # Organism: Escherichia coli K12 # 1 440 1 440 440 911 100.0 0 MPKLVTWMNNQRVGELTKLANGAHTFKYAPEWLASRYARPLSLSLPLQRGNITSDAVFNF FDNLLPDSPIVRDRIVKRYHAKSRQPFDLLSEIGRDSVGAVTLIPEDETVTHPIMAWEKL TEARLEEVLTAYKADIPLGMIREENDFRISVAGAQEKTALLRIGNDWCIPKGITPTTHII KLPIGEIRQPNATLDLSQSVDNEYYCLLLAKELGLNVPDAEIIKAGNVRALAVERFDRRW NAERTVLLRLPQEDMCQTFGLPSSVKYESDGGPGIARIMAFLMGSSEALKDRYDFMKFQV FQWLIGATDGHAKNFSVFIQAGGSYRLTPFYDIISAFPVLGGTGIHISDLKLAMGLNASK GKKTAIDKIYPRHFLATAKVLRFPEVQMHEILSDFARMIPAALDNVKTSLPTDFPENVVT AVESNVLRLHGRLSREYGSK >gi|296494655|gb|ADTN01000083.1| GENE 5 4103 - 4282 87 59 aa, chain + ## HITS:1 COG:yneL KEGG:ns NR:ns ## COG: yneL COG2207 # Protein_GI_number: 16129465 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 59 1 59 59 117 100.0 4e-27 MSPLRYQKWLRLNEVRRQMLNEHYDVTTAAYAVGYESYPISVGNIRGCLESHPREILPG >gi|296494655|gb|ADTN01000083.1| GENE 6 4638 - 5786 585 382 aa, chain + ## HITS:1 COG:Z2203 KEGG:ns NR:ns ## COG: Z2203 COG3188 # Protein_GI_number: 15801633 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli O157:H7 EDL933 # 1 382 502 883 883 725 98.0 0 MSGYTVKPPTGDTNEQTQFIDYFNLFYSKRGQEQISISQQLGNYGTTFFSASRQSYWNTS RSDQQISFGLNVPFGDITTSLNYSYSNNIWQNDRDHLLAFTLNVPFSHWMRTDSQSAFRN SNASYSMSNDLKGGMTNLSGVYGTLLPDNNLNYSVQVGNTHGGNTSSGTSGYSSLNYRGA YGNTNVGYSRSGDSSQIYYGMSGGIIAHADGITFGQPLGDTMVLVKAPGADNVKIENQTG IHTDWRGYAILPFATEYRENRVALNANSLADNVELDETVVTVIPTHGAIARATFNAQIGG KVLMTLKYGNKSVPFGAIVTHGENKNGSIVAENGQVYLTGLPQSGQLQVSWGKDKNSNCI VEYKLPEVSPGTLLNQQTAICR >gi|296494655|gb|ADTN01000083.1| GENE 7 5800 - 6330 322 176 aa, chain + ## HITS:1 COG:ydeS KEGG:ns NR:ns ## COG: ydeS COG3539 # Protein_GI_number: 16129463 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 176 1 176 176 338 100.0 2e-93 MKYNNIIFLGLCLGLTTYSALSADSVIKISGRVLDYGCTVSSDSLNFTVDLQKNSARQFP TTGSTSPAVPFQITLSECSKGTTGVRVAFNGIEDAENNTLLKLDEGSNTASGLGIEILDA NMRPVKLNDLHAGMQWIPLVPEQNNILPYSARLKSTQKSVNPGLVRASATFTLEFQ >gi|296494655|gb|ADTN01000083.1| GENE 8 6343 - 6846 250 167 aa, chain + ## HITS:1 COG:ydeR KEGG:ns NR:ns ## COG: ydeR COG3539 # Protein_GI_number: 16129462 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 167 1 167 167 308 100.0 3e-84 MKRLHKRFLLATFCALFTATLQAADVTITVNGRVVAKPCTIQTKEANVNLGDLYTRNLQQ PGSASGWHNITLSLTDCPVETSAVTAIVTGSTDNTGYYKNEGTAENIQIELRDDQDAALK NGDSKTVIVDEITRNAQFPLKARAITVNGNASQGTIEALINVIYTWQ >gi|296494655|gb|ADTN01000083.1| GENE 9 6905 - 7819 363 304 aa, chain + ## HITS:1 COG:no KEGG:B21_01472 NR:ns ## KEGG: B21_01472 # Name: ydeQ # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 304 1 304 304 573 100.0 1e-162 MGKTISIKVLFGIYLLLMAGKVFAFSCNVDGGSSIGAGTTSVYVNLDPVIQPGQNLVVDL SQHISCWNDYGGWYDTDHINLVQGSAFAGSLQSYKGSLYWNNVTYPFPLTTNTNVLDIGD KTPMPLPLKLYITPVGAAGGVVIKAGEVIARIHMYKIATLGSGNPRNFTWNIISNNNVVM PTGGCTVDSRNVTVDLPDFPGSAEIPLGVYCSSEQKLSFYLSGATTDSSRQVFANTAPDA TKASGVGVTLMRNGKILATGENVSLGTVNKSKVPLGLSATYGQTGNKVSAGTVQSVIGVT FIYE >gi|296494655|gb|ADTN01000083.1| GENE 10 8153 - 10432 1259 759 aa, chain + ## HITS:1 COG:ydeP KEGG:ns NR:ns ## COG: ydeP COG0243 # Protein_GI_number: 16129460 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 759 1 759 759 1566 100.0 0 MKKKIESYQGAAGGWGAVKSVANAVRKQMDIRQDVIAMFDMNKPEGFDCPGCAWPDPKHS ASFDICENGAKAIAWEVTDKQVNASFFAENTVQSLLTWGDHELEAAGRLTQPLKYDAVSD CYKPLSWQQAFDEIGARLQSYSDPNQVEFYTSGRTSNEAAFLYQLFAREYGSNNFPDCSN MCHEPTSVGLAASIGVGKGTVLLEDFEKCDLVICIGHNPGTNHPRMLTSLRALVKRGAKM IAINPLQERGLERFTAPQNPFEMLTNSETQLASAYYNVRIGGDMALLKGMMRLLIERDDA ASAAGRPSLLDDEFIQTHTVGFDELRRDVLNSEWKDIERISGLSQTQIAELADAYAAAER TIICYGMGITQHEHGTQNVQQLVNLLLMKGNIGKPGAGICPLRGHSNVQGDRTVGITEKP SAEFLARLGERYGFTPPHAPGHAAIASMQAICTGQARALICMGGNFALAMPDREASAVPL TQLDLAVHVATKLNRSHLLTARHSYILPVLGRSEIDMQKNGAQAVTVEDSMSMIHASRGV LKPAGVMLKSECAVVAGIAQAALPQSVVAWEYLVEDYDRIRNDIEAVLPEFADYNQRIRH PGGFHLINAAAERRWMTPSGKANFITSKGLLEDPSSAFNSKLVMATVRSHDQYNTTIYGM DDRYRGVFGQRDVVFMSAKQAKICRVKNGERVNLIALTPDGKRSSRRMDRLKVVIYPMAD RSLVTYFPESNHMLTLDNHDPLSGIPGYKSIPVELEPSN >gi|296494655|gb|ADTN01000083.1| GENE 11 10680 - 10877 128 65 aa, chain + ## HITS:1 COG:no KEGG:S1655 NR:ns ## KEGG: S1655 # Name: not_defined # Def: hypothetical protein # Organism: S.flexneri_2457T # Pathway: not_defined # 1 64 1 64 67 87 98.0 2e-16 MHATTVKNKITQRDNYKEIMSAIVVVLLLTLTLIAIFSAIDQLSISEMGRIARDLTHFII NSLQG >gi|296494655|gb|ADTN01000083.1| GENE 12 10952 - 11713 353 253 aa, chain + ## HITS:1 COG:ydeO KEGG:ns NR:ns ## COG: ydeO COG2207 # Protein_GI_number: 16129458 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 253 1 253 253 507 99.0 1e-144 MSLVCSVIFIHHAFNANILDKDYAFSDGEILMVDNAVRTHFEPYERHFKEIGFTENTIKK YLQCTNIQTVTVPVPAKFLRASNVPTGLLNEMIAYLNSEERNHHNFSELLLFSCLSIFAA CKGFITLLTNGVLSVSGKVRNIVNMKLAHPWKLKDICDCLYISESLLKKKLKQEQTTFSQ ILLDARMQHAKNLIRVEGSVNKIAEQCGYASTSYFIYAFRKHFGNSPKRVSKEYRCQSHT GMNTGNTMNALAI >gi|296494655|gb|ADTN01000083.1| GENE 13 12115 - 13572 956 485 aa, chain + ## HITS:1 COG:ydeN KEGG:ns NR:ns ## COG: ydeN COG3119 # Protein_GI_number: 16129457 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Escherichia coli K12 # 1 480 12 491 571 939 99.0 0 MKSALKKSVVSTSISLILASGMAAFAAHAADDVKLKATKTNVAFSDFTPTEYSTKGKPNI IVLTMDDLGYGQLPFDKGSFDPKTMENREVVDTYKIGIDKAIEAAQKSTPTLLSLMDEGV RFTNGYVAHGVSGPSRAAIMTGRAPARFGVYSNTDAQDGIPLTETFLPELFQNHGYYTAA VGKWHLSKISNVPVPEDKQTRDYHDNFTTFSAEEWQPQNRGFDYFMGFHAAGTAYYNSPS LFKNRERVPAKGYISDQLTDEAIGVVDRAKTLDQPFMLYLAYNAPHLPNDNPAPDQYQKQ FNTGSQTADNYYASVYSVDQGVKRILEQLKKNGQYDNTIILFTSDNGAVIDGPLPLNGAQ KGYKSQTYPGGTHTPMFMWWKGKLQPGNYDKLISAMDFYPTALDAADISIPKDLKLDGVS LLPWLQDKKQGEPHKNLTWITSYSHWFDEENIPFWDNYHKFVRHQSDDYPHNPNTEDLRK VRTSS Prediction of potential genes in microbial genomes Time: Sun May 15 23:24:25 2011 Seq name: gi|296494654|gb|ADTN01000084.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont217.1, whole genome shotgun sequence Length of sequence - 1048 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1046 311 ## COG3209 Rhs family protein Predicted protein(s) >gi|296494654|gb|ADTN01000084.1| GENE 1 2 - 1046 311 348 aa, chain + ## HITS:1 COG:rhsD KEGG:ns NR:ns ## COG: rhsD COG3209 # Protein_GI_number: 16128481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli K12 # 1 348 291 638 1426 655 99.0 0 RGIRLSAVWLMHDPAYPESLPAAPLVRYTYTEAGELLAVYDRSNTQVRAFTYDAQHPGRM VAHRYAGRPEMRYRYDDTGRVVEQLNPAGLSYRYLYEQDRITVTDSLNRREVLHTEGGAG LKRVVKKELADGSVTRSGYDAAGRLTAQTDAAGRRTEYGLNVVSGDITDITTPDGRETKF YYNDGNQLTAVVSPDGLESRREYDEPGRLVSETSRSGETVRYRYDDAHSELPATTTDATG STRQMTWSRYGQLLAFTDCSGYQTRYEYDRFGQMTAVHREEGISLYRHYDNRGRLTSVKD AQGRETRYEYNAAGDLTAVITPDGNRSETQYDAWGKAVSTTQGGLTRS Prediction of potential genes in microbial genomes Time: Sun May 15 23:24:26 2011 Seq name: gi|296494653|gb|ADTN01000085.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont218.1, whole genome shotgun sequence Length of sequence - 583 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 582 588 ## COG3209 Rhs family protein Predicted protein(s) >gi|296494653|gb|ADTN01000085.1| GENE 1 3 - 582 588 193 aa, chain - ## HITS:1 COG:rhsD KEGG:ns NR:ns ## COG: rhsD COG3209 # Protein_GI_number: 16128481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli K12 # 1 193 640 832 1426 387 99.0 1e-108 EYDAAGRVISLTNENGSHSVFSYDALDRLVQQGGFDGRTQRYHYDLTGKLTQSEDEGLVI LWYYDESDRITHRTVNGEPAEQWQYDGHGWLTDISHLSEGHRVAVHYGYDDKGRLTGECQ TVENPETGELLWQHETKHAYNEQGLANRVTPDSLSPVEWLTYGSGYLAGMKLGGTPLVEY TRDRLHRETVRSF Prediction of potential genes in microbial genomes Time: Sun May 15 23:24:30 2011 Seq name: gi|296494652|gb|ADTN01000086.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont221.1, whole genome shotgun sequence Length of sequence - 11792 bp Number of predicted genes - 10, with homology - 9 Number of transcription units - 6, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 702 - 761 8.7 2 2 Op 1 . + CDS 785 - 3013 1829 ## COG0699 Predicted GTPases (dynamin-related) 3 2 Op 2 . + CDS 3010 - 3888 471 ## JW5729 conserved hypothetical protein + Prom 3963 - 4022 3.2 4 3 Tu 1 . + CDS 4152 - 5654 1666 ## COG0477 Permeases of the major facilitator superfamily + Prom 5685 - 5744 1.6 5 4 Tu 1 . + CDS 5766 - 5855 151 ## 6 5 Op 1 40/0.000 - CDS 5831 - 6922 930 ## COG0642 Signal transduction histidine kinase 7 5 Op 2 3/1.000 - CDS 6932 - 7600 952 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 8 5 Op 3 3/1.000 - CDS 7597 - 9240 1390 ## COG2194 Predicted membrane-associated, metal-dependent hydrolase - Term 9275 - 9322 8.2 9 6 Op 1 5/0.000 - CDS 9344 - 10681 1501 ## COG0531 Amino acid transporters - Prom 10722 - 10781 3.9 10 6 Op 2 . - CDS 10818 - 11579 414 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 11645 - 11704 5.7 Predicted protein(s) >gi|296494652|gb|ADTN01000086.1| GENE 1 49 - 384 457 111 aa, chain - ## HITS:1 COG:ECs5090 KEGG:ns NR:ns ## COG: ECs5090 COG2824 # Protein_GI_number: 15834344 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized Zn-ribbon-containing protein involved in phosphonate metabolism # Organism: Escherichia coli O157:H7 # 1 111 1 111 111 216 100.0 8e-57 MSLPHCPKCNSEYTYEDNGMYICPECAYEWNDAEPAQESDELIVKDANGNLLADGDSVTI IKDLKVKGSSSMLKIGTKVKNIRLVEGDHNIDCKIDGFGPMKLKSEFVKKN >gi|296494652|gb|ADTN01000086.1| GENE 2 785 - 3013 1829 742 aa, chain + ## HITS:1 COG:yjdA_1 KEGG:ns NR:ns ## COG: yjdA_1 COG0699 # Protein_GI_number: 16131935 # Func_class: R General function prediction only # Function: Predicted GTPases (dynamin-related) # Organism: Escherichia coli K12 # 1 275 1 275 275 512 100.0 1e-145 MYTQTLYELSQEAERLLQLSRQQLQLLEKMPLSVPGDDAPQLALPWSQPNIAERHAMLNN ELRKISRLEMVLAIVGTMKAGKSTTINAIVGTEVLPNRNRPMTALPTLIRHTPGQKEPVL HFSHVAPIDCLIQQLQQRLRDCDIKHLTDVLEIDKDMRALMQRIENGVAFEKYYLGAQPI FHCLKSLNDLVRLAKALDVDFPFSAYAAIEHIPVIEVEFVHLAGLESYPGQLTLLDTPGP NEAGQPHLQKMLNQQLARASAVLAVLDYTQLKSISDEEVREAILAVGQSVPLYVLVNKFD QQDRNSDDADQVRALISGTLMKGCITPQQIFPVSSMWGYLANRARYELANNGKLPPPEQQ RWVEDFAHAALGRRWRHADLADLEHIRHAADQLWEDSLFAQPIQALLHAAYANASLYALR SAAHKLLNYAQQAREYLDFRAHGLNVACEQLRQNIHQIEESLQLLQLNQAQVSGEIKHEI ELALTSANHFLRQQQDALKVQLAALFQDDSEPLSEIRTRCETLLQTAQNTISRDFTLRFA ELESTLCRVLTDVIRPIEQQVKMELSESGFRPGFHFPVFHGVVPHFNTRQLFSEVISRQE ATDEQSTRLGVVRETFSRWLNQPDWGRGNEKSPTETVDYSVLQRALSAEVDLYCQQMAKV LAEQVDESVTAGMNTFFAEFASCLTELQTRLRESLALRQQNESVVRLMQQQLQQTVMTHG WIYTDAQLLRDDIQTLFTAERY >gi|296494652|gb|ADTN01000086.1| GENE 3 3010 - 3888 471 292 aa, chain + ## HITS:1 COG:no KEGG:JW5729 NR:ns ## KEGG: JW5729 # Name: yjcZ # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 292 1 292 292 558 100.0 1e-158 MTKTLLDGPGRVLESVYPRFLVDLAQGDDARLPQAHQQQFRERLMQELLSRVQLQTWTNG GMLNAPLSLRLTLVEKLASMLDPGHLALTQIAQHLALLQKMDHRQHSAFPELPQQIAALY EWFSARCRWKEKALTQRGLLVQAGDQSEQIFTRWRAGAYNAWSLPGRCFIVLEELRWGAF GDACRLGSPQAVALLLGDLLEKATQHLAESINAAPTTRHYYHQWFASSTVPTGGEHADFL SWLGKWTTADKQPVCWSVTQRWQTVALGMPRLCSAQRLAGAMLEEIFSVNLA >gi|296494652|gb|ADTN01000086.1| GENE 4 4152 - 5654 1666 500 aa, chain + ## HITS:1 COG:ECs5093 KEGG:ns NR:ns ## COG: ECs5093 COG0477 # Protein_GI_number: 15834347 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 500 1 500 500 977 100.0 0 MLKRKKVKPITLRDVTIIDDGKLRKAITAASLGNAMEWFDFGVYGFVAYALGKVFFPGAD PSVQMVAALATFSVPFLIRPLGGLFFGMLGDKYGRQKILAITIVIMSISTFCIGLIPSYD TIGIWAPILLLICKMAQGFSVGGEYTGASIFVAEYSPDRKRGFMGSWLDFGSIAGFVLGA GVVVLISTIVGEANFLDWGWRIPFFIALPLGIIGLYLRHALEETPAFQQHVDKLEQGDRE GLQDGPKVSFKEIATKYWRSLLTCIGLVIATNVTYYMLLTYMPSYLSHNLHYSEDHGVLI IIAIMIGMLFVQPVMGLLSDRFGRRPFVLLGSVALFVLAIPAFILINSNVIGLIFAGLLM LAVILNCFTGVMASTLPAMFPTHIRYSALAAAFNISVLVAGLTPTLAAWLVESSQNLMMP AYYLMVVAVVGLITGVTMKETANRPLKGATPAASDIQEAKEILVEHYDNIEQKIDDIDHE IADLQAKRTRLVQQHPRIDE >gi|296494652|gb|ADTN01000086.1| GENE 5 5766 - 5855 151 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKNRVYESLTTVFSVLVVSSFLYIWFATY >gi|296494652|gb|ADTN01000086.1| GENE 6 5831 - 6922 930 363 aa, chain - ## HITS:1 COG:basS KEGG:ns NR:ns ## COG: basS COG0642 # Protein_GI_number: 16131938 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 363 1 363 363 681 100.0 0 MHFLRRPISLRQRLILTIGAILLVFELISVFWLWHESTEQIQLFEQALRDNRNNDRHIMR EIREAVASLIVPGVFMVSLTLFICYQAVRRITRPLAELQKELEARTADNLTPIAIHSATL EIEAVVSALNDLVSRLTSTLDNERLFTADVAHELRTPLAGVRLHLELLAKTHHIDVAPLV ARLDQMMESVSQLLQLARAGQSFSSGNYQHVKLLEDVILPSYDELSTMLDQRQQTLLLPE SAADITVQGDATLLRMLLRNLVENAHRYSPQGSNIMIKLQEDDGAVMAVEDEGPGIDESK CGELSKAFVRMDSRYGGIGLGLSIVSRITQLHHGQFFLQNRQETSGTRAWVRLKKDQYVA NQI >gi|296494652|gb|ADTN01000086.1| GENE 7 6932 - 7600 952 222 aa, chain - ## HITS:1 COG:ECs5095 KEGG:ns NR:ns ## COG: ECs5095 COG0745 # Protein_GI_number: 15834349 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 222 1 222 222 432 100.0 1e-121 MKILIVEDDTLLLQGLILAAQTEGYACDGVTTARMAEQSLEAGHYSLVVLDLGLPDEDGL HFLARIRQKKYTLPVLILTARDTLTDKIAGLDVGADDYLVKPFALEELHARIRALLRRHN NQGESELIVGNLTLNMGRRQVWMGGEELILTPKEYALLSRLMLKAGSPVHREILYNDIYN WDNEPSTNTLEVHIHNLRDKVGKARIRTVRGFGYMLVANEEN >gi|296494652|gb|ADTN01000086.1| GENE 8 7597 - 9240 1390 547 aa, chain - ## HITS:1 COG:ZyjdB KEGG:ns NR:ns ## COG: ZyjdB COG2194 # Protein_GI_number: 15804706 # Func_class: R General function prediction only # Function: Predicted membrane-associated, metal-dependent hydrolase # Organism: Escherichia coli O157:H7 EDL933 # 1 547 11 557 557 1083 100.0 0 MLKRLLKRPSLNLLAWLLLAAFYISICLNIAFFKQVLQALPLDSLHNVLVFLSMPVVAFS VINIVLTLSSFLWLNRPLACLFILVGAAAQYFIMTYGIVIDRSMIANIIDTTPAESYALM TPQMLLTLGFSGVLAALIACWIKIKPATSRLRSVLFRGANILVSVLLILLVAALFYKDYA SLFRNNKELVKSLSPSNSIVASWSWYSHQRLANLPLVRIGEDAHRNPLMQNEKRKNLTIL IVGETSRAENFSLNGYPRETNPRLAKDNVVYFPNTASCGTATAVSVPCMFSDMPREHYKE ELAQHQEGVLDIIQRAGINVLWNDNDGGCKGACDRVPHQNVTALNLPDQCINGECYDEVL FHGLEEYINNLQGDGVIVLHTIGSHGPTYYNRYPPQFRKFTPTCDTNEIQTCTKEQLVNT YDNTLVYVDYIVDKAINLLKEHQDKFTTSLVYLSDHGESLGENGIYLHGLPYAIAPDSQK QVPMLLWLSEDYQKRYQVDQNCLQKQAQTQHYSQDNLFSTLLGLTGVETKYYQAADDILQ TCRRVSE >gi|296494652|gb|ADTN01000086.1| GENE 9 9344 - 10681 1501 445 aa, chain - ## HITS:1 COG:ECs5097 KEGG:ns NR:ns ## COG: ECs5097 COG0531 # Protein_GI_number: 15834351 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli O157:H7 # 1 445 1 445 445 780 100.0 0 MSSDADAHKVGLIPVTLMVSGNIMGSGVFLLPANLASTGGIAIYGWLVTIIGALGLSMVY AKMSFLDPSPGGSYAYARRCFGPFLGYQTNVLYWLACWIGNIAMVVIGVGYLSYFFPILK DPLVLTITCVVVLWIFVLLNIVGPKMITRVQAVATVLALIPIVGIAVFGWFWFRGETYMA AWNVSGLGTFGAIQSTLNVTLWSFIGVESASVAAGVVKNPKRNVPIATIGGVLIAAVCYV LSTTAIMGMIPNAALRVSASPFGDAARMALGDTAGAIVSFCAAAGCLGSLGGWTLLAGQT AKAAADDGLFPPIFARVNKAGTPVAGLIIVGILMTIFQLSSISPNATKEFGLVSSVSVIF TLVPYLYTCAALLLLGHGHFGKARPAYLAVTTIAFLYCIWAVVGSGAKEVMWSFVTLMVI TAMYALNYNRLHKNPYPLDAPISKD >gi|296494652|gb|ADTN01000086.1| GENE 10 10818 - 11579 414 253 aa, chain - ## HITS:1 COG:adiY KEGG:ns NR:ns ## COG: adiY COG2207 # Protein_GI_number: 16131942 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 253 1 253 253 493 100.0 1e-139 MRICSDQPCIVLLTEKDVWIRVNGKEPISLKANHMALLNCENNIIDVSSLNNTLVAHISH DIIKDYLRFLNKDLSQIPVWQRSATPILTLPCLTPDVFRVAAQHSMMPAETESEKERTRA LLFTVLSRFLDSKKFVSLMMYMLRNCVSDSVYQIIESDIHKDWNLSMVASCLCLSPSLLK KKLKSENTSYSQIITTCRMRYAVNELMMDGKNISQVSQSCGYNSTSYFISVFKDFYGMTP LHYVSQHRERTVA Prediction of potential genes in microbial genomes Time: Sun May 15 23:24:43 2011 Seq name: gi|296494651|gb|ADTN01000087.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont221.2, whole genome shotgun sequence Length of sequence - 18303 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 12, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 119 - 2386 2248 ## COG1982 Arginine/lysine/ornithine decarboxylases - Prom 2508 - 2567 7.2 2 2 Tu 1 . - CDS 2585 - 3493 802 ## COG2207 AraC-type DNA-binding domain-containing proteins 3 3 Tu 1 4/0.500 + CDS 3779 - 5131 1227 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases 4 4 Tu 1 . + CDS 5246 - 6655 1069 ## COG2211 Na+/melibiose symporter and related transporters + Term 6678 - 6719 6.1 5 5 Tu 1 . - CDS 6794 - 7423 632 ## COG3647 Predicted membrane protein - Prom 7446 - 7505 2.5 6 6 Op 1 4/0.500 - CDS 7546 - 9192 503 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase - Term 9219 - 9260 6.0 7 6 Op 2 3/1.000 - CDS 9270 - 10610 1498 ## COG2704 Anaerobic C4-dicarboxylate transporter 8 7 Op 1 9/0.000 - CDS 11181 - 11900 278 ## PROTEIN SUPPORTED gi|149011191|ref|ZP_01832496.1| 30S ribosomal protein S9 9 7 Op 2 . - CDS 11897 - 13528 1440 ## COG3290 Signal transduction histidine kinase regulating citrate/malate metabolism - Prom 13552 - 13611 2.9 10 8 Op 1 . + CDS 13520 - 13654 97 ## EcSMS35_4593 hypothetical protein 11 8 Op 2 5/0.500 + CDS 13709 - 13939 231 ## COG3592 Uncharacterized conserved protein 12 8 Op 3 . + CDS 13951 - 14223 348 ## COG2388 Predicted acetyltransferase 13 9 Tu 1 . - CDS 14153 - 14344 114 ## gi|293407852|ref|ZP_06651692.1| predicted protein - Prom 14440 - 14499 3.3 + Prom 14302 - 14361 1.7 14 10 Op 1 . + CDS 14450 - 14746 233 ## ECUMN_4661 hypothetical protein 15 10 Op 2 . + CDS 14774 - 14947 163 ## ECH74115_5644 hypothetical protein + Term 15026 - 15061 6.7 - Term 15014 - 15049 6.7 16 11 Tu 1 . - CDS 15066 - 16583 1711 ## COG1190 Lysyl-tRNA synthetase (class II) - Prom 16666 - 16725 6.5 17 12 Tu 1 . - CDS 16820 - 18277 1415 ## COG3104 Dipeptide/tripeptide permease Predicted protein(s) >gi|296494651|gb|ADTN01000087.1| GENE 1 119 - 2386 2248 755 aa, chain - ## HITS:1 COG:adiA KEGG:ns NR:ns ## COG: adiA COG1982 # Protein_GI_number: 16131943 # Func_class: E Amino acid transport and metabolism # Function: Arginine/lysine/ornithine decarboxylases # Organism: Escherichia coli K12 # 1 755 2 756 756 1602 100.0 0 MKVLIVESEFLHQDTWVGNAVERLADALSQQNVTVIKSTSFDDGFAILSSNEAIDCLMFS YQMEHPDEHQNVRQLIGKLHERQQNVPVFLLGDREKALAAMDRDLLELVDEFAWILEDTA DFIAGRAVAAMTRYRQQLLPPLFSALMKYSDIHEYSWAAPGHQGGVGFTKTPAGRFYHDY YGENLFRTDMGIERTSLGSLLDHTGAFGESEKYAARVFGADRSWSVVVGTSGSNRTIMQA CMTDNDVVVVDRNCHKSIEQGLMLTGAKPVYMVPSRNRYGIIGPIYPQEMQPETLQKKIS ESPLTKDKAGQKPSYCVVTNCTYDGVCYNAKEAQDLLEKTSDRLHFDEAWYGYARFNPIY ADHYAMRGEPGDHNGPTVFATHSTHKLLNALSQASYIHVREGRGAINFSRFNQAYMMHAT TSPLYAICASNDVAVSMMDGNSGLSLTQEVIDEAVDFRQAMARLYKEFTADGSWFFKPWN KEVVTDPQTGKTYDFADAPTKLLTTVQDCWVMHPGESWHGFKDIPDNWSMLDPIKVSILA PGMGEDGELEETGVPAALVTAWLGRHGIVPTRTTDFQIMFLFSMGVTRGKWGTLVNTLCS FKRHYDANTPLAQVMPELVEQYPDTYANMGIHDLGDTMFAWLKENNPGARLNEAYSGLPV AEVTPREAYNAIVDNNVELVSIENLPGRIAANSVIPYPPGIPMLLSGENFGDKNSPQVSY LRSLQSWDHHFPGFEHETEGTEIIDGIYHVMCVKA >gi|296494651|gb|ADTN01000087.1| GENE 2 2585 - 3493 802 302 aa, chain - ## HITS:1 COG:melR KEGG:ns NR:ns ## COG: melR COG2207 # Protein_GI_number: 16131944 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 302 1 302 302 610 100.0 1e-174 MNTDTFMCSSDEKQTRSPLSLYSEYQRMEIEFRAPHIMPTSHWHGQVEVNVPFDGDVEYL INNEKVNINQGHITLFWACTPHQLTDTGTCQSMAIFNLPMHLFLSWPLDKDLINHVTHGM VIKSLATQQLSPFEVRRWQQELNSPNEQIRQLAIDEIGLMLKRFSLSGWEPILVNKTSRT HKNSVSRHAQFYVSQMLGFIAENYDQALTINDVAEHVKLNANYAMGIFQRVMQLTMKQYI TAMRINHVRALLSDTDKSILDIALTAGFRSSSRFYSTFGKYVGMSPQQYRKLSQQRRQTF PG >gi|296494651|gb|ADTN01000087.1| GENE 3 3779 - 5131 1227 450 aa, chain + ## HITS:1 COG:melA KEGG:ns NR:ns ## COG: melA COG1486 # Protein_GI_number: 16131945 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Escherichia coli K12 # 1 450 2 451 451 959 100.0 0 MSAPKITFIGAGSTIFVKNILGDVFHREALKTAHIALMDIDPTRLEESHIVVRKLMDSAG ASGKITCHTQQKEALEDADFVVVAFQIGGYEPCTVTDFEVCKRHGLEQTIADTLGPGGIM RALRTIPHLWQICEDMTEVCPDATMLNYVNPMAMNTWAMYARYPHIKQVGLCHSVQGTAE ELARDLNIDPATLRYRCAGINHMAFYLELERKTADGSYVNLYPELLAAYEAGQAPKPNIH GNTRCQNIVRYEMFKKLGYFVTESSEHFAEYTPWFIKPGREDLIERYKVPLDEYPKRCVE QLANWHKELEEYKKASRIDIKPSREYASTIMNAIWTGEPSVIYGNVRNDGLIDNLPQGCC VEVACLVDANGIQPTKVGTLPSHLAALMQTNINVQTLLTEAILTENRDRVYHAAMMDPHT AAVLGIDEIYALVDDLIAAHGDWLPGWLHR >gi|296494651|gb|ADTN01000087.1| GENE 4 5246 - 6655 1069 469 aa, chain + ## HITS:1 COG:melB KEGG:ns NR:ns ## COG: melB COG2211 # Protein_GI_number: 16131946 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Escherichia coli K12 # 1 469 1 469 469 855 100.0 0 MTTKLSYGFGAFGKDFAIGIVYMYLMYYYTDVVGLSVGLVGTLFLVARIWDAINDPIMGW IVNATRSRWGKFKPWILIGTLANSVILFLLFSAHLFEGTTQIVFVCVTYILWGMTYTIMD IPFWSLVPTITLDKREREQLVPYPRFFASLAGFVTAGVTLPFVNYVGGGDRGFGFQMFTL VLIAFFIVSTIITLRNVHEVFSSDNQPSAEGSHLTLKAIVALIYKNDQLSCLLGMALAYN VASNIITGFAIYYFSYVIGDADLFPYYLSYAGAANLVTLVFFPRLVKSLSRRILWAGASI LPVLSCGVLLLMALMSYHNVVLIVIAGILLNVGTALFWVLQVIMVADIVDYGEYKLHVRC ESIAYSVQTMVVKGGSAFAAFFIAVVLGMIGYVPNVEQSTQALLGMQFIMIALPTLFFMV TLILYFRFYRLNGDTLRRIQIHLLDKYRKVPPEPVHADIPVGAVSDVKA >gi|296494651|gb|ADTN01000087.1| GENE 5 6794 - 7423 632 209 aa, chain - ## HITS:1 COG:yjdF KEGG:ns NR:ns ## COG: yjdF COG3647 # Protein_GI_number: 16131947 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 209 1 209 209 392 100.0 1e-109 MTRTLKPLILNTSALTLTLILIYTGISAHDKLTWLMEVTPVIIVVQLLLATARRYPLTPL LYTLIFLHAIILMVGGQYTYAKVPVGFEVQEWLGLSRNPYDKLGHFFQGLVPALVAREIL VRGMYVRGRKMVAFLVCCVALAISAMYELIEWWAALAMGQGADDFLGTQGDQWDTQSDMF CALLGALTTVIFLARFHCRQLRRFGLITG >gi|296494651|gb|ADTN01000087.1| GENE 6 7546 - 9192 503 548 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 69 531 31 482 508 198 31 3e-50 MSNKPFIYQAPFPMGKDNTEYYLLTSDYVSVADFDGETILKVEPEALTLLAQQAFHDASF MLRPAHQKQVAAILHDPEASENDKYVALQFLRNSEIAAKGVLPTCQDTGTAIIVGKKGQR VWTGGGDEETLSKGVYNTYIEDNLRYSQNAALDMYKEVNTGTNLPAQIDLYAVDGDEYKF LCVAKGGGSANKTYLYQETKALLTPGKLKNFLVEKMRTLGTAACPPYHIAFVIGGTSAET NLKTVKLASAHYYDELPTEGNEHGQAFRDVQLEQELLEEAQKLGLGAQFGGKYFAHDIRV IRLPRHGASCPVGMGVSCSADRNIKAKINREGIWIEKLEHNPGQYIPQELRQAGEGEAVK VDLNRPMKEILAQLSQYPVSTRLSLTGTIIVGRDIAHAKLKELIDAGKELPQYIKDHPIY YAGPAKTPAGYPSGSLGPTTAGRMDSYVDLLQSHGGSMIMLAKGNRSQQVTDACHKHGGF YLGSIGGPAAVLAQQSIKHLECVAYPELGMEAIWKIEVEDFPAFILVDDKGNDFFQQIVN KQCANCTK >gi|296494651|gb|ADTN01000087.1| GENE 7 9270 - 10610 1498 446 aa, chain - ## HITS:1 COG:ECs5105 KEGG:ns NR:ns ## COG: ECs5105 COG2704 # Protein_GI_number: 15834359 # Func_class: R General function prediction only # Function: Anaerobic C4-dicarboxylate transporter # Organism: Escherichia coli O157:H7 # 1 446 1 446 446 743 100.0 0 MLFTIQLIIILICLFYGARKGGIALGLLGGIGLVILVFVFHLQPGKPPVDVMLVIIAVVA ASATLQASGGLDVMLQIAEKLLRRNPKYVSIVAPFVTCTLTILCGTGHVVYTILPIIYDV AIKNNIRPERPMAASSIGAQMGIIASPVSVAVVSLVAMLGNVTFDGRHLEFLDLLAITIP STLIGILAIGIFSWFRGKDLDKDEEFQKFISVPENREYVYGDTATLLDKKLPKSNWLAMW IFLGAIAVVALLGADSDLRPSFGGKPLSMVLVIQMFMLLTGALIIILTKTNPASISKNEV FRSGMIAIVAVYGIAWMAETMFGAHMSEIQGVLGEMVKEYPWAYAIVLLLVSKFVNSQAA ALAAIVPVALAIGVDPAYIVASAPACYGYYILPTYPSDLAAIQFDRSGTTHIGRFVINHS FILPGLIGVSVSCVFGWIFAAMYGFL >gi|296494651|gb|ADTN01000087.1| GENE 8 11181 - 11900 278 239 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149011191|ref|ZP_01832496.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP19-BS75] # 2 225 1 222 226 111 36 3e-24 MINVLIIDDDAMVAELNRRYVAQIPGFQCCGTASTLEKAKEIIFNSDTPIDLILLDIYMQ KENGLDLLPVLHNARCKSDVIVISSAADAATIKDSLHYGVVDYLIKPFQASRFEEALTGW RQKKMALEKHQYYDQAELDQLIHGSSSNEQDPRRLPKGLTPQTLRTLCQWIDAHQDYEFS TDELANEVNISRVSCRKYLIWLVNCHILFTSIHYGVTGRPVYRYRIQAEHYSLLKQYCQ >gi|296494651|gb|ADTN01000087.1| GENE 9 11897 - 13528 1440 543 aa, chain - ## HITS:1 COG:ECs5107 KEGG:ns NR:ns ## COG: ECs5107 COG3290 # Protein_GI_number: 15834361 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase regulating citrate/malate metabolism # Organism: Escherichia coli O157:H7 # 1 543 1 543 543 1069 99.0 0 MRHSLPYRMLRKRPMKLSTTVILMVSAVLFSVLLVVHLIYFSQISDMTRDGLANKALAVA RTLADSPEIRQGLQKKPQESGIQAIAEAVRKRNDLLFIVVTDMQSLRYSHPEAQRIGQPF KGDDILKALNGEENVAINRGFLAQALRVFTPIYDENHKQIGVVAIGLELSRVTQQINDSR WSIIWSVLFGMLVGLIGTCILVKVLKKILFGLEPYEISTLFEQRQAMLQSIKEGVVAVDD RGEVTLINDAAQELLNYRKSQDDEKLSTLSHSWSQVVDVSEVLRDGTPRRDEEIMIKDRL LLINTVPVRSNGVIIGAISTFRDKTEVRKLMQRLDGLVNYADALRERSHEFMNKLHVILG LLHLKSYKQLEDYILKTANNYQEEIGSLLGKIKSPVIAGFLISKINRATDLGHTLILNSE SQLPDSGSEDQVATLITTLGNLIENALEALGPEPGGEISVTLHYRHGWLHCEVNDDGPGI APDKIDHIFDKGVSTKGSERGVGLALVKQQVENLGGSIAVESEPGIFTQFFVQIPWDGER SNR >gi|296494651|gb|ADTN01000087.1| GENE 10 13520 - 13654 97 44 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_4593 NR:ns ## KEGG: EcSMS35_4593 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 44 1 44 44 79 100.0 3e-14 MSHQLPCVTNFLSIISDEAGNSKGVRMIGYIGEETLATETASAV >gi|296494651|gb|ADTN01000087.1| GENE 11 13709 - 13939 231 76 aa, chain + ## HITS:1 COG:ECs5108 KEGG:ns NR:ns ## COG: ECs5108 COG3592 # Protein_GI_number: 15834362 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 76 1 76 76 162 100.0 1e-40 MDQALLDGGYRCYTGEKIDVYFNTAICQHSGNCVRGNGKLFNLKRKPWIMPDEVDVATVV KVIDTCPSGALKYRHK >gi|296494651|gb|ADTN01000087.1| GENE 12 13951 - 14223 348 90 aa, chain + ## HITS:1 COG:yjdJ KEGG:ns NR:ns ## COG: yjdJ COG2388 # Protein_GI_number: 16131953 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Escherichia coli K12 # 1 90 1 90 90 163 100.0 7e-41 MEIREGHNKFYINDKQGKQIAEIVFVPTGENLAIIEHTDVDESLKGQGIGKQLVAKVVEK MRREKRKIIPLCPFAKHEFDKTREYDDIRS >gi|296494651|gb|ADTN01000087.1| GENE 13 14153 - 14344 114 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293407852|ref|ZP_06651692.1| ## NR: gi|293407852|ref|ZP_06651692.1| predicted protein [Escherichia coli B354] # 1 63 1 63 63 104 85.0 2e-21 MTWDKKTRSLPILCASPYPSDALLHHIAERMKNIVTLYSPINCEYHHTPAFYQIHVSQMG IMG >gi|296494651|gb|ADTN01000087.1| GENE 14 14450 - 14746 233 98 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_4661 NR:ns ## KEGG: ECUMN_4661 # Name: yjdK # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 98 1 98 98 172 100.0 3e-42 MEGKNKFNTYVVSFDYPSSYSSVFLRLRSLMYDMNFSSIVADEYGIPRQLNENSFAITTS LAASEIEDLIRLKCLDLPDIDFDLNIMTVDDYFRQFYK >gi|296494651|gb|ADTN01000087.1| GENE 15 14774 - 14947 163 57 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_5644 NR:ns ## KEGG: ECH74115_5644 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 57 1 57 57 68 100.0 6e-11 MALFSKILIFYVIGVNISFVIIWFISHEKTHIRLLSAFLVGITWPMSLPVALLFSLF >gi|296494651|gb|ADTN01000087.1| GENE 16 15066 - 16583 1711 505 aa, chain - ## HITS:1 COG:ECs5111 KEGG:ns NR:ns ## COG: ECs5111 COG1190 # Protein_GI_number: 15834365 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Lysyl-tRNA synthetase (class II) # Organism: Escherichia coli O157:H7 # 1 505 1 505 505 1000 100.0 0 MSEQETRGANEAIDFNDELRNRREKLAALRQQGVAFPNDFRRDHTSDQLHEEFDAKDNQE LESLNIEVSVAGRMMTRRIMGKASFVTLQDVGGRIQLYVARDSLPEGVYNDQFKKWDLGD IIGARGTLFKTQTGELSIHCTELRLLTKALRPLPDKFHGLQDQEVRYRQRYLDLIANDKS RQTFVVRSKILAAIRQFMVARGFMEVETPMMQVIPGGASARPFITHHNALDLDMYLRIAP ELYLKRLVVGGFERVFEINRNFRNEGISVRHNPEFTMMELYMAYADYHDLIELTESLFRT LAQEVLGTTKVTYGEHVFDFGKPFEKLTMREAIKKYRPETDMADLDNFDAAKALAESIGI TVEKSWGLGRIVTEIFDEVAEAHLIQPTFITEYPAEVSPLARRNDVNPEITDRFEFFIGG REIGNGFSELNDAEDQAERFQEQVNAKAAGDDEAMFYDEDYVTALEYGLPPTAGLGIGID RMIMLFTNSHTIRDVILFPAMRPQK >gi|296494651|gb|ADTN01000087.1| GENE 17 16820 - 18277 1415 485 aa, chain - ## HITS:1 COG:yjdL KEGG:ns NR:ns ## COG: yjdL COG3104 # Protein_GI_number: 16131956 # Func_class: E Amino acid transport and metabolism # Function: Dipeptide/tripeptide permease # Organism: Escherichia coli K12 # 1 485 1 485 485 888 99.0 0 MKTPSQPRAIYYIVAIQIWEYFSFYGMRALLILYLTHQLGFDDNHAISLFSAYASLVYVT PILGGWLADRLLGNRTAVIAGALLMTLGHVVLGIDTNSTFSLYLALAIIICGYGLFKSNI SCLLGELYDENDHRRDGGFSLLYAAGNIGSIAAPIACGLAAQWYGWHVGFALAGGGMFIG LLIFLSGHRHFQSTRSMDKKALTSVKFALPVWSWLVVMLCLAPVFFTLLLENDWSGYLLA IVCLIAAQIIARMMIKFPEHRRALWQIVLLMFVGTLFWVLAQQGGSTISLFIDRFVNRQA FNIEVPTALFQSVNAIAVMLAGVVLAWLASPESRGNSTLRVWLKFAFGLLLMACGFMLLA FDARHAAADGQASMGVMISGLALMGFAELFIDPVAIAQITRLKMSGVLTGIYMLATGAVA NWLAGVVAQQTTESQISGMAIAAYQRFFSQMGEWTLACVAIIVVLTFATRFLFSTPTNMI QESND Prediction of potential genes in microbial genomes Time: Sun May 15 23:24:55 2011 Seq name: gi|296494650|gb|ADTN01000088.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont221.3, whole genome shotgun sequence Length of sequence - 5881 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 10/0.000 - CDS 42 - 2189 2413 ## COG1982 Arginine/lysine/ornithine decarboxylases - Term 2198 - 2253 3.0 2 1 Op 2 4/0.000 - CDS 2269 - 3603 1471 ## COG0531 Amino acid transporters - Term 3928 - 3962 -1.0 3 1 Op 3 . - CDS 3968 - 5161 747 ## COG3710 DNA-binding winged-HTH domains - Prom 5203 - 5262 2.2 + Prom 4835 - 4894 2.6 4 2 Tu 1 . + CDS 5118 - 5216 65 ## Predicted protein(s) >gi|296494650|gb|ADTN01000088.1| GENE 1 42 - 2189 2413 715 aa, chain - ## HITS:1 COG:ECs5113 KEGG:ns NR:ns ## COG: ECs5113 COG1982 # Protein_GI_number: 15834367 # Func_class: E Amino acid transport and metabolism # Function: Arginine/lysine/ornithine decarboxylases # Organism: Escherichia coli O157:H7 # 1 715 1 715 715 1494 100.0 0 MNVIAILNHMGVYFKEEPIRELHRALERLNFQIVYPNDRDDLLKLIENNARLCGVIFDWD KYNLELCEEISKMNENLPLYAFANTYSTLDVSLNDLRLQISFFEYALGAAEDIANKIKQT TDEYINTILPPLTKALFKYVREGKYTFCTPGHMGGTAFQKSPVGSLFYDFFGPNTMKSDI SISVSELGSLLDHSGPHKEAEQYIARVFNADRSYMVTNGTSTANKIVGMYSAPAGSTILI DRNCHKSLTHLMMMSDVTPIYFRPTRNAYGILGGIPQSEFQHATIAKRVKETPNATWPVH AVITNSTYDGLLYNTDFIKKTLDVKSIHFDSAWVPYTNFSPIYEGKCGMSGGRVEGKVIY ETQSTHKLLAAFSQASMIHVKGDVNEETFNEAYMMHTTTSPHYGIVASTETAAAMMKGNA GKRLINGSIERAIKFRKEIKRLRTESDGWFFDVWQPDHIDTTECWPLRSDSTWHGFKNID NEHMYLDPIKVTLLTPGMEKDGTMSDFGIPASIVAKYLDEHGIVVEKTGPYNLLFLFSIG IDKTKALSLLRALTDFKRAFDLNLRVKNMLPSLYREDPEFYENMRIQELAQNIHKLIVHH NLPDLMYRAFEVLPTMVMTPYAAFQKELHGMTEEVYLDEMVGRINANMILPYPPGVPLVM PGEMITEESRPVLEFLQMLCEIGAHYPGFETDIHGAYRQADGRYTVKVLKEESKK >gi|296494650|gb|ADTN01000088.1| GENE 2 2269 - 3603 1471 444 aa, chain - ## HITS:1 COG:ECs5114 KEGG:ns NR:ns ## COG: ECs5114 COG0531 # Protein_GI_number: 15834368 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli O157:H7 # 1 444 1 444 444 761 100.0 0 MSSAKKIGLFACTGVVAGNMMGSGIALLPANLASIGGIAIWGWIISIIGAMSLAYVYARL ATKNPQQGGPIAYAGEISPAFGFQTGVLYYHANWIGNLAIGITAVSYLSTFFPVLNDPVP AGIACIAIVWVFTFVNMLGGTWVSRLTTIGLVLVLIPVVMTAIVGWHWFDAATYAANWNT ADTTDGHAIIKSILLCLWAFVGVESAAVSTGMVKNPKRTVPLATMLGTGLAGIVYIAATQ VLSGMYPSSVMAASGAPFAISASTILGNWAAPLVSAFTAFACLTSLGSWMMLVGQAGVRA ANDGNFPKVYGEVDSNGIPKKGLLLAAVKMTALMILITLMNSAGGKASDLFGELTGIAVL LTMLPYFYSCVDLIRFEGVNIRNFVSLICSVLGCVFCFIALMGASSFELAGTFIVSLIIL MFYARKMHERQSHSMDNHTASNAH >gi|296494650|gb|ADTN01000088.1| GENE 3 3968 - 5161 747 397 aa, chain - ## HITS:1 COG:cadC_1 KEGG:ns NR:ns ## COG: cadC_1 COG3710 # Protein_GI_number: 16131959 # Func_class: K Transcription # Function: DNA-binding winged-HTH domains # Organism: Escherichia coli K12 # 1 65 116 180 180 105 100.0 1e-22 MLSSPPPIPEAVPATDSPSHSLNIQNTATPPEQSPVKSKRFTTFWVWFFFLLSLGICVAL VAFSSLDTRLPMSKSRILLNPRDIDINMVNKSCNSWSSPYQLSYAIGVGDLVATSLNTFS TFMVHDKINYNIDEPSSSGKTLSIAFVNQRQYRAQQCFMSIKLVDNADGSTMLDKRYVIT NGNQLAIQNDLLESLSKALNQPWPQRMQETLQKILPHRGALLTNFYQAHDYLLHGDDKSL NRASELLGEIVQSSPEFTYARAEKALVDIVRHSQHPLDEKQLAALNTEIDNIVTLPELNN LSIIYQIKAVSALVKGKTDESYQAINTGIDLEMSWLNYVLLGKVYEMKGMNREAADAYLT AFNLRPGANTLYWIENGIFQTSVPYVVPYLDKFLASE >gi|296494650|gb|ADTN01000088.1| GENE 4 5118 - 5216 65 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAGTASGIGGGEDSIISSPSSSLYQITGTINL Prediction of potential genes in microbial genomes Time: Sun May 15 23:25:13 2011 Seq name: gi|296494649|gb|ADTN01000089.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont238.1, whole genome shotgun sequence Length of sequence - 47979 bp Number of predicted genes - 40, with homology - 40 Number of transcription units - 25, operones - 10 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 68 - 106 3.5 1 1 Tu 1 . - CDS 107 - 307 186 ## SSON_3186 glycogen synthesis protein GlgS - Prom 401 - 460 6.9 + Prom 392 - 451 6.2 2 2 Op 1 . + CDS 576 - 1205 177 ## B21_02870 hypothetical protein 3 2 Op 2 . + CDS 1232 - 2893 1767 ## COG2268 Uncharacterized protein conserved in bacteria + Term 2906 - 2951 5.4 - Term 3458 - 3489 3.2 4 3 Op 1 5/0.250 - CDS 3688 - 5121 1780 ## COG2870 ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase 5 3 Op 2 5/0.250 - CDS 5169 - 8009 3061 ## COG1391 Glutamine synthetase adenylyltransferase 6 3 Op 3 . - CDS 8032 - 9333 1512 ## COG3025 Uncharacterized conserved protein - Prom 9405 - 9464 2.6 + Prom 9489 - 9548 2.2 7 4 Op 1 7/0.125 + CDS 9575 - 10195 703 ## COG3103 SH3 domain protein 8 4 Op 2 . + CDS 10259 - 11497 1181 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase + Term 11564 - 11608 4.1 9 5 Op 1 4/0.438 - CDS 11678 - 12499 1023 ## COG1968 Uncharacterized bacitracin resistance protein 10 5 Op 2 . - CDS 12589 - 12957 432 ## COG1539 Dihydroneopterin aldolase + Prom 12899 - 12958 1.8 11 6 Tu 1 . + CDS 13062 - 13679 559 ## COG0344 Predicted membrane protein 12 7 Tu 1 . - CDS 13692 - 14624 872 ## COG0583 Transcriptional regulator - Prom 14750 - 14809 3.9 + Prom 14691 - 14750 3.5 13 8 Op 1 11/0.000 + CDS 14834 - 15742 262 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase 14 8 Op 2 3/0.812 + CDS 15715 - 16344 248 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase + Term 16352 - 16383 1.7 15 8 Op 3 . + CDS 16392 - 17855 1599 ## COG0471 Di- and tricarboxylate transporters + Term 18031 - 18066 -0.7 16 9 Tu 1 . - CDS 17898 - 18911 660 ## PROTEIN SUPPORTED gi|227425790|ref|ZP_03908856.1| SSU ribosomal protein S18P alanine acetyltransferase - Prom 18935 - 18994 4.0 + Prom 18954 - 19013 6.3 17 10 Tu 1 . + CDS 19149 - 19364 357 ## PROTEIN SUPPORTED gi|15803607|ref|NP_289640.1| 30S ribosomal protein S21 + Term 19411 - 19451 6.1 + Prom 19393 - 19452 3.0 18 11 Op 1 31/0.000 + CDS 19475 - 21220 1389 ## COG0358 DNA primase (bacterial type) 19 11 Op 2 . + CDS 21415 - 23256 2455 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) + Term 23312 - 23344 3.0 - Term 23300 - 23330 2.6 20 12 Tu 1 . - CDS 23335 - 23841 433 ## COG3663 G:T/U mismatch-specific DNA glycosylase - Prom 23983 - 24042 79.6 + TRNA 23966 - 24041 93.9 # Met CAT 0 0 - Term 24041 - 24074 5.4 21 13 Tu 1 . - CDS 24095 - 24859 745 ## COG2375 Siderophore-interacting protein - Prom 24903 - 24962 5.7 + Prom 25062 - 25121 7.5 22 14 Tu 1 . + CDS 25147 - 25770 764 ## COG1695 Predicted transcriptional regulators + Term 25858 - 25901 3.2 23 15 Tu 1 . - CDS 25923 - 27443 1557 ## COG0840 Methyl-accepting chemotaxis protein 24 16 Tu 1 . + CDS 27951 - 29240 1346 ## COG4992 Ornithine/acetylornithine aminotransferase + Term 29248 - 29289 9.1 - Term 29236 - 29276 8.1 25 17 Tu 1 . - CDS 29282 - 29614 510 ## COG0073 EMAP domain - Prom 29635 - 29694 3.1 + Prom 29585 - 29644 4.8 26 18 Tu 1 . + CDS 29833 - 30816 1021 ## COG1609 Transcriptional regulators + Term 30826 - 30875 12.3 + Prom 30821 - 30880 3.9 27 19 Op 1 . + CDS 31000 - 34092 3159 ## COG3250 Beta-galactosidase/beta-glucuronidase 28 19 Op 2 3/0.812 + CDS 34089 - 34538 380 ## COG2731 Beta-galactosidase, beta subunit 29 19 Op 3 4/0.438 + CDS 34601 - 34879 359 ## COG0531 Amino acid transporters + Prom 34883 - 34942 1.5 30 19 Op 4 . + CDS 34964 - 36034 1105 ## COG0531 Amino acid transporters + Term 36054 - 36114 9.3 + Prom 36071 - 36130 6.7 31 20 Op 1 . + CDS 36168 - 37238 1075 ## B21_02898 hypothetical protein 32 20 Op 2 . + CDS 37255 - 39606 2627 ## JW3051 predicted glycosyl hydrolase + Prom 39932 - 39991 7.2 33 21 Tu 1 . + CDS 40032 - 42050 1833 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family + Term 42060 - 42106 5.6 34 22 Op 1 6/0.188 - CDS 42095 - 42511 286 ## COG5499 Predicted transcription regulator containing HTH domain 35 22 Op 2 1/0.875 - CDS 42508 - 42822 200 ## COG4680 Uncharacterized protein conserved in bacteria - Prom 42847 - 42906 5.2 - Term 42849 - 42887 10.2 36 23 Tu 1 . - CDS 43106 - 44242 451 ## PROTEIN SUPPORTED gi|225082609|ref|YP_002654106.1| ribosomal protein L11 methyltransferase, putative - Prom 44287 - 44346 4.6 + Prom 44212 - 44271 2.0 37 24 Op 1 3/0.812 + CDS 44327 - 44830 436 ## COG1451 Predicted metal-dependent hydrolase + Prom 44832 - 44891 1.6 38 24 Op 2 3/0.812 + CDS 44911 - 45603 503 ## COG2949 Uncharacterized membrane protein 39 24 Op 3 3/0.812 + CDS 45664 - 46668 749 ## COG0673 Predicted dehydrogenases and related proteins + Term 46694 - 46746 6.6 + Prom 46683 - 46742 6.7 40 25 Tu 1 . + CDS 46951 - 47916 1269 ## COG0861 Membrane protein TerC, possibly involved in tellurium resistance Predicted protein(s) >gi|296494649|gb|ADTN01000089.1| GENE 1 107 - 307 186 66 aa, chain - ## HITS:1 COG:no KEGG:SSON_3186 NR:ns ## KEGG: SSON_3186 # Name: glgS # Def: glycogen synthesis protein GlgS # Organism: S.sonnei # Pathway: not_defined # 1 66 1 66 66 126 100.0 2e-28 MDHSLNSLNNFDFLARSFARMHAEGRPVDILAVTGNMDEEHRTWFCARYAWYCQQMMQAR ELELEH >gi|296494649|gb|ADTN01000089.1| GENE 2 576 - 1205 177 209 aa, chain + ## HITS:1 COG:no KEGG:B21_02870 NR:ns ## KEGG: B21_02870 # Name: yqiJ # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 209 1 209 209 409 100.0 1e-113 MILFADYNTPYLFAISFVLLIGLLEIFALICGHMLSGALDAHLDHYDSITTGHISQALHY LNIGRLPALVVLCLLAGFFGLIGILLQHACIMVWQSPLSNLFVVPVSLLFTIIAVHYTGK IVAPWIPRDHSSAITEEEYIGSMALITGHQATSGNPCEGKLTDQFGQIHYLLLEPEEGKI FTKGDKVLIICRLSATRYLAENNPWPQIL >gi|296494649|gb|ADTN01000089.1| GENE 3 1232 - 2893 1767 553 aa, chain + ## HITS:1 COG:yqiK KEGG:ns NR:ns ## COG: yqiK COG2268 # Protein_GI_number: 16130947 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 553 1 553 553 820 100.0 0 MDDIVNSVPSWMFTAIIAVCILFIIGIIFARLYRRASAEQAFVRTGLGGQKVVMSGGAIV MPIFHEIIPINMNTLKLEVSRSTIDSLITKDRMRVDVVVAFFVRVKPSVEGIATAAQTLG QRTLSPEDLRMLVEDKFVDALRATAAQMTMHELQDTRENFVQGVQNTVAEDLSKNGLELE SVSLTNFNQTSKEHFNPNNAFDAEGLTKLTQETERRRRERNEVEQDVEVAVREKNRDALS RKLEIEQQEAFMTLEQEQQVKTRTAEQNARIAAFEAERRREAEQTRILAERQIQETEIDR EQAVRSRKVEAEREVRIKEIEQQQVTEIANQTKSIAIAAKSEQQSQAEARANLALAEAVS AQQNVETTRQTAEADRAKQVALIAAAQDAETKAVELTVRAKAEKEAAEMQAAAIVELAEA TRKKGLAEAEAQRALNDAINVLSDEQTSLKFKLALLQALPAVIEKSVEPMKSIDGIKIIQ VDGLNRGGAAGDANTGNVGGGNLAEQALSAALSYRTQAPLIDSLLNEIGVSGGSLAALTS PLTSTTPVEEKAE >gi|296494649|gb|ADTN01000089.1| GENE 4 3688 - 5121 1780 477 aa, chain - ## HITS:1 COG:rfaE KEGG:ns NR:ns ## COG: rfaE COG2870 # Protein_GI_number: 16130948 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose synthase, bifunctional sugar kinase/adenylyltransferase # Organism: Escherichia coli K12 # 1 477 1 477 477 916 100.0 0 MKVTLPEFERAGVMVVGDVMLDRYWYGPTSRISPEAPVPVVKVNTIEERPGGAANVAMNI ASLGANARLVGLTGIDDAARALSKSLADVNVKCDFVSVPTHPTITKLRVLSRNQQLIRLD FEEGFEGVDPQPLHERINQALSSIGALVLSDYAKGALASVQQMIQLARKAGVPVLIDPKG TDFERYRGATLLTPNLSEFEAVVGKCKTEEEIVERGMKLIADYELSALLVTRSEQGMSLL QPGKAPLHMPTQAQEVYDVTGAGDTVIGVLAATLAAGNSLEEACFFANAAAGVVVGKLGT STVSPIELENAVRGRADTGFGVMTEEELKLAVAAARKRGEKVVMTNGVFDILHAGHVSYL ANARKLGDRLIVAVNSDASTKRLKGDSRPVNPLEQRMIVLGALEAVDWVVSFEEDTPQRL IAGILPDLLVKGGDYKPEEIAGSKEVWANGGEVLVLNFEDGCSTTNIIKKIQQDKKG >gi|296494649|gb|ADTN01000089.1| GENE 5 5169 - 8009 3061 946 aa, chain - ## HITS:1 COG:glnE KEGG:ns NR:ns ## COG: glnE COG1391 # Protein_GI_number: 16130949 # Func_class: O Posttranslational modification, protein turnover, chaperones; T Signal transduction mechanisms # Function: Glutamine synthetase adenylyltransferase # Organism: Escherichia coli K12 # 1 946 1 946 946 1822 100.0 0 MKPLSSPLQQYWQTVVERLPEPLAEESLSAQAKSVLTFSDFVQDSVIAHPEWLTELESQP PQADEWQHYAAWLQEALCNVSDEAGLMRELRLFRRRIMVRIAWAQTLALVTEESILQQLS YLAETLIVAARDWLYDACCREWGTPCNAQGEAQPLLILGMGKLGGGELNFSSDIDLIFAW PEHGCTQGGRRELDNAQFFTRMGQRLIKVLDQPTQDGFVYRVDMRLRPFGESGPLVLSFA ALEDYYQEQGRDWERYAMVKARIMGDSEGVYANELRAMLRPFVFRRYIDFSVIQSLRNMK GMIAREVRRRGLTDNIKLGAGGIREIEFIVQVFQLIRGGREPSLQSRSLLPTLSAIAELH LLSENDAEQLRVAYLFLRRLENLLQSINDEQTQTLPSDELNRARLAWAMDFADWPQLTGA LTAHMTNVRRVFNELIGDDESETQEESLSEQWRELWQDALQEDDTTPVLAHLSEDDRKQV LTLIADFRKELDKRTIGPRGRQVLDHLMPHLLSDVCAREDAAVTLSRITALLVGIVTRTT YLELLSEFPAALKHLISLCAASPMIASQLARYPLLLDELLDPNTLYQPTATDAYRDELRQ YLLRVPEDDEEQQLEALRQFKQAQLLRIAAADIAGTLPVMKVSDHLTWLAEAMIDAVVQQ AWVQMVARYGKPNHLNEREGRGFAVVGYGKLGGWELGYSSDLDLIFLHDCPMDAMTDGER EIDGRQFYLRLAQRIMHLFSTRTSSGILYEVDARLRPSGAAGMLVTSAEAFADYQKNEAW TWEHQALVRARVVYGDPQLTAHFDAVRREIMTLPREGKTLQTEVREMREKMRAHLGNKHR DRFDIKADEGGITDIEFITQYLVLRYAHEKPKLTRWSDNVRILELLAQNDIMEEQEAMAL TRAYTTLRDELHHLALQELPGHVSEDCFTAERELVRASWQKWLVEE >gi|296494649|gb|ADTN01000089.1| GENE 6 8032 - 9333 1512 433 aa, chain - ## HITS:1 COG:ygiF KEGG:ns NR:ns ## COG: ygiF COG3025 # Protein_GI_number: 16130950 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 433 1 433 433 843 100.0 0 MAQEIELKFIVNHSAVEALRDHLNTLGGEHHDPVQLLNIYYETPDNWLRGHDMGLRIRGE NGRYEMTMKVAGRVTGGLHQRPEYNVALSEPTLDLAQLPTEVWPNGELPADLASRVQPLF STDFYREKWLVAVDGSQIEIALDQGEVKAGEFAEPICELELELLSGDTRAVLKLANQLVS QTGLRQGSLSKAARGYHLAQGNPAREIKPTTILHVAAKADVEQGLEAALELALAQWQYHE ELWVRGNDAAKEQVLAAISLVRHTLMLFGGIVPRKASTHLRDLLTQCEATIASAVSAVTA VYSTETAMAKLALTEWLVSKAWQPFLDAKAQGKISDSFKRFADIHLSRHAAELKSVFCQP LGDRYRDQLPRLTRDIDSILLLAGYYDPVVAQAWLENWQGLHHAIATGQRIEIEHFRNEA NNQEPFWLHSGKR >gi|296494649|gb|ADTN01000089.1| GENE 7 9575 - 10195 703 206 aa, chain + ## HITS:1 COG:ECs3938 KEGG:ns NR:ns ## COG: ECs3938 COG3103 # Protein_GI_number: 15833192 # Func_class: T Signal transduction mechanisms # Function: SH3 domain protein # Organism: Escherichia coli O157:H7 # 1 206 1 206 206 348 100.0 4e-96 MPKLRLIGLTLLALSATAVSHAEETRYVSDELNTWVRSGPGDHYRLVGTVNAGEEVTLLQ TDANTNYAQVKDSSGRTAWIPLKQLSTEPSLRSRVPDLENQVKTLTDKLTNIDNTWNQRT AEMQQKVAQSDSVINGLKEENQKLKNELIVAQKKVDAASVQLDDKQRTIIMQWFMYGGGV LGLGLLLGLVLPHLIPSRKRKDRWMN >gi|296494649|gb|ADTN01000089.1| GENE 8 10259 - 11497 1181 412 aa, chain + ## HITS:1 COG:cca KEGG:ns NR:ns ## COG: cca COG0617 # Protein_GI_number: 16130952 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Escherichia coli K12 # 1 412 1 412 412 837 100.0 0 MKIYLVGGAVRDALLGLPVKDRDWVVVGSTPQEMLDAGYQQVGRDFPVFLHPQTHEEYAL ARTERKSGSGYTGFTCYAAPDVTLEDDLKRRDLTINALAQDDNGEIIDPYNGLGDLQNRL LRHVSPAFGEDPLRVLRVARFAARYAHLGFRIADETLALMREMTHAGELEHLTPERVWKE TESALTTRNPQVFFQVLRDCGALRVLFPEIDALFGVPAPAKWHPEIDTGIHTLMTLSMAA MLSPQVDVRFATLCHDLGKGLTPPELWPRHHGHGPAGVKLVEQLCQRLRVPNEIRDLARL VAEFHDLIHTFPMLNPKTIVKLFDSIDAWRKPQRVEQLALTSEADVRGRTGFESADYPQG RWLREAWEVAQSVPTKAVVEAGFKGVEIREELTRRRIAAVASWKEQRCPKPE >gi|296494649|gb|ADTN01000089.1| GENE 9 11678 - 12499 1023 273 aa, chain - ## HITS:1 COG:ECs3940 KEGG:ns NR:ns ## COG: ECs3940 COG1968 # Protein_GI_number: 15833194 # Func_class: V Defense mechanisms # Function: Uncharacterized bacitracin resistance protein # Organism: Escherichia coli O157:H7 # 1 273 1 273 273 482 100.0 1e-136 MSDMHSLLIAAILGVVEGLTEFLPVSSTGHMIIVGHLLGFEGDTAKTFEVVIQLGSILAV VVMFWRRLFGLIGIHFGRPLQHEGESKGRLTLIHILLGMIPAVVLGLLFHDTIKSLFNPI NVMYALVVGGLLLIAAECLKPKEPRAPGLDDMTYRQAFMIGCFQCLALWPGFSRSGATIS GGMLMGVSRYAASEFSFLLAVPMMMGATALDLYKSWGFLTSGDIPMFAVGFITAFVVALI AIKTFLQLIKRISFIPFAIYRFIVAAAVYVVFF >gi|296494649|gb|ADTN01000089.1| GENE 10 12589 - 12957 432 122 aa, chain - ## HITS:1 COG:folB KEGG:ns NR:ns ## COG: folB COG1539 # Protein_GI_number: 16130954 # Func_class: H Coenzyme transport and metabolism # Function: Dihydroneopterin aldolase # Organism: Escherichia coli K12 # 1 122 2 123 123 222 100.0 1e-58 MDIVFIEQLSVITTIGVYDWEQTIEQKLVFDIEMAWDNRKAAKSDDVADCLSYADIAETV VSHVEGARFALVERVAEEVAELLLARFNSPWVRIKLSKPGAVARAANVGVIIERGNNLKE NN >gi|296494649|gb|ADTN01000089.1| GENE 11 13062 - 13679 559 205 aa, chain + ## HITS:1 COG:ECs3942 KEGG:ns NR:ns ## COG: ECs3942 COG0344 # Protein_GI_number: 15833196 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 205 1 205 205 388 100.0 1e-108 MSAIAPGMILIAYLCGSISSAILVCRLCGLPDPRTSGSGNPGATNVLRIGGKGAAVAVLI FDVLKGMLPVWGAYELGVSPFWLGLIAIAACLGHIWPVFFGFKGGKGVATAFGAIAPIGW DLTGVMAGTWLLTVLLSGYSSLGAIVSALIAPFYVWWFKPQFTFPVSMLSCLILLRHHDN IQRLWRRQETKIWTKFKRKREKDPE >gi|296494649|gb|ADTN01000089.1| GENE 12 13692 - 14624 872 310 aa, chain - ## HITS:1 COG:ygiP KEGG:ns NR:ns ## COG: ygiP COG0583 # Protein_GI_number: 16130956 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 310 1 310 310 640 100.0 0 MLNSWPLAKDLQVLVEIVHSGSFSAAAATLGQTPAFVTKRIQILENTLATTLLNRSARGV ALTESGQRCYEHALEILTQYQRLVDDVTQIKTRPEGMIRIGCSFGFGRSHIAPAITELMR NYPELQVHFELFDRQIDLVQDNIDLDIRINDEIPDYYIAHLLTKNKRILCAAPEYLQKYP QPQSLQELSRHDCLVTKERDMTHGIWELGNGQEKKSVKVSGHLSSNSGEIVLQWALEGKG IMLRSEWDVLPFLESGKLVQVLPEYAQSANIWAVYREPLYRSMKLRVCVEFLAAWCQQRL GKPDEGYQVM >gi|296494649|gb|ADTN01000089.1| GENE 13 14834 - 15742 262 302 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 24 300 21 297 508 105 27 1e-45 MSESNKQQAVNKLTEIVANFTAMISTRMPDDVVDKLKQLKDAETSSMGKIIYHTMFDNMQ KAIDLNRPACQDTGEIMFFVKVGSRFPLLGELQSILKQAVEEATVKAPLRHNAVEIFDEV NTGKNTGSGVPWVTWDIIPDNDDAEIEVYMAGGGCTLPGRSKVLMPSEGYEGVVKFVFEN ISTLAVNACPPVLVGVGIATSVETAAVLSRKAILRPIGSRHPNPKAAELELRLEEGLNRL GIGPQGLTGNSSVMGVHIESAARHPSTIGVAVSTGCWAHRRGTLLVHADLTFENLSHTRS AL >gi|296494649|gb|ADTN01000089.1| GENE 14 15715 - 16344 248 209 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 1 199 305 504 508 100 30 1e-45 SVSHPERVMKKILTTPIKAEDLQDIRVGDVIYLTGTLVTCRDVCHRRLIELKRPIPYDLN GKAIFHAGPIVRKNGDKWEMVSVGPTTSMRMESFEREFIEQTGVKLVVGKGGMGPLTEEG CQKFKALHVIFPAGCAVLAATQVEEIEEVHWTELGMPESLWVCRVKEFGPLIVSIDTHGN NLIAENKKLFAERRDPIVEEICEHVHYIK >gi|296494649|gb|ADTN01000089.1| GENE 15 16392 - 17855 1599 487 aa, chain + ## HITS:1 COG:ygjE KEGG:ns NR:ns ## COG: ygjE COG0471 # Protein_GI_number: 16130959 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Escherichia coli K12 # 1 487 1 487 487 801 99.0 0 MKPSTEWWRYLAPLAVIAIIALLPVPAGLENHTWLYFAVFTGVIVGLILEPVPGAVVAMV GISIIAILSPWLLFSPEQLAQPGFKFTAKSLSWAVSGFSNSVIWLIFAAFMFGTGYEKTG LGRRIALILVKKMGHRTLFLGYAVMFSELILAPVTPSNSARGAGIIYPIIRNLPPLYQSQ PNDSSSRSIGSYIMWMGIVADCVTSAIFLTAMAPNLLLIGLMKSASHATLSWGDWFLGML PLSILLVLLVPWLAYVLYPPVLKSGDQVPRWAETELQAMGPFCSREKRMLGLMVGALVLW IFGGDYIDAAMVGYSVVALMLLLRIISWDDIVSNKAAWNVFFWLASLITLATGLNNTGFI SWFGKLLAGSLSGYSPTMVMVALIVVFYLLRYFFASATAYTSALAPMMIAAALAMPEIPL PVFCLMVGAAIGLGSILTPYATGPSPIYYGSGYLPTADYWRLGAIFGLIFLVLLVITGLL WMPVVLL >gi|296494649|gb|ADTN01000089.1| GENE 16 17898 - 18911 660 337 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227425790|ref|ZP_03908856.1| SSU ribosomal protein S18P alanine acetyltransferase [Atopobium parvulum DSM 20469] # 3 327 480 813 832 258 44 4e-68 MRVLGIETSCDETGIAIYDDEKGLLANQLYSQVKLHADYGGVVPELASRDHVRKTVPLIQ AALKESGLTAKDIDAVAYTAGPGLVGALLVGATVGRSLAFAWDVPAIPVHHMEGHLLAPM LEDNPPEFPFVALLVSGGHTQLISVTGIGQYELLGESIDDAAGEAFDKTAKLLGLDYPGG PLLSKMAAQGTAGRFVFPRPMTDRPGLDFSFSGLKTFAANTIRDNGTDDQTRADIARAFE DAVVDTLMIKCKRALDQTGFKRLVMAGGVSANRTLRAKLAEMMKKRRGEVFYARPEFCTD NGAMIAYAGMVRFKAGATADLGVSVRPRWPLAELPAA >gi|296494649|gb|ADTN01000089.1| GENE 17 19149 - 19364 357 71 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803607|ref|NP_289640.1| 30S ribosomal protein S21 [Escherichia coli O157:H7 EDL933] # 1 71 1 71 71 142 100 5e-33 MPVIKVRENEPFDVALRRFKRSCEKAGVLAEVRRREFYEKPTTERKRAKASAVKRHAKKL ARENARRTRLY >gi|296494649|gb|ADTN01000089.1| GENE 18 19475 - 21220 1389 581 aa, chain + ## HITS:1 COG:ECs3949 KEGG:ns NR:ns ## COG: ECs3949 COG0358 # Protein_GI_number: 15833203 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Escherichia coli O157:H7 # 1 581 1 581 581 1191 100.0 0 MAGRIPRVFINDLLARTDIVDLIDARVKLKKQGKNFHACCPFHNEKTPSFTVNGEKQFYH CFGCGAHGNAIDFLMNYDKLEFVETVEELAAMHNLEVPFEAGSGPSQIERHQRQTLYQLM DGLNTFYQQSLQQPVATSARQYLEKRGLSHEVIARFAIGFAPPGWDNVLKRFGGNPENRQ SLIDAGMLVTNDQGRSYDRFRERVMFPIRDKRGRVIGFGGRVLGNDTPKYLNSPETDIFH KGRQLYGLYEAQQDNAEPNRLLVVEGYMDVVALAQYGINYAVASLGTSTTADHIQLLFRA TNNVICCYDGDRAGRDAAWRALETALPYMTDGRQLRFMFLPDGEDPDTLVRKEGKEAFEA RMEQAMPLSAFLFNSLMPQVDLSTPDGRARLSTLALPLISQVPGETLRIYLRQELGNKLG ILDDSQLERLMPKAAESGVSRPVPQLKRTTMRILIGLLVQNPELATLVPPLENLDENKLP GLGLFRELVNTCLSQPGLTTGQLLEHYRGTNNAATLEKLSMWDDIADKNIAEQTFTDSLN HMFDSLLELRQEELIARERTHGLSNEERLELWTLNQELAKK >gi|296494649|gb|ADTN01000089.1| GENE 19 21415 - 23256 2455 613 aa, chain + ## HITS:1 COG:ECs3950 KEGG:ns NR:ns ## COG: ECs3950 COG0568 # Protein_GI_number: 15833204 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Escherichia coli O157:H7 # 1 613 1 613 613 1031 99.0 0 MEQNPQSQLKLLVTRGKEQGYLTYAEVNDHLPEDIVDSDQIEDIIQMINDMGIQVMEEAP DADDLMLAENTADEDAAEAAAQVLSSVESEIGRTTDPVRMYMREMGTVELLTREGEIDIA KRIEDGINQVQCSVAEYPEAITYLLEQYDRVEAEEARLSDLITGFVDPNAEEDLAPTATH VGSELSQEDLDDDEDEDEEDGDDDSADDDNSIDPELAREKFAELRAQYVVTRDTIKAKGR SHATAQEEILKLSEVFKQFRLVPKQFDYLVNSMRVMMDRVRTQERLIMKLCVEQCKMPKK NFITLFTGNETSDTWFNAAIAMNKPWSEKLHDVSEEVHRALQKLQQIEEETGLTIEQVKD INRRMSIGEAKARRAKKEMVEANLRLVISIAKKYTNRGLQFLDLIQEGNIGLMKAVDKFE YRRGYKFSTYATWWIRQAITRSIADQARTIRIPVHMIETINKLNRISRQMLQEMGREPTP EELAERMLMPEDKIRKVLKIAKEPISMETPIGDDEDSHLGDFIEDTTLELPLDSATTESL RAATHDVLAGLTAREAKVLRMRFGIDMNTDHTLEEVGKQFDVTRERIRQIEAKALRKLRH PSRSEVLRSFLDD >gi|296494649|gb|ADTN01000089.1| GENE 20 23335 - 23841 433 168 aa, chain - ## HITS:1 COG:ygjF KEGG:ns NR:ns ## COG: ygjF COG3663 # Protein_GI_number: 16130964 # Func_class: L Replication, recombination and repair # Function: G:T/U mismatch-specific DNA glycosylase # Organism: Escherichia coli K12 # 1 168 1 168 168 336 100.0 1e-92 MVEDILAPGLRVVFCGINPGLSSAGTGFPFAHPANRFWKVIYQAGFTDRQLKPQEAQHLL DYRCGVTKLVDRPTVQANEVSKQELHAGGRKLIEKIEDYQPQALAILGKQAYEQGFSQRG AQWGKQTLTIGSTQIWVLPNPSGLSRVSLEKLVEAYRELDQALVVRGR >gi|296494649|gb|ADTN01000089.1| GENE 21 24095 - 24859 745 254 aa, chain - ## HITS:1 COG:yqjH KEGG:ns NR:ns ## COG: yqjH COG2375 # Protein_GI_number: 16130965 # Func_class: P Inorganic ion transport and metabolism # Function: Siderophore-interacting protein # Organism: Escherichia coli K12 # 1 254 1 254 254 518 100.0 1e-147 MNNTPRYPQRVRNDLRFRELTVLRVERISAGFQRIVLGGEALDGFTSRGFDDHSKLFFPQ PDAHFVPPTVTEEGIVWPEGPRPPSRDYTPLYDELRHELAIDFFIHDGGVASGWAMQAQP GDKLTVAGPRGSLVVPEDYAYQLYVCDESGMPALRRRLETLSKLAVKPQVSALVSVRDNA CQDYLAHLDGFNIEWLAHDEQAVDARLAQMQIPADDYFIWITGEGKVVKNLSRRFEAEQY DPQRVRAAAYWHAK >gi|296494649|gb|ADTN01000089.1| GENE 22 25147 - 25770 764 207 aa, chain + ## HITS:1 COG:yqjI KEGG:ns NR:ns ## COG: yqjI COG1695 # Protein_GI_number: 16130966 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 1 207 1 207 207 351 100.0 5e-97 MSHHHEGCCKHEGQPRHEGCCKGEKSEHEHCGHGHQHEHGQCCGGRHGRGGGRRQRFFGH GELRLVILDILSRDDSHGYELIKAIENLTQGNYTPSPGVIYPTLDFLQEQSLITIREEEG GKKQIALTEQGAQWLEENREQVEMIEERIKARCVGAALRQNPQMKRALDNFKAVLDLRVN QSDISDAQIKKIIAVIDRAAFDITQLD >gi|296494649|gb|ADTN01000089.1| GENE 23 25923 - 27443 1557 506 aa, chain - ## HITS:1 COG:aer_2 KEGG:ns NR:ns ## COG: aer_2 COG0840 # Protein_GI_number: 16130967 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Escherichia coli K12 # 125 506 1 382 382 674 100.0 0 MSSHPYVTQQNTPLADDTTLMSTTDLQSYITHANDTFVQVSGYTLQELQGQPHNMVRHPD MPKAAFADMWFTLKKGEPWSGIVKNRRKNGDHYWVRANAVPMVREGKISGYMSIRTRATD EEIAAVEPLYKALNAGRTSKRIHKGLVVRKGWLGKLPSLPLRWRARGVMTLMFILLAAML WFVAAPVVTYILCALVVLLASACFEWQIVRPIENVAHQALKVATGERNSVEHLNRSDELG LTLRAVGQLGLMCRWLINDVSSQVSSVRNGSETLAKGTDELNEHTQQTVDNVQQTVATMN QMAASVKQNSATASAADKLSITASNAAVQGGEAMTTVIKTMDDIADSTQRIGTITSLIND IAFQTNILALNAAVEAARAGEQGKGFAVVAGEVRHLASRSANAANDIRKLIDASADKVQS GSQQVHAAGRTMEDIVAQVKNVTQLIAQISHSTLEQADGLSSLTRAVDELNLITQKNAEL VEESAQVSAMVKHRASRLEDAVTVLH >gi|296494649|gb|ADTN01000089.1| GENE 24 27951 - 29240 1346 429 aa, chain + ## HITS:1 COG:ygjG KEGG:ns NR:ns ## COG: ygjG COG4992 # Protein_GI_number: 16130968 # Func_class: E Amino acid transport and metabolism # Function: Ornithine/acetylornithine aminotransferase # Organism: Escherichia coli K12 # 1 429 68 496 496 863 100.0 0 MKALNREVIEYFKEHVNPGFLEYRKSVTAGGDYGAVEWQAGSLNTLVDTQGQEFIDCLGG FGIFNVGHRNPVVVSAVQNQLAKQPLHSQELLDPLRAMLAKTLAALTPGKLKYSFFCNSG TESVEAALKLAKAYQSPRGKFTFIATSGAFHGKSLGALSATAKSTFRKPFMPLLPGFRHV PFGNIEAMRTALNECKKTGDDVAAVILEPIQGEGGVILPPPGYLTAVRKLCDEFGALMIL DEVQTGMGRTGKMFACEHENVQPDILCLAKALGGGVMPIGATIATEEVFSVLFDNPFLHT TTFGGNPLACAAALATINVLLEQNLPAQAEQKGDMLLDGFRQLAREYPDLVQEARGKGML MAIEFVDNEIGYNFASEMFRQRVLVAGTLNNAKTIRIEPPLTLTIEQCELVIKAARKALA AMRVSVEEA >gi|296494649|gb|ADTN01000089.1| GENE 25 29282 - 29614 510 110 aa, chain - ## HITS:1 COG:ygjH KEGG:ns NR:ns ## COG: ygjH COG0073 # Protein_GI_number: 16130969 # Func_class: R General function prediction only # Function: EMAP domain # Organism: Escherichia coli K12 # 1 110 1 110 110 194 100.0 3e-50 METVAYADFARLEMRVGKIVEVKRHENADKLYIVQVDVGQKTLQTVTSLVPYYSEEELMG KTVVVLCNLQKAKMRGETSECMLLCAETDDGSESVLLTPERMMPAGVRVV >gi|296494649|gb|ADTN01000089.1| GENE 26 29833 - 30816 1021 327 aa, chain + ## HITS:1 COG:ebgR KEGG:ns NR:ns ## COG: ebgR COG1609 # Protein_GI_number: 16130970 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 327 1 327 327 659 100.0 0 MATLKDIAIEAGVSLATVSRVLNDDPTLNVKEETKHRILEIAEKLEYKTSSARKLQTGAV NQHHILAIYSYQQELEINDPYYLAIRHGIETQCEKLGIELTNCYEHSGLPDIKNVTGILI VGKPTPALRAAASALTDNICFIDFHEPGSGYDAVDIDLARISKEIIDFYINQGVNRIGFI GGEDEPGKADIREVAFAEYGRLKQVVREEDIWRGGFSSSSGYELAKQMLAREDYPKALFV ASDSIAIGVLRAIHERGLNIPQDISLISVNDIPTARFTFPPLSTVRIHSEMMGSQGVNLV YEKARDGRALPLLVFVPSKLKLRGTTR >gi|296494649|gb|ADTN01000089.1| GENE 27 31000 - 34092 3159 1030 aa, chain + ## HITS:1 COG:ebgA KEGG:ns NR:ns ## COG: ebgA COG3250 # Protein_GI_number: 16130971 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Escherichia coli K12 # 1 1030 13 1042 1042 2182 99.0 0 MNRWENIQLTHENRLAPRAYFFSYDSVAQARTFARETSSLFLPLSGQWNFHFFDHPLQVP EAFTSELMADWGHITVPAMWQMEGHGKLQYTDEGFPFPIDVPFVPSDNPTGAYQRIFTLS DGWQGKQTLIKFDGVETYFEVYVNGQYVGFSKGSRLTAEFDISAMVKTGDNLLCVRVMQW ADSTYVEDQDMWWSAGIFRDVYLVGKHLTHINDFTVRTDFDEAYCDATLSCEVVLENLAA SPVVTTLEYTLFDGERVVHSSAIDHLAIEKLTSASFAFTVEQPQQWSAESPYLYHLVMTL KDANGNVLEVVPQRVGFRDIKVRDGLFWINNRYVMLHGVNRHDNDHRKGRAVGMDRVEKD LQLMKQHNINSVRTAHYPNDPRFYELCDIYGLFVMAETDVESHGFANVGDISRITDDPQW EKVYVERIVRHIHAQKNHPSIIIWSLGNESGYGCNIRAMYHAAKALDDTRLVHYEEDRDA EVVDIISTMYTRVPLMNEFGEYPHPKPRIICEYAHAMGNGPGGLTEYQNVFYKHDCIQGH YVWEWCDHGIQAQDDHGNVWYKFGGDYGDYPNNYNFCLDGLIYSDQTPGPGLKEYKQVIA PVKIHARDLTRGELKVENKLWFTTLDDYTLHAEVRAEGETLATQQIKLRDVAPNSEAPLQ ITLPQLDAREAFLNITVTKDSRTRYSEAGHPIATYQFPLKENTAQPVPFAPNNARPLTLE DDRLSCTVRGYNFAITFSKMSGKPTSWQVNGESLLTREPKINFFKPMIDNHKQEYEGLWQ PNHLQIMQEHLRDFAVEQSDGEVLIISRTVIAPPVFDFGMRCTYIWRIAADGQVNVALSG ERYGDYPHIIPCIGFTMGINGEYDQVAYYGRGPGENYADSQQANIIDIWRSTVDAMFENY PFPQNNGNRQHVRWTALTNRHGNGLLVVPQRPINFSAWHYTQENIHAAQHCNELQRSDDI TLNLDHQLLGLGSNSWGSEVLDSWRVWFRDFSYGFTLLPVSGGEATAQSLASYEFGAGFF STNLHSESKQ >gi|296494649|gb|ADTN01000089.1| GENE 28 34089 - 34538 380 149 aa, chain + ## HITS:1 COG:ebgC KEGG:ns NR:ns ## COG: ebgC COG2731 # Protein_GI_number: 16130972 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Escherichia coli K12 # 1 149 1 149 149 287 100.0 4e-78 MRIIDNLEQFRQIYASGKKWQRCVEAIENIDNIQPGVAHSIGDSLTYRVETDSATDALFT GHRRYFEVHYYLQGQQKIEYAPKETLQVVEYYRDETDREYLKGCGETVEVHEGQIVICDI HEAYRFICNNAVKKVVLKVTIEDGYFHNK >gi|296494649|gb|ADTN01000089.1| GENE 29 34601 - 34879 359 92 aa, chain + ## HITS:1 COG:ygjI KEGG:ns NR:ns ## COG: ygjI COG0531 # Protein_GI_number: 16130973 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli K12 # 1 92 1 92 477 169 100.0 1e-42 MSDTKRNTIGKFGLLSLTFAAVYSFNNVINNNIELGLASAPMFFLATIFYFIPFCLIIAE FVSLNKNSEAGVYAWVKSSLGGRWAFITAYTY >gi|296494649|gb|ADTN01000089.1| GENE 30 34964 - 36034 1105 356 aa, chain + ## HITS:1 COG:ECs3960 KEGG:ns NR:ns ## COG: ECs3960 COG0531 # Protein_GI_number: 15833214 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli O157:H7 # 1 356 122 477 477 593 99.0 1e-169 MTPVATTIISMVLFAFSTWVSTNGAKMLGPITSVTSTLMLLLTLSYILLAGTALVGGVQP ADAITVDAMIPNFNWAFLGVTTWIFMAAGGAESVAVYVNDVKGGSKSFVKVIILAGIFIG VLYSVSSVLINVFVSSKELKFTGGSVQVFHGMAAYFGLPEALMNRFVGLVSFTAMFGSLL MWTATPVKIFFSEIPEGIFGKKTVELNENGVPARAAWIQFLIVIPLMIIPMLGSNTVQDL MNTIINMTAAASMLPPLFIMLAYLNLRAKLDHLPRDFRMGSRRTGIIVVSMLIAIFAVGF VASTFPTGANILTIIFYNVGGIVIFLGFAWWKYSKYIKGLTAEERHIEATPASNVD >gi|296494649|gb|ADTN01000089.1| GENE 31 36168 - 37238 1075 356 aa, chain + ## HITS:1 COG:no KEGG:B21_02898 NR:ns ## KEGG: B21_02898 # Name: ygjJ # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 356 1 356 356 698 100.0 0 MKLITAPCRALLALPFCYAFSAAGEEARPAEHDDTKTPAITSTSSPSFRFYGELGVGGYM DLEGENKHKYSDGTYIEGGLEMKYGSWFGLIYGEGWTVQADHDGNAWVPDHSWGGFEGGI NRFYGGYRTNDGTEIMLSLRQDSSLDDLQWWGDFTPDLGYVIPNTRDIMTALKVQNLSGN FRYSVTATPAGHHDESKAWLHFGKYDRYDDKYTYPAMMNGYIQYDLAEGITWMNGLEITD GTGQLYLTGLLTPNFAARAWHHTGRADGLDVPGSESGMMVSAMYEALKGVYLSTAYTYAK HRPDHADDETTSFMQFGIWYEYGGGRFATAFDSRFYMKNASHDPSDQIFLMQYFYW >gi|296494649|gb|ADTN01000089.1| GENE 32 37255 - 39606 2627 783 aa, chain + ## HITS:1 COG:no KEGG:JW3051 NR:ns ## KEGG: JW3051 # Name: ygjK # Def: predicted glycosyl hydrolase # Organism: E.coli_J # Pathway: not_defined # 1 783 1 783 783 1590 99.0 0 MKIKTILTPVTCALLISFSAHAANADNYKNVINRTGAPQYMKDYDYDDHQRFNPFFDLGA WHGHLLPDGPNTMGGFPGVALLTEEYINFMASNFDRLTVWQDGKKVDFTLEAYSIPGALV QKLTAKDVQVEMTLRFATPRTSLLETKITSNKPLDLVWDGELLEKLEAKEGKPLSDKTIA GEYPDYQRKISATRDGLKVTFGKVRATWDLLTSGESEYQVHKSLPVQTEINGNRFTSKAH INGSTTLYTTYSHLLTAQEVSKEQMQIRDILARPAFYLTASQQRWEEYLKKGLTNPDATP EQTRVAVKAIETLNGNWRSPGGAVKFNTVTPSVTGRWFSGNQTWPWDTWKQAFAMAHFNP DIAKENIRAVFSWQIQPGDSVRPQDVGFVPDLIAWNLSPERGGDGGNWNERNTKPSLAAW SVMEVYNVTQDKTWVAEMYPKLVAYHDWWLRNRDHNGNGVPEYGATRDKAHNTESGEMLF TVKKGDKEETQSGLNNYARVVEKGQYDSLEIPAQVAASWESGRDDAAVFGFIDKEQLDKY VANGGKRSDWTVKFAENRSQDGTLLGYSLLQESVDQASYMYSDNHYLAEMATILGKPEEA KRYHQLAQQLADYINTCMFDPTTQFYYDVRIEDKPLANGCAGKPIVERGKGPEGWSPLFN GAATQANADAVVKVMLDPKEFNTFVPLGTAALTNPAFGADIYWRGRVWVDQFWFGLKGME RYGYRDDALKLADTFFRHAKGLTADGPIQENYNPLTGAQQGAPNFSWSAAHLYMLYNDFF RKQ >gi|296494649|gb|ADTN01000089.1| GENE 33 40032 - 42050 1833 672 aa, chain + ## HITS:1 COG:fadH_1 KEGG:ns NR:ns ## COG: fadH_1 COG1902 # Protein_GI_number: 16130976 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Escherichia coli K12 # 1 354 1 354 354 729 100.0 0 MSYPSLFAPLDLGFTTLKNRVLMGSMHTGLEEYPDGAERLAAFYAERARHGVALIVSGGI APDLTGVGMEGGAMLNDASQIPHHRTITEAVHQEGGKIALQILHTGRYSYQPHLVAPSAL QAPINRFVPHELSHEEILQLIDNFARCAQLAREAGYDGVEVMGSEGYLINEFLTLRTNQR SDQWGGDYRNRMRFAVEVVRAVRERVGNDFIIIYRLSMLDLVEDGGTFAETVELAQAIEA AGATIINTGIGWHEARIPTIATPVPRGAFSWVTRKLKGHVSLPLVTTNRINDPQVADDIL SRGDADMVSMARPFLADAELLSKAQSGRADEINTCIGCNQACLDQIFVGKVTSCLVNPRA CHETKMPILPAVQKKNLAVVGAGPAGLAFAINAAARGHQVTLFDAHSEIGGQFNIAKQIP GKEEFYETLRYYRRMIEVTGVTLKLNHTVTADQLQAFDETILASGIVPRTPPIDGIDHPK VLSYLDVLRDKAPVGNKVAIIGCGGIGFDTAMYLSQPGESTSQNIAGFCNEWGIDSSLQQ AGGLSPQGMQIPRSPRQIVMLQRKASKPGQGLGKTTGWIHRTTLLSRGVKMIPGVSYQKI DDDGLHVVINGETQVLAVDNVVICAGQEPNRALAQPLIDSGKTVHLIGGCDVAMELDARR AIAQGTRLALEI >gi|296494649|gb|ADTN01000089.1| GENE 34 42095 - 42511 286 138 aa, chain - ## HITS:1 COG:ygjM KEGG:ns NR:ns ## COG: ygjM COG5499 # Protein_GI_number: 16130977 # Func_class: K Transcription # Function: Predicted transcription regulator containing HTH domain # Organism: Escherichia coli K12 # 1 138 1 138 138 259 100.0 1e-69 MIAIADILQAGEKLTAVAPFLAGIQNEEQYTQALELVDHLLLNDPENPLLDLVCAKITAW EESAPEFAEFNAMAQAMPGGIAVIRTLMDQYGLTLSDLPEIGSKSMVSRVLSGKRKLTLE HAKKLATRFGISPALFID >gi|296494649|gb|ADTN01000089.1| GENE 35 42508 - 42822 200 104 aa, chain - ## HITS:1 COG:ECs3965 KEGG:ns NR:ns ## COG: ECs3965 COG4680 # Protein_GI_number: 15833219 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 104 1 104 104 192 100.0 2e-49 MHLITQKALKDAAEKYPQHKTELVALGNTIAKGYFKKPESLKAVFPSLDNFKYLDKHYVF NVGGNELRVVAMVFFESQKCYIREVMTHKEYDFFTAVHRTKGKK >gi|296494649|gb|ADTN01000089.1| GENE 36 43106 - 44242 451 378 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225082609|ref|YP_002654106.1| ribosomal protein L11 methyltransferase, putative [marine gamma proteobacterium HTCC2148] # 13 371 18 368 371 178 33 6e-44 MSHLDNGFRSLTLQRFPATDDVNPLQAWEAADEYLLQQLDDTEIRGPVLILNDAFGALSC ALAEHKPYSIGDSYISELATRENLRLNGIDESSVKFLDSTADYPQQPGVVLIKVPKTLAL LEQQLRALRKVVTSDTRIIAGAKARDIHTSTLELFEKVLGPTTTTLAWKKARLINCTFNE PQLADAPQTVSWKLEGTDWTIHNHANVFSRTGLDIGARFFMQHLPENLEGEIVDLGCGNG VIGLTLLDKNPQAKVVFVDESPMAVASSRLNVETNMPEALDRCEFMINNALSGVEPFRFN AVLCNPPFHQQHALTDNVAWEMFHHARRCLKINGELYIVANRHLDYFHKLKKIFGNCTTI ATNNKFVVLKAVKLGRRR >gi|296494649|gb|ADTN01000089.1| GENE 37 44327 - 44830 436 167 aa, chain + ## HITS:1 COG:ygjP KEGG:ns NR:ns ## COG: ygjP COG1451 # Protein_GI_number: 16130980 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Escherichia coli K12 # 1 167 13 179 179 331 99.0 4e-91 MSNLTYLQDYPEQLLSQVRTLINEQRLGDVLAKRYPGTHDYATDKALWQYTQDLKNQFLR NAPPINKVMYDNKIHVLKNALGLHTAVSRVQGGKLKAKVEIRVATVFRNAPEPFLRMIVV HELAHLKEKEHNKAFYQLCCHMEPQYHQLEFDTRLWLTQLSLGQNKI >gi|296494649|gb|ADTN01000089.1| GENE 38 44911 - 45603 503 230 aa, chain + ## HITS:1 COG:ygjQ KEGG:ns NR:ns ## COG: ygjQ COG2949 # Protein_GI_number: 16130981 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Escherichia coli K12 # 1 230 1 230 230 470 100.0 1e-133 MLRAFARLLLRICFSRRTLKIACLLLLVAGATILIADRVMVNASKQLTWSDVNAVPARNV GLLLGARPGNRYFTRRIDTAAALYHAGKVKWLLVSGDNGRKNYDEASGMQQALIAKGVPA KVIFCDYAGFSTLDSVVRAKKVFGENHITIISQEFHNQRAIWLAKQYGIDAIGFNAPDLN MKHGFYTQLREKLARVSAVIDAKILHRQPKYLGPSVMIGPFSEHGCPAQK >gi|296494649|gb|ADTN01000089.1| GENE 39 45664 - 46668 749 334 aa, chain + ## HITS:1 COG:ygjR KEGG:ns NR:ns ## COG: ygjR COG0673 # Protein_GI_number: 16130982 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Escherichia coli K12 # 1 334 1 334 334 664 100.0 0 MEYPRVMIRFAVIGTNWITRQFVEAAHESGKYKLTAVYSRSLEQAQHFANDFSVEHLFTS LEAMAESDAIDAVYIASPNSLHFSQTQLFLSHKINVICEKPLASNLAEVDAAIACARENQ VVLFEAFKTACLPNFHLLRQALPKVGKLRKVFFNYCQYSSRYQRYLDGENPNTFNPAFSN GSIMDIGFYCLASAVALFGEPKSVQATASLLASGVDAQGVVVMDYGDFSVTLQHSKVSDS VLASEIQGEAGSLVIEKLSECQKVCFVPRGSQMQDLTQPQHINTMLYEAELFATLVDEHL VDHPGLAVSRITAKLLTEIRRQTGVIFPADSVKL >gi|296494649|gb|ADTN01000089.1| GENE 40 46951 - 47916 1269 321 aa, chain + ## HITS:1 COG:ECs3970 KEGG:ns NR:ns ## COG: ECs3970 COG0861 # Protein_GI_number: 15833224 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane protein TerC, possibly involved in tellurium resistance # Organism: Escherichia coli O157:H7 # 1 321 1 321 321 550 99.0 1e-156 MNTVGTPLLWGGFAVVVAIMLAIDLLLQGRRGAHAMTMKQAAAWSLVWVTLSLLFNAAFW WYLVQTEGRAVADPQALAFLTGYLIEKSLAVDNVFVWLMLFSYFSVPAALQRRVLVYGVL GAIVLRTIMIFTGSWLISQFDWILYIFGAFLLFTGVKMALAHEDESGIGDKPLVRWLRGH LRMTDTIDNEHFFVRKNGLLYATPLMLVLILVELSDVIFAVDSIPAIFAVTTDPFIVLTS NLFAILGLRAMYFLLAGVAERFSMLKYGLAVILVFIGIKMLIVDFYHIPIAVSLGVVFGI LVMTFIINAWVNYRHDKQRGG Prediction of potential genes in microbial genomes Time: Sun May 15 23:25:52 2011 Seq name: gi|296494648|gb|ADTN01000090.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont238.2, whole genome shotgun sequence Length of sequence - 50438 bp Number of predicted genes - 52, with homology - 52 Number of transcription units - 24, operones - 13 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 93 - 152 6.7 1 1 Tu 1 . + CDS 276 - 1520 1393 ## COG3633 Na+/serine symporter + Term 1529 - 1565 7.1 2 2 Tu 1 . - CDS 1525 - 2076 249 ## ECUMN_3574 conserved hypothetical protein; putative inner membrane protein - Term 2096 - 2132 6.4 3 3 Op 1 3/1.000 - CDS 2159 - 3646 1674 ## COG2721 Altronate dehydratase 4 3 Op 2 . - CDS 3661 - 5073 1788 ## COG1904 Glucuronate isomerase - Prom 5180 - 5239 3.3 + Prom 5259 - 5318 3.0 5 4 Tu 1 5/0.200 + CDS 5556 - 6854 1823 ## COG0477 Permeases of the major facilitator superfamily + Term 6886 - 6916 3.0 + Prom 6880 - 6939 3.3 6 5 Tu 1 . + CDS 6969 - 7760 889 ## COG2186 Transcriptional regulators + Term 7772 - 7807 4.1 + Prom 7877 - 7936 6.0 7 6 Op 1 . + CDS 8105 - 8767 605 ## COG0586 Uncharacterized membrane-associated protein 8 6 Op 2 . + CDS 8786 - 9154 242 ## B21_02916 hypothetical protein + Term 9218 - 9256 6.4 + Prom 9161 - 9220 4.1 9 7 Op 1 . + CDS 9286 - 9669 474 ## SSON_3256 hypothetical protein 10 7 Op 2 7/0.200 + CDS 9707 - 10012 455 ## COG4575 Uncharacterized conserved protein 11 7 Op 3 . + CDS 10015 - 10419 448 ## COG5393 Predicted membrane protein 12 7 Op 4 . + CDS 10409 - 10708 376 ## JW3071 conserved hypothetical protein + Term 10718 - 10748 3.4 + Prom 10724 - 10783 1.8 13 8 Op 1 3/1.000 + CDS 10804 - 11286 506 ## COG2259 Predicted membrane protein + Term 11314 - 11343 2.1 14 8 Op 2 4/0.600 + CDS 11356 - 12342 1082 ## COG0435 Predicted glutathione S-transferase + Term 12390 - 12428 4.1 + Prom 12536 - 12595 5.2 15 9 Tu 1 6/0.200 + CDS 12636 - 13001 344 ## COG3152 Predicted membrane protein + Term 13015 - 13049 3.5 + Prom 13035 - 13094 6.4 16 10 Tu 1 . + CDS 13243 - 13599 317 ## COG3152 Predicted membrane protein + Term 13610 - 13640 3.0 - Term 13596 - 13627 3.2 17 11 Tu 1 . - CDS 13650 - 14546 971 ## COG0583 Transcriptional regulator + Prom 14534 - 14593 1.7 18 12 Op 1 . + CDS 14651 - 15352 594 ## COG1741 Pirin-related protein 19 12 Op 2 . + CDS 15375 - 15539 228 ## EC55989_3525 hypothetical protein - Term 15620 - 15653 -0.5 20 13 Op 1 3/1.000 - CDS 15673 - 16983 1325 ## COG3681 Uncharacterized conserved protein 21 13 Op 2 10/0.200 - CDS 17011 - 18342 1392 ## COG0814 Amino acid permeases - Prom 18528 - 18587 4.8 - Term 18563 - 18592 2.1 22 14 Op 1 1/1.000 - CDS 18617 - 19981 1196 ## COG1760 L-serine deaminase 23 14 Op 2 1/1.000 - CDS 20053 - 20442 418 ## COG0251 Putative translation initiation inhibitor, yjgF family 24 14 Op 3 3/1.000 - CDS 20456 - 22750 2631 ## COG1882 Pyruvate-formate lyase 25 14 Op 4 3/1.000 - CDS 22784 - 23992 1075 ## COG0282 Acetate kinase 26 14 Op 5 4/0.600 - CDS 24018 - 25349 1383 ## COG0814 Amino acid permeases 27 14 Op 6 4/0.600 - CDS 25371 - 26360 993 ## COG1171 Threonine dehydratase - Prom 26391 - 26450 4.9 28 14 Op 7 . - CDS 26459 - 27397 751 ## COG0583 Transcriptional regulator - Prom 27541 - 27600 8.0 - Term 27948 - 27985 1.6 29 15 Tu 1 . - CDS 28012 - 28149 76 ## EcE24377A_3594 hypothetical protein + Prom 28091 - 28150 3.5 30 16 Op 1 . + CDS 28186 - 28725 88 ## EcE24377A_3595 hypothetical protein 31 16 Op 2 . + CDS 28747 - 29934 193 ## ECIAI1_3270 hypothetical protein 32 17 Op 1 1/1.000 - CDS 30731 - 31876 1116 ## COG1929 Glycerate kinase 33 17 Op 2 1/1.000 - CDS 31973 - 32863 1196 ## COG2084 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases 34 17 Op 3 6/0.200 - CDS 32893 - 33663 799 ## COG3836 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase 35 17 Op 4 . - CDS 33679 - 35013 1687 ## COG0477 Permeases of the major facilitator superfamily - Prom 35105 - 35164 6.5 + Prom 35128 - 35187 6.9 36 18 Tu 1 3/1.000 + CDS 35388 - 36959 1306 ## COG2721 Altronate dehydratase + Term 36980 - 37020 2.5 + Prom 37020 - 37079 7.6 37 19 Op 1 . + CDS 37108 - 37443 249 ## COG2002 Regulators of stationary/sporulation gene expression 38 19 Op 2 . + CDS 37443 - 37907 252 ## ECUMN_3612 hypothetical protein + Term 38043 - 38085 2.1 - Term 37836 - 37880 -0.9 39 20 Tu 1 . - CDS 37962 - 38771 734 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 38932 - 38991 6.0 + Prom 38877 - 38936 5.5 40 21 Op 1 1/1.000 + CDS 39020 - 40300 1295 ## COG4573 Predicted tagatose 6-phosphate kinase 41 21 Op 2 13/0.000 + CDS 40323 - 40796 548 ## COG3444 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB 42 21 Op 3 13/0.000 + CDS 40807 - 41586 1083 ## COG3715 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC 43 21 Op 4 4/0.600 + CDS 41576 - 42454 1053 ## COG3716 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID 44 21 Op 5 3/1.000 + CDS 42472 - 42906 575 ## COG2893 Phosphotransferase system, mannose/fructose-specific component IIA 45 21 Op 6 9/0.200 + CDS 42903 - 44036 1033 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase + Term 44055 - 44090 -0.0 + Prom 44239 - 44298 6.9 46 22 Op 1 3/1.000 + CDS 44387 - 45541 1004 ## COG2222 Predicted phosphosugar isomerases 47 22 Op 2 3/1.000 + CDS 45554 - 46414 813 ## COG0191 Fructose/tagatose bisphosphate aldolase + Prom 46486 - 46545 5.8 48 23 Op 1 13/0.000 + CDS 46581 - 47057 573 ## COG3444 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB 49 23 Op 2 13/0.000 + CDS 47096 - 47899 872 ## COG3715 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC 50 23 Op 3 2/1.000 + CDS 47889 - 48680 809 ## COG3716 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID + Term 48688 - 48724 4.7 51 23 Op 4 2/1.000 + CDS 48735 - 49436 712 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase + Term 49515 - 49566 7.3 + Prom 49716 - 49775 6.4 52 24 Tu 1 . + CDS 49837 - 50421 540 ## COG3539 P pilus assembly protein, pilin FimA Predicted protein(s) >gi|296494648|gb|ADTN01000090.1| GENE 1 276 - 1520 1393 414 aa, chain + ## HITS:1 COG:ECs3971 KEGG:ns NR:ns ## COG: ECs3971 COG3633 # Protein_GI_number: 15833225 # Func_class: E Amino acid transport and metabolism # Function: Na+/serine symporter # Organism: Escherichia coli O157:H7 # 1 414 1 414 414 700 100.0 0 MTTQRSPGLFRRLAHGSLVKQILVGLVLGILLAWISKPAAEAVGLLGTLFVGALKAVAPI LVLMLVMASIANHQHGQKTNIRPILFLYLLGTFSAALAAVVFSFAFPSTLHLSSSAGDIS PPSGIVEVMRGLVMSMVSNPIDALLKGNYIGILVWAIGLGFALRHGNETTKNLVNDMSNA VTFMVKLVIRFAPIGIFGLVSSTLATTGFSTLWGYAQLLVVLVGCMLLVALVVNPLLVWW KIRRNPFPLVLLCLRESGVYAFFTRSSAANIPVNMALCEKLNLDRDTYSVSIPLGATINM AGAAITITVLTLAAVNTLGIPVDLPTALLLSVVASLCACGASGVAGGSLLLIPLACNMFG ISNDIAMQVVAVGFIIGVLQDSCETALNSSTDVLFTAAACQAEDDRLANSALRN >gi|296494648|gb|ADTN01000090.1| GENE 2 1525 - 2076 249 183 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_3574 NR:ns ## KEGG: ECUMN_3574 # Name: ygjV # Def: conserved hypothetical protein; putative inner membrane protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 183 1 183 183 350 100.0 1e-95 MTAYWLAQGVGVIAFLIGITTFFNRDERRFKKQLSVYSAVIGVHFFLLGTYPAGASAILN AIRTLITLRTRSLWVMAIFIVLTGGIGLAKFHHPVELLPVIGTIVSTWALFRCKGLTMRC VMWFSTCCWVIHNFWAGSIGGTMIEGSFLLMNGLNIIRFWRMQKRGIDPFKVEKTPSAVD ERG >gi|296494648|gb|ADTN01000090.1| GENE 3 2159 - 3646 1674 495 aa, chain - ## HITS:1 COG:uxaA KEGG:ns NR:ns ## COG: uxaA COG2721 # Protein_GI_number: 16130986 # Func_class: G Carbohydrate transport and metabolism # Function: Altronate dehydratase # Organism: Escherichia coli K12 # 1 495 1 495 495 1023 100.0 0 MQYIKIHALDNVAVALADLAEGTEVSVDNQTVTLRQDVARGHKFALTDIAKGANVIKYGL PIGYALADIAAGVHVHAHNTRTNLSDLDQYRYQPDFQDLPAQAADREVQIYRRANGDVGV RNELWILPTVGCVNGIARQIQNRFLKETNNAEGTDGVFLFSHTYGCSQLGDDHINTRTML QNMVRHPNAGAVLVIGLGCENNQVAAFRETLGDIDPERVHFMICQQQDDEIEAGIEHLHQ LYNVMRNDKREPGKLSELKFGLECGGSDGLSGITANPMLGRFSDYVIANGGTTVLTEVPE MFGAEQLLMDHCRDEATFEKLVTMVNDFKQYFIAHDQPIYENPSPGNKAGGITTLEDKSL GCTQKAGSSVVVDVLRYGERLKTPGLNLLSAPGNDAVATSALAGAGCHMVLFSTGRGTPY GGFVPTVKIATNSELAAKKKHWIDFDAGQLIHGKAMPQLLEEFIDTIVEFANGKQTCNER NDFRELAIFKSGVTL >gi|296494648|gb|ADTN01000090.1| GENE 4 3661 - 5073 1788 470 aa, chain - ## HITS:1 COG:uxaC KEGG:ns NR:ns ## COG: uxaC COG1904 # Protein_GI_number: 16130987 # Func_class: G Carbohydrate transport and metabolism # Function: Glucuronate isomerase # Organism: Escherichia coli K12 # 1 470 1 470 470 993 100.0 0 MTPFMTEDFLLDTEFARRLYHDYAKDQPIFDYHCHLPPQQIAEDYRFKNLYDIWLKGDHY KWRAMRTNGVAERLCTGDASDREKFDAWAATVPHTIGNPLYHWTHLELRRPFGITGKLLS PSTADEIWNECNELLAQDNFSARGIMQQMNVKMVGTTDDPIDSLEHHAEIAKDGSFTIKV LPSWRPDKAFNIEQATFNDYMAKLGEVSDTDIRRFADLQTALTKRLDHFAAHGCKVSDHA LDVVMFAEANEAELDSILARRLAGETLSEHEVAQFKTAVLVFLGAEYARRGWVQQYHIGA LRNNNLRQFKLLGPDVGFDSINDRPMAEELSKLLSKQNEENLLPKTILYCLNPRDNEVLG TMIGNFQGEGMPGKMQFGSGWWFNDQKDGMERQMTQLAQLGLLSRFVGMLTDSRSFLSYT RHEYFRRILCQMIGRWVEAGEAPADINLLGEMVKNICFNNARDYFAIELN >gi|296494648|gb|ADTN01000090.1| GENE 5 5556 - 6854 1823 432 aa, chain + ## HITS:1 COG:ECs3975 KEGG:ns NR:ns ## COG: ECs3975 COG0477 # Protein_GI_number: 15833229 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 432 41 472 472 788 100.0 0 MRKIKGLRWYMIALVTLGTVLGYLTRNTVAAAAPTLMEELNISTQQYSYIIAAYSAAYTV MQPVAGYVLDVLGTKIGYAMFAVLWAVFCGATALAGSWGGLAVARGAVGAAEAAMIPAGL KASSEWFPAKERSIAVGYFNVGSSIGAMIAPPLVVWAIVMHSWQMAFIISGALSFIWAMA WLIFYKHPRDQKHLTDEERDYIINGQEAQHQVSTAKKMSVGQILRNRQFWGIALPRFLAE PAWGTFNAWIPLFMFKVYGFNLKEIAMFAWMPMLFADLGCILGGYLPPLFQRWFGVNLIV SRKMVVTLGAVLMIGPGMIGLFTNPYVAIMLLCIGGFAHQALSGALITLSSDVFGRNEVA TANGLTGMSAWLASTLFALVVGALADTIGFSPLFAVLAVFDLLGALVIWTVLQNKPAIEV AQETHNDPAPQH >gi|296494648|gb|ADTN01000090.1| GENE 6 6969 - 7760 889 263 aa, chain + ## HITS:1 COG:ECs3976 KEGG:ns NR:ns ## COG: ECs3976 COG2186 # Protein_GI_number: 15833230 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 263 1 263 263 511 100.0 1e-145 MPGAHMEITEPRRLYQQLAADLKERIEQGVYLVGDKLPAERFIADEKNVSRTVVREAIIM LEVEGYVEVRKGSGIHVVSNQPRHQQAADNNMEFANYGPFELLQARQLIESNIAEFAATQ VTKQDIMKLMAIQEQARGEQCFRDSEWDLQFHIQVALATQNSALAAIVEKMWTQRSHNPY WKKLHEHIDSRTVDNWCDDHDQILKALIRKDPHAAKLAMWQHLENTKIMLFNETSDDFEF NADRYLFAENPVVHLDTATSGSK >gi|296494648|gb|ADTN01000090.1| GENE 7 8105 - 8767 605 220 aa, chain + ## HITS:1 COG:STM3226 KEGG:ns NR:ns ## COG: STM3226 COG0586 # Protein_GI_number: 16766525 # Func_class: S Function unknown # Function: Uncharacterized membrane-associated protein # Organism: Salmonella typhimurium LT2 # 1 220 1 220 220 372 95.0 1e-103 MELLTQLLQALWAQDFETLANPSMIGMLYFVLFVILFLENGLLPAAFLPGDSLLVLVGVL IAKGAMGYPQTILLLTVAASLGCWVSYIQGRWLGNTRTVQNWLSHLPAHYHQRAHHLFHK HGLSALLIGRFIAFVRTLLPTIAGLSGLNNARFQFFNWMSGLLWVLILTTLGYMLGKTPV FLKYEDQLMSCLMLLPVVLLVFGLAGSLVVLWKKKYGNRG >gi|296494648|gb|ADTN01000090.1| GENE 8 8786 - 9154 242 122 aa, chain + ## HITS:1 COG:no KEGG:B21_02916 NR:ns ## KEGG: B21_02916 # Name: yqjB # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 122 6 127 127 209 100.0 2e-53 MSLRQLAWSGAVLLLVGTLLLAWSAVRQQESTLAIRAVHQGTTMPDGFSIWHHLDAHGIP FKSITPKNDTLLITFDSSDQSAAAKAVLDRTLPHGYIIAQQDNNSQAMQWLTRLRDNSHR FG >gi|296494648|gb|ADTN01000090.1| GENE 9 9286 - 9669 474 127 aa, chain + ## HITS:1 COG:no KEGG:SSON_3256 NR:ns ## KEGG: SSON_3256 # Name: yqjC # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 127 1 127 127 158 99.0 6e-38 MEGSRMKYRIALAVSLFALSAGSYATTLCQEKEQNILKEISYAEKHQNQNRIDGLNKALS EVRANCSDSQLRADHQKKIAKQKDEVAERQQDLAEAKQKGDADKIAKRERKLAEAQEELK KLEARDY >gi|296494648|gb|ADTN01000090.1| GENE 10 9707 - 10012 455 101 aa, chain + ## HITS:1 COG:ECs3980 KEGG:ns NR:ns ## COG: ECs3980 COG4575 # Protein_GI_number: 15833234 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 101 1 101 101 150 100.0 4e-37 MSKEHTTEHLRAELKSLSDTLEEVLSSSGEKSKEELSKIRSKAEQALKQSRYRLGETGDA IAKQTRVAAARADEYVRENPWTGVGIGAAIGVVLGVLLSRR >gi|296494648|gb|ADTN01000090.1| GENE 11 10015 - 10419 448 134 aa, chain + ## HITS:1 COG:STM3230 KEGG:ns NR:ns ## COG: STM3230 COG5393 # Protein_GI_number: 16766529 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Salmonella typhimurium LT2 # 1 130 1 130 132 198 89.0 2e-51 MADTHHAQGPGKSVLGIGQRIVSIMVEMVETRLRLAVVELEEEKANLFQLLLMLGLTMLF AAFGLMSLMVLIIWAVDPQYRLNAMIATTVVLLLLALIGGIWTLRKSRKSTLLRHTRHEL ANDRQLLEEESREQ >gi|296494648|gb|ADTN01000090.1| GENE 12 10409 - 10708 376 99 aa, chain + ## HITS:1 COG:no KEGG:JW3071 NR:ns ## KEGG: JW3071 # Name: yqjK # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 99 1 99 99 140 100.0 1e-32 MSSKVERERRKAQLLSQIQQQRLDLSASRREWLETTGAYDRRWNMLLSLRSWALVGSSVM AIWTIRHPNMLVRWARRGFGVWSAWRLVKTTLKQQQLRG >gi|296494648|gb|ADTN01000090.1| GENE 13 10804 - 11286 506 160 aa, chain + ## HITS:1 COG:ECs3983 KEGG:ns NR:ns ## COG: ECs3983 COG2259 # Protein_GI_number: 15833237 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 160 1 160 160 275 100.0 2e-74 MILSIDSNDANTAPLHKKTISSLSGAVESMMKKLEDVGVLVARILMPILFITAGWGKITG YAGTQQYMEAMGVPGFMLPLVILLEFGGGLAILFGFLTRTTALFTAGFTLLTAFLFHSNF AEGVNSLMFMKNLTISGGFLLLAITGPGAYSIDRLLNKKW >gi|296494648|gb|ADTN01000090.1| GENE 14 11356 - 12342 1082 328 aa, chain + ## HITS:1 COG:yqjG KEGG:ns NR:ns ## COG: yqjG COG0435 # Protein_GI_number: 16130997 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted glutathione S-transferase # Organism: Escherichia coli K12 # 1 328 1 328 328 662 100.0 0 MGQLIDGVWHDTWYDTKSTGGKFQRSASAFRNWLTADGAPGPTGTGGFIAEKDRYHLYVS LACPWAHRTLIMRKLKGLEPFISVSVVNPLMLENGWTFDDSFPGATGDTLYQNEFLYQLY LHADPHYSGRVTVPVLWDKKNHTIVSNESAEIIRMFNTAFDALGAKAGDYYPPALQTKID ELNGWIYDTVNNGVYKAGFATSQEAYDEAVAKVFESLARLEQILGQHRYLTGNQLTEADI RLWTTLVRFDPVYVTHFKCDKHRISDYLNLYGFLRDIYQMPGIAETVNFDHIRNHYFRSH KTINPTGIISIGPWQDLDEPHGRDVRFG >gi|296494648|gb|ADTN01000090.1| GENE 15 12636 - 13001 344 121 aa, chain + ## HITS:1 COG:ECs3985 KEGG:ns NR:ns ## COG: ECs3985 COG3152 # Protein_GI_number: 15833239 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 121 1 121 121 223 100.0 5e-59 MDWYLKVLKNYVGFRGRARRKEYWMFILVNIIFTFVLGLLDKMLGWQRAGGEGILTTIYG ILVFLPWWAVQFRRLHDTDRSAWWALLFLIPFIGWLIIIVFNCQAGTPGENRFGPDPKLE P >gi|296494648|gb|ADTN01000090.1| GENE 16 13243 - 13599 317 118 aa, chain + ## HITS:1 COG:ECs3986 KEGG:ns NR:ns ## COG: ECs3986 COG3152 # Protein_GI_number: 15833240 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 118 1 118 118 220 100.0 5e-58 MQWYLSVLKNYVGFSGRARRKEYWMFTLINAIVGAIINVIQLILGLELPYLSMLYLLATF LPVLALAIRRLHDTDRSGAWALLFFVPFIGWLVLLVFFCTEGTSGSNRYGNDPKFGSN >gi|296494648|gb|ADTN01000090.1| GENE 17 13650 - 14546 971 298 aa, chain - ## HITS:1 COG:ECs3987 KEGG:ns NR:ns ## COG: ECs3987 COG0583 # Protein_GI_number: 15833241 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 298 1 298 298 553 100.0 1e-157 MAKERALTLEALRVMDAIDRRGSFAAAADELGRVPSALSYTMQKLEEELDVVLFDRSGHR TKFTNVGRMLLERGRVLLEAADKLTTDAEALARGWETHLTIVTEALVPTPAFFPLIDKLA AKANTQLAIITEVLAGAWERLEQGRADIVIAPDMHFRSSSEINSRKLYTLMNVYVAAPDH PIHQEPEPLSEVTRVKYRGIAVADTARERPVLTVQLLDKQPRLTVSTIEDKRQALLAGLG VATMPYPMVEKDIAEGRLRVVSPESTSEIDIIMAWRRDSMGEAKSWCLREIPKLFNGK >gi|296494648|gb|ADTN01000090.1| GENE 18 14651 - 15352 594 233 aa, chain + ## HITS:1 COG:yhaK KEGG:ns NR:ns ## COG: yhaK COG1741 # Protein_GI_number: 16131001 # Func_class: R General function prediction only # Function: Pirin-related protein # Organism: Escherichia coli K12 # 1 233 1 233 233 474 100.0 1e-134 MITTRTARQCGQADYGWLQARYTFSFGHYFDPKLLGYASLRVLNQEVLAPGAAFQPRTYP KVDILNVILDGEAEYRDSEGNHVQASAGEALLLSTQPGVSYSEHNLSKDKPLTRMQLWLD ACPQRENPLIQKLALNMGKQQLIASPEGAMGSLQLRQQVWLHHIVLDKGESANFQLHGPR AYLQSIHGKFHALTHHEEKAALTCGDGAFIRDEANITLVADSPLRALLIDLPV >gi|296494648|gb|ADTN01000090.1| GENE 19 15375 - 15539 228 54 aa, chain + ## HITS:1 COG:no KEGG:EC55989_3525 NR:ns ## KEGG: EC55989_3525 # Name: yhaL # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 54 1 54 54 65 98.0 7e-10 MSKKLAKKRQPVKPVVAKEPARTAKNFGYEEMLSELEAIVADAETRLAEDEATA >gi|296494648|gb|ADTN01000090.1| GENE 20 15673 - 16983 1325 436 aa, chain - ## HITS:1 COG:yhaN+M KEGG:ns NR:ns ## COG: yhaN+M COG3681 # Protein_GI_number: 16132252 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 436 1 436 436 753 100.0 0 MFDSTLNPLWQRYILAVQEEVKPALGCTEPISLALAAAVAAAELEGPVERVEAWVSPNLM KNGLGVTVPGTGMVGLPIAAALGALGGNANAGLEVLKDATAQAIADAKALLAAGKVSVKI QEPCDEILFSRAKVWNGEKWACVTIVGGHTNIVHIETHDGVVFTQQACVAEGEQESPLTV LSRTTLAEILKFVNEVPFAAIRFILDSAKLNCALSQEGLSGKWGLHIGATLEKQCERGLL AKDLSSSIVIRTSAASDARMGGATLPAMSNSGSGNQGITATMPVVVVAEHFGADDERLAR ALMLSHLSAIYIHNQLPRLSALCAATTAAMGAAAGMAWLVDGRYETISMAISSMIGDVSG MICDGASNSCAMKVSTSASAAWKAVLMALDDTAVTGNEGIVAHDVEQSIANLCALASHSM QQTDRQIIEIMASKAR >gi|296494648|gb|ADTN01000090.1| GENE 21 17011 - 18342 1392 443 aa, chain - ## HITS:1 COG:ECs3991 KEGG:ns NR:ns ## COG: ECs3991 COG0814 # Protein_GI_number: 15833245 # Func_class: E Amino acid transport and metabolism # Function: Amino acid permeases # Organism: Escherichia coli O157:H7 # 1 443 1 443 443 788 99.0 0 MEIASNKGVIADASTPAGRAGMSESEWREAIKFDSTDTGWVIMSIGMAIGAGIVFLPVQV GLMGLWVFLLSSVIGYPAMYLFQRLFINTLAESPECKDYPSVISGYLGKNWGILLGALYF VMLVIWMFVYSTAITNDSASYLHTFGVTEGLLSDSPFYGLVLICILVAISSRGEKLLFKI STGMVLTKLLVVAALGVSMVGMWHLYNVGSLPPLGLLVKNAIITLPFTLTSILFIQTLSP MVISYRSREKSIEVARHKALRAMNIAFGILFVTVFFYAVSFTLAMGHDEAVKAYEQNISA LAIAAQFISGDGAAWVKVVSVILNIFAVMTAFFGVYLGFREATQGIVMNILRRKMPAEKI NENLVQRGIMIFAILLAWSAIVLNAPVLSFTSICSPIFGMVGCLIPAWLVYKVPALHKYK GMSLYLIIVTGLLLCVSPFLAFS >gi|296494648|gb|ADTN01000090.1| GENE 22 18617 - 19981 1196 454 aa, chain - ## HITS:1 COG:ECs3992 KEGG:ns NR:ns ## COG: ECs3992 COG1760 # Protein_GI_number: 15833246 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Escherichia coli O157:H7 # 1 454 1 454 454 902 99.0 0 MISAFDIFKIGIGPSSSHTVGPMNAGKSFIDRLESSGLLTATSHIVVDLYGSLSLTGKGH ATDVAIIMGLAGNSPQDVVIDEIPAFIELVTRSGRLPVASGAHIVDFPVAKNIIFHPEML PRHENGMRITAWKGQEELLSKTYYSVGGGFIVEEEHFGLSHDVETSVPYDFHSAGELLKM CDYNGLSISGLMMHNELALRSKAEIDAGFARIWQVMHDGIERGMNTEGVLPGPLNVPRRA VALRRQLVSSDNISNDPMNVIDWINMYALAVSEENAAGGRVVTAPTNGACGIIPAVLAYY DKFRRPVNERSIARYFLAAGAIGALYKMNASISGAEVGCQGEIGVACSMAAAGLTELLGG SPAQVCNAAEIAMEHNLGLTCDPVAGQVQIPCIERNAINAVKAVNAARMAMRRTSAPRVS LDKVIETMYETGKDMNDKYRETSRGGLAIKVVCG >gi|296494648|gb|ADTN01000090.1| GENE 23 20053 - 20442 418 129 aa, chain - ## HITS:1 COG:tdcF KEGG:ns NR:ns ## COG: tdcF COG0251 # Protein_GI_number: 16131006 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Escherichia coli K12 # 1 129 22 150 150 249 100.0 1e-66 MKKIIETQRAPGAIGPYVQGVDLGSMVFTSGQIPVCPQTGEIPADVQDQARLSLENVKAI VVAAGLSVGDIIKMTVFITDLNDFATINEVYKQFFDEHQATYPTRSCVQVARLPKDVKLE IEAIAVRSA >gi|296494648|gb|ADTN01000090.1| GENE 24 20456 - 22750 2631 764 aa, chain - ## HITS:1 COG:ECs3994 KEGG:ns NR:ns ## COG: ECs3994 COG1882 # Protein_GI_number: 15833248 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Escherichia coli O157:H7 # 1 764 1 764 764 1582 99.0 0 MKVDIDTSDKLYADAWLGFKGTDWKNEINVRDFIQHNYTPYEGDESFLAEATPATTELWE KVMEGIRIENATHAPVDFDTNIATTITAHDAGYINQPLEKIVGLQTDAPLKRALHPFGGI NMIKSSFHAYGREMDSEFEYLFTDLRKTHNQGVFDVYSPDMLRCRKSGVLTGLPDGYGRG RIIGDYRRVALYGISYLVRERELQFADLQSRLEKGEDLEATIRLREELAEHRHALLQIQE MAAKYGFDISRPAQNAQEAVQWLYFAYLAAVKSQNGGAMSLGRTASFLDIYIERDFKAGV LNEQQAQELIDHFIMKIRMVRFLRTPEFDSLFSGDPIWATEVIGGMGLDGRTLVTKNSFR YLHTLHTMGPAPEPNLTILWSEELPIAFKKYAAQVSIVTSSLQYENDDLMRTDFNSDDYA IACCVSPMVIGKQMQFFGARANLAKTLLYAINGGVDEKLKIQVGPKTAPLMDDVLDYDKV MDSLDHFMDWLAVQYISALNIIHYMHDKYSYEASLMALHDRDVYRTMACGIAGLSVATDS LSAIKYARVKPIRDENGLAVDFEIDGEYPQYGNNDERVDSIACDLVERFMKKIKALPTYR NAVPTQSILTITSNVVYGQKTGNTPDGRRAGTPFAPGANPMHGRDRKGAVASLTSVAKLP FTYAKDGISYTFSIVPAALGKEDPVRKTNLVGLLDGYFHHEADVEGGQHLNVNVMNREML LDAIEHPEKYPNLTIRVSGYAVRFNALTREQQQDVISRTFTQAL >gi|296494648|gb|ADTN01000090.1| GENE 25 22784 - 23992 1075 402 aa, chain - ## HITS:1 COG:tdcD KEGG:ns NR:ns ## COG: tdcD COG0282 # Protein_GI_number: 16131008 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Escherichia coli K12 # 1 402 5 406 406 790 100.0 0 MNEFPVVLVINCGSSSIKFSVLDASDCEVLMSGIADGINSENAFLSVNGGEPAPLAHHSY EGALKAIAFELEKRNLNDSVALIGHRIAHGGSIFTESAIITDEVIDNIRRVSPLAPLHNY ANLSGIESAQQLFPGVTQVAVFDTSFHQTMAPEAYLYGLPWKYYEELGVRRYGFHGTSHR YVSQRAHSLLNLAEDDSGLVVAHLGNGASICAVRNGQSVDTSMGMTPLEGLMMGTRSGDV DFGAMSWVASQTNQSLGDLERVVNKESGLLGISGLSSDLRVLEKAWHEGHERAQLAIKTF VHRIARHIAGHAASLRRLDGIIFTGGIGENSSLIRRLVMEHLAVLGLEIDTEMNNRSNSC GERIVSSENARVICAVIPTNEEKMIALDAIHLGKVNAPAEFA >gi|296494648|gb|ADTN01000090.1| GENE 26 24018 - 25349 1383 443 aa, chain - ## HITS:1 COG:ECs3996 KEGG:ns NR:ns ## COG: ECs3996 COG0814 # Protein_GI_number: 15833250 # Func_class: E Amino acid transport and metabolism # Function: Amino acid permeases # Organism: Escherichia coli O157:H7 # 1 443 1 443 443 795 100.0 0 MSTSDSIVSSQTKQSSWRKSDTTWTLGLFGTAIGAGVLFFPIRAGFGGLIPILLMLVLAY PIAFYCHRALARLCLSGSNPSGNITETVEEHFGKTGGVVITFLYFFAICPLLWIYGVTIT NTFMTFWENQLGFAPLNRGFVALFLLLLMAFVIWFGKDLMVKVMSYLVWPFIASLVLISL SLIPYWNSAVIDQVDLGSLSLTGHDGILITVWLGISIMVFSFNFSPIVSSFVVSKREEYE KDFGRDFTERKCSQIISRASMLMVAVVMFFAFSCLFTLSPANMAEAKAQNIPVLSYLANH FASMTGTKTTFAITLEYAASIIALVAIFKSFFGHYLGTLEGLNGLVLKFGYKGDKTKVSL GKLNTISMIFIMGSTWVVAYANPNILDLIEAMGAPIIASLLCLLPMYAIRKAPSLAKYRG RLDNVFVTVIGLLTILNIVYKLF >gi|296494648|gb|ADTN01000090.1| GENE 27 25371 - 26360 993 329 aa, chain - ## HITS:1 COG:ECs3997 KEGG:ns NR:ns ## COG: ECs3997 COG1171 # Protein_GI_number: 15833251 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Escherichia coli O157:H7 # 1 329 1 329 329 590 100.0 1e-169 MHITYDLPVAIDDIIEAKQRLAGRIYKTGMPRSNYFSERCKGEIFLKFENMQRTGSFKIR GAFNKLSSLTDAEKRKGVVACSAGNHAQGVSLSCAMLGIDGKVVMPKGAPKSKVAATCDY SAEVVLHGDNFNDTIAKVSEIVEMEGRIFIPPYDDPKVIAGQGTIGLEIMEDLYDVDNVI VPIGGGGLIAGIAVAIKSINPTIRVIGVQSENVHGMAASFHSGEITTHRTTGTLADGCDV SRPGNLTYEIVRELVDDIVLVSEDEIRNSMIALIQRNKVVTEGAGALACAALLSGKLDQY IQNRKTVSIISGGNIDLSRVSQITGFVDA >gi|296494648|gb|ADTN01000090.1| GENE 28 26459 - 27397 751 312 aa, chain - ## HITS:1 COG:ECs3998 KEGG:ns NR:ns ## COG: ECs3998 COG0583 # Protein_GI_number: 15833252 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 312 1 312 312 593 100.0 1e-169 MSTILLPKTQHLVVFQEVIRSGSIGSAAKELGLTQPAVSKIINDIEDYFGVELVVRKNTG VTLTPAGQLLLSRSESITREMKNMVNEISGMSSEAVVEVSFGFPSLIGFTFMSGMINKFK EVFPKAQVSMYEAQLSSFLPAIRDGRLDFAIGTLSAEMKLQDLHVEPLFESEFVLVASKS RTCTGTTTLESLKNEQWVLPQTNMGYYSELLTTLQRNGISIENIVKTDSVVTIYNLVLNA DFLTVIPCDMTSPFGSNQFITIPVEETLPVAQYAAVWSKNYRIKKAASVLVELAKEYSSY NGCRRRQLIEVG >gi|296494648|gb|ADTN01000090.1| GENE 29 28012 - 28149 76 45 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_3594 NR:ns ## KEGG: EcE24377A_3594 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 45 1 45 45 64 100.0 1e-09 MLIFLAGFFRTIKKRSFTRRYSKRSNILWPTPIKSSVKISLVVFV >gi|296494648|gb|ADTN01000090.1| GENE 30 28186 - 28725 88 179 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_3595 NR:ns ## KEGG: EcE24377A_3595 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 179 8 186 186 363 100.0 1e-99 MKGFPIAHIFHPSIPPMHAVVNNHNRNIDYWTVKRKFAEIVSTNDVNKIYSISNELRRVL SAITALNFYHGDVPSVMIRIQPENMSPFIIDISTGEHDDYIIQTLDVGTFAPFGEQCTCS AVNKKELECIKETISKYCAKFTRKEAILTPLVHFNKTSITSDCWQILFFSPDHFNNDFY >gi|296494648|gb|ADTN01000090.1| GENE 31 28747 - 29934 193 395 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_3270 NR:ns ## KEGG: ECIAI1_3270 # Name: yhaC # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 395 1 395 395 679 96.0 0 MFPVSSIGNDISSDLVRRKMNDLPESPTGNNLEALAPGIEKLKQTSIEMVTLLNTLQPGG KCIITGDFQKELAYLQNVILYNVSSLRLDFLGYNAQIIQRSDNTCELTINEPLKNQEIST GNININCPLKDIYNEIRRLNVIFSCGTGDIVDLSSLDLRNVDLDYYDFTDKHMANTILNP FKLNSTNFTNANMFQVNFVSSTQNATISWDYLLKITPVLISISDMYSEEKIKFVESCLNE PGDITEEQLKIMRFAIIKSIPRATLTDKLENELTKEIYKSSSKIINCLNRIKLPEMKEFS SEKIYDYIDIIIEDYENIKENAYLVVPQINYTMDLNIEDSSSEELLSDNTLEKDENSPDN GFEVGEYNTYEAYNSEKQYFTREDYTYDYDLLNAI >gi|296494648|gb|ADTN01000090.1| GENE 32 30731 - 31876 1116 381 aa, chain - ## HITS:1 COG:yhaD KEGG:ns NR:ns ## COG: yhaD COG1929 # Protein_GI_number: 16131016 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerate kinase # Organism: Escherichia coli K12 # 1 381 28 408 408 674 100.0 0 MKIVIAPDSYKESLSASEVAQAIEKGFREIFPDAQYVSVPVADGGEGTVEAMIAATQGAE RHAWVTGPLGEKVNASWGISGDGKTAFIEMAAASGLELVPAEKRDPLVTTSRGTGELILQ ALESGATNIIIGIGGSATNDGGAGMVQALGAKLCDANGNEIGFGGGSLNTLNDIDISGLD PRLKDCVIRVACDVTNPLVGDNGASRIFGPQKGASEAMIVELDNNLSHYAEVIKKALHVD VKDVPGAGAAGGMGAALMAFLGAELKSGIEIVTTALNLEEHIHDCTLVITGEGRIDSQSI HGKVPIGVANVAKKYHKPVIGIAGSLTDDVGVVHQHGIDAVFSVLTSIGTLDEAFRGAYD NICRASRNIAATLAIGMRNAG >gi|296494648|gb|ADTN01000090.1| GENE 33 31973 - 32863 1196 296 aa, chain - ## HITS:1 COG:ECs4003 KEGG:ns NR:ns ## COG: ECs4003 COG2084 # Protein_GI_number: 15833257 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases # Organism: Escherichia coli O157:H7 # 1 296 4 299 299 513 100.0 1e-145 MTMKVGFIGLGIMGKPMSKNLLKAGYSLVVADRNPEAIADVIAAGAETASTAKAIAEQCD VIITMLPNSPHVKEVALGENGIIEGAKPGTVLIDMSSIAPLASREISEALKAKGIDMLDA PVSGGEPKAIDGTLSVMVGGDKAIFDKYYDLMKAMAGSVVHTGEIGAGNVTKLANQVIVA LNIAAMSEALTLATKAGVNPDLVYQAIRGGLAGSTVLDAKAPMVMDRNFKPGFRIDLHIK DLANALDTSHGVGAQLPLTAAVMEMMQALRADGLGTADHSALACYYEKLAKVEVTR >gi|296494648|gb|ADTN01000090.1| GENE 34 32893 - 33663 799 256 aa, chain - ## HITS:1 COG:yhaF KEGG:ns NR:ns ## COG: yhaF COG3836 # Protein_GI_number: 16131018 # Func_class: G Carbohydrate transport and metabolism # Function: 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase # Organism: Escherichia coli K12 # 1 256 1 256 256 503 100.0 1e-142 MNNDVFPNKFKAALAAKQVQIGCWSALSNPISTEVLGLAGFDWLVLDGEHAPNDISTFIP QLMALKGSASAPVVRVPTNEPVIIKRLLDIGFYNFLIPFVETKEEAELAVASTRYPPEGI RGVSVSHRANMFGTVADYFAQSNKNITILVQIESQQGVDNVDAIAATEGVDGIFVGPSDL AAALGHLGNASHPDVQKAIQHIFNRASAHGKPSGILAPVEADARRYLEWGATFVAVGSDL GVFRSATQKLADTFKK >gi|296494648|gb|ADTN01000090.1| GENE 35 33679 - 35013 1687 444 aa, chain - ## HITS:1 COG:ECs4005 KEGG:ns NR:ns ## COG: ECs4005 COG0477 # Protein_GI_number: 15833259 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 444 1 444 444 828 100.0 0 MILDTVDEKKKGVHTRYLILLIIFIVTAVNYADRATLSIAGTEVAKELQLSAVSMGYIFS AFGWAYLLMQIPGGWLLDKFGSKKVYTYSLFFWSLFTFLQGFVDMFPLAWAGISMFFMRF MLGFSEAPSFPANARIVAAWFPTKERGTASAIFNSAQYFSLALFSPLLGWLTFAWGWEHV FTVMGVIGFVLTALWIKLIHNPTDHPRMSAEELKFISENGAVVDMDHKKPGSAAASGPKL HYIKQLLSNRMMLGVFFGQYFINTITWFFLTWFPIYLVQEKGMSILKVGLVASIPALCGF AGGVLGGVFSDYLIKRGLSLTLARKLPIVLGMLLASTIILCNYTNNTTLVVMLMALAFFG KGFGALGWPVISDTAPKEIVGLCGGVFNVFGNVASIVTPLVIGYLVSELHSFNAALVFVG CSALMAMVCYLFVVGDIKRMELQK >gi|296494648|gb|ADTN01000090.1| GENE 36 35388 - 36959 1306 523 aa, chain + ## HITS:1 COG:yhaG KEGG:ns NR:ns ## COG: yhaG COG2721 # Protein_GI_number: 16131020 # Func_class: G Carbohydrate transport and metabolism # Function: Altronate dehydratase # Organism: Escherichia coli K12 # 1 523 1 523 523 1059 100.0 0 MANIEIRQETPTAFYIKVHDTDNVAIIVNDNGLKAGTRFPDGLELIEHIPQGHKVALLDI PANGEIIRYGEVIGYAVRAIPRGSWIDESMVVLPEAPPLHTLPLATKVPEPLPPLEGYTF EGYRNADGSVGTKNLLGITTSVHCVAGVVDYVVKIIERDLLPKYPNVDGVVGLNHLYGCG VAINAPAAVVPIRTIHNISLNPNFGGEVMVIGLGCEKLQPERLLTGTDDVQAIPVESASI VSLQDEKHVGFQSMVEDILQIAERHLQKLNQRQRETCPASELVVGMQCGGSDAFSGVTAN PAVGYASDLLVRCGATVMFSEVTEVRDAIHLLTPRAVNEEVGKRLLEEMEWYDNYLNMGK TDRSANPSPGNKKGGLANVVEKALGSIAKSGKSAIVEVLSPGQRPTKRGLIYAATPASDF VCGTQQVASGITVQVFTTGRGTPYGLMAVPVIKMATRTELANRWFDLMDINAGTIATGEE TIEEVGWKLFHFILDVASGKKKTFSDQWGLHNQLAVFNPAPVT >gi|296494648|gb|ADTN01000090.1| GENE 37 37108 - 37443 249 111 aa, chain + ## HITS:1 COG:sohA KEGG:ns NR:ns ## COG: sohA COG2002 # Protein_GI_number: 16131021 # Func_class: K Transcription # Function: Regulators of stationary/sporulation gene expression # Organism: Escherichia coli K12 # 1 111 1 111 111 216 100.0 8e-57 MPANARSHAVLTTESKVTIRGQTTIPAPVREALKLKPGQDSIHYEILPGGQVFMCRLGDE QEDHTMNAFLRFLDADIQNNPQKTRPFNIQQGKKLVAGMDVNIDDEIGDDE >gi|296494648|gb|ADTN01000090.1| GENE 38 37443 - 37907 252 154 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_3612 NR:ns ## KEGG: ECUMN_3612 # Name: yhaV # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 154 1 154 154 308 99.0 3e-83 MDFPQRVNGWALYAHPCFQETYDALVAEVETLKGKDPENYQRKAATKLLAVVHKVIEEHI TVNPSSPAFRHGKSLGSGKNKDWSRVKFGAGRYRLFFRYSEKEKVIILGWMNDENTLRTY GKKTDAYTVFSKMLKRGHPPADWETLTRETEETH >gi|296494648|gb|ADTN01000090.1| GENE 39 37962 - 38771 734 269 aa, chain - ## HITS:1 COG:ECs4009 KEGG:ns NR:ns ## COG: ECs4009 COG1349 # Protein_GI_number: 15833263 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli O157:H7 # 1 269 1 269 269 492 100.0 1e-139 MSNTDASGEKRVTGTSERREQIIQRLRQQGSVQVNDLSALYGVSTVTIRNDLAFLEKQGI AVRAYGGALICDSTTPSVEPSVEDKSALNTAMKRSVAKAAVELIQPGHRVILDSGTTTFE IARLMRKHTDVIAMTNGMNVANALLEAEGVELLMTGGHLRRQSQSFYGDQAEQSLQNYHF DMLFLGVDAIDLERGVSTHNEDEARLNRRMCEVAERIIVVTDSSKFNRSSLHKIIDTQRI DMIIVDEGIPADSLEGLRKAGVEVILVGE >gi|296494648|gb|ADTN01000090.1| GENE 40 39020 - 40300 1295 426 aa, chain + ## HITS:1 COG:agaZ KEGG:ns NR:ns ## COG: agaZ COG4573 # Protein_GI_number: 16131024 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted tagatose 6-phosphate kinase # Organism: Escherichia coli K12 # 1 426 1 426 426 862 100.0 0 MKHLTEMVRQHKAGKTNGIYAVCSAHPLVLEAAIRYASANQTPLLIEATSNQVDQFGGYT GMTPADFRGFVCQLADSLNFPQDALILGGDHLGPNRWQNLPAAQAMANADDLIKSYVAAG FKKIHLDCSMSCQDDPIPLTDDIVAERAARLAKVAEETCLEHFGEADLEYVIGTEVPVPG GAHETLSELAVTTPDAARATLEAHRHAFEKQGLNAIWPRIIALVVQPGVEFDHTNVIDYQ PAKASALSQMVENYETLIFEAHSTDYQTPQSLRQLVIDHFAILKVGPALTFALREALFSL AAIEEELVPAKACSGLRQVLEDVMLDRPEYWQSHYHGDGNARRLARGYSYSDRVRYYWPD SQIDDAFAHLVRNLADSPIPLPLISQYLPLQYVKVRSGELQPTPRELIINHIQDILAQYH TACEGQ >gi|296494648|gb|ADTN01000090.1| GENE 41 40323 - 40796 548 157 aa, chain + ## HITS:1 COG:agaV KEGG:ns NR:ns ## COG: agaV COG3444 # Protein_GI_number: 16131025 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB # Organism: Escherichia coli K12 # 1 157 13 169 169 306 100.0 9e-84 MPNIVLSRIDERLIHGQVGVQWVGFAGANLVLVANDEVAEDPVQQNLMEMVLAEGIAVRF WTLQKVIDNIHRAADRQKILLVCKTPADFLTLVKGGVPVNRINVGNMHYANGKQQIAKTV SVDAGDIAAFNDLKTAGVECFVQGVPTEPAVDLFKLL >gi|296494648|gb|ADTN01000090.1| GENE 42 40807 - 41586 1083 259 aa, chain + ## HITS:1 COG:ECs4012 KEGG:ns NR:ns ## COG: ECs4012 COG3715 # Protein_GI_number: 15833266 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC # Organism: Escherichia coli O157:H7 # 1 259 1 259 259 427 99.0 1e-119 MEISLLQAFALGIIAFIAGLDMFNGLTHMHRPVVLGPLVGLVLGDLHTGILTGGTLELVW MGLAPLAGAQPPNVIIGTIVGTAFAITTGVKPDVAVGVAVPFAVAVQMGITFLFSVMSGV MSRCDRMAENADTRGIERVNYLALLALGIFYFLCAFLPIYFGAEHAKTIIDVLPQRLIDG LGVAGGIMPAIGFAVLLKIMMKNVYIPYFILGFVAAAWLKLPVLAIAAAALAMALIDLLR KSPEPTQPAAQKEEFEDGI >gi|296494648|gb|ADTN01000090.1| GENE 43 41576 - 42454 1053 292 aa, chain + ## HITS:1 COG:ECs4013 KEGG:ns NR:ns ## COG: ECs4013 COG3716 # Protein_GI_number: 15833267 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID # Organism: Escherichia coli O157:H7 # 1 292 1 292 292 561 100.0 1e-160 MASNQTTLPNVSENEETLLTGVNENVYEDQSIGAELTKKDINRVAWRSMLLQASFNYERM QASGWLYGLLPALKKIHTNKRDLARAMKGHMGFFNTHPFLVTFVIGIILAMERSKQDVNS IQSTKIAVGAPLGGIGDAMFWLTLLPICGGIGASLALQGSILGAVVFIVLFNVVHLGLRF GLAHYAYRMGVAAIPLIKANTKKVGHAASIVGMTVIGALVATYVRLSTTLEITAGDAVVK LQADVIDKLMPAFLPLVYTLTMFWLVRRGWSPLRLIAVTVVLGIVGKFCHFL >gi|296494648|gb|ADTN01000090.1| GENE 44 42472 - 42906 575 144 aa, chain + ## HITS:1 COG:ECs4014 KEGG:ns NR:ns ## COG: ECs4014 COG2893 # Protein_GI_number: 15833268 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose-specific component IIA # Organism: Escherichia coli O157:H7 # 1 144 1 144 144 256 100.0 1e-68 MLSIILTGHGGFASGMEKAMKQILGEQSQFIAIDFPETSSTALLTSQLEEAIAQLDCEDG IVFLTDLLGGTPFRVASTLAMQKPGCEVITGTNLQLLLEMVLEREGLSGEEFRVQALECG HRGLTSLVDELGRCHEECPVEEGI >gi|296494648|gb|ADTN01000090.1| GENE 45 42903 - 44036 1033 377 aa, chain + ## HITS:1 COG:Z4489 KEGG:ns NR:ns ## COG: Z4489 COG1820 # Protein_GI_number: 15803675 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Escherichia coli O157:H7 EDL933 # 1 377 1 377 377 723 98.0 0 MTHVLRARRLLTEEGWLDDHQLRIADGVIAAIEPIPAGVTERDAELLCPAYIDTHVHGGA GVDVMDDAPDVLDKLAMHKAREGVGSWLPTTVTTPLNTIHAALKRIAQRCQRGGPGAQVL GSYLEGPYFTPQNKGAHPPELFRELEIAELDQLIAVSQHTLRVVALAPEKEGALQAIRHL KQQNVRVMLGHSAATWQQTRAAFDAGADGLVHCYNGMTGLHHREPGMVGAGLTDKRAWLE LIADGHHVHPAAMSLCCCCAKERIVLITDAMQAAGMPDGRYTLCGEEVQMHGGVVRTASG GLAGSTLSVDAAVRNMVELTGVTPAEAIHMASLHPARMLGVDGVLGSLKPGKRASVVALD SGLHVQQIWIQGQLASF >gi|296494648|gb|ADTN01000090.1| GENE 46 44387 - 45541 1004 384 aa, chain + ## HITS:1 COG:agaS KEGG:ns NR:ns ## COG: agaS COG2222 # Protein_GI_number: 16131028 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted phosphosugar isomerases # Organism: Escherichia coli K12 # 1 384 1 384 384 771 100.0 0 MPENYTPAAAATGTWTEEEIRHQPRAWIRSLTNIDALRSALNNFLEPLLRKENLRIILTG AGTSAFIGDIIAPWLASHTGKNFSAVPTTDLVTNPMDYLNPAHPLLLISFGRSGNSPESV AAVELANQFVPECYHLPITCNEAGALYQNAINSDNAFALLMPAETHDRGFAMTSSITTMM ASCLAVFAPETINSQTFRDVADRCQAILTSLGDFSEGVFGYAPWKRIVYLGSGGLQGAAR ESALKVLELTAGKLAAFYDSPTGFRHGPKSLVDDETLVVVFVSSHPYTRQYDLDLLAELR RDNQAMRVIAIAAESSDIVAAGPHIILPPSRHFIDVEQAFCFLMYAQTFALMQSLHMGNT PDTPSASGTVNRVVQGVIIHPWQA >gi|296494648|gb|ADTN01000090.1| GENE 47 45554 - 46414 813 286 aa, chain + ## HITS:1 COG:ECs4017 KEGG:ns NR:ns ## COG: ECs4017 COG0191 # Protein_GI_number: 15833271 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Escherichia coli O157:H7 # 1 286 1 286 286 577 100.0 1e-165 MSIISTKYLLQDAQANGYAVPAFNIHNAETIQAILEVCSEMRSPVILAGTPGTFKHIALE EIYALCSAYSTTYNMPLALHLDHHESLDDIRRKVHAGVRSAMIDGSHFPFAENVKLVKSV VDFCHSQDCSVEAELGRLGGVEDDMSVDAESAFLTDPQEAKRFVELTGVDSLAVAIGTAH GLYSKTPKIDFQRLAEIREVVDVPLVLHGASDVPDEFVRRTIELGVTKVNVATELKIAFA GAVKAWFAENPQGNDPRYYMRVGMDAMKEVVRNKINVCGSANRISA >gi|296494648|gb|ADTN01000090.1| GENE 48 46581 - 47057 573 158 aa, chain + ## HITS:1 COG:agaB KEGG:ns NR:ns ## COG: agaB COG3444 # Protein_GI_number: 16131030 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB # Organism: Escherichia coli K12 # 1 158 1 158 158 280 100.0 8e-76 MTSPNILLTRIDNRLVHGQVGVTWTSTIGANLLVVVDDVVANDDIQQKLMGITAETYGFG IRFFTIEKTINVIGKAAPHQKIFLICRTPQTVRKLVEGGIDLKDVNVGNMHFSEGKKQIS SKVYVDDQDLTDLRFIKQRGVNVFIQDVPGDQKEQIPD >gi|296494648|gb|ADTN01000090.1| GENE 49 47096 - 47899 872 267 aa, chain + ## HITS:1 COG:agaC KEGG:ns NR:ns ## COG: agaC COG3715 # Protein_GI_number: 16131031 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC # Organism: Escherichia coli K12 # 1 267 1 267 267 479 100.0 1e-135 MHEITLLQGLSLAALVFVLGIDFWLEALFLFRPIIVCTLTGAILGDIQTGLITGGLTELA FAGLTPAGGVQPPNPIMAGLMTTVIAWSTGVDAKTAIGLGLPFSLLMQYVILFFYSAFSL FMTKADKCAKEADTAAFSRLNWTTMLIVASAYAVIAFLCTYLAQGAMQALVKAMPAWLTH GFEVAGGILPAVGFGLLLRVMFKAQYIPYLIAGFLFVCYIQVSNLLPVAVLGAGFAVYEF FNAKSRQQAQPQPVASKNEEEDYSNGI >gi|296494648|gb|ADTN01000090.1| GENE 50 47889 - 48680 809 263 aa, chain + ## HITS:1 COG:agaD KEGG:ns NR:ns ## COG: agaD COG3716 # Protein_GI_number: 16131032 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID # Organism: Escherichia coli K12 # 1 263 1 263 263 503 99.0 1e-142 MGSEISKKDITRLGFRSSLLQASFNYERMQAGGFTWAMLPILKKIYKDDKPGLSAAMKDN LEFINTHPNLVGFLMGLLISMEEKGENRDTIKGLKVALFGPIAGIGDAIFWFTLLPIMAG ICSSFASQGNLLGPILFFAVYLLIFFLRVGWTHVGYSVGVKAIDKVRENSQMIARSATIL GITVIGGLIASYVHINVVTSFAIDNTHSVALQQDFFDIVFPNILPMAYTLLMYYFLRVKK AHPVLLIGVTFVLSIVCSAFGIL >gi|296494648|gb|ADTN01000090.1| GENE 51 48735 - 49436 712 233 aa, chain + ## HITS:1 COG:agaI KEGG:ns NR:ns ## COG: agaI COG0363 # Protein_GI_number: 16131033 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Escherichia coli K12 # 1 233 19 251 251 460 99.0 1e-130 MQTLQQVENYTALSERASEYLLAVIRSKPNAVICLATGATPLLTYHYLVEKIHQQQVDVS QLTFVKLDEWVDLPLTMPGTCETFLQQHIVQPLGLREDQLISFRSEEINETECERVTNLI ARKGGLDLCVLGLGKNGHLGLNEPGESLQPACHISQLDARTQQHEMLKTAGRPVTRGITL GLKDILNAREVLLLVTGEGKQDATDRFLTAKVSTAIPASFLWLHSNFICLINT >gi|296494648|gb|ADTN01000090.1| GENE 52 49837 - 50421 540 194 aa, chain + ## HITS:1 COG:yraH KEGG:ns NR:ns ## COG: yraH COG3539 # Protein_GI_number: 16131034 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 194 1 194 194 374 100.0 1e-104 MNKVTKTAIAGLLALFAGNAAATDGEIVFDGEILKSACEINDSDKKIEVALGHYNAEQFR NIGERSPKIPFTIPLVNCPMTGWEHDNGNVEASFRLWLETRDNGTVPNFPNLAKVGSFAG IAATGVGIRIDDAESGNIMPLNAMGNDNTVYQIPAESNGIVNVDLIAYYVSTVVPSEITP GEADAIVNVTLDYR Prediction of potential genes in microbial genomes Time: Sun May 15 23:26:18 2011 Seq name: gi|296494647|gb|ADTN01000091.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont238.3, whole genome shotgun sequence Length of sequence - 6276 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 10/0.500 + CDS 74 - 769 323 ## COG3121 P pilus assembly protein, chaperone PapD 2 1 Op 2 6/0.500 + CDS 798 - 3314 1577 ## COG3188 P pilus assembly protein, porin PapC 3 1 Op 3 . + CDS 3325 - 4416 400 ## COG3539 P pilus assembly protein, pilin FimA + Term 4520 - 4546 -0.3 - Term 4407 - 4451 8.1 4 2 Tu 1 . - CDS 4459 - 5319 1052 ## COG0313 Predicted methyltransferases - Prom 5357 - 5416 6.5 + Prom 5137 - 5196 2.7 5 3 Tu 1 . + CDS 5384 - 6275 969 ## COG3107 Putative lipoprotein Predicted protein(s) >gi|296494647|gb|ADTN01000091.1| GENE 1 74 - 769 323 231 aa, chain + ## HITS:1 COG:yraI KEGG:ns NR:ns ## COG: yraI COG3121 # Protein_GI_number: 16131035 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli K12 # 1 231 1 231 231 469 100.0 1e-132 MSKRTFAVILTLLCSFCIGQALAGGIVLQRTRVIYDASRKEAALPVANKGAETPYLLQSW VDNIDGKSRAPFIITPPLFRLEAGDDSSLRIIKTADNLPENKESLFYINVRAIPAKKKSD DVNANELTLVFKTRIKMFYRPAHLKGRVNDAWKSLEFKRSDHSLNIYNPTEYYVVFAGLA VDKTDLTSKIEYIAPGEHKQLPLPASGGKNVKWAAINDYGGSSGTETRPLQ >gi|296494647|gb|ADTN01000091.1| GENE 2 798 - 3314 1577 838 aa, chain + ## HITS:1 COG:yraJ KEGG:ns NR:ns ## COG: yraJ COG3188 # Protein_GI_number: 16131036 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli K12 # 1 838 1 838 838 1583 99.0 0 MPQRHHQGHKRTPKQLALIIKRCLPMVLTGSGMLCTTANAEEYYFDPIMLETTKSGMQTT DLSRFSKKYAQLPGTYQVDIWLNKKKVSQKKITFTANAEQLLQPQFTVEQLRELGIKVDE IPALAEKDDDSVINSLEQIIPGTAAEFDFNHQQLNLSIPQIALYRDARGYVSPSRWDDGI PTLFTNYSFTGSDNRYRQGNRSQRQYLNMQNGANFGPWRLRNYSTWTRNDQTSSWNTISS YLQRDIKALKSQLLLGESATSGSIFSSYTFTGVQLASDDNMLPNSQRGFAPTVRGIANSS AIVTIRQNGYVIYQSNVSAGAFEINDLYPSSNSGDLEVTIEESDGTQRRFIQPYSSLPMM QRPGHLKYSATAGRYRADANSDSKEPEFAEATAIYGLNNTFTLYGGLLGSEDYYALGIGI GGTLGALGALSMDINRADTQFDNQHSFHGYQWRTQYIKDIPETNTNIAVSYYRYTNDGYF SFNEANTRNWDYNSRQKSEIQFNISQTIFDGVSLYASGSQQDYWGNNDKNRNISVGVSGQ QWGVGYSLNYQYSRYTDQNNDRALSLNLSIPLERWLPRSRVSYQMTSQKDRPTQHEMRLD GSLLDDGRLSYSLEQSLDDDNNHNSSLNASYRSPYGTFSAGYSYGNDSSQYNYGVTGSVV IHPHGVTLSQYLGNAFALIDANGASGVRIQNYPGIATDPFGYAVVPYLTTYQENRLSVDT TQLPDNVDLEQTTQFVVPNRGAMVAARFNANIGYRVLVTVSDRNGKPLPFGALASNDDTG QQSIVDEGGILYLSGISSKSQSWTVRWGNQADQQCQFAFSTPDSEPTTSVLQGTAQCH >gi|296494647|gb|ADTN01000091.1| GENE 3 3325 - 4416 400 363 aa, chain + ## HITS:1 COG:yraK KEGG:ns NR:ns ## COG: yraK COG3539 # Protein_GI_number: 16131037 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 363 1 363 363 672 100.0 0 MKRAPLITGLLLISTSCAYASSGGCGADSTSGATNYSSVVDDVTVNQTDNVTGREFTSAT LSSTNWQYACSCSAGKAVKLVYMVSPVLTTTGHQTGYYKLNDSLDIKTTLQANDIPGLTT DQVVSVNTRFTQIKNNTVYSAATQTGVCQGDTSRYGPVNIGANTTFTLYVTKPFLGSMTI PKTDIAVIKGAWVDGMGSPSTGDFHDLVKLSIQGNLTAPQSCKINQGDVIKVNFGFINGQ KFTTRNAMPDGFTPVDFDITYDCGDTSKIKNSLQMRIDGTTGVVDQYNLVARRRSSDNVP DVGIRIENLGGGVANIPFQNGILPVDPSGHGTVNMRAWPVNLVGGELETGKFQGTATITV IVR >gi|296494647|gb|ADTN01000091.1| GENE 4 4459 - 5319 1052 286 aa, chain - ## HITS:1 COG:ECs4027 KEGG:ns NR:ns ## COG: ECs4027 COG0313 # Protein_GI_number: 15833281 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Escherichia coli O157:H7 # 1 286 1 286 286 533 100.0 1e-151 MKQHQSADNSQGQLYIVPTPIGNLADITQRALEVLQAVDLIAAEDTRHTGLLLQHFGINA RLFALHDHNEQQKAETLLAKLQEGQNIALVSDAGTPLINDPGYHLVRTCREAGIRVVPLP GPCAAITALSAAGLPSDRFCYEGFLPAKSKGRRDALKAIEAEPRTLIFYESTHRLLDSLE DIVAVLGESRYVVLARELTKTWETIHGAPVGELLAWVKEDENRRKGEMVLIVEGHKAQEE DLPADALRTLALLQAELPLKKAAALAAEIHGVKKNALYKYALEQQG >gi|296494647|gb|ADTN01000091.1| GENE 5 5384 - 6275 969 297 aa, chain + ## HITS:1 COG:yraM KEGG:ns NR:ns ## COG: yraM COG3107 # Protein_GI_number: 16131039 # Func_class: R General function prediction only # Function: Putative lipoprotein # Organism: Escherichia coli K12 # 1 288 1 288 678 528 100.0 1e-150 MVPSTFSRLKAARCLPVVLAALIFAGCGTHTPDQSTAYMQGTAQADSAFYLQQMQQSSDD TRINWQLLAIRALVKEGKTGQAVELFNQLPQELNDAQRREKTLLAVEIKLAQKDFAGAQN LLAKITPADLEQNQQARYWQAKIDASQGRPSIDLLRALIAQEPLLGAKEKQQNIDATWQA LSSMTQEQANTLVINADENILQGWLDLQRVWFDNRNDPDMMKAGIADWQKRYPNNPGAKM LPTQLVNVKAFKPASTNKIALLLPLNGQAAVFGRTIQQGFEAAKNIGTQPVAAQGAA Prediction of potential genes in microbial genomes Time: Sun May 15 23:26:30 2011 Seq name: gi|296494646|gb|ADTN01000092.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont238.4, whole genome shotgun sequence Length of sequence - 30699 bp Number of predicted genes - 29, with homology - 29 Number of transcription units - 13, operones - 6 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 10/0.000 + CDS 48 - 1151 969 ## COG3107 Putative lipoprotein 2 1 Op 2 11/0.000 + CDS 1109 - 1504 216 ## COG0792 Predicted endonuclease distantly related to archaeal Holliday junction resolvase 3 1 Op 3 11/0.000 + CDS 1524 - 2114 450 ## COG0279 Phosphoheptose isomerase 4 1 Op 4 . + CDS 2124 - 2699 605 ## COG2823 Predicted periplasmic or secreted lipoprotein - Term 2616 - 2650 2.0 5 2 Op 1 2/0.889 - CDS 2813 - 3853 1277 ## COG0701 Predicted permeases - Term 3860 - 3912 3.2 6 2 Op 2 . - CDS 3926 - 4561 575 ## COG0702 Predicted nucleoside-diphosphate-sugar epimerases - Prom 4731 - 4790 2.1 7 3 Tu 1 . + CDS 4689 - 5207 591 ## COG0693 Putative intracellular protease/amidase - Term 5121 - 5149 1.4 8 4 Tu 1 . - CDS 5187 - 5630 386 ## COG3787 Uncharacterized protein conserved in bacteria - Prom 5652 - 5711 3.1 + Prom 5597 - 5656 4.2 9 5 Tu 1 . + CDS 5681 - 5983 264 ## COG2827 Predicted endonuclease containing a URI domain - Term 5876 - 5918 1.1 10 6 Op 1 6/0.000 - CDS 5970 - 6473 699 ## COG3153 Predicted acetyltransferase 11 6 Op 2 . - CDS 6467 - 6991 721 ## COG3154 Putative lipid carrier protein - Prom 7032 - 7091 4.4 + Prom 7111 - 7170 4.4 12 7 Op 1 13/0.000 + CDS 7200 - 8195 867 ## COG0826 Collagenase and related proteases 13 7 Op 2 2/0.889 + CDS 8204 - 9082 986 ## COG0826 Collagenase and related proteases 14 7 Op 3 . + CDS 9163 - 10170 941 ## COG2141 Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related flavin-dependent oxidoreductases + Term 10224 - 10259 0.2 - Term 10212 - 10246 0.6 15 8 Tu 1 . - CDS 10288 - 11532 1620 ## COG0814 Amino acid permeases - Prom 11619 - 11678 7.9 - Term 11608 - 11642 3.5 16 9 Tu 1 . - CDS 11686 - 13575 2532 ## COG0513 Superfamily II DNA and RNA helicases - Prom 13601 - 13660 5.7 - Term 13683 - 13719 1.3 17 10 Op 1 6/0.000 - CDS 13755 - 14639 639 ## COG4785 Lipoprotein NlpI, contains TPR repeats - Term 14703 - 14736 5.9 18 10 Op 2 26/0.000 - CDS 14748 - 16883 188 ## PROTEIN SUPPORTED gi|229537485|ref|ZP_04426621.1| ribosomal protein S1 - Term 17080 - 17108 1.3 19 10 Op 3 14/0.000 - CDS 17130 - 17399 445 ## PROTEIN SUPPORTED gi|16131057|ref|NP_417634.1| 30S ribosomal subunit protein S15 - Prom 17476 - 17535 4.1 - Term 17417 - 17455 -0.8 20 10 Op 4 26/0.000 - CDS 17548 - 18492 1100 ## COG0130 Pseudouridine synthase 21 10 Op 5 32/0.000 - CDS 18492 - 18893 653 ## COG0858 Ribosome-binding factor A - Prom 18930 - 18989 2.3 - Term 18971 - 19029 1.2 22 10 Op 6 20/0.000 - CDS 19057 - 21729 3362 ## COG0532 Translation initiation factor 2 (IF-2; GTPase) 23 10 Op 7 32/0.000 - CDS 21754 - 23241 1026 ## PROTEIN SUPPORTED gi|17988250|ref|NP_540884.1| transcription elongation factor NusA 24 10 Op 8 . - CDS 23269 - 23691 429 ## COG0779 Uncharacterized protein conserved in bacteria - TRNA 23928 - 24004 86.1 # Met CAT 0 0 + Prom 24163 - 24222 4.4 25 11 Tu 1 . + CDS 24352 - 25695 1667 ## COG0137 Argininosuccinate synthase + Term 25703 - 25744 5.7 26 12 Tu 1 . - CDS 25703 - 27319 497 ## COG2194 Predicted membrane-associated, metal-dependent hydrolase - Prom 27361 - 27420 5.7 - TRNA 27778 - 27864 70.3 # Leu GAG 0 0 - Term 27728 - 27770 9.2 27 13 Op 1 7/0.000 - CDS 27879 - 28211 384 ## COG1314 Preprotein translocase subunit SecG - Prom 28339 - 28398 4.0 - Term 28388 - 28415 -0.1 28 13 Op 2 9/0.000 - CDS 28439 - 29776 1652 ## COG1109 Phosphomannomutase 29 13 Op 3 . - CDS 29769 - 30617 852 ## COG0294 Dihydropteroate synthase and related enzymes Predicted protein(s) >gi|296494646|gb|ADTN01000092.1| GENE 1 48 - 1151 969 367 aa, chain + ## HITS:1 COG:yraM KEGG:ns NR:ns ## COG: yraM COG3107 # Protein_GI_number: 16131039 # Func_class: R General function prediction only # Function: Putative lipoprotein # Organism: Escherichia coli K12 # 1 367 312 678 678 669 99.0 0 MDGVASPAQASVSDLTGEQPAAQPVPVSAPATSTAAVSAPANPSAELKIYDTSSQPLSQI LSQVQQDGASIVVGPLLKNNVEELLKSNTPLNVLALNQPENIENRVNICYFALSPEDEAR DAARHIRDQGKQAPLVLIPRSSLGDRVANAFAQEWQKLGGGTVLQQKFGSTSELRAGVNG GSGIALTGSPITLRATTDSGMTTNNPTLQTTPTDDQFTNNGGRVDAVYIVATPGEIAFIK PMIAMRNGSQSGATLYASSRSAQGTAGPDFRLEMEGLQYSEIPMLAGGNLPLMQQALSAV NNDYSLARMYAMGVDAWSLANHFSQMRQVQGFEINGNTGSLTANPDCVINRNLSWLQYQQ GQVVPVS >gi|296494646|gb|ADTN01000092.1| GENE 2 1109 - 1504 216 131 aa, chain + ## HITS:1 COG:yraN KEGG:ns NR:ns ## COG: yraN COG0792 # Protein_GI_number: 16131040 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease distantly related to archaeal Holliday junction resolvase # Organism: Escherichia coli K12 # 1 131 1 131 131 249 100.0 1e-66 MATVPTRSGSPRQLTTKQTGDAWEAQARRWLEGKGLRFIAANVNERGGEIDLIMREGRTT IFVEVRYRRSALYGGAAASVTRSKQHKLLQTARLWLARHNGSFDTVDCRFDVVAFTGNEV EWIKDAFNDHS >gi|296494646|gb|ADTN01000092.1| GENE 3 1524 - 2114 450 196 aa, chain + ## HITS:1 COG:ECs4030 KEGG:ns NR:ns ## COG: ECs4030 COG0279 # Protein_GI_number: 15833284 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoheptose isomerase # Organism: Escherichia coli O157:H7 # 1 196 1 196 196 370 100.0 1e-103 MQERIKACFTESIQTQIAAAEALPDAISRAAMTLVQSLLNGNKILCCGNGTSAANAQHFA ASMINRFETERPSLPAIALNTDNVVLTAIANDRLHDEVYAKQVRALGHAGDVLLAISTRG NSRDIVKAVEAAVTRDMTIVALTGYDGGELAGLLGPQDVEIRIPSHRSARIQEMHMLTVN CLCDLIDNTLFPHQDD >gi|296494646|gb|ADTN01000092.1| GENE 4 2124 - 2699 605 191 aa, chain + ## HITS:1 COG:ECs4031 KEGG:ns NR:ns ## COG: ECs4031 COG2823 # Protein_GI_number: 15833285 # Func_class: R General function prediction only # Function: Predicted periplasmic or secreted lipoprotein # Organism: Escherichia coli O157:H7 # 1 191 1 191 191 296 100.0 2e-80 MKALSPIAVLISALLLQGCVAAAVVGTAAVGTKAATDPRSVGTQVDDGTLEVRVNSALSK DEQIKKEARINVTAYQGKVLLVGQSPNAELSARAKQIAMGVDGANEVYNEIRQGQPIGLG EASNDTWITTKVRSQLLTSDLVKSSNVKVTTENGEVFLMGLVTEREAKAAADIASRVSGV KRVTTAFTFIK >gi|296494646|gb|ADTN01000092.1| GENE 5 2813 - 3853 1277 346 aa, chain - ## HITS:1 COG:yraQ KEGG:ns NR:ns ## COG: yraQ COG0701 # Protein_GI_number: 16131043 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Escherichia coli K12 # 1 346 1 346 346 575 100.0 1e-164 MTGQSSSQAATPIQWWKPALFFLVVIAGLWYVKWEPYYGKAFTAAETHSIGKSILAQADA NPWQAALDYAMIYFLAVWKAAVLGVILGSLIQVLIPRDWLLRTLGQSRFRGTLLGTLFSL PGMMCTCCAAPVAAGMRRQQVSMGGALAFWMGNPVLNPATLVFMGFVLGWGFAAIRLVAG LVMVLLIATLVQKWVRETPQTQAPVEIDIPEAQGGFFSRWGRALWTLFWSTIPVYILAVL VLGAARVWLFPHADGAVDNSLMWVVAMAVAGCLFVIPTAAEIPIVQTMMLAGMGTAPALA LLMTLPAVSLPSLIMLRKAFPAKALWLTGAMVAVSGVIVGGLALLF >gi|296494646|gb|ADTN01000092.1| GENE 6 3926 - 4561 575 211 aa, chain - ## HITS:1 COG:yraR KEGG:ns NR:ns ## COG: yraR COG0702 # Protein_GI_number: 16131044 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate-sugar epimerases # Organism: Escherichia coli K12 # 1 211 16 226 226 418 99.0 1e-117 MSQVLITGATGLVGGHLLRMLINEPKVNAIAAPTRRPLGDMPGVFNPHDPQLSDALAQVT DPIDIVFCCLGTTRREAGSKEAFIHADYTLVVDTALTGRRLGAQHMLVVSAMGANAHSPF FYNRVKGEMEEALIAQNWPKLTIARPSMLLGARSKQRMNETLFAPLFRLLPGNWKSIDAR DVARVMLAESMRPEHEGVTILSSSELRKRAE >gi|296494646|gb|ADTN01000092.1| GENE 7 4689 - 5207 591 172 aa, chain + ## HITS:1 COG:ECs4034 KEGG:ns NR:ns ## COG: ECs4034 COG0693 # Protein_GI_number: 15833288 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Escherichia coli O157:H7 # 1 172 15 186 186 330 100.0 7e-91 MSKKIAVLITDEFEDSEFTSPADEFRKAGHEVITIEKQAGKTVKGKKGEASVTIDKSIDE VTPAEFDALLLPGGHSPDYLRGDNRFVTFTRDFVNSGKPVFAICHGPQLLISADVIRGRK LTAVKPIIIDVKNAGAEFYDQEVVVDKDQLVTSRTPDDLPAFNREALRLLGA >gi|296494646|gb|ADTN01000092.1| GENE 8 5187 - 5630 386 147 aa, chain - ## HITS:1 COG:yhbP KEGG:ns NR:ns ## COG: yhbP COG3787 # Protein_GI_number: 16131046 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 147 1 147 147 290 99.0 8e-79 METLIAISRWLAKQHVVTWCVQQEGELWCANAFYLFDAQKVAFYILTEEKPRHAQMSGPQ AAVAGTVNGQPKTVALIRGVQFKGEIRRLEGEESDLARKAYNRRFPVARMLSAPVWEIRL DEIKFTDNTLGFGKKMIWLRDSGTEQA >gi|296494646|gb|ADTN01000092.1| GENE 9 5681 - 5983 264 100 aa, chain + ## HITS:1 COG:yhbQ KEGG:ns NR:ns ## COG: yhbQ COG2827 # Protein_GI_number: 16131047 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease containing a URI domain # Organism: Escherichia coli K12 # 1 100 1 100 100 181 100.0 3e-46 MTPWFLYLIRTADNKLYTGITTDVERRYQQHQSGKGAKALRGKGELTLVFSAPVGDRSLA LRAEYRVKQLTKRQKERLVAEGAGFAELLSSLQTPEIKSD >gi|296494646|gb|ADTN01000092.1| GENE 10 5970 - 6473 699 167 aa, chain - ## HITS:1 COG:ECs4037 KEGG:ns NR:ns ## COG: ECs4037 COG3153 # Protein_GI_number: 15833291 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Escherichia coli O157:H7 # 1 167 1 167 167 332 100.0 2e-91 MLIRVEIPIDAPGIDALLRRSFESDAEAKLVHDLREDGFLTLGLVATDDEGQVIGYVAFS PVDVQGEDLQWVGMAPLAVDEKYRGQGLARQLVYEGLDSLNEFGYAAVVTLGDPALYSRF GFELAAHHDLRCRWPGTESAFQVHRLADDALNGVTGLVEYHEHFNRF >gi|296494646|gb|ADTN01000092.1| GENE 11 6467 - 6991 721 174 aa, chain - ## HITS:1 COG:ECs4038 KEGG:ns NR:ns ## COG: ECs4038 COG3154 # Protein_GI_number: 15833292 # Func_class: I Lipid transport and metabolism # Function: Putative lipid carrier protein # Organism: Escherichia coli O157:H7 # 1 174 1 174 174 325 100.0 2e-89 MLDKLRSRIVHLGPSLLSVPVKLTPFALKRQVLEQVLSWQFRQALDDGELEFLEGRWLSI HVRDIDLQWFTSVVNGKLVVSQNAQADVSFSADASDLLMIAARKQDPDTLFFQRRLVIEG DTELGLYVKNLMDAIELEQMPKALRMMLLQLADFVEAGMKTAPETKQTSVGEPC >gi|296494646|gb|ADTN01000092.1| GENE 12 7200 - 8195 867 331 aa, chain + ## HITS:1 COG:yhbU KEGG:ns NR:ns ## COG: yhbU COG0826 # Protein_GI_number: 16131050 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Escherichia coli K12 # 1 331 1 331 331 678 100.0 0 MELLCPAGNLPALKAAIENGADAVYIGLKDDTNARHFAGLNFTEKKLQEAVSFVHQHRRK LHIAINTFAHPDGYARWQRAVDMAAQLGADALILADLAMLEYAAERYPHIERHVSVQASA TNEEAINFYHRHFDVARVVLPRVLSIHQVKQLARVTPVPLEVFAFGSLCIMSEGRCYLSS YLTGESPNTIGACSPARFVRWQQTPQGLESRLNEVLIDRYQDGENAGYPTLCKGRYLVDG ERYHALEEPTSLNTLELLPELMAANIASVKIEGRQRSPAYVSQVAKVWRQAIDRCKADPQ NFVPQSAWMETLGSMSEGTQTTLGAYHRKWQ >gi|296494646|gb|ADTN01000092.1| GENE 13 8204 - 9082 986 292 aa, chain + ## HITS:1 COG:yhbV KEGG:ns NR:ns ## COG: yhbV COG0826 # Protein_GI_number: 16131051 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Escherichia coli K12 # 1 292 7 298 298 600 100.0 1e-171 MKYSLGPVLWYWPKETLEEFYQQAATSSADVIYLGEAVCSKRRATKVGDWLEMAKSLAGS GKQIVLSTLALVQASSELGELKRYVENGEFLIEASDLGVVNMCAERKLPFVAGHALNCYN AVTLKILLKQGMMRWCMPVELSRDWLVNLLNQCDELGIRNQFEVEVLSYGHLPLAYSARC FTARSEDRPKDECETCCIKYPNGRNVLSQENQQVFVLNGIQTMSGYVYNLGNELASMQGL VDVVRLSPQGTDTFAMLDAFRANENGAAPLPLTANSDCNGYWRRLAGLELQA >gi|296494646|gb|ADTN01000092.1| GENE 14 9163 - 10170 941 335 aa, chain + ## HITS:1 COG:yhbW KEGG:ns NR:ns ## COG: yhbW COG2141 # Protein_GI_number: 16131052 # Func_class: C Energy production and conversion # Function: Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related flavin-dependent oxidoreductases # Organism: Escherichia coli K12 # 1 335 1 335 335 676 100.0 0 MTDKTIAFSLLDLAPIPEGSSAREAFSHSLDLARLAEKRGYHRYWLAEHHNMTGIASAAT SVLIGYLAANTTTLHLGSGGVMLPNHSPLVIAEQFGTLNTLYPGRIDLGLGRAPGSDQRT MMALRRHMSGDIDNFPRDVAELVDWFDARDPNPHVRPVPGYGEKIPVWLLGSSLYSAQLA AQLGLPFAFASHFAPDMLFQALHLYRSNFKPSARLEKPYAMVCINIIAADSNRDAEFLFT SMQQAFVKLRRGETGQLPPPIQNMDQFWSPSEQYGVQQALSMSLVGDKAKVRHGLQSILR ETDADEIMVNGQIFDHQARLHSFELAMDVKEELLG >gi|296494646|gb|ADTN01000092.1| GENE 15 10288 - 11532 1620 414 aa, chain - ## HITS:1 COG:mtr KEGG:ns NR:ns ## COG: mtr COG0814 # Protein_GI_number: 16131053 # Func_class: E Amino acid transport and metabolism # Function: Amino acid permeases # Organism: Escherichia coli K12 # 1 414 1 414 414 696 100.0 0 MATLTTTQTSPSLLGGVVIIGGTIIGAGMFSLPVVMSGAWFFWSMAALIFTWFCMLHSGL MILEANLNYRIGSSFDTITKDLLGKGWNVVNGISIAFVLYILTYAYISASGSILHHTFAE MSLNVPARAAGFGFALLVAFVVWLSTKAVSRMTAIVLGAKVITFFLTFGSLLGHVQPATL FNVAESNASYAPYLLMTLPFCLASFGYHGNVPSLMKYYGKDPKTIVKCLVYGTLMALALY TIWLLATMGNIPRPEFIGIAEKGGNIDVLVQALSGVLNSRSLDLLLVVFSNFAVASSFLG VTLGLFDYLADLFGFDDSAVGRLKTALLTFAPPVVGGLLFPNGFLYAIGYAGLAATIWAA IVPALLARASRKRFGSPKFRVWGGKPMIALILVFGVGNALVHILSSFNLLPVYQ >gi|296494646|gb|ADTN01000092.1| GENE 16 11686 - 13575 2532 629 aa, chain - ## HITS:1 COG:ECs4043 KEGG:ns NR:ns ## COG: ECs4043 COG0513 # Protein_GI_number: 15833297 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Escherichia coli O157:H7 # 1 629 18 646 646 1055 99.0 0 MAEFETTFADLGLKAPILEALNDLGYEKPSPIQAECIPHLLNGRDVLGMAQTGSGKTAAF SLPLLQNLDPELKAPQILVLAPTRELAVQVAEAMTDFSKHMRGVNVVALYGGQRYDVQLR ALRQGPQIVVGTPGRLLDHLKRGTLDLSKLSGLVLDEADEMLRMGFIEDVETIMAQIPEG HQTALFSATMPEAIRRITRRFMKEPQEVRIQSSVTTRPDISQSYWTVWGMRKNEALVRFL EAEDFDAAIIFVRTKNATLEVAEALERNGYNSAALNGDMNQALREQTLERLKDGRLDILI ATDVAARGLDVERISLVVNYDIPMDSESYVHRIGRTGRAGRAGRALLFVENRERRLLRNI ERTMKLTIPEVELPNAELLGKRRLEKFAAKVQQQLESSDLDQYRALLSKIQPTAEGEELD LETLAAALLKMAQGERTLIVPPDAPMRPKREFRDRDDRGPRDRNDRGPRGDREDRPRRER RDVGDMQLYRIEVGRDDGVEVRHIVGAIANEGDISSRYIGNIKLFASHSTIELPKGMPGE VLQHFTRTRILNKPMNMQLLGDAQPHTGGERRGGGRGFGGERREGGRNFSGERREGGRGD GRRFSGERREGRAPRRDDSTGRRRFGGDA >gi|296494646|gb|ADTN01000092.1| GENE 17 13755 - 14639 639 294 aa, chain - ## HITS:1 COG:ECs4044 KEGG:ns NR:ns ## COG: ECs4044 COG4785 # Protein_GI_number: 15833298 # Func_class: R General function prediction only # Function: Lipoprotein NlpI, contains TPR repeats # Organism: Escherichia coli O157:H7 # 1 294 1 294 294 561 100.0 1e-160 MKPFLRWCFVATALTLAGCSNTSWRKSEVLAVPLQPTLQQEVILARMEQILASRALTDDE RAQLLYERGVLYDSLGLRALARNDFSQALAIRPDMPEVFNYLGIYLTQAGNFDAAYEAFD SVLELDPTYNYAHLNRGIALYYGGRDKLAQDDLLAFYQDDPNDPFRSLWLYLAEQKLDEK QAKEVLKQHFEKSDKEQWGWNIVEFYLGNISEQTLMERLKADATDNTSLAEHLSETNFYL GKYYLSLGDLDSATALFKLAVANNVHNFVEHRYALLELSLLGQDQDDLAESDQQ >gi|296494646|gb|ADTN01000092.1| GENE 18 14748 - 16883 188 711 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229537485|ref|ZP_04426621.1| ribosomal protein S1 [Planctomyces limnophilus DSM 3776] # 621 703 433 516 557 77 46 1e-13 MLNPIVRKFQYGQHTVTLETGMMARQATAAVMVSMDDTAVFVTVVGQKKAKPGQDFFPLT VNYQERTYAAGRIPGSFFRREGRPSEGETLIARLIDRPIRPLFPEGFVNEVQVIATVVSV NPQVNPDIVAMIGASAALSLSGIPFNGPIGAARVGYINDQYVLNPTQDELKESKLDLVVA GTEAAVLMVESEAQLLSEDQMLGAVVFGHEQQQVVIQNINELVKEAGKPRWDWQPEPVNE ALNARVAALAEARLSDAYRITDKQERYAQVDVIKSETIATLLAEDETLDENELGEILHAI EKNVVRSRVLAGEPRIDGREKDMIRGLDVRTGVLPRTHGSALFTRGETQALVTATLGTAR DAQVLDELMGERTDTFLFHYNFPPYSVGETGMVGSPKRREIGHGRLAKRGVLAVMPDMDK FPYTVRVVSEITESNGSSSMASVCGASLALMDAGVPIKAAVAGIAMGLVKEGDNYVVLSD ILGDEDHLGDMDFKVAGSRDGISALQMDIKIEGITKEIMQVALNQAKGARLHILGVMEQA INAPRGDISEFAPRIHTIKINPDKIKDVIGKGGSVIRALTEETGTTIEIEDDGTVKIAAT DGEKAKHAIRRIEEITAEIEVGRVYTGKVTRIVDFGAFVAIGGGKEGLVHISQIADKRVE KVTDYLQMGQEVPVKVLEVDRQGRIRLSIKEATEQSQPAAAPEAPAAEQGE >gi|296494646|gb|ADTN01000092.1| GENE 19 17130 - 17399 445 89 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|16131057|ref|NP_417634.1| 30S ribosomal subunit protein S15 [Escherichia coli str. K-12 substr. MG1655] # 1 89 1 89 89 176 100 2e-43 MSLSTEATAKIVSEFGRDANDTGSTEVQVALLTAQINHLQGHFAEHKKDHHSRRGLLRMV SQRRKLLDYLKRKDVARYTQLIERLGLRR >gi|296494646|gb|ADTN01000092.1| GENE 20 17548 - 18492 1100 314 aa, chain - ## HITS:1 COG:ECs4047 KEGG:ns NR:ns ## COG: ECs4047 COG0130 # Protein_GI_number: 15833301 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridine synthase # Organism: Escherichia coli O157:H7 # 1 314 1 314 314 615 100.0 1e-176 MSRPRRRGRDINGVLLLDKPQGMSSNDALQKVKRIYNANRAGHTGALDPLATGMLPICLG EATKFSQYLLDSDKRYRVIARLGQRTDTSDADGQIVEERPVTFSAEQLAAALDTFRGDIE QIPSMYSALKYQGKKLYEYARQGIEVPREARPITVYELLFIRHEGNELELEIHCSKGTYI RTIIDDLGEKLGCGAHVIYLRRLAVSKYPVERMVTLEHLRELVEQAEQQDIPAAELLDPL LMPMDSPASDYPVVNLPLTSSVYFKNGNPVRTSGAPLEGLVRVTEGENGKFIGMGEIDDE GRVAPRRLVVEYPA >gi|296494646|gb|ADTN01000092.1| GENE 21 18492 - 18893 653 133 aa, chain - ## HITS:1 COG:ECs4048 KEGG:ns NR:ns ## COG: ECs4048 COG0858 # Protein_GI_number: 15833302 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-binding factor A # Organism: Escherichia coli O157:H7 # 1 133 1 133 133 239 100.0 1e-63 MAKEFGRPQRVAQEMQKEIALILQREIKDPRLGMMTTVSGVEMSRDLAYAKVYVTFLNDK DEDAVKAGIKALQEASGFIRSLLGKAMRLRIVPELTFFYDNSLVEGMRMSNLVTSVVKHD EERRVNPDDSKED >gi|296494646|gb|ADTN01000092.1| GENE 22 19057 - 21729 3362 890 aa, chain - ## HITS:1 COG:ECs4049 KEGG:ns NR:ns ## COG: ECs4049 COG0532 # Protein_GI_number: 15833303 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 2 (IF-2; GTPase) # Organism: Escherichia coli O157:H7 # 1 890 1 890 890 1394 100.0 0 MTDVTIKTLAAERQTSVERLVQQFADAGIRKSADDSVSAQEKQTLIDHLNQKNSGPDKLT LQRKTRSTLNIPGTGGKSKSVQIEVRKKRTFVKRDPQEAERLAAEEQAQREAEEQARREA EESAKREAQQKAEREAAEQAKREAAEQAKREAAEKDKVSNQQDDMTKNAQAEKARREQEA AELKRKAEEEARRKLEEEARRVAEEARRMAEENKWTDNAEPTEDSSDYHVTTSQHARQAE DESDREVEGGRGRGRNAKAARPKKGNKHAESKADREEARAAVRGGKGGKRKGSSLQQGFQ KPAQAVNRDVVIGETITVGELANKMAVKGSQVIKAMMKLGAMATINQVIDQETAQLVAEE MGHKVILRRENELEEAVMSDRDTGAAAEPRAPVVTIMGHVDHGKTSLLDYIRSTKVASGE AGGITQHIGAYHVETENGMITFLDTPGHAAFTSMRARGAQATDIVVLVVAADDGVMPQTI EAIQHAKAAQVPVVVAVNKIDKPEADPDRVKNELSQYGILPEEWGGESQFVHVSAKAGTG IDELLDAILLQAEVLELKAVRKGMASGAVIESFLDKGRGPVATVLVREGTLHKGDIVLCG FEYGRVRAMRNELGQEVLEAGPSIPVEILGLSGVPAAGDEVTVVRDEKKAREVALYRQGK FREVKLARQQKSKLENMFANMTEGEVHEVNIVLKADVQGSVEAISDSLLKLSTDEVKVKI IGSGVGGITETDATLAAASNAILVGFNVRADASARKVIEAESLDLRYYSVIYNLIDEVKA AMSGMLSPELKQQIIGLAEVRDVFKSPKFGAIAGCMVTEGVVKRHNPIRVLRDNVVIYEG ELESLRRFKDDVNEVRNGMECGIGVKNYNDVRTGDVIEVFEIIEIQRTIA >gi|296494646|gb|ADTN01000092.1| GENE 23 21754 - 23241 1026 495 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|17988250|ref|NP_540884.1| transcription elongation factor NusA [Brucella melitensis 16M] # 4 482 9 483 537 399 43 1e-111 MNKEILAVVEAVSNEKALPREKIFEALESALATATKKKYEQEIDVRVQIDRKSGDFDTFR RWLVVDEVTQPTKEITLEAARYEDESLNLGDYVEDQIESVTFDRITTQTAKQVIVQKVRE AERAMVVDQFREHEGEIITGVVKKVNRDNISLDLGNNAEAVILREDMLPRENFRPGDRVR GVLYSVRPEARGAQLFVTRSKPEMLIELFRIEVPEIGEEVIEIKAAARDPGSRAKIAVKT NDKRIDPVGACVGMRGARVQAVSTELGGERIDIVLWDDNPAQFVINAMAPADVASIVVDE DKHTMDIAVEAGNLAQAIGRNGQNVRLASQLSGWELNVMTVDDLQAKHQAEAHAAIDTFT KYLDIDEDFATVLVEEGFSTLEELAYVPMKELLEIEGLDEPTVEALRERAKNALATIAQA QEESLGDNKPADDLLNLEGVDRDLAFKLAARGVCTLEDLAEQGIDDLADIEGLTDEKAGA LIMAARNICWFGDEA >gi|296494646|gb|ADTN01000092.1| GENE 24 23269 - 23691 429 140 aa, chain - ## HITS:1 COG:ECs4051 KEGG:ns NR:ns ## COG: ECs4051 COG0779 # Protein_GI_number: 15833305 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 140 13 152 152 273 100.0 5e-74 MITAPVEALGFELVGIEFIRGRTSTLRIYIDSEDGINVDDCADVSHQVSAVLDVEDPITV AYNLEVSSPGLDRPLFTAEHYARFVGEEVTLVLRMAVQNRRKWQGVIKAVDGEMITVTVE GKDEVFALSNIQKANLVPHF >gi|296494646|gb|ADTN01000092.1| GENE 25 24352 - 25695 1667 447 aa, chain + ## HITS:1 COG:argG KEGG:ns NR:ns ## COG: argG COG0137 # Protein_GI_number: 16131063 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate synthase # Organism: Escherichia coli K12 # 1 447 1 447 447 910 99.0 0 MTTILKHLPVGQRIGIAFSGGLDTSAALLWMRQKGAVPYAYTANLGQPDEEDYDAIPRRA MEYGAENARLIDCRKQLVAEGIAAIQCGAFHNTTGGLTYFNTTPLGRAVTGTMLVAAMKE DGVNIWGDGSTYKGNDIERFYRYGLLTNAELQIYKPWLDTDFIDELGGRHEMSEFMIACG FDYKKSVEKAYSTDSNMLGATHEAKDLEYLNSSVKIVNPIMGVKFWDESVKIPAEEVTVR FEQGHPVALNGKTFSDDVEMMLEANRIGGRHGLGMSDQIENRIIEAKSRGIYEAPGMALL HIAYERLLTGIHNEDTIEQYHAHGRQLGRLLYQGRWFDSQALMLRDSLQRWVASQITGEV TLELRRGNDYSILNTVSENLTYKPERLTMEKGDSVFSPDDRIGQLTMRNLDITDTREKLF GYAKTGLLSSSAASGVPQVENLENKGQ >gi|296494646|gb|ADTN01000092.1| GENE 26 25703 - 27319 497 538 aa, chain - ## HITS:1 COG:yhbX KEGG:ns NR:ns ## COG: yhbX COG2194 # Protein_GI_number: 16131064 # Func_class: R General function prediction only # Function: Predicted membrane-associated, metal-dependent hydrolase # Organism: Escherichia coli K12 # 1 538 7 547 547 1082 99.0 0 MTVFNKFARTFKSHWLLYLCVIVFGITNLVASSGAHMVQRLLFFVLTILVVKRISSLPLR LLVAAPFVLLTAADMSISLYSWCTFGTTFNDGFAISVLQSDPDEVVKMLGMYIPYLCAFA FLSLLFLAVIIKYDVSLPTKKVTGILLLIVISGSLFSACQFAYKDAKNKKAFSPYILASR FATYTPFFNLNYFALAAKEHQRLLSIANTVPYFQLSVRDTGIDTYVLIVGESVRVDNMSL YGYTRSTTPQVEAQRKQIKLFNQAISGAPYTALSVPLSLTADSVLSHDIHNYPDNIINMA NQAGFQTFWLSSQSAFRQNGTAVTSIAMETVYVRGFDELLLPHLSQALQQNTQQKKLIVL HLNGSHEPACSAYPQSSAVFQPQDDQDACYDNSIHYTDSLLGQVFELLKDRRASVMYFAD HGLERDPTKKNVYFHGGREASQQAYHVPMFIWYSPVLGDGVDRTTENNIFSTAYNNYLIN AWMGVTKPEQPQTLEEVIAHYKGDSRVVDANHDVFDYVMLRKEFTEDKQGNPTPEGQG >gi|296494646|gb|ADTN01000092.1| GENE 27 27879 - 28211 384 110 aa, chain - ## HITS:1 COG:ECs4054 KEGG:ns NR:ns ## COG: ECs4054 COG1314 # Protein_GI_number: 15833308 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecG # Organism: Escherichia coli O157:H7 # 1 110 1 110 110 185 100.0 2e-47 MYEALLVVFLIVAIGLVGLIMLQQGKGADMGASFGAGASATLFGSSGSGNFMTRMTALLA TLFFIISLVLGNINSNKTNKGSEWENLSAPAKTEQTQPAAPAKPTSDIPN >gi|296494646|gb|ADTN01000092.1| GENE 28 28439 - 29776 1652 445 aa, chain - ## HITS:1 COG:mrsA KEGG:ns NR:ns ## COG: mrsA COG1109 # Protein_GI_number: 16131066 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Escherichia coli K12 # 1 445 1 445 445 836 100.0 0 MSNRKYFGTDGIRGRVGDAPITPDFVLKLGWAAGKVLARHGSRKIIIGKDTRISGYMLES ALEAGLAAAGLSALFTGPMPTPAVAYLTRTFRAEAGIVISASHNPFYDNGIKFFSIDGTK LPDAVEEAIEAEMEKEISCVDSAELGKASRIVDAAGRYIEFCKATFPNELSLSELKIVVD CANGATYHIAPNVLRELGANVIAIGCEPNGVNINAEVGATDVRALQARVLAEKADLGIAF DGDGDRVIMVDHEGNKVDGDQIMYIIAREGLRQGQLRGGAVGTLMSNMGLELALKQLGIP FARAKVGDRYVLEKMQEKGWRIGAENSGHVILLDKTTTGDGIVAGLQVLAAMARNHMSLH DLCSGMKMFPQILVNVRYTAGSGDPLEHESVKAVTAEVEAALGNRGRVLLRKSGTEPLIR VMVEGEDEAQVTEFAHRIADAVKAV >gi|296494646|gb|ADTN01000092.1| GENE 29 29769 - 30617 852 282 aa, chain - ## HITS:1 COG:ECs4056 KEGG:ns NR:ns ## COG: ECs4056 COG0294 # Protein_GI_number: 15833310 # Func_class: H Coenzyme transport and metabolism # Function: Dihydropteroate synthase and related enzymes # Organism: Escherichia coli O157:H7 # 1 282 16 297 297 556 100.0 1e-158 MKLFAQGTSLDLSHPHVMGILNVTPDSFSDGGTHNSLIDAVKHANLMINAGATIIDVGGE STRPGAAEVSVEEELQRVIPVVEAIAQRFEVWISVDTSKPEVIRESAKVGAHIINDIRSL SEPGALEAAAETGLPVCLMHMQGNPKTMQEAPKYDDVFAEVNRYFIEQIARCEQAGIAKE KLLLDPGFGFGKNLSHNYSLLARLAEFHHFNLPLLVGMSRKSMIGQLLNVGPSERLSGSL ACAVIAAMQGAHIIRVHDVKETVEAMRVVEATLSAKENKRYE Prediction of potential genes in microbial genomes Time: Sun May 15 23:26:35 2011 Seq name: gi|296494645|gb|ADTN01000093.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont238.5, whole genome shotgun sequence Length of sequence - 7856 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 5, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 13/0.000 - CDS 41 - 1984 1683 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 - Prom 2014 - 2073 4.2 2 1 Op 2 . - CDS 2075 - 2704 591 ## COG0293 23S rRNA methylase - Prom 2782 - 2841 4.2 + Prom 2603 - 2662 3.0 3 2 Tu 1 . + CDS 2830 - 3123 454 ## PROTEIN SUPPORTED gi|188532496|ref|YP_001906293.1| Predicted RNA-binding protein containing KH domain, possibly ribosomal protein + Term 3209 - 3248 7.4 - Term 3204 - 3230 1.7 4 3 Tu 1 . - CDS 3279 - 3755 583 ## COG0782 Transcription elongation factor - Prom 3902 - 3961 5.7 + Prom 3841 - 3900 3.2 5 4 Tu 1 . + CDS 4117 - 5436 1036 ## COG2027 D-alanyl-D-alanine carboxypeptidase (penicillin-binding protein 4) + Term 5444 - 5484 4.9 - Term 5432 - 5470 9.2 6 5 Op 1 6/0.250 - CDS 5622 - 6794 1409 ## COG0536 Predicted GTPase 7 5 Op 2 . - CDS 6810 - 7775 751 ## PROTEIN SUPPORTED gi|46133178|ref|ZP_00156740.2| COG0697: Permeases of the drug/metabolite transporter (DMT) superfamily Predicted protein(s) >gi|296494645|gb|ADTN01000093.1| GENE 1 41 - 1984 1683 647 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 1 599 1 597 636 652 56 0.0 MSDMAKNLILWLVIAVVLMSVFQSFGPSESNGRKVDYSTFLQEVNNDQVREARINGREIN VTKKDSNRYTTYIPVQDPKLLDNLLTKNVKVVGEPPEEPSLLASIFISWFPMLLLIGVWI FFMRQMQGGGGKGAMSFGKSKARMLTEDQIKTTFADVAGCDEAKEEVAELVEYLREPSRF QKLGGKIPKGVLMVGPPGTGKTLLAKAIAGEAKVPFFTISGSDFVEMFVGVGASRVRDMF EQAKKAAPCIIFIDEIDAVGRQRGAGLGGGHDEREQTLNQMLVEMDGFEGNEGIIVIAAT NRPDVLDPALLRPGRFDRQVVVGLPDVRGREQILKVHMRRVPLAPDIDAAIIARGTPGFS GADLANLVNEAALFAARGNKRVVSMVEFEKAKDKIMMGAERRSMVMTEAQKESTAYHEAG HAIIGRLVPEHDPVHKVTIIPRGRALGVTFFLPEGDAISASRQKLESQISTLYGGRLAEE IIYGPEHVSTGASNDIKVATNLARNMVTQWGFSEKLGPLLYAEEEGEVFLGRSVAKAKHM SDETARIIDQEVKALIERNYNRARQLLTDNMDILHAMKDALMKYETIDAPQIDDLMARRD VRPPAGWEEPGASNNSGDNGSPKAPRPVDEPRTPNPGNTMSEQLGDK >gi|296494645|gb|ADTN01000093.1| GENE 2 2075 - 2704 591 209 aa, chain - ## HITS:1 COG:ECs4058 KEGG:ns NR:ns ## COG: ECs4058 COG0293 # Protein_GI_number: 15833312 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 23S rRNA methylase # Organism: Escherichia coli O157:H7 # 1 209 1 209 209 409 100.0 1e-114 MTGKKRSASSSRWLQEHFSDKYVQQAQKKGLRSRAWFKLDEIQQSDKLFKPGMTVVDLGA APGGWSQYVVTQIGGKGRIIACDLLPMDPIVGVDFLQGDFRDELVMKALLERVGDSKVQV VMSDMAPNMSGTPAVDIPRAMYLVELALEMCRDVLAPGGSFVVKVFQGEGFDEYLREIRS LFTKVKVRKPDSSRARSREVYIVATGRKP >gi|296494645|gb|ADTN01000093.1| GENE 3 2830 - 3123 454 97 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|188532496|ref|YP_001906293.1| Predicted RNA-binding protein containing KH domain, possibly ribosomal protein [Erwinia tasmaniensis Et1/99] # 1 97 1 97 97 179 93 6e-45 MNLSTKQKQHLKGLAHPLKPVVLLGSNGLTEGVLAEIEQALEHHELIKVKIATEDRETKT LIVEAIVRETGACNVQVIGKTLVLYRPTKERKISLPR >gi|296494645|gb|ADTN01000093.1| GENE 4 3279 - 3755 583 158 aa, chain - ## HITS:1 COG:STM3299 KEGG:ns NR:ns ## COG: STM3299 COG0782 # Protein_GI_number: 16766595 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Salmonella typhimurium LT2 # 1 158 1 158 158 282 96.0 2e-76 MQAIPMTLRGAEKLREELDFLKSVRRPEIIAAIAEAREHGDLKENAEYHAAREQQGFCEG RIKDIEAKLSNAQVIDVTKMPNNGRVIFGATVTVLNLDSDEEQTYRIVGDDEADFKQNLI SVNSPIARGLIGKEEDDVVVIKTPGGEVEFEVIKVEYL >gi|296494645|gb|ADTN01000093.1| GENE 5 4117 - 5436 1036 439 aa, chain + ## HITS:1 COG:dacB KEGG:ns NR:ns ## COG: dacB COG2027 # Protein_GI_number: 16131072 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase (penicillin-binding protein 4) # Organism: Escherichia coli K12 # 1 439 39 477 477 882 100.0 0 MVQKVGASAPAIDYHSQQMALPASTQKVITALAALIQLGPDFRFTTTLETKGNVENGVLK GDLVARFGADPTLKRQDIRNMVATLKKSGVNQIDGNVLIDTSIFASHDKAPGWPWNDMTQ CFSAPPAAAIVDRNCFSVSLYSAPKPGDMAFIRVASYYPVTMFSQVRTLPRGSAEAQYCE LDVVPGDLNRFTLTGCLPQRSEPLPLAFAVQDGASYAGAILKDELKQAGITWSGTLLRQT QVNEPGTVVASKQSAPLHDLLKIMLKKSDNMIADTVFRMIGHARFNVPGTWRAGSDAVRQ ILRQQAGVDIGNTIIADGSGLSRHNLIAPATMMQVLQYIAQHDNELNFISMLPLAGYDGS LQYRAGLHQAGVDGKVSAKTGSLQGVYNLAGFITTASGQRMAFVQYLSGYAVEPADQRNR RIPLVRFESRLYKDIYQNN >gi|296494645|gb|ADTN01000093.1| GENE 6 5622 - 6794 1409 390 aa, chain - ## HITS:1 COG:yhbZ KEGG:ns NR:ns ## COG: yhbZ COG0536 # Protein_GI_number: 16131073 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Escherichia coli K12 # 1 390 1 390 390 690 100.0 0 MKFVDEASILVVAGDGGNGCVSFRREKYIPKGGPDGGDGGDGGDVWMEADENLNTLIDYR FEKSFRAERGQNGASRDCTGKRGKDVTIKVPVGTRVIDQGTGETMGDMTKHGQRLLVAKG GWHGLGNTRFKSSVNRTPRQKTNGTPGDKRELLLELMLLADVGMLGMPNAGKSTFIRAVS AAKPKVADYPFTTLVPSLGVVRMDNEKSFVVADIPGLIEGAAEGAGLGIRFLKHLERCRV LLHLIDIDPIDGTDPVENARIIISELEKYSQDLATKPRWLVFNKIDLLDKVEAEEKAKAI AEALGWEDKYYLISAASGLGVKDLCWDVMTFIIENPVVQAEEAKQPEKVEFMWDDYHRQQ LEEIAEEDDEDWDDDWDEDDEEGVEFIYKR >gi|296494645|gb|ADTN01000093.1| GENE 7 6810 - 7775 751 321 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46133178|ref|ZP_00156740.2| COG0697: Permeases of the drug/metabolite transporter (DMT) superfamily [Haemophilus influenzae R2866] # 1 303 1 300 306 293 49 2e-79 MKQQAGIGILLALTTAICWGALPIAMKQVLEVMEPPTIVFYRFLMASIGLGAILAVKKRL PPLRVFRKPRWLILLAVATAGLFGNFILFSSSLQYLSPTASQVIGQLSPVGMMVASVFIL KEKMRSTQVVGALMLLSGLVMFFNTSLVEIFTKLTDYTWGVIFGVGAATVWVSYGVAQKV LLRRLASPQILFLLYTLCTIALFPLAKPGVIAQLSHWQLACLIFCGLNTLVGYGALAEAM ARWQAAQVSAIITLTPLFTLFFSDLLSLAWPDFFARPMLNLLGYLGAFVVVAGAMYSAIG HRIWGGLRKHTTVVSQPRAGE Prediction of potential genes in microbial genomes Time: Sun May 15 23:26:37 2011 Seq name: gi|296494644|gb|ADTN01000094.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont238.6, whole genome shotgun sequence Length of sequence - 1852 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 32/0.000 - CDS 44 - 301 437 ## PROTEIN SUPPORTED gi|15803725|ref|NP_289759.1| 50S ribosomal protein L27 2 1 Op 2 . - CDS 322 - 657 562 ## PROTEIN SUPPORTED gi|224585100|ref|YP_002638899.1| 50S ribosomal protein L21 - Prom 699 - 758 3.6 + Prom 759 - 818 3.2 3 2 Tu 1 . + CDS 892 - 1852 1014 ## COG0142 Geranylgeranyl pyrophosphate synthase Predicted protein(s) >gi|296494644|gb|ADTN01000094.1| GENE 1 44 - 301 437 85 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803725|ref|NP_289759.1| 50S ribosomal protein L27 [Escherichia coli O157:H7 EDL933] # 1 85 1 85 85 172 100 1e-43 MAHKKAGGSTRNGRDSEAKRLGVKRFGGESVLAGSIIVRQRGTKFHAGANVGCGRDHTLF AKADGKVKFEVKGPKNRKFISIEAE >gi|296494644|gb|ADTN01000094.1| GENE 2 322 - 657 562 111 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|224585100|ref|YP_002638899.1| 50S ribosomal protein L21 [Salmonella enterica subsp. enterica serovar Paratyphi C strain RKS4594] # 1 111 1 111 111 221 97 5e-58 MCAEAEFYMYAVFQSGGKQHRVSEGQTVRLEKLDIATGETVEFAEVLMIANGEEVKIGVP FVDGGVIKAEVVAHGRGEKVKIVKFRRRKHYRKQQGHRQWFTDVKITGISA >gi|296494644|gb|ADTN01000094.1| GENE 3 892 - 1852 1014 320 aa, chain + ## HITS:1 COG:ispB KEGG:ns NR:ns ## COG: ispB COG0142 # Protein_GI_number: 16131077 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Escherichia coli K12 # 1 320 1 320 323 617 100.0 1e-177 MNLEKINELTAQDMAGVNAAILEQLNSDVQLINQLGYYIVSGGGKRIRPMIAVLAARAVG YEGNAHVTIAALIEFIHTATLLHDDVVDESDMRRGKATANAAFGNAASVLVGDFIYTRAF QMMTSLGSLKVLEVMSEAVNVIAEGEVLQLMNVNDPDITEENYMRVIYSKTARLFEAAAQ CSGILAGCTPEEEKGLQDYGRYLGTAFQLIDDLLDYNADGEQLGKNVGDDLNEGKPTLPL LHAMHHGTPEQAQMIRTAIEQGNGRHLLEPVLEAMNACGSLEWTRQRAEEEADKAIAALQ VLPDTPWREALIGLAHIAVQ Prediction of potential genes in microbial genomes Time: Sun May 15 23:26:50 2011 Seq name: gi|296494643|gb|ADTN01000095.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont238.7, whole genome shotgun sequence Length of sequence - 38744 bp Number of predicted genes - 38, with homology - 38 Number of transcription units - 14, operones - 7 average op.length - 4.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 30 - 89 3.0 1 1 Tu 1 . + CDS 200 - 478 337 ## COG3423 Predicted transcriptional regulator + Term 482 - 536 7.2 2 2 Op 1 11/0.000 - CDS 526 - 1785 1526 ## COG0766 UDP-N-acetylglucosamine enolpyruvyl transferase 3 2 Op 2 6/0.000 - CDS 1840 - 2109 328 ## COG5007 Predicted transcriptional regulator, BolA superfamily - Prom 2155 - 2214 1.6 - Term 2151 - 2203 6.0 4 2 Op 3 10/0.000 - CDS 2254 - 2547 356 ## COG3113 Predicted NTP binding protein (contains STAS domain) 5 2 Op 4 13/0.000 - CDS 2547 - 3182 898 ## COG2854 ABC-type transport system involved in resistance to organic solvents, auxiliary component 6 2 Op 5 16/0.000 - CDS 3201 - 3752 631 ## COG1463 ABC-type transport system involved in resistance to organic solvents, periplasmic component 7 2 Op 6 23/0.000 - CDS 3757 - 4539 809 ## COG0767 ABC-type transport system involved in resistance to organic solvents, permease component 8 2 Op 7 . - CDS 4547 - 5356 604 ## COG1127 ABC-type transport system involved in resistance to organic solvents, ATPase component - Prom 5424 - 5483 4.4 + Prom 5444 - 5503 5.5 9 3 Op 1 6/0.000 + CDS 5566 - 6543 709 ## COG0530 Ca2+/Na+ antiporter 10 3 Op 2 13/0.000 + CDS 6557 - 7543 920 ## COG0794 Predicted sugar phosphate isomerase involved in capsule formation 11 3 Op 3 11/0.000 + CDS 7564 - 8130 755 ## COG1778 Low specificity phosphatase (HAD superfamily) 12 3 Op 4 12/0.000 + CDS 8127 - 8702 439 ## COG3117 Uncharacterized protein conserved in bacteria 13 3 Op 5 19/0.000 + CDS 8671 - 9228 558 ## COG1934 Uncharacterized protein conserved in bacteria 14 3 Op 6 17/0.000 + CDS 9235 - 9960 287 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 15 3 Op 7 11/0.000 + CDS 10008 - 11441 1453 ## COG1508 DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog 16 3 Op 8 11/0.000 + CDS 11464 - 11751 462 ## PROTEIN SUPPORTED gi|227335124|ref|ZP_03838780.1| hypothetical protein CIT292_04930 + Term 11780 - 11816 6.4 17 3 Op 9 8/0.000 + CDS 11869 - 12360 398 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 18 3 Op 10 7/0.000 + CDS 12406 - 13260 885 ## COG1660 Predicted P-loop-containing kinase 19 3 Op 11 . + CDS 13257 - 13529 261 ## COG1925 Phosphotransferase system, HPr-related proteins + Prom 13573 - 13632 2.6 20 4 Tu 1 . + CDS 13743 - 14375 491 ## B21_03023 hypothetical protein - Term 14138 - 14178 3.7 21 5 Op 1 3/1.000 - CDS 14372 - 15079 513 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) 22 5 Op 2 2/1.000 - CDS 15097 - 15750 816 ## COG3155 Uncharacterized protein involved in an early stage of isoprenoid biosynthesis - Prom 15804 - 15863 3.7 23 6 Op 1 4/0.750 - CDS 15980 - 18316 2779 ## COG0642 Signal transduction histidine kinase - Prom 18340 - 18399 3.4 - Term 18367 - 18410 -0.6 24 6 Op 2 . - CDS 18412 - 19317 927 ## COG1242 Predicted Fe-S oxidoreductase + Prom 19787 - 19846 1.9 25 7 Op 1 21/0.000 + CDS 20016 - 24476 4800 ## COG0069 Glutamate synthase domain 2 26 7 Op 2 . + CDS 24489 - 25907 1646 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases + Term 25981 - 26014 -0.6 + Prom 26381 - 26440 13.2 27 8 Tu 1 . + CDS 26467 - 27231 538 ## JW3181 periplasmic protein + Term 27285 - 27326 4.0 + Prom 27328 - 27387 5.7 28 9 Op 1 10/0.000 + CDS 27445 - 28077 351 ## COG3121 P pilus assembly protein, chaperone PapD 29 9 Op 2 . + CDS 28098 - 30479 1033 ## COG3188 P pilus assembly protein, porin PapC 30 9 Op 3 . + CDS 30476 - 31021 343 ## gi|256024208|ref|ZP_05438073.1| hypothetical protein E4_12593 + Prom 31036 - 31095 2.3 31 10 Tu 1 . + CDS 31231 - 31734 246 ## JW3188 predicted transcriptional regulator + Prom 31763 - 31822 1.8 32 11 Tu 1 . + CDS 31919 - 33046 542 ## PROTEIN SUPPORTED gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 + Term 33064 - 33096 4.9 33 12 Op 1 3/1.000 - CDS 33106 - 33570 383 ## COG2731 Beta-galactosidase, beta subunit 34 12 Op 2 5/0.250 - CDS 33567 - 34475 314 ## PROTEIN SUPPORTED gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 35 12 Op 3 1/1.000 - CDS 34439 - 35128 792 ## COG3010 Putative N-acetylmannosamine-6-phosphate epimerase - Term 35137 - 35168 2.4 36 12 Op 4 4/0.750 - CDS 35176 - 36666 1720 ## COG0477 Permeases of the major facilitator superfamily - Prom 36711 - 36770 2.9 37 13 Tu 1 . - CDS 36775 - 37668 947 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase - Prom 37703 - 37762 3.7 38 14 Tu 1 . - CDS 37790 - 38581 959 ## COG2186 Transcriptional regulators - Prom 38666 - 38725 3.7 Predicted protein(s) >gi|296494643|gb|ADTN01000095.1| GENE 1 200 - 478 337 92 aa, chain + ## HITS:1 COG:Znlp KEGG:ns NR:ns ## COG: Znlp COG3423 # Protein_GI_number: 15803728 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Escherichia coli O157:H7 EDL933 # 1 92 1 92 92 174 100.0 4e-44 MESNFIDWHPADIIAGLRKKGTSMAAESRRNGLSSSTLANALSRPWPKGEMIIAKALGTD PWVIWPSRYHDPQTHEFIDRTQLMRSYTKPKK >gi|296494643|gb|ADTN01000095.1| GENE 2 526 - 1785 1526 419 aa, chain - ## HITS:1 COG:murA KEGG:ns NR:ns ## COG: murA COG0766 # Protein_GI_number: 16131079 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine enolpyruvyl transferase # Organism: Escherichia coli K12 # 1 419 1 419 419 812 100.0 0 MDKFRVQGPTKLQGEVTISGAKNAALPILFAALLAEEPVEIQNVPKLKDVDTSMKLLSQL GAKVERNGSVHIDARDVNVFCAPYDLVKTMRASIWALGPLVARFGQGQVSLPGGCTIGAR PVDLHISGLEQLGATIKLEEGYVKASVDGRLKGAHIVMDKVSVGATVTIMCAATLAEGTT IIENAAREPEIVDTANFLITLGAKISGQGTDRIVIEGVERLGGGVYRVLPDRIETGTFLV AAAISRGKIICRNAQPDTLDAVLAKLRDAGADIEVGEDWISLDMHGKRPKAVNVRTAPHP AFPTDMQAQFTLLNLVAEGTGFITETVFENRFMHVPELSRMGAHAEIESNTVICHGVEKL SGAQVMATDLRASASLVLAGCIAEGTTVVDRIYHIDRGYERIEDKLRALGANIERVKGE >gi|296494643|gb|ADTN01000095.1| GENE 3 1840 - 2109 328 89 aa, chain - ## HITS:1 COG:ECs4069 KEGG:ns NR:ns ## COG: ECs4069 COG5007 # Protein_GI_number: 15833323 # Func_class: K Transcription # Function: Predicted transcriptional regulator, BolA superfamily # Organism: Escherichia coli O157:H7 # 1 89 1 89 89 179 100.0 1e-45 MIEDPMENNEIQSVLMNALSLQEVHVSGDGSHFQVIAVGELFDGMSRVKKQQTVYGPLME YIADNRIHAVSIKAYTPAEWARDRKLNGF >gi|296494643|gb|ADTN01000095.1| GENE 4 2254 - 2547 356 97 aa, chain - ## HITS:1 COG:yrbB KEGG:ns NR:ns ## COG: yrbB COG3113 # Protein_GI_number: 16131081 # Func_class: R General function prediction only # Function: Predicted NTP binding protein (contains STAS domain) # Organism: Escherichia coli K12 # 1 97 33 129 129 170 100.0 6e-43 MSESLSWMQTGDTLALSGELDQDVLLPLWEMREEAVKGITCIDLSRVSRVDTGGLALLLH LIDLAKKQGNNVTLQGVNDKVYTLAKLYNLPADVLPR >gi|296494643|gb|ADTN01000095.1| GENE 5 2547 - 3182 898 211 aa, chain - ## HITS:1 COG:yrbC KEGG:ns NR:ns ## COG: yrbC COG2854 # Protein_GI_number: 16131082 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: ABC-type transport system involved in resistance to organic solvents, auxiliary component # Organism: Escherichia coli K12 # 1 211 1 211 211 394 100.0 1e-109 MFKRLMMVALLVIAPLSAATAADQTNPYKLMDEAAQKTFDRLKNEQPQIRANPDYLRTIV DQELLPYVQVKYAGALVLGQYYKSATPAQREAYFAAFREYLKQAYGQALAMYHGQTYQIA PEQPLGDKTIVPIRVTIIDPNGRPPVRLDFQWRKNSQTGNWQAYDMIAEGVSMITTKQNE WGTLLRTKGIDGLTAQLKSISQQKITLEEKK >gi|296494643|gb|ADTN01000095.1| GENE 6 3201 - 3752 631 183 aa, chain - ## HITS:1 COG:ECs4072 KEGG:ns NR:ns ## COG: ECs4072 COG1463 # Protein_GI_number: 15833326 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: ABC-type transport system involved in resistance to organic solvents, periplasmic component # Organism: Escherichia coli O157:H7 # 1 183 1 183 183 335 100.0 3e-92 MQTKKNEIWVGIFLLAALLAALFVCLKAANVTSIRTEPTYTLYATFDNIGGLKARSPVSI GGVVVGRVADITLDPKTYLPRVTLEIEQRYNHIPDTSSLSIRTSGLLGEQYLALNVGFED PELGTAILKDGDTIQDTKSAMVLEDLIGQFLYGSKGDDNKNSGDAPAAAPGNNETTEPVG TTK >gi|296494643|gb|ADTN01000095.1| GENE 7 3757 - 4539 809 260 aa, chain - ## HITS:1 COG:ECs4073 KEGG:ns NR:ns ## COG: ECs4073 COG0767 # Protein_GI_number: 15833327 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: ABC-type transport system involved in resistance to organic solvents, permease component # Organism: Escherichia coli O157:H7 # 1 260 1 260 260 438 100.0 1e-123 MLLNALASLGHKGIKTLRTFGRAGLMLFNALVGKPEFRKHAPLLVRQLYNVGVLSMLIIV VSGVFIGMVLGLQGYLVLTTYSAETSLGMLVALSLLRELGPVVAALLFAGRAGSALTAEI GLMRATEQLSSMEMMAVDPLRRVISPRFWAGVISLPLLTVIFVAVGIWGGSLVGVSWKGI DSGFFWSAMQNAVDWRMDLVNCLIKSVVFAITVTWISLFNGYDAIPTSAGISRATTRTVV HSSLAVLGLDFVLTALMFGN >gi|296494643|gb|ADTN01000095.1| GENE 8 4547 - 5356 604 269 aa, chain - ## HITS:1 COG:ECs4074 KEGG:ns NR:ns ## COG: ECs4074 COG1127 # Protein_GI_number: 15833328 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: ABC-type transport system involved in resistance to organic solvents, ATPase component # Organism: Escherichia coli O157:H7 # 1 269 1 269 269 505 100.0 1e-143 MEQSVANLVDMRDVSFTRGNRCIFDNISLTVPRGKITAIMGPSGIGKTTLLRLIGGQIAP DHGEILFDGENIPAMSRSRLYTVRKRMSMLFQSGALFTDMNVFDNVAYPLREHTQLPAPL LHSTVMMKLEAVGLRGAAKLMPSELSGGMARRAALARAIALEPDLIMFDEPFVGQDPITM GVLVKLISELNSALGVTCVVVSHDVPEVLSIADHAWILADKKIVAHGSAQALQANPDPRV RQFLDGIADGPVPFRYPAGDYHADLLPGS >gi|296494643|gb|ADTN01000095.1| GENE 9 5566 - 6543 709 325 aa, chain + ## HITS:1 COG:yrbG KEGG:ns NR:ns ## COG: yrbG COG0530 # Protein_GI_number: 16131086 # Func_class: P Inorganic ion transport and metabolism # Function: Ca2+/Na+ antiporter # Organism: Escherichia coli K12 # 1 325 1 325 325 482 100.0 1e-136 MLLATALLIVGLLLVVYSADRLVFAASILCRTFGIPPLIIGMTVVSIGTSLPEVIVSLAA SLHEQRDLAVGTALGSNIINILLILGLAALVRPFTVHSDVLRRELPLMLLVSVVAGSVLY DGQLSRSDGIFLLFLAVLWLLFIVKLARQAERQGTDSLTREQLAELPRDGGLPVAFLWLG IALIIMPVATRMVVDNATVLANYFAISELTMGLTAIAIGTSLPELATAIAGVRKGENDIA VGNIIGANIFNIVIVLGLPALITPGEIDPLAYSRDYSVMLLVSIIFALLCWRRSPQPGRG VGVLLTGGFIVWLAMLYWLSPILVE >gi|296494643|gb|ADTN01000095.1| GENE 10 6557 - 7543 920 328 aa, chain + ## HITS:1 COG:yrbH_1 KEGG:ns NR:ns ## COG: yrbH_1 COG0794 # Protein_GI_number: 16131087 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted sugar phosphate isomerase involved in capsule formation # Organism: Escherichia coli K12 # 1 212 1 212 212 424 100.0 1e-118 MSHVELQPGFDFQQAGKEVLAIERECLAELDQYINQNFTLACEKMFWCKGKVVVMGMGKS GHIGRKMAATFASTGTPSFFVHPGEAAHGDLGMVTPQDVVIAISNSGESSEITALIPVLK RLHVPLICITGRPESSMARAADVHLCVKVAKEACPLGLAPTSSTTATLVMGDALAVALLK ARGFTAEDFALSHPGGALGRKLLLRVNDIMHTGDEIPHVKKTASLRDALLEVTRKNLGMT VICDDNMMIEGIFTDGDLRRVFDMGVDVRQLSIADVMTPGGIRVRPGILAVEALNLMQSR HITSVMVADGDHLLGVLHMHDLLRAGVV >gi|296494643|gb|ADTN01000095.1| GENE 11 7564 - 8130 755 188 aa, chain + ## HITS:1 COG:ECs4077 KEGG:ns NR:ns ## COG: ECs4077 COG1778 # Protein_GI_number: 15833331 # Func_class: R General function prediction only # Function: Low specificity phosphatase (HAD superfamily) # Organism: Escherichia coli O157:H7 # 1 188 1 188 188 355 100.0 3e-98 MSKAGASLATCYGPVSADVIAKAENIRLLILDVDGVLSDGLIYMGNNGEELKAFNVRDGY GIRCALTSDIEVAIITGRKAKLVEDRCATLGITHLYQGQSNKLIAFSDLLEKLAIAPENV AYVGDDLIDWPVMEKVGLSVAVADAHPLLIPRADYVTRIAGGRGAVREVCDLLLLAQGKL DEAKGQSI >gi|296494643|gb|ADTN01000095.1| GENE 12 8127 - 8702 439 191 aa, chain + ## HITS:1 COG:ECs4078 KEGG:ns NR:ns ## COG: ECs4078 COG3117 # Protein_GI_number: 15833332 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 191 1 191 191 354 100.0 7e-98 MSKARRWVIIVLSLAVLVMIGINMAEKDDTAQVVVNNNDPTYKSEHTDTLVYNPEGALSY RLIAQHVEYYSDQAVSWFTQPVLTTFDKDKIPTWSVKADKAKLTNDRMLYLYGHVEVNAL VPDSQLRRITTDNAQINLVTQDVTSEDLVTLYGTTFNSSGLKMRGNLRSKNAELIEKVRT SYEIQNKQTQP >gi|296494643|gb|ADTN01000095.1| GENE 13 8671 - 9228 558 185 aa, chain + ## HITS:1 COG:ECs4079 KEGG:ns NR:ns ## COG: ECs4079 COG1934 # Protein_GI_number: 15833333 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 185 1 185 185 330 100.0 1e-90 MKFKTNKLSLNLVLASSLLAASIPAFAVTGDTDQPIHIESDQQSLDMQGNVVTFTGNVIV TQGTIKINADKVVVTRPGGEQGKEVIDGYGKPATFYQMQDNGKPVEGHASQMHYELAKDF VVLTGNAYLQQVDSNIKGDKITYLVKEQKMQAFSDKGKRVTTVLVPSQLQDKNNKGQTPA QKKGN >gi|296494643|gb|ADTN01000095.1| GENE 14 9235 - 9960 287 241 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 2 235 3 233 305 115 31 5e-25 MATLTAKNLAKAYKGRRVVEDVSLTVNSGEIVGLLGPNGAGKTTTFYMVVGIVPRDAGNI IIDDDDISLLPLHARARRGIGYLPQEASIFRRLSVYDNLMAVLQIRDDLSAEQREDRANE LMEEFHIEHLRDSMGQSLSGGERRRVEIARALAANPKFILLDEPFAGVDPISVIDIKRII EHLRDSGLGVLITDHNVRETLAVCERAYIVSQGHLIAHGTPTEILQDEHVKRVYLGEDFR L >gi|296494643|gb|ADTN01000095.1| GENE 15 10008 - 11441 1453 477 aa, chain + ## HITS:1 COG:rpoN KEGG:ns NR:ns ## COG: rpoN COG1508 # Protein_GI_number: 16131092 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog # Organism: Escherichia coli K12 # 1 477 1 477 477 872 100.0 0 MKQGLQLRLSQQLAMTPQLQQAIRLLQLSTLELQQELQQALESNPLLEQIDTHEEIDTRE TQDSETLDTADALEQKEMPEELPLDASWDTIYTAGTPSGTSGDYIDDELPVYQGETTQTL QDYLMWQVELTPFSDTDRAIATSIVDAVDETGYLTVPLEDILESIGDEEIDIDEVEAVLK RIQRFDPVGVAAKDLRDCLLIQLSQFDKTTPWLEEARLIISDHLDLLANHDFRTLMRVTR LKEDVLKEAVNLIQSLDPRPGQSIQTGEPEYVIPDVLVRKHNGHWTVELNSDSIPRLQIN QHYASMCNNARNDGDSQFIRSNLQDAKWLIKSLESRNDTLLRVSRCIVEQQQAFFEQGEE YMKPMVLADIAQAVEMHESTISRVTTQKYLHSPRGIFELKYFFSSHVNTEGGGEASSTAI RALVKKLIAAENPAKPLSDSKLTSLLSEQGIMVARRTVAKYRESLSIPPSNQRKQLV >gi|296494643|gb|ADTN01000095.1| GENE 16 11464 - 11751 462 95 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227335124|ref|ZP_03838780.1| hypothetical protein CIT292_04930 [Citrobacter youngae ATCC 29220] # 1 95 1 95 95 182 94 3e-45 MQLNITGNNVEITEALREFVTAKFAKLEQYFDRINQVYVVLKVEKVTHTSDATLHVNGGE IHASAEGQDMYAAIDGLIDKLARQLTKHKDKLKQH >gi|296494643|gb|ADTN01000095.1| GENE 17 11869 - 12360 398 163 aa, chain + ## HITS:1 COG:ptsN KEGG:ns NR:ns ## COG: ptsN COG1762 # Protein_GI_number: 16131094 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Escherichia coli K12 # 1 163 1 163 163 303 100.0 1e-82 MTNNDTTLQLSSVLNRECTRSRVHCQSKKRALEIISELAAKQLSLPPQVVFEAILTREKM GSTGIGNGIAIPHGKLEEDTLRAVGVFVQLETPIAFDAIDNQPVDLLFALLVPADQTKTH LHTLSLVAKRLADKTICRRLRAAQSDEELYQIITDTEGTPDEA >gi|296494643|gb|ADTN01000095.1| GENE 18 12406 - 13260 885 284 aa, chain + ## HITS:1 COG:ECs4084 KEGG:ns NR:ns ## COG: ECs4084 COG1660 # Protein_GI_number: 15833338 # Func_class: R General function prediction only # Function: Predicted P-loop-containing kinase # Organism: Escherichia coli O157:H7 # 1 284 1 284 284 556 100.0 1e-158 MVLMIVSGRSGSGKSVALRALEDMGFYCVDNLPVVLLPDLARTLADREISAAVSIDVRNM PESPEIFEQAMSNLPDAFSPQLLFLDADRNTLIRRYSDTRRLHPLSSKNLSLESAIDKES DLLEPLRSRADLIVDTSEMSVHELAEMLRTRLLGKRERELTMVFESFGFKHGIPIDADYV FDVRFLPNPHWDPKLRPMTGLDKPVAAFLDRHTEVHNFIYQTRSYLELWLPMLETNNRSY LTVAIGCTGGKHRSVYIAEQLADYFRSRGKNVQSRHRTLEKRKP >gi|296494643|gb|ADTN01000095.1| GENE 19 13257 - 13529 261 90 aa, chain + ## HITS:1 COG:ECs4085 KEGG:ns NR:ns ## COG: ECs4085 COG1925 # Protein_GI_number: 15833339 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Escherichia coli O157:H7 # 1 90 1 90 90 149 100.0 2e-36 MTVKQTVEITNKLGMHARPAMKLFELMQGFDAEVLLRNDEGTEAEANSVIALLMLDSAKG RQIEVEATGPQEEEALAAVIALFNSGFDED >gi|296494643|gb|ADTN01000095.1| GENE 20 13743 - 14375 491 210 aa, chain + ## HITS:1 COG:no KEGG:B21_03023 NR:ns ## KEGG: B21_03023 # Name: yrbL # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 210 1 210 210 426 100.0 1e-118 MIRLSEQSPLGTGRHRKCYAHPEDAQRCIKIVYHRGDGGDKEIRRELKYYAHLGRRLKDW SGIPRYHGTVETDCGTGYVYDVIADFDGKPSITLTEFAEQCRYEEDIAQLRQLLKQLKRY LQDNRIVTMSLKPQNILCHRISESEVIPVVCDNIGESTLIPLATWSKWCCLRKQERLWKR FIAQPALAIALQKDLQPRESKTLALTSREA >gi|296494643|gb|ADTN01000095.1| GENE 21 14372 - 15079 513 235 aa, chain - ## HITS:1 COG:mtgA KEGG:ns NR:ns ## COG: mtgA COG0744 # Protein_GI_number: 16131098 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Escherichia coli K12 # 1 235 8 242 242 457 99.0 1e-129 MFSFVRRFLLRLMVVLAVFWGGGIALFSVAPVPFSAVMVERQVSAWLHGNFRYVAHSDWV SMDQISPWMGLAVIAAEDQKFPEHWGFDVASIEKALAHNERNENRIRGASTISQQTAKNL FLWDGRSWVRKGLEAGLTLGIETVWSKKRILTVYLNIAEFGDGVFGVEAAAQRYFHKPAS KLTRSEAALLAAVLPNPLRFKVSSPSGYVRSRQAWILRQMYQLGGEPFMQQHQLD >gi|296494643|gb|ADTN01000095.1| GENE 22 15097 - 15750 816 217 aa, chain - ## HITS:1 COG:yhbL KEGG:ns NR:ns ## COG: yhbL COG3155 # Protein_GI_number: 16131099 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized protein involved in an early stage of isoprenoid biosynthesis # Organism: Escherichia coli K12 # 1 217 4 220 220 407 100.0 1e-114 MKKIGVILSGCGVYDGSEIHEAVLTLLAISRSGAQAVCFAPDKQQVDVINHLTGEAMTET RNVLIEAARITRGEIRPLAQADAAELDALIVPGGFGAAKNLSNFASLGSECTVDRELKAL AQAMHQAGKPLGFMCIAPAMLPKIFDFPLRLTIGTDIDTAEVLEEMGAEHVPCPVDDIVV DEDNKIVTTPAYMLAQNIAEAASGIDKLVSRVLVLAE >gi|296494643|gb|ADTN01000095.1| GENE 23 15980 - 18316 2779 778 aa, chain - ## HITS:1 COG:ZarcB_1 KEGG:ns NR:ns ## COG: ZarcB_1 COG0642 # Protein_GI_number: 15803750 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli O157:H7 EDL933 # 1 562 1 562 562 1055 100.0 0 MKQIRLLAQYYVDLMMKLGLVRFSMLLALALVVLAIVVQMAVTMVLHGQVESIDVIRSIF FGLLITPWAVYFLSVVVEQLEESRQRLSRLVQKLEEMRERDLSLNVQLKDNIAQLNQEIA VREKAEAELQETFGQLKIEIKEREETQIQLEQQSSFLRSFLDASPDLVFYRNEDKEFSGC NRAMELLTGKSEKQLVHLKPADVYSPEAAAKVIETDEKVFRHNVSLTYEQWLDYPDGRKA CFEIRKVPYYDRVGKRHGLMGFGRDITERKRYQDALERASRDKTTFISTISHELRTPLNG IVGLSRILLDTELTAEQEKYLKTIHVSAVTLGNIFNDIIDMDKMERRKVQLDNQPVDFTS FLADLENLSALQAQQKGLRFNLEPTLPLPHQVITDGTRLRQILWNLISNAVKFTQQGQVT VRVRYDEGDMLHFEVEDSGIGIPQDELDKIFAMYYQVKDSHGGKPATGTGIGLAVSRRLA KNMGGDITVTSEQGKGSTFTLTIHAPSVAEEVDDAFDEDDMPLPALNVLLVEDIELNVIV ARSVLEKLGNSVDVAMTGKAALEMFKPGEYDLVLLDIQLPDMTGLDISRELTKRYPREDL PPLVALTANVLKDKQEYLNAGMDDVLSKPLSVPALTAMIKKFWDTQDDEESTVTTEENSK SEALLDIPMLEQYLELVGPKLITDGLAVFEKMMPGYVSVLESNLTAQDKKGIVEEGHKIK GAAGSVGLRHLQQLGQQIQSPDLPAWEDNVGEWIEEMKEEWRHDVEVLKAWVAKATKK >gi|296494643|gb|ADTN01000095.1| GENE 24 18412 - 19317 927 301 aa, chain - ## HITS:1 COG:ECs4090 KEGG:ns NR:ns ## COG: ECs4090 COG1242 # Protein_GI_number: 15833344 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductase # Organism: Escherichia coli O157:H7 # 1 301 9 309 309 631 100.0 0 MFGGDLTRRYGQKVHKLTLHGGFSCPNRDGTIGRGGCTFCNVASFADEAQQHRSIAEQLA HQANLVNRAKRYLAYFQAYTSTFAEVQVLRSMYQQAVSQANIVGLCVGTRPDCVPDAVLD LLCEYKDQGYEVWLELGLQTAHDKTLHRINRGHDFACYQRTTQLARQRGLKVCSHLIVGL PGEGQAECLQTLERVVETGVDGIKLHPLHIVKGSIMAKAWEAGRLNGIELEDYTLTAGEM IRHTPPEVIYHRISASARRPTLLAPLWCENRWTGMVELDRYLNEHGVQGSALGRPWLPPT E >gi|296494643|gb|ADTN01000095.1| GENE 25 20016 - 24476 4800 1486 aa, chain + ## HITS:1 COG:ECs4091_2 KEGG:ns NR:ns ## COG: ECs4091_2 COG0069 # Protein_GI_number: 15833345 # Func_class: E Amino acid transport and metabolism # Function: Glutamate synthase domain 2 # Organism: Escherichia coli O157:H7 # 379 1194 1 816 816 1647 99.0 0 MLYDKSLERDNCGFGLIAHIEGEPSHKVVRTAIHALARMQHRGAILADGKTGDGCGLLLQ KPDRFFRIVAQERGWRLAKNYAVGMLFLNKDPELAAAARRIVEEELQRETLSIVGWRDVP TNEGVLGEIALSSLPRIEQIFVNAPAGWRPRDMERRLFIARRRIEKRLEADKDFYVCSLS NLVNIYKGLCMPTDLPRFYLDLADLRLESAICLFHQRFSTNTVPRWPLAQPFRYLAHNGE INTITGNRQWARARTYKFQTPLIPDLHDAAPFVNETGSDSSSMDNMLELLLAGGMDIIRA MRLLVPPAWQNNPDMDPELRAFFDFNSMHMEPWDGPAGIVMSDGRFAACNLDRNGLRPAR YVITKDKLITCASEVGIWDYQPDEVVEKGRVGPGELMVIDTRSGRILHSAETDDDLKSRH PYKEWMEKNVRRLVPFEDLPDEEVGSRELDDDTLASYQKQFNYSAEELDSVIRVLGENGQ EAVGSMGDDTPFAVLSSQPRIIYDYFRQQFAQVTNPPIDPLREAHVMSLATSIGREMNVF CEAEGQAHRLSFKSPILLYSDFKQLTTMKEEHYRADTLDITFDVTKTTLEATVKELCDKA EKMVRSGTVLLVLSDRNIAKDRLPVPAPMAVGAIQTRLVDQSLRCDANIIVETASARDPH HFAVLLGFGATAIYPYLAYETLGRLVDTHAIAKDYRTVMLNYRNGINKGLYKIMSKMGIS TIASYRCSKLFEAVGLHDDVVGLCFQGAVSRIGGASFEDFQQDLLNLSKRAWLARKPISQ GGLLKYVHGGEYHAYNPDVVRTLQQAVQSGEYSDYQEYAKLVNERPATTLRDLLAITPGE NAVNIADVEPASELFKRFDTAAMSIGALSPEAHEALAEAMNSIGGNSNSGEGGEDPARYG TNKVSRIKQVASGRFGVTPAYLVNADVIQIKVAQGAKPGEGGQLPGDKVTPYIAKLRYSV PGVTLISPPPHHDIYSIEDLAQLIFDLKQVNPKAMISVKLVSEPGVGTIATGVAKAYADL ITIAGYDGGTGASPLSSVKYAGCPWELGLVETQQALVANGLRHKIRLQVDGGLKTGVDII KAAILGAESFGFGTGPMVALGCKYLRICHLNNCATGVATQDDKLRKNHYHGLPFKVTNYF EFIARETRELMAQLGVTRLVDLIGRTDLLKELDGFTAKQQKLALSKLLETAEPHPGKALY CTENNPPFDNGLLNAQLLQQAKPFVDERQSKTFWFDIRNTDRSVGASLSGYIAQTHGDQG LAADPIKAYFNGTAGQSFGVWNAGGVELYLTGDANDYVGKGMAGGLIAIRPPVGSAFRSH EASIIGNTCLYGATGGRLYAAGRAGERFGVRNSGAITVVEGIGDNGCEYMTGGIVCILGK TGVNFGAGMTGGFAYVLDESGDFRKRVNPELVEVLSVDALAIHEEHLRGLITEHVQHTGS QRGEEILANWSTFATKFALVKPKSSDVKALLGHRSRSAAELRVQAQ >gi|296494643|gb|ADTN01000095.1| GENE 26 24489 - 25907 1646 472 aa, chain + ## HITS:1 COG:gltD KEGG:ns NR:ns ## COG: gltD COG0493 # Protein_GI_number: 16131103 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Escherichia coli K12 # 1 472 1 472 472 975 100.0 0 MSQNVYQFIDLQRVDPPKKPLKIRKIEFVEIYEPFSEGQAKAQADRCLSCGNPYCEWKCP VHNYIPNWLKLANEGRIFEAAELSHQTNTLPEVCGRVCPQDRLCEGSCTLNDEFGAVTIG NIERYINDKAFEMGWRPDMSGVKQTGKKVAIIGAGPAGLACADVLTRNGVKAVVFDRHPE IGGLLTFGIPAFKLEKEVMTRRREIFTGMGIEFKLNTEVGRDVQLDDLLSDYDAVFLGVG TYQSMRGGLENEDADGVYAALPFLIANTKQLMGFGETRDEPFVSMEGKRVVVLGGGDTAM DCVRTSVRQGAKHVTCAYRRDEENMPGSRREVKNAREEGVEFKFNVQPLGIEVNGNGKVS GVKMVRTEMGEPDAKGRRRAEIVAGSEHIVPADAVIMAFGFRPHNMEWLAKHSVELDSQG RIIAPEGSDNAFQTSNPKIFAGGDIVRGSDLVVTAIAEGRKAADGIMNWLEV >gi|296494643|gb|ADTN01000095.1| GENE 27 26467 - 27231 538 254 aa, chain + ## HITS:1 COG:no KEGG:JW3181 NR:ns ## KEGG: JW3181 # Name: gltF # Def: periplasmic protein # Organism: E.coli_J # Pathway: not_defined # 1 254 1 254 254 446 100.0 1e-124 MFFKKNLTTAAICAALSVAAFSAMATDSTDTELTIIGEYTPGACTPVVTGGGIVDYGKHH NSALNPTGKSNKLVQLGRKNSTLNITCTAPTLIAVTSKDNRQSTIVALNDTSYIEKAYDT LVDMKGTKNAFGLGSAPNGQKIGAASIGIDRSNGGIHAADDTGEIPVDLIQTDHWSAATP TWKASSNGAFCSLTSCSAIERGYSVAKTGELTPVAITAVTFPLLIDAAVNDNTILGSDET IKLDGNVTISVQYL >gi|296494643|gb|ADTN01000095.1| GENE 28 27445 - 28077 351 210 aa, chain + ## HITS:1 COG:yhcA KEGG:ns NR:ns ## COG: yhcA COG3121 # Protein_GI_number: 16131105 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli K12 # 1 210 15 224 224 389 100.0 1e-108 MNTLATGMVPETSVLLVDEKRGEASINIKNTDDHPSLLYTTIVDLPESNKSIRLIPTQPV IRVEAGQVQQVRFLLQATVPLQSEELKRVTFEGIPPKDDKSSRVTVSIRQDLPVLIHPAS LPEERETWKFLEWRKNGDQIEISNPSNYVVRMTLQFKTLPSGKTGAINKTYFLPHTSTTT ALTNATDTKVEFYPASRYGYRGNKYVTDLK >gi|296494643|gb|ADTN01000095.1| GENE 29 28098 - 30479 1033 793 aa, chain + ## HITS:1 COG:yhcD KEGG:ns NR:ns ## COG: yhcD COG3188 # Protein_GI_number: 16131106 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli K12 # 1 793 1 793 793 1533 99.0 0 MLKKTLLAYTIGFAFSPPANADGIEIAAVDFDRETLKSLGVDPNISHYFSRSARFLPGEY SLIVSVNGEKKGNIATRFDENGDICLDQAFLQQAGLKIPSEEKNGCYDYILSYPGTTITP LPNQEALDIIVSPQAIIPIGLDLTNAATGGTAALLNYSLMSSRAEFSNGSSDYSQAALEG GININDWMLRSHQFLTQTNGTFSNQNSSTYLQRTFTDLKTLMRAGEVNLNNSVLEGASIY GIEIAPDNALQTSGSGVQVTGIANTSQARVEIRQQGVLIHSILVPAGAFTIPDVPVRNGN SDLNVTVVETDGSSHNYIVPSTLFNQHVESFQGYRFAIGRGDDDYDESPWVISASSGWNL TRWSAMNGGVIVAENYQAASIRSSLVPLPDLTVSSQISTSQDTKDSLQGQKYRLDANYNL PFSLGLTTSLTRSDRHYRELSEAIDDDYTDPTKSTYALGLNWSNSILGGFNISGYKTYSY DGDNDSSNLNINWNKAFKHATVSVNWQHQLSASENNEDDGDLFYVNISIPFGRSNTATLY TRHDDHKTHYGTGVMGVVSDEMSYYVNAERDHDERETSLNGSISSNLHYTQVSLAAGASG SDSRTYNGTMSGGIAVHDQGVTFSPWTINDTFAIAKMDNNIAGVRITSQAGPVWTDFRGN AVIPSIQPWRTSGVEIDTASLPKNVDIGNGTKMIKQGRGAVGKVGFSAITQRRALLNITL SDGKKLPRGVAIEDSEGNYLTTSVDDGVVFLNNIKPDMVLDIKDEQQSCRIHLTFPEDAP KDVFYETATGECQ >gi|296494643|gb|ADTN01000095.1| GENE 30 30476 - 31021 343 181 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|256024208|ref|ZP_05438073.1| ## NR: gi|256024208|ref|ZP_05438073.1| hypothetical protein E4_12593 [Escherichia sp. 4_1_40B] # 1 181 1 181 181 320 100.0 3e-86 MKRIITGCLLLNFAMAAQAECNISSSIQNIDYGKRSAAMRQVDRGKTTQLADRTITLVMQ CDQDAHIRVQLNTANISNNGFGFGPNGSLNLIASDAFSGSNNLDLALASGKNDNPGSTGT ASISTSPNNWLVFMQNGQEVVIDSGKSVSLTLTMAPAFKDEGELTDMTDITGNLTVLVEA K >gi|296494643|gb|ADTN01000095.1| GENE 31 31231 - 31734 246 167 aa, chain + ## HITS:1 COG:no KEGG:JW3188 NR:ns ## KEGG: JW3188 # Name: yhcF # Def: predicted transcriptional regulator # Organism: E.coli_J # Pathway: not_defined # 1 167 72 238 238 286 100.0 1e-76 MTITCESATGIAITARDTRMDSMTTGKDSGGQSGVKYTLNGGGYISQTTRLFGLGKTKDN KNIGSYAVLIDSNNISASNGSQTLAVSIAGADAVITGQKRAWQTLTAYPLAVDQSYYYTF VKPGETTPTPVTNAIIPLQVSASIANDLGGSEKIELDGKAVISVVYL >gi|296494643|gb|ADTN01000095.1| GENE 32 31919 - 33046 542 375 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 [Haemophilus parasuis 29755] # 19 372 12 338 339 213 35 1e-54 MESLSEGTTAGYQQIHDGIIHLVDSARTETVRSVNALMTATYQEIGRRIVEFEQGGEARA AYGAQLIKRLSKDLCLRYKRGFSAKNLRQMRLFYLFFQHVEIRQTMSGELTPLGIPQTPS AEFPSAKIWQTLSAKSFPLPRSTYVRLLSVKNADARSFYEKETLRCGWSVRQLERQIATQ FYERTLLSHDKSAMLQQHAPAETHILPQQAIRDPFVLEFLELKDEYSESDFEEALINHLM DFMLELGDDFAFVGRQRRLRIDDNWFRVDLLFFHRRLRCLLIVDLKVGKFSYSDAGQMNM YLNYAKEHWTLPDENPPIGLVLCAEKGAGEAHYALAGLPNTVLASEYKMQLPDEKRLADE LVRTQAVLEEGYRRR >gi|296494643|gb|ADTN01000095.1| GENE 33 33106 - 33570 383 154 aa, chain - ## HITS:1 COG:yhcH KEGG:ns NR:ns ## COG: yhcH COG2731 # Protein_GI_number: 16131111 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Escherichia coli K12 # 1 154 1 154 154 285 100.0 3e-77 MMMGEVQSLPSAGLHPALQDALTLALAARPQEKAPGRYELQGDNIFMNVMTFNTQSPVEK KAELHEQYIDIQLLLNGEERILFGMAGTARQCEEFHHEDDYQLCSTIDNEQAIILKPGMF AVFMPGEPHKPGCVVGEPGEIKKVVVKVKADLMA >gi|296494643|gb|ADTN01000095.1| GENE 34 33567 - 34475 314 302 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 [Bacillus selenitireducens MLS10] # 17 298 8 314 323 125 30 4e-28 MVQHSDEKGGAMTTLAIDIGGTKLAAALIGADGQIRDRRELPTPASQTPEALRDALSALV SPLQAHAQRVAIASTGIIRDGSLLALNPHNLGGLLHFPLVKTLEQLTNLPTIAINDAQAA AWAEFQALDGDITDMVFITVSTGVGGGVVSGCKLLTGPGGLAGHIGHTLADPHGPVCGCG RTGCVEAIASGRGIAAAAQGELAGADAKTIFTRAGQGDEQAQQLIHRSARTLARLIADIK ATTDCQCVVVGGSVGLAEGYLALVETYLAQEPAAFHVDLLAAHYRHDAGLLGAALLAQGE KL >gi|296494643|gb|ADTN01000095.1| GENE 35 34439 - 35128 792 229 aa, chain - ## HITS:1 COG:nanE KEGG:ns NR:ns ## COG: nanE COG3010 # Protein_GI_number: 16131113 # Func_class: G Carbohydrate transport and metabolism # Function: Putative N-acetylmannosamine-6-phosphate epimerase # Organism: Escherichia coli K12 # 1 229 1 229 229 402 100.0 1e-112 MSLLAQLDQKIAANGGLIVSCQPVPDSPLDKPEIVAAMALAAEQAGAVAIRIEGVANLQA TRAVVSVPIIGIVKRDLEDSPVRITAYIEDVDALAQAGADIIAIDGTDRPRPVPVETLLA RIHHHGLLAMTDCSTPEDGLACQKLGAEIIGTTLSGYTTPETPEEPDLALVKTLSDAGCR VIAEGRYNTPAQAADAMRHGAWAVTVGSAITRLEHICQWYNTAMKKAVL >gi|296494643|gb|ADTN01000095.1| GENE 36 35176 - 36666 1720 496 aa, chain - ## HITS:1 COG:ECs4097 KEGG:ns NR:ns ## COG: ECs4097 COG0477 # Protein_GI_number: 15833351 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 496 11 506 506 919 99.0 0 MSTTTQNIPWYRHLNRAQWRAFSAAWLGYLLDGFDFVLIALVLTEVQGEFGLTTVQAASL ISAAFISRWFGGLMLGAMGDRYGRRLAMVTSIVLFSAGTLACGFAPGYITMFIARLVIGM GMAGEYGSSATYVIESWPKHLRNKASGFLISGFSVGAVVAAQVYSLVVPVWGWRALFFIG ILPIIFALWLRKNIPEAEDWKEKHAGKAPVRTMVDILYRGEHRIANIVMTLAAATALWFC FAGNLQNAAIVAVLGLLCAAIFISFMVQSTGKRWPTGVMLMVVVLFAFLYSWPIQALLPT YLKTDLAYNPHTVANVLFFSGFGAAVGCCVGGFLGDWLGTRKAYVCSLLASQLLIIPVFA IGGANVWVLGLLLFFQQMLGQGIAGILPKLIGGYFDTDQRAAGLGFTYNVGALGGALAPI IGALIAQRLDLGTALASLSFSLTFVVILLIGLDMPSRVQRWLRPEALRTHDAIDGKPFSG AVPFGSAKNDLVKTKS >gi|296494643|gb|ADTN01000095.1| GENE 37 36775 - 37668 947 297 aa, chain - ## HITS:1 COG:ECs4098 KEGG:ns NR:ns ## COG: ECs4098 COG0329 # Protein_GI_number: 15833352 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Escherichia coli O157:H7 # 1 297 1 297 297 595 100.0 1e-170 MATNLRGVMAALLTPFDQQQALDKASLRRLVQFNIQQGIDGLYVGGSTGEAFVQSLSERE QVLEIVAEEAKGKIKLIAHVGCVSTAESQQLAASAKRYGFDAVSAVTPFYYPFSFEEHCD HYRAIIDSADGLPMVVYNIPALSGVKLTLDQINTLVTLPGVGALKQTSGDLYQMEQIRRE HPDLVLYNGYDEIFASGLLAGADGGIGSTYNIMGWRYQGIVKALKEGDIQTAQKLQTECN KVIDLLIKTGVFRGLKTVLHYMDVVSVPLCRKPFGPVDEKYLPELKALAQQLMQERG >gi|296494643|gb|ADTN01000095.1| GENE 38 37790 - 38581 959 263 aa, chain - ## HITS:1 COG:nanR KEGG:ns NR:ns ## COG: nanR COG2186 # Protein_GI_number: 16131116 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 263 1 263 263 492 99.0 1e-139 MGLMNAFDSQTEDSSSAIGRNLRSRPLARKKLSEMVEEELEQMIRRREFGEGEQLPSERE LMAFFNVGRPSVREALAALKRKGLVQINNGERARVSRPSADTIIGELSGMAKDFLSHPGG IAHFEQLRLFFESSLVRYAAEHATDEQIDLLAKALEINSQSLDNNAAFIRSDVDFHRVLA EIPGNPIFMAIHVALLDWLIAARPTVTDQALHEHNNVSYQQHIAIVDAIRRHDPDEADRA LQSHLNSVSATWHAFGQTTNKKK Prediction of potential genes in microbial genomes Time: Sun May 15 23:27:12 2011 Seq name: gi|296494642|gb|ADTN01000096.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont242.1, whole genome shotgun sequence Length of sequence - 13633 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 9, operones - 6 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 67 - 99 1.4 1 1 Op 1 . - CDS 136 - 525 79 ## APECO1_124 curli assembly protein CsgE 2 1 Op 2 . - CDS 530 - 1180 244 ## COG2771 DNA-binding HTH domain-containing proteins + Prom 914 - 973 3.6 3 2 Tu 1 . + CDS 1148 - 1294 77 ## UTI89_C1162 hypothetical protein + Prom 1659 - 1718 4.0 4 3 Op 1 . + CDS 1908 - 2390 332 ## ECH74115_1421 curlin minor subunit 5 3 Op 2 . + CDS 2431 - 2886 398 ## B21_01046 hypothetical protein + Term 2909 - 2937 1.0 6 4 Tu 1 . + CDS 2945 - 3277 179 ## ECUMN_1217 putative autoagglutination protein 7 5 Tu 1 . + CDS 3398 - 3709 272 ## JW1031 hypothetical protein + Prom 3722 - 3781 1.5 8 6 Op 1 2/0.000 + CDS 3804 - 4337 347 ## COG2110 Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 9 6 Op 2 . + CDS 4339 - 5760 815 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes - Term 5677 - 5713 4.0 10 7 Op 1 . - CDS 5768 - 6400 562 ## ECDH10B_1119 glucans biosynthesis protein 11 7 Op 2 . - CDS 6394 - 6924 145 ## B21_01051 hypothetical protein - Prom 7040 - 7099 1.8 + Prom 7201 - 7260 4.5 12 8 Op 1 7/0.000 + CDS 7300 - 8853 1575 ## COG3131 Periplasmic glucans biosynthesis protein 13 8 Op 2 4/0.000 + CDS 8876 - 11389 2243 ## COG2943 Membrane glycosyltransferase + Term 11520 - 11552 3.1 + Prom 11482 - 11541 3.2 14 8 Op 3 . + CDS 11562 - 11789 294 ## COG5645 Predicted periplasmic lipoprotein - Term 11494 - 11548 4.2 15 9 Op 1 . - CDS 11790 - 12164 528 ## G2583_1310 acidic protein MsyB 16 9 Op 2 . - CDS 12247 - 13473 887 ## COG0477 Permeases of the major facilitator superfamily - Prom 13527 - 13586 2.5 Predicted protein(s) >gi|296494642|gb|ADTN01000096.1| GENE 1 136 - 525 79 129 aa, chain - ## HITS:1 COG:no KEGG:APECO1_124 NR:ns ## KEGG: APECO1_124 # Name: csgE # Def: curli assembly protein CsgE # Organism: E.coli_APEC # Pathway: not_defined # 1 129 1 129 129 256 100.0 2e-67 MKRYLRWIVAAEFLFAAGNLHAVEVEVPGLLTDHTVSSIGHDFYRAFSDKWESDYTGNLT INERPSARWGSWITITVNQDVIFQTFLFPLKRDFEKTVVFALIQTEEALNRRQINQALLS TGDLAHDEF >gi|296494642|gb|ADTN01000096.1| GENE 2 530 - 1180 244 216 aa, chain - ## HITS:1 COG:ECs1417 KEGG:ns NR:ns ## COG: ECs1417 COG2771 # Protein_GI_number: 15830671 # Func_class: K Transcription # Function: DNA-binding HTH domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 216 1 216 216 402 99.0 1e-112 MFNEVHSIHGHTLLLITKSSLQATALLQHLKQSLAITGKLHNIQRSLDDISSGSIILLDM MEADKKLIHYWQDTLSRKNNNIKILLLNTPEDYPYRDIENWPHINGVFYSMEDQERVVNG LQGVLRGECYFTQKLASYLITHSGNYRYNSTESALLTHREKEILNKLRIGASNNEIARSL FISENTVKTHLYNLFKKIAVKNRTQAVSWANDNLRR >gi|296494642|gb|ADTN01000096.1| GENE 3 1148 - 1294 77 48 aa, chain + ## HITS:1 COG:no KEGG:UTI89_C1162 NR:ns ## KEGG: UTI89_C1162 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UTI89 # Pathway: not_defined # 3 48 1 46 46 75 100.0 7e-13 MTMNTMDFIKHDETPLFLLIAHLTAASKIEAPEVLTDVALLCVVINQP >gi|296494642|gb|ADTN01000096.1| GENE 4 1908 - 2390 332 160 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_1421 NR:ns ## KEGG: ECH74115_1421 # Name: csgB # Def: curlin minor subunit # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 160 1 160 160 223 100.0 1e-57 MYDQVQGDNMKNKLLFMMLTILGAPGIAAAAGYDLANSEYNFAVNELSKSSFNQAAIIGQ AGTNNSAQLRQGGSKLLAVVAQEGSSNRAKIDQTGDYNLAYIDQAGSANDASISQGAYGN TAMIIQKGSGNKANITQYGTQKTAIVVQRQSQMAIRVTQR >gi|296494642|gb|ADTN01000096.1| GENE 5 2431 - 2886 398 151 aa, chain + ## HITS:1 COG:no KEGG:B21_01046 NR:ns ## KEGG: B21_01046 # Name: csgA # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 151 1 151 151 187 100.0 1e-46 MKLLKVAAIAAIVFSGSALAGVVPQYGGGGNHGGGGNNSGPNSELNIYQYGGGNSALALQ TDARNSDLTITQHGGGNGADVGQGSDDSSIDLTQRGFGNSATLDQWNGKNSEMTVKQFGG GNGAAVDQTASNSSVNVTQVGFGNNATAHQY >gi|296494642|gb|ADTN01000096.1| GENE 6 2945 - 3277 179 110 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_1217 NR:ns ## KEGG: ECUMN_1217 # Name: csgC # Def: putative autoagglutination protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 110 54 163 163 159 98.0 2e-38 MNTLLLLAALSSQITFNTTQQGDVYTIIPEVTLTQSCLCRVQILSLREGSSGQSQTKQEK TLSLPANQPIALTKLSLNISPDDRVKIVVTVSDGQSLHLSQQWPPSSEKS >gi|296494642|gb|ADTN01000096.1| GENE 7 3398 - 3709 272 103 aa, chain + ## HITS:1 COG:no KEGG:JW1031 NR:ns ## KEGG: JW1031 # Name: ymdA # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 103 1 103 103 196 100.0 2e-49 MFRPFLNSLMLGSLFFPFIAIAGSTVQGGVIHFYGQIVEPACDVSTQSSPVEMNCPQNGS IPGKTYSSKALMSGNVKNAQIASVKVQYLDKQKKLAVMNIEYN >gi|296494642|gb|ADTN01000096.1| GENE 8 3804 - 4337 347 177 aa, chain + ## HITS:1 COG:ECs1423 KEGG:ns NR:ns ## COG: ECs1423 COG2110 # Protein_GI_number: 15830677 # Func_class: R General function prediction only # Function: Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 # Organism: Escherichia coli O157:H7 # 1 177 1 177 177 347 100.0 5e-96 MKTRIHVVQGDITKLAVDVIVNAANPSLMGGGGVDGAIHRAAGPALLDACLKVRQQQGDC PTGHAVITLAGDLPAKAVVHTVGPVWRGGEQNEDQLLQDAYLNSLRLVAANSYTSVAFPA ISTGVYGYPRAAAAEIAVKTVSEFITRHALPEQVYFVCYDEENAHLYERLLTQQGDE >gi|296494642|gb|ADTN01000096.1| GENE 9 4339 - 5760 815 473 aa, chain + ## HITS:1 COG:ymdC KEGG:ns NR:ns ## COG: ymdC COG1502 # Protein_GI_number: 16129009 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Escherichia coli K12 # 1 473 21 493 493 948 99.0 0 MPRLASAVLPLCSQHPGQCGLFPLEKSLDAFAARYRLAEMAEHTLDVQYYIWQDDMSGRL LFSALLAAAKRGVRVRLLLDDNNTPGLDDILRLLDSHPRIEVRLFNPFSFRLLRPLGYIT DFSRLNRRMHNKSFTVDGVVTLVGGRNIGDAYFGAGEEPLFSDLDVMAIGPVVEDVADDF ARYWYCKSVSPLQQVLDVPEGEMADRIELPASWHNDAMTHRYLRKMESSPFINHLVDGTL PLIWAKTRLLSDDPAKGEGKAKRHSLLPQRLFDIMGSPSERIDIISSYFVPTRAGVAQLL RMVRKGVKIAILTNSLAANDVAVVHAGYARWRKKLLRYGVELYELKPTREQSSTLHDRGI TGNSGASLHAKTFSIDGKTVFIGSFNFDPRSTLLNTEMGFVIESETLAQLIDKRFIQSQY DAAWQLRLDRWGRINWVDRHAKKEIILKKEPATSFWKRVMVRLASILPVEWLL >gi|296494642|gb|ADTN01000096.1| GENE 10 5768 - 6400 562 210 aa, chain - ## HITS:1 COG:no KEGG:ECDH10B_1119 NR:ns ## KEGG: ECDH10B_1119 # Name: mdoC # Def: glucans biosynthesis protein # Organism: E.coli_DH10B # Pathway: not_defined # 1 210 176 385 385 380 100.0 1e-104 MVKLSVIFLCLGIGYAVIRRTIFIVYPPILSNGMFNFIVMQTLFYLPFFILGALAFIFPH LKALFTTPSRGCTLAAALAFVAYLLNQRYGSGDAWMYETESVITMVLGLWMVNVVFSFGH RLLNFQSARVTYFVNASLFIYLVHHPLTLFFGAYITPHITSNWLGFLCGLIFVVGIAIIL YEIHLRIPLLKFLFSGKPVVKRENDKAPAR >gi|296494642|gb|ADTN01000096.1| GENE 11 6394 - 6924 145 176 aa, chain - ## HITS:1 COG:no KEGG:B21_01051 NR:ns ## KEGG: B21_01051 # Name: mdoC # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 165 1 165 385 305 99.0 4e-82 MNPVPAQREYFLDSIRAWLMLLGIPFHISLIYSSHTWHVNSAESSLWLTLFNDFIHSFRM QVFFVISGYFSYMLFLRYPLKKWWKVRVERVGIPMLTAIPLLTLPQFIMLQYVKGKAESW PGLSLYDKYNTLAWELISHLWFLLVLVVMTTLCVWIFKRIRNNLKILIKRIKNSRW >gi|296494642|gb|ADTN01000096.1| GENE 12 7300 - 8853 1575 517 aa, chain + ## HITS:1 COG:mdoG KEGG:ns NR:ns ## COG: mdoG COG3131 # Protein_GI_number: 16129011 # Func_class: P Inorganic ion transport and metabolism # Function: Periplasmic glucans biosynthesis protein # Organism: Escherichia coli K12 # 7 517 1 511 511 1032 100.0 0 MKHKLQMMKMRWLSAAVMLTLYTSSSWAFSIDDVAKQAQSLAGKGYETPKSNLPSVFRDM KYADYQQIQFNHDKAYWNNLKTPFKLEFYHQGMYFDTPVKINEVTATAVKRIKYSPDYFT FGDVQHDKDTVKDLGFAGFKVLYPINSKDKNDEIVSMLGASYFRVIGAGQVYGLSARGLA IDTALPSGEEFPRFKEFWIERPKPTDKRLTIYALLDSPRATGAYKFVVMPGRDTVVDVQS KIYLRDKVGKLGVAPLTSMFLFGPNQPSPANNYRPELHDSNGLSIHAGNGEWIWRPLNNP KHLAVSSFSMENPQGFGLLQRGRDFSRFEDLDDRYDLRPSAWVTPKGEWGKGSVELVEIP TNDETNDNIVAYWTPDQLPEPGKEMNFKYTITFSRDEDKLHAPDNAWVQQTRRSTGDVKQ SNLIRQPDGTIAFVVDFTGAEMKKLPEDTPVTAQTSIGDNGEIVESTVRYNPVTKGWRLV MRVKVKDAKKTTEMRAALVNADQTLSETWSYQLPANE >gi|296494642|gb|ADTN01000096.1| GENE 13 8876 - 11389 2243 837 aa, chain + ## HITS:1 COG:ECs1427 KEGG:ns NR:ns ## COG: ECs1427 COG2943 # Protein_GI_number: 15830681 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane glycosyltransferase # Organism: Escherichia coli O157:H7 # 1 837 21 857 857 1677 99.0 0 MPIAASEKAALPKTDIRAVHQALDAEHRTWAREDDSPQGSVKARLEQAWPDSLADGQLIK DDEGRDQLKAMPEAKRSSMFPDPWRTNPVGRFWDRLRGRDVTPRYLARLTKEEQESEQKW RTVGTIRRYILLILTLAQTVVATWYMKTILPYQGWALINPMDMVGQDLWVSFMQLLPYML QTGILILFAVLFCWVSAGFWTALMGFLQLLIGRDKYSISASTVGDEPLNPEHRTALIMPI CNEDVNRVFAGLRATWESVKATGNAKHFDVYILSDSYNPDICVAEQKAWMELIAEVGGEG QIFYRRRRRRVKRKSGNIDDFCRRWGSQYSYMVVLDADSVMTGDCLCGLVRLMEANPNAG IIQSSPKASGMDTLYARCQQFATRVYGPLFTAGLHFWQLGESHYWGHNAIIRVKPFIEHC ALAPLPGEGSFAGSILSHDFVEAALMRRAGWGVWIAYDLPGSYEELPPNLLDELKRDRRW CHGNLMNFRLFLVKGMHPVHRAVFLTGVMSYLSAPLWFMFLALSTALQVVHALTEPQYFL QPRQLFPVWPQWRPELAIALFASTMVLLFLPKLLSILLIWCKGTKEYGGFWRVTLSLLLE VLFSVLLAPVRMLFHTVFVVSAFLGWEVVWNSPQRDDDSTSWGEAFKRHGSQLLLGFVWA VGMAWLDLRFLFWLAPIVFSLILSPFVSVISSRATVGLRTKRWKLFLIPEEYSPPQVLVD TDRFLEMNRQRSLDDGFMHAVFNPSFNALATAMATARHRASKVLEIARDRHVEQALNETP EKLNRDRRLVLLSDPVTMARLHFRVWNSPERYSSWVSYYEGIKLNPLALRKPDAASQ >gi|296494642|gb|ADTN01000096.1| GENE 14 11562 - 11789 294 75 aa, chain + ## HITS:1 COG:ECs1428 KEGG:ns NR:ns ## COG: ECs1428 COG5645 # Protein_GI_number: 15830682 # Func_class: R General function prediction only # Function: Predicted periplasmic lipoprotein # Organism: Escherichia coli O157:H7 # 1 75 1 75 75 147 100.0 3e-36 MRLIVVSIMVTLLSGCGSIISRTIPGQGHGNQYYPGVQWDVRDSAWRYVTILDLPFSLVF DTLLLPIDIHHGPYE >gi|296494642|gb|ADTN01000096.1| GENE 15 11790 - 12164 528 124 aa, chain - ## HITS:1 COG:no KEGG:G2583_1310 NR:ns ## KEGG: G2583_1310 # Name: msyB # Def: acidic protein MsyB # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 124 2 125 125 183 100.0 1e-45 MTMYATLEEAIDAAREEFLADNPGIDAEDANVQQFNAQKYVLQDGDIMWQVEFFADEGEE GECLPMLSGEAAQSVFDGDYDEIEIRQEWQEENTLHEWDEGEFQLEPPLDTEEGRAAADE WDER >gi|296494642|gb|ADTN01000096.1| GENE 16 12247 - 13473 887 408 aa, chain - ## HITS:1 COG:yceE KEGG:ns NR:ns ## COG: yceE COG0477 # Protein_GI_number: 16129016 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 408 1 408 408 715 100.0 0 MSPCENDTPINWKRNLIVAWLGCFLTGAAFSLVMPFLPLYVEQLGVTGHSALNMWSGIVF SITFLFSAIASPFWGGLADRKGRKLMLLRSALGMGIVMVLMGLAQNIWQFLILRALLGLL GGFVPNANALIATQVPRNKSGWALGTLSTGGVSGALLGPMAGGLLADSYGLRPVFFITAS VLILCFFVTLFCIREKFQPVSKKEMLHMREVVTSLKNPKLVLSLFVTTLIIQVATGSIAP ILTLYVRELAGNVSNVAFISGMIASVPGVAALLSAPRLGKLGDRIGPEKILITALIFSVL LLIPMSYVQTPLQLGILRFLLGAADGALLPAVQTLLVYNSSNQIAGRIFSYNQSFRDIGN VTGPLMGAAISANYGFRAVFLVTAGVVLFNAVYSWNSLRRRRIPQVSN Prediction of potential genes in microbial genomes Time: Sun May 15 23:27:36 2011 Seq name: gi|296494641|gb|ADTN01000097.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont242.2, whole genome shotgun sequence Length of sequence - 5609 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 5, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 39 - 959 972 ## COG1560 Lauroyl/myristoyl acyltransferase - Prom 1085 - 1144 4.8 + Prom 1054 - 1113 5.9 2 2 Tu 1 . + CDS 1184 - 2236 954 ## COG1054 Predicted sulfurtransferase + Term 2243 - 2281 6.9 - Term 2222 - 2274 8.2 3 3 Op 1 12/0.000 - CDS 2278 - 2853 586 ## COG2353 Uncharacterized conserved protein 4 3 Op 2 . - CDS 2857 - 3423 264 ## COG3038 Cytochrome B561 - Prom 3482 - 3541 8.1 - Term 3603 - 3643 -0.5 5 4 Op 1 . - CDS 3684 - 3797 177 ## - Term 3804 - 3835 2.4 6 4 Op 2 . - CDS 3845 - 4963 828 ## COG0665 Glycine/D-amino acid oxidases (deaminating) - Prom 4986 - 5045 2.2 - Term 5025 - 5053 0.6 7 5 Tu 1 . - CDS 5078 - 5332 233 ## APECO1_142 biofilm formation regulatory protein BssS - Prom 5524 - 5583 6.8 Predicted protein(s) >gi|296494641|gb|ADTN01000097.1| GENE 1 39 - 959 972 306 aa, chain - ## HITS:1 COG:ECs1432 KEGG:ns NR:ns ## COG: ECs1432 COG1560 # Protein_GI_number: 15830686 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lauroyl/myristoyl acyltransferase # Organism: Escherichia coli O157:H7 # 1 306 1 306 306 618 100.0 1e-177 MTNLPKFSTALLHPRYWLTWLGIGVLWLVVQLPYPVIYRLGCGLGKLALRFMKRRAKIVH RNLELCFPEMSEQERRKMVVKNFESVGMGLMETGMAWFWPDRRIARWTEVIGMEHIRDVQ AQKRGILLVGIHFLTLELGARQFGMQEPGIGVYRPNDNPLIDWLQTWGRLRSNKSMLDRK DLKGMIKALKKGEVVWYAPDHDYGPRSSVFVPLFAVEQAATTTGTWMLARMSGACLVPFV PRRKPDGKGYQLIMLPPECSPPLDDAETTAAWMNKVVEKCIMMAPEQYMWLHRRFKTRPE GVPSRY >gi|296494641|gb|ADTN01000097.1| GENE 2 1184 - 2236 954 350 aa, chain + ## HITS:1 COG:yceA KEGG:ns NR:ns ## COG: yceA COG1054 # Protein_GI_number: 16129018 # Func_class: R General function prediction only # Function: Predicted sulfurtransferase # Organism: Escherichia coli K12 # 1 350 1 350 350 742 100.0 0 MPVLHNRISNDALKAKMLAESEPRTTISFYKYFHIADPKATRDALYQLFTALNVFGRVYL AHEGINAQISVPASNVETFRAQLYAFDPALEGLRLNIALDDDGKSFWVLRMKVRDRIVAD GIDDPHFDASNVGEYLQAAEVNAMLDDPDALFIDMRNHYEYEVGHFENALEIPADTFREQ LPKAVEMMQAHKDKKIVMYCTGGIRCEKASAWMKHNGFNKVWHIEGGIIEYARKAREQGL PVRFIGKNFVFDERMGERISDEIIAHCHQCGAPCDSHTNCKNDGCHLLFIQCPVCAEKYK GCCSEICCEESALPPEEQRRRRAGRENGNKIFNKSRGRLNTTLCIPDPTE >gi|296494641|gb|ADTN01000097.1| GENE 3 2278 - 2853 586 191 aa, chain - ## HITS:1 COG:ECs1434 KEGG:ns NR:ns ## COG: ECs1434 COG2353 # Protein_GI_number: 15830688 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 191 1 191 191 360 100.0 1e-99 MKKSLLGLTFASLMFSAGSAVAADYKIDKEGQHAFVNFRIQHLGYSWLYGTFKDFDGTFT FDEKNPAADKVNVTINTTSVDTNHAERDKHLRSADFLNTAKYPQATFTSTSVKKDGDELD ITGDLTLNGVTKPVTLEAKLIGQGDDPWGGKRAGFEAEGKIKLKDFNIKTDLGPASQEVD LIISVEGVQQK >gi|296494641|gb|ADTN01000097.1| GENE 4 2857 - 3423 264 188 aa, chain - ## HITS:1 COG:yceJ KEGG:ns NR:ns ## COG: yceJ COG3038 # Protein_GI_number: 16129020 # Func_class: C Energy production and conversion # Function: Cytochrome B561 # Organism: Escherichia coli K12 # 1 188 1 188 188 345 99.0 3e-95 MSFTNTPERYGVISAAFHWLSAIIVYGMFALGLWMVTLSYYDGWYHKAPELHKSIGILLM MGLVIRVLWRVISPPPGPLPSYSPMTRLAARAGHLALYLLLFAIGISGYLISTADGKPIS VFGWFDVPANLADAGAQADFAGALHFWLAWSVVVLSVMHGFMALKHHFIDKDDTLKRMLG KSSSDYGV >gi|296494641|gb|ADTN01000097.1| GENE 5 3684 - 3797 177 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRRLLHYLINNIREHLMLYLFLWGLLAIMDLIYVFYF >gi|296494641|gb|ADTN01000097.1| GENE 6 3845 - 4963 828 372 aa, chain - ## HITS:1 COG:solA KEGG:ns NR:ns ## COG: solA COG0665 # Protein_GI_number: 16129022 # Func_class: E Amino acid transport and metabolism # Function: Glycine/D-amino acid oxidases (deaminating) # Organism: Escherichia coli K12 # 1 372 1 372 372 786 100.0 0 MKYDLIIIGSGSVGAAAGYYATRAGLNVLMTDAHMPPHQHGSHHGDTRLIRHAYGEGEKY VPLVLRAQTLWDELSRHNEEDPIFVRSGVINLGPADSTFLANVAHSAEQWQLNVEKLDAQ GIMARWPEIRVPDNYIGLFETDSGFLRSELAIKTWIQLAKEAGCAQLFNCPVTAIRHDDD GVTIETADGEYQAKKAIVCAGTWVKDLLPELPVQPVRKVFAWYQADGRYSVKNKFPAFTG ELPNGDQYYGFPAENDALKIGKHNGGQVIHSADERVPFAEVASDGSEAFPFLRNVLPGIG CCLYGAACTYDNSPDEDFIIDTLPGHDNTLLITGLSGHGFKFASVLGEIAADFAQDKKSD FDLTPFRLSRFQ >gi|296494641|gb|ADTN01000097.1| GENE 7 5078 - 5332 233 84 aa, chain - ## HITS:1 COG:no KEGG:APECO1_142 NR:ns ## KEGG: APECO1_142 # Name: bssS, yceP # Def: biofilm formation regulatory protein BssS # Organism: E.coli_APEC # Pathway: not_defined # 1 84 2 85 85 164 100.0 1e-39 MEKNNEVIQTHPLVGWDISTVDSYDALMLRLHYQTPNKSEQEGTEVGQTLWLTTDVARQF ISILEAGIAKIESGDFQVNEYRRH Prediction of potential genes in microbial genomes Time: Sun May 15 23:27:49 2011 Seq name: gi|296494640|gb|ADTN01000098.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont242.3, whole genome shotgun sequence Length of sequence - 18056 bp Number of predicted genes - 22, with homology - 22 Number of transcription units - 7, operones - 5 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 33 - 278 180 ## UTI89_C1186 DNA damage-inducible protein I 2 1 Op 2 . - CDS 352 - 1272 766 ## COG0418 Dihydroorotase - Prom 1435 - 1494 2.1 - Term 1468 - 1499 3.2 3 2 Tu 1 . - CDS 1504 - 2064 672 ## APECO1_145 hypothetical protein - Term 2146 - 2185 8.0 4 3 Op 1 1/1.000 - CDS 2198 - 2845 646 ## COG2999 Glutaredoxin 2 5 3 Op 2 . - CDS 2909 - 4117 1172 ## COG0477 Permeases of the major facilitator superfamily - Prom 4146 - 4205 2.1 + Prom 4198 - 4257 5.5 6 4 Op 1 5/0.667 + CDS 4353 - 4937 1067 ## PROTEIN SUPPORTED gi|15801183|ref|NP_287200.1| ribosomal-protein-S5-alanine N-acetyltransferase 7 4 Op 2 4/1.000 + CDS 4948 - 5595 815 ## COG3132 Uncharacterized protein conserved in bacteria 8 4 Op 3 5/0.667 + CDS 5597 - 6520 717 ## COG0673 Predicted dehydrogenases and related proteins + Term 6583 - 6611 3.0 9 5 Tu 1 . + CDS 6630 - 8165 1012 ## PROTEIN SUPPORTED gi|145628098|ref|ZP_01783899.1| 30S ribosomal protein S20 + Term 8173 - 8213 2.7 - Term 8157 - 8200 10.2 10 6 Op 1 7/0.333 - CDS 8205 - 8621 393 ## COG3418 Flagellar biosynthesis/type III secretory pathway chaperone 11 6 Op 2 8/0.000 - CDS 8626 - 8919 375 ## COG2747 Negative regulator of flagellin synthesis (anti-sigma28 factor) 12 6 Op 3 . - CDS 8995 - 9633 356 ## COG1261 Flagellar basal body P-ring biosynthesis protein - Prom 9677 - 9736 3.9 + Prom 9596 - 9655 2.3 13 7 Op 1 24/0.000 + CDS 9809 - 10225 484 ## COG1815 Flagellar basal body protein 14 7 Op 2 9/0.000 + CDS 10229 - 10633 287 ## COG1558 Flagellar basal body rod protein 15 7 Op 3 16/0.000 + CDS 10645 - 11340 852 ## COG1843 Flagellar hook capping protein 16 7 Op 4 8/0.000 + CDS 11365 - 12573 1295 ## COG1749 Flagellar hook protein FlgE 17 7 Op 5 8/0.000 + CDS 12593 - 13348 655 ## COG4787 Flagellar basal body rod protein + Prom 13413 - 13472 1.7 18 7 Op 6 9/0.000 + CDS 13520 - 14302 994 ## COG4786 Flagellar basal body rod protein + Term 14307 - 14346 2.5 19 7 Op 7 9/0.000 + CDS 14355 - 15053 735 ## COG2063 Flagellar basal body L-ring protein 20 7 Op 8 7/0.333 + CDS 15065 - 16162 929 ## COG1706 Flagellar basal-body P-ring protein 21 7 Op 9 9/0.000 + CDS 16162 - 17103 1040 ## COG3951 Rod binding protein 22 7 Op 10 . + CDS 17169 - 18050 821 ## COG1256 Flagellar hook-associated protein Predicted protein(s) >gi|296494640|gb|ADTN01000098.1| GENE 1 33 - 278 180 81 aa, chain - ## HITS:1 COG:no KEGG:UTI89_C1186 NR:ns ## KEGG: UTI89_C1186 # Name: not_defined # Def: DNA damage-inducible protein I # Organism: E.coli_UTI89 # Pathway: not_defined # 1 81 20 100 100 156 100.0 2e-37 MRIEVTIAKTSPLPAGAIDALAGELSRRIQYAFPDNEGHVSVRYAAANNLSVIGATKEDK QRISEILQETWESADDWFVSE >gi|296494640|gb|ADTN01000098.1| GENE 2 352 - 1272 766 306 aa, chain - ## HITS:1 COG:pyrC KEGG:ns NR:ns ## COG: pyrC COG0418 # Protein_GI_number: 16129025 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase # Organism: Escherichia coli K12 # 1 306 43 348 348 630 100.0 1e-180 MPNLAPPVTTVEAAVAYRQRILDAVPAGHDFTPLMTCYLTDSLDPNELERGFNEGVFTAA KLYPANATTNSSHGVTSIDAIMPVLERMEKIGMPLLVHGEVTHADIDIFDREARFIESVM EPLRQRLTALKVVFEHITTKDAADYVRDGNERLAATITPQHLMFNRNHMLVGGVRPHLYC LPILKRNIHQQALRELVASGFNRVFLGTDSAPHARHRKESSCGCAGCFNAPTALGSYATV FEEMNALQHFEAFCSVNGPQFYGLPVNDTFIELVREEQQVAESIALTDDTLVPFLAGETV RWSVKQ >gi|296494640|gb|ADTN01000098.1| GENE 3 1504 - 2064 672 186 aa, chain - ## HITS:1 COG:no KEGG:APECO1_145 NR:ns ## KEGG: APECO1_145 # Name: yceB # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 186 20 205 205 353 100.0 2e-96 MNKFLFAAALIVSGLLVGCNQLTQYTITEQEINQSLAKHNNFSKDIGLPGVADAHIVLTN LTSQIGREEPNKVTLTGDANLDMNSLFGSQKATMKLKLKALPVFDKEKGAIFLKEMEVVD ATVQPEKMQTVMQTLLPYLNQALRNYFNQQPAYVLREDGSQGEAMAKKLAKGIEVKPGEI VIPFTD >gi|296494640|gb|ADTN01000098.1| GENE 4 2198 - 2845 646 215 aa, chain - ## HITS:1 COG:ECs1442 KEGG:ns NR:ns ## COG: ECs1442 COG2999 # Protein_GI_number: 15830696 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutaredoxin 2 # Organism: Escherichia coli O157:H7 # 1 215 1 215 215 426 99.0 1e-119 MKLYIYDHCPYCLKARMIFGLKNIPVELHVLLNDDAETPTRMVGQKQVPILQKDDSRYMP ESMDIVHYVDKLNGKPLLTGKRSPAIEEWLRKVNGYANKLLLPRFAKSAFDEFSTPAARK YFVDKKEASAGNFADLLAHSDGLIKNISDDLRALDKLIVKPNAVNGELSEDDIQLFPLLR NLTLVAGINWPSRVADYRDNMAKQTQINLLSSMAI >gi|296494640|gb|ADTN01000098.1| GENE 5 2909 - 4117 1172 402 aa, chain - ## HITS:1 COG:yceL KEGG:ns NR:ns ## COG: yceL COG0477 # Protein_GI_number: 16129028 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 402 11 412 412 701 100.0 0 MSRVSQARNLGKYFLLIDNMLVVLGFFVVFPLISIRFVDQMGWAAVMVGIALGLRQFIQQ GLGIFGGAIADRFGAKPMIVTGMLMRAAGFATMGIAHEPWLLWFSCLLSGLGGTLFDPPR SALVVKLIRPQQRGRFFSLLMMQDSAGAVIGALLGSWLLQYDFRLVCATGAVLFVLCAAF NAWLLPAWKLSTVRTPVREGMTRVMRDKRFVTYVLTLAGYYMLAVQVMLMLPIMVNDVAG APSAVKWMYAIEACLSLTLLYPIARWSEKHFRLEHRLMAGLLIMSLSMMPVGMVSGLQQL FTLICLFYIGSIIAEPARETLSASLADARARGSYMGFSRLGLAIGGAIGYIGGGWLFDLG KSAHQPELPWMMLGIIGIFTFLALGWQFSQKRAARRLLERDA >gi|296494640|gb|ADTN01000098.1| GENE 6 4353 - 4937 1067 194 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15801183|ref|NP_287200.1| ribosomal-protein-S5-alanine N-acetyltransferase [Escherichia coli O157:H7 EDL933] # 1 194 1 194 194 415 100 1e-115 MFGYRSNVPKVRLTTDRLVVRLVHDRDAWRLADYYAENRHFLKPWEPVRDESHCYPSGWQ ARLGMINEFHKQGSAFYFGLFDPDEKEIIGVANFSNVVRGSFHACYLGYSIGQKWQGKGL MFEALTAAIRYMQRTQHIHRIMANYMPHNKRSGDLLARLGFEKEGYAKDYLLIDGQWRDH VLTALTTPDWTPGR >gi|296494640|gb|ADTN01000098.1| GENE 7 4948 - 5595 815 215 aa, chain + ## HITS:1 COG:yceH KEGG:ns NR:ns ## COG: yceH COG3132 # Protein_GI_number: 16129030 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 215 1 215 215 397 100.0 1e-111 MKYQLTALEARVIGCLLEKQVTTPEQYPLSVNGVVTACNQKTNREPVMNLSESEVQEQLD NLVKRHYLRTVSGFGNRVTKYEQRFCNSEFGDLKLSAAEVALITTLLLRGAQTPGELRSR AARMYEFSDMAEVESTLEQLANREDGPFVVRLAREPGKRENRYMHLFSGEVEDQPAVTDM SNAVDGDLQARVEALEIEVAELKQRLDSLLAHLGD >gi|296494640|gb|ADTN01000098.1| GENE 8 5597 - 6520 717 307 aa, chain + ## HITS:1 COG:mviM KEGG:ns NR:ns ## COG: mviM COG0673 # Protein_GI_number: 16129031 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Escherichia coli K12 # 1 307 1 307 307 622 100.0 1e-178 MKKLRIGVVGLGGIAQKAWLPVLAAASDWTLQGAWSPTRAKALPICESWRIPYADSLSSL AASCDAVFVHSSTASHFDVVSTLLNAGVHVCVDKPLAENLRDAERLVELAARKKLTLMVG FNRRFAPLYGELKTQLATAASLRMDKHRSNSVGPHDLYFTLLDDYLHVVDTALWLSGGKA SLDGGTLLTNDAGEMLFAEHHFSAGPLQITTCMHRRAGSQRETVQAVTDGALIDITDMRE WREERGQGVVHKPIPGWQSTLEQRGFVGCARHFIECVQNQTVPQTAGEQAVLAQRIVDKI WRDAMSE >gi|296494640|gb|ADTN01000098.1| GENE 9 6630 - 8165 1012 511 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145628098|ref|ZP_01783899.1| 30S ribosomal protein S20 [Haemophilus influenzae 22.1-21] # 3 502 5 516 524 394 40 1e-109 MNLLKSLAAVSSMTMFSRVLGFARDAIVARIFGAGMATDAFFVAFKLPNLLRRIFAEGAF SQAFVPILAEYKSKQGEDATRVFVSYVSGLLTLALAVVTVAGMLAAPWVIMVTAPGFADT ADKFALTSQLLKITFPYILLISLASLVGAILNTWNRFSIPAFAPTLLNISMIGFALFAAP YFNPPVLALAWAVTVGGVLQLVYQLPHLKKIGMLVLPRINFHDAGAMRVVKQMGPAILGV SVSQISLIINTIFASFLASGSVSWMYYADRLMEFPSGVLGVALGTILLPSLSKSFASGNH DEYNRLMDWGLRLCFLLALPSAVALGILSGPLTVSLFQYGKFTAFDALMTQRALIAYSVG LIGLIVVKVLAPGFYSRQDIKTPVKIAIVTLILTQLMNLAFIGPLKHAGLSLSIGLAACL NASLLYWQLRKQKIFTPQPGWMAFLLRLVVAVLVMSGVLLGMLHIMPEWSLGTMPWRLLR LMAVVLAGIAAYFAALAVLGFKVKEFARRTV >gi|296494640|gb|ADTN01000098.1| GENE 10 8205 - 8621 393 138 aa, chain - ## HITS:1 COG:flgN KEGG:ns NR:ns ## COG: flgN COG3418 # Protein_GI_number: 16129033 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport; O Posttranslational modification, protein turnover, chaperones # Function: Flagellar biosynthesis/type III secretory pathway chaperone # Organism: Escherichia coli K12 # 1 138 1 138 138 209 100.0 1e-54 MTRLAEILDQMSAVLNDLKTVMDQEQQHLSMGQINGSQLQWITEQKSSLLATLDYLEQLR RKEPNTANSVDISQRWQEITVKTQQLRQMNQHNGWLLEGQIERNQQALEMLKPHQEPTLY GANGQTSTTHRGGKKISI >gi|296494640|gb|ADTN01000098.1| GENE 11 8626 - 8919 375 97 aa, chain - ## HITS:1 COG:flgM KEGG:ns NR:ns ## COG: flgM COG2747 # Protein_GI_number: 16129034 # Func_class: K Transcription; N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Negative regulator of flagellin synthesis (anti-sigma28 factor) # Organism: Escherichia coli K12 # 1 97 1 97 97 123 100.0 8e-29 MSIDRTSPLKPVSTVQPRETTDAPVTNSRAAKTTASTSTSVTLSDAQAKLMQPGSSDINL ERVEALKLAIRNGELKMDTGKIADALINEAQQDLQSN >gi|296494640|gb|ADTN01000098.1| GENE 12 8995 - 9633 356 212 aa, chain - ## HITS:1 COG:flgA KEGG:ns NR:ns ## COG: flgA COG1261 # Protein_GI_number: 16129035 # Func_class: N Cell motility; O Posttranslational modification, protein turnover, chaperones # Function: Flagellar basal body P-ring biosynthesis protein # Organism: Escherichia coli K12 # 1 212 8 219 219 366 99.0 1e-101 MAIIAILFSPLSTASNLTSQLHNFFSAQLAGVSDEVRVSIRTAPNLLPPCEQPLLSMSNN SRLWGNVNVLARCGNDKRYLQVNVQATGNYVVAAMPIARGGKLEAGNVKLKRGRLDTLPP RTVLDINQLVDAISLRDLSPDQPIQLTQFRQAWRVKAGQRVNVIASGDGFSANAEGQALN NAAVAQNARVRMVSGQVVSGVVDADGNILINL >gi|296494640|gb|ADTN01000098.1| GENE 13 9809 - 10225 484 138 aa, chain + ## HITS:1 COG:flgB KEGG:ns NR:ns ## COG: flgB COG1815 # Protein_GI_number: 16129036 # Func_class: N Cell motility # Function: Flagellar basal body protein # Organism: Escherichia coli K12 # 1 138 1 138 138 228 100.0 2e-60 MLDKLDAALRFQQEALNLRAQRQEVLAANIANADTPGYQARDIDFASELKKVMQRGRDAT SVVALTMTSTQHIPAQALTPPTAELQYRIPDQPSLDGNTVDMDRERTQFADNSLQYQMSL SALSGQIKGMMNVLQSGN >gi|296494640|gb|ADTN01000098.1| GENE 14 10229 - 10633 287 134 aa, chain + ## HITS:1 COG:ECs1452 KEGG:ns NR:ns ## COG: ECs1452 COG1558 # Protein_GI_number: 15830706 # Func_class: N Cell motility # Function: Flagellar basal body rod protein # Organism: Escherichia coli O157:H7 # 1 134 1 134 134 223 100.0 8e-59 MALLNIFDIAGSALTAQSQRLNVAASNLANADSVTGPDGQPYRAKQVVFQVNAAPGAATG GVKVADVIESQAPDKLVYEPGNPLADAKGYVKMPNVDVVGEMVNTMSASRSYQANVEVLN TVKSMMLKTLTLGQ >gi|296494640|gb|ADTN01000098.1| GENE 15 10645 - 11340 852 231 aa, chain + ## HITS:1 COG:ECs1453 KEGG:ns NR:ns ## COG: ECs1453 COG1843 # Protein_GI_number: 15830707 # Func_class: N Cell motility # Function: Flagellar hook capping protein # Organism: Escherichia coli O157:H7 # 1 231 1 231 231 312 99.0 3e-85 MSIAVTTTDPTNTGVSTTSSSSLTGSNAADLQSSFLTLLVAQLKNQDPTNPMENNELTSQ LAQISTVSGIEKLNTTLGSISGQIDNSQSLQASNLIGHGVMIPGTTVLAGTGSEEGAVTT TTPFGVELQQAADKVTATITDKNGAVVRTIDIGELTAGVHSFTWDGTLTDGSTAPNGSYN VAISASNGGTQLVAQPLQFALVQGVIRGNSGNTLDLGTYGTTTLDEVRQII >gi|296494640|gb|ADTN01000098.1| GENE 16 11365 - 12573 1295 402 aa, chain + ## HITS:1 COG:flgE KEGG:ns NR:ns ## COG: flgE COG1749 # Protein_GI_number: 16129039 # Func_class: N Cell motility # Function: Flagellar hook protein FlgE # Organism: Escherichia coli K12 # 1 402 1 402 402 620 100.0 1e-177 MAFSQAVSGLNAAATNLDVIGNNIANSATYGFKSGTASFADMFAGSKVGLGVKVAGITQD FTDGTTTNTGRGLDVAISQNGFFRLVDSNGSVFYSRNGQFKLDENRNLVNMQGLQLTGYP ATGTPPTIQQGANPTNISIPNTLMAAKTTTTASMQINLNSSDPLPTVTPFSASNADSYNK KGSVTVFDSQGNAHDMSVYFVKTGDNNWQVYTQDSSDPNSIAKTATTLEFNANGTLVDGA MANNIATGAINGAEPATFSLSFLNSMQQNTGANNIVATTQNGYKPGDLVSYQINDDGTVV GNYSNEQTQLLGQIVLANFANNEGLASEGDNVWSATQSSGVALLGTAGTGNFGTLTNGAL EASNVDLSKELVNMIVAQRNYQSNAQTIKTQDQILNTLVNLR >gi|296494640|gb|ADTN01000098.1| GENE 17 12593 - 13348 655 251 aa, chain + ## HITS:1 COG:flgF KEGG:ns NR:ns ## COG: flgF COG4787 # Protein_GI_number: 16129040 # Func_class: N Cell motility # Function: Flagellar basal body rod protein # Organism: Escherichia coli K12 # 1 251 1 251 251 418 100.0 1e-117 MDHAIYTAMGAASQTLNQQAVTASNLANASTPGFRAQLNALRAVPVEGLSLPTRTLVTAS TPGADMTPGKMDYTSRPLDVALQQDGWLAVQTADGSEGYTRNGSIQVDPTGQLTIQGHPV IGEAGPIAVPEGAEITIAADGTISALNPGDPANTVAPVGRLKLVKATGSEVQRGDDGIFR LSAETQATRGPVLQADPTLRVMSGVLEGSNVNAVAAMSDMIASARRFEMQMKVISSVDDN AGRANQLLSMS >gi|296494640|gb|ADTN01000098.1| GENE 18 13520 - 14302 994 260 aa, chain + ## HITS:1 COG:ECs1456 KEGG:ns NR:ns ## COG: ECs1456 COG4786 # Protein_GI_number: 15830710 # Func_class: N Cell motility # Function: Flagellar basal body rod protein # Organism: Escherichia coli O157:H7 # 1 260 1 260 260 434 100.0 1e-122 MISSLWIAKTGLDAQQTNMDVIANNLANVSTNGFKRQRAVFEDLLYQTIRQPGAQSSEQT TLPSGLQIGTGVRPVATERLHSQGNLSQTNNSKDVAIKGQGFFQVMLPDGSSAYTRDGSF QVDQNGQLVTAGGFQVQPAITIPANALSITIGRDGVVSVTQQGQAAPVQVGQLNLTTFMN DTGLESIGENLYTETQSSGAPNESTPGLNGAGLLYQGYVETSNVNVAEELVNMIQVQRAY EINSKAVSTTDQMLQKLTQL >gi|296494640|gb|ADTN01000098.1| GENE 19 14355 - 15053 735 232 aa, chain + ## HITS:1 COG:ECs1457 KEGG:ns NR:ns ## COG: ECs1457 COG2063 # Protein_GI_number: 15830711 # Func_class: N Cell motility # Function: Flagellar basal body L-ring protein # Organism: Escherichia coli O157:H7 # 1 232 1 232 232 412 100.0 1e-115 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM >gi|296494640|gb|ADTN01000098.1| GENE 20 15065 - 16162 929 365 aa, chain + ## HITS:1 COG:flgI KEGG:ns NR:ns ## COG: flgI COG1706 # Protein_GI_number: 16129043 # Func_class: N Cell motility # Function: Flagellar basal-body P-ring protein # Organism: Escherichia coli K12 # 1 365 1 365 365 553 99.0 1e-157 MIKFLSALILLLVTTAAQAERIRDLTSVQGVRQNSLIGYGLVVGLDGTGDQTTQTPFTTQ TLNNMLSQLGITVPTGTNMQLKNVAAVMVTASLPPFGRQGQTIDVVVSSMGNAKSLRGGT LLMTPLKGVDSQVYALAQGNILVGGAGASAGGSSVQVNQLNGGRITNGAVIERELPSQFG VGNTLNLQLNDEDFSMAQQIADTINRVRGYGSATALDARTIQVRVPSGNSSQVRFLADIQ NMQVNVTPQDAKVVINSRTGSVVMNREVTLDSCAGAQGNLSVTVNRQANVSQPDTPFGGG QTVVTPQTQIDLRQSGGSLQSVRSSASLNNVVRALNALGATPMDLMSILQSMQSAGCLRA KLEII >gi|296494640|gb|ADTN01000098.1| GENE 21 16162 - 17103 1040 313 aa, chain + ## HITS:1 COG:flgJ_1 KEGG:ns NR:ns ## COG: flgJ_1 COG3951 # Protein_GI_number: 16129044 # Func_class: M Cell wall/membrane/envelope biogenesis; N Cell motility; O Posttranslational modification, protein turnover, chaperones # Function: Rod binding protein # Organism: Escherichia coli K12 # 1 167 1 167 167 291 100.0 1e-78 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK VSKTYSMNIDNLF >gi|296494640|gb|ADTN01000098.1| GENE 22 17169 - 18050 821 293 aa, chain + ## HITS:1 COG:flgK KEGG:ns NR:ns ## COG: flgK COG1256 # Protein_GI_number: 16129045 # Func_class: N Cell motility # Function: Flagellar hook-associated protein # Organism: Escherichia coli K12 # 1 287 1 287 547 467 100.0 1e-131 MSSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYV SGVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTL VSNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLN DQISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGST ARQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRVMLPTY Prediction of potential genes in microbial genomes Time: Sun May 15 23:27:59 2011 Seq name: gi|296494639|gb|ADTN01000099.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont245.1, whole genome shotgun sequence Length of sequence - 20906 bp Number of predicted genes - 16, with homology - 14 Number of transcription units - 14, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 25 - 852 73 ## COG0666 FOG: Ankyrin repeat - Prom 877 - 936 6.2 + Prom 866 - 925 6.0 2 2 Tu 1 . + CDS 1066 - 1134 89 ## 3 3 Tu 1 . - CDS 1169 - 1993 765 ## COG1414 Transcriptional regulator - Prom 2153 - 2212 3.7 + Prom 1989 - 2048 5.3 4 4 Tu 1 . + CDS 2205 - 5876 4238 ## COG1410 Methionine synthase I, cobalamin-binding domain + Term 5909 - 5945 7.1 + Prom 5894 - 5953 4.3 5 5 Tu 1 . + CDS 6096 - 7727 1736 ## COG1283 Na+/phosphate symporter - Term 7779 - 7811 3.2 6 6 Tu 1 . - CDS 7818 - 8507 790 ## COG3340 Peptidase E - Prom 8712 - 8771 3.2 + Prom 8640 - 8699 4.3 7 7 Tu 1 . + CDS 8719 - 9591 941 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases + Term 9637 - 9689 2.5 - Term 9605 - 9654 -0.9 8 8 Tu 1 . - CDS 9724 - 9996 355 ## B21_03855 hypothetical protein - Prom 10099 - 10158 6.8 - Term 10201 - 10242 5.5 9 9 Tu 1 . - CDS 10249 - 11598 166 ## PROTEIN SUPPORTED gi|145633033|ref|ZP_01788765.1| 50S ribosomal protein L25 + Prom 12029 - 12088 5.4 10 10 Tu 1 . + CDS 12123 - 13772 2050 ## COG0166 Glucose-6-phosphate isomerase + Term 13800 - 13859 6.2 + Prom 14044 - 14103 9.6 11 11 Tu 1 . + CDS 14271 - 14513 234 ## + Term 14585 - 14612 0.1 + Prom 14519 - 14578 2.1 12 12 Op 1 . + CDS 14627 - 15265 729 ## JW5711 predicted lipoprotein 13 12 Op 2 . + CDS 15262 - 15999 499 ## S3554 hypothetical protein 14 12 Op 3 . + CDS 15999 - 18095 2337 ## JW3989 predicted porin + Term 18103 - 18149 6.0 + Prom 18500 - 18559 2.5 15 13 Tu 1 . + CDS 18690 - 19100 598 ## COG3223 Predicted membrane protein + Term 19139 - 19183 1.9 - Term 19090 - 19136 8.2 16 14 Tu 1 . - CDS 19144 - 20619 1682 ## COG0477 Permeases of the major facilitator superfamily - Prom 20742 - 20801 5.2 Predicted protein(s) >gi|296494639|gb|ADTN01000099.1| GENE 1 25 - 852 73 275 aa, chain - ## HITS:1 COG:arp KEGG:ns NR:ns ## COG: arp COG0666 # Protein_GI_number: 16131843 # Func_class: R General function prediction only # Function: FOG: Ankyrin repeat # Organism: Escherichia coli K12 # 1 275 1 275 728 555 99.0 1e-158 MITRIPRSSFSANINNTAQTNEHQTLSELFYKELEDKFSGKELATPLLKSFSENCRQNGR HIFSNKDFVIKFSTSVLQADKKEITIINKNENTTLTQTIAPIFEKYLMEILPQRSDTLDK QELNLKSDRKEKEFPRIKLNGQCYFPGRPQNRIVCRHIAAQYINDIYQNVDYKPHQDDYS SAEKFLTHFNKKCKNQTLALVSSRPEGRCVAACGDFGLVMKAYFDKMESNGISVMAAILL VDNHALTVRLRIKNTTEGCTHYVVSVYDPNVTNDR >gi|296494639|gb|ADTN01000099.1| GENE 2 1066 - 1134 89 22 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTLSYFGVNAINRHPYSIATQV >gi|296494639|gb|ADTN01000099.1| GENE 3 1169 - 1993 765 274 aa, chain - ## HITS:1 COG:ECs4936 KEGG:ns NR:ns ## COG: ECs4936 COG1414 # Protein_GI_number: 15834190 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 274 14 287 287 526 100.0 1e-149 MVAPIPAKRGRKPAVATAPATGQVQSLTRGLKLLEWIAESNGSVALTELAQQAGLPNSTT HRLLTTMQQQGFVRQVGELGHWAIGAHAFMVGSSFLQSRNLLAIVHPILRNLMEESGETV NMAVLDQSDHEAIIIDQVQCTHLMRMSAPIGGKLPMHASGAGKAFLAQLSEEQVTKLLHR KGLHAYTHATLVSPVHLKEDLAQTRKRGYSFDDEEHALGLRCLAACIFDEHREPFAAISI SGPISRITDDRVTEFGAMVIKAAKEVTLAYGGMR >gi|296494639|gb|ADTN01000099.1| GENE 4 2205 - 5876 4238 1223 aa, chain + ## HITS:1 COG:metH_2 KEGG:ns NR:ns ## COG: metH_2 COG1410 # Protein_GI_number: 16131845 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I, cobalamin-binding domain # Organism: Escherichia coli K12 # 323 1223 1 901 901 1833 99.0 0 MEQLRAQLNERILVLDGGMGTMIQSYRLNEADFRGERFADWPCDLKGNNDLLVLSKPEVI AAIHNAYFEAGADIIETNTFNSTTIAMADYQMESLSAEINFAAAKLARACADEWTARTPE KPRYVAGVLGPTNRTASISPDVNDPAFRNITFDGLVAAYRESTKALVEGGADLILIETVF DTLNAKAAVFAVKTEFEALGVELPIMISGTITDASGRTLSGQTTEAFYNSLRHAEALTFG LNCALGPDELRQYVQELSRIAECYVTAHPNAGLPNAFGEYDLDADTMAKQIREWAQAGFL NIVGGCCGTTPQHIAAMSRAVEGLAPRKLPEIPVACRLSGLEPLNIGEDSLFVNVGERTN VTGSAKFKRLIKEEKYSEALDVARQQVENGAQIIDINMDEGMLDAEAAMVRFLNLIAGEP DIARVPIMIDSSKWDVIEKGLKCIQGKGIVNSISMKEGVDAFIHHAKLLRRYGAAVVVMA FDEQGQADTRARKIEICRRAYKILTEEVGFPPEDIIFDPNIFAVATGIEEHNNYAQDFIG ACEDIKRELPHALISGGVSNVSFSFRGNDPVREAIHAVFLYYAIRNGMDMGIVNAGQLAI YDDLPAELRDAVEDVILNRRDDGTERLLELAEKYRGSKTDDTANAQQAEWRSWEVNKRLE YSLVKGITEFIEQDTEEARQQATRPIEVIEGPLMDGMNVVGDLFGEGKMFLPQVVKSARV MKQAVAYLEPFIEASKEQGKTNGKMVIATVKGDVHDIGKNIVGVVLQCNNYEIVDLGVMV PAEKILRTAKEVNADLIGLSGLITPSLDEMVNVAKEMERQGFTIPLLIGGATTSKAHTAV KIEQNYSGPTVYVQNASRTVGVVAALLSDTQRDDFVARTRKEYETVRIQHGRKKPRTPPV TLEAARDNDFAFDWQAYTPPVAHRLGVQEVEAGIETLRNYIDWTPFFMTWSLAGKYPRIL EDEVVGVEAQRLFKDANDMLDKLSAEKTLNPRGVVGLFPANRVGDDIEIYRDETRTHVIN VSHHLRQQTEKTGFANYCLADFVAPKLSGKADYIGAFAVTGGLEEDALADAFEAQHDDYN KIMVKALADRLAEAFAEYLHERVRKVYWGYAPNENLSNEELIRENYQGIRPAPGYPACPE HTEKATIWELLEVEKHTGMKLTESFAMWPGASVSGWYFSHPDSKYYAVAQIQRDQVEDYA RRKGMSVTEVERWLAPNLGYDAD >gi|296494639|gb|ADTN01000099.1| GENE 5 6096 - 7727 1736 543 aa, chain + ## HITS:1 COG:yjbB KEGG:ns NR:ns ## COG: yjbB COG1283 # Protein_GI_number: 16131846 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Escherichia coli K12 # 1 543 1 543 543 945 100.0 0 MLTLLHLLSAVALLVWGTHIVRTGVMRVFGARLRTVLSRSVEKKPLAFCAGIGVTALVQS SNATTMLVTSFVAQDLVALAPALVIVLGADVGTALMARILTFDLSWLSPLLIFIGVIFFL GRKQSRAGQLGRVGIGLGLILLALELIVQAVTPITQANGVQVIFASLTGDILLDALIGAM FAIISYSSLAAVLLTATLTAAGIISFPVALCLVIGANLGSGLLAMLNNSAANAAARRVAL GSLLFKLVGSLIILPFVHLLAETMGKLSLPKAELVIYFHVFYNLVRCLVMLPFVDPMARF CKTIIRDEPELDTQLRPKHLDVSALDTPTLALANAARETLRIGDAMEQMMEGLNKVMHGE PRQEKELRKLADDINVLYTAIKLYLARMPKEELAEEESRRWAEIIEMSLNLEQASDIVER MGSEIADKSLAARRAFSLDGLKELDALYEQLLSNLKLAMSVFFSGDVTSARRLRRSKHRF RILNRRYSHAHVDRLHQQNVQSIETSSLHLGLLGDMQRLNSLFCSVAYSVLEQPDEDEGR DEY >gi|296494639|gb|ADTN01000099.1| GENE 6 7818 - 8507 790 229 aa, chain - ## HITS:1 COG:ECs4939 KEGG:ns NR:ns ## COG: ECs4939 COG3340 # Protein_GI_number: 15834193 # Func_class: E Amino acid transport and metabolism # Function: Peptidase E # Organism: Escherichia coli O157:H7 # 1 229 1 229 229 452 100.0 1e-127 MELLLLSNSTLPGKAWLEHALPLIAEQLQGRRSAVFIPFAGVTQTWDDYTAKTAAVLAPL GVSVTGIHSVVDPVAAIENAEIVIVGGGNTFQLLKQCRERGLLAPITDVVKRGALYIGWS AGANLACPTIRTTNDMPIVDPQGFDALNLFPLQINPHFTNALPEGHKGETREQRIRELLV VAPELTIIGLPEGNWITVSKGHATLGGPNTTYVFKAGEEAVPLEAGHRF >gi|296494639|gb|ADTN01000099.1| GENE 7 8719 - 9591 941 290 aa, chain + ## HITS:1 COG:yjbC KEGG:ns NR:ns ## COG: yjbC COG1187 # Protein_GI_number: 16131848 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Escherichia coli K12 # 1 290 1 290 290 518 100.0 1e-147 MLPDSSVRLNKYISESGICSRREADRYIEQGNVFLNGKRATIGDQVKPGDVVKVNGQLIE PREAEDLVLIALNKPVGIVSTTEDGERDNIVDFVNHSKRVFPIGRLDKDSQGLIFLTNHG DLVNKILRAGNDHEKEYLVTVDKPITEEFIRGMSAGVPILGTVTKKCKVKKEAPFVFRIT LVQGLNRQIRRMCEHFGYEVKKLERTRIMNVSLSGIPLGEWRDLTDDELIDLFKLIENSS SEVKPKAKAKPKTAGIKRPVVKMEKTAEKGGRPASNGKRFTSPGRKKKGR >gi|296494639|gb|ADTN01000099.1| GENE 8 9724 - 9996 355 90 aa, chain - ## HITS:1 COG:no KEGG:B21_03855 NR:ns ## KEGG: B21_03855 # Name: yjbD # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 90 1 90 90 120 100.0 2e-26 MALPRITQKEMTEREQRELKTLLDRARIAHGRVLTNSETNSIKKDYIDKLMVEREAEAKK ARQLKKKQAYKPDPEASFSWSANTSTRGRR >gi|296494639|gb|ADTN01000099.1| GENE 9 10249 - 11598 166 449 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145633033|ref|ZP_01788765.1| 50S ribosomal protein L25 [Haemophilus influenzae 3655] # 391 449 63 121 121 68 49 3e-11 MSEIVVSKFGGTSVADFDAMNRSADIVLSDANVRLVVLSASAGITNLLVALAEGLEPGER FEKLDAIRNIQFAILERLRYPNVIREEIERLLENITVLAEAAALATSPALTDELVSHGEL MSTLLFVEILRERDVQAQWFDVRKVMRTNDRFGRAEPDIAALAELAALQLLPRLNEGLVI TQGFIGSENKGRTTTLGRGGSDYTAALLAEALHASRVDIWTDVPGIYTTDPRVVSAAKRI DEIAFAEAAEMATFGAKVLHPATLLPAVRSDIPVFVGSSKDPRAGGTLVCNKTENPPLFR ALALRRNQTLLTLHSLNMLHSRGFLAEVFGILARHNISVDLITTSEVSVALTLDTTGSTS TGDTLLTQSLLMELSALCRVEVEEGLALVALIGNDLSKACGVGKEVFGVLEPFNIRMICY GASSHNLCFLVPGEDAEQVVQKLHSNLFE >gi|296494639|gb|ADTN01000099.1| GENE 10 12123 - 13772 2050 549 aa, chain + ## HITS:1 COG:ECs5008 KEGG:ns NR:ns ## COG: ECs5008 COG0166 # Protein_GI_number: 15834262 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate isomerase # Organism: Escherichia coli O157:H7 # 1 549 1 549 549 1134 100.0 0 MKNINPTQTAAWQALQKHFDEMKDVTIADLFAKDGDRFSKFSATFDDQMLVDYSKNRITE ETLAKLQDLAKECDLAGAIKSMFSGEKINRTENRAVLHVALRNRSNTPILVDGKDVMPEV NAVLEKMKTFSEAIISGEWKGYTGKAITDVVNIGIGGSDLGPYMVTEALRPYKNHLNMHF VSNVDGTHIAEVLKKVNPETTLFLVASKTFTTQETMTNAHSARDWFLKAAGDEKHVAKHF AALSTNAKAVGEFGIDTANMFEFWDWVGGRYSLWSAIGLSIVLSIGFDNFVELLSGAHAM DKHFSTTPAEKNLPVLLALIGIWYNNFFGAETEAILPYDQYMHRFAAYFQQGNMESNGKY VDRNGNVVDYQTGPIIWGEPGTNGQHAFYQLIHQGTKMVPCDFIAPAITHNPLSDHHQKL LSNFFAQTEALAFGKSREVVEQEYRDQGKDPATLDYVVPFKVFEGNRPTNSILLREITPF SLGALIALYEHKIFTQGVILNIFTFDQWGVELGKQLANRILPELKDDKEISSHDSSTNGL INRYKAWRG >gi|296494639|gb|ADTN01000099.1| GENE 11 14271 - 14513 234 80 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKVLYGIFAISALAATSAWAAPVQVGEAAGSAATSVSAGSSSATSVSTVSSAVGVALAA TGGGDGSNTGTTTTTTTSTQ >gi|296494639|gb|ADTN01000099.1| GENE 12 14627 - 15265 729 212 aa, chain + ## HITS:1 COG:no KEGG:JW5711 NR:ns ## KEGG: JW5711 # Name: yjbF # Def: predicted lipoprotein # Organism: E.coli_J # Pathway: not_defined # 1 212 1 212 212 415 100.0 1e-115 MKRPALILICLLLQACSATTKELGNSLWDSLFGTPGVQLTDDDIQNMPYASQYMQLNGGP QLFVVLAFAEDGQQKWVTQDQATLVTQHGRLVKTLLGGDNLIEVNNLAADPLIKPAQIVD GASWTRTMGWTEYQQVRYATARSVFKWDGTDTVKVGSDETPVRVLDEEVSTDQARWHNRY WIDSEGQIRQSEQYLGADYFPVKTTLIKAAKQ >gi|296494639|gb|ADTN01000099.1| GENE 13 15262 - 15999 499 245 aa, chain + ## HITS:1 COG:no KEGG:S3554 NR:ns ## KEGG: S3554 # Name: yjbG # Def: hypothetical protein # Organism: S.flexneri_2457T # Pathway: not_defined # 1 245 1 245 245 447 98.0 1e-124 MIKQTIVALLLSVGASSVFAAGTVKVFSNGSSEAKTLTGAEHLIDLVGQPRLANSWWPGA VISEELATAAALRQQQALLTRLAEQGADSSADDAAAINALRQQIQALKVTGRQKINLDPD IVRVAERGNPPLQGNYTLWVGPPPSTVTLFGLISRPGKQPFTPGRDVASYLSDQSLLSGA DRSYAWVVYPDGRTQKAPVAYWNKRHVEPMPGSIIYVGLADSVWSETPDALNADILQTLT QRIPQ >gi|296494639|gb|ADTN01000099.1| GENE 14 15999 - 18095 2337 698 aa, chain + ## HITS:1 COG:no KEGG:JW3989 NR:ns ## KEGG: JW3989 # Name: yjbH # Def: predicted porin # Organism: E.coli_J # Pathway: not_defined # 1 698 1 698 698 1415 100.0 0 MKKRHLLSLLALGISTACYGETYPAPIGPSQSDFGGVGLLQTPTARMAREGELSLNYRDN DQYRYYSASVQLFPWLETTLRYTDVRTRQYSSVEAFSGDQTYKDKAFDLKLRLWEESYWL PQVAVGARDIGGTGLFDAEYLVASKAWGPFDFTLGLGWGYLGTSGNVKNPLCSASDKYCY RDNSYKQAGSIDGSQMFHGPASLFGGVEYQTPWQPLRLKLEYEGNNYQQDFAGKLEQKSK FNVGAIYRVTDWADVNLSYERGNTFMFGVTLRTNFNDLRPSYNDNARPQYQPQPQDAILQ HSVVANQLTLLKYNAGLADPQIQAKGDTLYVTGEQVKYRDSREGIIRANRIVMNDLPDGI KTIRITENRLNMPQVTTETDVASLKNHLAGEPLGHETTLAQKRVEPVVPQSTEQGWYIDK SRFDFHIDPVLNQSVGGPENFYMYQLGVMGTADLWLTDHLLTTGSLFANLANNYDKFNYT NPPQDSHLPRVRTHVREYVQNDVYVNNLQANYFQHLGNGFYGQVYGGYLETMFGGAGAEV LYRPLDSNWAFGLDANYVKQRDWRSAKDMMKFTDYSVKTGHLTAYWTPSFAQDVLVKASV GQYLAGDKGGTLEIAKRFDSGVVVGGYATITNVSKEEYGEGDFTKGVYVSVPLDLFSSGP TRSRAAIGWTPLTRDGGQQLGRKFQLYDMTSDRSVNFR >gi|296494639|gb|ADTN01000099.1| GENE 15 18690 - 19100 598 136 aa, chain + ## HITS:1 COG:yjbA KEGG:ns NR:ns ## COG: yjbA COG3223 # Protein_GI_number: 16131856 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 136 1 136 136 199 100.0 2e-51 MTSLSRPRVEFISTILQTVLNLGLLCLGLILVVFLGKETVHLADVLFAPEQTSKYELVEG LVVYFLYFEFIALIVKYFQSGFHFPLRYFVYIGITAIVRLIIVDHKSPLDVLIYSAAILL LVITLWLCNSKRLKRE >gi|296494639|gb|ADTN01000099.1| GENE 16 19144 - 20619 1682 491 aa, chain - ## HITS:1 COG:ECs5014 KEGG:ns NR:ns ## COG: ECs5014 COG0477 # Protein_GI_number: 15834268 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 491 1 491 491 940 100.0 0 MNTQYNSSYIFSITLVATLGGLLFGYDTAVISGTVESLNTVFVAPQNLSESAANSLLGFC VASALIGCIIGGALGGYCSNRFGRRDSLKIAAVLFFISGVGSAWPELGFTSINPDNTVPV YLAGYVPEFVIYRIIGGIGVGLASMLSPMYIAELAPAHIRGKLVSFNQFAIIFGQLLVYC VNYFIARSGDASWLNTDGWRYMFASECIPALLFLMLLYTVPESPRWLMSRGKQEQAEGIL RKIMGNTLATQAVQEIKHSLDHGRKTGGRLLMFGVGVIVIGVMLSIFQQFVGINVVLYYA PEVFKTLGASTDIALLQTIIVGVINLTFTVLAIMTVDKFGRKPLQIIGALGMAIGMFSLG TAFYTQAPGIVALLSMLFYVAAFAMSWGPVCWVLLSEIFPNAIRGKALAIAVAAQWLANY FVSWTFPMMDKNSWLVAHFHNGFSYWIYGCMGVLAALFMWKFVPETKGKTLEELEALWEP ETKKTQQTATL Prediction of potential genes in microbial genomes Time: Sun May 15 23:28:28 2011 Seq name: gi|296494638|gb|ADTN01000100.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont245.2, whole genome shotgun sequence Length of sequence - 4127 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 20/0.000 - CDS 76 - 966 1135 ## COG3833 ABC-type maltose transport systems, permease component 2 1 Op 2 19/0.000 - CDS 981 - 2525 1715 ## COG1175 ABC-type sugar transport systems, permease components - Prom 2601 - 2660 5.0 - Term 2569 - 2614 4.3 3 1 Op 3 . - CDS 2679 - 3869 1537 ## COG2182 Maltose-binding periplasmic proteins/domains - Prom 3899 - 3958 2.6 4 2 Tu 1 . - CDS 3962 - 4057 66 ## Predicted protein(s) >gi|296494638|gb|ADTN01000100.1| GENE 1 76 - 966 1135 296 aa, chain - ## HITS:1 COG:ECs5015 KEGG:ns NR:ns ## COG: ECs5015 COG3833 # Protein_GI_number: 15834269 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type maltose transport systems, permease component # Organism: Escherichia coli O157:H7 # 1 296 1 296 296 516 99.0 1e-146 MAMVQPKSQKARLFITHLLLLLFIAAIMFPLLMVVAISLRQGNFATGSLIPEQISWDHWK LALGFSVEQADGRITPPPFPVLLWLWNSVKVAGISAIGIVALSTTCAYAFARMRFPGKAT LLKGMLIFQMFPAVLSLVALYALFDRLGEYIPFIGLNTHGGVIFAYLGGIALHVWTIKGY FETIDSSLEEAAALDGATPWQAFRLVLLPLSVPILAVVFILSFIAAITEVPVASLLLRDV NSYTLAVGMQQSLNPQNYLWGDFAAAAVMSALPITIVFLLAQRWLVNGLTAGGVKG >gi|296494638|gb|ADTN01000100.1| GENE 2 981 - 2525 1715 514 aa, chain - ## HITS:1 COG:malF KEGG:ns NR:ns ## COG: malF COG1175 # Protein_GI_number: 16131859 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Escherichia coli K12 # 1 514 1 514 514 978 100.0 0 MDVIKKKHWWQSDALKWSVLGLLGLLVGYLVVLMYAQGEYLFAITTLILSSAGLYIFANR KAYAWRYVYPGMAGMGLFVLFPLVCTIAIAFTNYSSTNQLTFERAQEVLLDRSWQAGKTY NFGLYPAGDEWQLALSDGETGKNYLSDAFKFGGEQKLQLKETTAQPEGERANLRVITQNR QALSDITAILPDGNKVMMSSLRQFSGTQPLYTLDGDGTLTNNQSGVKYRPNNQIGFYQSI TADGNWGDEKLSPGYTVTTGWKNFTRVFTDEGIQKPFLAIFVWTVVFSLITVFLTVAVGM VLACLVQWEALRGKAVYRVLLILPYAVPSFISILIFKGLFNQSFGEINMMLSALFGVKPA WFSDPTTARTMLIIVNTWLGYPYMMILCMGLLKAIPDDLYEASAMDGAGPFQNFFKITLP LLIKPLTPLMIASFAFNFNNFVLIQLLTNGGPDRLGTTTPAGYTDLLVNYTYRIAFEGGG GQDFGLAAAIATLIFLLVGALAIVNLKATRMKFD >gi|296494638|gb|ADTN01000100.1| GENE 3 2679 - 3869 1537 396 aa, chain - ## HITS:1 COG:ECs5017 KEGG:ns NR:ns ## COG: ECs5017 COG2182 # Protein_GI_number: 15834271 # Func_class: G Carbohydrate transport and metabolism # Function: Maltose-binding periplasmic proteins/domains # Organism: Escherichia coli O157:H7 # 1 396 1 396 396 761 99.0 0 MKIKTGARILALSALTTMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIK VTVEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTW DAVRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEP YFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAE AAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPPNPFVGVLSAGINAASPNKE LAKEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIP QMSAFWYAVRTAVINAASGRQTVDEALKDAQTRITK >gi|296494638|gb|ADTN01000100.1| GENE 4 3962 - 4057 66 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLLAKIVAILCAHLHITANSVTEITQSDGGA Prediction of potential genes in microbial genomes Time: Sun May 15 23:28:42 2011 Seq name: gi|296494637|gb|ADTN01000101.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont245.3, whole genome shotgun sequence Length of sequence - 36586 bp Number of predicted genes - 36, with homology - 35 Number of transcription units - 29, operones - 7 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.250 + CDS 107 - 1222 1206 ## COG3839 ABC-type sugar transport systems, ATPase components 2 1 Op 2 . + CDS 1297 - 2634 1583 ## COG4580 Maltoporin (phage lambda and maltose receptor) + Term 2820 - 2870 1.3 + Prom 2721 - 2780 1.9 3 2 Tu 1 . + CDS 2877 - 3797 831 ## B21_03869 hypothetical protein + Term 3830 - 3861 3.2 - Term 3658 - 3692 -0.3 4 3 Tu 1 . - CDS 3772 - 3924 58 ## ECBD_3997 hypothetical protein + Prom 3841 - 3900 7.2 5 4 Tu 1 3/0.625 + CDS 4068 - 4220 81 ## COG1357 Uncharacterized low-complexity proteins 6 5 Tu 1 . + CDS 4736 - 5605 240 ## COG1357 Uncharacterized low-complexity proteins + Prom 5745 - 5804 4.2 7 6 Op 1 6/0.125 + CDS 5828 - 6325 555 ## COG3161 4-hydroxybenzoate synthetase (chorismate lyase) 8 6 Op 2 . + CDS 6338 - 7210 1061 ## COG0382 4-hydroxybenzoate polyprenyltransferase and related prenyltransferases + Term 7223 - 7252 2.1 - Term 7208 - 7244 5.8 9 7 Tu 1 . - CDS 7365 - 9848 2620 ## COG2937 Glycerol-3-phosphate O-acyltransferase - Prom 9959 - 10018 6.0 + Prom 9700 - 9759 4.0 10 8 Tu 1 5/0.250 + CDS 9959 - 10327 434 ## COG0818 Diacylglycerol kinase + Term 10340 - 10377 -1.0 + Prom 10350 - 10409 8.2 11 9 Op 1 1/1.000 + CDS 10437 - 11045 511 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) + Term 11057 - 11096 8.9 12 9 Op 2 . + CDS 11118 - 12443 1343 ## COG0534 Na+-driven multidrug efflux pump + Prom 12448 - 12507 2.5 13 10 Tu 1 . + CDS 12559 - 12768 363 ## COG3237 Uncharacterized protein conserved in bacteria 14 11 Tu 1 . - CDS 12810 - 13325 488 ## COG0735 Fe2+/Zn2+ uptake regulation proteins - Prom 13474 - 13533 4.1 + Prom 13426 - 13485 7.7 15 12 Op 1 . + CDS 13643 - 13897 128 ## SSON_4227 hypothetical protein 16 12 Op 2 . + CDS 13921 - 14628 329 ## JW4008 hypothetical protein + Term 14670 - 14703 3.1 + Prom 14711 - 14770 6.1 17 13 Tu 1 . + CDS 14991 - 16028 1151 ## COG0042 tRNA-dihydrouridine synthase + Term 16059 - 16096 7.0 + Prom 16076 - 16135 4.4 18 14 Tu 1 . + CDS 16162 - 16404 414 ## ECIAI39_4470 phage shock protein G + Term 16537 - 16578 5.4 - Term 16528 - 16561 4.1 19 15 Tu 1 . - CDS 16570 - 17553 820 ## COG0604 NADPH:quinone reductase and related Zn-dependent oxidoreductases - Prom 17626 - 17685 1.7 + Prom 17522 - 17581 3.1 20 16 Op 1 9/0.125 + CDS 17636 - 19051 1564 ## COG0305 Replicative DNA helicase 21 16 Op 2 5/0.250 + CDS 19104 - 20183 1002 ## COG0787 Alanine racemase + Prom 20196 - 20255 3.7 22 17 Tu 1 . + CDS 20436 - 21629 1280 ## COG1448 Aspartate/tyrosine/aromatic aminotransferase 23 18 Tu 1 . - CDS 22131 - 22373 131 ## ECSE_4348 hypothetical protein + Prom 22553 - 22612 7.7 24 19 Tu 1 4/0.625 + CDS 22736 - 23449 499 ## COG3700 Acid phosphatase (class B) + Prom 23477 - 23536 3.3 25 20 Op 1 4/0.625 + CDS 23560 - 23976 345 ## COG0432 Uncharacterized conserved protein 26 20 Op 2 . + CDS 23980 - 24336 457 ## COG2315 Uncharacterized protein conserved in bacteria - Term 24333 - 24363 3.4 27 21 Tu 1 . - CDS 24371 - 27193 3272 ## COG0178 Excinuclease ATPase subunit - Prom 27254 - 27313 5.0 + Prom 27361 - 27420 5.6 28 22 Tu 1 . + CDS 27447 - 27983 759 ## COG0629 Single-stranded DNA-binding protein + Term 28056 - 28084 1.4 - Term 28041 - 28075 5.2 29 23 Tu 1 . - CDS 28082 - 28363 279 ## ECSE_4354 hypothetical protein + Prom 28666 - 28725 3.7 30 24 Tu 1 . + CDS 28793 - 30379 1107 ## COG4943 Predicted signal transduction protein containing sensor and EAL domains - Term 30335 - 30376 7.7 31 25 Tu 1 . - CDS 30382 - 30705 305 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 30746 - 30805 5.8 + Prom 30708 - 30767 5.6 32 26 Tu 1 3/0.625 + CDS 30791 - 31255 485 ## COG0789 Predicted transcriptional regulators + Term 31257 - 31283 -0.7 33 27 Tu 1 5/0.250 + CDS 31801 - 33150 1640 ## COG2252 Permeases + Term 33166 - 33209 9.2 + Prom 33190 - 33249 3.8 34 28 Tu 1 . + CDS 33301 - 34950 1694 ## COG0025 NhaP-type Na+/H+ and K+/H+ antiporters + Term 35136 - 35170 1.2 35 29 Op 1 . - CDS 34930 - 35034 57 ## 36 29 Op 2 . - CDS 35104 - 35847 270 ## COG1357 Uncharacterized low-complexity proteins - Prom 36071 - 36130 6.2 Predicted protein(s) >gi|296494637|gb|ADTN01000101.1| GENE 1 107 - 1222 1206 371 aa, chain + ## HITS:1 COG:ECs5018 KEGG:ns NR:ns ## COG: ECs5018 COG3839 # Protein_GI_number: 15834272 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Escherichia coli O157:H7 # 1 371 1 371 371 729 100.0 0 MASVQLQNVTKAWGEVVVSKDINLDIHEGEFVVFVGPSGCGKSTLLRMIAGLETITSGDL FIGEKRMNDTPPAERGVGMVFQSYALYPHLSVAENMSFGLKLAGAKKEVINQRVNQVAEV LQLAHLLDRKPKALSGGQRQRVAIGRTLVAEPSVFLLDEPLSNLDAALRVQMRIEISRLH KRLGRTMIYVTHDQVEAMTLADKIVVLDAGRVAQVGKPLELYHYPADRFVAGFIGSPKMN FLPVKVTATAIDQVQVELPMPNRQQVWLPVESRDVQVGANMSLGIRPEHLLPSDIADVIL EGEVQVVEQLGNETQIHIQIPSIRQNLVYRQNDVVLVEEGATFAIGLPPERCHLFREDGT ACRRLHKEPGV >gi|296494637|gb|ADTN01000101.1| GENE 2 1297 - 2634 1583 445 aa, chain + ## HITS:1 COG:lamB KEGG:ns NR:ns ## COG: lamB COG4580 # Protein_GI_number: 16131862 # Func_class: G Carbohydrate transport and metabolism # Function: Maltoporin (phage lambda and maltose receptor) # Organism: Escherichia coli K12 # 1 445 2 446 446 849 100.0 0 MITLRKLPLAVAVAAGVMSAQAMAVDFHGYARSGIGWTGSGGEQQCFQTTGAQSKYRLGN ECETYAELKLGQEVWKEGDKSFYFDTNVAYSVAQQNDWEATDPAFREANVQGKNLIEWLP GSTIWAGKRFYQRHDVHMIDFYYWDISGPGAGLENIDVGFGKLSLAATRSSEAGGSSSFA SNNIYDYTNETANDVFDVRLAQMEINPGGTLELGVDYGRANLRDNYRLVDGASKDGWLFT AEHTQSVLKGFNKFVVQYATDSMTSQGKGLSQGSGVAFDNEKFAYNINNNGHMLRILDHG AISMGDNWDMMYVGMYQDINWDNDNGTKWWTVGIRPMYKWTPIMSTVMEIGYDNVESQRT GDKNNQYKITLAQQWQAGDSIWSRPAIRVFATYAKWDEKWGYDYTGNADNNANFGKAVPA DFNGGSFGRGDSDEWTFGAQMEIWW >gi|296494637|gb|ADTN01000101.1| GENE 3 2877 - 3797 831 306 aa, chain + ## HITS:1 COG:no KEGG:B21_03869 NR:ns ## KEGG: B21_03869 # Name: malM # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 306 1 306 306 497 100.0 1e-139 MKMNKSLIVLCLSAGLLASAPGISLADVNYVPQNTSDAPAIPSAALQQLTWTPVDQSKTQ TTQLATGGQQLNVPGISGPVAAYSVPANIGELTLTLTSEVNKQTSVFAPNVLILDQNMTP SAFFPSSYFTYQEPGVMSADRLEGVMRLTPALGQQKLYVLVFTTEKDLQQTTQLLDPAKA YAKGVGNSIPDIPDPVARHTTDGLLKLKVKTNSSSSVLVGPLFGSSAPAPVTVGNTAAPA VAAPAPAPVKKSEPMLNDTESYFNTAIKNAVAKGDVDKALKLLDEAERLGSTSARSTFIS SVKGKG >gi|296494637|gb|ADTN01000101.1| GENE 4 3772 - 3924 58 50 aa, chain - ## HITS:1 COG:no KEGG:ECBD_3997 NR:ns ## KEGG: ECBD_3997 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_BL21_DE3 # Pathway: not_defined # 1 50 1 50 50 94 100.0 8e-19 MDILSEEEVSYYKKSLISQEGSIKKGAPGDAPVVAKSALWGVITPCLLHC >gi|296494637|gb|ADTN01000101.1| GENE 5 4068 - 4220 81 50 aa, chain + ## HITS:1 COG:ECs5021_1 KEGG:ns NR:ns ## COG: ECs5021_1 COG1357 # Protein_GI_number: 15834275 # Func_class: S Function unknown # Function: Uncharacterized low-complexity proteins # Organism: Escherichia coli O157:H7 # 1 47 15 61 277 77 85.0 7e-15 MKHDVVQGNNIVDLDLLRNLNGVPGLNRDNFIYISNIFSNIKQRNEKIMQ >gi|296494637|gb|ADTN01000101.1| GENE 6 4736 - 5605 240 289 aa, chain + ## HITS:1 COG:yjbI_1 KEGG:ns NR:ns ## COG: yjbI_1 COG1357 # Protein_GI_number: 16131864 # Func_class: S Function unknown # Function: Uncharacterized low-complexity proteins # Organism: Escherichia coli K12 # 1 95 154 248 248 187 100.0 3e-47 MFDYVRMSTGNFKDCITEQLELTIDYSDIFWNEDLDGYINNIIKMIDTLPDNAMILKSVL AVKLVMQLKILNIVNKNFIENMKKIFSHCPYIKDPIIRSYIHSDEDNKFDDFMRQHRFSE VNFDTQQMIDFINRFNTNKWLIDKNNNFFIQLIDQALRSTDDMIKANVWHLYKEWIRSDD VSPIFIETEDNLRTFNTNELTRNDNIFILFSSVDDGPVMVVSSQRLHDMLNPTKDTNWNS TYIYKSRHEMLPVNLTQETLFSSKSHGKYALFPIFTASWRAHRIMNKGV >gi|296494637|gb|ADTN01000101.1| GENE 7 5828 - 6325 555 165 aa, chain + ## HITS:1 COG:ubiC KEGG:ns NR:ns ## COG: ubiC COG3161 # Protein_GI_number: 16131865 # Func_class: H Coenzyme transport and metabolism # Function: 4-hydroxybenzoate synthetase (chorismate lyase) # Organism: Escherichia coli K12 # 1 165 38 202 202 314 100.0 6e-86 MSHPALTQLRALRYCKEIPALDPQLLDWLLLEDSMTKRFEQQGKTVSVTMIREGFVEQNE IPEELPLLPKESRYWLREILLCADGEPWLAGRTVVPVSTLSGPELALQKLGKTPLGRYLF TSSTLTRDFIEIGRDAGLWGRRSRLRLSGKPLLLTELFLPASPLY >gi|296494637|gb|ADTN01000101.1| GENE 8 6338 - 7210 1061 290 aa, chain + ## HITS:1 COG:ECs5023 KEGG:ns NR:ns ## COG: ECs5023 COG0382 # Protein_GI_number: 15834277 # Func_class: H Coenzyme transport and metabolism # Function: 4-hydroxybenzoate polyprenyltransferase and related prenyltransferases # Organism: Escherichia coli O157:H7 # 1 290 1 290 290 513 100.0 1e-145 MEWSLTQNKLLAFHRLMRTDKPIGALLLLWPTLWALWVATPGVPQLWILAVFVAGVWLMR AAGCVVNDYADRKFDGHVKRTANRPLPSGAVTEKEARALFVVLVLISFLLVLTLNTMTIL LSIAALALAWVYPFMKRYTHLPQVVLGAAFGWSIPMAFAAVSESVPLSCWLMFLANILWA VAYDTQYAMVDRDDDVKIGIKSTAILFGQYDKLIIGILQIGVLALMAIIGELNGLGWGYY WSILVAGALFVYQQKLIANREREACFKAFMNNNYVGLVLFLGLAMSYWHF >gi|296494637|gb|ADTN01000101.1| GENE 9 7365 - 9848 2620 827 aa, chain - ## HITS:1 COG:plsB KEGG:ns NR:ns ## COG: plsB COG2937 # Protein_GI_number: 16131867 # Func_class: I Lipid transport and metabolism # Function: Glycerol-3-phosphate O-acyltransferase # Organism: Escherichia coli K12 # 1 827 1 827 827 1657 99.0 0 MTFCYPCRAFALLTRGFTSFMSGWPRIYYKLLNLPLSILVKSKSIPADPAPELGLDTSRP IMYVLPYNSKADLLTLRAQCLAHDLPDPLEPLEIDGTLLPRYVFIHGGPRVFTYYTPKEE SIKLFHDYLDLHRSNPNLDVQMVPVSVMFGRAPGREKGEVNPPLRMLNGVQKFFAVLWLG RDSFVRFSPSVSLRRMADEHGTDKTIAQKLARVARMHFARQRLAAVGPRLPARQDLFNKL LASRAIAKAVEDEARSKKISHEKAQQNAIALMEEIAANFSYEMIRLTDRILGFTWNRLYQ GINVHNAERVRQLAHDGHELVYVPCHRSHMDYLLLSYVLYPQGLVPPHIAAGINLNFWPA GPIFRRLGAFFIRRTFKGNKLYSTVFREYLGELFSRGYSVEYFVEGGRSRTGRLLDPKTG TLSMTIQAMLRGGTRPITLIPIYIGYEHVMEVGTYAKELRGATKEKESLPQMLRGLSKLR NLGQGYVNFGEPMPLMTYLNQHVPDWRESIDPIEAVRPAWLTPTVNNIAADLMVRINNAG AANAMNLCCTALLASRQRSLTREQLTEQLNCYLDLMRNVPYSTDSTVPSASASELIDHAL QMNKFEVEKDTIGDIIILPREQAVLMTYYRNNIAHMLVLPSLMAAIVTQHRHISRDVLME HVNVLYPMLKAELFLRWDRDELPDVIDALANEMQRQGLITLQDDELHINPAHSRTLQLLA AGARETLQRYAITFWLLSANPSINRGTLEKESRTVAQRLSVLHGINAPEFFDKAVFSSLV LTLRDEGYISDSGDAEPAETMKVYQLLAELITSDVRLTIESATQGEG >gi|296494637|gb|ADTN01000101.1| GENE 10 9959 - 10327 434 122 aa, chain + ## HITS:1 COG:ECs5025 KEGG:ns NR:ns ## COG: ECs5025 COG0818 # Protein_GI_number: 15834279 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Diacylglycerol kinase # Organism: Escherichia coli O157:H7 # 1 122 1 122 122 195 100.0 2e-50 MANNTTGFTRIIKAAGYSWKGLRAAWINEAAFRQEGVAVLLAVVIACWLDVDAITRVLLI SSVMLVMIVEILNSAIEAVVDRIGSEYHELSGRAKDMGSAAVLIAIIVAVITWCILLWSH FG >gi|296494637|gb|ADTN01000101.1| GENE 11 10437 - 11045 511 202 aa, chain + ## HITS:1 COG:ECs5026 KEGG:ns NR:ns ## COG: ECs5026 COG1974 # Protein_GI_number: 15834280 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Escherichia coli O157:H7 # 1 202 1 202 202 380 100.0 1e-106 MKALTARQQEVFDLIRDHISQTGMPPTRAEIAQRLGFRSPNAAEEHLKALARKGVIEIVS GASRGIRLLQEEEEGLPLVGRVAAGEPLLAQQHIEGHYQVDPSLFKPNADFLLRVSGMSM KDIGIMDGDLLAVHKTQDVRNGQVVVARIDDEVTVKRLKKQGNKVELLPENSEFKPIVVD LRQQSFTIEGLAVGVIRNGDWL >gi|296494637|gb|ADTN01000101.1| GENE 12 11118 - 12443 1343 441 aa, chain + ## HITS:1 COG:dinF KEGG:ns NR:ns ## COG: dinF COG0534 # Protein_GI_number: 16131870 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Escherichia coli K12 # 1 441 19 459 459 682 100.0 0 MAFLTSSDKALWHLALPMIFSNITVPLLGLVDTAVIGHLDSPVYLGGVAVGATATSFLFM LLLFLRMSTTGLTAQAYGAKNPQALARTLVQPLLLALGAGALIALLRTPIIDLALHIVGG SEAVLEQARRFLEIRWLSAPASLANLVLLGWLLGVQYARAPVILLVVGNILNIVLDVWLV MGLHMNVQGAALATVIAEYATLLIGLLMVRKILKLRGISGEMLKTAWRGNFRRLLALNRD IMLRSLLLQLCFGAITVLGARLGSDIIAVNAVLMTLLTFTAYALDGFAYAVEAHSGQAYG ARDGSQLLDVWRAACRQSGIVALLFSVVYLLAGEHIIALLTSLTQIQQLADRYLIWQVIL PVVGVWCYLLDGMFIGATRATEMRNSMAVAAAGFALTLLTLPWLGNHALWLALTVFLALR GLSLAAIWRRHWRNGTWFAAT >gi|296494637|gb|ADTN01000101.1| GENE 13 12559 - 12768 363 69 aa, chain + ## HITS:1 COG:ECs5028 KEGG:ns NR:ns ## COG: ECs5028 COG3237 # Protein_GI_number: 15834282 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 69 1 69 69 95 100.0 2e-20 MNKDEAGGNWKQFKGKVKEQWGKLTDDDMTIIEGKRDQLVGKIQERYGYQKDQAEKEVVD WETRNEYRW >gi|296494637|gb|ADTN01000101.1| GENE 14 12810 - 13325 488 171 aa, chain - ## HITS:1 COG:zur KEGG:ns NR:ns ## COG: zur COG0735 # Protein_GI_number: 16131872 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Escherichia coli K12 # 1 171 21 191 191 325 100.0 3e-89 MEKTTTQELLAQAEKICAQRNVRLTPQRLEVLRLMSLQDGAISAYDLLDLLREAEPQAKP PTVYRALDFLLEQGFVHKVESTNSYVLCHLFDQPTHTSAMFICDRCGAVKEECAEGVEDI MHTLAAKMGFALRHNVIEAHGLCAACVEVEACRHPEQCQHDHSVQVKKKPR >gi|296494637|gb|ADTN01000101.1| GENE 15 13643 - 13897 128 84 aa, chain + ## HITS:1 COG:no KEGG:SSON_4227 NR:ns ## KEGG: SSON_4227 # Name: yjbL # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 84 1 84 84 157 100.0 9e-38 MLKIIPGATGYFNKTLNSNQFDNEDAIKDKLDNRGSIKGKLNNIYGKSIDYAALRHRDII IAKIDLFIQRITHNLWHARKKMCF >gi|296494637|gb|ADTN01000101.1| GENE 16 13921 - 14628 329 235 aa, chain + ## HITS:1 COG:no KEGG:JW4008 NR:ns ## KEGG: JW4008 # Name: yjbM # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 235 1 235 235 425 100.0 1e-118 MWVNKYIDDCTDEDLNDRDFIASVVDRAIFHFAINSICNPGDNKDAMPIEQCTFDVETKN DLPSTVQLFYEESKDNEPLANIHFQAIGSGFLTFVNACQEHDDNSLKLFASLLISLSYSS AYADLSETVYINENNESYLKAQFEKLSQRDMKKYLGEMKRLADGGEMNFDGYLDKMSHLV NEGTLDPDILSKMRDAAPQLISFAKSFDPTSKEEIKILTDTSKLIYDLFGVKSEK >gi|296494637|gb|ADTN01000101.1| GENE 17 14991 - 16028 1151 345 aa, chain + ## HITS:1 COG:ECs5031 KEGG:ns NR:ns ## COG: ECs5031 COG0042 # Protein_GI_number: 15834285 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-dihydrouridine synthase # Organism: Escherichia coli O157:H7 # 1 345 1 345 345 732 100.0 0 MHGNSEMQKINQTSAMPEKTDVHWSGRFSVAPMLDWTDRHCRYFLRLLSRNTLLYTEMVT TGAIIHGKGDYLAYSEEEHPVALQLGGSDPAALAQCAKLAEARGYDEINLNVGCPSDRVQ NGMFGACLMGNAQLVADCVKAMRDVVSIPVTVKTRIGIDDQDSYEFLCDFINTVSGKGEC EMFIIHARKAWLSGLSPKENREIPPLDYPRVYQLKRDFPHLTMSINGGIKSLEEAKAHLQ HMDGVMVGREAYQNPGILAAVDREIFGSSDTDADPVAVVRAMYPYIERELSQGTYLGHIT RHMLGLFQGIPGARQWRRYLSENAHKAGADINVLEHALKLVADKR >gi|296494637|gb|ADTN01000101.1| GENE 18 16162 - 16404 414 80 aa, chain + ## HITS:1 COG:no KEGG:ECIAI39_4470 NR:ns ## KEGG: ECIAI39_4470 # Name: pspG # Def: phage shock protein G # Organism: E.coli_IAI39 # Pathway: not_defined # 1 80 71 150 150 128 98.0 8e-29 MLELLFVIGFFVMLMVTGVSLLGIIAALVVATAIMFLGGMLALMIKLLPWLLLAIAVVWV IKAIKAPKVPKYQRYDRWRY >gi|296494637|gb|ADTN01000101.1| GENE 19 16570 - 17553 820 327 aa, chain - ## HITS:1 COG:qor KEGG:ns NR:ns ## COG: qor COG0604 # Protein_GI_number: 16131877 # Func_class: C Energy production and conversion; R General function prediction only # Function: NADPH:quinone reductase and related Zn-dependent oxidoreductases # Organism: Escherichia coli K12 # 1 327 1 327 327 629 100.0 1e-180 MATRIEFHKHGGPEVLQAVEFTPADPAENEIQVENKAIGINFIDTYIRSGLYPPPSLPSG LGTEAAGIVSKVGSGVKHIKAGDRVVYAQSALGAYSSVHNIIADKAAILPAAISFEQAAA SFLKGLTVYYLLRKTYEIKPDEQFLFHAAAGGVGLIACQWAKALGAKLIGTVGTAQKAQS ALKAGAWQVINYREEDLVERLKEITGGKKVRVVYDSVGRDTWERSLDCLQRRGLMVSFGN SSGAVTGVNLGILNQKGSLYVTRPSLQGYITTREELTEASNELFSLIASGVIKVDVAEQQ KYPLKDAQRAHEILESRATQGSSLLIP >gi|296494637|gb|ADTN01000101.1| GENE 20 17636 - 19051 1564 471 aa, chain + ## HITS:1 COG:dnaB KEGG:ns NR:ns ## COG: dnaB COG0305 # Protein_GI_number: 16131878 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Escherichia coli K12 # 1 471 1 471 471 892 100.0 0 MAGNKPFNKQQAEPRERDPQVAGLKVPPHSIEAEQSVLGGLMLDNERWDDVAERVVADDF YTRPHRHIFTEMARLQESGSPIDLITLAESLERQGQLDSVGGFAYLAELSKNTPSAANIS AYADIVRERAVVREMISVANEIAEAGFDPQGRTSEDLLDLAESRVFKIAESRANKDEGPK NIADVLDATVARIEQLFQQPHDGVTGVNTGYDDLNKKTAGLQPSDLIIVAARPSMGKTTF AMNLVENAAMLQDKPVLIFSLEMPSEQIMMRSLASLSRVDQTKIRTGQLDDEDWARISGT MGILLEKRNIYIDDSSGLTPTEVRSRARRIAREHGGIGLIMIDYLQLMRVPALSDNRTLE IAEISRSLKALAKELNVPVVALSQLNRSLEQRADKRPVNSDLRESGSIEQDADLIMFIYR DEVYHENSDLKGIAEIIIGKQRNGPIGTVRLTFNGQWSRFDNYAGPQYDDE >gi|296494637|gb|ADTN01000101.1| GENE 21 19104 - 20183 1002 359 aa, chain + ## HITS:1 COG:alr KEGG:ns NR:ns ## COG: alr COG0787 # Protein_GI_number: 16131879 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Escherichia coli K12 # 1 359 1 359 359 726 100.0 0 MQAATVVINRRALRHNLQRLRELAPASKMVAVVKANAYGHGLLETARTLPDADAFGVARL EEALRLRAGGITKPVLLLEGFFDARDLPTISAQHFHTAVHNEEQLAALEEASLDEPVTVW MKLDTGMHRLGVRPEQAEAFYHRLTQCKNVRQPVNIVSHFARADEPKCGATEKQLAIFNT FCEGKPGQRSIAASGGILLWPQSHFDWVRPGIILYGVSPLEDRSTGADFGCQPVMSLTSS LIAVREHKAGEPVGYGGTWVSERDTRLGVVAMGYGDGYPRAAPSGTPVLVNGREVPIVGR VAMDMICVDLGPQAQDKAGDPVILWGEGLPVERIAEMTKVSAYELITRLTSRVAMKYVD >gi|296494637|gb|ADTN01000101.1| GENE 22 20436 - 21629 1280 397 aa, chain + ## HITS:1 COG:tyrB KEGG:ns NR:ns ## COG: tyrB COG1448 # Protein_GI_number: 16131880 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Escherichia coli K12 # 1 397 1 397 397 808 100.0 0 MFQKVDAYAGDPILTLMERFKEDPRSDKVNLSIGLYYNEDGIIPQLQAVAEAEARLNAQP HGASLYLPMEGLNCYRHAIAPLLFGADHPVLKQQRVATIQTLGGSGALKVGADFLKRYFP ESGVWVSDPTWENHVAIFAGAGFEVSTYPWYDEATNGVRFNDLLATLKTLPARSIVLLHP CCHNPTGADLTNDQWDAVIEILKARELIPFLDIAYQGFGAGMEEDAYAIRAIASAGLPAL VSNSFSKIFSLYGERVGGLSVMCEDAEAAGRVLGQLKATVRRNYSSPPNFGAQVVAAVLN DEALKASWLAEVEEMRTRILAMRQELVKVLSTEMPERNFDYLLNQRGMFSYTGLSAAQVD RLREEFGVYLIASGRMCVAGLNTANVQRVAKAFAAVM >gi|296494637|gb|ADTN01000101.1| GENE 23 22131 - 22373 131 80 aa, chain - ## HITS:1 COG:no KEGG:ECSE_4348 NR:ns ## KEGG: ECSE_4348 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 59 1 59 139 108 98.0 5e-23 MVINFITRDGDDDMNISYVNSNKTTSLPVELDALNNKDISYAKDFFLYIETQLKIAKDFL DLEKKYQVLLQVKFFTHLLI >gi|296494637|gb|ADTN01000101.1| GENE 24 22736 - 23449 499 237 aa, chain + ## HITS:1 COG:aphA KEGG:ns NR:ns ## COG: aphA COG3700 # Protein_GI_number: 16131881 # Func_class: R General function prediction only # Function: Acid phosphatase (class B) # Organism: Escherichia coli K12 # 1 237 1 237 237 473 100.0 1e-133 MRKITQAISAVCLLFALNSSAVALASSPSPLNPGTNVARLAEQAPIHWVSVAQIENSLAG RPPMAVGFDIDDTVLFSSPGFWRGKKTFSPESEDYLKNPVFWEKMNNGWDEFSIPKEVAR QLIDMHVRRGDAIFFVTGRSPTKTETVSKTLADNFHIPATNMNPVIFAGDKPGQNTKSQW LQDKNIRIFYGDSDNDITAARDVGARGIRILRASNSTYKPLPQAGAFGEEVIVNSEY >gi|296494637|gb|ADTN01000101.1| GENE 25 23560 - 23976 345 138 aa, chain + ## HITS:1 COG:ECs5038 KEGG:ns NR:ns ## COG: ECs5038 COG0432 # Protein_GI_number: 15834292 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 138 1 138 138 277 100.0 3e-75 MWYQKTLTLSAKSRGFHLVTDEILNQLADMPRVNIGLLHLLLQHTSASLTLNENCDPTVR HDMERFFLRTVPDNGNYEHDYEGADDMPSHIKSSMLGTSLVLPVHKGRIQTGTWQGIWLG EHRIHGGSRRIIATLQGE >gi|296494637|gb|ADTN01000101.1| GENE 26 23980 - 24336 457 118 aa, chain + ## HITS:1 COG:ECs5039 KEGG:ns NR:ns ## COG: ECs5039 COG2315 # Protein_GI_number: 15834293 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 118 1 118 118 216 100.0 5e-57 MTISELLQYCMAKPGAEQSVHNDWKATQIKVEDVLFAMVKEVENRPAVSLKTSPELAELL RQQHSDVRPSRHLNKAHWSTVYLDGSLPDSQIYYLVDASYQQAVNLLPEEKRKLLVQL >gi|296494637|gb|ADTN01000101.1| GENE 27 24371 - 27193 3272 940 aa, chain - ## HITS:1 COG:uvrA KEGG:ns NR:ns ## COG: uvrA COG0178 # Protein_GI_number: 16131884 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Escherichia coli K12 # 1 940 1 940 940 1900 100.0 0 MDKIEVRGARTHNLKNINLVIPRDKLIVVTGLSGSGKSSLAFDTLYAEGQRRYVESLSAY ARQFLSLMEKPDVDHIEGLSPAISIEQKSTSHNPRSTVGTITEIHDYLRLLFARVGEPRC PDHDVPLAAQTVSQMVDNVLSQPEGKRLMLLAPIIKERKGEHTKTLENLASQGYIRARID GEVCDLSDPPKLELQKKHTIEVVVDRFKVRDDLTQRLAESFETALELSGGTAVVADMDDP KAEELLFSANFACPICGYSMRELEPRLFSFNNPAGACPTCDGLGVQQYFDPDRVIQNPEL SLAGGAIRGWDRRNFYYFQMLKSLADHYKFDVEAPWGSLSANVHKVVLYGSGKENIEFKY MNDRGDTSIRRHPFEGVLHNMERRYKETESSAVREELAKFISNRPCASCEGTRLRREARH VYVENTPLPAISDMSIGHAMEFFNNLKLAGQRAKIAEKILKEIGDRLKFLVNVGLNYLTL SRSAETLSGGEAQRIRLASQIGAGLVGVMYVLDEPSIGLHQRDNERLLGTLIHLRDLGNT VIVVEHDEDAIRAADHVIDIGPGAGVHGGEVVAEGPLEAIMAVPESLTGQYMSGKRKIEV PKKRVPANPEKVLKLTGARGNNLKDVTLTLPVGLFTCITGVSGSGKSTLINDTLFPIAQR QLNGATIAEPAPYRDIQGLEHFDKVIDIDQSPIGRTPRSNPATYTGVFTPVRELFAGVPE SRARGYTPGRFSFNVRGGRCEACQGDGVIKVEMHFLPDIYVPCDQCKGKRYNRETLEIKY KGKTIHEVLDMTIEEAREFFDAVPALARKLQTLMDVGLTYIRLGQSATTLSGGEAQRVKL ARELSKRGTGQTLYILDEPTTGLHFADIQQLLDVLHKLRDQGNTIVVIEHNLDVIKTADW IVDLGPEGGSGGGEILVSGTPETVAECEASHTARFLKPML >gi|296494637|gb|ADTN01000101.1| GENE 28 27447 - 27983 759 178 aa, chain + ## HITS:1 COG:ECs5041 KEGG:ns NR:ns ## COG: ECs5041 COG0629 # Protein_GI_number: 15834295 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Escherichia coli O157:H7 # 1 178 1 178 178 270 100.0 1e-72 MASRGVNKVILVGNLGQDPEVRYMPNGGAVANITLATSESWRDKATGEMKEQTEWHRVVL FGKLAEVASEYLRKGSQVYIEGQLRTRKWTDQSGQDRYTTEVVVNVGGTMQMLGGRQGGG APAGGNIGGGQPQGGWGQPQQPQGGNQFSGGAQSRPQQSAPAAPSNEPPMDFDDDIPF >gi|296494637|gb|ADTN01000101.1| GENE 29 28082 - 28363 279 93 aa, chain - ## HITS:1 COG:no KEGG:ECSE_4354 NR:ns ## KEGG: ECSE_4354 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 93 34 126 126 166 100.0 3e-40 MATLTTGVVLLRWQLLSAVMMFLASTLNIRFRRSDYVGLAVISSGLGVVSACWFAMGLLG ITMADITAIWHNIESVMIEEMNQTPPQWPMILT >gi|296494637|gb|ADTN01000101.1| GENE 30 28793 - 30379 1107 528 aa, chain + ## HITS:1 COG:yjcC KEGG:ns NR:ns ## COG: yjcC COG4943 # Protein_GI_number: 16131887 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing sensor and EAL domains # Organism: Escherichia coli K12 # 1 528 1 528 528 1057 100.0 0 MSHRARHQLLALPGIIFLVLFPIILSLWIAFLWAKSEVNNQLRTFAQLALDKSELVIRQA DLVSDAAERYQGQVCTPAHQKRMLNIIRGYLYINELIYARDNHFLCSSLIAPVNGYTIAP ADYKREPNVSIYYYRDTPFFSGYKMTYMQRGNYVAVINPLFWSEVMSDDPTLQWGVYDTV TKTFFSLSKEASAATFSPLIHLKDLTVQRNGYLYATVYSTKRPIAAIVATSYQRLITHFY NHLIFALPAGILGSLVLLLLWLRIRQNYLSPKRKLQRALEKHQLCLYYQPIIDIKTEKCI GAEALLRWPGEQGQIMNPAEFIPLAEKEGMIEQITDYVIDNVFRDLGDYLATHADRYVSI NLSASDFHTSRLIARINQKTEQYAVRPQQIKFEVTEHAFLDVDKMTPIILAFRQAGYEVA IDDFGIGYSNLHNLKSLNVDILKIDKSFVETLTTHKTSHLIAEHIIELAHSLGLKTIAEG VETEEQVNWLRKRGVRYCQGWFFAKAMPPQVFMQWMEQLPARELTRGQ >gi|296494637|gb|ADTN01000101.1| GENE 31 30382 - 30705 305 107 aa, chain - ## HITS:1 COG:ECs5044 KEGG:ns NR:ns ## COG: ECs5044 COG2207 # Protein_GI_number: 15834298 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 107 1 107 107 194 100.0 3e-50 MSHQKIIQDLIAWIDEHIDQPLNIDVVAKKSGYSKWYLQRMFRTVTHQTLGDYIRQRRLL LAAVELRTTERPIFDIAMDLGYVSQQTFSRVFRRQFDRTPSDYRHRL >gi|296494637|gb|ADTN01000101.1| GENE 32 30791 - 31255 485 154 aa, chain + ## HITS:1 COG:ECs5045 KEGG:ns NR:ns ## COG: ECs5045 COG0789 # Protein_GI_number: 15834299 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 154 1 154 154 301 100.0 2e-82 MEKKLPRIKALLTPGEVAKRSGVAVSALHFYESKGLITSIRNSGNQRRYKRDVLRYVAII KIAQRIGIPLATIGEAFGVLPEGHTLSAKEWKQLSSQWREELDRRIHTLVALRDELDGCI GCGCLSRSDCPLRNPGDRLGEEGTGARLLEDEQN >gi|296494637|gb|ADTN01000101.1| GENE 33 31801 - 33150 1640 449 aa, chain + ## HITS:1 COG:ECs5046 KEGG:ns NR:ns ## COG: ECs5046 COG2252 # Protein_GI_number: 15834300 # Func_class: R General function prediction only # Function: Permeases # Organism: Escherichia coli O157:H7 # 1 449 1 449 449 689 100.0 0 MSTPSARTGGSLDAWFKISQRGSTVRQEVVAGLTTFLAMVYSVIVVPGMLGKAGFPPAAV FVATCLVAGLGSIVMGLWANLPLAIGCAISLTAFTAFSLVLGQHISVPVALGAVFLMGVL FTVISATGIRSWILRNLPHGVAHGTGIGIGLFLLLIAANGVGLVIKNPLDGLPVALGDFA TFPVIMSLVGLAVIIGLEKLKVPGGILLTIIGISIVGLIFDPNVHFSGVFAMPSLSDENG NSLIGSLDIMGALNPVVLPSVLALVMTAVFDATGTIRAVAGQANLLDKDGQIIDGGKALT TDSMSSVFSGLVGAAPAAVYIESAAGTAAGGKTGLTAITVGVLFLLILFLSPLSYLVPGY ATAPALMYVGLLMLSNVAKIDFADFVDAMAGLVTAVFIVLTCNIVTGIMIGFATLVIGRL VSGEWRKLNIGTVVIAVALVTFYAGGWAI >gi|296494637|gb|ADTN01000101.1| GENE 34 33301 - 34950 1694 549 aa, chain + ## HITS:1 COG:ECs5047 KEGG:ns NR:ns ## COG: ECs5047 COG0025 # Protein_GI_number: 15834301 # Func_class: P Inorganic ion transport and metabolism # Function: NhaP-type Na+/H+ and K+/H+ antiporters # Organism: Escherichia coli O157:H7 # 1 549 1 549 549 979 100.0 0 MEIFFTILIMTLVVSLSGVVTRVMPFQIPLPLMQIAIGALLAWPTFGLHVEFDPELFLVL FIPPLLFADGWKTPTREFLEHGREIFGLALALVVVTVVGIGFLIYWVVPGIPLIPAFALA AVLSPTDAVALSGIVGEGRIPKKIMGILQGEALMNDASGLVSLKFAVAVAMGTMIFTVGG ATVEFMKVAIGGILAGFVVSWLYGRSLRFLSRWGGDEPATQIVLLFLLPFASYLIAEHIG VSGILAAVAAGMTITRSGVMRRAPLAMRLRANSTWAMLEFVFNGMVFLLLGLQLPGILET SLMAAEIDPNVEIWMLFTDIILIYAALMLVRFGWLWTMKKFSNRFLKKKPMEFGSWTTRE ILIASFAGVRGAITLAGVLSIPLLLPDGNVFPARYELVFLAAGVILFSLFVGVVMLPILL QHIEVADHSQQLKEERIARAATAEVAIVAIQKMEERLAADTEENIDNQLLTEVSSRVIGN LRRRADGRNDVESSVQEENLERRFRLAALRSERAELYHLRATREISNETLQKLLHDLDLL EALLIEENQ >gi|296494637|gb|ADTN01000101.1| GENE 35 34930 - 35034 57 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLARANLLKVQKIAPFSDYLGPGYVAFGLLIFLN >gi|296494637|gb|ADTN01000101.1| GENE 36 35104 - 35847 270 247 aa, chain - ## HITS:1 COG:yjcF KEGG:ns NR:ns ## COG: yjcF COG1357 # Protein_GI_number: 16131892 # Func_class: S Function unknown # Function: Uncharacterized low-complexity proteins # Organism: Escherichia coli K12 # 1 247 184 430 430 436 99.0 1e-122 MYKTNFNFAIMEKILFDNCILDDSNFAQIKMTDGTLNSCSAMHVQFYNATMNRANIKNTF LDYSNFHMAYMAEVNLYKVIAPYINLFRADLSFSKLDLINFEHADLSRVNLNKATLQNIN LIDSKLFFTRLTNTFLEMVICTGSNMANVNFNNANLSNCHFNCSVLTKAWMFNIRLYRVN FDEASVQGMGITILRGEENISINSDILVTLQKFFEEDCATHTGMSQTEDNLHAVAMKITA DIMQDAD Prediction of potential genes in microbial genomes Time: Sun May 15 23:29:09 2011 Seq name: gi|296494636|gb|ADTN01000102.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont245.4, whole genome shotgun sequence Length of sequence - 12651 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 2, operones - 2 average op.length - 5.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 10/0.000 - CDS 43 - 1692 2254 ## COG4147 Predicted symporter 2 1 Op 2 5/0.000 - CDS 1689 - 2003 339 ## COG3162 Predicted membrane protein - Prom 2055 - 2114 4.2 3 1 Op 3 . - CDS 2203 - 4161 2466 ## COG0365 Acyl-coenzyme A synthetases/AMP-(fatty) acid ligases - Prom 4387 - 4446 7.0 4 2 Op 1 . + CDS 4554 - 5990 1507 ## COG3303 Formate-dependent nitrite reductase, periplasmic cytochrome c552 subunit 5 2 Op 2 . + CDS 6035 - 6601 332 ## G2583_4896 NrfB, formate-dependent nitrite reductase 6 2 Op 3 8/0.000 + CDS 6598 - 7269 452 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 7 2 Op 4 5/0.000 + CDS 7266 - 8222 1297 ## COG3301 Formate-dependent nitrite reductase, membrane component + Term 8223 - 8259 -0.9 8 2 Op 5 7/0.000 + CDS 8356 - 9960 1243 ## COG1138 Cytochrome c biogenesis factor 9 2 Op 6 9/0.000 + CDS 9953 - 10336 306 ## COG3088 Uncharacterized protein involved in biosynthesis of c-type cytochromes 10 2 Op 7 4/0.000 + CDS 10267 - 10929 701 ## COG4235 Cytochrome c biogenesis factor + Prom 10964 - 11023 8.3 11 2 Op 8 . + CDS 11271 - 12584 1711 ## COG1301 Na+/H+-dicarboxylate symporters Predicted protein(s) >gi|296494636|gb|ADTN01000102.1| GENE 1 43 - 1692 2254 549 aa, chain - ## HITS:1 COG:yjcG KEGG:ns NR:ns ## COG: yjcG COG4147 # Protein_GI_number: 16131893 # Func_class: R General function prediction only # Function: Predicted symporter # Organism: Escherichia coli K12 # 1 549 1 549 549 961 100.0 0 MKRVLTALAATLPFAANAADAISGAVERQPTNWQAIIMFLIFVVFTLGITYWASKRVRSR SDYYTAGGNITGFQNGLAIAGDYMSAASFLGISALVFTSGYDGLIYSLGFLVGWPIILFL IAERLRNLGRYTFADVASYRLKQGPIRILSACGSLVVVALYLIAQMVGAGKLIELLFGLN YHIAVVLVGVLMMMYVLFGGMLATTWVQIIKAVLLLFGASFMAFMVMKHVGFSFNNLFSE AMAVHPKGVDIMKPGGLVKDPISALSLGLGLMFGTAGLPHILMRFFTVSDAREARKSVFY ATGFMGYFYILTFIIGFGAIMLVGANPEYKDAAGHLIGGNNMAAVHLANAVGGNLFLGFI SAVAFATILAVVAGLTLAGASAVSHDLYANVFKKGATEREELRVSKITVLILGVIAIILG VLFENQNIAFMVGLAFAIAASCNFPIILLSMYWSKLTTRGAMMGGWLGLITAVVLMILGP TIWVQILGHEKAIFPYEYPALFSITVAFLGIWFFSATDNSAEGARERELFRAQFIRSQTG FGVEQGRAH >gi|296494636|gb|ADTN01000102.1| GENE 2 1689 - 2003 339 104 aa, chain - ## HITS:1 COG:yjcH KEGG:ns NR:ns ## COG: yjcH COG3162 # Protein_GI_number: 16131894 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 104 1 104 104 199 100.0 1e-51 MNGTIYQRIEDNAHFRELVEKRQRFATILSIIMLAVYIGFILLIAFAPGWLGTPLNPNTS VTRGIPIGVGVIVISFVLTGIYIWRANGEFDRLNNEVLHEVQAS >gi|296494636|gb|ADTN01000102.1| GENE 3 2203 - 4161 2466 652 aa, chain - ## HITS:1 COG:acs KEGG:ns NR:ns ## COG: acs COG0365 # Protein_GI_number: 16131895 # Func_class: I Lipid transport and metabolism # Function: Acyl-coenzyme A synthetases/AMP-(fatty) acid ligases # Organism: Escherichia coli K12 # 1 652 1 652 652 1354 100.0 0 MSQIHKHTIPANIADRCLINPQQYEAMYQQSINVPDTFWGEQGKILDWIKPYQKVKNTSF APGNVSIKWYEDGTLNLAANCLDRHLQENGDRTAIIWEGDDASQSKHISYKELHRDVCRF ANTLLELGIKKGDVVAIYMPMVPEAAVAMLACARIGAVHSVIFGGFSPEAVAGRIIDSNS RLVITSDEGVRAGRSIPLKKNVDDALKNPNVTSVEHVVVLKRTGGKIDWQEGRDLWWHDL VEQASDQHQAEEMNAEDPLFILYTSGSTGKPKGVLHTTGGYLVYAALTFKYVFDYHPGDI YWCTADVGWVTGHSYLLYGPLACGATTLMFEGVPNWPTPARMAQVVDKHQVNILYTAPTA IRALMAEGDKAIEGTDRSSLRILGSVGEPINPEAWEWYWKKIGNEKCPVVDTWWQTETGG FMITPLPGATELKAGSATRPFFGVQPALVDNEGNPLEGATEGSLVITDSWPGQARTLFGD HERFEQTYFSTFKNMYFSGDGARRDEDGYYWITGRVDDVLNVSGHRLGTAEIESALVAHP KIAEAAVVGIPHNIKGQAIYAYVTLNHGEEPSPELYAEVRNWVRKEIGPLATPDVLHWTD SLPKTRSGKIMRRILRKIAAGDTSNLGDTSTLADPGVVEKLLEEKQAIAMPS >gi|296494636|gb|ADTN01000102.1| GENE 4 4554 - 5990 1507 478 aa, chain + ## HITS:1 COG:ECs5052 KEGG:ns NR:ns ## COG: ECs5052 COG3303 # Protein_GI_number: 15834306 # Func_class: P Inorganic ion transport and metabolism # Function: Formate-dependent nitrite reductase, periplasmic cytochrome c552 subunit # Organism: Escherichia coli O157:H7 # 1 478 1 478 478 972 100.0 0 MTRIKINARRIFSLLIPFFFFTSVHAEQTAAPAKPVTVEAKNETFAPQHPDQYLSWKATS EQSERVDALAEDPRLVILWAGYPFSRDYNKPRGHAFAVTDVRETLRTGAPKNAEDGPLPM ACWSCKSPDVARLIQKDGEDGYFHGKWARGGPEIVNNLGCADCHNTASPEFAKGKPELTL SRPYAARAMEAIGKPFEKAGRFDQQSMVCGQCHVEYYFDGKNKAVKFPWDDGMKVENMEQ YYDKIAFSDWTNSLSKTPMLKAQHPEYETWTAGIHGKNNVTCIDCHMPKVQNAEGKLYTD HKIGNPFDNFAQTCANCHTQDKAALQKVVAERKQSINDLKIKVEDQLVHAHFEAKAALDA GATEAEMKPIQDDIRHAQWRWDLAIASHGIHMHAPEEGLRMLGTAMDKAADARTKLARLL ATKGITHEIQIPDISTKEKAQQAIGLNMEQIKAEKQDFIKTVIPQWEEQARKNGLLSQ >gi|296494636|gb|ADTN01000102.1| GENE 5 6035 - 6601 332 188 aa, chain + ## HITS:1 COG:no KEGG:G2583_4896 NR:ns ## KEGG: G2583_4896 # Name: nrfB # Def: NrfB, formate-dependent nitrite reductase # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 188 3 190 190 362 100.0 5e-99 MSVLRSLLTAGVLASGLLWSLNGITATPAAQASDDRYEVTQQRNPDAACLDCHKPDTEGM HGKHASVINPNNKLPVTCTNCHGQPSPQHREGVKDVMRFNEPMYKVGEQNSVCMSCHLPE QLQKAFWPHDVHVTKVACASCHSLHPQQDTMQTLSDKGRIKICVDCHSDQRTNPNFNPAS VPLLKEQP >gi|296494636|gb|ADTN01000102.1| GENE 6 6598 - 7269 452 223 aa, chain + ## HITS:1 COG:ECs5054 KEGG:ns NR:ns ## COG: ECs5054 COG0437 # Protein_GI_number: 15834308 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Escherichia coli O157:H7 # 1 223 1 223 223 446 100.0 1e-125 MTWSRRQFLTGVGVLAAVSGTAGRVVAKTLNINGVRYGMVHDESLCIGCTACMDACREVN KVPEGVSRLTIIRSEPQGEFPDVKYRFFRKSCQHCDHAPCVDVCPTGASFRDAASGIVDV NPDLCVGCQYCIAACPYRVRFIHPVTKTADKCDFCRKTNLQAGKLPACVEACPTKALTFG NLDDPNSEISQLLRQKPTYRYKLALGTKPKLYRVPFKYGEVSQ >gi|296494636|gb|ADTN01000102.1| GENE 7 7266 - 8222 1297 318 aa, chain + ## HITS:1 COG:nrfD KEGG:ns NR:ns ## COG: nrfD COG3301 # Protein_GI_number: 16131899 # Func_class: P Inorganic ion transport and metabolism # Function: Formate-dependent nitrite reductase, membrane component # Organism: Escherichia coli K12 # 1 318 1 318 318 525 100.0 1e-149 MTQTSAFHFESLVWDWPIAIYLFLIGISAGLVTLAVLLRRFYPQAGGADSTLLRTTLIVG PGAVILGLLILVFHLTRPWTFWKLMFHYSFTSVMSMGVMLFQLYMVVLVLWLAKIFEHDL LALQQRWLPKLGIVQKVLSLLTPVHRGLETLMLVLAVLLGAYTGFLLSALKSYPFLNNPI LPVLFLFSGISSGAAVALIAMAIRQRSNPHSTEAQFVHRMEIPVVWGEIFLLVAFFVGLA LGDDGKVRALVAALGGGFWTWWFWLGVAGLGLIVPMLLKPWVNRSSGIPAVLAACGASLV GVLMLRFFILYAGQLTVA >gi|296494636|gb|ADTN01000102.1| GENE 8 8356 - 9960 1243 534 aa, chain + ## HITS:1 COG:nrfE KEGG:ns NR:ns ## COG: nrfE COG1138 # Protein_GI_number: 16131900 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cytochrome c biogenesis factor # Organism: Escherichia coli K12 # 1 534 19 552 552 912 100.0 0 MRLTCIGILAQFALLLLAFGVLTYCFLISDFSVIYVAQHSYSLLSWELKLAAVWGGHEGS LLLWVLLLSAWSALFAWHYRQQTDPLFPLTLAVLSLMLAALLLFVVLWSDPFVRIFPPAI EGRDLNPMLQHPGLIFHPPLLYLGYGGLMVAASVALASLLRGEFDGACARICWRWALPGW SALTAGIILGSWWAYCELGWGGWWFWDPVENASLLPWLSATALLHSLSLTRQRGIFCHWS LLLAIVTLMLSLLGTLIVRSGILVSVHAFALDNVRAVPLFSLFALISLASLALYGWRARD GGPAVHFSGLSREMLILATLLLFCAVLLIVLVGTLYPMIYGLLGWGRLSVGAPYFNRATL PFGLLMLVVIVLATFVSGKRVQLPALVAHAGVLLFAAGVVVSSVSRQEISLNLQPGQQVT LAGYTFRFECLDLQAKGNYTSEKAIVALFDHQQRIGELTPERRFYEARRQQMMEPSIRWN GIHDWYAVMGEKTGPDRYAFRLYVQSGVRWIWGGGLLMIAGALLSGWRGKKRDE >gi|296494636|gb|ADTN01000102.1| GENE 9 9953 - 10336 306 127 aa, chain + ## HITS:1 COG:nrfF KEGG:ns NR:ns ## COG: nrfF COG3088 # Protein_GI_number: 16131901 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Uncharacterized protein involved in biosynthesis of c-type cytochromes # Organism: Escherichia coli K12 # 1 127 1 127 127 188 100.0 2e-48 MNKGLLTLLLLFTCFAHAQVVDTWQFANPQQQQQALNIASQLRCPQCQNQNLLESNAPVA VSMRHQVYSMVAEGKNEVEIIGWMTERYGDFVRYNPPLTGQTLVLWALPVVLLLLMALIL WRVRAKR >gi|296494636|gb|ADTN01000102.1| GENE 10 10267 - 10929 701 220 aa, chain + ## HITS:1 COG:nrfG KEGG:ns NR:ns ## COG: nrfG COG4235 # Protein_GI_number: 16131902 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cytochrome c biogenesis factor # Organism: Escherichia coli K12 # 23 220 1 198 198 337 100.0 7e-93 MGAASGVVTADGTDPLASEGEAMKQPKIPVKMLTTLTILMVFLCVGSYLLSPKWQAVRAE YQRQRDPLHQFASQQTPEAQLQALQDKIRANPQNSEQWALLGEYYLWQNDYSNSLLAYRQ ALQLRGENAELYAALATVLYYQASQHMTAQTRAMIDKALALDSNEITALMLLASDAFMQA NYAQAIELWQKVMDLNSPRVNRTQLVESINMAKLLQRRLD >gi|296494636|gb|ADTN01000102.1| GENE 11 11271 - 12584 1711 437 aa, chain + ## HITS:1 COG:ECs5059 KEGG:ns NR:ns ## COG: ECs5059 COG1301 # Protein_GI_number: 15834313 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Escherichia coli O157:H7 # 1 437 1 437 437 751 100.0 0 MKNIKFSLAWQILFAMVLGILLGSYLHYHSDSRDWLVVNLLSPAGDIFIHLIKMIVVPIV ISTLVVGIAGVGDAKQLGRIGAKTIIYFEVITTVAIILGITLANVFQPGAGVDMSQLATV DISKYQSTTEAVQSSSHGIMGTILSLVPTNIVASMAKGEMLPIIFFSVLFGLGLSSLPAT HREPLVTVFRSISETMFKVTHMVMRYAPVGVFALIAVTVANFGFSSLWPLAKLVLLVHFA ILFFALVVLGIVARLCGLSVWILIRILKDELILAYSTASSESVLPRIIEKMEAYGAPASI TSFVVPTGYSFNLDGSTLYQSIAAIFIAQLYGIDLSIWQEIILVLTLMVTSKGIAGVPGV SFVVLLATLGSVGIPLEGLAFIAGVDRILDMARTALNVVGNALAVLVIAKWEHKFDRKKA LAYEREVLGKFDKTADQ Prediction of potential genes in microbial genomes Time: Sun May 15 23:29:14 2011 Seq name: gi|296494635|gb|ADTN01000103.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont246.1, whole genome shotgun sequence Length of sequence - 5417 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 - CDS 95 - 2509 3039 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 2 1 Op 2 . - CDS 2558 - 3145 489 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 3 1 Op 3 . - CDS 3170 - 3301 87 ## - Prom 3362 - 3421 4.2 + Prom 3135 - 3194 4.4 4 2 Tu 1 . + CDS 3339 - 4172 628 ## COG1526 Uncharacterized protein required for formate dehydrogenase activity + Term 4179 - 4227 4.1 + Prom 4183 - 4242 7.4 5 3 Tu 1 . + CDS 4325 - 5380 976 ## B21_03730 hypothetical protein Predicted protein(s) >gi|296494635|gb|ADTN01000103.1| GENE 1 95 - 2509 3039 804 aa, chain - ## HITS:1 COG:fdoG KEGG:ns NR:ns ## COG: fdoG COG0243 # Protein_GI_number: 16131734 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 804 213 1016 1016 1684 100.0 0 MTNHWVDIKNANLVVVMGGNAAEAHPVGFRWAMEAKIHNGAKLIVIDPRFTRTAAVADYY APIRSGTDIAFLSGVLLYLLNNEKFNREYTEAYTNASLIVREDYGFEDGLFTGYDAEKRK YDKSSWTYELDENGFAKRDTTLQHPRCVWNLLKQHVSRYTPDVVENICGTPKDAFLKVCE YIAETSAHDKTASFLYALGWTQHSVGAQNIRTMAMIQLLLGNMGMAGGGVNALRGHSNIQ GLTDLGLLSQSLPGYMTLPSEKQTDLQTYLTANTPKPLLEGQVNYWGNYPKFFVSMMKAF FGDKATAENSWGFDWLPKWDKGYDVLQYFEMMKEGKVNGYICQGFNPVASFPNKNKVIGC LSKLKFLVTIDPLNTETSNFWQNHGELNEVDSSKIQTEVFRLPSTCFAEENGSIVNSGRW LQWHWKGADAPGIALTDGEILSGIFLRLRKMYAEQGGANPDQVLNMTWNYAIPHEPSSEE VAMESNGKALADITDPATGAVIVKKGQQLSSFAQLRDDGTTSCGCWIFAGSWTPEGNQMA RRDNADPSGLGNTLGWAWAWPLNRRILYNRASADPQGNPWDPKRQLLKWDGTKWTGWDIP DYSAAPPGSGVGPFIMQQEGMGRLFALDKMAEGPFPEHYEPFETPLGTNPLHPNVISNPA ARIFKDDAEALGKADKFPYVGTTYRLTEHFHYWTKHALLNAILQPEQFVEIGESLANKLG IAQGDTVKVSSNRGYIKAKAVVTKRIRTLKANGKDIDTIGIPIHWGYEGVAKKGFIANTL TPFVGDANTQTPEFKSFLVNVEKV >gi|296494635|gb|ADTN01000103.1| GENE 2 2558 - 3145 489 195 aa, chain - ## HITS:1 COG:fdoG KEGG:ns NR:ns ## COG: fdoG COG0243 # Protein_GI_number: 16131734 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 195 1 195 1016 412 100.0 1e-115 MQVSRRQFFKICAGGMAGTTAAALGFAPSVALAETRQYKLLRTRETRNTCTYCSVGCGLL MYSLGDGAKNAKASIFHIEGDPDHPVNRGALCPKGAGLVDFIHSESRLKFPEYRAPGSDK WQQISWEEAFDRIAKLMKEDRDANYIAQNAEGVTVNRWLSTGMLCASASSNETGYLTQKF SRALGMLAVDNQARV >gi|296494635|gb|ADTN01000103.1| GENE 3 3170 - 3301 87 43 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVFCYKPFLDGGKLSQFWTFAAPSAKKNNSHSLHTQQANVNGT >gi|296494635|gb|ADTN01000103.1| GENE 4 3339 - 4172 628 277 aa, chain + ## HITS:1 COG:fdhD KEGG:ns NR:ns ## COG: fdhD COG1526 # Protein_GI_number: 16131735 # Func_class: C Energy production and conversion # Function: Uncharacterized protein required for formate dehydrogenase activity # Organism: Escherichia coli K12 # 1 277 1 277 277 572 100.0 1e-163 MKKTQRKEIENVTNITGVRQIELWRRDDLQHPRLDEVAEEVPVALVYNGISHVVMMASPK DLEYFALGFSLSEGIIESPRDIFGMDVVPSCNGLEVQIELSSRRFMGLKERRRALAGRTG CGVCGVEQLNDIGKPVQPLPFTQTFDLNKLDDALRHLNDFQPVGQLTGCTHAAAWMLPSG ELVGGHEDVGRHVALDKLLGRRSQEGESWQQGAVLVSSRASYEMVQKSAMCGVEILFAVS AATTLAVEVAERCNLTLVGFCKPGRATVYTHPQRLSN >gi|296494635|gb|ADTN01000103.1| GENE 5 4325 - 5380 976 351 aa, chain + ## HITS:1 COG:no KEGG:B21_03730 NR:ns ## KEGG: B21_03730 # Name: yiiG # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 351 1 351 351 631 100.0 1e-179 MKRNLLSSAIIVAIMSLGLTGCDDKKAETETLPPANSQPAAPAPEAKPTEAPVAKAEAKP ETPAQPVVDEQAVFDEKMDVYIKCYNKLQIPVQRSLARYADWLKDFKQGPTGEERTVYGI YGISESNLAECEKGVKSAVALTPALQPIDGVAVSYIDAAVALGNTINEMDKYYTQENYKD DAFAKGKTLHQTFLKNLEAFEPVAESYHAAIQEINDKRQLAELKNIEEREGKTFHYYSLA VMISAKQINNLISQDKFDAEAAMKKVSELETLVAQAKEADKGGMNFSFINSAGQYQLEAK KYVRRIRDKVPYSDWDKEQLQDANSSWMVEDSFPRALREYNEMVDDYNSLR Prediction of potential genes in microbial genomes Time: Sun May 15 23:29:29 2011 Seq name: gi|296494634|gb|ADTN01000104.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont246.2, whole genome shotgun sequence Length of sequence - 18089 bp Number of predicted genes - 18, with homology - 17 Number of transcription units - 12, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.600 - CDS 8 - 1756 1134 ## COG3711 Transcriptional antiterminator 2 1 Op 2 2/0.600 - CDS 1756 - 2826 1139 ## COG1363 Cellulase M and related proteins 3 1 Op 3 7/0.200 - CDS 2816 - 4267 1610 ## COG1299 Phosphotransferase system, fructose-specific IIC component 4 1 Op 4 1/0.800 - CDS 4278 - 4724 340 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) - Prom 4915 - 4974 5.1 - Term 4980 - 5016 2.2 5 2 Op 1 4/0.400 - CDS 5025 - 5339 413 ## COG3254 Uncharacterized conserved protein 6 2 Op 2 5/0.200 - CDS 5349 - 6173 963 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases - Prom 6363 - 6422 5.8 - Term 6337 - 6381 4.1 7 3 Op 1 6/0.200 - CDS 6440 - 7699 1424 ## COG4806 L-rhamnose isomerase 8 3 Op 2 . - CDS 7696 - 9165 1261 ## COG1070 Sugar (pentulose and hexulose) kinases - Prom 9189 - 9248 2.4 + Prom 9337 - 9396 2.6 9 4 Tu 1 . + CDS 9453 - 10289 476 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 10124 - 10179 1.8 10 5 Tu 1 . - CDS 10237 - 10371 104 ## - Prom 10609 - 10668 2.9 11 6 Tu 1 . + CDS 10363 - 11211 596 ## COG2207 AraC-type DNA-binding domain-containing proteins 12 7 Tu 1 . - CDS 11208 - 12242 1325 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 12414 - 12473 6.4 + Prom 12417 - 12476 7.2 13 8 Tu 1 . + CDS 12527 - 13147 571 ## PROTEIN SUPPORTED gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent + Term 13202 - 13238 6.4 + Prom 13217 - 13276 7.3 14 9 Tu 1 . + CDS 13401 - 14390 981 ## ECSE_4198 2-keto-3-deoxygluconate permease + Term 14409 - 14453 8.0 + Prom 14444 - 14503 3.7 15 10 Tu 1 . + CDS 14539 - 15213 611 ## COG2258 Uncharacterized protein conserved in bacteria + Term 15242 - 15305 0.8 - Term 15175 - 15210 -0.7 16 11 Op 1 40/0.000 - CDS 15319 - 16692 1487 ## COG0642 Signal transduction histidine kinase 17 11 Op 2 . - CDS 16689 - 17387 892 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 17444 - 17503 2.1 + Prom 17451 - 17510 2.0 18 12 Tu 1 . + CDS 17537 - 18037 256 ## COG3678 P pilus assembly/Cpx signaling pathway, periplasmic inhibitor/zinc-resistance associated protein Predicted protein(s) >gi|296494634|gb|ADTN01000104.1| GENE 1 8 - 1756 1134 582 aa, chain - ## HITS:1 COG:frvR_1 KEGG:ns NR:ns ## COG: frvR_1 COG3711 # Protein_GI_number: 16131737 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Escherichia coli K12 # 1 463 1 463 463 910 100.0 0 MLNERQLKIVDLLEQQPRTPGELAQQTGVSGRTILRDIDYLNFTLNGKARIFASGSAGYQ LEIFERRSFFQLLQKHDNDDRLLALLLLNTFTPRAQLASALNLPETWVAERLPRLKQRYE RTCCLASRPGLGHFIDETEEKRVILLANLLRKDPFLIPLAGITRDNLQHLSTACDNQHRW PLMQGDYLSSLILAIYALRNQLTDEWPQYPGDEIKQIVEHSGLFLGDNAVRTLTGLIEKQ HQQAQVISADNVQGLLQRVPGIASLNIIDAQLVENITGHLLRCLAAPVWIAEHRQSSMNN LKAAWPAAFDMSLHFITLLREQLDIPLFDSDLIGLYFACALERHQNERQPIILLSDQNAI ATINQLAIERDVLNCRVIIARSLSELVAIREEIEPLLIINNSHYLLDDAVNNYITVKNII TAAGIEQIKHFLATAFIRQQPERFFSAPGSFHYSNVRGESWQHITRQICAQLVAQHHITA DEAQRIIAREGEGENLIVNRLAIPHCWSEQERRFRGFFITLAQPVEVNNEVINHVLIACA AADARHELKIFSYLASILCQHPAEIIAGLTGYEAFMELLHKG >gi|296494634|gb|ADTN01000104.1| GENE 2 1756 - 2826 1139 356 aa, chain - ## HITS:1 COG:frvX KEGG:ns NR:ns ## COG: frvX COG1363 # Protein_GI_number: 16131738 # Func_class: G Carbohydrate transport and metabolism # Function: Cellulase M and related proteins # Organism: Escherichia coli K12 # 1 356 1 356 356 738 100.0 0 MNIELLQQLCEASAVSGDEQEVRDILINTLEPCVNEITFDGLGSFVARKGNKGPKVAVVG HMDEVGFMVTHIDESGFLRFTTIGGWWNQSMLNHRVTIRTHKGVKIPGVIGSVAPHALTE KQKQQPLSFDEMFIDIGANSREEVEKRGVEIGNFISPEANFACWGEDKVVGKALDNRIGC AMMAELLQTVNNPEITLYGVGSVEEEVGLRGAQTSAEHIKPDVVIVLDTAVAGDVPGIDN IKYPLKLGQGPGLMLFDKRYFPNQKLVAALKSCAAHNDLPLQFSTMKTGATDGGRYNVMG GGRPVVALCLPTRYLHANSGMISKADYEALLTLIRGFLTTLTAEKVNAFSQFRQVD >gi|296494634|gb|ADTN01000104.1| GENE 3 2816 - 4267 1610 483 aa, chain - ## HITS:1 COG:frvB_2 KEGG:ns NR:ns ## COG: frvB_2 COG1299 # Protein_GI_number: 16131739 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Escherichia coli K12 # 129 483 1 355 355 629 100.0 1e-180 MESSLRIVAITNCPAGIAHTYMVAEALEQKARSLGHTIKVETQGSSGVENRLSSEEIAAA DYVILATGRGLSGDDRARFAGKKVYEIAISQALKNIDQIFSELPTNSQLFAADSGVKLGK QEVQSGSVMSHLMAGVSAALPFVIGGGILVALANMLVQFGLPYTDMSKGAPSFTWVVESI GYLGFTFMIPIMGAYIASSIADKPAFAPAFLVCYLANDKALLGTQSGAGFLGAVVLGLAI GYFVFWFRKVRLGKALQPLLGSMLIPFVTLLVFGVLTYYVIGPVMSDLMGGLLHFLNTIP PSMKFAAAFLVGAMLAFDMGGPINKTAWFFCFSLLEKHIYDWYAIVGVVALMPPVAAGLA TFIAPKLFTRQEKEAASSAIVVGATVATEPAIPYALAAPLPMITANTLAGGITGVLVIAF GIKRLAPGLGIFDPLIGLMSPVGSFYLVLAIGLALNISFIIVLKGLWLRRKAKAAQQELV HEH >gi|296494634|gb|ADTN01000104.1| GENE 4 4278 - 4724 340 148 aa, chain - ## HITS:1 COG:frvA KEGG:ns NR:ns ## COG: frvA COG1762 # Protein_GI_number: 16131740 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Escherichia coli K12 # 1 148 1 148 148 292 100.0 2e-79 MAALTASCIDLNIQGNGAYSVLKQLATIALQNGFITDSHQFLQTLLLREKMHSTGFGSGV AVPHGKSACVKQPFVLFARKAQAIDWKASDGEDVNCWICLGVPQSGEEDQVKIIGTLCRK IIHKEFIHQLQQGDTDQVLALLNQTLSS >gi|296494634|gb|ADTN01000104.1| GENE 5 5025 - 5339 413 104 aa, chain - ## HITS:1 COG:yiiL KEGG:ns NR:ns ## COG: yiiL COG3254 # Protein_GI_number: 16131741 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 104 1 104 104 199 100.0 8e-52 MIRKAFVMQVNPDAHEEYQRRHNPIWPELEAVLKSHGAHNYAIYLDKARNLLFAMVEIES EERWNAVASTDVCQRWWKYMTDVMPANPDNSPVSSELQEVFYLP >gi|296494634|gb|ADTN01000104.1| GENE 6 5349 - 6173 963 274 aa, chain - ## HITS:1 COG:rhaD KEGG:ns NR:ns ## COG: rhaD COG0235 # Protein_GI_number: 16131742 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli K12 # 1 274 1 274 274 573 100.0 1e-163 MQNITQSWFVQGMIKATTDAWLKGWDERNGGNLTLRLDDADIAPYHDNFHQQPRYIPLSQ PMPLLANTPFIVTGSGKFFRNVQLDPAANLGIVKVDSDGAGYHILWGLTNEAVPTSELPA HFLSHCERIKATNGKDRVIMHCHATNLIALTYVLENDTAVFTRQLWEGSTECLVVFPDGV GILPWMVPGTDEIGQATAQEMQKHSLVLWPFHGVFGSGPTLDETFGLIDTAEKSAQVLVK VYSMGGMKQTISREELIALGKRFGVTPLASALAL >gi|296494634|gb|ADTN01000104.1| GENE 7 6440 - 7699 1424 419 aa, chain - ## HITS:1 COG:rhaA KEGG:ns NR:ns ## COG: rhaA COG4806 # Protein_GI_number: 16131743 # Func_class: G Carbohydrate transport and metabolism # Function: L-rhamnose isomerase # Organism: Escherichia coli K12 # 1 419 1 419 419 856 99.0 0 MTTQLEQAWELAKQRFAAVGIDVEEALRQLDRLPVSMHCWQGDDVSGFENPEGSLTGGIQ ATGNYPGKARNASELRADLEQAMRLIPGPKRLNLHAIYLESDTPVSRDQIKPEHFKNWVE WAKANQLGLDFNPSCFSHPLSADGFTLSHADDSIRQFWIDHCKASRRVSAYFGEQLGTPS VMNIWIPDGMKDITVDRLAPRQRLLAALDEVISEKLNPAHHIDAVESKLFGIGAESYTVG SNEFYMGYATSRQTALCLDAGHFHPTEVISDKISAAMLYVPQLLLHVSRPVRWDSDHVVL LDDETQAIASEIVRHDLFDRVHIGLDFFDASINRIAAWVIGTRNMKKALLRALLEPTAEL RKLEAAGDYTARLALLEEQKSLPWQAVWEMYCQRHDTPAGSEWLESVRAYEKEILSRRG >gi|296494634|gb|ADTN01000104.1| GENE 8 7696 - 9165 1261 489 aa, chain - ## HITS:1 COG:rhaB KEGG:ns NR:ns ## COG: rhaB COG1070 # Protein_GI_number: 16131744 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Escherichia coli K12 # 1 489 1 489 489 1004 99.0 0 MTFRNCVAVDLGASSGRVMLARYERECRSLTLREIHRFNNGLHSQNGYVTRDVDSLESAI RLGLNKVCEEGIRIDSIGIDTWGVDFVLLDQQGQRVGLPVAYRDSRTNGLMAQAQQQLGK RDIYQRSGIQFLPFNTLYQLRALTEQQPELIPHIAHALLMPDNFSYRLTGKMNWEYTNAT TTQLVNINSDDWDESLLAWSGANKAWFGRPTHPGNVIGHWICPQGNEIPVVAVASHDTAS AVIASPLNGSRAAYLSSGTWSLMGFESQTPFTNDTALAANITNEGGAEGRYRVLKNIMGL WLLQRVLQEQQINDLPALISATQALPACRFIINPNDDRFINPETMCSEIQAACRETAQPI PESDAELARCIFDSLALLYADVLHELAQLRGEDFSQLHIVGGGCQNTLLNQLCADACGIR VIAGPVEASTLGNIGIQLMTLDELNNVDDFRQVVSTTANLTTFTPNPDSEIAHYVAQIHS TRQTKELCA >gi|296494634|gb|ADTN01000104.1| GENE 9 9453 - 10289 476 278 aa, chain + ## HITS:1 COG:rhaS KEGG:ns NR:ns ## COG: rhaS COG2207 # Protein_GI_number: 16131745 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 278 1 278 278 529 99.0 1e-150 MTVLHSVDFFPSGNASVAIEPRLPQADFPEHHHDFHEIVIVEHGTGIHVFNGQPYTITDG TVCFVRDHDRHLYEHTDNLCLTNVLYRSPDRFQFLAGLNQLLPQELDGQYPSHWRVNHSV LQQVRQLVAQMEQQEGENDLPSTASREILFMQLLLLLCKSSLQENLENSASRLNLLLAWL EDHFADEVNWDAVADQFSLSLRTLHRQLKQQTGLTPQRYLNRLRLMKARHLLRHSEASVT DIAYRCGFSDSNHFSTLFRREFNWSPRDIRQGRDGFLQ >gi|296494634|gb|ADTN01000104.1| GENE 10 10237 - 10371 104 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRHRGYLGQRTKLIIRNMAYKYVEKIRVIAESHPVPGEYHAVTS >gi|296494634|gb|ADTN01000104.1| GENE 11 10363 - 11211 596 282 aa, chain + ## HITS:1 COG:rhaR KEGG:ns NR:ns ## COG: rhaR COG2207 # Protein_GI_number: 16131746 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 282 31 312 312 578 99.0 1e-165 MAHQLKLLKDDFFASDQQAVAVADRYPQDVFAEHTHDFCELVIVWRGNGLHVLNDRPYRI TRGDLFYIHADDKHSYASVNDLVLQNIIYCPERLKLNLDWQGAIPGFNASAGQPHWRLGS MGMAQARQVIGQLEHESSQHVPFANEMAELLFGQLVMLLNRHRYTSDSLPPTSSETLLDK LITRLAASLKSPFALDKFCDEASCSERVLRQQFRQQTGMTINQYLRQVRVCHAQYLLQHS RLLISDISTECGFEDSNYFSVVFTRETGMTPSQWRHLNSQKD >gi|296494634|gb|ADTN01000104.1| GENE 12 11208 - 12242 1325 344 aa, chain - ## HITS:1 COG:rhaT KEGG:ns NR:ns ## COG: rhaT COG0697 # Protein_GI_number: 16131747 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Escherichia coli K12 # 1 344 1 344 344 606 99.0 1e-173 MSNAITMGIFWHLIGAASAACFYAPFKKVKKWSWETMWSVGGIVSWIILPWAISALLLPN FWAYYSSFSLSTLLPVFLFGAMWGIGNINYGLTMRYLGMSMGIGIAIGITLIVGTLMTPI INGNFDVLISTEGGRMTLLGVLVALIGVGIVTRAGQLKERKMGIKAEEFNLKKGLVLAVM CGIFSAGMSFAMNAAKPMHEAAAALGVDPLYVALPSYVVIMGGGAIINLGFCFIRLAKVK DLSLKADFSLAKSLIIHNVLLSTLGGLMWYLQFFFYAWGHARIPAQYDYISWMLHMSFYV LCGGIVGLVLKEWNNAGRRPVTVLSLGCVVIIVAANIVGIGMAN >gi|296494634|gb|ADTN01000104.1| GENE 13 12527 - 13147 571 206 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent [Streptococcus pneumoniae TIGR4] # 1 206 1 201 201 224 53 3e-58 MSYTLPSLPYAYDALEPHFDKQTMEIHHTKHHQTYVNNANAALESLPEFANLPVEELITK LDQLPADKKTVLRNNAGGHANHSLFWKGLKKGTTLQGDLKAAIERDFGSVDNFKAEFEKA AASRFGSGWAWLVLKGDKLAVVSTANQDSPLMGEAISGASGFPIMGLDVWEHAYYLKFQN RRPDYIKEFWNVVNWDEAAARFAAKK >gi|296494634|gb|ADTN01000104.1| GENE 14 13401 - 14390 981 329 aa, chain + ## HITS:1 COG:no KEGG:ECSE_4198 NR:ns ## KEGG: ECSE_4198 # Name: not_defined # Def: 2-keto-3-deoxygluconate permease # Organism: E.coli_SE11 # Pathway: not_defined # 1 329 13 341 341 500 99.0 1e-140 MEMQIKRSIEKIPGGMMLVPLFLGALCHTFSPGAGKYFGSFTNGMITGTVPILAVWFFCM GASIKLSATGTVLRKSGTLVVTKIAVAWVVAAIASRIIPEHGVEVGFFAGLSTLALVAAM DMTNGGLYASIMQQYGTKEEAGAFVLMSLESGPLMTMIILGTAGIASFEPHVFVGAVLPF LVGFALGNLDPELREFFSKAVQTLIPFFAFALGNTIDLTVIAQTGLLGILLGVAVIIVTG IPLIIADKLIGGGDGTAGIAASSSAGAAVATPVLIAEMVPAFKPMAPAATSLVATAVIVT SILVPILTSIWSRKVKARAAKIEILGTVK >gi|296494634|gb|ADTN01000104.1| GENE 15 14539 - 15213 611 224 aa, chain + ## HITS:1 COG:yiiM KEGG:ns NR:ns ## COG: yiiM COG2258 # Protein_GI_number: 16131750 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 224 11 234 234 462 100.0 1e-130 MRYPVDVYTGKIQAYPEGKPSAIAKIQVDGELMLTELGLEGDEQAEKKVHGGPDRALCHY PREHYLYWAREFPEQAELFVAPAFGENLSTDGLTESNVYMGDIFRWGEALIQVSQPRSPC YKLNYHFDISDIAQLMQNTGKVGWLYSVIAPGKVSADAPLELVSRVSDVTVQEAAAIAWH MPFDDDQYHRLLSAAGLSKSWTRTMQKRRLSGKIEDFSRRLWGK >gi|296494634|gb|ADTN01000104.1| GENE 16 15319 - 16692 1487 457 aa, chain - ## HITS:1 COG:ECs4837 KEGG:ns NR:ns ## COG: ECs4837 COG0642 # Protein_GI_number: 15834091 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli O157:H7 # 1 457 1 457 457 816 100.0 0 MIGSLTARIFAIFWLTLALVLMLVLMLPKLDSRQMTELLDSEQRQGLMIEQHVEAELAND PPNDLMWWRRLFRAIDKWAPPGQRLLLVTTEGRVIGAERSEMQIIRNFIGQADNADHPQK KKYGRVELVGPFSVRDGEDNYQLYLIRPASSSQSDFINLLFDRPLLLLIVTMLVSTPLLL WLAWSLAKPARKLKNAADEVAQGNLRQHPELEAGPQEFLAAGASFNQMVTALERMMTSQQ RLLSDISHELRTPLTRLQLGTALLRRRSGESKELERIETEAQRLDSMINDLLVMSRNQQK NALVSETIKANQLWSEVLDNAAFEAEQMGKSLTVNFPPGPWPLYGNPNALESALENIVRN ALRYSHTKIEVGFAVDKDGITITVDDDGPGVSPEDREQIFRPFYRTDEARDRESGGTGLG LAIVETAIQQHRGWVKAEDSPLGGLRLVIWLPLYKRS >gi|296494634|gb|ADTN01000104.1| GENE 17 16689 - 17387 892 232 aa, chain - ## HITS:1 COG:ECs4838 KEGG:ns NR:ns ## COG: ECs4838 COG0745 # Protein_GI_number: 15834092 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 232 1 232 232 410 100.0 1e-114 MNKILLVDDDRELTSLLKELLEMEGFNVIVAHDGEQALDLLDDSIDLLLLDVMMPKKNGI DTLKALRQTHQTPVIMLTARGSELDRVLGLELGADDYLPKPFNDRELVARIRAILRRSHW SEQQQNNDNGSPTLEVDALVLNPGRQEASFDGQTLELTGTEFTLLYLLAQHLGQVVSREH LSQEVLGKRLTPFDRAIDMHISNLRRKLPDRKDGHPWFKTLRGRGYLMVSAS >gi|296494634|gb|ADTN01000104.1| GENE 18 17537 - 18037 256 166 aa, chain + ## HITS:1 COG:ECs4839 KEGG:ns NR:ns ## COG: ECs4839 COG3678 # Protein_GI_number: 15834093 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; N Cell motility; T Signal transduction mechanisms; P Inorganic ion transport and metabolism # Function: P pilus assembly/Cpx signaling pathway, periplasmic inhibitor/zinc-resistance associated protein # Organism: Escherichia coli O157:H7 # 1 151 2 152 167 211 100.0 5e-55 MRIVTAAVMASTLAVSSLSHAAEVGSGDNWHPGEELTQRSTQSHMFDGISLTEHQRQQMR DLMQQARHEQPPVNVSELETMHRLVTAENFDENAVRAQAEKMANEQIARQVEMAKVRNQM YRLLTPEQQAVLNEKHQQRMEQLRDVTQWQKSSSLKLLSSSNSRSQ Prediction of potential genes in microbial genomes Time: Sun May 15 23:29:42 2011 Seq name: gi|296494633|gb|ADTN01000105.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont246.3, whole genome shotgun sequence Length of sequence - 9273 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 5, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 11 - 70 4.3 1 1 Op 1 7/0.000 + CDS 90 - 992 539 ## COG0053 Predicted Co/Zn/Cd cation transporters + Term 1040 - 1087 1.3 + Prom 1034 - 1093 4.2 2 1 Op 2 4/0.000 + CDS 1173 - 2135 1248 ## COG0205 6-phosphofructokinase + Term 2319 - 2351 2.3 + Prom 2356 - 2415 6.2 3 1 Op 3 4/0.000 + CDS 2455 - 3444 1224 ## COG1613 ABC-type sulfate transport system, periplasmic component + Term 3454 - 3516 1.8 + Prom 3468 - 3527 4.8 4 1 Op 4 . + CDS 3551 - 4306 485 ## COG2134 CDP-diacylglycerol pyrophosphatase - Term 4317 - 4351 5.2 5 2 Op 1 . - CDS 4361 - 5128 906 ## COG0149 Triosephosphate isomerase - Prom 5191 - 5250 2.4 - Term 5154 - 5184 2.1 6 2 Op 2 . - CDS 5298 - 5834 352 ## EcolC_4098 hypothetical protein - Prom 5858 - 5917 5.8 + Prom 5852 - 5911 2.5 7 3 Tu 1 3/1.000 + CDS 5935 - 6375 459 ## COG3152 Predicted membrane protein + Prom 6492 - 6551 3.3 8 4 Op 1 3/1.000 + CDS 6587 - 6886 349 ## COG3691 Uncharacterized protein conserved in bacteria 9 4 Op 2 . + CDS 6913 - 7341 416 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins + Term 7372 - 7424 4.6 10 5 Op 1 4/0.000 - CDS 7346 - 8092 930 ## COG1018 Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 - Term 8133 - 8168 5.1 11 5 Op 2 . - CDS 8189 - 9199 1384 ## COG1494 Fructose-1,6-bisphosphatase/sedoheptulose 1,7-bisphosphatase and related proteins Predicted protein(s) >gi|296494633|gb|ADTN01000105.1| GENE 1 90 - 992 539 300 aa, chain + ## HITS:1 COG:ECs4840 KEGG:ns NR:ns ## COG: ECs4840 COG0053 # Protein_GI_number: 15834094 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Escherichia coli O157:H7 # 1 300 1 300 300 558 100.0 1e-159 MNQSYGRLVSRAAIAATAMASLLLLIKIFAWWYTGSVSILAALVDSLVDIGASLTNLLVV RYSLQPADDNHSFGHGKAESLAALAQSMFISGSALFLFLTGIQHLISPTPMTDPGVGVIV TIVALICTIILVSFQRWVVRRTQSQAVRADMLHYQSDVMMNGAILLALGLSWYGWHRADA LFALGIGIYILYSALRMGYEAVQSLLDRALPDEERQEIIDIVTSWPGVSGAHDLRTRQSG PTRFIQIHLEMEDSLPLVQAHMVADQVEQAILRRFPGSDVIIHQDPCSVVPREGKRSMLS >gi|296494633|gb|ADTN01000105.1| GENE 2 1173 - 2135 1248 320 aa, chain + ## HITS:1 COG:ECs4841 KEGG:ns NR:ns ## COG: ECs4841 COG0205 # Protein_GI_number: 15834095 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Escherichia coli O157:H7 # 1 320 1 320 320 634 100.0 0 MIKKIGVLTSGGDAPGMNAAIRGVVRSALTEGLEVMGIYDGYLGLYEDRMVQLDRYSVSD MINRGGTFLGSARFPEFRDENIRAVAIENLKKRGIDALVVIGGDGSYMGAMRLTEMGFPC IGLPGTIDNDIKGTDYTIGFFTALSTVVEAIDRLRDTSSSHQRISVVEVMGRYCGDLTLA AAIAGGCEFVVVPEVEFSREDLVNEIKAGIAKGKKHAIVAITEHMCDVDELAHFIEKETG RETRATVLGHIQRGGSPVPYDRILASRMGAYAIDLLLAGYGGRCVGIQNEQLVHHDIIDA IENMKRPFKGDWLDCAKKLY >gi|296494633|gb|ADTN01000105.1| GENE 3 2455 - 3444 1224 329 aa, chain + ## HITS:1 COG:sbp KEGG:ns NR:ns ## COG: sbp COG1613 # Protein_GI_number: 16131755 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type sulfate transport system, periplasmic component # Organism: Escherichia coli K12 # 1 329 1 329 329 649 100.0 0 MNKWGVGLTFLLAATSVMAKDIQLLNVSYDPTRELYEQYNKAFSAHWKQQTGDNVVIRQS HGGSGKQATSVINGIEADVVTLALAYDVDAIAERGRIDKEWIKRLPDNSAPYTSTIVFLV RKGNPKQIHDWNDLIKPGVSVITPNPKSSGGARWNYLAAWGYALHHNNNDQAKAQDFVRA LYKNVEVLDSGARGSTNTFVERGIGDVLIAWENEALLAANELGKDKFEIVTPSESILAEP TVSVVDKVVEKKGTKEVAEAYLKYLYSPEGQEIAAKNYYRPRDAEVAKKYENAFPKLKLF TIDEEFGGWTKAQKEHFANGGTFDQISKR >gi|296494633|gb|ADTN01000105.1| GENE 4 3551 - 4306 485 251 aa, chain + ## HITS:1 COG:cdh KEGG:ns NR:ns ## COG: cdh COG2134 # Protein_GI_number: 16131756 # Func_class: I Lipid transport and metabolism # Function: CDP-diacylglycerol pyrophosphatase # Organism: Escherichia coli K12 # 1 251 1 251 251 515 100.0 1e-146 MKKAGLLFLVMIVIAVVAAGIGYWKLTGEESDTLRKIVLEECLPNQQQNQNPSPCAEVKP NAGYVVLKDLNGPLQYLLMPTYRINGTESPLLTDPSTPNFFWLAWQARDFMSKKYGQPVP DRAVSLAINSRTGRTQNHFHIHISCIRPDVRKQLDNNLANISSRWLPLPGGLRGHEYLAR RVTESELVQRSPFMMLAEEVPEAREHMGRYGLAMVRQSDNSFVLLATQRNLLTLNRASAE EIQDHQCEILR >gi|296494633|gb|ADTN01000105.1| GENE 5 4361 - 5128 906 255 aa, chain - ## HITS:1 COG:ECs4844 KEGG:ns NR:ns ## COG: ECs4844 COG0149 # Protein_GI_number: 15834098 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Escherichia coli O157:H7 # 1 255 1 255 255 454 100.0 1e-128 MRHPLVMGNWKLNGSRHMVHELVSNLRKELAGVAGCAVAIAPPEMYIDMAKREAEGSHIM LGAQNVDLNLSGAFTGETSAAMLKDIGAQYIIIGHSERRTYHKESDELIAKKFAVLKEQG LTPVLCIGETEAENEAGKTEEVCARQIDAVLKTQGAAAFEGAVIAYEPVWAIGTGKSATP AQAQAVHKFIRDHIAKVDANIAEQVIIQYGGSVNASNAAELFAQPDIDGALVGGASLKAD AFAVIVKAAEAAKQA >gi|296494633|gb|ADTN01000105.1| GENE 6 5298 - 5834 352 178 aa, chain - ## HITS:1 COG:no KEGG:EcolC_4098 NR:ns ## KEGG: EcolC_4098 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 157 1 157 199 276 100.0 3e-73 MKPGCTLFFLLCSALTVTTAAHAQTPDTATTAPYLLAGAPTFDLSISQFREDFNSQNPSL PLNEFRAIDSSPDKANLTRAASKINENLYASTALERGTLKIKSIQMTWLPIQGPEQKAAK AKAQEYMAAVIRTLTPLMTKTQSQKKLQSLLTAGKNKVITPRQKVHCVMLSRTTAKRG >gi|296494633|gb|ADTN01000105.1| GENE 7 5935 - 6375 459 146 aa, chain + ## HITS:1 COG:ECs4846 KEGG:ns NR:ns ## COG: ECs4846 COG3152 # Protein_GI_number: 15834100 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 146 1 146 146 260 100.0 6e-70 MTIQQWLFSFKGRIGRRDFWIWIGLWFAGMLVLFSLAGKNLLDIQTAAFCLVCLLWPTAA VTVKRLHDRGRSGAWAFLMIVAWMLLAGNWAILPGVWQWAVGRFVPTLILVMMLIDLGAF VGTQGENKYGKDTQDVKYKADNKSSN >gi|296494633|gb|ADTN01000105.1| GENE 8 6587 - 6886 349 99 aa, chain + ## HITS:1 COG:yiiS KEGG:ns NR:ns ## COG: yiiS COG3691 # Protein_GI_number: 16131760 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 99 1 99 99 161 100.0 3e-40 MKDVVDKCSTKGCAIDIGTVIDNDNCTSKFSRFFATREEAESFMTKLKELAAATSSADEG ASVAYKIKDLEGQVELDAAFTFSCQAEMIIFELSLRSLA >gi|296494633|gb|ADTN01000105.1| GENE 9 6913 - 7341 416 142 aa, chain + ## HITS:1 COG:yiiT KEGG:ns NR:ns ## COG: yiiT COG0589 # Protein_GI_number: 16131761 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Escherichia coli K12 # 1 142 1 142 142 280 100.0 4e-76 MAYKHIGVAISGNEEDALLVNKALELARHNDAHLTLIHIDDGLSELYPGIYFPATEDILQ LLKNKSDNKLYKLTKNIQWPKTKLRIERGEMPETLLEIMQKEQCDLLVCGHHHSFINRLM PAYRGMINKMSADLLIVPFIDK >gi|296494633|gb|ADTN01000105.1| GENE 10 7346 - 8092 930 248 aa, chain - ## HITS:1 COG:fpr KEGG:ns NR:ns ## COG: fpr COG1018 # Protein_GI_number: 16131762 # Func_class: C Energy production and conversion # Function: Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 # Organism: Escherichia coli K12 # 1 248 1 248 248 505 100.0 1e-143 MADWVTGKVTKVQNWTDALFSLTVHAPVLPFTAGQFTKLGLEIDGERVQRAYSYVNSPDN PDLEFYLVTVPDGKLSPRLAALKPGDEVQVVSEAAGFFVLDEVPHCETLWMLATGTAIGP YLSILQLGKDLDRFKNLVLVHAARYAADLSYLPLMQELEKRYEGKLRIQTVVSRETAAGS LTGRIPALIESGELESTIGLPMNKETSHVMLCGNPQMVRDTQQLLKETRQMTKHLRRRPG HMTAEHYW >gi|296494633|gb|ADTN01000105.1| GENE 11 8189 - 9199 1384 336 aa, chain - ## HITS:1 COG:glpX KEGG:ns NR:ns ## COG: glpX COG1494 # Protein_GI_number: 16131763 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1,6-bisphosphatase/sedoheptulose 1,7-bisphosphatase and related proteins # Organism: Escherichia coli K12 # 1 336 1 336 336 620 100.0 1e-177 MRRELAIEFSRVTESAALAGYKWLGRGDKNTADGAAVNAMRIMLNQVNIDGTIVIGEGEI DEAPMLYIGEKVGTGRGDAVDIAVDPIEGTRMTAMGQANALAVLAVGDKGCFLNAPDMYM EKLIVGPGAKGTIDLNLPLADNLRNVAAALGKPLSELTVTILAKPRHDAVIAEMQQLGVR VFAIPDGDVAASILTCMPDSEVDVLYGIGGAPEGVVSAAVIRALDGDMNGRLLARHDVKG DNEENRRIGEQELARCKAMGIEAGKVLRLGDMARSDNVIFSATGITKGDLLEGISRKGNI ATTETLLIRGKSRTIRRIQSIHYLDRKDPEMQVHIL Prediction of potential genes in microbial genomes Time: Sun May 15 23:30:00 2011 Seq name: gi|296494632|gb|ADTN01000106.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont246.4, whole genome shotgun sequence Length of sequence - 46432 bp Number of predicted genes - 39, with homology - 38 Number of transcription units - 23, operones - 10 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 18/0.000 - CDS 105 - 1613 1870 ## COG0554 Glycerol kinase 2 1 Op 2 . - CDS 1636 - 2481 706 ## COG0580 Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) - Prom 2587 - 2646 5.3 3 2 Tu 1 . + CDS 2912 - 3151 512 ## COG3074 Uncharacterized protein conserved in bacteria + Term 3213 - 3241 1.0 - Term 3197 - 3232 5.1 4 3 Op 1 7/0.200 - CDS 3236 - 3721 558 ## COG0684 Demethylmenaquinone methyltransferase - Prom 3747 - 3806 5.3 5 3 Op 2 6/0.300 - CDS 3814 - 4740 1082 ## COG1575 1,4-dihydroxy-2-naphthoate octaprenyltransferase - Term 4745 - 4778 2.9 6 4 Op 1 24/0.000 - CDS 4807 - 6138 1193 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 7 4 Op 2 7/0.200 - CDS 6148 - 6678 623 ## COG5405 ATP-dependent protease HslVU (ClpYQ), peptidase subunit - Prom 6698 - 6757 1.7 - Term 6702 - 6733 4.1 8 5 Op 1 6/0.300 - CDS 6771 - 7625 689 ## COG3087 Cell division protein - Prom 7760 - 7819 2.3 9 5 Op 2 . - CDS 7822 - 8847 894 ## COG1609 Transcriptional regulators - Prom 8936 - 8995 5.1 10 6 Tu 1 . - CDS 9003 - 11201 2249 ## COG1198 Primosomal protein N' (replication factor Y) - superfamily II helicase - Prom 11226 - 11285 2.8 + Prom 11284 - 11343 5.2 11 7 Tu 1 . + CDS 11404 - 11616 384 ## PROTEIN SUPPORTED gi|15804527|ref|NP_290567.1| 50S ribosomal protein L31 + Term 11624 - 11677 5.7 - Term 11624 - 11649 -0.5 12 8 Tu 1 . - CDS 11677 - 12285 623 ## JW3908 predicted peptidoglycan peptidase - Prom 12363 - 12422 2.4 13 9 Tu 1 . - CDS 12469 - 12786 440 ## COG3060 Transcriptional regulator of met regulon - Prom 12921 - 12980 7.3 + Prom 12967 - 13026 4.8 14 10 Op 1 5/0.400 + CDS 13063 - 14223 1249 ## COG0626 Cystathionine beta-lyases/cystathionine gamma-synthases 15 10 Op 2 4/0.400 + CDS 14226 - 16658 2865 ## COG0527 Aspartokinases + Term 16809 - 16843 2.2 + Prom 16797 - 16856 6.0 16 11 Op 1 . + CDS 17007 - 17897 975 ## COG0685 5,10-methylenetetrahydrofolate reductase 17 11 Op 2 . + CDS 17955 - 18074 65 ## + Prom 18143 - 18202 3.0 18 11 Op 3 1/1.000 + CDS 18226 - 20406 2694 ## COG0376 Catalase (peroxidase I) + Term 20415 - 20460 13.5 + Prom 20415 - 20474 1.7 19 12 Tu 1 . + CDS 20499 - 21404 957 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 20 13 Tu 1 . - CDS 21431 - 22048 491 ## COG3738 Uncharacterized protein conserved in bacteria - Prom 22223 - 22282 3.4 21 14 Op 1 3/0.600 - CDS 22323 - 23426 1370 ## COG0371 Glycerol dehydrogenase and related enzymes 22 14 Op 2 2/0.900 - CDS 23437 - 24099 861 ## COG0176 Transaldolase 23 14 Op 3 . - CDS 24111 - 26612 2483 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) - Prom 26797 - 26856 4.2 + Prom 26697 - 26756 3.3 24 15 Op 1 7/0.200 + CDS 26921 - 28000 1329 ## COG1299 Phosphotransferase system, fructose-specific IIC component 25 15 Op 2 4/0.400 + CDS 28015 - 28335 529 ## COG1445 Phosphotransferase system fructose-specific component IIB 26 15 Op 3 11/0.000 + CDS 28386 - 30683 2275 ## COG1882 Pyruvate-formate lyase 27 15 Op 4 2/0.900 + CDS 30649 - 31527 568 ## COG1180 Pyruvate-formate lyase-activating enzyme 28 15 Op 5 . + CDS 31529 - 31870 326 ## COG1445 Phosphotransferase system fructose-specific component IIB - Term 31744 - 31778 -1.0 29 16 Tu 1 . - CDS 31857 - 32708 847 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 32841 - 32900 3.0 - Term 32750 - 32782 1.3 30 17 Tu 1 . - CDS 32923 - 34656 2001 ## COG2194 Predicted membrane-associated, metal-dependent hydrolase - Prom 34681 - 34740 3.5 - Term 34787 - 34820 5.2 31 18 Tu 1 . - CDS 34838 - 37450 3108 ## COG2352 Phosphoenolpyruvate carboxylase - Prom 37697 - 37756 5.3 32 19 Tu 1 . - CDS 37964 - 39115 1054 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases - Prom 39164 - 39223 3.3 + Prom 39098 - 39157 6.1 33 20 Op 1 8/0.000 + CDS 39269 - 40273 930 ## COG0002 Acetylglutamate semialdehyde dehydrogenase 34 20 Op 2 2/0.900 + CDS 40284 - 41057 1113 ## COG0548 Acetylglutamate kinase 35 20 Op 3 4/0.400 + CDS 41118 - 42491 1845 ## COG0165 Argininosuccinate lyase + Term 42514 - 42548 3.4 + Prom 42665 - 42724 5.1 36 21 Tu 1 . + CDS 42758 - 43675 883 ## COG0583 Transcriptional regulator + Term 43923 - 43948 -0.8 37 22 Tu 1 . - CDS 43658 - 45058 428 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 - Prom 45090 - 45149 4.2 + Prom 45054 - 45113 4.2 38 23 Op 1 . + CDS 45335 - 46039 716 ## COG1309 Transcriptional regulator 39 23 Op 2 . + CDS 46039 - 46398 450 ## LF82_3412 inner membrane protein YijD Predicted protein(s) >gi|296494632|gb|ADTN01000106.1| GENE 1 105 - 1613 1870 502 aa, chain - ## HITS:1 COG:ECs4851 KEGG:ns NR:ns ## COG: ECs4851 COG0554 # Protein_GI_number: 15834105 # Func_class: C Energy production and conversion # Function: Glycerol kinase # Organism: Escherichia coli O157:H7 # 1 502 1 502 502 1023 100.0 0 MTEKKYIVALDQGTTSSRAVVMDHDANIISVSQREFEQIYPKPGWVEHDPMEIWATQSST LVEVLAKADISSDQIAAIGITNQRETTIVWEKETGKPIYNAIVWQCRRTAEICEHLKRDG LEDYIRSNTGLVIDPYFSGTKVKWILDHVEGSRERARRGELLFGTVDTWLIWKMTQGRVH VTDYTNASRTMLFNIHTLDWDDKMLEVLDIPREMLPEVRRSSEVYGQTNIGGKGGTRIPI SGIAGDQQAALFGQLCVKEGMAKNTYGTGCFMLMNTGEKAVKSENGLLTTIACGPTGEVN YALEGAVFMAGASIQWLRDEMKLINDAYDSEYFATKVQNTNGVYVVPAFTGLGAPYWDPY ARGAIFGLTRGVNANHIIRATLESIAYQTRDVLEAMQADSGIRLHALRVDGGAVANNFLM QFQSDILGTRVERPEVREVTALGAAYLAGLAVGFWQNLDELQEKAVIEREFRPGIETTER NYRYAGWKKAVKRAMAWEEHDE >gi|296494632|gb|ADTN01000106.1| GENE 2 1636 - 2481 706 281 aa, chain - ## HITS:1 COG:ECs4852 KEGG:ns NR:ns ## COG: ECs4852 COG0580 # Protein_GI_number: 15834106 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) # Organism: Escherichia coli O157:H7 # 1 281 1 281 281 533 100.0 1e-151 MSQTSTLKGQCIAEFLGTGLLIFFGVGCVAALKVAGASFGQWEISVIWGLGVAMAIYLTA GVSGAHLNPAVTIALWLFACFDKRKVIPFIVSQVAGAFCAAALVYGLYYNLFFDFEQTHH IVRGSVESVDLAGTFSTYPNPHINFVQAFAVEMVITAILMGLILALTDDGNGVPRGPLAP LLIGLLIAVIGASMGPLTGFAMNPARDFGPKVFAWLAGWGNVAFTGGRDIPYFLVPLFGP IVGAIVGAFAYRKLIGRHLPCDICVVEEKETTTPSEQKASL >gi|296494632|gb|ADTN01000106.1| GENE 3 2912 - 3151 512 79 aa, chain + ## HITS:1 COG:ECs4853 KEGG:ns NR:ns ## COG: ECs4853 COG3074 # Protein_GI_number: 15834107 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 79 3 81 81 85 100.0 2e-17 MSLEVFEKLEAKVQQAIDTITLLQMEIEELKEKNNSLSQEVQNAQHQREELERENNHLKE QQNGWQERLQALLGRMEEV >gi|296494632|gb|ADTN01000106.1| GENE 4 3236 - 3721 558 161 aa, chain - ## HITS:1 COG:ECs4856 KEGG:ns NR:ns ## COG: ECs4856 COG0684 # Protein_GI_number: 15834110 # Func_class: H Coenzyme transport and metabolism # Function: Demethylmenaquinone methyltransferase # Organism: Escherichia coli O157:H7 # 1 161 1 161 161 299 100.0 1e-81 MKYDTSELCDIYQEDVNVVEPLFSNFGGRASFGGQIITVKCFEDNGLLYDLLEQNGRGRV LVVDGGGSVRRALVDAELARLAVQNEWEGLVIYGAVRQVDDLEELDIGIQAMAAIPVGAA GEGIGESDVRVNFGGVTFFSGDHLYADNTGIILSEDPLDIE >gi|296494632|gb|ADTN01000106.1| GENE 5 3814 - 4740 1082 308 aa, chain - ## HITS:1 COG:menA KEGG:ns NR:ns ## COG: menA COG1575 # Protein_GI_number: 16131768 # Func_class: H Coenzyme transport and metabolism # Function: 1,4-dihydroxy-2-naphthoate octaprenyltransferase # Organism: Escherichia coli K12 # 1 308 1 308 308 572 100.0 1e-163 MTEQQISRTQAWLESLRPKTLPLAFAAIIVGTALAWWQGHFDPLVALLALITAGLLQILS NLANDYGDAVKGSDKPDRIGPLRGMQKGVITQQEMKRALIITVVLICLSGLALVAVACHT LADFVGFLILGGLSIIAAITYTVGNRPYGYIGLGDISVLVFFGWLSVMGSWYLQAHTLIP ALILPATACGLLATAVLNINNLRDINSDRENGKNTLVVRLGEVNARRYHACLLMGSLVCL ALFNLFSLHSLWGWLFLLAAPLLVKQARYVMREMDPVAMRPMLERTVKGALLTNLLFVLG IFLSQWAA >gi|296494632|gb|ADTN01000106.1| GENE 6 4807 - 6138 1193 443 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 4 443 8 466 466 464 51 1e-130 MSEMTPREIVSELDKHIIGQDNAKRSVAIALRNRWRRMQLNEELRHEVTPKNILMIGPTG VGKTEIARRLAKLANAPFIKVEATKFTEVGYVGKEVDSIIRDLTDAAVKMVRVQAIEKNR YRAEELAEERILDVLIPPAKNNWGQTEQQQEPSAARQAFRKKLREGQLDDKEIEIDLAAA PMGVEIMAPPGMEEMTSQLQSMFQNLGGQKQKARKLKIKDAMKLLIEEEAAKLVNPEELK QDAIDAVEQHGIVFIDEIDKICKRGESSGPDVSREGVQRDLLPLVEGCTVSTKHGMVKTD HILFIASGAFQIAKPSDLIPELQGRLPIRVELQALTTSDFERILTEPNASITVQYKALMA TEGVNIEFTDSGIKRIAEAAWQGNESTENIGARRLHTVLERLMEEISYDASDLSGQNITI DADYVSKHLDALVADEDLSRFIL >gi|296494632|gb|ADTN01000106.1| GENE 7 6148 - 6678 623 176 aa, chain - ## HITS:1 COG:ECs4859 KEGG:ns NR:ns ## COG: ECs4859 COG5405 # Protein_GI_number: 15834113 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent protease HslVU (ClpYQ), peptidase subunit # Organism: Escherichia coli O157:H7 # 1 176 1 176 176 332 100.0 3e-91 MTTIVSVRRNGHVVIAGDGQATLGNTVMKGNVKKVRRLYNDKVIAGFAGGTADAFTLFEL FERKLEMHQGHLVKAAVELAKDWRTDRMLRKLEALLAVADETASLIITGNGDVVQPENDL IAIGSGGPYAQAAARALLENTELSAREIAEKALDIAGDICIYTNHFHTIEELSYKA >gi|296494632|gb|ADTN01000106.1| GENE 8 6771 - 7625 689 284 aa, chain - ## HITS:1 COG:ECs4860 KEGG:ns NR:ns ## COG: ECs4860 COG3087 # Protein_GI_number: 15834114 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Escherichia coli O157:H7 # 1 284 36 319 319 388 99.0 1e-108 MVAIAAAVLVTFIGGLYFITHHKKEESETLQSQKVTGNGLPPKPEERWRYIKELESRQPG VRAPTEPSAGGEVKTPEQLTPEQRQLLEQMQADMRQQPTQLVEVPWNEQTPEQRQQTLQR QRQAQQLAEQQRLAQQSRTTEQSWQQQTRTSQAAPVQAQPRQSKPASSQQPYQDLLQTPA HTTAQSKPQQAAPVARAADAPKPTAEKKDERRWMVQCGSFRGAEQAETVRAQLAFEGFDS KITTNNGWNRVVIGPVKGKENADSTLNRLKMAGHTNCIRLAAGG >gi|296494632|gb|ADTN01000106.1| GENE 9 7822 - 8847 894 341 aa, chain - ## HITS:1 COG:ECs4861 KEGG:ns NR:ns ## COG: ECs4861 COG1609 # Protein_GI_number: 15834115 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 341 1 341 341 684 99.0 0 MKAKKQETAATMKDVALKAKVSTATVSRALMNPDKVSQATRNRVEKAAREVGYLPQPMGR NVKRNESRTILVIVPDICDPFFSEIIRGIEVTAANHGYLVLIGDCAHQNQQEKTFIDLII TKQIDGMLLLGSRLPFDASIEEQRNLPPMVMANEFAPELELPTVHIDNLTAAFDAVNYLY EQGHKRIGCIAGPEEMPLCHYRLQGYVQALRRCDIMVDPQYIARGDFTFEAGSKAMQQLL DLPQPPTAVFCHSDVMALGALSQAKRQGLKVPEDLSIIGFDNIDLTQFCDPPLTTIAQPR YEIGREAMLLLLDQMQGQHVGSGSRLMDCELIIRGSTRALP >gi|296494632|gb|ADTN01000106.1| GENE 10 9003 - 11201 2249 732 aa, chain - ## HITS:1 COG:priA KEGG:ns NR:ns ## COG: priA COG1198 # Protein_GI_number: 16131773 # Func_class: L Replication, recombination and repair # Function: Primosomal protein N' (replication factor Y) - superfamily II helicase # Organism: Escherichia coli K12 # 1 732 1 732 732 1477 100.0 0 MPVAHVALPVPLPRTFDYLLPEGMTVKAGCRVRVPFGKQQERIGIVVSVSDASELPLNEL KAVVEVLDSEPVFTHSVWRLLLWAADYYHHPIGDVLFHALPILLRQGRPAANAPMWYWFA TEQGQAVDLNSLKRSPKQQQALAALRQGKIWRDQVATLEFNDAALQALRKKGLCDLASET PEFSDWRTNYAVSGERLRLNTEQATAVGAIHSAADTFSAWLLAGVTGSGKTEVYLSVLEN VLAQGKQALVMVPEIGLTPQTIARFRERFNAPVEVLHSGLNDSERLSAWLKAKNGEAAIV IGTRSALFTPFKNLGVIVIDEEHDSSYKQQEGWRYHARDLAVYRAHSEQIPIILGSATPA LETLCNVQQKKYRLLRLTRRAGNARPAIQHVLDLKGQKVQAGLAPALITRMRQHLQADNQ VILFLNRRGFAPALLCHDCGWIAECPRCDHYYTLHQAQHHLRCHHCDSQRPVPRQCPSCG STHLVPVGLGTEQLEQTLAPLFPGVPISRIDRDTTSRKGALEQQLAEVHRGGARILIGTQ MLAKGHHFPDVTLVALLDVDGALFSADFRSAERFAQLYTQVAGRAGRAGKQGEVVLQTHH PEHPLLQTLLYKGYDAFAEQALAERRMMQLPPWTSHVIVRAEDHNNQHAPLFLQQLRNLI LSSPLADEKLWVLGPVPALAPKRGGRWRWQILLQHPSRVRLQHIINGTLALINTIPDSRK VKWVLDVDPIEG >gi|296494632|gb|ADTN01000106.1| GENE 11 11404 - 11616 384 70 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15804527|ref|NP_290567.1| 50S ribosomal protein L31 [Escherichia coli O157:H7 EDL933] # 1 70 1 70 70 152 100 3e-36 MKKDIHPKYEEITASCSCGNVMKIRSTVGHDLNLDVCSKCHPFFTGKQRDVATGGRVDRF NKRFNIPGSK >gi|296494632|gb|ADTN01000106.1| GENE 12 11677 - 12285 623 202 aa, chain - ## HITS:1 COG:no KEGG:JW3908 NR:ns ## KEGG: JW3908 # Name: yiiX # Def: predicted peptidoglycan peptidase # Organism: E.coli_J # Pathway: not_defined # 1 202 1 202 202 365 100.0 1e-100 MKNRLLILSLLVSVPAFAWQPQTGDIIFQISRSSQSKAIQLATHTDYSHTGMLVIRNKKP YVFEAVGPVKYTPLKQWIAHGEKGKYVVRRVEGGLSVEQQQKLAQTAKRYLGKPYDFSFS WSDDRQYCSEVVWKVYQNALGMRVGEQQKLKEFDLSSPQVQAKLKERYGKNIPLEETVVS PQAVFDAPQLTTVAKEWPLFSW >gi|296494632|gb|ADTN01000106.1| GENE 13 12469 - 12786 440 105 aa, chain - ## HITS:1 COG:ECs4867 KEGG:ns NR:ns ## COG: ECs4867 COG3060 # Protein_GI_number: 15834121 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulator of met regulon # Organism: Escherichia coli O157:H7 # 1 105 1 105 105 198 100.0 2e-51 MAEWSGEYISPYAEHGKKSEQVKKITVSIPLKVLKILTDERTRRQVNNLRHATNSELLCE AFLHAFTGQPLPDDADLRKERSDEIPEAAKEIMREMGINPETWEY >gi|296494632|gb|ADTN01000106.1| GENE 14 13063 - 14223 1249 386 aa, chain + ## HITS:1 COG:metB KEGG:ns NR:ns ## COG: metB COG0626 # Protein_GI_number: 16131777 # Func_class: E Amino acid transport and metabolism # Function: Cystathionine beta-lyases/cystathionine gamma-synthases # Organism: Escherichia coli K12 # 1 386 1 386 386 773 100.0 0 MTRKQATIAVRSGLNDDEQYGCVVPPIHLSSTYNFTGFNEPRAHDYSRRGNPTRDVVQRA LAELEGGAGAVLTNTGMSAIHLVTTVFLKPGDLLVAPHDCYGGSYRLFDSLAKRGCYRVL FVDQGDEQALRAALAEKPKLVLVESPSNPLLRVVDIAKICHLAREVGAVSVVDNTFLSPA LQNPLALGADLVLHSCTKYLNGHSDVVAGVVIAKDPDVVTELAWWANNIGVTGGAFDSYL LLRGLRTLVPRMELAQRNAQAIVKYLQTQPLVKKLYHPSLPENQGHEIAARQQKGFGAML SFELDGDEQTLRRFLGGLSLFTLAESLGGVESLISHAATMTHAGMAPEARAAAGISETLL RISTGIEDGEDLIADLENGFRAANKG >gi|296494632|gb|ADTN01000106.1| GENE 15 14226 - 16658 2865 810 aa, chain + ## HITS:1 COG:metL_1 KEGG:ns NR:ns ## COG: metL_1 COG0527 # Protein_GI_number: 16131778 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Escherichia coli K12 # 1 448 1 448 448 881 100.0 0 MSVIAQAGAKGRQLHKFGGSSLADVKCYLRVAGIMAEYSQPDDMMVVSAAGSTTNQLINW LKLSQTDRLSAHQVQQTLRRYQCDLISGLLPAEEADSLISAFVSDLERLAALLDSGINDA VYAEVVGHGEVWSARLMSAVLNQQGLPAAWLDAREFLRAERAAQPQVDEGLSYPLLQQLL VQHPGKRLVVTGFISRNNAGETVLLGRNGSDYSATQIGALAGVSRVTIWSDVAGVYSADP RKVKDACLLPLLRLDEASELARLAAPVLHARTLQPVSGSEIDLQLRCSYTPDQGSTRIER VLASGTGARIVTSHDDVCLIEFQVPASQDFKLAHKEIDQILKRAQVRPLAVGVHNDRQLL QFCYTSEVADSALKILDEAGLPGELRLRQGLALVAMVGAGVTRNPLHCHRFWQQLKGQPV EFTWQSDDGISLVAVLRTGPTESLIQGLHQSVFRAEKRIGLVLFGKGNIGSRWLELFARE QSTLSARTGFEFVLAGVVDSRRSLLSYDGLDASRALAFFNDEAVEQDEESLFLWMRAHPY DDLVVLDVTASQQLADQYLDFASHGFHVISANKLAGASDSNKYRQIHDAFEKTGRHWLYN ATVGAGLPINHTVRDLIDSGDTILSISGIFSGTLSWLFLQFDGSVPFTELVDQAWQQGLT EPDPRDDLSGKDVMRKLVILAREAGYNIEPDQVRVESLVPAHCEGGSIDHFFENGDELNE QMVQRLEAAREMGLVLRYVARFDANGKARVGVEAVREDHPLASLLPCDNVFAIESRWYRD NPLVIRGPGAGRDVTAGAIQSDINRLAQLL >gi|296494632|gb|ADTN01000106.1| GENE 16 17007 - 17897 975 296 aa, chain + ## HITS:1 COG:metF KEGG:ns NR:ns ## COG: metF COG0685 # Protein_GI_number: 16131779 # Func_class: E Amino acid transport and metabolism # Function: 5,10-methylenetetrahydrofolate reductase # Organism: Escherichia coli K12 # 1 296 1 296 296 611 100.0 1e-175 MSFFHASQRDALNQSLAEVQGQINVSFEFFPPRTSEMEQTLWNSIDRLSSLKPKFVSVTY GANSGERDRTHSIIKGIKDRTGLEAAPHLTCIDATPDELRTIARDYWNNGIRHIVALRGD LPPGSGKPEMYASDLVTLLKEVADFDISVAAYPEVHPEAKSAQADLLNLKRKVDAGANRA ITQFFFDVESYLRFRDRCVSAGIDVEIIPGILPVSNFKQAKKFADMTNVRIPAWMAQMFD GLDDDAETRKLVGANIAMDMVKILSREGVKDFHFYTLNRAEMSYAICHTLGVRPGL >gi|296494632|gb|ADTN01000106.1| GENE 17 17955 - 18074 65 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLPRNEGGKIRLSALFSPSLLEGYEAKTLFYKAFVRIRT >gi|296494632|gb|ADTN01000106.1| GENE 18 18226 - 20406 2694 726 aa, chain + ## HITS:1 COG:katG KEGG:ns NR:ns ## COG: katG COG0376 # Protein_GI_number: 16131780 # Func_class: P Inorganic ion transport and metabolism # Function: Catalase (peroxidase I) # Organism: Escherichia coli K12 # 1 726 1 726 726 1427 100.0 0 MSTSDDIHNTTATGKCPFHQGGHDQSAGAGTTTRDWWPNQLRVDLLNQHSNRSNPLGEDF DYRKEFSKLDYYGLKKDLKALLTESQPWWPADWGSYAGLFIRMAWHGAGTYRSIDGRGGA GRGQQRFAPLNSWPDNVSLDKARRLLWPIKQKYGQKISWADLFILAGNVALENSGFRTFG FGAGREDVWEPDLDVNWGDEKAWLTHRHPEALAKAPLGATEMGLIYVNPEGPDHSGEPLS AAAAIRATFGNMGMNDEETVALIAGGHTLGKTHGAGPTSNVGPDPEAAPIEEQGLGWAST YGSGVGADAITSGLEVVWTQTPTQWSNYFFENLFKYEWVQTRSPAGAIQFEAVDAPEIIP DPFDPSKKRKPTMLVTDLTLRFDPEFEKISRRFLNDPQAFNEAFARAWFKLTHRDMGPKS RYIGPEVPKEDLIWQDPLPQPIYNPTEQDIIDLKFAIADSGLSVSELVSVAWASASTFRG GDKRGGANGARLALMPQRDWDVNAAAVRALPVLEKIQKESGKASLADIIVLAGVVGVEKA ASAAGLSIHVPFAPGRVDARQDQTDIEMFELLEPIADGFRNYRARLDVSTTESLLIDKAQ QLTLTAPEMTALVGGMRVLGANFDGSKNGVFTDRVGVLSNDFFVNLLDMRYEWKATDESK ELFEGRDRETGEVKFTASRADLVFGSNSVLRAVAEVYASSDAHEKFVKDFVAAWVKVMNL DRFDLL >gi|296494632|gb|ADTN01000106.1| GENE 19 20499 - 21404 957 301 aa, chain + ## HITS:1 COG:yijE KEGG:ns NR:ns ## COG: yijE COG0697 # Protein_GI_number: 16131781 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Escherichia coli K12 # 1 301 12 312 312 489 100.0 1e-138 MSAAGKSNPLAISGLVVLTLIWSYSWIFMKQVTSYIGAFDFTALRCIFGALVLFIVLLLR GRGMRPTPFKYTLAIALLQTCGMVGLAQWALVSGGAGKVAILSYTMPFWVVIFAALFLGE RLRRGQYFAILIAAFGLFLVLQPWQLDFSSMKSAMLAILSGVSWGASAIVAKRLYARHPR VDLLSLTSWQMLYAALVMSVVALLVPQREIDWQPTVFWALAYSAILATALAWSLWLFVLK NLPASIASLSTLAVPVCGVLFSWWLLGENPGAVEGSGIVLIVLALALVSRKKKEAVSVKR I >gi|296494632|gb|ADTN01000106.1| GENE 20 21431 - 22048 491 205 aa, chain - ## HITS:1 COG:ECs4873 KEGG:ns NR:ns ## COG: ECs4873 COG3738 # Protein_GI_number: 15834127 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 205 1 205 205 413 96.0 1e-115 MKASLALLSLLTAFTSHSLKSPAVPPTVVQIQANTNLAIADGARQQIGSTLFYDPAYVQL TYPGGDVPQERGVCSDVVIRALRSQKVDLQKLVHEDMAKNFAEYPQKWKLKRPDSNIDHR RVPNLETWFSRHDKTRPTSKNPSDYQAGDIVSWRLDNGLAHIGVVSDGFARDGTPLVIHN IGAGAQEEDVLFNWRMVGHYRYFVK >gi|296494632|gb|ADTN01000106.1| GENE 21 22323 - 23426 1370 367 aa, chain - ## HITS:1 COG:gldA KEGG:ns NR:ns ## COG: gldA COG0371 # Protein_GI_number: 16131783 # Func_class: C Energy production and conversion # Function: Glycerol dehydrogenase and related enzymes # Organism: Escherichia coli K12 # 1 367 14 380 380 702 100.0 0 MDRIIQSPGKYIQGADVINRLGEYLKPLAERWLVVGDKFVLGFAQSTVEKSFKDAGLVVE IAPFGGECSQNEIDRLRGIAETAQCGAILGIGGGKTLDTAKALAHFMGVPVAIAPTIAST DAPCSALSVIYTDEGEFDRYLLLPNNPNMVIVDTKIVAGAPARLLAAGIGDALATWFEAR ACSRSGATTMAGGKCTQAALALAELCYNTLLEEGEKAMLAAEQHVVTPALERVIEANTYL SGVGFESGGLAAAHAVHNGLTAIPDAHHYYHGEKVAFGTLTQLVLENAPVEEIETVAALS HAVGLPITLAQLDIKEDVPAKMRIVAEAACAEGETIHNMPGGATPDQVYAALLVADQYGQ RFLQEWE >gi|296494632|gb|ADTN01000106.1| GENE 22 23437 - 24099 861 220 aa, chain - ## HITS:1 COG:talC KEGG:ns NR:ns ## COG: talC COG0176 # Protein_GI_number: 16131784 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Escherichia coli K12 # 1 220 1 220 220 389 100.0 1e-108 MELYLDTANVAEVERLARIFPIAGVTTNPSIIAASKESIWEVLPRLQKAIGDEGILFAQT MSRDAQGMVEEAKRLRDAIPGIVVKIPVTSEGLAAIKILKKEGITTLGTAVYSAAQGLLA ALAGAKYVAPYVNRVDAQGGDGIRTVQELQTLLEMHAPESMVLAASFKTPRQALDCLLAG CESITLPLDVAQQMLNTPAVESAIEKFEHDWNAAFGTTHL >gi|296494632|gb|ADTN01000106.1| GENE 23 24111 - 26612 2483 833 aa, chain - ## HITS:1 COG:ptsA_1 KEGG:ns NR:ns ## COG: ptsA_1 COG1080 # Protein_GI_number: 16131785 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Escherichia coli K12 # 123 687 1 565 565 1100 99.0 0 MALIVEFICELPNGVHARPASHVETLCNTFSSQIEWHNLRTDRKGNAKSALALIGTDTLA GDNCQLLISGADEQEAHQRLSQWLRDEFPHCDAPLAEVKSDELEPLPVSLTNLNPQIIRA RTVCSGSAGGILTPISSLDLNALGNLPAAKGVDAEQSALENGLTLVLKNIEFRLLDSDGA TSAILEAHRSLAGDTSLREHLLAGVSAGLSCAEAIVASANHFCEEFSRSSSSYLQERALD VRDVCFQLLQQIYGEQRFPAPGKLTQPAICMADELTPSQFLELDKNHLKGLLLKSGGTTS HTVILARSFNIPTLVGVDIDALTPWQQQTIYIDGNAGAIVVEPGEAVARYYQQEARVQDA LREQQRVWLTQQARTADGIRIEIAANIAHSVEAQAAFGNGAEGVGLFRTEMLYMDRTSAP GESELYNIFCQALESANGRSIIVRTMDIGGDKPVDYLNIPAEANPFLGYRAVRIYEEYAS LFTTQLRSILRASAHGSLKIMIPMISSMEEILWVKEKLAEAKQQLRNEHIPFDEKIQLGI MLEVPSVMFIIDQCCEEIDFFSIGSNDLTQYLLAVDRDNAKVTRHYNSLNPAFLRALDYA VQAVHRQGKWIGLCGELGAKGSVLPLLVGLGLDELSMSAPSIPAAKARMAQLDSRECRKL LNQAMACRTSLEVEHLLAQFRMTQQDAPLVTAECITLESDWRSKEEVLKGMTDNLLLAGR CRYPRKLEADLWAREAVFSTGLGFSFAIPHSKSEHIEQSTISVARLQAPVRWGDDEAQFI IMLTLNKHAAGDQHMRIFSRLARRIMHEEFRNTLVNAASADAIASLLQHELEL >gi|296494632|gb|ADTN01000106.1| GENE 24 26921 - 28000 1329 359 aa, chain + ## HITS:1 COG:frwC KEGG:ns NR:ns ## COG: frwC COG1299 # Protein_GI_number: 16131787 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Escherichia coli K12 # 1 359 1 359 359 566 100.0 1e-161 MNELVQILKNTRQHLMTGVSHMIPFVVSGGILLAVSVMLYGKGAVPDAVADPNLKKLFDI GVAGLTLMVPFLAAYIGYSIAERSALAPCAIGAWVGNSFGAGFFGALIAGIIGGIVVHYL KKIPVHKVLRSVMPIFIIPIVGTLITAGIMMWGLGEPVGALTNSLTQWLQGMQQGSIVML AVIMGLMLAFDMGGPVNKVAYAFMLICVAQGVYTVVAIAAVGICIPPLGMGLATLIGRKN FSAEERETGKAALVMGCVGVTEGAIPFAAADPLRVIPSIMVGSVCGAVTAALVGAQCYAG WGGLIVLPVVEGKLGYIAAVAVGAVVTAVCVNVLKSLARKNGSSTDEKEDDLDLDFEIN >gi|296494632|gb|ADTN01000106.1| GENE 25 28015 - 28335 529 106 aa, chain + ## HITS:1 COG:STM4113 KEGG:ns NR:ns ## COG: STM4113 COG1445 # Protein_GI_number: 16767378 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system fructose-specific component IIB # Organism: Salmonella typhimurium LT2 # 1 106 1 106 106 162 96.0 2e-40 MTKIIAVTACPSGVAHTYMAAEALESAAKAKGWEVKVETQGSIGLENELTAEDVASADMV ILTKDIGIKFEERFAGKTIVRVNISDAVKRADAIMSKIEAHLAQTA >gi|296494632|gb|ADTN01000106.1| GENE 26 28386 - 30683 2275 765 aa, chain + ## HITS:1 COG:pflD KEGG:ns NR:ns ## COG: pflD COG1882 # Protein_GI_number: 16131789 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Escherichia coli K12 # 1 765 1 765 765 1553 99.0 0 MTNRISRLKTALFANTREISLERALLYTASHRQTEGEPVILRRAKATAYILEHVEISIRD EELIAGNRTVKPRAGIMSPEMDPYWLLKELDQFPTRPQDRFAISEEDKRIYREELFPYWE KRSMKDFINGQMTDEVKAATNTQIFSINQTDKGQGHIIIDYPRLLNHGLGELVAQMQQHC QQQPENHFYQAALLLLEASQKHILRYAELAETMAANCTDAQRREELLTIAEISRHNAQHK PQTFWQACQLFWYMNIILQYESNASSLSLGRFDQYMLPFYQTSLTQGEDAAFLKELLESL WVKCNDIVLLRSTSSARYFAGFPTGYTALLGGLTENGRSAVNVLSFLCLDAYQSVQLPQP NLGVRTNALIDTPFLMKTAETIRFGTGIPQIFNDEVVVPAFLNRGVSLEDARDYSIVGCV ELSIPGRTYGLHDIAMFNLLKVMEICLHENEGNAALTYEGLLEQIRAKISHYITLMVEGS NICDIGHRDWAPVPLLSSFISDCLEKGRDITDGGARYNFSGVQGIGIANLSDSLHALKGM VFEQQRLSFDELLSVLKANFATPEGEKVRARLINRFEKYGNDIDEVDNISAELLRHYCKE VEKYQNPRGGYFTPGSYTVSAHVPLGSVVGATPDGRFAGEQLADGGLSPMLGQDAQGPTA VLKSVSKLDNTLLSNGTLLNVKFTPATLEGEAGLRKLADFLRAFTQLKLQHIQFNVVNAD TLREAQQRPQDYAGLVVRVAGYSAFFVELSKEIQDDIIRRTAHQL >gi|296494632|gb|ADTN01000106.1| GENE 27 30649 - 31527 568 292 aa, chain + ## HITS:1 COG:pflC KEGG:ns NR:ns ## COG: pflC COG1180 # Protein_GI_number: 16131790 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Escherichia coli K12 # 1 292 1 292 292 600 100.0 1e-172 MTSSAGQRISCNVVETRRDDVARIFNIQRYSLNDGEGIRTVVFFKGCPHLCPWCANPESI SGKIQTVRREAKCLHCAKCLRDADECPSGAFERIGRDISLDALEREVMKDDIFFRTSGGG VTLSGGEVLMQAEFATRFLQRLRLWGVSCAIETAGDAPASKLLPLAKLCDEVLFDLKIMD ATQARDVVKMNLPRVLENLRLLVSEGVNVIPRLPLIPGFTLSRENMQQALDVLIPLNIRQ IHLLPFHQYGEPKYRLLGKTWSMKEVPAPSSADVATMREMAERAGLQVTVGG >gi|296494632|gb|ADTN01000106.1| GENE 28 31529 - 31870 326 113 aa, chain + ## HITS:1 COG:frwD KEGG:ns NR:ns ## COG: frwD COG1445 # Protein_GI_number: 16131791 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system fructose-specific component IIB # Organism: Escherichia coli K12 # 1 113 1 113 113 207 100.0 3e-54 MAYLVAVTACVSGVAHTYMAAERLEKLCLLEKWGVSIETQGALGTENRLADEDIRRADVA LLITDIELAGAERFEHCRYVQCSIYAFLREPQRVMSAVRKVLSAPQQTHLILE >gi|296494632|gb|ADTN01000106.1| GENE 29 31857 - 32708 847 283 aa, chain - ## HITS:1 COG:yijO KEGG:ns NR:ns ## COG: yijO COG2207 # Protein_GI_number: 16131792 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 283 1 283 283 551 100.0 1e-157 MYHDVSYLLSRLINGPLSLRQIYFASSNGPVPDLAYQVDFPRLEIVLEGEFVDTGAGATL VPGDVLYVPAGGWNFPQWQAPATTFSVLFGKQQLGFSVVQWDGKQYQNLAKQHVARRGPR IGSFLLQTLNEMQMQPQEQQTARLIVASLLSHCRDLLGSQIQTASRSQALFEAIRDYIDE RYASALTRESVAQAFYISPNYLSHLFQKTGAIGFNEYLNHTRLEHAKTLLKGYDLKVKEV AHACGFVDSNYFCRLFRKNTERSPSEYRRQYHSQLTEKPTTPE >gi|296494632|gb|ADTN01000106.1| GENE 30 32923 - 34656 2001 577 aa, chain - ## HITS:1 COG:yijP KEGG:ns NR:ns ## COG: yijP COG2194 # Protein_GI_number: 16131793 # Func_class: R General function prediction only # Function: Predicted membrane-associated, metal-dependent hydrolase # Organism: Escherichia coli K12 # 1 577 1 577 577 1149 100.0 0 MHSTEVQAKPLFSWKALGWALLYFWFFSTLLQAIIYISGYSGTNGIRDSLLFSSLWLIPV FLFPKRIKIIAAVIGVVLWAASLAALCYYVIYGQEFSQSVLFVMFETNTNEASEYLSQYF SLKIVLIALAYTAVAVLLWTRLRPVYIPKPWRYVVSFALLYGLILHPIAMNTFIKNKPFE KTLDNLASRMEPAAPWQFLTGYYQYRQQLNSLTKLLNENNALPPLANFKDESGNEPRTLV LVIGESTQRGRMSLYGYPRETTPELDALHKTDPNLTVFNNVVTSRPYTIEILQQALTFAN EKNPDLYLTQPSLMNMMKQAGYKTFWITNQQTMTARNTMLTVFSRQTDKQYYMNQQRTQS AREYDTNVLKPFQEVLNDPAPKKLIIVHLLGTHIKYKYRYPENQGKFDGNTDHVPPGLNA EELESYNDYDNANLYNDHVVASLIKDFKAANPNGFLVYFSDHGEEVYDTPPHKTQGRNED NPTRHMYTIPFLLWTSEKWQATHPRDFSQDVDRKYSLAELIHTWSDLAGLSYDGYDPTRS VVNPQFKETTRWIGNPYKKNALIDYDTLPYGDQVGNQ >gi|296494632|gb|ADTN01000106.1| GENE 31 34838 - 37450 3108 870 aa, chain - ## HITS:1 COG:ppc KEGG:ns NR:ns ## COG: ppc COG2352 # Protein_GI_number: 16131794 # Func_class: C Energy production and conversion # Function: Phosphoenolpyruvate carboxylase # Organism: Escherichia coli K12 # 1 870 14 883 883 1701 100.0 0 MLGKVLGETIKDALGEHILERVETIRKLSKSSRAGNDANRQELLTTLQNLSNDELLPVAR AFSQFLNLANTAEQYHSISPKGEAASNPEVIARTLRKLKNQPELSEDTIKKAVESLSLEL VLTAHPTEITRRTLIHKMVEVNACLKQLDNKDIADYEHNQLMRRLRQLIAQSWHTDEIRK LRPSPVDEAKWGFAVVENSLWQGVPNYLRELNEQLEENLGYKLPVEFVPVRFTSWMGGDR DGNPNVTADITRHVLLLSRWKATDLFLKDIQVLVSELSMVEATPELLALVGEEGAAEPYR YLMKNLRSRLMATQAWLEARLKGEELPKPEGLLTQNEELWEPLYACYQSLQACGMGIIAN GDLLDTLRRVKCFGVPLVRIDIRQESTRHTEALGELTRYLGIGDYESWSEADKQAFLIRE LNSKRPLLPRNWQPSAETREVLDTCQVIAEAPQGSIAAYVISMAKTPSDVLAVHLLLKEA GIGFAMPVAPLFETLDDLNNANDVMTQLLNIDWYRGLIQGKQMVMIGYSDSAKDAGVMAA SWAQYQAQDALIKTCEKAGIELTLFHGRGGSIGRGGAPAHAALLSQPPGSLKGGLRVTEQ GEMIRFKYGLPEITVSSLSLYTGAILEANLLPPPEPKESWRRIMDELSVISCDVYRGYVR ENKDFVPYFRSATPEQELGKLPLGSRPAKRRPTGGVESLRAIPWIFAWTQNRLMLPAWLG AGTALQKVVEDGKQSELEAMCRDWPFFSTRLGMLEMVFAKADLWLAEYYDQRLVDKALWP LGKELRNLQEEDIKVVLAIANDSHLMADLPWIAESIQLRNIYTDPLNVLQAELLHRSRQA EKEGQEPDPRVEQALMVTIAGIAAGMRNTG >gi|296494632|gb|ADTN01000106.1| GENE 32 37964 - 39115 1054 383 aa, chain - ## HITS:1 COG:argE KEGG:ns NR:ns ## COG: argE COG0624 # Protein_GI_number: 16131795 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Escherichia coli K12 # 1 383 1 383 383 800 100.0 0 MKNKLPPFIEIYRALIATPSISATEEALDQSNADLITLLADWFKDLGFNVEVQPVPGTRN KFNMLASIGQGAGGLLLAGHTDTVPFDDGRWTRDPFTLTEHDGKLYGLGTADMKGFFAFI LDALRDVDVTKLKKPLYILATADEETSMAGARYFAETTALRPDCAIIGEPTSLQPVRAHK GHISNAIRIQGQSGHSSDPARGVNAIELMHDAIGHILQLRDNLKERYHYEAFTVPYPTLN LGHIHGGDASNRICACCELHMDIRPLPGMTLNELNGLLNDALAPVSERWPGRLTVDELHP PIPGYECPPNHQLVEVVEKLLGAKTEVVNYCTEAPFIQTLCPTLVLGPGSINQAHQPDEY LETRFIKPTRELITQVIHHFCWH >gi|296494632|gb|ADTN01000106.1| GENE 33 39269 - 40273 930 334 aa, chain + ## HITS:1 COG:argC KEGG:ns NR:ns ## COG: argC COG0002 # Protein_GI_number: 16131796 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate semialdehyde dehydrogenase # Organism: Escherichia coli K12 # 1 334 1 334 334 656 100.0 0 MLNTLIVGASGYAGAELVTYVNRHPHMNITALTVSAQSNDAGKLISDLHPQLKGIVDLPL QPMSDISEFSPGVDVVFLATAHEVSHDLAPQFLEAGCVVFDLSGAFRVNDATFYEKYYGF THQYPELLEQAAYGLAEWCGNKLKEANLIAVPGCYPTAAQLALKPLIDADLLDLNQWPVI NATSGVSGAGRKAAISNSFCEVSLQPYGVFTHRHQPEIATHLGADVIFTPHLGNFPRGIL ETITCRLKSGVTQAQVAQVLQQAYAHKPLVRLYDKGVPALKNVVGLPFCDIGFAVQGEHL IIVATEDNLLKGAAAQAVQCANIRFGYAETQSLI >gi|296494632|gb|ADTN01000106.1| GENE 34 40284 - 41057 1113 257 aa, chain + ## HITS:1 COG:ECs4888 KEGG:ns NR:ns ## COG: ECs4888 COG0548 # Protein_GI_number: 15834142 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate kinase # Organism: Escherichia coli O157:H7 # 1 257 2 258 258 453 100.0 1e-127 MNPLIIKLGGVLLDSEEALERLFSALVNYRESHQRPLVIVHGGGCVVDELMKGLNLPVKK KNGLRVTPADQIDIITGALAGTANKTLLAWAKKHQIAAVGLFLGDGDSVKVTQLDEELGH VGLAQPGSPKLINSLLENGYLPVVSSIGVTDEGQLMNVNADQAATALAATLGADLILLSD VSGILDGKGQRIAEMTAAKAEQLIEQGIITDGMIVKVNAALDAARTLGRPVDIASWRHAE QLPALFNGMPMGTRILA >gi|296494632|gb|ADTN01000106.1| GENE 35 41118 - 42491 1845 457 aa, chain + ## HITS:1 COG:argH KEGG:ns NR:ns ## COG: argH COG0165 # Protein_GI_number: 16131798 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate lyase # Organism: Escherichia coli K12 # 1 457 1 457 457 887 100.0 0 MALWGGRFTQAADQRFKQFNDSLRFDYRLAEQDIVGSVAWSKALVTVGVLTAEEQAQLEE ALNVLLEDVRARPQQILESDAEDIHSWVEGKLIDKVGQLGKKLHTGRSRNDQVATDLKLW CKDTVSELLTANRQLQSALVETAQNNQDAVMPGYTHLQRAQPVTFAHWCLAYVEMLARDE SRLQDALKRLDVSPLGCGALAGTAYEIDREQLAGWLGFASATRNSLDSVSDRDHVLELLS AAAIGMVHLSRFAEDLIFFNTGEAGFVELSDRVTSGSSLMPQKKNPDALELIRGKCGRVQ GALTGMMMTLKGLPLAYNKDMQEDKEGLFDALDTWLDCLHMAALVLDGIQVKRPRCQEAA QQGYANATELADYLVAKGVPFREAHHIVGEAVVEAIRQGKPLEDLPLSELQKFSQVIDED VYPILSLQSCLDKRAAKGGVSPQQVAQAIAFAQARLG >gi|296494632|gb|ADTN01000106.1| GENE 36 42758 - 43675 883 305 aa, chain + ## HITS:1 COG:ECs4890 KEGG:ns NR:ns ## COG: ECs4890 COG0583 # Protein_GI_number: 15834144 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 305 1 305 305 596 100.0 1e-170 MNIRDLEYLVALAEHRHFRRAADSCHVSQPTLSGQIRKLEDELGVMLLERTSRKVLFTQA GMLLVDQARTVLREVKVLKEMASQQGETMSGPLHIGLIPTVGPYLLPHIIPMLHQTFPKL EMYLHEAQTHQLLAQLDSGKLDCVILALVKESEAFIEVPLFDEPMLLAIYEDHPWANREC VPMADLAGEKLLMLEDGHCLRDQAMGFCFEAGADEDTHFRATSLETLRNMVAAGSGITLL PALAVPPERKRDGVVYLPCIKPEPRRTIGLVYRPGSPLRSRYEQLAEAIRARMDGHFDKV LKQAV >gi|296494632|gb|ADTN01000106.1| GENE 37 43658 - 45058 428 466 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 7 453 4 443 458 169 26 3e-41 MPHSYDYDAIVIGSGPGGEGAAMGLVKQGARVAVIERYQNVGGGCTHWGTIPSKALRHAV SRIIEFNQNPLYSDHSRLLRSSFADILNHADNVINQQTRMRQGFYERNHCEILQGNARFV DEHTLALDCPDGSVETLTAEKFVIACGSRPYHPTDVDFTHPRIYDSDSILSMHHEPRHVL IYGAGVIGCEYASIFRGMDVKVDLINTRDRLLAFLDQEMSDSLSYHFWNSGVVIRHNEEY EKIEGCDDGVIMHLKSGKKLKADCLLYANGRTGNTDSLALQNIGLETDSRGQLKVNSMYQ TAQPHVYAVGDVIGYPSLASAAYDQGRIAAQALVKGEATAHLIEDIPTGIYTIPEISSVG KTEQQLTAMKVPYEVGRAQFKHLARAQIVGMNVGTLKILFHRETKEILGIHCFGERAAEI IHIGQAIMEQKGGGNTIEYFVNTTFNSPTMAEAYRVAALNGLNRLF >gi|296494632|gb|ADTN01000106.1| GENE 38 45335 - 46039 716 234 aa, chain + ## HITS:1 COG:ECs4894 KEGG:ns NR:ns ## COG: ECs4894 COG1309 # Protein_GI_number: 15834148 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 234 1 234 234 409 100.0 1e-114 MFILWYSASSTFGKDSDIVMGVRAQQKEKTRRSLVEAAFSQLSAERSFASLSLREVAREA GIAPTSFYRHFRDVDELGLTMVDESGLMLRQLMRQARQRIAKGGSVIRTSVSTFMEFIGN NPNAFRLLLRERSGTSAAFRAAVAREIQHFIAELADYLELENHMPRAFTEAQAEAMVTIV FSAGAEALDVGVEQRRQLEERLVLQLRMISKGAYYWYRREQEKTAIIPGNVKDE >gi|296494632|gb|ADTN01000106.1| GENE 39 46039 - 46398 450 119 aa, chain + ## HITS:1 COG:no KEGG:LF82_3412 NR:ns ## KEGG: LF82_3412 # Name: yijD # Def: inner membrane protein YijD # Organism: E.coli_LF82 # Pathway: not_defined # 1 119 1 119 119 222 100.0 4e-57 MKQANQDRGTLLLALVAGLSINGTFAALFSSIVPFSVFPIISLVLTVYCLHQRYLNRTMP VGLPGLAAACFILGVLLYSTVVRAEYPDIGSNFFPAVLSVIMVFWIGAKMRNRKQEVAE Prediction of potential genes in microbial genomes Time: Sun May 15 23:30:12 2011 Seq name: gi|296494631|gb|ADTN01000107.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont246.5, whole genome shotgun sequence Length of sequence - 4193 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 35 - 1135 1118 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase - Prom 1160 - 1219 6.0 + Prom 1209 - 1268 3.3 2 2 Op 1 2/1.000 + CDS 1504 - 3348 1552 ## COG4206 Outer membrane cobalamin receptor protein 3 2 Op 2 . + CDS 3293 - 4150 699 ## COG0796 Glutamate racemase Predicted protein(s) >gi|296494631|gb|ADTN01000107.1| GENE 1 35 - 1135 1118 366 aa, chain - ## HITS:1 COG:trmA KEGG:ns NR:ns ## COG: trmA COG2265 # Protein_GI_number: 16131803 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Escherichia coli K12 # 1 366 1 366 366 737 99.0 0 MTPEHLPTEQYEAQLAEKVVRLQSMMAPFSDLVPEVFRSPVSHYRMRAEFRIWHDGDDLY HIIFDQQTKSRIRVDSFPAASELINQLMTAMIAGVRNNPVLRHKLFQIDYLTTLSNQAVV SLLYHKKLDDEWRQEAEALRDALRAQNLNVHLIGRATKTKIELDQDYIDERLPVAGKEMI YRQVENSFTQPNAAMNIQMLEWALDVTKGSKGDLLELYCGNGNFSLALARNFDRVLATEI AKPSVAAAQYNIAANHIDNVQIIRMSAEEFTQAMNGVREFNRLQGIDLKSYQCETIFVDP PRSGLDSETEKMVQAYPRILYISCNPETLCKNLETLSQTHKVERLALFDQFPYTHHMECG VLLTAK >gi|296494631|gb|ADTN01000107.1| GENE 2 1504 - 3348 1552 614 aa, chain + ## HITS:1 COG:btuB KEGG:ns NR:ns ## COG: btuB COG4206 # Protein_GI_number: 16131804 # Func_class: H Coenzyme transport and metabolism # Function: Outer membrane cobalamin receptor protein # Organism: Escherichia coli K12 # 1 614 1 614 614 1158 100.0 0 MIKKASLLTACSVTAFSAWAQDTSPDTLVVTANRFEQPRSTVLAPTTVVTRQDIDRWQST SVNDVLRRLPGVDITQNGGSGQLSSIFIRGTNASHVLVLIDGVRLNLAGVSGSADLSQFP IALVQRVEYIRGPRSAVYGSDAIGGVVNIITTRDEPGTEISAGWGSNSYQNYDVSTQQQL GDKTRVTLLGDYAHTHGYDVVAYGNTGTQAQTDNDGFLSKTLYGALEHNFTDAWSGFVRG YGYDNRTNYDAYYSPGSPLLDTRKLYSQSWDAGLRYNGELIKSQLITSYSHSKDYNYDPH YGRYDSSATLDEMKQYTVQWANNVIVGHGSIGAGVDWQKQTTTPGTGYVEDGYDQRNTGI YLTGLQQVGDFTFEGAARSDDNSQFGRHGTWQTSAGWEFIEGYRFIASYGTSYKAPNLGQ LYGFYGNPNLDPEKSKQWEGAFEGLTAGVNWRISGYRNDVSDLIDYDDHTLKYYNEGKAR IKGVEATANFDTGPLTHTVSYDYVDARNAITDTPLLRRAKQQVKYQLDWQLYDFDWGITY QYLGTRYDKDYSSYPYQTVKMGGVSLWDLAVAYPVTSHLTVRGKIANLFDKDYETVYGYQ TAGREYTLSGSYTF >gi|296494631|gb|ADTN01000107.1| GENE 3 3293 - 4150 699 285 aa, chain + ## HITS:1 COG:murI KEGG:ns NR:ns ## COG: murI COG0796 # Protein_GI_number: 16131805 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glutamate racemase # Organism: Escherichia coli K12 # 1 285 5 289 289 532 100.0 1e-151 MATKLQDGNTPCLAATPSEPRPTVLVFDSGVGGLSVYDEIRHLLPDLHYIYAFDNVAFPY GEKSEAFIVERVVAIVTAVQERYPLALAVVACNTASTVSLPALREKFDFPVVGVVPAIKP AARLTANGIVGLLATRGTVKRSYTHELIARFANECQIEMLGSAEMVELAEAKLHGEDVSL DALKRILRPWLRMKEPPDTVVLGCTHFPLLQEELLQVLPEGTRLVDSGAAIARRTAWLLE HEAPDAKSADANIAFCMAMTPGAEQLLPVLQRYGFETLEKLAVLG Prediction of potential genes in microbial genomes Time: Sun May 15 23:30:18 2011 Seq name: gi|296494630|gb|ADTN01000108.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont248.1, whole genome shotgun sequence Length of sequence - 15061 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 7, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 61 - 345 298 ## COG3093 Plasmid maintenance system antidote protein 2 2 Tu 1 . - CDS 491 - 1501 1010 ## COG1064 Zn-dependent alcohol dehydrogenases - Prom 1562 - 1621 4.0 - Term 1593 - 1628 6.5 3 3 Tu 1 . - CDS 1635 - 3332 1929 ## COG0281 Malic enzyme - Prom 3415 - 3474 4.3 - Term 3415 - 3466 8.4 4 4 Op 1 . - CDS 3489 - 3626 230 ## PROTEIN SUPPORTED gi|15801657|ref|NP_287675.1| 30S ribosomal subunit S22 5 4 Op 2 . - CDS 3728 - 3943 313 ## B21_01451 hypothetical protein - Prom 4060 - 4119 4.8 + Prom 4017 - 4076 7.6 6 5 Tu 1 . + CDS 4288 - 4719 541 ## COG1764 Predicted redox protein, regulator of disulfide bond formation + Term 4735 - 4774 6.5 7 6 Op 1 17/0.000 - CDS 4775 - 5701 618 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 8 6 Op 2 44/0.000 - CDS 5694 - 6680 474 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 9 6 Op 3 49/0.000 - CDS 6677 - 7570 777 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 10 6 Op 4 38/0.000 - CDS 7570 - 8592 316 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 11 6 Op 5 3/1.000 - CDS 8594 - 10144 1621 ## COG0747 ABC-type dipeptide transport system, periplasmic component 12 6 Op 6 1/1.000 - CDS 10158 - 10739 418 ## COG2173 D-alanyl-D-alanine dipeptidase - Prom 10912 - 10971 1.6 - Term 10958 - 10983 -0.5 13 7 Op 1 6/0.000 - CDS 10997 - 12499 1251 ## COG2200 FOG: EAL domain 14 7 Op 2 5/0.000 - CDS 12253 - 13395 473 ## COG2202 FOG: PAS/PAC domain 15 7 Op 3 . - CDS 13420 - 14802 1293 ## COG2199 FOG: GGDEF domain - Prom 14956 - 15015 6.3 Predicted protein(s) >gi|296494630|gb|ADTN01000108.1| GENE 1 61 - 345 298 94 aa, chain - ## HITS:1 COG:yddM KEGG:ns NR:ns ## COG: yddM COG3093 # Protein_GI_number: 16129436 # Func_class: R General function prediction only # Function: Plasmid maintenance system antidote protein # Organism: Escherichia coli K12 # 1 94 27 120 120 169 100.0 1e-42 MKMANHPRPGDIIQESLDELNVSLREFARAMEIAPSTASRLLTGKAALTPEMAIKLSVVI GSSPQMWLNLQNAWSLAEAEKTVDVSRLRRLVTQ >gi|296494630|gb|ADTN01000108.1| GENE 2 491 - 1501 1010 336 aa, chain - ## HITS:1 COG:adhP KEGG:ns NR:ns ## COG: adhP COG1064 # Protein_GI_number: 16129437 # Func_class: R General function prediction only # Function: Zn-dependent alcohol dehydrogenases # Organism: Escherichia coli K12 # 1 336 11 346 346 609 100.0 1e-174 MKAAVVTKDHHVDVTYKTLRSLKHGEALLKMECCGVCHTDLHVKNGDFGDKTGVILGHEG IGVVAEVGPGVTSLKPGDRASVAWFYEGCGHCEYCNSGNETLCRSVKNAGYSVDGGMAEE CIVVADYAVKVPDGLDSAAASSITCAGVTTYKAVKLSKIRPGQWIAIYGLGGLGNLALQY AKNVFNAKVIAIDVNDEQLKLATEMGADLAINSHTEDAAKIVQEKTGGAHAAVVTAVAKA AFNSAVDAVRAGGRVVAVGLPPESMSLDIPRLVLDGIEVVGSLVGTRQDLTEAFQFAAEG KVVPKVALRPLADINTIFTEMEEGKIRGRMVIDFRH >gi|296494630|gb|ADTN01000108.1| GENE 3 1635 - 3332 1929 565 aa, chain - ## HITS:1 COG:sfcA KEGG:ns NR:ns ## COG: sfcA COG0281 # Protein_GI_number: 16129438 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Escherichia coli K12 # 1 565 10 574 574 1139 99.0 0 MEPKTKKQRSLYIPYAGPVLLEFPLLNKGSAFSMEERRNFNLLGLLPEVVETIEEQAERA WIQYQGFKTEIDKHIYLRNIQDTNETLFYRLVNNHLDEMMPVIYTPPVGAACERFSEIYR RSRGVFISYQNRHNMDDILQNVPNHNIKVIVVTDGERILGLGDQGIGGMGIPIGKLSLYT ACGGISPAYTLPVVLDVGTNNQQLLNDPLYMGWRNPRITDDEYYEFVDEFIQAVKQRWPD VLLQFEDFAQKNAMPLLNRYRNEICSFNDDIQGTAAVTVGTLIAASRAAGGQLSEKKIVF LGAGSAGCGIAEMIISQTQREGLSEEAARQKVFMVDRFGLLTDKMPNLLPFQTKLVQKRE NLSDWDTDSDVLSLLDVVRNVKPDILIGVSGQTGLFTEEIIREMHKHCPRPIVMPLSNPT SRVEATPQDIIAWTEGNALVATGSPFNPVVWKDKIYPIAQCNNAFIFPGIGLGVIASGAS RITDEMLMSASETLAQYSPLVLNGEGMVLPELKDIQKVSRAIAFAVGKMAQQQGVAVKTS AEALQQAIDDNFWQAEYRDYRRTSI >gi|296494630|gb|ADTN01000108.1| GENE 4 3489 - 3626 230 45 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15801657|ref|NP_287675.1| 30S ribosomal subunit S22 [Escherichia coli O157:H7 EDL933] # 1 45 1 45 45 93 100 1e-18 MKSNRQARHILGLDHKISNQRKIVTEGDKSSVVNNPTGRKRPAEK >gi|296494630|gb|ADTN01000108.1| GENE 5 3728 - 3943 313 71 aa, chain - ## HITS:1 COG:no KEGG:B21_01451 NR:ns ## KEGG: B21_01451 # Name: bdm # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 71 1 71 71 129 100.0 4e-29 MFTYYQAENSTAEPALVNAIEQGLRAQHGVVTEDDILMELTKWVEASDNDILSDIYQQTI NYVVSGQHPTL >gi|296494630|gb|ADTN01000108.1| GENE 6 4288 - 4719 541 143 aa, chain + ## HITS:1 COG:osmC KEGG:ns NR:ns ## COG: osmC COG1764 # Protein_GI_number: 16129441 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted redox protein, regulator of disulfide bond formation # Organism: Escherichia coli K12 # 1 143 1 143 143 270 100.0 5e-73 MTIHKKGQAHWEGDIKRGKGTVSTESGVLNQQPYGFNTRFEGEKGTNPEELIGAAHAACF SMALSLMLGEAGFTPTSIDTTADVSLDKVDAGFAITKIALKSEVAVPGIDASTFDGIIQK AKAGCPVSQVLKAEITLDYQLKS >gi|296494630|gb|ADTN01000108.1| GENE 7 4775 - 5701 618 308 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 2 291 8 310 329 242 40 1e-63 MSDTLLTLRDVHINFPARKNWLGKTTEHVHAINGIDLQIRRGETLGIVGESGCGKSTLAQ LLMGMLQPSHGQYIRSGSQRIMQMVFQDPLSSLNPRLPVWRIITEPLWIAKRSSEQQRRA LAEELAVQVGIRPEYLDRLPHAFSGGQRQRIAIARALSSQPDVIVLDEPTSALDISVQAQ ILNLLVTLQENHGLTYVLISHNVSVIRHMSDRVAVMYLGQIVELGDAQQVLTAPAHPYTR LLLDSLPAIDKPLEEEWALRKTDLPGNRTLPQGCFFYERCPLATHGCEVRQSLAIREDGR ELRCWRAL >gi|296494630|gb|ADTN01000108.1| GENE 8 5694 - 6680 474 328 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 20 316 33 324 329 187 37 5e-47 MTQPVLDIQQLHLSFPGFNGDVHALNNVSLQINRGEIVGLVGESGSGKSVTAMLIMRLLP TGSYCVHRGQISLLGEDVLNAREKQLRQWRGARVAMIFQEPMTALNPTRRIGLQMMDVIR HHQPISRREARAKAIDLLEEMQIPDAVEVMSRYPFELSGGMRQRVMIALAFSCEPQLIIA DEPTTALDVTVQLQVLRLLKHKARASGTAVLFISHDMAVVSQLCDSVYVMYAGSVIESGV TADVIHHPRHPYTIGLLQCAPEHGVPRQLLPAIPGTVPNLTHLPDGCAFRDRCYAAGAQC ENVPALTACGDNNQRCACWYPQQEVISV >gi|296494630|gb|ADTN01000108.1| GENE 9 6677 - 7570 777 297 aa, chain - ## HITS:1 COG:ddpC KEGG:ns NR:ns ## COG: ddpC COG1173 # Protein_GI_number: 16129444 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Escherichia coli K12 # 1 297 2 298 298 540 100.0 1e-153 MLSEETSAVRPQKQTRFNGAKLVWMLKGSPLTVTSAVIIVLMLLMMIFSPWLATHDPNAI DLTARLLPPSAAHWFGTDEVGRDLFSRVLVGSQQSILAGLVVVAIAGMIGSLLGCLSGVL GGRADAIIMRIMDIMLSIPSLVLTMALAAALGPSLFNAMLAIAIVRIPFYVRLARGQALV VRQYTYVQAAKTFGASRWHLINWHILRNSLPPLIVQASLDIGSAILMAATLGFIGLGAQQ PSAEWGAMVANGRNYVLDQWWYCAFPGAAILLTAVGFNLFGDGIRDLLDPKAGGKQS >gi|296494630|gb|ADTN01000108.1| GENE 10 7570 - 8592 316 340 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 70 332 46 310 320 126 29 1e-28 MTFWSILRQRCWGLVLVVAGVCVITFIISHLIPGDPARLLAGDRASDAIVENIRQQLGLD QPLYVQFYRYVSDLFHGDLGTSIRTGRPVLEELRIFFPATLELAFGALLLALLIGIPLGI LSAVWRNRWLDHLVRIMAITGISTPAFWLGLGVIVLFYGHLQILPGGGRLDDWLDPPTHV TGFYLLDALLEGNGEVFFNALQHLILPALTLAFVHLGIVARQIRSAMLEQLSEDYIRTAR ASGLPGWYIVLCYALPNALIPSITVLGLALGDLLYGAVLTETVFAWPGMGAWVVTSIQAL DFPAVMGFAVVVSFAYVLVNLVVDLLYLWIDPRIGRGGGE >gi|296494630|gb|ADTN01000108.1| GENE 11 8594 - 10144 1621 516 aa, chain - ## HITS:1 COG:ddpA KEGG:ns NR:ns ## COG: ddpA COG0747 # Protein_GI_number: 16129446 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Escherichia coli K12 # 1 516 1 516 516 1014 100.0 0 MKRSISFRPTLLALVLATNFPVAHAAVPKDMLVIGKAADPQTLDPAVTIDNNDWTVTYPS YQRLVQYKTDGDKGSTDVEGDLASSWKASDDQKEWTFTLKDNAKFADGTPVTAEAVKLSF ERLLKIGQGPAEAFPKDLKIDAPDEHTVKFTLSQPFAPFLYTLANDGASIINPAVLKEHA ADDARGFLAQNTAGSGPFMLKSWQKGQQLVLVPNPHYPGNKPNFKRVSVKIIGESASRRL QLSRGDIDIADALPVDQLNALKQENKVNVAEYPSLRVTYLYLNNSKAPLNQADLRRAISW STDYQGMVNGILSGNGKQMRGPIPEGMWGYDATAMQYNHDETKAKAEWDKVTSKPTSLTF LYSDNDPNWEPIALATQSSLNKLGIIVKLEKLANATMRDRVGKGDYDIAIGNWSPDFADP YMFMNYWFESDKKGLPGNRSFYENSEVDKLLRNALATTDQTQRTRDYQQAQKIVIDDAAY VYLFQKNYQLAMNKEVKGFVFNPMLEQVFNINTMSK >gi|296494630|gb|ADTN01000108.1| GENE 12 10158 - 10739 418 193 aa, chain - ## HITS:1 COG:ddpX KEGG:ns NR:ns ## COG: ddpX COG2173 # Protein_GI_number: 16129447 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine dipeptidase # Organism: Escherichia coli K12 # 1 193 1 193 193 395 100.0 1e-110 MSDTTELVDLAVIFPDLEIELKYACADNITGKAIYQQARCLLHKDAITALAKSISIAQLS GLQLVIYDAYRPQQAQAMLWQACPDPQYVVDVTVGSNHSRGTAIDLTLRDEHGNILDMGA GFDEMHERSHAYHPSVPPAAQRNRLLLNAIMTGGGFVGISSEWWHFELPQAASYPLLADQ FSCFISPGTQHVS >gi|296494630|gb|ADTN01000108.1| GENE 13 10997 - 12499 1251 500 aa, chain - ## HITS:1 COG:yddU_3 KEGG:ns NR:ns ## COG: yddU_3 COG2200 # Protein_GI_number: 16129448 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 234 500 1 267 267 546 100.0 1e-155 MPIHWASSSHGAEIQNAQSWSATIRQRDGAPAGILQIKTSSGAETSAFIERVADISQHMA ALALEQEKSRQHIEQLIQFDPMTGLPNRNNLHNYLDDLVDKAVSPVVYLIGVDHIQDVID SLGYAWADQALLEVVNRFREKLKPDQYLCRIEGTQFVLVSLENDVSNITQIADELRNVVS KPIMIDDKPFPLTLSIGISYDLGKNRDYLLSTAHNAMDYIRKNGGNGWQFFSPAMNEMVK ERLVLGAALKEAISNNQLKLVYQPQIFAETGELYGIEALARWHDPLHGHVPPSRFIPLAE EIGEIENIGRWVIAEACRQLAEWRSQNIHIPALSVNLSALHFRSNQLPNQVSDAMHAWGI DGHQLTVEITESMMMEHDTEIFKRIQILRDMGVGLSVDDFGTGFSGLSRLVSLPVTEIKI DKSFVDRCLTEKRILALLEAITSIGQSLNLTVVAEGVETKEQFEMLRKIHCRVIQGYFFS RPLPAEEIPGWMSSVLPLKI >gi|296494630|gb|ADTN01000108.1| GENE 14 12253 - 13395 473 380 aa, chain - ## HITS:1 COG:ECs2094_1 KEGG:ns NR:ns ## COG: ECs2094_1 COG2202 # Protein_GI_number: 15831348 # Func_class: T Signal transduction mechanisms # Function: FOG: PAS/PAC domain # Organism: Escherichia coli O157:H7 # 1 277 1 277 342 573 100.0 1e-163 MKLTDADNAADGIFFPALEQNMMGAVLINENDEVMFFNPAAEKLWGYKREEVIGNNIDML IPRDLRPAHPEYIRHNREGGKARVEGMSRELQLEKKDGSKIWTRFALSKVSAEGKVYYLA LVRDASVEMAQKEQTRQLIIAVDHLDRPVIVLDPERHIVQCNRAFTEMFGYCISEASGMQ PDTLLNIPEFPADNRIRLQQLLWKTARDQDEFLLLTRTGEKIWIKASISPVYDVLAHLQN LVMTFSDITEERQIRQLEGNILAAMCSSPPFHEMGEIFVVTSNLYSTNRMFRCSHCATGC RYTGRHLPTVQKFKMRKAGQRPFVSVMARLRGSCKLKPRQEQKPAPLSNAWQISASIWPR WRWNRKKAVSILNNSSNLIR >gi|296494630|gb|ADTN01000108.1| GENE 15 13420 - 14802 1293 460 aa, chain - ## HITS:1 COG:ECs2095_2 KEGG:ns NR:ns ## COG: ECs2095_2 COG2199 # Protein_GI_number: 15831349 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli O157:H7 # 269 460 1 192 192 367 100.0 1e-101 MEMYFKRMKDEWTGLVEQADPPIRAKAAEIAVAHAHYLSIEFYRIVRIDPHAEEFLSNEQ VERQLKSAMERWIINVLSAQVDDVERLIQIQHTVAEVHARIGIPVEIVEMGFRVLKKILY PVIFSSDYSAAEKLQVYHFSINSIDIAMEVMTRAFTFSDSSASKEDENYRIFSLLENAEE EKERQIASILSWEIDIIYKILLDSDLGSSLPLSQADFGLWFNHKGRHYFSGIAEVGHISR LIQDFDGIFNQTMRNTRNLNNRSLRVKFLLQIRNTVSQIITLLRELFEEVSRHEVGMDVL TKLLNRRFLPTIFKREIAHANRTGTPLSVLIIDVDKFKEINDTWGHNTGDEILRKVSQAF YDNVRSSDYVFRYGGDEFIIVLTEASENETLRTAERIRSRVEKTKLKAANGEDIALSLSI GAAMFNGHPDYERLIQIADEALYIAKRRGRNRVELWKASL Prediction of potential genes in microbial genomes Time: Sun May 15 23:30:24 2011 Seq name: gi|296494629|gb|ADTN01000109.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont249.1, whole genome shotgun sequence Length of sequence - 6600 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 7 - 549 -162 ## COG3710 DNA-binding winged-HTH domains - Prom 591 - 650 8.5 - Term 809 - 848 6.1 2 2 Tu 1 . - CDS 883 - 1401 166 ## COG2771 DNA-binding HTH domain-containing proteins - Prom 1507 - 1566 4.7 - Term 1918 - 1964 13.1 3 3 Tu 1 . - CDS 1975 - 3204 571 ## COG0814 Amino acid permeases - Prom 3239 - 3298 4.3 + Prom 3353 - 3412 6.0 4 4 Tu 1 . + CDS 3459 - 4640 1065 ## COG0183 Acetyl-CoA acetyltransferase + Prom 4766 - 4825 5.8 5 5 Op 1 9/0.000 + CDS 5035 - 5763 695 ## COG3717 5-keto 4-deoxyuronate isomerase 6 5 Op 2 . + CDS 5793 - 6554 165 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 Predicted protein(s) >gi|296494629|gb|ADTN01000109.1| GENE 1 7 - 549 -162 180 aa, chain - ## HITS:1 COG:ZyqeI KEGG:ns NR:ns ## COG: ZyqeI COG3710 # Protein_GI_number: 15803367 # Func_class: K Transcription # Function: DNA-binding winged-HTH domains # Organism: Escherichia coli O157:H7 EDL933 # 1 174 1 174 269 355 98.0 4e-98 MYWIINDNIESWPEHRKLISVHNADLNVVLTTPASRCLSLLLEAFPDVVAQQDFFTRVWE EEGMRVPTNTLYQNISIIRRGFRAVGDTTHSLIATVPRRGFKIHNDINIQNHVINSSTDA HTHNAPPAIKVNAGYKESIGGAKNFNNKILKHIKSHLIMLSAFVIGAYSAYWLWVMLPTY >gi|296494629|gb|ADTN01000109.1| GENE 2 883 - 1401 166 172 aa, chain - ## HITS:1 COG:yqeH KEGG:ns NR:ns ## COG: yqeH COG2771 # Protein_GI_number: 16130750 # Func_class: K Transcription # Function: DNA-binding HTH domain-containing proteins # Organism: Escherichia coli K12 # 1 172 59 230 230 345 99.0 2e-95 MWPEESSYFNRGVVEGILTKNHNARLSGYIFVDFSVSFLRLFLEKDWIDYLASTDMGIVL VSDRNMQSLANYWRKHNSAISAVIYNDDGLDVANEKIRQLFIGRYLSFTRGNTLTQMEFT IMGYMVSGYNPYQIAEVLDMDIRSIYAYKQRIEKRMGGKINELFIRSHSVQH >gi|296494629|gb|ADTN01000109.1| GENE 3 1975 - 3204 571 409 aa, chain - ## HITS:1 COG:ECs3702 KEGG:ns NR:ns ## COG: ECs3702 COG0814 # Protein_GI_number: 15832956 # Func_class: E Amino acid transport and metabolism # Function: Amino acid permeases # Organism: Escherichia coli O157:H7 # 1 409 1 409 409 707 100.0 0 MSNIWSKEETLWSFALYGTAVGAGTLFLPIQLGSAGAVVLFITALVAWPLTYWPHKALCQ FILSSKTSAGEGITGAVTHYYGKKIGNLITTLYFIAFFVVVLIYAVAITNSLTEQLAKHM VIDLRIRMLVSLGVVLILNLIFLMGRHATIRVMGFLVFPLIAYFLFLSIYLVGSWQPDLL TTQVEFNQNTLHQIWISIPVMVFAFSHTPIISTFAIDRREKYGEHAMDKCKKIMKVAYLI ICISVLFFVFSCLLSIPPSYIEAAKEEGVTILSALSMLPNAPAWLSISGIIVAVVAMSKS FLGTYFGVIEGATEVVKTTLQQVGVKKSRAFNRALSIMLVSLITFIVCCINPNAISMIYA ISGPLIAMILFIMPTLSTYLIPALKPWRSIGNLITLIVGILCVSVMFFS >gi|296494629|gb|ADTN01000109.1| GENE 4 3459 - 4640 1065 393 aa, chain + ## HITS:1 COG:yqeF KEGG:ns NR:ns ## COG: yqeF COG0183 # Protein_GI_number: 16130748 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA acetyltransferase # Organism: Escherichia coli K12 # 1 393 2 394 394 719 100.0 0 MKDVVIVGALRTPIGCFRGALAGHSAVELGSLVVKALIERTGVPAYAVDEVILGQVLTAG AGQNPARQSAIKGGLPNSVSAITINDVCGSGLKALHLATQAIQCGEADIVIAGGQENMSR APHVLTDSRTGAQLGNSQLVDSLVHDGLWDAFNDYHIGVTAENLAREYGISRQLQDAYAL SSQQKARAAIDAGRFKDEIVPVMTQSNGQTLVVDTDEQPRTDASAEGLARLNPSFDSLGS VTAGNASSINDGAAAVMMMSEAKARALNLPVLARIRAFASVGVDPALMGIAPVYATRRCL ERVGWQLAEVDLIEANEAFAAQALSVGKMLEWDERRVNVNGGAIALGHPIGASGCRILVS LVHEMVKRNARKGLATLCIGGGQGVALTIERDE >gi|296494629|gb|ADTN01000109.1| GENE 5 5035 - 5763 695 242 aa, chain + ## HITS:1 COG:kduI KEGG:ns NR:ns ## COG: kduI COG3717 # Protein_GI_number: 16130747 # Func_class: G Carbohydrate transport and metabolism # Function: 5-keto 4-deoxyuronate isomerase # Organism: Escherichia coli K12 # 1 242 37 278 278 512 100.0 1e-145 MVYSHIDRIIVGGIMPITKTVSVGGEVGKQLGVSYFLERRELGVINIGGAGTITVDGQCY EIGHRDALYVGKGAKEVVFASIDTGTPAKFYYNCAPAHTTYPTKKVTPDEVSPVTLGDNL TSNRRTINKYFVPDVLETCQLSMGLTELAPGNLWNTMPCHTHERRMEVYFYFNMDDDACV FHMMGQPQETRHIVMHNEQAVISPSWSIHSGVGTKAYTFIWGMVGENQVFDDMDHVAVKD LR >gi|296494629|gb|ADTN01000109.1| GENE 6 5793 - 6554 165 253 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 63 248 60 238 242 68 27 2e-11 MILSAFSLEGKVAVVTGCDTGLGQGMALGLAQAGCDIVGINIVEPTETIEQVTALGRRFL SLTADLRKIDGIPALLDRAVAEFGHIDILVNNAGLIRREDALEFSEKDWDDVMNLNIKSV FFMSQAAAKHFIAQGNGGKIINIASMLSFQGGIRVPSYTASKSGVMGVTRLMANEWAKHN INVNAIAPGYMATNNTQQLRADEQRSAEILDRIPAGRWGLPSDLMGPIVFLASSASDYVN GYTIAVDGGWLAR Prediction of potential genes in microbial genomes Time: Sun May 15 23:30:25 2011 Seq name: gi|296494628|gb|ADTN01000110.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont249.2, whole genome shotgun sequence Length of sequence - 1721 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 48 - 107 6.5 1 1 Tu 1 . + CDS 334 - 1692 1447 ## COG0477 Permeases of the major facilitator superfamily Predicted protein(s) >gi|296494628|gb|ADTN01000110.1| GENE 1 334 - 1692 1447 452 aa, chain + ## HITS:1 COG:ECs3698 KEGG:ns NR:ns ## COG: ECs3698 COG0477 # Protein_GI_number: 15832952 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 452 21 472 472 817 100.0 0 MNMFVSVAAAVAGLLFGLDIGVIAGALPFITDHFVLTSRLQEWVVSSMMLGAAIGALFNG WLSFRLGRKYSLMAGAILFVLGSIGSAFATSVEMLIAARVVLGIAVGIASYTAPLYLSEM ASENVRGKMISMYQLMVTLGIVLAFLSDTAFSYSGNWRAMLGVLALPAVLLIILVVFLPN SPRWLAEKGRHIEAEEVLRMLRDTSEKAREELNEIRESLKLKQGGWALFKINRNVRRAVF LGMLLQAMQQFTGMNIIMYYAPRIFKMAGFTTTEQQMIATLVVGLTFMFATFIAVFTVDK AGRKPALKIGFSVMALGTLVLGYCLMQFDNGTASSGLSWLSVGMTMMCIAGYAMSAAPVV WILCSEIQPLKCRDFGITCSTTTNWVSNMIIGATFLTLLDSIGAAGTFWLYTALNIAFVG ITFWLIPETKNVTLEHIERKLMAGEKLRNIGV Prediction of potential genes in microbial genomes Time: Sun May 15 23:30:43 2011 Seq name: gi|296494627|gb|ADTN01000111.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont249.3, whole genome shotgun sequence Length of sequence - 62615 bp Number of predicted genes - 52, with homology - 52 Number of transcription units - 30, operones - 15 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 117 - 809 745 ## COG1794 Aspartate racemase + Term 1026 - 1074 -1.0 2 2 Tu 1 . - CDS 796 - 1731 751 ## COG0583 Transcriptional regulator - Prom 1815 - 1874 6.7 + Prom 1761 - 1820 9.1 3 3 Tu 1 . + CDS 1853 - 3115 1308 ## COG0019 Diaminopimelate decarboxylase 4 4 Tu 1 . - CDS 3122 - 4153 979 ## COG1609 Transcriptional regulators + Prom 4643 - 4702 3.2 5 5 Op 1 5/0.364 + CDS 4738 - 6897 2022 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II 6 5 Op 2 . + CDS 6890 - 8083 1387 ## COG0477 Permeases of the major facilitator superfamily + Term 8087 - 8124 8.0 - Term 8075 - 8112 8.0 7 6 Tu 1 . - CDS 8115 - 9155 1205 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) - Prom 9181 - 9240 7.3 - Term 9221 - 9258 5.1 8 7 Tu 1 . - CDS 9263 - 9481 261 ## G2583_3488 hypothetical protein - Prom 9551 - 9610 3.6 - Term 9570 - 9615 2.7 9 8 Op 1 5/0.364 - CDS 9619 - 10332 801 ## COG0861 Membrane protein TerC, possibly involved in tellurium resistance 10 8 Op 2 . - CDS 10401 - 11090 685 ## COG3066 DNA mismatch repair protein - Prom 11136 - 11195 4.8 + Prom 11137 - 11196 3.6 11 9 Tu 1 . + CDS 11276 - 11422 83 ## ECO103_3390 hypothetical protein + Prom 11476 - 11535 6.5 12 10 Op 1 7/0.000 + CDS 11775 - 12305 223 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 13 10 Op 2 5/0.364 + CDS 12318 - 14564 2185 ## COG3605 Signal transduction protein containing GAF and PtsI domains 14 11 Op 1 11/0.000 + CDS 14715 - 15590 1127 ## COG0682 Prolipoprotein diacylglyceryltransferase 15 11 Op 2 4/0.727 + CDS 15597 - 16391 902 ## COG0207 Thymidylate synthase + Term 16410 - 16447 5.6 + Prom 16406 - 16465 2.1 16 12 Op 1 12/0.000 + CDS 16575 - 17045 202 ## COG2165 Type II secretory pathway, pseudopilin PulG 17 12 Op 2 . + CDS 17036 - 17599 660 ## COG4795 Type II secretory pathway, component PulJ 18 12 Op 3 . + CDS 17596 - 18003 271 ## B21_02633 hypothetical protein 19 12 Op 4 5/0.364 + CDS 17988 - 18311 83 ## COG4967 Tfp pilus assembly protein PilV 20 12 Op 5 5/0.364 + CDS 18324 - 21692 3059 ## COG1330 Exonuclease V gamma subunit + Prom 21748 - 21807 4.1 21 13 Op 1 5/0.364 + CDS 21868 - 24756 2950 ## COG1025 Secreted/periplasmic Zn-dependent peptidases, insulinase-like 22 13 Op 2 13/0.000 + CDS 24749 - 28291 2875 ## COG1074 ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) 23 13 Op 3 . + CDS 28291 - 30117 1346 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member + Term 30128 - 30178 9.5 24 14 Tu 1 . - CDS 30179 - 31510 1319 ## COG0548 Acetylglutamate kinase - Prom 31540 - 31599 5.6 + Prom 31533 - 31592 8.0 25 15 Tu 1 . + CDS 31742 - 32995 1174 ## COG0860 N-acetylmuramoyl-L-alanine amidase - TRNA 33069 - 33145 86.1 # Met CAT 0 0 - TRNA 33179 - 33255 86.1 # Met CAT 0 0 + Prom 33279 - 33338 4.4 26 16 Op 1 8/0.000 + CDS 33464 - 34561 1102 ## COG2821 Membrane-bound lytic murein transglycosylase + Term 34596 - 34636 4.1 + Prom 34668 - 34727 7.0 27 16 Op 2 . + CDS 34800 - 35606 867 ## COG1179 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 + Term 35614 - 35659 12.8 - Term 35599 - 35647 11.2 28 17 Op 1 7/0.000 - CDS 35657 - 36100 419 ## COG2166 SufE protein probably involved in Fe-S center assembly 29 17 Op 2 . - CDS 36100 - 37305 1042 ## COG0520 Selenocysteine lyase - Prom 37433 - 37492 5.2 + Prom 37418 - 37477 4.3 30 18 Tu 1 . + CDS 37497 - 37724 355 ## ECIAI1_2919 hypothetical protein + Prom 37996 - 38055 7.8 31 19 Op 1 6/0.182 + CDS 38075 - 38992 637 ## COG0583 Transcriptional regulator 32 19 Op 2 5/0.364 + CDS 39011 - 39406 505 ## COG2363 Uncharacterized small membrane protein 33 19 Op 3 . + CDS 39399 - 40499 1150 ## COG2933 Predicted SAM-dependent methyltransferase - Term 40491 - 40531 1.2 34 20 Tu 1 4/0.727 - CDS 40543 - 41274 528 ## COG1349 Transcriptional regulators of sugar metabolism - Term 41287 - 41318 3.5 35 21 Op 1 5/0.364 - CDS 41332 - 41754 387 ## COG4154 Fucose dissimilation pathway protein FucU 36 21 Op 2 4/0.727 - CDS 41756 - 43174 1022 ## COG1070 Sugar (pentulose and hexulose) kinases - Term 43181 - 43216 5.4 37 22 Op 1 3/0.909 - CDS 43283 - 45058 1852 ## COG2407 L-fucose isomerase and related proteins 38 22 Op 2 . - CDS 45091 - 46407 837 ## COG0738 Fucose permease - Prom 46523 - 46582 6.7 + Prom 46833 - 46892 2.4 39 23 Op 1 5/0.364 + CDS 46954 - 47601 628 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 40 23 Op 2 . + CDS 47629 - 48777 1288 ## COG1454 Alcohol dehydrogenase, class IV + Term 48909 - 48946 1.0 - Term 48783 - 48825 8.2 41 24 Tu 1 . - CDS 48832 - 49587 533 ## COG0258 5'-3' exonuclease (including N-terminal domain of PolI) - Prom 49608 - 49667 2.7 42 25 Op 1 10/0.000 - CDS 49699 - 51066 1443 ## COG1760 L-serine deaminase - Term 51084 - 51115 3.9 43 25 Op 2 4/0.727 - CDS 51124 - 52413 1519 ## COG0814 Amino acid permeases - Prom 52609 - 52668 6.8 - Term 52924 - 52959 4.0 44 26 Tu 1 6/0.182 - CDS 52970 - 54334 1377 ## COG1611 Predicted Rossmann fold nucleotide-binding protein - Prom 54370 - 54429 1.7 45 27 Tu 1 . - CDS 54446 - 55294 635 ## COG0780 Enzyme related to GTP cyclohydrolase I - Prom 55386 - 55445 3.6 + Prom 55269 - 55328 3.5 46 28 Tu 1 . + CDS 55362 - 55907 511 ## ECO103_3336 SecY interacting protein Syd + Term 55990 - 56044 0.5 + Prom 56445 - 56504 6.4 47 29 Op 1 7/0.000 + CDS 56529 - 56858 374 ## COG3098 Uncharacterized protein conserved in bacteria 48 29 Op 2 5/0.364 + CDS 56858 - 57640 192 ## PROTEIN SUPPORTED gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family 49 29 Op 3 4/0.727 + CDS 57658 - 58107 603 ## COG0716 Flavodoxins + Term 58130 - 58164 -0.1 + Prom 58347 - 58406 5.7 50 30 Op 1 7/0.000 + CDS 58542 - 59894 1531 ## COG0477 Permeases of the major facilitator superfamily 51 30 Op 2 5/0.364 + CDS 59896 - 61236 1497 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 52 30 Op 3 . + CDS 61257 - 62597 1716 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily Predicted protein(s) >gi|296494627|gb|ADTN01000111.1| GENE 1 117 - 809 745 230 aa, chain + ## HITS:1 COG:ECs3697 KEGG:ns NR:ns ## COG: ECs3697 COG1794 # Protein_GI_number: 15832951 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Aspartate racemase # Organism: Escherichia coli O157:H7 # 1 230 1 230 230 457 100.0 1e-129 MKTIGLLGGMSWESTIPYYRLINEGIKQRLGGLHSAQVLLHSVDFHEIEECQRRGEWDKT GDILAEAALGLQRAGAEGIVLCTNTMHKVADAIESRCTLPFLHIADATGRAITGAGMTRV ALLGTRYTMEQDFYRGRLTEQFSINCLIPEADERAKINQIIFEELCLGQFTEASRAYYAQ VIARLAEQGAQGVIFGCTEIGLLVPEERSVLPVFDTAAIHAEDAVAFMLS >gi|296494627|gb|ADTN01000111.1| GENE 2 796 - 1731 751 311 aa, chain - ## HITS:1 COG:lysR KEGG:ns NR:ns ## COG: lysR COG0583 # Protein_GI_number: 16130743 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 311 1 311 311 579 99.0 1e-165 MAAVNLRHIEIFHAVMTAGSLTEAAHLLHTSQPTVSRELARFEKVIGLKLFERVRGRLHP TVQGLRLFEEVQRSWYGLDRIVSAAESLREFRQGELSIACLPVFSQSFLPQLLQPFLARY PDVSLNIVPQESPLLEEWLSAQRHDLGLTETLHTPAGTERTELLSLDEVCVLPPGHPLAV KKVLTPDDFQGENYISLSRTDSYRQLLDQLFTEHQVKRRMIVETHNAASVCAMVRAGVGV SVVNPLTALDYAASGLVVRRFSIAVPFTVSLIRPLHRPSSALVQAFSGHLQAGLPKLVTS LDAILSSATTA >gi|296494627|gb|ADTN01000111.1| GENE 3 1853 - 3115 1308 420 aa, chain + ## HITS:1 COG:lysA KEGG:ns NR:ns ## COG: lysA COG0019 # Protein_GI_number: 16130742 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Escherichia coli K12 # 1 411 1 411 420 833 100.0 0 MPHSLFSTDTDLTAENLLRLPAEFGCPVWVYDAQIIRRQIAALKQFDVVRFAQKACSNIH ILRLMREQGVKVDSVSLGEIERALAAGYNPQTHPDDIVFTADVIDQATLERVSELQIPVN AGSVDMLDQLGQVSPGHRVWLRVNPGFGHGHSQKTNTGGENSKHGIWYTDLPAALDVIQR HHLQLVGIHMHIGSGVDYAHLEQVCGAMVRQVIEFGQDLQAISAGGGLSVPYQQGEEAVD TEHYYGLWNAAREQIARHLGHPVKLEIEPGRFLVAQSGVLITQVRSVKQMGSRHFVLVDA GFNDLMRPAMYGSYHHISALAADGRSLEHAPTVETVVAGPLCESGDVFTQQEGGNVETRA LPEVKAGDYLVLHDTGAYGASMSSNYNSRPLLPEVLFDNGQARLIRRRQTIEELLALELL >gi|296494627|gb|ADTN01000111.1| GENE 4 3122 - 4153 979 343 aa, chain - ## HITS:1 COG:galR KEGG:ns NR:ns ## COG: galR COG1609 # Protein_GI_number: 16130741 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 343 1 343 343 658 100.0 0 MATIKDVARLAGVSVATVSRVINNSPKASEASRLAVHSAMESLSYHPNANARALAQQTTE TVGLVVGDVSDPFFGAMVKAVEQVAYHTGNFLLIGNGYHNEQKERQAIEQLIRHRCAALV VHAKMIPDADLASLMKQMPGMVLINRILPGFENRCIALDDRYGAWLATRHLIQQGHTRIG YLCSNHSISDAEDRLQGYYDALAESGIAANDRLVTFGEPDESGGEQAMTELLGRGRNFTA VACYNDSMAAGAMGVLNDNGIDVPGEISLIGFDDVLVSRYVRPRLTTVRYPIVTMATQAA ELALALADNRPLPEITNVFSPTLVRRHSVSTPSLEASHHATSD >gi|296494627|gb|ADTN01000111.1| GENE 5 4738 - 6897 2022 719 aa, chain + ## HITS:1 COG:aas_2 KEGG:ns NR:ns ## COG: aas_2 COG0318 # Protein_GI_number: 16130740 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Escherichia coli K12 # 198 719 1 522 522 1065 100.0 0 MLFSFFRNLCRVLYRVRVTGDTQALKGERVLITPNHVSFIDGILLGLFLPVRPVFAVYTS ISQQWYMRWLKSFIDFVPLDPTQPMAIKHLVRLVEQGRPVVIFPEGRITTTGSLMKIYDG AGFVAAKSGATVIPVRIEGAELTHFSRLKGLVKRRLFPQITLHILPPTQVAMPDAPRARD RRKIAGEMLHQIMMEARMAVRPRETLYESLLSAMYRFGAGKKCVEDVNFTPDSYRKLLTK TLFVGRILEKYSVEGERIGLMLPNAGISAAVIFGAIARRRMPAMMNYTAGVKGLTSAITA AEIKTIFTSRQFLDKGKLWHLPEQLTQVRWVYLEDLKADVTTADKVWIFAHLLMPRLAQV KQQPEEEALILFTSGSEGHPKGVVHSHKSILANVEQIKTIADFTTNDRFMSALPLFHSFG LTVGLFTPLLTGAEVFLYPSPLHYRIVPELVYDRSCTVLFGTSTFLGHYARFANPYDFYR LRYVVAGAEKLQESTKQLWQDKFGLRILEGYGVTECAPVVSINVPMAAKPGTVGRILPGM DARLLSVPGIEEGGRLQLKGPNIMNGYLRVEKPGVLEVPTAENVRGEMERGWYDTGDIVR FDEQGFVQIQGRAKRFAKIAGEMVSLEMVEQLALGVSPDKVHATAIKSDASKGEALVLFT TDNELTRDKLQQYAREHGVPELAVPRDIRYLKQMPLLGSGKPDFVTLKSWVDEAEQHDE >gi|296494627|gb|ADTN01000111.1| GENE 6 6890 - 8083 1387 397 aa, chain + ## HITS:1 COG:ygeD KEGG:ns NR:ns ## COG: ygeD COG0477 # Protein_GI_number: 16130739 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 397 1 397 397 642 100.0 0 MSESVHTNTSLWSKGMKAVIVAQFLSAFGDNALLFATLALLKAQFYPEWSQPILQMVFVG AYILFAPFVGQVADSFAKGRVMMFANGLKLLGAASICFGINPFLGYTLVGVGAAAYSPAK YGILGELTTGSKLVKANGLMEASTIAAILLGSVAGGVLADWHVLVALAACALAYGGAVVA NIYIPKLAAARPGQSWNLINMTRSFLNACTSLWRNGETRFSLVGTSLFWGAGVTLRFLLV LWVPVALGITDNATPTYLNAMVAIGIVVGAGAAAKLVTLETVSRCMPAGILIGVVVLIFS LQHELLPAYALLMLIGVMGGFFVVPLNALLQERGKKSVGAGNAIAVQNLGENSAMLLMLG IYSLAVMIGIPVVPIGIGFGALFALAITALWIWQRRH >gi|296494627|gb|ADTN01000111.1| GENE 7 8115 - 9155 1205 346 aa, chain - ## HITS:1 COG:tas KEGG:ns NR:ns ## COG: tas COG0667 # Protein_GI_number: 16130738 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Escherichia coli K12 # 1 346 1 346 346 699 100.0 0 MQYHRIPHSSLEVSTLGLGTMTFGEQNSEADAHAQLDYAVAQGINLIDVAEMYPVPPRPE TQGLTETYVGNWLAKHGSREKLIIASKVSGPSRNNDKGIRPDQALDRKNIREALHDSLKR LQTDYLDLYQVHWPQRPTNCFGKLGYSWTDSAPAVSLLDTLDALAEYQRAGKIRYIGVSN ETAFGVMRYLHLADKHDLPRIVTIQNPYSLLNRSFEVGLAEVSQYEGVELLAYSCLGFGT LTGKYLNGAKPAGARNTLFSRFTRYSGEQTQKAVAAYVDIARRHGLDPAQMALAFVRRQP FVASTLLGATTMDQLKTNIESLHLELSEDVLAEIEAVHQVYTYPAP >gi|296494627|gb|ADTN01000111.1| GENE 8 9263 - 9481 261 72 aa, chain - ## HITS:1 COG:no KEGG:G2583_3488 NR:ns ## KEGG: G2583_3488 # Name: ygdR # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 72 1 72 72 132 100.0 4e-30 MKKWAVIISAVGLAFAVSGCSSDYVMATKDGRMILTDGKPEIDDDTGLVSYHDQQGNAMQ INRDDVSQIIER >gi|296494627|gb|ADTN01000111.1| GENE 9 9619 - 10332 801 237 aa, chain - ## HITS:1 COG:ECs3689 KEGG:ns NR:ns ## COG: ECs3689 COG0861 # Protein_GI_number: 15832943 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane protein TerC, possibly involved in tellurium resistance # Organism: Escherichia coli O157:H7 # 1 237 1 237 237 381 100.0 1e-106 MLFAWITDPNAWLALGTLTLLEIVLGIDNIIFLSLVVAKLPTAQRAHARRLGLAGAMVMR LALLASIAWVTRLTNPLFTIFSQEISARDLILLLGGLFLIWKASKEIHESIEGEEEGLKT RVSSFLGAIVQIMLLDIIFSLDSVITAVGLSDHLFIMMAAVVIAVGVMMFAARSIGDFVE RHPSVKMLALSFLILVGFTLILESFDIHVPKGYIYFAMFFSIAVESLNLIRNKKNPL >gi|296494627|gb|ADTN01000111.1| GENE 10 10401 - 11090 685 229 aa, chain - ## HITS:1 COG:mutH KEGG:ns NR:ns ## COG: mutH COG3066 # Protein_GI_number: 16130735 # Func_class: L Replication, recombination and repair # Function: DNA mismatch repair protein # Organism: Escherichia coli K12 # 1 229 1 229 229 433 100.0 1e-121 MSQPRPLLSPPETEEQLLAQAQQLSGYTLGELAALVGLVTPENLKRDKGWIGVLLEIWLG ASAGSKPEQDFAALGVELKTIPVDSLGRPLETTFVCVAPLTGNSGVTWETSHVRHKLKRV LWIPVEGERSIPLAQRRVGSPLLWSPNEEEDRQLREDWEELMDMIVLGQVERITARHGEY LQIRPKAANAKALTEAIGARGERILTLPRGFYLKKNFTSALLARHFLIQ >gi|296494627|gb|ADTN01000111.1| GENE 11 11276 - 11422 83 48 aa, chain + ## HITS:1 COG:no KEGG:ECO103_3390 NR:ns ## KEGG: ECO103_3390 # Name: ygdT # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 48 1 48 48 95 100.0 5e-19 MLSTESWDNCEKPPLLFPFTALTCDETPVFSGSVLNLVAHSVDKYGIG >gi|296494627|gb|ADTN01000111.1| GENE 12 11775 - 12305 223 176 aa, chain + ## HITS:1 COG:ECs3687 KEGG:ns NR:ns ## COG: ECs3687 COG0494 # Protein_GI_number: 15832941 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Escherichia coli O157:H7 # 1 176 1 176 176 340 100.0 1e-93 MIDDDGYRPNVGIVICNRQGQVMWARRFGQHSWQFPQGGINPGESAEQAMYRELFEEVGL SRKDVRILASTRNWLRYKLPKRLVRWDTKPVCIGQKQKWFLLQLVSGDAEINMQTSSTPE FDGWRWVSYWYPVRQVVSFKRDVYRRVMKEFASVVMSLQENTPKPQNASAYRRKRG >gi|296494627|gb|ADTN01000111.1| GENE 13 12318 - 14564 2185 748 aa, chain + ## HITS:1 COG:ptsP KEGG:ns NR:ns ## COG: ptsP COG3605 # Protein_GI_number: 16130733 # Func_class: T Signal transduction mechanisms # Function: Signal transduction protein containing GAF and PtsI domains # Organism: Escherichia coli K12 # 1 748 1 748 748 1434 100.0 0 MLTRLREIVEKVASAPRLNEALNILVTDICLAMDTEVCSVYLADHDRRCYYLMATRGLKK PRGRTVTLAFDEGIVGLVGRLAEPINLADAQKHPSFKYIPSVKEERFRAFLGVPIIQRRQ LLGVLVVQQRELRQYDESEESFLVTLATQMAAILSQSQLTALFGQYRQTRIRALPAAPGV AIAEGWQDATLPLMEQVYQASTLDPALERERLTGALEEAANEFRRYSKRFAAGAQKETAA IFDLYSHLLSDTRLRRELFAEVDKGSVAEWAVKTVIEKFAEQFAALSDNYLKERAGDLRA LGQRLLFHLDDANQGPNAWPERFILVADELSATTLAELPQDRLVGVVVRDGAANSHAAIM VRALGIPTVMGADIQPSVLHRRTLIVDGYRGELLVDPEPVLLQEYQRLISEEIELSRLAE DDVNLPAQLKSGERIKVMLNAGLSPEHEEKLGSRIDGIGLYRTEIPFMLQSGFPSEEEQV AQYQGMLQMFNDKPVTLRTLDVGADKQLPYMPISEENPCLGWRGIRITLDQPEIFLIQVR AMLRANAATGNLNILLPMVTSLDEVDEARRLIERAGREVEEMIGYEIPKPRIGIMLEVPS MVFMLPHLAKRVDFISVGTNDLTQYILAVDRNNTRVANIYDSLHPAMLRALAMIAREAEI HGIDLRLCGEMAGDPMCVAILIGLGYRHLSMNGRSVARAKYLLRRIDYAEAENLAQRSLE AQLATEVRHQVAAFMERRGMGGLIRGGL >gi|296494627|gb|ADTN01000111.1| GENE 14 14715 - 15590 1127 291 aa, chain + ## HITS:1 COG:ECs3685 KEGG:ns NR:ns ## COG: ECs3685 COG0682 # Protein_GI_number: 15832939 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Prolipoprotein diacylglyceryltransferase # Organism: Escherichia coli O157:H7 # 1 291 1 291 291 548 100.0 1e-156 MTSSYLHFPEFDPVIFSIGPVALHWYGLMYLVGFIFAMWLATRRANRPGSGWTKNEVENL LYAGFLGVFLGGRIGYVLFYNFPQFMADPLYLFRVWDGGMSFHGGLIGVIVVMIIFARRT KRSFFQVSDFIAPLIPFGLGAGRLGNFINGELWGRVDPNFPFAMLFPGSRTEDILLLQTN PQWQSIFDTYGVLPRHPSQLYELLLEGVVLFIILNLYIRKPRPMGAVSGLFLIGYGAFRI IVEFFRQPDAQFTGAWVQYISMGQILSIPMIVAGVIMMVWAYRRSPQQHVS >gi|296494627|gb|ADTN01000111.1| GENE 15 15597 - 16391 902 264 aa, chain + ## HITS:1 COG:ECs3684 KEGG:ns NR:ns ## COG: ECs3684 COG0207 # Protein_GI_number: 15832938 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate synthase # Organism: Escherichia coli O157:H7 # 1 264 1 264 264 567 100.0 1e-162 MKQYLELMQKVLDEGTQKNDRTGTGTLSIFGHQMRFNLQDGFPLVTTKRCHLRSIIHELL WFLQGDTNIAYLHENNVTIWDEWADENGDLGPVYGKQWRAWPTPDGRHIDQITTVLNQLK NDPDSRRIIVSAWNVGELDKMALAPCHAFFQFYVADGKLSCQLYQRSCDVFLGLPFNIAS YALLVHMMAQQCDLEVGDFVWTGGDTHLYSNHMDQTHLQLSREPRPLPKLIIKRKPESIF DYRFEDFEIEGYDPHPGIKAPVAI >gi|296494627|gb|ADTN01000111.1| GENE 16 16575 - 17045 202 156 aa, chain + ## HITS:1 COG:ppdA KEGG:ns NR:ns ## COG: ppdA COG2165 # Protein_GI_number: 16130730 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, pseudopilin PulG # Organism: Escherichia coli K12 # 1 156 1 156 156 293 100.0 8e-80 MKTQRGYTLIETLVAMLILVMLSASGLYGWQYWQQSQRLWQTASQARDYLLYLREDANWH NRDHSISVIREGTLWCLVSSAAGANTCHGSSPLVFVPRWPEVEMSDLTPSLAFFGLRNTA WAGHIRFKNSTGEWWLVVSPWGRLRLCQQGETEGCL >gi|296494627|gb|ADTN01000111.1| GENE 17 17036 - 17599 660 187 aa, chain + ## HITS:1 COG:ppdB KEGG:ns NR:ns ## COG: ppdB COG4795 # Protein_GI_number: 16130729 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulJ # Organism: Escherichia coli K12 # 1 187 1 187 187 382 100.0 1e-106 MPVKEQGFSLLEVLIAMAISSVLLLGAARFLPALQRESLTSTRKLALEDEIWLRVFTVAK HLQRAGYCHGICTGEGLEIVGQGDCVIVQWDANSNGIWDREPVKESDQIGFRLKEHVLET LRGATSCEGKGWDKVTNPDAIIIDTFQVVRQDVSGFSPVLTVNMRAASKSEPQTVVNASY SVTGFNL >gi|296494627|gb|ADTN01000111.1| GENE 18 17596 - 18003 271 135 aa, chain + ## HITS:1 COG:no KEGG:B21_02633 NR:ns ## KEGG: B21_02633 # Name: ygdB # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 135 1 135 135 221 100.0 4e-57 MNREKGVSSLALVLMLLVLGSLLLQGMSQQDRSFASRVSMESQSLRRQAIVQSALAWGKM HSWQTQPAVQCSQYAETDAQVCLRLLADNEALLIAGYEGVSLWRTGEVIDGNIVFSPRGW SDFCPLKERALCQLP >gi|296494627|gb|ADTN01000111.1| GENE 19 17988 - 18311 83 107 aa, chain + ## HITS:1 COG:ppdC KEGG:ns NR:ns ## COG: ppdC COG4967 # Protein_GI_number: 16130727 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Tfp pilus assembly protein PilV # Organism: Escherichia coli K12 # 1 107 1 107 107 182 100.0 1e-46 MSASLKNQQGFSLPEVMLAMVLMVMIVTALSGFQRTLMNSLASRNQYQQLWRHGWQQTQL RAISPPANWQVNRMQTSQAGCVSISVTLVSPGGREGEMTRLHCPNRQ >gi|296494627|gb|ADTN01000111.1| GENE 20 18324 - 21692 3059 1122 aa, chain + ## HITS:1 COG:recC KEGG:ns NR:ns ## COG: recC COG1330 # Protein_GI_number: 16130726 # Func_class: L Replication, recombination and repair # Function: Exonuclease V gamma subunit # Organism: Escherichia coli K12 # 1 1122 1 1122 1122 2226 100.0 0 MLRVYHSNRLDVLEALMEFIVERERLDDPFEPEMILVQSTGMAQWLQMTLSQKFGIAANI DFPLPASFIWDMFVRVLPEIPKESAFNKQSMSWKLMTLLPQLLEREDFTLLRHYLTDDSD KRKLFQLSSKAADLFDQYLVYRPDWLAQWETGHLVEGLGEAQAWQAPLWKALVEYTHQLG QPRWHRANLYQRFIETLESATTCPPGLPSRVFICGISALPPVYLQALQALGKHIEIHLLF TNPCRYYWGDIKDPAYLAKLLTRQRRHSFEDRELPLFRDSENAGQLFNSDGEQDVGNPLL ASWGKLGRDYIYLLSDLESSQELDAFVDVTPDNLLHNIQSDILELENRAVAGVNIEEFSR SDNKRPLDPLDSSITFHVCHSPQREVEVLHDRLLAMLEEDPTLTPRDIIVMVADIDSYSP FIQAVFGSAPADRYLPYAISDRRARQSHPVLEAFISLLSLPDSRFVSEDVLALLDVPVLA ARFDITEEGLRYLRQWVNESGIRWGIDDDNVRELELPATGQHTWRFGLTRMLLGYAMESA QGEWQSVLPYDESSGLIAELVGHLASLLMQLNIWRRGLAQERPLEEWLPVCRDMLNAFFL PDAETEAAMTLIEQQWQAIIAEGLGAQYGDAVPLSLLRDELAQRLDQERISQRFLAGPVN ICTLMPMRSIPFKVVCLLGMNDGVYPRQLAPLGFDLMSQKPKRGDRSRRDDDRYLFLEAL ISAQQKLYISYIGRSIQDNSERFPSVLVQELIDYIGQSHYLPGDEALNCDESEARVKAHL TCLHTRMPFDPQNYQPGERQSYAREWLPAASQAGKAHSEFVQPLPFTLPETVPLETLQRF WAHPVRAFFQMRLQVNFRTEDSEIPDTEPFILEGLSRYQINQQLLNALVEQDDAERLFRR FRAAGDLPYGAFGEIFWETQCQEMQQLADRVIACRQPGQSMEIDLACNGVQITGWLPQVQ PDGLLRWRPSLLSVAQGMQLWLEHLVYCASGGNGESRLFLRKDGEWRFPPLAAEQALHYL SQLIEGYREGMSAPLLVLPESGGAWLKTCYDAQNDAMLDDDSTLQKARTKFLQAYEGNMM VRGEGDDIWYQRLWRQLTPETMEAIVEQSQRFLLPLFRFNQS >gi|296494627|gb|ADTN01000111.1| GENE 21 21868 - 24756 2950 962 aa, chain + ## HITS:1 COG:ptr KEGG:ns NR:ns ## COG: ptr COG1025 # Protein_GI_number: 16130725 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Secreted/periplasmic Zn-dependent peptidases, insulinase-like # Organism: Escherichia coli K12 # 1 962 1 962 962 1854 100.0 0 MPRSTWFKALLLLVALWAPLSQAETGWQPIQETIRKSDKDNRQYQAIRLDNGMVVLLVSD PQAVKSLSALVVPVGSLEDPEAYQGLAHYLEHMSLMGSKKYPQADSLAEYLKMHGGSHNA STAPYRTAFYLEVENDALPGAVDRLADAIAEPLLDKKYAERERNAVNAELTMARTRDGMR MAQVSAETINPAHPGSKFSGGNLETLSDKPGNPVQQALKDFHEKYYSANLMKAVIYSNKP LPELAKMAADTFGRVPNKESKKPEITVPVVTDAQKGIIIHYVPALPRKVLRVEFRIDNNS AKFRSKTDELITYLIGNRSPGTLSDWLQKQGLVEGISANSDPIVNGNSGVLAISASLTDK GLANRDQVVAAIFSYLNLLREKGIDKQYFDELANVLDIDFRYPSITRDMDYVEWLADTMI RVPVEHTLDAVNIADRYDAKAVKERLAMMTPQNARIWYISPKEPHNKTAYFVDAPYQVDK ISAQTFADWQKKAADIALSLPELNPYIPDDFSLIKSEKKYDHPELIVDESNLRVVYAPSR YFASEPKADVSLILRNPKAMDSARNQVMFALNDYLAGLALDQLSNQASVGGISFSTNANN GLMVNANGYTQRLPQLFQALLEGYFSYTATEDQLEQAKSWYNQMMDSAEKGKAFEQAIMP AQMLSQVPYFSRDERRKILPSITLKEVLAYRDALKSGARPEFMVIGNMTEAQATTLARDV QKQLGADGSEWCRNKDVVVDKKQSVIFEKAGNSTDSALAAVFVPTGYDEYTSSAYSSLLG QIVQPWFYNQLRTEEQLGYAVFAFPMSVGRQWGMGFLLQSNDKQPSFLWERYKAFFPTAE AKLRAMKPDEFAQIQQAVITQMLQAPQTLGEEASKLSKDFDRGNMRFDSRDKIVAQIKLL TPQKLADFFHQAVVEPQGMAILSQISGSQNGKAEYVHPEGWKVWENVSALQQTMPLMSEK NE >gi|296494627|gb|ADTN01000111.1| GENE 22 24749 - 28291 2875 1180 aa, chain + ## HITS:1 COG:recB KEGG:ns NR:ns ## COG: recB COG1074 # Protein_GI_number: 16130724 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) # Organism: Escherichia coli K12 # 1 1180 1 1180 1180 2297 99.0 0 MSDVAETLDPLRLPLQGERLIEASAGTGKTFTIAALYLRLLLGLGGSAAFPRPLTVEELL VVTFTEAATAELRGRIRSNIHELRIACLRETTDNPLYERLLEEIDDKAQAAQWLLLAERQ MDEAAVFTIHGFCQRMLNLNAFESGMLFEQQLIEDESLLRYQACADFWRRHCYPLPREIA QVVFETWKGPQALLRDINRYLQGEAPVIKAPPPDDETLASRHAQIVARIDTVKQQWRDAV GELDALIESSGIDRRKFNRSNQAKWIDKISAWAEEETNSYQLPESLEKFSQRFLEDRTKA GGGTPRHPLFEAIDQLLAEPLSIRDLVITRALAEIRETVAREKRRRGELGFDDMLSRLDS ALRSESGEVLAAAIRTRFPVAMIDEFQDTDPQQYRIFRRIWHHQPETALLLIGDPKQAIY AFRGADIFTYMKARSEVHAHYTLDTNWRSAPGMVNSVNKLFSQTDDAFMFREIPFIPVKS AGKNQALRFVFKGETQPAMKMWLMEGESCGVGDYQSTMAQVCAAQIRDWLQAGQRGEALL MNGDDARPVRASDISVLVRSRQEAAQVRDALTLLEIPSVYLSNRDSVFETLEAQEMLWLL QAVMTPERENTLRSALATSMMGLNALDIETLNNDEHAWDVVVEEFDGYRQIWRKRGVMPM LRALMSARNIAENLLATAGGERRLTDILHISELLQEAGTQLESEHALVRWLSQHILEPDS NASSQQMRLESDKHLVQIVTIHKSKGLEYPLVWLPFITNFRVQEQAFYHDRHSFEAVLDL NAAPESVDLAEAERLAEDLRLLYVALTRSVWHCSLGVAPLVRRRGDKKGDTDVHQSALGR LLQKGEPQDAAGLRTCIEALCDDDIAWQTAQTGDNQPWQVNDVSTAELNAKTLQRLPGDN WRVTSYSGLQQRGHGIAQDLMPRLDVDAAGVASVVEEPTLTPHQFPRGASPGTFLHSLFE DLDFTQPVDPNWVREKLELGGFESQWEPVLTEWITAVLQAPLNETGVSLSQLSARNKQVE MEFYLPISEPLIASQLDTLIRQFDPLSAGCPPLEFMQVRGMLKGFIDLVFRHEGRYYLLD YKSNWLGEDSSAYTQQAMAAAMQAHRYDLQYQLYTLALHRYLRHRIADYDYEHHFGGVIY LFLRGVDKEHPQQGIYTTRPNAGLIALMDEMFAGMTLEEA >gi|296494627|gb|ADTN01000111.1| GENE 23 28291 - 30117 1346 608 aa, chain + ## HITS:1 COG:recD KEGG:ns NR:ns ## COG: recD COG0507 # Protein_GI_number: 16130723 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Escherichia coli K12 # 1 608 1 608 608 1163 100.0 0 MKLQKQLLEAVEHKQLRPLDVQFALTVAGDEHPAVTLAAALLSHDAGEGHVCLPLSRLEN NEASHPLLATCVSEIGELQNWEECLLASQAVSRGDEPTPMILCGDRLYLNRMWCNERTVA RFFNEVNHAIEVDEALLAQTLDKLFPVSDEINWQKVAAAVALTRRISVISGGPGTGKTTT VAKLLAALIQMADGERCRIRLAAPTGKAAARLTESLGKALRQLPLTDEQKKRIPEDASTL HRLLGAQPGSQRLRHHAGNPLHLDVLVVDEASMIDLPMMSRLIDALPDHARVIFLGDRDQ LASVEAGAVLGDICAYANAGFTAERARQLSRLTGTHVPAGTGTEAASLRDSLCLLQKSYR FGSDSGIGQLAAAINRGDKTAVKTVFQQDFTDIEKRLLQSGEDYIAMLEEALAGYGRYLD LLQARAEPDLIIQAFNEYQLLCALREGPFGVAGLNERIEQFMQQKRKIHRHPHSRWYEGR PVMIARNDSALGLFNGDIGIALDRGQGTRVWFAMPDGNIKSVQPSRLPEHETTWAMTVHK SQGSEFDHAALILPSQRTPVVTRELVYTAVTRARRRLSLYADERILSAAIATRTERRSGL AALFSSRE >gi|296494627|gb|ADTN01000111.1| GENE 24 30179 - 31510 1319 443 aa, chain - ## HITS:1 COG:ECs3675_1 KEGG:ns NR:ns ## COG: ECs3675_1 COG0548 # Protein_GI_number: 15832929 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate kinase # Organism: Escherichia coli O157:H7 # 1 291 1 291 291 581 100.0 1e-166 MVKERKTELVEGFRHSVPYINTHRGKTFVIMLGGEAIEHENFSSIVNDIGLLHSLGIRLV VVYGARPQIDANLAAHHHEPLYHKNIRVTDAKTLELVKQAAGTLQLDITARLSMSLNNTP LQGAHINVVSGNFIIAQPLGVDDGVDYCHSGRIRRIDEDAIHRQLDSGAIVLMGPVAVSV TGESFNLTSEEIATQLAIKLKAEKMIGFCSSQGVTNDDGDIVSELFPNEAQARVEAQEEK GDYNSGTVRFLRGAVKACRSGVRRCHLISYQEDGALLQELFSRDGIGTQIVMESAEQIRR ATINDIGGILELIRPLEQQGILVRRSREQLEMEIDKFTIIQRDNTTIACAALYPFPEEKI GEMACVAVHPDYRSSSRGEVLLERIAAQAKQSGLSKLFVLTTRSIHWFQERGFTPVDIDL LPESKKQLYNYQRKSKVLMADLG >gi|296494627|gb|ADTN01000111.1| GENE 25 31742 - 32995 1174 417 aa, chain + ## HITS:1 COG:ECs3674 KEGG:ns NR:ns ## COG: ECs3674 COG0860 # Protein_GI_number: 15832928 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Escherichia coli O157:H7 # 1 417 31 447 447 783 100.0 0 MSGSNTAISRRRLLQGAGAMWLLSVSQVSLAAVSQVVAVRVWPASSYTRVTVESNRQLKY KQFALSNPERVVVDIEDVNLNSVLKGMAAQIRADDPFIKSARVGQFDPQTVRMVFELKQN VKPQLFALAPVAGFKERLVMDLYPANAQDMQDPLLALLEDYNKGDLEKQVPPAQSGPQPG KAGRDRPIVIMLDPGHGGEDSGAVGKYKTREKDVVLQIARRLRSLIEKEGNMKVYMTRNE DIFIPLQVRVAKAQKQRADLFVSIHADAFTSRQPSGSSVFALSTKGATSTAAKYLAQTQN ASDLIGGVSKSGDRYVDHTMFDMVQSLTIADSLKFGKAVLNKLGKINKLHKNQVEQAGFA VLKAPDIPSILVETAFISNVEEERKLKTATFQQEVAESILAGIKAYFADGATLARRG >gi|296494627|gb|ADTN01000111.1| GENE 26 33464 - 34561 1102 365 aa, chain + ## HITS:1 COG:ECs3673 KEGG:ns NR:ns ## COG: ECs3673 COG2821 # Protein_GI_number: 15832927 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-bound lytic murein transglycosylase # Organism: Escherichia coli O157:H7 # 1 365 1 365 365 740 100.0 0 MKGRWVKYLLMGTVVAMLAACSSKPTDRGQQYKDGKFTQPFSLVNQPDAVGAPINAGDFA EQINHIRNSSPRLYGNQSNVYNAVQEWLRAGGDTRNMRQFGIDAWQMEGADNYGNVQFTG YYTPVIQARHTRQGEFQYPIYRMPPKRGRLPSRAEIYAGALSDKYILAYSNSLMDNFIMD VQGSGYIDFGDGSPLNFFSYAGKNGHAYRSIGKVLIDRGEVKKEDMSMQAIRHWGETHSE AEVRELLEQNPSFVFFKPQSFAPVKGASAVPLVGRASVASDRSIIPPGTTLLAEVPLLDN NGKFNGQYELRLMVALDVGGAIKGQHFDIYQGIGPEAGHRAGWYNHYGRVWVLKTAPGAG NVFSG >gi|296494627|gb|ADTN01000111.1| GENE 27 34800 - 35606 867 268 aa, chain + ## HITS:1 COG:ygdL KEGG:ns NR:ns ## COG: ygdL COG1179 # Protein_GI_number: 16130719 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 # Organism: Escherichia coli K12 # 1 268 1 268 268 515 100.0 1e-146 MSVVISDAWRQRFGGTARLYGEKALQLFADAHICVVGIGGVGSWAAEALARTGIGAITLI DMDDVCVTNTNRQIHALRDNVGLAKAEVMAERIRQINPECRVTVVDDFVTPDNVAQYMSV GYSYVIDAIDSVRPKAALIAYCRRNKIPLVTTGGAGGQIDPTQIQVTDLAKTIQDPLAAK LRERLKSDFGVVKNSKGKLGVDCVFSTEALVYPQSDGTVCAMKATAEGPKRMDCASGFGA ATMVTATFGFVAVSHALKKMMAKAARQG >gi|296494627|gb|ADTN01000111.1| GENE 28 35657 - 36100 419 147 aa, chain - ## HITS:1 COG:ECs3671 KEGG:ns NR:ns ## COG: ECs3671 COG2166 # Protein_GI_number: 15832925 # Func_class: R General function prediction only # Function: SufE protein probably involved in Fe-S center assembly # Organism: Escherichia coli O157:H7 # 1 147 1 147 147 275 100.0 2e-74 MTNPQFAGHPFGTTVTAETLRNTFAPLTQWEDKYRQLIMLGKQLPALPDELKAQAKEIAG CENRVWLGYTVAENGKMHFFGDSEGRIVRGLLAVLLTAVEGKTAAELQAQSPLALFDELG LRAQLSASRSQGLNALSEAIIAATKQV >gi|296494627|gb|ADTN01000111.1| GENE 29 36100 - 37305 1042 401 aa, chain - ## HITS:1 COG:csdA KEGG:ns NR:ns ## COG: csdA COG0520 # Protein_GI_number: 16130717 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Escherichia coli K12 # 1 401 1 401 401 791 100.0 0 MNVFNPAQFRAQFPALQDAGVYLDSAATALKPEAVVEATQQFYSLSAGNVHRSQFAEAQR LTARYEAAREKVAQLLNAPDDKTIVWTRGTTESINMVAQCYARPRLQPGDEIIVSVAEHH ANLVPWLMVAQQTGAKVVKLPLNAQRLPDVDLLPELITPRSRILALGQMSNVTGGCPDLA RAITFAHSAGMVVMVDGAQGAVHFPADVQQLDIDFYAFSGHKLYGPTGIGVLYGKSELLE AMSPWLGGGKMVHEVSFDGFTTQSAPWKLEAGTPNVAGVIGLSAALEWLADYDINQAESW SRSLATLAEDALAKRPGFRSFRCQDSSLLAFDFAGVHHSDMVTLLAEYGIALRAGQHCAQ PLLAELGVTGTLRASFAPYNTKSDVDALVNAVDRALELLVD >gi|296494627|gb|ADTN01000111.1| GENE 30 37497 - 37724 355 75 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_2919 NR:ns ## KEGG: ECIAI1_2919 # Name: ygdI # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 75 1 75 75 129 100.0 3e-29 MKKTAAIISACMLTFALSACSGSNYVMHTNDGRTIVSDGKPQTDNDTGMISYKDANGNKQ QINRTDVKEMVELDQ >gi|296494627|gb|ADTN01000111.1| GENE 31 38075 - 38992 637 305 aa, chain + ## HITS:1 COG:ECs3668 KEGG:ns NR:ns ## COG: ECs3668 COG0583 # Protein_GI_number: 15832922 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 305 1 305 305 604 100.0 1e-173 MSKRLPPLNALRVFDAAARHLSFTRAAEELFVTQAAVSHQIKSLEDFLGLKLFRRRNRSL LLTEEGQSYFLDIKEIFSQLTEATRKLQARSAKGALTVSLLPSFAIHWLVPRLSSFNSAY PGIDVRIQAVDRQEDKLADDVDVAIFYGRGNWPGLRVEKLYAEYLLPVCSPLLLTGEKPL KTPEDLAKHTLLHDASRRDWQTYTRQLGLNHINVQQGPIFSHSAMVLQAAIHGQGVALAN NVMAQSEIEAGRLVCPFNDVLVSKNAFYLVCHDSQAELGKIAAFRQWILAKAAAEQEKFR FRYEQ >gi|296494627|gb|ADTN01000111.1| GENE 32 39011 - 39406 505 131 aa, chain + ## HITS:1 COG:ECs3667 KEGG:ns NR:ns ## COG: ECs3667 COG2363 # Protein_GI_number: 15832921 # Func_class: S Function unknown # Function: Uncharacterized small membrane protein # Organism: Escherichia coli O157:H7 # 1 131 1 131 131 204 100.0 3e-53 MTSRFMLIFAAISGFIFVALGAFGAHVLSKTMGAVEMGWIQTGLEYQAFHTLAILGLAVA MQRRISIWFYWSSVFLALGTVLFSGSLYCLALSHLRLWAFVTPVGGVSFLAGWALMLVGA IRLKRKGVSHE >gi|296494627|gb|ADTN01000111.1| GENE 33 39399 - 40499 1150 366 aa, chain + ## HITS:1 COG:ECs3666 KEGG:ns NR:ns ## COG: ECs3666 COG2933 # Protein_GI_number: 15832920 # Func_class: R General function prediction only # Function: Predicted SAM-dependent methyltransferase # Organism: Escherichia coli O157:H7 # 1 366 1 366 366 780 100.0 0 MNKVVLLCRPGFEKECAAEITDKAGQREIFGFARVKENAGYVIYECYQPDDGDKLIRELP FSSLIFARQWFVVGELLQHLPPEDRITPIVGMLQGVVEKGGELRVEVADTNESKELLKFC RKFTVPLRAALRDAGVLANYETPKRPVVHVFFIAPGCCYTGYSYSNNNSPFYMGIPRLKF PADAPSRSTLKLEEAFHVFIPADEWDERLANGMWAVDLGACPGGWTYQLVKRNMWVYSVD NGPMAQSLMDTGQVTWLREDGFKFRPTRSNISWMVCDMVEKPAKVAALMAQWLVNGWCRE TIFNLKLPMKKRYEEVSHNLAYIQAQLDEHGINAQIQARQLYHDREEVTVHVRRIWAAVG GRRDER >gi|296494627|gb|ADTN01000111.1| GENE 34 40543 - 41274 528 243 aa, chain - ## HITS:1 COG:fucR KEGG:ns NR:ns ## COG: fucR COG1349 # Protein_GI_number: 16130712 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli K12 # 1 243 1 243 243 471 100.0 1e-133 MKAARQQAIVDLLLNHTSLTTEALSEQLKVSKETIRRDLNELQTQGKILRNHGRAKYIHR QNQDSGDPFHIRLKSHYAHKADIAREALAWIEEGMVIALDASSTCWYLARQLPDINIQVF TNSHPICHELGKRERIQLISSGGTLERKYGCYVNPSLISQLKSLEIDLFIFSCEGIDSSG ALWDSNAINADYKSMLLKRAAQSLLLIDKSKFNRSGEARIGHLDEVTHIISDERQVATSL VTA >gi|296494627|gb|ADTN01000111.1| GENE 35 41332 - 41754 387 140 aa, chain - ## HITS:1 COG:ECs3664 KEGG:ns NR:ns ## COG: ECs3664 COG4154 # Protein_GI_number: 15832918 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose dissimilation pathway protein FucU # Organism: Escherichia coli O157:H7 # 1 140 1 140 140 268 100.0 2e-72 MLKTISPLISPELLKVLAEMGHGDEIIFSDAHFPAHSMGPQVIRADGLLVSDLLQAIIPL FELDSYAPPLVMMAAVEGDTLDPEVERRYRNALSLQAPCPDIIRINRFAFYERAQKAFAI VITGERAKYGNILLKKGVTP >gi|296494627|gb|ADTN01000111.1| GENE 36 41756 - 43174 1022 472 aa, chain - ## HITS:1 COG:fucK KEGG:ns NR:ns ## COG: fucK COG1070 # Protein_GI_number: 16130710 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Escherichia coli K12 # 1 472 11 482 482 961 100.0 0 MKQEVILVLDCGATNVRAIAVNRQGKIVARASTPNASDIAMENNTWHQWSLDAILQRFAD CCRQINSELTECHIRGIAVTTFGVDGALVDKQGNLLYPIISWKCPRTAAVMDNIERLISA QRLQAISGVGAFSFNTLYKLVWLKENHPQLLERAHAWLFISSLINHRLTGEFTTDITMAG TSQMLDIQQRDFSPQILQATGIPRRLFPRLVEAGEQIGTLQNSAAAMLGLPVGIPVISAG HDTQFALFGAGAEQNEPVLSSGTWEILMVRSAQVDTSLLSQYAGSTCELDSQAGLYNPGM QWLASGVLEWVRKLFWTAETPWQMLIEEARLIAPGADGVKMQCDLLSCQNAGWQGVTLNT TRGHFYRAALEGLTAQLQRNLQMLEKIGHFKASELLLVGGGSRNTLWNQIKANMLDIPVK VLDDAETTVAGAALFGWYGVGEFNSPEEARAQIHYQYRYFYPQTEPEFIEEV >gi|296494627|gb|ADTN01000111.1| GENE 37 43283 - 45058 1852 591 aa, chain - ## HITS:1 COG:fucI KEGG:ns NR:ns ## COG: fucI COG2407 # Protein_GI_number: 16130709 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Escherichia coli K12 # 1 591 1 591 591 1222 100.0 0 MKKISLPKIGIRPVIDGRRMGVRESLEEQTMNMAKATAALLTEKLRHACGAAVECVISDT CIAGMAEAAACEEKFSSQNVGLTITVTPCWCYGSETIDMDPTRPKAIWGFNGTERPGAVY LAAALAAHSQKGIPAFSIYGHDVQDADDTSIPADVEEKLLRFARAGLAVASMKGKSYLSL GGVSMGIAGSIVDHNFFESWLGMKVQAVDMTELRRRIDQKIYDEAELEMALAWADKNFRY GEDENNKQYQRNAEQSRAVLRESLLMAMCIRDMMQGNSKLADIGRVEESLGYNAIAAGFQ GQRHWTDQYPNGDTAEAILNSSFDWNGVREPFVVATENDSLNGVAMLMGHQLTGTAQVFA DVRTYWSPEAIERVTGHKLDGLAEHGIIHLINSGSAALDGSCKQRDSEGNPTMKPHWEIS QQEADACLAATEWCPAIHEYFRGGGYSSRFLTEGGVPFTMTRVNIIKGLGPVLQIAEGWS VELPKDVHDILNKRTNSTWPTTWFAPRLTGKGPFTDVYSVMANWGANHGVLTIGHVGADF ITLASMLRIPVCMHNVEETKVYRPSAWAAHGMDIEGQDYRACQNYGPLYKR >gi|296494627|gb|ADTN01000111.1| GENE 38 45091 - 46407 837 438 aa, chain - ## HITS:1 COG:fucP KEGG:ns NR:ns ## COG: fucP COG0738 # Protein_GI_number: 16130708 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Escherichia coli K12 # 1 438 1 438 438 808 100.0 0 MGNTSIQTQSYRAVDKDAGQSRSYIIPFALLCSLFFLWAVANNLNDILLPQFQQAFTLTN FQAGLIQSAFYFGYFIIPIPAGILMKKLSYKAGIITGLFLYALGAALFWPAAEIMNYTLF LVGLFIIAAGLGCLETAANPFVTVLGPESSGHFRLNLAQTFNSFGAIIAVVFGQSLILSN VPHQSQDVLDKMSPEQLSAYKHSLVLSVQTPYMIIVAIVLLVALLIMLTKFPALQSDNHS DAKQGSFSASLSRLARIRHWRWAVLAQFCYVGAQTACWSYLIRYAVEEIPGMTAGFAANY LTGTMVCFFIGRFTGTWLISRFAPHKVLAAYALIAMALCLISAFAGGHVGLIALTLCSAF MSIQYPTIFSLGIKNLGQDTKYGSSFIVMTIIGGGIVTPVMGFVSDAAGNIPTAELIPAL CFAVIFIFARFRSQTATN >gi|296494627|gb|ADTN01000111.1| GENE 39 46954 - 47601 628 215 aa, chain + ## HITS:1 COG:fucA KEGG:ns NR:ns ## COG: fucA COG0235 # Protein_GI_number: 16130707 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli K12 # 1 215 1 215 215 431 100.0 1e-121 MERNKLARQIIDTCLEMTRLGLNQGTAGNVSVRYQDGMLITPTGIPYEKLTESHIVFIDG NGKHEEGKLPSSEWRFHMAAYQSRPDANAVVHNHAVHCTAVSILNRSIPAIHYMIAAAGG NSIPCAPYATFGTRELSEHVALALKNRKATLLQHHGLIACEVNLEKALWLAHEVEVLAQL YLTTLAITDPVPVLSDEEIAVVLEKFKTYGLRIEE >gi|296494627|gb|ADTN01000111.1| GENE 40 47629 - 48777 1288 382 aa, chain + ## HITS:1 COG:ECs3659 KEGG:ns NR:ns ## COG: ECs3659 COG1454 # Protein_GI_number: 15832913 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Escherichia coli O157:H7 # 1 382 2 383 383 745 100.0 0 MANRMILNETAWFGRGAVGALTDEVKRRGYQKALIVTDKTLVQCGVVAKVTDKMDAAGLA WAIYDGVVPNPTITVVKEGLGVFQNSGADYLIAIGGGSPQDTCKAIGIISNNPEFADVRS LEGLSPTNKPSVPILAIPTTAGTAAEVTINYVITDEEKRRKFVCVDPHDIPQVAFIDADM MDGMPPALKAATGVDALTHAIEGYITRGAWALTDALHIKAIEIIAGALRGSVAGDKDAGE EMALGQYVAGMGFSNVGLGLVHGMAHPLGAFYNTPHGVANAILLPHVMRYNADFTGEKYR DIARVMGVKVEGMSLEEARNAAVEAVFALNRDVGIPPHLRDVGVRKEDIPALAQAALDDV CTGGNPREATLEDIVELYHTAW >gi|296494627|gb|ADTN01000111.1| GENE 41 48832 - 49587 533 251 aa, chain - ## HITS:1 COG:exo KEGG:ns NR:ns ## COG: exo COG0258 # Protein_GI_number: 16130705 # Func_class: L Replication, recombination and repair # Function: 5'-3' exonuclease (including N-terminal domain of PolI) # Organism: Escherichia coli K12 # 1 251 31 281 281 519 99.0 1e-147 MAVHLLIVDALNLIRRIHAVQGSPCVETCQHALDQLIMHSQPTHAVAVFDDENRSSGWRH QRLPDYKAGRPPMPEELHDEMPALRAAFEQRGVPCWSTSGNEADDLAATLAVKVTQAGHQ ATIVSTDKGYCQLLSPTLRIRDYFQKRWLDAPFIDKEFGVQPQQLPDYWGLAGISSSKVP GVAGIGPKSATQLLVEFQSLEGIYENLDAVAEKWRKKLETHKEMAFLCRDIARLQTDLHI DGNLQQLRLVR >gi|296494627|gb|ADTN01000111.1| GENE 42 49699 - 51066 1443 455 aa, chain - ## HITS:1 COG:sdaB KEGG:ns NR:ns ## COG: sdaB COG1760 # Protein_GI_number: 16130704 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Escherichia coli K12 # 1 455 1 455 455 910 99.0 0 MISVFDIFKIGIGPSSSHTVGPMKAGKQFTDDLIARNLLKDVTRVVVDVYGSLSLTGKGH HTDIAIIMGLAGNLPDTVDIDSIPSFIQDVNTHGRLMLANGQHEVEFPVDQCMNFHADNL SLHENGMRITALAGDKVVYSHTYYSIGGGFIVDEEHFGQQDSAPVEVPYPYSSAADLQKH CQETGLSLSGLMMKNELALHSKEELEQHLANVWEVMRGGIERGISTEGVLPGKLRVPRRA AALRRMLVSQDKTTTDPMAVVDWINMFALAVNEENAAGGRVVTAPTNGACGIIPAVLAYY DKFIREVNANSLARYLLVASAIGSLYKMNASISGAEVGCQGEVGVACSMAAAGLAELLGA SPAQVCIAAEIAMEHNLGLTCDPVAGQVQVPCIERNAIAAVKAVNAARMALRRTSEPRVC LDKVIETMYETGKDMNAKYRETSRGGLAMKIVACD >gi|296494627|gb|ADTN01000111.1| GENE 43 51124 - 52413 1519 429 aa, chain - ## HITS:1 COG:ECs3656 KEGG:ns NR:ns ## COG: ECs3656 COG0814 # Protein_GI_number: 15832910 # Func_class: E Amino acid transport and metabolism # Function: Amino acid permeases # Organism: Escherichia coli O157:H7 # 1 429 1 429 429 733 100.0 0 METTQTSTIASKDSRSAWRKTDTMWMLGLYGTAIGAGVLFLPINAGVGGMIPLIIMAILA FPMTFFAHRGLTRFVLSGKNPGEDITEVVEEHFGIGAGKLITLLYFFAIYPILLVYSVAI TNTVESFMSHQLGMTPPPRAILSLILIVGMMTIVRFGEQMIVKAMSILVFPFVGVLMLLA LYLIPQWNGAALETLSLDTASATGNGLWMTLWLAIPVMVFSFNHSPIISSFAVAKREEYG DMAEQKCSKILAFAHIMMVLTVMFFVFSCVLSLTPADLAAAKEQNISILSYLANHFNAPV IAWMAPIIAIIAITKSFLGHYLGAREGFNGMVIKSLRGKGKSIEINKLNRITALFMLVTT WIVATLNPSILGMIETLGGPIIAMILFLMPMYAIQKVPAMRKYSGHISNVFVVVMGLIAI SAIFYSLFS >gi|296494627|gb|ADTN01000111.1| GENE 44 52970 - 54334 1377 454 aa, chain - ## HITS:1 COG:ECs3655 KEGG:ns NR:ns ## COG: ECs3655 COG1611 # Protein_GI_number: 15832909 # Func_class: R General function prediction only # Function: Predicted Rossmann fold nucleotide-binding protein # Organism: Escherichia coli O157:H7 # 1 454 1 454 454 931 100.0 0 MITHISPLGSMDMLSQLEVDMLKRTASSDLYQLFRNCSLAVLNSGSLTDNSKELLSRFEN FDINVLRRERGVKLELINPPEEAFVDGRIIRALQANLFAVLRDILFVYGQIHNTVRFPNL NLDNSVHITNLVFSILRNARALHVGEAPNMVVCWGGHSINENEYLYARRVGNQLGLRELN ICTGCGPGAMEAPMKGAAVGHAQQRYKDSRFIGMTEPSIIAAEPPNPLVNELIIMPDIEK RLEAFVRIAHGIIIFPGGVGTAEELLYLLGILMNPANKDQVLPLILTGPKESADYFRVLD EFVVHTLGENARRHYRIIIDDAAEVARQMKKSMPLVKENRRDTGDAYSFNWSMRIAPDLQ MPFEPSHENMANLKLYPDQPVEVLAADLRRAFSGIVAGNVKEVGIRAIEEFGPYKINGDK EIMRRMDDLLQGFVAQHRMKLPGSAYIPCYEICT >gi|296494627|gb|ADTN01000111.1| GENE 45 54446 - 55294 635 282 aa, chain - ## HITS:1 COG:yqcD_2 KEGG:ns NR:ns ## COG: yqcD_2 COG0780 # Protein_GI_number: 16130701 # Func_class: R General function prediction only # Function: Enzyme related to GTP cyclohydrolase I # Organism: Escherichia coli K12 # 138 282 1 145 145 308 100.0 1e-83 MSSYANHQALAGLTLGKSTDYRDTYDASLLQGVPRSLNRDPLGLKADNLPFHGTDIWTLY ELSWLNAKGLPQVAVGHVELDYTSVNLIESKSFKLYLNSFNQTRFNNWDEVRQTLERDLS TCAQGKISVALYRLDELEGQPIGHFNGTCIDDQDITIDNYEFTTDYLENATCGEKVVEET LVSHLLKSNCLITHQPDWGSLQIQYRGRQIDREKLLRYLVSFRHHNEFHEQCVERIFNDL LRFCQPEKLSVYARYTRRGGLDINPWRSNSDFVPSTTRLVRQ >gi|296494627|gb|ADTN01000111.1| GENE 46 55362 - 55907 511 181 aa, chain + ## HITS:1 COG:no KEGG:ECO103_3336 NR:ns ## KEGG: ECO103_3336 # Name: syd # Def: SecY interacting protein Syd # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 181 1 181 181 367 100.0 1e-101 MDDLTAQALKDFTARYCDAWHEEHKSWPLSEELYGVPSPCIISTTEDAVYWQPQPFTGEQ NVNAVERAFDIVIQPTIHTFYTTQFAGDMHAQFGDIKLTLLQTWSEDDFRRVQENLIGHL VTQKRLKLPPTLFIATLEEELEVISVCNLSGEVCKETLGTRKRTHLASNLAEFLNQLKPL L >gi|296494627|gb|ADTN01000111.1| GENE 47 56529 - 56858 374 109 aa, chain + ## HITS:1 COG:yqcC KEGG:ns NR:ns ## COG: yqcC COG3098 # Protein_GI_number: 16130699 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 109 1 109 109 176 100.0 9e-45 MTTHDRVRLQLQALEALLREHQHWRNDEPQPHQFNSTQPFFMDTMEPLEWLQWVLIPRMH DLLDNKQPLPGAFAVAPYYEMALATDHPQRALILAELEKLDALFADDAS >gi|296494627|gb|ADTN01000111.1| GENE 48 56858 - 57640 192 260 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family [Lactobacillus jensenii 269-3] # 3 219 82 276 287 78 27 8e-14 MLEILYQDEWLVAVNKPSGWLVHRSWLDRDEKVVVMQTVRDQIGQHVFTAHRLDRPTSGV LLMGLSSEAGRLLAQQFEQHQIQKRYHAIVRGWLMEEAVLDYPLVEELDKIADKFAREDK GPQPAVTHYRGLATVEMPVATGRYPTTRYGLVELEPKTGRKHQLRRHLAHLRHPIIGDSK HGDLRQNRSGAEHFGLQRLMLHASQLSLTHPFTGEPLTIHAGLDDTWMQALSQFGWRGLL PENERVEFSAPSGQDGEISS >gi|296494627|gb|ADTN01000111.1| GENE 49 57658 - 58107 603 149 aa, chain + ## HITS:1 COG:ECs3650 KEGG:ns NR:ns ## COG: ECs3650 COG0716 # Protein_GI_number: 15832904 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Escherichia coli O157:H7 # 1 149 1 149 149 293 100.0 9e-80 MAEIGIFVGTMYGNSLLVAEEAEAILTAQGHKATVFEDPELSDWLPYQDKYVLVVTSTTG QGDLPDSIVPLFQGIKDSLGFQPNLRYGVIALGDSSYVNFCNGGKQFDALLQEQSAQRVG EMLLIDASENPEPETESNPWVEQWGTLLS >gi|296494627|gb|ADTN01000111.1| GENE 50 58542 - 59894 1531 450 aa, chain + ## HITS:1 COG:ygcZ KEGG:ns NR:ns ## COG: ygcZ COG0477 # Protein_GI_number: 16130696 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 450 1 450 450 819 100.0 0 MSSLSQAASSVEKRTNARYWIVVMLFIVTSFNYGDRATLSIAGSEMAKDIGLDPVGMGYV FSAFSWAYVIGQIPGGWLLDRFGSKRVYFWSIFIWSMFTLLQGFVDIFSGFGIIVALFTL RFLVGLAEAPSFPGNSRIVAAWFPAQERGTAVSIFNSAQYFATVIFAPIMGWLTHEVGWS HVFFFMGGLGIVISFIWLKVIHEPNQHPGVNKKELEYIAAGGALINMDQQNTKVKVPFSV KWGQIKQLLGSRMMIGVYIGQYCINALTYFFITWFPVYLVQARGMSILKAGFVASVPAVC GFIGGVLGGIISDWLMRRTGSLNIARKTPIVMGMLLSMVMVFCNYVNVEWMIIGFMALAF FGKGIGALGWAVMADTAPKEISGLSGGLFNMFGNISGIVTPIAIGYIVGTTGSFNGALIY VGVHALIAVLSYLVLVGDIKRIELKPVAGQ >gi|296494627|gb|ADTN01000111.1| GENE 51 59896 - 61236 1497 446 aa, chain + ## HITS:1 COG:ygcY KEGG:ns NR:ns ## COG: ygcY COG4948 # Protein_GI_number: 16130695 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Escherichia coli K12 # 1 446 1 446 446 905 99.0 0 MATQSSPVITDMKVIPVAGHDSMLLNIGGAHNAYFTRNIVVLTDNAGHTGIGEAPGGDVI YQTLVDAIPMVLGQEVARLNKVVQQVHKGNQAADFDTFGKGAWTFELRVNAVAALEAALL DLLGKALNVPVCELLGPGKQREAITVLGYLFYIGDRTKTDLPYVENTPGNHEWYQLRHQK AMNSEAVVRLAEASQDRYGFKDFKLKGGVLPGEQEIDTVRALKKRFPDARITVDPNGAWL LDEAISLCKGLNDVLTYAEDPCGAEQGFSGREVMAEFRRATGLPVATNMIATNWREMGHA VMLNAVDIPLADPHFWTLSGAVHVAQLCDDWGLTWGCHSNNHFDISLAMFTHVGAAAPGN PTAIDTHWIWQEGDCRLTQNPLEIKNGKIAVPDAPGLGVELDWEQVQKAHEAYKRLPGGA RNDAGPMQYLIPGWTFDRKRPVFGRH >gi|296494627|gb|ADTN01000111.1| GENE 52 61257 - 62597 1716 446 aa, chain + ## HITS:1 COG:ECs3647 KEGG:ns NR:ns ## COG: ECs3647 COG4948 # Protein_GI_number: 15832901 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Escherichia coli O157:H7 # 1 446 1 446 446 931 100.0 0 MSSQFTTPVVTEMQVIPVAGHDSMLMNLSGAHAPFFTRNIVIIKDNSGHTGVGEIPGGEK IRKTLEDAIPLVVGKTLGEYKNVLTLVRNTFADRDAGGRGLQTFDLRTTIHVVTGIEAAM LDLLGQHLGVNVASLLGDGQQRSEVEMLGYLFFVGNRKATPLPYQSQPDDSCDWYRLRHE EAMTPDAVVRLAEAAYEKYGFNDFKLKGGVLAGEEEAESIVALAQRFPQARITLDPNGAW SLNEAIKIGKYLKGSLAYAEDPCGAEQGFSGREVMAEFRRATGLPTATNMIATDWRQMGH TLSLQSVDIPLADPHFWTMQGSVRVAQMCHEFGLTWGSHSNNHFDISLAMFTHVAAAAPG KITAIDTHWIWQEGNQRLTKEPFEIKGGLVQVPEKPGLGVEIDMDQVMKAHELYQKHGLG ARDDAMGMQYLIPGWTFDNKRPCMVR Prediction of potential genes in microbial genomes Time: Sun May 15 23:31:00 2011 Seq name: gi|296494626|gb|ADTN01000112.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont249.4, whole genome shotgun sequence Length of sequence - 11418 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 4, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 202 - 2958 2525 ## COG0642 Signal transduction histidine kinase - Prom 3016 - 3075 3.6 + Prom 2746 - 2805 1.6 2 2 Op 1 8/0.000 + CDS 3015 - 4316 1013 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase 3 2 Op 2 3/1.000 + CDS 4364 - 6598 2318 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases + Term 6611 - 6644 1.1 4 3 Op 1 8/0.000 + CDS 6676 - 6924 280 ## COG2336 Growth regulator 5 3 Op 2 3/1.000 + CDS 6924 - 7259 217 ## COG2337 Growth inhibitor 6 3 Op 3 5/1.000 + CDS 7330 - 8121 1053 ## COG1694 Predicted pyrophosphatase + Term 8135 - 8188 1.1 + Prom 8241 - 8300 5.3 7 4 Op 1 8/0.000 + CDS 8349 - 9986 1710 ## COG0504 CTP synthase (UTP-ammonia lyase) + Term 10005 - 10042 9.1 8 4 Op 2 . + CDS 10074 - 11372 1487 ## COG0148 Enolase Predicted protein(s) >gi|296494626|gb|ADTN01000112.1| GENE 1 202 - 2958 2525 918 aa, chain - ## HITS:1 COG:ECs3646_1 KEGG:ns NR:ns ## COG: ECs3646_1 COG0642 # Protein_GI_number: 15832900 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli O157:H7 # 1 531 1 531 531 1031 99.0 0 MTNYSLRARMMILILAPTVLIGLLLSIFFVVHRYNDLQRQLEDAGASIIEPLAVSTEYGM SLQNRESIGQLISVLHRRHSDIVRAISVYDENNRLFVTSNFHLDPSSMQLGSNVPFPRQL TVTRDGDIMILRTPIISESYSPDESPSSDAKNSQNMLGYIALELDLKSVRLQQYKEIFIS SVMMLFCIGIALIFGWRLMRDVTGPIRNMVNTVDRIRRGQLDSRVEGFMLGELDMLKNGI NAMAMSLAAYHEEMQHNIDQATSDLRETLEQMEIQNVELDLAKKRAQEAARIKSEFLANM SHELRTPLNGVIGFTRLTLKTELTPTQRDHLNTIERSANNLLAIINDVLDFSKLEAGKLI LESIPFPLRSTLDEVVTLLAHSSHDKGLELTLNIKSDVPDNVIGDPLRLQQIITNLVGNA IKFTENGNIDILVEKRALSNTKVQIEVQIRDTGIGIPERDQSRLFQAFRQADASISRRHG GTGLGLVITQKLVNEMGGDISFHSQPNRGSTFWFHINLDLNPNIIIEGPSTQCLAGKRLA YVEPNSAAAQCTLDILSETPLEVVYSPTFSALPPAHYDMMLLGIAVTFREPLTMQHERLA KAVSMTDFLMLALPCHAQVNAEKLKQDGIGACLLKPLTPTRLLPALTEFCHHKQNTLLPV TDESKLAMTVMAVDDNPANLKLIGALLEDMVQHVELCDSGHQAVERAKQMPFDLILMDIQ MPDMDGIRACELIHQLPHQQQTPVIAVTAHAMAGQKEKLLGAGMSDYLAKPIEEERLHNL LLRYKPGSGISSRVVTPEVNEIVVNPNATLDWQLALRQAAGKTDLARDMLQMLLDFLPEV RNKVEEQLVGENPEGLVDLIHKLHGSCGYSGVPRMKNLCQLIEQQLRSGTKEEDLEPELL ELLDEMDNVAREASKILG >gi|296494626|gb|ADTN01000112.1| GENE 2 3015 - 4316 1013 433 aa, chain + ## HITS:1 COG:ygcA KEGG:ns NR:ns ## COG: ygcA COG2265 # Protein_GI_number: 16130692 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Escherichia coli K12 # 1 433 1 433 433 877 100.0 0 MAQFYSAKRRTTTRQIITVSVNDLDSFGQGVARHNGKTLFIPGLLPQENAEVTVTEDKKQ YARAKVVRRLSDSPERETPRCPHFGVCGGCQQQHASVDLQQRSKSAALARLMKHDVSEVI ADVPWGYRRRARLSLNYLPKTQQLQMGFRKAGSSDIVDVKQCPILAPQLEALLPKVRACL GSLQAMRHLGHVELVQATSGTLMILRHTAPLSSADREKLERFSHSEGLDLYLAPDSEILE TVSGEMPWYDSNGLRLTFSPRDFIQVNAGVNQKMVARALEWLDVQPEDRVLDLFCGMGNF TLPLATQAASVVGVEGVPALVEKGQQNARLNGLQNVTFYHENLEEDVTKQPWAKNGFDKV LLDPARAGAAGVMQQIIKLEPIRIVYVSCNPATLARDSEALLKAGYTIARLAMLDMFPHT GHLESMVLFSRVK >gi|296494626|gb|ADTN01000112.1| GENE 3 4364 - 6598 2318 744 aa, chain + ## HITS:1 COG:ECs3644 KEGG:ns NR:ns ## COG: ECs3644 COG0317 # Protein_GI_number: 15832898 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Escherichia coli O157:H7 # 1 744 1 744 744 1509 100.0 0 MVAVRSAHINKAGEFDPEKWIASLGITSQKSCECLAETWAYCLQQTQGHPDASLLLWRGV EMVEILSTLSMDIDTLRAALLFPLADANVVSEDVLRESVGKSVVNLIHGVRDMAAIRQLK ATHTDSVSSEQVDNVRRMLLAMVDDFRCVVIKLAERIAHLREVKDAPEDERVLAAKECTN IYAPLANRLGIGQLKWELEDYCFRYLHPTEYKRIAKLLHERRLDREHYIEEFVGHLRAEM KAEGVKAEVYGRPKHIYSIWRKMQKKNLAFDELFDVRAVRIVAERLQDCYAALGIVHTHY RHLPDEFDDYVANPKPNGYQSIHTVVLGPGGKTVEIQIRTKQMHEDAELGVAAHWKYKEG AAAGGARSGHEDRIAWLRKLIAWQEEMADSGEMLDEVRSQVFDDRVYVFTPKGDVVDLPA GSTPLDFAYHIHSDVGHRCIGAKIGGRIVPFTYQLQMGDQIEIITQKQPNPSRDWLNPNL GYVTTSRGRSKIHAWFRKQDRDKNILAGRQILDDELEHLGISLKEAEKHLLPRYNFNDVD ELLAAIGGGDIRLNQMVNFLQSQFNKPSAEEQDAAALKQLQQKSYTPQNRSKDNGRVVVE GVGNLMHHIARCCQPIPGDEIVGFITQGRGISVHRADCEQLAELRSHAPERIVDAVWGES YSAGYSLVVRVVANDRSGLLRDITTILANEKVNVLGVASRSDTKQQLATIDMTIEIYNLQ VLGRVLGKLNQVPDVIDARRLHGS >gi|296494626|gb|ADTN01000112.1| GENE 4 6676 - 6924 280 82 aa, chain + ## HITS:1 COG:ECs3643 KEGG:ns NR:ns ## COG: ECs3643 COG2336 # Protein_GI_number: 15832897 # Func_class: T Signal transduction mechanisms # Function: Growth regulator # Organism: Escherichia coli O157:H7 # 1 82 1 82 82 148 100.0 2e-36 MIHSSVKRWGNSPAVRIPATLMQALNLNIDDEVKIDLVDGKLIIEPVRKEPVFTLAELVN DITPENLHENIDWGEPKDKEVW >gi|296494626|gb|ADTN01000112.1| GENE 5 6924 - 7259 217 111 aa, chain + ## HITS:1 COG:ECs3642 KEGG:ns NR:ns ## COG: ECs3642 COG2337 # Protein_GI_number: 15832896 # Func_class: T Signal transduction mechanisms # Function: Growth inhibitor # Organism: Escherichia coli O157:H7 # 1 111 1 111 111 230 100.0 4e-61 MVSRYVPDMGDLIWVDFDPTKGSEQAGHRPAVVLSPFMYNNKTGMCLCVPCTTQSKGYPF EVVLSGQERDGVALADQVKSIAWRARGATKKGTVAPEELQLIKAKINVLIG >gi|296494626|gb|ADTN01000112.1| GENE 6 7330 - 8121 1053 263 aa, chain + ## HITS:1 COG:ECs3641 KEGG:ns NR:ns ## COG: ECs3641 COG1694 # Protein_GI_number: 15832895 # Func_class: R General function prediction only # Function: Predicted pyrophosphatase # Organism: Escherichia coli O157:H7 # 1 263 1 263 263 478 100.0 1e-135 MNQIDRLLTIMQRLRDPENGCPWDKEQTFATIAPYTLEETYEVLDAIAREDFDDLRGELG DLLFQVVFYAQMAQEEGRFDFNDICAAISDKLERRHPHVFADSSAENSSEVLARWEQIKT EERAQKAQHSALDDIPRSLPALMRAQKIQKRCANVGFDWTTLGPVVDKVYEEIDEVMYEA RQAVVDQAKLEEEMGDLLFATVNLARHLGTKAEIALQKANEKFERRFREVERIVAARGLE MTGVDLETMEEVWQQVKRQEIDL >gi|296494626|gb|ADTN01000112.1| GENE 7 8349 - 9986 1710 545 aa, chain + ## HITS:1 COG:ECs3640 KEGG:ns NR:ns ## COG: ECs3640 COG0504 # Protein_GI_number: 15832894 # Func_class: F Nucleotide transport and metabolism # Function: CTP synthase (UTP-ammonia lyase) # Organism: Escherichia coli O157:H7 # 1 545 1 545 545 1101 100.0 0 MTTNYIFVTGGVVSSLGKGIAAASLAAILEARGLNVTIMKLDPYINVDPGTMSPIQHGEV FVTEDGAETDLDLGHYERFIRTKMSRRNNFTTGRIYSDVLRKERRGDYLGATVQVIPHIT NAIKERVLEGGEGHDVVLVEIGGTVGDIESLPFLEAIRQMAVEIGREHTLFMHLTLVPYM AASGEVKTKPTQHSVKELLSIGIQPDILICRSDRAVPANERAKIALFCNVPEKAVISLKD VDSIYKIPGLLKSQGLDDYICKRFSLNCPEANLSEWEQVIFEEANPVSEVTIGMVGKYIE LPDAYKSVIEALKHGGLKNRVSVNIKLIDSQDVETRGVEILKGLDAILVPGGFGYRGVEG MITTARFARENNIPYLGICLGMQVALIDYARHVANMENANSTEFVPDCKYPVVALITEWR DENGNVEVRSEKSDLGGTMRLGAQQCQLVDDSLVRQLYNAPTIVERHRHRYEVNNMLLKQ IEDAGLRVAGRSGDDQLVEIIEVPNHPWFVACQFHPEFTSTPRDGHPLFAGFVKAASEFQ KRQAK >gi|296494626|gb|ADTN01000112.1| GENE 8 10074 - 11372 1487 432 aa, chain + ## HITS:1 COG:ECs3639 KEGG:ns NR:ns ## COG: ECs3639 COG0148 # Protein_GI_number: 15832893 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Escherichia coli O157:H7 # 1 432 1 432 432 760 100.0 0 MSKIVKIIGREIIDSRGNPTVEAEVHLEGGFVGMAAAPSGASTGSREALELRDGDKSRFL GKGVTKAVAAVNGPIAQALIGKDAKDQAGIDKIMIDLDGTENKSKFGANAILAVSLANAK AAAAAKGMPLYEHIAELNGTPGKYSMPVPMMNIINGGEHADNNVDIQEFMIQPVGAKTVK EAIRMGSEVFHHLAKVLKAKGMNTAVGDEGGYAPNLGSNAEALAVIAEAVKAAGYELGKD ITLAMDCAASEFYKDGKYVLAGEGNKAFTSEEFTHFLEELTKQYPIVSIEDGLDESDWDG FAYQTKVLGDKIQLVGDDLFVTNTKILKEGIEKGIANSILIKFNQIGSLTETLAAIKMAK DAGYTAVISHRSGETEDATIADLAVGTAAGQIKTGSMSRSDRVAKYNQLIRIEEALGEKA PYNGRKEIKGQA Prediction of potential genes in microbial genomes Time: Sun May 15 23:31:09 2011 Seq name: gi|296494625|gb|ADTN01000113.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont249.5, whole genome shotgun sequence Length of sequence - 24043 bp Number of predicted genes - 28, with homology - 27 Number of transcription units - 12, operones - 5 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 869 -327 ## COG1512 Beta-propeller domains of methanol dehydrogenase type 2 1 Op 2 . - CDS 883 - 1023 187 ## ECBD_0952 hypothetical protein - Prom 1084 - 1143 4.2 + Prom 877 - 936 7.4 3 2 Op 1 . + CDS 1162 - 1833 220 ## PROTEIN SUPPORTED gi|157803532|ref|YP_001492081.1| 50S ribosomal protein L35 4 2 Op 2 . + CDS 1857 - 1940 69 ## 5 3 Tu 1 . - CDS 2330 - 2536 114 ## ECUMN_3078 hypothetical protein 6 4 Tu 1 . - CDS 2650 - 3687 524 ## COG2234 Predicted aminopeptidases - Prom 3721 - 3780 3.9 + Prom 3702 - 3761 4.2 7 5 Op 1 18/0.000 + CDS 3939 - 4847 1098 ## COG0175 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes 8 5 Op 2 7/0.000 + CDS 4849 - 6276 1679 ## COG2895 GTPases - Sulfate adenylate transferase subunit 1 9 5 Op 3 . + CDS 6276 - 6881 566 ## COG0529 Adenylylsulfate kinase and related kinases 10 5 Op 4 . + CDS 6931 - 7254 512 ## EcSMS35_2875 hypothetical protein + Prom 7263 - 7322 3.4 11 6 Op 1 11/0.000 + CDS 7448 - 7759 210 ## COG2919 Septum formation initiator 12 6 Op 2 19/0.000 + CDS 7778 - 8488 315 ## PROTEIN SUPPORTED gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 13 6 Op 3 8/0.000 + CDS 8488 - 8967 833 ## COG0245 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase 14 6 Op 4 8/0.000 + CDS 8964 - 10013 1104 ## COG0585 Uncharacterized conserved protein 15 6 Op 5 13/0.000 + CDS 9994 - 10755 693 ## COG0496 Predicted acid phosphatase 16 6 Op 6 11/0.000 + CDS 10749 - 11375 558 ## COG2518 Protein-L-isoaspartate carboxylmethyltransferase + Term 11427 - 11485 -0.7 + Prom 11426 - 11485 2.1 17 6 Op 7 8/0.000 + CDS 11515 - 12654 589 ## COG0739 Membrane proteins related to metalloendopeptidases 18 6 Op 8 . + CDS 12717 - 13709 1154 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) + Term 13728 - 13763 6.5 - Term 13759 - 13790 3.4 19 7 Op 1 1/1.000 - CDS 13803 - 15167 1214 ## COG2610 H+/gluconate symporter and related permeases 20 7 Op 2 4/0.500 - CDS 15256 - 16032 779 ## COG3622 Hydroxypyruvate isomerase 21 7 Op 3 6/0.000 - CDS 16037 - 16675 479 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 22 7 Op 4 5/0.000 - CDS 16672 - 17934 1025 ## COG3395 Uncharacterized protein conserved in bacteria 23 7 Op 5 . - CDS 17931 - 18839 957 ## COG2084 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases - Prom 18921 - 18980 4.9 + Prom 18825 - 18884 5.8 24 8 Tu 1 . + CDS 19035 - 19802 617 ## COG1349 Transcriptional regulators of sugar metabolism + Term 19824 - 19866 4.5 - Term 19792 - 19840 -0.9 25 9 Tu 1 . - CDS 19853 - 20509 396 ## COG0639 Diadenosine tetraphosphatase and related serine/threonine protein phosphatases - Prom 20531 - 20590 8.2 26 10 Tu 1 . - CDS 20615 - 23176 3146 ## COG0249 Mismatch repair ATPase (MutS family) + Prom 23369 - 23428 7.4 27 11 Tu 1 . + CDS 23463 - 23816 312 ## B21_02547 hypothetical protein + Term 23819 - 23856 7.8 - Term 23805 - 23844 7.4 28 12 Tu 1 . - CDS 23853 - 24041 131 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains Predicted protein(s) >gi|296494625|gb|ADTN01000113.1| GENE 1 2 - 869 -327 289 aa, chain - ## HITS:1 COG:ygcG KEGG:ns NR:ns ## COG: ygcG COG1512 # Protein_GI_number: 16130685 # Func_class: R General function prediction only # Function: Beta-propeller domains of methanol dehydrogenase type # Organism: Escherichia coli K12 # 1 262 24 285 313 489 99.0 1e-138 MRYFILMFTFVCSFVAAQPTIVPQLQQQVTDLTSSSNSQEKKELTHKLESIFNNTQVQIA VLIVPTTKDETIEQYATRVFDNWRLGDAKRNDGILIVVAWSDRTVRIQVGYGLEEKVTDA LAGDIIRSNMIPAFKQQKLAKGLELAINALNNQLTSQHQYPTNPSESESASSSDHYYFAI FWVFAVMFFPFWFFHQGSNFCRACKSGVCISAIYLLDLFLFSDKIFSIAVFSFFFTFTIF MVFTCLCVLQKRASGRSYHSDNSGSAGGSDSGGFSGGGGGCFGGGGGGG >gi|296494625|gb|ADTN01000113.1| GENE 2 883 - 1023 187 46 aa, chain - ## HITS:1 COG:no KEGG:ECBD_0952 NR:ns ## KEGG: ECBD_0952 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_BL21_DE3 # Pathway: not_defined # 1 46 1 46 46 67 100.0 2e-10 MSEENKENGFNHVKTFTKIIFIFSVLVFNDNEYKITDAAVNLFIQI >gi|296494625|gb|ADTN01000113.1| GENE 3 1162 - 1833 220 223 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157803532|ref|YP_001492081.1| 50S ribosomal protein L35 [Rickettsia canadensis str. McKiel] # 5 222 20 224 225 89 30 2e-17 MQYPINEMFQTLQGEGYFTGVPAIFIRLQGCPVGCAWCDTKHTWEKLEDREVSLFSILAK TKESDKWGAASSEDLLAVIGRQGYTARHVVITGGEPCIHDLLPLTDLLEKNGFSCQIETS GTHEVRCTPNTWVTVSPKLNMRGGYEVLSQALERANEIKHPVGRVRDIEALDELLATLTD DKPRVIALQPISQKDDATRLCIETCIARNWRLSMQTHKYLNIA >gi|296494625|gb|ADTN01000113.1| GENE 4 1857 - 1940 69 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGGTCKDFYSFILFVASENSSILPIQG >gi|296494625|gb|ADTN01000113.1| GENE 5 2330 - 2536 114 68 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_3078 NR:ns ## KEGG: ECUMN_3078 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 11 63 93 145 172 65 64.0 4e-10 MNVGSDRVFTRFIPAGAGNTVLDHFKFTIDSVYPRWRGEHGRTEYKAVYLLGLSPLARGT LALQQQHP >gi|296494625|gb|ADTN01000113.1| GENE 6 2650 - 3687 524 345 aa, chain - ## HITS:1 COG:ECs3607 KEGG:ns NR:ns ## COG: ECs3607 COG2234 # Protein_GI_number: 15832861 # Func_class: R General function prediction only # Function: Predicted aminopeptidases # Organism: Escherichia coli O157:H7 # 1 345 1 345 345 694 99.0 0 MFSALRHRTAALALGVCFILPVHASSPKPGDFANTQARHIATFFPGRMTGTPAEMLSADY IRQQFQQMGYRSDIRTFNSRYIYTARDNRKNWHNVTGSTVIAAHEGKAPQQIIIMAHLDT YAPLSDADADANLGGLTLQGMDDNAAGLGVMLELAERLKNTPTEYGIRFVATSGEEEGKL GAENLLKRMSDTEKKNTLLVINLDNLIVGDKLYFNSGVKTPEAVRKLTRDRALAIARSHG TAATTNPGLNKNYPKGTGCCNDAEIFDKAGIAVLSVEATNWNLGNKDGYQQRAKTAAFPA GNSWHDVRLDNQQHIDKALPGRIERRCRDVMRIMLPLVKELAKAS >gi|296494625|gb|ADTN01000113.1| GENE 7 3939 - 4847 1098 302 aa, chain + ## HITS:1 COG:cysD KEGG:ns NR:ns ## COG: cysD COG0175 # Protein_GI_number: 16130659 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes # Organism: Escherichia coli K12 # 1 302 1 302 302 625 99.0 1e-179 MDQIRLTHLRQLEAESIHIIREVAAEFSNPVMLYSIGKDSSVMLHLARKAFYPGTLPFPL LHVDTGWKFREMYEFRDRTAKAYGCELLVHKNPEGVAMGINPFVHGSAKHTDIMKTEGLK QALNKYGFDAAFGGARRDEEKSRAKERIYSFRDRFHRWDPKNQRPELWHNYNGQINKGES IRVFPLSNWTEQDIWQYIWLENIDIVPLYLAVERPVLERDGMLMMIDDNRIDLQPGEVIK KRMVRFRTLGCWPLTGAVESNAQTLPEIIEEMLVSTTSERQGRVIDRDQAGSMELKKRQG YF >gi|296494625|gb|ADTN01000113.1| GENE 8 4849 - 6276 1679 475 aa, chain + ## HITS:1 COG:cysN KEGG:ns NR:ns ## COG: cysN COG2895 # Protein_GI_number: 16130658 # Func_class: P Inorganic ion transport and metabolism # Function: GTPases - Sulfate adenylate transferase subunit 1 # Organism: Escherichia coli K12 # 1 475 1 475 475 943 99.0 0 MNTALAQQIANEGGVEAWMIAQQHKSLLRFLTCGSVDDGKSTLIGRLLHDTRQIYEDQLS SLHNDSKRHGTQGEKLDLALLVDGLQAEREQGITIDVAYRYFSTEKRKFIIADTPGHEQY TRNMATGASTCELAILLIDARKGVLDQTRRHSFISTLLGIKHLVVAINKMDLVDYSEKTF TRIREDYLTFAGQLPGNLDIRFVPLSALEGDNVASQSESMAWYSGPTLLEVLETVEIQRV VDAQPMRFPVQYVNRPNLDFRGYAGTLASGRVEVGQRVKVLPSGVESNVARIVTFDGDRE EAFAGEAITLVLTDEIDISRGDLLLAADEALPAVQSASVDVVWMAEQPLSPGQSYDIKIA GKKTRARVDGIRYQVDINNLTQREVENLPLNGIGLVDLTFDEPLVLDRYQQNPVTGGLIF IDRLSNVTVGAGMVHEPVSQATAAPSEFSAFELELNALVRRHFPHWGARDLLGDK >gi|296494625|gb|ADTN01000113.1| GENE 9 6276 - 6881 566 201 aa, chain + ## HITS:1 COG:ECs3604 KEGG:ns NR:ns ## COG: ECs3604 COG0529 # Protein_GI_number: 15832858 # Func_class: P Inorganic ion transport and metabolism # Function: Adenylylsulfate kinase and related kinases # Organism: Escherichia coli O157:H7 # 1 201 1 201 201 395 99.0 1e-110 MALHDENVVWHSHPVTVQQRELHHGHRGVVLWFTGLSGSGKSTVAGALEEALHKLGVSTY LLDGDNVRHGLCSDLGFSDADRKENIRRVGEVANLMVEAGLVVLTAFISPHRAERQMVRE RVGEGRFIEVFVDTPLAICEARDPKGLYKKARAGELRYFTGIDSVYEAPESAEIHLNGEQ LVTNLVQQLLDLLRQNDIIRS >gi|296494625|gb|ADTN01000113.1| GENE 10 6931 - 7254 512 107 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_2875 NR:ns ## KEGG: EcSMS35_2875 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 107 1 107 107 176 100.0 2e-43 MRNSHNITLTNNDSLTEDEETTWSLPGAVVGFISWLFALAMPMLIYGSNTLFFFIYTWPF FLALMPVAVVVGIALHSLMDGKLRYSIVFTLVTVGIMFGALFMWLLG >gi|296494625|gb|ADTN01000113.1| GENE 11 7448 - 7759 210 103 aa, chain + ## HITS:1 COG:ECs3602 KEGG:ns NR:ns ## COG: ECs3602 COG2919 # Protein_GI_number: 15832856 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation initiator # Organism: Escherichia coli O157:H7 # 16 103 16 103 103 164 100.0 4e-41 MGKLTLLLLAILVWLQYSLWFGKNGIHDYTRVNDDVAAQQATNAKLKARNDQLFAEIDDL NGGQEALEERARNELSMTRPGETFYRLVPDASKRAQSAGQNNR >gi|296494625|gb|ADTN01000113.1| GENE 12 7778 - 8488 315 236 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 [Bacillus selenitireducens MLS10] # 10 227 6 223 234 125 37 2e-28 MATTHLDVCAVVPAAGFGRRMQTECPKQYLSIGNQTILEHSVHALLAHPRVKRVVIAISP GDSRFAQLPLANHPQITVVDGGDERADSVLAGLKAAGDAQWVLVHDAARPCLHQDDLARL LALSETSRTGGILAAPVRDTMKRAEPGKNAIAHTVDRNGLWHALTPQFFPRELLHDCLTR ALNEGATITDEASALEYCGFHPQLVEGRADNIKVTRPEDLALAEFYLTRTIHQENT >gi|296494625|gb|ADTN01000113.1| GENE 13 8488 - 8967 833 159 aa, chain + ## HITS:1 COG:ECs3600 KEGG:ns NR:ns ## COG: ECs3600 COG0245 # Protein_GI_number: 15832854 # Func_class: I Lipid transport and metabolism # Function: 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase # Organism: Escherichia coli O157:H7 # 1 159 1 159 159 288 100.0 3e-78 MRIGHGFDVHAFGGEGPIIIGGVRIPYEKGLLAHSDGDVALHALTDALLGAAALGDIGKL FPDTDPAFKGADSRELLREAWRRIQAKGYTLGNVDVTIIAQAPKMLPHIPQMRVFIAEDL GCHMDDVNVKATTTEKLGFTGRGEGIACEAVALLIKATK >gi|296494625|gb|ADTN01000113.1| GENE 14 8964 - 10013 1104 349 aa, chain + ## HITS:1 COG:ygbO KEGG:ns NR:ns ## COG: ygbO COG0585 # Protein_GI_number: 16130652 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 349 1 349 349 692 99.0 0 MIEFDNLTYLHGKPQGTGLLKANPEDFVVVEDLGFEPDGEGEHILVRILKNGCNTRFVAD ALAKFLKIHAREVSFAGQKDKHAVTEQWLCARVPGKEMPDLSAFQLEGCQVLEYARHKRK LRLGALKGNAFTLVLREVSNRDDVEQRLNDICVKGVPNYFGAQRFGIGGSNLQGAQRWAQ TNTPVRDRNKRSFWLSAARSALFNQIVAERLKKADVNQVVDGDALQLAGRGSWFVATTEE LAELQRRVNDKELMITAALPGSGEWGTQREALAFEQAAVAAETELQALLVREKVEAARRA MLLYPQQLSWNWWDDVTVEIRFWLPAGSFATSVVRELINTTGDYAHIAE >gi|296494625|gb|ADTN01000113.1| GENE 15 9994 - 10755 693 253 aa, chain + ## HITS:1 COG:ECs3598 KEGG:ns NR:ns ## COG: ECs3598 COG0496 # Protein_GI_number: 15832852 # Func_class: R General function prediction only # Function: Predicted acid phosphatase # Organism: Escherichia coli O157:H7 # 1 253 1 253 253 505 100.0 1e-143 MRILLSNDDGVHAPGIQTLAKALREFADVQVVAPDRNRSGASNSLTLESSLRTFTFENGD IAVQMGTPTDCVYLGVNALMRPRPDIVVSGINAGPNLGDDVIYSGTVAAAMEGRHLGFPA LAVSLDGHKHYDTAAAVTCSILRALCKEPLRTGRILNINVPDLPLDQIKGIRVTRCGTRH PADQVIPQQDPRGNTLYWIGPPGGKCDAGPGTDFAAVDEGYVSITPLHVDLTAHSAQDVV SDWLNSVGVGTQW >gi|296494625|gb|ADTN01000113.1| GENE 16 10749 - 11375 558 208 aa, chain + ## HITS:1 COG:ECs3597 KEGG:ns NR:ns ## COG: ECs3597 COG2518 # Protein_GI_number: 15832851 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Protein-L-isoaspartate carboxylmethyltransferase # Organism: Escherichia coli O157:H7 # 1 208 1 208 208 395 99.0 1e-110 MVSRRVQALLDQLRAQGIQDEQVLNALAAVPREKFVDEAFEQKAWDNIALPIGQGQTISQ PYMVARMTELLELTPQSRVLGIGTGSGYQTAILAHLVQHVCSVERIKGLQWQARRRLKNL DLHNVSTRHGDGWQGWQARAPFDAIIVTAAPPEIPTALMTQLDEGGILVLPVGEEHQYLK RVRRRGGEFIIDTVEAVRFVPLVKGELA >gi|296494625|gb|ADTN01000113.1| GENE 17 11515 - 12654 589 379 aa, chain + ## HITS:1 COG:ECs3596 KEGG:ns NR:ns ## COG: ECs3596 COG0739 # Protein_GI_number: 15832850 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Escherichia coli O157:H7 # 1 379 1 379 379 553 99.0 1e-157 MSAGSPKFTVRRIAALSLVSLWLAGCSDTSNPPAPVSSVNGNAPANTNSGMLITPPPKMG TTSTAQQPQIQPVQQPQIQATQQPQIQPVQPVAQQPVQMENGRIVYNRQYGNIPKGSYSG STYTVKKGDTLFYIAWITGNDFRDLAQRNNIQAPYALNVGQTLQVGNASGTPITGGNAIT QADAAEQGVVIKPAQNSTVAVASQPTITYSESSGEQSANKMLPNNKPAATTVTAPVTVPT ASTTEPTVSSTSTSTPISTWRWPTEGKVIETFGASEGGNKGIDIAGSKGQAIIATADGRV VYAGNALRGYGNLIIIKHNDDYLSAYAHNDTMLVREQQEVKAGQKIATMGSTGTSSTRLH FEIRYKGKSVNPLRYLPQR >gi|296494625|gb|ADTN01000113.1| GENE 18 12717 - 13709 1154 330 aa, chain + ## HITS:1 COG:ECs3595 KEGG:ns NR:ns ## COG: ECs3595 COG0568 # Protein_GI_number: 15832849 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Escherichia coli O157:H7 # 1 330 1 330 330 576 100.0 1e-164 MSQNTLKVHDLNEDAEFDENGVEVFDEKALVEEEPSDNDLAEEELLSQGATQRVLDATQL YLGEIGYSPLLTAEEEVYFARRALRGDVASRRRMIESNLRLVVKIARRYGNRGLALLDLI EEGNLGLIRAVEKFDPERGFRFSTYATWWIRQTIERAIMNQTRTIRLPIHIVKELNVYLR TARELSHKLDHEPSAEEIAEQLDKPVDDVSRMLRLNERITSVDTPLGGDSEKALLDILAD EKENGPEDTTQDDDMKQSIVKWLFELNAKQREVLARRFGLLGYEAATLEDVGREIGLTRE RVRQIQVEGLRRLREILQTQGLNIEALFRE >gi|296494625|gb|ADTN01000113.1| GENE 19 13803 - 15167 1214 454 aa, chain - ## HITS:1 COG:ygbN KEGG:ns NR:ns ## COG: ygbN COG2610 # Protein_GI_number: 16130647 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Escherichia coli K12 # 1 454 1 454 454 687 99.0 0 MSTITLLCIALTGVIMLLLLVIKAKVQPFVALLLVSLLVALAAGIPAGEVGKVMIAGMGG VLGSVTIIIGLGAMLGRMIEHSGGAESLANYFSRKLGDKRTIAALTLAAFFLGIPVFFDV GFIILAPIIYGFAKVAKISPLKFGLPVAGIMLTVHVAVPPHPGPVAAAGLLHADIGWLTI IGIAISIPVGVVGYFAAKIINKRQYAMSVEVLEQMQLAPASEEGATKLSDKINPPGVALV TSLIVIPIAIIMAGTVSATLMPPSHPLLGTLQLIGSPMVALMIALVLAFWLLALRRGWSL QHTSDIMGSALPTAAVVILVTGAGGVFGKVLVESGVGKALANMLQMIDLPLLPAAFIISL ALRASQGSATVAILTTGGLLSEAVMGLNPIQCVLVTLAACFGGLGASHINDSGFWIVTKY LGLSVADGLKTWTVLTTILGFTGFLITWCVWAVI >gi|296494625|gb|ADTN01000113.1| GENE 20 15256 - 16032 779 258 aa, chain - ## HITS:1 COG:ygbM KEGG:ns NR:ns ## COG: ygbM COG3622 # Protein_GI_number: 16130646 # Func_class: G Carbohydrate transport and metabolism # Function: Hydroxypyruvate isomerase # Organism: Escherichia coli K12 # 1 258 1 258 258 528 99.0 1e-150 MPRFAANLSMMFTEVPFIERFAAARKAGFDAVEFLFPYDYSTLQIQKQLEQNHLTLALFN TAPGDINAGEWGLSALPGREHEAHADIDLALEYALALNCEQVHVMAGVVPAGEDAERYRA VFIDNLRYAADRFAPHGKRILVEALSPGVKPHYLFSSQYQALAIVEEVARDNVFIQLDTF HAQKVDGNLTHLIRDYAGKYAHVQIAGLPDRHEPDDGEINYPWLFRLFDEVGYQGWIGCE YKPRGLTEEGLGWFDAWR >gi|296494625|gb|ADTN01000113.1| GENE 21 16037 - 16675 479 212 aa, chain - ## HITS:1 COG:ygbL KEGG:ns NR:ns ## COG: ygbL COG0235 # Protein_GI_number: 16130645 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli K12 # 1 212 1 212 212 432 100.0 1e-121 MSDFAKVEQSLREEMTRIASSFFQRGYATGSAGNLSLLLPDGNLLATPTGSCLGNLDPQR LSKVAADGEWLSGDKPSKEVLFHLALYRNNPRCKAVVHLHSTWSTALSCLQGLDSSNVIR PFTPYVVMRMGNVPLVPYYRPGDKRIAQDLAELAADNQAFLLANHGPVVCGESLQEAANN MEELEETAKLIFILGDRPIRYLTAGEIAELRS >gi|296494625|gb|ADTN01000113.1| GENE 22 16672 - 17934 1025 420 aa, chain - ## HITS:1 COG:ygbK KEGG:ns NR:ns ## COG: ygbK COG3395 # Protein_GI_number: 16130644 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 385 1 385 388 741 99.0 0 MIKIGVIADDFTGATDIASFLVENGLPTVQINGVPTGKMPEAIDALVISLKTRSCPVVEA TQQSLAALSWLQQQGCKQIYFKYCSTFDSTAKGNIGPVTDALMDALDTPFTVFSPALPVN GRTVYQGYLFVMNQLLAESGMRHHPVNPMTDSYLPRLVEAQSTGRCGVVSAHVFEQGVDA VRQELARLQQEGYRYAVLDALTEHHLEIQGEALRDAPLVTGGSGLAIGLARQWAQENGNQ ARKAGRPLAGRGVVLSGSCSQMTNRQVAHYRQIAPAREVDVARCLSIETLAAYAHELAEW VLGQESVLAPLVFATASTDALAAIQQQYGAQKASQAVETLFSQLAARLAAEGVTRFIVAG GETSGVVTQSLGIKGFHIGPTISPGVPWVNALDKPVSLALKSGNFGDDAFFSRAQREFLS >gi|296494625|gb|ADTN01000113.1| GENE 23 17931 - 18839 957 302 aa, chain - ## HITS:1 COG:ygbJ KEGG:ns NR:ns ## COG: ygbJ COG2084 # Protein_GI_number: 16130643 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases # Organism: Escherichia coli K12 # 1 302 1 302 302 493 99.0 1e-139 MKTGSEFHVGIVGLGSMGMGAALSCVRAGLSTWGADLNSNACATLKEAGACGVSDNAATF AEKLDALLVLVVNAAQVKQVLFGETGVAQHLKPGTAVMVSSTIASADAQEIATALAGFDL EMLDAPVSGGAVKAANGEMTIMASGSDIAFERLAPVLEAVAGKVYRIGAEPGLGSTVKII HQLLAGVHIAAGAEAMALAARAGIPLDVMYDVVTNAAGNSWMFENRMRHVVDGDYTPHSA VDIFVKDLGLVADTAKALHFPLPLASTALNMFTSASNAGYGKEDDSAVIKIFSGITLPGA KS >gi|296494625|gb|ADTN01000113.1| GENE 24 19035 - 19802 617 255 aa, chain + ## HITS:1 COG:ygbI KEGG:ns NR:ns ## COG: ygbI COG1349 # Protein_GI_number: 16130642 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli K12 # 1 255 11 265 265 464 99.0 1e-130 MIPVERRQIILEMVAEKGIVSIAELTDRMNVSHMTIRRDLQKLEQQGAVVLVSGGVQSPG RVAHEPSHQVKTALAMTQKAAIGKLAASLVQPGSCIYLDAGTTTLAIAQHLIHMESLTVV TNDFVIADYLLDNSNCTIIHTGGAVCRENRSCVGEAAATMLRSLMIDQAFISASSWSVRG ISTPAEDKVTVKRAIASASRQRVLVCDATKYGQVATWLALPLSEFDQIITDDGLPESASR ALVKQDLSLLVAKNE >gi|296494625|gb|ADTN01000113.1| GENE 25 19853 - 20509 396 218 aa, chain - ## HITS:1 COG:pphB KEGG:ns NR:ns ## COG: pphB COG0639 # Protein_GI_number: 16130641 # Func_class: T Signal transduction mechanisms # Function: Diadenosine tetraphosphatase and related serine/threonine protein phosphatases # Organism: Escherichia coli K12 # 1 218 1 218 218 441 99.0 1e-124 MPSTRYQKINAHHYRHIWVVGDIHGEYQLLQSRLHQLSFFPEIDLLISVGDNIDRGPESL DVLRLLNQPWFTSVKGNHEAMALEAFETGDGNMWLASGGDWFFDLNDSEQQEAIDLLLKF HHLPHIIEITNDNIKYAIAHADYPGSEYLFGKEIAESELLWPVDRVQKSLNGELQQINGA DYFIFGHMMFDNIQTFANQIYIDTGSPNSGRLSFYKIK >gi|296494625|gb|ADTN01000113.1| GENE 26 20615 - 23176 3146 853 aa, chain - ## HITS:1 COG:mutS KEGG:ns NR:ns ## COG: mutS COG0249 # Protein_GI_number: 16130640 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Escherichia coli K12 # 1 853 1 853 853 1615 99.0 0 MSAIENFDAHTPMMQQYLRLKAQHPEILLFYRMGDFYELFYDDAKRASQLLDISLTKRGA SAGEPIPMAGIPYHAVENYLAKLVNQGESVAICEQIGDPATSKGPVERKVVRIVTPGTIS DEALLQERQDNLLAAIWQDSKGFGYATLDISSGRFRLSEPADRETMAAELQRTNPAELLY AEDFAEMSLIEGRRGLRRRPLWEFEIDTARQQLNLQFGTRDLVGFGVENAPRGLCAAGCL LQYAKDTQRTTLPHIRSITMEREQDSIIMDAATRRNLEITQNLAGGAENTLASVLDCTVT PMGSRMLKRWLHMPVRDTRVLLERQQTIGALQDFTAGLQPVLRQVGDLERILARLALRTA RPRDLARMRHAFQQLPELRAQLETVDSAPVQALREKMGEFAELRDLLERAIIDTPPVLVR DGGVIASGYNEELDEWRALADGATDYLERLEVRERERTGLDTLKVGFNAVHGYYIQISRG QSHLAPINYMRRQTLKNAERYIIPELKEYEDKVLTSKGKALALEKQLYEELFDLLLPHLE ALQQSASALAELDVLVNLAERAYTLNYTCPTFIDKPGIRITEGRHPVVEQVLNEPFIANP LNLSPQRRMLIITGPNMGGKSTYMRQTALIALMAYIGSYVPAQKVEIGPIDRIFTRVGAA DDLASGRSTFMVEMTKTANILHNATEYSLVLMDEIGRGTSTYDGLSLAWACAENLANKIK ALTLFATHYFELTQLPEKMEGVANVHLDALEHGDTIAFMHSVQDGAASKSYGLAVAALAG VPKEVIKRARQKLRELESISPNAAATQVDGTQMSLLSVPEETSPAVEALENLDPDSLTPR QALEWIYRLKSLV >gi|296494625|gb|ADTN01000113.1| GENE 27 23463 - 23816 312 117 aa, chain + ## HITS:1 COG:no KEGG:B21_02547 NR:ns ## KEGG: B21_02547 # Name: ygbA # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 117 1 117 117 199 100.0 2e-50 MSGKRISREKLTIKKMIDLYQAKCPQASAEPEHYEALFVYAQKRLDKCVFGEEKPACKQC PVHCYQPAKREEMKQIMRWAGPRMLWRHPILTVRHLIDDKRPVPELPEKYRPKKPHE >gi|296494625|gb|ADTN01000113.1| GENE 28 23853 - 24041 131 62 aa, chain - ## HITS:1 COG:fhlA KEGG:ns NR:ns ## COG: fhlA COG3604 # Protein_GI_number: 16130638 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Escherichia coli K12 # 2 62 632 692 692 119 100.0 2e-27 PPPAATVVALEGEDEYQLIVRVLKETNGVVAGPKGAAQRLGLKRTTLLSRMKRLGIDKSA LI Prediction of potential genes in microbial genomes Time: Sun May 15 23:31:26 2011 Seq name: gi|296494624|gb|ADTN01000114.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont249.6, whole genome shotgun sequence Length of sequence - 16952 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 5, operones - 3 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 3/1.000 - CDS 2 - 1931 2047 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains - Term 1937 - 1965 3.0 2 2 Op 1 4/0.000 - CDS 2005 - 3015 1189 ## COG0309 Hydrogenase maturation factor 3 2 Op 2 13/0.000 - CDS 3012 - 4133 1078 ## COG0409 Hydrogenase maturation factor 4 2 Op 3 8/0.000 - CDS 4133 - 4405 344 ## COG0298 Hydrogenase maturation factor 5 2 Op 4 11/0.000 - CDS 4396 - 5268 875 ## COG0378 Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase 6 2 Op 5 . - CDS 5272 - 5622 158 ## COG0375 Zn finger protein HypA/HybF (possibly regulating hydrogenase expression) - Prom 5764 - 5823 3.9 + Prom 5723 - 5782 4.6 7 3 Tu 1 . + CDS 5834 - 6295 421 ## ECO103_3263 formate hydrogenlyase regulatory protein HycA + Term 6344 - 6394 10.1 8 4 Op 1 4/0.000 + CDS 6420 - 7031 315 ## COG1142 Fe-S-cluster-containing hydrogenase components 2 9 4 Op 2 10/0.000 + CDS 7028 - 8854 2122 ## COG0651 Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit 10 4 Op 3 7/0.000 + CDS 8857 - 9780 1479 ## COG0650 Formate hydrogenlyase subunit 4 11 4 Op 4 5/0.000 + CDS 9798 - 11507 2314 ## COG3261 Ni,Fe-hydrogenase III large subunit 12 4 Op 5 6/0.000 + CDS 11517 - 12059 525 ## COG1143 Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) 13 4 Op 6 . + CDS 12059 - 12826 919 ## COG3260 Ni,Fe-hydrogenase III small subunit 14 4 Op 7 . + CDS 12823 - 13233 484 ## B21_02533 hypothetical protein 15 4 Op 8 . + CDS 13259 - 13696 657 ## COG0680 Ni,Fe-hydrogenase maturation factor + Term 13768 - 13805 3.5 - Term 13754 - 13792 5.1 16 5 Op 1 8/0.000 - CDS 13855 - 15279 1571 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 17 5 Op 2 . - CDS 15288 - 16745 1858 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific - Prom 16841 - 16900 2.6 Predicted protein(s) >gi|296494624|gb|ADTN01000114.1| GENE 1 2 - 1931 2047 643 aa, chain - ## HITS:1 COG:fhlA KEGG:ns NR:ns ## COG: fhlA COG3604 # Protein_GI_number: 16130638 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Escherichia coli K12 # 1 642 1 642 692 1281 99.0 0 MSYTPMSDLGQQGLFDITRTLLQQPDLASLCEALSQLVKRSALADNAAIVLWQAQTQRAS YYASREKDTPIKYEDETVLAHGPVRSILSRPDTLHCSYEEFCETWPQLDAGGLYPKFGHY CLMPLAAEGHIFGGCEFIRYDDRPWSEKEFNRLQTFTQIVSVVTEQIQSRVVNNVDYELL CRERDNFRILVAITNAVLSRLDMDELVSEVAKEIHYYFDIDDISIVLRSHRKNKLNIYST HYLDKQHPAHEQSEVDEAGTLTERVFKSKEMLLINLHERDDLAPYERMLFDTWGNQIQTL CLLPLMSGDTMLGVLKLAQCEEKVFTTTNLNLLRQIAERVAIAVDNALAYQEIHRLKERL VDENLALTEQLNNVDSEFGEIIGRSEAMYSVLKQVEMVAQSDSTVLILGETGTGKELIAR AIHNLSGRNNRRMVKMNCAAMPAGLLESDLFGHERGAFTGASAQRIGRFELADKSSLFLD EVGDMPLELQPKLLRVLQEQEFERLGSNKIIQTDVRLIAATNRDLKKMVADREFRSDLYY RLNVFPIHLPPLRERPEDIPLLAKAFTFKIARRLGRNIDSIPAETLRTLSNMEWPGNVRE LENVIERAVLLTRGNVLQLSLPDIVLPEPETPPAATVVAPGGG >gi|296494624|gb|ADTN01000114.1| GENE 2 2005 - 3015 1189 336 aa, chain - ## HITS:1 COG:hypE KEGG:ns NR:ns ## COG: hypE COG0309 # Protein_GI_number: 16130637 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Escherichia coli K12 # 15 336 1 322 322 610 100.0 1e-174 MNNIQLAHGSGGQAMQQLINSLFMEAFANPWLAEQEDQARLDLAQLVAEGDRLAFSTDSY VIDPLFFPGGNIGKLAICGTANDVAVSGAIPRYLSCGFILEEGLPMETLKAVVTSMAETA RAAGIAIVTGDTKVVQRGAVDKLFINTAGMGAIPANIHWGAQTLTAGDVLLVSGTLGDHG ATILNLREQLGLDGELVSDCAVLTPLIQTLRDIPGVKALRDATRGGVNAVVHEFAAACGC GIELSEAALPVKPAVRGVCELLGLDALNFANEGKLVIAVERNAAEQVLAALHSHPLGKDA ALIGEVVERKGVRLAGLYGVKRTLDLPHAEPLPRIC >gi|296494624|gb|ADTN01000114.1| GENE 3 3012 - 4133 1078 373 aa, chain - ## HITS:1 COG:hypD KEGG:ns NR:ns ## COG: hypD COG0409 # Protein_GI_number: 16130636 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Escherichia coli K12 # 1 373 1 373 373 779 100.0 0 MRFVDEYRAPEQVMQLIEHLRERASHLSYTAERPLRIMEVCGGHTHAIFKFGLDQLLPEN VEFIHGPGCPVCVLPMGRIDTCVEIASHPEVIFCTFGDAMRVPGKQGSLLQAKARGADVR IVYSPMDALKLAQENPTRKVVFFGLGFETTMPTTAITLQQAKARDVQNFYFFCQHITLIP TLRSLLEQPDNGIDAFLAPGHVSMVIGTDAYNFIASDFHRPLVVAGFEPLDLLQGVVMLV QQKIAAHSKVENQYRRVVPDAGNLLAQQAIADVFCVNGDSEWRGLGVIESSGVHLTPDYQ RFDAEAHFRPAPQQVCDDPRARCGEVLTGKCKPHQCPLFGNTCNPQTAFGALMVSSEGAC AAWYQYRQQESEA >gi|296494624|gb|ADTN01000114.1| GENE 4 4133 - 4405 344 90 aa, chain - ## HITS:1 COG:ECs3584 KEGG:ns NR:ns ## COG: ECs3584 COG0298 # Protein_GI_number: 15832838 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Escherichia coli O157:H7 # 1 90 1 90 90 174 100.0 3e-44 MCIGVPGQIRTIDGNQAKVDVCGIQRDVDLTLVGSCDENGQPRVGQWVLVHVGFAMSVIN EAEARDTLDALQNMFDVEPDVGALLYGEEK >gi|296494624|gb|ADTN01000114.1| GENE 5 4396 - 5268 875 290 aa, chain - ## HITS:1 COG:hypB KEGG:ns NR:ns ## COG: hypB COG0378 # Protein_GI_number: 16130634 # Func_class: O Posttranslational modification, protein turnover, chaperones; K Transcription # Function: Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase # Organism: Escherichia coli K12 # 1 290 1 290 290 597 100.0 1e-171 MCTTCGCGEGNLYIEGDEHNPHSAFRSAPFAPAARPKMKITGIKAPEFTPSQTEEGDLHY GHGEAGTHAPGMSQRRMLEVEIDVLDKNNRLAERNRARFAARKQLVLNLVSSPGSGKTTL LTETLMRLKDSVPCAVIEGDQQTVNDAARIRATGTPAIQVNTGKGCHLDAQMIADAAPRL PLDDNGILFIENVGNLVCPASFDLGEKHKVAVLSVTEGEDKPLKYPHMFAAASLMLLNKV DLLPYLNFDVEKCIACAREVNPEIEIILISATSGEGMDQWLNWLETQRCA >gi|296494624|gb|ADTN01000114.1| GENE 6 5272 - 5622 158 116 aa, chain - ## HITS:1 COG:ECs3582 KEGG:ns NR:ns ## COG: ECs3582 COG0375 # Protein_GI_number: 15832836 # Func_class: R General function prediction only # Function: Zn finger protein HypA/HybF (possibly regulating hydrogenase expression) # Organism: Escherichia coli O157:H7 # 1 116 1 116 116 202 100.0 1e-52 MHEITLCQRALELIEQQAAKHGAKRVTGVWLKIGAFSCVETSSLAFCFDLVCRGSVAEGC KLHLEEQEAECWCETCQQYVTLLTQRVRRCPQCHGDMLQIVADDGLQIRRIEIDQE >gi|296494624|gb|ADTN01000114.1| GENE 7 5834 - 6295 421 153 aa, chain + ## HITS:1 COG:no KEGG:ECO103_3263 NR:ns ## KEGG: ECO103_3263 # Name: hycA # Def: formate hydrogenlyase regulatory protein HycA # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 153 1 153 153 309 100.0 2e-83 MTIWEISEKADYIAQRHRRLQDQWHIYCNSLVQGITLSKARLHHAMSCAPDKELCFVLFE HFRIYVTLADGFNSHTIEYYVETKDGEDKQRIAQAQLSIDGMIDGKVNIRDREQVLEHYL EKIAGVYDSLYTAIENNVPVNLSQLVKGQSPAA >gi|296494624|gb|ADTN01000114.1| GENE 8 6420 - 7031 315 203 aa, chain + ## HITS:1 COG:hycB KEGG:ns NR:ns ## COG: hycB COG1142 # Protein_GI_number: 16130631 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 2 # Organism: Escherichia coli K12 # 1 203 1 203 203 360 100.0 1e-99 MNRFVIADSTLCIGCHTCEAACSETHRQHGLQSMPRLRVMLNEKESAPQLCHHCEDAPCA VVCPVNAITRVDGAVQLNESLCVSCKLCGIACPFGAIEFSGSRPLDIPANANTPKAPPAP PAPARVSTLLDWVPGIRAIAVKCDLCSFDEQGPACVRMCPTKALHLVDNTDIARVSKRKR ELTFNTDFGDLTLFQQAQSGEAK >gi|296494624|gb|ADTN01000114.1| GENE 9 7028 - 8854 2122 608 aa, chain + ## HITS:1 COG:hycC KEGG:ns NR:ns ## COG: hycC COG0651 # Protein_GI_number: 16130630 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit # Organism: Escherichia coli K12 # 1 608 1 608 608 996 99.0 0 MSAISLINSGVAWFVAAAVLAFLFSFQKALSGWIAGIGGAVGSLYTAAAGFTVLTGAVGV SGALSLVSYDVQISPLNAIWLITLGLCGLFVSLYNIDWHRHAQVKCNGLQINMLMAAAVC AVIASNLGMFVVMAEIMALCAVFLTSNSKEGKLWFALGRLGTLLLAIACWLLWQRYGTLD LRLLDMRMQQLPLGSDIWLLGVIGFGLLAGIIPLHGWVPQAHANASAPAAALFSTVVMKI GLLGILTLSLLGGNAPLWWGIALLVLGMITAFVGGLYALVEHNIQRLLAYHTLENIGIIL LGLGAGVTGIALEQPALIALGLVGGLYHLLNHSLFKSVLFLGAGSVWFRTGHRDIEKLGG IGKKMPVISIAMLVGLMAMAALPPLNGFAGEWVIYQSFFKLSNSGAFVTRLLGPLLAVGL AITGALAVMCMAKVYGVTFLGAPRTKEAENATCAPLLMSVSVVALAICCVIGGVAAPWLL PMLSAAVPLPLEPANTTVSQPMITLLLIACPLLPFIIMAICKGDRLPSRSRGAAWVCGYD HEKSMVITAHGFAMPVKQAFAPVLKLRKWLNPVSLVPGWQCEGSALLFRRMALVELAVLV VIIVSRGA >gi|296494624|gb|ADTN01000114.1| GENE 10 8857 - 9780 1479 307 aa, chain + ## HITS:1 COG:hycD KEGG:ns NR:ns ## COG: hycD COG0650 # Protein_GI_number: 16130629 # Func_class: C Energy production and conversion # Function: Formate hydrogenlyase subunit 4 # Organism: Escherichia coli K12 # 1 307 1 307 307 535 100.0 1e-152 MSVLYPLIQALVLFAVAPLLSGITRVARARLHNRRGPGVLQEYRDIIKLLGRQSVGPDAS GWVFRLTPYVMVGVMLTIATALPVVTVGSPLPQLGDLITLLYLFAIARFFFAISGLDTGS PFTAIGASREAMLGVLVEPMLLLGLWVAAQVAGSTNISNITDTVYHWPLSQSIPLVLALC ACAFATFIEMGKLPFDLAEAEQELQEGPLSEYSGSGFGVMKWGISLKQLVVLQMFVGVFI PWGQMETFTAGGLLLALVIAIVKLVVGVLVIALFENSMARLRLDITPRITWAGFGFAFLA FVSLLAA >gi|296494624|gb|ADTN01000114.1| GENE 11 9798 - 11507 2314 569 aa, chain + ## HITS:1 COG:ECs3577_2 KEGG:ns NR:ns ## COG: ECs3577_2 COG3261 # Protein_GI_number: 15832831 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase III large subunit # Organism: Escherichia coli O157:H7 # 158 569 1 412 412 873 100.0 0 MSEEKLGQHYLAALNEAFPGVVLDHAWQTKDQLTVTVKVNYLPEVVEFLYYKQGGWLSVL FGNDERKLNGHYAVYYVLSMEKGTKCWITVRVEVDANKPEYPSVTPPGSGAVWGEREVRD MYGLIPVGLPDERRLVLPDDWPDELYPLRKDSMDYRQRPAPTTDAETYEFINELGDKKNN VVPIGPLHVTSDEPGHFRLFVDGENIIDADYRLFYVHRGMEKLAETRMGYNEVTFLSDRV CGICGFAHSTAYTTSVENAMGIQVPERAQMIRAILLEVERLHSHLLNLGLACHFTGFDSG FMQFFRVRETSMKMAEILTGARKTYGLNLIGGIRRDLLKDDMIQTRQLAQQMRREVQELV DVLLSTPNMEQRTVGIGRLDPEIARDFSNVGPMVRASGHARDTRADHPFVGYGLLPMEVH SEQGCDVISRLKVRINEVYTALNMIDYGLDNLPGGPLMVEGFTYIPHRFALGFAEAPRGD DIHWSMTGDNQKLYRWRCRAATYANWPTLRYMLRGNTVSDAPLIIGSLDPCYSCTDRMTV VDVRKKKSKVVPYKELERYSIERKNSPLK >gi|296494624|gb|ADTN01000114.1| GENE 12 11517 - 12059 525 180 aa, chain + ## HITS:1 COG:hycF KEGG:ns NR:ns ## COG: hycF COG1143 # Protein_GI_number: 16130627 # Func_class: C Energy production and conversion # Function: Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) # Organism: Escherichia coli K12 # 1 180 1 180 180 348 100.0 4e-96 MFTFIKKVIKTGTATSSYPLEPIAVDKNFRGKPEQNPQQCIGCAACVNACPSNALTVETD LATGELAWEFNLGHCIFCGRCEEVCPTAAIKLSQEYELAVWKKEDFLQQSRFALCNCRVC NRPFAVQKEIDYAIALLKHNGDSRAENHRESFETCPECKRQKCLVPSDRIELTRHMKEAI >gi|296494624|gb|ADTN01000114.1| GENE 13 12059 - 12826 919 255 aa, chain + ## HITS:1 COG:hycG KEGG:ns NR:ns ## COG: hycG COG3260 # Protein_GI_number: 16130626 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase III small subunit # Organism: Escherichia coli K12 # 1 255 1 255 255 530 100.0 1e-151 MSNLLGPRDANGIPVPMTVDESIASMKASLLKKIKRSAYVYRVDCGGCNGCEIEIFATLS PLFDAERFGIKVVPSPRHADILLFTGAVTRAMRSPALRAWQSAPDPKICISYGACGNSGG IFHDLYCVWGGTDKIVPVDVYIPGCPPTPAATLYGFAMALGLLEQKIHARGPGELDEQPA EILHGDMVQPLRVKVDREARRLAGYRYGRQIADDYLTQLGQGEEQVARWLEAENDPRLNE IVSHLNHVVEEARIR >gi|296494624|gb|ADTN01000114.1| GENE 14 12823 - 13233 484 136 aa, chain + ## HITS:1 COG:no KEGG:B21_02533 NR:ns ## KEGG: B21_02533 # Name: hycH # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 136 1 136 136 270 100.0 1e-71 MSEKVVFSQLSRKFIDENDATPAEAQQVVYYSLAIGHHLGVIDCLEAALTCPWDEYLAWI ATLEAGSEARRKMEGVPKYGEIVIDINHVPMLANAFDKARAAQTSQQQEWSTMLLSMLHD IHQENAIYLMVRRLRD >gi|296494624|gb|ADTN01000114.1| GENE 15 13259 - 13696 657 145 aa, chain + ## HITS:1 COG:ECs3573 KEGG:ns NR:ns ## COG: ECs3573 COG0680 # Protein_GI_number: 15832827 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase maturation factor # Organism: Escherichia coli O157:H7 # 1 145 12 156 156 283 100.0 7e-77 MMGDDGAGPLLAEKCAAAPKGNWVVIDGGSAPENDIVAIRELRPTRLLIVDATDMGLNPG EIRIIDPDDIAEMFMMTTHNMPLNYLIDQLKEDIGEVIFLGIQPDIVGFYYPMTQPIKDA VETVYQRLEGWEGNGGFAQLAVEEE >gi|296494624|gb|ADTN01000114.1| GENE 16 13855 - 15279 1571 474 aa, chain - ## HITS:1 COG:ascB KEGG:ns NR:ns ## COG: ascB COG2723 # Protein_GI_number: 16130623 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Escherichia coli K12 # 1 474 1 474 474 1002 100.0 0 MSVFPESFLWGGALAANQSEGAFREGDKGLTTVDMIPHGEHRMAVKLGLEKRFQLRDDEF YPSHEATDFYHRYKEDIALMAEMGFKVFRTSIAWSRLFPQGDEITPNQQGIAFYRSVFEE CKKYGIEPLVTLCHFDVPMHLVTEYGSWRNRKLVEFFSRYARTCFEAFDGLVKYWLTFNE INIMLHSPFSGAGLVFEEGENQDQVKYQAAHHQLVASALATKIAHEVNPQNQVGCMLAGG NFYPYSCKPEDVWAALEKDRENLFFIDVQARGTYPAYSARVFREKGVTINKAPGDDEILK NTVDFVSFSYYASRCASAEMNANNSSAANVVKSLRNPYLQVSDWGWGIDPLGLRITMNMM YDRYQKPLFLVENGLGAKDEFAANGEINDDYRISYLREHIRAMGEAIADGIPLMGYTTWG CIDLVSASTGEMSKRYGFVFVDRDDAGNGTLTRTRKKSFWWYKKVIASNGEDLE >gi|296494624|gb|ADTN01000114.1| GENE 17 15288 - 16745 1858 485 aa, chain - ## HITS:1 COG:ECs3571_2 KEGG:ns NR:ns ## COG: ECs3571_2 COG1263 # Protein_GI_number: 15832825 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Escherichia coli O157:H7 # 91 485 2 396 396 640 98.0 0 MAKNYAALARSVIAALGGVDNISAVTHCMTRLRFVIKDDALIDSPTLKTIPGVLGVVRSD NQCQVIIGNTVSQAFQEVVSLLPGDMQPAQPVGKPKLTLRRIGAGILDALIGTMSPLIPA IIGGSMVKLLAMILEMSGVLTKGSPTLTILNVIGDGAFFFMPLMVAASAAIKFKTNMSLA IAIAGVLVHPSFIELMAKAAQGEHVEFALIPVTAVKYTYTVIPALVMTWCLSYIERWVDS ITPAVTKNFLKPMLIVLIAAPLAILLIGPIGIWIGSAISALVYTIHGYLGWLSVAIMGAL WPLLVMTGMHRVFTPTIIQTIAETGKEGMVMPSEIGANLSLGGSSLAVAWKTKNPELRLT ALAAAASAIMAGISEPALYGVAIRLKRPLIASLISGFICGAVAGMAGLASHSMAAPGLFT SVQFFDPANPMSIVWVFAVMALAVVLSFILTLLLGFEDIPVEEAAAQARKYQSVQPTVAK EVSLN Prediction of potential genes in microbial genomes Time: Sun May 15 23:31:35 2011 Seq name: gi|296494623|gb|ADTN01000115.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont249.7, whole genome shotgun sequence Length of sequence - 8386 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 3/0.500 + CDS 54 - 1064 931 ## COG1609 Transcriptional regulators + Term 1079 - 1118 3.0 + Prom 1116 - 1175 4.5 2 2 Op 1 4/0.500 + CDS 1213 - 1740 409 ## COG1142 Fe-S-cluster-containing hydrogenase components 2 3 2 Op 2 . + CDS 1893 - 4145 1650 ## COG0068 Hydrogenase maturation factor + Term 4220 - 4264 4.1 4 3 Op 1 5/0.000 - CDS 4273 - 5406 988 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 5 3 Op 2 . - CDS 5403 - 6842 1373 ## COG0426 Uncharacterized flavoproteins - Prom 6933 - 6992 6.5 + Prom 6890 - 6949 7.4 6 4 Tu 1 . + CDS 7029 - 8385 1235 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains Predicted protein(s) >gi|296494623|gb|ADTN01000115.1| GENE 1 54 - 1064 931 336 aa, chain + ## HITS:1 COG:ascG KEGG:ns NR:ns ## COG: ascG COG1609 # Protein_GI_number: 16130621 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 336 2 337 337 664 100.0 0 MTTMLEVAKRAGVSKATVSRVLSGNGYVSQETKDRVFQAVEESGYRPNLLARNLSAKSTQ TLGLVVTNTLYHGIYFSELLFHAARMAEEKGRQLLLADGKHSAEEERQAIQYLLDLRCDA IMIYPRFLSVDEIDDIIDAHSQPIMVLNRRLRKNSSHSVWCDHKQTSFNAVAELINAGHQ EIAFLTGSMDSPTSIERLAGYKDALAQHGIALNEKLIANGKWTPASGAEGVEMLLERGAK FSALVASNDDMAIGAMKALHERGVAVPEQVSVIGFDDIAIAPYTVPALSSVKIPVTEMIQ EIIGRLIFMLDGGDFSPPKTFSGKLIRRDSLIAPSR >gi|296494623|gb|ADTN01000115.1| GENE 2 1213 - 1740 409 175 aa, chain + ## HITS:1 COG:ECs3569 KEGG:ns NR:ns ## COG: ECs3569 COG1142 # Protein_GI_number: 15832823 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 2 # Organism: Escherichia coli O157:H7 # 1 175 1 175 175 302 100.0 2e-82 MNRFIIADASKCIGCRTCEVACVVSHQENQDCASLTPETFLPRIHVIKGVNISTATVCRQ CEDAPCANVCPNGAISRDKGFVHVMQERCIGCKTCVVACPYGAMEVVVRPVIRNSGAGLN VRADKAEANKCDLCNHREDGPACMAACPTHALICVDRNKLEQLSAEKRRRTALMF >gi|296494623|gb|ADTN01000115.1| GENE 3 1893 - 4145 1650 750 aa, chain + ## HITS:1 COG:hypF KEGG:ns NR:ns ## COG: hypF COG0068 # Protein_GI_number: 16130619 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Escherichia coli K12 # 1 750 1 750 750 1455 100.0 0 MAKNTSCGVQLRIRGKVQGVGFRPFVWQLAQQLNLHGDVCNDGDGVEVRLREDPETFLVQ LYQHCPPLARIDSVEREPFIWSQLPTEFTIRQSTGGTMNTQIVPDAATCPACLAEMNTPG ERRYRYPFINCTHCGPRFTIIRAMPYDRPFTVMAAFPLCPACDKEYRDPLDRRFHAQPVA CPECGPHLEWVSHGEHAEQEAALQAAIAQLKMGKIVAIKGIGGFHLACDARNSNAVATLR ARKHRPAKPLAVMLPVADGLPDAARQLLTTPAAPIVLVDKKYVPELCDDIAPDLNEVGVM LPANPLQHLLLQELQCPLVMTSGNLSGKPPAISNEQALADLQGIADGFLIHNRDIVQRMD DSVVRESGEMLRRSRGYVPDALALPPGFKNVPPVLCLGADLKNTFCLVRGEQAVLSQHLG DLSDDGIQMQWREALRLMQNIYDFTPQYVVHDAHPGYVSSQWAREMNLPTQTVLHHHAHA AACLAEHQWPLDGGDVIALTLDGIGMGENGALWGGECLRVNYRECEHLGGLPAVALPGGD LAAKQPWRNLLAQCLRFVPEWQNYSETASVQQQNWSVLARAIERGINAPLASSCGRFFDA VAAALGCAPATLSYEGEAACALEALAASCHGVTHPVTMPRVDNQLDLATFWQQWLNWQAP VNQRAWAFHDALAQGFAALMREQATMRGITTLVFSGGVIHNRLLRARLAHYLADFTLLFP QSLPAGDGGLSLGQGVIAAARWLAGEVQNG >gi|296494623|gb|ADTN01000115.1| GENE 4 4273 - 5406 988 377 aa, chain - ## HITS:1 COG:ygbD KEGG:ns NR:ns ## COG: ygbD COG0446 # Protein_GI_number: 16130618 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Escherichia coli K12 # 1 377 1 377 377 718 98.0 0 MSNGIVIIGSGFAARQLVKNIRKQDATIPLTLIAADSIDEYNKPDLSHVISQGQRADDLT RQTAGEFAEQFNLRLFPHTWVTDIDAEAHVVKSQNNQWQYDKLVLATGASAFVPPVPGRE LMLTLNSQQEYRACETQLRDARRVLIVGGGLIGSELAMDFCRAGKAVTLIDNAASILASL MPPEVSSRLQHRLTEMGVHLLLKSQLQGLEKTDSGILATLDRQRSIEVDAVIAATGLRPE TALARRAGLTINRGVCVDSYLQTSNTDIYALGDCAEINGQVLPFLQPIQLSAMVLAKNLL GNNTPLKLPAMLVKIKTPELPLHLAGETQRQDLRWQINTERQGMVARGVDDADQLRAFVV SEDRMKEAFGLLKTLPM >gi|296494623|gb|ADTN01000115.1| GENE 5 5403 - 6842 1373 479 aa, chain - ## HITS:1 COG:ygaK_1 KEGG:ns NR:ns ## COG: ygaK_1 COG0426 # Protein_GI_number: 16130617 # Func_class: C Energy production and conversion # Function: Uncharacterized flavoproteins # Organism: Escherichia coli K12 # 1 394 1 394 394 828 99.0 0 MSIVVKNNIHWVGQRDWEVRDFHGTEYKTLRGSSYNSYLIREEKNVLIDTVDHKFSREFV QNLRNEIDLADIDYIVINHAEEDHAGALTELMAQIPDTPIYCTANAIDSINGHHHHPEWN FNVVKTGDTLDIGNGKQLIFVETPMLHWPDSMMTYLTGDAVLFSNDAFGQHYCDEHLFND EVDQTELFEQCQRYYANILTPFSRLVTPKITEILGFNLPVDMIATSHGVVWRDNPTQIVE LYLKWAADYQEDRITIFYDTMSNNTRMMADAIAQGIAETDPRVAVKIFNVARSDKNEILT NVFRSKGVLVGTSTMNNVMMPKIAGLVEEMTGLRFRNKRASAFGSHGWSGGAVDRLSTRL QDAGFEMSLSLKAKWRPDQDALELCREHGREIARQWALAPLPQSTVNTVVKEETSAATTA DLGPRMQCSVCRWIYDPAKGEPMQDVAPGTPWSEVPDNFLCPECSLGKDVFDELASEAK >gi|296494623|gb|ADTN01000115.1| GENE 6 7029 - 8385 1235 452 aa, chain + ## HITS:1 COG:ygaA KEGG:ns NR:ns ## COG: ygaA COG3604 # Protein_GI_number: 16130616 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Escherichia coli K12 # 1 448 26 473 529 838 100.0 0 MSFSVDVLANIAIELQRGIGHQDRFQRLITTLRQVLECDASALLRYDSRQFIPLAIDGLA KDVLGRRFALEGHPRLEAIARAGDVVRFPADSELPDPYDGLIPGQESLKVHACVGLPLFA GQNLIGALTLDGMQPDQFDVFSDEELRLIAALAAGALSNALLIEQLESQNMLPGDATPFE AVKQTQMIGLSPGMTQLKKEIEIVAASDLNVLISGETGTGKELVAKAIHEASPRAVNPLV YLNCAALPESVAESELFGHVKGAFTGAISNRSGKFEMADNGTLFLDEIGELSLALQAKLL RVLQYGDIQRVGDDRCLRVDVRVLAATNRDLREEVLAGRFRADLFHRLSVFPLSVPPLRE RGDDVILLAGYFCEQCRLRQGLSRVVLSAGARNLLQHYSFPGNVRELEHAIHRAVVLARA TRSGDEVILEAQHFAFPEVTLPTPEVAAGAGV Prediction of potential genes in microbial genomes Time: Sun May 15 23:31:40 2011 Seq name: gi|296494622|gb|ADTN01000116.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont249.8, whole genome shotgun sequence Length of sequence - 12707 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 5, operones - 3 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 16 - 186 180 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains + Term 212 - 250 2.0 2 2 Op 1 . - CDS 183 - 1148 1283 ## COG0794 Predicted sugar phosphate isomerase involved in capsule formation 3 2 Op 2 4/0.000 - CDS 1141 - 1914 726 ## COG1349 Transcriptional regulators of sugar metabolism 4 2 Op 3 5/0.000 - CDS 1981 - 2340 312 ## COG4578 Glucitol operon activator - Prom 2377 - 2436 2.3 5 2 Op 4 5/0.000 - CDS 2445 - 3224 879 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 6 2 Op 5 6/0.000 - CDS 3228 - 3599 271 ## COG3731 Phosphotransferase system sorbitol-specific component IIA 7 2 Op 6 6/0.000 - CDS 3610 - 4569 1084 ## COG3732 Phosphotransferase system sorbitol-specific component IIBC 8 2 Op 7 . - CDS 4566 - 5129 618 ## COG3730 Phosphotransferase system sorbitol-specific component IIC - Prom 5253 - 5312 6.0 + Prom 5192 - 5251 4.8 9 3 Op 1 4/0.000 + CDS 5385 - 6470 1106 ## COG2951 Membrane-bound lytic murein transglycosylase B + Prom 6488 - 6547 2.6 10 3 Op 2 12/0.000 + CDS 6615 - 7112 301 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase 11 3 Op 3 14/0.000 + CDS 7192 - 8253 1548 ## COG0468 RecA/RadA recombinase + Term 8269 - 8308 4.4 12 3 Op 4 . + CDS 8322 - 8822 561 ## COG2137 Uncharacterized protein conserved in bacteria + Term 8879 - 8931 10.2 13 4 Tu 1 . - CDS 8863 - 9045 128 ## ECSP_3645 hypothetical protein - Prom 9201 - 9260 1.9 14 5 Op 1 8/0.000 + CDS 8950 - 11580 3021 ## COG0013 Alanyl-tRNA synthetase + Prom 11628 - 11687 4.1 15 5 Op 2 . + CDS 11815 - 12000 204 ## PROTEIN SUPPORTED gi|167855109|ref|ZP_02477881.1| 30S ribosomal protein S1 + TRNA 12316 - 12408 72.5 # Ser GCT 0 0 + TRNA 12412 - 12488 87.2 # Arg ACG 0 0 Predicted protein(s) >gi|296494622|gb|ADTN01000116.1| GENE 1 16 - 186 180 56 aa, chain + ## HITS:1 COG:ECs3565 KEGG:ns NR:ns ## COG: ECs3565 COG3604 # Protein_GI_number: 15832819 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Escherichia coli O157:H7 # 1 56 474 529 529 113 98.0 8e-26 MPVVKQNLREATEAFQRETIRQALAQNHHNWAACARMLETDVANLHRLAKRLGLKD >gi|296494622|gb|ADTN01000116.1| GENE 2 183 - 1148 1283 321 aa, chain - ## HITS:1 COG:gutQ_1 KEGG:ns NR:ns ## COG: gutQ_1 COG0794 # Protein_GI_number: 16130615 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted sugar phosphate isomerase involved in capsule formation # Organism: Escherichia coli K12 # 14 205 1 192 192 363 100.0 1e-100 MSEALLNAGRQTLMLELQEASRLPERLGDDFVRAANIILHCEGKVVVSGIGKSGHIGKKI AATLASTGTPAFFVHPAEALHGDLGMIESRDVMLFISYSGGAKELDLIIPRLEDKSIALL AMTGKPTSPLGLAAKAVLDISVEREACPMHLAPTSSTVNTLMMGDALAMAVMQARGFNEE DFARSHPAGALGARLLNKVHHLMRRDDAIPQVALTASVMDAMLELSRTGLGLVAVCDAQQ QVQGVFTDGDLRRWLVGGGALTTPVNEAMTVGGTTLQSQSRAIDAKEILMKRKITAAPVV DENGKLTGAINLQDFYQAGII >gi|296494622|gb|ADTN01000116.1| GENE 3 1141 - 1914 726 257 aa, chain - ## HITS:1 COG:srlR KEGG:ns NR:ns ## COG: srlR COG1349 # Protein_GI_number: 16130614 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli K12 # 1 257 1 257 257 504 100.0 1e-143 MKPRQRQAAILEYLQKQGKCSVEELAQYFDTTGTTIRKDLVILEHAGTVIRTYGGVVLNK EESDPPIDHKTLINTHKKELIAEAAVSFIHDGDSIILDAGSTVLQMVPLLSRFNNITVMT NSLHIVNALSELDNEQTILMPGGTFRKKSASFHGQLAENAFEHFTFDKLFMGTDGIDLNA GVTTFNEVYTVSKAMCNAAREVILMADSSKFGRKSPNVVCSLESVDKLITDAGIDPAFRQ ALEEKGIDVIITGESNE >gi|296494622|gb|ADTN01000116.1| GENE 4 1981 - 2340 312 119 aa, chain - ## HITS:1 COG:gutM KEGG:ns NR:ns ## COG: gutM COG4578 # Protein_GI_number: 16130613 # Func_class: K Transcription # Function: Glucitol operon activator # Organism: Escherichia coli K12 # 1 119 1 119 119 231 100.0 2e-61 MVSALITVAVIAWCAQLALGGWQISRFNRAFDTLCQQGRVGVGRSSGRFKPRVVVAIALD DQQRIVDTLFMKGLTVFARPQKIPAITGMHAGDLQPDVIFPHDPLSQNALSLALKLKRG >gi|296494622|gb|ADTN01000116.1| GENE 5 2445 - 3224 879 259 aa, chain - ## HITS:1 COG:srlD KEGG:ns NR:ns ## COG: srlD COG1028 # Protein_GI_number: 16130612 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Escherichia coli K12 # 1 259 1 259 259 493 100.0 1e-139 MNQVAVVIGGGQTLGAFLCHGLAAEGYRVAVVDIQSDKAANVAQEINAEYGESMAYGFGA DATSEQSVLALSRGVDEIFGRVDLLVYSAGIAKAAFISDFQLGDFDRSLQVNLVGYFLCA REFSRLMIRDGIQGRIIQINSKSGKVGSKHNSGYSAAKFGGVGLTQSLALDLAEYGITVH SLMLGNLLKSPMFQSLLPQYATKLGIKPDQVEQYYIDKVPLKRGCDYQDVLNMLLFYASP KASYCTGQSINVTGGQVMF >gi|296494622|gb|ADTN01000116.1| GENE 6 3228 - 3599 271 123 aa, chain - ## HITS:1 COG:srlB KEGG:ns NR:ns ## COG: srlB COG3731 # Protein_GI_number: 16130611 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system sorbitol-specific component IIA # Organism: Escherichia coli K12 # 1 123 1 123 123 248 100.0 2e-66 MTVIYQTTITRIGASAIDALSDQMLITFREGAPADLEEYCFIHCHGELKGALHPGLQFSL GQHRYPVTAVGSVAEDNLRELGHVTLRFDGLNEAEFPGTVHVAGPVPDDIAPGSVLKFES VKE >gi|296494622|gb|ADTN01000116.1| GENE 7 3610 - 4569 1084 319 aa, chain - ## HITS:1 COG:srlE KEGG:ns NR:ns ## COG: srlE COG3732 # Protein_GI_number: 16130610 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system sorbitol-specific component IIBC # Organism: Escherichia coli K12 # 1 319 1 319 319 592 99.0 1e-169 MTHIRIEKGTGGWGGPLELKATPGKKIVYITAGTRPAIVDKLAQLTGWQAIDGFKEGEPA EAEIGVAVIDCGGTLRCGIYPKRRIPTINIHSTGKSGPLAQYIVEDIYVSGVKEENITVV GDATPQPSSVGRDYDTSKKITEQSDGLLAKVGMGMGSTVAVLFQSGRDTIDTVLKTILPF MAFVSALIGIIMASGLGDWIAHGLAPLASHPLGLVMLALICSFPLLSPFLGPGAVIAQVI GVLIGVQIGLGNIPPHLALPALFAINAQAACDFIPVGLSLAEARQDTVRVGVPSVLVSRF LTGAPTVLIAWFVSGFIYQ >gi|296494622|gb|ADTN01000116.1| GENE 8 4566 - 5129 618 187 aa, chain - ## HITS:1 COG:srlA KEGG:ns NR:ns ## COG: srlA COG3730 # Protein_GI_number: 16130609 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system sorbitol-specific component IIC # Organism: Escherichia coli K12 # 1 187 1 187 187 364 99.0 1e-101 MIETITHGAEWFIGLFQKGGEVFTGMVTGILPLLISLLVIMNALINFIGQHRIERFAQRC AGNPVSRYLLLPCIGTFVFCNPMTLSLGRFMPEKYKPSYYAAASYSCHSMNGLFPHINPG ELFVYLGIASGLTTLNLPLGPLAVSYLLVGLVTNFFRGWVTDLTTAIFEKKMGIQLEQKV HLAGATS >gi|296494622|gb|ADTN01000116.1| GENE 9 5385 - 6470 1106 361 aa, chain + ## HITS:1 COG:mltB KEGG:ns NR:ns ## COG: mltB COG2951 # Protein_GI_number: 16130608 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-bound lytic murein transglycosylase B # Organism: Escherichia coli K12 # 1 361 1 361 361 717 100.0 0 MFKRRYVTLLPLFVLLAACSSKPKPTETDTTTGTPSGGFLLEPQHNVMQMGGDFANNPNA QQFIDKMVNKHGFDRQQLQEILSQAKRLDSVLRLMDNQAPTTSVKPPSGPNGAWLRYRKK FITPDNVQNGVVFWNQYEDALNRAWQVYGVPPEIIVGIIGVETRWGRVMGKTRILDALAT LSFNYPRRAEYFSGELETFLLMARDEQDDPLNLKGSFAGAMGYGQFMPSSYKQYAVDFSG DGHINLWDPVDAIGSVANYFKAHGWVKGDQVAVMANGQAPGLPNGFKTKYSISQLAAAGL TPQQPLGNHQQASLLRLDVGTGYQYWYGLPNFYTITRYNHSTHYAMAVWQLGQAVALARV Q >gi|296494622|gb|ADTN01000116.1| GENE 10 6615 - 7112 301 165 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 8 157 748 898 904 120 46 5e-27 MTDSELMQLSEQVGQALKARGATVTTAESCTGGWVAKVITDIAGSSAWFERGFVTYSNEA KAQMIGVREETLAQHGAVSEPVVVEMAIGALKAARADYAVSISGIAGPDGGSEEKPVGTV WFAFATARGEGITRRECFSGDRDAVRRQATAYALQTLWQQFLQNT >gi|296494622|gb|ADTN01000116.1| GENE 11 7192 - 8253 1548 353 aa, chain + ## HITS:1 COG:ECs3556 KEGG:ns NR:ns ## COG: ECs3556 COG0468 # Protein_GI_number: 15832810 # Func_class: L Replication, recombination and repair # Function: RecA/RadA recombinase # Organism: Escherichia coli O157:H7 # 1 353 1 353 353 666 100.0 0 MAIDENKQKALAAALGQIEKQFGKGSIMRLGEDRSMDVETISTGSLSLDIALGAGGLPMG RIVEIYGPESSGKTTLTLQVIAAAQREGKTCAFIDAEHALDPIYARKLGVDIDNLLCSQP DTGEQALEICDALARSGAVDVIVVDSVAALTPKAEIEGEIGDSHMGLAARMMSQAMRKLA GNLKQSNTLLIFINQIRMKIGVMFGNPETTTGGNALKFYASVRLDIRRIGAVKEGENVVG SETRVKVVKNKIAAPFKQAEFQILYGEGINFYGELVDLGVKEKLIEKAGAWYSYKGEKIG QGKANATAWLKDNPETAKEIEKKVRELLLSNPNSTPDFSVDDSEGVAETNEDF >gi|296494622|gb|ADTN01000116.1| GENE 12 8322 - 8822 561 166 aa, chain + ## HITS:1 COG:oraA KEGG:ns NR:ns ## COG: oraA COG2137 # Protein_GI_number: 16130605 # Func_class: R General function prediction only # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 166 1 166 166 310 100.0 6e-85 MTESTSRRPAYARLLDRAVRILAVRDHSEQELRRKLAAPIMGKNGPEEIDATAEDYERVI AWCHEHGYLDDSRFVARFIASRSRKGYGPARIRQELNQKGISREATEKAMRECDIDWCAL ARDQATRKYGEPLPTVFSEKVKIQRFLLYRGYLMEDIQEIWRNFAD >gi|296494622|gb|ADTN01000116.1| GENE 13 8863 - 9045 128 60 aa, chain - ## HITS:1 COG:no KEGG:ECSP_3645 NR:ns ## KEGG: ECSP_3645 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 60 74 133 133 115 98.0 5e-25 MGYQGAAGNYLMSLTMEKVEKRLTDLSGALAHNYPEIKLTKYRHQLQRVLTTGLVTEKWE >gi|296494622|gb|ADTN01000116.1| GENE 14 8950 - 11580 3021 876 aa, chain + ## HITS:1 COG:alaS KEGG:ns NR:ns ## COG: alaS COG0013 # Protein_GI_number: 16130604 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Alanyl-tRNA synthetase # Organism: Escherichia coli K12 # 1 876 1 876 876 1721 100.0 0 MSKSTAEIRQAFLDFFHSKGHQVVASSSLVPHNDPTLLFTNAGMNQFKDVFLGLDKRNYS RATTSQRCVRAGGKHNDLENVGYTARHHTFFEMLGNFSFGDYFKHDAIQFAWELLTSEKW FALPKERLWVTVYESDDEAYEIWEKEVGIPRERIIRIGDNKGAPYASDNFWQMGDTGPCG PCTEIFYDHGDHIWGGPPGSPEEDGDRYIEIWNIVFMQFNRQADGTMEPLPKPSVDTGMG LERIAAVLQHVNSNYDIDLFRTLIQAVAKVTGATDLSNKSLRVIADHIRSCAFLIADGVM PSNENRGYVLRRIIRRAVRHGNMLGAKETFFYKLVGPLIDVMGSAGEDLKRQQAQVEQVL KTEEEQFARTLERGLALLDEELAKLSGDTLDGETAFRLYDTYGFPVDLTADVCRERNIKV DEAGFEAAMEEQRRRAREASGFGADYNAMIRVDSASEFKGYDHLELNGKVTALFVDGKAV DAINAGQEAVVVLDQTPFYAESGGQVGDKGELKGANFSFAVEDTQKYGQAIGHIGKLAAG SLKVGDAVQADVDEARRARIRLNHSATHLMHAALRQVLGTHVSQKGSLVNDKVLRFDFSH NEAMKPEEIRAVEDLVNTQIRRNLPIETNIMDLEAAKAKGAMALFGEKYDERVRVLSMGD FSTELCGGTHASRTGDIGLFRIISESGTAAGVRRIEAVTGEGAIATVHADSDRLSEVAHL LKGDSNNLADKVRSVLERTRQLEKELQQLKEQAAAQESANLSSKAIDVNGVKLLVSELSG VEPKMLRTMVDDLKNQLGSTIIVLATVVEGKVSLIAGVSKDVTDRVKAGELIGMVAQQVG GKGGGRPDMAQAGGTDAAALPAALASVKGWVSAKLQ >gi|296494622|gb|ADTN01000116.1| GENE 15 11815 - 12000 204 61 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855109|ref|ZP_02477881.1| 30S ribosomal protein S1 [Haemophilus parasuis 29755] # 1 60 1 60 61 83 61 9e-16 MLILTRRVGETLMIGDEVTVTVLGVKGNQVRIGVNAPKEVSVHREEIYQRIQAEKSQQSS Y Prediction of potential genes in microbial genomes Time: Sun May 15 23:31:48 2011 Seq name: gi|296494621|gb|ADTN01000117.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont249.9, whole genome shotgun sequence Length of sequence - 18461 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 6, operones - 4 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 221 - 297 87.2 # Arg ACG 0 0 1 1 Op 1 6/0.000 + CDS 578 - 1144 580 ## COG0637 Predicted phosphatase/phosphohexomutase + Prom 1147 - 1206 3.6 2 1 Op 2 5/0.000 + CDS 1285 - 1569 305 ## COG1238 Predicted membrane protein 3 1 Op 3 6/0.000 + CDS 1642 - 3198 1766 ## COG2918 Gamma-glutamylcysteine synthetase + Term 3212 - 3243 2.5 + Prom 3234 - 3293 2.5 4 1 Op 4 . + CDS 3348 - 3863 592 ## COG1854 LuxS protein involved in autoinducer AI2 synthesis + Term 3897 - 3928 3.4 - Term 3885 - 3916 3.4 5 2 Op 1 19/0.000 - CDS 3927 - 5465 1663 ## COG0477 Permeases of the major facilitator superfamily 6 2 Op 2 7/0.000 - CDS 5482 - 6654 1152 ## COG1566 Multidrug resistance efflux pump - Prom 6718 - 6777 1.9 - Term 6698 - 6733 5.0 7 2 Op 3 . - CDS 6781 - 7311 525 ## COG1846 Transcriptional regulators - Prom 7335 - 7394 5.2 8 3 Op 1 . - CDS 7402 - 7737 357 ## B21_02503 hypothetical protein 9 3 Op 2 4/0.000 - CDS 7727 - 8464 530 ## COG1296 Predicted branched-chain amino acid permease (azaleucine resistance) - Prom 8484 - 8543 8.4 10 3 Op 3 3/1.000 - CDS 8588 - 9772 1106 ## COG0477 Permeases of the major facilitator superfamily - Prom 9804 - 9863 4.5 - Term 9912 - 9953 1.7 11 4 Op 1 14/0.000 - CDS 9964 - 10956 1097 ## COG2113 ABC-type proline/glycine betaine transport systems, periplasmic components 12 4 Op 2 16/0.000 - CDS 11014 - 12078 1268 ## COG4176 ABC-type proline/glycine betaine transport system, permease component 13 4 Op 3 5/0.000 - CDS 12071 - 13273 1213 ## COG4175 ABC-type proline/glycine betaine transport system, ATPase component - Term 13476 - 13512 2.4 14 4 Op 4 24/0.000 - CDS 13627 - 14586 959 ## COG0208 Ribonucleotide reductase, beta subunit 15 4 Op 5 18/0.000 - CDS 14596 - 16740 2166 ## COG0209 Ribonucleotide reductase, alpha subunit 16 4 Op 6 11/0.000 - CDS 16713 - 17123 293 ## COG1780 Protein involved in ribonucleotide reduction 17 4 Op 7 . - CDS 17120 - 17365 286 ## COG0695 Glutaredoxin and related proteins - Prom 17548 - 17607 3.9 - Term 17547 - 17583 5.6 18 5 Tu 1 . - CDS 17613 - 17942 284 ## COG4575 Uncharacterized conserved protein - Prom 17968 - 18027 2.4 + Prom 17947 - 18006 3.4 19 6 Tu 1 . + CDS 18094 - 18438 369 ## G2583_3317 hypothetical protein Predicted protein(s) >gi|296494621|gb|ADTN01000117.1| GENE 1 578 - 1144 580 188 aa, chain + ## HITS:1 COG:yqaB KEGG:ns NR:ns ## COG: yqaB COG0637 # Protein_GI_number: 16130602 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Escherichia coli K12 # 1 188 1 188 188 372 100.0 1e-103 MYERYAGLIFDMDGTILDTEPTHRKAWREVLGHYGLQYDIQAMIALNGSPTWRIAQAIIE LNQADLDPHALAREKTEAVRSMLLDSVEPLPLVDVVKSWHGRRPMAVGTGSESAIAEALL AHLGLRHYFDAVVAADHVKHHKPAPDTFLLCAQRMGVQPTQCVVFEDADFGIQAARAAGM DAVDVRLL >gi|296494621|gb|ADTN01000117.1| GENE 2 1285 - 1569 305 94 aa, chain + ## HITS:1 COG:yqaA KEGG:ns NR:ns ## COG: yqaA COG1238 # Protein_GI_number: 16130601 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 94 49 142 142 163 100.0 7e-41 MGNSLGGLTNVILGRFFPLRKTSRWQEKATGWLKRYGAVTLLLSWMPVVGDLLCLLAGWM RISWGPVIFFLCLGKALRYVAVAAATVQGMMWWH >gi|296494621|gb|ADTN01000117.1| GENE 3 1642 - 3198 1766 518 aa, chain + ## HITS:1 COG:gshA KEGG:ns NR:ns ## COG: gshA COG2918 # Protein_GI_number: 16130600 # Func_class: H Coenzyme transport and metabolism # Function: Gamma-glutamylcysteine synthetase # Organism: Escherichia coli K12 # 1 518 1 518 518 1084 100.0 0 MIPDVSQALAWLEKHPQALKGIQRGLERETLRVNADGTLATTGHPEALGSALTHKWITTD FAEALLEFITPVDGDIEHMLTFMRDLHRYTARNMGDERMWPLSMPCYIAEGQDIELAQYG TSNTGRFKTLYREGLKNRYGALMQTISGVHYNFSLPMAFWQAKCGDISGADAKEKISAGY FRVIRNYYRFGWVIPYLFGASPAICSSFLQGKPTSLPFEKTECGMYYLPYATSLRLSDLG YTNKSQSNLGITFNDLYEYVAGLKQAIKTPSEEYAKIGIEKDGKRLQINSNVLQIENELY APIRPKRVTRSGESPSDALLRGGIEYIEVRSLDINPFSPIGVDEQQVRFLDLFMVWCALA DAPEMSSSELACTRVNWNRVILEGRKPGLTLGIGCETAQFPLPQVGKDLFRDLKRVAQTL DSINGGEAYQKVCDELVACFDNPDLTFSARILRSMIDTGIGGTGKAFAEAYRNLLREEPL EILREEDFVAEREASERRQQEMEAADTEPFAVWLEKHA >gi|296494621|gb|ADTN01000117.1| GENE 4 3348 - 3863 592 171 aa, chain + ## HITS:1 COG:luxS KEGG:ns NR:ns ## COG: luxS COG1854 # Protein_GI_number: 16130599 # Func_class: T Signal transduction mechanisms # Function: LuxS protein involved in autoinducer AI2 synthesis # Organism: Escherichia coli K12 # 1 171 1 171 171 351 100.0 4e-97 MPLLDSFTVDHTRMEAPAVRVAKTMNTPHGDAITVFDLRFCVPNKEVMPERGIHTLEHLF AGFMRNHLNGNGVEIIDISPMGCRTGFYMSLIGTPDEQRVADAWKAAMEDVLKVQDQNQI PELNVYQCGTYQMHSLQEAQDIARSILERDVRINSNEELALPKEKLQELHI >gi|296494621|gb|ADTN01000117.1| GENE 5 3927 - 5465 1663 512 aa, chain - ## HITS:1 COG:STM2815 KEGG:ns NR:ns ## COG: STM2815 COG0477 # Protein_GI_number: 16766126 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Salmonella typhimurium LT2 # 1 502 1 502 512 907 95.0 0 MQQQKPLEGAQLVIMTIALSLATFMQVLDSTIANVAIPTIAGNLGSSLSQGTWVITSFGV ANAISIPLTGWLAKRVGEVKLFLWSTIAFAIASWACGVSSSLNMLIFFRVIQGIVAGPLI PLSQSLLLNNYPPAKRSIALALWSMTVIVAPICGPILGGYISDNYHWGWIFFINVPIGVA VVLMTLQTLRGRETRTERRRIDAVGLALLVIGIGSLQIMLDRGKELDWFSSQEIIILTVV AVVAICFLIVWELTDDNPIVDLSLFKSRNFTIGCLCISLAYMLYFGAIVLLPQLLQEVYG YTATWAGLASAPVGIIPVILSPIIGRFAHKLDMRRLVTFSFIMYAVCFYWRAYTFEPGMD FGASAWPQFIQGFAVACFFMPLTTITLSGLPPERLAAASSLSNFTRTLAGSIGTSITTTM WTNRESMHHAQLTESVNPFNPNAQAMYSQLEGLGMTQQQASGWIAQQITNQGLIISANEI FWMSAGIFLVLLGLVWFAKPPFGAGGGGGGAH >gi|296494621|gb|ADTN01000117.1| GENE 6 5482 - 6654 1152 390 aa, chain - ## HITS:1 COG:emrA KEGG:ns NR:ns ## COG: emrA COG1566 # Protein_GI_number: 16130597 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Escherichia coli K12 # 1 390 1 390 390 689 100.0 0 MSANAETQTPQQPVKKSGKRKRLLLLLTLLFIIIAVAIGIYWFLVLRHFEETDDAYVAGN QIQIMSQVSGSVTKVWADNTDFVKEGDVLVTLDPTDARQAFEKAKTALASSVRQTHQLMI NSKQLQANIEVQKIALAKAQSDYNRRVPLGNANLIGREELQHARDAVTSAQAQLDVAIQQ YNANQAMILGTKLEDQPAVQQAATEVRNAWLALERTRIISPMTGYVSRRAVQPGAQISPT TPLMAVVPATNMWVDANFKETQIANMRIGQPVTITTDIYGDDVKYTGKVVGLDMGTGSAF SLLPAQNATGNWIKVVQRLPVRIELDQKQLEQYPLRIGLSTLVSVNTTNRDGQVLANKVR STPVAVSTAREISLAPVNKLIDDIVKANAG >gi|296494621|gb|ADTN01000117.1| GENE 7 6781 - 7311 525 176 aa, chain - ## HITS:1 COG:ECs3546 KEGG:ns NR:ns ## COG: ECs3546 COG1846 # Protein_GI_number: 15832800 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 176 1 176 176 306 100.0 2e-83 MDSSFTPIEQMLKFRASRHEDFPYQEILLTRLCMHMQSKLLENRNKMLKAQGINETLFMA LITLESQENHSIQPSELSCALGSSRTNATRIADELEKRGWIERRESDNDRRCLHLQLTEK GHEFLREVLPPQHNCLHQLWSALSTTEKDQLEQITRKLLSRLDQMEQDGVVLEAMS >gi|296494621|gb|ADTN01000117.1| GENE 8 7402 - 7737 357 111 aa, chain - ## HITS:1 COG:no KEGG:B21_02503 NR:ns ## KEGG: B21_02503 # Name: ygaH # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 111 1 111 111 176 100.0 2e-43 MSYEVLLLGLLVGVANYCFRYLPLRLRVGNARPTKRGAVGILLDTIGIASICALLVVSTA PEVMHDTRRFVPTLVGFAVLGASFYKTRSIIIPTLLSALAYGLAWKVMAII >gi|296494621|gb|ADTN01000117.1| GENE 9 7727 - 8464 530 245 aa, chain - ## HITS:1 COG:ygaZ KEGG:ns NR:ns ## COG: ygaZ COG1296 # Protein_GI_number: 16130594 # Func_class: E Amino acid transport and metabolism # Function: Predicted branched-chain amino acid permease (azaleucine resistance) # Organism: Escherichia coli K12 # 1 245 1 245 245 433 100.0 1e-121 MESPTPQPAPGSATFMEGCKDSLPIVISYIPVAFAFGLNATRLGFSPLESVFFSCIIYAG ASQFVITAMLAAGSSLWIAALTVMAMDVRHVLYGPSLRSRIIQRLQKSKTALWAFGLTDE VFAAATAKLVRNNRRWSENWMIGIAFSSWSSWVFGTVIGAFSGSGLLQGYPAVEAALGFM LPALFMSFLLASFQRKQSLCVTAALVGALAGVTLFSIPVAILAGIVCGCLTALIQAFWQG APDEL >gi|296494621|gb|ADTN01000117.1| GENE 10 8588 - 9772 1106 394 aa, chain - ## HITS:1 COG:ECs3543 KEGG:ns NR:ns ## COG: ECs3543 COG0477 # Protein_GI_number: 15832797 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 394 1 394 394 620 99.0 1e-177 MTKPNHELSPALIVLMSIATGLAVASNYYAQPLLDTIARNFSLSASSAGFIVTAAQLGYA AGLLFLVPLGDMFERRRLIVSMTLLAACGMLITASSQSLAMMILGTALTGLFSVVAQILV PLAATLASPDKRGKVVGTIMSGLLLGILLARTVAGLLANLGGWRTVFWVASVLMALMALA LWRGLPQMKSETHLNYPQLLGSVFSMFISDKILRTRALLGCLTFANFSILWTSMAFLLAA PPFNYSDGVIGLFGLAGAAGALGARPAGGFADKGKSHHTTTFGLLLLLLSWLAIWFGHTS VLALIIGILVLDLTVQGVHITNQTVIYRIHPDARNRLTAGYMTSYFIGGAAGSLISASAW QHGGWAGVCLAGATIALVNLLVWWRGFHRQEAAN >gi|296494621|gb|ADTN01000117.1| GENE 11 9964 - 10956 1097 330 aa, chain - ## HITS:1 COG:ECs3542 KEGG:ns NR:ns ## COG: ECs3542 COG2113 # Protein_GI_number: 15832796 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport systems, periplasmic components # Organism: Escherichia coli O157:H7 # 1 330 1 330 330 653 100.0 0 MRHSVLFATAFATLISTQTFAADLPGKGITVNPVQSTITEETFQTLLVSRALEKLGYTVN KPSEVDYNVGYTSLASGDATFTAVNWTPLHDNMYEAAGGDKKFYREGVFVNGAAQGYLID KKTADQYKITNIAQLKDPKIAKLFDTNGDGKADLTGCNPGWGCEGAINHQLAAYELTNTV THNQGNYAAMMADTISRYKEGKPVFYYTWTPYWVSNELKPGKDVVWLQVPFSALPGDKNA DTKLPNGANYGFPVSTMHIVANKAWAEKNPAAAKLFAIMQLPVADINAQNAIMHDGKASE GDIQGHVDGWIKAHQQQFDGWVNEALAAQK >gi|296494621|gb|ADTN01000117.1| GENE 12 11014 - 12078 1268 354 aa, chain - ## HITS:1 COG:proW KEGG:ns NR:ns ## COG: proW COG4176 # Protein_GI_number: 16130592 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport system, permease component # Organism: Escherichia coli K12 # 1 354 1 354 354 583 100.0 1e-166 MADQNNPWDTTPAADSAAQSADAWGTPTTAPTDGGGADWLTSTPAPNVEHFNILDPFHKT LIPLDSWVTEGIDWVVTHFRPVFQGVRVPVDYILNGFQQLLLGMPAPVAIIVFALIAWQI SGVGMGVATLVSLIAIGAIGAWSQAMVTLALVLTALLFCIVIGLPLGIWLARSPRAAKII RPLLDAMQTTPAFVYLVPIVMLFGIGNVPGVVVTIIFALPPIIRLTILGINQVPADLIEA SRSFGASPRQMLFKVQLPLAMPTIMAGVNQTLMLALSMVVIASMIAVGGLGQMVLRGIGR LDMGLATVGGVGIVILAIILDRLTQAVGRDSRSRGNRRWYTTGPVGLLTRPFIK >gi|296494621|gb|ADTN01000117.1| GENE 13 12071 - 13273 1213 400 aa, chain - ## HITS:1 COG:proV KEGG:ns NR:ns ## COG: proV COG4175 # Protein_GI_number: 16130591 # Func_class: E Amino acid transport and metabolism # Function: ABC-type proline/glycine betaine transport system, ATPase component # Organism: Escherichia coli K12 # 1 400 1 400 400 764 100.0 0 MAIKLEIKNLYKIFGEHPQRAFKYIEQGLSKEQILEKTGLSLGVKDASLAIEEGEIFVIM GLSGSGKSTMVRLLNRLIEPTRGQVLIDGVDIAKISDAELREVRRKKIAMVFQSFALMPH MTVLDNTAFGMELAGINAEERREKALDALRQVGLENYAHSYPDELSGGMRQRVGLARALA INPDILLMDEAFSALDPLIRTEMQDELVKLQAKHQRTIVFISHDLDEAMRIGDRIAIMQN GEVVQVGTPDEILNNPANDYVRTFFRGVDISQVFSAKDIARRTPNGLIRKTPGFGPRSAL KLLQDEDREYGYVIERGNKFVGAVSIDSLKTALTQQQGLDAALIDAPLAVDAQTPLSELL SHVGQAPCAVPVVDEDQQYVGIISKGMLLRALDREGVNNG >gi|296494621|gb|ADTN01000117.1| GENE 14 13627 - 14586 959 319 aa, chain - ## HITS:1 COG:nrdF KEGG:ns NR:ns ## COG: nrdF COG0208 # Protein_GI_number: 16130590 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, beta subunit # Organism: Escherichia coli K12 # 1 319 1 319 319 637 100.0 0 MKLSRISAINWNKISDDKDLEVWNRLTSNFWLPEKVPLSNDIPAWQTLTVVEQQLTMRVF TGLTLLDTLQNVIGAPSLMPDALTPHEEAVLSNISFMEAVHARSYSSIFSTLCQTKDVDA AYAWSEENAPLQRKAQIIQQHYRGDDPLKKKIASVFLESFLFYSGFWLPMYFSSRGKLTN TADLIRLIIRDEAVHGYYIGYKYQKNMEKISLGQREELKSFAFDLLLELYDNELQYTDEL YAETPWADDVKAFLCYNANKALMNLGYEPLFPAEMAEVNPAILAALSPNADENHDFFSGS GSSYVMGKAVETEDEDWNF >gi|296494621|gb|ADTN01000117.1| GENE 15 14596 - 16740 2166 714 aa, chain - ## HITS:1 COG:nrdE KEGG:ns NR:ns ## COG: nrdE COG0209 # Protein_GI_number: 16130589 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, alpha subunit # Organism: Escherichia coli K12 # 1 714 1 714 714 1456 99.0 0 MATTTAECLTQETMDYHALNAMLNLYDSAGRIQFDKDRQAVDAFIATHVRPNSVTFSSQQ QRLNWLVNEGYYDESVLNRYSRDFVITLFAHAHTSGFRFQTFLGAWKFYTSYTLKTFDGK RYLEDFADRVTMVALTLAQGDETLALQLTDEMLSGRFQPATPTFLNCGKQQRGELVSCFL LRIEDNMESIGRAVNSALQLSKRGGGVAFLLSNLREAGAPIKRIENQSSGVIPVMKMLED AFSYANQLGARQGAGAVYLHAHHPDILRFLDTKRENADEKIRIKTLSLGVVIPDITFHLA KENAQMALFSPYDVERVYGKPFADVAISQHYDELVADERIRKKYLNARDFFQRLAEIQFE SGYPYIMYEDTVNRANPIAGRINMSNLCSEILQVNSASEYDENLDYTRTGHDISCNLGSL NIAHTMDSPDFARTVETAVRGLTAVSDMSHIRSVPSIEAGNAASHAIGLGQMNLHGYLAR EGIAYGSPEALDFTNLYFYAITWHALRTSMLLARERGETFAGFKQSRYASGEYFSQYLQG NWQPKTAKVGELFTRSGITLPTREMWAQLRDDVMRYGIYNQNLQAVPPTGSISYINHATS SIHPIVAKVEIRKEGKTGRVYYPAPFMTNENLALYQDAYEIGAEKIIDTYAEATRHVDQG LSLTLFFPDTATTRDINKAQIYAWRKGIKTLYYIRLRQMALEGTEIEGCVSCAL >gi|296494621|gb|ADTN01000117.1| GENE 16 16713 - 17123 293 136 aa, chain - ## HITS:1 COG:ECs3537 KEGG:ns NR:ns ## COG: ECs3537 COG1780 # Protein_GI_number: 15832791 # Func_class: F Nucleotide transport and metabolism # Function: Protein involved in ribonucleotide reduction # Organism: Escherichia coli O157:H7 # 1 136 1 136 136 268 100.0 1e-72 MSQLVYFSSSSENTQRFIERLGLPAVRIPLNERERIQVDEPYILIVPSYGGGGTAGAVPR QVIRFLNDEHNRALLRGVIASGNRNFGEAYGRAGDVIARKCGVPWLYRFELMGTQSDIEN VRKGVTEFWQRQPQNA >gi|296494621|gb|ADTN01000117.1| GENE 17 17120 - 17365 286 81 aa, chain - ## HITS:1 COG:ECs3536 KEGG:ns NR:ns ## COG: ECs3536 COG0695 # Protein_GI_number: 15832790 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutaredoxin and related proteins # Organism: Escherichia coli O157:H7 # 1 81 1 81 81 164 100.0 4e-41 MRITIYTRNDCVQCHATKRAMENRGFDFEMINVDRVPEAAEALRAQGFRQLPVVIAGDLS WSGFRPDMINRLHPAPHAASA >gi|296494621|gb|ADTN01000117.1| GENE 18 17613 - 17942 284 109 aa, chain - ## HITS:1 COG:ECs3533 KEGG:ns NR:ns ## COG: ECs3533 COG4575 # Protein_GI_number: 15832787 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 109 5 113 113 192 100.0 2e-49 MFNRPNRNDVDDGVQDIQNDVNQLADSLESVLKSWGSDAKGEAEAARSKAQALLKETRAR MHGRTRVQQAARDAVGCADSFVRERPWCSVGTAAAVGIFIGALLSMRKS >gi|296494621|gb|ADTN01000117.1| GENE 19 18094 - 18438 369 114 aa, chain + ## HITS:1 COG:no KEGG:G2583_3317 NR:ns ## KEGG: G2583_3317 # Name: ygaC # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 114 1 114 114 221 100.0 8e-57 MYLRPDEVARVLEKVGFTVDVVTQKAYGYRRGENYVYVNREARMGRTALVIHPTLKERSS TLAEPASDIKTCDHYQQFPLYLAGERHEHYGIPHGFSSRVALERYLNGLFGEAS Prediction of potential genes in microbial genomes Time: Sun May 15 23:31:57 2011 Seq name: gi|296494620|gb|ADTN01000118.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont249.10, whole genome shotgun sequence Length of sequence - 8632 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 6, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 14 - 463 400 ## ECSP_3616 predicted inner membrane protein - Prom 578 - 637 6.2 + Prom 924 - 983 5.8 2 2 Tu 1 . + CDS 1132 - 1536 464 ## COG2916 DNA-binding protein H-NS - Term 1413 - 1446 1.0 3 3 Op 1 7/0.000 - CDS 1583 - 2107 383 ## COG0607 Rhodanese-related sulfurtransferase 4 3 Op 2 . - CDS 2117 - 2416 337 ## COG0640 Predicted transcriptional regulators - Prom 2440 - 2499 5.6 + Prom 2483 - 2542 4.8 5 4 Op 1 4/0.667 + CDS 2599 - 2757 235 ## COG0401 Uncharacterized homolog of Blt101 + Term 2773 - 2804 3.9 + Prom 2759 - 2818 2.2 6 4 Op 2 . + CDS 2841 - 3290 531 ## COG1652 Uncharacterized protein containing LysM domain - Term 3229 - 3262 5.1 7 5 Op 1 1/1.000 - CDS 3291 - 3953 756 ## COG1802 Transcriptional regulators 8 5 Op 2 4/0.667 - CDS 3974 - 5374 1514 ## COG1113 Gamma-aminobutyrate permease and related permeases - Prom 5541 - 5600 2.0 - Term 5511 - 5554 3.0 9 6 Op 1 12/0.000 - CDS 5612 - 6892 1386 ## COG0160 4-aminobutyrate aminotransferase and related aminotransferases 10 6 Op 2 3/0.667 - CDS 6906 - 8354 1812 ## COG1012 NAD-dependent aldehyde dehydrogenases 11 6 Op 3 . - CDS 8377 - 8613 70 ## COG0579 Predicted dehydrogenase Predicted protein(s) >gi|296494620|gb|ADTN01000118.1| GENE 1 14 - 463 400 149 aa, chain - ## HITS:1 COG:no KEGG:ECSP_3616 NR:ns ## KEGG: ECSP_3616 # Name: ygaW # Def: predicted inner membrane protein # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 149 1 149 149 269 100.0 2e-71 MFSPQSRLRHAVADTFAMVVYCSVVNMCIEVFLSGMSFEQSFYSRLVAIPVNILIAWPYG MYRDLFMRAARKVSPSGWIKNLADILAYVTFQSPVYVAILLVVGADWHQIMAAVSSNIVV SMLMGAVYGYFLDYCRRLFKVSRYQQVKA >gi|296494620|gb|ADTN01000118.1| GENE 2 1132 - 1536 464 134 aa, chain + ## HITS:1 COG:ECs3530 KEGG:ns NR:ns ## COG: ECs3530 COG2916 # Protein_GI_number: 15832784 # Func_class: R General function prediction only # Function: DNA-binding protein H-NS # Organism: Escherichia coli O157:H7 # 1 134 1 134 134 213 100.0 1e-55 MSVMLQSLNNIRTLRAMAREFSIDVLEEMLEKFRVVTKERREEEEQQQRELAERQEKIST WLELMKADGINPEELLGNSSAAAPRAGKKRQPRPAKYKFTDVNGETKTWTGQGRTPKPIA QALAEGKSLDDFLI >gi|296494620|gb|ADTN01000118.1| GENE 3 1583 - 2107 383 174 aa, chain - ## HITS:1 COG:ygaP KEGG:ns NR:ns ## COG: ygaP COG0607 # Protein_GI_number: 16130582 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Escherichia coli K12 # 1 174 1 174 174 340 100.0 5e-94 MALTTISPHDAQELIARGAKLIDIRDADEYLREHIPEADLAPLSVLEQSGLPAKLRHEQI IFHCQAGKRTSNNADKLAAIAAPAEIFLLEDGIDGWKKAGLPVAVNKSQPLPLMRQVQIA AGGLILIGVVLGYTVNSGFFLLSGFVGAGLLFAGISGFCGMARLLDKMPWNQRA >gi|296494620|gb|ADTN01000118.1| GENE 4 2117 - 2416 337 99 aa, chain - ## HITS:1 COG:ygaV KEGG:ns NR:ns ## COG: ygaV COG0640 # Protein_GI_number: 16130581 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 1 99 1 99 99 162 100.0 2e-40 MTELAQLQASAEQAAALLKAMSHPKRLLILCMLSGSPGTSAGELTRITGLSASATSQHLA RMRDEGLIDSQRDAQRILYSIKNEAVNAIIATLKNVYCP >gi|296494620|gb|ADTN01000118.1| GENE 5 2599 - 2757 235 52 aa, chain + ## HITS:1 COG:ECs3527 KEGG:ns NR:ns ## COG: ECs3527 COG0401 # Protein_GI_number: 15832781 # Func_class: S Function unknown # Function: Uncharacterized homolog of Blt101 # Organism: Escherichia coli O157:H7 # 1 52 1 52 52 77 100.0 6e-15 MGFWRIVITIILPPLGVLLGKGFGWAFIINILLTLLGYIPGLIHAFWVQTRD >gi|296494620|gb|ADTN01000118.1| GENE 6 2841 - 3290 531 149 aa, chain + ## HITS:1 COG:ygaU KEGG:ns NR:ns ## COG: ygaU COG1652 # Protein_GI_number: 16130579 # Func_class: S Function unknown # Function: Uncharacterized protein containing LysM domain # Organism: Escherichia coli K12 # 1 149 1 149 149 267 100.0 4e-72 MGLFNFVKDAGEKLWDAVTGQHDKDDQAKKVQEHLNKTGIPDADKVNIQIADGKATVTGD GLSQEAKEKILVAVGNISGIASVDDQVKTATPATASQFYTVKSGDTLSAISKQVYGNANL YNKIFEANKPMLKSPDKIYPGQVLRIPEE >gi|296494620|gb|ADTN01000118.1| GENE 7 3291 - 3953 756 220 aa, chain - ## HITS:1 COG:ECs3525 KEGG:ns NR:ns ## COG: ECs3525 COG1802 # Protein_GI_number: 15832779 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 220 7 226 226 411 100.0 1e-115 MTITSLDGYRWLKNDIIRGNFQPDEKLRMSLLTSRYALGVGPLREALSQLVAERLVTVVN QKGYRVASMSEQELLDIFDARANMEAMLVSLAIARGGDEWEADVLAKAHLLSKLEACDAS EKMLDEWDLRHQAFHTAIVAGCGSHYLLQMRERLFDLAARYRFIWLRRTVLSVEMLEDKH DQHQTLTAAVLARDTARASELMRQHLLTPIPIIQQAMAGN >gi|296494620|gb|ADTN01000118.1| GENE 8 3974 - 5374 1514 466 aa, chain - ## HITS:1 COG:gabP KEGG:ns NR:ns ## COG: gabP COG1113 # Protein_GI_number: 16130577 # Func_class: E Amino acid transport and metabolism # Function: Gamma-aminobutyrate permease and related permeases # Organism: Escherichia coli K12 # 1 466 1 466 466 824 100.0 0 MGQSSQPHELGGGLKSRHVTMLSIAGVIGASLFVGSSVAIAEAGPAVLLAYLFAGLLVVM IMRMLAEMAVATPDTGSFSTYADKAIGRWAGYTIGWLYWWFWVLVIPLEANIAAMILHSW VPGIPIWLFSLVITLALTGSNLLSVKNYGEFEFWLALCKVIAILAFIFLGAVAISGFYPY AEVSGISRLWDSGGFMPNGFGAVLSAMLITMFSFMGAEIVTIAAAESDTPEKHIVRATNS VIWRISIFYLCSIFVVVALIPWNMPGLKAVGSYRSVLELLNIPHAKLIMDCVILLSVTSC LNSALYTASRMLYSLSRRGDAPAVMGKINRSKTPYVAVLLSTGAAFLTVVVNYYAPAKVF KFLIDSSGAIALLVYLVIAVSQLRMRKILRAEGSEIRLRMWLYPWLTWLVIGFITFVLVV MLFRPAQQLEVISTGLLAIGIICTVPIMARWKKLVLWQKTPVHNTR >gi|296494620|gb|ADTN01000118.1| GENE 9 5612 - 6892 1386 426 aa, chain - ## HITS:1 COG:gabT KEGG:ns NR:ns ## COG: gabT COG0160 # Protein_GI_number: 16130576 # Func_class: E Amino acid transport and metabolism # Function: 4-aminobutyrate aminotransferase and related aminotransferases # Organism: Escherichia coli K12 # 1 426 1 426 426 854 100.0 0 MNSNKELMQRRSQAIPRGVGQIHPIFADRAENCRVWDVEGREYLDFAGGIAVLNTGHLHP KVVAAVEAQLKKLSHTCFQVLAYEPYLELCEIMNQKVPGDFAKKTLLVTTGSEAVENAVK IARAATKRSGTIAFSGAYHGRTHYTLALTGKVNPYSAGMGLMPGHVYRALYPCPLHGISE DDAIASIHRIFKNDAAPEDIAAIVIEPVQGEGGFYASSPAFMQRLRALCDEHGIMLIADE VQSGAGRTGTLFAMEQMGVAPDLTTFAKSIAGGFPLAGVTGRAEVMDAVAPGGLGGTYAG NPIACVAALEVLKVFEQENLLQKANDLGQKLKDGLLAIAEKHPEIGDVRGLGAMIAIELF EDGDHNKPDAKLTAEIVARARDKGLILLSCGPYYNVLRILVPLTIEDAQIRQGLEIISQC FDEAKQ >gi|296494620|gb|ADTN01000118.1| GENE 10 6906 - 8354 1812 482 aa, chain - ## HITS:1 COG:gabD KEGG:ns NR:ns ## COG: gabD COG1012 # Protein_GI_number: 16130575 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Escherichia coli K12 # 1 482 1 482 482 954 100.0 0 MKLNDSNLFRQQALINGEWLDANNGEAIDVTNPANGDKLGSVPKMGADETRAAIDAANRA LPAWRALTAKERATILRNWFNLMMEHQDDLARLMTLEQGKPLAEAKGEISYAASFIEWFA EEGKRIYGDTIPGHQADKRLIVIKQPIGVTAAITPWNFPAAMITRKAGPALAAGCTMVLK PASQTPFSALALAELAIRAGVPAGVFNVVTGSAGAVGNELTSNPLVRKLSFTGSTEIGRQ LMEQCAKDIKKVSLELGGNAPFIVFDDADLDKAVEGALASKFRNAGQTCVCANRLYVQDG VYDRFAEKLQQAVSKLHIGDGLDNGVTIGPLIDEKAVAKVEEHIADALEKGARVVCGGKA HERGGNFFQPTILVDVPANAKVSKEETFGPLAPLFRFKDEADVIAQANDTEFGLAAYFYA RDLSRVFRVGEALEYGIVGINTGIISNEVAPFGGIKASGLGREGSKYGIEDYLEIKYMCI GL >gi|296494620|gb|ADTN01000118.1| GENE 11 8377 - 8613 70 78 aa, chain - ## HITS:1 COG:ygaF KEGG:ns NR:ns ## COG: ygaF COG0579 # Protein_GI_number: 16130574 # Func_class: R General function prediction only # Function: Predicted dehydrogenase # Organism: Escherichia coli K12 # 1 78 367 444 444 154 98.0 3e-38 MRAQAVSPDGKLIDDFLFVTTPRTIHTCNAPSPAATSAIPIGAHIVSKVQTLLASQSNPG RTLRAARSVDALHAAFNQ Prediction of potential genes in microbial genomes Time: Sun May 15 23:32:02 2011 Seq name: gi|296494619|gb|ADTN01000119.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont249.11, whole genome shotgun sequence Length of sequence - 5209 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 1030 831 ## COG0579 Predicted dehydrogenase 2 1 Op 2 . - CDS 1050 - 2027 1038 ## JW5427 hypothetical protein - Prom 2249 - 2308 4.8 - Term 2312 - 2352 8.5 3 2 Op 1 7/0.000 - CDS 2363 - 2527 67 ## COG0366 Glycosidases - Prom 2588 - 2647 5.8 4 2 Op 2 7/0.000 - CDS 2774 - 3565 0 ## COG0366 Glycosidases 5 2 Op 3 . - CDS 3578 - 4264 417 ## COG0366 Glycosidases - Prom 4370 - 4429 4.1 Predicted protein(s) >gi|296494619|gb|ADTN01000119.1| GENE 1 1 - 1030 831 343 aa, chain - ## HITS:1 COG:ygaF KEGG:ns NR:ns ## COG: ygaF COG0579 # Protein_GI_number: 16130574 # Func_class: R General function prediction only # Function: Predicted dehydrogenase # Organism: Escherichia coli K12 # 1 343 23 365 444 696 98.0 0 MYDFVIIGGGIIGRPPAMQLIDVYPDARIALLEKESGPACHQTGHNSGVIHAGVYYTPGS LKAQFCLAGNRATKAFCDQNGIRYDNCGKMLVATSDLEMERMRALWERTAANGIEREWLN ADELREREPNITGLGGIFVPSSGIVSYRDVTAAMAKIFQSRGGEIIYNAEVSGLNEHKNG VVIRTRQGGEYEASTLISCSGLMADRLVKMLGLEPGFIICPFRGEYFRLAPEHNQIVNHL IYPIPDPAMPFLGVHLTRMIDGSVTVGPNAVLAFKREGYRKRDFSFSDTLEILGSSGIRR VLQNHLRSGLGEMKNSLCKSGYLRLVQKYCPRLSLSDLQPWPA >gi|296494619|gb|ADTN01000119.1| GENE 2 1050 - 2027 1038 325 aa, chain - ## HITS:1 COG:no KEGG:JW5427 NR:ns ## KEGG: JW5427 # Name: ygaT # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 325 1 325 325 639 100.0 0 MNALTAVQNNAVDSGQDYSGFTLTPSAQSPRLLELTFTEQTTKQFLEQVAEWPVQALEYK SFLRFRVAKILDDLCANQLQPLLLKTLLNRAEGALLINAVGVDDVKQADEMVKLATAVAH LIGRSNFDAMSGQYYARFVVKNVDNSDSYLRQPHRVMELHNDGTYVEEITDYVLMMKIDE QNMQGGNSLLLHLDDWEHLDNYFRHPLARRPMRFAAPPSKNVSKDVFHPVFDVDQQGRPV MRYIDQFVQPKDFEEGVWLSELSDAIETSKGILSVPVPVGKFLLINNLFWLHGRDRFTPH PDLRRELMRQRGYFAYASNHYQTHQ >gi|296494619|gb|ADTN01000119.1| GENE 3 2363 - 2527 67 54 aa, chain - ## HITS:1 COG:ECs3519 KEGG:ns NR:ns ## COG: ECs3519 COG0366 # Protein_GI_number: 15832773 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Escherichia coli O157:H7 # 1 54 282 335 335 103 94.0 1e-22 MGSNSVNIIINNTRKIIPPGKVFTLRGGTLNINIPGRSALLLGKTGEPPNYLYL >gi|296494619|gb|ADTN01000119.1| GENE 4 2774 - 3565 0 263 aa, chain - ## HITS:1 COG:b2657 KEGG:ns NR:ns ## COG: b2657 COG0366 # Protein_GI_number: 16130571 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Escherichia coli K12 # 54 263 1 210 210 416 99.0 1e-116 MIPFIKDPVTKERKQIHPEDIHLTAKDFEASKDNISKDEWENLHALKEKRLNGMPKTTPK SDQVIMLQNQYVREMRKYGVRGLRYDAAKHSKHEQIERSITPPLKNYNERLHNTNLFNPK YHKKAVMNYMEYLVTCQLDEQQMSSLLYERDDLSAIDFSLLMKTIKAFSFGGDLQTLASK PGSTISSIPSERRILININHDFPNNGNLFNDFLFNHQQDEQLAMAYIDALPFSRPLVYWD GQVLKSTTEIKNYDGSTRVGGEA >gi|296494619|gb|ADTN01000119.1| GENE 5 3578 - 4264 417 228 aa, chain - ## HITS:1 COG:ECs3518 KEGG:ns NR:ns ## COG: ECs3518 COG0366 # Protein_GI_number: 15832772 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Escherichia coli O157:H7 # 1 228 118 345 412 461 98.0 1e-130 MIERGDNSGVYQYGRAEHFTHIISDKPSPKDKYVAYAINIPDYELAADVYNINVTSPSGQ QETFKILINLEHLRQTLERKSLTAVQKSQCEIITPKKPGEAILHAFNATYQQIRENMSEF ARCHYGYIQIPPVTTFRADGPETPEEEKGYWFHAYQPEDLCTIHNPMGDLQDFIALVKDA KKFGIDIIPDYTFNFMGIGGSGKNDLDYPSADIRAKISKDIEGGIPGY Prediction of potential genes in microbial genomes Time: Sun May 15 23:32:11 2011 Seq name: gi|296494618|gb|ADTN01000120.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont250.1, whole genome shotgun sequence Length of sequence - 15569 bp Number of predicted genes - 17, with homology - 16 Number of transcription units - 8, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 319 - 349 1.6 1 1 Tu 1 . - CDS 388 - 1563 859 ## COG3135 Uncharacterized protein involved in benzoate metabolism 2 2 Op 1 3/0.500 + CDS 1655 - 2191 362 ## COG1396 Predicted transcriptional regulators + Term 2222 - 2252 1.1 3 2 Op 2 . + CDS 2264 - 4225 1632 ## COG0826 Collagenase and related proteases 4 3 Tu 1 . - CDS 4317 - 4487 203 ## JW1432 hypothetical protein - Prom 4687 - 4746 4.2 5 4 Op 1 . + CDS 4871 - 4945 85 ## 6 4 Op 2 2/1.000 + CDS 4970 - 5407 428 ## COG1598 Uncharacterized conserved protein 7 4 Op 3 3/0.500 + CDS 5486 - 6892 1099 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs + Prom 6936 - 6995 4.5 8 5 Op 1 13/0.000 + CDS 7137 - 8282 1184 ## COG0687 Spermidine/putrescine-binding periplasmic protein 9 5 Op 2 30/0.000 + CDS 8300 - 9313 867 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 10 5 Op 3 36/0.000 + CDS 9314 - 10255 1042 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I 11 5 Op 4 5/0.000 + CDS 10245 - 11039 799 ## COG1177 ABC-type spermidine/putrescine transport system, permease component II 12 5 Op 5 . + CDS 11061 - 12485 1255 ## COG1012 NAD-dependent aldehyde dehydrogenases + Prom 12661 - 12720 3.1 13 6 Op 1 . + CDS 12824 - 13045 142 ## ECIAI1_1442 conserved hypothetical protein; putative inner membrane protein 14 6 Op 2 . + CDS 13131 - 13364 356 ## G2583_1807 hypothetical protein 15 7 Op 1 4/0.000 - CDS 13365 - 13814 420 ## COG3238 Uncharacterized protein conserved in bacteria 16 7 Op 2 . - CDS 13811 - 14329 527 ## COG1247 Sortase and related acyltransferases - Prom 14360 - 14419 2.3 + Prom 14296 - 14355 2.7 17 8 Tu 1 . + CDS 14510 - 15547 962 ## COG2130 Putative NADP-dependent oxidoreductases Predicted protein(s) >gi|296494618|gb|ADTN01000120.1| GENE 1 388 - 1563 859 391 aa, chain - ## HITS:1 COG:ydcO KEGG:ns NR:ns ## COG: ydcO COG3135 # Protein_GI_number: 16129392 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized protein involved in benzoate metabolism # Organism: Escherichia coli K12 # 1 391 88 478 478 640 100.0 0 MRLFSIPPPTLLAGFLAVLIGYASSAAIIWQAAIVAGATTAQISGWMTALGLAMGVSTLT LTLWYRVPVLTAWSTPGAALLVTGLQGLTLNEAIGVFIVTNALIVLCGITGLFARLMRII PHSLAAAMLAGILLRFGLQAFASLDGQFTLCGSMLLVWLATKAVAPRYAVIAAMIIGIVI VIAQGDVVTTDVVFKPVLPTYITPDFSFAHSLSVALPLFLVTMASQNAPGIAAMKAAGYS APVSPLIVFTGLLALVFSPFGVYSVGIAAITAAICQSPEAHPDKDQRWLAAAVAGIFYLL AGLFGSAITGMMAALPVSWIQMLAGLALLSTIGGSLYQALHNERERDAAVVAFLVTASGL TLVGIGSAFWGLIAGGVCYVVLNLIADRNRY >gi|296494618|gb|ADTN01000120.1| GENE 2 1655 - 2191 362 178 aa, chain + ## HITS:1 COG:ydcN KEGG:ns NR:ns ## COG: ydcN COG1396 # Protein_GI_number: 16129393 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 1 178 1 178 178 361 100.0 1e-100 MENLARFLSTTLKQLRQQRGWSLSRLAEATGVSKAMLGQIERNESSPTVATLWKIATGLN VPFSTFISPPQSATPSVYDPQQQAMVITSLFPYDPQLCFEHFSIQMASGAISESTPHEKG VIEHVVVIDGQLDLCVDGEWQTLNCGEGVRFAADVTHIYRNGGEQTVHFHSLIHYPRS >gi|296494618|gb|ADTN01000120.1| GENE 3 2264 - 4225 1632 653 aa, chain + ## HITS:1 COG:ECs2039 KEGG:ns NR:ns ## COG: ECs2039 COG0826 # Protein_GI_number: 15831293 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Escherichia coli O157:H7 # 1 653 15 667 667 1354 99.0 0 MTVSSHRLELLSPARDAAIAREAILHGADAVYIGGPGFGARHNASNSLKDIAELVPFAHR YGAKIFVTLNTILHDDELEPAQRLITDLYQTGVDALIVQDMGILELDIPPIELHASTQCD IRTVEKAKFLSDVGFTQIVLARELNLDQIRAIHQATDATIEFFIHGALCVAYSGQCYISH AQTGRSANRGDCSQACRLPYTLKDDQGRVVSYEKHLLSMKDNDQTANLGALIDAGVRSFK IEGRYKDMSYVKNITAHYRQMLDAIIEERGDLARASSGRTEHFFVPSTEKTFHRGSTDYF VNARKGDIGAFDSPKFIGLPVGEVVKVAKDHLDVAVTEPLANGDGLNVLIKREVVGFRAN TVEKTGENQYRVWPNEMPADLHKIRPHHPLNRNLDHNWQQALTKTSSERRVAVDIELGGW QEQLILTLTSEEGVSITHTLDGQFDEANNAEKAMNNLKDGLAKLGQTLYYARDVQINLPG ALFVPNSLLNQFRREAADMLDAARLASYQRGSRKPVADPAPVYPQTHLSFLANVYNQKAR EFYHRYGVQLIDAAYEAHEEKGEVPVMITKHCLRFAFNLCPKQAKGNIKSWKATPMQLVN GDEVLTLKFDCRPCEMHVIGKIKNHILKMPLPGSVVASVSPDELLKTLPKRKG >gi|296494618|gb|ADTN01000120.1| GENE 4 4317 - 4487 203 56 aa, chain - ## HITS:1 COG:no KEGG:JW1432 NR:ns ## KEGG: JW1432 # Name: yncJ # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 56 21 76 76 102 100.0 6e-21 MAGHKGHEFVWVKNVDHQLRHEADSDELRAVAEESAEGLREHFYWQKSRKPEAGQR >gi|296494618|gb|ADTN01000120.1| GENE 5 4871 - 4945 85 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPRHPCDEIKEPLRKAILKQLGLS >gi|296494618|gb|ADTN01000120.1| GENE 6 4970 - 5407 428 145 aa, chain + ## HITS:1 COG:ECs2042 KEGG:ns NR:ns ## COG: ECs2042 COG1598 # Protein_GI_number: 15831296 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 145 1 145 145 263 100.0 9e-71 MRETVEIMRYPVTLTPAPEGGYMVSFVDIPEALTQGETVAEAMEAAKDALLTAFDFYFED NELIPLPSPLNSHDHFIEVPLSVASKVLLLNAFLQSEITQQELARRIGKPKQEITRLFNL HHATKIDAVQLAAKALGKELSLVMV >gi|296494618|gb|ADTN01000120.1| GENE 7 5486 - 6892 1099 468 aa, chain + ## HITS:1 COG:ydcR KEGG:ns NR:ns ## COG: ydcR COG1167 # Protein_GI_number: 16129398 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Escherichia coli K12 # 1 468 1 468 468 917 99.0 0 MKKYQQLAEQLREQIASGIWQPGDRLPSLRDQVALSGMSFMTVSHAYQLLESQGYIIARP QSGYYVAPQAIKMPKAPVIPVTRDEAVDINTYIFDMLQASRDPSVVPFASAFPDPRLFPL QQLNRSLAQVSKTATAMSVIENLPPGNAELRQAIARRYALQGITISPDEIVITAGALEAL NLSLQAVTEPGDWVIVENPCFYGALQALERLRLKALSVATDVKEGIDLQALELALQEYPV KACWLMTNSQNPLGFTLTPQKKAQLVALLNQYNVTLIEDDVYSELYFGREKPLPAKAWDR HDGVLHCSSFSKCLVPGFRIGWVAAGKHARKIQRLQLMSTLSTSSPMQLALVDYLSTRRY DAHLRRLRRQLAERKQRAWQALLRYLPAEVKIHHNDSGSFLWLELPEPLDAGELSLAALT HHISIAPGKMFSTGENWSRFFRFNTAWQWGEREEQAVKQLGKLIQERL >gi|296494618|gb|ADTN01000120.1| GENE 8 7137 - 8282 1184 381 aa, chain + ## HITS:1 COG:ydcS KEGG:ns NR:ns ## COG: ydcS COG0687 # Protein_GI_number: 16129399 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Escherichia coli K12 # 1 381 1 381 381 770 100.0 0 MSKTFARSSLCALSMTIMTAHAAEPPTNLDKPEGRLDIIAWPGYIERGQTDKQYDWVTQF EKETGCAVNVKTAATSDEMVSLMTKGGYDLVTASGDASLRLIMGKRVQPINTALIPNWKT LDPRVVKGDWFNVGGKVYGTPYQWGPNLLMYNTKTFPTPPDSWQVVFVEQNLPDGKSNKG RVQAYDGPIYIADAALFVKATQPQLGISDPYQLTEEQYQAVLKVLRAQHSLIHRYWHDTT VQMSDFKNEGVVASSAWPYQANALKAEGQPVATVFPKEGVTGWADTTMLHSEAKHPVCAY KWMNWSLTPKVQGDVAAWFGSLPVVPEGCKASPLLGEKGCETNGFNYFDKIAFWKTPIAE GGKFVPYSRWTQDYIAIMGGR >gi|296494618|gb|ADTN01000120.1| GENE 9 8300 - 9313 867 337 aa, chain + ## HITS:1 COG:ydcT KEGG:ns NR:ns ## COG: ydcT COG3842 # Protein_GI_number: 16129400 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Escherichia coli K12 # 1 337 1 337 337 670 100.0 0 MTYAVEFDNVSRLYGDVRAVDGVSIAIKDGEFFSMLGPSGSGKTTCLRLIAGFEQLSGGA ISIFGKPASNLPPWERDVNTVFQDYALFPHMSILDNVAYGLMVKGVNKKQRHAMAQEALE KVALGFVHQRKPSQLSGGQRQRVAIARALVNEPRVLLLDEPLGALDLKLREQMQLELKKL QQSLGITFIFVTHDQGEALSMSDRVAVFNNGRIEQVDSPRDLYMRPRTPFVAGFVGTSNV FDGLMAEKLCGMTGSFALRPEHIRLNTPGELQANGTIQAVQYQGAATRFELKLNGGEKLL VSQANMTGEELPATLTPGQQVMVSWSRDVMVPLVEER >gi|296494618|gb|ADTN01000120.1| GENE 10 9314 - 10255 1042 313 aa, chain + ## HITS:1 COG:ydcU KEGG:ns NR:ns ## COG: ydcU COG1176 # Protein_GI_number: 16129401 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Escherichia coli K12 # 1 313 1 313 313 506 99.0 1e-143 MAMNVLQSPSRPGLGKVSGFFWHNPGLGLFLLLLGPLMWLGIVYFGSLLTLLWQGFYTFD DFTMSVTPELTLANIRALFNPANYDIILRTLTMAVAVTIASAILAFPMAWYMARYTSGKM KAFFYIAVMLPMWASYIVKAYAWTLLLAKDGVAQWFLQHLGLEPLLTAFLTLPAVGGNTL STSGLGRFLVFLYIWLPFMILPVQAALERLPPSLLQASADLGARPRQTFRYVVLPLAIPG IAAGSIFTFSLTLGDFIVPQLVGPPGYFIGNMVYSQQGAIGNMPMAAAFTLVPIILIALY LAFVKRLGAFDAL >gi|296494618|gb|ADTN01000120.1| GENE 11 10245 - 11039 799 264 aa, chain + ## HITS:1 COG:ydcV KEGG:ns NR:ns ## COG: ydcV COG1177 # Protein_GI_number: 16129402 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component II # Organism: Escherichia coli K12 # 1 264 1 264 264 422 100.0 1e-118 MHSERAPFFLKLAAWGGVVFLHFPILIIAAYAFNTEDAAFSFPPQGLTLRWFSVAAQRSD ILDAVTLSLKVAALATLIALVLGTLAAAALWRRDFFGKNAISLLLLLPIALPGIVTGLAL LTAFKTINLEPGFFTIVVGHATFCVVVVFNNVIARFRRTSWSLVEASMDLGANGWQTFRY VVLPNLSSALLAGGMLAFALSFDEIIVTTFTAGHERTLPLWLLNQLGRPRDVPVTNVVAL LVMLVTTLPILGAWWLTREGDNGQ >gi|296494618|gb|ADTN01000120.1| GENE 12 11061 - 12485 1255 474 aa, chain + ## HITS:1 COG:ydcW KEGG:ns NR:ns ## COG: ydcW COG1012 # Protein_GI_number: 16129403 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Escherichia coli K12 # 1 474 1 474 474 928 99.0 0 MQHKLLINGELVSGEGEKQPVYNPATGDVLLEIAEASAEQVDAAVRAADAAFAEWGQTTP KVRAECLLKLADVIEENGQVFAELESRNCGKPLHSAFNDEIPAIVDVFRFFAGAARCLNG LAAGEYLEGHTSMIRRDPLGVVASIAPWNYPLMMAAWKLAPALAAGNCVVLKPSEITPLT ALKLAELAKDIFPAGVINILFGRGKTVGDPLTGHPKVRMVSLTGSIATGEHIISHTASSI KRTHMELGGKAPVIVFDDADIEAVVEGVRTFGYYNAGQDCTAACRIYAQKGIYDTLVEKL GAAVATLKSGAPDDESTELGPLSSLAHLERVSKAVEEAKATGHIKVITGGEKRKGNGYYY APTLLAGALQDDAIVQKEVFGPVVSVTPFDNEEQVVNWANDSQYGLASSVWTKDVGRAHR VSARLQYGCTWVNTHFMLVSEMPHGGQKLSGYGKDMSLYGLEDYTVVRHVMVKH >gi|296494618|gb|ADTN01000120.1| GENE 13 12824 - 13045 142 73 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_1442 NR:ns ## KEGG: ECIAI1_1442 # Name: ydcX # Def: conserved hypothetical protein; putative inner membrane protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 73 10 82 82 117 100.0 1e-25 MTHICARFIHLAGRPYMSLYQHMLVFYAVMAAIAFLITWFLSHDKKRIRFLSAFLVGATW PMSFPVALLFSLF >gi|296494618|gb|ADTN01000120.1| GENE 14 13131 - 13364 356 77 aa, chain + ## HITS:1 COG:no KEGG:G2583_1807 NR:ns ## KEGG: G2583_1807 # Name: ydcY # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 77 1 77 77 112 100.0 5e-24 MSHLDEVIARVDAAIEESVIAHMNELLIALSDDAELSREDRYTQQQRLRTAIAHHGRKHK EDMEARHEQLTKGGTIL >gi|296494618|gb|ADTN01000120.1| GENE 15 13365 - 13814 420 149 aa, chain - ## HITS:1 COG:ydcZ KEGG:ns NR:ns ## COG: ydcZ COG3238 # Protein_GI_number: 16129406 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 149 1 149 149 221 99.0 3e-58 MNQSLTLAFLIAAGIGLVVQNTLMVRITQTTSTILIAMLLNSLVGIVLFVSILWFKQGMA GFGELVSSVRWWTLIPGLLGSFFVFASISGYQNVGAATTIAVLVASQLIGGLMLDIFRSH GVPLRALFGPICGAILLVVGAWLVARRSF >gi|296494618|gb|ADTN01000120.1| GENE 16 13811 - 14329 527 172 aa, chain - ## HITS:1 COG:yncA KEGG:ns NR:ns ## COG: yncA COG1247 # Protein_GI_number: 16129407 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sortase and related acyltransferases # Organism: Escherichia coli K12 # 1 172 1 172 172 356 100.0 1e-98 MSIRFARKADCAAIAEIYNHAVLYTAAIWNDQTVDADNRIAWFEARTLAGYPVLVSEENG VVTGYASFGDWRSFDGFRHTVEHSVYVHPDHQGKGLGRKLLSRLIDEARDCGKHVMVAGI ESQNQASLHLHQSLGFVVTAQMPQVGTKFGRWLDLTFMQLQLDERTEPDAIG >gi|296494618|gb|ADTN01000120.1| GENE 17 14510 - 15547 962 345 aa, chain + ## HITS:1 COG:yncB KEGG:ns NR:ns ## COG: yncB COG2130 # Protein_GI_number: 16129408 # Func_class: R General function prediction only # Function: Putative NADP-dependent oxidoreductases # Organism: Escherichia coli K12 # 1 345 32 376 376 709 100.0 0 MGQQKQRNRRWVLASRPHGAPVPENFRLEEDDVATPGEGQVLLRTVYLSLDPYMRGRMSD EPSYSPPVDIGGVMVGGTVSRVVESNHPDYQSGDWVLGYSGWQDYDISSGDDLVKLGDHP QNPSWSLGVLGMPGFTAYMGLLDIGQPKEGETLVVAAATGPVGATVGQIGKLKGCRVVGV AGGAEKCRHATEVLGFDVCLDHHADDFAEQLAKACPKGIDIYYENVGGKVFDAVLPLLNT SARIPVCGLVSSYNATELPPGPDRLPLLMATVLKKRIRLQGFIIAQDYGHRIHEFQREMG QWVKEDKIHYREEITDGLENAPQTFIGLLKGKNFGKVVIRVAGDD Prediction of potential genes in microbial genomes Time: Sun May 15 23:32:23 2011 Seq name: gi|296494617|gb|ADTN01000121.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont250.2, whole genome shotgun sequence Length of sequence - 8990 bp Number of predicted genes - 8, with homology - 7 Number of transcription units - 7, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 5.9 1 1 Tu 1 . + CDS 185 - 850 632 ## COG1802 Transcriptional regulators + Term 933 - 970 0.3 - Term 840 - 879 5.1 2 2 Tu 1 . - CDS 886 - 2988 1669 ## COG1629 Outer membrane receptor proteins, mostly Fe transport - Prom 3017 - 3076 3.9 + Prom 3015 - 3074 5.5 3 3 Tu 1 . + CDS 3230 - 4291 1320 ## COG3391 Uncharacterized conserved protein + Term 4373 - 4407 6.8 - Term 4222 - 4266 1.0 4 4 Tu 1 . - CDS 4404 - 5903 1561 ## COG1113 Gamma-aminobutyrate permease and related permeases - Prom 6016 - 6075 3.8 + Prom 6074 - 6133 4.4 5 5 Op 1 . + CDS 6170 - 6787 427 ## COG0625 Glutathione S-transferase 6 5 Op 2 . + CDS 6863 - 7075 90 ## B21_01423 hypothetical protein + Prom 7539 - 7598 6.7 7 6 Tu 1 . + CDS 7659 - 7751 64 ## + Term 7838 - 7878 5.5 8 7 Tu 1 . + CDS 7895 - 8990 891 ## COG3501 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|296494617|gb|ADTN01000121.1| GENE 1 185 - 850 632 221 aa, chain + ## HITS:1 COG:yncC KEGG:ns NR:ns ## COG: yncC COG1802 # Protein_GI_number: 16129409 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 221 20 240 240 403 99.0 1e-112 MPGTGKMKHVSLTLQVENDLKHQLSIGALKPGARLITKNLAEQLGMSITPVREALLRLVS VNALSVAPAQAFTVPEVGKCQLDEINRIRYELELMAVALAVENLTPQDLAELQELLEKLQ QAQEKGDMEQIINVNRLFRLAIYHRSNMPILCEMIEQLWVRMGPGLHYLYEAINPAELRE HIENYHLLLAALKAKDKEGCRHCLAEIMQQNIAILYQQYNR >gi|296494617|gb|ADTN01000121.1| GENE 2 886 - 2988 1669 700 aa, chain - ## HITS:1 COG:yncD KEGG:ns NR:ns ## COG: yncD COG1629 # Protein_GI_number: 16129410 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Escherichia coli K12 # 1 700 1 700 700 1379 100.0 0 MKIFSVRQTVLPALLVLSPVVFAADEQTMIVSAAPQVVSELDTPAAVSVVDGEEMRLATP RINLSESLTGVPGLQVQNRQNYAQDLQLSIRGFGSRSTYGIRGIRLYVDGIPATMPDGQG QTSNIDLSSVQNVEVLRGPFSALYGNASGGVMNVTTQTGQQPPTIEASSYYGSFGSWRYG LKATGATGDGTQPGDVDYTVSTTRFTTHGYRDHSGAQKNLANAKLGVRIDEASKLSLIFN SVDIKADDPGGLTKAEWKANPQQAPRAEQYDTRKTIKQTQAGLRYERSLSSRDDMSVMMY AGERETTQYQSIPMAPQLNPSHAGGVITLQRHYQGIDSRWTHRGELGVPVTFTTGLNYEN MSENRKGYNNFRLNSGMPEYGQKGELRRDERNLMWNIDPYLQTQWQLSEKLSLDAGVRYS SVWFDSNDHYVTPGNGDDSGDASYHKWLPAGSLKYAMTDAWNIYLAAGRGFETPTINELS YRADGQSGMNLGLKPSTNDTIEIGSKTRIGDGLLSLALFQTDTDDEIVVDSSSGGRTTYK NAGKTRRQGAELAWDQRFAGDFRVNASWTWLDATYRSNVCNEQDCNGNRMPGIARNMGFA SIGYVPEDGWYAGTEARYMGDIMADDENTAKAPSYTLVGLFTGYKYNYHNLTVDLFGRVD NLFDKEYVGSVIVNESNGRYYEPSPGRNYGVGMNIAWRFE >gi|296494617|gb|ADTN01000121.1| GENE 3 3230 - 4291 1320 353 aa, chain + ## HITS:1 COG:yncE KEGG:ns NR:ns ## COG: yncE COG3391 # Protein_GI_number: 16129411 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 353 1 353 353 621 100.0 1e-178 MHLRHLFSSRLRGSLLLGSLLVVSSFSTQAAEEMLRKAVGKGAYEMAYSQQENALWLATS QSRKLDKGGVVYRLDPVTLEVTQAIHNDLKPFGATINNTTQTLWFGNTVNSAVTAIDAKT GEVKGRLVLDDRKRTEEVRPLQPRELVADDATNTVYISGIGKESVIWVVDGGNIKLKTAI QNTGKMSTGLALDSEGKRLYTTNADGELITIDTADNKILSRKKLLDDGKEHFFINISLDT ARQRAFITDSKAAEVLVVDTRNGNILAKVAAPESLAVLFNPARNEAYVTHRQAGKVSVID AKSYKVVKTFDTPTHPNSLALSADGKTLYVSVKQKSTKQQEATQPDDVIRIAL >gi|296494617|gb|ADTN01000121.1| GENE 4 4404 - 5903 1561 499 aa, chain - ## HITS:1 COG:ECs2057 KEGG:ns NR:ns ## COG: ECs2057 COG1113 # Protein_GI_number: 15831311 # Func_class: E Amino acid transport and metabolism # Function: Gamma-aminobutyrate permease and related permeases # Organism: Escherichia coli O157:H7 # 1 499 18 516 516 918 100.0 0 MSKHDTDTSDQHAAKRRWLNAHEEGYHKAMGNRQVQMIAIGGAIGTGLFLGAGARLQMAG PALALVYLICGLFSFFILRALGELVLHRPSSGSFVSYAREFLGEKAAYVAGWMYFINWAM TGIVDITAVALYMHYWGAFGGVPQWVFALAALTIVGTMNMIGVKWFAEMEFWFALIKVLA IVTFLVVGTVFLGSGQPLDGNTTGFHLITDNGGFFPHGLLPALVLIQGVVFAFASIEMVG TAAGECKDPQTMVPKAINSVIWRIGLFYVGSVVLLVMLLPWSAYQAGQSPFVTFFSKLGV PYIGSIMNIVVLTAALSSLNSGLYCTGRILRSMAMGGSAPSFMAKMSRQHVPYAGILATL VVYVVGVFLNYLVPSRVFEIVLNFASLGIIASWAFIIVCQMRLRKAIKEGKAADVSFKLP GAPFTSWLTLLFLLSVLVLMAFDYPNGTYTIAALPIIGILLVIGWFGVRKRVAEIHSTAP VVEEDEEKQEIVFKPETAS >gi|296494617|gb|ADTN01000121.1| GENE 5 6170 - 6787 427 205 aa, chain + ## HITS:1 COG:yncG KEGG:ns NR:ns ## COG: yncG COG0625 # Protein_GI_number: 16129413 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione S-transferase # Organism: Escherichia coli K12 # 1 205 1 205 205 418 100.0 1e-117 MIKVYGVPGWGSTISELMLTLADIPYQFVDVSGFDHEGASRELLKTLNPLCQVPTLALEN DEIMTETAAIALMVLDRRPDLAPPVGRAERQLFQRLLVWLVANVYPTFTFADYPERWAPD APEQLKKNVIEYRKSLYIWLNSQLTAEPYAFGEQLTLVDCYLCTMRTWGPGHEWFQDNAT NISAIADAVCQLPKLQEVLKRNEII >gi|296494617|gb|ADTN01000121.1| GENE 6 6863 - 7075 90 70 aa, chain + ## HITS:1 COG:no KEGG:B21_01423 NR:ns ## KEGG: B21_01423 # Name: yncH # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 70 1 70 70 102 100.0 3e-21 MVCFLIYITLLFIQRVYFISSEKKLTIHIVQMFQLLSQAFYNLKMFLMMDMLGVGDAINI NTNKNIRQVC >gi|296494617|gb|ADTN01000121.1| GENE 7 7659 - 7751 64 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYEPVVNNVENPSVAFVIILLTKQTPVSLK >gi|296494617|gb|ADTN01000121.1| GENE 8 7895 - 8990 891 365 aa, chain + ## HITS:1 COG:ECs0607 KEGG:ns NR:ns ## COG: ECs0607 COG3501 # Protein_GI_number: 15829861 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 356 1 356 633 708 98.0 0 MSTGLRFTLEVDGLPPDAFAVVSFHLNQSLSSLFSLDLSLVSQQFLSLEFAQVLDKMAYL TIWQGDEVQRRVKGVVTWFELGENDKNQMLYSMKVHPPLWRAGLRQNFRIFQNEDIKSIL GTMLQENGVTEWSPLFSEPHPSREFCVQYGETDYDFLCRMAAEEGIFFYEEHAYKSTDQS LVLCDTVRHLPESFEIPWNPNTRTEVSTLCISQFRYSAQIRPSSVVTKDYTFKRPGWAGR FEQEGQHQDYQRTQYEVYDYPGRFKGAHGQNFARWQMDGWRNNAETARGMSRSPEIWPGR RIVLTGHPQANLNREWQVVASELHGEQPQAVPGRQGAGTALENHFAVIPADRTWRPGVSA FRRCG Prediction of potential genes in microbial genomes Time: Sun May 15 23:32:31 2011 Seq name: gi|296494616|gb|ADTN01000122.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont251.1, whole genome shotgun sequence Length of sequence - 8819 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.000 - CDS 73 - 1719 1834 ## COG0579 Predicted dehydrogenase - Prom 1810 - 1869 5.9 2 1 Op 2 . - CDS 1937 - 3580 198 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P - Prom 3606 - 3665 9.6 + Prom 3699 - 3758 5.0 3 2 Tu 1 . + CDS 3903 - 5060 326 ## COG0582 Integrase + Prom 5109 - 5168 1.6 4 3 Op 1 . + CDS 5279 - 5416 66 ## gi|301645408|ref|ZP_07245350.1| hypothetical protein HMPREF9543_02026 + Prom 5419 - 5478 1.7 5 3 Op 2 . + CDS 5512 - 5697 123 ## gi|301645409|ref|ZP_07245351.1| transcriptional regulator, AlpA family 6 4 Tu 1 . + CDS 5912 - 8293 1088 ## IL0684 hypothetical protein + Term 8295 - 8344 4.2 Predicted protein(s) >gi|296494616|gb|ADTN01000122.1| GENE 1 73 - 1719 1834 548 aa, chain - ## HITS:1 COG:yojH KEGG:ns NR:ns ## COG: yojH COG0579 # Protein_GI_number: 16130147 # Func_class: R General function prediction only # Function: Predicted dehydrogenase # Organism: Escherichia coli K12 # 1 548 1 548 548 1078 100.0 0 MKKVTAMLFSMAVGLNAVSMAAKAKASEEQETDVLLIGGGIMSATLGTYLRELEPEWSMT MVERLEGVAQESSNGWNNAGTGHSALMELNYTPQNADGSISIEKAVAINEAFQISRQFWA HQVERGVLRTPRSFINTVPHMSFVWGEDNVNFLRARYAALQQSSLFRGMRYSEDHAQIKE WAPLVMEGRDPQQKVAATRTEIGTDVNYGEITRQLIASLQKKSNFSLQLSSEVRALKRND DNTWTVTVADLKNGTAQNIRAKFVFIGAGGAALKLLQESGIPEAKDYAGFPVGGQFLVSE NPDVVNHHLAKVYGKASVGAPPMSVPHIDTRVLDGKRVVLFGPFATFSTKFLKNGSLWDL MSSTTTSNVMPMMHVGLDNFDLVKYLVSQVMLSEEDRFEALKEYYPQAKKEDWRLWQAGQ RVQIIKRDAEKGGVLRLGTEVVSDQQGTIAALLGASPGASTAAPIMLNLLEKVFGDRVSS PQWQATLKAIVPSYGRKLNGDVAATERELQYTSEVLGLNYDKPQAADSTPKPQLKPQPVQ KEVADIAL >gi|296494616|gb|ADTN01000122.1| GENE 2 1937 - 3580 198 547 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 326 541 133 357 398 80 31 3e-15 MELLVLVWRQYRWPFISVMALSLASAALGIGLIAFINQRLIETADTSLLVLPEFLGLLLL LMAVTLGSQLALTTLGHHFVYRLRSEFIKRILDTHVERIEQLGSASLLAGLTSDVRNITI AFVRLPELVQGIILTIGSAAYLWMLSGKMLLVTAIWMAITIWGGFVLVARVYKHMATLRE TEDKLYTDFQTVLEGRKELTLNRERAEYVFNNLYIPDAQEYRHHIIRADTFHLSAVNWSN IMMLGAIGLVFWMANSLGWADTNVAATYSLTLLFLRTPLLSAVGALPTLLTAQVAFNKLN KFALAPFKAEFPRPQAFPNWQTLELRNVTFAYQDNAFSVGPINLTIKRGELLFLIGGNGS GKSTLAMLLTGLYQPQSGEILLDGKPVSGEQPEDYRKLFSAVFTDVWLFDQLLGPEGKPA NPQLVEKWLAQLKMAHKLELSNGRIVNLKLSKGQKKRVALLLALAEERDIILLDEWAADQ DPHFRREFYQVLLPLMQEMGKTIFAISHDDHYFIHADRLLEMRNGQLSELTGEERDAASR DAVARTA >gi|296494616|gb|ADTN01000122.1| GENE 3 3903 - 5060 326 385 aa, chain + ## HITS:1 COG:Z1835 KEGG:ns NR:ns ## COG: Z1835 COG0582 # Protein_GI_number: 15801304 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Escherichia coli O157:H7 EDL933 # 2 378 15 393 416 272 39.0 8e-73 MLNDAKLRALARGHDNPNPQKIADSNGLAALVRPNGQISFIFRFRWSAKQQNFTLGQYPS LSLKAARERAAQCRKWLADGFDPRIQARLEKQANREQLTVRQAVDFWLTDSETAAADNYR KQFEKHIFPHLGDLPIDQLDTAAWLKVFSDIKRGTYHRAAPNAAAYVFGLTKTAMKYCRA RQMITTTVLNDLSAGDVGATVNKRDRVLKQNELSDAVEWSNDIANPFYYRALLKLLMVFG CRLSELRLSTIKEWDMDAMIWTAPKQHTKTGVEIQRPIPAALVPVLASLINDRKSGLLLT EKKTVGSVSNYIIGISRKLGHDSWTAHDIRRTVATRLADMGIHHHVIEALLGHVLPGVAG IYNRSHLLNEKREALELWISEGVKL >gi|296494616|gb|ADTN01000122.1| GENE 4 5279 - 5416 66 45 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|301645408|ref|ZP_07245350.1| ## NR: gi|301645408|ref|ZP_07245350.1| hypothetical protein HMPREF9543_02026 [Escherichia coli MS 146-1] # 1 45 75 119 119 86 100.0 4e-16 MGSGKETVLKPNTEVYRQFAEIDEGFRLFFEANPEFMPPTSKIKK >gi|296494616|gb|ADTN01000122.1| GENE 5 5512 - 5697 123 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|301645409|ref|ZP_07245351.1| ## NR: gi|301645409|ref|ZP_07245351.1| transcriptional regulator, AlpA family [Escherichia coli MS 146-1] # 1 61 1 61 61 112 100.0 8e-24 MEKLISIAQLARMLNRHRTRIYVYQKTGYLPPAIMRNGRTIGWRESDIAAWMEKNKDVRD N >gi|296494616|gb|ADTN01000122.1| GENE 6 5912 - 8293 1088 793 aa, chain + ## HITS:1 COG:no KEGG:IL0684 NR:ns ## KEGG: IL0684 # Name: not_defined # Def: hypothetical protein # Organism: I.loihiensis # Pathway: not_defined # 283 778 55 530 547 277 34.0 1e-72 MLERTGYCAVPCLSNGATAPYGNTQTYPREDTIWQSSNVKIDRVAFRLDDVILLDYDGNK APEGEIISVSKLADLLGLSGLMQFCTQTNDEGNSLHFLFRWPEGYDRSQFKAANNGKWLP HIDVKTGNQLCYLKQGKQIIEGILPFIEDLPEAPVALIEALRKEPKASPSKATPEGHDNE TDIEWFNRTAPDWESLFFEYGFEKHGDKWLHPESTTGNPGIFINRDNRYVSSHGCDPLGD GHSHDKFDFVAQNQWEGDKSQLLEDIRQKRLEEDLQDEPDGLEWLPPKELKSELVPVQQF DARLLPEAVRDYVKDYAARMDNAAPDYAAISVMIAAGAVIGGTAQIQPKRKDTGWRLIPN LWGSAIGYPSAMKTPSLKCGLDLLAHSQKVLDRDFMQRQSKYEAESAINNARRAEIEQGL EEATKAALYAQSVKDAGPEEKQAYEDALARLVYIKESLKDEPKEPKPRELMVNDSTVEAL AIMASNNPQGIMVFRDELSGWLASLDSPQRPNDRAFYLEGFSCGSYKQSRVSRAPLKIDK LIVSLLGGIQPGKLAPILAARAAGGGDDGLLERIIQMSVFPDLTGEYRDAAPDIEAERRA KAVFETLADADYGDGKPAIYTFSEDAQKMWDEWATEHKKREQTANADWQGILGKYPALCA KIALVYHLLDEADGVNYESDFFSPSLKVTAKSLRCALHWMEYLESHATRITSYFRAEKAM TPAITLRDRLSQLSPSFTRNSLGQKDWRHLTTKEDRESAIEQLIKTGHIREVTTPPKNGT GRPSVSFLVNPHI Prediction of potential genes in microbial genomes Time: Sun May 15 23:32:51 2011 Seq name: gi|296494615|gb|ADTN01000123.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont252.1, whole genome shotgun sequence Length of sequence - 9890 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 2, operones - 1 average op.length - 9.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 8 - 67 3.5 1 1 Op 1 8/0.000 + CDS 125 - 1327 1283 ## COG1749 Flagellar hook protein FlgE 2 1 Op 2 8/0.000 + CDS 1327 - 2064 865 ## COG4787 Flagellar basal body rod protein 3 1 Op 3 9/0.000 + CDS 2143 - 2928 962 ## COG4786 Flagellar basal body rod protein + Prom 3032 - 3091 3.6 4 1 Op 4 9/0.000 + CDS 3132 - 3797 584 ## COG2063 Flagellar basal body L-ring protein 5 1 Op 5 7/0.000 + CDS 3812 - 4912 1162 ## COG1706 Flagellar basal-body P-ring protein 6 1 Op 6 9/0.000 + CDS 4912 - 5211 348 ## COG3951 Rod binding protein + Term 5313 - 5377 2.0 7 1 Op 7 21/0.000 + CDS 5633 - 7009 1689 ## COG1256 Flagellar hook-associated protein 8 1 Op 8 . + CDS 7024 - 7953 1066 ## COG1344 Flagellin and related hook-associated proteins 9 1 Op 9 . + CDS 7970 - 8947 806 ## EcolC_3360 hypothetical protein + Term 8958 - 9003 4.6 10 2 Tu 1 . - CDS 9006 - 9848 459 ## COG3710 DNA-binding winged-HTH domains Predicted protein(s) >gi|296494615|gb|ADTN01000123.1| GENE 1 125 - 1327 1283 400 aa, chain + ## HITS:1 COG:YPO0725 KEGG:ns NR:ns ## COG: YPO0725 COG1749 # Protein_GI_number: 16121044 # Func_class: N Cell motility # Function: Flagellar hook protein FlgE # Organism: Yersinia pestis # 1 400 1 413 413 301 47.0 1e-81 MSYEIAATGLNAVNEQLDGISNNIANAGTVGYKSMTTQFSAMYAGSQAMGVSVAGTAQSI SRGGSLVSTGNALDLAINDDGFFVTCDSAGNISYTRAGSFETDKNGYIVNASGAYLQGYP VDDTGTLQTGTVTDIQIKTGNIPAQASSSLTFTANFDASDDAIDRTTVPFDATNSSSYTD SYTTTVYDSLGNEHSVCQYFTKTSDNTWEVQYTFDGQQQTGVPATTLTFDPNTGKLTSPT TPQTIEFQTDAAAPIDLTVDYSTCTQYGSDFSVTTNAANGYASATQNGVQVDDDGKVYAT YSNGERMLQGQVVLATFPNENGLEAVSGTAWVQTGESGTPLIGVPGSGTCGTLSSGVLES SNVDITSELVNLMTAQRNYQANTKVIATSTQLDDALFQAM >gi|296494615|gb|ADTN01000123.1| GENE 2 1327 - 2064 865 245 aa, chain + ## HITS:1 COG:VC2196 KEGG:ns NR:ns ## COG: VC2196 COG4787 # Protein_GI_number: 15642195 # Func_class: N Cell motility # Function: Flagellar basal body rod protein # Organism: Vibrio cholerae # 1 245 1 249 249 163 43.0 2e-40 MDRLIYTALSGASQTLYEQQISANNLANVNTNGFRADMAMATNDKVKGGGFDTRYMAQEG ASGVNDSTGVAEKTERPLDVAIQGAGYIAVQDKNGNEVYTRNGNIQQDDQGQLAIDGNLV LGDNGPIILPPNAIASFGSDGTLSVTPDDGDVTATMDIDRLKLVDIPVANLAKNPEGMLI TADGVPAQRDENIKVSGGFLEGSNVSAVSEMMSSIAMNRQFEAQIKMMKTAEDISDAGNR LLRGS >gi|296494615|gb|ADTN01000123.1| GENE 3 2143 - 2928 962 261 aa, chain + ## HITS:1 COG:YPO0728 KEGG:ns NR:ns ## COG: YPO0728 COG4786 # Protein_GI_number: 16121046 # Func_class: N Cell motility # Function: Flagellar basal body rod protein # Organism: Yersinia pestis # 1 261 1 261 261 320 66.0 1e-87 MNAALWISKTGLSAQDAEMSAIANNIANVNTTGFKRDRVMFQDLFYQTQEAPGAMLDQNN IMPTGLQFGSGVRIVGTQKTFTEGNVETTDNAMNVAIMGQGFLQVQKANGDIAYTRDGNL QVNADGVLTNSQGLPLQPEIDVPAGATNVAFGEDGTVTAILPGDSDATELGQLTLVNFAN PAGLSAEADNLYLETAASGQPTEGVPGEDGLGTLQDNALEGSNVDIVNEMVAMITVQRAY EMNAKMVSAADDMLQYISQTL >gi|296494615|gb|ADTN01000123.1| GENE 4 3132 - 3797 584 221 aa, chain + ## HITS:1 COG:YPO0729 KEGG:ns NR:ns ## COG: YPO0729 COG2063 # Protein_GI_number: 16121047 # Func_class: N Cell motility # Function: Flagellar basal body L-ring protein # Organism: Yersinia pestis # 1 221 1 221 221 306 68.0 1e-83 MKNYLWLVALLPLLSGCESQAILVKKDDAYFAPPKTEAPPPADGRAGGVFETGYNWSLTA DRRAYRVGDILTVILEESTQSSKQAKTNFGKSNTVDIGAPTIFGHTKDKLSGSIDANRDF DGSATSQQQNSLRGEITVSVHAVQPNGILEIRGEKWLTLNQGDEYIRLSGLVRADDIQND NSVSSQRIADARISYAGRGALSDANAAGWLTRLFNHPLFPI >gi|296494615|gb|ADTN01000123.1| GENE 5 3812 - 4912 1162 366 aa, chain + ## HITS:1 COG:YPO0730 KEGG:ns NR:ns ## COG: YPO0730 COG1706 # Protein_GI_number: 16121048 # Func_class: N Cell motility # Function: Flagellar basal-body P-ring protein # Organism: Yersinia pestis # 8 366 12 370 370 428 68.0 1e-120 MQNWIKTVVVAVSLALPGVVLAQSLESLVNVQGVRENQLVGYSLVVGLDGTGDKNQVKFT NQTITNMLRQFGVQLPNKIDPKVKNVAAVAVSATLPPMYSRGQTIDVTVSSIGDAKSIRG GTLLLTQLHGADGEVYALAQGSVVVGGMNATGASGSSVTVNTPTAGLIPNGATVEREIPS DFQMGDTITLNLKRPSFKDANNIAAAINASFGGIATAQSSTNVTVRAPTSPGARVAFMSQ LDDVQVQAEKIRARVVFNSRTGTVVMGDGVALHAAAVSHGSLTVSINETSNVSQPNAFAG GRTAVTPQSNIAVNHARPGVVSLPESSSLKTLVNALNSLGATPDDIMSILQALHEAGALD ADLEVI >gi|296494615|gb|ADTN01000123.1| GENE 6 4912 - 5211 348 99 aa, chain + ## HITS:1 COG:YPO0731 KEGG:ns NR:ns ## COG: YPO0731 COG3951 # Protein_GI_number: 16121049 # Func_class: M Cell wall/membrane/envelope biogenesis; N Cell motility; O Posttranslational modification, protein turnover, chaperones # Function: Rod binding protein # Organism: Yersinia pestis # 20 97 20 97 97 80 50.0 7e-16 MKVNGSGGIDGSDALMGPKVQANDIRQAAEQFEAIFLRNMLKEMRKTNELFDSKDNPFNS DSVRMMQGFYDDELCNTLAQQHGIGIAAMIVKQLSPKHR >gi|296494615|gb|ADTN01000123.1| GENE 7 5633 - 7009 1689 458 aa, chain + ## HITS:1 COG:YPO0732 KEGG:ns NR:ns ## COG: YPO0732 COG1256 # Protein_GI_number: 16121050 # Func_class: N Cell motility # Function: Flagellar hook-associated protein # Organism: Yersinia pestis # 1 456 1 452 453 288 38.0 2e-77 MDMINIGYSGASTAQVELNVTAQNTANAMITGYTRQVAEISTIGASGGSPNSAGNGVQVD SIRRVSNQYQVNQVWYAASDYGYYSTQQGYLSQLEAVLSDDNSSLSGGFDNFFAALNEAT TSPDDSALREQVISEAGALSLRIDNTLDYVDSQSTEIISQQQAMVSQINTLTSGIASYNQ QIAQAEANGDNASALYDARDQMVEELSGMMEVQVNIDDQGNYNVTLKNGQPLVSGQQSST IALETNADGTPTMSLTFAGTTSTMTTDTGGSLGALFDYQNDVLTPLTDTINSMASQFADA VNNQLAQGYDLNGNPGEPLFIYDASNADGPLTVNPDITADELAFSSSPDESGNSDNLQAL INISTEPLEIANLGSVTVGQACSSIISNIGIYSQQNQTEVDAASNVYSAAQNQQSSVSGV SMDEEAVNLITYQQIYEANLKVISAGAEIFDSVLEMCS >gi|296494615|gb|ADTN01000123.1| GENE 8 7024 - 7953 1066 309 aa, chain + ## HITS:1 COG:YPO0733 KEGG:ns NR:ns ## COG: YPO0733 COG1344 # Protein_GI_number: 16121051 # Func_class: N Cell motility # Function: Flagellin and related hook-associated proteins # Organism: Yersinia pestis # 1 309 1 307 307 137 29.0 2e-32 MRVTTQQTYVSMTQSFNDLSGDLAHVVEQMATGKQILQPSDDPIAATRITQLNRQQSAIE QYQSNIDSASAGLSQQESILDGVNNSLLAVRDDLLEAANGTNTADSLASLGQDIESLTES MVAALNYQDEDGHYVFGGTINDQPPIVAVDDDGDGVTDSYSYQGNSDHRQTTVSNGVEVD TNVAASDFFGSNLDVLNTLNSLSQELQAPNVDPADPQVQSDIQNAVDVVDTASDDLNASI ASLGETQNTMSMLSDAQTDISTSNDELIGSLQDLDYGPASITFTGLEVAMEATLKTYSKV SELNLFSVL >gi|296494615|gb|ADTN01000123.1| GENE 9 7970 - 8947 806 325 aa, chain + ## HITS:1 COG:no KEGG:EcolC_3360 NR:ns ## KEGG: EcolC_3360 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 325 1 325 325 496 97.0 1e-139 MQVGLTSSSLATGSAHSAAVSSSTVAPTQAVRQKLPATASEYPASPLITTRPQRYSVQLN DQLTTLQQADHYLGQLEQQLLDYRHSQRKGGQAQSTALMQMLDKRTALSGGAVDRQLQPV LQGEARVTFHSPDLANLVHNPTPGTRMFSVSDGRQTQLSAVMLSEDDSAAQYQTRLTNAL RRVGVQLHQQADGISFSTTEKQWPNIESTLSVRTDGDKSAFMPLKTFAEPSQAERLAQSL QQGGAGISQMLENINQQRAQMAVQQEKARQLIDGMSRFPQTENAVQASENLGGVLDSANH NYQVLLQAVNGQARISSQTVRSLLG >gi|296494615|gb|ADTN01000123.1| GENE 10 9006 - 9848 459 280 aa, chain - ## HITS:1 COG:YPO0736_1 KEGG:ns NR:ns ## COG: YPO0736_1 COG3710 # Protein_GI_number: 16121054 # Func_class: K Transcription # Function: DNA-binding winged-HTH domains # Organism: Yersinia pestis # 2 182 13 179 181 105 35.0 7e-23 MSIIINNWRMDPSLNALIHCETGETHRLGEYHFILLETLAKNADVVLSRSYLCAEVWKNR IVGGNSLPTAIHALRVAIDDDGKQQNIIKTIPKKGYLCNKEYVSLPESSPAEALIITNQV QETVPGEISSTTLPVPVRKKHKGLMGLALTAAVIFIGSTVGYSHLKSTPDAPQLVKESIN SPRIKIFHLSSGKENNSAPLLSQTLAPGKDKLDNLLSAHNMTMTTYYKYVRNRLESDIVL RNQCNGSWQLTFNVDEWQNSDINSAMYQNLEKLLNTVQKC Prediction of potential genes in microbial genomes Time: Sun May 15 23:32:57 2011 Seq name: gi|296494614|gb|ADTN01000124.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont255.1, whole genome shotgun sequence Length of sequence - 3284 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 13 - 72 2.6 1 1 Op 1 . + CDS 283 - 1830 2039 ## COG0074 Succinyl-CoA synthetase, alpha subunit 2 1 Op 2 . + CDS 1830 - 3248 1823 ## JW0313 conserved hypothetical protein Predicted protein(s) >gi|296494614|gb|ADTN01000124.1| GENE 1 283 - 1830 2039 515 aa, chain + ## HITS:1 COG:yahF KEGG:ns NR:ns ## COG: yahF COG0074 # Protein_GI_number: 16128305 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, alpha subunit # Organism: Escherichia coli K12 # 1 515 1 515 515 988 100.0 0 MSVKIVIKPNTYFDSVSLMSISTRANKLDGVEQAFVAMATEMNKGVLKNLGLLTPELEQA KNGDLMIVINGKSGADNEQLLVEIEELFNTKAQSGSHEARYATIGSAKKHIPESNLAVIS VNGLFAAREARQALQNDLNVMLFSDNVSVEDELALKQLAHEKGLLMMGPDCGTAIINGAA LCFGNAVRRGNIGIVGASGTGSQELSVRIHEFGGGVSQLIGTGGRDLSEKIGGLMMLDAI GMLENDPQTEIIALISKPPAPAVARKVLERARACRKPVVVCFLDRGETPVDEQGLQFARG TKEAALKAVMLSGVKQENLDLHTLNQPLIADVRARLQPQQKYIRGLFCGGTLCDETMFAV MEKHGDVYSNIQPDPEFRLKDINRSIKHTFLDFGDDDFTNGKPHPMIDPTNRISRLIEEA RDPEVAVIVMDFVLGFGSHEDPVGSTIETIKEAKAIAAAEGRELIILAYVLGTDLDTPSL EQQSQMLLDAGVILASSSTNTGLLAREFICKGEEA >gi|296494614|gb|ADTN01000124.1| GENE 2 1830 - 3248 1823 472 aa, chain + ## HITS:1 COG:no KEGG:JW0313 NR:ns ## KEGG: JW0313 # Name: yahG # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 472 1 472 472 941 100.0 0 MSQSLFSQPLNVINVGIAMFSDDLKKQHVEVTQLDWTPPGQGNMQVVQALDNIADSPLAD KIAAANQQALERIIQSHPVLIGFDQAINVVPGMTAKTILHAGPPITWEKMCGAMKGAVTG ALVFEGLAKDLDEAAELAASGEITFSPCHEHDCVGSMAGVTSASMFMHIVKNKTYGNIAY TNMSEQMAKILRMGANDQSVIDRLNWMRDVQGPILRDAMKIIGEIDLRLMLAQALHMGDE CHNRNNAGTTLLIQALTPGIIQAGYSVEQQREVFEFVASSDYFSGPTWMAMCKAAMDAAH GIEYSTVVTTMARNGVEFGLRVSGLPGQWFTGPAQQVIGPMFAGYKPEDSGLDIGDSAIT ETYGIGGFAMATAPAIVALVGGTVEEAIDFSRQMREITLGENPNVTIPLLGFMGVPSAID ITRVGSSGILPVINTAIAHKDAGVGMIGAGIVHPPFACFEKAILGWCERYGV Prediction of potential genes in microbial genomes Time: Sun May 15 23:33:07 2011 Seq name: gi|296494613|gb|ADTN01000125.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont259.1, whole genome shotgun sequence Length of sequence - 14082 bp Number of predicted genes - 10, with homology - 9 Number of transcription units - 8, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 398 - 449 3.0 1 1 Op 1 45/0.000 - CDS 457 - 1581 1147 ## COG0842 ABC-type multidrug transport system, permease component 2 1 Op 2 10/0.000 - CDS 1581 - 4316 2765 ## COG1131 ABC-type multidrug transport system, ATPase component 3 1 Op 3 . - CDS 4313 - 5380 1164 ## COG0845 Membrane-fusion protein - Prom 5588 - 5647 3.6 4 2 Tu 1 . - CDS 5746 - 7368 155 ## JW3455 hypothetical protein - Prom 7446 - 7505 5.2 - Term 7521 - 7563 2.6 5 3 Tu 1 . - CDS 7630 - 8439 27 ## ECB_03339 hypothetical protein + Prom 8858 - 8917 9.3 6 4 Tu 1 . + CDS 9061 - 9180 102 ## 7 5 Tu 1 . - CDS 9224 - 9427 101 ## SF3505 hypothetical protein - Prom 9447 - 9506 5.5 + Prom 9387 - 9446 8.7 8 6 Tu 1 . + CDS 9665 - 10672 541 ## G2583_4217 inner membrane protein YhiM + Term 10693 - 10734 3.0 - Term 10912 - 10957 -0.9 9 7 Tu 1 . - CDS 10987 - 12189 1044 ## COG2081 Predicted flavoproteins - Prom 12277 - 12336 8.2 + Prom 12252 - 12311 6.9 10 8 Tu 1 . + CDS 12421 - 13920 1633 ## COG0306 Phosphate/sulphate permeases Predicted protein(s) >gi|296494613|gb|ADTN01000125.1| GENE 1 457 - 1581 1147 374 aa, chain - ## HITS:1 COG:yhhJ KEGG:ns NR:ns ## COG: yhhJ COG0842 # Protein_GI_number: 16131357 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Escherichia coli K12 # 1 374 2 375 375 667 100.0 0 MRHLRNIFNLGIKELRSLLGDKAMLTLIVFSFTVSVYSSATVTPGSLNLAPIAIADMDQS QLSNRIVNSFYRPWFLPPEMITADEMDAGLDAGRYTFAINIPPNFQRDVLAGRQPDIQVN VDATRMSQAFTGNGYIQNIINGEVNSFVARYRDNSEPLVSLETRMRFNPNLDPAWFGGVM AIINNITMLAIVLTGSALIREREHGTVEHLLVMPITPFEIMMAKIWSMGLVVLVVSGLSL VLMVKGVLGVPIEGSIPLFMLGVALSLFATTSIGIFMGTIARSMPQLGLLVILVLLPLQM LSGGSTPRESMPQMVQDIMLTMPTTHFVSLAQAILYRGAGFEIVWPQFLTLMAIGGAFFT IALLRFRKTIGTMA >gi|296494613|gb|ADTN01000125.1| GENE 2 1581 - 4316 2765 911 aa, chain - ## HITS:1 COG:yhiH_1 KEGG:ns NR:ns ## COG: yhiH_1 COG1131 # Protein_GI_number: 16131358 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Escherichia coli K12 # 18 549 1 532 532 1085 99.0 0 MTHLELVPVPPVAQLAGVSQHYGKTVALNNITLDIPARCMVGLIGPDGVGKSSLLSLISG ARVIEQGNVMVLGGDMRDPKHRRDVCPRIAWMPQGLGKNLYHTLSVYENVDFFARLFGHD KAEREVRINELLTSTGLAPFRDRPAGKLSGGMKQKLGLCCALIHDPELLILDEPTTGVDP LSRSQFWDLIDSIRQRQSNMSVLVATAYMEEAERFDWLVAMNAGEVLATGSAEELRQQTQ SATLEEAFINLLPQAQRQAHQAVVIPPYQPENAEIAIEARDLTMRFGSFVAVDHVNFRIP RGEIFGFLGSNGCGKSTTMKMLTGLLPASEGEAWLFGQPVDPKDIDTRRRVGYMSQAFSL YNELTVRQNLELHARLFHIPEAEIPARVAEMSERFKLNDVEDILPESLPLGIRQRLSLAV AVIHRPEMLILDEPTSGVDPVARDMFWQLMVDLSRQDKVTIFISTHFMNEAERCDRISLM HAGKVLASGTPQELVEKRGAASLEEAFIAYLQEAAGQSNEAEAPPVVHDTTHAPRQGFSL RRLFSYSRREALELRRDPVRSTLALMGTVILMLIMGYGISMDVENLRFAVLDRDQTVSSQ AWTLNLSGSRYFIEQPPLTSYDELDRRMRAGDITVAIEIPPNFGRDIARGTPVELGVWID GAMPSRAETVKGYVQAMHQSWLQDVASRQSTPASQSGLMNIETRYRYNPDVKSLPAIVPA VIPLLLMMIPSMLSALSVVREKELGSIINLYVTPTTRSEFLLGKQLPYIALGMLNFFLLC GLSVFVFGVPHKGSFLTLTLAALLYIIIATGMGLLISTFMKSQIAAIFGTAIITLIPATQ FSGMIDPVASLEGPGRWIGEVYPTSHFLTIARGTFSKALDLTDLWQLFIPLLIAIPLVMG LSILLLKKQEG >gi|296494613|gb|ADTN01000125.1| GENE 3 4313 - 5380 1164 355 aa, chain - ## HITS:1 COG:yhiI KEGG:ns NR:ns ## COG: yhiI COG0845 # Protein_GI_number: 16131359 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli K12 # 1 355 1 355 355 566 99.0 1e-161 MDKSKRHLAWWVVGLLAVAAIVAWWLLRPAGVPEGFAVSNGRIEATEVDIASKIAGRIDT ILVKEGQFVREGEVLAKMDTRVLQEQRLEAIAQIKEAQSAVAAAQALLEQRQSETRAAQS LVNQRQAELDSVAKRHTRSRSLAQRGAISAQQLDDDRAAAESARAALESAKAQVSASKAA IEAARTNIIQAQTRVEAAQATERRIAADIDDSELKAPRDGRVQYRVAEPGEVLAAGGRVL NMVDLSDVYMTFFLPTEQAGTLKLGGEARLILDAAPDLRIPATISFVASVAQFTPKTVET SDERLKLMFRVKARIPPELLQQHLEYVKTGLPGVAWVRVNEELPWPDDLVVRLPQ >gi|296494613|gb|ADTN01000125.1| GENE 4 5746 - 7368 155 540 aa, chain - ## HITS:1 COG:no KEGG:JW3455 NR:ns ## KEGG: JW3455 # Name: yhiJ # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 540 1 540 540 1099 99.0 0 MKIGTVAGTNDSTTTIATNDMVQEHVTNFTKELFGYIANGIGDDISSIARTMLGEVVEKI DDWQIERFQQSIQDDKISFTIQTDHSEKYSMLSGMRAHILRRNNNYQFIVTINSKNYGCS LDNTDVNWCSIVYLLNNMTVNDNANDVAVTESYKPIWNWKISQYNVSDIKFETMIKPQFA DRIYFSNCLPVDPTSTRPTYFGDTDGSVGAVLFALFATGHLGIMAEGENFLSQLLNIEDE VLNVLLRENFNEQLNTNVNTIISILNRRDIILESLQPYLVINKDAVTPCTFLGDQTGDRF SNICGDQFIIDLLKRIMSINENVHVLAGNHETNCNGNYMQNFTRMKPLDEDTYSGIKDYP VCFYDPKYKIMANHHGITFDDQRKRYIIGPITVSIDEMTNALDPVELAAIINKKHHAIIN GKKFKTSRAISCRSFNRYFSVSTDYRPKLEALLACSQMLGINQVVAHNGNGGRERIGETG TVLGLNARDSKHAGRMFSMHNCQINPGAGPEITTPWKSYQHEKNRNGLMPLIRRRTMLQL >gi|296494613|gb|ADTN01000125.1| GENE 5 7630 - 8439 27 269 aa, chain - ## HITS:1 COG:no KEGG:ECB_03339 NR:ns ## KEGG: ECB_03339 # Name: yhiKL # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 269 267 535 535 544 99.0 1e-153 MLYDLLNTRDMILNELHQHVFLKDDAITPCIFLGDHTGDRFSTIFGDKYILTLLNSMRNM EGNKDSRINKNVVVLAGNHEINFNGNYTARLANHKLSAGDTYNLIKTLDVCNYDSERQVL TSHHGIIRDEEKKCYCLGALQVPFNQMKNPTDPEELANIFNKKHKEHMDDPLFHLIRSNT LKPTPVYANYFDNTTDFRPARERIFICGETLKGEDPSKYIRQKYGHHGPGVDHNQQFDNG IMGLNSLKEARDKNNKIIYSSGLSCFQPH >gi|296494613|gb|ADTN01000125.1| GENE 6 9061 - 9180 102 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQAKIVIYLRIIIQFIKFNYLCTMCFINLAMVLMTVNFI >gi|296494613|gb|ADTN01000125.1| GENE 7 9224 - 9427 101 67 aa, chain - ## HITS:1 COG:no KEGG:SF3505 NR:ns ## KEGG: SF3505 # Name: not_defined # Def: hypothetical protein # Organism: S.flexneri # Pathway: not_defined # 2 58 1 57 591 88 75.0 6e-17 MLLFFTIIFQGGIILHYLVKDRIVSVLKISEHISIIILYRRKSVFQTSIIYFVTNNSKIS QVIYESN >gi|296494613|gb|ADTN01000125.1| GENE 8 9665 - 10672 541 335 aa, chain + ## HITS:1 COG:no KEGG:G2583_4217 NR:ns ## KEGG: G2583_4217 # Name: yhiM # Def: inner membrane protein YhiM # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 335 49 383 383 583 100.0 1e-165 MGLICIALGGFVLESSGQSEYFVAGHVLISLAAICLALFTTAFIIISQLTRGVNTFYNTL FPIIGYAGSIITMIWGWALLAGNDVMADEFVAGHVIFGVGMIAACVSTVAASSGHFLLIP KNAAGSKSDGTPVQAYSSLIGNCLIAVPVLLTLLGFIWSITLLRSADITPHYVAGHVLLG LTAICACLIGLVATIVHQTRNTFSTKEHWLWCYWVIFLGSITVLQGIYVLVSSDASARLA PGIILICLGMICYSIFSKVWLLALVWRRTCSLANRIPMIPVFTCLFCLFLASFLAEMAQT DMGYFIPSRVLVGLGAVCFTLFSIVSILEAGSAKK >gi|296494613|gb|ADTN01000125.1| GENE 9 10987 - 12189 1044 400 aa, chain - ## HITS:1 COG:yhiN KEGG:ns NR:ns ## COG: yhiN COG2081 # Protein_GI_number: 16131364 # Func_class: R General function prediction only # Function: Predicted flavoproteins # Organism: Escherichia coli K12 # 1 400 1 400 400 810 100.0 0 MERFDAIIIGAGAAGMFCSALAGQAGRRVLLIDNGKKPGRKILMSGGGRCNFTNLYVEPG AYLSQNPHFCKSALARFTQWDFIDLVNKHGIAWHEKTLGQLFCDDSAQQIVDMLVDECEK GNVTFRLRSEVLSVAKDETGFTLDLNGMTVGCEKLVIATGGLSMPGLGASPFGYKIAEQF GLNVLPTRAGLVPFTLHKPLLEELQVLAGVAVPSVITAENGTVFRENLLFTHRGLSGPAV LQISSYWQPGEFVSINLLPDVDLETFLNEQRNAHPNQSLKNTLAVHLPKRLVERLQQLGQ IPDVSLKQLNVRDQQALISTLTDWRVQPNGTEGYRTAEVTLGGVDTNELSSRTMEARKVP GLYFIGEVMDVTGWLGGYNFQWAWSSAWACAQDLIAAKSS >gi|296494613|gb|ADTN01000125.1| GENE 10 12421 - 13920 1633 499 aa, chain + ## HITS:1 COG:ECs4365 KEGG:ns NR:ns ## COG: ECs4365 COG0306 # Protein_GI_number: 15833619 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate/sulphate permeases # Organism: Escherichia coli O157:H7 # 1 499 1 499 499 893 100.0 0 MLHLFAGLDLHTGLLLLLALAFVLFYEAINGFHDTANAVATVIYTRAMRSQLAVVMAAVF NFLGVLLGGLSVAYAIVHMLPTDLLLNMGSSHGLAMVFSMLLAAIIWNLGTWYFGLPASS SHTLIGAIIGIGLTNALMTGTSVVDALNIPKVLSIFGSLIVSPIVGLVFAGGLIFLLRRY WSGTKKRARIHLTPAEREKKDGKKKPPFWTRIALILSAIGVAFSHGANDGQKGIGLVMLV LIGVAPAGFVVNMNATGYEITRTRDAINNVEAYFEQHPALLKQATGADQLVPAPEAGATQ PAEFHCHPSNTINALNRLKGMLTTDVESYDKLSLDQRSQMRRIMLCVSDTIDKVVKMPGV SADDQRLLKKLKSDMLSTIEYAPVWIIMAVALALGIGTMIGWRRVATTIGEKIGKKGMTY AQGMSAQMTAAVSIGLASYTGMPVSTTHVLSSSVAGTMVVDGGGLQRKTVTSILMAWVFT LPAAVLLSGGLYWLSLQFL Prediction of potential genes in microbial genomes Time: Sun May 15 23:33:30 2011 Seq name: gi|296494612|gb|ADTN01000126.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont259.2, whole genome shotgun sequence Length of sequence - 12266 bp Number of predicted genes - 14, with homology - 13 Number of transcription units - 10, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 35 - 370 247 ## G2583_4220 universal stress protein B - Prom 528 - 587 2.2 + Prom 505 - 564 5.2 2 2 Tu 1 . + CDS 761 - 1195 581 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins + Term 1218 - 1257 7.1 + Prom 1332 - 1391 6.2 3 3 Tu 1 . + CDS 1512 - 2981 1687 ## COG3104 Dipeptide/tripeptide permease + Term 2993 - 3037 8.9 - Term 2986 - 3021 7.4 4 4 Op 1 5/0.500 - CDS 3030 - 3782 844 ## COG0500 SAM-dependent methyltransferases 5 4 Op 2 . - CDS 3790 - 5832 2714 ## COG0339 Zn-dependent oligopeptidases - Prom 5888 - 5947 3.1 + Prom 5917 - 5976 4.1 6 5 Op 1 7/0.000 + CDS 6035 - 6877 883 ## COG2961 Protein involved in catabolism of external DNA 7 5 Op 2 . + CDS 6949 - 8301 502 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 + Term 8310 - 8348 8.6 - Term 8298 - 8336 6.1 8 6 Tu 1 . - CDS 8355 - 8471 128 ## - Prom 8632 - 8691 4.6 9 7 Tu 1 . + CDS 8466 - 8639 82 ## ECBD_0240 hypothetical protein + Term 8773 - 8801 -1.0 + Prom 8654 - 8713 3.6 10 8 Op 1 7/0.000 + CDS 8947 - 9300 246 ## COG0640 Predicted transcriptional regulators 11 8 Op 2 2/1.000 + CDS 9354 - 10643 1581 ## COG1055 Na+/H+ antiporter NhaD and related arsenite permeases 12 8 Op 3 . + CDS 10656 - 11081 618 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family + Prom 11105 - 11164 2.6 13 9 Tu 1 . + CDS 11184 - 11300 80 ## EcHS_A3706 hypothetical protein + Term 11327 - 11360 0.0 14 10 Tu 1 . + CDS 11710 - 12243 126 ## ECB_03353 hypothetical protein Predicted protein(s) >gi|296494612|gb|ADTN01000126.1| GENE 1 35 - 370 247 111 aa, chain - ## HITS:1 COG:no KEGG:G2583_4220 NR:ns ## KEGG: G2583_4220 # Name: uspB # Def: universal stress protein B # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 111 1 111 111 214 100.0 1e-54 MISTVALFWALCVVCIVNMARYFSSLRALLVVLRNCDPLLYQYVDGGGFFTSHGQPNKQV RLVWYIYAQRYRDHHDDEFIRRCERVRRQFILTSALCGLVVVSLIALMIWH >gi|296494612|gb|ADTN01000126.1| GENE 2 761 - 1195 581 144 aa, chain + ## HITS:1 COG:ECs4367 KEGG:ns NR:ns ## COG: ECs4367 COG0589 # Protein_GI_number: 15833621 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Escherichia coli O157:H7 # 1 144 1 144 144 271 100.0 3e-73 MAYKHILIAVDLSPESKVLVEKAVSMARPYNAKVSLIHVDVNYSDLYTGLIDVNLGDMQK RISEETHHALTELSTNAGYPITETLSGSGDLGQVLVDAIKKYDMDLVVCGHHQDFWSKLM SSARQLINTVHVDMLIVPLRDEEE >gi|296494612|gb|ADTN01000126.1| GENE 3 1512 - 2981 1687 489 aa, chain + ## HITS:1 COG:yhiP KEGG:ns NR:ns ## COG: yhiP COG3104 # Protein_GI_number: 16131368 # Func_class: E Amino acid transport and metabolism # Function: Dipeptide/tripeptide permease # Organism: Escherichia coli K12 # 1 489 1 489 489 867 100.0 0 MNTTTPMGMLQQPRPFFMIFFVELWERFGYYGVQGVLAVFFVKQLGFSQEQAFVTFGAFA ALVYGLISIGGYVGDHLLGTKRTIVLGALVLAIGYFMTGMSLLKPDLIFIALGTIAVGNG LFKANPASLLSKCYPPKDPRLDGAFTLFYMSINIGSLIALSLAPVIADRFGYSVTYNLCG AGLIIALLVYIACRGMVKDIGSEPDFRPMSFSKLLYVLLGSVVMIFVCAWLMHNVEVANL VLIVLSIVVTIIFFRQAFKLDKTGRNKMFVAFVLMLEAVVFYILYAQMPTSLNFFAINNV HHEILGFSINPVSFQALNPFWVVLASPILAGIYTHLGNKGKDLSMPMKFTLGMFMCSLGF LTAAAAGMWFADAQGLTSPWFIVLVYLFQSLGELFISALGLAMIAALVPQHLMGFILGMW FLTQAAAFLLGGYVATFTAVPDNITDPLETLPVYTNVFGKIGLVTLGVAVVMLLMVPWLK RMIATPESH >gi|296494612|gb|ADTN01000126.1| GENE 4 3030 - 3782 844 250 aa, chain - ## HITS:1 COG:ECs4369 KEGG:ns NR:ns ## COG: ECs4369 COG0500 # Protein_GI_number: 15833623 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli O157:H7 # 1 250 1 250 250 484 100.0 1e-137 MKICLIDETGTGDGALSVLAARWGLEHDEDNLMALVLTPEHLELRKRDEPKLGGIFVDFV GGAMAHRRKFGGGRGEAVAKAVGIKGDYLPDVVDATAGLGRDAFVLASVGCRVRMLERNP VVAALLDDGLARGYADAEIGGWLQERLQLIHASSLTALTDITPRPQVVYLDPMFPHKQKS ALVKKEMRVFQSLVGPDLDADGLLEPARLLATKRVVVKRPDYAPPLANVATPNAVVTKGH RFDIYAGTPV >gi|296494612|gb|ADTN01000126.1| GENE 5 3790 - 5832 2714 680 aa, chain - ## HITS:1 COG:prlC KEGG:ns NR:ns ## COG: prlC COG0339 # Protein_GI_number: 16131370 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent oligopeptidases # Organism: Escherichia coli K12 # 1 680 1 680 680 1369 100.0 0 MTNPLLTPFELPPFSKILPEHVVPAVTKALNDCRENVERVVAQGAPYTWENLCQPLAEVD DVLGRIFSPVSHLNSVKNSPELREAYEQTLPLLSEYSTWVGQHEGLYKAYRDLRDGDHYA TLNTAQKKAVDNALRDFELSGIGLPKEKQQRYGEIATRLSELGNQYSNNVLDATMGWTKL VTDEAELAGMPESALAAAKAQAEAKELEGYLLTLDIPSYLPVMTYCDNQALREEMYRAYS TRASDQGPNAGKWDNSKVMEEILALRHELAQLLGFENYAFKSLATKMAENPQQVLDFLTD LAKRARPQGEKELAQLRAFAKAEFGVDELQPWDIAYYSEKQKQHLYSISDEQLRPYFPEN KAVNGLFEVVKRIYGITAKERKDVDVWHPDVRFFELYDENNELRGSFYLDLYARENKRGG AWMDDCVGQMRKADGSLQKPVAYLTCNFNRPVNGKPALFTHDEVITLFHEFGHGLHHMLT RIETAGVSGISGVPWDAVELPSQFMENWCWEPEALAFISGHYETGEPLPKELLDKMLAAK NYQAALFILRQLEFGLFDFRLHAEFRPDQGAKILETLAEIKKLVAVVPSPSWGRFPHAFS HIFAGGYAAGYYSYLWADVLAADAFSRFEEEGIFNRETGQSFLDNILSRGGSEEPMDLFK RFRGREPQLDAMLEHYGIKG >gi|296494612|gb|ADTN01000126.1| GENE 6 6035 - 6877 883 280 aa, chain + ## HITS:1 COG:yhiR KEGG:ns NR:ns ## COG: yhiR COG2961 # Protein_GI_number: 16131371 # Func_class: R General function prediction only # Function: Protein involved in catabolism of external DNA # Organism: Escherichia coli K12 # 1 280 1 280 280 559 100.0 1e-159 MLSYRHSFHAGNHADVLKHTVQSLIIESLKEKDKPFLYLDTHAGAGRYQLGSEHAERTGE YLEGIARIWQQDDLPAELEAYINVVKHFNRSGQLRYYPGSPLIARLLLREQDSLQLTELH PSDYPLLRSEFQKDSRARVEKADGFQQLKAKLPPVSRRGLILIDPPYEMKTDYQAVVSGI AEGYKRFATGIYALWYPVVLRQQIKRMIHDLEATGIRKILQIELAVLPDSDRRGMTASGM IVINPPWKLEQQMNNVLPWLHSKLVPAGTGHATVSWIVPE >gi|296494612|gb|ADTN01000126.1| GENE 7 6949 - 8301 502 450 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 5 444 4 442 458 197 31 2e-50 MTKHYDYIAIGGGSGGIASINRAAMYGQKCALIEAKELGGTCVNVGCVPKKVMWHAAQIR EAIHMYGPDYGFDTTINKFNWETLIASRTAYIDRIHTSYENVLGKNNVDVIKGFARFVDA KTLEVNGETITADHILIATGGRPSHPDIPGVEYGIDSDGFFALPALPERVAVVGAGYIAV ELAGVINGLGAKTHLFVRKHAPLRSFDPMISETLVEVMNAEGPQLHTNAIPKAVVKNADG SLTLELEDGRSETVDCLIWAIGREPANDNINLEAAGVKTNEKGYIVVDKYQNTNIEGIYA VGDNTGAVELTPVAVAAGRRLSERLFNNKPDEHLDYSNIPTVVFSHPPIGTVGLTEPQAR EQYGDDQVKVYKSSFTAMYTAVTTHRQPCRMKLVCVGSEEKIVGIHGIGFGMDEMLQGFA VALKMGATKKDFDNTVAIHPTAAEEFVTMR >gi|296494612|gb|ADTN01000126.1| GENE 8 8355 - 8471 128 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHSHSIAWRKRVIDKAIIVLGALIALLELIRFLLQLLN >gi|296494612|gb|ADTN01000126.1| GENE 9 8466 - 8639 82 57 aa, chain + ## HITS:1 COG:no KEGG:ECBD_0240 NR:ns ## KEGG: ECBD_0240 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_BL21_DE3 # Pathway: not_defined # 1 57 1 57 57 92 100.0 5e-18 MHPLTHPLPVTAHVSLLDKNSLTPARASVNGTTRTSDQDFESVYAHCQSENASELTG >gi|296494612|gb|ADTN01000126.1| GENE 10 8947 - 9300 246 117 aa, chain + ## HITS:1 COG:arsR KEGG:ns NR:ns ## COG: arsR COG0640 # Protein_GI_number: 16131373 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 1 117 1 117 117 225 100.0 2e-59 MSFLLPIQLFKILADETRLGIVLLLSELGELCVCDLCTALDQSQPKISRHLALLRESGLL LDRKQGKWVHYRLSPHIPAWAAKIIDEAWRCEQEKVQAIVRNLARQNCSGDSKNICS >gi|296494612|gb|ADTN01000126.1| GENE 11 9354 - 10643 1581 429 aa, chain + ## HITS:1 COG:arsB KEGG:ns NR:ns ## COG: arsB COG1055 # Protein_GI_number: 16131374 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter NhaD and related arsenite permeases # Organism: Escherichia coli K12 # 1 429 8 436 436 680 100.0 0 MLLAGAIFVLTIVLVIWQPKGLGIGWSATLGAVLALVTGVVHPGDIPVVWNIVWNATAAF IAVIIISLLLDESGFFEWAALHVSRWGNGRGRLLFTWIVLLGAAVAALFANDGAALILTP IVIAMLLALGFSKGTTLAFVMAAGFIADTASLPLIVSNLVNIVSADFFGLGFREYASVMV PVDIAAIVATLVMLHLYFRKDIPQNYDMALLKSPAEAIKDPATFKTGWVVLLLLLVGFFV LEPLGIPVSAIAAVGALILFVVAKRGHAINTGKVLRGAPWQIVIFSLGMYLVVYGLRNAG LTEYLSGVLNVLADNGLWAATLGTGFLTAFLSSIMNNMPTVLVGALSIDGSTASGVIKEA MVYANVIGCDLGPKITPIGSLATLLWLHVLSQKNMTISWGYYFRTGIIMTLPVLFVTLAA LALRLSFTL >gi|296494612|gb|ADTN01000126.1| GENE 12 10656 - 11081 618 141 aa, chain + ## HITS:1 COG:arsC KEGG:ns NR:ns ## COG: arsC COG1393 # Protein_GI_number: 16131375 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Escherichia coli K12 # 1 141 1 141 141 273 100.0 1e-73 MSNITIYHNPACGTSRNTLEMIRNSGTEPTIIHYLETPPTRDELVKLIADMGISVRALLR KNVEPYEELGLAEDKFTDDRLIDFMLQHPILINRPIVVTPLGTRLCRPSEVVLEILPDAQ KGAFSKEDGEKVVDEAGKRLK >gi|296494612|gb|ADTN01000126.1| GENE 13 11184 - 11300 80 38 aa, chain + ## HITS:1 COG:no KEGG:EcHS_A3706 NR:ns ## KEGG: EcHS_A3706 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_HS # Pathway: not_defined # 1 38 17 54 54 63 100.0 1e-09 MNSMLFVGLTGVAHQAILLFTQYSLREIPIIASSMVYQ >gi|296494612|gb|ADTN01000126.1| GENE 14 11710 - 12243 126 177 aa, chain + ## HITS:1 COG:no KEGG:ECB_03353 NR:ns ## KEGG: ECB_03353 # Name: yhiS # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 176 1 176 407 330 100.0 1e-89 MSIDFTPGIINTYHGDIYNCTTNTDNAKTPDTPKWPCDNWEEQQPINSTFSGEGYISDQY DLAQHQLQQINACHTNTTYTNADYSKVVAQLVSLITNIETISSTQLTQQTQSILNQINNI RYEKNKSAECRIIVIANPKPDKAIITKISVEEGIPITFSVQTMFSDTNFIAEQRADW Prediction of potential genes in microbial genomes Time: Sun May 15 23:33:48 2011 Seq name: gi|296494611|gb|ADTN01000127.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont263.1, whole genome shotgun sequence Length of sequence - 22422 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 11, operones - 5 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 2 1 Op 2 . - CDS 711 - 2024 288 ## COG3209 Rhs family protein 3 2 Op 1 . + CDS 2138 - 4633 514 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 4 2 Op 2 . + CDS 4726 - 5373 580 ## EC55989_0222 conserved hypothetical protein Aec28 5 3 Op 1 5/0.571 + CDS 5554 - 6348 502 ## COG3515 Uncharacterized protein conserved in bacteria 6 3 Op 2 5/0.571 + CDS 6367 - 9825 3027 ## COG3523 Uncharacterized protein conserved in bacteria 7 3 Op 3 2/0.857 + CDS 9836 - 11188 1331 ## COG3515 Uncharacterized protein conserved in bacteria 8 3 Op 4 . + CDS 11212 - 11694 358 ## COG3157 Hemolysin-coregulated protein (uncharacterized) 9 3 Op 5 . + CDS 11738 - 12652 67 ## ECO103_0212 hypothetical protein + Term 12772 - 12826 0.3 - Term 13011 - 13074 1.9 10 4 Tu 1 . - CDS 13278 - 14063 551 ## JW0206 predicted aminopeptidase - Prom 14136 - 14195 8.0 - TRNA 14391 - 14467 90.7 # Asp GTC 0 0 - Term 14317 - 14367 9.1 11 5 Tu 1 . - CDS 14600 - 15340 770 ## COG0847 DNA polymerase III, epsilon subunit and related 3'-5' exonucleases - Prom 15376 - 15435 4.3 + Prom 15314 - 15373 2.1 12 6 Tu 1 . + CDS 15396 - 15863 483 ## COG0328 Ribonuclease HI + Term 15874 - 15928 6.0 13 7 Tu 1 . - CDS 15860 - 16582 448 ## COG0500 SAM-dependent methyltransferases - Prom 16613 - 16672 3.8 + Prom 16391 - 16450 2.7 14 8 Op 1 . + CDS 16616 - 17371 334 ## COG0491 Zn-dependent hydrolases, including glyoxylases + Prom 17410 - 17469 1.5 15 8 Op 2 . + CDS 17581 - 18801 729 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) + Term 18809 - 18867 7.4 - Term 18801 - 18847 8.2 16 9 Op 1 4/0.714 - CDS 18849 - 19472 307 ## COG0500 SAM-dependent methyltransferases 17 9 Op 2 . - CDS 19476 - 20255 469 ## COG3021 Uncharacterized protein conserved in bacteria - Prom 20335 - 20394 5.7 + Prom 20417 - 20476 4.1 18 10 Tu 1 . + CDS 20517 - 21431 602 ## COG0583 Transcriptional regulator + Term 21467 - 21509 7.1 - Term 21386 - 21419 3.8 19 11 Tu 1 . - CDS 21428 - 22231 1121 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase - Prom 22329 - 22388 3.1 Predicted protein(s) >gi|296494611|gb|ADTN01000127.1| GENE 1 341 - 733 117 130 aa, chain - ## HITS:1 COG:no KEGG:ECB_00212 NR:ns ## KEGG: ECB_00212 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 13 130 6 123 123 234 99.0 6e-61 MNVLRHVNAINMGVCILIWVIGCANAKVLKRCILPAPDGGVEKLSKPNLHGVIKKVDLNT NEAEIKIKEGGAVSFIFNEETLYFTVFGGDFIMDDLKKVKNTEAWVWYKNCDSKNKNVAV ELLQIYELHQ >gi|296494611|gb|ADTN01000127.1| GENE 2 711 - 2024 288 437 aa, chain - ## HITS:1 COG:Z0268 KEGG:ns NR:ns ## COG: Z0268 COG3209 # Protein_GI_number: 15799917 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli O157:H7 EDL933 # 1 301 970 1270 1404 509 85.0 1e-144 MHDERTHHYHYDSQHRLVFYTRIQHGEPQVESRYLYDPLGRRTGKRVWRRERDLTGWMSL SRKPEVTWYGWDGDRLTTVQTGTTRIQTVYQPGSFTPLIRIETENGEQAKARHRSLAEVL QEDTGVTLPAELAVMLGRLERELRAGAVSAESEAWLAQCGLTAEQMAAQMEAEYIPERKL HLYHCDHRGLPQALITPEGETAWCGEYDEWGNQLNEENPHHLYQPYRLPGQQYDEESGLY YNRHRHYDPLQGRYITQDPIGLKGGINLYTYPLVPIRYTDPLGLERVISVYGPPAPDRAG AETPLVLTDMTGGVTIYYDPETGDSMTFDSSNRIDRRSQRGAGDPYTGEVVGCETNESGI SAAYGTTKIYTTDTRARWLHGGGSSLRDPYAPRQGWKPTMGCTRAQNEDVDELCKKVTSW MYSHPGERIRYERFKTR >gi|296494611|gb|ADTN01000127.1| GENE 3 2138 - 4633 514 831 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 489 791 477 789 815 202 36 2e-51 MLVEWLKEGWLLASAEMQHSELRGGVLLLALLHSPLRYIPPAAARLLTGINRDRLQQDFV QWTQESAESVVPDADGKGAGTMTDASDTLLARYAKNMTADARNGRLDPVLCRDHEIDLMI DILCRRRKNNPVVVGEAGVGKSALIEGLALRIVAGQVPDKLKNTDIMTLDLGALQAGASV KGEFEKRFKGLMAEVISSPVPVILFIDEAHTLIGAGNQQGGLDISNLLKPALARGELKTI AATTWSEYKKYFEKDAALSRRFQLVKVSEPNAAEATIILRGLSAVYEQSHGVLIDDDALQ AAATLSERYLSGRQLPDKAIDVLDTACARVAINLSSPPKQISALTTLSHQQEAEIRQLER ELRIGLRTDTSRMTEVLVQYDETLTALDELEAAWHQQQTLVREIIALRQQLLGVAEDDAA PLPDADTVEDTQPESESEQDNTGAVPADEADREQPEETAETVSPVQRLAQLTAELDALHN DRLLVSPHVDKKQIAAVIAEWTGVPLNRLSQNEMSVITDLTKWLGDTIKGQDLAIASLHK HLLTARADLRRPGRPLGAFLLAGPSGVGKTETVLQLAELLYGGRQYLTTINMSEFQEKHT VSRLIGSPPGYVGYGEGGVLTEAIRQKPYSVVLLDEVEKAHPDVLNLFYQAFDKGEMADG EGRLIDCKNIVFFLTSNLGYQVIVEHADDPETMQEVLYPVLADFFKPALLARMEVVPYLP LSKETLATIIAGKLARLDNVLRSRFGAEVVIEPEVTDEIMSRVTRAENGARMLESVIDGD MLPPLSLLLLQKMAANTAIARIRLSAVDGAFTADVEDAQNDESVTKDETVL >gi|296494611|gb|ADTN01000127.1| GENE 4 4726 - 5373 580 215 aa, chain + ## HITS:1 COG:no KEGG:EC55989_0222 NR:ns ## KEGG: EC55989_0222 # Name: not_defined # Def: conserved hypothetical protein Aec28 # Organism: E.coli_55989 # Pathway: not_defined # 1 215 33 247 247 398 94.0 1e-110 MAGVAVAVATTTPPDATATLQAIQSCRRESAALERLDCYDRLLAPLSPSGFDGALVKAGF VGEAWTRATEQEKRREGNTTELLVTQVPGERPTVVITTPAIGHVPPRPVLMFSCVDNITR MQVALMHPLDVHDIAVTLNADNRALRSHWFVRENGTLLESSRGLSGIDEIKQLFGAKTLT VDTGADNAAGKLTFNIDGLARAIAPLRDACHWAGE >gi|296494611|gb|ADTN01000127.1| GENE 5 5554 - 6348 502 264 aa, chain + ## HITS:1 COG:ECs0220 KEGG:ns NR:ns ## COG: ECs0220 COG3515 # Protein_GI_number: 15829474 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 264 1 264 264 447 92.0 1e-125 MTQPSAPAPQIVIDSHDDKAWRDTLLKVAVILCERQPDSPQGYRLRRHALWQNITSTPQA ESDGRTPLAAVSADMVADYHAQLGSADMALWQQVEKSVLLAPYWLDGHCLSAQTALRLGY KQVADAIRDEVIRFLERLPQLTGLLFNDHTPFISEQTKQWLAASPDAKVAPVAQIGEESK AARACFAEQGLEAALRYLDMLPEGDPRDQFHRQYLAAQLTEEAGLVQLAQQQYRMLFRMG LQMMVADWEPSLLEQLEQKFTAEQ >gi|296494611|gb|ADTN01000127.1| GENE 6 6367 - 9825 3027 1152 aa, chain + ## HITS:1 COG:Z0250 KEGG:ns NR:ns ## COG: Z0250 COG3523 # Protein_GI_number: 15799899 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 EDL933 # 31 1152 1 1144 1144 2163 96.0 0 MFRLPTPRLLSGLKSALRPAMPRFKVSAFWLLILAWIFLLVWIWWKGPTWTLYEEQWLKP LANRWLATAAWGIIALMWLTVRVMKRLQQLEKMQKQQREEAVDPLSVELNAQQRYLDRWL LRLQRYLDNRRFLWQLPWYMVIGPAGSGKTTLLREGFPSDIIYAPEGARGAEQRLYLTPH VGKQAVIFDIDGTLCAPADADILHRRLWEHALGWLKEKRARQPLNGIILTLDLPDLLTAD KRRREHLLQTLRSRLQDIRQHLHCQLPVYVVLTRLDLLQGFAALFQSLNRQDRDAILGVT FTRRAHENDDWRTELNAFWQTWVDRMNLALPDLMVAQTHTRTSLFSFSRQMQGSREPLVS LLEGLLDGENMNVMLRGVYLTSSLQRGQMDDIFTQSAARQYRLGNNPLASWPLVDTAPYF TRSLFPQALLAEPNLATESRAWLIRSRRRLTVFSATGGVAALLLITGWHHYYNGNYQSGI TVLKQAKAFMDVPPPQGEDDFGNLQLPLLNPVRDATLAYGDWGDRSRLADMGLYQGRRIG PYVEQTYLQLLEQRYLPSLFNGLVKAMNAAPPESEEKLAVLRVMRMLEDKSGRNNEVVKQ YMAKRWSEKFHGQRDIQAQLMSHLDYALAHTDWHAERQAGDGDAISRWTPYDKPVVSAQK ELSKLPVYQRVYQSLKTRALGVLPADLNLRDQVGPTFDQVFTSADDNKLVVPQFLTRYGL QSYFVKQRDELVELTAMDSWVLNLTRSVKYSDADRAEIQRQLTEQYISDYTATWRAGMDN LNIRNFESIGQLTGALEQVISGDQPLQRALTVLRDNTQPGVFSEKLSAKEREEALAVQKD KESTMQAVYQQLTELHRYLLAIQNAPVPGKSALKAVQLRLDQNSSDPIFATRQMAKTLPA PLNRWVGRLTDQAWHVVMVEAVHYMEVDWRDSVVKPFNEQLANNYPFNPRSAQDASLDAF ERFFKPDGILDTFYQQNLKLFIDNDLSLEDGDNNVIIREDIIAQLETAQKIRDIFFSKQN GLGTSFAVETVSLSGNKRRSVLNLDGQLVDYSQGRNYTAHLVWPNNMREGNESKLTLIGT SGNAPRSISFSGPWAQFRLFGAGQLTGVQDGNFTVRFSVDGGAMTYRVHTDTEDNPFSGG LFSQFGLSDTLY >gi|296494611|gb|ADTN01000127.1| GENE 7 9836 - 11188 1331 450 aa, chain + ## HITS:1 COG:Z0249 KEGG:ns NR:ns ## COG: Z0249 COG3515 # Protein_GI_number: 15799898 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 EDL933 # 1 440 31 470 499 818 99.0 0 MNSNVLTQTIVTGSDPRGLPEFSAIREEINKASHPSQPELNWKLVESLALAIFKANGVDL HTATYYTLARTRTQGLAGFCEGAELLAAMVSHDWDKFWPQGGPARTEMLDWFNSRTGNIL RQQISFAESDLPLIYRTERALQLICDKLQQVELKRVPRVENLLYFMQNTRKRLEPQLKSN TENAAQTTVRTLIYAPETQASSTPEAVVPPLPGLPEMKVEVRSLTENPPQASVIKQGSTV RGFIAGIACSVAVASALWWWQVYPVQQQLLQVNDTAQGAATVWMASPELENYERRLQQLL DTSPVQPLETGMQMMRVADSRWPESLQQQQASTQWNEALKTRAQSSPQLRGWLQTRQDLH AFADLVMQREKEGLTLSYIKNVIWQAERGLGQETPVESLLTQYHDARAQKQNTDALEKQI NERLEGVLSRWLLLKNNVMPEAATGTTAEK >gi|296494611|gb|ADTN01000127.1| GENE 8 11212 - 11694 358 160 aa, chain + ## HITS:1 COG:ECs0216 KEGG:ns NR:ns ## COG: ECs0216 COG3157 # Protein_GI_number: 15829470 # Func_class: S Function unknown # Function: Hemolysin-coregulated protein (uncharacterized) # Organism: Escherichia coli O157:H7 # 1 159 1 158 159 182 54.0 2e-46 MANISYLSLSGETQGLISAGCSTLDSVGNKAQPEHKDQIMVYALMHSISRSQNVNHHELI ITKPVDKSSPLLAKALSDNEKMAICEFILYRTSKAGIYQPYYKINLSKARISSIDFVTPH AVLEKELEPQERIAFIYEDISWEHTLAGTNAMSKWQDRVQ >gi|296494611|gb|ADTN01000127.1| GENE 9 11738 - 12652 67 304 aa, chain + ## HITS:1 COG:no KEGG:ECO103_0212 NR:ns ## KEGG: ECO103_0212 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 304 1 304 304 628 99.0 1e-178 MAGLRNNNNTQNAQWADYVGDILRGAQPINQLVPQHPYLNDVPLIDELRHQNTHHVIPLT LDVAKKILSPITSFDYIHFITTHPSSIKDTLAWLVNAGKLMTEFDDNGKIIFNLNALKYT KASYFEILGEKYIKITTSSPWLLEKLGKYIFSSRAPQVLELAIGWRGALSESIKGVKFCI WFSVAWRTIEFIMSSERDLVNFLGDFSMDVAKAVIAGGVATAIGSLASFACVSFGFPVIL VGGAILLTGIVCTVVLNEIDAQCHLSEKLKYAIRDGLKRQQELDKWKRENMTPFMYVLNT PPVI >gi|296494611|gb|ADTN01000127.1| GENE 10 13278 - 14063 551 261 aa, chain - ## HITS:1 COG:no KEGG:JW0206 NR:ns ## KEGG: JW0206 # Name: yafT # Def: predicted aminopeptidase # Organism: E.coli_J # Pathway: not_defined # 1 261 1 261 261 492 100.0 1e-138 MNSKKLCCICVLFSLLAGCASESSIDEKKKKAQVTQSNINKNTPQQLTDKDLFGNETTLA VSEEDIQAALDGDEFRVPLNSPVILVQSGNRAPETIMQEEMRKYYTVSTFSGIPDRQKPL TCNKNKDKNENEDVASAENMNWMQALRFVAAKGHQKAIIVYQDMLQTGKYDSALKSTVWS DYKNDKLTDAISLRYLVRFTLVDVATGEWATWSPVNYEYKVLPPLPDKNEASTTDMTEQQ IMQLKQKTYKAMVKDLVNRYQ >gi|296494611|gb|ADTN01000127.1| GENE 11 14600 - 15340 770 246 aa, chain - ## HITS:1 COG:dnaQ KEGG:ns NR:ns ## COG: dnaQ COG0847 # Protein_GI_number: 16128202 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases # Organism: Escherichia coli K12 # 4 246 1 243 243 487 100.0 1e-137 MTAMSTAITRQIVLDTETTGMNQIGAHYEGHKIIEIGAVEVVNRRLTGNNFHVYLKPDRL VDPEAFGVHGIADEFLLDKPTFAEVADEFMDYIRGAELVIHNAAFDIGFMDYEFSLLKRD IPKTNTFCKVTDSLAVARKMFPGKRNSLDALCARYEIDNSKRTLHGALLDAQILAEVYLA MTGGQTSMAFAMEGETQQQQGEATIQRIVRQASKLRVVFATDEEIAAHEARLDLVQKKGG SCLWRA >gi|296494611|gb|ADTN01000127.1| GENE 12 15396 - 15863 483 155 aa, chain + ## HITS:1 COG:ECs0210 KEGG:ns NR:ns ## COG: ECs0210 COG0328 # Protein_GI_number: 15829464 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HI # Organism: Escherichia coli O157:H7 # 1 155 1 155 155 314 100.0 5e-86 MLKQVEIFTDGSCLGNPGPGGYGAILRYRGREKTFSAGYTRTTNNRMELMAAIVALEALK EHCEVILSTDSQYVRQGITQWIHNWKKRGWKTADKKPVKNVDLWQRLDAALGQHQIKWEW VKGHAGHPENERCDELARAAAMNPTLEDTGYQVEV >gi|296494611|gb|ADTN01000127.1| GENE 13 15860 - 16582 448 240 aa, chain - ## HITS:1 COG:yafS KEGG:ns NR:ns ## COG: yafS COG0500 # Protein_GI_number: 16128200 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli K12 # 1 240 7 246 246 487 100.0 1e-138 MKPARVPQTVVAPDCWGDLPWGKLYRKALERQLNPWFTKMYGFHLLKIGNLSAEINCEAC AVSHQVNVSAQGMPVQVQADPLHLPFADKSVDVCLLAHTLPWCTDPHRLLREADRVLIDD GWLVISGFNPISFMGLRKLVPVLRKTSPYNSRMFTLMRQLDWLSLLNFEVLHASRFHVLP WNKHGGKLLNAHIPALGCLQLIVARKRTIPLTLNPMKQSKNKPRIRQAVGATRQCRKPQA >gi|296494611|gb|ADTN01000127.1| GENE 14 16616 - 17371 334 251 aa, chain + ## HITS:1 COG:ECs0208 KEGG:ns NR:ns ## COG: ECs0208 COG0491 # Protein_GI_number: 15829462 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Escherichia coli O157:H7 # 1 251 1 251 251 520 100.0 1e-147 MNLNSIPAFDDNYIWVLNDEAGRCLIVDPGDAEPVLNAIAANNWQPEAIFLTHHHHDHVG GVKELVEKFPQIVVYGPQETQDKGTTQVVKDGETAFVLGHEFSVIATPGHTLGHICYFSK PYLFCGDTLFSGGCGRLFEGTASQMYQSLKKLSALPDDTLVCCAHEYTLSNMKFALSILP HDLSINDYYRKVKELRAKNQITLPVILKNERQINVFLRTEDIDLINVINEETLLQQPEER FAWLRSKKDRF >gi|296494611|gb|ADTN01000127.1| GENE 15 17581 - 18801 729 406 aa, chain + ## HITS:1 COG:dniR KEGG:ns NR:ns ## COG: dniR COG0741 # Protein_GI_number: 16128198 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Escherichia coli K12 # 1 406 47 452 452 775 100.0 0 MDDGTSIAPDGDLWAFIGDELKMGIPENDRIREQKQKYLRNKSYLHDVTLRAEPYMYWIA GQVKKRNMPMELVLLPIVESAFDPHATSGANAAGIWQIIPSTGRNYGLKQTRNYDARRDV VASTTAALNMMQRLNKMFDGDWLLTVAAYNSGEGRVMKAIKTNKARGKSTDFWSLPLPQE TKQYVPKMLALSDILKNSKRYGVRLPTTDESRALARVHLSSPVEMAKVADMAGISVSKLK TFNAGVKGSTLGASGPQYVMVPKKHADQLRESLASGEIAAVQSTLVADNTPLNSRVYTVR SGDTLSSIASRLGVSTKDLQQWNKLRGSKLKPGQSLTIGAGSSAQRLANNSDSITYRVRK GDSLSSIAKRHGVNIKDVMRWNSDTANLQPGDKLTLFVKNNNMPDS >gi|296494611|gb|ADTN01000127.1| GENE 16 18849 - 19472 307 207 aa, chain - ## HITS:1 COG:yafE KEGG:ns NR:ns ## COG: yafE COG0500 # Protein_GI_number: 16128197 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli K12 # 1 207 1 207 207 407 100.0 1e-114 MSGLPQGRPTFGAAQNVSAVVAYDLSAHMLDVVAQAAEARQLKNITTRQGYAESLPFADN AFDIVISRYSAHHWHDVGAALREVNRILKPGGRLIVMDVMSPGHPVRDIWLQTVEALRDT SHVRNYASGEWLTLINEANLIVDNLITDKLPLEFSSWVARMRTPEALVDAIRIYQQSAST EVRTYFALQNDGFFTSDIIMVDAHKAA >gi|296494611|gb|ADTN01000127.1| GENE 17 19476 - 20255 469 259 aa, chain - ## HITS:1 COG:ECs0205 KEGG:ns NR:ns ## COG: ECs0205 COG3021 # Protein_GI_number: 15829459 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 259 8 266 266 512 100.0 1e-145 MRYVAGQPAERILPPGSFASIGQALPPGEPLSTEERIRILVWNIYKQQRAEWLSVLKNYG KDAHLVLLQEAQTTPELVQFATANYLAADQVPAFVLPQHPSGVMTLSAAHPVYCCPLRER EPILRLAKSALVTVYPLPDTRLLMVVNIHAVNFSLGVDVYSKQLLPIGDQIAHHSGPVIM AGDFNAWSRRRMNALYRFAREMSLRQVRFTDDQRRRAFGRPLDFVFYRGLNVSEASVLVT RASDHNPLLVEFSPGKPDK >gi|296494611|gb|ADTN01000127.1| GENE 18 20517 - 21431 602 304 aa, chain + ## HITS:1 COG:yafC KEGG:ns NR:ns ## COG: yafC COG0583 # Protein_GI_number: 16128195 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 304 1 304 304 577 100.0 1e-164 MKATSEELAIFVSVVESGSFSRAAEQLGQANSAVSRAVKKLEMKLGVSLLNRTTRQLSLT EEGERYFRRVQSILQEMAAAESEIMETRNTPRGLLRIDAATPVVLHFLMPLIKPFRERYP EVTLSLVSSETIINLIERKVDVAIRAGTLTDSSLRARPLFNSYRKIIASPDYISRYGKPE TIDDLKQHICLGFTEPASLNTWPIARSDGQLHEVKYGLSSNSGETLKQLCLSGNGIACLS DYMIDKEIARGELVELMADKVLPVEMPFSAVYYSDRAVSTRIRAFIDFLSEHVKTAPGGA VREA >gi|296494611|gb|ADTN01000127.1| GENE 19 21428 - 22231 1121 267 aa, chain - ## HITS:1 COG:yafB KEGG:ns NR:ns ## COG: yafB COG0656 # Protein_GI_number: 16128194 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Escherichia coli K12 # 1 267 1 267 267 512 100.0 1e-145 MAIPAFGLGTFRLKDDVVISSVITALELGYRAIDTAQIYDNEAAVGQAIAESGVPRHELY ITTKIWIENLSKDKLIPSLKESLQKLRTDYVDLTLIHWPSPNDEVSVEEFMQALLEAKKQ GLTREIGISNFTIPLMEKAIAAVGAENIATNQIELSPYLQNRKVVAWAKQHGIHITSYMT LAYGKALKDEVIARIAAKHNATPAQVILAWAMGEGYSVIPSSTKRKNLESNLKAQNLQLD AEDKKAIAALDCNDRLVSPEGLAPEWD Prediction of potential genes in microbial genomes Time: Sun May 15 23:34:09 2011 Seq name: gi|296494610|gb|ADTN01000128.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont276.1, whole genome shotgun sequence Length of sequence - 22464 bp Number of predicted genes - 25, with homology - 24 Number of transcription units - 17, operones - 5 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 488 65 ## ECB_01891 adhesin + Term 511 - 545 3.1 - Term 996 - 1031 -0.1 2 2 Tu 1 . - CDS 1098 - 1802 319 ## COG0859 ADP-heptose:LPS heptosyltransferase + Prom 2032 - 2091 4.6 3 3 Tu 1 3/0.500 + CDS 2117 - 3433 1139 ## COG0477 Permeases of the major facilitator superfamily + Term 3434 - 3479 1.5 + Prom 3445 - 3504 5.8 4 4 Tu 1 3/0.500 + CDS 3535 - 4989 1008 ## COG0775 Nucleoside phosphorylase + Term 5045 - 5089 7.3 + Prom 5213 - 5272 9.9 5 5 Tu 1 . + CDS 5332 - 6048 675 ## COG0217 Uncharacterized conserved protein + Term 6061 - 6097 4.2 - TRNA 6501 - 6576 87.1 # Asn GTT 0 0 - Term 6397 - 6438 4.1 6 6 Op 1 . - CDS 6567 - 6668 84 ## 7 6 Op 2 1/0.500 - CDS 6677 - 8131 767 ## COG0534 Na+-driven multidrug efflux pump - Prom 8343 - 8402 80.1 + TRNA 8325 - 8400 87.1 # Asn GTT 0 0 - Term 8402 - 8430 0.6 8 7 Op 1 9/0.000 - CDS 8438 - 9316 631 ## COG0583 Transcriptional regulator - Prom 9421 - 9480 4.0 9 7 Op 2 3/0.500 - CDS 9490 - 10407 437 ## COG0583 Transcriptional regulator + TRNA 10734 - 10809 87.1 # Asn GTT 0 0 - Term 10809 - 10838 2.1 10 8 Op 1 4/0.000 - CDS 10865 - 11797 593 ## COG1376 Uncharacterized protein conserved in bacteria 11 8 Op 2 11/0.000 - CDS 11862 - 12941 749 ## COG2038 NaMN:DMB phosphoribosyltransferase 12 8 Op 3 8/0.000 - CDS 12953 - 13696 766 ## COG0368 Cobalamin-5-phosphate synthase 13 8 Op 4 . - CDS 13693 - 14235 477 ## COG2087 Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase - Prom 14333 - 14392 6.1 14 9 Tu 1 . - CDS 14703 - 14867 67 ## EcSMS35_1129 hypothetical protein - Prom 14953 - 15012 4.0 15 10 Op 1 23/0.000 - CDS 16085 - 16549 195 ## COG2801 Transposase and inactivated derivatives 16 10 Op 2 . - CDS 16546 - 16845 293 ## COG2963 Transposase and inactivated derivatives + Prom 16782 - 16841 1.8 17 11 Tu 1 . + CDS 16932 - 17762 745 ## COG1735 Predicted metal-dependent hydrolase with the TIM-barrel fold 18 12 Tu 1 . - CDS 18414 - 18701 127 ## ECP_2011 putative transposase - Term 19856 - 19900 1.5 19 13 Tu 1 . - CDS 20085 - 20489 224 ## ECS88_2074 hypothetical protein - Prom 20512 - 20571 4.7 + Prom 20472 - 20531 4.0 20 14 Tu 1 . + CDS 20675 - 21154 320 ## COG3547 Transposase and inactivated derivatives 21 15 Tu 1 . - CDS 21209 - 21628 62 ## ECB_02812 hypothetical protein + Prom 21323 - 21382 1.7 22 16 Tu 1 . + CDS 21509 - 21643 90 ## ECO26_2889 putative transposase 23 17 Op 1 . - CDS 21653 - 22045 217 ## EcHS_A2131 hypothetical protein 24 17 Op 2 . - CDS 22000 - 22167 56 ## gi|301645509|ref|ZP_07245445.1| hypothetical protein HMPREF9543_02125 25 17 Op 3 . - CDS 22211 - 22462 208 ## UTI89_C2250 hypothetical protein Predicted protein(s) >gi|296494610|gb|ADTN01000128.1| GENE 1 3 - 488 65 161 aa, chain + ## HITS:1 COG:no KEGG:ECB_01891 NR:ns ## KEGG: ECB_01891 # Name: yeeJ # Def: adhesin # Organism: E.coli_B_REL606 # Pathway: not_defined # 11 161 2233 2383 2383 306 100.0 1e-82 AKVRFIQQSHYEFSSSASWVDVDATGKVTFKNVGSNSERITATPKSGGPSYVYEIRVKSW WVNAGEAFMIYSLAENFCSSNGYTLPRANYLNHCSSRGIGSLYSEWGDMGHYTTDAGFQS NMYWSSSPANSSEQYVVSLATGDQSVFEKLGFAYATCYKNL >gi|296494610|gb|ADTN01000128.1| GENE 2 1098 - 1802 319 234 aa, chain - ## HITS:1 COG:b1980 KEGG:ns NR:ns ## COG: b1980 COG0859 # Protein_GI_number: 16129924 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Escherichia coli K12 # 1 234 1 234 234 453 99.0 1e-127 MFLASLLRRIAFSYYDYKAYNFNIEKTDFVVIHIPDQIGDAMAIFPVIRALELHKIKHLL IVTSTINLEVFNALKLEQTKLTLVTMTMQDHATLKEIKDLAKNITQQYGTPDLCIEGMRK KNLKTMLFISQLKAKTNFQVVGITMNCFSPLCKNASRMDQKLRAPVPMTWAFMMREAGFP AVRPIYELPLSEDVLDEVREEMRSLGSYIAFNLEGSSQERTFSLSIAENLIAKI >gi|296494610|gb|ADTN01000128.1| GENE 3 2117 - 3433 1139 438 aa, chain + ## HITS:1 COG:shiA KEGG:ns NR:ns ## COG: shiA COG0477 # Protein_GI_number: 16129925 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 438 1 438 438 807 99.0 0 MDSTLISTRPDEGTLSLSRARRAALGSFAGAVVDWYDFLLYGITAALVFNREFFPQVSPA MGTLAAFATFGVGFLFRPLGGVIFGHFGDRLGRKRMLMLTVWMMGIATALIGILPSFSTI GWWAPILLVTLRAIQGFAAGGEWGGAALLSVESAPKNKKAFYSSGVQVGYGVGLLLSTGL VSLISMMTTDEQFLSWGWRIPFLFSIVLVLGALWVRNGMEESAEFEQQQHYQAAAKKRIP VIEALLRHPGAFLKIIALRLCELLTMYIVTAFALNYSTQNMGLPRELFLNIGLLVGGLSC LTIPCFAWLADRFGRRRVYITGTLIGTLSAFPFFMALEAQSIFWIVFFSIMLANIAHDMV VCVQQPMFTEMFGASYRYSGAGVGYQVASVVGGGFTPFIAAALITYFAGNWHSVAIYLLA GCLISAMTALLMKDSQRA >gi|296494610|gb|ADTN01000128.1| GENE 4 3535 - 4989 1008 484 aa, chain + ## HITS:1 COG:ECs2779 KEGG:ns NR:ns ## COG: ECs2779 COG0775 # Protein_GI_number: 15832033 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside phosphorylase # Organism: Escherichia coli O157:H7 # 1 484 1 484 484 999 100.0 0 MNNKGSGLTPAQALDKLDALYEQSVVALRNAIGNYITSGELPDENARKQGLFVYPSLTVT WDGSTTNPPKTRAFGRFTHAGSYTTTITRPTLFRSYLNEQLTLLYQDYGAHISVQPSQHE IPYPYVIDGSELTLDRSMSAGLTRYFPTTELAQIGDETADGIYHPTEFSPLSHFDARRVD FSLARLRHYTGTPVEHFQPFVLFTNYTRYVDEFVRWGCSQILDPDSPYIALSCAGGNWIT AETEAPEEAISDLAWKKHQMPAWHLITADGQGITLVNIGVGPSNAKTICDHLAVLRPDVW LMIGHCGGLRESQAIGDYVLAHAYLRDDHVLDAVLPPDIPIPSIAEVQRALYDATKLVSG RPGEEVKQRLRTGTVVTTDDRNWELRYSASALRFNLSRAVAIDMESATIAAQGYRFRVPY GTLLCVSDKPLHGEIKLPGQANRFYEGAISEHLQIGIRAIDLLRAEGDRLHSRKLRTFNE PPFR >gi|296494610|gb|ADTN01000128.1| GENE 5 5332 - 6048 675 238 aa, chain + ## HITS:1 COG:yeeN KEGG:ns NR:ns ## COG: yeeN COG0217 # Protein_GI_number: 16129927 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 238 1 238 238 441 100.0 1e-124 MGRKWANIVAKKTAKDGATSKIYAKFGVEIYAAAKQGEPDPELNTSLKFVIERAKQAQVP KHVIDKAIDKAKGGGDETFVQGRYEGFGPNGSMIIAETLTSNVNRTIANVRTIFNKKGGN IGAAGSVSYMFDNTGVIVFKGTDPDHIFEILLEAEVDVRDVTEEEGNIVIYTEPTDLHKG IAALKAAGITEFSTTELEMIAQSEVELSPEDLEIFEGLVDALEDDDDVQKVYHNVANL >gi|296494610|gb|ADTN01000128.1| GENE 6 6567 - 6668 84 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQIISANDFKFKKQALTLWVGIANIRLVLTIPL >gi|296494610|gb|ADTN01000128.1| GENE 7 6677 - 8131 767 484 aa, chain - ## HITS:1 COG:yeeO KEGG:ns NR:ns ## COG: yeeO COG0534 # Protein_GI_number: 16129928 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Escherichia coli K12 # 1 484 64 547 547 876 99.0 0 MNISSALRQVVHGTRWHAKRKSYKVLFWREITPLAVPIFMENACVLLMGVLSTFLVSWLG KDAMAGVGLADSFNMVIMAFFAAIDLGTTVVVAFSLGKRDRRRARVATRQSLVIMTLFAV LLATLIHHFGEQIIDFVAGDATTEVKALALTYLELTVLSYPAAAITLIGSGALRGAGNTK IPLLINGSLNILNIIISGILIYGLFSWPGLGFVGAGLGLTISRYIGAVAILWVLAIGFNP ALRISLKSYFKPLNFSIIWEAMGIGIPASVESVLFTSGRLLTQMFVAGMGTSVIAGNFIA FSIAALINLPGSALGSASTIITGRRLGVGQIAQAEIQLRHVFWLSTLGLTAIAWLTAPFA GVMASFYTQDPQVKHVVVILIWLNALFMPIWSASWVLPAGFKGARDARYAMWVSMLSMWG CRVVVGYVLGIMLGWGVVGVWMGMFADWAVRAVLFYWRMVTGRWLWKYPRPEPQKCEKKP VVSE >gi|296494610|gb|ADTN01000128.1| GENE 8 8438 - 9316 631 292 aa, chain - ## HITS:1 COG:ECs2783 KEGG:ns NR:ns ## COG: ECs2783 COG0583 # Protein_GI_number: 15832037 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 292 25 316 316 565 98.0 1e-161 MLFTSQSGVSRHIRELEDELGIEIFVRRGKRLLGMTEPGKALLVIAERILNEASNVRRLA DLFTNDTSGVLTIATTHTQARYSLPEVIKAFRELFPEVRLELIQGTPQEIATLLQNGEAD IGIASERLSNDPQLVAFPWFRWHHSLLVPHDHPLTQISPLTLESIAKWPLITYRQGITGR SRIDDAFARKGLLADIVLSAQDSDVIKTYVALGLGIGLVAEQSSGEQEEENLIRLDTRHL FDANTVWLGLKRGQLQRNYVWRFLELCNAGLSVEDIKRQVMESSEEEIDYQI >gi|296494610|gb|ADTN01000128.1| GENE 9 9490 - 10407 437 305 aa, chain - ## HITS:1 COG:nac KEGG:ns NR:ns ## COG: nac COG0583 # Protein_GI_number: 16129930 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 305 1 305 305 535 100.0 1e-152 MNFRRLKYFVKIVDIGSLTQAAEVLHIAQPALSQQVATLEGELNQQLLIRTKRGVTPTDA GKILYTHARAILRQCEQAQLAVHNVGQALSGQVSIGFAPGTAASSITMPLLQAVRAEFPE IVIYLHENSGAVLNEKLINHQLDMAVIYEHSPVAGVSSQALLKEDLFLVGTQDCPGQSVD VNAIAQMNLFLPSDYSAIRLRVDEAFSLRRLTAKVIGEIESIATLTAAIASGMGVAVLPE SAARSLCGAVNGWMSRITTPSMSLSLSLNLPARANLSPQAQAVKELLMSVISSPVMEKRQ WQLVS >gi|296494610|gb|ADTN01000128.1| GENE 10 10865 - 11797 593 310 aa, chain - ## HITS:1 COG:erfK KEGG:ns NR:ns ## COG: erfK COG1376 # Protein_GI_number: 16129931 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 310 1 310 310 605 100.0 1e-173 MRRVNILCSFALLFASHTSLAVTYPLPPEGSRLVGQSFTVTVPDHNTQPLETFAAQYGQG LSNMLEANPGADVFLPKSGSQLTIPQQLILPDTVRKGIVVNVAEMRLYYYPPDSNTVEVF PIGIGQAGRETPRNWVTTVERKQEAPTWTPTPNTRREYAKRGESLPAFVPAGPDNPMGLY AIYIGRLYAIHGTNANFGIGLRVSQGCIRLRNDDIKYLFDNVPVGTRVQIIDQPVKYTTE PDGSNWLEVHEPLSRNRAEYESDRKVPLPVTPSLRAFINGQEVDVNRANAALQRRSGMPV QISSGSRQMF >gi|296494610|gb|ADTN01000128.1| GENE 11 11862 - 12941 749 359 aa, chain - ## HITS:1 COG:cobT KEGG:ns NR:ns ## COG: cobT COG2038 # Protein_GI_number: 16129932 # Func_class: H Coenzyme transport and metabolism # Function: NaMN:DMB phosphoribosyltransferase # Organism: Escherichia coli K12 # 1 359 1 359 359 646 100.0 0 MQILADLLNTIPAIDSTAMSRAQRHIDGLLKPVGSLGKLEVLAIQLAGMPGLNGIPHVGK KAVLVMCADHGVWEEGVAISPKEVTAIQAENMTRGTTGVCVLAEQAGANVHVIDVGIDTA EPIPGLINMRVARGSGNIASAPAMSRRQAEKLLLDVICYTQELAKNGVTLFGVGELGMAN TTPAAAIVSTITGRDPEEVVGIGANLPTDKLANKIDVVRRAITLNQPNPQDGVDVLAKVG GFDLVGIAGVMLGAASCGLPVLLDGFLSYAAALAACQMSPAIKPYLIPSHLSAEKGARIA LSHLGLEPYLNMEMRLGEGSGAALAMPIIEAACAIYNNMGELAASNIVLPGNTTSDLNS >gi|296494610|gb|ADTN01000128.1| GENE 12 12953 - 13696 766 247 aa, chain - ## HITS:1 COG:cobS KEGG:ns NR:ns ## COG: cobS COG0368 # Protein_GI_number: 16129933 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin-5-phosphate synthase # Organism: Escherichia coli K12 # 1 247 1 247 247 413 100.0 1e-115 MSKLFWAMLSFITRLPVPRRWSQGLDFEHYSRGIITFPLIGLLLGAISGLVFMVLQAWCG APLAALFSVLVLVLMTGGFHLDGLADTCDGVFSARSRDRMLEIMRDSRLGTHGGLALIFV VLAKILVLSELALRGESILASLAAACAVSRGTAALLMYRHRYAREEGLGNVFIGKIDGRQ TCVTLGLAAIFAAVLLPGMHGVAAMVVTMVAIFILGQLLKRTLGGQTGDTLGAAIELGEL VFLLALL >gi|296494610|gb|ADTN01000128.1| GENE 13 13693 - 14235 477 180 aa, chain - ## HITS:1 COG:ECs2788 KEGG:ns NR:ns ## COG: ECs2788 COG2087 # Protein_GI_number: 15832042 # Func_class: H Coenzyme transport and metabolism # Function: Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase # Organism: Escherichia coli O157:H7 # 1 180 2 181 181 357 100.0 5e-99 MILVTGGARSGKSRHAEALIGDSSQVLYIATSQILDDEMAARIEHHRQGRPEHWRTVERW QHLDELIHADINPNEVVLLECVTTMVTNLLFDYGGDKDPDEWDYQAMEQAINAEIQSLIA ACQRCPAKVVLVTNEVGMGIVPESRLARHFRDIAGRVNQQLAAAANEVWLVVSGIGVKIK >gi|296494610|gb|ADTN01000128.1| GENE 14 14703 - 14867 67 54 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_1129 NR:ns ## KEGG: EcSMS35_1129 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 54 1 54 54 82 100.0 3e-15 MPLITHLLNVEELSRLLKNVALSVINLFILDDVSEKMEYIPDWVVVTRLNIWWR >gi|296494610|gb|ADTN01000128.1| GENE 15 16085 - 16549 195 154 aa, chain - ## HITS:1 COG:tra5_g1 KEGG:ns NR:ns ## COG: tra5_g1 COG2801 # Protein_GI_number: 16128357 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli K12 # 1 154 1 154 288 305 100.0 3e-83 MKYVFIEKHQAEFSIKAMCRVLRVARSGWYTWCQRRTRISTRQQFRQHCDSVVLAAFTRS KQRYGAPRLTDELRAQGYPFNVKTVAASLRRQGLRAKASRKFSPVSYRAHGLPVSENLLE QDFYASGPNQKWAGDITYLRTDEGWLYLAVVIDL >gi|296494610|gb|ADTN01000128.1| GENE 16 16546 - 16845 293 99 aa, chain - ## HITS:1 COG:b0298 KEGG:ns NR:ns ## COG: b0298 COG2963 # Protein_GI_number: 16128283 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli K12 # 1 99 4 102 102 114 100.0 6e-26 MTKTVSTSKKPRKQHSPEFRSEALKLAERIGVTAAARELSLYESQLYNWRSKQQNQQTSS ERELEMSTEIARLKRQLAERDEELAILQKAATYFAKRLK >gi|296494610|gb|ADTN01000128.1| GENE 17 16932 - 17762 745 276 aa, chain + ## HITS:1 COG:STM3550 KEGG:ns NR:ns ## COG: STM3550 COG1735 # Protein_GI_number: 16766836 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Salmonella typhimurium LT2 # 1 276 69 344 344 497 88.0 1e-141 MDRKPIEDVIFEINNFISLGGRTIVDATGSESIGRDAQALREVALKTGLNIVASSGPYLE KFESQRIHKTVDELAATIDKELNQGIGDTDIRAGMIGEIGVSPTFTEAEHNSLRAASLAQ INNPHVAMNIHMPGWLRRGDEVLDIVLGEMGVSPNKVSLAHSDPSGKDVAYQRKMLDKGV WLEFDMIGLDITFPKEGIAPGVQETADAVAHLIELGYADQLVLSHDVFLKQMWAKNGGNG WGFVPDVFLAYLAERGVDKTILKKLCIDNPGRLLTA >gi|296494610|gb|ADTN01000128.1| GENE 18 18414 - 18701 127 95 aa, chain - ## HITS:1 COG:no KEGG:ECP_2011 NR:ns ## KEGG: ECP_2011 # Name: not_defined # Def: putative transposase # Organism: E.coli_536 # Pathway: not_defined # 12 95 202 285 285 155 98.0 6e-37 MPAGYGGKWWPRGGTGSETIVRDTVAKWRKGWNPPVTTAARLPSVSRVSRWLMPWRIIRG EENYASRFISLMCEKEPELKIAQQLVLEFYRILKT >gi|296494610|gb|ADTN01000128.1| GENE 19 20085 - 20489 224 134 aa, chain - ## HITS:1 COG:no KEGG:ECS88_2074 NR:ns ## KEGG: ECS88_2074 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 126 30 155 165 260 98.0 1e-68 MRILNCYMANDSKGHFVTAKEAAKHNRQDVLCCVSCGCPLTLKRGNDGQPPWFEHDQMTV AAKILLRCTWLDPAEKEARRLHLQGMTVPDYTVKVRKWFCVMCDEDYEGEKCCPRCGTGV YSREGGAAGRQLEG >gi|296494610|gb|ADTN01000128.1| GENE 20 20675 - 21154 320 159 aa, chain + ## HITS:1 COG:all0306 KEGG:ns NR:ns ## COG: all0306 COG3547 # Protein_GI_number: 17227802 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 6 153 5 146 320 100 38.0 1e-21 MSQPGSLCVDIDVSKAILDIAASSAIEQFSVGNDSDGFDAIIAELRMYAIALVLMKATGG LETAIVCALQAEGFELAVVNPRQARDFARAMGYLAKTDSIDARVLSQMAEVLNRHPERER FIRALPDTECQVLTAMVVRRRQLITMLVAERNRLHPAHP >gi|296494610|gb|ADTN01000128.1| GENE 21 21209 - 21628 62 139 aa, chain - ## HITS:1 COG:no KEGG:ECB_02812 NR:ns ## KEGG: ECB_02812 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 62 2 63 97 90 69.0 2e-17 MVRFIPFFVLSEHCVQDSQQLAYTGNQRHFFGFSCGKQSHVKHLYYRVKAGCDKRCHVQG CSHTSPAAKDGSSPSHRARVPVYWNHTDKRTDFTPGEVTLFGYFRQQRSSRHGANTFNTA QSLSKLFEVAMDMAVHVSI >gi|296494610|gb|ADTN01000128.1| GENE 22 21509 - 21643 90 44 aa, chain + ## HITS:1 COG:no KEGG:ECO26_2889 NR:ns ## KEGG: ECO26_2889 # Name: not_defined # Def: putative transposase # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 44 279 322 322 91 100.0 1e-17 MRLLAAGKPKKVALVTCIRKLLTILNAMLRKNEEWNESYHHVAP >gi|296494610|gb|ADTN01000128.1| GENE 23 21653 - 22045 217 130 aa, chain - ## HITS:1 COG:no KEGG:EcHS_A2131 NR:ns ## KEGG: EcHS_A2131 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_HS # Pathway: not_defined # 1 130 1 130 130 248 99.0 6e-65 MYAKSFLALDGNGRLTGARTAQAAPYAHYTCHLCGSALRYHPQYDTELPWFEHTDDRLTE HGQQCPYVRPERREIQLIKRLQQFVPDTLPVVRKASWHCRQCHHDYYGERYCTHCQTGGF SLPRTAQQLS >gi|296494610|gb|ADTN01000128.1| GENE 24 22000 - 22167 56 55 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|301645509|ref|ZP_07245445.1| ## NR: gi|301645509|ref|ZP_07245445.1| hypothetical protein HMPREF9543_02125 [Escherichia coli MS 146-1] # 1 55 1 55 55 96 100.0 6e-19 MATTLIFPYETELKPSFTPDHATRTGTAVACVSSDGKEHILCTQNPFSLLMATDV >gi|296494610|gb|ADTN01000128.1| GENE 25 22211 - 22462 208 83 aa, chain - ## HITS:1 COG:no KEGG:UTI89_C2250 NR:ns ## KEGG: UTI89_C2250 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 83 119 201 201 177 100.0 1e-43 CRAGDYRAPGSLAGMIEQAWCSALGVDAGCHATLVHFPAWPAVWLARNDDTGFQQVLERA DYLAKEHTKAHCTGERNFGCSRG Prediction of potential genes in microbial genomes Time: Sun May 15 23:34:35 2011 Seq name: gi|296494609|gb|ADTN01000129.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont277.1, whole genome shotgun sequence Length of sequence - 1206 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 349 189 ## ECO111_p2-089 putative morphogenetic protein 2 1 Op 2 . - CDS 351 - 569 107 ## gi|168467993|ref|ZP_02701830.1| conserved domain protein - Prom 589 - 648 3.2 - Term 589 - 620 1.0 3 2 Tu 1 . - CDS 651 - 1193 -52 ## ROD_26041 hypothetical protein Predicted protein(s) >gi|296494609|gb|ADTN01000129.1| GENE 1 1 - 349 189 116 aa, chain - ## HITS:1 COG:no KEGG:ECO111_p2-089 NR:ns ## KEGG: ECO111_p2-089 # Name: not_defined # Def: putative morphogenetic protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 116 1 116 420 193 97.0 1e-48 MSTSAQNQSIENVSIPDVLNAGIPAIIQNIRAAQRRVSCDDLTARFFDNAIQSAEMLHAQ LIDVYNAEADSHNSLVDAAENMQLDLGLKGKEIEELQLQIEHLKRQQQDAIDDATH >gi|296494609|gb|ADTN01000129.1| GENE 2 351 - 569 107 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|168467993|ref|ZP_02701830.1| ## NR: gi|168467993|ref|ZP_02701830.1| conserved domain protein [Salmonella enterica subsp. enterica serovar Newport str. SL317] # 4 72 4 72 72 116 76.0 5e-25 MWPFRRKYHYWLIAFVTPTGGIRHVITRYRNKRLTLARILQAAIGEGLDTNCVVLPPSYL GKMTEAQANTEL >gi|296494609|gb|ADTN01000129.1| GENE 3 651 - 1193 -52 180 aa, chain - ## HITS:1 COG:no KEGG:ROD_26041 NR:ns ## KEGG: ROD_26041 # Name: not_defined # Def: hypothetical protein # Organism: C.rodentium # Pathway: not_defined # 5 171 56 221 221 121 43.0 1e-26 MAASPGWWAKIELFRPDITDDLFYLDLDTVIAGDIRPILENPPTSFTMLRDFYHPQYRGS GALWIPNSVKAHIWSSFWQDPEGWISRCVTTECWGDQGFLRKVMGDDTPAFQDLYPGWFV SYKADVVEPSSKYASARYSRGNGALPKDCRIIFFHGKPRPREVSEDWLPLISSFFEQKSE Prediction of potential genes in microbial genomes Time: Sun May 15 23:34:50 2011 Seq name: gi|296494608|gb|ADTN01000130.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont282.1, whole genome shotgun sequence Length of sequence - 18402 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 6, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 47 - 211 72 ## EcolC_0453 hypothetical protein 2 1 Op 2 . - CDS 248 - 1222 1221 ## COG0604 NADPH:quinone reductase and related Zn-dependent oxidoreductases - Prom 1260 - 1319 3.4 + Prom 1223 - 1282 3.7 3 2 Tu 1 2/1.000 + CDS 1374 - 3314 1146 ## COG2200 FOG: EAL domain + Prom 3520 - 3579 4.9 4 3 Op 1 22/0.000 + CDS 3619 - 4662 1171 ## COG1077 Actin-like ATPase involved in cell morphogenesis 5 3 Op 2 19/0.000 + CDS 4728 - 5831 1170 ## COG1792 Cell shape-determining protein 6 3 Op 3 7/0.000 + CDS 5831 - 6319 417 ## COG2891 Cell shape-determining protein 7 3 Op 4 8/0.000 + CDS 6328 - 6921 585 ## COG0424 Nucleotide-binding protein implicated in inhibition of septum formation 8 3 Op 5 5/1.000 + CDS 6911 - 8380 1620 ## COG1530 Ribonucleases G and E 9 3 Op 6 3/1.000 + CDS 8448 - 12248 3204 ## COG3164 Predicted membrane protein + Term 12353 - 12394 1.3 10 4 Tu 1 . + CDS 12587 - 14032 1795 ## COG0312 Predicted Zn-dependent proteases and their inactivated homologs - Term 13947 - 13996 -0.5 11 5 Tu 1 . - CDS 14166 - 15095 822 ## COG0583 Transcriptional regulator - Prom 15136 - 15195 7.2 + Prom 15112 - 15171 5.5 12 6 Op 1 . + CDS 15278 - 15481 128 ## G2583_3962 hypothetical protein 13 6 Op 2 6/0.000 + CDS 15489 - 16421 1028 ## COG1566 Multidrug resistance efflux pump 14 6 Op 3 . + CDS 16427 - 18394 1403 ## COG1289 Predicted membrane protein Predicted protein(s) >gi|296494608|gb|ADTN01000130.1| GENE 1 47 - 211 72 54 aa, chain - ## HITS:1 COG:no KEGG:EcolC_0453 NR:ns ## KEGG: EcolC_0453 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 48 1 48 276 103 97.0 2e-21 MKSLFNRLTGKAVSRTAFVEHLGQEVIQHHPNWKVMISTDHKLMRIDTPLNSYY >gi|296494608|gb|ADTN01000130.1| GENE 2 248 - 1222 1221 324 aa, chain - ## HITS:1 COG:ECs4125 KEGG:ns NR:ns ## COG: ECs4125 COG0604 # Protein_GI_number: 15833379 # Func_class: C Energy production and conversion; R General function prediction only # Function: NADPH:quinone reductase and related Zn-dependent oxidoreductases # Organism: Escherichia coli O157:H7 # 1 324 1 324 324 638 99.0 0 MQALLLEQQDGKTLASVQTLDESRLPEGDVTVDVHWSSLNYKDALAITGKGKIIRNFPMI PGIDFAGTVRTSEDPRFHAGQEVLLTGWGVGENHWGGLAEQARVKGDWLVAMPQGLDARK AMIIGTAGFTAMLCVMALEDAGVRPQDGEIVVTGASGGVGSTAVALLHKLGYQVVAVSGR ESTHEYLKSLGASRVLPRDEFAESRPLEKQVWAGAIDTVGDKVLAKVLAQMNYGGCVAAC GLAGGFTLPTTVMPFILRNVRLQGVDSVMTPPERRAQAWQRLVADLPESFYTQAAKEISL SEAPNFAEAIINNQIQGRTLVKVN >gi|296494608|gb|ADTN01000130.1| GENE 3 1374 - 3314 1146 646 aa, chain + ## HITS:1 COG:ECs4124_3 KEGG:ns NR:ns ## COG: ECs4124_3 COG2200 # Protein_GI_number: 15833378 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli O157:H7 # 395 646 1 252 252 499 100.0 1e-141 MRLTTKFSAFVTLLTGLTIFVTLLGCSLSFYNAIQYKFSHRVQAVATAIDTHLVSNDFSV LRPQITELMMSADIVRVDLLHGDKQVYTLARNGSYRPVGSSDLFRELSVPLIKHPGMSLR LVYQDPMGNYFHSLMTTAPLTGAIGFIIVMLFLAVRWLQRQLAGQELLETRATRILNGER GSNVLGTIYEWPPRTSSALDTLLREIQNAREQHSRLDTLIRSYAAQDVKTGLNNRLFFDN QLATLLEDQEKVGTHGIVMMIRLPDFNMLSDTWGHSQVEEQFFTLTNLLSTFMMRYPGAL LARYHRSDFAALLPHRTLKEAESIAGQLIKAVDTLPNNKMLDRDDMIHIGICAWRSGQDT EQVMEHAESATRNAGLQGGNSWAIYDDSLPEKGRGNVRWRTLIEQMLSRGGPRLYQKPAV TREGQVHHRELMCRIFDGNEEVSSAEYMPMVLQFGLSEEYDRLQISRLIPLLRYWPEENL AIQVTVESLIRPRFQRWLRDTLMQCEKSQRKRIIIELAEADVGQHISRLQPVIRLVNALG VRVAVNQAGLTLVSTSWIKELNVELLKLHPGLVRNIEKRTENQLLVQSLVEACSGTSTQV YATGVRSRSEWQTLIQRGVTGGQGDFFASSQPLDTNVKKYSQRYSV >gi|296494608|gb|ADTN01000130.1| GENE 4 3619 - 4662 1171 347 aa, chain + ## HITS:1 COG:ECs4123 KEGG:ns NR:ns ## COG: ECs4123 COG1077 # Protein_GI_number: 15833377 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Escherichia coli O157:H7 # 1 347 21 367 367 649 100.0 0 MLKKFRGMFSNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRAGSPKSVAAVGHDAK QMLGRTPGNIAAIRPMKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQ VERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNG VVYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRN LAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALL RNLDRLLMEETGIPVVVAEDPLTCVARGGGKALEMIDMHGGDLFSEE >gi|296494608|gb|ADTN01000130.1| GENE 5 4728 - 5831 1170 367 aa, chain + ## HITS:1 COG:ECs4122 KEGG:ns NR:ns ## COG: ECs4122 COG1792 # Protein_GI_number: 15833376 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell shape-determining protein # Organism: Escherichia coli O157:H7 # 1 367 1 367 367 592 99.0 1e-169 MKPIFSRGPSLQIRLILAVLVALGIIIADSRLGTFSQIRTYMDTAVSPFYFVSNAPRELL DGVSQTLASRDQLELENRALRQELLLKNSELLMLGQYKQENARLRELLGSPLRQDEQKMV TQVISTVNDPYSDQVVIDKGSVNGVYEGQPVISDKGVVGQVVAVAKLTSRVLLICDATHA LPIQVLRNDIRVIAAGNGCTDDLQLEHLPANTDIRVGDVLVTSGLGGRFPEGYPVAVVSS VKLDTQRAYTVIQARPTAGLQRLRYLLLLWGADRNGANPMTPEEVHRVANERLMQMMPQV LPSPDAMGPKLPEPATGIAQPTPQQPATGNAATAPAAPTQPAANRSPQRATPPQSGAQPP ARAPGGQ >gi|296494608|gb|ADTN01000130.1| GENE 6 5831 - 6319 417 162 aa, chain + ## HITS:1 COG:ECs4121 KEGG:ns NR:ns ## COG: ECs4121 COG2891 # Protein_GI_number: 15833375 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell shape-determining protein # Organism: Escherichia coli O157:H7 # 1 162 1 162 162 227 100.0 6e-60 MASYRSQGRWVIWLSFLIALLLQIMPWPDNLIVFRPNWVLLILLYWILALPHRVNVGTGF VMGAILDLISGSTLGVRVLAMSIIAYLVALKYQLFRNLALWQQALVVMLLSLVVDIIVFW AEFLVINVSFRPEVFWSSVVNGVLWPWIFLLMRKVRQQFAVQ >gi|296494608|gb|ADTN01000130.1| GENE 7 6328 - 6921 585 197 aa, chain + ## HITS:1 COG:yhdE KEGG:ns NR:ns ## COG: yhdE COG0424 # Protein_GI_number: 16131136 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Nucleotide-binding protein implicated in inhibition of septum formation # Organism: Escherichia coli K12 # 1 197 1 197 197 375 100.0 1e-104 MTSLYLASGSPRRQELLAQLGVTFERIVTGIEEQRQPQESAQQYVVRLAREKARAGVAQT AKDLPVLGADTIVILNGEVLEKPRDAEHAAQMLRKLSGQTHQVMTAVALADSQHILDCLV VTDVTFRTLTDEDIAGYVASDEPLDKAGAYGIQGLGGCFVRKINGSYHAVVGLPLVETYE LLSNFNALREKRDKHDG >gi|296494608|gb|ADTN01000130.1| GENE 8 6911 - 8380 1620 489 aa, chain + ## HITS:1 COG:ZcafA KEGG:ns NR:ns ## COG: ZcafA COG1530 # Protein_GI_number: 15803780 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonucleases G and E # Organism: Escherichia coli O157:H7 EDL933 # 1 489 7 495 495 917 100.0 0 MTAELLVNVTPSETRVAYIDGGILQEIHIEREARRGIVGNIYKGRVSRVLPGMQAAFVDI GLDKAAFLHASDIMPHTECVAGEEQKQFTVRDISELVRQGQDLMVQVVKDPLGTKGARLT TDITLPSRYLVFMPGASHVGVSQRIESESERERLKKVVAEYCDEQGGFIIRTAAEGVGEA ELASDAAYLKRVWTKVMERKKRPQTRYQLYGELALAQRVLRDFADAELDRIRVDSRLTYE ALLEFTSEYIPEMTSKLEHYTGRQPIFDLFDVENEIQRALERKVELKSGGYLIIDQTEAM TTVDINTGAFVGHRNLDDTIFNTNIEATQAIARQLRLRNLGGIIIIDFIDMNNEDHRRRV LHSLEQALSKDRVKTSVNGFSALGLVEMTRKRTRESIEHVLCNECPTCHGRGTVKTVETV CYEIMREIVRVHHAYDSDRFLVYASPAVAEALKGEESHSLAEVEIFVGKQVKVQIEPLYN QEQFDVVMM >gi|296494608|gb|ADTN01000130.1| GENE 9 8448 - 12248 3204 1266 aa, chain + ## HITS:1 COG:yhdR+P KEGG:ns NR:ns ## COG: yhdR+P COG3164 # Protein_GI_number: 16132254 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 1266 1 1266 1266 2518 100.0 0 MRRLPGILLLTGAALVVIAALLVSGLRIALPHLDAWRPEILNKIESATGMPVEASQLSAS WQNFGPTLEAHDIRAELKDGGEFSVKRVTLALDVWQSLLHMRWQFRDLTFWQLRFRTNTP ITSGGSDDSLEASHISDLFLRQFDHFDLRDSEVSFLTPSGQRAELAIPQLTWLNDPRRHR AEGLVSLSSLTGQHGVMQVRMDLRDDEGLLSNGRVWLQADDIDLKPWLGKWMQDNIALET AQFSLEGWMTIDKGDVTGGDVWLKQGGASWLGEKQTHTLSVDNLTAHITRENPGWQFSIP DTRITMDGKPWPSGALTLAWIPEQDVGGKDNKRSDELRIRASNLELAGLEGIRPLAAKLS PALGDVWRSTQPSGKINTLALDIPLQAADKTRFQASWSDLAWKQWKLLPGAEHFSGTLSG SVENGLLTASMKQAKMPYETVFRAPLEIADGQATISWLNNNKGFQLDGRNIDVKAKAVHA RGGFRYLQPANDEPWLGILAGISTDDGSQAWRYFPENLMGKDLVDYLSGAIQGGEADNAT LVYGGNPQLFPYKHNEGQFEVLVPLRNAKFAFQPDWPALTNLDIELDFINDGLWMKTDGV NLGGVRASNLTAVIPDYSKEKLLIDADIKGPGKAVGPYFDETPLKDSLGATLQELQLDGD VNARLHLDIPLNGELVTAKGEVTLRNNSLFIKPLDSTLKNLSGKFSFINSDLQSEPLTAS WFNQPLNVDFSTKEGAKAYQVAVNLNGNWQPAKTGVLPEAVNEALSGSVAWDGKVGIDLP YHAGATYNIELNGDLKNVSSHLPSPLAKPAGEPLAVNVKVDGNLNSFELTGQAGADNHFN SRWLLGQKLTLDRAIWAADSKTLPPLPEQSGVELNMPPMNGAEWLALFQKGAAESVGGAA SFPQHITLRTPMLSLGNQQWNNLSIVSQPTANGTLVEAQGREINATLAMRNNAPWLANIK YLYYNPSVAKTRGDSTPSSPFPTTERINFRGWPDAQIRCTECWFWGQKFGRIDSDITISG DTLTLTNGLIDTGFSRLTADGEWVNNPGNERTSLKGKLRGQKIDAAAEFFGVTTPIRQSS FNVDYDLHWRKAPWQPDEATLNGIIHTQLGKGEITEINTGHAGQLLRLLSVDALMRKLRF DFRDTFGEGFYFDSIRSTAWIKDGVMHTDDTLVDGLEADIAMKGSVNLVRRDLNMEAVVA PEISATVGVAAAFAVNPIVGAAVFAASKVLGPLWSKVSILRYHISGPLDDPQINEVLRQP RKEKAQ >gi|296494608|gb|ADTN01000130.1| GENE 10 12587 - 14032 1795 481 aa, chain + ## HITS:1 COG:tldD KEGG:ns NR:ns ## COG: tldD COG0312 # Protein_GI_number: 16131134 # Func_class: R General function prediction only # Function: Predicted Zn-dependent proteases and their inactivated homologs # Organism: Escherichia coli K12 # 1 481 1 481 481 889 100.0 0 MSLNLVSEQLLAANGLKHQDLFAILGQLAERRLDYGDLYFQSSYHESWVLEDRIIKDGSY NIDQGVGVRAISGEKTGFAYADQISLLALEQSAQAARTIVRDSGDGKVQTLGAVEHSPLY TSVDPLQSMSREEKLDILRRVDKVAREADKRVQEVTASLSGVYELILVAATDGTLAADVR PLVRLSVSVLVEEDGKRERGASGGGGRFGYEFFLADLDGEVRADAWAKEAVRMALVNLSA VAAPAGTMPVVLGAGWPGVLLHEAVGHGLEGDFNRRGTSVFSGQVGELVASELCTVVDDG TMVDRRGSVAIDDEGTPGQYNVLIENGILKGYMQDKLNARLMGMTPTGNGRRESYAHLPM PRMTNTYMLPGKSTPQEIIESVEYGIYAPNFGGGQVDITSGKFVFSTSEAYLIENGKVTK PVKGATLIGSGIETMQQISMVGNDLKLDNGVGVCGKEGQSLPVGVGQPTLKVDNLTVGGT A >gi|296494608|gb|ADTN01000130.1| GENE 11 14166 - 15095 822 309 aa, chain - ## HITS:1 COG:ECs4116 KEGG:ns NR:ns ## COG: ECs4116 COG0583 # Protein_GI_number: 15833370 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 309 1 309 309 623 100.0 1e-178 MERLKRMSVFAKVVEFGSFTAAARQLQMSVSSISQTVSKLEDELQVKLLNRSTRSIGLTE AGRIYYQGCRRMLHEVQDVHEQLYAFNNTPIGTLRIGCSSTMAQNVLAGLTAKMLKEYPG LSVNLVTGIPAPDLIADGLDVVIRVGALQDSSLFSRRLGAMPMVVCAAKSYLTQYGIPEK PADLSSHSWLEYSVRPDNEFELIAPEGISTRLIPQGRFVTNDPMTLVRWLTAGAGIAYVP LMWVINEINRGELEILLPRYQSDPRPVYALYTEKDKLPLKVQVVINSLTDYFVEVGKLFQ EMHGRGKEK >gi|296494608|gb|ADTN01000130.1| GENE 12 15278 - 15481 128 67 aa, chain + ## HITS:1 COG:no KEGG:G2583_3962 NR:ns ## KEGG: G2583_3962 # Name: aaeX # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 67 24 90 90 103 100.0 3e-21 MSLFPVIVVFGLSFPPIFFELLLSLAIFWLVRRVLVPTGIYDFVWHPALFNTALYCCLFY LISRLFV >gi|296494608|gb|ADTN01000130.1| GENE 13 15489 - 16421 1028 310 aa, chain + ## HITS:1 COG:yhcQ KEGG:ns NR:ns ## COG: yhcQ COG1566 # Protein_GI_number: 16131131 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Escherichia coli K12 # 1 310 1 310 310 580 100.0 1e-165 MKTLIRKFSRTAITVVLVILAFIAIFNAWVYYTESPWTRDARFSADVVAIAPDVSGLITQ VNVHDNQLVKKGQILFTIDQPRYQKALEEAQADVAYYQVLAQEKRQEAGRRNRLGVQAMS REEIDQANNVLQTVLHQLAKAQATRDLAKLDLERTVIRAPADGWVTNLNVYTGEFITRGS TAVALVKQNSFYVLAYMEETKLEGVRPGYRAEITPLGSNKVLKGTVDSVAAGVTNASSTR DDKGMATIDSNLEWVRLAQRVPVRIRLDNQQENIWPAGTTATVVVTGKQDRDESQDSFFR KMAHRLREFG >gi|296494608|gb|ADTN01000130.1| GENE 14 16427 - 18394 1403 655 aa, chain + ## HITS:1 COG:yhcP KEGG:ns NR:ns ## COG: yhcP COG1289 # Protein_GI_number: 16131130 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 655 1 655 655 1277 99.0 0 MGIFSIANQHIRFAVKLATAIVLALFVGFHFQLETPRWAVLTAAIVAAGPAFAAGGEPYS GAIRYRGFLRIIGTFIGCIAGLVIIIAMIRAPLLMILVCCIWAGFCTWISSLVRIENSYA WGLAGYTALIIVITIPPEPLLTPQFAVERCSEIVIGIVCAIMADLLFSPRSIKQEVDREL ESLLVAQYQLMQLCIKHGDGEVVDKAWGDLVRRTTALQGMRSNLNMESSRWARANRRLKA INTLSLTLITQSCETYLIQNTRPELITDTFREFFDTPVETAQDVHKQLKRLRRVIAWTGE RETPVTIYSWVAAATRYQLLKRGVISNTKINATEEEILQGEPEVKVESAERHHAMVNFWR TTLSCILGTLFWLWTGWTSGSGAMVMIAVVTSLAMRLPNPRMVAIDFIYGTLAALPLGLL YFLVIIPNTQQSMLLLCISLAVLGFFLGIEVQKRRLGSMGALASTINIIVLDNPMTFHFS QFLDSALGQIVGCVLAFTVILLVRDKSRDRTGRVLLNQFVSAAVSAMTTNVARRKENHLP ALYQQLFLLMNKFPGDLPKFRLALTMIIAHQRLRDAPIPVNEDLSAFHRQMRRTADHVIS ARSDDKRRRYFGQLLEELEIYQEKLRIWQAPPQVTEPVNRLAGMLHKYQHALTDS Prediction of potential genes in microbial genomes Time: Sun May 15 23:34:58 2011 Seq name: gi|296494607|gb|ADTN01000131.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont282.2, whole genome shotgun sequence Length of sequence - 8422 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 8, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 69 - 341 358 ## COG2732 Barstar, RNAse (barnase) inhibitor - Term 339 - 369 4.1 2 2 Tu 1 . - CDS 397 - 660 373 ## G2583_3958 hypothetical protein - Prom 790 - 849 7.5 - Term 981 - 1020 4.7 3 3 Tu 1 . - CDS 1025 - 1495 546 ## COG1438 Arginine repressor - Prom 1521 - 1580 5.2 + Prom 1667 - 1726 4.3 4 4 Tu 1 . + CDS 1930 - 2868 1331 ## COG0039 Malate/lactate dehydrogenases + Term 2884 - 2927 8.6 - Term 2879 - 2908 1.2 5 5 Op 1 6/0.333 - CDS 2931 - 3998 1166 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain - Prom 4025 - 4084 4.0 6 5 Op 2 5/0.667 - CDS 4088 - 5455 1459 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain - Prom 5500 - 5559 4.2 - Term 5566 - 5598 3.8 7 6 Tu 1 . - CDS 5609 - 6007 551 ## COG3105 Uncharacterized protein conserved in bacteria - Prom 6121 - 6180 3.3 + Prom 6252 - 6311 2.2 8 7 Tu 1 . + CDS 6357 - 7328 779 ## COG1485 Predicted ATPase + Term 7351 - 7383 3.1 + Prom 7331 - 7390 5.6 9 8 Op 1 59/0.000 + CDS 7472 - 7975 881 ## PROTEIN SUPPORTED gi|226956764|ref|YP_002807559.1| 50S ribosomal subunit protein L13 10 8 Op 2 . + CDS 7991 - 8383 650 ## PROTEIN SUPPORTED gi|15803764|ref|NP_289798.1| 30S ribosomal protein S9 Predicted protein(s) >gi|296494607|gb|ADTN01000131.1| GENE 1 69 - 341 358 90 aa, chain + ## HITS:1 COG:ECs4112 KEGG:ns NR:ns ## COG: ECs4112 COG2732 # Protein_GI_number: 15833366 # Func_class: K Transcription # Function: Barstar, RNAse (barnase) inhibitor # Organism: Escherichia coli O157:H7 # 1 90 1 90 90 158 100.0 3e-39 MNIYTFDFDEIESQEDFYRDFSQTFGLAKDKVRDLDSLWDVLMNDVLPLPLEIEFVHLGE KTRRRFGALILLFDEAEEELEGHLRFNVRH >gi|296494607|gb|ADTN01000131.1| GENE 2 397 - 660 373 87 aa, chain - ## HITS:1 COG:no KEGG:G2583_3958 NR:ns ## KEGG: G2583_3958 # Name: yhcN # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 87 18 104 104 133 100.0 2e-30 MKIKTTVAALSVLSVLSFGAFAADSIDAAQAQNREAIGTVSVSGVASSPMDMREMLNKKA EEKGATAYQITEARSGDTWHATAELYK >gi|296494607|gb|ADTN01000131.1| GENE 3 1025 - 1495 546 156 aa, chain - ## HITS:1 COG:ECs4110 KEGG:ns NR:ns ## COG: ECs4110 COG1438 # Protein_GI_number: 15833364 # Func_class: K Transcription # Function: Arginine repressor # Organism: Escherichia coli O157:H7 # 1 156 1 156 156 293 100.0 6e-80 MRSSAKQEELVKAFKALLKEEKFSSQGEIVAALQEQGFDNINQSKVSRMLTKFGAVRTRN AKMEMVYCLPAELGVPTTSSPLKNLVLDIDYNDAVVVIHTSPGAAQLIARLLDSLGKAEG ILGTIAGDDTIFTTPANGFTVKDLYEAILELFDQEL >gi|296494607|gb|ADTN01000131.1| GENE 4 1930 - 2868 1331 312 aa, chain + ## HITS:1 COG:ECs4109 KEGG:ns NR:ns ## COG: ECs4109 COG0039 # Protein_GI_number: 15833363 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Escherichia coli O157:H7 # 1 312 1 312 312 557 100.0 1e-159 MKVAVLGAAGGIGQALALLLKTQLPSGSELSLYDIAPVTPGVAVDLSHIPTAVKIKGFSG EDATPALEGADVVLISAGVARKPGMDRSDLFNVNAGIVKNLVQQVAKTCPKACIGIITNP VNTTVAIAAEVLKKAGVYDKNKLFGVTTLDIIRSNTFVAELKGKQPGEVEVPVIGGHSGV TILPLLSQVPGVSFTEQEVADLTKRIQNAGTEVVEAKAGGGSATLSMGQAAARFGLSLVR ALQGEQGVVECAYVEGDGQYARFFSQPLLLGKNGVEERKSIGTLSAFEQNALEGMLDTLK KDIALGEEFVNK >gi|296494607|gb|ADTN01000131.1| GENE 5 2931 - 3998 1166 355 aa, chain - ## HITS:1 COG:ECs4108 KEGG:ns NR:ns ## COG: ECs4108 COG0265 # Protein_GI_number: 15833362 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Escherichia coli O157:H7 # 1 355 1 355 355 645 100.0 0 MFVKLLRSVAIGLIVGAILLVAMPSLRSLNPLSTPQFDSTDETPASYNLAVRRAAPAVVN VYNRGLNTNSHNQLEIRTLGSGVIMDQRGYIITNKHVINDADQIIVALQDGRVFEALLVG SDSLTDLAVLKINATGGLPTIPINARRVPHIGDVVLAIGNPYNLGQTITQGIISATGRIG LNPTGRQNFLQTDASINHGNSGGALVNSLGELMGINTLSFDKSNDGETPEGIGFAIPFQL ATKIMDKLIRDGRVIRGYIGIGGREIAPLHAQGGGIDQLQGIVVNEVSPDGPAANAGIQV NDLIISVDNKPAISALETMDQVAEIRPGSVIPVVVMRDDKQLTLQVTIQEYPATN >gi|296494607|gb|ADTN01000131.1| GENE 6 4088 - 5455 1459 455 aa, chain - ## HITS:1 COG:degQ KEGG:ns NR:ns ## COG: degQ COG0265 # Protein_GI_number: 16131124 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Escherichia coli K12 # 1 455 1 455 455 777 100.0 0 MKKQTQLLSALALSVGLTLSASFQAVASIPGQVADQAPLPSLAPMLEKVLPAVVSVRVEG TASQGQKIPEEFKKFFGDDLPDQPAQPFEGLGSGVIINASKGYVLTNNHVINQAQKISIQ LNDGREFDAKLIGSDDQSDIALLQIQNPSKLTQIAIADSDKLRVGDFAVAVGNPFGLGQT ATSGIVSALGRSGLNLEGLENFIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSV GIGFAIPSNMARTLAQQLIDFGEIKRGLLGIKGTEMSADIAKAFNLDVQRGAFVSEVLPG SGSAKAGVKAGDIITSLNGKPLNSFAELRSRIATTEPGTKVKLGLLRNGKPLEVEVTLDT STSSSASAEMITPALEGATLSDGQLKDGGKGIKIDEVVKGSPAAQAGLQKDDVIIGVNRD RVNSIAEMRKVLAAKPAIIALQIVRGNESIYLLMR >gi|296494607|gb|ADTN01000131.1| GENE 7 5609 - 6007 551 132 aa, chain - ## HITS:1 COG:STM3347 KEGG:ns NR:ns ## COG: STM3347 COG3105 # Protein_GI_number: 16766642 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Salmonella typhimurium LT2 # 1 132 3 134 134 229 96.0 9e-61 MTWEYALIGLVVGIIIGAVAMRFGNRKLRQQQALQYELEKNKAELDEYREELVSHFARSA ELLDTMAHDYRQLYQHMAKSSSSLLPELSAEANPFRNRLAESEASNDQAPVQMPRDYSEG ASGLLRTGAKRD >gi|296494607|gb|ADTN01000131.1| GENE 8 6357 - 7328 779 323 aa, chain + ## HITS:1 COG:ECs4105 KEGG:ns NR:ns ## COG: ECs4105 COG1485 # Protein_GI_number: 15833359 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Escherichia coli O157:H7 # 1 323 53 375 375 661 100.0 0 MARVGKLWGKREDTKHTPVRGLYMWGGVGRGKTWLMDLFYQSLPGERKQRLHFHRFMLRV HEELTALQGQTDPLEIIADRFKAETDVLCFDEFFVSDITDAMLLGGLMKALFARGITLVA TSNIPPDELYRNGLQRARFLPAIDAIKQHCDVMNVDAGVDYRLRTLTQAHLWLSPLHDET RAQMDKLWLALAGGKRENSPTLEINHRPLATMGVENQTLAVSFTTLCVDARSQHDYIALS RLFHTVMLFDVPVMTRLMESEARRFIALVDEFYERHVKLVVSAEVPLYEIYQGDRLKFEF QRCLSRLQEMQSEEYLKREHLAG >gi|296494607|gb|ADTN01000131.1| GENE 9 7472 - 7975 881 167 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|226956764|ref|YP_002807559.1| 50S ribosomal subunit protein L13 [Escherichia sp. 1_1_43] # 1 167 1 167 167 343 99 2e-94 MSCEPQQLKTFGCSPTCNYLLGKLLMKTFTAKPETVKRDWYVVDATGKTLGRLATELARR LRGKHKAEYTPHVDTGDYIIVLNADKVAVTGNKRTDKVYYHHTGHIGGIKQATFEEMIAR RPERVIEIAVKGMLPKGPLGRAMFRKLKVYAGNEHNHAAQQPQVLDI >gi|296494607|gb|ADTN01000131.1| GENE 10 7991 - 8383 650 130 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15803764|ref|NP_289798.1| 30S ribosomal protein S9 [Escherichia coli O157:H7 EDL933] # 1 130 1 130 130 254 100 1e-67 MAENQYYGTGRRKSSAARVFIKPGNGKIVINQRSLEQYFGRETARMVVRQPLELVDMVEK LDLYITVKGGGISGQAGAIRHGITRALMEYDESLRSELRKAGFVTRDARQVERKKVGLRK ARRRPQFSKR Prediction of potential genes in microbial genomes Time: Sun May 15 23:35:02 2011 Seq name: gi|296494606|gb|ADTN01000132.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont282.3, whole genome shotgun sequence Length of sequence - 3180 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 46 - 98 4.8 1 1 Tu 1 . - CDS 206 - 415 78 ## ECP_3312 hypothetical protein 2 2 Op 1 13/0.000 + CDS 357 - 995 663 ## PROTEIN SUPPORTED gi|46133488|ref|ZP_00157281.2| COG0625: Glutathione S-transferase 3 2 Op 2 . + CDS 1001 - 1498 525 ## COG2969 Stringent starvation protein B + Term 1501 - 1540 8.0 - Term 1487 - 1527 8.2 4 3 Tu 1 . - CDS 1541 - 2908 1371 ## COG3069 C4-dicarboxylate transporter - Prom 3011 - 3070 5.7 Predicted protein(s) >gi|296494606|gb|ADTN01000132.1| GENE 1 206 - 415 78 69 aa, chain - ## HITS:1 COG:no KEGG:ECP_3312 NR:ns ## KEGG: ECP_3312 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_536 # Pathway: not_defined # 1 69 1 69 69 122 98.0 4e-27 MSVGPENSVITERLLAATAMKTSRYSQNFYCYQPPGGQSEVVLPNKERLSLFENQTKNEQ YPTFGQKIG >gi|296494606|gb|ADTN01000132.1| GENE 2 357 - 995 663 212 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|46133488|ref|ZP_00157281.2| COG0625: Glutathione S-transferase [Haemophilus influenzae R2866] # 1 203 1 203 212 259 62 1e-69 MAVAANKRSVMTLFSGPTDIYSHQVRIVLAEKGVSFEIEHVEKDNPPQDLIDLNPNQSVP TLVDRELTLWESRIIMEYLDERFPHPPLMPVYPVARGESRLYMHRIEKDWYTLMNTIING SASEADAARKQLREELLAIAPVFGQKPYFLSDEFSLVDCYLAPLLWRLPQLGIEFSGPGA KELKGYMTRVFERDSFLASLTEAEREMRLGRS >gi|296494606|gb|ADTN01000132.1| GENE 3 1001 - 1498 525 165 aa, chain + ## HITS:1 COG:ECs4101 KEGG:ns NR:ns ## COG: ECs4101 COG2969 # Protein_GI_number: 15833355 # Func_class: R General function prediction only # Function: Stringent starvation protein B # Organism: Escherichia coli O157:H7 # 1 133 1 133 165 253 100.0 1e-67 MDLSQLTPRRPYLLRAFYEWLLDNQLTPHLVVDVTLPGVQVPMEYARDGQIVLNIAPRAV GNLELANDEVRFNARFGGIPRQVSVPLAAVLAIYARENGAGTMFEPEAAYDEDTSIMNDE EASADNETVMSVIDGDKPDHDDDTHPDDEPPQPPRGGRPALRVVK >gi|296494606|gb|ADTN01000132.1| GENE 4 1541 - 2908 1371 455 aa, chain - ## HITS:1 COG:dcuD KEGG:ns NR:ns ## COG: dcuD COG3069 # Protein_GI_number: 16131117 # Func_class: C Energy production and conversion # Function: C4-dicarboxylate transporter # Organism: Escherichia coli K12 # 1 455 1 455 455 746 100.0 0 MFGIIISVIVLITMGYLILKNYKPQVVLAAAGIFLMMCGVWLGFGGVLDPTKSSGYLIVD IYNEILRMLSNRIAGLGLSIMAVGGYARYMERIGASRAMVSLLSRPLKLIRSPYIILSAT YVIGQIMAQFITSASGLGMLLMVTLFPTLVSLGVSRLSAVAVIATTMSIEWGILETNSIF AAQVAGMKIATYFFHYQLPVASCVIISVAISHFFVQRAFDKKDKNINHEQAEQKALDNVP PLYYAILPVMPLILMLGSLFLAHVGLMQSELHLVVVMLLSLTVTMFVEFFRKHNLRETMD DVQAFFDGMGTQFANVVTLVVAGEIFAKGLTTIGTVDAVIRGAEHSGLGGIGVMIIMALV IAICAIVMGSGNAPFMSFASLIPNIAAGLHVPAVVMIMPMHFATTLARAVSPITAVVVVT SGIAGVSPFAVVKRTAIPMAVGFVVNMIATITLFY Prediction of potential genes in microbial genomes Time: Sun May 15 23:35:06 2011 Seq name: gi|296494605|gb|ADTN01000133.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont283.1, whole genome shotgun sequence Length of sequence - 8995 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 6, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 195 - 428 119 ## gi|300897681|ref|ZP_07116079.1| transporter, major facilitator family protein - Term 319 - 349 1.6 2 2 Tu 1 2/1.000 - CDS 388 - 1059 532 ## COG2135 Uncharacterized conserved protein - Prom 1079 - 1138 2.6 3 3 Op 1 10/0.000 - CDS 1168 - 1401 328 ## COG0425 Predicted redox protein, regulator of disulfide bond formation 4 3 Op 2 . - CDS 1398 - 2603 1572 ## COG2391 Predicted transporter component - Prom 2641 - 2700 3.9 + Prom 2632 - 2691 7.2 5 4 Tu 1 . + CDS 2790 - 3203 523 ## SDY_1087 hypothetical protein + Term 3211 - 3245 5.0 - Term 3197 - 3231 5.0 6 5 Op 1 . - CDS 3237 - 4724 1538 ## COG0366 Glycosidases 7 5 Op 2 . - CDS 4802 - 5167 455 ## EcolC_1713 flagellar biosynthesis protein FliT 8 5 Op 3 15/0.000 - CDS 5167 - 5577 488 ## COG1516 Flagellin-specific chaperone FliS 9 5 Op 4 . - CDS 5602 - 7008 1266 ## COG1345 Flagellar capping protein - Prom 7160 - 7219 5.1 + Prom 7048 - 7107 4.2 10 6 Tu 1 . + CDS 7274 - 8959 1391 ## COG1344 Flagellin and related hook-associated proteins Predicted protein(s) >gi|296494605|gb|ADTN01000133.1| GENE 1 195 - 428 119 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|300897681|ref|ZP_07116079.1| ## NR: gi|300897681|ref|ZP_07116079.1| transporter, major facilitator family protein [Escherichia coli MS 198-1] # 32 65 1 34 443 67 94.0 2e-10 MPGMWIYSERRCKWRSQHFSGGARRSCLWRDGAVRPPVEAGTHRNDSGDSLNVAGILVLQ GVEDVNKQVELTPRPGF >gi|296494605|gb|ADTN01000133.1| GENE 2 388 - 1059 532 223 aa, chain - ## HITS:1 COG:ZyedK KEGG:ns NR:ns ## COG: ZyedK COG2135 # Protein_GI_number: 15802366 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 EDL933 # 1 222 1 222 222 443 97.0 1e-124 MCGRFAQSQTREDYLALLAEDIERDIPYDPEPIGRYNVAPGTKVLLLSERDEHLHLDPVF WGYAPGWWDKPPLINARVETAATSRMFKPLWQHGRAICFADGWFEWKKEGDKKQPYFIYR ADGQPVFMAAIGSTPFERGDEAEGFLIVTAAADQGLVDIHDRRPLVLSPEAAQEWMRQEI GGKEASEIATNGCVPANQFTWHPVSRAVGNVKNQGAELIQPVC >gi|296494605|gb|ADTN01000133.1| GENE 3 1168 - 1401 328 77 aa, chain - ## HITS:1 COG:ECs2669 KEGG:ns NR:ns ## COG: ECs2669 COG0425 # Protein_GI_number: 15831923 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted redox protein, regulator of disulfide bond formation # Organism: Escherichia coli O157:H7 # 1 77 1 77 77 144 98.0 3e-35 MKNIIPDYRLDMVGEPCPYPAVATLEAMPQLKKGEILEVVSDCPQSINNIPLDARNHGYT VLDIQQDGPTIRYLIQK >gi|296494605|gb|ADTN01000133.1| GENE 4 1398 - 2603 1572 401 aa, chain - ## HITS:1 COG:yedE KEGG:ns NR:ns ## COG: yedE COG2391 # Protein_GI_number: 16129876 # Func_class: R General function prediction only # Function: Predicted transporter component # Organism: Escherichia coli K12 # 1 401 1 401 401 706 99.0 0 MSWQQFKHAWLIKFWAPIPAVIAAGILSTYYFGITGTFWAVTGEFTRWGGQLLQLFGVHA EEWGYFKIIHLEGSPLTRIDGMMILGMFGGCFAAALWANNVKLRMPRSRIRIMQAIIGGI IAGFGARLAMGCNLAAFFTGIPQFSLHAWFFAIATAIGSWFGARFTLLPIFRIPVKMQKV SAASPLTQKPDQARRRFRLGMLVFFGMLGWALLTAMNQPKLGLAMLFGVGFGLLIERAQI CFTSAFRDMWITGRTHMAKAIIIGMAVSAIGIFSYVQLGVEPKIMWAGPNAVIGGLLFGF GIVLAGGCETGWMYRAVEGQVHYWWVGLGNVIGSTILAYYWDDFAPALATDWDKINLLKT FGPMGGLLVTYLLLFTALMLIIGWEKRFFRRAAPQTAKEIA >gi|296494605|gb|ADTN01000133.1| GENE 5 2790 - 3203 523 137 aa, chain + ## HITS:1 COG:no KEGG:SDY_1087 NR:ns ## KEGG: SDY_1087 # Name: yedD # Def: hypothetical protein # Organism: S.dysenteriae # Pathway: not_defined # 1 137 1 137 137 262 99.0 2e-69 MKKLAIAGALMLLAGCAEVENYNNVVKTPAPGWLAGYWQTKGPQRALVSPEAIGSLIVTK EGDTLDCRQWQRVIAVPGKLTLMSDDLTNVTVKRELYEVERDGNTIEYDGMTMERVDRPT AECAAALDKAPLPTPLP >gi|296494605|gb|ADTN01000133.1| GENE 6 3237 - 4724 1538 495 aa, chain - ## HITS:1 COG:amyA KEGG:ns NR:ns ## COG: amyA COG0366 # Protein_GI_number: 16129874 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Escherichia coli K12 # 1 495 1 495 495 1031 99.0 0 MRNPTLLQCFHWYYPEGGKLWPELAERADGFNDIGINMVWLPPAYKGASGGYSVGYDSYD LFDLGEFDQKGSIPTKYGDKAQLLAAIDALKRNDIAVLLDVVVNHKMGADEKEAIRVQRV NADDRTQIDEEIIECEGWTRYTFPARAGQYSQFIWDFKCFSGIDHIENPDEDGIFKIVND YTGEGWNDQVDDELGNFDYLMGENIDFRNHAVTEEIKYWARWVMEQTQCDGFRLDAVKHI PAWFYKEWIEHVQEVAPKPLFIVAEYWSHEVDKLQTYIDQVEGKTMLFDAPLQMKFHEAS RMGRDYDMTQIFTGTLVEADPFHAVTLVANHDTQPLQALEAPVEPWFKPLAYALILLREN GVPSVFYPDLYGAHYEDVGGDGHTYPIDMPIIEQLDELILARQRFAHGVQTLFFDHPNCI AFSRSGTDEYPGCVVVMSNGDDGEKTIHLGENYGNKTWRDFLGNRQESVVTDENGEATFF CNGGSVSVWVIEEVI >gi|296494605|gb|ADTN01000133.1| GENE 7 4802 - 5167 455 121 aa, chain - ## HITS:1 COG:no KEGG:EcolC_1713 NR:ns ## KEGG: EcolC_1713 # Name: not_defined # Def: flagellar biosynthesis protein FliT # Organism: E.coli_ATCC8739 # Pathway: Flagellar assembly [PATH:ecl02040] # 1 121 1 121 121 204 99.0 6e-52 MNHAPHLYFAWQQLVEKSQLMLRLATEEQWDELIASEMAYVNAVQEIAHLTEEIDPSTTM QEQLRPMLRLILDNESKVKQLLQIRMDELAKLVGQSSVQKSVLSAYGDQGGFVLAPQDNL S >gi|296494605|gb|ADTN01000133.1| GENE 8 5167 - 5577 488 136 aa, chain - ## HITS:1 COG:ECs2664 KEGG:ns NR:ns ## COG: ECs2664 COG1516 # Protein_GI_number: 15831918 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport; O Posttranslational modification, protein turnover, chaperones # Function: Flagellin-specific chaperone FliS # Organism: Escherichia coli O157:H7 # 1 136 1 136 136 224 98.0 3e-59 MYAAKGTQAYAQIGVESAVMSASQQQLVTMLFDGVLSALVRARLFMQDNNQQGKGVSLSK AINIIENGLRVSLDEESKDELTQNLIALYSYMVRRLLQANLRNDVSAVEEVEALMRNIAD AWKESLLTPSLIQDPV >gi|296494605|gb|ADTN01000133.1| GENE 9 5602 - 7008 1266 468 aa, chain - ## HITS:1 COG:ECs2663 KEGG:ns NR:ns ## COG: ECs2663 COG1345 # Protein_GI_number: 15831917 # Func_class: N Cell motility # Function: Flagellar capping protein # Organism: Escherichia coli O157:H7 # 1 468 1 465 465 540 98.0 1e-153 MASISSLGVGSGLDLSSILDSLTAAQKATLTPISNQQSSFTAKLSAYGTLKSALTTFQTA NTALSKADLFSATSTTSSTTAFSATTAGNAIAGKYTISVTHLAQAQTLTTRTTRDDTKTA IATSDSKLTIQQGGDKDPISIDISAANSSLSGIRDAINNAKAGVSASIINVGNGEYRLSV TSNDTGLDNAMTLSVSGDDALQSFMGYDASASSNGMEVSVAAQNAQLTVNNVAIENSSNT ISDALENITLNLNDVTTGNQTLTITQDTSKAQTAIKDWMNAYNSLIDTFSSLTKYTAVDA GADSQSSSNGALLGDSTLRTIQTQLKSMLSNTVSSSNYKTLAQIGITTDPSDGKLELDAD KLTAALKKDASGVGALIVGDGKKTGITTTIGSNLTSWLSTTGIIKAATDGVSKTLNKLTK DYNAASDRIDAQVARYKEQFTQLDVLMTSLNSTSSYLTQQFENNSNSK >gi|296494605|gb|ADTN01000133.1| GENE 10 7274 - 8959 1391 561 aa, chain + ## HITS:1 COG:ECs2662 KEGG:ns NR:ns ## COG: ECs2662 COG1344 # Protein_GI_number: 15831916 # Func_class: N Cell motility # Function: Flagellin and related hook-associated proteins # Organism: Escherichia coli O157:H7 # 1 561 1 585 585 353 55.0 7e-97 MAQVINTNSLSLITQNNINKNQSALSSSIERLSSGLRINSAKDDAAGQAIANRFTSNIKG LTQAARNANDGISVAQTTEGALSEINNNLQRVRELTVQATTGTNSQSDLDSIQDEIKSRL DEIDRVSGQTQFNGVNVLAKDGSMKIQVGANDGQTITIDLKKIDSSTLKLTGFNVNGKAA VDNAKATDANLTTAGFTQGVVDSNGNSTWTKSTTTNFDAATAVNVLAAVKDGSTINYTGT GNGLGIAATSAYTYHDSTKSYTFDSTGAAVAGAASSLQGTFGTDTNTAKITIDGSAQEVN IAKDGKITDTDGKALYIDSTGNLTKNGSDTLTQATLNDVLTGANSVDDTRIDFDSGMSVT LDKVNSTVDITGASISAAAMTNELTGKAYTVVNGAESYAVATNNTVKTTADAKNVYVDAS GKLTTDDKATVTETYHEFANGNIYDDKGAAVYAAADGSLTTETTSKSEATANPLAALDDA ISQIDKFRSSLGAIQNRLDSAVTNLNNTTTNLSEAQSRIQDADYATEVSNMSKAQIIQQA GNSVLAKANQVPQQVLSLLQG Prediction of potential genes in microbial genomes Time: Sun May 15 23:35:19 2011 Seq name: gi|296494604|gb|ADTN01000134.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont283.2, whole genome shotgun sequence Length of sequence - 4900 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 43 - 102 3.0 1 1 Op 1 . + CDS 132 - 851 848 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit 2 1 Op 2 . + CDS 900 - 1448 308 ## ECED1_2186 flagella biosynthesis protein FliZ + Prom 1452 - 1511 4.0 3 2 Op 1 2/0.000 + CDS 1536 - 2336 1261 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain + Term 2346 - 2399 3.9 + Prom 2354 - 2413 8.0 4 2 Op 2 1/0.000 + CDS 2441 - 3427 1183 ## COG2515 1-aminocyclopropane-1-carboxylate deaminase 5 2 Op 3 34/0.000 + CDS 3442 - 4110 686 ## COG0765 ABC-type amino acid transport system, permease component 6 2 Op 4 . + CDS 4107 - 4859 660 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 Predicted protein(s) >gi|296494604|gb|ADTN01000134.1| GENE 1 132 - 851 848 239 aa, chain + ## HITS:1 COG:ECs2661 KEGG:ns NR:ns ## COG: ECs2661 COG1191 # Protein_GI_number: 15831915 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Escherichia coli O157:H7 # 1 239 1 239 239 416 100.0 1e-116 MNSLYTAEGVMDKHSLWQRYVPLVRHEALRLQVRLPASVELDDLLQAGGIGLLNAVERYD ALQGTAFTTYAVQRIRGAMLDELRSRDWVPRSVRRNAREVAQAIGQLEQELGRNATETEV AERLGIDIADYRQMLLDTNNSQLFSYDEWREEHGDSIELVTDDHQRENPLQQLLDSNLRQ RVMEAIETLPEREKLVLTLYYQEELNLKEIGAVLEVGESRVSQLHSQAIKRLRTKLGKL >gi|296494604|gb|ADTN01000134.1| GENE 2 900 - 1448 308 182 aa, chain + ## HITS:1 COG:no KEGG:ECED1_2186 NR:ns ## KEGG: ECED1_2186 # Name: fliZ # Def: flagella biosynthesis protein FliZ # Organism: E.coli_ED1a # Pathway: not_defined # 1 182 2 183 183 370 99.0 1e-101 MVQHLKRRPLSRYLKDFKHSQTHCAHCRKLLDRITLVRDGKIVNKIEISRLDALLDENGW QTEQKSWAALCRFCGDLHCKTQSDFFDIIGFKQFLFEQTEMSPGTVREYVVRLRRLGNHL HEQNISLDPLQDGFLDEILAPWLPTTSTNNYRIALRKYQHYQRQTCTGLVQKSSSQPASD IY >gi|296494604|gb|ADTN01000134.1| GENE 3 1536 - 2336 1261 266 aa, chain + ## HITS:1 COG:ECs2659 KEGG:ns NR:ns ## COG: ECs2659 COG0834 # Protein_GI_number: 15831913 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Escherichia coli O157:H7 # 1 266 1 266 266 488 99.0 1e-138 MKLAHLGRQALMGVMAVALVAGMSVKSFADEGLLNKVKERGTLLVGLEGTYPPFSFQGDD GKLTGFEVEFAQQLAKHLGVEASLKPTKWDGMLASLDSKRIDVVINQVTISDERKKKYDF STPYTISGIQALVKKGNEGTIKTAADLKGKKVGVGLGTNYEEWLRQNVQGVDVRTYDDDP TKYQDLRVGRIDAILVDRLAALDLVKKTNDTLAVTGEAFSRQESGVALRKGNEDLLKAVN DAIAEMQKDGTLQALSEKWFGADVTK >gi|296494604|gb|ADTN01000134.1| GENE 4 2441 - 3427 1183 328 aa, chain + ## HITS:1 COG:yedO KEGG:ns NR:ns ## COG: yedO COG2515 # Protein_GI_number: 16129866 # Func_class: E Amino acid transport and metabolism # Function: 1-aminocyclopropane-1-carboxylate deaminase # Organism: Escherichia coli K12 # 1 328 33 360 360 647 100.0 0 MPLHNLTRFPRLEFIGAPTPLEYLPRFSDYLGREIFIKRDDVTPMAMGGNKLRKLEFLAA DALREGADTLITAGAIQSNHVRQTAAVAAKLGLHCVALLENPIGTTAENYLTNGNRLLLD LFNTQIEMCDALTDPNAQLEELATRVEAQGFRPYVIPVGGSNALGALGYVESALEIAQQC EGAVNISSVVVASGSAGTHAGLAVGLEHLMPESELIGVTVSRSVADQLPKVVNLQQAIAK ELELTASAEILLWDDYFAPGYGVPNDEGMEAVKLLARLEGILLDPVYTGKAMAGLIDGIS QKRFKDEGPILFIHTGGAPALFAYHPHV >gi|296494604|gb|ADTN01000134.1| GENE 5 3442 - 4110 686 222 aa, chain + ## HITS:1 COG:yecS KEGG:ns NR:ns ## COG: yecS COG0765 # Protein_GI_number: 16129865 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Escherichia coli K12 # 1 222 1 222 222 395 99.0 1e-110 MQESIQLVIDSLPFLLKGAGYTLQLSIGGMFFGLLLGFILALMRLSPIWPVRWLARFYIS IFRGTPLIAQLFMIYYGLPQFGIELDPIPSAMIGLSLNTAAYAAETLRAAISSIDKGQWE AAASIGMTPWQTMRRAILPQAARVALPPLSNSFISLVKDTSLAATIQVPELFRQAQLITS RTLDVFTMYLAASLIYWIMATVLSTLQNHFENQLNRQEREPK >gi|296494604|gb|ADTN01000134.1| GENE 6 4107 - 4859 660 250 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 4 246 2 241 245 258 51 5e-69 MSAIEVKNLVKKFHGQTVLHGIDLEVKPGEVVAIIGPSGSGKTTLLRSINLLEQPEAGTI TVGDITIDTARSLSQQKSLIRQLRQHVGFVFQNFNLFPHRTVLENIIEGPVIVKGEPKEE ATARARELLAKVGLAGKETSYPRRLSGGQQQRVAIARALAMRPEVILFDEPTSALDPELV GEVLNTIRQLAQEKRTMVIVTHEMSFARDVADRAIFMDQGRIVEQGAAKALFADPQQPRT RQFLEKFLLQ Prediction of potential genes in microbial genomes Time: Sun May 15 23:35:42 2011 Seq name: gi|296494603|gb|ADTN01000135.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont283.3, whole genome shotgun sequence Length of sequence - 62737 bp Number of predicted genes - 60, with homology - 60 Number of transcription units - 30, operones - 14 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 59 - 118 7.7 1 1 Tu 1 . + CDS 233 - 919 468 ## COG2771 DNA-binding HTH domain-containing proteins + Term 952 - 992 4.2 - Term 938 - 978 8.0 2 2 Tu 1 . - CDS 986 - 1210 359 ## ECDH10B_2056 hypothetical protein - Prom 1401 - 1460 5.5 + Prom 1566 - 1625 5.2 3 3 Op 1 3/0.800 + CDS 1669 - 2325 434 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 4 3 Op 2 9/0.000 + CDS 2388 - 4154 1583 ## COG0322 Nuclease subunit of the excinuclease complex 5 3 Op 3 1/1.000 + CDS 4211 - 4759 264 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase + TRNA 4911 - 4986 93.7 # Gly GCC 0 0 + TRNA 5041 - 5114 51.5 # Cys GCA 0 0 + TRNA 5127 - 5213 71.6 # Leu TAA 0 0 + Prom 5268 - 5327 2.9 6 4 Tu 1 . + CDS 5409 - 6074 775 ## COG3318 Predicted metal-binding protein related to the C-terminal domain of SecA + Term 6079 - 6132 3.6 - Term 6072 - 6113 2.5 7 5 Tu 1 . - CDS 6136 - 7314 1164 ## COG0814 Amino acid permeases - Prom 7436 - 7495 6.6 + Prom 7406 - 7465 6.2 8 6 Tu 1 . + CDS 7539 - 7778 330 ## ECUMN_2201 hypothetical protein - Term 7775 - 7807 5.4 9 7 Tu 1 . - CDS 7816 - 8313 658 ## COG1528 Ferritin-like protein - Prom 8347 - 8406 7.9 10 8 Tu 1 . + CDS 9271 - 9522 286 ## B21_01860 hypothetical protein + Term 9528 - 9567 7.4 - Term 9512 - 9559 7.9 11 9 Tu 1 . - CDS 9601 - 10104 501 ## COG1528 Ferritin-like protein - Prom 10284 - 10343 6.8 + Prom 10792 - 10851 5.1 12 10 Op 1 16/0.000 + CDS 10901 - 11890 1260 ## COG1879 ABC-type sugar transport system, periplasmic component 13 10 Op 2 21/0.000 + CDS 11960 - 13474 183 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 14 10 Op 3 1/1.000 + CDS 13489 - 14475 1045 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components + Term 14514 - 14551 5.4 + Prom 14528 - 14587 3.6 15 11 Op 1 8/0.000 + CDS 14642 - 15442 721 ## COG1877 Trehalose-6-phosphatase 16 11 Op 2 . + CDS 15417 - 16841 1215 ## COG0380 Trehalose-6-phosphate synthase - Term 16767 - 16796 2.5 17 12 Tu 1 . - CDS 16848 - 17276 413 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins - Prom 17429 - 17488 4.3 18 13 Op 1 . + CDS 18054 - 18404 280 ## EC55989_2071 transcriptional activator FlhD 19 13 Op 2 . + CDS 18407 - 18985 367 ## ECSP_2465 transcriptional activator FlhC + Term 19012 - 19049 7.2 20 14 Op 1 19/0.000 + CDS 19112 - 19999 995 ## COG1291 Flagellar motor component 21 14 Op 2 5/0.400 + CDS 19996 - 20922 630 ## COG1360 Flagellar motor protein 22 14 Op 3 20/0.000 + CDS 20933 - 22891 1706 ## COG0643 Chemotaxis protein histidine kinase and related kinases 23 14 Op 4 17/0.000 + CDS 22912 - 23415 602 ## COG0835 Chemotaxis signal transduction protein + Term 23441 - 23479 4.5 + Prom 23455 - 23514 5.0 24 14 Op 5 13/0.000 + CDS 23560 - 25221 1551 ## COG0840 Methyl-accepting chemotaxis protein 25 14 Op 6 9/0.000 + CDS 25267 - 26868 1557 ## COG0840 Methyl-accepting chemotaxis protein 26 14 Op 7 13/0.000 + CDS 26887 - 27747 615 ## COG1352 Methylase of chemotaxis methyl-accepting proteins 27 14 Op 8 18/0.000 + CDS 27750 - 28799 866 ## COG2201 Chemotaxis response regulator containing a CheY-like receiver domain and a methylesterase domain 28 14 Op 9 8/0.000 + CDS 28814 - 29203 501 ## COG0784 FOG: CheY-like receiver 29 14 Op 10 4/0.600 + CDS 29214 - 29858 736 ## COG3143 Chemotaxis protein + Prom 29893 - 29952 2.8 30 15 Op 1 13/0.000 + CDS 30060 - 31208 1096 ## COG1377 Flagellar biosynthesis pathway, component FlhB 31 15 Op 2 . + CDS 31201 - 33279 2142 ## COG1298 Flagellar biosynthesis pathway, component FlhA 32 15 Op 3 . + CDS 33279 - 33671 178 ## B21_01838 hypothetical protein + Term 33834 - 33879 -1.0 33 16 Tu 1 . - CDS 33791 - 34279 167 ## COG3755 Uncharacterized protein conserved in bacteria - Prom 34375 - 34434 4.4 - Term 34394 - 34437 7.4 34 17 Tu 1 . - CDS 34456 - 36189 2279 ## COG0018 Arginyl-tRNA synthetase - Prom 36248 - 36307 4.1 + Prom 36322 - 36381 4.1 35 18 Op 1 3/0.800 + CDS 36405 - 36971 390 ## COG3102 Uncharacterized protein conserved in bacteria 36 18 Op 2 1/1.000 + CDS 36985 - 37731 404 ## COG3142 Uncharacterized protein involved in copper resistance + Term 37743 - 37778 2.0 + Prom 38035 - 38094 7.7 37 19 Op 1 7/0.000 + CDS 38119 - 39219 614 ## COG3005 Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit 38 19 Op 2 . + CDS 39244 - 40194 1087 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing - Term 40255 - 40292 -0.2 39 20 Op 1 17/0.000 - CDS 40381 - 41352 1028 ## COG0500 SAM-dependent methyltransferases 40 20 Op 2 2/0.900 - CDS 41349 - 42092 823 ## COG0500 SAM-dependent methyltransferases 41 20 Op 3 2/0.900 - CDS 42133 - 42528 469 ## COG3788 Uncharacterized relative of glutathione S-transferase, MAPEG superfamily 42 20 Op 4 1/1.000 - CDS 42581 - 43399 474 ## COG1801 Uncharacterized conserved protein 43 20 Op 5 . - CDS 43396 - 43962 560 ## COG1335 Amidases related to nicotinamidase - Prom 43989 - 44048 6.5 + Prom 44118 - 44177 4.6 44 21 Op 1 5/0.400 + CDS 44272 - 46044 2212 ## COG0173 Aspartyl-tRNA synthetase 45 21 Op 2 7/0.000 + CDS 46105 - 46614 302 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 46 21 Op 3 8/0.000 + CDS 46643 - 47383 1031 ## COG0217 Uncharacterized conserved protein 47 21 Op 4 . + CDS 47418 - 47939 556 ## COG0817 Holliday junction resolvasome, endonuclease subunit + Term 47947 - 47984 5.2 - Term 47764 - 47805 2.3 48 22 Tu 1 . - CDS 47941 - 48543 248 ## JW5306 hypothetical protein - Prom 48575 - 48634 6.0 + Prom 48728 - 48787 7.4 49 23 Op 1 29/0.000 + CDS 48818 - 49429 671 ## COG0632 Holliday junction resolvasome, DNA-binding subunit 50 23 Op 2 . + CDS 49438 - 50448 1141 ## COG2255 Holliday junction resolvasome, helicase subunit + Term 50465 - 50515 5.6 51 24 Op 1 42/0.000 - CDS 50595 - 51380 937 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 52 24 Op 2 . - CDS 51377 - 52132 219 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 - Prom 52189 - 52248 7.3 + Prom 51981 - 52040 3.0 53 25 Op 1 . + CDS 52157 - 53143 338 ## COG4531 ABC-type Zn2+ transport system, periplasmic component/surface adhesin 54 25 Op 2 1/1.000 + CDS 53159 - 54481 1277 ## COG0739 Membrane proteins related to metalloendopeptidases + Term 54498 - 54532 3.5 + Prom 54507 - 54566 6.4 55 26 Tu 1 . + CDS 54601 - 55572 757 ## COG1560 Lauroyl/myristoyl acyltransferase + Term 55662 - 55714 8.2 - Term 55561 - 55599 5.3 56 27 Tu 1 . - CDS 55703 - 57145 1387 ## COG0469 Pyruvate kinase - Term 57214 - 57254 1.4 57 28 Tu 1 . - CDS 57273 - 58142 709 ## COG1737 Transcriptional regulators + Prom 58248 - 58307 7.8 58 29 Tu 1 . + CDS 58480 - 59955 1718 ## COG0364 Glucose-6-phosphate 1-dehydrogenase + Term 60035 - 60072 5.1 + Prom 60023 - 60082 6.6 59 30 Op 1 8/0.000 + CDS 60190 - 62001 1706 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase 60 30 Op 2 . + CDS 62038 - 62679 996 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase + Term 62691 - 62722 3.9 Predicted protein(s) >gi|296494603|gb|ADTN01000135.1| GENE 1 233 - 919 468 228 aa, chain + ## HITS:1 COG:sdiA KEGG:ns NR:ns ## COG: sdiA COG2771 # Protein_GI_number: 16129863 # Func_class: K Transcription # Function: DNA-binding HTH domain-containing proteins # Organism: Escherichia coli K12 # 1 228 13 240 240 447 99.0 1e-126 MLLRFQRMEAAEEVYHEIELQAQQLEYDYYSLCVRHPVPFTRPKVAFYTNYPEAWVSYYQ AKNFLAIDPVLNPENFSQGHLMWNDDLFSEAQPLWEAARAHGLRRGVTQYLMLPNRALGF LSFSRCSAREIPILSDELQLKMQLLVRESLMALMRLNDEIVMTPEMNFSKREKEILKWTA EGKTSAEIAMILSISENTVNFHQKNMQKKINAPNKTQVACYAAATGLI >gi|296494603|gb|ADTN01000135.1| GENE 2 986 - 1210 359 74 aa, chain - ## HITS:1 COG:no KEGG:ECDH10B_2056 NR:ns ## KEGG: ECDH10B_2056 # Name: yecF # Def: hypothetical protein # Organism: E.coli_DH10B # Pathway: not_defined # 1 74 1 74 74 103 100.0 1e-21 MSTPDFSTAENNQELANEVSCLKAMLTLMLQAMGQADAGRVMLKMEKQLALIEDETQAAV FSKTVKQIKQAYRQ >gi|296494603|gb|ADTN01000135.1| GENE 3 1669 - 2325 434 218 aa, chain + ## HITS:1 COG:uvrY KEGG:ns NR:ns ## COG: uvrY COG2197 # Protein_GI_number: 16129861 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli K12 # 1 218 1 218 218 416 100.0 1e-116 MINVLLVDDHELVRAGIRRILEDIKGIKVVGEASCGEDAVKWCRTNAVDVVLMDMSMPGI GGLEATRKIARSTADVKIIMLTVHTENPLPAKVMQAGAAGYLSKGAAPQEVVSAIRSVYS GQRYIASDIAQQMALSQIEPEKTESPFASLSERELQIMLMITKGQKVNEISEQLNLSPKT VNSYRYRMFSKLNIHGDVELTHLAIRHGLCNAETLSSQ >gi|296494603|gb|ADTN01000135.1| GENE 4 2388 - 4154 1583 588 aa, chain + ## HITS:1 COG:ECs2651 KEGG:ns NR:ns ## COG: ECs2651 COG0322 # Protein_GI_number: 15831905 # Func_class: L Replication, recombination and repair # Function: Nuclease subunit of the excinuclease complex # Organism: Escherichia coli O157:H7 # 1 588 1 588 588 1193 100.0 0 MYDAGGTVIYVGKAKDLKKRLSSYFRSNLASRKTEALVAQIQQIDVTVTHTETEALLLEH NYIKLYQPRYNVLLRDDKSYPFIFLSGDTHPRLAMHRGAKHAKGEYFGPFPNGYAVRETL ALLQKIFPIRQCENSVYRNRSRPCLQYQIGRCLGPCVEGLVSEEEYAQQVEYVRLFLSGK DDQVLTQLISRMETASQNLEFEEAARIRDQIQAVRRVTEKQFVSNTGDDLDVIGVAFDAG MACVHVLFIRQGKVLGSRSYFPKVPGGTELSEVVETFVGQFYLQGSQMRTLPGEILLDFN LSDKTLLADSLSELAGRKINVQTKPRGDRARYLKLARTNAATALTSKLSQQSTVHQRLTA LASVLKLPEVKRMECFDISHTMGEQTVASCVVFDANGPLRAEYRRYNITGITPGDDYAAM NQVLRRRYGKAIDDSKIPDVILIDGGKGQLAQAKNVFAELDVSWDKNHPLLLGVAKGADR KAGLETLFFEPEGEGFSLPPDSPALHVIQHIRDESHDHAIGGHRKKRAKVKNTSSLETIE GVGPKRRQMLLKYMGGLQGLRNASVEEIAKVPGISQGLAEKIFWSLKH >gi|296494603|gb|ADTN01000135.1| GENE 5 4211 - 4759 264 182 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 9 175 486 665 904 106 38 4e-22 MQFNIPTLLTLFRVILIPFFVLVFYLPVTWSPFAAALIFCVAAVTDWFDGFLARRWNQST RFGAFLDPVADKVLVAIAMVLVTEHYHSWWVTLPAATMIAREIIISALREWMAELGKRSS VAVSWIGKVKTTAQMVALAWLLWRPNIWVEYAGIALFFVAAVLTLWSMLQYLSAARADLL DQ >gi|296494603|gb|ADTN01000135.1| GENE 6 5409 - 6074 775 221 aa, chain + ## HITS:1 COG:yecA KEGG:ns NR:ns ## COG: yecA COG3318 # Protein_GI_number: 16129858 # Func_class: R General function prediction only # Function: Predicted metal-binding protein related to the C-terminal domain of SecA # Organism: Escherichia coli K12 # 1 221 1 221 221 437 100.0 1e-122 MKTGPLNESELEWLDDILTKYNTDHAILDVAELDGLLTAVLSSPQEIEPEQWLVAVWGGA DYVPRWASEKEMTRFMNLAFQHMADTAERLNEFPEQFEPLFGLREVDGSELTIVEEWCFG YMRGVALSDWSTLPDSLKPALEAIALHGTEENFERVEKMSPEAFEESVDAIRLAALDLHA YWMAHPQEKAVQQPIKAEEKPGRNDPCPCGSGKKFKQCCLH >gi|296494603|gb|ADTN01000135.1| GENE 7 6136 - 7314 1164 392 aa, chain - ## HITS:1 COG:ECs2615 KEGG:ns NR:ns ## COG: ECs2615 COG0814 # Protein_GI_number: 15831869 # Func_class: E Amino acid transport and metabolism # Function: Amino acid permeases # Organism: Escherichia coli O157:H7 # 1 392 12 403 403 643 99.0 0 MAGTTIGAGMLAMPLAAAGVGFSVTLILLIGLWALMCYTALLLLEVYQHVPADTGLGTLA KRYLGRYGQWLTGFSMMFLMYALTAAYISGAGELLASSISDWTGISMSATAGVLLFTFVA GGVVCVGTSLVDLFNRFLFSAKIIFLVVMLVLLLPHIHKVNLLTLPLQQGLALSAIPVIF TSFGFHGSVPSIVSYMDGNVRKLRWVFITGSAIPLVAYIFWQVATLGSIDSTTFMGLLAN HAGLNGLLQALREMVASPHVELAVHLFADLALATSFLGVALGLFDYLADLFQRSNTVGGR LQTGAITFLPPLAFALFYPRGFVMALGYAGVALAVLALIIPSLLTWQSRKHNPQAGYRVK GGRPALVVVFLCGIAVIGVQFLIAAGLLPEVG >gi|296494603|gb|ADTN01000135.1| GENE 8 7539 - 7778 330 79 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_2201 NR:ns ## KEGG: ECUMN_2201 # Name: yecH # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 79 1 79 79 142 100.0 3e-33 MDSIHGHEVLNMMIESGEQYTHASLEAAIKARFGEQARFHTCSAEGMTAGELVAFLAAKG KFIPSEDGFSTDQSKICRH >gi|296494603|gb|ADTN01000135.1| GENE 9 7816 - 8313 658 165 aa, chain - ## HITS:1 COG:ECs2613 KEGG:ns NR:ns ## COG: ECs2613 COG1528 # Protein_GI_number: 15831867 # Func_class: P Inorganic ion transport and metabolism # Function: Ferritin-like protein # Organism: Escherichia coli O157:H7 # 1 165 1 165 165 291 100.0 3e-79 MLKPEMIEKLNEQMNLELYSSLLYQQMSAWCSYHTFEGAAAFLRRHAQEEMTHMQRLFDY LTDTGNLPRINTVESPFAEYSSLDELFQETYKHEQLITQKINELAHAAMTNQDYPTFNFL QWYVSEQHEEEKLFKSIIDKLSLAGKSGEGLYFIDKELSTLDTQN >gi|296494603|gb|ADTN01000135.1| GENE 10 9271 - 9522 286 83 aa, chain + ## HITS:1 COG:no KEGG:B21_01860 NR:ns ## KEGG: B21_01860 # Name: yecJ # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 83 1 83 83 138 100.0 5e-32 MSQPLNADQELVSDVVACQLVIKQILDVLDVIAPVEVREKMSSQLKNIDFTNHPAAADPV TMRAIQKAIALIELKFTPQGESH >gi|296494603|gb|ADTN01000135.1| GENE 11 9601 - 10104 501 167 aa, chain - ## HITS:1 COG:ECs2610 KEGG:ns NR:ns ## COG: ECs2610 COG1528 # Protein_GI_number: 15831864 # Func_class: P Inorganic ion transport and metabolism # Function: Ferritin-like protein # Organism: Escherichia coli O157:H7 # 1 167 1 167 167 295 99.0 2e-80 MATAGMLLKLNSQMNREFYASNLYLHLSNWCSEQSLNGTATFLRALAQSNVTQMMRMFNF MKSVGATPIVKAIDVPGEKLNSLEELFQKTMEEYEQRSSTLAQLADEAKELNDDSTVNFL RDLEKEQQHDGLLLQTILDEVRSAKLAGMCPVQTDQHVLNVVSHQLH >gi|296494603|gb|ADTN01000135.1| GENE 12 10901 - 11890 1260 329 aa, chain + ## HITS:1 COG:araF KEGG:ns NR:ns ## COG: araF COG1879 # Protein_GI_number: 16129851 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli K12 # 1 329 1 329 329 634 100.0 0 MHKFTKALAAIGLAAVMSQSAMAENLKLGFLVKQPEEPWFQTEWKFADKAGKDLGFEVIK IAVPDGEKTLNAIDSLAASGAKGFVICTPDPKLGSAIVAKARGYDMKVIAVDDQFVNAKG KPMDTVPLVMMAATKIGERQGQELYKEMQKRGWDVKESAVMAITANELDTARRRTTGSMD ALKAAGFPEKQIYQVPTKSNDIPGAFDAANSMLVQHPEVKHWLIVGMNDSTVLGGVRATE GQGFKAADIIGIGINGVDAVSELSKAQATGFYGSLLPSPDVHGYKSSEMLYNWVAKDVEP PKFTEVTDVVLITRDNFKEELEKKGLGGK >gi|296494603|gb|ADTN01000135.1| GENE 13 11960 - 13474 183 504 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 275 477 20 217 245 75 25 9e-13 MQQSTPYLSFRGIGKTFPGVKALTDISFDCYAGQVHALMGENGAGKSTLLKILSGNYAPT TGSVVINGQEMSFSDTTAALNAGVAIIYQELHLVPEMTVAENIYLGQLPHKGGIVNRSLL NYEAGLQLKHLGMDIDPDTPLKYLSIGQWQMVEIAKALARNAKIIAFDEPTSSLSAREID NLFRVIRELRKEGRVILYVSHRMEEIFALSDAITVFKDGRYVKTFTDMQQVDHDALVQAM VGRDIGDIYGWQPRSYGEERLRLDAVKAPGVRTPISLAVRSGEIVGLFGLVGAGRSELMK GMFGGTQITAGQVYIDQQPIDIRKPSHAIAAGMMLCPEDRKAEGIIPVHSVRDNINISAR RKHVLGGCVINNGWEENNADHHIRSLNIKTPGAEQLIMNLSGGNQQKAILGRWLSEEMKV ILLDEPTRGIDVGAKHEIYNVIYALAAQGVAVLFASSDLPEVLGVADRIVVMREGEIAGE LLHEQADERQALSLAMPKVSQAVA >gi|296494603|gb|ADTN01000135.1| GENE 14 13489 - 14475 1045 328 aa, chain + ## HITS:1 COG:araH KEGG:ns NR:ns ## COG: araH COG1172 # Protein_GI_number: 16132221 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Escherichia coli K12 # 1 328 2 329 329 517 99.0 1e-146 MSSVSTSGSGAPKSSFSFGRIWDQYGMLVVFAVLFIACAIFVPNFATFINMKGLGLAISM SGMVACGMLFCLASGDFDLSVASVIACAGVTTAVGINLTESLWIGVAAGLLLGVLCGLVN GFVIAKLKINALITTLATMQIVRGLAYIISDGKAVGIEDESFFALGYANWFGLPAPIWLT VACLIIFGLLLNKTTFGRNTLAIGGNEEAARLAGVPVVRTKIIIFVLSGLVSAIAGIILA SRMTSGQPMTSIGYELIVISACVLGGVSLKGGIGKISYVVAGILILGTVENAMNLLNISP FAQYVVRGLILLAAVIFDRYKQKAKRTV >gi|296494603|gb|ADTN01000135.1| GENE 15 14642 - 15442 721 266 aa, chain + ## HITS:1 COG:otsB KEGG:ns NR:ns ## COG: otsB COG1877 # Protein_GI_number: 16129849 # Func_class: G Carbohydrate transport and metabolism # Function: Trehalose-6-phosphatase # Organism: Escherichia coli K12 # 1 266 1 266 266 531 100.0 1e-151 MTEPLTETPELSAKYAWFFDLDGTLAEIKPHPDQVVVPDNILQGLQLLATASDGALALIS GRSMVELDALAKPYRFPLAGVHGAERRDINGKTHIVHLPDAIARDISVQLHTVIAQYPGA ELEAKGMAFALHYRQAPQHEDALMTLAQRITQIWPQMALQQGKCVVEIKPRGTSKGEAIA AFMQEAPFIGRTPVFLGDDLTDESGFAVVNRLGGMSVKIGTGATQASWRLAGVPDVWSWL EMITTALQQKRENNRSDDYESFSRSI >gi|296494603|gb|ADTN01000135.1| GENE 16 15417 - 16841 1215 474 aa, chain + ## HITS:1 COG:otsA KEGG:ns NR:ns ## COG: otsA COG0380 # Protein_GI_number: 16129848 # Func_class: G Carbohydrate transport and metabolism # Function: Trehalose-6-phosphate synthase # Organism: Escherichia coli K12 # 1 474 1 474 474 959 100.0 0 MSRLVVVSNRIAPPDEHAASAGGLAVGILGALKAAGGLWFGWSGETGNEDQPLKKVKKGN ITWASFNLSEQDLDEYYNQFSNAVLWPAFHYRLDLVQFQRPAWDGYLRVNALLADKLLPL LQDDDIIWIHDYHLLPFAHELRKRGVNNRIGFFLHIPFPTPEIFNALPTYDTLLEQLCDY DLLGFQTENDRLAFLDCLSNLTRVTTRSAKSHTAWGKAFRTEVYPIGIEPKEIAKQAAGP LPPKLAQLKAELKNVQNIFSVERLDYSKGLPERFLAYEALLEKYPQHHGKIRYTQIAPTS RGDVQAYQDIRHQLENEAGRINGKYGQLGWTPLYYLNQHFDRKLLMKIFRYSDVGLVTPL RDGMNLVAKEYVAAQDPANPGVLVLSQFAGAANELTSALIVNPYDRDEVAAALDRALTMS LAERISRHAEMLDVIVKNDINHWQECFISDLKQIVPRSAESQQRDKVATFPKLA >gi|296494603|gb|ADTN01000135.1| GENE 17 16848 - 17276 413 142 aa, chain - ## HITS:1 COG:ECs2603 KEGG:ns NR:ns ## COG: ECs2603 COG0589 # Protein_GI_number: 15831857 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Escherichia coli O157:H7 # 1 142 1 142 142 267 97.0 4e-72 MSYSNILVAVAVTPESQQLLAKAVSIARPVKGHISLITLASDPEMYNQLAAPMLEDLRSV MQEETQSFLDKLIQDAGYPVDKTFIAYGELSEHILEVCHKHHFDLVICGNHNHSFFSRAS CSAKRVIASSEVDVLLVPLMGD >gi|296494603|gb|ADTN01000135.1| GENE 18 18054 - 18404 280 116 aa, chain + ## HITS:1 COG:no KEGG:EC55989_2071 NR:ns ## KEGG: EC55989_2071 # Name: flhD # Def: transcriptional activator FlhD # Organism: E.coli_55989 # Pathway: Two-component system [PATH:eck02020]; Flagellar assembly [PATH:eck02040] # 1 116 4 119 119 194 99.0 6e-49 MHTSELLKHIYDINLSYLLLAQRMIVQDKASAMFRLGINEEMATTLAALTLPQMVKLAET NQLVCHFRFDSHQTITQLTQDSRVDDLQQIHTGIMLSTRLLNDVNQPEEALRKKRA >gi|296494603|gb|ADTN01000135.1| GENE 19 18407 - 18985 367 192 aa, chain + ## HITS:1 COG:no KEGG:ECSP_2465 NR:ns ## KEGG: ECSP_2465 # Name: flhC # Def: transcriptional activator FlhC # Organism: E.coli_O157_TW14359 # Pathway: Two-component system [PATH:etw02020]; Flagellar assembly [PATH:etw02040] # 1 192 1 192 192 385 100.0 1e-106 MSEKSIVQEARDIQLAMELITLGARLQMLESETQLSRGRLIKLYKELRGSPPPKGMLPFS TDWFMTWEQNVHASMFCNAWQFLLKTGLCNGVDAVIKAYRLYLEQCPQAEEGPLLALTRA WTLVRFVESGLLQLSSCNCCGGNFITHAHQPVGSFACSLCQPPSRAVKRRKLSQNPADII PQLLDEQRVQAV >gi|296494603|gb|ADTN01000135.1| GENE 20 19112 - 19999 995 295 aa, chain + ## HITS:1 COG:motA KEGG:ns NR:ns ## COG: motA COG1291 # Protein_GI_number: 16129842 # Func_class: N Cell motility # Function: Flagellar motor component # Organism: Escherichia coli K12 # 1 295 1 295 295 547 100.0 1e-156 MLILLGYLVVLGTVFGGYLMTGGSLGALYQPAELVIIAGAGIGSFIVGNNGKAIKGTLKA LPLLFRRSKYTKAMYMDLLALLYRLMAKSRQMGMFSLERDIENPRESEIFASYPRILADS VMLDFIVDYLRLIISGHMNTFEIEALMDEEIETHESEAEVPANSLALVGDSLPAFGIVAA VMGVVHALGSADRPAAELGALIAHAMVGTFLGILLAYGFISPLATVLRQKSAETSKMMQC VKVTLLSNLNGYAPPIAVEFGRKTLYSSERPSFIELEEHVRAVKNPQQQTTTEEA >gi|296494603|gb|ADTN01000135.1| GENE 21 19996 - 20922 630 308 aa, chain + ## HITS:1 COG:ECs2599 KEGG:ns NR:ns ## COG: ECs2599 COG1360 # Protein_GI_number: 15831853 # Func_class: N Cell motility # Function: Flagellar motor protein # Organism: Escherichia coli O157:H7 # 1 308 1 308 308 587 100.0 1e-168 MKNQAHPIIVVKRRKAKSHGAAHGSWKIAYADFMTAMMAFFLVMWLISISSPKELIQIAE YFRTPLATAVTGGDRISNSESPIPGGGDDYTQSQGEVNKQPNIEELKKRMEQSRLRKLRG DLDQLIESDPKLRALRPHLKIDLVQEGLRIQIIDSQNRPMFRTGSADVEPYMRDILRAIA PVLNGIPNRISLSGHTDDFPYASGEKGYSNWELSADRANASRRELMVGGLDSGKVLRVVG MAATMRLSDRGPDDAVNRRISLLVLNKQAEQAILHENAESQNEPVSALEKPEVAPQVSVP TMPSAEPR >gi|296494603|gb|ADTN01000135.1| GENE 22 20933 - 22891 1706 652 aa, chain + ## HITS:1 COG:cheA KEGG:ns NR:ns ## COG: cheA COG0643 # Protein_GI_number: 16129840 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Chemotaxis protein histidine kinase and related kinases # Organism: Escherichia coli K12 # 1 652 3 654 654 1199 99.0 0 MDISDFYQTFFDEADELLADMEQHLLVLQPEAPDAEQLNAIFRAAHSIKGGAGTFGFSVL QETTHLMENLLDEARRGEMQLNTDIINLFLETKDIMQEQLDAYKQSQEPDAASFDYICQA LRQLALEAKGETPSAVTRLSVVAKSEPQDEQSRSQSPRRIILSRLKAGEVDLLEEELGHL TTLTDVVKGADSLSAILPGDIAEDDITAVLCFVIEADQITFETVEVSPKISTPPVLKLAA EQAPTGRVEREKTTRSNESTSIRVAVEKVDQLINLVGELVITQSMLAQRSSELDPVNHGD LITSMGQLQRNARDLQESVMSIRMMPMEYVFSRYPRLVRDLAGKLGKQVELTLVGSSTEL DKSLIERIIDPLTHLVRNSLDHGIELPEKRLAAGKNSVGNLILSAEHQGGNICIEVTDDG AGLNRERILAKAASQGLTVSENMSDDEVAMLIFAPGFSTAEQVTDVSGRGVGMDVVKRNI QEMGGHVEIQSKQGTGTTIRILLPLTLAILDGMSVRVADEVFILPLNAVMESLQPREADL HPLAGGERVLEVRGEYLPIVELWKVFNVAGAKTEATQGIVVILQSGGRRYALLVDQLIGQ HQVVVKNLESNYRKVPGISAATILGDGSVALIVDVSALQAINREQRMANTAA >gi|296494603|gb|ADTN01000135.1| GENE 23 22912 - 23415 602 167 aa, chain + ## HITS:1 COG:ECs2597 KEGG:ns NR:ns ## COG: ECs2597 COG0835 # Protein_GI_number: 15831851 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Chemotaxis signal transduction protein # Organism: Escherichia coli O157:H7 # 1 167 1 167 167 293 100.0 1e-79 MTGMTNVTKLASEPSGQEFLVFTLGDEEYGIDILKVQEIRGYDQVTRIANTPAFIKGVTN LRGVIVPIVDLRIKFSQVDVDYNDNTVVIVLNLGQRVVGIVVDGVSDVLSLTAEQIRPAP EFAVTLSTEYLTGLGALGDRMLILVNIEKLLNSEEMALLDSAASEVA >gi|296494603|gb|ADTN01000135.1| GENE 24 23560 - 25221 1551 553 aa, chain + ## HITS:1 COG:tar KEGG:ns NR:ns ## COG: tar COG0840 # Protein_GI_number: 16129838 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Escherichia coli K12 # 1 553 1 553 553 897 100.0 0 MINRIRVVTLLVMVLGVFALLQLISGSLFFSSLHHSQKSFVVSNQLREQQGELTSTWDLM LQTRINLSRSAVRMMMDSSNQQSNAKVELLDSARKTLAQAATHYKKFKSMAPLPEMVATS RNIDEKYKNYYTALTELIDYLDYGNTGAYFAQPTQGMQNAMGEAFAQYALSSEKLYRDIV TDNADDYRFAQWQLAVIALVVVLILLVAWYGIRRMLLTPLAKIIAHIREIAGGNLANTLT IDGRSEMGDLAQSVSHMQRSLTDTVTHVREGSDAIYAGTREIAAGNTDLSSRTEQQASAL EETAASMEQLTATVKQNADNARQASQLAQSASDTAQHGGKVVDGVVKTMHEIADSSKKIA DIISVIDGIAFQTNILALNAAVEAARAGEQGRGFAVVAGEVRNLASRSAQAAKEIKALIE DSVSRVDTGSVLVESAGETMNNIVNAVTRVTDIMGEIASASDEQSRGIDQVALAVSEMDR VTQQNASLVQESAAAAAALEEQASRLTQAVSAFRLAASPLTNKPQTPSRPASEQPPAQPR LRIAEQDPNWETF >gi|296494603|gb|ADTN01000135.1| GENE 25 25267 - 26868 1557 533 aa, chain + ## HITS:1 COG:tap KEGG:ns NR:ns ## COG: tap COG0840 # Protein_GI_number: 16129837 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Escherichia coli K12 # 1 533 1 533 533 836 99.0 0 MFNRIRISTTLFLILILCGILQIGSNGMSFWAFRDDLQRLNQVEQSNQQRAALAQTRAVM LQASTALNKAGTLTALSYPADDIKTLMTTARASLTQSTTLFKSFMAMTAGNEHVRALQKE TEKSFARWHNDLEHQATWLESNQLSDFLTAPVQGSQNAFDVNFEAWQLEINHVLEAASAQ SQRNYQISALVFISMIIVAAIYISSALWWTRKMIVQPLAIIGSHFDSIAAGNLARPIAVY GRNEITAIFASLKTMQQALRGTVSDVRKGSQEMHIGIAEIVAGNNDLSSRTEQQAASLAQ TAASMEQLTATVGQNADNARQASELAKNAATTAQAGGVQVSTMTHTMQEIATSSQKIGDI ISVIDGIAFQTNILALNAAVEAARAGEQGRGFAVVAGEVRNLASRSAQAAKEIKGLIEES VNRVQQGSKLVNNAAATMIDIVSSVTRVNDIMGEIASASEEQQRGIEQVAQAVSQMDQVT QQNASLVEEAAVATEQLANQADHLSSRVAVFTLEEHEVARHESVQLQIAPVVS >gi|296494603|gb|ADTN01000135.1| GENE 26 26887 - 27747 615 286 aa, chain + ## HITS:1 COG:cheR KEGG:ns NR:ns ## COG: cheR COG1352 # Protein_GI_number: 16129836 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methylase of chemotaxis methyl-accepting proteins # Organism: Escherichia coli K12 # 1 286 1 286 286 565 98.0 1e-161 MTSSLPCGQTSLLLQMTERLALSDAHFRRISQLIYQRAGIVLADHKRDMVYNRLVRRLRS LGLTDFGHYLNLLESNQHSGEWQAFINSLTTNLTAFFREAHHFPLLADHARRRSGEYRVW SAAASTGEEPYSIAMTLADTLGTAPGRWKVFASDIDTKVLEKARSGIYRHEELKNLTPQQ LQRYFMRGTGPHEGLVRVRQELANYVDFAPLNLLAKQYTVPGPFDAIFCRNVMIYFDQNT QQEILRRFVPLLKTDGLLFAGHSENFSHLERRFTLRGQTVYALSKD >gi|296494603|gb|ADTN01000135.1| GENE 27 27750 - 28799 866 349 aa, chain + ## HITS:1 COG:ECs2593 KEGG:ns NR:ns ## COG: ECs2593 COG2201 # Protein_GI_number: 15831847 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Chemotaxis response regulator containing a CheY-like receiver domain and a methylesterase domain # Organism: Escherichia coli O157:H7 # 1 349 1 349 349 654 99.0 0 MSKIRVLSVDDSALMRQIMTEIINSHSDMEMVATAPDPLVARDLIKKFNPDVLTLDVEMP RMDGLDFLEKLMRLRPMPVVMVSSLTGKGSEVTLRALELGAIDFVTKPQLGIREGMLAYS EMIAEKVRTAAKASLAAHKPLSAPTTLKAGPLLSSEKLIAIGASTGGTEAIRHVLQPLPL SSPALLITQHMPPGFTRSFADRLNKLCQIGVKEAEDGERVLPGHAYIAPGDRHMELARSG ANYQIKIHDGPAVNRHRPSVDVLFHSVAKQAGRNAVGVILTGMGNDGAAGMLAMRQAGAW TLAQNEASCVVFGMPREAINMGGVCEVVDLSQVSQQMLAKISAGQAIRI >gi|296494603|gb|ADTN01000135.1| GENE 28 28814 - 29203 501 129 aa, chain + ## HITS:1 COG:ECs2592 KEGG:ns NR:ns ## COG: ECs2592 COG0784 # Protein_GI_number: 15831846 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Escherichia coli O157:H7 # 1 129 1 129 129 234 100.0 4e-62 MADKELKFLVVDDFSTMRRIVRNLLKELGFNNVEEAEDGVDALNKLQAGGYGFVISDWNM PNMDGLELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKPFTAATLEEKL NKIFEKLGM >gi|296494603|gb|ADTN01000135.1| GENE 29 29214 - 29858 736 214 aa, chain + ## HITS:1 COG:cheZ KEGG:ns NR:ns ## COG: cheZ COG3143 # Protein_GI_number: 16129833 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Chemotaxis protein # Organism: Escherichia coli K12 # 1 214 1 214 214 345 100.0 3e-95 MMQPSIKPADEHSAGDIIARIGSLTRMLRDSLRELGLDQAIAEAAEAIPDARDRLYYVVQ MTAQAAERALNSVEASQPHQDQMEKSAKALTQRWDDWFADPIDLADARELVTDTRQFLAD VPAHTSFTNAQLLEIMMAQDFQDLTGQVIKRMMDVIQEIERQLLMVLLENIPEQESRPKR ENQSLLNGPQVDTSKAGVVASQDQVDDLLDSLGF >gi|296494603|gb|ADTN01000135.1| GENE 30 30060 - 31208 1096 382 aa, chain + ## HITS:1 COG:flhB KEGG:ns NR:ns ## COG: flhB COG1377 # Protein_GI_number: 16129832 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FlhB # Organism: Escherichia coli K12 # 1 382 1 382 382 692 99.0 0 MSDESDDKTEAPTPHRLEKAREEGQIPRSRELTSLLILLVGVSVIWFGGVSLARRLSGML SAGLHFDHSIIKDPNLILGQIILLIREAMLALLPLISGVVLVALISPVMLGGLVFSGKSL QPKFSKLNPLPGIKRMFSAQTGAELLKAILKTILVGSVTGFFLWHHWPQMMRLMAESPIT AMGNAMDLVGLCALLVVLGVIPMVGFDVFFQIFSHLKKLRMSRQDIRDEFKQSEGDPHVK GRIRQMQRAAARRRMMADVPKADVIVNNPTHYSVALQYDENKMSAPKVVAKGAGLVALRI REIGAENNVPTLEAPPLARALYRHAEIGQQIPGQLYAAVAEVLAWVWQLKRWRLAGGQRP VQPTHLPVPEALDFINEKPTHE >gi|296494603|gb|ADTN01000135.1| GENE 31 31201 - 33279 2142 692 aa, chain + ## HITS:1 COG:flhA KEGG:ns NR:ns ## COG: flhA COG1298 # Protein_GI_number: 16129831 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FlhA # Organism: Escherichia coli K12 # 1 692 1 692 692 1264 99.0 0 MSNLAAMLRLPANLKSTQWQILAGPILILLILSMMVLPLPAFILDLLFTFNIALSIMVLL VAMFTQRTLEFAAFPTILLFTTLLRLALNVASTRIILMEGHTGAAAAGKVVEAFGHFLVG GNFAIGIVVFVILVIINFMVITKGAGRIAEVGARFVLDGMPGKQMAIDADLNAGLIGEDE AKKRRSEVTQEADFYGSMDGASKFVRGDAIAGILIMVINVVGGLLVGVLQHGMSMGHAAE SYTLLTIGDGLVAQIPALVISTAAGVIVTRVSTDQDVGEQMVNQLFSNPSVMLLSAAVLG LLGLVPGMPNLVFLLFTAGLLGLAWWIRGREQKAPAEPKPVKMAENNTVVEATWNDVQLE DSLGMEVGYRLIPMVDFQQDGELLGRIRSIRKKFAQEMGFLPPVVHIRDNMDLQPACYRI LMKGVEIGSGDAYPGRWLAINPGTAAGTLPGEATVDPAFGLNAIWIESALKEQAQIQGYT VVEASTVVATHLNHLISQHAAELFGRQEAQQLLDRVAQEMPKLTEDLVPGVVTLTTLHKV LQNLLDEKVPIRDMRTILETLAEHAPIQSDPHELTAVVRVALGRAITQQWFPGKDEVHVI GLDTPLERLLLQALQGGGGLEPGLADRLLAQTQEALSRQEMLGAPPVLLVNHALRPLLSR FLRRSLPQLVVLSNLELSDNRHIRMTATIGGK >gi|296494603|gb|ADTN01000135.1| GENE 32 33279 - 33671 178 130 aa, chain + ## HITS:1 COG:no KEGG:B21_01838 NR:ns ## KEGG: B21_01838 # Name: flhE # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 130 1 130 130 218 100.0 4e-56 MRTLLAILLFPLLVQAAGEGMWQASSVGITLNHRGESMSSAPLSTRQPASGLMTLVAWRY QLIGPTPSGLRVRLCSQSRCVELEGQSGTTVAFSGIAAAEPLRFIWEVPGGGRLIPPLKV QRNEVIVNYR >gi|296494603|gb|ADTN01000135.1| GENE 33 33791 - 34279 167 162 aa, chain - ## HITS:1 COG:yecT KEGG:ns NR:ns ## COG: yecT COG3755 # Protein_GI_number: 16129829 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 162 8 169 169 311 99.0 4e-85 MFKFLVLTLGIISCQAYAEDTVIVNDHDISAIKDCWQKNSDDDTDVNVIKSCLRQEYNLV DAQLNKAYGEAYRYIEQVPRTGVKKPDTEQLNLLKKSQRAWLDFRDKECELILSNEDVQD LSDPYSESEWLSCMIIQTNTRTRQLQLYRNSEDFYPSPLTRG >gi|296494603|gb|ADTN01000135.1| GENE 34 34456 - 36189 2279 577 aa, chain - ## HITS:1 COG:argS KEGG:ns NR:ns ## COG: argS COG0018 # Protein_GI_number: 16129828 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Arginyl-tRNA synthetase # Organism: Escherichia coli K12 # 1 577 1 577 577 1151 100.0 0 MNIQALLSEKVRQAMIAAGAPADCEPQVRQSAKVQFGDYQANGMMAVAKKLGMAPRQLAE QVLTHLDLNGIASKVEIAGPGFINIFLDPAFLAEHVQQALASDRLGVATPEKQTIVVDYS APNVAKEMHVGHLRSTIIGDAAVRTLEFLGHKVIRANHVGDWGTQFGMLIAWLEKQQQEN AGEMELADLEGFYRDAKKHYDEDEEFAERARNYVVKLQSGDEYFREMWRKLVDITMTQNQ ITYDRLNVTLTRDDVMGESLYNPMLPGIVADLKAKGLAVESEGATVVFLDEFKNKEGEPM GVIIQKKDGGYLYTTTDIACAKYRYETLHADRVLYYIDSRQHQHLMQAWAIVRKAGYVPE SVPLEHHMFGMMLGKDGKPFKTRAGGTVKLADLLDEALERARRLVAEKNPDMPADELEKL ANAVGIGAVKYADLSKNRTTDYIFDWDNMLAFEGNTAPYMQYAYTRVLSVFRKAEIDEEQ LAAAPVIIREDREAQLAARLLQFEETLTVVAREGTPHVMCAYLYDLAGLFSGFYEHCPIL SAENEEVRNSRLKLAQLTAKTLKLGLDTLGIETVERM >gi|296494603|gb|ADTN01000135.1| GENE 35 36405 - 36971 390 188 aa, chain + ## HITS:1 COG:yecM KEGG:ns NR:ns ## COG: yecM COG3102 # Protein_GI_number: 16129827 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 188 3 190 190 373 100.0 1e-104 MANWQSIDELQDIASDLPRFIHALDELSRRLGLNITPLTADHISLRCHQNATAERWRRGF EQCGELLSENMINGRPICLFKLHEPVQVAHWQFSIVELPWPGEKRYPHEGWEHIEIVLPG DPETLNARALALLSDEGLSLPGISVKTSSPKGEHERLPNPTLAVTDGKTTIKFHPWSIEE IVASEQSA >gi|296494603|gb|ADTN01000135.1| GENE 36 36985 - 37731 404 248 aa, chain + ## HITS:1 COG:cutCm KEGG:ns NR:ns ## COG: cutCm COG3142 # Protein_GI_number: 16132223 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in copper resistance # Organism: Escherichia coli K12 # 1 248 1 248 248 495 99.0 1e-140 MALLEICCYSMECALTAQQNGADRVELCAAPKEGGLTPSLGVLKSVRQRVTIPVHPIIRP RGGDFCYSDGEFAAILEDVRTVRELGFPGLVTGVLDVDGNVDMPRMEKIMAAAGPLAVTF HRAFDMCANPLHTLNNLAELGIARVLTSGQKSDALQGLSKIMELIAHRDAPIIMAGAGVR AENLHHFLDAGVLEVHSSAGAWQASPMRYRNQGLSMSSDEHADEYSRYIVDGAAVAEMKG IIERHQAK >gi|296494603|gb|ADTN01000135.1| GENE 37 38119 - 39219 614 366 aa, chain + ## HITS:1 COG:ECs2583 KEGG:ns NR:ns ## COG: ECs2583 COG3005 # Protein_GI_number: 15831837 # Func_class: C Energy production and conversion # Function: Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit # Organism: Escherichia coli O157:H7 # 1 366 1 366 366 721 98.0 0 MRGKKRIGLLFLLIAVVVGGGGLLLAQKALHKTSDTAFCLSCHSMSKPFEEYQGTVHFSN QKGIRAECADCHIPKSGMDYLFAKLKASKDIYHEFVSGKIDSDDKFEAHRQEMAETVWKE LKATDSATCRSCHSFDAMDIASQSESAQKMHNKAQKDSETCIDCHKGIAHFPPEIKMDDN AAHELESQAATSVTNGAHIYPFKTSHIGELATVNPGTDLTVVDASGKQPIVLLQGYQMQG SENTLYLAAGQRLALATLSEEGIKALTVNGEWQADEYGNQWRQASLQGALTDPALADRKP LWQYAEKLDDTYCAGCHAPIAADHYTVNAWPSIAKGMGARTSMSENELDILTRYFQYNAK DITEKQ >gi|296494603|gb|ADTN01000135.1| GENE 38 39244 - 40194 1087 316 aa, chain + ## HITS:1 COG:bisZ KEGG:ns NR:ns ## COG: bisZ COG0243 # Protein_GI_number: 16129825 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 303 7 309 815 608 99.0 1e-174 MTLTRREFIKHSGIAAGALVVTSAAPLPAWAEEKGGKILSAGRWGAMNVEVKDGKIVSST GALAKTIPNSLQSTAADQVHTTARIQHPMVRKSYLDNPLQPAKGRGEDTYVQVSWEQALK LIHEQHDRIRKANGPSAIFAGSYGWRSSGVLHKAQTLLQRYMNLAGGYSGHSGDYSTGAA QVIMPHVVGSVEVYEQQTSWPLILENSQVVVLWGMNPLNTLKIAWSSTDEQGLEYFHQLK KSGKPVIAIDPIRSETIEFFDDNATWIAPNMGTDVALMLGIAHTLMTQGKHDKVFLEKYT TGYRKIYGQCTEVNGV >gi|296494603|gb|ADTN01000135.1| GENE 39 40381 - 41352 1028 323 aa, chain - ## HITS:1 COG:yecP KEGG:ns NR:ns ## COG: yecP COG0500 # Protein_GI_number: 16129824 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli K12 # 1 323 1 323 323 671 100.0 0 MIDFGNFYSLIAKNHLSHWLETLPAQIANWQREQQHGLFKQWSNAVEFLPEIKPYRLDLL HSVTAESEEPLSAGQIKRIETLMRNLMPWRKGPFSLYGVNIDTEWRSDWKWDRVLPHLSD LTGRTILDVGCGSGYHMWRMIGAGAHLAVGIDPTQLFLCQFEAVRKLLGNDQRAHLLPLG IEQLPALKAFDTVFSMGVLYHRRSPLEHLWQLKDQLVNEGELVLETLVIDGDENTVLVPG DRYAQMRNVYFIPSALALKNWLKKCGFVDIRIADVSVTTTEEQRRTEWMVTESLADFLDP HDPGKTVEGYPAPKRAVLIARKP >gi|296494603|gb|ADTN01000135.1| GENE 40 41349 - 42092 823 247 aa, chain - ## HITS:1 COG:yecO KEGG:ns NR:ns ## COG: yecO COG0500 # Protein_GI_number: 16129823 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli K12 # 1 247 1 247 247 513 100.0 1e-146 MSHRDTLFSAPIARLGDWTFDERVAEVFPDMIQRSVPGYSNIISMIGMLAERFVQPGTQV YDLGCSLGAATLSVRRNIHHDNCKIIAIDNSPAMIERCRRHIDAYKAPTPVDVIEGDIRD IAIENASMVVLNFTLQFLEPSERQALLDKIYQGLNPGGALVLSEKFSFEDAKVGELLFNM HHDFKRANGYSELEISQKRSMLENVMLTDSVETHKARLHNAGFEHSELWFQCFNFGSLVA LKAEDAA >gi|296494603|gb|ADTN01000135.1| GENE 41 42133 - 42528 469 131 aa, chain - ## HITS:1 COG:ECs2579 KEGG:ns NR:ns ## COG: ECs2579 COG3788 # Protein_GI_number: 15831833 # Func_class: R General function prediction only # Function: Uncharacterized relative of glutathione S-transferase, MAPEG superfamily # Organism: Escherichia coli O157:H7 # 1 131 11 141 141 228 100.0 2e-60 MVSALYAVLSALLLMKFSFDVVRLRMQYRVAYGDGGFSELQSAIRIHGNAVEYIPIAIVL MLFMEMNGAETWMVHICGIVLLAGRLMHYYGFHHRLFRWRRSGMSATWCALLLMVLANLW YMPWELVFSLR >gi|296494603|gb|ADTN01000135.1| GENE 42 42581 - 43399 474 272 aa, chain - ## HITS:1 COG:yecE KEGG:ns NR:ns ## COG: yecE COG1801 # Protein_GI_number: 16129821 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 272 1 272 272 542 100.0 1e-154 MIYIGLPQWSHPKWVRLGITSLEEYARHFNCVEGNTTLYALPKPEVVLRWREQTTDDFRF CFKFPATISHQAALRHCDDLVTEFLTRMSPLAPRIGQYWLQLPATFGPRELPALWHFLDS LPGEFNYGVEVRHPQFFAKGEEEQTLNRGLHQRGVNRVILDSRPVHAARPHSEAIRDAQR KKPKVPVHAVLTATNPLIRFIGSDDMTQNRELFQVWLQKLAQWHQTTTPYLFLHTPDIAQ APELVHTLWEDLRKTLPEIGAVPAIPQQSSLF >gi|296494603|gb|ADTN01000135.1| GENE 43 43396 - 43962 560 188 aa, chain - ## HITS:1 COG:ECs2577 KEGG:ns NR:ns ## COG: ECs2577 COG1335 # Protein_GI_number: 15831831 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Escherichia coli O157:H7 # 1 188 12 199 199 376 100.0 1e-104 MLELNAKTTALVVIDLQEGILPFAGGPHTADEVVNRAGKLAAKFRASGQPVFLVRVGWSA DYAEALKQPVDAPSPAKVLPENWWQHPAALGATDSDIEIIKRQWGAFYGTDLELQLRRRG IDTIVLCGISTNIGVESTARNAWELGFNLVIAEDACSAASAEQHNNSINHIYPRIARVRS VEEILNAL >gi|296494603|gb|ADTN01000135.1| GENE 44 44272 - 46044 2212 590 aa, chain + ## HITS:1 COG:aspS KEGG:ns NR:ns ## COG: aspS COG0173 # Protein_GI_number: 16129819 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl-tRNA synthetase # Organism: Escherichia coli K12 # 1 590 1 590 590 1199 100.0 0 MRTEYCGQLRLSHVGQQVTLCGWVNRRRDLGSLIFIDMRDREGIVQVFFDPDRADALKLA SELRNEFCIQVTGTVRARDEKNINRDMATGEIEVLASSLTIINRADVLPLDSNHVNTEEA RLKYRYLDLRRPEMAQRLKTRAKITSLVRRFMDDHGFLDIETPMLTKATPEGARDYLVPS RVHKGKFYALPQSPQLFKQLLMMSGFDRYYQIVKCFRDEDLRADRQPEFTQIDVETSFMT APQVREVMEALVRHLWLEVKGVDLGDFPVMTFAEAERRYGSDKPDLRNPMELTDVADLLK SVEFAVFAGPANDPKGRVAALRVPGGASLTRKQIDEYGNFVKIYGAKGLAYIKVNERAKG LEGINSPVAKFLNAEIIEDILDRTAAQDGDMIFFGADNKKIVADAMGALRLKVGKDLGLT DESKWAPLWVIDFPMFEDDGEGGLTAMHHPFTSPKDMTAAELKAAPENAVANAYDMVING YEVGGGSVRIHNGDMQQTVFGILGINEEEQREKFGFLLDALKYGTPPHAGLAFGLDRLTM LLTGTDNIRDVIAFPKTTAAACLMTEAPSFANPTALAELSIQVVKKAENN >gi|296494603|gb|ADTN01000135.1| GENE 45 46105 - 46614 302 169 aa, chain + ## HITS:1 COG:ECs2575 KEGG:ns NR:ns ## COG: ECs2575 COG0494 # Protein_GI_number: 15831829 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Escherichia coli O157:H7 # 20 169 1 150 150 292 99.0 2e-79 MSIDNYVNGMSEGSQRRGSVKDKVYKRPVSILVVIYAQDTKRVLMLQRRDDPDFWQSVTG SVEEGETAPQAAMREVKEEVTIDVVAEQLTLIDCQRTVEFEIFSHLRHRYAPGVTRNTES WFCLALPHERQIVFTEHLAYKWLDAPAAAALTKSWSNRQAIEQFVINAA >gi|296494603|gb|ADTN01000135.1| GENE 46 46643 - 47383 1031 246 aa, chain + ## HITS:1 COG:ECs2574 KEGG:ns NR:ns ## COG: ECs2574 COG0217 # Protein_GI_number: 15831828 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 246 1 246 246 444 99.0 1e-125 MAGHSKWANTRHRKAAQDAKRGKIFTKIIRELVTAAKLGGGDPDANPRLRAAVDKALSNN MTRDTLNRAIARGVGGDDDANMETIIYEGYGPGGTAIMIECLSDNRNRTVAEVRHAFSKC GGNLGTDGSVAYLFSKKGVISFEKGDEDTIMEAALEAGAEDVVTYDDGAIDVYTAWEEMG KVRDALEAAGLKADSAEVSMIPSTKADMDAETAPKLMRLIDMLEDCDDVQEVYHNGEISD EVAATL >gi|296494603|gb|ADTN01000135.1| GENE 47 47418 - 47939 556 173 aa, chain + ## HITS:1 COG:ECs2573 KEGG:ns NR:ns ## COG: ECs2573 COG0817 # Protein_GI_number: 15831827 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, endonuclease subunit # Organism: Escherichia coli O157:H7 # 1 173 1 173 173 302 100.0 2e-82 MAIILGIDPGSRVTGYGVIRQVGRQLSYLGSGCIRTKVDDLPSRLKLIYAGVTEIITQFQ PDYFAIEQVFMAKNADSALKLGQARGVAIVAAVNQELPVFEYAARQVKQTVVGIGSAEKS QVQHMVRTLLKLPANPQADAADALAIAITHCHVSQNAMQMSESRLNLARGRLR >gi|296494603|gb|ADTN01000135.1| GENE 48 47941 - 48543 248 200 aa, chain - ## HITS:1 COG:no KEGG:JW5306 NR:ns ## KEGG: JW5306 # Name: yebB # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 200 1 200 200 419 100.0 1e-116 MNINYPAEYEIGDIVFTCIGAALFGQISAASNCWSNHVGIIIGHNGEDFLVAESRVPLST ITTLSRFIKRSSNQRYAIKRLDAGLTERQKQRIVEQVPSRLRKLYHTGFKYESSRQFCSK FVFDIYKEALCIPVGEIETFGELLNSNPNAKLTFWKFWFLGSIPWERKTVTPASLWHHPG LVLIHAEGVETPQPELTEAV >gi|296494603|gb|ADTN01000135.1| GENE 49 48818 - 49429 671 203 aa, chain + ## HITS:1 COG:ECs2571 KEGG:ns NR:ns ## COG: ECs2571 COG0632 # Protein_GI_number: 15831825 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, DNA-binding subunit # Organism: Escherichia coli O157:H7 # 1 203 1 203 203 377 99.0 1e-105 MIGRLRGIIIEKQPPLVLIEVGGVGYEVHMPMTCFYELPEAGQEAIVFTHFVVREDAQLL YGFNNKQERTLFKELIKTNGVGPKLALAILSGMSAQQFVNAVEREEVGALVKLPGIGKKT AERLIVEMKDRFKGLHGDLFTPAADLVLTSPASPATDDAEQEAVAALVALGYKPQEASRM VSKIARPDASSESLIREALRAAL >gi|296494603|gb|ADTN01000135.1| GENE 50 49438 - 50448 1141 336 aa, chain + ## HITS:1 COG:ECs2570 KEGG:ns NR:ns ## COG: ECs2570 COG2255 # Protein_GI_number: 15831824 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, helicase subunit # Organism: Escherichia coli O157:H7 # 1 336 1 336 336 644 100.0 0 MIEADRLISAGTTLPEDVADRAIRPKLLEEYVGQPQVRSQMEIFIKAAKLRGDALDHLLI FGPPGLGKTTLANIVANEMGVNLRTTSGPVLEKAGDLAAMLTNLEPHDVLFIDEIHRLSP VVEEVLYPAMEDYQLDIMIGEGPAARSIKIDLPPFTLIGATTRAGSLTSPLRDRFGIVQR LEFYQVPDLQYIVSRSARFMGLEMSDDGALEVARRARGTPRIANRLLRRVRDFAEVKHDG TISADIAAQALDMLNVDAEGFDYMDRKLLLAVIDKFFGGPVGLDNLAAAIGEERETIEDV LEPYLIQQGFLQRTPRGRMATTRAWNHFGITPPEMP >gi|296494603|gb|ADTN01000135.1| GENE 51 50595 - 51380 937 261 aa, chain - ## HITS:1 COG:znuB KEGG:ns NR:ns ## COG: znuB COG1108 # Protein_GI_number: 16129812 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Escherichia coli K12 # 1 261 1 261 261 395 100.0 1e-110 MIELLFPGWLAGIMLACAAGPLGSFVVWRRMSYFGDTLAHASLLGVAFGLLLDVNPFYAV IAVTLLLAGGLVWLEKRPQLAIDTLLGIMAHSALSLGLVVVSLMSNIRVDLMAYLFGDLL AVTPEDLISIAIGVVIVVAILFWQWRNLLSMTISPDLAFVDGVKLQRVKLLLMLVTALTI GVAMKFVGALIITSLLIIPAATARRFARTPEQMAGVAVLVGMVAVTGGLTFSAVYDTPAG PSVVLCAALLFILSMMKKQAS >gi|296494603|gb|ADTN01000135.1| GENE 52 51377 - 52132 219 251 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 202 1 212 305 89 29 6e-17 MTSLVSLENVSVSFGQRRVLSDVSLELKPGKILTLLGPNGAGKSTLVRVVLGLVTPDEGV IKRNGKLRIGYVPQKLYLDTTLPLTVNRFLRLRPGTHKEDILPALKRVQAGHLINAPMQK LSGGETQRVLLARALLNRPQLLVLDEPTQGVDVNGQVALYDLIDQLRRELDCGVLMVSHD LHLVMAKTDEVLCLNHHICCSGTPEVVSLHPEFISMFGPRGAEQLGIYRHHHNHRHDLQG RIVLRRGNDRS >gi|296494603|gb|ADTN01000135.1| GENE 53 52157 - 53143 338 328 aa, chain + ## HITS:1 COG:ECs2567 KEGG:ns NR:ns ## COG: ECs2567 COG4531 # Protein_GI_number: 15831821 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Zn2+ transport system, periplasmic component/surface adhesin # Organism: Escherichia coli O157:H7 # 1 328 1 328 328 635 98.0 0 MKCYNITLLIFITIIGRIMLHKKTLLFAALSAALWGGATQAADAAVVASLKPVGFIASAI ADGVTETEVLLPDGASEHDYSLRPSDVKRLQNADLVVWVGPEMEAFMQKPVSKLPGAKQV TIAQLEDVKPLLMKSIHGDDDDHDHAEKSDEDHHHGDFNMHLWLSPEIARATAVAIHGKL VELMPQSRAKLDANLKDFEAQLASTETQVGNELAPLKGKGYFVFHDAYGYFEKQFGLTPL GHFTVNPEIQPGAQRLHEIRTQLVEQKATCVFAEPQFRPAVVESVARGTSVRMGTLDPLG TNIKLGKTSYSEFLSQLANQYASCLKGD >gi|296494603|gb|ADTN01000135.1| GENE 54 53159 - 54481 1277 440 aa, chain + ## HITS:1 COG:ECs2566 KEGG:ns NR:ns ## COG: ECs2566 COG0739 # Protein_GI_number: 15831820 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Escherichia coli O157:H7 # 22 440 1 419 419 823 100.0 0 MQQIARSVALAFNNLPRPHRVMLGSLTVLTLAVAVWRPYVYHRDATPIVKTIELEQNEIR SLLPEASEPIDQAAQEDEAIPQDELDDKIAGEAGVHEYVVSTGDTLSSILNQYGIDMGDI TQLAAADKELRNLKIGQQLSWTLTADGELQRLTWEVSRRETRTYDRTAANGFKMTSEMQQ GEWVNNLLKGTVGGSFVASARNAGLTSAEVSAVIKAMQWQMDFRKLKKGDEFAVLMSREM LDGKREQSQLLGVRLRSEGKDYYAIRAEDGKFYDRNGTGLAKGFLRFPTAKQFRISSNFN PRRTNPVTGRVAPHRGVDFAMPQGTPVLSVGDGEVVVAKRSGAAGYYVAIRHGRSYTTRY MHLRKILVKPGQKVKRGDRIALSGNTGRSTGPHLHYEVWINQQAVNPLTAKLPRTEGLTG SDRREFLAQAKEIVPQLRFD >gi|296494603|gb|ADTN01000135.1| GENE 55 54601 - 55572 757 323 aa, chain + ## HITS:1 COG:msbB KEGG:ns NR:ns ## COG: msbB COG1560 # Protein_GI_number: 16129808 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lauroyl/myristoyl acyltransferase # Organism: Escherichia coli K12 # 1 323 1 323 323 654 100.0 0 METKKNNSEYIPEFDKSFRHPRYWGAWLGVAAMAGIALTPPKFRDPILARLGRFAGRLGK SSRRRALINLSLCFPERSEAEREAIVDEMFATAPQAMAMMAELAIRGPEKIQPRVDWQGL EIIEEMRRNNEKVIFLVPHGWAVDIPAMLMASQGQKMAAMFHNQGNPVFDYVWNTVRRRF GGRLHARNDGIKPFIQSVRQGYWGYYLPDQDHGPEHSEFVDFFATYKATLPAIGRLMKVC RARVVPLFPIYDGKTHRLTIQVRPPMDDLLEADDHTIARRMNEEVEIFVGPRPEQYTWIL KLLKTRKPGEIQPYKRKDLYPIK >gi|296494603|gb|ADTN01000135.1| GENE 56 55703 - 57145 1387 480 aa, chain - ## HITS:1 COG:pykA KEGG:ns NR:ns ## COG: pykA COG0469 # Protein_GI_number: 16129807 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Escherichia coli K12 # 1 480 1 480 480 884 100.0 0 MSRRLRRTKIVTTLGPATDRDNNLEKVIAAGANVVRMNFSHGSPEDHKMRADKVREIAAK LGRHVAILGDLQGPKIRVSTFKEGKVFLNIGDKFLLDANLGKGEGDKEKVGIDYKGLPAD VVPGDILLLDDGRVQLKVLEVQGMKVFTEVTVGGPLSNNKGINKLGGGLSAEALTEKDKA DIKTAALIGVDYLAVSFPRCGEDLNYARRLARDAGCDAKIVAKVERAEAVCSQDAMDDII LASDVVMVARGDLGVEIGDPELVGIQKALIRRARQLNRAVITATQMMESMITNPMPTRAE VMDVANAVLDGTDAVMLSAETAAGQYPSETVAAMARVCLGAEKIPSINVSKHRLDVQFDN VEEAIAMSAMYAANHLKGVTAIITMTESGRTALMTSRISSGLPIFAMSRHERTLNLTALY RGVTPVHFDSANDGVAAASEAVNLLRDKGYLMSGDLVIVTQGDVMSTVGSTNTTRILTVE >gi|296494603|gb|ADTN01000135.1| GENE 57 57273 - 58142 709 289 aa, chain - ## HITS:1 COG:yebK KEGG:ns NR:ns ## COG: yebK COG1737 # Protein_GI_number: 16129806 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 289 1 289 289 538 100.0 1e-153 MNMLEKIQSQLEHLSKSERKVAEVILASPDNAIHSSIAAMALEANVSEPTVNRFCRSMDT RGFPDFKLHLAQSLANGTPYVNRNVNEDDSVESYTGKIFESAMATLDHVRHSLDKSAINR AVDLLTQAKKIAFFGLGSSAAVAHDAMNKFFRFNVPVVYSDDIVLQRMSCMNCSDGDVVV LISHTGRTKNLVELAQLARENDAMVIALTSAGTPLAREATLAITLDVPEDTDIYMPMVSR LAQLTVIDVLATGFTLRRGAKFRDNLKRVKEALKESRFDKQLLNLSDDR >gi|296494603|gb|ADTN01000135.1| GENE 58 58480 - 59955 1718 491 aa, chain + ## HITS:1 COG:zwf KEGG:ns NR:ns ## COG: zwf COG0364 # Protein_GI_number: 16129805 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate 1-dehydrogenase # Organism: Escherichia coli K12 # 1 491 1 491 491 1012 100.0 0 MAVTQTAQACDLVIFGAKGDLARRKLLPSLYQLEKAGQLNPDTRIIGVGRADWDKAAYTK VVREALETFMKETIDEGLWDTLSARLDFCNLDVNDTAAFSRLGAMLDQKNRITINYFAMP PSTFGAICKGLGEAKLNAKPARVVMEKPLGTSLATSQEINDQVGEYFEECQVYRIDHYLG KETVLNLLALRFANSLFVNNWDNRTIDHVEITVAEEVGIEGRWGYFDKAGQMRDMIQNHL LQILCMIAMSPPSDLSADSIRDEKVKVLKSLRRIDRSNVREKTVRGQYTAGFAQGKKVPG YLEEEGANKSSNTETFVAIRVDIDNWRWAGVPFYLRTGKRLPTKCSEVVVYFKTPELNLF KESWQDLPQNKLTIRLQPDEGVDIQVLNKVPGLDHKHNLQITKLDLSYSETFNQTHLADA YERLLLETMRGIQALFVRRDEVEEAWKWVDSITEAWAMDNDAPKPYQAGTWGPVASVAMI TRDGRSWNEFE >gi|296494603|gb|ADTN01000135.1| GENE 59 60190 - 62001 1706 603 aa, chain + ## HITS:1 COG:ECs2561 KEGG:ns NR:ns ## COG: ECs2561 COG0129 # Protein_GI_number: 15831815 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Escherichia coli O157:H7 # 1 603 1 603 603 1207 100.0 0 MNPQLLRVTNRIIERSRETRSAYLARIEQAKTSTVHRSQLACGNLAHGFAACQPEDKASL KSMLRNNIAIITSYNDMLSAHQPYEHYPEIIRKALHEANAVGQVAGGVPAMCDGVTQGQD GMELSLLSREVIAMSAAVGLSHNMFDGALFLGVCDKIVPGLTMAALSFGHLPAVFVPSGP MASGLPNKEKVRIRQLYAEGKVDRMALLESEAASYHAPGTCTFYGTANTNQMVVEFMGMQ LPGSSFVHPDSPLRDALTAAAARQVTRMTGNGNEWMPIGKMIDEKVVVNGIVALLATGGS TNHTMHLVAMARAAGIQINWDDFSDLSDVVPLMARLYPNGPADINHFQAAGGVPVLVREL LKAGLLHEDVNTVAGFGLSRYTLEPWLNNGELDWREGAEKSLDSNVIASFEQPFSHHGGT KVLSGNLGRAVMKTSAVPVENQVIEAPAVVFESQHDVMPAFEAGLLDRDCVVVVRHQGPK ANGMPELHKLMPPLGVLLDRCFKIALVTDGRLSGASGKVPSAIHVTPEAYDGGLLAKVRD GDIIRVNGQTGELTLLVDEAELAAREPHIPDLSASRVGTGRELFSALREKLSGAEQGATC ITF >gi|296494603|gb|ADTN01000135.1| GENE 60 62038 - 62679 996 213 aa, chain + ## HITS:1 COG:STM1884 KEGG:ns NR:ns ## COG: STM1884 COG0800 # Protein_GI_number: 16765226 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Salmonella typhimurium LT2 # 1 212 1 212 213 373 97.0 1e-103 MKNWKTSAESILTTGPVVPVIVVKKLEHAVPMAKALVAGGVRVLEVTLRTECAVDAIRAI AKEVPEAIVGAGTVLNPQQLAEVTEAGAQFAISPGLTEPLLKAATEGTIPLIPGISTVSE LMLGMDYGLKEFKFFPAEANGGVKALQAIAGPFSQVRFCPTGGISPANYRDYLALKSVLC IGGSWLVPADALEAGDYDRITKLAREAVEGAKL Prediction of potential genes in microbial genomes Time: Sun May 15 23:36:08 2011 Seq name: gi|296494602|gb|ADTN01000136.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont283.4, whole genome shotgun sequence Length of sequence - 27321 bp Number of predicted genes - 32, with homology - 31 Number of transcription units - 22, operones - 7 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 42 - 1220 1206 ## COG0027 Formate-dependent phosphoribosylglycinamide formyltransferase (GAR transformylase) - Prom 1313 - 1372 4.7 + Prom 1268 - 1327 5.0 2 2 Op 1 . + CDS 1354 - 1644 369 ## COG3141 Uncharacterized protein conserved in bacteria 3 2 Op 2 . + CDS 1711 - 2067 179 ## ECIAI39_1203 hypothetical protein 4 3 Tu 1 . - CDS 1981 - 2187 101 ## gi|188494878|ref|ZP_03002148.1| hypothetical protein Ec53638_0054 - Prom 2251 - 2310 3.2 + Prom 2178 - 2237 2.8 5 4 Tu 1 . + CDS 2394 - 3053 816 ## COG2979 Uncharacterized protein conserved in bacteria + Prom 3177 - 3236 4.3 6 5 Tu 1 . + CDS 3262 - 5322 2007 ## COG1770 Protease II 7 6 Op 1 4/0.500 - CDS 5319 - 5981 809 ## COG0847 DNA polymerase III, epsilon subunit and related 3'-5' exonucleases 8 6 Op 2 . - CDS 6005 - 6661 424 ## COG0388 Predicted amidohydrolase - Prom 6701 - 6760 3.7 9 7 Tu 1 . - CDS 6763 - 6993 313 ## ECP_1786 DNA polymerase III subunit theta (EC:2.7.7.7) - Prom 7015 - 7074 2.1 + Prom 7044 - 7103 4.0 10 8 Op 1 8/0.000 + CDS 7132 - 7506 368 ## COG2372 Uncharacterized protein, homolog of Cu resistance protein CopC 11 8 Op 2 . + CDS 7510 - 8382 742 ## COG1276 Putative copper export protein 12 8 Op 3 . + CDS 8395 - 8736 347 ## SSON_1309 hypothetical protein + Term 8773 - 8816 2.2 13 9 Tu 1 . + CDS 8821 - 8907 72 ## + Term 8937 - 8966 -0.3 + Prom 9035 - 9094 4.3 14 10 Tu 1 . + CDS 9132 - 9788 521 ## COG0639 Diadenosine tetraphosphatase and related serine/threonine protein phosphatases 15 11 Op 1 . - CDS 9789 - 10064 159 ## SbBS512_E2105 hypothetical protein 16 11 Op 2 . - CDS 10085 - 10321 208 ## ECSP_2410 hypothetical protein - Prom 10375 - 10434 4.4 17 12 Tu 1 4/0.500 - CDS 10439 - 11878 1182 ## COG0144 tRNA and rRNA cytosine-C5-methylases - Term 11914 - 11949 7.2 18 13 Op 1 11/0.000 - CDS 11958 - 14597 2586 ## COG3008 Paraquat-inducible protein B 19 13 Op 2 . - CDS 14560 - 15843 784 ## COG2995 Uncharacterized paraquat-inducible protein A - Prom 15918 - 15977 4.8 + Prom 15847 - 15906 3.7 20 14 Op 1 4/0.500 + CDS 15973 - 16470 302 ## PROTEIN SUPPORTED gi|15902812|ref|NP_358362.1| hypothetical protein spr0768 + Term 16497 - 16531 3.1 21 14 Op 2 7/0.000 + CDS 16567 - 17265 525 ## COG3109 Activator of osmoprotectant transporter ProP 22 14 Op 3 5/0.000 + CDS 17285 - 19333 2057 ## COG0793 Periplasmic protease + Term 19340 - 19375 6.0 + Prom 19362 - 19421 4.4 23 14 Op 4 . + CDS 19525 - 20406 867 ## COG0501 Zn-dependent protease with chaperone function + Term 20427 - 20458 2.4 - Term 20415 - 20446 2.4 24 15 Tu 1 . - CDS 20452 - 21825 1072 ## COG0477 Permeases of the major facilitator superfamily - Prom 21945 - 22004 8.3 + Prom 21904 - 21963 6.3 25 16 Tu 1 . + CDS 22002 - 22793 967 ## COG1414 Transcriptional regulator + Term 22803 - 22847 3.1 - Term 22791 - 22833 2.3 26 17 Tu 1 . - CDS 22937 - 23176 247 ## ECO103_2017 hypothetical protein - Prom 23200 - 23259 5.2 + Prom 23474 - 23533 6.0 27 18 Tu 1 . + CDS 23553 - 23840 225 ## ECS88_1877 hypothetical protein + Term 23854 - 23897 6.1 + Prom 24424 - 24483 2.4 28 19 Op 1 . + CDS 24510 - 24653 85 ## ECO111_2331 hypothetical protein 29 19 Op 2 3/1.000 + CDS 24666 - 24875 323 ## COG1278 Cold shock proteins + Term 24924 - 24954 1.0 + Prom 24883 - 24942 3.7 30 20 Tu 1 . + CDS 25125 - 25850 544 ## COG0500 SAM-dependent methyltransferases + Term 26099 - 26138 3.0 - Term 25770 - 25811 2.1 31 21 Tu 1 . - CDS 25847 - 26413 691 ## COG1971 Predicted membrane protein - Term 26608 - 26648 0.8 32 22 Tu 1 . - CDS 26842 - 27300 265 ## COG4811 Predicted membrane protein Predicted protein(s) >gi|296494602|gb|ADTN01000136.1| GENE 1 42 - 1220 1206 392 aa, chain - ## HITS:1 COG:purT KEGG:ns NR:ns ## COG: purT COG0027 # Protein_GI_number: 16129802 # Func_class: F Nucleotide transport and metabolism # Function: Formate-dependent phosphoribosylglycinamide formyltransferase (GAR transformylase) # Organism: Escherichia coli K12 # 1 392 1 392 392 759 100.0 0 MTLLGTALRPAATRVMLLGSGELGKEVAIECQRLGVEVIAVDRYADAPAMHVAHRSHVIN MLDGDALRRVVELEKPHYIVPEIEAIATDMLIQLEEEGLNVVPCARATKLTMNREGIRRL AAEELQLPTSTYRFADSESLFREAVADIGYPCIVKPVMSSSGKGQTFIRSAEQLAQAWKY AQQGGRAGAGRVIVEGVVKFDFEITLLTVSAVDGVHFCAPVGHRQEDGDYRESWQPQQMS PLALERAQEIARKVVLALGGYGLFGVELFVCGDEVIFSEVSPRPHDTGMVTLISQDLSEF ALHVRAFLGLPVGGIRQYGPAASAVILPQLTSQNVTFDNVQNAVGADLQIRLFGKPEIDG SRRLGVALATAESVVDAIERAKHAAGQVKVQG >gi|296494602|gb|ADTN01000136.1| GENE 2 1354 - 1644 369 96 aa, chain + ## HITS:1 COG:ECs2558 KEGG:ns NR:ns ## COG: ECs2558 COG3141 # Protein_GI_number: 15831812 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 96 1 96 96 148 100.0 2e-36 MAVEVKYVVIREGEEKMSFTSKKEADAYDKMLDTADLLDTWLTNSPVQMEDEQREALSLW LAEQKDVLSTILKTGKLPSPQVVGAESEEEDASHAA >gi|296494602|gb|ADTN01000136.1| GENE 3 1711 - 2067 179 118 aa, chain + ## HITS:1 COG:no KEGG:ECIAI39_1203 NR:ns ## KEGG: ECIAI39_1203 # Name: yebF # Def: hypothetical protein # Organism: E.coli_IAI39 # Pathway: not_defined # 1 118 5 122 122 222 100.0 4e-57 MKKRGAFLGLLLVSACASVFAANNETSKSVTFPKCEGLDAAGIAASVKRDYQQNRVARWA DDQKIVGQADPVAWVSLQDIQGKDDKWSVPLTVRGKSADIHYQVSVDCKAGMAEYQRR >gi|296494602|gb|ADTN01000136.1| GENE 4 1981 - 2187 101 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|188494878|ref|ZP_03002148.1| ## NR: gi|188494878|ref|ZP_03002148.1| hypothetical protein Ec53638_0054 [Escherichia coli 53638] # 1 68 1 68 68 130 98.0 2e-29 MLIARCDAVASSQAYDCGAFWGGQGTRAASANSQSNYQKPLTPLIFRHSRFAVHADLVMN IGTFTTHG >gi|296494602|gb|ADTN01000136.1| GENE 5 2394 - 3053 816 219 aa, chain + ## HITS:1 COG:yebE KEGG:ns NR:ns ## COG: yebE COG2979 # Protein_GI_number: 16129799 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 219 1 219 219 352 100.0 2e-97 MANWLNQLQSLLGQSSSSTSSSADQGLVKLLVPGALGGLAGLLVANKSARKLLTKYGTNA LLVGGGAVAGTVLWNKYKDKIRAAHQDEPQFGAQSTPLDERTARLILALVFAAKSDGHID AKERAAIDQQLRGAGVEEQGRVLIEQAIEQPLDPQRLATGVRNEEEALEIYFLSCAAIDI DHFMERSYLNALGDALKIPQDVRDGIERDLEQQKRTLAE >gi|296494602|gb|ADTN01000136.1| GENE 6 3262 - 5322 2007 686 aa, chain + ## HITS:1 COG:ptrB KEGG:ns NR:ns ## COG: ptrB COG1770 # Protein_GI_number: 16129798 # Func_class: E Amino acid transport and metabolism # Function: Protease II # Organism: Escherichia coli K12 # 1 686 1 686 686 1409 100.0 0 MLPKAARIPHAMTLHGDTRIDNYYWLRDDTRSQPEVLDYLQQENSYGHRVMASQQALQDR ILKEIIDRIPQREVSAPYIKNGYRYRHIYEPGCEYAIYQRQSAFSEEWDEWETLLDANKR AAHSEFYSMGGMAITPDNTIMALAEDFLSRRQYGIRFRNLETGNWYPELLDNVEPSFVWA NDSWIFYYVRKHPVTLLPYQVWRHAIGTPASQDKLIYEEKDDTYYVSLHKTTSKHYVVIH LASATTSEVRLLDAEMADAEPFVFLPRRKDHEYSLDHYQHRFYLRSNRHGKNFGLYRTRM RDEQQWEELIPPRENIMLEGFTLFTDWLVVEERQRGLTSLRQINRKTREVIGIAFDDPAY VTWIAYNPEPETARLRYGYSSMTTPDTLFELDMDTGERRVLKQTEVPGFYAANYRSEHLW IVARDGVEVPVSLVYHRKHFRKGHNPLLVYGYGSYGASIDADFSFSRLSLLDRGFVYAIV HVRGGGELGQQWYEDGKFLKKKNTFNDYLDACDALLKLGYGSPSLCYAMGGSAGGMLMGV AINQRPELFHGVIAQVPFVDVVTTMLDESIPLTTGEFEEWGNPQDPQYYEYMKSYSPYDN VTAQAYPHLLVTTGLHDSQVQYWEPAKWVAKLRELKTDDHLLLLCTDMDSGHGGKSGRFK SYEGVAMEYAFLVALAQGTLPATPAD >gi|296494602|gb|ADTN01000136.1| GENE 7 5319 - 5981 809 220 aa, chain - ## HITS:1 COG:exoX KEGG:ns NR:ns ## COG: exoX COG0847 # Protein_GI_number: 16129797 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases # Organism: Escherichia coli K12 # 1 220 1 220 220 458 99.0 1e-129 MLRIIDTETCGLQGGIVEIASVDVIDGKIVNPMSHLVRPDRPISPQAMAIHRITEAMVAD KPWIEDVIPHYYGSEWYVAHNASFDRRVLPEMPGEWICTMKLARRLWPGIKYSNVALYKT RKLNVQTPPGLHHHRALYDCYITAALLIDIMNTSGWTAEQMADITGRPSLMTTFTFGKYR GKAVSDVAERDPGYLRWLFNNLDSMSPELRLTLKHYLENT >gi|296494602|gb|ADTN01000136.1| GENE 8 6005 - 6661 424 218 aa, chain - ## HITS:1 COG:yobB KEGG:ns NR:ns ## COG: yobB COG0388 # Protein_GI_number: 16129796 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Escherichia coli K12 # 1 218 1 218 218 444 100.0 1e-125 MSFWKVAAAQYEPRKTSLTEQVAHHLEFVRAAARQQCQLLVFPSLSLLGCDYSRRALPAP PDLSLLDPLCYAATTWRMTIIAGLPVEYNDRFIRGIAVFAPWRKTPGIYHQSHGACLGRR SRTITVVDEQPQGMDMDPTCSLFTTGQCLGEPDLLASARRLQFFSHQYSIAVLMANARGN SALWDEYGRLIVRADRGSLLLVGQRSSQGWQGDIIPLR >gi|296494602|gb|ADTN01000136.1| GENE 9 6763 - 6993 313 76 aa, chain - ## HITS:1 COG:no KEGG:ECP_1786 NR:ns ## KEGG: ECP_1786 # Name: not_defined # Def: DNA polymerase III subunit theta (EC:2.7.7.7) # Organism: E.coli_536 # Pathway: Purine metabolism [PATH:ecp00230]; Pyrimidine metabolism [PATH:ecp00240]; Metabolic pathways [PATH:ecp01100]; DNA replication [PATH:ecp03030]; Mismatch repair [PATH:ecp03430]; Homologous recombination [PATH:ecp03440] # 1 76 30 105 105 136 100.0 2e-31 MLKNLAKLDQTEMDKVNVDLAAAGVAFKERYNMPVIAEAVEREQPEHLRSWFRERLIAHR LASVNLSRLPYEPKLK >gi|296494602|gb|ADTN01000136.1| GENE 10 7132 - 7506 368 124 aa, chain + ## HITS:1 COG:yobA KEGG:ns NR:ns ## COG: yobA COG2372 # Protein_GI_number: 16129794 # Func_class: R General function prediction only # Function: Uncharacterized protein, homolog of Cu resistance protein CopC # Organism: Escherichia coli K12 # 1 124 1 124 124 224 100.0 4e-59 MASTARSLRYALAILTTSLVTPSVWAHAHLTHQYPAANAQVTAAPQAITLNFSEGVETGF SGAKITGPKNENIKTLPAKRNEQDQKQLIVPLADSLKPGTYTVDWHVVSVDGHKTKGHYT FSVK >gi|296494602|gb|ADTN01000136.1| GENE 11 7510 - 8382 742 290 aa, chain + ## HITS:1 COG:yebZ KEGG:ns NR:ns ## COG: yebZ COG1276 # Protein_GI_number: 16129793 # Func_class: P Inorganic ion transport and metabolism # Function: Putative copper export protein # Organism: Escherichia coli K12 # 1 290 1 290 290 476 100.0 1e-134 MLAFTWIALRFIHFTSLMLVFGFAMYGAWLAPLTIRRLLAKRFLRLQQHAAVWSLISATA MLAVQGGLMGTGWTDVFSPNIWQAVLQTQFGGIWLWQIVLALVTLIVALMQPRNMPRLLF MLTTAQFILLAGVGHATLNEGVTAKIHQTNHAIHLICAAAWFGGLLPVLWCMQLIKGRWR HQAIQALMRFSWCGHFAVIGVLASGVLNALLITGFPPTLTTYWGQLLLLKAILVMIMVVI ALANRYVLVPRMRQDEDRAAPWFVWMTKLEWAIGAVVLVIISLLATLEPF >gi|296494602|gb|ADTN01000136.1| GENE 12 8395 - 8736 347 113 aa, chain + ## HITS:1 COG:no KEGG:SSON_1309 NR:ns ## KEGG: SSON_1309 # Name: not_defined # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 113 1 113 113 195 100.0 4e-49 MMKKSILAFLLLTSSAAALAAPQVITVSRFEVGKDKWAFNREEVMLTCRPGNALYVINPS TLVQYPLNDIAQKEVASGKTNAQPISVIQIDDPNNPGEKMSLAPFIERAEKLC >gi|296494602|gb|ADTN01000136.1| GENE 13 8821 - 8907 72 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTTTLFIDHFARKLENLASSSILKGQGN >gi|296494602|gb|ADTN01000136.1| GENE 14 9132 - 9788 521 218 aa, chain + ## HITS:1 COG:pphA KEGG:ns NR:ns ## COG: pphA COG0639 # Protein_GI_number: 16129791 # Func_class: T Signal transduction mechanisms # Function: Diadenosine tetraphosphatase and related serine/threonine protein phosphatases # Organism: Escherichia coli K12 # 1 218 2 219 219 429 100.0 1e-120 MKQPAPVYQRIAGHQWRHIWLSGDIHGCLEQLRRKLWHCRFDPWRDLLISVGDVIDRGPQ SLRCLQLLEQHWVCAVRGNHEQMAMDALASQQMSLWLMNGGDWFIALADNQQKQAKTALE KCQHLPFILEVHSRTGKHVIAHADYPDDVYEWQKDVDLHQVLWSRSRLGERQKGQGITGA DHFWFGHTPLRHRVDIGNLHYIDTGAVFGGELTLVQLQ >gi|296494602|gb|ADTN01000136.1| GENE 15 9789 - 10064 159 91 aa, chain - ## HITS:1 COG:no KEGG:SbBS512_E2105 NR:ns ## KEGG: SbBS512_E2105 # Name: not_defined # Def: hypothetical protein # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 1 91 1 91 91 191 100.0 8e-48 MAGYLSWLFPRCKISPKLNGTAPHFGDEMFALVLFVCYLDGGCEDIVVDVYNTEQQCLYS MSDQRIRQGGCFPIEDFIDGFWRPAQEYGDF >gi|296494602|gb|ADTN01000136.1| GENE 16 10085 - 10321 208 78 aa, chain - ## HITS:1 COG:no KEGG:ECSP_2410 NR:ns ## KEGG: ECSP_2410 # Name: yebV # Def: hypothetical protein # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 78 1 78 78 157 100.0 2e-37 MKTSVRIGAFEIDDGELHGESPGDRTLTIPCKSDPDLCMQLDAWDAETSIPALLNGEHSV LYRTRYDQQSDAWIMRLA >gi|296494602|gb|ADTN01000136.1| GENE 17 10439 - 11878 1182 479 aa, chain - ## HITS:1 COG:yebU_1 KEGG:ns NR:ns ## COG: yebU_1 COG0144 # Protein_GI_number: 16129788 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Escherichia coli K12 # 1 383 3 385 385 788 99.0 0 MAQHTVYFPDAFLTQMREAMPSTLSFDDFLAACQRPLRRSIRVNTLKISVADFLQLTAPY GWTLTPIPWCGEGFWIERDNEDALPLGSTAEHLSGLFYIQEASSMLPVAALFADGNAPQR VMDVAAAPGSKTTQISARMNNEGAILANEFSASRVKVLHANISRCGISNVALTHFDGRVF GAAVPEMFDAILLDAPCSGEGVVRKDPDALKNWSPESNQEIAATQRELIDSAFHALRPGG TLVYSTCTLNQEENEAVCLWLKETYPDAVEFLPLGDLFPGANKALTEEGFLHVFPQIYDC EGFFVARLRKTQAIPALPAPKYKVGNFPFSPVKDREAGQIRQAATGVGLNWDENLRLWQR DKELWLFPVGIEALIGKVRFSRLGIKLAETHNKGYRWQHEAVIALASPDNMNAFELTPQE AEEWYRGRDVYPQAAPVADDVLVTFQHQPIGLAKRIGSRLKNSYPRELVRDGKLFTGNA >gi|296494602|gb|ADTN01000136.1| GENE 18 11958 - 14597 2586 879 aa, chain - ## HITS:1 COG:yebT KEGG:ns NR:ns ## COG: yebT COG3008 # Protein_GI_number: 16129787 # Func_class: R General function prediction only # Function: Paraquat-inducible protein B # Organism: Escherichia coli K12 # 1 879 1 879 879 1734 100.0 0 MHMSQETPASTTEAQIKNKRRISPFWLLPFIALMIASWLIWDSYQDRGNTVTIDFMSADG IVPGRTPVRYQGVEVGTVQDISLSDDLRKIEVKVSIKSDMKDALREETQFWLVTPKASLA GVSGLDALVGGNYIGMMPGKGKEQDHFVALDTQPKYRLDNGDLMIHLQAPDLGSLNSGSL VYFRKIPVGKVYDYAINPNKQGVVIDVLIERRFTDLVKKGSRFWNVSGVDANVSISGAKV KLESLAALVNGAIAFDSPEESKPAEAEDTFGLYEDLAHSQRGVIIKLELPSGAGLTADST PLMYQGLEVGQLTKLDLNPGGKVTGEMTVDPSVVTLLRENTRIELRNPKLSLSDANLSAL LTGKTFELVPGDGEPRKEFVVVPGEKALLHEPDVLTLTLTAPESYGIDAGQPLILHGVQV GQVIDRKLTSKGVTFTVAIEPQHRELVKGDSKFVVNSRVDVKVGLDGVEFLGASASEWIN GGIRILPGDKGEMKASYPLYANLEKALENSLSDLPTTTVSLSAETLPDVQAGSVVLYRKF EVGEVITVRPRANAFDIDLHIKPEYRNLLTSNSVFWAEGGAKVQLNGSGLTVQASPLSRA LKGAISFDNLSGASASQRKGDKRILYASETAARAVGGQITLHAFDAGKLAVGMPIRYLGI DIGQIQTLDLITARNEVQAKAVLYPEYVQTFARGGTRFSVVTPQISAAGVEHLDTILQPY INVEPGRGNPRRDFELQEATITDSRYLDGLSIIVEAPEAGSLGIGTPVLFRGLEVGTVTG MTLGTLSDRVMIAMRISKRYQHLVRNNSVFWLASGYSLDFGLTGGVVKTGTFNQFIRGGI AFATPPGTPLAPKAQEGKHFLLQESEPKEWREWGTALPK >gi|296494602|gb|ADTN01000136.1| GENE 19 14560 - 15843 784 427 aa, chain - ## HITS:1 COG:ECs2543 KEGG:ns NR:ns ## COG: ECs2543 COG2995 # Protein_GI_number: 15831797 # Func_class: S Function unknown # Function: Uncharacterized paraquat-inducible protein A # Organism: Escherichia coli O157:H7 # 1 427 1 427 427 850 100.0 0 MALNTPQITPTKKITVRAIGEELPRGDYQRCPQCDMLFSLPEINSHQSAYCPRCQAKIRD GRDWSLTRLAAMAFTMLLLMPFAWGEPLLHIWLLGIRIDANVMQGIWQMTKQGDAITGSM VFFCVIGAPLILVTSIAYLWFGNRLGMNLRPVLLMLERLKEWVMLDIYLVGIGVASIKVQ DYAHIQAGVGLFSFVALVILTTVTLSHLNVEELWERFYPQRPATRRDEKLRVCLGCHFTG YPDQRGRCPRCHIPLRLRRRHSLQKCWAALLASIVLLLPANLLPISIIYLNGGRQEDTIL SGIMSLASSNIAVAGIVFIASILVPFTKVIVMFTLLLSIHFKCQQGLRTRILLLRMVTWI GRWSMLDLFVISLTMSLINRDQILAFTMGPAAFYFGAAVILTILAVEWLDSRLLWDAHES GNARFDD >gi|296494602|gb|ADTN01000136.1| GENE 20 15973 - 16470 302 165 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15902812|ref|NP_358362.1| hypothetical protein spr0768 [Streptococcus pneumoniae R6] # 3 159 6 161 165 120 41 7e-27 MNKTEFYADLNRDFNALMAGETSFLATLANTSALLYERLTDINWAGFYLLEDDTLVLGPF QGKIACVRIPVGRGVCGTAVARNQVQRIEDVHVFDGHIACDAASNSEIVLPLVVKNQIIG VLDIDSTVFGRFTDEDEQGLRQLVAQLEKVLATTDYKKFFASVAG >gi|296494602|gb|ADTN01000136.1| GENE 21 16567 - 17265 525 232 aa, chain + ## HITS:1 COG:proQm KEGG:ns NR:ns ## COG: proQm COG3109 # Protein_GI_number: 16132234 # Func_class: T Signal transduction mechanisms # Function: Activator of osmoprotectant transporter ProP # Organism: Escherichia coli K12 # 1 232 1 232 232 387 100.0 1e-108 MENQPKLNSSKEVIAFLAERFPHCFSAEGEARPLKIGIFQDLVDRVAGEMNLSKTQLRSA LRLYTSSWRYLYGVKPGATRVDLDGNPCGELDEQHVEHARKQLEEAKARVQAQRAEQQAK KREAAATAGEKEDAPRRERKPRPTTPRRKEGAERKPRAQKPVEKAPKTVKAPREEQHTPV SDISALTVGQALKVKAGQNAMDATVLEITKDGVRVQLNSGMSLIVRAEHLVF >gi|296494602|gb|ADTN01000136.1| GENE 22 17285 - 19333 2057 682 aa, chain + ## HITS:1 COG:prc KEGG:ns NR:ns ## COG: prc COG0793 # Protein_GI_number: 16129784 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Escherichia coli K12 # 1 682 1 682 682 1321 100.0 0 MNMFFRLTALAGLLAIAGQTFAVEDITRADQIPVLKEETQHATVSERVTSRFTRSHYRQF DLDQAFSAKIFDRYLNLLDYSHNVLLASDVEQFAKKKTELGDELRSGKLDVFYDLYNLAQ KRRFERYQYALSVLEKPMDFTGNDTYNLDRSKAPWPKNEAELNALWDSKVKFDELSLKLT GKTDKEIRETLTRRYKFAIRRLAQTNSEDVFSLAMTAFAREIDPHTNYLSPRNTEQFNTE MSLSLEGIGAVLQMDDDYTVINSMVAGGPAAKSKAISVGDKIVGVGQTGKPMVDVIGWRL DDVVALIKGPKGSKVRLEILPAGKGTKTRTVTLTRERIRLEDRAVKMSVKTVGKEKVGVL DIPGFYVGLTDDVKVQLQKLEKQNVSSVIIDLRSNGGGALTEAVSLSGLFIPAGPIVQVR DNNGKVREDSDTDGQVFYKGPLVVLVDRFSASASEIFAAAMQDYGRALVVGEPTFGKGTV QQYRSLNRIYDQMLRPEWPALGSVQYTIQKFYRVNGGSTQRKGVTPDIIMPTGNEETETG EKFEDNALPWDSIDAATYVKSGDLTAFEPELLKEHNARIAKDPEFQNIMKDIARFNAMKD KRNIVSLNYAVREKENNEDDATRLARLNERFKREGKPELKKLDDLPKDYQEPDPYLDETV NIALDLAKLEKARPAEQPAPVK >gi|296494602|gb|ADTN01000136.1| GENE 23 19525 - 20406 867 293 aa, chain + ## HITS:1 COG:htpX KEGG:ns NR:ns ## COG: htpX COG0501 # Protein_GI_number: 16129783 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Escherichia coli K12 # 1 293 1 293 293 547 100.0 1e-155 MMRIALFLLTNLAVMVVFGLVLSLTGIQSSSVQGLMIMALLFGFGGSFVSLLMSKWMALR SVGGEVIEQPRNERERWLVNTVATQARQAGIAMPQVAIYHAPDINAFATGARRDASLVAV STGLLQNMSPDEAEAVIAHEISHIANGDMVTMTLIQGVVNTFVIFISRILAQLAAGFMGG NRDEGEESNGNPLIYFAVATVLELVFGILASIITMWFSRHREFHADAGSAKLVGREKMIA ALQRLKTSYEPQEATSMMALCINGKSKSLSELFMTHPPLDKRIEALRTGEYLK >gi|296494602|gb|ADTN01000136.1| GENE 24 20452 - 21825 1072 457 aa, chain - ## HITS:1 COG:yebQ KEGG:ns NR:ns ## COG: yebQ COG0477 # Protein_GI_number: 16129782 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 457 38 494 494 753 100.0 0 MPKVQADGLPLPQRYGAILTIVIGISMAVLDGAIANVALPTIATDLHATPASSIWVVNAY QIAIVISLLSFSFLGDMFGYRRIYKCGLVVFLLSSLFCALSDSLQMLTLARVIQGFGGAA LMSVNTALIRLIYPQRFLGRGMGINSFIVAVSSAAGPTIAAAILSIASWKWLFLINVPLG IIALLLAMRFLPPNGSRASKPRFDLPSAVMNALTFGLLITALSGFAQGQSLTLIAAELVV MVVVGIFFIRRQLSLPVPLLPVDLLRIPLFSLSICTSVCSFCAQMLAMVSLPFYLQTVLG RSEVETGLLLTPWPLATMVMAPLAGYLIERVHAGLLGALGLFIMAAGLFSLVLLPASPAD INIIWPMILCGAGFGLFQSPNNHTIITSAPRERSGGASGMLGTARLLGQSSGAALVALML NQFGDNGTHVSLMAAAILAVIAACVSGLRITQPRSRA >gi|296494602|gb|ADTN01000136.1| GENE 25 22002 - 22793 967 263 aa, chain + ## HITS:1 COG:kdgR KEGG:ns NR:ns ## COG: kdgR COG1414 # Protein_GI_number: 16129781 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 263 1 263 263 494 100.0 1e-140 MANADLDKQPDSVSSVLKVFGILQALGEEREIGITELSQRVMMSKSTVYRFLQTMKTLGY VAQEGESEKYSLTLKLFELGARALQNVDLIRSADIQMRELSRLTKETIHLGALDEDSIVY IHKIDSMYNLRMYSRIGRRNPLYSTAIGKVLLAWRDRDEVKQILEGVEYKRSTERTITST EALLPVLDQVREQGYGEDNEEQEEGLRCIAVPVFDRFGVVIAGLSISFPTLRFSEERLQE YVAMLHTAARKISAQMGYHDYPF >gi|296494602|gb|ADTN01000136.1| GENE 26 22937 - 23176 247 79 aa, chain - ## HITS:1 COG:no KEGG:ECO103_2017 NR:ns ## KEGG: ECO103_2017 # Name: yobH # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 79 1 79 79 141 100.0 8e-33 MRFIIRTVMLIALVWIGLLLSGYGVLIGSKENAAGLGLQCTYLTARGTSTVQYLHTKSGF LGITDCPLLRKSNIVVDNG >gi|296494602|gb|ADTN01000136.1| GENE 27 23553 - 23840 225 95 aa, chain + ## HITS:1 COG:no KEGG:ECS88_1877 NR:ns ## KEGG: ECS88_1877 # Name: yebO # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 95 1 95 95 134 100.0 1e-30 MNEVVNSGVMNIASLVVSVVVLLIGLILWFFINRASSRTNEQIELLEALLDQQKRQNALL RRLCEANEPEKADKKTVESQKSVEDEDIIRLVAER >gi|296494602|gb|ADTN01000136.1| GENE 28 24510 - 24653 85 47 aa, chain + ## HITS:1 COG:no KEGG:ECO111_2331 NR:ns ## KEGG: ECO111_2331 # Name: yobF # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 47 1 47 47 84 100.0 1e-15 MCGIFSKEVLSKHVDVEYRFSAEPYIGASCSNVSVLSMLCLRAKKTI >gi|296494602|gb|ADTN01000136.1| GENE 29 24666 - 24875 323 69 aa, chain + ## HITS:1 COG:ECs2533 KEGG:ns NR:ns ## COG: ECs2533 COG1278 # Protein_GI_number: 15831787 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Escherichia coli O157:H7 # 1 69 1 69 69 120 100.0 5e-28 MAKIKGQVKWFNESKGFGFITPADGSKDVFVHFSAIQGNGFKTLAEGQNVEFEIQDGQKG PAAVNVTAI >gi|296494602|gb|ADTN01000136.1| GENE 30 25125 - 25850 544 241 aa, chain + ## HITS:1 COG:rrmA KEGG:ns NR:ns ## COG: rrmA COG0500 # Protein_GI_number: 16129776 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli K12 # 1 241 29 269 269 496 100.0 1e-140 MAKEGYVNLLPVQHKRSRDPGDSAEMMQARRAFLDAGHYQPLRDAIVAQLRERLDDKATA VLDIGCGEGYYTHAFADALPEITTFGLDVSKVAIKAAAKRYPQVTFCVASSHRLPFSDTS MDAIIRIYAPCKAEELARVVKPGGWVITATPGPRHLMELKGLIYNEVHLHAPHAEQLEGF TLQQSAELCYPMRLRGDEAVALLQMTPFAWRAKPEVWQTLAAKEVFDCQTDFNIHLWQRS Y >gi|296494602|gb|ADTN01000136.1| GENE 31 25847 - 26413 691 188 aa, chain - ## HITS:1 COG:ECs2531 KEGG:ns NR:ns ## COG: ECs2531 COG1971 # Protein_GI_number: 15831785 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 188 19 206 206 324 100.0 5e-89 MNITATVLLAFGMSMDAFAASIGKGATLHKPKFSEALRTGLIFGAVETLTPLIGWGMGML ASRFVLEWNHWIAFVLLIFLGGRMIIEGFRGADDEDEEPRRRHGFWLLVTTAIATSLDAM AVGVGLAFLQVNIIATALAIGCATLIMSTLGMMVGRFIGSIIGKKAEILGGLVLIGIGVQ ILWTHFHG >gi|296494602|gb|ADTN01000136.1| GENE 32 26842 - 27300 265 152 aa, chain - ## HITS:1 COG:yobD KEGG:ns NR:ns ## COG: yobD COG4811 # Protein_GI_number: 16129774 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 152 1 152 152 256 100.0 1e-68 MTITDLVLILFIAALLAFAIYDQFIMPRRNGPTLLAIPLLRRGRIDSVIFVGLIVILIYN NVTNHGALITTWLLSALALMGFYIFWIRVPKIIFKQKGFFFANVWIEYSRIKAMNLSEDG VLVMQLEQRRLLIRVRNIDDLEKIYKLLVSTQ Prediction of potential genes in microbial genomes Time: Sun May 15 23:36:40 2011 Seq name: gi|296494601|gb|ADTN01000137.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont283.5, whole genome shotgun sequence Length of sequence - 24970 bp Number of predicted genes - 22, with homology - 22 Number of transcription units - 13, operones - 4 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 13/0.000 - CDS 24 - 875 1068 ## COG3716 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID 2 1 Op 2 13/0.000 - CDS 888 - 1688 1010 ## COG3715 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC 3 1 Op 3 . - CDS 1751 - 2722 1290 ## COG3444 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB - Prom 2857 - 2916 4.5 4 2 Tu 1 . + CDS 3185 - 4741 1813 ## COG1253 Hemolysins and related proteins containing CBS domains 5 3 Tu 1 2/1.000 - CDS 4745 - 6343 1305 ## COG2200 FOG: EAL domain - Prom 6400 - 6459 1.9 - Term 6426 - 6463 8.2 6 4 Tu 1 5/0.167 - CDS 6474 - 7838 1475 ## COG1760 L-serine deaminase - Prom 7931 - 7990 5.2 7 5 Op 1 6/0.000 - CDS 8022 - 8600 440 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 8 5 Op 2 . - CDS 8604 - 9965 1090 ## COG0147 Anthranilate/para-aminobenzoate synthases component I - Prom 10013 - 10072 5.2 + Prom 9957 - 10016 4.7 9 6 Tu 1 . + CDS 10039 - 10218 338 ## COG3140 Uncharacterized protein conserved in bacteria - Term 10265 - 10308 5.2 10 7 Tu 1 . - CDS 10338 - 10637 214 ## S1533 hypothetical protein 11 8 Tu 1 . - CDS 11059 - 11403 477 ## COG0251 Putative translation initiation inhibitor, yjgF family + Prom 11371 - 11430 2.9 12 9 Op 1 8/0.000 + CDS 11535 - 13445 1653 ## COG1199 Rad3-related DNA helicases 13 9 Op 2 7/0.000 + CDS 13503 - 14198 728 ## COG1214 Inactive homolog of metal-dependent proteases, putative molecular chaperone 14 9 Op 3 7/0.000 + CDS 14238 - 14819 612 ## COG3065 Starvation-inducible outer membrane lipoprotein + Term 14845 - 14873 0.6 + Prom 14879 - 14938 3.1 15 9 Op 4 8/0.000 + CDS 14958 - 16709 1619 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II + Term 16728 - 16760 3.0 16 9 Op 5 . + CDS 16791 - 17906 1132 ## COG0349 Ribonuclease D - Term 17899 - 17955 13.2 17 10 Op 1 11/0.000 - CDS 17960 - 18925 617 ## COG1018 Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 - Term 18935 - 18977 5.5 18 10 Op 2 3/0.833 - CDS 18981 - 20105 1231 ## COG4638 Phenylpropionate dioxygenase and related ring-hydroxylating dioxygenases, large terminal subunit 19 10 Op 3 3/0.833 - CDS 20137 - 21567 1536 ## COG1292 Choline-glycine betaine transporter - Prom 21629 - 21688 7.2 - Term 21720 - 21754 6.0 20 11 Tu 1 . - CDS 21773 - 22855 1057 ## COG0473 Isocitrate/isopropylmalate dehydrogenase - Prom 22893 - 22952 3.0 + Prom 22861 - 22920 3.8 21 12 Tu 1 1/1.000 + CDS 22961 - 23884 869 ## COG0583 Transcriptional regulator + Prom 23911 - 23970 3.5 22 13 Tu 1 . + CDS 24011 - 24649 495 ## COG1280 Putative threonine efflux protein Predicted protein(s) >gi|296494601|gb|ADTN01000137.1| GENE 1 24 - 875 1068 283 aa, chain - ## HITS:1 COG:ECs2529 KEGG:ns NR:ns ## COG: ECs2529 COG3716 # Protein_GI_number: 15831783 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID # Organism: Escherichia coli O157:H7 # 1 283 4 286 286 559 100.0 1e-159 MVDTTQTTTEKKLTQSDIRGVFLRSNLFQGSWNFERMQALGFCFSMVPAIRRLYPENNEA RKQAIRRHLEFFNTQPFVAAPILGVTLALEEQRANGAEIDDGAINGIKVGLMGPLAGVGD PIFWGTVRPVFAALGAGIAMSGSLLGPLLFFILFNLVRLATRYYGVAYGYSKGIDIVKDM GGGFLQKLTEGASILGLFVMGALVNKWTHVNIPLVVSRITDQTGKEHVTTVQTILDQLMP GLVPLLLTFACMWLLRKKVNPLWIIVGFFVIGIAGYACGLLGL >gi|296494601|gb|ADTN01000137.1| GENE 2 888 - 1688 1010 266 aa, chain - ## HITS:1 COG:ECs2528 KEGG:ns NR:ns ## COG: ECs2528 COG3715 # Protein_GI_number: 15831782 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC # Organism: Escherichia coli O157:H7 # 1 266 1 266 266 394 100.0 1e-109 MEITTLQIVLVFIVACIAGMGSILDEFQFHRPLIACTLVGIVLGDMKTGIIIGGTLEMIA LGWMNIGAAVAPDAALASIISTILVIAGHQSIGAGIALAIPLAAAGQVLTIIVRTITVAF QHAADKAADNGNLTAISWIHVSSLFLQAMRVAIPAVIVALSVGTSEVQNMLNAIPEVVTN GLNIAGGMIVVVGYAMVINMMRAGYLMPFFYLGFVTAAFTNFNLVALGVIGTVMAVLYIQ LSPKYNRVAGAPAQAAGNNDLDNELD >gi|296494601|gb|ADTN01000137.1| GENE 3 1751 - 2722 1290 323 aa, chain - ## HITS:1 COG:ECs2527_2 KEGG:ns NR:ns ## COG: ECs2527_2 COG3444 # Protein_GI_number: 15831781 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB # Organism: Escherichia coli O157:H7 # 158 323 1 166 166 311 100.0 1e-84 MTIAIVIGTHGWAAEQLLKTAEMLLGEQENVGWIDFVPGENAETLIEKYNAQLAKLDTTK GVLFLVDTWGGSPFNAASRIVVDKEHYEVIAGVNIPMLVETLMARDDDPSFDELVALAVE TGREGVKALKAKPVEKAAPAPAAAAPKAAPTPAKPMGPNDYMVIGLARIDDRLIHGQVAT RWTKETNVSRIIVVSDEVAADTVRKTLLTQVAPPGVTAHVVDVAKMIRVYNNPKYAGERV MLLFTNPTDVERLVEGGVKITSVNVGGMAFRQGKTQVNNAVSVDEKDIEAFKKLNARGIE LEVRKVSTDPKLKMMDLISKIDK >gi|296494601|gb|ADTN01000137.1| GENE 4 3185 - 4741 1813 518 aa, chain + ## HITS:1 COG:yoaE_2 KEGG:ns NR:ns ## COG: yoaE_2 COG1253 # Protein_GI_number: 16129770 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Escherichia coli K12 # 231 518 1 288 288 540 100.0 1e-153 MEFLMDPSIWAGLLTLVVLEIVLGIDNLVFIAILADKLPPKQRDKARLLGLSLALIMRLG LLSLISWMVTLTKPLFTVMDFSFSGRDLIMLFGGIFLLFKATTELHERLENRDHDSGHGK GYASFWVVVTQIVILDAVFSLDAVITAVGMVNHLPVMMAAVVIAMAVMLLASKPLTRFVN QHPTVVVLCLSFLLMIGLSLVAEGFGFHIPKGYLYAAIGFSIIIEVFNQIARRNFIRHQS TLPLRARTADAILRLMGGKRQANVQHDADNPMPMPIPEGAFAEEERYMINGVLTLASRSL RGIMTPRGEISWVDANLGVDEIREQLLSSPHSLFPVCRGELDEIIGIVRAKELLVALEEG VDVAAIASASPAIIVPETLDPINLLGVLRRARGSFVIVTNEFGVVQGLVTPLDVLEAIAG EFPDADETPEIITDGDGWLVKGGTDLHALQQALDVEHLADDDDIATVAGLVISANGHIPR VGDVIDVGPLHITIIEANDYRVDLVRIVKEQPAHDEDE >gi|296494601|gb|ADTN01000137.1| GENE 5 4745 - 6343 1305 532 aa, chain - ## HITS:1 COG:yoaD_2 KEGG:ns NR:ns ## COG: yoaD_2 COG2200 # Protein_GI_number: 16129769 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 261 532 1 272 272 561 100.0 1e-159 MQKAQRIIKTYRRNRMIVCTICALVTLASTLSVRFISQRNLNQQRVVQFANHAVEELDKV LLPLQAGSEVLLPLIGLPCSVAHLPLRKQAAKLQTVRSIGLVQDGTLYCSSIFGYRNVPV VDILAELPAPQPLLRLTIDRALIKGSPVLIQWTPAAGSSNAGVMEMINIDLLTAMLLEPQ LPQISSASLTVDKRHLLYGNGLVDSLPQPEDNENYQVSSQRFPFTINVNGPGATALAWHY LPTQLPLAVLLSLLVGYIAWLATAYRMSFSREINLGLAQHEFELFCQPLLNARSQQCIGV EILLRWNNPRQGWISPDVFIPIAEEHHLIVPLTRYVMAETIRQRHVFPMSSQFHVGINVA PSHFRRGVLIKDLNQYWFSAHPIQQLILEITERDALLDVDYRIARELHRKNVKLAIDDFG TGNSSFSWLETLRPDVLKIDKSFTAAIGSDAVNSTVTDIIIALGQRLNIELVAEGVETQE QAKYLRRHGVHILQGYLYAQPMPLRDFPKWLAGSQPPPARHNGHITPIMPLR >gi|296494601|gb|ADTN01000137.1| GENE 6 6474 - 7838 1475 454 aa, chain - ## HITS:1 COG:sdaA KEGG:ns NR:ns ## COG: sdaA COG1760 # Protein_GI_number: 16129768 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Escherichia coli K12 # 1 454 1 454 454 925 100.0 0 MISLFDMFKVGIGPSSSHTVGPMKAGKQFVDDLVEKGLLDSVTRVAVDVYGSLSLTGKGH HTDIAIIMGLAGNEPATVDIDSIPGFIRDVEERERLLLAQGRHEVDFPRDNGMRFHNGNL PLHENGMQIHAYNGDEVVYSKTYYSIGGGFIVDEEHFGQDAANEVSVPYPFKSATELLAY CNETGYSLSGLAMQNELALHSKKEIDEYFAHVWQTMQACIDRGMNTEGVLPGPLRVPRRA SALRRMLVSSDKLSNDPMNVIDWVNMFALAVNEENAAGGRVVTAPTNGACGIVPAVLAYY DHFIESVSPDIYTRYFMAAGAIGALYKMNASISGAEVGCQGEVGVACSMAAAGLAELLGG SPEQVCVAAEIGMEHNLGLTCDPVAGQVQVPCIERNAIASVKAINAARMALRRTSAPRVS LDKVIETMYETGKDMNAKYRETSRGGLAIKVQCD >gi|296494601|gb|ADTN01000137.1| GENE 7 8022 - 8600 440 192 aa, chain - ## HITS:1 COG:ECs2522 KEGG:ns NR:ns ## COG: ECs2522 COG0494 # Protein_GI_number: 15831776 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Escherichia coli O157:H7 # 1 192 1 192 192 342 99.0 2e-94 MEYRSLTLDDFLSRFQLLRPQINRETLNHRQAAVLIPIVRRPQPGLLLTQRSIHLRKHAG QVAFPGGAVDDTDASAIAAALREAEEEVAIPPSAVEVIGVLPPVDSVTGYQVTPVVGIIP PDLPYRASEDEVSAVFEMPLAQALHLGRYHPLDIYRRGDSHRVWLSWYEQYFVWGMTAGI IRELALQIGVKP >gi|296494601|gb|ADTN01000137.1| GENE 8 8604 - 9965 1090 453 aa, chain - ## HITS:1 COG:pabB KEGG:ns NR:ns ## COG: pabB COG0147 # Protein_GI_number: 16129766 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component I # Organism: Escherichia coli K12 # 1 453 1 453 453 927 99.0 0 MKTLSPAVITLPWRQDAAEFYFSRLSHLPWAMLLHSGYADHPYSRFDIVVAEPICTLTTF GKETVVSESEKRTTTTDDPLQVLQQVLDRADIRPTHNEDLPFQGGALGLFGYDLGRRFES LPEIAEQDIVLPDMAVGIYDWALIVDHQRHTVSLLSHNDVNARRAWLESQQFSPQEDFTL TSDWQSNMTREQYGEKFRQVQEYLHSGDCYQVNLAQRFHATYSGDEWQAFLQLNQANRAP FSAFLRLEQGAILSLSPERFILCDNSEIQTRPIKGTLPRLPDPQEDSKQAVKLANSAKDR AENLMIVDLMRNDIGRVAVAGSVKVPELFVVEPFPAVHHLVSTITAQLPEQLHASDLLRA AFPGGSITGAPKVRAMEIIDELEPQRRNAWCGSIGYLSFCGNMDTSITIRTLTAINGQIF CSAGGGIVADSQEEAEYQETFDKVNRILKQLEK >gi|296494601|gb|ADTN01000137.1| GENE 9 10039 - 10218 338 59 aa, chain + ## HITS:1 COG:ECs2520 KEGG:ns NR:ns ## COG: ECs2520 COG3140 # Protein_GI_number: 15831774 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 59 1 59 59 83 100.0 1e-16 MFAGLPSLTHEQQQKAVERIQELMAQGMSSGQAIALVAEELRANHSGERIVARFEDEDE >gi|296494601|gb|ADTN01000137.1| GENE 10 10338 - 10637 214 99 aa, chain - ## HITS:1 COG:no KEGG:S1533 NR:ns ## KEGG: S1533 # Name: not_defined # Def: hypothetical protein # Organism: S.flexneri_2457T # Pathway: not_defined # 1 99 21 119 119 187 100.0 7e-47 MPAVIDKALDFIGAMDVSAPTPSSMNESTAKGIFKYLKELGVPASAADITARADQEGWNP GFTEKMVGWAKKMETGERSVIKNPEYFSTYMQEELKALV >gi|296494601|gb|ADTN01000137.1| GENE 11 11059 - 11403 477 114 aa, chain - ## HITS:1 COG:yoaB KEGG:ns NR:ns ## COG: yoaB COG0251 # Protein_GI_number: 16129763 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Escherichia coli K12 # 1 114 17 130 130 209 100.0 9e-55 MTIVRIDAEARWSDVVIHNNTLYYTGVPENLDADAFEQTANTLAQIDAVLEKQGSNKSSI LDATIFLADKNDFAAMNKAWDAWVVAGHAPVRCTVQAGLMNPKYKVEIKIVAAV >gi|296494601|gb|ADTN01000137.1| GENE 12 11535 - 13445 1653 636 aa, chain + ## HITS:1 COG:ECs2517 KEGG:ns NR:ns ## COG: ECs2517 COG1199 # Protein_GI_number: 15831771 # Func_class: K Transcription; L Replication, recombination and repair # Function: Rad3-related DNA helicases # Organism: Escherichia coli O157:H7 # 1 636 1 636 636 1265 100.0 0 MTDDFAPDGQLAKAIPGFKPREPQRQMAVAVTQAIEKGQPLVVEAGTGTGKTYAYLAPAL RAKKKVIISTGSKALQDQLYSRDLPTVSKALKYTGNVALLKGRSNYLCLERLEQQALAGG DLPVQILSDVILLRSWSNQTVDGDISTCVSVAEDSQAWPLVTSTNDNCLGSDCPMYKDCF VVKARKKAMDADVVVVNHHLFLADMVVKESGFGELIPEADVMIFDEAHQLPDIASQYFGQ SLSSRQLLDLAKDITIAYRTELKDTQQLQKCADRLAQSAQDFRLQLGEPGYRGNLRELLA NPQIQRAFLLLDDTLELCYDVAKLSLGRSALLDAAFERATLYRTRLKRLKEINQPGYSYW YECTSRHFTLALTPLSVADKFKELMAQKPGSWIFTSATLSVNDDLHHFTSRLGIEQAESL LLPSPFDYSRQALLCVPRNLPQTNQPGSARQLAAMLRPIIEANNGRCFMLCTSHAMMRDL AEQFRATMTLPVLLQGETSKGQLLQQFVSAGNALLVATSSFWEGVDVRGDTLSLVIIDKL PFTSPDDPLLKARMEDCRLRGGDPFDEVQLPDAVITLKQGVGRLIRDADDRGVLVICDNR LVMRPYGATFLASLPPAPRTRDIARAVRFLAIPSSR >gi|296494601|gb|ADTN01000137.1| GENE 13 13503 - 14198 728 231 aa, chain + ## HITS:1 COG:yeaZ KEGG:ns NR:ns ## COG: yeaZ COG1214 # Protein_GI_number: 16129761 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Inactive homolog of metal-dependent proteases, putative molecular chaperone # Organism: Escherichia coli K12 # 1 231 1 231 231 452 100.0 1e-127 MRILAIDTATEACSVALWNDGTVNAHFELCPREHTQRILPMVQDILTTSGTSLTDINALA YGRGPGSFTGVRIGIGIAQGLALGAELPMIGVSTLMTMAQGAWRKNGATRVLAAIDARMG EVYWAEYQRDENGIWHGEETEAVLKPEIVHERMQQLSGEWVTVGTGWQAWPDLGKESGLV LRDGEVLLPAAEDMLPIACQMFAEGKTVAVEHAEPVYLRNNVAWKKLPGKE >gi|296494601|gb|ADTN01000137.1| GENE 14 14238 - 14819 612 193 aa, chain + ## HITS:1 COG:ECs2515 KEGG:ns NR:ns ## COG: ECs2515 COG3065 # Protein_GI_number: 15831769 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Starvation-inducible outer membrane lipoprotein # Organism: Escherichia coli O157:H7 # 1 193 1 193 193 358 98.0 2e-99 MAVQKNVIKGILAGTFALMLSGCVTVPDAIKGSSPTPQQDLVRVMSAPQLYVGQEARFGG KVVAVQNQQGKTRLEIATVPLDSGARPTLGELSRGRIYADVNGFLDPVDFRGQLVTVVGP ITGAVDGKIGNTPYKFMVMQVTGYKRWHLTQQVIMPPQPIDPWFYGGRGWPYGYGGWGWY NPGPARVQTVVTE >gi|296494601|gb|ADTN01000137.1| GENE 15 14958 - 16709 1619 583 aa, chain + ## HITS:1 COG:fadD KEGG:ns NR:ns ## COG: fadD COG0318 # Protein_GI_number: 16129759 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Escherichia coli K12 # 23 583 1 561 561 1149 99.0 0 MLTACISFGVAMTTNTHFRGEELKKVWLNRYPADVPTEINPDRYQSLVDMFEQSVARYAD QPAFVNMGEVMTFRKLEERSRAFAAYLQQGLGLKKGDRVALMMPNLLQYPVALFGILRAG MIVVNVNPLYTPRELEHQLNDSGASAIVIVSNFAHTLEKVVDKTAVQHVILTRMGDQLST AKGTVVNFVVKYIKRLVPKYHLPDAISFRSALHNGYRMQYVKPELVPEDLAFLQYTGGTT GVAKGAMLTHRNMLANLEQVNATYGPLLHPGKELVVTALPLYHIFALTINCLLFIELGGQ NLLITNPRDIPGLVKELAKYPFTAITGVNTLFNALLNNKEFQQLDFSSLHLSAGGGMPVQ QVVAERWVKLTGQYLLEGYGLTECAPLVSVNPYDIDYHSGSIGLPVPSTEAKLVDDDDNE VPPGQPGELCVKGPQVMLGYWQRPDATDEIIKNGWLHTGDIAVMDEEGFLRIVDRKKDMI LVSGFNVYPNEIEDVVMQHPGVQEVAAVGVPSGSSGEAVKIFVVKKDPSLTEESLVTFCR RQLTGYKVPKLVEFRDELPKSNVGKILRRELRDEARGKVDNKA >gi|296494601|gb|ADTN01000137.1| GENE 16 16791 - 17906 1132 371 aa, chain + ## HITS:1 COG:rnd KEGG:ns NR:ns ## COG: rnd COG0349 # Protein_GI_number: 16129758 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonuclease D # Organism: Escherichia coli K12 # 1 371 5 375 375 729 100.0 0 MITTDDALASLCEAVRAFPAIALDTEFVRTRTYYPQLGLIQLFDGEHLALIDPLGITDWS PLKAILRDPSITKFLHAGSEDLEVFLNVFGELPQPLIDTQILAAFCGRPMSWGFASMVEE YSGVTLDKSESRTDWLARPLTERQCEYAAADVWYLLPITAKLMVETEASGWLPAALDECR LMQMRRQEVVAPEDAWRDITNAWQLRTRQLACLQLLADWRLRKARERDLAVNFVVREEHL WSVARYMPGSLGELDSLGLSGSEIRFHGKTLLALVEKAQTLPEDALPQPMLNLMDMPGYR KAFKAIKSLITDVSETHKISAELLASRRQINQLLNWHWKLKPQNNLPELISGWRGELMAE ALHNLLQEYPQ >gi|296494601|gb|ADTN01000137.1| GENE 17 17960 - 18925 617 321 aa, chain - ## HITS:1 COG:yeaX KEGG:ns NR:ns ## COG: yeaX COG1018 # Protein_GI_number: 16129757 # Func_class: C Energy production and conversion # Function: Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 # Organism: Escherichia coli K12 # 1 321 1 321 321 648 100.0 0 MSDYQMFEVQVSQVEPLTEQVKRFTLVATDGKPLPAFTGGSHVIVQMSDGDNQYSNAYSL LSSPHDTSCYQIAVRLEENSRGGSRFLHQQVKVGDRLTISTPNNLFALIPSARKHLFIAG GIGITPFLSHMAELQHSDVDWQLHYCSRNPESCAFRDELVQHPQAEKVHLHHSSTGTRLE LARLLADIEPGTHVYTCGPEALIEAVRSEAARLDIAADTLHFEQFAIEDKTGDAFTLVLA RSGKEFVVPEEMTILQVIENNKAAKVECLCREGVCGTCETAILEGEADHRDQYFSDEERA SQQSMLICCSRAKGKRLVLDL >gi|296494601|gb|ADTN01000137.1| GENE 18 18981 - 20105 1231 374 aa, chain - ## HITS:1 COG:ECs2511 KEGG:ns NR:ns ## COG: ECs2511 COG4638 # Protein_GI_number: 15831765 # Func_class: P Inorganic ion transport and metabolism; R General function prediction only # Function: Phenylpropionate dioxygenase and related ring-hydroxylating dioxygenases, large terminal subunit # Organism: Escherichia coli O157:H7 # 1 374 1 374 374 801 100.0 0 MSNLSPDFVLPENFCANPQEAWTIPARFYTDQNAFEHEKENVFAKSWICVAHSSELANAN DYVTREIIGESIVLVRGRDKVLRAFYNVCPHRGHQLLSGEGKAKNVITCPYHAWAFKLDG NLAHARNCENVANFDSDKAQLVPVRLEEYAGFVFINMDPNATSVEDQLPGLGAKVLEACP EVHDLKLAARFTTRTPANWKNIVDNYLECYHCGPAHPGFSDSVQVDRYWHTMHGNWTLQY GFAKPSEQSFKFEEGTDAAFHGFWLWPCTMLNVTPIKGMMTVIYEFPVDSETTLQNYDIY FTNEELTDEQKSLIEWYRDVFRPEDLRLVESVQKGLKSRGYRGQGRIMADSSGSGISEHG IAHFHNLLAQVFKD >gi|296494601|gb|ADTN01000137.1| GENE 19 20137 - 21567 1536 476 aa, chain - ## HITS:1 COG:ECs2510 KEGG:ns NR:ns ## COG: ECs2510 COG1292 # Protein_GI_number: 15831764 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Choline-glycine betaine transporter # Organism: Escherichia coli O157:H7 # 1 476 6 481 481 899 100.0 0 MGLVIYLATSKYGNIRLGEGKPEYSTLSWLFMFICAGLGSSTLYWGVAEWAYYYQTPGLN IAPRSQQALEFSVPYSFFHWGISAWATYTLASLIMAYHFHVRKNKGLSLSGIIAAITGVR PQGPWGKLVDLMFLIATVGALTISLVVTAATFTRGLSALTGLPDNFTVQAFVILLSGGIF CLSSWIGINNGLQRLSKMVGWGAFLLPLLVLIVGPTEFITNSIINAIGLTTQNFLQMSLF TDPLGDGSFTRNWTVFYWLWWISYTPGVAMFVTRVSRGRKIKEVIWGLILGSTVGCWFFF GVMESYAIHQFINGVINVPQVLETLGGETAVQQVLMSLPAGKLFLAAYLGVMIIFLASHM DAVAYTMAATSTRNLQEGDDPDRGLRLFWCVVITLIPLSILFTGASLETMKTTVVLTALP FLVILLVKVGGFIRWLKQDYADIPAHQVEHYLPQTPVEALEKTPVLPAGTVFKGDN >gi|296494601|gb|ADTN01000137.1| GENE 20 21773 - 22855 1057 360 aa, chain - ## HITS:1 COG:yeaU KEGG:ns NR:ns ## COG: yeaU COG0473 # Protein_GI_number: 16129754 # Func_class: C Energy production and conversion; E Amino acid transport and metabolism # Function: Isocitrate/isopropylmalate dehydrogenase # Organism: Escherichia coli K12 # 1 360 2 361 361 761 100.0 0 MKTMRIAAIPGDGIGKEVLPEGIRVLQAAAERWGFALSFEQMEWASCEYYSHHGKMMPDD WHEQLSRFDAIYFGAVGWPDTVPDHISLWGSLLKFRREFDQYVNLRPVRLFPGVPCPLAG KQPGDIDFYVVRENTEGEYSSLGGRVNEGTEHEVVIQESVFTRRGVDRILRYAFELAQSR PRKTLTSATKSNGLAISMPYWDERVEAMAENYPEIRWDKQHIDILCARFVMQPERFDVVV ASNLFGDILSDLGPACTGTIGIAPSANLNPERTFPSLFEPVHGSAPDIYGKNIANPIATI WAGAMMLDFLGNGDERFQQAHNGILAAIEEVIAHGPKTPDMKGNATTPQVADAICKIILR >gi|296494601|gb|ADTN01000137.1| GENE 21 22961 - 23884 869 307 aa, chain + ## HITS:1 COG:ECs2508 KEGG:ns NR:ns ## COG: ECs2508 COG0583 # Protein_GI_number: 15831762 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 307 8 314 314 623 99.0 1e-178 MNNLPLLNDLRVFMLVARRAGFAAVAEELGVSPAFVSKRIALLEQTLNVVLLHRTTRRVT ITEEGERIYEWAQRILQDVGQMMDELSDVRQVPQGMLRIISSFGFGRQVVAPALSALAKA YPQLELRFDVEDRLVDLVNEGVDLDIRIGDDIAPNLIARKLATNYRILCASPEFIAQHGA PKHLTDLSALPCLVIKERDHPFGVWQLRNKEGPHAIKVTGPLSSNHGEIVHQWCLDGQGI ALRSWWDVSENIASGHLVQVLPEYYQPANVWAVYVSRLATSAKVRITVEFLRQYFAEHYP NFSLEHA >gi|296494601|gb|ADTN01000137.1| GENE 22 24011 - 24649 495 212 aa, chain + ## HITS:1 COG:yeaS KEGG:ns NR:ns ## COG: yeaS COG1280 # Protein_GI_number: 16129752 # Func_class: E Amino acid transport and metabolism # Function: Putative threonine efflux protein # Organism: Escherichia coli K12 # 1 212 1 212 212 359 100.0 2e-99 MFAEYGVLNYWTYLVGAIFIVLVPGPNTLFVLKNSVSSGMKGGYLAACGVFIGDAVLMFL AWAGVATLIKTTPILFNIVRYLGAFYLLYLGSKILYATLKGKNSEAKSDEPQYGAIFKRA LILSLTNPKAILFYVSFFVQFIDVNAPHTGISFFILAATLELVSFCYLSFLIISGAFVTQ YIRTKKKLAKVGNSLIGLMFVGFAARLATLQS Prediction of potential genes in microbial genomes Time: Sun May 15 23:36:44 2011 Seq name: gi|296494600|gb|ADTN01000138.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont287.1, whole genome shotgun sequence Length of sequence - 991 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 991 439 ## COG3209 Rhs family protein Predicted protein(s) >gi|296494600|gb|ADTN01000138.1| GENE 1 1 - 991 439 330 aa, chain - ## HITS:1 COG:rhsE KEGG:ns NR:ns ## COG: rhsE COG3209 # Protein_GI_number: 16129415 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli K12 # 1 330 99 428 682 645 98.0 0 VHRETVRSFGSMAGSNAAYELTSTYTPAGQLQSQHLNSLVYDRDYGWSDNGDLVRISGPR QTREYGYSATGRLESVRTLAPDLDIRIPYATDPAGNRLPDPELHPDSTLTVWPDNRIAED AHYVYRHDEYGRLTEKTDRIPAGVIRPDDERTHHYHYDSQHRLVFYTRIQHGEPLVESRY LYDPLGRRMAKRVWRRERDLTGWMSLSRKPEVTWYGWDGDRLTTVQTDTTRIQTVYEPGS FTPLIRVETENGEREKAQRRSLAETLQQEGSENGHGVVFPAELVRLLDRLEEEIRADRVS SESRAWLAQCGLTVEQLARQVEPEYTPARK Prediction of potential genes in microbial genomes Time: Sun May 15 23:36:49 2011 Seq name: gi|296494599|gb|ADTN01000139.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont288.1, whole genome shotgun sequence Length of sequence - 17432 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 8, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/1.000 + CDS 78 - 1028 1167 ## COG0549 Carbamate kinase 2 1 Op 2 1/1.000 + CDS 1038 - 2420 1518 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases + Prom 2551 - 2610 2.4 3 2 Tu 1 . + CDS 2797 - 3846 1014 ## COG1064 Zn-dependent alcohol dehydrogenases + Prom 4003 - 4062 5.5 4 3 Tu 1 . + CDS 4089 - 4904 344 ## B21_00285 hypothetical protein + TRNA 5247 - 5327 28.4 # Pseudo ??? 0 0 - Term 5419 - 5452 -0.4 5 4 Tu 1 . - CDS 5579 - 6247 545 ## COG1280 Putative threonine efflux protein - Prom 6496 - 6555 6.3 + Prom 6216 - 6275 4.5 6 5 Tu 1 . + CDS 6397 - 6672 394 ## ECUMN_0373 hypothetical protein - Term 6722 - 6767 2.7 7 6 Tu 1 . - CDS 6770 - 8356 1579 ## COG1221 Transcriptional regulators containing an AAA-type ATPase domain and a DNA-binding domain - Prom 8474 - 8533 4.7 + Prom 8452 - 8511 4.8 8 7 Op 1 9/0.000 + CDS 8595 - 9485 1017 ## COG2513 PEP phosphonomutase and related enzymes 9 7 Op 2 8/0.000 + CDS 9739 - 10908 1303 ## COG0372 Citrate synthase 10 7 Op 3 4/1.000 + CDS 10942 - 12393 1671 ## COG2079 Uncharacterized protein involved in propionate catabolism 11 7 Op 4 2/1.000 + CDS 12433 - 14319 1922 ## COG0365 Acyl-coenzyme A synthetases/AMP-(fatty) acid ligases + Prom 14552 - 14611 5.6 12 8 Op 1 6/0.000 + CDS 14649 - 15908 1439 ## COG1457 Purine-cytosine permease and related proteins 13 8 Op 2 . + CDS 15898 - 17181 1452 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases + Term 17386 - 17415 -0.8 Predicted protein(s) >gi|296494599|gb|ADTN01000139.1| GENE 1 78 - 1028 1167 316 aa, chain + ## HITS:1 COG:yahI KEGG:ns NR:ns ## COG: yahI COG0549 # Protein_GI_number: 16128308 # Func_class: E Amino acid transport and metabolism # Function: Carbamate kinase # Organism: Escherichia coli K12 # 1 316 1 316 316 584 100.0 1e-167 MKELVVVAIGGNSIIKDNASQSIEHQAEAVKAVADTVLEMLASDYDIVLTHGNGPQVGLD LRRAEIAHKREGLPLTPLANCVADTQGGIGYLIQQALNNRLARHGEKKAVTVVTQVEVDK NDPGFAHPTKPIGAFFSDSQRDELQKANPDWCFVEDAGRGYRRVVASPEPKRIVEAPAIK ALIQQGFVVIGAGGGGIPVVRTDAGDYQSVDAVIDKDLSTALLAREIHADILVITTGVEK VCIHFGKPQQQALDRVDIATMTRYMQEGHFPPGSMLPKIIASLTFLEQGGKEVIITTPEC LPAALRGETGTHIIKT >gi|296494599|gb|ADTN01000139.1| GENE 2 1038 - 2420 1518 460 aa, chain + ## HITS:1 COG:yahJ KEGG:ns NR:ns ## COG: yahJ COG0402 # Protein_GI_number: 16128309 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Escherichia coli K12 # 1 460 1 460 460 927 100.0 0 MKESNSRREFLSQSGKMVTAAALFGTSVPLAHAAVAGTLNCEANNTMKITDPHYYLDNVL LETGFDYENGVAVQTRTARQTVEIQDGKIVALRENKLHPDATLPHYDAGGKLMLPTTRDM HIHLDKTFYGGPWRSLNRPAGTTIQDMIKLEQKMLPELQPYTQERAEKLIDLLQSKGTTI ARSHCNIEPVSGLKNLQNLQAVLARRQAGFECEIVAFPQHGLLLSKSEPLMREAMQAGAH YVGGLDPTSVDGAMEKSLDTMFQIALDYDKGVDIHLHETTPAGVAAINYMVETVEKTPQL KGKLTISHAFALATLNEQQVDELANRMVVQQISIASTVPIGTLHMPLKQLHDKGVKVMTG TDSVIDHWSPYGLGDMLEKANLYAQLYIRPNEQNLSRSLFLATGDVLPLNEKGERVWPKA QDDASFVLVDASCSAEAVARISPRTATFHKGQLVWGSVAG >gi|296494599|gb|ADTN01000139.1| GENE 3 2797 - 3846 1014 349 aa, chain + ## HITS:1 COG:yahK KEGG:ns NR:ns ## COG: yahK COG1064 # Protein_GI_number: 16128310 # Func_class: R General function prediction only # Function: Zn-dependent alcohol dehydrogenases # Organism: Escherichia coli K12 # 1 349 1 349 349 690 100.0 0 MKIKAVGAYSAKQPLEPMDITRREPGPNDVKIEIAYCGVCHSDLHQVRSEWAGTVYPCVP GHEIVGRVVAVGDQVEKYAPGDLVGVGCIVDSCKHCEECEDGLENYCDHMTGTYNSPTPD EPGHTLGGYSQQIVVHERYVLRIRHPQEQLAAVAPLLCAGITTYSPLRHWQAGPGKKVGV VGIGGLGHMGIKLAHAMGAHVVAFTTSEAKREAAKALGADEVVNSRNADEMAAHLKSFDF ILNTVAAPHNLDDFTTLLKRDGTMTLVGAPATPHKSPEVFNLIMKRRAIAGSMIGGIPET QEMLDFCAEHGIVADIEMIRADQINEAYERMLRGDVKYRFVIDNRTLTD >gi|296494599|gb|ADTN01000139.1| GENE 4 4089 - 4904 344 271 aa, chain + ## HITS:1 COG:no KEGG:B21_00285 NR:ns ## KEGG: B21_00285 # Name: yahL # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 271 1 271 271 482 100.0 1e-135 MISLKAPHNNLMPYTQQSILNTVKNNQLPEDIKSSLVSCVDIFKVLIKQYYDYPYDCRDD LVDDDKLIHLMAAVRDCEWSDDNALTINVQFNDFPGFYDWMDYPDHPVKFVFHILENQKG TVWVYDQDDAFLDIKANVQAGRFTGLKKLVQFIDSVRTDCKCILLEYHMPLLRIFPKGKE CMHVEKWLREMSSIPETDAPIKQALAHGLLLHLKNIYPVFPESLVMLLLSVLDVKTYRDD ARLNEWISNRVQELGDRYYPVNKHVKIRYTL >gi|296494599|gb|ADTN01000139.1| GENE 5 5579 - 6247 545 222 aa, chain - ## HITS:1 COG:yahN KEGG:ns NR:ns ## COG: yahN COG1280 # Protein_GI_number: 16128313 # Func_class: E Amino acid transport and metabolism # Function: Putative threonine efflux protein # Organism: Escherichia coli K12 # 1 222 2 223 223 401 100.0 1e-112 MQLVHLFMDEITMDPLHAVYLTVGLFVITFFNPGANLFVVVQTSLASGRRAGVLTGLGVA LGDAFYSGLGLFGLATLITQCEEIFSLIRIVGGAYLLWFAWCSMRRQSTPQMSTLQQPIS APWYVFFRRGLITDLSNPQTVLFFISIFSVTLNAETPTWARLMAWAGIVLASIIWRVFLS QAFSLPAVRRAYGRMQRVASRVIGAIIGVFALRLIYEGVTQR >gi|296494599|gb|ADTN01000139.1| GENE 6 6397 - 6672 394 91 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_0373 NR:ns ## KEGG: ECUMN_0373 # Name: yahO # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 91 1 91 91 134 98.0 1e-30 MKIISKMLVGALALAVTNVYAAELMTKAEFEKVESQYEKIGDISTSNEMSTADAKEDLIK KADEKGADVLVLTSGQTDNKIHGTANIYKKK >gi|296494599|gb|ADTN01000139.1| GENE 7 6770 - 8356 1579 528 aa, chain - ## HITS:1 COG:prpR KEGG:ns NR:ns ## COG: prpR COG1221 # Protein_GI_number: 16128315 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulators containing an AAA-type ATPase domain and a DNA-binding domain # Organism: Escherichia coli K12 # 1 528 1 528 528 1029 100.0 0 MAHPPRLNDDKPVIWTVSVTRLFELFRDISLEFDHLANITPIQLGFEKAVTYIRKKLANE RCDAIIAAGSNGAYLKSRLSVPVILIKPSGYDVLQALAKAGKLTSSIGVVTYQETIPALV AFQKTFNLRLDQRSYITEEDARGQINELKANGTEAVVGAGLITDLAEEAGMTGIFIYSAA TVRQAFSDALDMTRMSLRHNTHDATRNALRTRYVLGDMLGQSPQMEQVRQTILLYARSSA AVLIEGETGTGKELAAQAIHREYFARHDARQGKKSHPFVAVNCGAIAESLLEAELFGYEE GAFTGSRRGGRAGLFEIAHGGTLFLDEIGEMPLPLQTRLLRVLEEKEVTRVGGHQPVPVD VRVISATHCNLEEDMQQGRFRRDLFYRLSILRLQLPPLRERVADILPLAESFLKVSLAAL SAPFSAALRQGLQASETVLLHYDWPGNIRELRNMMERLALFLSVEPTPDLTPQFMQLLLP ELARESAKTPAPRLLTPQQALEKFNGDKTAAANYLGISRTTFWRRLKS >gi|296494599|gb|ADTN01000139.1| GENE 8 8595 - 9485 1017 296 aa, chain + ## HITS:1 COG:prpB KEGG:ns NR:ns ## COG: prpB COG2513 # Protein_GI_number: 16128316 # Func_class: G Carbohydrate transport and metabolism # Function: PEP phosphonomutase and related enzymes # Organism: Escherichia coli K12 # 1 296 1 296 296 562 100.0 1e-160 MSLHSPGKAFRAALTKENPLQIVGTINANHALLAQRAGYQAIYLSGGGVAAGSLGLPDLG ISTLDDVLTDIRRITDVCSLPLLVDADIGFGSSAFNVARTVKSMIKAGAAGLHIEDQVGA KRCGHRPNKAIVSKEEMVDRIRAAVDAKTDPDFVIMARTDALAVEGLDAAIERAQAYVEA GAEMLFPEAITELAMYRQFADAVQVPILANITEFGATPLFTTDELRSAHVAMALYPLSAF RAMNRAAEHVYNVLRQEGTQKSVIDTMQTRNELYESINYYQYEEKLDNLFARSQVK >gi|296494599|gb|ADTN01000139.1| GENE 9 9739 - 10908 1303 389 aa, chain + ## HITS:1 COG:prpC KEGG:ns NR:ns ## COG: prpC COG0372 # Protein_GI_number: 16128318 # Func_class: C Energy production and conversion # Function: Citrate synthase # Organism: Escherichia coli K12 # 1 389 1 389 389 780 100.0 0 MSDTTILQNSTHVIKPKKSVALSGVPAGNTALCTVGKSGNDLHYRGYDILDLAKHCEFEE VAHLLIHGKLPTRDELAAYKTKLKALRGLPANVRTVLEALPAASHPMDVMRTGVSALGCT LPEKEGHTVSGARDIADKLLASLSSILLYWYHYSHNGERIQPETDDDSIGGHFLHLLHGE KPSQSWEKAMHISLVLYAEHEFNASTFTSRVIAGTGSDMYSAIIGAIGALRGPKHGGANE VSLEIQQRYETPDEAEADIRKRVENKEVVIGFGHPVYTIADPRHQVIKRVAKQLSQEGGS LKMYNIADRLETVMWESKKMFPNLDWFSAVSYNMMGVPTEMFTPLFVIARVTGWAAHIIE QRQDNKIIRPSANYVGPEDRPFVALDKRQ >gi|296494599|gb|ADTN01000139.1| GENE 10 10942 - 12393 1671 483 aa, chain + ## HITS:1 COG:prpD KEGG:ns NR:ns ## COG: prpD COG2079 # Protein_GI_number: 16128319 # Func_class: R General function prediction only # Function: Uncharacterized protein involved in propionate catabolism # Organism: Escherichia coli K12 # 1 483 1 483 483 1008 100.0 0 MSAQINNIRPEFDREIVDIVDYVMNYEISSKVAYDTAHYCLLDTLGCGLEALEYPACKKL LGPIVPGTVVPNGVRVPGTQFQLDPVQAAFNIGAMIRWLDFNDTWLAAEWGHPSDNLGGI LATADWLSRNAVASGKAPLTMKQVLTAMIKAHEIQGCIALENSFNRVGLDHVLLVKVAST AVVAEMLGLTREEILNAVSLAWVDGQSLRTYRHAPNTGTRKSWAAGDATSRAVRLALMAK TGEMGYPSALTAPVWGFYDVSFKGESFRFQRPYGSYVMENVLFKISFPAEFHSQTAVEAA MTLYEQMQAAGKTAADIEKVTIRTHEACIRIIDKKGPLNNPADRDHCIQYMVAIPLLFGR LTAADYEDNVAQDKRIDALREKINCFEDPAFTADYHDPEKRAIANAITLEFTDGTRFEEV VVEYPIGHARRRQDGIPKLVDKFKINLARQFPTRQQQRILEVSLDRARLEQMPVNEYLDL YVI >gi|296494599|gb|ADTN01000139.1| GENE 11 12433 - 14319 1922 628 aa, chain + ## HITS:1 COG:prpE KEGG:ns NR:ns ## COG: prpE COG0365 # Protein_GI_number: 16128320 # Func_class: I Lipid transport and metabolism # Function: Acyl-coenzyme A synthetases/AMP-(fatty) acid ligases # Organism: Escherichia coli K12 # 1 628 1 628 628 1283 99.0 0 MSFSEFYQRSINEPEQFWAEQARRIDWQTPFTQTLDHSNPPFARWFCEGRTNLCHNAIDR WLEKQPEALALIAVSSETEEERTFTFRQLHDEVNAVASMLRSLGVQRGDRVLVYMPMIAE AHITLLACARIGAIHSVVFGGFASHSVAARIDDAKPVLIVSADAGARGGKIIPYKKLLDD AISQAQHQPRHVLLVDRGLAKMARVSGRDVDFASLRHQHIGARVPVAWLESNETSCILYT SGTTGKPKGVQRDVGGYAVALATSMDTIFGGKAGSVFFCASDIGWVVGHSYIVYAPLLAG MATIVYEGLPTWPDCGVWWKIVEKYQVSRMFSAPTAIRVLKKFPTAEIRKHDLSSLEVLY LAGEPLDEPTASWVSNTLDVPVIDNYWQTESGWPIMAIARGLDDRPTRLGSPGVPMYGYN VQLLNEVTGEPCGVNEKGMLVVEGPLPPGCIQTIWGDDGRFVKTYWSLFSRPVYATFDWG IRDADGYHFILGRTDDVINVAGHRLGTREIEESISSHPGVAEVAVVGVKDALKGQVAVAF VIPKESDSLEDRDVAHSQEKAIMALVDSQIGNFGRPAHVWFVSQLPKTRSGKMLRRTIQA ICEGRDPGDLTTIDDPASLDQIRQAMEE >gi|296494599|gb|ADTN01000139.1| GENE 12 14649 - 15908 1439 419 aa, chain + ## HITS:1 COG:codB KEGG:ns NR:ns ## COG: codB COG1457 # Protein_GI_number: 16128321 # Func_class: F Nucleotide transport and metabolism # Function: Purine-cytosine permease and related proteins # Organism: Escherichia coli K12 # 1 419 1 419 419 672 100.0 0 MSQDNNFSQGPVPQSARKGVLALTFVMLGLTFFSASMWTGGTLGTGLSYHDFFLAVLIGN LLLGIYTSFLGYIGAKTGLTTHLLARFSFGVKGSWLPSLLLGGTQVGWFGVGVAMFAIPV GKATGLDINLLIAVSGLLMTVTVFFGISALTVLSVIAVPAIACLGGYSVWLAVNGMGGLD ALKAVVPAQPLDFNVALALVVGSFISAGTLTADFVRFGRNAKLAVLVAMVAFFLGNSLMF IFGAAGAAALGMADISDVMIAQGLLLPAIVVLGLNIWTTNDNALYASGLGFANITGMSSK TLSVINGIIGTVCALWLYNNFVGWLTFLSAAIPPVGGVIIADYLMNRRRYEHFATTRMMS VNWVAILAVALGIAAGHWLPGIVPVNAVLGGALSYLILNPILNRKTTAAMTHVEANSVE >gi|296494599|gb|ADTN01000139.1| GENE 13 15898 - 17181 1452 427 aa, chain + ## HITS:1 COG:codA KEGG:ns NR:ns ## COG: codA COG0402 # Protein_GI_number: 16128322 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Escherichia coli K12 # 1 427 1 427 427 882 100.0 0 MSNNALQTIINARLPGEEGLWQIHLQDGKISAIDAQSGVMPITENSLDAEQGLVIPPFVE PHIHLDTTQTAGQPNWNQSGTLFEGIERWAERKALLTHDDVKQRAWQTLKWQIANGIQHV RTHVDVSDATLTALKAMLEVKQEVAPWIDLQIVAFPQEGILSYPNGEALLEEALRLGADV VGAIPHFEFTREYGVESLHKTFALAQKYDRLIDVHCDEIDDEQSRFVETVAALAHHEGMG ARVTASHTTAMHSYNGAYTSRLFRLLKMSGINFVANPLVNIHLQGRFDTYPKRRGITRVK EMLESGINVCFGHDDVFDPWYPLGTANMLQVLHMGLHVCQLMGYGQINDGLNLITHHSAR TLNLQDYGIAAGNSANLIILPAENGFDALRRQVPVRYSVRGGKVIASTQPAQTTVYLEQP EAIDYKR Prediction of potential genes in microbial genomes Time: Sun May 15 23:36:56 2011 Seq name: gi|296494598|gb|ADTN01000140.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont288.2, whole genome shotgun sequence Length of sequence - 2462 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 930 369 ## COG0583 Transcriptional regulator - Prom 958 - 1017 2.7 + Prom 851 - 910 4.2 2 2 Op 1 4/1.000 + CDS 1003 - 1662 454 ## COG0288 Carbonic anhydrase 3 2 Op 2 3/1.000 + CDS 1693 - 2163 714 ## COG1513 Cyanate lyase 4 2 Op 3 . + CDS 2196 - 2460 295 ## COG2807 Cyanate permease Predicted protein(s) >gi|296494598|gb|ADTN01000140.1| GENE 1 3 - 930 369 309 aa, chain - ## HITS:1 COG:cynR KEGG:ns NR:ns ## COG: cynR COG0583 # Protein_GI_number: 16128323 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 13 283 1 271 299 499 99.0 1e-141 MMSSIGFTYRLRMLSRHINYFLAVAEHGSFTRAASALHVSQPALSQQIRQLEESLGVPLF DRSGRTIRLTDAGEVWRQYASRALQELGAGKRAIHDVADLTRGSLRIAVTPTFTSYFIGP LMADFYARYPSITLQLQEMSQEKIEDMLCRDELDVGIAFAPVHSPELEAIPLLTESLALV VAQHHPLAVHEQVALSRLHDEKLVLLSAEFATREQIDHYCEKAGLHPQVVIEANSISAVL ELIRRTSLSTLLPAAIATQHDGLKAISLASPLLERTAVLLRRKIAGRQPPRRHFCTWRWI NARLLAEMN >gi|296494598|gb|ADTN01000140.1| GENE 2 1003 - 1662 454 219 aa, chain + ## HITS:1 COG:ZcynT KEGG:ns NR:ns ## COG: ZcynT COG0288 # Protein_GI_number: 15800068 # Func_class: P Inorganic ion transport and metabolism # Function: Carbonic anhydrase # Organism: Escherichia coli O157:H7 EDL933 # 1 219 1 219 219 429 100.0 1e-120 MKEIIDGFLKFQREAFPKREALFKQLATQQSPRTLFISCSDSRLVPELVTQREPGDLFVI RNAGNIVPSYGPEPGGVSASVEYAVAALRVSDIVICGHSNCGAMTAIASCQCMDHMPAVS HWLRYADSARVVNEARPHSDLPSKAAAMVRENVIAQLANLQTHPSVRLALEEGRIALHGW VYDIESGSIAAFDGATRQFVPLAANPRVCAIPLRQPTAA >gi|296494598|gb|ADTN01000140.1| GENE 3 1693 - 2163 714 156 aa, chain + ## HITS:1 COG:cynS KEGG:ns NR:ns ## COG: cynS COG1513 # Protein_GI_number: 16128325 # Func_class: P Inorganic ion transport and metabolism # Function: Cyanate lyase # Organism: Escherichia coli K12 # 1 156 1 156 156 291 100.0 3e-79 MIQSQINRNIRLDLADAILLSKAKKDLSFAEIADGTGLAEAFVTAALLGQQALPADAARL VGAKLDLDEDSILLLQMIPLRGCIDDRIPTDPTMYRFYEMLQVYGTTLKALVHEKFGDGI ISAINFKLDVKKVADPEGGERAVITLDGKYLPTKPF >gi|296494598|gb|ADTN01000140.1| GENE 4 2196 - 2460 295 88 aa, chain + ## HITS:1 COG:ECs0394 KEGG:ns NR:ns ## COG: ECs0394 COG2807 # Protein_GI_number: 15829648 # Func_class: P Inorganic ion transport and metabolism # Function: Cyanate permease # Organism: Escherichia coli O157:H7 # 1 87 1 87 384 141 100.0 3e-34 MLLVLVLIGLNMRPLLTSVGPLLPQLRQASGMSFSVAALLTALPVVTMGGLALAGSWLHQ HVSERRSVAISLLLIAVGALMRELYPQK Prediction of potential genes in microbial genomes Time: Sun May 15 23:36:58 2011 Seq name: gi|296494597|gb|ADTN01000141.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont288.3, whole genome shotgun sequence Length of sequence - 5818 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 89 - 871 678 ## COG2807 Cyanate permease + Term 937 - 973 3.5 2 2 Op 1 4/0.000 - CDS 974 - 1489 255 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 3 2 Op 2 3/1.000 - CDS 1651 - 2904 1316 ## COG0477 Permeases of the major facilitator superfamily - Term 2916 - 2948 5.4 4 2 Op 3 . - CDS 2956 - 5817 2448 ## COG3250 Beta-galactosidase/beta-glucuronidase Predicted protein(s) >gi|296494597|gb|ADTN01000141.1| GENE 1 89 - 871 678 260 aa, chain + ## HITS:1 COG:cynX KEGG:ns NR:ns ## COG: cynX COG2807 # Protein_GI_number: 16128326 # Func_class: P Inorganic ion transport and metabolism # Function: Cyanate permease # Organism: Escherichia coli K12 # 1 260 125 384 384 410 100.0 1e-114 MGLWSAALMGGGGLGAAITPWLVQHSETWYQTLAWWALPAVVALFAWWWQSAREVASSHK TTTTPVRVVFTPRAWTLGVYFGLINGGYASLIAWLPAFYIEIGASAQYSGSLLALMTLGQ AAGALLMPAMARHQDRRKLLMLALVLQLVGFCGFIWLPMQLPVLWAMVCGLGLGGAFPLC LLLALDHSVQPAIAGKLVAFMQGIGFIIAGLAPWFSGVLRSISGNYLMDWAFHALCVVGL MIITLRFAPVRFPQLWVKEA >gi|296494597|gb|ADTN01000141.1| GENE 2 974 - 1489 255 171 aa, chain - ## HITS:1 COG:lacA KEGG:ns NR:ns ## COG: lacA COG0110 # Protein_GI_number: 16128327 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Escherichia coli K12 # 1 171 33 203 203 339 100.0 1e-93 MYEFNHSHPSEVEKRESLIKEMFATVGENAWVEPPVYFSYGSNIHIGRNFYANFNLTIVD DYTVTIGDNVLIAPNVTLSVTGHPVHHELRKNGEMYSFPITIGNNVWIGSHVVINPGVTI GDNSVIGAGSIVTKDIPPNVVAAGVPCRVIREINDRDKHYYFKDYKVESSV >gi|296494597|gb|ADTN01000141.1| GENE 3 1651 - 2904 1316 417 aa, chain - ## HITS:1 COG:lacY KEGG:ns NR:ns ## COG: lacY COG0477 # Protein_GI_number: 16128328 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 417 1 417 417 686 99.0 0 MYYLKNTNFWMFGLFFFFYFFIMGAYFPFFPIWLHDINHISKSDTGIIFAAISLFSLLFQ PLFGLLSDKLGLRKYLLWIITGMLVMFAPFFIFIFGPLLQYNILVGSIVGGIYLGFCFNA GAPAVEAFIDKVSRRSNFEFGRARMFGCVGWALCASIVGIMFTINNQFVFWLGSGCALIL AVLLFFAKTDAPSSATVANAVGANHSAFSLKLALELFRQPKLWFLSLYVIGVSCTYDVFD QQFANFFTSFFATGEQGTRVFGYVTTMGELLNASIMFFAPLIINRIGGKNALLLAGTIMS VRIIGSSFATSALEVVILKTLHMFEVPFLLVGCFKYITSQFEVRFSATIYLVCFCFFKQL AMIFMSVLAGNMYESIGFQGAYLVLGLVALGFTLISVFTLSGPGPLSLLRRQVNEVA >gi|296494597|gb|ADTN01000141.1| GENE 4 2956 - 5817 2448 953 aa, chain - ## HITS:1 COG:lacZ KEGG:ns NR:ns ## COG: lacZ COG3250 # Protein_GI_number: 16128329 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Escherichia coli K12 # 1 953 72 1024 1024 1963 100.0 0 ESWLECDLPEADTVVVPSNWQMHGYDAPIYTNVTYPITVNPPFVPTENPTGCYSLTFNVD ESWLQEGQTRIIFDGVNSAFHLWCNGRWVGYGQDSRLPSEFDLSAFLRAGENRLAVMVLR WSDGSYLEDQDMWRMSGIFRDVSLLHKPTTQISDFHVATRFNDDFSRAVLEAEVQMCGEL RDYLRVTVSLWQGETQVASGTAPFGGEIIDERGGYADRVTLRLNVENPKLWSAEIPNLYR AVVELHTADGTLIEAEACDVGFREVRIENGLLLLNGKPLLIRGVNRHEHHPLHGQVMDEQ TMVQDILLMKQNNFNAVRCSHYPNHPLWYTLCDRYGLYVVDEANIETHGMVPMNRLTDDP RWLPAMSERVTRMVQRDRNHPSVIIWSLGNESGHGANHDALYRWIKSVDPSRPVQYEGGG ADTTATDIICPMYARVDEDQPFPAVPKWSIKKWLSLPGETRPLILCEYAHAMGNSLGGFA KYWQAFRQYPRLQGGFVWDWVDQSLIKYDENGNPWSAYGGDFGDTPNDRQFCMNGLVFAD RTPHPALTEAKHQQQFFQFRLSGQTIEVTSEYLFRHSDNELLHWMVALDGKPLASGEVPL DVAPQGKQLIELPELPQPESAGQLWLTVRVVQPNATAWSEAGHISAWQQWRLAENLSVTL PAASHAIPHLTTSEMDFCIELGNKRWQFNRQSGFLSQMWIGDKKQLLTPLRDQFTRAPLD NDIGVSEATRIDPNAWVERWKAAGHYQAEAALLQCTADTLADAVLITTAHAWQHQGKTLF ISRKTYRIDGSGQMAITVDVEVASDTPHPARIGLNCQLAQVAERVNWLGLGPQENYPDRL TAACFDRWDLPLSDMYTPYVFPSENGLRCGTRELNYGPHQWRGDFQFNISRYSQQQLMET SHRHLLHAEEGTWLNIDGFHMGIGGDDSWSPSVSAEFQLSAGRYHYQLVWCQK Prediction of potential genes in microbial genomes Time: Sun May 15 23:37:02 2011 Seq name: gi|296494596|gb|ADTN01000142.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont288.4, whole genome shotgun sequence Length of sequence - 8856 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 3, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 2 2 Op 1 . - CDS 334 - 1425 1089 ## COG1609 Transcriptional regulators 3 2 Op 2 . - CDS 1493 - 2326 696 ## COG1414 Transcriptional regulator - Prom 2419 - 2478 6.1 + Prom 2367 - 2426 5.8 4 3 Op 1 . + CDS 2517 - 4175 1803 ## COG0654 2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases 5 3 Op 2 . + CDS 4182 - 5126 855 ## SSON_0295 3-(2,3-dihydroxyphenyl)propionate dioxygenase (EC:1.13.11.16) 6 3 Op 3 1/1.000 + CDS 5144 - 6010 1024 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 7 3 Op 4 4/0.000 + CDS 6020 - 6829 882 ## COG3971 2-keto-4-pentenoate hydratase 8 3 Op 5 6/0.000 + CDS 6826 - 7776 983 ## COG4569 Acetaldehyde dehydrogenase (acetylating) 9 3 Op 6 . + CDS 7773 - 8786 1249 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases Predicted protein(s) >gi|296494596|gb|ADTN01000142.1| GENE 1 1 - 211 190 70 aa, chain - ## HITS:1 COG:lacZ KEGG:ns NR:ns ## COG: lacZ COG3250 # Protein_GI_number: 16128329 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Escherichia coli K12 # 1 70 1 70 1024 141 100.0 2e-34 MTMITDSLAVVLQRRDWENPGVTQLNRLAAHPPFASWRNSEEARTDRPSQQLRSLNGEWR FAWFPAPEAV >gi|296494596|gb|ADTN01000142.1| GENE 2 334 - 1425 1089 363 aa, chain - ## HITS:1 COG:lacI KEGG:ns NR:ns ## COG: lacI COG1609 # Protein_GI_number: 16128330 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 4 363 1 360 360 645 99.0 0 MVNVKPVTLYDVAEYAGVSYQTVSRVVNQASHVSAKTREKVEAAMAELNYIPNRVAQQLA GKQSLLIGVATSSLALHAPSQIVAAIKSRADQLGASVVVSMVERSGVEACKAAVHNLLAQ RVSGLIINYPLDDQDAIAVEAACTNVPALFLDVSDQTPINSIIFSHEDGTRLGVEHLVAL GHQQIALLAGPLSSVSARLRLAGWHKYLTRNQIQPIAEREGDWSAMSGFQQTMQMLNEGI VPTAMLVANDQMALGAMRAITESGLRVGADISVVGYDDTEDSSCYIPPLTTIKQDFRLLG QTSVDRLLQLSQGQAVKGNQLLPVSLVKRKTTLAPNTQTASPRALADSLMQLARQVSRLE SGQ >gi|296494596|gb|ADTN01000142.1| GENE 3 1493 - 2326 696 277 aa, chain - ## HITS:1 COG:mhpR KEGG:ns NR:ns ## COG: mhpR COG1414 # Protein_GI_number: 16128331 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 277 39 315 315 507 100.0 1e-143 MQNNEQTEYKTVRGLTRGLMLLNMLNKLDGGASVGLLAELSGLHRTTVRRLLETLQEEGY VRRSPSDDSFRLTIKVRQLSEGFRDEQWISALAAPLLGDLLREVVWPTDVSTLDVDAMVV RETTHRFSRLSFHRAMVGRRLPLLKTASGLTWLAFCPEQDRKELIEMLASRPGDDYQLAR EPLKLEAILARARKEGYGQNYRGWDQEEKIASIAVPLRSEQRVIGCLNLVYMASAMTIEQ AAEKHLPALQRVAKQIEEGVESQAILVAGRRSGMHLR >gi|296494596|gb|ADTN01000142.1| GENE 4 2517 - 4175 1803 552 aa, chain + ## HITS:1 COG:mhpA KEGG:ns NR:ns ## COG: mhpA COG0654 # Protein_GI_number: 16128332 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenyl-6-methoxyphenol hydroxylase and related FAD-dependent oxidoreductases # Organism: Escherichia coli K12 # 1 552 1 552 554 1135 99.0 0 MAIQHPDIQPAVNHSVQVAIAGAGPVGLMMANYLGQMGIDVLVVEKLDKLIDYPRAIGID DEALRTMQSVGLVDDVLPHTTPWHAMRFLTPKGRCFADIQPMTDEFGWPRRNAFIQPQVD AVMLEGVSRFPNVRCLFSRELEAFSQQDDEVTLHLKTAEGQREIVKAQWLVACDGGASFV RRTLNVPFEGKTAPNQWIVVDIANDPLSTPHIYLCCDPVRPYVSAALPHAVRRFEFMVMP GETEEQLREPQNMRKLLSKVLPNPDNVELIRQRVYTHNARLAQRFRIDRVLLAGDAAHIM PVWQGQGYNSGMRDAFNLAWKLALVIQGKARDALLDTYQQERRDHAKAMIDLSVTAGNVL APPKRWQGTLRDGVSWLLNYLPPVKRYFLEMRFKPMPQYYGGALMREGEAKHSPVGKMFI QPKVTLENGDVTLLDNAIGANFAVIGWGCNPLWGMSDEQIQQWRALGTRFIQVVPEVQIH TAQDNHDGVLRVGDTQGRLRSWFAQHNASLVVMRPDRFVAATAIPQTLGKTLNKLASVMM LTRPDADVSVER >gi|296494596|gb|ADTN01000142.1| GENE 5 4182 - 5126 855 314 aa, chain + ## HITS:1 COG:no KEGG:SSON_0295 NR:ns ## KEGG: SSON_0295 # Name: mhpB # Def: 3-(2,3-dihydroxyphenyl)propionate dioxygenase (EC:1.13.11.16) # Organism: S.sonnei # Pathway: Phenylalanine metabolism [PATH:ssn00360] # 1 314 1 314 314 640 100.0 0 MHAYLHCLSHSPLVGYVDPAQEVLDEVNGVIASARERIAAFSPELVVLFAPDHYNGFFYD VMPPFCLGVGATAIGDFGSAAGELPVPVELAEACAHAVMKSGIDLAVSYCMQVDHGFAQP LEFLLGGLDKVPVLPVFINGVATPLPGFQRTRMLGEAIGRFTSTLNKRVLFLGSGGLSHQ PPVPELAKADAHMRDRLLGSGKDLPASERELRQQRVISAAEKFVEDQRTLHPLNPIWDNQ FMTLLEQGRIQELDAVSNEELSAIAGKSTHEIKTWVAAFAAISAFGNWRSEGRYYRPIPE WIAGFGSLSARTEN >gi|296494596|gb|ADTN01000142.1| GENE 6 5144 - 6010 1024 288 aa, chain + ## HITS:1 COG:mhpC KEGG:ns NR:ns ## COG: mhpC COG0596 # Protein_GI_number: 16128334 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Escherichia coli K12 # 1 288 22 309 309 603 100.0 1e-172 MSYQPQTEAATSRFLNVEEAGKTLRIHFNDCGQGDETVVLLHGSGPGATGWANFSRNIDP LVEAGYRVILLDCPGWGKSDSVVNSGSRSDLNARILKSVVDQLDIAKIHLLGNSMGGHSS VAFTLKWPERVGKLVLMGGGTGGMSLFTPMPTEGIKRLNQLYRQPTIENLKLMMDIFVFD TSDLTDALFEARLNNMLSRRDHLENFVKSLEANPKQFPDFGPRLAEIKAQTLIVWGRNDR FVPMDAGLRLLSGIAGSELHIFRDCGHWAQWEHADAFNQLVLNFLARP >gi|296494596|gb|ADTN01000142.1| GENE 7 6020 - 6829 882 269 aa, chain + ## HITS:1 COG:mhpD KEGG:ns NR:ns ## COG: mhpD COG3971 # Protein_GI_number: 16128335 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: 2-keto-4-pentenoate hydratase # Organism: Escherichia coli K12 # 1 269 3 271 271 538 100.0 1e-153 MTKHTLEQLAADLRRAAEQGEAIAPLRDLIGIDNAEAAYAIQHINVQHDVAQGRRVVGRK VGLTHPKVQQQLGVDQPDFGTLFADMCYGDNEIIPFSRVLQPRIEAEIALVLNRDLPATD ITFDELYNAIEWVLPALEVVGSRIRDWSIQFVDTVADNASCGVYVIGGPAQRPAGLDLKN CAMKMTRNNEEVSSGRGSECLGHPLNAAVWLARKMASLGEPLRTGDIILTGALGPMVAVN AGDRFEAHIEGIGSVAATFSSAAPKGSLS >gi|296494596|gb|ADTN01000142.1| GENE 8 6826 - 7776 983 316 aa, chain + ## HITS:1 COG:mhpF KEGG:ns NR:ns ## COG: mhpF COG4569 # Protein_GI_number: 16128336 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acetaldehyde dehydrogenase (acetylating) # Organism: Escherichia coli K12 # 1 316 1 316 316 578 100.0 1e-165 MSKRKVAIIGSGNIGTDLMIKILRHGQHLEMAVMVGIDPQSDGLARARRMGVATTHEGVI GLMNMPEFADIDIVFDATSAGAHVKNDAALREAKPDIRLIDLTPAAIGPYCVPVVNLEAN VDQLNVNMVTCGGQATIPMVAAVSRVARVHYAEIIASIASKSAGPGTRANIDEFTETTSR AIEVVGGAAKGKAIIVLNPAEPPLMMRDTVYVLSDEASQDDIEASINEMAEAVQAYVPGY RLKQRVQFEVIPQDKPVNLPGVGQFSGLKTAVWLEVEGAAHYLPAYAGNLDIMTSSALAT AEKMAQSLARKAGEAA >gi|296494596|gb|ADTN01000142.1| GENE 9 7773 - 8786 1249 337 aa, chain + ## HITS:1 COG:mhpE KEGG:ns NR:ns ## COG: mhpE COG0119 # Protein_GI_number: 16128337 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Escherichia coli K12 # 1 337 1 337 337 631 100.0 0 MNGKKLYISDVTLRDGMHAIRHQYSLENVRQIAKALDDARVDSIEVAHGDGLQGSSFNYG FGAHSDLEWIEAAADVVKHAKIATLLLPGIGTIHDLKNAWQAGARVVRVATHCTEADVSA QHIQYARELGMDTVGFLMMSHMTTPENLAKQAKLMEGYGATCIYVVDSGGAMNMSDIRDR FRALKAELKPETQTGMHAHHNLSLGVANSIAAVEEGCDRIDASLAGMGAGAGNAPLEVFI AAADKLGWQHGTDLYALMDAADDLVRPLQDRPVRVDRETLALGYAGVYSSFLRHCETAAA RYGLSAVDILVELGKRRMVGGQEDMIVDVALDLRNNK Prediction of potential genes in microbial genomes Time: Sun May 15 23:37:13 2011 Seq name: gi|296494595|gb|ADTN01000143.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont290.1, whole genome shotgun sequence Length of sequence - 18273 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 8, operones - 4 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 187 - 246 3.6 1 1 Op 1 17/0.000 + CDS 341 - 1930 1844 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) 2 1 Op 2 . + CDS 1942 - 3231 1615 ## COG0151 Phosphoribosylamine-glycine ligase + Term 3277 - 3316 1.1 3 2 Op 1 13/0.000 - CDS 3228 - 4553 1305 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 4 2 Op 2 . - CDS 4550 - 5938 1243 ## COG0642 Signal transduction histidine kinase - Prom 6097 - 6156 3.5 + Prom 6048 - 6107 2.0 5 3 Tu 1 . + CDS 6185 - 6610 416 ## COG3678 P pilus assembly/Cpx signaling pathway, periplasmic inhibitor/zinc-resistance associated protein - Term 6348 - 6382 -0.7 6 4 Tu 1 . - CDS 6612 - 7247 432 ## c4958 hypothetical protein - Term 7259 - 7292 3.8 7 5 Op 1 6/0.000 - CDS 7320 - 7592 415 ## COG0776 Bacterial nucleoid DNA-binding protein - Prom 7697 - 7756 1.8 8 5 Op 2 4/0.667 - CDS 7779 - 8369 637 ## COG3068 Uncharacterized protein conserved in bacteria 9 5 Op 3 4/0.667 - CDS 8412 - 9083 644 ## COG1515 Deoxyinosine 3'endonuclease (endonuclease V) 10 5 Op 4 5/0.333 - CDS 9093 - 10157 1266 ## COG0407 Uroporphyrinogen-III decarboxylase 11 5 Op 5 . - CDS 10197 - 10970 597 ## COG2816 NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding - Prom 11189 - 11248 2.8 + Prom 10952 - 11011 3.2 12 6 Tu 1 4/0.667 + CDS 11065 - 11541 470 ## COG3160 Regulator of sigma D + Term 11576 - 11604 -0.7 + Prom 11545 - 11604 3.5 13 7 Op 1 8/0.000 + CDS 11774 - 13669 2060 ## COG0422 Thiamine biosynthesis protein ThiC 14 7 Op 2 3/0.667 + CDS 13669 - 14304 690 ## COG0352 Thiamine monophosphate synthase 15 7 Op 3 5/0.333 + CDS 14297 - 15052 707 ## COG0476 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 16 7 Op 4 16/0.000 + CDS 15036 - 15236 227 ## COG2104 Sulfur transfer protein involved in thiamine biosynthesis 17 7 Op 5 5/0.333 + CDS 15238 - 16008 790 ## COG2022 Uncharacterized enzyme of thiazole biosynthesis 18 7 Op 6 . + CDS 16005 - 17138 1028 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes + Term 17251 - 17295 4.3 - Term 17373 - 17413 5.6 19 8 Tu 1 . - CDS 17548 - 18087 189 ## B21_03818 hypothetical protein - Prom 18210 - 18269 4.2 Predicted protein(s) >gi|296494595|gb|ADTN01000143.1| GENE 1 341 - 1930 1844 529 aa, chain + ## HITS:1 COG:purH KEGG:ns NR:ns ## COG: purH COG0138 # Protein_GI_number: 16131836 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Escherichia coli K12 # 1 529 1 529 529 1033 100.0 0 MQQRRPVRRALLSVSDKAGIVEFAQALSARGVELLSTGGTARLLAEKGLPVTEVSDYTGF PEMMDGRVKTLHPKVHGGILGRRGQDDAIMEEHQIQPIDMVVVNLYPFAQTVAREGCSLE DAVENIDIGGPTMVRSAAKNHKDVAIVVKSSDYDAIIKEMDDNEGSLTLATRFDLAIKAF EHTAAYDSMIANYFGSMVPAYHGESKEAAGRFPRTLNLNFIKKLDMRYGENSHQQAAFYI EENVKEASVATATQVQGKALSYNNIADTDAALECVKEFAEPACVIVKHANPCGVAIGNSI LDAYDRAYKTDPTSAFGGIIAFNRELDAETAQAIISRQFVEVIIAPSASEEALKITAAKQ NVRVLTCGQWGERVPGLDFKRVNGGLLVQDRDLGMVGAEELRVVTKRQPSEQELRDALFC WKVAKFVKSNAIVYAKNNMTIGIGAGQMSRVYSAKIAGIKAADEGLEVKGSSMASDAFFP FRDGIDAAAAAGVTCVIQPGGSIRDDEVIAAADEHGIAMLFTDMRHFRH >gi|296494595|gb|ADTN01000143.1| GENE 2 1942 - 3231 1615 429 aa, chain + ## HITS:1 COG:purD KEGG:ns NR:ns ## COG: purD COG0151 # Protein_GI_number: 16131835 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylamine-glycine ligase # Organism: Escherichia coli K12 # 1 429 1 429 429 854 100.0 0 MKVLVIGNGGREHALAWKAAQSPLVETVFVAPGNAGTALEPALQNVAIGVTDIPALLDFA QNEKIDLTIVGPEAPLVKGVVDTFRAAGLKIFGPTAGAAQLEGSKAFTKDFLARHKIPTA EYQNFTEVEPALAYLREKGAPIVIKADGLAAGKGVIVAMTLEEAEAAVHDMLAGNAFGDA GHRIVIEEFLDGEEASFIVMVDGEHVLPMATSQDHKRVGDKDTGPNTGGMGAYSPAPVVT DDVHQRTMERIIWPTVKGMAAEGNTYTGFLYAGLMIDKQGNPKVIEFNCRFGDPETQPIM LRMKSDLVELCLAACESKLDEKTSEWDERASLGVVMAAGGYPGDYRTGDVIHGLPLEEVA GGKVFHAGTKLADDEQVVTNGGRVLCVTALGHTVAEAQKRAYALMTDIHWDDCFCRKDIG WRAIEREQN >gi|296494595|gb|ADTN01000143.1| GENE 3 3228 - 4553 1305 441 aa, chain - ## HITS:1 COG:hydG KEGG:ns NR:ns ## COG: hydG COG2204 # Protein_GI_number: 16131834 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 1 441 1 441 441 837 100.0 0 MTHDNIDILVVDDDISHCTILQALLRGWGYNVALANSGRQALEQVREQVFDLVLCDVRMA EMDGIATLKEIKALNPAIPVLIMTAYSSVETAVEALKTGALDYLIKPLDFDNLQATLEKA LAHTHSIDAETPAVTASQFGMVGKSPAMQHLLSEIALVAPSEATVLIHGDSGTGKELVAR AIHASSARSEKPLVTLNCAALNESLLESELFGHEKGAFTGADKRREGRFVEADGGTLFLD EIGDISPMMQVRLLRAIQEREVQRVGSNQIISVDVRLIAATHRDLAAEVNAGRFRQDLYY RLNVVAIEVPSLRQRREDIPLLAGHFLQRFAERNRKAVKGFTPQAMDLLIHYDWPGNIRE LENAVERAVVLLTGEYISERELPLAIASTPIPLGQSQDIQPLVEVEKEVILAALEKTGGN KTEAARQLGITRKTLLAKLSR >gi|296494595|gb|ADTN01000143.1| GENE 4 4550 - 5938 1243 462 aa, chain - ## HITS:1 COG:hydH KEGG:ns NR:ns ## COG: hydH COG0642 # Protein_GI_number: 16131833 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 462 4 465 465 899 100.0 0 MQRSKDSLAKWLSAILPVVIVGLVGLFAVTVIRDYGRASEADRQALLEKGNVLIRALESG SRVGMGMRMHHVQQQALLEEMAGQPGVLWFAVTDAQGIIILHSDPDKVGRALYSPDEMQK LKPEENSRWRLLGKTETTPALEVYRLFQPMSAPWRHGMHNMPRCNGKAVPQVDAQQAIFI AVDASDLVATQSGEKRNTLIILFALATVLLASVLSFFWYRRYLRSRQLLQDEMKRKEKLV ALGHLAAGVAHEIRNPLSSIKGLAKYFAERAPAGGEAHQLAQVMAKEADRLNRVVSELLE LVKPTHLALQAVDLNTLINHSLQLVSQDANSREIQLRFTANDTLPEIQADPDRLTQVLLN LYLNAIQAIGQHGVISVTASESGAGVKISVTDSGKGIAADQLDAIFTPYFTTKAEGTGLG LAVVHNIVEQHGGTIQVASQEGKGSTFTLWLPVNITRKDPQG >gi|296494595|gb|ADTN01000143.1| GENE 5 6185 - 6610 416 141 aa, chain + ## HITS:1 COG:ECs4925 KEGG:ns NR:ns ## COG: ECs4925 COG3678 # Protein_GI_number: 15834179 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; N Cell motility; T Signal transduction mechanisms; P Inorganic ion transport and metabolism # Function: P pilus assembly/Cpx signaling pathway, periplasmic inhibitor/zinc-resistance associated protein # Organism: Escherichia coli O157:H7 # 1 141 48 188 188 219 100.0 1e-57 MKRNTKIALVMMALSAMAMGSTSAFAHGGHGMWQQNAAPLTSEQQTAWQKIHNDFYAQSS ALQQQLVTKRYEYNALLAANPPDSSKINAVAKEMENLRQSLDELRVKRDIAMAEAGIPRG AGMGMGYGGCGGGGHMGMGHW >gi|296494595|gb|ADTN01000143.1| GENE 6 6612 - 7247 432 211 aa, chain - ## HITS:1 COG:no KEGG:c4958 NR:ns ## KEGG: c4958 # Name: yjaH # Def: hypothetical protein # Organism: E.coli_CFT073 # Pathway: not_defined # 1 211 23 233 233 416 99.0 1e-115 MLAGALLLTACSHNSSLPPFTASGFAEDQGAVRIWRKDSGDNVHLLAVFSPWRSGDTTTR EYRWQGDNLTLININVYSKPPVNIRARFDDRGDLSFMQRESDGEKQQLSNDQIDLYRYRA DQIRQISDALRQGRVVLRQGRWHAMEQTVTTCEGQTIKPDLDSQAIAHIERRQSRSSVDV SVAWLEAPEGSQLLLVANSDFCRWQPNEKTF >gi|296494595|gb|ADTN01000143.1| GENE 7 7320 - 7592 415 90 aa, chain - ## HITS:1 COG:STM4170 KEGG:ns NR:ns ## COG: STM4170 COG0776 # Protein_GI_number: 16767424 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Salmonella typhimurium LT2 # 1 90 1 90 90 135 98.0 1e-32 MNKTQLIDVIAEKAELSKTQAKAALESTLAAITESLKEGDAVQLVGFGTFKVNHRAERTG RNPQTGKEIKIAAANVPAFVSGKALKDAVK >gi|296494595|gb|ADTN01000143.1| GENE 8 7779 - 8369 637 196 aa, chain - ## HITS:1 COG:yjaG KEGG:ns NR:ns ## COG: yjaG COG3068 # Protein_GI_number: 16131829 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 196 1 196 196 384 100.0 1e-107 MLQNPIHLRLERLESWQHVTFMACLCERMYPNYAMFCQQTGFGDGQIYRRILDLIWETLT VKDAKVNFDSQLEKFEEAIPSADDFDLYGVYPAIDACVALSELVHSRLSGETLEHAVEVS KTSITTVAMLEMTQAGREMSDEELKENPAVEQEWDIQWEIFRLLAECEERDIELIKGLRA DLREAGESNIGIIFQQ >gi|296494595|gb|ADTN01000143.1| GENE 9 8412 - 9083 644 223 aa, chain - ## HITS:1 COG:nfi KEGG:ns NR:ns ## COG: nfi COG1515 # Protein_GI_number: 16131828 # Func_class: L Replication, recombination and repair # Function: Deoxyinosine 3'endonuclease (endonuclease V) # Organism: Escherichia coli K12 # 1 223 3 225 225 442 100.0 1e-124 MDLASLRAQQIELASSVIREDRLDKDPPDLIAGADVGFEQGGEVTRAAMVLLKYPSLELV EYKVARIATTMPYIPGFLSFREYPALLAAWEMLSQKPDLVFVDGHGISHPRRLGVASHFG LLVDVPTIGVAKKRLCGKFEPLSSEPGALAPLMDKGEQLAWVWRSKARCNPLFIATGHRV SVDSALAWVQRCMKGYRLPEPTRWADAVASERPAFVRYTANQP >gi|296494595|gb|ADTN01000143.1| GENE 10 9093 - 10157 1266 354 aa, chain - ## HITS:1 COG:hemE KEGG:ns NR:ns ## COG: hemE COG0407 # Protein_GI_number: 16131827 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III decarboxylase # Organism: Escherichia coli K12 # 1 354 1 354 354 736 100.0 0 MTELKNDRYLRALLRQPVDVTPVWMMRQAGRYLPEYKATRAQAGDFMSLCKNAELACEVT LQPLRRYPLDAAILFSDILTVPDAMGLGLYFEAGEGPRFTSPVTCKADVDKLPIPDPEDE LGYVMNAVRTIRRELKGEVPLIGFSGSPWTLATYMVEGGSSKAFTVIKKMMYADPQALHA LLDKLAKSVTLYLNAQIKAGAQAVMIFDTWGGVLTGRDYQQFSLYYMHKIVDGLLRENDG RRVPVTLFTKGGGQWLEAMAETGCDALGLDWTTDIADARRRVGNKVALQGNMDPSMLYAP PARIEEEVATILAGFGHGEGHVFNLGHGIHQDVPPEHAGVFVEAVHRLSEQYHR >gi|296494595|gb|ADTN01000143.1| GENE 11 10197 - 10970 597 257 aa, chain - ## HITS:1 COG:yjaD KEGG:ns NR:ns ## COG: yjaD COG2816 # Protein_GI_number: 16131826 # Func_class: L Replication, recombination and repair # Function: NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding # Organism: Escherichia coli K12 # 1 257 1 257 257 531 99.0 1e-151 MDRIIEKLDHGWWVVSHEQKLWLPKGELPYGEAANFDLVGQRALQIGEWQGEPVWLVQQQ RRHDMGSVRQVIDLDVGLFQLAGRGVQLAEFYRSHKYCGYCGHEMYPSKTEWAMLCSHCR ERYYPQIAPCIIVAIRRDDSILLAQHTRHRNGVHTVLAGFVEVGETLEQAVAREVMEESG IKVKNLRYVTSQPWPFPQSLMTAFMAEYDSGDIVIDPKELLEANWYRYDDLPLLPPPGTV ARRLIEDTVAMCRAEYE >gi|296494595|gb|ADTN01000143.1| GENE 12 11065 - 11541 470 158 aa, chain + ## HITS:1 COG:ECs4918 KEGG:ns NR:ns ## COG: ECs4918 COG3160 # Protein_GI_number: 15834172 # Func_class: K Transcription # Function: Regulator of sigma D # Organism: Escherichia coli O157:H7 # 1 158 1 158 158 305 100.0 3e-83 MLNQLDNLTERVRGSNKLVDRWLHVRKHLLVAYYNLVGIKPGKESYMRLNEKALDDFCQS LVDYLSAGHFSIYERILHKLEGNGQLARAAKIWPQLEANTQQIMDYYDSSLETAIDHDNY LEFQQVLSDIGEALEARFVLEDKLILLVLDAARVKHPA >gi|296494595|gb|ADTN01000143.1| GENE 13 11774 - 13669 2060 631 aa, chain + ## HITS:1 COG:thiC KEGG:ns NR:ns ## COG: thiC COG0422 # Protein_GI_number: 16131824 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine biosynthesis protein ThiC # Organism: Escherichia coli K12 # 1 631 1 631 631 1313 100.0 0 MSATKLTRREQRARAQHFIDTLEGTAFPNSKRIYITGTHPGVRVPMREIQLSPTLIGGSK EQPQYEENEAIPVYDTSGPYGDPQIAINVQQGLAKLRQPWIDARGDTEELTVRSSDYTKA RLADDGLDELRFSGVLTPKRAKAGRRVTQLHYARQGIITPEMEFIAIRENMGRERIRSEV LRHQHPGMSFGAHLPENITAEFVRDEVAAGRAIIPANINHPESEPMIIGRNFLVKVNANI GNSAVTSSIEEEVEKLVWSTRWGADTVMDLSTGRYIHETREWILRNSPVPIGTVPIYQAL EKVNGIAEDLTWEAFRDTLLEQAEQGVDYFTIHAGVLLRYVPMTAKRLTGIVSRGGSIMA KWCLSHHQENFLYQHFREICEICAAYDVSLSLGDGLRPGSIQDANDEAQFAELHTLGELT KIAWEYDVQVMIEGPGHVPMQMIRRNMTEELEHCHEAPFYTLGPLTTDIAPGYDHFTSGI GAAMIGWFGCAMLCYVTPKEHLGLPNKEDVKQGLITYKIAAHAADLAKGHPGAQIRDNAM SKARFEFRWEDQFNLALDPFTARAYHDETLPQESGKVAHFCSMCGPKFCSMKISQEVRDY AATQTIEMGMADMSENFRARGGEIYLRKEEA >gi|296494595|gb|ADTN01000143.1| GENE 14 13669 - 14304 690 211 aa, chain + ## HITS:1 COG:thiE KEGG:ns NR:ns ## COG: thiE COG0352 # Protein_GI_number: 16131823 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate synthase # Organism: Escherichia coli K12 # 1 211 1 211 211 395 100.0 1e-110 MYQPDFPPVPFRSGLYPVVDSVQWIERLLDAGVRTLQLRIKDRRDEEVEADVVAAIALGR RYNARLFINDYWRLAIKHQAYGVHLGQEDLQATDLNAIRAAGLRLGVSTHDDMEIDVALA ARPSYIALGHVFPTQTKQMPSAPQGLEQLARHVERLADYPTVAIGGISLARAPAVIATGV GSIAVVSAITQAADWRLATAQLLEIAGVGDE >gi|296494595|gb|ADTN01000143.1| GENE 15 14297 - 15052 707 251 aa, chain + ## HITS:1 COG:thiF KEGG:ns NR:ns ## COG: thiF COG0476 # Protein_GI_number: 16131822 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 # Organism: Escherichia coli K12 # 7 251 1 245 245 444 100.0 1e-125 MNDRDFMRYSRQILLDDIALDGQQKLLDSQVLIIGLGGLGTPAALYLAGAGVGTLVLADD DDVHLSNLQRQILFTTEDIDRPKSQVSQQRLTQLNPDIQLTALQQRLTGEALKDAVARAD VVLDCTDNMATRQEINAACVALNTPLITASAVGFGGQLMVLTPPWEQGCYRCLWPDNQEP ERNCRTAGVVGPVVGVMGTLQALEAIKLLSGIETPAGELRLFDGKSSQWRSLALRRASGC PVCGGSNADPV >gi|296494595|gb|ADTN01000143.1| GENE 16 15036 - 15236 227 66 aa, chain + ## HITS:1 COG:thiS KEGG:ns NR:ns ## COG: thiS COG2104 # Protein_GI_number: 16132237 # Func_class: H Coenzyme transport and metabolism # Function: Sulfur transfer protein involved in thiamine biosynthesis # Organism: Escherichia coli K12 # 1 66 1 66 66 91 100.0 4e-19 MQILFNDQAMQCAAGQTVHELLEQLDQRQAGAALAINQQIVPREQWAQHIVQDGDQILLF QVIAGG >gi|296494595|gb|ADTN01000143.1| GENE 17 15238 - 16008 790 256 aa, chain + ## HITS:1 COG:thiG KEGG:ns NR:ns ## COG: thiG COG2022 # Protein_GI_number: 16131821 # Func_class: H Coenzyme transport and metabolism # Function: Uncharacterized enzyme of thiazole biosynthesis # Organism: Escherichia coli K12 # 1 256 26 281 281 471 99.0 1e-133 MLRIADKTFDSHLFTGTGKFASSQLMVEAIRASGSQLVTLAMKRVDLRQHNDAILEPLIA AGVTLLPNTSGAKTAEEAIFAAHLAREALGTNWLKLEIHPDARWLLPDPIETLKAAETLV QQGFVVLPYCGADPVLCKCLEEVGCAAVMPLGAPIGSNQGLETRAMLEIIIQQATVPVVV DAGIGVPSHAAQALEMGADAVLVNTAIAVADDPVNMAKAFRLAVEAGLLARQSGPGSRSY FAHATSPLTGFLEASA >gi|296494595|gb|ADTN01000143.1| GENE 18 16005 - 17138 1028 377 aa, chain + ## HITS:1 COG:thiH KEGG:ns NR:ns ## COG: thiH COG1060 # Protein_GI_number: 16131820 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Escherichia coli K12 # 1 377 1 377 377 764 100.0 0 MKTFSDRWRQLDWDDIRLRINGKTAADVERALNASQLTRDDMMALLSPAASGYLEQLAQR AQRLTRQRFGNTVSFYVPLYLSNLCANDCTYCGFSMSNRIKRKTLDEADIARESAAIREM GFEHLLLVTGEHQAKVGMDYFRRHLPALREQFSSLQMEVQPLAETEYAELKQLGLDGVMV YQETYHEATYARHHLKGKKQDFFWRLETPDRLGRAGIDKIGLGALIGLSDNWRVDSYMVA EHLLWLQQHYWQSRYSVSFPRLRPCTGGIEPASIMDERQLVQTICAFRLLAPEIELSLST RESPWFRDRVIPLAINNVSAFSKTQPGGYADNHPELEQFSPHDDRRPEAVAAALTAQGLQ PVWKDWDSYLGRASQRL >gi|296494595|gb|ADTN01000143.1| GENE 19 17548 - 18087 189 179 aa, chain - ## HITS:1 COG:no KEGG:B21_03818 NR:ns ## KEGG: B21_03818 # Name: htrC # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 179 1 179 179 347 100.0 1e-94 MKQEVEKWRPFGHPDGDIRDLSFLDAHQAVYVQHHEGKEPLEYRFWVTYSLHCFTKDYEH QTNEEKQSLMYHAPKESRPFCQHRYNLARTHLKRTILALPESNVIHAGYGSYAVIEVDLD GGDKAFYFVAFRAFREKKKLRLHVTSAYPISEKQKGKSVKFFTIAYNLLRNKQLPQPSK Prediction of potential genes in microbial genomes Time: Sun May 15 23:37:30 2011 Seq name: gi|296494594|gb|ADTN01000144.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont290.2, whole genome shotgun sequence Length of sequence - 18907 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 4, operones - 4 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 58/0.000 - CDS 30 - 4253 4877 ## COG0086 DNA-directed RNA polymerase, beta' subunit/160 kD subunit - Term 4271 - 4310 8.9 2 1 Op 2 28/0.000 - CDS 4330 - 8358 4095 ## PROTEIN SUPPORTED gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 - Prom 8515 - 8574 2.6 - Term 8619 - 8667 10.2 3 2 Op 1 47/0.000 - CDS 8678 - 9043 562 ## PROTEIN SUPPORTED gi|15804576|ref|NP_290617.1| 50S ribosomal protein L7/L12 4 2 Op 2 43/0.000 - CDS 9110 - 9607 798 ## PROTEIN SUPPORTED gi|15804575|ref|NP_290616.1| 50S ribosomal protein L10 5 2 Op 3 55/0.000 - CDS 10020 - 10724 1148 ## PROTEIN SUPPORTED gi|15804574|ref|NP_290615.1| 50S ribosomal protein L1 6 2 Op 4 45/0.000 - CDS 10728 - 11156 705 ## PROTEIN SUPPORTED gi|15804573|ref|NP_290614.1| 50S ribosomal protein L11 - Prom 11248 - 11307 6.1 - Term 11229 - 11281 10.2 7 3 Op 1 46/0.000 - CDS 11315 - 11860 674 ## COG0250 Transcription antiterminator 8 3 Op 2 14/0.000 - CDS 11862 - 12245 479 ## COG0690 Preprotein translocase subunit SecE - Prom 12325 - 12384 6.0 - Term 12402 - 12449 8.9 9 4 Op 1 30/0.000 - CDS 12475 - 13659 1634 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 10 4 Op 2 51/0.000 - CDS 13730 - 15844 2282 ## COG0480 Translation elongation factors (GTPases) - Prom 15868 - 15927 1.6 - Term 15880 - 15915 3.1 11 4 Op 3 56/0.000 - CDS 15941 - 16411 777 ## PROTEIN SUPPORTED gi|15803854|ref|NP_289888.1| 30S ribosomal protein S7 - Prom 16440 - 16499 2.1 12 4 Op 4 7/0.000 - CDS 16508 - 16942 744 ## PROTEIN SUPPORTED gi|226956878|ref|YP_002807671.1| 30S ribosomal subunit protein S12 13 4 Op 5 10/0.000 - CDS 17008 - 17295 248 ## COG2168 Uncharacterized conserved protein involved in oxidation of intracellular sulfur 14 4 Op 6 13/0.000 - CDS 17303 - 17662 214 ## COG2923 Uncharacterized protein involved in the oxidation of intracellular sulfur 15 4 Op 7 6/0.000 - CDS 17662 - 18048 415 ## COG1553 Uncharacterized conserved protein involved in intracellular sulfur reduction 16 4 Op 8 . - CDS 18048 - 18770 742 ## COG2964 Uncharacterized protein conserved in bacteria - Prom 18844 - 18903 3.2 Predicted protein(s) >gi|296494594|gb|ADTN01000144.1| GENE 1 30 - 4253 4877 1407 aa, chain - ## HITS:1 COG:ECs4911 KEGG:ns NR:ns ## COG: ECs4911 COG0086 # Protein_GI_number: 15834165 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta' subunit/160 kD subunit # Organism: Escherichia coli O157:H7 # 1 1407 1 1407 1407 2776 100.0 0 MKDLLKFLKAQTKTEEFDAIKIALASPDMIRSWSFGEVKKPETINYRTFKPERDGLFCAR IFGPVKDYECLCGKYKRLKHRGVICEKCGVEVTQTKVRRERMGHIELASPTAHIWFLKSL PSRIGLLLDMPLRDIERVLYFESYVVIEGGMTNLERQQILTEEQYLDALEEFGDEFDAKM GAEAIQALLKSMDLEQECEQLREELNETNSETKRKKLTKRIKLLEAFVQSGNKPEWMILT VLPVLPPDLRPLVPLDGGRFATSDLNDLYRRVINRNNRLKRLLDLAAPDIIVRNEKRMLQ EAVDALLDNGRRGRAITGSNKRPLKSLADMIKGKQGRFRQNLLGKRVDYSGRSVITVGPY LRLHQCGLPKKMALELFKPFIYGKLELRGLATTIKAAKKMVEREEAVVWDILDEVIREHP VLLNRAPTLHRLGIQAFEPVLIEGKAIQLHPLVCAAYNADFDGDQMAVHVPLTLEAQLEA RALMMSTNNILSPANGEPIIVPSQDVVLGLYYMTRDCVNAKGEGMVLTGPKEAERLYRSG LASLHARVKVRITEYEKDANGELVAKTSLKDTTVGRAILWMIVPKGLPYSIVNQALGKKA ISKMLNTCYRILGLKPTVIFADQIMYTGFAYAARSGASVGIDDMVIPEKKHEIISEAEAE VAEIQEQFQSGLVTAGERYNKVIDIWAAANDRVSKAMMDNLQTETVINRDGQEEKQVSFN SIYMMADSGARGSAAQIRQLAGMRGLMAKPDGSIIETPITANFREGLNVLQYFISTHGAR KGLADTALKTANSGYLTRRLVDVAQDLVVTEDDCGTHEGIMMTPVIEGGDVKEPLRDRVL GRVTAEDVLKPGTADILVPRNTLLHEQWCDLLEENSVDAVKVRSVVSCDTDFGVCAHCYG RDLARGHIINKGEAIGVIAAQSIGEPGTQLTMRTFHIGGAASRAAAESSIQVKNKGSIKL SNVKSVVNSSGKLVITSRNTELKLIDEFGRTKESYKVPYGAVLAKGDGEQVAGGETVANW DPHTMPVITEVSGFVRFTDMIDGQTITRQTDELTGLSSLVVLDSAERTAGGKDLRPALKI VDAQGNDVLIPGTDMPAQYFLPGKAIVQLEDGVQISSGDTLARIPQESGGTKDITGGLPR VADLFEARRPKEPAILAEISGIVSFGKETKGKRRLVITPVDGSDPYEEMIPKWRQLNVFE GERVERGDVISDGPEAPHDILRLRGVHAVTRYIVNEVQDVYRLQGVKINDKHIEVIVRQM LRKATIVNAGSSDFLEGEQVEYSRVKIANRELEANGKVGATYSRDLLGITKASLATESFI SAASFQETTRVLTEAAVAGKRDELRGLKENVIVGRLIPAGTGYAYHQDRMRRRAAGEAPA APQVTAEDASASLAELLNAGLGGSDNE >gi|296494594|gb|ADTN01000144.1| GENE 2 4330 - 8358 4095 1342 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 [alpha proteobacterium BAL199] # 2 1342 6 1390 1392 1582 58 0.0 MVYSYTEKKRIRKDFGKRPQVLDVPYLLSIQLDSFQKFIEQDPEGQYGLEAAFRSVFPIQ SYSGNSELQYVSYRLGEPVFDVQECQIRGVTYSAPLRVKLRLVIYEREAPEGTVKDIKEQ EVYMGEIPLMTDNGTFVINGTERVIVSQLHRSPGVFFDSDKGKTHSSGKVLYNARIIPYR GSWLDFEFDPKDNLFVRIDRRRKLPATIILRALNYTTEQILDLFFEKVIFEIRDNKLQME LVPERLRGETASFDIEANGKVYVEKGRRITARHIRQLEKDDVKLIEVPVEYIAGKVVAKD YIDESTGELICAANMELSLDLLAKLSQSGHKRIETLFTNDLDHGPYISETLRVDPTNDRL SALVEIYRMMRPGEPPTREAAESLFENLFFSEDRYDLSAVGRMKFNRSLLREEIEGSGIL SKDDIIDVMKKLIDIRNGKGEVDDIDHLGNRRIRSVGEMAENQFRVGLVRVERAVKERLS LGDLDTLMPQDMINAKPISAAVKEFFGSSQLSQFMDQNNPLSEITHKRRISALGPGGLTR ERAGFEVRDVHPTHYGRVCPIETPEGPNIGLINSLSVYAQTNEYGFLETPYRKVTDGVVT DEIHYLSAIEEGNYVIAQANSNLDEEGHFVEDLVTCRSKGESSLFSRDQVDYMDVSTQQV VSVGASLIPFLEHDDANRALMGANMQRQAVPTLRADKPLVGTGMERAVAVDSGVTAVAKR GGVVQYVDASRIVIKVNEDEMYPGEAGIDIYNLTKYTRSNQNTCINQMPCVSLGEPVERG DVLADGPSTDLGELALGQNMRVAFMPWNGYNFEDSILVSERVVQEDRFTTIHIQELACVS RDTKLGPEEITADIPNVGEAALSKLDESGIVYIGAEVTGGDILVGKVTPKGETQLTPEEK LLRAIFGEKASDVKDSSLRVPNGVSGTVIDVQVFTRDGVEKDKRALEIEEMQLKQAKKDL SEELQILEAGLFSRIRAVLVAGGVEAEKLDKLPRDRWLELGLTDEEKQNQLEQLAEQYDE LKHEFEKKLEAKRRKITQGDDLAPGVLKIVKVYLAVKRRIQPGDKMAGRHGNKGVISKIN PIEDMPYDENGTPVDIVLNPLGVPSRMNIGQILETHLGMAAKGIGDKINAMLKQQQEVAK LREFIQRAYDLGADVRQKVDLSTFSDEEVMRLAENLRKGMPIATPVFDGAKEAEIKELLK LGDLPTSGQIRLYDGRTGEQFERPVTVGYMYMLKLNHLVDDKMHARSTGSYSLVTQQPLG GKAQFGGQRFGEMEVWALEAYGAAYTLQEMLTVKSDDVNGRTKMYKNIVDGNHQMEPGMP ESFNVLLKEIRSLGINIELEDE >gi|296494594|gb|ADTN01000144.1| GENE 3 8678 - 9043 562 121 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15804576|ref|NP_290617.1| 50S ribosomal protein L7/L12 [Escherichia coli O157:H7 EDL933] # 1 121 1 121 121 221 100 4e-57 MSITKDQIIEAVAAMSVMDVVELISAMEEKFGVSAAAAVAVAAGPVEAAEEKTEFDVILK AAGANKVAVIKAVRGATGLGLKEAKDLVESAPAALKEGVSKDDAEALKKALEEAGAEVEV K >gi|296494594|gb|ADTN01000144.1| GENE 4 9110 - 9607 798 165 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15804575|ref|NP_290616.1| 50S ribosomal protein L10 [Escherichia coli O157:H7 EDL933] # 1 165 1 165 165 311 100 2e-84 MALNLQDKQAIVAEVSEVAKGALSAVVADSRGVTVDKMTELRKAGREAGVYMRVVRNTLL RRAVEGTPFECLKDAFVGPTLIAYSMEHPGAAARLFKEFAKANAKFEVKAAAFEGELIPA SQIDRLATLPTYEEAIARLMATMKEASAGKLVRTLAAVRDAKEAA >gi|296494594|gb|ADTN01000144.1| GENE 5 10020 - 10724 1148 234 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15804574|ref|NP_290615.1| 50S ribosomal protein L1 [Escherichia coli O157:H7 EDL933] # 1 234 1 234 234 446 100 1e-125 MAKLTKRMRVIREKVDATKQYDINEAIALLKELATAKFVESVDVAVNLGIDARKSDQNVR GATVLPHGTGRSVRVAVFTQGANAEAAKAAGAELVGMEDLADQIKKGEMNFDVVIASPDA MRVVGQLGQVLGPRGLMPNPKVGTVTPNVAEAVKNAKAGQVRYRNDKNGIIHTTIGKVDF DADKLKENLEALLVALKKAKPTQAKGVYIKKVSISTTMGAGVAVDQAGLSASVN >gi|296494594|gb|ADTN01000144.1| GENE 6 10728 - 11156 705 142 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15804573|ref|NP_290614.1| 50S ribosomal protein L11 [Escherichia coli O157:H7 EDL933] # 1 142 1 142 142 276 100 1e-73 MAKKVQAYVKLQVAAGMANPSPPVGPALGQQGVNIMEFCKAFNAKTDSIEKGLPIPVVIT VYADRSFTFVTKTPPAAVLLKKAAGIKSGSGKPNKDKVGKISRAQLQEIAQTKAADMTGA DIEAMTRSIEGTARSMGLVVED >gi|296494594|gb|ADTN01000144.1| GENE 7 11315 - 11860 674 181 aa, chain - ## HITS:1 COG:ECs4905 KEGG:ns NR:ns ## COG: ECs4905 COG0250 # Protein_GI_number: 15834159 # Func_class: K Transcription # Function: Transcription antiterminator # Organism: Escherichia coli O157:H7 # 1 181 1 181 181 348 100.0 3e-96 MSEAPKKRWYVVQAFSGFEGRVATSLREHIKLHNMEDLFGEVMVPTEEVVEIRGGQRRKS ERKFFPGYVLVQMVMNDASWHLVRSVPRVMGFIGGTSDRPAPISDKEVDAIMNRLQQVGD KPRPKTLFEPGEMVRVNDGPFADFNGVVEEVDYEKSRLKVSVSIFGRATPVELDFSQVEK A >gi|296494594|gb|ADTN01000144.1| GENE 8 11862 - 12245 479 127 aa, chain - ## HITS:1 COG:STM4147 KEGG:ns NR:ns ## COG: STM4147 COG0690 # Protein_GI_number: 16767401 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecE # Organism: Salmonella typhimurium LT2 # 1 127 1 127 127 184 96.0 4e-47 MSANTEAQGSGRGLEAMKWVVVVALLLVAIVGNYLYRDIMLPLRALAVVILIAAAGGVAL LTTKGKATVAFAREARTEVRKVIWPTRQETLHTTLIVAAVTAVMSLILWGLDGILVRLVS FITGLRF >gi|296494594|gb|ADTN01000144.1| GENE 9 12475 - 13659 1634 394 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 393 1 406 407 634 76 0.0 MSKEKFERTKPHVNVGTIGHVDHGKTTLTAAITTVLAKTYGGAARAFDQIDNAPEEKARG ITINTSHVEYDTPTRHYAHVDCPGHADYVKNMITGAAQMDGAILVVAATDGPMPQTREHI LLGRQVGVPYIIVFLNKCDMVDDEELLELVEMEVRELLSQYDFPGDDTPIVRGSALKALE GDAEWEAKILELAGFLDSYIPEPERAIDKPFLLPIEDVFSISGRGTVVTGRVERGIIKVG EEVEIVGIKETQKSTCTGVEMFRKLLDEGRAGENVGVLLRGIKREEIERGQVLAKPGTIK PHTKFESEVYILSKDEGGRHTPFFKGYRPQFYFRTTDVTGTIELPEGVEMVMPGDNIKMV VTLIHPIAMDDGLRFAIREGGRTVGAGVVAKVLS >gi|296494594|gb|ADTN01000144.1| GENE 10 13730 - 15844 2282 704 aa, chain - ## HITS:1 COG:ECs4191 KEGG:ns NR:ns ## COG: ECs4191 COG0480 # Protein_GI_number: 15833445 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Escherichia coli O157:H7 # 1 704 1 704 704 1402 100.0 0 MARTTPIARYRNIGISAHIDAGKTTTTERILFYTGVNHKIGEVHDGAATMDWMEQEQERG ITITSAATTAFWSGMAKQYEPHRINIIDTPGHVDFTIEVERSMRVLDGAVMVYCAVGGVQ PQSETVWRQANKYKVPRIAFVNKMDRMGANFLKVVNQIKTRLGANPVPLQLAIGAEEHFT GVVDLVKMKAINWNDADQGVTFEYEDIPADMVELANEWHQNLIESAAEASEELMEKYLGG EELTEAEIKGALRQRVLNNEIILVTCGSAFKNKGVQAMLDAVIDYLPSPVDVPAINGILD DGKDTPAERHASDDEPFSALAFKIATDPFVGNLTFFRVYSGVVNSGDTVLNSVKAARERF GRIVQMHANKREEIKEVRAGDIAAAIGLKDVTTGDTLCDPDAPIILERMEFPEPVISIAV EPKTKADQEKMGLALGRLAKEDPSFRVWTDEESNQTIIAGMGELHLDIIVDRMKREFNVE ANVGKPQVAYRETIRQKVTDVEGKHAKQSGGRGQYGHVVIDMYPLEPGSNPKGYEFINDI KGGVIPGEYIPAVDKGIQEQLKAGPLAGYPVVDMGIRLHFGSYHDVDSSELAFKLAASIA FKEGFKKAKPVLLEPIMKVEVETPEENTGDVIGDLSRRRGMLKGQESEVTGVKIHAEVPL SEMFGYATQLRSLTKGRASYTMEFLKYDEAPSNVAQAVIEARGK >gi|296494594|gb|ADTN01000144.1| GENE 11 15941 - 16411 777 156 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803854|ref|NP_289888.1| 30S ribosomal protein S7 [Escherichia coli O157:H7 EDL933] # 1 156 1 156 156 303 100 4e-82 MPRRRVIGQRKILPDPKFGSELLAKFVNILMVDGKKSTAESIVYSALETLAQRSGKSELE AFEVALENVRPTVEVKSRRVGGSTYQVPVEVRPVRRNALAMRWIVEAARKRGDKSMALRL ANELSDAAENKGTAVKKREDVHRMAEANKAFAHYRW >gi|296494594|gb|ADTN01000144.1| GENE 12 16508 - 16942 744 144 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|226956878|ref|YP_002807671.1| 30S ribosomal subunit protein S12 [Escherichia sp. 1_1_43] # 1 144 1 144 144 291 99 3e-78 MCEDVLLRVYEAKAKTRSYLMATVNQLVRKPRARKVAKSNVPALEACPQKRGVCTRVYTT TPKKPNSALRKVCRVRLTNGFEVTSYIGGEGHNLQEHSVILIRGGRVKDLPGVRYHTVRG ALDCSGVKDRKQARSKYGVKRPKA >gi|296494594|gb|ADTN01000144.1| GENE 13 17008 - 17295 248 95 aa, chain - ## HITS:1 COG:yheL KEGG:ns NR:ns ## COG: yheL COG2168 # Protein_GI_number: 16131222 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized conserved protein involved in oxidation of intracellular sulfur # Organism: Escherichia coli K12 # 1 95 1 95 95 181 100.0 4e-46 MLHTLHRSPWLTDFAALLRLLSEGDELLLLQDGVTAAVDGNRYLESLRNAPIKVYALNED LIARGLTGQISNDIILIDYTDFVRLTVKHPSQMAW >gi|296494594|gb|ADTN01000144.1| GENE 14 17303 - 17662 214 119 aa, chain - ## HITS:1 COG:yheM KEGG:ns NR:ns ## COG: yheM COG2923 # Protein_GI_number: 16131223 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in the oxidation of intracellular sulfur # Organism: Escherichia coli K12 # 1 119 1 119 119 226 100.0 8e-60 MKRIAFVFSTAPHGTAAGREGLDALLATSALTDDLAVFFIADGVFQLLPGQKPDAVLARD YIATFKLLGLYDIEQCWVCAASLRERGLDPQTPFVVEATPLEADALRRELANYDVILRF >gi|296494594|gb|ADTN01000144.1| GENE 15 17662 - 18048 415 128 aa, chain - ## HITS:1 COG:yheN KEGG:ns NR:ns ## COG: yheN COG1553 # Protein_GI_number: 16131224 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized conserved protein involved in intracellular sulfur reduction # Organism: Escherichia coli K12 # 1 128 1 128 128 227 100.0 3e-60 MRFAIVVTGPAYGTQQASSAFQFAQALIADGHELSSVFFYREGVYNANQLTSPASDEFDL VRAWQQLNAQHGVALNICVAAALRRGVVDETEAGRLGLASSNLQQGFTLSGLGALAEASL TCDRVVQF >gi|296494594|gb|ADTN01000144.1| GENE 16 18048 - 18770 742 240 aa, chain - ## HITS:1 COG:yheO KEGG:ns NR:ns ## COG: yheO COG2964 # Protein_GI_number: 16131225 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 240 5 244 244 469 100.0 1e-132 MSRSLLTNETSELDLLDQRPFDQTDFDILKSYEAVVDGLAMLIGSHCEIVLHSLQDLKCS AIRIANGEHTGRKIGSPITDLALRMLHDMTGADSSVSKCYFTRAKSGVLMKSLTIAIRNR EQRVIGLLCINMNLDVPFSQIMSTFVPPETPDVGSSVNFASSVEDLVTQTLEFTIEEVNA DRNVSNNAKNRQIVLNLYEKGIFDIKDAINQVADRLNISKHTVYLYIRQFKSGDFQGQDK Prediction of potential genes in microbial genomes Time: Sun May 15 23:37:36 2011 Seq name: gi|296494593|gb|ADTN01000145.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont290.3, whole genome shotgun sequence Length of sequence - 15121 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 9, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 21 - 833 1015 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 - Prom 925 - 984 3.4 + Prom 887 - 946 3.3 2 2 Tu 1 . + CDS 1054 - 1272 301 ## COG2900 Uncharacterized protein conserved in bacteria + Term 1276 - 1324 10.2 - Term 1271 - 1299 2.1 3 3 Tu 1 4/0.833 - CDS 1321 - 1911 753 ## COG1047 FKBP-type peptidyl-prolyl cis-trans isomerases 2 - Prom 1940 - 1999 3.9 - Term 1927 - 1968 0.7 4 4 Op 1 3/1.000 - CDS 2006 - 2206 235 ## COG3529 Predicted nucleic-acid-binding protein containing a Zn-ribbon domain 5 4 Op 2 7/0.000 - CDS 2216 - 4021 1056 ## PROTEIN SUPPORTED gi|229845962|ref|ZP_04466074.1| 30S ribosomal protein S2 6 4 Op 3 . - CDS 4021 - 4572 422 ## COG2249 Putative NADPH-quinone reductase (modulator of drug activity B) - Prom 4654 - 4713 4.7 + Prom 4540 - 4599 3.4 7 5 Op 1 3/1.000 + CDS 4703 - 6616 2504 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 8 5 Op 2 5/0.500 + CDS 6676 - 7638 840 ## COG0429 Predicted hydrolase of the alpha/beta-hydrolase fold 9 5 Op 3 6/0.333 + CDS 7632 - 7850 337 ## COG3089 Uncharacterized protein conserved in bacteria 10 6 Tu 1 . + CDS 7904 - 8773 952 ## COG3954 Phosphoribulokinase + Term 8849 - 8880 -0.8 - Term 8762 - 8817 2.7 11 7 Tu 1 . - CDS 8828 - 9232 431 ## COG1765 Predicted redox protein, regulator of disulfide bond formation + Prom 9325 - 9384 1.8 12 8 Op 1 4/0.833 + CDS 9534 - 10166 930 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Term 10178 - 10215 8.2 13 8 Op 2 . + CDS 10217 - 12307 1679 ## COG1289 Predicted membrane protein - Term 12308 - 12349 3.2 14 9 Op 1 5/0.500 - CDS 12374 - 13594 1365 ## COG4992 Ornithine/acetylornithine aminotransferase - Prom 13619 - 13678 7.0 15 9 Op 2 4/0.833 - CDS 13680 - 14243 493 ## COG0512 Anthranilate/para-aminobenzoate synthases component II 16 9 Op 3 . - CDS 14275 - 14877 697 ## COG2184 Protein involved in cell division 17 9 Op 4 . - CDS 14867 - 15034 146 ## SDY_3524 hypothetical protein - Prom 15061 - 15120 2.9 Predicted protein(s) >gi|296494593|gb|ADTN01000145.1| GENE 1 21 - 833 1015 270 aa, chain - ## HITS:1 COG:ECs4198 KEGG:ns NR:ns ## COG: ECs4198 COG0545 # Protein_GI_number: 15833452 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Escherichia coli O157:H7 # 1 259 1 259 270 416 99.0 1e-116 MKSLFKVTLLATTMAVALHAPITFAAEAAKPATAADSKAAFKNDDQKSAYALGASLGRYM ENSLKEQEKLGIKLDKDQLIAGVQDAFADKSKLSDQEIEQTLQAFEARVKSSAQAKMEKD AADNEAKGKEYREKFAKEKGVKTSSTGLVYQVVEAGKGEAPKDSDTVVVNYKGTLIDGKE FDNSYTRGEPLSFRLDGVIPGWTEGLKNIKKGGKIKLVIPPELAYGKAGVPGIPPNSTLV FDVELLDVKPAPKADAKPEADAKAADSAKK >gi|296494593|gb|ADTN01000145.1| GENE 2 1054 - 1272 301 72 aa, chain + ## HITS:1 COG:ECs4199 KEGG:ns NR:ns ## COG: ECs4199 COG2900 # Protein_GI_number: 15833453 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 72 1 72 72 100 100.0 7e-22 MQDLSLEARLAELESRLAFQEITIEELNVTVTAHEMEMAKLRDHLRLLTEKLKASQPSNI ASQAEETPPPHY >gi|296494593|gb|ADTN01000145.1| GENE 3 1321 - 1911 753 196 aa, chain - ## HITS:1 COG:ECs4200 KEGG:ns NR:ns ## COG: ECs4200 COG1047 # Protein_GI_number: 15833454 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 2 # Organism: Escherichia coli O157:H7 # 1 148 1 148 196 271 100.0 7e-73 MKVAKDLVVSLAYQVRTEDGVLVDESPVSAPLDYLHGHGSLISGLETALEGHEVGDKFDV AVGANDAYGQYDENLVQRVPKDVFMGVDELQVGMRFLAETDQGPVPVEITAVEDDHVVVD GNHMLAGQNLKFNVEVVAIREATEEELAHGHVHGAHDHHHDHDHDGCCGGHGHDHGHEHG GEGCCGGKGNGGCGCH >gi|296494593|gb|ADTN01000145.1| GENE 4 2006 - 2206 235 66 aa, chain - ## HITS:1 COG:Z4708 KEGG:ns NR:ns ## COG: Z4708 COG3529 # Protein_GI_number: 15803863 # Func_class: R General function prediction only # Function: Predicted nucleic-acid-binding protein containing a Zn-ribbon domain # Organism: Escherichia coli O157:H7 EDL933 # 1 66 1 66 66 120 100.0 6e-28 MAIRKRFIAGAKCPACQAQDSMAMWRENNIDIVECVKCGHQMREADKEARDHVRKDEQVI GIFHPD >gi|296494593|gb|ADTN01000145.1| GENE 5 2216 - 4021 1056 601 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229845962|ref|ZP_04466074.1| 30S ribosomal protein S2 [Haemophilus influenzae 7P49H1] # 7 590 9 612 618 411 36 1e-114 MEGSDFLLAGVLFLFAAVAAVPLASRLGIGAVLGYLLAGIAIGPWGLGFISDVDEILHFS ELGVVFLMFIIGLELNPSKLWQLRRSIFGVGAAQVLLSAALLAGLLMLTDFAWQAAVVGG IGLAMSSTAMALQLMREKGMNRSESGQLGFSVLLFQDLAVIPALALVPLLAGSADEHFDW MKVGMKVLAFVGMLIGGRYLLRPVFRFIAASGVREVFTAATLLLVLGSALFMDALGLSMA LGTFIAGVLLAESEYRHELETAIDPFKGLLLGLFFISVGMSLNLGVLYTHLLWVVISVVV LVAVKILVLYLLARLYGVRSSERMQFAGVLSQGGEFAFVLFSTASSQRLFQGDQMALLLV TVTLSMMTTPLLMKLVDKWLSRQFNGPEEEDEKPWVNDDKPQVIVVGFGRFGQVIGRLLM ANKMRITVLERDISAVNLMRKYGYKVYYGDATQVDLLRSAGAEAAESIVITCNEPEDTMK LVEICQQHFPHLHILARARGRVEAHELLQAGVTQFSRETFSSALELGRKTLVTLGMHPHQ AQRAQLHFRRLDMRMLRELIPMHADTVQISRAREARRELEEIFQREMQQERRQLDGWDEF E >gi|296494593|gb|ADTN01000145.1| GENE 6 4021 - 4572 422 183 aa, chain - ## HITS:1 COG:ECs4202 KEGG:ns NR:ns ## COG: ECs4202 COG2249 # Protein_GI_number: 15833456 # Func_class: R General function prediction only # Function: Putative NADPH-quinone reductase (modulator of drug activity B) # Organism: Escherichia coli O157:H7 # 1 183 2 184 184 365 100.0 1e-101 MSQPAKVLLLYAHPESQDSVANRVLLKPATQLSNVTVHDLYAHYPDFFIDIPREQALLRE HEVIVFQHPLYTYSCPALLKEWLDRVLSRGFASGPGGNQLAGKYWRSVITTGEPESAYRY DALNRYPMSDVLRPFELAAGMCRMHWLSPIIIYWARRQSAQELASHARAYGDWLANPLSP GGR >gi|296494593|gb|ADTN01000145.1| GENE 7 4703 - 6616 2504 637 aa, chain + ## HITS:1 COG:ECs4203 KEGG:ns NR:ns ## COG: ECs4203 COG0488 # Protein_GI_number: 15833457 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Escherichia coli O157:H7 # 1 637 1 637 637 1209 100.0 0 MIVFSSLQIRRGVRVLLDNATATINPGQKVGLVGKNGCGKSTLLALLKNEISADGGSYTF PGSWQLAWVNQETPALPQAALEYVIDGDREYRQLEAQLHDANERNDGHAIATIHGKLDAI DAWSIRSRAASLLHGLGFSNEQLERPVSDFSGGWRMRLNLAQALICRSDLLLLDEPTNHL DLDAVIWLEKWLKSYQGTLILISHDRDFLDPIVDKIIHIEQQSMFEYTGNYSSFEVQRAT RLAQQQAMYESQQERVAHLQSYIDRFRAKATKAKQAQSRIKMLERMELIAPAHVDNPFRF SFRAPESLPNPLLKMEKVSAGYGDRIILDSIKLNLVPGSRIGLLGRNGAGKSTLIKLLAG ELAPVSGEIGLAKGIKLGYFAQHQLEYLRADESPIQHLARLAPQELEQKLRDYLGGFGFQ GDKVTEETRRFSGGEKARLVLALIVWQRPNLLLLDEPTNHLDLDMRQALTEALIEFEGAL VVVSHDRHLLRSTTDDLYLVHDRKVEPFDGDLEDYQQWLSDVQKQENQTDEAPKENANSA QARKDQKRREAELRAQTQPLRKEIARLEKEMEKLNAQLAQAEEKLGDSELYDQSRKAELT ACLQQQASAKSGLEECEMAWLEAQEQLEQMLLEGQSN >gi|296494593|gb|ADTN01000145.1| GENE 8 6676 - 7638 840 320 aa, chain + ## HITS:1 COG:yheT KEGG:ns NR:ns ## COG: yheT COG0429 # Protein_GI_number: 16131232 # Func_class: R General function prediction only # Function: Predicted hydrolase of the alpha/beta-hydrolase fold # Organism: Escherichia coli K12 # 1 320 21 340 340 678 100.0 0 MRGFSNCHLQTMLPRLFRRQVKFTPYWQRLELPDGDFVDLAWSENPAQAQHKPRLVVFHG LEGSLNSPYAHGLVEAAQKRGWLGVVMHFRGCSGEPNRMHRIYHSGETEDASWFLRWLQR EFGHAPTAAVGYSLGGNMLACLLAKEGNDLPVDAAVIVSAPFMLEACSYHMEKGFSRVYQ RYLLNLLKANAARKLAAYPGTLPINLAQLKSVRRIREFDDLITARIHGYADAIDYYRQCS AMPMLNRIAKPTLIIHAKDDPFMDHQVIPKPESLPPQVEYQLTEHGGHVGFIGGTLLHPQ MWLESRIPDWLTTYLEAKSC >gi|296494593|gb|ADTN01000145.1| GENE 9 7632 - 7850 337 72 aa, chain + ## HITS:1 COG:ECs4205 KEGG:ns NR:ns ## COG: ECs4205 COG3089 # Protein_GI_number: 15833459 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 72 1 72 72 130 100.0 4e-31 MLIPWQDLSPETLENLIESFVLREGTDYGEHERTLEQKVADVKRQLQCGEAVLVWSELHE TVNIMPRSQFRE >gi|296494593|gb|ADTN01000145.1| GENE 10 7904 - 8773 952 289 aa, chain + ## HITS:1 COG:prkB KEGG:ns NR:ns ## COG: prkB COG3954 # Protein_GI_number: 16131234 # Func_class: C Energy production and conversion # Function: Phosphoribulokinase # Organism: Escherichia coli K12 # 1 289 1 289 289 595 100.0 1e-170 MSAKHPVIAVTGSSGAGTTTTSLAFRKIFAQLNLHAAEVEGDSFHRYTRPEMDMAIRKAR DAGRHISYFGPEANDFGLLEQTFIEYGQSGKGKSRKYLHTYDEAVPWNQVPGTFTPWQPL PEPTDVLFYEGLHGGVVTPQHNVAQHVDLLVGVVPIVNLEWIQKLIRDTSERGHSREAVM DSVVRSMEDYINYITPQFSRTHLNFQRVPTVDTSNPFAAKGIPSLDESFVVIHFRNLEGI DFPWLLAMLQGSFISHINTLVVPGGKMGLAMELIMLPLVQRLMEGKKIE >gi|296494593|gb|ADTN01000145.1| GENE 11 8828 - 9232 431 134 aa, chain - ## HITS:1 COG:ECs4207 KEGG:ns NR:ns ## COG: ECs4207 COG1765 # Protein_GI_number: 15833461 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted redox protein, regulator of disulfide bond formation # Organism: Escherichia coli O157:H7 # 1 134 1 134 134 257 100.0 3e-69 MQARVKWVEGLTFLGESASGHQILMDGNSGDKAPSPMEMVLMAAGGCSAIDVVSILQKGR QDVVDCEVKLTSERREEAPRLFTHINLHFIVTGRDLKDAAVARAVDLSAEKYCSVALMLE KAVNITHSYEVVAA >gi|296494593|gb|ADTN01000145.1| GENE 12 9534 - 10166 930 210 aa, chain + ## HITS:1 COG:ECs4208 KEGG:ns NR:ns ## COG: ECs4208 COG0664 # Protein_GI_number: 15833462 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Escherichia coli O157:H7 # 1 210 1 210 210 418 100.0 1e-117 MVLGKPQTDPTLEWFLSHCHIHKYPSKSTLIHQGEKAETLYYIVKGSVAVLIKDEEGKEM ILSYLNQGDFIGELGLFEEGQERSAWVRAKTACEVAEISYKKFRQLIQVNPDILMRLSAQ MARRLQVTSEKVGNLAFLDVTGRIAQTLLNLAKQPDAMTHPDGMQIKITRQEIGQIVGCS RETVGRILKMLEDQNLISAHGKTIVVYGTR >gi|296494593|gb|ADTN01000145.1| GENE 13 10217 - 12307 1679 696 aa, chain + ## HITS:1 COG:ECs4209 KEGG:ns NR:ns ## COG: ECs4209 COG1289 # Protein_GI_number: 15833463 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 696 1 696 696 1332 100.0 0 MWRRLIYHPDINYALRQTLVLCLPVAVGLMLGELRFGLLFSLVPACCNIAGLDTPHKRFF KRLIIGASLFATCSLLTQLLLAKDVPLPFLLTGLTLVLGVTAELGPLHAKLLPASLLAAI FTLSLAGYMPVWEPLLIYALGTLWYGLFNWFWFWIWREQPLRESLSLLYRELADYCEAKY SLLTQHTDPEKALPPLLVRQQKAVDLITQCYQQMHMLSAQNNTDYKRMLRIFQEALDLQE HISVSLHQPEEVQKLVERSHAEEVIRWNAQTVAARLRVLADDILYHRLPTRFTMEKQIGA LEKIARQHPDNPVGQFCYWHFSRIARVLRTQKPLYARDLLADKQRRMPLLPALKSYLSLK SPALRNAGRLSVMLSVASLMGTALHLPKSYWILMTVLLVTQNGYGATRLRIVNRSVGTVV GLIIAGVALHFKIPEGYTLTLMLITTLASYLILRKNYGWATVGFTITAVYTLQLLWLNGE QYILPRLIDTIIGCLIAFGGTVWLWPQWQSGLLRKNAHDALEAYQEAIRLILSEDPQPTP LAWQRMRVNQAHNTLYNSLNQAMQEPAFNSHYLADMKLWVTHSQFIVEHINAMTTLAREH RALPPELAQEYLQSCEIAIQRCQQRLEYDEPGSSGDANIMDAPEMQPHEGAAGTLEQHLQ RVIGHLNTMHTISSMAWRQRPHHGIWLSRKLRDSKA >gi|296494593|gb|ADTN01000145.1| GENE 14 12374 - 13594 1365 406 aa, chain - ## HITS:1 COG:argD KEGG:ns NR:ns ## COG: argD COG4992 # Protein_GI_number: 16131238 # Func_class: E Amino acid transport and metabolism # Function: Ornithine/acetylornithine aminotransferase # Organism: Escherichia coli K12 # 1 406 1 406 406 822 100.0 0 MAIEQTAITRATFDEVILPIYAPAEFIPVKGQGSRIWDQQGKEYVDFAGGIAVTALGHCH PALVNALKTQGETLWHISNVFTNEPALRLGRKLIEATFAERVVFMNSGTEANETAFKLAR HYACVRHSPFKTKIIAFHNAFHGRSLFTVSVGGQPKYSDGFGPKPADIIHVPFNDLHAVK AVMDDHTCAVVVEPIQGEGGVTAATPEFLQGLRELCDQHQALLVFDEVQCGMGRTGDLFA YMHYGVTPDILTSAKALGGGFPISAMLTTAEIASAFHPGSHGSTYGGNPLACAVAGAAFD IINTPEVLEGIQAKRQRFVDHLQKIDQQYDVFSDIRGMGLLIGAELKPQYKGRARDFLYA GAEAGVMVLNAGPDVMRFAPSLVVEDADIDEGMQRFAHAVAKVVGA >gi|296494593|gb|ADTN01000145.1| GENE 15 13680 - 14243 493 187 aa, chain - ## HITS:1 COG:pabA KEGG:ns NR:ns ## COG: pabA COG0512 # Protein_GI_number: 16131239 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component II # Organism: Escherichia coli K12 # 1 187 1 187 187 400 100.0 1e-112 MILLIDNYDSFTWNLYQYFCELGADVLVKRNDALTLADIDALKPQKIVISPGPCTPDEAG ISLDVIRHYAGRLPILGVCLGHQAMAQAFGGKVVRAAKVMHGKTSPITHNGEGVFRGLAN PLTVTRYHSLVVEPDSLPACFDVTAWSETREIMGIRHRQWDLEGVQFHPESILSEQGHQL LANFLHR >gi|296494593|gb|ADTN01000145.1| GENE 16 14275 - 14877 697 200 aa, chain - ## HITS:1 COG:fic KEGG:ns NR:ns ## COG: fic COG2184 # Protein_GI_number: 16131240 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Protein involved in cell division # Organism: Escherichia coli K12 # 1 200 1 200 200 398 100.0 1e-111 MSDKFGEGRDPYLYPGLDIMRNRLNIRQQQRLEQAAYEMTALRAATIELGPLVRGLPHLR TIHRQLYQDIFDWAGQLREVDIYQGDTPFCHFAYIEKEGNALMQDLEEEGYLVGLEKAKF VERLAHYYCEINVLHPFRVGSGLAQRIFFEQLAIHAGYQLSWQGIEKEAWNQANQSGAMG DLTALQMIFSKVVSEAGESE >gi|296494593|gb|ADTN01000145.1| GENE 17 14867 - 15034 146 55 aa, chain - ## HITS:1 COG:no KEGG:SDY_3524 NR:ns ## KEGG: SDY_3524 # Name: yhfG # Def: hypothetical protein # Organism: S.dysenteriae # Pathway: not_defined # 1 55 1 55 55 72 100.0 4e-12 MKKLTDKQKSRLWELQRNRNFQASRRLEGVEMPLVTLTAAEALARLEELRSHYER Prediction of potential genes in microbial genomes Time: Sun May 15 23:37:41 2011 Seq name: gi|296494592|gb|ADTN01000146.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont290.4, whole genome shotgun sequence Length of sequence - 10612 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 6, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 28 - 600 535 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family - Prom 679 - 738 5.8 + Prom 705 - 764 7.3 2 2 Tu 1 4/1.000 + CDS 871 - 2052 1312 ## COG0477 Permeases of the major facilitator superfamily + Term 2061 - 2099 7.0 + Prom 2162 - 2221 3.9 3 3 Op 1 14/0.000 + CDS 2314 - 4857 3271 ## COG1251 NAD(P)H-nitrite reductase 4 3 Op 2 3/1.000 + CDS 4854 - 5180 473 ## COG2146 Ferredoxin subunits of nitrite reductase and ring-hydroxylating dioxygenases + Term 5190 - 5220 3.4 + Prom 5224 - 5283 10.8 5 4 Op 1 3/1.000 + CDS 5306 - 6112 1016 ## COG2116 Formate/nitrite family of transporters 6 4 Op 2 . + CDS 6131 - 7504 1475 ## COG0007 Uroporphyrinogen-III methylase + Term 7514 - 7543 2.1 + Prom 7658 - 7717 7.6 7 5 Tu 1 . + CDS 7751 - 7918 255 ## LF82_3285 uncharacterized protein YhfL + Term 7960 - 7995 5.5 + Prom 8081 - 8140 4.8 8 6 Op 1 2/1.000 + CDS 8213 - 9550 1468 ## COG0531 Amino acid transporters 9 6 Op 2 . + CDS 9571 - 10593 1290 ## COG2222 Predicted phosphosugar isomerases Predicted protein(s) >gi|296494592|gb|ADTN01000146.1| GENE 1 28 - 600 535 190 aa, chain - ## HITS:1 COG:ECs4214 KEGG:ns NR:ns ## COG: ECs4214 COG0652 # Protein_GI_number: 15833468 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Escherichia coli O157:H7 # 1 190 1 190 190 359 100.0 2e-99 MFKSTLAAMAAVFALSALSPAAMAAKGDPHVLLTTSAGNIELELDKQKAPVSVQNFVDYV NSGFYNNTTFHRVIPGFMIQGGGFTEQMQQKKPNPPIKNEADNGLRNTRGTIAMARTADK DSATSQFFINVADNAFLDHGQRDFGYAVFGKVVKGMDVADKISQVPTHDVGPYQNVPSKP VVILSAKVLP >gi|296494592|gb|ADTN01000146.1| GENE 2 871 - 2052 1312 393 aa, chain + ## HITS:1 COG:yhfC KEGG:ns NR:ns ## COG: yhfC COG0477 # Protein_GI_number: 16131243 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 393 1 393 393 684 100.0 0 MTNSNRIKLTWISFLSYALTGALVIVTGMVMGNIADYFNLPVSSMSNTFTFLNAGILISI FLNAWLMEIVPLKTQLRFGFLLMVLAVAGLMFSHSLALFSAAMFILGVVSGITMSIGTFL VTQMYEGRQRGSRLLFTDSFFSMAGMIFPMIAAFLLARSIEWYWVYACIGLVYVAIFILT FGCEFPALGKHAPKTDAPVEKEKWGIGVLFLSVAALCYILGQLGFISWVPEYAKGLGMSL NDAGTLVSNFWMSYMVGMWAFSFILRFFDLQRILTVLAGLAAILMYVFNTGTPAHMAWSI LALGFFSSAIYTTIITLGSQQTKVPSPKLVNFVLTCGTIGTMLTFVVTGPIVEHSGPQAA LLTANGLYAVVFVMCFLLGFVSRHRQHNTLTSH >gi|296494592|gb|ADTN01000146.1| GENE 3 2314 - 4857 3271 847 aa, chain + ## HITS:1 COG:nirB KEGG:ns NR:ns ## COG: nirB COG1251 # Protein_GI_number: 16131244 # Func_class: C Energy production and conversion # Function: NAD(P)H-nitrite reductase # Organism: Escherichia coli K12 # 1 847 1 847 847 1740 100.0 0 MSKVRLAIIGNGMVGHRFIEDLLDKSDAANFDITVFCEEPRIAYDRVHLSSYFSHHTAEE LSLVREGFYEKHGIKVLVGERAITINRQEKVIHSSAGRTVFYDKLIMATGSYPWIPPIKG SDTQDCFVYRTIEDLNAIESCARRSKRGAVVGGGLLGLEAAGALKNLGIETHVIEFAPML MAEQLDQMGGEQLRRKIESMGVRVHTSKNTLEIVQEGVEARKTMRFADGSELEVDFIVFS TGIRPRDKLATQCGLDVAPRGGIVINDSCQTSDPDIYAIGECASWNNRVFGLVAPGYKMA QVAVDHILGSENAFEGADLSAKLKLLGVDVGGIGDAHGRTPGARSYVYLDESKEIYKRLI VSEDNKTLLGAVLVGDTSDYGNLLQLVLNAIELPENPDSLILPAHSGSGKPSIGVDKLPD SAQICSCFDVTKGDLIAAINKGCHTVAALKAETKAGTGCGGCIPLVTQVLNAELAKQGIE VNNNLCEHFAYSRQELFHLIRVEGIKTFEELLAKHGKGYGCEVCKPTVGSLLASCWNEYI LKPEHTPLQDSNDNFLANIQKDGTYSVIPRSPGGEITPEGLMAVGRIAREFNLYTKITGS QRLAMFGAQKDDLPEIWRQLIEAGFETGHAYAKALRMAKTCVGSTWCRYGVGDSVGLGVE LENRYKGIRTPHKMKFGVSGCTRECSEAQGKDVGIIATEKGWNLYVCGNGGMKPRHADLL AADIDRETLIKYLDRFMMFYIRTADKLTRTAPWLENLEGGIDYLKAVIIDDKLGLNAHLE EEMARLREAVLCEWTETVNTPSAQTRFKHFINSDKRDPNVQMVPEREQHRPATPYERIPV TLVEDNA >gi|296494592|gb|ADTN01000146.1| GENE 4 4854 - 5180 473 108 aa, chain + ## HITS:1 COG:ECs4217 KEGG:ns NR:ns ## COG: ECs4217 COG2146 # Protein_GI_number: 15833471 # Func_class: P Inorganic ion transport and metabolism; R General function prediction only # Function: Ferredoxin subunits of nitrite reductase and ring-hydroxylating dioxygenases # Organism: Escherichia coli O157:H7 # 1 108 1 108 108 223 100.0 8e-59 MSQWKDICKIDDILPETGVCALLGDEQVAIFRPYHSDQVFAISNIDPFFESSVLSRGLIA EHQGELWVASPLKKQRFRLSDGLCMEDEQFSVKHYEARVKDGVVQLRG >gi|296494592|gb|ADTN01000146.1| GENE 5 5306 - 6112 1016 268 aa, chain + ## HITS:1 COG:nirCm KEGG:ns NR:ns ## COG: nirCm COG2116 # Protein_GI_number: 16132233 # Func_class: P Inorganic ion transport and metabolism # Function: Formate/nitrite family of transporters # Organism: Escherichia coli K12 # 1 268 1 268 268 464 99.0 1e-131 MFTDTINKCAANAARIARLSANNPLGFWVSSAMAGAYVGLGIILIFTLGNLLDPSVRPLV MGATFGIALTLVIIAGSELFTGHTMFLTFGVKAGSISHGQMWAILTQTWLGNLVGSVFVA MLYSWGGGSLLPVDTSIVHSVALAKTTAPAMVLFFKGALCNWLVCLAIWMALRTEGAAKF IAIWWCLLAFIASGYEHSIANMTLFALSWFGNHSEAYTLAGIGHNLLWVTLGNTLSGAVF MGLGYWYATPKANRPVADKFNQTETAAG >gi|296494592|gb|ADTN01000146.1| GENE 6 6131 - 7504 1475 457 aa, chain + ## HITS:1 COG:cysG_2 KEGG:ns NR:ns ## COG: cysG_2 COG0007 # Protein_GI_number: 16131246 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III methylase # Organism: Escherichia coli K12 # 211 457 1 247 247 503 100.0 1e-142 MDHLPIFCQLRDRDCLIVGGGDVAERKARLLLDAGARLTVNALAFIPQFTAWADAGMLTL VEGPFDESLLDTCWLAIAATDDDALNQRVSEAAEARRIFCNVVDAPKAASFIMPSIIDRS PLMVAVSSGGTSPVLARLLREKLESLLPLHLGQVAKYAGQLRGRVKQQFATMGERRRFWE KLFVNDRLAQSLANNDQKAITETTEQLINEPLDHRGEVVLVGAGPGDAGLLTLKGLQQIQ QADVVVYDRLVSDDIMNLVRRDADRVFVGKRAGYHCVPQEEINQILLREAQKGKRVVRLK GGDPFIFGRGGEELETLCNAGIPFSVVPGITAASGCSAYSGIPLTHRDYAQSVRLITGHL KTGGELDWENLAAEKQTLVFYMGLNQAATIQQKLIEHGMPGEMPVAIVENGTAVTQRVID GTLTQLGELAQQMNSPSLIIIGRVVGLRDKLNWFSNH >gi|296494592|gb|ADTN01000146.1| GENE 7 7751 - 7918 255 55 aa, chain + ## HITS:1 COG:no KEGG:LF82_3285 NR:ns ## KEGG: LF82_3285 # Name: yhfL # Def: uncharacterized protein YhfL # Organism: E.coli_LF82 # Pathway: not_defined # 1 55 1 55 55 107 100.0 1e-22 MNKFIKVALVGAVLATLTACTGHIENRDKNCSYDYLLHPAISISKIIGGCGPTAQ >gi|296494592|gb|ADTN01000146.1| GENE 8 8213 - 9550 1468 445 aa, chain + ## HITS:1 COG:yhfM KEGG:ns NR:ns ## COG: yhfM COG0531 # Protein_GI_number: 16131248 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli K12 # 1 445 18 462 462 782 100.0 0 MGSQELQRKLGFWAVLAIAVGTTVGSGIFVSVGEVAKAAGTPWLTVLAFVIGGLIVIPQM CVYAELSTAYPENGADYVYLKNAGSRPLAFLSGWASFWANDAPSLSIMALAIVSNLGFLT PIDPLLGKFIAAGLIIAFMLLHLRSVEGGAAFQTLITIAKIIPFTIVIGLGIFWFKAENF AAPTTTAIGATGSFMALLAGISATSWSYTGMASICYMTGEIKNPGKTMPRALIGSCLLVL VLYTLLALVISGLMPFDKLANSETPISDALTWIPALGSTAGIFVAITAMIVILGSLSSCV MYQPRLEYAMAKDNLFFKCFGHVHPKYNTPDVSIILQGALGIFFIFVSDLTSLLGYFTLV MCFKNTLTFGSIIWCRKRDDYKPLWRTPAFGLMTTLAIASSLILVASTFVWAPIPGLICA VIVIATGLPAYAFWAKRSRQLNALS >gi|296494592|gb|ADTN01000146.1| GENE 9 9571 - 10593 1290 340 aa, chain + ## HITS:1 COG:yhfN KEGG:ns NR:ns ## COG: yhfN COG2222 # Protein_GI_number: 16131249 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted phosphosugar isomerases # Organism: Escherichia coli K12 # 1 340 8 347 347 725 100.0 0 MLDIDKSTVDFLVTENMVQEVEKVLSHDVPLVHAIVEEMVKRDIDRIYFVACGSPLNAAQ TAKHLADRFSDLQVYAISGWEFCDNTPYRLDDRCAVIGVSDYGKTEEVIKALELGRACGA LTAAFTKRADSPITSAAEFSIDYQADCIWEIHLLLCYSVVLEMITRLAPNAEIGKIKNDL KQLPNALGHLVRTWEEKGRQLGELASQWPMIYTVAAGPLRPLGYKEGIVTLMEFTWTHGC VIESGEFRHGPLEIVEPGVPFLFLLGNDESRHTTERAINFVKQRTDNVIVIDYAEISQGL HPWLAPFLMFVPMEWLCYYLSIYKDHNPDERRYYGGLVEY Prediction of potential genes in microbial genomes Time: Sun May 15 23:37:44 2011 Seq name: gi|296494591|gb|ADTN01000147.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont290.5, whole genome shotgun sequence Length of sequence - 2523 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 + CDS 52 - 882 940 ## COG1082 Sugar phosphate isomerases/epimerases 2 1 Op 2 1/0.000 + CDS 879 - 1664 848 ## COG0524 Sugar kinases, ribokinase family + Term 1671 - 1712 9.5 + Prom 1679 - 1738 3.6 3 1 Op 3 . + CDS 1764 - 2495 647 ## COG2188 Transcriptional regulators Predicted protein(s) >gi|296494591|gb|ADTN01000147.1| GENE 1 52 - 882 940 276 aa, chain + ## HITS:1 COG:ECs4223 KEGG:ns NR:ns ## COG: ECs4223 COG1082 # Protein_GI_number: 15833477 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Escherichia coli O157:H7 # 1 275 1 275 275 564 99.0 1e-161 MKTGMFTCGHQRLPIEHAFRDASELGYDGIEIWGGRPHAFAPDLKAGGIKQIKALAQTYQ MPIIGYTPETNGYPYNMMLGDEHMRRESLDMIKLAMDMAKEMNAGYTLISAAHAGYLTPP NVIWGRLAENLSELCEYAENIGMDLILEPLTPYESNVVCNANDVLHALALVPSPRLFSMV DICAPYVQAEPVMSYFDKLGDKLRHLHIVDSDGASDTHYIPGEGKMPLRELMRDIIERGY EGYCTVELVTMYMNEPRLYARQALERFRALLPEDER >gi|296494591|gb|ADTN01000147.1| GENE 2 879 - 1664 848 261 aa, chain + ## HITS:1 COG:yhfQ KEGG:ns NR:ns ## COG: yhfQ COG0524 # Protein_GI_number: 16131252 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli K12 # 1 261 1 261 261 520 100.0 1e-147 MKTLATIGDNCVDIYPQLNKAFSGGNAVNVAVYCTRYGIQPGCITWVGDDDYGTKLKQDL ARMGVDISHVHTKHGVTAQTQVELHDNDRVFGDYTEGVMADFALSEEDYAWLAQYDIVHA AIWGHAEDAFPQLHAAGKLTAFDFSDKWDSPLWQTLVPHLDFAFASAPQEDETLRLKMKA IVARGAGTVIVTLGENGSIAWDGAQFWRQAPEPVTVIDTMGAGDSFIAGFLCGWSAGMTL PQAIAQGTACAAKTIQYHGAW >gi|296494591|gb|ADTN01000147.1| GENE 3 1764 - 2495 647 243 aa, chain + ## HITS:1 COG:ECs4225 KEGG:ns NR:ns ## COG: ECs4225 COG2188 # Protein_GI_number: 15833479 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 243 23 265 265 501 99.0 1e-142 MSATDRYSHQLLYATVRQRLLDDIAQGVYQAGQQIPTENELCTQYNVSRITIRKAISDLV ADGVLIRWQGKGTFVQSQKVENALLTVSGFTDFGVSQGKATKEKVIEQERVSAAPFCEKL NIPGNSEVFHLCRVMYLDKEPLFIDSSWIPLSRYPDFDEIYVEGSSTYQLFQERFDTRVV SDKKTIDIFAATRPQAKWLKCELGEPLFRISKIAFDQNDKPVHVSELFCRANRITLTIDN KRH Prediction of potential genes in microbial genomes Time: Sun May 15 23:37:48 2011 Seq name: gi|296494590|gb|ADTN01000148.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont290.6, whole genome shotgun sequence Length of sequence - 10923 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 3, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 56 - 1141 939 ## JW3339 conserved hypothetical protein 2 1 Op 2 . - CDS 1153 - 2457 1408 ## EC55989_3782 conserved hypothetical protein; putative inner membrane protein 3 1 Op 3 . - CDS 2469 - 2822 508 ## UTI89_C3876 hypothetical protein 4 1 Op 4 1/0.000 - CDS 2833 - 3711 1054 ## COG1735 Predicted metal-dependent hydrolase with the TIM-barrel fold 5 1 Op 5 2/0.000 - CDS 3708 - 4934 1273 ## COG1015 Phosphopentomutase 6 1 Op 6 . - CDS 4934 - 6097 1136 ## COG3457 Predicted amino acid racemase - Prom 6117 - 6176 1.5 7 2 Op 1 . - CDS 6181 - 6543 340 ## SSON_3513 hypothetical protein 8 2 Op 2 . - CDS 6560 - 7465 779 ## B21_03187 hypothetical protein - Prom 7638 - 7697 9.1 - Term 7620 - 7656 2.4 9 3 Op 1 6/0.000 - CDS 7755 - 8759 1266 ## COG0180 Tryptophanyl-tRNA synthetase 10 3 Op 2 9/0.000 - CDS 8752 - 9510 945 ## COG0546 Predicted phosphatases 11 3 Op 3 6/0.000 - CDS 9503 - 10180 815 ## COG0036 Pentose-5-phosphate-3-epimerase 12 3 Op 4 . - CDS 10198 - 10833 499 ## COG0338 Site-specific DNA methylase Predicted protein(s) >gi|296494590|gb|ADTN01000148.1| GENE 1 56 - 1141 939 361 aa, chain - ## HITS:1 COG:no KEGG:JW3339 NR:ns ## KEGG: JW3339 # Name: yhfS # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 361 1 361 361 660 99.0 0 MKTFPLQSLTIIEAQQKQFALVDSICRHFPGSEFLTGGDLGLTPGLNQPRVTQRVEQVLA DAFHAQAAALVQGAGTGAIRAGLAALLKPGQRLLVHDAPVYPTTRVIIEQMGLTLITVDF NDLSALKQVVDEQQPDAALVQHTRQQPQDSYVLADVLATLRAAGVPALTDDNYAVMKVAR IGCECGANVSTFSCFKLFGPEGVGAVVGDADVINRIRATLYSGGSQIQGAQALEVLRGLV FAPVMHAVQAGVSERLLALLNGGAVPEVKSAVIANAQSKVLIIEFHQPIAARVLEEVQKR GALPYPVGAESKYEIPPLFYRLSGTFRQANPQSEHCAIRINPNRSGEETVLRILRESIAS I >gi|296494590|gb|ADTN01000148.1| GENE 2 1153 - 2457 1408 434 aa, chain - ## HITS:1 COG:no KEGG:EC55989_3782 NR:ns ## KEGG: EC55989_3782 # Name: yhfT # Def: conserved hypothetical protein; putative inner membrane protein # Organism: E.coli_55989 # Pathway: not_defined # 1 434 1 434 434 729 100.0 0 MDLYIQIIVVACLTGMTSLLAHRSAAVFHDGIRPILPQLIEGYMNRREAGSIAFGLSIGF VASVGISFTLKTGLLNAWLLFLPTDILGVLAINSLMAFGLGAIWGVLILTCLLPVNQLLT ALPVDVLGSLGELSSPVVSAFALFPLVAIFYQFGWKQSLIAAVVVLMTRVVVVRYFPHLN PESIEIFIGMVMLLGIAITHDLRHRDENDIDASGLSVFEERTSRIIKNLPYIAIVGALIA AVASMKIFAGSEVSIFTLEKAYSAGVTPEQSQTLINQAALAEFMRGLGFVPLIATTALAT GVYAVAGFTFVYAVGYLSPNPMVAAVLGAVVISAEVLLLRSIGKWLGRYPSVRNASDNIR NAMNMLMEVALLVGSIFAAIKMAGYTGFSIAVAIYFLNESLGRPVQKMAAPVVAVMITGI LLNVLYWLGLFVPA >gi|296494590|gb|ADTN01000148.1| GENE 3 2469 - 2822 508 117 aa, chain - ## HITS:1 COG:no KEGG:UTI89_C3876 NR:ns ## KEGG: UTI89_C3876 # Name: yhfU # Def: hypothetical protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 117 14 130 130 215 100.0 4e-55 MKKIGVAGLQREQIKKTIEATAPGCFEVFIHNDMEAAMKVKSGQLDYYIGACNTGAGAAL SIAIAVIGYNKSCTIAKPGIKAKDEHIAKMIAEGKVAFGLSVEHVEHAIPMLINHLK >gi|296494590|gb|ADTN01000148.1| GENE 4 2833 - 3711 1054 292 aa, chain - ## HITS:1 COG:php KEGG:ns NR:ns ## COG: php COG1735 # Protein_GI_number: 16131257 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Escherichia coli K12 # 1 292 1 292 292 593 100.0 1e-169 MSFDPTGYTLAHEHLHIDLSGFKNNVDCRLDQYAFICQEMNDLMTRGVRNVIEMTNRYMG RNAQFMLDVMRETGINVVACTGYYQDAFFPEHVATRSVQELAQEMVDEIEQGIDGTELKA GIIAEIGTSEGKITPLEEKVFIAAALAHNQTGRPISTHTSFSTMGLEQLALLQAHGVDLS RVTVGHCDLKDNLDNILKMIDLGAYVQFDTIGKNSYYPDEKRIAMLHALRDRGLLNRVML SMDITRRSHLKANGGYGYDYLLTTFIPQLRQSGFSQADVDVMLRENPSQFFQ >gi|296494590|gb|ADTN01000148.1| GENE 5 3708 - 4934 1273 408 aa, chain - ## HITS:1 COG:yhfW KEGG:ns NR:ns ## COG: yhfW COG1015 # Protein_GI_number: 16131258 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphopentomutase # Organism: Escherichia coli K12 # 1 408 1 408 408 825 99.0 0 MARFVVLVIDSFGVGAMKDVTLVRPQDAGANTCGHILSQLPHLQLPTLEKLGLINALGYA PGDMQPSDSATWGVAELQHEGGDTFMGHQEILGTRPLPPLRMPFRDVIDRVEQALVSAGW QVERRGDDLQFLWVNQAVAIGDNLEADLGQVYNITANLSVISFDDAIKIGRIVREQVQVG RVITFGGLLTDSQRILDAAESKEGRFIGINAPRSGAYDNGFQVVHMGYGVDEKVQVPQKL YEAGVPTVLVGKVADIVNNPYGVSWQNLVDSQRIMDITLNEFNTHPTAFICTNIQETDLA GHAEDVARYAERLQVVDRNLARLVEAMQPDDCLVVMADHGNDPTIGHSHHTREVVPVLVY QQGMIATQLGVRTTLSDVGATVCEFFRAPPPQNGRSFLSSLRFAGDTL >gi|296494590|gb|ADTN01000148.1| GENE 6 4934 - 6097 1136 387 aa, chain - ## HITS:1 COG:yhfX KEGG:ns NR:ns ## COG: yhfX COG3457 # Protein_GI_number: 16131259 # Func_class: E Amino acid transport and metabolism # Function: Predicted amino acid racemase # Organism: Escherichia coli K12 # 1 387 1 387 387 792 100.0 0 MFVEALKRQNPALISAALSLWQQGKIAPDSWVIDVDQILENGKRLIETARLYGIELYLMT KQFGRNPWLAEKLLALGYSGIVAVDYKEARVMRRAGLPVAHQGHLVQIPCHQVADAVEQG TDVITVFTLDKAREVSAAAVKAGRIQSVLLKVYSDDDFLYPGQESGFALKVLPEIVAEIQ NLPGLHLAGLTHFPCLLWDEAVGKVLPTPNLHTLIQARDQLAKSGIALEQLNAPSATSCT SLPLLAQYGVTHAEPGHALTGTIPANQQGDQPERIAMLWLSEISHHFRGDSYCYGGGYYR RGHAQHALVFTPENQKITETNLKTVDDSSIDYTLPLAGEFPVSSAVVLCFRTQIFVTRSD VVLVSGIHRGEPEIVGRYDSLGNSLGA >gi|296494590|gb|ADTN01000148.1| GENE 7 6181 - 6543 340 120 aa, chain - ## HITS:1 COG:no KEGG:SSON_3513 NR:ns ## KEGG: SSON_3513 # Name: yhfY # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 120 15 134 134 220 100.0 1e-56 METRLNLLCEAGVIDKDICKGMMQVVNVLETECHLPVRSEQGTMAMTHMASALMRSRRGE EIEPLDNELLAELAQSSHWQAVVQLHQVLLKEFALEVNPCEEGYLLANLYGLWMAANEEV >gi|296494590|gb|ADTN01000148.1| GENE 8 6560 - 7465 779 301 aa, chain - ## HITS:1 COG:no KEGG:B21_03187 NR:ns ## KEGG: B21_03187 # Name: yhfZ # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 301 1 301 301 605 100.0 1e-172 MRRTFIKKEGVVITTLARYLLGEKCGNRLKTIDELANECRSSVGLTQAALKTLESSGAIR IERRGRNGSYLVEMDNKALLTHVDINNVVCAMPLPYTRLYEGLASGLKAQFDGIPFYYAH MRGADIRVECLLNGVYDMAVVSRLAAESYLTQKGLCLALELGPHTYVGEHQLICRKGESA NVKRVGLDNRSADQKIMTDVFFGGSDVERVDLSYHESLQRIVKGDVDAVIWNVVAENELT MLGLEATPLTDDPRFLQATEAVVLTRVDDYPMQQLLRAVVDKHALLAHQQRVVSGEQEPS Y >gi|296494590|gb|ADTN01000148.1| GENE 9 7755 - 8759 1266 334 aa, chain - ## HITS:1 COG:trpS KEGG:ns NR:ns ## COG: trpS COG0180 # Protein_GI_number: 16131262 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Tryptophanyl-tRNA synthetase # Organism: Escherichia coli K12 # 1 334 1 334 334 684 100.0 0 MTKPIVFSGAQPSGELTIGNYMGALRQWVNMQDDYHCIYCIVDQHAITVRQDAQKLRKAT LDTLALYLACGIDPEKSTIFVQSHVPEHAQLGWALNCYTYFGELSRMTQFKDKSARYAEN INAGLFDYPVLMAADILLYQTNLVPVGEDQKQHLELSRDIAQRFNALYGEIFKVPEPFIP KSGARVMSLLEPTKKMSKSDDNRNNVIGLLEDPKSVVKKIKRAVTDSDEPPVVRYDVQNK AGVSNLLDILSAVTGQSIPELEKQFEGKMYGHLKGEVADAVSGMLTELQERYHRFRNDEA FLQQVMKDGAEKASAHASRTLKAVYEAIGFVAKP >gi|296494590|gb|ADTN01000148.1| GENE 10 8752 - 9510 945 252 aa, chain - ## HITS:1 COG:gph KEGG:ns NR:ns ## COG: gph COG0546 # Protein_GI_number: 16131263 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Escherichia coli K12 # 1 252 1 252 252 493 100.0 1e-139 MNKFEDIRGVAFDLDGTLVDSAPGLAAAVDMALYALELPVAGEERVITWIGNGADVLMER ALTWARQERATQRKTMGKPPVDDDIPAEEQVRILRKLFDRYYGEVAEEGTFLFPHVADTL GALQAKGLPLGLVTNKPTPFVAPLLEALDIAKYFSVVIGGDDVQNKKPHPDPLLLVAERM GIAPQQMLFVGDSRNDIQAAKAAGCPSVGLTYGYNYGEAIDLSQPDVIYQSINDLLPALG LPHSENQESKND >gi|296494590|gb|ADTN01000148.1| GENE 11 9503 - 10180 815 225 aa, chain - ## HITS:1 COG:ECs4228 KEGG:ns NR:ns ## COG: ECs4228 COG0036 # Protein_GI_number: 15833482 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Escherichia coli O157:H7 # 1 225 1 225 225 441 100.0 1e-124 MKQYLIAPSILSADFARLGEDTAKALAAGADVVHFDVMDNHYVPNLTIGPMVLKSLRNYG ITAPIDVHLMVKPVDRIVPDFAAAGASIITFHPEASEHVDRTLQLIKENGCKAGLVFNPA TPLSYLDYVMDKLDVILLMSVNPGFGGQSFIPQTLDKLREVRRRIDESGFDIRLEVDGGV KVNNIGEIAAAGADMFVAGSAIFDQPDYKKVIDEMRSELAKVSHE >gi|296494590|gb|ADTN01000148.1| GENE 12 10198 - 10833 499 211 aa, chain - ## HITS:1 COG:ECs4229 KEGG:ns NR:ns ## COG: ECs4229 COG0338 # Protein_GI_number: 15833483 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Escherichia coli O157:H7 # 1 211 68 278 278 423 100.0 1e-118 MRTDEYVQAARELFVPETNCAEVYYQFREEFNKSQDPFRRAVLFLYLNRYGYNGLCRYNL RGEFNVPFGRYKKPYFPEAELYHFAEKAQNAFFYCESYADSMARADDASVVYCDPPYAPL SATANFTAYHTNSFTLEQQAHLAEIAEGLVERHIPVLISNHDTMLTREWYQRAKLHVVKV RRSISSNGGTRKKVDELLALYKPGVVSPAKK Prediction of potential genes in microbial genomes Time: Sun May 15 23:38:14 2011 Seq name: gi|296494589|gb|ADTN01000149.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont290.7, whole genome shotgun sequence Length of sequence - 21543 bp Number of predicted genes - 20, with homology - 19 Number of transcription units - 8, operones - 4 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 170 - 208 -0.9 1 1 Op 1 7/0.000 - CDS 249 - 1535 1082 ## COG3266 Uncharacterized protein conserved in bacteria - Term 1581 - 1611 2.7 2 1 Op 2 20/0.000 - CDS 1627 - 2715 1031 ## COG0337 3-dehydroquinate synthetase 3 1 Op 3 8/0.000 - CDS 2772 - 3293 523 ## COG0703 Shikimate kinase - Prom 3396 - 3455 3.9 - Term 3581 - 3614 0.6 4 1 Op 4 . - CDS 3694 - 4932 1050 ## COG4796 Type II secretory pathway, component HofQ 5 1 Op 5 . - CDS 4844 - 5194 176 ## G2583_4089 hypothetical protein 6 1 Op 6 . - CDS 5238 - 5678 238 ## ECO103_4111 hypothetical protein 7 1 Op 7 . - CDS 5662 - 6201 220 ## COG3166 Tfp pilus assembly protein PilN 8 1 Op 8 . - CDS 6201 - 6980 727 ## JW5693 predicted pilus assembly protein - Prom 7053 - 7112 2.0 + Prom 7017 - 7076 3.4 9 2 Tu 1 . + CDS 7100 - 9652 2718 ## COG5009 Membrane carboxypeptidase/penicillin-binding protein + Term 9653 - 9692 3.0 - Term 9648 - 9673 -0.5 10 3 Op 1 . - CDS 9755 - 9817 62 ## 11 3 Op 2 . - CDS 9818 - 10378 609 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes - Prom 10461 - 10520 7.0 + Prom 10420 - 10479 5.3 12 4 Op 1 . + CDS 10698 - 12833 1839 ## JW3361 predicted inner membrane protein + Term 12838 - 12884 10.1 13 4 Op 2 4/0.500 + CDS 12898 - 13566 905 ## COG1011 Predicted hydrolase (HAD superfamily) 14 4 Op 3 3/0.500 + CDS 13577 - 13978 452 ## COG1188 Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) 15 4 Op 4 . + CDS 14003 - 14881 1112 ## COG1281 Disulfide bond chaperones of the HSP33 family + Term 14890 - 14925 7.4 16 5 Tu 1 . - CDS 14944 - 16668 1449 ## B21_03206 hypothetical protein - Prom 16890 - 16949 5.7 + Prom 16832 - 16891 4.0 17 6 Tu 1 . + CDS 17047 - 18669 2033 ## COG1866 Phosphoenolpyruvate carboxykinase (ATP) + Term 18718 - 18751 4.4 - Term 18699 - 18743 7.1 18 7 Op 1 40/0.000 - CDS 18745 - 20097 1345 ## COG0642 Signal transduction histidine kinase 19 7 Op 2 . - CDS 20094 - 20813 981 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 20918 - 20977 11.2 + Prom 20900 - 20959 4.9 20 8 Tu 1 . + CDS 21041 - 21517 648 ## COG0782 Transcription elongation factor Predicted protein(s) >gi|296494589|gb|ADTN01000149.1| GENE 1 249 - 1535 1082 428 aa, chain - ## HITS:1 COG:damX KEGG:ns NR:ns ## COG: damX COG3266 # Protein_GI_number: 16131266 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 428 1 428 428 484 100.0 1e-137 MDEFKPEDELKPDPSDRRTGRSRQSSERSERTERGEPQINFDDIELDDTDDRRPTRAQKE RNEEPEIEEEIDESEDETVDEERVERRPRKRKKAASKPASRQYMMMGVGILVLLLLIIGI GSALKAPSTTSSDQTASGEKSIDLAGNATDQANGVQPAPGTTSAENTQQDVSLPPISSTP TQGQTPVATDGQQRVEVQGDLNNALTQPQNQQQLNNVAVNSTLPTEPATVAPVRNGNASR DTAKTQTAERPSTTRPARQQAVIEPKKPQATVKTEPKPVAQTPKRTEPAAPVASTKAPAA TSTPAPKETATTAPVQTASPAQTTATPAAGAKTAGNVGSLKSAPSSHYTLQLSSSSNYDN LNGWAKKENLKNYVVYETTRNGQPWYVLVSGVYASKEEAKKAVSTLPADVQAKNPWAKPL RQVQADLK >gi|296494589|gb|ADTN01000149.1| GENE 2 1627 - 2715 1031 362 aa, chain - ## HITS:1 COG:aroB KEGG:ns NR:ns ## COG: aroB COG0337 # Protein_GI_number: 16131267 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate synthetase # Organism: Escherichia coli K12 # 1 362 1 362 362 694 100.0 0 MERIVVTLGERSYPITIASGLFNEPASFLPLKSGEQVMLVTNETLAPLYLDKVRGVLEQA GVNVDSVILPDGEQYKSLAVLDTVFTALLQKPHGRDTTLVALGGGVVGDLTGFAAASYQR GVRFIQVPTTLLSQVDSSVGGKTAVNHPLGKNMIGAFYQPASVVVDLDCLKTLPPRELAS GLAEVIKYGIILDGAFFNWLEENLDALLRLDGPAMAYCIRRCCELKAEVVAADERETGLR ALLNLGHTFGHAIEAEMGYGNWLHGEAVAAGMVMAARTSERLGQFSSAETQRIITLLKRA GLPVNGPREMSAQAYLPHMLRDKKVLAGEMRLILPLAIGKSEVRSGVSHELVLNAIADCQ SA >gi|296494589|gb|ADTN01000149.1| GENE 3 2772 - 3293 523 173 aa, chain - ## HITS:1 COG:ECs4232 KEGG:ns NR:ns ## COG: ECs4232 COG0703 # Protein_GI_number: 15833486 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Escherichia coli O157:H7 # 1 173 68 240 240 321 100.0 4e-88 MAEKRNIFLVGPMGAGKSTIGRQLAQQLNMEFYDSDQEIEKRTGADVGWVFDLEGEEGFR DREEKVINELTEKQGIVLATGGGSVKSRETRNRLSARGVVVYLETTIEKQLARTQRDKKR PLLHVETPPREVLEALANERNPLYEEIADVTIRTDDQSAKVVANQIIHMLESN >gi|296494589|gb|ADTN01000149.1| GENE 4 3694 - 4932 1050 412 aa, chain - ## HITS:1 COG:hofQ KEGG:ns NR:ns ## COG: hofQ COG4796 # Protein_GI_number: 16131268 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component HofQ # Organism: Escherichia coli K12 # 1 412 1 412 412 738 100.0 0 MKQWIAALLLMLIPGVQAAKPQKVTLMVDDVPVAQVLQALAEQEKLNLVVSPDVSGTVSL HLTDVPWKQALQTVVKSAGLITRQEGNILSVHSIAWQNNNIARQEAEQARAQANLPLENR SITLQYADAGELAKAGEKLLSAKGSMTVDKRTNRLLLRDNKTALSALEQWVAQMDLPVGQ VELSAHIVTINEKSLRELGVKWTLADAQHAGGVGQVTTLGSDLSVATATTHVGFNIGRIN GRLLDLELSALEQKQQLDIIASPRLLASHLQPASIKQGSEIPYQVSSGESGATSVEFKEA VLGMEVTPTVLQKGRIRLKLHISQNVPGQVLQQADGEVLAIDKQEIETQVEVKSGETLAL GGIFTRKNKSGQDSVPLLGDIPWFGQLFRHDGKEDERRELVVFITPRLVSSE >gi|296494589|gb|ADTN01000149.1| GENE 5 4844 - 5194 176 116 aa, chain - ## HITS:1 COG:no KEGG:G2583_4089 NR:ns ## KEGG: G2583_4089 # Name: hofP # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 116 126 241 241 235 100.0 4e-61 MRDPFKPPEDLCRISELSQWRYQGMVGRGERIIGVIKDGQKKWRRVQQNDVLENGWTILQ LTPDVLTLGTGTNCEPPQWLWQRQGDTNEAMDSRTTVDADTRRTGGKAAKSDADGG >gi|296494589|gb|ADTN01000149.1| GENE 6 5238 - 5678 238 146 aa, chain - ## HITS:1 COG:no KEGG:ECO103_4111 NR:ns ## KEGG: ECO103_4111 # Name: hofO # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 146 1 146 146 266 100.0 2e-70 MNMFFDWWFATSPRLRQLCWAFWLLMLVTLIFLSSTHHEERDALIRLRASHHQQWAALYR LVDTTPFSEEKTLPFSPLDFQLSGAQLVSWHPSAQGGELALKTLWEAVPSAFTRLAERNV SVSRFSLSVEGDDLLFTLQLETPHEG >gi|296494589|gb|ADTN01000149.1| GENE 7 5662 - 6201 220 179 aa, chain - ## HITS:1 COG:yrfC KEGG:ns NR:ns ## COG: yrfC COG3166 # Protein_GI_number: 16131271 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Tfp pilus assembly protein PilN # Organism: Escherichia coli K12 # 1 179 1 179 179 253 100.0 1e-67 MNPPINFLPWRQQRRTAFLRFWLLMFVAPLLLAVGITLILRLTGSAEARIDAVLLQAEQQ LARSLQITKPRLLEQQQLREQRSQRQRQRQFTRDWQSALEALAALLPEHAWLTTISWQQG TLEIKGLTTSITALNALETSLRQDASFHLNQRGATQQDAQGRWQFEYQLTRKVSDEHVL >gi|296494589|gb|ADTN01000149.1| GENE 8 6201 - 6980 727 259 aa, chain - ## HITS:1 COG:no KEGG:JW5693 NR:ns ## KEGG: JW5693 # Name: yrfD # Def: predicted pilus assembly protein # Organism: E.coli_J # Pathway: not_defined # 1 259 1 259 259 493 100.0 1e-138 MAFKIWQIGLHLQQQEAVAVAIVRGAKECFLQRWWRLPLENDIIKDGRIVDAQQLAKTLL PWSRELPQRHHIMLAFPASRTLQRSFPRPSMSLGEREQTAWLSGTMARELDMDPDSLRFD YSEDSLSPAYNVTAAQSKELATLLTLAERLRVHVSAITPDASALQRFLPFLPSHQQCLAW RDNEQWLWATRYSWGRKLAVGMTSAKELAAALSVDPESVAICGEGGFDPWEAVSVRQPPL PPPGGDFAIALGLALGKAY >gi|296494589|gb|ADTN01000149.1| GENE 9 7100 - 9652 2718 850 aa, chain + ## HITS:1 COG:mrcA KEGG:ns NR:ns ## COG: mrcA COG5009 # Protein_GI_number: 16131273 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase/penicillin-binding protein # Organism: Escherichia coli K12 # 1 850 9 858 858 1710 99.0 0 MKFVKYFLILAVCCILLGAGSIYGLYRYIEPQLPDVATLKDVRLQIPMQIYSADGELIAQ YGEKRRIPVTLDQIPPEMVKAFIATEDSRFYEHHGVDPVGIFRAASVALFSGHASQGAST ITQQLARNFFLSPERTLMRKIKEVFLAIRIEQLLTKDEILELYLNKIYLGYRAYGVGAAA QVYFGKTVDQLTLNEMAVIAGLPKAPSTFNPLYSMDRAVARRNVVLSRMLDEGYITQQQF DQTRTEAINANYHAPEIAFSAPYLSEMVRQEMYNRYGESAYEDGYRIYTTITRKVQQAAQ QAVRNNVLDYDMRHGYRGPANVLWKVGESAWDNNKITDTLKALPTYGPLLPAAVTSANPQ QATAMLADGSTVALSMEGVRWARPYRSDTQQGPTPRKVTDVLQTGQQIWVRQVGDAWWLA QVPEVNSALVSINPQNGAVMALVGGFDFNQSKFNRATQALRQVGSNIKPFLYTAAMDKGL TLASMLNDVPISRWDASAGSDWQPKNSPPQYAGPIRLRQGLGQSKNVVMVRAMRAMGVDY AAEYLQRFGFPAQNIVHTESLALGSASFTPMQVARGYAVMANGGFLVDPWFISKIENDQG GVIFEAKPKVACPECDIPVIYGDTQKSNVLENNDVEDVAISREQQNVSVPMPQLEQANQA LVAKTGAQEYAPHVINTPLAFLIKSALNTNIFGEPGWQGTGWRAGRDLQRRDIGGKTGTT NSSKDAWFSGYGPGVVTSVWIGFDDHRRNLGHTTASGAIKDQISGYEGGAKSAQPAWDAY MKAVLEGVPEQPLTPPPGIVTVNIDRSTGQLANGGNSREEYFIEGTQPTQQAVHEVGTTI IDNGEAQELF >gi|296494589|gb|ADTN01000149.1| GENE 10 9755 - 9817 62 20 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MREVCLMHNVYQAYGVSAIY >gi|296494589|gb|ADTN01000149.1| GENE 11 9818 - 10378 609 186 aa, chain - ## HITS:1 COG:yrfE KEGG:ns NR:ns ## COG: yrfE COG0494 # Protein_GI_number: 16131274 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Escherichia coli K12 # 1 186 1 186 186 360 100.0 1e-99 MSKSLQKPTILNVETVARSRLFTVESVDLEFSNGVRRVYERMRPTNREAVMIVPIVDDHL ILIREYAVGTESYELGFSKGLIDPGESVYEAANRELKEEVGFGANDLTFLKKLSMAPSYF SSKMNIVVAQDLYPESLEGDEPEPLPQVRWPLAHMMDLLEDPDFNEARNVSALFLVREWL KGQGRV >gi|296494589|gb|ADTN01000149.1| GENE 12 10698 - 12833 1839 711 aa, chain + ## HITS:1 COG:no KEGG:JW3361 NR:ns ## KEGG: JW3361 # Name: yrfF # Def: predicted inner membrane protein # Organism: E.coli_J # Pathway: not_defined # 1 711 1 711 711 1363 100.0 0 MSTIVIFLAALLACSLLAGWLIKVRSRRRQLPWTNAFADAQTRKLTPEERSAVENYLESL TQVLQVPGPTGASAAPISLALNAESNNVMMLTHAITRYGISTDDPNKWRYYLDSVEVHLP PFWEQYINDENTVELIHTDSLPLVISLNGHTLQEYMQETRSYALQPVPSTQASIRGEESE QIELLNIRKETHEEYALSRPRGLREALLIVASFLMFFFCLITPDVFVPWLAGGALLLLGA GLWGLFAPPAKSSLREIHCLRGTPRRWGLFGENDQEQINNISLGIIDLVYPAHWQPYIAQ DLGQQTDIDIYLDRHVVRQGRYLSLHDEVKNFPLQHWLRSTIIAAGSLLVLFMLLFWIPL DMPLKFTLSWMKGAQTIEATSVKQLADAGVRVGDTLRISGTGMCNIRTSGTWSAKTNSPF LPFDCSQIIWNDARSLPLPESELVNKATALTEAVNRQLHPKPEDESRVSASLRSAIQKSG MVLLDDFGDIVLKTADLCSAKDDCVRLKNALVNLGNSKDWDALVKRANAGKLDGVNVLLR PVSAESLDNLVATSTAPFITHETARAAQSLNSPAPGGFLIVSDEGSDFVDQPWPSASLYD YPPQEQWNAFQKLAQMLMHTPFNAEGIVTKIFTDANGTQHIGLHPIPDRSGLWRYLSTTL LLLTMLGSAIYNGVQAWRRYQRHRTRMMEIQAYYESCLNPQLITPSESLIE >gi|296494589|gb|ADTN01000149.1| GENE 13 12898 - 13566 905 222 aa, chain + ## HITS:1 COG:ECs4241 KEGG:ns NR:ns ## COG: ECs4241 COG1011 # Protein_GI_number: 15833495 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Escherichia coli O157:H7 # 1 222 16 237 237 459 100.0 1e-129 MHINIAWQDVDTVLLDMDGTLLDLAFDNYFWQKLVPETWGAKNGVTPQEAMEYMRQQYHD VQHTLNWYCLDYWSEQLGLDICAMTTEMGPRAVLREDTIPFLEALKASGKQRILLTNAHP HNLAVKLEHTGLDAHLDLLLSTHTFGYPKEDQRLWHAVAEATGLKAERTLFIDDSEAILD AAAQFGIRYCLGVTNPDSGIAEKQYQRHPSLNDYRRLIPSLM >gi|296494589|gb|ADTN01000149.1| GENE 14 13577 - 13978 452 133 aa, chain + ## HITS:1 COG:ECs4242 KEGG:ns NR:ns ## COG: ECs4242 COG1188 # Protein_GI_number: 15833496 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) # Organism: Escherichia coli O157:H7 # 1 133 1 133 133 214 100.0 3e-56 MKEKPAVEVRLDKWLWAARFYKTRALAREMIEGGKVHYNGQRSKPSKIVELNATLTLRQG NDERTVIVKAITEQRRPASEAALLYEETAESVEKREKMALARKLNALTMPHPDRRPDKKE RRDLLRFKHGDSE >gi|296494589|gb|ADTN01000149.1| GENE 15 14003 - 14881 1112 292 aa, chain + ## HITS:1 COG:ZyrfI KEGG:ns NR:ns ## COG: ZyrfI COG1281 # Protein_GI_number: 15803904 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Disulfide bond chaperones of the HSP33 family # Organism: Escherichia coli O157:H7 EDL933 # 1 292 3 294 294 585 100.0 1e-167 MPQHDQLHRYLFENFAVRGELVTVSETLQQILENHDYPQPVKNVLAELLVATSLLTATLK FDGDITVQLQGDGPMNLAVINGNNNQQMRGVARVQGEIPENADLKTLVGNGYVVITITPS EGERYQGVVGLEGDTLAACLEDYFMRSEQLPTRLFIRTGDVDGKPAAGGMLLQVMPAQNA QQDDFDHLATLTETIKTEELLTLPANEVLWRLYHEEEVTVYDPQDVEFKCTCSRERCADA LKTLPDEEVDSILAEDGEIDMHCDYCGNHYLFNAMDIAEIRNNASPADPQVH >gi|296494589|gb|ADTN01000149.1| GENE 16 14944 - 16668 1449 574 aa, chain - ## HITS:1 COG:no KEGG:B21_03206 NR:ns ## KEGG: B21_03206 # Name: yhgE # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 574 1 574 574 983 99.0 0 MDNVELSPATRWGMIATGLLQGLVCYLLIAWLSGKNHSWIVYGVPATVAFSSVLLFSVIS FKQKRLWGWLALVFIATLGMSGWLKWQTDGMNPWRAEKALWDFGCYLLLMAMLLLPWIQQ SLRIRNDSSRYRYFYQSVWHNVLILLVIFLANGLTWLVLLLWSELFKLVGITFFNTLFFA TDWFIYLTLGLVTALAVILARTQSRLIDSIQKLFTLIATGLLPLVSLLTLMFIITLPFTG LSAISRHISAAGLLLTLAFLQLILMAIVRDPQKASLPWTGPLRCLIKTALLVAPLYVFVA AWALWLRVAQYGWTVDRLQGVLAVLVLLVWSLGYFVSIVWRKGQNPVVLQGKVNLAVSLL VLVILVLLNSPVLDSMRISVNSHMARYQSGKNTSDQVTIYMLEQSGRYGRAALESLKSDA GFMKDPKRARDLLMALDGEQHLQQQVSEKVLADNVLIAPGSVKPDATFWSALIQDRYNVM TCIEKDACVLVEQDLNSDGQAERILFAFNDDRVIVYGFDSDRKEWDALDMSLLPNEITKE KLLTAAKDGKLGTKPKAWRDLVVDGERLNVNLNE >gi|296494589|gb|ADTN01000149.1| GENE 17 17047 - 18669 2033 540 aa, chain + ## HITS:1 COG:pckA KEGG:ns NR:ns ## COG: pckA COG1866 # Protein_GI_number: 16131280 # Func_class: C Energy production and conversion # Function: Phosphoenolpyruvate carboxykinase (ATP) # Organism: Escherichia coli K12 # 1 540 1 540 540 1119 100.0 0 MRVNNGLTPQELEAYGISDVHDIVYNPSYDLLYQEELDPSLTGYERGVLTNLGAVAVDTG IFTGRSPKDKYIVRDDTTRDTFWWADKGKGKNDNKPLSPETWQHLKGLVTRQLSGKRLFV VDAFCGANPDTRLSVRFITEVAWQAHFVKNMFIRPSDEELAGFKPDFIVMNGAKCTNPQW KEQGLNSENFVAFNLTERMQLIGGTWYGGEMKKGMFSMMNYLLPLKGIASMHCSANVGEK GDVAVFFGLSGTGKTTLSTDPKRRLIGDDEHGWDDDGVFNFEGGCYAKTIKLSKEAEPEI YNAIRRDALLENVTVREDGTIDFDDGSKTENTRVSYPIYHIDNIVKPVSKAGHATKVIFL TADAFGVLPPVSRLTADQTQYHFLSGFTAKLAGTERGITEPTPTFSACFGAAFLSLHPTQ YAEVLVKRMQAAGAQAYLVNTGWNGTGKRISIKDTRAIIDAILNGSLDNAETFTLPMFNL AIPTELPGVDTKILDPRNTYASPEQWQEKAETLAKLFIDNFDKYTDTPAGAALVAAGPKL >gi|296494589|gb|ADTN01000149.1| GENE 18 18745 - 20097 1345 450 aa, chain - ## HITS:1 COG:envZ KEGG:ns NR:ns ## COG: envZ COG0642 # Protein_GI_number: 16131281 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 450 1 450 450 897 99.0 0 MRRLRFSPRSSFARTLLLIVTLLFASLVTTYLVVLNFAILPSLQQFNKVLAYEVRMLMTD KLQLEDGTQLVVPPAFRREIYRELGISLYSNEAAEEAGLRWAQHYEFLSHQMAQQLGGPT EVRVEVNKSSPVVWLKTWLSPNIWVRVPLTEIHQGDFSPLFRYTLAIMLLAIGGAWLFIR IQNRPLVDLEHAALQVGKGIIPPPLREYGASEVRSVTRAFNHMAAGVKQLADDRTLLMAG VSHDLRTPLTRIRLATEMMGEQDGYLAESINKDIEECNAIIEQFIDYLRTGQEMPMEMAD LNAVLGEVIAAESGYEREIETALYPGSIEVKMHPLSIKRAVANMVVNAARYGNGWIKVSS GTEPNRAWFQVEDDGPGIAPEQRKHLFQPFVRGDSARTISGTGLGLAIVQRIVDNHNGML ELGTSERGGLSIRAWLPVPVTRAQGTTKEG >gi|296494589|gb|ADTN01000149.1| GENE 19 20094 - 20813 981 239 aa, chain - ## HITS:1 COG:ECs4247 KEGG:ns NR:ns ## COG: ECs4247 COG0745 # Protein_GI_number: 15833501 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 239 1 239 239 446 99.0 1e-125 MQENYKILVVDDDMRLRALLERYLTEQGFQVRSVANAEQMDRLLTRESFHLMVLDLMLPG EDGLSICRRLRSQSNPMPIIMVTAKGEEVDRIVGLEIGADDYIPKPFNPRELLARIRAVL RRQANELPGAPSQEEAVIAFGKFKLNLGTREMFREDEPMPLTSGEFAVLKALVSHPREPL SRDKLMNLARGREYSAMERSIDVQISRLRRMGEEDPAHPRYIQTVWGLGYVFVPDGSKA >gi|296494589|gb|ADTN01000149.1| GENE 20 21041 - 21517 648 158 aa, chain + ## HITS:1 COG:ECs4248 KEGG:ns NR:ns ## COG: ECs4248 COG0782 # Protein_GI_number: 15833502 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Escherichia coli O157:H7 # 1 158 13 170 170 305 99.0 3e-83 MKTPLVTREGYEKLKQELNYLWREERPEVTKKVTWAASLGDRSENADYQYNKKRLREIDR RVRYLTKCLENLKIVDYSPQQEGKVFFGAWVEIENDDGVTHRFRIVGYDEIFGRKDYISI DSPMARALLKKEVGDLAVVNTPAGEASWYVNAIEYVKP Prediction of potential genes in microbial genomes Time: Sun May 15 23:39:03 2011 Seq name: gi|296494588|gb|ADTN01000150.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont290.8, whole genome shotgun sequence Length of sequence - 70608 bp Number of predicted genes - 61, with homology - 61 Number of transcription units - 35, operones - 14 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.500 + CDS 76 - 2397 1644 ## PROTEIN SUPPORTED gi|51894064|ref|YP_076755.1| ribosomal protein S1-like protein + Term 2496 - 2530 6.0 2 1 Op 2 4/0.500 + CDS 76 - 2397 166 ## PROTEIN SUPPORTED gi|124005220|ref|ZP_01690062.1| 30S ribosomal protein S1 + Term 2496 - 2530 6.0 3 2 Op 1 22/0.000 + CDS 2854 - 3081 260 ## COG1918 Fe2+ transport system protein A 4 2 Op 2 . + CDS 3098 - 5419 2843 ## COG0370 Fe2+ transport system protein B 5 2 Op 3 . + CDS 5419 - 5655 210 ## SDY_3666 hypothetical protein + Term 5682 - 5719 6.1 + Prom 5682 - 5741 2.4 6 3 Tu 1 . + CDS 5858 - 6736 669 ## COG5464 Uncharacterized conserved protein 7 4 Tu 1 . - CDS 6765 - 7535 638 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) - Prom 7566 - 7625 2.9 8 5 Op 1 5/0.250 + CDS 7573 - 8256 209 ## COG1040 Predicted amidophosphoribosyltransferases 9 5 Op 2 3/0.667 + CDS 8315 - 8890 684 ## COG0694 Thioredoxin-like proteins and domains + Term 8955 - 8990 0.7 + Prom 9139 - 9198 4.0 10 6 Tu 1 . + CDS 9250 - 10566 1804 ## COG2610 H+/gluconate symporter and related permeases + Term 10755 - 10798 1.6 11 7 Op 1 7/0.083 - CDS 10677 - 12761 2019 ## COG1640 4-alpha-glucanotransferase 12 7 Op 2 . - CDS 12771 - 15164 2768 ## COG0058 Glucan phosphorylase - Prom 15284 - 15343 2.7 + Prom 15692 - 15751 6.1 13 8 Tu 1 . + CDS 15776 - 18481 2505 ## COG2909 ATP-dependent transcriptional regulator + Term 18484 - 18527 0.5 - Term 18330 - 18368 -0.6 14 9 Op 1 3/0.667 - CDS 18524 - 19540 777 ## COG0430 RNA 3'-terminal phosphate cyclase 15 9 Op 2 . - CDS 19544 - 20770 1325 ## COG1690 Uncharacterized conserved protein - Prom 20833 - 20892 5.2 + Prom 20803 - 20862 4.4 16 10 Tu 1 . + CDS 20959 - 22557 1378 ## COG4650 Sigma54-dependent transcription regulator containing an AAA-type ATPase domain and a DNA-binding domain - Term 22340 - 22374 -1.0 17 11 Op 1 6/0.083 - CDS 22539 - 23297 828 ## COG1349 Transcriptional regulators of sugar metabolism 18 11 Op 2 4/0.500 - CDS 23314 - 24144 879 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) 19 11 Op 3 . - CDS 24189 - 24515 501 ## COG0607 Rhodanese-related sulfurtransferase - Prom 24617 - 24676 4.6 + Prom 24603 - 24662 5.7 20 12 Tu 1 . + CDS 24705 - 26210 1722 ## COG0578 Glycerol-3-phosphate dehydrogenase + Term 26308 - 26345 -0.9 - Term 26339 - 26378 1.2 21 13 Tu 1 . - CDS 26416 - 26697 149 ## COG0226 ABC-type phosphate transport system, periplasmic component - Prom 26720 - 26779 2.4 - Term 26775 - 26810 4.9 22 14 Op 1 10/0.000 - CDS 26826 - 29273 3051 ## COG0058 Glucan phosphorylase 23 14 Op 2 17/0.000 - CDS 29292 - 30725 1258 ## COG0297 Glycogen synthase 24 14 Op 3 7/0.083 - CDS 30725 - 32020 1251 ## COG0448 ADP-glucose pyrophosphorylase 25 14 Op 4 9/0.000 - CDS 32038 - 34011 1500 ## COG1523 Type II secretory pathway, pullulanase PulA and related glycosidases 26 14 Op 5 4/0.500 - CDS 34008 - 36194 2498 ## COG0296 1,4-alpha-glucan branching enzyme - Prom 36349 - 36408 3.1 - Term 36419 - 36448 1.1 27 15 Tu 1 . - CDS 36467 - 37570 1108 ## COG0136 Aspartate-semialdehyde dehydrogenase - Prom 37736 - 37795 3.7 + Prom 37541 - 37600 3.4 28 16 Tu 1 . + CDS 37763 - 38356 653 ## COG2095 Multiple antibiotic transporter - Term 38368 - 38403 5.2 29 17 Op 1 4/0.500 - CDS 38413 - 39753 1553 ## COG2610 H+/gluconate symporter and related permeases 30 17 Op 2 1/0.750 - CDS 39757 - 40245 512 ## COG3265 Gluconate kinase - Prom 40335 - 40394 2.0 31 18 Tu 1 . - CDS 40423 - 41418 1040 ## COG1609 Transcriptional regulators - Prom 41490 - 41549 4.4 - Term 41592 - 41621 0.4 32 19 Tu 1 5/0.250 - CDS 41642 - 42337 895 ## COG1741 Pirin-related protein - Prom 42388 - 42447 4.5 - Term 42401 - 42453 3.6 33 20 Tu 1 . - CDS 42460 - 43497 1189 ## COG0673 Predicted dehydrogenases and related proteins - Prom 43708 - 43767 4.4 34 21 Tu 1 . + CDS 43444 - 43596 176 ## EcE24377A_3919 hypothetical protein + Prom 43741 - 43800 3.2 35 22 Tu 1 . + CDS 43830 - 44318 347 ## PROTEIN SUPPORTED gi|229877854|ref|ZP_04497362.1| acetyltransferase, ribosomal protein N-acetylase + Term 44348 - 44384 -1.0 36 23 Op 1 . + CDS 45617 - 45733 57 ## B21_03246 hypothetical protein 37 23 Op 2 . + CDS 45769 - 46224 191 ## ECIAI1_3588 hypothetical protein + Term 46433 - 46480 1.5 + Prom 46466 - 46525 4.3 38 24 Tu 1 . + CDS 46674 - 46958 278 ## B21_03248 hypothetical protein - Term 46954 - 46987 5.4 39 25 Tu 1 . - CDS 46996 - 48738 1978 ## COG0405 Gamma-glutamyltransferase - Prom 48825 - 48884 2.2 + Prom 48776 - 48835 3.7 40 26 Tu 1 . + CDS 48858 - 49298 367 ## S4297 hypothetical protein + Term 49346 - 49389 8.7 - Term 49220 - 49263 8.1 41 27 Op 1 4/0.500 - CDS 49285 - 50028 840 ## COG0584 Glycerophosphoryl diester phosphodiesterase 42 27 Op 2 21/0.000 - CDS 50025 - 51095 1425 ## COG3839 ABC-type sugar transport systems, ATPase components 43 27 Op 3 38/0.000 - CDS 51097 - 51942 969 ## COG0395 ABC-type sugar transport system, permease component 44 27 Op 4 35/0.000 - CDS 51939 - 52826 991 ## COG1175 ABC-type sugar transport systems, permease components - Prom 52863 - 52922 2.6 45 27 Op 5 . - CDS 52924 - 54240 1758 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 54316 - 54375 7.3 - Term 54523 - 54565 -0.8 46 28 Op 1 18/0.000 - CDS 54639 - 55352 267 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 47 28 Op 2 19/0.000 - CDS 55354 - 56121 258 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 48 28 Op 3 24/0.000 - CDS 56118 - 57395 1558 ## COG4177 ABC-type branched-chain amino acid transport system, permease component 49 28 Op 4 20/0.000 - CDS 57392 - 58318 1372 ## COG0559 Branched-chain amino acid ABC-type transport system, permease components 50 28 Op 5 . - CDS 58366 - 59475 1337 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component - Prom 59495 - 59554 2.1 + Prom 59801 - 59860 5.4 51 29 Tu 1 . + CDS 59899 - 60282 366 ## JW3424 conserved hypothetical protein + Term 60443 - 60474 2.1 52 30 Tu 1 . - CDS 60470 - 61573 1508 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component - Prom 61676 - 61735 5.1 - Term 61786 - 61827 6.3 53 31 Tu 1 . - CDS 61844 - 62698 1068 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) - Prom 62795 - 62854 6.2 - Term 62860 - 62899 3.7 54 32 Op 1 28/0.000 - CDS 62943 - 64001 1006 ## COG2177 Cell division protein 55 32 Op 2 9/0.000 - CDS 63994 - 64662 348 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 56 32 Op 3 . - CDS 64665 - 66158 747 ## PROTEIN SUPPORTED gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 - Prom 66230 - 66289 5.1 + Prom 66119 - 66178 3.2 57 33 Op 1 6/0.083 + CDS 66308 - 66904 190 ## PROTEIN SUPPORTED gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 58 33 Op 2 . + CDS 66894 - 67163 347 ## COG3776 Predicted membrane protein - Term 67053 - 67085 -1.0 59 34 Tu 1 . - CDS 67166 - 67525 356 ## JW3432 conserved hypothetical protein - Prom 67549 - 67608 3.4 60 35 Op 1 5/0.250 + CDS 67666 - 68292 733 ## COG3714 Predicted membrane protein + Term 68303 - 68334 3.4 61 35 Op 2 . + CDS 68366 - 70564 2473 ## COG2217 Cation transport ATPase Predicted protein(s) >gi|296494588|gb|ADTN01000150.1| GENE 1 76 - 2397 1644 773 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|51894064|ref|YP_076755.1| ribosomal protein S1-like protein [Symbiobacterium thermophilum IAM 14863] # 1 743 1 743 764 637 48 0.0 MMNDSFCRIIAGEIQARPEQVDAAVRLLDEGNTVPFIARYRKEITGGLDDTQLRNLETRL SYLRELEERRQAILKSISEQGKLTDDLAKAINATLSKTELEDLYLPYKPKRRTRGQIAIE AGLEPLADLLWSDPSHTPEVAAAQYVDADKGVADTKAALDGARYILMERFAEDAALLAKV RDYLWKNAHLVSTVVSGKEEEGAKFRDYFDHHEPLSTVPSHRALAMFRGRNEGVLQLSLN ADPQFDEPPKESYCEQIIMDHLGLRLNNAPADSWRKGVVSWTWRIKVLMHLETELMGTVR ERAEDEAINVFARNLHDLLMAAPAGLRATMGLDPGLRTGVKVAVVDATGKLVATDTIYPH TGQAAKAAMTVAALCEKHNVELVAIGNGTASRETERFYLDVQKQFPKVTAQKVIVSEAGA SVYSASELAAQEFPDLDVSLRGAVSIARRLQDPLAELVKIDPKSIGVGQYQHDVSQTQLA RKLDAVVEDCVNAVGVDLNTASVPLLTRVAGLTRMMAQNIVAWRDENGQFQNRQQLLKVS RLGPKAFEQCAGFLRINHGDNPLDASTVHPEAYPVVERILAATQQALKDLMGNSSELRNL KASDFTDEKFGVPTVTDIIKELEKPGRDPRPEFKTAQFADGVETMNDLQPGMILEGAVTN VTNFGAFVDIGVHQDGLVHISSLSNKFVEDPHTVVKAGDIVKVKVLEVDLQRKRIALTMR LDEQPGETNARRGGGNERPQNNRPAAKPRGREAQPAGNSAMMDALAAAMGKKR >gi|296494588|gb|ADTN01000150.1| GENE 2 76 - 2397 166 773 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|124005220|ref|ZP_01690062.1| 30S ribosomal protein S1 [Microscilla marina ATCC 23134] # 645 770 206 322 596 68 33 9e-11 MMNDSFCRIIAGEIQARPEQVDAAVRLLDEGNTVPFIARYRKEITGGLDDTQLRNLETRL SYLRELEERRQAILKSISEQGKLTDDLAKAINATLSKTELEDLYLPYKPKRRTRGQIAIE AGLEPLADLLWSDPSHTPEVAAAQYVDADKGVADTKAALDGARYILMERFAEDAALLAKV RDYLWKNAHLVSTVVSGKEEEGAKFRDYFDHHEPLSTVPSHRALAMFRGRNEGVLQLSLN ADPQFDEPPKESYCEQIIMDHLGLRLNNAPADSWRKGVVSWTWRIKVLMHLETELMGTVR ERAEDEAINVFARNLHDLLMAAPAGLRATMGLDPGLRTGVKVAVVDATGKLVATDTIYPH TGQAAKAAMTVAALCEKHNVELVAIGNGTASRETERFYLDVQKQFPKVTAQKVIVSEAGA SVYSASELAAQEFPDLDVSLRGAVSIARRLQDPLAELVKIDPKSIGVGQYQHDVSQTQLA RKLDAVVEDCVNAVGVDLNTASVPLLTRVAGLTRMMAQNIVAWRDENGQFQNRQQLLKVS RLGPKAFEQCAGFLRINHGDNPLDASTVHPEAYPVVERILAATQQALKDLMGNSSELRNL KASDFTDEKFGVPTVTDIIKELEKPGRDPRPEFKTAQFADGVETMNDLQPGMILEGAVTN VTNFGAFVDIGVHQDGLVHISSLSNKFVEDPHTVVKAGDIVKVKVLEVDLQRKRIALTMR LDEQPGETNARRGGGNERPQNNRPAAKPRGREAQPAGNSAMMDALAAAMGKKR >gi|296494588|gb|ADTN01000150.1| GENE 3 2854 - 3081 260 75 aa, chain + ## HITS:1 COG:ECs4250 KEGG:ns NR:ns ## COG: ECs4250 COG1918 # Protein_GI_number: 15833504 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein A # Organism: Escherichia coli O157:H7 # 1 75 1 75 75 147 100.0 6e-36 MQYTPDTAWKITGFSREISPAYRQKLLSLGMLPGSSFNVVRVAPLGDPIHIETRRVSLVL RKKDLALLEVEAVSC >gi|296494588|gb|ADTN01000150.1| GENE 4 3098 - 5419 2843 773 aa, chain + ## HITS:1 COG:feoB KEGG:ns NR:ns ## COG: feoB COG0370 # Protein_GI_number: 16131285 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Escherichia coli K12 # 1 773 1 773 773 1503 100.0 0 MKKLTIGLIGNPNSGKTTLFNQLTGSRQRVGNWAGVTVERKEGQFSTTDHQVTLVDLPGT YSLTTISSQTSLDEQIACHYILSGDADLLINVVDASNLERNLYLTLQLLELGIPCIVALN MLDIAEKQNIRIEIDALSARLGCPVIPLVSTRGRGIEALKLAIDRYKANENVELVHYAQP LLNEADSLAKVMPSDIPLKQRRWLGLQMLEGDIYSRAYAGEASQHLDAALARLRNEMDDP ALHIADARYQCIAAICDVVSNTLTAEPSRFTTAVDKIVLNRFLGLPIFLFVMYLMFLLAI NIGGALQPLFDVGSVALFVHGIQWIGYTLHFPDWLTIFLAQGLGGGINTVLPLVPQIGMM YLFLSFLEDSGYMARAAFVMDRLMQALGLPGKSFVPLIVGFGCNVPSVMGARTLDAPRER LMTIMMAPFMSCGARLAIFAVFAAAFFGQNGALAVFSLYMLGIVMAVLTGLMLKYTIMRG EATPFVMELPVYHVPHVKSLIIQTWQRLKGFVLRAGKVIIIVSIFLSAFNSFSLSGKIVD NINDSALASVSRVITPVFKPIGVHEDNWQATVGLFTGAMAKEVVVGTLNTLYTAENIQDE EFNPAEFNLGEELFSAIDETWQSLKDTFSLSVLMNPIEASKGDGEMGTGAMGVMDQKFGS AAAAYSYLIFVLLYVPCISVMGAIARESSRGWMGFSILWGLNIAYSLATLFYQVASYSQH PTYSLVCILAVILFNIVVIGLLRRARSRVDIELLATRKSVSSCCAASTTGDCH >gi|296494588|gb|ADTN01000150.1| GENE 5 5419 - 5655 210 78 aa, chain + ## HITS:1 COG:no KEGG:SDY_3666 NR:ns ## KEGG: SDY_3666 # Name: yhgG # Def: hypothetical protein # Organism: S.dysenteriae # Pathway: not_defined # 1 78 1 78 78 136 100.0 3e-31 MASLIQVRDLLALRGRMEAAQISQTLNTPQPMINAMLQQLESMGKAVRIQEEPDGCLSGS CKSCPEGKACLREWWALR >gi|296494588|gb|ADTN01000150.1| GENE 6 5858 - 6736 669 292 aa, chain + ## HITS:1 COG:yhgA KEGG:ns NR:ns ## COG: yhgA COG5464 # Protein_GI_number: 16131287 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 292 1 292 292 586 100.0 1e-167 MSKKQSSTPHDALFKLFLRQPDTARDFLAFHLPAPIHALCDMKTLKLESSSFIDDDLRES YSDVLWSVKTEQGPGYIYCLIEHQSTSNKLIAFRMMRYAIAAMQNHLDAGYKTLPMVVPL LFYHGIESPYPYSLCWLDCFADPKLARQLYASAFPLIDVTVMPDDEIMQHRRMALLELIQ KHIRQRDLMGLVEQMACLLSSGYANDRQIKGLFNYILQTGDAVRFNDFIDGVAERSPKHK ESLMTIAERLRQEGEQSKALHIAKIMLESGVPLADIMRFTGLSEEELAAASQ >gi|296494588|gb|ADTN01000150.1| GENE 7 6765 - 7535 638 256 aa, chain - ## HITS:1 COG:bioH KEGG:ns NR:ns ## COG: bioH COG0596 # Protein_GI_number: 16131288 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Escherichia coli K12 # 1 256 1 256 256 518 100.0 1e-147 MNNIWWQTKGQGNVHLVLLHGWGLNAEVWRCIDEELSSHFTLHLVDLPGFGRSRGFGALS LADMAEAVLQQAPDKAIWLGWSLGGLVASQIALTHPERVQALVTVASSPCFSARDEWPGI KPDVLAGFQQQLSDDFQRTVERFLALQTMGTETARQDARALKKTVLALPMPEVDVLNGGL EILKTVDLRQPLQNVSMPFLRLYGYLDGLVPRKVVPMLDKLWPHSESYIFAKAAHAPFIS HPAEFCHLLVALKQRV >gi|296494588|gb|ADTN01000150.1| GENE 8 7573 - 8256 209 227 aa, chain + ## HITS:1 COG:yhgH KEGG:ns NR:ns ## COG: yhgH COG1040 # Protein_GI_number: 16131289 # Func_class: R General function prediction only # Function: Predicted amidophosphoribosyltransferases # Organism: Escherichia coli K12 # 1 227 17 243 243 404 100.0 1e-113 MLTVPGLCWLCRMPLALGHWGICSVCSRATRTDKTLCPQCGLPATHSHLPCGRCLQKPPP WQRLVTVADYAPPLSPLIHQLKFSRRSEIASALSRLLLLEVLHARRTTGLQLPDRIVSVP LWQRRHWRRGFNQSDLLCQPLSRWLHCQWDSEAVTRTRATATQHFLSARLRKRNLKNAFR LELPVQGRHMVIVDDVVTTGSTVAEIAQLLLRNGAAAVQVWCLCRTL >gi|296494588|gb|ADTN01000150.1| GENE 9 8315 - 8890 684 191 aa, chain + ## HITS:1 COG:yhgI_2 KEGG:ns NR:ns ## COG: yhgI_2 COG0694 # Protein_GI_number: 16131290 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Thioredoxin-like proteins and domains # Organism: Escherichia coli K12 # 97 191 1 95 95 197 100.0 8e-51 MIRISDAAQAHFAKLLANQEEGTQIRVFVINPGTPNAECGVSYCPPDAVEATDTALKFDL LTAYVDELSAPYLEDAEIDFVTDQLGSQLTLKAPNAKMRKVADDAPLMERVEYMLQSQIN PQLAGHGGRVSLMEITEDGYAILQFGGGCNGCSMVDVTLKEGIEKQLLNEFPELKGVRDL TEHQRGEHSYY >gi|296494588|gb|ADTN01000150.1| GENE 10 9250 - 10566 1804 438 aa, chain + ## HITS:1 COG:ECs4257 KEGG:ns NR:ns ## COG: ECs4257 COG2610 # Protein_GI_number: 15833511 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Escherichia coli O157:H7 # 1 438 1 438 438 700 100.0 0 MPLVIVAIGVILLLLLMIRFKMNGFIALVLVALAVGLMQGMPLDKVIGSIKAGVGGTLGS LALIMGFGAMLGKMLADCGGAQRIATTLIAKFGKKHIQWAVVLTGFTVGFALFYEVGFVL MLPLVFTIAASANIPLLYVGVPMAAALSVTHGFLPPHPGPTAIATIFNADMGKTLLYGTI LAIPTVILAGPVYARVLKGIDKPIPEGLYSAKTFSEEEMPSFGVSVWTSLVPVVLMAMRA IAEMILPKGHAFLPVAEFLGDPVMATLIAVLIAMFTFGLNRGRSMDQINDTLVSSIKIIA MMLLIIGGGGAFKQVLVDSGVDKYIASMMHETNISPLLMAWSIAAVLRIALGSATVAAIT AGGIAAPLIATTGVSPELMVIAVGSGSVIFSHVNDPGFWLFKEYFNLTIGETIKSWSMLE TIISVCGLVGCLLLNMVI >gi|296494588|gb|ADTN01000150.1| GENE 11 10677 - 12761 2019 694 aa, chain - ## HITS:1 COG:malQ KEGG:ns NR:ns ## COG: malQ COG1640 # Protein_GI_number: 16131292 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Escherichia coli K12 # 1 684 1 684 694 1413 99.0 0 MESKRLDNAALAAGISPNYINAHGKPQSISAETKRRLLDAMHQRTATKVAVTPVPNVMVY TSGKKMPMVVEGSGEYSWLLTTEEGTQYKGHVTGGKAFNLPTKLPEGYHTLTLTQDDQRA HCRVIVAPKRCYEPQALLNKQKLWGACVQLYTLRSEKNWGIGDFGDLKAMLVDVAKRGGS FIGLNPIHALYPANPESASPYSPSSRRWLNVIYIDVNAVEDFHLSEEAQAWWQLPTTQQT LQQARDADWVDYSTVTALKMTALRMAWKGFAQRDDEQMAAFRQFVAEQGDSLFWQAAFDA LHAQQVKEDEMRWGWPAWPEMYQNVDSPEVRQFCEEHRDDVDFYLWLQWLAYSQFAACWE ISQGYEMPIGLYRDLAVGVAEGGAETWCDRELYCLKASVGAPPDILGPLGQNWGLPPMDP HIITARAYEPFIELLRANMQNCGALRIDHVMSMLRLWWIPYGETADQGAYVHYPVDDLLS ILALESKRHRCMVIGEDLGTVPVEIVGKLRSSGVYSYKVLYFENDHEKTFRAPKAYPEQS MAVAATHDLPTLRGYWESGDLTLGKTLGLYPDEVVLRGLYQDRELAKQGLLDALHKYGCL PKRAGHKASLMSMTPTLNRGLQRYIADSNSALLGLQPEDWLDMAEPVNIPGTSYQYKNWR RKLSATLESMFADDGVNKLLKDLDRRRRAAAKKK >gi|296494588|gb|ADTN01000150.1| GENE 12 12771 - 15164 2768 797 aa, chain - ## HITS:1 COG:ECs4259 KEGG:ns NR:ns ## COG: ECs4259 COG0058 # Protein_GI_number: 15833513 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Escherichia coli O157:H7 # 1 797 1 797 797 1644 99.0 0 MSQPIFNDKQFQEALSRQWQRYGLNSAAEMTPRQWWLAVSEALAEMLRAQPFAKPVANQR HVNYISMEFLIGRLTGNNLLNLGWYQDVQDSLKAYDINLTDLLEEEIDPALGNGGLGRLA ACFLDSMATVGQSATGYGLNYQYGLFRQSFVDGKQVEAPDDWHRSNYPWFRHNEALDVQV GIGGKVTKDGRWEPEFTITGQAWDLPVVGYRNGVAQPLRLWQATHAHPFDLTKFNDGDFL RAEQQGINAEKLTKVLYPNDNHTAGKKLRLMQQYFQCACSVADILRRHHLAGRKLHELAD YEVIQLNDTHPTIAIPELLRVLIDEHQMSWDDAWAITSKTFAYTNHTLMPEALERWDVKL VKGLLPRHMQIINEINTRFKTLVEKTWPGDEKVWAKLAVVHDKQVHMANLCVVGGFAVNG VAALHSDLVVKDLFPEYHQLWPNKFHNVTNGITPRRWIKQCNPALAALLDKSLQKEWAND LDQLINLEKFADDAKFRQQYREIKQANKVRLAEFVKVRTGIEINPQAIFDIQIKRLHEYK RQHLNLLHILALYKEIRENPQADRVPRVFLFGAKAAPGYYLAKNIIFAINKVADVINNDP LVGDKLKVVFLPDYCVSAAEKLIPAADISEQISTAGKEASGTGNMKLALNGALTVGTLDG ANVEIAEKVGEENIFIFGHTVEQVKAILAKGYDPVKWRKKDKVLDAVLKELESGKYSDGD KHAFDQMLHSIGKQGGDPYLVMADFAAYVEAQKQVDVLYRDQEAWTRAAILNTARCGMFS SDRSIRDYQARIWQAKR >gi|296494588|gb|ADTN01000150.1| GENE 13 15776 - 18481 2505 901 aa, chain + ## HITS:1 COG:malT KEGG:ns NR:ns ## COG: malT COG2909 # Protein_GI_number: 16131294 # Func_class: K Transcription # Function: ATP-dependent transcriptional regulator # Organism: Escherichia coli K12 # 1 901 1 901 901 1677 100.0 0 MLIPSKLSRPVRLDHTVVRERLLAKLSGANNFRLALITSPAGYGKTTLISQWAAGKNDIG WYSLDEGDNQQERFASYLIAAVQQATNGHCAICETMAQKRQYASLTSLFAQLFIELAEWH SPLYLVIDDYHLITNPVIHESMRFFIRHQPENLTLVVLSRNLPQLGIANLRVRDQLLEIG SQQLAFTHQEAKQFFDCRLSSPIEAAESSRICDDVSGWATALQLIALSARQNTHSAHKSA RRLAGINASHLSDYLVDEVLDNVDLATRHFLLKSAILRSMNDALITRVTGEENGQMRLEE IERQGLFLQRMDDTGEWFCYHPLFGNFLRQRCQWELAAELPEIHRAAAESWMAQGFPSEA IHHALAAGDALMLRDILLNHAWSLFNHSELSLLEESLKALPWDSLLENPQLVLLQAWLMQ SQHRYGEVNTLLARAEHEIKDIREDTMHAEFNALRAQVAINDGNPDEAERLAKLALEELP PGWFYSRIVATSVLGEVLHCKGELTRSLALMQQTEQMARQHDVWHYALWSLIQQSEILFA QGFLQTAWETQEKAFQLINEQHLEQLPMHEFLVRIRAQLLWAWARLDEAEASARSGIEVL SSYQPQQQLQCLAMLIQCSLARGDLDNARSQLNRLENLLGNGKYHSDWISNANKVRVIYW QMTGDKAAAANWLRHTAKPEFANNHFLQGQWRNIARAQILLGEFEPAEIVLEELNENARS LRLMSDLNRNLLLLNQLYWQAGRKSDAQRVLLDALKLANRTGFISHFVIEGEAMAQQLRQ LIQLNTLPELEQHRAQRILREINQHHRHKFAHFDENFVERLLNHPEVPELIRTSPLTQRE WQVLGLIYSGYSNEQIAGELEVAATTIKTHIRNLYQKLGVAHRQDAVQHAQQLLKMMGYG V >gi|296494588|gb|ADTN01000150.1| GENE 14 18524 - 19540 777 338 aa, chain - ## HITS:1 COG:yhgK+J KEGG:ns NR:ns ## COG: yhgK+J COG0430 # Protein_GI_number: 16132255 # Func_class: A RNA processing and modification # Function: RNA 3'-terminal phosphate cyclase # Organism: Escherichia coli K12 # 1 338 2 339 339 628 99.0 1e-180 MKRMIALDGAQGEGGGQILRSALSLSMITGQPFTITSIRAGRAKPGLLRQHLTAVKAATE ICGATVEGAELGSQRLLFRPGTVRGGDYRFAIGSAGSCTLVLQTVLPALWFADGPSRVEV SGGTDNPSAPPADFIRRVLEPLLAKIGIHQQTTLLRHGFYPAGGGVVATEVSPVASFNTL QLGERGNIVQMRGEVLLAGVPRHVAEREIATLAGSFSLHEQNIHNLPRDQGPGNTVSLEV ESENITERFFVVGEKRVSAEVVAAQLVKEVKRYLASTAAVGEYLADQLVLPMALAGAGEF TVAHPSCHLLTNIAVVERFLPVRFSLIETDGVTRVSIE >gi|296494588|gb|ADTN01000150.1| GENE 15 19544 - 20770 1325 408 aa, chain - ## HITS:1 COG:rtcB KEGG:ns NR:ns ## COG: rtcB COG1690 # Protein_GI_number: 16131295 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 408 1 408 408 852 100.0 0 MNYELLTTENAPVKMWTKGVPVEADARQQLINTAKMPFIFKHIAVMPDVHLGKGSTIGSV IPTKGAIIPAAVGVDIGCGMNALRTALTAEDLPENLAELRQAIETAVPHGRTTGRCKRDK GAWENPPVNVDAKWAELEAGYQWLTQKYPRFLNTNNYKHLGTLGTGNHFIEICLDESDQV WIMLHSGSRGIGNAIGTYFIDLAQKEMQETLETLPSRDLAYFMEGTEYFDDYLKAVAWAQ LFASLNRDAMMENVVTALQSITQKTVRQPQTLAMEEINCHHNYVQKEQHFGEEIYVTRKG AVSARAGQYGIIPGSMGAKSFIVRGLGNEESFCSCSHGAGRVMSRTKAKKLFSVEDQIRA TAHVECRKDAEVIDEIPMAYKDIDAVMAAQSDLVEVIYTLRQVVCVKG >gi|296494588|gb|ADTN01000150.1| GENE 16 20959 - 22557 1378 532 aa, chain + ## HITS:1 COG:rtcR KEGG:ns NR:ns ## COG: rtcR COG4650 # Protein_GI_number: 16131296 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Sigma54-dependent transcription regulator containing an AAA-type ATPase domain and a DNA-binding domain # Organism: Escherichia coli K12 # 1 532 1 532 532 1051 99.0 0 MRKTVAFGFVGTVLDYAGRGSQRWSKWRPTLCLCQQESLVIDRLELLHDARSRSLFETLK RDIASVSPETEVVSVEIELHNPWDFEEVYACLHDFARGYEFQPEKEDYLIHITTGTHVAQ ICWFLLAEARYLPARLIQSSPPRKKEQPRGPGEVTIIDLDLSRYNAIASRFAEERQQTLD FLKSGIATRNPHFNRMIEQIEKVAIKSRAPILLNGPTGAGKSFLARRIFELKQARHQFSG AFVEVNCATLRGDTAMSTLFGHVKGAFTGARESREGLLRSANGGMLFLDEIGELGADEQA MLLKAIEEKTFYPFGSDRQVSSDFQLIAGTVRDLRQLVAEGKFREDLYARINLWTFTLPG LRQRQEDIEPNLDYEVERHASLTGDSVRFNTEARRAWLAFATSPQATWRGNFRELSASVT RMATFATSGRITLDVVEDEINRLRYNWQESRPSALTALLGAEAENIDLFDRMQLEHVIAI CRQAKSLSAAGRQLFDVSRQGKASVNDADRLRKYLARFGLTWEAVQDQHSSS >gi|296494588|gb|ADTN01000150.1| GENE 17 22539 - 23297 828 252 aa, chain - ## HITS:1 COG:glpR KEGG:ns NR:ns ## COG: glpR COG1349 # Protein_GI_number: 16131297 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli K12 # 1 252 1 252 252 491 100.0 1e-139 MKQTQRHNGIIELVKQQGYVSTEELVEHFSVSPQTIRRDLNELAEQNLILRHHGGAALPS SSVNTPWHDRKATQTEEKERIARKVAEQIPNGSTLFIDIGTTPEAVAHALLNHSNLRIVT NNLNVANTLMVKEDFRIILAGGELRSRDGGIIGEATLDFISQFRLDFGILGISGIDSDGS LLEFDYHEVRTKRAIIENSRHVMLVVDHSKFGRNAMVNMGSISMVDAVYTDAPPPVSVMQ VLTDHHIQLELC >gi|296494588|gb|ADTN01000150.1| GENE 18 23314 - 24144 879 276 aa, chain - ## HITS:1 COG:glpG KEGG:ns NR:ns ## COG: glpG COG0705 # Protein_GI_number: 16131298 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Escherichia coli K12 # 1 276 1 276 276 518 99.0 1e-147 MLMITSFANPRVAQAFVDYMATQGVILTIQQHNQSDVWLADESQAERVRAELARFLENPA DPRYLAASWQAGHTGSGLHYRRYPFFAALRERAGPVTWVMMIACVVVFIAMQILGDQEVM LWLAWPFDPTLKFEFWRYFTHALMHFSLMHILFNLLWWWYLGGAVEKRLGSGKLIVITLI SALLSGYVQQKFSGPWFGGLSGVVYALMGYVWLRGERDPQSGIYLQRGLIIFALIWIVAG WFDLFGMSMANGAHIAGLAVGLAMAFVDSLNARKRK >gi|296494588|gb|ADTN01000150.1| GENE 19 24189 - 24515 501 108 aa, chain - ## HITS:1 COG:glpE KEGG:ns NR:ns ## COG: glpE COG0607 # Protein_GI_number: 16131299 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Escherichia coli K12 # 1 108 1 108 108 212 100.0 2e-55 MDQFECINVADAHQKLQEKEAVLVDIRDPQSFAMGHAVQAFHLTNDTLGAFMRDNDFDTP VMVMCYHGNSSKGAAQYLLQQGYDVVYSIDGGFEAWQRQFPAEVAYGA >gi|296494588|gb|ADTN01000150.1| GENE 20 24705 - 26210 1722 501 aa, chain + ## HITS:1 COG:glpD KEGG:ns NR:ns ## COG: glpD COG0578 # Protein_GI_number: 16131300 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Escherichia coli K12 # 1 501 1 501 501 988 99.0 0 METKDLIVIGGGINGAGIAADAAGRGLSVLMLEAQDLACATSSASSKLIHGGLRYLEHYE FRLVSEALAEREVLLKMAPHIAFPMRFRLPHRPHLRPAWMIRIGLFMYDHLGKRTSLPGS TGLRFGANSVLKPEIKRGFEYSDCWVDDARLVLANAQMVVRKGGEVLTRTRATSARRENG LWIVEAEDIDTGKKYSWQARGLVNATGPWVKQFFDDGMHLPSPYGIRLIKGSHIVVPRVH TQKQAYILQNEDKRIVFVIPWMDEFSIIGTTDVEYKGDPKAVKIEESEINYLLNVYNTHF KKQLSRDDIVWTYSGVRPLCDDESDSPQAITRDYTLDIHDENGKAPLLSVFGGKLTTYRK LAEHALEKLTPYYQGIGPAWTKESVLPGGAIEGDRDDYAARLRRRYPFLTESLARHYART YGSNSELLLGNAGTGSDLGEDFGHEFYEAELKYLVDHEWVRRADDALWRRTKQGMWLNAD QQSRVSQWLVEYTQQRLSLAS >gi|296494588|gb|ADTN01000150.1| GENE 21 26416 - 26697 149 93 aa, chain - ## HITS:1 COG:ECs4272 KEGG:ns NR:ns ## COG: ECs4272 COG0226 # Protein_GI_number: 15833526 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 1 73 5 77 505 129 84.0 1e-30 MQNRKWILTSLVMTFFGIPILAQFLAVVIAMLGVGLAGIIEVCNILITPTIYLLLKIFML ALGALMLFFSGRVGNVPEFCYVGYDGVGFAVIP >gi|296494588|gb|ADTN01000150.1| GENE 22 26826 - 29273 3051 815 aa, chain - ## HITS:1 COG:glgP KEGG:ns NR:ns ## COG: glgP COG0058 # Protein_GI_number: 16131302 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Escherichia coli K12 # 1 815 1 815 815 1660 100.0 0 MNAPFTYSSPTLSVEALKHSIAYKLMFTIGKDPVVANKHEWLNATLFAVRDRLVERWLRS NRAQLSQETRQVYYLSMEFLIGRTLSNAMLSLGIYEDVQGALEAMGLNLEELIDEENDPG LGNGGLGRLAACFLDSLATLGLPGRGYGIRYDYGMFKQNIVNGSQKESPDYWLEYGNPWE FKRHNTRYKVRFGGRIQQEGKKTRWIETEEILGVAYDQIIPGYDTDATNTLRLWSAQASS EINLGKFNQGDYFAAVEDKNHSENVSRVLYPDDSTYSGRELRLRQEYFLVSSTIQDILSR HYQLHKTYDNLADKIAIHLNDTHPVLSIPEMMRLLIDEHQFSWDDAFEVCCQVFSYTNHT LMSEALETWPVDMLGKILPRHLQIIFEINDYFLKTLQEQYPNDTDLLGRASIIDESNGRR VRMAWLAVVVSHKVNGVSELHSNLMVQSLFADFAKIFPGRFTNVTNGVTPRRWLAVANPS LSAVLDEHLGRNWRTDLSLLNELQQHCDFPMVNHAVHQAKLENKKRLAEYIAQQLNVVVN PKALFDVQIKRIHEYKRQLMNVLHVITRYNRIKADPDAKWVPRVNIFGGKAASAYYMAKH IIHLINDVAKVINNDPQIGDKLKVVFIPNYSVSLAQLIIPAADLSEQISLAGTEASGTSN MKFALNGALTIGTLDGANVEMLDHVGADNIFIFGNTAEEVEELRRQGYKPREYYEKDEEL HQVLTQIGSGVFSPEDPGRYRDLVDSLINFGDHYQVLADYRSYVDCQDKVDELYELQEEW TAKAMLNIANMGYFSSDRTIKEYADHIWHIDPVRL >gi|296494588|gb|ADTN01000150.1| GENE 23 29292 - 30725 1258 477 aa, chain - ## HITS:1 COG:ECs4274 KEGG:ns NR:ns ## COG: ECs4274 COG0297 # Protein_GI_number: 15833528 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen synthase # Organism: Escherichia coli O157:H7 # 1 477 1 477 477 976 100.0 0 MQVLHVCSEMFPLLKTGGLADVIGALPAAQIADGVDARVLLPAFPDIRRGVTDAQVVSRR DTFAGHITLLFGHYNGVGIYLIDAPHLYDRPGSPYHDTNLFAYTDNVLRFALLGWVGAEM ASGLDPFWRPDVVHAHDWHAGLAPAYLAARGRPAKSVFTVHNLAYQGMFYAHHMNDIQLP WSFFNIHGLEFNGQISFLKAGLYYADHITAVSPTYAREITEPQFAYGMEGLLQQRHREGR LSGVLNGVDEKIWSPETDLLLASRYTRDTLEDKAENKRQLQIAMGLKVDDKVPLFAVVSR LTSQKGLDLVLEALPGLLEQGGQLALLGAGDPVLQEGFLAAAAEYPGQVGVQIGYHEAFS HRIMGGADVILVPSRFEPCGLTQLYGLKYGTLPLVRRTGGLADTVSDCSLENLADGVASG FVFEDSNAWSLLRAIRRAFVLWSRPSLWRFVQRQAMAMDFSWQVAAKSYRELYYRLK >gi|296494588|gb|ADTN01000150.1| GENE 24 30725 - 32020 1251 431 aa, chain - ## HITS:1 COG:ECs4275 KEGG:ns NR:ns ## COG: ECs4275 COG0448 # Protein_GI_number: 15833529 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Escherichia coli O157:H7 # 1 431 1 431 431 909 100.0 0 MVSLEKNDHLMLARQLPLKSVALILAGGRGTRLKDLTNKRAKPAVHFGGKFRIIDFALSN CINSGIRRMGVITQYQSHTLVQHIQRGWSFFNEEMNEFVDLLPAQQRMKGENWYRGTADA VTQNLDIIRRYKAEYVVILAGDHIYKQDYSRMLIDHVEKGARCTVACMPVPIEEASAFGV MAVDENDKIIEFVEKPANPPSMPNDPSKSLASMGIYVFDADYLYELLEEDDRDENSSHDF GKDLIPKITEAGLAYAHPFPLSCVQSDPDAEPYWRDVGTLEAYWKANLDLASVVPELDMY DRNWPIRTYNESLPPAKFVQDRSGSHGMTLNSLVSGGCVISGSVVVQSVLFSRVRVNSFC NIDSAVLLPEVWVGRSCRLRRCVIDRACVIPEGMVIGENAEEDARRFYRSEEGIVLVTRE MLRKLGHKQER >gi|296494588|gb|ADTN01000150.1| GENE 25 32038 - 34011 1500 657 aa, chain - ## HITS:1 COG:glgX KEGG:ns NR:ns ## COG: glgX COG1523 # Protein_GI_number: 16131305 # Func_class: G Carbohydrate transport and metabolism # Function: Type II secretory pathway, pullulanase PulA and related glycosidases # Organism: Escherichia coli K12 # 1 657 1 657 657 1382 100.0 0 MTQLAIGKPAPLGAHYDGQGVNFTLFSAHAERVELCVFDANGQEHRYDLPGHSGDIWHGY LPDARPGLRYGYRVHGPWQPAEGHRFNPAKLLIDPCARQIDGEFKDNPLLHAGHNEPDYR DNAAIAPKCVVVVDHYDWEDDAPPRTPWGSTIIYEAHVKGLTYLHPEIPVEIRGTYKALG HPVMINYLKQLGITALELLPVAQFASEPRLQRMGLSNYWGYNPVAMFALHPAYACSPETA LDEFRDAIKALHKAGIEVILDIVLNHSAELDLDGPLFSLRGIDNRSYYWIREDGDYHNWT GCGNTLNLSHPAVVDYASACLRYWVETCHVDGFRFDLAAVMGRTPEFRQDAPLFTAIQNC PVLSQVKLIAEPWDIAPGGYQVGNFPPLFAEWNDHFRDAARRFWLHYDLPLGAFAGRFAA SSDVFKRNGRLPSAAINLVTAHDGFTLRDCVCFNHKHNEANGEENRDGTNNNYSNNHGKE GLGGSLDLVERRRDSIHALLTTLLLSQGTPMLLAGDEHGHSQHGNNNAYCQDNQLTWLDW SQASSGLTAFTAALIHLRKRIPALVENRWWEEGDGNVRWLNRYAQPLSTDEWQNGPKQLQ ILLSDRFLIAINATLEVTEIVLPAGEWHAIPPFAGEDNPVITAVWQGPAHGLCVFQR >gi|296494588|gb|ADTN01000150.1| GENE 26 34008 - 36194 2498 728 aa, chain - ## HITS:1 COG:glgB KEGG:ns NR:ns ## COG: glgB COG0296 # Protein_GI_number: 16131306 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Escherichia coli K12 # 1 728 1 728 728 1507 100.0 0 MSDRIDRDVINALIAGHFADPFSVLGMHKTTAGLEVRALLPDATDVWVIEPKTGRKLAKL ECLDSRGFFSGVIPRRKNFFRYQLAVVWHGQQNLIDDPYRFGPLIQEMDAWLLSEGTHLR PYETLGAHADTMDGVTGTRFSVWAPNARRVSVVGQFNYWDGRRHPMRLRKESGIWELFIP GAHNGQLYKYEMIDANGNLRLKSDPYAFEAQMRPETASLICGLPEKVVQTEERKKANQFD APISIYEVHLGSWRRHTDNNFWLSYRELADQLVPYAKWMGFTHLELLPINEHPFDGSWGY QPTGLYAPTRRFGTRDDFRYFIDAAHAAGLNVILDWVPGHFPTDDFALAEFDGTNLYEHS DPREGYHQDWNTLIYNYGRREVSNFLVGNALYWIERFGIDALRVDAVASMIYRDYSRKEG EWIPNEFGGRENLEAIEFLRNTNRILGEQVSGAVTMAEESTDFPGVSRPQDMGGLGFWYK WNLGWMHDTLDYMKLDPVYRQYHHDKLTFGILYNYTENFVLPLSHDEVVHGKKSILDRMP GDAWQKFANLRAYYGWMWAFPGKKLLFMGNEFAQGREWNHDASLDWHLLEGGDNWHHGVQ RLVRDLNLTYRHHKAMHELDFDPYGFEWLVVDDKERSVLIFVRRDKEGNEIIVASNFTPV PRHDYRFGINQPGKWREILNTDSMHYHGSNAGNGGTVHSDEIASHGRQHSLSLTLPPLAT IWLVREAE >gi|296494588|gb|ADTN01000150.1| GENE 27 36467 - 37570 1108 367 aa, chain - ## HITS:1 COG:ECs4278 KEGG:ns NR:ns ## COG: ECs4278 COG0136 # Protein_GI_number: 15833532 # Func_class: E Amino acid transport and metabolism # Function: Aspartate-semialdehyde dehydrogenase # Organism: Escherichia coli O157:H7 # 1 367 1 367 367 746 100.0 0 MKNVGFIGWRGMVGSVLMQRMVEERDFDAIRPVFFSTSQLGQAAPSFGGTTGTLQDAFDL EALKALDIIVTCQGGDYTNEIYPKLRESGWQGYWIDAASSLRMKDDAIIILDPVNQDVIT DGLNNGIRTFVGGNCTVSLMLMSLGGLFANDLVDWVSVATYQAASGGGARHMRELLTQMG HLYGHVADELATPSSAILDIERKVTTLTRSGELPVDNFGVPLAGSLIPWIDKQLDNGQSR EEWKGQAETNKILNTSSVIPVDGLCVRVGALRCHSQAFTIKLKKDVSIPTVEELLAAHNP WAKVVPNDREITMRELTPAAVTGTLTTPVGRLRKLNMGPEFLSAFTVGDQLLWGAAEPLR RMLRQLA >gi|296494588|gb|ADTN01000150.1| GENE 28 37763 - 38356 653 197 aa, chain + ## HITS:1 COG:ECs4279 KEGG:ns NR:ns ## COG: ECs4279 COG2095 # Protein_GI_number: 15833533 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Multiple antibiotic transporter # Organism: Escherichia coli O157:H7 # 1 197 1 197 197 294 100.0 9e-80 MNEIISAAVLLILIMDPLGNLPIFMSVLKHTEPKRRRAIMVRELLIALLVMLVFLFAGEK ILAFLSLRAETVSISGGIILFLIAIKMIFPSASGNSSGLPAGEEPFIVPLAIPLVAGPTI LATLMLLSHQYPNQMGHLVIALLLAWGGTFVILLQSSLFLRLLGEKGVNALERLMGLILV MMATQMFLDGIRMWMKG >gi|296494588|gb|ADTN01000150.1| GENE 29 38413 - 39753 1553 446 aa, chain - ## HITS:1 COG:ECs4285 KEGG:ns NR:ns ## COG: ECs4285 COG2610 # Protein_GI_number: 15833539 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Escherichia coli O157:H7 # 1 446 1 446 446 699 100.0 0 MTTLTLVLTAVGSVLLLLFLVMKARMHAFLALMVVSMGAGLFSGMPLDKIAATMEKGMGG TLGFLAVVVALGAMFGKILHETGAVDQIAVKMLKSFGHSRAHYAIGLAGLVCALPLFFEV AIVLLISVAFSMARHTGTNLVKLVIPLFAGVAAAAAFLVPGPAPMLLASQMNADFGWMIL IGLCAAIPGMIIAGPLWGNFISRYVELHIPDDISEPHLGEGKMPSFGFSLSLILLPLVLV GLKTIAARFVPEGSTAYEWFEFIGHPFTAILVACLVAIYGLAMRQGMPKDKVMEICGHAL QPAGIILLVIGAGGVFKQVLVDSGVGPALGEALTGMGLPIAITCFVLAAAVRIIQGSATV ACLTAVGLVMPVIEQLNYSGAQMAALSICIAGGSIVVSHVNDAGFWLFGKFTGATEAETL KTWTMMETILGTVGAIVGMIAFQLLS >gi|296494588|gb|ADTN01000150.1| GENE 30 39757 - 40245 512 162 aa, chain - ## HITS:1 COG:ECs4286 KEGG:ns NR:ns ## COG: ECs4286 COG3265 # Protein_GI_number: 15833540 # Func_class: G Carbohydrate transport and metabolism # Function: Gluconate kinase # Organism: Escherichia coli O157:H7 # 1 162 1 162 162 325 100.0 3e-89 MGVSGSGKSAVASEVAHQLHAAFLDGDFLHPRRNIEKMASGEPLNDDDRKPWLQALNDAA FAMQRTNKVSLIVCSALKKHYRDLLREGNPNLSFIYLKGDFDVIESRLKARKGHFFKTQM LVTQFETLQEPGADETDVLVVDIDQPLEGVVASTIEVIKKGK >gi|296494588|gb|ADTN01000150.1| GENE 31 40423 - 41418 1040 331 aa, chain - ## HITS:1 COG:ECs4287 KEGG:ns NR:ns ## COG: ECs4287 COG1609 # Protein_GI_number: 15833541 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 331 1 331 331 652 99.0 0 MKKKRPVLQDVADRVGVTKMTVSRFLRNPEQVSVALRGKIAAALDELGYIPNRAPDILSN ATSRAIGVLLPSLTNQVFAEVLRGIESVTDAHGYQTMLAHYGYKPEMEQERLESMLSWNI DGLILTERTHTPRTLKMIEVAGIPVVELMDSKSPCLDIAVGFDNFEAARQMTTAIIARGH RHIAYLGARLDERTIIKQKGYEQAMLDAGLVPYSVMVEQSSSYSSGIELIRQARREYPQL DGVFCTNDDLAVGAAFECQRLGLKVPDDMAIAGFHGHDIGQVMEPRLASVLTPRERMGSI GAERLLARIRGESVTPKMLDLGFTLSPGGSI >gi|296494588|gb|ADTN01000150.1| GENE 32 41642 - 42337 895 231 aa, chain - ## HITS:1 COG:yhhW KEGG:ns NR:ns ## COG: yhhW COG1741 # Protein_GI_number: 16131311 # Func_class: R General function prediction only # Function: Pirin-related protein # Organism: Escherichia coli K12 # 1 231 1 231 231 471 100.0 1e-133 MIYLRKANERGHANHGWLDSWHTFSFANYYDPNFMGFSALRVINDDVIEAGQGFGTHPHK DMEILTYVLEGTVEHQDSMGNKEQVPAGEFQIMSAGTGIRHSEYNPSSTERLHLYQIWIM PEENGITPRYEQRRFDAVQGKQLVLSPDARDGSLKVHQDMELYRWALLKDEQSVHQIAAE RRVWIQVVKGNVTINGVKASTSDGLAIWDEQAISIHADSDSEVLLFDLPPV >gi|296494588|gb|ADTN01000150.1| GENE 33 42460 - 43497 1189 345 aa, chain - ## HITS:1 COG:yhhX KEGG:ns NR:ns ## COG: yhhX COG0673 # Protein_GI_number: 16131312 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Escherichia coli K12 # 1 345 1 345 345 702 100.0 0 MVINCAFIGFGKSTTRYHLPYVLNRKDSWHVAHIFRRHAKPEEQAPIYSHIHFTSDLDEV LNDPDVKLVVVCTHADSHFEYAKRALEAGKNVLVEKPFTPTLAQAKELFALAKSKGLTVT PYQNRRFDSCFLTAKKAIESGKLGEIVEVESHFDYYRPVAETKPGLPQDGAFYGLGVHTM DQIISLFGRPDHVAYDIRSLRNKANPDDTFEAQLFYGDLKAIVKTSHLVKIDYPKFIVHG KKGSFIKYGIDQQETSLKANIMPGEPGFAADDSVGVLEYVNDEGVTVREEMKPEMGDYGR VYDALYQTITHGAPNYVKESEVLTNLEILERGFEQASPSTVTLAK >gi|296494588|gb|ADTN01000150.1| GENE 34 43444 - 43596 176 50 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_3919 NR:ns ## KEGG: EcE24377A_3919 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 50 13 62 62 95 100.0 5e-19 MVTGGAFAEANKGAVDDHDFVLFKVVIYTLAQSGRGSYWSAHNEHKHSRG >gi|296494588|gb|ADTN01000150.1| GENE 35 43830 - 44318 347 162 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229877854|ref|ZP_04497362.1| acetyltransferase, ribosomal protein N-acetylase [Sphaerobacter thermophilus DSM 20745] # 1 162 1 163 179 138 46 9e-32 MSEIVIRHAETRDYEAIRQIHAQPEVYCNTLQVPHPSDHMWQERLADRPGIKQLVACIDG DVVGHLTIDVQQRPRRSHVADFGICVDSRWKNRGVASALMREMIEMCDNWLRVDRIELTV FVDNAPAIKVYKKYGFEIEGTGKKYALRNGEYVDAYYMARVK >gi|296494588|gb|ADTN01000150.1| GENE 36 45617 - 45733 57 38 aa, chain + ## HITS:1 COG:no KEGG:B21_03246 NR:ns ## KEGG: B21_03246 # Name: yhhZ # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 38 355 392 392 87 100.0 2e-16 MQSRIAKGILVGESKITPWAIPSGSIYPPMKNIMDHTK >gi|296494588|gb|ADTN01000150.1| GENE 37 45769 - 46224 191 151 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_3588 NR:ns ## KEGG: ECIAI1_3588 # Name: yrhA # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 151 14 164 164 241 99.0 8e-63 MNDLDYPFEAPLKESFIESIIQIEFNSNSTNCLEKLCNEVSILFKNQPDYLTFLRAMDGF EVNGLRLFSLSIPEPSVKNLFAVNEFYRNNDDFINPDLQERLVIGDYSISIFTYDIKSNF FEIRDNIGTENIFSSFSDFSSFLNEIMDSCS >gi|296494588|gb|ADTN01000150.1| GENE 38 46674 - 46958 278 94 aa, chain + ## HITS:1 COG:no KEGG:B21_03248 NR:ns ## KEGG: B21_03248 # Name: yrhB # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 94 1 94 94 189 100.0 2e-47 MITYHDAFAKANHYLDDADLPVVITLHGRFSQGWYFCFEAREFLETGDEAARLAGNAPFI IDKDSGEIHSLGTAKPLEEYLQDYEIKKATFGLP >gi|296494588|gb|ADTN01000150.1| GENE 39 46996 - 48738 1978 580 aa, chain - ## HITS:1 COG:ggt KEGG:ns NR:ns ## COG: ggt COG0405 # Protein_GI_number: 16131319 # Func_class: E Amino acid transport and metabolism # Function: Gamma-glutamyltransferase # Organism: Escherichia coli K12 # 1 580 1 580 580 1098 99.0 0 MIKPTFLRRVAIAALLSGSCFSAAAAPPAPPVSYGVEEDVFHPVRAKQGMVASVDATATQ VGVDILKEGGNAVDAAVAVGYALAVTHPQAGNLGGGGFMLIRSKNGNTTAIDFREMAPAK ATRDMFLDDQGNPDSKKSLTSHLASGTPGTVAGFSLALDKYGTMPLNKVVQPAFKLARDG FIVNDALADDLKTYGSEVLPNHENSKAIFWKEGEPLKKGDTLVQANLAKSLEMIAENGPD EFYKGTIAEQIAQEMQKNGGLITKEDLAAYKAVERTPISGDYRGYQVYSMPPPSSGGIHI VQILNILENFDMKKYGFGSADAMQIMAEAEKYAYADRSEYLGDPDFVKVPWQALTNKAYA KSIADQIDINKAKPSSEIRPGKLAPYESNQTTHYSVVDKDGNAVAVTYTLNTTFGTGIVA GESGILLNNQMDDFSAKPGVPNVYGLVGGDANAVGPNKRPLSSMSPTIVVKDGKTWLVTG SPGGSRIITTVLQMVVNSIDYGLNVAEATNAPRFHHQWLSDELRVEKGFSPDTLKLLEAK GQKVALKEAMGSTQSIMVGPDGELYGASDPRSVDDLTAGY >gi|296494588|gb|ADTN01000150.1| GENE 40 48858 - 49298 367 146 aa, chain + ## HITS:1 COG:no KEGG:S4297 NR:ns ## KEGG: S4297 # Name: yhhA # Def: hypothetical protein # Organism: S.flexneri_2457T # Pathway: not_defined # 1 146 1 146 146 157 100.0 1e-37 MKRLLILTALLPFVGFAQPINTLNNPNQPGYQIPSQQRMQTQMQTQQIQQKGMLNQQLKT QTQLQQQHLENQINNNSQRVLQSQPGERNPARQQMLPNTNGGMLNSNRNPDSSLNQQHML PERRNGDMLNQPSTPQPDIPLKTIGP >gi|296494588|gb|ADTN01000150.1| GENE 41 49285 - 50028 840 247 aa, chain - ## HITS:1 COG:ugpQ KEGG:ns NR:ns ## COG: ugpQ COG0584 # Protein_GI_number: 16131321 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Escherichia coli K12 # 1 247 1 247 247 505 99.0 1e-143 MSNWPYPRIVAHRGGGKLAPENTLAAIDVGAKYGHKMIEFDAKLSKDGEIFLLHDDNLER TSNGWGVAGELNWQDLLRVDAGSWYSKMFKGEPLPLLSQVAERCREHGMMANIEIKPTTG TGPLTGKMVALAARELWAGMTPPLLSSFEIDALEAAQQAAPELPRGLLLDEWRDDWRELT ARLGCVSIHLNHKLLNKARVMQLKDAGLRILVYTVNKPQRAAELLRWGVDCICTDAIDVI GPNFTAQ >gi|296494588|gb|ADTN01000150.1| GENE 42 50025 - 51095 1425 356 aa, chain - ## HITS:1 COG:ugpC KEGG:ns NR:ns ## COG: ugpC COG3839 # Protein_GI_number: 16131322 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Escherichia coli K12 # 1 356 14 369 369 697 99.0 0 MAGLKLQAVTKSWDGKTQVIKPLTLDVADGEFIVMVGPSGCGKSTLLRMVAGLERVTEGD IWINDQRVTEMEPKDRGIAMVFQNYALYPHMCVEENMAWGLKIRGMGKQQIAERVKEAAR ILELDGLLKRRPRELSGGQRQRVAMGRAIVRDPAVFLFDEPLSNLDAKLRVQMRLELQQL HRRLKTTSLYVTHDQVEAMTLAQRVMVMNGGVAEQIGTPVEVYEKPASLFVASFIGSPAM NLLTGRVNNEGTHFELDGGIELPLNGGYRQYAGRKMTLGIRPEHIALSSQAEGGVPMVMD TLEILGADNLAHGRWGEQKLVVRLAHQERPTAGSTLWLHLAENQLHLFDGETGQRV >gi|296494588|gb|ADTN01000150.1| GENE 43 51097 - 51942 969 281 aa, chain - ## HITS:1 COG:ECs4297 KEGG:ns NR:ns ## COG: ECs4297 COG0395 # Protein_GI_number: 15833551 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Escherichia coli O157:H7 # 1 281 1 281 281 482 99.0 1e-136 MIENRPWLTIFSHTMLILGIAVILFPLYVAFVAATLDKQAVYAAPMTLIPGTHLLENIHN IWVNGVGTNSAPFWRMLLNSFVMAFSITLGKITVSMLSAFAIVWFRFPLRNLFFWMIFIT LMLPVEVRIFPTVEVIANLQMLDSYAGLTLPLMASATATFLFRQFFMTLPDELVEAARID GASPMRFFCDIVFPLSKTNLAALFVITFIYGWNQYLWPLLIITDVDLGTTVAGIKGMIAT GEGTTEWNSVMAAMLLTLIPPVVIVLVMQRAFVRGLVDSEK >gi|296494588|gb|ADTN01000150.1| GENE 44 51939 - 52826 991 295 aa, chain - ## HITS:1 COG:ugpA KEGG:ns NR:ns ## COG: ugpA COG1175 # Protein_GI_number: 16131324 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Escherichia coli K12 # 1 295 1 295 295 485 100.0 1e-137 MSSSRPVFRSRWLPYLLVAPQLIITVIFFIWPAGEALWYSLQSVDPFGFSSQFVGLDNFV TLFHDSYYLDSFWTTIKFSTFVTVSGLLVSLFFAALVEYIVRGSRFYQTLMLLPYAVAPA VAAVLWIFLFNPGRGLITHFLAEFGYDWNHAQNSGQAMFLVVFASVWKQISYNFLFFYAA LQSIPRSLIEAAAIDGAGPIRRFFKIALPLIAPVSFFLLVVNLVYAFFDTFPVIDAATSG GPVQATTTLIYKIYREGFTGLDLASSAAQSVVLMFLVIVLTVVQFRYVESKVRYQ >gi|296494588|gb|ADTN01000150.1| GENE 45 52924 - 54240 1758 438 aa, chain - ## HITS:1 COG:ECs4299 KEGG:ns NR:ns ## COG: ECs4299 COG1653 # Protein_GI_number: 15833553 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 1 438 1 438 438 845 100.0 0 MKPLHYTASALALGLALMGNAQAVTTIPFWHSMEGELGKEVDSLAQRFNAENPDYKIVPT YKGNYEQNLSAGIAAFRTGNAPAILQVYEVGTATMMASKAIKPVYDVFKEAGIQFDESQF VPTVSGYYSDSKTGHLLSQPFNSSTPVLYYNKDAFKKAGLDPEQPPKTWQDLADYAAKLK ASGMKCGYASGWQGWIQLENFSAWNGLPFASKNNGFDGTDAVLEFNKPEQVKHIAMLEEM NKKGDFSYVGRKDESTEKFYNGDCAMTTASSGSLANIREYAKFNYGVGMMPYDADAKDAP QNAIIGGASLWVMQGKDKETYTGVAKFLDFLAKPENAAEWHQKTGYLPITKAAYDLTREQ GFYEKNPGADTATRQMLNKPPLPFTKGLRLGNMPQIRVIVDEELESVWTGKKTPQQALDT AVERGNQLLRRFEKSTKS >gi|296494588|gb|ADTN01000150.1| GENE 46 54639 - 55352 267 237 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 5 216 1 218 245 107 29 2e-22 MEKVMLSFDKVSAHYGKIQALHEVSLHINQGEIVTLIGANGAGKTTLLGTLCGDPRATSG RIVFDDKDITDWQTAKIMREAVAIVPEGRRVFSRMTVEENLAMGGFFAERDQFQERIKWV YELFPRLHERRIQRAGTMSGGEQQMLAIGRALMSNPRLLLLDEPSLGLAPIIIQQIFDTI EQLREQGMTIFLVEQNANQALKLADRGYVLENGHVVLSDTGDALLANEAVRSAYLGG >gi|296494588|gb|ADTN01000150.1| GENE 47 55354 - 56121 258 255 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 5 245 1 230 245 103 25 2e-21 MSQPLLSVNGLMMRFGGLLAVNNVNLELYPQEIVSLIGPNGAGKTTVFNCLTGFYKPTGG TILLRDQHLEGLPGQQIARMGVVRTFQHVRLFREMTVIENLLVAQHQQLKTGLFSGLLKT PSFRRAQSEALDRAATWLERIGLLEHANRQASNLAYGDQRRLEIARCMVTQPEILMLDEP AAGLNPKETKELDELIAELRNHHNTTILLIEHDMKLVMGISDRIYVVNQGTPLANGTPEQ IRNNPDVIRAYLGEA >gi|296494588|gb|ADTN01000150.1| GENE 48 56118 - 57395 1558 425 aa, chain - ## HITS:1 COG:livM KEGG:ns NR:ns ## COG: livM COG4177 # Protein_GI_number: 16131328 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport system, permease component # Organism: Escherichia coli K12 # 1 425 1 425 425 718 100.0 0 MKPMHIAMALLSAAMFFVLAGVFMGVQLELDGTKLVVDTASDVRWQWVFIGTAVVFFFQL LRPAFQKGLKSVSGPKFILPAIDGSTVKQKLFLVALLVLAVAWPFMVSRGTVDIATLTMI YIILGLGLNVVVGLSGLLVLGYGGFYAIGAYTFALLNHYYGLGFWTCLPIAGLMAAAAGF LLGFPVLRLRGDYLAIVTLGFGEIVRILLLNNTEITGGPNGISQIPKPTLFGLEFSRTAR EGGWDTFSNFFGLKYDPSDRVIFLYLVALLLVVLSLFVINRLLRMPLGRAWEALREDEIA CRSLGLSPRRIKLTAFTISAAFAGFAGTLFAARQGFVSPESFTFAESAFVLAIVVLGGMG SQFAVILAAILLVVSRELMRDFNEYSMLMLGGLMVLMMIWRPQGLLPMTRPQLKLKNGAA KGEQA >gi|296494588|gb|ADTN01000150.1| GENE 49 57392 - 58318 1372 308 aa, chain - ## HITS:1 COG:ECs4304 KEGG:ns NR:ns ## COG: ECs4304 COG0559 # Protein_GI_number: 15833558 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid ABC-type transport system, permease components # Organism: Escherichia coli O157:H7 # 1 308 1 308 308 491 100.0 1e-138 MSEQFLYFLQQMFNGVTLGSTYALIAIGYTMVYGIIGMINFAHGEVYMIGSYVSFMIIAA LMMMGIDTGWLLVAAGFVGAIVIASAYGWSIERVAYRPVRNSKRLIALISAIGMSIFLQN YVSLTEGSRDVALPSLFNGQWVVGHSENFSASITTMQAVIWIVTFLAMLALTIFIRYSRM GRACRACAEDLKMASLLGINTDRVIALTFVIGAAMAAVAGVLLGQFYGVINPYIGFMAGM KAFTAAVLGGIGSIPGAMIGGLILGIAEALSSAYLSTEYKDVVSFALLILVLLVMPTGIL GRPEVEKV >gi|296494588|gb|ADTN01000150.1| GENE 50 58366 - 59475 1337 369 aa, chain - ## HITS:1 COG:livK KEGG:ns NR:ns ## COG: livK COG0683 # Protein_GI_number: 16131330 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Escherichia coli K12 # 1 369 1 369 369 707 100.0 0 MKRNAKTIIAGMIALAISHTAMADDIKVAVVGAMSGPIAQWGDMEFNGARQAIKDINAKG GIKGDKLVGVEYDDACDPKQAVAVANKIVNDGIKYVIGHLCSSSTQPASDIYEDEGILMI SPGATNPELTQRGYQHIMRTAGLDSSQGPTAAKYILETVKPQRIAIIHDKQQYGEGLARS VQDGLKAANANVVFFDGITAGEKDFSALIARLKKENIDFVYYGGYYPEMGQMLRQARSVG LKTQFMGPEGVGNASLSNIAGDAAEGMLVTMPKRYDQDPANQGIVDALKADKKDPSGPYV WITYAAVQSLATALERTGSDEPLALVKDLKANGANTVIGPLNWDEKGDLKGFDFGVFQWH ADGSSTAAK >gi|296494588|gb|ADTN01000150.1| GENE 51 59899 - 60282 366 127 aa, chain + ## HITS:1 COG:no KEGG:JW3424 NR:ns ## KEGG: JW3424 # Name: yhhK # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 127 1 127 127 253 100.0 1e-66 MKLTIIRLEKFSDQDRIDLQKIWPEYSPSSLQVDDNHRIYAARFNERLLAAVRVTLSGTE GALDSLRVREVTRRRGVGQYLLEEVLRNNPGVSCWWMADAGVEDRGVMTAFMQALGFTAQ QGGWEKC >gi|296494588|gb|ADTN01000150.1| GENE 52 60470 - 61573 1508 367 aa, chain - ## HITS:1 COG:ECs4309 KEGG:ns NR:ns ## COG: ECs4309 COG0683 # Protein_GI_number: 15833563 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Escherichia coli O157:H7 # 1 367 20 386 386 706 100.0 0 MNIKGKALLAGCIALAFSNMALAEDIKVAVVGAMSGPVAQYGDQEFTGAEQAVADINAKG GIKGNKLQIVKYDDACDPKQAVAVANKVVNDGIKYVIGHLCSSSTQPASDIYEDEGILMI TPAATAPELTARGYQLILRTTGLDSDQGPTAAKYILEKVKPQRIAIVHDKQQYGEGLARA VQDGLKKGNANVVFFDGITAGEKDFSTLVARLKKENIDFVYYGGYHPEMGQILRQARAAG LKTQFMGPEGVANVSLSNIAGESAEGLLVTKPKNYDQVPANKPIVDAIKAKKQDPSGAFV WTTYAALQSLQAGLNQSDDPAEIAKYLKANSVDTVMGPLTWDEKGDLKGFEFGVFDWHAN GTATDAK >gi|296494588|gb|ADTN01000150.1| GENE 53 61844 - 62698 1068 284 aa, chain - ## HITS:1 COG:ECs4310 KEGG:ns NR:ns ## COG: ECs4310 COG0568 # Protein_GI_number: 15833564 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Escherichia coli O157:H7 # 1 284 1 284 284 514 100.0 1e-146 MTDKMQSLALAPVGNLDSYIRAANAWPMLSADEERALAEKLHYHGDLEAAKTLILSHLRF VVHIARNYAGYGLPQADLIQEGNIGLMKAVRRFNPEVGVRLVSFAVHWIKAEIHEYVLRN WRIVKVATTKAQRKLFFNLRKTKQRLGWFNQDEVEMVARELGVTSKDVREMESRMAAQDM TFDLSSDDDSDSQPMAPVLYLQDKSSNFADGIEDDNWEEQAANRLTDAMQGLDERSQDII RARWLDEDNKSTLQELADRYGVSAERVRQLEKNAMKKLRAAIEA >gi|296494588|gb|ADTN01000150.1| GENE 54 62943 - 64001 1006 352 aa, chain - ## HITS:1 COG:ftsX KEGG:ns NR:ns ## COG: ftsX COG2177 # Protein_GI_number: 16131334 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Escherichia coli K12 # 1 352 1 352 352 687 100.0 0 MNKRDAINHIRQFGGRLDRFRKSVGGSGDGGRNAPKRAKSSPKPVNRKTNVFNEQVRYAF HGALQDLKSKPFATFLTVMVIAISLTLPSVCYMVYKNVNQAATQYYPSPQITVYLQKTLD DDAAAGVVAQLQAEQGVEKVNYLSREDALGEFRNWSGFGGALDMLEENPLPAVAVVIPKL DFQGTESLNTLRDRITQINGIDEVRMDDSWFARLAALTGLVGRVSAMIGVLMVAAVFLVI GNSVRLSIFARRDSINVQKLIGATDGFILRPFLYGGALLGFSGALLSLILSEILVLRLSS AVAEVAQVFGTKFDINGLSFDECLLLLLVCSMIGWVAAWLATVQHLRHFTPE >gi|296494588|gb|ADTN01000150.1| GENE 55 63994 - 64662 348 222 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 216 4 220 223 138 36 7e-32 MIRFEHVSKAYLGGRQALQGVTFHMQPGEMAFLTGHSGAGKSTLLKLICGIERPSAGKIW FSGHDITRLKNREVPFLRRQIGMIFQDHHLLMDRTVYDNVAIPLIIAGASGDDIRRRVSA ALDKVGLLDKAKNFPIQLSGGEQQRVGIARAVVNKPAVLLADEPTGNLDDALSEGILRLF EEFNRVGVTVLMATHDINLISRRSYRMLTLSDGHLHGGVGHE >gi|296494588|gb|ADTN01000150.1| GENE 56 64665 - 66158 747 497 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 [Bacillus selenitireducens MLS10] # 198 494 21 321 336 292 48 4e-78 MAKEKKRGFFSWLGFGQKEQTPEKETEVQNEQPVVEEIVQAQEPVKASEQAVEEQPQAHT EAEAETFAADVVEVTEQVAESEKAQPEAEVVAQPEPVVEETPEPVAIEREELPLPEDVNA EAVSPEEWQAEAETVEIVEAAEEEAAKEEITDEELETALAAEAAEEAVMVVPPAEEEQPV EEIAQEQEKPTKEGFFARLKRSLLKTKENLGSGFISLFRGKKIDDDLFEELEEQLLIADV GVETTRKIITNLTEGASRKQLRDAEALYGQLKEEMGEILAKVDEPLNVEGKAPFVILMVG VNGVGKTTTIGKLARQFEQQGKSVMLAAGDTFRAAAVEQLQVWGQRNNIPVIAQHTGADS ASVIFDAIQAAKARNIDVLIADTAGRLQNKSHLMEELKKIVRVMKKLDVEAPHEVMLTID ASTGQNAVSQAKLFHEAVGLTGITLTKLDGTAKGGVIFSVADQFGIPIRYIGVGERIEDL RPFKADDFIEALFARED >gi|296494588|gb|ADTN01000150.1| GENE 57 66308 - 66904 190 198 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 [Bacillus selenitireducens MLS10] # 12 194 13 199 199 77 27 1e-13 MKKPNHSGSGQIRIIGGQWRGRKLPVPDSPGLRPTTDRVRETLFNWLAPVIVDAQCLDCF AGSGALGLEALSRYAAGATLIEMDRAVSQQLIKNLATLKAGNARVVNSNAMSFLAQKGTP HNIVFVDPPFRRGLLEETINLLEDNGWLADEALIYVESEVENGLPTVPANWSLHREKVAG QVAYRLYQREAQGESDAD >gi|296494588|gb|ADTN01000150.1| GENE 58 66894 - 67163 347 89 aa, chain + ## HITS:1 COG:yhhL KEGG:ns NR:ns ## COG: yhhL COG3776 # Protein_GI_number: 16131338 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 89 1 89 89 149 100.0 1e-36 MLINIGRLLMLCVWGFLILNLVHPFPRPLNIFVNVALIFTVLMHGMQLALLKSTLPKDGP QMTTAEKVRIFLFGVFELLAWQKKFKVKK >gi|296494588|gb|ADTN01000150.1| GENE 59 67166 - 67525 356 119 aa, chain - ## HITS:1 COG:no KEGG:JW3432 NR:ns ## KEGG: JW3432 # Name: yhhM # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 119 1 119 119 218 100.0 5e-56 MSKPPLFFIVIIGLIVVAASFRFMQQRREKADNDMAPLQQKLVVVSNKREKPINDRRSRQ QEVTPAGTSIRYEASFKPQSGGMEQTFRLDAQQYHALTVGDKGTLSYKGTRFVSFVGEQ >gi|296494588|gb|ADTN01000150.1| GENE 60 67666 - 68292 733 208 aa, chain + ## HITS:1 COG:STM3575 KEGG:ns NR:ns ## COG: STM3575 COG3714 # Protein_GI_number: 16766861 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Salmonella typhimurium LT2 # 1 208 1 208 208 302 91.0 3e-82 MLWSFIAVCLSAWLSVDASYRGPTWQRWVFKPLTLLLLLLLAWQAPMFDAISYLVLAGLC ASLLGDALTLLPRQRLMYAIGAFFLSHLLYTIYFASQMTLSFFWPLPLVLLVLGALLLAI IWTRLEEYRWPICTFIGMTLVMVWLAGELWFFRPTAPALSAFVGASLLFISNFVWLGSHY RRRFRADNAIAAACYFAGHFLIVRSLYL >gi|296494588|gb|ADTN01000150.1| GENE 61 68366 - 70564 2473 732 aa, chain + ## HITS:1 COG:zntA KEGG:ns NR:ns ## COG: zntA COG2217 # Protein_GI_number: 16131341 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Escherichia coli K12 # 1 732 1 732 732 1262 99.0 0 MSTPDNHGKKAPQFAAFKPLTTVQNANDCCCDGACSSTPTLSENVSGTRYSWKVSGMDCA ACARKVENAVRQLAGVNQVQVLFATEKLVVDADNDIRAQVESALQKAGYSLRDEQAAEEP QASRLKENLPLITLIVMMAISWGLEQFNHPFGQLAFIATTLVGLYPIARQALRLIKSGSY FAIETLMSVAAIGALFIGATAEAAMVLLLFLIGERLEGWAASRARQGVSALMALKPETAT RLRKGEREEVAINSLRPGDVIEVAAGGRLPADGKLLSPFASFDESALTGESIPVERATGD KVPAGATSVDRLVTLEVLSEPGASAIDRILKLIEEAEERRAPIERFIDRFSRIYTPAIMA VALLVTLVPPLLFAASWQEWIYKGLTLLLIGCPCALVISTPAAITSGLAAAARRGALIKG GAALEQLGRVTQVAFDKTGTLTVGKPRVTAIHPATGISESELLTLAAAVEQGATHPLAQA IVREAQVAELAIPTAESQRALVGSGIEAQVNGERVLICAAGKHPADAFTGLINELESAGQ TVVLVVRNDDVLGVIALQDTLRADAATAISELNALGVKGVILTGDNPRAAAAIAGELGLE FKAGLLPEDKVKAVTGLNQHAPLAMVGDGINDAPAMKAAAIGIAMGSGTDVALETADAAL THNHLRGLVQMIELARATHANIRQNITIALGLKGIFLVTTLLGMTGLWLAVLADTGATVL VTANALRLLRRR Prediction of potential genes in microbial genomes Time: Sun May 15 23:39:27 2011 Seq name: gi|296494587|gb|ADTN01000151.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont290.9, whole genome shotgun sequence Length of sequence - 11451 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 7, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 58 - 303 320 ## COG0425 Predicted redox protein, regulator of disulfide bond formation - Prom 327 - 386 6.5 + Prom 403 - 462 4.5 2 2 Op 1 . + CDS 524 - 1189 663 ## COG1738 Uncharacterized conserved protein 3 2 Op 2 . + CDS 1262 - 1819 761 ## ECO103_4192 hypothetical protein + Term 1833 - 1880 7.6 - Term 1732 - 1776 5.1 4 3 Tu 1 . - CDS 1823 - 3040 1314 ## COG0477 Permeases of the major facilitator superfamily - Prom 3066 - 3125 2.2 + Prom 3092 - 3151 2.9 5 4 Op 1 1/1.000 + CDS 3172 - 4221 1023 ## COG0628 Predicted permease 6 4 Op 2 . + CDS 4276 - 4863 536 ## COG2091 Phosphopantetheinyl transferase + Term 4909 - 4949 3.0 - Term 4647 - 4683 2.0 7 5 Tu 1 . - CDS 4773 - 4949 73 ## UTI89_C3992 hypothetical protein + Prom 4885 - 4944 4.7 8 6 Op 1 38/0.000 + CDS 4974 - 6548 1812 ## COG0747 ABC-type dipeptide transport system, periplasmic component 9 6 Op 2 49/0.000 + CDS 6548 - 7492 254 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 10 6 Op 3 44/0.000 + CDS 7489 - 8322 1089 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 11 6 Op 4 17/0.000 + CDS 8322 - 9086 355 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 12 6 Op 5 3/0.500 + CDS 9083 - 9889 404 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 13 6 Op 6 . + CDS 9895 - 10296 437 ## COG0864 Predicted transcriptional regulators containing the CopG/Arc/MetJ DNA-binding domain and a metal-binding domain + Prom 10354 - 10413 3.6 14 7 Tu 1 . + CDS 10499 - 11450 746 ## COG3209 Rhs family protein Predicted protein(s) >gi|296494587|gb|ADTN01000151.1| GENE 1 58 - 303 320 81 aa, chain - ## HITS:1 COG:ECs4319 KEGG:ns NR:ns ## COG: ECs4319 COG0425 # Protein_GI_number: 15833573 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted redox protein, regulator of disulfide bond formation # Organism: Escherichia coli O157:H7 # 1 81 1 81 81 159 100.0 1e-39 MTDLFSSPDHTLDALGLRCPEPVMMVRKTVRNMQPGETLLIIADDPATTRDIPGFCTFME HELVAKETDGLPYRYLIRKGG >gi|296494587|gb|ADTN01000151.1| GENE 2 524 - 1189 663 221 aa, chain + ## HITS:1 COG:yhhQ KEGG:ns NR:ns ## COG: yhhQ COG1738 # Protein_GI_number: 16131343 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 221 1 221 221 374 100.0 1e-104 MNVFSQTQRYKALFWLSLFHLLVITSSNYLVQLPVSILGFHTTWGAFSFPFIFLATDLTV RIFGAPLARRIIFAVMIPALLISYVISSLFYMGSWQGFGALAHFNLFVARIATASFMAYA LGQILDVHVFNRLRQSRRWWLAPTASTLFGNVSDTLAFFFIAFWRSPDAFMAEHWMEIAL VDYCFKVLISIVFFLPMYGVLLNMLLKRLADKSEINALQAS >gi|296494587|gb|ADTN01000151.1| GENE 3 1262 - 1819 761 185 aa, chain + ## HITS:1 COG:no KEGG:ECO103_4192 NR:ns ## KEGG: ECO103_4192 # Name: dcrB # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 185 15 199 199 324 100.0 1e-87 MRNLVKYVGIGLLVMGLAACDDKDTNATAQGSVAESNATGNPVNLLDGKLSFSLPADMTD QSGKLGTQANNMHVWSDATGQKAVIVIMGDDPKEDLAVLAKRLEDQQRSRDPQLQVVTNK AIELKGHKMQQLDSIISAKGQTAYSSVILGNVGNQLLTMQITLPADDQQKAQTTAENIIN TLVIQ >gi|296494587|gb|ADTN01000151.1| GENE 4 1823 - 3040 1314 405 aa, chain - ## HITS:1 COG:yhhS KEGG:ns NR:ns ## COG: yhhS COG0477 # Protein_GI_number: 16131345 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 405 15 419 419 673 100.0 0 MPEPVAEPALNGLRLNLRIVSIVMFNFASYLTIGLPLAVLPGYVHDVMGFSAFWAGLVIS LQYFATLLSRPHAGRYADSLGPKKIVVFGLCGCFLSGLGYLTAGLTASLPVISLLLLCLG RVILGIGQSFAGTGSTLWGVGVVGSLHIGRVISWNGIVTYGAMAMGAPLGVVFYHWGGLQ ALALIIMGVALVAILLAIPRPTVKASKGKPLPFRAVLGRVWLYGMALALASAGFGVIATF ITLFYDAKGWDGAAFALTLFSCAFVGTRLLFPNGINRIGGLNVAMICFSVEIIGLLLVGV ATMPWMAKIGVLLAGAGFSLVFPALGVVAVKAVPQQNQGAALATYTVFMDLSLGVTGPLA GLVMSWAGVPVIYLAAAGLVAIALLLTWRLKKRPPEHVPEAASSS >gi|296494587|gb|ADTN01000151.1| GENE 5 3172 - 4221 1023 349 aa, chain + ## HITS:1 COG:yhhT KEGG:ns NR:ns ## COG: yhhT COG0628 # Protein_GI_number: 16131346 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Escherichia coli K12 # 1 349 28 376 376 581 100.0 1e-166 METPQPDKTGMHILLKLASLVVILAGIHAAADIIVQLLLALFFAIVLNPLVTWFIRRGVQ RPVAITIVVVVMLIALTALVGVLAASFNEFISMLPKFNKELTRKLFKLQEMLPFLNLHMS PERMLQRMDSEKVVTFTTALMTGLSGAMASVLLLVMTVVFMLFEVRHVPYKMRFALNNPQ IHIAGLHRALKGVSHYLALKTLLSLWTGVIVWLGLELMGVQFALMWAVLAFLLNYVPNIG AVISAVPPMIQVLLFNGVYECILVGALFLVVHMVIGNILEPRMMGHRLGMSTMVVFLSLL IWGWLLGPVGMLLSVPLTSVCKIWMETTKGGSKLAILLGPGRPKSRLPG >gi|296494587|gb|ADTN01000151.1| GENE 6 4276 - 4863 536 195 aa, chain + ## HITS:1 COG:yhhU KEGG:ns NR:ns ## COG: yhhU COG2091 # Protein_GI_number: 16131347 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantetheinyl transferase # Organism: Escherichia coli K12 # 1 195 1 195 195 387 100.0 1e-108 MYRIVLGKVSTLSAAPLPPGLREQAPQGPRRERWLAGRALLSHTLSPLPEIIYGEQGKPA FAPEMPLWFNLSHSGDDIALLLSDEGEVGCDIEVIRPRANWRWLANAVFSLGEHAEMDAV HPDQQLEMFWRIWTRKEAIVKQRGGSAWQIVSVDSTYHSSLSVSHCQLENLSLAICTPTP FTLTADSVQWIDSVN >gi|296494587|gb|ADTN01000151.1| GENE 7 4773 - 4949 73 58 aa, chain - ## HITS:1 COG:no KEGG:UTI89_C3992 NR:ns ## KEGG: UTI89_C3992 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 58 1 58 58 102 100.0 3e-21 MAEISMTILSIRHTDYLFWINRWAVGWADQLTESIHCTLSAVSVKGVGVQIARLKFSS >gi|296494587|gb|ADTN01000151.1| GENE 8 4974 - 6548 1812 524 aa, chain + ## HITS:1 COG:nikA KEGG:ns NR:ns ## COG: nikA COG0747 # Protein_GI_number: 16131348 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Escherichia coli K12 # 1 524 1 524 524 1045 100.0 0 MLSTLRRTLFALLACASFIVHAAAPDEITTAWPVNVGPLNPHLYTPNQMFAQSMVYEPLV KYQADGSVIPWLAKSWTHSEDGKTWTFTLRDDVKFSNGEPFDAEAAAENFRAVLDNRQRH AWLELANQIVDVKALSKTELQITLKSAYYPFLQELALPRPFRFIAPSQFKNHETMNGIKA PIGTGPWILQESKLNQYDVFVRNENYWGEKPAIKKITFNVIPDPTTRAVAFETGDIDLLY GNEGLLPLDTFARFSQNPAYHTQLSQPIETVMLALNTAKAPTNELAVREALNYAVNKKSL IDNALYGTQQVADTLFAPSVPYANLGLKPSQYDPQKAKALLEKAGWTLPAGKDIREKNGQ PLRIELSFIGTDALSKSMAEIIQADMRQIGADVSLIGEEESSIYARQRDGRFGMIFHRTW GAPYDPHAFLSSMRVPSHADFQAQQGLADKPLIDKEIGEVLATHDETQRQALYRDILTRL HDEAVYLPISYISMMVVSKPELGNIPYAPIATEIPFEQIKPVKP >gi|296494587|gb|ADTN01000151.1| GENE 9 6548 - 7492 254 314 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 68 304 43 310 320 102 26 1e-21 MLRYVLRRFLLLIPMVLAASVIIFLMLRLGTGDPALDYLRLSNLPPTPEMLASTRTMLGL DQPLYVQYGTWLWKALHLDFGISFASQRPVLDDMLNFLPATLELAGAALVLILLTSVPLG IWAARHRDRLPDFAVRFIAFLGVSMPNFWLAFLLVMAFSVYLQWLPAMGYGGWQHIILPA VSIAFMSLAINARLLRASMLDVAGQRHVTWARLRGLNDKQTERRHILRNASLPMITAVGM HIGELIGGTMIIENIFAWPGVGRYAVSAIFNRDYPVIQCFTLMMVVVFVVCNLIVDLLNA ALDPRIRRHEGAHA >gi|296494587|gb|ADTN01000151.1| GENE 10 7489 - 8322 1089 277 aa, chain + ## HITS:1 COG:ECs4345 KEGG:ns NR:ns ## COG: ECs4345 COG1173 # Protein_GI_number: 15833599 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Escherichia coli O157:H7 # 1 277 1 277 277 461 100.0 1e-130 MNFFLSSRWSVRLALIIIALLALIALTSQWWLPYDPQAIDLPSRLLSPDAQHWLGTDHLG RDIFSRLMAATRVSLGSVMACLLLVLTLGLVIGGSAGLIGGRVDQATMRVADMFMTFPTS ILSFFMVGVLGTGLTNVIIAIALSHWAWYARMVRSLVISLRQREFVLASRLSGAGHVRVF VDHLAGAVIPSLLVLATLDIGHMMLHVAGMSFLGLGVTAPTAEWGVMINDARQYIWTQPL QMFWPGLALFISVMAFNLVGDALRDHLDPHLVTEHAH >gi|296494587|gb|ADTN01000151.1| GENE 11 8322 - 9086 355 254 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 5 240 9 258 563 141 34 7e-75 MPQQIELRNIALQAAQPLVHGVSLTLQRGRVLALVGGSGSGKSLTCAATLGILPAGVRQT AGEILADGKPVSPCALRGIKIATIMQNPRSAFNPLHTMHTHARETCLALGKPADDATLTA AIEAVGLENAARVLKLYPFEMSGGMLQRMMIAMAVLCESPFIIADEPTTDLDVVAQARIL DLLESIMQKQAPGMLLVTHDMGVVARLADDVAVMSDGKIVEQGDVETLFNAPKHTVTRSL VSAHLALYGMELAS >gi|296494587|gb|ADTN01000151.1| GENE 12 9083 - 9889 404 268 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 27 232 294 500 563 160 41 7e-75 MTLLNISGLSHHYAHGGFNGKHQHQAVLNNVSLTLKSGETVALLGRSGCGKSTLARLLVG LESPAQGNISWRGEPLAKLNRAQRKAFRRDIQMVFQDSISAVNPRKTVREILREPMRHLL SLKKSEQLARASEMLKAVDLDDSVLDKRPPQLSGGQLQRVCLARALAVEPKLLILDEAVS NLDLVLQAGVIRLLKKLQQQFGTACLFITHDLRLVERFCQRVMVMDNGQIVETQVVGEKL TFSSDAGRVLQNAVLPAFPVRRRTTEKV >gi|296494587|gb|ADTN01000151.1| GENE 13 9895 - 10296 437 133 aa, chain + ## HITS:1 COG:ECs4348 KEGG:ns NR:ns ## COG: ECs4348 COG0864 # Protein_GI_number: 15833602 # Func_class: K Transcription # Function: Predicted transcriptional regulators containing the CopG/Arc/MetJ DNA-binding domain and a metal-binding domain # Organism: Escherichia coli O157:H7 # 1 133 1 133 133 205 100.0 2e-53 MQRVTITLDDDLLETLDSLSQRRGYNNRSEAIRDILRSALAQEATQQHGTQGFAVLSYVY EHEKRDLASRIVSTQHHHHDLSVATLHVHINHDDCLEIAVLKGDMGDVQHFADDVIAQRG VRHGHLQCLPKED >gi|296494587|gb|ADTN01000151.1| GENE 14 10499 - 11450 746 317 aa, chain + ## HITS:1 COG:rhsB KEGG:ns NR:ns ## COG: rhsB COG3209 # Protein_GI_number: 16131354 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli K12 # 1 317 1 317 1411 649 100.0 0 MSGKPAARQGDMTQYGGSIVQGSAGVRIGAPTGVACSVCPGGVTSGHPVNPLLGAKVLPG ETDIALPGPLPFILSRTYSSYRTKTPAPVGSLGPGWKMPADIRLQLRDNTLILSDNGGRS LYFEHLFPGEDGYSRSESLWLVRGGVAKLDEGHRLAALWQALPEELRLSPHRYLATNSPQ GPWWLLGWCERVPEADEVLPAPLPPYRVLTGLVDRFGRTQTFHREAAGEFSGEITGVTDG AWRHFRLVLTTQAQRAEEARQQAISGGTEPSAFPDTLPGYTEYGRDNGIRLSAVWLTHDP EYPENLPAAPLVRYGWT Prediction of potential genes in microbial genomes Time: Sun May 15 23:39:33 2011 Seq name: gi|296494586|gb|ADTN01000152.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont297.1, whole genome shotgun sequence Length of sequence - 5565 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 96 - 155 4.2 1 1 Op 1 4/0.000 + CDS 276 - 884 696 ## COG0625 Glutathione S-transferase 2 1 Op 2 7/0.000 + CDS 982 - 2373 1382 ## COG1921 Selenocysteine synthase [seryl-tRNASer selenium transferase] 3 1 Op 3 2/0.000 + CDS 2370 - 4214 1738 ## COG3276 Selenocysteine-specific translation elongation factor + Term 4265 - 4324 13.1 + Prom 4295 - 4354 9.5 4 2 Tu 1 . + CDS 4404 - 5555 1358 ## COG1454 Alcohol dehydrogenase, class IV Predicted protein(s) >gi|296494586|gb|ADTN01000152.1| GENE 1 276 - 884 696 202 aa, chain + ## HITS:1 COG:yibF KEGG:ns NR:ns ## COG: yibF COG0625 # Protein_GI_number: 16131463 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione S-transferase # Organism: Escherichia coli K12 # 1 202 1 202 202 401 100.0 1e-112 MKLVGSYTSPFVRKLSILLLEKGITFEFINELPYNADNGVAQFNPLGKVPVLVTEEGECW FDSPIIAEYIELMNVAPAMLPRDPLESLRVRKIEALADGIMDAGLVSVREQARPAAQQSE DELLRQREKINRSLDVLEGYLVDGTLKTDTVNLATIAIACAVGYLNFRRVAPGWCVDRPH LVKLVENLFSRESFARTEPPKA >gi|296494586|gb|ADTN01000152.1| GENE 2 982 - 2373 1382 463 aa, chain + ## HITS:1 COG:ECs4468 KEGG:ns NR:ns ## COG: ECs4468 COG1921 # Protein_GI_number: 15833722 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine synthase [seryl-tRNASer selenium transferase] # Organism: Escherichia coli O157:H7 # 1 463 1 463 463 843 100.0 0 MTTETRSLYSQLPAIDRLLRDSSFLSLRDTYGHTRVVELLRQMLDEAREVIRGSQTLPAW CENWAQEVDARLTKEAQSALRPVINLTGTVLHTNLGRALQAEAAVEAVAQAMRSPVTLEY DLDDAGRGHRDRALAQLLCRITGAEDACIVNNNAAAVLLMLAATASGKEVVVSRGELVEI GGAFRIPDVMRQAGCTLHEVGTTNRTHANDYRQAVNENTALLMKVHTSNYSIQGFTKAID EAELVALGKELDVPVVTDLGSGSLVDLSQYGLPKEPMPQELIAAGVSLVSFSGDKLLGGP QAGIIVGKKEMIARLQSHPLKRALRADKMTLAALEATLRLYLHPEALSEKLPTLRLLTRS AEVIQIQAQRLQAPLAAHYGAEFAVQVMPCLSQIGSGSLPVDRLPSAALTFTPHDGRGSH LESLAARWRELPVPVIGRIYDGRLWLDLRCLEDEQRFLEMLLK >gi|296494586|gb|ADTN01000152.1| GENE 3 2370 - 4214 1738 614 aa, chain + ## HITS:1 COG:selB KEGG:ns NR:ns ## COG: selB COG3276 # Protein_GI_number: 16131461 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Selenocysteine-specific translation elongation factor # Organism: Escherichia coli K12 # 1 614 1 614 614 1218 100.0 0 MIIATAGHVDHGKTTLLQAITGVNADRLPEEKKRGMTIDLGYAYWPQPDGRVPGFIDVPG HEKFLSNMLAGVGGIDHALLVVACDDGVMAQTREHLAILQLTGNPMLTVALTKADRVDEA RVDEVERQVKEVLREYGFAEAKLFITAATEGRGMDALREHLLQLPEREHASQHSFRLAID RAFTVKGAGLVVTGTALSGEVKVGDSLWLTGVNKPMRVRALHAQNQPTETANAGQRIALN IAGDAEKEQINRGDWLLADVPPEPFTRVIVELQTHTPLTQWQPLHIHHAASHVTGRVSLL EDNLAELVFDTPLWLADNDRLVLRDISARNTLAGARVVMLNPPRRGKRKPEYLQWLASLA RAQSDADALSVHLERGAVNLADFAWARQLNGEGMRELLQQPGYIQAGYSLLNAPVAARWQ RKILDTLATYHEQHRDEPGPGRERLRRMALPMEDEALVLLLIEKMRESGDIHSHHGWLHL PDHKAGFSEEQQAIWQKAEPLFGDEPWWVRDLAKETGTDEQAMRLTLRQAAQQGIITAIV KDRYYRNDRIVEFANMIRDLDQECGSTCAADFRDRLGVGRKLAIQILEYFDRIGFTRRRG NDHLLRDALLFPEK >gi|296494586|gb|ADTN01000152.1| GENE 4 4404 - 5555 1358 383 aa, chain + ## HITS:1 COG:ECs4466 KEGG:ns NR:ns ## COG: ECs4466 COG1454 # Protein_GI_number: 15833720 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Escherichia coli O157:H7 # 1 383 1 383 383 704 99.0 0 MAASTFFIPSVNVIGADSLTDAMNMMADYGFTRTLIVTDNMLTKLGMAGDVQKALEERNI FSVIYDGTQPNPTTENVAAGLKLLKENNCDSVISLGGGSPHDCAKGIALVAANGGDIRDY EGVDRSAKPQLPMIAINTTAGTASEMTRFCIITDEARHIKMAIVDKHVTPLLSVNDSSLM IGMPKSLTAATGMDALTHAIEAYVSIAATPITDACALKAVTMIAENLPLAVEDGSNAKAR EAMAYAQFLAGMAFNNASLGYVHAMAHQLGGFYNLPHGVCNAVLLPHVQVFNSKVAAARL RDCAAAMGVNVTGKNDAEGAEACINAIRELAKKVDIPAGLRDLNVKEEDFAVLATNALKD ACGFTNPIQATHEEIVAIYRAAM Prediction of potential genes in microbial genomes Time: Sun May 15 23:39:39 2011 Seq name: gi|296494585|gb|ADTN01000153.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont297.2, whole genome shotgun sequence Length of sequence - 21451 bp Number of predicted genes - 20, with homology - 19 Number of transcription units - 11, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 37 - 243 247 ## ECO103_0695 hypothetical protein - Prom 414 - 473 9.8 + Prom 412 - 471 9.5 2 2 Op 1 . + CDS 556 - 645 113 ## 3 2 Op 2 20/0.000 + CDS 645 - 2318 1870 ## COG2060 K+-transporting ATPase, A chain 4 2 Op 3 18/0.000 + CDS 2341 - 4389 2511 ## COG2216 High-affinity K+ transport system, ATPase chain B 5 2 Op 4 15/0.000 + CDS 4398 - 4970 494 ## COG2156 K+-transporting ATPase, c chain 6 2 Op 5 16/0.000 + CDS 4963 - 7647 2360 ## COG2205 Osmosensitive K+ channel histidine kinase 7 2 Op 6 . + CDS 7719 - 8321 629 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Term 8360 - 8391 -0.7 - Term 8348 - 8379 -0.7 8 3 Tu 1 . - CDS 8537 - 8794 112 ## ECB_00650 hypothetical protein - Prom 8883 - 8942 4.0 + Prom 8712 - 8771 3.0 9 4 Op 1 10/0.000 + CDS 8917 - 11115 2069 ## COG1982 Arginine/lysine/ornithine decarboxylases 10 4 Op 2 . + CDS 11112 - 12431 1605 ## COG0531 Amino acid transporters + Term 12449 - 12480 2.5 + Prom 12630 - 12689 5.3 11 5 Tu 1 . + CDS 12780 - 13430 256 ## ECB_00647 hypothetical protein + Term 13441 - 13477 0.0 12 6 Tu 1 . - CDS 13471 - 13695 71 ## B21_00637 hypothetical protein - Prom 13746 - 13805 1.5 - Term 14129 - 14160 3.2 13 7 Op 1 6/1.000 - CDS 14179 - 15819 2049 ## COG0033 Phosphoglucomutase 14 7 Op 2 . - CDS 15845 - 16390 390 ## COG3057 Negative regulator of replication initiationR - Prom 16527 - 16586 4.4 + Prom 16483 - 16542 2.4 15 8 Op 1 . + CDS 16575 - 17339 719 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 16 8 Op 2 . + CDS 17410 - 17772 383 ## LF82_2603 uncharacterized protein YbfE + Prom 17796 - 17855 2.9 17 9 Tu 1 6/1.000 + CDS 17912 - 18442 784 ## COG0716 Flavodoxins + Term 18448 - 18487 9.1 + Prom 18601 - 18660 7.0 18 10 Tu 1 . + CDS 18731 - 19177 468 ## COG0735 Fe2+/Zn2+ uptake regulation proteins + Term 19197 - 19226 2.1 - Term 19229 - 19260 0.1 19 11 Op 1 . - CDS 19261 - 19587 395 ## ECIAI39_0639 putative lipoprotein 20 11 Op 2 . - CDS 19637 - 21043 1654 ## JW0667 predicted outer membrane porin Predicted protein(s) >gi|296494585|gb|ADTN01000153.1| GENE 1 37 - 243 247 68 aa, chain - ## HITS:1 COG:no KEGG:ECO103_0695 NR:ns ## KEGG: ECO103_0695 # Name: ybfA # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 68 1 68 68 117 100.0 1e-25 MELYREYPAWLIFLRRTYAVAAGVLALPFMLFWKDRARFYSYLHRVWSKTSDKPVWMDQA EKATGDFY >gi|296494585|gb|ADTN01000153.1| GENE 2 556 - 645 113 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSAGVITGVLLVFLLLGYLVYALINAEAF >gi|296494585|gb|ADTN01000153.1| GENE 3 645 - 2318 1870 557 aa, chain + ## HITS:1 COG:kdpA KEGG:ns NR:ns ## COG: kdpA COG2060 # Protein_GI_number: 16128674 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, A chain # Organism: Escherichia coli K12 # 1 557 1 557 557 982 99.0 0 MAAQGFLLIATFLLVLMVLARPLGSGLARLINDIPLPGTTGVERVLFRALGVSDREMNWK QYLCAILGLNMLGLAVLFFMLLGQHYLPLNPQQLPGLSWDLALNTAVSFVTNTNWQSYSG ETTLSYFSQMAGLTVQNFLSAASGIAVIFALIRAFTRQSMSTLGNAWVDLLRITLWVLVP VALLIALFFIQQGALQNFLPYQAVNTVEGAQQLLPMGPVASQEAIKMLGTNGGGFFNANS SHPFENPTALTNFVQMLAIFLIPTALCFAFGEVTGDRRQGRMLLWAMSVIFVICVGVVMW AEVQGNPHLLALGTDSSINMEGKESRFGVLVSSLFAVVTTAASCGAVIAMHDSFTALGGM VPMWLMQIGEVVFGGVGSGLYGMMLFVLLAVFIAGLMIGRTPEYLGKKIDVREMKLTALA ILVTPTLVLMGAALAMMTDAGRSAMLNPGPHGFSEVLYAVSSAANNNGSAFAGLSANSPF WNCLLAFCMFVGRFGVIIPVMAIAGSLVSKKSQAASSGTLPTHGPLFVGLLIGTVLLVGA LTFIPALALGPVAEYLS >gi|296494585|gb|ADTN01000153.1| GENE 4 2341 - 4389 2511 682 aa, chain + ## HITS:1 COG:ECs0725 KEGG:ns NR:ns ## COG: ECs0725 COG2216 # Protein_GI_number: 15829979 # Func_class: P Inorganic ion transport and metabolism # Function: High-affinity K+ transport system, ATPase chain B # Organism: Escherichia coli O157:H7 # 1 682 1 682 682 1246 99.0 0 MSRKQLALFEPTLVVQALKEAVKKLNPQAQWRNPVMFIVWIGSLLTTCISIAMASGAMPG NALFSAAISGWLWITVLFANFAEALAEGRSKAQANSLKGVKKTAFARKLREPKYGAAADK VPADQLRKGDIVLVEAGDIIPCDGEVIEGGASVDESAITGESAPVIRESGGDFASVTGGT RILSDWLVIECSVNPGETFLDRMIAMVEGAQRRKTPNEIALTILLIALTIVFLLATATLW PFSAWGGNAVSVTVLVALLVCLIPTTIGGLLSAIGVAGMSRMLGANVIATSGRAVEAAGD VDVLLLDKTGTITLGNRQASEFIPAQGVDEKTLADAAQLASLADETPEGRSIVILAKQRF NLRERDVQSLHATFVPFTAQSRMSGINIDNRMIRKGSVDAIRRHVEANGGHFPTDVDQKV DQVARQGATPLVVVEGSRVLGVIALKDIVKGGIKERFAQLRKMGIKTVMITGDNRLTAAA IAAEAGVDDFLAEATPEAKLALIRQYQAEGRLVAMTGDGTNDAPALAQADVAVAMNSGTQ AAKEAGNMVDLDSNPTKLIEVVHIGKQMLMTRGSLTTFSIANDVAKYFAIIPAAFAATYP QLNALNIMCLHSPDSAILSAVIFNALIIVFLIPLALKGVSYKPLTASAMLRRNLWIYGLG GLLVPFIGIKVIDLLLTVCGLV >gi|296494585|gb|ADTN01000153.1| GENE 5 4398 - 4970 494 190 aa, chain + ## HITS:1 COG:kdpC KEGG:ns NR:ns ## COG: kdpC COG2156 # Protein_GI_number: 16128672 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, c chain # Organism: Escherichia coli K12 # 1 190 1 190 190 337 100.0 6e-93 MSGLRPALSTFIFLLLITGGVYPLLTTVLGQWWFPWQANGSLIREGDTVRGSALIGQNFT GNGYFHGRPSATAEMPYNPQASGGSNLAVSNPELDKLIAARVAALRAANPDASASVPVEL VTASASGLDNNITPQAAAWQIPRVAKARNLSVEQLTQLIAKYSQQPLVKYIGQPVVNIVE LNLALDKLDE >gi|296494585|gb|ADTN01000153.1| GENE 6 4963 - 7647 2360 894 aa, chain + ## HITS:1 COG:kdpD KEGG:ns NR:ns ## COG: kdpD COG2205 # Protein_GI_number: 16128671 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Escherichia coli K12 # 1 894 1 894 894 1768 100.0 0 MNNEPLRPDPDRLLEQTAAPHRGKLKVFFGACAGVGKTWAMLAEAQRLRAQGLDIVVGVV ETHGRKDTAAMLEGLAVLPLKRQAYRGRHISEFDLDAALARRPALILMDELAHSNAPGSR HPKRWQDIEELLEAGIDVFTTVNVQHLESLNDVVSGVTGIQVRETVPDPFFDAADDVVLV DLPPDDLRQRLKEGKVYIAGQAERAIEHFFRKGNLIALRELALRRTADRVDEQMRAWRGH PGEEKVWHTRDAILLCIGHNTGSEKLVRAAARLASRLGSVWHAVYVETPALHRLPEKKRR AILSALRLAQELGAETATLSDPAEEKAVVRYAREHNLGKIILGRPASRRWWRRETFADRL ARIAPDLDQVLVALDEPPARTINNAPDNRSFKDKWRVQIQGCVVAAALCAVITLIAMQWL MAFDAANLVMLYLLGVVVVALFYGRWPSVVATVINVVSFDLFFIAPRGTLAVSDVQYLLT FAVMLTVGLVIGNLTAGVRYQARVARYREQRTRHLYEMSKALAVGRSPQDIAATSEQFIA STFHARSQVLLPDDNGKLQPLTHPQGMTPWDDAIAQWSFDKGLPAGAGTDTLPGVPYQIL PLKSGEKTYGLVVVEPGNLRQLMIPEQQRLLETFTLLVANALERLTLTASEEQARMASER EQIRNALLAALSHDLRTPLTVLFGQAEILTLDLASEGSPHARQASEIRQHVLNTTRLVNN LLDMARIQSGGFNLKKEWLTLEEVVGSALQMLEPGLSSPINLSLPEPLTLIHVDGPLFER VLINLLENAVKYAGAQAEIGIDAHVEGENLQLDVWDNGPGLPPGQEQTIFDKFARGNKES AVPGVGLGLAICRAIVDVHGGTITAFNRPEGGACFRVTLPQQTAPELEEFHEDM >gi|296494585|gb|ADTN01000153.1| GENE 7 7719 - 8321 629 200 aa, chain + ## HITS:1 COG:kdpE KEGG:ns NR:ns ## COG: kdpE COG0745 # Protein_GI_number: 16128670 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli K12 # 1 200 26 225 225 389 100.0 1e-108 MRVFEAETLQRGLLEAATRKPDLIILDLGLPDGDGIEFIRDLRQWSAVPVIVLSARSEES DKIAALDAGADDYLSKPFGIGELQARLRVALRRHSATTAPDPLVKFSDVTVDLAARVIHR GEEEVHLTPIEFRLLAVLLNNAGKVLTQRQLLNQVWGPNAVEHSHYLRIYMGHLRQKLEQ DPARPRHFITETGIGYRFML >gi|296494585|gb|ADTN01000153.1| GENE 8 8537 - 8794 112 85 aa, chain - ## HITS:1 COG:no KEGG:ECB_00650 NR:ns ## KEGG: ECB_00650 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 85 1 85 85 169 100.0 3e-41 MQKPLPVRWLIIQGLTENEHCGIQAYACNTESQEDLYAEVPLVGRLTGIEEVKVEAIAMS KLHNMSCPPYMGHSAAVIFHLIYLS >gi|296494585|gb|ADTN01000153.1| GENE 9 8917 - 11115 2069 732 aa, chain + ## HITS:1 COG:speF KEGG:ns NR:ns ## COG: speF COG1982 # Protein_GI_number: 16128669 # Func_class: E Amino acid transport and metabolism # Function: Arginine/lysine/ornithine decarboxylases # Organism: Escherichia coli K12 # 1 732 1 732 732 1551 99.0 0 MSKLKIAVSDSCPDCFTTQRECIYINESRNIDVAAIVLSLNDVTCGKLDEIDATGYGIPV FIATENQERVPAEYLPRISGVFENCESRREFYGRQLETAASHYETQLRPPFFRALVDYVN QGNSAFDCPGHQGGEFFRRHPAGNQFVEYFGEALFRADLCNADVAMGDLLIHEGAPCIAQ QHAAKVFNADKTYFVLNGTSSSNKVVLNALLTPGDLVLFDRNNHKSNHHGALLQAGATPV YLETARNPYGFIGGIDAHCFEESYLRELIAEVAPQRAKEARPFRLAVIQLGTYDGTIYNA RQVVDKIGHLCDYILFDSAWVGYEQFIPMMADCSPLLLDLNENDPGILVTQSVHKQQAGF SQTSQIHKKDSHIKGQQRYVPHKRMNNAFMMHASTSPFYPLFAALDINAKMHEGVSGRNM WMDCVVNGINARKLILDNCQHIRPFVPELVDGKPWQSYETAQIAVDLRFFQFVPGEHWHS FEGYAENQYFVDPCKLLLTTPGIDARNGEYEAFGVPATILANFLRENGVVPEKCDLNSIL FLLTPAEDMAKLQQLVALLVRFEKLLESDAPLAEVLPSIYKQHEERYAGYTLRQLCQEMH DLYARHNVKQLQKEMFRKEHFPRVSMNPQEANYAYLRGEVELVRLPDAEGRIAAEGALPY PPGVLCVVPGEIWGGAVLRYFSALEEGINLLPGFAPELQGVYIEEHDGRKQVWCYVIKPR DAQSTLLKGEKL >gi|296494585|gb|ADTN01000153.1| GENE 10 11112 - 12431 1605 439 aa, chain + ## HITS:1 COG:potE KEGG:ns NR:ns ## COG: potE COG0531 # Protein_GI_number: 16128668 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli K12 # 1 439 1 439 439 770 100.0 0 MSQAKSNKMGVVQLTILTMVNMMGSGIIMLPTKLAEVGTISIISWLVTAVGSMALAWAFA KCGMFSRKSGGMGGYAEYAFGKSGNFMANYTYGVSLLIANVAIAISAVGYGTELLGASLS PVQIGLATIGVLWICTVANFGGARITGQISSITVWGVIIPVVGLCIIGWFWFSPTLYVDS WNPHHAPFFSAVGSSIAMTLWAFLGLESACANTDVVENPERNVPIAVLGGTLGAAVIYIV STNVIAGIVPNMELANSTAPFGLAFAQMFTPEVGKVIMALMVMSCCGSLLGWQFTIAQVF KSSSDEGYFPKIFSRVTKVDAPVQGMLTIVIIQSGLALMTISPSLNSQFNVLVNLAVVTN IIPYILSMAALVIIQKVANVPPSKAKVANFVAFVGAMYSFYALYSSGEEAMLYGSIVTFL GWTLYGLVSPRFELKNKHG >gi|296494585|gb|ADTN01000153.1| GENE 11 12780 - 13430 256 216 aa, chain + ## HITS:1 COG:no KEGG:ECB_00647 NR:ns ## KEGG: ECB_00647 # Name: ybfGH # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 216 1 216 216 426 100.0 1e-118 MREMNQLVNTEDSAWPIIQNWLKDATNHTELLPVNKDLAETALYQLQVTTKSPMGALVYG SGGLLIDNGWLRIAGSGHPRLPRDPVSWTQRPEFAGVRALPIADDVAGGIFALNGGDLGE DTGCVYYFAPDTLNWESLEVGYSEFLQWALSGDLDTFYENVRWQQWREDVIKLSATEAFT FYPFLWVQSEEARTRKVISLTELWEMQYQMKETFTQ >gi|296494585|gb|ADTN01000153.1| GENE 12 13471 - 13695 71 74 aa, chain - ## HITS:1 COG:no KEGG:B21_00637 NR:ns ## KEGG: B21_00637 # Name: ybfP # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 74 91 164 164 161 100.0 7e-39 MSAQEKVGLNPGWQCYTSFFMRVCQGKPGTRPIVNEDYVSESGFFGSMMHVGIIELRRCQ SENCQQELKAINTH >gi|296494585|gb|ADTN01000153.1| GENE 13 14179 - 15819 2049 546 aa, chain - ## HITS:1 COG:pgm KEGG:ns NR:ns ## COG: pgm COG0033 # Protein_GI_number: 16128664 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglucomutase # Organism: Escherichia coli K12 # 1 546 1 546 546 1081 100.0 0 MAIHNRAGQPAQQSDLINVAQLTAQYYVLKPEAGNAEHAVKFGTSGHRGSAARHSFNEPH ILAIAQAIAEERAKNGITGPCYVGKDTHALSEPAFISVLEVLAANGVDVIVQENNGFTPT PAVSNAILVHNKKGGPLADGIVITPSHNPPEDGGIKYNPPNGGPADTNVTKVVEDRANAL LADGLKGVKRISLDEAMASGHVKEQDLVQPFVEGLADIVDMAAIQKAGLTLGVDPLGGSG IEYWKRIGEYYNLNLTIVNDQVDQTFRFMHLDKDGAIRMDCSSECAMAGLLALRDKFDLA FANDPDYDRHGIVTPAGLMNPNHYLAVAINYLFQHRPQWGKDVAVGKTLVSSAMIDRVVN DLGRKLVEVPVGFKWFVDGLFDGSFGFGGEESAGASFLRFDGTPWSTDKDGIIMCLLAAE ITAVTGKNPQEHYNELAKRFGAPSYNRLQAAATSAQKAALSKLSPEMVSASTLAGDPITA RLTAAPGNGASIGGLKVMTDNGWFAARPSGTEDAYKIYCESFLGEEHRKQIEKEAVEIVS EVLKNA >gi|296494585|gb|ADTN01000153.1| GENE 14 15845 - 16390 390 181 aa, chain - ## HITS:1 COG:seqA KEGG:ns NR:ns ## COG: seqA COG3057 # Protein_GI_number: 16128663 # Func_class: L Replication, recombination and repair # Function: Negative regulator of replication initiationR # Organism: Escherichia coli K12 # 1 181 1 181 181 344 100.0 5e-95 MKTIEVDDELYSYIASHTKHIGESASDILRRMLKFSAASQPAAPVTKEVRVASPAIVEAK PVKTIKDKVRAMRELLLSDEYAEQKRAVNRFMLLLSTLYSLDAQAFAEATESLHGRTRVY FAADEQTLLKNGNQTKPKHVPGTPYWVITNTNTGRKCSMIEHIMQSMQFPAELIEKVCGT I >gi|296494585|gb|ADTN01000153.1| GENE 15 16575 - 17339 719 254 aa, chain + ## HITS:1 COG:ybfF KEGG:ns NR:ns ## COG: ybfF COG0596 # Protein_GI_number: 16128662 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Escherichia coli K12 # 1 254 1 254 254 500 100.0 1e-141 MKLNIRAQTAQNQHNNSPIVLVHGLFGSLDNLGVLARDLVNDHNIIQVDMRNHGLSPRDP VMNYPAMAQDLVDTLDAQQIDKATFIGHSMGGKAVMALTALASDRIDKLVAIDIAPVDYH VRRHDEIFAAINAVSESDAQTRQQAAAIMRQHLNEEGVIQFLLKSFVDGEWRFNVPVLWD QYPHIVGWEKIPAWDHPALFIPGGNSPYVSEQYRDDLLAQFPQARAHVIAGAGHWVHAEK PDAVLRAIRRYLND >gi|296494585|gb|ADTN01000153.1| GENE 16 17410 - 17772 383 120 aa, chain + ## HITS:1 COG:no KEGG:LF82_2603 NR:ns ## KEGG: LF82_2603 # Name: ybfE # Def: uncharacterized protein YbfE # Organism: E.coli_LF82 # Pathway: not_defined # 1 120 1 120 120 190 99.0 1e-47 MYYGALSIRAEAWLIVSPEVTKIMAKEQTDRTTLDLFAHERRPGRPKTNPLSRDEQLRIN KRNQLKRDKVRGLKRVELKLNAEAVEALNELAESRNMSRSELIEEMLMQQLAALRSQGIV >gi|296494585|gb|ADTN01000153.1| GENE 17 17912 - 18442 784 176 aa, chain + ## HITS:1 COG:ECs0715 KEGG:ns NR:ns ## COG: ECs0715 COG0716 # Protein_GI_number: 15829969 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Escherichia coli O157:H7 # 1 176 1 176 176 348 100.0 3e-96 MAITGIFFGSDTGNTENIAKMIQKQLGKDVADVHDIAKSSKEDLEAYDILLLGIPTWYYG EAQCDWDDFFPTLEEIDFNGKLVALFGCGDQEDYAEYFCDALGTIRDIIEPRGATIVGHW PTAGYHFEASKGLADDDHFVGLAIDEDRQPELTAERVEKWVKQISEELHLDEILNA >gi|296494585|gb|ADTN01000153.1| GENE 18 18731 - 19177 468 148 aa, chain + ## HITS:1 COG:ECs0714 KEGG:ns NR:ns ## COG: ECs0714 COG0735 # Protein_GI_number: 15829968 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Escherichia coli O157:H7 # 1 148 1 148 148 279 100.0 1e-75 MTDNNTALKKAGLKVTLPRLKILEVLQEPDNHHVSAEDLYKRLIDMGEEIGLATVYRVLN QFDDAGIVTRHNFEGGKSVFELTQQHHHDHLICLDCGKVIEFSDDSIEARQREIAAKHGI RLTNHSLYLYGHCAEGDCREDEHAHEGK >gi|296494585|gb|ADTN01000153.1| GENE 19 19261 - 19587 395 108 aa, chain - ## HITS:1 COG:no KEGG:ECIAI39_0639 NR:ns ## KEGG: ECIAI39_0639 # Name: ybfN # Def: putative lipoprotein # Organism: E.coli_IAI39 # Pathway: not_defined # 1 108 1 108 108 172 99.0 3e-42 MKKLILIAIMASGLVACAQSTAPQEDSRLKEAYSACINTAQGSPEKIEACQSVLNVLKKE KQHQQFADQESVRVLDYQQCLRATQTGNDQAVKADCDKVWQEIRSNNK >gi|296494585|gb|ADTN01000153.1| GENE 20 19637 - 21043 1654 468 aa, chain - ## HITS:1 COG:no KEGG:JW0667 NR:ns ## KEGG: JW0667 # Name: ybfM # Def: predicted outer membrane porin # Organism: E.coli_J # Pathway: not_defined # 1 468 1 468 468 878 100.0 0 MRTFSGKRSTLALAIAGVTAMSGFMAMPEARAEGFIDDSTLTGGIYYWQRERDRKDVTDG DKYKTNLSHSTWNANLDFQSGYAADMFGLDIAAFTAIEMAENGDSSHPNEIAFSKSNKAY DEDWSGDKSGISLYKAAAKFKYGPVWARAGYIQPTGQTLLAPHWSFMPGTYQGAEAGANF DYGDAGALSFSYMWTNEYKAPWHLEMDEFYQNDKTTKVDYLHSFGAKYDFKNNFVLEAAF GQAEGYIDQYFAKASYKFDIAGSPLTTSYQFYGTRDKVDDRSVNDLYDGTAWLQALTFGY RAADVVDLRLEGTWVKADGQQGYFLQRMTPTYASSNGRLDIWWDNRSDFNANGEKAVFFG AMYDLKNWNLPGFAIGASYVYAWDAKPATWQSNPDAYYDKNRTIEESAYSLDAVYTIQDG RAKGTMFKLHFTEYDNHSDIPSWGGGYGNIFQDERDVKFMVIAPFTIF Prediction of potential genes in microbial genomes Time: Sun May 15 23:40:06 2011 Seq name: gi|296494584|gb|ADTN01000154.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont297.3, whole genome shotgun sequence Length of sequence - 10371 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 5, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 105 - 1769 2162 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases - Prom 1852 - 1911 7.7 + Prom 1801 - 1860 3.6 2 2 Tu 1 . + CDS 1880 - 2002 67 ## ECH74115_0771 hypothetical protein - Term 1883 - 1926 6.6 3 3 Tu 1 . - CDS 1972 - 3918 2191 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific + Prom 3952 - 4011 5.4 4 4 Op 1 12/0.000 + CDS 4251 - 5051 947 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase + Term 5062 - 5099 2.1 5 4 Op 2 7/0.000 + CDS 5111 - 6259 1350 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase 6 4 Op 3 5/1.000 + CDS 6268 - 7488 261 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase 7 4 Op 4 5/1.000 + CDS 7536 - 8288 800 ## COG0647 Predicted sugar phosphatases of the HAD superfamily + Prom 8535 - 8594 8.9 8 5 Tu 1 . + CDS 8685 - 10349 1940 ## COG0367 Asparagine synthase (glutamine-hydrolyzing) Predicted protein(s) >gi|296494584|gb|ADTN01000154.1| GENE 1 105 - 1769 2162 554 aa, chain - ## HITS:1 COG:glnS KEGG:ns NR:ns ## COG: glnS COG0008 # Protein_GI_number: 16128656 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Escherichia coli K12 # 1 554 1 554 554 1169 100.0 0 MSEAEARPTNFIRQIIDEDLASGKHTTVHTRFPPEPNGYLHIGHAKSICLNFGIAQDYKG QCNLRFDDTNPVKEDIEYVESIKNDVEWLGFHWSGNVRYSSDYFDQLHAYAIELINKGLA YVDELTPEQIREYRGTLTQPGKNSPYRDRSVEENLALFEKMRAGGFEEGKACLRAKIDMA SPFIVMRDPVLYRIKFAEHHQTGNKWCIYPMYDFTHCISDALEGITHSLCTLEFQDNRRL YDWVLDNITIPVHPRQYEFSRLNLEYTVMSKRKLNLLVTDKHVEGWDDPRMPTISGLRRR GYTAASIREFCKRIGVTKQDNTIEMASLESCIREDLNENAPRAMAVIDPVKLVIENYQGE GEMVTMPNHPNKPEMGSRQVPFSGEIWIDRADFREEANKQYKRLVLGKEVRLRNAYVIKA ERVEKDAEGNITTIFCTYDADTLSKDPADGRKVKGVIHWVSAAHALPVEIRLYDRLFSVP NPGAADDFLSVINPESLVIKQGFAEPSLKDAVAGKAFQFEREGYFCLDSRHSTAEKPVFN RTVGLRDTWAKVGE >gi|296494584|gb|ADTN01000154.1| GENE 2 1880 - 2002 67 40 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_0771 NR:ns ## KEGG: ECH74115_0771 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 40 12 51 51 68 100.0 8e-11 MIDEKKSGREDLHRLSLLDATLKRRIRHKADYFLISYSGV >gi|296494584|gb|ADTN01000154.1| GENE 3 1972 - 3918 2191 648 aa, chain - ## HITS:1 COG:ECs0709_1 KEGG:ns NR:ns ## COG: ECs0709_1 COG1263 # Protein_GI_number: 15829963 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Escherichia coli O157:H7 # 1 389 1 389 390 687 99.0 0 MNILGFFQRLGRALQLPIAVLPVAALLLRFGQPDLLNVAFIAQAGGAIFDNLALIFAIGV ASSWSKDSAGAAALAGAVGYFVLTKAMVTINPEINMGVLAGIITGLVGGAAYNRWSDIKL PDFLSFFGGKRFVPIATGFFCLVLAAIFGYVWPPVQHAIHAGGEWIVSAGALGSGIFGFI NRLLIPTGLHQVLNTIAWFQIGEFTNAAGTVFHGDINRFYAGDGTAGMFMSGFFPIMMFG LPGAALAMYFAAPKERRPMVGGMLLSVAVTAFLTGVTEPLEFLFMFLAPLLYLLHALLTG ISLFVATLLGIHAGFSFSAGAIDYALMYNLPAASQNVWMLLVMGVIFFAIYFVVFSLVIR MFNLKTPGREDKEDEIVTEEANSNTEEGLTQLATNYIAAVGGTDNLKAIDACITRLRLTV ADSARVNDTMCKRLGASGVVKLNKQTIQVIVGAKAESIGDAMKKVVARGPVAAASAEATP ATAAPVAKPQAVPNAVSIAELVSPITGDVVALDQVPDEAFASKAVGDGVAVKPTDKIVVS PAAGTIVKIFNTNHAFCLETEKGAEIVVHMGIDTVALEGKGFKRLVEEGAQVSAGQPILE MDLDYLNANARSMISPVVCSNIDDFSGLIIKAQGHVVAGQTPLYEIKK >gi|296494584|gb|ADTN01000154.1| GENE 4 4251 - 5051 947 266 aa, chain + ## HITS:1 COG:ECs0708 KEGG:ns NR:ns ## COG: ECs0708 COG0363 # Protein_GI_number: 15829962 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Escherichia coli O157:H7 # 1 266 1 266 266 560 100.0 1e-159 MRLIPLTTAEQVGKWAARHIVNRINAFKPTADRPFVLGLPTGGTPMTTYKALVEMHKAGQ VSFKHVVTFNMDEYVGLPKEHPESYYSFMHRNFFDHVDIPAENINLLNGNAPDIDAECRQ YEEKIRSYGKIHLFMGGVGNDGHIAFNEPASSLASRTRIKTLTHDTRVANSRFFDNDVNQ VPKYALTVGVGTLLDAEEVMILVLGSQKALALQAAVEGCVNHMWTISCLQLHPKAIMVCD EPSTMELKVKTLRYFNELEAENIKGL >gi|296494584|gb|ADTN01000154.1| GENE 5 5111 - 6259 1350 382 aa, chain + ## HITS:1 COG:ECs0707 KEGG:ns NR:ns ## COG: ECs0707 COG1820 # Protein_GI_number: 15829961 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Escherichia coli O157:H7 # 1 382 1 382 382 769 100.0 0 MYALTQGRIFTGHEFLDDHAVVIADGLIKSVCPVAELPPEIEQRSLNGAILSPGFIDVQL NGCGGVQFNDTAEAVSVETLEIMQKANEKSGCTNYLPTLITTSDELMKQGVRVMREYLAK HPNQALGLHLEGPWLNLVKKGTHNPNFVRKPDAALVDFLCENADVITKVTLAPEMVPAEV ISKLANAGIVVSAGHSNATLKEAKAGFRAGITFATHLYNAMPYITGREPGLAGAILDEAD IYCGIIADGLHVDYANIRNAKRLKGDKLCLVTDATAPAGANIEQFIFAGKTIYYRNGLCV DENGTLSGSSLTMIEGVRNLVEHCGIALDEVLRMATLYPARAIGVEKRLGTLAAGKVANL TAFTPDFKITKTIVNGNEVVTQ >gi|296494584|gb|ADTN01000154.1| GENE 6 6268 - 7488 261 406 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 129 390 47 317 319 105 27 2e-22 MTPGGQAQIGNVDLVKQLNSAAVYRLIDQYGPISRIQIAEQSQLAPASVTKITRQLIERG LIKEVDQQASTGGRRAISIVTETRNFHAIGVRLGRHDATITLFDLSSKVLAEEHYPLPER TQQTLEHALLNAIAQFIDSYQRKLRELIAISVILPGLVDPDSGKIHYMPHIQVENWGLVE ALEERFKVTCFVGHDIRSLALAEHYFGASQDCEDSILVRVHRGTGAGIISNGRIFIGRNG NVGEIGHIQVEPLGERCHCGNFGCLETIAANAAIEQRVLNLLKQGYQSRVPLDDCTIKTI CKAANKGDSLASEVIEYVGRHLGKTIAIAINLFNPQKIVIAGEITEADKVLLPAIESCIN TQALKAFRTNLPVVRSELDHRSAIGAFALVKRAMLNGILLQHLLEN >gi|296494584|gb|ADTN01000154.1| GENE 7 7536 - 8288 800 250 aa, chain + ## HITS:1 COG:ECs0705 KEGG:ns NR:ns ## COG: ECs0705 COG0647 # Protein_GI_number: 15829959 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar phosphatases of the HAD superfamily # Organism: Escherichia coli O157:H7 # 1 250 1 250 250 517 100.0 1e-147 MTIKNVICDIDGVLMHDNVAVPGAAEFLHGIMDKGLPLVLLTNYPSQTGQDLANRFATAG VDVPDSVFYTSAMATADFLRRQEGKKAYVVGEGALIHELYKAGFTITDVNPDFVIVGETR SYNWDMMHKAAYFVANGARFIATNPDTHGRGFYPACGALCAGIEKISGRKPFYVGKPSPW IIRAALNKMQAHSEETVIVGDNLRTDILAGFQAGLETILVLSGVSSLDDIDSMPFRPSWI YPSVAEIDVI >gi|296494584|gb|ADTN01000154.1| GENE 8 8685 - 10349 1940 554 aa, chain + ## HITS:1 COG:asnB KEGG:ns NR:ns ## COG: asnB COG0367 # Protein_GI_number: 16128650 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthase (glutamine-hydrolyzing) # Organism: Escherichia coli K12 # 1 554 1 554 554 1145 100.0 0 MCSIFGVFDIKTDAVELRKKALELSRLMRHRGPDWSGIYASDNAILAHERLSIVDVNAGA QPLYNQQKTHVLAVNGEIYNHQALRAEYGDRYQFQTGSDCEVILALYQEKGPEFLDDLQG MFAFALYDSEKDAYLIGRDHLGIIPLYMGYDEHGQLYVASEMKALVPVCRTIKEFPAGSY LWSQDGEIRSYYHRDWFDYDAVKDNVTDKNELRQALEDSVKSHLMSDVPYGVLLSGGLDS SIISAITKKYAARRVEDQERSEAWWPQLHSFAVGLPGSPDLKAAQEVANHLGTVHHEIHF TVQEGLDAIRDVIYHIETYDVTTIRASTPMYLMSRKIKAMGIKMVLSGEGSDEVFGGYLY FHKAPNAKELHEETVRKLLALHMYDCARANKAMSAWGVEARVPFLDKKFLDVAMRINPQD KMCGNGKMEKHILRECFEAYLPASVAWRQKEQFSDGVGYSWIDTLKEVAAQQVSDQQLET ARFRFPYNTPTSKEAYLYREIFEELFPLPSAAECVPGGPSVACSSAKAIEWDEAFKKMDD PSGRAVGVHQSAYK Prediction of potential genes in microbial genomes Time: Sun May 15 23:40:08 2011 Seq name: gi|296494583|gb|ADTN01000155.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont297.4, whole genome shotgun sequence Length of sequence - 598 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 375 - 451 96.2 # Met CAT 0 0 + TRNA 461 - 545 78.3 # Leu TAG 0 0 Prediction of potential genes in microbial genomes Time: Sun May 15 23:40:08 2011 Seq name: gi|296494582|gb|ADTN01000156.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont297.5, whole genome shotgun sequence Length of sequence - 304 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 63 - 137 74.3 # Gln TTG 0 0 + TRNA 153 - 229 96.2 # Met CAT 0 0 Prediction of potential genes in microbial genomes Time: Sun May 15 23:40:10 2011 Seq name: gi|296494581|gb|ADTN01000157.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont297.6, whole genome shotgun sequence Length of sequence - 5793 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 45 - 1583 1954 ## COG1012 NAD-dependent aldehyde dehydrogenases + Term 1592 - 1646 -0.3 - Term 1694 - 1745 -0.6 2 2 Tu 1 . - CDS 1802 - 2014 135 ## ECIAI1_3755 conserved hypothetical protein, putative membrane protein - Prom 2037 - 2096 5.8 + Prom 2036 - 2095 8.8 3 3 Op 1 . + CDS 2128 - 2451 329 ## B21_03393 hypothetical protein 4 3 Op 2 . + CDS 2457 - 3593 956 ## COG1566 Multidrug resistance efflux pump + Term 3643 - 3677 -0.7 5 4 Tu 1 . - CDS 3590 - 4564 705 ## COG0583 Transcriptional regulator - Prom 4587 - 4646 10.5 + Prom 4536 - 4595 8.9 6 5 Tu 1 . + CDS 4688 - 5428 750 ## COG3713 Outer membrane protein V Predicted protein(s) >gi|296494581|gb|ADTN01000157.1| GENE 1 45 - 1583 1954 512 aa, chain + ## HITS:1 COG:aldB KEGG:ns NR:ns ## COG: aldB COG1012 # Protein_GI_number: 16131459 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Escherichia coli K12 # 1 512 31 542 542 1074 100.0 0 MTNNPPSAQIKPGEYGFPLKLKARYDNFIGGEWVAPADGEYYQNLTPVTGQLLCEVASSG KRDIDLALDAAHKVKDKWAHTSVQDRAAILFKIADRMEQNLELLATAETWDNGKPIRETS AADVPLAIDHFRYFASCIRAQEGGISEVDSETVAYHFHEPLGVVGQIIPWNFPLLMASWK MAPALAAGNCVVLKPARLTPLSVLLLMEIVGDLLPPGVVNVVNGAGGVIGEYLATSKRIA KVAFTGSTEVGQQIMQYATQNIIPVTLELGGKSPNIFFADVMDEEDAFFDKALEGFALFA FNQGEVCTCPSRALVQESIYERFMERAIRRVESIRSGNPLDSVTQMGAQVSHGQLETILN YIDIGKKEGADVLTGGRRKLLEGELKDGYYLEPTILFGQNNMRVFQEEIFGPVLAVTTFK TMEEALELANDTQYGLGAGVWSRNGNLAYKMGRGIQAGRVWTNCYHAYPAHAAFGGYKQS GIGRETHKMMLEHYQQTKCLLVSYSDKPLGLF >gi|296494581|gb|ADTN01000157.1| GENE 2 1802 - 2014 135 70 aa, chain - ## HITS:1 COG:no KEGG:ECIAI1_3755 NR:ns ## KEGG: ECIAI1_3755 # Name: not_defined # Def: conserved hypothetical protein, putative membrane protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 70 1 70 70 124 100.0 1e-27 MCCLVFQQDEVYHIEIVELLAKLMNNSSKTSTVQIKRIKPSIIYRLLLIGLGSPMVIYGL VRPLTIETRD >gi|296494581|gb|ADTN01000157.1| GENE 3 2128 - 2451 329 107 aa, chain + ## HITS:1 COG:no KEGG:B21_03393 NR:ns ## KEGG: B21_03393 # Name: yiaW # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 107 1 107 107 199 100.0 3e-50 MFLDYFALGVLIFVFLVIFYGIIILHDIPYLIAKKRNHPHADAIHVAGWVSLFTLHVIWP FLWIWATLYRPERGWGMQSHDSSVMQLQQRIAGLEKQLADIKSSSAE >gi|296494581|gb|ADTN01000157.1| GENE 4 2457 - 3593 956 378 aa, chain + ## HITS:1 COG:yiaV KEGG:ns NR:ns ## COG: yiaV COG1566 # Protein_GI_number: 16131457 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Escherichia coli K12 # 1 378 1 378 378 726 100.0 0 MDLLIILTYVAFAWAMFKIFKIPVNKWTIPTAALGGIFIVSGLILLMNYNHPYTFKAQKA VISIPVVPQVTGVVIEVTDKKNTLIKKGEVLFRLDPTRYQARVDRLMADIVTAEHKQRAL GAELDEMAANTQQAKATRDKFAKEYQRYARGSQAKVNPFSERDIDVARQNYLAQEASVKS SAAEQKQIQSQLDSLVLGEHSQIASLKAQLAEAKYNLEQTIVRAPSDGYVTQVLIRPGTY AASLPLRPVMVFIPDQKRQIVAQFRQNSLLRLAPGDDAEVVFNALPGKVFSGKLAAISPA VPGGAYQSTGTLQTLNTAPGSDGVIATIELDEHTDLSALPDGIYAQVAVYSDHFSHVSVM RKVLLRMTSWVHYLYLDH >gi|296494581|gb|ADTN01000157.1| GENE 5 3590 - 4564 705 324 aa, chain - ## HITS:1 COG:yiaU KEGG:ns NR:ns ## COG: yiaU COG0583 # Protein_GI_number: 16131456 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 324 1 324 324 661 100.0 0 MTKLQLKYRELKIISVIAASENISHAATVLGIAQANVSKYLADFESKVGLKVFDRTTRQL MLTPFGTALLPYINDMLDRNEQLNNFIADYKHEKRGRVTIYAPTGIITYLSKHVIDKIKD IGDITLSLKTCNLERNAFYEGVEFPDDCDVLISYAPPKDESLVASFITQYAVTAYASQRY LEKHPISRPDELEHHSCILIDSMMIDDANIWRFNVAGSKEVRDYRVKGNYVCDNTQSALE LARNHLGIVFAPDKSVQSDLQDGTLVPCFQQPYEWWLDLVAIFRKREYQPWRVQYVLDEM LREIRHQLAQSQQLRPEQAAESED >gi|296494581|gb|ADTN01000157.1| GENE 6 4688 - 5428 750 246 aa, chain + ## HITS:1 COG:yiaT KEGG:ns NR:ns ## COG: yiaT COG3713 # Protein_GI_number: 16131455 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein V # Organism: Escherichia coli K12 # 1 246 1 246 246 474 100.0 1e-134 MLINRNIVALFALPFMASATASELSIGAGAAYNESPYRGYNENTKAIPLISYEGDTFYVR QTTLGFILSQSEKNELSLTASWMPLEFDPTDNDDYAMQQLDKRDSTAMAGVAWYHHERWG TVKASAAADVLDNSNGWVGELSVFHKMQIGRLSLTPALGVLYYDENFSDYYYGISESESR RSGLASYSAQDAWVPYVSLTAKYPIGEHVVLMASAGYSELPEEITDSPMIDRNESFTFVT GVSWRF Prediction of potential genes in microbial genomes Time: Sun May 15 23:40:28 2011 Seq name: gi|296494580|gb|ADTN01000158.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont297.7, whole genome shotgun sequence Length of sequence - 42209 bp Number of predicted genes - 41, with homology - 41 Number of transcription units - 26, operones - 8 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 8/0.000 - CDS 28 - 723 597 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 2 1 Op 2 9/0.000 - CDS 717 - 1577 978 ## COG3623 Putative L-xylulose-5-phosphate 3-epimerase 3 1 Op 3 3/0.700 - CDS 1570 - 2232 783 ## COG0269 3-hexulose-6-phosphate synthase and related proteins 4 1 Op 4 3/0.700 - CDS 2229 - 3725 1462 ## COG1070 Sugar (pentulose and hexulose) kinases 5 1 Op 5 9/0.000 - CDS 3729 - 4715 392 ## PROTEIN SUPPORTED gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 6 1 Op 6 11/0.000 - CDS 4728 - 6008 754 ## PROTEIN SUPPORTED gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 7 1 Op 7 2/0.900 - CDS 6008 - 6481 300 ## COG3090 TRAP-type C4-dicarboxylate transport system, small permease component - Prom 6537 - 6596 1.9 - Term 6551 - 6589 -0.7 8 2 Op 1 3/0.700 - CDS 6599 - 7066 445 ## COG2731 Beta-galactosidase, beta subunit 9 2 Op 2 . - CDS 7078 - 8076 1122 ## COG2055 Malate/L-lactate dehydrogenases - Prom 8137 - 8196 4.5 + Prom 8067 - 8126 6.0 10 3 Op 1 . + CDS 8277 - 9125 742 ## COG1414 Transcriptional regulator 11 3 Op 2 . + CDS 9227 - 9700 156 ## COG1142 Fe-S-cluster-containing hydrogenase components 2 + Term 9703 - 9748 8.2 12 4 Tu 1 . - CDS 9851 - 11104 1288 ## COG3977 Alanine-alpha-ketoisovalerate (or valine-pyruvate) aminotransferase - Prom 11135 - 11194 2.4 13 5 Tu 1 . - CDS 11282 - 13312 1606 ## COG0366 Glycosidases - Prom 13553 - 13612 5.6 14 6 Tu 1 . + CDS 13884 - 14456 436 ## COG2992 Uncharacterized FlgJ-related protein + Term 14469 - 14514 13.0 15 7 Tu 1 4/0.200 - CDS 14652 - 15830 991 ## COG1609 Transcriptional regulators - Term 15846 - 15891 8.0 16 8 Op 1 11/0.000 - CDS 15908 - 17089 1140 ## COG4214 ABC-type xylose transport system, permease component 17 8 Op 2 11/0.000 - CDS 17067 - 18608 231 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 - Term 18642 - 18682 5.1 18 8 Op 3 . - CDS 18686 - 19678 930 ## COG4213 ABC-type xylose transport system, periplasmic component - Prom 19897 - 19956 6.9 + Prom 19888 - 19947 4.7 19 9 Op 1 11/0.000 + CDS 20044 - 21366 1487 ## COG2115 Xylose isomerase + Term 21368 - 21405 -0.3 20 9 Op 2 3/0.700 + CDS 21438 - 22892 916 ## COG1070 Sugar (pentulose and hexulose) kinases + Term 22895 - 22943 1.1 + Prom 22925 - 22984 5.8 21 10 Op 1 1/1.000 + CDS 23061 - 23402 317 ## COG4682 Predicted membrane protein 22 10 Op 2 . + CDS 23448 - 23885 489 ## COG4682 Predicted membrane protein + Term 23895 - 23932 5.3 23 11 Tu 1 . - CDS 23927 - 24922 677 ## COG3274 Uncharacterized protein conserved in bacteria - Prom 25054 - 25113 3.8 + Prom 24987 - 25046 4.5 24 12 Op 1 . + CDS 25100 - 25396 284 ## JW3532 hypothetical protein + Prom 25400 - 25459 3.4 25 12 Op 2 19/0.000 + CDS 25491 - 26402 1191 ## COG0752 Glycyl-tRNA synthetase, alpha subunit 26 12 Op 3 . + CDS 26412 - 28481 2949 ## COG0751 Glycyl-tRNA synthetase, beta subunit + Term 28500 - 28530 3.0 - Term 28683 - 28717 0.2 27 13 Tu 1 . - CDS 28760 - 29125 160 ## COG2801 Transposase and inactivated derivatives + Prom 28941 - 29000 4.0 28 14 Tu 1 . + CDS 29066 - 29353 69 ## gi|10955443|ref|NP_065295.1| hypothetical protein R721_05 + Term 29362 - 29399 0.1 29 15 Tu 1 . - CDS 29608 - 30129 124 ## COG2963 Transposase and inactivated derivatives + Prom 30066 - 30125 3.5 30 16 Tu 1 . + CDS 30209 - 30361 75 ## ECP_3660 small toxic polypeptide + Term 30370 - 30401 4.1 - Term 30486 - 30521 4.9 31 17 Tu 1 4/0.200 - CDS 30548 - 30760 295 ## COG1278 Cold shock proteins - Prom 30921 - 30980 5.2 - Term 30983 - 31014 4.1 32 18 Tu 1 . - CDS 31041 - 31331 238 ## COG2944 Predicted transcriptional regulator + Prom 31548 - 31607 5.0 33 19 Tu 1 . + CDS 31765 - 32475 763 ## APECO1_2894 hypothetical protein + Term 32485 - 32515 0.2 - Term 32465 - 32510 6.3 34 20 Tu 1 . - CDS 32525 - 33499 862 ## COG1052 Lactate dehydrogenase and related dehydrogenases - Prom 33522 - 33581 3.3 - Term 33554 - 33593 7.6 35 21 Tu 1 . - CDS 33603 - 34262 213 ## PROTEIN SUPPORTED gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 - Prom 34330 - 34389 2.4 36 22 Tu 1 . + CDS 34595 - 36748 1874 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing - Term 36619 - 36654 6.1 37 23 Op 1 5/0.100 - CDS 36717 - 37157 445 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 38 23 Op 2 . - CDS 37154 - 37717 357 ## PROTEIN SUPPORTED gi|157164512|ref|YP_001467500.1| 50S ribosomal protein L24 (BL23; 12 kDa DNA-binding protein; HPB12) - Prom 37871 - 37930 6.5 + Prom 37739 - 37798 3.1 39 24 Tu 1 . + CDS 37875 - 38573 580 ## COG5571 Autotransporter protein or domain, integral membrane beta-barrel involved in protein secretion + Prom 38705 - 38764 3.5 40 25 Tu 1 . + CDS 38802 - 40010 1615 ## COG0477 Permeases of the major facilitator superfamily + Term 40095 - 40124 0.4 + Prom 40101 - 40160 7.4 41 26 Tu 1 . + CDS 40334 - 42025 1539 ## COG2194 Predicted membrane-associated, metal-dependent hydrolase + Term 42080 - 42126 6.6 + TRNA 42117 - 42193 85.1 # Pro CGG 0 0 Predicted protein(s) >gi|296494580|gb|ADTN01000158.1| GENE 1 28 - 723 597 231 aa, chain - ## HITS:1 COG:sgbE KEGG:ns NR:ns ## COG: sgbE COG0235 # Protein_GI_number: 16131454 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli K12 # 1 231 1 231 231 464 100.0 1e-131 MLEQLKADVLAANLALPAHHLVTFTWGNVSAVDETRQWMVIKPSGVEYDVMTADDMVVVE IASGKVVEGSKKPSSDTPTHLALYRRYAEIGGIVHTHSRHATIWSQAGLDLPAWGTTHAD YFYGAIPCTRQMTAEEINGEYEYQTGEVIIETFEERGRSPAQIPAVLVHSHGPFAWGKNA ADAVHNAVVLEECAYMGLFSRQLAPQLPAMQNELLDKHYLRKHGANAYYGQ >gi|296494580|gb|ADTN01000158.1| GENE 2 717 - 1577 978 286 aa, chain - ## HITS:1 COG:sgbU KEGG:ns NR:ns ## COG: sgbU COG3623 # Protein_GI_number: 16131453 # Func_class: G Carbohydrate transport and metabolism # Function: Putative L-xylulose-5-phosphate 3-epimerase # Organism: Escherichia coli K12 # 1 286 12 297 297 576 99.0 1e-164 MRNHPLGIYEKALAKDLSWPERLVLAKSCGFDFVEMSVDETDERLSRLDWSAAQRTSLVA AMIETGVGIPSMCLSAHRRFPFGSRDEAVRERAREIMSKAIRLARDLGIRTIQLAGYDVY YEDHDEGTRQRFAEGLAWAVEQAAASQVMLAVEIMDTAFMNSISKWKKWDEMLASPWFTV YPDVGNLSAWGNDVPAELKLGIDRIAAIHLKDTQPVTGQSPGQFRDVPFGEGCVDFVGIF KTLHKLNYRGSFLIEMWTEKAKEPVLEIIQARRWIEARMQEAGFIC >gi|296494580|gb|ADTN01000158.1| GENE 3 1570 - 2232 783 220 aa, chain - ## HITS:1 COG:sgbH KEGG:ns NR:ns ## COG: sgbH COG0269 # Protein_GI_number: 16131452 # Func_class: G Carbohydrate transport and metabolism # Function: 3-hexulose-6-phosphate synthase and related proteins # Organism: Escherichia coli K12 # 1 220 1 220 220 406 100.0 1e-113 MSRPLLQLALDHSSLEAAQRDVTLLKDSVDIVEAGTILCLNEGLGAVKALREQCPDKIIV ADWKVADAGETLAQQAFGAGANWMTIICAAPLATVEKGHAMAQRCGGEIQIELFGNWTLD DARDWHRIGVRQAIYHRGRDAQASGQQWGEADLARMKALSDIGLELSITGGITPADLPLF KDIRVKAFIAGRALAGAANPAQVAGDFHAQIDAIWGGARA >gi|296494580|gb|ADTN01000158.1| GENE 4 2229 - 3725 1462 498 aa, chain - ## HITS:1 COG:lyxK KEGG:ns NR:ns ## COG: lyxK COG1070 # Protein_GI_number: 16131451 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Escherichia coli K12 # 1 498 1 498 498 1042 100.0 0 MTQYWLGLDCGGSWLKAGLYDREGREAGVQRLPLCALSPQPGWAERDMAELWQCCMAVIR ALLTHSGVSGEQIVGIGISAQGKGLFLLDKNDKPLGNAILSSDRRAMEIVRRWQEDGIPE KLYPLTRQTLWTGHPVSLLRWLKEHEPERYAQIGCVMMTHDYLRWCLTGVKGCEESNISE SNLYNMSLGEYDPCLTDWLGIAEINHALPPVVGSAEICGEITAQTAALTGLKAGTPVVGG LFDVVSTALCAGIEDEFTLNAVMGTWAVTSGITRGLRDGEAHPYVYGRYVNDGEFIVHEA SPTSSGNLEWFTAQWGEISFDEINQAVASLPKAGGDLFFLPFLYGSNAGLEMTSGFYGMQ AIHTRAHLLQAIYEGVVFSHMTHLNRMRERFTDVHTLRVTGGPAHSDVWMQMLADVSGLR IELPQVEETGCFGAALAARVGTGVYHNFSEAQRDLRHPVRTLLPDMTAHQLYQKKYQRYQ HLIAALQGFHARIKEHTL >gi|296494580|gb|ADTN01000158.1| GENE 5 3729 - 4715 392 328 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 [Algoriphagus sp. PR1] # 4 328 3 325 328 155 28 4e-37 MKLRSVTYALFIAGLAAFSTSSLAAQSLRFGYETSQTDSQHIAAKKFNDLLQERTKGELK LKLFPDSTLGNAQAMISGVRGGTIDMEMSGSNNFAGLSPVMNLLDVPFLFRDTAHAHKTL DGKVGDDLKASLEGKGLKVLAYWENGWRDVTNSRAPVKTPADLKGLKIRTNNSPMNIAAF KVFGANPIPMPFAEVYTGLETRTIDAQEHPINVVWSAKFFEVQKFLSLTHHAYSPLLVVI NKAKFDGLSPEFQQALVSSAQEAGNYQRKLVAEDQQKIIDGMKEAGVEVITDLDRKAFSD ALGNQVRDMFVKDVPQGADLLKAVDEVQ >gi|296494580|gb|ADTN01000158.1| GENE 6 4728 - 6008 754 426 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 [Algoriphagus sp. PR1] # 1 419 3 423 431 295 36 4e-79 IMAVLIFLGCLLGGIAIGLPIAWALLLCGAALMFWLDMFDVQIMAQTLVNGADSFSLLAI PFFVLAGEIMNAGGLSKRIVDLPMKLVGHKPGGLGYVGVLAAMIMASLSGSAVADTAAVA ALLVPMMRSANYPVNRAAGLIASGGIIAPIIPPSIPFIIFGVSSGLSISKLFMAGIAPGM MMGATLMLTWWWQASRLNLPRQQKATMQEIWHSFVSGIWALFLPVIIIGGFRSGLFTPTE AGAVAAFYALFVATVIYREMTFATLWHVLIGAAKTTSVVMFLVASAQVSAWLITIAELPM MVSDLLQPLVDSPRLLFIVIMVAILIVGMVMDLTPTVLILTPVLMPLVKEAGIDPIYFGV MFIINCSIGLITPPIGNVLNVISGVAKLKFDDAVRGVFPYVLVLYSLLVVFVFIPDLIIL PLKWIN >gi|296494580|gb|ADTN01000158.1| GENE 7 6008 - 6481 300 157 aa, chain - ## HITS:1 COG:yiaM KEGG:ns NR:ns ## COG: yiaM COG3090 # Protein_GI_number: 16131448 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, small permease component # Organism: Escherichia coli K12 # 1 157 1 157 157 284 100.0 4e-77 MKKILEAILAINLAVLSCIVFINIILRYGFQTSILSVDELSRYLFVWLTFIGAIVAFMDN AHVQVTFLVEKLSPAWQRRVALVTHSLILFICGALAWGATLKTIQDWSDYSPILGLPIGL MYAACLPTSLVIAFFELRHLYQLITRSNSLTSPPQGA >gi|296494580|gb|ADTN01000158.1| GENE 8 6599 - 7066 445 155 aa, chain - ## HITS:1 COG:yiaL KEGG:ns NR:ns ## COG: yiaL COG2731 # Protein_GI_number: 16131447 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Escherichia coli K12 # 1 155 1 155 155 315 100.0 2e-86 MIFGHIAQPNPCRLPAAIEKALDFLRATDFNALEPGVVEIDGKNIYTQIIDLTTREAVVN RPEVHRRYIDIQFLAWGEEKIGIAIDTGNNKVSESLLEQRNIIFYHDSEHESFIEMIPGS YAIFFPQDVHRPGCIMQTASEIRKIVVKVALTALN >gi|296494580|gb|ADTN01000158.1| GENE 9 7078 - 8076 1122 332 aa, chain - ## HITS:1 COG:yiaK KEGG:ns NR:ns ## COG: yiaK COG2055 # Protein_GI_number: 16131446 # Func_class: C Energy production and conversion # Function: Malate/L-lactate dehydrogenases # Organism: Escherichia coli K12 # 1 332 1 332 332 674 100.0 0 MKVTFEQLKAAFNRVLISRGVDSETADACAEMFARTTESGVYSHGVNRFPRFIQQLENGD IIPDAQPKRITSLGAIEQWDAQRSIGNLTAKKMMDRAIELAADHGIGLVALRNANHWMRG GSYGWQAAEKGYIGICWTNSIAVMPPWGAKECRIGTNPLIVAIPSTPITMVDMSMSMFSY GMLEVNRLAGRQLPVDGGFDDEGNLTKEPGVIEKNRRILPMGYWKGSGMSIVLDMIATLL SDGASVAEVTQDNSDEYGISQIFIAIEVDKLIDGPTRDAKLQRIMDYVTSAERADENQAI RLPGHEFTTLLAENRRNGITVDDSVWAKIQAL >gi|296494580|gb|ADTN01000158.1| GENE 10 8277 - 9125 742 282 aa, chain + ## HITS:1 COG:yiaJ KEGG:ns NR:ns ## COG: yiaJ COG1414 # Protein_GI_number: 16131445 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 282 1 282 282 566 100.0 1e-161 MGKEVMGKKENEMAQEKERPAGSQSLFRGLMLIEILSNYPNGCPLAHLSELAGLNKSTVH RLLQGLQSCGYVTTAPAAGSYRLTTKFIAVGQKALSSLNIIHIAAPHLEALNIATGETIN FSSREDDHAILIYKLEPTTGMLRTRAYIGQHMPLYCSAMGKIYMAFGHPDYVKSYWESHQ HEIQPLTRNTITELPAMFDELAHIRESGAAMDREENELGVSCIAVPVFDIHGRVPYAVSI SLSTSRLKQVGEKNLLKPLRETAQAISNELGFTVRDDLGAIT >gi|296494580|gb|ADTN01000158.1| GENE 11 9227 - 9700 156 157 aa, chain + ## HITS:1 COG:yiaI KEGG:ns NR:ns ## COG: yiaI COG1142 # Protein_GI_number: 16131444 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 2 # Organism: Escherichia coli K12 # 1 157 1 157 157 258 100.0 4e-69 MNRFIIADATKCIGCRTCEVACAVSHHENQDCAALSPDEFISRIRVIKDHCWTTAVACHQ CEDAPCANVCPVDAISREHGHIFVEQTRCIGCKSCMLACPFGAMEVVSSRKKARAIKCDL CWHRETGPACVEACPTKALQCMDVEKVQRHRLRQQPV >gi|296494580|gb|ADTN01000158.1| GENE 12 9851 - 11104 1288 417 aa, chain - ## HITS:1 COG:avtA KEGG:ns NR:ns ## COG: avtA COG3977 # Protein_GI_number: 16131443 # Func_class: E Amino acid transport and metabolism # Function: Alanine-alpha-ketoisovalerate (or valine-pyruvate) aminotransferase # Organism: Escherichia coli K12 # 1 417 1 417 417 872 99.0 0 MTFSLFGDKFTRHSGITLLMEDLNDGLRTPGAIMLGGGNPAQIPEMQDYFQTLLTDMLES GKATDALCNYDGPQGKTELLTLLAGMLREKLGWDIEPQNIALTNGSQSAFFYLFNLFAGR RADGRVKKVLFPLAPEYIGYADAGLEEDLFVSARPNIELLPEGQFKYHVDFEHLHIGEET GMICVSRPTNPTGNVITDEELLKLDALANQHGIPLVIDNAYGVPFPGIIFSEARPLWNPN IVLCMSLSKLGLPGSRCGIIIANEKIITAITNMNGIISLAPGGIGPAMMCEMIKRNDLLR LSETVIKPFYYQRVQETIAIIRRYLPENRCLIHKPEGAIFLWLWFKDLPITTEQLYQRLK ARGVLMVPGHNFFPGLDKPWPHTHQCMRMNYVPEPEKIEAGVKILAEEIERAWAESH >gi|296494580|gb|ADTN01000158.1| GENE 13 11282 - 13312 1606 676 aa, chain - ## HITS:1 COG:malS KEGG:ns NR:ns ## COG: malS COG0366 # Protein_GI_number: 16131442 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Escherichia coli K12 # 1 676 1 676 676 1363 99.0 0 MKLAACFLTLLPGFAVAASWTSPGFPAFSEQGTGTFVSHAQLPKGTRPLTLNFDQQCWQP ADAIKLNQMLSLQPCSNTPPQWRLFRDGEYTLQIDTRSGTPTLMISIQNAAEPVASLVRE CPKWDGLPLTVDVSATFPEGAAVRDYYSQQIAIVKNGQIMLQPAATSNGLLLLERAETDT SAPFDWHNATVYFVLTDRFENGDPSNDQSYGRHKDGMAEIGTFHGGDLRGLTNKLDYLQQ LGVNALWISAPFEQIHGWVGGGTKGDFPHYAYHGYYTQDWTNLDANMGNEADLRTLVDSA HQRGIRILFDVVMNHTGYATLADMQEYQFGALYLSGDEVKKSLGERWSDWKPAAGQTWHS FNDYINFSDKTGWDKWWGKNWIRTDIGDYDNPGFDDLTMSLAFLPDIKTESTTASGLPVF YKNKMDTHAKAIDGYTPRDYLTHWLSQWVRDYGIDGFRVDTAKHVELPAWQQLKTEASAA LREWKKANPDKALDDKPFWMTGEAWGHGVMQSDYYRHGFDAMINFDYQEQAAKAVDCLAQ MDTTWQQMAEKLQGFNVLSSLSSHDTRLFREGGDKAAELLLLAPGAVQIFYGDESSRPFG PTGSDPLQGTRSDMNWQDVSGKSAASVAHWQKISQFRARHPAIGAGKQTTLLLKQGYGFV REHGDDKVLVVWAGQQ >gi|296494580|gb|ADTN01000158.1| GENE 14 13884 - 14456 436 190 aa, chain + ## HITS:1 COG:ECs4453 KEGG:ns NR:ns ## COG: ECs4453 COG2992 # Protein_GI_number: 15833707 # Func_class: R General function prediction only # Function: Uncharacterized FlgJ-related protein # Organism: Escherichia coli O157:H7 # 1 190 85 274 274 348 100.0 4e-96 MPYITSQNAAITAERNWLISKQYQGQWSPAERARLKDIAKRYKVKWSGNTRKIPWNTLLE RVDIIPTSMVATMAAAESGWGTSKLARNNNNLFGMKCMKGRCTNAPGKVKGYSQFSSVKE SVSAYVTNLNTHPAYSSFRKSRAQLRKADQEVTATAMIHKLKGYSTKGKSYNNYLFAMYQ DNQRLIAAHM >gi|296494580|gb|ADTN01000158.1| GENE 15 14652 - 15830 991 392 aa, chain - ## HITS:1 COG:xylR_1 KEGG:ns NR:ns ## COG: xylR_1 COG1609 # Protein_GI_number: 16131440 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 265 1 265 265 542 100.0 1e-154 MFTKRHRITLLFNANKAYDRQVVEGVGEYLQASQSEWDIFIEEDFRARIDKIKDWLGDGV IADFDDKQIEQALADVDVPIVGVGGSYHLAESYPPVHYIATDNYALVESAFLHLKEKGVN RFAFYGLPESSGKRWATEREYAFRQLVAEEKYRGVVYQGLETAPENWQHAQNRLADWLQT LPPQTGIIAVTDARARHILQVCEHLHIPVPEKLCVIGIDNEELTRYLSRVALSSVAQGAR QMGYQAAKLLHRLLDKEEMPLQRILVPPVRVIERRSTDYRSLTDPAVIQAMHYIRNHACK GIKVDQVLDAVGISRSNLEKRFKEEVGETIHAMIHAEKLEKARSLLISTTLSINEISQMC GYPSLQYFYSVFKKAYDTTPKEYRDVNSEVML >gi|296494580|gb|ADTN01000158.1| GENE 16 15908 - 17089 1140 393 aa, chain - ## HITS:1 COG:ECs4451 KEGG:ns NR:ns ## COG: ECs4451 COG4214 # Protein_GI_number: 15833705 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type xylose transport system, permease component # Organism: Escherichia coli O157:H7 # 1 393 1 393 393 596 100.0 1e-170 MSKSNPSEVKLAVPTSGGFSGLKSLNLQVFVMIAAIIAIMLFFTWTTDGAYLSARNVSNL LRQTAITGILAVGMVFVIISAEIDLSVGSMMGLLGGVAAICDVWLGWPLPLTIIVTLVLG LLLGAWNGWWVAYRKVPSFIVTLAGMLAFRGILIGITNGTTVSPTSAAMSQIGQSYLPAS TGFIIGALGLMAFVGWQWRGRMRRQALGLQSPASTAVVGRQALTAIIVLGAIWLLNDYRG VPTPVLLLTLLLLGGMFMATRTAFGRRIYAIGGNLEAARLSGINVERTKLAVFAINGLMV AIAGLILSSRLGAGSPSAGNIAELDAIAACVIGGTSLAGGVGSVAGAVMGAFIMASLDNG MSMMDVPTFWQYIVKGAILLLAVWMDSATKRRS >gi|296494580|gb|ADTN01000158.1| GENE 17 17067 - 18608 231 513 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 254 483 7 240 329 93 29 2e-18 MPYLLEMKNITKTFGSVKAIDNVCLRLNAGEIVSLCGENGSGKSTLMKVLCGIYPHGSYE GEIIFAGEEIQASHIRDTERKGIAIIHQELALVKELTVLENIFLGNEITHNGIMDYDLMT LRCQKLLAQVSLSISPDTRVGDLGLGQQQLVEIAKALNKQVRLLILDEPTASLTEQETSV LLDIIRDLQQHGIACIYISHKLNEVKAISDTICVIRDGQHIGTRDAAGMSEDDIITMMVG RELTALYPNEPHTTGDEILRIEHLTAWHPVNRHIKRVNDVSFSLKRGEILGIAGLVGAGR TETIQCLFGVWPGQWEGKIYIDGKQVDIRNCQQAIAQGIAMVPEDRKRDGIVPVMAVGKN ITLAALNKFTGGISQLDDAAEQKCILESIQQLKVKTSSPDLAIGRLSGGNQQKAILARCL LLNPRILILDEPTRGIDIGAKYEIYKLINQLVQQGIAVIVISSELPEVLGLSDRVLVMHE GKLKANLINHNLTQEQVMEAALRSEHHVEKQSV >gi|296494580|gb|ADTN01000158.1| GENE 18 18686 - 19678 930 330 aa, chain - ## HITS:1 COG:xylF KEGG:ns NR:ns ## COG: xylF COG4213 # Protein_GI_number: 16131437 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type xylose transport system, periplasmic component # Organism: Escherichia coli K12 # 1 330 1 330 330 579 100.0 1e-165 MKIKNILLTLCTSLLLTNVAAHAKEVKIGMAIDDLRLERWQKDRDIFVKKAESLGAKVFV QSANGNEETQMSQIENMINRGVDVLVIIPYNGQVLSNVVKEAKQEGIKVLAYDRMINDAD IDFYISFDNEKVGELQAKALVDIVPQGNYFLMGGSPVDNNAKLFRAGQMKVLKPYVDSGK IKVVGDQWVDGWLPENALKIMENALTANNNKIDAVVASNDATAGGAIQALSAQGLSGKVA ISGQDADLAGIKRIAAGTQTMTVYKPITLLANTAAEIAVELGNGQEPKADTTLNNGLKDV PSRLLTPIDVNKNNIKDTVIKDGFHKESEL >gi|296494580|gb|ADTN01000158.1| GENE 19 20044 - 21366 1487 440 aa, chain + ## HITS:1 COG:xylA KEGG:ns NR:ns ## COG: xylA COG2115 # Protein_GI_number: 16131436 # Func_class: G Carbohydrate transport and metabolism # Function: Xylose isomerase # Organism: Escherichia coli K12 # 1 440 1 440 440 905 100.0 0 MQAYFDQLDRVRYEGSKSSNPLAFRHYNPDELVLGKRMEEHLRFAACYWHTFCWNGADMF GVGAFNRPWQQPGEALALAKRKADVAFEFFHKLHVPFYCFHDVDVSPEGASLKEYINNFA QMVDVLAGKQEESGVKLLWGTANCFTNPRYGAGAATNPDPEVFSWAATQVVTAMEATHKL GGENYVLWGGREGYETLLNTDLRQEREQLGRFMQMVVEHKHKIGFQGTLLIEPKPQEPTK HQYDYDAATVYGFLKQFGLEKEIKLNIEANHATLAGHSFHHEIATAIALGLFGSVDANRG DAQLGWDTDQFPNSVEENALVMYEILKAGGFTTGGLNFDAKVRRQSTDKYDLFYGHIGAM DTMALALKIAARMIEDGELDKRIAQRYSGWNSELGQQILKGQMSLADLAKYAQEHHLSPV HQSGRQEQLENLVNHYLFDK >gi|296494580|gb|ADTN01000158.1| GENE 20 21438 - 22892 916 484 aa, chain + ## HITS:1 COG:xylB KEGG:ns NR:ns ## COG: xylB COG1070 # Protein_GI_number: 16131435 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Escherichia coli K12 # 1 484 1 484 484 923 100.0 0 MYIGIDLGTSGVKVILLNEQGEVVAAQTEKLTVSRPHPLWSEQDPEQWWQATDRAMKALG DQHSLQDVKALGIAGQMHGATLLDAQQRVLRPAILWNDGRCAQECTLLEARVPQSRVITG NLMMPGFTAPKLLWVQRHEPEIFRQIDKVLLPKDYLRLRMTGEFASDMSDAAGTMWLDVA KRDWSDVMLQACDLSRDQMPALYEGSEITGALLPEVAKAWGMATVPVVAGGGDNAAGAVG VGMVDANQAMLSLGTSGVYFAVSEGFLSKPESAVHSFCHALPQRWHLMSVMLSAASCLDW AAKLTGLSNVPALIAAAQQADESAEPVWFLPYLSGERTPHNNPQAKGVFFGLTHQHGPNE LARAVLEGVGYALADGMDVVHACGIKPQSVTLIGGGARSEYWRQMLADISGQQLDYRTGG DVGPALGAARLAQIAANPEKSLIELLPQLPLEQSHLPDAQRYAAYQPRRETFRRLYQQLL PLMA >gi|296494580|gb|ADTN01000158.1| GENE 21 23061 - 23402 317 113 aa, chain + ## HITS:1 COG:yiaB KEGG:ns NR:ns ## COG: yiaB COG4682 # Protein_GI_number: 16131434 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 113 5 117 117 193 100.0 6e-50 MKTSKTVAKLLFVVGALVYLVGLWISCPLLSGKGYFLGVLMTATFGNYAYLRAEKLGQLD DFFTHICQLVALITIGLLFIGVLNAPINTYEMVIYPIAFFVCLFGQMRLFRSA >gi|296494580|gb|ADTN01000158.1| GENE 22 23448 - 23885 489 145 aa, chain + ## HITS:1 COG:ECs4445 KEGG:ns NR:ns ## COG: ECs4445 COG4682 # Protein_GI_number: 15833699 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 145 2 146 146 268 100.0 2e-72 MDNKISTYSPAFSIVSWIALVGGIVTYLLGLWNAEMQLNEKGYYFAVLVLGLFSAASYQK TVRDKYEGIPTTSIYYMTCLTVFIISVALLMVGLWNATLLLSEKGFYGLAFFLSLFGAVA VQKNIRDAGINPPKETQVTQEEYSE >gi|296494580|gb|ADTN01000158.1| GENE 23 23927 - 24922 677 331 aa, chain - ## HITS:1 COG:yiaH KEGG:ns NR:ns ## COG: yiaH COG3274 # Protein_GI_number: 16131432 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 331 1 331 331 605 99.0 1e-173 MQPKIYWIDNLRGIACLMVVMIHTTTWYVTNAHSVSPVTWDIANVLNSASRVSVPLFFMI SGYLFFGERSAQPRHFLRIGLCLIFYSAIALLYIALFTSINMELALKNLLQKPVFYHLWF FFAIAVIYLVSPLIQVKNVGGKMLLVLMAVIGIIANPNTVPQKIDGFEWLPINLYINGDT FYYIQYGMLGRAIGMMDTQHKALSWVSAALFATGVFIISRGTLYELQWRGNFADTWYLYC GPMVFICAIALLTLVKNTLDTRTIRGLGLISRHSLGIYGFHALIIHALRTRGIELKNWPI LDIIWIFCATLAASLLLSMLVQRIDRNRLVS >gi|296494580|gb|ADTN01000158.1| GENE 24 25100 - 25396 284 98 aa, chain + ## HITS:1 COG:no KEGG:JW3532 NR:ns ## KEGG: JW3532 # Name: ysaB # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 98 2 99 99 194 100.0 6e-49 MMNAFFPAMALMVLVGCSIPSPVQKAQRVKVDPLRSLNMEALCKDQAAKRYNTGEQKIDV TAFEQFQGSYEMRGYTFRKEQFVCSFDADGHFLHLSMR >gi|296494580|gb|ADTN01000158.1| GENE 25 25491 - 26402 1191 303 aa, chain + ## HITS:1 COG:ECs4443 KEGG:ns NR:ns ## COG: ECs4443 COG0752 # Protein_GI_number: 15833697 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase, alpha subunit # Organism: Escherichia coli O157:H7 # 1 303 1 303 303 634 100.0 0 MQKFDTRTFQGLILTLQDYWARQGCTIVQPLDMEVGAGTSHPMTCLRALGPEPMAAAYVQ PSRRPTDGRYGENPNRLQHYYQFQVVIKPSPDNIQELYLGSLKELGMDPTIHDIRFVEDN WENPTLGAWGLGWEVWLNGMEVTQFTYFQQVGGLECKPVTGEITYGLERLAMYIQGVDSV YDLVWSDGPLGKTTYGDVFHQNEVEQSTYNFEYADVDFLFTCFEQYEKEAQQLLALENPL PLPAYERILKAAHSFNLLDARKAISVTERQRYILRIRTLTKAVAEAYYASREALGFPMCN KDK >gi|296494580|gb|ADTN01000158.1| GENE 26 26412 - 28481 2949 689 aa, chain + ## HITS:1 COG:glyS KEGG:ns NR:ns ## COG: glyS COG0751 # Protein_GI_number: 16131430 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase, beta subunit # Organism: Escherichia coli K12 # 1 689 1 689 689 1326 100.0 0 MSEKTFLVEIGTEELPPKALRSLAESFAANFTAELDNAGLAHGTVQWFAAPRRLALKVAN LAEAQPDREIEKRGPAIAQAFDAEGKPSKAAEGWARGCGITVDQAERLTTDKGEWLLYRA HVKGESTEALLPNMVATSLAKLPIPKLMRWGASDVHFVRPVHTVTLLLGDKVIPATILGI QSDRVIRGHRFMGEPEFTIDNADQYPEILRERGKVIADYEERKAKIKADAEEAARKIGGN ADLSESLLEEVASLVEWPVVLTAKFEEKFLAVPAEALVYTMKGDQKYFPVYANDGKLLPN FIFVANIESKDPQQIISGNEKVVRPRLADAEFFFNTDRKKRLEDNLPRLQTVLFQQQLGT LRDKTDRIQALAGWIAEQIGADVNHATRAGLLSKCDLMTNMVFEFTDTQGVMGMHYARHD GEAEDVAVALNEQYQPRFAGDDLPSNPVACALAIADKMDTLAGIFGIGQHPKGDKDPFAL RRAALGVLRIIVEKNLNLDLQTLTEEAVRLYGDKLTNANVVDDVIDFMLGRFRAWYQDEG YTVDTIQAVLARRPTRPADFDARMKAVSHFRTLDAAAALAAANKRVSNILAKSDEVLSDR VNASTLKEPEEIKLAMQVVVLRDKLEPYFTEGRYQDALVELAELREPVDAFFDKVMVMVD DKELRINRLTMLEKLRELFLRVADISLLQ >gi|296494580|gb|ADTN01000158.1| GENE 27 28760 - 29125 160 121 aa, chain - ## HITS:1 COG:yi5B KEGG:ns NR:ns ## COG: yi5B COG2801 # Protein_GI_number: 16131429 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli K12 # 1 121 163 283 283 246 100.0 9e-66 MNMVENMLDQAFKKLNPHEHPVLHSDQGWQYRMRRYQNILKEHGIKQSMSRKGNCLDNAV VECFFGTLKSECFYLDEFSNISELKDAVTEYIEYYNSRRISLKLKGLTPIEYRNQTYMPR V >gi|296494580|gb|ADTN01000158.1| GENE 28 29066 - 29353 69 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|10955443|ref|NP_065295.1| ## NR: gi|10955443|ref|NP_065295.1| hypothetical protein R721_05 [Escherichia coli] # 1 95 24 118 118 188 100.0 1e-46 MLVRIKLFECLIEHILNHVHHWSFRKAVRNNFVVEEIYYWRQIQLAPIDCKFSNIGNPLL VWPRSLEISLENIRGGLPYLSSVRAVSLDLNRCFK >gi|296494580|gb|ADTN01000158.1| GENE 29 29608 - 30129 124 173 aa, chain - ## HITS:1 COG:yi5A KEGG:ns NR:ns ## COG: yi5A COG2963 # Protein_GI_number: 16131428 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli K12 # 1 173 1 173 173 321 100.0 3e-88 MSKPKYPFEKRLEVVNHYFTTDDGYRIISARFGVPRTQVRTWVALYEKHGEKGLIPKPKG VSADPELRIKVVKAVIEQHMSLNQAAAHFMLAGSGSVARWLKVYEERGEAGLRALKIGTK RNIAISVDPEKAASALELSKDRRIEDLERQVRFLETRLMYLKKLKALAHPTKK >gi|296494580|gb|ADTN01000158.1| GENE 30 30209 - 30361 75 50 aa, chain + ## HITS:1 COG:no KEGG:ECP_3660 NR:ns ## KEGG: ECP_3660 # Name: not_defined # Def: small toxic polypeptide # Organism: E.coli_536 # Pathway: not_defined # 1 50 21 70 70 90 100.0 1e-17 MPQKYRLLSLIVICFTLLFFTWMIRDSLCELHIKQESYELAAFLACKLKE >gi|296494580|gb|ADTN01000158.1| GENE 31 30548 - 30760 295 70 aa, chain - ## HITS:1 COG:ECs4441 KEGG:ns NR:ns ## COG: ECs4441 COG1278 # Protein_GI_number: 15833695 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Escherichia coli O157:H7 # 1 70 1 70 70 122 100.0 1e-28 MSGKMTGIVKWFNADKGFGFITPDDGSKDVFVHFSAIQNDGYKSLDEGQKVSFTIESGAK GPAAGNVTSL >gi|296494580|gb|ADTN01000158.1| GENE 32 31041 - 31331 238 96 aa, chain - ## HITS:1 COG:ECs4440 KEGG:ns NR:ns ## COG: ECs4440 COG2944 # Protein_GI_number: 15833694 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 96 1 96 96 164 100.0 5e-41 MEYKDPMHELLSSLEQIVFKDETQKITLTHRTTSCTEIEQLRKGTGLKIDDFARVLGVSV AMVKEWESRRVKPSSAELKLMRLIQANPALSKQLME >gi|296494580|gb|ADTN01000158.1| GENE 33 31765 - 32475 763 236 aa, chain + ## HITS:1 COG:no KEGG:APECO1_2894 NR:ns ## KEGG: APECO1_2894 # Name: yiaF # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 236 41 276 276 432 100.0 1e-120 MATGKSCSRWFAPLAALLMVVSLSGCFDKEGDQRKAFIDFLQNTVMRSGERLPTLTADQK KQFGPFVSDYAILYGYSQQVNQAMDSGLRPVVDSVNAIRVPQDYVTQSGPLREMNGSLGV LAQQLQNAKLQADAAHSALKQSDDLKPVFDQAFTKVVTTPADALQPLIPAAQTFTQQLVM VGDYIAQQGTQVSFVANGIQFPTSQQASEYNKLIAPLPAQHQAFNQAWTTAVTATQ >gi|296494580|gb|ADTN01000158.1| GENE 34 32525 - 33499 862 324 aa, chain - ## HITS:1 COG:yiaE KEGG:ns NR:ns ## COG: yiaE COG1052 # Protein_GI_number: 16131424 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Escherichia coli K12 # 1 324 5 328 328 645 100.0 0 MKPSVILYKALPDDLLQRLQEHFTVHQVANLSPQTVEQNAAIFAEAEGLLGSNENVNAAL LEKMPKLRATSTISVGYDNFDVDALTARKILLMHTPTVLTETVADTLMALVLSTARRVVE VAERVKAGEWTASIGPDWYGTDVHHKTLGIVGMGRIGMALAQRAHFGFNMPILYNARRHH KEAEERFNARYCDLDTLLQESDFVCLILPLTDETHHLFGAEQFAKMKSSAIFINAGRGPV VDENALIAALQKGEIHAAGLDVFEQEPLSVDSPLLSMANVVAVPHIGSATHETRYGMAAC AVDNLIDALQGKVEKNCVNPHVAD >gi|296494580|gb|ADTN01000158.1| GENE 35 33603 - 34262 213 219 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163756109|ref|ZP_02163225.1| 30S ribosomal protein S1 [Kordia algicida OT-1] # 113 214 243 344 347 86 37 2e-16 MKKRVYLIAAVVSGALAVSGCTTNPYTGEREAGKSAIGAGLGSLVGAGIGALSSSKKDRG KGALIGAAAGAALGGGVGYYMDVQEAKLRDKMRGTGVSVTRSGDNIILNMPNNVTFDSSS ATLKPAGANTLTGVAMVLKEYPKTAVNVIGYTDSTGGHDLNMRLSQQRADSVASALITQG VDASRIRTQGLGPANPIASNSTAEGKAQNRRVEITLSPL >gi|296494580|gb|ADTN01000158.1| GENE 36 34595 - 36748 1874 717 aa, chain + ## HITS:1 COG:bisC KEGG:ns NR:ns ## COG: bisC COG0243 # Protein_GI_number: 16131422 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 717 23 739 739 1494 100.0 0 MVRKGFLASPENPQGIRGQDEFVRVSWDEALDLIHQQHKRIREAYGPASIFAGSYGWRSN GVLHKASTLLQRYMALAGGYTGHLGDYSTGAAQAIMPYVVGGSEVYQQQTSWPLVLEHSD VVVLWSANPLNTLKIAWNASDEQGLSYFSALRDSGKKLICIDPMRSETVDFFGDKMEWVA PHMGTDVALMLGIAHTLVENGWHDEAFLARCTTGYAVFASYLLGESDGIAKTAEWAAEIC GVGAAKIRELAAIFHQNTTMLMAGWGMQRQQFGEQKHWMIVTLAAMLGQIGTPGGGFGLS YHFANGGNPTRRSAVLSSMQGSLPGGCDAVDKIPVARIVEALENPGGAYQHNGMNRHFPD IRFIWWAGGANFTHHQDTNRLIRAWQKPELVVISECFWTAAAKHADIVLPATTSFERNDL TMTGDYSNQHLVPMKQVVPPRYEARNDFDVFAELSERWEKGGYARFTEGKSELQWLETFY NVARQRGASQQVELPPFAEFWQANQLIEMPENPDSERFIRFADFCRDPLAHPLKTASGKI EIFSQRIADYGYPDCPGHPMWLEPDEWQGNAEPEQLQVLSAHPAHRLHSQLNYSSLRELY AVANREPVTIHPDDAQERGIQDGDTVRLWNARGQILAGAVISEGIKPGVICIHEGAWPDL DLTADGICKNGAVNVLTKDLPSSRLGNGCAGNTALAWLEKYNGPELTLTAFEPPASS >gi|296494580|gb|ADTN01000158.1| GENE 37 36717 - 37157 445 146 aa, chain - ## HITS:1 COG:yiaC KEGG:ns NR:ns ## COG: yiaC COG0454 # Protein_GI_number: 16131421 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Escherichia coli K12 # 1 146 1 146 146 285 100.0 2e-77 MIREAQRSELPAILELWLESTTWGHPFIKANYWRDCIPLVRDAYLANAQNWVWEEDGKLL GFVSIMEGRFLAAMFVAPKAVRRGIGKALMQYVQQRHPHLMLEVYQKNQPAINFYQAQGF HIVDCAWQDETQLPTWIMSWPVVQTL >gi|296494580|gb|ADTN01000158.1| GENE 38 37154 - 37717 357 187 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164512|ref|YP_001467500.1| 50S ribosomal protein L24 (BL23; 12 kDa DNA-binding protein; HPB12) [Campylobacter concisus 13826] # 1 181 1 181 185 142 38 4e-33 MERCGWVSQDPLYIAYHDNEWGVPETDSKKLFEMICLEGQQAGLSWITVLKKRENYRACF HQFDPVKVAAMQEEDVERLVQDAGIIRHRGKIQAIIGNARAYLQMEQNGEPFVDFVWSFV NHQPQVTQATTLSEIPTSTSASDALSKALKKRGFKFVGTTICYSFMQACGLVNDHVVGCC CYPGNKP >gi|296494580|gb|ADTN01000158.1| GENE 39 37875 - 38573 580 232 aa, chain + ## HITS:1 COG:yhjY KEGG:ns NR:ns ## COG: yhjY COG5571 # Protein_GI_number: 16131419 # Func_class: N Cell motility # Function: Autotransporter protein or domain, integral membrane beta-barrel involved in protein secretion # Organism: Escherichia coli K12 # 1 232 3 234 234 426 100.0 1e-119 MIIKKSGGRWQLSLLASVVISAFFLNTAYAWQQEYIVDTQPGLSTERYTWDSDHQPDYND ILSQRIQSSQRALGLEVNLAEETPVDVTSSMSMGWNFPLYEQVTTGPVAALHYDGTTTSM YNEFGDSTTTLTDPLWHASVSTLGWRVDSRLGDLRPWAQISYNQQFGENIWKAQSGLSRM TATNQNGNWLDVTVGADMLLNQNIAAYAALTQAENTTNNSDYLYTMGVSARF >gi|296494580|gb|ADTN01000158.1| GENE 40 38802 - 40010 1615 402 aa, chain + ## HITS:1 COG:yhjX KEGG:ns NR:ns ## COG: yhjX COG0477 # Protein_GI_number: 16131418 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 402 1 402 402 687 100.0 0 MTPSNYQRTRWLTLIGTIITQFALGSVYTWSLFNGALSAKLDAPVSQVAFSFGLLSLGLA ISSSVAGKLQERFGVKRVTMASGILLGLGFFLTAHSDNLMMLWLSAGVLVGLADGAGYLL TLSNCVKWFPERKGLISAFAIGSYGLGSLGFKFIDTQLLETVGLEKTFVIWGAIALLMIV FGATLMKDAPKQEVKTSNGVVEKDYTLAESMRKPQYWMLAVMFLTACMSGLYVIGVAKDI AQSLAHLDVVSAANAVTVISIANLSGRLVLGILSDKIARIRVITIGQVISLVGMAALLFA PLNAVTFFAAIACVAFNFGGTITVFPSLVSEFFGLNNLAKNYGVIYLGFGIGSICGSIIA SLFGGFYVTFYVIFALLILSLALSTTIRQPEQKMLREAHGSL >gi|296494580|gb|ADTN01000158.1| GENE 41 40334 - 42025 1539 563 aa, chain + ## HITS:1 COG:yhjW KEGG:ns NR:ns ## COG: yhjW COG2194 # Protein_GI_number: 16131417 # Func_class: R General function prediction only # Function: Predicted membrane-associated, metal-dependent hydrolase # Organism: Escherichia coli K12 # 1 563 12 574 574 1156 100.0 0 MRYIKSITQQKLSFLLAIYIGLFMNGAVFYRRFGSYAHDFTVWKGISAVVELAATVLVTF FLLRLLSLFGRRSWRILASLVVLFSAGASYYMTFLNVVIGYGIIASVMTTDIDLSKEVVG LNFILWLIAVSALPLILIWNNRCRYTLLRQLRTPGQRIRSLAVVVLAGIMVWAPIRLLDI QQKKVERATGVDLPSYGGVVANSYLPSNWLSALGLYAWARVDESSDNNSLLNPAKKFTYQ APQNVDDTYVVFIIGETTRWDHMGIFGYERNTTPKLAQEKNLAAFRGYSCDTATKLSLRC MFVRQGGAEDNPQRTLKEQNIFAVLKQLGFSSDLYAMQSEMWFYSNTMADNIAYREQIGA EPRNRGKPVDDMLLVDEMQQSLGRNPDGKHLIILHTKGSHFNYTQRYPRSFAQWKPECIG VDSGCTKAQMINSYDNSVTYVDHFISSVIDQVRDKKAIVFYAADHGESINEREHLHGTPR ELAPPEQFRVPMMVWMSDKYLENPANAQAFAQLKKEADMKVPRRHVELYDTIMGCLGYTS PDGGINENNNWCHIPQAKEAAAN Prediction of potential genes in microbial genomes Time: Sun May 15 23:40:54 2011 Seq name: gi|296494579|gb|ADTN01000159.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont297.8, whole genome shotgun sequence Length of sequence - 41040 bp Number of predicted genes - 32, with homology - 31 Number of transcription units - 16, operones - 6 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 421 - 480 3.2 1 1 Tu 1 . + CDS 646 - 2253 2165 ## COG0747 ABC-type dipeptide transport system, periplasmic component + Term 2474 - 2509 1.5 + Prom 2285 - 2344 3.0 2 2 Op 1 49/0.000 + CDS 2561 - 3580 355 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 3 2 Op 2 44/0.000 + CDS 3590 - 4492 1344 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 4 2 Op 3 44/0.000 + CDS 4503 - 5486 575 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 5 2 Op 4 . + CDS 5483 - 6487 802 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 + Term 6497 - 6527 1.7 6 3 Tu 1 . - CDS 6517 - 7788 1197 ## COG0814 Amino acid permeases - Prom 7889 - 7948 1.7 + Prom 7926 - 7985 2.3 7 4 Tu 1 . + CDS 8174 - 8371 83 ## G2583_4275 hypothetical protein + Term 8422 - 8460 7.0 - Term 8217 - 8276 1.2 8 5 Op 1 . - CDS 8458 - 10137 1356 ## B21_03339 hypothetical protein 9 5 Op 2 . - CDS 10134 - 10325 227 ## ECO103_4265 hypothetical protein 10 5 Op 3 . - CDS 10322 - 10636 182 ## EcSMS35_3845 hypothetical protein 11 5 Op 4 . - CDS 10669 - 10935 288 ## LF82_0214 uncharacterized protein YhjS 12 5 Op 5 . - CDS 10964 - 11890 383 ## B21_03337 hypothetical protein - Prom 11946 - 12005 4.8 + Prom 11924 - 11983 4.3 13 6 Op 1 . + CDS 12163 - 12351 235 ## SSON_3856 hypothetical protein 14 6 Op 2 1/1.000 + CDS 12363 - 13115 696 ## COG1192 ATPases involved in chromosome partitioning 15 6 Op 3 . + CDS 13112 - 15730 2525 ## COG1215 Glycosyltransferases, probably involved in cell wall biogenesis 16 6 Op 4 . + CDS 15741 - 18080 2405 ## ECO26_4622 cellulose synthase regulator protein 17 6 Op 5 4/0.429 + CDS 18087 - 19193 1105 ## COG3405 Endoglucanase Y 18 6 Op 6 5/0.000 + CDS 19175 - 22648 3218 ## COG0457 FOG: TPR repeat 19 6 Op 7 3/0.857 + CDS 22769 - 24718 1966 ## COG2200 FOG: EAL domain + Term 24743 - 24796 4.3 + Prom 24753 - 24812 3.6 20 7 Op 1 5/0.000 + CDS 24901 - 26187 1546 ## COG1301 Na+/H+-dicarboxylate symporters + Term 26200 - 26241 6.1 + Prom 26371 - 26430 4.0 21 7 Op 2 . + CDS 26450 - 27904 1639 ## COG0612 Predicted Zn-dependent peptidases + Term 27905 - 27952 3.3 - Term 27893 - 27938 5.0 22 8 Tu 1 . - CDS 28000 - 28929 1039 ## COG0524 Sugar kinases, ribokinase family - Prom 29154 - 29213 2.6 + Prom 28964 - 29023 2.8 23 9 Op 1 4/0.429 + CDS 29161 - 29928 636 ## COG2200 FOG: EAL domain 24 9 Op 2 . + CDS 29998 - 32058 1692 ## COG2982 Uncharacterized protein involved in outer membrane biogenesis + Term 32062 - 32108 8.9 - Term 32053 - 32089 8.2 25 10 Tu 1 . - CDS 32240 - 33562 1658 ## COG0477 Permeases of the major facilitator superfamily - Prom 33710 - 33769 5.4 - Term 33922 - 33970 6.5 26 11 Op 1 3/0.857 - CDS 33973 - 34986 1028 ## COG1295 Predicted membrane protein 27 11 Op 2 . - CDS 35035 - 35916 741 ## COG0583 Transcriptional regulator - Prom 35990 - 36049 1.8 + Prom 36337 - 36396 4.3 28 12 Tu 1 . + CDS 36454 - 37056 436 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain + Term 37066 - 37115 8.4 - Term 37054 - 37103 8.4 29 13 Tu 1 . - CDS 37107 - 38756 1772 ## COG1626 Neutral trehalase - Prom 38867 - 38926 6.6 + Prom 39081 - 39140 4.1 30 14 Tu 1 . + CDS 39160 - 40557 1170 ## COG1858 Cytochrome c peroxidase + Term 40565 - 40606 8.8 31 15 Tu 1 . - CDS 40689 - 40835 103 ## - Prom 40907 - 40966 5.5 + Prom 40596 - 40655 6.0 32 16 Tu 1 . + CDS 40768 - 41038 158 ## COG0076 Glutamate decarboxylase and related PLP-dependent proteins Predicted protein(s) >gi|296494579|gb|ADTN01000159.1| GENE 1 646 - 2253 2165 535 aa, chain + ## HITS:1 COG:dppA KEGG:ns NR:ns ## COG: dppA COG0747 # Protein_GI_number: 16131416 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Escherichia coli K12 # 1 535 1 535 535 1091 100.0 0 MRISLKKSGMLKLGLSLVAMTVAASVQAKTLVYCSEGSPEGFNPQLFTSGTTYDASSVPL YNRLVEFKIGTTEVIPGLAEKWEVSEDGKTYTFHLRKGVKWHDNKEFKPTRELNADDVVF SFDRQKNAQNPYHKVSGGSYEYFEGMGLPELISEVKKVDDNTVQFVLTRPEAPFLADLAM DFASILSKEYADAMMKAGTPEKLDLNPIGTGPFQLQQYQKDSRIRYKAFDGYWGTKPQID TLVFSITPDASVRYAKLQKNECQVMPYPNPADIARMKQDKSINLMEMPGLNVGYLSYNVQ KKPLDDVKVRQALTYAVNKDAIIKAVYQGAGVSAKNLIPPTMWGYNDDVQDYTYDPEKAK ALLKEAGLEKGFSIDLWAMPVQRPYNPNARRMAEMIQADWAKVGVQAKIVTYEWGEYLKR AKDGEHQTVMMGWTGDNGDPDNFFATLFSCAASEQGSNYSKWCYKPFEDLIQPARATDDH NKRVELYKQAQVVMHDQAPALIIAHSTVFEPVRKEVKGYVVDPLGKHHFENVSIE >gi|296494579|gb|ADTN01000159.1| GENE 2 2561 - 3580 355 339 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 66 333 43 310 320 141 29 7e-33 MLQFILRRLGLVIPTFIGITLLTFAFVHMIPGDPVMIMAGERGISPERHAQLLAELGLDK PMWQQYLHYIWGVMHGDLGISMKSRIPVWEEFVPRFQATLELGVCAMIFATAVGIPVGVL AAVKRGSIFDHTAVGLALTGYSMPIFWWGMMLIMLVSVHWNLTPVSGRVSDMVFLDDSNP LTGFMLIDTAIWGEDGNFIDAVAHMILPAIVLGTIPLAVIVRMTRSSMLEVLGEDYIRTA RAKGLTRMRVIIVHALRNAMLPVVTVIGLQVGTLLAGAILTETIFSWPGLGRWLIDALQR RDYPVVQGGVLLVATMIILVNLLVDLLYGVVNPRIRHKK >gi|296494579|gb|ADTN01000159.1| GENE 3 3590 - 4492 1344 300 aa, chain + ## HITS:1 COG:ECs4422 KEGG:ns NR:ns ## COG: ECs4422 COG1173 # Protein_GI_number: 15833676 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Escherichia coli O157:H7 # 1 300 1 300 300 486 100.0 1e-137 MSQVTENKVISAPVPMTPLQEFWHYFKRNKGAVVGLVYVVIVLFIAIFANWIAPYNPAEQ FRDALLAPPAWQEGGSMAHLLGTDDVGRDVLSRLMYGARLSLLVGCLVVVLSLIMGVILG LIAGYFGGLVDNIIMRVVDIMLALPSLLLALVLVAIFGPSIGNAALALTFVALPHYVRLT RAAVLVEVNRDYVTASRVAGAGAMRQMFINIFPNCLAPLIVQASLGFSNAILDMAALGFL GMGAQPPTPEWGTMLSDVLQFAQSAWWVVTFPGLAILLTVLAFNLMGDGLRDALDPKLKQ >gi|296494579|gb|ADTN01000159.1| GENE 4 4503 - 5486 575 327 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 3 315 11 324 329 226 40 2e-58 MALLNVDKLSVHFGDESAPFRAVDRISYSVKQGEVVGIVGESGSGKSVSSLAIMGLIDYP GRVMAEKLEFNGQDLQRISEKERRNLMGAEVAMIFQDPMTSLNPCYTVGFQIMEAIKVHQ GGNKSTRRQRAIDLLNQVGIPDPASRLDVYPHQLSGGMSQRVMIAMAIACRPKLLIADEP TTALDVTIQAQIIELLLELQQKENMALVLITHDLALVAEAAHKIIVMYAGQVVETGDAHA IFHAPRHPYTQALLRALPEFAQDKERLASLPGVVPGKYDRPNGCLLNPRCPYATDRCRAE EPALNMLADGRQSKCHYPLDDAGRPTL >gi|296494579|gb|ADTN01000159.1| GENE 5 5483 - 6487 802 334 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 3 328 2 329 329 313 47 1e-84 MSTQEATLQQPLLQAIDLKKHYPVKKGMFAPERLVKALDGVSFNLERGKTLAVVGESGCG KSTLGRLLTMIEMPTGGELYYQGQDLLKHDPQAQKLRRQKIQIVFQNPYGSLNPRKKVGQ ILEEPLLINTSLSKEQRREKALSMMAKVGLKTEHYDRYPHMFSGGQRQRIAIARGLMLDP DVVIADEPVSALDVSVRAQVLNLMMDLQQELGLSYVFISHDLSVVEHIADEVMVMYLGRC VEKGTKDQIFNNPRHPYTQALLSATPRLNPDDRRERIKLSGELPSPLNPPPGCAFNARCR RRFGPCTQLQPQLKDYGGQLVACFAVDQDENPQR >gi|296494579|gb|ADTN01000159.1| GENE 6 6517 - 7788 1197 423 aa, chain - ## HITS:1 COG:yhjV KEGG:ns NR:ns ## COG: yhjV COG0814 # Protein_GI_number: 16131411 # Func_class: E Amino acid transport and metabolism # Function: Amino acid permeases # Organism: Escherichia coli K12 # 1 423 1 423 423 702 100.0 0 MQHNTLSKHNQKLPFTRYDFGWVLLCIGMAIGAGTVLMPVQIGLKGIWVFITAAIIAYPA TWVVQDIYLKTLSESDSCNDYTDIISHYLGKNWGIFLGVIYFLMIIHGIFIYSLSVVFDS ASYLKTFGLTDADLSQSLLYKVAIFAVLVAIASGGERLLFKISGPMVVVKVGIIVVFGFA MIPHWNFANITAFPQASVFFRDVLLTIPFCFFSAVFIQVLNPMNIAYRKREADKVLATRL ALRTHRISYITLIAVILFFAFSFTFSISHEEAVSAFEQNISALALAAQVIPGHIIHITST VLNIFAVLTAFFGIYLGFHEAIKGIILNLLSRIIDTKKINSRVLTLAICAFIVITLTIWV SFRVSVLVFFQLGSPLYGIVSCLIPFFLIYKVAQLEKLRGFKAWLILLYGILLCLSPLLK LIE >gi|296494579|gb|ADTN01000159.1| GENE 7 8174 - 8371 83 65 aa, chain + ## HITS:1 COG:no KEGG:G2583_4275 NR:ns ## KEGG: G2583_4275 # Name: ldrD # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 65 16 80 80 125 96.0 6e-28 MHLTTPRGLILTLDHSRIAAKRCHYNTGGYMTFAELGMAFWHDLAAPVIAGILASMIVNW LNKRK >gi|296494579|gb|ADTN01000159.1| GENE 8 8458 - 10137 1356 559 aa, chain - ## HITS:1 COG:no KEGG:B21_03339 NR:ns ## KEGG: B21_03339 # Name: bcsG # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 559 1 559 559 1088 100.0 0 MTQFTQNTAMPSSLWQYWRGLSGWNFYFLVKFGLLWAGYLNFHPLLNLVFAAFLLMPLPR YSLHRLRHWIALPIGFALFWHDTWLPGPESIMSQGSQVAGFSTDYLIDLVTRFINWQMIG AIFVLLVAWLFLSQWIRITVFVVAILLWLNVLTLAGPSFSLWPAGQPTTTVTTTGGNAAA TVAATGGAPVVGDMPAQTAPPTTANLNAWLNNFYNAEAKRKSTFPSSLPADAQPFELLVI NICSLSWSDIEAAGLMSHPLWSHFDIEFKNFNSATSYSGPAAIRLLRASCGQTSHTNLYQ PANNDCYLFDNLSKLGFTQHLMMGHNGQFGGFLKEVRENGGMQSELMDQTNLPVILLGFD GSPVYDDTAVLNRWLDVTEKDKNSRSATFYNTLPLHDGNHYPGVSKTADYKARAQKFFDE LDAFFTELEKSGRKVMVVVVPEHGGALKGDRMQVSGLRDIPSPSITDVPVGVKFFGMKAP HQGAPIVIEQPSSFLAISDLVVRVLDGKIFTEDNVDWKKLTSGLPQTAPVSENSNAVVIQ YQDKPYVRLNGGDWVPYPQ >gi|296494579|gb|ADTN01000159.1| GENE 9 10134 - 10325 227 63 aa, chain - ## HITS:1 COG:no KEGG:ECO103_4265 NR:ns ## KEGG: ECO103_4265 # Name: bcsF # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 63 1 63 63 104 100.0 9e-22 MMTISDIIEIIVVCALIFFPLGYLARHSLRRIRDTLRLFFAKPRYVKPAGTLRRTEKARA TKK >gi|296494579|gb|ADTN01000159.1| GENE 10 10322 - 10636 182 104 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_3845 NR:ns ## KEGG: EcSMS35_3845 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 104 416 519 519 218 100.0 4e-56 MTIGGNRLVLFLSFCRINDLDTALNHIFPLPTGDIFSNRMVWFEDDQISAELVQMRLLAP EQWGMPLPLTQSSKPVINAEHDGRHWRRIPEPMRLLDDAVERSS >gi|296494579|gb|ADTN01000159.1| GENE 11 10669 - 10935 288 88 aa, chain - ## HITS:1 COG:no KEGG:LF82_0214 NR:ns ## KEGG: LF82_0214 # Name: bcsE # Def: uncharacterized protein YhjS # Organism: E.coli_LF82 # Pathway: not_defined # 1 87 320 406 523 181 97.0 7e-45 MVIPWNAPLSRCLTMIESVQGQKFSRYVPEDITTLLSMTQPLKLRGFQKWDVFCNAVNNM MNNPLLPAHGKGVLVALRPVPGIRVNKP >gi|296494579|gb|ADTN01000159.1| GENE 12 10964 - 11890 383 308 aa, chain - ## HITS:1 COG:no KEGG:B21_03337 NR:ns ## KEGG: B21_03337 # Name: bcsE # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 306 1 306 523 617 100.0 1e-175 MRDIVDPVFSIGISSLWDELRHMPAGGVWWFNVDRHEDAISLANQTIASQAETAHVAVIS MDSDPAKIFQLDDSQGPEKIKLFSMLNHEKGLYYLTRDLQCSIDPHNYLFILVCANNAWQ NIPAERLRSWLDKMNKWSRLNHCSLLVINPGNNNDKQFSLLLEEYRSLFGLASLRFQGDQ HLLDIAFWCNEKGVSARQQLSVQQQNGIWTLVQSEEAEIQPRSDEKRILSNVAVLEGAPP LSEHWQLFNNNEVLFNEARTAQAATVVFSLQQNAQIEPLARSIHTLRRQRGSAMKILVRE NTASLRHR >gi|296494579|gb|ADTN01000159.1| GENE 13 12163 - 12351 235 62 aa, chain + ## HITS:1 COG:no KEGG:SSON_3856 NR:ns ## KEGG: SSON_3856 # Name: yhjR # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 62 1 62 62 108 100.0 7e-23 MNNNEPDTLPDPAIGYIFQNDIVALKQAFSLPDIDYADISQREQLAAALKRWPLLAEFAQ QK >gi|296494579|gb|ADTN01000159.1| GENE 14 12363 - 13115 696 250 aa, chain + ## HITS:1 COG:yhjQ KEGG:ns NR:ns ## COG: yhjQ COG1192 # Protein_GI_number: 16131406 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Escherichia coli K12 # 9 250 1 242 242 477 99.0 1e-134 MAVLGLQGVRGGVGTTTITAALAWSLQMLGENVLVVDACPDNLLRLSFNVDFTHRQGWAR AMLDGQDWRDAGLRYTSQLDLLPFGQLSIEEQENPQHWQTRLSDICSGLQQLKASGRYQW ILIDLPRDASQITHQLLSLCDHSLAIVNVDANCHIRLHQQALPDGAHILINDFRIGSQVQ DDIYQLWLQSQRRLLPMLIHRDEAMAECLAAKQPVGEYRSDALAAEEILTLANWCLLNYS GLKTPVGSAS >gi|296494579|gb|ADTN01000159.1| GENE 15 13112 - 15730 2525 872 aa, chain + ## HITS:1 COG:yhjO KEGG:ns NR:ns ## COG: yhjO COG1215 # Protein_GI_number: 16131405 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases, probably involved in cell wall biogenesis # Organism: Escherichia coli K12 # 1 872 17 888 888 1811 100.0 0 MSILTRWLLIPPVNARLIGRYRDYRRHGASAFSATLGCFWMILAWIFIPLEHPRWQRIRA EHKNLYPHINASRPRPLDPVRYLIQTCWLLIGASRKETPKPRRRAFSGLQNIRGRYHQWM NELPERVSHKTQHLDEKKELGHLSAGARRLILGIIVTFSLILALICVTQPFNPLAQFIFL MLLWGVALIVRRMPGRFSALMLIVLSLTVSCRYIWWRYTSTLNWDDPVSLVCGLILLFAE TYAWIVLVLGYFQVVWPLNRQPVPLPKDMSLWPSVDIFVPTYNEDLNVVKNTIYASLGID WPKDKLNIWILDDGGREEFRQFAQNVGVKYIARTTHEHAKAGNINNALKYAKGEFVSIFD CDHVPTRSFLQMTMGWFLKEKQLAMMQTPHHFFSPDPFERNLGRFRKTPNEGTLFYGLVQ DGNDMWDATFFCGSCAVIRRKPLDEIGGIAVETVTEDAHTSLRLHRRGYTSAYMRIPQAA GLATESLSAHIGQRIRWARGMVQIFRLDNPLTGKGLKFAQRLCYVNAMFHFLSGIPRLIF LTAPLAFLLLHAYIIYAPALMIALFVLPHMIHASLTNSKIQGKYRHSFWSEIYETVLAWY IAPPTLVALINPHKGKFNVTAKGGLVEEEYVDWVISRPYIFLVLLNLVGVAVGIWRYFYG PPTEMLTVVVSMVWVFYNLIVLGGAVAVSVESKQVRRSHRVEMTMPAAIAREDGHLFSCT VQDFSDGGLGIKINGQAQILEGQKVNLLLKRGQQEYVFPTQVARVMGNEVGLKLMPLTTQ QHIDFVQCTFARADTWALWQDSYPEDKPLESLLDILKLGFRGYRHLAEFAPSSVKGIFRV LTSLVSWVVSFIPRRPERSETAQPSDQALAQQ >gi|296494579|gb|ADTN01000159.1| GENE 16 15741 - 18080 2405 779 aa, chain + ## HITS:1 COG:no KEGG:ECO26_4622 NR:ns ## KEGG: ECO26_4622 # Name: bcsB # Def: cellulose synthase regulator protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 779 1 779 779 1516 100.0 0 MKRKLFWICAVAMGMSAFPSFMTQATPATQPLINAEPAVAAQTEQNPQVGQVMPGVQGAD APVVAQNGPSRDVKLTFAQIAPPPGSMVLRGINPNGSIEFGMRSDEVVTKAMLNLEYTPS PSLLPVQSQLKVYLNDELMGVLPVTKEQLGKKTLAQMPINPLFITDFNRVRLEFVGHYQD VCENPASTTLWLDVGRSSGLDLTYQTLNVKNDLSHFPVPFFDPRDNRTNTLPMVFAGAPD VGLQQASAIVASWFGSRSGWRGQNFPVLYNQLPDRNAIVFATNDKRPDFLRDHPAVKAPV IEMINHPQNPYVKLLVVFGRDDKDLLQAAKGIAQGNILFRGESVVVNEVKPLLPRKPYDA PNWVRTDRPVTFGELKTYEEQLQSSGLEPAAINVSLNLPPDLYLMRSTGIDMDINYRYTM PPVKDSSRMDISLNNQFLQSFNLSSKQEANRLLLRIPVLQGLLDGKTDVSIPALKLGATN QLRFDFEYMNPMPGGSVDNCITFQPVQNHVVIGDDSTIDFSKYYHFIPMPDLRAFANAGF PFSRMADLSQTITVMPKAPNEAQMETLLNTVGFIGAQTGFPAINLTVTDDGSTIQGKDAD IMIIGGIPDKLKDDKQIDLLVQATESWVKTPMRQTPFPGIVPDESDRAAETRSTLTSSGA MAAVIGFQSPYNDQRSVIALLADSPRGYEMLNDAVNDSGKRATMFGSVAVIRESGINSLR VGDVYYVGHLPWFERVWYALANHPILLAVLAAISVILLAWVLWRLLRIISRRRLNPDNE >gi|296494579|gb|ADTN01000159.1| GENE 17 18087 - 19193 1105 368 aa, chain + ## HITS:1 COG:yhjM KEGG:ns NR:ns ## COG: yhjM COG3405 # Protein_GI_number: 16131403 # Func_class: G Carbohydrate transport and metabolism # Function: Endoglucanase Y # Organism: Escherichia coli K12 # 1 368 1 368 368 724 100.0 0 MNVLRSGIVTMLLLAAFSVQAACTWPAWEQFKKDYISQEGRVIDPSDARKITTSEGQSYG MFSALAANDRAAFDNILDWTQNNLAQGSLKERLPAWLWGKKENSKWEVLDSNSASDGDVW MAWSLLEAGRLWKEQRYTDIGSALLKRIAREEVVTVPGLGSMLLPGKVGFAEDNSWRFNP SYLPPTLAQYFTRFGAPWTTLRETNQRLLLETAPKGFSPDWVRYEKDKGWQLKAEKTLIS SYDAIRVYMWVGMMPDSDPQKARMLNRFKPMATFTEKNGYPPEKVDVATGKAQGKGPVGF SAAMLPFLQNRDAQAVQRQRVADNFPGSDAYYNYVLTLFGQGWDQHRFRFSTKGELLPDW GQECANSH >gi|296494579|gb|ADTN01000159.1| GENE 18 19175 - 22648 3218 1157 aa, chain + ## HITS:1 COG:yhjL KEGG:ns NR:ns ## COG: yhjL COG0457 # Protein_GI_number: 16131402 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Escherichia coli K12 # 11 1157 20 1166 1166 2122 100.0 0 MRKFTLNIFTLSLGLAVMPMVEAAPTAQQQLLEQVRLGEATHREDLVQQSLYRLELIDPN NPDVVAARFRSLLRQGDIDGAQKQLDRLSQLAPSSNAYKSSRTTMLLSTPDGRQALQQAR LQATTGHAEEAVASYNKLFNGAPPEGDIAVEYWSTVAKIPARRGEAINQLKRINADAPGN TGLQNNLALLLFSSDRRDEGFAVLEQMAKSNAGREGASKIWYGQIKDMPVSDASVSALKK YLSIFSDGDSVAAAQSQLAEQQKQLADPAFRARAQGLAAVDSGMAGKAIPELQQAVRANP KDSEALGALGQAYSQKGDRANAVANLEKALALDPHSSNNDKWNSLLKVNRYWLAIQQGDA ALKANNPDRAERLFQQARNVDNTDSYAVLGLGDVAMARKDYPAAERYYQQTLRMDSGNTN AVRGLANIYRQQSPEKAEAFIASLSASQRRSIDDIERSLQNDRLAQQAEALENQGKWAQA AALQRQRLALDPGSVWITYRLSQDLWQAGQRSQADTLMRNLAQQKSNDPEQVYAYGLYLS GHDQDRAALAHINSLPRAQWNSNIQELVNRLQSDQVLETANRLRESGKEAEAEAMLRQQP PSTRIDLTLADWAQQRRDYTAARAAYQNVLTREPANADAILGLTEVDIAAGDKAAARSQL AKLPATDNASLNTQRRVALAQAQLGDTAAAQRTFNKLIPQAKSQPPSMESAMVLRDGAKF EAQAGDPTQALETYKDAMVASGVTTTRPQDNDTFTRLTRNDEKDDWLKRGVRSDAADLYR QQDLNVTLEHDYWGSSGTGGYSDLKAHTTMLQVDAPYSDGRMFFRSDFVNMNVGSFSTNA DGKWDDNWGTCTLQDCSGNRSQSDSGASVAVGWRNDVWSWDIGTTPMGFNVVDVVGGISY SDDIGPLGYTVNAHRRPISSSLLAFGGQKDSPSNTGKKWGGVRADGVGLSLSYDKGEANG VWASLSGDQLTGKNVEDNWRVRWMTGYYYKVINQNNRRVTIGLNNMIWHYDKDLSGYSLG QGGYYSPQEYLSFAIPVMWRERTENWSWELGASGSWSHSRTKTMPRYPLMNLIPTDWQEE AARQSNDGGSSQGFGYTARALLERRVTSNWFVGTAIDIQQAKDYAPSHFLLYVRYSAAGW QGDMDLPPQPLIPYADW >gi|296494579|gb|ADTN01000159.1| GENE 19 22769 - 24718 1966 649 aa, chain + ## HITS:1 COG:ECs4409_3 KEGG:ns NR:ns ## COG: ECs4409_3 COG2200 # Protein_GI_number: 15833663 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli O157:H7 # 387 649 1 263 263 507 99.0 1e-143 MVAAVVLVFVFIFCTVLLFHLVQQNRYNTATQLESIARSVREPLSSAILKGDIPEAEAIL ASIKPAGVVSRADVVLPNQFQALRKSFIPERPVPVMVTRLFELPVQISLGVYSLERPANP QPIAYLVLQADSFRMYKFVMSTLSTLVTIYLLLSLILTVAISWCINRLILHPLRNIAREL NAIPAKELVGHQLALPRLHQDDEIGMLVRSYNLNQQLLQRHYEEQNENAMRFPVSDLPNK ALLMEMLEQVVARKQTTALMIITCETLRDTAGVLKEAQREILLLTLVEKLKSVLSPRMIL AQISGYDFAVIANGVQEPWHAITLGQQVLTIMSERLPIERIQLRPHCSIGVAMFYGDLTA EQLYSRAISAAFTARHKGKNQIQFFDPQQMEAAQKRLTEESDILNALENHQFAIWLQPQV EMTSGKLVSAEVLLRIQQPDGSWDLPDGLIDRIECCGLMVTVGHWVLEESCRLLAAWQER GIMLPLSVNLSALQLMHPNMVADMLELLTRYRIQPGTLILEVTESRRIDDPHAAVAILRP LRNAGVRVALDDFGMGYAGLRQLQHMKSLPIDVLKIDKMFVEGLPGDSSMIAAIIMLAQS LNLQMIAEGVETEAQRDWLAKAGVGIAQGFLFARPLPIEIFEESYLEEK >gi|296494579|gb|ADTN01000159.1| GENE 20 24901 - 26187 1546 428 aa, chain + ## HITS:1 COG:dctA KEGG:ns NR:ns ## COG: dctA COG1301 # Protein_GI_number: 16131400 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Escherichia coli K12 # 1 428 1 428 428 733 100.0 0 MKTSLFKSLYFQVLTAIAIGILLGHFYPEIGEQMKPLGDGFVKLIKMIIAPVIFCTVVTG IAGMESMKAVGRTGAVALLYFEIVSTIALIIGLIIVNVVQPGAGMNVDPATLDAKAVAVY ADQAKDQGIVAFIMDVIPASVIGAFASGNILQVLLFAVLFGFALHRLGSKGQLIFNVIES FSQVIFGIINMIMRLAPIGAFGAMAFTIGKYGVGTLVQLGQLIICFYITCILFVVLVLGS IAKATGFSIFKFIRYIREELLIVLGTSSSESALPRMLDKMEKLGCRKSVVGLVIPTGYSF NLDGTSIYLTMAAVFIAQATNSQMDIVHQITLLIVLLLSSKGAAGVTGSGFIVLAATLSA VGHLPVAGLALILGIDRFMSEARALTNLVGNGVATIVVAKWVKELDHKKLDDVLNNRAPD GKTHELSS >gi|296494579|gb|ADTN01000159.1| GENE 21 26450 - 27904 1639 484 aa, chain + ## HITS:1 COG:yhjJ KEGG:ns NR:ns ## COG: yhjJ COG0612 # Protein_GI_number: 16131399 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Escherichia coli K12 # 1 484 15 498 498 899 100.0 0 MMATAGYVQADALQPDPAWQQGTLSNGLQWQVLTTPQRPSDRVEIRLLVNTGSLAESTQQ SGYSHAIPRIALTQSGGLDAAQARSLWQQGIDPKRPMPPVIVSYDTTLFNLSLPNNRNDL LKEALSYLANATGKLTITPETINHALQSQDMVATWPADTKEGWWRYRLKGSTLLGHDPAD PLKQPVEAEKIKDFYQKWYTPDAMTLLVVGNVDARSVVDQINKTFGELKGKRETPAPVPT LSPLRAEAVSIMTDAVRQDRLSIMWDTPWQPIRESAALLRYWRADLAREALFWHVQQALS ASNSKDIGLGFDCRVLYLRAQCAINIESPNDKLNSNLNLVARELAKVRDKGLPEEEFNAL VAQKKLELQKLFAAYARADTDILMGQRMRSLQNQVVDIAPEQYQKLRQDFLNSLTVEMLN QDLRQQLSNDMALILLQPKGEPEFNMKALQAVWDQIMAPSTAAATTSVATDDVHPEVTDI PPAQ >gi|296494579|gb|ADTN01000159.1| GENE 22 28000 - 28929 1039 309 aa, chain - ## HITS:1 COG:kdgK KEGG:ns NR:ns ## COG: kdgK COG0524 # Protein_GI_number: 16131398 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli K12 # 1 309 74 382 382 626 100.0 1e-179 MSKKIAVIGECMIELSEKGADVKRGFGGDTLNTSVYIARQVDPAALTVHYVTALGTDSFS QQMLDAWHGENVDTSLTQRMENRLPGLYYIETDSTGERTFYYWRNEAAAKFWLESEQSAA ICEELANFDYLYLSGISLAILSPTSREKLLSLLRECRANGGKVIFDNNYRPRLWASKEET QQVYQQMLECTDIAFLTLDDEDALWGQQPVEDVIARTHNAGVKEVVVKRGADSCLVSIAG EGLVDVPAVKLPKEKVIDTTAAGDSFSAGYLAVRLTGGSAEDAAKRGHLTASTVIQYRGA IIPREAMPA >gi|296494579|gb|ADTN01000159.1| GENE 23 29161 - 29928 636 255 aa, chain + ## HITS:1 COG:yhjH KEGG:ns NR:ns ## COG: yhjH COG2200 # Protein_GI_number: 16131397 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 1 255 2 256 256 508 100.0 1e-144 MIRQVIQRISNPEASIESLQERRFWLQCERAYTWQPIYQTCGRLMAVELLTVVTHPLNPS QRLPPDRYFTEITVSHRMEVVKEQIDLLAQKADFFIEHGLLASVNIDGPTLIALRQQPKI LRQIERLPWLRFELVEHIRLPKDSTFASMCEFGPLWLDDFGTGMANFSALSEVRYDYIKI ARELFVMLRQSPEGRTLFSQLLHLMNRYCRGVIVEGVETPEEWRDVQNSPAFAAQGWFLS RPAPIETLNTAVLAL >gi|296494579|gb|ADTN01000159.1| GENE 24 29998 - 32058 1692 686 aa, chain + ## HITS:1 COG:yhjG KEGG:ns NR:ns ## COG: yhjG COG2982 # Protein_GI_number: 16131396 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein involved in outer membrane biogenesis # Organism: Escherichia coli K12 # 1 686 6 691 691 1327 100.0 0 MSKAGKITAAISGAFLLLIVVAIILIATFDWNRLKPTINQKVSAELNRPFAIRGDLGVVW ERQKQETGWRSWVPWPHVHAEDIILGNPPDIPEVTMVHLPRVEATLAPLALLTKTVWLPW IKLEKPDARLIRLSEKNNNWTFNLANDDNKDANAKPSAWSFRLDNILFDQGRIAIDDKVS KADLEIFVDPLGKPLPFSEVTGSKGKADKEKVGDYVFGLKAQGRYNGEPLTGTGKIGGML ALRGEGTPFPVQADFRSGNTRVAFDGVVNDPMKMGGVDLRLKFSGDSLGDLYELTGVLLP DTPPFETDGRLVAKIDTEKSSVFDYRGFNGRIGDSDIHGSLVYTTGKPRPKLEGDVESRQ LRLADLGPLIGVDSGKGAEKSKRSEQKKGEKSVQPAGKVLPYDRFETDKWDVMDADVRFK GRRIEHGSSLPISDLSTHIILKNADLRLQPLKFGMAGGSIAANIHLEGDKKPMQGRADIQ ARRLKLKELMPDVELMQKTLGEMNGDAELRGSGNSVAALLGNSNGNLKLLMNDGLVSRNL MEIVGLNVGNYIVGAIFGDDEVRVNCAAANLNIANGVARPQIFAFDTENALINVTGTASF ASEQLDLTIDPESKGIRIITLRSPLYVRGTFKNPQAGVKAGPLIARGAVAAALATLVTPA AALLALISPSEGEANQCRTILSQMKK >gi|296494579|gb|ADTN01000159.1| GENE 25 32240 - 33562 1658 440 aa, chain - ## HITS:1 COG:yhjE KEGG:ns NR:ns ## COG: yhjE COG0477 # Protein_GI_number: 16131395 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 440 1 440 440 683 100.0 0 MQATATTLDHEQEYTPINSRNKVLVASLIGTAIEFFDFYIYATAAVIVFPHIFFPQGDPT AATLQSLATFAIAFVARPIGSAVFGHFGDRVGRKATLVASLLTMGISTVVIGLLPGYATI GIFAPLLLALARFGQGLGLGGEWGGAALLATENAPPRKRALYGSFPQLGAPIGFFFANGT FLLLSWLLTDEQFMSWGWRVPFIFSAVLVIIGLYVRVSLHESPVFEKVAKAKKQVKIPLG TLLTKHVRVTVLGTFIMLATYTLFYIMTVYSMTFSTAAAPVGLGLPRNEVLWMLMMAVIG FGVMVPVAGLLADAFGRRKSMVIITTLIILFALFAFNPLLGSGNPILVFAFLLLGLSLMG LTFGPMGALLPELFPTEVRYTGASFSYNVASILGASVAPYIAAWLQTNYGLGAVGLYLAA MAGLTLIALLLTHETRHQSL >gi|296494579|gb|ADTN01000159.1| GENE 26 33973 - 34986 1028 337 aa, chain - ## HITS:1 COG:yhjD KEGG:ns NR:ns ## COG: yhjD COG1295 # Protein_GI_number: 16131394 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 337 1 337 337 661 100.0 0 MTQENEIKRPIQDLEHEPIKPLDNSEKGSKVSQALETVTTTAEKVQRQPVIAHLIRATER FNDRLGNQFGAAITYFSFLSMIPILMVSFAAGGFVLASHPMLLQDIFDKILQNISDPTLA ATLKNTINTAVQQRTTVGLVGLAVALYSGINWMGNLREAIRAQSRDVWERSPQDQEKFWV KYLRDFISLIGLLIALIVTLSITSVAGSAQQMIISALHLNSIEWLKPTWRLIGLAISIFA NYLLFFWIFWRLPRHRPRKKALIRGTFLAAIGFEVIKIVMTYTLPSLMKSPSGAAFGSVL GLMAFFYFFARLTLFCAAWIATAEYKDDPRMPGKTQP >gi|296494579|gb|ADTN01000159.1| GENE 27 35035 - 35916 741 293 aa, chain - ## HITS:1 COG:yhjC KEGG:ns NR:ns ## COG: yhjC COG0583 # Protein_GI_number: 16131393 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 293 31 323 323 588 99.0 1e-168 MQLFIKVAELESFSRAADFFALPKGSVSRQIQALEHQLGTQLLQRTTRRVKLTPEGMTYY QRAKDVLSNLSELDGLFQQDATSISGKLRIDIPPGIAKSLLLPRLSEFLYLHPGIELELS SHDRPVDILHDGFDCVIRTGALPEDGVIARPLGKLTMVNCASPHYLTRFGYPQSPDDLTS HAIVRYTPHLGVHPLGFEVASVNGVQWFKSGGMLTVNSSKNYLTAGLAGLGIIQIPRIAV REALRAGRLIEVLPGYRAEPLSLSLVYPQRRELSRRVNLFMQWLAGVMKEYLD >gi|296494579|gb|ADTN01000159.1| GENE 28 36454 - 37056 436 200 aa, chain + ## HITS:1 COG:yhjB KEGG:ns NR:ns ## COG: yhjB COG2197 # Protein_GI_number: 16131392 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli K12 # 1 200 1 200 200 394 100.0 1e-110 MQIVMFDRQSIFIHGMKISLQQRIPGVSIQGASQADELWQKLESYPEALVMLDGDQDGEF CYWLLQKTVVQFPEVKVLITATDCNKRWLQEVIHFNVLAIVPRDSTVETFALAVNSAAMG MMFLPGDWRTTPEKDIKDLKSLSARQREILTMLAAGESNKEIGRALNISTGTVKAHLESL YRRLEVKNRTQAAMMLNISS >gi|296494579|gb|ADTN01000159.1| GENE 29 37107 - 38756 1772 549 aa, chain - ## HITS:1 COG:ECs4399 KEGG:ns NR:ns ## COG: ECs4399 COG1626 # Protein_GI_number: 15833653 # Func_class: G Carbohydrate transport and metabolism # Function: Neutral trehalase # Organism: Escherichia coli O157:H7 # 1 549 1 549 549 1113 100.0 0 MLNQKIQNPNPDELMIEVDLCYELDPYELKLDEMIEAEPEPEMIEGLPASDALTPADRYL ELFEHVQSAKIFPDSKTFPDCAPKMDPLDILIRYRKVRRHRDFDLRKFVENHFWLPEVYS SEYVSDPQNSLKEHIDQLWPVLTREPQDHIPWSSLLALPQSYIVPGGRFSETYYWDSYFT MLGLAESGREDLLKCMADNFAWMIENYGHIPNGNRTYYLSRSQPPVFALMVELFEEDGVR GARRYLDHLKMEYAFWMDGAESLIPNQAYRHVVRMPDGSLLNRYWDDRDTPRDESWLEDV ETAKHSGRPPNEVYRDLRAGAASGWDYSSRWLRDTGRLASIRTTQFIPIDLNAFLFKLES AIANISALKGEKETEALFRQKASARRDAVNRYLWDDENGIYRDYDWRREQLALFSAAAIV PLYVGMANHEQADRLANAVRSRLLTPGGILASEYETGEQWDKPNGWAPLQWMAIQGFKMY GDDLLGDEIARSWLKTVNQFYLEQHKLIEKYHIADGVPREGGGGEYPLQDGFGWTNGVVR RLIGLYGEP >gi|296494579|gb|ADTN01000159.1| GENE 30 39160 - 40557 1170 465 aa, chain + ## HITS:1 COG:yhjA KEGG:ns NR:ns ## COG: yhjA COG1858 # Protein_GI_number: 16131390 # Func_class: P Inorganic ion transport and metabolism # Function: Cytochrome c peroxidase # Organism: Escherichia coli K12 # 1 465 1 465 465 970 100.0 0 MKMVSRITAIGLAGVAICYLGLSGYVWYHDNKRSKQADVQASAVSENNKVLGFLREKGCD YCHTPSAELPAYYYIPGAKQLMDYDIKLGYKSFNLEAVRAALLADKPVSQSDLNKIEWVM QYETMPPTRYTALHWAGKVSDEERAEILAWIAKQRAEYYASNDTAPEHRNEPVQPIPQKL PTDAQKVALGFALYHDPRLSADSTISCAHCHALNAGGVDGRKTSIGVGGAVGPINAPTVF NSVFNVEQFWDGRAATLQDQAGGPPLNPIEMASKSWDEIIAKLEKDPQLKTQFLEVYPQG FSGENITDAIAEFEKTLITPDSPFDKWLRGDENALTAQQKKGYQLFKDNKCATCHGGIIL GGRSFEPLGLKKDFNFGEITAADIGRMNVTKEERDKLRQKVPGLRNVALTAPYFHRGDVP TLDGAVKLMLRYQVGKELPQEDVDDIVAFLHSLNGVYTPYMQDKQ >gi|296494579|gb|ADTN01000159.1| GENE 31 40689 - 40835 103 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAFAPKRESSSSERKSVNSFWSISNSLNLFEGNKKVGFIRNGSKALQA >gi|296494579|gb|ADTN01000159.1| GENE 32 40768 - 41038 158 90 aa, chain + ## HITS:1 COG:ECs4397 KEGG:ns NR:ns ## COG: ECs4397 COG0076 # Protein_GI_number: 15833651 # Func_class: E Amino acid transport and metabolism # Function: Glutamate decarboxylase and related PLP-dependent proteins # Organism: Escherichia coli O157:H7 # 1 90 1 90 466 192 100.0 8e-50 MDQKLLTDFRSELLDSRFGAKAISTIAESKRFPLHEMRDDVAFQIINDELYLDGNARQNL ATFCQTWDDENVHKLMDLSINKNWIDKEEY Prediction of potential genes in microbial genomes Time: Sun May 15 23:41:34 2011 Seq name: gi|296494578|gb|ADTN01000160.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont297.9, whole genome shotgun sequence Length of sequence - 20935 bp Number of predicted genes - 18, with homology - 17 Number of transcription units - 11, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 725 465 ## COG4178 ABC-type uncharacterized transport system, permease and ATPase components 2 1 Op 2 . + CDS 763 - 3135 1932 ## JW1490 predicted porin protein + Term 3155 - 3184 -0.5 3 1 Op 3 3/0.000 + CDS 3192 - 5975 1859 ## COG0612 Predicted Zn-dependent peptidases + Term 5982 - 6018 6.3 + Prom 6130 - 6189 6.7 4 1 Op 4 . + CDS 6337 - 7737 1634 ## COG0076 Glutamate decarboxylase and related PLP-dependent proteins + Term 7757 - 7807 10.0 - Term 8147 - 8181 1.3 5 2 Tu 1 . - CDS 8303 - 8455 74 ## EcE24377A_4004 hypothetical protein - Prom 8490 - 8549 2.8 - Term 8944 - 8974 -0.7 6 3 Tu 1 . - CDS 9057 - 9143 85 ## - Prom 9244 - 9303 4.6 + Prom 9076 - 9135 7.7 7 4 Tu 1 . + CDS 9299 - 10027 154 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 10099 - 10126 -0.1 8 5 Tu 1 . + CDS 10172 - 10453 126 ## gi|293412945|ref|ZP_06655613.1| predicted protein - Term 10333 - 10378 10.6 9 6 Op 1 27/0.000 - CDS 10390 - 13503 3090 ## COG0841 Cation/multidrug efflux pump 10 6 Op 2 2/0.000 - CDS 13528 - 14685 1095 ## COG0845 Membrane-fusion protein - Prom 14835 - 14894 2.6 11 6 Op 3 3/0.000 - CDS 15024 - 15551 387 ## COG2771 DNA-binding HTH domain-containing proteins - Prom 15643 - 15702 4.4 - Term 16287 - 16332 12.6 12 6 Op 4 . - CDS 16350 - 16922 515 ## COG3247 Uncharacterized conserved protein - Prom 17053 - 17112 4.4 + Prom 17030 - 17089 6.7 13 7 Op 1 . + CDS 17177 - 17509 502 ## EC55989_3953 acid-resistance protein + Term 17556 - 17598 0.4 + Prom 17522 - 17581 2.4 14 7 Op 2 . + CDS 17613 - 17951 295 ## SSON_3577 acid-resistance protein + Term 17969 - 18011 7.2 15 8 Tu 1 . + CDS 18015 - 18662 621 ## COG1285 Uncharacterized membrane protein - Term 18543 - 18588 0.5 16 9 Tu 1 1/1.000 - CDS 18704 - 19207 330 ## COG2771 DNA-binding HTH domain-containing proteins - Prom 19299 - 19358 5.8 17 10 Tu 1 . - CDS 19390 - 19956 607 ## COG3065 Starvation-inducible outer membrane lipoprotein - Prom 20137 - 20196 6.9 - Term 19966 - 20014 0.2 18 11 Tu 1 . - CDS 20204 - 20815 44 ## EcolC_0212 hypothetical protein - Prom 20836 - 20895 4.0 Predicted protein(s) >gi|296494578|gb|ADTN01000160.1| GENE 1 3 - 725 465 240 aa, chain + ## HITS:1 COG:ECs2101 KEGG:ns NR:ns ## COG: ECs2101 COG4178 # Protein_GI_number: 15831355 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease and ATPase components # Organism: Escherichia coli O157:H7 # 9 240 330 561 561 475 99.0 1e-134 NDLLIRTFHKYDELAELAAVIDRLYEFHQLTEQRPTNKPKNCQHAVQVADASIRTPDNKI ILENLNFHVSPGKWLLLKGYSGAGKTTLLKTLSHCWPWFKGDISSPADSWYVSQTPLIKT GLLKEIICKALPLPVDDKSLSEVLHQVGLGKLAARIHDHDRWGDILSSGEKQRIALARLI LRRPKWIFLDETTSHLEEQEAIRLLRLVREKLPTSGVIMVTHQPGVWNLADDICDISAVL >gi|296494578|gb|ADTN01000160.1| GENE 2 763 - 3135 1932 790 aa, chain + ## HITS:1 COG:no KEGG:JW1490 NR:ns ## KEGG: JW1490 # Name: yddB # Def: predicted porin protein # Organism: E.coli_J # Pathway: not_defined # 1 790 1 790 790 1521 100.0 0 MKRVLIPGVILCGADVAQAVDDKNMYMHFFEEMTVYAPVPVPVNGNTHYTSESIERLPTG NGNISDLLRTNPAVRMDSTQSTSLNQGDIRPEKISIHGASPYQNAYLIDGISATNNLNPA NESDASSATNISGMSQGYYLDVSLLDNVTLYDSFVPVEFGRFNGGVIDAKIKRFNADDSK VKLGYRTTRSDWLTSHIDENNKSAFNQGSSGSTYYSPDFKKNFYTLSFNQELADNFGVTA GLSRRQSDITRADYVSNDGIVAGRAQYKNVIDTALSKFTWFASDRFTHDLTLKYTGSSRD YNTSTFPQSDREMGNKSYGLAWDMDTQLAWAKLRTTVGWDHISDYTRHDHDIWYTELSCT YGDITGRCTRGGLGHISQAVDNYTFKTRLDWQKFAVGNVSHQPYFGAEYIYSDAWTERHN QSESYVINAAGKKTNHTIYHKGKGRLGIDNYTLYMADRISWRNVSLMPGVRYDYDNYLSN HNISPRFMTEWDIFANQTSMITAGYNRYYGGNILDMGLRDIRNSWTESVSGNKTLTRYQD LKTPYNDELAMGLQQKIGKNVIARANYVYREAHDQISKSSRTDSATKTTITEYNNDGKTK THSFSLSFELAEPLHIRQVDINPQIVFSYIKSKGNLSLNNGYEESNTGDNQVVYNGNLVS YDSVPVADFNNPLKISLNMDFTHQPSGLVWANTLAWQEARKARIILGKTNAQYISEYSDY KQYVDEKLDSSLTWDTRLSWTPQFLQQQNLTISADILNVLDSKTAVDTTNTGVATYASGR TFWLDVSMKF >gi|296494578|gb|ADTN01000160.1| GENE 3 3192 - 5975 1859 927 aa, chain + ## HITS:1 COG:pqqL KEGG:ns NR:ns ## COG: pqqL COG0612 # Protein_GI_number: 16129453 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Escherichia coli K12 # 1 927 5 931 931 1696 99.0 0 MRNLCFLLTLVATLLLPGRLIAAALPQDEKLITGQLDNGLRYMIYPHAHPKDQVNLWLQI HTGSLQEEDNERGVAHFVEHMMFNGTKTWPGNKVIETFESMGLRFGRDVNAYTSYDETVY QVSLPTTQKQNLQQVMAIFSEWSNAATFEKLEVDAERGVITEEWRAHQDAKWRTSQARRP FLLANTRNLDREPIGLMDTVATVTPAQLRQFYQRWYQPNNMTFIVVGDIDSKEALALIKD NLSKLPANKAAENRVWPTKAENHLRFNIINDKENRVNGIALYYRLPMVQVNDEQSFIEQA EWSMLVQLFNQRLQERIQSGELKTISGGTARSVKIAPDYQSLFFRVNARDDNMQDAANAL MAELATIDQHGFSAEELDDVKSTRLTWLKNAVDQQAERDLRMLTSRLASSSLNNTPFLSP EETYQLSKRLWQQITVQSLAEKWQQLRKNQDAFWEQMVNNEVAAKKALSPAAILALEKEY ANKKLAAYVFPGRNLSLTVDADPQAEISSKETLAENLTSLTLSNGARVILAKSAGEEQKL QIIAVSNKGDLSFPAQQKSLIALANKAVSGSGVGELSSSSLKRWSAENSVTMSSKVSGMN TLLSVSARTNNPEPGFQLINQRITHSTINDNIWASLQNAQIQALKTLDQRPAEKFAQQMY ETRYADDRTKLLQENQIAQFTAADALAADRQLFSSPADITFVIVGNVAEDKLVALITRYL GSIKHSDSPLAAGKPLTRATDNASVTVKEQNEPVAQVSQWKRYDSRTPVNLPTRMALDAF NVALAKDLRVNIREQASGAYSVSSRLSVDPQAKDISHLLAFTCQPERHDELLTLANEVMV KRLAKGISEQELNEYQQNVQRSLDIQQRSVQQLANTIVNSLIQYDDPAAWTEQEQLLKQM TVENVNTAVKQYLSHPVNTYTGVLLPK >gi|296494578|gb|ADTN01000160.1| GENE 4 6337 - 7737 1634 466 aa, chain + ## HITS:1 COG:ECs2098 KEGG:ns NR:ns ## COG: ECs2098 COG0076 # Protein_GI_number: 15831352 # Func_class: E Amino acid transport and metabolism # Function: Glutamate decarboxylase and related PLP-dependent proteins # Organism: Escherichia coli O157:H7 # 1 466 1 466 466 1000 100.0 0 MDKKQVTDLRSELLDSRFGAKSISTIAESKRFPLHEMRDDVAFQIINDELYLDGNARQNL ATFCQTWDDENVHKLMDLSINKNWIDKEEYPQSAAIDLRCVNMVADLWHAPAPKNGQAVG TNTIGSSEACMLGGMAMKWRWRKRMEAAGKPTDKPNLVCGPVQICWHKFARYWDVELREI PMRPGQLFMDPKRMIEACDENTIGVVPTFGVTYTGNYEFPQPLHDALDKFQADTGIDIDM HIDAASGGFLAPFVAPDIVWDFRLPRVKSISASGHKFGLAPLGCGWVIWRDEEALPQELV FNVDYLGGQIGTFAINFSRPAGQVIAQYYEFLRLGREGYTKVQNASYQVAAYLADEIAKL GPYEFICTGRPDEGIPAVCFKLKDGEDPGYTLYDLSERLRLRGWQVPAFTLGGEATDIVV MRIMCRRGFEMDFAELLLEDYKASLKYLSDHPKLQGIAQQNSFKHT >gi|296494578|gb|ADTN01000160.1| GENE 5 8303 - 8455 74 50 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_4004 NR:ns ## KEGG: EcE24377A_4004 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 50 1 50 50 76 96.0 3e-13 MLFYVAFLHSEDSYSAIVAQLPEKQEYLYYSDERVMKKLFYVHEDLMPPP >gi|296494578|gb|ADTN01000160.1| GENE 6 9057 - 9143 85 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQHEYIFAQRIAYVYKKMADLISSKSYI >gi|296494578|gb|ADTN01000160.1| GENE 7 9299 - 10027 154 242 aa, chain + ## HITS:1 COG:ECs4395 KEGG:ns NR:ns ## COG: ECs4395 COG2207 # Protein_GI_number: 15833649 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 242 1 242 242 472 100.0 1e-133 MTHVCSVILIRRSFDIYHEQQKISLHNESILLLEKNLADDFAFCSPDTRRLDIDELTVCH YLQNIRQLPRNLGLHSKDRLLINQSPPMPLVTAIFDSFNESGVNSPILSNMLYLSCLSMF SHKKELIPLLFNSISTVSGKVERLISFDIAKRWYLRDIAERMYTSESLIKKKLQDENTCF SKILLASRMSMARRLLELRQIPLHTIAEKCGYSSTSYFINTFRQYYGVTPHQFAQHSPGT FS >gi|296494578|gb|ADTN01000160.1| GENE 8 10172 - 10453 126 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293412945|ref|ZP_06655613.1| ## NR: gi|293412945|ref|ZP_06655613.1| predicted protein [Escherichia coli B354] # 1 84 1 84 93 140 98.0 3e-32 MFGIIKLTIHTITGMWVSIVLFKLMTNGWSGFYFQCCVLSLVFLTVSWLLSGEWLAGKSK AEPSCSTLLSFTRYAFLKRAKRCSTTTKKTGTK >gi|296494578|gb|ADTN01000160.1| GENE 9 10390 - 13503 3090 1037 aa, chain - ## HITS:1 COG:yhiV KEGG:ns NR:ns ## COG: yhiV COG0841 # Protein_GI_number: 16131386 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Escherichia coli K12 # 1 1037 1 1037 1037 1898 100.0 0 MANYFIDRPVFAWVLAIIMMLAGGLAIMNLPVAQYPQIAPPTITVSATYPGADAQTVEDS VTQVIEQNMNGLDGLMYMSSTSDAAGNASITLTFETGTSPDIAQVQVQNKLQLAMPSLPE AVQQQGISVDKSSSNILMVAAFISDNGSLNQYDIADYVASNIKDPLSRTAGVGSVQLFGS EYAMRIWLDPQKLNKYNLVPSDVISQIKVQNNQISGGQLGGMPQAADQQLNASIIVQTRL QTPEEFGKILLKVQQDGSQVLLRDVARVELGAEDYSTVARYNGKPAAGIAIKLAAGANAL DTSRAVKEELNRLSAYFPASLKTVYPYDTTPFIEISIQEVFKTLVEAIILVFLVMYLFLQ NFRATIIPTIAVPVVILGTFAILSAVGFTINTLTMFGMVLAIGLLVDDAIVVVENVERVI AEDKLPPKEATHKSMGQIQRALVGIAVVLSAVFMPMAFMSGATGEIYRQFSITLISSMLL SVFVAMSLTPALCATILKAAPEGGHKPNALFARFNTLFEKSTQHYTDSTRSLLRCTGRYM VVYLLICAGMAVLFLRTPTSFLPEEDQGVFMTTAQLPSGATMVNTTKVLQQVTDYYLTKE KDNVQSVFTVGGFGFSGQGQNNGLAFISLKPWSERVGEENSVTAIIQRAMIALSSINKAV VFPFNLPAVAELGTASGFDMELLDNGNLGHEKLTQARNELLSLAAQSPNQVTGVRPNGLE DTPMFKVNVNAAKAEAMGVALSDINQTISTAFGSSYVNDFLNQGRVKKVYVQAGTPFRML PDNINQWYVRNASGTMAPLSAYSSTEWTYGSPRLERYNGIPSMEILGEAAAGKSTGDAMK FMADLVAKLPAGVGYSWTGLSYQEALSSNQAPALYAISLVVVFLALAALYESWSIPFSVM LVVPLGVVGALLATDLRGLSNDVYFQVGLLTTIGLSAKNAILIVEFAVEMMQKEGKTPIE AIIEAARMRLRPILMTSLAFILGVLPLVISHGAGSGAQNAVGTGVMGGMFAATVLAIYFV PVFFVVVEHLFARFKKA >gi|296494578|gb|ADTN01000160.1| GENE 10 13528 - 14685 1095 385 aa, chain - ## HITS:1 COG:yhiU KEGG:ns NR:ns ## COG: yhiU COG0845 # Protein_GI_number: 16131385 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli K12 # 1 385 1 385 385 687 100.0 0 MNRRRKLLIPLLFCGAMLTACDDKSAENAAAMTPEVGVVTLSPGSVNVLSELPGRTVPYE VAEIRPQVGGIIIKRNFIEGDKVNQGDSLYQIDPAPLQAELNSAKGSLAKALSTASNARI TFNRQASLLKTNYVSRQDYDTARTQLNEAEANVTVAKAAVEQATINLQYANVTSPITGVS GKSSVTVGALVTANQADSLVTVQRLDPIYVDLTQSVQDFLRMKEEVASGQIKQVQGSTPV QLNLENGKRYSQTGTLKFSDPTVDETTGSVTLRAIFPNPNGDLLPGMYVTALVDEGSRQN VLLVPQEGVTHNAQGKATALILDKDDVVQLREIEASKAIGDQWVVTSGLQAGDRVIVSGL QRIRPGIKARAISSSQENASTESKQ >gi|296494578|gb|ADTN01000160.1| GENE 11 15024 - 15551 387 175 aa, chain - ## HITS:1 COG:ECs4392 KEGG:ns NR:ns ## COG: ECs4392 COG2771 # Protein_GI_number: 15833646 # Func_class: K Transcription # Function: DNA-binding HTH domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 175 1 175 175 302 100.0 2e-82 MIFLMTKDSFLLQGFWQLKDNHEMIKINSLSEIKKVGNKPFKVIIDTYHNHILDEEAIKF LEKLDAERIIVLAPYHISKLKAKAPIYFVSRKESIKNLLEITYGKHLPHKNSQLCFSHNQ FKIMQLILKNKNESNITSTLNISQQTLKIQKFNIMYKLKLRRMSDIVTLGITSYF >gi|296494578|gb|ADTN01000160.1| GENE 12 16350 - 16922 515 190 aa, chain - ## HITS:1 COG:ECs4391 KEGG:ns NR:ns ## COG: ECs4391 COG3247 # Protein_GI_number: 15833645 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 190 1 190 190 293 100.0 1e-79 MLYIDKATILKFDLEMLKKHRRAIQFIAVLLFIVGLLCISFPFVSGDILSTVVGALLICS GIALIVGLFSNRSHNFWPVLSGFLVAVAYLLIGYFFIRAPELGIFAIAAFIAGLFCVAGV IRLMSWYRQRSMKGSWLQLVIGVLDIVIAWIFLGATPMVSVTLVSTLVGIELIFSAASLF SFASLFVKQQ >gi|296494578|gb|ADTN01000160.1| GENE 13 17177 - 17509 502 110 aa, chain + ## HITS:1 COG:no KEGG:EC55989_3953 NR:ns ## KEGG: EC55989_3953 # Name: hdeA # Def: acid-resistance protein # Organism: E.coli_55989 # Pathway: not_defined # 19 110 19 110 110 157 100.0 7e-38 MKKVLGVILGGLLLLPVVSNAADAQKAADNKKPVNSWTCEDFLAVDESFQPTAVGFAEAL NNKDKPEDAVLDVQGIATVTPAIVQACTQDKQANFKDKVKGEWDKIKKDM >gi|296494578|gb|ADTN01000160.1| GENE 14 17613 - 17951 295 112 aa, chain + ## HITS:1 COG:no KEGG:SSON_3577 NR:ns ## KEGG: SSON_3577 # Name: hdeB # Def: acid-resistance protein # Organism: S.sonnei # Pathway: not_defined # 1 112 1 112 112 216 100.0 1e-55 MGYKMNISSLRKAFIFMGAVAALSLVNAQSALAANESAKDMTCQEFIDLNPKAMTPVAWW MLHEETVYKGGDTVTLNETDLTQIPKVIEYCKKNPQKNLYTFKNQASNDLPN >gi|296494578|gb|ADTN01000160.1| GENE 15 18015 - 18662 621 215 aa, chain + ## HITS:1 COG:ECs4388 KEGG:ns NR:ns ## COG: ECs4388 COG1285 # Protein_GI_number: 15833642 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Escherichia coli O157:H7 # 1 215 1 215 215 381 100.0 1e-106 MTAEFIIRLILAAIACGAIGMERQMRGKGAGLRTHVLIGMGSALFMIVSKYGFADVLSLD HVGLDPSRIAAQVVTGVGFIGAGNILVRNQNIVGLTTAADIWVTAAIGMVIGSGMYELGI YGSVMTLLVLEVFHQLTFRLMNKNYHLQLTLVNGNTVSMLDWFKQQKIKTDLVSLQENED HEVVAIDIQLHATTSIEDLLRLLKGMAGVKGVSIS >gi|296494578|gb|ADTN01000160.1| GENE 16 18704 - 19207 330 167 aa, chain - ## HITS:1 COG:yhiF KEGG:ns NR:ns ## COG: yhiF COG2771 # Protein_GI_number: 16131379 # Func_class: K Transcription # Function: DNA-binding HTH domain-containing proteins # Organism: Escherichia coli K12 # 1 167 10 176 176 298 100.0 3e-81 MFFTAMKNILSKGNVVHIQNEEEIDVMLHQNAFVIIDTLMNNVFHSNFLTQIERLKPVHV IIFSPFNIKRCLGKVPVTFVPRTITIIDFVALINGSYCSVPEAAVSLSRKQHQVLSCIAN QMTTEDILEKLKISLKTFYCHKHNIMMILNLKRINELVRHQHIDYLV >gi|296494578|gb|ADTN01000160.1| GENE 17 19390 - 19956 607 188 aa, chain - ## HITS:1 COG:ECs4377 KEGG:ns NR:ns ## COG: ECs4377 COG3065 # Protein_GI_number: 15833631 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Starvation-inducible outer membrane lipoprotein # Organism: Escherichia coli O157:H7 # 1 188 12 199 199 381 100.0 1e-106 MNMTKGALILSLSFLLAACSSIPQNIKGNNQPDIQKSFVAVHNQPGLYVGQQARFGGKVI NVINGKTDTLLEIAVLPLDSYAKPDIEANYQGRLLARQSGFLDPVNYRNHFVTILGTIQG EQPGFINKVPYNFLEVNMQGIQVWHLREVVNTTYNLWDYGYGAFWPEPGWGAPYYTNAVS QVTPELVK >gi|296494578|gb|ADTN01000160.1| GENE 18 20204 - 20815 44 203 aa, chain - ## HITS:1 COG:no KEGG:EcolC_0212 NR:ns ## KEGG: EcolC_0212 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 203 205 407 407 392 100.0 1e-108 MKVFAGTNFIDFNMTGQNLSGFVLTLSRFYFEDLLNINFTDANLGDTIFLHKEHPTPKLY KDGQYLDKQIEGLFSTLLTINDNLLRAKAEIASTIIKFLEARITNLSYNDILKYQQEFQK QCYKQVKAFTTLSRYNKIQTWAEMSEYQFEVFQYETLNPKKMSHTPYLKRPLPNEKDINY GVEIEIPSGKRIRLSNHYQNIIP Prediction of potential genes in microbial genomes Time: Sun May 15 23:42:04 2011 Seq name: gi|296494577|gb|ADTN01000161.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont305.1, whole genome shotgun sequence Length of sequence - 7862 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 - CDS 48 - 593 518 ## COG4635 Flavodoxin 2 1 Op 2 4/1.000 - CDS 605 - 2056 1548 ## COG0168 Trk-type K+ transport systems, membrane components 3 1 Op 3 2/1.000 - CDS 2095 - 2637 518 ## COG1739 Uncharacterized conserved protein 4 1 Op 4 . - CDS 2709 - 4040 1624 ## COG0006 Xaa-Pro aminopeptidase - Prom 4071 - 4130 2.4 + Prom 4108 - 4167 4.8 5 2 Op 1 20/0.000 + CDS 4230 - 6419 2801 ## COG1250 3-hydroxyacyl-CoA dehydrogenase 6 2 Op 2 . + CDS 6429 - 7592 1360 ## COG0183 Acetyl-CoA acetyltransferase Predicted protein(s) >gi|296494577|gb|ADTN01000161.1| GENE 1 48 - 593 518 181 aa, chain - ## HITS:1 COG:ECs4778 KEGG:ns NR:ns ## COG: ECs4778 COG4635 # Protein_GI_number: 15834032 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism # Function: Flavodoxin # Organism: Escherichia coli O157:H7 # 1 181 1 181 181 358 100.0 2e-99 MKTLILFSTRDGQTREIASYLASELKELGIQADVANVHRIEEPQWENYDRVVIGASIRYG HYHSAFQEFVKKHATRLNSMPSAFYSVNLVARKPEKRTPQTNSYARKFLMNSQWRPDRCA VIAGALRYPRYRWYDRFMIKLIMKMSGGETDTRKEVVYTDWEQVANFAREIAHLTDKPTL K >gi|296494577|gb|ADTN01000161.1| GENE 2 605 - 2056 1548 483 aa, chain - ## HITS:1 COG:ECs4777 KEGG:ns NR:ns ## COG: ECs4777 COG0168 # Protein_GI_number: 15834031 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Escherichia coli O157:H7 # 1 483 1 483 483 848 100.0 0 MHFRAITRIVGLLVILFSGTMIIPGLVALIYRDGAGRAFTQTFFVALAIGSMLWWPNRKE KGELKSREGFLIVVLFWTVLGSVGALPFIFSESPNLTITDAFFESFSGLTTTGATTLVGL DSLPHAILFYRQMLQWFGGMGIIVLAVAILPILGVGGMQLYRAEMPGPLKDNKMRPRIAE TAKTLWLIYVLLTVACALALWFAGMDAFDAIGHSFATIAIGGFSTHDASIGYFDSPTINT IIAIFLLISGCNYGLHFSLLSGRSLKVYWRDPEFRMFIGVQFTLVVICTLVLWFHNVYSS ALMTINQAFFQVVSMATTAGFTTDSIARWPLFLPVLLLCSAFIGGCAGSTGGGLKVIRIL LLFKQGNRELKRLVHPNAVYSIKLGNRALPERILEAVWGFFSAYALVFIVSMLAIIATGV DDFSAFASVVATLNNLGPGLGVVADNFTSMNPVAKWILIANMLFGRLEVFTLLVLFTPTF WRE >gi|296494577|gb|ADTN01000161.1| GENE 3 2095 - 2637 518 180 aa, chain - ## HITS:1 COG:yigZ KEGG:ns NR:ns ## COG: yigZ COG1739 # Protein_GI_number: 16131694 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 180 26 205 205 329 100.0 2e-90 MLAHTDGVEAAKAFVESVRAEHPDARHHCVAWVAGAPDDSQQLGFSDDGEPAGTAGKPML AQLMGSGVGEITAVVVRYYGGILLGTGGLVKAYGGGVNQALRQLTTQRKTPLTEYTLQCE YHQLTGIEALLGQCDGKIINSDYQAFVLLRVALPAAKVAEFSAKLADFSRGSLQLLAIEE >gi|296494577|gb|ADTN01000161.1| GENE 4 2709 - 4040 1624 443 aa, chain - ## HITS:1 COG:pepQ KEGG:ns NR:ns ## COG: pepQ COG0006 # Protein_GI_number: 16131693 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Escherichia coli K12 # 1 443 1 443 443 927 100.0 0 MESLASLYKNHIATLQERTRDALARFKLDALLIHSGELFNVFLDDHPYPFKVNPQFKAWV PVTQVPNCWLLVDGVNKPKLWFYLPVDYWHNVEPLPTSFWTEDVEVIALPKADGIGSLLP AARGNIGYIGPVPERALQLGIEASNINPKGVIDYLHYYRSFKTEYELACMREAQKMAVNG HRAAEEAFRSGMSEFDINIAYLTATGHRDTDVPYSNIVALNEHAAVLHYTKLDHQAPEEM RSFLLDAGAEYNGYAADLTRTWSAKSDNDYAQLVKDVNDEQLALIATMKAGVSYVDYHIQ FHQRIAKLLRKHQIITDMSEEAMVENDLTGPFMPHGIGHPLGLQVHDVAGFMQDDSGTHL AAPAKYPYLRCTRILQPGMVLTIEPGIYFIESLLAPWREGQFSKHFNWQKIEALKPFGGI RIEDNVVIHENNVENMTRDLKLA >gi|296494577|gb|ADTN01000161.1| GENE 5 4230 - 6419 2801 729 aa, chain + ## HITS:1 COG:fadB_2 KEGG:ns NR:ns ## COG: fadB_2 COG1250 # Protein_GI_number: 16131692 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyacyl-CoA dehydrogenase # Organism: Escherichia coli K12 # 307 729 1 423 423 843 100.0 0 MLYKGDTLYLDWLEDGIAELVFDAPGSVNKLDTATVASLGEAIGVLEQQSDLKGLLLRSN KAAFIVGADITEFLSLFLVPEEQLSQWLHFANSVFNRLEDLPVPTIAAVNGYALGGGCEC VLATDYRLATPDLRIGLPETKLGIMPGFGGSVRMPRMLGADSALEIIAAGKDVGADQALK IGLVDGVVKAEKLVEGAKAVLRQAINGDLDWKAKRQPKLEPLKLSKIEATMSFTIAKGMV AQTAGKHYPAPITAVKTIEAAARFGREEALNLENKSFVPLAHTNEARALVGIFLNDQYVK GKAKKLTKDVETPKQAAVLGAGIMGGGIAYQSAWKGVPVVMKDINDKSLTLGMTEAAKLL NKQLERGKIDGLKLAGVISTIHPTLDYAGFDRVDIVVEAVVENPKVKKAVLAETEQKVRQ DTVLASNTSTIPISELANALERPENFCGMHFFNPVHRMPLVEIIRGEKSSDETIAKVVAW ASKMGKTPIVVNDCPGFFVNRVLFPYFAGFSQLLRDGADFRKIDKVMEKQFGWPMGPAYL LDVVGIDTAHHAQAVMAAGFPQRMQKDYRDAIDALFDANRFGQKNGLGFWRYKEDSKGKP KKEEDAAVEDLLAEVSQPKRDFSEEEIIARMMIPMVNEVVRCLEEGIIATPAEADMALVY GLGFPPFHGGAFRWLDTLGSAKYLDMAQQYQHLGPLYEVPEGLRNKARHNEPYYPPVEPA RPVGDLKTA >gi|296494577|gb|ADTN01000161.1| GENE 6 6429 - 7592 1360 387 aa, chain + ## HITS:1 COG:fadA KEGG:ns NR:ns ## COG: fadA COG0183 # Protein_GI_number: 16131691 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA acetyltransferase # Organism: Escherichia coli K12 # 1 387 1 387 387 732 99.0 0 MEQVVIVDAIRTPMGRSKGGAFRNVRAEDLSAHLMRSLLARNPALEAAALDDIYWGCVQQ TLEQGFNIARNAALLAEVPHSVPAVTVNRLCGSSMQALHDAARMIMTGDAQACLVGGVEH MGHVPMSHGVDFHPGLSRNVAKAAGMMGLTAEMLARMHGISREMQDAFAARSHARAWAAT QSAAFKNEIIPTGGHDADGVLKQFNYDEVIRPETTVEALATLRPAFDPVNGMVTAGTSSA LSDGAAAMLVMSESRAHELGLKPRARVRSMAVVGCDPSIMGYGPVPASKLALKKAGLSAS DIGVFEMNEAFAAQILPCIKDLGLIEQIDEKINLNGGAIALGHPLGCSGARISTTLLNLM ERKDVQFGLATMCIGLGQGIATVFERV Prediction of potential genes in microbial genomes Time: Sun May 15 23:42:06 2011 Seq name: gi|296494576|gb|ADTN01000162.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont305.2, whole genome shotgun sequence Length of sequence - 3717 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 - CDS 16 - 717 757 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 2 1 Op 2 . - CDS 763 - 2256 1618 ## COG0043 3-polyprenyl-4-hydroxybenzoate decarboxylase and related decarboxylases - Prom 2331 - 2390 3.9 + Prom 2270 - 2329 6.4 3 2 Tu 1 . + CDS 2423 - 2911 368 ## COG0250 Transcription antiterminator + Term 2946 - 2983 1.0 4 3 Tu 1 . - CDS 2908 - 3690 513 ## COG0084 Mg-dependent DNase Predicted protein(s) >gi|296494576|gb|ADTN01000162.1| GENE 1 16 - 717 757 233 aa, chain - ## HITS:1 COG:ECs4772 KEGG:ns NR:ns ## COG: ECs4772 COG0543 # Protein_GI_number: 15834026 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Escherichia coli O157:H7 # 1 233 1 233 233 480 100.0 1e-136 MTTLSCKVTSVEAITDTVYRVRIVPDAAFSFRAGQYLMVVMDERDKRPFSMASTPDEKGF IELHIGASEINLYAKAVMDRILKDHQIVVDIPHGEAWLRDDEERPMILIAGGTGFSYARS ILLTALARNPNRDITIYWGGREEQHLYDLCELEALSLKHPGLQVVPVVEQPEAGWRGRTG TVLTAVLQDHGTLAEHDIYIAGRFEMAKIARDLFCSERNAREDRLFGDAFAFI >gi|296494576|gb|ADTN01000162.1| GENE 2 763 - 2256 1618 497 aa, chain - ## HITS:1 COG:ubiD KEGG:ns NR:ns ## COG: ubiD COG0043 # Protein_GI_number: 16131689 # Func_class: H Coenzyme transport and metabolism # Function: 3-polyprenyl-4-hydroxybenzoate decarboxylase and related decarboxylases # Organism: Escherichia coli K12 # 1 497 1 497 497 1036 100.0 0 MDAMKYNDLRDFLTLLEQQGELKRITLPVDPHLEITEIADRTLRAGGPALLFENPKGYSM PVLCNLFGTPKRVAMGMGQEDVSALREVGKLLAFLKEPEPPKGFRDLFDKLPQFKQVLNM PTKRLRGAPCQQKIVSGDDVDLNRIPIMTCWPEDAAPLITWGLTVTRGPHKERQNLGIYR QQLIGKNKLIMRWLSHRGGALDYQEWCAAHPGERFPVSVALGADPATILGAVTPVPDTLS EYAFAGLLRGTKTEVVKCISNDLEVPASAEIVLEGYIEQGETAPEGPYGDHTGYYNEVDS FPVFTVTHITQREDAIYHSTYTGRPPDEPAVLGVALNEVFVPILQKQFPEIVDFYLPPEG CSYRLAVVTIKKQYAGHAKRVMMGVWSFLRQFMYTKFVIVCDDDVNARDWNDVIWAITTR MDPARDTVLVENTPIDYLDFASPVSGLGSKMGLDATNKWPGETQREWGRPIKKDPDVVAH IDAIWDELAIFNNGKSA >gi|296494576|gb|ADTN01000162.1| GENE 3 2423 - 2911 368 162 aa, chain + ## HITS:1 COG:ECs4770 KEGG:ns NR:ns ## COG: ECs4770 COG0250 # Protein_GI_number: 15834024 # Func_class: K Transcription # Function: Transcription antiterminator # Organism: Escherichia coli O157:H7 # 1 162 1 162 162 330 100.0 9e-91 MQSWYLLYCKRGQLQRAQEHLERQAVNCLAPMITLEKIVRGKRTAVSEPLFPNYLFVEFD PEVIHTTTINATRGVSHFVRFGASPAIVPSAVIHQLSVYKPKDIVDPATPYPGDKVIITE GAFEGFQAIFTEPDGEARSMLLLNLINKEIKHSVKNTEFRKL >gi|296494576|gb|ADTN01000162.1| GENE 4 2908 - 3690 513 260 aa, chain - ## HITS:1 COG:tatD KEGG:ns NR:ns ## COG: tatD COG0084 # Protein_GI_number: 16132236 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Escherichia coli K12 # 1 260 5 264 264 539 100.0 1e-153 MFDIGVNLTSSQFAKDRDDVVACAFDAGVNGLLITGTNLRESQQAQKLARQYSSCWSTAG VHPHDSSQWQAATEEAIIELAAQPEVVAIGECGLDFNRNFSTPEEQERAFVAQLRIAADL NMPVFMHCRDAHERFMTLLEPWLDKLPGAVLHCFTGTREEMQACVAHGIYIGITGWVCDE RRGLELRELLPLIPAEKLLIETDAPYLLPRDLTPKPSSRRNEPAHLPHILQRIAHWRGED AAWLAATTDANVKTLFGIAF Prediction of potential genes in microbial genomes Time: Sun May 15 23:42:09 2011 Seq name: gi|296494575|gb|ADTN01000163.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont305.3, whole genome shotgun sequence Length of sequence - 8199 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 3, operones - 1 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 28/0.000 - CDS 20 - 796 839 ## COG0805 Sec-independent protein secretion pathway component TatC 2 1 Op 2 16/0.000 - CDS 799 - 1314 458 ## COG1826 Sec-independent protein secretion pathway components 3 1 Op 3 6/0.000 - CDS 1318 - 1587 200 ## PROTEIN SUPPORTED gi|90022866|ref|YP_528693.1| ribosomal protein L25 - Term 1610 - 1644 -1.0 4 1 Op 4 10/0.000 - CDS 1666 - 3306 1650 ## COG0661 Predicted unusual protein kinase 5 1 Op 5 7/0.000 - CDS 3303 - 3908 701 ## COG3165 Uncharacterized protein conserved in bacteria 6 1 Op 6 5/0.000 - CDS 3922 - 4677 363 ## PROTEIN SUPPORTED gi|163754278|ref|ZP_02161401.1| 30S ribosomal protein S15 - Prom 4704 - 4763 3.2 7 1 Op 7 4/1.000 - CDS 4772 - 6184 1553 ## COG1322 Uncharacterized protein conserved in bacteria - Prom 6232 - 6291 3.0 - Term 6289 - 6323 5.2 8 2 Tu 1 . - CDS 6340 - 7101 968 ## COG2820 Uridine phosphorylase - Prom 7348 - 7407 6.5 + Prom 7276 - 7335 5.3 9 3 Tu 1 . + CDS 7363 - 8178 797 ## COG0412 Dienelactone hydrolase and related enzymes Predicted protein(s) >gi|296494575|gb|ADTN01000163.1| GENE 1 20 - 796 839 258 aa, chain - ## HITS:1 COG:ECs4768 KEGG:ns NR:ns ## COG: ECs4768 COG0805 # Protein_GI_number: 15834022 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Sec-independent protein secretion pathway component TatC # Organism: Escherichia coli O157:H7 # 1 258 1 258 258 465 100.0 1e-131 MSVEDTQPLITHLIELRKRLLNCIIAVIVIFLCLVYFANDIYHLVSAPLIKQLPQGSTMI ATDVASPFFTPIKLTFMVSLILSAPVILYQVWAFIAPALYKHERRLVVPLLVSSSLLFYI GMAFAYFVVFPLAFGFLANTAPEGVQVSTDIASYLSFVMALFMAFGVSFEVPVAIVLLCW MGITSPEDLRKKRPYVLVGAFVVGMLLTPPDVFSQTLLAIPMYCLFEIGVFFSRFYVGKG RNREEENDAEAESEKTEE >gi|296494575|gb|ADTN01000163.1| GENE 2 799 - 1314 458 171 aa, chain - ## HITS:1 COG:ECs4767 KEGG:ns NR:ns ## COG: ECs4767 COG1826 # Protein_GI_number: 15834021 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Sec-independent protein secretion pathway components # Organism: Escherichia coli O157:H7 # 1 171 1 171 171 267 100.0 9e-72 MFDIGFSELLLVFIIGLVVLGPQRLPVAVKTVAGWIRALRSLATTVQNELTQELKLQEFQ DSLKKVEKASLTNLTPELKASMDELRQAAESMKRSYVANDPEKASDEAHTIHNPVVKDNE AAHEGVTPAAAQTQASSPEQKPETTPEPVVKPAADAEPKTAAPSPSSSDKP >gi|296494575|gb|ADTN01000163.1| GENE 3 1318 - 1587 200 89 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90022866|ref|YP_528693.1| ribosomal protein L25 [Saccharophagus degradans 2-40] # 1 79 3 81 83 81 51 2e-15 MGGISIWQLLIIAVIVVLLFGTKKLGSIGSDLGASIKGFKKAMSDDEPKQDKTSQDADFT AKTIADKQADTNQEQAKTEDAKRHDKEQV >gi|296494575|gb|ADTN01000163.1| GENE 4 1666 - 3306 1650 546 aa, chain - ## HITS:1 COG:ECs4765 KEGG:ns NR:ns ## COG: ECs4765 COG0661 # Protein_GI_number: 15834019 # Func_class: R General function prediction only # Function: Predicted unusual protein kinase # Organism: Escherichia coli O157:H7 # 1 546 1 546 546 1109 100.0 0 MTPGEVRRLYFIIRTFLSYGLDELIPKMRITLPLRLWRYSLFWMPNRHKDKLLGERLRLA LQELGPVWIKFGQMLSTRRDLFPPHIADQLALLQDKVAPFDGKLAKQQIEAAMGGLPVEA WFDDFEIKPLASASIAQVHTARLKSNGKEVVIKVIRPDILPVIKADLKLIYRLARWVPRL LPDGRRLRPTEVVREYEKTLIDELNLLRESANAIQLRRNFEDSPMLYIPEVYPDYCSEGM MVMERIYGIPVSDVAALEKNGTNMKLLAERGVQVFFTQVFRDSFFHADMHPGNIFVSYEH PENPKYIGIDCGIVGSLNKEDKRYLAENFIAFFNRDYRKVAELHVDSGWVPPDTNVEEFE FAIRTVCEPIFEKPLAEISFGHVLLNLFNTARRFNMEVQPQLVLLQKTLLYVEGVGRQLY PQLDLWKTAKPFLESWIKDQVGIPALVRAFKEKAPFWVEKMPELPELVYDSLRQGKYLQH SVDKIARELQSNHVRQGQSRYFLGIGATLVLSGTFLLVSRPEWGLMPGWLMAGGLIAWFV GWRKTR >gi|296494575|gb|ADTN01000163.1| GENE 5 3303 - 3908 701 201 aa, chain - ## HITS:1 COG:yigP KEGG:ns NR:ns ## COG: yigP COG3165 # Protein_GI_number: 16131683 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 201 1 201 201 375 100.0 1e-104 MPFKPLVTAGIESLLNTFLYRSPALKTARSRLLGKVLRVEVKGFSTSLILVFSERQVDVL GEWAGDADCTVIAYASVLPKLRDRQQLTALIRSGELEVQGDIQVVQNFVALADLAEFDPA ELLAPYTGDIAAEGISKAMRGGAKFLHHGIKRQQRYVAEAITEEWRMAPGPLEVAWFAEE TAAVERAVDALTKRLEKLEAK >gi|296494575|gb|ADTN01000163.1| GENE 6 3922 - 4677 363 251 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163754278|ref|ZP_02161401.1| 30S ribosomal protein S15 [Kordia algicida OT-1] # 28 250 1 221 221 144 34 2e-34 MVDKLQETTHFGFQTVAKEQKADMVAHVFHSVASKYDVMNDLMSFGIHRLWKRFTIDCSG VRRGQTVLDLAGGTGDLTAKFSRLVGETGKVVLADINESMLKMGREKLRNIGVIGNVEYV QANAEALPFPDNTFDCITISFGLRNVTDKDKALRSMYRVLKPGGRLLVLEFSKPIIEPLS KAYDAYSFHVLPRIGSLVANDADSYRYLAESIRMHPDQDTLKAMMQDAGFESVDYYNLTA GVVALHRGYKF >gi|296494575|gb|ADTN01000163.1| GENE 7 4772 - 6184 1553 470 aa, chain - ## HITS:1 COG:ECs4762 KEGG:ns NR:ns ## COG: ECs4762 COG1322 # Protein_GI_number: 15834016 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 470 6 475 475 805 100.0 0 MVYAVIALVGVAIGWLFASYQHAQQKAEQLAEREEMVAELSAAKQQITQSEHWRAECELL NNEVRSLQSINTSLEADLREVTTRMEAAQQHADDKIRQMINSEQRLSEQFENLANRIFEH SNRRVDEQNRQSLNSLLSPLREQLDGFRRQVQDSFGKEAQERHTLTHEIRNLQQLNAQMA QEAINLTRALKGDNKTQGNWGEVVLTRVLEASGLREGYEYETQVSIENDARSRMQPDVIV RLPQGKDVVIDAKMTLVAYERYFNAEDDYTRESALQEHIASVRNHIRLLGRKDYQQLPGL RTLDYVLMFIPVEPAFLLALDRQPELITEALKNNIMLVSPTTLLVALRTIANLWRYEHQS RNAQQIADRASKLYDKMRLFIDDMSAIGQSLDKAQDNYRQAMKKLSSGRGNVLAQAEAFR GLGVEIKREINPDLAEQAVSQDEEYRLRSVPEQPNDEAYQRDDEYNQQSR >gi|296494575|gb|ADTN01000163.1| GENE 8 6340 - 7101 968 253 aa, chain - ## HITS:1 COG:udp KEGG:ns NR:ns ## COG: udp COG2820 # Protein_GI_number: 16131680 # Func_class: F Nucleotide transport and metabolism # Function: Uridine phosphorylase # Organism: Escherichia coli K12 # 1 253 1 253 253 489 100.0 1e-138 MSKSDVFHLGLTKNDLQGATLAIVPGDPDRVEKIAALMDKPVKLASHREFTTWRAELDGK PVIVCSTGIGGPSTSIAVEELAQLGIRTFLRIGTTGAIQPHINVGDVLVTTASVRLDGAS LHFAPLEFPAVADFECTTALVEAAKSIGATTHVGVTASSDTFYPGQERYDTYSGRVVRHF KGSMEEWQAMGVMNYEMESATLLTMCASQGLRAGMVAGVIVNRTQQEIPNAETMKQTESH AVKIVVEAARRLL >gi|296494575|gb|ADTN01000163.1| GENE 9 7363 - 8178 797 271 aa, chain + ## HITS:1 COG:ECs4760 KEGG:ns NR:ns ## COG: ECs4760 COG0412 # Protein_GI_number: 15834014 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Dienelactone hydrolase and related enzymes # Organism: Escherichia coli O157:H7 # 1 271 23 293 293 548 99.0 1e-156 MATTQQSGFAPAASPLASTIVQTPDDAIVAGFTSIPSQGDNMPAYHARPKQSDGPLPVVI VVQEIFGVHEHIRDICRRLALEGYLAIAPELYFREGDPNDFADIPTLLSGLVAKVPDSQV LADLDHVASWASRNGGDVHRLMITGFCWGGRITWLYAAHNPQLKAAVAWYGKLTGDKSLN SPKQPVDIATDLNAPILGLYGGQDNSIPQESVETMRQALRAANAKAEIIVYPDAGHAFNA DYRPSYHAASAEDGWQRMLEWFKQYGGKKSL Prediction of potential genes in microbial genomes Time: Sun May 15 23:42:19 2011 Seq name: gi|296494574|gb|ADTN01000164.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont305.4, whole genome shotgun sequence Length of sequence - 28707 bp Number of predicted genes - 27, with homology - 27 Number of transcription units - 15, operones - 6 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 18 - 2279 2661 ## COG0620 Methionine synthase II (cobalamin-independent) - Prom 2417 - 2476 6.3 + Prom 2423 - 2482 4.9 2 2 Tu 1 . + CDS 2516 - 3469 678 ## COG0583 Transcriptional regulator 3 3 Op 1 3/0.600 - CDS 3357 - 4256 1136 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 4 3 Op 2 5/0.600 - CDS 4332 - 5132 952 ## COG0561 Predicted hydrolases of the HAD superfamily 5 3 Op 3 . - CDS 5140 - 6162 637 ## COG2267 Lysophospholipase - Prom 6308 - 6367 3.6 + Prom 6141 - 6200 4.9 6 4 Tu 1 . + CDS 6273 - 6893 581 ## COG1280 Putative threonine efflux protein + Term 6902 - 6943 6.4 - Term 6890 - 6931 2.6 7 5 Op 1 4/0.600 - CDS 6955 - 7575 807 ## COG1280 Putative threonine efflux protein 8 5 Op 2 5/0.600 - CDS 7639 - 9474 1899 ## COG0514 Superfamily II DNA helicase - Prom 9497 - 9556 4.5 - Term 9484 - 9517 4.5 9 6 Tu 1 . - CDS 9601 - 10470 986 ## COG2829 Outer membrane phospholipase A - Prom 10583 - 10642 4.3 + Prom 10547 - 10606 5.2 10 7 Op 1 1/0.900 + CDS 10635 - 11102 510 ## COG2050 Uncharacterized protein, possibly involved in aromatic compounds catabolism 11 7 Op 2 . + CDS 11154 - 12044 1033 ## COG2962 Predicted permeases + Term 12070 - 12107 1.7 + Prom 12047 - 12106 3.2 12 8 Op 1 . + CDS 12139 - 12519 96 ## JW5590 predicted inner membrane protein 13 8 Op 2 . + CDS 12548 - 12913 286 ## B21_03645 hypothetical protein - Term 12908 - 12944 7.3 14 9 Tu 1 . - CDS 12956 - 13906 1213 ## COG0598 Mg2+ and Co2+ transporters 15 10 Tu 1 . + CDS 14276 - 15040 755 ## COG3698 Predicted periplasmic protein - Term 15149 - 15173 -1.0 16 11 Tu 1 5/0.600 - CDS 15187 - 17349 2285 ## COG0210 Superfamily I DNA and RNA helicases - Term 17360 - 17403 5.0 17 12 Op 1 8/0.100 - CDS 17433 - 18149 514 ## COG1011 Predicted hydrolase (HAD superfamily) 18 12 Op 2 10/0.100 - CDS 18149 - 19045 933 ## COG4973 Site-specific recombinase XerC 19 12 Op 3 4/0.600 - CDS 19042 - 19749 662 ## COG3159 Uncharacterized protein conserved in bacteria 20 12 Op 4 1/0.900 - CDS 19746 - 20570 811 ## COG0253 Diaminopimelate epimerase 21 12 Op 5 . - CDS 20607 - 20810 229 ## COG5567 Predicted small periplasmic lipoprotein - Prom 20917 - 20976 4.0 + Prom 21190 - 21249 5.3 22 13 Tu 1 . + CDS 21273 - 21593 487 ## COG1965 Protein implicated in iron transport, frataxin homolog 23 14 Tu 1 . - CDS 21633 - 24179 2421 ## COG3072 Adenylate cyclase - Prom 24333 - 24392 5.3 + Prom 24398 - 24457 3.9 24 15 Op 1 23/0.000 + CDS 24566 - 25507 1087 ## COG0181 Porphobilinogen deaminase 25 15 Op 2 10/0.100 + CDS 25504 - 26244 544 ## COG1587 Uroporphyrinogen-III synthase 26 15 Op 3 11/0.100 + CDS 26266 - 27453 1367 ## COG2959 Uncharacterized enzyme of heme biosynthesis 27 15 Op 4 . + CDS 27456 - 28652 1359 ## COG3071 Uncharacterized enzyme of heme biosynthesis + Term 28661 - 28699 4.0 Predicted protein(s) >gi|296494574|gb|ADTN01000164.1| GENE 1 18 - 2279 2661 753 aa, chain - ## HITS:1 COG:metE KEGG:ns NR:ns ## COG: metE COG0620 # Protein_GI_number: 16131678 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase II (cobalamin-independent) # Organism: Escherichia coli K12 # 1 753 1 753 753 1530 100.0 0 MTILNHTLGFPRVGLRRELKKAQESYWAGNSTREELLAVGRELRARHWDQQKQAGIDLLP VGDFAWYDHVLTTSLLLGNVPARHQNKDGSVDIDTLFRIGRGRAPTGEPAAAAEMTKWFN TNYHYMVPEFVKGQQFKLTWTQLLDEVDEALALGHKVKPVLLGPVTWLWLGKVKGEQFDR LSLLNDILPVYQQVLAELAKRGIEWVQIDEPALVLELPQAWLDAYKPAYDALQGQVKLLL TTYFEGVTPNLDTITALPVQGLHVDLVHGKDDVAELHKRLPSDWLLSAGLINGRNVWRAD LTEKYAQIKDIVGKRDLWVASSCSLLHSPIDLSVETRLDAEVKSWFAFALQKCHELALLR DALNSGDTAALAEWSAPIQARRHSTRVHNPAVEKRLAAITAQDSQRANVYEVRAEAQRAR FKLPAWPTTTIGSFPQTTEIRTLRLDFKKGNLDANNYRTGIAEHIKQAIVEQERLGLDVL VHGEAERNDMVEYFGEHLDGFVFTQNGWVQSYGSRCVKPPIVIGDISRPAPITVEWAKYA QSLTDKPVKGMLTGPVTILCWSFPREDVSRETIAKQIALALRDEVADLEAAGIGIIQIDE PALREGLPLRRSDWDAYLQWGVEAFRINAAVAKDDTQIHTHMCYCEFNDIMDSIAALDAD VITIETSRSDMELLESFEEFDYPNEIGPGVYDIHSPNVPSVEWIEALLKKAAKRIPAERL WVNPDCGLKTRGWPETRAALANMVQAAQNLRRG >gi|296494574|gb|ADTN01000164.1| GENE 2 2516 - 3469 678 317 aa, chain + ## HITS:1 COG:ECs4758 KEGG:ns NR:ns ## COG: ECs4758 COG0583 # Protein_GI_number: 15834012 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 317 1 317 317 614 100.0 1e-176 MIEVKHLKTLQALRNCGSLAAAAATLHQTQSALSHQFSDLEQRLGFRLFVRKSQPLRFTP QGEILLQLANQVLPQISQALQACNEPQQTRLRIAIECHSCIQWLTPALENFHKNWPQVEM DFKSGVTFDPQPALQQGELDLVMTSDILPRSGLHYSPMFDYEVRLVLAPDHPLAAKTRIT PEDLASETLLIYPVQRSRLDVWRHFLQPAGVSPSLKSVDNTLLLIQMVAARMGIAALPHW VVESFERQGLVVTKTLGEGLWSRLYAAVRDGEQRQPVTEAFIRSARNHACDHLPFVKSAE RPTYDAPTVRPGSPARL >gi|296494574|gb|ADTN01000164.1| GENE 3 3357 - 4256 1136 299 aa, chain - ## HITS:1 COG:ECs4757 KEGG:ns NR:ns ## COG: ECs4757 COG0697 # Protein_GI_number: 15834011 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Escherichia coli O157:H7 # 1 299 1 299 299 540 100.0 1e-153 MALLIITTILWAFSFSFYGEYLAGHVDSYFAVLVRVGLAALVFLPFLRTRGNSLKTVGLY MLVGAMQLGVMYMLSFRAYLYLTVSELLLFTVLTPLYITLIYDIMSKRRLRWGYAFSALL AVIGAGIIRYDQVTDHFWTGLLLVQLSNITFAIGMVGYKRLMETRPMPQHNAFAWFYLGA FLVAVIAWFLLGNAQKMPQTTLQWGILVFLGVVASGIGYFMWNYGATQVDAGTLGIMNNM HVPAGLLVNLAIWHQQPHWPTFITGALVILASLWVHRKWVAPRSSQTADDRRRDCALSE >gi|296494574|gb|ADTN01000164.1| GENE 4 4332 - 5132 952 266 aa, chain - ## HITS:1 COG:Z5347 KEGG:ns NR:ns ## COG: Z5347 COG0561 # Protein_GI_number: 15804418 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Escherichia coli O157:H7 EDL933 # 1 266 40 305 305 556 99.0 1e-158 MYQVVASDLDGTLLSPDHTLSPYAKETLKLLTARGINFVFATGRHHVDVGQIRDNLEIKS YMITSNGARVHDLDGNLIFAHNLDRDIASDLFGVVNDNPDIITNVYRDDEWFMNRHRPEE MRFFKEAVFQYALYEPGLLEPEGVSKVFFTCDSHEQLLPLEQAINARWGDRVNVSFSTLT CLEVMAGGVSKGHALEAVAKKLGYSLKDCIAFGDGMNDAEMLSMAGKGCIMGSAHQRLKD LHPELEVIGTNADDAVPHYLRKLYLS >gi|296494574|gb|ADTN01000164.1| GENE 5 5140 - 6162 637 340 aa, chain - ## HITS:1 COG:ECs4755 KEGG:ns NR:ns ## COG: ECs4755 COG2267 # Protein_GI_number: 15834009 # Func_class: I Lipid transport and metabolism # Function: Lysophospholipase # Organism: Escherichia coli O157:H7 # 1 340 1 340 340 699 99.0 0 MFQQQKDWETRENAFAAFTMGPLTDFWRQRDEAEFTGVDDIPVRFVRFRAQHHDRVVVIC PGRIESYVKYAELAYDLFHLGFDVLIIDHRGQGRSGRLLADPHLGHVNRFNDYVDDLAAF WQQEVQPGPWRKRYILAHSMGGAISTLFLQRHPGVCDAIALTAPMFGIVIRMPSFMARQI LNWAEAHPRFRDGYAIGTGRWRALPFAINVLTHSRQRYRRNLRFYADDPTIRVGGPTYHW VRESILAGEQVLAGAGDDATPTLLLQAEEERVVDNRMHDRFCELRTAAGHPVEGGRPLVI KGAYHEILFEKDAMRSVALHAIVDFFNRHNSPSGNRSTEV >gi|296494574|gb|ADTN01000164.1| GENE 6 6273 - 6893 581 206 aa, chain + ## HITS:1 COG:ECs4754 KEGG:ns NR:ns ## COG: ECs4754 COG1280 # Protein_GI_number: 15834008 # Func_class: E Amino acid transport and metabolism # Function: Putative threonine efflux protein # Organism: Escherichia coli O157:H7 # 1 206 1 206 206 352 100.0 3e-97 MTLEWWFAYLLTSIILSLSPGSGAINTMTTSLNHGYRGAVASIAGLQTGLAIHIVLVGVG LGTLFSRSVIAFEVLKWAGAAYLIWLGIQQWRAAGAIDLKSLASTQSRRHLFQRAVFVNL TNPKSIVFLAALFPQFIMPQQPQLMQYIVLGVTTIVVDIIVMIGYATLAQRIALWIKGPK QMKALNKIFGSLFMLVGALLASARHA >gi|296494574|gb|ADTN01000164.1| GENE 7 6955 - 7575 807 206 aa, chain - ## HITS:1 COG:ECs4753 KEGG:ns NR:ns ## COG: ECs4753 COG1280 # Protein_GI_number: 15834007 # Func_class: E Amino acid transport and metabolism # Function: Putative threonine efflux protein # Organism: Escherichia coli O157:H7 # 1 206 1 206 206 364 100.0 1e-101 MLMLFLTVAMVHIVALMSPGPDFFFVSQTAVSRSRKEAMMGVLGITCGVMVWAGIALLGL HLIIEKMAWLHTLIMVGGGLYLCWMGYQMLRGALKKEAVSAPAPQVELAKSGRSFLKGLL TNLANPKAIIYFGSVFSLFVGDNVGTTARWGIFALIIVETLAWFTVVASLFALPQMRRGY QRLAKWIDGFAGALFAGFGIHLIISR >gi|296494574|gb|ADTN01000164.1| GENE 8 7639 - 9474 1899 611 aa, chain - ## HITS:1 COG:ECs4752 KEGG:ns NR:ns ## COG: ECs4752 COG0514 # Protein_GI_number: 15834006 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Escherichia coli O157:H7 # 1 611 1 611 611 1255 100.0 0 MNVAQAEVLNLESGAKQVLQETFGYQQFRPGQEEIIDTVLSGRDCLVVMPTGGGKSLCYQ IPALLLNGLTVVVSPLISLMKDQVDQLQANGVAAACLNSTQTREQQLEVMTGCRTGQIRL LYIAPERLMLDNFLEHLAHWNPVLLAVDEAHCISQWGHDFRPEYAALGQLRQRFPTLPFM ALTATADDTTRQDIVRLLGLNDPLIQISSFDRPNIRYMLMEKFKPLDQLMRYVQEQRGKS GIIYCNSRAKVEDTAARLQSKGISAAAYHAGLENNVRADVQEKFQRDDLQIVVATVAFGM GINKPNVRFVVHFDIPRNIESYYQETGRAGRDGLPAEAMLFYDPADMAWLRRCLEEKPQG QLQDIERHKLNAMGAFAEAQTCRRLVLLNYFGEGRQEPCGNCDICLDPPKQYDGSTDAQI ALSTIGRVNQRFGMGYVVEVIRGANNQRIRDYGHDKLKVYGMGRDKSHEHWVSVIRQLIH LGLVTQNIAQHSALQLTEAARPVLRGESSLQLAVPRIVALKPKAMQKSFGGNYDRKLFAK LRKLRKSIADESNVPPYVVFNDATLIEMAEQMPITASEMLSVNGVGMRKLERFGKPFMAL IRAHVDGDDEE >gi|296494574|gb|ADTN01000164.1| GENE 9 9601 - 10470 986 289 aa, chain - ## HITS:1 COG:ECs4751 KEGG:ns NR:ns ## COG: ECs4751 COG2829 # Protein_GI_number: 15834005 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane phospholipase A # Organism: Escherichia coli O157:H7 # 1 289 1 289 289 572 100.0 1e-163 MRTLQGWLLPVFMLPMAVYAQEATVKEVHDAPAVRGSIIANMLQEHDNPFTLYPYDTNYL IYTQTSDLNKEAIASYDWAENARKDEVKFQLSLAFPLWRGILGPNSVLGASYTQKSWWQL SNSEESSPFRETNYEPQLFLGFATDYRFAGWTLRDVEMGYNHDSNGRSDPTSRSWNRLYT RLMAENGNWLVEVKPWYVVGNTDDNPDITKYMGYYQLKIGYHLGDAVLSAKGQYNWNTGY GGAELGLSYPITKHVRLYTQVYSGYGESLIDYNFNQTRVGVGVMLNDLF >gi|296494574|gb|ADTN01000164.1| GENE 10 10635 - 11102 510 155 aa, chain + ## HITS:1 COG:ECs4750 KEGG:ns NR:ns ## COG: ECs4750 COG2050 # Protein_GI_number: 15834004 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized protein, possibly involved in aromatic compounds catabolism # Organism: Escherichia coli O157:H7 # 1 155 7 161 161 295 100.0 2e-80 MSAVLTAEQALKLVGEMFVYHMPFNRALGMELERYEKEFAQLAFKNQPMMVGNWAQSILH GGVIASALDVAAGLVCVGSTLTRHETISEDELRQRLSRMGTIDLRVDYLRPGRGERFTAT SSLLRAGNKVAVARVELHNEEQLYIASATATYMVG >gi|296494574|gb|ADTN01000164.1| GENE 11 11154 - 12044 1033 296 aa, chain + ## HITS:1 COG:ECs4749 KEGG:ns NR:ns ## COG: ECs4749 COG2962 # Protein_GI_number: 15834003 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Escherichia coli O157:H7 # 1 295 1 295 295 508 99.0 1e-144 MDAKQTRQGVLLALAAYFIWGIAPAYFKLIYYVPADEILTHRVIWSFFFMVVLMSICRQW SYLKTLIQTPQKIFMLAVSAVLIGGNWLLFIWAVNNHHMLEASLGYFINPLVNIVLGMIF LGERFRRMQWLAVILAICGVLVQLWTFGSLPIIALGLAFSFAFYGLVRKKIAVEAQTGML IETMWLLPVAAIYLFAIADSSTSHMGQNPMSLNLLLIAAGIVTTVPLLCFTAAATRLRLS TLGFFQYIGPTLMFLLAVTFYGEKPGADKMVTFAFIWVALAIFVMDAIYTQRRTSK >gi|296494574|gb|ADTN01000164.1| GENE 12 12139 - 12519 96 126 aa, chain + ## HITS:1 COG:no KEGG:JW5590 NR:ns ## KEGG: JW5590 # Name: yigG # Def: predicted inner membrane protein # Organism: E.coli_J # Pathway: not_defined # 1 126 1 126 126 182 100.0 3e-45 MLRIFIPTSNGKISRRRYIFSFILINFIFAFLIIFFNDGEAGFLVIVSTIVLHYLVINMN CQRLRDSGFIYIKTYVFGTLAVYIISIITMIAEDFACSGNGSMIFLICYFSTFSMLMLAP TDSSKQ >gi|296494574|gb|ADTN01000164.1| GENE 13 12548 - 12913 286 121 aa, chain + ## HITS:1 COG:no KEGG:B21_03645 NR:ns ## KEGG: B21_03645 # Name: yigF # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 121 6 126 126 218 100.0 4e-56 MNDGSLSEKWKYRFNFYDQHGFPGFWGATPEYKAAFKALKVRQRLTIQMNFIAFFCSWIY LFVLGLWKKAIIVLLLGILSLFVGALIGVNILGIAVAAYVAVNTNKWFYEKEVKGLNTWS L >gi|296494574|gb|ADTN01000164.1| GENE 14 12956 - 13906 1213 316 aa, chain - ## HITS:1 COG:ECs4746 KEGG:ns NR:ns ## COG: ECs4746 COG0598 # Protein_GI_number: 15834000 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Escherichia coli O157:H7 # 1 316 1 316 316 607 100.0 1e-174 MLSAFQLENNRLTRLEVEESQPLVNAVWIDLVEPDDDERLRVQSELGQSLATRPELEDIE ASARFFEDDDGLHIHSFFFFEDAEDHAGNSTVAFTIRDGRLFTLRERELPAFRLYRMRAR SQSMVDGNAYELLLDLFETKIEQLADEIENIYSDLEQLSRVIMEGHQGDEYDEALSTLAE LEDIGWKVRLCLMDTQRALNFLVRKARLPGGQLEQAREILRDIESLLPHNESLFQKVNFL MQAAMGFINIEQNRIIKIFSVVSVVFLPPTLVASSYGMNFEFMPELKWSFGYPGAIIFMI LAGLAPYLYFKRKNWL >gi|296494574|gb|ADTN01000164.1| GENE 15 14276 - 15040 755 254 aa, chain + ## HITS:1 COG:yigEm KEGG:ns NR:ns ## COG: yigEm COG3698 # Protein_GI_number: 16132262 # Func_class: S Function unknown # Function: Predicted periplasmic protein # Organism: Escherichia coli K12 # 1 254 1 254 254 517 100.0 1e-147 MAHQLLIGKGMITLNLKRIFLALTLLPLFAVAADDCALSDPTLTVQAYTVNPQTERVKMY WQKANGEAWGTLHALLADINSQGQVQMAMNGGIYDESYAPLGLYIENGQQKVALNLASGE GNFFIRPGGVFYVAGDKVGIVRLDAFKTSKEIQFAVQSGPMLMENGVINPRIHPNVASSK IRNGVGINKHGNAVFLLSQQATNFYDFACYAKAKLNVEQLLYLDGTISHMYMKGGAIPWQ RYPFVTMISVERKG >gi|296494574|gb|ADTN01000164.1| GENE 16 15187 - 17349 2285 720 aa, chain - ## HITS:1 COG:uvrD KEGG:ns NR:ns ## COG: uvrD COG0210 # Protein_GI_number: 16131665 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Escherichia coli K12 # 1 720 1 720 720 1436 100.0 0 MDVSYLLDSLNDKQREAVAAPRSNLLVLAGAGSGKTRVLVHRIAWLMSVENCSPYSIMAV TFTNKAAAEMRHRIGQLMGTSQGGMWVGTFHGLAHRLLRAHHMDANLPQDFQILDSEDQL RLLKRLIKAMNLDEKQWPPRQAMWYINSQKDEGLRPHHIQSYGNPVEQTWQKVYQAYQEA CDRAGLVDFAELLLRAHELWLNKPHILQHYRERFTNILVDEFQDTNNIQYAWIRLLAGDT GKVMIVGDDDQSIYGWRGAQVENIQRFLNDFPGAETIRLEQNYRSTSNILSAANALIENN NGRLGKKLWTDGADGEPISLYCAFNELDEARFVVNRIKTWQDNGGALAECAILYRSNAQS RVLEEALLQASMPYRIYGGMRFFERQEIKDALSYLRLIANRNDDAAFERVVNTPTRGIGD RTLDVVRQTSRDRQLTLWQACRELLQEKALAGRAASALQRFMELIDALAQETADMPLHVQ TDRVIKDSGLRTMYEQEKGEKGQTRIENLEELVTATRQFSYNEEDEDLMPLQAFLSHAAL EAGEGQADTWQDAVQLMTLHSAKGLEFPQVFIVGMEEGMFPSQMSLDEGGRLEEERRLAY VGVTRAMQKLTLTYAETRRLYGKEVYHRPSRFIGELPEECVEEVRLRATVSRPVSHQRMG TPMVENDSGYKLGQRVRHAKFGEGTIVNMEGSGEHSRLQVAFQGQGIKWLVAAYARLESV >gi|296494574|gb|ADTN01000164.1| GENE 17 17433 - 18149 514 238 aa, chain - ## HITS:1 COG:yigB KEGG:ns NR:ns ## COG: yigB COG1011 # Protein_GI_number: 16131664 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Escherichia coli K12 # 1 238 1 238 238 486 100.0 1e-137 MRFYRPLGRISALTFDLDDTLYDNRPVILRTEREALTFVQNYHPALRSFQNEDLQRLRQA VREAEPEIYHDVTRWRFRSIEQAMLDAGLSAEEASAGAHAAMINFAKWRSRIDVPQQTHD TLKQLAKKWPLVAITNGNAQPELFGLGDYFEFVLRAGPHGRSKPFSDMYFLAAEKLNVPI GEILHVGDDLTTDVGGAIRSGMQACWIRPENGDLMQTWDSRLLPHLEISRLASLTSLI >gi|296494574|gb|ADTN01000164.1| GENE 18 18149 - 19045 933 298 aa, chain - ## HITS:1 COG:xerC KEGG:ns NR:ns ## COG: xerC COG4973 # Protein_GI_number: 16131663 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerC # Organism: Escherichia coli K12 # 1 298 1 298 298 589 100.0 1e-168 MTDLHTDVERYLRYLSVERQLSPITLLNYQRQLEAIINFASENGLQSWQQCDVTMVRNFA VRSRRKGLGAASLALRLSALRSFFDWLVSQNELKANPAKGVSAPKAPRHLPKNIDVDDMN RLLDIDINDPLAVRDRAMLEVMYGAGLRLSELVGLDIKHLDLESGEVWVMGKGSKERRLP IGRNAVAWIEHWLDLRDLFGSEDDALFLSKLGKRISARNVQKRFAEWGIKQGLNNHVHPH KLRHSFATHMLESSGDLRGVQELLGHANLSTTQIYTHLDFQHLASVYDAAHPRAKRGK >gi|296494574|gb|ADTN01000164.1| GENE 19 19042 - 19749 662 235 aa, chain - ## HITS:1 COG:yigA KEGG:ns NR:ns ## COG: yigA COG3159 # Protein_GI_number: 16131662 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 235 1 235 235 445 100.0 1e-125 MKQPGEELQETLTELDDRAVVDYLIKNPEFFIRNARAVEAIRVPHPVRGTVSLVEWHMAR ARNHIHVLEENMALLMEQAIANEGLFYRLLYLQRSLTAASSLDDMLMRFHRWARDLGLAG ASLRLFPDRWRLGAPSNHTHLALSRQSFEPLRIQRLGQEQHYLGPLNGPELLVVLPEAKA VGSVAMSMLGSDADLGVVLFTSRDASHYQQGQGTQLLHEIALMLPELLERWIERV >gi|296494574|gb|ADTN01000164.1| GENE 20 19746 - 20570 811 274 aa, chain - ## HITS:1 COG:ECs4739 KEGG:ns NR:ns ## COG: ECs4739 COG0253 # Protein_GI_number: 15833993 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Escherichia coli O157:H7 # 1 274 2 275 275 568 100.0 1e-162 MQFSKMHGLGNDFMVVDAVTQNVFFSPELIRRLADRHLGVGFDQLLVVEPPYDPELDFHY RIFNADGSEVAQCGNGARCFARFVRLKGLTNKRDIRVSTANGRMVLTVTDDDLVRVNMGE PNFEPSAVPFRANKAEKTYIMRAAEQTILCGVVSMGNPHCVIQVDDVDTAAVETLGPVLE SHERFPERANIGFMQVVKREHIRLRVYERGAGETQACGSGACAAVAVGIQQGLLAEEVRV ELPGGRLDIAWKGPGHPLYMTGPAVHVYDGFIHL >gi|296494574|gb|ADTN01000164.1| GENE 21 20607 - 20810 229 67 aa, chain - ## HITS:1 COG:Z5325 KEGG:ns NR:ns ## COG: Z5325 COG5567 # Protein_GI_number: 15804397 # Func_class: N Cell motility # Function: Predicted small periplasmic lipoprotein # Organism: Escherichia coli O157:H7 EDL933 # 1 67 1 67 67 109 100.0 1e-24 MKNVFKALTVLLTLFSLTGCGLKGPLYFPPADKNAPPPTKPVETQTQSTVPDKNDRATGD GPSQVNY >gi|296494574|gb|ADTN01000164.1| GENE 22 21273 - 21593 487 106 aa, chain + ## HITS:1 COG:STM3943 KEGG:ns NR:ns ## COG: STM3943 COG1965 # Protein_GI_number: 16767214 # Func_class: P Inorganic ion transport and metabolism # Function: Protein implicated in iron transport, frataxin homolog # Organism: Salmonella typhimurium LT2 # 1 106 1 106 106 193 94.0 5e-50 MNDSEFHRLADQLWLTIEERLDDWDGDSDIDCEINGGVLTITFENGSKIIINRQEPLHQV WLATKQGGYHFDLKGDEWICDRSGETFWDLLEQAATQQAGETVSFR >gi|296494574|gb|ADTN01000164.1| GENE 23 21633 - 24179 2421 848 aa, chain - ## HITS:1 COG:cyaA KEGG:ns NR:ns ## COG: cyaA COG3072 # Protein_GI_number: 16131658 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate cyclase # Organism: Escherichia coli K12 # 1 848 1 848 848 1780 99.0 0 MYLYIETLKQRLDAINQLRVDRALAAMGPAFQQVYSLLPTLLHYHHPLMPGYLDGNVPKG ICLYTPDETQRHYLNELELYRGMSVQEPPKGELPITGVYTMGSTSSVGQSCSSDLDIWVC HQSWLDSEERQLLQRKCSLLENWAASLGVEVSFFLIDENRFRHNESGSLGGEDCGSTQHI LLLDEFYRTAVRLAGKRILWNMVPCDEEEHYDDYVMTLYAQGVLTPNEWLDLGGLSSLSA EEYFGASLWQLYKSIDSPYKAVLKTLLLEAYSWEYPNPRLLAKDIKQRLHDGEIVSFGLD PYCMMLERVTEYLTAIEDFTRLDLVRRCFYLKVCEKLSRERACVGWRRAVLSQLVSEWGW DEARLAMLDNRANWKIDQVREAHNELLDAMMQSYRNLIRFARRNNLSVSASPQDIGVLTR KLYAAFEALPGKVTLVNPQISPDLSEPNLTFIYVPPGRANRSGWYLYNRAPNIESIISHQ PLEYNRYLNKLVAWAWFNGLLTSRTRLYIKGNGIVDLPKLQEMVADVSHHFPLRLPAPTP KALYSPCEIRHLAIIVNLEYDPTAAFRNQVVHFDFRKLDVFSFGENQNCLVGSVDLLYRN SWNEVRTLHFNGEQSMIEALKTILGKMHQDAAPPDSVEVFCYSQHLRGLIRTRVQQLVSE CIELRLSSTRQETGRFKALRVSGQTWGLFFERLNVSVQKLENAIEFYGAISHNKLHGLSV QVETNHVKLPAVVDGFASEGIIQFFFEETQDENGFNIYILDESNRVEVYHHCEGSKEELV RDVSRFYSSSHDRFTYGSSFINFNLPQFYQIVKVDGREQVIPFRTKSIGNMPPANQDHDT PLLQQYFS >gi|296494574|gb|ADTN01000164.1| GENE 24 24566 - 25507 1087 313 aa, chain + ## HITS:1 COG:ECs4735 KEGG:ns NR:ns ## COG: ECs4735 COG0181 # Protein_GI_number: 15833989 # Func_class: H Coenzyme transport and metabolism # Function: Porphobilinogen deaminase # Organism: Escherichia coli O157:H7 # 1 313 8 320 320 594 99.0 1e-170 MLDNVLRIATRQSPLALWQAHYVKDKLMASHPGLVVELVPMVTRGDVILDTPLAKVGGKG LFVKELEVALLENRADIAVHSMKDVPVEFPQGLGLVTICEREDPRDAFVSNNYDSLDALP AGSIVGTSSLRRQCQLAERRPDLIIRSLRGNVGTRLSKLDNGEYDAIILAVAGLKRLGLE SRIRAALPPEISLPAVGQGAVGIECRLDDSRTRELLAALNHHETALRVTAERAMNTRLEG GCQVPIGSYAELIDGEIWLRALVGAPDGSQIIRGERRGAPQDAEQMGISLAEELLNNGAR EILAEVYNGDAPA >gi|296494574|gb|ADTN01000164.1| GENE 25 25504 - 26244 544 246 aa, chain + ## HITS:1 COG:hemD KEGG:ns NR:ns ## COG: hemD COG1587 # Protein_GI_number: 16131656 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III synthase # Organism: Escherichia coli K12 # 1 246 1 246 246 473 100.0 1e-133 MSILVTRPSPAGEELVSRLRTLGQVAWHFPLIEFSPGQQLPQLADQLAALGESDLLFALS QHAVAFAQSQLHQQDRKWPRLPDYFAIGRTTALALHTVSGQKILYPQDREISEVLLQLPE LQNIAGKRALILRGNGGRELIGDTLTARGAEVTFCECYQRCAIHYDGAEEAMRWQAREVT MVVVTSGEMLQQLWSLIPQWYREHWLLHCRLLVVSERLAKLARELGWQDIKVADNADNDA LLRALQ >gi|296494574|gb|ADTN01000164.1| GENE 26 26266 - 27453 1367 395 aa, chain + ## HITS:1 COG:hemX KEGG:ns NR:ns ## COG: hemX COG2959 # Protein_GI_number: 16131655 # Func_class: H Coenzyme transport and metabolism # Function: Uncharacterized enzyme of heme biosynthesis # Organism: Escherichia coli K12 # 1 395 1 393 393 585 99.0 1e-167 MTEQEKTSAVVEETREAVDTTSQPVATEKKSKNNTALILSAVAIAIALAAGIGLYGWGKQ QAVNQTATSDALANQLTALQKAQESQKAELEGIIKQQAAQLKQANRQQETLAKQLDEVQQ KVATISGSDAKTWLLAQADFLVKLAGRKLWSDQDVTTAAALLKSADASLADMNDPSLITV RRAITDDIASLSAVSQVDYDGIILKLNQLSNQVDNLRLADNDSDGSPMDSDGEELSSSIS EWRINLQKSWQNFMDNFITIRRRDDTAVPLLAPNQDIYLRENIRSRLLVAAQAVPRHQEE TYRQALENVSTWVRAYYDTDDATTKAFLDEVDQLSQQNISMDLPETLQSQAMLEKLMQTR VRNLLAQPAAGTTEAKPAPAPAPQADTPAAAPQGE >gi|296494574|gb|ADTN01000164.1| GENE 27 27456 - 28652 1359 398 aa, chain + ## HITS:1 COG:ECs4732 KEGG:ns NR:ns ## COG: ECs4732 COG3071 # Protein_GI_number: 15833986 # Func_class: H Coenzyme transport and metabolism # Function: Uncharacterized enzyme of heme biosynthesis # Organism: Escherichia coli O157:H7 # 12 398 12 398 398 728 100.0 0 MLKVLLLFVLLIAGIVVGPMIAGHQGYVLIQTDNYNIETSVTGLAIILILAMVVLFAIEW LLRRIFRTGAHTRGWFVGRKRRRARKQTEQALLKLAEGDYQQVEKLMAKNADHAEQPVVN YLLAAEAAQQRGDEARANQHLERAAELAGNDTIPVEITRVRLQLARNENHAARHGVDKLL EVTPRHPEVLRLAEQAYIRTGAWSSLLDIIPSMAKAHVGDEEHRAMLEQQAWIGLMDQAR ADNGSEGLRNWWKNQSRKTRHQVALQVAMAEHLIECDDHDTAQQIIIDGLKRQYDDRLLL PIPRLKTNNPEQLEKVLRQQIKNVGDRPLLWSTLGQSLMKHGEWQEASLAFRAALKQRPD AYDYAWLADALDRLHKPEEAAAMRRDGLMLTLQNNPPQ Prediction of potential genes in microbial genomes Time: Sun May 15 23:42:33 2011 Seq name: gi|296494573|gb|ADTN01000165.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont305.5, whole genome shotgun sequence Length of sequence - 27134 bp Number of predicted genes - 23, with homology - 23 Number of transcription units - 11, operones - 3 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 433 - 492 8.7 1 1 Tu 1 . + CDS 619 - 2274 1584 ## COG3119 Arylsulfatase A and related enzymes 2 2 Tu 1 . - CDS 2433 - 3668 691 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) - Prom 3707 - 3766 3.2 - TRNA 3815 - 3891 92.7 # Pro TGG 0 0 - TRNA 3934 - 4020 69.1 # Leu CAG 0 0 - TRNA 4041 - 4116 84.9 # His GTG 0 0 - TRNA 4175 - 4251 89.5 # Arg CCG 0 0 3 3 Tu 1 . - CDS 4354 - 5739 1539 ## COG1113 Gamma-aminobutyrate permease and related permeases - Prom 5840 - 5899 4.5 - Term 5875 - 5911 7.6 4 4 Op 1 . - CDS 5930 - 6670 888 ## COG1922 Teichoic acid biosynthesis proteins 5 4 Op 2 . - CDS 6673 - 8025 1444 ## SSON_3966 putative common antigen polymerase 6 4 Op 3 . - CDS 8022 - 8990 826 ## EcolC_4210 4-alpha-L-fucosyltransferase 7 5 Op 1 6/0.000 - CDS 9098 - 10348 1187 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 8 5 Op 2 4/0.250 - CDS 10350 - 11480 1278 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 9 5 Op 3 4/0.250 - CDS 11485 - 12159 446 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 10 5 Op 4 16/0.000 - CDS 12137 - 13018 1011 ## COG1209 dTDP-glucose pyrophosphorylase 11 5 Op 5 4/0.250 - CDS 13037 - 14104 1212 ## COG1088 dTDP-D-glucose 4,6-dehydratase 12 5 Op 6 10/0.000 - CDS 14104 - 15366 1230 ## COG0677 UDP-N-acetyl-D-mannosaminuronate dehydrogenase 13 5 Op 7 4/0.250 - CDS 15363 - 16493 927 ## COG0381 UDP-N-acetylglucosamine 2-epimerase - Term 16501 - 16538 9.2 14 6 Op 1 5/0.250 - CDS 16549 - 17595 987 ## COG3765 Chain length determinant protein 15 6 Op 2 . - CDS 17607 - 18710 872 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 16 6 Op 3 . - CDS 18722 - 18940 169 ## ECO111_4609 hypothetical protein 17 6 Op 4 6/0.000 - CDS 18950 - 20209 1532 ## COG1158 Transcription termination factor - Prom 20391 - 20450 3.8 - Term 20450 - 20483 0.2 18 6 Op 5 . - CDS 20536 - 20865 165 ## PROTEIN SUPPORTED gi|124485582|ref|YP_001030198.1| ribosomal protein L12E/L44/L45/RPP1/RPP2-like protein - Prom 20976 - 21035 4.6 + Prom 20801 - 20860 2.4 19 7 Tu 1 . + CDS 20996 - 22261 1342 ## COG0513 Superfamily II DNA and RNA helicases 20 8 Tu 1 . + CDS 22397 - 23881 1644 ## COG0248 Exopolyphosphatase + Term 23891 - 23935 3.2 - Term 23826 - 23864 4.2 21 9 Tu 1 . - CDS 23928 - 25949 2183 ## COG0210 Superfamily I DNA and RNA helicases - Prom 25976 - 26035 3.2 + Prom 25943 - 26002 1.8 22 10 Tu 1 . + CDS 26166 - 26384 168 ## COG3692 Uncharacterized protein conserved in bacteria + Prom 26726 - 26785 4.8 23 11 Tu 1 . + CDS 26813 - 27094 287 ## COG0760 Parvulin-like peptidyl-prolyl isomerase Predicted protein(s) >gi|296494573|gb|ADTN01000165.1| GENE 1 619 - 2274 1584 551 aa, chain + ## HITS:1 COG:aslA KEGG:ns NR:ns ## COG: aslA COG3119 # Protein_GI_number: 16131653 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Escherichia coli K12 # 1 551 1 551 551 1110 99.0 0 MEFSFSPKRLVVAVAAALPLMASAADTPSTATARKGFAGYDHPNQYLVKPATTIADNMMP VMQHPAQDKETQQKLAELEKKTGKKPNVVVFLLDDVGWMDVGFNGGGVAVGNPTPDIDAV ASQGLILTSAYSQPSSSPTRATILTGQYSIHHGILMPPMYGQPGGLQGLTTLPQLLHDQG YVTQAIGKWHMGENKESQPQNVGFDDFRGFNSVSDMYTEWRDVHVNPEVALSPDRSEYIK QLPFSKDDVHAVRGGEQQAIADITPKYMEDLDQRWMDYGVKFLDKMAKSDKPFFLYYGTR GCHFDNYPNAKYAGSSPARTSYGDCMVEMNDVFANLYKTLEKNGQLDNTLIVFTSDNGPE DEVPPHGRTPFRGAKGSTWEGGVRVPTFVYWKGMIQPRKSDGIVDLADLFPTALDLAGHP GAKVANLVPKTTFIDGVDQTSFFLGTNGQSNRKAEHYFLNGKLAAVRMDEFKYHVLIQQP YAYTQSGYQGGFTGTVMQTAGSSVFNLYTDPQESDSIGVRHIPMGVPLQTEMHAYMEILK KYPPRAQIKSD >gi|296494573|gb|ADTN01000165.1| GENE 2 2433 - 3668 691 411 aa, chain - ## HITS:1 COG:aslB KEGG:ns NR:ns ## COG: aslB COG0641 # Protein_GI_number: 16131652 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Escherichia coli K12 # 1 411 1 411 411 874 99.0 0 MLQQVPTRAFHVMAKPSGSDCNLNCDYCFYLEKQSLYREKPVTHMDDDTLEAYVRHYIAA SEPQNEVAFTWQGGEPTLLGLAFYRRAVALQAKYGAGRKISNSFQTNGVLLDDEWCAFLA EHHFLVGLSLDGPPEIHNQYRVTKGGRPTHKLVMRALTLLQKHHVDYNVLVCVNRTSAQQ PLQVYDFLCDAGVEFIQFIPVVERLADETTARDGLKLHAPGDIQGELTEWSVRPEEFGEF LVAIFDHWIKRDVGKIFVMNIEWAFANFVGAPGAVCHHQPTCGRSVIVEHNGDVYACDHY VYPQYRLGNMHQQTIAEMIDSPQQQAFGEDKFKQLPAQCRSCNVLKACWGGCPKHRFMLD ASGKPGLNYLCAGYQRYFRHLPPYLKAMADLLAHGRPASDIMHAHLLVVSK >gi|296494573|gb|ADTN01000165.1| GENE 3 4354 - 5739 1539 461 aa, chain - ## HITS:1 COG:ECs4729 KEGG:ns NR:ns ## COG: ECs4729 COG1113 # Protein_GI_number: 15833983 # Func_class: E Amino acid transport and metabolism # Function: Gamma-aminobutyrate permease and related permeases # Organism: Escherichia coli O157:H7 # 1 461 1 461 461 808 100.0 0 MADNKPELQRGLEARHIELIALGGTIGVGLFMGAASTLKWAGPSVLLAYIIAGLFVFFIM RSMGEMLFLEPVTGSFAVYAHRYMSPFFGYLTAWSYWFMWMAVGISEITAIGVYVQFWFP EMAQWIPALIAVALVALANLAAVRLYGEIEFWFAMIKVTTIIVMIVIGLGVIFFGFGNGG QSIGFSNLTEHGGFFAGGWKGFLTALCIVVASYQGVELIGITAGEAKNPQVTLRSAVGKV LWRILIFYVGAIFVIVTIFPWNEIGSNGSPFVLTFAKIGITAAAGIINFVVLTAALSGCN SGMYSCGRMLYALAKNRQLPAAMAKVSRHGVPVAGVAVSIAILLIGSCLNYIIPNPQRVF VYVYSASVLPGMVPWFVILISQLRFRRAHKAAIASHPFRSILFPWANYVTMAFLICVLIG MYFNEDTRMSLFVGIIFMLAVTAIYKVFGLNRHGKAHKLEE >gi|296494573|gb|ADTN01000165.1| GENE 4 5930 - 6670 888 246 aa, chain - ## HITS:1 COG:wecG KEGG:ns NR:ns ## COG: wecG COG1922 # Protein_GI_number: 16131650 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Teichoic acid biosynthesis proteins # Organism: Escherichia coli K12 # 1 246 1 246 246 474 99.0 1e-134 MNNNTPAPTYTLRGLQLIGWRDMQHALDYLFADGQLKQGTLVAINAEKMLTIEDNAEVRE LINAAEFKYADGISVVRSVRKKYPQAQVSRVAGADLWEELMARAGKEGTPVFLVGGKPEV LAQTEAKLRNQWNVNIVGSQDGYFKPEQRQALFERIHASGAQIVTVAMGSPKQEIIMRDC RLVHPDALYMGVGGTYDVFTGHVKRAPKIWQTLGLEWLYRLLSQPSRIKRQLRLLRYLRW HYTGNL >gi|296494573|gb|ADTN01000165.1| GENE 5 6673 - 8025 1444 450 aa, chain - ## HITS:1 COG:no KEGG:SSON_3966 NR:ns ## KEGG: SSON_3966 # Name: wecF # Def: putative common antigen polymerase # Organism: S.sonnei # Pathway: not_defined # 1 450 1 450 450 751 100.0 0 MSLLQFSGLFVVWLLCTLFIATLTWFEFRRVRFNFNVFFSLLFLLTFFFGFPLTSVLVFR FDVGVAPPEILLQALLSAGCFYAVYYVTYKTRLRKRVADVPRRPLFTMNRVETNLTWVIL MGIALVSVGIFFMHNGFLLFRLNSYSQIFSSEVSGVALKRFFYFFIPAMLVVYFLRQDSK AWLFFLVSTVAFGLLTYMIVGGTRANIIIAFAIFLFIGIIRGWISLWMLAAAGVLGIVGM FWLALKRYGMNVSGDEAFYTFLYLTRDTFSPWENLALLLQNYDNIDFQGLAPIVRDFYVF IPSWLWPGRPSMVLNSANYFTWEVLNNHSGLAISPTLIGSLVVMGGALFIPLGAIVVGLI IKWFDWLYELGNREPNRYKAAILHSFCFGAIFNMIVLAREGLDSFVSRVVFFIVVFGACL MIAKLLYWLFESAGLIHKRTKSSLRTQVEG >gi|296494573|gb|ADTN01000165.1| GENE 6 8022 - 8990 826 322 aa, chain - ## HITS:1 COG:no KEGG:EcolC_4210 NR:ns ## KEGG: EcolC_4210 # Name: not_defined # Def: 4-alpha-L-fucosyltransferase # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 322 38 359 359 673 99.0 0 MVVGKDDGLSDSCPALSVQFFPGKKSLAEAVIAKAKANRQQRFFFHGQFNPTLWLALLSG GIKPSQFFWHIWGADLYELSSGLRYKLFYPLRRLAQKRVGCVFATRGDLSFFAKTHPKVR GELLFFPTRMDPSLNTMANDRQREGKMTILVGNSGDRSNEHIAALRAVHQQFGDTVKVVV PMGYPPNNEAYIEEVRQAGLELFSEENLQILSEKLEFDAYLALLRQCDLGYFIFARQQGI GTLCLLIQAGIPCVLNRENPFWQDMTEQHLPVLFTTDDLNEDIVREAQRQLASVDKNTIA FFSPNYLQGWQRALAIAAGEVA >gi|296494573|gb|ADTN01000165.1| GENE 7 9098 - 10348 1187 416 aa, chain - ## HITS:1 COG:wzxE KEGG:ns NR:ns ## COG: wzxE COG2244 # Protein_GI_number: 16131648 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Escherichia coli K12 # 1 416 1 416 416 697 100.0 0 MSLAKASLWTAASTLVKIGAGLLVGKLLAVSFGPAGLGLAANFRQLITVLGVLAGAGIFN GVTKYVAQYHDNPQQLRRVVGTSSAMVLGFSTLMALVFVLAAAPISQGLFGNTDYQGLVR LVALVQMGIAWGNLLLALMKGFRDAAGNALSLIVGSLIGVLAYYVSYRLGGYEGALLGLA LIPALVVIPAAIMLIKRGVIPLSYLKPSWDNGLAGQLSKFTLMALITSVTLPVAYIMMRK LLAAQYSWDEVGIWQGVSSISDAYLQFITASFSVYLLPTLSRLTEKRDITREVVKSLKFV LPAVAAASFTVWLLRDFAIWLLLSNKFTAMRDLFAWQLVGDVLKVGAYVFGYLVIAKASL RFYILAEVSQFTLLMVFAHWLIPAHGALGAAQAYMATYIVYFSLCCGVFLLWRRRA >gi|296494573|gb|ADTN01000165.1| GENE 8 10350 - 11480 1278 376 aa, chain - ## HITS:1 COG:wecE KEGG:ns NR:ns ## COG: wecE COG0399 # Protein_GI_number: 16131647 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Escherichia coli K12 # 1 376 1 376 376 799 100.0 0 MIPFNAPPVVGTELDYMQSAMGSGKLCGDGGFTRRCQQWLEQRFGSAKVLLTPSCTASLE MAALLLDIQPGDEVIMPSYTFVSTANAFVLRGAKIVFVDVRPDTMNIDETLIEAAITDKT RVIVPVHYAGVACEMDTIMALAKKHNLFVVEDAAQGVMSTYKGRALGTIGHIGCFSFHET KNYTAGGEGGATLINDKALIERAEIIREKGTNRSQFFRGQVDKYTWRDIGSSYLMSDLQA AYLWAQLEAADRINQQRLALWQNYYDALAPLAKAGRIELPSIPDGCVQNAHMFYIKLRDI DDRSALINFLKEAEIMAVFHYIPLHGCPAGEHFGEFHGEDRYTTKESERLLRLPLFYNLS PVNQRTVIATLLNYFS >gi|296494573|gb|ADTN01000165.1| GENE 9 11485 - 12159 446 224 aa, chain - ## HITS:1 COG:ECs4723 KEGG:ns NR:ns ## COG: ECs4723 COG0454 # Protein_GI_number: 15833977 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Escherichia coli O157:H7 # 44 224 1 181 181 322 99.0 3e-88 MPVRASIEPLTWENAFFGVNSAIVRITSEAPLLTPDALAPWSRVQAKIAASNTGELDALQ QLGFSLVEGEVDLALPVNNASDSGAVVAQETDIPALRQLASAAFAQSRFRAPWYAPDASS RFYAQWIENAVRGTFDHQCLILRAASGDIRGYVSLRELNATDARIGLLAGRGAGAELMQT ALNWAYARGKTTLRVATQMGNTAALKRYIQSGANVESTAYWLYR >gi|296494573|gb|ADTN01000165.1| GENE 10 12137 - 13018 1011 293 aa, chain - ## HITS:1 COG:rffH KEGG:ns NR:ns ## COG: rffH COG1209 # Protein_GI_number: 16131645 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Escherichia coli K12 # 1 293 1 293 293 596 100.0 1e-170 MKGIILAGGSGTRLHPITRGVSKQLLPIYDKPMIYYPLSVLMLAGIREILIITTPEDKGY FQRLLGDGSEFGIQLEYAEQPSPDGLAQAFIIGETFLNGEPSCLVLGDNIFFGQGFSPKL RHVAARTEGATVFGYQVMDPERFGVVEFDDNFRAISLEEKPKQPKSNWAVTGLYFYDSKV VEYAKQVKPSERGELEITSINQMYLEAGNLTVELLGRGFAWLDTGTHDSLIEASTFVQTV EKRQGFKIACLEEIAWRNGWLDDEGVKRAASSLAKTGYGQYLLELLRARPRQY >gi|296494573|gb|ADTN01000165.1| GENE 11 13037 - 14104 1212 355 aa, chain - ## HITS:1 COG:rffG KEGG:ns NR:ns ## COG: rffG COG1088 # Protein_GI_number: 16131644 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Escherichia coli K12 # 1 355 1 355 355 741 99.0 0 MRKILITGGAGFIGSALVRYIINETSDAVVVVDKLTYAGNLMSLAPVAQSERFAFEKVDI CDRAELARVFTEHQPDCVMHLAAESHVDRSIDGPAAFIETNIVGTYTLLEAARAYWNALT EDKKSAFRFHHISTDEVYGDLHSTDDFFTETTPYAPSSPYSASKASSDHLVRAWLRTYGL PTLITNCSNNYGPYHFPEKLIPLMILNALAGKSLPVYGNGQQIRDWLYVEDHARALYCVA TTGKVGETYNIGGHNERKNLDVVETICELLEELAPNKPHGVAHYRDLITFVADRPGHDLR YAIDASKIARELGWLPQETFESGMRKTVQWYLANESWWKQVQDGSYQGERLGLKG >gi|296494573|gb|ADTN01000165.1| GENE 12 14104 - 15366 1230 420 aa, chain - ## HITS:1 COG:ECs4720 KEGG:ns NR:ns ## COG: ECs4720 COG0677 # Protein_GI_number: 15833974 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetyl-D-mannosaminuronate dehydrogenase # Organism: Escherichia coli O157:H7 # 1 420 1 420 420 850 100.0 0 MSFATISVIGLGYIGLPTAAAFASRQKQVIGVDINQHAVDTINRGEIHIVEPDLASVVKT AVEGGFLRASTTPVEADAWLIAVPTPFKGDHEPDMTYVESAARSIAPVLKKGALVILEST SPVGSTEKMAEWLAEMRPDLTFPQQVGEQADVNIAYCPERVLPGQVMVELIKNDRVIGGM TPVCSARASELYKIFLEGECVVTNSRTAEMCKLTENSFRDVNIAFANELSLICADQGINV WELIRLANRHPRVNILQPGPGVGGHCIAVDPWFIVAQNPQQARLIRTAREVNDHKPFWVI DQVKAAVADCLAATDKRASELKIACFGLAFKPNIDDLRESPAMEIAELIAQWHSGETLVV EPNIHQLPKKLTGLCTLAQLDEALATADVLVMLVDHSQFKVINGDNVHQQYVVDAKGVWR >gi|296494573|gb|ADTN01000165.1| GENE 13 15363 - 16493 927 376 aa, chain - ## HITS:1 COG:ECs4719 KEGG:ns NR:ns ## COG: ECs4719 COG0381 # Protein_GI_number: 15833973 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine 2-epimerase # Organism: Escherichia coli O157:H7 # 1 376 15 390 390 746 99.0 0 MKVLTVFGTRPEAIKMAPLVHALAKDPFFEAKVCVTAQHREMLDQVLKLFSIVPDYDLNI MQPGQGLTEITCRILEGLKPILAEFKPDVVLVHGDTTTTLATSLAAFYQRIPVGHVEAGL RTGDLYSPWPEEANRTLTGHLAMYHFSPTETSRQNLLRENVADSRIFITGNTVIDALLWV RDQVMSSDKLRSELAANYPFIDPDKKMILVTGHRRESFGRGFEEICHALADIATTHQDIQ IVYPVHLNPNVREPVNRILGHVKNVILIDPQEYLPFVWLMNHAWLILTDSGGIQEEAPSL GKPVLVMRDTTERPEAVTAGTVRLVGTDKQRIVEEVTRLLKDENEYQAMSRAHNPYGDGQ ACSRILEALKNNRISL >gi|296494573|gb|ADTN01000165.1| GENE 14 16549 - 17595 987 348 aa, chain - ## HITS:1 COG:ECs4718 KEGG:ns NR:ns ## COG: ECs4718 COG3765 # Protein_GI_number: 15833972 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Chain length determinant protein # Organism: Escherichia coli O157:H7 # 1 348 2 349 349 682 100.0 0 MTQPMPGKPAEDAENELDIRGLFRTLWAGKLWIIGMGLAFALIALAYTFFARQEWSSTAI TDRPTVNMLGGYYSQQQFLRNLDVRSNMASADQPSVMDEAYKEFVMQLASWDTRREFWLQ TDYYKQRMVGNSKADAALLDEMINNIQFIPGDFTRAVNDSVKLIAETAPDANNLLRQYVA FASQRAASHLNDELKGAWAARTIQMKAQVKRQEEVAKAIYDRRMNSIEQALKIAEQHNIS RSATDVPAEELPDSEMFLLGRPMLQARLENLQAVGPAFDLDYDQNRAMLNTLNVGPTLDP RFQTYRYLRTPEEPVKRDSPRRAFLMIMWGIVGGLIGAGVALTRRCSK >gi|296494573|gb|ADTN01000165.1| GENE 15 17607 - 18710 872 367 aa, chain - ## HITS:1 COG:rfe KEGG:ns NR:ns ## COG: rfe COG0472 # Protein_GI_number: 16131640 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Escherichia coli K12 # 1 367 1 367 367 610 99.0 1e-174 MNLLTVSTDLISIFLFTTLFLFFARKVAKKVGLVDKPNFRKRHQGLIPLVGGISVYAGIC FTFGIVDYYIPHASLYLACAGVLVFIGALDDRFDISVKIRATIQAAVGIVMMVFGKLYLS SLGYIFGSWEMVLGPFGYFLTLFAVWATINAFNMVDGIDGLLGGLSCVSFAAIGMILWFD GQTSLAIWCFAMIAAILPYIMLNLGILGRRYKVFMGDAGSTLIGFTVIWILLETTQGKTH PISPVTALWIIAIPLMDMVAIMYRRLRKGMSPFSPDRQHIHHLIMRAGFTSRQAFVLITL AAALLASIGVLAEYSHFVSEWVMLVLFLLAFFLYGYCIKRAWKVARFIKRVKRRLRRNRG GSPNLTK >gi|296494573|gb|ADTN01000165.1| GENE 16 18722 - 18940 169 72 aa, chain - ## HITS:1 COG:no KEGG:ECO111_4609 NR:ns ## KEGG: ECO111_4609 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 72 1 72 72 130 100.0 1e-29 MPKTPRVYVAFCFYICNLNAALAMLGKFLEFAGMLCNLHIKWLIFAQDWWVWNGLSLLNK GLRGYTSANNFL >gi|296494573|gb|ADTN01000165.1| GENE 17 18950 - 20209 1532 419 aa, chain - ## HITS:1 COG:ECs4716 KEGG:ns NR:ns ## COG: ECs4716 COG1158 # Protein_GI_number: 15833970 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Escherichia coli O157:H7 # 1 419 1 419 419 815 100.0 0 MNLTELKNTPVSELITLGENMGLENLARMRKQDIIFAILKQHAKSGEDIFGDGVLEILQD GFGFLRSADSSYLAGPDDIYVSPSQIRRFNLRTGDTISGKIRPPKEGERYFALLKVNEVN FDKPENARNKILFENLTPLHANSRLRMERGNGSTEDLTARVLDLASPIGRGQRGLIVAPP KAGKTMLLQNIAQSIAYNHPDCVLMVLLIDERPEEVTEMQRLVKGEVVASTFDEPASRHV QVAEMVIEKAKRLVEHKKDVIILLDSITRLARAYNTVVPASGKVLTGGVDANALHRPKRF FGAARNVEEGGSLTIIATALIDTGSKMDEVIYEEFKGTGNMELHLSRKIAEKRVFPAIDY NRSGTRKEELLTTQEELQKMWILRKIIHPMGEIDAMEFLINKLAMTKTNDDFFEMMKRS >gi|296494573|gb|ADTN01000165.1| GENE 18 20536 - 20865 165 109 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|124485582|ref|YP_001030198.1| ribosomal protein L12E/L44/L45/RPP1/RPP2-like protein [Methanocorpusculum labreanum Z] # 5 106 17 116 120 68 34 6e-11 MSDKIIHLTDDSFDTDVLKADGAILVDFWAEWCGPCKMIAPILDEIADEYQGKLTVAKLN IDQNPGTAPKYGIRGIPTLLLFKNGEVAATKVGALSKGQLKEFLDANLA >gi|296494573|gb|ADTN01000165.1| GENE 19 20996 - 22261 1342 421 aa, chain + ## HITS:1 COG:rhlB KEGG:ns NR:ns ## COG: rhlB COG0513 # Protein_GI_number: 16131636 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Escherichia coli K12 # 1 421 1 421 421 862 100.0 0 MSKTHLTEQKFSDFALHPKVVEALEKKGFHNCTPIQALALPLTLAGRDVAGQAQTGTGKT MAFLTSTFHYLLSHPAIADRKVNQPRALIMAPTRELAVQIHADAEPLAEATGLKLGLAYG GDGYDKQLKVLESGVDILIGTTGRLIDYAKQNHINLGAIQVVVLDEADRMYDLGFIKDIR WLFRRMPPANQRLNMLFSATLSYRVRELAFEQMNNAEYIEVEPEQKTGHRIKEELFYPSN EEKMRLLQTLIEEEWPDRAIIFANTKHRCEEIWGHLAADGHRVGLLTGDVAQKKRLRILD EFTRGDLDILVATDVAARGLHIPAVTHVFNYDLPDDCEDYVHRIGRTGRAGASGHSISLA CEEYALNLPAIETYIGHSIPVSKYNPDALMTDLPKPLRLTRPRTGNGPRRTGAPRNRRRS G >gi|296494573|gb|ADTN01000165.1| GENE 20 22397 - 23881 1644 494 aa, chain + ## HITS:1 COG:ECs4712 KEGG:ns NR:ns ## COG: ECs4712 COG0248 # Protein_GI_number: 15833966 # Func_class: F Nucleotide transport and metabolism; P Inorganic ion transport and metabolism # Function: Exopolyphosphatase # Organism: Escherichia coli O157:H7 # 1 494 1 494 494 961 100.0 0 MGSTSSLYAAIDLGSNSFHMLVVREVAGSIQTLTRIKRKVRLAAGLNSENALSNEAMERG WQCLRLFAERLQDIPPSQIRVVATATLRLAVNAGDFIAKAQEILGCPVQVISGEEEARLI YQGVAHTTGGADQRLVVDIGGASTELVTGTGAQTTSLFSLSMGCVTWLERYFADRNLGQE NFDAAEKAAREVLRPVADELRYHGWKVCVGASGTVQALQEIMMAQGMDERITLEKLQQLK QRAIHCGRLEELEIDGLTLERALVFPSGLAILIAIFTELNIQCMTLAGGALREGLVYGML HLAVEQDIRSRTLRNIQRRFMIDIDQAQRVAKVAANFFDQVENEWHLEAISRDLLISACQ LHEIGLSVDFKQAPQHAAYLVRNLDLPGFTPAQKKLLATLLLNQTNPVDLSSLHQQNAVP PRVAEQLCRLLRLAIIFASRRRDDLVPEMTLQANHELLTLTLPQGWLTQHPLGKEIIAQE SQWQSYVHWPLEVH >gi|296494573|gb|ADTN01000165.1| GENE 21 23928 - 25949 2183 673 aa, chain - ## HITS:1 COG:rep KEGG:ns NR:ns ## COG: rep COG0210 # Protein_GI_number: 16131634 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Escherichia coli K12 # 1 673 1 673 673 1322 99.0 0 MRLNPGQQQAVEFVTGPCLVLAGAGSGKTRVITNKIAHLIRGCGYQARHIAAVTFTNKAA REMKERVGQTLGRKEARGLMISTFHTLGLDIIKREYAALGMKANFSLFDDTDQLALLKEL TEGLIEDDKVLLQQLISTISNWKNDLKTPSQAAASAIGERDRIFAHCYGLYDAHLKACNV LDFDDLILLPTLLLQRNEEVRERWQNKIRYLLVDEYQDTNTSQYELVKLLVGSRARFTVV GDDDQSIYSWRGARPQNLVLLSQDFPALKVIKLEQNYRSSGRILKAANILIANNPHVFEK RLFSELGYGAELKVLSANNEEHEAERVTGELIAHHFVNKTQYKDYAILYRGNHQSRVFEK FLMQNRIPYKISGGTSFFSRPEIKDLLAYLRVLTNPDDDSAFLRIVNTPKREIGPATLKK LGEWAMTRNKSMFTASFDMGLSQTLSGRGYEALTRFTHWLAEIQRLAEREPIAAVRDLIH GMDYESWLYETSPSPKAAEMRMKNVNQLFSWMTEMLEGSELDEPMTLTQVVTRFTLRDMM ERGESEEELDQVQLMTLHASKGLEFPYVYMVGMEEGFLPHQSSIDEDNIDEERRLAYVGI TRAQKELTFTLCKERRQYGELVRPEPSRFLLELPQDDLIWEQERKVVSAEERMQKGQSHL ANLKAMMAAKRGK >gi|296494573|gb|ADTN01000165.1| GENE 22 26166 - 26384 168 72 aa, chain + ## HITS:1 COG:yifNm KEGG:ns NR:ns ## COG: yifNm COG3692 # Protein_GI_number: 16132261 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 69 15 83 163 143 98.0 7e-35 MAINFSPKVGEILECNFGNYPVSQNGPFSTTYYDGRIPPEMIKNRLVVVLNGKINGNACI VVPLSTTRDQTS >gi|296494573|gb|ADTN01000165.1| GENE 23 26813 - 27094 287 93 aa, chain + ## HITS:1 COG:ECs4709 KEGG:ns NR:ns ## COG: ECs4709 COG0760 # Protein_GI_number: 15833963 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Parvulin-like peptidyl-prolyl isomerase # Organism: Escherichia coli O157:H7 # 1 93 1 93 93 188 100.0 2e-48 MAKTAAALHILVKEEKLALDLLEQIKNGADFGKLAKKHSICPSGKRGGDLGEFRQGQMVP AFDKVVFSCPVLEPTGPLHTQFGYHIIKVLYRN Prediction of potential genes in microbial genomes Time: Sun May 15 23:42:51 2011 Seq name: gi|296494572|gb|ADTN01000166.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont305.6, whole genome shotgun sequence Length of sequence - 12541 bp Number of predicted genes - 11, with homology - 10 Number of transcription units - 7, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 14 - 1489 2164 ## COG0059 Ketol-acid reductoisomerase - Prom 1543 - 1602 8.1 + Prom 1536 - 1595 4.4 2 2 Tu 1 . + CDS 1639 - 2532 869 ## COG0583 Transcriptional regulator + Term 2550 - 2587 6.1 - Term 2461 - 2491 1.0 3 3 Op 1 8/0.250 - CDS 2584 - 4128 1605 ## COG1171 Threonine dehydratase 4 3 Op 2 5/0.500 - CDS 4131 - 5981 2136 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase 5 3 Op 3 5/0.500 - CDS 6046 - 6975 1183 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase 6 3 Op 4 7/0.250 - CDS 6995 - 7258 252 ## COG3978 Acetolactate synthase (isozyme II), small (regulatory) subunit 7 3 Op 5 . - CDS 7255 - 8901 1435 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] - Prom 8945 - 9004 2.1 - Term 8979 - 9018 5.1 8 4 Tu 1 . - CDS 9041 - 9139 56 ## - Prom 9242 - 9301 9.1 + Prom 9228 - 9287 6.9 9 5 Tu 1 . + CDS 9492 - 11012 810 ## COG0606 Predicted ATPase with chaperone activity + Term 11261 - 11294 -0.2 10 6 Tu 1 . - CDS 11037 - 11375 449 ## COG3085 Uncharacterized protein conserved in bacteria - Prom 11454 - 11513 6.9 + Prom 11410 - 11469 6.1 11 7 Tu 1 . + CDS 11494 - 12333 595 ## COG0583 Transcriptional regulator + Term 12402 - 12435 3.1 - TRNA 12429 - 12504 82.1 # Trp CCA 0 0 Predicted protein(s) >gi|296494572|gb|ADTN01000166.1| GENE 1 14 - 1489 2164 491 aa, chain - ## HITS:1 COG:ilvC KEGG:ns NR:ns ## COG: ilvC COG0059 # Protein_GI_number: 16131632 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Ketol-acid reductoisomerase # Organism: Escherichia coli K12 # 1 491 1 491 491 969 100.0 0 MANYFNTLNLRQQLAQLGKCRFMGRDEFADGASYLQGKKVVIVGCGAQGLNQGLNMRDSG LDISYALRKEAIAEKRASWRKATENGFKVGTYEELIPQADLVINLTPDKQHSDVVRTVQP LMKDGAALGYSHGFNIVEVGEQIRKDITVVMVAPKCPGTEVREEYKRGFGVPTLIAVHPE NDPKGEGMAIAKAWAAATGGHRAGVLESSFVAEVKSDLMGEQTILCGMLQAGSLLCFDKL VEEGTDPAYAEKLIQFGWETITEALKQGGITLMMDRLSNPAKLRAYALSEQLKEIMAPLF QKHMDDIISGEFSSGMMADWANDDKKLLTWREETGKTAFETAPQYEGKIGEQEYFDKGVL MIAMVKAGVELAFETMVDSGIIEESAYYESLHELPLIANTIARKRLYEMNVVISDTAEYG NYLFSYACVPLLKPFMAELQPGDLGKAIPEGAVDNGQLRDVNEAIRSHAIEQVGKKLRGY MTDMKRIAVAG >gi|296494572|gb|ADTN01000166.1| GENE 2 1639 - 2532 869 297 aa, chain + ## HITS:1 COG:ilvY KEGG:ns NR:ns ## COG: ilvY COG0583 # Protein_GI_number: 16131631 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 297 1 297 297 564 100.0 1e-161 MDLRDLKTFLHLAESRHFGRSARAMHVSPSTLSRQIQRLEEDLGQPLFVRDNRTVTLTEA GEELRVFAQQTLLQYQQLRHTIDQQGPSLSGELHIFCSVTAAYSHLPPILDRFRAEHPSV EIKLTTGDAADAMEKVVTGEADLAIAGKPETLPGAVAFSMLENLAVVLIAPALPCPVRNQ VSVEKPDWSTVPFIMADQGPVRRRIELWFRRNKISNPMIYATVGGHEAMVSMVALGCGVA LLPEVVLENSPEPVRNRVMILERSDEKTPFELGVCAQKKRLHEPLIEAFWKILPNHK >gi|296494572|gb|ADTN01000166.1| GENE 3 2584 - 4128 1605 514 aa, chain - ## HITS:1 COG:ilvA KEGG:ns NR:ns ## COG: ilvA COG1171 # Protein_GI_number: 16131630 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Escherichia coli K12 # 1 514 1 514 514 1011 100.0 0 MADSQPLSGAPEGAEYLRAVLRAPVYEAAQVTPLQKMEKLSSRLDNVILVKREDRQPVHS FKLRGAYAMMAGLTEEQKAHGVITASAGNHAQGVAFSSARLGVKALIVMPTATADIKVDA VRGFGGEVLLHGANFDEAKAKAIELSQQQGFTWVPPFDHPMVIAGQGTLALELLQQDAHL DRVFVPVGGGGLAAGVAVLIKQLMPQIKVIAVEAEDSACLKAALDAGHPVDLPRVGLFAE GVAVKRIGDETFRLCQEYLDDIITVDSDAICAAMKDLFEDVRAVAEPSGALALAGMKKYI ALHNIRGERLAHILSGANVNFHGLRYVSERCELGEQREALLAVTIPEEKGSFLKFCQLLG GRSVTEFNYRFADAKNACIFVGVRLSRGLEERKEILQMLNDGGYSVVDLSDDEMAKLHVR YMVGGRPSHPLQERLYSFEFPESPGALLRFLNTLGTYWNISLFHYRSHGTDYGRVLAAFE LGDHEPDFETRLNELGYDCHDETNNPAFRFFLAG >gi|296494572|gb|ADTN01000166.1| GENE 4 4131 - 5981 2136 616 aa, chain - ## HITS:1 COG:ECs4705 KEGG:ns NR:ns ## COG: ECs4705 COG0129 # Protein_GI_number: 15833959 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Escherichia coli O157:H7 # 1 616 1 616 616 1210 99.0 0 MPKYRSATTTHGRNMAGARALWRATGMTDADFGKPIIAVVNSFTQFVPGHVHLRDLGKLV AEQIEAAGGVAKEFNTIAVDDGIAMGHGGMLYSLPSRELIADSVEYMVNAHCADAMVCIS NCDKITPGMLMASLRLNIPVIFVSGGPMEAGKTKLSDQIIKLDLVDAMIQGADPKVSDSQ SDQVERSACPTCGSCSGMFTANSMNCLTEALGLSQPGNGSLLATHADRKQLFLNAGKRIV ELTKRYYEQNDESALPRNIASKAAFENAMTLDIAMGGSTNTVLHLLAAAQEAEIDFTMSD IDKLSRKVPQLCKVAPSTQKYHMEDVHRAGGVIGILGELDRAGLLNRDVKNVLGLTLPQT LEQYDVMLTQDDAVKNMFRAGPAGIRTTQAFSQDCRWDTLDDDRANGCIRSLEHAYSKDG GLAVLYGNFAENGCIVKTAGVDDSILKFTGPAKVYESQDDAVEAILGGKVVAGDVVVIRY EGPKGGPGMQEMLYPTSFLKSMGLGKACALITDGRFSGGTSGLSIGHVSPEAASGGSIGL IEDGDLIAIDIPNRGIQLQVSDAELAARREAQDARGDKAWTPKNRERQVSFALRAYASLA TSADKGAVRDKSKLGG >gi|296494572|gb|ADTN01000166.1| GENE 5 6046 - 6975 1183 309 aa, chain - ## HITS:1 COG:ECs4704 KEGG:ns NR:ns ## COG: ECs4704 COG0115 # Protein_GI_number: 15833958 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Escherichia coli O157:H7 # 1 309 1 309 309 644 100.0 0 MTTKKADYIWFNGEMVRWEDAKVHVMSHALHYGTSVFEGIRCYDSHKGPVVFRHREHMQR LHDSAKIYRFPVSQSIDELMEACRDVIRKNNLTSAYIRPLIFVGDVGMGVNPPAGYSTDV IIAAFPWGAYLGAEALEQGIDAMVSSWNRAAPNTIPTAAKAGGNYLSSLLVGSEARRHGY QEGIALDVNGYISEGAGENLFEVKDGVLFTPPFTSSALPGITRDAIIKLAKELGIEVREQ VLSRESLYLADEVFMSGTAAEITPVRSVDGIQVGEGRCGPVTKRIQQAFFGLFTGETEDK WGWLDQVNQ >gi|296494572|gb|ADTN01000166.1| GENE 6 6995 - 7258 252 87 aa, chain - ## HITS:1 COG:ECs4703 KEGG:ns NR:ns ## COG: ECs4703 COG3978 # Protein_GI_number: 15833957 # Func_class: S Function unknown # Function: Acetolactate synthase (isozyme II), small (regulatory) subunit # Organism: Escherichia coli O157:H7 # 1 87 1 87 87 143 100.0 8e-35 MMQHQVNVSARFNPETLERVLRVVRHRGFHVCSMNMAAASDAQNINIELTVASPRSVDLL FSQLNKLVDVAHVAICQSTTTSQQIRA >gi|296494572|gb|ADTN01000166.1| GENE 7 7255 - 8901 1435 548 aa, chain - ## HITS:1 COG:ilvG KEGG:ns NR:ns ## COG: ilvG COG0028 # Protein_GI_number: 16132231 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Escherichia coli K12 # 1 548 1 548 548 1110 99.0 0 MNGAQWVVHALRAQGVNTVFGYPGGAIMPVYDALYDGGVEHLLCRHEQGAAMAAIGYARA TGKTGVCIATSGPGATNLITGLADALLDSIPVVAITGQVSAPFIGTDAFQEVDVLGLSLA CTKHSFLVQSLEELPRIMAEAFDVACSGRPGPVLVDIPKDIQLASGDLEPWFTTVENEVT FPHAEVEQARQMLAKAQKPMLYVGGGVGMAQAVPALREFLAATKMPATCTLKGLGAVEAD YPYYLGMLGMHGTKAANFAVQECDLLIAVGARFDDRVTGKLNTFAPHASVIHMDIDPAEM NKLRQAHVALQGDLNALLPALQQPLNINDWQQHCAQLRDEHSWRYDHPGDAIYAPLLLKQ LSDRKPADCVVTTDVGQHQMWAAQHIAHTRPENFITSSGLGTMGFGLPAAVGAQVARPND TVVCISGDGSFMMNVQELGTVKRKQLPLKIVLLDNQRLGMVRQWQQLFFQERYSETTLTD NPDFLMLASAFGIHGQHITRKDQVEAALDTMLNSDGPYLLHVSIDELENVWPLVPPGASN SEMLEKLS >gi|296494572|gb|ADTN01000166.1| GENE 8 9041 - 9139 56 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTALLRVISLVVISVVVIIIPPCGAALGRGKA >gi|296494572|gb|ADTN01000166.1| GENE 9 9492 - 11012 810 506 aa, chain + ## HITS:1 COG:yifB KEGG:ns NR:ns ## COG: yifB COG0606 # Protein_GI_number: 16131625 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATPase with chaperone activity # Organism: Escherichia coli K12 # 1 506 11 516 516 982 100.0 0 MSLSIVHTRAALGVNAPPITVEVHISKGLPGLTMVGLPETTVKEARDRVRSAIINSGYEY PAKKITINLAPADLPKEGGRYDLPIAIALLAASEQLTANKLDEYELVGELALTGALRGVP GAISSATEAIKSGRKIIVAKDNEDEVGLINGEGCLIADHLQAVCAFLEGKHALERPKPTD AVSRALQHDLSDVIGQEQGKRGLEITAAGGHNLLLIGPPGTGKTMLASRINGLLPDLSNE EALESAAILSLVNAESVQKQWRQRPFRSPHHSASLTAMVGGGAIPGPGEISLAHNGVLFL DELPEFERRTLDALREPIESGQIHLSRTRAKITYPARFQLVAAMNPSPTGHYQGNHNRCT PEQTLRYLNRLSGPFLDRFDLSLEIPLPPPGILSKTVVPGESSATVKQRVMAARERQFKR QNKLNAWLDSPEIRQFCKLESEDAMWLEGTLIHLGLSIRAWQRLLKVARTIADIDQSDII TRQHLQEAVSYRAIDRLLIHLQKLLT >gi|296494572|gb|ADTN01000166.1| GENE 10 11037 - 11375 449 112 aa, chain - ## HITS:1 COG:ECs4699 KEGG:ns NR:ns ## COG: ECs4699 COG3085 # Protein_GI_number: 15833953 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 112 1 112 112 208 100.0 2e-54 MAESFTTTNRYFDNKHYPRGFSRHGDFTIKEAQLLERHGYAFNELDLGKREPVTEEEKLF VAVCRGEREPVTEAERVWSKYMTRIKRPKRFHTLSGGKPQVEGAEDYTDSDD >gi|296494572|gb|ADTN01000166.1| GENE 11 11494 - 12333 595 279 aa, chain + ## HITS:1 COG:yifD+A KEGG:ns NR:ns ## COG: yifD+A COG0583 # Protein_GI_number: 16132258 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 279 1 279 279 537 100.0 1e-153 MDTELLKTFLEVSRTRHFGRAAESLYLTQSAVSFRIRQLENQLGVNLFTRHRNNIRLTAA GEKLLPYAETLMSTWQAARKEVAHTSRHNEFSIGASASLWECMLNQWLGRLYQNQDAHTG LQFEARIAQRQSLVKQLHERQLDLLITTEAPKMDEFSSQLLGYFTLALYTSAPSKLKGDL NYLRLEWGPDFQQHEAGLIGADEVPILTTSSAELAQQQIAMLNGCTWLPVSWARKKGGLH TVVDSTTLSRPLYAIWLQNSDKNALIRDLLKINVLDEVY Prediction of potential genes in microbial genomes Time: Sun May 15 23:43:04 2011 Seq name: gi|296494571|gb|ADTN01000167.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont306.1, whole genome shotgun sequence Length of sequence - 27461 bp Number of predicted genes - 26, with homology - 26 Number of transcription units - 16, operones - 6 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 129 - 188 4.5 1 1 Op 1 . + CDS 271 - 1008 476 ## COG0379 Quinolinate synthase 2 1 Op 2 5/0.429 + CDS 1047 - 1313 160 ## COG0379 Quinolinate synthase 3 1 Op 3 . + CDS 1366 - 2070 593 ## COG3201 Nicotinamide mononucleotide transporter - Term 2022 - 2058 4.7 4 2 Tu 1 . - CDS 2067 - 3008 725 ## COG1230 Co/Zn/Cd efflux system component - Prom 3032 - 3091 5.9 - Term 3067 - 3100 2.1 5 3 Tu 1 . - CDS 3122 - 3502 360 ## B21_00695 hypothetical protein + Prom 3718 - 3777 6.1 6 4 Tu 1 . + CDS 3818 - 4870 1167 ## COG0722 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase + Term 4983 - 5015 2.8 - Term 4967 - 5007 7.1 7 5 Tu 1 4/0.571 - CDS 5028 - 5780 1115 ## COG0588 Phosphoglycerate mutase 1 - Prom 5815 - 5874 4.2 - Term 5947 - 5975 1.4 8 6 Op 1 6/0.000 - CDS 5982 - 7022 1145 ## COG2017 Galactose mutarotase and related enzymes 9 6 Op 2 8/0.000 - CDS 7016 - 8164 1092 ## COG0153 Galactokinase 10 6 Op 3 3/0.714 - CDS 8168 - 9214 850 ## COG1085 Galactose-1-phosphate uridylyltransferase 11 6 Op 4 . - CDS 9224 - 10240 1129 ## COG1087 UDP-glucose 4-epimerase - Prom 10362 - 10421 6.9 12 7 Op 1 4/0.571 - CDS 10501 - 11973 177 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 13 7 Op 2 . - CDS 12041 - 12829 840 ## COG2005 N-terminal domain of molybdenum-binding protein - Prom 12951 - 13010 3.0 + Prom 12877 - 12936 6.1 14 8 Tu 1 . + CDS 12958 - 13107 148 ## ECSE_0815 hypothetical protein + Term 13114 - 13149 4.5 + Prom 13123 - 13182 4.3 15 9 Op 1 23/0.000 + CDS 13274 - 14047 899 ## COG0725 ABC-type molybdate transport system, periplasmic component 16 9 Op 2 13/0.000 + CDS 14047 - 14736 640 ## COG4149 ABC-type molybdate transport system, permease component 17 9 Op 3 . + CDS 14739 - 15797 1202 ## COG4148 ABC-type molybdate transport system, ATPase component 18 10 Tu 1 . - CDS 15798 - 16616 951 ## COG0561 Predicted hydrolases of the HAD superfamily - Prom 16721 - 16780 6.0 19 11 Tu 1 . + CDS 16771 - 17766 1003 ## COG2706 3-carboxymuconate cyclase + Term 17779 - 17814 6.4 - Term 17645 - 17691 5.0 20 12 Tu 1 . - CDS 17807 - 18640 485 ## COG0583 Transcriptional regulator - Prom 18804 - 18863 6.7 + Prom 18697 - 18756 4.6 21 13 Op 1 3/0.714 + CDS 18944 - 19996 740 ## COG2828 Uncharacterized protein conserved in bacteria 22 13 Op 2 . + CDS 20072 - 21505 1812 ## COG0471 Di- and tricarboxylate transporters + Prom 21593 - 21652 4.7 23 14 Tu 1 . + CDS 21688 - 23949 2280 ## COG1048 Aconitase A + Term 23966 - 24020 7.3 24 15 Tu 1 . - CDS 24183 - 25466 1377 ## COG4677 Pectin methylesterase - Prom 25514 - 25573 3.5 25 16 Op 1 4/0.571 - CDS 25618 - 26094 469 ## COG1881 Phospholipid-binding protein 26 16 Op 2 . - CDS 26153 - 27385 1190 ## COG0161 Adenosylmethionine-8-amino-7-oxononanoate aminotransferase Predicted protein(s) >gi|296494571|gb|ADTN01000167.1| GENE 1 271 - 1008 476 245 aa, chain + ## HITS:1 COG:nadA KEGG:ns NR:ns ## COG: nadA COG0379 # Protein_GI_number: 16128718 # Func_class: H Coenzyme transport and metabolism # Function: Quinolinate synthase # Organism: Escherichia coli K12 # 1 242 1 242 347 489 99.0 1e-138 MSVMFDPDTAIYPFPPKPTPLSIDEKAYYREKIKRLLKERNAVMVAHYYTDPEIQQLAEE TGGCISDSLEMARFGAKHPASTLLVAGVRFMGETAKILSPEKTILMPTLQAECSLDLGCP VEEFNAFCDAHPDRTVVVYANTSAAVKARADWVVTSSIAVELIDHLDSLGEKIIWAPDKH LGRYVQKQTGGDILCWQGACIVHDEFKTQALTRLQEEYPDAAILVHPESPQAIVDMADAG GSPVN >gi|296494571|gb|ADTN01000167.1| GENE 2 1047 - 1313 160 88 aa, chain + ## HITS:1 COG:nadA KEGG:ns NR:ns ## COG: nadA COG0379 # Protein_GI_number: 16128718 # Func_class: H Coenzyme transport and metabolism # Function: Quinolinate synthase # Organism: Escherichia coli K12 # 1 88 260 347 347 180 98.0 5e-46 MATDRGIFYKMQQAVPDKELLEAPTAGEGATCRSCAHCPWMAMNGLQAIAEALEQEGSNH EVHVDERLRERALVPLNRMLDFAATLRG >gi|296494571|gb|ADTN01000167.1| GENE 3 1366 - 2070 593 234 aa, chain + ## HITS:1 COG:pnuC KEGG:ns NR:ns ## COG: pnuC COG3201 # Protein_GI_number: 16128719 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinamide mononucleotide transporter # Organism: Escherichia coli K12 # 1 234 6 239 239 391 99.0 1e-109 MQNILVHIPIGAGGYDLSWIEAVGTIAGLLCIGLASLEKISNYFFGLINVTLFGIIFFQI QLYASLLLQVFFFAANIYGWYAWSRQTSQNEAELKIRWLPLPKALSWLAVCVVSIGLMTV FINPVFAFLTRVAVMIMQALGLQVVMPELQPDAFPFWDSCMMVLSIVAMILMTRKYVENW LLWVIINVISVVIFALQGVYAMSLEYIILTFIALNGSRMWINSARERGSRALSH >gi|296494571|gb|ADTN01000167.1| GENE 4 2067 - 3008 725 313 aa, chain - ## HITS:1 COG:ZybgR KEGG:ns NR:ns ## COG: ZybgR COG1230 # Protein_GI_number: 15800461 # Func_class: P Inorganic ion transport and metabolism # Function: Co/Zn/Cd efflux system component # Organism: Escherichia coli O157:H7 EDL933 # 4 307 2 305 311 529 99.0 1e-150 MAHSHSHTSSHLPEDNNARRLLYAFGVTAGFMLVEVVGGFLSGSLALLADAGHMLTDTAA LLFALLAVQFSRRPPTIRHTFGWLRLTTLAAFVNAIALVVITILIVWEAIERFRTPRPVE GGMMMAIAVAGLLANILSFWLLHHGSEEKNLNVRAAALHVLGDLLGSVGAIIAALIIIWT GWTPADPILSILVSLLVLRSAWRLLKDSVNELLEGAPVSLDIAELKRRMCREIPEVRNVH HVHVWMVGEKPVMTLHVQVIPPHDHDALLDQIQHYLMDHYQIEHATIQMEYQPCHGPDCH LNEGVSGHSHHHH >gi|296494571|gb|ADTN01000167.1| GENE 5 3122 - 3502 360 126 aa, chain - ## HITS:1 COG:no KEGG:B21_00695 NR:ns ## KEGG: B21_00695 # Name: ybgS # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 126 1 126 126 146 100.0 2e-34 MKMTKLATLFLTATLSLASGAALAADSGAQTNNGQANAAADAGQVAPDARENVAPNNVDN NGVNTGSGGTMLHSDGSSMNNDGMTKDEEHKNTMCKDGRCPDINKKVQTGDGINNDVDTK TDGTTQ >gi|296494571|gb|ADTN01000167.1| GENE 6 3818 - 4870 1167 350 aa, chain + ## HITS:1 COG:ECs0782 KEGG:ns NR:ns ## COG: ECs0782 COG0722 # Protein_GI_number: 15830036 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Escherichia coli O157:H7 # 1 350 1 350 350 717 100.0 0 MNYQNDDLRIKEIKELLPPVALLEKFPATENAANTVAHARKAIHKILKGNDDRLLVVIGP CSIHDPVAAKEYATRLLALREELKDELEIVMRVYFEKPRTTVGWKGLINDPHMDNSFQIN DGLRIARKLLLDINDSGLPAAGEFLDMITPQYLADLMSWGAIGARTTESQVHRELASGLS CPVGFKNGTDGTIKVAIDAINAAGAPHCFLSVTKWGHSAIVNTSGNGDCHIILRGGKEPN YSAKHVAEVKEGLNKAGLPAQVMIDFSHANSSKQFKKQMDVCADVCQQIAGGEKAIIGVM VESHLVEGNQSLESGEPLAYGKSITDACIGWEDTDALLRQLANAVKARRG >gi|296494571|gb|ADTN01000167.1| GENE 7 5028 - 5780 1115 250 aa, chain - ## HITS:1 COG:ECs0783 KEGG:ns NR:ns ## COG: ECs0783 COG0588 # Protein_GI_number: 15830037 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglycerate mutase 1 # Organism: Escherichia coli O157:H7 # 1 250 1 250 250 470 100.0 1e-133 MAVTKLVLVRHGESQWNKENRFTGWYDVDLSEKGVSEAKAAGKLLKEEGYSFDFAYTSVL KRAIHTLWNVLDELDQAWLPVEKSWKLNERHYGALQGLNKAETAEKYGDEQVKQWRRGFA VTPPELTKDDERYPGHDPRYAKLSEKELPLTESLALTIDRVIPYWNETILPRMKSGERVI IAAHGNSLRALVKYLDNMSEEEILELNIPTGVPLVYEFDENFKPLKRYYLGNADEIAAKA AAVANQGKAK >gi|296494571|gb|ADTN01000167.1| GENE 8 5982 - 7022 1145 346 aa, chain - ## HITS:1 COG:galM KEGG:ns NR:ns ## COG: galM COG2017 # Protein_GI_number: 16128724 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Escherichia coli K12 # 1 346 1 346 346 711 100.0 0 MLNETPALAPDGQPYRLLTLRNNAGMVVTLMDWGATLLSARIPLSDGSVREALLGCASPE CYQDQAAFLGASIGRYANRIANSRYTFDGETVTLSPSQGVNQLHGGPEGFDKRRWQIVNQ NDRQVLFALSSDDGDQGFPGNLGATVQYRLTDDNRISITYRATVDKPCPVNMTNHVYFNL DGEQSDVRNHKLQILADEYLPVDEGGIPHDGLKSVAGTSFDFRSAKIIASEFLADDDQRK VKGYDHAFLLQAKGDGKKVAAHVWSADEKLQLKVYTTAPALQFYSGNFLGGTPSRGTEPY ADWQGLALESEFLPDSPNHPEWPQPDCFLRPGEEYSSLTEYQFIAE >gi|296494571|gb|ADTN01000167.1| GENE 9 7016 - 8164 1092 382 aa, chain - ## HITS:1 COG:ECs0785 KEGG:ns NR:ns ## COG: ECs0785 COG0153 # Protein_GI_number: 15830039 # Func_class: G Carbohydrate transport and metabolism # Function: Galactokinase # Organism: Escherichia coli O157:H7 # 1 382 1 382 382 775 100.0 0 MSLKEKTQSLFANAFGYPATHTIQAPGRVNLIGEHTDYNDGFVLPCAIDYQTVISCAPRD DRKVRVMAADYENQLDEFSLDAPIVAHENYQWANYVRGVVKHLQLRNNSFGGVDMVISGN VPQGAGLSSSASLEVAVGTVLQQLYHLPLDGAQIALNGQEAENQFVGCNCGIMDQLISAL GKKDHALLIDCRSLGTKAVSMPKGVAVVIINSNFKRTLVGSEYNTRREQCETGARFFQQP ALRDVTIEEFNAVAHELDPIVAKRVRHILTENARTVEAASALEQGDLKRMGELMAESHAS MRDDFEITVPQIDTLVEIVKAVIGDKGGVRMTGGGFGGCIVALIPEELVPAVQQAVAEQY EAKTGIKETFYVCKPSQGAGQC >gi|296494571|gb|ADTN01000167.1| GENE 10 8168 - 9214 850 348 aa, chain - ## HITS:1 COG:galT KEGG:ns NR:ns ## COG: galT COG1085 # Protein_GI_number: 16128726 # Func_class: C Energy production and conversion # Function: Galactose-1-phosphate uridylyltransferase # Organism: Escherichia coli K12 # 1 348 1 348 348 707 100.0 0 MTQFNPVDHPHRRYNPLTGQWILVSPHRAKRPWQGAQETPAKQVLPAHDPDCFLCAGNVR VTGDKNPDYTGTYVFTNDFAALMSDTPDAPESHDPLMRCQSARGTSRVICFSPDHSKTLP ELSVAALTEIVKTWQEQTAELGKTYPWVQVFENKGAAMGCSNPHPHGQIWANSFLPNEAE REDRLQKEYFAEQKSPMLVDYVQRELADGSRTVVETEHWLAVVPYWAAWPFETLLLPKAH VLRITDLTDAQRSDLALALKKLTSRYDNLFQCSFPYSMGWHGAPFNGEENQHWQLHAHFY PPLLRSATVRKFMVGYEMLAETQRDLTAEQAAERLRAVSDIHFRESGV >gi|296494571|gb|ADTN01000167.1| GENE 11 9224 - 10240 1129 338 aa, chain - ## HITS:1 COG:galE KEGG:ns NR:ns ## COG: galE COG1087 # Protein_GI_number: 16128727 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Escherichia coli K12 # 1 338 1 338 338 706 100.0 0 MRVLVTGGSGYIGSHTCVQLLQNGHDVIILDNLCNSKRSVLPVIERLGGKHPTFVEGDIR NEALMTEILHDHAIDTVIHFAGLKAVGESVQKPLEYYDNNVNGTLRLISAMRAANVKNFI FSSSATVYGDQPKIPYVESFPTGTPQSPYGKSKLMVEQILTDLQKAQPDWSIALLRYFNP VGAHPSGDMGEDPQGIPNNLMPYIAQVAVGRRDSLAIFGNDYPTEDGTGVRDYIHVMDLA DGHVVAMEKLANKPGVHIYNLGAGVGNSVLDVVNAFSKACGKPVNYHFAPRREGDLPAYW ADASKADRELNWRVTRTLDEMAQDTWHWQSRHPQGYPD >gi|296494571|gb|ADTN01000167.1| GENE 12 10501 - 11973 177 490 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 276 465 21 199 311 72 28 2e-12 MSSLQILQGTFRLSDTKTLQLPQLTLNAGDSWAFVGSNGSGKSALARALAGELPLLKGER QSQFSHITRLSFEQLQKLVSDEWQRNNTDMLGPGEDDTGRTTAEIIQDEVKDAPRCMQLA QQFGITALLDRRFKYLSTGETRKTLLCQALMSEPDLLILDEPFDGLDVASRQQLAERLAS LHQSGITLVLVLNRFDEIPEFVQFAGVLADCTLAETGAKEELLQQALVAQLAHSEQLEGV QLPEPDEPSARHALPANEPRIVLNNGVVSYNDRPILNNLSWQVNPGEHWQIVGPNGAGKS TLLSLVTGDHPQGYSNDLTLFGRRRGSGETIWDIKKHIGYVSSSLHLDYRVSTTVRNVIL SGYFDSIGIYQAVSDRQQKLVQQWLDILGIDKRTADAPFHSLSWGQQRLALIVRALVKHP TLLILDEPLQGLDPLNRQLIRRFVDVLISEGETQLLFVSHHAEDAPACITHRLEFVPDGG LYRYVLTKIY >gi|296494571|gb|ADTN01000167.1| GENE 13 12041 - 12829 840 262 aa, chain - ## HITS:1 COG:ECs0789 KEGG:ns NR:ns ## COG: ECs0789 COG2005 # Protein_GI_number: 15830043 # Func_class: R General function prediction only # Function: N-terminal domain of molybdenum-binding protein # Organism: Escherichia coli O157:H7 # 1 262 1 262 262 451 100.0 1e-127 MQAEILLTLKLQQKLFADPRRISLLKHIALSGSISQGAKDAGISYKSAWDAINEMNQLSE HILVERATGGKGGGGAVLTRYGQRLIQLYDLLAQIQQKAFDVLSDDDALPLNSLLAAISR FSLQTSARNQWFGTITARDHDDVQQHVDVLLADGKTRLKVAITAQSGARLGLDEGKEVLI LLKAPWVGITQDEAVAQNADNQLPGIISHIERGAEQCEVLMALPDGQTLCATVPVNEATS LQQGQNVTAYFNADSVIIATLC >gi|296494571|gb|ADTN01000167.1| GENE 14 12958 - 13107 148 49 aa, chain + ## HITS:1 COG:no KEGG:ECSE_0815 NR:ns ## KEGG: ECSE_0815 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 49 3 51 51 77 100.0 1e-13 MLELLKSLVFAVIMVPVVMAIILGLIYGLGEVFNIFSGVGKKDQPGQNH >gi|296494571|gb|ADTN01000167.1| GENE 15 13274 - 14047 899 257 aa, chain + ## HITS:1 COG:modA KEGG:ns NR:ns ## COG: modA COG0725 # Protein_GI_number: 16128731 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type molybdate transport system, periplasmic component # Organism: Escherichia coli K12 # 1 257 1 257 257 465 100.0 1e-131 MARKWLNLFAGAALSFAVAGNALADEGKITVFAAASLTNAMQDIATQFKKEKGVDVVSSF ASSSTLARQIEAGAPADLFISADQKWMDYAVDKKAIDTATRQTLLGNSLVVVAPKASVQK DFTIDSKTNWTSLLNGGRLAVGDPEHVPAGIYAKEALQKLGAWDTLSPKLAPAEDVRGAL ALVERNEAPLGIVYGSDAVASKGVKVVATFPEDSHKKVEYPVAVVEGHNNATVKAFYDYL KGPQAAEIFKRYGFTIK >gi|296494571|gb|ADTN01000167.1| GENE 16 14047 - 14736 640 229 aa, chain + ## HITS:1 COG:ECs0792 KEGG:ns NR:ns ## COG: ECs0792 COG4149 # Protein_GI_number: 15830046 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type molybdate transport system, permease component # Organism: Escherichia coli O157:H7 # 1 229 1 229 229 370 100.0 1e-102 MILTDPEWQAVLLSLKVSSLAVLFSLPFGIFFAWLLVRCTFPGKALLDSVLHLPLVLPPV VVGYLLLVSMGRRGFIGERLYDWFGITFAFSWRGAVLAAAVMSFPLMVRAIRLALEGVDV KLEQAARTLGAGRWRVFFTITLPLTLPGIIVGTVLAFARSLGEFGATITFVSNIPGETRT IPSAMYTLIQTPGGESGAARLCIISIALAMISLLISEWLARISRERAGR >gi|296494571|gb|ADTN01000167.1| GENE 17 14739 - 15797 1202 352 aa, chain + ## HITS:1 COG:modC KEGG:ns NR:ns ## COG: modC COG4148 # Protein_GI_number: 16128733 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type molybdate transport system, ATPase component # Organism: Escherichia coli K12 # 1 352 1 352 352 669 100.0 0 MLELNFSQTLGNHCLTINETLPANGITAIFGVSGAGKTSLINAISGLTRPQKGRIVLNGR VLNDAEKGICLTPEKRRVGYVFQDARLFPHYKVRGNLRYGMSKSMVDQFDKLVALLGIEP LLDRLPGSLSGGEKQRVAIGRALLTAPELLLLDEPLASLDIPRKRELLPYLQRLTREINI PMLYVSHSLDEILHLADRVMVLENGQVKAFGALEEVWGSSVMNPWLPKEQQSSILKVTVL EHHPHYAMTALALGDQHLWVNKLDEPLQAALRIRIQASDVSLVLQPPQQTSIRNVLRAKV VNSYDDNGQVEVELEVGGKTLWARISPWARDELAIKPGLWLYAQIKSVSITA >gi|296494571|gb|ADTN01000167.1| GENE 18 15798 - 16616 951 272 aa, chain - ## HITS:1 COG:ybhA KEGG:ns NR:ns ## COG: ybhA COG0561 # Protein_GI_number: 16128734 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Escherichia coli K12 # 1 272 1 272 272 555 100.0 1e-158 MTTRVIALDLDGTLLTPKKTLLPSSIEALARAREAGYQLIIVTGRHHVAIHPFYQALALD TPAICCNGTYLYDYHAKTVLEADPMPVIKALQLIEMLNEHHIHGLMYVDDAMVYEHPTGH VIRTSNWAQTLPPEQRPTFTQVASLAETAQQVNAVWKFALTHDDLPQLQHFGKHVEHELG LECEWSWHDQVDIARGGNSKGKRLTKWVEAQGWSMENVVAFGDNFNDISMLEAAGTGVAM GNADDAVKARANIVIGDNTTDSIAQFIYSHLI >gi|296494571|gb|ADTN01000167.1| GENE 19 16771 - 17766 1003 331 aa, chain + ## HITS:1 COG:ybhE KEGG:ns NR:ns ## COG: ybhE COG2706 # Protein_GI_number: 16128735 # Func_class: G Carbohydrate transport and metabolism # Function: 3-carboxymuconate cyclase # Organism: Escherichia coli K12 # 1 331 1 331 331 679 100.0 0 MKQTVYIASPESQQIHVWNLNHEGALTLTQVVDVPGQVQPMVVSPDKRYLYVGVRPEFRV LAYRIAPDDGALTFAAESALPGSPTHISTDHQGQFVFVGSYNAGNVSVTRLEDGLPVGVV DVVEGLDGCHSANISPDNRTLWVPALKQDRICLFTVSDDGHLVAQDPAEVTTVEGAGPRH MVFHPNEQYAYCVNELNSSVDVWELKDPHGNIECVQTLDMMPENFSDTRWAADIHITPDG RHLYACDRTASLITVFSVSEDGSVLSKEGFQPTETQPRGFNVDHSGKYLIAAGQKSHHIS VYEIVGEQGLLHEKGRYAVGQGPMWVVVNAH >gi|296494571|gb|ADTN01000167.1| GENE 20 17807 - 18640 485 277 aa, chain - ## HITS:1 COG:ybhD KEGG:ns NR:ns ## COG: ybhD COG0583 # Protein_GI_number: 16128736 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 277 62 338 338 575 99.0 1e-164 MEEDLHIQLFERTTRKVTLTKAGKRLLPEARELIKKFDETLFNIRDMNAYHRGMVTLACI PTAVFYFLPLAIGKFNELYPNIKVRILEQGTNNCMESVLCNESDFGINMNNVTNSSIDFT PLVNEPFVLACRRDHPLAKKQLVEWQELVGYKMIGVRSSSGNRLLIEQQLADKPWKLDWF YEVRHLSTSLGLVEAGLGISALPGLAMPHAPYSSIIGIPLVEPVIRRTLGIIRRKDAVLS PAAERFFALLINLWTDDKDNLWTNIVERQRHALQEIG >gi|296494571|gb|ADTN01000167.1| GENE 21 18944 - 19996 740 350 aa, chain + ## HITS:1 COG:ybhH KEGG:ns NR:ns ## COG: ybhH COG2828 # Protein_GI_number: 16128737 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 350 1 350 350 678 100.0 0 MKKIPCVMMRGGTSRGAFLLAEHLPEDQTQRDKILMAIMGSGNDLEIDGIGGGNPLTSKV AIISRSSDPRADVDYLFAQVIVHEQRVDTTPNCGNMLSGVGAFAIENGLIAATSPVTRVR IRNVNTGTFIEADVQTPNGVVEYEGSARIDGVPGTAAPVALTFLNAAGTKTGKVFPTDNQ IDYFDDVPVTCIDMAMPVVIIPAEYLGKTGYELPAELDADKALLARIESIRLQAGKAMGL GDVSNMVIPKPVLISPAQKGGAINVRYFMPHSCHRALAITGAIAISSSCALEGTVTRQIV PSVGYGNINIEHPSGALDVHLSNEGQDATTLRASVIRTTRKIFSGEVYLP >gi|296494571|gb|ADTN01000167.1| GENE 22 20072 - 21505 1812 477 aa, chain + ## HITS:1 COG:ybhI KEGG:ns NR:ns ## COG: ybhI COG0471 # Protein_GI_number: 16128738 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Escherichia coli K12 # 1 477 1 477 477 798 100.0 0 MNKKSLWKLILILAIPCIIGFMPAPAGLSELAWVLFGIYLAAIVGLVIKPFPEPVVLLIA VAASMVVVGNLSDGAFKTTAVLSGYSSGTTWLVFSAFTLSAAFVTTGLGKRIAYLLIGKI GNTTLGLGYVTVFLDLVLAPATPSNTARAGGIVLPIINSVAVALGSEPEKSPRRVGHYLM MSIYMVTKTTSYMFFTAMAGNILALKMINDILHLQISWGGWALAAGLPGIIMLLVTPLVI YTMYPPEIKKVDNKTIAKAGLAELGPMKIREKMLLGVFVLALLGWIFSKSLGVDESTVAI VVMATMLLLGIVTWEDVVKNKGGWNTLIWYGGIIGLSSLLSKVKFFEWLAEVFKNNLAFD GHGNVAFFVIIFLSIIVRYFFASGSAYIVAMLPVFAMLANVSGAPLMLTALALLFSNSYG GMVTHYGGAAGPVIFGVGYNDIKSWWLVGAVLTILTFLVHITLGVWWWNMLIGWNML >gi|296494571|gb|ADTN01000167.1| GENE 23 21688 - 23949 2280 753 aa, chain + ## HITS:1 COG:ybhJ KEGG:ns NR:ns ## COG: ybhJ COG1048 # Protein_GI_number: 16128739 # Func_class: C Energy production and conversion # Function: Aconitase A # Organism: Escherichia coli K12 # 1 753 9 761 761 1553 100.0 0 MIKLSEKGVFLASNNEIIAEEHFTGEIKKEEAKKGTIAWSILSSHNTSGNMDKLKIKFDS LASHDITFVGIVQTAKASGMERFPLPYVLTNCHNSLCAVGGTINGDDHVFGLSAAQRYGG IFVPPHIAVIHQYMREMMAGGGKMILGSDSHTRYGALGTMAVGEGGGELVKQLLNDTWDI DYPGVVAVHLTGKPAPYVGPQDVALAIIGAVFKNGYVKNKVMEFVGPGVSALSTDFRNSV DVMTTETTCLSSVWQTDEEVHNWLALHGRGQDYCQLNPQPMAYYDGCISVDLSAIKPMIA LPFHPSNVYEIDTLNQNLTDILREIEIESERVAHGKAKLSLLDKVENGRLKVQQGIIAGC SGGNYENVIAAANALRGQSCGNDTFSLAVYPSSQPVFMDLAKKGVVADLIGAGAIIRTAF CGPCFGAGDTPINNGLSIRHTTRNFPNREGSKPANGQMSAVALMDARSIAATAANGGYLT SASELDCWDNVPEYAFDVTPYKNRVYQGFVKGATQQPLIYGPNIKDWPELGALTDNIVLK VCSKILDEVTTTDELIPSGETSSYRSNPIGLAEFTLSRRDPGYVSRSKATAELENQRLAG NVSELTEVFARIKQIAGQEHIDPLQTEIGSMVYAVKPGDGSAREQAASCQRVIGGLANIA EEYATKRYRSNVINWGMLPLQMAEVPTFEVGDYIYIPGIKAALDNPGTTFKGYVIHEDAP VTEITLYMESLTAEEREIIKAGSLINFNKNRQM >gi|296494571|gb|ADTN01000167.1| GENE 24 24183 - 25466 1377 427 aa, chain - ## HITS:1 COG:ybhC KEGG:ns NR:ns ## COG: ybhC COG4677 # Protein_GI_number: 16128740 # Func_class: G Carbohydrate transport and metabolism # Function: Pectin methylesterase # Organism: Escherichia coli K12 # 1 427 1 427 427 822 99.0 0 MNTFSVSRLALALAFGVTLTACSSTPPDQRPSDQTAPGTSSRPILSAKEAQNFDAQHYFA SLTPGAAAWNPSPITLPAQPDFVVGPAGTQGVTHTTIQAAVDAAIIKRTNKRQYIAVMPG EYQGTVYVPAAPGGITLYGTGEKPIDVKIGLSLDGGMSPADWRHDVNPRGKYMPGKPAWY MYDSCQSKRSDSIGVLCSAVFWSQNNGLQLQNLTIENTLGDSVDAGNHPAVALRTDGDQV QINNVNILGRQNTFFVTNSGVQNRLETNRQPRTLVTNSYIEGDVDIVSGRGAVVFDNTEF RVVNSRTQQEAYVFAPATLSNIYYGFLAVNSRFNAFGDGVAQLGRSLDVDANTNGQVVIR DSAINEGFNTAKPWADAVISNRPFAGNSGSVDDNDEIQRNLNDTNYNRMWEYNNRGVGSK VVAEAKK >gi|296494571|gb|ADTN01000167.1| GENE 25 25618 - 26094 469 158 aa, chain - ## HITS:1 COG:ybhB KEGG:ns NR:ns ## COG: ybhB COG1881 # Protein_GI_number: 16128741 # Func_class: R General function prediction only # Function: Phospholipid-binding protein # Organism: Escherichia coli K12 # 1 158 1 158 158 314 100.0 4e-86 MKLISNDLRDGDKLPHRHVFNGMGYDGDNISPHLAWDDVPAGTKSFVVTCYDPDAPTGSG WWHWVVVNLPADTRVLPQGFGSGLVAMPDGVLQTRTDFGKTGYDGAAPPKGETHRYIFTV HALDIERIDVDEGASGAMVGFNVHFHSLASASITAMFS >gi|296494571|gb|ADTN01000167.1| GENE 26 26153 - 27385 1190 410 aa, chain - ## HITS:1 COG:bioA KEGG:ns NR:ns ## COG: bioA COG0161 # Protein_GI_number: 16128742 # Func_class: H Coenzyme transport and metabolism # Function: Adenosylmethionine-8-amino-7-oxononanoate aminotransferase # Organism: Escherichia coli K12 # 1 410 20 429 429 830 100.0 0 MTSPLPVYPVVSAEGCELILSDGRRLVDGMSSWWAAIHGYNHPQLNAAMKSQIDAMSHVM FGGITHAPAIELCRKLVAMTPQPLECVFLADSGSVAVEVAMKMALQYWQAKGEARQRFLT FRNGYHGDTFGAMSVCDPDNSMHSLWKGYLPENLFAPAPQSRMDGEWDERDMVGFARLMA AHRHEIAAVIIEPIVQGAGGMRMYHPEWLKRIRKICDREGILLIADEIATGFGRTGKLFA CEHAEIAPDILCLGKALTGGTMTLSATLTTREVAETISNGEAGCFMHGPTFMGNPLACAA ANASLAILESGDWQQQVADIEVQLREQLAPARDAEMVADVRVLGAIGVVETTHPVNMAAL QKFFVEQGVWIRPFGKLIYLMPPYIILPQQLQRLTAAVNRAVQDETFFCQ Prediction of potential genes in microbial genomes Time: Sun May 15 23:43:18 2011 Seq name: gi|296494570|gb|ADTN01000168.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont306.2, whole genome shotgun sequence Length of sequence - 22930 bp Number of predicted genes - 25, with homology - 23 Number of transcription units - 10, operones - 5 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 12/0.000 + CDS 65 - 1105 1202 ## COG0502 Biotin synthase and related enzymes 2 1 Op 2 6/0.333 + CDS 1102 - 2256 1035 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes 3 1 Op 3 9/0.000 + CDS 2243 - 2998 631 ## COG0500 SAM-dependent methyltransferases 4 1 Op 4 3/1.000 + CDS 2991 - 3668 407 ## COG0132 Dethiobiotin synthetase + Term 3825 - 3867 1.1 + Prom 4137 - 4196 6.9 5 2 Tu 1 . + CDS 4247 - 6268 2327 ## COG0556 Helicase subunit of the DNA excision repair complex - Term 6294 - 6330 2.4 6 3 Tu 1 . - CDS 6460 - 7368 886 ## COG0391 Uncharacterized conserved protein - Prom 7521 - 7580 3.7 7 4 Tu 1 . + CDS 7312 - 7446 57 ## + Prom 7556 - 7615 4.1 8 5 Op 1 6/0.333 + CDS 7765 - 8754 681 ## COG2896 Molybdenum cofactor biosynthesis enzyme 9 5 Op 2 11/0.000 + CDS 8776 - 9288 643 ## COG0521 Molybdopterin biosynthesis enzymes 10 5 Op 3 11/0.000 + CDS 9291 - 9776 660 ## COG0315 Molybdenum cofactor biosynthesis enzyme 11 5 Op 4 21/0.000 + CDS 9769 - 10014 296 ## COG1977 Molybdopterin converting factor, small subunit 12 5 Op 5 5/0.333 + CDS 10016 - 10468 645 ## COG0314 Molybdopterin converting factor, large subunit + Prom 10520 - 10579 1.9 13 6 Op 1 . + CDS 10605 - 11309 762 ## COG0670 Integral membrane protein, interacts with FtsH + Term 11329 - 11359 1.0 14 6 Op 2 . + CDS 11395 - 11487 100 ## 15 6 Op 3 . + CDS 11514 - 12227 262 ## COG0670 Integral membrane protein, interacts with FtsH 16 7 Op 1 5/0.333 - CDS 12263 - 13219 1058 ## COG0392 Predicted integral membrane protein 17 7 Op 2 8/0.000 - CDS 13219 - 14460 1222 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes 18 7 Op 3 . - CDS 14457 - 15218 571 ## COG3568 Metal-dependent hydrolase - Prom 15325 - 15384 1.5 + Prom 15201 - 15260 3.0 19 8 Tu 1 . + CDS 15351 - 15761 521 ## G2583_1019 inner membrane protein YbhQ 20 9 Op 1 22/0.000 - CDS 15723 - 16829 1230 ## COG0842 ABC-type multidrug transport system, permease component 21 9 Op 2 45/0.000 - CDS 16840 - 17973 1463 ## COG0842 ABC-type multidrug transport system, permease component 22 9 Op 3 10/0.000 - CDS 17966 - 19702 343 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 23 9 Op 4 15/0.000 - CDS 19695 - 20690 1319 ## COG0845 Membrane-fusion protein 24 9 Op 5 . - CDS 20693 - 21364 725 ## COG1309 Transcriptional regulator - Prom 21432 - 21491 3.5 + Prom 21415 - 21474 2.9 25 10 Tu 1 . + CDS 21593 - 22928 1340 ## COG0513 Superfamily II DNA and RNA helicases Predicted protein(s) >gi|296494570|gb|ADTN01000168.1| GENE 1 65 - 1105 1202 346 aa, chain + ## HITS:1 COG:bioB KEGG:ns NR:ns ## COG: bioB COG0502 # Protein_GI_number: 16128743 # Func_class: H Coenzyme transport and metabolism # Function: Biotin synthase and related enzymes # Organism: Escherichia coli K12 # 1 346 1 346 346 694 100.0 0 MAHRPRWTLSQVTELFEKPLLDLLFEAQQVHRQHFDPRQVQVSTLLSIKTGACPEDCKYC PQSSRYKTGLEAERLMEVEQVLESARKAKAAGSTRFCMGAAWKNPHERDMPYLEQMVQGV KAMGLEACMTLGTLSESQAQRLANAGLDYYNHNLDTSPEFYGNIITTRTYQERLDTLEKV RDAGIKVCSGGIVGLGETVKDRAGLLLQLANLPTPPESVPINMLVKVKGTPLADNDDVDA FDFIRTIAVARIMMPTSYVRLSAGREQMNEQTQAMCFMAGANSIFYGCKLLTTPNPEEDK DLQLFRKLGLNPQQTAVLAGDNEQQQRLEQALMTPDTDEYYNAAAL >gi|296494570|gb|ADTN01000168.1| GENE 2 1102 - 2256 1035 384 aa, chain + ## HITS:1 COG:bioF KEGG:ns NR:ns ## COG: bioF COG0156 # Protein_GI_number: 16128744 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Escherichia coli K12 # 1 384 1 384 384 679 100.0 0 MSWQEKINAALDARRAADALRRRYPVAQGAGRWLVADDRQYLNFSSNDYLGLSHHPQIIR AWQQGAEQFGIGSGGSGHVSGYSVVHQALEEELAEWLGYSRALLFISGFAANQAVIAAMM AKEDRIAADRLSHASLLEAASLSPSQLRRFAHNDVTHLARLLASPCPGQQMVVTEGVFSM DGDSAPLAEIQQVTQQHNGWLMVDDAHGTGVIGEQGRGSCWLQKVKPELLVVTFGKGFGV SGAAVLCSSTVADYLLQFARHLIYSTSMPPAQAQALRASLAVIRSDEGDARREKLAALIT RFRAGVQDLPFTLADSCSAIQPLIVGDNSRALQLAEKLRQQGCWVTAIRPPTVPAGTARL RLTLTAAHEMQDIDRLLEVLHGNG >gi|296494570|gb|ADTN01000168.1| GENE 3 2243 - 2998 631 251 aa, chain + ## HITS:1 COG:bioC KEGG:ns NR:ns ## COG: bioC COG0500 # Protein_GI_number: 16128745 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli K12 # 1 251 1 251 251 460 100.0 1e-130 MATVNKQAIAAAFGRAAAHYEQHADLQRQSADALLAMLPQRKYTHVLDAGCGPGWMSRHW RERHAQVTALDLSPPMLVQARQKDAADHYLAGDIESLPLATATFDLAWSNLAVQWCGNLS TALRELYRVVRPKGVVAFTTLVQGSLPELHQAWQAVDERPHANRFLPPDEIEQSLNGVHY QHHIQPITLWFDDALSAMRSLKGIGATHLHEGRDPRILTRSQLQRLQLAWPQQQGRYPLT YHLFLGVIARE >gi|296494570|gb|ADTN01000168.1| GENE 4 2991 - 3668 407 225 aa, chain + ## HITS:1 COG:bioD KEGG:ns NR:ns ## COG: bioD COG0132 # Protein_GI_number: 16128746 # Func_class: H Coenzyme transport and metabolism # Function: Dethiobiotin synthetase # Organism: Escherichia coli K12 # 1 225 1 225 225 433 100.0 1e-121 MSKRYFVTGTDTEVGKTVASCALLQAAKAAGYRTAGYKPVASGSEKTPEGLRNSDALALQ RNSSLQLDYATVNPYTFAEPTSPHIISAQEGRPIESLVMSAGLRALEQQADWVLVEGAGG WFTPLSDTFTFADWVTQEQLPVILVVGVKLGCINHAMLTAQVIQHAGLTLAGWVANDVTP PGKRHAEYMTTLTRMIPAPLLGEIPWLAENPENAATGKYINLALL >gi|296494570|gb|ADTN01000168.1| GENE 5 4247 - 6268 2327 673 aa, chain + ## HITS:1 COG:ECs0857 KEGG:ns NR:ns ## COG: ECs0857 COG0556 # Protein_GI_number: 15830111 # Func_class: L Replication, recombination and repair # Function: Helicase subunit of the DNA excision repair complex # Organism: Escherichia coli O157:H7 # 1 673 1 673 673 1265 100.0 0 MSKPFKLNSAFKPSGDQPEAIRRLEEGLEDGLAHQTLLGVTGSGKTFTIANVIADLQRPT MVLAPNKTLAAQLYGEMKEFFPENAVEYFVSYYDYYQPEAYVPSSDTFIEKDASVNEHIE QMRLSATKAMLERRDVVVVASVSAIYGLGDPDLYLKMMLHLTVGMIIDQRAILRRLAELQ YARNDQAFQRGTFRVRGEVIDIFPAESDDIALRVELFDEEVERLSLFDPLTGQIVSTIPR FTIYPKTHYVTPRERIVQAMEEIKEELAARRKVLLENNKLLEEQRLTQRTQFDLEMMNEL GYCSGIENYSRFLSGRGPGEPPPTLFDYLPADGLLVVDESHVTIPQIGGMYRGDRARKET LVEYGFRLPSALDNRPLKFEEFEALAPQTIYVSATPGNYELEKSGGDVVDQVVRPTGLLD PIIEVRPVATQVDDLLSEIRQRAAINERVLVTTLTKRMAEDLTEYLEEHGERVRYLHSDI DTVERMEIIRDLRLGEFDVLVGINLLREGLDMPEVSLVAILDADKEGFLRSERSLIQTIG RAARNVNGKAILYGDKITPSMAKAIGETERRREKQQKYNEEHGITPQGLNKKVVDILALG QNIAKTKAKGRGKSRPIVEPDNVPMDMSPKALQQKIHELEGLMMQHAQNLEFEEAAQIRD QLHQLRELFIAAS >gi|296494570|gb|ADTN01000168.1| GENE 6 6460 - 7368 886 302 aa, chain - ## HITS:1 COG:ybhK KEGG:ns NR:ns ## COG: ybhK COG0391 # Protein_GI_number: 16128748 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 302 1 302 302 548 100.0 1e-156 MRNRTLADLDRVVALGGGHGLGRVLSSLSSLGSRLTGIVTTTDNGGSTGRIRRSEGGIAW GDMRNCLNQLITEPSVASAMFEYRFGGNGELSGHNLGNLMLKALDHLSVRPLEAINLIRN LLKVDTHLIPMSEHPVDLMAIDDQGHEVYGEVNIDQLTTPIQELLLTPNVPATREAVHAI NEADLIIIGPGSFYTSLMPILLLKEIAQALRRTPAPMVYIGNLGRELSLPAANLKLESKL AIMEQYVGKKVIDAVIVGPKVDVSAVKERIVIQEVLEASDIPYRHDRQLLHNALEKALQA LG >gi|296494570|gb|ADTN01000168.1| GENE 7 7312 - 7446 57 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPSAESNDTIKISQRTIAHIVSWSQIIRATVAQIAGKQQLTCQY >gi|296494570|gb|ADTN01000168.1| GENE 8 7765 - 8754 681 329 aa, chain + ## HITS:1 COG:ECs0859 KEGG:ns NR:ns ## COG: ECs0859 COG2896 # Protein_GI_number: 15830113 # Func_class: H Coenzyme transport and metabolism # Function: Molybdenum cofactor biosynthesis enzyme # Organism: Escherichia coli O157:H7 # 1 329 1 329 329 680 99.0 0 MASQLTDAFARKFYYLRLSITDVCNFRCTYCLPDGYKPSGVTNKGFLTVDEIRRVTRAFA SLGTEKVRLTGGEPSLRRDFTDIIAAVRENDAIRQIAVTTNGYRLERDVASWRDAGLTGI NVSVDSLDARQFHAITGQDKFNQVMAGIDAAFEAGFEKVKVNTVLMRDVNHHQLDTFLNW IQHRPIQLRFIELMETGEGSELFRKHHISGQVLRDELLRRGWIHQLRQRSDGPAQVFCHP DYAGEIGLIMPYEKDFCATCNRLRVSSIGKLHLCLFGEGGVNLRDLLEDDTQQQALEARI SAALREKKQTHFLHQNNTGITQNLSYIGG >gi|296494570|gb|ADTN01000168.1| GENE 9 8776 - 9288 643 170 aa, chain + ## HITS:1 COG:ECs0860 KEGG:ns NR:ns ## COG: ECs0860 COG0521 # Protein_GI_number: 15830114 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin biosynthesis enzymes # Organism: Escherichia coli O157:H7 # 1 170 1 170 170 341 100.0 4e-94 MSQVSTEFIPTRIAILTVSNRRGEEDDTSGHYLRDSAQEAGHHVVDKAIVKENRYAIRAQ VSAWIASDDVQVVLITGGTGLTEGDQAPEALLPLFDREVEGFGEVFRMLSFEEIGTSTLQ SRAVAGVANKTLIFAMPGSTKACRTAWENIIAPQLDARTRPCNFHPHLKK >gi|296494570|gb|ADTN01000168.1| GENE 10 9291 - 9776 660 161 aa, chain + ## HITS:1 COG:ECs0861 KEGG:ns NR:ns ## COG: ECs0861 COG0315 # Protein_GI_number: 15830115 # Func_class: H Coenzyme transport and metabolism # Function: Molybdenum cofactor biosynthesis enzyme # Organism: Escherichia coli O157:H7 # 1 161 1 161 161 296 100.0 9e-81 MSQLTHINAAGEAHMVDVSAKAETVREARAEAFVTMRSETLAMIIDGRHHKGDVFATARI AGIQAAKRTWDLIPLCHPLMLSKVEVNLQAEPEHNRVRIETLCRLTGKTGVEMEALTAAS VAALTIYDMCKAVQKDMVIGPVRLLAKSGGKSGDFKVEADD >gi|296494570|gb|ADTN01000168.1| GENE 11 9769 - 10014 296 81 aa, chain + ## HITS:1 COG:moaD KEGG:ns NR:ns ## COG: moaD COG1977 # Protein_GI_number: 16128752 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin converting factor, small subunit # Organism: Escherichia coli K12 # 1 81 1 81 81 149 100.0 1e-36 MIKVLFFAQVRELVGTDATEVAADFPTVEALRQHMAAQSDRWALALEDGKLLAAVNQTLV SFDHPLTDGDEVAFFPPVTGG >gi|296494570|gb|ADTN01000168.1| GENE 12 10016 - 10468 645 150 aa, chain + ## HITS:1 COG:moaE KEGG:ns NR:ns ## COG: moaE COG0314 # Protein_GI_number: 16128753 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin converting factor, large subunit # Organism: Escherichia coli K12 # 1 150 1 150 150 291 100.0 3e-79 MAETKIVVGPQPFSVGEEYPWLAERDEDGAVVTFTGKVRNHNLGDSVNALTLEHYPGMTE KALAEIVDEARNRWPLGRVTVIHRIGELWPGDEIVFVGVTSAHRSSAFEAGQFIMDYLKT RAPFWKREATPEGDRWVEARESDQQAAKRW >gi|296494570|gb|ADTN01000168.1| GENE 13 10605 - 11309 762 234 aa, chain + ## HITS:1 COG:ybhL KEGG:ns NR:ns ## COG: ybhL COG0670 # Protein_GI_number: 16128754 # Func_class: R General function prediction only # Function: Integral membrane protein, interacts with FtsH # Organism: Escherichia coli K12 # 1 234 1 234 234 368 100.0 1e-102 MDRFPRSDSIVQPRAGLQTYMAQVYGWMTVGLLLTAFVAWYAANSAAVMELLFTNRVFLI GLIIAQLALVIVLSAMIQKLSAGVTTMLFMLYSALTGLTLSSIFIVYTAASIASTFVVTA GMFGAMSLYGYTTKRDLSGFGNMLFMALIGIVLASLVNFWLKSEALMWAVTYIGVIVFVG LTAYDTQKLKNMGEQIDTRDTSNLRKYSILGALTLYLDFINLFLMLLRIFGNRR >gi|296494570|gb|ADTN01000168.1| GENE 14 11395 - 11487 100 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPYGRAEAFSVSINAKVFCMRGFKFIDGEY >gi|296494570|gb|ADTN01000168.1| GENE 15 11514 - 12227 262 237 aa, chain + ## HITS:1 COG:ybhM KEGG:ns NR:ns ## COG: ybhM COG0670 # Protein_GI_number: 16128755 # Func_class: R General function prediction only # Function: Integral membrane protein, interacts with FtsH # Organism: Escherichia coli K12 # 1 237 1 237 237 417 99.0 1e-116 MESYSQNSNKLDFQHEARILNGIWLITALGLVATAGLAWGAKYLEITATKYDSPPMYVAI GLLLLCMYGLSKDINKINAAIAGVIYLFLLSLVAIVVASLVPVYAIIIVFSTAGAMFLIS MLAGLLFNVDPGSHRFIIMMTLTGLALVIIVNAALMSERPIWVISCLMIVLWSGIISHGR NKLLELAGKCHSEELWSPVRCAFTGALTLYYYFIGFFGILAAIAITLVWQRHTRFFH >gi|296494570|gb|ADTN01000168.1| GENE 16 12263 - 13219 1058 318 aa, chain - ## HITS:1 COG:ybhN KEGG:ns NR:ns ## COG: ybhN COG0392 # Protein_GI_number: 16128756 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Escherichia coli K12 # 1 318 1 318 318 530 100.0 1e-150 MSKSHPRWRLAKKILTWLFFIAVIVLLVVYAKKVDWEEVWKVIRDYNRVALLSAVGLVVV SYLIYGCYDLLARFYCGHKLAKRQVMLVSFICYAFNLTLSTWVGGIGMRYRLYSRLGLPG STITRIFSLSITTNWLGYILLAGIIFTAGVVELPDHWYVDQTTLRILGIGLLMIIAVYLW FCAFAKHRHMTIKGQKLVLPSWKFALAQMLISSVNWMVMGAIIWLLLGQSVNYFFVLGVL LVSSIAGVIVHIPAGIGVLEAVFIALLAGEHTSKGTIIAALLAYRVLYYFIPLLLALICY LLLESQAKKLRAKNEAAM >gi|296494570|gb|ADTN01000168.1| GENE 17 13219 - 14460 1222 413 aa, chain - ## HITS:1 COG:ECs0867 KEGG:ns NR:ns ## COG: ECs0867 COG1502 # Protein_GI_number: 15830121 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Escherichia coli O157:H7 # 1 413 1 413 413 835 100.0 0 MKCSWREGNKIQLLENGEQYYPAVFKAIGEAQERIILETFIWFEDDVGKQLHAALLAAAQ RGVKAEVLLDGYGSPDLSDEFVNELTAAGVVFRYYDPRPRLFGMRTNVFRRMHRKIVVID ARIAFIGGLNYSAEHMSSYGPEAKQDYAVRLEGPIVEDILQFELENLPGQSAARRWWRRH HKAEENRQPGEAQVLLVWRDNEEHRDDIERHYLKMLTQARREVIIANAYFFPGYRFLHAL RKAARRGVRIKLIIQGEPDMPIVRVGARLLYNYLVKGGVQVFEYRRRPLHGKVALMDDHW ATVGSSNLDPLSLSLNLEANVIIHDRHFNQTLRDNLNGIIAADCQQVDETMLPKRTWWNL TKSVLAFHFLRHFPALVGWLPAHTPRLAQVDPPAQPTMETQDRVETENTGVKP >gi|296494570|gb|ADTN01000168.1| GENE 18 14457 - 15218 571 253 aa, chain - ## HITS:1 COG:ECs0868 KEGG:ns NR:ns ## COG: ECs0868 COG3568 # Protein_GI_number: 15830122 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Escherichia coli O157:H7 # 1 253 1 253 253 520 100.0 1e-147 MPDQTQQFSFKVLTINIHKGFTAFNRRFILPELRDAVRTVSADIVCLQEVMGAHEVHPLH VENWPDTSHYEFLADTMWSDFAYGRNAVYPEGHHGNAVLSRYPIEHYENRDVSVDGAEKR GVLYCRIVPPMTGKAIHVMCVHLGLREAHRQAQLAMLAEWVNELPDGEPVLVAGDFNDWR QKANHPLKVQAGLDEIFTRAHGRPARTFPVQFPLLRLDRIYVKNASASAPTALPLRTWRH LSDHAPLSAEIHL >gi|296494570|gb|ADTN01000168.1| GENE 19 15351 - 15761 521 136 aa, chain + ## HITS:1 COG:no KEGG:G2583_1019 NR:ns ## KEGG: G2583_1019 # Name: ybhQ # Def: inner membrane protein YbhQ # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 136 1 136 136 215 100.0 4e-55 MKWQQRVRVATGLSCWQIMLHLLVVALLVVGWMSKTLVHVGVGLCALYCVTVVMMLVFQR HPEQRWREVADVLEELTTTWYFGAALIVLWLLSRVLENNFLLAIAGLAILAGPAVVSLLA KDKKLHHLTSKHRVRR >gi|296494570|gb|ADTN01000168.1| GENE 20 15723 - 16829 1230 368 aa, chain - ## HITS:1 COG:ECs0870 KEGG:ns NR:ns ## COG: ECs0870 COG0842 # Protein_GI_number: 15830124 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Escherichia coli O157:H7 # 1 368 1 368 368 686 100.0 0 MFHRLWTLIRKELQSLLREPQTRAILILPVLIQVILFPFAATLEVTNATIAIYDEDNGEH SVELTQRFARASAFTHVLLLKSPQEIRPTIDTQKALLLVRFPADFSRKLDTFQTAPLQLI LDGRNSNSAQIAANYLQQIVKNYQQELLEGKPKPNNSELVVRNWYNPNLDYKWFVVPSLI AMITTIGVMIVTSLSVAREREQGTLDQLLVSPLTTWQIFIGKAVPALIVATFQATIVLAI GIWAYQIPFAGSLALFYFTMVIYGLSLVGFGLLISSLCSTQQQAFIGVFVFMMPAILLSG YVSPVENMPVWLQNLTWINPIRHFTDITKQIYLKDASLDIVWNSLWPLLVITATTGSAAY AMFRRKVM >gi|296494570|gb|ADTN01000168.1| GENE 21 16840 - 17973 1463 377 aa, chain - ## HITS:1 COG:ECs0871 KEGG:ns NR:ns ## COG: ECs0871 COG0842 # Protein_GI_number: 15830125 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Escherichia coli O157:H7 # 1 377 1 377 377 682 100.0 0 MSNPILSWRRVRALCVKETRQIVRDPSSWLIAVVIPLLLLFIFGYGINLDSSKLRVGILL EQRSEAALDFTHTMTGSPYIDATISDNRQELIAKMQAGKIRGLVVIPVDFAEQMERANAT APIQVITDGSEPNTANFVQGYVEGIWQIWQMQRAEDNGQTFEPLIDVQTRYWFNPAAISQ HFIIPGAVTIIMTVIGAILTSLVVAREWERGTMEALLSTEITRTELLLCKLIPYYFLGML AMLLCMLVSVFILGVPYRGSLLILFFISSLFLLSTLGMGLLISTITRNQFNAAQVALNAA FLPSIMLSGFIFQIDSMPAVIRAVTYIIPARYFVSTLQSLFLAGNIPVVLVVNVLFLIAS AVMFIGLTWLKTKRRLD >gi|296494570|gb|ADTN01000168.1| GENE 22 17966 - 19702 343 578 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 330 549 12 233 318 136 34 1e-31 MNDAVITLNGLEKRFPGMDKPAVAPLDCTIHAGYVTGLVGPDGAGKTTLMRMLAGLLKPD SGSATVIGFDPIKNDGALHAVLGYMPQKFGLYEDLTVMENLNLYADLRSVTGEARKQTFA RLLEFTSLGPFTGRLAGKLSGGMKQKLGLACTLVGEPKVLLLDEPGVGVDPISRRELWQM VHELAGEGMLILWSTSYLDEAEQCRDVLLMNEGELLYQGEPKALTQTMAGRSFLMTSPHE GNRKLLQRALKLPQVSDGMIQGKSVRLILKKEATPDDIRHADGMPEININETTPRFEDAF IDLLGGAGTSESPLGAILHTVEGTPGETVIEAKELTKKFGDFAATDHVNFAVKRGEIFGL LGPNGAGKSTTFKMMCGLLVPTSGQALVLGMDLKESSGKARQHLGYMAQKFSLYGNLTVE QNLRFFSGVYGLRGRAQNEKISRMSEAFGLKSIASHATDELPLGFKQRLALACSLMHEPD ILFLDEPTSGVDPLTRREFWLHINSMVEKGVTVMVTTHFMDEAEYCDRIGLVYRGKLIAS GTPDGLKAQSANDEQPDPTMEQAFIQLIHDWDKEHSNE >gi|296494570|gb|ADTN01000168.1| GENE 23 19695 - 20690 1319 331 aa, chain - ## HITS:1 COG:Z1015 KEGG:ns NR:ns ## COG: Z1015 COG0845 # Protein_GI_number: 15800546 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli O157:H7 EDL933 # 1 331 2 332 332 552 99.0 1e-157 MKKPVVIGLAVVVLAAVVAGGYWWYQSRQDNGLTLYGNVDIRTVNLSFRVGGRVESLAVD EGDAIKAGQVLGELDHKPYEIALMQAKAGVSVAQAQYDLMLAGYRDEEIAQAAAAVKQAQ AAYDYAQNFYNRQQGLWKSRTISANDLENARSSRDQAQATLKSAQDKLRQYRSGNREQDI AQAKASLEQAQAQLAQAELNLQDSTLIAPSDGTLLTRAVEPGTVLNEGGTVFTVSLTRPV WVRAYVDERNLDQAQPGRKVLLYTDGRPDKPYHGQIGFVSPTAEFTPKTVETPDLRTDLV YRLRIVVTDADDALRQGMPVTVQFGDEAGHE >gi|296494570|gb|ADTN01000168.1| GENE 24 20693 - 21364 725 223 aa, chain - ## HITS:1 COG:ECs0874 KEGG:ns NR:ns ## COG: ECs0874 COG1309 # Protein_GI_number: 15830128 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 223 5 227 227 437 100.0 1e-123 MNNPAMTIKGEQAKKQLIAAALAQFGEYGMNATTREIAAQAGQNIAAITYYFGSKEDLYL ACAQWIADFIGEQFRPHAEEAERLFAQPQPDRAAIRELILRACRNMIKLLTQDDTVNLSK FISREQLSPTAAYHLVHEQVISPLHSHLTRLIAAWTGCDANDTRMILHTHALIGEILAFR LGKETILLRTGWTAFDEEKTELINQTVTCHIDLILQGLSQRSL >gi|296494570|gb|ADTN01000168.1| GENE 25 21593 - 22928 1340 445 aa, chain + ## HITS:1 COG:rhlE KEGG:ns NR:ns ## COG: rhlE COG0513 # Protein_GI_number: 16128765 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Escherichia coli K12 # 1 442 1 442 454 781 100.0 0 MSFDSLGLSPDILRAVAEQGYREPTPIQQQAIPAVLEGRDLMASAQTGTGKTAGFTLPLL QHLITRQPHAKGRRPVRALILTPTRELAAQIGENVRDYSKYLNIRSLVVFGGVSINPQMM KLRGGVDVLVATPGRLLDLEHQNAVKLDQVEILVLDEADRMLDMGFIHDIRRVLTKLPAK RQNLLFSATFSDDIKALAEKLLHNPLEIEVARRNTASDQVTQHVHFVDKKRKRELLSHMI GKGNWQQVLVFTRTKHGANHLAEQLNKDGIRSAAIHGNKSQGARTRALADFKSGDIRVLV ATDIAARGLDIEELPHVVNYELPNVPEDYVHRIGRTGRAAATGEALSLVCVDEHKLLRDI EKLLKKEIPRIAIPGYEPDPSIKAEPIQNGRQQRGGGGRGQGGGRGQQQPRRGEGGAKSA SAKPAEKPSRRLGDAKPAGEQQRRR Prediction of potential genes in microbial genomes Time: Sun May 15 23:43:33 2011 Seq name: gi|296494569|gb|ADTN01000169.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont306.3, whole genome shotgun sequence Length of sequence - 14902 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 9, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 177 - 626 460 ## COG3236 Uncharacterized protein conserved in bacteria - Prom 747 - 806 2.6 + Prom 693 - 752 2.5 2 2 Op 1 4/0.000 + CDS 779 - 2929 2063 ## COG1199 Rad3-related DNA helicases 3 2 Op 2 3/0.500 + CDS 2957 - 3919 1006 ## COG0547 Anthranilate phosphoribosyltransferase + Term 3932 - 3965 3.0 + Prom 3974 - 4033 3.6 4 3 Tu 1 . + CDS 4060 - 5145 1285 ## COG2055 Malate/L-lactate dehydrogenases + Term 5310 - 5368 9.1 - Term 5140 - 5173 5.2 5 4 Tu 1 . - CDS 5374 - 5634 292 ## EC55989_0846 hypothetical protein - Prom 5801 - 5860 2.5 6 5 Op 1 1/1.000 - CDS 5899 - 6165 291 ## COG1734 DnaK suppressor protein 7 5 Op 2 2/0.500 - CDS 6239 - 6916 730 ## COG3128 Uncharacterized iron-regulated protein - Term 6923 - 6948 -0.5 8 5 Op 3 . - CDS 6958 - 9240 2004 ## COG4774 Outer membrane receptor for monomeric catechols - Prom 9357 - 9416 3.7 - Term 9457 - 9490 2.2 9 6 Tu 1 . - CDS 9505 - 9765 264 ## ECIAI1_0844 hypothetical protein - Prom 9867 - 9926 8.7 + Prom 9875 - 9934 5.3 10 7 Tu 1 . + CDS 10041 - 10967 815 ## COG3129 Predicted SAM-dependent methyltransferase 11 8 Tu 1 . - CDS 10964 - 13189 1997 ## COG0668 Small-conductance mechanosensitive channel - Prom 13217 - 13276 5.6 12 9 Op 1 34/0.000 - CDS 13450 - 14172 595 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 13 9 Op 2 . - CDS 14169 - 14828 948 ## COG0765 ABC-type amino acid transport system, permease component Predicted protein(s) >gi|296494569|gb|ADTN01000169.1| GENE 1 177 - 626 460 149 aa, chain - ## HITS:1 COG:ybiA KEGG:ns NR:ns ## COG: ybiA COG3236 # Protein_GI_number: 16128766 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 149 12 160 160 290 100.0 8e-79 MQDTIINFYSTSDDYGDFSNFAAWPIKVDGKTWPTSEHYFQAQKFLDEKYREEIRRVSSP MVAARMGRDRSKPLRKNWESVKEQVMRKALRAKFEQHAELRALLLATAPAKLVEHTENDA YWGDGGHGKGKNRLGYLLMELREQLAIEK >gi|296494569|gb|ADTN01000169.1| GENE 2 779 - 2929 2063 716 aa, chain + ## HITS:1 COG:dinG KEGG:ns NR:ns ## COG: dinG COG1199 # Protein_GI_number: 16128767 # Func_class: K Transcription; L Replication, recombination and repair # Function: Rad3-related DNA helicases # Organism: Escherichia coli K12 # 1 716 1 716 716 1437 100.0 0 MALTAALKAQIAAWYKALQEQIPDFIPRAPQRQMIADVAKTLAGEEGRHLAIEAPTGVGK TLSYLIPGIAIAREEQKTLVVSTANVALQDQIYSKDLPLLKKIIPDLKFTAAFGRGRYVC PRNLTALASTEPTQQDLLAFLDDELTPNNQEEQKRCAKLKGDLDTYKWDGLRDHTDIAID DDLWRRLSTDKASCLNRNCYYYRECPFFVARREIQEAEVVVANHALVMAAMESEAVLPDP KNLLLVLDEGHHLPDVARDALEMSAEITAPWYRLQLDLFTKLVATCMEQFRPKTIPPLAI PERLNAHCEELYELIASLNNILNLYMPAGQEAEHRFAMGELPDEVLEICQRLAKLTEMLR GLAELFLNDLSEKTGSHDIVRLHRLILQMNRALGMFEAQSKLWRLASLAQSSGAPVTKWA TREEREGQLHLWFHCVGIRVSDQLERLLWRSIPHIIVTSATLRSLNSFSRLQEMSGLKEK AGDRFVALDSPFNHCEQGKIVIPRMRVEPSIDNEEQHIAEMAAFFRKQVESKKHLGMLVL FASGRAMQRFLDYVTDLRLMLLVQGDQPRYRLVELHRKRVANGERSVLVGLQSFAEGLDL KGDLLSQVHIHKIAFPPIDSPVVITEGEWLKSLNRYPFEVQSLPSASFNLIQQVGRLIRS HGCWGEVVIYDKRLLTKNYGKRLLDALPVFPIEQPEVPEGIVKKKEKTKSPRRRRR >gi|296494569|gb|ADTN01000169.1| GENE 3 2957 - 3919 1006 320 aa, chain + ## HITS:1 COG:ybiB KEGG:ns NR:ns ## COG: ybiB COG0547 # Protein_GI_number: 16128768 # Func_class: E Amino acid transport and metabolism # Function: Anthranilate phosphoribosyltransferase # Organism: Escherichia coli K12 # 1 320 1 320 320 644 100.0 0 MDYRKIIKEIGRGKNHARDLDRDTARGLYAHMLNGEVPDLELGGVLIALRIKGEGEAEML GFYEAMQNHTIKLTPPAGKPMPIVIPSYNGARKQANLTPLLAILLHKLGFPVVVHGVSED PTRVLTETIFELMGITPTLHGGQAQAKLDEHQPVFMPVGAFCPPLEKQLAMRWRMGVRNS AHTLAKLATPFAEGEALRLSSVSHPEYIGRVAKFFSDIGGRALLMHGTEGEVYANPQRCP QINLIDREGMRVLYEKQDTAGSELLPQAKDPETTAQWIERCLAGSEPIPESLKIQMACCL VATGEAATISDGLARVNQAF >gi|296494569|gb|ADTN01000169.1| GENE 4 4060 - 5145 1285 361 aa, chain + ## HITS:1 COG:ybiC KEGG:ns NR:ns ## COG: ybiC COG2055 # Protein_GI_number: 16128769 # Func_class: C Energy production and conversion # Function: Malate/L-lactate dehydrogenases # Organism: Escherichia coli K12 # 1 361 1 361 361 729 100.0 0 MESGHRFDAQTLHSFIQAVFRQMGSEEQEAKLVADHLIAANLAGHDSHGIGMIPSYVRSW SQGHLQINHHAKTVKEAGAAVTLDGDRAFGQVAAHEAMALGIEKAHQHGIAAVALHNSHH IGRIGYWAEQCAAAGFVSIHFVSVVGIPMVAPFHGRDSRFGTNPFCVVFPRKDNFPLLLD YATSAIAFGKTRVAWHKGVPVPPGCLIDVNGVPTTNPAVMQESPLGSLLTFAEHKGYALA AMCEILGGALSGGKTTHQETLQTSPDAILNCMTTIIINPELFGAPDCNAQTEAFAEWVKA SPHDDDKPILLPGEWEVNTRRERQKQGIPLDAGSWQAICDAARQIGMPEETLQAFCQQLA S >gi|296494569|gb|ADTN01000169.1| GENE 5 5374 - 5634 292 86 aa, chain - ## HITS:1 COG:no KEGG:EC55989_0846 NR:ns ## KEGG: EC55989_0846 # Name: ybiJ # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 86 1 86 86 82 100.0 3e-15 MKTINTVVAAMALSTLSFGVFAAEPVTASQAQNMNKIGVVSADGASTLDALEAKLAEKAA AAGASGYSITSATNNNKLSGTAVIYK >gi|296494569|gb|ADTN01000169.1| GENE 6 5899 - 6165 291 88 aa, chain - ## HITS:1 COG:ybiI KEGG:ns NR:ns ## COG: ybiI COG1734 # Protein_GI_number: 16128771 # Func_class: T Signal transduction mechanisms # Function: DnaK suppressor protein # Organism: Escherichia coli K12 # 1 88 1 88 88 158 100.0 3e-39 MASGWANDDAVNEQINSTIEDAIARARGEIPRGESLDECEECGAPIPQARREAIPGVRLC IHCQQEKDLQKPAYTGYNRRGSKDSQLR >gi|296494569|gb|ADTN01000169.1| GENE 7 6239 - 6916 730 225 aa, chain - ## HITS:1 COG:ybiX KEGG:ns NR:ns ## COG: ybiX COG3128 # Protein_GI_number: 16128772 # Func_class: S Function unknown # Function: Uncharacterized iron-regulated protein # Organism: Escherichia coli K12 # 1 225 13 237 237 459 100.0 1e-129 MMYHIPGVLSPQDVARFREQLEQAEWVDGRVTTGAQGAQVKNNQQVDTRSTLYAALQNEV LNAVNQHALFFAAALPRTLSTPLFNRYQNNETYGFHVDGAVRSHPQNGWMRTDLSATLFL SDPQSYDGGELVVNDTFGQHRVKLPAGDLVLYPSSSLHCVTPVTRGVRVASFMWIQSMIR DDKKRAMLFELDNNIQSLKSRYGESEEILSLLNLYHNLLREWSEI >gi|296494569|gb|ADTN01000169.1| GENE 8 6958 - 9240 2004 760 aa, chain - ## HITS:1 COG:fiu KEGG:ns NR:ns ## COG: fiu COG4774 # Protein_GI_number: 16128773 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for monomeric catechols # Organism: Escherichia coli K12 # 1 760 1 760 760 1396 100.0 0 MENNRNFPARQFHSLTFFAGLCIGITPVAQALAAEGQTNADDTLVVEASTPSLYAPQQSA DPKFSRPVADTTRTMTVISEQVIKDQGATNLTDALKNVPGVGAFFAGENGNSTTGDAIYM RGADTSNSIYIDGIRDIGSVSRDTFNTEQVEVIKGPSGTDYGRSAPTGSINMISKQPRND SGIDASASIGSAWFRRGTLDVNQVIGDTTAVRLNVMGEKTHDAGRDKVKNERYGVAPSVA FGLGTANRLYLNYLHVTQHNTPDGGIPTIGLPGYSAPSAGTAALNHSGKVDTHNFYGTDS DYDDSTTDTATMRFEHDINDNTTIRNTTRWSRVKQDYLMTAIMGGASNITQPTSDVNSWT WSRTANTKDVSNKILTNQTNLTSTFYTGSIGHDVSTGVEFTRETQTNYGVNPVTLPAVNI YHPDSSIHPGGLTRNGANANGQTDTFAIYAFDTLQITRDFELNGGIRLDNYHTEYDSATA CGGSGRGAITCPTGVAKGSPVTTVDTAKSGNLMNWKAGALYHLTENGNVYINYAVSQQPP GGNNFALAQSGSGNSANRTDFKPQKANTSEIGTKWQVLDKRLLLTAALFRTDIENEVEQN DDGTYSQYGKKRVEGYEISVAGNITPAWQVIGGYTQQKATIKNGKDVAQDGSSSLPYTPE HAFTLWSQYQATDDISVGAGARYIGSMHKGSDGAVGTPAFTEGYWVADAKLGYRVNRNLD FQLNVYNLFDTDYVASINKSGYRYHPGEPRTFLLTANMHF >gi|296494569|gb|ADTN01000169.1| GENE 9 9505 - 9765 264 86 aa, chain - ## HITS:1 COG:no KEGG:ECIAI1_0844 NR:ns ## KEGG: ECIAI1_0844 # Name: ybiM # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 86 49 134 134 163 100.0 1e-39 MKKCLTLLIATVLSGISLTAYAAQPMSNLDSGQLRPAGTVSATGASNLSDLEDKLAEKAR EQGAKGYVINSAGGNDQMFGTATIYK >gi|296494569|gb|ADTN01000169.1| GENE 10 10041 - 10967 815 308 aa, chain + ## HITS:1 COG:ybiN KEGG:ns NR:ns ## COG: ybiN COG3129 # Protein_GI_number: 16128775 # Func_class: R General function prediction only # Function: Predicted SAM-dependent methyltransferase # Organism: Escherichia coli K12 # 1 308 28 335 335 639 100.0 0 MSAQKPGLHPRNRHHSRYDLATLCQVNPELRQFLTLTPAGEQSVDFANPLAVKALNKALL AHFYAVANWDIPDGFLCPPVPGRADYIHHLADLLAEASGTIPANASILDIGVGANCIYPL IGVHEYGWRFTGSETSSQALSSAQAIISSNPGLNRAIRLRRQKESGAIFNGIIHKNEQYD ATLCNPPFHDSAAAARAGSERKRRNLGLNKDDALNFGGQQQELWCEGGEVTFIKKMIEES KGFAKQVMWFTSLVSRGENLPPLYRALTDVGAVKVVKKEMAQGQKQSRFIAWTFMNDEQR RRFVNRQR >gi|296494569|gb|ADTN01000169.1| GENE 11 10964 - 13189 1997 741 aa, chain - ## HITS:1 COG:ECs0886_2 KEGG:ns NR:ns ## COG: ECs0886_2 COG0668 # Protein_GI_number: 15830140 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Escherichia coli O157:H7 # 431 741 1 311 311 582 100.0 1e-166 MRWILFILFCLLGAPAHAVSIPGVTTTTTTDSTTEPAPEPDIEQKKAAYGALADVLDNDT SRKELIDQLRTVAATPPAEPVPKIVPPTLVEEQTVLQKVTEVSRHYGEALSARFGQLYRN ITGSPHKPFNPQTFSNALTHFSMLAVLVFGFYWLIRLCALPLYRKMGQWARQKNRERSNW LQLPAMIIGAFIIDLLLLALTLFVGQVLSDNLNAGSRTIAFQQSLFLNAFALIEFFKAVL RLIFCPNVAELRPFTIQDESARYWSRRLSWLSSLIGYGLIVAVPIISNQVNVQIGALANV IIMLCMTVWALYLIFRNKKEITQHLLNFAEHSLAFFSLFIRAFALVWHWLASAYFIVLFF FSLFDPGNSLKFMMGATVRSLAIIGIAAFVSGMFSRWLAKTITLSPHTQRNYPELQKRLN GWLSAALKTARILTVCVAVMLLLSAWGLFDFWNWLQNGAGQKTVDILIRIALILFFSAVG WTVLASLIENRLASDIHGRPLPSARTRTLLTLFRNALAVIISTITIMIVLSEIGVNIAPL LAGAGALGLAISFGSQTLVKDIITGVFIQFENGMNTGDLVTIGPLTGTVERMSIRSVGVR QDTGAYHIIPWSSITTFANFVRGIGSVVANYDVDRHEDADKANQALKDAVAELMENEEIR GLIIGEPNFAGIVGLSNTAFTLRVSFTTLPLKQWTVRFALDSQVKKHFDLAGVRAPVQTY QVLPAPGATPAEPLPPGEPTL >gi|296494569|gb|ADTN01000169.1| GENE 12 13450 - 14172 595 240 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 239 1 242 245 233 47 5e-61 MIEFKNVSKHFGPTQVLHNIDLNIAQGEVVVIIGPSGSGKSTLLRCINKLEEITSGDLIV DGLKVNDPKVDERLIRQEAGMVFQQFYLFPHLTALENVMFGPLRVRGANKEEAEKLAREL LAKVGLAERAHHYPSELSGGQQQRVAIARALAVKPKMMLFDEPTSALDPELRHEVLKVMQ DLAEEGMTMVIVTHEIGFAEKVASRLIFIDKGRIAEDGNPQVLIKNPPSQRLQEFLQHVS >gi|296494569|gb|ADTN01000169.1| GENE 13 14169 - 14828 948 219 aa, chain - ## HITS:1 COG:ECs0888 KEGG:ns NR:ns ## COG: ECs0888 COG0765 # Protein_GI_number: 15830142 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Escherichia coli O157:H7 # 1 219 1 219 219 373 100.0 1e-103 MQFDWSAIWPAIPLLIEGAKMTLWISVLGLAGGLVIGLLAGFARTFGGWIANHVALVFIE VIRGTPIVVQVMFIYFALPMAFNDLRIDPFTAAVVTIMINSGAYIAEITRGAVLSIHKGF REAGLALGLSRWETIRYVILPLALRRMLPPLGNQWIISIKDTSLFIVIGVAELTRQGQEI IAGNFRALEIWSAVAVFYLIITLVLSFILRRLERRMKIL Prediction of potential genes in microbial genomes Time: Sun May 15 23:43:41 2011 Seq name: gi|296494568|gb|ADTN01000170.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont306.4, whole genome shotgun sequence Length of sequence - 10441 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 9, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 4/1.000 - CDS 38 - 784 1080 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain - Prom 831 - 890 4.1 - Term 1109 - 1152 9.5 2 2 Tu 1 4/1.000 - CDS 1188 - 1691 612 ## COG0783 DNA-binding ferritin-like protein (oxidative damage protectant) - Prom 1731 - 1790 4.8 - Term 1939 - 1981 10.1 3 3 Tu 1 . - CDS 1990 - 2877 983 ## COG5006 Predicted permease, DMT superfamily + Prom 3145 - 3204 4.6 4 4 Tu 1 . + CDS 3230 - 3745 591 ## COG3637 Opacity protein and related surface antigens + Term 3749 - 3811 10.7 - Term 3756 - 3783 -0.9 5 5 Tu 1 . - CDS 3794 - 5377 1400 ## COG2194 Predicted membrane-associated, metal-dependent hydrolase - Prom 5493 - 5552 2.2 6 6 Tu 1 . - CDS 5649 - 5777 82 ## SbBS512_E2532 hypothetical protein - Prom 5845 - 5904 3.9 + Prom 5858 - 5917 4.1 7 7 Op 1 4/1.000 + CDS 5963 - 6430 518 ## COG1321 Mn-dependent transcriptional regulator 8 7 Op 2 . + CDS 6427 - 7545 1015 ## COG0471 Di- and tricarboxylate transporters + Term 7564 - 7596 3.1 - Term 7547 - 7588 3.0 9 8 Tu 1 . - CDS 7604 - 8524 1157 ## COG1376 Uncharacterized protein conserved in bacteria - Prom 8580 - 8639 4.9 + Prom 8565 - 8624 4.1 10 9 Tu 1 . + CDS 8743 - 10335 2134 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains Predicted protein(s) >gi|296494568|gb|ADTN01000170.1| GENE 1 38 - 784 1080 248 aa, chain - ## HITS:1 COG:ECs0889 KEGG:ns NR:ns ## COG: ECs0889 COG0834 # Protein_GI_number: 15830143 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Escherichia coli O157:H7 # 1 248 1 248 248 462 100.0 1e-130 MKSVLKVSLAALTLAFAVSSHAADKKLVVATDTAFVPFEFKQGDKYVGFDVDLWAAIAKE LKLDYELKPMDFSGIIPALQTKNVDLALAGITITDERKKAIDFSDGYYKSGLLVMVKANN NDVKSVKDLDGKVVAVKSGTGSVDYAKANIKTKDLRQFPNIDNAYMELGTNRADAVLHDT PNILYFIKTAGNGQFKAVGDSLEAQQYGIAFPKGSDELRDKVNGALKTLRENGTYNEIYK KWFGTEPK >gi|296494568|gb|ADTN01000170.1| GENE 2 1188 - 1691 612 167 aa, chain - ## HITS:1 COG:ECs0890 KEGG:ns NR:ns ## COG: ECs0890 COG0783 # Protein_GI_number: 15830144 # Func_class: P Inorganic ion transport and metabolism # Function: DNA-binding ferritin-like protein (oxidative damage protectant) # Organism: Escherichia coli O157:H7 # 1 167 1 167 167 305 100.0 3e-83 MSTAKLVKSKATNLLYTRNDVSDSEKKATVELLNRQVIQFIDLSLITKQAHWNMRGANFI AVHEMLDGFRTALIDHLDTMAERAVQLGGVALGTTQVINSKTPLKSYPLDIHNVQDHLKE LADRYAIVANDVRKAIGEAKDDDTADILTAASRDLDKFLWFIESNIE >gi|296494568|gb|ADTN01000170.1| GENE 3 1990 - 2877 983 295 aa, chain - ## HITS:1 COG:ECs0891 KEGG:ns NR:ns ## COG: ECs0891 COG5006 # Protein_GI_number: 15830145 # Func_class: R General function prediction only # Function: Predicted permease, DMT superfamily # Organism: Escherichia coli O157:H7 # 1 295 1 295 295 481 100.0 1e-135 MPGSLRKMPVWLPIVILLVAMASIQGGASLAKSLFPLVGAPGVTALRLALGTLILIAFFK PWRLRFAKEQRLPLLFYGVSLGGMNYLFYLSIQTVPLGIAVALEFTGPLAVALFSSRRPV DFVWVVLAVLGLWFLLPLGQDVSHVDLTGCALALGAGACWAIYILSGQRAGAEHGPATVA IGSLIAALIFVPIGALQAGEALWHWSVIPLGLAVAILSTALPYSLEMIALTRLPTRTFGT LMSMEPALAAVSGMIFLGETLTPIQLLALGAIIAASMGSTLTVRKESKIKELDIN >gi|296494568|gb|ADTN01000170.1| GENE 4 3230 - 3745 591 171 aa, chain + ## HITS:1 COG:ECs0892 KEGG:ns NR:ns ## COG: ECs0892 COG3637 # Protein_GI_number: 15830146 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Opacity protein and related surface antigens # Organism: Escherichia coli O157:H7 # 1 171 1 171 171 301 100.0 4e-82 MKKIACLSALAAVLAFTAGTSVAATSTVTGGYAQSDAQGQMNKMGGFNLKYRYEEDNSPL GVIGSFTYTEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGVGYGKFQTTEYPT YKHDTSDYGFSYGAGLQFNPMENVALDFSYEQSRIRSVDVGTWIAGVGYRF >gi|296494568|gb|ADTN01000170.1| GENE 5 3794 - 5377 1400 527 aa, chain - ## HITS:1 COG:ybiP KEGG:ns NR:ns ## COG: ybiP COG2194 # Protein_GI_number: 16128783 # Func_class: R General function prediction only # Function: Predicted membrane-associated, metal-dependent hydrolase # Organism: Escherichia coli K12 # 1 527 1 527 527 1073 100.0 0 MNLTLKESLVTRSRVFSPWTAFYFLQSLLINLGLGYPFSLLYTAAFTAILLLLWRTLPRV QKVLVGVSSLVAACYFPFAQAYGAPNFNTLLALHSTNMEESTEILTIFPWYSYLVGLFIF ALGVIAIRRKKENEKARWNTFDSLCLVFSVATFFVAPVQNLAWGGVFKLKDTGYPVFRFA KDVIVNNNEVIEEQERMAKLSGMKDTWTVTAVKPKYQTYVVVIGESARRDALGAFGGHWD NTPFASSVNGLIFADYIAASGSTQKSLGLTLNRVVDGKPQFQDNFVTLANRAGFQTWWFS NQGQIGEYDTAIASIAKRADEVYFLKEGNFEADKNTKDEALLDMTAQVLAQEHSQPQLIV LHLMGSHPQACDRTQGKYETFVQSKETSCYLYTMTQTDDLLRKLYDQLRNSGSSFSLVYF SDHGLAFKERGKDVQYLAHDDKYQQNFQVPFMVISSDDKAHRVIKARRSANDFLGFFSQW TGIKAKEINIKYPFISEKKAGPIYITNFQLQKVDYNHLGTDIFDPKP >gi|296494568|gb|ADTN01000170.1| GENE 6 5649 - 5777 82 42 aa, chain - ## HITS:1 COG:no KEGG:SbBS512_E2532 NR:ns ## KEGG: SbBS512_E2532 # Name: not_defined # Def: hypothetical protein # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 1 42 1 42 42 69 97.0 3e-11 MNEFKRCMRVFSHSPFKVRLMLLSMLCDMVNNKPQQDKPSDK >gi|296494568|gb|ADTN01000170.1| GENE 7 5963 - 6430 518 155 aa, chain + ## HITS:1 COG:ybiQ KEGG:ns NR:ns ## COG: ybiQ COG1321 # Protein_GI_number: 16128785 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Escherichia coli K12 # 1 155 1 155 155 268 100.0 2e-72 MSRRAGTPTAKKVTQLVNVEEHVEGFRQVREAHRRELIDDYVELISDLIREVGEARQVDM AARLGVSQPTVAKMLKRLATMGLIEMIPWRGVFLTAEGEKLAQESRERHQIVENFLLVLG VSPEIARRDAEGMEHHVSEETLDAFRLFTQKHGAK >gi|296494568|gb|ADTN01000170.1| GENE 8 6427 - 7545 1015 372 aa, chain + ## HITS:1 COG:ybiR KEGG:ns NR:ns ## COG: ybiR COG0471 # Protein_GI_number: 16128786 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Escherichia coli K12 # 1 372 1 372 372 618 100.0 1e-177 MSLPFLRTLQGDRFFQLLILVGIGLSFFVPFAPKSWPAAIDWHTIITLSGLMLLTKGVEL SGYFDVLGRKMVRRFATERRLAMFMVLAAALLSTFLTNDVALFIVVPLTITLKRLCEIPV NRLIIFEALAVNAGSLLTPIGNPQNILIWGRSGLSFAGFIAQMAPLAGAMMLTLLLLCWC CFPGKAMQYHTGVQTPEWKPRLVWSCLGLYIVFLTALEFKQELWGLVIVAAGFALLARRV VLSVDWTLLLVFMAMFIDVHLLTQLPALQGVLGNVSHLSEPGLWLTAIGLSQVISNVPST ILLLNYVPPSLLLVWAVNVGGFGLLPGSLANLIALRMANDRRIWWRFHLYSIPMLLWAAL VGYVLLVILPAN >gi|296494568|gb|ADTN01000170.1| GENE 9 7604 - 8524 1157 306 aa, chain - ## HITS:1 COG:ECs0896 KEGG:ns NR:ns ## COG: ECs0896 COG1376 # Protein_GI_number: 15830150 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 306 1 306 306 599 100.0 1e-171 MNMKLKTLFAAAFAVVGFCSTASAVTYPLPTDGSRLVGQNQVITIPEGNTQPLEYFAAEY QMGLSNMMEANPGVDTFLPKGGTVLNIPQQLILPDTVHEGIVINSAEMRLYYYPKGTNTV IVLPIGIGQLGKDTPINWTTKVERKKAGPTWTPTAKMHAEYRAAGEPLPAVVPAGPDNPM GLYALYIGRLYAIHGTNANFGIGLRVSHGCVRLRNEDIKFLFEKVPVGTRVQFIDEPVKA TTEPDGSRYIEVHNPLSTTEAQFEGQEIVPITLTKSVQTVTGQPDVDQVVLDEAIKNRSG MPVRLN >gi|296494568|gb|ADTN01000170.1| GENE 10 8743 - 10335 2134 530 aa, chain + ## HITS:1 COG:ECs0897 KEGG:ns NR:ns ## COG: ECs0897 COG0488 # Protein_GI_number: 15830151 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Escherichia coli O157:H7 # 1 530 1 530 530 1053 100.0 0 MLVSSNVTMQFGSKPLFENISVKFGGGNRYGLIGANGSGKSTFMKILGGDLEPTLGNVSL DPNERIGKLRQDQFAFEEFTVLDTVIMGHKELWEVKQERDRIYALPEMSEEDGYKVADLE VKYGEMDGYSAEARAGELLLGVGIPVEQHYGPMSEVAPGWKLRVLLAQALFADPDILLLD EPTNNLDIDTIRWLEQVLNERDSTMIIISHDRHFLNMVCTHMADLDYGELRVYPGNYDEY MTAATQARERLLADNAKKKAQIAELQSFVSRFSANASKSRQATSRARQIDKIKLEEVKAS SRQNPFIRFEQDKKLFRNALEVEGLTKGFDNGPLFKNLNLLLEVGEKLAVLGTNGVGKST LLKTLVGDLQPDSGTVKWSENARIGYYAQDHEYEFENDLTVFEWMSQWKQEGDDEQAVRS ILGRLLFSQDDIKKPAKVLSGGEKGRMLFGKLMMQKPNILIMDEPTNHLDMESIESLNMA LELYQGTLIFVSHDREFVSSLATRILEITPERVIDFSGNYEDYLRSKGIE Prediction of potential genes in microbial genomes Time: Sun May 15 23:43:44 2011 Seq name: gi|296494567|gb|ADTN01000171.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont306.5, whole genome shotgun sequence Length of sequence - 1589 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 178 - 1443 1052 ## B21_00805 hypothetical protein - Prom 1487 - 1546 3.9 Predicted protein(s) >gi|296494567|gb|ADTN01000171.1| GENE 1 178 - 1443 1052 421 aa, chain - ## HITS:1 COG:no KEGG:B21_00805 NR:ns ## KEGG: B21_00805 # Name: ybiU # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 421 1 421 421 872 100.0 0 MASTFTSDTLPADHKAAIRQMKHALRAQLGDVQQIFNQLSDDIATRVAEINALKAQGDAV WPVLSYADIKAGHVTAEQREQIKRRGCAVIKGHFPREQALGWDQSMLDYLDRNRFDEVYK GPGDNFFGTLSASRPEIYPIYWSQAQMQARQSEEMANAQSFLNRLWTFESDGKQWFNPDV SVIYPDRIRRRPPGTTSKGLGAHTDSGALERWLLPAYQRVFANVFNGNLAQYDPWHAAHR TEVEEYTVDNTTKCSVFRTFQGWTALSDMLPGQGLLHVVPIPEAMAYVLLRPLLDDVPED ELCGVAPGRVLPVSEQWHPLLIEALTSIPKLEAGDSVWWHCDVIHSVAPVENQQGWGNVM YIPAAPMCEKNLAYAHKVKAALEKGASPGDFPREDYETNWEGRFTLADLNIHGKRALGMD V Prediction of potential genes in microbial genomes Time: Sun May 15 23:43:54 2011 Seq name: gi|296494566|gb|ADTN01000172.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont306.6, whole genome shotgun sequence Length of sequence - 15456 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 6, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 21 - 836 889 ## COG0561 Predicted hydrolases of the HAD superfamily - Prom 894 - 953 3.8 2 2 Op 1 11/0.000 - CDS 982 - 3414 2924 ## COG1882 Pyruvate-formate lyase 3 2 Op 2 . - CDS 3420 - 4319 1018 ## COG1180 Pyruvate-formate lyase-activating enzyme - Prom 4404 - 4463 3.6 + Prom 4300 - 4359 3.0 4 3 Tu 1 . + CDS 4444 - 5112 865 ## COG0176 Transaldolase + Term 5147 - 5175 0.5 - Term 4988 - 5027 0.1 5 4 Op 1 9/0.000 - CDS 5188 - 5937 907 ## COG0476 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 6 4 Op 2 . - CDS 5937 - 7172 1258 ## COG0303 Molybdopterin biosynthesis enzyme - Prom 7309 - 7368 2.9 + Prom 7249 - 7308 7.5 7 5 Op 1 3/0.333 + CDS 7376 - 8341 1060 ## COG1446 Asparaginase 8 5 Op 2 11/0.000 + CDS 8328 - 10199 841 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 9 5 Op 3 38/0.000 + CDS 10219 - 11757 1674 ## COG0747 ABC-type dipeptide transport system, periplasmic component 10 5 Op 4 49/0.000 + CDS 11775 - 12695 1029 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 11 5 Op 5 3/0.333 + CDS 12740 - 13609 743 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components + Term 13684 - 13718 -1.0 + Prom 13648 - 13707 5.6 12 6 Tu 1 . + CDS 13838 - 15455 1042 ## COG2200 FOG: EAL domain Predicted protein(s) >gi|296494566|gb|ADTN01000172.1| GENE 1 21 - 836 889 271 aa, chain - ## HITS:1 COG:ybiV KEGG:ns NR:ns ## COG: ybiV COG0561 # Protein_GI_number: 16128790 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Escherichia coli K12 # 1 271 1 271 271 546 100.0 1e-155 MSVKVIVTDMDGTFLNDAKTYNQPRFMAQYQELKKRGIKFVVASGNQYYQLISFFPELKD EISFVAENGALVYEHGKQLFHGELTRHESRIVIGELLKDKQLNFVACGLQSAYVSENAPE AFVALMAKHYHRLKPVKDYQEIDDVLFKFSLNLPDEQIPLVIDKLHVALDGIMKPVTSGF GFIDLIIPGLHKANGISRLLKRWDLSPQNVVAIGDSGNDAEMLKMARYSFAMGNAAENIK QIARYATDDNNHEGALNVIQAVLDNTSPFNS >gi|296494566|gb|ADTN01000172.1| GENE 2 982 - 3414 2924 810 aa, chain - ## HITS:1 COG:ybiW KEGG:ns NR:ns ## COG: ybiW COG1882 # Protein_GI_number: 16128791 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Escherichia coli K12 # 1 810 1 810 810 1700 99.0 0 MTTLKLDTLSDRIKAHKNALVHIVKPPVCTERAQHYTEMYQQHLDKPIPVRRALALAHHL ANRTIWIKHDELIIGNQASEVRAAPIFPEYTVSWIEKEIDDLADRPGAGFAVSEENKRVL HEVCPWWRGQTVQDRCYGMFTDEQKGLLATGIIKAEGNMTSGDAHLAVNFPLLLEKGLDG LREKVAERRSRINLTVLEDLHGEQFLKAIDIVLVAVSEHIERFAALAREMAATETRESRR DELLAMAENCDLIAHQPPQTFWQALQLCYFIQLILQIESNGHSVSFGRMDQYLYPYYRRD VELNQTLDREHAIEMLHSCWLKLLEVNKIRSGSHSKASAGSPLYQNVTIGGQNLVDGQPM DAVNPLSYAILESCGRLRSTQPNLSVRYHAGMSNDFLDACVQVIRCGFGMPAFNNDEIVI PEFIKLGIEPQDAYDYAAIGCIETAVGGKWGYRCTGMSFINFARVMLAALEGGHDATSGK VFLPQEKALSAGNFNNFDEVMDAWDTQIRYYTRKSIEIEYVVDTMLEENVHDILCSALVD DCIERAKSIKQGGAKYDWVSGLQVGIANLGNSLAAVKKLVFEQGAIGQQQLAAALADDFD GLTHEQLRQRLINGAPKYGNDDDTVDTLLARAYQTYIDELKQYHNPRYGRGPVGGNYYAG TSSISANVPFGAQTMATPDGRKAHTPLAEGASPASGTDHLGPTAVIGSVGKLPTAAILGG VLLNQKLNPATLENESDKQKLMILLRTFFEVHKGWHIQYNIVSRETLLDAKKHPDQYRDL VVRVAGYSAFFTALSPDAQDDIIARTEHML >gi|296494566|gb|ADTN01000172.1| GENE 3 3420 - 4319 1018 299 aa, chain - ## HITS:1 COG:ybiY KEGG:ns NR:ns ## COG: ybiY COG1180 # Protein_GI_number: 16128792 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Escherichia coli K12 # 1 299 10 308 308 600 100.0 1e-171 MIFNIQRYSTHDGPGIRTVVFLKGCSLGCRWCQNPESRARTQDLLYDARLCLEGCELCAK AAPEVIERALNGLLIHREKLTPEHLTALTDCCPTQALTVCGEVKSVEEIMTTVLRDKPFY DRSGGGLTLSGGEPFMQPEMAMALLQASHEAGIHTAVETCLHVPWKYIAPSLPYIDLFLA DLKHVADAPFKQWTDGNAARVLDNLKKLAAAGKKIIIRVPLIQGFNADETSVKAITDFAA DELHVGEIHFLPYHTLGINKYHLLNLPYDAPEKPLDAPELLDFAQQYACQKGLTATLRG >gi|296494566|gb|ADTN01000172.1| GENE 4 4444 - 5112 865 222 aa, chain + ## HITS:1 COG:mipB KEGG:ns NR:ns ## COG: mipB COG0176 # Protein_GI_number: 16128793 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Escherichia coli K12 # 1 222 23 244 244 401 100.0 1e-112 MVMELYLDTSDVVAVKALSRIFPLAGVTTNPSIIAAGKKPLDVVLPQLHEAMGGQGRLFA QVMATTAEGMVNDALKLRSIIADIVVKVPVTAEGLAAIKMLKAEGIPTLGTAVYGAAQGL LSALAGAEYVAPYVNRIDAQGGSGIQTVTDLHQLLKMHAPQAKVLAASFKTPRQALDCLL AGCESITLPLDVAQQMISYPAVDAAVAKFEQDWQGAFGRTSI >gi|296494566|gb|ADTN01000172.1| GENE 5 5188 - 5937 907 249 aa, chain - ## HITS:1 COG:moeB KEGG:ns NR:ns ## COG: moeB COG0476 # Protein_GI_number: 16128794 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 # Organism: Escherichia coli K12 # 1 249 1 249 249 490 100.0 1e-138 MAELSDQEMLRYNRQIILRGFDFDGQEALKDSRVLIVGLGGLGCAASQYLASAGVGNLTL LDFDTVSLSNLQRQTLHSDATVGQPKVESARDALTRINPHIAITPVNALLDDAELAALIA EHDLVLDCTDNVAVRNQLNAGCFAAKVPLVSGAAIRMEGQITVFTYQDGEPCYRCLSRLF GENALTCVEAGVMAPLIGVIGSLQAMEAIKMLAGYGKPASGKIVMYDAMTCQFREMKLMR NPGCEVCGQ >gi|296494566|gb|ADTN01000172.1| GENE 6 5937 - 7172 1258 411 aa, chain - ## HITS:1 COG:moeA KEGG:ns NR:ns ## COG: moeA COG0303 # Protein_GI_number: 16128795 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin biosynthesis enzyme # Organism: Escherichia coli K12 # 1 411 1 411 411 795 100.0 0 MEFTTGLMSLDTALNEMLSRVTPLTAQETLPLVQCFGRILASDVVSPLDVPGFDNSAMDG YAVRLADIASGQPLPVAGKSFAGQPYHGEWPAGTCIRIMTGAPVPEGCEAVVMQEQTEQM DNGVRFTAEVRSGQNIRRRGEDISAGAVVFPAGTRLTTAELPVIASLGIAEVPVIRKVRV ALFSTGDELQLPGQPLGDGQIYDTNRLAVHLMLEQLGCEVINLGIIRDDPHALRAAFIEA DSQADVVISSGGVSVGEADYTKTILEELGEIAFWKLAIKPGKPFAFGKLSNSWFCGLPGN PVSATLTFYQLVQPLLAKLSGNTASGLPARQRVRTASRLKKTPGRLDFQRGVLQRNADGE LEVTTTGHQGSHIFSSFSLGNCFIVLERDRGNVEVGEWVEVEPFNALFGGL >gi|296494566|gb|ADTN01000172.1| GENE 7 7376 - 8341 1060 321 aa, chain + ## HITS:1 COG:ybiK KEGG:ns NR:ns ## COG: ybiK COG1446 # Protein_GI_number: 16128796 # Func_class: E Amino acid transport and metabolism # Function: Asparaginase # Organism: Escherichia coli K12 # 1 321 1 321 321 581 99.0 1e-166 MGKAVIAIHGGAGAISRAQMSLQQELRYIEALSAIVETGQKMLEAGESALDVVTEAVRLL EECPLFNAGIGAVFTRDETHELDACVMDGNTLKAGAVAGVSHLRNPVLAARLVMEQSPHV MMIGEGAENFAFARGMERVSPEIFSTPLRYEQLLAARKEGATVLDHSGAPLDEKQKMGTV GAVALDLDGNLAAATSTGGMTNKLPGRVGDSPLVGAGCYANNASVAVSCTGTGEVFIRAL AAYDIAALMDYGGLSLAEACERVVMEKLPALGGSGGLIAIDHEGNVALPFNTEGMYRAWG YAGDTPTTGIYREKGDTVATQ >gi|296494566|gb|ADTN01000172.1| GENE 8 8328 - 10199 841 623 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 12 572 8 528 563 328 35 1e-89 MPHSDELDAGNVLAVENLNIAFMQDQQKIAAVRNLSFSLQRGETLAIVGESGSGKSVTAL ALMRLLEQAGGLVQCDKMLLQRRSREVIELSEQNAAQMRHVRGADMAMIFQEPMTSLNPV FTVGEQIAESIRLHQNASREEAMVEAKRMLDQVRIPEAQTILSRYPHQLSGGMRQRVMIA MALSCRPAVLIADEPTTALDVTIQAQILQLIKVLQKEMSMGVIFITHDMGVVAEIADRVL VMYQGEAVETGTVEQIFHAPQHPYTRALLAAVPQLGAMKGLDYPRRFPLISLEHPAKQAP PIEQKTVVDGEPVLRVRNLVTRFPLRSGLLNRVTREVHAVEKVSFDLWPGETLSLVGESG SGKSTTGRALLRLVESQGGEIIFNGQRIDTLSPGKLQALRRDIQFIFQDPYASLDPRQTI GDSIIEPLRVHGLLPGKDAAARVAWLLERVGLLPEHAWRYPHEFSGGQRQRICIARALAL NPKVIIADEAVSALDVSIRGQIINLLLDLQRDFGIAYLFISHDMAVVERISHRVAVMYLG QIVEIGPRRAVFENPQHPYTRKLLAAVPVAEPSRQRPQRVLLSDDLPSNIHLRGEEVAAV SLQCVGPGHYVAQPQSEYAFMRR >gi|296494566|gb|ADTN01000172.1| GENE 9 10219 - 11757 1674 512 aa, chain + ## HITS:1 COG:yliB KEGG:ns NR:ns ## COG: yliB COG0747 # Protein_GI_number: 16128798 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Escherichia coli K12 # 1 512 1 512 512 992 100.0 0 MARAVHRSGLVALGIATALMASCAFAAKDVVVAVGSNFTTLDPYDANDTLSQAVAKSFYQ GLFGLDKEMKLKNVLAESYTVSDDGITYTVKLREGIKFQDGTDFNAAAVKANLDRASDPA NHLKRYNLYKNIAKTEAIDPTTVKITLKQPFSAFINILAHPATAMISPAALEKYGKEIGF YPVGTGPYELDTWNQTDFVKVKKFAGYWQPGLPKLDSITWRPVADNNTRAAMLQTGEAQF AFPIPYEQATLLEKNKNIELMASPSIMQRYISMNVTQKPFDNPKVREALNYAINRPALVK VAFAGYATPATGVVPPSIAYAQSYKPWPYDPVKARELLKEAGYPNGFSTTLWSSHNHSTA QKVLQFTQQQLAQVGIKAQVTAMDAGQRAAEVEGKGQKESGVRMFYTGWSASTGEADWAL SPLFASQNWPPTLFNTAFYSNKQVDDFLAQALKTNDPAEKTRLYKAAQDIIWQESPWIPL VVEKLVSAHSKNLTGFWIMPDTGFSFEDADLQ >gi|296494566|gb|ADTN01000172.1| GENE 10 11775 - 12695 1029 306 aa, chain + ## HITS:1 COG:yliC KEGG:ns NR:ns ## COG: yliC COG0601 # Protein_GI_number: 16128799 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Escherichia coli K12 # 1 306 1 306 306 563 100.0 1e-160 MLNYVIKRLLGLIPTLFIVSVLVFLFVHMLPGDPARLIAGPEADAQVIELVRQQLGLDQP LYHQFWHYISNAVQGDFGLSMVSRRPVADEIASRFMPTLWLTITSMVWAVIFGMAAGIIA AVWRNRWPDRLSMTIAVSGISFPAFALGMLLIQVFSVELGWLPTVGADSWQHYILPSLTL GAAVAAVMARFTRASFVDVLSEDYMRTARAKGVSETWVVLKHGLRNAMIPVVTMMGLQFG FLLGGSIVVEKVFNWPGLGRLLVDSVEMRDYPVIQAEILLFSLEFILINLVVDVLYAAIN PAIRYK >gi|296494566|gb|ADTN01000172.1| GENE 11 12740 - 13609 743 289 aa, chain + ## HITS:1 COG:yliD KEGG:ns NR:ns ## COG: yliD COG1173 # Protein_GI_number: 16128800 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Escherichia coli K12 # 1 289 15 303 303 535 100.0 1e-152 MPLVKPDQVRTPWHEFWRRFRRQHMAMTAALFVILLIVVAIFARWIAPYDAENYFDYDNL NNGPSLQHWFGVDSLGRDIFSRVLVGAQISLAAGVFAVFIGAAIGTLLGLLAGYYEGWWD RLIMRICDVLFAFPGILLAIAVVAVLGSGIANVIIAVAIFSIPAFARLVRGNTLVLKQQT FIESARSIGASDMTVLLRHILPGTVSSIVVFFTMRIGTSIISAASLSFLGLGAQPPTPEW GAMLNEARADMVIAPHVAVFPALAIFLTVLAFNLLGDGLRDALDPKIKG >gi|296494566|gb|ADTN01000172.1| GENE 12 13838 - 15455 1042 539 aa, chain + ## HITS:1 COG:yliE_2 KEGG:ns NR:ns ## COG: yliE_2 COG2200 # Protein_GI_number: 16128801 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 509 539 1 31 257 69 100.0 1e-11 MAALSFIGLFFIINYQLVSERAVKRADSRFELIQKNVGYFFKDIERSALTLKDSLYLLKN TEEIQRAVILKMEMMPFLDSVGLVLDDNKYYLFSRRANDKIVVYHQEQVNGPLVDESGRV IFADFNPSKRPWSVASDDSNNSWNPAYNCFDRPGKKCISFTLHINGKDHDLLAVDKIHVD LNWRYLNEYLDQISANDEVLFLKQGHEIIAKNQLAREKLIIYNSEGNYNIIDSVDTEYIE KTSAVPNNALFEIYFYYPGGNLLNASDKLFYLPFAFIIIVLLVVYLMTTRVFRRQFSEMT ELVNTLAFLPDSTDQIEALKIREGDAKEIISIKNSIAEMKDAEIERSNKLLSLISYDQES GFIKNMAIIESNNNQYLAVGIIKLCGLEAVEAVFGVDERNKIVRKLCQRIAEKYAQCCDI VTFNADLYLLLCRENVQTFTRKIAMVNDFDSSFGYRNLRIHKSAICEPLQGENAWSYAEK LKLAISSIRDHMFSEFIFCDDAKLNEIEENIWIARNIRHAMEIGELFLVYQPIVDINTR Prediction of potential genes in microbial genomes Time: Sun May 15 23:44:03 2011 Seq name: gi|296494565|gb|ADTN01000173.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont306.7, whole genome shotgun sequence Length of sequence - 26070 bp Number of predicted genes - 28, with homology - 28 Number of transcription units - 16, operones - 5 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 + CDS 23 - 682 563 ## COG2200 FOG: EAL domain 2 1 Op 2 . + CDS 690 - 2018 1020 ## COG2199 FOG: GGDEF domain + Term 2029 - 2072 7.7 3 2 Tu 1 . - CDS 2065 - 3390 1844 ## PROTEIN SUPPORTED gi|183179613|ref|ZP_02957824.1| conserved hypothetical protein - Prom 3425 - 3484 4.9 + Prom 3479 - 3538 4.5 4 3 Tu 1 . + CDS 3603 - 3986 272 ## EcE24377A_0907 biofilm formation regulatory protein BssR + Term 3991 - 4030 5.0 + Prom 4017 - 4076 4.4 5 4 Tu 1 . + CDS 4241 - 5212 917 ## COG2133 Glucose/sorbosone dehydrogenases 6 5 Tu 1 . - CDS 5209 - 5835 650 ## COG0625 Glutathione S-transferase - Prom 6023 - 6082 3.8 + Prom 5982 - 6041 5.1 7 6 Tu 1 . + CDS 6061 - 7284 1274 ## COG1686 D-alanyl-D-alanine carboxypeptidase + Term 7296 - 7332 6.3 8 7 Op 1 4/0.500 - CDS 7331 - 8089 579 ## COG1349 Transcriptional regulators of sugar metabolism 9 7 Op 2 . - CDS 8147 - 8743 562 ## COG0671 Membrane-associated phospholipid phosphatase + Prom 8866 - 8925 7.1 10 8 Tu 1 . + CDS 9028 - 10260 1238 ## COG0477 Permeases of the major facilitator superfamily + Term 10266 - 10304 2.2 - Term 10251 - 10295 5.2 11 9 Op 1 . - CDS 10301 - 10579 220 ## ECO103_0887 hypothetical protein 12 9 Op 2 2/0.625 - CDS 10671 - 11486 890 ## COG0561 Predicted hydrolases of the HAD superfamily 13 9 Op 3 . - CDS 11486 - 12613 1137 ## COG0477 Permeases of the major facilitator superfamily - Prom 12698 - 12757 3.6 + Prom 12657 - 12716 2.0 14 10 Tu 1 . + CDS 12778 - 13314 529 ## COG3226 Uncharacterized protein conserved in bacteria - Term 13325 - 13379 0.4 15 11 Tu 1 . - CDS 13489 - 15174 1707 ## COG2985 Predicted permease - Prom 15292 - 15351 5.5 + Prom 15277 - 15336 4.2 16 12 Tu 1 . + CDS 15525 - 15821 109 ## ECIAI39_0827 conserved hypothetical protein; putative inner membrane protein - Term 15811 - 15839 1.4 17 13 Tu 1 . - CDS 15851 - 16108 467 ## COG0695 Glutaredoxin and related proteins - Prom 16132 - 16191 2.2 + Prom 16102 - 16161 4.8 18 14 Op 1 . + CDS 16268 - 16555 243 ## SSON_0835 hypothetical protein 19 14 Op 2 4/0.500 + CDS 16539 - 17261 758 ## COG0778 Nitroreductase 20 14 Op 3 . + CDS 17322 - 18224 1507 ## PROTEIN SUPPORTED gi|15830186|ref|NP_308959.1| ribosomal protein S6 modification protein + Prom 18226 - 18285 1.6 21 14 Op 4 . + CDS 18312 - 18788 418 ## EcE24377A_0925 TPR repeat-containing protein + Term 18796 - 18850 6.1 + Prom 18912 - 18971 6.9 22 15 Op 1 13/0.000 + CDS 19139 - 20251 1236 ## COG0687 Spermidine/putrescine-binding periplasmic protein 23 15 Op 2 30/0.000 + CDS 20346 - 21479 1430 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 24 15 Op 3 36/0.000 + CDS 21489 - 22442 993 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I 25 15 Op 4 . + CDS 22439 - 23284 742 ## COG1177 ABC-type spermidine/putrescine transport system, permease component II 26 15 Op 5 . + CDS 23344 - 23832 369 ## LF82_2664 inner membrane protein YbjO 27 15 Op 6 . + CDS 23873 - 25000 1067 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase + Term 25124 - 25160 2.4 - Term 24990 - 25021 2.5 28 16 Tu 1 . - CDS 25199 - 25930 861 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain - Prom 25997 - 26056 7.3 Predicted protein(s) >gi|296494565|gb|ADTN01000173.1| GENE 1 23 - 682 563 219 aa, chain + ## HITS:1 COG:yliE_2 KEGG:ns NR:ns ## COG: yliE_2 COG2200 # Protein_GI_number: 16128801 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 1 219 39 257 257 452 99.0 1e-127 MCRWVSAERGIISPLKFITIAEDIGFINELGYQIIKTAMGEFRHFSQRASLKDDFLLHIN VSPWQLNEPHFHERFTTIMKENGLKANSLCVEITETVIERINEHFYLNIEQLRKQGVRIS IDDFGTGLSNLKRFYEINPDSIKVDSQFTGDIFGTAGKIVRIIFDLARYNRIPVIAEGVE SEDVARELIKLGCVQAQGYLYQKPMPFSAWDKSGKLVKE >gi|296494565|gb|ADTN01000173.1| GENE 2 690 - 2018 1020 442 aa, chain + ## HITS:1 COG:yliF_2 KEGG:ns NR:ns ## COG: yliF_2 COG2199 # Protein_GI_number: 16128802 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli K12 # 261 442 1 182 182 370 100.0 1e-102 MSRINKFVLTVSLLIFIMISAVACGIYTQMVKERVYSLKQSVIDTAFAVANIAEYRRSVA IDLINTLNPTEEQLLVGLRIAYADSVSPSYLYDVGPYLISSDECIQVKEFEKNYCADIMQ VVKYRHVKNTGFISFDGKTFVYYLYPVTHNRSLIFLLGLERFSLLSKSLAMDSENLMFSL FKNGKPVTGDEYNAKNAIFTVSEAMEHFAYLPTGLYVFAYKKDVYLRVCTLIIFFAALVA VISGASCLYLVRRVINRGIVEKEAIINNHFERVLDGGLFFSAADVKKLYSMYNSAFLDDL TKAMGRKSFDEDLKALPEKGGYLCLFDVDKFKNINDTFGHLLGDEVLMKVVKILKSQIPV DKGKVYRFGGDEFAVIYTGGTLEELLSILKEIVHFQVGSINLSTSIGVAHSNECPTVERL KMLADERLYKSKKNGRAQISWQ >gi|296494565|gb|ADTN01000173.1| GENE 3 2065 - 3390 1844 441 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|183179613|ref|ZP_02957824.1| conserved hypothetical protein [Vibrio cholerae MZO-3] # 9 439 34 464 470 714 77 0.0 MSKVTPQPKIGFVSLGCPKNLVDSERILTELRTEGYDVVPSYDDADMVIVNTCGFIDSAV QESLEAIGEALNENGKVIVTGCLGAKEDQIREVHPKVLEITGPHSYEQVLEHVHHYVPKP KHNPFLSLVPEQGVKLTPRHYAYLKISEGCNHRCTFCIIPSMRGDLVSRPIGEVLSEAKR LVDAGVKEILVISQDTSAYGVDVKHRTGFHNGEPVKTSMVSLCEQLSKLGIWTRLHYVYP YPHVDDVIPLMAEGKILPYLDIPLQHASPRILKLMKRPGSVDRQLARIKQWREICPELTL RSTFIVGFPGETEEDFQMLLDFLKEARLDRVGCFKYSPVEGADANALPDQVPEEVKEERW NRFMQLQQQISAERLQEKVGREILVIIDEVDEEGAIGRSMADAPEIDGAVYLNGETNVKP GDILRVKVEHADEYDLWGSRV >gi|296494565|gb|ADTN01000173.1| GENE 4 3603 - 3986 272 127 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_0907 NR:ns ## KEGG: EcE24377A_0907 # Name: bssR # Def: biofilm formation regulatory protein BssR # Organism: E.coli_E24377A # Pathway: not_defined # 1 127 12 138 138 226 100.0 2e-58 MFVDRQRIDLLNRLIDARVDLAAYVQLRKAKGYMSVSESNHLRDNFFKLNRELHDKSLRL NLHLDQEEWSALHHAEEALATAAVCLMSGHHDCPTVITVNADKLENCLMSLTLSIQSLQK HAMLEKA >gi|296494565|gb|ADTN01000173.1| GENE 5 4241 - 5212 917 323 aa, chain + ## HITS:1 COG:ECs0917 KEGG:ns NR:ns ## COG: ECs0917 COG2133 # Protein_GI_number: 15830171 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose/sorbosone dehydrogenases # Organism: Escherichia coli O157:H7 # 1 323 49 371 371 650 100.0 0 MLITLRGGELRHWQAGKGLSAPLSGVPDVWAHGQGGLLDVVLAPDFAQSRRIWLSYSEVG DDGKAGTAVGYGRLSDDLSKVTDFRTVFRQMPKLSTGNHFGGRLVFDGKGYLFIALGENN QRPTAQDLDKLQGKLVRLTDQGEIPDDNPFIKESGARAEIWSYGIRNPQGMAMNPWSNAL WLNEHGPRGGDEINIPQKGKNYGWPLATWGINYSGFKIPEAKGEIVAGTEQPVFYWKDSP AVSGMAFYNSDKFPQWQQKLFIGALKDKDVIVMSVNGDKVTEDGRILTDRGQRIRDVRTG PDGYLYVLTDESSGELLKVSPRN >gi|296494565|gb|ADTN01000173.1| GENE 6 5209 - 5835 650 208 aa, chain - ## HITS:1 COG:ECs0918 KEGG:ns NR:ns ## COG: ECs0918 COG0625 # Protein_GI_number: 15830172 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione S-transferase # Organism: Escherichia coli O157:H7 # 1 208 3 210 210 399 99.0 1e-111 MITLWGRNNSTNVKKVLLTLEELELPYEQILAGREFGINHDADFLAMNPNGLVPLLRDDE SDLILWESNAIVRYLAAQYGQKRLWIDSPARRAEAEKWMDWANQTLSNAHRGILMGLVRT PPEERDQAAIDASCKECDALFALLDAELAKVKWFSGDEFGVGDIAIAPFIYNLFNVGLTW TPRPNLQRLYQQLTERPAVRKVVMIPVS >gi|296494565|gb|ADTN01000173.1| GENE 7 6061 - 7284 1274 407 aa, chain + ## HITS:1 COG:dacC KEGG:ns NR:ns ## COG: dacC COG1686 # Protein_GI_number: 16128807 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Escherichia coli K12 # 8 407 1 400 400 795 100.0 0 MDTRVAFMTQYSSLLRGLAAGSAFLFLFAPTAFAAEQTVEAPSVDARAWILMDYASGKVL AEGNADEKLDPASLTKIMTSYVVGQALKADKIKLTDMVTVGKDAWATGNPALRGSSVMFL KPGDQVSVADLNKGVIIQSGNDACIALADYVAGSQESFIGLMNGYAKKLGLTNTTFQTVH GLDAPGQFSTARDMALLGKALIHDVPEEYAIHKEKEFTFNKIRQPNRNRLLWSSNLNVDG MKTGTTAGAGYNLVASATQGDMRLISVVLGAKTDRIRFNESEKLLTWGFRFFETVTPIKP DATFVTQRVWFGDKSEVNLGAGEAGSVTIPRGQLKNLKASYTLTEPQLTAPLKKGQVVGT IDFQLNGKSIEQRPLIVMENVEEGGFFGRVWDFVMMKFHQWFGSWFS >gi|296494565|gb|ADTN01000173.1| GENE 8 7331 - 8089 579 252 aa, chain - ## HITS:1 COG:ECs0920 KEGG:ns NR:ns ## COG: ECs0920 COG1349 # Protein_GI_number: 15830174 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli O157:H7 # 1 252 1 252 252 528 100.0 1e-150 METRREERIGQLLQELKRSDKLHLKDAAALLGVSEMTIRRDLNNHSAPVVLLGGYIVLEP RSASHYLLSDQKSRLVEEKRRAAKLAATLVEPDQTLFFDCGTTTPWIIEAIDNEIPFTAV CYSLNTFLALKEKPHCRAFLCGGEFHASNAIFKPIDFQQTLNNFCPDIAFYSAAGVHVSK GATCFNLEELPVKHWAMSMAQKHVLVVDHSKFGKVRPARMGDLKRFDIVVSDCCPEDEYV KYAQTQRIKLMY >gi|296494565|gb|ADTN01000173.1| GENE 9 8147 - 8743 562 198 aa, chain - ## HITS:1 COG:ybjG KEGG:ns NR:ns ## COG: ybjG COG0671 # Protein_GI_number: 16128809 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Escherichia coli K12 # 1 198 1 198 198 355 99.0 2e-98 MLENLNLSLFSLINATPDSAPWMISLAIFISKDLITVVPLLAVVLWLWGLTAQRQLVIKI AIALAVSLFVSWTMGHLFPHDRPFVENIGYNFLHHAADDSFPSDHGTVIFTFALAFLCWH RLWSGSLLMVLAVVIAWSRVYLGVHWPLDMLGGLLAGMIGCLSAQIIWQAMGHKLYQRLQ SWYRVCFALPIRKGWVRD >gi|296494565|gb|ADTN01000173.1| GENE 10 9028 - 10260 1238 410 aa, chain + ## HITS:1 COG:ECs0922 KEGG:ns NR:ns ## COG: ECs0922 COG0477 # Protein_GI_number: 15830176 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 410 1 410 410 721 99.0 0 MQNKLASGARLGRQALLFPLCLVLYEFSTYIGNDMIQPGMLAVVEQYQAGIDWVPTSMTA YLAGGMFLQWLLGPLSDRIGRRPVMLAGVVWFIVTCLAILLAQNIEQFTLLRFLQGISLC FIGAVGYAAIQESFEEAVCIKITALMANVALIAPLLGPLVGAAWIHVLPWEGMFVLFAAL AAISFFGLQRAMPETATRIGEKLSLKELGRDYKLVLKNGRFVAGALALGFVSLPLLAWIA QSPIIIITGEQLSSYEYGLLQVPIFGALIAGNLLLARLTSRRTVRSLIIMGGWPIMIGLL VAAAATVISSHAYLWMTAGLSIYAFGIGLANAGLVRLTLFASDMSKGTVSAAMGMLQMLI FTVGVEISKHAWLNGGNGLFNLFNLVNGILWLSLMVIFLKDKQMGNSHEG >gi|296494565|gb|ADTN01000173.1| GENE 11 10301 - 10579 220 92 aa, chain - ## HITS:1 COG:no KEGG:ECO103_0887 NR:ns ## KEGG: ECO103_0887 # Name: ybjH # Def: hypothetical protein # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 92 3 94 94 127 100.0 9e-29 MKNCLLLGALLMGFTGVAMAQSVTVDVPSGYKVVVVPDSVSVPQAVSVATVPQTVYVAPA PAPAYRPHPYVRHLASVGEGMVIEHQIDDHHH >gi|296494565|gb|ADTN01000173.1| GENE 12 10671 - 11486 890 271 aa, chain - ## HITS:1 COG:ybjI KEGG:ns NR:ns ## COG: ybjI COG0561 # Protein_GI_number: 16128812 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Escherichia coli K12 # 10 271 1 262 262 520 100.0 1e-147 MSIKLIAVDMDGTFLSDQKTYNRERFMAQYQQMKAQGIRFVVASGNQYYQLISFFPEIAN EIAFVAENGGWVVSEGKDVFNGELSKDAFATVVEHLLTRPEVEIIACGKNSAYTLKKYDD AMKTVAEMYYHRLEYVDNFDNLEDIFFKFGLNLSDELIPQVQKALHEAIGDIMVSVHTGN GSIDLIIPGVHKANGLRQLQKLWGIDDSEVVVFGDGGNDIEMLRQAGFSFAMENAGSAVV AAAKYRAGSNNREGVLDVIDKVLKHEAPFDQ >gi|296494565|gb|ADTN01000173.1| GENE 13 11486 - 12613 1137 375 aa, chain - ## HITS:1 COG:ybjJ KEGG:ns NR:ns ## COG: ybjJ COG0477 # Protein_GI_number: 16128813 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 375 28 402 402 613 100.0 1e-175 MASWATRTPAIRDILSVSIAEMGGVLFGLSIGSMSGILCSAWLVKRFGTRNVILVTMSCA LIGMMILSLALWLTSPLLFAVGLGVFGASFGSAEVAINVEGAAVEREMNKTVLPMMHGFY SLGTLAGAGVGMALTAFGVPATVHILLAALVGIAPIYIAIQAIPDGTGKNAADGTQHGEK GVPFYRDIQLLLIGVVVLAMAFAEGSANDWLPLLMVDGHGFSPTSGSLIYAGFTLGMTVG RFTGGWFIDRYSRVAVVRASALMGALGIGLIIFVDSAWVAGVSVVLWGLGASLGFPLTIS AASDTGPDAPTRVSVVATTGYLAFLVGPPLLGYLGEHYGLRSAMLVVLALVILAAIVAKA VAKPDTKTQTAMENS >gi|296494565|gb|ADTN01000173.1| GENE 14 12778 - 13314 529 178 aa, chain + ## HITS:1 COG:ybjK KEGG:ns NR:ns ## COG: ybjK COG3226 # Protein_GI_number: 16128814 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 178 1 178 178 344 100.0 6e-95 MRRANDPQRREKIIQATLEAVKLYGIHAVTHRKIATLAGVPLGSMTYYFSGIDELLLEAF SSFTEIMSRQYQAFFSDVSDAPGACQAITDMIYSSQVATPDNMELMYQLYALASRKPLLK TVMQNWMQRSQQTLEQWFEPGTARALDAFIEGMTLHFVTDRKPLSREEILRMVERVAG >gi|296494565|gb|ADTN01000173.1| GENE 15 13489 - 15174 1707 561 aa, chain - ## HITS:1 COG:ECs0927 KEGG:ns NR:ns ## COG: ECs0927 COG2985 # Protein_GI_number: 15830181 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Escherichia coli O157:H7 # 1 561 1 561 561 1046 100.0 0 MNINVAELLNGNYILLLFVVLALGLCLGKLRLGSIQLGNSIGVLVVSLLLGQQHFSINTD ALNLGFMLFIFCVGVEAGPNFFSIFFRDGKNYLMLALVMVGSALVIALGLGKLFGWDIGL TAGMLAGSMTSTPVLVGAGDTLRHSGMESRQLSLALDNLSLGYALTYLIGLVSLIVGARY LPKLQHQDLQTSAQQIARERGLDTDANRKVYLPVIRAYRVGPELVAWTDGKNLRELGIYR QTGCYIERIRRNGILANPDGDAVLQMGDEIALVGYPDAHARLDPSFRNGKEVFDRDLLDM RIVTEEVVVKNHNAVGKRLAQLKLTDHGCFLNRVIRSQIEMPIDDNVVLNKGDVLQVSGD ARRVKTIADRIGFISIHSQVTDLLAFCAFFVIGLMIGMITFQFSTFSFGMGNAAGLLFAG IMLGFMRANHPTFGYIPQGALSMVKEFGLMVFMAGVGLSAGSGINNGLGAIGGQMLIAGL IVSLVPVVICFLFGAYVLRMNRALLFGAMMGARTCAPAMEIISDTARSNIPALGYAGTYA IANVLLTLAGTIIVMVWPGLG >gi|296494565|gb|ADTN01000173.1| GENE 16 15525 - 15821 109 98 aa, chain + ## HITS:1 COG:no KEGG:ECIAI39_0827 NR:ns ## KEGG: ECIAI39_0827 # Name: ybjM # Def: conserved hypothetical protein; putative inner membrane protein # Organism: E.coli_IAI39 # Pathway: not_defined # 1 98 28 125 125 170 100.0 1e-41 MKGAFRAAGHPEIGLLFFILPGAVASFFSQRREVLKPLFGAMLAAPCSMLIMRLFFSPTR SFWQELAWLLSAVFWCALGALCFLFISSLFKPQHRKNQ >gi|296494565|gb|ADTN01000173.1| GENE 17 15851 - 16108 467 85 aa, chain - ## HITS:1 COG:grxA KEGG:ns NR:ns ## COG: grxA COG0695 # Protein_GI_number: 16128817 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutaredoxin and related proteins # Organism: Escherichia coli K12 # 1 85 1 85 85 167 98.0 6e-42 MQTVIFGRSGCPYCVRAKDLAEKLSNERDDFQYQYVDFRAEGITKEDLQQKAGKPVETVP QIFVDQQHIGGYTDFAAWVKENLDA >gi|296494565|gb|ADTN01000173.1| GENE 18 16268 - 16555 243 95 aa, chain + ## HITS:1 COG:no KEGG:SSON_0835 NR:ns ## KEGG: SSON_0835 # Name: ybjC # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 95 1 95 95 137 100.0 2e-31 MRAIGKLPKGVLILEFIGMMLLAVALLSVSDSLSLPEPFSRPEVQILMIFLGVLLMLPAA VVVILQVAKRLAPQLMNRPPQYSRSEREKDNDANH >gi|296494565|gb|ADTN01000173.1| GENE 19 16539 - 17261 758 240 aa, chain + ## HITS:1 COG:mdaA KEGG:ns NR:ns ## COG: mdaA COG0778 # Protein_GI_number: 16128819 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Escherichia coli K12 # 1 240 1 240 240 477 100.0 1e-135 MTPTIELICGHRSIRHFTDEPISEAQREAIINSARATSSSSFLQCSSIIRITDKALREEL VTLTGGQKHVAQAAEFWVFCADFNRHLQICPDAQLGLAEQLLLGVVDTAMMAQNALIAAE SLGLGGVYIGGLRNNIEAVTKLLKLPQHVLPLFGLCLGWPADNPDLKPRLPASILVHENS YQPLDKGALAQYDEQLAEYYLTRGSNNRRDTWSDHIRRTIIKESRPFILDYLHKQGWATR >gi|296494565|gb|ADTN01000173.1| GENE 20 17322 - 18224 1507 300 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15830186|ref|NP_308959.1| ribosomal protein S6 modification protein [Escherichia coli O157:H7 str. Sakai] # 1 300 1 300 300 585 99 1e-166 MKIAILSRDGTLYSCKRLREAAIQRGHLVEILDPLSCYMNINPAASSIHYKGRKLPHFDA VIPRIGTAITFYGTAALRQFEMLGSYPLNESVAIARARDKLRSMQLLARQGIDLPVTGIA HSPDDTSDLIDMVGGAPLVVKLVEGTQGIGVVLAETRQAAESVIDAFRGLNAHILVQEYI KEAQGCDIRCLVVGDEVVAAIERRAKEGDFRSNLHRGGAASVASITPQEREIAIKAARTM ALDVAGVDILRANRGPLVMEVNASPGLEGIEKTTGIDIAGKMIRWIERHATTEYCLKTGG >gi|296494565|gb|ADTN01000173.1| GENE 21 18312 - 18788 418 158 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_0925 NR:ns ## KEGG: EcE24377A_0925 # Name: not_defined # Def: TPR repeat-containing protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 158 17 174 174 310 100.0 1e-83 MTSLVVPGLDTLRQWLDDLGMSFFECDNCQALHLPHMQNFDGVFDAKIDLIDNTILFSAM AEVRPSAVLPLAADLSAINASSLTVKAFLDMQDDNLPKLVVCQSLSVMQGVTYEQFAWFV RQSEEQISMVILEANAHQLLLPTDDEGQNNVTENYFLH >gi|296494565|gb|ADTN01000173.1| GENE 22 19139 - 20251 1236 370 aa, chain + ## HITS:1 COG:potF KEGG:ns NR:ns ## COG: potF COG0687 # Protein_GI_number: 16128822 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Escherichia coli K12 # 1 370 1 370 370 734 100.0 0 MTALNKKWLSGLVAGALMAVSVGTLAAEQKTLHIYNWSDYIAPDTVANFEKETGIKVVYD VFDSNEVLEGKLMAGSTGFDLVVPSASFLERQLTAGVFQPLDKSKLPEWKNLDPELLKLV AKHDPDNKFAMPYMWATTGIGYNVDKVKAVLGENAPVDSWDLILKPENLEKLKSCGVSFL DAPEEVFATVLNYLGKDPNSTKADDYTGPATDLLLKLRPNIRYFHSSQYINDLANGDICV AIGWAGDVWQASNRAKEAKNGVNVSFSIPKEGAMAFFDVFAMPADAKNKDEAYQFLNYLL RPDVVAHISDHVFYANANKAATPLVSAEVRENPGIYPPADVRAKLFTLKVQDPKIDRVRT RAWTKVKSGK >gi|296494565|gb|ADTN01000173.1| GENE 23 20346 - 21479 1430 377 aa, chain + ## HITS:1 COG:potG KEGG:ns NR:ns ## COG: potG COG3842 # Protein_GI_number: 16128823 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Escherichia coli K12 # 1 377 28 404 404 762 99.0 0 MNDAIPRPQAKTRKALTPLLEIRNLTKSYDGQHAVDDVSLTIYKGEIFALLGASGCGKST LLRMLAGFEQPSAGQIMLDGVDLSQVPPYLRPINMMFQSYALFPHMTVEQNIAFGLKQDK LPKAEIASRVNEMLGLVHMQEFAKRKPHQLSGGQRQRVALARSLAKRPKLLLLDEPMGAL DKKLRDRMQLEVVDILERVGVTCVMVTHDQEEAMTMAGRIAIMNRGKFVQIGEPEEIYEH PTTRYSAEFIGSVNVFEGVLKERQEDGLVLDSPGLVHPLKVDADASVVDNVPVHVALRPE KIMLCEEPPANGCNFAVGEVIHIAYLGDLSVYHVRLKSGQMISAQLQNAHRHRKGLPTWG DEVRLCWEVDSCVVLTV >gi|296494565|gb|ADTN01000173.1| GENE 24 21489 - 22442 993 317 aa, chain + ## HITS:1 COG:potH KEGG:ns NR:ns ## COG: potH COG1176 # Protein_GI_number: 16128824 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Escherichia coli K12 # 1 317 1 317 317 553 100.0 1e-157 MSTLEPAAQSKPPGGFKLWLSQLQMKHGRKLVIALPYIWLILLFLLPFLIVFKISLAEMA RAIPPYTELMEWADGQLSITLNLGNFLQLTDDPLYFDAYLQSLQVAAISTFCCLLIGYPL AWAVAHSKPSTRNILLLLVILPSWTSFLIRVYAWMGILKNNGVLNNFLLWLGVIDQPLTI LHTNLAVYIGIVYAYVPFMVLPIYTALIRIDYSLVEAALDLGARPLKTFFTVIVPLTKGG IIAGSMLVFIPAVGEFVIPELLGGPDSIMIGRVLWQEFFNNRDWPVASAVAIIMLLLLIV PIMWFHKHQQKSVGEHG >gi|296494565|gb|ADTN01000173.1| GENE 25 22439 - 23284 742 281 aa, chain + ## HITS:1 COG:STM0880 KEGG:ns NR:ns ## COG: STM0880 COG1177 # Protein_GI_number: 16764242 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component II # Organism: Salmonella typhimurium LT2 # 1 281 1 281 281 452 94.0 1e-127 MNNLPVVRSPWRIVILLLGFTFLYAPMLMLVIYSFNSSKLVTVWAGWSTRWYGELLRDDA MMSAVGLSLTIAACAATAAAILGTIAAVVLVRFGRFRGSNGFAFMITAPLVMPDVITGLS LLLLFVALAHAIGWPADRGMLTIWLAHVTFCTAYVAVVISSRLRELDRSIEEAAMDLGAT PLKVFFVITLPMIMPAIISGWLLAFTLSLDDLVIASFVSGPGATTLPMLVFSSVRMGVNP EINALATLILGAVGIVGFIAWYLMARAEKQRIRDIQRARRG >gi|296494565|gb|ADTN01000173.1| GENE 26 23344 - 23832 369 162 aa, chain + ## HITS:1 COG:no KEGG:LF82_2664 NR:ns ## KEGG: LF82_2664 # Name: ybjO # Def: inner membrane protein YbjO # Organism: E.coli_LF82 # Pathway: not_defined # 1 162 1 162 162 288 99.0 4e-77 MEDETLGFFKKTSSSHARLNVPALVQVAALAIIMIRGLDVLMIFNTLGVRGIGEFIHRSV QTWSLTLVLLSSLVLVFIEIWCAFSLVKGRRWARWLYLLTQITAASYLWAASLGYGYPEL FSIPGESKREIFHSLMLQKLPDMLILMLLFVPSTSRRFFQLQ >gi|296494565|gb|ADTN01000173.1| GENE 27 23873 - 25000 1067 375 aa, chain + ## HITS:1 COG:ybjF KEGG:ns NR:ns ## COG: ybjF COG2265 # Protein_GI_number: 16128827 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Escherichia coli K12 # 1 375 1 375 375 767 100.0 0 MQCALYDAGRCRSCQWIMQPIPEQLSAKTADLKNLLADFPVEEWCAPVSGPEQGFRNKAK MVVSGSVEKPLLGMLHRDGTPEDLCDCPLYPASFAPVFAALKPFIARAGLTPYNVARKRG ELKYILLTESQSDGGMMLRFVLRSDTKLAQLRKALPWLHEQLPQLKVITVNIQPVHMAIM EGETEIYLTEQQALAERFNDVPLWIRPQSFFQTNPAVASQLYATARDWVRQLPVKHMWDL FCGVGGFGLHCATPDMQLTGIEIASEAIACAKQSAAELGLTRLQFQALDSTQFATAQGDV PELVLVNPPRRGIGKPLCDYLSTMAPRFIIYSSCNAQTMAKDIRELPGFRIERVQLFDMF PHTAHYEVLTLLVKQ >gi|296494565|gb|ADTN01000173.1| GENE 28 25199 - 25930 861 243 aa, chain - ## HITS:1 COG:artJ KEGG:ns NR:ns ## COG: artJ COG0834 # Protein_GI_number: 16128828 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Escherichia coli K12 # 1 243 1 243 243 482 100.0 1e-136 MKKLVLAALLASFTFGASAAEKINFGVSATYPPFESIGANNEIVGFDIDLAKALCKQMQA ECTFTNHAFDSLIPSLKFRKYDAVISGMDITPERSKQVSFTTPYYENSAVVIAKKDTYKT FADLKGKRIGMENGTTHQKYIQDQHPEVKTVSYDSYQNAFIDLKNGRIDGVFGDTAVVNE WLKTNPQLGVATEKVTDPQYFGTGLGIAVRPDNKALLEKLNNALAAIKADGTYQKISDQW FPQ Prediction of potential genes in microbial genomes Time: Sun May 15 23:44:22 2011 Seq name: gi|296494564|gb|ADTN01000174.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont306.8, whole genome shotgun sequence Length of sequence - 14508 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 6, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 12/0.000 - CDS 148 - 816 940 ## COG4160 ABC-type arginine/histidine transport system, permease component 2 1 Op 2 12/0.000 - CDS 816 - 1532 832 ## COG4215 ABC-type arginine transport system, permease component 3 1 Op 3 7/0.000 - CDS 1539 - 2270 1128 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain 4 1 Op 4 . - CDS 2288 - 3022 246 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 - Prom 3053 - 3112 4.2 - Term 3163 - 3203 5.6 5 2 Tu 1 . - CDS 3234 - 3749 345 ## B21_00876 hypothetical protein - Prom 3790 - 3849 4.1 + Prom 3740 - 3799 6.2 6 3 Op 1 4/0.000 + CDS 3875 - 4198 525 ## COG0393 Uncharacterized conserved protein 7 3 Op 2 . + CDS 4195 - 5025 810 ## COG3023 Negative regulator of beta-lactamase expression + Term 5158 - 5199 1.0 8 4 Op 1 4/0.000 - CDS 5022 - 6035 928 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Prom 6063 - 6122 3.0 9 4 Op 2 4/0.000 - CDS 6134 - 7564 1239 ## COG0702 Predicted nucleoside-diphosphate-sugar epimerases 10 4 Op 3 5/0.000 - CDS 7575 - 8576 1141 ## COG2008 Threonine aldolase 11 4 Op 4 5/0.000 - CDS 8613 - 10331 1746 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] - Prom 10359 - 10418 5.3 - Term 10415 - 10454 5.0 12 5 Op 1 2/1.000 - CDS 10464 - 11432 941 ## COG1018 Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 13 5 Op 2 2/1.000 - CDS 11444 - 13096 1999 ## COG1151 6Fe-6S prismane cluster-containing protein - Prom 13129 - 13188 4.8 14 6 Tu 1 . - CDS 13240 - 14139 819 ## COG2431 Predicted membrane protein - Prom 14283 - 14342 3.4 Predicted protein(s) >gi|296494564|gb|ADTN01000174.1| GENE 1 148 - 816 940 222 aa, chain - ## HITS:1 COG:ECs0944 KEGG:ns NR:ns ## COG: ECs0944 COG4160 # Protein_GI_number: 15830198 # Func_class: E Amino acid transport and metabolism # Function: ABC-type arginine/histidine transport system, permease component # Organism: Escherichia coli O157:H7 # 1 222 1 222 222 372 100.0 1e-103 MFEYLPELMKGLHTSLTLTVASLIVALILALIFTIILTLKTPVLVWLVRGYITLFTGTPL LVQIFLIYYGPGQFPTLQEYPALWHLLSEPWLCALIALSLNSAAYTTQLFYGAIRAIPEG QWQSCSALGMSKKDTLAILLPYAFKRSLSSYSNEVVLVFKSTSLAYTITLMEVMGYSQLL YGRTYDVMVFGAAGIIYLVVNGLLTLMMRLIERKALAFERRN >gi|296494564|gb|ADTN01000174.1| GENE 2 816 - 1532 832 238 aa, chain - ## HITS:1 COG:artQ KEGG:ns NR:ns ## COG: artQ COG4215 # Protein_GI_number: 16128830 # Func_class: E Amino acid transport and metabolism # Function: ABC-type arginine transport system, permease component # Organism: Escherichia coli K12 # 1 238 1 238 238 426 100.0 1e-119 MNEFFPLASAAGMTVGLAVCALIVGLALAMFFAVWESAKWRPVAWAGSALVTILRGLPEI LVVLFIYFGSSQLLLTLSDGFTINLGFVQIPVQMDIENFDVSPFLCGVIALSLLYAAYAS QTLRGALKAVPVGQWESGQALGLSKSAIFFRLVMPQMWRHALPGLGNQWLVLLKDTALVS LISVNDLMLQTKSIATRTQEPFTWYIVAAAIYLVITLLSQYILKRIDLRATRFERRPS >gi|296494564|gb|ADTN01000174.1| GENE 3 1539 - 2270 1128 243 aa, chain - ## HITS:1 COG:artI KEGG:ns NR:ns ## COG: artI COG0834 # Protein_GI_number: 16128831 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Escherichia coli K12 # 1 243 1 243 243 475 100.0 1e-134 MKKVLIAALIAGFSLSATAAETIRFATEASYPPFESIDANNQIVGFDVDLAQALCKEIDA TCTFSNQAFDSLIPSLKFRRVEAVMAGMDITPEREKQVLFTTPYYDNSALFVGQQGKYTS VDQLKGKKVGVQNGTTHQKFIMDKHPEITTVPYDSYQNAKLDLQNGRIDGVFGDTAVVTE WLKDNPKLAAVGDKVTDKDYFGTGLGIAVRQGNTELQQKLNTALEKVKKDGTYETIYNKW FQK >gi|296494564|gb|ADTN01000174.1| GENE 4 2288 - 3022 246 244 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 222 1 215 305 99 29 1e-20 MSMSIQLNGINCFYGAHQALFDITLDCPQGETLVLLGPSGAGKSSLLRVLNLLEMPRSGT LNIAGNHFDFTKTPSDKAIRDLRRNVGMVFQQYNLWPHLTVQQNLIEAPCRVLGLSKDQA LARAEKLLERLRLKPYSDRYPLHLSGGQQQRVAIARALMMEPQVLLFDEPTAALDPEITA QIVSIIRELAETNITQVIVTHEVEVARKTASRVVYMENGHIVEQGDASCFTEPQTEAFKN YLSH >gi|296494564|gb|ADTN01000174.1| GENE 5 3234 - 3749 345 171 aa, chain - ## HITS:1 COG:no KEGG:B21_00876 NR:ns ## KEGG: B21_00876 # Name: ybjP # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 171 1 171 171 342 100.0 4e-93 MRYSKLTMLIPCALLLSACTTVTPAYKDNGTRSGPCVEGGPDNVAQQFYDYRILHRSNDI TALRPYLSDKLATLLSDASRDNNHRELLTNDPFSSRTTLPDSAHVASASTIPNRDARNIP LRVDLKQGDQGWQDEVLMIQEGQCWVIDDVRYLGGSVHATAGTLRQSIENR >gi|296494564|gb|ADTN01000174.1| GENE 6 3875 - 4198 525 107 aa, chain + ## HITS:1 COG:ECs0952 KEGG:ns NR:ns ## COG: ECs0952 COG0393 # Protein_GI_number: 15830206 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 107 1 107 107 187 100.0 4e-48 MQFSTTPTLEGQTIVEYCGVVTGEAILGANIFRDFFAGIRDIVGGRSGAYEKELRKAREI AFEELGSQARALGADAVVGIDIDYETVGQNGSMLMVSVSGTAVKTRR >gi|296494564|gb|ADTN01000174.1| GENE 7 4195 - 5025 810 276 aa, chain + ## HITS:1 COG:ECs0953 KEGG:ns NR:ns ## COG: ECs0953 COG3023 # Protein_GI_number: 15830207 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Escherichia coli O157:H7 # 1 276 1 276 276 538 98.0 1e-153 MRRFFWLVAAALLLAGCAGEKGIVEKEGYQLDTRRQAQAAYPRIKVLVIHYTADDFDSSL ATLTDKQVSSHYLVPAVPPRYNGKPRIWQLVPEQELAWHAGISAWRGATRLNDTSIGIEL ENRGWQKSAGVKYFAPFEPAQIQALIPLAKDIIARYHIKPENVVAHADIAPQRKDDPGPL FPWQQLAQQGIGAWPDAQRVNFYLAGRAPHTPVDTASLLELLARYGYDVKPDMTPREQRR VIMAFQMHFRPTLYNGEADAETQAIAEALLEKYGQD >gi|296494564|gb|ADTN01000174.1| GENE 8 5022 - 6035 928 337 aa, chain - ## HITS:1 COG:ybjS KEGG:ns NR:ns ## COG: ybjS COG0451 # Protein_GI_number: 16128836 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Escherichia coli K12 # 1 337 13 349 349 707 100.0 0 MKVLVTGATSGLGRNAVEFLCQKGISVRATGRNEAMGKLLEKMGAEFVPADLTELVSSQA KVMLAGIDTLWHCSSFTSPWGTQQAFDLANVRATRRLGEWAVAWGVRNFIHISSPSLYFD YHHHRDIKEDFRPHRFANEFARSKAASEEVINMLSQANPQTRFTILRPQSLFGPHDKVFI PRLAHMMHHYGSILLPHGGSALVDMTYYENAVHAMWLASQEACDKLPSGRVYNITNGEHR TLRSIVQKLIDELNIDCRIRSVPYPMLDMIARSMERLGRKSAKEPPLTHYGVSKLNFDFT LDITRAQEELGYQPVITLDEGIEKTAAWLRDHGKLPR >gi|296494564|gb|ADTN01000174.1| GENE 9 6134 - 7564 1239 476 aa, chain - ## HITS:1 COG:ybjT KEGG:ns NR:ns ## COG: ybjT COG0702 # Protein_GI_number: 16128837 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate-sugar epimerases # Organism: Escherichia coli K12 # 1 476 11 486 486 958 99.0 0 MPQRILVLGASGYIGQHLVRTLSQQGHQILAAARHVDRLAKLQLANVSCHKVDLSWPDNL PALLQDIDTVYFLVHSMGEGGDFIAQERQVALNVRDALREVPVKQLIFLSSLQAPPHEQS DHLRARQATADILREANVPVTELRAGIIVGAGSAAFEVMRDMVYNLPVLTPPRWVRSRTT PIALENLLHYLVALLDHPASEHRIFEAAGPEVLSYQQQFEHFMAVSGKRRWLIPIPLPTR WISVWFLNVITSVPPTTARALIQGLKHDLLADDTALRALIPQRLIAFDDAVRSTLKEEEK LVNSSDWGYDAQAFARWRPEYGYFAKQAGFTVKTSASLAALWQVVNQIGGKERYFFGNIL WQTRALMDRAIGHKLAKGRPEREYLQTGDAVDSWKVIVVEPEKQLTLLFGMKAPGLGRLC FSLEDKGDYRTIDVRAFWHPHGMPGLFYWLLMIPAHLFIFRGMAKQIARLAEQSTD >gi|296494564|gb|ADTN01000174.1| GENE 10 7575 - 8576 1141 333 aa, chain - ## HITS:1 COG:ybjU KEGG:ns NR:ns ## COG: ybjU COG2008 # Protein_GI_number: 16128838 # Func_class: E Amino acid transport and metabolism # Function: Threonine aldolase # Organism: Escherichia coli K12 # 1 333 1 333 333 664 100.0 0 MIDLRSDTVTRPSRAMLEAMMAAPVGDDVYGDDPTVNALQDYAAELSGKEAAIFLPTGTQ ANLVALLSHCERGEEYIVGQAAHNYLFEAGGAAVLGSIQPQPIDAAADGTLPLDKVAMKI KPDDIHFARTKLLSLENTHNGKVLPREYLKEAWEFTRERNLALHVDGARIFNAVVAYGCE LKEITQYCDSFTICLSKGLGTPVGSLLVGNRDYIKRAIRWRKMTGGGMRQSGILAAAGIY ALKNNVARLQEDHDNAAWMAEQLREAGADVMRQDTNMLFVRVGEENAAALGEYMKARNVL INASPIVRLVTHLDVSREQLAEVAAHWRAFLAR >gi|296494564|gb|ADTN01000174.1| GENE 11 8613 - 10331 1746 572 aa, chain - ## HITS:1 COG:poxB KEGG:ns NR:ns ## COG: poxB COG0028 # Protein_GI_number: 16128839 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Escherichia coli K12 # 1 572 1 572 572 1148 99.0 0 MKQTVAAYIAKTLESAGVKRIWGVTGDSLNGLSDSLNRMGTIEWMSTRHEEVAAFAAGAE AQLSGELAVCAGSCGPGNLHLINGLFDCHRNHVPVLAIAAHIPSSEIGSGYFQETHPQEL FRECSHYCELVSSPEQIPQVLAIAMRKAVLNRGVSVVVLPGDVALKPAPEGATMHWYHAP QPVVTPEEEELRKLAQLLRYSSNIALMCGSGCAGAHKELVEFAGKIKAPIVHALRGKEHV EYDNPYDVGMTGLIGFSSGFHTMMNADTLVLLGTQFPYRAFYPTDAKIIQIDINPASIGA HSKVDMALVGDIKSTLRALLPLVEEKADRKFLDKALEDYRDARKGLDDLAKPSEKAIHPQ YLAQQISHFAADDAIFTCDVGTPTVWAARYLKMNGKRRLLGSFNHGSMANAMPQALGAQA TEPERQVVAMCGDGGFSMLMGDFLSVVQMKLPVKIVVFNNSVLGFVAMEMKAGGYLTDGT KLHDTNFARIAEACGITGIRVEKASEVDEALQRAFSIDGPVLVDVVVAKEELAIPPQIKL EQAKGFSLYMLRAIISGRGDEVIELAKTNWLR >gi|296494564|gb|ADTN01000174.1| GENE 12 10464 - 11432 941 322 aa, chain - ## HITS:1 COG:ybjV KEGG:ns NR:ns ## COG: ybjV COG1018 # Protein_GI_number: 16128840 # Func_class: C Energy production and conversion # Function: Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 # Organism: Escherichia coli K12 # 1 322 1 322 322 663 100.0 0 MTMPTNQCPWRMQVHHITQETPDVWTISLICHDYYPYRAGQYALVSVRNSAETLRAYTIS STPGVSEYITLTVRRIDDGVGSQWLTRDVKRGDYLWLSDAMGEFTCDDKAEDKFLLLAAG CGVTPIMSMRRWLAKNRPQADVRVIYNVRTPQDVIFADEWRNYPVTLVAENNVTEGFIAG RLTRELLAGVPDLASRTVMTCGPAPYMDWVEQEVKALGVTRFFKEKFFTPVAEAATSGLK FTKLQPAREFYAPVGTTLLEALESNNVPVVAACRAGVCGCCKTKVVSGEYTVSSTMTLTD AEIAEGYVLACSCHPQGDLVLA >gi|296494564|gb|ADTN01000174.1| GENE 13 11444 - 13096 1999 550 aa, chain - ## HITS:1 COG:ybjW KEGG:ns NR:ns ## COG: ybjW COG1151 # Protein_GI_number: 16128841 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Escherichia coli K12 # 1 550 3 552 552 1160 100.0 0 MFCVQCEQTIRTPAGNGCSYAQGMCGKTAETSDLQDLLIAALQGLSAWAVKAREYGIINH DVDSFAPRAFFSTLTNVNFDSPRIVGYAREAIALREALKAQCLAVDANARVDNPMADLQL VSDDLGELQRQAAEFTPNKDKAAIGENILGLRLLCLYGLKGAAAYMEHAHVLGQYDNDIY AQYHKIMAWLGTWPADMNALLECSMEIGQMNFKVMSILDAGETGKYGHPTPTQVNVKATA GKCILISGHDLKDLYNLLEQTEGTGVNVYTHGEMLPAHGYPELRKFKHLVGNYGSGWQNQ QVEFARFPGPIVMTSNCIIDPTVGAYDDRIWTRSIVGWPGVRHLDGDDFSAVITQAQQMA GFPYSEIPHLITVGFGRQTLLGAADTLIDLVSREKLRHIFLLGGCDGARGERHYFTDFAT SVPDDCLILTLACGKYRFNKLEFGDIEGLPRLVDAGQCNDAYSAIILAVTLAEKLGCGVN DLPLSLVLSWFEQKAIVILLTLLSLGVKNIVTGPTAPGFLTPDLLAVLNEKFGLRSITTV EEDMKQLLSA >gi|296494564|gb|ADTN01000174.1| GENE 14 13240 - 14139 819 299 aa, chain - ## HITS:1 COG:ybjE KEGG:ns NR:ns ## COG: ybjE COG2431 # Protein_GI_number: 16128842 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 299 17 315 315 479 100.0 1e-135 MFSGLLIILVPLIVGYLIPLRQQAALKVINQLLSWMVYLILFFMGISLAFLDNLASNLLA ILHYSAVSITVILLCNIAALMWLERGLPWRNHHQQEKLPSRIAMALESLKLCGVVVIGFA IGLSGLAFLQHATEASEYTLILLLFLVGIQLRNNGMTLKQIVLNRRGMIVAVVVVVSSLI GGLINAFILDLPINTALAMASGFGWYSLSGILLTESFGPVIGSAAFFNDLARELIAIMLI PGLIRRSRSTALGLCGATSMDFTLPVLQRTGGLDMVPAAIVHGFILSLLVPILIAFFSA Prediction of potential genes in microbial genomes Time: Sun May 15 23:44:28 2011 Seq name: gi|296494563|gb|ADTN01000175.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont306.9, whole genome shotgun sequence Length of sequence - 7096 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 82 - 135 9.6 1 1 Tu 1 . - CDS 136 - 831 802 ## COG0580 Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) - Prom 1074 - 1133 8.2 + Prom 1027 - 1086 7.5 2 2 Tu 1 . + CDS 1257 - 2915 1228 ## COG3593 Predicted ATP-dependent endonuclease of the OLD family + Term 2916 - 2944 -0.6 - Term 2754 - 2795 0.2 3 3 Tu 1 . - CDS 2912 - 3862 637 ## COG2990 Uncharacterized protein conserved in bacteria - Prom 4007 - 4066 3.9 4 4 Op 1 13/0.000 + CDS 4019 - 5134 1304 ## COG0845 Membrane-fusion protein 5 4 Op 2 . + CDS 5131 - 7077 368 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 Predicted protein(s) >gi|296494563|gb|ADTN01000175.1| GENE 1 136 - 831 802 231 aa, chain - ## HITS:1 COG:aqpZ KEGG:ns NR:ns ## COG: aqpZ COG0580 # Protein_GI_number: 16128843 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) # Organism: Escherichia coli K12 # 1 231 1 231 231 353 100.0 1e-97 MFRKLAAECFGTFWLVFGGCGSAVLAAGFPELGIGFAGVALAFGLTVLTMAFAVGHISGG HFNPAVTIGLWAGGRFPAKEVVGYVIAQVVGGIVAAALLYLIASGKTGFDAAASGFASNG YGEHSPGGYSMLSALVVELVLSAGFLLVIHGATDKFAPAGFAPIAIGLALTLIHLISIPV TNTSVNPARSTAVAIFQGGWALEQLWFFWVVPIVGGIIGGLIYRTLLEKRD >gi|296494563|gb|ADTN01000175.1| GENE 2 1257 - 2915 1228 552 aa, chain + ## HITS:1 COG:ybjD KEGG:ns NR:ns ## COG: ybjD COG3593 # Protein_GI_number: 16128844 # Func_class: L Replication, recombination and repair # Function: Predicted ATP-dependent endonuclease of the OLD family # Organism: Escherichia coli K12 # 1 552 1 552 552 1072 100.0 0 MILERVEIVGFRGINRLSLMLEQNNVLIGENAWGKSSLLDALTLLLSPESDLYHFERDDF WFPPGDINGREHHLHIILTFRESLPGRHRVRRYRPLEACWTPCTDGYHRIFYRLEGESAE DGSVMTLRSFLDKDGHPIDVEDINDQARHLVRLMPVLRLRDARFMRRIRNGTVPNVPNVE VTARQLDFLARELSSHPQNLSDGQIRQGLSAMVQLLEHYFSEQGAGQARYRLMRRRASNE QRSWRYLDIINRMIDRPGGRSYRVILLGLFATLLQAKGTLRLDKDARPLLLIEDPETRLH PIMLSVAWHLLNLLPLQRIATTNSGELLSLTPVEHVCRLVRESSRVAAWRLGPSGLSTED SRRISFHIRFNRPSSLFARCWLLVEGETETWVINELARQCGHHFDAEGIKVIEFAQSGLK PLVKFARRMGIEWHVLVDGDEAGKKYAATVRSLLNNDREAEREHLTALPALDMEHFMYRQ GFSDVFHRMAQIPENVPMNLRKIISKAIHRSSKPDLAIEVAMEAGRRGVDSVPTLLKKMF SRVLWLARGRAD >gi|296494563|gb|ADTN01000175.1| GENE 3 2912 - 3862 637 316 aa, chain - ## HITS:1 COG:ybjX KEGG:ns NR:ns ## COG: ybjX COG2990 # Protein_GI_number: 16128845 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 316 15 330 330 605 100.0 1e-173 MSQLTERTFTPSESLSSLSLFLSLARGQCRPGKFWHRRSFRQKFLLRSLIMPRLSVEWMN ELSHWPNLNVLLTRQPRLPVRLHRPYLAANLSRKQLLEALRYHYALLRECMSAEEFSLYL NTPGLQLAKLEGKNGEQFTLELTMMISMDKEGDSTILFRNSEGIPLAEITFTLCEYQGKR TMFIGGLQGAKWEIPHQEIQNATKACHGLFPKRLVMEAACLFAQRLQVEQIIAVSNETHI YRSLRYRDKEGKIHADYNAFWESVGGVCDAERHYRLPAQIARKEIAEIASKKRAEYRRRY EMLDAIQPQMATMFRG >gi|296494563|gb|ADTN01000175.1| GENE 4 4019 - 5134 1304 371 aa, chain + ## HITS:1 COG:ybjY KEGG:ns NR:ns ## COG: ybjY COG0845 # Protein_GI_number: 16128846 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli K12 # 1 371 10 380 380 643 99.0 0 MKKRKTVKKRYVIALVIVIAGLITLWRILNAPVPTYQTLIVRPGDLQQSVLATGKLDALR KVDVGAQVSGQLKTLSVAIGDKVKKDQLLGVIDPEQAENQIKEVEATLMELRAQRQQAEA ELKLARVTYSRQQRLAQTKAVSQQDLDTAATEMAVKQAQIGTIDAQIKRNQASLDTAKTN LDYTRIVAPMAGEVTQITTLQGQTVIAAQQAPNILTLADMSTMLVKAQVSEADVIHLKPG QKAWFTVLGDPLTRYEGQIKDVLPTPEKVNDAIFYYARFEVPNPNGLLRLDMTAQVHIQL TDVKNVLTIPLSALGDPVGDNRYKVKLLRNGETREREVTIGARNDTDVEIVKGLEAGDEV VIGEAKPGAAQ >gi|296494563|gb|ADTN01000175.1| GENE 5 5131 - 7077 368 648 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 256 648 7 413 413 146 27 5e-35 MTPLLELKDIRRSYPAGDEQVEVLKGISLDIYAGEMVAIVGASGSGKSTLMNILGCLDKA TSGTYRVAGQDVATLDADALAQLRREHFGFIFQRYHLLSHLTAEQNVEVPAVYAGLERKQ RLLRAQELLQRLGLEDRTEYYPAQLSGGQQQRVSIARALMNGGQVILADEPTGALDSHSG EEVMAILHQLRDRGHTVIIVTHDPQVAAQAERVIEIRDGEIVRNPPAIEKVNVTGGTEPV VNTVSGWRQFVSGFNEALTMAWRALAANKMRTLLTMLGIIIGIASVVSIVVVGDAAKQMV LADIRSIGTNTIDVYPGKDFGDDDPQYQQALKYDDLIAIQKQPWVASATPAVSQNLRLRY NNVDVAASANGVSGDYFNVYGMTFSEGNTFNQEQLNGRAQVVVLDSNTRRQLFPHKADVV GEVILVGNMPARVIGVAEEKQSMFGSSKVLRVWLPYSTMSGRVMGQSWLNSITVRVKEGF DSAEAEQQLTRLLSLRHGKKDFFTWNMDGVLKTVEKTTRTLQLFLTLVAVISLVVGGIGV MNIMLVSVTERTREIGIRMAVGARASDVLQQFLIEAVLVCLVGGALGITLSLLIAFTLQL FLPGWEIGFSPLALLLAFLCSTVTGILFGWLPARNAARLDPVDALARE Prediction of potential genes in microbial genomes Time: Sun May 15 23:44:38 2011 Seq name: gi|296494562|gb|ADTN01000176.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont306.10, whole genome shotgun sequence Length of sequence - 27929 bp Number of predicted genes - 23, with homology - 22 Number of transcription units - 13, operones - 5 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 29 - 253 230 ## COG1278 Cold shock proteins - Prom 341 - 400 4.6 + Prom 56 - 115 2.2 2 2 Tu 1 . + CDS 164 - 406 90 ## + Prom 451 - 510 1.8 3 3 Op 1 19/0.000 + CDS 576 - 896 360 ## COG2127 Uncharacterized conserved protein 4 3 Op 2 . + CDS 927 - 3203 1361 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 + Term 3229 - 3268 8.4 - TRNA 3547 - 3634 68.6 # Ser GGA 0 0 - Term 3812 - 3851 5.5 5 4 Tu 1 . - CDS 3888 - 4106 257 ## PROTEIN SUPPORTED gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 - Prom 4142 - 4201 3.7 - Term 4166 - 4202 -1.0 6 5 Op 1 5/0.333 - CDS 4391 - 5095 448 ## COG2360 Leu/Phe-tRNA-protein transferase 7 5 Op 2 14/0.000 - CDS 5137 - 6858 1584 ## COG4987 ABC-type transport system involved in cytochrome bd biosynthesis, fused ATPase and permease components 8 5 Op 3 8/0.000 - CDS 6859 - 8625 170 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P - Prom 8661 - 8720 6.1 - Term 8677 - 8721 8.0 9 5 Op 4 . - CDS 8748 - 9713 733 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 - Prom 9742 - 9801 5.8 10 6 Tu 1 . + CDS 10258 - 10752 548 ## COG1522 Transcriptional regulators + Term 10809 - 10852 9.2 + Prom 10776 - 10835 2.6 11 7 Op 1 10/0.000 + CDS 10887 - 14876 3874 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins 12 7 Op 2 8/0.000 + CDS 15035 - 15646 696 ## COG2834 Outer membrane lipoprotein-sorting protein 13 7 Op 3 8/0.000 + CDS 15657 - 17000 1198 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase + Term 17001 - 17032 -1.0 14 7 Op 4 2/0.833 + CDS 17091 - 18383 1557 ## COG0172 Seryl-tRNA synthetase + Term 18409 - 18434 -0.5 + Prom 18514 - 18573 4.6 15 8 Op 1 16/0.000 + CDS 18622 - 21066 2421 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 16 8 Op 2 9/0.000 + CDS 21077 - 21694 648 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 17 8 Op 3 . + CDS 21696 - 22559 869 ## COG3302 DMSO reductase anchor subunit + Term 22569 - 22599 2.1 - Term 22557 - 22586 1.1 18 9 Tu 1 . - CDS 22594 - 23220 605 ## COG1335 Amidases related to nicotinamidase - Prom 23426 - 23485 4.3 + Prom 23445 - 23504 3.8 19 10 Tu 1 . + CDS 23534 - 24682 1038 ## COG0477 Permeases of the major facilitator superfamily + Term 24693 - 24736 8.5 + Prom 24749 - 24808 9.1 20 11 Tu 1 . + CDS 24892 - 26322 1375 ## COG0531 Amino acid transporters + Term 26436 - 26467 -0.1 21 12 Op 1 9/0.000 - CDS 26323 - 26907 244 ## COG0583 Transcriptional regulator 22 12 Op 2 . - CDS 26908 - 27231 275 ## COG0583 Transcriptional regulator - Prom 27272 - 27331 2.6 + Prom 27247 - 27306 3.2 23 13 Tu 1 . + CDS 27331 - 27906 368 ## COG2249 Putative NADPH-quinone reductase (modulator of drug activity B) Predicted protein(s) >gi|296494562|gb|ADTN01000176.1| GENE 1 29 - 253 230 74 aa, chain - ## HITS:1 COG:ECs0966 KEGG:ns NR:ns ## COG: ECs0966 COG1278 # Protein_GI_number: 15830220 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Escherichia coli O157:H7 # 1 74 1 74 74 145 100.0 2e-35 MEKGTVKWFNNAKGFGFICPEGGGEDIFAHYSTIQMDGYRTLKAGQSVQFDVHQGPKGNH ASVIVPVEVEAAVA >gi|296494562|gb|ADTN01000176.1| GENE 2 164 - 406 90 80 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSENIFAAAFRADETKPFGIVEPLNSTLFHASTSFANLIQVRWNKPGSERGLFKTSPTLE IQFRELGRAVKHLTGDKGQV >gi|296494562|gb|ADTN01000176.1| GENE 3 576 - 896 360 106 aa, chain + ## HITS:1 COG:ECs0967 KEGG:ns NR:ns ## COG: ECs0967 COG2127 # Protein_GI_number: 15830221 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 106 1 106 106 209 100.0 8e-55 MGKTNDWLDFDQLAEEKVRDALKPPSMYKVILVNDDYTPMEFVIDVLQKFFSYDVERATQ LMLAVHYQGKAICGVFTAEVAETKVAMVNKYARENEHPLLCTLEKA >gi|296494562|gb|ADTN01000176.1| GENE 4 927 - 3203 1361 758 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 10 741 14 806 815 528 38 1e-149 MLNQELELSLNMAFARAREHRHEFMTVEHLLLALLSNPSAREALEACSVDLVALRQELEA FIEQTTPVLPASEEERDTQPTLSFQRVLQRAVFHVQSSGRNEVTGANVLVAIFSEQESQA AYLLRKHEVSRLDVVNFISHGTRKDEPTQSSDPGSQPNSEEQAGGEERMENFTTNLNQLA RVGGIDPLIGREKELERAIQVLCRRRKNNPLLVGESGVGKTAIAEGLAWRIVQGDVPEVM ADCTIYSLDIGSLLAGTKYRGDFEKRFKALLKQLEQDTNSILFIDEIHTIIGAGAASGGQ VDAANLIKPLLSSGKIRVIGSTTYQEFSNIFEKDRALARRFQKIDITEPSIEETVQIING LKPKYEAHHDVRYTAKAVRAAVELAVKYINDRHLPDKAIDVIDEAGARARLMPVSKRKKT VNVADIESVVARIARIPEKSVSQSDRDTLKNLGDRLKMLVFGQDKAIEALTEAIKMARAG LGHEHKPVGSFLFAGPTGVGKTEVTVQLSKALGIELLRFDMSEYMERHTVSRLIGAPPGY VGFDQGGLLTDAVIKHPHAVLLLDEIEKAHPDVFNILLQVMDNGTLTDNNGRKADFRNVV LVMTTNAGVRETERKSIGLIHQDNSTDAMEEIKKIFTPEFRNRLDNIIWFDHLSTDVIHQ VVDKFIVELQVQLDQKGVSLEVSQEARNWLAEKGYDRAMGARPMARVIQDNLKKPLANEL LFGSLVDGGQVTVALDKEKNELTYGFQSAQKHKAEAAH >gi|296494562|gb|ADTN01000176.1| GENE 5 3888 - 4106 257 72 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 [Streptococcus pneumoniae TIGR4] # 1 70 1 70 72 103 65 1e-21 MAKEDNIEMQGTVLETLPNTMFRVELENGHVVTAHISGKMRKNYIRILTGDKVTVELTPY DLSKGRIVFRSR >gi|296494562|gb|ADTN01000176.1| GENE 6 4391 - 5095 448 234 aa, chain - ## HITS:1 COG:aat KEGG:ns NR:ns ## COG: aat COG2360 # Protein_GI_number: 16128852 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Leu/Phe-tRNA-protein transferase # Organism: Escherichia coli K12 # 1 234 1 234 234 493 100.0 1e-139 MRLVQLSRHSIAFPSPEGALREPNGLLALGGDLSPARLLMAYQRGIFPWFSPGDPILWWS PDPRAVLWPESLHISRSMKRFHKRSPYRVTMNYAFGQVIEGCASDREEGTWITRGVVEAY HRLHELGHAHSIEVWREDELVGGMYGVAQGTLFCGESMFSRMENASKTALLVFCEEFIGH GGKLIDCQVLNDHTASLGACEIPRRDYLNYLNQMRLGRLPNNFWVPRCLFSPQE >gi|296494562|gb|ADTN01000176.1| GENE 7 5137 - 6858 1584 573 aa, chain - ## HITS:1 COG:cydC KEGG:ns NR:ns ## COG: cydC COG4987 # Protein_GI_number: 16128853 # Func_class: C Energy production and conversion; O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome bd biosynthesis, fused ATPase and permease components # Organism: Escherichia coli K12 # 1 573 1 573 573 1049 99.0 0 MHALLPYLALYKRHKWMLSLGIVLAIVTLLASIGLLTLSGWFLSASAVAGVAGLYSFNYM LPAAGVRGAAITRTAGRYFERLVSHDATFRVLQHLRIYTFSKLLPLSPAGLARYRQGELL NRVVSDVDTLDHLYLRVISPLVGAFVVIMVVTIGLSFLDFTLAFTLGGIMLLTLFLMPPL FYRAGKSTGQNLTHLRGQYRQQLTAWLQGQAELTIFGASDRYRTQLENTEIQWLEAQRRQ SELTALSQAIMLLIGALAVILMLWMASGGVGGNAQPGALIALFVFCALAAFEALAPVTGA FQHLGQVIASAVRISDLTDQKPEVTFPDTQTRVADRVSLTLRDVQFTYPEQSQQALKGIS LQVNAGEHIAILGRTGCGKSTLLQLLTRAWDPQQGEILLNDSPIASLNEAALRQTISVVP QRVHLFSATLRDNLLLASPGSSDEALSEILRRVGLEKLLEDAGLNSWLGEGGRQLSGGEL RRLAIARALLHDAPLVLLDEPTEGLDATTESQILELLAEMMREKTVLMVTHRLRGLSRFQ QIIVMDNGQIIEQGTHAELLARQGRYYQFKQGL >gi|296494562|gb|ADTN01000176.1| GENE 8 6859 - 8625 170 588 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 369 570 150 351 398 70 29 1e-11 MNKSRQKELTRWLKQQSVISQRWLNISRLLGFVSGILIIAQAWFMARILQHMIMENIPRE ALLLPFTLLVLTFVLRAWVVWLRERVGYHAGQHIRFAIRRQVLDRLQQAGPAWIQGKPAG SWATLVLEQIDDMHDYYARYLPQMALAVSVPLLIVVAIFPSNWAAALILLGTAPLIPLFM ALVGMGAADANRRNFLALARLSGHFLDRLRGMETLRIFGRGEAEIESIRSASEDFRQRTM EVLRLAFLSSGILEFFTSLSIALVAVYFGFSYLGELDFGHYDTGVTLAAGFLALILAPEF FQPLRDLGTFYHAKAQAVGAADSLKTFMETPLAHPQRGEAELASTDPVTIEAEELFITSP EGKTLAGPLNFTLPAGQRAVLVGRSGSGKSSLLNALSGFLSYQGSLRINGIELRDLSPES WRKHLSWVGQNPQLPAATLRDNVLLARPDASEQELQAALDNAWVSEFLPLLPQGVDTPVG DQAARLSVGQAQRVAVARALLNPCSLLLLDEPAASLDAHSEQRVMEALNAASLRQTTLMV THQLEDLADWDVIWVMQDGRIIEQGRYAELSVAGGPFATLLAHRQEEI >gi|296494562|gb|ADTN01000176.1| GENE 9 8748 - 9713 733 321 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 10 317 5 306 306 286 49 8e-77 MGTTKHSKLLILGSGPAGYTAAVYAARANLQPVLITGMEKGGQLTTTTEVENWPGDPNDL TGPLLMERMHEHATKFETEIIFDHINKVDLQNRPFRLNGDNGEYTCDALIIATGASARYL GLPSEEAFKGRGVSACATCDGFFYRNQKVAVIGGGNTAVEEALYLSNIASEVHLIHRRDG FRAEKILIKRLMDKVENGNIILHTNRTLEEVTGDQMGVTGVRLRDTQNSDNIESLDVAGL FVAIGHSPNTAIFEGQLELENGYIKVQSGIHGNATQTSIPGVFAAGDVMDHIYRQAITSA GTGCMAALDAERYLDGLADAK >gi|296494562|gb|ADTN01000176.1| GENE 10 10258 - 10752 548 164 aa, chain + ## HITS:1 COG:ECs0974 KEGG:ns NR:ns ## COG: ECs0974 COG1522 # Protein_GI_number: 15830228 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 164 1 164 164 312 100.0 2e-85 MVDSKKRPGKDLDRIDRNILNELQKDGRISNVELSKRVGLSPTPCLERVRRLERQGFIQG YTALLNPHYLDASLLVFVEITLNRGAPDVFEQFNTAVQKLEEIQECHLVSGDFDYLLKTR VPDMSAYRKLLGETLLRLPGVNDTRTYVVMEEVKQSNRLVIKTR >gi|296494562|gb|ADTN01000176.1| GENE 11 10887 - 14876 3874 1329 aa, chain + ## HITS:1 COG:ftsK KEGG:ns NR:ns ## COG: ftsK COG1674 # Protein_GI_number: 16128857 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Escherichia coli K12 # 1 1329 1 1329 1329 2115 99.0 0 MSQEYTEDKEVTLTKLSSGRRLLEALLILIVLFAVWLMAALLSFNPSDPSWSQTAWHEPI HNLGGMPGAWLADTLFFIFGVMAYTIPVIIVGGCWFAWRHQSSDEYIDYFAVSLRIIGVL ALILTSCGLAAINADDIWYFASGGVIGSLLSTTLQPLLHSSGGTIALLCVWAAGLTLFTG WSWVTIAEKLGGWILNILTFASNRTRRDDTWVDEDEYEDDEEYEDENHGKQHESRRARIL RGALARRKRLAEKFINPMGRQTDAALFSGKRMDDDEEITYTARGVAADPDDVLFSGNRAT QPEYDEYDPLLNGAPITEPVAVAAAATTATQSWAAPVEPVTQTPPVASVDVPPAQPTVAW QPVPGPQTGEPVIAPAPEGYPQQSQYAQPAVQYNEPLQQPVQPQQPYYAPAAEQPAQQPY YAPAPEQPVAGNAWQAEEQQSTFAPQSTYQTEQTYQQPAAQEPLYQQPQPVEQQPVVEPE PVVEETKPARPPLYYFEEVEEKRAREREQLAAWYQPIPEPVKEPEPIKSSLKAPSVAAVP PVEAAAAVSPLASGVKKATLATGAAATVAAPVFSLANSGGPRPQVKEGIGPQLPRPKRIR VPTRRELASYGIKLPSQRAAEEKAREAQRNQYDSGDQYNDDEIDAMQQDELARQFAQTQQ QRYGEQYQHDVPVNAEDADAAAEAELARQFAQTQQQRYSGEQPAGANPFSLDDFEFSPMK ALLDDGPHEPLFTPIVEPVQQPQQPVAPQQQYQQPQQPVPPQPQYQQPQQPVAPQPQYQQ PQQPVAPQQQYQQPQQPVAPQQQYQQPQQPVAPQPQDTLLHPLLMRNGDSRPLHKPTTPL PSLDLLTPPPSEVEPVDTFALEQMARLVEARLADFRIKADVVNYSPGPVITRFELNLAPG VKAARISNLSRDLARSLSTVAVRVVEVIPGKPYVGLELPNKKRQTVYLREVLDNAKFRDN PSPLTVVLGKDIAGEPVVADLAKMPHLLVAGTTGSGKSVGVNAMILSMLYKAQPEDVRFI MIDPKMLELSVYEGIPHLLTEVVTDMKDAANALRWCVNEMERRYKLMSALGVRNLAGYNE KIAEADRMMRPIPDPYWKPGDSMDAQHPVLKKEPYIVVLVDEFADLMMTVGKKVEELIAR LAQKARAAGIHLVLATQRPSVDVITGLIKANIPTRIAFTVSSKIDSRTILDQAGAESLLG MGDMLYSGPNSTLPVRVHGAFVRDQEVHAVVQDWKARGRPQYVDGITSDSESEGGAGGFD GAEELDPLFDQAVQFVTEKRKASISGVQRQFRIGYNRAARIIEQMEAQGIVSEQGHNGNR EVLAPPPFD >gi|296494562|gb|ADTN01000176.1| GENE 12 15035 - 15646 696 203 aa, chain + ## HITS:1 COG:ECs0976 KEGG:ns NR:ns ## COG: ECs0976 COG2834 # Protein_GI_number: 15830230 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane lipoprotein-sorting protein # Organism: Escherichia coli O157:H7 # 1 203 2 204 204 378 100.0 1e-105 MKKIAITCALLSSLVASSVWADAASDLKSRLDKVSSFHASFTQKVTDGSGAAVQEGQGDL WVKRPNLFNWHMTQPDESILVSDGKTLWFYNPFVEQATATWLKDATGNTPFMLIARNQSS DWQQYNIKQNGDDFVLTPKASNGNLKQFTINVGRDGTIHQFSAVEQDDQRSSYQLKSQQN GAVDAAKFTFTPPQGVTVDDQRK >gi|296494562|gb|ADTN01000176.1| GENE 13 15657 - 17000 1198 447 aa, chain + ## HITS:1 COG:ECs0977 KEGG:ns NR:ns ## COG: ECs0977 COG2256 # Protein_GI_number: 15830231 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Escherichia coli O157:H7 # 1 447 1 447 447 872 100.0 0 MSNLSLDFSDNTFQPLAARMRPENLAQYIGQQHLLAAGKPLPRAIEAGHLHSMILWGPPG TGKTTLAEVIARYANADVERISAVTSGVKEIREAIERARQNRNAGRRTILFVDEVHRFNK SQQDAFLPHIEDGTITFIGATTENPSFELNSALLSRARVYLLKSLSTEDIEQVLTQAMED KTRGYGGQDIVLPDETRRAIAELVNGDARRALNTLEMMADMAEVDDSGKRVLKPELLTEI AGERSARFDNKGDRFYDLISALHKSVRGSAPDAALYWYARIITAGGDPLYVARRCLAIAS EDVGNADPRAMQVAIAAWDCFTRVGPAEGERAIAQAIVYLACAPKSNAVYTAFKAALADA RERPDYDVPVHLRNAPTKLMKEMGYGQEYRYAHDEANAYAAGEVYFPPEIAQTRYYFPTN RGLEGKIGEKLAWLAEQDQNSPIKRYR >gi|296494562|gb|ADTN01000176.1| GENE 14 17091 - 18383 1557 430 aa, chain + ## HITS:1 COG:ECs0978 KEGG:ns NR:ns ## COG: ECs0978 COG0172 # Protein_GI_number: 15830232 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Seryl-tRNA synthetase # Organism: Escherichia coli O157:H7 # 1 430 1 430 430 865 100.0 0 MLDPNLLRNEPDAVAEKLARRGFKLDVDKLGALEERRKVLQVKTENLQAERNSRSKSIGQ AKARGEDIEPLRLEVNKLGEELDAAKAELDALQAEIRDIALTIPNLPADEVPVGKDENDN VEVSRWGTPREFDFEVRDHVTLGEMHSGLDFAAAVKLTGSRFVVMKGQIARMHRALSQFM LDLHTEQHGYSENYVPYLVNQDTLYGTGQLPKFAGDLFHTRPLEEEADTSNYALIPTAEV PLTNLVRGEIIDEDDLPIKMTAHTPCFRSEAGSYGRDTRGLIRMHQFDKVEMVQIVRPED SMAALEEMTGHAEKVLQLLGLPYRKIILCTGDMGFGACKTYDLEVWIPAQNTYREISSCS NVWDFQARRMQARCRSKSDKKTRLVHTLNGSGLAVGRTLVAVMENYQQADGRIEVPEVLR PYMNGLEYIG >gi|296494562|gb|ADTN01000176.1| GENE 15 18622 - 21066 2421 814 aa, chain + ## HITS:1 COG:dmsA KEGG:ns NR:ns ## COG: dmsA COG0243 # Protein_GI_number: 16128861 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 30 814 1 785 785 1635 100.0 0 MKTKIPDAVLAAEVSRRGLVKTTAIGGLAMASSALTLPFSRIAHAVDSAIPTKSDEKVIW SACTVNCGSRCPLRMHVVDGEIKYVETDNTGDDNYDGLHQVRACLRGRSMRRRVYNPDRL KYPMKRVGARGEGKFERISWEEAYDIIATNMQRLIKEYGNESIYLNYGTGTLGGTMTRSW PPGNTLVARLMNCCGGYLNHYGDYSSAQIAEGLNYTYGGWADGNSPSDIENSKLVVLFGN NPGETRMSGGGVTYYLEQARQKSNARMIIIDPRYTDTGAGREDEWIPIRPGTDAALVNGL AYVMITENLVDQAFLDKYCVGYDEKTLPASAPKNGHYKAYILGEGPDGVAKTPEWASQIT GVPADKIIKLAREIGSTKPAFISQGWGPQRHANGEIATRAISMLAILTGNVGINGGNSGA REGSYSLPFVRMPTLENPIQTSISMFMWTDAIERGPEMTALRDGVRGKDKLDVPIKMIWN YAGNCLINQHSEINRTHEILQDDKKCELIVVIDCHMTSSAKYADILLPDCTASEQMDFAL DASCGNMSYVIFNDQVIKPRFECKTIYEMTSELAKRLGVEQQFTEGRTQEEWMRHLYAQS REAIPELPTFEEFRKQGIFKKRDPQGHHVAYKAFREDPQANPLTTPSGKIEIYSQALADI AATWELPEGDVIDPLPIYTPGFESYQDPLNKQYPLQLTGFHYKSRVHSTYGNVDVLKAAC RQEMWINPLDAQKRGIHNGDKVRIFNDRGEVHIEAKVTPRMMPGVVALGEGAWYDPDAKR VDKGGCINVLTTQRPSPLAKGNPSHTNLVQVEKV >gi|296494562|gb|ADTN01000176.1| GENE 16 21077 - 21694 648 205 aa, chain + ## HITS:1 COG:dmsB KEGG:ns NR:ns ## COG: dmsB COG0437 # Protein_GI_number: 16128862 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Escherichia coli K12 # 1 205 1 205 205 414 100.0 1e-116 MTTQYGFFIDSSRCTGCKTCELACKDYKDLTPEVSFRRIYEYAGGDWQEDNGVWHQNVFA YYLSISCNHCEDPACTKVCPSGAMHKREDGFVVVDEDVCIGCRYCHMACPYGAPQYNETK GHMTKCDGCYDRVAEGKKPICVESCPLRALDFGPIDELRKKHGDLAAVAPLPRAHFTKPN IVIKPNANSRPTGDTTGYLANPKEV >gi|296494562|gb|ADTN01000176.1| GENE 17 21696 - 22559 869 287 aa, chain + ## HITS:1 COG:dmsC KEGG:ns NR:ns ## COG: dmsC COG3302 # Protein_GI_number: 16128863 # Func_class: R General function prediction only # Function: DMSO reductase anchor subunit # Organism: Escherichia coli K12 # 1 287 1 287 287 437 100.0 1e-122 MGSGWHEWPLMIFTVFGQCVAGGFIVLALALLKGDLRAEAQQRVIACMFGLWVLMGIGFI ASMLHLGSPMRAFNSLNRVGASALSNEIASGSIFFAVGGIGWLLAMLKKLSPALRTLWLI VTMVLGVIFVWMMVRVYNSIDTVPTWYSIWTPMGFFLTMFMGGPLLGYLLLSLAGVDGWA MRLLPAISVLALVVSGVVSVMQGAELATIHSSVQQAAALVPDYGALMSWRIVLLAVALCL WIAPQLKGYQPAVPLLSVSFILLLAGELIGRGVFYGLHMTVGMAVAS >gi|296494562|gb|ADTN01000176.1| GENE 18 22594 - 23220 605 208 aa, chain - ## HITS:1 COG:ycaC KEGG:ns NR:ns ## COG: ycaC COG1335 # Protein_GI_number: 16128864 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Escherichia coli K12 # 1 208 1 208 208 429 100.0 1e-120 MTKPYVRLDKNDAAVLLVDHQAGLLSLVRDIEPDKFKNNVLALGDLAKYFNLPTILTTSF ETGPNGPLVPELKAQFPDTPYIARPGNINAWDNEDFVKAVKATGKKQLIIAGVVTEVCVA FPALSAIEEGFDVFVVTDASGTFNEITRHSAWDRLSQAGAQLMTWFGVACELHRDWRNDI EGLATLFSNHIPDYRNLMTSYDTLTKQK >gi|296494562|gb|ADTN01000176.1| GENE 19 23534 - 24682 1038 382 aa, chain + ## HITS:1 COG:ycaD KEGG:ns NR:ns ## COG: ycaD COG0477 # Protein_GI_number: 16128865 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 382 1 382 382 639 100.0 0 MSTYTQPVMLLLSGLLLLTLAIAVLNTLVPLWLAQEHMSTWQVGVVSSSYFTGNLVGTLL TGYVIKRIGFNRSYYLASFIFAAGCAGLGLMIGFWSWLAWRFVAGVGCAMIWVVVESALM CSGTSRNRGRLLAAYMMVYYVGTFLGQLLVSKVSTELMSVLPWVTGLTLAGILPLLFTRV LNQQAENHDSTSITSMLKLRQARLGVNGCIISGIVLGSLYGLMPLYLNHKGVSNASIGFW MAVLVSAGILGQWPIGRLADKFGRLLVLRVQVFVVILGSIAMLSQAAMAPALFILGAAGF TLYPVAMAWACEKVEHHQLVAMNQALLLSYTVGSLLGPSFTAMLMQNFSDNLLFIMIASV SFIYLLMLLRNAGHTPKPVAHV >gi|296494562|gb|ADTN01000176.1| GENE 20 24892 - 26322 1375 476 aa, chain + ## HITS:1 COG:ycaM KEGG:ns NR:ns ## COG: ycaM COG0531 # Protein_GI_number: 16128866 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli K12 # 1 476 65 540 540 887 100.0 0 MAGNVQEKQLRWYNIALMSFITVWGFGNVVNNYANQGLVVVFSWVFIFALYFTPYALIVG QLGSTFKDGKGGVSTWIKHTMGPGLAYLAAWTYWVVHIPYLAQKPQAILIALGWAMKGDG SLIKEYSVVALQGLTLVLFIFFMWVASRGMKSLKIVGSVAGIAMFVMSLLYVAMAVTAPA ITEVHIATTNITWETFIPHIDFTYITTISMLVFAVGGAEKISPYVNQTRNPGKEFPKGML CLAVMVAVCAILGSLAMGMMFDSRNIPDDLMTNGQYYAFQKLGEYYNMGNTLMVIYAIAN TLGQVAALVFSIDAPLKVLLGDADSKYIPASLCRTNASGTPVNGYFLTLVLVAILIMLPT LGIGDMNNLYKWLLNLNSVVMPLRYLWVFVAFIAVVRLAQKYKPEYVFIRNKPLAMTVGI WCFAFTAFACLTGIFPKMEAFTAEWTFQLALNVATPFVLVGLGLIFPLLARKANSK >gi|296494562|gb|ADTN01000176.1| GENE 21 26323 - 26907 244 194 aa, chain - ## HITS:1 COG:ycaN KEGG:ns NR:ns ## COG: ycaN COG0583 # Protein_GI_number: 16128867 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 194 109 302 302 422 100.0 1e-118 MSLLVGFTREYPDIKVELTTDDSLVDIVQQGFDAGVRLSCIVEKDMISVAIGPPVKLCVA ATPEYFARYGKPRHPHDLLNHQCVVFRYPSGKPFHWQFAKELEIAVAGNIILDDVDAELE AVLMGAGIGYLLYEQIKEYLDTGRLECVLEDWSTERPGFQIYYPNRQYMSCGLRAFLDYV KTGQICQSQRHRPQ >gi|296494562|gb|ADTN01000176.1| GENE 22 26908 - 27231 275 107 aa, chain - ## HITS:1 COG:ycaN KEGG:ns NR:ns ## COG: ycaN COG0583 # Protein_GI_number: 16128867 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 107 1 107 302 213 100.0 9e-56 MRMNMSDFATFFAVARNQSFRAAGDELGLSSSAISHSIKTLEQRLKIRLFNRTTRSVSLT EAGSNLYERLRPAFDEIQIMLDEMNDFRLTPTGTLKINAARVAARIF >gi|296494562|gb|ADTN01000176.1| GENE 23 27331 - 27906 368 191 aa, chain + ## HITS:1 COG:ycaK KEGG:ns NR:ns ## COG: ycaK COG2249 # Protein_GI_number: 16128868 # Func_class: R General function prediction only # Function: Putative NADPH-quinone reductase (modulator of drug activity B) # Organism: Escherichia coli K12 # 1 190 1 190 196 392 100.0 1e-109 MQSERIYLVWAHPRHDSLTAHIADAIHQRAMERKIQVTELDLYRRNFNPVMTPEDEPDWK NMDKRYSPEVHQLYSELLEHDTLVVVFPLWWYSFPAMLKGYIDRVWNNGLAYGDGHKLPF NKVRWVALVGGDKESFVQMGWEKNISDYLKNMCSYLGIEDADVTFLCNTVVFDGEELHAS YYQSLLSQVRG Prediction of potential genes in microbial genomes Time: Sun May 15 23:44:47 2011 Seq name: gi|296494561|gb|ADTN01000177.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont322.1, whole genome shotgun sequence Length of sequence - 11489 bp Number of predicted genes - 12, with homology - 11 Number of transcription units - 5, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 70 - 129 3.2 1 1 Op 1 . + CDS 226 - 1938 664 ## COG3505 Type IV secretory pathway, VirD4 components 2 1 Op 2 . + CDS 1985 - 2515 78 ## EcSMS35_A0025 hypothetical protein 3 1 Op 3 . + CDS 2508 - 4151 782 ## ECED1_3504 putative type IV pilus outer membrane secretin lipoprotein + Term 4165 - 4210 3.1 4 2 Op 1 . + CDS 4580 - 5512 447 ## ECED1_3505 putative pilin accessory protein PilO 5 2 Op 2 . + CDS 5496 - 5990 73 ## ECED1_3506 pilus type IV assembly protein 6 2 Op 3 . + CDS 6015 - 7553 525 ## COG2804 Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB + Term 7636 - 7682 4.8 7 3 Tu 1 . - CDS 7495 - 7686 87 ## - Prom 7823 - 7882 6.6 + Prom 7810 - 7869 5.1 8 4 Op 1 . + CDS 7955 - 8653 430 ## ECED1_3508 membrane protein 9 4 Op 2 . + CDS 8698 - 9255 375 ## ECED1_3509 major pillin subunit + Term 9272 - 9307 5.1 10 5 Op 1 . + CDS 9321 - 9803 248 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) 11 5 Op 2 . + CDS 9807 - 10442 162 ## ECED1_3511 prepilin peptidase 12 5 Op 3 . + CDS 10455 - 11487 460 ## ECED1_3512 minor pilin subunit Predicted protein(s) >gi|296494561|gb|ADTN01000177.1| GENE 1 226 - 1938 664 570 aa, chain + ## HITS:1 COG:XFa0016 KEGG:ns NR:ns ## COG: XFa0016 COG3505 # Protein_GI_number: 10956727 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Xylella fastidiosa 9a5c # 11 509 78 533 656 290 35.0 4e-78 MLIVVIGIMPKKVIYGDARLATDMDLSKSGFFPDKKSPYKHPPILIGKMFKGRYKKQFIY FAGQQFLILYAPTRSGKGVGIVIPNCVNYPGSMVILDIKLENWFLSAGFRQKELGQECFL FAPAGYAETIDQAIKGQIRSHRWNPLDCVSRSDLLRETDLAKIAAILIPASDDPIWSDSA RNLFVGLGLYLLDKERFHLEQKAKGHNVPDVLVSISAILKTSVPDGGKDLAAWMGQEIEN RSWISDKTKSFFFKFMSAPDRTRGSIETNFSSPLSIFSNPITAEATNFSDFDIRDIRKKP MSIYLGLTPDALITHEKIVNLFFSLLVNENCRELPEHNPDLKYQCLILLDEFTSMGKSEV IERAVGFTAGYNLRFMFILQNEGQGQKSDMYGQEGWTTFTENSAVVLYYPPKSKNALAKK ISEEIGVRDMKISKRSISSGGGKGGSSRTRNDDVIERPVLLPEEIVSLRDKKNKARNIAI REIITSEFSRPFIANKIIWFEEPEFKRRVDIARNNHVDIPNLFTQEVMDEIAKIAEIYLP KAGGKKVMVAGGNVITNPDLDNHDKTDVSE >gi|296494561|gb|ADTN01000177.1| GENE 2 1985 - 2515 78 176 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_A0025 NR:ns ## KEGG: EcSMS35_A0025 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 17 168 1 141 153 102 43.0 5e-21 MRIKHSNMINMNMNMNMMKEAIDVLINSEKHILIIGETGTGKSTLLAELLINDTDVKYFD FCDIYKRHEHLPLLKHHYLCDENFDYFDFLSAPEKTLVLNSVAIGNNLRESKVVQFIVRF IKTARKRGKRLIVVTTPSDGGVQIQNLFDTVISLTRSHDEIGCTVIIPVPELIKYE >gi|296494561|gb|ADTN01000177.1| GENE 3 2508 - 4151 782 547 aa, chain + ## HITS:1 COG:no KEGG:ECED1_3504 NR:ns ## KEGG: ECED1_3504 # Name: not_defined # Def: putative type IV pilus outer membrane secretin lipoprotein # Organism: E.coli_ED1a # Pathway: not_defined # 3 535 2 533 533 643 68.0 0 MNKKLLCLLISSFLLSGCANFKKINANELAAERDAQKATEYAKKIRNNPVIQDSTKQWIN TTPIIEKQVKQSAPPCPIVINTRSDIELRQIAQRITQTCHIPVRISADVWTYLNGTGTGS TQQMTGSIPAPDANGMVPLASVGSQPVKVSSGGTSLPELNISGLPALLQTVSSRLGISWK YDKGRITFFYLETRSFPITYMDSNVAYNSKVVSGTMSSSGSTGSTSSGGMTGDASNTQTT TVEMKSSLYNDLKSEVSSMLTPGTGRMYLSTGSLTVTDTPDVLDSVQEIVNKRNNEMSRQ VVLNVEILSIKKSSQEQAGIDWNAVFNDGHLGLSLGGSFGNAAENVITSGVSIVDGKLAG SKAFLKALSSQGSVSVVTQNSAVTKNLTPVPMQIANQQGFIESVTTDSTANVGSSTSLNA ATITTGFNMTLLPYIQPDSQNLQLLFSMSLSDKPTFEVFESGGSKAQTPNVDLKTINQTI DLKSGQTVILSGFQQSNRKTSKQGVITPSFFGLGGGINSEDGDTILVVLITPNVLNSGSS AVIKNVN >gi|296494561|gb|ADTN01000177.1| GENE 4 4580 - 5512 447 310 aa, chain + ## HITS:1 COG:no KEGG:ECED1_3505 NR:ns ## KEGG: ECED1_3505 # Name: not_defined # Def: putative pilin accessory protein PilO # Organism: E.coli_ED1a # Pathway: not_defined # 3 308 139 435 435 223 41.0 6e-57 MSALTTFLEFNETPEPGWKLYQPESWDISQALPSLTLSALIDVKKPPKEAAFTRVSRKRQ FMIYGGSAILAILLWNGITMYQEYREKEAAAEAARLRLAKEMADKQAIQIAPPWQHLPEI KPFIDKCIDKWDALPLSIAGWRFDLAECSTSGNDGLLRTSYKELSGVTVEDFSTRIREIF QGTTTATFVLPEGSAGGFSLPVSFDVSPDPITPDTLPQATDIQERLTTFAQKMRLKLTWQ EIENTKTDEEGRPIILPWNEYELMIQTSTPPSILFANFHEPAVRFQYAGIKLEEGRLNYE IKGAFYVKNN >gi|296494561|gb|ADTN01000177.1| GENE 5 5496 - 5990 73 164 aa, chain + ## HITS:1 COG:no KEGG:ECED1_3506 NR:ns ## KEGG: ECED1_3506 # Name: pilP # Def: pilus type IV assembly protein # Organism: E.coli_ED1a # Pathway: not_defined # 40 162 33 151 153 77 39.0 1e-13 MLKIIKPTIISVCFFSSLAYATESESTPKDEIKYSSPTILQVEQIQAETVIYEAKLARQK AVSELRNLDLNGVASATPAITPSSGYSSLADTDIPAQRVAAKSLRVIEIFGTPQQMSARI SLLDGSVTDVKVGQSIPGTSYKVASITGNGVWIQNDTEKRQLIL >gi|296494561|gb|ADTN01000177.1| GENE 6 6015 - 7553 525 512 aa, chain + ## HITS:1 COG:aq_1474 KEGG:ns NR:ns ## COG: aq_1474 COG2804 # Protein_GI_number: 15606637 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB # Organism: Aquifex aeolicus # 94 456 95 428 469 157 34.0 3e-38 MFFSDADYPGKDISDVVYVSISEKDDKPVHVYIADNSKGQRAVQAYIGTLNVKYPGKVNI TWTTLDVIASRYQDSERRSDSNSSQHLNSNQEKVISYLAKANNLGSSDLHITPGRDGSEF TYVEARVHGELEILDVIPRKEGLELLGAAYSGMSDVIKGTQFDPAIPQDARIAENFLKPV NLFGARYSHYPCVGGVYAVFRLIKDDSEDIPTFEELGYMPQQIQTIRRMLQRPEGIIVLS GPTGSGKSTTLRTASEAYLSTFGFNHNDNMRLPRKRLFTIESPPEGRIPGAIQTAVRDSV DGWVDAIKSAMRLDPDAILNGEIRDHASALAAIKASMTGHLLLTTLHANDALNIIERLEM ENIQARLIADPLLFIGLLSQRLVQKLCPYCKRTYAEMADKLSTEERKLIEENCLPEQVRL RNLDGCEHCYKGVTGRTVIAEVVSPDARFFQLYRQSGRLEAKSYWHYELGGITRNQHLLH LINHGLVDPLSAHFISPIDEDKYSMLPEGTWR >gi|296494561|gb|ADTN01000177.1| GENE 7 7495 - 7686 87 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPSFNIPDKSSYKLILEYLNVYFASQWDIPEKSISIELSFSNISFIAMSPPGALSIYLRQ WGI >gi|296494561|gb|ADTN01000177.1| GENE 8 7955 - 8653 430 232 aa, chain + ## HITS:1 COG:no KEGG:ECED1_3508 NR:ns ## KEGG: ECED1_3508 # Name: not_defined # Def: membrane protein # Organism: E.coli_ED1a # Pathway: not_defined # 13 232 169 388 388 239 50.0 5e-62 MSMIYPSVLVFGCLANMYMVHDTITPVFLSLAPKERWTSSMRLLTDTSGFLIENGTYILC FCIVLFFLIRYSLARYTGPGRQYLDYIPPWSLYKTFNGVSFLFNMASMLSIDIPVAKALE RLEKNSTQNRWLLERIKEIRRYVLLGQSLAMAMKSCGYDFPSKLCINKLLLHSENADNAF MMSNYADKWLEEAKGKVKSTGVWITVVSALIVFAFIVMMITALYDVSNVLQH >gi|296494561|gb|ADTN01000177.1| GENE 9 8698 - 9255 375 185 aa, chain + ## HITS:1 COG:no KEGG:ECED1_3509 NR:ns ## KEGG: ECED1_3509 # Name: pilS # Def: major pillin subunit # Organism: E.coli_ED1a # Pathway: not_defined # 7 185 1 177 177 125 43.0 6e-28 MSSINILNMRSVFSSLSARRKKEQDKGATLMEVLLVVGVIVVLAASAYKLYSMVQSNIQS SNEQNNVLTVIANMKSLKFQGRYTDSNYIKTLYAQGLLPSDMIADTTGASAKNPWGGSVT ITTSSDKYSFNVVEANVPQKNCMAMVNALRSSSAISKINNTSTSTVSAATVCASDSNTLT FSTDS >gi|296494561|gb|ADTN01000177.1| GENE 10 9321 - 9803 248 160 aa, chain + ## HITS:1 COG:Z4175 KEGG:ns NR:ns ## COG: Z4175 COG0741 # Protein_GI_number: 15803375 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Escherichia coli O157:H7 EDL933 # 16 157 11 153 167 150 51.0 1e-36 MRSVLYILLSCISSLIFSLQPAYSATTTRFDACFAQAGQRYLIDPLLLKSIATVESSLIP DRINHNKEKGKILSTDYGLMQINSSHIPNLKKLGVIKDKSELIDNPCLNIQIGAWILATH FQKCGINWSCLGSYNAGFKESNEQKRIKYARYVYNKYMVR >gi|296494561|gb|ADTN01000177.1| GENE 11 9807 - 10442 162 211 aa, chain + ## HITS:1 COG:no KEGG:ECED1_3511 NR:ns ## KEGG: ECED1_3511 # Name: pilU # Def: prepilin peptidase # Organism: E.coli_ED1a # Pathway: not_defined # 48 207 46 207 218 71 30.0 2e-11 MLNSIFNTAILCCANIVCQCYAYNEVKFRLNKLNVSYKTGTEKYYCLILTTLAVLYLSLN QLSLIQSIIVIVFYAFLTLMACIDLLSFLLPRLYTVTFIFSGLLYQTWNNNILSGLFCAM LMFFIMLFVRLYFAYKNGTESFGMGDVLLIAGTGVWFPTPEIACSIVFIAVIGGIIFFTL GGLNKQKKYIPFGPFLCGGMFVYSLVPGILF >gi|296494561|gb|ADTN01000177.1| GENE 12 10455 - 11487 460 344 aa, chain + ## HITS:1 COG:no KEGG:ECED1_3512 NR:ns ## KEGG: ECED1_3512 # Name: pilV # Def: minor pilin subunit # Organism: E.coli_ED1a # Pathway: not_defined # 5 344 9 321 447 299 49.0 1e-79 MKKTDKGVSLLEVLLVIGIMVMVIPKVYENIENHLNNVRWQNAAEHANTYNTAVRNYVAD NASTLLAGSLPKTITPATLIQKGYLKSGFSESNFGQSYITGIAKNSKTSRLEALTCSNGG QSLSEAGMRSVASMIEGLGGYINSSKQAIGAGGGWSDTPSNYGLNCATGHIAMALVGADL QESDRLYRYSITNRPDLNRMHTAIDMNSNNLNNVGTLNGNAAALSGDISARNGTFSGAIS GNTATTNGDITSNNGWLVTKNSKGWMNSTYGGGWYMSDSSWLRSVNNKGIYTGGQVKGGT VRADGRLYTGEYLQLEKTATAGTSCSPNGLVGRDSTGAILSCQS Prediction of potential genes in microbial genomes Time: Sun May 15 23:45:23 2011 Seq name: gi|296494560|gb|ADTN01000178.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont350.1, whole genome shotgun sequence Length of sequence - 2078 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 22 - 81 3.4 1 1 Op 1 . + CDS 232 - 1152 1101 ## COG2214 DnaJ-class molecular chaperone 2 1 Op 2 . + CDS 1152 - 1457 333 ## ECB_01002 modulator of CbpA co-chaperone - Term 1520 - 1580 4.4 3 2 Tu 1 . - CDS 1609 - 2049 338 ## COG3381 Uncharacterized component of anaerobic dehydrogenases Predicted protein(s) >gi|296494560|gb|ADTN01000178.1| GENE 1 232 - 1152 1101 306 aa, chain + ## HITS:1 COG:cbpA KEGG:ns NR:ns ## COG: cbpA COG2214 # Protein_GI_number: 16128966 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone # Organism: Escherichia coli K12 # 1 306 1 306 306 612 100.0 1e-175 MELKDYYAIMGVKPTDDLKTIKTAYRRLARKYHPDVSKEPDAEARFKEVAEAWEVLSDEQ RRAEYDQMWQHRNDPQFNRQFHHGDGQSFNAEDFDDIFSSIFGQHARQSRQRPATRGHDI EIEVAVFLEETLTEHKRTISYNLPVYNAFGMIEQEIPKTLNVKIPAGVGNGQRIRLKGQG TPGENGGPNGDLWLVIHIAPHPLFDIVGQDLEIVVPVSPWEAALGAKVTVPTLKESILLT IPPGSQAGQRLRVKGKGLVSKKQTGDLYAVLKIVMPPKPDENTAALWQQLADAQSSFDPR KDWGKA >gi|296494560|gb|ADTN01000178.1| GENE 2 1152 - 1457 333 101 aa, chain + ## HITS:1 COG:no KEGG:ECB_01002 NR:ns ## KEGG: ECB_01002 # Name: yccD # Def: modulator of CbpA co-chaperone # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 101 1 101 101 189 100.0 2e-47 MANVTVTFTITEFCLHTGISEEELNEIVGLGVVEPREIQETTWVFDDHAAIVVQRAVRLR HELALDWPGIAVALTLMDDIAHLKQENRLLRQRLSRFVAHP >gi|296494560|gb|ADTN01000178.1| GENE 3 1609 - 2049 338 146 aa, chain - ## HITS:1 COG:torD KEGG:ns NR:ns ## COG: torD COG3381 # Protein_GI_number: 16128964 # Func_class: R General function prediction only # Function: Uncharacterized component of anaerobic dehydrogenases # Organism: Escherichia coli K12 # 1 146 54 199 199 283 99.0 5e-77 MNELENRIATLTVRDDARLELAADFCGLFLMTDKQAALPYASAYKQDEQEIKRLLVEAGM ETSGNFNEPADHLAIYLELLSHLHFSLGEGTVPARRIDSLRQKTLTALWQWLPEFVARCR QYDSFGFYAALSQLLLVLVECDHQNR Prediction of potential genes in microbial genomes Time: Sun May 15 23:45:29 2011 Seq name: gi|296494559|gb|ADTN01000179.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont350.2, whole genome shotgun sequence Length of sequence - 11240 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 7, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.250 - CDS 3 - 141 143 ## COG3381 Uncharacterized component of anaerobic dehydrogenases 2 1 Op 2 7/0.000 - CDS 138 - 2684 2761 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 3 1 Op 3 . - CDS 2684 - 3856 1031 ## COG3005 Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit - Prom 3919 - 3978 4.5 + Prom 3821 - 3880 3.3 4 2 Tu 1 . + CDS 3986 - 4678 788 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 5 3 Tu 1 . - CDS 4651 - 5679 750 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 5758 - 5817 2.1 + Prom 5658 - 5717 3.3 6 4 Op 1 1/1.000 + CDS 5792 - 8506 2581 ## COG0642 Signal transduction histidine kinase 7 4 Op 2 . + CDS 8578 - 9651 639 ## COG0348 Polyferredoxin + Term 9681 - 9724 -0.5 - Term 9652 - 9693 5.6 8 5 Tu 1 . - CDS 9700 - 9873 164 ## PROTEIN SUPPORTED gi|16760699|ref|NP_456316.1| hypothetical protein STY1938 - Prom 9995 - 10054 4.2 - Term 10032 - 10062 1.0 9 6 Tu 1 . - CDS 10267 - 10479 310 ## COG1278 Cold shock proteins - Prom 10668 - 10727 6.0 + Prom 10649 - 10708 6.8 10 7 Tu 1 . + CDS 10765 - 10977 222 ## COG1278 Cold shock proteins Predicted protein(s) >gi|296494559|gb|ADTN01000179.1| GENE 1 3 - 141 143 46 aa, chain - ## HITS:1 COG:ECs1153 KEGG:ns NR:ns ## COG: ECs1153 COG3381 # Protein_GI_number: 15830407 # Func_class: R General function prediction only # Function: Uncharacterized component of anaerobic dehydrogenases # Organism: Escherichia coli O157:H7 # 1 46 1 46 199 82 100.0 2e-16 MTTLTAQQIACVYAWLAQLFSRELDDEQLTQIASAQMAEWFSLLKS >gi|296494559|gb|ADTN01000179.1| GENE 2 138 - 2684 2761 848 aa, chain - ## HITS:1 COG:torA KEGG:ns NR:ns ## COG: torA COG0243 # Protein_GI_number: 16128963 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 848 1 848 848 1748 99.0 0 MNNNDLFQASRRRFLAQLGGLTVAGMLGPSLLTPRRATAAQAATDAVISKEGILTGSHWG AIRATVKDGRFVAAKPFELDKYPSKMIAGLPDHVHNAARIRYPMVRVDWLRKRHLSDTSQ RGDNRFVRVSWDEALDMFYEELERVQKTHGPSALLTASGWQSTGMFHNASGMLAKAIALH GNSVGTGGDYSTGAAQVILPRVVGSMEVYEQQTSWPLVLQNSKTIVLWGSDLLKNQQANW WCPDHDVYEYYAQLKAKVAAGEIEVISIDPVVTSTHEYLGREHVKHIAVNPQTDVPLQLA LAHTLYSENLYDKNFLANYCVGFEQFLPYLLGEKDGQPKDAAWAEKLTGIDAETIRRLAR QMAANRTQIIAGWCVQRMQHGEQWAWMIVVLAAMLGQIGLPGGGFGFGWHYNGAGTPGRK GVILSGFSGSTSIPPVHDNSDYKGYSSTIPIARFIDAILEPGKVINWNGKSVKLPPLKMC IFAGTNPFHRHQQINRIIEGLRKLETVIAIDNQWTSTCRFADIVLPATTQFERNDLDQYG NHSNRGIIAMKQVVPPQFEARNDFDIFRELCRRFNREEAFTEGLDEMGWLKRIWQEGVQQ GKGRGVHLPAFDDFWNNKEYVEFDHPQMFVRHQAFREDPDLEPLGTPSGLIEIYSKTIAD MNYDDCQGHPMWFEKIERSHGGPGSQKYPLHLQSVHPDFRLHSQLCESETLRQQYTVAGK EPVFINPQDASARGIRNGDVVRVFNARGQVLAGAVVSDRYAPGVARIHEGAWYDPDKGGE PGALCKYGNPNVLTIDIGTSQLAQATSAHTTLVEIEKYNGTVEQVTAFNGPVEMVAQCEY VPASQVKS >gi|296494559|gb|ADTN01000179.1| GENE 3 2684 - 3856 1031 390 aa, chain - ## HITS:1 COG:torC KEGG:ns NR:ns ## COG: torC COG3005 # Protein_GI_number: 16128962 # Func_class: C Energy production and conversion # Function: Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit # Organism: Escherichia coli K12 # 1 390 1 390 390 801 100.0 0 MRKLWNALRRPSARWSVLALVAIGIVIGIALIVLPHVGIKVTSTTEFCVSCHSMQPVYEE YKQSVHFQNASGVRAECHDCHIPPDIPGMVKRKLEASNDIYQTFIAHSIDTPEKFEAKRA ELAEREWARMKENNSATCRSCHNYDAMDHAKQHPEAARQMKVAAKDNQSCIDCHKGIAHQ LPDMSSGFRKQFDELRASANDSGDTLYSIDIKPIYAAKGDKEASGSLLPASEVKVLKRDG DWLQIEITGWTESAGRQRVLTQFPGKRIFVASIRGDVQQQVKTLEKTTVADTNTEWSKLQ ATAWMKKGDMVNDIKPIWAYADSLYNGTCNQCHGAPEIAHFDANGWIGTLNGMIGFTSLD KREERTLLKYLQMNASDTAGKAHGDKKEEK >gi|296494559|gb|ADTN01000179.1| GENE 4 3986 - 4678 788 230 aa, chain + ## HITS:1 COG:torR KEGG:ns NR:ns ## COG: torR COG0745 # Protein_GI_number: 16128961 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli K12 # 1 230 1 230 230 439 100.0 1e-123 MPHHIVIVEDEPVTQARLQSYFTQEGYTVSVTASGAGLREIMQNQSVDLILLDINLPDEN GLMLTRALRERSTVGIILVTGRSDRIDRIVGLEMGADDYVTKPLELRELVVRVKNLLWRI DLARQAQPHTQDNCYRFAGYCLNVSRHTLERDGEPIKLTRAEYEMLVAFVTNPGEILSRE RLLRMLSARRVENPDLRTVDVLIRRLRHKLSADLLVTQHGEGYFLAADVC >gi|296494559|gb|ADTN01000179.1| GENE 5 4651 - 5679 750 342 aa, chain - ## HITS:1 COG:torT KEGG:ns NR:ns ## COG: torT COG1879 # Protein_GI_number: 16128960 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli K12 # 1 342 1 342 342 638 100.0 0 MRVLLFLLLSLFMLPAFSADNLLRWHDAQHFTVQASTPLKAKRAWKLCALYPSLKDSYWL SLNYGMQEAARRYGVDLKVLEAGGYSQLATQQAQIDQCKQWGAEAILLGSSTTSFPDLQK QVASLPVIELVNAIDAPQVKSRVGVPWFQMGYQPGRYLVQWAHGKPLNVLLMPGPDNAGG SKEMVEGFRAAIAGSPVRIVDIALGDNDIEIQRNLLQEMLERHPEIDVVAGTAIAAEAAM GEGRNLKTPLTVVSFYLSHQVYRGLKRGRVIMAASDQMVWQGELAVEQAIRQLQGQSVSD NVSPPILVLTPKNADREHIRRSLSPGGFRPVYFYQHTSAAKK >gi|296494559|gb|ADTN01000179.1| GENE 6 5792 - 8506 2581 904 aa, chain + ## HITS:1 COG:torS_1 KEGG:ns NR:ns ## COG: torS_1 COG0642 # Protein_GI_number: 16128959 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 650 1 650 650 1130 100.0 0 MGFALMALLTLTSTLVGWYNLRFISQVEKDNTQALIPTMNMARQLSEASAWELFAAQNLT SADNEKMWQAQGRMLTAQSLKINALLQALREQGFDTTAIEQQEQEISRSLRQQGELVGQR LQLRQQQQQLSQQIVAAADEIARLAQGQANNATTSAGATQAGIYDLIEQDQRQAAESALD RLIDIDLEYVNQMNELRLSALRVQQMVMNLGLEQIQKNAPTLEKQLNNAVKILQRRQIRI EDPGVRAQVATTLTTVSQYSDLLALYQQDSEISNHLQTLAQNNIAQFAQFSSEVSQLVDT IELRNQHGLAHLEKASARGQYSLLLLGMVSLCALILILWRVVYRSVTRPLAEQTQALQRL LDGDIDSPFPETAGVRELDTIGRLMDAFRSNVHALNRHREQLAAQVKARTAELQELVIEH RQARAEAEKASQAKSAFLAAMSHEIRTPLYGILGTAQLLADNPALNAQRDDLRAITDSGE SLLTILNDILDYSAIEAGGKNVSVSDEPFEPRPLLESTLQLMSGRVKGRPIRLATAIADD MPCALMGDPRRIRQVITNLLSNALRFTDEGYIILRSRTDGEQWLVEVEDSGCGIDPAKLA EIFQPFVQVSGKRGGTGLGLTISSRLAQAMGGELSATSTPEVGSCFCLRLPLRVATAPVP KTVNQAVRLDGLRLLLIEDNPLTQRITIEMLKTSGAQIVAVGNAAQALETLQNSEPFAAA LVDFDLPDIDGITLARQLAQQYPSLVLIGFSAHVIDETLRQRTSSLFRGIIPKPVPREVL GQLLAHYLQLQVNNDQSLDVSQLNEDAQLMGTEKIHEWLVLFTQHALPLLDEIDIARASQ DSEKIKRAAHQLKSSCSSLGMHIASQLCAQLEQQPLSAPLPHEEITRSVAALEAWLHKKD LNAI >gi|296494559|gb|ADTN01000179.1| GENE 7 8578 - 9651 639 357 aa, chain + ## HITS:1 COG:yccM KEGG:ns NR:ns ## COG: yccM COG0348 # Protein_GI_number: 16128958 # Func_class: C Energy production and conversion # Function: Polyferredoxin # Organism: Escherichia coli K12 # 1 357 1 357 357 702 100.0 0 MAENKRTRWQRRPGTTGGKLPWNDWRNATTWRKATQLLLLAMNIYIAITFWYWVRYYETA SSTTFVARPGGIEGWLPIAGLMNLKYSLVTGQLPSVHAAAMLLLVAFIVISLLLKKAFCS WLCPVGTLSELIGDLGNKLFGRQCVLPRWLDIPLRGVKYLLLSFFIYIALLMPAQAIHYF MLSPYSVVMDVKMLDFFRHMGTATLISVTVLLIASLFIRHAWCRYLCPYGALMGVVSLLS PFKIRRNAESCIDCGKCAKNCPSRIPVDKLIQVRTVECTGCMTCVESCPVASTLTFSLQK PAANKKAFALSGWLMTLLVLGIMFAVIGYAMYAGVWQSPVPEELYRRLIPQAPMIGH >gi|296494559|gb|ADTN01000179.1| GENE 8 9700 - 9873 164 57 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|16760699|ref|NP_456316.1| hypothetical protein STY1938 [Salmonella enterica subsp. enterica serovar Typhi str. CT18] # 1 57 1 57 57 67 59 3e-11 MNIEELKKQAETEIADFIAQKIAELNKNTGKEVSEIRFTAREKMTGLESYDVKIKIM >gi|296494559|gb|ADTN01000179.1| GENE 9 10267 - 10479 310 70 aa, chain - ## HITS:1 COG:ECs1145 KEGG:ns NR:ns ## COG: ECs1145 COG1278 # Protein_GI_number: 15830399 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Escherichia coli O157:H7 # 1 70 1 70 70 133 100.0 9e-32 MSNKMTGLVKWFNADKGFGFITPDDGSKDVFVHFTAIQSNEFRTLNENQKVEFSIEQGQR GPAAANVVTL >gi|296494559|gb|ADTN01000179.1| GENE 10 10765 - 10977 222 70 aa, chain + ## HITS:1 COG:ECs1144 KEGG:ns NR:ns ## COG: ECs1144 COG1278 # Protein_GI_number: 15830398 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Escherichia coli O157:H7 # 1 70 1 70 70 137 100.0 5e-33 MSRKMTGIVKTFDRKSGKGFIIPSDGRKEVQVHISAFTPRDAEVLIPGLRVEFCRVNGLR GPTAANVYLS Prediction of potential genes in microbial genomes Time: Sun May 15 23:45:32 2011 Seq name: gi|296494558|gb|ADTN01000180.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont364.1, whole genome shotgun sequence Length of sequence - 1485 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 11 - 54 9.3 1 1 Op 1 12/0.000 - CDS 74 - 727 937 ## COG2864 Cytochrome b subunit of formate dehydrogenase 2 1 Op 2 . - CDS 720 - 1484 815 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 Predicted protein(s) >gi|296494558|gb|ADTN01000180.1| GENE 1 74 - 727 937 217 aa, chain - ## HITS:1 COG:STM1568 KEGG:ns NR:ns ## COG: STM1568 COG2864 # Protein_GI_number: 16764912 # Func_class: C Energy production and conversion # Function: Cytochrome b subunit of formate dehydrogenase # Organism: Salmonella typhimurium LT2 # 1 217 1 217 218 408 98.0 1e-114 MSKSKMIVRTKFIDRACHWTVVICFFLVALSGISFFFPTLQWLTQTFGTPQMGRILHPFF GIAIFVALMFMFVRFVHHNIPDKKDIPWLLNIVEVLKGNEHKVADVGKYNAGQKMMFWSI MSMIFVLLVTGVIIWRPYFAQYFPMQVVRYSLLIHAAAGIILIHAILIHMYMAFWVKGSI KGMIEGKVSRRWAKKHHPRWYREIEKAEAKKESEEGI >gi|296494558|gb|ADTN01000180.1| GENE 2 720 - 1484 815 254 aa, chain - ## HITS:1 COG:fdnH KEGG:ns NR:ns ## COG: fdnH COG0437 # Protein_GI_number: 16129434 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Escherichia coli K12 # 1 254 41 294 294 531 99.0 1e-151 GCKACQVACSEWNDIRDEVGHCVGVYDNPADLSAKSWTVMRFSETEQNGKLEWLIRKDGC MHCEDPGCLKACPSAGAIIQYANGIVDFQSENCIGCGYCIAGCPFNIPRLNKEDNRVYKC TLCVDRVSVGQEPACVKTCPTGAIHFGTKKEMLELAEQRVAKLKARGYEHAGVYNPEGVG GTHVMYVLHHADQPELYHGLPKDPKIDTSVSLWKGALNPLAAAGFIATFAGLIFHYIGIG PNKEVDDDEEDHHE Prediction of potential genes in microbial genomes Time: Sun May 15 23:45:33 2011 Seq name: gi|296494557|gb|ADTN01000181.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont372.1, whole genome shotgun sequence Length of sequence - 2596 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 409 - 1155 292 ## COG5433 Transposase - Prom 1190 - 1249 6.2 - Term 1217 - 1258 7.6 2 2 Tu 1 . - CDS 1337 - 1570 61 ## B21_01427 hypothetical protein - Prom 1643 - 1702 7.8 3 3 Tu 1 . - CDS 1803 - 2594 204 ## COG3209 Rhs family protein Predicted protein(s) >gi|296494557|gb|ADTN01000181.1| GENE 1 409 - 1155 292 248 aa, chain - ## HITS:1 COG:b1458 KEGG:ns NR:ns ## COG: b1458 COG5433 # Protein_GI_number: 16129417 # Func_class: L Replication, recombination and repair # Function: Transposase # Organism: Escherichia coli K12 # 1 248 1 248 248 469 100.0 1e-132 MSIQSLLDYISVTPDIRQQGKVKHKLSAILFLTVCAVIAGADEWQEIEDFGHERLEWLKK YGDFDNGIPVDDTIARVVSNIDSLAFEKMFIEWMQECHEITDGEIIAIDGKTIRGSFDKG KRKGAIHMVSAFSNENGVVLGQVKTEAKSNEITAIPELLNLLYLKKNLITIDAMGCQKDI ASKIKDKKADYLLAVKGNQGKLHHAFEEKFPVNVFSNYKGDSFSTQEISHGRKETRLHIV SNVTPELL >gi|296494557|gb|ADTN01000181.1| GENE 2 1337 - 1570 61 77 aa, chain - ## HITS:1 COG:no KEGG:B21_01427 NR:ns ## KEGG: B21_01427 # Name: ydcD # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 77 84 160 160 107 98.0 1e-22 MVDINLNFFNDILYSVRLKNISKLENMEFCATKQRVYFSDKNKKASYKIINYGDYYDVDY YDNNLKNEVFDWIGKWS >gi|296494557|gb|ADTN01000181.1| GENE 3 1803 - 2594 204 263 aa, chain - ## HITS:1 COG:rhsE KEGG:ns NR:ns ## COG: rhsE COG3209 # Protein_GI_number: 16129415 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli K12 # 1 263 420 682 682 556 100.0 1e-158 EPEYTPARKVHFYHCDHRGLPLALISEDGNTAWRGEYDEWGNQLNEENPHHLHQPYRLPG QQHDEESGLYYNRHRHYDPLQGRYITPDPIGLRGGWNMYQYPLNPIQVIDPMGLDAIENM TSGGLIYAVSGVPGLIAANSITNSAYQFGYDMDAIVGGAHNGAADAMRHCYLMCRMTKTF GSTIADVIGKNHEAAGDRQGQPAKERIMDLKNNTVGIACGDFSAKCSDACIEKYNTGQLF GLDGIKADNPIKAKQGSSDASNY Prediction of potential genes in microbial genomes Time: Sun May 15 23:45:47 2011 Seq name: gi|296494556|gb|ADTN01000182.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont381.1, whole genome shotgun sequence Length of sequence - 35458 bp Number of predicted genes - 33, with homology - 33 Number of transcription units - 10, operones - 6 average op.length - 4.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 16 - 75 4.0 1 1 Tu 1 . + CDS 105 - 620 485 ## COG0583 Transcriptional regulator + Term 626 - 677 14.4 + Prom 663 - 722 9.5 2 2 Op 1 32/0.000 + CDS 938 - 2662 1434 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 3 2 Op 2 3/1.000 + CDS 2665 - 3156 461 ## COG0440 Acetolactate synthase, small (regulatory) subunit + Term 3182 - 3214 1.5 + Prom 3248 - 3307 5.7 4 3 Tu 1 . + CDS 3336 - 4340 1070 ## COG1609 Transcriptional regulators + Prom 4838 - 4897 4.1 5 4 Op 1 29/0.000 + CDS 4942 - 5400 274 ## COG2001 Uncharacterized protein conserved in bacteria 6 4 Op 2 12/0.000 + CDS 5405 - 6343 1111 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis 7 4 Op 3 12/0.000 + CDS 6340 - 6705 341 ## COG3116 Cell division protein 8 4 Op 4 26/0.000 + CDS 6721 - 8487 1651 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 9 4 Op 5 26/0.000 + CDS 8474 - 9961 1511 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase 10 4 Op 6 28/0.000 + CDS 9958 - 11316 1351 ## COG0770 UDP-N-acetylmuramyl pentapeptide synthase 11 4 Op 7 28/0.000 + CDS 11310 - 12392 1438 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 12 4 Op 8 25/0.000 + CDS 12395 - 13711 1425 ## COG0771 UDP-N-acetylmuramoylalanine-D-glutamate ligase 13 4 Op 9 31/0.000 + CDS 13711 - 14955 1506 ## COG0772 Bacterial cell division membrane protein 14 4 Op 10 26/0.000 + CDS 14952 - 16019 989 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase 15 4 Op 11 11/0.000 + CDS 16073 - 17548 1562 ## COG0773 UDP-N-acetylmuramate-alanine ligase 16 4 Op 12 18/0.000 + CDS 17541 - 18461 1026 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes 17 4 Op 13 25/0.000 + CDS 18463 - 19293 671 ## COG1589 Cell division septal protein 18 4 Op 14 35/0.000 + CDS 19290 - 20552 1175 ## COG0849 Actin-like ATPase involved in cell division 19 4 Op 15 11/0.000 + CDS 20613 - 21764 1359 ## COG0206 Cell division GTPase + Prom 21775 - 21834 1.8 20 4 Op 16 . + CDS 21865 - 22782 783 ## COG0774 UDP-3-O-acyl-N-acetylglucosamine deacetylase + Term 22809 - 22866 2.1 + Prom 22830 - 22889 4.9 21 5 Op 1 . + CDS 22938 - 23525 315 ## G2583_0101 secretion monitor protein + Term 23556 - 23584 0.5 22 5 Op 2 8/0.000 + CDS 23587 - 26292 3668 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) + Term 26314 - 26344 3.0 23 5 Op 3 . + CDS 26352 - 26741 337 ## PROTEIN SUPPORTED gi|42631237|ref|ZP_00156775.1| COG0494: NTP pyrophosphohydrolases including oxidative damage repair enzymes - Term 26889 - 26933 3.7 24 6 Op 1 7/0.000 - CDS 26957 - 27154 129 ## COG3024 Uncharacterized protein conserved in bacteria 25 6 Op 2 7/0.000 - CDS 27164 - 27907 760 ## COG4582 Uncharacterized protein conserved in bacteria 26 6 Op 3 . - CDS 27907 - 28527 590 ## COG0237 Dephospho-CoA kinase - Prom 28629 - 28688 4.8 + Prom 28485 - 28544 3.0 27 7 Tu 1 . + CDS 28752 - 29795 1061 ## COG0516 IMP dehydrogenase/GMP reductase + Term 29804 - 29841 7.7 28 8 Op 1 24/0.000 - CDS 29830 - 31032 899 ## COG1459 Type II secretory pathway, component PulF 29 8 Op 2 8/0.000 - CDS 31022 - 32407 730 ## COG2804 Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB 30 8 Op 3 5/0.500 - CDS 32417 - 32857 467 ## COG4969 Tfp pilus assembly protein, major pilin PilA - Prom 32934 - 32993 3.5 31 9 Tu 1 . - CDS 33060 - 33953 460 ## PROTEIN SUPPORTED gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 - Prom 33974 - 34033 2.0 + Prom 33771 - 33830 2.0 32 10 Op 1 6/0.250 + CDS 34041 - 34592 528 ## COG3023 Negative regulator of beta-lactamase expression 33 10 Op 2 . + CDS 34589 - 35443 933 ## COG3725 Membrane protein required for beta-lactamase induction Predicted protein(s) >gi|296494556|gb|ADTN01000182.1| GENE 1 105 - 620 485 171 aa, chain + ## HITS:1 COG:leuO KEGG:ns NR:ns ## COG: leuO COG0583 # Protein_GI_number: 16128070 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 171 203 373 373 343 100.0 9e-95 MFKSSLNQNTEHQLRYQETEFVISYEDFHRPEFTSVPLFKDEMVLVASKNHPTIKGPLLK HDVYNEQHAAVSLDRFASFSQPWYDTVDKQASIAYQGMAMMSVLSVVSQTHLVAIAPRWL AEEFAESLELQVLPLPLKQNSRTCYLSWHEAAGRDKGHQWMEEQLVSICKR >gi|296494556|gb|ADTN01000182.1| GENE 2 938 - 2662 1434 574 aa, chain + ## HITS:1 COG:ilvI KEGG:ns NR:ns ## COG: ilvI COG0028 # Protein_GI_number: 16132273 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Escherichia coli K12 # 1 574 1 574 574 1174 100.0 0 MEMLSGAEMVVRSLIDQGVKQVFGYPGGAVLDIYDALHTVGGIDHVLVRHEQAAVHMADG LARATGEVGVVLVTSGPGATNAITGIATAYMDSIPLVVLSGQVATSLIGYDAFQECDMVG ISRPVVKHSFLVKQTEDIPQVLKKAFWLAASGRPGPVVVDLPKDILNPANKLPYVWPESV SMRSYNPTTTGHKGQIKRALQTLVAAKKPVVYVGGGAITAGCHQQLKETVEALNLPVVCS LMGLGAFPATHRQALGMLGMHGTYEANMTMHNADVIFAVGVRFDDRTTNNLAKYCPNATV LHIDIDPTSISKTVTADIPIVGDARQVLEQMLELLSQESAHQPLDEIRDWWQQIEQWRAR QCLKYDTHSEKIKPQAVIETLWRLTKGDAYVTSDVGQHQMFAALYYPFDKPRRWINSGGL GTMGFGLPAALGVKMALPEETVVCVTGDGSIQMNIQELSTALQYELPVLVVNLNNRYLGM VKQWQDMIYSGRHSQSYMQSLPDFVRLAEAYGHVGIQISHPHELESKLSEALEQVRNNRL VFVDVTVDGSEHVYPMQIRGGGMDEMWLSKTERT >gi|296494556|gb|ADTN01000182.1| GENE 3 2665 - 3156 461 163 aa, chain + ## HITS:1 COG:ilvH KEGG:ns NR:ns ## COG: ilvH COG0440 # Protein_GI_number: 16128071 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase, small (regulatory) subunit # Organism: Escherichia coli K12 # 1 163 1 163 163 284 100.0 6e-77 MRRILSVLLENESGALSRVIGLFSQRGYNIESLTVAPTDDPTLSRMTIQTVGDEKVLEQI EKQLHKLVDVLRVSELGQGAHVEREIMLVKIQASGYGRDEVKRNTEIFRGQIIDVTPSLY TVQLAGTSGKLDAFLASIRDVAKIVEVARSGVVGLSRGDKIMR >gi|296494556|gb|ADTN01000182.1| GENE 4 3336 - 4340 1070 334 aa, chain + ## HITS:1 COG:ECs0084 KEGG:ns NR:ns ## COG: ECs0084 COG1609 # Protein_GI_number: 15829338 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 334 1 334 334 655 100.0 0 MKLDEIARLAGVSRTTASYVINGKAKQYRVSDKTVEKVMAVVREHNYHPNAVAAGLRAGR TRSIGLVIPDLENTSYTRIANYLERQARQRGYQLLIACSEDQPDNEMRCIEHLLQRQVDA IIVSTSLPPEHPFYQRWANDPFPIVALDRALDREHFTSVVGADQDDAEMLAEELRKFPAE TVLYLGALPELSVSFLREQGFRTAWKDDPREVHFLYANSYEREAAAQLFEKWLETHPMPQ ALFTTSFALLQGVMDVTLRRDGKLPSDLAIATFGDNELLDFLQCPVLAVAQRHRDVAERV LEIVLASLDEPRKPKPGLTRIKRNLYRRGVLSRS >gi|296494556|gb|ADTN01000182.1| GENE 5 4942 - 5400 274 152 aa, chain + ## HITS:1 COG:yabB KEGG:ns NR:ns ## COG: yabB COG2001 # Protein_GI_number: 16128074 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 152 1 152 152 298 100.0 2e-81 MFRGATLVNLDSKGRLSVPTRYREQLLENAAGQMVCTIDIYHPCLLLYPLPEWEIIEQKL SRLSSMNPVERRVQRLLLGHASECQMDGAGRLLIAPVLRQHAGLTKEVMLVGQFNKFELW DETTWHQQVKEDIDAEQLATGDLSERLQDLSL >gi|296494556|gb|ADTN01000182.1| GENE 6 5405 - 6343 1111 312 aa, chain + ## HITS:1 COG:ECs0086 KEGG:ns NR:ns ## COG: ECs0086 COG0275 # Protein_GI_number: 15829340 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Escherichia coli O157:H7 # 1 312 2 313 313 593 100.0 1e-169 MENYKHTTVLLDEAVNGLNIRPDGIYIDGTFGRGGHSRLILSQLGEEGRLLAIDRDPQAI AVAKTIDDPRFSIIHGPFSALGEYVAERDLIGKIDGILLDLGVSSPQLDDAERGFSFMRD GPLDMRMDPTRGQSAAEWLQTAEEADIAWVLKTYGEERFAKRIARAIVERNREQPMTRTK ELAEVVAAATPVKDKFKHPATRTFQAVRIWVNSELEEIEQALKSSLNVLAPGGRLSIISF HSLEDRIVKRFMRENSRGPQVPAGLPMTEEQLKKLGGRQLRALGKLMPGEEEVAENPRAR SSVLRIAERTNA >gi|296494556|gb|ADTN01000182.1| GENE 7 6340 - 6705 341 121 aa, chain + ## HITS:1 COG:ECs0087 KEGG:ns NR:ns ## COG: ECs0087 COG3116 # Protein_GI_number: 15829341 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Escherichia coli O157:H7 # 1 121 1 121 121 224 100.0 4e-59 MISRVTEALSKVKGSMGSHERHALPGVIGDDLLRFGKLPLCLFICIILTAVTVVTTAHHT RLLTAQREQLVLERDALDIEWRNLILEENALGDHSRVERIATEKLQMQHVDPSQENIVVQ K >gi|296494556|gb|ADTN01000182.1| GENE 8 6721 - 8487 1651 588 aa, chain + ## HITS:1 COG:ECs0088 KEGG:ns NR:ns ## COG: ECs0088 COG0768 # Protein_GI_number: 15829342 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Escherichia coli O157:H7 # 1 588 1 588 588 1162 100.0 0 MKAAAKTQKPKRQEEHANFISWRFALLCGCILLALAFLLGRVAWLQVISPDMLVKEGDMR SLRVQQVSTSRGMITDRSGRPLAVSVPVKAIWADPKEVHDAGGISVGDRWKALANALNIP LDQLSARINANPKGRFIYLARQVNPDMADYIKKLKLPGIHLREESRRYYPSGEVTAHLIG FTNVDSQGIEGVEKSFDKWLTGQPGERIVRKDRYGRVIEDISSTDSQAAHNLALSIDERL QALVYRELNNAVAFNKAESGSAVLVDVNTGEVLAMANSPSYNPNNLSGTPKEAMRNRTIT DVFEPGSTVKPMVVMTALQRGVVRENSVLNTIPYRINGHEIKDVARYSELTLTGVLQKSS NVGVSKLALAMPSSALVDTYSRFGLGKATNLGLVGERSGLYPQKQRWSDIERATFSFGYG LMVTPLQLARVYATIGSYGIYRPLSITKVDPPVPGERVFPESIVRTVVHMMESVALPGGG GVKAAIKGYRIAIKTGTAKKVGPDGRYINKYIAYTAGVAPASQPRFALVVVINDPQAGKY YGGAVSAPVFGAIMGGVLRTMNIEPDALTTGDKNEFVINQGEGTGGRS >gi|296494556|gb|ADTN01000182.1| GENE 9 8474 - 9961 1511 495 aa, chain + ## HITS:1 COG:murE KEGG:ns NR:ns ## COG: murE COG0769 # Protein_GI_number: 16128078 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Escherichia coli K12 # 1 495 1 495 495 937 99.0 0 MADRNLRDLLAPWVPDAPSRALREMTLDSRVAAAGDLFVAVVGHQADGRRYIPQAIAQGV AAIIAEAKDEATDGEIREMHGVPVIYLSQLNERLSALAGRFYHEPSDNLRLVGVTGTNGK TTTTQLLAQWSQLLGETSAVMGTVGNGLLGKVIPTENTTGSAVDVQHELAGLVDQGATFC AMEVSSHGLVQHRVAALKFAASVFTNLSRDHLDYHGDMEHYEAAKWLLYSEHHCGQAIIN ADDEVGRRWLAKLPDAVAVSMEDHINPNCHGRWLKATEVNYHDSGATIRFSSSWGDGEIE SHLMGAFNVSNLLLALATLLALGYPLADLLKTAARLQPVCGRMEVFTAPGKPTVVVDYAH TPDALEKALQAARLHCAGKLWCVFGCGGDRDKGKRPLMGAIAEEFADVAVVTDDNPRTEE PRAIINDILAGMLDAGHAKVMEGRAEAVTCAVMQAKENDVVLVAGKGHEDYQIVGNQRLD YSDRVTVARLLGVIA >gi|296494556|gb|ADTN01000182.1| GENE 10 9958 - 11316 1351 452 aa, chain + ## HITS:1 COG:murF KEGG:ns NR:ns ## COG: murF COG0770 # Protein_GI_number: 16128079 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide synthase # Organism: Escherichia coli K12 # 1 452 1 452 452 816 100.0 0 MISVTLSQLTDILNGELQGADITLDAVTTDTRKLTPGCLFVALKGERFDAHDFADQAKAG GAGALLVSRPLDIDLPQLIVKDTRLAFGELAAWVRQQVPARVVALTGSSGKTSVKEMTAA ILSQCGNTLYTAGNLNNDIGVPMTLLRLTPEYDYAVIELGANHQGEIAWTVSLTRPEAAL VNNLAAAHLEGFGSLAGVAKAKGEIFSGLPENGIAIMNADNNDWLNWQSVIGSRKVWRFS PNAANSDFTATNIHVTSHGTEFTLQTPTGSVDVLLPLPGRHNIANALAAAALSMSVGATL DAIKAGLANLKAVPGRLFPIQLAENQLLLDDSYNANVGSMTAAVQVLAEMPGYRVLVVGD MAELGAESEACHVQVGEAAKAAGIDRVLSVGKQSHAISTASGVGEHFADKTALITRLKLL IAEQQVITILVKGSRSAAMEEVVRALQENGTC >gi|296494556|gb|ADTN01000182.1| GENE 11 11310 - 12392 1438 360 aa, chain + ## HITS:1 COG:STM0125 KEGG:ns NR:ns ## COG: STM0125 COG0472 # Protein_GI_number: 16763515 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Salmonella typhimurium LT2 # 1 360 1 360 360 642 97.0 0 MLVWLAEHLVKYYSGFNVFSYLTFRAIVSLLTALFISLWMGPRMIAHLQKLSFGQVVRND GPESHFSKRGTPTMGGIMILTAIVISVLLWAYPSNPYVWCVLVVLVGYGVIGFVDDYRKV VRKDTKGLIARWKYFWMSVIALGVAFALYLAGKDTPATQLVVPFFKDVMPQLGLFYILLA YFVIVGTGNAVNLTDGLDGLAIMPTVFVAGGFALVAWATGNMNFASYLHIPYLRHAGELV IVCTAIVGAGLGFLWFNTYPAQVFMGDVGSLALGGALGIIAVLLRQEFLLVIMGGVFVVE TLSVILQVGSFKLRGQRIFRMAPIHHHYELKGWPEPRVIVRFWIISLMLVLIGLATLKVR >gi|296494556|gb|ADTN01000182.1| GENE 12 12395 - 13711 1425 438 aa, chain + ## HITS:1 COG:ECs0092 KEGG:ns NR:ns ## COG: ECs0092 COG0771 # Protein_GI_number: 15829346 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramoylalanine-D-glutamate ligase # Organism: Escherichia coli O157:H7 # 1 438 1 438 438 853 99.0 0 MADYQGKNVVIIGLGLTGLSCVDFFLARGVTPRVMDTRMTPPGLDKLPEAVERHTGSLND EWLMAADLIVASPGIALAHPSLSAAADAGIEIVGDIELFCREAQAPIVAITGSNGKSTVT TLVGEMAKAAGVNVGVGGNIGLPALMLLDDECELYVLELSSFQLETTSSLQAVAATILNV TEDHMDRYPFGLQQYRAAKLRIYENAKVCVVNADDALTMPIRGADERCVSFGVNMGDYHL NHQQGETWLRVKGEKVLNVKEMKLSGQHNYTNALAALALADAAGLPRASSLKALTTFTGL PHRFEVVLEHNGVRWINDSKATNVGSTEAALNGLHVDGTLHLLLGGDGKSADFSPLARYL NGDNVRLYCFGRDGAQLAALRPEVAEQTETMEQAMRLLAPRVQPGDMVLLSPACASLDQF KNFEQRGNEFARLAKELG >gi|296494556|gb|ADTN01000182.1| GENE 13 13711 - 14955 1506 414 aa, chain + ## HITS:1 COG:ECs0093 KEGG:ns NR:ns ## COG: ECs0093 COG0772 # Protein_GI_number: 15829347 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Escherichia coli O157:H7 # 1 414 1 414 414 735 100.0 0 MRLSLPRLKMPRLPGFSILVWISTALKGWVMGSREKDTDSLIMYDRTLLWLTFGLAAIGF IMVTSASMPIGQRLTNDPFFFAKRDGVYLILAFILAIITLRLPMEFWQRYSATMLLGSII LLMIVLVVGSSVKGASRWIDLGLLRIQPAELTKLSLFCYIANYLVRKGDEVRNNLRGFLK PMGVILVLAVLLLAQPDLGTVVVLFVTTLAMLFLAGAKLWQFIAIIGMGISAVVLLILAE PYRIRRVTAFWNPWEDPFGSGYQLTQSLMAFGRGELWGQGLGNSVQKLEYLPEAHTDFIF AIIGEELGYVGVVLALLMVFFVAFRAMSIGRKALEIDHRFSGFLACSIGIWFSFQALVNV GAAAGMLPTKGLTLPLISYGGSSLLIMSTAIMMLLRIDYETRLEKAQAFVRGSR >gi|296494556|gb|ADTN01000182.1| GENE 14 14952 - 16019 989 355 aa, chain + ## HITS:1 COG:murG KEGG:ns NR:ns ## COG: murG COG0707 # Protein_GI_number: 16128083 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Escherichia coli K12 # 1 355 1 355 355 655 100.0 0 MSGQGKRLMVMAGGTGGHVFPGLAVAHHLMAQGWQVRWLGTADRMEADLVPKHGIEIDFI RISGLRGKGIKALIAAPLRIFNAWRQARAIMKAYKPDVVLGMGGYVSGPGGLAAWSLGIP VVLHEQNGIAGLTNKWLAKIATKVMQAFPGAFPNAEVVGNPVRTDVLALPLPQQRLAGRE GPVRVLVVGGSQGARILNQTMPQVAAKLGDSVTIWHQSGKGSQQSVEQAYAEAGQPQHKV TEFIDDMAAAYAWADVVVCRSGALTVSEIAAAGLPALFVPFQHKDRQQYWNALPLEKAGA AKIIEQPQLSVDAVANTLAGWSRETLLTMAERARAASIPDATERVANEVSRVARA >gi|296494556|gb|ADTN01000182.1| GENE 15 16073 - 17548 1562 491 aa, chain + ## HITS:1 COG:murC KEGG:ns NR:ns ## COG: murC COG0773 # Protein_GI_number: 16128084 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate-alanine ligase # Organism: Escherichia coli K12 # 1 491 1 491 491 908 100.0 0 MNTQQLAKLRSIVPEMRRVRHIHFVGIGGAGMGGIAEVLANEGYQISGSDLAPNPVTQQL MNLGATIYFNHRPENVRDASVVVVSSAISADNPEIVAAHEARIPVIRRAEMLAELMRFRH GIAIAGTHGKTTTTAMVSSIYAEAGLDPTFVNGGLVKAAGVHARLGHGRYLIAEADESDA SFLHLQPMVAIVTNIEADHMDTYQGDFENLKQTFINFLHNLPFYGRAVMCVDDPVIRELL PRVGRQTTTYGFSEDADVRVEDYQQIGPQGHFTLLRQDKEPMRVTLNAPGRHNALNAAAA VAVATEEGIDDEAILRALESFQGTGRRFDFLGEFPLEPVNGKSGTAMLVDDYGHHPTEVD ATIKAARAGWPDKNLVMLFQPHRFTRTRDLYDDFANVLTQVDTLLMLEVYPAGEAPIPGA DSRSLCRTIRGRGKIDPILVPDPARVAEMLAPVLTGNDLILVQGAGNIGKIARSLAEIKL KPQTPEEEQHD >gi|296494556|gb|ADTN01000182.1| GENE 16 17541 - 18461 1026 306 aa, chain + ## HITS:1 COG:ddlB KEGG:ns NR:ns ## COG: ddlB COG1181 # Protein_GI_number: 16128085 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Escherichia coli K12 # 1 306 1 306 306 581 100.0 1e-166 MTDKIAVLLGGTSAEREVSLNSGAAVLAGLREGGIDAYPVDPKEVDVTQLKSMGFQKVFI ALHGRGGEDGTLQGMLELMGLPYTGSGVMASALSMDKLRSKLLWQGAGLPVAPWVALTRA EFEKGLSDKQLAEISALGLPVIVKPSREGSSVGMSKVVAENALQDALRLAFQHDEEVLIE KWLSGPEFTVAILGEEILPSIRIQPSGTFYDYEAKYLSDETQYFCPAGLEASQEANLQAL VLKAWTTLGCKGWGRIDVMLDSDGQFYLLEANTSPGMTSHSLVPMAARQAGMSFSQLVVR ILELAD >gi|296494556|gb|ADTN01000182.1| GENE 17 18463 - 19293 671 276 aa, chain + ## HITS:1 COG:ftsQ KEGG:ns NR:ns ## COG: ftsQ COG1589 # Protein_GI_number: 16128086 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division septal protein # Organism: Escherichia coli K12 # 1 265 1 265 276 490 100.0 1e-138 MSQAALNTRNSEEEVSSRRNNGTRLAGILFLLTVLTTVLVSGWVVLGWMEDAQRLPLSKL VLTGERHYTRNDDIRQSILALGEPGTFMTQDVNIIQTQIEQRLPWIKQVSVRKQWPDELK IHLVEYVPIARWNDQHMVDAEGNTFSVPPERTSKQVLPMLYGPEGSANEVLQGYREMGQM LAKDRFTLKEAAMTARRSWQLTLNNDIKLNLGRGDTMKRLARFVELYPVLQQQAQTDGKR ISYVDLRYDSGAAVGWAPLPPEESTQQQNQAQAEQQ >gi|296494556|gb|ADTN01000182.1| GENE 18 19290 - 20552 1175 420 aa, chain + ## HITS:1 COG:ECs0098 KEGG:ns NR:ns ## COG: ECs0098 COG0849 # Protein_GI_number: 15829352 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell division # Organism: Escherichia coli O157:H7 # 1 420 1 420 420 818 100.0 0 MIKATDRKLVVGLEIGTAKVAALVGEVLPDGMVNIIGVGSCPSRGMDKGGVNDLESVVKC VQRAIDQAELMADCQISSVYLALSGKHISCQNEIGMVPISEEEVTQEDVENVVHTAKSVR VRDEHRVLHVIPQEYAIDYQEGIKNPVGLSGVRMQAKVHLITCHNDMAKNIVKAVERCGL KVDQLIFAGLASSYSVLTEDERELGVCVVDIGGGTMDIAVYTGGALRHTKVIPYAGNVVT SDIAYAFGTPPSDAEAIKVRHGCALGSIVGKDESVEVPSVGGRPPRSLQRQTLAEVIEPR YTELLNLVNEEILQLQEKLRQQGVKHHLAAGIVLTGGAAQIEGLAACAQRVFHTQVRIGA PLNITGLTDYAQEPYYSTAVGLLHYGKESHLNGEAEVEKRVTASVGSWIKRLNSWLRKEF >gi|296494556|gb|ADTN01000182.1| GENE 19 20613 - 21764 1359 383 aa, chain + ## HITS:1 COG:ECs0099 KEGG:ns NR:ns ## COG: ECs0099 COG0206 # Protein_GI_number: 15829353 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division GTPase # Organism: Escherichia coli O157:H7 # 1 383 1 383 383 642 100.0 0 MFEPMELTNDAVIKVIGVGGGGGNAVEHMVRERIEGVEFFAVNTDAQALRKTAVGQTIQI GSGITKGLGAGANPEVGRNAADEDRDALRAALEGADMVFIAAGMGGGTGTGAAPVVAEVA KDLGILTVAVVTKPFNFEGKKRMAFAEQGITELSKHVDSLITIPNDKLLKVLGRGISLLD AFGAANDVLKGAVQGIAELITRPGLMNVDFADVRTVMSEMGYAMMGSGVASGEDRAEEAA EMAISSPLLEDIDLSGARGVLVNITAGFDLRLDEFETVGNTIRAFASDNATVVIGTSLDP DMNDELRVTVVATGIGMDKRPEITLVTNKQVQQPVMDRYQQHGMAPLTQEQKPVAKVVND NAPQTAKEPDYLDIPAFLRKQAD >gi|296494556|gb|ADTN01000182.1| GENE 20 21865 - 22782 783 305 aa, chain + ## HITS:1 COG:ECs0100 KEGG:ns NR:ns ## COG: ECs0100 COG0774 # Protein_GI_number: 15829354 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-acyl-N-acetylglucosamine deacetylase # Organism: Escherichia coli O157:H7 # 1 305 1 305 305 636 100.0 0 MIKQRTLKRIVQATGVGLHTGKKVTLTLRPAPANTGVIYRRTDLNPPVDFPADAKSVRDT MLCTCLVNEHDVRISTVEHLNAALAGLGIDNIVIEVNAPEIPIMDGSAAPFVYLLLDAGI DELNCAKKFVRIKETVRVEDGDKWAEFKPYNGFSLDFTIDFNHPAIDSSNQRYAMNFSAD AFMRQISRARTFGFMRDIEYLQSRGLCLGGSFDCAIVVDDYRVLNEDGLRFEDEFVRHKM LDAIGDLFMCGHNIIGAFTAYKSGHALNNKLLQAVLAKQEAWEYVTFQDDAELPLAFKAP SAVLA >gi|296494556|gb|ADTN01000182.1| GENE 21 22938 - 23525 315 195 aa, chain + ## HITS:1 COG:no KEGG:G2583_0101 NR:ns ## KEGG: G2583_0101 # Name: secM # Def: secretion monitor protein # Organism: E.coli_O55_H7 # Pathway: Protein export [PATH:eok03060]; Bacterial secretion system [PATH:eok03070] # 1 195 33 227 227 384 98.0 1e-106 MLWTAGFNDKICALNTFEYDRDGNNVSGILTRWRQFGKRYFWPHLLLGMVAASLGLPALS NAAEPNAPAKATTRNHEPSAKVNFGQLALLEANTRRPNSNYSVDYWHQHAIRTVIRHLSF AMAPQTLPVAEESLPLQAQHLALLDTLSALLTQEGTPSEKGYRIDYAHFTPQAKFSTPVW ISQAQGIRAGPQRLT >gi|296494556|gb|ADTN01000182.1| GENE 22 23587 - 26292 3668 901 aa, chain + ## HITS:1 COG:secA KEGG:ns NR:ns ## COG: secA COG0653 # Protein_GI_number: 16128091 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Escherichia coli K12 # 1 901 1 901 901 1721 100.0 0 MLIKLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKGEVL ENLIPEAFAVVREASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNA LTGKGVHVVTVNDYLAQRDAENNRPLFEFLGLTVGINLPGMPAPAKREAYAADITYGTNN EYGFDYLRDNMAFSPEERVQRKLHYALVDEVDSILIDEARTPLIISGPAEDSSEMYKRVN KIIPHLIRQEKEDSETFQGEGHFSVDEKSRQVNLTERGLVLIEELLVKEGIMDEGESLYS PANIMLMHHVTAALRAHALFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAK EGVQIQNENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTVVVPTNRPMIR KDLPDLVYMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVLN AKFHANEAAIVAQAGYPAAVTIATNMAGRGTDIVLGGSWQAEVAALENPTAEQIEKIKAD WQVRHDAVLEAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDALMRIFAS DRVSGMMRKLGMKPGEAIEHPWVTKAIANAQRKVESRNFDIRKQLLEYDDVANDQRRAIY SQRNELLDVSDVSETINSIREDVFKATIDAYIPPQSLEEMWDIPGLQERLKNDFDLDLPI AEWLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLA AMDYLRQGIHLRGYAQKDPKQEYKRESFSMFAAMLESLKYEVISTLSKVQVRMPEEVEEL EQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCPCGSGKKYKQCHGRL Q >gi|296494556|gb|ADTN01000182.1| GENE 23 26352 - 26741 337 129 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42631237|ref|ZP_00156775.1| COG0494: NTP pyrophosphohydrolases including oxidative damage repair enzymes [Haemophilus influenzae R2866] # 2 126 4 128 136 134 48 8e-31 MKKLQIAVGIIRNENNEIFITRRAADAHMANKLEFPGGKIEMGETPEQAVVRELQEEVGI TPQHFSLFEKLEYEFPDRHITLWFWLVERWEGEPWGKEGQPGEWMSLVGLNADDFPPANE PVIAKLKRL >gi|296494556|gb|ADTN01000182.1| GENE 24 26957 - 27154 129 65 aa, chain - ## HITS:1 COG:ECs0105 KEGG:ns NR:ns ## COG: ECs0105 COG3024 # Protein_GI_number: 15829359 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 65 1 65 65 117 100.0 7e-27 MSETITVNCPTCGKTVVWGEISPFRPFCSKRCQLIDLGEWAAEEKRIPSSGDLSESDDWS EEPKQ >gi|296494556|gb|ADTN01000182.1| GENE 25 27164 - 27907 760 247 aa, chain - ## HITS:1 COG:yacF KEGG:ns NR:ns ## COG: yacF COG4582 # Protein_GI_number: 16128095 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 247 1 247 247 484 100.0 1e-137 MQTQVLFEHPLNEKMRTWLRIEFLIQQLTVNLPIVDHAGALHFFRNVSELLDVFERGEVR TELLKELDRQQRKLQTWIGVPGVDQSRIEALIQQLKAAGSVLISAPRIGQFLREDRLIAL VRQRLSIPGGCCSFDLPTLHIWLHLPQAQRDSQVETWIASLNPLTQALTMVLDLIRQSAP FRKQTSLNGFYQDNGGDADLLRLNLSLDSQLYPQISGHKSRFAIRFMPLDTENGQVPERL DFELACC >gi|296494556|gb|ADTN01000182.1| GENE 26 27907 - 28527 590 206 aa, chain - ## HITS:1 COG:ECs0107 KEGG:ns NR:ns ## COG: ECs0107 COG0237 # Protein_GI_number: 15829361 # Func_class: H Coenzyme transport and metabolism # Function: Dephospho-CoA kinase # Organism: Escherichia coli O157:H7 # 1 206 1 206 206 360 100.0 1e-100 MRYIVALTGGIGSGKSTVANAFADLGINVIDADIIARQVVEPGAPALHAIADHFGANMIA ADGTLQRRALRERIFANPEEKNWLNALLHPLIQQETQHQIQQATSPYVLWVVPLLVENSL YKKANRVLVVDVSPETQLKRTMQRDDVTREHVEQILAAQATREARLAVADDVIDNNGAPD AIASDVARLHAHYLQLASQFVSQEKP >gi|296494556|gb|ADTN01000182.1| GENE 27 28752 - 29795 1061 347 aa, chain + ## HITS:1 COG:guaC KEGG:ns NR:ns ## COG: guaC COG0516 # Protein_GI_number: 16128097 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Escherichia coli K12 # 1 347 1 347 347 697 100.0 0 MRIEEDLKLGFKDVLIRPKRSTLKSRSDVELERQFTFKHSGQSWSGVPIIAANMDTVGTF SMASALASFDILTAVHKHYSVEEWQAFINNSSADVLKHVMVSTGTSDADFEKTKQILDLN PALNFVCIDVANGYSEHFVQFVAKAREAWPTKTICAGNVVTGEMCEELILSGADIVKVGI GPGSVCTTRVKTGVGYPQLSAVIECADAAHGLGGMIVSDGGCTTPGDVAKAFGGGADFVM LGGMLAGHEESGGRIVEENGEKFMLFYGMSSESAMKRHVGGVAEYRAAEGKTVKLPLRGP VENTARDILGGLRSACTYVGASRLKELTKRTTFIRVQEQENRIFNNL >gi|296494556|gb|ADTN01000182.1| GENE 28 29830 - 31032 899 400 aa, chain - ## HITS:1 COG:hofC KEGG:ns NR:ns ## COG: hofC COG1459 # Protein_GI_number: 16128099 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulF # Organism: Escherichia coli K12 # 1 400 1 400 400 717 100.0 0 MASKQLWRWHGITGDGNAQDGMLWAESRTLLLMALQQQMVTPLSLKRIAINSAQWRGDKS AEVIHQLATLLKAGLTLSEGLALLAEQHPSKQWQALLQSLAHDLEQGIAFSNALLPWSEV FPPLYQAMIRTGELTGKLDECCFELARQQKAQRQLTDKVKSALRYPIIILAMAIMVVVAM LHFVLPEFAAIYKTFNTPLPALTQGIMTLADFSGEWSWLLVLFGFLLAIANKLLMRRPTW LIVRQKLLLRIPIMGSLMRGQKLTQIFTILALTQSAGITFLQGVESVRETMRCPYWVQLL TQIQHDISNGQPIWLALKNTGEFSPLCLQLVRTGEASGSLDLMLDNLAHHHRENTMALAD NLAALLEPALLIITGGIIGTLVVAMYLPIFHLGDAMSGMG >gi|296494556|gb|ADTN01000182.1| GENE 29 31022 - 32407 730 461 aa, chain - ## HITS:1 COG:hofB KEGG:ns NR:ns ## COG: hofB COG2804 # Protein_GI_number: 16128100 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB # Organism: Escherichia coli K12 # 1 461 1 461 461 898 100.0 0 MNIPQLTALCLRYHGVLLDASEEVVHVAVVDAPSHELLDALHFATTKRIEITCWTRQQME GHASRTQQTLPVAVQEKHQPKAELLTRTLQSALEQRASDIHIEPADNAYRIRLRIDGVLH PLPDVSPDAGVALTARLKVLGNLDIAEHRLPQDGQFTVELAGNAVSFRIATLPCRGGEKV VLRLLQQVGQALDVNTLGMQPLQLADFAHALQQPQGLVLVTGPTGSGKTVTLYSALQKLN TADINICSVEDPVEIPIAGLNQTQIHPRAGLTFQGVLRALLRQDPDVIMIGEIRDGETAE IAIKAAQTGHLVLSTLHTNSTCETLVRLQQMGVARWMLSSALTLVIAQRLVRKLCPHCRR QQGEPIHIPDNVWPSPLPHWQAPGCVHCYHGFYGRTALFEVLPITPVIRQLISANTDVES LETHARQAGMRTLFENGCLAVEQGLTTFEELIRVLGMPHGE >gi|296494556|gb|ADTN01000182.1| GENE 30 32417 - 32857 467 146 aa, chain - ## HITS:1 COG:ppdD KEGG:ns NR:ns ## COG: ppdD COG4969 # Protein_GI_number: 16128101 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Tfp pilus assembly protein, major pilin PilA # Organism: Escherichia coli K12 # 1 146 1 146 146 293 100.0 7e-80 MDKQRGFTLIELMVVIGIIAILSAIGIPAYQNYLRKAALTDMLQTFVPYRTAVELCALEH GGLDTCDGGSNGIPSPTTTRYVSAMSVAKGVVSLTGQESLNGLSVVMTPGWDNANGVTGW TRNCNIQSDSALQQACEDVFRFDDAN >gi|296494556|gb|ADTN01000182.1| GENE 31 33060 - 33953 460 297 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 [Kordia algicida OT-1] # 27 295 15 283 286 181 37 4e-45 MPPRRYNPDTRRDELLERINLDIPGAVAQALREDLGGTVDANNDITAKLLPENSRSHATV ITRENGVFCGKRWVEEVFIQLAGDDVTIIWHVDDGDVINANQSLFELEGPSRVLLTGERT ALNFVQTLSGVASKVRHYVELLEGTNTQLLDTRKTLPGLRSALKYAVLCGGGANHRLGLS DAFLIKENHIIASGSVRQAVEKASWLHPDAPVEVEVENLEELDEALKAGADIIMLDNFET EQMREAVKRTNGKALLEVSGNVTDKTLREFAETGVDFISVGALTKHVQALDLSMRFR >gi|296494556|gb|ADTN01000182.1| GENE 32 34041 - 34592 528 183 aa, chain + ## HITS:1 COG:ampD KEGG:ns NR:ns ## COG: ampD COG3023 # Protein_GI_number: 16128103 # Func_class: V Defense mechanisms # Function: Negative regulator of beta-lactamase expression # Organism: Escherichia coli K12 # 1 183 1 183 183 384 100.0 1e-107 MLLEQGWLVGARRVPSPHYDCRPDDETPTLLVVHNISLPPGEFGGPWIDALFTGTIDPQA HPFFAEIAHLRVSAHCLIRRDGEIVQYVPFDKRAWHAGVSQYQGRERCNDFSIGIELEGT DTLAYTDAQYQQLAAVTRALIDCYPDIAKNMTGHCDIAPDRKTDPGPAFDWARFRVLVSK ETT >gi|296494556|gb|ADTN01000182.1| GENE 33 34589 - 35443 933 284 aa, chain + ## HITS:1 COG:ampE KEGG:ns NR:ns ## COG: ampE COG3725 # Protein_GI_number: 16128104 # Func_class: V Defense mechanisms # Function: Membrane protein required for beta-lactamase induction # Organism: Escherichia coli K12 # 1 284 1 284 284 518 100.0 1e-147 MTLFTTLLVLIFERLFKLGEHWQLDHRLEAFFRRVKHFSLGRTLGMTIIAMGVTFLLLRA LQGVLFNVPTLLVWLLIGLLCIGAGKVRLHYHAYLTAASRNDSHARATMAGELTMIHGVP AGCDEREYLRELQNALLWINFRFYLAPLFWLIVGGTWGPVTLMGYAFLRAWQYWLARYQT PHHRLQSGIDAVLHVLDWVPVRLAGVVYALIGHGEKALPAWFASLGDFHTSQYQVLTRLA QFSLAREPHVDKVETPKAAVSMAKKTSFVVVVVIALLTIYGALV Prediction of potential genes in microbial genomes Time: Sun May 15 23:45:54 2011 Seq name: gi|296494555|gb|ADTN01000183.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont381.2, whole genome shotgun sequence Length of sequence - 7498 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 27 - 1397 1505 ## COG1113 Gamma-aminobutyrate permease and related permeases - Prom 1489 - 1548 5.2 + Prom 1668 - 1727 7.4 2 2 Op 1 6/0.000 + CDS 1941 - 2705 777 ## COG2186 Transcriptional regulators + Term 2710 - 2749 2.8 + Prom 2730 - 2789 1.7 3 2 Op 2 13/0.000 + CDS 2866 - 5529 3490 ## COG2609 Pyruvate dehydrogenase complex, dehydrogenase (E1) component 4 2 Op 3 . + CDS 5544 - 7436 2229 ## COG0508 Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes Predicted protein(s) >gi|296494555|gb|ADTN01000183.1| GENE 1 27 - 1397 1505 456 aa, chain - ## HITS:1 COG:aroP KEGG:ns NR:ns ## COG: aroP COG1113 # Protein_GI_number: 16128105 # Func_class: E Amino acid transport and metabolism # Function: Gamma-aminobutyrate permease and related permeases # Organism: Escherichia coli K12 # 1 456 2 457 457 824 99.0 0 MEGQQHGEQLKRGLKNRHIQLIALGGAIGTGLFLGSASVIQSAGPGIILGYAIAGFIAFL IMRQLGEMVVEEPVAGSFSHFAYKYWGSFAGFASGWNYWVLYVLVAMAELTAVGKYIQFW YPEIPTWVSAAVFFVVINAINLTNVKVFGEMEFWFAIIKVIAVVAMIIFGGWLLFSGNGG PQATVSNLWDQGGFLPHGFTGLVMMMAIIMFSFGGLELVGITAAEADNPEQSIPKATNQV IYRILIFYIGSLAVLLSLMPWPRVTADTSPFVLIFHELGDTFVANALNIVVLTAALSVYN SCVYCNSRMLFGLAQQGNAPKALASVDKRGVPVNTILVSALVTALCVLINYLAPESAFGL LMALVVSALVINWAMISLAHMKFRRAKQEQGVVTRFPALLYPLGNWICLLFMAAVLVIML MTPGMAISVYLIPVWLIVLGIGYLFKEKTAKAVKAH >gi|296494555|gb|ADTN01000183.1| GENE 2 1941 - 2705 777 254 aa, chain + ## HITS:1 COG:ECs0117 KEGG:ns NR:ns ## COG: ECs0117 COG2186 # Protein_GI_number: 15829371 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 254 1 254 254 402 100.0 1e-112 MAYSKIRQPKLSDVIEQQLEFLILEGTLRPGEKLPPERELAKQFDVSRPSLREAIQRLEA KGLLLRRQGGGTFVQSSLWQSFSDPLVELLSDHPESQYDLLETRHALEGIAAYYAALRST DEDKERIRELHHAIELAQQSGDLDAESNAVLQYQIAVTEAAHNVVLLHLLRCMEPMLAQN VRQNFELLYSRREMLPLVSSHRTRIFEAIMAGKPEEAREASHRHLAFIEEILLDRSREES RRERSLRRLEQRKN >gi|296494555|gb|ADTN01000183.1| GENE 3 2866 - 5529 3490 887 aa, chain + ## HITS:1 COG:ECs0118 KEGG:ns NR:ns ## COG: ECs0118 COG2609 # Protein_GI_number: 15829372 # Func_class: C Energy production and conversion # Function: Pyruvate dehydrogenase complex, dehydrogenase (E1) component # Organism: Escherichia coli O157:H7 # 1 887 1 887 887 1837 100.0 0 MSERFPNDVDPIETRDWLQAIESVIREEGVERAQYLIDQLLAEARKGGVNVAAGTGISNY INTIPVEEQPEYPGNLELERRIRSAIRWNAIMTVLRASKKDLELGGHMASFQSSATIYDV CFNHFFRARNEQDGGDLVYFQGHISPGVYARAFLEGRLTQEQLDNFRQEVHGNGLSSYPH PKLMPEFWQFPTVSMGLGPIGAIYQAKFLKYLEHRGLKDTSKQTVYAFLGDGEMDEPESK GAITIATREKLDNLVFVINCNLQRLDGPVTGNGKIINELEGIFEGAGWNVIKVMWGSRWD ELLRKDTSGKLIQLMNETVDGDYQTFKSKDGAYVREHFFGKYPETAALVADWTDEQIWAL NRGGHDPKKIYAAFKKAQETKGKATVILAHTIKGYGMGDAAEGKNIAHQVKKMNMDGVRH IRDRFNVPVSDADIEKLPYITFPEGSEEHTYLHAQRQKLHGYLPSRQPNFTEKLELPSLQ DFGALLEEQSKEISTTIAFVRALNVMLKNKSIKDRLVPIIADEARTFGMEGLFRQIGIYS PNGQQYTPQDREQVAYYKEDEKGQILQEGINELGAGCSWLAAATSYSTNNLPMIPFYIYY SMFGFQRIGDLCWAAGDQQARGFLIGGTSGRTTLNGEGLQHEDGHSHIQSLTIPNCISYD PAYAYEVAVIMHDGLERMYGEKQENVYYYITTLNENYHMPAMPEGAEEGIRKGIYKLETI EGSKGKVQLLGSGSILRHVREAAEILAKDYGVGSDVYSVTSFTELARDGQDCERWNMLHP LETPRVPYIAQVMNDAPAVASTDYMKLFAEQVRTYVPADDYRVLGTDGFGRSDSRENLRH HFEVDASYVVVAALGELAKRGEIDKKVVADAIAKFNIDADKVNPRLA >gi|296494555|gb|ADTN01000183.1| GENE 4 5544 - 7436 2229 630 aa, chain + ## HITS:1 COG:ECs0119 KEGG:ns NR:ns ## COG: ECs0119 COG0508 # Protein_GI_number: 15829373 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes # Organism: Escherichia coli O157:H7 # 1 630 1 630 630 945 99.0 0 MAIEIKVPDIGADEVEITEILVKVGDKVEAEQSLITVEGDKASMEVPSPQAGIVKEIKVS VGDKTQTGALIMIFDSADGAADAAPAQAEEKKEAAPAAAPAAAAAKDVNVPDIGSDEVEV TEILVKVGDKVEAEQSLITVEGDKASMEVPAPFAGTVKEIKVNVGDKVSTGSLIMVFEVA GEAGAAAPAAKQEAAPAAAPAPAAGVKEVNVPDIGGDEVEVTEVMVKVGDKVAAEQSLIT VEGDKASMEVPAPFAGVVKELKVNVGDKVKTGSLIMIFEVEGAAPAAAPAKQEAAAPAPA AKAEAPAAAPAAKAEGKSEFAENDAYVHATPLIRRLAREFGVNLAKVKGTGRKGRILRED VQAYVKEAIKRAEAAPAATGGGIPGMLPWPKVDFSKFGEIEEVELGRIQKISGANLSRNW VMIPHVTHFDKTDITELEAFRKQQNEEAAKRKLDVKITPVVFIMKAVAAALEQMPRFNSS LSEDGQRLTLKKYINIGVAVDTPNGLVVPVFKDVNKKGIIELSRELMTISKKARDGKLTA GEMQGGCFTISSIGGLGTTHFAPIVNAPEVAILGVSKSAMEPVWNGKEFVPRLMLPISLS FDHRVIDGADGARFITIINNTLSDIRRLVM Prediction of potential genes in microbial genomes Time: Sun May 15 23:46:00 2011 Seq name: gi|296494554|gb|ADTN01000184.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont381.3, whole genome shotgun sequence Length of sequence - 18705 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 12, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 56 - 115 7.8 1 1 Tu 1 . + CDS 310 - 1734 682 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 + Term 1758 - 1813 5.0 - Term 1723 - 1769 3.7 2 2 Tu 1 . - CDS 1805 - 3658 657 ## JW0113 hypothetical protein - Prom 3872 - 3931 4.4 + Prom 3847 - 3906 7.7 3 3 Tu 1 . + CDS 4013 - 6610 3192 ## COG1049 Aconitase B + Term 6646 - 6683 6.1 + Prom 6628 - 6687 6.4 4 4 Tu 1 . + CDS 6786 - 7148 482 ## COG3112 Uncharacterized protein conserved in bacteria + Term 7304 - 7342 2.2 - Term 7140 - 7181 10.2 5 5 Op 1 9/0.000 - CDS 7186 - 7980 800 ## COG1586 S-adenosylmethionine decarboxylase 6 5 Op 2 . - CDS 7996 - 8862 855 ## COG0421 Spermidine synthase - Prom 8901 - 8960 4.8 - Term 8909 - 8942 3.8 7 6 Tu 1 . - CDS 8968 - 9315 340 ## ECSE_0122 hypothetical protein - Prom 9457 - 9516 2.0 + Prom 9383 - 9442 4.8 8 7 Tu 1 . + CDS 9481 - 11031 1691 ## COG2132 Putative multicopper oxidases - Term 11084 - 11121 2.2 9 8 Tu 1 . - CDS 11233 - 13623 2633 ## COG4993 Glucose dehydrogenase - Prom 13660 - 13719 3.7 + Prom 13631 - 13690 6.4 10 9 Tu 1 . + CDS 13829 - 14365 884 ## COG0634 Hypoxanthine-guanine phosphoribosyltransferase + Term 14374 - 14412 4.2 11 10 Tu 1 . - CDS 14406 - 15068 593 ## COG0288 Carbonic anhydrase - Prom 15164 - 15223 5.4 + Prom 15079 - 15138 4.2 12 11 Op 1 45/0.000 + CDS 15177 - 16103 946 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 13 11 Op 2 3/0.600 + CDS 16100 - 16870 861 ## COG0842 ABC-type multidrug transport system, permease component + Term 16878 - 16917 6.6 14 12 Op 1 4/0.400 + CDS 16975 - 17415 341 ## COG2893 Phosphotransferase system, mannose/fructose-specific component IIA + Term 17431 - 17466 1.1 15 12 Op 2 . + CDS 17479 - 18703 647 ## COG0726 Predicted xylanase/chitin deacetylase Predicted protein(s) >gi|296494554|gb|ADTN01000184.1| GENE 1 310 - 1734 682 474 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 6 452 3 444 458 267 33 5e-71 MSTEIKTQVVVLGAGPAGYSAAFRCADLGLETVIVERYNTLGGVCLNVGCIPSKALLHVA KVIEEAKALAEHGIVFGEPKTDIDKIRTWKEKVINQLTGGLAGMAKGRKVKVVNGLGKFT GANTLEVEGENGKTVINFDNAIIAAGSRPIQLPFIPHEDPRIWDSTDALELKEVPERLLV MGGGIIGLEMGTVYHALGSQIDVVEMFDQVIPAADKDIVKVFTKRISKKFNLMLETKVTA VEAKEDGIYVTMEGKKAPAEPQRYDAVLVAIGRVPNGKNLDAGKAGVEVDDRGFIRVDKQ LRTNVPHIFAIGDIVGQPMLAHKGVHEGHVAAEVIAGKKHYFDPKVIPSIAYTEPEVAWV GLTEKEAKEKGISYETATFPWAASGRAIASDCADGMTKLIFDKESHRVIGGAIVGTNGGE LLGEIGLAIEMGCDAEDIALTIHAHPTLHESVGLAAEVFEGSITDLPNPKAKKK >gi|296494554|gb|ADTN01000184.1| GENE 2 1805 - 3658 657 617 aa, chain - ## HITS:1 COG:no KEGG:JW0113 NR:ns ## KEGG: JW0113 # Name: yacH # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 617 1 617 617 1018 100.0 0 MKMTLPFKPHVLALICSAGLCAASTGLYIKSRTVEAPVEPQSTQLAVSDAAAVTFPATVS APPVTPAVVKSAFSTAQIDQWVAPVALYPDALLSQVLMASTYPTNVAQAVQWSHDNPLKQ GDAAIQAVSDQPWDASVKSLVAFPQLMALMGENPQWVQNLGDAFLAQPQDVMDSVQRLRQ LAQQTGSLKSSTEQKVITTTKKAVPVKQTVTAPVIPSNTVLTANPVITEPATTVISIEPA NPDVVYIPNYNPTVVYGNWANTAYPPVYLPPPAGEPFVDSFVRGFGYSMGVATTYALFSS IDWDDDDHDHHHHDNDDYHHHDGGHRDGNGWQHNGDNINIDVNNFNRITGEHLTDKNMAW RHNPNYRNGVPYHDQDMAKRFHQTDVNGGMSATQLPAPTRDSQRQAAANQFQQRTHAAPV ITRDTQRQAAAQRFNEAEHYGSYDDFHDFSRRQPLTQQQKDAARQRYQSASPEQRQAVRE RMQTNPKIQQRREAARERIQSASPEQRQAVREKMQTNPQNQQRRDAARERIQSASPEQRQ VFKEKVQQRPLNQQQRDNARQRVQSASPEQRQVFREKVQESRPQRLNDSNHTVRLNNEQR SAVCERLSERGARRLER >gi|296494554|gb|ADTN01000184.1| GENE 3 4013 - 6610 3192 865 aa, chain + ## HITS:1 COG:acnB KEGG:ns NR:ns ## COG: acnB COG1049 # Protein_GI_number: 16128111 # Func_class: C Energy production and conversion # Function: Aconitase B # Organism: Escherichia coli K12 # 1 865 1 865 865 1705 100.0 0 MLEEYRKHVAERAAEGIAPKPLDANQMAALVELLKNPPAGEEEFLLDLLTNRVPPGVDEA AYVKAGFLAAIAKGEAKSPLLTPEKAIELLGTMQGGYNIHPLIDALDDAKLAPIAAKALS HTLLMFDNFYDVEEKAKAGNEYAKQVMQSWADAEWFLNRPALAEKLTVTVFKVTGETNTD DLSPAPDAWSRPDIPLHALAMLKNAREGIEPDQPGVVGPIKQIEALQQKGFPLAYVGDVV GTGSSRKSATNSVLWFMGDDIPHVPNKRGGGLCLGGKIAPIFFNTMEDAGALPIEVDVSN LNMGDVIDVYPYKGEVRNHETGELLATFELKTDVLIDEVRAGGRIPLIIGRGLTTKAREA LGLPHSDVFRQAKDVAESDRGFSLAQKMVGRACGVKGIRPGAYCEPKMTSVGSQDTTGPM TRDELKDLACLGFSADLVMQSFCHTAAYPKPVDVNTHHTLPDFIMNRGGVSLRPGDGVIH SWLNRMLLPDTVGTGGDSHTRFPIGISFPAGSGLVAFAAATGVMPLDMPESVLVRFKGKM QPGITLRDLVHAIPLYAIKQGLLTVEKKGKKNIFSGRILEIEGLPDLKVEQAFELTDASA ERSAAGCTIKLNKEPIIEYLNSNIVLLKWMIAEGYGDRRTLERRIQGMEKWLANPELLEA DADAEYAAVIDIDLADIKEPILCAPNDPDDARPLSAVQGEKIDEVFIGSCMTNIGHFRAA GKLLDAHKGQLPTRLWVAPPTRMDAAQLTEEGYYSVFGKSGARIEIPGCSLCMGNQARVA DGATVVSTSTRNFPNRLGTGANVFLASAELAAVAALIGKLPTPEEYQTYVAQVDKTAVDT YRYLNFNQLSQYTEKADGVIFQTAV >gi|296494554|gb|ADTN01000184.1| GENE 4 6786 - 7148 482 120 aa, chain + ## HITS:1 COG:ECs0123 KEGG:ns NR:ns ## COG: ECs0123 COG3112 # Protein_GI_number: 15829377 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 120 17 136 136 217 100.0 3e-57 MDYEFLRDITGVVKVRMSMGHEVVGHWFNEEVKENLALLDEVEQAAHALKGSERSWQRAG HEYTLWMDGEEVMVRANQLEFAGDEMEEGMNYYDEESLSLCGVEDFLQVVAAYRNFVQQK >gi|296494554|gb|ADTN01000184.1| GENE 5 7186 - 7980 800 264 aa, chain - ## HITS:1 COG:ECs0124 KEGG:ns NR:ns ## COG: ECs0124 COG1586 # Protein_GI_number: 15829378 # Func_class: E Amino acid transport and metabolism # Function: S-adenosylmethionine decarboxylase # Organism: Escherichia coli O157:H7 # 1 264 1 264 264 537 100.0 1e-153 MKKLKLHGFNNLTKSLSFCIYDICYAKTAEERDGYIAYIDELYNANRLTEILSETCSIIG ANILNIARQDYEPQGASVTILVSEEPVDPKLIDKTEHPGPLPETVVAHLDKSHICVHTYP ESHPEGGLCTFRADIEVSTCGVISPLKALNYLIHQLESDIVTIDYRVRGFTRDINGMKHF IDHEINSIQNFMSDDMKALYDMVDVNVYQENIFHTKMLLKEFDLKHYMFHTKPEDLTDSE RQEITAALWKEMREIYYGRNMPAV >gi|296494554|gb|ADTN01000184.1| GENE 6 7996 - 8862 855 288 aa, chain - ## HITS:1 COG:speE KEGG:ns NR:ns ## COG: speE COG0421 # Protein_GI_number: 16128114 # Func_class: E Amino acid transport and metabolism # Function: Spermidine synthase # Organism: Escherichia coli K12 # 1 288 1 288 288 602 100.0 1e-172 MAEKKQWHETLHDQFGQYFAVDNVLYHEKTDHQDLIIFENAAFGRVMALDGVVQTTERDE FIYHEMMTHVPLLAHGHAKHVLIIGGGDGAMLREVTRHKNVESITMVEIDAGVVSFCRQY LPNHNAGSYDDPRFKLVIDDGVNFVNQTSQTFDVIISDCTDPIGPGESLFTSAFYEGCKR CLNPGGIFVAQNGVCFLQQEEAIDSHRKLSHYFSDVGFYQAAIPTYYGGIMTFAWATDND ALRHLSTEIIQARFLASGLKCRYYNPAIHTAAFALPQYLQDALASQPS >gi|296494554|gb|ADTN01000184.1| GENE 7 8968 - 9315 340 115 aa, chain - ## HITS:1 COG:no KEGG:ECSE_0122 NR:ns ## KEGG: ECSE_0122 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 115 42 156 156 231 100.0 8e-60 MKTFFRTVLFGSLMAVCANSYALSESEAEDMADLTAVFVFLKNDCGYQNLPNGQIRRALV FFAQQNQWDLSNYDTFDMKALGEDSYRDLSGIGIPVAKKCKALARDSLSLLAYVK >gi|296494554|gb|ADTN01000184.1| GENE 8 9481 - 11031 1691 516 aa, chain + ## HITS:1 COG:yacK KEGG:ns NR:ns ## COG: yacK COG2132 # Protein_GI_number: 16128116 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Putative multicopper oxidases # Organism: Escherichia coli K12 # 1 516 1 516 516 1051 100.0 0 MQRRDFLKYSVALGVASALPLWSRAVFAAERPTLPIPDLLTTDARNRIQLTIGAGQSTFG GKTATTWGYNGNLLGPAVKLQRGKAVTVDIYNQLTEETTLHWHGLEVPGEVDGGPQGIIP PGGKRSVTLNVDQPAATCWFHPHQHGKTGRQVAMGLAGLVVIEDDEILKLMLPKQWGIDD VPVIVQDKKFSADGQIDYQLDVMTAAVGWFGDTLLTNGAIYPQHAAPRGWLRLRLLNGCN ARSLNFATSDNRPLYVIASDGGLLPEPVKVSELPVLMGERFEVLVEVNDNKPFDLVTLPV SQMGMAIAPFDKPHPVMRIQPIAISASGALPDTLSSLPALPSLEGLTVRKLQLSMDPMLD MMGMQMLMEKYGDQAMAGMDHSQMMGHMGHGNMNHMNHGGKFDFHHANKINGQAFDMNKP MFAAAKGQYERWVISGVGDMMLHPFHIHGTQFRILSENGKPPAAHRAGWKDTVKVEGNVS EVLVKFNHDAPKEHAYMAHCHLLEHEDTGMMLGFTV >gi|296494554|gb|ADTN01000184.1| GENE 9 11233 - 13623 2633 796 aa, chain - ## HITS:1 COG:gcd KEGG:ns NR:ns ## COG: gcd COG4993 # Protein_GI_number: 16128117 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose dehydrogenase # Organism: Escherichia coli K12 # 1 796 1 796 796 1527 100.0 0 MAINNTGSRRLLVTLTALFAALCGLYLLIGGGWLVAIGGSWYYPIAGLVMLGVAWMLWRS KRAALWLYAALLLGTMIWGVWEVGFDFWALTPRSDILVFFGIWLILPFVWRRLVIPASGA VAALVVALLISGGILTWAGFNDPQEINGTLSADATPAEAISPVADQDWPAYGRNQEGQRF SPLKQINADNVHNLKEAWVFRTGDVKQPNDPGEITNEVTPIKVGDTLYLCTAHQRLFALD AASGKEKWHYDPELKTNESFQHVTCRGVSYHEAKAETASPEVMADCPRRIILPVNDGRLI AINAENGKLCETFANKGVLNLQSNMPDTKPGLYEPTSPPIITDKTIVMAGSVTDNFSTRE TSGVIRGFDVNTGELLWAFDPGAKDPNAIPSDEHTFTFNSPNSWAPAAYDAKLDLVYLPM GVTTPDIWGGNRTPEQERYASSILALNATTGKLAWSYQTVHHDLWDMDLPAQPTLADITV NGQKVPVIYAPAKTGNIFVLDRRNGELVVPAPEKPVPQGAAKGDYVTPTQPFSELSFRPT KDLSGADMWGATMFDQLVCRVMFHQMRYEGIFTPPSEQGTLVFPGNLGMFEWGGISVDPN REVAIANPMALPFVSKLIPRGPGNPMEQPKDAKGTGTESGIQPQYGVPYGVTLNPFLSPF GLPCKQPAWGYISALDLKTNEVVWKKRIGTPQDSMPFPMPVPVPFNMGMPMLGGPISTAG NVLFIAATADNYLRAYNMSNGEKLWQGRLPAGGQATPMTYEVNGKQYVVISAGGHGSFGT KMGDYIVAYALPDDVK >gi|296494554|gb|ADTN01000184.1| GENE 10 13829 - 14365 884 178 aa, chain + ## HITS:1 COG:ECs0129 KEGG:ns NR:ns ## COG: ECs0129 COG0634 # Protein_GI_number: 15829383 # Func_class: F Nucleotide transport and metabolism # Function: Hypoxanthine-guanine phosphoribosyltransferase # Organism: Escherichia coli O157:H7 # 1 178 5 182 182 337 100.0 9e-93 MKHTVEVMIPEAEIKARIAELGRQITERYKDSGSDMVLVGLLRGSFMFMADLCREVQVSH EVDFMTASSYGSGMSTTRDVKILKDLDEDIRGKDVLIVEDIIDSGNTLSKVREILSLREP KSLAICTLLDKPSRREVNVPVEFIGFSIPDEFVVGYGIDYAQRYRHLPYIGKVILLDE >gi|296494554|gb|ADTN01000184.1| GENE 11 14406 - 15068 593 220 aa, chain - ## HITS:1 COG:yadF KEGG:ns NR:ns ## COG: yadF COG0288 # Protein_GI_number: 16128119 # Func_class: P Inorganic ion transport and metabolism # Function: Carbonic anhydrase # Organism: Escherichia coli K12 # 1 220 1 220 220 457 100.0 1e-129 MKDIDTLISNNALWSKMLVEEDPGFFEKLAQAQKPRFLWIGCSDSRVPAERLTGLEPGEL FVHRNVANLVIHTDLNCLSVVQYAVDVLEVEHIIICGHYGCGGVQAAVENPELGLINNWL LHIRDIWFKHSSLLGEMPQERRLDTLCELNVMEQVYNLGHSTIMQSAWKRGQKVTIHGWA YGIHDGLLRDLDVTATNRETLEQRYRHGISNLKLKHANHK >gi|296494554|gb|ADTN01000184.1| GENE 12 15177 - 16103 946 308 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 1 302 1 304 311 369 63 1e-101 MTIALELQQLKKTYPGGVQALRGIDLQVEAGDFYALLGPNGAGKSTTIGIISSLVNKTSG RVSVFGYDLEKDVVNAKRQLGLVPQEFNFNPFETVQQIVVNQAGYYGVERKEAYIRSEKY LKQLDLWGKRNERARMLSGGMKRRLMIARALMHEPKLLILDEPTAGVDIELRRSMWGFLK DLNDKGTTIILTTHYLEEAEMLCRNIGIIQHGELVENTSMKALLAKLKSETFILDLAPKS PLPKLDGYQYRLVDTATLEVEVLREQGINSVFTQLSEQGIQVLSMRNKANRLEELFVSLV NEKQGDRA >gi|296494554|gb|ADTN01000184.1| GENE 13 16100 - 16870 861 256 aa, chain + ## HITS:1 COG:ECs0132 KEGG:ns NR:ns ## COG: ECs0132 COG0842 # Protein_GI_number: 15829386 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Escherichia coli O157:H7 # 1 256 1 256 256 431 100.0 1e-121 MMHLYWVALKSIWAKEIHRFMRIWVQTLVPPVITMTLYFIIFGNLIGSRIGDMHGFSYMQ FIVPGLIMMSVITNAYANVASSFFGAKFQRNIEELLVAPVPTHVIIAGYVGGGVARGLFV GILVTAISLFFVPFQVHSWVFVALTLVLTAVLFSLAGLLNGVFAKTFDDISLVPTFVLTP LTYLGGVFYSLTLLPPFWQGLSHLNPIVYMISGFRYGFLGINDVPLVTTFGVLVVFIVAF YLICWSLIQRGRGLRS >gi|296494554|gb|ADTN01000184.1| GENE 14 16975 - 17415 341 146 aa, chain + ## HITS:1 COG:yadI KEGG:ns NR:ns ## COG: yadI COG2893 # Protein_GI_number: 16128122 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose-specific component IIA # Organism: Escherichia coli K12 # 1 146 1 146 146 302 100.0 1e-82 MLGWVITCHDDRAQEILDALEKKHGALLQCRAVNFWRGLSSNMLSRMMCDALHEADSGEG VIFLTDIAGAPPYRVASLLSHKHSRCEVISGVTLPLIEQMMACRETMTSSEFRERIVELG APEVSSLWHQQQKNPPFVLKHNLYEY >gi|296494554|gb|ADTN01000184.1| GENE 15 17479 - 18703 647 408 aa, chain + ## HITS:1 COG:yadE KEGG:ns NR:ns ## COG: yadE COG0726 # Protein_GI_number: 16128123 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Escherichia coli K12 # 1 401 1 401 409 824 100.0 0 MYKQAVILLLMLFTASVSAALPARYMQTIENAAVWAQIGDKMVTVGNIRAGQIIAVEPTA ASYYAFNFGFGKGFIDKGHLEPVQGRQKVEDGLGDLNKPLSNQNLVTWKDTPVYNAPSAG SAPFGVLADNLRYPILHKLKDRLNQTWYQIRIGDRLAYISALDAQPDNGLSVLTYHHILR DEENTRFRHTSTTTSVRAFNNQMAWLRDRGYATLSMVQLEGYVKNKINLPARAVVITFDD GLKSVSRYAYPVLKQYGMKATAFIVTSRIKRHPQKWNPKSLQFMSVSELNEIRDVFDFQS HTHFLHRVDGYRRPILLSRSEHNILFDFARSRRALAQFNPHVWYLSYPFGGFNDNAVKAA NDAGFHLAVTTMKGKVKPGDNPLLLKRLYILRTDSLETMSRGGGGGGG Prediction of potential genes in microbial genomes Time: Sun May 15 23:46:17 2011 Seq name: gi|296494553|gb|ADTN01000185.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont381.4, whole genome shotgun sequence Length of sequence - 18249 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 10, operones - 4 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 14 - 394 449 ## COG0853 Aspartate 1-decarboxylase - Prom 454 - 513 2.6 2 2 Tu 1 . + CDS 344 - 556 89 ## ECH74115_0140 hypothetical protein 3 3 Tu 1 . + CDS 668 - 1570 649 ## COG5464 Uncharacterized conserved protein + Term 1572 - 1616 2.1 - Term 1597 - 1639 10.2 4 4 Op 1 19/0.000 - CDS 1644 - 2495 1001 ## COG0414 Panthothenate synthetase 5 4 Op 2 3/0.500 - CDS 2507 - 3301 929 ## COG0413 Ketopantoate hydroxymethyltransferase - Prom 3329 - 3388 4.2 6 5 Tu 1 . - CDS 3415 - 4341 308 ## COG3539 P pilus assembly protein, pilin FimA 7 6 Op 1 . - CDS 4703 - 5299 110 ## JW0132 predicted fimbrial-like adhesin protein 8 6 Op 2 . - CDS 5326 - 5928 427 ## JW0133 predicted fimbrial-like adhesin protein 9 6 Op 3 6/0.000 - CDS 5943 - 6326 233 ## COG3539 P pilus assembly protein, pilin FimA - Prom 6384 - 6443 3.2 10 6 Op 4 10/0.000 - CDS 6529 - 9126 1456 ## COG3188 P pilus assembly protein, porin PapC 11 6 Op 5 7/0.000 - CDS 9161 - 9853 444 ## COG3121 P pilus assembly protein, chaperone PapD - Prom 9890 - 9949 7.2 - Term 9918 - 9965 5.4 12 6 Op 6 3/0.500 - CDS 9999 - 10583 501 ## COG3539 P pilus assembly protein, pilin FimA - Prom 10786 - 10845 5.4 - Term 10734 - 10779 1.0 13 7 Op 1 7/0.000 - CDS 10953 - 11432 208 ## PROTEIN SUPPORTED gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 14 7 Op 2 1/1.000 - CDS 11429 - 12793 1386 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase - Prom 12818 - 12877 3.5 15 8 Tu 1 3/0.500 - CDS 12886 - 13782 543 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases - Term 13788 - 13820 3.0 16 9 Op 1 6/0.000 - CDS 13849 - 14304 668 ## COG1734 DnaK suppressor protein - Prom 14358 - 14417 2.3 - Term 14382 - 14417 -0.1 17 9 Op 2 3/0.500 - CDS 14482 - 15186 329 ## COG1489 DNA-binding protein, stimulates sugar fermentation 18 9 Op 3 . - CDS 15201 - 15731 479 ## COG1514 2'-5' RNA ligase - Prom 15753 - 15812 4.3 19 10 Tu 1 . + CDS 15760 - 18234 2256 ## COG1643 HrpA-like helicases Predicted protein(s) >gi|296494553|gb|ADTN01000185.1| GENE 1 14 - 394 449 126 aa, chain - ## HITS:1 COG:ECs0135 KEGG:ns NR:ns ## COG: ECs0135 COG0853 # Protein_GI_number: 15829389 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate 1-decarboxylase # Organism: Escherichia coli O157:H7 # 1 126 1 126 126 249 100.0 8e-67 MIRTMLQGKLHRVKVTHADLHYEGSCAIDQDFLDAAGILENEAIDIWNVTNGKRFSTYAI AAERGSRIISVNGAAAHCASVGDIVIIASFVTMPDEEARTWRPNVAYFEGDNEMKRTAKA IPVQVA >gi|296494553|gb|ADTN01000185.1| GENE 2 344 - 556 89 70 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_0140 NR:ns ## KEGG: ECH74115_0140 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 7 70 1 64 64 101 95.0 9e-21 MSHFHAVEFALQHRANHNFYLSTLSLTKQAMPALRKFSRSIARFLFSVYSSDGICVSSLR TAPKRAMYRL >gi|296494553|gb|ADTN01000185.1| GENE 3 668 - 1570 649 300 aa, chain + ## HITS:1 COG:yadD KEGG:ns NR:ns ## COG: yadD COG5464 # Protein_GI_number: 16128125 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 300 1 300 300 586 100.0 1e-167 MDAPSTTPHDAVFKQFLMHAETARDFLEIHLPVELRELCDLNTLHLESGSFIEESLKGHS TDVLYSVQMQGNPGYLHVVIEHQSKPDKKMAFRMMRYSIAAMHRHLEADHDKLPLVVPIL FYQGEATPYPLSMCWFDMFYSPELARRVYNSPFPLVDITITPDDEIMQHRRIAILELLQK HIRQRDLMLLLEQLVTLIDEGYTSGSQLVAMQNYMLQRGHTEQADLFYGVLRDRETGGES MMTLAQWFEEKGIEKGIQQGRQEVSQEFAQRLLSKGMSREDVAEMANLPLAEIDKVINLI >gi|296494553|gb|ADTN01000185.1| GENE 4 1644 - 2495 1001 283 aa, chain - ## HITS:1 COG:panC KEGG:ns NR:ns ## COG: panC COG0414 # Protein_GI_number: 16128126 # Func_class: H Coenzyme transport and metabolism # Function: Panthothenate synthetase # Organism: Escherichia coli K12 # 1 283 1 283 283 548 100.0 1e-156 MLIIETLPLLRQQIRRLRMEGKRVALVPTMGNLHDGHMKLVDEAKARADVVVVSIFVNPM QFDRPEDLARYPRTLQEDCEKLNKRKVDLVFAPSVKEIYPNGTETHTYVDVPGLSTMLEG ASRPGHFRGVSTIVSKLFNLVQPDIACFGEKDFQQLALIRKMVADMGFDIEIVGVPIMRA KDGLALSSRNGYLTAEQRKIAPGLYKVLSSIADKLQAGERDLDEIITIAGQELNEKGFRA DDIQIRDADTLLEVSETSKRAVILVAAWLGDARLIDNKMVELA >gi|296494553|gb|ADTN01000185.1| GENE 5 2507 - 3301 929 264 aa, chain - ## HITS:1 COG:panB KEGG:ns NR:ns ## COG: panB COG0413 # Protein_GI_number: 16128127 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate hydroxymethyltransferase # Organism: Escherichia coli K12 # 1 264 1 264 264 518 100.0 1e-147 MKPTTISLLQKYKQEKKRFATITAYDYSFAKLFADEGLNVMLVGDSLGMTVQGHDSTLPV TVADIAYHTAAVRRGAPNCLLLADLPFMAYATPEQAFENAATVMRAGANMVKIEGGEWLV ETVQMLTERAVPVCGHLGLTPQSVNIFGGYKVQGRGDEAGDQLLSDALALEAAGAQLLVL ECVPVELAKRITEALAIPVIGIGAGNVTDGQILVMHDAFGITGGHIPKFAKNFLAETGDI RAAVRQYMAEVESGVYPGEEHSFH >gi|296494553|gb|ADTN01000185.1| GENE 6 3415 - 4341 308 308 aa, chain - ## HITS:1 COG:yadC KEGG:ns NR:ns ## COG: yadC COG3539 # Protein_GI_number: 16128128 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 308 105 412 412 561 100.0 1e-160 MVYSGKDYGGHKLFNTSVPGLYYTMLISRVWSAYDTITDIQSPGIYIGDPSNQEFFFSVT DSDLQTKGCNKADDYDKFWAIGGIVHNITVEFYTDTNFDPTLNQQVQLSSSSNYLYSFKA YSPGTKVVDHSNHIYVNFTLNNVKLTLPTCFTSILTGPSVNGSTVRMGEYSSGTIKNGAS PVPFDISLQNCIRVRNIETKLVTGKVGTQNTQLLGNTLTGSTAAKGVGVLIEGLATSKNP LMTLKPNDTNSVYIDYETEDDTSDGVYPNQGNGTSQPLHFQATLKQDGNIAIEPGEFKAT STFQVTYP >gi|296494553|gb|ADTN01000185.1| GENE 7 4703 - 5299 110 198 aa, chain - ## HITS:1 COG:no KEGG:JW0132 NR:ns ## KEGG: JW0132 # Name: yadK # Def: predicted fimbrial-like adhesin protein # Organism: E.coli_J # Pathway: not_defined # 1 198 1 198 198 382 100.0 1e-105 MHPTQRKLMKRIILFLSLLFCIACPAIAGQDIDLVANVKNSTCKSGISNQGNIDLGVVGV GYFSGNVTPESYQPGGKEFTITVSDCALQGTGDVLNQLHIDFRALSGVMAAGSRQIFANE ISSGASNVGVVIFSTQDSANTFNVLNASGGSRSVYPVMSDDMNGSSWKFSTRMQKIDPAL SVTSGQLMSHVLVDIYYE >gi|296494553|gb|ADTN01000185.1| GENE 8 5326 - 5928 427 200 aa, chain - ## HITS:1 COG:no KEGG:JW0133 NR:ns ## KEGG: JW0133 # Name: yadL # Def: predicted fimbrial-like adhesin protein # Organism: E.coli_J # Pathway: not_defined # 1 200 2 201 201 307 99.0 1e-82 MTFKNLRYGLSSSVVLAASLFSVLSYAATDSIGLTVITTVEMGTCTATLVNDSDQDISVV DFGDVYISEINAKTKVKTFKLKFKDCAGIPNKKAQIKLPKRATCEGTANDGAGFANGSTA ADKASAVAVEVWSTVTPATGSATQFSCVTPASQEVTISTAANAVVYYPMSARLVVEKNKT VNNVTAGKFSAPATFTVTYN >gi|296494553|gb|ADTN01000185.1| GENE 9 5943 - 6326 233 127 aa, chain - ## HITS:1 COG:yadM KEGG:ns NR:ns ## COG: yadM COG3539 # Protein_GI_number: 16128131 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 127 77 203 203 233 100.0 7e-62 MGLDKIANKTTESQADFKLVASGCSSGISWIDTTLTGNASSSSPKLIIPQSGDSSSTTSN IGMGFKKRTTDDATFLKPNSAEKIRWSTDEMQPDKGLEMTVALRETDAGQGVPGNFRALA TFNFIYQ >gi|296494553|gb|ADTN01000185.1| GENE 10 6529 - 9126 1456 865 aa, chain - ## HITS:1 COG:htrE KEGG:ns NR:ns ## COG: htrE COG3188 # Protein_GI_number: 16128132 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli K12 # 1 865 1 865 865 1664 100.0 0 MTIEYTKNYHHLTRIATFCALLYCNTAFSAELVEYDHTFLMGQNASNIDLSRYSEGNPAI PGVYDVSVYVNDQPIINQSITFVAIEGKKNAQACITLKNLLQFHINSPDINNEKAVLLAR DETLGNCLNLTEIIPQASVRYDVNDQRLDIDVPQAWVMKNYQNYVDPSLWENGINAAMLS YNLNGYHSETPGRKNESIYAAFNGGMNLGAWRLRASGNYNWMTDSGSNYDFKNRYVQRDI ASLRSQLILGESYTTGETFDSVSIRGIRLYSDSRMLPPTLASFAPIIHGVANTNAKVTIT QGGYKIYETTVPPGAFVIDDLSPSGYGSDLIVTIEESDGSKRTFSQPFSSVVQMLRPGVG RWDISGGQVLKDDIQDEPNLFQASYYYGLNNYLTGYTGIQITDNNYTAGLLGLGLNTSVG AFSFDVTHSNVRIPDDKTYQGQSYRVSWNKLFEETSTSLNIAAYRYSTQNYLGLNDALTL IDEVKHPEQDLEPKSMRNYSRMKNQVTVSINQPLKFEKKDYGSFYLSGSWSDYWASGQNR SNYSIGYSNSTSWGSYSVSAQRSWNEDGDTDDSVYLSFTIPIEKLLGTEQRTSGFQSIDT QISSDFKGNNQLNVSSSGYSDNARVSYSVNTGYTMNKASKDLSYVGGYASYESPWGTLAG SISANSDNSRQVSLSTDGGFVLHSGGLTFSNDSFSDSDTLAVVQAPGAQGARINYGNSTI DRWGYGVTSALSPYHENRIALDINDLENDVELKSTSAVAVPRQGSVVFADFETVQGQSAI MNITRSDGKNIPFAADIYDEQGNVIGNVGQGGQAFVRGIEQQGNISIKWLEQSKPVSCLA HYQQSPEAEKIAQSIILNGIRCQIQ >gi|296494553|gb|ADTN01000185.1| GENE 11 9161 - 9853 444 230 aa, chain - ## HITS:1 COG:ecpD KEGG:ns NR:ns ## COG: ecpD COG3121 # Protein_GI_number: 16128133 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli K12 # 1 230 17 246 246 452 100.0 1e-127 MAFSSSSIADIVISGTRVIYKSDQKSVNVRLENKGNNPLLVQSWLDTGDDNAEPGSITVP FTATPPVSRIDAKRGQTIKLMYTASTSLPKDRESVFWFNVLEVPPKPDAEKVANQSLLQL AFRTRIKLFYRPDGLKGNPSEAPLALKWFWSGSEGKASLRVTNPTPYYVSFSSGDLEASG KRYPIDVKMIAPFSDEVMKVNGLNGKANSAKVHFYAINDFGGAIEGNARL >gi|296494553|gb|ADTN01000185.1| GENE 12 9999 - 10583 501 194 aa, chain - ## HITS:1 COG:yadN KEGG:ns NR:ns ## COG: yadN COG3539 # Protein_GI_number: 16128134 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 194 1 194 194 326 100.0 2e-89 MSKKLGFALSGLMLAMVAGTASADMDGGQLNISGLVVDNTCETRVDGGNKDGLILLQTAT VGEIDAGVLNDTVGAKAKPFSITVDCSKANPNPGSTAKMTFGSVFFGNSKGTLNNDMSIN NPSDGVNIALHNIDGSTIKQVQINNPGDVYTKALDATTKSAVYDFKASYVRAVADQTATA GYVKTNTAYTITYQ >gi|296494553|gb|ADTN01000185.1| GENE 13 10953 - 11432 208 159 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 [Streptococcus pneumoniae SP9-BS68] # 4 158 122 270 278 84 36 4e-16 MTVAYIAIGSNLASPLEQVNAALKALGDIPESHILTVSSFYRTPPLGPQDQPDYLNAAVA LETSLAPEELLNHTQRIELQQGRVRKAERWGPRTLDLDIMLFGNEVINTERLTVPHYDMK NRGFMLWPLFEIAPELVFPDGEMLRQILHTRAFDKLNKW >gi|296494553|gb|ADTN01000185.1| GENE 14 11429 - 12793 1386 454 aa, chain - ## HITS:1 COG:ECs0147 KEGG:ns NR:ns ## COG: ECs0147 COG0617 # Protein_GI_number: 15829401 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Escherichia coli O157:H7 # 1 454 1 454 454 862 100.0 0 MLSREESEAEQAVARPQVTVIPREQHAISRKDISENALKVMYRLNKAGYEAWLVGGGVRD LLLGKKPKDFDVTTNATPEQVRKLFRNCRLVGRRFRLAHVMFGPEIIEVATFRGHHEGNV SDRTTSQRGQNGMLLRDNIFGSIEEDAQRRDFTINSLYYSVADFTVRDYVGGMKDLKDGV IRLIGNPETRYREDPVRMLRAVRFAAKLGMRISPETAEPIPRLATLLNDIPPARLFEESL KLLQAGYGYETYKLLCEYHLFQPLFPTITRYFTENGDSPMERIIEQVLKNTDTRIHNDMR VNPAFLFAAMFWYPLLETAQKIAQESGLTYHDAFALAMNDVLDEACRSLAIPKRLTTLTR DIWQLQLRMSRRQGKRAWKLLEHPKFRAAYDLLALRAEVERNAELQRLVKWWGEFQVSAP PDQKGMLNELDEEPSPRRRTRRPRKRAPRREGTA >gi|296494553|gb|ADTN01000185.1| GENE 15 12886 - 13782 543 298 aa, chain - ## HITS:1 COG:yadB KEGG:ns NR:ns ## COG: yadB COG0008 # Protein_GI_number: 16128137 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Escherichia coli K12 # 1 298 11 308 308 611 100.0 1e-175 MTDTQYIGRFAPSPSGELHFGSLIAALGSYLQARARQGRWLVRIEDIDPPREVPGAAETI LRQLEHYGLHWDGDVLWQSQRHDAYREALAWLHEQGLSYYCTCTRARIQSIGGIYDGHCR VLHHGPDNAAVRIRQQHPVTQFTDQLRGIIHADEKLAREDFIIHRRDGLFAYNLAVVVDD HFQGVTEIVRGADLIEPTVRQISLYQLFGWKVPDYIHLPLALNPQGAKLSKQNHAPALPK GDPRPVLIAALQFLGQQAEAHWQDFSVEQILQSAVKNWRLTAVPESAIVNSTFSNASC >gi|296494553|gb|ADTN01000185.1| GENE 16 13849 - 14304 668 151 aa, chain - ## HITS:1 COG:ECs0149 KEGG:ns NR:ns ## COG: ECs0149 COG1734 # Protein_GI_number: 15829403 # Func_class: T Signal transduction mechanisms # Function: DnaK suppressor protein # Organism: Escherichia coli O157:H7 # 1 151 1 151 151 261 100.0 5e-70 MQEGQNRKTSSLSILAIAGVEPYQEKPGEEYMNEAQLAHFRRILEAWRNQLRDEVDRTVT HMQDEAANFPDPVDRAAQEEEFSLELRNRDRERKLIKKIEKTLKKVEDEDFGYCESCGVE IGIRRLEARPTADLCIDCKTLAEIREKQMAG >gi|296494553|gb|ADTN01000185.1| GENE 17 14482 - 15186 329 234 aa, chain - ## HITS:1 COG:ECs0150 KEGG:ns NR:ns ## COG: ECs0150 COG1489 # Protein_GI_number: 15829404 # Func_class: R General function prediction only # Function: DNA-binding protein, stimulates sugar fermentation # Organism: Escherichia coli O157:H7 # 1 234 1 234 234 467 100.0 1e-132 MEFSPPLQRATLIQRYKRFLADVITPDGRELTLHCPNTGAMTGCATPGDTVWYSTSDNTK RKYPHTWELTQSQSGAFICVNTLWANRLTKEAILNESISELSGYSSLKSEVKYGAERSRI DFMLQADSRPDCYIEVKSVTLAENEQGYFPDAVTERGQKHLRELMSVAAEGQRAVIFFAV LHSAITRFSPARHIDEKYAQLLSEAQQRGVEILAYKAEISAEGMALKKSLPVTL >gi|296494553|gb|ADTN01000185.1| GENE 18 15201 - 15731 479 176 aa, chain - ## HITS:1 COG:ligT KEGG:ns NR:ns ## COG: ligT COG1514 # Protein_GI_number: 16128140 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 2'-5' RNA ligase # Organism: Escherichia coli K12 # 1 176 4 179 179 336 100.0 1e-92 MSEPQRLFFAIDLPAEIREQIIHWRATHFPPEAGRPVAADNLHLTLAFLGEVSAEKEKAL SLLAGRIRQPGFTLTLDDAGQWLRSRVVWLGMRQPPRGLIQLANMLRSQAARSGCFQSNR PFHPHITLLRDASEAVTIPPPGFNWSYAVTEFTLYASSFARGRTRYTPLKRWALTQ >gi|296494553|gb|ADTN01000185.1| GENE 19 15760 - 18234 2256 824 aa, chain + ## HITS:1 COG:hrpB KEGG:ns NR:ns ## COG: hrpB COG1643 # Protein_GI_number: 16128141 # Func_class: L Replication, recombination and repair # Function: HrpA-like helicases # Organism: Escherichia coli K12 # 1 824 1 824 824 1485 99.0 0 MLQCGAKNVNPLERFVSSLPVAAVLPELLTALDCAPQVLLSAPTGAGKSTWLPLQLLAHP GINGKIILLEPRRLAARNVAQRLAELLNEKPGDTVGYRMRAQNCVGPNTRLEVVTEGVLT RMIQRDPELSGVGLVILDEFHERSLQADLALALLLDVQQGLRDDLKLLIMSATLDNDRLQ QMLPEAPVVISEGRSFPVERRYLPLPAHQRFDDAVAVATAEMLRQESGSLLLFLPGVGEI QRVQEQLASRIGSDVLLCPLYGALSLNDQRKAILPAPQGMRKVVLATNIAETSLTIEGIR LVVDCAQERVARFDPRTGLTRLITQRVSQASMTQRAGRAGRLEPGISLHLIAKEQAERAA AQSEPEILQSDLSGLLMELLQWGCSDPAQMSWLDQPPVVNLLAAKRLLQMLGALEGERLS AQGQKMAALGNDPRLAAMLVSAKNDDEAATAAKIAAILEEPPRMGNSDLGVAFSRNQPAW QQRSQQLLKRLNVRGGEADSSLIAPLLAGAFADRIARRRGQDGRYQLANGMGAMLDANDA LSRHEWLIAPLLLQGSASPDARILLALLVDIDELVQRCPQLVQQSDTVEWDDAQGTLKAW RRLQIGQLTVKVQPLAKPSEDELHQAMLNGIRDKGLSVLNWTAEAEQLRLRLLCAAKWLP EYDWPAVDDESLLAALETWLLPHMTGVHSLRGLKSLDIYQALRGLLDWGMQQRLDSELPA HYTVPTGSRIAIRYHEDNPPALAVRMQEMFGEATNPTIAQGRVPLVLELLSPAQRPLQIT RDLSAFWKGAYREVQKEMKGRYPKHVWPDDPANTAPTRRTKKYS Prediction of potential genes in microbial genomes Time: Sun May 15 23:46:26 2011 Seq name: gi|296494552|gb|ADTN01000186.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont381.5, whole genome shotgun sequence Length of sequence - 2667 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 8 - 67 1.7 1 1 Tu 1 . + CDS 109 - 2643 2969 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) Predicted protein(s) >gi|296494552|gb|ADTN01000186.1| GENE 1 109 - 2643 2969 844 aa, chain + ## HITS:1 COG:mrcB KEGG:ns NR:ns ## COG: mrcB COG0744 # Protein_GI_number: 16128142 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Escherichia coli K12 # 1 844 1 844 844 1521 100.0 0 MAGNDREPIGRKGKPTRPVKQKVSRRRYEDDDDYDDYDDYEDEEPMPRKGKGKGKGRKPR GKRGWLWLLLKLAIVFAVLIAIYGVYLDQKIRSRIDGKVWQLPAAVYGRMVNLEPDMTIS KNEMVKLLEATQYRQVSKMTRPGEFTVQANSIEMIRRPFDFPDSKEGQVRARLTFDGDHL ATIVNMENNRQFGFFRLDPRLITMISSPNGEQRLFVPRSGFPDLLVDTLLATEDRHFYEH DGISLYSIGRAVLANLTAGRTVQGASTLTQQLVKNLFLSSERSYWRKANEAYMALIMDAR YSKDRILELYMNEVYLGQSGDNEIRGFPLASLYYFGRPVEELSLDQQALLVGMVKGASIY NPWRNPKLALERRNLVLRLLQQQQIIDQELYDMLSARPLGVQPRGGVISPQPAFMQLVRQ ELQAKLGDKVKDLSGVKIFTTFDSVAQDAAEKAAVEGIPALKKQRKLSDLETAIVVVDRF SGEVRAMVGGSEPQFAGYNRAMQARRSIGSLAKPATYLTALSQPKIYRLNTWIADAPIAL RQPNGQVWSPQNDDRRYSESGRVMLVDALTRSMNVPTVNLGMALGLPAVTETWIKLGVPK DQLHPVPAMLLGALNLTPIEVAQAFQTIASGGNRAPLSALRSVIAEDGKVLYQSFPQAER AVPAQAAYLTLWTMQQVVQRGTGRQLGAKYPNLHLAGKTGTTNNNVDTWFAGIDGSTVTI TWVGRDNNQPTKLYGASGAMSIYQRYLANQTPTPLNLVPPEDIADMGVDYDGNFVCSGGM RILPVWTSDPQSLCQQSEMQQQPSGNPFDQSSQPQQQPQQQPAQQEQKDSDGVAGWIKDM FGSN Prediction of potential genes in microbial genomes Time: Sun May 15 23:46:33 2011 Seq name: gi|296494551|gb|ADTN01000187.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont381.6, whole genome shotgun sequence Length of sequence - 21381 bp Number of predicted genes - 18, with homology - 17 Number of transcription units - 10, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 91 - 150 7.0 1 1 Op 1 7/0.000 + CDS 171 - 2414 2440 ## COG1629 Outer membrane receptor proteins, mostly Fe transport 2 1 Op 2 14/0.000 + CDS 2465 - 3262 232 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 3 1 Op 3 33/0.000 + CDS 3262 - 4152 924 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 4 1 Op 4 . + CDS 4149 - 6131 2019 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component - Term 6108 - 6152 6.5 5 2 Tu 1 . - CDS 6289 - 7569 1475 ## COG0001 Glutamate-1-semialdehyde aminotransferase - Prom 7678 - 7737 5.4 + Prom 7560 - 7619 3.8 6 3 Op 1 . + CDS 7794 - 9215 1588 ## COG0038 Chloride channel protein EriC 7 3 Op 2 . + CDS 9239 - 9304 68 ## 8 3 Op 3 . + CDS 9297 - 9641 488 ## COG0316 Uncharacterized conserved protein + Term 9653 - 9689 8.2 9 4 Op 1 6/0.000 - CDS 9688 - 10311 600 ## COG2860 Predicted membrane protein 10 4 Op 2 5/0.250 - CDS 10349 - 11149 791 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 11 4 Op 3 . - CDS 11142 - 11840 751 ## COG0775 Nucleoside phosphorylase - Prom 11941 - 12000 4.3 + Prom 11793 - 11852 2.0 12 5 Tu 1 . + CDS 11924 - 13441 1003 ## COG0232 dGTP triphosphohydrolase + Prom 13463 - 13522 2.8 13 6 Tu 1 . + CDS 13571 - 14995 1651 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain + Term 15006 - 15041 6.5 + Prom 15027 - 15086 3.8 14 7 Tu 1 . + CDS 15150 - 16307 1132 ## COG3835 Sugar diacid utilization regulator + Term 16372 - 16406 1.2 15 8 Tu 1 . - CDS 16396 - 16782 460 ## ECIAI1_0162 hypothetical protein - Prom 16811 - 16870 3.6 16 9 Tu 1 . - CDS 16944 - 17687 531 ## COG1408 Predicted phosphohydrolases - Term 17764 - 17798 5.0 17 10 Op 1 5/0.250 - CDS 17810 - 18634 1014 ## COG2171 Tetrahydrodipicolinate N-succinyltransferase 18 10 Op 2 . - CDS 18665 - 21337 2169 ## COG2844 UTP:GlnB (protein PII) uridylyltransferase Predicted protein(s) >gi|296494551|gb|ADTN01000187.1| GENE 1 171 - 2414 2440 747 aa, chain + ## HITS:1 COG:ECs0154 KEGG:ns NR:ns ## COG: ECs0154 COG1629 # Protein_GI_number: 15829408 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Escherichia coli O157:H7 # 1 747 1 747 747 1461 99.0 0 MARSKTAQPKHSLRKIAVVVATAVSGMSVYAQAAVEPKEDTITVTAAPAPQESAWGPAAT IAARQSATGTKTDTPIQKVPQSISVVTAEEMALHQPKSVKEALSYTPGVSVGTRGASNTY DHLIIRGFAAEGQSQNNYLNGLKLQGNFYNDAVIDPYMLERAEIMRGPVSVLYGKSSPGG LLNMVSKRPTTEPLKEVQFKAGTDSLFQTGFDFSDSLDDDGVYSYRLTGLARSANAQQKG SEEQRYAIAPAFTWRPDDKTNFTFLSYFQNEPETGYYGWLPKEGTVEPLPNGKRLPTDFN EGAKNNTYSRNEKMVGYSFDHEFNDTFTVRQNLRFAENKTSQNSVYGYGVCSDPANAYSK QCAALAPADKGHYLARKYVVDDEKLQNFSVDTQLQSKFATGDIDHTLLTGVDFMRMRNDI NAWFGYDDSVPLLNLYNPVNTDFDFNAKDPANSGPYRILNKQKQTGVYVQDQAQWDKVLV TLGGRYDWADQESLNRVAGTTDKRDDKQFTWRGGVNYLFDNGVTPYFSYSESFEPSSQVG KDGNIFAPSKGKQYEVGVKYVPEDRPIVVTGAVYNLTKTNNLMADPEGSFFSVEGGEIRA RGVEIEAKAALSASVNVVGSYTYTDAEYTTDTTYKGNTPAQVPKHMASLWADYTFFDGPL SGLTLGTGGRYTGSSYGDPANSFKVGSYTVVDALVRYDLARVGMAGSNVALHVNNLFDRE YVASCFNTYGCFWGAERQVVATATFRF >gi|296494551|gb|ADTN01000187.1| GENE 2 2465 - 3262 232 265 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 9 235 274 507 563 94 29 8e-19 MQEYTNHSDTTFALRNISFRVPGRTLLHPLSLTFPAGKVTGLIGHNGSGKSTLLKMLGRH QPPSEGEILLDAQPLESWSSKAFARKVAYLPQQLPPAEGMTVRELVAIGRYPWHGALGRF GAADREKVEEAISLVGLKPLAHRLVDSLSGGERQRAWIAMLVAQDSRCLLLDEPTSALDI AHQVDVLSLVHRLSQERGLTVIAVLHDINMAARYCDYLVALRGGEMIAQGTPAEIMRGET LEMIYGIPMGILPHPAGAAPVSFVY >gi|296494551|gb|ADTN01000187.1| GENE 3 3262 - 4152 924 296 aa, chain + ## HITS:1 COG:fhuD KEGG:ns NR:ns ## COG: fhuD COG0614 # Protein_GI_number: 16128145 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Escherichia coli K12 # 1 296 1 296 296 585 100.0 1e-167 MSGLPLISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVA DTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLT TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH DNSKDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVLDNAIGGKA >gi|296494551|gb|ADTN01000187.1| GENE 4 4149 - 6131 2019 660 aa, chain + ## HITS:1 COG:fhuB KEGG:ns NR:ns ## COG: fhuB COG0609 # Protein_GI_number: 16128146 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Escherichia coli K12 # 1 660 1 660 660 967 99.0 0 MSKRIALFPALLLALLVIVATALTWMNFSQALPRSQWAQAAWSPDIDVIEQMIFHYSLLP RLAISLLVGAGLGLVGVLFQQVLRNPLAEPTTLGVATGAQLGITVTTLWAIPGAMASQFA ALAGACVVGLIVFGVAWGKRLSPVTLILAGLVVSLYCGAINQLLVIFHHDQLQSMFLWST GTLTQTDWGGVERLWPQLLGGVMLTLLLLRPLTLMGLDDGVARNLGLALSLARLAALSLA IVISALLVNAVGIIGFIGLFAPLLAKMLGARRLLPRLMLASLIGALILWLSDQIILWLTR VWMEVSTGSVTALIGAPLLLWLLPRLRSISAPDMKVNDRVAAERQHVLAFALAGGVLLLM AVVVALSFGRDAHGWTWASGALLEDLMPWRWPRIMAALFAGVMLAVAGCIIQRLTGNPMA SPEVLGISSGAAFGVVLMLFLVPGNAFGWLLPAGSLGAAVTLLIIMIAAGRGGFSPHRML LAGMALSTAFTMLLMMLQASGDPRMAQVLTWISGSTYNATDAQVWRTGIVMVILLAITPL CRRWLTILPLGGDTARAVGMALTPTRIALLLLAACLTATATITIGPLSFVGLMAPHIARM MGFRRTMPHIVISALVGGLLLVFADWCGRMVLFPFQIPAGLLSTFIGAPYFIYLLRKQSR >gi|296494551|gb|ADTN01000187.1| GENE 5 6289 - 7569 1475 426 aa, chain - ## HITS:1 COG:hemL KEGG:ns NR:ns ## COG: hemL COG0001 # Protein_GI_number: 16128147 # Func_class: H Coenzyme transport and metabolism # Function: Glutamate-1-semialdehyde aminotransferase # Organism: Escherichia coli K12 # 1 426 1 426 426 843 99.0 0 MSKSENLYSAARELIPGGVNSPVRAFTGVGGTPLFIEKADGAYLYDVDGKAYIDYVGSWG PMVLGHNHPAIRNAVIEAAERGLSFGAPTEMEVKMAQLVTELVPTMDMVRMVNSGTEATM SAIRLARGFTGRDKIIKFEGCYHGHADCLLVKAGSGALTLGQPNSPGVPADFAKHTLTCT YNDLASVRAAFEQYPQEIACIIVEPVAGNMNCVPPLPEFLPGLRALCDEFGALLIIDEVM TGFRVALAGAQDYYGVVPDLTCLGKIIGGGMPVGAFGGRRDVMDALAPTGPVYQAGTLSG NPIAMAAGFACLNEVAQPGVHETLDELTTRLAEGLLEAAEEAGIPLVVNHVGGMFGIFFT DAESVTCYQDVMACDVERFKRFFHMMLGEGVYLAPSAFEAGFMSVAHSMEDINNTIDAAR RVFAKL >gi|296494551|gb|ADTN01000187.1| GENE 6 7794 - 9215 1588 473 aa, chain + ## HITS:1 COG:yadQ KEGG:ns NR:ns ## COG: yadQ COG0038 # Protein_GI_number: 16128148 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Escherichia coli K12 # 1 473 1 473 473 789 100.0 0 MKTDTPSLETPQAARLRRRQLIRQLLERDKTPLAILFMAAVVGTLVGLAAVAFDKGVAWL QNQRMGALVHTADNYPLLLTVAFLCSAVLAMFGYFLVRKYAPEAGGSGIPEIEGALEDQR PVRWWRVLPVKFFGGLGTLGGGMVLGREGPTVQIGGNIGRMVLDIFRLKGDEARHTLLAT GAAAGLAAAFNAPLAGILFIIEEMRPQFRYTLISIKAVFIGVIMSTIMYRIFNHEVALID VGKLSDAPLNTLWLYLILGIIFGIFGPIFNKWVLGMQDLLHRVHGGNITKWVLMGGAIGG LCGLLGFVAPATSGGGFNLIPIATAGNFSMGMLVFIFVARVITTLLCFSSGAPGGIFAPM LALGTVLGTAFGMVAVELFPQYHLEAGTFAIAGMGALLAASIRAPLTGIILVLEMTDNYQ LILPMIITGLGATLLAQFTGGKPLYSAILARTLAKQEAEQLARSKAASASENT >gi|296494551|gb|ADTN01000187.1| GENE 7 9239 - 9304 68 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAIIIGLEFAQLPMSFGAKYE >gi|296494551|gb|ADTN01000187.1| GENE 8 9297 - 9641 488 114 aa, chain + ## HITS:1 COG:STM0204 KEGG:ns NR:ns ## COG: STM0204 COG0316 # Protein_GI_number: 16763594 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Salmonella typhimurium LT2 # 1 114 16 129 129 221 99.0 4e-58 MSDDVALPLEFTDAAANKVKSLIADEDNPNLKLRVYITGGGCSGFQYGFTFDDQVNEGDM TIEKQGVGLVVDPMSLQYLVGGSVDYTEGLEGSRFIVTNPNAKSTCGCGSSFSI >gi|296494551|gb|ADTN01000187.1| GENE 9 9688 - 10311 600 207 aa, chain - ## HITS:1 COG:yadS KEGG:ns NR:ns ## COG: yadS COG2860 # Protein_GI_number: 16128150 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 207 1 207 207 337 100.0 1e-92 MLVYWLDIVGTAVFAISGVLLAGKLRMDPFGVLVLGVVTAVGGGTIRDMALDHGPVFWVK DPTDLVVAMVTSMLTIVLVRQPRRLPKWMLPVLDAVGLAVFVGIGVNKAFNAEAGPLIAV CMGVITGVGGGIIRDVLAREIPMILRTEIYATACIIGGIVHATAYYTFSVPLETASMMGM VVTLLIRLAAIRWHLKLPTFALDENGR >gi|296494551|gb|ADTN01000187.1| GENE 10 10349 - 11149 791 266 aa, chain - ## HITS:1 COG:yadT KEGG:ns NR:ns ## COG: yadT COG0614 # Protein_GI_number: 16128151 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Escherichia coli K12 # 1 266 1 266 266 482 100.0 1e-136 MAKSLFRALVALSFLAPLWLNAAPRVITLSPANTELAFAAGITPVGVSSYSDYPPQAQKI EQVSTWQGMNLERIVALKPDLVIAWRGGNAERQVDQLASLGIKVMWVDATSIEQIANALR QLAPWSPQPDKAEQAAQSLLDQYAQLKAQYADKPKKRVFLQFGINPPFTSGKESIQNQVL EVCGGENIFKDSRVPWPQVSREQVLARSPQAIVITGGPDQIPKIKQYWGEQLKIPVIPLT SDWFERASPRIILAAQQLCNALSQVD >gi|296494551|gb|ADTN01000187.1| GENE 11 11142 - 11840 751 232 aa, chain - ## HITS:1 COG:STM0207 KEGG:ns NR:ns ## COG: STM0207 COG0775 # Protein_GI_number: 16763597 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside phosphorylase # Organism: Salmonella typhimurium LT2 # 1 232 1 232 232 412 96.0 1e-115 MKIGIIGAMEEEVTLLRDKIENRQTISLGGCEIYTGQLNGTEVALLKSGIGKVAAALGAT LLLEHCKPDVIINTGSAGGLAPTLKVGDIVVSDEARYHDADVTAFGYEYGQLPGCPAGFK ADDKLIAAAEACIAELNLNAVRGLIVSGDAFINGSVGLAKIRHNFPQAIAVEMEATAIAH VCHNFNVPFVVVRAISDVADQQSHLSFDEFLAVAAKQSSLMVESLVQKLAHG >gi|296494551|gb|ADTN01000187.1| GENE 12 11924 - 13441 1003 505 aa, chain + ## HITS:1 COG:dgt KEGG:ns NR:ns ## COG: dgt COG0232 # Protein_GI_number: 16128153 # Func_class: F Nucleotide transport and metabolism # Function: dGTP triphosphohydrolase # Organism: Escherichia coli K12 # 1 505 1 505 505 1011 99.0 0 MAQIDFRKKINWHRRYRSPQGVKTEHEILRIFESDRGRIINSPAIRRLQQKTQVFPLERN AAVRTRLTHSMEVQQVGRYIAKEILSRLKELKLLEAYGLDELTGPFESIVEMSCLMHDIG NPPFGHFGEAAINDWFRQRLHPEDAESQPLTDDRCSVAALRLRDGEEPLNELRRKIRQDL CHFEGNAQGIRLVHTLMRMNLTWAQVGGILKYTRPAWWRGETPETHHYLMKKPGYYLSEE AYIARLRKELNLALYSRFPLTWIMEAADDISYCVADLEDAVEKRIFTVEQLYHHLHEAWG QHEKGSLFSLVVENAWEKSRSNSLSRSTEDQFFMYLRVNTLNKLVPYAAQRFIDNLPAIF AGTFNHALLEDASECSDLLKLYKNVAVKHVFSHPDVEQLELQGYRVISGLLEIYRPLLSL SLSDFTELVEKERVKRFPIESRLFHKLSTRHRLAYVEAVSKLPSDSPEFPLWEYYYRCRL LQDYISGMTDLYAWDEYRRLMAVEQ >gi|296494551|gb|ADTN01000187.1| GENE 13 13571 - 14995 1651 474 aa, chain + ## HITS:1 COG:ECs0165 KEGG:ns NR:ns ## COG: ECs0165 COG0265 # Protein_GI_number: 15829419 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Escherichia coli O157:H7 # 1 474 1 474 474 780 100.0 0 MKKTTLALSALALSLGLALSPLSATAAETSSATTAQQMPSLAPMLEKVMPSVVSINVEGS TTVNTPRMPRNFQQFFGDDSPFCQEGSPFQSSPFCQGGQGGNGGGQQQKFMALGSGVIID ADKGYVVTNNHVVDNATVIKVQLSDGRKFDAKMVGKDPRSDIALIQIQNPKNLTAIKMAD SDALRVGDYTVAIGNPFGLGETVTSGIVSALGRSGLNAENYENFIQTDAAINRGNSGGAL VNLNGELIGINTAILAPDGGNIGIGFAIPSNMVKNLTSQMVEYGQVKRGELGIMGTELNS ELAKAMKVDAQRGAFVSQVLPNSSAAKAGIKAGDVITSLNGKPISSFAALRAQVGTMPVG SKLTLGLLRDGKQVNVNLELQQSSQNQVDSSSIFNGIEGAEMSNKGKDQGVVVNNVKTGT PAAQIGLKKGDVIIGANQQAVKNIAELRKVLDSKPSVLALNIQRGDSTIYLLMQ >gi|296494551|gb|ADTN01000187.1| GENE 14 15150 - 16307 1132 385 aa, chain + ## HITS:1 COG:yaeG KEGG:ns NR:ns ## COG: yaeG COG3835 # Protein_GI_number: 16128155 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Sugar diacid utilization regulator # Organism: Escherichia coli K12 # 1 385 7 391 391 734 100.0 0 MAGWHLDTKMAQDIVARTMRIIDTNINVMDARGRIIGSGDRERIGELHEGALLVLSQGRV VDIDDAVARHLHGVRQGINLPLRLEGEIVGVIGLTGEPENLRKYGELVCMTAEMMLEQSR LMHLLAQDSRLREELVMNLIQAEENTPALTEWAQRLGIDLNQPRVVAIVEVDSGQLGVDS AMAELQQLQNALTTPERNNLVAIVSLTEMVVLKPALNSFGRWDAEDHRKRVEQLITRMKE YGQLRFRVSLGNYFTGPGSIARSYRTAKTTMVVGKQRMPESRCYFYQDLMLPVLLDSLRG DWQANELARPLARLKTMDNNGLLRRTLAAWFRHNVQPLATSKALFIHRNTLEYRLNRISE LTGLDLGNFDDRLLLYVALQLDEER >gi|296494551|gb|ADTN01000187.1| GENE 15 16396 - 16782 460 128 aa, chain - ## HITS:1 COG:no KEGG:ECIAI1_0162 NR:ns ## KEGG: ECIAI1_0162 # Name: yaeH # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 128 1 128 128 218 100.0 6e-56 MYDNLKSLGITNPEEIDRYSLRQEANNDILKIYFQKDKGEFFAKSVKFKYPRQRKTVVAD GVGQGYKEVQEISPNLRYIIDELDQICQRDRSEVDLKRKILDDLRHLESVVTNKISEIEA DLEKLTRK >gi|296494551|gb|ADTN01000187.1| GENE 16 16944 - 17687 531 247 aa, chain - ## HITS:1 COG:yaeI KEGG:ns NR:ns ## COG: yaeI COG1408 # Protein_GI_number: 16128157 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Escherichia coli K12 # 1 247 1 247 247 520 100.0 1e-148 MHYCEPGWFELIRHRLAFFKDNAAPFKILFLADLHYSRFVPLSLISDAIALGIEQKPDLI LLGGDYVLFDMSLNFSAFSDVLSPLAECAPTFACFGNHDRPVGTEKNHLIGETLKSAGIT VLFNQATVIATPNRQFELVGTGDLWAGQCKPPPASEANLPRLVLAHNPDSKEVMRDEPWD LMLCGHTHGGQLRVPLVGEPFAPVEDKRYVAGLNAFGERHIYTTRGVGSLYGLRLNCRPE VTMLELV >gi|296494551|gb|ADTN01000187.1| GENE 17 17810 - 18634 1014 274 aa, chain - ## HITS:1 COG:dapD KEGG:ns NR:ns ## COG: dapD COG2171 # Protein_GI_number: 16128159 # Func_class: E Amino acid transport and metabolism # Function: Tetrahydrodipicolinate N-succinyltransferase # Organism: Escherichia coli K12 # 1 274 1 274 274 521 100.0 1e-148 MQQLQNIIETAFERRAEITPANADTVTREAVNQVIALLDSGALRVAEKIDGQWVTHQWLK KAVLLSFRINDNQVIEGAESRYFDKVPMKFADYDEARFQKEGFRVVPPAAVRQGAFIARN TVLMPSYVNIGAYVDEGTMVDTWATVGSCAQIGKNVHLSGGVGIGGVLEPLQANPTIIED NCFIGARSEVVEGVIVEEGSVISMGVYIGQSTRIYDRETGEIHYGRVPAGSVVVSGNLPS KDGKYSLYCAVIVKKVDAKTRGKVGINELLRTID >gi|296494551|gb|ADTN01000187.1| GENE 18 18665 - 21337 2169 890 aa, chain - ## HITS:1 COG:glnD KEGG:ns NR:ns ## COG: glnD COG2844 # Protein_GI_number: 16128160 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: UTP:GlnB (protein PII) uridylyltransferase # Organism: Escherichia coli K12 # 1 890 1 890 890 1790 100.0 0 MNTLPEQYANTALPTLPGQPQNPCVWPRDELTVGGIKAHIDTFQRWLGDAFDNGISAEQL IEARTEFIDQLLQRLWIEAGFSQIADLALVAVGGYGRGELHPLSDVDLLILSRKKLPDDQ AQKVGELLTLLWDVKLEVGHSVRTLEECMLEGLSDLTVATNLIESRLLIGDVALFLELQK HIFSEGFWPSDKFYAAKVEEQNQRHQRYHGTSYNLEPDIKSSPGGLRDIHTLQWVARRHF GATSLDEMVGFGFLTSAERAELNECLHILWRIRFALHLVVSRYDNRLLFDRQLSVAQRLN YSGEGNEPVERMMKDYFRVTRRVSELNQMLLQLFDEAILALPADEKPRPIDDEFQLRGTL IDLRDETLFMRQPEAILRMFYTMVHNSAITGIYSTTLRQLRHARRHLQQPLCNIPEARKL FLSILRHPGAVRRGLLPMHRHSVLGAYMPQWSHIVGQMQFDLFHAYTVDEHTIRVMLKLE SFASEETRQRHPLCVDVWPRLPSTELIFIAALFHDIAKGRGGDHSILGAQDVVHFAELHG LNSRETQLVAWLVRQHLLMSVTAQRRDIQDPEVIKQFAEEVQTENRLRYLVCLTVADICA TNETLWNSWKQSLLRELYFATEKQLRRGMQNTPDMRERVRHHQLQALALLRMDNIDEEAL HQIWSRCRANYFVRHSPNQLAWHARHLLQHDLSKPLVLLSPQATRGGTEIFIWSPDRPYL FAAVCAELDRRNLSVHDAQIFTTRDGMAMDTFIVLEPDGNPLSADRHEVIRFGLEQVLTQ SSWQPPQPRRQPAKLRHFTVETEVTFLPTHTDRKSFLELIALDQPGLLARVGKIFADLGI SLHGARITTIGERVEDLFIIATADRRALNNELQQEVHQRLTEALNPNDKG Prediction of potential genes in microbial genomes Time: Sun May 15 23:46:42 2011 Seq name: gi|296494550|gb|ADTN01000188.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont381.7, whole genome shotgun sequence Length of sequence - 2926 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 696 686 ## COG0024 Methionine aminopeptidase - Prom 812 - 871 4.5 + Prom 771 - 830 3.4 2 2 Op 1 38/0.000 + CDS 866 - 1789 1604 ## PROTEIN SUPPORTED gi|26106512|gb|AAN78698.1|AE016755_198 30S ribosomal protein S2 + Term 1964 - 1997 3.1 + Prom 1962 - 2021 3.8 3 2 Op 2 . + CDS 2047 - 2898 1001 ## PROTEIN SUPPORTED gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts Predicted protein(s) >gi|296494550|gb|ADTN01000188.1| GENE 1 3 - 696 686 231 aa, chain - ## HITS:1 COG:ECs0170 KEGG:ns NR:ns ## COG: ECs0170 COG0024 # Protein_GI_number: 15829424 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Escherichia coli O157:H7 # 1 231 1 231 264 473 100.0 1e-133 MAISIKTPEDIEKMRVAGRLAAEVLEMIEPYVKPGVSTGELDRICNDYIVNEQHAVSACL GYHGYPKSVCISINEVVCHGIPDDAKLLKDGDIVNIDVTVIKDGFHGDTSKMFIVGKPTI MGERLCRITQESLYLALRMVKPGINLREIGAAIQKFVEAEGFSVVREYCGHGIGRGFHEE PQVLHYDSRETNVVLKPGMTFTIEPMVNAGKKEIRTMKDGWTVKTKDRSLS >gi|296494550|gb|ADTN01000188.1| GENE 2 866 - 1789 1604 307 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|26106512|gb|AAN78698.1|AE016755_198 30S ribosomal protein S2 [Escherichia coli CFT073] # 1 307 1 307 307 622 99 1e-178 MVSTTYLWYKARRTSDPFRIHRLDGSDNLTLCNNTHVSAHIPGCPLGSVIWDTWRHNPNF YIEVLIMATVSMRDMLKAGVHFGHQTRYWNPKMKPFIFGARNKVHIINLEKTVPMFNEAL AELNKIASRKGKILFVGTKRAASEAVKDAALSCDQFFVNHRWLGGMLTNWKTVRQSIKRL KDLETQSQDGTFDKLTKKEALMRTRELEKLENSLGGIKDMGGLPDALFVIDADHEHIAIK EANNLGIPVFAIVDTNSDPDGVDFVIPGNDDAIRAVTLYLGAVAATVREGRSQDLASQAE ESFVEAE >gi|296494550|gb|ADTN01000188.1| GENE 3 2047 - 2898 1001 283 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts [Haemophilus influenzae R2866] # 1 280 1 279 283 390 71 1e-109 MAEITASLVKELRERTGAGMMDCKKALTEANGDIELAIENMRKSGAIKAAKKAGNVAADG VIKTKIDGNYGIILEVNCQTDFVAKDAGFQAFADKVLDAAVAGKITDVEVLKAQFEEERV ALVAKIGENINIRRVAALEGDVLGSYQHGARIGVLVAAKGADEELVKHIAMHVAASKPEF IKPEDVSAEVVEKEYQVQLDIAMQSGKPKEIAEKMVEGRMKKFTGEVSLTGQPFVMEPSK TVGQLLKEHNAEVTGFIRFEVGEGIEKVETDFAAEVAAMSKQS Prediction of potential genes in microbial genomes Time: Sun May 15 23:46:47 2011 Seq name: gi|296494549|gb|ADTN01000189.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont381.8, whole genome shotgun sequence Length of sequence - 15770 bp Number of predicted genes - 15, with homology - 14 Number of transcription units - 2, operones - 2 average op.length - 7.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 41 - 100 4.9 1 1 Op 1 33/0.000 + CDS 143 - 868 997 ## COG0528 Uridylate kinase + Prom 1024 - 1083 5.4 2 1 Op 2 8/0.000 + CDS 1160 - 1717 843 ## COG0233 Ribosome recycling factor + Prom 1727 - 1786 2.1 3 1 Op 3 7/0.000 + CDS 1809 - 3005 950 ## COG0743 1-deoxy-D-xylulose 5-phosphate reductoisomerase + Term 3110 - 3139 0.4 + Prom 3012 - 3071 3.9 4 1 Op 4 32/0.000 + CDS 3194 - 3952 471 ## COG0020 Undecaprenyl pyrophosphate synthase 5 1 Op 5 12/0.000 + CDS 4073 - 4822 667 ## COG0575 CDP-diglyceride synthetase 6 1 Op 6 18/0.000 + CDS 4834 - 6186 933 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 7 1 Op 7 . + CDS 6216 - 8648 2473 ## COG4775 Outer membrane protein/protective antigen OMA87 8 1 Op 8 . + CDS 8685 - 8759 56 ## 9 1 Op 9 15/0.000 + CDS 8770 - 9255 617 ## COG2825 Outer membrane protein 10 1 Op 10 18/0.000 + CDS 9259 - 10284 941 ## COG1044 UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase + Term 10311 - 10362 7.1 + Prom 10303 - 10362 4.0 11 2 Op 1 25/0.000 + CDS 10389 - 10844 512 ## COG0764 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases 12 2 Op 2 11/0.000 + CDS 10848 - 11636 783 ## COG1043 Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase 13 2 Op 3 11/0.000 + CDS 11636 - 12784 1167 ## COG0763 Lipid A disaccharide synthetase 14 2 Op 4 6/0.000 + CDS 12781 - 13377 741 ## COG0164 Ribonuclease HII 15 2 Op 5 . + CDS 13414 - 15769 2510 ## COG0587 DNA polymerase III, alpha subunit Predicted protein(s) >gi|296494549|gb|ADTN01000189.1| GENE 1 143 - 868 997 241 aa, chain + ## HITS:1 COG:ECs0173 KEGG:ns NR:ns ## COG: ECs0173 COG0528 # Protein_GI_number: 15829427 # Func_class: F Nucleotide transport and metabolism # Function: Uridylate kinase # Organism: Escherichia coli O157:H7 # 1 241 1 241 241 460 100.0 1e-130 MATNAKPVYKRILLKLSGEALQGTEGFGIDASILDRMAQEIKELVELGIQVGVVIGGGNL FRGAGLAKAGMNRVVGDHMGMLATVMNGLAMRDALHRAYVNARLMSAIPLNGVCDSYSWA EAISLLRNNRVVILSAGTGNPFFTTDSAACLRGIEIEADVVLKATKVDGVFTADPAKDPT ATMYEQLTYSEVLEKELKVMDLAAFTLARDHKLPIRVFNMNKPGALRRVVMGEKEGTLIT E >gi|296494549|gb|ADTN01000189.1| GENE 2 1160 - 1717 843 185 aa, chain + ## HITS:1 COG:ECs0174 KEGG:ns NR:ns ## COG: ECs0174 COG0233 # Protein_GI_number: 15829428 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome recycling factor # Organism: Escherichia coli O157:H7 # 1 185 1 185 185 296 100.0 1e-80 MISDIRKDAEVRMDKCVEAFKTQISKIRTGRASPSLLDGIVVEYYGTPTPLRQLASVTVE DSRTLKINVFDRSMSPAVEKAIMASDLGLNPNSAGSDIRVPLPPLTEERRKDLTKIVRGE AEQARVAVRNVRRDANDKVKALLKDKEISEDDDRRSQDDVQKLTDAAIKKIEAALADKEA ELMQF >gi|296494549|gb|ADTN01000189.1| GENE 3 1809 - 3005 950 398 aa, chain + ## HITS:1 COG:dxr KEGG:ns NR:ns ## COG: dxr COG0743 # Protein_GI_number: 16128166 # Func_class: I Lipid transport and metabolism # Function: 1-deoxy-D-xylulose 5-phosphate reductoisomerase # Organism: Escherichia coli K12 # 1 398 1 398 398 759 100.0 0 MKQLTILGSTGSIGCSTLDVVRHNPEHFRVVALVAGKNVTRMVEQCLEFSPRYAVMDDEA SAKLLKTMLQQQGSRTEVLSGQQAACDMAALEDVDQVMAAIVGAAGLLPTLAAIRAGKTI LLANKESLVTCGRLFMDAVKQSKAQLLPVDSEHNAIFQSLPQPIQHNLGYADLEQNGVVS ILLTGSGGPFRETPLRDLATMTPDQACRHPNWSMGRKISVDSATMMNKGLEYIEARWLFN ASASQMEVLIHPQSVIHSMVRYQDGSVLAQLGEPDMRTPIAHTMAWPNRVNSGVKPLDFC KLSALTFAAPDYDRYPCLKLAMEAFEQGQAATTALNAANEITVAAFLAQQIRFTDIAALN LSVLEKMDMREPQCVDDVLSVDANAREVARKEVMRLAS >gi|296494549|gb|ADTN01000189.1| GENE 4 3194 - 3952 471 252 aa, chain + ## HITS:1 COG:ECs0176 KEGG:ns NR:ns ## COG: ECs0176 COG0020 # Protein_GI_number: 15829430 # Func_class: I Lipid transport and metabolism # Function: Undecaprenyl pyrophosphate synthase # Organism: Escherichia coli O157:H7 # 1 252 2 253 253 509 100.0 1e-144 MLSATQPLSEKLPAHGCRHVAIIMDGNGRWAKKQGKIRAFGHKAGAKSVRRAVSFAANNG IEALTLYAFSSENWNRPAQEVSALMELFVWALDSEVKSLHRHNVRLRIIGDTSRFNSRLQ ERIRKSEALTAGNTGLTLNIAANYGGRWDIVQGVRQLAEKVQQGNLQPDQIDEEMLNQHV CMHELAPVDLVIRTGGEHRISNFLLWQIAYAELYFTDVLWPDFDEQDFEGALNAFANRER RFGGTEPGDETA >gi|296494549|gb|ADTN01000189.1| GENE 5 4073 - 4822 667 249 aa, chain + ## HITS:1 COG:ECs0177 KEGG:ns NR:ns ## COG: ECs0177 COG0575 # Protein_GI_number: 15829431 # Func_class: I Lipid transport and metabolism # Function: CDP-diglyceride synthetase # Organism: Escherichia coli O157:H7 # 1 249 1 249 249 418 100.0 1e-117 MLAAWEWGQLSGFTTRSQRVWLAVLCGLLLALMLFLLPEYHRNIHQPLVEISLWASLGWW IVALLLVLFYPGSAAIWRNSKTLRLIFGVLTIVPFFWGMLALRAWHYDENHYSGAIWLLY VMILVWGADSGAYMFGKLFGKHKLAPKVSPGKTWQGFIGGLATAAVISWGYGMWANLDVA PVTLLICSIVAALASVLGDLTESMFKREAGIKDSGHLIPGHGGILDRIDSLTAAVPVFAC LLLLVFRTL >gi|296494549|gb|ADTN01000189.1| GENE 6 4834 - 6186 933 450 aa, chain + ## HITS:1 COG:ECs0178 KEGG:ns NR:ns ## COG: ECs0178 COG0750 # Protein_GI_number: 15829432 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Escherichia coli O157:H7 # 1 450 1 450 450 872 100.0 0 MLSFLWDLASFIVALGVLITVHEFGHFWVARRCGVRVERFSIGFGKALWRRTDKLGTEYV IALIPLGGYVKMLDERAEPVVPELRHHAFNNKSVGQRAAIIAAGPVANFIFAIFAYWLVF IIGVPGVRPVVGEIAANSIAAEAQIAPGTELKAVDGIETPDWDAVRLQLVDKIGDESTTI TVAPFGSDQRRDVKLDLRHWAFEPDKEDPVSSLGIRPRGPQIEPVLENVQPNSAASKAGL QAGDRIVKVDGQPLTQWVTFVMLVRDNPGKSLALEIERQGSPLSLTLIPESKPGNGKAIG FVGIEPKVIPLPDEYKVVRQYGPFNAIVEATDKTWQLMKLTVSMLGKLITGDVKLNNLSG PISIAKGAGMTAELGVVYYLPFLALISVNLGIINLFPLPVLDGGHLLFLAIEKIKGGPVS ERVQDFCYRIGSILLVLLMGLALFNDFSRL >gi|296494549|gb|ADTN01000189.1| GENE 7 6216 - 8648 2473 810 aa, chain + ## HITS:1 COG:ECs0179 KEGG:ns NR:ns ## COG: ECs0179 COG4775 # Protein_GI_number: 15829433 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein/protective antigen OMA87 # Organism: Escherichia coli O157:H7 # 1 810 1 810 810 1609 100.0 0 MAMKKLLIASLLFSSATVYGAEGFVVKDIHFEGLQRVAVGAALLSMPVRTGDTVNDEDIS NTIRALFATGNFEDVRVLRDGDTLLVQVKERPTIASITFSGNKSVKDDMLKQNLEASGVR VGESLDRTTIADIEKGLEDFYYSVGKYSASVKAVVTPLPRNRVDLKLVFQEGVSAEIQQI NIVGNHAFTTDELISHFQLRDEVPWWNVVGDRKYQKQKLAGDLETLRSYYLDRGYARFNI DSTQVSLTPDKKGIYVTVNITEGDQYKLSGVEVSGNLAGHSAEIEQLTKIEPGELYNGTK VTKMEDDIKKLLGRYGYAYPRVQSMPEINDADKTVKLRVNVDAGNRFYVRKIRFEGNDTS KDAVLRREMRQMEGAWLGSDLVDQGKERLNRLGFFETVDTDTQRVPGSPDQVDVVYKVKE RNTGSFNFGIGYGTESGVSFQAGVQQDNWLGTGYAVGINGTKNDYQTYAELSVTNPYFTV DGVSLGGRLFYNDFQADDADLSDYTNKSYGTDVTLGFPINEYNSLRAGLGYVHNSLSNMQ PQVAMWRYLYSMGEHPSTSDQDNSFKTDDFTFNYGWTYNKLDRGYFPTDGSRVNLTGKVT IPGSDNEYYKVTLDTATYVPIDDDHKWVVLGRTRWGYGDGLGGKEMPFYENFYAGGSSTV RGFQSNTIGPKAVYFPHQASNYDPDYDYECATQDGAKDLCKSDDAVGGNAMAVASLEFIT PTPFISDKYANSVRTSFFWDMGTVWDTNWDSSQYSGYPDYSDPSNIRMSAGIALQWMSPL GPLVFSYAQPFKKYDGDKAEQFQFNIGKTW >gi|296494549|gb|ADTN01000189.1| GENE 8 8685 - 8759 56 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTLGDQYKIAGPRKELHPPVQMGW >gi|296494549|gb|ADTN01000189.1| GENE 9 8770 - 9255 617 161 aa, chain + ## HITS:1 COG:STM0225 KEGG:ns NR:ns ## COG: STM0225 COG2825 # Protein_GI_number: 16763615 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein # Organism: Salmonella typhimurium LT2 # 1 161 1 161 161 209 90.0 2e-54 MKKWLLAAGLGLALATSAQAADKIAIVNMGSLFQQVAQKTGVSNTLENEFKGRASELQRM ETDLQAKMKKLQSMKAGSDRTKLEKDVMAQRQTFAQKAQAFEQDRARRSNEERGKLVTRI QTAVKSVANSQDIDLVVDANAVAYNSSDVKDITADVLKQVK >gi|296494549|gb|ADTN01000189.1| GENE 10 9259 - 10284 941 341 aa, chain + ## HITS:1 COG:lpxD KEGG:ns NR:ns ## COG: lpxD COG1044 # Protein_GI_number: 16128172 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase # Organism: Escherichia coli K12 # 1 341 1 341 341 650 100.0 0 MPSIRLADLAQQLDAELHGDGDIVITGVASMQSAQTGHITFMVNPKYREHLGLCQASAVV MTQDDLPFAKSAALVVKNPYLTYARMAQILDTTPQPAQNIAPSAVIDATAKLGNNVSIGA NAVIESGVELGDNVIIGAGCFVGKNSKIGAGSRLWANVTIYHEIQIGQNCLIQSGTVVGA DGFGYANDRGNWVKIPQIGRVIIGDRVEIGACTTIDRGALDDTIIGNGVIIDNQCQIAHN VVIGDNTAVAGGVIMAGSLKIGRYCMIGGASVINGHMEICDKVTVTGMGMVMRPITEPGV YSSGIPLQPNKVWRKTAALVMNIDDMSKRLKSLERKVNQQD >gi|296494549|gb|ADTN01000189.1| GENE 11 10389 - 10844 512 151 aa, chain + ## HITS:1 COG:ZfabZ KEGG:ns NR:ns ## COG: ZfabZ COG0764 # Protein_GI_number: 15799862 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases # Organism: Escherichia coli O157:H7 EDL933 # 1 151 1 151 151 295 100.0 2e-80 MTTNTHTLQIEEILELLPHRFPFLLVDRVLDFEEGRFLRAVKNVSVNEPFFQGHFPGKPI FPGVLILEAMAQATGILAFKSVGKLEPGELYYFAGIDEARFKRPVVPGDQMIMEVTFEKT RRGLTRFKGVALVDGKVVCEATMMCARSREA >gi|296494549|gb|ADTN01000189.1| GENE 12 10848 - 11636 783 262 aa, chain + ## HITS:1 COG:lpxA KEGG:ns NR:ns ## COG: lpxA COG1043 # Protein_GI_number: 16128174 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase # Organism: Escherichia coli K12 # 1 262 1 262 262 511 100.0 1e-145 MIDKSAFVHPTAIVEEGASIGANAHIGPFCIVGPHVEIGEGTVLKSHVVVNGHTKIGRDN EIYQFASIGEVNQDLKYAGEPTRVEIGDRNRIRESVTIHRGTVQGGGLTKVGSDNLLMIN AHIAHDCTVGNRCILANNATLAGHVSVDDFAIIGGMTAVHQFCIIGAHVMVGGCSGVAQD VPPYVIAQGNHATPFGVNIEGLKRRGFSREAITAIRNAYKLIYRSGKTLDEVKPEIAELA ETYPEVKAFTDFFARSTRGLIR >gi|296494549|gb|ADTN01000189.1| GENE 13 11636 - 12784 1167 382 aa, chain + ## HITS:1 COG:lpxB KEGG:ns NR:ns ## COG: lpxB COG0763 # Protein_GI_number: 16128175 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipid A disaccharide synthetase # Organism: Escherichia coli K12 # 1 382 1 382 382 748 100.0 0 MTEQRPLTIALVAGETSGDILGAGLIRALKEHVPNARFVGVAGPRMQAEGCEAWYEMEEL AVMGIVEVLGRLRRLLHIRADLTKRFGELKPDVFVGIDAPDFNITLEGNLKKQGIKTIHY VSPSVWAWRQKRVFKIGRATDLVLAFLPFEKAFYDKYNVPCRFIGHTMADAMPLDPDKNA ARDVLGIPHDAHCLALLPGSRGAEVEMLSADFLKTAQLLRQTYPDLEIVVPLVNAKRREQ FERIKAEVAPDLSVHLLDGMGREAMVASDAALLASGTAALECMLAKCPMVVGYRMKPFTF WLAKRLVKTDYVSLPNLLAGRELVKELLQEECEPQKLAAALLPLLANGKTSHAMHDTFRE LHQQIRCNADEQAAQAVLELAQ >gi|296494549|gb|ADTN01000189.1| GENE 14 12781 - 13377 741 198 aa, chain + ## HITS:1 COG:rnhB KEGG:ns NR:ns ## COG: rnhB COG0164 # Protein_GI_number: 16128176 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HII # Organism: Escherichia coli K12 # 1 198 1 198 198 372 100.0 1e-103 MIEFVYPHTQLVAGVDEVGRGPLVGAVVTAAVILDPARPIAGLNDSKKLSEKRRLALYEE IKEKALSWSLGRAEPHEIDELNILHATMLAMQRAVAGLHIAPEYVLIDGNRCPKLPMPAM AVVKGDSRVPEISAASILAKVTRDAEMAALDIVFPQYGFAQHKGYPTAFHLEKLAEHGAT EHHRRSFGPVKRALGLAS >gi|296494549|gb|ADTN01000189.1| GENE 15 13414 - 15769 2510 785 aa, chain + ## HITS:1 COG:dnaE KEGG:ns NR:ns ## COG: dnaE COG0587 # Protein_GI_number: 16128177 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Escherichia coli K12 # 1 785 1 785 1160 1615 100.0 0 MSEPRFVHLRVHSDYSMIDGLAKTAPLVKKAAALGMPALAITDFTNLCGLVKFYGAGHGA GIKPIVGADFNVQCDLLGDELTHLTVLAANNTGYQNLTLLISKAYQRGYGAAGPIIDRDW LIELNEGLILLSGGRMGDVGRSLLRGNSALVDECVAFYEEHFPDRYFLELIRTGRPDEES YLHAAVELAEARGLPVVATNDVRFIDSSDFDAHEIRVAIHDGFTLDDPKRPRNYSPQQYM RSEEEMCELFADIPEALANTVEIAKRCNVTVRLGEYFLPQFPTGDMSTEDYLVKRAKEGL EERLAFLFPDEEERLKRRPEYDERLETELQVINQMGFPGYFLIVMEFIQWSKDNGVPVGP GRGSGAGSLVAYALKITDLDPLEFDLLFERFLNPERVSMPDFDVDFCMEKRDQVIEHVAD MYGRDAVSQIITFGTMAAKAVIRDVGRVLGHPYGFVDRISKLIPPDPGMTLAKAFEAEPQ LPEIYEADEEVKALIDMARKLEGVTRNAGKHAGGVVIAPTKITDFAPLYCDEEGKHPVTQ FDKSDVEYAGLVKFDFLGLRTLTIINWALEMINKRRAKNGEPPLDIAAIPLDDKKSFDML QRSETTAVFQLESRGMKDLIKRLQPDCFEDMIALVALFRPGPLQSGMVDNFIDRKHGREE ISYPDVQWQHESLKPVLEPTYGIILYQEQVMQIAQVLSGYTLGGADMLRRAMGKKKPEEM AKQRSVFAEGAEKNGINAELAMKIFDLVEKFAGYGFNKSHSAAYALVSYQTLWLKAHYPA EFMAA Prediction of potential genes in microbial genomes Time: Sun May 15 23:46:56 2011 Seq name: gi|296494548|gb|ADTN01000190.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont381.9, whole genome shotgun sequence Length of sequence - 16037 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 10, operones - 6 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.500 + CDS 15 - 1151 1027 ## COG0587 DNA polymerase III, alpha subunit 2 1 Op 2 3/1.000 + CDS 1164 - 2123 1381 ## COG0825 Acetyl-CoA carboxylase alpha subunit + Term 2144 - 2173 1.2 + Prom 2142 - 2201 3.5 3 2 Op 1 1/1.000 + CDS 2222 - 4363 1991 ## COG1982 Arginine/lysine/ornithine decarboxylases 4 2 Op 2 2/1.000 + CDS 4420 - 4809 417 ## COG0346 Lactoylglutathione lyase and related lyases + Term 4829 - 4861 4.7 5 3 Tu 1 . + CDS 4874 - 6172 1126 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control + Term 6384 - 6442 1.3 - Term 6167 - 6211 6.6 6 4 Op 1 . - CDS 6221 - 6475 301 ## COG4568 Transcriptional antiterminator 7 4 Op 2 . - CDS 6468 - 6668 133 ## ECIAI39_0458 hypothetical protein - Prom 6740 - 6799 3.3 + Prom 6753 - 6812 1.8 8 5 Op 1 5/0.500 + CDS 6834 - 7379 745 ## COG4681 Uncharacterized protein conserved in bacteria 9 5 Op 2 4/0.500 + CDS 7376 - 7798 258 ## COG1186 Protein chain release factor B 10 5 Op 3 . + CDS 7812 - 8522 688 ## COG3015 Uncharacterized lipoprotein NlpE involved in copper resistance - Term 8473 - 8518 2.1 11 6 Tu 1 . - CDS 8722 - 9189 412 ## JW5016 predicted lipoprotein - Prom 9237 - 9296 2.1 - Term 9554 - 9594 9.0 12 7 Tu 1 6/0.500 - CDS 9600 - 11318 2167 ## COG0442 Prolyl-tRNA synthetase - Term 11336 - 11366 2.7 13 8 Op 1 . - CDS 11430 - 12137 626 ## COG1720 Uncharacterized conserved protein 14 8 Op 2 . - CDS 12134 - 12538 363 ## EC55989_0194 outer membrane lipoprotein - Prom 12563 - 12622 2.5 - Term 12585 - 12621 6.7 15 9 Op 1 22/0.000 - CDS 12656 - 13471 1216 ## COG1464 ABC-type metal ion transport system, periplasmic component/surface antigen 16 9 Op 2 32/0.000 - CDS 13511 - 14164 913 ## COG2011 ABC-type metal ion transport system, permease component 17 9 Op 3 . - CDS 14157 - 15188 1203 ## COG1135 ABC-type metal ion transport system, ATPase component + Prom 15150 - 15209 3.9 18 10 Tu 1 . + CDS 15376 - 15951 540 ## COG0241 Histidinol phosphatase and related phosphatases Predicted protein(s) >gi|296494548|gb|ADTN01000190.1| GENE 1 15 - 1151 1027 378 aa, chain + ## HITS:1 COG:dnaE KEGG:ns NR:ns ## COG: dnaE COG0587 # Protein_GI_number: 16128177 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Escherichia coli K12 # 1 378 783 1160 1160 766 99.0 0 MAAGMTADMDNTEKVVGLVDECWRMGLKILPPDINSGLYHFHVNDDGEIVYGIGAIKGVG EGPIEAIIEARNKGGYFRELFDLCARTDTKKLNRRVLEKLIMSGAFDRLGPHRAALMNSL GDALKAADQHAKAEAIGQADMFGVLAEEPEQIEQSYASCQPWPEQVVLDGERETLGLYLT GHPINQYLKEIERYVGGVRLKDMHPTERGKVITAAGLVVAARVMVTKRGNRIGICTLDDR SGRLEVMLFTDALDKYQQLLEKDRILIVSGQVSFDDFSGGLKMTAREVMDIDEAREKYAR GLAISLTDRQIDDQLLNRLRQSLEPHRSGTIPVHLYYQRADARARLRFGATWRVSPSDRL LNDLRGLIGSEQVELEFD >gi|296494548|gb|ADTN01000190.1| GENE 2 1164 - 2123 1381 319 aa, chain + ## HITS:1 COG:ECs0187 KEGG:ns NR:ns ## COG: ECs0187 COG0825 # Protein_GI_number: 15829441 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase alpha subunit # Organism: Escherichia coli O157:H7 # 1 319 1 319 319 619 100.0 1e-177 MSLNFLDFEQPIAELEAKIDSLTAVSRQDEKLDINIDEEVHRLREKSVELTRKIFADLGA WQIAQLARHPQRPYTLDYVRLAFDEFDELAGDRAYADDKAIVGGIARLDGRPVMIIGHQK GRETKEKIRRNFGMPAPEGYRKALRLMQMAERFKMPIITFIDTPGAYPGVGAEERGQSEA IARNLREMSRLGVPVVCTVIGEGGSGGALAIGVGDKVNMLQYSTYSVISPEGCASILWKS ADKAPLAAEAMGIIAPRLKELKLIDSIIPEPLGGAHRNPEAMAASLKAQLLADLADLDVL STEDLKNRRYQRLMSYGYA >gi|296494548|gb|ADTN01000190.1| GENE 3 2222 - 4363 1991 713 aa, chain + ## HITS:1 COG:ldcC KEGG:ns NR:ns ## COG: ldcC COG1982 # Protein_GI_number: 16128179 # Func_class: E Amino acid transport and metabolism # Function: Arginine/lysine/ornithine decarboxylases # Organism: Escherichia coli K12 # 1 713 1 713 713 1511 100.0 0 MNIIAIMGPHGVFYKDEPIKELESALVAQGFQIIWPQNSVDLLKFIEHNPRICGVIFDWD EYSLDLCSDINQLNEYLPLYAFINTHSTMDVSVQDMRMALWFFEYALGQAEDIAIRMRQY TDEYLDNITPPFTKALFTYVKERKYTFCTPGHMGGTAYQKSPVGCLFYDFFGGNTLKADV SISVTELGSLLDHTGPHLEAEEYIARTFGAEQSYIVTNGTSTSNKIVGMYAAPSGSTLLI DRNCHKSLAHLLMMNDVVPVWLKPTRNALGILGGIPRREFTRDSIEEKVAATTQAQWPVH AVITNSTYDGLLYNTDWIKQTLDVPSIHFDSAWVPYTHFHPIYQGKSGMSGERVAGKVIF ETQSTHKMLAALSQASLIHIKGEYDEEAFNEAFMMHTTTSPSYPIVASVETAAAMLRGNP GKRLINRSVERALHFRKEVQRLREESDGWFFDIWQPPQVDEAECWPVAPGEQWHGFNDAD ADHMFLDPVKVTILTPGMDEQGNMSEEGIPAALVAKFLDERGIVVEKTGPYNLLFLFSIG IDKTKAMGLLRGLTEFKRSYDLNLRIKNMLPDLYAEDPDFYRNMRIQDLAQGIHKLIRKH DLPGLMLRAFDTLPEMIMTPHQAWQRQIKGEVETIALEQLVGRVSANMILPYPPGVPLLM PGEMLTKESRTVLDFLLMLCSVGQHYPGFETDIHGAKQDEDGVYRVRVLKMAG >gi|296494548|gb|ADTN01000190.1| GENE 4 4420 - 4809 417 129 aa, chain + ## HITS:1 COG:yaeR KEGG:ns NR:ns ## COG: yaeR COG0346 # Protein_GI_number: 16128180 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Escherichia coli K12 # 1 129 10 138 138 270 99.0 6e-73 MLGLKQVHHIAIIATDYAVSTAFYCDILGFTLQSEVYREARDSWKGDLALNGQYVIELFS FPFPPERPSRPEACGLRHLAFSVDDIDAAVAHLESHNVKCETIRVDPYTQKRFTFFNDPD GLPLELYEQ >gi|296494548|gb|ADTN01000190.1| GENE 5 4874 - 6172 1126 432 aa, chain + ## HITS:1 COG:mesJ KEGG:ns NR:ns ## COG: mesJ COG0037 # Protein_GI_number: 16128181 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Escherichia coli K12 # 1 432 1 432 432 802 99.0 0 MTLTLNRQLLTSRQILVAFSGGLDSTVLLHQLVQWRTENPGVALRAIHVHHGLSANADAW VTHCENVCQQWQVPLVVERVQLAQEGLGIEAQARQARYQAFARTLLPGEVLVTAQHLDDQ CETFLLALKRGSGPAGLSAMAEVSEFAGTRLIRPLLARTRGELVQWARQYDLRWIEDESN QDDSYDRNFLRLRVVPLLQQRWPHFAEATARSAALCAEQESLLDELLADDLAHCQSPQGT LQIVPMLAMSDARRAAIIRRWLAGQNAPMPSRDALVRIWQEVALAREDASPCLRLGAFEI RRYQSQLWWIKSVTGQSENIVPWQTWLQPLELPAGLGSVQLNAGGDIRPPRADEAVSVRF KAPGLLHIVGRNGGRKLKKIWQELGVPPWLRDTTPLLFYGETLIAAAGGFVTQEGVAEGE NGVSFVWQKTLS >gi|296494548|gb|ADTN01000190.1| GENE 6 6221 - 6475 301 84 aa, chain - ## HITS:1 COG:ECs0191 KEGG:ns NR:ns ## COG: ECs0191 COG4568 # Protein_GI_number: 15829445 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Escherichia coli O157:H7 # 1 84 3 86 86 155 100.0 1e-38 MNDTYQPINCDDYDNLELACQHHLMLTLELKDGEKLQAKASDLVSRKNVEYLVVEAAGET RELRLDKITSFSHPEIGTVVVSES >gi|296494548|gb|ADTN01000190.1| GENE 7 6468 - 6668 133 66 aa, chain - ## HITS:1 COG:no KEGG:ECIAI39_0458 NR:ns ## KEGG: ECIAI39_0458 # Name: yaeP # Def: hypothetical protein # Organism: E.coli_IAI39 # Pathway: not_defined # 1 66 1 66 66 107 100.0 1e-22 MEKYCELIRKRYAEIASGDLGYVPDALGCVLKVLNEMAADDALSEAVREKAAYAAANLLV SDYVNE >gi|296494548|gb|ADTN01000190.1| GENE 8 6834 - 7379 745 181 aa, chain + ## HITS:1 COG:yaeQ KEGG:ns NR:ns ## COG: yaeQ COG4681 # Protein_GI_number: 16128183 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 181 1 181 181 331 100.0 5e-91 MALKATIYKATVNVADLDRNQFLDASLTLARHPSETQERMMLRLLAWLKYADERLQFTRG LCADDEPEAWLRNDHLGIDLWIELGLPDERRIKKACTQAAEVALFTYNSRAAQIWWQQNQ SKCVQFANLSVWYLDDEQLAKVSAFADRTMTLQATIQDGVIWLSDDKNNLEVNLTAWQQP S >gi|296494548|gb|ADTN01000190.1| GENE 9 7376 - 7798 258 140 aa, chain + ## HITS:1 COG:yaeJ KEGG:ns NR:ns ## COG: yaeJ COG1186 # Protein_GI_number: 16128184 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Escherichia coli K12 # 1 140 1 140 140 210 100.0 8e-55 MIVISRHVAIPDGELEITAIRAQGAGGQHVNKTSTAIHLRFDIRASSLPEYYKERLLAAS HHLISSDGVIVIKAQEYRSQELNREAALARLVAMIKELTTEKKARRPTRPTRASKERRLA SKAQKSSVKAMRGKVRSGRE >gi|296494548|gb|ADTN01000190.1| GENE 10 7812 - 8522 688 236 aa, chain + ## HITS:1 COG:cutF KEGG:ns NR:ns ## COG: cutF COG3015 # Protein_GI_number: 16128185 # Func_class: M Cell wall/membrane/envelope biogenesis; P Inorganic ion transport and metabolism # Function: Uncharacterized lipoprotein NlpE involved in copper resistance # Organism: Escherichia coli K12 # 1 236 1 236 236 470 100.0 1e-132 MVKKAIVTAMAVISLFTLMGCNNRAEVDTLSPAQAAELKPMPQSWRGVLPCADCEGIETS LFLEKDGTWVMNERYLGAREEPSSFASYGTWARTADKLVLTDSKGEKSYYRAKGDALEML DREGNPIESQFNYTLEAAQSSLPMTPMTLRGMYFYMADAATFTDCATGKRFMVANNAELE RSYLAARGHSEKPVLLSVEGHFTLEGNPDTGAPTKVLAPDTAGKFYPNQDCSSLGQ >gi|296494548|gb|ADTN01000190.1| GENE 11 8722 - 9189 412 155 aa, chain - ## HITS:1 COG:no KEGG:JW5016 NR:ns ## KEGG: JW5016 # Name: yaeF # Def: predicted lipoprotein # Organism: E.coli_J # Pathway: not_defined # 1 155 120 274 274 326 100.0 2e-88 MKHSDKLFVLRVPDLTPQQATDITAFANKIKDSGYNYRGIVEFIPFMVTRQMCSLNPFSE DFRQQCVSGLAKAQLSSVGEGDKKSWFCSEFVTDAFAKAGHPLTLAQSGWISPADLMHMR IGDVSAFKPETQLQYVGHLKPGIYIKAGRFVGLTR >gi|296494548|gb|ADTN01000190.1| GENE 12 9600 - 11318 2167 572 aa, chain - ## HITS:1 COG:ECs0196 KEGG:ns NR:ns ## COG: ECs0196 COG0442 # Protein_GI_number: 15829450 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Prolyl-tRNA synthetase # Organism: Escherichia coli O157:H7 # 1 572 1 572 572 1135 99.0 0 MRTSQYLLSTLKETPADAEVISHQLMLRAGMIRKLASGLYTWLPTGVRVLKKVENIVREE MNNAGAIEVSMPVVQPADLWQESGRWEQYGPELLRFVDRGERPFVLGPTHEEVITDLIRN ELSSYKQLPLNFYQIQTKFRDEVRPRFGVMRSREFLMKDAYSFHTSQESLQETYDAMYAA YSKIFSRMGLDFRAVQADTGSIGGSASHEFQVLAQSGEDDVVFSDTSDYAANIELAEAIA PKEPRAAATQEMTLVDTPNAKTIAELVEQFNLPIEKTVKTLLVKAVEGSSFPLVALLVRG DHELNEVKAEKLPQVASPLTFATEEEIRAVVKAGPGSLGPVNMPIPVVIDRTVAAMSDFA AGANIDGKHYFGINWDRDVATPEVADIRNVVAGDPSPDGQGTLLIKRGIEVGHIFQLGTK YSEALKASVQGEDGRNQILTMGCYGIGVTRVVAAAIEQNYDERGIVWPDAIAPFQVAILP MNMHKSFRVQELAEKLYSELRAQGIEVLLDDRKERPGVMFADMELIGIPHTIVLGDRNLD NDDIEYKYRRNGEKQLIKTGDIVEYLVKQIKG >gi|296494548|gb|ADTN01000190.1| GENE 13 11430 - 12137 626 235 aa, chain - ## HITS:1 COG:ECs0197 KEGG:ns NR:ns ## COG: ECs0197 COG1720 # Protein_GI_number: 15829451 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 235 1 235 235 469 99.0 1e-132 MSSFQFEQIGVIRSPYKEKFAVPRQPGLVKSANGELHLIAPYNQADAVRGLEAFSHLWIL FVFHQTMEGGWRPTVRPPRLGGNARMGVFATRSTFRPNPIGMSLVELKEVVCHKDSVILK LGSLDLVDGTPVVDIKPYLPFAESLPDASASYAQSAPAAEMAVSFTAEVEKQLLTLEKRY PQLTLFIREVLAQDPRPAYRKGEETGKTYAVWLHDFNVRWRVTDAGFEVFALEPR >gi|296494548|gb|ADTN01000190.1| GENE 14 12134 - 12538 363 134 aa, chain - ## HITS:1 COG:no KEGG:EC55989_0194 NR:ns ## KEGG: EC55989_0194 # Name: rcsF # Def: outer membrane lipoprotein # Organism: E.coli_55989 # Pathway: Two-component system [PATH:eck02020] # 1 134 1 134 134 220 100.0 1e-56 MRALPICLVALMLSGCSMLSRSPVEPVQSTAPQPKAEPAKPKAPRATPVRIYTNAEELVG KPFRDLGEVSGDSCQASNQDSPPSIPTARKRMQINASKMKANAVLLHSCEVTSGTPGCYR QAVCIGSALNITAK >gi|296494548|gb|ADTN01000190.1| GENE 15 12656 - 13471 1216 271 aa, chain - ## HITS:1 COG:yaeC KEGG:ns NR:ns ## COG: yaeC COG1464 # Protein_GI_number: 16128190 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface antigen # Organism: Escherichia coli K12 # 1 271 1 271 271 507 100.0 1e-144 MAFKFKTFAAVGALIGSLALVGCGQDEKDPNHIKVGVIVGAEQQVAEVAQKVAKDKYGLD VELVTFNDYVLPNEALSKGDIDANAFQHKPYLDQQLKDRGYKLVAVGNTFVYPIAGYSKK IKSLDELQDGSQVAVPNDPTNLGRSLLLLQKVGLIKLKDGVGLLPTVLDVVENPKNLKIV ELEAPQLPRSLDDAQIALAVINTTYASQIGLTPAKDGIFVEDKESPYVNLIVTREDNKDA ENVKKFVQAYQSDEVYEAANKVFNGGAVKGW >gi|296494548|gb|ADTN01000190.1| GENE 16 13511 - 14164 913 217 aa, chain - ## HITS:1 COG:yaeE KEGG:ns NR:ns ## COG: yaeE COG2011 # Protein_GI_number: 16128191 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, permease component # Organism: Escherichia coli K12 # 1 217 1 217 217 310 100.0 9e-85 MSEPMMWLLVRGVWETLAMTFVSGFFGFVIGLPVGVLLYVTRPGQIIANAKLYRTVSAIV NIFRSIPFIILLVWMIPFTRVIVGTSIGLQAAIVPLTVGAAPFIARMVENALLEIPTGLI EASRAMGATPMQIVRKVLLPEALPGLVNAATITLITLVGYSAMGGAVGAGGLGQIGYQYG YIGYNATVMNTVLVLLVILVYLIQFAGDRIVRAVTRK >gi|296494548|gb|ADTN01000190.1| GENE 17 14157 - 15188 1203 343 aa, chain - ## HITS:1 COG:ECs0201 KEGG:ns NR:ns ## COG: ECs0201 COG1135 # Protein_GI_number: 15829455 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, ATPase component # Organism: Escherichia coli O157:H7 # 1 343 1 343 343 676 99.0 0 MIKLSNITKVFHQGTRTIQALNNVSLHVPAGQIYGVIGASGAGKSTLIRCVNLLERPTEG SVLVDGQELTTLSESELTKARRQIGMIFQHFNLLSSRTVFGNVALPLELDNTPKDEVKRR VTELLSLVGLGDKHDSYPSNLSGGQKQRVAIARALASNPKVLLCDEATSALDPATTRSIL ELLKDINRRLGLTILLITHEMDVVKRICDCVAVISNGELIEQDTVSEVFSHPKTPLAQKF IQSTLHLDIPEDYQERLQAEPFTDCVPMLRLEFTGQSVDAPLLSETARRFNVNNNIISAQ MDYAGGVKFGIMLTEMHGTQQDTQAAIAWLQEHHVKVEVLGYV >gi|296494548|gb|ADTN01000190.1| GENE 18 15376 - 15951 540 191 aa, chain + ## HITS:1 COG:ECs0202 KEGG:ns NR:ns ## COG: ECs0202 COG0241 # Protein_GI_number: 15829456 # Func_class: E Amino acid transport and metabolism # Function: Histidinol phosphatase and related phosphatases # Organism: Escherichia coli O157:H7 # 1 191 1 191 191 395 100.0 1e-110 MAKSVPAIFLDRDGTINVDHGYVHEIDNFEFIDGVIDAMRELKKMGFALVVVTNQSGIAR GKFTEAQFETLTEWMDWSLADRDVDLDGIYYCPHHPQGSVEEFRQVCDCRKPHPGMLLSA RDYLHIDMAASYMVGDKLEDMQAAVAANVGTKVLVRTGKPITPEAENAADWVLNSLADLP QAIKKQQKPAQ Prediction of potential genes in microbial genomes Time: Sun May 15 23:47:11 2011 Seq name: gi|296494547|gb|ADTN01000191.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont396.1, whole genome shotgun sequence Length of sequence - 23665 bp Number of predicted genes - 28, with homology - 27 Number of transcription units - 10, operones - 5 average op.length - 4.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 352 312 ## COG2003 DNA repair proteins 2 1 Op 2 . + CDS 415 - 636 298 ## SDY_4591 hypothetical protein 3 1 Op 3 . + CDS 636 - 749 60 ## ECUMN_4887 hypothetical protein 4 1 Op 4 . + CDS 799 - 1167 224 ## ECED1_5187 antitoxin of the YeeV-YeeU toxin-antitoxin system; CP4-44 prophage + Prom 1971 - 2030 6.5 5 2 Op 1 4/0.000 + CDS 2095 - 3354 1052 ## COG4857 Predicted kinase 6 2 Op 2 . + CDS 3364 - 4479 808 ## COG0182 Predicted translation initiation factor 2B subunit, eIF-2B alpha/beta/delta family 7 2 Op 3 . + CDS 4510 - 5151 344 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases + Term 5183 - 5216 5.1 + Prom 5158 - 5217 3.0 8 3 Tu 1 . + CDS 5255 - 6259 857 ## ECUMN_4892 putative integral membrane protein; putative sugar permease + Term 6264 - 6299 5.0 - Term 6250 - 6287 6.2 9 4 Tu 1 . - CDS 6300 - 7286 660 ## COG2390 Transcriptional regulator, contains sigma factor-related N-terminal domain - Prom 7363 - 7422 3.7 - Term 7594 - 7631 -0.8 10 5 Op 1 2/0.000 - CDS 7633 - 8982 1133 ## COG2610 H+/gluconate symporter and related permeases - Prom 9028 - 9087 4.0 11 5 Op 2 2/0.000 - CDS 9089 - 11056 1298 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase 12 5 Op 3 . - CDS 11067 - 12026 867 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 13 5 Op 4 1/0.000 - CDS 11977 - 12765 413 ## COG1414 Transcriptional regulator - Prom 12877 - 12936 4.4 14 5 Op 5 2/0.000 - CDS 13068 - 13850 580 ## COG1349 Transcriptional regulators of sugar metabolism 15 5 Op 6 2/0.000 - CDS 13867 - 14499 406 ## COG0036 Pentose-5-phosphate-3-epimerase 16 5 Op 7 2/0.000 - CDS 14511 - 14942 300 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 17 5 Op 8 2/0.000 - CDS 15073 - 15879 791 ## COG0434 Predicted TIM-barrel enzyme 18 5 Op 9 10/0.000 - CDS 15892 - 17205 1188 ## COG3775 Phosphotransferase system, galactitol-specific IIC component 19 5 Op 10 1/0.000 - CDS 17217 - 17495 265 ## COG3414 Phosphotransferase system, galactitol-specific IIB component 20 5 Op 11 . - CDS 17492 - 18613 788 ## COG1363 Cellulase M and related proteins - Prom 18673 - 18732 5.1 21 6 Tu 1 . - CDS 19183 - 19272 59 ## - Prom 19315 - 19374 3.5 - Term 19349 - 19389 6.6 22 7 Op 1 3/0.000 - CDS 19399 - 20145 715 ## COG0500 SAM-dependent methyltransferases 23 7 Op 2 1/0.000 - CDS 20201 - 20746 308 ## COG3153 Predicted acetyltransferase 24 7 Op 3 . - CDS 20758 - 21015 254 ## COG3811 Uncharacterized protein conserved in bacteria - Prom 21221 - 21280 5.1 25 8 Tu 1 . - CDS 21506 - 21637 106 ## ECUMN_4909 hypothetical protein - Prom 21664 - 21723 3.5 + Prom 21653 - 21712 2.3 26 9 Op 1 2/0.000 + CDS 21801 - 22010 79 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits 27 9 Op 2 . + CDS 21977 - 22993 410 ## COG1112 Superfamily I DNA and RNA helicases and helicase subunits + Term 23051 - 23091 -0.4 + Prom 23017 - 23076 1.6 28 10 Tu 1 . + CDS 23180 - 23663 170 ## COG3039 Transposase and inactivated derivatives, IS5 family Predicted protein(s) >gi|296494547|gb|ADTN01000191.1| GENE 1 2 - 352 312 116 aa, chain + ## HITS:1 COG:ECs2803 KEGG:ns NR:ns ## COG: ECs2803 COG2003 # Protein_GI_number: 15832057 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Escherichia coli O157:H7 # 1 116 43 158 158 221 98.0 2e-58 AREWLILNMAGLEREEFRVLYLNNQNQLIAGETLFTGTINRTEVHPREVIKRALYHNAAA VVLAHNHPSGEVTPSKADRLITERLVQALGLVDIRVPDHLIVGGSQVFSFAEYGLL >gi|296494547|gb|ADTN01000191.1| GENE 2 415 - 636 298 73 aa, chain + ## HITS:1 COG:no KEGG:SDY_4591 NR:ns ## KEGG: SDY_4591 # Name: not_defined # Def: hypothetical protein # Organism: S.dysenteriae # Pathway: not_defined # 1 73 1 73 73 148 100.0 7e-35 MKIITRGEAMRIHRQHPASRLFPFCTGKYRWHGSAEAYTGREVQDIPGVLAVFAERRKDS FGPYVRLMSVTLN >gi|296494547|gb|ADTN01000191.1| GENE 3 636 - 749 60 37 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_4887 NR:ns ## KEGG: ECUMN_4887 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 37 1 37 37 71 100.0 1e-11 MFRHPENEYDWQSSDAAVDEVLLINEYTFTEAGLRAG >gi|296494547|gb|ADTN01000191.1| GENE 4 799 - 1167 224 122 aa, chain + ## HITS:1 COG:no KEGG:ECED1_5187 NR:ns ## KEGG: ECED1_5187 # Name: yeeU # Def: antitoxin of the YeeV-YeeU toxin-antitoxin system; CP4-44 prophage # Organism: E.coli_ED1a # Pathway: not_defined # 1 122 1 122 122 236 95.0 1e-61 MSDTLSGTTHPDDNDDHPWWGLPCTVTPCFGARLVQEGNRLHYLADRAGIRGRFRDADAY PLDQAFPLLMKQLKLMLTSGELNPRHQHTVTLYAKGLTCEADTLGSCGYVYLAVYPTPET KK >gi|296494547|gb|ADTN01000191.1| GENE 5 2095 - 3354 1052 419 aa, chain + ## HITS:1 COG:mll7285 KEGG:ns NR:ns ## COG: mll7285 COG4857 # Protein_GI_number: 13476069 # Func_class: R General function prediction only # Function: Predicted kinase # Organism: Mesorhizobium loti # 1 412 1 410 423 341 44.0 2e-93 MTDSIPSGYKPLTCDTLPGYLSSRLTPSCEPGGLPEEWKVSEVGDGNLNMVFIVEGTHKT IIVKQALPWLRAGGEGWPLSLSRAGFEYNVLCQEAKYAGHTLIPQVYFYDPEMALFAMEY LTPHVILRKELINGKKFPKLAEDIGRFLAQTLFNTSDIGMSAEQKKALTAEFALNHELCK ITEDLIFTEPYYNAERNNWTSPELDDAVHKAWADVEMIQVAMRYKYKFMTEAQALLHGDL HSGSIMVTDTDTKVIDPEFGFMGPMAFDIGNYIGNLLLAYFSRPGWDANEQRRADYQEWL LQQIVQTWSVFTREFRQLWDNKTQGDAWSTEMYQQNRAALEDAQDQFFATLLEDSLVNAG MEMNRRIIGFAGVAELKQIENTELRAGCERRALTMARDLIVNARPFKNMDSVIQSAKVK >gi|296494547|gb|ADTN01000191.1| GENE 6 3364 - 4479 808 371 aa, chain + ## HITS:1 COG:SMb20624 KEGG:ns NR:ns ## COG: SMb20624 COG0182 # Protein_GI_number: 16265284 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted translation initiation factor 2B subunit, eIF-2B alpha/beta/delta family # Organism: Sinorhizobium meliloti # 1 360 1 362 364 442 62.0 1e-124 MNIKGKHYRTVWVSGDGKAVEIIDQTKLPFKFEVVALTSAEMAATAIQDMWVRGAPLIGV VAAYGIALGMNHDASDMGLQRYYDLLIKTRPTAINLKWALDRMIDTLKDLCVSERKDVAW ALAAEIAEEDVALCEQIGLHGAEVIREIAQKKPAGSVVNILTHCNAGWLATVDWGTALSP IYKAHENGIPVHVWVDETRPRNQGGLTAFELGSHGIPHTLIADNAGGHLMQHGDVDLCIV GTDRTTARGDVCNKIGTYLKALAAHDNHVPFYVALPSPTIDWTIEDGKSIPIEQRDGKEQ SHVYGINPQGELSWVNTAPEGTRCGNYAFDVTPARYITGFITERGVCAASKSALADMFAD LKSKALQGEQH >gi|296494547|gb|ADTN01000191.1| GENE 7 4510 - 5151 344 213 aa, chain + ## HITS:1 COG:fucA KEGG:ns NR:ns ## COG: fucA COG0235 # Protein_GI_number: 16130707 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli K12 # 1 212 1 212 215 257 58.0 1e-68 MERIKLAEKIISTCREMNASGLNQGTSGNVSARYTGGMLITPSGIAYSKMTPDMIVFVDD KGKPEAGKIPSSEWLIHLACYKARPELNAVIHTHAVNSTAVAIHNHSIPAIHYMVAVSGT DHIPCIPYYTFGSPELADGVSKGIRESKSLLMQHHGMLAMDVTLEKTLWLAGETETLADL YIKCGGLHHDVPVLSEAEMTIVLEKFKTYGLKA >gi|296494547|gb|ADTN01000191.1| GENE 8 5255 - 6259 857 334 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_4892 NR:ns ## KEGG: ECUMN_4892 # Name: not_defined # Def: putative integral membrane protein; putative sugar permease # Organism: E.coli_UMN026 # Pathway: not_defined # 1 334 1 334 334 587 100.0 1e-166 MFIVESYAVAIIMCFITMICWGSWANTTKLVSNKKWEFPLFYWDYSIGLLLCSLLFAFTL GSMGEAGRSFIPDIQQASSSSLMSAILAGIIFNISNILLVASINLAGMAVAFPVGVGLAL ALGVITTYIGNPQGDPLILFLGVVCVVIAIIFTAIAYGRVTQEADKSRRNKGLITAILAG IIMGWFFRFLADSMSDNFSQPASGLMTPYSALVLFAVGLFLSNFVLNRLVMKKPISGEPV NGKMYFSGSLRDHVCGWLGGMIWCVGLAFSLIASGQAGYAISYGLGQGATMIAVIWGVFI WREFASAPAGTNKLLLTMFISYIVGIVLIIAANQ >gi|296494547|gb|ADTN01000191.1| GENE 9 6300 - 7286 660 328 aa, chain - ## HITS:1 COG:yjhU KEGG:ns NR:ns ## COG: yjhU COG2390 # Protein_GI_number: 16132116 # Func_class: K Transcription # Function: Transcriptional regulator, contains sigma factor-related N-terminal domain # Organism: Escherichia coli K12 # 63 328 1 266 266 534 99.0 1e-151 MDRDQTNSSLFNDDPVLHATWLYYQEGKSQTEVAAIMGVSRVTVVKYLQTARENGLVHIN LDVNVFGSIDAALQIRDKFNLQRVIIVPDGEHAGKRDDTKLMRTRLSRAGGMYLNQVIEN GDVLGVAWGRTIHQMSKTMTPKSCKNVTVIQMLGSMPSQPDLTIIESSSQIAYKLSGRVA SLHVPAVVSSARLAMELQAEPIIRSNFDVLTRCTKAFFVVGNALDENPLIRVGVLNKKEM QTYRDLGAVGVICGRFYDKEGMPVVADVDQRILGISLAQLRQIERKIFLAGGERGYDATL GALLGGYVTDLIVDEGTAEFLLACELPH >gi|296494547|gb|ADTN01000191.1| GENE 10 7633 - 8982 1133 449 aa, chain - ## HITS:1 COG:yjhF KEGG:ns NR:ns ## COG: yjhF COG2610 # Protein_GI_number: 16132117 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Escherichia coli K12 # 1 449 1 449 449 695 100.0 0 MPLIIVVAGIALLLLLTIKIKLNTFVSLIIVSIAVAIASGMDLSKVVTSVESGLGGTLGH IGLIFGFGVMLGRLLADAGGAQRIALTMLNYFGKNKLDWAVVCSAFIVGIALFFEVGLIL LVPILFAIAREAKISPMFMCVPMLSGLLVAHGFLPPHPGPTVIAREYGADVGLVLIYGII VGIPTFILCGPVLNKFCQRIIPDAFKKEGNIASLGATRRFSESEMPGFGISFLTAMLPVI LMAVVTIIQMTHAKSAADSGLFYNVILFLGNSTIAMLISLLFAIYTMGLGRGKTIPDLMD SCGKAIAGIAGLLLIIGGGGAFKQVLIDSGVGQYISTLVSGMDINPILMAWGVAAFLRIC LGSATVAAISTAGLVIPLLAVHPNTNLALITLATGAGSCICSHVNDASFWMIKDFFGLTT KETLLSWTLMSTLLSISGLIFILLASLVL >gi|296494547|gb|ADTN01000191.1| GENE 11 9089 - 11056 1298 655 aa, chain - ## HITS:1 COG:yjhG KEGG:ns NR:ns ## COG: yjhG COG0129 # Protein_GI_number: 16132118 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Escherichia coli K12 # 1 655 1 655 655 1313 99.0 0 MSVRNIFADESHDIYTVRTHADGPDGELPLTAEMLINRPSGDLFGMTMNAGMGWSPDELD RDGILLLSTLGGLRGADGKPVALALHQGHYELDIQMKAAAEVIKANHALPYAVYVSDPCD GRTQGTTGMFDSLPYRNDASMVMRRLIRSLPDAKAVIGVASCDKGLPATMMALAAQHNIA TVLVPGGATLPAKDGEDNGKVQTIGARFANGELSLQDARRAGCKACASSGGGCQFLGTAG TSQVVAEGLGLAIPHSALAPSGEPVWREIARASARAALNLSQKGITTREILTDKAIENAM TVHAAFGGSTNLLLHIPAIAHQAGCHIPTVDDWIRINKRVPRLVSVLPNGPVYHPTVNAF MAGGVPGVMLHLRSLGLLHEDVMTVTGSTLKENLDWWEHSERRQRFKQLLLDQEQINADE VIMSPQQAKARGLTSTITFPVGNIAPEGSVIKSTAIDPSMIDEQGIYYHKGVAKVYLSEK SAIYDIKHDKIKAGDILVIIGVGPSGTGMEETYQVTSALKHLSYGKHVSLITDARFSGVS TGACIGHVGPEALAGGPIGKLRTGDLIEIKIDCRELHGEVNFLGTRSDEQLPSQEEATAI LNARPSHQDLLPDPELPDDTRLWAMLQAVSGGTWTGCIYDVNKIGAALRDFMNKN >gi|296494547|gb|ADTN01000191.1| GENE 12 11067 - 12026 867 319 aa, chain - ## HITS:1 COG:yjhH KEGG:ns NR:ns ## COG: yjhH COG0329 # Protein_GI_number: 16132119 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Escherichia coli K12 # 1 319 1 319 319 647 100.0 0 MGWDTETKMSTYEKETEVMKKFSGIIPPVSSTFHRDGTLDKKAMREVADFLINKGVDGLF YLGTGGEFSQMNTAQRMALAEEAVTIVDGRVPVLIGVGSPSTDEAVKLAQHAQAYGADGI VAINPYYWKVAPRNLDDYYQQIARSVTLPVILYNFPDLTGQDLTPETVTRLALQNENIVG IKDTIDSVGHLRTMINTVKSVRPSFSVFCGYDDHLLNTMLLGGDGAITASANFAPELSVG IYRAWREGDLATAATLNKKLLQLPAIYALETPFVSLIKYSMQCVGLPVETYCLPPILEAS EEAKDKVHVLLTAQGILPV >gi|296494547|gb|ADTN01000191.1| GENE 13 11977 - 12765 413 262 aa, chain - ## HITS:1 COG:yjhI KEGG:ns NR:ns ## COG: yjhI COG1414 # Protein_GI_number: 16132120 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 262 1 262 262 531 100.0 1e-151 MVRKGCNSLVRAEKILTHIAWVGMASYMELLNKFQYPKSSLLNLLNVMVDCGFLIKNKNG YYSLGIKNYELGCQALHRQNIFEVTKRPMQELSLKSGLVCHLGAMESISAIYLDKIESPD SVPTSKSWIGKKLELHITALGKALLAWKTREELDYFLEALTLTPHTRNTFTDKKLFLEEL QKTRLRGWAIDNEESTYGAVCLSMPVFNMYNRVNYAISLSGDPVVYSGNKIDSYLELLRK CAEQISYGLGYRNENEHLRKGN >gi|296494547|gb|ADTN01000191.1| GENE 14 13068 - 13850 580 260 aa, chain - ## HITS:1 COG:sgcR KEGG:ns NR:ns ## COG: sgcR COG1349 # Protein_GI_number: 16132121 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli K12 # 1 260 1 260 260 537 100.0 1e-153 MSQQRPDRIKQMLHYLWQHRHLSTQQAMELFGYAEATVRRDFQYIVNQYPGMIRGHGCLD FDDSTDDKEYVFDVKRTLQSVAKREIAALARTMIKDGDCFFLDSGSTCLELAKCLADARV KVICNDIKIANELGCFPHVESYIIGGLIRPGYFSVGESLALEMINAFSVERAFISCDALS LETGITNATMFEVGVKTRIIQRSREVILMADHSKFDAVEPHAVATLSCIKTIISDSGLPE TIAQRYQRAGCQLFLPHSIK >gi|296494547|gb|ADTN01000191.1| GENE 15 13867 - 14499 406 210 aa, chain - ## HITS:1 COG:sgcE KEGG:ns NR:ns ## COG: sgcE COG0036 # Protein_GI_number: 16132122 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Escherichia coli K12 # 1 210 1 210 210 399 100.0 1e-111 MILHPSLASANPLHYGRELTALDNLDFGSLHLDIEDSSFINNITFGMKTVQAVARQTPHP LSFHFMLARPQRWFNALAEIRPAWIFVHAETLDYPSETLTEIRHTGARAGLVFNPATPID AWRYLASELDGVMVMTSEPDGQGQRFIPSMCEKIQKVRTAFPQTECWADGGITLAAAQQL AAAGAQHMVIGRALFSSSDYRATLAQFATL >gi|296494547|gb|ADTN01000191.1| GENE 16 14511 - 14942 300 143 aa, chain - ## HITS:1 COG:sgcA KEGG:ns NR:ns ## COG: sgcA COG1762 # Protein_GI_number: 16132123 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Escherichia coli K12 # 1 143 1 143 143 283 100.0 5e-77 MINDIKWVQAQRKATDWRQAVEIATRPLVAYGAAQPCYVNGIIENTLNWGPYYLIAPGIA LPHARPEQGANYNQVSITTLRTPVAFGNEECDPVWLLLCVSATDANAHILTIQRISQFID SPQRLTAVGNASTDDALFALVSG >gi|296494547|gb|ADTN01000191.1| GENE 17 15073 - 15879 791 268 aa, chain - ## HITS:1 COG:sgcQ KEGG:ns NR:ns ## COG: sgcQ COG0434 # Protein_GI_number: 16132124 # Func_class: R General function prediction only # Function: Predicted TIM-barrel enzyme # Organism: Escherichia coli K12 # 1 268 1 268 268 553 100.0 1e-157 MSWLKEVIGTEKAVIAMCHLRALPGDPSFDAQLGMNWVIDKAWDDLMALQNGGVDAVMFS NEFSLPYLTKVRPETTAAMARIIGQLMSDIRIPFGVNVLWDPVASFDLAMATGAKFIREI FTGAYASDFGVWDTNVGETIRHQHRIGAGEVKTLFNIVPEAAVYLGNRDICSIAKSTVFN NHPDALCVSGLTAGTRTDSALLKRVKETVPDTVVLANTGVCLENVEEQLSIADGCVTATT FKKDGVFANFVDQARVSQFMEKVHHIRR >gi|296494547|gb|ADTN01000191.1| GENE 18 15892 - 17205 1188 437 aa, chain - ## HITS:1 COG:sgcC KEGG:ns NR:ns ## COG: sgcC COG3775 # Protein_GI_number: 16132125 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIC component # Organism: Escherichia coli K12 # 1 437 1 437 437 703 100.0 0 MFDYILSLGGTVFVPIIMIVIGLIFRIPWLQAIKAGVTVGIGFVGMGLVIVMAIDSLSPP IKVMIERFGLALHVFDVGAGPASGVGYATAIGAMIIPVIFLLNVAMLVTRLTKTMNVDIY NYWHYAITGTVVQLMTGSLIYGVLGAICHAALSLKMADWTAKRVQNIVGLEGISIPQGYG SSSVPLFVLLDAIYEKIPFMKGRNIDAQEIQKRYGMVGDPVIIGVVLGLIFGLAAGEGFK GCASLMITVAAIMVLFPRMIRLIVEGLLPISDGARKFFQKYFKGREVYIGLDTAVTLGHP TTIAVGLLLIPIMLILASILPGNKVLPLADLPVAPFFICMATVIHRGDLVRTLISGVIVM ITVLLIATQFAPYFTEMALKGGFSFAGESAQISALSVGNMFGWSISELMSLGIIGVVVAV GIVASVVLFLRKRELSE >gi|296494547|gb|ADTN01000191.1| GENE 19 17217 - 17495 265 92 aa, chain - ## HITS:1 COG:STM1613 KEGG:ns NR:ns ## COG: STM1613 COG3414 # Protein_GI_number: 16764957 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIB component # Organism: Salmonella typhimurium LT2 # 1 92 2 93 93 149 93.0 1e-36 MKKILVACGTGMSTSTMIAHKLQEFLTEQGISATTAQCCLNEIPLNCNGMDLIVTSMRTN SDYGIPTLNGAALLTGINDDALKQQIKALLTQ >gi|296494547|gb|ADTN01000191.1| GENE 20 17492 - 18613 788 373 aa, chain - ## HITS:1 COG:sgcX KEGG:ns NR:ns ## COG: sgcX COG1363 # Protein_GI_number: 16132126 # Func_class: G Carbohydrate transport and metabolism # Function: Cellulase M and related proteins # Organism: Escherichia coli K12 # 1 373 11 383 383 764 100.0 0 MSFSVQETLFSLLQHNAISGHENAVADVMLCEFRRQAKEVWRDRLGNVVARYGSDKPDAL RLMIFAHMDEVGFMVRKIEPSGFLRFERVGGPAQVTMAGSIVTLTGDKGPVMGCIGIKSY HFAKGDERTQSPSVDKLWIDIGAKDKDDAIRMGIQVGTPVTLYNPPQLLANDLVCSKALD DRLGCTALLGVADAISTMELDIAVYLVASVQEEFNIRGIVPVLRRVKPDLAIGIDITPSC DTPDLHDYSEVRINQGVGITCLNYHGRGTLAGLITPPRLIRMLEQTALEHNIPVQREVAP GVITETGYIQVEQDGIPCASLSIPCRYTHSPAEVASLRDLTDCIRLLTALAGMSAAHFPV EPDSGTTQEAHPL >gi|296494547|gb|ADTN01000191.1| GENE 21 19183 - 19272 59 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSQHEQFCQACGMPMSAPDAHSASDQYCA >gi|296494547|gb|ADTN01000191.1| GENE 22 19399 - 20145 715 248 aa, chain - ## HITS:1 COG:yjhP KEGG:ns NR:ns ## COG: yjhP COG0500 # Protein_GI_number: 16132127 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli K12 # 1 248 1 248 248 507 100.0 1e-144 MDIPRIFTISESEHRIHNPFTEEKYATLGRVLRMKPGTRILDLGSGSGEMLCTWARDHGI TGTGIDMSSLFTAQAKRRAEELGVSERVHFIHNDAAGYVANEKCDVAACVGATWIAGGFA GAEELLAQSLKPGGIMLIGEPYWRQLPATEEIAQACGVSSTSDFLTLPGLVGAFDDLGYD VVEMVLADQEGWDRYEAAKWLTMRRWLEANPDDDFAAEVRAELNIAPKRYVTYARECFGW GVFALIAR >gi|296494547|gb|ADTN01000191.1| GENE 23 20201 - 20746 308 181 aa, chain - ## HITS:1 COG:yjhQ KEGG:ns NR:ns ## COG: yjhQ COG3153 # Protein_GI_number: 16132128 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Escherichia coli K12 # 1 181 1 181 181 365 100.0 1e-101 MTVHHFTFHITDKSDASDIREVETRAFGFSKEADLVASLLEDESARPALSLLARYEGKAV GHILFTRATFKGEMDSPLMHILAPLAVIPEYQGMGVGGRLIRTGIEHLRLMGCQTVFVLG HATYYPRHGFEPCAGDKGYPAPYPIPEEHKACWMMQSLTAQPMTLTGHIRCADPDETGAL T >gi|296494547|gb|ADTN01000191.1| GENE 24 20758 - 21015 254 85 aa, chain - ## HITS:1 COG:STM4501 KEGG:ns NR:ns ## COG: STM4501 COG3811 # Protein_GI_number: 16767745 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Salmonella typhimurium LT2 # 1 85 1 85 85 121 88.0 3e-28 MNLSRQEQHTLHVLAKGRRIAHVRDSSGRVTSVECYSREGLLLTDCTLAVFKKLKTKKLI KSVNGQPYRINTTELNKVRAQLDNR >gi|296494547|gb|ADTN01000191.1| GENE 25 21506 - 21637 106 43 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_4909 NR:ns ## KEGG: ECUMN_4909 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 43 1 43 43 78 100.0 9e-14 MTQVLIGDIYLRQQCDVFWLGYTISPAYARQGYAIEAITATID >gi|296494547|gb|ADTN01000191.1| GENE 26 21801 - 22010 79 69 aa, chain + ## HITS:1 COG:STM4489 KEGG:ns NR:ns ## COG: STM4489 COG1112 # Protein_GI_number: 16767733 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Salmonella typhimurium LT2 # 1 58 775 832 1171 100 68.0 7e-22 MQVAQFNSHYQYDPKFERGMYLYEHRRCFNNIIDYCNSLCYHGKLQPKRGMEKGRFSPQW DIYILMVEA >gi|296494547|gb|ADTN01000191.1| GENE 27 21977 - 22993 410 338 aa, chain + ## HITS:1 COG:yjhR KEGG:ns NR:ns ## COG: yjhR COG1112 # Protein_GI_number: 16132129 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases and helicase subunits # Organism: Escherichia coli K12 # 1 338 1 338 338 684 100.0 0 MGYLHIDGRGMKPNGGSRHNPLEAETIAAWLVAHKDDIERHYGEPLYKVVGVVTPFSAQV NAIKMSLRKLEINGKDEQGLLTVGTVHSLQGAERAIVLFSPVYSKHEDGRFLDSNSTILN VAVSRAKDSFLVFGDMDLIEMQPAFSPRGLLAKYLFSSDNNALQFEFQKRQDLISAHTQI STLHGVEQHDEFLNKTLAGAQKKITIISPWLSWQKVEQTGFLASMALARSRGIDITVVTD KNCNIAHVDDDKRQEKQHLLNDAVEKLNKMGIATKLVNRVHSKIVIEDEELLCVGSFNWF SATREDKYQRYDTSLVYRGEGVKNEIKAIYGSLDQRQL >gi|296494547|gb|ADTN01000191.1| GENE 28 23180 - 23663 170 161 aa, chain + ## HITS:1 COG:yi52_g6 KEGG:ns NR:ns ## COG: yi52_g6 COG3039 # Protein_GI_number: 16129935 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives, IS5 family # Organism: Escherichia coli K12 # 1 161 13 173 338 320 100.0 7e-88 MSHQLTFADSEFSSKRRQTRKEIFLSRMEQILPWQNMVEVIEPFYPKAGNGRRPYPLETM LRIHCMQHWYNLSDGAMEDALYEIASMRLFARLSLDSALPDRTTIMNFRHLLEQHQLARQ LFKTINRWLAEAGVMMTQGTLVDATIIEAPSSTKNKEQQRD Prediction of potential genes in microbial genomes Time: Sun May 15 23:47:29 2011 Seq name: gi|296494546|gb|ADTN01000192.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont398.1, whole genome shotgun sequence Length of sequence - 8693 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 137 - 188 10.1 1 1 Tu 1 . - CDS 210 - 1256 489 ## COG1787 Predicted endonuclease distantly related to archaeal Holliday junction resolvase and Mrr-like restriction enzymes - Prom 1405 - 1464 6.5 2 2 Op 1 . - CDS 1496 - 2623 345 ## Rmet_6082 hypothetical protein 3 2 Op 2 . - CDS 2682 - 4520 575 ## COG1479 Uncharacterized conserved protein 4 2 Op 3 8/0.000 - CDS 4579 - 5310 467 ## COG1451 Predicted metal-dependent hydrolase 5 2 Op 4 . - CDS 5331 - 8615 2868 ## COG0610 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases Predicted protein(s) >gi|296494546|gb|ADTN01000192.1| GENE 1 210 - 1256 489 348 aa, chain - ## HITS:1 COG:jhp0345 KEGG:ns NR:ns ## COG: jhp0345 COG1787 # Protein_GI_number: 15611413 # Func_class: V Defense mechanisms # Function: Predicted endonuclease distantly related to archaeal Holliday junction resolvase and Mrr-like restriction enzymes # Organism: Helicobacter pylori J99 # 224 337 76 189 189 103 44.0 5e-22 MTPLNPSFYPEFQYRSKGVLKDLFGKKKEQAQLNQLLENVLEKYSSLKDPYFTNFIYTSR FSHNNKSEADYSTDNGTYSELQLFREVLVRKGFNELEELPELLDKLLLTTSFNALYYGFA KEVKRHIKSTLTESLKSWIEEAGTTFRADLSLFLFYIWSNKIEFSSVEFNEKAEATPGVP LVSFDEVKKYLAVCEHIYFDILVDRLATRLEHFNPNKFVTMYLVDAMDGFQFEDFLVEVF QTMGYDVKETKRTQDQGADLFVTRFGKDMVIQAKNYSGSVGNSAVQQVISAKTFYGCDEA MVVTNSYFTRSAKELAESALVRLIDRDELQKYLDDYNQKIIEDFQSNS >gi|296494546|gb|ADTN01000192.1| GENE 2 1496 - 2623 345 375 aa, chain - ## HITS:1 COG:no KEGG:Rmet_6082 NR:ns ## KEGG: Rmet_6082 # Name: not_defined # Def: hypothetical protein # Organism: R.metallidurans # Pathway: not_defined # 24 375 5 377 380 162 30.0 2e-38 MRSETFFVKFIEQLKGVSVEDDVEFNFFELGGSGYLENPTTDLMALFMGAQKMVPPWLLK ALLCCLDDSLDVDEIDFTSLELMREARTEDGKYIDILIRHDEFIIGIEHKVLADTYNPFP SYVSLIDSYGGNNQKLFRCILKPDGNSAKGVDGWQLINYSLLLETAIRRLGLEMMNQEFS KWTFFYQEFLSHLKKLSEVSMDKVSDKNVEFVTENFTALIKSVQLLEMYQNAITEEAKSV VSEVLPDIHIATGINNWKGYYKAIHLMPGCWGQGKTGITLVYRPSEDGRDEAEFYVYGWI HSDDYPEVNALKEEIRVALSAGDFIPTASPEDYEVSITGKGKVLELSFWGNTRSKKDALV LLKDMTKWMDSKIRT >gi|296494546|gb|ADTN01000192.1| GENE 3 2682 - 4520 575 612 aa, chain - ## HITS:1 COG:YPO3437 KEGG:ns NR:ns ## COG: YPO3437 COG1479 # Protein_GI_number: 16123585 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Yersinia pestis # 8 611 16 628 636 377 36.0 1e-104 MSKLDNKIEARHRNLFDVLNAQKYTVDYFQREYSWGEKHIEELVTDLTSAFLNEYTVGDS REQGENYNNYYLGPFVVSSKDGKRSIIDGQQRLTSLTLFLIYLHNLQKELKYEEKIESMI FSELRGSKSFNIVVEDRIPCMEALFNFGSYSLVDGDDESTHNMVERYQNITDAFPDELKG QAFPFFIDWLKYNVIMVEIVAYSDENAYTIFETMNDRGLNLTPSEMLKGFLLSRFHQGDK RQKANELWKKAMMDLKNYDKDEDQRFFQSWLRAQYADTIRPGKAGSKNEDFEKIGTRFHS WVRDNLQEVGLDPDNGETFERFIQKNFLFYLNAYTQILNAERALTHQLEYVFYIHHWGIA PTLSFPLMLAPLNVGDSPEAVIAKINLVARYIETFVVRRSVNFRKFSASSIRYTMYSLVK EIRGKSIEELKDLLSKKLSEMPDTFAGMKEFRLHGQNYRFVKFLLSRITAWVEQQAGMST TFITYYQPEHGKPFEVEHIWADKYERYSDEFEQEHEFNNYRNRLGDLVLLPRGSNQSYGD LCYDQKQPHYIKENLLAKSLCPLAYMNNPNFNQLRNVFGLPFKPHDSFKKQDVDERQSLY KIICENIWDHNL >gi|296494546|gb|ADTN01000192.1| GENE 4 4579 - 5310 467 243 aa, chain - ## HITS:1 COG:BMEII0448 KEGG:ns NR:ns ## COG: BMEII0448 COG1451 # Protein_GI_number: 17988793 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Brucella melitensis # 6 236 9 241 246 118 34.0 1e-26 MKALRIVYGDEVIVVQCVPRQVVKGRVLIKVYPDCRVVASVPPETPEHEVLSALKKRGRW IYQQLRDFREQQIHIVPRQYVSGESHYYLGKQYQLKVTEDATVPQRVKMLRGRLEVTVRH KSAEKIKALLAEWYRERARDVFQRRLDLLIPQTLWVSERPPIRLRAMQTQWGNCSAKGCL TLNPWLVKASSECIDYVLLHELCHVAEHNHSEEFYRLMGQVMPGWEKVKKRLDGMAGMLL ADM >gi|296494546|gb|ADTN01000192.1| GENE 5 5331 - 8615 2868 1094 aa, chain - ## HITS:1 COG:BMEII0449 KEGG:ns NR:ns ## COG: BMEII0449 COG0610 # Protein_GI_number: 17988794 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Brucella melitensis # 92 787 83 720 783 335 33.0 3e-91 MDKHYQPKFQEEYSAKIPALTLLTSLGWTFLSPKQIMDCRGYKQDEVVLRPILREVLSER YFMVGGKTCRLSEKALDNLISQVCSPALNEGLLKANERMYNHLLYGIAVTEFVDGKKVNP TIALIDWEHPKNNQFHFAEEFSVLRSGGVETRRPDIVCFVNGIPLAVIEAKSPVGHGKKG PTIDEGISQSIRNQLNDEIPQLFAYSQLLLSINGHDGRYGTCHTPMKFWAAWREEDITDA QMYAIRNHPLSTEQIDALFAHRPPADRNWYQQLIAAGELAVSGQDKLLISLLSPERLLEM TRFFTLFDKKNGKIVARYQQVFGIKRLLERISTRRSDGGREGGVIWHTTGSGKSYTMVFL SKALILHDSLKQCRIVVVTDRIDLEEQLSGTFASGGELAGKKDKANAMATSGQMLAKQIG SGKERIIFTLIQKFNSATKLPECVNTSPDIIVLIDEGHRSQDGENHVRMKLALPNAAFVA FTGTPLLKEDKTTNKFGPIVHAYTMQRAVEDQAVTPLLYEERIPDLEVNDRAIDAWFDRI TDGLSEAQKADLKRKYARKGEVYSADDRIRLIALDIAMHFSKNIDEGLKGQLACDSKISA IKYKKYLDEAGLFESAVVISPPDTREGNTEVDESKLPEVTKWWKDNVGTQDESVYTRNVI SRFDTDEKLKLLIVVDKLLTGFDEPKNTVLYIDKPLKSHNLIQAIARVNRLHPLKKFGLL IDYRGILAELDTTIGKYQDLASRTQGGYDIKDIDGLYSAMSSEYKRLPHLYNQLWAIFAG VKNKNDTEQLRAVLVPKMEERDGEMVDIHQKTRDDFFEALTAFAGCLKVALQSATFFTDK SFTEQDRNLYKETVKQMSSLRQWAMQVSGEQVNYDDYAEQVKKLLDKHVTGVEVREPDGV YEVGKMGKSEKPEEWDNNKTRNETDIIKTRVTKMIEQELRDDPYAQEAFSKLLRMAIEEA EKLFDHPLKQYLLFREFEEQVEARKLSDIPDALAVNKHAQAYYGVFKKELPEVFAVNDVQ VQDKWTKLAFEVDNIIVKAVAENSLNPQDIEKVVKTSLLPLLFTACREIGAGMNQVNRIV ETIIQILRVGLMKS Prediction of potential genes in microbial genomes Time: Sun May 15 23:47:35 2011 Seq name: gi|296494545|gb|ADTN01000193.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont410.1, whole genome shotgun sequence Length of sequence - 583 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 581 495 ## COG3209 Rhs family protein Predicted protein(s) >gi|296494545|gb|ADTN01000193.1| GENE 1 2 - 581 495 193 aa, chain + ## HITS:1 COG:rhsD KEGG:ns NR:ns ## COG: rhsD COG3209 # Protein_GI_number: 16128481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli K12 # 1 193 640 832 1426 361 92.0 1e-100 EYDAAGRVISLTNENGSHSDFSYDALDRLVQQGGFDGRTQRYHYDLTGKLTQSEDEGLIT LWHYDASDRITHRTVNGDPAEQWQYDEHGWLTTLSHTSEGHRVSVHYGYDDKGRLTGECQ TVENPETGELLWHHETGHAYNEQGLANRVTPDSLPPVEWLTYGSGYLAGMKLGGTPLLEF TRDRLHRETVRSF Prediction of potential genes in microbial genomes Time: Sun May 15 23:47:37 2011 Seq name: gi|296494544|gb|ADTN01000194.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont413.1, whole genome shotgun sequence Length of sequence - 7841 bp Number of predicted genes - 8, with homology - 7 Number of transcription units - 2, operones - 1 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 57 - 161 120 ## 2 1 Op 2 . + CDS 224 - 2851 2636 ## COG3451 Type IV secretory pathway, VirB4 components 3 1 Op 3 . + CDS 2848 - 3234 336 ## APECO1_O1CoBM40 conjugal transfer protein TrbI 4 1 Op 4 . + CDS 3231 - 3863 517 ## pECS88_0079 conjugal transfer pilus assembly protein TraW 5 1 Op 5 . + CDS 3860 - 4852 778 ## APECO1_O1CoBM42 conjugal transfer pilus assembly protein TraU 6 1 Op 6 . + CDS 4861 - 5499 414 ## APECO1_O1CoBM43 conjugal transfer pilus assembly protein TrbC 7 1 Op 7 . + CDS 5496 - 7304 1176 ## E2348_P1_058 conjugal transfer mating pair stabilization protein TraN + Term 7444 - 7483 -0.9 + Prom 7457 - 7516 4.7 8 2 Tu 1 . + CDS 7551 - 7839 163 ## APECO1_O1CoBM46 conjugal pilus assembly protein TraF Predicted protein(s) >gi|296494544|gb|ADTN01000194.1| GENE 1 57 - 161 120 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHKSVAEHSDLIPDEHEWIFRKQKSLYMRREMAR >gi|296494544|gb|ADTN01000194.1| GENE 2 224 - 2851 2636 875 aa, chain + ## HITS:1 COG:PSLT088_2 KEGG:ns NR:ns ## COG: PSLT088_2 COG3451 # Protein_GI_number: 17233453 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Salmonella typhimurium LT2 # 290 872 1 583 593 1141 94.0 0 MNNPLEAVTQAVNSLVTALKLPDESAKANEVLGEMSFPQFSRLLPYRDYNQESGLFMNDT TMGFMLEAIPINGANESIVEALDHMLRTKLPRGIPLCIHLMSSQLVGDRIEYGLREFSWS GEQAERFNAITRAYYMKAAATQFPLPEGMNLPLTLRHYRVFISYCSPSKKKSRADILEME NLVKIIRASLQGASITTQTVDAQAFIDIVGEMINHNPDSLYPKRRQLDPYSDLNYQCVED SFDLKVRADYLTLGLRESGRNSTARILNFHLARNPEIAFLWNMADNYSNLLNPELSISCP FILTLTLVVEDQVKTHSEANLKYMDLEKKSKTSYAKWFPSVEKEAKEWGELRQRLGSGQS SVVSYFLNITAFCKDNNETALEVEQDILNSFRKNGFELISPRFNHMRNFLTCLPFMAGKG LFKQLKEAGVVQRAESFNVANLMPLVADNPLTPAGLLAPTYRNQLAFIDIFFRGMNNTNY NMAVCGTSGAGKTGLIQPLIRSVLDSGGFAVVFDMGDGYKSLCENMGGVYLDGETLRFNP FANITDIDQSAERVRDQLSVMASPNGNLDEVHEGLLLQAVRASWLAKENRARIDDVVDFL KNASDSEQYAESPTIRSRLDEMIVLLDQYTANGTYGQYFNSDEPSLRDDAKMVVLELGGL EDRPSLLVAVMFSLIIYIENRMYRTPRNLKKLNVIDEGWRLLDFKNHKVGEFIEKGYRTA RRHTGAYITITQNIVDFDSDKASSAARAAWGNSSYKIILKQSAKEFAKYNQLYPDQFLPL QRDMIGKFGAAKDQWFSSFLLQVENHSSWHRLFVDPLSRAMYSSDGPDFEFVQQKRKEGL SIHEAVWQLAWKKSGPEMASLEAWLEEHEKYRSVA >gi|296494544|gb|ADTN01000194.1| GENE 3 2848 - 3234 336 128 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM40 NR:ns ## KEGG: APECO1_O1CoBM40 # Name: trbI # Def: conjugal transfer protein TrbI # Organism: E.coli_APEC # Pathway: not_defined # 1 128 1 128 128 208 92.0 4e-53 MTTTQKTTDVTAPRRSHWWWTVPGCLAMVLLNAAISYGIVRLNAPVTAAFNMKQTVDAFF DSASQKQLSEAQSKALSARFNTALEASLQAWQQKHHAVILVSPAVVQGAPDITREIQQDI ARRMRAEP >gi|296494544|gb|ADTN01000194.1| GENE 4 3231 - 3863 517 210 aa, chain + ## HITS:1 COG:no KEGG:pECS88_0079 NR:ns ## KEGG: pECS88_0079 # Name: traW # Def: conjugal transfer pilus assembly protein TraW # Organism: E.coli_S88 # Pathway: not_defined # 1 210 1 210 210 414 99.0 1e-114 MRCRGLIALLIWGQSVAAADLGTWGDLWPVKEPDMLTVIMQRLTALEQSGEMGRKMDAFK ERVSRNSLRPPAVPGIGRTEKYGSRLFDPSVRLAADIRDNEGRVFARQGEVMNPLQYVPF NQTLYFINGDDPAQVAWMKRQTPPTLESKIILVQGSIPEMQKSLDSRVYFDQNGVLCQRL GIDQVPARVSAVPGDRFLKVEFIPAEEGRK >gi|296494544|gb|ADTN01000194.1| GENE 5 3860 - 4852 778 330 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM42 NR:ns ## KEGG: APECO1_O1CoBM42 # Name: traU # Def: conjugal transfer pilus assembly protein TraU # Organism: E.coli_APEC # Pathway: not_defined # 1 330 1 330 330 652 98.0 0 MKRSLWLLMLFLLAGHVPAASADSACEGRFVNPITDICWSCIFPLSLGSIKVSQGKVPDT ANPSMPIQICPAPPPLFRRIGLAIGYWEPMALTDVTRSPGCMVNLGFSLPAFGKTAQGTA KKDEKQVNGAFYHVHWYKYPLTYWLNIITSLGCLEGGDLDIAYLSEIDPTWTDSSLTTIL NPEAVIFANPIAQGACAADAIASAFNMPLGVLFWCAGSQGSMYPFNGWVSNESSPLQSSL LVSERMAFKLHRQGMIMETIGKNNAVCNEYPSPILPKERWRYQMVNMYPDSGQCHPFGRS VTRWETGKNPPNTKKNFGYLMWRKRNCVFL >gi|296494544|gb|ADTN01000194.1| GENE 6 4861 - 5499 414 212 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM43 NR:ns ## KEGG: APECO1_O1CoBM43 # Name: trbC # Def: conjugal transfer pilus assembly protein TrbC # Organism: E.coli_APEC # Pathway: not_defined # 1 212 2 213 213 411 99.0 1e-114 MKLSMKSLAALLMMLNGAVMASENVNTPENRQFLKQQENLSRQLREKPDHQLKAWAEKQV LENPLQRSDNHFLDELVRKQQASQDGKPRQGALYFVSFSIPEEGLKRMLGETRHYGIPAT LRGMVNNDLKTTAEAVLSLVKDGATDGVQIDPTLFSQYGIRTVPALVVFCSQGYDIIRGN LRVGQALEKVAATGDCRQVAHDLLAGKGDSGK >gi|296494544|gb|ADTN01000194.1| GENE 7 5496 - 7304 1176 602 aa, chain + ## HITS:1 COG:no KEGG:E2348_P1_058 NR:ns ## KEGG: E2348_P1_058 # Name: traN # Def: conjugal transfer mating pair stabilization protein TraN # Organism: E.coli_0127 # Pathway: not_defined # 1 602 1 602 602 1170 99.0 0 MKRILPLILALVAGMAQADSNSDYRAGSDFAHQIKGQGSSSIQGFKPQESIPGYNANPDE TKYYGGVTAGGDGGLKNDGTTEWATGETGKTITESFMNKPKDILSPDAPFIQTGRDVVNR ADSIVGNTGQQCSAQEISRSEYTNYTCERDLQVEQYCTRTARMELQGSTTWETRTLEYEM SQLPAREVNGQYVVSITSPVTGEIVDAHYSWSRTYLQKSVPMTITVLGTPLSWNAKYSAD ASFTPVQKTLTAGVAFASSHPVRVGNTKFKRHTAMKLRLVVRVKKASYTPYVVWSESCPF SKELGKLTKTECTEAGGNRTLVKDGQSYSMYQSCWAYRDTYVTQSADKGTCQTYTDNPAC TLVSHQCAFYSEEGACLHEYATYSCESKTSGKVMVCGGDVFCLDGECDKAQSGQSNDFAE AVSQLAALAAAGKDVAALNGVDVRAFTGQAKFCKKAAAGYSNCCKDSGWGQDIGLAKCSS DEKALAKAKSNKLTVSVGEFCSKKVLGVCLEKKRSYCQFDSKLAQIVQQQGRNGQLRISF GSAKHPDCRGITVDELQKIQFNRLDFTNFYEDLMNNQKIPDSGVLTQKVKEQIADQLKQA GQ >gi|296494544|gb|ADTN01000194.1| GENE 8 7551 - 7839 163 96 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM46 NR:ns ## KEGG: APECO1_O1CoBM46 # Name: traF # Def: conjugal pilus assembly protein TraF # Organism: E.coli_APEC # Pathway: not_defined # 1 96 1 96 257 196 100.0 3e-49 MKEKVMKGMKMNKALLPLLLCCFIFPASGKDAGWQWYNEKINPKEKENKPVPAAPRQEPD IMQKLAALQTATKRALYEAILYPGVDNFVKYFRLQN Prediction of potential genes in microbial genomes Time: Sun May 15 23:48:03 2011 Seq name: gi|296494543|gb|ADTN01000195.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont416.1, whole genome shotgun sequence Length of sequence - 2850 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.000 - CDS 119 - 694 775 ## COG1309 Transcriptional regulator 2 1 Op 2 7/0.000 - CDS 731 - 2428 1579 ## COG4232 Thiol:disulfide interchange protein 3 1 Op 3 . - CDS 2404 - 2742 241 ## COG1324 Uncharacterized protein involved in tolerance to divalent cations Predicted protein(s) >gi|296494543|gb|ADTN01000195.1| GENE 1 119 - 694 775 191 aa, chain - ## HITS:1 COG:ECs5116 KEGG:ns NR:ns ## COG: ECs5116 COG1309 # Protein_GI_number: 15834370 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 191 9 199 199 368 99.0 1e-102 MQREDVLGEALKLLELQGIANTTLEMVAERVDYPLDELRRFWPDKEAILYDALRYLSQQI DVWRRQLMLDETQTAEQKLLARYQALSECVKNNRYPGCLFIAACTFYPDPGHPIHQLADQ QKSAAYDFTHELLTTLEVDDPAMVAKQMELVLEGCLSRMLVNRSQADVDTAHRLAEDILR FARCRQGGALT >gi|296494543|gb|ADTN01000195.1| GENE 2 731 - 2428 1579 565 aa, chain - ## HITS:1 COG:dsbD KEGG:ns NR:ns ## COG: dsbD COG4232 # Protein_GI_number: 16131961 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol:disulfide interchange protein # Organism: Escherichia coli K12 # 1 565 1 565 565 1102 100.0 0 MAQRIFTLILLLCSTSVFAGLFDAPGRSQFVPADQAFAFDFQQNQHDLNLTWQIKDGYYL YRKQIRITPEHAKIADVQLPQGVWHEDEFYGKSEIYRDRLTLPVTINQASAGATLTVTYQ GCADAGFCYPPETKTVPLSEVVANNAAPQPVSVPQQEQPTAQLPFSALWALLIGIGIAFT PCVLPMYPLISGIVLGGKQRLSTARALLLTFIYVQGMALTYTALGLVVAAAGLQFQAALQ HPYVLIGLAIVFTLLAMSMFGLFTLQLPSSLQTRLTLMSNRQQGGSPGGVFVMGAIAGLI CSPCTTAPLSAILLYIAQSGNMWLGGGTLYLYALGMGLPLMLITVFGNRLLPKSGPWMEQ VKTAFGFVILALPVFLLERVIGDVWGLRLWSALGVAFFGWAFITSLQAKRGWMRIVQIIL LAAALVSVRPLQDWAFGATHTAQTQTHLNFTQIKTVDELNQALVEAKGKPVMLDLYADWC VACKEFEKYTFSDPQVQKALADTVLLQANVTANDAQDVALLKHLNVLGLPTILFFDGQGQ EHPQARVTGFMDAETFSAHLRDRQP >gi|296494543|gb|ADTN01000195.1| GENE 3 2404 - 2742 241 112 aa, chain - ## HITS:1 COG:ECs5118 KEGG:ns NR:ns ## COG: ECs5118 COG1324 # Protein_GI_number: 15834372 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in tolerance to divalent cations # Organism: Escherichia coli O157:H7 # 1 112 1 112 112 198 100.0 2e-51 MLDEKSSNTASVVVLCTAPDEATAQDLAAKVLAEKLAACATLIPGATSLYYWEGKLEQEY EVQMILKTTVSHQQALLECLKSHHPYQTPELLVLPVTHGDTDYLSWLNASLR Prediction of potential genes in microbial genomes Time: Sun May 15 23:48:04 2011 Seq name: gi|296494542|gb|ADTN01000196.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont416.2, whole genome shotgun sequence Length of sequence - 1438 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 13 - 1314 1394 ## COG2704 Anaerobic C4-dicarboxylate transporter - Prom 1350 - 1409 5.3 Predicted protein(s) >gi|296494542|gb|ADTN01000196.1| GENE 1 13 - 1314 1394 433 aa, chain - ## HITS:1 COG:ECs5119 KEGG:ns NR:ns ## COG: ECs5119 COG2704 # Protein_GI_number: 15834373 # Func_class: R General function prediction only # Function: Anaerobic C4-dicarboxylate transporter # Organism: Escherichia coli O157:H7 # 1 433 1 433 433 705 100.0 0 MLVVELIIVLLAIFLGARLGGIGIGFAGGLGVLVLAAIGVKPGNIPFDVISIIMAVIAAI SAMQVAGGLDYLVHQTEKLLRRNPKYITILAPIVTYFLTIFAGTGNISLATLPVIAEVAK EQGVKPCRPLSTAVVSAQIAITASPISAAVVYMSSVMEGHGISYLHLLSVVIPSTLLAVL VMSFLVTMLFNSKLSDDPIYRKRLEEGLVELRGEKQIEIKSGAKTSVWLFLLGVVGVVIY AIINSPSMGLVEKPLMNTTNAILIIMLSVATLTTVICKVDTDNILNSSTFKAGMSACICI LGVAWLGDTFVSNNIDWIKDTAGEVIQGHPWLLAVIFFFASALLYSQAATAKALMPMALA LNVSPLTAVASFAAVSGLFILPTYPTLVAAVQMDDTGTTRIGKFVFNHPFFIPGTLGVAL AVCFGFVLGSFML Prediction of potential genes in microbial genomes Time: Sun May 15 23:48:08 2011 Seq name: gi|296494541|gb|ADTN01000197.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont416.3, whole genome shotgun sequence Length of sequence - 12124 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 11, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 44 - 1480 1557 ## COG1027 Aspartate ammonia-lyase - Prom 1638 - 1697 5.7 + Prom 1701 - 1760 3.1 2 2 Tu 1 . + CDS 1817 - 2293 360 ## COG3030 Protein affecting phage T7 exclusion by the F plasmid + Term 2302 - 2346 -0.4 3 3 Tu 1 . - CDS 2309 - 3565 1149 ## COG0531 Amino acid transporters - Prom 3630 - 3689 2.3 + Prom 3662 - 3721 6.0 4 4 Op 1 41/0.000 + CDS 3841 - 4134 457 ## COG0234 Co-chaperonin GroES (HSP10) 5 4 Op 2 . + CDS 4178 - 5824 2383 ## PROTEIN SUPPORTED gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 + Term 5854 - 5891 6.0 + Prom 5852 - 5911 3.8 6 5 Tu 1 . + CDS 5962 - 6315 362 ## ECIAI1_4377 hypothetical protein + Term 6370 - 6411 -0.2 - Term 6362 - 6394 -0.2 7 6 Tu 1 . - CDS 6508 - 7377 840 ## B21_03977 hypothetical protein - Prom 7501 - 7560 5.8 8 7 Tu 1 . - CDS 7772 - 8800 878 ## COG1509 Lysine 2,3-aminomutase - Prom 8845 - 8904 3.3 + Prom 8739 - 8798 3.7 9 8 Tu 1 . + CDS 8842 - 9408 611 ## COG0231 Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) + Term 9415 - 9455 10.5 + Prom 9615 - 9674 3.5 10 9 Tu 1 . + CDS 9696 - 9842 130 ## COG5510 Predicted small secreted protein + Term 9856 - 9893 6.2 + Prom 9889 - 9948 4.2 11 10 Tu 1 . + CDS 10024 - 10341 440 ## COG2076 Membrane transporters of cations and cationic drugs 12 11 Op 1 3/0.600 - CDS 10338 - 10871 752 ## COG3040 Bacterial lipocalin - Prom 10892 - 10951 2.0 - Term 10920 - 10949 2.1 13 11 Op 2 . - CDS 10960 - 12093 799 ## COG1680 Beta-lactamase class C and other penicillin binding proteins Predicted protein(s) >gi|296494541|gb|ADTN01000197.1| GENE 1 44 - 1480 1557 478 aa, chain - ## HITS:1 COG:ECs5120 KEGG:ns NR:ns ## COG: ECs5120 COG1027 # Protein_GI_number: 15834374 # Func_class: E Amino acid transport and metabolism # Function: Aspartate ammonia-lyase # Organism: Escherichia coli O157:H7 # 1 478 16 493 493 932 100.0 0 MSNNIRIEEDLLGTREVPADAYYGVHTLRAIENFYISNNKISDIPEFVRGMVMVKKAAAM ANKELQTIPKSVANAIIAACDEVLNNGKCMDQFPVDVYQGGAGTSVNMNTNEVLANIGLE LMGHQKGEYQYLNPNDHVNKCQSTNDAYPTGFRIAVYSSLIKLVDAINQLREGFERKAVE FQDILKMGRTQLQDAVPMTLGQEFRAFSILLKEEVKNIQRTAELLLEVNLGATAIGTGLN TPKEYSPLAVKKLAEVTGFPCVPAEDLIEATSDCGAYVMVHGALKRLAVKMSKICNDLRL LSSGPRAGLNEINLPELQAGSSIMPAKVNPVVPEVVNQVCFKVIGNDTTVTMAAEAGQLQ LNVMEPVIGQAMFESVHILTNACYNLLEKCINGITANKEVCEGYVYNSIGIVTYLNPFIG HHNGDIVGKICAETGKSVREVVLERGLLTEAELDDIFSVQNLMHPAYKAKRYTDESEQ >gi|296494541|gb|ADTN01000197.1| GENE 2 1817 - 2293 360 158 aa, chain + ## HITS:1 COG:yjeG KEGG:ns NR:ns ## COG: yjeG COG3030 # Protein_GI_number: 16132265 # Func_class: R General function prediction only # Function: Protein affecting phage T7 exclusion by the F plasmid # Organism: Escherichia coli K12 # 1 158 1 158 158 241 100.0 3e-64 MRWLPFIAIFLYVYIEISIFIQVAHVLGVLLTLVLVIFTSVIGMSLVRNQGFKNFVLMQQ KMAAGENPAAEMIKSVSLIIAGLLLLLPGFFTDFLGLLLLLPPVQKHLTVKLMPHLRFSR MPGGGFSAGTGGGNTFDGEYQRKDDERDRLDHKDDRQD >gi|296494541|gb|ADTN01000197.1| GENE 3 2309 - 3565 1149 418 aa, chain - ## HITS:1 COG:yjeH KEGG:ns NR:ns ## COG: yjeH COG0531 # Protein_GI_number: 16131966 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli K12 # 1 418 1 418 418 702 100.0 0 MSGLKQELGLAQGIGLLSTSLLGTGVFAVPALAALVAGNNSLWAWPVLIILVFPIAIVFA ILGRHYPSAGGVAHFVGMAFGSRLERVTGWLFLSVIPVGLPAALQIAAGFGQAMFGWHSW QLLLAELGTLALVWYIGTRGASSSANLQTVIAGLIVALIVAIWWAGDIKPANIPFPAPGN IELTGLFAALSVMFWCFVGLEAFAHLASEFKNPERDFPRALMIGLLLAGLVYWGCTVVVL HFDAYGEKMAAAASLPKIVVQLFGVGALWIACVIGYLACFASLNIYIQSFARLVWSQAQH NPDHYLARLSSRHIPNNALNAVLGCCVVSTLVIHALEINLDALIIYANGIFIMIYLLCML AGCKLLQGRYRLLAVVGGLLCVLLLAMVGWKSLYALIMLAGLWLLLPKRKTPENGITT >gi|296494541|gb|ADTN01000197.1| GENE 4 3841 - 4134 457 97 aa, chain + ## HITS:1 COG:ECs5123 KEGG:ns NR:ns ## COG: ECs5123 COG0234 # Protein_GI_number: 15834377 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Co-chaperonin GroES (HSP10) # Organism: Escherichia coli O157:H7 # 1 97 1 97 97 158 100.0 2e-39 MNIRPLHDRVIVKRKEVETKSAGGIVLTGSAAAKSTRGEVLAVGNGRILENGEVKPLDVK VGDIVIFNDGYGVKSEKIDNEEVLIMSESDILAIVEA >gi|296494541|gb|ADTN01000197.1| GENE 5 4178 - 5824 2383 548 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 [Haemophilus parasuis 29755] # 1 548 1 547 547 922 87 0.0 MAAKDVKFGNDARVKMLRGVNVLADAVKVTLGPKGRNVVLDKSFGAPTITKDGVSVAREI ELEDKFENMGAQMVKEVASKANDAAGDGTTTATVLAQAIITEGLKAVAAGMNPMDLKRGI DKAVTAAVEELKALSVPCSDSKAIAQVGTISANSDETVGKLIAEAMDKVGKEGVITVEDG TGLQDELDVVEGMQFDRGYLSPYFINKPETGAVELESPFILLADKKISNIREMLPVLEAV AKAGKPLLIIAEDVEGEALATLVVNTMRGIVKVAAVKAPGFGDRRKAMLQDIATLTGGTV ISEEIGMELEKATLEDLGQAKRVVINKDTTTIIDGVGEEAAIQGRVAQIRQQIEEATSDY DREKLQERVAKLAGGVAVIKVGAATEVEMKEKKARVEDALHATRAAVEEGVVAGGGVALI RVASKLADLRGQNEDQNVGIKVALRAMEAPLRQIVLNCGEEPSVVANTVKGGDGNYGYNA ATEEYGNMIDMGILDPTKVTRSALQYAASVAGLMITTECMVTDLPKNDAADLGAAGGMGG MGGMGGMM >gi|296494541|gb|ADTN01000197.1| GENE 6 5962 - 6315 362 117 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_4377 NR:ns ## KEGG: ECIAI1_4377 # Name: yjeI # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 117 1 117 117 198 100.0 6e-50 MHVKYLAGIVGAALLMAGCSSSNELSAAGQSVRIVDEQPGAECQLIGTATGKQSNWLSGQ HGEEGGSMRGAANDLRNQAAAMGGNVIYGISSPSQGMLSSFVPTDSQIIGQVYKCPN >gi|296494541|gb|ADTN01000197.1| GENE 7 6508 - 7377 840 289 aa, chain - ## HITS:1 COG:no KEGG:B21_03977 NR:ns ## KEGG: B21_03977 # Name: yjeJ # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 289 1 289 289 554 100.0 1e-156 MAISIKGVNTGVIRKSNNFIALALKIKEPRNKESLFFMSVMELRDLLIALESRLHQKHKL DAAARLQYEQARDKVIKKMAENIPEILVDELKNADINRRVNTLELTDNQGENLTFVLTLH DGSKCELVVNELQIEMLARAIIHAINNAEMRELALRITSLLDFLPLYDVDCQENGNLEYD TYSQPEWKHNLFDHYLAVLYRFKDESGKEQFSGAVVKTREATPGKEIEAITRRMLDFSPR LKKLAGVPCQVYVRTVAANNAQPLTQDQCLRALHHLRVQSTSKTAPQAK >gi|296494541|gb|ADTN01000197.1| GENE 8 7772 - 8800 878 342 aa, chain - ## HITS:1 COG:ECs5127 KEGG:ns NR:ns ## COG: ECs5127 COG1509 # Protein_GI_number: 15834381 # Func_class: E Amino acid transport and metabolism # Function: Lysine 2,3-aminomutase # Organism: Escherichia coli O157:H7 # 1 342 1 342 342 672 98.0 0 MAHIVTLNTPSREDWLTQLADVVTDPDELLRLLNIDADEKLLAGRSAKKLFALRVPRSFI DRMEKGNPDDPLLRQVLTSQDEFVVASGFSTDPLEEQHSVVPGLLHKYHNRALLLVKGGC AVNCRYCFRRHFPYAENQGNKRNWQTALEYVAAHPELDEMIFSGGDPLMAKDHELDWLLT QLEAIPHIKRLRIHSRLPIVIPARITDALVERFSHSTLQILLVNHINHANEIDETFRQAM AKLRRVGVTLLNQSVLLRDVNDNAQTLANLSNALFDAGVMPYYLHVLDKVQGAAHFMVSD DEARQIMRELLTLVSGYLVPKLAREIGGEPSKTPLDLQLRQQ >gi|296494541|gb|ADTN01000197.1| GENE 9 8842 - 9408 611 188 aa, chain + ## HITS:1 COG:ECs5128 KEGG:ns NR:ns ## COG: ECs5128 COG0231 # Protein_GI_number: 15834382 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) # Organism: Escherichia coli O157:H7 # 1 188 1 188 188 377 100.0 1e-105 MATYYSNDFRAGLKIMLDGEPYAVEASEFVKPGKGQAFARVKLRRLLTGTRVEKTFKSTD SAEGADVVDMNLTYLYNDGEFWHFMNNETFEQLSADAKAIGDNAKWLLDQAECIVTLWNG QPISVTPPNFVELEIVDTDPGLKGDTAGTGGKPATLSTGAVVKVPLFVQIGEVIKVDTRS GEYVSRVK >gi|296494541|gb|ADTN01000197.1| GENE 10 9696 - 9842 130 48 aa, chain + ## HITS:1 COG:STM4336 KEGG:ns NR:ns ## COG: STM4336 COG5510 # Protein_GI_number: 16767585 # Func_class: S Function unknown # Function: Predicted small secreted protein # Organism: Salmonella typhimurium LT2 # 1 48 1 48 48 62 95.0 2e-10 MVKKTIAAIFSVLVLSTVLTACNTTRGVGEDISDGGNAISGAATKAQQ >gi|296494541|gb|ADTN01000197.1| GENE 11 10024 - 10341 440 105 aa, chain + ## HITS:1 COG:ECs5129 KEGG:ns NR:ns ## COG: ECs5129 COG2076 # Protein_GI_number: 15834383 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane transporters of cations and cationic drugs # Organism: Escherichia coli O157:H7 # 1 105 51 155 155 160 100.0 4e-40 MSWIILVIAGLLEVVWAVGLKYTHGFSRLTPSVITVTAMIVSMALLAWAMKSLPVGTAYA VWTGIGAVGAAITGIVLLGESANPMRLASLALIVLGIIGLKLSTH >gi|296494541|gb|ADTN01000197.1| GENE 12 10338 - 10871 752 177 aa, chain - ## HITS:1 COG:ECs5130 KEGG:ns NR:ns ## COG: ECs5130 COG3040 # Protein_GI_number: 15834384 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Bacterial lipocalin # Organism: Escherichia coli O157:H7 # 1 177 1 177 177 365 99.0 1e-101 MRLLPLVAAATAAFLVVACSSPTPPRGVTVVNNFDAKRYLGTWYEIARFDHRFERGLEKV TATYSLRDDGGLNVINKGYNPDREMWQQSEGKAYFTGAPTRAALKVSFFGPFYGGYNVIA LDREYRHALVCGPDRDYLWILSRTPTISDEVKQEMLAVATREGFDVSKFIWVQQPGS >gi|296494541|gb|ADTN01000197.1| GENE 13 10960 - 12093 799 377 aa, chain - ## HITS:1 COG:ECs5131 KEGG:ns NR:ns ## COG: ECs5131 COG1680 # Protein_GI_number: 15834385 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Escherichia coli O157:H7 # 1 377 1 377 377 733 98.0 0 MFKTTLCALLITASCSTFAAPQKINDIVHRTITPLIEQQKIPGMAVAVIYQGKPYYFTWG YADIAKKQPVTQQTLFELGSVSKTFTGVLGGDAIARGEIKLSDPATKYWPELTAKQWNGI TLLHLATYPAGGLPLQVPDEVKSSSDLLRFYQNWQPAWAPGTQRLYANSSIGLFGALAVK PSGLSFEQAMQTRVFQPLKLNHTWINVPPAEEKNYAWGYREGKAVHVSPGALDAETYGVK STIEDMACWVRSNMNPRDINDKTLQQGIQLAQSRYWQTGDMYQGLGWEMLDWPVNPDSII NGSGNKIALAARPVKAITPPTPAVRASWVHKTGATGGFGSYVAFIPEKELGIVMLANKNY PNPARVTAAWQILNALQ Prediction of potential genes in microbial genomes Time: Sun May 15 23:48:31 2011 Seq name: gi|296494540|gb|ADTN01000198.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont416.4, whole genome shotgun sequence Length of sequence - 47499 bp Number of predicted genes - 49, with homology - 49 Number of transcription units - 19, operones - 9 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 2 1 Op 2 10/0.000 - CDS 406 - 801 490 ## COG3029 Fumarate reductase subunit C 3 1 Op 3 36/0.000 - CDS 812 - 1546 844 ## COG0479 Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit 4 1 Op 4 . - CDS 1539 - 3347 1999 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit - Prom 3512 - 3571 3.4 + Prom 3470 - 3529 7.1 5 2 Tu 1 . + CDS 3672 - 4649 991 ## COG2269 Truncated, possibly inactive, lysyl-tRNA synthetase (class II) + Term 4682 - 4710 -1.0 + Prom 4688 - 4747 3.9 6 3 Op 1 . + CDS 4913 - 6370 1393 ## COG0531 Amino acid transporters 7 3 Op 2 . + CDS 6422 - 6736 267 ## JW4118 hypothetical protein 8 3 Op 3 . + CDS 6733 - 7047 254 ## JW4119 conserved inner membrane protein - Term 6949 - 6987 -0.1 9 4 Op 1 5/0.333 - CDS 7076 - 10399 3590 ## COG3264 Small-conductance mechanosensitive channel 10 4 Op 2 2/0.833 - CDS 10421 - 11389 978 ## COG0688 Phosphatidylserine decarboxylase - Prom 11420 - 11479 2.2 11 4 Op 3 . - CDS 11486 - 12538 953 ## COG1162 Predicted GTPases - Prom 12618 - 12677 2.7 + Prom 12531 - 12590 1.9 12 5 Tu 1 . + CDS 12633 - 13178 793 ## COG1949 Oligoribonuclease (3'->5' exoribonuclease) + Term 13411 - 13479 30.4 + TRNA 13389 - 13464 93.7 # Gly GCC 0 0 + TRNA 13500 - 13575 93.7 # Gly GCC 0 0 13 6 Tu 1 . - CDS 13845 - 14984 930 ## COG1600 Uncharacterized Fe-S protein - Prom 15009 - 15068 4.9 + Prom 14893 - 14952 2.8 14 7 Op 1 6/0.000 + CDS 14998 - 16530 1051 ## PROTEIN SUPPORTED gi|153825000|ref|ZP_01977667.1| ribosomal protein S15 15 7 Op 2 13/0.000 + CDS 16502 - 16963 540 ## COG0802 Predicted ATPase or kinase 16 7 Op 3 10/0.000 + CDS 16982 - 18319 1054 ## COG0860 N-acetylmuramoyl-L-alanine amidase 17 7 Op 4 12/0.000 + CDS 18329 - 20176 1725 ## COG0323 DNA mismatch repair enzyme (predicted ATPase) 18 7 Op 5 15/0.000 + CDS 20214 - 21119 540 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase 19 7 Op 6 16/0.000 + CDS 21205 - 21513 294 ## COG1923 Uncharacterized host factor I protein + Term 21534 - 21579 6.2 20 7 Op 7 8/0.000 + CDS 21589 - 22869 733 ## PROTEIN SUPPORTED gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 21 7 Op 8 21/0.000 + CDS 22955 - 24214 1513 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs 22 7 Op 9 11/0.000 + CDS 24217 - 25221 1351 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs + Term 25251 - 25289 7.2 23 7 Op 10 6/0.000 + CDS 25303 - 25500 234 ## COG3242 Uncharacterized protein conserved in bacteria + Prom 25521 - 25580 4.0 24 7 Op 11 5/0.333 + CDS 25604 - 26902 1497 ## COG0104 Adenylosuccinate synthase + Term 26957 - 27003 9.2 + Prom 27028 - 27087 4.2 25 8 Op 1 2/0.833 + CDS 27179 - 27532 239 ## COG1959 Predicted transcriptional regulator 26 8 Op 2 7/0.000 + CDS 27571 - 30012 1250 ## PROTEIN SUPPORTED gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 + Prom 30093 - 30152 7.0 27 8 Op 3 4/0.500 + CDS 30192 - 30923 455 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 + Term 30938 - 30979 7.4 + Prom 30936 - 30995 7.2 28 9 Op 1 6/0.000 + CDS 31050 - 31451 344 ## COG3789 Uncharacterized protein conserved in bacteria 29 9 Op 2 . + CDS 31470 - 32168 904 ## COG1842 Phage shock protein A (IM30), suppresses sigma54-dependent transcription + Term 32174 - 32212 6.1 30 10 Op 1 . + CDS 32219 - 32878 724 ## JW4141 conserved hypothetical protein 31 10 Op 2 5/0.333 + CDS 32896 - 33294 384 ## COG3766 Predicted membrane protein 32 10 Op 3 5/0.333 + CDS 33304 - 33942 351 ## COG5463 Predicted integral membrane protein 33 10 Op 4 1/0.833 + CDS 33945 - 35108 1127 ## COG0754 Glutathionylspermidine synthase 34 10 Op 5 . + CDS 35192 - 36817 1256 ## COG1960 Acyl-CoA dehydrogenases 35 11 Tu 1 . - CDS 36934 - 37221 273 ## SSON_4370 hypothetical protein - Prom 37286 - 37345 4.3 36 12 Tu 1 . - CDS 37358 - 37687 249 ## APECO1_2203 hypothetical protein - Prom 37794 - 37853 3.1 + Prom 37652 - 37711 3.7 37 13 Tu 1 . + CDS 37869 - 38618 513 ## COG1073 Hydrolases of the alpha/beta superfamily - Term 38330 - 38382 3.3 38 14 Tu 1 . - CDS 38615 - 39370 858 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 39395 - 39454 3.5 - Term 39417 - 39468 11.1 39 15 Tu 1 . - CDS 39478 - 40542 1286 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold - Prom 40711 - 40770 6.3 + Prom 40722 - 40781 8.1 40 16 Op 1 11/0.000 + CDS 40897 - 42294 1764 ## COG3037 Uncharacterized protein conserved in bacteria 41 16 Op 2 13/0.000 + CDS 42310 - 42615 361 ## COG3414 Phosphotransferase system, galactitol-specific IIB component 42 16 Op 3 8/0.000 + CDS 42625 - 43089 640 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 43 16 Op 4 9/0.000 + CDS 43103 - 43753 917 ## COG0269 3-hexulose-6-phosphate synthase and related proteins 44 16 Op 5 8/0.000 + CDS 43763 - 44617 1100 ## COG3623 Putative L-xylulose-5-phosphate 3-epimerase 45 16 Op 6 . + CDS 44617 - 45303 847 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases + Term 45391 - 45430 0.3 - Term 45361 - 45391 3.0 46 17 Tu 1 . - CDS 45432 - 45707 304 ## LF82_3468 UPF0379 protein YjfY - Prom 45760 - 45819 4.0 + Prom 45884 - 45943 2.9 47 18 Tu 1 . + CDS 46035 - 46430 681 ## PROTEIN SUPPORTED gi|188495956|ref|ZP_03003226.1| ribosomal protein S6 48 19 Op 1 27/0.000 + CDS 46756 - 46983 385 ## PROTEIN SUPPORTED gi|15834432|ref|NP_313205.1| 30S ribosomal protein S18 49 19 Op 2 . + CDS 47025 - 47474 720 ## PROTEIN SUPPORTED gi|15804792|ref|NP_290833.1| 50S ribosomal protein L9 Predicted protein(s) >gi|296494540|gb|ADTN01000198.1| GENE 1 36 - 395 428 119 aa, chain - ## HITS:1 COG:ECs5132 KEGG:ns NR:ns ## COG: ECs5132 COG3080 # Protein_GI_number: 15834386 # Func_class: C Energy production and conversion # Function: Fumarate reductase subunit D # Organism: Escherichia coli O157:H7 # 1 119 1 119 119 208 100.0 2e-54 MINPNPKRSDEPVFWGLFGAGGMWSAIIAPVMILLVGILLPLGLFPGDALSYERVLAFAQ SFIGRVFLFLMIVLPLWCGLHRMHHAMHDLKIHVPAGKWVFYGLAAILTVVTLIGVVTI >gi|296494540|gb|ADTN01000198.1| GENE 2 406 - 801 490 131 aa, chain - ## HITS:1 COG:ECs5133 KEGG:ns NR:ns ## COG: ECs5133 COG3029 # Protein_GI_number: 15834387 # Func_class: C Energy production and conversion # Function: Fumarate reductase subunit C # Organism: Escherichia coli O157:H7 # 1 131 1 131 131 225 100.0 2e-59 MTTKRKPYVRPMTSTWWKKLPFYRFYMLREGTAVPAVWFSIELIFGLFALKNGPEAWAGF VDFLQNPVIVIINLITLAAALLHTKTWFELAPKAANIIVKDEKMGPEPIIKSLWAVTVVA TIVILFVALYW >gi|296494540|gb|ADTN01000198.1| GENE 3 812 - 1546 844 244 aa, chain - ## HITS:1 COG:ECs5134 KEGG:ns NR:ns ## COG: ECs5134 COG0479 # Protein_GI_number: 15834388 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit # Organism: Escherichia coli O157:H7 # 1 244 1 244 244 516 100.0 1e-146 MAEMKNLKIEVVRYNPEVDTAPHSAFYEVPYDATTSLLDALGYIKDNLAPDLSYRWSCRM AICGSCGMMVNNVPKLACKTFLRDYTDGMKVEALANFPIERDLVVDMTHFIESLEAIKPY IIGNSRTADQGTNIQTPAQMAKYHQFSGCINCGLCYAACPQFGLNPEFIGPAAITLAHRY NEDSRDHGKKERMAQLNSQNGVWSCTFVGYCSEVCPKHVDPAAAIQQGKVESSKDFLIAT LKPR >gi|296494540|gb|ADTN01000198.1| GENE 4 1539 - 3347 1999 602 aa, chain - ## HITS:1 COG:frdA KEGG:ns NR:ns ## COG: frdA COG1053 # Protein_GI_number: 16131979 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Escherichia coli K12 # 1 602 1 602 602 1170 99.0 0 MQTFQADLAIVGAGGAGLRAAIAAAQANPNAKIALISKVYPMRSHTVAAEGGSAAVAQDH DSFEYHFHDTVAGGDWLCEQDVVDYFVHHCPTEMTQLELWGCPWSRRPDGSVNVRRFGGM KIERTWFAADKTGFHMLHTLFQTSLQFPQIQRFDEHFVLDILVDDGHVRGLVAMNMMEGT LVQIRANAVVMATGGAGRVYRYNTNGGIVTGDGMGMALSHGVPLRDMEFVQYHPTGLPGS GILMTEGCRGEGGILVNKNGYRYLQDYGMGPETPLGEPKNKYMELGPRDKVSQAFWHEWR KGNTISTPRGDVVYLDLRHLGEKKLHERLPFICELAKAYVGVDPVKEPIPVRPTAHYTMG GIETDQNCETRIKGLFAVGECSSVGLHGANRLGSNSLAELVVFGRLAGEQATERSATAGN GNEAAIEAQAAGVEQRLKDLVNQDGGENWAKIRDEMGLAMEEGCGIYRTPELMQKTIDKL AELQERFKRVRITDTSSVFNTDLLYTIELGHGLNVAECMAHSAMARKESRGAHQRLDEGC TERDDVNFLKHTLAFRDADGTTRLEYSDVKITTLPPAKRVYGGEADAADKAEAANKKEKA NG >gi|296494540|gb|ADTN01000198.1| GENE 5 3672 - 4649 991 325 aa, chain + ## HITS:1 COG:ECs5136 KEGG:ns NR:ns ## COG: ECs5136 COG2269 # Protein_GI_number: 15834390 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Truncated, possibly inactive, lysyl-tRNA synthetase (class II) # Organism: Escherichia coli O157:H7 # 1 325 11 335 335 662 100.0 0 MSETASWQPSASIPNLLKRAAIMAEIRRFFADRGVLEVETPCMSQATVTDIHLVPFETRF VGPGHSQGMNLWLMTSPEYHMKRLLVAGCGPVFQLCRSFRNEEMGRYHNPEFTMLEWYRP HYDMYRLMNEVDDLLQQVLDCPAAESLSYQQAFLRYLEIDPLSADKTQLREVAAKLDLSN VADTEEDRDTLLQLLFTFGVEPNIGKEKPTFVYHFPASQASLAQISTEDHRVAERFEVYY KGIELANGFHELTDAREQQQRFEQDNRKRAARGLPQHPIDQNLIEALKVGMPDCSGVALG VDRLVMLALGAETLAEVIAFSVDRA >gi|296494540|gb|ADTN01000198.1| GENE 6 4913 - 6370 1393 485 aa, chain + ## HITS:1 COG:yjeM KEGG:ns NR:ns ## COG: yjeM COG0531 # Protein_GI_number: 16131981 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli K12 # 1 485 30 514 514 860 100.0 0 MIFTSVFGFANSPSAYYLMGYSAIPFYIFSALLFFIPFALMMAEMGAAYRKEEGGIYSWM NNSVGPRFAFIGTFMWFSSYIIWMVSTSAKVWVPFSTFLYGSDMTQHWRIAGLEPTQVVG LLAVAWMILVTVVASKGINKIARITAVGGIAVMCLNLVLLLVSITILLLNGGHFAQDINF LASPNPGYQSGLAMLSFVVFAIFAYGGIEAVGGLVDKTENPEKNFAKGIVFAAIVISIGY SLAIFLWGVSTNWQQVLSNGSVNLGNITYVLMKSLGMTLGNALHLSPEASLSLGVWFARI TGLSMFLAYTGAFFTLCYSPLKAIIQGTPKALWPEPMTRLNAMGMPSIAMWMQCGLVTVF ILLVSFGGGTASAFFNKLTLMANVSMTLPYLFLALAFPFFKARQDLDRPFVIFKTHLSAM IATVVVVLVVTFANVFTIIQPVVEAGDWDSTLWMIGGPVFFSLLAMAIYQNYCSRVAKNP QWAVE >gi|296494540|gb|ADTN01000198.1| GENE 7 6422 - 6736 267 104 aa, chain + ## HITS:1 COG:no KEGG:JW4118 NR:ns ## KEGG: JW4118 # Name: yjeN # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 104 1 104 104 211 100.0 5e-54 MDDTSRDPAITEDEIRALQFSAGDVAEIEQTILSFVDACHTRKVAMVVGSTINTLKDRDG KRWGNLPDIYCAYLIRCLVFRGELVGYGDLFRMRYSEIKRPVTL >gi|296494540|gb|ADTN01000198.1| GENE 8 6733 - 7047 254 104 aa, chain + ## HITS:1 COG:no KEGG:JW4119 NR:ns ## KEGG: JW4119 # Name: yjeO # Def: conserved inner membrane protein # Organism: E.coli_J # Pathway: not_defined # 1 104 1 104 104 169 100.0 2e-41 MSARMFVLCCIWFIVAFLWITITSALDKEWMIDGRGINNVCDVLMYLEEDDTRDVGVIMT LPLFFPFLWFALWRKKRGWFMYATALAIFGYWLWQFFLRYQFCL >gi|296494540|gb|ADTN01000198.1| GENE 9 7076 - 10399 3590 1107 aa, chain - ## HITS:1 COG:yjeP KEGG:ns NR:ns ## COG: yjeP COG3264 # Protein_GI_number: 16131984 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Escherichia coli K12 # 1 1107 1 1107 1107 2058 100.0 0 MRLIITFLMAWCLSWGAYAATAPDSKQITQELEQAKAAKPAQPEVVEALQSALNALEERK GSLERIKQYQQVIDNYPKLSATLRAQLNNMRDEPRSVSPGMSTDALNQEILQVSSQLLDK SRQAQQEQERAREIADSLNQLPQQQTDARRQLNEIERRLGTLTGNTPLNQAQNFALQSDS ARLKALVDELELAQLSANNRQELARLRSELAEKESQQLDAYLQALRNQLNSQRQLEAERA LESTELLAENSADLPKDIVAQFKINRELSAALNQQAQRMDLVASQQRQAASQTLQVRQAL NTLREQSQWLGSSNLLGEALRAQVARLPEMPKPQQLDTEMAQLRVQRLRYEDLLNKQPLL RQIHQADGQPLTAEQNRILEAQLRTQRELLNSLLQGGDTLLLELTKLKVSNGQLEDALKE VNEATHRYLFWTSDVRPMTIAWPLEIAQDLRRLISLDTFSQLGKASVMMLTSKETILPLF GALILVGCSIYSRRYFTRFLERSAAKVGKVTQDHFWLTLRTLFWSILVASPLPVLWMTLG YGLREAWPYPLAVAIGDGVTATVPLLWVVMICATFARPNGLFIAHFGWPRERVSRGMRYY LMSIGLIVPLIMALMMFDNLDDREFSGSLGRLCFILICGALAVVTLSLKKAGIPLYLNKE GSGDNITNHMLWNMMIGAPLVAILASAVGYLATAQALLARLETSVAIWFLLLVVYHVIRR WMLIQRRRLAFDRAKHRRAEMLAQRARGEEEAHHHSSPEGAIEVDESEVDLDAISAQSLR LVRSILMLIALLSVIVLWSEIHSAFGFLENISLWDVTSTVQGVESLEPITLGAVLIAILV FIITTQLVRNLPALLELAILQHLDLTPGTGYAITTITKYLLMLIGGLVGFSMIGIEWSKL QWLVAALGVGLGFGLQEIFANFISGLIILFEKPIRIGDTVTIRDLTGSVTKINTRATTIS DWDRKEIIVPNKAFITEQFINWSLSDSVTRVVLTIPAPADANSEEVTEILLTAARRCSLV IDNPAPEVFLVDLQQGIQIFELRIYAAEMGHRMPLRHEIHQLILAGFHAHGIDMPFPPFQ MRLESLNGKQTGRTLTSAGKGRQAGSL >gi|296494540|gb|ADTN01000198.1| GENE 10 10421 - 11389 978 322 aa, chain - ## HITS:1 COG:ECs5139 KEGG:ns NR:ns ## COG: ECs5139 COG0688 # Protein_GI_number: 15834393 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine decarboxylase # Organism: Escherichia coli O157:H7 # 1 322 1 322 322 664 100.0 0 MLNSFKLSLQYILPKLWLTRLAGWGASKRAGWLTKLVIDLFVKYYKVDMKEAQKPDTASY RTFNEFFVRPLRDEVRPIDTDPNVLVMPADGVISQLGKIEEDKILQAKGHNYSLEALLAG NYLMADLFRNGTFVTTYLSPRDYHRVHMPCNGILREMIYVPGDLFSVNHLTAQNVPNLFA RNERVICLFDTEFGPMAQILVGATIVGSIETVWAGTITPPREGIIKRWTWPAGENDGSVA LLKGQEMGRFKLGSTVINLFAPGKVNLVEQLESLSVTKIGQPLAVSTETFVTPDAEPAPL PAEEIEAEHDASPLVDDKKDQV >gi|296494540|gb|ADTN01000198.1| GENE 11 11486 - 12538 953 350 aa, chain - ## HITS:1 COG:yjeQ KEGG:ns NR:ns ## COG: yjeQ COG1162 # Protein_GI_number: 16131986 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Escherichia coli K12 # 14 350 1 337 337 692 99.0 0 MSKNKLSKGQQRRVNANHQRRLKTSKEKPDYDDNLFGEPDEGIVISRFGMHADVESADGD VHRCNIRRTIRSLVTGDRVVWRPGKPAAEGVNVKGIVEAVHERTSVLTRPDFYDGVKPIA ANIDQIVIVSAILPELSLNIIDRYLVACETLQIEPIIVLNKIDLLDDEGMAFVNEQMDIY RNIGYRVLMVSSHTQDGLKPLEEALTGRISIFAGQSGVGKSSLLNALLGLQKEILTNDIS DNSGLGQHTTTAARLYHFPHGGDVIDSPGVREFGLWHLEPEQITQGFVEFHDYLGLCKYR DCKHDTDPGCAIREAVEEGKIAETRFENYHRILESMAQVKTRKNFSDTDD >gi|296494540|gb|ADTN01000198.1| GENE 12 12633 - 13178 793 181 aa, chain + ## HITS:1 COG:orn KEGG:ns NR:ns ## COG: orn COG1949 # Protein_GI_number: 16131987 # Func_class: A RNA processing and modification # Function: Oligoribonuclease (3'->5' exoribonuclease) # Organism: Escherichia coli K12 # 1 181 24 204 204 367 100.0 1e-102 MSANENNLIWIDLEMTGLDPERDRIIEIATLVTDANLNILAEGPTIAVHQSDEQLALMDD WNVRTHTASGLVERVKASTMGDREAELATLEFLKQWVPAGKSPICGNSIGQDRRFLFKYM PELEAYFHYRYLDVSTLKELARRWKPEILDGFTKQGTHQAMDDIRESVAELAYYREHFIK L >gi|296494540|gb|ADTN01000198.1| GENE 13 13845 - 14984 930 379 aa, chain - ## HITS:1 COG:yjeS KEGG:ns NR:ns ## COG: yjeS COG1600 # Protein_GI_number: 16131988 # Func_class: C Energy production and conversion # Function: Uncharacterized Fe-S protein # Organism: Escherichia coli K12 # 1 379 1 379 379 784 100.0 0 MSEPLDLNQLAQKIKQWGLELGFQQVGITDTDLSESEPKLQAWLDKQYHGEMDWMARHGM LRARPHELLPGTLRVISVRMNYLPANAAFASTLKNPKLGYVSRYALGRDYHKLLRNRLKK LGEMIQQHCVSLNFRPFVDSAPILERPLAEKAGLGWTGKHSLILNREAGSFFFLGELLVD IPLPVDQPVEEGCGKCVACMTICPTGAIVEPYTVDARRCISYLTIELEGAIPEELRPLMG NRIYGCDDCQLICPWNRYSQLTTEEDFSPRKPLHAPELIELFAWSEEKFLKVTEGSAIRR IGHLRWLRNIAVALGNAPWDETILTALESRKGEHPLLDEHIAWAIAQQIERRNACIVEVQ LPKKQRLVRVIEKGLPRDA >gi|296494540|gb|ADTN01000198.1| GENE 14 14998 - 16530 1051 510 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|153825000|ref|ZP_01977667.1| ribosomal protein S15 [Vibrio cholerae MZO-2] # 8 494 3 490 490 409 44 1e-113 MKKNPVSIPHTVWYADDIRRGEREAADVLGLTLYELMLRAGEAAFQVCRSAYPDARHWLV LCGHGNNGGDGYVVARLAKAVGIEVTLLAQESDKPLPEEAALAREAWLNAGGEIHASNIV WPESVDLIVDALLGTGLRQAPRESISQLIDHANSHPAPIVAVDIPSGLLAETGATPGAVI NADHTITFIALKPGLLTGKARDVTGQLHFDSLGLDSWLAGQETKIQRFSAEQLSHWLKPR RPTSHKGDHGRLVIIGGDHGTAGAIRMTGEAALRAGAGLVRVLTRSENIAPLLTARPELM VHELTMDSLTESLEWADVVVIGPGLGQQEWGKKALQKVENFRKPMLWDADALNLLAINPD KRHNRVITPHPGEAARLLGCSVAEIESDRLHCAKRLVQRYGGVAVLKGAGTVVAAHPDAL GIIDAGNAGMASGGMGDVLSGIIGALLGQKLSPYDAACAGCVAHGAAADVLAARFGTRGM LATDLFSTLQRIVNPEVTDKNHDESSNSAP >gi|296494540|gb|ADTN01000198.1| GENE 15 16502 - 16963 540 153 aa, chain + ## HITS:1 COG:ECs5144 KEGG:ns NR:ns ## COG: ECs5144 COG0802 # Protein_GI_number: 15834398 # Func_class: R General function prediction only # Function: Predicted ATPase or kinase # Organism: Escherichia coli O157:H7 # 1 153 1 153 153 313 100.0 1e-85 MMNRVIPLPDEQATLDLGERVAKACDGATVIYLYGDLGAGKTTFSRGFLQALGHQGNVKS PTYTLVEPYTLDNLMVYHFDLYRLADPEELEFMGIRDYFANDAICLVEWPQQGTGVLPDP DVEIHIDYQAQGREARVSAVSSAGELLLARLAG >gi|296494540|gb|ADTN01000198.1| GENE 16 16982 - 18319 1054 445 aa, chain + ## HITS:1 COG:amiB KEGG:ns NR:ns ## COG: amiB COG0860 # Protein_GI_number: 16131991 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Escherichia coli K12 # 1 445 1 445 445 814 99.0 0 MMYRIRNWLVATLLLLCTPVGAATLSDIQVSNGNQQARITLSFIGDPDYAFSHQSKRTVA LDIKQTGVIQGLPLLFSGNNLVKAIRSGTPKDAQTLRLVVDLTENGKTEAVKRQNGSNYT VVFTINADVPPPPPPPPVVAKRVETPAVVAPRVSEPARNPFKTESNRTTGVISSNTVTRP AARATANTGDKIIIAIDAGHGGQDPGAIGPGGTREKNVTIAIARKLRTLLNDDPMFKGVL TRDGDYFISVMGRSDVARKQNANFLVSIHADAAPNRSATGASVWVLSNRRANSEMASWLE QHEKQSELLGGAGDVLANSQSDPYLSQAVLDLQFGHSQRGGYDVATSMISQLQRIGEIHK RRPEHASLGVLRSPDIPSVLVETGFISNNSEERLLASDDYQQQLAEAIYKGLRNYFLAHP MQSAPQGATAQTASTVTTPDRTLPN >gi|296494540|gb|ADTN01000198.1| GENE 17 18329 - 20176 1725 615 aa, chain + ## HITS:1 COG:mutL KEGG:ns NR:ns ## COG: mutL COG0323 # Protein_GI_number: 16131992 # Func_class: L Replication, recombination and repair # Function: DNA mismatch repair enzyme (predicted ATPase) # Organism: Escherichia coli K12 # 1 615 1 615 615 1164 100.0 0 MPIQVLPPQLANQIAAGEVVERPASVVKELVENSLDAGATRIDIDIERGGAKLIRIRDNG CGIKKDELALALARHATSKIASLDDLEAIISLGFRGEALASISSVSRLTLTSRTAEQQEA WQAYAEGRDMNVTVKPAAHPVGTTLEVLDLFYNTPARRKFLRTEKTEFNHIDEIIRRIAL ARFDVTINLSHNGKIVRQYRAVPEGGQKERRLGAICGTAFLEQALAIEWQHGDLTLRGWV ADPNHTTPALAEIQYCYVNGRMMRDRLINHAIRQACEDKLGADQQPAFVLYLEIDPHQVD VNVHPAKHEVRFHQSRLVHDFIYQGVLSVLQQQLETPLPLDDEPQPAPRSIPENRVAAGR NHFAEPAAREPVAPRYTPAPASGSRPAAPWPNAQPGYQKQQGEVYRQLLQTPAPMQKLKA PEPQEPALAANSQSFGRVLTIVHSDCALLERDGNISLLSLPVAERWLRQAQLTPGEAPVC AQPLLIPLRLKVSAEEKSALEKAQSALAELGIDFQSDAQHVTIRAVPLPLRQQNLQILIP ELIGYLAKQSVFEPGNIAQWIARNLMSEHAQWSMAQAITLLADVERLCPQLVKTPPGGLL QSVDLHPAIKALKDE >gi|296494540|gb|ADTN01000198.1| GENE 18 20214 - 21119 540 301 aa, chain + ## HITS:1 COG:miaA KEGG:ns NR:ns ## COG: miaA COG0324 # Protein_GI_number: 16131993 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Escherichia coli K12 # 1 301 16 316 316 580 100.0 1e-165 MGPTASGKTALAIELRKILPVELISVDSALIYKGMDIGTAKPNAEELLAAPHRLLDIRDP SQAYSAADFRRDALAEMADITAAGRIPLLVGGTMLYFKALLEGLSPLPSADPEVRARIEQ QAAEQGWESLHRQLQEVDPVAAARIHPNDPQRLSRALEVFFISGKTLTELTQTSGDALPY QVHQFAIAPASRELLHQRIEQRFHQMLASGFEAEVRALFARGDLHTDLPSIRCVGYRQMW SYLEGEISYDEMVYRGVCATRQLAKRQITWLRGWEGVHWLDSEKPEQARDEVLQVVGAIA G >gi|296494540|gb|ADTN01000198.1| GENE 19 21205 - 21513 294 102 aa, chain + ## HITS:1 COG:ECs5148 KEGG:ns NR:ns ## COG: ECs5148 COG1923 # Protein_GI_number: 15834402 # Func_class: R General function prediction only # Function: Uncharacterized host factor I protein # Organism: Escherichia coli O157:H7 # 1 102 1 102 102 177 100.0 5e-45 MAKGQSLQDPFLNALRRERVPVSIYLVNGIKLQGQIESFDQFVILLKNTVSQMVYKHAIS TVVPSRPVSHHSNNAGGGTSSNYHHGSSAQNTSAQQDSEETE >gi|296494540|gb|ADTN01000198.1| GENE 20 21589 - 22869 733 426 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 [Roseobacter sp. AzwK-3b] # 50 398 55 407 425 286 46 1e-76 MFDRYDAGEQAVLVHIYFTQDKDMEDLQEFESLVSSAGVEALQVITGSRKAPHPKYFVGE GKAVEIAEAVKATGASVVLFDHALSPAQERNLERLCECRVIDRTGLILDIFAQRARTHEG KLQVELAQLRHLATRLVRGWTHLERQKGGIGLRGPGETQLETDRRLLRNRIVQIQSRLER VEKQREQGRQSRIKADVPTVSLVGYTNAGKSTLFNRITEARVYAADQLFATLDPTLRRID VADVGETVLADTVGFIRHLPHDLVAAFKATLQETRQATLLLHVIDAADVRVQENIEAVNT VLEEIDAHEIPTLLVMNKIDMLEDFEPRIDRDEENKPIRVWLSAQTGAGIPQLFQALTER LSGEVAQHTLRLPPQEGRLRSRFYQLQAIEKEWMEEDGSVSLQVRMPIVDWRRLCKQEPA LIDYLI >gi|296494540|gb|ADTN01000198.1| GENE 21 22955 - 24214 1513 419 aa, chain + ## HITS:1 COG:ECs5150 KEGG:ns NR:ns ## COG: ECs5150 COG0330 # Protein_GI_number: 15834404 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Escherichia coli O157:H7 # 1 419 1 419 419 672 100.0 0 MAWNQPGNNGQDRDPWGSSKPGGNSEGNGNKGGRDQGPPDLDDIFRKLSKKLGGLGGGKG TGSGGGSSSQGPRPQLGGRVVTIAAAAIVIIWAASGFYTIKEAERGVVTRFGKFSHLVEP GLNWKPTFIDEVKPVNVEAVRELAASGVMLTSDENVVRVEMNVQYRVTNPEKYLYSVTSP DDSLRQATDSALRGVIGKYTMDRILTEGRTVIRSDTQRELEETIRPYDMGITLLDVNFQA ARPPEEVKAAFDDAIAARENEQQYIREAEAYTNEVQPRANGQAQRILEEARAYKAQTILE AQGEVARFAKLLPEYKAAPEITRERLYIETMEKVLGNTRKVLVNDKGGNLMVLPLDQMLK GGNAPAAKSDNGASNLLRLPPASSSTTSGASNTSSTSQGDIMDQRRANAQRNDYQRQGE >gi|296494540|gb|ADTN01000198.1| GENE 22 24217 - 25221 1351 334 aa, chain + ## HITS:1 COG:ECs5151 KEGG:ns NR:ns ## COG: ECs5151 COG0330 # Protein_GI_number: 15834405 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Escherichia coli O157:H7 # 1 334 1 334 334 566 100.0 1e-161 MRKSVIAIIIIVLVVLYMSVFVVKEGERGITLRFGKVLRDDDNKPLVYEPGLHFKIPFIE TVKMLDARIQTMDNQADRFVTKEKKDLIVDSYIKWRISDFSRYYLATGGGDISQAEVLLK RKFSDRLRSEIGRLDVKDIVTDSRGRLTLEVRDALNSGSAGTEDEVTTPAADNAIAEAAE RVTAETKGKVPVINPNSMAALGIEVVDVRIKQINLPTEVSEAIYNRMRAEREAVARRHRS QGQEEAEKLRATADYEVTRTLAEAERQGRIMRGEGDAEAAKLFADAFSKDPDFYAFIRSL RAYENSFSGNQDVMVMSPDSDFFRYMKTPTSATR >gi|296494540|gb|ADTN01000198.1| GENE 23 25303 - 25500 234 65 aa, chain + ## HITS:1 COG:ECs5152 KEGG:ns NR:ns ## COG: ECs5152 COG3242 # Protein_GI_number: 15834406 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 65 1 65 65 80 100.0 6e-16 MNSTIWLALALVLVLEGLGPMLYPKAWKKMISAMTNLPDNILRRFGGGLVVAGVVVYYML RKTIG >gi|296494540|gb|ADTN01000198.1| GENE 24 25604 - 26902 1497 432 aa, chain + ## HITS:1 COG:ECs5153 KEGG:ns NR:ns ## COG: ECs5153 COG0104 # Protein_GI_number: 15834407 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate synthase # Organism: Escherichia coli O157:H7 # 1 432 1 432 432 867 100.0 0 MGNNVVVLGTQWGDEGKGKIVDLLTERAKYVVRYQGGHNAGHTLVINGEKTVLHLIPSGI LRENVTSIIGNGVVLSPAALMKEMKELEDRGIPVRERLLLSEACPLILDYHVALDNAREK ARGAKAIGTTGRGIGPAYEDKVARRGLRVGDLFDKETFAEKLKEVMEYHNFQLVNYYKAE AVDYQKVLDDTMAVADILTSMVVDVSDLLDQARQRGDFVMFEGAQGTLLDIDHGTYPYVT SSNTTAGGVATGSGLGPRYVDYVLGILKAYSTRVGAGPFPTELFDETGEFLCKQGNEFGA TTGRRRRTGWLDTVAVRRAVQLNSLSGFCLTKLDVLDGLKEVKLCVAYRMPDGREVTTTP LAADDWKGVEPIYETMPGWSESTFGVKDRSGLPQAALNYIKRIEELTGVPIDIISTGPDR TETMILRDPFDA >gi|296494540|gb|ADTN01000198.1| GENE 25 27179 - 27532 239 117 aa, chain + ## HITS:1 COG:ECs5154 KEGG:ns NR:ns ## COG: ECs5154 COG1959 # Protein_GI_number: 15834408 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 117 25 141 141 231 100.0 3e-61 MTSISEVTDVYGVSRNHMVKIINQLSRAGYVTAVRGKNGGIRLGKPASAIRIGDVVRELE PLSLVNCSSEFCHITPACRLKQALSKAVQSFLTELDNYTLADLVEENQPLYKLLLVE >gi|296494540|gb|ADTN01000198.1| GENE 26 27571 - 30012 1250 813 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 [Clostridium acetobutylicum ATCC 824] # 21 728 4 705 730 486 37 1e-136 MSQDPFQEREAEKYANPIPSREFILEHLTKREKPASRDELAVELHIEGEEQLEGLRRRLR AMERDGQLVFTRRQCYALPERLDLVKGTVIGHRDGYGFLRVEGRKDDLYLSSEQMKTCIH GDQVLAQPLGADRKGRREARIVRVLVPKTSQIVGRYFTEAGVGFVVPDDSRLSFDILIPP DQIMGARMGFVVVVELTQRPTRRTKAVGKIVEVLGDNMGTGMAVDIALRTHEIPYIWPQA VEQQVAGLKEEVPEEAKAGRVDLRDLPLVTIDGEDARDFDDAVYCEKKRGGGWRLWVAIA DVSYYVRPSTPLDREARNRGTSVYFPSQVIPMLPEVLSNGLCSLNPQVDRLCMVCEMTVS SKGRLTGYKFYEAVMSSHARLTYTKVWHILQGDQDLREQYAPLVKHLEELHNLYKVLDKA REERGGISFESEEAKFIFNAERRIERIEQTQRNDAHKLIEECMILANISAARFVEKAKEP ALFRIHDKPSTEAITSFRSVLAELGLELPGGNKPEPRDYAELLESVADRPDAEMLQTMLL RSMKQAIYDPENRGHFGLALQSYAHFTSPIRRYPDLTLHRAIKYLLAKEQGHQGNTTETG GYHYSMEEMLQLGQHCSMAERRADEATRDVADWLKCDFMLDQVGNVFKGVISSVTGFGFF VRLDDLFIDGLVHVSSLDNDYYRFDQVGQRLMGESSGQTYRLGDRVEVRVEAVNMDERKI DFSLISSERAPRNVGKTAREKAKKGDAGKKGGKRRQVGKKVNFEPDSAFRGEKKTKPKAA KKDARKAKKPSAKTQKIAAATKAKRAAKKKVAE >gi|296494540|gb|ADTN01000198.1| GENE 27 30192 - 30923 455 243 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 3 242 7 246 255 179 40 2e-44 MSEMIYGIHAVQALLERAPERFQEVFILKGREDKRLLPLIHALESQGVVIQLANRQYLDE KSDGAVHQGIIARVKPGRQYQENDLPDLIASLDQPFLLILDGVTDPHNLGACLRSADAAG VHAVIVPKDRSAQLNATAKKVACGAAESVPLIRVTNLARTMRMLQEENIWIVGTAGEADH TLYQSKMTGRLALVMGAEGEGMRRLTREHCDELISIPMAGSVSSLNVSVATGICLFEAVR QRS >gi|296494540|gb|ADTN01000198.1| GENE 28 31050 - 31451 344 133 aa, chain + ## HITS:1 COG:ECs5157 KEGG:ns NR:ns ## COG: ECs5157 COG3789 # Protein_GI_number: 15834411 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 133 1 133 133 230 100.0 5e-61 MTWNPLALATALQTVPEQNIDVTNSENALIIKMNDYGDLQINILFTSRQMIIETFICPVS SISNPDEFNTFLLRNQKMMPLSSVGISSVQQEEYYIVFGALSLKSSLEDILLEITSLVDN ALDLAEITEEYSH >gi|296494540|gb|ADTN01000198.1| GENE 29 31470 - 32168 904 232 aa, chain + ## HITS:1 COG:yjfJ KEGG:ns NR:ns ## COG: yjfJ COG1842 # Protein_GI_number: 16132004 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Phage shock protein A (IM30), suppresses sigma54-dependent transcription # Organism: Escherichia coli K12 # 1 232 1 232 232 291 100.0 7e-79 MGILKSLFTLGKSFISQAEESIEETQGVRMLEQHIRDAKAELDKAGKSRVDLLARVKLSH DKLKDLRERKASLEARALEALSKNVNPSLINEVAEEIARLENLITAEEQVLSNLEVSRDG VEKAVTATAQRIAQFEQQMEVVKATEAMQRAQQAVTTSTVGASSSVSTAAESLKRLQTRQ AERQARLDAAAQLEKVADGRDLDEKLAEAGIGGSNKSSAQDVLARLQRQQGE >gi|296494540|gb|ADTN01000198.1| GENE 30 32219 - 32878 724 219 aa, chain + ## HITS:1 COG:no KEGG:JW4141 NR:ns ## KEGG: JW4141 # Name: yjfK # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 219 1 219 219 421 100.0 1e-116 MSGFFQRLFGKDNKPAIARGPLGLHLNSGFTLDTLAFRLLEDELLIALPGEEFTVAAVSH IDLGGGSQIFRYYTSGDEFLQINTTGGEDIDDIDDIKLFVYEESYGISKESHWREAINAK AMGAMTLNWQEKRWQRFFNSEEPGNIEPVYMLEKVENQNHAKWEVHNFTMGYQRQVTEDT YEYLLLNGEESFNDLGEPEWLFSRALGVDIPLTSLHIIG >gi|296494540|gb|ADTN01000198.1| GENE 31 32896 - 33294 384 132 aa, chain + ## HITS:1 COG:ECs5160 KEGG:ns NR:ns ## COG: ECs5160 COG3766 # Protein_GI_number: 15834414 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 132 1 132 132 220 100.0 6e-58 MHILDSLLAFSAYFFIGVAMVIIFLFIYSKITPHNEWQLIKNNNTAASLAFSGTLLGYVI PLSSAAINAVSIPDYFAWGGIALVIQLLVFAGVRLYMPALSEKIINHNTAAGMFMGTAAL AGGIFNAACMTW >gi|296494540|gb|ADTN01000198.1| GENE 32 33304 - 33942 351 212 aa, chain + ## HITS:1 COG:yjfM KEGG:ns NR:ns ## COG: yjfM COG5463 # Protein_GI_number: 16132007 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Escherichia coli K12 # 1 212 1 212 212 387 99.0 1e-108 MARKRKSRNNSKIGHGAISRIGRPNNPFEPRRNRYAQKYLTLALMGGAAFFVLKGCSDSS DVDNDGDGTFYATVQDCIDDGNNADICARGWNNAKTAFYADVPKNMTQQNCQSKYENCYY DNVEQSWIPVVSGFLLSRVIRKDRDEPFVYNSGGSSFASRPVWRSTSGDYSWRSGSGKKE SYSSGGFTTKKASTVSRGGYGRSSSARGHWGG >gi|296494540|gb|ADTN01000198.1| GENE 33 33945 - 35108 1127 387 aa, chain + ## HITS:1 COG:ECs5162 KEGG:ns NR:ns ## COG: ECs5162 COG0754 # Protein_GI_number: 15834416 # Func_class: E Amino acid transport and metabolism # Function: Glutathionylspermidine synthase # Organism: Escherichia coli O157:H7 # 1 387 1 387 387 798 99.0 0 MLRHNVPVRRDLDQIAADNGFDFHIIDNEIYWDESRAYRFTLRQIEEQIEKPTAELHQMC LEVVDRAVKDEEILTQLAIPPLYWDVIAESWRARDPSLYGRMDFAWCGNAPVKLLEYNAD TPTSLYESAYFQWLWLEDARRSGIIPRDADQYNAIQERLISRFSELYSREPFYFCCCQDT DEDRSTVLYLQDCAQQAGQESRFIYIEDLGLGVGGVLTDLDDNVIQRAFKLYPLEWMMRD DNGPLLRKRREQWVEPLWKSILSNKGLMPLLWRFFPGHPNLLASWFEGEKSQIAAGESYV RKPLYSREGGNVTIFDGQNNVVDHADGDYADEPMIYQAFQPLPRFGDSYTLIGSWIVDDE ACGMGIREDNTLITKDTSRFVPHYIAG >gi|296494540|gb|ADTN01000198.1| GENE 34 35192 - 36817 1256 541 aa, chain + ## HITS:1 COG:aidB KEGG:ns NR:ns ## COG: aidB COG1960 # Protein_GI_number: 16132009 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Escherichia coli K12 # 1 541 6 546 546 1107 99.0 0 MHWQTHTVFNQPIPLNNSNLYLSDGALCEAVTREGAGWDSDFLASIGQQLGTAESLELGR LANVNPPELLRYDAQGRRLDDVRFHPAWHLLMQALCTNRVHNLAWEEDARSGAFVARAAR FMLHAQVEAGSLCPITMTFAATPLLLQMLPAPFQDWTTPLLSDRYDSHLLPGGQKRGLLI GMGMTEKQGGSDVMSNTTRAERLEDGSYRLVGHKWFFSVPQSDAHLVLAQTAGGLSCFFV PRFLPDGQRNAIRLERLKDKLGNRSNASCEVEFQDAIGWLLGQEGEGIRLILKMGGMTRF DCALGSHAMMRRAFSLAIYHAHQRHVFGNPLIQQPLMRHVLSRMALQLEGQTALLFRLAR AWDRRADAKEALWARLFTPAAKFVICKRGMPFVAEAMEVLGGIGYCEESELPRLYREMPV NSIWEGSGNIMCLDVLRVLNKQAGVYDLLSEAFVEVKGQDRYFDRAVRRLQQQLRKPAEE LGREITHQLFLLGCGAQMLKYASPPMAQAWCQVMLDTRGGVRLSEQIQNDLLLRATGGVC V >gi|296494540|gb|ADTN01000198.1| GENE 35 36934 - 37221 273 95 aa, chain - ## HITS:1 COG:no KEGG:SSON_4370 NR:ns ## KEGG: SSON_4370 # Name: yjfN # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 95 6 100 100 172 100.0 2e-42 MELTMKQLLASPSLQLVTYPASATAQSAEFASADCVTGLNEIGQISVSNISGDPQDVERI VALKADEQGASWYRIITMYEDQQPDNWRVQAILYA >gi|296494540|gb|ADTN01000198.1| GENE 36 37358 - 37687 249 109 aa, chain - ## HITS:1 COG:no KEGG:APECO1_2203 NR:ns ## KEGG: APECO1_2203 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 109 34 142 142 201 99.0 8e-51 MVSRKRNSVIYRFASLLLVLMLSACSALQGTPQPAPPVTDHPQEIRRDQTQGLQRIGSVS TMVRGSPDDALAEIKAKAVAAKADYYVVVMVDETIVTGQWYSQAILYRK >gi|296494540|gb|ADTN01000198.1| GENE 37 37869 - 38618 513 249 aa, chain + ## HITS:1 COG:yjfP KEGG:ns NR:ns ## COG: yjfP COG1073 # Protein_GI_number: 16132012 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Escherichia coli K12 # 1 249 1 249 249 495 98.0 1e-140 MIEIESRELADIPVLHAYPVGQKDTPLPCAIFYHGFTSSSLVYSYFAVALAQAGLRVIMP DAPDHGSRFSGDAARRLNQFWQILLQSMQEFTTLRAAIAEEKWLLDDRLAVGGASMGAMT ALGITARHPTVRCTASMMGSGYFTSLARSLFPPLIPETAAQQNEFNNIVAPLAEWEATNH LEQLGDRPLLLWHGLDDDVVPADESLRLQQALSETGRDKLLTCSWQPGVRHRITPEALDA AVTFFRQHL >gi|296494540|gb|ADTN01000198.1| GENE 38 38615 - 39370 858 251 aa, chain - ## HITS:1 COG:ECs5167 KEGG:ns NR:ns ## COG: ECs5167 COG1349 # Protein_GI_number: 15834421 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli O157:H7 # 1 251 1 251 251 482 100.0 1e-136 MTEAQRHQILLEMLAQLGFVTVEKVVERLGISPATARRDINKLDESGKLKKVRNGAEAIT QQRPRWTPMNLHQAQNHDEKVRIAKAASQLVNPGESVVINCGSTAFLLGREMCGKPVQII TNYLPLANYLIDQEHDSVIIMGGQYNKSQSITLSPQGSENSLYAGHWMFTSGKGLTAEGL YKTDMLTAMAEQKMLSVVGKLVVLVDSSKIGERAGMLFSRADQIDMLITGKNANPEILQQ LEAQGVSILRV >gi|296494540|gb|ADTN01000198.1| GENE 39 39478 - 40542 1286 354 aa, chain - ## HITS:1 COG:ECs5168 KEGG:ns NR:ns ## COG: ECs5168 COG2220 # Protein_GI_number: 15834422 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Escherichia coli O157:H7 # 1 354 3 356 356 752 99.0 0 MSKVKSITRESWILSTFPEWGSWLNEEIEQEQVAPGTFAMWWLGCTGIWLKSEGGTNVCV DFWCGTGKQSHGNPLMKQGHQMQRMAGVKKLQPNLRTTPFVLDPFAIRQVDAVLATHDHN DHIDVNVAAAVMQNCADDVPFIGPKTCVDLWIGWGVPKERCIVVKPGDVVKVKDIEIHAL DAFDRTALITLPADQKAAGVLPDGMDDRAVNYLFKTPGGSLYHSGDSHYSNYYAKHGNEH QIDVALGSYGENPRGITDKMTSADMLRMGEALNAKVVIPFHHDIWSNFQADPQEIRVLWE MKKDRLKYGFKPFIWQVGGKFTWPLDKDNFEYHYPRGFDDCFTIEPDLPFKSFL >gi|296494540|gb|ADTN01000198.1| GENE 40 40897 - 42294 1764 465 aa, chain + ## HITS:1 COG:sgaT KEGG:ns NR:ns ## COG: sgaT COG3037 # Protein_GI_number: 16132015 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 465 20 484 484 806 99.0 0 MEILYNIFTVFFNQVMTNAPLLLGIVTCLGYILLRKSVSVIIKGTIKTIIGFMLLQAGSG ILTSTFKPVVAKMSEVYGINGAISDTYASMMATIDRMGDAYSWVGYAVLLALALNICYVL LRRITGIRTIMLTGHIMFQQAGLIAVTLFIFGYSMWTTIICTAILVSLYWGITSNMMYKP TQEVTDGCGFSIGHQQQFASWIAYKVAPFLGKKEESVEDLKLPGWLNIFHDNIVSTAIVM TIFFGAILLSFGIDTVQAMAGKVNWTVYILQTGFSFAVAIFIITQGVRMFVAELSEAFNG ISQRLIPGAVLAIDCAAIYSFAPNAVVWGFMWGTIGQLIAVGILVACGSSILIIPGFIPM FFSNATIGVFANHFGGWRAALKICLVMGMIEIFGCVWAVKLTGMSAWMGMADWSILAPPM MQGFFSIGIAFMAVIIVIALAYMFFAGRALRAEEDAEKQLAEQSA >gi|296494540|gb|ADTN01000198.1| GENE 41 42310 - 42615 361 101 aa, chain + ## HITS:1 COG:ECs5170 KEGG:ns NR:ns ## COG: ECs5170 COG3414 # Protein_GI_number: 15834424 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIB component # Organism: Escherichia coli O157:H7 # 1 101 1 101 101 196 100.0 6e-51 MTVRILAVCGNGQGSSMIMKMKVDQFLTQSNIDHTVNSCAVGEYKSELSGADIIIASTHI AGEITVTGNKYVVGVRNMLSPADFGPKLLEVIKEHFPQDVK >gi|296494540|gb|ADTN01000198.1| GENE 42 42625 - 43089 640 154 aa, chain + ## HITS:1 COG:ECs5171 KEGG:ns NR:ns ## COG: ECs5171 COG1762 # Protein_GI_number: 15834425 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Escherichia coli O157:H7 # 1 154 1 154 154 287 96.0 5e-78 MKLRDSLAENKSIRLQAEAETWQDAVKIGVDLLVAADVVEPRYYQAILDAVEQHGPYFVL APGLAMPHGRPEEGVKKTGFALVTLKKPLEFNHEDNDPVDILITMAAVDANTHQEVGIMQ IVNLFEDEENFDRLRACRTEQEVLDLIDRTNAAA >gi|296494540|gb|ADTN01000198.1| GENE 43 43103 - 43753 917 216 aa, chain + ## HITS:1 COG:ECs5172 KEGG:ns NR:ns ## COG: ECs5172 COG0269 # Protein_GI_number: 15834426 # Func_class: G Carbohydrate transport and metabolism # Function: 3-hexulose-6-phosphate synthase and related proteins # Organism: Escherichia coli O157:H7 # 1 216 1 216 216 420 100.0 1e-117 MSLPMLQVALDNQTMDSAYETTRLIAEEVDIIEVGTILCVGEGVRAVRDLKALYPHKIVL ADAKIADAGKILSRMCFEANADWVTVICCADINTAKGALDVAKEFNGDVQIELTGYWTWE QAQQWRDAGIQQVVYHRSRDAQAAGVAWGEADITAIKRLSDMGFKVTVTGGLALEDLPLF KGIPIHVFIAGRSIRDAASPVEAARQFKRSIAELWG >gi|296494540|gb|ADTN01000198.1| GENE 44 43763 - 44617 1100 284 aa, chain + ## HITS:1 COG:ECs5173 KEGG:ns NR:ns ## COG: ECs5173 COG3623 # Protein_GI_number: 15834427 # Func_class: G Carbohydrate transport and metabolism # Function: Putative L-xylulose-5-phosphate 3-epimerase # Organism: Escherichia coli O157:H7 # 1 284 1 284 284 582 100.0 1e-166 MLSKQIPLGIYEKALPAGECWLERLQLAKTLGFDFVEMSVDETDERLSRLDWSREQRLAL VNAIVETGVRVPSMCLSAHRRFPLGSEDDAVRAQGLEIMRKAIQFAQDVGIRVIQLAGYD VYYQEANNETRRRFRDGLKESVEMASRAQVTLAMEIMDYPLMNSISKALGYAHYLNNPWF QLYPDIGNLSAWDNDVQMELQAGIGHIVAVHVKDTKPGVFKNVPFGEGVVDFERCFETLK QSGYCGPYLIEMWSETAEDPAAEVAKARDWVKARMAKAGMVEAA >gi|296494540|gb|ADTN01000198.1| GENE 45 44617 - 45303 847 228 aa, chain + ## HITS:1 COG:sgaE KEGG:ns NR:ns ## COG: sgaE COG0235 # Protein_GI_number: 16132020 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli K12 # 1 228 1 228 228 464 99.0 1e-131 MQKLKQQVFEANMELPRYGLVTFTWGNVSAIDRERGLVVIKPSGVAYETMKAADMVVVDM SGKVVEGEYRPSSDTATHLELYRRYPSLGGIVHTHSTHATAWAQAGLAIPALGTTHADYF FGDIPCTRGLSEEEVQGEYELNTGKVIIETLGNAEPLHTPGIVVYQHGPFAWGKDAHDAV HNAVVMEEVAKMAWIARSINPQLNHIDSFLMNKHFMRKHGPNAYYGQK >gi|296494540|gb|ADTN01000198.1| GENE 46 45432 - 45707 304 91 aa, chain - ## HITS:1 COG:no KEGG:LF82_3468 NR:ns ## KEGG: LF82_3468 # Name: yjfY # Def: UPF0379 protein YjfY # Organism: E.coli_LF82 # Pathway: not_defined # 1 91 1 91 91 141 100.0 6e-33 MFSRVLALLAVLLLSANTWAAIEINNHQARNMDDVQSLGVIYINHNFATESEARQALNEE TDAQGATYYHVILMREPGSNGNMHASADIYR >gi|296494540|gb|ADTN01000198.1| GENE 47 46035 - 46430 681 131 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|188495956|ref|ZP_03003226.1| ribosomal protein S6 [Escherichia coli 53638] # 1 131 1 131 131 266 100 1e-70 MRHYEIVFMVHPDQSEQVPGMIERYTAAITGAEGKIHRLEDWGRRQLAYPINKLHKAHYV LMNVEAPQEVIDELETTFRFNDAVIRSMVMRTKHAVTEASPMVKAKDERRERRDDFANET ADDADAGDSEE >gi|296494540|gb|ADTN01000198.1| GENE 48 46756 - 46983 385 75 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15834432|ref|NP_313205.1| 30S ribosomal protein S18 [Escherichia coli O157:H7 str. Sakai] # 1 75 1 75 75 152 100 3e-36 MARYFRRRKFCRFTAEGVQEIDYKDIATLKNYITESGKIVPSRITGTRAKYQRQLARAIK RARYLSLLPYTDRHQ >gi|296494540|gb|ADTN01000198.1| GENE 49 47025 - 47474 720 149 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15804792|ref|NP_290833.1| 50S ribosomal protein L9 [Escherichia coli O157:H7 EDL933] # 1 149 1 149 149 281 100 4e-75 MQVILLDKVANLGSLGDQVNVKAGYARNFLVPQGKAVPATKKNIEFFEARRAELEAKLAE VLAAANARAEKINALETVTIASKAGDEGKLFGSIGTRDIADAVTAAGVEVAKSEVRLPNG VLRTTGEHEVSFQVHSEVFAKVIVNVVAE Prediction of potential genes in microbial genomes Time: Sun May 15 23:48:56 2011 Seq name: gi|296494539|gb|ADTN01000199.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont416.5, whole genome shotgun sequence Length of sequence - 28812 bp Number of predicted genes - 25, with homology - 25 Number of transcription units - 17, operones - 2 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 56 - 86 0.1 1 1 Tu 1 9/0.222 - CDS 98 - 1069 679 ## PROTEIN SUPPORTED gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 - Term 1080 - 1110 3.0 2 2 Op 1 11/0.000 - CDS 1122 - 2423 1256 ## PROTEIN SUPPORTED gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 3 2 Op 2 . - CDS 2466 - 2942 268 ## PROTEIN SUPPORTED gi|90020580|ref|YP_526407.1| ribosomal protein S3 4 2 Op 3 . - CDS 2963 - 5044 1634 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 5 2 Op 4 3/0.556 - CDS 5041 - 6582 1703 ## COG4670 Acyl CoA:acetate/3-ketoacid CoA transferase 6 2 Op 5 1/0.889 - CDS 6592 - 7368 738 ## COG1024 Enoyl-CoA hydratase/carnithine racemase 7 2 Op 6 . - CDS 7378 - 8220 720 ## COG1082 Sugar phosphate isomerases/epimerases 8 2 Op 7 . - CDS 8230 - 9021 885 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) + Prom 9116 - 9175 4.1 9 3 Tu 1 . + CDS 9234 - 9881 529 ## COG1309 Transcriptional regulator + Term 9945 - 9986 2.1 10 4 Tu 1 . - CDS 9865 - 10503 450 ## COG3061 Cell envelope opacity-associated protein A - Prom 10528 - 10587 2.8 + Prom 10591 - 10650 4.9 11 5 Tu 1 . + CDS 10722 - 11342 964 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 + Prom 11403 - 11462 3.7 12 6 Tu 1 . + CDS 11651 - 13063 1447 ## COG1113 Gamma-aminobutyrate permease and related permeases + Term 13076 - 13113 8.2 13 7 Tu 1 . - CDS 13108 - 13770 874 ## COG2846 Regulator of cell morphogenesis and NO signaling - Prom 13793 - 13852 5.3 14 8 Tu 1 . - CDS 13878 - 14843 930 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 15 9 Tu 1 . - CDS 14952 - 15812 867 ## COG0702 Predicted nucleoside-diphosphate-sugar epimerases - Prom 15886 - 15945 3.3 + Prom 15811 - 15870 4.0 16 10 Tu 1 . + CDS 15901 - 16281 431 ## COG1733 Predicted transcriptional regulators + Term 16346 - 16384 4.2 - Term 16160 - 16199 3.2 17 11 Tu 1 . - CDS 16404 - 18347 2103 ## COG0737 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases - Prom 18397 - 18456 5.1 + Prom 18441 - 18500 4.3 18 12 Tu 1 . + CDS 18537 - 19277 778 ## COG1218 3'-Phosphoadenosine 5'-phosphosulfate (PAPS) 3'-phosphatase + Term 19286 - 19323 5.1 19 13 Tu 1 . - CDS 19267 - 19824 659 ## COG3054 Predicted transcriptional regulator - Prom 19998 - 20057 5.4 + Prom 19938 - 19997 5.0 20 14 Tu 1 . + CDS 20149 - 20355 250 ## G2583_5047 hypothetical protein + Term 20380 - 20412 6.3 - Term 20368 - 20400 6.3 21 15 Tu 1 . - CDS 20417 - 21760 1646 ## COG1253 Hemolysins and related proteins containing CBS domains - Prom 21864 - 21923 3.4 - Term 21925 - 21961 -1.0 22 16 Tu 1 . - CDS 22083 - 22721 705 ## COG0225 Peptide methionine sulfoxide reductase - Prom 22777 - 22836 3.3 + Prom 22736 - 22795 2.8 23 17 Op 1 16/0.000 + CDS 22927 - 24660 1666 ## COG0729 Outer membrane protein 24 17 Op 2 6/0.333 + CDS 24657 - 28436 4121 ## COG2911 Uncharacterized protein conserved in bacteria 25 17 Op 3 . + CDS 28439 - 28780 229 ## COG2105 Uncharacterized conserved protein Predicted protein(s) >gi|296494539|gb|ADTN01000199.1| GENE 1 98 - 1069 679 323 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 [Lentisphaera araneosa HTCC2155] # 24 316 48 339 346 266 43 1e-70 MKIKVSAGIIGAVLMLSASQSWAVTLKLSHNQDKSHPVHKAMEFFAKKSKEYSNGDITIR IYPNGTLGTQRETMELIRSGAIPLVKTNAAEMEAFENSYKLFSLPYLFRDRDHYYQVMQG DIGRKILDSTKSKGYFGLTFYDGGARSFYGNKPVLKPDDLKGMKVRVQPSPGAVEMIKVM GGNPTPLDYGELYTALQQGVVDMAENSVMALTTMRHGEVAKSFSLDEHTMVPDVVLMSNA AFDKLSQENQAVILKAAKESMSYMKDLWSEEEKQEFAKLDKMGVKVYQVDKAPFIEKVQP MYANFAKDNPALAPMLADIQAAK >gi|296494539|gb|ADTN01000199.1| GENE 2 1122 - 2423 1256 433 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 [Algoriphagus sp. PR1] # 5 432 4 430 431 488 54 1e-137 MDYWLPIIVLFGAFFFMLALGVPIVYAIGLSTLASISTQLDFNSALSVVSQKLASGLDSF TLLAIPFFILSGNIMNHGGIARRLINFARILGGRLPGSLAHCNILANMLFGAISGSAVAS AAAMGGVMHPQQVKEGYDPAFSTAVNVASAPTGLLIPPSNTLIVYSLVSGGTSIATLFLA GYVPGILLGLALMVIAGIIAVRRGYPKPERPTLRQAGVAIWMAIPSIFLIILIMGGVLSG IFTPTEASAIAVIYTLFLALVLYREISVKDLPKIFLESVITTAIVLLLIGSSMGMSWAMS NADVPFLILDLLNTISDNPIIILLIINIILLIIGTFMDMTPAVLIFTPIFLPVVTELGMD PIHFGIVMVLNMCIGICTPPVGSVLFVGCSVSKLPINKIIKPMLPFYAVMVLVLAMVTYI PQISMALPRALGY >gi|296494539|gb|ADTN01000199.1| GENE 3 2466 - 2942 268 158 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020580|ref|YP_526407.1| ribosomal protein S3 [Saccharophagus degradans 2-40] # 1 148 1 148 164 107 37 7e-23 MRNIIRSINALLAALNITILAIIVACVTWQVAARFIFTSPSIFTDELSRLLFICLGLFGG AYTAGQNRHLAIDLLPMMLKGKARRHLFLCIQIIVIIFATIIMVYGGGLLTMDTFDSGQT SPALGWQMGYIYMSIPISGVLIIIYTIDMVLTELKQPL >gi|296494539|gb|ADTN01000199.1| GENE 4 2963 - 5044 1634 693 aa, chain - ## HITS:1 COG:TM0427 KEGG:ns NR:ns ## COG: TM0427 COG1053 # Protein_GI_number: 15643193 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Thermotoga maritima # 29 689 5 664 664 360 35.0 7e-99 MSQLDDTILDALTHVTFPKGFAQAEPAWVVTVDGVDYPLWQTDALVVGSGAAGLRAAVEL KRRQQNVLIATAGLYMGTSACSGSDKQTLFTAATAGNGDNFTKLAEALASGGAMDHDTAY VEAVGSLHTLGGLQYLGLELPEDRYGAILRYQTDHDEAGRATSCGPRTSRLMVKVLLEEV QRLAIPVLTSATVIKLLHQRDENGEDRVAGAILATGHRAHNPWGLAIVTAPNVVLATGGP GELYRDSVYPHKCFGSLGLALEEGLMLTNLTESQFGIGTPRSTFPWNLSGTYVQVIPYIY SVDAEGNEYNFLADYYRTTQELASNIFRKGYQWPFHATRVMDFGSSLLDMAVAQEQQSGR QVFMDFNRNPEAVPGDLPFSLDRLDDDVRAYLENNDALAPSPIERLQRMNPLSISLYKMH GYDLTTQPLQFAMNNQHMNGGIEVDIWGQTSLPGCFAVGEVAGTHGVTRPGGAALNAGQV FAVRLARFIGCTQKRNIDGDIAQLVAPTLASIREIITQAHDNGTGMPLSVVREKIQARMS DHAGFICHADKVRRATRDALLLSEFVQRHGLAIKHVGEVAELFMWRHMALTSAAVLTQLT HYIDAGGGSRGARIILDRDGNSIPQTRNGFCDAWRFRSERTEDKKDKLLIHYCNGIFHVR ETPVREFPIIRGIWFEKNWPGFLNGTIYQPQDE >gi|296494539|gb|ADTN01000199.1| GENE 5 5041 - 6582 1703 513 aa, chain - ## HITS:1 COG:BH3898 KEGG:ns NR:ns ## COG: BH3898 COG4670 # Protein_GI_number: 15616460 # Func_class: I Lipid transport and metabolism # Function: Acyl CoA:acetate/3-ketoacid CoA transferase # Organism: Bacillus halodurans # 3 507 4 507 525 416 46.0 1e-116 MRKITTAEALAAQIQDGATIAISGNGGGMVEADHILAAIEARFLQTGHPRDLTLIHSLGI GDRDCKGTNRFAHAEMLKRIIAGHFTWSPKMQALVKNNTIEAYCFPGGVIQALLREIGAG RPGLFTHVGLGSFVDPRNGGGKSNECTTDELVELIEIDGETKLRYRPFKVDYAILRGTYA DPRGNVSLEEEAIDMDSYSMALAAHNSGGKVFVQVRDVLEAGAIEPRRVKLPGILVDGIV EHREQPQTYLGGYDLTISGQHRRLSSNDAIELVSHPVRRLIARRAARELVAGASTNFGFG IPGGIPGVALREGVPYQSLWLSVEQGVHNGMMLDDAFFGCARNADAIIPSLDQFEFYSGG GIDITFLGMGEMDQYGNVNVSHLNGNLIGPGGFLEIAQNARKVVFCGTFDAKGSKIDITP DGLHIAQSGQIPKLVTKVEKITFSAAYAQQSGQEVLYITERAVFQLTAEGVELIEIAPGV EIERDILPYMAFRPIIKHPRLMESSLFTPMEDA >gi|296494539|gb|ADTN01000199.1| GENE 6 6592 - 7368 738 258 aa, chain - ## HITS:1 COG:AF2273_2 KEGG:ns NR:ns ## COG: AF2273_2 COG1024 # Protein_GI_number: 11499854 # Func_class: I Lipid transport and metabolism # Function: Enoyl-CoA hydratase/carnithine racemase # Organism: Archaeoglobus fulgidus # 16 255 17 259 262 162 34.0 5e-40 MSDQPVLFSRAAASCRLTLNREDKCHAINEEMIESLDHYLNEIENDTTLRLVELTATGDK FFCAGGDIKSWSAYSPLDMGRKWIKRGNDVFNRLRNLPQLTVANLNGHTIGGGIELALCC DIRIARPGAKFSNPEVMLGMVPGWMGIERVLNQVGPVVGRQMLMLGKRLTAQEAQAANLI DEVVEKEQVESWMANQLAQLEKCGPVALAHIKQLILALENKHADYPHQLLAGLMSATQDC QQATRAFAEKSSVSFHNQ >gi|296494539|gb|ADTN01000199.1| GENE 7 7378 - 8220 720 280 aa, chain - ## HITS:1 COG:SMc04130 KEGG:ns NR:ns ## COG: SMc04130 COG1082 # Protein_GI_number: 15963875 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Sinorhizobium meliloti # 11 280 6 274 274 271 50.0 7e-73 MRDLLQRPDLFSINTATLGYKTPLPAIIDACAARGIGAIAPWRRELQGEDLQQIAHQLAA SNMSVSGLCRSTYYTAPTLAERKLAIDDNRRALDDAAVLNAACYMQVVGGLPTGTKDLYE AREQVKQGIRQLLPHSKDVGVPIALEPLHPMTAADRSCLCTLRQALDWCDELDPDGEFGL GVAVDVYHVWWDPDLASQILRAGKRILAFHVSDWLVPTTDLVNDRGMPGDGVINIPSIRR LVENAGFNGAIELEIFSPYWWQKDINSTLDISVDRIAHYC >gi|296494539|gb|ADTN01000199.1| GENE 8 8230 - 9021 885 263 aa, chain - ## HITS:1 COG:mlr3057 KEGG:ns NR:ns ## COG: mlr3057 COG1028 # Protein_GI_number: 13472685 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Mesorhizobium loti # 4 263 2 253 253 206 43.0 3e-53 MKSTRPVAVITGAARGIGKGCALELARGGFNLLINDLPDADSVEKLHITQQECIAEGVEV ICFPADVGDLSLHEEMLDAAQNLWGRLDCLLNNAGISVKKRGDLLDLEPDSFDQNIAINT RAPFFLAQAFSKRLLAQPKPEAELPHRSIIFVSSINAIMLAMNRGEYTIAKTAVSAAARL FAARLCNEQIGVYEVRPGLIKTDMTIPATAYYDELIAKGLVPWGRWGYPADIASTVRAMA EGKLIYTCGQAVAIDGGLSMPRF >gi|296494539|gb|ADTN01000199.1| GENE 9 9234 - 9881 529 215 aa, chain + ## HITS:1 COG:ytfA KEGG:ns NR:ns ## COG: ytfA COG1309 # Protein_GI_number: 16132027 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 108 215 1 108 108 210 99.0 2e-54 MYLDTSNVQSLKEKLLLCAVNEFAEYGYEGARVDNIVKAAGCSKQTVYHHFGNKENLFIE VLEYTWNDIRQKEKALDFSDLPPQKAIEKIIDFTWDYYIANPWFLKIVHSENQSKGVHYA KSQRLLEINHAHLQLMESLLDEGKKHNIFKPDIDPLQVNINIAALGGYYLINQHTLGLVY HISMVSPQALEARRKVIKETILSWLLVDPSSTAHE >gi|296494539|gb|ADTN01000199.1| GENE 10 9865 - 10503 450 212 aa, chain - ## HITS:1 COG:ytfB KEGG:ns NR:ns ## COG: ytfB COG3061 # Protein_GI_number: 16132028 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell envelope opacity-associated protein A # Organism: Escherichia coli K12 # 1 212 13 224 224 363 99.0 1e-100 MPGRFELKPTLEKVWHAPDNFRFMDPLPPMHRRGVIIAAIILVVGFLLPSDDTPNAPVVT REAQLDIQSQSQPPTEEQLRAQLVTPQNDPDQVAPVAPEPIQEGQPEEQPQTTQTQPFQP DSGIDNQWRSYRVEPGKTMAQLFRDHGLPATDVYAMAQVEGAGKPLSNLQNGQMVKIRQN ASGVVTGLTIDTGNNQQVLFTRQPDGSFIRAR >gi|296494539|gb|ADTN01000199.1| GENE 11 10722 - 11342 964 206 aa, chain + ## HITS:1 COG:ECs5185 KEGG:ns NR:ns ## COG: ECs5185 COG0545 # Protein_GI_number: 15834439 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Escherichia coli O157:H7 # 1 206 54 259 259 395 99.0 1e-110 MTTPTFDTIEAQASYGIGLQVGQQLSESGLQGLLPEALVAGIADALEGKHPAVPVDVVHR ALREIHERADAVRRQRFQAMAAEGVKYLEENAKKEGVNSTESGLQFRVINQGEGAIPART DRVRVHYTGKLIDGTVFDSSVARGEPAEFPVNGVIPGWIEALTLMPVGSKWELTIPQELA YGERGAGASIPPFSTLVFEVELLEIL >gi|296494539|gb|ADTN01000199.1| GENE 12 11651 - 13063 1447 470 aa, chain + ## HITS:1 COG:ECs5186 KEGG:ns NR:ns ## COG: ECs5186 COG1113 # Protein_GI_number: 15834440 # Func_class: E Amino acid transport and metabolism # Function: Gamma-aminobutyrate permease and related permeases # Organism: Escherichia coli O157:H7 # 1 470 1 470 470 851 100.0 0 MVDQVKVVADDQAPAEQSLRRNLTNRHIQLIAIGGAIGTGLFMGSGKTISLAGPSIIFVY MIIGFMLFFVMRAMGELLLSNLEYKSFSDFASDLLGPWAGYFTGWTYWFCWVVTGMADVV AITAYAQFWFPDLSDWVASLAVIVLLLTLNLATVKMFGEMEFWFAMIKIVAIVSLIVVGL VMVAMHFQSPTGVEASFAHLWNDGGWFPKGLSGFFAGFQIAVFAFVGIELVGTTAAETKD PEKSLPRAINSIPIRIIMFYVFALIVIMSVTPWSSVVPEKSPFVELFVLVGLPAAASVIN FVVLTSAASSANSGVFSTSRMLFGLAQEGVAPKAFAKLSKRAVPAKGLTFSCICLLGGVV MLYVNPSVIGAFTMITTVSAILFMFVWTIILCSYLVYRKQRPHLHEKSIYKMPLGKLMCW VCMAFFVFVVVLLTLEDDTRQALLVTPLWFIALGLGWLFIGKKRAAELRK >gi|296494539|gb|ADTN01000199.1| GENE 13 13108 - 13770 874 220 aa, chain - ## HITS:1 COG:ytfE KEGG:ns NR:ns ## COG: ytfE COG2846 # Protein_GI_number: 16132031 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Regulator of cell morphogenesis and NO signaling # Organism: Escherichia coli K12 # 1 220 1 220 220 423 99.0 1e-118 MAYRDQPLGELALSIPRASALFRKYDMDYCCGGKQTLARAAARKELGVEVIEAELAKLAE QPIEKDWRSAPLAEIIDHIIVRYHDRHREQLPELILQATKVERVHADKPSVPKGLTKYLT MLHEELSSHMMKEEQILFPMIKQGMGSQAMGPISVMESEHDEAGELLEVIKHTTNNVTPP PEACTTWKAMYNGINELIDDLMDHISLENNVLFPRALAGE >gi|296494539|gb|ADTN01000199.1| GENE 14 13878 - 14843 930 321 aa, chain - ## HITS:1 COG:ECs5188 KEGG:ns NR:ns ## COG: ECs5188 COG0697 # Protein_GI_number: 15834442 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Escherichia coli O157:H7 # 1 321 4 324 324 556 98.0 1e-158 MISGVLYALLAGLMWGLIFVGPLIVPEYPAMLQSMGRYLALGLIALPIAWLGRVRLRQLA RRDWLTALMLTMMGNLIYYFCLASAIQRTGAPVSTMIIGTLPVVIPVFANLLYSQRDGKL AWGKLAPALICIGIGLACVNIAELNHGLPDFDWARYTSGIVLALVSVVCWAWYALRNARW LRENPDKHPMMWATAQALVTLPVSLIGYLVACYWLNIQTPDFSLPFGPRPLVFISLMVAI AVLCSWVGALCWNVASQRLPTVILGPLIVFETLAGLLYTFLIRQQMPPLMTLSGIALLVV GVVIAVRAKPEKPLTESVSES >gi|296494539|gb|ADTN01000199.1| GENE 15 14952 - 15812 867 286 aa, chain - ## HITS:1 COG:ECs5190 KEGG:ns NR:ns ## COG: ECs5190 COG0702 # Protein_GI_number: 15834443 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate-sugar epimerases # Organism: Escherichia coli O157:H7 # 1 285 1 285 286 469 98.0 1e-132 MIAITGATGQLGHYVIKSLMKTVPASQIVAIVRNPAKAQALTAQGITVRQADYGDEAALT SALQGVEKLLLISSSEVGQRAPQHRNVINAAKTAGVKFIAYTSLLHADKSPLGLADEHIE TEKMLADSGIVYTLLRNGWYTENYLASAPAALEHGVFIGAAGDGKIASATRADYAAAAAR VISEAGHEGKVYELAGDSAWTLTQLAAELTKQSGKQVTYQNLSETDFAAALKSVGLPDGL ADMLADSDVGASKGGLFDDSKTLSKLIGRPTTTLAESVSHLFNVNK >gi|296494539|gb|ADTN01000199.1| GENE 16 15901 - 16281 431 126 aa, chain + ## HITS:1 COG:ytfH KEGG:ns NR:ns ## COG: ytfH COG1733 # Protein_GI_number: 16132034 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 1 126 31 156 156 243 100.0 8e-65 MSQVSLSQQLKEGNLFAEQCPSREVLKHVTSRWGVLILVALREGTHRFSDLRRKIGGVSE KMLAQSLQALEQDGFLNRIAYPVVPPHVEYSLTPLGEQVSEKVAALADWIELNLPEVLAV RDERAA >gi|296494539|gb|ADTN01000199.1| GENE 17 16404 - 18347 2103 647 aa, chain - ## HITS:1 COG:cpdB KEGG:ns NR:ns ## COG: cpdB COG0737 # Protein_GI_number: 16132035 # Func_class: F Nucleotide transport and metabolism # Function: 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases # Organism: Escherichia coli K12 # 1 647 1 647 647 1266 99.0 0 MIKFSATLLATLIAASVNAATVDLRIMETTDLHSNMMDFDYYKDTATEKFGLVRTASLIN DARNEVKNSVLVDNGDLIQGSPLADYISAKGLKAGDVHPVYKALNTLDYTVGTLGNHEFN YGLDYLKNALAGAKFPYVNANVIDARTKQPMFTPYLIKDTEVVDKDGKKQTLKIGYIGVV PPQIMGWDKANLSGKVTVNDITETVRKYVPEMREKGADVVVVLAHSGLSADPYKVMAENS VYYLSEIPGVNAIMFGHAHAVFPGKDFADIEGADIAKGTLNGVPAVMPGMWGDHLGVVDL QLSNNSGKWQVTQAKAEARPIYDIANKKSLAAEDSKLVETLKADHDATRQFVSKPIGKSA DNMYSYLALVQDDPTVQVVNNAQKAYVEHYIQGDPDLAKLPVLSAAAPFKVGGRKNDPAS YVEVEKGQLTFRNAADLYLYPNTLIVVKASGKEVKEWLECSAGQFNQIDPDNTKPQSLIN WDGFRTYNFDVIDGVNYQIDVTQPARYDGECQMVNANAERIKNLTFNGKPIDPNAMFLVA TNNYRAYGGKFAGTGDSHIAFASPDENRSVLAAWIADESKRAGEIHPAADNNWRLAPIAG DKKLDIRFETSPSDKAAAFIKEKGQYPMNKVATDDIGFAIYQVDLSK >gi|296494539|gb|ADTN01000199.1| GENE 18 18537 - 19277 778 246 aa, chain + ## HITS:1 COG:cysQ KEGG:ns NR:ns ## COG: cysQ COG1218 # Protein_GI_number: 16132036 # Func_class: P Inorganic ion transport and metabolism # Function: 3'-Phosphoadenosine 5'-phosphosulfate (PAPS) 3'-phosphatase # Organism: Escherichia coli K12 # 1 246 1 246 246 475 99.0 1e-134 MLDQVCQLARNAGDAIMQVYDGTKPMDVVSKADNSPVTAADIAAHTVIMDGLRTLTPDIP VLSEEDPPGWEVRQHWQRYWLVDPLDGTKEFIKRNGEFTVNIALIDHGKPILGVVYAPVM NIMYSAAEGKAWKEECGVRKQIQVRDARPPLVVISRSHADAELKEYLQQLGEHQTTSIGS SLKFCLVAEGQAQLYPRFGPTNIWDTAAGHAVAAAAGAHVHDWQGKPLDYTPRESFLNPG FRVSIY >gi|296494539|gb|ADTN01000199.1| GENE 19 19267 - 19824 659 185 aa, chain - ## HITS:1 COG:ECs5194 KEGG:ns NR:ns ## COG: ECs5194 COG3054 # Protein_GI_number: 15834448 # Func_class: R General function prediction only # Function: Predicted transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 183 1 183 184 353 98.0 1e-97 MTLRKILALTCLLLPMMASAHQFETGQRVPPIGITDRGELVLDKDQFSYKTWNSAQLVGK VRVLQHIAGRTSAKEKNATLIEAIKSAKLPHDRYQTTTIVNTDDAIPGSGMFVRSSLESN KKLYPWSQFIVDSNGVARGAWQLDEESSAVVVLDKDGRVQWAKDGALTQEEVQQVMDLLH KLINK >gi|296494539|gb|ADTN01000199.1| GENE 20 20149 - 20355 250 68 aa, chain + ## HITS:1 COG:no KEGG:G2583_5047 NR:ns ## KEGG: G2583_5047 # Name: ytfK # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 68 14 81 81 119 100.0 4e-26 MKIFQRYNPLQVAKYVKILFRGRLYIKDVGAFEFDKGKILIPKVKDKLHLSVMSEVNRQV MRLQTEMA >gi|296494539|gb|ADTN01000199.1| GENE 21 20417 - 21760 1646 447 aa, chain - ## HITS:1 COG:ECs5196 KEGG:ns NR:ns ## COG: ECs5196 COG1253 # Protein_GI_number: 15834450 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Escherichia coli O157:H7 # 1 447 1 447 447 876 100.0 0 MLNSILVILCLIAVSAFFSMSEISLAASRKIKLKLLADEGNINAQRVLNMQENPGMFFTV VQIGLNAVAILGGIVGDAAFSPAFHSLFSRYMSAELSEQLSFILSFSLVTGMFILFADLT PKRIGMIAPEAVALRIINPMRFCLYVCTPLVWFFNGLANIIFRIFKLPMVRKDDITSDDI YAVVEAGALAGVLRKQEHELIENVFELESRTVPSSMTPRENVIWFDLHEDEQSLKNKVAE HPHSKFLVCNEDIDHIIGYVDSKDLLNRVLANQSLALNSGVQIRNTLIVPDTLTLSEALE SFKTAGEDFAVIMNEYALVVGIITLNDVMTTLMGDLVGQGLEEQIVARDENSWLIDGGTP IDDVMRVLDIDEFPQSGNYETIGGFMMFMLRKIPKRTDSVKFAGYKFEVVDIDNYRIDQL LVTRIDSKATALSPKLPDAKDKEESVA >gi|296494539|gb|ADTN01000199.1| GENE 22 22083 - 22721 705 212 aa, chain - ## HITS:1 COG:msrA KEGG:ns NR:ns ## COG: msrA COG0225 # Protein_GI_number: 16132041 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptide methionine sulfoxide reductase # Organism: Escherichia coli K12 # 1 212 1 212 212 405 100.0 1e-113 MSLFDKKHLVSPADALPGRNTPMPVATLHAVNGHSMTNVPDGMEIAIFAMGCFWGVERLF WQLPGVYSTAAGYTGGYTPNPTYREVCSGDTGHAEAVRIVYDPSVISYEQLLQVFWENHD PAQGMRQGNDHGTQYRSAIYPLTPEQDAAARASLERFQAAMLAADDDRHITTEIANATPF YYAEDDHQQYLHKNPYGYCGIGGIGVCLPPEA >gi|296494539|gb|ADTN01000199.1| GENE 23 22927 - 24660 1666 577 aa, chain + ## HITS:1 COG:ECs5198 KEGG:ns NR:ns ## COG: ECs5198 COG0729 # Protein_GI_number: 15834452 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein # Organism: Escherichia coli O157:H7 # 1 577 1 577 577 1169 99.0 0 MRYIRQLCCVSLLCLSGSAVAANVRLQVEGLSGQLEKNVRAQLSTIESDEVTPDRRFRAR VDDAIREGLKALGYYQPTIEFDLRPPPKKGRQVLIAKVTPGVPVLIGGTDVVLRGGARTD KDYLKLLDTRPAIGTVLNQGDYENFKKSLTSIALRKGYFDSEFTKAQLGIALGLHKAFWD IDYNSGERYRFGHVTFEGSQIRDEYLQNLVPFKEGDEYESKDLAELNRRLSATGWFNSVV VAPQFDKARETKVLPLTGVVSPRTENTIETGVGYSTDVGPRVKATWKKPWMNSYGHSLTT STSISAPEQILDFSYKMPLLKNPLEQYYLVQGGFKRTDLNDTESDSTTLVASRYWDLSSG WQRAINLRWSLDHFTQGEITNTTMLFYPGVMISRTRSRGGLMPTWGDSQRYSIDYSNTAW GSDVDFSVFQAQNVWIRTLYDRHRFVTRGTLGWIETGDFDKVPPDLRFFAGGDRSIRGYK YKSIAPKYANGDLKGASKLITGSLEYQYNVTGKWWGAVFVDSGEAVSDIRRSDFKTGTGV GVRWESPVGPIKLDFAVPVADKDEHGLQFYIGLGPEL >gi|296494539|gb|ADTN01000199.1| GENE 24 24657 - 28436 4121 1259 aa, chain + ## HITS:1 COG:ECs5199 KEGG:ns NR:ns ## COG: ECs5199 COG2911 # Protein_GI_number: 15834453 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 1259 1 1259 1259 2434 99.0 0 MSLWKKISLGVVIVILLLLGSVAFLVGTTSGLHLVFKAADRWVPGLDVGKVTGGWRDLTL SDVRYEQPGVAVKAGNLHLAVGLECLWNSSVCINDLALKDIQVNIDSKKMPPSEQVEEEE DSGPLDLSTPYPITLTRVALDNVNIKIDDTTVSVMDFTSGLNWQEKTLTLKPTSLKGLLI ALPKVAEVAQEEVVEPKIENPQPEEKPLGETLKDLFSRPVLPEMTDVHLPLNLNIEEFKG EQLRVTGDTDITVRTMLLKVSSIDGNTKLDALDIDSSQGIVNASGTAQLSDNWPVDITLN STLNVEPLKGEKVKLKVGGALREQLEIGVNLSGPVDMDLRAQTRLAEAGLPLNVEVNSKQ IYWPFTGEKQYQADDLKLKLTGKMTDYTLSMRTAVKGLEIPPATITLDAKGNEQQVNLDK LTVAALEGKTELKALLDWQQAISWRGELTLNGINTAKEIPEWPSKLNGLIKTRGSLYGGT WQMEVPELKLTGNVKQNKVNVDGTLKGNSYMQWMIPGLHLELGPNSAEVKGELGVKDLNL DATINAPGLDNALPGLGGTAKGLVKVRGTVEAPQLLADITARGLRWQELSVAQVRVEGDI KSTDQIAGKLDVRVEQISQPDVNINLVTLNAKGSEKQHELQLRIQGEPVSGQLNLAGSFD RKEERWKGTLSNTRFQTPVGPWSLTRDIALDYRNKEQKISIGPHCWLNPNAELCVPQTID AGAEGRAVVNLNRFDLAMLKPFMPETTQASGIFTGKADVAWDTTKEGLPQGSITLSGRNV QVTQTVNDAALPVAFQTLNLTAELRNNRAELGWTIRLTNNGQFDGQVQVTDPQGRRNLGG NVNIRNFNLAMINPIFTRGEKAAGMVSANLRLGGDVQSPQLFGQLQVTGVDIDGNFMPFD MQPSQLAVNFNGMRSTLAGTVRTQQGEIYLNGDADWSQIENWRARVTAKGSKVRITVPPM VRMDVSPDVVFEATPNLFTLDGRVDVPWARIVVHDLPESAVGVSSDVVMLNDNLQPEEPK TASIPINSNLIVHVGNNVRIDAFGLKARLTGDLNVVQDKQGLGLNGQINIPEGRFHAYGQ DLIVRKGELLFSGPPDQPYLNIEAIRNPDATEDDVIAGVRVTGLADEPKAEIFSDPAMSQ QAALSYLLRGQGLESDQSDSAAMTSMLIGLGVAQSGQIVGKIGETFGVSNLALDTQGVGD SSQVVVSGYVLPGLQVKYGVGIFDSIATLTLRYRLMPKLYLEAVSGVDQALDLLYQFEF >gi|296494539|gb|ADTN01000199.1| GENE 25 28439 - 28780 229 113 aa, chain + ## HITS:1 COG:ECs5200 KEGG:ns NR:ns ## COG: ECs5200 COG2105 # Protein_GI_number: 15834454 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 113 1 113 113 226 100.0 1e-59 MRIFVYGSLRHKQGNSHWMTNAQLLGDFSIDNYQLYSLGHYPGAVPGNGTVHGEVYRIDN ATLAELDALRTRGGEYARQLIQTPYGSAWMYVYQRPVDGLKLIESGDWLDRDK Prediction of potential genes in microbial genomes Time: Sun May 15 23:49:01 2011 Seq name: gi|296494538|gb|ADTN01000200.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont416.6, whole genome shotgun sequence Length of sequence - 5813 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 352 - 882 728 ## COG0221 Inorganic pyrophosphatase - Prom 979 - 1038 4.8 + Prom 964 - 1023 4.3 2 2 Op 1 16/0.000 + CDS 1192 - 2148 1133 ## COG1879 ABC-type sugar transport system, periplasmic component + Prom 2200 - 2259 2.7 3 2 Op 2 21/0.000 + CDS 2288 - 3790 207 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 4 2 Op 3 11/0.000 + CDS 3804 - 4826 1107 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 5 2 Op 4 . + CDS 4813 - 5808 1093 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components Predicted protein(s) >gi|296494538|gb|ADTN01000200.1| GENE 1 352 - 882 728 176 aa, chain - ## HITS:1 COG:ECs5204 KEGG:ns NR:ns ## COG: ECs5204 COG0221 # Protein_GI_number: 15834458 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase # Organism: Escherichia coli O157:H7 # 1 176 1 176 176 346 98.0 1e-95 MSLLNVPAGKDLPEDIYVVIEIPANADPIKYEIDKESGALFVDRFMSTAMFYPCNYGYIN HTLSLDGDPVDVLVPTPYPLQPGSVIRCRPVGVLKMTDEAGEDAKLIAVPHTKLSKEYDH IKDVNDLPELLKAQIAHFFEHYKDLEKGKWVKVEGWENAEAAKAEIVASFERAKNK >gi|296494538|gb|ADTN01000200.1| GENE 2 1192 - 2148 1133 318 aa, chain + ## HITS:1 COG:ECs5205 KEGG:ns NR:ns ## COG: ECs5205 COG1879 # Protein_GI_number: 15834459 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 1 318 1 318 318 571 99.0 1e-163 MWKRLLVVSAVSAAMSSMALAAPLTVGFSQVGSESGWRAAETNVAKSEAEKRGITLKIAD GQQKQENQIKAVRSFVAQGVDAIFIAPVVATGWEPVLKEAKDAEIPVFLLDRSIDVKDKS LYMTTVTADNILEGKLIGDWLVKEVNGKPCNVVELQGTVGASVAIDRKKGFAEAIKNAPN IKIIRSQSGDFTRSKGKEVMESFIKAENNGKNICMVYAHNDDMVIGAIQAIKEAGLKPGK DILTGSIDGVPDIYKAMIDGEANASVELTPNMAGPAFDALEKYKKDGTMPEKLTLTKSTL YLPDTAKEELEKKKNMGY >gi|296494538|gb|ADTN01000200.1| GENE 3 2288 - 3790 207 500 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 268 477 9 214 245 84 26 2e-16 MTTNQHQEILRTEGLSKFFPGVKALDNVDFSLRRGEIMALLGENGAGKSTLIKALTGVYH ADRGTIWLEGQAISPKNTAHAQQLGIGTVYQEVNLLPNMSVADNLFIGREPKRFGLLRRK EMEKRATELMASYGFSLDVREPLNRFSVAMQQIVAICRAIDLSAKVLILDEPTASLDTQE VELLFDLMRQLRDRGVSLIFVTHFLDQVYQVSDRITVLRNGSFVGCRETCELPQIELVKM MLGRELDTHALQRAGRTLLSDKPVAAFKNYGKKGTIAPFDLEVRPGEIVGLAGLLGSGRT ETAEVIFGIKPADSGTALIKGKPQNLRSPHQASVLGIGFCPEDRKTDGIIAAASVRENII LALQAQRGWLRPISRKEQQEIAERFIRQLGIRTPSTEQPIEFLSGGNQQKVLLSRWLLTR PQFLILDEPTRGIDVGAHAEIIRLIETLCADGLALLVISSELEELVGYADRVIIMRDRKQ VAEIPLAELSVPAIMNAIAA >gi|296494538|gb|ADTN01000200.1| GENE 4 3804 - 4826 1107 340 aa, chain + ## HITS:1 COG:ECs5207 KEGG:ns NR:ns ## COG: ECs5207 COG1172 # Protein_GI_number: 15834461 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Escherichia coli O157:H7 # 1 340 2 341 341 495 99.0 1e-140 MPQSLPDTTPPKRRFRWPTGMPQLAALLLVLLVDSLVAPHFWQVVLQDGRLFGSPIDILN RAAPVALLAIGMTLVIATGGIDLSVGAVMAIAGATTAAMTVAGFSLPIVLLSALGTGILA GLWNGILVAILKIQPFVATLILMVAGRGVAQLITSGQIVTFNSPDLSWFGSGSLLFLPTP VIIAVLTLILFWLLTRKTALGMFIEAVGINIRAAKNAGVNTRIIVMLTYVLSGLCAAIAG IIVAADIRGADANNAGLWLELDAILAVVIGGGSLMGGRFNLLLSVVGALIIQGMNTGILL SGFPPEMNQVVKAVVVLCVLIVQSQRFISLIKGVRSRDKT >gi|296494538|gb|ADTN01000200.1| GENE 5 4813 - 5808 1093 331 aa, chain + ## HITS:1 COG:yjfF KEGG:ns NR:ns ## COG: yjfF COG1172 # Protein_GI_number: 16132053 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Escherichia coli K12 # 9 331 1 323 323 520 99.0 1e-147 MIKRNLPLMITIGVFVLGYLYCLTQFPGFASTRVICNILTDNAFLGIIAVGMTFVILSGG IDLSVGSVIAFTGVFLAKVIGDFGLSPLLAFPLVLVMGCAFGAFMGLLIDALKIPAFIIT LAGMFFLRGVSYLVSEESIPINHPIYDTLSSLAWKIPGGGRLSAMGLLMLAVVVIGIFLA HRTRFGNQVYAIGGNATSANLMGISTRSTTIRIYMLSTGLATLAGIVFSIYTQAGYALAG VGVELDAIASVVIGGTLLSGGVGTVLGTLFGVAIQGLIQTYINFDGTLSSWWTKIAIGIL LFIFIALQRGLTVLWENRQSSPVTRVNIAQR Prediction of potential genes in microbial genomes Time: Sun May 15 23:49:07 2011 Seq name: gi|296494537|gb|ADTN01000201.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont416.7, whole genome shotgun sequence Length of sequence - 17736 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 11, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 23 - 1021 1199 ## COG0158 Fructose-1,6-bisphosphatase - Prom 1077 - 1136 2.2 + Prom 969 - 1028 4.9 2 2 Tu 1 . + CDS 1197 - 2570 1715 ## COG0773 UDP-N-acetylmuramate-alanine ligase + Term 2618 - 2669 2.1 3 3 Tu 1 . - CDS 2726 - 3277 761 ## COG3028 Uncharacterized protein conserved in bacteria - Prom 3297 - 3356 7.1 + Prom 3268 - 3327 3.1 4 4 Op 1 2/0.833 + CDS 3371 - 4723 1510 ## COG0312 Predicted Zn-dependent proteases and their inactivated homologs + Term 4729 - 4761 2.6 5 4 Op 2 . + CDS 4778 - 5164 551 ## COG3783 Soluble cytochrome b562 + Term 5173 - 5216 11.7 - Term 4987 - 5028 -0.5 6 5 Tu 1 12/0.167 - CDS 5209 - 5673 463 ## COG0602 Organic radical activating enzymes - Prom 5737 - 5796 4.3 7 6 Tu 1 3/0.667 - CDS 5831 - 7969 2620 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase - Prom 8148 - 8207 6.5 - Term 8313 - 8350 4.4 8 7 Op 1 9/0.167 - CDS 8363 - 10018 1417 ## COG0366 Glycosidases 9 7 Op 2 7/0.167 - CDS 10068 - 11486 1560 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific - Prom 11538 - 11597 4.4 - Term 11542 - 11568 -1.0 10 8 Tu 1 . - CDS 11608 - 12555 785 ## COG1609 Transcriptional regulators - Prom 12589 - 12648 7.1 11 9 Tu 1 . + CDS 12934 - 15630 2717 ## COG0474 Cation transport ATPase + Term 15753 - 15789 2.4 12 10 Tu 1 6/0.333 - CDS 15836 - 16222 570 ## COG0251 Putative translation initiation inhibitor, yjgF family - Term 16237 - 16276 5.0 13 11 Op 1 19/0.000 - CDS 16295 - 16756 505 ## COG1781 Aspartate carbamoyltransferase, regulatory subunit 14 11 Op 2 . - CDS 16769 - 17704 996 ## COG0540 Aspartate carbamoyltransferase, catalytic chain Predicted protein(s) >gi|296494537|gb|ADTN01000201.1| GENE 1 23 - 1021 1199 332 aa, chain - ## HITS:1 COG:fbp KEGG:ns NR:ns ## COG: fbp COG0158 # Protein_GI_number: 16132054 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1,6-bisphosphatase # Organism: Escherichia coli K12 # 1 332 1 332 332 685 100.0 0 MKTLGEFIVEKQHEFSHATGELTALLSAIKLGAKIIHRDINKAGLVDILGASGAENVQGE VQQKLDLFANEKLKAALKARDIVAGIASEEEDEIVVFEGCEHAKYVVLMDPLDGSSNIDV NVSVGTIFSIYRRVTPVGTPVTEEDFLQPGNKQVAAGYVVYGSSTMLVYTTGCGVHAFTY DPSLGVFCLCQERMRFPEKGKTYSINEGNYIKFPNGVKKYIKFCQEEDKSTNRPYTSRYI GSLVADFHRNLLKGGIYLYPSTASHPDGKLRLLYECNPMAFLAEQAGGKASDGKERILDI IPETLHQRRSFFVGNDHMVEDVERFIREFPDA >gi|296494537|gb|ADTN01000201.1| GENE 2 1197 - 2570 1715 457 aa, chain + ## HITS:1 COG:ZyjfG KEGG:ns NR:ns ## COG: ZyjfG COG0773 # Protein_GI_number: 15804823 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate-alanine ligase # Organism: Escherichia coli O157:H7 EDL933 # 1 457 1 457 457 953 100.0 0 MRIHILGICGTFMGGLAMLARQLGHEVTGSDANVYPPMSTLLEKQGIELIQGYDASQLDP QPDLVIIGNAMTRGNPCVEAVLEKNIPYMSGPQWLHDFVLRDRWVLAVAGTHGKTTTAGM ATWILEQCGYKPGFVIGGVPGNFEVSARLGESDFFVIEADEYDCAFFDKRSKFVHYCPRT LILNNLEFDHADIFDDLKAIQKQFHHLVRIVPGQGRIIWPENDINLKQTMAMGCWSEQEL VGEQGHWQAKKLTTDASEWEVLLDGEKVGEVKWSLVGEHNMHNGLMAIAAARHVGVAPAD AANALGSFINARRRLELRGEANGVTVYDDFAHHPTAILATLAALRGKVGGTARIIAVLEP RSNTMKMGICKDDLAPSLGRADEVFLLQPAHIPWQVAEVAEACVQPAHWSGDVDTLADMV VKTAQPGDHILVMSNGGFGGIHQKLLDGLAKKAEAAQ >gi|296494537|gb|ADTN01000201.1| GENE 3 2726 - 3277 761 183 aa, chain - ## HITS:1 COG:ECs5211 KEGG:ns NR:ns ## COG: ECs5211 COG3028 # Protein_GI_number: 15834465 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 183 1 183 183 271 100.0 7e-73 MTKQPEDWLDDVPGDDIEDEDDEIIWVSKSEIKRDAEELKRLGAEIVDLGKNALDKIPLD ADLRAAIELAQRIKMEGRRRQLQLIGKMLRQRDVEPIRQALDKLKNRHNQQVVLFHKLEN LRDRLIDQGDDAIAEVLNLWPDADRQQLRTLIRNAKKEKEGNKPPKSARQIFQYLRELAE NEG >gi|296494537|gb|ADTN01000201.1| GENE 4 3371 - 4723 1510 450 aa, chain + ## HITS:1 COG:ECs5212 KEGG:ns NR:ns ## COG: ECs5212 COG0312 # Protein_GI_number: 15834466 # Func_class: R General function prediction only # Function: Predicted Zn-dependent proteases and their inactivated homologs # Organism: Escherichia coli O157:H7 # 1 450 1 450 450 883 99.0 0 MALAMKVISQVEAQRKILEEAVSTALELASGKSDGAEVAVSKTTGISVSTRYGEVENVEF NSDGALGITVYHQNRKGSASSTDLSPQAIARTVQAALDIARYTSPDPCAGVADKELLAFE APDLDLFHPAEVSPDEAIELAARAEQAALQADKRITNTEGGSFNSHYGVKVFGNSHGMLQ GYCSTRHSLSSCVIAEENGDMERDYAYTIGRAMSDLQTPEWVGADCARRTLSRLSPRKLS TMKAPVIFANEVATGLFGHLVGAIAGGAVYRKSTFLLDSLGKQILPDWLTIEEHPHLLKG LASTPFDSEGVRTERRDIIKDGILTQWLLTSYSARKLGLKSTGHAGGIHNWRIAGQGLSF EQMLKEMGTGLVVTELMGQGVSAITGDYSRGAAGFWVENGEIQYPVSEITIAGNLKDMWR NIVTVGNDIETRSNIQCGSVLLPEMKIAGQ >gi|296494537|gb|ADTN01000201.1| GENE 5 4778 - 5164 551 128 aa, chain + ## HITS:1 COG:STM4439 KEGG:ns NR:ns ## COG: STM4439 COG3783 # Protein_GI_number: 16767685 # Func_class: C Energy production and conversion # Function: Soluble cytochrome b562 # Organism: Salmonella typhimurium LT2 # 1 128 1 128 128 174 85.0 4e-44 MRKSLLAILAVSSLVFSSASFAADLEDNMETLNDTLKVVEKADNAAQVKDALTKMRAAAL DAQKATPPKLEDKSPDSPEMKDFRHGFDILVGQIDDALKLANEGKVKEAQAAAEQLKTTR NAYHQKYR >gi|296494537|gb|ADTN01000201.1| GENE 6 5209 - 5673 463 154 aa, chain - ## HITS:1 COG:ECs5214 KEGG:ns NR:ns ## COG: ECs5214 COG0602 # Protein_GI_number: 15834468 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Escherichia coli O157:H7 # 1 154 1 154 154 319 98.0 1e-87 MNYHQYYPVDIVNGPGTRCTLFVSGCVHECPGCYNKSTWRVNSGQPFTKAMEDQIINDLN DTRIKRQGISLSGGDPLHPQNVPDILKLVKRIRAECPDKDIWVWTGYKLDDLNAAQMQVV DLINVLVDGKFVQDLKDPSLIWRGSSNQVVHHLR >gi|296494537|gb|ADTN01000201.1| GENE 7 5831 - 7969 2620 712 aa, chain - ## HITS:1 COG:nrdD KEGG:ns NR:ns ## COG: nrdD COG1328 # Protein_GI_number: 16132060 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Escherichia coli K12 # 1 712 1 712 712 1505 99.0 0 MTPHVMKRDGCKVPFKSERIKEAILRAAKAAEVDDADYCATVAAVVSEQMQGRNQVDINE IQTAVENQLMSGPYKQLARAYIEYRHDRDIEREKRGRLNQEIRGLVEQTNASLLNENANK DSKVIPTQRDLLAGIVAKHYARQHLLPRDVVQAHERGDIHYHDLDYSPFFPMFNCMLIDL KGMLTQGFKMGNAEIEPPKSISTATAVTAQIIAQVASHIYGGTTINRIDEVLAPFVTASY NKHRKTAEEWNIPDAEGYANSRTIKECYDAFQSLEYEVNTLHTANGQTPFVTFGFGLGTS WESRLIQESILRNRIAGLGKNRKTAVFPKLVFAIRDGLNHKKGDPNYDIKQLALECASKR MYPDILNYDQVVKVTGSFKTPMGCRSFLGVWENENGEQIHDGRNNLGVISLNLPRIALEA KGDEATFWKLLDERLVLARKALMTRIARLEGVKARVAPILYMEGACGVRLNADDDVSEIF KNGRASISLGYIGIHETINALFGGEHVYDNEQLRAKGIAIVERLRQAVDKWKEETGYGFS LYSTPSENLCDRFCRLDTAEFGVVPGVTDKGYYTNSFHLDVEKKVNPYDKIDFEAPYPPL ANGGFICYGEYPNIQHNLKALEDVWDYSYQHVPYYGTNTPIDECYECGFTGEFECTSKGF TCPKCGNHDASRVSVTRRVCGYLGSPDARPFNAGKQEEVKRRVKHLGNGQIG >gi|296494537|gb|ADTN01000201.1| GENE 8 8363 - 10018 1417 551 aa, chain - ## HITS:1 COG:ECs5216 KEGG:ns NR:ns ## COG: ECs5216 COG0366 # Protein_GI_number: 15834470 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Escherichia coli O157:H7 # 1 551 1 551 551 1130 98.0 0 MTNLPHWWQNGVIYQIYPKSFQDTTGSGTGDLRGVIQRLDYLHKLGVDAIWLTPFYISPQ VDNGYDVANYTAIDPIYGTLDDFDELVTQAKSRGIRIILDMVFNHTSTQHAWFREALNKE SPYRQFYIWRDGEPETPPNNWRSKFGGSAWRWHAESEQYYLHLFAPEQADLNWENPAVRA ELKKVCEFWADRGVDGLRLDVVNLISKDPRFPDDLDGDGRRFYTDGPRAHEFLHEMNRDV FTPRGLMTVGEMSSTSLEHCQRYAALTGSELSMTFNFHHLKVDYPGGEKWTLAKPDFVAL KTLFRHWQQGMHNVAWNALFWCNHDQPRIVSRFGDEGEYRVPAAKMLAMVLHGMQGTPYI YQGEEIGMTNPHFTRITDYRDVESLNMFAELRNDGRDADELLAILASKSRDNSRTPMQWS NGDNAGFTAGEPWIGLGDNYQQINVEAALADDSSVFYTYQKLIALRKQEAILTWGNYQDL LPNSPVLWCYRREWKGQTLLVIANLSRGIQPWQPGQMRGNWQLVMHNYEEASPQPCAMNL RPFEAVWWLQK >gi|296494537|gb|ADTN01000201.1| GENE 9 10068 - 11486 1560 472 aa, chain - ## HITS:1 COG:ECs5217_2 KEGG:ns NR:ns ## COG: ECs5217_2 COG1263 # Protein_GI_number: 15834471 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Escherichia coli O157:H7 # 92 472 1 381 381 689 99.0 0 MSKINQTDIDRLIELVGGRGNIATVSHCITRLRFVLNQPANARPKEIEQLPMVKGCFTNA GQFQVVIGTNVGDYYQALIASTGQAQVDKEQVKKAARQNMKWHEQLISHFAEIFFPLLPA LISGGLILGFRNVIGDLPMSNGQTLAQMYPSLQTIYDFLWLIGEAIFFYLPVGICWSAVK KMGGTPILGIVLGVTLVSPQLMNAYLLGQQLPEVWDFGMFSIAKVGYQAQVIPALLAGLA LGVIETRLKRIVPDYLYLVVVPVCSLILAVFLAHALIGPFGRMIGDGVAFAVRHLMTGSF APIGAALFGFLYAPLVITGVHQTTLAIDLQMIQSMGGTPVWPLIALSNIAQGSAVIGIII SSRKHNEREISVPAAISAWLGVTEPAMYGINLKYRFPMLCAMIGSGLAGLLCGLNGVMAN GIGVGGLPGILSIQPSYWQVFALAMVIAIIIPIVLTSFIYQRKYRLGTLDIV >gi|296494537|gb|ADTN01000201.1| GENE 10 11608 - 12555 785 315 aa, chain - ## HITS:1 COG:treR KEGG:ns NR:ns ## COG: treR COG1609 # Protein_GI_number: 16132063 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 315 1 315 315 619 99.0 1e-177 MQNRLTIKDIARLSGVGKSTVSRVLNNESGVSQRTRERVEAVMNQHGFSPSRSARAMRGQ SDKVVAIIVTRLDSLSENLAVQTMLPAFYEQGYDPIMMESQFSPQLVAEHLGVLKRRNID GVVLFGFTGITEEMLAHWQSSLVLLARDAKGFASVCYDDEGAIKILMQRLYDQGHRNISY LGVPHSDVTTGKRRHEAYLAFCKAHKLHPVAALPGLAMKQGYENVAKVITPETTALLCAT DTLALGASKYLQEQRIDTLQLASVGNTPLMKFLHPEIVTVDPGYAEAGRQAACQLIAQVT GRSEPQQIIIPATLS >gi|296494537|gb|ADTN01000201.1| GENE 11 12934 - 15630 2717 898 aa, chain + ## HITS:1 COG:ECs5219 KEGG:ns NR:ns ## COG: ECs5219 COG0474 # Protein_GI_number: 15834473 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Escherichia coli O157:H7 # 1 898 1 898 898 1813 99.0 0 MFKEIFTRLIRHLPSRLVHRDPLPGAQQTVNTAVPPSLSAHCLKMAVMPEEELWKTFDTH PEGLNQAEVESAREQHGENKLPAQQPSPWWVHLWVCYRNPFNILLTILGAISYATEDLFA AGVIALMVAISTLLNFIQEARSTKAADALKAMVSNTATVLRVINDKGENGWLEIPIDQLV PGDIIKLAAGDMIPADLRILQARDLFVAQASLTGESLPVEKAATTRRPEHSNPLECDTLC FMGTTVVSGTAQAMVIATGANTWFGQLAGRVSEQESEPNAFQQGISRVSMLLIRFMLVMA PVVLLINGYTKGDWWEAALFALSVAVGLTPEMLPMIVTSTLARGAVKLSKQKVIVKHLDA IQNFGAMDILCTDKTGTLTQDKIVLENHTDISGKTSERVLHSAWLNSHYQTGLKNLLDTA VLEGTDEESARSLASRWQKIDEIPFDFERRRMSVVVAENTEHHQLVCKGALQEILNVCSQ VRHNGEIVPLDDIMLRKIKRVTDTLNRQGLRVVAVATKYLPAREGDYQRADESDLILEGY IAFLDPPKETTAPALKALKASGITVKILTGDSELVAAKVCHEVGLDAGEVVIGSDIETLS DDELANLAQRTTLFARLTPMHKERIVTLLKREGHVVGFMGDGINDAPALRAADIGISVDG AVDIAREAADIILLEKSLMVLEEGVIEGRRTFANMLKYIKMTASSNFGNVFSVLVASAFL PFLPMLPLHLLIQNLLYDVSQVAIPFDNVDDEQIQKPQRWNPADLGRFMIFFGPISSIFD ILTFCLMWWVFHANTPETQTLFQSGWFVVGLLSQTLIVHMIRTRRVPFIQSCASWPLMIM TVIVMIVGIALPFSPLASYLQLQALPLSYFPWLVAILAGYMTLTQLVKGFYSRRYGWQ >gi|296494537|gb|ADTN01000201.1| GENE 12 15836 - 16222 570 128 aa, chain - ## HITS:1 COG:ECs5220 KEGG:ns NR:ns ## COG: ECs5220 COG0251 # Protein_GI_number: 15834474 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Escherichia coli O157:H7 # 1 128 14 141 141 225 98.0 1e-59 MSKTIATENAPAAIGPYVQGVDLGSMIITSGQIPVNPKTGEVPTDVAAQARQSLDNVKAI VEAAGLKVGDIVKTTVFVKDLNDFATVNATYEAFFTEHNATFPARSCVEVARLPKDVKIE IEAIAVRR >gi|296494537|gb|ADTN01000201.1| GENE 13 16295 - 16756 505 153 aa, chain - ## HITS:1 COG:ECs5221 KEGG:ns NR:ns ## COG: ECs5221 COG1781 # Protein_GI_number: 15834475 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, regulatory subunit # Organism: Escherichia coli O157:H7 # 1 153 1 153 153 304 100.0 5e-83 MTHDNKLQVEAIKRGTVIDHIPAQIGFKLLSLFKLTETDQRITIGLNLPSGEMGRKDLIK IENTFLSEDQVDQLALYAPQATVNRIDNYEVVGKSRPSLPERIDNVLVCPNSNCISHAEP VSSSFAVRKRANDIALKCKYCEKEFSHNVVLAN >gi|296494537|gb|ADTN01000201.1| GENE 14 16769 - 17704 996 311 aa, chain - ## HITS:1 COG:ECs5222 KEGG:ns NR:ns ## COG: ECs5222 COG0540 # Protein_GI_number: 15834476 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, catalytic chain # Organism: Escherichia coli O157:H7 # 1 311 1 311 311 613 100.0 1e-176 MANPLYQKHIISINDLSRDDLNLVLATAAKLKANPQPELLKHKVIASCFFEASTRTRLSF ETSMHRLGASVVGFSDSANTSLGKKGETLADTISVISTYVDAIVMRHPQEGAARLATEFS GNVPVLNAGDGSNQHPTQTLLDLFTIQETQGRLDNLHVAMVGDLKYGRTVHSLTQALAKF DGNRFYFIAPDALAMPQYILDMLDEKGIAWSLHSSIEEVMAEVDILYMTRVQKERLDPSE YANVKAQFVLRASDLHNAKANMKVLHPLPRVDEIATDVDKTPHAWYFQQAGNGIFARQAL LALVLNRDLVL Prediction of potential genes in microbial genomes Time: Sun May 15 23:49:09 2011 Seq name: gi|296494536|gb|ADTN01000202.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont416.8, whole genome shotgun sequence Length of sequence - 2982 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 370 - 765 211 ## COG0251 Putative translation initiation inhibitor, yjgF family - Prom 833 - 892 2.0 - Term 838 - 877 2.1 2 2 Tu 1 . - CDS 896 - 1609 274 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 - Prom 1633 - 1692 5.5 + Prom 1595 - 1654 5.0 3 3 Tu 1 . + CDS 1680 - 2273 273 ## COG1309 Transcriptional regulator + Prom 2333 - 2392 4.6 4 4 Tu 1 . + CDS 2418 - 2870 462 ## COG2731 Beta-galactosidase, beta subunit + Term 2879 - 2918 4.3 Predicted protein(s) >gi|296494536|gb|ADTN01000202.1| GENE 1 370 - 765 211 131 aa, chain - ## HITS:1 COG:yjgH KEGG:ns NR:ns ## COG: yjgH COG0251 # Protein_GI_number: 16132070 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Escherichia coli K12 # 1 131 1 131 131 265 100.0 1e-71 MVERTAVFPAGRHSLYAEHRYSAAIRSGDLLFVSGQVGSREDGTPEPDFQQQVRLAFDNL HATLAAAGCTFDDIIDVTSFHTDPENQFEDIMTVKNEIFSAPPYPNWTAVGVTWLAGFDF EIKVIARIPEQ >gi|296494536|gb|ADTN01000202.1| GENE 2 896 - 1609 274 237 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 7 234 4 239 242 110 35 2e-24 MGAFTGKTVLILGGSRGIGAAIVRRFVTDGANVRFTYAGSKDAAEHLAQETGATAVFTDS ADRDAVIDVVRKSGALDILVVNAGIGVFGDALELNADDIDRLFKINIHAPYHASVEAARQ MPEGGRILIIGSVNGDRMPVAGMAAYAASKSALQGMARGLARDFGPRGITINVVQPGPID TDANPANGPMRDMLHSLMAIKRHGQPEEVAGMVAWLAGPEASFVTGAMHTIDGAFGA >gi|296494536|gb|ADTN01000202.1| GENE 3 1680 - 2273 273 197 aa, chain + ## HITS:1 COG:STM1674 KEGG:ns NR:ns ## COG: STM1674 COG1309 # Protein_GI_number: 16765017 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Salmonella typhimurium LT2 # 2 197 1 196 196 217 54.0 1e-56 MVTKKQSRVPGRPRRFAPEQAVSAAKVLFHQKGFDAVSVAEVTDYLGINPPSLYAAFGSK AGLFSRVLNEYVGTEAIPLADILRDDRPVGECLVEVLKEAARRYSQNGGCAGCMVLEGIH SHDPQARDIAVQYYHAAETTIYDYIARRHPQSAQCVTDFMSTVMSGLSAKAREGHSIEQL CATAALAGGAIKTILKE >gi|296494536|gb|ADTN01000202.1| GENE 4 2418 - 2870 462 150 aa, chain + ## HITS:1 COG:ECs5229 KEGG:ns NR:ns ## COG: ECs5229 COG2731 # Protein_GI_number: 15834483 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Escherichia coli O157:H7 # 1 150 4 153 153 302 98.0 2e-82 MIIGNIHNLQPWLPQELRQAIEHIKAHVTPETPKGKHDIEGNRLFYLISEDMTEPYEARR AEYHARYLDIQIVLRGQEGMTFSTQPAGTPDTDWLADKDIAFLPEGVDEKTVILNEGDFV VFYPGEVHKPLCAVGAPAQVRKAVVKMLMA Prediction of potential genes in microbial genomes Time: Sun May 15 23:49:10 2011 Seq name: gi|296494535|gb|ADTN01000203.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont424.1, whole genome shotgun sequence Length of sequence - 2903 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 368 - 427 3.4 1 1 Tu 1 . + CDS 513 - 1349 149 ## EcolC_2594 hypothetical protein + Term 1493 - 1535 1.8 + Prom 1390 - 1449 5.5 2 2 Tu 1 . + CDS 1642 - 2883 1427 ## JW0987 glucose-1-phosphatase/inositol phosphatase Predicted protein(s) >gi|296494535|gb|ADTN01000203.1| GENE 1 513 - 1349 149 278 aa, chain + ## HITS:1 COG:no KEGG:EcolC_2594 NR:ns ## KEGG: EcolC_2594 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 278 141 418 418 533 99.0 1e-150 MEYIENINACDDVFSEYCFDDENISVQPERINTPGISDLDSDIDLSGISFIQRETNQALG LKYAPVDGDGYCLLRAILVLKQHDYSWALVSYKMQKEVYNEFIKMVDKKTIEALVDTAFY NLREDVKTLFGVDLQSDNQIQGQSSLMSWSFLFFKKQFIDSCLNNEKCILHLPEFIFNDN KNLLALDTDTSDRIKAVKNFLAVLSDSICSLFIVNSNVASISLGNESFSTDEDLECGYLM NTGNHYDVYLPPELFAQAYKLNNKEMNAQLDYLNRYAI >gi|296494535|gb|ADTN01000203.1| GENE 2 1642 - 2883 1427 413 aa, chain + ## HITS:1 COG:no KEGG:JW0987 NR:ns ## KEGG: JW0987 # Name: agp # Def: glucose-1-phosphatase/inositol phosphatase # Organism: E.coli_J # Pathway: Glycolysis / Gluconeogenesis [PATH:ecj00010] # 1 413 1 413 413 834 100.0 0 MNKTLIAAAVAGIVLLASNAQAQTVPEGYQLQQVLMMSRHNLRAPLANNGSVLEQSTPNK WPEWDVPGGQLTTKGGVLEVYMGHYMREWLAEQGMVKSGECPPPYTVYAYANSLQRTVAT AQFFITGAFPGCDIPVHHQEKMGTMDPTFNPVITDDSAAFSEQAVAAMEKELSKLQLTDS YQLLEKIVNYKDSPACKEKQQCSLVDGKNTFSAKYQQEPGVSGPLKVGNSLVDAFTLQYY EGFPMDQVAWGEIKSDQQWKVLSKLKNGYQDSLFTSPEVARNVAKPLVSYIDKALVTDRT SAPKITVLVGHDSNIASLLTALDFKPYQLHDQNERTPIGGKIVFQRWHDSKANRDLMKIE YVYQSAEQLRNADALTLQAPAQRVTLELSGCPIDADGFCPMDKFDSVLNEAVK Prediction of potential genes in microbial genomes Time: Sun May 15 23:49:19 2011 Seq name: gi|296494534|gb|ADTN01000204.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont424.2, whole genome shotgun sequence Length of sequence - 1588 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 9 - 236 370 ## LF82_2690 uncharacterized protein YccJ 2 1 Op 2 . - CDS 257 - 853 679 ## COG0655 Multimeric flavodoxin WrbA - Prom 1023 - 1082 3.7 + Prom 1002 - 1061 4.7 3 2 Tu 1 . + CDS 1226 - 1399 75 ## COG3729 General stress protein + Term 1455 - 1484 1.9 Predicted protein(s) >gi|296494534|gb|ADTN01000204.1| GENE 1 9 - 236 370 75 aa, chain - ## HITS:1 COG:no KEGG:LF82_2690 NR:ns ## KEGG: LF82_2690 # Name: yccJ # Def: uncharacterized protein YccJ # Organism: E.coli_LF82 # Pathway: not_defined # 1 75 1 75 75 125 100.0 6e-28 MPTQEAKAHHVGEWASLRNTSPEIAEAIFEVAGYDEKMAEKIWEEGSDEVLVKAFAKTDK DSLFWGEQTIERKNV >gi|296494534|gb|ADTN01000204.1| GENE 2 257 - 853 679 198 aa, chain - ## HITS:1 COG:wrbA KEGG:ns NR:ns ## COG: wrbA COG0655 # Protein_GI_number: 16128970 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Escherichia coli K12 # 1 198 1 198 198 349 100.0 2e-96 MAKVLVLYYSMYGHIETMARAVAEGASKVDGAEVVVKRVPETMPPQLFEKAGGKTQTAPV ATPQELADYDAIIFGTPTRFGNMSGQMRTFLDQTGGLWASGALYGKLASVFSSTGTGGGQ EQTITSTWTTLAHHGMVIVPIGYAAQELFDVSQVRGGTPYGATTIAGGDGSRQPSQEELS IARYQGEYVAGLAVKLNG >gi|296494534|gb|ADTN01000204.1| GENE 3 1226 - 1399 75 57 aa, chain + ## HITS:1 COG:STM1121 KEGG:ns NR:ns ## COG: STM1121 COG3729 # Protein_GI_number: 16764478 # Func_class: R General function prediction only # Function: General stress protein # Organism: Salmonella typhimurium LT2 # 1 55 1 55 55 59 92.0 1e-09 MANHRGGSGNFAEDRERASEAGKKGGQHSGGNFKNDPQRASEAGKKGGKSSHGKSDN Prediction of potential genes in microbial genomes Time: Sun May 15 23:49:26 2011 Seq name: gi|296494533|gb|ADTN01000205.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont424.3, whole genome shotgun sequence Length of sequence - 12455 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 5, operones - 1 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/1.000 - CDS 51 - 1379 723 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 2 1 Op 2 5/0.000 - CDS 1400 - 1894 313 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family 3 1 Op 3 4/0.000 - CDS 1905 - 2495 542 ## COG0778 Nitroreductase 4 1 Op 4 5/0.000 - CDS 2505 - 3305 712 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 5 1 Op 5 2/0.500 - CDS 3313 - 3699 394 ## COG0251 Putative translation initiation inhibitor, yjgF family 6 1 Op 6 4/0.000 - CDS 3711 - 4403 640 ## COG1335 Amidases related to nicotinamidase 7 1 Op 7 . - CDS 4403 - 5494 1059 ## COG2141 Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related flavin-dependent oxidoreductases - Prom 5635 - 5694 3.6 + Prom 5579 - 5638 5.1 8 2 Tu 1 . + CDS 5782 - 6420 727 ## COG1309 Transcriptional regulator + Term 6424 - 6467 2.0 - Term 6412 - 6452 6.2 9 3 Tu 1 . - CDS 6460 - 10422 4497 ## COG4230 Delta 1-pyrroline-5-carboxylate dehydrogenase - Prom 10536 - 10595 7.9 + Prom 10262 - 10321 2.3 10 4 Tu 1 . + CDS 10477 - 10686 81 ## EcSMS35_2110 hypothetical protein + Prom 10692 - 10751 2.9 11 5 Tu 1 . + CDS 10845 - 12353 1778 ## COG0591 Na+/proline symporter Predicted protein(s) >gi|296494533|gb|ADTN01000205.1| GENE 1 51 - 1379 723 442 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 24 428 8 421 447 283 37 6e-76 MAMFGFPHWQLKSTSTESGVVAPDERLPFAQTAVMGVQHAVAMFGATVLMPILMGLDPNL SILMSGIGTLLFFFITGGRVPSYLGSSAAFVGVVIAATGFNGQGINPNISIALGGIIACG LVYTVIGLVVMKIGTRWIERLMPPVVTGAVVMAIGLNLAPIAVKSVSASAFDSWMAVMTV LCIGLVAVFTRGMIQRLLILVGLIVACLLYGVMTNVLGLGKAVDFTLVSHAAWFGLPHFS TPAFNGQAMMLIAPVAVILVAENLGHLKAVAGMTGRNMDPYMGRAFVGDGLATMLSGSVG GSGVTTYAENIGVMAVTKVYSTLVFVAAAVIAMLLGFSPKFGALIHTIPAAVIGGASIVV FGLIAVAGARIWVQNRVDLSQNGNLIMVAVTLVLGAGDFALTLGGFTLGGIGTATFGAIL LNALLSRKLVDVPPPEVVHQEP >gi|296494533|gb|ADTN01000205.1| GENE 2 1400 - 1894 313 164 aa, chain - ## HITS:1 COG:ycdH KEGG:ns NR:ns ## COG: ycdH COG1853 # Protein_GI_number: 16128973 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Escherichia coli K12 # 13 164 1 152 152 308 100.0 2e-84 MNIVDQQTFRDAMSCMGAAVNIITTDGPAGRAGFTASAVCSVTDTPPTLLVCLNRGASVW PAFNENRTLCVNTLSAGQEPLSNLFGGKTPMEHRFAAARWQTGVTGCPQLEEALVSFDCR ISQVVSVGTHDILFCAIEAIHRHTTPYGLVWFDRSYHALMRPAC >gi|296494533|gb|ADTN01000205.1| GENE 3 1905 - 2495 542 196 aa, chain - ## HITS:1 COG:ycdI KEGG:ns NR:ns ## COG: ycdI COG0778 # Protein_GI_number: 16128974 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Escherichia coli K12 # 1 196 1 196 196 387 100.0 1e-108 MNEAVSPGALSTLFTDARTHNGWRETPVSDETLREIYALMKWGPTSANCSPARIVFTRTA EGKERLRPALSSGNLQKTLTAPVTAIVAWDSEFYERLPLLFPHGDARSWFTSSPQLAEET AFRNSSMQAAYLIVACRALGLDTGPMSGFDRQHVDDAFFTGSTLKSNLLINIGYGDSSKL YARLPRLSFEEACGLL >gi|296494533|gb|ADTN01000205.1| GENE 4 2505 - 3305 712 266 aa, chain - ## HITS:1 COG:ycdJ KEGG:ns NR:ns ## COG: ycdJ COG0596 # Protein_GI_number: 16128975 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Escherichia coli K12 # 1 266 1 266 266 503 100.0 1e-142 MKLSLSPPPYADAPVVVLISGLGGSGSYWLPQLAVLEQEYQVVCYDQRGTGNNPDTLAED YSIAQMAAELHQALVAAGIEHYAVVGHALGALVGMQLALDYPASVTVLISVNGWLRINAH TRRCFQVRERLLYSGGAQAWVEAQPLFLYPADWMAARAPRLEAEDALALAHFQGKNNLLR RLNALKRADFSHHADRIRCPVQIICASDDLLVPTACSSELHAALPDSQKMVMPYGGHACN VTDPETFNALLLNGLASLLHHREAAL >gi|296494533|gb|ADTN01000205.1| GENE 5 3313 - 3699 394 128 aa, chain - ## HITS:1 COG:ycdK KEGG:ns NR:ns ## COG: ycdK COG0251 # Protein_GI_number: 16128976 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Escherichia coli K12 # 1 128 1 128 128 249 99.0 8e-67 MPKSVIIPAGSSAPLAPFVPGTLTDGVVYVSGTLAFDQHNNVLFADDPKAQTRHVLETIR KVIETAGGTMADVTFNSIFITDWKNYAAINEIYAEFFPGDKPARFCIQCGLVKPDALVEI ATIAHIAK >gi|296494533|gb|ADTN01000205.1| GENE 6 3711 - 4403 640 230 aa, chain - ## HITS:1 COG:ycdL KEGG:ns NR:ns ## COG: ycdL COG1335 # Protein_GI_number: 16128977 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Escherichia coli K12 # 1 230 15 244 244 469 100.0 1e-132 MTTLTARPEAITFDPQQSALIVVDMQNAYATPGGYLDLAGFDVSTTRPVIANIQTAVTAA RAAGMLIIWFQNGWDEQYVEAGGPGSPNFHKSNALKTMRKQPQLQGKLLAKGSWDYQLVD ELVPQPGDIVLPKPRYSGFFNTPLDSILRSRGIRHLVFTGIATNVCVESTLRDGFFLEYF GVVLEDATHQAGPKFAQKAALFNIETFFGWVSDVETFCDALSPTSFAHIA >gi|296494533|gb|ADTN01000205.1| GENE 7 4403 - 5494 1059 363 aa, chain - ## HITS:1 COG:ycdM KEGG:ns NR:ns ## COG: ycdM COG2141 # Protein_GI_number: 16128978 # Func_class: C Energy production and conversion # Function: Coenzyme F420-dependent N5,N10-methylene tetrahydromethanopterin reductase and related flavin-dependent oxidoreductases # Organism: Escherichia coli K12 # 1 363 20 382 382 745 100.0 0 MKIGVFVPIGNNGWLISTHAPQYMPTFELNKAIVQKAEHYHFDFALSMIKLRGFGGKTEF WDHNLESFTLMAGLAAVTSRIQIYATAATLTLPPAIVARMAATIDSISGGRFGVNLVTGW QKPEYEQMGIWPGDDYFSRRYDYLTEYVQVLRDLWGTGKSDFKGDFFTMNDCRVSPQPSV PMKVICAGQSDAGMAFSARYADFNFCFGKGVNTPTAFAPTAARMKQAAEQTGRDVGSYVL FMVIADETDDAARAKWEHYKAGADEEALSWLTEQSQKDTRSGTDTNVRQMADPTSAVNIN MGTLVGSYASVARMLDEVASVPGAEGVLLTFDDFLSGIETFGERIQPLMQCRAHLPALTQ EVA >gi|296494533|gb|ADTN01000205.1| GENE 8 5782 - 6420 727 212 aa, chain + ## HITS:1 COG:ycdC KEGG:ns NR:ns ## COG: ycdC COG1309 # Protein_GI_number: 16128979 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 212 1 212 212 412 100.0 1e-115 MTQGAVKTTGKRSRAVSAKKKAILSAALDTFSQFGFHGTRLEQIAELAGVSKTNLLYYFP SKEALYIAVLRQILDIWLAPLKAFREDFAPLAAIKEYIRLKLEVSRDYPQASRLFCMEML AGAPLLMDELTGDLKALIDEKSALIAGWVKSGKLAPIDPQHLIFMIWASTQHYADFAPQV EAVTGATLRDEVFFNQTVENVQRIIIEGIRPR >gi|296494533|gb|ADTN01000205.1| GENE 9 6460 - 10422 4497 1320 aa, chain - ## HITS:1 COG:putA_2 KEGG:ns NR:ns ## COG: putA_2 COG4230 # Protein_GI_number: 16128980 # Func_class: C Energy production and conversion # Function: Delta 1-pyrroline-5-carboxylate dehydrogenase # Organism: Escherichia coli K12 # 526 1320 1 795 795 1540 100.0 0 MGTTTMGVKLDDATRERIKSAATRIDRTPHWLIKQAIFSYLEQLENSDTLPELPALLSGA ANESDEAPTPAEEPHQPFLDFAEQILPQSVSRAAITAAYRRPETEAVSMLLEQARLPQPV AEQAHKLAYQLADKLRNQKNASGRAGMVQGLLQEFSLSSQEGVALMCLAEALLRIPDKAT RDALIRDKISNGNWQSHIGRSPSLFVNAATWGLLFTGKLVSTHNEASLSRSLNRIIGKSG EPLIRKGVDMAMRLMGEQFVTGETIAEALANARKLEEKGFRYSYDMLGEAALTAADAQAY MVSYQQAIHAIGKASNGRGIYEGPGISIKLSALHPRYSRAQYDRVMEELYPRLKSLTLLA RQYDIGINIDAEESDRLEISLDLLEKLCFEPELAGWNGIGFVIQAYQKRCPLVIDYLIDL ATRSRRRLMIRLVKGAYWDSEIKRAQMDGLEGYPVYTRKVYTDVSYLACAKKLLAVPNLI YPQFATHNAHTLAAIYQLAGQNYYPGQYEFQCLHGMGEPLYEQVTGKVADGKLNRPCRIY APVGTHETLLAYLVRRLLENGANTSFVNRIADTSLPLDELVADPVTAVEKLAQQEGQTGL PHPKIPLPRDLYGHGRDNSAGLDLANEHRLASLSSALLNSALQKWQALPMLEQPVAAGEM SPVINPAEPKDIVGYVREATPREVEQALESAVNNAPIWFATPPAERAAILHRAAVLMESQ MQQLIGILVREAGKTFSNAIAEVREAVDFLHYYAGQVRDDFANETHRPLGPVVCISPWNF PLAIFTGQIAAALAAGNSVLAKPAEQTPLIAAQGIAILLEAGVPPGVVQLLPGRGETVGA QLTGDDRVRGVMFTGSTEVATLLQRNIASRLDAQGRPIPLIAETGGMNAMIVDSSALTEQ VVVDVLASAFDSAGQRCSALRVLCLQDEIADHTLKMLRGAMAECRMGNPGRLTTDIGPVI DSEAKANIERHIQTMRSKGRPVFQAVRENSEDAREWQSGTFVAPTLIELDDFAELQKEVF GPVLHVVRYNRNQLPELIEQINASGYGLTLGVHTRIDETIAQVTGSAHVGNLYVNRNMVG AVVGVQPFGGEGLSGTGPKAGGPLYLYRLLANRPESALAVTLARQDAKYPVDAQLKAALT QPLNALREWAANRPELQALCTQYGELAQAGTQRLLPGPTGERNTWTLLPRERVLCIADDE QDALTQLAAVLAVGSQVLWPDDALHRQLVKALPSAVSERIQLAKAENITAQPFDAVIFHG DSDQLRALCEAVAARDGTIVSVQGFARGESNILLERLYIERSLSVNTAAAGGNASLMTIG >gi|296494533|gb|ADTN01000205.1| GENE 10 10477 - 10686 81 69 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_2110 NR:ns ## KEGG: EcSMS35_2110 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 69 1 69 69 101 100.0 1e-20 MMLQLCATMLNVTCVASLKMNEMLIKEIDMTGIKKITQTFSLRQLTFLKGATAKNVRECN LMKNSVAEH >gi|296494533|gb|ADTN01000205.1| GENE 11 10845 - 12353 1778 502 aa, chain + ## HITS:1 COG:putP KEGG:ns NR:ns ## COG: putP COG0591 # Protein_GI_number: 16128981 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Escherichia coli K12 # 1 502 1 502 502 891 100.0 0 MAISTPMLVTFCVYIFGMILIGFIAWRSTKNFDDYILGGRSLGPFVTALSAGASDMSGWL LMGLPGAVFLSGISESWIAIGLTLGAWINWKLVAGRLRVHTEYNNNALTLPDYFTGRFED KSRILRIISALVILLFFTIYCASGIVAGARLFESTFGMSYETALWAGAAATILYTFIGGF LAVSWTDTVQASLMIFALILTPVIVIISVGGFGDSLEVIKQKSIENVDMLKGLNFVAIIS LMGWGLGYFGQPHILARFMAADSHHSIVHARRISMTWMILCLAGAVAVGFFGIAYFNDHP ALAGAVNQNAERVFIELAQILFNPWIAGILLSAILAAVMSTLSCQLLVCSSAITEDLYKA FLRKHASQKELVWVGRVMVLVVALVAIALAANPENRVLGLVSYAWAGFGAAFGPVVLFSV MWSRMTRNGALAGMIIGALTVIVWKQFGWLGLYEIIPGFIFGSIGIVVFSLLGKAPSAAM QKRFAEADAHYHSAPPSRLQES Prediction of potential genes in microbial genomes Time: Sun May 15 23:49:31 2011 Seq name: gi|296494532|gb|ADTN01000206.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont424.4, whole genome shotgun sequence Length of sequence - 13395 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 4, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 292 - 351 4.5 1 1 Op 1 7/0.000 + CDS 447 - 1277 855 ## COG0672 High-affinity Fe2+/Pb2+ permease 2 1 Op 2 9/0.000 + CDS 1335 - 2462 1661 ## COG2822 Predicted periplasmic lipoprotein involved in iron transport 3 1 Op 3 3/0.000 + CDS 2468 - 3739 1122 ## COG2837 Predicted iron-dependent peroxidase 4 1 Op 4 . + CDS 4360 - 5148 586 ## COG1702 Phosphate starvation-inducible protein PhoH, predicted ATPase + Term 5163 - 5199 6.3 - Term 5086 - 5133 1.5 5 2 Tu 1 . - CDS 5198 - 5500 155 ## ECB_01023 predicted inner membrane protein - Prom 5520 - 5579 3.6 - Term 5547 - 5581 1.2 6 3 Op 1 6/0.000 - CDS 5613 - 6818 964 ## COG1215 Glycosyltransferases, probably involved in cell wall biogenesis - Prom 6860 - 6919 3.3 - Term 6860 - 6897 3.2 7 3 Op 2 . - CDS 6931 - 8949 962 ## COG0726 Predicted xylanase/chitin deacetylase 8 3 Op 3 . - CDS 8958 - 11381 1378 ## B21_01033 hypothetical protein - Prom 11418 - 11477 2.9 + Prom 12787 - 12846 6.1 9 4 Tu 1 . + CDS 12910 - 13326 227 ## COG2199 FOG: GGDEF domain Predicted protein(s) >gi|296494532|gb|ADTN01000206.1| GENE 1 447 - 1277 855 276 aa, chain + ## HITS:1 COG:ECs1263 KEGG:ns NR:ns ## COG: ECs1263 COG0672 # Protein_GI_number: 15830517 # Func_class: P Inorganic ion transport and metabolism # Function: High-affinity Fe2+/Pb2+ permease # Organism: Escherichia coli O157:H7 # 1 276 1 276 276 455 100.0 1e-128 MFVPFLIMLREGLEAALIVSLIASYLKRTQRGRWIGVMWIGVLLAAALCLGLGIFINETT GEFPQKEQELFEGIVAVIAVVILTWMVFWMRKVSRNVKVQLEQAVDSALQRGNHHGWALV MMVFFAVAREGLESVFFLLAAFQQDVGIWPPLGAMLGLATAVVLGFLLYWGGIRLNLGAF FKWTSLFILFVAAGLAAGAIRAFHEAGLWNHFQEIAFDMSAVLSTHSLFGTLMEGIFGYQ EAPSVSEVAVWFIYLIPALVAFALPPRAGATASRSA >gi|296494532|gb|ADTN01000206.1| GENE 2 1335 - 2462 1661 375 aa, chain + ## HITS:1 COG:ycdO KEGG:ns NR:ns ## COG: ycdO COG2822 # Protein_GI_number: 16128982 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted periplasmic lipoprotein involved in iron transport # Organism: Escherichia coli K12 # 1 375 1 375 375 704 99.0 0 MTINFRRNALQLSVAALFSSAFMANAADVPQVKVTVTDKQCEPMTITVNAGKTQFIIQNH SQKALEWEILKGVMVVEERENIAPGFSQKMTANLQPGEYDMTCGLLTNPKGKLIVKGEAT ADAAQSDALLSLGGAITAYKAYVMAETTQLVTDTKAFTDAIKAGDIEKAKALYAPTRQHY ERIEPIAELFSDLDGSIDAREDDYEQKAADPKFTGFHRLEKALFGDNTTKGMDQYAEQLY TDVVDLQKRISELAFPPSKVVGGATGLIEEVAASKISGEEDRYSHTDLWDFQANVEGSQK IVDLLRPQLQKANPELLAKVDANFKKVDTILAKYRTKDGFETYDKLTDADRNALKGPITA LAEDLAQLRGVLGLD >gi|296494532|gb|ADTN01000206.1| GENE 3 2468 - 3739 1122 423 aa, chain + ## HITS:1 COG:ycdB KEGG:ns NR:ns ## COG: ycdB COG2837 # Protein_GI_number: 16128983 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted iron-dependent peroxidase # Organism: Escherichia coli K12 # 1 423 1 423 423 866 100.0 0 MQYKDENGVNEPSRRRLLKVIGALALAGSCPVAHAQKTQSAPGTLSPDARNEKQPFYGEH QAGILTPQQAAMMLVAFDVLASDKADLERLFRLLTQRFAFLTQGGAAPETPNPRLPPLDS GILGGYIAPDNLTITLSVGHSLFDERFGLAPQMPKKLQKMTRFPNDSLDAALCHGDVLLQ ICANTQDTVIHALRDIIKHTPDLLSVRWKREGFISDHAARSKGKETPINLLGFKDGTANP DSQNDKLMQKVVWVTADQQEPAWTIGGSYQAVRLIQFRVEFWDRTPLKEQQTIFGRDKQT GAPLGMQHEHDVPDYASDPEGKVIALDSHIRLANPRTAESESSLMLRRGYSYSLGVTNSG QLDMGLLFVCYQHDLEKGFLTVQKRLNGEALEEYVKPIGGGYFFALPGVKDANDYFGSAL LRV >gi|296494532|gb|ADTN01000206.1| GENE 4 4360 - 5148 586 262 aa, chain + ## HITS:1 COG:ECs1266 KEGG:ns NR:ns ## COG: ECs1266 COG1702 # Protein_GI_number: 15830520 # Func_class: T Signal transduction mechanisms # Function: Phosphate starvation-inducible protein PhoH, predicted ATPase # Organism: Escherichia coli O157:H7 # 1 262 93 354 354 523 100.0 1e-148 MGRQKAVIKARREAKRVLRRDSRSHKQREEESVTSLVQMGGVEAIGMARDSRDTSPILAR NEAQLHYLKAIESKQLIFATGEAGCGKTWISAAKAAEALIHKDVDRIIVTRPVLQADEDL GFLPGDIAEKFAPYFRPVYDVLVRRLGASFMQYCLRPEIGKVEIAPFAYMRGRTFENAVV ILDEAQNVTAAQMKMFLTRLGENVTVIVNGDITQCDLPRGVCSGLSDALERFEEDEMVGI VRFGKEDCVRSALCQRTLHAYS >gi|296494532|gb|ADTN01000206.1| GENE 5 5198 - 5500 155 100 aa, chain - ## HITS:1 COG:no KEGG:ECB_01023 NR:ns ## KEGG: ECB_01023 # Name: pgaD # Def: predicted inner membrane protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 100 38 137 137 171 100.0 7e-42 MDLLTGYYWQSEARSRLQFYFLLAVANAVVLIVWALYNKLRFQKQQHHAAYQYTPQEYAE SLAIPDELYQQLQKSHRMSVHFTSQGQIKMVVSEKALVRA >gi|296494532|gb|ADTN01000206.1| GENE 6 5613 - 6818 964 401 aa, chain - ## HITS:1 COG:ycdQ KEGG:ns NR:ns ## COG: ycdQ COG1215 # Protein_GI_number: 16128986 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases, probably involved in cell wall biogenesis # Organism: Escherichia coli K12 # 1 401 41 441 441 806 100.0 0 MSIMWIVGGVYFWVYRERHWPWGENAPAPQLKDNPSISIIIPCFNEEKNVEETIHAALAQ RYENIEVIAVNDGSTDKTRAILDRMAAQIPHLRVIHLAQNQGKAIALKTGAAAAKSEYLV CIDGDALLDRDAAAYIVEPMLYNPRVGAVTGNPRIRTRSTLVGKIQVGEYSSIIGLIKRT QRIYGNVFTVSGVIAAFRRSALAEVGYWSDDMITEDIDISWKLQLNQWTIFYEPRALCWI LMPETLKGLWKQRLRWAQGGAEVFLKNMTRLWRKENFRMWPLFFEYCLTTIWAFTCLVGF IIYAVQLAGVPLNIELTHIAATHTAGILLCTLCLLQFIVSLMIENRYEHNLTSSLFWIIW FPVIFWMLSLATTLVSFTRVMLMPKKQRARWVSPDRGILRG >gi|296494532|gb|ADTN01000206.1| GENE 7 6931 - 8949 962 672 aa, chain - ## HITS:1 COG:ycdR KEGG:ns NR:ns ## COG: ycdR COG0726 # Protein_GI_number: 16128987 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Escherichia coli K12 # 1 672 1 672 672 1344 100.0 0 MLRNGNKYLLMLVSIIMLTACISQSRTSFIPPQDRESLLAEQPWPHNGFVAISWHNVEDE AADQRFMSVRTSALREQFAWLRENGYQPVSIAQIREAHRGGKPLPEKAVVLTFDDGYQSF YTRVFPILQAFQWPAVWAPVGSWVDTPADKQVKFGDELVDREYFATWQQVREVARSRLVE LASHTWNSHYGIQANATGSLLPVYVNRAYFTDHARYETAAEYRERIRLDAVKMTEYLRTK VEVNPHVFVWPYGEANGIAIEELKKLGYDMFFTLESGLANASQLDSIPRVLIANNPSLKE FAQQIITVQEKSPQRIMHIDLDYVYDENLQQMDRNIDVLIQRVKDMQISTVYLQAFADPD GDGLVKEVWFPNRLLPMKADIFSRVAWQLRTRSGVNIYAWMPVLSWDLDPTLTRVKYLPT GEKKAQIHPEQYHRLSPFDDRVRAQVGMLYEDLAGHAAFDGILFHDDALLSDYEDASAPA ITAYQQAGFSGSLSEIRQNPEQFKQWARFKSRALTDFTLELSARVKAIRGPHIKTARNIF ALPVIQPESEAWFAQNYADFLKSYDWTAIMAMPYLEGVAEKSADQWLIQLTNQIKNIPQA KDKSILELQAQNWQKNGQHQAISSQQLAHWMSLLQLNGVKNYGYYPDNFLHNQPEIDLIR PEFSTAWYPKND >gi|296494532|gb|ADTN01000206.1| GENE 8 8958 - 11381 1378 807 aa, chain - ## HITS:1 COG:no KEGG:B21_01033 NR:ns ## KEGG: B21_01033 # Name: ycdS # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 807 1 807 807 1563 100.0 0 MYSSSRKRCPKTKWALKLLTAAFLAASPAAKSAVNNAYDALIIEARKGNTQPALSWFALK SALSNNQIADWLQIALWAGQDKQVITVYNRYRHQQLPARGYAAVAVAYRNLQQWQNSLTL WQKALSLEPQNKDYQRGQILTLADAGHYDTALVKLKQLNSGAPDKANLLAEAYIYKLAGR HQDELRAMTESLPENASTQQYPTEYVQALRNNQLAAAIDDANLTPDIRADIHAELVRLSF MPTRSESERYAIADRALAQYAALEILWHDNPDRTAQYQRIQVDHLGALLTRDRYKDVISH YQRLKKTGQIIPPWGQYWVASAYLKDHQPKKAQSIMTELFYHKETIAPDLSDEELADLFY SHLESENYPGALTVTQHTINTSPPFLRLMGTPTSIPNDTWLQGHSFLSTVAKYSNDLPQA EMTARELAYNAPGNQGLRIDYASVLQARGWPRAAENELKKAEVIEPRNINLEVEQAWTAL TLQEWQQAAVLTHDVVEREPQDPGVVRLKRAVDVHNLAELRIAGSTGIDAEGPDSGKHDV DLTTIVYSPPLKDNWRGFAGFGYADGQFSEGKGIVRDWLAGVEWRSRNIWLEAEYAERVF NHEHKPGARLSGWYDFNDNWRIGSQLERLSHRVPLRAMKNGVTGNSAQAYVRWYQNERRK YGVSWAFTDFSDSNQRHEVSLEGQERIWSSPYLIVDFLPSLYYEQNTEHDTPYYNPIKTF DIVPAFEASHLLWRSYENSWEQIFSAGVGASWQKHYGTDVVTQLGYGQRISWNDVIDAGA TLRWEKRPYDGDREHNLYVEFDMTFRF >gi|296494532|gb|ADTN01000206.1| GENE 9 12910 - 13326 227 138 aa, chain + ## HITS:1 COG:ycdT_2 KEGG:ns NR:ns ## COG: ycdT_2 COG2199 # Protein_GI_number: 16128989 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli K12 # 1 138 65 202 202 271 100.0 2e-73 MIMDIDHFKKVNDTWGHPVGDQVIKTVVNIIGKSIRPDDLLARVGGEEFGVLLTDIDTER AKALAERIRENVERLTGDNPEYAIPQKVTISIGAVVTQENALNPNEIYRLADNALYEAKE TGRNKVVVRDVVNFCESP Prediction of potential genes in microbial genomes Time: Sun May 15 23:49:45 2011 Seq name: gi|296494531|gb|ADTN01000207.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont441.1, whole genome shotgun sequence Length of sequence - 6483 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 5, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 32 - 361 109 ## COG3316 Transposase and inactivated derivatives 2 1 Op 2 . - CDS 512 - 736 106 ## COG3316 Transposase and inactivated derivatives - Prom 760 - 819 4.6 + Prom 1021 - 1080 4.2 3 2 Op 1 . + CDS 1117 - 1272 154 ## gi|218511192|ref|YP_002415650.1| hypothetical protein pEC55989_0044 4 2 Op 2 . + CDS 1321 - 2313 570 ## ROD_p1471 hypothetical protein 5 2 Op 3 . + CDS 2340 - 2501 187 ## COG5464 Uncharacterized conserved protein + Term 2640 - 2674 1.5 6 3 Op 1 . - CDS 2498 - 3109 358 ## KPN_pKPN4p07047 hypothetical protein 7 3 Op 2 . - CDS 3163 - 3444 243 ## KPN_pKPN4p07046 putative cytoplasmic protein - Prom 3683 - 3742 4.9 - Term 3693 - 3734 3.4 8 4 Op 1 11/0.000 - CDS 3898 - 4128 183 ## COG2801 Transposase and inactivated derivatives 9 4 Op 2 23/0.000 - CDS 4192 - 5079 497 ## COG2801 Transposase and inactivated derivatives 10 4 Op 3 . - CDS 5079 - 5405 291 ## COG2963 Transposase and inactivated derivatives 11 4 Op 4 . - CDS 5431 - 5625 217 ## COG2963 Transposase and inactivated derivatives 12 4 Op 5 . - CDS 5686 - 6018 279 ## COG0286 Type I restriction-modification system methyltransferase subunit - Prom 6044 - 6103 3.7 + Prom 6111 - 6170 6.5 13 5 Tu 1 . + CDS 6195 - 6416 285 ## ECO111_p2-111 antitoxin of P1 addiction system Predicted protein(s) >gi|296494531|gb|ADTN01000207.1| GENE 1 32 - 361 109 109 aa, chain - ## HITS:1 COG:Cgl0933 KEGG:ns NR:ns ## COG: Cgl0933 COG3316 # Protein_GI_number: 19552183 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Corynebacterium glutamicum # 2 108 128 234 236 120 56.0 7e-28 MKSWKIPRRINTDKAPTYGRALALLKCEGQCPPDVEHRQIKYRNNVIECDHGKLKRIISA TLGFKSMKTAYATIKGIEVMRALRKGQASAFYYGDPLGEMRLVSRVFEM >gi|296494531|gb|ADTN01000207.1| GENE 2 512 - 736 106 74 aa, chain - ## HITS:1 COG:Cgl0933 KEGG:ns NR:ns ## COG: Cgl0933 COG3316 # Protein_GI_number: 19552183 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Corynebacterium glutamicum # 1 69 1 69 236 106 68.0 1e-23 MNPFKGRHFQRDIILWAVRWYCKYGISYRELQEMLAERGVNVDHSTIYRWVQSYAPEIEK RLRWYWRNPSRFCP >gi|296494531|gb|ADTN01000207.1| GENE 3 1117 - 1272 154 51 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|218511192|ref|YP_002415650.1| ## NR: gi|218511192|ref|YP_002415650.1| hypothetical protein pEC55989_0044 [Escherichia coli 55989] # 1 51 11 61 61 67 94.0 2e-10 MSPVQAKQKQHERYEAVAVQVLRGRAGYKPAVKSRFSKSASSKFAHTIAFA >gi|296494531|gb|ADTN01000207.1| GENE 4 1321 - 2313 570 330 aa, chain + ## HITS:1 COG:no KEGG:ROD_p1471 NR:ns ## KEGG: ROD_p1471 # Name: not_defined # Def: hypothetical protein # Organism: C.rodentium # Pathway: not_defined # 1 330 1 330 330 665 96.0 0 MPVQDVIPPYEQMYLLNQQLICNADQFKHAVITVGGQAVQYWISYYHAQYGDRLPDERLT TSVDCDYSARKDDIAAIAKTLNVKTWENKDGQPPSLAQFMLIDQDTHDIKRDDGRLFAVP DAPDEPNVVDIIDRPGGFDRSDFQGKKLYLYTAPFYVEATGPGMPEMNEKVRVLNPVACM RSRFSNLIALRRDAEIEIARINALKIPCYFFLIEQFDEQPFKVARGIFMDLWRLANDESC LRHQAFWHSWQGPLLEGQQSNNITLIDVLEGVHVYLEGHLDDFEIPEAFVTKEVPLKLAQ LRERWERYVVLNAEWAARGRRGFERNPRDD >gi|296494531|gb|ADTN01000207.1| GENE 5 2340 - 2501 187 53 aa, chain + ## HITS:1 COG:PSLT051 KEGG:ns NR:ns ## COG: PSLT051 COG5464 # Protein_GI_number: 17233500 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Salmonella typhimurium LT2 # 11 53 271 313 313 82 95.0 2e-16 MCKPPEHTLIGRSEGERQATLKIARTMLQNGIDRNTVMKMTGLTEDDLAQIRH >gi|296494531|gb|ADTN01000207.1| GENE 6 2498 - 3109 358 203 aa, chain - ## HITS:1 COG:no KEGG:KPN_pKPN4p07047 NR:ns ## KEGG: KPN_pKPN4p07047 # Name: not_defined # Def: hypothetical protein # Organism: K.pneumoniae # Pathway: not_defined # 1 203 1 203 203 397 100.0 1e-109 MTLTEKTGHLAWCALVALALARQEQGELSPAQENLFLTRWLAAALKQRRFSRDVAQDIGW LLNQGRLLGVRAKLADKLGYVWRSCSGELTEQNDMFRLTYALETAKDMGWNYRVMSDREW AGRYALVLNPGVNGVYLLRTNLDAAFDDNGQQTNPLTVRLTGNVTGIMKLLNRCGWQAEP ESDASLPHQFSLMARQGVPGKGD >gi|296494531|gb|ADTN01000207.1| GENE 7 3163 - 3444 243 93 aa, chain - ## HITS:1 COG:no KEGG:KPN_pKPN4p07046 NR:ns ## KEGG: KPN_pKPN4p07046 # Name: not_defined # Def: putative cytoplasmic protein # Organism: K.pneumoniae # Pathway: not_defined # 1 93 1 93 93 176 98.0 2e-43 MRVAESIILDALTRGGCIKTFYRISSRQAAESATRIPEGYILESPGEREDIVLSRADFHA LEKLLEQKVTWEQVVGVTCFGGATWQLRPTVQS >gi|296494531|gb|ADTN01000207.1| GENE 8 3898 - 4128 183 76 aa, chain - ## HITS:1 COG:mlr6150 KEGG:ns NR:ns ## COG: mlr6150 COG2801 # Protein_GI_number: 13475138 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Mesorhizobium loti # 1 68 215 282 286 84 57.0 7e-17 MSRRGNCHDNAVAESFFQLLKRERIKKKIYGTREEARSDIFDYIEMFYNSKRRHGSSDQM SPTEYENQYYQRLGSV >gi|296494531|gb|ADTN01000207.1| GENE 9 4192 - 5079 497 295 aa, chain - ## HITS:1 COG:ECs1208 KEGG:ns NR:ns ## COG: ECs1208 COG2801 # Protein_GI_number: 15830462 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 295 2 296 296 565 96.0 1e-161 MPLLDKLRKLYGVGPVCSELHIAPSTYYHCQQQRHHPDKRSARAQRDDWLKKEIQRVYDE NHKVYGVRKVWRQLLREGIRVARCTVARLMAVMGLAGVLRGKKVRTTISRKAVAAGDRVN RQFVAERPDQLWVADFTYVSTWRGFVYVAFIIDVFAGYIVGWRVSSSMETTFVLDALEQA LWARRPSGTVHHSDKGSQYVSLAYTQRLKEAGLLASTGSTGDSYDNAMAESINGLYKAEV IHRKSWKNRAEVELATLTWVDWYNNRRLLERQGHIPPAEAEKAYYASIGNDDLAA >gi|296494531|gb|ADTN01000207.1| GENE 10 5079 - 5405 291 108 aa, chain - ## HITS:1 COG:ECs1665 KEGG:ns NR:ns ## COG: ECs1665 COG2963 # Protein_GI_number: 15830919 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 108 1 108 108 164 96.0 3e-41 MTKNTRFSPEVRQRAVRMVLESQSEYDSQWATICSIAPKIGCTPETLRVWVRQHERDTGG GDGGLTTAERQRLKELERENRELRRSNDILRQASAYFAKAEFDRLWKK >gi|296494531|gb|ADTN01000207.1| GENE 11 5431 - 5625 217 64 aa, chain - ## HITS:1 COG:STM0556 KEGG:ns NR:ns ## COG: STM0556 COG2963 # Protein_GI_number: 16763934 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Salmonella typhimurium LT2 # 1 56 1 56 72 82 80.0 1e-16 MSGKRYPEEFKTEAVKQVVDRGYSVASVATRLDITTHSLYAWIKKYGPDSSTNKELNRPG NPGD >gi|296494531|gb|ADTN01000207.1| GENE 12 5686 - 6018 279 110 aa, chain - ## HITS:1 COG:NMA1038 KEGG:ns NR:ns ## COG: NMA1038 COG0286 # Protein_GI_number: 15793994 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Neisseria meningitidis Z2491 # 3 109 2 108 514 171 71.0 4e-43 MKMTSIQQRAELHRQIWQIANDVRGSVDGWDFKQYVLGALFYRFISENFSSYIEAGDDSI CYAKLDDSVITDDIKDDAIKTKGYFIYPSQLFCNVAAKANTNDRLNADLN >gi|296494531|gb|ADTN01000207.1| GENE 13 6195 - 6416 285 73 aa, chain + ## HITS:1 COG:no KEGG:ECO111_p2-111 NR:ns ## KEGG: ECO111_p2-111 # Name: not_defined # Def: antitoxin of P1 addiction system # Organism: E.coli_O111_H- # Pathway: not_defined # 1 73 1 73 73 117 100.0 1e-25 MQSINFRTARGNLSEVLNNVEAGEEVEITRRGREPAVIVSKATFEAYKKAALDAEFASLF DTLDSTNKELVNR Prediction of potential genes in microbial genomes Time: Sun May 15 23:50:01 2011 Seq name: gi|296494530|gb|ADTN01000208.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont441.2, whole genome shotgun sequence Length of sequence - 6239 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 3, operones - 1 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 71 - 322 126 ## COG3654 Prophage maintenance system killer protein 2 1 Op 2 . + CDS 327 - 506 87 ## ECO111_p2-113 hypothetical protein 3 1 Op 3 . + CDS 534 - 1577 508 ## ECO111_p2-114 hypothetical protein 4 1 Op 4 . + CDS 1666 - 2118 174 ## ECO111_p2-115 late promoter activating protein + Term 2129 - 2153 -0.3 5 1 Op 5 . + CDS 2204 - 3397 405 ## ECO111_p2-116 DNA packaging protein 6 1 Op 6 . + CDS 3397 - 4881 239 ## ECO111_p2-117 DNA packaging protein + Term 4921 - 4957 -0.9 7 2 Tu 1 . - CDS 4906 - 5757 334 ## ECO111_p2-122 primary repressor of lytic functions - Term 5779 - 5809 -0.6 8 3 Tu 1 . - CDS 5868 - 6077 171 ## ECO111_p2-123 C1 inactivator protein - Prom 6147 - 6206 2.7 Predicted protein(s) >gi|296494530|gb|ADTN01000208.1| GENE 1 71 - 322 126 83 aa, chain + ## HITS:1 COG:STM3558 KEGG:ns NR:ns ## COG: STM3558 COG3654 # Protein_GI_number: 16766844 # Func_class: R General function prediction only # Function: Prophage maintenance system killer protein # Organism: Salmonella typhimurium LT2 # 2 76 47 121 122 78 49.0 3e-15 MYEEITDLFEVSATYLVATARGHIFNDANKRTALNSALLFLRRNGVQVFDSPELADLTVG AATGEISVSSVADTLRRLYGSAE >gi|296494530|gb|ADTN01000208.1| GENE 2 327 - 506 87 59 aa, chain + ## HITS:1 COG:no KEGG:ECO111_p2-113 NR:ns ## KEGG: ECO111_p2-113 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 59 1 59 59 85 100.0 5e-16 MARKYNKLSREALKMLLDGVSRREVKQYLVGKQIGARTAIAVLCRQEMVVLKQRMPGSR >gi|296494530|gb|ADTN01000208.1| GENE 3 534 - 1577 508 347 aa, chain + ## HITS:1 COG:no KEGG:ECO111_p2-114 NR:ns ## KEGG: ECO111_p2-114 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 347 1 347 347 683 97.0 0 MKAVITPFVQKELGVATFKVDQEVRKLVEAGRKFIMEPVPRELIEHMDDGLVVSEQTMAT NEALQPFFNSDELFRRIGGIDALVAWLRRKEGQCQAADRSWCDNHIVHAERDNSAVLLCW HHDNHYRMRGFNELKETLHNNRVNWILDVARQEMGLSDGHDLSIQELCWWAFMRDMMHLM PEEVCRISINKIKAATQDSGPMKEADIRPYDDRATAYVQMMEERAAPMRAKVCPVDVDSD PGMAHFKIPKLQSLKLPEYMDFVASRPCCGCGAAGAGARITPYIVRHSRLCAHDIYAIPL CQSCQRDIERDRDNWEKTHGRLAMHQRLFFDYALGVGVITSHSSSVR >gi|296494530|gb|ADTN01000208.1| GENE 4 1666 - 2118 174 150 aa, chain + ## HITS:1 COG:no KEGG:ECO111_p2-115 NR:ns ## KEGG: ECO111_p2-115 # Name: not_defined # Def: late promoter activating protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 150 1 150 150 263 99.0 2e-69 MLLNWQGRHFMEINHSRITSYEIADYMIRTKSLLSAKELAAILEKEYPHLDVDKRDVYLR LKAIAVSKYSSVLIDDSTRPRRFQIHSLNPEFFRRSRAPRRFDEKLQNELYMTQDEKERR EHQPWVMARQLFNKVARQHRHYGNATSARI >gi|296494530|gb|ADTN01000208.1| GENE 5 2204 - 3397 405 397 aa, chain + ## HITS:1 COG:no KEGG:ECO111_p2-116 NR:ns ## KEGG: ECO111_p2-116 # Name: not_defined # Def: DNA packaging protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 397 1 397 397 629 97.0 1e-179 MTWDDHKKNFARLARDGGYTIAQYAAEFNLNPNTARRYLRAFKEDTRTADSRKPNKPVRK PLKSMIIDHANDQRAGDHIAAEIAEKQRVNAVVSAAVENAKRQNKRINDRSDDHDVITRA HLTLRDRLERDTLDDDGERFEFEAGDYLIDNVEARKAARAMLRRSGADVLETTLLEKSLS HLLMLENARDTCIRLVQEMRDQQKDDDEGTPPEYRIASMLNSCSAQISSLINTIYSIRNN YRKESREAEKHALSMGQAGIVKLAYERKRENNWSVLEAAEFIEAHGGKVPPLMLEQIKAD LRAPKTNTDDEENQTASGAPSLEDLDKIARERAASRRADAALWIEHRREEIADIVDTGGY GDVDAEGISNEAWLEQDLDEDEEEDEEVTRKLYGDDD >gi|296494530|gb|ADTN01000208.1| GENE 6 3397 - 4881 239 494 aa, chain + ## HITS:1 COG:no KEGG:ECO111_p2-117 NR:ns ## KEGG: ECO111_p2-117 # Name: not_defined # Def: DNA packaging protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 494 1 494 494 1018 99.0 0 MARSCVTDPRWRELVALYRYDWIAAADVLFGKTPTWQQDEIIESTQQDGSWTSVTSGHGT GKSDMTSIIAILFIMFFPGARVILVANKRQQVLDGIFKYIKSNWATAVSRFPWLSKYFIL TETSFFEVTGKGVWTILIKSCRPGNEEALAGEHADHLLYIIDEASGVSDKAFSVITGALT GKDNRILLLSQPTRPSGYFYDSHHRLAIRPGNPDGLFTAIILNSEESPLVDAKFIRAKLA EYGGRDNPMYMIKVRGEFPKSQDGFLLGRDEVERATRRKVKIAKGWGWVACVDVAGGTGR DKSVINIMMVSGQRNKRRVINYRMLEYTDVTETQLAAKIFAECNPERFPNITIAIDGDGL GKSTADLMYERYGITVQRIRWGKKMHSREDKSLYFDMRAFANIQAAEAVKSGRMRLDKGA ATIEEASKIPVGINSAGQWKVMSKEDMKKKLNLHSPDHWDTYCFAMLANYVPQDEVLSVE DEAQVDEALAWLNE >gi|296494530|gb|ADTN01000208.1| GENE 7 4906 - 5757 334 283 aa, chain - ## HITS:1 COG:no KEGG:ECO111_p2-122 NR:ns ## KEGG: ECO111_p2-122 # Name: not_defined # Def: primary repressor of lytic functions # Organism: E.coli_O111_H- # Pathway: not_defined # 1 283 1 283 283 552 98.0 1e-156 MINYVYGEQLYQEFVSFRDLFLKKAVARAQHVDTASDGRPVRPVVVLPFKETDSIQAEID KWTLMARELEQYPDLNIPKTILYPVPNILRGVRKVTTYQTEAVNSVNMTAGRIIHLIDKD IRIQKSAGINEHSAKYIENLEATKELMKQYPEDEKFRMRVHGFSETMLRVHYISSSPNYN DGKSVSYHVPLCGVFICDETLRDGIIINGEFEKAKFSLYDSIEPIICDRWPQAKIYRLAD IENVKKQIAITREEKKVKSAASVTRRRKTKKGQPVNDNPESAQ >gi|296494530|gb|ADTN01000208.1| GENE 8 5868 - 6077 171 69 aa, chain - ## HITS:1 COG:no KEGG:ECO111_p2-123 NR:ns ## KEGG: ECO111_p2-123 # Name: not_defined # Def: C1 inactivator protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 69 1 69 69 96 98.0 3e-19 MAFIQPTIDDVRHCSNALSVDPAETDAARAIAEHYSKISNQEYRITQDDLDDLTDTIEYL MATNQLDSQ Prediction of potential genes in microbial genomes Time: Sun May 15 23:50:27 2011 Seq name: gi|296494529|gb|ADTN01000209.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont442.1, whole genome shotgun sequence Length of sequence - 3207 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 49 - 201 141 ## ECO103_0016 regulatory protein MokC for HokC + Prom 697 - 756 3.9 2 2 Op 1 5/0.000 + CDS 787 - 1953 1218 ## COG3004 Na+/H+ antiporter + Term 1969 - 2000 2.4 3 2 Op 2 . + CDS 2019 - 2918 838 ## COG0583 Transcriptional regulator 4 3 Tu 1 . + CDS 3064 - 3174 70 ## Predicted protein(s) >gi|296494529|gb|ADTN01000209.1| GENE 1 49 - 201 141 50 aa, chain - ## HITS:1 COG:no KEGG:ECO103_0016 NR:ns ## KEGG: ECO103_0016 # Name: mokC # Def: regulatory protein MokC for HokC # Organism: E.coli_O103_H2 # Pathway: not_defined # 1 50 20 69 69 74 100.0 1e-12 MKQHKAMIVALIVICITAVVAALVTRKDLCEVHIRTGQTEVAVFTAYESE >gi|296494529|gb|ADTN01000209.1| GENE 2 787 - 1953 1218 388 aa, chain + ## HITS:1 COG:ECs0017 KEGG:ns NR:ns ## COG: ECs0017 COG3004 # Protein_GI_number: 15829271 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter # Organism: Escherichia coli O157:H7 # 1 388 1 388 388 627 99.0 1e-179 MKHLHRFFSSDASGGIILIIAAILAMIMANSGATSGWYHDFLETPVQLRVGSLEINKNML LWINDALMAVFFLLVGLEVKSELMQGSLASLRQAAFPVIAAIGGMIVPALLYLAFNYADP ITREGWAIPAATDIAFALGVLALLGSRVPLALKIFLMALAIIDDLGAIIIIALFYTNDLS MASLGVAAVAIAVLAVLNLCGVRRTGVYILVGVVLWTAVLKSGVHATLAGVIVGFFIPLK EKHGRSPAKRLEHVLHPWVAYLILPLFAFANAGVSLQGVTLDGLTSILPLGIIAGLLIGK PLGISLFCWLALRLKLAHLPEGTTYQQIMVVGILCGIGFTMSIFIASLAFGSVDPELINW AKLGILVGSISSAVIGYSWLRVRLRPSV >gi|296494529|gb|ADTN01000209.1| GENE 3 2019 - 2918 838 299 aa, chain + ## HITS:1 COG:nhaR KEGG:ns NR:ns ## COG: nhaR COG0583 # Protein_GI_number: 16128014 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 299 3 301 301 629 100.0 1e-180 MSHINYNHLYYFWHVYKEGSVVGAAEALYLTPQTITGQIRALEERLQGKLFKRKGRGLEP SELGELVYRYADKMFTLSQEMLDIVNYRKESNLLFDVGVADALSKRLVSSVLNAAVVEGE PIHLRCFESTHEMLLEQLSQHKLDMIISDCPIDSTQQEGLFSVRIGECGVSFWCTNPPPE KPFPACLEERRLLIPGRRSMLGRKLLNWFNSQGLNVEILGEFDDAALMKAFGAMHNAIFV APTLYAYDFYADKTVVEIGRVENVMEEYHAIFAERMIQHPAVQRICNTDYSALFSPAVR >gi|296494529|gb|ADTN01000209.1| GENE 4 3064 - 3174 70 36 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGMLKRVVIAFTILDFIRTIIGELKAYLTRTVLTVF Prediction of potential genes in microbial genomes Time: Sun May 15 23:50:38 2011 Seq name: gi|296494528|gb|ADTN01000210.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont447.1, whole genome shotgun sequence Length of sequence - 19674 bp Number of predicted genes - 22, with homology - 21 Number of transcription units - 10, operones - 5 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 72 - 842 557 ## COG0388 Predicted amidohydrolase - Prom 903 - 962 6.7 + Prom 894 - 953 3.9 2 2 Tu 1 . + CDS 996 - 1469 576 ## EC55989_0244 C-lysozyme inhibitor + Term 1486 - 1516 3.0 - Term 1473 - 1503 3.0 3 3 Tu 1 . - CDS 1512 - 3956 3083 ## COG1960 Acyl-CoA dehydrogenases - Prom 4043 - 4102 3.2 + Prom 4105 - 4164 4.2 4 4 Op 1 6/0.000 + CDS 4196 - 4774 783 ## COG0279 Phosphoheptose isomerase + Term 4797 - 4861 3.8 + Prom 4807 - 4866 2.5 5 4 Op 2 . + CDS 4980 - 5747 639 ## COG0121 Predicted glutamine amidotransferase + Term 5792 - 5848 6.0 - Term 5674 - 5707 2.1 6 5 Tu 1 3/0.750 - CDS 5718 - 6458 814 ## COG3034 Uncharacterized protein conserved in bacteria - Prom 6539 - 6598 3.0 7 6 Op 1 9/0.000 - CDS 6614 - 6892 99 ## COG3041 Uncharacterized protein conserved in bacteria 8 6 Op 2 . - CDS 6895 - 7155 403 ## COG3077 DNA-damage-inducible protein J - Prom 7186 - 7245 7.8 + Prom 7226 - 7285 8.3 9 7 Op 1 3/0.750 + CDS 7305 - 8114 291 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) + Prom 8136 - 8195 2.0 10 7 Op 2 . + CDS 8290 - 8787 21 ## COG1943 Transposase and inactivated derivatives + Term 8839 - 8880 1.0 11 8 Op 1 13/0.000 - CDS 9011 - 11104 2412 ## COG1298 Flagellar biosynthesis pathway, component FlhA 12 8 Op 2 10/0.000 - CDS 11088 - 12227 1190 ## COG1377 Flagellar biosynthesis pathway, component FlhB 13 8 Op 3 17/0.000 - CDS 12217 - 12972 930 ## COG1684 Flagellar biosynthesis pathway, component FliR 14 8 Op 4 16/0.000 - CDS 13001 - 13273 176 ## COG1987 Flagellar biosynthesis pathway, component FliQ 15 8 Op 5 2/1.000 - CDS 13276 - 14028 1021 ## COG1338 Flagellar biosynthesis pathway, component FliP 16 8 Op 6 . - CDS 14025 - 14396 496 ## COG1886 Flagellar motor switch/type III secretory pathway protein 17 8 Op 7 . - CDS 14389 - 15240 791 ## EcolC_3385 surface presentation of antigens (SpoA) protein - Prom 15335 - 15394 7.8 - Term 15423 - 15465 -0.3 18 9 Tu 1 . - CDS 15519 - 15623 63 ## - Prom 15777 - 15836 5.3 + Prom 15492 - 15551 9.2 19 10 Op 1 2/1.000 + CDS 15627 - 16613 865 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 20 10 Op 2 9/0.000 + CDS 16628 - 16969 378 ## COG1677 Flagellar hook-basal body protein 21 10 Op 3 19/0.000 + CDS 16974 - 18620 1743 ## COG1766 Flagellar biosynthesis/type III secretory pathway lipoprotein 22 10 Op 4 . + CDS 18598 - 19608 1251 ## COG1536 Flagellar motor switch protein Predicted protein(s) >gi|296494528|gb|ADTN01000210.1| GENE 1 72 - 842 557 256 aa, chain - ## HITS:1 COG:yafV KEGG:ns NR:ns ## COG: yafV COG0388 # Protein_GI_number: 16128205 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Escherichia coli K12 # 1 256 1 256 256 529 100.0 1e-150 MPGLKITLLQQPLVWMDGPANLRHFDRQLEGITGRDVIVLPEMFTSGFAMEAAASSLAQD DVVNWMTAKAQQCNALIAGSVALQTESGSVNRFLLVEPGGTVHFYDKRHLFRMADEHLHY KAGNARVIVEWRGWRILPLVCYDLRFPVWSRNLNDYDLALYVANWPAPRSLHWQALLTAR AIENQAYVAGCNRVGSDGNGCHYRGDSRVINPQGEIIATADAHQATRIDAELSMAALREY REKFPAWQDADEFRLW >gi|296494528|gb|ADTN01000210.1| GENE 2 996 - 1469 576 157 aa, chain + ## HITS:1 COG:no KEGG:EC55989_0244 NR:ns ## KEGG: EC55989_0244 # Name: ivy # Def: C-lysozyme inhibitor # Organism: E.coli_55989 # Pathway: not_defined # 1 157 1 157 157 292 100.0 2e-78 MGRISSGGMMFKAITTVAALVIATSAMAQDDLTISSLAKGETTKAAFNQMVQGHKLPAWV MKGGTYTPAQTVTLGDETYQVMSACKPHDCGSQRIAVMWSEKSNQMTGLFSTIDEKTSQE KLTWLNVNDALSIDGKTVLFAALTGSLENHPDGFNFK >gi|296494528|gb|ADTN01000210.1| GENE 3 1512 - 3956 3083 814 aa, chain - ## HITS:1 COG:yafH KEGG:ns NR:ns ## COG: yafH COG1960 # Protein_GI_number: 16128207 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Escherichia coli K12 # 1 814 13 826 826 1674 100.0 0 MMILSILATVVLLGALFYHRVSLFISSLILLAWTAALGVAGLWSAWVLVPLAIILVPFNF APMRKSMISAPVFRGFRKVMPPMSRTEKEAIDAGTTWWEGDLFQGKPDWKKLHNYPQPRL TAEEQAFLDGPVEEACRMANDFQITHELADLPPELWAYLKEHRFFAMIIKKEYGGLEFSA YAQSRVLQKLSGVSGILAITVGVPNSLGPGELLQHYGTDEQKDHYLPRLARGQEIPCFAL TSPEAGSDAGAIPDTGIVCMGEWQGQQVLGMRLTWNKRYITLAPIATVLGLAFKLSDPEK LLGGAEDLGITCALIPTTTPGVEIGRRHFPLNVPFQNGPTRGKDVFVPIDYIIGGPKMAG QGWRMLVECLSVGRGITLPSNSTGGVKSVALATGAYAHIRRQFKISIGKMEGIEEPLARI AGNAYVMDAAASLITYGIMLGEKPAVLSAIVKYHCTHRGQQSIIDAMDITGGKGIMLGQS NFLARAYQGAPIAITVEGANILTRSMMIFGQGAIRCHPYVLEEMEAAKNNDVNAFDKLLF KHIGHVGSNKVRSFWLGLTRGLTSSTPTGDATKRYYQHLNRLSANLALLSDVSMAVLGGS LKRRERISARLGDILSQLYLASAVLKRYDDEGRNEADLPLVHWGVQDALYQAEQAMDDLL QNFPNRVVAGLLNVVIFPTGRHYLAPSDKLDHKVAKILQVPNATRSRIGRGQYLTPSEHN PVGLLEEALVDVIAADPIHQRICKELGKNLPFTRLDELAHNALVKGLIDKDEAAILVKAE ESRLRSINVDDFDPEELATKPVKLPEKVRKVEAA >gi|296494528|gb|ADTN01000210.1| GENE 4 4196 - 4774 783 192 aa, chain + ## HITS:1 COG:ECs0249 KEGG:ns NR:ns ## COG: ECs0249 COG0279 # Protein_GI_number: 15829503 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoheptose isomerase # Organism: Escherichia coli O157:H7 # 1 192 1 192 192 373 100.0 1e-103 MYQDLIRNELNEAAETLANFLKDDANIHAIQRAAVLLADSFKAGGKVLSCGNGGSHCDAM HFAEELTGRYRENRPGYPAIAISDVSHISCVGNDFGFNDIFSRYVEAVGREGDVLLGIST SGNSANVIKAIAAAREKGMKVITLTGKDGGKMAGTADIEIRVPHFGYADRIQEIHIKVIH ILIQLIEKEMVK >gi|296494528|gb|ADTN01000210.1| GENE 5 4980 - 5747 639 255 aa, chain + ## HITS:1 COG:yafJ KEGG:ns NR:ns ## COG: yafJ COG0121 # Protein_GI_number: 16128209 # Func_class: R General function prediction only # Function: Predicted glutamine amidotransferase # Organism: Escherichia coli K12 # 1 255 1 255 255 548 100.0 1e-156 MCELLGMSANVPTDICFSFTGLVQRGGGTGPHKDGWGITFYEGKGCRTFKDPQPSFNSPI AKLVQDYPIKSCSVVAHIRQANRGEVALENTHPFTRELWGRNWTYAHNGQLTGYKSLETG NFRPVGETDSEKAFCWLLHKLTQRYPRTPGNMAAVFKYIASLADELRQKGVFNMLLSDGR YVMAYCSTNLHWITRRAPFGVATLLDQDVEIDFSSQTTPNDVVTVIATQPLTGNETWQKI MPGEWRLFCLGERVV >gi|296494528|gb|ADTN01000210.1| GENE 6 5718 - 6458 814 246 aa, chain - ## HITS:1 COG:ECs0251 KEGG:ns NR:ns ## COG: ECs0251 COG3034 # Protein_GI_number: 15829505 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 246 1 246 246 491 100.0 1e-139 MRKIALILAMLLIPCVSFAGLLGSSSSTTPVSKEYKQQLMGSPVYIQIFKEERTLDLYVK MGEQYQLLDSYKICKYSGGLGPKQRQGDFKSPEGFYSVQRNQLKPDSRYYKAINIGFPNA YDRAHGYEGKYLMIHGDCVSIGCYAMTNQGIDEIFQFVTGALVFGQPSVQVSIYPFRMTD ANMKRHKYSNFKDFWEQLKPGYDYFEQTRKPPTVSVVNGRYVVSKPLSHEVVQPQLASNY TLPEAK >gi|296494528|gb|ADTN01000210.1| GENE 7 6614 - 6892 99 92 aa, chain - ## HITS:1 COG:yafQ KEGG:ns NR:ns ## COG: yafQ COG3041 # Protein_GI_number: 16128211 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 92 1 92 92 180 100.0 5e-46 MIQRDIEYSGQYSKDVKLAQKRHKDMNKLKYLMTLLINNTLPLPAVYKDHPLQGSWKGYR DAHVEPDWILIYKLTDKLLRFERTGTHAALFG >gi|296494528|gb|ADTN01000210.1| GENE 8 6895 - 7155 403 86 aa, chain - ## HITS:1 COG:dinJ KEGG:ns NR:ns ## COG: dinJ COG3077 # Protein_GI_number: 16128212 # Func_class: L Replication, recombination and repair # Function: DNA-damage-inducible protein J # Organism: Escherichia coli K12 # 1 86 1 86 86 144 100.0 3e-35 MAANAFVRARIDEDLKNQAADVLAGMGLTISDLVRITLTKVAREKALPFDLREPNQLTIQ SIKNSEAGIDVHKAKDADDLFDKLGI >gi|296494528|gb|ADTN01000210.1| GENE 9 7305 - 8114 291 269 aa, chain + ## HITS:1 COG:yafL KEGG:ns NR:ns ## COG: yafL COG0791 # Protein_GI_number: 16128213 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Escherichia coli K12 # 21 269 1 249 249 502 100.0 1e-142 MFRNDPGYTATLKHDQPLLSMSLPSIPSFVLSGLLLICLPFSSFASATTSHISFSYAARQ RMQNRARLLKQYQTHLKKQASYIVEGNAESKRALRQHNREQIKQHPEWFPAPLKASDRRW QALAENNHFLSSDHLHNITEVAIHRLEQQLGKPYVWGGTRPDKGFDCSGLVFYAYNKILE AKLPRTANEMYHYRRATIVANNDLRRGDLLFFHIHSREIADHMGVYLGDGQFIESPRTGE TIRISRLAEPFWQDHFLGARRILTEETIL >gi|296494528|gb|ADTN01000210.1| GENE 10 8290 - 8787 21 165 aa, chain + ## HITS:1 COG:yafM KEGG:ns NR:ns ## COG: yafM COG1943 # Protein_GI_number: 16128214 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli K12 # 1 165 1 165 165 325 100.0 2e-89 MSEYRRYYIKGGTWFFTVNLRNRRSQLLTTQYQMLRHAIIKVKRDRPFEINAWVVLPEHM HCIWTLPEGDDDFSSRWREIKKQFTHACGLKNIWQPRFWEHAIRNTKDYRHHVDYIYINP VKHGWVKQVSDWPFSTFHRDVARGLYPIDWAGDVTDFSAGERIIS >gi|296494528|gb|ADTN01000210.1| GENE 11 9011 - 11104 2412 697 aa, chain - ## HITS:1 COG:fhiA KEGG:ns NR:ns ## COG: fhiA COG1298 # Protein_GI_number: 16128215 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FlhA # Organism: Escherichia coli K12 # 125 697 7 579 579 1079 98.0 0 MAKTTKSFLALLRGGNLGVPLVILCILAMVILPLPPALLDILFTFNIVLAVMVLLVAVSA KRPLEFSLFPTILLITTLMRLTLNVASTRVVLLHGHLGAGAAGKVIESFGQVVIGGNFVV GFVVFIILMIINFIVVTKGAERISEVSARFTLDAMPGKQMAIDADLNAGLINQAQAQTRR KDVASEADFYGAMDGASKFVRGDAIAGMMILAINLIGGVCIGIFKYNLSADAAFQQYVLM TIGDGLVAQIPSLQLSTAAAIIVTRVSDNGDIAHDVRNQLLASPSVLYTATGIMFVLAVV PGMPHLPFLLFSALLGFTGWRMSKQPLAAEAEEKSLETLTRTITETSEQQVSWETIPLIE PISLSLGYKLVALVDKAQGNPLTQRIRGVRQVISDGNGVLLPEIRIRENFRLKPSQYAIF INGIKADEADIPADKLMALPSSETYGEIDGVQGNDPAYGMPVTWIQAAQKAKALNMGYQV IDSASVIATHVNKIVRSYIPDLFNYDDITQLHNRLSSTAPRLAEDLSAALNYSQLLKVYR ALLTEGVSLRDIVTIATVLVASSTVTKDHILLAADVRLALRRSITHPFVRKQELTVYTLN NELENLLTNVVNQAQQGGKVMLDSVPVDPNMLNQFQSTMPQVKEQMKAAGKDPVLLVPPQ LRPLLARYARLFAPGLHVLSYNEVPDELELKIMGALM >gi|296494528|gb|ADTN01000210.1| GENE 12 11088 - 12227 1190 379 aa, chain - ## HITS:1 COG:YPO0706 KEGG:ns NR:ns ## COG: YPO0706 COG1377 # Protein_GI_number: 16121025 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FlhB # Organism: Yersinia pestis # 4 377 3 376 377 388 50.0 1e-108 MADSSSEEKTEKPSAQKLRKAREEGQLPRSKDMGLAASLFAAFVVISSSFPWYADFVRES FISVHQYAQEINNPEVIGQFLRHHLLILGKFILTLLPMPAAALLSSLVPGGWLFLPKKIL PDFSKISPLKGIGRLFSSEHLAETGKMTVKSVVVLVMLWVSLRNNFAAFLGLQALPFKLA MNEGLSLYACVMRNFVILFIFFALIDVPLAKALFTKGLKMTKQELKEEYKNQEGKPEVKA RVRRLQRQLAMGQIRKVVPKANVVITNPTHYAVALQYDSSRAAAPFVVAKGTDEIALYIR QVAAENQVEVVEFPRLARSVYYTTQVNQQIPFQLYRAIAHVLTYVLQMKHWREGAQPRPA LNRHISIPKEVLKLDGENN >gi|296494528|gb|ADTN01000210.1| GENE 13 12217 - 12972 930 251 aa, chain - ## HITS:1 COG:YPO0707 KEGG:ns NR:ns ## COG: YPO0707 COG1684 # Protein_GI_number: 16121026 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FliR # Organism: Yersinia pestis # 5 248 13 257 257 232 56.0 5e-61 MDLALGLWFPFVRIMAFLRYVPVLDNSALTVRVRIILSLALAIIITPLIPHPIPHDLLSL NSLILTVEQILWGMLFGLMFQFLFLALQLAGQILSFNMGMSMAVMNDPSSGASTTVLAEL INVYAILLFFAMDGHLLLVSVLYKGFTYWPIGNALHPQTLRTIALAFSWVLASASLLALP TTFIMLIVQGCFGLLNRIAPPLNLFSLGFPINMLAGLVCFATLLYNLPDHYLHLANFVLQ QLDALKGHYGG >gi|296494528|gb|ADTN01000210.1| GENE 14 13001 - 13273 176 90 aa, chain - ## HITS:1 COG:YPO0708 KEGG:ns NR:ns ## COG: YPO0708 COG1987 # Protein_GI_number: 16121027 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FliQ # Organism: Yersinia pestis # 2 89 1 88 89 73 59.0 7e-14 MLTVDVAADIVASGIKVVILLVSVLVVPSLLVGLLVSVFQAVTQINEQTLSFLPRLIVTL VVLGVCGKWMIIQLHDLCIHLFTQAALLVH >gi|296494528|gb|ADTN01000210.1| GENE 15 13276 - 14028 1021 250 aa, chain - ## HITS:1 COG:YPO0709 KEGG:ns NR:ns ## COG: YPO0709 COG1338 # Protein_GI_number: 16121028 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FliP # Organism: Yersinia pestis # 9 247 18 256 256 254 67.0 1e-67 MKRRTQLTLGLGLLALAPLAIAQGGDIALLNVVTHGNTQEYSVKIQVLILMTLVGLLPTM ALMMTCFARFIIVLSLLRQALGLQQTPPNRILIGIALSLTMLVMRPIWLNIYDHAVVPFE NDQITLTDALSTAATPLKRFMLAQTDKKAMAQIMTIGGAKGNAADQDLSIVVPAYVLSEL KTAFQIGFMIYIPFLVIDLIVASVLMAMGMMMLSPLIVSLPFKLMLFVLIDGWSLTIGTL TTSIRGLGLG >gi|296494528|gb|ADTN01000210.1| GENE 16 14025 - 14396 496 123 aa, chain - ## HITS:1 COG:YPO0710 KEGG:ns NR:ns ## COG: YPO0710 COG1886 # Protein_GI_number: 16121029 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar motor switch/type III secretory pathway protein # Organism: Yersinia pestis # 36 122 50 136 137 99 62.0 1e-21 MSKQEDILAGDFGLTDEVAPVAKSADAETLVTRLEDRFSDSMTLLKRIPVTLTLEVSSVE IMLADLLNIDDDTVIELDKLAGEPLDIKVNNILLGKAEVVVVNEKYGLRVLEFNTRDIND LAP >gi|296494528|gb|ADTN01000210.1| GENE 17 14389 - 15240 791 283 aa, chain - ## HITS:1 COG:no KEGG:EcolC_3385 NR:ns ## KEGG: EcolC_3385 # Name: not_defined # Def: surface presentation of antigens (SpoA) protein # Organism: E.coli_ATCC8739 # Pathway: Bacterial chemotaxis [PATH:ecl02030]; Flagellar assembly [PATH:ecl02040] # 1 283 1 283 283 545 99.0 1e-154 MLKYSKTPGIFKLEGNRLGRPYHHLPTLFTGNFDVIDSHLGSYFLKKHRSNITLKKIGCE MDIINKNAELMISQVGHLAFDIDRSLLLMLLGNFYGLESSLEEAKAHNGLPTKTETRLKN RLALDICTLIFNLQTSGIALKLKLDSSTVITHWAYQLTFTLAGDEESCFRILLDDAYTDF ILNLIRHSEHSKPQQVAKSVNKPALIKEIIRSLPLTLNVKIAELSMNVADLTQIKAGDIL PISLGETFPVAIGQSELFSALIVEDKDKLFLSELAGKNENSHE >gi|296494528|gb|ADTN01000210.1| GENE 18 15519 - 15623 63 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNDISSDRKQDIVKRITGGINKFELNDNKNIILW >gi|296494528|gb|ADTN01000210.1| GENE 19 15627 - 16613 865 328 aa, chain + ## HITS:1 COG:YPO0712 KEGG:ns NR:ns ## COG: YPO0712 COG2204 # Protein_GI_number: 16121031 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Yersinia pestis # 4 326 16 338 342 382 59.0 1e-106 MSELIATAASSINAFTLAKRVAAFNVPVLIQGETGAGKECVAKYIHTVAFGENDNAPYIG VNCAAIPENMLEATLFGYDKGAFTGAIASVPGKMELANNGTLLLDEIGDMPLALQAKILR VLQEQQVERLGSNRQIKLNFRLIACTNKNLEQEVAAGRFREDLYYRLAVIPITMPPLRER LNDIIPLAESFIKKYSTVLVKNITLSESTRRALLNYRWPGNVRQLENAIQRGMILNRDGV IYPDALGLPETDVADRSELQWPVQPAVHIAETSDLGQHGRSAQYQYIADLMRKYQGNRSK IADLLGITPRALRYRLASMRKQGIEVFS >gi|296494528|gb|ADTN01000210.1| GENE 20 16628 - 16969 378 113 aa, chain + ## HITS:1 COG:YPO0713 KEGG:ns NR:ns ## COG: YPO0713 COG1677 # Protein_GI_number: 16121032 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar hook-basal body protein # Organism: Yersinia pestis # 11 113 13 126 126 67 41.0 6e-12 MSITTINSTMQAQIMQDVQRMQANAQAPVLPAMTFSSTDPDVSFNRIMYGALGHVDQFQQ VAEQQQTAVDTGKSDDLAGAMIASQQASLSFSALVQVRNKIATGFNDLMSMSI >gi|296494528|gb|ADTN01000210.1| GENE 21 16974 - 18620 1743 548 aa, chain + ## HITS:1 COG:YPO0714 KEGG:ns NR:ns ## COG: YPO0714 COG1766 # Protein_GI_number: 16121033 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis/type III secretory pathway lipoprotein # Organism: Yersinia pestis # 18 548 13 544 546 442 46.0 1e-123 MNAQIKKLTQAFPAFRLRLVDNKRWALMAGVGLAVAATAIIVSVLWTGNRGYVSLYGRQE NLPVSQIVTVLDGEKLSYRIDPQSGQILVPEDELSKTRMTLAAKGVQAILPSGYELMDKD EVLGSSQFVQNVRYKRSLEGELAQSIMSLDAVESARVHLALNEESSFVVSDEPQNSASVV VRLHYGAKLNMDQVNAIVHLVSGSIPGLQASKVSVVDQAGNLLTDGIGAGEAVSAATRKR DQILKDIQDKTRASVANVLDSLVGSGNYRVSVMPDLDLSTIDETQEHYGDAPKINREESV LDSDTNQVAMGVPGSLSNRPPVAANQMTNGTEENRSPEALSKHSESKRDYSYDRSVQHIQ HPGFAVKRLNVAVVLNQNAPALKNWKPEQTTQLTALLNNAAGIDAQRGDNLTLSLLNFVP QVVPVEPVIPLWKDDSVLAWVRLIGCGLLALLLLFFVVRPVMKRLTAVRAPVITPEPEAV SEPWIAMPEEERKNVDLPSLPGDDSLPSQSSGLEVKLEFLQKLAMSDTDRVAEVLRQWIT SNERIDNK >gi|296494528|gb|ADTN01000210.1| GENE 22 18598 - 19608 1251 336 aa, chain + ## HITS:1 COG:YPO0715 KEGG:ns NR:ns ## COG: YPO0715 COG1536 # Protein_GI_number: 16121034 # Func_class: N Cell motility # Function: Flagellar motor switch protein # Organism: Yersinia pestis # 13 336 1 324 324 402 69.0 1e-112 MSELTTNKSNSYLEQAAILLLCLGEEAAATVMQKLSREEVVRLSENMARLSGVKTSMARK VINNFFDEFREQSGINGASRSMLQGILNKALGTEIASSVINGIYGDEIRSRMARLQWVEP RQLAMLISEEHLQLQAVFLAFLTPEISAAVLSYLNESVQNEIFYRVAKLNDVNRDVVDEL DRLIERGLSVLSEHGSKVKGIKQAADIVNRFQGNQQVILDQMRERDEEVLEQLQDEMYDF FILSRQNEEVRRRLLDEVPMEDWAVALKGTEALLRRSIYAVMPKRQVQQLEAITARLGPV PVSRIEQIRREIMGIARELEEAGEIQLQLFAEQTAE Prediction of potential genes in microbial genomes Time: Sun May 15 23:50:51 2011 Seq name: gi|296494527|gb|ADTN01000211.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont447.2, whole genome shotgun sequence Length of sequence - 5455 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 13/0.000 + CDS 86 - 676 754 ## COG1317 Flagellar biosynthesis/type III secretory pathway protein 2 1 Op 2 . + CDS 669 - 2006 1229 ## COG1157 Flagellar biosynthesis/type III secretory pathway ATPase 3 1 Op 3 . + CDS 2009 - 2443 432 ## EcolC_3378 flagellar export protein FliJ 4 1 Op 4 . + CDS 2446 - 2841 342 ## COG0615 Cytidylyltransferase 5 2 Tu 1 . - CDS 2959 - 5403 1970 ## COG3306 Glycosyltransferase involved in LPS biosynthesis Predicted protein(s) >gi|296494527|gb|ADTN01000211.1| GENE 1 86 - 676 754 196 aa, chain + ## HITS:1 COG:YPO0716 KEGG:ns NR:ns ## COG: YPO0716 COG1317 # Protein_GI_number: 16121035 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis/type III secretory pathway protein # Organism: Yersinia pestis # 1 196 48 239 239 181 47.0 8e-46 MDGFQEGLQKGFAQGMTEGQEQGFSEGHQQGFAEGRRQGYTEGSLAGQQEGRKQFVEAAQ PLEAITGKVNDFLAHIERKQREDLLQLVEKVTRQVIRCELALQPTQLLALVEEALAAFPA MPETLQVMLSTEEFNRLRDAVPEKVSEWGLTPSPDLPPGECRVITDKSELDIGCEHRLEQ CMTALKETLTPESQGE >gi|296494527|gb|ADTN01000211.1| GENE 2 669 - 2006 1229 445 aa, chain + ## HITS:1 COG:YPO0717 KEGG:ns NR:ns ## COG: YPO0717 COG1157 # Protein_GI_number: 16121036 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis/type III secretory pathway ATPase # Organism: Yersinia pestis # 1 438 1 438 446 635 73.0 0 MSDFSAFDNALRSLESIPLARVAGRLVRLNGILLESVGCPLMTGQLCRIESANHTLIDAQ AVGFNRDITYLMPFKQPVGLMAGARVFPEEKAHDILIGESWLGRVVNGLGEPLDGKGRLN GNDLLPPLPPSVNPLTRRSVDEPLDVGVKAINGLLTIGKGQRVGLMAGSGVGKSVLLGMI TRQTKADIVVVGLIGERGREVKEFIDHSLGADGLAKSIVVVAPADESPLMRLKATELCHS IAAWFRDRGHHVLLLVDSLTRYAMAQREIALSLGEPPATKGYPPSAFGMIPKLVESAGNS ESAGSMTAIYTVLAEGDDQQDPIVDCARAVLDGHIVLTRKLAEAGHYPAIDIGQSISRCM NQVTSLEHQQSARALKQNYAAYMEIKPLIPLGGYVAGADASVDKAVKMFPAIERFLRQEM REPASLELVQSRLQILFPCAKKAEQ >gi|296494527|gb|ADTN01000211.1| GENE 3 2009 - 2443 432 144 aa, chain + ## HITS:1 COG:no KEGG:EcolC_3378 NR:ns ## KEGG: EcolC_3378 # Name: not_defined # Def: flagellar export protein FliJ # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 144 1 144 144 214 99.0 7e-55 MRQIIDTLAQLQRLRDKSVKDKTVELAKQKQICAGYDNNIKALGYLVEKTSAGAAVSVES LKNVSGYKGTLRKVIAWQEQEKTLANIKATRMQKNLTAAACEEKVVALTLDDKRREQQES ATAKAQKAVDDIAVQCWLRHKLAE >gi|296494527|gb|ADTN01000211.1| GENE 4 2446 - 2841 342 131 aa, chain + ## HITS:1 COG:aq_1368 KEGG:ns NR:ns ## COG: aq_1368 COG0615 # Protein_GI_number: 15606564 # Func_class: M Cell wall/membrane/envelope biogenesis; I Lipid transport and metabolism # Function: Cytidylyltransferase # Organism: Aquifex aeolicus # 2 127 6 130 168 136 52.0 1e-32 MKTVITFGTFDVFHVGHLRLLQRARALGERLLVGVSSDALNIAKKGRAPVYHQDDRMAII AGLACVDGVFLEESLEQKAEYLRGYSADILVMGDDWAGKFDSFAYICEVVYFPRTPSVST TGIIEVIRGLL >gi|296494527|gb|ADTN01000211.1| GENE 5 2959 - 5403 1970 814 aa, chain - ## HITS:1 COG:YPO1382 KEGG:ns NR:ns ## COG: YPO1382 COG3306 # Protein_GI_number: 16121662 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase involved in LPS biosynthesis # Organism: Yersinia pestis # 258 457 6 213 224 113 32.0 2e-24 MPTPTIIKTLNTHREKPFVSVVTPTWNRGAFLPYLLYMYRYQDYPADRRELIILDDSPQS HQHIIDRLTNGTPEAFNIRYIHHPEKLPLGKKRNMLNELARGEYILCMDDDDYYPADKIS YTIAMMQKHRALISGSDQIPIWYSHINRIFQSRSFGPHNILNGTFCYHRNYLKKHRYDDD CNLGEEKRFTDNFSVNPLQLPGERTILCISHSHNTFDKDFILGASTPVNAALTDIVRDPL LRNAYLSLHNATHHQAINHQAIDQIVLLNLDKRPDRLQQIREELALLHIPPEKITRLAAS EDENGQRGRRQSHLQALRLAQQHGWQNYLLLEDDAVILKQEKHIQVLNTLLASLAKIPWQ VMILGGEISQGTMLKSLPGLVHARDCRKVCAYLVNSRYYPQLAQQMSNDEHSLEACWQPL LRADKWLACYPSLCYQRPGFSDIEKKTTDNIAHYFNKLPVATKPSTLPIADTIGFFMETS FHYTLYRPIITALQAQGQSCTLVINDRVFKPFLDEMLETLKNIDDPQLKGMRLSEMQTHG QRVKCLVSPYHTPALNGLAAVNIRAMYGLAKETWNHADWNRFYQSILCYSHYSQQALAHF GSAKVVGNPRFDACHNGTFDRALPENIQSDYRKPTVLYAPTFGALSSLPHWAEKLGRLSG DVNLICKLHHGTCSRPEEAASLALVRRHLKQRTDSARHTLALLAKADYVLTDNSGFIFDA IHVDKRVILLDFPGMNDLLDGEKSYSTAESADQRIREILPVAHDVAELRYLLSEAFDWGS VQARLTEIRHHYCDAFMDGKAGERAAIVIVEALG Prediction of potential genes in microbial genomes Time: Sun May 15 23:50:55 2011 Seq name: gi|296494526|gb|ADTN01000212.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont447.3, whole genome shotgun sequence Length of sequence - 3525 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 33 - 962 749 ## EcolC_3375 hypothetical protein - Prom 997 - 1056 1.7 2 2 Op 1 . - CDS 1090 - 1518 375 ## EcolC_3374 hypothetical protein 3 2 Op 2 . - CDS 1531 - 1809 272 ## ECO111_0270 putative lateral flagellar anti-sigma factor 28 protein 4 2 Op 3 . - CDS 1890 - 2627 451 ## COG1261 Flagellar basal body P-ring biosynthesis protein - Prom 2832 - 2891 2.9 + Prom 2589 - 2648 2.4 5 3 Op 1 24/0.000 + CDS 2709 - 3044 312 ## COG1815 Flagellar basal body protein 6 3 Op 2 . + CDS 3047 - 3478 509 ## COG1558 Flagellar basal body rod protein Predicted protein(s) >gi|296494526|gb|ADTN01000212.1| GENE 1 33 - 962 749 309 aa, chain - ## HITS:1 COG:no KEGG:EcolC_3375 NR:ns ## KEGG: EcolC_3375 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 309 1 309 309 543 96.0 1e-153 MSFIECYEPEYVRNFMTQYPGSPLYVRKVWKNEIRLSLMLTEPESCEAMLNDVRAFDLQL TYRPVEENAAELSARDAIVNQVILSTLTLRDLTPELSLYAVGILLSRASKMPGRDGDILA RLTTLPQALADHAKKGILQAQFAQLPPVPQLARHLATLLGSFTFDWSILPESPRKTSLPL QMPLLTLHDANSEALLQQQLQTQWQTTWQQHFATAPWMMRNWLIYRVYHDVIGQTDGADY FPLVCDFYLLRTLISLWTLDGSSLRQEDIFALFAMFERWRASENALLVRQQIQSLCAADP LLSAFSLLT >gi|296494526|gb|ADTN01000212.1| GENE 2 1090 - 1518 375 142 aa, chain - ## HITS:1 COG:no KEGG:EcolC_3374 NR:ns ## KEGG: EcolC_3374 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_ATCC8739 # Pathway: Flagellar assembly [PATH:ecl02040] # 1 142 1 142 142 236 97.0 2e-61 MSIRLQQVKALLQGIRDDDALYDSLRERLQRQRICMIRRASEELLAVNDEITHHYEQLHS HSHQRHSLLKMLGVSVNRDGLAQVFAWLPAVQKAAAQQLWQHLEQKAERCKAYNDKNGEL LIRQYEFIQSFLGSEADFLYQE >gi|296494526|gb|ADTN01000212.1| GENE 3 1531 - 1809 272 92 aa, chain - ## HITS:1 COG:no KEGG:ECO111_0270 NR:ns ## KEGG: ECO111_0270 # Name: lfgM # Def: putative lateral flagellar anti-sigma factor 28 protein # Organism: E.coli_O111_H- # Pathway: Flagellar assembly [PATH:eoi02040] # 1 92 1 92 92 99 97.0 5e-20 MKITPTMPGNRPTTTTAGSSQPRKTTATTTTTLSADDITQAGLQSAQQTLNDQQESDIDS NKVAQVQAMLASGSPQVDTEQLASDMLSFFQN >gi|296494526|gb|ADTN01000212.1| GENE 4 1890 - 2627 451 245 aa, chain - ## HITS:1 COG:YPO0721 KEGG:ns NR:ns ## COG: YPO0721 COG1261 # Protein_GI_number: 16121040 # Func_class: N Cell motility; O Posttranslational modification, protein turnover, chaperones # Function: Flagellar basal body P-ring biosynthesis protein # Organism: Yersinia pestis # 24 239 12 229 237 141 33.0 1e-33 MSFQFCRRKFYFLLALALPGYAAVHPVQHSAREQVNAQVLNAASQKIESLAQQRQWHDYH YTFKVYIPSQIATAAPCATTPGVTLTSPAEIALNRMNFTVSCPQSWQMNVAVRPDVLVPV VMTKSLVARDTPLTANDVELKPYNVSAQRREVLMELDDAIGFSSKHALQPGRPITKEELV SPVLVGRDQPVMIVYQSAGITASMPGVALKNGRKGEMVKVRNASSQRVISAMVAESGVVT TVSAE >gi|296494526|gb|ADTN01000212.1| GENE 5 2709 - 3044 312 111 aa, chain + ## HITS:1 COG:YPO0722 KEGG:ns NR:ns ## COG: YPO0722 COG1815 # Protein_GI_number: 16121041 # Func_class: N Cell motility # Function: Flagellar basal body protein # Organism: Yersinia pestis # 1 111 1 148 148 125 50.0 2e-29 MGISFQQALGVHPQAVKLRLERTELLTANLANVDTPNFKAKDIDFAREMQRANNAAVDVQ YRVPMQPSEDGNTVELNSEQARFSQNSMDYQSSLTFLNLQISGIREAIEGK >gi|296494526|gb|ADTN01000212.1| GENE 6 3047 - 3478 509 143 aa, chain + ## HITS:1 COG:YPO0723 KEGG:ns NR:ns ## COG: YPO0723 COG1558 # Protein_GI_number: 16121042 # Func_class: N Cell motility # Function: Flagellar basal body rod protein # Organism: Yersinia pestis # 1 142 1 140 141 160 66.0 9e-40 MSFTDIYQISGSAMTAQTIRLNTVASNLANAEAPAGSEAQAYKARSPVFAAVYHHSLLAG THRHAIDAASVQVQDVLQTGGAVKRYEPHSPLADANGDVWYPDVNVVEQMADMMSASRDF ETNVDVLNNVKSMQQSLLKLGEA Prediction of potential genes in microbial genomes Time: Sun May 15 23:51:04 2011 Seq name: gi|296494525|gb|ADTN01000213.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont456.1, whole genome shotgun sequence Length of sequence - 2909 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 905 801 ## ECO26_p2-70 conjugal transfer pilus assembly protein TraH 2 1 Op 2 . + CDS 902 - 2909 1350 ## APECO1_O1CoBM53 hypothetical protein Predicted protein(s) >gi|296494525|gb|ADTN01000213.1| GENE 1 3 - 905 801 300 aa, chain + ## HITS:1 COG:no KEGG:ECO26_p2-70 NR:ns ## KEGG: ECO26_p2-70 # Name: not_defined # Def: conjugal transfer pilus assembly protein TraH # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 300 158 457 457 585 99.0 1e-166 IGGLFPRTQVSQQKVCQDIAGESNIFADWAASRQGCTVGGKSDSVRDKASDKDKERVTKN INIMWNALSKNRMFDGNKELKEFVMTLTGSLVFGPNGEITPLSARTTDRSIIRAMMEGGT AKIYHCNDSDKCLKVVADTPVTISRDNALKSQITKLLASIQNKAVSDTPLDDKEKGFISS TTIPVFKYLVDPQMLGVSNSMIYQLTDYIGYDILLQYIQELIQQARAMVATGNYDEAVIG HINDNMNDATRQIAAFQSQVQVQQDALLVVDRQMSYMRQQLSARMLSRYQNNYHFGGSTL >gi|296494525|gb|ADTN01000213.1| GENE 2 902 - 2909 1350 669 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM53 NR:ns ## KEGG: APECO1_O1CoBM53 # Name: traG # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 669 458 1126 1398 1269 99.0 0 MNEVYVIAGGEWLRNNLNAIAAFMGTRTWDSIEKIALTLSVLAVAVMWVQRHNVMDLLGW VAVFVLISLLVNVRTSVQIIDNSDLVKVHRVDNVPVGLAMPLSLTTRIGHAMVASYEMIF TQPDSVTYSKTGMLFGAEMVSKSTDFLSRNPEIANIFQDYVQNCVMGDIYLNHKYTLEEL MASADPYTLIFSRPSPLRGVYDSNNNFVTCKDASVSLKDKLNLDTQSGGKTWHYYAQQLF GGRPDPNLLFSTLIGDSYSYFYGSSKSASQIIRQNVTINALKEGITSYAARNGDSASLVN LATTSSMEKQRLAHVSIGHVAMRTLPMTQTILTGIAIGIFPLLVLAAVFNKLTLSVLKGY VFALMWLQSWPMLYAILNSAMTFYAKQNGAPVVLSEISQIQLKYSDLASTAGYLSMMIPP LSWMMVRGLGAGFSSVYSHFASSAISPTASAAGSVVDGNYSYGNMQTENVNGFSWSTNST TSFGQMMYQTGSGATATQTRDGNMVMDASGAMSRLPVGINTTRQIAAAQQEMAREASNRA ESALHGFSSSIASAWNTLSQFGSNRGSSDSVTGGADSTMSAQDSMMASRMRSAVESYAKA HNISNEQATRELASRSTTASAGLYGDAHAGLDAKPSLFGVRVDVGAKVGGRASIDGSDLD SHEASSGSR Prediction of potential genes in microbial genomes Time: Sun May 15 23:51:15 2011 Seq name: gi|296494524|gb|ADTN01000214.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont456.2, whole genome shotgun sequence Length of sequence - 487 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 485 279 ## APECO1_O1CoBM53 hypothetical protein Predicted protein(s) >gi|296494524|gb|ADTN01000214.1| GENE 1 2 - 485 279 161 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM53 NR:ns ## KEGG: APECO1_O1CoBM53 # Name: traG # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 161 1134 1294 1398 301 100.0 7e-81 DIDARASKDFKEASDYFTSRKVSESGSHTDNNADSRVDQLSAALNSAKQSYDQYTTNMTR SHEYAEMASRTESMSGQMSEDLSQQFAQYVMKHAPQDAEAILTNTSSPEIAERRRAMAWS FVQEQVQPGVDNAWRESRGDIGKGMESVPSGGGSQDIIADH Prediction of potential genes in microbial genomes Time: Sun May 15 23:51:21 2011 Seq name: gi|296494523|gb|ADTN01000215.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont457.1, whole genome shotgun sequence Length of sequence - 7871 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 1, operones - 1 average op.length - 8.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 18 - 77 1.8 1 1 Op 1 3/0.000 + CDS 97 - 693 359 ## COG0582 Integrase 2 1 Op 2 4/0.000 + CDS 1175 - 1723 452 ## COG3539 P pilus assembly protein, pilin FimA 3 1 Op 3 7/0.000 + CDS 1788 - 2327 255 ## COG3539 P pilus assembly protein, pilin FimA 4 1 Op 4 10/0.000 + CDS 2364 - 3089 519 ## COG3121 P pilus assembly protein, chaperone PapD 5 1 Op 5 6/0.000 + CDS 3156 - 5792 1342 ## COG3188 P pilus assembly protein, porin PapC 6 1 Op 6 4/0.000 + CDS 5862 - 6332 101 ## COG3539 P pilus assembly protein, pilin FimA 7 1 Op 7 . + CDS 6345 - 6848 241 ## COG3539 P pilus assembly protein, pilin FimA 8 1 Op 8 . + CDS 6868 - 7770 504 ## ECUMN_4927 minor component of type 1 fimbriae Predicted protein(s) >gi|296494523|gb|ADTN01000215.1| GENE 1 97 - 693 359 198 aa, chain + ## HITS:1 COG:fimE KEGG:ns NR:ns ## COG: fimE COG0582 # Protein_GI_number: 16132134 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Escherichia coli K12 # 1 198 1 198 198 367 100.0 1e-102 MSKRRYLTGKEVQAMMQAVCYGATGARDYCLILLAYRHGMRISELLDLHYQDLDLNEGRI NIRRLKNGFSTVHPLRFDEREAVERWTQERANWKGADRTDAIFISRRGSRLSRQQAYRII RDAGIEAGTVTQTHPHMLRHACGYELAERGADTRLIQDYLGHRNIRHTVRYTASNAARFA GLWERNNLINEKLKREEV >gi|296494523|gb|ADTN01000215.1| GENE 2 1175 - 1723 452 182 aa, chain + ## HITS:1 COG:ECs5273 KEGG:ns NR:ns ## COG: ECs5273 COG3539 # Protein_GI_number: 15834527 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli O157:H7 # 1 182 1 182 182 223 96.0 1e-58 MKIKTLAIVVLSALSLSSTAALAAATTVNGGTVHFKGEVVNAACAVDAGSVDQTVQLGQV RTASLAQEGATSSAVGFNIQLNDCDTNVASKAAVAFLGTAIDAGHTNVLALQSSAAGSAT NVGVQILDRTGAALTLDGATFSSETTLNNGTNTIPFQARYFATGAATPGAANADATFKVQ YQ >gi|296494523|gb|ADTN01000215.1| GENE 3 1788 - 2327 255 179 aa, chain + ## HITS:1 COG:fimI KEGG:ns NR:ns ## COG: fimI COG3539 # Protein_GI_number: 16132136 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 179 37 215 215 370 100.0 1e-102 MKRKRLFLLASLLPMFALAGNKWNTTLPGGNMQFQGVIIAETCRIEAGDKQMTVNMGQIS SNRFHAVGEDSAPVPFVIHLRECSTVVSERVGVAFHGVADGKNPDVLSVGEGPGIATNIG VALFDDEGNLVPINRPPANWKRLYSGSTSLHFIAKYRATGRRVTGGIANAQAWFSLTYQ >gi|296494523|gb|ADTN01000215.1| GENE 4 2364 - 3089 519 241 aa, chain + ## HITS:1 COG:ECs5275 KEGG:ns NR:ns ## COG: ECs5275 COG3121 # Protein_GI_number: 15834529 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli O157:H7 # 1 241 1 241 241 456 99.0 1e-128 MSNKNVNVRKSQEITFCLLAGILMFMAMMVAGRAEAGVALGATRVIYPAGQKQVQLAVTN NDENSTYLIQSWVENADGVKDGRFIVTPPLFAMKGKKENTLRILDATNNQLPQDRESLFW MNVKAIPSMDKSKLTENTLQLAIISRIKLYYRPAKLALPPDQAAEKLRFRRSANSLTLIN PTPYYLTVTELNAGTRVLENALVPPMGESTVKLPSDAGSNITYRTINDYGALTPKMTGVM E >gi|296494523|gb|ADTN01000215.1| GENE 5 3156 - 5792 1342 878 aa, chain + ## HITS:1 COG:fimD KEGG:ns NR:ns ## COG: fimD COG3188 # Protein_GI_number: 16132138 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli K12 # 1 878 1 878 878 1727 99.0 0 MSYLNLRLYQRNTQCLHIRKHRLAGFFVRLVVACAFAAQAPLSSADLYFNPRFLADDPQA VADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLN TASVAGMNLLADDACVPLTTMVQDATAHLDVGQQRLNLTIPQAFMSNRARGYIPPELWDP GINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDRSSGSK NKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPV IHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTV PYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRY RAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYR YSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRT STLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNI PFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGD GNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVL VKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVP TRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLA GKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR >gi|296494523|gb|ADTN01000215.1| GENE 6 5862 - 6332 101 156 aa, chain + ## HITS:1 COG:fimF KEGG:ns NR:ns ## COG: fimF COG3539 # Protein_GI_number: 16132139 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 156 21 176 176 276 99.0 1e-74 MAADSTITIRGYVRDNGCSVAAESTNFTVDLMENAAKQFNNIGATTPVVPFRILLSPCGN AVSAVKVGFTGVADSHNANLLALENTVSAASGLGIQLLNEQQNQIPLNAPSSALSWTTLT PGKPNTLNFYARLMATQVPVTAGHINATATFTLEYQ >gi|296494523|gb|ADTN01000215.1| GENE 7 6345 - 6848 241 167 aa, chain + ## HITS:1 COG:fimG KEGG:ns NR:ns ## COG: fimG COG3539 # Protein_GI_number: 16132140 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 167 1 167 167 255 100.0 3e-68 MKWCKRGYVLAAILALASATIQAADVTITVNGKVVAKPCTVSTTNATVDLGDLYSFSLMS AGAASAWHDVALELTNCPVGTSRVTASFSGAADSTGYYKNQGTAQNIQLELQDDSGNTLN TGATKTVQVDDSSQSAHFPLQVRALTVNGGATQGTIQAVISITYTYS >gi|296494523|gb|ADTN01000215.1| GENE 8 6868 - 7770 504 300 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_4927 NR:ns ## KEGG: ECUMN_4927 # Name: fimH # Def: minor component of type 1 fimbriae # Organism: E.coli_UMN026 # Pathway: not_defined # 1 300 1 300 300 528 100.0 1e-149 MKRVITLFAVLLMGWSVNAWSFACKTANGTAIPIGGGSANVYVNLAPVVNVGQNLVVDLS TQIFCHNDYPETITDYVTLQRGSAYGGVLSNFSGTVKYSGSSYPFPTTSETPRVVYNSRT DKPWPVALYLTPVSSAGGVAIKAGSLIAVLILRQTNNYNSDDFQFVWNIYANNDVVVPTG GCDVSARDVTVTLPDYPGSVPIPLTVYCAKSQNLGYYLSGTTADAGNSIFTNTASFSPAQ GVGVQLTRNGTIIPANNTVSLGAVGTSAVSLGLTANYARTGGQVTAGNVQSIIGVTFVYQ Prediction of potential genes in microbial genomes Time: Sun May 15 23:51:34 2011 Seq name: gi|296494522|gb|ADTN01000216.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont457.2, whole genome shotgun sequence Length of sequence - 32236 bp Number of predicted genes - 26, with homology - 25 Number of transcription units - 16, operones - 6 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 94 - 1437 1672 ## COG2610 H+/gluconate symporter and related permeases - Prom 1462 - 1521 2.4 + Prom 1600 - 1659 3.3 2 2 Op 1 7/0.000 + CDS 1777 - 2961 1545 ## COG1312 D-mannonate dehydratase + Term 2968 - 3004 8.6 3 2 Op 2 . + CDS 3042 - 4502 1866 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases + Term 4527 - 4570 8.6 4 3 Tu 1 . - CDS 4524 - 4718 61 ## ECBD_3711 hypothetical protein - Prom 4773 - 4832 2.6 + Prom 4560 - 4619 5.1 5 4 Tu 1 . + CDS 4717 - 5490 966 ## COG2186 Transcriptional regulators + Term 5504 - 5541 4.1 - Term 5506 - 5554 0.3 6 5 Tu 1 . - CDS 5631 - 6461 360 ## B21_04156 hypothetical protein - Prom 6599 - 6658 6.6 + Prom 6461 - 6520 5.5 7 6 Tu 1 . + CDS 6715 - 6816 72 ## + Term 6910 - 6949 2.1 + Prom 6844 - 6903 5.5 8 7 Tu 1 . + CDS 7134 - 7526 192 ## B21_04157 hypothetical protein 9 8 Op 1 . - CDS 7519 - 8430 819 ## COG0583 Transcriptional regulator 10 8 Op 2 . - CDS 8495 - 9667 1334 ## B21_04159 hypothetical protein 11 8 Op 3 3/0.400 - CDS 9680 - 10141 557 ## COG0700 Uncharacterized membrane protein 12 8 Op 4 . - CDS 10138 - 10833 634 ## COG3314 Uncharacterized protein conserved in bacteria - Prom 11032 - 11091 4.2 + Prom 10753 - 10812 3.3 13 9 Op 1 . + CDS 10851 - 11042 58 ## ECH74115_5839 hypothetical protein 14 9 Op 2 . + CDS 11071 - 11625 451 ## COG1859 RNA:NAD 2'-phosphotransferase + Term 11701 - 11750 0.1 - Term 11518 - 11566 1.8 15 10 Tu 1 2/0.800 - CDS 11638 - 12759 774 ## COG0477 Permeases of the major facilitator superfamily - Prom 12809 - 12868 1.9 16 11 Op 1 . - CDS 12884 - 13744 506 ## COG3204 Uncharacterized protein conserved in bacteria 17 11 Op 2 . - CDS 13809 - 14066 220 ## ECUMN_4944 hypothetical protein 18 11 Op 3 4/0.200 - CDS 14063 - 14830 409 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) 19 11 Op 4 1/0.800 - CDS 14840 - 15991 1428 ## COG1775 Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB - Prom 16029 - 16088 1.8 20 12 Op 1 4/0.200 - CDS 16107 - 17387 1078 ## COG2733 Predicted membrane protein 21 12 Op 2 . - CDS 17428 - 18660 1133 ## COG0477 Permeases of the major facilitator superfamily - Prom 18895 - 18954 4.5 + Prom 18876 - 18935 4.3 22 13 Tu 1 . + CDS 19139 - 20059 543 ## COG5464 Uncharacterized conserved protein - Term 20072 - 20115 0.5 23 14 Tu 1 . - CDS 20303 - 21715 1201 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs - Prom 21758 - 21817 2.0 + Prom 21758 - 21817 3.8 24 15 Tu 1 . + CDS 21892 - 22056 154 ## COG5457 Uncharacterized conserved small protein + Term 22199 - 22253 14.8 + Prom 22380 - 22439 5.5 25 16 Op 1 . + CDS 22555 - 25845 1327 ## ECUMN_4952 hypothetical protein 26 16 Op 2 . + CDS 25858 - 32202 4304 ## COG1205 Distinct helicase family with a unique C-terminal domain including a metal-binding cysteine cluster Predicted protein(s) >gi|296494522|gb|ADTN01000216.1| GENE 1 94 - 1437 1672 447 aa, chain - ## HITS:1 COG:ECs5280 KEGG:ns NR:ns ## COG: ECs5280 COG2610 # Protein_GI_number: 15834534 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Escherichia coli O157:H7 # 1 447 1 447 447 704 100.0 0 MHVLNILWVVFGIGLMLVLNLKFKINSMVALLVAALSVGMLAGMDLMSLLHTMKAGFGNT LGELAIIVVFGAVIGKLMVDSGAAHQIAHTLLARLGLRYVQLSVIIIGLIFGLAMFYEVA FIMLAPLVIVIAAEAKIPFLKLAIPAVAAATTAHSLFPPQPGPVALVNAYGADMGMVYIY GVLVTIPSVICAGLILPKFLGNLERPTPSFLKADQPVDMNNLPSFGVSILVPLIPAIIMI STTIANIWLVKDTPAWEVVNFIGSSPIAMFIAMVVAFVLFGTARGHDMQWVMNAFESAVK SIAMVILIIGAGGVLKQTIIDTGIGDTIGMLMSHGNISPYIMAWLITVLIRLATGQGVVS AMTAAGIISAAILDPATGQLVGVNPALLVLATAAGSNTLTHINDASFWLFKGYFDLSVKD TLKTWGLLELVNSVVGLIIVLIISMVA >gi|296494522|gb|ADTN01000216.1| GENE 2 1777 - 2961 1545 394 aa, chain + ## HITS:1 COG:uxuA KEGG:ns NR:ns ## COG: uxuA COG1312 # Protein_GI_number: 16132143 # Func_class: G Carbohydrate transport and metabolism # Function: D-mannonate dehydratase # Organism: Escherichia coli K12 # 1 394 1 394 394 815 100.0 0 MEQTWRWYGPNDPVSLADVRQAGATGVVTALHHIPNGEVWSVEEILKRKAIIEDAGLVWS VVESVPIHEDIKTHTGNYEQWIANYQQTLRNLAQCGIRTVCYNFMPVLDWTRTDLEYVLP DGSKALRFDQIEFAAFEMHILKRPGAEADYTEEEIAQAAERFATMSDEDKARLTRNIIAG LPGAEEGYTLDQFRKHLELYKDIDKAKLRENFAVFLKAIIPVAEEVGVRMAVHPDDPPRP ILGLPRIVSTIEDMQWMVDTVNSMANGFTMCTGSYGVRADNDLVDMIKQFGPRIYFTHLR STMREDNPKTFHEAAHLNGDVDMYEVVKAIVEEEHRRKAEGKEDLIPMRPDHGHQMLDDL KKKTNPGYSAIGRLKGLAEVRGVELAIQRAFFSR >gi|296494522|gb|ADTN01000216.1| GENE 3 3042 - 4502 1866 486 aa, chain + ## HITS:1 COG:uxuB KEGG:ns NR:ns ## COG: uxuB COG0246 # Protein_GI_number: 16132144 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Escherichia coli K12 # 1 486 1 486 486 1014 100.0 0 MTTIVDSNLPVARPSWDHSRLESRIVHLGCGAFHRAHQALYTHHLLESTDSDWGICEVNL MPGNDRVLIENLKKQQLLYTVAEKGAESTELKIIGSMKEALHPEIDGCEGILNAMARPQT AIVSLTVTEKGYCADAASGQLDLNNPLIKHDLENPTAPKSAIGYIVEALRLRREKGLKAF TVMSCDNVRENGHVAKVAVLGLAQARDPQLAAWIEENVTFPCTMVDRIVPAATPETLQEI ADQLGVYDPCAIACEPFRQWVIEDNFVNGRPDWDKVGAQFVADVVPFEMMKLRMLNGSHS FLAYLGYLGGYETIADTVTNPAYRKAAFALMMQEQAPTLSMPEGTDLNAYATLLIERFSN PSLRHRTWQIAMDGSQKLPQRLLDPVRLHLQNGGSWRHLALGVAGWMRYTQGVDEQGNAI DVVDPMLAEFQKINAQYQGADRVKALLGLSGIFADDLPQNADFVGAVTAAYQQLCERGAR ECVAAL >gi|296494522|gb|ADTN01000216.1| GENE 4 4524 - 4718 61 64 aa, chain - ## HITS:1 COG:no KEGG:ECBD_3711 NR:ns ## KEGG: ECBD_3711 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_BL21_DE3 # Pathway: not_defined # 1 64 1 64 64 122 100.0 4e-27 MVLCPLGAVNQIFAGCGVNALSGLQRAILRPPLISNYTISTGSWHVQVGKKTKTGQPNIS LIDQ >gi|296494522|gb|ADTN01000216.1| GENE 5 4717 - 5490 966 257 aa, chain + ## HITS:1 COG:uxuR KEGG:ns NR:ns ## COG: uxuR COG2186 # Protein_GI_number: 16132145 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 257 1 257 257 498 100.0 1e-141 MKSATSAQRPYQEVGAMIRDLIIKTPYNPGERLPPEREIAEMLDVTRTVVREALIMLEIK GLVEVRRGAGIYVLDNSGSQNTDSPDANVCNDAGPFELLQARQLLESNIAEFAALQATRE DIVKMRQALQLEERELASSAPGSSESGDMQFHLAIAEATHNSMLVELFRQSWQWRENNPM WIQLHSHLDDSLYRKEWLGDHKQILAALIKKDARAAKLAMWQHLENVKQRLLEFSNVDDI YFDGYLFDSWPLDKVDA >gi|296494522|gb|ADTN01000216.1| GENE 6 5631 - 6461 360 276 aa, chain - ## HITS:1 COG:no KEGG:B21_04156 NR:ns ## KEGG: B21_04156 # Name: yjiC # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 276 1 276 276 569 100.0 1e-161 MSMPLSNALQSQIITDAHFLHHPIVDSEFTRKLKYARMDSENIYLPPLTRGNNHNYDGKS VVEIRKLDISKEPWPFNYVTGACRESDGITTTGRMLYRNLKITSALDEIYGGICKKAHAT TELAEGLRLNLFMKSPFDPVEDYTVHEITLGPGCNVPGYAGTTIGYISTLPASQAKRWTN EQPRIDIYIDQIMTVTGVANSSGFALAALLNANIELGNDPIIGIEAYPGTAEIHAKMGYK VIPGDENAPLKRMTLQPSSLPELFELKNGEWNYIGK >gi|296494522|gb|ADTN01000216.1| GENE 7 6715 - 6816 72 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPVNGIFDVFDMLSIYIIYKLIVSNNTWLIMRK >gi|296494522|gb|ADTN01000216.1| GENE 8 7134 - 7526 192 130 aa, chain + ## HITS:1 COG:no KEGG:B21_04157 NR:ns ## KEGG: B21_04157 # Name: yjiD # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 130 1 130 130 252 100.0 2e-66 MMRQSLQAVLPEISGNKTSSLRKSVCSDLLTLFNSPHSALPSLLVSGMPEWQVHNPSDKH LQSWYCRQLRSALLFHEPRIAALQVNLKEAYCHTLAISLEIMLYHDDEPLTFDLVWDNGG WRSATLENVS >gi|296494522|gb|ADTN01000216.1| GENE 9 7519 - 8430 819 303 aa, chain - ## HITS:1 COG:yjiE KEGG:ns NR:ns ## COG: yjiE COG0583 # Protein_GI_number: 16132148 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 303 1 303 303 606 100.0 1e-173 MDDCGAILHNIETKWLYDFLTLEKCRNFSQAAVSRNVSQPAFSRRIRALEQAIGVELFNR QVTPLQLSEQGKIFHSQIRHLLQQLESNLAELRGGSDYAQRKIKIAAAHSLSLGLLPSII SQMPPLFTWAIEAIDVDEAVDKLREGQSDCIFSFHDEDLLEAPFDHIRLFESQLFPVCAS DEHGEALFNLAQPHFPLLNYSRNSYMGRLINRTLTRHSELSFSTFFVSSMSELLKQVALD GCGIAWLPEYAIQQEIRSGKLVVLNRDELVIPIQAYAYRMNTRMNPVAERFWRELRELEI VLS >gi|296494522|gb|ADTN01000216.1| GENE 10 8495 - 9667 1334 390 aa, chain - ## HITS:1 COG:no KEGG:B21_04159 NR:ns ## KEGG: B21_04159 # Name: iadA # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 390 1 390 390 761 100.0 0 MIDYTAAGFTLLQGAHLYAPEDRGICDVLVANGKIIAVASNIPSDIVPNCTVVDLSGQIL CPGFIDQHVHLIGGGGEAGPTTRTPEVALSRLTEAGVTSVVGLLGTDSISRHPESLLAKT RALNEEGISAWMLTGAYHVPSRTITGSVEKDVAIIDRVIGVKCAISDHRSAAPDVYHLAN MAAESRVGGLLGGKPGVTVFHMGDSKKALQPIYDLLENCDVPISKLLPTHVNRNVPLFEQ ALEFARKGGTIDITSSIDEPVAPAEGIARAVQAGIPLARVTLSSDGNGSQPFFDDEGNLT HIGVAGFETLLETVQVLVKDYDFSISDALRPLTSSVAGFLNLTGKGEILPGNDADLLVMT PELRIEQVYARGKLMVKDGKACVKGTFETA >gi|296494522|gb|ADTN01000216.1| GENE 11 9680 - 10141 557 153 aa, chain - ## HITS:1 COG:ECs5287 KEGG:ns NR:ns ## COG: ECs5287 COG0700 # Protein_GI_number: 15834541 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Escherichia coli O157:H7 # 1 153 1 153 153 261 100.0 4e-70 MTTQVRKNVMDMFIDGARRGFTIATTNLLPNVVMAFVIIQALKITGLLDWVGHICEPVMA LWGLPGEAATVLLAALMSMGGAVGVAASLATAGALTGHDVTVLLPAMYLMGNPVQNVGRC LGTAEVNAKYYPHIITVCVINALLSIWVMQLIV >gi|296494522|gb|ADTN01000216.1| GENE 12 10138 - 10833 634 231 aa, chain - ## HITS:1 COG:ECs5288 KEGG:ns NR:ns ## COG: ECs5288 COG3314 # Protein_GI_number: 15834542 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 231 1 231 231 399 100.0 1e-111 MGIVMTQQGDAVAGELATEKVGIKGYLAFFLTIIFFSGVFSGTDSWWRVFDFSVLNGSFG QLPGANGATTSFRGAGGAGAKDGFLFALELAPSVILSLGIISITDGLGGLRAAQQLMTPV LKPLLGIPGICSLALIANLQNTDAAAGMTKELAQEGEITERDKVIFAAYQTSGSAIITNY FSSGVAVFAFLGTSVIVPLAVILVFKFVGANILRVWLNFEERRNPTQGAQA >gi|296494522|gb|ADTN01000216.1| GENE 13 10851 - 11042 58 63 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_5839 NR:ns ## KEGG: ECH74115_5839 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 63 1 63 63 114 95.0 1e-24 MLLEPGKFLELKNLISHVGMGKMQVCFWLCVFCIAEKLRCVEKALRRIELQKIHKPSDND ISK >gi|296494522|gb|ADTN01000216.1| GENE 14 11071 - 11625 451 184 aa, chain + ## HITS:1 COG:kptA KEGG:ns NR:ns ## COG: kptA COG1859 # Protein_GI_number: 16132152 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNA:NAD 2'-phosphotransferase # Organism: Escherichia coli K12 # 1 184 35 218 218 366 100.0 1e-101 MAKYNEKELADTSKFLSFVLRHKPEAIGIVLDREGWADIDKLILCAQKAGKRLTRALLDT VVATSDKKRFSYSSDGRCIRAVQGHSTSQVAISFAEKTPPQFLYHGTASRFLDEIKKQGL IAGERHYVHLSADEATARKVGARHGSPVILTVKAQEMAKRGLPFWQAENGVWLTSTVAVE FLEW >gi|296494522|gb|ADTN01000216.1| GENE 15 11638 - 12759 774 373 aa, chain - ## HITS:1 COG:yjiJ KEGG:ns NR:ns ## COG: yjiJ COG0477 # Protein_GI_number: 16132153 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 373 20 392 392 560 99.0 1e-159 MLVLTLGMGLGRFLYTPMLPVMMAEGSFSFSQLSWIASGNYAGYLAGSLLFSFGAFHQPS RLRPFLLASALASGLLILAMAWLPPFILVLLIRVLAGVASAGMLIFGSTLIMQHTRHPFV LAALFSGVGIGIALGNEYVLAGLHFDLSSQTLWQGAGALSGMMLIALTLLMPSKKHAITT MPLAKTEQQIMSWWLLAILYGLAGFGYIIVATYLPLMAKDAGSPLLTAHLWTLVGLSIVP GCFGWLWAAKRWGALPCLTANLLVQAICVLLTLASDSPLLLIISSLGFGGTFMGTTSLVM TIARQLSVPGNLNLLGFVTLIYGIGQILGPALTSMLSNGTSALASATLCGAAALFIAALI STVQLFKLQVVTS >gi|296494522|gb|ADTN01000216.1| GENE 16 12884 - 13744 506 286 aa, chain - ## HITS:1 COG:yjiK KEGG:ns NR:ns ## COG: yjiK COG3204 # Protein_GI_number: 16132154 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 286 38 323 323 557 99.0 1e-159 MTKSISLSKRISVIVILFAIVAVCTFFVQSCARKSNHAASFQNYHATIDGKEIAGITNNI SSLTWSAQSNTLFSTINKPAAIVEMTTNGDFIRTIPLDFVKDLETIEYIGDNQFVISDER DYAIYVISLTPNSEVKILKKIKIPLQDSPTNCGFEGLAYSRQDHTFWFFKEKNPIEVYKV NGLLSSNELHISKDEALQRQFTLDDVSGAEFNQQKNTLLVLSHESRALQEVTLVGEVIGE MSLTKGSRGLSHNIKQAEGVAMDASGNIYIVSEPNRFYRFTPQSSH >gi|296494522|gb|ADTN01000216.1| GENE 17 13809 - 14066 220 85 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_4944 NR:ns ## KEGG: ECUMN_4944 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 85 1 85 85 176 100.0 2e-43 MKEFLFLFHSTVGVIQTRKALQAAGMTFRVSDIPRDLRGGCGLCIWLTCPPGEEIQWVIP GLTESIYCQQDGVWRCIAHYGVSPR >gi|296494522|gb|ADTN01000216.1| GENE 18 14063 - 14830 409 255 aa, chain - ## HITS:1 COG:yjiL KEGG:ns NR:ns ## COG: yjiL COG1924 # Protein_GI_number: 16132155 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Escherichia coli K12 # 1 255 3 257 257 490 99.0 1e-139 MAYSIGIDSGSTATKGILLADGVITRRFLVPTPFRPATAITEAWETLREGLETTPFLTLT GYGRQLVDFADKQVTEISCHGLGARFLAPATRAVIDIGGQDSKVIQLDDDGNLCDFLMND KCAAGTGRFLEVISRTLGTSVEQLDSITENVTPHAITSMCTVFAESEAISLRSAGVAPEA ILAGVINAMARRSANFIARLSCEAPILFTGGVSHCQKFARMLESHLRMPVNTHPDAQFAG AIGAAVIGQRVRTRR >gi|296494522|gb|ADTN01000216.1| GENE 19 14840 - 15991 1428 383 aa, chain - ## HITS:1 COG:yjiM KEGG:ns NR:ns ## COG: yjiM COG1775 # Protein_GI_number: 16132156 # Func_class: E Amino acid transport and metabolism # Function: Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB # Organism: Escherichia coli K12 # 1 383 8 390 390 792 99.0 0 MSLVTDLPAIFDQFSEARQTGFLTVMDLKERGIPLVGTYCTFMPQEIPMAAGAVVVSLCS TSDETIEEAEKDLPRNLCPLIKSSYGFGKTDKCPYFYFSDLVVGETTCDGKKKMYEYMAE FKPVHVMQLPNSVKDDASRALWKAEMLRLQKTVEERFGHEISEDVLRDAIALKNRERRAL ANFYHLGQLNPPALSGSDILKVVYGATFRFDKEALINELDAMTARVRQQWEEGQRLDPRP RILITGCPIGGAAEKVVRAIEENGGWVVGYENCTGAKATEQCVAETGDVYDALADKYLAI GCSCVSPNDQRLKMLSQMVEEYQVDGVVDVILQACHTYAVESLAIKRHVRQQHNIPYIAI ETDYSTSDVGQLSTRVAAFIEML >gi|296494522|gb|ADTN01000216.1| GENE 20 16107 - 17387 1078 426 aa, chain - ## HITS:1 COG:yjiN KEGG:ns NR:ns ## COG: yjiN COG2733 # Protein_GI_number: 16132157 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 426 1 426 426 797 100.0 0 MNKLIELRRAKRLALSLLLIAAATFVVTLFLPPNFWVSGVKAIAEAAMVGALADWFAVVA LFRRVPIPIISRHTAIIPRNKDRIGENLGQFVQEKFLDTQSLVALIRRHEPALLIGNWFS QPENARRVGQHLLQIMSGFLELTDDARIQRLLKRAVHRAIDKVDLSGTSALMLESMTKND RHQVLLDTLIAQLIALLQRDKSRKFIAQQIVRWLESEHPLKAKILPTEWLGEHSAELVSD AVNSLLDDISRDRAHQIRHAFDRATFALIDKLKNDPEMAARADAVKSYLKEDEAFNRYLS ELWGDLREWLKVDINSEDSRVKERIARAGQWFGETLIADDALRASLNGHLEQAAHRVAPE FSAFLTRHISDTVKSWDARDMSRQIELNIGKDLQFIRVNGTLVGGCIGLILYLLSQLPAL FPLGNF >gi|296494522|gb|ADTN01000216.1| GENE 21 17428 - 18660 1133 410 aa, chain - ## HITS:1 COG:yjiO KEGG:ns NR:ns ## COG: yjiO COG0477 # Protein_GI_number: 16132158 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 410 1 410 410 714 100.0 0 MPRFFTRHAATLFFPMALILYDFAAYLSTDLIQPGIINVVRDFNADVSLAPAAVSLYLAG GMALQWLLGPLSDRIGRRPVLITGALIFTLACAATMFTTSMTQFLIARAIQGTSICFIAT VGYVTVQEAFGQTKGIKLMAIITSIVLIAPIIGPLSGAALMHFMHWKVLFAIIAVMGFIS FVGLLLAMPETVKRGAVPFSAKSVLRDFRNVFCNRLFLFGAATISLSYIPMMSWVAVSPV ILIDAGSLTTSQFAWTQVPVFGAVIVANAIVARFVKDPTEPRFIWRAVPIQLVGLSLLIV GNLLSPHVWLWSVLGTSLYAFGIGLIFPTLFRFTLFSNKLPKGTVSASLNMVILMVMSVS VEIGRWLWFNGGRLPFHLLAVVAGVIVVFTLAGLLNRVRQHQAAELVEEQ >gi|296494522|gb|ADTN01000216.1| GENE 22 19139 - 20059 543 306 aa, chain + ## HITS:1 COG:ECs5301 KEGG:ns NR:ns ## COG: ECs5301 COG5464 # Protein_GI_number: 15834555 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 254 1 254 260 493 97.0 1e-139 MTNFTTSTPHDALFKTFLTHPDTARDFMEIHLPKDLRELCDLDSLKLESASFVDEKLRAL HSDILWSVKTREGDGYIYVVIEHQSREDIHMAFRLMRYSMAVMQRHIEHDKRQPLPLVIP MLFYHGSRSPYPWSLCWLDEFADPTTARKLYNAAFPLVDVTVVPDDEIVQHRRVALLELI QKHIRQRDLMGLIDQLVVLLVTECANDSQITALLNYILLTGDEARFNEFISELTRRMPQH RERIMTIAERIHNDGYIKGEQRILRLLLQNGADPEWIQKITGLSAEQMQALRQPLPERER YSWLKS >gi|296494522|gb|ADTN01000216.1| GENE 23 20303 - 21715 1201 470 aa, chain - ## HITS:1 COG:yjiR KEGG:ns NR:ns ## COG: yjiR COG1167 # Protein_GI_number: 16132161 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Escherichia coli K12 # 1 470 1 470 470 957 100.0 0 MTRYQHLATLLAERIEQGLYRHGEKLPSVRSLSQEHGVSISTVQQAYQTLETMKLITPQP RSGYFVAQRKAQPPVPPMTRPVQRPVEITQWDQVLDMLEAHSDSSIVPLSKSTPDVEAPS LKPLWRELSRVVQHNLQTVLGYDLLAGQRVLREQIARLMLDSGSVVTADDIIITSGCHNS MSLALMAVCKPGDIVAVESPCYYGSMQMLRGMGVKVIEIPTDPETGISVEALELALEQWP IKGIILVPNCNNPLGFIMPDARKRAVLSLAQRHDIVIFEDDVYGELATEYPRPRTIHSWD IDGRVLLCSSFSKSIAPGLRVGWVAPGRYHDKLMHMKYAISSFNVPSTQMAAATFVLEGH YHRHIRRMRQIYQRNLALYTCWIREYFPCEICITRPKGGFLLWIELPEQVDMVCVARQLC RMKIQVAAGSIFSASGKYRNCLRINCALPLSETYREALKQIGEAVYRAME >gi|296494522|gb|ADTN01000216.1| GENE 24 21892 - 22056 154 54 aa, chain + ## HITS:1 COG:yjiS KEGG:ns NR:ns ## COG: yjiS COG5457 # Protein_GI_number: 16132162 # Func_class: S Function unknown # Function: Uncharacterized conserved small protein # Organism: Escherichia coli K12 # 1 54 1 54 54 74 100.0 5e-14 MEFHENRAKAPFIGLVQLWQAVRRWRRQMQTRRVLQQMSDERLKDIGLRREDVE >gi|296494522|gb|ADTN01000216.1| GENE 25 22555 - 25845 1327 1096 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_4952 NR:ns ## KEGG: ECUMN_4952 # Name: yjgT # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 1096 17 1112 1112 2268 99.0 0 MGQSEYISWVKCTSWLSNFVNLRGLRQPDGRPLYEYHATNDEYTQLTQLLRAVGQSQSNI CNRDFAACFVLFCSEWYRRDYERQCGWTWDPIYKKIGISFTATELGTIVPKGMEDYWLRP IRFYESERRNFLGTLFSEGGLPFRLLKESDSRFLAVFSRILGQYEQAKQSGFSALSLARA VIEKSALPTVFSEDTSVELISHMADNLNSLVLTHNLINHKEPVQQLEKVHPTWRSEFPIP LDDETGTHFLNGLLCAASVEAKPRLQKNKSTRCQFYWSEKHPDELRVIVSLPDEVSFPVT SEPSTTRFELAICEDGEEVSGLGPAYASLENRQATVRLRKSEVRFGRQNPSAGLSLVARA GGMIVGSIKLDDSEIAIGEVPLTFIVDADQWLLQGQASCSVRSSDVLIVLPRDNSNVAGF DGQSRAVNVLGLKALPVKGCQDVTVTANETYRIRTGREQISIGRFALNGKRASWVCHPDE TFIGVPKVISTLPDIQSIDVTRYLSGISIEQCHIQEMLGAQYLSIRNSKNETLLRRKIGI LPADFSIEIKGGIHANEGTIVITTQHPCVAKLKDKTLEATRKRTEGRTEIQLKAEGVPPA FITLQVLPNLAADPIDIELPFPAKGCLALDVNGRPLDKNITLHDLLGSRAFLFSRNGEPT RYTLQLHLRSVSGLQAWHEWCYTALSDRPVELNLYSLREHIENLISLEAGIDQVVEMRIT GAGVVMAWQIRRYKYSLRYDYEKELLLSQSVNHRAGQIPSPVIMLLSEPERKSIPLASRM SEGVPVGEYELSSIVNKNGPWLVVPKPGEEMAFRPCFIRGESSLPVEESNIRSLQKATQL FNPQAEVNTITLVLGQMANDPAHSGWQFMRSLYDQFGYLPLATFEVWRALVQHPQALAMS LFKFEMSAEYLSRIENEFPILWEFFPIFEIKAASERFKLFLSQKGAPEETQKLLVTNMFQ RLGLVFPTYADEIEKWLSNGHLPPSIPESFVHVWYQELLREHSEAQWPEYGSKRLHSWMA SQKNPVIGINPDANHRYSVALLPVFAAAVASGNASFESVFDRKPGAVFFLRQVRDFDSRW FKAIFQCSLLRYVAKK >gi|296494522|gb|ADTN01000216.1| GENE 26 25858 - 32202 4304 2114 aa, chain + ## HITS:1 COG:ECs5260 KEGG:ns NR:ns ## COG: ECs5260 COG1205 # Protein_GI_number: 15834514 # Func_class: R General function prediction only # Function: Distinct helicase family with a unique C-terminal domain including a metal-binding cysteine cluster # Organism: Escherichia coli O157:H7 # 5 2088 7 2084 2104 978 33.0 0 MTTRYFSSLIEQSLSRSTEATLSIMGVTNPQLREHLAQQMGADCGKPGSFLASPVFQQMF GWKESNHTMRSLTEGNALLSKAVVDSLDDQNNGRYRFGADWRPFTHQLASWKALLEKKHS VVVTSGTGSGKTECFMVPVLEDLYRELHENGNNPLVGVRALFLYPLNALINSQRERLDAW TRGFGTGIRYCLYNGNTENLHASVKSEQAKRPNEVLSREKMREEPAPILVTNGTMLEYMM VRQVDAPIIQQSKAQKSLRWIVLDEAHTYVGSQAAELALQLRRVMTAFGVTPDDVRFVAT SATIAGSDAEKQLKKFLSELSGIPQERIDVLDGSRVIPELEPSKNVSVPLDEIEQIPDLD LRDKNDNKIKGISPERFYALTHSPEARYIREMLVKQPNPMKLDEITERLHVLTKQHYSQQ EVLRWIDVCSGTQPNAKDPAFLKVRAHIFQRNTQGIWACVDKDCRLKHGTPLDKGWPFGY VYVNQRQNCDCGSPVYEVAFCNDCNEPHLLARDKKGKLVQWENKGGDEFSLQDEVPVEHD ATEEKVEKENSFQPPLIIAAGETSEAGYTLQRLDRQTRRIGVINNDSIPLIINDIEQVCS ASGCGYRGMSGKQPFRRALLGGPFYVTNIVPTVLEYCQDFTSDEGKEGVGPDSLPGRGRR LITFTDSRQGTARMAVRMQQEAERSRLRGSVVEILSWHQRTQTSTAPNANADLEKLAARA KQAREQAEEYRSWGMPDQAKLSQAQAEQLEQAYQAATGGKAATILVSRTWTEMVNELKER ADIRGPVLQYNHYLKPEVFNENGGPLKLSEMLLFREFMRRPKRTNSLETQGLVQVGYQGL EKIHKSPLHWQEKGLTLDDWRDFLKVTLDHYVRESNFTQLDDELKNWIGSRFSSKFVRNP ESKDPEDNQNRRWPQIRNGNVSHRLAKLLMLGAGFKTVNAATIDIINTWLKEAWAQLTGP LAVLKPDGNRFYLPKEHMTFSLITDAWICPVTNKILDTAFKGLTPYLPTHISFEHLTLAQ YDTFVAQKVTMPEIWKLDRSQEDYAEGLAKARDWVSHDPLIAQLRSENVWTDINDRVVEG GFYYRTAEHSAQQSSERLQSYEKMFKNGQLNVLNCSTTMEMGVDIGGITAVVMNNVPPHP ANYLQRAGRAGRSKESRAISYTLCKGNPHDQQVFANPLWPFETMIPAPMVAMNSARLVQR HVNALLLSDFLCNVIGETDKEKTSLDSLWFFGEDDGQSKCERFKIWLERPVLDIDTALER LVKGTALHGARAEYLRDKTINAITFLQQRWLSVYRDLVTQERESQPQTPYRKRIELEKKR HCGEYLLRDLAARTFLPGYGFPTDVVTFDNFTMEDYIREKSQKSRDKKDREDNVSRYKGL PSRNLGVAIREYAPGAEIILDGRVFRSAGVSLHWHNINADTNEAQRLDCAWRCHKCGTIG YEEGMSSSGMLFCSNSACGEKIIMDNRRQVLQPAGFVTDAHAPVTNNIETMKFVPVVPAW VFVKAEPVPLPNPLMGYMASGADGHVFQQSLGEGGHGYALCLSCGRAESMLNENDAPKSM EAHYPPRPGKADRDSQNHRLICPGSTALMKNVTLGALARTDVFEMVLRKPQNGEYLPDNT EEGRIVAMTLAVALRQALAGVLGISAAELGYSVRPVRLEDGQSVLAVQLYDVISGGAGFA SSAPVHIEAILQGMVKQLGCRHCETACSECLLDSQTRHDHDLLDRKAALAWLGDDFTYYI GLPDEETFSLPDARYCPGAIGDTIRRAINEGAEKLTLWMTGAPNEWDLYARQFRAAVQNY RLKDNVEVDLVIPTGVDDPDLLHELSQFTALGVRLCHVEQDLQLPIVAQVTFTDRVMTLA SRSQQATIPGPEWHLNDELVVRSLGYKTVELNEFILPAKATNAVERVKDIQIHKQLNGPL SQFGQRFWDVLFNDHEEAQSLMNNTRITGVHYTDRYLQNPVALALLGSILRPLKTKLTDG AEVTLDTLFKDKDRPGNRPFHDWMSIADFQDFADQWFAAALGRPVELTVFDSPRDIPHHR KLTVTFEDGQVLKIRFDQGMGYWRINFSSQWHYFDFRDDVSFQLVKMAQACKEGNVANSE ESWATDVLVEVIAS Prediction of potential genes in microbial genomes Time: Sun May 15 23:52:06 2011 Seq name: gi|296494521|gb|ADTN01000217.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont457.3, whole genome shotgun sequence Length of sequence - 2616 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 - CDS 49 - 1095 194 ## COG4268 McrBC 5-methylcytosine restriction system component 2 1 Op 2 . - CDS 1095 - 2474 768 ## COG1401 GTPase subunit of restriction endonuclease - Prom 2498 - 2557 2.4 Predicted protein(s) >gi|296494521|gb|ADTN01000217.1| GENE 1 49 - 1095 194 348 aa, chain - ## HITS:1 COG:mcrC KEGG:ns NR:ns ## COG: mcrC COG4268 # Protein_GI_number: 16132166 # Func_class: V Defense mechanisms # Function: McrBC 5-methylcytosine restriction system component # Organism: Escherichia coli K12 # 1 348 1 348 348 705 100.0 0 MEQPVIPVRNIYYMLTYAWGYLQEIKQANLEAIPGNNLLDILGYVLNKGVLQLSRRGLEL DYNPNTEIIPGIKGRIEFAKTIRGFHLNHGKTVSTFDMLNEDTLANRIIKSTLAILIKHE KLNSTIRDEARSLYRKLPGISTLHLTPQHFSYLNGGKNTRYYKFVISVCKFIVNNSIPGQ NKGHYRFYDFERNEKEMSLLYQKFLYEFCRRELTSANTTRSYLKWDASSISDQSLNLLPR METDITIRSSEKILIVDAKYYKSIFSRRMGTEKFHSQNLYQLMNYLWSLKPENGENIGGL LIYPHVDTAVKHRYKINGFDIGLCTVNLGQEWPCIHQELLDIFDEYLK >gi|296494521|gb|ADTN01000217.1| GENE 2 1095 - 2474 768 459 aa, chain - ## HITS:1 COG:mcrB KEGG:ns NR:ns ## COG: mcrB COG1401 # Protein_GI_number: 16132167 # Func_class: V Defense mechanisms # Function: GTPase subunit of restriction endonuclease # Organism: Escherichia coli K12 # 1 459 7 465 465 936 100.0 0 MESIQPWIEKFIKQAQQQRSQSTKDYPTSYRNLRVKLSFGYGNFTSIPWFAFLGEGQEAS NGIYPVILYYKDFDELVLAYGISDTNEPHAQWQFSSDIPKTIAEYFQATSGVYPKKYGQS YYACSQKVSQGIDYTRFASMLDNIINDYKLIFNSGKSVIPPMSKTESYCLEDALNDLFIP ETTIETILKRLTIKKNIILQGPPGVGKTFVARRLAYLLTGEKAPQRVNMVQFHQSYSYED FIQGYRPNGVGFRRKDGIFYNFCQQAKEQPEKKYIFIIDEINRANLSKVFGEVMMLMEHD KRGENWSVPLTYSENDEERFYVPENVYIIGLMNTADRSLAVVDYALRRRFSFIDIEPGFD TPQFRNFLLNKKAEPSFVESLCQKMNELNQEISKEATILGKGFRIGHSYFCCGLEDGTSP DTQWLNEIVMTDIAPLLEEYFFDDPYKQQKWTNKLLGDS Prediction of potential genes in microbial genomes Time: Sun May 15 23:52:20 2011 Seq name: gi|296494520|gb|ADTN01000218.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont457.4, whole genome shotgun sequence Length of sequence - 42647 bp Number of predicted genes - 37, with homology - 37 Number of transcription units - 25, operones - 8 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 20 - 361 480 ## ECUMN_4956 endoribonuclease SymE 2 2 Tu 1 . - CDS 479 - 1699 410 ## COG3464 Transposase and inactivated derivatives - Prom 1800 - 1859 5.1 - Term 1868 - 1900 3.0 3 3 Tu 1 . - CDS 1913 - 2269 242 ## COG0732 Restriction endonuclease S subunits - Prom 2406 - 2465 5.2 - Term 3169 - 3215 2.1 4 4 Tu 1 5/0.200 - CDS 3304 - 4893 1942 ## COG0286 Type I restriction-modification system methyltransferase subunit - Prom 4954 - 5013 4.7 - Term 4901 - 4940 2.2 5 5 Tu 1 . - CDS 5094 - 8606 4017 ## COG4096 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases - Prom 8657 - 8716 4.5 6 6 Tu 1 . + CDS 8794 - 9708 782 ## COG1715 Restriction endonuclease - Term 9702 - 9746 4.3 7 7 Op 1 3/0.300 - CDS 9754 - 10710 1066 ## COG0523 Putative GTPases (G3E family) 8 7 Op 2 9/0.000 - CDS 10721 - 10924 265 ## COG2879 Uncharacterized small protein - Term 10935 - 10964 3.5 9 7 Op 3 . - CDS 10974 - 13124 2788 ## COG1966 Carbon starvation protein, predicted membrane protein - Prom 13372 - 13431 4.4 + Prom 13319 - 13378 5.4 10 8 Tu 1 . + CDS 13501 - 15165 1732 ## COG0840 Methyl-accepting chemotaxis protein - Term 15164 - 15205 7.1 11 9 Op 1 . - CDS 15214 - 16575 1294 ## COG0477 Permeases of the major facilitator superfamily - Prom 16607 - 16666 1.8 - Term 16630 - 16673 1.7 12 9 Op 2 . - CDS 16790 - 17704 460 ## COG1802 Transcriptional regulators - Prom 17729 - 17788 4.0 + Prom 17693 - 17752 3.9 13 10 Tu 1 . + CDS 17843 - 18865 971 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases + Term 18893 - 18931 3.4 - Term 18951 - 18993 11.9 14 11 Tu 1 . - CDS 19005 - 21296 2189 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily - Prom 21348 - 21407 4.2 - Term 21495 - 21539 5.1 15 12 Op 1 . - CDS 21550 - 22041 485 ## S4661 hypothetical protein 16 12 Op 2 . - CDS 22093 - 22830 800 ## COG1484 DNA replication protein 17 12 Op 3 . - CDS 22833 - 23372 546 ## COG5529 Pyocin large subunit - Prom 23413 - 23472 2.1 - Term 23415 - 23462 7.8 18 13 Op 1 12/0.000 - CDS 23479 - 23952 471 ## COG3610 Uncharacterized conserved protein 19 13 Op 2 . - CDS 23943 - 24713 361 ## COG2966 Uncharacterized conserved protein 20 14 Op 1 . + CDS 25383 - 26057 293 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 21 14 Op 2 . + CDS 26069 - 26692 260 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain + Term 26702 - 26737 2.4 22 15 Tu 1 . - CDS 26730 - 27518 626 ## COG4114 Uncharacterized Fe-S protein - Prom 27547 - 27606 5.4 + Prom 27469 - 27528 3.5 23 16 Tu 1 . + CDS 27659 - 27895 250 ## EC55989_5030 hypothetical protein - TRNA 27934 - 28020 69.1 # Leu CAG 0 0 - TRNA 28049 - 28135 69.1 # Leu CAG 0 0 - TRNA 28169 - 28255 69.1 # Leu CAG 0 0 24 17 Tu 1 . - CDS 28556 - 29587 264 ## PROTEIN SUPPORTED gi|225082609|ref|YP_002654106.1| ribosomal protein L11 methyltransferase, putative - Prom 29610 - 29669 6.0 + Prom 29608 - 29667 3.0 25 18 Op 1 8/0.000 + CDS 29690 - 30103 365 ## COG3050 DNA polymerase III, psi subunit 26 18 Op 2 4/0.200 + CDS 30072 - 30518 753 ## PROTEIN SUPPORTED gi|15804944|ref|NP_290986.1| ribosomal-protein-alanine N-acetyltransferase 27 18 Op 3 3/0.300 + CDS 30533 - 31210 728 ## COG1011 Predicted hydrolase (HAD superfamily) 28 18 Op 4 5/0.200 + CDS 31301 - 32890 1951 ## COG4108 Peptide chain release factor RF-3 + Term 32901 - 32940 8.0 29 19 Tu 1 . + CDS 33283 - 33888 735 ## COG2823 Predicted periplasmic or secreted lipoprotein 30 20 Tu 1 . + CDS 34015 - 34176 204 ## gi|157368899|ref|YP_001476888.1| hypothetical protein Spro_0654 + Term 34203 - 34245 7.2 + Prom 34211 - 34270 2.3 31 21 Op 1 4/0.200 + CDS 34298 - 35371 912 ## COG4667 Predicted esterase of the alpha-beta hydrolase superfamily 32 21 Op 2 . + CDS 35368 - 36150 770 ## COG0084 Mg-dependent DNase + Term 36185 - 36236 3.0 33 22 Tu 1 . - CDS 36376 - 37110 685 ## COG1180 Pyruvate-formate lyase-activating enzyme - Term 37150 - 37185 4.0 34 23 Tu 1 . - CDS 37211 - 38761 1479 ## JW4343 conserved hypothetical protein - Prom 38865 - 38924 4.7 + Prom 38865 - 38924 4.5 35 24 Tu 1 7/0.100 + CDS 39019 - 39798 1047 ## COG0274 Deoxyribose-phosphate aldolase + Prom 39908 - 39967 2.3 36 25 Op 1 4/0.200 + CDS 40027 - 41349 1847 ## COG0213 Thymidine phosphorylase 37 25 Op 2 . + CDS 41401 - 42624 1674 ## COG1015 Phosphopentomutase Predicted protein(s) >gi|296494520|gb|ADTN01000218.1| GENE 1 20 - 361 480 113 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_4956 NR:ns ## KEGG: ECUMN_4956 # Name: yjiW # Def: endoribonuclease SymE # Organism: E.coli_UMN026 # Pathway: not_defined # 1 113 20 132 132 212 100.0 4e-54 MTDTHSIAQPFEAEVSPANNRHVTVGYASRYPDYSRIPAITLKGQWLEAAGFATGTAVDV KVMEGCIVLTAQPPAAEESELMQSLRQVCKLSARKQKQVQAFIGVIAGKQKVA >gi|296494520|gb|ADTN01000218.1| GENE 2 479 - 1699 410 406 aa, chain - ## HITS:1 COG:MA2406 KEGG:ns NR:ns ## COG: MA2406 COG3464 # Protein_GI_number: 20091237 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Methanosarcina acetivorans str.C2A # 6 406 5 401 403 191 27.0 2e-48 MDEKSLYAHILNLSDPWQVKSLSLDENAGSVTVTIEIAENTRLACPTCGKSCSVHDHRHR KWRHLDTCQFTTIVEADVPRIMCPEHGCLTLPVPWAGPGSRYTLLFESFVLSWLKISTVD AVRKQLKLSWNAVDGIMTRAVKRGLARIKKPLSARHMNVDEVAFKKGHRYITVISDRDGR ALALTDDRGTESLAGYLRTLTDGQLLAIKTLSMDMNAGYIRAARIHLPSAVEKIAFDRFH VAKQLGEVVDKTRQNEHPHLPVESRHQAKGTRFLWQYSDKWMTESRQEKLMWLRAQMKLT SQCWALKELAKDIWNRPWSEERRSDWQRWLALAANSDVPMMKNAAKTIGKRLYGILNAMR HSVSNGNAEALNSKIRLLRIKARGYRNRERFKLGVMFHYGKLNMAF >gi|296494520|gb|ADTN01000218.1| GENE 3 1913 - 2269 242 118 aa, chain - ## HITS:1 COG:hsdS KEGG:ns NR:ns ## COG: hsdS COG0732 # Protein_GI_number: 16132169 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Escherichia coli K12 # 1 118 347 464 464 219 100.0 9e-58 MMNCVKTTSGQKGISGKDIKSQVVLLPPVKEQAEIVRRVEQLFAYADTIEKQVNNALARV NNLTQSILAKAFRGELTAQWRAENPDLISGENSAAALLEKIKAERAASGGKKASRKKS >gi|296494520|gb|ADTN01000218.1| GENE 4 3304 - 4893 1942 529 aa, chain - ## HITS:1 COG:hsdM KEGG:ns NR:ns ## COG: hsdM COG0286 # Protein_GI_number: 16132170 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Escherichia coli K12 # 1 529 1 529 529 1091 99.0 0 MNNNDLVAKLWKLCDNLRDGGVSYQNYVNELASLLFLKMCKETGQEAEYLPEGYRWDDLK SRIGQEQLQFYRKMLVHLGEDDKKLVQAVFHNVSTTITEPKQITALVSNMDSLDWYNGAH GKSRDDFGDMYEGLLQKNANETKSGAGQYFTPRPLIKTIIHLLKPQPREVVQDPAAGTAG FLIEADRYVKSQTNDLDDLDGDTQDFQIHRAFIGLELVPGTRRLALMNCLLHDIEGNLDH GGAIRLGNTLGSDGENLPKAHIVATNPPFGSAAGTNITRTFVHPTSNKQLCFMQHIIETL HPGGRAAVVVPDNVLFEGGKGTDIRRDLMDKCHLHTILRLPTGIFYAQGVKTNVLFFTKG TVANPHQDKNCTDDVWVYDLRTNMPSFGKRTPFTDEHLQPFERVYGEDPHGLSPRTEGEW SFNAEETEVADSEENKNTDQHLATSRWRKFSREWIRTAKSDSLDISWLKDKDSIDADSLP EPDVLAAEAMGELVQALSELDALMRELGASDEADLQRQLLEEAFGGVKE >gi|296494520|gb|ADTN01000218.1| GENE 5 5094 - 8606 4017 1170 aa, chain - ## HITS:1 COG:hsdR KEGG:ns NR:ns ## COG: hsdR COG4096 # Protein_GI_number: 16132171 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Escherichia coli K12 # 1 1170 19 1188 1188 2329 98.0 0 MMNKSNFEFLKGVNDFTYAIACAAENNYPDDPNTTLIKMRMFGEATAKHLGLLLNIPPCE NQHDLLRELGKIAFVDDNILSVFHKLRRIGNQAVHEYHNDLDDAQMCLRLGFRLAVWYYR LVTKDYDFPVPVFVLPERGENLYHQEVLTLKQQLEQQVREKAQTQAEVEAQQQKLVALNG YIAILEGKQQETEAQTQARLAALEAQLAEKNAELAKQTEQERKAYHKEITDQAIKRTLNL SEEESRFLIDAQLRKAGWQADSKTLRFSKGARPEPGVNKAIAEWPTGKDETGKQGFADYV LFVGLKPIAVVEAKRKNIDVPGKLNESYRYSKCFDNGFLRETLLEHYSPDEVHEAVPEYE TSWQDSSGQQRFKIPFCYSTNGREYRAAMKTKSGIWYRDVRDTRNMSKALPEWHRPEELL EMLGSEPQKQNQWFADNPGMSELGLRYYQEDAVRAVEKAIVKGQQEILLAMATGTGKTRT AIAMMFRLIQSQRFKRILFLVDRRSLGEQALGAFEDTRINGDTFNSIFDIKGLTDKFPED STKIHVATVQSLVKRTLQSDEPMPVARYDCIVVDEAHRGYILDKEQTEGELQFRSQLDYV SAYRRILDHFDAIKIALTATPALHTVQIFGEPVYRYTYRTAVIDGFLIDQDPPIQITTRN AQEGVYLSKGEQVERISPQGEVINDTLEDDQDFEVADFNRGLVIPAFNRAVCNELTNYLD PTGSQKTLVFCVTNAHADMVVEELRTAFKKKYPQLEHDAIIKITGDADKDARKVQTMITR FNKERLPNIVVTVDLLTTGVDIPSICNIVFLRKVRSRILYEQMKGRATRLCPDVNKTSFK IFDCVDIYSTLESVDTMRPVVVRPKVELQTLVNEITDSETYKITEADGRSFAEHSHEQLV AKLQRIIGLATFNRDRSETIDKQVRRLDELCQDAAGVGFNGFASRLREKGPHWSAEVFNK LPGFIARLEKLKTDINNLNDAPIFLDIDDEVVSVKSLYGDYDTPQDFLEAFDSLVQRSPN AQPALQAVINRPRDLTRKGLVELQEWFDRQHFEESSLRKAWKETRNEDIAARLIGHIRRA AVGDALKPFEERVDHALTRIKGENDWSSEQLSWLDRLAQALKEKVVLDDDVFKTGNFHRR GGKAMLQRTFDDNLDTLLGKFSDYIWDELA >gi|296494520|gb|ADTN01000218.1| GENE 6 8794 - 9708 782 304 aa, chain + ## HITS:1 COG:mrr KEGG:ns NR:ns ## COG: mrr COG1715 # Protein_GI_number: 16132172 # Func_class: V Defense mechanisms # Function: Restriction endonuclease # Organism: Escherichia coli K12 # 1 304 1 304 304 571 99.0 1e-163 MTVPTYDKFIEPVLRYLATKPEGAAARDVHEAAADALGLDDSQRAKVITSGQLVYKNRAG WAHDRLKRAGLSQSLSRGKWCLTPAGFDWVASHPQPMTEQETNHLAFDFVNVKLKSRPDA VDLDPKADSPDHEELAKSSPDDRLDQALKELRDAVADEVLENLLQVSPSRFEVIVLDVLH RLGYGGHRDDLQRVGGTGDGGIDGVISLDKLGLEKVYVQAKRWQNTVGRPELQAFYGALA GQKAKRGVFITTSGFTSQARDFAQSVEGMVLVDGERLVHLMIENEVGVSSRLLKVPKLDM DYFE >gi|296494520|gb|ADTN01000218.1| GENE 7 9754 - 10710 1066 318 aa, chain - ## HITS:1 COG:STM4530 KEGG:ns NR:ns ## COG: STM4530 COG0523 # Protein_GI_number: 16767774 # Func_class: R General function prediction only # Function: Putative GTPases (G3E family) # Organism: Salmonella typhimurium LT2 # 1 318 1 318 318 584 91.0 1e-167 MNPIAVTLLTGFLGAGKTTLLRHILNEQHGYKIAVIENEFGEVSVDDQLIGDRATQIKTL TNGCICCSRSNELEDALLDLLDNLDKGNIQFDRLVIECTGMADPGPIIQTFFSHEILCQR YLLDGVIALVDAVHADEQMNQFTIAQSQVGYADRILLTKTDVAGEAEKLRERLARINARA PVYTVTHGDIDLGLLFNTNGFMLEENVVSTKPRFHFIADKQNDISSIVVELDYPVDISEV SRVMENLLLESADKLLRYKGMLWIDGEANRLLFQGVQRLYSADWDRPWGDEKPHSTMVFI GIQLPEDEIRAAFAGLRK >gi|296494520|gb|ADTN01000218.1| GENE 8 10721 - 10924 265 67 aa, chain - ## HITS:1 COG:ECs5312 KEGG:ns NR:ns ## COG: ECs5312 COG2879 # Protein_GI_number: 15834566 # Func_class: S Function unknown # Function: Uncharacterized small protein # Organism: Escherichia coli O157:H7 # 1 67 1 67 67 127 100.0 4e-30 MFGNLGQAKKYLGQAAKMLIGIPDYDNYVEHMKTNHPDKPYMSYEEFFRERQNARYGGDG KGGMRCC >gi|296494520|gb|ADTN01000218.1| GENE 9 10974 - 13124 2788 716 aa, chain - ## HITS:1 COG:ECs5313 KEGG:ns NR:ns ## COG: ECs5313 COG1966 # Protein_GI_number: 15834567 # Func_class: T Signal transduction mechanisms # Function: Carbon starvation protein, predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 716 6 721 721 1340 99.0 0 MDTKKIFKHIPWVILGIIGAFCLAVVALRRGEHVSALWIVVASVSVYLVAYRYYSLYIAQ KVMKLDPTRATPAVINNDGLNYVPTNRYVLFGHHFAAIAGAGPLVGPVLAAQMGYLPGTL WLLAGVVLAGAVQDFMVLFISSRRNGASLGEMIKEEMGPVPGTIALFGCFLIMIIILAVL ALIVVKALAESPWGVFTVCSTVPIALFMGIYMRFIRPGRVGEVSVIGIVLLVASIYFGGV IAHDPYWGPALTFKDTTITFALIGYAFVSALLPVWLILAPRDYLATFLKIGVIVGLALGI VVLNPELKMPAMTQYIDGTGPLWKGALFPFLFITIACGAVSGFHALISSGTTPKLLANET DARFIGYGAMLMESFVAIMALVAASIIEPGLYFAMNTPPAGLGITMPNLHEMGGENAPII MAQLKDVTAHAAATVSSWGFVISPEQILQTAKDIGEPSVLNRAGGAPTLAVGIAHVFHKV LPMADMGFWYHFGILFEALFILTALDAGTRSGRFMLQDLLGNFIPFLKKTDSLVAGIIGT AGCVGLWGYLLYQGVVDPLGGVKSLWPLFGISNQMLAAVALVLGTVVLIKMKRTQYIWVT VVPAVWLLICTTWALGLKLFSTNPQMEGFFYMASQYKEKIANGTDLTAQQIANMNHIVVN NYTNAGLSILFLIVVYSIIFYGFKTWLAVRNSDKRTDKETPYVPIPEGGVKISSHH >gi|296494520|gb|ADTN01000218.1| GENE 10 13501 - 15165 1732 554 aa, chain + ## HITS:1 COG:ECs5315 KEGG:ns NR:ns ## COG: ECs5315 COG0840 # Protein_GI_number: 15834569 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Escherichia coli O157:H7 # 1 554 1 554 554 899 99.0 0 MLKRIKIVTSLLLVLAVFGLLQLTSGGLFFNALKNDKENFTVLQTIRQQQSTLNGSWVAL LQTRNTLNRAGIRYMMDQNNIGSGSTVAELMQSASISLKQAEKNWADYEALPRDPRQSTA AAAEIKRNYDIYHNALAELIQLLGAGKINEFFDQPTQGYQDGFEKQYVAYMEQDDRLYDI AVSDNNASYSQAMWILVGVMIVVLAVIFAVWFGIKASLVAPMNRLIDSIRHIAGGDLVKP IEVDGSNEMGQLAESLRHMQGELMRTVGDVRNGANAIYSGASEIATGNNDLSSRTEQQAA SLEETAASMEQLTATVKQNAENARQASHLALSASETAQRGGKVVDNVVQTMRDISTSSQK IADIISVIDGIAFQTNILALNAAVEAARAGEQGRGFAVVAGEVRNLAQRSAQAAREIKSL IEDSVGKVDVGSTLVESAGETMAEIVSAVTRVTDIMGEIASASDEQSRGIDQVGLAVAEM DRVTQQNAALVEESAAAAAALEEQASRLTEAVAVFRIQQQQQQQRETSAVVKNVTPATPR KMAVADSGENWETF >gi|296494520|gb|ADTN01000218.1| GENE 11 15214 - 16575 1294 453 aa, chain - ## HITS:1 COG:yjiZ KEGG:ns NR:ns ## COG: yjiZ COG0477 # Protein_GI_number: 16132177 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 453 1 453 453 877 99.0 0 MEKENITLDPRSSFTPSSSADIPVPPDGLVQRSTRIKRIQTTAMLLLFFAAVINYLDRSS LSVANLTIREELGLSATEIGALLSVFSLAYGIAQLPCGPLLDRKGPRLMLGLGMFFWSLF QAMSGMVHSFTQFVLVRIGMGIGEAPMNPCGVKVINDWFNIKERGRPMGFFNAASTIGVA VSPPILAAMMLVMGWRGMFITIGVLGIFLAIGWYMLYRNREHVELTAVEQAYLNAGSVNA RRDPLSFAEWRSLFRNRTMWGMMLGFSGINYTAWLYLAWLPGYLQTAYNLDLKSTGLMAA IPFLFGAAGMLVNGYVTDWLVKGGMAPIKSRKICIIAGMFCSAAFTLVVPQATTSMTAVL LIGMALFCIHFAGTSCWGLIHVAVASRMTASVGSIQNFASFICASFAPIITGFIVDTTHS FRLALIICGCVTAAGALAYIFLVRQPINDPRKD >gi|296494520|gb|ADTN01000218.1| GENE 12 16790 - 17704 460 304 aa, chain - ## HITS:1 COG:ECs5317 KEGG:ns NR:ns ## COG: ECs5317 COG1802 # Protein_GI_number: 15834571 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 37 304 1 268 268 516 100.0 1e-146 MSRSQNLRHNVINQVIDDMARGHIPSPLPSQSALAEMYNISRTTVRHILSHLRECGVLTQ VGNDYVIARKPDHDDGFACTTASMSEQNKVFEQAFFTMINQRQLRPGETFSELQLARAAG VSPVVVREYLLKFGRYNLIQSEKRGQWSMKQFDQSYAEQLFELREMLETHSLQHFLNLPD HDPRWLQAKTMLERHRLLRDNIGNSFRMFSQLDRDFHSLLLSAADNIFFDQSLEIISVIF HFHYQWDESDLKQRNIIAVDEHMTILSALICRSDLDATLALRNHLNSAKQSMIRSINENT RYAH >gi|296494520|gb|ADTN01000218.1| GENE 13 17843 - 18865 971 340 aa, chain + ## HITS:1 COG:yjjN KEGG:ns NR:ns ## COG: yjjN COG1063 # Protein_GI_number: 16132179 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli K12 # 1 340 6 345 345 671 99.0 0 MSTMNVLICQQPKELVWKQREIPIPGDNEALIKIKSVGICGTDIHAWGGNQPFFSYPRVL GHEICGEIVGLGKNIANLKNGQQVAVIPYVACQQCPACKSGRTNCCEKISVIGVHQDGGF SEYLSVPVANILPADGIDPQAAALIEPFAISAHAVRRAAIAPGEQVLVVGAGPIGLGAAA IAKADGAQVVVADTSPARREHVATRLELPVLDPSAEDFDAQLRAQFGGSLAQKVIDATGN QHAMNNTVNLIRHGGTVVFVGLFKGELQFSDPEFHKKETTMMGSRNATPEDFAKVGRLMA EGKITADMMLTHRYPFATLAETYERDVINNRELIKGVITF >gi|296494520|gb|ADTN01000218.1| GENE 14 19005 - 21296 2189 763 aa, chain - ## HITS:1 COG:mdoB KEGG:ns NR:ns ## COG: mdoB COG1368 # Protein_GI_number: 16132180 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Escherichia coli K12 # 14 763 1 750 750 1524 99.0 0 MSELLSFALFLASVLIYAWKAGRNTWWFAATLTVLGLFVVLNITLFASDYFTGDGINDAV LYTLTNSLTGAGVSKYILPGIGIVLGLTAVFGALGWILRRRRHHPHHFGYSLLALLLALG SVDASPAFRQITELVKSQSRDGDPDFAAYYKEPSKTIPDPKLNLVYIYGESLERTYFDNE AFPDLTPELGALKNEGLDFSHTQQLPGTDYTIAGMVASQCGIPLFAPFEGNASASVSSFF PQNICLGDILKNSGYQNYFVQGANLRFAGKDVFLKSHGFDHLYGSEELKSVVADPHYRND WGFYDDTVLDEAWKKFEELSRSGQRFSLFTLTVDTHHPDGFISRTCNRKKYDFDGKPNQS FSAVSCSQENIATFINKIKASPWFKDTVIVVSSDHLAMNNTAWKYLNKQDRNNLFFVIRG DKPQQETLAVKRNTMDNGATVLDILGGDNYLGLGRSSLSGQSMSEIFLNIKEKTLAWKPD IIRLWKFPKEMKEFTIDQQKNMIAFSGSHFRLPLLLRVSDKRVEPLPESEYSAPLRFQLA DFAPRDNFVWVDRCYKMAQLWAPELALSTDWCVSQGQLGGQQIVQHVDKTTWQGKTAFKD TVIDMARYKGNVDTLKIVDNDIRYKADSFIFNVAGAPEEVKQFSGISRPESWGRWSNAQL GDEVKIEYKHPLPKKFDLVITAKAYGNNASRPIPVRVGNEEQTLVLGNEVTTTTLHFDNP TDADTLVIVPPEPVSTNEGNILGHSPRKLGIGMVEIKVVEREG >gi|296494520|gb|ADTN01000218.1| GENE 15 21550 - 22041 485 163 aa, chain - ## HITS:1 COG:no KEGG:S4661 NR:ns ## KEGG: S4661 # Name: yjjA # Def: hypothetical protein # Organism: S.flexneri_2457T # Pathway: not_defined # 1 163 3 165 165 264 99.0 1e-69 MKTVKHLLCCAIAASALISTGVHAASWKDALSSAASELGNQNSTTQEGGWSLASLTNLLS SGNQALSADNMNNAAGILQYCAKQKLASVTDAENIKNQVLEKLGLNSEEQKEDTNYLDGI QGLLKTKDGQQLNLDNIGTTPLAEKVKTKACDLVLKQGLNFIS >gi|296494520|gb|ADTN01000218.1| GENE 16 22093 - 22830 800 245 aa, chain - ## HITS:1 COG:ECs5321 KEGG:ns NR:ns ## COG: ECs5321 COG1484 # Protein_GI_number: 15834575 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Escherichia coli O157:H7 # 1 245 1 245 245 473 100.0 1e-133 MKNVGDLMQRLQKMMPAHIKPAFKTGEELLAWQKEQGAIRSAALERENRAMKMQRTFNRS GIRPLHQNCSFENYRVECEGQMNALSKARQYVEEFDGNIASFIFSGKPGTGKNHLAAAIC NELLLRGKSVLIITVADIMSAMKDTFRNSGTSEEQLLNDLSNVDLLVIDEIGVQTESKYE KVIINQIVDRRSSSKRPTGMLTNSNMEEMTKLLGERVMDRMRLGNSLWVIFNWDSYRSRV TGKEY >gi|296494520|gb|ADTN01000218.1| GENE 17 22833 - 23372 546 179 aa, chain - ## HITS:1 COG:ECs1768 KEGG:ns NR:ns ## COG: ECs1768 COG5529 # Protein_GI_number: 15831022 # Func_class: R General function prediction only # Function: Pyocin large subunit # Organism: Escherichia coli O157:H7 # 78 153 184 266 346 77 51.0 1e-14 MSSRVLTPDVVGIDALVHDHQTVLAKAEGGVVAVFANNAPAFYAVTPARLAELLALEEKL ARPGSDVALDDQLYQEPQAAPVAVPMGKFAMYPDWQPDADFIRLAALWGVALREPVTTEE LASFIAYWQAEGKVFHHVQWQQKLARSLQIGRASNGGLPKRDVNTVSEPDSQIPPGFRG >gi|296494520|gb|ADTN01000218.1| GENE 18 23479 - 23952 471 157 aa, chain - ## HITS:1 COG:STM4545 KEGG:ns NR:ns ## COG: STM4545 COG3610 # Protein_GI_number: 16767789 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Salmonella typhimurium LT2 # 1 157 1 157 157 244 88.0 4e-65 MGVIEFLLALAQDMILAAIPAVGFAMVFNVPVRALRWCALLGAIGHGSRMILMTSGLNIE WSTFMASMLVGTIGIQWSRWYLAHPKVFTVAAVIPMFPGISAYTAMISAVKISQLGYSEP LMITLLTNFLTASSIVGALSIGLSIPGLWLYRKRPRV >gi|296494520|gb|ADTN01000218.1| GENE 19 23943 - 24713 361 256 aa, chain - ## HITS:1 COG:yjjP KEGG:ns NR:ns ## COG: yjjP COG2966 # Protein_GI_number: 16132185 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 256 22 277 277 501 99.0 1e-142 MQTEQQRAVTRLCIQCGLFLLQHGAESALVDELSSRLGRALGMDSVESSISSNAIVLTTI KDGQCLTSTRKNHDRGINMHVVTEVQHIVILAEHHLLDYKGVEKRFSQIQPLRYPRWLVA LMVGLSCACFCKLNKGGWDGAVITFFASTAAMYIRQLLAQRHLHPQINFCLTAFAATTIS GLLLQLPTFSNTPTIAMAASVLLLVPGFPLINAVADMFKGHINTGLARWAIASLLTLATC VGVVMALTIWGLRGWV >gi|296494520|gb|ADTN01000218.1| GENE 20 25383 - 26057 293 224 aa, chain + ## HITS:1 COG:ECs5325 KEGG:ns NR:ns ## COG: ECs5325 COG2197 # Protein_GI_number: 15834579 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 224 18 241 241 424 100.0 1e-119 MQAGLKEVMRTHFPEYEIISSASAEDLTLLQLRRSGLVIADLAGESEDPRSVCEHYYSLI SQYREIHWVFMVSRSWYSQAVELLMCPTATLLSDVEPIENLVKTVRSGNTHAERISAMLT SPAMTETHDFSYRSVILTLSERKVLRLLGKGWGINQIASLLKKSNKTISAQKNSAMRRLA IHSNAEMYAWINSAQGARELNLPSVYGDAAEWNTAELRREMSHS >gi|296494520|gb|ADTN01000218.1| GENE 21 26069 - 26692 260 207 aa, chain + ## HITS:1 COG:ECs5326 KEGG:ns NR:ns ## COG: ECs5326 COG2197 # Protein_GI_number: 15834580 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 207 19 225 225 395 99.0 1e-110 MSSIGIESLFRKFAGNPYKLHTYTSQESFQDAMSRISFAAVIFSFSAMRSERREGLSCLT ELAIKFPRTRRLVIADDDIEARLIGSLSPSPLDGVLSKASTLEIFHQELFLSLNGVRQAT DRLNNQWYINQSRTLSPTEREILRFMSRGYSMTQIAEQLKRNIKTIRAHKFNVMSKLGVS SDAGLLEAADILLCMRHCEASNVLHPY >gi|296494520|gb|ADTN01000218.1| GENE 22 26730 - 27518 626 262 aa, chain - ## HITS:1 COG:ECs5327 KEGG:ns NR:ns ## COG: ECs5327 COG4114 # Protein_GI_number: 15834581 # Func_class: R General function prediction only # Function: Uncharacterized Fe-S protein # Organism: Escherichia coli O157:H7 # 1 262 1 262 262 519 98.0 1e-147 MAYRSAPLYEDIIWRTHLQPQDAGLAQAVRATIAEHREHLLEFIRLDEPAPLNAMTLAQW SSPNALSSLLAVYSDHIYRNQPTMIRENKPLISLWAQWYIGLMVPPLMLALLTQEKALDV SPEHFHAEFHETGRAACFWVDVCEDKNATPHSPQQRMETLISQALVPVVQALEATGEING KLIWSNTGYLINWYLTEMKQLLGEATVESLRHAIFFEKTLTNGEDNPLWRTVVLRDGLLV RRTCCQRYRLPDVQQCGDCTLK >gi|296494520|gb|ADTN01000218.1| GENE 23 27659 - 27895 250 78 aa, chain + ## HITS:1 COG:no KEGG:EC55989_5030 NR:ns ## KEGG: EC55989_5030 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 78 38 115 115 121 100.0 6e-27 MLQRTLGSGWGVLLPGLLIAGLMYADLSPDQWRIVILMGLILTPVMLYHKQLRHYILLPS CLALSAGIMLMLMNLNQG >gi|296494520|gb|ADTN01000218.1| GENE 24 28556 - 29587 264 343 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225082609|ref|YP_002654106.1| ribosomal protein L11 methyltransferase, putative [marine gamma proteobacterium HTCC2148] # 78 334 106 368 371 106 30 3e-22 MSAFTPASEVLLRHSDDFEQSRILFAGDLQDDLPARLDTAASRAHTQQFHHWQVLSRQMG DNARFSLVATADDVADCDTLIYYWPKNKPEAQFQLMNLLSLLPVGTDIFVVGENRSGVRS AEQMLADYAPLNKVDSARRCGLYFGRLEKQPVFDADKFWGEYSVDGLTVKTLPGVFSRDG LDVGSQLLLSTLTPHTKGKVLDVGCGAGVLSVAFARHSPKIRLTLCDVSAPAVEASRATL AANGVEGEVFASNVFSEVKGRFDMIISNPPFHDGMQTSLDAAQTLIRGAVRHLNSGGELR IVANAFLPYPDVLDETFGFHEVIAQTGRFKVYRAIMTRQAKKG >gi|296494520|gb|ADTN01000218.1| GENE 25 29690 - 30103 365 137 aa, chain + ## HITS:1 COG:ECs5330 KEGG:ns NR:ns ## COG: ECs5330 COG3050 # Protein_GI_number: 15834584 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, psi subunit # Organism: Escherichia coli O157:H7 # 1 137 1 137 137 253 100.0 6e-68 MTSRRDWQLQQLGITQWSLRRPGALQGEIAIAIPAHVRLVMVANDLPALTDPLVSDVLRA LTVSPDQVLQLTPEKIAMLPQGSRCNSWRLGTDEPLSLEGAQVASPALTELRANPTARAA LWQQICTYEHDFFPRND >gi|296494520|gb|ADTN01000218.1| GENE 26 30072 - 30518 753 148 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15804944|ref|NP_290986.1| ribosomal-protein-alanine N-acetyltransferase [Escherichia coli O157:H7 EDL933] # 1 148 1 148 148 294 100 5e-79 MNTISSLETTDLPAAYHIEQRAHAFPWSEKTFASNQGERYLNFQLTQNGKMAAFAITQVV LDEATLFNIAVDPDYQRQGLGRALLEHLIDELEKRGVATLWLEVRASNAAAIALYESLGF NEATIRRNYYPTTDGREDAIIMALPISM >gi|296494520|gb|ADTN01000218.1| GENE 27 30533 - 31210 728 225 aa, chain + ## HITS:1 COG:ECs5332 KEGG:ns NR:ns ## COG: ECs5332 COG1011 # Protein_GI_number: 15834586 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Escherichia coli O157:H7 # 1 225 1 225 225 469 99.0 1e-132 MKWDWIFFDADETLFTFDSFTGLQRMFLDYSVTFTAEDFQDYQAVNKPLWVDYQNGAITS LQLQHGRFESWAERLNVEPGKLNEAFINAMAEICTPLPGAVSLLNAIRGNAKIGIITNGF SALQQVRLERTGLRDYFDLLVISEEVGVAKPNKKIFDYALEQAGNPDRSRVLMVGDTAES DILGGINAGLATCWLYAHHREQPEGIAPTWTVSSLHELEQLLCKH >gi|296494520|gb|ADTN01000218.1| GENE 28 31301 - 32890 1951 529 aa, chain + ## HITS:1 COG:ECs5333 KEGG:ns NR:ns ## COG: ECs5333 COG4108 # Protein_GI_number: 15834587 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptide chain release factor RF-3 # Organism: Escherichia coli O157:H7 # 1 529 1 529 529 1087 99.0 0 MTLSPYLQEVAKRRTFAIISHPDAGKTTITEKVLLFGQAIQTAGTVKGRGSNQHAKSDWM EMEKQRGISITTSVMQFPYHDCLVNLLDTPGHEDFSEDTYRTLTAVDCCLMVIDAAKGVE DRTRKLMEVTRLRDTPILTFMNKLDRDIRDPMELLDEVENELKIGCAPITWPIGCGKLFK GVYHLYKDETYLYQSGKGHTIQEVRIVKGLNNPDLDAAVGEDLAQQLRDELELVKGASNE FDKELFLAGEITPVFFGTALGNFGVDHMLDGLVEWAPAPMPRQTDTRTVEASEDKFTGFV FKIQANMDPKHRDRVAFMRVVSGKYEKGMKLRQVRTAKDVVISDALTFMAGDRSHVEEAY PGDILGLHNHGTIQIGDTFTQGEMMKFTGIPNFAPELFRRIRLKDPLKQKQLLKGLVQLS EEGAVQVFRPISNNDLIVGAVGVLQFDVVVSRLKSEYNVEAVYESVNVATARWVECADAK KFEEFKRKNESQLALDGGDNLAYIATSMVNLRLAQERYPDVQFHQTREH >gi|296494520|gb|ADTN01000218.1| GENE 29 33283 - 33888 735 201 aa, chain + ## HITS:1 COG:ECs5334 KEGG:ns NR:ns ## COG: ECs5334 COG2823 # Protein_GI_number: 15834588 # Func_class: R General function prediction only # Function: Predicted periplasmic or secreted lipoprotein # Organism: Escherichia coli O157:H7 # 1 201 1 201 201 259 99.0 2e-69 MTMTRLKISKTLLAVMLTSAVATGSAYAENNAQTTNESAGQKVDSSMNKVGNFMDDSAIT AKVKAALVDHDNIKSTDISVKTDQKVVTLSGFVESQAQAEEAVKVAKGVEGVTSVSDKLH VRDAKEGSVKGYAGDTATTSEIKAKLLADDIVPSRKVKVETTDGVVQLSGTVDSQAQSDR AESIAKAVDGVKSVKNDLKTK >gi|296494520|gb|ADTN01000218.1| GENE 30 34015 - 34176 204 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|157368899|ref|YP_001476888.1| ## NR: gi|157368899|ref|YP_001476888.1| hypothetical protein Spro_0654 [Serratia proteamaculans 568] # 1 53 33 85 85 77 92.0 4e-13 MFRWGIIFLVIALIAAALGFGGLAGTAAGAAKIVFVVGIILFLVSLFMGRKRP >gi|296494520|gb|ADTN01000218.1| GENE 31 34298 - 35371 912 357 aa, chain + ## HITS:1 COG:yjjU KEGG:ns NR:ns ## COG: yjjU COG4667 # Protein_GI_number: 16132195 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Escherichia coli K12 # 1 357 1 357 357 729 99.0 0 MGQRIPVTLGNIAPLSLRPFQPGRIALVCEGGGQRGIFTAGVLDEFMRAQFNPFDLYLGT SAGAQNLSAFICNQPGYARKVIMRYTTKREFFDPLRFVRGGNLIDLDWLVEATASQMPLQ MDTAARLFDSGKSFYMCACRQDDYAPNYFLPTKQNWLDVIRASSAIPGFYRSGVSLEGIN YLDGGISDAIPVKEAARQGAKTLVVIRTVPSQMYYTPQWFKRMERWLGDSSLQPLVNLVQ HHETSYRDIQQFIEKPPGKLRIFEIYPPKPLHSIALGSRIPALREDYKLGRLCGRYFLAT VGKLLTEKAPLTRHLVPVVTPESIVIPPAPVANDTLVAEVIDAPQANDPTFNNEDLA >gi|296494520|gb|ADTN01000218.1| GENE 32 35368 - 36150 770 260 aa, chain + ## HITS:1 COG:yjjVm KEGG:ns NR:ns ## COG: yjjVm COG0084 # Protein_GI_number: 16132266 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Escherichia coli K12 # 1 258 1 258 259 493 95.0 1e-139 MICRFIDTHCHFDFPPFSGDEEASLQRAAQAGVGKIIVPATEAANFARVQALAENFQPLY AALGLHPGMLEKHSDVSLEQLQQALERRPAKVVAVGEIGLDLFGDDPQFERQQWLLDEQL KLAKRYDLPVILHSRRTHDKLAMHLKRHDLSRTGVVHGFSGSLQQAERFVQLGYKIGVGG TITYPRASKTRDVIAKLPLASLLLETDAPDMPLNGFQGQPNRPEQAARVFDVLCELRPEP EDEIAEVLLNNTYAVFNVRG >gi|296494520|gb|ADTN01000218.1| GENE 33 36376 - 37110 685 244 aa, chain - ## HITS:1 COG:yjjW KEGG:ns NR:ns ## COG: yjjW COG1180 # Protein_GI_number: 16132196 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Escherichia coli K12 # 1 244 44 287 287 493 98.0 1e-139 MGRCNDCGECVPQCPHQALQIVDGKVLWNAVVCEQCDTCLKRCPQHATPMAQSMSVEEVL SHVRKAVLFIEGITVSGGEATTQLPFVVALFTAIKNDPQLRHLTCLVDSNGMLSETGWEK LLPVCDGAMLDLKAWGSECHQQLTGRDNQQIKRSICLLAERGKLAELRLLMIPGQVDYLQ HIEELAAFIKGLGDVPVRLNAFHAHGVYGEAQSWASATPEDVEPLADALKVRGVSRLIFP ALYL >gi|296494520|gb|ADTN01000218.1| GENE 34 37211 - 38761 1479 516 aa, chain - ## HITS:1 COG:no KEGG:JW4343 NR:ns ## KEGG: JW4343 # Name: yjjI # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 516 1 516 516 1071 99.0 0 MPTSHENALQQRCQQIVTSPVLSPEQKRHFLALEAENNLPYPQLPAEARRALDEGVICDM FEGHAPYKPRYVLPDYARFLANGSEWLELEGAKDLDDALSLLTILYHHVPSVTSMPVYLG QLDALLQPYVRILTQDEIDVRIKRFWRYLDRTLPDAFMHTNIGPSDSPITRAILRADAEL KQVSPNLTFIYDPEITPDDLLLEVAKNICECSKPHVANGPVHDKIFTKGGYGIVSCYNSL PLAGGGSTLVRLNLKAIAERSESLDDFFTRTLPHYCQQQIAIIDARCEFLYQQSHFFENS FLVKEGLINPERFVPMFGMYGLAEAVNLLCEKEGIAARYGKEAAANEVGYRISAQLAEFV ANTPVKYGWQKRAMLHAQSGISSDIGTTPGARLPYGDEPDPITHLQTVAPHHAYYYSGIS DILTLDETIKRNPQALVQLCLGAFKAGMREFTANVSGNDLVRVTGYMVRLSDLEKYRAEG SRTNTTWLGEEAARNTRILERQPRVISHEQQMRFSQ >gi|296494520|gb|ADTN01000218.1| GENE 35 39019 - 39798 1047 259 aa, chain + ## HITS:1 COG:ECs5340 KEGG:ns NR:ns ## COG: ECs5340 COG0274 # Protein_GI_number: 15834594 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Escherichia coli O157:H7 # 1 259 1 259 259 446 100.0 1e-125 MTDLKASSLRALKLMDLTTLNDDDTDEKVIALCHQAKTPVGNTAAICIYPRFIPIARKTL KEQGTPEIRIATVTNFPHGNDDIEIALAETRAAIAYGADEVDVVFPYRALMAGNEQVGFD LVKACKEACAAANVLLKVIIETGELKDEALIRKASEISIKAGADFIKTSTGKVAVNATPE SARIMMEVIRDMGVEKTVGFKPAGGVRTAEDAQKYLAIADELFGADWADARHYRFGASSL LASLLKALGHGDGKSASSY >gi|296494520|gb|ADTN01000218.1| GENE 36 40027 - 41349 1847 440 aa, chain + ## HITS:1 COG:ZdeoA KEGG:ns NR:ns ## COG: ZdeoA COG0213 # Protein_GI_number: 15804954 # Func_class: F Nucleotide transport and metabolism # Function: Thymidine phosphorylase # Organism: Escherichia coli O157:H7 EDL933 # 1 440 1 440 440 816 100.0 0 MFLAQEIIRKKRDGHALSDEEIRFFINGIRDNTISEGQIAALAMTIFFHDMTMPERVSLT MAMRDSGTVLDWKSLHLNGPIVDKHSTGGVGDVTSLMLGPMVAACGGYIPMISGRGLGHT GGTLDKLESIPGFDIFPDDNRFREIIKDVGVAIIGQTSSLAPADKRFYATRDITATVDSI PLITASILAKKLAEGLDALVMDVKVGSGAFMPTYELSEALAEAIVGVANGAGVRTTALLT DMNQVLASSAGNAVEVREAVQFLTGEYRNPRLFDVTMALCVEMLISGKLAKDDAEARAKL QAVLDNGKAAEVFGRMVAAQKGPTDFVENYAKYLPTAMLTKAVYADTEGFVSEMDTRALG MAVVAMGGGRRQASDTIDYSVGFTDMARLGDQVDGQRPLAVIHAKDENSWQEAAKAVKAA IKLADKAPESTPTVYRRISE >gi|296494520|gb|ADTN01000218.1| GENE 37 41401 - 42624 1674 407 aa, chain + ## HITS:1 COG:ECs5342 KEGG:ns NR:ns ## COG: ECs5342 COG1015 # Protein_GI_number: 15834596 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphopentomutase # Organism: Escherichia coli O157:H7 # 1 407 1 407 407 840 100.0 0 MKRAFIMVLDSFGIGATEDAERFGDVGADTLGHIAEACAKGEADNGRKGPLNLPNLTRLG LAKAHEGSTGFIPAGMDGNAEVIGAYAWAHEMSSGKDTPSGHWEIAGVPVLFEWGYFSDH ENSFPQELLDKLVERANLPGYLGNCHSSGTVILDQLGEEHMKTGKPIFYTSADSVFQIAC HEETFGLDKLYELCEIAREELTNGGYNIGRVIARPFIGDKAGNFQRTGNRHDLAVEPPAP TVLQKLVDEKHGQVVSVGKIADIYANCGITKKVKATGLDALFDATIKEMKEAGDNTIVFT NFVDFDSSWGHRRDVAGYAAGLELFDRRLPELMSLLRDDDILILTADHGCDPTWTGTDHT REHIPVLVYGPKVKPGSLGHRETFADIGQTLAKYFGTSDMEYGKAMF Prediction of potential genes in microbial genomes Time: Sun May 15 23:52:45 2011 Seq name: gi|296494519|gb|ADTN01000219.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont457.5, whole genome shotgun sequence Length of sequence - 20013 bp Number of predicted genes - 21, with homology - 20 Number of transcription units - 12, operones - 4 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 28 - 747 956 ## COG0813 Purine-nucleoside phosphorylase + Term 781 - 818 6.0 2 2 Op 1 2/0.778 - CDS 908 - 1171 218 ## COG1396 Predicted transcriptional regulators 3 2 Op 2 2/0.778 - CDS 1203 - 2219 1191 ## COG0095 Lipoate-protein ligase A 4 2 Op 3 . - CDS 2247 - 2891 810 ## COG3726 Uncharacterized membrane protein affecting hemolysin expression - Prom 2973 - 3032 4.6 + Prom 2900 - 2959 3.1 5 3 Op 1 5/0.333 + CDS 2997 - 3965 1199 ## COG0560 Phosphoserine phosphatase 6 3 Op 2 1/0.889 + CDS 4014 - 5396 1494 ## COG1066 Predicted ATP-dependent serine protease 7 3 Op 3 . + CDS 5417 - 6649 1417 ## COG3172 Predicted ATPase/kinase involved in NAD metabolism 8 3 Op 4 . + CDS 6683 - 6850 76 ## EcSMS35_4940 hypothetical protein + Term 6878 - 6921 -0.9 - Term 6800 - 6862 7.3 9 4 Tu 1 . - CDS 6957 - 8624 2364 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 8721 - 8780 3.2 10 5 Op 1 4/0.333 + CDS 8835 - 10772 1855 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) 11 5 Op 2 . + CDS 10862 - 11188 362 ## COG2973 Trp operon repressor 12 6 Tu 1 . - CDS 11272 - 11793 437 ## COG1986 Uncharacterized conserved protein - Prom 11820 - 11879 4.3 + Prom 11744 - 11803 2.7 13 7 Tu 1 . + CDS 11845 - 12492 717 ## COG0406 Fructose-2,6-bisphosphatase 14 8 Tu 1 . - CDS 12489 - 13358 1003 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 13551 - 13610 4.5 + Prom 13465 - 13524 4.1 15 9 Op 1 4/0.333 + CDS 13569 - 14042 558 ## COG3045 Uncharacterized protein conserved in bacteria 16 9 Op 2 40/0.000 + CDS 14055 - 14744 610 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 17 9 Op 3 6/0.222 + CDS 14744 - 16168 1254 ## COG0642 Signal transduction histidine kinase 18 9 Op 4 . + CDS 16226 - 17578 1095 ## COG4452 Inner membrane protein involved in colicin E2 resistance + Term 17595 - 17625 3.0 - Term 17582 - 17612 3.0 19 10 Tu 1 . - CDS 17637 - 18353 898 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Prom 18738 - 18797 8.1 20 11 Tu 1 . + CDS 18989 - 19675 482 ## COG0565 rRNA methylase + Prom 19753 - 19812 4.1 21 12 Tu 1 . + CDS 19889 - 19954 82 ## Predicted protein(s) >gi|296494519|gb|ADTN01000219.1| GENE 1 28 - 747 956 239 aa, chain + ## HITS:1 COG:ECs5343 KEGG:ns NR:ns ## COG: ECs5343 COG0813 # Protein_GI_number: 15834597 # Func_class: F Nucleotide transport and metabolism # Function: Purine-nucleoside phosphorylase # Organism: Escherichia coli O157:H7 # 1 239 1 239 239 469 99.0 1e-132 MATPHINAEMGDFADVVLMPGDPLRAKYIAETFLEDAREVNNVRGMLGFTGTYKGRKISV MGHGMGIPSCSIYTKELITDFGVKKIIRVGSCGAVLPHVKLRDVVIGMGACTDSKVNRIR FKDHDFAAIADFDMVRNAVDAAKALGVDARVGNLFSADLFYSPDGEMFDVMEKYGILGVE MEAAGIYGVAAEFGAKALTICTVSDHIRTHEQTTAAERQTTFNDMIKIALESVLLGDKE >gi|296494519|gb|ADTN01000219.1| GENE 2 908 - 1171 218 87 aa, chain - ## HITS:1 COG:ECs5344 KEGG:ns NR:ns ## COG: ECs5344 COG1396 # Protein_GI_number: 15834598 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 87 1 87 87 152 100.0 1e-37 MIDLENQEREIINLMFSQRISWLAAVRIRHKLSLAEVSKMLGISINSLKQIEKTERLSSN IKSKMAEIYGCPPELLICPSWMTAEHK >gi|296494519|gb|ADTN01000219.1| GENE 3 1203 - 2219 1191 338 aa, chain - ## HITS:1 COG:Z5988_2 KEGG:ns NR:ns ## COG: Z5988_2 COG0095 # Protein_GI_number: 15804958 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate-protein ligase A # Organism: Escherichia coli O157:H7 EDL933 # 1 338 5 342 342 695 99.0 0 MSTLRLLISDSYDPWFNLAVEECIFRQMPATQRVLFLWRNADTVVIGRAQNPWKECNTRR MEEDNVRLARRSSGGGAVFHDLGNTCFTFMAGKPEYDKTISTSIVLNALNALGVSAEASG RNDLVVKTVEGDRKVSGSAYRETKDRGFHHGTLLLNADLSRLANYLNPDKKKLAAKGITS VRSRVTNLTELLPGITHEQVCEAITEAFFAHYGERVEAEIISPDKTPDLPNFAETFARQS SWEWNFGQAPAFSHLLDERFSWGGVELHFDVEKGHITRAQVFTDSLNPAPLEALAGRLQG CLYRADMLQQECEALLVDFPEQEKELRELSTWIAGAVR >gi|296494519|gb|ADTN01000219.1| GENE 4 2247 - 2891 810 214 aa, chain - ## HITS:1 COG:ECs5345_1 KEGG:ns NR:ns ## COG: ECs5345_1 COG3726 # Protein_GI_number: 15834599 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein affecting hemolysin expression # Organism: Escherichia coli O157:H7 # 1 205 1 205 224 366 100.0 1e-101 MARTKLKFRLHRAVIVLFCLALLVALMQGASWFSQNHQRQRNPQLEELARTLARQVTLNV APLMRTDSPDEKRIQAILDQLTDESRILDAGVYDEQGDLIARSGESVEVRDRLALDGKKA GGYFNQQIVEPIAGKNGPLGYLRLTLDTHTLATEAQQVDNTTNILRLMLLLSLAIGVVLT RTLLQGKRTRWQQSPFLLTASKPVPEEEESEKKE >gi|296494519|gb|ADTN01000219.1| GENE 5 2997 - 3965 1199 322 aa, chain + ## HITS:1 COG:ECs5346 KEGG:ns NR:ns ## COG: ECs5346 COG0560 # Protein_GI_number: 15834600 # Func_class: E Amino acid transport and metabolism # Function: Phosphoserine phosphatase # Organism: Escherichia coli O157:H7 # 1 322 1 322 322 628 98.0 1e-180 MPNITWCDLPEDVSLWPGLPLSLSGDEVMPLDYHAGRSGWLLYGRGLDKQRLTQYQSKLG AAMVIVAAWCVEDYQVIRLAGSLTARATRLAHEAQLDVAPLGKIPHLRTPGLLVMDMDST AIQIECIDEIAKLAGTGEMVAEVTERAMRGELDFTASLRSRVATLKGADANILQQVRENL PLMPGLTQLVLKLETLGWKVAIASGGFTFFAEYLRDKLRLTAVVANELEIMDGKFTGNVI GDIVDAQYKAKTLTRLAQEYETPLAQTGAIGEGANALPMIKAAGLGIAYHAKPKVNEKAE VTIRHADLMGVFCILSGSLNQK >gi|296494519|gb|ADTN01000219.1| GENE 6 4014 - 5396 1494 460 aa, chain + ## HITS:1 COG:ECs5347 KEGG:ns NR:ns ## COG: ECs5347 COG1066 # Protein_GI_number: 15834601 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent serine protease # Organism: Escherichia coli O157:H7 # 1 460 1 460 460 901 99.0 0 MAKAPKRAFVCNECGADYPRWQGQCSACHAWNTITEVRLAASPTVARNERLSGYAGSTGV AKVQKLSDISLEELPRFSTGFKEFDRVLGGGVVPGSAILIGGNPGAGKSTLLLQTLCKLA QQMKTLYVTGEESLQQVAMRAHRLGLPTDNLNMLSETSIEQICLIAEEEQPKLMVIDSIQ VMHMADVQSSPGSVAQVRETAAYLTRFAKTRGVAIVMVGHVTKDGSLAGPKVLEHCIDCS VLLDGDADSRFRTLRSHKNRFGAVNELGVFAMTEQGLREVSNPSAIFLSRGDEVTSGSSV MVVWEGTRPLLVEIQALVDHSMMANPRRVAVGLEQNRLAILLAVLHRHGGLQMADQDVFV NVVGGVKVTETSADLALLLAMVSSLRDRPLPQDLVVFGEVGLAGEIRPVPSGQERISEAA KHGFRRAIVPAANVPKKAPEGMQIFGVKKLSDALSVFDDL >gi|296494519|gb|ADTN01000219.1| GENE 7 5417 - 6649 1417 410 aa, chain + ## HITS:1 COG:nadR_3 KEGG:ns NR:ns ## COG: nadR_3 COG3172 # Protein_GI_number: 16132207 # Func_class: H Coenzyme transport and metabolism # Function: Predicted ATPase/kinase involved in NAD metabolism # Organism: Escherichia coli K12 # 224 409 1 186 187 380 99.0 1e-105 MSSFDYLKTAIKQQGCTLQQVADASGMTKGYLSQLLNAKIKSPSAQKLEALHRFLGLEFP RQKKTIGVVFGKFYPLHTGHIYLIQRACSQVDELHIIMGFDDTRDRALFEDSAMSQQPTV PDRLRWLLQTFKYQKNIRIHAFNEEGMEPYPHGWDVWSNGIKKFMAEKGIQPDLIYTSEE ADAPQYMEHLGIETVLVDPKRTFMSISGAQIRENPFRYWEYIPTEVKPFFVRTVAILGGE SSGKSTLVNKLANIFNTTSAWEYGRDYVFSHLGGDEIALQYSDYDKIALGHAQYIDFAVK YANKVAFIDTDFVTTQAFCKKYEGREHPFVQALIDEYRFDLVILLENNTPWVADGLRSLG SSVDRKEFQNLLVEMLEENNIEFVRVEEDDYDSRFLRCVELVREMMGEQG >gi|296494519|gb|ADTN01000219.1| GENE 8 6683 - 6850 76 55 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_4940 NR:ns ## KEGG: EcSMS35_4940 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 55 1 55 55 85 98.0 7e-16 MSFFDELKTSLEEAVEIKQGLKKPARVTRHEIEDAKAVVDRKRCSRRIRHSVLNA >gi|296494519|gb|ADTN01000219.1| GENE 9 6957 - 8624 2364 555 aa, chain - ## HITS:1 COG:ECs5349 KEGG:ns NR:ns ## COG: ECs5349 COG0488 # Protein_GI_number: 15834603 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Escherichia coli O157:H7 # 1 555 1 555 555 1104 100.0 0 MAQFVYTMHRVGKVVPPKRHILKNISLSFFPGAKIGVLGLNGAGKSTLLRIMAGIDKDIE GEARPQPDIKIGYLPQEPQLNPEHTVRESIEEAVSEVVNALKRLDEVYALYADPDADFDK LAAEQGRLEEIIQAHDGHNLNVQLERAADALRLPDWDAKIANLSGGERRRVALCRLLLEK PDMLLLDEPTNHLDAESVAWLERFLHDFEGTVVAITHDRYFLDNVAGWILELDRGEGIPW EGNYSSWLEQKDQRLAQEASQEAARRKSIEKELEWVRQGTKGRQSKGKARLARFEELNST EYQKRNETNELFIPPGPRLGDKVLEVSNLRKSYGDRLLIDDLSFSIPKGAIVGIIGPNGA GKSTLFRMISGQEQPDSGTITLGETVKLASVDQFRDSMDNSKTVWEEVSGGLDIMKIGNT EMPSRAYVGRFNFKGVDQGKRVGELSGGERGRLHLAKLLQVGGNMLLLDEPTNDLDIETL RALENALLEFPGCAMVISHDRWFLDRIATHILDYQDEGKVEFFEGNFTEYEEYKKRTLGA DALEPKRIKYKRIAK >gi|296494519|gb|ADTN01000219.1| GENE 10 8835 - 10772 1855 645 aa, chain + ## HITS:1 COG:ECs5350 KEGG:ns NR:ns ## COG: ECs5350 COG0741 # Protein_GI_number: 15834604 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Escherichia coli O157:H7 # 1 645 10 654 654 1239 99.0 0 MEKAKQVTWRLLAAGVCLLTVSSVARADSLDEQRSRYAQIKQAWDNRQMDVVEQMMPRLK DYPLYPYLEYRQITDDLMNQPAVTVTNFVRANPTLPPARTLQSRFVNELARREDWRGLLA FSPEKPGTTEAQCNYYYAKWNTGQSEEAWQGAKELWLTGKSQPNACDKLFSVWRASGKQD PLAYLERIRLAMKAGNTGLVTVLAGQMPADYQTIASAIISLANNPNTVLTFARTTGATDF TRQMAAVAFASVARQDAENARLMIPSLAQAQQLNEDQIQELRDIVAWRLMGNDVTDEQAK WRDDAIMRSQSTSLIERRVRMALGTGDRRGLNTWLARLPMEAKEKDEWRYWQADLLLERG REAEAKEILHQLMQQRGFYPMVAAQRIGEEYELKIDKAPQNVDSALTQGPEMARVRELMY WNLDNTARSEWANLVKSKSKTEQAQLARYAFNNQWWDLSVQATIAGKLWDHLEERFPLAY NDLFKRYTSGKEIPQSYAMAIARQESAWNPKVKSPVGASGLMQIMPGTATHTVKMFSIPS YSSPGQLLDPETNINIGTSYLQYVYQQFGNNRIFSSAAYNAGPGRVRTWLGNSAGRIDAV AFVESIPFSETRGYVKNVLAYDAYYRYFMGDKPTLMSATEWGRRY >gi|296494519|gb|ADTN01000219.1| GENE 11 10862 - 11188 362 108 aa, chain + ## HITS:1 COG:ECs5351 KEGG:ns NR:ns ## COG: ECs5351 COG2973 # Protein_GI_number: 15834605 # Func_class: K Transcription # Function: Trp operon repressor # Organism: Escherichia coli O157:H7 # 1 108 1 108 108 166 100.0 9e-42 MAQQSPYSAAMAEQRHQEWLRFVDLLKNAYQNDLHLPLLNLMLTPDEREALGTRVRIVEE LLRGEMSQRELKNELGAGIATITRGSNSLKAAPVELRQWLEEVLLKSD >gi|296494519|gb|ADTN01000219.1| GENE 12 11272 - 11793 437 173 aa, chain - ## HITS:1 COG:yjjX KEGG:ns NR:ns ## COG: yjjX COG1986 # Protein_GI_number: 16132211 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 170 4 173 173 325 100.0 3e-89 MHQVVCATTNPAKIQAILQAFHEIFGEGSCHIASVAVESGVPEQPFGSEETRAGARNRVA NARRLLPEADFWVAIEAGIDGDSTFSWVVIENASQRGEARSATLPLPAVILEKVREGEAL GPVMSRYTGIDEIGRKEGAIGVFTAGKLTRASVYHQAVILALSPFHNAVYQAL >gi|296494519|gb|ADTN01000219.1| GENE 13 11845 - 12492 717 215 aa, chain + ## HITS:1 COG:ECs5353 KEGG:ns NR:ns ## COG: ECs5353 COG0406 # Protein_GI_number: 15834607 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Escherichia coli O157:H7 # 1 215 1 215 215 406 100.0 1e-113 MLQVYLVRHGETQWNAERRIQGQSDSPLTAKGEQQAMQVATRAKELGITHIISSDLGRTR RTAEIIAQACGCDIIFDSRLRELNMGVLEKRHIDSLTEEEENWRRQLVNGTVDGRIPEGE SMQELSDRVNAALESCRDLPQGSRPLLVSHGIALGCLVSTILGLPAWAERRLRLRNCSIS RVDYQESLWLASGWVVETAGDISHLDAPALDELQR >gi|296494519|gb|ADTN01000219.1| GENE 14 12489 - 13358 1003 289 aa, chain - ## HITS:1 COG:ECs5354 KEGG:ns NR:ns ## COG: ECs5354 COG2207 # Protein_GI_number: 15834608 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 1 289 1 289 289 597 100.0 1e-171 MDQAGIIRDLLIWLEGHLDQPLSLDNVAAKAGYSKWHLQRMFKDVTGHAIGAYIRARRLS KSAVALRLTARPILDIALQYRFDSQQTFTRAFKKQFAQTPALYRRSPEWSAFGIRPPLRL GEFTMPEHKFVTLEDTPLIGVTQSYSCSLEQISDFRHEMRYQFWHDFLGNAPTIPPVLYG LNETRPSQDKDDEQEVFYTTALAQDQADGYVLTGHPVMLQGGEYVMFTYEGLGTGVQEFI LTVYGTCMPMLNLTRRKGQDIERYYPAEDAKAGDRPINLRCELLIPIRR >gi|296494519|gb|ADTN01000219.1| GENE 15 13569 - 14042 558 157 aa, chain + ## HITS:1 COG:ECs5355 KEGG:ns NR:ns ## COG: ECs5355 COG3045 # Protein_GI_number: 15834609 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 157 1 157 157 291 100.0 3e-79 MKYKHLILSLSLIMLGPLAHAEEIGSVDTVFKMIGPDHKIVVEAFDDPDVKNVTCYVSRA KTGGIKGGLGLAEDTSDAAISCQQVGPIELSDRIKNGKAQGEVVFKKRTSLVFKSLQVVR FYDAKRNALAYLAYSDKVVEGSPKNAISAVPVMPWRQ >gi|296494519|gb|ADTN01000219.1| GENE 16 14055 - 14744 610 229 aa, chain + ## HITS:1 COG:creB KEGG:ns NR:ns ## COG: creB COG0745 # Protein_GI_number: 16132215 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli K12 # 1 229 1 229 229 449 99.0 1e-126 MQRETVWLVEDEQGIADTLVYMLQQEGFAVEVFERGLPVLDKARQQVPDVMILDVGLPDI SGFELCRQLLALHPALPVLFLTARSEEVDRLLGLEIGADDYVAKPFSPREVCARVRTLLR RVKKFSTPSPVIRIGHFELNEPAAQISWFDTPLTLTRYEFLLLKTLLKSPGRVWSRQQLM DSVWEDAQDTYDRTVDTHIKTLRAKLRAINPDLSPINTHRGMGYSLRGL >gi|296494519|gb|ADTN01000219.1| GENE 17 14744 - 16168 1254 474 aa, chain + ## HITS:1 COG:creC KEGG:ns NR:ns ## COG: creC COG0642 # Protein_GI_number: 16132216 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 474 1 474 474 907 99.0 0 MRIGMRLLLGYFLLVAVAAWFVLAIFVKEVKPGVRRATEGTLIDTATLLAELARPDLLSG DPTHGQLAQAFNQLQHRPFRANIGGINKVRNEYHVYMTDAQGKVLFDSANKAVGQDYSRW NDVWLTLRGQYGARSTLQNPADPESSVMYVAAPIMDGSRLIGVLSVGKPNAAMAPVIKRS ERRILWASAILLGIALVIGAGMVWWINRSIARLTRYADSVTDNKPVPLPDLASSELRKLA QALESMRVKLEGKNYIEQYVYALTHELKSPLAAIRGAAEILREGPPPEVVARFTDNILTQ NARMQALVETLLRQARLENRQEVVLTAVDVAALFRRVSEARTVQLAEKNITLHVTPTEVN VAAEPALLDQALGNLLDNAIDFTPESGRITLSAEVDQEHVTLKVLDTGSGIPDYALSRIF ERFYSLPRANGQKSSGLGLAFVSEVARLFNGEVTLRNVQEGGVLASLRLHRHFT >gi|296494519|gb|ADTN01000219.1| GENE 18 16226 - 17578 1095 450 aa, chain + ## HITS:1 COG:ECs5358 KEGG:ns NR:ns ## COG: ECs5358 COG4452 # Protein_GI_number: 15834612 # Func_class: V Defense mechanisms # Function: Inner membrane protein involved in colicin E2 resistance # Organism: Escherichia coli O157:H7 # 1 450 1 450 450 840 98.0 0 MLKSPLFWKMTTLFGAVLLLLIPIMLIRQVIVERADYRSDVEDAIRQSTSGPQKLVGPLI AIPVIELYTVQEEDKTVERKRSFIHFWLPESLMVDGNQNVEERKIGIYTGQVWHSDLTLK ADFDVSRLSELNAPNIILGKPFIVISVGDARGIGVVKAPEVNGTALTIEPGTGLEQGGQG VHIPLPEGDWRKQNLQLNMALNLSGTGDLSVVPAGRNSEMTLTSNWPHPSFLGDFLPAKR EVSESGFQAQWQSSWFANNLGERFASGNDTGWENFPAFSVAVTTPADQYQLTDRATKYAI LLIALTFMAFFVFETLTAQRLHPMQYLLVGLSLVMFYLLLLALSEHTGFTVAWIIASLIG ALMNGIYLQAVLKGWRNSMLFTLALLLLDGVMWGLLNSADSALLLGTSVLVVALAGMMFV TRNIDWYAFSLPKMKASKEVTTDDELRIWK >gi|296494519|gb|ADTN01000219.1| GENE 19 17637 - 18353 898 238 aa, chain - ## HITS:1 COG:ECs5359 KEGG:ns NR:ns ## COG: ECs5359 COG0745 # Protein_GI_number: 15834613 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 238 1 238 238 470 100.0 1e-133 MQTPHILIVEDELVTRNTLKSIFEAEGYDVFEATDGAEMHQILSEYDINLVIMDINLPGK NGLLLARELREQANVALMFLTGRDNEVDKILGLEIGADDYITKPFNPRELTIRARNLLSR TMNLGTVSEERRSVESYKFNGWELDINSRSLIGPDGEQYKLPRSEFRAMLHFCENPGKIQ SRAELLKKMTGRELKPHDRTVDVTIRRIRKHFESTPDTPEIIATIHGEGYRFCGDLED >gi|296494519|gb|ADTN01000219.1| GENE 20 18989 - 19675 482 228 aa, chain + ## HITS:1 COG:ECs5361 KEGG:ns NR:ns ## COG: ECs5361 COG0565 # Protein_GI_number: 15834615 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylase # Organism: Escherichia coli O157:H7 # 1 228 1 228 228 422 98.0 1e-118 MRITIILVAPARAENIGAAARAMKTMGFSELRIVDSQAHLEPATRWVAHGSGDVIDNIKV FPTLAESLHDVDFTVATTARSRAKYHYYATPVELVPLLEEKSSWMSHAALVFGREDSGLT NEELALADVLTGVPMVADYPSLNLGQAVMVYCYQLATLIQQPAKSDTTADQHQLQALRER VMALLTTLAVADDIKLVDWLQQRLGLLEQRDTAMLHRLLHDIEKNITK >gi|296494519|gb|ADTN01000219.1| GENE 21 19889 - 19954 82 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKRISTTITTTITITTGNGAG Prediction of potential genes in microbial genomes Time: Sun May 15 23:52:56 2011 Seq name: gi|296494518|gb|ADTN01000220.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont457.6, whole genome shotgun sequence Length of sequence - 15129 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 7, operones - 5 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 11/0.000 + CDS 52 - 2514 2269 ## COG0527 Aspartokinases 2 1 Op 2 19/0.000 + CDS 2516 - 3448 742 ## COG0083 Homoserine kinase 3 1 Op 3 . + CDS 3449 - 4735 1464 ## COG0498 Threonine synthase + Term 4759 - 4798 4.2 + Prom 4744 - 4803 2.9 4 2 Tu 1 . + CDS 4949 - 5245 136 ## ECUMN_0005 conserved hypothetical protein; putative exported protein + Term 5282 - 5321 3.7 - Term 5235 - 5279 2.2 5 3 Op 1 5/0.000 - CDS 5399 - 6175 924 ## COG3022 Uncharacterized protein conserved in bacteria 6 3 Op 2 . - CDS 6245 - 7675 812 ## PROTEIN SUPPORTED gi|145634045|ref|ZP_01789756.1| 50S ribosomal protein L21 - Prom 7732 - 7791 4.3 + Prom 7848 - 7907 1.9 7 4 Op 1 5/0.000 + CDS 7954 - 8907 1339 ## COG0176 Transaldolase + Term 8915 - 8955 9.3 + Prom 8933 - 8992 2.4 8 4 Op 2 . + CDS 9022 - 9609 692 ## PROTEIN SUPPORTED gi|134277849|ref|ZP_01764564.1| ribosomal protein S16 - Term 9599 - 9636 6.2 9 5 Tu 1 4/1.000 - CDS 9644 - 10210 744 ## COG1584 Predicted membrane protein - Prom 10279 - 10338 4.5 - Term 10239 - 10279 3.2 10 6 Op 1 2/1.000 - CDS 10359 - 10751 369 ## COG4735 Uncharacterized protein conserved in bacteria 11 6 Op 2 . - CDS 10772 - 11071 173 ## COG4735 Uncharacterized protein conserved in bacteria 12 6 Op 3 . - CDS 11097 - 11501 367 ## JW0012 hypothetical protein - Prom 11594 - 11653 4.8 + Prom 11583 - 11642 5.7 13 7 Op 1 31/0.000 + CDS 11878 - 13794 2358 ## COG0443 Molecular chaperone + Term 13823 - 13869 11.0 14 7 Op 2 . + CDS 13883 - 15013 1163 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain Predicted protein(s) >gi|296494518|gb|ADTN01000220.1| GENE 1 52 - 2514 2269 820 aa, chain + ## HITS:1 COG:thrA_1 KEGG:ns NR:ns ## COG: thrA_1 COG0527 # Protein_GI_number: 16127996 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Escherichia coli K12 # 1 460 1 460 460 911 100.0 0 MRVLKFGGTSVANAERFLRVADILESNARQGQVATVLSAPAKITNHLVAMIEKTISGQDA LPNISDAERIFAELLTGLAAAQPGFPLAQLKTFVDQEFAQIKHVLHGISLLGQCPDSINA ALICRGEKMSIAIMAGVLEARGHNVTVIDPVEKLLAVGHYLESTVDIAESTRRIAASRIP ADHMVLMAGFTAGNEKGELVVLGRNGSDYSAAVLAACLRADCCEIWTDVDGVYTCDPRQV PDARLLKSMSYQEAMELSYFGAKVLHPRTITPIAQFQIPCLIKNTGNPQAPGTLIGASRD EDELPVKGISNLNNMAMFSVSGPGMKGMVGMAARVFAAMSRARISVVLITQSSSEYSISF CVPQSDCVRAERAMQEEFYLELKEGLLEPLAVTERLAIISVVGDGMRTLRGISAKFFAAL ARANINIVAIAQGSSERSISVVVNNDDATTGVRVTHQMLFNTDQVIEVFVIGVGGVGGAL LEQLKRQQSWLKNKHIDLRVCGVANSKALLTSVHGLNLENWQEELAQAKEPFNLGRLIRL VKEYHLLNPVIVDCTSSQAVADQYADFLREGFHVVTPNKKANTSSMDYYHQLRYAAEKSR RKFLYDTNVGAGLPVIENLQNLLNAGDELVKFSGILSGSLSYIFGKLDEGMSFSEATTLA REMGYTEPDPRDDLSGMDVARKLLILARETGRELELADIEIEPVLPAEFNAEGDVAAFMA NLSQLDDLFAARVAKARDEGKVLRYVGNIDEDGVCRVKIAEVDGNDPLFKVKNGENALAF YSHYYQPLPLVLRGYGAGNDVTAAGVFADLLRTLSWKLGV >gi|296494518|gb|ADTN01000220.1| GENE 2 2516 - 3448 742 310 aa, chain + ## HITS:1 COG:ECs0003 KEGG:ns NR:ns ## COG: ECs0003 COG0083 # Protein_GI_number: 15829257 # Func_class: E Amino acid transport and metabolism # Function: Homoserine kinase # Organism: Escherichia coli O157:H7 # 1 310 1 310 310 638 99.0 0 MVKVYAPASSANMSVGFDVLGAAVTPVDGALLGDVVTVEAAETFSLNNLGRFADKLPSEP RENIVYQCWERFCQEQGKQIPVAMTLEKNMPIGSGLGSSACSVVAALMAMNEHCGKPLND TRLLALMGELEGRISGSIHYDNVAPCFLGGMQLMIEENDIISQQVPGFDEWLWVLAYPGI KVSTAEARAILPAQYRRQDCIAHGRHLAGFIHACYSRQPELAAKLMKDVIAEPYRERLLP GFRQARQAVAEIGAVASGISGSGPTLFALCDKPDTAQRVADWLGKNYLQNQEGFVHICRL DTAGARVLEN >gi|296494518|gb|ADTN01000220.1| GENE 3 3449 - 4735 1464 428 aa, chain + ## HITS:1 COG:ECs0004 KEGG:ns NR:ns ## COG: ECs0004 COG0498 # Protein_GI_number: 15829258 # Func_class: E Amino acid transport and metabolism # Function: Threonine synthase # Organism: Escherichia coli O157:H7 # 1 428 1 428 428 842 99.0 0 MKLYNLKDHNEQVSFAQAVTQGLGKNQGLFFPHDLPEFSLTEIDEMLKLDFVTRSAKILS AFIGDEIPQEILEERVRAAFAFPAPVANVESDVGCLELFHGPTLAFKDFGGRFMAQMLTH IAGDKPVTILTATSGDTGAAVAHAFYGLPNVKVVILYPRGKISPLQEKLFCTLGGNIETV AIDGDFDACQTLVKQAFDDEELKVALGLNSANSINISRLLAQICYYFEAVAQLPQEARNQ LVVSVPSGNFGDLTAGLLAKSLGLPVKRFIAATNVNDTVPRFLHDGQWSPKATQATLSNA MDVSQPNNWPRVEELFRRKIWQLKELGYAAVDDETTQQTMRELKELGYTSEPHAAVAYRA LRDQLNPGEYGLFLGTAHPAKFKESVEAILGETLDLPKELAERADLPLLSHNLPADFAAL RKLMMNHQ >gi|296494518|gb|ADTN01000220.1| GENE 4 4949 - 5245 136 98 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_0005 NR:ns ## KEGG: ECUMN_0005 # Name: yaaX # Def: conserved hypothetical protein; putative exported protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 84 1 84 98 134 98.0 7e-31 MKKMQSIVLALSLVLVAPMAAQAAEITLVPSVKLQIGDRDNRGYYWDGGHWRDHGWWKQH YEWRGNRWHPHGPPPPPRHHKKAPHDHHGGHGPGKHHR >gi|296494518|gb|ADTN01000220.1| GENE 5 5399 - 6175 924 258 aa, chain - ## HITS:1 COG:yaaA KEGG:ns NR:ns ## COG: yaaA COG3022 # Protein_GI_number: 16128000 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 258 1 258 258 510 100.0 1e-145 MLILISPAKTLDYQSPLTTTRYTLPELLDNSQQLIHEARKLTPPQISTLMRISDKLAGIN AARFHDWQPDFTPANARQAILAFKGDVYTGLQAETFSEDDFDFAQQHLRMLSGLYGVLRP LDLMQPYRLEMGIRLENARGKDLYQFWGDIITNKLNEALAAQGDNVVINLASDEYFKSVK PKKLNAEIIKPVFLDEKNGKFKIISFYAKKARGLMSRFIIENRLTKPEQLTGFNSEGYFF DEDSSSNGELVFKRYEQR >gi|296494518|gb|ADTN01000220.1| GENE 6 6245 - 7675 812 476 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145634045|ref|ZP_01789756.1| 50S ribosomal protein L21 [Haemophilus influenzae PittAA] # 6 447 8 446 456 317 40 3e-86 MPDFFSFINSVLWGSVMIYLLFGAGCWFTFRTGFVQFRYIRQFGKSLKNSIHPQPGGLTS FQSLCTSLAARVGSGNLAGVALAITAGGPGAVFWMWVAAFIGMATSFAECSLAQLYKERD VNGQFRGGPAWYMARGLGMRWMGVLFAVFLLIAYGIIFSGVQANAVARALSFSFDFPPLV TGIILAVFTLLAITRGLHGVARLMQGFVPLMAIIWVLTSLVICVMNIGQLPHVIWSIFES AFGWQEAAGGAAGYTLSQAITNGFQRSMFSNEAGMGSTPNAAAAAASWPPHPAAQGIVQM IGIFIDTLVICTASAMLILLAGNGTTYMPLEGIQLIQKAMRVLMGSWGAEFVTLVVILFA FSSIVANYIYAENNLFFLRLNNPKAIWCLRICTFATVIGGTLLSLPLMWQLADIIMACMA ITNLTAILLLSPVVHTIASDYLRQRKLGVRPVFDPLRYPDIGRQLSPDAWDDVSQE >gi|296494518|gb|ADTN01000220.1| GENE 7 7954 - 8907 1339 317 aa, chain + ## HITS:1 COG:ECs0008 KEGG:ns NR:ns ## COG: ECs0008 COG0176 # Protein_GI_number: 15829262 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Escherichia coli O157:H7 # 1 317 1 317 317 609 100.0 1e-174 MTDKLTSLRQYTTVVADTGDIAAMKLYQPQDATTNPSLILNAAQIPEYRKLIDDAVAWAK QQSNDRAQQIVDATDKLAVNIGLEILKLVPGRISTEVDARLSYDTEASIAKAKRLIKLYN DAGISNDRILIKLASTWQGIRAAEQLEKEGINCNLTLLFSFAQARACAEAGVFLISPFVG RILDWYKANTDKKEYAPAEDPGVVSVSEIYQYYKEHGYETVVMGASFRNIGEILELAGCD RLTIAPALLKELAESEGAIERKLSYTGEVKARPARITESEFLWQHNQDPMAVDKLAEGIR KFAIDQEKLEKMIGDLL >gi|296494518|gb|ADTN01000220.1| GENE 8 9022 - 9609 692 195 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|134277849|ref|ZP_01764564.1| ribosomal protein S16 [Burkholderia pseudomallei 305] # 6 191 2 191 194 271 73 3e-72 MNTLRIGLVSISDRASSGVYQDKGIPALEEWLTSALTTPFELETRLIPDEQAIIEQTLCE LVDEMSCHLVLTTGGTGPARRDVTPDATLAVADREMPGFGEQMRQISLHFVPTAILSRQV GVIRKQALILNLPGQPKSIKETLEGVKDAEGNVVVHGIFASVPYCIQLLEGPYVETAPEV VAAFRPKSARRDVSE >gi|296494518|gb|ADTN01000220.1| GENE 9 9644 - 10210 744 188 aa, chain - ## HITS:1 COG:ECs0010 KEGG:ns NR:ns ## COG: ECs0010 COG1584 # Protein_GI_number: 15829264 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 188 1 188 188 299 100.0 2e-81 MGNTKLANPAPLGLMGFGMTTILLNLHNVGYFALDGIILAMGIFYGGIAQIFAGLLEYKK GNTFGLTAFTSYGSFWLTLVAILLMPKLGLTDAPNAQFLGVYLGLWGVFTLFMFFGTLKG ARVLQFVFFSLTVLFALLAIGNIAGNAAIIHFAGWIGLICGASAIYLAMGEVLNEQFGRT VLPIGESH >gi|296494518|gb|ADTN01000220.1| GENE 10 10359 - 10751 369 130 aa, chain - ## HITS:1 COG:yaaW KEGG:ns NR:ns ## COG: yaaW COG4735 # Protein_GI_number: 16128005 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 130 108 237 237 192 98.0 1e-49 MSTFEIEQQLLEQFLRNTWKKMDEEHKQEFLHAVDTRVNELEELLPLLMKDKLLAKGVSH LLSSQLTRILRTHAAMSVLEHGLLRGAGLGGPVGAALNGVKAVSGSAYRVTIPAVLQIAC LRRMVSATQV >gi|296494518|gb|ADTN01000220.1| GENE 11 10772 - 11071 173 99 aa, chain - ## HITS:1 COG:yaaW KEGG:ns NR:ns ## COG: yaaW COG4735 # Protein_GI_number: 16128005 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 84 1 84 237 173 100.0 5e-44 MNVNYLNDSDLDFLQHCSEEQLANFARLLTHNEKGKTRLSSVLMRNELFKSMEGHPEQHR RNWQLIAGELQHFGGDSIANKLRGTVNCIGPFCSMFQSD >gi|296494518|gb|ADTN01000220.1| GENE 12 11097 - 11501 367 134 aa, chain - ## HITS:1 COG:no KEGG:JW0012 NR:ns ## KEGG: JW0012 # Name: yaaI # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 134 1 134 134 245 100.0 3e-64 MKSVFTISASLAISLMLCCTAQANDHKLLGAIAMPRNETNDLALKLPVCRIVKRIQLSAD HGDLQLSGASVYFKAARSASQSLNIPSEIKEGQTTDWININSDNDNKRCVSKITFSGHTV NSSDMATLKIIGDD >gi|296494518|gb|ADTN01000220.1| GENE 13 11878 - 13794 2358 638 aa, chain + ## HITS:1 COG:ECs0014 KEGG:ns NR:ns ## COG: ECs0014 COG0443 # Protein_GI_number: 15829268 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Escherichia coli O157:H7 # 1 638 1 638 638 1106 100.0 0 MGKIIGIDLGTTNSCVAIMDGTTPRVLENAEGDRTTPSIIAYTQDGETLVGQPAKRQAVT NPQNTLFAIKRLIGRRFQDEEVQRDVSIMPFKIIAADNGDAWVEVKGQKMAPPQISAEVL KKMKKTAEDYLGEPVTEAVITVPAYFNDAQRQATKDAGRIAGLEVKRIINEPTAAALAYG LDKGTGNRTIAVYDLGGGTFDISIIEIDEVDGEKTFEVLATNGDTHLGGEDFDSRLINYL VEEFKKDQGIDLRNDPLAMQRLKEAAEKAKIELSSAQQTDVNLPYITADATGPKHMNIKV TRAKLESLVEDLVNRSIEPLKVALQDAGLSVSDIDDVILVGGQTRMPMVQKKVAEFFGKE PRKDVNPDEAVAIGAAVQGGVLTGDVKDVLLLDVTPLSLGIETMGGVMTTLIAKNTTIPT KHSQVFSTAEDNQSAVTIHVLQGERKRAADNKSLGQFNLDGINPAPRGMPQIEVTFDIDA DGILHVSAKDKNSGKEQKITIKASSGLNEDEIQKMVRDAEANAEADRKFEELVQTRNQGD HLLHSTRKQVEEAGDKLPADDKTAIESALTALETALKGEDKAAIEAKMQELAQVSQKLME IAQQQHAQQQTAGADASANNAKDDDVVDAEFEEVKDKK >gi|296494518|gb|ADTN01000220.1| GENE 14 13883 - 15013 1163 376 aa, chain + ## HITS:1 COG:dnaJ KEGG:ns NR:ns ## COG: dnaJ COG0484 # Protein_GI_number: 16128009 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Escherichia coli K12 # 1 376 1 376 376 698 99.0 0 MAKQDYYEILGVSKTAEEREIRKAYKRLAMKYHPDRNQGDKEAEAKFKEIKEAYEVLTDS QKRAAYDQYGHAAFEQGGMGGGGFGGGADFSDIFGDVFGDIFGGGRGRQRAARGADLRYN MELTLEEAVRGVTKEIRIPTLEECDVCHGSGAKPGTQPQTCPTCHGSGQVQMRQGFFAVQ QTCPHCQGRGTLIKDPCNKCHGHGRVERSKTLSVKIPAGVDTGDRIRLAGEGEAGEHGAP AGDLYVQVQVKQHPIFEREGNNLYCEVPINFAMAALGGEIEVPTLDGRVKLKVPGETQTG KLFRMRGKGVKSVRGGAQGDLLCRVVVETPVGLNEKQKQLLQELQESFGGPTGEHNSPRS KSFFDGVKKFFDDLTR Prediction of potential genes in microbial genomes Time: Sun May 15 23:53:03 2011 Seq name: gi|296494517|gb|ADTN01000221.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont471.1, whole genome shotgun sequence Length of sequence - 5667 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 352 - 411 9.2 1 1 Op 1 25/0.000 + CDS 518 - 991 96 ## COG0438 Glycosyltransferase + Prom 1428 - 1487 2.7 2 1 Op 2 11/0.000 + CDS 1553 - 2155 127 ## COG0438 Glycosyltransferase 3 1 Op 3 11/0.000 + CDS 2203 - 3393 212 ## COG0500 SAM-dependent methyltransferases + Term 3548 - 3593 1.1 + Prom 3785 - 3844 6.0 4 1 Op 4 . + CDS 3871 - 5478 267 ## COG0438 Glycosyltransferase Predicted protein(s) >gi|296494517|gb|ADTN01000221.1| GENE 1 518 - 991 96 157 aa, chain + ## HITS:1 COG:TM0622 KEGG:ns NR:ns ## COG: TM0622 COG0438 # Protein_GI_number: 15643387 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Thermotoga maritima # 5 150 226 369 388 104 40.0 7e-23 MMQLPHSELIIVGEGEERLTLQTFIDQNSLSERVTLLGKKDNVVDYYHDSDAFVLASDYE GFALVVAEAMACGLPVVATDCGGPAEIIGNGQDFGITISVNDVHHLTAAMQQIESLDIEQ RKGSGYRARERVIEKFSAKKIVSQWEDIYLCIQKNRI >gi|296494517|gb|ADTN01000221.1| GENE 2 1553 - 2155 127 200 aa, chain + ## HITS:1 COG:NMB0218 KEGG:ns NR:ns ## COG: NMB0218 COG0438 # Protein_GI_number: 15676144 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Neisseria meningitidis MC58 # 1 189 185 375 376 193 51.0 2e-49 MSDYPYCPPPVEPVSFIFVARFLQEKGVYEFIEAAKIVKKKFPETHFCMLGHLDLHNPGS LTIERLNDLKCNNIIELPGHVDNVQEWLAKASVFVLPSWREGFPRSTQEAMAMGRAVITS DVPGCRDTVVDGVNGFLIKPRSSHALAEKMLCFLHQPELITQMGNASHQIALQQFDSSIV NSKLMKILRVDFENTKSNVC >gi|296494517|gb|ADTN01000221.1| GENE 3 2203 - 3393 212 396 aa, chain + ## HITS:1 COG:BH0418 KEGG:ns NR:ns ## COG: BH0418 COG0500 # Protein_GI_number: 15612981 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Bacillus halodurans # 20 164 24 168 203 68 30.0 2e-11 MERLNLSEKPAHSLQESAIHCARYANILHLVKDKVVLDIACGEGYGSALLMKAGAKRVVG VDISEESIERAKKLFGKYDVEYIVSDANTISERYGEDFFDIVVSIETIEHINTPDVFLSS IKKTAKENAIFYITCPNDYWYYPNNEQSNPCHVRKYTFQEFKELSTGVLGNNVQWAYGGS VFGFGTVSANNRGLEKIGSSWMEISESENSINVLNTEIDAVNENNCSFFVGLWNAPETNF SNSTFPISMDAYSRMTEEFESNVSSVLREDQAENKKAISILKKDMKQLTMNLRKSNLLLS ALKIENVTLRKNISELQQQLAKQIAMYHDVTNAKNNLELENADLISLHGDLATKNSSLME RISQMEIPYYRYLRLSALLPGFLKKIILKIVRFIRK >gi|296494517|gb|ADTN01000221.1| GENE 4 3871 - 5478 267 535 aa, chain + ## HITS:1 COG:mll7087 KEGG:ns NR:ns ## COG: mll7087 COG0438 # Protein_GI_number: 13475905 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Mesorhizobium loti # 357 525 162 322 340 103 34.0 1e-21 MPLLDSDGETLFANGGNVFLDIDDDSVHVGAGSACIQEKTITYDGEGFLSTFLFGGASVI KIATLKNLGKYDENMFIGFEDIDLSIRLYQSGMKVGTCGAISLIHDHPKPSSNKDKDYEK IRFTRSILKESADYLENKWGMVFWSAPVDDWLEEKQRSFELTTQHNDNLSQFDDNKKGAT YDLSVSKPKIALIIDTENWAFSNIANQIVNYLSDSYDFTIIPTEIVDNISQVIMMTRNYD ITHFFWRESLRLIYDEYYINYNRTIGFDELSFKKEFLDGRIITSSVYDHLFLDEQAIKLR KTFYNDLITAYTVSSSKLYDIYSSIVEYPKPSVLAEDGVNLKLFYPINLERFDNIESRAL RVGWAGNSKWAGELEDFKGYHSLLKPAVEQLQSEGLNIELVLADRQLGFIPHDEMVKYYS QIDVYVCPSKIEGTPNPVLESMACGVPVISTDVGVVKDAFGEMQKEWILPVRSKDTLIDK LRDFYHKRNSVVKCLSSENLQQIKKWDWKVKSENFRTFFDKVLESKKHSEISNKM Prediction of potential genes in microbial genomes Time: Sun May 15 23:53:04 2011 Seq name: gi|296494516|gb|ADTN01000222.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont475.1, whole genome shotgun sequence Length of sequence - 1968 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 2 1 Op 2 . - CDS 1104 - 1181 66 ## - Prom 1373 - 1432 7.2 + Prom 1426 - 1485 4.5 3 2 Tu 1 . + CDS 1521 - 1934 551 ## COG2916 DNA-binding protein H-NS Predicted protein(s) >gi|296494516|gb|ADTN01000222.1| GENE 1 299 - 916 242 205 aa, chain - ## HITS:1 COG:ECs1740 KEGG:ns NR:ns ## COG: ECs1740 COG1435 # Protein_GI_number: 15830994 # Func_class: F Nucleotide transport and metabolism # Function: Thymidine kinase # Organism: Escherichia coli O157:H7 # 1 205 1 205 205 407 100.0 1e-114 MAQLYFYYSAMNAGKSTALLQSSYNYQERGMRTVVYTAEIDDRFGAGKVSSRIGLSSPAK LFNQNSSLFDEIRAEHEQQAIHCVLVDECQFLTRQQVYELSEVVDQLDIPVLCYGLRTDF RGELFIGSQYLLAWSDKLVELKTICFCGRKASMVLRLDQAGRPYNEGEQVVIGGNERYVS VCRKHYKEALQVGSLTAIQERHRHD >gi|296494516|gb|ADTN01000222.1| GENE 2 1104 - 1181 66 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDLVLHYLLEVNRGLLLAQQVKDCE >gi|296494516|gb|ADTN01000222.1| GENE 3 1521 - 1934 551 137 aa, chain + ## HITS:1 COG:ECs1739 KEGG:ns NR:ns ## COG: ECs1739 COG2916 # Protein_GI_number: 15830993 # Func_class: R General function prediction only # Function: DNA-binding protein H-NS # Organism: Escherichia coli O157:H7 # 1 137 1 137 137 197 100.0 4e-51 MSEALKILNNIRTLRAQARECTLETLEEMLEKLEVVVNERREEESAAAAEVEERTRKLQQ YREMLIADGIDPNELLNSLAAVKSGTKAKRAQRPAKYSYVDENGETKTWTGQGRTPAVIK KAMDEQGKSLDDFLIKQ Prediction of potential genes in microbial genomes Time: Sun May 15 23:53:09 2011 Seq name: gi|296494515|gb|ADTN01000223.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont475.2, whole genome shotgun sequence Length of sequence - 5524 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 47 - 88 8.8 1 1 Op 1 5/0.000 - CDS 111 - 1019 972 ## COG1210 UDP-glucose pyrophosphorylase - Prom 1156 - 1215 4.9 - Term 1143 - 1177 -0.4 2 1 Op 2 4/1.000 - CDS 1221 - 2234 707 ## COG0784 FOG: CheY-like receiver - Term 2253 - 2282 -0.3 3 1 Op 3 . - CDS 2326 - 3231 367 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily - Prom 3476 - 3535 3.0 + Prom 3240 - 3299 3.8 4 2 Op 1 4/1.000 + CDS 3344 - 3802 195 ## PROTEIN SUPPORTED gi|90021194|ref|YP_527021.1| ribosomal protein L20 5 2 Op 2 . + CDS 3852 - 4694 863 ## COG0788 Formyltetrahydrofolate hydrolase + TRNA 4854 - 4938 66.9 # Tyr GTA 0 0 + TRNA 5148 - 5232 66.9 # Tyr GTA 0 0 - Term 5036 - 5064 2.3 6 3 Tu 1 . - CDS 5229 - 5411 94 ## ECUMN_1528 hypothetical protein - Prom 5451 - 5510 1.6 Predicted protein(s) >gi|296494515|gb|ADTN01000223.1| GENE 1 111 - 1019 972 302 aa, chain - ## HITS:1 COG:ECs1738 KEGG:ns NR:ns ## COG: ECs1738 COG1210 # Protein_GI_number: 15830992 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose pyrophosphorylase # Organism: Escherichia coli O157:H7 # 1 302 1 302 302 589 100.0 1e-168 MAAINTKVKKAVIPVAGLGTRMLPATKAIPKEMLPLVDKPLIQYVVNECIAAGITEIVLV THSSKNSIENHFDTSFELEAMLEKRVKRQLLDEVQSICPPHVTIMQVRQGLAKGLGHAVL CAHPVVGDEPVAVILPDVILDEYESDLSQDNLAEMIRRFDETGHSQIMVEPVADVTAYGV VDCKGVELAPGESVPMVGVVEKPKADVAPSNLAIVGRYVLSADIWPLLAKTPPGAGDEIQ LTDAIDMLIEKETVEAYHMKGKSHDCGNKLGYMQAFVEYGIRHNTLGTEFKAWLEEEMGI KK >gi|296494515|gb|ADTN01000223.1| GENE 2 1221 - 2234 707 337 aa, chain - ## HITS:1 COG:STM1753_1 KEGG:ns NR:ns ## COG: STM1753_1 COG0784 # Protein_GI_number: 16765097 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Salmonella typhimurium LT2 # 1 134 1 134 134 248 91.0 2e-65 MTQPLVGKQILIVEDEQVFRSLLDSWFSSLGATTVLAADGVDALELLGGFTPDLMICDIA MPRMNGLKLLEHIRNRGDQTPVLVISATENMADIAKALRLGVEDVLLKPVKDLNRLREMV FACLYPSMFNSRVEEEERLFRDWDAMVDNPAAAAKLLQELQPPVQQVISHCRVNYRQLVA ADKPGLVLDIAALSENDLAFYCLDVTRAGHNGVLAALLLRALFNGLLQEQLAHQNQRLPE LGALLKQVNHLLRQANLPGQFPLLVGYYHRELKNLILVSAGLNATLNTGEHQVQISNGVP LGTLGNAYLNQLSQRCDAWQCQIWGTGGRLRLMLSAE >gi|296494515|gb|ADTN01000223.1| GENE 3 2326 - 3231 367 301 aa, chain - ## HITS:1 COG:rssA KEGG:ns NR:ns ## COG: rssA COG1752 # Protein_GI_number: 16129195 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Escherichia coli K12 # 1 301 14 314 314 607 99.0 1e-174 MRKIKIGLALGSGAARGWSHIGVINALKKVGIEIDIIAGCSIGSLVGAAYACDRLSALED WVTSFSYWDVLRLMDLSWQRGGLLRGERVFNQYREIMPETEIENCSRRFAAVATNLSTGR ELWFTEGDLHLAIRASCSIPGLMAPVAHNGYWLVDGAVVNPIPISLTRALGADIVIAVDL QHDAHLMQQDLLSFNVSEENSENGDSLPWHARLKERLGSITTRRAVTAPTATEIMTTSIQ VLENRLKRNRMAGDPPDILIQPVCPQISTLDFHRAHAAIAAGQLAVERKMDELLPLVRTN I >gi|296494515|gb|ADTN01000223.1| GENE 4 3344 - 3802 195 152 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90021194|ref|YP_527021.1| ribosomal protein L20 [Saccharophagus degradans 2-40] # 4 127 3 131 134 79 36 5e-15 MSQLCPCGSAVEYSLCCHPYVSGEKVAPDPEHLMRSRYCTFVMQDADYLIKTWHPSCGAA ALRAELIAGFAHTEWLGLTVFEHCWQDADNIGFVSFVARFTEGGKTGAIIERSRFLKENG QWYYIDGTRPQFGRNDPCPCGSGKKFKKCCGQ >gi|296494515|gb|ADTN01000223.1| GENE 5 3852 - 4694 863 280 aa, chain + ## HITS:1 COG:purU KEGG:ns NR:ns ## COG: purU COG0788 # Protein_GI_number: 16129193 # Func_class: F Nucleotide transport and metabolism # Function: Formyltetrahydrofolate hydrolase # Organism: Escherichia coli K12 # 1 280 1 280 280 563 99.0 1e-161 MHSLQRKVLRTICPDQKGLIARITNICYKHELNIVQNNEFVDHRTGRFFMRTELEGIFND STLLADLDSALPEGSVRELNPAGRRRIVILVTKEAHCLGDLLMKANYGGLDVEIAAVIGN HDTLRSLVERFDIPFELVSHEGLSRNEHDQKMADAIDAYQPDYVVLAKYMRVLTPEFVAR FPNKIINIHHSFLPAFIGARPYHQAYERGVKIIGATAHYVNDNLDEGPIIMQDVIHVDHT YTAEDMMRAGRDVEKNVLSRALYKVLAQRVFVYGNRTIIL >gi|296494515|gb|ADTN01000223.1| GENE 6 5229 - 5411 94 60 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_1528 NR:ns ## KEGG: ECUMN_1528 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 59 2 60 149 97 94.0 1e-19 MVVGEGLLSAARFALRVVACGNALSLALESNLGRSFSSFPAWAEYLIADSLGSSGTFESG Prediction of potential genes in microbial genomes Time: Sun May 15 23:53:16 2011 Seq name: gi|296494514|gb|ADTN01000224.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont475.3, whole genome shotgun sequence Length of sequence - 16456 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 7, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 161 - 192 1.6 1 1 Op 1 12/0.000 - CDS 212 - 889 883 ## COG2181 Nitrate reductase gamma subunit 2 1 Op 2 12/0.000 - CDS 889 - 1599 902 ## COG2180 Nitrate reductase delta subunit 3 1 Op 3 13/0.000 - CDS 1596 - 3134 1870 ## COG1140 Nitrate reductase beta subunit 4 1 Op 4 10/0.000 - CDS 3131 - 6874 4599 ## COG5013 Nitrate reductase alpha subunit - Prom 7064 - 7123 5.7 - Term 7213 - 7243 3.4 5 1 Op 5 . - CDS 7390 - 8781 1200 ## COG2223 Nitrate/nitrite transporter - Prom 8807 - 8866 1.8 6 1 Op 6 . - CDS 8876 - 9094 120 ## SSON_1954 hypothetical protein - Prom 9332 - 9391 3.1 + Prom 8901 - 8960 8.3 7 2 Op 1 8/0.000 + CDS 9120 - 10916 1554 ## COG3850 Signal transduction histidine kinase, nitrate/nitrite-specific 8 2 Op 2 . + CDS 10909 - 11559 859 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 9 3 Tu 1 . - CDS 11560 - 12990 825 ## ECBD_2399 hypothetical protein - Prom 13130 - 13189 2.7 + Prom 13047 - 13106 2.7 10 4 Tu 1 . + CDS 13140 - 13493 374 ## COG1553 Uncharacterized conserved protein involved in intracellular sulfur reduction + Term 13496 - 13541 2.8 11 5 Tu 1 2/1.000 - CDS 13537 - 14211 523 ## COG3703 Uncharacterized protein involved in cation transport - Prom 14295 - 14354 2.9 12 6 Tu 1 . - CDS 14390 - 14620 246 ## COG4572 Putative cation transport regulator - Prom 14840 - 14899 5.0 + Prom 14784 - 14843 4.8 13 7 Tu 1 . + CDS 14890 - 15990 1127 ## COG0387 Ca2+/H+ antiporter + Term 15994 - 16048 4.8 Predicted protein(s) >gi|296494514|gb|ADTN01000224.1| GENE 1 212 - 889 883 225 aa, chain - ## HITS:1 COG:ECs1732 KEGG:ns NR:ns ## COG: ECs1732 COG2181 # Protein_GI_number: 15830986 # Func_class: C Energy production and conversion # Function: Nitrate reductase gamma subunit # Organism: Escherichia coli O157:H7 # 1 225 1 225 225 422 100.0 1e-118 MQFLNMFFFDIYPYIAGAVFLIGSWLRYDYGQYTWRAASSQMLDRKGMNLASNLFHIGIL GIFVGHFFGMLTPHWMYEAWLPIEVKQKMAMFAGGASGVLCLIGGVLLLKRRLFSPRVRA TTTGADILILSLLVIQCALGLLTIPFSAQHMDGSEMMKLVGWAQSVVTFHGGASQHLDGV AFIFRLHLVLGMTLFLLFPFSRLVHIWSVPVEYLTRKYQLVRARH >gi|296494514|gb|ADTN01000224.1| GENE 2 889 - 1599 902 236 aa, chain - ## HITS:1 COG:ECs1731 KEGG:ns NR:ns ## COG: ECs1731 COG2180 # Protein_GI_number: 15830985 # Func_class: C Energy production and conversion # Function: Nitrate reductase delta subunit # Organism: Escherichia coli O157:H7 # 1 236 1 236 236 426 98.0 1e-119 MIELVIVSRLLEYPDAALWQHQQEMFEAIAASKNLSKEDAHALGIFLRDLTAMDPLDAQA QYSELFDRGRATSLLLFEHVHGESRDRGQAMVDLLAQYEQHGLQLNSRELPDHLPLYLEY LSQLPQSEAVEGLKDIAPILALLSARLQQRESRYAVMFDLLLKLANTAIDSDKVAEKIAD EARDDTPQALDAVWEEEQVKFFADKGCGDSAITAHQRRFAGAVAPQYLNITTGGQH >gi|296494514|gb|ADTN01000224.1| GENE 3 1596 - 3134 1870 512 aa, chain - ## HITS:1 COG:ECs1730 KEGG:ns NR:ns ## COG: ECs1730 COG1140 # Protein_GI_number: 15830984 # Func_class: C Energy production and conversion # Function: Nitrate reductase beta subunit # Organism: Escherichia coli O157:H7 # 1 512 1 512 512 1093 100.0 0 MKIRSQVGMVLNLDKCIGCHTCSVTCKNVWTSREGVEYAWFNNVETKPGQGFPTDWENQE KYKGGWIRKINGKLQPRMGNRAMLLGKIFANPHLPGIDDYYEPFDFDYQNLHTAPEGSKS QPIARPRSLITGERMAKIEKGPNWEDDLGGEFDKLAKDKNFDNIQKAMYSQFENTFMMYL PRLCEHCLNPACVATCPSGAIYKREEDGIVLIDQDKCRGWRMCITGCPYKKIYFNWKSGK SEKCIFCYPRIEAGQPTVCSETCVGRIRYLGVLLYDADAIERAASTENEKDLYQRQLEVF LDPNDPKVIEQAIKDGIPLSVIEAAQQSPVYKMAMEWKLALPLHPEYRTLPMVWYVPPLS PIQSAADAGELGSNGILPDVESLRIPVQYLANLLTAGDTKPVLRALKRMLAMRHYKRAET VDGKVDTRALEEVGLTEAQAQEMYRYLAIANYEDRFVVPSSHRELAREAFPEKNGCGFTF GDGCHGSDTKFNLFNSRRIDAIDVTSKTEPHP >gi|296494514|gb|ADTN01000224.1| GENE 4 3131 - 6874 4599 1247 aa, chain - ## HITS:1 COG:ECs1729 KEGG:ns NR:ns ## COG: ECs1729 COG5013 # Protein_GI_number: 15830983 # Func_class: C Energy production and conversion # Function: Nitrate reductase alpha subunit # Organism: Escherichia coli O157:H7 # 1 1247 1 1247 1247 2601 100.0 0 MSKFLDRFRYFKQKGETFADGHGQLLNTNRDWEDGYRQRWQHDKIVRSTHGVNCTGSCSW KIYVKNGLVTWETQQTDYPRTRPDLPNHEPRGCPRGASYSWYLYSANRLKYPMMRKRLMK MWREAKALHSDPVEAWASIIEDADKAKSFKQARGRGGFVRSSWQEVNELIAASNVYTIKN YGPDRVAGFSPIPAMSMVSYASGARYLSLIGGTCLSFYDWYCDLPPASPQTWGEQTDVPE SADWYNSSYIIAWGSNVPQTRTPDAHFFTEVRYKGTKTVAVTPDYAEIAKLCDLWLAPKQ GTDAAMALAMGHVMLREFHLDNPSQYFTDYVRRYTDMPMLVMLEERDGYYAAGRMLRAAD LVDALGQENNPEWKTVAFNTNGEMVAPNGSIGFRWGEKGKWNLEQRDGKTGEETELQLSL LGSQDEIAEVGFPYFGGDGTEHFNKVELENVLLHKLPVKRLQLADGSTALVTTVYDLTLA NYGLERGLNDVNCATSYDDVKAYTPAWAEQITGVSRSQIIRIAREFADNADKTHGRSMII VGAGLNHWYHLDMNYRGLINMLIFCGCVGQSGGGWAHYVGQEKLRPQTGWQPLAFALDWQ RPARHMNSTSYFYNHSSQWRYETVTAEELLSPMADKSRYTGHLIDFNVRAERMGWLPSAP QLGTNPLTIAGEAEKAGMNPVDYTVKSLKEGSIRFAAEQPENGKNHPRNLFIWRSNLLGS SGKGHEFMLKYLLGTEHGIQGKDLGQQGGVKPEEVDWQDNGLEGKLDLVVTLDFRLSSTC LYSDIILPTATWYEKDDMNTSDMHPFIHPLSAAVDPAWEAKSDWEIYKAIAKKFSEVCVG HLGKETDIVTLPIQHDSAAELAQPLDVKDWKKGECDLIPGKTAPHIMVVERDYPATYERF TSIGPLMEKIGNGGKGIAWNTQSEMDLLRKLNYTKAEGPAKGQPMLNTAIDAAEMILTLA PETNGQVAVKAWAALSEFTGRDHTHLALNKEDEKIRFRDIQAQPRKIISSPTWSGLEDEH VSYNAGYTNVHELIPWRTLSGRQQLYQDHQWMRDFGESLLVYRPPIDTRSVKEVIGQKSN GNPEKALNFLTPHQKWGIHSTYSDNLLMLTLGRGGPVVWLSEADAKDLGIADNDWIEVFN SNGALTARAVVSQRVPAGMTMMYHAQERIVNLPGSEITQQRGGIHNSVTRITPKPTHMIG GYAHLAYGFNYYGTVGSNRDEFVVVRKMKNIDWLDGEGNDQVQESVK >gi|296494514|gb|ADTN01000224.1| GENE 5 7390 - 8781 1200 463 aa, chain - ## HITS:1 COG:narK KEGG:ns NR:ns ## COG: narK COG2223 # Protein_GI_number: 16129186 # Func_class: P Inorganic ion transport and metabolism # Function: Nitrate/nitrite transporter # Organism: Escherichia coli K12 # 1 463 1 463 463 801 100.0 0 MSHSSAPERATGAVITDWRPEDPAFWQQRGQRIASRNLWISVPCLLLAFCVWMLFSAVAV NLPKVGFNFTTDQLFMLTALPSVSGALLRVPYSFMVPIFGGRRWTAFSTGILIIPCVWLG FAVQDTSTPYSVFIIISLLCGFAGANFASSMANISFFFPKQKQGGALGLNGGLGNMGVSV MQLVAPLVVSLSIFAVFGSQGVKQPDGTELYLANASWIWVPFLAIFTIAAWFGMNDLATS KASIKEQLPVLKRGHLWIMSLLYLATFGSFIGFSAGFAMLSKTQFPDVQILQYAFFGPFI GALARSAGGALSDRLGGTRVTLVNFILMAIFSGLLFLTLPTDGQGGSFMAFFAVFLALFL TAGLGSGSTFQMISVIFRKLTMDRVKAEGGSDERAMREAATDTAAALGFISAIGAIGGFF IPKAFGSSLALTGSPVGAMKVFLIFYIACVVITWAVYGRHSKK >gi|296494514|gb|ADTN01000224.1| GENE 6 8876 - 9094 120 72 aa, chain - ## HITS:1 COG:no KEGG:SSON_1954 NR:ns ## KEGG: SSON_1954 # Name: not_defined # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 72 1 72 72 141 100.0 8e-33 MSNNLNECDDTFWNGSILGYWLKYTHTRKELLLICRVVSRFTSVRVGILQHREKSHNFYE ITVLTMGNDKYQ >gi|296494514|gb|ADTN01000224.1| GENE 7 9120 - 10916 1554 598 aa, chain + ## HITS:1 COG:ECs1727 KEGG:ns NR:ns ## COG: ECs1727 COG3850 # Protein_GI_number: 15830981 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase, nitrate/nitrite-specific # Organism: Escherichia coli O157:H7 # 1 598 1 598 598 1133 100.0 0 MLKRCLSPLTLVNQVALIVLLSTAIGLAGMAVSGWLVQGVQGSAHAINKAGSLRMQSYRL LAAVPLSEKDKPLIKEMEQTAFSAELTRAAERDGQLAQLQGLQDYWRNELIPALMRAQNR ETVSADVSQFVAGLDQLVSGFDRTTEMRIETVVLVHRVMAVFMALLLVFTIIWLRARLLQ PWRQLLAMASAVSHRDFTQRANISGRNEMAMLGTALNNMSAELAESYAVLEQRVQEKTAG LEHKNQILSFLWQANRRLHSRAPLCERLSPVLNGLQNLTLLRDIELRVYDTDDEENHQEF TCQPDMTCDDKGCQLCPRGVLPVGDRGTTLKWRLADSHTQYGILLATLPQGRHLSHDQQQ LVDTLVEQLTATLALDRHQERQQQLIVMEERATIARELHDSIAQSLSCMKMQVSCLQMQG DALPESSRELLSQIRNELNASWAQLRELLTTFRLQLTEPGLRPALEASCEEYSAKFGFPV KLDYQLPPRLVPSHQAIHLLQIAREALSNALKHSQASEVVVTVAQNDNQVKLTVQDNGCG VPENAIRSNHYGMIIMRDRAQSLRGDCRVRRRESGGTEVVVTFIPEKTFTDVQGDTHE >gi|296494514|gb|ADTN01000224.1| GENE 8 10909 - 11559 859 216 aa, chain + ## HITS:1 COG:ECs1726 KEGG:ns NR:ns ## COG: ECs1726 COG2197 # Protein_GI_number: 15830980 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 216 1 216 216 346 100.0 2e-95 MSNQEPATILLIDDHPMLRTGVKQLISMAPDITVVGEASNGEQGIELAESLDPDLILLDL NMPGMNGLETLDKLREKSLSGRIVVFSVSNHEEDVVTALKRGADGYLLKDMEPEDLLKAL HQAAAGEMVLSEALTPVLAASLRANRATTERDVNQLTPRERDILKLIAQGLPNKMIARRL DITESTVKVHVKHMLKKMKLKSRVEAAVWVHQERIF >gi|296494514|gb|ADTN01000224.1| GENE 9 11560 - 12990 825 476 aa, chain - ## HITS:1 COG:no KEGG:ECBD_2399 NR:ns ## KEGG: ECBD_2399 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_BL21_DE3 # Pathway: not_defined # 1 476 1 476 476 925 100.0 0 MVFHFLPEVTDVLSRFVPRIISFYLLLLAAGGTANAQSTFEQKAANPFDNNNDGLPDLGM APENHDGEKHFAEIVKDFGETSMNDNGLDTGEQAKAFALGKVRDALSQQVNQHVESWLSP WGNASVDVKVDNEGHFTGSRGSWFVPLQDNDRYLTWSQLGLTQQDDGLVSNVGVGQRWAR GNWLVGYNTFYDNLLDENLQRAGFGAEAWGEYLRLSANFYQPFAAWHEQTATQEQRMARG YDLTARMRMPFYQHLNTSVSVEQYFGDRVDLFNSGTGYHNPVALSLGLNYTPVPLVTVTA QHKQGESGENQNNLGLNLNYRFGVPLKKQLSAGEVAESQSLRGSRYDNPQRNNLPTLEYR QRKTLTVFLATPPWDLKPGETVPLKLQIRSRYGIRQLIWQGDTQILSLTPGAQANSAEGW TLIMPDWQNGEGASNHWRLSVVVEDNQGQRVSSNEITLTLVEPFDALSNDELRWEP >gi|296494514|gb|ADTN01000224.1| GENE 10 13140 - 13493 374 117 aa, chain + ## HITS:1 COG:ECs1724 KEGG:ns NR:ns ## COG: ECs1724 COG1553 # Protein_GI_number: 15830978 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized conserved protein involved in intracellular sulfur reduction # Organism: Escherichia coli O157:H7 # 1 117 1 117 117 224 100.0 2e-59 MQKIVIVANGAPYGSESLFNSLRLAIALREQESNLDLRLFLMSDAVTAGLRGQKPGEGYN IQQMLEILTAQNVPVKLCKTCTDGRGISTLPLIDGVEIGTLVELAQWTLSADKVLTF >gi|296494514|gb|ADTN01000224.1| GENE 11 13537 - 14211 523 224 aa, chain - ## HITS:1 COG:chaC KEGG:ns NR:ns ## COG: chaC COG3703 # Protein_GI_number: 16129181 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in cation transport # Organism: Escherichia coli K12 # 1 224 15 238 238 454 99.0 1e-128 MNADCKTAFGAIEESLLWSAEQRAASLAATLACRPDEGPVWIFGYGSLMWNPALEFTESC TGTLVGWHRAFCLRLTAGRGTAHQPGRMLALKEGGRTTGVAYRLPEETLEQELTLLWKRE MITGCYLPTWCQLDLDDGCTVNAIVFIMDPRHPEYESDTRAQVIAPLIAAASGPLGTNAQ YLFSLEQELIKLGMQDDGLNDLLVSVKKLLAENYPDGVLRPGFA >gi|296494514|gb|ADTN01000224.1| GENE 12 14390 - 14620 246 76 aa, chain - ## HITS:1 COG:ECs1722 KEGG:ns NR:ns ## COG: ECs1722 COG4572 # Protein_GI_number: 15830976 # Func_class: R General function prediction only # Function: Putative cation transport regulator # Organism: Escherichia coli O157:H7 # 1 76 1 76 76 99 100.0 1e-21 MPYKTKSDLPESVKHVLPSHAQDIYKEAFNSAWDQYKDKEDRRDDASREETAHKVAWAAV KHEYAKGDDDKWHKKS >gi|296494514|gb|ADTN01000224.1| GENE 13 14890 - 15990 1127 366 aa, chain + ## HITS:1 COG:STM1771 KEGG:ns NR:ns ## COG: STM1771 COG0387 # Protein_GI_number: 16765112 # Func_class: P Inorganic ion transport and metabolism # Function: Ca2+/H+ antiporter # Organism: Salmonella typhimurium LT2 # 1 366 1 366 366 577 93.0 1e-165 MSNAQEAVKTRHKETSLIFPVLALVVLFLWGSSQTLPVVIAINLLALIGILSSAFSVVRH ADVLAHRLGEPYGSLILSLSVVILEVSLISALMATGDAAPTLMRDTLYSIIMIVTGGLVG FSLLLGGRKFATQYMNLFGIKQYLIALFPLAIIVLVFPMALPAANFSTGQALLVALISAA MYGVFLLIQTKTHQSLFVYEHEDDSDDDDPHHGKPSAHSSVWHAIWLIIHLIAVIAVTKM NASPLETLLDSMNAPVAFTGFLVALLILSPEGLGALKAVLNNQVQRAMNLFFGSVLATIS LTVPVVTLIAFMTGNELQFALGAPEMVVMVASLVLCHISFSTGRTNVLNGAAHLALFAAY LMTIFA Prediction of potential genes in microbial genomes Time: Sun May 15 23:53:24 2011 Seq name: gi|296494513|gb|ADTN01000225.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont478.1, whole genome shotgun sequence Length of sequence - 721 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Sun May 15 23:53:25 2011 Seq name: gi|296494512|gb|ADTN01000226.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont497.1, whole genome shotgun sequence Length of sequence - 4085 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 4083 3055 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member Predicted protein(s) >gi|296494512|gb|ADTN01000226.1| GENE 1 3 - 4083 3055 1360 aa, chain - ## HITS:1 COG:PSLT108 KEGG:ns NR:ns ## COG: PSLT108 COG0507 # Protein_GI_number: 17233470 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Salmonella typhimurium LT2 # 1 1360 118 1473 1752 2212 88.0 0 EALASTRVMTDGQSETVLTGNLVMALFNHDTSRDQEPQLHTHAVAANVTQHNGEWKTLSS DKVGKTGFIENVYANQIAFGRLYREKLKEQVEALGYETEVVGKHGMWEMPGVPVEAFSGR SQTIREAVGEDASLKSRDVAALDTRKSKQHVDPEVRMAEWMQTLKETGFDIRAYRDAADQ RAETRTQAPGAVSQEGPDVQQAVTQAIAGLSERKVQFTYTDVLARTVGILPPENGVIERA RAGIDEAISREQLIPLDREKGLFTSGIHVLDELSVRALSRDIMKQNRVTIHPEKSVPRTA GYSDAVSVLAQDRPSLAIVSGQGGAAGQRERVAELVMMAREQGREVQIIAADRRSQMNLK QDERLSGELITGRRQLLEGMAFTPGSTVIVDQGEKLSLKETLTLLDGAARHNVQVLITDS GQRTGTGSALMAMKDAGVNTYRWQGGEQRPATIISEPDRNVRYARLAGDFAASVKAGEES VAQVSGVREQAILTQAIRSELKTQGVLGHQEVTMTALSPVWLDSRSRYLRDMYRPGMVME QWNPETRSHDRYVIDRVTAQSHSLTLRDAQGETQVVRISSLDSSWSLFRPEKMPVADGER LRVTGKIPGLRVSGGDRLQVASVSEDAMTVVVPGRAEPATLPVADSPFTALKLENGWVET PGHSVSDSAKVFASVTQMAMDNATLNGLARSGRDVRLYSSLDETRTAEKLARHPSFTVVS EQIKARAGETLLETAISLQKAWLHTPAQQAIHLALPVVESKNLAFSMVDLLTEVKSFAAE GTSFTELGGEINAQIKRGDLLYVDVAKGYGTGLLVSRASYEAEKSILRHILEGKEAVTPL MERVPGELMEKLTSGQRAATRMILETSDRFTVVQGYAGVGKTTQFRAVMSAVNMLPESER PRVVGLGPTHRAVGEMRSAGVDAQTLASFLHDTQLLQRSGETPNFSNTLFLLDESSMVGN TDMARAYALIAAGGGRAVASGDTDQLQAIAPGQPFRLQQTRSAADVAIMKEIVRQTPELR EAVYSLINRDVERALSGLESVKPSQVPRQEGAWVPEHSVTEFSHSQEAKLAEAQQKAMLK GEAFPDIPMTLYEAIVRDYTGRTPEAREQTLIVTHLNEDRRVLNSMIHDAREKAGELGKE QVMVPVLNTANIRDGELRRLSTWETHRDALAQVDNVYHRIAGISKDDGLITLQDAEGNTR LISPREAVAEGVTLYTPDTIRVGTGDRMRFTKSDRERGYVANSVWTVTAVSGDSVTLSDG QQTRVIRPGQERAEQHIDLAYAITAHGAQGASETFAIALEGTEGNRKQMAGFESAYVALS RMKQHVQVYTDNRQGWTDAINNAVQKGTAHDVLEPKSDRE Prediction of potential genes in microbial genomes Time: Sun May 15 23:53:27 2011 Seq name: gi|296494511|gb|ADTN01000227.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont497.2, whole genome shotgun sequence Length of sequence - 1386 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 377 61 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member - Prom 449 - 508 2.0 + Prom 368 - 427 3.8 2 2 Op 1 7/0.000 + CDS 459 - 686 272 ## COG4456 Virulence-associated protein and related proteins 3 2 Op 2 . + CDS 686 - 1084 331 ## COG1487 Predicted nucleic acid-binding protein, contains PIN domain + Term 1233 - 1275 -0.9 - Term 933 - 971 6.0 4 3 Tu 1 . - CDS 1093 - 1293 271 ## ECO26_p2-75 conjugal transfer protein TraD Predicted protein(s) >gi|296494511|gb|ADTN01000227.1| GENE 1 2 - 377 61 125 aa, chain - ## HITS:1 COG:PSLT108 KEGG:ns NR:ns ## COG: PSLT108 COG0507 # Protein_GI_number: 17233470 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Salmonella typhimurium LT2 # 1 116 1 116 1752 201 86.0 4e-52 MMSIAQVRSAGSAGNYYTDKDNYYVLGSMGERWAGQGAEQLGLQGSVDKDVFTRLLEGRL PDGADLSRMQDGRNRHRPGYDLTFSAPKSVSMMAMLGGDKRLIEAHNQAVDFAVRQGGGG GPPPG >gi|296494511|gb|ADTN01000227.1| GENE 2 459 - 686 272 75 aa, chain + ## HITS:1 COG:PSLT107 KEGG:ns NR:ns ## COG: PSLT107 COG4456 # Protein_GI_number: 17233505 # Func_class: S Function unknown # Function: Virulence-associated protein and related proteins # Organism: Salmonella typhimurium LT2 # 1 75 2 76 76 131 93.0 4e-31 METTVFLSNRSQAVRLPKAVALPENVKRVEVIAVGRTRIITPAGETWDEWFDGNSVSADF MDNREQPGMQERESF >gi|296494511|gb|ADTN01000227.1| GENE 3 686 - 1084 331 132 aa, chain + ## HITS:1 COG:PSLT106 KEGG:ns NR:ns ## COG: PSLT106 COG1487 # Protein_GI_number: 17233504 # Func_class: R General function prediction only # Function: Predicted nucleic acid-binding protein, contains PIN domain # Organism: Salmonella typhimurium LT2 # 1 132 1 132 132 261 96.0 3e-70 MLKFMLDTNICIFTIKNKPASVRERFNLNQGRMCISSVTLMELIYGAEKSQMPERNLAVI EGFVSRIDVLDYDAAAATHTGQIRAELARQGRPVGPFDQMIAGHARSRGLIIVTNNTREF ERVGGLRIEDWS >gi|296494511|gb|ADTN01000227.1| GENE 4 1093 - 1293 271 66 aa, chain - ## HITS:1 COG:no KEGG:ECO26_p2-75 NR:ns ## KEGG: ECO26_p2-75 # Name: not_defined # Def: conjugal transfer protein TraD # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 66 667 732 732 137 100.0 1e-31 MKPEEEMEQQLPPGISESGEVVDMAAYEAWQQENHPDIQQQMQRREEVNINVHRERGEDV EPGDDF Prediction of potential genes in microbial genomes Time: Sun May 15 23:53:31 2011 Seq name: gi|296494510|gb|ADTN01000228.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont498.1, whole genome shotgun sequence Length of sequence - 7336 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 6, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 114 - 152 6.1 1 1 Tu 1 . - CDS 162 - 599 425 ## EcE24377A_D0011 hypothetical protein 2 2 Tu 1 . - CDS 771 - 1088 251 ## SeHA_A0041 hypothetical protein + Prom 1388 - 1447 9.5 3 3 Op 1 . + CDS 1482 - 1730 382 ## SeHA_A0038 protein ImpC 4 3 Op 2 4/0.000 + CDS 1727 - 2164 188 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) 5 3 Op 3 . + CDS 2164 - 3435 533 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair + Term 3479 - 3526 -0.9 - Term 3389 - 3424 -0.6 6 4 Op 1 . - CDS 3440 - 3832 181 ## E2348_P1_101 plasmid stability protein 7 4 Op 2 . - CDS 3837 - 4808 593 ## p1ECUMN_0151 plasmid segregation protein - Prom 4962 - 5021 4.3 + Prom 5312 - 5371 2.2 8 5 Op 1 . + CDS 5418 - 5681 209 ## p1ECUMN_0152 hypothetical protein 9 5 Op 2 . + CDS 5675 - 5950 177 ## p1ECUMN_0153 conserved hypothetical protein, putative helix-turn-helix protein + Term 6020 - 6055 1.0 - Term 6048 - 6087 1.6 10 6 Op 1 . - CDS 6088 - 6879 495 ## COG0582 Integrase 11 6 Op 2 . - CDS 6876 - 7334 165 ## APECO1_O1CoBM115 hypothetical protein Predicted protein(s) >gi|296494510|gb|ADTN01000228.1| GENE 1 162 - 599 425 145 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_D0011 NR:ns ## KEGG: EcE24377A_D0011 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 145 164 308 308 298 98.0 3e-80 MPFDLLTVLPTRLDVEVNGFNGGVLNGVPSAYHWYTERYGVKWPCGYDLNISSQGDNFIQ VDFDTPWCQPESDVVAALSRRFGCTLEHWYAEQGCNFCGWQLYERGELVDVLWGELEWSS PTDDDELPEVTAPEWIVDKVAHYGG >gi|296494510|gb|ADTN01000228.1| GENE 2 771 - 1088 251 105 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0041 NR:ns ## KEGG: SeHA_A0041 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 105 1 105 308 219 99.0 3e-56 MPNWCSNRMHFSGEPAQIAEIKRLASGAVTPLYRRATNEGIQLFLAGSAGLLQITENIRS EQCPGVTAAGRGAVSTENIAFTRWLTHLQNGVLLDEQNCLMLHEL >gi|296494510|gb|ADTN01000228.1| GENE 3 1482 - 1730 382 82 aa, chain + ## HITS:1 COG:no KEGG:SeHA_A0038 NR:ns ## KEGG: SeHA_A0038 # Name: not_defined # Def: protein ImpC # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 82 1 82 82 141 96.0 6e-33 MIRIEILFDRQSTKNLKSGILQALQNEIEQRLKPHSPEIWLRIDQGSAPSVSVTGARNDK DKERIMSLLEEIWQDDSWLPAA >gi|296494510|gb|ADTN01000228.1| GENE 4 1727 - 2164 188 145 aa, chain + ## HITS:1 COG:PSLT055 KEGG:ns NR:ns ## COG: PSLT055 COG1974 # Protein_GI_number: 17233502 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Salmonella typhimurium LT2 # 20 142 17 137 140 167 67.0 4e-42 MSTVYHRPADPSGDDSYVRPLFADRCQAGFPSPATDYAEQELDLNSYCISRPAATFFLRA SGESMNQAGVQNGDLLVVDRAEKPQHGDIVIAEIDGEFTVKRLLLRPRPALEPVSDSREF RTLYPENICIFGVVTHVIHRTRELR >gi|296494510|gb|ADTN01000228.1| GENE 5 2164 - 3435 533 423 aa, chain + ## HITS:1 COG:PSLT054 KEGG:ns NR:ns ## COG: PSLT054 COG0389 # Protein_GI_number: 17233501 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Salmonella typhimurium LT2 # 1 423 1 423 424 629 70.0 1e-180 MFALADINSFYASCEKVFRPDLRNEPVIVLSNNDGCVIARSPEAKALGIRMGQPWFQVRQ MRLEKKIHVFSSNYALYHSMSQRVMAVLESLSPAVEPYSIDEMFIDLRGINHCISPEVFG HQLREQVKSWTGLTMGVGIAPTKTLAKSAQWATKQWPQFSGVVALEAENRNRILKLLGLQ PVGEVWGVGRRLTEKLNALGINTALQLAQANTAFIRKNFSVILERTVRELNGESCISLEE APPAKQQIVCSRSFGERITDKDAMHQAVVQYAERAAEKLRGERQYCRQVTTFVRTSPFAV KEPCYSNAAVEKLPLPTQDSRDIIAAACRALNHVWREGYRYMKAGVMLADFTPSGIAQPG LFDEIQPRKNSEKLMKTLDELNQSGKGKVWFAGRGTAPEWQMKREMLSPSYTTQWTEIPI ASF >gi|296494510|gb|ADTN01000228.1| GENE 6 3440 - 3832 181 130 aa, chain - ## HITS:1 COG:no KEGG:E2348_P1_101 NR:ns ## KEGG: E2348_P1_101 # Name: stbB # Def: plasmid stability protein # Organism: E.coli_0127 # Pathway: not_defined # 1 130 1 130 130 226 96.0 2e-58 MDDERKRKKFTLYLHPEKAADFQTLEAIESVPRSERGELFRNAFISGMALHQLDPRLPVL LTAILSEEFSADQVVTLLSQTTGWKPSQADIRAVLTELGALQSAEKMPPSATDSVQEAMN DVRLKMQKLF >gi|296494510|gb|ADTN01000228.1| GENE 7 3837 - 4808 593 323 aa, chain - ## HITS:1 COG:no KEGG:p1ECUMN_0151 NR:ns ## KEGG: p1ECUMN_0151 # Name: parM # Def: plasmid segregation protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 323 1 323 323 649 99.0 0 MNVYCDDGSTTIKLAWNDNGKICKSLSQNSFRHGWKVDGLGIRQTFNYELDGKKYTYDEV SNQSILTTHIEYQYTDVNLLAVHHALLNSGLAPQPVSLTVTLPISEFYTKECQKNELNIQ RKIENLMRPIRLNKGDVFTIEHVDVMPESLPAVFSRLVMDKVGQFEKSLVVDIGGTTLDV GVIVGQFDSVSAIHGNSGIGVSSVTKAAMSALRMASSDTSFLVADELIKRRNDPDFVRQV INDETKTDLVLNTIEGAIASLGEQVVNELGDFHHVNRVYVVGGGAPLIYDSIKTAWHHLG QKVVMMESPQTALVEAIAAFKEE >gi|296494510|gb|ADTN01000228.1| GENE 8 5418 - 5681 209 87 aa, chain + ## HITS:1 COG:no KEGG:p1ECUMN_0152 NR:ns ## KEGG: p1ECUMN_0152 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 87 128 214 214 185 98.0 4e-46 MRNLNPELKVYCLQSMATTNPVLRGNERKEFLEYLEEFPTIQVLDSVICFRKVYRDCMSN GTGVVETNNTAARAEIEHLMNEVFGPW >gi|296494510|gb|ADTN01000228.1| GENE 9 5675 - 5950 177 91 aa, chain + ## HITS:1 COG:no KEGG:p1ECUMN_0153 NR:ns ## KEGG: p1ECUMN_0153 # Name: not_defined # Def: conserved hypothetical protein, putative helix-turn-helix protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 91 1 91 91 144 100.0 1e-33 MVKKPSQQALNRAAVTVEQAEALAQRLADKPYGAPEKPEPEKQCRTTISLGESMLVTIED LALRNKRNGKDPKNVSAIVRVALEQYLKTLT >gi|296494510|gb|ADTN01000228.1| GENE 10 6088 - 6879 495 263 aa, chain - ## HITS:1 COG:PSLT031 KEGG:ns NR:ns ## COG: PSLT031 COG0582 # Protein_GI_number: 17233417 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Salmonella typhimurium LT2 # 18 259 16 257 260 380 88.0 1e-105 MSGPQLLSLPGSFLTPPATLPVAIDYPAALALRQMALVQDELPKYLLAPEISALLHYVPD LHRKMLLATLWNTGARINEALALTRGDFSLAPPYPFVQLATLKQRTEKATRTAGRVPAGS QVHRLVPLSDTQYVSQLQMMVATLKIPLERRNKRTGRMEKARIWGITDRTVRTWLNEAVE AAAADGVTFSVPVTPHTFRHSYAMHMLYAGIPLKVLQSLMGHKSISSTEVYTKVFALDVA ARHRVQFSMPESDAVTMLKNRHA >gi|296494510|gb|ADTN01000228.1| GENE 11 6876 - 7334 165 152 aa, chain - ## HITS:1 COG:no KEGG:APECO1_O1CoBM115 NR:ns ## KEGG: APECO1_O1CoBM115 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 10 152 87 229 229 286 99.0 2e-76 TTDYLQQCPVEHDQDGRGEAEFLLPEIDYSPVSGNWRSLPSGLMYRLSELSVLSYEAVVC VDNVFVEDTPYGGAGEYSLHKNAAMLGVKALRLSRELRMLCGLPLHGLSDTLSPTRLVLL KARGKTLQKEYEMVKKSKKTEQEIEDFIKGTS Prediction of potential genes in microbial genomes Time: Sun May 15 23:53:52 2011 Seq name: gi|296494509|gb|ADTN01000229.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont498.2, whole genome shotgun sequence Length of sequence - 3474 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 189 - 248 5.4 1 1 Tu 1 . + CDS 272 - 976 344 ## COG3316 Transposase and inactivated derivatives - Term 832 - 869 -0.1 2 2 Tu 1 . - CDS 922 - 1236 182 ## COG0582 Integrase 3 3 Op 1 . + CDS 1235 - 1447 79 ## gi|301647000|ref|ZP_07246835.1| conserved domain protein 4 3 Op 2 . + CDS 1502 - 1630 160 ## ECSE_P1-0012 hypothetical protein 5 4 Tu 1 . - CDS 1686 - 2861 426 ## COG0477 Permeases of the major facilitator superfamily - Prom 2900 - 2959 3.9 + Prom 2851 - 2910 3.7 6 5 Tu 1 . + CDS 2954 - 3473 222 ## COG1309 Transcriptional regulator Predicted protein(s) >gi|296494509|gb|ADTN01000229.1| GENE 1 272 - 976 344 234 aa, chain + ## HITS:1 COG:Cgl0933 KEGG:ns NR:ns ## COG: Cgl0933 COG3316 # Protein_GI_number: 19552183 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Corynebacterium glutamicum # 1 233 1 234 236 274 58.0 9e-74 MNPFKGRHFQRDIILWAVRWYCKYGISYRELQEMLAERGVNVDHSTIYRWVQRYAPEMEK RLRWYWRNPSDLCPWHMDETYVKVNGRWAYLYRAVDSRGRTVDFYLSSRRNSKAAYRFLG KILNNVKKWQIPRFINTDKAPAYGRALALLKREGRCPSDVEHRQIKYRNNVIECDHGKLK RIIGATLGFKSMKTAYATIKGIEVMRALRKGQASAFYYGDPLGEMRLVSRVFEM >gi|296494509|gb|ADTN01000229.1| GENE 2 922 - 1236 182 104 aa, chain - ## HITS:1 COG:PSLT031 KEGG:ns NR:ns ## COG: PSLT031 COG0582 # Protein_GI_number: 17233417 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Salmonella typhimurium LT2 # 13 69 15 71 260 92 85.0 2e-19 MSGLISYSELQQSAPLPVAIDYPAALALRQMALIQGKLPKYLLAPEVSALHHYVPDLHRR MLLATLWNTWHCCKVSDEAAFCLIQRPYISKTLLTRRISPRGSP >gi|296494509|gb|ADTN01000229.1| GENE 3 1235 - 1447 79 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|301647000|ref|ZP_07246835.1| ## NR: gi|301647000|ref|ZP_07246835.1| conserved domain protein [Escherichia coli MS 146-1] # 1 70 1 70 70 124 100.0 3e-27 MALTLSHSIVVSPVGSTLCCGQLQQHVGDFRVSRLYETRKPKTIHVVAQVADVLQQQSLH VRSRIGDSFC >gi|296494509|gb|ADTN01000229.1| GENE 4 1502 - 1630 160 42 aa, chain + ## HITS:1 COG:no KEGG:ECSE_P1-0012 NR:ns ## KEGG: ECSE_P1-0012 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 42 1 42 42 70 100.0 1e-11 MRTRGQDPTLPEMRRVRLLEMADAMDMFCQGLVCAFTVLRKN >gi|296494509|gb|ADTN01000229.1| GENE 5 1686 - 2861 426 391 aa, chain - ## HITS:1 COG:AGl1300 KEGG:ns NR:ns ## COG: AGl1300 COG0477 # Protein_GI_number: 15890776 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 371 2 369 394 307 47.0 2e-83 MKSNNALIVILGTVTLDAVGIGLVMPVLPGLLRDIVHSDSIASHYGVLLALYALMQFLCA PVLGALSDRFGRRPVLLASLLGATIDYAIMATTPVLWILYAGRIVAGITGATGAVAGAYI ADITDGEDRARHFGLMSACFGVGMVAGPVAGGLLGAISLHAPFLAAAVLNGLNLLLGCFL MQESHKGERRPMPLRAFNPVSSFRWARGMTIVAALMTVFFIMQLVGQVPAALWVIFGEDR FRWSATMIGLSLAVFGILHALAQAFVTGPATKRFGEKQAIIAGMAADALGYVLLAFATRG WMAFPIMILLASGGIGMPALQAMLSRQVDDDHQGQLQGSLAALTSLTSIIGPLIVTAIYA ASASTWNGLAWIVGAALYLVCLPALRRATST >gi|296494509|gb|ADTN01000229.1| GENE 6 2954 - 3473 222 173 aa, chain + ## HITS:1 COG:AGl1301 KEGG:ns NR:ns ## COG: AGl1301 COG1309 # Protein_GI_number: 15890777 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 6 163 51 202 261 135 45.0 3e-32 MNKLQREAVIRTALELLNDVGMEGLTTRRLAERLGVQQPALYWHFKNKRALLDALAEAML TINHTHSTPRDDDDWRSFLKGNACSFRRALLAYRDGARIHAGTRPAAPQMEKADAQLRFL CDAGFSAGDATYALMAISYFTVGAVLEQQASEADAEERGEDQLTTSASTMPAR Prediction of potential genes in microbial genomes Time: Sun May 15 23:54:02 2011 Seq name: gi|296494508|gb|ADTN01000230.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont498.3, whole genome shotgun sequence Length of sequence - 17976 bp Number of predicted genes - 18, with homology - 15 Number of transcription units - 11, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 261 183 ## ECS88_4998 hypothetical protein from phage origin 2 2 Op 1 . - CDS 775 - 987 144 ## ECO111_p2-099 hypothetical protein 3 2 Op 2 . - CDS 984 - 1421 239 ## ECO26_1568 hypothetical protein 4 2 Op 3 . - CDS 1441 - 2172 279 ## ECH74115_A0041 hypothetical protein 5 2 Op 4 . - CDS 2150 - 2902 404 ## ECH74115_A0040 TraL - Prom 2955 - 3014 2.2 + Prom 3904 - 3963 7.0 6 3 Op 1 . + CDS 4029 - 8252 1617 ## ECH74115_A0037 relaxase 7 3 Op 2 . + CDS 8326 - 8499 83 ## 8 4 Tu 1 . - CDS 8980 - 9423 97 ## gi|195940427|ref|ZP_03085809.1| hypothetical protein EscherichcoliO157_29210 - Prom 9460 - 9519 3.1 - Term 9436 - 9481 3.3 9 5 Tu 1 . - CDS 9569 - 9775 170 ## ECP_2998 putative hemolysin expression modulating protein - Prom 9902 - 9961 5.8 10 6 Tu 1 . - CDS 10239 - 10385 83 ## gi|10955487|ref|NP_065339.1| hypothetical protein R721_48 - Prom 10478 - 10537 5.1 - Term 10510 - 10543 2.0 11 7 Op 1 . - CDS 10689 - 12845 1100 ## COG0550 Topoisomerase IA 12 7 Op 2 . - CDS 12850 - 13425 152 ## ROD_p2301 hypothetical protein - Prom 13466 - 13525 1.9 13 8 Tu 1 . - CDS 13595 - 13825 64 ## PMIP24 hypothetical protein - Prom 13999 - 14058 3.7 14 9 Tu 1 . - CDS 14773 - 14868 65 ## - Prom 15000 - 15059 7.0 + Prom 15053 - 15112 5.8 15 10 Tu 1 . + CDS 15221 - 15286 80 ## - Term 15406 - 15437 -0.7 16 11 Op 1 . - CDS 15544 - 16059 345 ## COG1525 Micrococcal nuclease (thermonuclease) homologs 17 11 Op 2 . - CDS 16106 - 16759 244 ## gi|301647023|ref|ZP_07246857.1| conserved hypothetical protein 18 11 Op 3 . - CDS 16771 - 17895 262 ## COG0582 Integrase Predicted protein(s) >gi|296494508|gb|ADTN01000230.1| GENE 1 3 - 261 183 86 aa, chain - ## HITS:1 COG:no KEGG:ECS88_4998 NR:ns ## KEGG: ECS88_4998 # Name: not_defined # Def: hypothetical protein from phage origin # Organism: E.coli_S88 # Pathway: not_defined # 1 41 1 41 206 79 90.0 5e-14 MTTLTDKEMIKEIKERIGSLDVRDNIERRAYEIALASLEAEAPNLPGGFTIEEAKELHEN LVKSHVSKALSGEKMKKGHCCKVSDE >gi|296494508|gb|ADTN01000230.1| GENE 2 775 - 987 144 70 aa, chain - ## HITS:1 COG:no KEGG:ECO111_p2-099 NR:ns ## KEGG: ECO111_p2-099 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 8 70 3 63 63 94 90.0 1e-18 MNDMATNNRKAKMLISRVYRLCYPSQWLRVSNRRVVLFSFSGIAREGVKDKRSAVQNRWK NHCYLRTKGE >gi|296494508|gb|ADTN01000230.1| GENE 3 984 - 1421 239 145 aa, chain - ## HITS:1 COG:no KEGG:ECO26_1568 NR:ns ## KEGG: ECO26_1568 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 145 1 152 152 182 69.0 2e-45 MTTFTDEDKELIKEIRERIGSLDVRDNIERRAYEIALASLEAEAVMFCISGQNVDSEEHV STSKAVVDAWVEEWNQVDRTPGEPLYKTVPLYHHPVLPASGLVNAVRFYEQVKRENPPVE TVAWKDAVEWVLEEACQAVNTGIKG >gi|296494508|gb|ADTN01000230.1| GENE 4 1441 - 2172 279 243 aa, chain - ## HITS:1 COG:no KEGG:ECH74115_A0041 NR:ns ## KEGG: ECH74115_A0041 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 16 241 5 232 234 220 52.0 3e-56 MTMNTVNNMSTVNQPELKQPEKAVSSEDIDNFIVDVFKETGHKISKDDPVISLIFLNQKI QEKFSNELQANFTALSEGFRQVVSSVENDYIQRFKNIVETCGDLDNEIKEKVEEGKNDLK ETSVEVKEKLTDDIIELISGIKRNQEKNNKLYEEKLKSLSKAVKPFSTRTAIAICALCTL CLSAAFSGATWYVAQHEKEASLRFYARAFLDMKKITEESMRKLPKSDQQNIKLKLEEIDS RKP >gi|296494508|gb|ADTN01000230.1| GENE 5 2150 - 2902 404 250 aa, chain - ## HITS:1 COG:no KEGG:ECH74115_A0040 NR:ns ## KEGG: ECH74115_A0040 # Name: not_defined # Def: TraL # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 250 1 250 253 343 72.0 4e-93 MKNSINFILQGKGGVGKSFATAILAQYFIDENHMDNIVVGDTDPVNTTTVKVKRLNADLI QITENSKVIQSKFDPMFESMLTNSQNTFVIDNGASTFLPLIQYFNDNCVMDMFEDVEQDV YIHTVIVGGQALADTLQGFEELKELVKGSKVKLIVWINEFQGIPALENIPLIETKFIEKN KDVIAGVVVIQDRKSDAFASDIKELTEKSLTLKEALESDHFGLMAKSRLKRVFNDIYKQL DSIYDDEHGE >gi|296494508|gb|ADTN01000230.1| GENE 6 4029 - 8252 1617 1407 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_A0037 NR:ns ## KEGG: ECH74115_A0037 # Name: not_defined # Def: relaxase # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 820 1 823 1065 874 57.0 0 MIVRYGGGNDGIVDYLINGRKAERQYTRDELDHRVVLDGDLQTTDKIIDSIENKSQERYL HITLSFHESHVSNEVLKAVVDDYKKLLMNAYHPDEYSFYAEAHLPKIRHIQDNSTGELVE RKPHIHIVIPKVNLITEKFLNPVGDVTKGHTIEQLDAIQEFINNKYNLDSPKDYPRKDAD YGKIISRVKGDLYKEHHSELKGELLSRIENEKIENYSVFKDIVAEYGELRIRNAGKTNEY LAVKLPGDKKYTNLKSPLFRQNYIETRTLTLEKPTHKEIEKRLNTWLNKTSHEIKHIFNQ AEKTRELYKTLSPSQQIDFLQERIKEYDSREKLNERNSQQTSGREGGYKSCPKKFARIRQ SEATVGLSRMPQRGMVYGINGFTRPDSVSVLSDISQRDLAEQLPQREHPGQDVRRDYDRQ FTESGIKSLERSSFLCETMFQTLNEAAEKNEITTMAEIRRNIDPVRFLSSAAERFNIISA QHKIRTAKDGSPRFSVGNRNMNASDFLTKHINLAWKDAKSFLLEVYSQQLENTPYTRYPT YRRLTHHEARERLNSLNLSEKTLRNTIRFERGKLYNDLREMRRELKLIPREQRDIAVGVI VYKKLTTLERLSELDTEGRHIIRQYHADWHKDKDEMKALERLKSYLNFDEINAISADEPE LSLQKAVDSQRRLEEAKKVNSKLKDLVMDKQDSRIVYRDQESEKPVFTDKGNFVVAGKNP SKEEIGIMLEYSREKFGGVLKLTGSEDFKKMCAEVAAEQDMKIILRPEQYQQMMLELKAE LQGNKFEQVETQENSQESESRIEKGDALKEQATEQAQTTEQAQATEQAQATEQAQVAEQA QVAEQAQVAEQAQVAEQATSSYDPGVITRANTLDSQMLSKGTNGEFGYLKSLDSDENEIW EVLGHVPGDSDDIFDVASFDNENDAKEFCKIVNELGIDRTQALIQEQLTTQHDQATAQVH NKQEIYCINFSRFHDLNEGIVFHSKDAAIQCYEESKSSAIEKYESNYLNGDGFDTVVLMS KTVSTDELSSYPEGALVTFDRPFEIIANSYEEYRTPVYAVSFSKDEFSEDIKTFESLHDA SEYKNKMLQEHGLNQDDILITPVTREEIAFKGIKDAVNDANMAVMEQAGDSSRESPEEIL ASISANEHMISGLENFLVKDRSQFSSCNGDIVVEAEITRGEGGLYRLAVAGKHGLERGDA VARVDVTEQQFAAITGKTPSEVLTGDQTSARVPVITGIHFSTRAIENVNKLEQQKDYVYF STHEGLNAEIKDFSSLKDAIEWGRVECELHDLNKRDTVIYRVESEHISQGIDAVMKNAER VERHEIEKAQGRDCTPEDGKILEAIDRFEDKFRGEGLKFEREKAESDLLNHGFTREMAED ALGKQFVQAREDHQESQQQRDDSQDMH >gi|296494508|gb|ADTN01000230.1| GENE 7 8326 - 8499 83 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDSNNAKKVMLSHMLDESFKTNMAVNAIVMKLMVIMAANKYCTMFFIVLGCSPILSL >gi|296494508|gb|ADTN01000230.1| GENE 8 8980 - 9423 97 147 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|195940427|ref|ZP_03085809.1| ## NR: gi|195940427|ref|ZP_03085809.1| hypothetical protein EscherichcoliO157_29210 [Escherichia coli O157:H7 str. EC4024] # 1 147 1 147 147 284 100.0 1e-75 MFFLYETYNFFYYLIKLIVIQPQYICVYMIFFFFNAGIAYSITNDIEDQVCRWLLFVSML HALMIPPAVIMPLIMPPQEILQETEKRQELHESIPKTCKLKALDAQQGGLFGVDKDEWVF PDNKSFYLPEKYRPENRITELAMMKEG >gi|296494508|gb|ADTN01000230.1| GENE 9 9569 - 9775 170 68 aa, chain - ## HITS:1 COG:no KEGG:ECP_2998 NR:ns ## KEGG: ECP_2998 # Name: not_defined # Def: putative hemolysin expression modulating protein # Organism: E.coli_536 # Pathway: not_defined # 1 67 61 127 128 102 67.0 3e-21 MKTKQDWLFQLRRCSTLDTLEKIIEKNQNSLPPSELESFNAAADHRLAELTMNKLYDKVP PSVWKYVY >gi|296494508|gb|ADTN01000230.1| GENE 10 10239 - 10385 83 48 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|10955487|ref|NP_065339.1| ## NR: gi|10955487|ref|NP_065339.1| hypothetical protein R721_48 [Escherichia coli] # 1 48 5 52 52 75 100.0 8e-13 MLHLKTASLLSECSEYRTKCDNLYHLCQMQEKELEHLRALIDAHNIDY >gi|296494508|gb|ADTN01000230.1| GENE 11 10689 - 12845 1100 718 aa, chain - ## HITS:1 COG:VC2043 KEGG:ns NR:ns ## COG: VC2043 COG0550 # Protein_GI_number: 15642045 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Vibrio cholerae # 2 624 3 622 647 399 39.0 1e-111 MKLFIAEKPAVANDIVKALGGNFTRHDGWFESDNTIVTNCFGHIIESQPPENYNPEYKEW KIETLPLRLYPVKYQPVESAEKQVKTIIELIRRADVTEIIHAGDPDDEGQLLVDEVLEYA GNTKPVKRVLINDNTLPAVKKALANPKNNRDFRGLYLKALARSVADAIYGLSMTRAYTIP AKTKGYKGVLSVGRVQTPVLGLIVNRTRANKNHKSSFYYTMTGHFQRGADVIRANWKPGE FAPLTDRKLLDKTWANGTATSLAGKPATVEAAATDDKKTAAPLPFNLVRLQQYMNKKFKM TAQKTLDITQQLREKYKAITYNRSDCSYLSDEQFSEAPQIIDALKSVFDQPMDIDTTRKS KAFNSAKVTAHTAIIPTVSVPDVNALSTDERNVYLAIAQHYLVQFMPEKAYQEVSVAIQC GDESFYARARKTTDSGFEAFLGVENAGDDEAEVENDDSAFDLLCKIRTGETVTTKEVVVN EKKTTPPPLFTEATLLAALVRVADFVTDPVIKKLLKDKDRDKKDEHGGIGTPATRASILE TLKKRNYITLEKGKLIPTDTGYALIDALPDIAVNPDMTALWAEKQTLIENGEMTIEQFVD ELYNDLIPMISNANSAEIKVSPSAPSGQSQRLSSPCPSCGKQIVIRPKSYSCTGCEFKIW NEFSGKKITQAQAEKLIKSGKTDLIKGFKKKSGGTYDAVLVLEDKKTGKLGFPARAKK >gi|296494508|gb|ADTN01000230.1| GENE 12 12850 - 13425 152 191 aa, chain - ## HITS:1 COG:no KEGG:ROD_p2301 NR:ns ## KEGG: ROD_p2301 # Name: not_defined # Def: hypothetical protein # Organism: C.rodentium # Pathway: not_defined # 1 191 1 191 191 402 98.0 1e-111 MQYALLDGFERKFLLDALEFGVLKDWKENPVKELPDIDESAHPFHVCYGGYLLNPGVSDS DISRKIKDQTGFWLAAIDDTRMDCHSIAYYDIHTLPLISCGHQKIVPFAALIKADECIIS KISSYSGFAVTAFLRIKDQDIATNILNREGIFAFNGCERRFRHPVSEDNWQQAVSEERAI RCAKRLIQCKG >gi|296494508|gb|ADTN01000230.1| GENE 13 13595 - 13825 64 76 aa, chain - ## HITS:1 COG:no KEGG:PMIP24 NR:ns ## KEGG: PMIP24 # Name: not_defined # Def: hypothetical protein # Organism: P.mirabilis # Pathway: not_defined # 1 76 476 551 559 64 38.0 1e-09 MEHCVASYADWCASGEYIAISVLMDNERATLGLSRKEHDLTYQFDQMRGIRNQAVSRNML IKGRHILKIINSSLKR >gi|296494508|gb|ADTN01000230.1| GENE 14 14773 - 14868 65 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRVVIFPHYIHPMYITHMSMEYMTLFIALVK >gi|296494508|gb|ADTN01000230.1| GENE 15 15221 - 15286 80 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQRKNIIVPCLLINYIHAATY >gi|296494508|gb|ADTN01000230.1| GENE 16 15544 - 16059 345 171 aa, chain - ## HITS:1 COG:Cj0979c KEGG:ns NR:ns ## COG: Cj0979c COG1525 # Protein_GI_number: 15792306 # Func_class: L Replication, recombination and repair # Function: Micrococcal nuclease (thermonuclease) homologs # Organism: Campylobacter jejuni # 2 171 19 173 175 106 35.0 2e-23 MKILLTIFIYLLTAFQSANAGGIPYTFNKKILQGKVIRVLDGDTIEIKTLPAKIVVYEVP IRVRLINIDAPEKKQPFGRWSTSQLKTLVAGKQVTVSYSHKDRYGRIIGHVFTTNGTDAS RFMVQSGAAWVYERYNEDESLPALQREAQEQKRGLWADANPVPPWEWRIRN >gi|296494508|gb|ADTN01000230.1| GENE 17 16106 - 16759 244 217 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|301647023|ref|ZP_07246857.1| ## NR: gi|301647023|ref|ZP_07246857.1| conserved hypothetical protein [Escherichia coli MS 146-1] # 1 217 1 217 217 434 100.0 1e-120 MIEQLQSDFLFWIFIYFLIVGLLIRNIHLTRKSKSIEKNKMFPAEEYAVLDIEQHMSNKY IGRLVTRFMKRTNGAPLLIVFPEYDKGDNNEDKNSPPFQLVKLTSDINNIKEIYSNDDSA CFLNNISAQKNLYTTHSGPSQAWIEHAYPLQQVTILLQGTRHSEREDIISQLEIILSRLK NGENTGFEHDDDFGYSFQYDAAAVGKSSFFDEAAGRK >gi|296494508|gb|ADTN01000230.1| GENE 18 16771 - 17895 262 374 aa, chain - ## HITS:1 COG:RSc1554 KEGG:ns NR:ns ## COG: RSc1554 COG0582 # Protein_GI_number: 17546273 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Ralstonia solanacearum # 1 285 65 354 354 177 38.0 4e-44 MFQKIKIRKMTISRALDKYLKTVSIHKKGHLQEFYRVNVIKRHPIAERYMDDITTVDIAN YRDQRLAQINPRTGRQITGNTVRLELALLSSLFNIARVEWGTCRMNPVELVRKPKISSGR DRRLTSGEERRLSRYFKEKNQALYVIFHLALETAMRQGEILSLRWEHVDLQHGVAHLPTT KNGAPRDVPLSRKARNYLQMLPTQLNGNIFSYTSSGFKSAWRTALQELKIENLHFHDLRH EAISRFFELGTLNVIEVAAISGHRSLNMLKRYTHLRAYQLVSKLDARRKQTSKIAPYFVP YPATVENRNGQVVVTLSDFDLEISAATKEQAIFHASVLLLRTLAQAAQRGERVPTPGELP TNIDERVMICPLTN Prediction of potential genes in microbial genomes Time: Sun May 15 23:55:07 2011 Seq name: gi|296494507|gb|ADTN01000231.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont518.1, whole genome shotgun sequence Length of sequence - 2655 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 27/0.000 + CDS 784 - 1254 467 ## COG0511 Biotin carboxyl carrier protein 2 1 Op 2 . + CDS 1265 - 2614 1537 ## COG0439 Biotin carboxylase Predicted protein(s) >gi|296494507|gb|ADTN01000231.1| GENE 1 784 - 1254 467 156 aa, chain + ## HITS:1 COG:ECs4127 KEGG:ns NR:ns ## COG: ECs4127 COG0511 # Protein_GI_number: 15833381 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxyl carrier protein # Organism: Escherichia coli O157:H7 # 1 156 1 156 156 246 100.0 2e-65 MDIRKIKKLIELVEESGISELEISEGEESVRISRAAPAASFPVMQQAYAAPMMQQPAQSN AAAPATVPSMEAPAAAEISGHIVRSPMVGTFYRTPSPDAKAFIEVGQKVNVGDTLCIVEA MKMMNQIEADKSGTVKAILVESGQPVEFDEPLVVIE >gi|296494507|gb|ADTN01000231.1| GENE 2 1265 - 2614 1537 449 aa, chain + ## HITS:1 COG:ECs4128 KEGG:ns NR:ns ## COG: ECs4128 COG0439 # Protein_GI_number: 15833382 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Escherichia coli O157:H7 # 1 449 1 449 449 882 99.0 0 MLDKIVIANRGEIALRILRACKELGIKTVAVHSSADRDLKHVLLADETVCIGPAPSVKSY LNIPAIISAAEITGAVAIHPGYGFLSENANFAEQVERSGFIFIGPKAETIRLMGDKVSAI AAMKKAGVPCVPGSDGPLGDDMDKNRAIAKRIGYPVIIKASGGGGGRGMRVVRGDAELAQ SISMTRAEAKAAFSNDMVYMEKYLENPRHVEIQVLADGQGNAIYLAERDCSMQRRHQKVV EEAPAPGITPELRRYIGERCAKACVDIGYRGAGTFEFLFENGEFYFIEMNTRIQVEHPVT EMITGVDLIKEQLRIAAGQPLSIKQEEVHVRGHAVECRINAEDPNTFLPSPGKITRFHAP GGFGVRWESHIYAGYTVPPYYDSMIGKLICYGENRDVAIARMKNALQELIIDGIKTNVDL QIRIMNDENFQHGGTNIHYLEKKLGLQEK Prediction of potential genes in microbial genomes Time: Sun May 15 23:55:13 2011 Seq name: gi|296494506|gb|ADTN01000232.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont518.2, whole genome shotgun sequence Length of sequence - 15995 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 8, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 5 - 64 5.4 1 1 Op 1 4/1.000 + CDS 89 - 331 265 ## COG3924 Predicted membrane protein 2 1 Op 2 3/1.000 + CDS 321 - 1772 1570 ## COG4145 Na+/panthothenate symporter 3 1 Op 3 7/1.000 + CDS 1784 - 2665 1543 ## PROTEIN SUPPORTED gi|15833385|ref|NP_312158.1| ribosomal protein L11 methyltransferase + Term 2719 - 2773 11.8 + Prom 2787 - 2846 5.5 4 2 Op 1 12/1.000 + CDS 2994 - 3959 1101 ## PROTEIN SUPPORTED gi|42631300|ref|ZP_00156838.1| COG0042: tRNA-dihydrouridine synthase 5 2 Op 2 1/1.000 + CDS 3985 - 4281 381 ## COG2901 Factor for inversion stimulation Fis, transcriptional activator + Term 4311 - 4348 1.4 6 3 Tu 1 . + CDS 4367 - 5251 744 ## COG0863 DNA modification methylase + Prom 5256 - 5315 1.9 7 4 Tu 1 . + CDS 5401 - 5514 106 ## EcE24377A_3748 hypothetical protein + Term 5528 - 5564 0.1 - Term 5442 - 5483 2.3 8 5 Tu 1 . - CDS 5517 - 6179 495 ## COG1309 Transcriptional regulator - Prom 6302 - 6361 6.9 + Prom 6274 - 6333 7.3 9 6 Op 1 27/0.000 + CDS 6578 - 7735 947 ## COG0845 Membrane-fusion protein 10 6 Op 2 . + CDS 7747 - 10851 2781 ## COG0841 Cation/multidrug efflux pump + Term 10862 - 10897 3.5 + Prom 10868 - 10927 4.7 11 7 Tu 1 . + CDS 11104 - 11325 265 ## ECH74115_4585 putative lipoprotein + Term 11436 - 11481 5.6 + Prom 11452 - 11511 5.4 12 8 Op 1 6/1.000 + CDS 11756 - 12781 976 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain + Term 12789 - 12839 6.1 13 8 Op 2 6/1.000 + CDS 12849 - 14030 900 ## COG4597 ABC-type amino acid transport system, permease component 14 8 Op 3 34/0.000 + CDS 14040 - 15143 809 ## COG0765 ABC-type amino acid transport system, permease component 15 8 Op 4 . + CDS 15151 - 15909 532 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Term 15936 - 15967 4.1 Predicted protein(s) >gi|296494506|gb|ADTN01000232.1| GENE 1 89 - 331 265 80 aa, chain + ## HITS:1 COG:yhdT KEGG:ns NR:ns ## COG: yhdT COG3924 # Protein_GI_number: 16131145 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 80 1 80 80 149 100.0 2e-36 MDTRFVQAHKEARWALGLTLLYLAVWLVAAYLSGVAPGFTGFPRWFEMACILTPLLFIGL CWAMVKFIYRDIPLEDDDAA >gi|296494506|gb|ADTN01000232.1| GENE 2 321 - 1772 1570 483 aa, chain + ## HITS:1 COG:ECs4130 KEGG:ns NR:ns ## COG: ECs4130 COG4145 # Protein_GI_number: 15833384 # Func_class: H Coenzyme transport and metabolism # Function: Na+/panthothenate symporter # Organism: Escherichia coli O157:H7 # 1 483 3 485 485 797 100.0 0 MQLEVILPLVAYLVVVFGISVYAMRKRSTGTFLNEYFLGSRSMGGIVLAMTLTATYISAS SFIGGPGAAYKYGLGWVLLAMIQLPAVWLSLGILGKKFAILARRYNAVTLNDMLFARYQS RLLVWLASLSLLVAFVGAMTVQFIGGARLLETAAGIPYETGLLIFGISIALYTAFGGFRA SVLNDTMQGLVMLIGTVVLLIGVVHAAGGLSNAVQTLQTIDPQLVTPQGADDILSPAFMT SFWVLVCFGVIGLPHTAVRCISYKDSKAVHRGIIIGTIVVAILMFGMHLAGALGRAVIPD LTVPDLVIPTLMVKVLPPFAAGIFLAAPMAAIMSTINAQLLQSSATIIKDLYLNIRPDQM QNETRLKRMSAVITLVLGALLLLAAWKPPEMIIWLNLLAFGGLEAVFLWPLVLGLYWERA NAKGALSAMIVGGVLYAVLATLNIQYLGFHPIVPSLLLSLLAFLVGNRFGTSVPQATVLT TDK >gi|296494506|gb|ADTN01000232.1| GENE 3 1784 - 2665 1543 293 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15833385|ref|NP_312158.1| ribosomal protein L11 methyltransferase [Escherichia coli O157:H7 str. Sakai] # 1 293 1 293 293 598 100 1e-171 MPWIQLKLNTTGANAEDLSDALMEAGAVSITFQDTHDTPVFEPLPGETRLWGDTDVIGLF DAETDMNDVVAILENHPLLGAGFAHKIEQLEDKDWEREWMDNFHPMRFGERLWICPSWRD VPDENAVNVMLDPGLAFGTGTHPTTSLCLQWLDSLDLTGKTVIDFGCGSGILAIAALKLG AAKAIGIDIDPQAIQASRDNAERNGVSDRLELYLPKDQPEEMKADVVVANILAGPLRELA PLISVLPVSGGLLGLSGILASQAESVCEAYADSFALDPVVEKEEWCRITGRKN >gi|296494506|gb|ADTN01000232.1| GENE 4 2994 - 3959 1101 321 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42631300|ref|ZP_00156838.1| COG0042: tRNA-dihydrouridine synthase [Haemophilus influenzae R2866] # 1 317 28 344 353 428 65 1e-119 MRIGQYQLRNRLIAAPMAGITDRPFRTLCYEMGAGLTVSEMMSSNPQVWESDKSRLRMVH IDEPGIRTVQIAGSDPKEMADAARINVESGAQIIDINMGCPAKKVNRKLAGSALLQYPDV VKSILTEVVNAVDVPVTLKIRTGWAPEHRNCEEIAQLAEDCGIQALTIHGRTRACLFNGE AEYDSIRAVKQKVSIPVIANGDITDPLKARAVLDYTGADALMIGRAAQGRPWIFREIQHY LDTGELLPPLPLAEVKRLLCAHVRELHDFYGPAKGYRIARKHVSWYLQEHAPNDQFRRTF NAIEDASEQLEALEAYFENFA >gi|296494506|gb|ADTN01000232.1| GENE 5 3985 - 4281 381 98 aa, chain + ## HITS:1 COG:ECs4133 KEGG:ns NR:ns ## COG: ECs4133 COG2901 # Protein_GI_number: 15833387 # Func_class: K Transcription; L Replication, recombination and repair # Function: Factor for inversion stimulation Fis, transcriptional activator # Organism: Escherichia coli O157:H7 # 1 98 1 98 98 162 100.0 1e-40 MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN >gi|296494506|gb|ADTN01000232.1| GENE 6 4367 - 5251 744 294 aa, chain + ## HITS:1 COG:yhdJ KEGG:ns NR:ns ## COG: yhdJ COG0863 # Protein_GI_number: 16131150 # Func_class: L Replication, recombination and repair # Function: DNA modification methylase # Organism: Escherichia coli K12 # 1 294 3 296 296 611 100.0 1e-175 MRTGCEPTRFGNEAKTIIHGDALAELKKIPAESVDLIFADPPYNIGKNFDGLIEAWKEDL FIDWLFEVIAECHRVLKKQGSMYIMNSTENMPFIDLQCRKLFTIKSRIVWSYDSSGVQAK KHYGSMYEPILMMVKDAKNYTFNGDAILVEAKTGSQRALIDYRKNPPQPYNHQKVPGNVW DFPRVRYLMDEYENHPTQKPEALLKRIILASSNPGDIVLDPFAGSFTTGAVAIASGRKFI GIEINSEYIKMGLRRLDVASHYSAEELAKVKKRKTGNLSKRSRLSEVDPDLITK >gi|296494506|gb|ADTN01000232.1| GENE 7 5401 - 5514 106 37 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_3748 NR:ns ## KEGG: EcE24377A_3748 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 37 32 68 68 67 100.0 2e-10 MQWIELLATETDKCRNMNSVNPLKLVNCDELNFQDRM >gi|296494506|gb|ADTN01000232.1| GENE 8 5517 - 6179 495 220 aa, chain - ## HITS:1 COG:ECs4136 KEGG:ns NR:ns ## COG: ECs4136 COG1309 # Protein_GI_number: 15833390 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 220 1 220 220 426 100.0 1e-119 MAKRTKAEALKTRQELIETAIAQFAQHGVSKTTLNDIADAANVTRGAIYWHFENKTQLFN EMWLQQPSLRELIQEHLTAGLEHDPFQQLREKLIVGLQYIAKIPRQQALLKILYHKCEFN DEMLAEGVIREKMGFNPQTLREVLQACQQQGCVANNLDLDVVMIIIDGAFSGIVQNWLMN MAGYDLYKQAPALVDNVLRMFMPDENITKLIHQTNELSVM >gi|296494506|gb|ADTN01000232.1| GENE 9 6578 - 7735 947 385 aa, chain + ## HITS:1 COG:acrE KEGG:ns NR:ns ## COG: acrE COG0845 # Protein_GI_number: 16131153 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli K12 # 1 385 1 385 385 674 100.0 0 MTKHARFFLLPSFILISAALIAGCNDKGEEKAHVGEPQVTVHIVKTAPLEVKTELPGRTN AYRIAEVRPQVSGIVLNRNFTEGSDVQAGQSLYQIDPATYQANYDSAKGELAKSEAAAAI AHLTVKRYVPLVGTKYISQQEYDQAIADARQADAAVIAAKATVESARINLAYTKVTAPIS GRIGKSTVTEGALVTNGQTTELATVQQLDPIYVDVTQSSNDFMRLKQSVEQGNLHKENAT SNVELVMENGQTYPLKGTLQFSDVTVDESTGSITLRAVFPNPQHTLLPGMFVRARIDEGV QPDAILIPQQGVSRTPRGDATVLIVNDKSQVEARPVVASQAIGDKWLISEGLKSGDQVIV SGLQKARPGEQVKATTDTPADTASK >gi|296494506|gb|ADTN01000232.1| GENE 10 7747 - 10851 2781 1034 aa, chain + ## HITS:1 COG:acrF KEGG:ns NR:ns ## COG: acrF COG0841 # Protein_GI_number: 16131154 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Escherichia coli K12 # 1 1034 1 1034 1034 1956 99.0 0 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLALGLLVDDAIVVVENVERVM MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRM LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF VPVFFVVIRRCFKG >gi|296494506|gb|ADTN01000232.1| GENE 11 11104 - 11325 265 73 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_4585 NR:ns ## KEGG: ECH74115_4585 # Name: not_defined # Def: putative lipoprotein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 73 1 73 73 115 100.0 6e-25 MKRLIPVALLTALLAGCAHDSPCVPVYDDQGRLVHTNTCMKGTTQDNWETAGAIAGGAAA VAGLTMGIIALSK >gi|296494506|gb|ADTN01000232.1| GENE 12 11756 - 12781 976 341 aa, chain + ## HITS:1 COG:yhdW KEGG:ns NR:ns ## COG: yhdW COG0834 # Protein_GI_number: 16131156 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Escherichia coli K12 # 37 341 1 305 305 621 99.0 1e-178 MKKMMIATLAAASVLLAVANQAHAGATLDAVQKKGFVQCGISDGLPGFSYADADGKFSGI DVDICRGVAAAVFGDDTKVKYTPLTAKERFTALQSGEVDLLSRNTTWTSSRDAGMGMAFT GVTYYDGIGFLTHDKAGLKSAKELDGATVCIQAGTDTELNVADYFKANNMKYTPVTFDRS DESAKALESGRCDTLASDQSQLYALRIKLSNPAEWIVLPEVISKEPLGPVVRRGDDEWFS IVRWTLFAMLNAEEMGINSQNVDEKAANPATPDMAHLLGKEGDYGKDLKLDNKWAYNIIK QVGNYSEIFERNVGSESPLKIKRGQNNLWNNGGIQYAPPVR >gi|296494506|gb|ADTN01000232.1| GENE 13 12849 - 14030 900 393 aa, chain + ## HITS:1 COG:ECs4142 KEGG:ns NR:ns ## COG: ECs4142 COG4597 # Protein_GI_number: 15833396 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Escherichia coli O157:H7 # 1 393 7 399 399 698 99.0 0 MSHRRSTVKGSLSFANPTVRAWLFQILAVVAVVGIVGWLFHNTVTNLNNRGITSGFAFLD RGAGFGIVQHLIDYQQGDTYGRVFIVGLLNTLLVSALCIVFASVLGFFIGLARLSDNWLL RKLSTIYIEIFRNIPPLLQIFFWYFAVLRNLPGPRQAVSAFDLAFLSNRGLYIPSPQLGD GFIAFILAVVMAIVLSVGLFRFNKTYQIKTGQLRRTWPIAAVLIIGLPLLAQWLFGAALH WDVPALRGFNFRGGMVLIPELAALTLALSVYTSAFIAEIIRAGIQAVPYGQHEAARSLGL PNPVTLRQVIIPQALRVIIPPLTSQYLNIVKNSSLAAAIGYPDMVSLFAGTVLNQTGQAI ETIAMTMSVYLIISLTISLLMNIYNRRIAIVER >gi|296494506|gb|ADTN01000232.1| GENE 14 14040 - 15143 809 367 aa, chain + ## HITS:1 COG:yhdY KEGG:ns NR:ns ## COG: yhdY COG0765 # Protein_GI_number: 16131158 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Escherichia coli K12 # 1 367 2 368 368 678 99.0 0 MTKVLLSHPPRPASHNSSRAMVWVRKNLFSSWSNSLLTIGCIWLMWELIPPLLNWAFLQA NWVGSTRADCTKAGACWVFIHERFGQFMYRLYPHDQRWRINLALLIGLVSIAPMFWKILP HRGRYIAAWAVIYPLIVWWLMYGGFFALERVETRQWGGLTLTLIIASVGIAGALPWGILL ALGRRSHMPIVRILSVIFIEFWRGVPLITVLFMSSVMLPLFMAEGTSIDKLIRALVGVIL FQSAYVAEVVRGGLQALPKGQYEAAESLALGYWKTQGLVILPQALKLVIPGLVNTIIALF KDTSLVIIIGLFDLFSSVQQATVDPAWLGMSTEGYVFAALIYWIFCFSMSRYSQYLEKRF NTGRTPH >gi|296494506|gb|ADTN01000232.1| GENE 15 15151 - 15909 532 252 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 12 250 1 242 245 209 45 1e-53 MSQILLQPANAMITLENVNKWYGQFHVLKNINLTVQPGERIVLCGPSGSGKSTTIRCINH LEEHQQGRIVVDGIELNEDIRNIERVRQEVGMVFQHFNLFPHLTVLQNCTLAPIWVRKMP KKEAEDLAVHYLERVRIAEHAHKFPGQISGGQQQRVAIARSLCMKPKIMLFDEPTSALDP EMVKEVLDTMIGLAQSGMTMLCVTHEMGFARTVADRVIFMDRGEIVEQAAPDEFFAHPKS ERTRAFLSQVIH Prediction of potential genes in microbial genomes Time: Sun May 15 23:55:17 2011 Seq name: gi|296494505|gb|ADTN01000233.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont536.1, whole genome shotgun sequence Length of sequence - 1003 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 5 - 64 4.2 1 1 Tu 1 . + CDS 214 - 861 873 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain + Term 865 - 904 6.2 Predicted protein(s) >gi|296494505|gb|ADTN01000233.1| GENE 1 214 - 861 873 215 aa, chain + ## HITS:1 COG:ECs3082 KEGG:ns NR:ns ## COG: ECs3082 COG2197 # Protein_GI_number: 15832336 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 215 1 215 215 385 99.0 1e-107 MPEATPFQVMIVDDHPLMRRGVRQLLELDPGFEVVAEAGDGASAIDLANRLDIDVILLDL NMKGMSGLDTLNALRRDGVTAQIIILTVSDASSDVFALIDAGADGYLLKDSDPEVLLEAI RAGAKGSKVFSERVNQYLREREMFGAEEDPFSVLTERELDVLHELAQGLSNKQIASVLNI SEQTVKVHIRNLLRKLNVRSRVAATILFLQQRGAQ Prediction of potential genes in microbial genomes Time: Sun May 15 23:55:22 2011 Seq name: gi|296494504|gb|ADTN01000234.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont536.2, whole genome shotgun sequence Length of sequence - 11351 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 1, operones - 1 average op.length - 13.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 - CDS 12 - 1064 1263 ## COG4235 Cytochrome c biogenesis factor 2 1 Op 2 11/0.000 - CDS 1061 - 1618 735 ## COG0526 Thiol-disulfide isomerase and thioredoxins 3 1 Op 3 16/0.000 - CDS 1615 - 3558 2386 ## COG1138 Cytochrome c biogenesis factor 4 1 Op 4 9/0.000 - CDS 3555 - 4034 663 ## COG2332 Cytochrome c-type biogenesis protein CcmE 5 1 Op 5 9/0.000 - CDS 4031 - 4240 276 ## COG3114 Heme exporter protein D 6 1 Op 6 14/0.000 - CDS 4237 - 4974 801 ## COG0755 ABC-type transport system involved in cytochrome c biogenesis, permease component 7 1 Op 7 14/0.000 - CDS 5016 - 5675 719 ## COG2386 ABC-type transport system involved in cytochrome c biogenesis, permease component 8 1 Op 8 3/0.000 - CDS 5675 - 6298 553 ## COG4133 ABC-type transport system involved in cytochrome c biogenesis, ATPase component 9 1 Op 9 7/0.000 - CDS 6311 - 6913 551 ## COG3005 Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit 10 1 Op 10 4/0.000 - CDS 6923 - 7393 490 ## COG3043 Nitrate reductase cytochrome c-type subunit 11 1 Op 11 7/0.000 - CDS 7369 - 8232 667 ## COG0348 Polyferredoxin 12 1 Op 12 10/0.000 - CDS 8219 - 8914 484 ## COG1145 Ferredoxin 13 1 Op 13 . - CDS 8921 - 11335 3028 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing Predicted protein(s) >gi|296494504|gb|ADTN01000234.1| GENE 1 12 - 1064 1263 350 aa, chain - ## HITS:1 COG:ccmH_2 KEGG:ns NR:ns ## COG: ccmH_2 COG4235 # Protein_GI_number: 16130131 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cytochrome c biogenesis factor # Organism: Escherichia coli K12 # 130 350 1 221 221 430 100.0 1e-120 MRFLLGVLMLMISGSALATIDVLQFKDEAQEQQFRQLTEELRCPKCQNNSIADSNSMIAT DLRQKVYELMQEGKSKKEIVDYMVARYGNFVTYDPPLTPLTVLLWVLPVVAIGIGGWVIY ARSRRRVRVVPEAFPEQSVPEGKRAGYVVYLPGIVVALIVAGVSYYQTGNYQQVKIWQQA TAQAPALLDRALDPKADPLNEEEMSRLALGMRTQLQKNPGDIEGWIMLGRVGMALGNASI ATDAYATAYRLDPKNSDAALGYAEALTRSSDPNDNRLGGELLRQLVRTDHSNIRVLSMYA FNAFEQQRFGEAVAAWEMMLKLLPANDTRRAVIERSIAQAMQHLSPQESK >gi|296494504|gb|ADTN01000234.1| GENE 2 1061 - 1618 735 185 aa, chain - ## HITS:1 COG:ECs3084 KEGG:ns NR:ns ## COG: ECs3084 COG0526 # Protein_GI_number: 15832338 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Escherichia coli O157:H7 # 1 185 1 185 185 378 100.0 1e-105 MKRKVLLIPLIIFLAIAAALLWQLARNAEGDDPTNLESALIGKPVPKFRLESLDNPGQFY QADVLTQGKPVLLNVWATWCPTCRAEHQYLNQLSAQGIRVVGMNYKDDRQKAISWLKELG NPYALSLFDGDGMLGLDLGVYGAPETFLIDGNGIIRYRHAGDLNPRVWEEEIKPLWEKYS KEAAQ >gi|296494504|gb|ADTN01000234.1| GENE 3 1615 - 3558 2386 647 aa, chain - ## HITS:1 COG:ccmF KEGG:ns NR:ns ## COG: ccmF COG1138 # Protein_GI_number: 16130133 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cytochrome c biogenesis factor # Organism: Escherichia coli K12 # 1 647 1 647 647 1142 99.0 0 MMPEIGNGLLCLALGIALLLSVYPLWGVARGDARMMASSRLFAWLLFMSVAGAFLVLVNA FVVNDFTVTYVASNSNTQLPVWYRVAATWGAHEGSLLLWVLLMSGWTFAVAIFSQRIPLD IVARVLAIMGMVSVGFLLFILFTSNPFSRTLPNFPIEGRDLNPLLQDPGLIFHPPLLYMG YVGFSVAFAFAIASLLSGRLDSTYARFTRPWTLAAWIFLTLGIVLGSAWAYYELGWGGWW FWDPVENASFMPWLVGTALMHSLAVTEQRASFKAWTLLLAISAFSLCLLGTFLVRSGVLV SVHAFASDPARGMFILAFMVLVIGGSLLLFAARGHKVRSRVNNALWSRESLLLANNVLLV AAMLVVLLGTLLPLVHKQLGLGSISIGEPFFNTMFTWLMVPFALLLGVGPLVRWGRDRPR KIRNLLIIAFISTLVLSLLLPWLFESKVVAMTVLGLAMACWIAVLAIAEAALRISRGTKT TFSYWGMVAAHLGLAVTIVGIAFSQNYSVERDVRMKSGDSVDIHEYRFTFRDVKEVTGPN WRGGVATIGVTRDGKPETVLYAEKRYYNTAGSMMTEAAIDGDITRDLYAALGEELENGAW AVRLYYKPFVRWIWAGGLMMALGGLLCLFDPRYRKRVSPQKTAPEAV >gi|296494504|gb|ADTN01000234.1| GENE 4 3555 - 4034 663 159 aa, chain - ## HITS:1 COG:ECs3086 KEGG:ns NR:ns ## COG: ECs3086 COG2332 # Protein_GI_number: 15832340 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cytochrome c-type biogenesis protein CcmE # Organism: Escherichia coli O157:H7 # 1 159 1 159 159 321 100.0 3e-88 MNIRRKNRLWIACAVLAGLALTIGLVLYALRSNIDLFYTPGEILYGKRETQQMPEVGQRL RVGGMVMPGSVQRDPNSLKVTFTIYDAEGSVDVSYEGILPDLFREGQGVVVQGELEKGNH ILAKEVLAKHDENYTPPEVEKAMEANHRRPASVYKDPAS >gi|296494504|gb|ADTN01000234.1| GENE 5 4031 - 4240 276 69 aa, chain - ## HITS:1 COG:ECs3087 KEGG:ns NR:ns ## COG: ECs3087 COG3114 # Protein_GI_number: 15832341 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Heme exporter protein D # Organism: Escherichia coli O157:H7 # 1 69 1 69 69 90 100.0 7e-19 MTPAFASWNEFFAMGGYAFFVWLAVVMTVIPLVVLVVHSVMQHRAILRGVAQQRAREARL RAAQQQEAA >gi|296494504|gb|ADTN01000234.1| GENE 6 4237 - 4974 801 245 aa, chain - ## HITS:1 COG:ECs3088 KEGG:ns NR:ns ## COG: ECs3088 COG0755 # Protein_GI_number: 15832342 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome c biogenesis, permease component # Organism: Escherichia coli O157:H7 # 1 245 1 245 245 417 100.0 1e-116 MWKTLHQLAIPPRLYQICGWFIPWLAIASVVVLTVGWIWGFGFAPADYQQGNSYRIIYLH VPAAIWSMGIYASMAVAAFIGLVWQMKMANLAVAAMAPIGAVFTFIALVTGSAWGKPMWG TWWVWDARLTSELVLLFLYVGVIALWHAFDDRRLAGRAAGILVLIGVVNLPIIHYSVEWW NTLHQGSTRMQQSIDPAMRSPLRWSIFGFLLLSATLTLMRMRNLILLMEKRRPWVSELIL KRGRK >gi|296494504|gb|ADTN01000234.1| GENE 7 5016 - 5675 719 219 aa, chain - ## HITS:1 COG:ECs3089 KEGG:ns NR:ns ## COG: ECs3089 COG2386 # Protein_GI_number: 15832343 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome c biogenesis, permease component # Organism: Escherichia coli O157:H7 # 1 219 2 220 220 280 100.0 2e-75 MFWRIFRLELRVAFRHSAEIANPLWFFLIVITLFPLSIGPEPQLLARIAPGIIWVAALLS SLLALERLFRDDLQDGSLEQLMLLPLPLPAVVLAKVMAHWMVTGLPLLILSPLVAMLLGM DVYGWQVMALTLLLGTPTLGFLGAPGVALTVGLKRGGVLLSILVLPLTIPLLIFATAAMD AASMHLPVDGYLAILGALLAGTATLSPFATAAALRISIQ >gi|296494504|gb|ADTN01000234.1| GENE 8 5675 - 6298 553 207 aa, chain - ## HITS:1 COG:ccmA KEGG:ns NR:ns ## COG: ccmA COG4133 # Protein_GI_number: 16130138 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in cytochrome c biogenesis, ATPase component # Organism: Escherichia coli K12 # 3 207 1 205 205 375 100.0 1e-104 MGMLEARELLCERDERTLFSGLSFTLNAGEWVQITGSNGAGKTTLLRLLTGLSRPDAGEV LWQGQPLHQVRDSYHQNLLWIGHQPGIKTRLTALENLHFYHRDGDTAQCLEALAQAGLAG FEDIPVNQLSAGQQRRVALARLWLTRATLWILDEPFTAIDVNGVDRLTQRMAQHTEQGGI VILTTHQPLNVAESKIRRISLTQTRAA >gi|296494504|gb|ADTN01000234.1| GENE 9 6311 - 6913 551 200 aa, chain - ## HITS:1 COG:ECs3091 KEGG:ns NR:ns ## COG: ECs3091 COG3005 # Protein_GI_number: 15832345 # Func_class: C Energy production and conversion # Function: Nitrate/TMAO reductases, membrane-bound tetraheme cytochrome c subunit # Organism: Escherichia coli O157:H7 # 1 200 1 200 200 429 100.0 1e-120 MGNSDRKPGLIKRLWKWWRTPSRLALGTLLLIGFVGGIVFWGGFNTGMEKANTEEFCISC HEMRNTVYQEYMDSVHYNNRSGVRATCPDCHVPHEFVPKMIRKLKASKELYGKIFGVIDT PQKFEAHRLTMAQNEWRRMKDNNSQECRNCHNFEYMDTTAQKSVAAKMHDQAVKDGQTCI DCHKGIAHKLPDMREVEPGF >gi|296494504|gb|ADTN01000234.1| GENE 10 6923 - 7393 490 156 aa, chain - ## HITS:1 COG:ECs3092 KEGG:ns NR:ns ## COG: ECs3092 COG3043 # Protein_GI_number: 15832346 # Func_class: C Energy production and conversion # Function: Nitrate reductase cytochrome c-type subunit # Organism: Escherichia coli O157:H7 # 1 156 1 156 156 320 100.0 7e-88 MEFGSEIMKSHDLKKALCQWTAMLALVVSGAVWAANGVDFSQSPEVSGTQEGAIRMPKEQ DRMPLNYVNQPPMIPHSVEGYQVTTNTNRCLQCHGVESYRTTGAPRISPTHFMDSDGKVG AEVAPRRYFCLQCHVPQADTAPIVGNTFTPSKGYGK >gi|296494504|gb|ADTN01000234.1| GENE 11 7369 - 8232 667 287 aa, chain - ## HITS:1 COG:napH KEGG:ns NR:ns ## COG: napH COG0348 # Protein_GI_number: 16130141 # Func_class: C Energy production and conversion # Function: Polyferredoxin # Organism: Escherichia coli K12 # 1 287 1 287 287 549 100.0 1e-156 MANRKRDAGREALEKKGWWRSHRWLVLRRLCQFFVLGMFLSGPWFGVWILHGNYSSSLLF DTVPLTDPLMTLQSLASGHLPATVALTGAVIITVLYALAGKRLFCSWVCPLNPITDLANW LRRRFDLNQSATIPRHIRYVLLVVILVGSALTGTLIWEWINPVSLMGRSLVMGFGSGALL ILALFLFDLLVVEHGWCGHICPVGALYGVLGSKGVITVAATDRQKCNRCMDCFHVCPEPH VLRAPVLDEQSPVQVTSRDCMTCGRCVDVCSEDVFTITTRWSSGAKS >gi|296494504|gb|ADTN01000234.1| GENE 12 8219 - 8914 484 231 aa, chain - ## HITS:1 COG:napG KEGG:ns NR:ns ## COG: napG COG1145 # Protein_GI_number: 16130142 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Escherichia coli K12 # 1 231 1 231 231 449 100.0 1e-126 MSRSAKPQNGRRRFLRDVVRTAGGLAAVGVALGLQQQTARASGVRLRPPGAINENAFASA CVRCGQCVQACPYDTLKLATLASGLSAGTPYFVARDIPCEMCEDIPCAKVCPSGALDREI ESIDDARMGLAVLVDQENCLNFQGLRCDVCYRECPKIDEAITLELERNTRTGKHARFLPT VHSDACTGCGKCEKVCVLEQPAIKVLPLSLAKGELGHHYRFGWLEGNNGKS >gi|296494504|gb|ADTN01000234.1| GENE 13 8921 - 11335 3028 804 aa, chain - ## HITS:1 COG:napA KEGG:ns NR:ns ## COG: napA COG0243 # Protein_GI_number: 16130143 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 804 25 828 828 1699 99.0 0 MPGVARAVVGQQEAIKWDKAPCRFCGTGCGVLVGTQQGRVVACQGDPDAPVNRGLNCIKG YFLPKIMYGKDRLTQPLLRMKNGKYDKEGEFTPITWDQAFDVMEEKFKTALKEKGPESIG MFGSGQWTIWEGYAASKLFKAGFRSNNIDPNARHCMASAVVGFMRTFGMDEPMGCYDDIE QADAFVLWGANMAEMHPILWSRITNRRLSNQNVTVAVLSTYQHRSFELADNGIIFTPQSD LVILNYIANYIIQNNAINQDFFSKHVNLRKGATDIGYGLRPTHPLEKAAKNPGSDASEPM SFEDYKAFVAEYTLEKTAEMTGVPKDQLEQLAQLYADPNKKVISYWTMGFNQHTRGVWAN NLVYNLHLLTGKISQPGCGPFSLTGQPSACGTAREVGTFAHRLPADMVVTNEKHRDICEK KWNIPSGTIPAKIGLHAVAQDRALKDGKLNVYWTMCTNNMQAGPNINEERMPGWRDPRNF IIVSDPYPTVSALAADLILPTAMWVEKEGAYGNAERRTQFWRQQVQAPGEAKSDLWQLVQ FSRRFKTEEVWPEDLLAKKPELRGKTLYEVLYATPEVSKFPVSELAEDQLNDESRELGFY LQKGLFEEYAWFGRGHGHDLAPFDDYHKARGLRWPVVNGKETQWRYSEGNDPYVKAGEGY KFYGKPDGKAVIFALPFEPAAEAPDEEYDLWLSTGRVLEHWHTGSMTRRVPELHRAFPEA VLFIHPLDAKARDLRRGDKVKVVSRRGEVISIVETRGRNRPPQGLVYMPFFDAAQLVNKL TLDATDPLSKETDFKKCAVKLEKV Prediction of potential genes in microbial genomes Time: Sun May 15 23:55:23 2011 Seq name: gi|296494503|gb|ADTN01000235.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont536.3, whole genome shotgun sequence Length of sequence - 1775 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 62 76 ## 2 1 Op 2 . - CDS 59 - 322 155 ## PROTEIN SUPPORTED gi|167855983|ref|ZP_02478730.1| 50S ribosomal protein L31 type B 3 2 Tu 1 . + CDS 1214 - 1702 541 ## COG4574 Serine protease inhibitor ecotin Predicted protein(s) >gi|296494503|gb|ADTN01000235.1| GENE 1 2 - 62 76 20 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKLSRRSFMKANAVAAAAAA >gi|296494503|gb|ADTN01000235.1| GENE 2 59 - 322 155 87 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855983|ref|ZP_02478730.1| 50S ribosomal protein L31 type B [Haemophilus parasuis 29755] # 4 81 15 92 94 64 37 7e-11 MHTNWQVCSLVVQAKSERISDISTQLNAFPGCEVAVSDAPSGQLIVVVEAEDSETLIQTI ESVRNVEGVLAVSLVYHQQEEQGEETP >gi|296494503|gb|ADTN01000235.1| GENE 3 1214 - 1702 541 162 aa, chain + ## HITS:1 COG:eco KEGG:ns NR:ns ## COG: eco COG4574 # Protein_GI_number: 16130146 # Func_class: R General function prediction only # Function: Serine protease inhibitor ecotin # Organism: Escherichia coli K12 # 1 162 1 162 162 322 100.0 2e-88 MKTILPAVLFAAFATTSAWAAESVQPLEKIAPYPQAEKGMKRQVIQLTPQEDESTLKVEL LIGQTLEVDCNLHRLGGKLENKTLEGWGYDYYVFDKVSSPVSTMMACPDGKKEKKFVTAY LGDAGMLRYNSKLPIVVYTPDNVDVKYRVWKAEEKIDNAVVR Prediction of potential genes in microbial genomes Time: Sun May 15 23:55:28 2011 Seq name: gi|296494502|gb|ADTN01000236.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont551.1, whole genome shotgun sequence Length of sequence - 3881 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 84 - 143 2.2 1 1 Tu 1 . + CDS 202 - 1020 298 ## COG2801 Transposase and inactivated derivatives + Term 1111 - 1145 0.1 + Prom 1085 - 1144 2.8 2 2 Tu 1 . + CDS 1174 - 3171 1101 ## COG1292 Choline-glycine betaine transporter + Term 3180 - 3210 3.4 - Term 3392 - 3435 -0.5 3 3 Op 1 . - CDS 3470 - 3559 75 ## 4 3 Op 2 . - CDS 3582 - 3788 68 ## COG2801 Transposase and inactivated derivatives Predicted protein(s) >gi|296494502|gb|ADTN01000236.1| GENE 1 202 - 1020 298 272 aa, chain + ## HITS:1 COG:ECs1311 KEGG:ns NR:ns ## COG: ECs1311 COG2801 # Protein_GI_number: 15830565 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 272 1 272 272 541 99.0 1e-154 MCQVFGVSRSGYYNWVQHEPSDRKQSDERLKLEIKVAHIRTRETYGTRRLQTELAENGII VGRDRLARLRKELRLRCKQKRKFRATTNSNHNLPVAPNLLNQTFAPTAPNQVWVADLTYV ATQEGWLYLAGIKDVYTCEIVGYAMGERMTKELTGKALFMALRSQRPPAGLIHHSDRGSQ YCAYDYRVIQEQFGLKTSMSRKGNCYDNAPMESFWGTLKNESLSHYRFNNRDEAISVIRE YIEIFYNRQRRHSRLGNISPAAFREKYHQMAA >gi|296494502|gb|ADTN01000236.1| GENE 2 1174 - 3171 1101 665 aa, chain + ## HITS:1 COG:PA5291 KEGG:ns NR:ns ## COG: PA5291 COG1292 # Protein_GI_number: 15600484 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Choline-glycine betaine transporter # Organism: Pseudomonas aeruginosa # 8 665 4 661 661 871 66.0 0 MSENDTIPKKSTSQINKAVFFTSALLIFLLVAFAAVFPDVADKNFKLLQQQIFTNASWFY ILAVALILLSVTFLGLSRYGDIKLGPDHAQPDFSYHSWFAMLFSAGMGIGLMFFGVAEPV MHYLSPPVGTPETVAAAKEAMRLTFFHWGLHAWAIYAIVALILAFFSYRHGLPLTLRSAL YPIIGDRIYGPVGHAVDIFAVIGTVFGVATSLGYGVLQVNAGLNHLFGVPINETVQVILI VVITGLATISVVSGLDKGIRILSELNLGLALLLLALVLCLGPTVLLLKSFVENTGGYLSE LVSKTFNLYAYEPKSSNWLGGWTLLYWGWWLSWSPFVGMFIARVSRGRTIREFVTGVLFV PAGFTLMWMTVFGNSAIYLIMNQGATDLANTVQQDVSLALFNFLEHFPFSSVLSFIAMAM VIVFFVTSADSGAMVVDTLASGGVANTPVWQRIFWASLMGIVAIALLLAGGLSALQTVTI ASALPFSVILLISIYGLLKALRRDLTKRESLSMATIAPTAARNPIPWQRRLRNIAYLPKR SLVKRFMDDVIQPAMTLVQEELNKQGTISHISDAVDDRIRLEVDLGNELNFIYEVRLRGY ISPTFALAAMDNDEQQTEQHRYYRAEVYLKEGGQNYDVMGWNQEQLINDILDQYEKHLHF LHLVR >gi|296494502|gb|ADTN01000236.1| GENE 3 3470 - 3559 75 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKVASTFCVVTSSTTKRVALGVTLVVAEP >gi|296494502|gb|ADTN01000236.1| GENE 4 3582 - 3788 68 68 aa, chain - ## HITS:1 COG:VC0257 KEGG:ns NR:ns ## COG: VC0257 COG2801 # Protein_GI_number: 15640286 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Vibrio cholerae # 1 67 223 289 290 102 68.0 2e-22 MERFFRSLKNEWMPVVGYVSFSEAAHAITDYIVGYYSALRPHEYNGGLPPNESENRYWKN SNAVASFC Prediction of potential genes in microbial genomes Time: Sun May 15 23:55:32 2011 Seq name: gi|296494501|gb|ADTN01000237.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont561.1, whole genome shotgun sequence Length of sequence - 551 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 115 - 174 2.7 1 1 Tu 1 . + CDS 214 - 303 56 ## Predicted protein(s) >gi|296494501|gb|ADTN01000237.1| GENE 1 214 - 303 56 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAVLTELTCRGNVVTSWFASDKYRHWQMA Prediction of potential genes in microbial genomes Time: Sun May 15 23:55:36 2011 Seq name: gi|296494500|gb|ADTN01000238.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont561.2, whole genome shotgun sequence Length of sequence - 576 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Sun May 15 23:55:36 2011 Seq name: gi|296494499|gb|ADTN01000239.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont561.3, whole genome shotgun sequence Length of sequence - 493 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 262 - 459 91 ## ECED1_3516 fragment of PilV protein, shufflon protein (C-ter part) Predicted protein(s) >gi|296494499|gb|ADTN01000239.1| GENE 1 262 - 459 91 65 aa, chain - ## HITS:1 COG:no KEGG:ECED1_3516 NR:ns ## KEGG: ECED1_3516 # Name: pilV # Def: fragment of PilV protein, shufflon protein (C-ter part) # Organism: E.coli_ED1a # Pathway: not_defined # 1 65 15 77 77 106 83.0 2e-22 MNYSACKWYQSSVAMNHFIGGKSGGSIYYKPIQCPTGFIMTGTRMYGIGDGVDEEHVDAY CCPFG Prediction of potential genes in microbial genomes Time: Sun May 15 23:55:40 2011 Seq name: gi|296494498|gb|ADTN01000240.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont585.1, whole genome shotgun sequence Length of sequence - 9745 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 6, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 306 370 ## SDY_4589 hypothetical protein - Prom 490 - 549 1.9 2 2 Tu 1 . - CDS 648 - 1466 653 ## ECUMN_4883 hypothetical protein - Term 1504 - 1542 7.1 3 3 Tu 1 . - CDS 1621 - 1779 148 ## ECUMN_4882 hypothetical protein - Term 1793 - 1827 8.3 4 4 Op 1 2/0.000 - CDS 1850 - 4969 2459 ## COG3468 Type V secretory pathway, adhesin AidA - Term 5187 - 5223 3.4 5 4 Op 2 . - CDS 5341 - 6213 796 ## COG3596 Predicted GTPase + Prom 7282 - 7341 4.4 6 5 Tu 1 . + CDS 7444 - 8928 447 ## ECUMN_4878 hypothetical protein 7 6 Tu 1 . - CDS 9250 - 9744 333 ## ECO26_2894 hypothetical protein Predicted protein(s) >gi|296494498|gb|ADTN01000240.1| GENE 1 3 - 306 370 101 aa, chain - ## HITS:1 COG:no KEGG:SDY_4589 NR:ns ## KEGG: SDY_4589 # Name: not_defined # Def: hypothetical protein # Organism: S.dysenteriae # Pathway: not_defined # 1 101 1 101 157 209 96.0 3e-53 MKTLSQNTTSSACAPETDLQQLVATPVPDERRISFWPQHFGLIPQWVTLEPRIFGWMDRL CEDYCGGIWNLYTLNNGGAFMAPEPDDDDDETWVLFNAMNG >gi|296494498|gb|ADTN01000240.1| GENE 2 648 - 1466 653 272 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_4883 NR:ns ## KEGG: ECUMN_4883 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 272 1 272 272 530 100.0 1e-149 MRLASRFGYANQIRRDRPLTHEELMHYVPSIFGEDRHTSRSKRYAYIPTITVLESLQQEG FQPFFACQTRVRDPGRRGYTKHMLRLRRAGEINGEHVPEIILLNSHDGTSSYQMLPGYFR FVCQNGCVCGQSLGEVRVPHRGNVVEKVIEGAYEVVGVFDRIEEKRDAMQSLVLPPPARQ ALAQAALTYRYGDEHQPVTTADILTPRRREDYGQDLWSAYQTIQENMLKGGISGRSAKGK RIHTRAIHNIDTDIKLNRALWVMAETLLESLR >gi|296494498|gb|ADTN01000240.1| GENE 3 1621 - 1779 148 52 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_4882 NR:ns ## KEGG: ECUMN_4882 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 52 13 64 64 106 100.0 2e-22 MPGCTSRLLPEGPFSREQAVAVKTAYRNVFTEDDQGTYSRLVIRNAEGQLRW >gi|296494498|gb|ADTN01000240.1| GENE 4 1850 - 4969 2459 1039 aa, chain - ## HITS:1 COG:flu KEGG:ns NR:ns ## COG: flu COG3468 # Protein_GI_number: 16129941 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 1 1039 53 1091 1091 1585 94.0 0 MKRHLNTSYRLVWNHITGTLVVASELARSRGKRAGVAVALSLAAVTSVPALAADKVVQAG ETVNDGTLTNHDNQIVLGTANGMTISTGLEYGPDNEANTGGQWIQNGGIANNTTVTGGGL QRVNAGGSVSDTVISAGGGQSLQGQAVNTTLNGGEQWVHEGGIATGTVINEKGWQAIKSG AVATDTVVNTGAEGGPDAENGDTGQTVYGDAVRTTINKNGRQIVAAEGTANTTVVYAGGD QTVHGHALDTTLNGGYQYVHNGGTASGTVVNSDGWQIVKNGGVAGNTTVNQKGRLQVDAG GTATNVTLKQGGALVTSTAATVTGINRLGAFSVVEGKADNVVLENGGRLDVLTGHTATNT RVDDGGTLDVRNGGTATTVSMGNGGVLLADSGAAVSGTRSDGKAFSIGGGQADALMLEKG SSFTLNAGDTATDTTVNGGLFTARGGTLAGTTTLNNGAILTLSGKTVNNDTLTIREGDAL LQGGSLTGNGSVEKSGSGTLTVSNTTLTQKAVNLNEGTLTLNDSTVTTDVIAQRGTALKL TGSTVLNGAIDPTNVTLASGATWNIPDNATVQSVVDDLSHAGQIHFTSTRTGKFVPATLK VKNLNGQNGTISLRVRPDMAQNNADRLVIDGGRATGKTILNLVNAGNSASGLATSGKGIQ VVEAINGATTEEGAFIQGNKLQAGAFNYSLNRDSDESWYLRSENAYRAEVPLYTSMLTQA MDYDRILAGSRSHQTGVNGENNSVRLSIQGGHLGHDNNGGIARGATPESSGSYGFVRLEG DLLRTEVAGMSLTTGVYGAAGHSSVDVKDDDGSRAGTVRDDAGSLGGYLNLTHTSSGLWA DIVAQGTRHSMKASSDNNDFRARGWGWLGSLETGLPFSITDNLMLEPQLQYTWQGLSLDD GQDNAGYVKFGHGSAQHVRAGFRLGSHNDMSFGEGTSSRDTLRDSAKHRVRELPVNWWVQ PSVIRTFSSRGDMSMGTAAAGSNMTFSPSRNGTSLDLQAGLEARVRENITLGVQAGYAHS VSGSSAEGYNGQATLNVTF >gi|296494498|gb|ADTN01000240.1| GENE 5 5341 - 6213 796 290 aa, chain - ## HITS:1 COG:ECs1395 KEGG:ns NR:ns ## COG: ECs1395 COG3596 # Protein_GI_number: 15830649 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Escherichia coli O157:H7 # 1 290 1 290 290 554 96.0 1e-158 MNPSDAIEAIEKPLSSLPYPISRHILEHLRKLTRHEPVPGIMGKSGAGKSSLCNALFQGE VTPVSDVHAGTREVRRFRLSGHGHSMVITDLPGVGESRDRDAEYEALYRDILPELDLVLW LIKADDRALSVDEYFWRHILHRGHQQVLFVVTQADKTEPCHEWDMAGIQPSPAQAQNIRE KTEAVFRLFRPVHPVVAVSARTGWELDTLVSALMTALPDHAASPLMTRLQDELRTESVRA QAREQFTGAVDRIFDTAESVCVASVARTVLRAVRDTVVSVARAVWNWIFF >gi|296494498|gb|ADTN01000240.1| GENE 6 7444 - 8928 447 494 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_4878 NR:ns ## KEGG: ECUMN_4878 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 494 24 517 517 998 100.0 0 MSFGQPCDEFPLSSLPPLIRDAVIEAQQITQAPLGLVAASALGAVSLVCQNLIDVCRLNT LRGPVSLFFLTLAESGERKTAVDKLLMKPLYQQEMQLYSRYKSELAVWKNKEELLKAQKK ALLSKLNKELRKGADESETLRQLEVLQKNSAEEPVRYKFIFNDATTAAIKNQLCGKWRSV GIMSDEAGIIFDGYTLSELPFINKMWDGSVLSVDRKNEPEQMIENARMTLSLMVQPGLFD RYMERKGSVARDSGFLARCLISKPATTQGKRFINGAVIPGGSLTAFHERLMELARGSIEK SSEDERYCLHFSPEAQKIFIEHYNVLEQDLSPSGPLSPFRGHVSKKTENIARIAALFQYF SYGEGKISADIMTSAVVISSWYTDEYKKLFALPDESELQQKDAEELFDWLIEECRGECPP RVRKNYILQCGPGRFRNRKKLNALLNILESQFRLSVVPEGKTMYVLLPQIASLKLSDVSG IFTSGYHYNKLRAK >gi|296494498|gb|ADTN01000240.1| GENE 7 9250 - 9744 333 164 aa, chain - ## HITS:1 COG:no KEGG:ECO26_2894 NR:ns ## KEGG: ECO26_2894 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 164 86 249 249 338 98.0 4e-92 IYYFHNLTPGWVSFNGEKPEIAIVPQSLHRLIYGPDKRATPPLDDDLIVNLCTSEHLLVH HPMLEGILLSECERLRQRSLANKLISLFRQFGGTELRLKLVWLCWLDLMTGNCLDDWTEN LKRKSEKELEEWIINRQKQSAALTDLMDQYVLLSYRTTVDDKRT Prediction of potential genes in microbial genomes Time: Sun May 15 23:55:57 2011 Seq name: gi|296494497|gb|ADTN01000241.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont596.1, whole genome shotgun sequence Length of sequence - 1519 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 760 216 ## COG3209 Rhs family protein 2 1 Op 2 . + CDS 757 - 1326 88 ## JW0693 hypothetical protein Predicted protein(s) >gi|296494497|gb|ADTN01000241.1| GENE 1 2 - 760 216 252 aa, chain + ## HITS:1 COG:ECs0729 KEGG:ns NR:ns ## COG: ECs0729 COG3209 # Protein_GI_number: 15829983 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli O157:H7 # 1 252 1148 1399 1399 517 98.0 1e-147 CDHRGLPLALVSTEGATEWCAEYDEWGNLLNEENPHQLQQLIRLPGQQYDEESGLYYNRH RYYDPLQGRYITQDPIGLKGGWNFYQYPLNPVQYIDSMGLASKYGHLNNGGYGARPNKPP TPDPSKLPDIAKQLRLPYPIDQASSAPNVFKTFFRALSPYDYTLYCRKWVKPNLTCTPQD DSQYPGMDTKTASDYLPQTNWPTTQLPPGYTCAEPYLFPDINKPDGPATAGIDDLGEILA KMKQRTSRGIRK >gi|296494497|gb|ADTN01000241.1| GENE 2 757 - 1326 88 189 aa, chain + ## HITS:1 COG:no KEGG:JW0693 NR:ns ## KEGG: JW0693 # Name: ybfC # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 189 1 189 189 394 100.0 1e-109 MKRVLFFLLMIFVSFGVIADCEIQAKDHDCFTIFAKGTIFSAFPVLNNKAMWRWYQNEDI GEYYWQTELGTCKNNKFTPSGARLLIRVGSLRLNENHAIKGTLQELINTAEKTAFLGDRF RSYIRAGIYQKKSSDPVQLLAVLDNSIMVKYFKDEKPTYARMTAHLPNKNESYECLIKIQ HELIRSEEK Prediction of potential genes in microbial genomes Time: Sun May 15 23:56:18 2011 Seq name: gi|296494496|gb|ADTN01000242.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont602.1, whole genome shotgun sequence Length of sequence - 64445 bp Number of predicted genes - 54, with homology - 51 Number of transcription units - 31, operones - 11 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 73 - 104 4.1 1 1 Tu 1 . - CDS 113 - 781 647 ## EC55989_1562 putative lipoprotein - Prom 866 - 925 4.5 - Term 864 - 923 9.7 2 2 Op 1 4/0.300 - CDS 1083 - 1676 659 ## COG0500 SAM-dependent methyltransferases 3 2 Op 2 . - CDS 1673 - 2665 944 ## COG1275 Tellurite resistance protein and related permeases - Prom 2866 - 2925 4.5 + Prom 2649 - 2708 6.6 4 3 Tu 1 . + CDS 2789 - 3769 707 ## JW1424 hypothetical protein 5 4 Op 1 1/0.700 - CDS 3761 - 4300 934 ## PROTEIN SUPPORTED gi|16129386|ref|NP_415944.1| ribosomal-protein-L7/L12-serine acetyltransferase - Term 4324 - 4352 1.3 6 4 Op 2 . - CDS 4363 - 4587 232 ## COG2841 Uncharacterized protein conserved in bacteria 7 5 Tu 1 . - CDS 4727 - 6346 1501 ## COG3131 Periplasmic glucans biosynthesis protein - Prom 6461 - 6520 3.9 8 6 Tu 1 . - CDS 6607 - 7950 1025 ## COG5383 Uncharacterized protein conserved in bacteria - Prom 8088 - 8147 4.4 + Prom 8063 - 8122 4.5 9 7 Tu 1 . + CDS 8167 - 9090 461 ## COG0583 Transcriptional regulator - Term 8929 - 8970 -0.7 10 8 Tu 1 . - CDS 9128 - 10768 1132 ## COG0840 Methyl-accepting chemotaxis protein - Prom 10852 - 10911 5.7 + Prom 10647 - 10706 3.6 11 9 Tu 1 . + CDS 10917 - 10982 89 ## - Term 11151 - 11179 -0.9 12 10 Tu 1 . - CDS 11388 - 11561 119 ## SSON_1723 hypothetical protein - Prom 11685 - 11744 8.1 13 11 Op 1 . - CDS 11806 - 12336 291 ## COG3038 Cytochrome B561 14 11 Op 2 . - CDS 12385 - 12465 77 ## - Prom 12693 - 12752 3.5 + Prom 12320 - 12379 5.2 15 12 Tu 1 . + CDS 12525 - 12641 132 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase + Prom 12775 - 12834 1.9 16 13 Op 1 . + CDS 12900 - 13277 418 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase 17 13 Op 2 . + CDS 13274 - 13525 268 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase + Term 13534 - 13567 4.4 - Term 13299 - 13338 -0.0 18 14 Tu 1 . - CDS 13567 - 14520 985 ## COG1012 NAD-dependent aldehyde dehydrogenases - Prom 14629 - 14688 5.0 19 15 Tu 1 . - CDS 14717 - 15517 487 ## COG1434 Uncharacterized conserved protein - Prom 15696 - 15755 6.5 20 16 Op 1 . - CDS 15789 - 19691 4271 ## COG1643 HrpA-like helicases 21 16 Op 2 . - CDS 19719 - 19793 95 ## + Prom 19683 - 19742 4.0 22 17 Tu 1 . + CDS 19892 - 20497 764 ## COG1182 Acyl carrier protein phosphodiesterase 23 18 Op 1 5/0.200 - CDS 20551 - 21843 591 ## COG0671 Membrane-associated phospholipid phosphatase 24 18 Op 2 3/0.400 - CDS 21857 - 23614 1020 ## COG0500 SAM-dependent methyltransferases 25 18 Op 3 4/0.300 - CDS 23630 - 24526 193 ## COG4589 Predicted CDP-diglyceride synthetase/phosphatidate cytidylyltransferase 26 18 Op 4 . - CDS 24526 - 24993 221 ## COG0558 Phosphatidylglycerophosphate synthase - Prom 25213 - 25272 5.0 - Term 25218 - 25270 2.3 27 19 Op 1 . - CDS 25302 - 27608 1071 ## JW5221 hypothetical protein - Term 27619 - 27669 -0.9 28 19 Op 2 . - CDS 27671 - 28531 725 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) - Prom 28647 - 28706 5.0 - Term 28679 - 28716 5.2 29 20 Tu 1 . - CDS 28739 - 34801 4369 ## EcHS_A1488 autotransporter (AT) family porin - Prom 34943 - 35002 11.4 - Term 35058 - 35100 6.4 30 21 Op 1 1/0.700 - CDS 35132 - 35722 574 ## COG0663 Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily 31 21 Op 2 2/0.600 - CDS 35704 - 36654 772 ## COG3327 Phenylacetic acid-responsive transcriptional repressor - Prom 36683 - 36742 2.4 32 22 Op 1 1/0.700 - CDS 36755 - 38068 1191 ## COG1541 Coenzyme F390 synthetase 33 22 Op 2 1/0.700 - CDS 38095 - 39300 1123 ## COG0183 Acetyl-CoA acetyltransferase 34 22 Op 3 1/0.700 - CDS 39300 - 39722 404 ## COG2050 Uncharacterized protein, possibly involved in aromatic compounds catabolism 35 22 Op 4 7/0.000 - CDS 39712 - 41136 1220 ## COG1250 3-hydroxyacyl-CoA dehydrogenase 36 22 Op 5 12/0.000 - CDS 41141 - 41926 800 ## COG1024 Enoyl-CoA hydratase/carnithine racemase 37 22 Op 6 1/0.700 - CDS 41929 - 42696 714 ## COG1024 Enoyl-CoA hydratase/carnithine racemase 38 22 Op 7 2/0.600 - CDS 42693 - 43763 990 ## COG1018 Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 39 22 Op 8 4/0.300 - CDS 43771 - 44268 516 ## COG2151 Predicted metal-sulfur cluster biosynthetic enzyme 40 22 Op 9 5/0.200 - CDS 44283 - 45029 685 ## COG3396 Uncharacterized conserved protein 41 22 Op 10 5/0.200 - CDS 45038 - 45325 373 ## COG3460 Uncharacterized enzyme of phenylacetate metabolism 42 22 Op 11 . - CDS 45337 - 46266 701 ## COG3396 Uncharacterized conserved protein - Prom 46489 - 46548 7.1 + Prom 46304 - 46363 8.0 43 23 Tu 1 . + CDS 46551 - 48596 1664 ## COG1012 NAD-dependent aldehyde dehydrogenases + Term 48600 - 48655 6.7 + Prom 48622 - 48681 3.4 44 24 Tu 1 . + CDS 48844 - 51117 2439 ## COG3733 Cu2+-containing amine oxidase + Term 51136 - 51162 1.0 - Term 51124 - 51150 1.0 45 25 Tu 1 . - CDS 51175 - 52674 1214 ## COG1012 NAD-dependent aldehyde dehydrogenases - Prom 52781 - 52840 4.2 + Prom 52784 - 52843 4.1 46 26 Tu 1 . + CDS 52910 - 53815 516 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 53820 - 53858 2.4 - Term 53797 - 53856 9.8 47 27 Op 1 . - CDS 53987 - 54361 206 ## COG3784 Uncharacterized protein conserved in bacteria 48 27 Op 2 . - CDS 54321 - 54506 272 ## JW1377 predicted lipoprotein 49 27 Op 3 . - CDS 54503 - 57142 2404 ## JW1376 hypothetical protein - Prom 57285 - 57344 4.4 + Prom 57251 - 57310 8.4 50 28 Op 1 . + CDS 57350 - 58339 1206 ## COG1052 Lactate dehydrogenase and related dehydrogenases + Prom 58357 - 58416 2.2 51 28 Op 2 . + CDS 58438 - 58872 201 ## PROTEIN SUPPORTED gi|163801140|ref|ZP_02195040.1| 50S ribosomal protein L25 - Term 58586 - 58643 6.2 52 29 Tu 1 . - CDS 58869 - 59135 205 ## COG3042 Putative hemolysin - Prom 59235 - 59294 3.0 + Prom 59203 - 59262 4.7 53 30 Tu 1 . + CDS 59409 - 62933 3386 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit + Term 62971 - 63002 3.2 + Prom 63107 - 63166 9.2 54 31 Tu 1 . + CDS 63300 - 64433 1246 ## COG3203 Outer membrane protein (porin) Predicted protein(s) >gi|296494496|gb|ADTN01000242.1| GENE 1 113 - 781 647 222 aa, chain - ## HITS:1 COG:no KEGG:EC55989_1562 NR:ns ## KEGG: EC55989_1562 # Name: ydcL # Def: putative lipoprotein # Organism: E.coli_55989 # Pathway: not_defined # 1 222 1 222 222 434 100.0 1e-120 MRTTSFAKVAALCGLLALSGCASKITQPDKYSGFLNNYSDLKETTSATGKPVLRWVDPSF DQSKYDSIVWNPITYYPVPKPSTQVGQKVLDKILNYTNTEMKEAIAQRKPLVTTAGPRSL IFRGAITGVDTSKEGLQFYEVVPVALVVAGTQMATGHRTMDTRLYFEGELIDAATNKPVI KVVRQGEGKDLNNESTPMAFENIKQVIDDMATDATMFDVNKK >gi|296494496|gb|ADTN01000242.1| GENE 2 1083 - 1676 659 197 aa, chain - ## HITS:1 COG:tehB KEGG:ns NR:ns ## COG: tehB COG0500 # Protein_GI_number: 16129389 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli K12 # 1 197 1 197 197 403 100.0 1e-113 MIIRDENYFTDKYELTRTHSEVLEAVKVVKPGKTLDLGCGNGRNSLYLAANGYDVDAWDK NAMSIANVERIKSIENLDNLHTRVVDLNNLTFDRQYDFILSTVVLMFLEAKTIPGLIANM QRCTKPGGYNLIVAAMDTADYPCTVGFPFAFKEGELRRYYEGWERVKYNEDVGELHRTDA NGNRIKLRFATMLARKK >gi|296494496|gb|ADTN01000242.1| GENE 3 1673 - 2665 944 330 aa, chain - ## HITS:1 COG:tehA KEGG:ns NR:ns ## COG: tehA COG1275 # Protein_GI_number: 16129388 # Func_class: P Inorganic ion transport and metabolism # Function: Tellurite resistance protein and related permeases # Organism: Escherichia coli K12 # 1 330 1 330 330 579 100.0 1e-165 MQSDKVLNLPAGYFGIVLGTIGMGFAWRYASQVWQVSHWLGDGLVILAMIIWGLLTSAFI ARLIRFPHSVLAEVRHPVLSSFVSLFPATTMLVAIGFVPWFRPLAVCLFSFGVVVQLAYA AWQTAGLWRGSHPEEATTPGLYLPTVANNFISAMACGALGYTDAGLVFLGAGVFSWLSLE PVILQRLRSSGELPTALRTSLGIQLAPALVACSAWLSVNGGEGDTLAKMLFGYGLLQLLF MLRLMPWYLSQPFNASFWSFSFGVSALATTGLHLGSGSDNGFFHTLAVPLFIFTNFIIAI LLIRTFALLMQGKLLVRTERAVLMKAEDKE >gi|296494496|gb|ADTN01000242.1| GENE 4 2789 - 3769 707 326 aa, chain + ## HITS:1 COG:no KEGG:JW1424 NR:ns ## KEGG: JW1424 # Name: ydcK # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 326 1 326 326 640 100.0 0 MRKYRLSEEQRAFSYQEDGTKKNVLLRQIIAISDFNDVIAGTAGGWIDRETVLAQEGNCW IYDQNAIAFGGAVISGNTRITGTSVLWGEVYATDNVWIDNSEISQGAYISDSVTIHDSLV YGQCRIFGHALIDQHSMIVAAQGLTPDHQLLLQIYDRARVSASRIVHQAQIYGDAVVRYA FIEHRAEVFDFASIEGNEENNVWLCDCAKVYGHAQVKAGIEEDAIPTIHYSSQVAEYAIV EGNCVLKHHVLVGGNAVVRGGPILLDEHVVIQGESRITGAVIIENHVELTDHAVVEAFDG DTVHVRGPKVINGEERITRTPLAGLL >gi|296494496|gb|ADTN01000242.1| GENE 5 3761 - 4300 934 179 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|16129386|ref|NP_415944.1| ribosomal-protein-L7/L12-serine acetyltransferase [Escherichia coli str. K-12 substr. MG1655] # 1 179 1 179 179 364 100 1e-100 MTETIKVSESLELHAVAENHVKPLYQLICKNKTWLQQSLNWPQFVQSEEDTRKTVQGNVM LHQRGYAKMFMIFKEDELIGVISFNRIEPLNKTAEIGYWLDESHQGQGIISQALQALIHH YAQSGELRRFVIKCRVDNPQSNQVALRNGFILEGCLKQAEFLNDAYDDVNLYARIIDSQ >gi|296494496|gb|ADTN01000242.1| GENE 6 4363 - 4587 232 74 aa, chain - ## HITS:1 COG:ydcH KEGG:ns NR:ns ## COG: ydcH COG2841 # Protein_GI_number: 16129385 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 20 74 1 55 55 82 100.0 2e-16 MFPEYRDLISRLKNENPRFMSLFDKHNKLDHEIARKEGSDGRGYNAEVVRMKKQKLQLKD EMLKILQQESVKEV >gi|296494496|gb|ADTN01000242.1| GENE 7 4727 - 6346 1501 539 aa, chain - ## HITS:1 COG:ydcG KEGG:ns NR:ns ## COG: ydcG COG3131 # Protein_GI_number: 16129383 # Func_class: P Inorganic ion transport and metabolism # Function: Periplasmic glucans biosynthesis protein # Organism: Escherichia coli K12 # 1 539 13 551 551 1130 100.0 0 MAAVCGTSGIASLFSQAAFAADSDIADGQTQRFDFSILQSMAHDLAQTAWRGAPRPLPDT LATMTPQAYNSIQYDAEKSLWHNVENRQLDAQFFHMGMGFRRRVRMFSVDPATHLAREIH FRPELFKYNDAGVDTKQLEGQSDLGFAGFRVFKAPELARRDVVSFLGASYFRAVDDTYQY GLSARGLAIDTYTDSKEEFPDFTAFWFDTVKPGATTFTVYALLDSASITGAYKFTIHCEK SQVIMDVENHLYARKDIKQLGIAPMTSMFSCGTNERRMCDTIHPQIHDSDRLSMWRGNGE WICRPLNNPQKLQFNAYTDNNPKGFGLLQLDRDFSHYQDIMGWYNKRPSLWVEPRNKWGK GTIGLMEIPTTGETLDNIVCFWQPEKAVKAGDEFAFQYRLYWSAQPPVHCPLARVMATRT GMGGFSEGWAPGEHYPEKWARRFAVDFVGGDLKAAAPKGIEPVITLSSGEAKQIEILYIE PIDGYRIQFDWYPTSDSTDPVDMRMYLRCQGDAISETWLYQYFPPAPDKRQYVDDRVMS >gi|296494496|gb|ADTN01000242.1| GENE 8 6607 - 7950 1025 447 aa, chain - ## HITS:1 COG:ydcJ KEGG:ns NR:ns ## COG: ydcJ COG5383 # Protein_GI_number: 16129382 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 447 1 447 447 871 99.0 0 MANSITADEIREQFSQAMSAMYQQEVPQYGTLLELVADVNLAVLENNPQLHEKMVNADEL ARLNVERHGAIRVGTAQELATLRRMFAIMGMYPVSYYDLSQAGVPVHSTAFRPIDDASLA RNPFRVFTSLLRLELIENEILRQKAAEILRQRDIFTPRCRQLLEEYEQQGGFNETQAQEF MQEALETFRWHQSATVDEETYRALHNKHRLIADVVCFPGCHINHLTPRTLDIDRVQSMMP ECGIEPKILIEGPPRREVPILLRQTSFKALEETVLFAGQKQGTHTARFGEIEQRGVALTP KGRQLYDDLLRNAGTGQDNLTHQMHLQETFRTFPDSEFLMRQQGLAWFRYRLTPSGEAHR QAIHPGDDPQPLIERGWVVAQPITYEDFLPVSAAGIFQSNLGNETQTRSHGNASREAFEQ ALGCPVLDEFQLYQEAEERSKRRCGLL >gi|296494496|gb|ADTN01000242.1| GENE 9 8167 - 9090 461 307 aa, chain + ## HITS:1 COG:ydcI KEGG:ns NR:ns ## COG: ydcI COG0583 # Protein_GI_number: 16129381 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 307 48 354 354 600 100.0 1e-171 MEKNSLFSQRIRLRHLHTFVAVAQQGTLGRAAETLNLSQPALSKTLNELEQLTGARLFER GRQGAQLTLPGEQFLTHAVRVLDAINTAGRSLHRKEGLNNDVVRVGALPTAALGILPSVI GQFHQQQKETTLQVATMSNPMILAGLKTGEIDIGIGRMSDPELMTGLNYELLFLESLKLV VRPNHPLLQENVTLSRVLEWPVVVSPEGTAPRQHSDALVQSQGCKIPSGCIETLSASLSR QLTVEYDYVWFVPSGAVKDDLRHATLVALPVPGHGAGEPIGILTRVDATFSSGCQLMINA IRKSMPF >gi|296494496|gb|ADTN01000242.1| GENE 10 9128 - 10768 1132 546 aa, chain - ## HITS:1 COG:trg KEGG:ns NR:ns ## COG: trg COG0840 # Protein_GI_number: 16129380 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Escherichia coli K12 # 1 546 1 546 546 899 100.0 0 MNTTPSQRLGFLHHIRLVPLFACILGGILVLFALSSALAGYFLWQADRDQRDVTAEIEIR TGLANSSDFLRSARINMIQAGAASRIAEMEAMKRNIAQAESEIKQSQQGYRAYQNRPVKT PADEALDTELNQRFQAYITGMQPMLKYAKNGMFEAIINHESEQIRPLDNAYTDILNKAVK IRSTRANQLAELAHQRTRLGGMFMIGAFVLALVMTLITFMVLRRIVIRPLQHAAQRIEKI ASGDLTMNDEPAGRNEIGRLSRHLQQMQHSLGMTVGTVRQGAEEIYRGTSEISAGNADLS SRTEEQAAAIEQTAASMEQLTATVKQNADNAHHASKLAQEASIKASDGGQTVSGVVKTMG AISTSSKKISEITAVINSIAFQTNILALNAAVEAARAGEQGRGFAVVASEVRTLASRSAQ AAKEIEGLISESVRLIDLGSDEVATAGKTMSTIVDAVASVTHIMQEIAAASDEQSRGITQ VSQAISEMDKVTQQNASLVEEASAAAVSLEEQAARLTEAVDVFRLHKHSVSAEPRGAGEP VSFATV >gi|296494496|gb|ADTN01000242.1| GENE 11 10917 - 10982 89 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAKLANPNDEKFWKTARNFVK >gi|296494496|gb|ADTN01000242.1| GENE 12 11388 - 11561 119 57 aa, chain - ## HITS:1 COG:no KEGG:SSON_1723 NR:ns ## KEGG: SSON_1723 # Name: ydcA # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 57 1 57 57 92 100.0 4e-18 MKKLALILFMGTLVSFYADAGRKPCSGSKGGISHCTAGGKFVCNDGSISASKKTCTN >gi|296494496|gb|ADTN01000242.1| GENE 13 11806 - 12336 291 176 aa, chain - ## HITS:1 COG:STM1639 KEGG:ns NR:ns ## COG: STM1639 COG3038 # Protein_GI_number: 16764983 # Func_class: C Energy production and conversion # Function: Cytochrome B561 # Organism: Salmonella typhimurium LT2 # 1 175 1 175 176 281 84.0 5e-76 MENKYSRLQISIHWLVFLLVIAAYCAMEFRGFFPRSDRPLINMIHVSCGISILVLMVVRL LLRLKYPTPPIIPKPKPMMTGLAHLGHLVIYLLFIALPVIGLVMMYNRGNPWFAFGLTMP YASEANFERVDSLKSWHETLANLGYFVIGLHAAAALAHHYFWKDNTLLRMMPRKRS >gi|296494496|gb|ADTN01000242.1| GENE 14 12385 - 12465 77 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAGWAISILTVKRNVLKTGEQYLMKF >gi|296494496|gb|ADTN01000242.1| GENE 15 12525 - 12641 132 38 aa, chain + ## HITS:1 COG:gapC KEGG:ns NR:ns ## COG: gapC COG0057 # Protein_GI_number: 16132227 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Escherichia coli K12 # 1 38 1 38 332 78 100.0 4e-15 MSKVGINGFGRIGRLVLGRLLEVKSNIDVVAINDLTSP >gi|296494496|gb|ADTN01000242.1| GENE 16 12900 - 13277 418 125 aa, chain + ## HITS:1 COG:ECs2022 KEGG:ns NR:ns ## COG: ECs2022 COG0057 # Protein_GI_number: 15831276 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Escherichia coli O157:H7 # 1 125 126 250 333 238 99.0 2e-63 MKTIVYNVNDDTLDGNDTIVSVASCTTNCLAPMAKALHDSFGIEVGTMTTIHAYTGTQSL VDGPRGKDLRASRAAAENIIPHTTGAAKAIGLVIPELSGKLKGHAQRVPVKTGSVTELVS ILGKK >gi|296494496|gb|ADTN01000242.1| GENE 17 13274 - 13525 268 83 aa, chain + ## HITS:1 COG:ECs2022 KEGG:ns NR:ns ## COG: ECs2022 COG0057 # Protein_GI_number: 15831276 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Escherichia coli O157:H7 # 1 83 251 333 333 160 98.0 6e-40 MTAEEVNNALKQATTNNESFGYTDEEIVSSDIIGSHFGSVFDATQTEITAVGDLQLVKTV AWYDNEYGFVTQLIRTLEKFAKL >gi|296494496|gb|ADTN01000242.1| GENE 18 13567 - 14520 985 317 aa, chain - ## HITS:1 COG:ECs2021 KEGG:ns NR:ns ## COG: ECs2021 COG1012 # Protein_GI_number: 15831275 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Escherichia coli O157:H7 # 1 265 1 273 479 505 94.0 1e-143 MSVPVQHPMYIDGQFVTWRGDAWIDVVNPATEAVISRIPDGQAEDARKAIDAAERAQPEW EALPAIERASWLRKISAGIRERASEISALIVEEGGKIQQLAEVEVAFTADYIDYMAEWAR RYEGEIIQSDRPGENILLFKRALGVTTGILPWNFPFFLIARKMAPALLTGNTIVIKPSEF TPNNAIAFAKIVDEIGLPRGVFNLVLGRGETVGQELAGNPKVAMVSMTGSVSAGEKIMAT AAKNITKVCLELGGTQNLNVAMKAIKGLKFGETYINRENFEAMQGFHAGWRKSGIGGADG KHGLHEYLQTQVVYLQS >gi|296494496|gb|ADTN01000242.1| GENE 19 14717 - 15517 487 266 aa, chain - ## HITS:1 COG:ydcF KEGG:ns NR:ns ## COG: ydcF COG1434 # Protein_GI_number: 16129375 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 266 1 266 266 550 100.0 1e-156 MNITPFPTLSPATIDAINVIGQWLAQDDFSGEVPYQADCVILAGNAVMPTIDAACKIARD QQIPLLISGGIGHSTTFLYSAIAQHPHYNTIRTTGRAEATILADIAHQFWHIPHEKIWIE DQSTNCGENARFSIALLNQAVERVHTAIVVQDPTMQRRTMATFRRMTGDNPDAPRWLSYP GFVPQLGNNADSVIFINQLQGLWPVERYLSLLTGELPRLRDDSDGYGPRGRDFIVHVDFP AEVIHAWQTLKHDAVLIEAMESRSLR >gi|296494496|gb|ADTN01000242.1| GENE 20 15789 - 19691 4271 1300 aa, chain - ## HITS:1 COG:hrpA KEGG:ns NR:ns ## COG: hrpA COG1643 # Protein_GI_number: 16129374 # Func_class: L Replication, recombination and repair # Function: HrpA-like helicases # Organism: Escherichia coli K12 # 20 1300 1 1281 1281 2546 100.0 0 MTEQQKLTFTALQQRLDSLMLRDRLRFSRRLHGVKKVKNPDAQQAIFQEMAKEIDQAAGK VLLREAARPEITYPDNLPVSQKKQDILEAIRDHQVVIVAGETGSGKTTQLPKICMELGRG IKGLIGHTQPRRLAARTVANRIAEELKTEPGGCIGYKVRFSDHVSDNTMVKLMTDGILLA EIQQDRLLMQYDTIIIDEAHERSLNIDFLLGYLKELLPRRPDLKIIITSATIDPERFSRH FNNAPIIEVSGRTYPVEVRYRPIVEEADDTERDQLQAIFDAVDELSQESHGDILIFMSGE REIRDTADALNKLNLRHTEILPLYARLSNSEQNRVFQSHSGRRIVLATNVAETSLTVPGI KYVIDPGTARISRYSYRTKVQRLPIEPISQASANQRKGRCGRVSEGICIRLYSEDDFLSR PEFTDPEILRTNLASVILQMTALGLGDIAAFPFVEAPDKRNIQDGVRLLEELGAITTDEQ ASAYKLTPLGRQLSQLPVDPRLARMVLEAQKHGCVREAMIITSALSIQDPRERPMDKQQA SDEKHRRFHDKESDFLAFVNLWNYLGEQQKALSSNAFRRLCRTDYLNYLRVREWQDIYTQ LRQVVKELGIPVNSEPAEYREIHIALLTGLLSHIGMKDADKQEYTGARNARFSIFPGSGL FKKPPKWVMVAELVETSRLWGRIAARIDPEWVEPVAQHLIKRTYSEPHWERAQGAVMATE KVTVYGLPIVAARKVNYSQIDPALCRELFIRHALVEGDWQTRHAFFRENLKLRAEVEELE HKSRRRDILVDDETLFEFYDQRISHDVISARHFDSWWKKVSRETPDLLNFEKSMLIKEGA EKISKLDYPNFWHQGNLKLRLSYQFEPGADADGVTVHIPLPLLNQVEESGFEWQIPGLRR ELVIALIKSLPKPVRRNFVPAPNYAEAFLGRVKPLELPLLDSLERELRRMTGVTVDREDW HWDQVPDHLKITFRVVDDKNKKLKEGRSLQDLKDALKGKVQETLSAVADDGIEQSGLHIW SFGQLPESYEQKRGNYKVKAWPALVDERDSVAIKLFDNPLEQKQAMWNGLRRLLLLNIPS PIKYLHEKLPNKAKLGLYFNPYGKVLELIDDCISCGVDKLIDANGGPVWTEEGFAALHEK VRAELNDTVVDIAKQVEQILTAVFNINKRLKGRVDMTMALGLSDIKAQMGGLVYRGFVTG NGFKRLGDTLRYLQAIEKRLEKLAVDPHRDRAQMLKVENVQQAWQQWINKLPPARREDED VKEIRWMIEELRVSYFAQQLGTPYPISDKRILQAMEQISG >gi|296494496|gb|ADTN01000242.1| GENE 21 19719 - 19793 95 24 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFEFIEQRIESRDVVLYIYHLKEN >gi|296494496|gb|ADTN01000242.1| GENE 22 19892 - 20497 764 201 aa, chain + ## HITS:1 COG:acpD KEGG:ns NR:ns ## COG: acpD COG1182 # Protein_GI_number: 16129373 # Func_class: I Lipid transport and metabolism # Function: Acyl carrier protein phosphodiesterase # Organism: Escherichia coli K12 # 1 201 1 201 201 386 100.0 1e-107 MSKVLVLKSSILAGYSQSNQLSDYFVEQWREKHSADEITVRDLAANPIPVLDGELVGALR PSDAPLTPRQQEALALSDELIAELKAHDVIVIAAPMYNFNISTQLKNYFDLVARAGVTFR YTENGPEGLVTGKKAIVITSRGGIHKDGPTDLVTPYLSTFLGFIGITDVKFVFAEGIAYG PEMAAKAQSDAKAAIDSIVSA >gi|296494496|gb|ADTN01000242.1| GENE 23 20551 - 21843 591 430 aa, chain - ## HITS:1 COG:ynbD_1 KEGG:ns NR:ns ## COG: ynbD_1 COG0671 # Protein_GI_number: 16129372 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Escherichia coli K12 # 1 342 1 342 342 652 100.0 0 MLQGAGWLLLLAPFFFFTYGSLNQFTAVQDLNSHDIPSQVFGWETAIPFLPWTIVPYWSL DLLYGFSLFVCSTTFEQRRLVHRLILATVMACCGFLLYPLKFSFIRPEVSGVTGWLFSQL ELFDLPYNQSPSLHIILCWLLWRHFRQHLAERWRKVCGGWFLLIAISTLTTWQHHFIDVI TGLAVGMLIDWMVPVDRRWNYQKPDQRRIKIALPYVVGAGSCIVLMELMMMIQLWWSVWL CWPVLSLLIIGRGYGGLGAITTGKDSQGKLPPAVYWLTLPCRIGMWLSMRWFCRRLEPVS KMTAGVYLGAFPRHIPAQNAVLDVTFEFPRGRATKDRLYFCVPMLDLVVPEEGELRQAVA MLETLREEQGSVLVHCALGLSRSALVVAAWLLCYGHCKTVNEAISYIRARRPQIVLTDEH KAMLRLWENR >gi|296494496|gb|ADTN01000242.1| GENE 24 21857 - 23614 1020 585 aa, chain - ## HITS:1 COG:ynbC_2 KEGG:ns NR:ns ## COG: ynbC_2 COG0500 # Protein_GI_number: 16129371 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Escherichia coli K12 # 275 585 1 311 311 645 100.0 0 MENSRIPGEHFFTTSDNTALFYRHWPALQPGAKKVIVLFHRGHEHSGRLQHLVDELAMPD TAFYAWDARGHGKSSGPRGYSPSLARSVRDVDEFVRFAASDSQVGLEEVVVIAQSVGAVL VATWIHDYAPAIRGLVLASPAFKVKLYVPLARPALALWHRLRGLFFINSYVKGRYLTHDR QRGASFNNDPLITRAIAVNILLDLYKTSERIIRDAAAITLPTQLLISGDDYVVHRQPQID FYQRLRSPLKELHLLPGFYHDTLGEENRALAFEKMQSFISRLYANKSQKFDYQHEDCTGP SADRWRLLSGGPVPLSPVDLAYRFMRKAMKLFGTHSSGLHLGMSTGFDSGSSLDYVYQNQ PQGSNAFGRLVDKIYLNSVGWRGIRQRKTHLQILIKQAVADLHAKGLAVRVVDIAAGHGR YVLDALANEPAVSDILLRDYSELNVAQGQEMIAQRGMSGRVRFEQGDAFNPEELSALTPR PTLAIVSGLYELFPENEQVKNSLAGLANAIEPGGILIYTGQPWHPQLEMIAGVLTSHKDG KPWVMRVRSQGEMDSLVRDAGFDKCTQRIDEWGIFTVSMAVRRDN >gi|296494496|gb|ADTN01000242.1| GENE 25 23630 - 24526 193 298 aa, chain - ## HITS:1 COG:ynbB KEGG:ns NR:ns ## COG: ynbB COG4589 # Protein_GI_number: 16129370 # Func_class: R General function prediction only # Function: Predicted CDP-diglyceride synthetase/phosphatidate cytidylyltransferase # Organism: Escherichia coli K12 # 1 298 1 298 298 522 100.0 1e-148 MLEKSLATLFALLILATLINRFLLWRLPERKGGEVTLRIRTWWGIVICFSMVISGPRWMT LTFFALISFLALKEYCTLISVHFPRWLYWGIPLNYLLIGFNCFELFLLFIPLAGFLILAT GQVLVGDPSGFLHTVSAIFWGWIMTVFALSHAAWLLMLPTTNIQGGALLVLFLLALTESN DIAQYLWGKSCGRRKVVPKVSPGKTLEGLMGGVITIMIASLIIGPLLTPLNTLQALLAGL LIGISGFCGDVVMSAIKRDIGVKDSGKLLPGHGGLLDRIDSLIFTAPVFFYFIRYCCY >gi|296494496|gb|ADTN01000242.1| GENE 26 24526 - 24993 221 155 aa, chain - ## HITS:1 COG:ynbA KEGG:ns NR:ns ## COG: ynbA COG0558 # Protein_GI_number: 16129369 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphate synthase # Organism: Escherichia coli K12 # 5 155 53 203 203 269 100.0 1e-72 MLAAQPILFLLLPIVLFIRMALNALDGMLARECNQQTRLGAILNETGDVISDIALYLPFL FLPESNASLVILMLFCTILTEFCGLLAQTINGVRSYAGPFGKSDRALIFGLWGLAVAIYP QWMQWNNLLWSIASILLLWTAINRCRSVLLMSAEI >gi|296494496|gb|ADTN01000242.1| GENE 27 25302 - 27608 1071 768 aa, chain - ## HITS:1 COG:no KEGG:JW5221 NR:ns ## KEGG: JW5221 # Name: ydbD # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 768 1 768 768 1506 99.0 0 MFKARNCGWIRLLPLFMLSLPVQAELRCVANAVDIESFFSAATAEDKQQVEQAINSSVNL VPFGLSASNWKVHRGDLVVEGNIESNQKLIVLGNLTVKGNISTFSLSNPWVILGNVTATN IVADSPLLITGSINASGLVFIDSYYDNPSTIKGSINARGIFINDIIAPVVASSTNSEFMV RASDKHDTENVKKALMIINPDAYYWGLINDEDALKEIFKRSNIRMAGNVCNQMKKEALFR PKPSPELVQELQMLDEGKVAAFEGRDIATFDLAVMRTLPRLKGISANLRKQLINSNDEQT IESMARYMPDNEILELTDQQLGYQPVVLGLLDREPLSVEIMTRMSRLPDGVGPLNLALRE NLPLDIVMTLAKRDWDMIIQELYKDAWLLPESIIDGYIRSDDSSIRQVGAGGQLTYNQAM QLANDSSNNVVTSLAFKLAEMKHHGQLLRMAPQESDKVAGYLYQKFENDDDLIRVLFLAL PDNLQFNFVKRMEKKSPAYFCCRDMQVIHSDAALQRLLTRFNDPEGWSNLAKNQYLSTSM KQKIWQRALSHRKNNPKADSDAYETSADMILSELISHGEVDDQMLLNATALIRSDDWDFL ESALISWDNLPAVVLKELQQNTPRNDIWAKFFLRQENSSRAQVDEALRVYYALDPDALAQ LDVLAKQPDRIWWSTLAKSNLTFFKFGALNNRHTPPAVLAAEIDPEWWIVAMNNPRFPVD VLKARLKRDPLLALELVNPELDLVRQLALNGKTRAIREQAMRKLDELY >gi|296494496|gb|ADTN01000242.1| GENE 28 27671 - 28531 725 286 aa, chain - ## HITS:1 COG:ydbC KEGG:ns NR:ns ## COG: ydbC COG0667 # Protein_GI_number: 16129367 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Escherichia coli K12 # 1 286 1 286 286 560 100.0 1e-159 MSSNTFTLGTKSVNRLGYGAMQLAGPGVFGPPRDRHVAITVLREALALGVNHIDTSDFYG PHVTNQIIREALYPYSDDLTIVTKIGARRGEDASWLPAFSPAELQKAVHDNLRNLGLDVL DVVNLRVMMGDGHGPAEGSIEASLTVLAEMQQQGLVKHIGLSNVTPTQVAEARKIAEIVC VQNEYNIAHRADDAMIDALAHDGIAYVPFFPLGGFTPLQSSTLSDVAASLGATPMQVALA WLLQRSPNILLIPGTSSVAHLRENMAAEKLHLSEEVLSTLDGISRE >gi|296494496|gb|ADTN01000242.1| GENE 29 28739 - 34801 4369 2020 aa, chain - ## HITS:1 COG:no KEGG:EcHS_A1488 NR:ns ## KEGG: EcHS_A1488 # Name: not_defined # Def: autotransporter (AT) family porin # Organism: E.coli_HS # Pathway: not_defined # 1 2020 1 1951 1951 2767 93.0 0 MQRKTLLSACIALALSGQGWAADITEVETTTGEKKNTNVTCPADPGKLSPEELKRLPSEC SPLVEQNLMPWLSTGAAALITALAVVELNDDDDHHHRNNSPLPPTPPDDESDDTPVPPTP GGDEIIPDDPDDTPTPPKPVSFNNDVILDKTEKTLTIRDSVFTYTENADGTISLQDSNGR KATINLWQIDEANNTVALEGVSADGATKWQYNHNGELVITGDNATVNNNGKTTVDGKDST GTEINGNNGKVIQDGDLDVSGGGHGIDITGDSATVDNKGTMTVTDPESMGIQIDGDKAIV NNEGESTITNGGTGTQINGDDATANNNGKTTVDGKDSTGTEINGNNGKVIQDGDLDVSGG GHGIDITGDSATVDNKGTMTVTDPESIGIQVDGDQAVVNNEGESAITNGGTGTQINGDDA TANNNGKTTVDGKDSTGTEIAGNNGKVIQDGDLDVSGGGHGIDITGDSATVDNKGTMTVT DPESIGIQIDGDQAIVNNEGESTITNGGTGTQINGNDATANNSGKTTVDGKDSTGTKIAG NIGIVNLDGSLTVTGGAHGVENIGDNGTVNNKGDIVVSDTGSIGVLINGEGATVSNTGDV NVSNEATGFSITTNSGKVSLAGSMQVGDFSTGVDLNGNNNSVTLAAKDLKVVGQKATGIN VSGDANTVNITGNVLVDKDKTADNAAEYFFDPSVGINVYGSDNNVTLDGKLTVVSDSEVT SRQSNLFDGSAEKTSGLVVIGDGNTVNMNGGLELIGEKNALADGSQVTSLRTGYSYTSVI VVSGESSVYLNGDTTISGEFPLGFAGVIRVQDKALLEIGSGATLTMQDIDSFEHHGTRTP ELTYADSGAKIVNKGTVEIQNLGFAFVTGENTTGINSGTISLLQNGKDPAPSPIVLLATN GGSATNAGTITGKVTEQHSVFNKYSTGTSNSFIFNNDVSSITGLVAQSNSTIINTDSGII DLYGRGSVGMLAIADSTAENQGKITLDSMWVDANDTTAMRDIASNSAIDFGTGVGVGTDS YSGAGKNATAINQLGGVITIYNAGAGMAAYGASNTVINQGTINLEKNGNYDDSLAANTLV GMAVYEHGTAINDQTGVININVGTGQAFYNDGTGTIVNYGTICTFGVCQSGNEYNNTDDF TSLIYTGGDTITRSGETVTLNKSAAVTDKLAGNVVNSGTLSGDQITVSSGLLENTSGGII NNLVKLDKGAVIKNAGVMTNNVDVSGGILNNAGEMTAQITMNAGADSSLVNNTGTINKIV QNAGVFNNSGSVTGRMMSAGGVFNNQTDGAIMRGAALTGTAVANNEGTWNLGSSSEGNNT GMLEVNNNSAFNNRGEFILDNDKNAVHINQSGTLYNTGHMNISNSSHNGAVNMWGGNGRF INDGTIDVSAKSLVVSANNAGDQNAFFWNQDNGVINFDHDSASAVKVTHSNFIAQNDGIM NISGTGAVAMEGDKNAQLVNNGTINLGTAGTTDTGMIGMQLDANATADAVIENNGTINIF ANDSFAFSVLGTVGHVVNNGTVVIADGVTGSGLIKQGDSINVEGMNGNNGNSSEVHYGDY TLPDVPKPNTVSVTSGSDEAGGSMNNLNGYVVGTNVNGSAGKLKVNNASMNGVEINTGFT AGTADTTVSFDNVVEGSNLTDADAITSTSVVWTAKGSTDASGNVDVTMSKNAYTDVATDA SVNDIAKALDAGYTNNELFTSLNVGTTAELNSALKQVSGSQATTVFREARVLSNRFSMLA DAAPKVGNGLAFNVVAKGDPRAELGNNTEYDMLALRKTIDLSESQTMSLEYGIARLDGDG AQKAGDNGVTGGYSQFFGLKHQMSFDNGMNWNNALRYDVHNLDSSRSIAFGNTNKTADTD VKQQYLEFRSEGAKTFEPSEGLKVTPYAGVKLRHTLEGGYQERNAGDFNLNMNSGSETAV DSIVGLKLDYAGKDGWSASATLEGGPNLSYAKSQRTASLAGAGSQHFNVDDGQKGGGINS LTSVGVKYSSKESSLNLDAYNWKEDGISDKGVMLNFKKTF >gi|296494496|gb|ADTN01000242.1| GENE 30 35132 - 35722 574 196 aa, chain - ## HITS:1 COG:paaY KEGG:ns NR:ns ## COG: paaY COG0663 # Protein_GI_number: 16129361 # Func_class: R General function prediction only # Function: Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily # Organism: Escherichia coli K12 # 1 196 1 196 196 393 100.0 1e-109 MPIYQIDGLTPVVPEESFVHPTAVLIGDVILGKGVYVGPNASLRGDFGRIVVKDGANIQD NCVMHGFPEQDTVVGEDGHIGHSAILHGCIIRRNALVGMNAVVMDGAVIGENSIVGASAF VKAKAEMPANYLIVGSPAKAIRELSEQELAWKKQGTHEYQVLVTRCKQTLHQVEPLREIE PGRKRLVFDENLRPKQ >gi|296494496|gb|ADTN01000242.1| GENE 31 35704 - 36654 772 316 aa, chain - ## HITS:1 COG:paaX KEGG:ns NR:ns ## COG: paaX COG3327 # Protein_GI_number: 16129360 # Func_class: K Transcription # Function: Phenylacetic acid-responsive transcriptional repressor # Organism: Escherichia coli K12 # 1 316 1 316 316 627 100.0 1e-179 MSKLDTFIQHAVNAVPVSGTSLISSLYGDSLSHRGGEIWLGSLAALLEGLGFGERFVRTA LFRLNKEGWLDVSRIGRRSFYSLSDKGLRLTRRAESKIYRAEQPAWDGKWLLLLSEGLDK STLADVKKQLIWQGFGALAPSLMASPSQKLADVQTLLHEAGVADNVICFEAQIPLALSRA ALRARVEECWHLTEQNAMYETFIQSFRPLVPLLKEAADELTPERAFHIQLLLIHFYRRVV LKDPLLPEELLPAHWAGHTARQLCINIYQRVAPAALAFVSEKGETSVGELPAPGSLYFQR FGGLNIEQEALCQFIR >gi|296494496|gb|ADTN01000242.1| GENE 32 36755 - 38068 1191 437 aa, chain - ## HITS:1 COG:paaK KEGG:ns NR:ns ## COG: paaK COG1541 # Protein_GI_number: 16129359 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Escherichia coli K12 # 1 437 1 437 437 912 100.0 0 MITNTKLDPIETASVDELQALQTQRLKWTLKHAYENVPMYRRKFDAAGVHPDDFRELSDL RKFPCTTKQDLRDNYPFDTFAVPMEQVVRIHASSGTTGKPTVVGYTQNDIDNWANIVARS LRAAGGSPKDKIHVAYGYGLFTGGLGAHYGAERLGATVIPMSGGQTEKQAQLIRDFQPDM IMVTPSYCLNLIEELERQLGGDASGCSLRVGVFGAEPWTQAMRKEIERRLGITALDIYGL SEVMGPGVAMECLETTDGPTIWEDHFYPEIVNPHDGTPLADGEHGELLFTTLTKEALPVI RYRTRDLTRLLPGTARTMRRMDRISGRSDDMLIIRGVNVFPSQLEEEIVKFEHLSPHYQL EVNRRGHLDSLSVKVELKESSLTLTHEQRCQVCHQLRHRIKSMVGISTDVMIVNCGSIPR SEGKACRVFDLRNIVGA >gi|296494496|gb|ADTN01000242.1| GENE 33 38095 - 39300 1123 401 aa, chain - ## HITS:1 COG:paaJ KEGG:ns NR:ns ## COG: paaJ COG0183 # Protein_GI_number: 16129358 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA acetyltransferase # Organism: Escherichia coli K12 # 1 401 1 401 401 737 100.0 0 MREAFICDGIRTPIGRYGGALSSVRADDLAAIPLRELLVRNPRLDAECIDDVILGCANQA GEDNRNVARMATLLAGLPQSVSGTTINRLCGSGLDALGFAARAIKAGDGDLLIAGGVESM SRAPFVMGKAASAFSRQAEMFDTTIGWRFVNPLMAQQFGTDSMPETAENVAELLKISRED QDSFALRSQQRTAKAQSSGILAEEIVPVVLKNKKGVVTEIQHDEHLRPETTLEQLRGLKA PFRANGVITAGNASGVNDGAAALIIASEQMAAAQGLTPRARIVAMATAGVEPRLMGLGPV PATRRVLERAGLSIHDMDVIELNEAFAAQALGVLRELGLPDDAPHVNPNGGAIALGHPLG MSGARLALAASHELHRRNGRYALCTMCIGVGQGIAMILERV >gi|296494496|gb|ADTN01000242.1| GENE 34 39300 - 39722 404 140 aa, chain - ## HITS:1 COG:paaI KEGG:ns NR:ns ## COG: paaI COG2050 # Protein_GI_number: 16129357 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized protein, possibly involved in aromatic compounds catabolism # Organism: Escherichia coli K12 # 1 140 1 140 140 255 100.0 2e-68 MSHKAWQNAHAMYENDACAKALGIDIISMDEGFAVVTMTVTAQMLNGHQSCHGGQLFSLA DTAFAYACNSQGLAAVASACTIDFLRPGFAGDTLTATAQVRHQGKQTGVYDIEIVNQQQK TVALFRGKSHRIGGTITGEA >gi|296494496|gb|ADTN01000242.1| GENE 35 39712 - 41136 1220 474 aa, chain - ## HITS:1 COG:ydbU KEGG:ns NR:ns ## COG: ydbU COG1250 # Protein_GI_number: 16129356 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyacyl-CoA dehydrogenase # Organism: Escherichia coli K12 # 1 474 2 475 475 920 100.0 0 MINVQTVAVIGSGTMGAGIAEVAASHGHQVLLYDISAEALTRAIDGIHARLNSRVTRGKL TAETCERTLKRLIPVTDIHALAAADLVIEAASERLEVKKALFAQLAEVCPPQTLLTTNTS SISITAIAAEIKNPERVAGLHFFNPAPVMKLVEVVSGLATAAEVVEQLCELTLSWGKQPV RCHSTPGFIVNRVARPYYSEAWRALEEQVAAPEVIDAALRDGAGFPMGPLELTDLIGQDV NFAVTCSVFNAFWQERRFLPSLVQQELVIGGRLGKKSGLGVYDWRAEREAVVGLEAVSDS FSPMKVEKKSDGVTEIDDVLLIETQGETAQALAIRLARPVVVIDKMAGKVVTIAAAAVNP DSATRKAIYYLQQQGKTVLQIADYPGMLIWRTVAMIINEALDALQKGVASEQDIDTAMRL GVNYPYGPLAWGAQLGWQRILRLLENLQHHYGEERYRPCSLLRQRALLESGYES >gi|296494496|gb|ADTN01000242.1| GENE 36 41141 - 41926 800 261 aa, chain - ## HITS:1 COG:paaG KEGG:ns NR:ns ## COG: paaG COG1024 # Protein_GI_number: 16129355 # Func_class: I Lipid transport and metabolism # Function: Enoyl-CoA hydratase/carnithine racemase # Organism: Escherichia coli K12 # 1 261 2 262 262 489 100.0 1e-138 MEFILSHVEKGVMTLTLNRPERLNSFNDEMHAQLAECLKQVERDDTIRCLLLTGAGRGFC AGQDLNDRNVDPTGPAPDLGMSVERFYNPLVRRLAKLPKPVICAVNGVAAGAGATLALGG DIVIAARSAKFVMAFSKLGLIPDCGGTWLLPRVAGRARAMGLALLGNQLSAEQAHEWGMI WQVVDDETLADTAQQLARHLATQPTFGLGLIKQAINSAETNTLDTQLDLERDYQRLAGRS ADYREGVSAFLAKRSPQFTGK >gi|296494496|gb|ADTN01000242.1| GENE 37 41929 - 42696 714 255 aa, chain - ## HITS:1 COG:ydbS KEGG:ns NR:ns ## COG: ydbS COG1024 # Protein_GI_number: 16129354 # Func_class: I Lipid transport and metabolism # Function: Enoyl-CoA hydratase/carnithine racemase # Organism: Escherichia coli K12 # 1 255 1 255 255 435 100.0 1e-122 MSELIVSRQQRVLLLTLNRPAARNALNNALLMQLVNELEAAATDTSISVCVITGNARFFA AGADLNEMAEKDLAATLNDTRPQLWARLQAFNKPLIAAVNGYALGAGCELALLCDVVVAG ENARFGLPEITLGIMPGAGGTQRLIRSVGKSLASKMVLSGESITAQQAQQAGLVSDVFPS DLTLEYALQLASKMARHSPLALQAAKQALRQSQEVALQAGLAQERQLFTLLAATEDRHEG ISAFLQKRTPDFKGR >gi|296494496|gb|ADTN01000242.1| GENE 38 42693 - 43763 990 356 aa, chain - ## HITS:1 COG:paaE KEGG:ns NR:ns ## COG: paaE COG1018 # Protein_GI_number: 16129353 # Func_class: C Energy production and conversion # Function: Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 # Organism: Escherichia coli K12 # 1 356 1 356 356 715 100.0 0 MTTFHSLTVAKVESETRDAVTITFAVPQPLQEAYRFRPGQHLTLKASFDGEELRRCYSIC RSYLPGEISVAVKAIEGGRFSRYAREHIRQGMTLEVMVPQGHFGYQPQAERQGRYLAIAA GSGITPMLAIIATTLQTEPESQFTLIYGNRTSQSMMFRQALADLKDKYPQRLQLLCIFSQ ETLDSDLLHGRIDGEKLQSLGASLINFRLYDEAFICGPAAMMDDAETALKALGMPDKTIH LERFNTPGTRVKRSVNVQSDGQKVTVRQDGRDREIVLNADDESILDAALRQGADLPYACK GGVCATCKCKVLRGKVAMETNYSLEPDELAAGYVLSCQALPLTSDVVVDFDAKGMA >gi|296494496|gb|ADTN01000242.1| GENE 39 43771 - 44268 516 165 aa, chain - ## HITS:1 COG:paaD KEGG:ns NR:ns ## COG: paaD COG2151 # Protein_GI_number: 16129352 # Func_class: R General function prediction only # Function: Predicted metal-sulfur cluster biosynthetic enzyme # Organism: Escherichia coli K12 # 1 165 3 167 167 329 100.0 1e-90 MQRLATIAPPQVHEIWALLSQIPDPEIPVLTITDLGMVRNVTQMGEGWVIGFTPTYSGCP ATEHLIGAIREAMTTNGFTPVQVVLQLDPAWTTDWMTPDARERLREYGISPPAGHSCHAH LPPEVRCPRCASVHTTLISEFGSTACKALYRCDSCREPFDYFKCI >gi|296494496|gb|ADTN01000242.1| GENE 40 44283 - 45029 685 248 aa, chain - ## HITS:1 COG:ydbP KEGG:ns NR:ns ## COG: ydbP COG3396 # Protein_GI_number: 16129351 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 248 1 248 248 487 100.0 1e-138 MNQLTAYTLRLGDNCLVLSQRLGEWCGHAPELEIDLALANIGLDLLGQARNFLSYAAELA GEGDEDTLAFTRDERQFSNLLLVEQPNGNFADTIARQYFIDAWHVALFTRLMESRDPQLA AISAKAIKEARYHLRFSRGWLERLGNGTDVSGQKMQQAINKLWRFTAELFDADEIDIALS EEGIAVDPRTLRAAWEAEVFAGINEATLNVPQEQAYRTGGKKGLHTEHLGPMLAEMQYLQ RVLPGQQW >gi|296494496|gb|ADTN01000242.1| GENE 41 45038 - 45325 373 95 aa, chain - ## HITS:1 COG:ynbF KEGG:ns NR:ns ## COG: ynbF COG3460 # Protein_GI_number: 16129350 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized enzyme of phenylacetate metabolism # Organism: Escherichia coli K12 # 1 95 1 95 95 189 100.0 8e-49 MSNVYWPLYEVFVRGKQGLSHRHVGSLHAADERMALENARDAYTRRSEGCSIWVVKASEI VASQPEERGEFFDPAESKVYRHPTFYTIPDGIEHM >gi|296494496|gb|ADTN01000242.1| GENE 42 45337 - 46266 701 309 aa, chain - ## HITS:1 COG:ydbO KEGG:ns NR:ns ## COG: ydbO COG3396 # Protein_GI_number: 16129349 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 309 1 309 309 632 100.0 0 MTQEERFEQRIAQETAIEPQDWMPDAYRKTLIRQIGQHAHSEIVGMLPEGNWITRAPTLR RKAILLAKVQDEAGHGLYLYSAAETLGCAREDIYQKMLDGRMKYSSIFNYPTLSWADIGV IGWLVDGAAIVNQVALCRTSYGPYARAMVKICKEESFHQRQGFEACMALAQGSEAQKQML QDAINRFWWPALMMFGPNDDNSPNSARSLTWKIKRFTNDELRQRFVDNTVPQVEMLGMTV PDPDLHFDTESGHYRFGEIDWQEFNEVINGRGICNQERLDAKRKAWEEGTWVREAALAHA QKQHARKVA >gi|296494496|gb|ADTN01000242.1| GENE 43 46551 - 48596 1664 681 aa, chain + ## HITS:1 COG:maoC_1 KEGG:ns NR:ns ## COG: maoC_1 COG1012 # Protein_GI_number: 16129348 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Escherichia coli K12 # 1 500 1 500 500 934 99.0 0 MQQLASFLSGTWQSGRGRSRLIHHAISGEALWEVTSEGLDMAAARQFAIEKGAPALRAMT FIERAAMLKAVARHLLSEKERFYALSAQTGATRADSWVDIEGGIGTLFTYASLGSRELPD DTLWPEDELIPLSKEGGFAARHLLTSKSGVAVHINAFNFPCWGMLEKLAPTWLGGMPAII KPATATAQLTQAMVKSIVDSGLVPEGAISLICGSAGDLLDHLDSQDVVTFTGSAATGQML RVQPNIVAKSIPFTMEADSLNCCVLGEDVTPDQPEFALFIREVVREMTTKAGQKCTAIRR IIVPQALVNAVSDALVARLQKVVVGDPAQEGVKMGALVNAEQRADVQEKVNILLAAGCEI RLGGQADLSAAGAFFPPTLLYCPQPDETPAVHATEAFGPVATLMPAQNQRHALQLACAGG GSLAGTLVTADPQIARQFIADAARTHGRIQILNEESAKESTGHGSPLPQLVHGGPGRAGG GEELGGLRAVKHYMQRTAVQGSPTMLAAISKQWVRGAKVEEDRIHPFRKYFEELQPGDSL LTPRRTMTEADIVNFACLSGDHFYAHMDKIAAAESIFGERVVHGYFVLSAAAGLFVDAGV GPVIANYGLESLRFIEPVKPGDTIQVRLTCKRKTLKKQRSAEEKPTGVVEWAVEVFNQHQ TPVALYSILTLVARQHGDFVD >gi|296494496|gb|ADTN01000242.1| GENE 44 48844 - 51117 2439 757 aa, chain + ## HITS:1 COG:tynA KEGG:ns NR:ns ## COG: tynA COG3733 # Protein_GI_number: 16129347 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Cu2+-containing amine oxidase # Organism: Escherichia coli K12 # 1 757 1 757 757 1540 100.0 0 MGSPSLYSARKTTLALAVALSFAWQAPVFAHGGEAHMVPMDKTLKEFGADVQWDDYAQLF TLIKDGAYVKVKPGAQTAIVNGQPLALQVPVVMKDNKAWVSDTFINDVFQSGLDQTFQVE KRPHPLNALTADEIKQAVEIVKASADFKPNTRFTEISLLPPDKEAVWAFALENKPVDQPR KADVIMLDGKHIIEAVVDLQNNKLLSWQPIKDAHGMVLLDDFASVQNIINNSEEFAAAVK KRGITDAKKVITTPLTVGYFDGKDGLKQDARLLKVISYLDVGDGNYWAHPIENLVAVVDL EQKKIVKIEEGPVVPVPMTARPFDGRDRVAPAVKPMQIIEPEGKNYTITGDMIHWRNWDF HLSMNSRVGPMISTVTYNDNGTKRKVMYEGSLGGMIVPYGDPDIGWYFKAYLDSGDYGMG TLTSPIARGKDAPSNAVLLNETIADYTGVPMEIPRAIAVFERYAGPEYKHQEMGQPNVST ERRELVVRWISTVGNYDYIFDWIFHENGTIGIDAGATGIEAVKGVKAKTMHDETAKDDTR YGTLIDHNIVGTTHQHIYNFRLDLDVDGENNSLVAMDPVVKPNTAGGPRTSTMQVNQYNI GNEQDAAQKFDPGTIRLLSNPNKENRMGNPVSYQIIPYAGGTHPVAKGAQFAPDEWIYHR LSFMDKQLWVTRYHPGERFPEGKYPNRSTHDTGLGQYSKDNESLDNTDAVVWMTTGTTHV ARAEEWPIMPTEWVHTLLKPWNFFDETPTLGALKKDK >gi|296494496|gb|ADTN01000242.1| GENE 45 51175 - 52674 1214 499 aa, chain - ## HITS:1 COG:feaB KEGG:ns NR:ns ## COG: feaB COG1012 # Protein_GI_number: 16129346 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Escherichia coli K12 # 1 499 2 500 500 1001 100.0 0 MTEPHVAVLSQVQQFLDRQHGLYIDGRPGPAQSEKRLAIFDPATGQEIASTADANEADVD NAVMSAWRAFVSRRWAGRLPAERERILLRFADLVEQHSEELAQLETLEQGKSIAISRAFE VGCTLNWMRYTAGLTTKIAGKTLDLSIPLPQGARYQAWTRKEPVGVVAGIVPWNFPLMIG MWKVMPALAAGCSIVIKPSETTPLTMLRVAELASEAGIPDGVFNVVTGSGAVCGAALTSH PHVAKISFTGSTATGKGIARTAADHLTRVTLELGGKNPAIVLKDADPQWVIEGLMTGSFL NQGQVCAASSRIYIEAPLFDTLVSGFEQAVKSLQVGPGMSPVAQINPLVSRAHCDKVCSF LDDAQAQQAELIRGSNGPAGEGYYVAPTLVVNPDAKLRLTREEVFGPVVNLVRVADGEEA LQLANDTEYGLTASVWTQNLSQALEYSDRLQAGTVWVNSHTLIDANLPFGGMKQSGTGRD FGPDWLDGWCETKSVCVRY >gi|296494496|gb|ADTN01000242.1| GENE 46 52910 - 53815 516 301 aa, chain + ## HITS:1 COG:feaR KEGG:ns NR:ns ## COG: feaR COG2207 # Protein_GI_number: 16129345 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 301 1 301 301 611 99.0 1e-175 MNPAVDNEFQQWLSQINQVCGNFTGRLLTERYTGVLDTHFAKGLKLSTVTTSGVNLSRTW QEVKGSDDAWFYTVFQLSGQAIMEQDERQVQIGAGDITLLDASRPCSLYWQESSKQISLL LPRTLLEQYFPHQKPVCAERLDADLPMVQLSHRLLQESMNNPALSETESEAALQAMVCLL RPVLHQRESVQPRRERQFQKVVTLIDDNIREEILRPEWIAGETGMSVRSLYRMFADKGLV VAQYIRNRRLDFCADAIRHAADDEKLAGIGFHWGFSDQSHFSTVFKQRFGMTPGEYRRKF R >gi|296494496|gb|ADTN01000242.1| GENE 47 53987 - 54361 206 124 aa, chain - ## HITS:1 COG:ydbL KEGG:ns NR:ns ## COG: ydbL COG3784 # Protein_GI_number: 16129344 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 15 124 1 110 110 200 99.0 5e-52 MSKSCLKLVAIFSEVMMKKTLLLCAFLVGLVSSNVMALTLDEARTQGRVGETFYGYLVAL KTDAETEKLVADINAERKASYQQLAKQNNVSVDDIAKLAGQKLVARAKPGEYVQGINGKW VRKF >gi|296494496|gb|ADTN01000242.1| GENE 48 54321 - 54506 272 61 aa, chain - ## HITS:1 COG:no KEGG:JW1377 NR:ns ## KEGG: JW1377 # Name: ynbE # Def: predicted lipoprotein # Organism: E.coli_J # Pathway: not_defined # 1 61 1 61 61 101 100.0 1e-20 MKILLAALTSSFMLVGCTPRIEVAAPKEPITINMNVKIEHEIIIKADKDVEELLETRSDL F >gi|296494496|gb|ADTN01000242.1| GENE 49 54503 - 57142 2404 879 aa, chain - ## HITS:1 COG:no KEGG:JW1376 NR:ns ## KEGG: JW1376 # Name: ydbH # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 879 1 879 879 1696 100.0 0 MLGKYKAVLALLLLIILVPLTLLMTLGLWVPTLAGIWLPLGTRIALDESPRITRKGLIIP DLRYLVGDCQLAHITNASLSHPSRWLLNVGTVELDSACLAKLPQTEQSPAAPKTLAQWQA MLPNTWINIDKLIFSPWQEWQGKLSLALTSDIQQLRYQGEKVKFQGQLKGQQLTVSELDV VAFENQPPVKLVGEFAMPLVPDGLPVSGHATATLNLPQEPSLVDAELDWQENSGQLIVLA RDNGDPLLDLPWQITRQQLTVSDGRWSWPYAGFPLSGRLGVKVDNWQAGLENALVSGRLS VLTQGQAGKGNAVLNFGPGKLSMDNSQLPLQLTGEAKQADLILYARLPAQLSGSLSDPTL TFEPGALLRSKGRVIDSLDIDEIRWPLAGVKVTQRGVDGRLQAILQAHENELGDFVLHMD GLANDFLPDAGRWQWRYWGKGSFTPMNATWDVAGKGEWHDSTITLTDLSTGFDQLQYGTM TVEKPRLILDKPIVWVRDAQHPSFSGALSLDAGQTLFTGGSVLPPSTLKFSVDGRDPTYF LFKGDLHAGEIGPVRVNGRWDGIRLRGNAWWPKQSLTVFQPLVPPDWKMNLRDGELYAQV AFSAAPEQGFRAGGHGVLKGGSAWMPDNQVNGVDFVLPFRFADGAWHLGTRGPVTLRIAE VINLVTAKNITADLQGRYPWTEEEPLLLTDVSVDVLGGNVLMKQLRMPQHDPALLRLNNL SSSELVSAVNPKQFAMSGAFSGALPLWLNNEKWIVKDGWLANSGPMTLRLDKDTADAVVK DNMTAGSAINWLRYMEISRSSTKINLDNLGLLTMQANITGTSRVDGKSGTVNLNYHHEEN IFTLWRSLRFGDNLQAWLEQNARLPGNDCPQGKECEEKQ >gi|296494496|gb|ADTN01000242.1| GENE 50 57350 - 58339 1206 329 aa, chain + ## HITS:1 COG:ldhA KEGG:ns NR:ns ## COG: ldhA COG1052 # Protein_GI_number: 16129341 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Escherichia coli K12 # 1 329 1 329 329 668 100.0 0 MKLAVYSTKQYDKKYLQQVNESFGFELEFFDFLLTEKTAKTANGCEAVCIFVNDDGSRPV LEELKKHGVKYIALRCAGFNNVDLDAAKELGLKVVRVPAYDPEAVAEHAIGMMMTLNRRI HRAYQRTRDANFSLEGLTGFTMYGKTAGVIGTGKIGVAMLRILKGFGMRLLAFDPYPSAA ALELGVEYVDLPTLFSESDVISLHCPLTPENYHLLNEAAFEQMKNGVMIVNTSRGALIDS QAAIEALKNQKIGSLGMDVYENERDLFFEDKSNDVIQDDVFRRLSACHNVLFTGHQAFLT AEALTSISQTTLQNLSNLEKGETCPNELV >gi|296494496|gb|ADTN01000242.1| GENE 51 58438 - 58872 201 144 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163801140|ref|ZP_02195040.1| 50S ribosomal protein L25 [Vibrio campbellii AND4] # 1 144 1 147 147 82 29 7e-15 MRTTMKKVAAFVALSLLMAGCVSNDKIAVTPEQLQHHRFVLESVNGKPVTSDKNPPEISF GEKMMISGSMCNRFSGEGKLSNGELTAKGLAMTRMMCANPQLNELDNTISEMLKEGAQVD LTANQLTLATAKQTLTYKLADLMN >gi|296494496|gb|ADTN01000242.1| GENE 52 58869 - 59135 205 88 aa, chain - ## HITS:1 COG:STM1649 KEGG:ns NR:ns ## COG: STM1649 COG3042 # Protein_GI_number: 16764993 # Func_class: R General function prediction only # Function: Putative hemolysin # Organism: Salmonella typhimurium LT2 # 38 88 1 51 51 63 92.0 1e-10 MRAAFWVGCAALLLSACSSEPVQQATAAHVAPGLKASMSSSGEANCAMIGGSLSVARQLD GTAIGMCALPNGKRCSEQSLAAGSCGSY >gi|296494496|gb|ADTN01000242.1| GENE 53 59409 - 62933 3386 1174 aa, chain + ## HITS:1 COG:ECs2000_1 KEGG:ns NR:ns ## COG: ECs2000_1 COG0674 # Protein_GI_number: 15831254 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Escherichia coli O157:H7 # 1 411 1 411 411 832 99.0 0 MITIDGNGAVASVAFRTSEVIAIYPITPSSTMAEQADAWAGNGLKNVWGDTPRVVEMQSE AGAIATVHGALQTGALSTSFTSSQGLLLMIPTLYKLAGELTPFVLHVAARTVATHALSIF GDHSDVMAVRQTGCAMLCAANVQEAQDFALISQIATLKSRVPFIHFFDGFRTSHEINKIV PLADDTILDLMPQVEIDAHRARALNPEHPVIRGTSANPDTYFQSREATNPWYNAVYDHVE QAMNDFSAATGRQYQPFEYYGHPQAERVIILMGSAIGTCEEVVDELLTRGEKVGVLKVRL YRPFSAKHLLQALPGSVRSVAVLDRTKEPGAQAEPLYLDVMTALAEAFNNGERETLPRVI GGRYGLSSKEFGPDCVLAVFAELNAAKPKARFTVGIYDDVTNLSLPLPENTLPNSAKLEA LFYGLGSDGSVSATKNNIKIIGNSTPWYAQGYFVYDSKKAGGLTVSHLRVSEQPIRSAYL ISQADFVGCHQLQFIDKYQMAERLKPGGIFLLNTPYSADEVWSRLPQEVQAVLNQKKARF YVINAAKIARECGLAARINTVMQMAFFHLTQILPGDSALAELQGAIAKSYSSKGQDLVER NWQALALARESVEEVPLQPVNPHSANRPPVVSDAAPDFVKTVTAAMLAGLGDALPVSALP PDGTWPMGTTRWEKRNIAEEIPILKEELCTQCNHCVAACPHSAIRAKVVPPEAMENAPAS LHSLDVKSRDMRGQKYVLQVAPEDCTGCNLCVEVCPAKDRQNPEIKAINMMSRLEHVEEE KINYDFFLNLPEIDRSKLERIDIRTSQLITPLFEYSGACSGCGETPYIKLLTQLYGDRML IANATGCSSIYGGNLPSTPYTTDANGRGPAWANSLFEDNAEFGLGFRLTVDQHRVRVLRL LDQFADKIPAELLTALKSDATPEVRREQVAALRQQLNDVAEAHELLRDADALVEKSIWLI GGDGWAYDIGFGGLDHVLSLTENVNILVLDTQCYSNTGGQASKATPLGAVTKFGEHGKRK ARKDLGVSMMMYGHVYVAQISLGAQLNQTVKAIQEAEAYPGPSLIIAYSPCEEHGYDLAL SHDQMRQLTATGFWPLYRFDPRRADEGKLPLALDSRPPSEALEETLLHEQRFRRLNSQQP EVAEQLWKDAAADLQKRYDFLAQMAGKAEKSNTD >gi|296494496|gb|ADTN01000242.1| GENE 54 63300 - 64433 1246 377 aa, chain + ## HITS:1 COG:ompN KEGG:ns NR:ns ## COG: ompN COG3203 # Protein_GI_number: 16129338 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein (porin) # Organism: Escherichia coli K12 # 1 377 1 377 377 620 100.0 1e-177 MKSKVLALLIPALLAAGAAHAAEVYNKDGNKLDLYGKVDGLHYFSDNSAKDGDQSYARLG FKGETQINDQLTGYGQWEYNIQANNTESSKNQSWTRLAFAGLKFADYGSFDYGRNYGVMY DIEGWTDMLPEFGGDSYTNADNFMTGRANGVATYRNTDFFGLVNGLNFAVQYQGNNEGAS NGQEGTNNGRDVRHENGDGWGLSTTYDLGMGFSAGAAYTSSDRTNDQVNHTAAGGDKADA WTAGLKYDANNIYLATMYSETRNMTPFGDSDYAVANKTQNFEVTAQYQFDFGLRPAVSFL MSKGRDLHAAGGADNPAGVDDKDLVKYADIGATYYFNKNMSTYVDYKINLLDEDDSFYAA NGISTDDIVALGLVYQF Prediction of potential genes in microbial genomes Time: Sun May 15 23:57:42 2011 Seq name: gi|296494495|gb|ADTN01000243.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont602.2, whole genome shotgun sequence Length of sequence - 51954 bp Number of predicted genes - 51, with homology - 49 Number of transcription units - 24, operones - 9 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 23 - 82 4.7 1 1 Tu 1 1/0.917 + CDS 154 - 588 392 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins + Term 596 - 627 2.4 + Prom 616 - 675 4.5 2 2 Tu 1 . + CDS 765 - 1700 929 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control + Term 1711 - 1748 6.9 - Term 1603 - 1646 -0.8 3 3 Tu 1 4/0.250 - CDS 1829 - 3127 1081 ## COG0513 Superfamily II DNA and RNA helicases - Term 3534 - 3578 2.2 4 4 Tu 1 . - CDS 3680 - 4663 720 ## COG0598 Mg2+ and Co2+ transporters - Prom 4752 - 4811 5.1 + Prom 4785 - 4844 2.3 5 5 Tu 1 . + CDS 4918 - 6150 883 ## COG2199 FOG: GGDEF domain 6 6 Tu 1 3/0.583 - CDS 6171 - 6734 653 ## COG2840 Uncharacterized protein conserved in bacteria - Prom 6831 - 6890 5.2 7 7 Tu 1 . - CDS 7064 - 7972 595 ## COG0583 Transcriptional regulator - Prom 8041 - 8100 2.0 + Prom 8025 - 8084 2.3 8 8 Op 1 3/0.583 + CDS 8148 - 9458 1035 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 9 8 Op 2 4/0.250 + CDS 9458 - 10903 1004 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 10 8 Op 3 2/0.833 + CDS 10940 - 12466 1132 ## COG2978 Putative p-aminobenzoyl-glutamate transporter 11 8 Op 4 4/0.250 + CDS 12477 - 12992 344 ## COG0350 Methylated DNA-protein cysteine methyltransferase + Prom 13102 - 13161 6.2 12 9 Op 1 7/0.000 + CDS 13187 - 13939 770 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Term 13961 - 14021 10.1 + Prom 13994 - 14053 3.7 13 9 Op 2 . + CDS 14091 - 15041 1034 ## COG0589 Universal stress protein UspA and related nucleotide-binding proteins + Term 15063 - 15093 3.0 - Term 15051 - 15080 2.8 14 10 Tu 1 . - CDS 15091 - 15348 336 ## ECSP_1858 predicted inner membrane protein - Prom 15397 - 15456 1.7 + Prom 15403 - 15462 4.3 15 11 Tu 1 . + CDS 15592 - 16623 822 ## COG0668 Small-conductance mechanosensitive channel + Term 16825 - 16859 0.3 16 12 Tu 1 2/0.833 - CDS 16674 - 18287 1368 ## COG4166 ABC-type oligopeptide transport system, periplasmic component - Prom 18317 - 18376 6.2 - Term 18453 - 18483 1.1 17 13 Op 1 . - CDS 18624 - 19523 163 ## COG0583 Transcriptional regulator - Prom 19551 - 19610 2.3 18 13 Op 2 . - CDS 19618 - 19707 74 ## + Prom 19536 - 19595 2.4 19 14 Op 1 . + CDS 19664 - 20581 937 ## COG1073 Hydrolases of the alpha/beta superfamily 20 14 Op 2 . + CDS 20581 - 20646 102 ## 21 14 Op 3 1/0.917 + CDS 20652 - 20834 148 ## COG0702 Predicted nucleoside-diphosphate-sugar epimerases 22 14 Op 4 . + CDS 20916 - 21644 416 ## COG2866 Predicted carboxypeptidase 23 15 Tu 1 . - CDS 21619 - 22584 874 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily - Prom 22647 - 22706 4.3 + Prom 22615 - 22674 2.3 24 16 Tu 1 . + CDS 22703 - 23209 673 ## COG2077 Peroxiredoxin + Term 23211 - 23264 11.2 - Term 23205 - 23243 10.2 25 17 Op 1 5/0.000 - CDS 23253 - 24794 1454 ## COG3283 Transcriptional regulator of aromatic amino acids metabolism - Prom 24823 - 24882 6.5 - Term 24847 - 24889 1.9 26 17 Op 2 9/0.000 - CDS 24942 - 26003 1259 ## COG3768 Predicted membrane protein 27 17 Op 3 . - CDS 26000 - 27397 1260 ## COG3106 Predicted ATPase - Prom 27417 - 27476 3.3 + Prom 27465 - 27524 3.8 28 18 Tu 1 . + CDS 27553 - 28551 811 ## COG1609 Transcriptional regulators 29 19 Op 1 . - CDS 28662 - 29567 1037 ## B21_01307 hypothetical protein 30 19 Op 2 3/0.583 - CDS 29612 - 30694 1219 ## COG3839 ABC-type sugar transport systems, ATPase components 31 19 Op 3 11/0.000 - CDS 30708 - 31367 482 ## COG0637 Predicted phosphatase/phosphohexomutase 32 19 Op 4 4/0.250 - CDS 31364 - 33562 1502 ## COG1554 Trehalose and maltose hydrolases (possible phosphorylases) 33 19 Op 5 16/0.000 - CDS 33628 - 34671 899 ## COG0673 Predicted dehydrogenases and related proteins 34 19 Op 6 3/0.583 - CDS 34693 - 35481 966 ## COG1082 Sugar phosphate isomerases/epimerases 35 19 Op 7 5/0.000 - CDS 35500 - 36552 1179 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 36 19 Op 8 38/0.000 - CDS 36583 - 37425 967 ## COG0395 ABC-type sugar transport system, permease component 37 19 Op 9 35/0.000 - CDS 37412 - 38293 892 ## COG1175 ABC-type sugar transport systems, permease components 38 19 Op 10 2/0.833 - CDS 38314 - 39606 1657 ## COG1653 ABC-type sugar transport system, periplasmic component 39 19 Op 11 1/0.917 - CDS 39620 - 41326 1059 ## COG0366 Glycosidases - Prom 41433 - 41492 3.9 - Term 41465 - 41498 3.6 40 20 Op 1 . - CDS 41512 - 41826 435 ## COG0607 Rhodanese-related sulfurtransferase 41 20 Op 2 . - CDS 41901 - 42122 237 ## ECDH10B_1424 peripheral inner membrane phage-shock protein 42 20 Op 3 . - CDS 42131 - 42490 539 ## COG1983 Putative stress-responsive transcriptional regulator 43 20 Op 4 . - CDS 42490 - 42714 324 ## ECDH10B_1422 phage shock protein B 44 20 Op 5 . - CDS 42768 - 43436 999 ## COG1842 Phage shock protein A (IM30), suppresses sigma54-dependent transcription - Prom 43568 - 43627 4.5 + Prom 43441 - 43500 3.7 45 21 Tu 1 . + CDS 43603 - 44580 883 ## COG1221 Transcriptional regulators containing an AAA-type ATPase domain and a DNA-binding domain + Term 44828 - 44856 -1.0 - Term 44458 - 44493 3.1 46 22 Op 1 4/0.250 - CDS 44700 - 45965 1318 ## COG0160 4-aminobutyrate aminotransferase and related aminotransferases 47 22 Op 2 6/0.000 - CDS 46003 - 47283 1562 ## COG0665 Glycine/D-amino acid oxidases (deaminating) 48 22 Op 3 4/0.250 - CDS 47285 - 48772 1747 ## COG1012 NAD-dependent aldehyde dehydrogenases 49 23 Op 1 1/0.917 - CDS 49047 - 49604 489 ## COG1396 Predicted transcriptional regulators 50 23 Op 2 . - CDS 49631 - 50383 347 ## COG2071 Predicted glutamine amidotransferases - Prom 50463 - 50522 4.6 + Prom 50460 - 50519 5.4 51 24 Tu 1 . + CDS 50607 - 51954 1032 ## COG0174 Glutamine synthetase Predicted protein(s) >gi|296494495|gb|ADTN01000243.1| GENE 1 154 - 588 392 144 aa, chain + ## HITS:1 COG:ynaF KEGG:ns NR:ns ## COG: ynaF COG0589 # Protein_GI_number: 16129337 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Escherichia coli K12 # 1 144 25 168 168 259 100.0 1e-69 MNRTILVPIDISDSELTQRVISHVEEEAKIDDAEVHFLTVIPSLPYYASLGLAYSAELPA MDDLKAEAKSQLEEIIKKFKLPTDRVHVHVEEGSPKDRILELAKKIPAHMIIIASHRPDI TTYLLGSNAAAVVRHAECSVLVVR >gi|296494495|gb|ADTN01000243.1| GENE 2 765 - 1700 929 311 aa, chain + ## HITS:1 COG:ydaO KEGG:ns NR:ns ## COG: ydaO COG0037 # Protein_GI_number: 16129305 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Escherichia coli K12 # 1 311 1 311 311 640 98.0 0 MSQNQEISKKEQYNLNKLQKRLRRNVGEAIADFNMIEEGDRIMVCLSGGKDSYTMLEILR NLQQSAPINFSLVAVNLDQKQPGFPEHVLPEYLEKLGVEYKIVEENTYGIVKEKIPEGKT TCSLCSRLRRGILYRTATELGATKIALGHHRDDILQTLFLNMFYGGKMKGMPPKLMSDDG KHIVIRPLAYCREKDIQRFADAKAFPIIPCNLCGSQPNLQRQVIADMLRDWDKRYPGRIE TMFSAMQNVVPSHLCDTNLFDFKGITHGSEVVNGGDLAFDREEIPLQPACWQPEEDENQL DELRLNVVEVK >gi|296494495|gb|ADTN01000243.1| GENE 3 1829 - 3127 1081 432 aa, chain - ## HITS:1 COG:dbpA KEGG:ns NR:ns ## COG: dbpA COG0513 # Protein_GI_number: 16129304 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Escherichia coli K12 # 1 432 26 457 457 845 100.0 0 MTPVQAAALPAILAGKDVRVQAKTGSGKTAAFGLGLLQQIDASLFQTQALVLCPTRELAD QVAGELRRLARFLPNTKILTLCGGQPFGMQRDSLQHAPHIIVATPGRLLDHLQKGTVSLD ALNTLVMDEADRMLDMGFSDAIDDVIRFAPASRQTLLFSATWPEAIAAISGRVQRDPLAI EIDSTDALPPIEQQFYETSSKGKIPLLQRLLSLHQPSSCVVFCNTKKDCQAVCDALNEVG QSALSLHGDLEQRDRDQTLVRFANGSARVLVATDVAARGLDIKSLELVVNFELAWDPEVH VHRIGRTARAGNSGLAISFCAPEEAQRANIISDMLQIKLNWQTPPANSSIATLEAEMATL CIDGGKKAKMRPGDVLGALTGDIGLDGADIGKIAVHPAHVYVAVRQAVAHKAWKQLQGGK IKGKTCRVRLLK >gi|296494495|gb|ADTN01000243.1| GENE 4 3680 - 4663 720 327 aa, chain - ## HITS:1 COG:ECs1926 KEGG:ns NR:ns ## COG: ECs1926 COG0598 # Protein_GI_number: 15831180 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Escherichia coli O157:H7 # 1 327 1 327 327 667 100.0 0 MEAIKGSDVNVPDAVFAWMLDGRGGVKPLENTDVIDEAHPCWLHLNYVHHDSAQWLATTP LLPNNVRDALAGESTRPRVSRLGEGTLITLRCINGSTDERPDQLVAMRVYMDGRLIVSTR QRKVLALDDVVSDLEEGTGPTDCGGWLVDVCDALTDHSSEFIEQLHDKIIDLEDNLLDQQ IPPRGFLALLRKQLIVMRRYMAPQRDVYARLASERLPWMSDDQRRRMQDIADRLGRGLDE IDACIARTGVMADEIAQVMQENLARRTYTMSLMAMVFLPSTFLTGLFGVNLGGIPGGGWQ FGFSIFCILLVVLIGGVALWLHRSKWL >gi|296494495|gb|ADTN01000243.1| GENE 5 4918 - 6150 883 410 aa, chain + ## HITS:1 COG:ydaM_3 KEGG:ns NR:ns ## COG: ydaM_3 COG2199 # Protein_GI_number: 16129302 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli K12 # 241 410 1 170 170 346 100.0 6e-95 MITHNFNTLDLLTSPVWIVSPFEEQLIYANSAAKLLMQDLTFSQLRTGPYSVSSQKELPK YLSDLQNQHDIIEILTVQRKEEETALSCRLVLRKLTETEPVIIFEGIEAPATLGLKASRS ANYQRKKQGFYARFFLTNSAPMLLIDPSRDGQIIDANLAALNFYGYNHETMCQKHTWEIN MLGRRVMPIMHEISHLPGGHKPLNFVHKLADGSTRHVQTYAGPIEIYGDKLMLCIVHDIT EQKRLEEQLEHAAHHDAMTGLLNRRQFYHITEPGQMQHLAIAQDYSLLLIDTDRFKHIND LYGHSKGDEVLCALARTLESCARKGDLVFRWGGEEFVLLLPRTPLDTALSLAETIRVSVA KVSISGLPRFTVSIGVAHHEGNESIDELFKRVDDALYRAKNDGRNRVLAA >gi|296494495|gb|ADTN01000243.1| GENE 6 6171 - 6734 653 187 aa, chain - ## HITS:1 COG:ydaL KEGG:ns NR:ns ## COG: ydaL COG2840 # Protein_GI_number: 16129301 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 187 1 187 187 378 100.0 1e-105 MNLDDKSLFLDAMEDVQPLKRATDVHWHPTRNQRAPQRIDTLQLDNFLTTGFLDIIPLSQ PLEFRREGLQHGVLDKLRSGKYPQQASLNLLRQPVEECRKMVFSFIQQALADGLRNVLII HGKGRDDKSHANIVRSYVARWLTEFDDVQAYCTALPHHGGSGACYVALRKTAQAKQENWE RHAKRSR >gi|296494495|gb|ADTN01000243.1| GENE 7 7064 - 7972 595 302 aa, chain - ## HITS:1 COG:ECs1923 KEGG:ns NR:ns ## COG: ECs1923 COG0583 # Protein_GI_number: 15831177 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 302 1 302 302 583 100.0 1e-166 MAFQVKIHQIRAFVEVARQGSIRGASRMLNMSQPALSKSIQELEEGLAAQLFFRRSKGVT LTDAGESFYQHASLILEELRAAQEDIRQRQGQLAGQINIGMGASISRSLMPAVISRFHQQ HPQVKVRIMEGQLVSMINELRQGELDFTINTYYQGPYDHEFTFEKLLEKQFAIFCRPGHP AIGARSIKQLLDYSWTMPTPHGSYYKQLSELLDDQAQTPQVGVVCETFSACISLVAKSDF LSILPEEMGCDPLHGQGLVMLPVSEILPKAAYYLIQRRDSRQTPLTASLITQFRRECGYL QS >gi|296494495|gb|ADTN01000243.1| GENE 8 8148 - 9458 1035 436 aa, chain + ## HITS:1 COG:ydaJ KEGG:ns NR:ns ## COG: ydaJ COG1473 # Protein_GI_number: 16129299 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Escherichia coli K12 # 1 436 6 441 441 778 100.0 0 MESLNQFVNSLAPKLSHWRRDFHHYAESGWVEFRTATLVAEELHQLGYSLALGREVVNES SRMGLPDEFTLQREFERARQQGALAQWIAAFEGGFTGIVATLDTGRPGPVMAFRVDMDAL DLSEEQDVSHRPYRDGFASCNAGMMHACGHDGHTAIGLGLAHTLKQFESGLHGVIKLIFQ PAEEGTRGARAMVDAGVVDDVDYFTAVHIGTGVPAGTVVCGSDNFMATTKFDAHFTGTAA HAGAKPEDGHNALLAAAQATLALHAIAPHSEGASRVNVGVMQAGSGRNVVPASALLKVET RGASDVINQYVFDRAQQAIQGAATMYGVGVETRLMGAATASSPSPQWVAWLQSQAAQVAG VNQAIERVEAPAGSEDATLMMARVQQHQGQASYVVFGTQLAAGHHNEKFDFDEQVLAIAV ETLARTALNFPWTRGI >gi|296494495|gb|ADTN01000243.1| GENE 9 9458 - 10903 1004 481 aa, chain + ## HITS:1 COG:abgB KEGG:ns NR:ns ## COG: abgB COG1473 # Protein_GI_number: 16129298 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Escherichia coli K12 # 1 481 1 481 481 962 100.0 0 MQEIYRFIDDAIEADRQRYTDIADQIWDHPETRFEEFWSAEHLASALESAGFTVTRNVGN IPNAFIASFGQGKPVIALLGEYDALAGLSQQAGCAQPTSVTPGENGHGCGHNLLGTAAFA AAIAVKKWLEQYGQGGTVRFYGCPGEEGGSGKTFMVREGVFDDVDAALTWHPEAFAGMFN TRTLANIQASWRFKGIAAHAANSPHLGRSALDAVTLMTTGTNFLNEHIIEKARVHYAITN SGGISPNVVQAQAEVLYLIRAPEMTDVQHIYDRVAKIAEGAALMTETTVECRFDKACSSY LPNRTLENAMYQALSHFGTPEWNSEELAFAKQIQATLTSNDRQNSLNNIAATGGENGKVF ALRHRETVLANEVAPYAATDNVLAASTDVGDVSWKLPVAQCFSPCFAVGTPLHTWQLVSQ GRTSIAHKGMLLAAKTMAATTVNLFLDSGLLQECQQEHQQVTDTQPYHCPIPKNVTPSPL K >gi|296494495|gb|ADTN01000243.1| GENE 10 10940 - 12466 1132 508 aa, chain + ## HITS:1 COG:abgT KEGG:ns NR:ns ## COG: abgT COG2978 # Protein_GI_number: 16129297 # Func_class: H Coenzyme transport and metabolism # Function: Putative p-aminobenzoyl-glutamate transporter # Organism: Escherichia coli K12 # 13 508 15 510 510 856 100.0 0 MSMSSIPSSSQSGKLYGWVERIGNKVPHPFLLFIYLIIVLMVTTAILSAFGVSAKNPTDG TPVVVKNLLSVEGLHWFLPNVIKNFSGFAPLGAILALVLGAGLAERVGLLPALMVKMASH VNARYASYMVLFIAFFSHISSDAALVIMPPMGALIFLAVGRHPVAGLLAAIAGVGCGFTA NLLIVTTDVLLSGISTEAAAAFNPQMHVSVIDNWYFMASSVVVLTIVGGLITDKIIEPRL GQWQGNSDEKLQTLTESQRFGLRIAGVVSLLFIAAIALMVIPQNGILRDPINHTVMPSPF IKGIVPLIILFFFVVSLAYGIATRTIRRQADLPHLMIEPMKEMAGFIVMVFPLAQFVAMF NWSNMGKFIAVGLTDILESSGLSGIPAFVGLALLSSFLCMFIASGSAIWSILAPIFVPMF MLLGFHPAFAQILFRIADSSVLPLAPVSPFVPLFLGFLQRYKPDAKLGTYYSLVLPYPLI FLVVWLLMLLAWYLVGLPIGPGIYPRLS >gi|296494495|gb|ADTN01000243.1| GENE 11 12477 - 12992 344 171 aa, chain + ## HITS:1 COG:ogt KEGG:ns NR:ns ## COG: ogt COG0350 # Protein_GI_number: 16129296 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Escherichia coli K12 # 1 171 1 171 171 346 98.0 9e-96 MLRLLKKKIATPLGPLWVICDEQFRLRAVEWEEYSERMVQLLDIHYRKEGYERISATNPG GLSDKLREYFAGNLSIIDTLPTATGGTPFQREVWKTLRTIPCGQVMHYGQLAEQLGRPGA ARAVGAANGSNPISIVVPCHRVIGRNGTMTGYAGGVQRKEWLLRHEGYLLL >gi|296494495|gb|ADTN01000243.1| GENE 12 13187 - 13939 770 250 aa, chain + ## HITS:1 COG:ECs1915 KEGG:ns NR:ns ## COG: ECs1915 COG0664 # Protein_GI_number: 15831169 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Escherichia coli O157:H7 # 1 250 1 250 250 501 100.0 1e-142 MIPEKRIIRRIQSGGCAIHCQDCSISQLCIPFTLNEHELDQLDNIIERKKPIQKGQTLFK AGDELKSLYAIRSGTIKSYTITEQGDEQITGFHLAGDLVGFDAIGSGHHPSFAQALETSM VCEIPFETLDDLSGKMPNLRQQMMRLMSGEIKGDQDMILLLSKKNAEERLAAFIYNLSRR FAQRGFSPREFRLTMTRGDIGNYLGLTVETISRLLGRFQKSGMLAVKGKYITIENNDALA QLAGHTRNVA >gi|296494495|gb|ADTN01000243.1| GENE 13 14091 - 15041 1034 316 aa, chain + ## HITS:1 COG:ECs1914 KEGG:ns NR:ns ## COG: ECs1914 COG0589 # Protein_GI_number: 15831168 # Func_class: T Signal transduction mechanisms # Function: Universal stress protein UspA and related nucleotide-binding proteins # Organism: Escherichia coli O157:H7 # 1 316 1 316 316 634 100.0 0 MAMYQNMLVVIDPNQDDQPALRRAVYLHQRIGGKIKAFLPIYDFSYEMTTLLSPDERTAM RQGVISQRTAWIHEQAKYYLNAGVPIEIKVVWHNRPFEAIIQEVISGGHDLVLKMAHQHD RLEAVIFTPTDWHLLRKCPSPVWMVKDQPWPEGGKALVAVNLASEEPYHNALNEKLVKET IELAEQVNHTEVHLVGAYPVTPINIAIELPEFDPSVYNDAIRGQHLLAMKALRQKFGINE NMTHVEKGLPEEVIPDLAEHLQAGIVVLGTVGRTGISAAFLGNTAEQVIDHLRCDLLVIK PDQYQTPVELDDEEDD >gi|296494495|gb|ADTN01000243.1| GENE 14 15091 - 15348 336 85 aa, chain - ## HITS:1 COG:no KEGG:ECSP_1858 NR:ns ## KEGG: ECSP_1858 # Name: ynaJ # Def: predicted inner membrane protein # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 85 1 85 85 132 100.0 4e-30 MIMAKLKSAKGKKFLFGLLAVFIIAASVVTRATIGGVIEQYNIPLSEWTTSMYVIQSSMI FVYSLVFTVLLAIPLGIYFLGGEEQ >gi|296494495|gb|ADTN01000243.1| GENE 15 15592 - 16623 822 343 aa, chain + ## HITS:1 COG:ECs1912 KEGG:ns NR:ns ## COG: ECs1912 COG0668 # Protein_GI_number: 15831166 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Escherichia coli O157:H7 # 1 343 1 343 343 672 100.0 0 MIAELFTNNALNLVIIFGSCAALILMSFWFRRGNRKRKGFLFHAVQFLIYTIIISAVGSI INYVIENYKLKFITPGVIDFICTSLIAVILTIKLFLLINQFEKQQIKKGRDITSARIMSR IIKITIIVVLVLLYGEHFGMSLSGLLTFGGIGGLAVGMAGKDILSNFFSGIMLYFDRPFS IGDWIRSPDRNIEGTVAEIGWRITKITTFDNRPLYVPNSLFSSISVENPGRMTNRRITTT IGLRYEDAAKVGVIVEAVREMLKNHPAIDQRQTLLVYFNQFADSSLNIMVYCFTKTTVWA EWLAAQQDVYLKIIDIVQSHGADFAFPSQTLYMDNITPPEQGR >gi|296494495|gb|ADTN01000243.1| GENE 16 16674 - 18287 1368 537 aa, chain - ## HITS:1 COG:mppA KEGG:ns NR:ns ## COG: mppA COG4166 # Protein_GI_number: 16129290 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Escherichia coli K12 # 1 537 8 544 544 1065 100.0 0 MKHSVSVTCCALLVSSISLSYAAEVPSGTVLAEKQELVRHIKDEPASLDPAKAVGLPEIQ VIRDLFEGLVNQNEKGEIVPGVATQWKSNDNRIWTFTLRDNAKWADGTPVTAQDFVYSWQ RLVDPKTLSPFAWFAALAGINNAQAIIDGKATPDQLGVTAVDAHTLKIQLDKPLPWFVNL TANFAFFPVQKANVESGKEWTKPGNLIGNGAYVLKERVVNEKLVVVPNTHYWDNAKTVLQ KVTFLPINQESAATKRYLAGDIDITESFPKNMYQKLLKDIPGQVYTPPQLGTYYYAFNTQ KGPTADQRVRLALSMTIDRRLMTEKVLGTGEKPAWHFTPDVTAGFTPEPSPFEQMSQEEL NAQAKTLLSAAGYGPQKPLKLTLLYNTSENHQKIAIAVASMWKKNLGVDVKLQNQEWKTY IDSRNTGNFDVIRASWVGDYNEPSTFLTLLTSTHSGNISRFNNPAYDKVLAQASTENTVK ARNADYNAAEKILMEQAPIAPIYQYTNGRLIKPWLKGYPINNPEDVAYSRTMYIVKH >gi|296494495|gb|ADTN01000243.1| GENE 17 18624 - 19523 163 299 aa, chain - ## HITS:1 COG:ycjZ KEGG:ns NR:ns ## COG: ycjZ COG0583 # Protein_GI_number: 16129289 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 299 1 299 299 569 100.0 1e-162 MKREEIADLMAFVVVAEERSFTRAAARLSMAQSALSQIVRRIEERLGLRLLTRTTRSVVP TEAGEHLLSVLGPMLHDIDSAMASLSDLQNRPSGTIRITTVEHAAKTILLPAMRTFLKSH PEIDIQLTIDYGLTDVVSERFDAGVRLGGEMDKDMIAIRIGPDIPMAIVGSPDYFSRRSV PTSVSQLIDHQAINLYLPTSGTANRWRLIRGGREVRVRMEGQLLLNTIDLIIDAAIDGHG LAYLPYDQVERAIKEKKLIRVLDKFTPDLPGYHLYYPHRRHAGSAFSLFIDRLKYKGAV >gi|296494495|gb|ADTN01000243.1| GENE 18 19618 - 19707 74 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVGLLLLVKLTLLFIIPFLMELSVRFNGW >gi|296494495|gb|ADTN01000243.1| GENE 19 19664 - 20581 937 305 aa, chain + ## HITS:1 COG:ycjY KEGG:ns NR:ns ## COG: ycjY COG1073 # Protein_GI_number: 16129288 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Escherichia coli K12 # 1 305 6 310 310 585 100.0 1e-167 MNNKVSFTNSNNPTISLSAVIYFPPKFDETRQYQAIVLSHPGGGVKEQTAGTYAKKLAEK GFVTIAYDASYQGESGGEPRQLENPYIRTEDISAVIDYLTTLSYVDNTRIGAMGICAGAG YTANAAIQDRRIKAIGTVSAVNIGSIFRNGWENNVKSIDALPYVEAGSNARTSDISSGEY AIMPLAPMKESDAPNEELRQAWEYYHTPRAQYPTAPGYATLRSLNQIITYDAYHMAEVYL TQPTQIVAGSQAGSKWMSDDLYDRASSQDKRYHIVEGANHMDLYDGKAYVAEAISVLAPF FEETL >gi|296494495|gb|ADTN01000243.1| GENE 20 20581 - 20646 102 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKHIQIRNSDMDWHIAANNLG >gi|296494495|gb|ADTN01000243.1| GENE 21 20652 - 20834 148 60 aa, chain + ## HITS:1 COG:ECs1906 KEGG:ns NR:ns ## COG: ECs1906 COG0702 # Protein_GI_number: 15831160 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate-sugar epimerases # Organism: Escherichia coli O157:H7 # 1 46 9 54 220 94 100.0 3e-20 MKNVLILGAGGQIARHVINQLADKQTIKQTLFARQPAKIHKPYPTNKMQTTSGKKVIQDR >gi|296494495|gb|ADTN01000243.1| GENE 22 20916 - 21644 416 242 aa, chain + ## HITS:1 COG:ECs1905 KEGG:ns NR:ns ## COG: ECs1905 COG2866 # Protein_GI_number: 15831159 # Func_class: E Amino acid transport and metabolism # Function: Predicted carboxypeptidase # Organism: Escherichia coli O157:H7 # 1 242 21 262 262 489 100.0 1e-138 MTVTRPRAERGAFPPGTEHYGRSLLGAPLIWFPAPAASRESGLILAGTHGDENSSVVTLS CALRTLTPSLRRHHVVLCVNPDGCQLGLRANANGVDLNRNFPAANWKEGETVYRWNSAAE ERDVVLLTGDKPGSEPETQALCQLIHRIQPAWVVSFHDPLACIEDPRHSELGEWLAQAFE LPLVTSVGYETPGSFGSWCADLNLHCITAEFPPISSDEASEKYLFAMANLLRWHPKDAIR PS >gi|296494495|gb|ADTN01000243.1| GENE 23 21619 - 22584 874 321 aa, chain - ## HITS:1 COG:ycjG KEGG:ns NR:ns ## COG: ycjG COG4948 # Protein_GI_number: 16129286 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Escherichia coli K12 # 1 321 15 335 335 594 100.0 1e-170 MRTVKVFEEAWPLHTPFVIARGSRSEARVVVVELEEEGIKGTGECTPYPRYGESDASVMA QIMSVVPQLEKGLTREELQKILPAGAARNALDCALWDLAARRQQQSLADLIGITLPETVI TAQTVVIGTPDQMANSASTLWQAGAKLLKVKLDNHLISERMVAIRTAVPDATLIVDANES WRAEGLAARCQLLADLGVAMLEQPLPAQDDAALENFIHPLPICADESCHTRSNLKALKGR YEMVNIKLDKTGGLTEALALATEARAQGFSLMLGCMLCTSRAISAALPLVPQVSFADLDG PTWLAVDVEPALQFTTGELHL >gi|296494495|gb|ADTN01000243.1| GENE 24 22703 - 23209 673 168 aa, chain + ## HITS:1 COG:ECs1903 KEGG:ns NR:ns ## COG: ECs1903 COG2077 # Protein_GI_number: 15831157 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Escherichia coli O157:H7 # 1 168 1 168 168 310 100.0 9e-85 MSQTVHFQGNPVTVANSIPQAGSKAQTFTLVAKDLSDVTLGQFAGKRKVLNIFPSIDTGV CAASVRKFNQLATEIDNTVVLCISADLPFAQSRFCGAEGLNNVITLSTFRNAEFLQAYGV AIADGPLKGLAARAVVVIDENDNVIFSQLVDEITTEPDYEAALAVLKA >gi|296494495|gb|ADTN01000243.1| GENE 25 23253 - 24794 1454 513 aa, chain - ## HITS:1 COG:tyrR KEGG:ns NR:ns ## COG: tyrR COG3283 # Protein_GI_number: 16129284 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulator of aromatic amino acids metabolism # Organism: Escherichia coli K12 # 1 513 1 513 513 1014 100.0 0 MRLEVFCEDRLGLTRELLDLLVLRGIDLRGIEIDPIGRIYLNFAELEFESFSSLMAEIRR IAGVTDVRTVPWMPSEREHLALSALLEALPEPVLSVDMKSKVDMANPASCQLFGQKLDRL RNHTAAQLINGFNFLRWLESEPQDSHNEHVVINGQNFLMEITPVYLQDENDQHVLTGAVV MLRSTIRMGRQLQNVAAQDVSAFSQIVAVSPKMKHVVEQAQKLAMLSAPLLITGDTGTGK DLFAYACHQASPRAGKPYLALNCASIPEDAVESELFGHAPEGKKGFFEQANGGSVLLDEI GEMSPRMQAKLLRFLNDGTFRRVGEDHEVHVDVRVICATQKNLVELVQKGMFREDLYYRL NVLTLNLPPLRDCPQDIMPLTELFVARFADEQGVPRPKLAADLNTVLTRYAWPGNVRQLK NAIYRALTQLDGYELRPQDILLPDYDAATVAVGEDAMEGSLDEITSRFERSVLTQLYRNY PSTRKLAKRLGVSHTAIANKLREYGLSQKKNEE >gi|296494495|gb|ADTN01000243.1| GENE 26 24942 - 26003 1259 353 aa, chain - ## HITS:1 COG:ycjF KEGG:ns NR:ns ## COG: ycjF COG3768 # Protein_GI_number: 16129283 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 353 1 353 353 660 100.0 0 MTEPLKPRIDFDGPLEVDQNPKFRAQQTFDENQAQNFAPATLDEAQEEEGQVEAVMDAAL RPKRSLWRKMVMGGLALFGASVVGQGVQWTMNAWQTQDWVALGGCAAGALIIGAGVGSVV TEWRRLWRLRQRAHERDEARDLLHSHGTGKGRAFCEKLAQQAGIDQSHPALQRWYASIHE TQNDREVVSLYAHLVQPVLDAQARREISRSAAESTLMIAVSPLALVDMAFIAWRNLRLIN RIATLYGIELGYYSRLRLFKLVLLNIAFAGASELVREVGMDWMSQDLAARLSTRAAQGIG AGLLTARLGIKAMELCRPLPWIDDDKPRLGDFRRQLIGQVKETLQKGKTPSEK >gi|296494495|gb|ADTN01000243.1| GENE 27 26000 - 27397 1260 465 aa, chain - ## HITS:1 COG:ycjX KEGG:ns NR:ns ## COG: ycjX COG3106 # Protein_GI_number: 16129282 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Escherichia coli K12 # 1 465 1 465 465 962 100.0 0 MKRLKNELNALVNRGVDRHLRLAVTGLSRSGKTAFITAMVNQLLNIHAGARLPLLSAVRE ERLLGVKRIPQRDFGIPRFTYDEGLAQLYGDPPAWPTPTRGVSEIRLALRFKSNDSLLRH FKDTSTLYLEIVDYPGEWLLDLPMLAQDYLSWSRQMTGLLNGQRGEWSAKWRMMSEGLDP LAPADENRLADIAAAWTDYLHHCKEQGLHFIQPGRFVLPGDMAGAPALQFFPWPDVDTWG ESKLAQADKHTNAGMLRERFNYYCEKVVKGFYKNHFLRFDRQIVLVDCLQPLNSGPQAFN DMRLALTQLMQSFHYGQRTLFRRLFSPVIDKLLFAATKADHVTIDQHANMVSLLQQLIQD AWQNAAFEGISMDCLGLASVQATTSGIIDVNGEKIPALRGNRLSDGAPLTVYPGEVPARL PGQAFWDKQGFQFEAFRPQVMDVDKPLPHIRLDAALEFLIGDKLR >gi|296494495|gb|ADTN01000243.1| GENE 28 27553 - 28551 811 332 aa, chain + ## HITS:1 COG:ycjW KEGG:ns NR:ns ## COG: ycjW COG1609 # Protein_GI_number: 16129281 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 332 1 332 332 638 100.0 0 MSPTIYDIARVAGVSKSTVSRVLNKQTNISPEAREKVLRAIEELQYQPNKLARALTSSGF DAIMVISTRSTKTTAGNPFFSEVLHAITAKAEEEGFDVILQTSHNPAEDLQKCESKIKQK MIKGIIMLSSPADESFFAQLDKYDIPVVVIGKVEGQYAHVYSVDTDNFGDSIALTDALIE SGHQNIACLHAPLDVHVSVDRVNGYKQSLAAHNIAVRDEWIVDGGYTHETALKAARQLLS QSPLPEAVFATDSLKLMSIYRAAAEKNIAIPQQLAVVGYSNETLSFILTPAPGGIDVPTQ ELGQQSCELLFRLISGKPSPQNITVATHMTLK >gi|296494495|gb|ADTN01000243.1| GENE 29 28662 - 29567 1037 301 aa, chain - ## HITS:1 COG:no KEGG:B21_01307 NR:ns ## KEGG: B21_01307 # Name: ompG # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 301 1 301 301 596 100.0 1e-169 MKKLLPCTALVMCAGMACAQAEERNDWHFNIGAMYEIENVEGYGEDMDGLAEPSVYFNAA NGPWRIALAYYQEGPVDYSAGKRGTWFDRPELEVHYQFLENDDFSFGLTGGFRNYGYHYV DEPGKDTANMQRWKIAPDWDVKLTDDLRFNGWLSMYKFANDLNTTGYADTRVETETGLQY TFNETVALRVNYYLERGFNMDDSRNNGEFSTQEIRAYLPLTLGNHSVTPYTRIGLDRWSN WDWQDDIEREGHDFNRVGLFYGYDFQNGLSVSLEYAFEWQDHDEGDSDKFHYAGVGVNYS F >gi|296494495|gb|ADTN01000243.1| GENE 30 29612 - 30694 1219 360 aa, chain - ## HITS:1 COG:ECs1897 KEGG:ns NR:ns ## COG: ECs1897 COG3839 # Protein_GI_number: 15831151 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Escherichia coli O157:H7 # 1 360 1 360 360 710 98.0 0 MAQLSLQHIQKIYDNQVHVVKDFNLEIADKEFIVFVGPSGCGKSTTLRMIAGLEEISGGD LLIDGKRMNDVPAKARNIAMVFQNYALYPHMTVYDNMAFGLKMQKIAKEVIDERVNWAAQ ILGLREYLKRKPEALSGGQRQRVALGRAIVREAGVFLMDEPLSNLDAKLRVQMRAEISKL HQKLNTTMIYVTHDQTEAMTMATRIVIMKDGIVQQVGAPKTVYNQPANMFVSGFIGSPAM NFIRGTIDGDKFVTETLKLTIPEEKLAVLKTQESLHKPIVMGIRPEDIHPDAQEENNISA KISVAELTGAEFMLYTTVGGHELVVRAGALNDYHAGENITIHFDMTKCHFFDAETEIAIR >gi|296494495|gb|ADTN01000243.1| GENE 31 30708 - 31367 482 219 aa, chain - ## HITS:1 COG:ycjU KEGG:ns NR:ns ## COG: ycjU COG0637 # Protein_GI_number: 16129278 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Escherichia coli K12 # 1 219 1 219 219 425 100.0 1e-119 MKLQGVIFDLDGVITDTAHLHFQAWQQIAAEIGISIDAQFNESLKGISRDESLRRILQHG GKEGDFNSQERAQLAYRKNLLYVHSLRELTVNAVLPGIRSLLADLRAQQISVGLASVSLN APTILAALELREFFTFCADASQLKNSKPDPEIFLAACAGLGVPPQACIGIEDAQAGIDAI NASGMRSVGIGAGLTGAQLLLPSTESLTWPRLSAFWQNV >gi|296494495|gb|ADTN01000243.1| GENE 32 31364 - 33562 1502 732 aa, chain - ## HITS:1 COG:ycjT KEGG:ns NR:ns ## COG: ycjT COG1554 # Protein_GI_number: 16129277 # Func_class: G Carbohydrate transport and metabolism # Function: Trehalose and maltose hydrolases (possible phosphorylases) # Organism: Escherichia coli K12 # 1 732 24 755 755 1489 99.0 0 MAQGNGYLGLRASHEEDYTRQTRGMYLAGLYHRAGKGEINELVNLPDVVGMEIAINGEVF SLSHEAWQRELDFASGELRRNVVWRTSNGSGYTIASRRFVSADQLPLIALEITITPLDAD ASVLISTGIDATQTNHGRQHLDETQVRVFGQHLMQGSYTTQDGRSDVAISCCCKVSGDVQ QCYTAKERRLLQHTSAQLHAGETMTLQKLVWIDWRDDRQAALDEWGSASLRQLEMCAQQS YDQLLAASTENWRQWWQKRRITVNGGEAHDQQALDYALYHLRIMTPAHDERSSIAAKGLT GEGYKGHVFWDTEVFLLPFHLFSDPTVARSLLRYRWHNLPGAQEKARRNGWQGALFPWES ARSGEEETPELAAINIRTGLRQKVASAQAEHHLVADIAWAVIQYWQTTGDESFIAHEGMA LLLETAKFWISRAVRVNDRLEIHDVIGPDEYTEHVNNNAYTSYMARYNVQQALNIARQFG CSDDAFIHRAEMFLKELWMPEIQPDGVLPQDDSFMAKPAINLAKYKAAAGKQTILLDYSR AEVNEMQILKQADVVMLNYMLPEQFSAASCLANLQFYEPRTIHDSSLSKAIHGIVAARCG LLTQSYQFWREGTEIDLGADPHSCDDGIHAAATGAIWLGAIQGFAGVSVRDGELHLNPAL PEQWQQLSFPLFWQGCELQVTLDAQRIAIRTSAPVSLRLNGQLITVAEESVFCLGDFILP FNGTATKHQEDE >gi|296494495|gb|ADTN01000243.1| GENE 33 33628 - 34671 899 347 aa, chain - ## HITS:1 COG:ycjS KEGG:ns NR:ns ## COG: ycjS COG0673 # Protein_GI_number: 16129276 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Escherichia coli K12 # 1 347 5 351 351 728 99.0 0 MTSSPLRVAIIGAGQVADKVHASYYCTRNDLELVAVCDSRLSQAQALAEKYGNASVWDDP QAMLLAVKPDVVSVCSPNRFHYEHTLMALEAGCHVMCEKPPAMTPEQAREMCDTARKLGK VLAYDFHHRFALDTQQLREQVTNGVLGEIYVTTARALRRCGVPGWGVFTNKELQGGGPLI DIGIHMLDAAMYVLGFPAVKSVNAHSFQKIRTQKSCGQFGEWDPATYSVEDSLFGTIEFH NGGILWLETSFALNIREQSIMNVSFCGDKAGATLFPAHIYTDNNGELMTLMQREMADDNR HLRSMEAFINHVQGKPVMIADAEQGYIIQQLVAALYQSAETGTRVEL >gi|296494495|gb|ADTN01000243.1| GENE 34 34693 - 35481 966 262 aa, chain - ## HITS:1 COG:ycjR KEGG:ns NR:ns ## COG: ycjR COG1082 # Protein_GI_number: 16129275 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Escherichia coli K12 # 1 262 4 265 265 546 100.0 1e-155 MKIGTQNQAFFPENILEKFRYIKEMGFDGFEIDGKLLVNNIEEVKAAIKETGLPVTTACG GYDGWIGDFIEERRLNGLKQIERILEALAEVGGKGIVVPAAWGMFTFRLPPMTSPRSLDG DRKMVSDSLRVLEQVAARTGTVVYLEPLNRYQDHMINTLADARRYIVENDLKHVQIIGDF YHMNIEEDNLAQALHDNRDLLGHVHIADNHRYQPGSGTLDFHALFEQLRADNYQGYVVYE GRIRAEDPAQAYRDSLAWLRTC >gi|296494495|gb|ADTN01000243.1| GENE 35 35500 - 36552 1179 350 aa, chain - ## HITS:1 COG:ycjQ KEGG:ns NR:ns ## COG: ycjQ COG1063 # Protein_GI_number: 16129274 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli K12 # 1 350 1 350 350 700 100.0 0 MKKLVATAPRVAALVEYEDRAILANEVKIRVRFGAPKHGTEVVDFRAASPFIDEDFNGEW QMFTPRPADAPRGIEFGKFQLGNMVVGDIIECGSDVTDYAVGDSVCGYGPLSETVIINAV NNYKLRKMPQGSSWKNAVCYDPAQFAMSGVRDANVRVGDFVVVVGLGAIGQIAIQLAKRA GASVVIGVDPIAHRCDIARRHGADFCLNPIGTDVGKEIKTLTGKQGADVIIETSGYADAL QSALRGLAYGGTISYVAFAKPFAEGFNLGREAHFNNAKIVFSRACSEPNPDYPRWSRKRI EETCWELLMNGYLNCEDLIDPVVTFANSPESYMQYVDQHPEQSIKMGVTF >gi|296494495|gb|ADTN01000243.1| GENE 36 36583 - 37425 967 280 aa, chain - ## HITS:1 COG:ycjP KEGG:ns NR:ns ## COG: ycjP COG0395 # Protein_GI_number: 16129273 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Escherichia coli K12 # 1 280 1 280 280 462 100.0 1e-130 MATNKRTLSRIGFYCGLALFLIITLFPFFVMLMTSFKGAKEAISLHPTLLPQQWTLEHYV DIFNPMIFPFVDYFRNSLVVSVVSSVVAVFLGILGAYALSRLRFKGRMTINASFYTVYMF SGILLVVPLFKIITALGIYDTEMALIITMVTQTLPTAVFMLKSYFDTIPDEIEEAAMMDG LNRLQIIFRITVPLAMSGLISVFVYCFMVAWNDYLFASIFLSSASNFTLPVGLNALFSTP DYIWGRMMAASLVTALPVVIMYALSERFIKSGLTAGGVKG >gi|296494495|gb|ADTN01000243.1| GENE 37 37412 - 38293 892 293 aa, chain - ## HITS:1 COG:ECs1890 KEGG:ns NR:ns ## COG: ECs1890 COG1175 # Protein_GI_number: 15831144 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Escherichia coli O157:H7 # 1 293 1 293 293 487 99.0 1e-138 MNRLFSGRSDMPFALLLLAPSLLLLGGLVAWPMVSNIEISFLRLPLNPNIESTFVGVSNY VRILSDPGFWHSLWMTVWYTALVVAGSTVLGLAVAMFFNREFRLRKTARSLVILSYVTPS ISLVFAWKYMFNNGYGIVNYLGVDLLHLYEQAPLWFDNPGSSFVLVVLFAIWRYFPYAFI SFLAILQTIDKSLYEAAEMDGANAWQRFRIVTLPAIMPVLATVVTLRTIWMFYMFADVYL LTTKVDILGVYLYKTAFAFNDLGKAAAISVVLFIIIFAVILLTRKRVNLNGNK >gi|296494495|gb|ADTN01000243.1| GENE 38 38314 - 39606 1657 430 aa, chain - ## HITS:1 COG:ycjN KEGG:ns NR:ns ## COG: ycjN COG1653 # Protein_GI_number: 16129271 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli K12 # 1 430 1 430 430 817 99.0 0 MIKSKIVLLSALVSCALISGCKEENKTNVSIEFMHSSVEQERQAVISKLIARFEKENPGI TVKQVPVEEDAYNTKVITLSRSGSLPEVIETSHDYAKVMDKEQLIDRKAVATVISNVGEG AFYDGVLRIVRTEDGSAWIGVPVSAWIGGIWYRKDVLAKAGLEEPKNWQQLLDVAQKLND PANKKYGIALPTAESVLTEQSFSQFALSNQANVFNAEGKITLDTPEMMQALTYYRDLAAN TMPGSNDIMEVKDAFMNGTAPMAIYSTYILPAVIKEGDPKNVGFVVPTEKNSAVYGMLTS LTITAGQKTEETEAAEKFVTFMEQADNIADWVMMSPGAALPVNKAVVTTATWKDNDVIKA LGELPNQLIGELPNIQVFGAVGDKNFTRMGDVTGSGVVSSMVHNVTVGKADLSTTLQASQ KKLDELIEQH >gi|296494495|gb|ADTN01000243.1| GENE 39 39620 - 41326 1059 568 aa, chain - ## HITS:1 COG:ycjM KEGG:ns NR:ns ## COG: ycjM COG0366 # Protein_GI_number: 16129270 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Escherichia coli K12 # 1 568 1 568 568 1184 99.0 0 MGPLPRKGNMKQKITDYLDEIYGGTFTATHLQKLVTRLESAKRLITQRRKKHWDESDVVL ITYADQFHSNDLKPLPTFNQFYHQWLQSIFSHVHLLPFYPWSSDDGFSVIDYHQVASEAG EWQDIQQLGECSHLMFDFVCNHMSAKSEWFKNYLQQHPGFEDFFIAVDPQTDLSAVTRPR ALPLLTPFQMRDHSTRHLWTTFSDDQIDLNYRSPEVLLAMVDVLLCYLAKGAEYVRLDAV GFMWKEPGTSCIHLEKTHLIIKLLRSIIDNVAPGTVIITEINVPHKDNIAYFGAGDDEAH MVYQFSLPPLVLHAVQKQNVEALCAWAQNLTLPSSNTTWFNFLASHDGIGLNPLRGLLPE SEILELVEALQQEGALVNWKNNPDGTRSPYEINVTYMDALSRRESSDEERCARFILAHAI LLSFPGVPAIYIQSILGSRNDYAGVEKLGYNRAINRKKYHSKEITRELNDEATLKHAVYH ELSRLITLRRSHNEFHPDNNFTIDTINSSVMRIQRSNADGNCLTGLFNVSKNIQHVNITN LHGRDLISEVDILGNEITLRPWQVMWIK >gi|296494495|gb|ADTN01000243.1| GENE 40 41512 - 41826 435 104 aa, chain - ## HITS:1 COG:pspE KEGG:ns NR:ns ## COG: pspE COG0607 # Protein_GI_number: 16129269 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Escherichia coli K12 # 1 104 1 104 104 206 100.0 1e-53 MFKKGLLALALVFSLPVFAAEHWIDVRVPEQYQQEHVQGAINIPLKEVKERIATAVPDKN DTVKVYCNAGRQSGQAKEILSEMGYTHVENAGGLKDIAMPKVKG >gi|296494495|gb|ADTN01000243.1| GENE 41 41901 - 42122 237 73 aa, chain - ## HITS:1 COG:no KEGG:ECDH10B_1424 NR:ns ## KEGG: ECDH10B_1424 # Name: pspD # Def: peripheral inner membrane phage-shock protein # Organism: E.coli_DH10B # Pathway: not_defined # 1 73 1 73 73 112 100.0 6e-24 MNTRWQQAGQKVKPGFKLAGKLVLLTALRYGPAGVAGWAIKSVARRPLKMLLAVALEPLL SRAANKLAQRYKR >gi|296494495|gb|ADTN01000243.1| GENE 42 42131 - 42490 539 119 aa, chain - ## HITS:1 COG:ECs1883 KEGG:ns NR:ns ## COG: ECs1883 COG1983 # Protein_GI_number: 15831137 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Putative stress-responsive transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 119 1 119 119 228 100.0 3e-60 MAGINLNKKLWRIPQQGMVRGVCAGIANYFDVPVKLVRILVVLSIFFGLALFTLVAYIIL SFALDPMPDNMAFGEQLPSSSELLDEVDRELAASETRLREMERYVTSDTFTLRSRFRQL >gi|296494495|gb|ADTN01000243.1| GENE 43 42490 - 42714 324 74 aa, chain - ## HITS:1 COG:no KEGG:ECDH10B_1422 NR:ns ## KEGG: ECDH10B_1422 # Name: pspB # Def: phage shock protein B # Organism: E.coli_DH10B # Pathway: not_defined # 1 74 1 74 74 127 100.0 9e-29 MSALFLAIPLTIFVLFVLPIWLWLHYSNRSGRSELSQSEQQRLAQLADEAKRMRERIQAL ESILDAEHPNWRDR >gi|296494495|gb|ADTN01000243.1| GENE 44 42768 - 43436 999 222 aa, chain - ## HITS:1 COG:ECs1881 KEGG:ns NR:ns ## COG: ECs1881 COG1842 # Protein_GI_number: 15831135 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Phage shock protein A (IM30), suppresses sigma54-dependent transcription # Organism: Escherichia coli O157:H7 # 1 222 1 222 222 280 100.0 2e-75 MGIFSRFADIVNANINALLEKAEDPQKLVRLMIQEMEDTLVEVRSTSARALAEKKQLTRR IEQASAREVEWQEKAELALLKEREDLARAALIEKQKLTDLIKSLEHEVTLVDDTLARMKK EIGELENKLSETRARQQALMLRHQAANSSRDVRRQLDSGKLDEAMARFESFERRIDQMEA EAESHSFGKQKSLDDQFAELKADDAISEQLAQLKAKMKQDNQ >gi|296494495|gb|ADTN01000243.1| GENE 45 43603 - 44580 883 325 aa, chain + ## HITS:1 COG:ECs1880 KEGG:ns NR:ns ## COG: ECs1880 COG1221 # Protein_GI_number: 15831134 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulators containing an AAA-type ATPase domain and a DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 325 6 330 330 641 100.0 0 MAEYKDNLLGEANSFLEVLEQVSHLAPLDKPVLIIGERGTGKELIASRLHYLSSRWQGPF ISLNCAALNENLLDSELFGHEAGAFTGAQKRHPGRFERADGGTLFLDELATAPMMVQEKL LRVIEYGELERVGGSQPLQVNVRLVCATNADLPAMVNEGTFRADLLDRLAFDVVQLPPLR ERESDIMLMAEHFAIQMCREIKLPLFPGFTERARETLLNYRWPGNIRELKNVVERSVYRH GTSDYPLDDIIIDPFKRRPPEDAIAVSETTSLPTLPLDLREFQMQQEKELLQLSLQQGKY NQKRAAELLGLTYHQFRALLKKHQI >gi|296494495|gb|ADTN01000243.1| GENE 46 44700 - 45965 1318 421 aa, chain - ## HITS:1 COG:goaG KEGG:ns NR:ns ## COG: goaG COG0160 # Protein_GI_number: 16129263 # Func_class: E Amino acid transport and metabolism # Function: 4-aminobutyrate aminotransferase and related aminotransferases # Organism: Escherichia coli K12 # 1 421 1 421 421 794 99.0 0 MSNNEFHQRRLSATPRGVGVMCNFFAQSAENATLKDVEGNEYIDFAAGIAVLNTGHRHPD LVAAVEQQLQQFTHTAYQIVPYESYVTLAEKINALAPVSGQAKTAFFTTGAEAVENAVKI ARAHTGRPGVIAFSGGFHGRTYMTMALTGKVAPYKIGFGPFPGSVYHVPYPSDLHGISTQ DSLDAIERLFKSDIEAKQVAAIIFEPVQGEGGFNVAPKELVAAIRRLCDEHGIVMIADEV QSGFARTGKLFAMDHYADKPDLMTMAKSLAGGMPLSGVVGNANIMDAPAPGGLGGTYAGN PLAVAAAHAVLNIIDKESLCERANQLGQRLTNTLIDAKESVPAIAAVRGLGSMIAAEFND PQTGEPSAAIAQKIQQRALAQGLLLLTCGAYGNVIRFLYPLTIPDAQFDAAMKILQDALS D >gi|296494495|gb|ADTN01000243.1| GENE 47 46003 - 47283 1562 426 aa, chain - ## HITS:1 COG:ordL KEGG:ns NR:ns ## COG: ordL COG0665 # Protein_GI_number: 16129262 # Func_class: E Amino acid transport and metabolism # Function: Glycine/D-amino acid oxidases (deaminating) # Organism: Escherichia coli K12 # 1 426 1 426 426 871 100.0 0 MTEHTSSYYAASANKYAPFDTLNESITCDVCVVGGGYTGLSSALHLAEAGFDVVVLEASR IGFGASGRNGGQLVNSYSRDIDVIEKSYGMDTARMLGSMMFEGGEIIRERIKRYQIDCDY RPGGLFVAMNDKQLATLEEQKENWERYGNKQLELLDANAIRREVASDRYTGALLDHSGGH IHPLNLAIGEADAIRLNGGRVYELSAVTQIQHTTPAVVRTAKGQVTAKYVIVAGNAYLGD KVEPELAKRSMPCGTQVITTERLSEDLARSLIPKNYCVEDCNYLLDYYRLTADNRLLYGG GVVYGARDPDDVERLVVPKLLKTFPQLKGVKIDYRWTGNFLLTLSRMPQFGRLDTNIYYM QGYSGHGVTCTHLAGRLIAELLRGDAERFDAFANLPHYPFPGGRTLRVPFTAMGAAYYSL RDRLGV >gi|296494495|gb|ADTN01000243.1| GENE 48 47285 - 48772 1747 495 aa, chain - ## HITS:1 COG:aldH KEGG:ns NR:ns ## COG: aldH COG1012 # Protein_GI_number: 16129261 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Escherichia coli K12 # 1 495 1 495 495 973 99.0 0 MNFHHLAYWQDKALSLAIENRLFINGEYTAAAENETFETVDPVTQAPLAKIARGKSVDID RAVSAARGVFERGDWSLSSPAKRKAVLNKLADLMEAHAEELALLETLDTGKPIRHSLRDD IPGAARAIRWYAEAIDKVYGEVATTSSHELAMIVREPVGVIAAIVPWNFPLLLTCWKLGP ALAAGNSVILKPSEKSPLSAIRLAGLAKEAGLPDGVLNVVTGFGHEAGQALSRHNDIDAI AFTGSTRTGKQLLKDAGDSNMKRVWLEAGGKSANLVFADCPDLQKAASATAAGIFYNQGQ VCIAGTRLLLEESIADEFLALLKQQAQNWQPGHPLDPATTMGTLIDCAHADSVHSFIREG ESKGQLLLDGRNAELAAAIGPTIFVDVDPNASLSREEIFGPVLVVTRFTSEEQALQLAND SQYGLGAAVWTRDLSRAHRMSRRLKAGSVFVNNYNDGDMTVPFGGYKQSGNGRDKSLHAL EKFTELKTIWISLEA >gi|296494495|gb|ADTN01000243.1| GENE 49 49047 - 49604 489 185 aa, chain - ## HITS:1 COG:ycjC KEGG:ns NR:ns ## COG: ycjC COG1396 # Protein_GI_number: 16129260 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 1 185 1 185 185 344 100.0 5e-95 MSDEGLAPGKRLSEIRQQQGLSQRRAAELSGLTHSAISTIEQDKVSPAISTLQKLLKVYG LSLSEFFSEPEKPDEPQVVINQDDLIEMGSQGVSMKLVHNGNPNRTLAMIFETYQPGTTT GERIKHQGEEIGTVLEGEIVLTINGQDYHLVAGQSYAINTGIPHSFSNTSAGICRIISAH TPTTF >gi|296494495|gb|ADTN01000243.1| GENE 50 49631 - 50383 347 250 aa, chain - ## HITS:1 COG:ycjL KEGG:ns NR:ns ## COG: ycjL COG2071 # Protein_GI_number: 16129259 # Func_class: R General function prediction only # Function: Predicted glutamine amidotransferases # Organism: Escherichia coli K12 # 1 250 9 258 258 488 99.0 1e-138 MNNPVIGVVMCRNRLKGHATQTLQEKYLNAIIHAGGLPIALPHALAEPSLLEQLLPKLDG IYLPGSPSNVQPHLYGENGDEPDADPGRDLLSMALINAALERRIPIFAICRGLQELVVAT GGSLHRKLCEQPELLEHREDPELPVEQQYAPSHEVQVEEGGLLSALLPECSNFWVNSLHG QGAKVVSPRLRVEARSPDGLVEAVSVINHPFALGVQWHPEWNSSEYALSRILFEGFITAC QHHIAEKQRL >gi|296494495|gb|ADTN01000243.1| GENE 51 50607 - 51954 1032 449 aa, chain + ## HITS:1 COG:ycjK KEGG:ns NR:ns ## COG: ycjK COG0174 # Protein_GI_number: 16129258 # Func_class: E Amino acid transport and metabolism # Function: Glutamine synthetase # Organism: Escherichia coli K12 # 1 440 27 466 498 900 99.0 0 METNIVEVENFVQQSEERRGSAFTQEVKRYLERYPNTQYVDVLLTDLNGCFRGKRIPVSS LKKLEKGCYFPASVFAMDILGNVVEEAGLGQEMGEPDRTCVPVLGSLTPSAADPEFIGQM LLTMVDEDGAPFDVEPRNVLNRLWQQLRQRGLFPVVAVELEFYLLDRQRDAEGYLQPPCA PGTDDRNTQSQVYSVDNLNHFADVLNDIDELAQLQLIPADGAVAEASPGQFEINLYHTDN VLEACDDALALKRLVRLMAEKHKMHATFMAKPYEEHAGNGMHIHISMQNNRGENVLSDAE GEDSPLLKKMLAGMIDLMPSSMALLAPNVNSYRRFQPGMYVPTQASWGHNNRTVALRIPC GDRHNHRVEYRVAGADANPYLVMAAIFAGILHGLDNELPLQEEVEGNGLEQEGLPFPIRQ SDALGEFIENDHLRRYLGERPFLKKKKKK Prediction of potential genes in microbial genomes Time: Sun May 15 23:58:04 2011 Seq name: gi|296494494|gb|ADTN01000244.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont602.3, whole genome shotgun sequence Length of sequence - 8246 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 3, operones - 1 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 171 - 230 2.2 1 1 Tu 1 . + CDS 273 - 1658 1515 ## COG0531 Amino acid transporters + Prom 1700 - 1759 3.9 2 2 Tu 1 . + CDS 1792 - 2037 333 ## Z2493 hypothetical protein + Prom 2268 - 2327 4.0 3 3 Op 1 8/0.000 + CDS 2350 - 3993 1459 ## COG4166 ABC-type oligopeptide transport system, periplasmic component 4 3 Op 2 8/0.000 + CDS 3990 - 4955 588 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 5 3 Op 3 8/0.000 + CDS 4942 - 5832 942 ## COG4171 ABC-type antimicrobial peptide transport system, permease component 6 3 Op 4 8/0.000 + CDS 5832 - 6824 468 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 7 3 Op 5 1/0.000 + CDS 6814 - 7632 448 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 + Prom 7661 - 7720 6.0 8 3 Op 6 . + CDS 7826 - 8053 76 ## COG2852 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|296494494|gb|ADTN01000244.1| GENE 1 273 - 1658 1515 461 aa, chain + ## HITS:1 COG:ycjJ KEGG:ns NR:ns ## COG: ycjJ COG0531 # Protein_GI_number: 16129257 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli K12 # 1 461 19 479 479 862 100.0 0 MAINSPLNIAAQPGKTRLRKSLKLWQVVMMGLAYLTPMTVFDTFGIVSGISDGHVPASYL LALAGVLFTAISYGKLVRQFPEAGSAYTYAQKSINPHVGFMVGWSSLLDYLFLPMINVLL AKIYLSALFPEVPPWVWVVTFVAILTAANLKSVNLVANFNTLFVLVQISIMVVFIFLVVQ GLHKGEGVGTVWSLQPFISENAHLIPIITGATIVCFSFLGFDAVTTLSEETPDAARVIPK AIFLTAVYGGVIFIAASFFMQLFFPDISRFKDPDAALPEIALYVGGKLFQSIFLCTTFVN TLASGLASHASVSRLLYVMGRDNVFPERVFGYVHPKWRTPALNVIMVGIVALSALFFDLV TATALINFGALVAFTFVNLSVFNHFWRRKGMNKSWKDHFHYLLMPLVGALTVGVLWVNLE STSLTLGLVWASLGGAYLWYLIRRYRKVPLYDGDRTPVSET >gi|296494494|gb|ADTN01000244.1| GENE 2 1792 - 2037 333 81 aa, chain + ## HITS:1 COG:no KEGG:Z2493 NR:ns ## KEGG: Z2493 # Name: ymjA # Def: hypothetical protein # Organism: E.coli_O157 # Pathway: not_defined # 1 81 1 81 81 141 98.0 9e-33 MNHDIPLKYFDIADEYATECAEPVAEAERTPLAHYFQLLLTRLMNNEEISEEAQHEMAAE AGINPVRIDEIAEFLNQWGNE >gi|296494494|gb|ADTN01000244.1| GENE 3 2350 - 3993 1459 547 aa, chain + ## HITS:1 COG:sapA KEGG:ns NR:ns ## COG: sapA COG4166 # Protein_GI_number: 16129255 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Escherichia coli K12 # 1 547 1 547 547 1075 99.0 0 MRQVLSSLLVIAGLVSGQAIAAPESPPHADIRDSGFVYCVSGQVNTFNPSKASSGLIVDT LAAQFYDRLLDVDPYTYRLMPELAESWEVLDNGATYRFHLRRDVPFQKTDWFTPTRKMNA DDVVFTFQRIFDRNNPWHNVNGSNFPYFDSLQFADNVKSVRKLDNHTVEFRLAQPDASFL WHLATHYASVMSAEYARKLEKEDRQEQLDRQPVGTGPYQLSEYRAGQFIRLQRHDDFWRG KPLMPQVVVDLGSGGTGRLSKLLTGECDVLAWPAASQLSILRDDPRLRLTLRPGMNVAYL AFNTAKPPLNNPAVRHALALAINNQRLMQSIYYGTAETAASILPRASWAYDNEAKITEYN PAKSREQLKALGLENLTLKLWVPTRSQAWNPSPLKTAELIQADMAQVGVKVVIVPVEGRF QEARLMDMSHDLTLSGWATDSNDPDSFFRPLLSCAAIHSQTNLAHWCDPKFDSVLRKALS SQQLAARIEAYDEAQSILAQELPILPLASSLRLQAYRYDIKGLVLSPFGNASFAGVYREK QDEVKKP >gi|296494494|gb|ADTN01000244.1| GENE 4 3990 - 4955 588 321 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 1 318 1 317 320 231 35 2e-60 MIIFTLRRILLLIVTLFLLTFVGFSLSYFTPHAPLQGASLWNAWVFWFNGLIHWDFGVSS INGQPIAEQLKEVFPATMELCILAFGFALIVGIPVGMIAGITRHKWQDNLINAIALLGFS IPVFWLALLLTLFCSLTLGWLPVSGRFDLLYEVKPITGFALIDAWLSDSPWRDEMIMSAI RHMILPVITLSVAPTTEVIRLMRISTIEVYDQNYVKAAATRGLSRFTILRRHVLHNALPP VIPRLGLQFSTMLTLAMITEMVFSWPGLGRWLINAIRQQDYAAISAGVMVCGSLVIIVNV ISDILGAMANPLKHKEWYALR >gi|296494494|gb|ADTN01000244.1| GENE 5 4942 - 5832 942 296 aa, chain + ## HITS:1 COG:sapC KEGG:ns NR:ns ## COG: sapC COG4171 # Protein_GI_number: 16129253 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Escherichia coli K12 # 1 296 1 296 296 535 100.0 1e-152 MPYDSVYSEKRPPGTLRTAWRKFYSDASAMVGLYGCAGLAVLCIFGGWFAPYGIDQQFLG YQLLPPSWSRYGEVSFFLGTDDLGRDVLSRLLSGAAPTVGGAFVVTLAATICGLVLGTFA GATHGLRSAVLNHILDTLLAIPSLLLAIIVVAFAGPSLSHAMFAVWLALLPRMVRSIYSM VHDELEKEYVIAARLDGASTLNILWFAVMPNITAGLVTEITRALSMAILDIAALGFLDLG AQLPSPEWGAMLGDALELIYVAPWTVMLPGAAIMISVLLVNLLGDGVRRAIIAGVE >gi|296494494|gb|ADTN01000244.1| GENE 6 5832 - 6824 468 330 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 14 321 31 324 329 184 37 1e-46 MPLLDIRNLTIEFKTGDEWVKAVDRVSMTLTEGEIRGLVGESGSGKSLIAKAICGVNKDN WRVTADRMRFDDIDLLRLSARERRKLVGHNVSMIFQEPQSCLDPSERVGRQLMQNIPAWT YKGRWWQRFGWRKRRAIELLHRVGIKDHKDAMRSFPYELTEGECQKVMIAIALANQPRLL IADEPTNSMEPTTQAQIFRLLTRLNQNSNTTILLISHDLQMLSQWADKINVLYCGQTVET APSKELVTMPHHPYTQALIRAIPDFGSAMPHKSRLNTLPGAIPLLEQLPIGCRLGPRCPY AQRECIVTPRLTGAKNHLYACHFPLNMEKE >gi|296494494|gb|ADTN01000244.1| GENE 7 6814 - 7632 448 272 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 1 256 3 263 329 177 35 3e-44 RKSEMIETLLEVRNLSKTFRYRTGWFRRQTVEAVKPLSFTLREGQTLAIIGENGSGKSTL AKMLAGMIEPTSGELLIDDHPLHFGDYSFRSQRIRMIFQDPSTSLNPRQRISQILDFPLR LNTDLEPEQRRKQIIETMRMVGLLPDHVSYYPHMLAPGQKQRLGLARALILRPKVIIADE ALASLDMSMRSQLINLMLELQEKQGISYIYVTQHIGMMKHISDQVLVMHQGEVVERGSTA DVLASPLHELTKRLIAGHFGEALTADAWRKDR >gi|296494494|gb|ADTN01000244.1| GENE 8 7826 - 8053 76 75 aa, chain + ## HITS:1 COG:ycjD KEGG:ns NR:ns ## COG: ycjD COG2852 # Protein_GI_number: 16129250 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 75 43 117 117 151 98.0 3e-37 MGSYILDFACCSARVVVELDGGQHDLAVAYDSRRTSWLESQGWTVLRFWNNEIDCNEETV LENILQELNRRSPSP Prediction of potential genes in microbial genomes Time: Sun May 15 23:58:22 2011 Seq name: gi|296494493|gb|ADTN01000245.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont602.4, whole genome shotgun sequence Length of sequence - 54635 bp Number of predicted genes - 53, with homology - 52 Number of transcription units - 32, operones - 12 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 31 - 90 3.6 1 1 Tu 1 . + CDS 171 - 959 1055 ## COG0623 Enoyl-[acyl-carrier-protein] reductase (NADH) + Term 967 - 1027 5.1 + Prom 962 - 1021 2.5 2 2 Op 1 1/0.900 + CDS 1103 - 2230 401 ## COG4950 Uncharacterized protein conserved in bacteria 3 2 Op 2 4/0.600 + CDS 2298 - 4232 2073 ## COG4776 Exoribonuclease II + Term 4243 - 4279 6.3 + Prom 4320 - 4379 6.4 4 3 Tu 1 . + CDS 4468 - 6453 1160 ## COG2200 FOG: EAL domain + Term 6524 - 6566 3.1 + Prom 6461 - 6520 6.3 5 4 Op 1 . + CDS 6601 - 6774 236 ## EC55989_1446 hypothetical protein + Prom 6776 - 6835 6.9 6 4 Op 2 . + CDS 6864 - 7613 662 ## COG1349 Transcriptional regulators of sugar metabolism + Term 7624 - 7682 4.1 + Prom 7675 - 7734 5.9 7 5 Tu 1 . + CDS 7882 - 8100 335 ## G2583_1624 hypothetical protein + Term 8194 - 8233 8.4 - Term 8186 - 8217 4.1 8 6 Op 1 6/0.200 - CDS 8226 - 8552 467 ## COG0023 Translation initiation factor 1 (eIF-1/SUI1) and related proteins 9 6 Op 2 7/0.100 - CDS 8552 - 9289 689 ## COG0284 Orotidine-5'-phosphate decarboxylase - Prom 9341 - 9400 7.6 10 7 Op 1 8/0.000 - CDS 9483 - 10652 1124 ## COG2956 Predicted N-acetylglucosaminyl transferase 11 7 Op 2 5/0.400 - CDS 10659 - 10967 201 ## PROTEIN SUPPORTED gi|46133578|ref|ZP_00203203.1| COG3771: Predicted membrane protein - Prom 11031 - 11090 2.9 12 8 Tu 1 . - CDS 11116 - 11880 448 ## COG0671 Membrane-associated phospholipid phosphatase - Prom 11901 - 11960 4.5 + Prom 11917 - 11976 1.6 13 9 Tu 1 . + CDS 12050 - 12640 700 ## COG0807 GTP cyclohydrolase II + Term 12661 - 12712 5.6 - Term 12607 - 12643 4.1 14 10 Tu 1 . - CDS 12704 - 15379 2460 ## COG1048 Aconitase A - Prom 15429 - 15488 3.5 - Term 15503 - 15538 4.0 15 11 Tu 1 . - CDS 15543 - 15638 97 ## 16 12 Op 1 . - CDS 15752 - 15919 144 ## SSON_1865 hypothetical protein 17 12 Op 2 . - CDS 15922 - 16050 100 ## EC55989_1434 hypothetical protein - Prom 16104 - 16163 4.7 - Term 16318 - 16370 9.1 18 13 Tu 1 . - CDS 16381 - 17355 879 ## COG0583 Transcriptional regulator - Prom 17422 - 17481 7.5 - Term 17503 - 17533 2.1 19 14 Tu 1 . - CDS 17565 - 20162 2829 ## COG0550 Topoisomerase IA - Prom 20324 - 20383 3.7 + Prom 20362 - 20421 4.8 20 15 Tu 1 . + CDS 20542 - 20793 422 ## ECIAI1_1292 hypothetical protein + Term 20796 - 20836 6.3 - Term 20784 - 20824 6.6 21 16 Tu 1 . - CDS 20829 - 21878 1057 ## COG0616 Periplasmic serine proteases (ClpP class) - Prom 21908 - 21967 3.6 + Prom 21987 - 22046 2.1 22 17 Op 1 5/0.400 + CDS 22098 - 22856 756 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 23 17 Op 2 . + CDS 22853 - 23443 697 ## COG2109 ATP:corrinoid adenosyltransferase + Term 23454 - 23487 2.5 - Term 23442 - 23474 1.5 24 18 Tu 1 . - CDS 23483 - 24358 1181 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases - Prom 24507 - 24566 2.7 25 19 Op 1 . - CDS 24569 - 26464 708 ## JW5197 predicted inner membrane protein 26 19 Op 2 6/0.200 - CDS 26492 - 27112 773 ## COG0009 Putative translation factor (SUA5) 27 19 Op 3 . - CDS 27109 - 27990 377 ## COG0613 Predicted metal-dependent phosphoesterases (PHP family) - Prom 28069 - 28128 6.1 + Prom 28042 - 28101 6.1 28 20 Op 1 10/0.000 + CDS 28264 - 29826 1582 ## COG0147 Anthranilate/para-aminobenzoate synthases component I 29 20 Op 2 21/0.000 + CDS 29826 - 31421 1690 ## COG0547 Anthranilate phosphoribosyltransferase 30 20 Op 3 13/0.000 + CDS 31425 - 32783 1195 ## COG0134 Indole-3-glycerol phosphate synthase 31 20 Op 4 37/0.000 + CDS 32795 - 33988 1356 ## COG0133 Tryptophan synthase beta chain 32 20 Op 5 2/0.800 + CDS 33988 - 34794 392 ## PROTEIN SUPPORTED gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc + Term 34802 - 34834 5.4 + Prom 34977 - 35036 4.8 33 21 Op 1 3/0.600 + CDS 35175 - 35354 214 ## COG3729 General stress protein + Term 35395 - 35434 6.2 34 21 Op 2 4/0.600 + CDS 35440 - 35940 595 ## COG3685 Uncharacterized protein conserved in bacteria 35 21 Op 3 . + CDS 35986 - 36492 628 ## COG3685 Uncharacterized protein conserved in bacteria + Term 36516 - 36550 6.0 36 22 Tu 1 . - CDS 36552 - 37106 665 ## COG3047 Outer membrane protein W - Prom 37219 - 37278 7.0 37 23 Op 1 . + CDS 37547 - 38290 498 ## JW1247 predicted inner membrane protein 38 23 Op 2 9/0.000 + CDS 38320 - 38859 559 ## COG2917 Intracellular septation protein A + Term 38861 - 38902 5.5 + Prom 38876 - 38935 2.1 39 23 Op 3 . + CDS 38964 - 39362 326 ## COG1607 Acyl-CoA hydrolase + Term 39365 - 39406 11.3 - Term 39353 - 39394 11.3 40 24 Tu 1 . - CDS 39402 - 40121 600 ## COG0810 Periplasmic protein TonB, links inner and outer membranes - Prom 40153 - 40212 5.6 + Prom 40253 - 40312 3.3 41 25 Tu 1 . + CDS 40345 - 40641 272 ## COG2350 Uncharacterized protein conserved in bacteria + Term 40650 - 40706 5.1 + Prom 40856 - 40915 4.8 42 26 Tu 1 . + CDS 41061 - 42194 742 ## COG1226 Kef-type K+ transport systems, predicted NAD-binding component + Term 42219 - 42259 8.1 - Term 42207 - 42245 3.1 43 27 Tu 1 . - CDS 42249 - 42422 78 ## EC55989_1347 hypothetical protein - Prom 42447 - 42506 6.2 + Prom 42478 - 42537 7.8 44 28 Op 1 1/0.900 + CDS 42565 - 44025 1131 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes 45 28 Op 2 . + CDS 44060 - 44389 523 ## COG3099 Uncharacterized protein conserved in bacteria + Term 44413 - 44444 4.1 - Term 44401 - 44432 4.1 46 29 Op 1 44/0.000 - CDS 44442 - 45446 853 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 47 29 Op 2 44/0.000 - CDS 45443 - 46456 582 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 48 29 Op 3 49/0.000 - CDS 46468 - 47376 954 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 49 29 Op 4 21/0.000 - CDS 47391 - 48311 810 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components - Term 48341 - 48393 11.1 50 29 Op 5 . - CDS 48397 - 50028 1710 ## COG4166 ABC-type oligopeptide transport system, periplasmic component - Prom 50159 - 50218 4.1 + Prom 50315 - 50374 2.4 51 30 Tu 1 . + CDS 50447 - 50608 163 ## ECS88_1310 hypothetical protein + Term 50732 - 50761 0.4 - Term 50715 - 50753 4.6 52 31 Tu 1 . - CDS 50766 - 51413 486 ## COG2095 Multiple antibiotic transporter - Prom 51601 - 51660 6.9 53 32 Tu 1 . + CDS 51890 - 54565 2863 ## COG1454 Alcohol dehydrogenase, class IV Predicted protein(s) >gi|296494493|gb|ADTN01000245.1| GENE 1 171 - 959 1055 262 aa, chain + ## HITS:1 COG:ECs1861 KEGG:ns NR:ns ## COG: ECs1861 COG0623 # Protein_GI_number: 15831115 # Func_class: I Lipid transport and metabolism # Function: Enoyl-[acyl-carrier-protein] reductase (NADH) # Organism: Escherichia coli O157:H7 # 1 262 1 262 262 506 100.0 1e-143 MGFLSGKRILVTGVASKLSIAYGIAQAMHREGAELAFTYQNDKLKGRVEEFAAQLGSDIV LQCDVAEDASIDTMFAELGKVWPKFDGFVHSIGFAPGDQLDGDYVNAVTREGFKIAHDIS SYSFVAMAKACRSMLNPGSALLTLSYLGAERAIPNYNVMGLAKASLEANVRYMANAMGPE GVRVNAISAGPIRTLAASGIKDFRKMLAHCEAVTPIRRTVTIEDVGNSAAFLCSDLSAGI SGEVVHVDGGFSIAAMNELELK >gi|296494493|gb|ADTN01000245.1| GENE 2 1103 - 2230 401 375 aa, chain + ## HITS:1 COG:yciW_1 KEGG:ns NR:ns ## COG: yciW_1 COG4950 # Protein_GI_number: 16129248 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 210 27 236 236 423 100.0 1e-118 MEQRHITGKSHWYHETQSSTTEYDVLPLVPEAAKVSDPFLLDVILEKETLAPFLSWLDPA RVLAVDLFPDQLTVTRSQTFTAYERLSTALTVAQVCGVQRLCNYYSARLTPLPGPDSTRE SNHRLAQITQYARQLASSPSIIDNRSRQHLNDVGLTAWDCVIISQIIGFIGFQARTIATF QAYLGHPVRWLPGLEIQNYADASLFADESLRWRSSYEVEKLPEEHTKSSTAELCQLAEIL SLHPISLSLLEKLLNSTRGNTQPDNQLAALLCARINGSPACFATCMDSSNEYKKISTLMR KGENEINQWADRHSVERATVQAIQWLTRAPDRFSAAQFSPLLEHEKSSTQIINLLVWSGL CGWINRLKIALGETY >gi|296494493|gb|ADTN01000245.1| GENE 3 2298 - 4232 2073 644 aa, chain + ## HITS:1 COG:ECs1859 KEGG:ns NR:ns ## COG: ECs1859 COG4776 # Protein_GI_number: 15831113 # Func_class: K Transcription # Function: Exoribonuclease II # Organism: Escherichia coli O157:H7 # 1 644 1 644 644 1281 99.0 0 MFQDNPLLAQLKQQLHSQTPRAEGVVKATEKGFGFLEVDAQKSYFIPPPQMKKVMHGDRI IAVIHSEKERESAEPEELVEPFLTRFVGKVQGKNDRLAIVPDHPLLKDAIPCRAARGLNH EFKEGDWAVAEMRRHPLKGDRSFYAELTQYITFGDDHFVPWWVTLARHNLEKEAPDGVAT EMLDEGLVREDLTALDFVTIDSASTEDMDDALFAKALPDDKLQLIVAIADPTAWIAEGSK LDKAAKIRAFTNYLPGFNIPMLPRELSDDLCSLRANEVRPVLACRMTLSADGTIEDNIEF FAATIESKAKLVYDQVSDWLENTGDWQPESEAIAEQVRLLAQICQRRGEWRHNHALVFKD RPDYRFILGEKGEVLDIVAEPRRIANRIVEEAMIAANICAARVLRDKLGFGIYNVHMGFD PANADALAALLKTHGLHVDAEEVLTLDGFCKLRRELDAQPTGFLDSRIRRFQSFAEISTE PGPHFGLGLEAYATWTSPIRKYGDMINHRLLKAVIKGETATRPQDEITVQMAERRRLNRM AERDVGDWLYARFLKDKAGTDTRFAAEIVDISRGGMRVRLVDNGAIAFIPAPFLHAVRDE LVCSQENGTVQIKGETVYKVTDVIDVTIAEVRMETRSIIARPVA >gi|296494493|gb|ADTN01000245.1| GENE 4 4468 - 6453 1160 661 aa, chain + ## HITS:1 COG:yciR_3 KEGG:ns NR:ns ## COG: yciR_3 COG2200 # Protein_GI_number: 16129246 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 401 661 1 261 261 530 100.0 1e-150 MKTVRESTTLYNFLGSHNPYWRLTESSDVLRFSTTETTEPDRTLQLSAEQAARIREMTVI TSSLMMSLTVDESDLSVHLVGRKINKREWAGNASAWHDTPAVARDLSHGLSFAEQVVSEA HSAIVILDSRGNIQRFNRLCEDYTGLKEHDVIGQSVFKLFMSRREAAASRRNNRVFFRSG NAYEVELWIPTCKGQRLFLFRNKFVHSGSGKNEIFLICSGTDITEERRAQERLRILANTD SITGLPNRNAMQDLIDHAINHADNNKVGVVYLDLDNFKKVNDAYGHLFGDQLLRDVSLAI LSCLEHDQVLARPGGDEFLVLASNTSQSALEAMASRILTRLRLPFRIGLIEVYTSCSVGI ALSPEHGSDSTAIIRHADTAMYTAKEGGRGQFCVFTPEMNQRVFEYLWLDTNLRKALEND QLVIHYQPKITWRGEVRSLEALVRWQSPERGLIPPLDFISYAEESGLIVPLGRWVILDVV RQVAKWRDKGINLRVAVNISARQLADQTIFTALKQVLQELNFEYCPIDVELTESCLIEND ELALSVIQQFSQLGAQVHLDDFGTGYSSLSQLARFPIDAIKLDQVFVRDIHKQPVSQSLV RAIVAVAQALNLQVIAEGVESAKEDAFLTKNGINERQGFLFAKPMPAVAFERWYKRYLKR A >gi|296494493|gb|ADTN01000245.1| GENE 5 6601 - 6774 236 57 aa, chain + ## HITS:1 COG:no KEGG:EC55989_1446 NR:ns ## KEGG: EC55989_1446 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 57 15 71 71 105 100.0 4e-22 MSEFDAQRVAERIDIVLDILVAGDYHSAIHNLEILKAELLRQVAESTPDIPKAPWEI >gi|296494493|gb|ADTN01000245.1| GENE 6 6864 - 7613 662 249 aa, chain + ## HITS:1 COG:ECs1857 KEGG:ns NR:ns ## COG: ECs1857 COG1349 # Protein_GI_number: 15831111 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli O157:H7 # 1 249 1 249 249 480 100.0 1e-135 MNSRQQTILQMVIDQGQVSVTDLAKATGVSEVTIRQDLNTLEKLSYLRRAHGFAVSLDSD DVETRMMSNYTLKRELAEFAASLVQPGETIFIENGSSNALLARTLGEQKKNVTIITVSSY IAHLLKDAPCEVILLGGVYQKKSESMVGPLTRQCIQQVHFSKAFIGIDGWQPETGFTGRD MMRTDVVNAVLEKECEAIVLTDSSKFGAVHSYSIGPVERFNRVITDSKIRASDLMHLEHS KLTVHVVDI >gi|296494493|gb|ADTN01000245.1| GENE 7 7882 - 8100 335 72 aa, chain + ## HITS:1 COG:no KEGG:G2583_1624 NR:ns ## KEGG: G2583_1624 # Name: osmB # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 72 1 72 72 74 100.0 1e-12 MFVTSKKMTAAVLAITLAMSLSACSNWSKRDRNTAIGAGAGALGGAVLTDGSTLGTLGGA AVGGVIGHQVGK >gi|296494493|gb|ADTN01000245.1| GENE 8 8226 - 8552 467 108 aa, chain - ## HITS:1 COG:yciH KEGG:ns NR:ns ## COG: yciH COG0023 # Protein_GI_number: 16129243 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 1 (eIF-1/SUI1) and related proteins # Organism: Escherichia coli K12 # 1 108 2 109 109 177 100.0 5e-45 MSDSNSRLVYSTETGRIDEPKAAPVRPKGDGVVRIQRQTSGRKGKGVCLITGVDLDDAEL TKLAAELKKKCGCGGAVKDGVIEIQGDKRDLLKSLLEAKGMKVKLAGG >gi|296494493|gb|ADTN01000245.1| GENE 9 8552 - 9289 689 245 aa, chain - ## HITS:1 COG:pyrF KEGG:ns NR:ns ## COG: pyrF COG0284 # Protein_GI_number: 16129242 # Func_class: F Nucleotide transport and metabolism # Function: Orotidine-5'-phosphate decarboxylase # Organism: Escherichia coli K12 # 1 245 1 245 245 456 100.0 1e-128 MTLTASSSSRAVTNSPVVVALDYHNRDDALAFVDKIDPRDCRLKVGKEMFTLFGPQFVRE LQQRGFDIFLDLKFHDIPNTAAHAVAAAADLGVWMVNVHASGGARMMTAAREALVPFGKD APLLIAVTVLTSMEASDLVDLGMTLSPADYAERLAALTQKCGLDGVVCSAQEAVRFKQVF GQEFKLVTPGIRPQGSEAGDQRRIMTPEQALSAGVDYMVIGRPVTQSVDPAQTLKAINAS LQRSA >gi|296494493|gb|ADTN01000245.1| GENE 10 9483 - 10652 1124 389 aa, chain - ## HITS:1 COG:ECs1853 KEGG:ns NR:ns ## COG: ECs1853 COG2956 # Protein_GI_number: 15831107 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted N-acetylglucosaminyl transferase # Organism: Escherichia coli O157:H7 # 10 389 10 389 389 721 100.0 0 MLELLFLLLPVAAAYGWYMGRRSAQQNKQDEANRLSRDYVAGVNFLLSNQQDKAVDLFLD MLKEDTGTVEAHLTLGNLFRSRGEVDRAIRIHQTLMESASLTYEQRLLAIQQLGRDYMAA GLYDRAEDMFNQLTDETDFRIGALQQLLQIYQATSEWQKAIDVAERLVKLGKDKQRVEIA HFYCELALQHMASDDLDRAMTLLKKGAAADKNSARVSIMMGRVFMAKGEYAKAVESLQRV ISQDRELVSETLEMLQTCYQQLGKTAEWAEFLQRAVEENTGADAELMLADIIEARDGSEA AQVYITRQLQRHPTMRVFHKLMDYHLNEAEEGRAKESLMVLRDMVGEKVRSKPRYRCQKC GFTAYTLYWHCPSCRAWSTIKPIRGLDGL >gi|296494493|gb|ADTN01000245.1| GENE 11 10659 - 10967 201 102 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46133578|ref|ZP_00203203.1| COG3771: Predicted membrane protein [Haemophilus influenzae R2866] # 1 89 2 90 97 82 35 7e-15 MKYLLIFLLVLAIFVISVTLGAQNDQQVTFNYLLAQGEYRISTLLAVLFAAGFAIGWLIC GLFWLRVRVSLARAERKIKRLENQLSPATDVAVVPHSSAAKE >gi|296494493|gb|ADTN01000245.1| GENE 12 11116 - 11880 448 254 aa, chain - ## HITS:1 COG:ECs1851 KEGG:ns NR:ns ## COG: ECs1851 COG0671 # Protein_GI_number: 15831105 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Escherichia coli O157:H7 # 1 254 1 254 254 446 100.0 1e-125 MRSIARRTAVGAALLLVMPVAVWISGWRWQPGEQSWLLKAAFWVTETVTQPWGVITHLIL FGWFLWCLRFRIKAAFVLFAILAAAILVGQGVKSWIKDKVQEPRPFVIWLEKTHHIPVDE FYTLKRAERGNLVKEQLAEEKNIPQYLRSHWQKETGFAFPSGHTMFAASWALLAVGLLWP RRRTLTIAILLVWATGVMGSRLLLGMHWPRDLVVATLISWALVAVATWLAQRICGPLTPP AEENREIAQREQES >gi|296494493|gb|ADTN01000245.1| GENE 13 12050 - 12640 700 196 aa, chain + ## HITS:1 COG:ECs1850 KEGG:ns NR:ns ## COG: ECs1850 COG0807 # Protein_GI_number: 15831104 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase II # Organism: Escherichia coli O157:H7 # 1 196 1 196 196 403 100.0 1e-112 MQLKRVAEAKLPTPWGDFLMVGFEELATGHDHVALVYGDISGHTPVLARVHSECLTGDAL FSLRCDCGFQLEAALTQIAEEGRGILLYHRQEGRNIGLLNKIRAYALQDQGYDTVEANHQ LGFAADERDFTLCADMFKLLGVNEVRLLTNNPKKVEILTEAGINIVERVPLIVGRNPNNE HYLDTKAEKMGHLLNK >gi|296494493|gb|ADTN01000245.1| GENE 14 12704 - 15379 2460 891 aa, chain - ## HITS:1 COG:acnA KEGG:ns NR:ns ## COG: acnA COG1048 # Protein_GI_number: 16129237 # Func_class: C Energy production and conversion # Function: Aconitase A # Organism: Escherichia coli K12 # 1 891 1 891 891 1798 99.0 0 MSSTLREASKDTLQAKDKTYHYYSLPLAAKSLGDITRLPKSLKVLLENLLRWQDGNSVTE EDIHALAGWLKNAHADREIAYRPARVLMQDFTGVPAVVDLAAMREAVKRLGGDTAKVNPL SPVDLVIDHSVTVDRFGDDEAFEENVRLEMERNHERYVFLKWGKQAFSRFSVVPPGTGIC HQVNLEYLGKAVWSELQDGEWIAYPDTLVGTDSHTTMINGLGVLGWGVGGIEAEAAMLGQ PVSMLIPDVVGFKLTGKLREGITATDLVLTVTQMLRKHGVVGKFVEFYGDGLDSLPLADR ATIANMSPEYGATCGFFPIDAVTLDYMRLSGRSEDQVELVEKYAKAQGMWRNPGDEPIFT STLELDMNDVEASLAGPKRPQDRVALPDVPKAFAASNELEVNATHKDRQPVDYVMNGHQY QLPDGAVVIAAITSCTNTSNPSVLMAAGLLAKKAVTLGLKRQPWVKASLAPGSKVVSDYL AKAKLTPYLDELGFNLVGYGCTTCIGNSGPLPDPIETAIKKGDLTVGAVLSGNRNFEGRI HPLVKTNWLASPPLVVAYALAGNMNINLASEPIGHDRKGDPVYLKDIWPSAQEIARAVEQ VSTEMFRKEYAEVFEGTAEWKGINVTRSDTYGWQEDSTYIRLSPFFDEMQATPAPVEDIH GARILAMLGDSVTTDHISPAGSIKPDSPAGRYLQGRGVERKDFNSYGSRRGNHEVMMRGT FANIRIRNEMVPGVEGGMTRHLPDSDVVSIYDAAMRYKQEQTPLAVIAGKEYGSGSSRDW AAKGPRLLGIRVVIAESFERIHRSNLIGMGILPLEFPQGVTRKTLGLTGEEKIDIGDLQN LQPGATVPVTLTRADGSQEVVPCRCRIDTATELTYYQNDGILHYVIRNMLK >gi|296494493|gb|ADTN01000245.1| GENE 15 15543 - 15638 97 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MINTNMKYWSWMGAFSLSMLFWAELLWIITH >gi|296494493|gb|ADTN01000245.1| GENE 16 15752 - 15919 144 55 aa, chain - ## HITS:1 COG:no KEGG:SSON_1865 NR:ns ## KEGG: SSON_1865 # Name: not_defined # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 55 5 59 59 80 100.0 1e-14 MVGQEQLESSPLCQHSDNETETKRECSVVIPDDWQLTSQQQAFIELFAEDDQPKQ >gi|296494493|gb|ADTN01000245.1| GENE 17 15922 - 16050 100 42 aa, chain - ## HITS:1 COG:no KEGG:EC55989_1434 NR:ns ## KEGG: EC55989_1434 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 42 13 54 54 73 100.0 2e-12 MPSGNQEPRRDPELKRKAWLAVFLGSALFWVVVALLIWKVWG >gi|296494493|gb|ADTN01000245.1| GENE 18 16381 - 17355 879 324 aa, chain - ## HITS:1 COG:ECs1847 KEGG:ns NR:ns ## COG: ECs1847 COG0583 # Protein_GI_number: 15831101 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 324 1 324 324 645 99.0 0 MKLQQLRYIVEVVNHNLNVSSTAEGLYTSQPGISKQVRMLEDELGIQIFSRSGKHLTQVT PAGQEIIRIAREVLSKVDAIKSVAGEHTWPDKGSLYIATTHTQARYALPNVIKGFIERYP RVSLHMHQGSPTQIADAVSKGNADFAIATEALHLYEDLVMLPCYHWNRAIVVTPDHPLAG KKAITIEELAQYPLVTYTFGFTGRSELDTAFNRAGLTPRIVFTATDADVIKTYVRLGLGV GVIASMAVDPVADPDLVRVDAHDIFSHSTTKIGFRRSTFLRSYMYDFIQRFAPHLTRDVV DAAVVLRSNEEIEVMFKDIKLPEK >gi|296494493|gb|ADTN01000245.1| GENE 19 17565 - 20162 2829 865 aa, chain - ## HITS:1 COG:topA_1 KEGG:ns NR:ns ## COG: topA_1 COG0550 # Protein_GI_number: 16129235 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Escherichia coli K12 # 1 592 1 592 592 1167 100.0 0 MGKALVIVESPAKAKTINKYLGSDYVVKSSVGHIRDLPTSGSAAKKSADSTSTKTAKKPK KDERGALVNRMGVDPWHNWEAHYEVLPGKEKVVSELKQLAEKADHIYLATDLDREGEAIA WHLREVIGGDDARYSRVVFNEITKNAIRQAFNKPGELNIDRVNAQQARRFMDRVVGYMVS PLLWKKIARGLSAGRVQSVAVRLVVEREREIKAFVPEEFWEVDASTTTPSGEALALQVTH QNDKPFRPVNKEQTQAAVSLLEKARYSVLEREDKPTTSKPGAPFITSTLQQAASTRLGFG VKKTMMMAQRLYEAGYITYMRTDSTNLSQDAVNMVRGYISDNFGKKYLPESPNQYASKEN SQEAHEAIRPSDVNVMAESLKDMEADAQKLYQLIWRQFVACQMTPAKYDSTTLTVGAGDF RLKARGRILRFDGWTKVMPALRKGDEDRILPAVNKGDALTLVELTPAQHFTKPPARFSEA SLVKELEKRGIGRPSTYASIISTIQDRGYVRVENRRFYAEKMGEIVTDRLEENFRELMNY DFTAQMENSLDQVANHEAEWKAVLDHFFSDFTQQLDKAEKDPEEGGMRPNQMVLTSIDCP TCGRKMGIRTASTGVFLGCSGYALPPKERCKTTINLVPENEVLNVLEGEDAETNALRAKR RCPKCGTAMDSYLIDPKRKLHVCGNNPTCDGYEIEEGEFRIKGYDGPIVECEKCGSEMHL KMGRFGKYMACTNEECKNTRKILRNGEVAPPKEDPVPLPELPCEKSDAYFVLRDGAAGVF LAANTFPKSRETRAPLVEELYRFRDRLPEKLRYLADAPQQDPEGNKTMVRFSRKTKQQYV SSEKDGKATGWSAFYVDGKWVEGKK >gi|296494493|gb|ADTN01000245.1| GENE 20 20542 - 20793 422 83 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_1292 NR:ns ## KEGG: ECIAI1_1292 # Name: yciN # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 83 1 83 83 158 100.0 7e-38 MNKETQPIDRETLLKEANKIIREHEDTLAGIEATGVTQRNGVLVFTGDYFLDEQGLPTAK STAVFNMFKHLAHVLSEKYHLVD >gi|296494493|gb|ADTN01000245.1| GENE 21 20829 - 21878 1057 349 aa, chain - ## HITS:1 COG:sohB KEGG:ns NR:ns ## COG: sohB COG0616 # Protein_GI_number: 16129233 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Escherichia coli K12 # 1 349 1 349 349 594 100.0 1e-169 MELLSEYGLFLAKIVTVVLAIAAIAAIIVNVAQRNKRQRGELRVNNLSEQYKEMKEELAA ALMDSHQQKQWHKAQKKKHKQEAKAAKAKAKLGEVATDSKPRVWVLDFKGSMDAHEVNSL REEITAVLAAFKPQDQVVLRLESPGGMVHGYGLAASQLQRLRDKNIPLTVTVDKVAASGG YMMACVADKIVSAPFAIVGSIGVVAQMPNFNRFLKSKDIDIELHTAGQYKRTLTLLGENT EEGREKFREELNETHQLFKDFVKRMRPSLDIEQVATGEHWYGQQAVEKGLVDEINTSDEV ILSLMEGREVVNVRYMQRKRLIDRFTGSAAESADRLLLRWWQRGQKPLM >gi|296494493|gb|ADTN01000245.1| GENE 22 22098 - 22856 756 252 aa, chain + ## HITS:1 COG:yciK KEGG:ns NR:ns ## COG: yciK COG1028 # Protein_GI_number: 16129232 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Escherichia coli K12 # 1 252 1 252 252 496 100.0 1e-140 MHYQPKQDLLNDRIILVTGASDGIGREAAMTYARYGATVILLGRNEEKLRQVASHINEET GRQPQWFILDLLTCTSENCQQLAQRIAVNYPRLDGVLHNAGLLGDVCPMSEQNPQVWQDV MQVNVNATFMLTQALLPLLLKSDAGSLVFTSSSVGRQGRANWGAYAASKFATEGMMQVLA DEYQQRLRVNCINPGGTRTAMRASAFPTEDPQKLKTPADIMPLYLWLMGDDSRRKTGMTF DAQPGRKPGISQ >gi|296494493|gb|ADTN01000245.1| GENE 23 22853 - 23443 697 196 aa, chain + ## HITS:1 COG:btuR KEGG:ns NR:ns ## COG: btuR COG2109 # Protein_GI_number: 16129231 # Func_class: H Coenzyme transport and metabolism # Function: ATP:corrinoid adenosyltransferase # Organism: Escherichia coli K12 # 1 196 1 196 196 389 100.0 1e-108 MSDERYQQRQQRVKEKVDARVAQAQDERGIIIVFTGNGKGKTTAAFGTATRAVGHGKKVG VVQFIKGTWPNGERNLLEPHGVEFQVMATGFTWDTQNRESDTAACREVWQHAKRMLADSS LDMVLLDELTYMVAYDYLPLEEVVQALNERPHQQTVIITGRGCHRDILELADTVSELRPV KHAFDAGVKAQIGIDY >gi|296494493|gb|ADTN01000245.1| GENE 24 23483 - 24358 1181 291 aa, chain - ## HITS:1 COG:yciL KEGG:ns NR:ns ## COG: yciL COG1187 # Protein_GI_number: 16129230 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Escherichia coli K12 # 1 291 1 291 291 541 100.0 1e-154 MSEKLQKVLARAGHGSRREIESIIEAGRVSVDGKIAKLGDRVEVTPGLKIRIDGHLISVR ESAEQICRVLAYYKPEGELCTRNDPEGRPTVFDRLPKLRGARWIAVGRLDVNTCGLLLFT TDGELANRLMHPSREVEREYAVRVFGQVDDAKLRDLSRGVQLEDGPAAFKTIKFSGGEGI NQWYNVTLTEGRNREVRRLWEAVGVQVSRLIRVRYGDIPLPKGLPRGGWTELDLAQTNYL RELVELPPETSSKVAVEKDRRRMKANQIRRAVKRHSQVSGGRRSGGRNNNG >gi|296494493|gb|ADTN01000245.1| GENE 25 24569 - 26464 708 631 aa, chain - ## HITS:1 COG:no KEGG:JW5197 NR:ns ## KEGG: JW5197 # Name: yciQ # Def: predicted inner membrane protein # Organism: E.coli_J # Pathway: not_defined # 1 615 1 615 631 1275 100.0 0 MAGKFRCILLLIAGLFVSSLSYAENTEIPSYEEGISLFDVEATLQPDGVLDIKENIHFQA RNQQIKHGFYRDLPRLWMQPDGDAALLNYHIVGVTRDGIPEPWHLDWHIGLMSIVVGDKQ RFLPQGDYHYQIHYQVKNAFLREGDSDLLIWNVTGNHWPFEIYKTRFSLQFSNIAGNPFS EIDLFTGEEGDTYRNGRILEDGRIESRDPFYREDFTVLYRWPHALLSNASAPQTTNIFSH LLLPSTSSLLIWFPCLFLVCGWLYLWKRRPQFTPVDVIETDVIPPDYTPGMLRLDAKLVY DDKGFCADIVNLIVKGKIHLEDQSDKNQQILIRVNEGATRNNAVLLPAEQLLLEALFRKG DKVVLTGRRNRVLRRAFLRMQKFYLPRKKSSFYRSDTFLQWGGLAILAVILYGNLSPVGW AGMSLVGDMFIMICWIIPFLFCSLELLFARDDDKPCVNRVIITLFLPLICSGVAFYSLYI NVGDVFFYWYMPAGYFTAVCLTGYLTGMGYIFLPKFTQTGQQRYAHGEAIVNYLARKEAA THSGRRRKGETRKLDYALLGWAISANLGREWALRIAPSLSSAIRAPEIARNGVLFSLQTH LSCGANTSLLGRSYSGGGAGGGAGGGGGGGW >gi|296494493|gb|ADTN01000245.1| GENE 26 26492 - 27112 773 206 aa, chain - ## HITS:1 COG:ECs1839 KEGG:ns NR:ns ## COG: ECs1839 COG0009 # Protein_GI_number: 15831093 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Escherichia coli O157:H7 # 1 206 13 218 218 419 100.0 1e-117 MSQFFYIHPDNPQQRLINQAVEIVRKGGVIVYPTDSGYALGCKIEDKNAMERICRIRQLP DGHNFTLMCRDLSELSTYSFVDNVAFRLMKNNTPGNYTFILKGTKEVPRRLLQEKRKTIG MRVPSNPIAQALLEALGEPMLSTSLMLPGSEFTESDPEEIKDRLEKQVDLIIHGGYLGQK PTTVIDLTDDTPVVVREGVGDVKPFL >gi|296494493|gb|ADTN01000245.1| GENE 27 27109 - 27990 377 293 aa, chain - ## HITS:1 COG:yciV KEGG:ns NR:ns ## COG: yciV COG0613 # Protein_GI_number: 16129227 # Func_class: R General function prediction only # Function: Predicted metal-dependent phosphoesterases (PHP family) # Organism: Escherichia coli K12 # 1 293 1 293 293 582 100.0 1e-166 MSDTNYAVIYDLHSHTTASDGCLTPEALVHRAVEMRVGTLAITDHDTTAAIAPAREEISR SGLALNLIPGVEISTVWENHEIHIVGLNIDITHPLMCEFLAQQTERRNQRAQLIAERLEK AQIPGALEGAQRLAQGGAVTRGHFARFLVECGKASSMADVFKKYLARGKTGYVPPQWCTI EQAIDVIHHSGGKAVLAHPGRYNLSAKWLKRLVAHFAEHHGDAMEVAQCQQSPNERTQLA ALARQHHLWASQGSDFHQPCPWIELGRKLWLPAGVEGVWQLWEQPQNTTEREL >gi|296494493|gb|ADTN01000245.1| GENE 28 28264 - 29826 1582 520 aa, chain + ## HITS:1 COG:ECs1836 KEGG:ns NR:ns ## COG: ECs1836 COG0147 # Protein_GI_number: 15831090 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component I # Organism: Escherichia coli O157:H7 # 1 520 1 520 520 1020 98.0 0 MQTQKPTLELLTCEGAYRDNPTALFHQLCGDRPATLLLESADIDSKDDLKSLLLVDSALR ITALGDTVTIQALSGNGEALLALLDNALPAGVESEQSPNCRVLRFPPVSPLLDEDARLCS LSVFDAFRLLQNLLNVPKEEREAMFFGGLFSYDLVAGFEDLPQLSAENNCPDFCFYLAET LMVIDHQKKSTRIQASLFAPNEEEKQRLTARLNELRQQLTEAAPPLPVVSVPHMRCECNQ SDEEFGGVVRLLQKAIRAGEIFQVVPSRRFSLPCPSPLAAYYVLKKSNPSPYMFFMQDND FTLFGASPESSLKYDATSRQIEIYPIAGTRPRGRRADGSLDRDLDSRIELEMRTDHKELS EHLMLVDLARNDLARICTPGSRYVADLTKVDRYSYVMHLVSRVVGELRHDLDALHAYRAC MNMGTLSGAPKVRAMQLIAEAEGRRRGSYGGAVGYFTAHGDLDTCIVIRSALVENGIATV QAGAGVVLDSVPQSEADETRNKARAVLRAIATAHHAQETF >gi|296494493|gb|ADTN01000245.1| GENE 29 29826 - 31421 1690 531 aa, chain + ## HITS:1 COG:trpD_2 KEGG:ns NR:ns ## COG: trpD_2 COG0547 # Protein_GI_number: 16129224 # Func_class: E Amino acid transport and metabolism # Function: Anthranilate phosphoribosyltransferase # Organism: Escherichia coli K12 # 197 531 1 335 335 603 100.0 1e-172 MADILLLDNIDSFTYNLADQLRSNGHNVVIYRNHIPAQTLIERLATMSNPVLMLSPGPGV PSEAGCMPELLTRLRGKLPIIGICLGHQAIVEAYGGYVGQAGEILHGKASSIEHDGQAMF AGLTNPLPVARYHSLVGSNIPAGLTINAHFNGMVMAVRHDADRVCGFQFHPESILTTQGA RLLEQTLAWAQQKLEPANTLQPILEKLYQAQTLSQQESHQLFSAVVRGELKPEQLAAALV SMKIRGEHPNEIAGAATALLENAAPFPRPDYLFADIVGTGGDGSNSINISTASAFVAAAC GLKVAKHGNRSVSSKSGSSDLLAAFGINLDMNADKSRQALDELGVCFLFAPKYHTGFRHA MPVRQQLKTRTLFNVLGPLINPAHPPLALIGVYSPELVLPIAETLRVLGYQRAAVVHSGG MDEVSLHAPTIVAELHDGEIKSYQLTAEDFGLTPYHQEQLAGGTPEENRDILTRLLQGKG DAAHEAAVAANVAMLMRLHGHEDLQANAQTVLEVLRSGSAYDRVTALAARG >gi|296494493|gb|ADTN01000245.1| GENE 30 31425 - 32783 1195 452 aa, chain + ## HITS:1 COG:trpC_1 KEGG:ns NR:ns ## COG: trpC_1 COG0134 # Protein_GI_number: 16129223 # Func_class: E Amino acid transport and metabolism # Function: Indole-3-glycerol phosphate synthase # Organism: Escherichia coli K12 # 1 253 2 254 254 505 100.0 1e-143 MQTVLAKIVADKAIWVEARKQQQPLASFQNEVQPSTRHFYDALQGARTAFILECKKASPS KGVIRDDFDPARIAAIYKHYASAISVLTDEKYFQGSFNFLPIVSQIAPQPILCKDFIIDP YQIYLARYYQADACLLMLSVLDDDQYRQLAAVAHSLEMGVLTEVSNEEEQERAIALGAKV VGINNRDLRDLSIDLNRTRELAPKLGHNVTVISESGINTYAQVRELSHFANGFLIGSALM AHDDLHAAVRRVLLGENKVCGLTRGQDAKAAYDAGAIYGGLIFVATSPRCVNVEQAQEVM AAAPLQYVGVFRNHDIADVVDKAKVLSLAAVQLHGNEEQLYIDTLREALPAHVAIWKALS VGETLPAREFQHVDKYVLDNGQGGSGQRFDWSLLNGQSLGNVLLAGGLGADNCVEAAQTG CAGLDFNSAVESQPGIKDARLLASVFQTLRAY >gi|296494493|gb|ADTN01000245.1| GENE 31 32795 - 33988 1356 397 aa, chain + ## HITS:1 COG:trpB KEGG:ns NR:ns ## COG: trpB COG0133 # Protein_GI_number: 16129222 # Func_class: E Amino acid transport and metabolism # Function: Tryptophan synthase beta chain # Organism: Escherichia coli K12 # 1 397 1 397 397 796 100.0 0 MTTLLNPYFGEFGGMYVPQILMPALRQLEEAFVSAQKDPEFQAQFNDLLKNYAGRPTALT KCQNITAGTNTTLYLKREDLLHGGAHKTNQVLGQALLAKRMGKTEIIAETGAGQHGVASA LASALLGLKCRIYMGAKDVERQSPNVFRMRLMGAEVIPVHSGSATLKDACNEALRDWSGS YETAHYMLGTAAGPHPYPTIVREFQRMIGEETKAQILEREGRLPDAVIACVGGGSNAIGM FADFINETNVGLIGVEPGGHGIETGEHGAPLKHGRVGIYFGMKAPMMQTEDGQIEESYSI SAGLDFPSVGPQHAYLNSTGRADYVSITDDEALEAFKTLCLHEGIIPALESSHALAHALK MMRENPDKEQLLVVNLSGRGDKDIFTVHDILKARGEI >gi|296494493|gb|ADTN01000245.1| GENE 32 33988 - 34794 392 268 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc [Roseobacter sp. AzwK-3b] # 1 249 1 248 263 155 35 5e-37 MERYESLFAQLKERKEGAFVPFVTLGDPGIEQSLKIIDTLIEAGADALELGIPFSDPLAD GPTIQNATLRAFAAGVTPAQCFEMLALIRQKHPTIPIGLLMYANLVFNKGIDEFYAQCEK VGVDSVLVADVPVEESAPFRQAALRHNVAPIFICPPNADDDLLRQIASYGRGYTYLLSRA GVTGAENRAALPLNHLVAKLKEYNAAPPLQGFGISAPDQVKAAIDAGAAGAISGSAIVKI IEQHINEPEKMLAALKVFVQPMKAATRS >gi|296494493|gb|ADTN01000245.1| GENE 33 35175 - 35354 214 59 aa, chain + ## HITS:1 COG:STM1728 KEGG:ns NR:ns ## COG: STM1728 COG3729 # Protein_GI_number: 16765072 # Func_class: R General function prediction only # Function: General stress protein # Organism: Salmonella typhimurium LT2 # 1 55 1 55 60 58 89.0 4e-09 MAEHRGGSGNFAEDREKASDAGRKGGQHSGGNFKNDPQRASEAGKKGGQQSGGNKSGKS >gi|296494493|gb|ADTN01000245.1| GENE 34 35440 - 35940 595 166 aa, chain + ## HITS:1 COG:yciF KEGG:ns NR:ns ## COG: yciF COG3685 # Protein_GI_number: 16129219 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 166 1 166 166 234 100.0 4e-62 MNMKTIEDVFIHLLSDTYSAEKQLTRALAKLARATSNEKLSQAFHAHLEETHGQIERIDQ VVESESNLKIKRMKCVAMEGLIEEANEVIESTEKNEVRDAALIAAAQKVEHYEIASYGTL ATLAEQLGYRKAAKLLKETLEEEKATDIKLTDLAINNVNKKAENKA >gi|296494493|gb|ADTN01000245.1| GENE 35 35986 - 36492 628 168 aa, chain + ## HITS:1 COG:yciE KEGG:ns NR:ns ## COG: yciE COG3685 # Protein_GI_number: 16129218 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 168 1 168 168 309 100.0 1e-84 MNRIEHYHDWLRDAHAMEKQAESMLESMASRIDNYPELRARIEQHLSETKNQIVQLETIL DRNDISRSVIKDSMSKMAALGQSIGGIFPSDEIVKGSISGYVFEQFEIACYTSLLAAAKN AGDTASIPTIEAILNEEKQMADWLIQNIPQTTEKFLIRSETDGVEAKK >gi|296494493|gb|ADTN01000245.1| GENE 36 36552 - 37106 665 184 aa, chain - ## HITS:1 COG:ompW KEGG:ns NR:ns ## COG: ompW COG3047 # Protein_GI_number: 16129217 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein W # Organism: Escherichia coli K12 # 1 184 29 212 212 358 100.0 3e-99 MRAGSATVRPTEGAGGTLGSLGGFSVTNNTQLGLTFTYMATDNIGVELLAATPFRHKIGT RATGDIATVHHLPPTLMAQWYFGDASSKFRPYVGAGINYTTFFDNGFNDHGKEAGLSDLS LKDSWGAAGQVGVDYLINRDWLVNMSVWYMDIDTTANYKLGGAQQHDSVRLDPWVFMFSA GYRF >gi|296494493|gb|ADTN01000245.1| GENE 37 37547 - 38290 498 247 aa, chain + ## HITS:1 COG:no KEGG:JW1247 NR:ns ## KEGG: JW1247 # Name: yciC # Def: predicted inner membrane protein # Organism: E.coli_J # Pathway: not_defined # 1 247 1 247 247 352 100.0 5e-96 MSITAQSVYRDTGNFFRNQFMTILLVSLLCAFITVVLGHVFSPSDAQLAQLNDGVPVSGS SGLFDLVQNMSPEQQQILLQASAASTFSGLIGNAILAGGVILIIQLVSAGQRVSALRAIG ASAPILPKLFILIFLTTLLVQIGIMLVVVPGIIMAILLALAPVMLVQDKMGVFASMRSSM RLTWANMRLVAPAVLSWLLAKTLLLLFASSFAALTPEIGAVLANTLSNLISAILLIYLFR LYMLIRQ >gi|296494493|gb|ADTN01000245.1| GENE 38 38320 - 38859 559 179 aa, chain + ## HITS:1 COG:ECs1754 KEGG:ns NR:ns ## COG: ECs1754 COG2917 # Protein_GI_number: 15831008 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Intracellular septation protein A # Organism: Escherichia coli O157:H7 # 1 179 1 179 179 291 100.0 6e-79 MKQFLDFLPLVVFFAFYKIYDIYAATAALIVATAIVLIYSWVRFRKVEKMALITFVLVVV FGGLTLFFHNDEFIKWKVTVIYALFAGALLVSQWVMKKPLIQRMLGKELTLPQPVWSKLN LAWAVFFILCGLANIYIAFWLPQNIWVNFKVFGLTALTLIFTLLSGIYIYRHMPQEDKS >gi|296494493|gb|ADTN01000245.1| GENE 39 38964 - 39362 326 132 aa, chain + ## HITS:1 COG:yciA KEGG:ns NR:ns ## COG: yciA COG1607 # Protein_GI_number: 16129214 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA hydrolase # Organism: Escherichia coli K12 # 1 132 1 132 132 264 100.0 3e-71 MSTTHNVPQGDLVLRTLAMPADTNANGDIFGGWLMSQMDIGGAILAKEIAHGRVVTVRVE GMTFLRPVAVGDVVCCYARCVQKGTTSVSINIEVWVKKVASEPIGQRYKATEALFKYVAV DPEGKPRALPVE >gi|296494493|gb|ADTN01000245.1| GENE 40 39402 - 40121 600 239 aa, chain - ## HITS:1 COG:tonB KEGG:ns NR:ns ## COG: tonB COG0810 # Protein_GI_number: 16129213 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein TonB, links inner and outer membranes # Organism: Escherichia coli K12 # 1 239 1 239 239 283 99.0 2e-76 MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADLEPPQA VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR PASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ >gi|296494493|gb|ADTN01000245.1| GENE 41 40345 - 40641 272 98 aa, chain + ## HITS:1 COG:ECs1751 KEGG:ns NR:ns ## COG: ECs1751 COG2350 # Protein_GI_number: 15831005 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 98 33 130 130 188 98.0 2e-48 MLYVIYAQDKADSLEKRLSVRPAHLARLQLLHDEGRLLTAGPMPAVDSNDPGAAGFTGST VIAEFESLEAAQAWADADPYVAAGVYEHVSVKPFKKVF >gi|296494493|gb|ADTN01000245.1| GENE 42 41061 - 42194 742 377 aa, chain + ## HITS:1 COG:ECs1750 KEGG:ns NR:ns ## COG: ECs1750 COG1226 # Protein_GI_number: 15831004 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, predicted NAD-binding component # Organism: Escherichia coli O157:H7 # 1 377 41 417 417 684 100.0 0 MSVNLLDIFHIKAFSELDLSLLANAPLFMLGVFLVLNSIGLLFRAKLAWAISIILLLIAL IYTLHFYPWLKFSIGFCIFTLVFLLILRKDFSHSSAAAGTIFAFISFTTLLFYSTYGALY LSEGFNPRIESLMTAFYFSIETMSTVGYGDIVPVSESARLFTISVIISGITVFATSMTSI FGPLIRGGFNKLVKGNNHTMHRKDHFIVCGHSILAINTILQLNQRGQNVTVISNLPEDDI KQLEQRLGDNADVIPGDSNDSSVLKKAGIDRCRAILALSDNDADNAFVVLSAKDMSSDVK TVLAVSDSKNLNKIKMVHPDIILSPQLFGSEILARVLNGEEINNDMLVSMLLNSGHGIFS DNDEQETKADSKESAQK >gi|296494493|gb|ADTN01000245.1| GENE 43 42249 - 42422 78 57 aa, chain - ## HITS:1 COG:no KEGG:EC55989_1347 NR:ns ## KEGG: EC55989_1347 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 57 38 94 94 83 100.0 3e-15 MKRSRTEVGRWRMQRQASRRKSRWLEGQSRRNMRIHSIRKCILNKQRNSLLFAIYNI >gi|296494493|gb|ADTN01000245.1| GENE 44 42565 - 44025 1131 486 aa, chain + ## HITS:1 COG:cls KEGG:ns NR:ns ## COG: cls COG1502 # Protein_GI_number: 16129210 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Escherichia coli K12 # 1 486 1 486 486 978 100.0 0 MTTVYTLVSWLAILGYWLLIAGVTLRILMKRRAVPSAMAWLLIIYILPLVGIIAYLAVGE LHLGKRRAERARAMWPSTAKWLNDLKACKHIFAEENSSVAAPLFKLCERRQGIAGVKGNQ LQLMTESDDVMQALIRDIQLARHNIEMVFYIWQPGGMADQVAESLMAAARRGIHCRLMLD SAGSVAFFRSPWPELMRNAGIEVVEALKVNLMRVFLRRMDLRQHRKMIMIDNYIAYTGSM NMVDPRYFKQDAGVGQWIDLMARMEGPIATAMGIIYSCDWEIETGKRILPPPPDVNIMPF EQASGHTIHTIASGPGFPEDLIHQALLTAAYSAREYLIMTTPYFVPSDDLLHAICTAAQR GVDVSIILPRKNDSMLVGWASRAFFTELLAAGVKIYQFEGGLLHTKSVLVDGELSLVGTV NLDMRSLWLNFEITLAIDDKGFGADLAAVQDDYISRSRLLDARLWLKRPLWQRVAERLFY FFSPLL >gi|296494493|gb|ADTN01000245.1| GENE 45 44060 - 44389 523 109 aa, chain + ## HITS:1 COG:yciU KEGG:ns NR:ns ## COG: yciU COG3099 # Protein_GI_number: 16129209 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 109 27 135 135 201 100.0 2e-52 MDMDLNNRLTEDETLEQAYDIFLELAADNLDPADVLLFNLQFEERGGAELFDPAEDWQEH VDFDLNPDFFAEVVIGLADSEDGEINDVFARILLCREKDHKLCHIIWRE >gi|296494493|gb|ADTN01000245.1| GENE 46 44442 - 45446 853 334 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 1 329 1 325 329 333 51 2e-90 MNAVTEGRKVLLEIADLKVHFEIKDGKQWFWQPPKTLKAVDGVTLRLYEGETLGVVGESG CGKSTFARAIIGLVKATDGHVAWLGKELLGMKPDEWRAVRSDIQMIFQDPLASLNPRMTI GEIIAEPLRTYHPKMSRQEVRERVKAMMLKVGLLPNLINRYPHEFSGGQCQRIGIARALI LEPKLIICDEPVSALDVSIQAQVVNLLQQLQREMGLSLIFIAHDLAVVKHISDRVLVMYL GHAVELGTYDEVYHNPLHPYTRALMSAVPIPDPDLEKNKTIQLLEGELPSPINPPSGCVF RTRCPIAGPECAKTRPVLEGSFRHAVSCLKVDPL >gi|296494493|gb|ADTN01000245.1| GENE 47 45443 - 46456 582 337 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 12 317 4 311 329 228 40 4e-59 MSVIETATVPLAQQQADALLNVKDLRVTFSTPDGDVTAVNDLNFSLRAGETLGIVGESGS GKSQTAFALMGLLAANGRIGGSATFNGREILNLPERELNKLRAEQISMIFQDPMTSLNPY MRVGEQLMEVLMLHKNMSKAEAFEESVRMLDAVKMPEARKRMKMYPHEFSGGMRQRVMIA MALLCRPKLLIADEPTTALDVTVQAQIMTLLNELKREFNTAIIMITHDLGVVAGICDKVL VMYAGRTMEYGNARDVFYQPVHPYSIGLLNAVPRLDAEGETMLTIPGNPPNLLRLPKGCP FQPRCPHAMEICSSAPPLEEFTPGRLRACFKPVEELL >gi|296494493|gb|ADTN01000245.1| GENE 48 46468 - 47376 954 302 aa, chain - ## HITS:1 COG:STM1744 KEGG:ns NR:ns ## COG: STM1744 COG1173 # Protein_GI_number: 16765088 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Salmonella typhimurium LT2 # 1 302 1 302 302 510 96.0 1e-144 MMLSKKNSETLENFSEKLEVEGRSLWQDARRRFMHNRAAVASLIVLVLIALFVILAPMLS QFAYDDTDWAMMSSAPDMESGHYFGTDSSGRDLLVRVAIGGRISLMVGVAAALVAVVVGT LYGSLSGYLGGKVDSVMMRLLEILNSFPFMFFVILLVTFFGQNILLIFVAIGMVSWLDMA RIVRGQTLSLKRKEFIEAAQVGGVSTPGIVIRHIVPNVLGVVVVYASLLVPSMILFESFL SFLGLGTQEPLSSWGALLSDGANSMEVSPWLLLFPAGFLVVTLFCFNFIGDGLRDALDPK DR >gi|296494493|gb|ADTN01000245.1| GENE 49 47391 - 48311 810 306 aa, chain - ## HITS:1 COG:ECs1744 KEGG:ns NR:ns ## COG: ECs1744 COG0601 # Protein_GI_number: 15830998 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Escherichia coli O157:H7 # 1 306 1 306 306 549 100.0 1e-156 MLKFILRRCLEAIPTLFILITISFFMMRLAPGSPFTGERTLPPEVMANIEAKYHLNDPIM TQYFSYLKQLAHGDFGPSFKYKDYSVNDLVASSFPVSAKLGAAAFFLAVILGVSAGVIAA LKQNTKWDYTVMGLAMTGVVIPSFVVAPLLVMIFAIILHWLPGGGWNGGALKFMILPMVA LSLAYIASIARITRGSMIEVLHSNFIRTARAKGLPMRRIILRHALKPALLPVLSYMGPAF VGIITGSMVIETIYGLPGIGQLFVNGALNRDYSLVLSLTILVGALTILFNAIVDVLYAVI DPKIRY >gi|296494493|gb|ADTN01000245.1| GENE 50 48397 - 50028 1710 543 aa, chain - ## HITS:1 COG:ECs1743 KEGG:ns NR:ns ## COG: ECs1743 COG4166 # Protein_GI_number: 15830997 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 1 543 1 543 543 1082 100.0 0 MTNITKRSLVAAGVLAALMAGNVALAADVPAGVTLAEKQTLVRNNGSEVQSLDPHKIEGV PESNISRDLFEGLLVSDLDGHPAPGVAESWDNKDAKVWTFHLRKDAKWSDGTPVTAQDFV YSWQRSVDPNTASPYASYLQYGHIAGIDEILEGKKPITDLGVKAIDDHTLEVTLSEPVPY FYKLLVHPSTSPVPKAAIEKFGEKWTQPGNIVTNGAYTLKDWVVNERIVLERSPTYWNNA KTVINQVTYLPIASEVTDVNRYRSGEIDMTYNNMPIELFQKLKKEIPDEVHVDPYLCTYY YEINNQKPPFNDVRVRTALKLGMDRDIIVNKVKAQGDMPAYGYTPPYTDGAKLTQPEWFG WSQEKRNEEAKKLLAEAGYTADKPLTINLLYNTSDLHKKLAIAASSLWKKNIGVNVKLVN QEWKTFLDTRHQGTFDVARAGWCADYNEPTSFLNTMLSNSSMNTAHYKSPAFDSIMAETL KVTDEAQRTALYTKAEQQLDKDSAIVPVYYYVNARLVKPWVGGYTGKDPLDNTYTRNMYI VKH >gi|296494493|gb|ADTN01000245.1| GENE 51 50447 - 50608 163 53 aa, chain + ## HITS:1 COG:no KEGG:ECS88_1310 NR:ns ## KEGG: ECS88_1310 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 12 53 63 104 104 76 100.0 2e-13 MSIFSDYSSSSEMHNNLTIDYYLALSSTKGSGITNIISIILQQAQDYDVAKIT >gi|296494493|gb|ADTN01000245.1| GENE 52 50766 - 51413 486 215 aa, chain - ## HITS:1 COG:ychE KEGG:ns NR:ns ## COG: ychE COG2095 # Protein_GI_number: 16129203 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Multiple antibiotic transporter # Organism: Escherichia coli K12 # 1 215 1 215 215 376 99.0 1e-104 MIQTFFDFPVYFKFFIGLFALVNPVGIIPVFISMTSYQTAATRNKTNLTANLSVAIILWI SLFLGDTILQLFGISIDSFRIAGGILVVTIAMSMISGKLGEDKQNKQEKSETAVRESIGV VPLALPLMAGPGAISSTIVWGTRYHSISYLFGFFVAIALFALCCWGLFRMAPWLVRVLRQ TGINVITRIMGLLLMALGIEFIVTGIKGIFPGLLN >gi|296494493|gb|ADTN01000245.1| GENE 53 51890 - 54565 2863 891 aa, chain + ## HITS:1 COG:ECs1741_2 KEGG:ns NR:ns ## COG: ECs1741_2 COG1454 # Protein_GI_number: 15830995 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Escherichia coli O157:H7 # 448 872 1 425 444 860 100.0 0 MAVTNVAELNALVERVKKAQREYASFTQEQVDKIFRAAALAAADARIPLAKMAVAESGMG IVEDKVIKNHFASEYIYNAYKDEKTCGVLSEDDTFGTITIAEPIGIICGIVPTTNPTSTA IFKSLISLKTRNAIIFSPHPRAKDATNKAADIVLQAAIAAGAPKDLIGWIDQPSVELSNA LMHHPDINLILATGGPGMVKAAYSSGKPAIGVGAGNTPVVIDETADIKRAVASVLMSKTF DNGVICASEQSVVVVDSVYDAVRERFATHGGYLLQGKELKAVQDVILKNGALNAAIVGQP AYKIAELAGFSVPENTKILIGEVTVVDESEPFAHEKLSPTLAMYRAKDFEDAVEKAEKLV AMGGIGHTSCLYTDQDNQPARVSYFGQKMKTARILINTPASQGGIGDLYNFKLAPSLTLG CGSWGGNSISENVGPKHLINKKTVAKRAENMLWHKLPKSIYFRRGSLPIALDEVITDGHK RALIVTDRFLFNNGYADQITSVLKAAGVETEVFFEVEADPTLSIVRKGAELANSFKPDVI IALGGGSPMDAAKIMWVMYEHPETHFEELALRFMDIRKRIYKFPKMGVKAKMIAVTTTSG TGSEVTPFAVVTDDATGQKYPLADYALTPDMAIVDANLVMDMPKSLCAFGGLDAVTHAME AYVSVLASEFSDGQALQALKLLKEYLPASYHEGSKNPVARERVHSAATIAGIAFANAFLG VCHSMAHKLGSQFHIPHGLANALLICNVIRYNANDNPTKQTAFSQYDRPQARRRYAEIAD HLGLSAPGDRTAAKIEKLLAWLETLKAELGIPKSIREAGVQEADFLANVDKLSEDAFDDQ CTGANPRYPLISELKQILLDTYYGRDYVEGETATKKEAAPAKAEKKAKKSA Prediction of potential genes in microbial genomes Time: Sun May 15 23:58:54 2011 Seq name: gi|296494492|gb|ADTN01000246.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont650.1, whole genome shotgun sequence Length of sequence - 4752 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 53 - 112 4.0 1 1 Tu 1 . + CDS 164 - 1375 1116 ## COG0477 Permeases of the major facilitator superfamily + Prom 1390 - 1449 3.1 2 2 Tu 1 . + CDS 1477 - 2016 615 ## COG3122 Uncharacterized protein conserved in bacteria 3 3 Op 1 12/0.000 - CDS 2240 - 3073 698 ## COG0627 Predicted esterase - Term 3119 - 3151 1.3 4 3 Op 2 2/0.000 - CDS 3167 - 4276 1185 ## COG1062 Zn-dependent alcohol dehydrogenases, class III 5 3 Op 3 . - CDS 4311 - 4586 281 ## COG1937 Uncharacterized protein conserved in bacteria - Prom 4608 - 4667 5.8 Predicted protein(s) >gi|296494492|gb|ADTN01000246.1| GENE 1 164 - 1375 1116 403 aa, chain + ## HITS:1 COG:mhpT KEGG:ns NR:ns ## COG: mhpT COG0477 # Protein_GI_number: 16128338 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 13 403 28 418 418 576 100.0 1e-164 MSTRTPSSSSSRLMLTIGLCFLVALMEGLDLQAAGIAAGGIAQAFALDKMQMGWIFSAGI LGLLPGALVGGMLADRYGRKRILIGSVALFGLFSLATAIAWDFPSLVFARLMTGVGLGAA LPNLIALTSEAAGPRFRGTAVSLMYCGVPIGAALAATLGFAGANLAWQTVFWVGGVVPLI LVPLLMRWLPESAVFAGEKQSAPPLRALFAPETATATLLLWLCYFFTLLVVYMLINWLPL LLVEQGFQPSQAAGVMFALQMGAASGTLMLGALMDKLRPVTMSLLIYSGMLASLLALGTV SSFNGMLLAGFVAGLFATGGQSVLYALAPLFYSSQIRATGVGTAVAVGRLGAMSGPLLAG KMLALGTGTVGVMAASAPGILVAGLAVFILMSRRSRIQPCADA >gi|296494492|gb|ADTN01000246.1| GENE 2 1477 - 2016 615 179 aa, chain + ## HITS:1 COG:yaiL KEGG:ns NR:ns ## COG: yaiL COG3122 # Protein_GI_number: 16128339 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 179 40 218 218 295 99.0 4e-80 MAKLTLQEQFLKAGLVTSKKAAKVERTAKKSRVQAREARAAVEENKKAQLERDKQLSEQQ KQAALAKEYKAQVKQLIEMNRITIANGDIGFNFTDGNLIKKIFVDKLTQAQLINGRLAIA RLLVDNNSEGEYAIIPASVADKIAQRDASSIVLHSALSAEEQDEDDPYADFKVPDDLMW >gi|296494492|gb|ADTN01000246.1| GENE 3 2240 - 3073 698 277 aa, chain - ## HITS:1 COG:yaiM KEGG:ns NR:ns ## COG: yaiM COG0627 # Protein_GI_number: 16128340 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Escherichia coli K12 # 1 277 1 277 277 569 99.0 1e-162 MELIEKHVSFGGWQNVYRHYSQSLKCEMNVGVYLPPKAANEKLPVLYWLSGLTCNEQNFI TKSGMQRYAAEHNIIVVAPDTSPRGSHVADADRYDLGQGAGFYLNATQAPWNEHYKMYDY IRNELPDLVMHHFPATAKKSISGHSMGGLGALVLALRNPDEYVSVSAFSPIVSPSQVPWG QQAFAAYLAENKDAWLDYDPVSLISQGQRVAEIMVDQGLSDDFYAEQLRTPNLEKICQEM NIKTLIRYHEGYDHSYYFVSSFIGEHIAYHANKLNMR >gi|296494492|gb|ADTN01000246.1| GENE 4 3167 - 4276 1185 369 aa, chain - ## HITS:1 COG:ECs0411 KEGG:ns NR:ns ## COG: ECs0411 COG1062 # Protein_GI_number: 15829665 # Func_class: C Energy production and conversion # Function: Zn-dependent alcohol dehydrogenases, class III # Organism: Escherichia coli O157:H7 # 1 369 1 369 369 726 99.0 0 MKSRAAVAFAPGKPLEIVEIDVAPPKKGEVLIKVTHTGVCHTDAFTLSGDDPEGVFPVVL GHEGAGVVVEVGEGVTSVKPGDHVIPLYTAECGECEFCRSGKTNLCVAVRETQGKGLMPD GTTRFSYNGQPLYHYMGCSTFSEYTVVAEVSLAKINPEANHEHVCLLGCGVTTGIGAVHN TAKVQPGDSVAVFGLGAIGLAVVQGARQAKAGRIIAIDTNPKKFDLARRFGATDCINPND YDKPIKDVLLDINKWGIDHTFECIGNVNVMRAALESAHRGWGQSVIIGVAGAGQEISTRP FQLVTGRVWKGSAFGGVKGRSQLPGMVEDAMKGDIDLEPFVTHTMSLDEINDAFDLMHEG KSIRTVIRY >gi|296494492|gb|ADTN01000246.1| GENE 5 4311 - 4586 281 91 aa, chain - ## HITS:1 COG:ECs0412 KEGG:ns NR:ns ## COG: ECs0412 COG1937 # Protein_GI_number: 15829666 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 91 8 98 98 154 100.0 4e-38 MPSTPEEKKKVLTRVRRIRGQIDALERSLEGDAECRAILQQIAAVRGAANGLMAEVLESH IRETFDRNDCYSREVSQSVDDTIELVRAYLK Prediction of potential genes in microbial genomes Time: Sun May 15 23:58:58 2011 Seq name: gi|296494491|gb|ADTN01000247.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont650.2, whole genome shotgun sequence Length of sequence - 7328 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 3, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 53 - 826 812 ## JW0349 hypothetical protein 2 1 Op 2 . - CDS 828 - 1538 442 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 3 1 Op 3 1/0.000 - CDS 1387 - 2583 576 ## COG1215 Glycosyltransferases, probably involved in cell wall biogenesis 4 1 Op 4 . - CDS 2593 - 3264 348 ## COG2120 Uncharacterized proteins, LmbE homologs - Prom 3477 - 3536 7.9 5 2 Tu 1 . - CDS 3702 - 3926 102 ## EcE24377A_0388 hypothetical protein + Prom 3741 - 3800 4.1 6 3 Op 1 6/0.000 + CDS 3880 - 4842 1151 ## COG4521 ABC-type taurine transport system, periplasmic component 7 3 Op 2 7/0.000 + CDS 4855 - 5622 267 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 8 3 Op 3 5/0.000 + CDS 5619 - 6446 1005 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 9 3 Op 4 . + CDS 6443 - 7294 800 ## COG2175 Probable taurine catabolism dioxygenase Predicted protein(s) >gi|296494491|gb|ADTN01000247.1| GENE 1 53 - 826 812 257 aa, chain - ## HITS:1 COG:no KEGG:JW0349 NR:ns ## KEGG: JW0349 # Name: yaiO # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 257 1 257 257 484 100.0 1e-135 MIKRTLLAAAIFSALPAYAGLTSITAGYDFTDYSGDHGNRNLAYAELVAKVENATLLFNL SQGRRDYETEHFNATRGQGAVWYKWNNWLTTRTGIAFADNTPVFARQDFRQDINLALLPK TLFTTGYRYTKYYDDVEVDAWQGGVSLYTGPVITSYRYTHYDSSDAGGSYSNMISVRLND PRGTGYTQLWLSRGTGAYTYDWTPETRYGSMKSVSLQRIQPLTEQLNLGLTAGKVWYDTP TDDFNGLQLAARLTWKF >gi|296494491|gb|ADTN01000247.1| GENE 2 828 - 1538 442 236 aa, chain - ## HITS:1 COG:b0359 KEGG:ns NR:ns ## COG: b0359 COG0110 # Protein_GI_number: 16128344 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Escherichia coli K12 # 100 236 11 147 147 269 100.0 3e-72 MPSGLFMDLLPFLLDANLSATNPPAIPHWWKRQPLIPNLLSQELKNYLKLNVKEKNIQIA DQVIIDETAGEVVIGANTRICHGAVIQGPVVIGANCLIGNYAFIRPGTIISNGVKIGFAT EIKNAVIEAEATIGPQCFIADSVVANQAYLGAQVRTSNHRLDEQPVSVRTPEGIIATGCD KLGCYIGQRSRLGVQVIILPGRIISPNTQLGPRVIVERNLPTGTYSLRQELIRTGD >gi|296494491|gb|ADTN01000247.1| GENE 3 1387 - 2583 576 398 aa, chain - ## HITS:1 COG:yaiP KEGG:ns NR:ns ## COG: yaiP COG1215 # Protein_GI_number: 16128348 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases, probably involved in cell wall biogenesis # Organism: Escherichia coli K12 # 1 398 1 398 398 802 100.0 0 MKTWIFICMSIAMLLWFLSTLRRKPSQKKGCIDAIIPAYNEGPCLAQSLDNLLRNPYFCR VICVNDGSTDNTEAVMAEVKRKWGDRFVAVTQKNTGKGGALMNGLNYATCDQVFLSDADT YVPPDQDGMGYMLAEIERGADAVGGIPSTALKGAGLLPHIRATVKLPMIVMKRTLQQLLG GAPFIISGACGMFRTDVLRKFGFSDRTKVEDLDLTWTLVANGYRIRQANRCIVYPQECNS PREEWRRWRRWIVGYAVCMRLHKRLLFSRFGIFSIFPMLLVVLYGVGIYLTTWFNEFITT GPHGVVLAMFPLIWVGVVCVIGAFSAWFHRCWLLVPLAPLSVVYVLLAYAIWIIYGLIAF FTGREPQRDKPTRYSALVEASTAYSQPSVTGTEKLSEA >gi|296494491|gb|ADTN01000247.1| GENE 4 2593 - 3264 348 223 aa, chain - ## HITS:1 COG:yaiS KEGG:ns NR:ns ## COG: yaiS COG2120 # Protein_GI_number: 16128349 # Func_class: S Function unknown # Function: Uncharacterized proteins, LmbE homologs # Organism: Escherichia coli K12 # 50 185 1 136 136 273 98.0 1e-73 MDKVLDSALLSSANKRKGILAIGAHPDDIELGCGASLARLAQKGIYIAAVVMTTGNSGTD GIIDRHEEARNALKILGCHQTIHLNFADTRAHLQLNDMISALEDIIKNQIPSDVEIMRVY TMHDADRHQDHLAVYQASMVACRTIPQILGYETPSTWLSFMPQVFESVKEEYFTVKLAAL KKHKSQERRDYMRHDRLRAVAQFRGQQVNSDLGEGFVIHKMIL >gi|296494491|gb|ADTN01000247.1| GENE 5 3702 - 3926 102 74 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_0388 NR:ns ## KEGG: EcE24377A_0388 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 1 74 1 74 74 132 98.0 4e-30 MNASAARSVLRDEIAMIVCSPVLLWEQYSGIKTFIKRISRYRTDDFILSKKTAKERLNMQ LLRSNNWENYYTSF >gi|296494491|gb|ADTN01000247.1| GENE 6 3880 - 4842 1151 320 aa, chain + ## HITS:1 COG:tauA KEGG:ns NR:ns ## COG: tauA COG4521 # Protein_GI_number: 16128350 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type taurine transport system, periplasmic component # Organism: Escherichia coli K12 # 1 320 20 339 339 587 100.0 1e-167 MAISSRNTLLAALAFIAFQAQAVNVTVAYQTSAEPAKVAQADNTFAKESGATVDWRKFDS GASIVRALASGDVQIGNLGSSPLAVAASQQVPIEVFLLASKLGNSEALVVKKTISKPEDL IGKRIAVPFISTTHYSLLAALKHWGIKPGQVEIVNLQPPAIIAAWQRGDIDGAYVWAPAV NALEKDGKVLTDSEQVGQWGAPTLDVWVVRKDFAEKHPEVVKAFAKSAIDAQQPYIANPD VWLKQPENISKLARLSGVPEGDVPGLVKGNTYLTPQQQTAELTGPVNKAIIDTAQFLKEQ GKVPAVANDYSQYVTSRFVQ >gi|296494491|gb|ADTN01000247.1| GENE 7 4855 - 5622 267 255 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 205 1 215 245 107 31 3e-23 MLQISHLYADYGGKPALEDINLTLESGELLVVLGPSGCGKTTLLNLIAGFVPYQHGSIQL AGKRIEGPGAERGVVFQNEGLLPWRNVQDNVAFGLQLAGIEKMQRLEIAHQMLKKVGLEG AEKRYIWQLSGGQRQRVGIARALAANPQLLLLDEPFGALDAFTRDQMQTLLLKLWQETGK QVLLITHDIEEAVFMATELVLLSSGPGRVLERLPLNFARRFVAGESSRSIKSDPQFIAMR EYVLSRVFEQREAFS >gi|296494491|gb|ADTN01000247.1| GENE 8 5619 - 6446 1005 275 aa, chain + ## HITS:1 COG:tauC KEGG:ns NR:ns ## COG: tauC COG0600 # Protein_GI_number: 16128352 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Escherichia coli K12 # 1 275 1 275 275 431 100.0 1e-120 MSVLINEKLHSRRLKWRWPLSRQVTLSIGTLAVLLTVWWTVATLQLISPLFLPPPQQVLE KLLTIAGPQGFMDATLWQHLAASLTRIMLALFAAVLFGIPVGIAMGLSPTVRGILDPIIE LYRPVPPLAYLPLMVIWFGIGETSKILLIYLAIFAPVAMSALAGVKSVQQVRIRAAQSLG ASRAQVLWFVILPGALPEILTGLRIGLGVGWSTLVAAELIAATRGLGFMVQSAGEFLATD VVLAGIAVIAIIAFLLELGLRALQRRLTPWHGEVQ >gi|296494491|gb|ADTN01000247.1| GENE 9 6443 - 7294 800 283 aa, chain + ## HITS:1 COG:tauD KEGG:ns NR:ns ## COG: tauD COG2175 # Protein_GI_number: 16128353 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Probable taurine catabolism dioxygenase # Organism: Escherichia coli K12 # 1 283 1 283 283 564 100.0 1e-161 MSERLSITPLGPYIGAQISGADLTRPLSDNQFEQLYHAVLRHQVVFLRDQAITPQQQRAL AQRFGELHIHPVYPHAEGVDEIIVLDTHNDNPPDNDNWHTDVTFIETPPAGAILAAKELP STGGDTLWTSGIAAYEALSVPFRQLLSGLRAEHDFRKSFPEYKYRKTEEEHQRWREAVAK NPPLLHPVVRTHPVSGKQALFVNEGFTTRIVDVSEKESEALLSFLFAHITKPEFQVRWRW QPNDIAIWDNRVTQHYANADYLPQRRIMHRATILGDKPFYRAG Prediction of potential genes in microbial genomes Time: Sun May 15 23:59:07 2011 Seq name: gi|296494490|gb|ADTN01000248.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont650.3, whole genome shotgun sequence Length of sequence - 13244 bp Number of predicted genes - 14, with homology - 12 Number of transcription units - 11, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 81 - 1055 1067 ## COG0113 Delta-aminolevulinic acid dehydratase - Prom 1082 - 1141 3.2 2 2 Tu 1 . - CDS 1225 - 1308 64 ## - Prom 1329 - 1388 5.5 3 3 Tu 1 . - CDS 1430 - 1621 134 ## - Prom 1659 - 1718 2.8 4 4 Op 1 . + CDS 1579 - 4485 2593 ## COG3468 Type V secretory pathway, adhesin AidA + Prom 4490 - 4549 1.7 5 4 Op 2 . + CDS 4573 - 5196 370 ## SSON_0350 putative DNA-binding transcriptional regulator - Term 5139 - 5187 10.6 6 5 Tu 1 . - CDS 5197 - 6354 1009 ## COG1680 Beta-lactamase class C and other penicillin binding proteins - Prom 6430 - 6489 3.9 + Prom 6389 - 6448 3.1 7 6 Op 1 . + CDS 6492 - 6686 65 ## ECIAI1_0372 hypothetical protein 8 6 Op 2 . + CDS 6706 - 7926 1230 ## COG1133 ABC-type long-chain fatty acid transport system, fused permease and ATPase components 9 6 Op 3 . + CDS 7939 - 9033 983 ## JW0369 predicted DNA-binding transcriptional regulator 10 7 Tu 1 . - CDS 9092 - 9400 316 ## APECO1_1629 hypothetical protein - Prom 9451 - 9510 3.8 + Prom 9521 - 9580 5.7 11 8 Tu 1 . + CDS 9660 - 9872 334 ## UTI89_C0397 hypothetical protein - Term 9850 - 9887 5.1 12 9 Tu 1 . - CDS 9896 - 10990 1168 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes + Prom 11210 - 11269 8.0 13 10 Tu 1 . + CDS 11453 - 11713 280 ## SSON_0357 hypothetical protein + Prom 11715 - 11774 5.8 14 11 Tu 1 . + CDS 11814 - 13229 1397 ## COG1785 Alkaline phosphatase Predicted protein(s) >gi|296494490|gb|ADTN01000248.1| GENE 1 81 - 1055 1067 324 aa, chain - ## HITS:1 COG:ECs0423 KEGG:ns NR:ns ## COG: ECs0423 COG0113 # Protein_GI_number: 15829677 # Func_class: H Coenzyme transport and metabolism # Function: Delta-aminolevulinic acid dehydratase # Organism: Escherichia coli O157:H7 # 1 324 12 335 335 620 100.0 1e-178 MTDLIQRPRRLRKSPALRAMFEETTLSLNDLVLPIFVEEEIDDYKAVEAMPGVMRIPEKH LAREIERIANAGIRSVMTFGISHHTDETGSDAWREDGLVARMSRICKQTVPEMIVMSDTC FCEYTSHGHCGVLCEHGVDNDATLENLGKQAVVAAAAGADFIAPSAAMDGQVQAIRQALD AAGFKDTAIMSYSTKFASSFYGPFREAAGSALKGDRKSYQMNPMNRREAIRESLLDEAQG ADCLMVKPAGAYLDIVRELRERTELPIGAYQVSGEYAMIKFAALAGAIDEEKVVLESLGS IKRAGADLIFSYFALDLAEKKILR >gi|296494490|gb|ADTN01000248.1| GENE 2 1225 - 1308 64 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNMQQSFNYSIYSSATKLQKIITNSYK >gi|296494490|gb|ADTN01000248.1| GENE 3 1430 - 1621 134 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLIVILQVSFSRSAFVIPPQGRNKLIHQLSMHRKHIMFRANDYAALQTNGSIILSPKSEC GHM >gi|296494490|gb|ADTN01000248.1| GENE 4 1579 - 4485 2593 968 aa, chain + ## HITS:1 COG:yaiU KEGG:ns NR:ns ## COG: yaiU COG3468 # Protein_GI_number: 16128359 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 502 968 1 467 467 729 99.0 0 MHSWKKKLVVSQLALACTLAITSQANAANYDTWTYIDNPVTALDWDHMDKAGTVDGNYVN YSGFVYYNNTNGDFDQSFNGDTVNGTISTYYLNHDYADSTANQLDISNSVIHGSITSMLP GGYYDRFDADGNNLGGYDFYTDAVVDTHWRDGDVFTLNIANTTIDDDYEALYFTDSYKDG DVTKHTNETFDTSEGVAVNLDVESNINISNNSRVAGIALSQGNTYNETYTTESHTWDNNI SVKDSTVTSGSNYILDSNTYGKTGHFGNSDEPSDYAGPGDVAMSFTASGSDYAMKNNVFL SNSTLMGDVAFTSTWNSNFDPNGHDSNGDGVKDTNGGWTDDSLNVDELNLTLDNGSKWVG QAIYNVAETSAMYDVATNSLTPDATYENNDWKRVVDDKVFQSGVFNVALNNGSEWDTTGR SIVDTLTVNNGSQVNVSESKLTSDTIDLTNGSSLNIGEDGYVDTDHLTINSYSTVALTES TGWGADYNLYANTITVTNGGVLDVNVDQFDTEAFRTDKLELTSGNIADHNGNVVSGVFDI HSSDYVLNADLVNDRTWDTSKSNYGYGIVAMNSDGHLTINGNGDVDNGTELDNSSVDNVV AATGNYKVRIDNATGAGAIADYKDKEIIYVNDVNSNATFSAANKADLGAYTYQAEQRGNT VVLQQMELTDYANMALSIPSANTNIWNLEQDTVGTRLTNSRHGLADNGGAWVSYFGGNFN GDNGTINYDQDVNGIMVGVDTKIDGNNAKWIVGTAAGFAKGDMNDRSGQVDQDSQTAYIY SSAHFANNVFVDGSLSYSHFNNDLSATMSNGTYVDGSTNSDAWGFGLKAGYDFKLGDAGY VTPYGSVSGLFQSGDDYQLSNDMKVDGQSYDSMRYELGVDAGYTFTYSEDQALTPYFKLA YVYDDSNNDNDVNGDSIDNGTEGSAVRVGLGTQFSFTKNFSAYTDANYLGGGDVDQDWSA NVGVKYTW >gi|296494490|gb|ADTN01000248.1| GENE 5 4573 - 5196 370 207 aa, chain + ## HITS:1 COG:no KEGG:SSON_0350 NR:ns ## KEGG: SSON_0350 # Name: yaiV # Def: putative DNA-binding transcriptional regulator # Organism: S.sonnei # Pathway: not_defined # 1 207 16 222 222 418 100.0 1e-116 MLSVVKPLQEFGKLDKCLSRYGTRFEFNNEKQVIFSSDVNNEDTFVILEGVISLRREENV LIGITQAPYIMGLADGLMKNDIPYKLISEGNCTGYHLPAKQTITLIEQNQLWRDAFYWLA WQNRILELRDVQLIGHNSYEQIRATLLSMIDWNEELRSRIGVMNYIHQRTRISRSVVAEV LAALRKGGYIEMNKGKLVAINRLPSEY >gi|296494490|gb|ADTN01000248.1| GENE 6 5197 - 6354 1009 385 aa, chain - ## HITS:1 COG:ECs0426 KEGG:ns NR:ns ## COG: ECs0426 COG1680 # Protein_GI_number: 15829680 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Escherichia coli O157:H7 # 1 385 1 385 385 760 100.0 0 MKRSLLFSAVLCAASLTSVHAAQPITEPEFASDIVDRYADHIFYGSGATGMALVVIDGNQ RVFRSYGETRPGNNVRPQLDSVVRIASLTKLMTSEMLVKLLDQGTVKLNDPLSKYAPPGA RVPTYNGTPITLVNLATHTSALPREQPGGAAHRPVFVWPTREQRWKYLSTAKLKAAPGSQ AAYSNLAFDLLADALANASGKPYTQLFEEQITRPLGMKDTTYTPSPDQCRRLMVAERGAS PCNNTLAAIGSGGVYSTPGDMMRWMQQYLSSDFYQRSNQADRMQTLIYQRAQFTKVIGMD VPGKADALGLGWVYMAPKEGRPGIIQKTGGGGGFITYMAMIPQKNIGAFVVVTRSPLTRF KNMSDGINDLVTELSGNKPLVIPAS >gi|296494490|gb|ADTN01000248.1| GENE 7 6492 - 6686 65 64 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_0372 NR:ns ## KEGG: ECIAI1_0372 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 64 1 64 64 119 100.0 3e-26 MPAFVILRVLIRLCKQSLVCEIRRNNSYGQAGELSALLWVTMGGFIWLTLCSGHAVNTQQ LLKR >gi|296494490|gb|ADTN01000248.1| GENE 8 6706 - 7926 1230 406 aa, chain + ## HITS:1 COG:ECs0427 KEGG:ns NR:ns ## COG: ECs0427 COG1133 # Protein_GI_number: 15829681 # Func_class: I Lipid transport and metabolism # Function: ABC-type long-chain fatty acid transport system, fused permease and ATPase components # Organism: Escherichia coli O157:H7 # 1 406 1 406 406 757 100.0 0 MFKSFFPKPGTFFLSAFVWALIAVIFWQAGGGDWVARITGASGQIPISAARFWSLDFLIF YAYYIVCVGLFALFWFIYSPHRWQYWSILGTALIIFVTWFLVEVGVAVNAWYAPFYDLIQ TALSSPHKVTIEQFYREVGVFLGIALIAVVISVLNNFFVSHYVFRWRTAMNEYYMANWQQ LRHIEGAAQRVQEDTMRFASTLENMGVSFINAIMTLIAFLPVLVTLSAHVPELPIIGHIP YGLVIAAIVWSLMGTGLLAVVGIKLPGLEFKNQRVEAAYRKELVYGEDDATRATPPTVRE LFSAVRKNYFRLYFHYMYFNIARILYLQVDNVFGLFLLFPSIVAGTITLGLMTQITNVFG QVRGAFQYLINSWTTLVELMSIYKRLRSFEHELDGDKIQEVTHTLS >gi|296494490|gb|ADTN01000248.1| GENE 9 7939 - 9033 983 364 aa, chain + ## HITS:1 COG:no KEGG:JW0369 NR:ns ## KEGG: JW0369 # Name: yaiW # Def: predicted DNA-binding transcriptional regulator # Organism: E.coli_J # Pathway: not_defined # 1 364 1 364 364 687 100.0 0 MSRVNPLSSLSLLAVLVLAGCSSQAPQPLKKGEKAIDVASVVRQKMPASVKDRDAWAKDL ATTFESQGLAPTLENVCSVLAVAQQESNYQADPAVPGLSKIAWQEIDRRAERMHIPAFLV HTALKIKSPNGKSYSERLDSVRTEKQLSAIFDDLINMVPMGQTLFGSLNPVRTGGPMQVS IAFAEQHTKGYPWKMDGTVRQEVFSRRGGLWFGTYHLLNYPASYSAPIYRFADFNAGWYA SRNAAFQNAVSKASGVKLALDGDLIRYDSKEPGKTELATRKLAAKLGMSDSEIRRQLEKG DSFSFEETALYKKVYQLAETKTGKSLPREMLPGIQLESPKITRNLTTAWFAKRVDERRAR CMKQ >gi|296494490|gb|ADTN01000248.1| GENE 10 9092 - 9400 316 102 aa, chain - ## HITS:1 COG:no KEGG:APECO1_1629 NR:ns ## KEGG: APECO1_1629 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 102 1 102 102 200 100.0 1e-50 MADFTLSKSLFSGKYRNASSTPGNIAYALFVLFCFWAGAQLLNLLVHAPGVYERLMQVQE TGRPRVEIGLGVGTIFGLIPFLVGCLIFAVVALWLHWRHRRQ >gi|296494490|gb|ADTN01000248.1| GENE 11 9660 - 9872 334 70 aa, chain + ## HITS:1 COG:no KEGG:UTI89_C0397 NR:ns ## KEGG: UTI89_C0397 # Name: yaiZ # Def: hypothetical protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 70 45 114 114 138 100.0 5e-32 MNLPVKIRRDWHYYAFAIGLIFILNGVVGLLGFEAKGWQTYAVGLVTWVISFWLAGLIIR RRDEETENAQ >gi|296494490|gb|ADTN01000248.1| GENE 12 9896 - 10990 1168 364 aa, chain - ## HITS:1 COG:ECs0431 KEGG:ns NR:ns ## COG: ECs0431 COG1181 # Protein_GI_number: 15829685 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Escherichia coli O157:H7 # 1 364 1 364 364 728 100.0 0 MEKLRVGIVFGGKSAEHEVSLQSAKNIVDAIDKSRFDVVLLGIDKQGQWHVSDASNYLLN ADDPAHIALRPSATSLAQVPGKHEHQLIDAQNGQPLPTVDVIFPIVHGTLGEDGSLQGML RVANLPFVGSDVLASAACMDKDVTKRLLRDAGLNIAPFITLTRANRHNISFAEVESKLGL PLFVKPANQGSSVGVSKVTSEEQYAIAVDLAFEFDHKVIVEQGIKGREIECAVLGNDNPQ ASTCGEIVLTSDFYAYDTKYIDEDGAKVVVPAAIAPEINDKIRAIAVQAYQTLGCAGMAR VDVFLTPENEVVINEINTLPGFTNISMYPKLWQASGLGYTDLITRLIELALERHAADNAL KTTM >gi|296494490|gb|ADTN01000248.1| GENE 13 11453 - 11713 280 86 aa, chain + ## HITS:1 COG:no KEGG:SSON_0357 NR:ns ## KEGG: SSON_0357 # Name: yaiB # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 86 1 86 86 132 100.0 5e-30 MKNLIAELLFKLAQKEEESKELCAQVEALEIIVTAMLRNMAQNDQQRLIDQVEGALYEVK PDASIPDDDTELLRDYVKKLLKHPRQ >gi|296494490|gb|ADTN01000248.1| GENE 14 11814 - 13229 1397 471 aa, chain + ## HITS:1 COG:phoA KEGG:ns NR:ns ## COG: phoA COG1785 # Protein_GI_number: 16128368 # Func_class: P Inorganic ion transport and metabolism # Function: Alkaline phosphatase # Organism: Escherichia coli K12 # 1 471 24 494 494 870 99.0 0 MKQSTIALALLPLLFTPVTKARTPEMPVLENRAAQGDITAPGGARRLTGDQTAALRDSLS DKPAKNIILLIGDGMGDSEITAARNYAEGAGGFFKGIDALPLTGQYTHYALNKKTGKPDY VTDSAASATAWSTGVKTYNGALGVDIHEKDHPTILEMAKAAGLATGNVSTAELQDATPAA LVAHVTSRKCYGPSATSEKCPGNALEKGGKGSITEQLLNARADVTLGGGAKTFAETATAG EWQGKTLREQAQARGYQLVSDAASLNSVTEANQQKPLLGLFADGNMPVRWLGPKATYHGN IDKPAVTCTPNPQRNDSVPTLAQMTDKAIELLSKNEKGFFLQVEGASIDKQDHAANPCGQ IGETVDLDEAVQRALEFAKKEGNTLVIVTADHAHASQIVAPDTKAPGLTQALNTKDGAVM VMSYGNSEEDSQEHTGSQLRIAAYGPHAANVVGLTDQTDLFYTMKAALGLK Prediction of potential genes in microbial genomes Time: Sun May 15 23:59:30 2011 Seq name: gi|296494489|gb|ADTN01000249.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont650.4, whole genome shotgun sequence Length of sequence - 1635 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 103 - 423 220 ## APECO1_1624 hypothetical protein + Prom 453 - 512 3.2 2 2 Tu 1 . + CDS 540 - 1635 669 ## COG2199 FOG: GGDEF domain Predicted protein(s) >gi|296494489|gb|ADTN01000249.1| GENE 1 103 - 423 220 106 aa, chain + ## HITS:1 COG:no KEGG:APECO1_1624 NR:ns ## KEGG: APECO1_1624 # Name: psiF # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 106 7 112 112 161 100.0 5e-39 MKITLLVTLLFGLVFLTTVGAAERTLTPQQQRMTSCNQQATAQALKGDARKTYMSDCLKN SKSAPGEKSLTPQQQKMRECNNQATQQSLKGDDRNKFMSACLKKAA >gi|296494489|gb|ADTN01000249.1| GENE 2 540 - 1635 669 365 aa, chain + ## HITS:1 COG:yaiC_2 KEGG:ns NR:ns ## COG: yaiC_2 COG2199 # Protein_GI_number: 16128370 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli K12 # 196 365 1 170 171 355 100.0 1e-97 MNDENFFKKAAAHGGGTPLTPQNEHQRSGLRFARRVRLPRAVGLAGMFLPIASTLVSHPP PGWWWLVLVGWAFVWPHLAWQIASRAVDPLSREIYNLKTDAVLAGMWVGVMGVNVLPSTA MLMIMCLNLMGAGGPRLFVAGLVLMVVSCLVTLELTGITVSFNSAPLEWWLSLPIIVIYP LLFGWVSYQTATKLAEHKRRLQVMSTRDGMTGVYNRRHWETMLRNEFDNCRRHNRDATLL IIDIDHFKSINDTWGHDVGDEAIVALTRQLQITLRGSDVIGRFGGDEFAVIMSGTPAESA ITAMLRVHEGLNTLRLPNTPQVTLRISVGVAPLNPQMSHYREWLKSADLALYKAKKAGRN RTEVA Prediction of potential genes in microbial genomes Time: Sun May 15 23:59:37 2011 Seq name: gi|296494488|gb|ADTN01000250.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont650.5, whole genome shotgun sequence Length of sequence - 16131 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 10, operones - 5 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 26 - 835 931 ## COG0345 Pyrroline-5-carboxylate reductase - Prom 877 - 936 4.3 + Prom 805 - 864 3.5 2 2 Tu 1 . + CDS 955 - 1413 408 ## COG1671 Uncharacterized protein conserved in bacteria + Term 1474 - 1511 -0.4 + Prom 1486 - 1545 5.2 3 3 Op 1 . + CDS 1596 - 2120 355 ## COG0703 Shikimate kinase 4 3 Op 2 . + CDS 2170 - 2361 233 ## LF82_2523 uncharacterized protein YaiA + Prom 2401 - 2460 11.1 5 4 Op 1 . + CDS 2619 - 3296 647 ## S0334 hypothetical protein 6 4 Op 2 . + CDS 3368 - 3652 295 ## COG3123 Uncharacterized protein conserved in bacteria + Term 3659 - 3695 3.0 + Prom 3705 - 3764 7.2 7 5 Op 1 . + CDS 3938 - 4141 132 ## ECH74115_0466 hypothetical protein 8 5 Op 2 . + CDS 4138 - 4221 104 ## 9 6 Tu 1 . - CDS 4299 - 5210 1152 ## COG2974 DNA recombination-dependent growth factor C - Prom 5241 - 5300 4.1 + Prom 5187 - 5246 4.3 10 7 Tu 1 . + CDS 5335 - 6243 292 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase + Term 6252 - 6301 6.0 - Term 6238 - 6287 2.2 11 8 Op 1 2/0.500 - CDS 6488 - 7756 909 ## COG2814 Arabinose efflux permease 12 8 Op 2 28/0.000 - CDS 7798 - 10944 2944 ## COG0419 ATPase involved in DNA repair 13 8 Op 3 . - CDS 10941 - 12143 1088 ## COG0420 DNA repair exonuclease - Prom 12274 - 12333 3.8 + Prom 12250 - 12309 5.2 14 9 Op 1 40/0.000 + CDS 12333 - 13022 681 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Term 13024 - 13063 3.3 15 9 Op 2 4/0.250 + CDS 13080 - 14375 1130 ## COG0642 Signal transduction histidine kinase + Term 14402 - 14435 3.1 + Prom 14574 - 14633 5.3 16 10 Tu 1 . + CDS 14782 - 16101 1567 ## COG1114 Branched-chain amino acid permeases Predicted protein(s) >gi|296494488|gb|ADTN01000250.1| GENE 1 26 - 835 931 269 aa, chain - ## HITS:1 COG:proC KEGG:ns NR:ns ## COG: proC COG0345 # Protein_GI_number: 16128371 # Func_class: E Amino acid transport and metabolism # Function: Pyrroline-5-carboxylate reductase # Organism: Escherichia coli K12 # 1 269 1 269 269 489 100.0 1e-138 MEKKIGFIGCGNMGKAILGGLIASGQVLPGQIWVYTPSPDKVAALHDQFGINAAESAQEV AQIADIIFAAVKPGIMIKVLSEITSSLNKDSLVVSIAAGVTLDQLARALGHDRKIIRAMP NTPALVNAGMTSVTPNALVTPEDTADVLNIFRCFGEAEVIAEPMIHPVVGVSGSSPAYVF MFIEAMADAAVLGGMPRAQAYKFAAQAVMGSAKMVLETGEHPGALKDMVCSPGGTTIEAV RVLEEKGFRAAVIEAMTKCMEKSEKLSKS >gi|296494488|gb|ADTN01000250.1| GENE 2 955 - 1413 408 152 aa, chain + ## HITS:1 COG:ECs0436 KEGG:ns NR:ns ## COG: ECs0436 COG1671 # Protein_GI_number: 15829691 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 152 41 192 192 291 100.0 2e-79 MTIWVDADACPNVIKEILYRAAERMQMPLVLVANQSLRVPPSRFIRTLRVAAGFDVADNE IVRQCEAGDLVITADIPLAAEAIEKGAAALNPRGERYTPATIRERLTMRDFMDTLRASGI QTGGPDSLSQRDRQAFAAELEKWWLEVQRSRG >gi|296494488|gb|ADTN01000250.1| GENE 3 1596 - 2120 355 174 aa, chain + ## HITS:1 COG:ECs0438 KEGG:ns NR:ns ## COG: ECs0438 COG0703 # Protein_GI_number: 15829692 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Escherichia coli O157:H7 # 1 174 1 174 174 333 100.0 7e-92 MTQPLFLIGPRGCGKTTVGMALADSLNRRFVDTDQWLQSQLNMTVAEIVEREEWAGFRAR ETAALEAVTAPSTVIATGGGIILTEFNRHFMQNNGIVVYLCAPVSVLVNRLQAAPEEDLR PTLTGKPLSEEVQEVLEERDALYREVAHIIIDATNEPSQVISEIRSALAQTINC >gi|296494488|gb|ADTN01000250.1| GENE 4 2170 - 2361 233 63 aa, chain + ## HITS:1 COG:no KEGG:LF82_2523 NR:ns ## KEGG: LF82_2523 # Name: yaiA # Def: uncharacterized protein YaiA # Organism: E.coli_LF82 # Pathway: not_defined # 1 63 1 63 63 90 100.0 2e-17 MPTKPPYPREAYIVTIEKGKPGQTVTWYQLRADHPKPDSLISEHPTAQEAMDAKKRYEDP DKE >gi|296494488|gb|ADTN01000250.1| GENE 5 2619 - 3296 647 225 aa, chain + ## HITS:1 COG:no KEGG:S0334 NR:ns ## KEGG: S0334 # Name: aroM # Def: hypothetical protein # Organism: S.flexneri_2457T # Pathway: not_defined # 1 225 1 225 225 401 100.0 1e-111 MSASLAILTIGIVPMQEVLPLLTEYIDEDNISHHSLLGKLSREEVMAEYAPEAGEDTILT LLNDNQLAHVSRRKVERDLQGVVEVLDNQGYDVILLMSTANISSMTARNTIFLEPSRILP PLVSSIVEDHQVGVIVPVEEMLPVQAQKWQILQKSPVFSLGNPIHDSEQKIIDAGKELLA KGADVIMLDCLGFHQRHRDLLQKQLDVPVLLSNVLIARLAAELLV >gi|296494488|gb|ADTN01000250.1| GENE 6 3368 - 3652 295 94 aa, chain + ## HITS:1 COG:ECs0441 KEGG:ns NR:ns ## COG: ECs0441 COG3123 # Protein_GI_number: 15829695 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 94 1 94 94 175 100.0 2e-44 MLQSNEYFSGKVKSIGFSSSSTGRASVGVMVEGEYTFSTAEPEEMTVISGALNVLLPDAT DWQVYEAGSVFNVPGHSEFHLQVAEPTSYLCRYL >gi|296494488|gb|ADTN01000250.1| GENE 7 3938 - 4141 132 67 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_0466 NR:ns ## KEGG: ECH74115_0466 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 67 655 721 721 114 89.0 1e-24 MTEPQDRSLAINNPQLAADVKTAWLKEDPSLLLFVEQPDLSLLRDLVKTGATRKIRSEAR HRLEEKQ >gi|296494488|gb|ADTN01000250.1| GENE 8 4138 - 4221 104 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTQRPWSKLQRKTHNIAALKIIARRSE >gi|296494488|gb|ADTN01000250.1| GENE 9 4299 - 5210 1152 303 aa, chain - ## HITS:1 COG:rdgC KEGG:ns NR:ns ## COG: rdgC COG2974 # Protein_GI_number: 16128378 # Func_class: L Replication, recombination and repair # Function: DNA recombination-dependent growth factor C # Organism: Escherichia coli K12 # 1 303 1 303 303 572 100.0 1e-163 MLWFKNLMVYRLSREISLRAEEMEKQLASMAFTPCGSQDMAKMGWVPPMGSHSDALTHVA NGQIVICARKEEKILPSPVIKQALEAKIAKLEAEQARKLKKTEKDSLKDEVLHSLLPRAF SRFSQTMMWIDTVNGLIMVDCASAKKAEDTLALLRKSLGSLPVVPLSMENPIELTLTEWV RSGSAAQGFQLLDEAELKSLLEDGGVIRAKKQDLTSEEITNHIEAGKVVTKLALDWQQRI QFVMCDDGSLKRLKFCDELRDQNEDIDREDFAQRFDADFILMTGELAALIQNLIEGLGGE AQR >gi|296494488|gb|ADTN01000250.1| GENE 10 5335 - 6243 292 302 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 3 298 6 316 319 117 31 7e-26 MRIGIDLGGTKTEVIALGDAGEQLYRHRLPTPRDDYRQTIETIATLVDMAEQATGQRGTV GMGIPGSISPYTGVVKNANSTWLNGQPFDKDLSARLQREVRLANDANCLAVSEAVDGAAA GAQTVFAVIIGTGCGAGVAFNGRAHIGGNGTAGEWGHNPLPWMDEDELRYREEVPCYCGK QGCIETFISGTGFAMDYRRLSGHALKGSEIIRLVEESDPVAELALRRYELRLAKSLAHVV NILDPDVIVLGGGMSNVDRLYQTVGQLIKQFVFGGECETPVRKAKHGDSSGVRGAAWLWP QE >gi|296494488|gb|ADTN01000250.1| GENE 11 6488 - 7756 909 422 aa, chain - ## HITS:1 COG:araJ KEGG:ns NR:ns ## COG: araJ COG2814 # Protein_GI_number: 16128381 # Func_class: G Carbohydrate transport and metabolism # Function: Arabinose efflux permease # Organism: Escherichia coli K12 # 29 422 1 394 394 632 100.0 0 MALLVVILQAITLLATVIGSRSGGCDGGMKKVILSLALGTFGLGMAEFGIMGVLTELAHN VGISIPAAGHMISYYALGVVVGAPIIALFSSRYSLKHILLFLVALCVIGNAMFTLSSSYL MLAIGRLVSGFPHGAFFGVGAIVLSKIIKPGKVTAAVAGMVSGMTVANLLGIPLGTYLSQ EFSWRYTFLLIAVFNIAVMASVYFWVPDIRDEAKGNLREQFHFLRSPAPWLIFAATMFGN AGVFAWFSYVKPYMMFISGFSETAMTFIMMLVGLGMVLGNMLSGRISGRYSPLRIAAVTD FIIVLALLMLFFCGGMKTTSLIFAFICCAGLFALSAPLQILLLQNAKGGELLGAAGGQIA FNLGSAVGAYCGGMMLTLGLAYNYVALPAALLSFAAMSSLLLYGRYKRQQAADTPVLAKP LG >gi|296494488|gb|ADTN01000250.1| GENE 12 7798 - 10944 2944 1048 aa, chain - ## HITS:1 COG:sbcC KEGG:ns NR:ns ## COG: sbcC COG0419 # Protein_GI_number: 16128382 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Escherichia coli K12 # 1 1048 1 1048 1048 1499 100.0 0 MKILSLRLKNLNSLKGEWKIDFTREPFASNGLFAITGPTGAGKTTLLDAICLALYHETPR LSNVSQSQNDLMTRDTAECLAEVEFEVKGEAYRAFWSQNRARNQPDGNLQVPRVELARCA DGKILADKVKDKLELTATLTGLDYGRFTRSMLLSQGQFAAFLNAKPKERAELLEELTGTE IYGQISAMVFEQHKSARTELEKLQAQASGVTLLTPEQVQSLTASLQVLTDEEKQLITAQQ QEQQSLNWLTRQDELQQEASRRQQALQQALAEEEKAQPQLAALSLAQPARNLRPHWERIA EHSAALAHIRQQIEEVNTRLQSTMALRASIRHHAAKQSAELQQQQQSLNTWLQEHDRFRQ WNNEPAGWRAQFSQQTSDREHLRQWQQQLTHAEQKLNALAAITLTLTADEVATALAQHAE QRPLRQHLVALHGQIVPQQKRLAQLQVAIQNVTQEQTQRNAALNEMRQRYKEKTQQLADV KTICEQEARIKTLEAQRAQLQAGQPCPLCGSTSHPAVEAYQALEPGVNQSRLLALENEVK KLGEEGATLRGQLDAITKQLQRDENEAQSLRQDEQALTQQWQAVTASLNITLQPLDDIQP WLDAQDEHERQLRLLSQRHELQGQIAAHNQQIIQYQQQIEQRQQLLLTTLTGYALTLPQE DEEESWLATRQQEAQSWQQRQNELTALQNRIQQLTPILETLPQSDELPHCEETVVLENWR QVHEQCLALHSQQQTLQQQDVLAAQSLQKAQAQFDTALQASVFDDQQAFLAALMDEQTLT QLEQLKQNLENQRRQAQTLVTQTAETLAQHQQHRPDDGLALTVTVEQIQQELAQTHQKLR ENTTSQGEIRQQLKQDADNRQQQQTLMQQIAQMTQQVEDWGYLNSLIGSKEGDKFRKFAQ GLTLDNLVHLANQQLTRLHGRYLLQRKASEALEVEVVDTWQADAVRDTRTLSGGESFLVS LALALALSDLVSHKTRIDSLFLDEGFGTLDSETLDTALDALDALNASGKTIGVISHVEAM KERIPVQIKVKKINGLGYSKLESTFAVK >gi|296494488|gb|ADTN01000250.1| GENE 13 10941 - 12143 1088 400 aa, chain - ## HITS:1 COG:ECs0448 KEGG:ns NR:ns ## COG: ECs0448 COG0420 # Protein_GI_number: 15829702 # Func_class: L Replication, recombination and repair # Function: DNA repair exonuclease # Organism: Escherichia coli O157:H7 # 1 400 1 400 400 801 99.0 0 MRILHTSDWHLGQNFYSKSREAEHQAFLDWLLETAQTHQVDAIIVAGDVFDTGSPPRYAR TLYNRFVVNLQQTGCHLVVLAGNHDSVATLNESRDIMAFLNTTVVASAGHAPQILPRRDG TPGAVLCPIPFLRPRDIITSQAGLNGIEKQQHLLAAITDYYQQHYADACKLRGDQPLPII ATGHLTTVGASKSDAVRDIYIGTLDAFPAQNFPPADYIALGHIHRAQIIGGMEHVRYCGS PIPLSFDECGKSKYVHLVTFSNGKLESVENLNVPVTQPMAVLKGDLASITAQLEQWRDVS QEPPVWLDIEITTDEYLHDIQRKIQALTESLPVEVLLVRRSREQRERVLASQQRETLSEL SVEEVFNRRLALEELDESQQQRLQHLFTTTLHTLAGEHEA >gi|296494488|gb|ADTN01000250.1| GENE 14 12333 - 13022 681 229 aa, chain + ## HITS:1 COG:ECs0449 KEGG:ns NR:ns ## COG: ECs0449 COG0745 # Protein_GI_number: 15829703 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 229 1 229 229 449 100.0 1e-126 MARRILVVEDEAPIREMVCFVLEQNGFQPVEAEDYDSAVNQLNEPWPDLILLDWMLPGGS GIQFIKHLKRESMTRDIPVVMLTARGEEEDRVRGLETGADDYITKPFSPKELVARIKAVM RRISPMAVEEVIEMQGLSLDPTSHRVMAGEEPLEMGPTEFKLLHFFMTHPERVYSREQLL NHVWGTNVYVEDRTVDVHIRRLRKALEPGGHDRMVQTVRGTGYRFSTRF >gi|296494488|gb|ADTN01000250.1| GENE 15 13080 - 14375 1130 431 aa, chain + ## HITS:1 COG:phoR KEGG:ns NR:ns ## COG: phoR COG0642 # Protein_GI_number: 16128385 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 431 1 431 431 840 100.0 0 MLERLSWKRLVLELLLCCLPAFILGAFFGYLPWFLLASVTGLLIWHFWNLLRLSWWLWVD RSMTPPPGRGSWEPLLYGLHQMQLRNKKRRRELGNLIKRFRSGAESLPDAVVLTTEEGGI FWCNGLAQQILGLRWPEDNGQNILNLLRYPEFTQYLKTRDFSRPLNLVLNTGRHLEIRVM PYTHKQLLMVARDVTQMHQLEGARRNFFANVSHELRTPLTVLQGYLEMMNEQPLEGAVRE KALHTMREQTQRMEGLVKQLLTLSKIEAAPTHLLNEKVDVPMMLRVVEREAQTLSQKKQT FTFEIDNGLKVSGNEDQLRSAISNLVYNAVNHTPEGTHITVRWQRVPHGAEFSVEDNGPG IAPEHIPRLTERFYRVDKARSRQTGGSGLGLAIVKHAVNHHESRLNIESTVGKGTRFSFV IPERLIAKNSD >gi|296494488|gb|ADTN01000250.1| GENE 16 14782 - 16101 1567 439 aa, chain + ## HITS:1 COG:ECs0451 KEGG:ns NR:ns ## COG: ECs0451 COG1114 # Protein_GI_number: 15829705 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid permeases # Organism: Escherichia coli O157:H7 # 1 439 1 439 439 741 100.0 0 MTHQLRSRDIIALGFMTFALFVGAGNIIFPPMVGLQAGEHVWTAAFGFLITAVGLPVLTV VALAKVGGGVDSLSTPIGKVAGVLLATVCYLAVGPLFATPRTATVSFEVGIAPLTGDSAL PLFIYSLVYFAIVILVSLYPGKLLDTVGNFLAPLKIIALVILSVAAIVWPAGSISTATEA YQNAAFSNGFVNGYLTMDTLGAMVFGIVIVNAARSRGVTEARLLTRYTVWAGLMAGVGLT LLYLALFRLGSDSASLVDQSANGAAILHAYVQHTFGGGGSFLLAALIFIACLVTAVGLTC ACAEFFAQYVPLSYRTLVFILGGFSMVVSNLGLSQLIQISVPVLTAIYPPCIALVVLSFT RSWWHNSSRVIAPPMFISLLFGILDGIKASAFSDILPSWAQRLPLAEQGLAWLMPTVVMV VLAIIWDRAAGRQVTSSAH Prediction of potential genes in microbial genomes Time: Sun May 15 23:59:51 2011 Seq name: gi|296494487|gb|ADTN01000251.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont650.6, whole genome shotgun sequence Length of sequence - 10146 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 5, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 5/0.500 + CDS 84 - 1403 1539 ## COG1113 Gamma-aminobutyrate permease and related permeases 2 2 Tu 1 . + CDS 1562 - 3376 1827 ## COG0366 Glycosidases + Term 3500 - 3542 -1.0 3 3 Tu 1 . - CDS 3381 - 3962 467 ## COG3124 Uncharacterized protein conserved in bacteria - Prom 4027 - 4086 1.9 + Prom 3970 - 4029 4.0 4 4 Op 1 17/0.000 + CDS 4055 - 5125 1119 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) 5 4 Op 2 15/0.000 + CDS 5181 - 6308 1100 ## COG0343 Queuine/archaeosine tRNA-ribosyltransferase 6 4 Op 3 25/0.000 + CDS 6331 - 6663 552 ## COG1862 Preprotein translocase subunit YajC 7 4 Op 4 31/0.000 + CDS 6724 - 8538 2078 ## COG0342 Preprotein translocase subunit SecD 8 4 Op 5 . + CDS 8549 - 9520 1003 ## COG0341 Preprotein translocase subunit SecF + Term 9540 - 9581 8.1 + Prom 9561 - 9620 4.5 9 5 Tu 1 . + CDS 9649 - 9996 434 ## c0520 hypothetical protein Predicted protein(s) >gi|296494487|gb|ADTN01000251.1| GENE 1 84 - 1403 1539 439 aa, chain + ## HITS:1 COG:ECs0452 KEGG:ns NR:ns ## COG: ECs0452 COG1113 # Protein_GI_number: 15829706 # Func_class: E Amino acid transport and metabolism # Function: Gamma-aminobutyrate permease and related permeases # Organism: Escherichia coli O157:H7 # 1 439 19 457 457 745 100.0 0 MALGSAIGTGLFYGSADAIKMAGPSVLLAYIIGGIAAYIIMRALGEMSVHNPAASSFSRY AQENLGPLAGYITGWTYCFEILIVAIADVTAFGIYMGVWFPTVPHWIWVLSVVLIICAVN LMSVKVFGELEFWFSFFKVATIIIMIVAGFGIIIWGIGNGGQPTGIHNLWSNGGFFSNGW LGMVMSLQMVMFAYGGIEIIGITAGEAKDPEKSIPRAINSVPMRILVFYVGTLFVIMSIY PWNQVGTAGSPFVLTFQHMGITFAASILNFVVLTASLSAINSDVFGVGRMLHGMAEQGSA PKIFSKTSRRGIPWVTVLVMTTALLFAVYLNYIMPENVFLVIASLATFATVWVWIMILLS QIAFRRRLPPEEVKALKFKVPGGVATTIGGLIFLLFIIGLIGYHPDTRISLYVGFAWIVV LLIGWMFKRRHDRQLAENQ >gi|296494487|gb|ADTN01000251.1| GENE 2 1562 - 3376 1827 604 aa, chain + ## HITS:1 COG:malZ KEGG:ns NR:ns ## COG: malZ COG0366 # Protein_GI_number: 16128388 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Escherichia coli K12 # 1 604 2 605 605 1253 100.0 0 MLNAWHLPVPPFVKQSKDQLLITLWLTGEDPPQRIMLRTEHDNEEMSVPMHKQRSQPQPG VTAWRAAIDLSSGQPRRRYSFKLLWHDRQRWFTPQGFSRMPPARLEQFAVDVPDIGPQWA ADQIFYQIFPDRFARSLPREAEQDHVYYHHAAGQEIILRDWDEPVTAQAGGSTFYGGDLD GISEKLPYLKKLGVTALYLNPVFKAPSVHKYDTEDYRHVDPQFGGDGALLRLRHNTQQLG MRLVLDGVFNHSGDSHAWFDRHNRGTGGACHNPESPWRDWYSFSDDGTALDWLGYASLPK LDYQSESLVNEIYRGEDSIVRHWLKAPWNMDGWRLDVVHMLGEAGGARNNMQHVAGITEA AKETQPEAYIVGEHFGDARQWLQADVEDAAMNYRGFTFPLWGFLANTDISYDPQQIDAQT CMAWMDNYRAGLSHQQQLRMFNQLDSHDTARFKTLLGRDIARLPLAVVWLFTWPGVPCIY YGDEVGLDGKNDPFCRKPFPWQVEKQDTALFALYQRMIALRKKSQALRHGGCQVLYAEDN VVVFVRVLNQQRVLVAINRGEACEVVLPASPFLNAVQWQCKEGHGQLTDGILALPAISAT VWMN >gi|296494487|gb|ADTN01000251.1| GENE 3 3381 - 3962 467 193 aa, chain - ## HITS:1 COG:yajB KEGG:ns NR:ns ## COG: yajB COG3124 # Protein_GI_number: 16128389 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 193 1 193 193 352 100.0 2e-97 MNFLAHLHLAHLAESSLSGNLLADFVRGNPEESFPPDVVAGIHMHRRIDVLTDNLPEVRE AREWFRSETRRVAPITLDVMWDHFLSRHWSQLSPDFPLQEFVCYAREQVMTILPDSPPRF INLNNYLWSEQWLVRYRDMDFIQNVLNGMASRRPRLDALRDSWYDLDAHYDALETRFWQF YPRMMAQASRKAL >gi|296494487|gb|ADTN01000251.1| GENE 4 4055 - 5125 1119 356 aa, chain + ## HITS:1 COG:ECs0456 KEGG:ns NR:ns ## COG: ECs0456 COG0809 # Protein_GI_number: 15829710 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Escherichia coli O157:H7 # 1 356 1 356 356 706 100.0 0 MRVTDFSFELPESLIAHYPMPERSSCRLLSLDGPTGALTHGTFTDLLDKLNPGDLLVFNN TRVIPARLFGRKASGGKIEVLVERMLDDKRILAHIRASKAPKPGAELLLGDDESINATMT ARHGALFEVEFNDERSVLDILNSIGHMPLPPYIDRPDEDADRELYQTVYSEKPGAVAAPT AGLHFDEPLLEKLRAKGVEMAFVTLHVGAGTFQPVRVDTIEDHIMHSEYAEVPQDVVDAV LAAKARGNRVIAVGTTSVRSLESAAQAAKNDLIEPFFDDTQIFIYPGFQYKVVDALVTNF HLPESTLIMLVSAFAGYQHTMNAYKAAVEEKYRFFSYGDAMFITYNPQAINERVGE >gi|296494487|gb|ADTN01000251.1| GENE 5 5181 - 6308 1100 375 aa, chain + ## HITS:1 COG:ECs0457 KEGG:ns NR:ns ## COG: ECs0457 COG0343 # Protein_GI_number: 15829711 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Queuine/archaeosine tRNA-ribosyltransferase # Organism: Escherichia coli O157:H7 # 1 375 1 375 375 798 100.0 0 MKFELDTTDGRARRGRLVFDRGVVETPCFMPVGTYGTVKGMTPEEVEATGAQIILGNTFH LWLRPGQEIMKLHGDLHDFMQWKGPILTDSGGFQVFSLGDIRKITEQGVHFRNPINGDPI FLDPEKSMEIQYDLGSDIVMIFDECTPYPADWDYAKRSMEMSLRWAKRSRERFDSLGNKN ALFGIIQGSVYEDLRDISVKGLVDIGFDGYAVGGLAVGEPKADMHRILEHVCPQIPADKP RYLMGVGKPEDLVEGVRRGIDMFDCVMPTRNARNGHLFVTDGVVKIRNAKYKSDTGPLDP ECDCYTCRNYSRAYLHHLDRCNEILGARLNTIHNLRYYQRLMAGLRKAIEEGKLESFVTD FYQRQGREVPPLNVD >gi|296494487|gb|ADTN01000251.1| GENE 6 6331 - 6663 552 110 aa, chain + ## HITS:1 COG:ECs0458 KEGG:ns NR:ns ## COG: ECs0458 COG1862 # Protein_GI_number: 15829712 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YajC # Organism: Escherichia coli O157:H7 # 1 110 1 110 110 201 100.0 4e-52 MSFFISDAVAATGAPAQGSPMSLILMLVVFGLIFYFMILRPQQKRTKEHKKLMDSIAKGD EVLTNGGLVGRVTKVAENGYIAIALNDTTEVVIKRDFVAAVLPKGTMKAL >gi|296494487|gb|ADTN01000251.1| GENE 7 6724 - 8538 2078 604 aa, chain + ## HITS:1 COG:ECs0459 KEGG:ns NR:ns ## COG: ECs0459 COG0342 # Protein_GI_number: 15829713 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecD # Organism: Escherichia coli O157:H7 # 1 604 12 615 615 1113 100.0 0 MLIVVIVIGLLYALPNLFGEDPAVQITGARGVAASEQTLIQVQKTLQEEKITAKSVALEE GAILARFDSTDTQLRAREALMGVMGDKYVVALNLAPATPRWLAAIHAEPMKLGLDLRGGV HFLMEVDMDTALGKLQEQNIDSLRSDLREKGIPYTTVRKENNYGLSITFRDAKARDEAIA YLSKRHPDLVISSQGSNQLRAVMSDARLSEAREYAVQQNINILRNRVNQLGVAEPVVQRQ GADRIVVELPGIQDTARAKEILGATATLEFRLVNTNVDQAAAASGRVPGDSEVKQTREGQ PVVLYKRVILTGDHITDSTSSQDEYNQPQVNISLDSAGGNIMSNFTKDNIGKPMATLFVE YKDSGKKDANGRAVLVKQEEVINIANIQSRLGNSFRITGINNPNEARQLSLLLRAGALIA PIQIVEERTIGPTLGMQNIEQGLEACLAGLLVSILFMIIFYKKFGLIATSALIANLILIV GIMSLLPGATLSMPGIAGIVLTLAVAVDANVLINERIKEELSNGRTVQQAIDEGYRGAFS SIFDANITTLIKVIILYAVGTGAIKGFAITTGIGVATSMFTAIVGTRAIVNLLYGGKRVK KLSI >gi|296494487|gb|ADTN01000251.1| GENE 8 8549 - 9520 1003 323 aa, chain + ## HITS:1 COG:ECs0460 KEGG:ns NR:ns ## COG: ECs0460 COG0341 # Protein_GI_number: 15829714 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecF # Organism: Escherichia coli O157:H7 # 1 323 1 323 323 613 100.0 1e-175 MAQEYTVEQLNHGRKVYDFMRWDYWAFGISGLLLIAAIVIMGVRGFNWGLDFTGGTVIEI TLEKPAEIDVMRDALQKAGFEEPMLQNFGSSHDIMVRMPPAEGETGGQVLGSQVLKVINE STNQNAAVKRIEFVGPSVGADLAQTGAMALMAALLSILVYVGFRFEWRLAAGVVIALAHD VIITLGILSLFHIEIDLTIVASLMSVIGYSLNDSIVVSDRIRENFRKIRRGTPYEIFNVS LTQTLHRTLITSGTTLMVILMLYLFGGPVLEGFSLTMLIGVSIGTASSIYVASALALKLG MKREHMLQQKVEKEGADQPSILP >gi|296494487|gb|ADTN01000251.1| GENE 9 9649 - 9996 434 115 aa, chain + ## HITS:1 COG:no KEGG:c0520 NR:ns ## KEGG: c0520 # Name: yajD # Def: hypothetical protein # Organism: E.coli_CFT073 # Pathway: not_defined # 1 115 24 138 138 235 99.0 4e-61 MAIIPKNYARLESGYREKALKIYPWVCGRCSREFVYSDLRELTVHHIDHDHTNNPEDGSN WELLCLYCHDHEHSKYTEADQYGTTVIAGEDAQKDVGEAKYNPFADLKAMMNKKK Prediction of potential genes in microbial genomes Time: Mon May 16 00:00:06 2011 Seq name: gi|296494486|gb|ADTN01000252.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont650.7, whole genome shotgun sequence Length of sequence - 37193 bp Number of predicted genes - 37, with homology - 36 Number of transcription units - 17, operones - 8 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 34 - 918 1179 ## COG3248 Nucleoside-binding outer membrane protein - Prom 996 - 1055 7.7 - Term 1171 - 1211 1.4 2 2 Tu 1 . - CDS 1217 - 1756 426 ## ECs0465 hypothetical protein - Prom 1781 - 1840 3.1 + Prom 1821 - 1880 4.6 3 3 Op 1 14/0.000 + CDS 1907 - 2356 432 ## COG1327 Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains 4 3 Op 2 6/0.000 + CDS 2360 - 3463 959 ## COG1985 Pyrimidine reductase, riboflavin biosynthesis 5 3 Op 3 17/0.000 + CDS 3552 - 4022 596 ## COG0054 Riboflavin synthase beta-chain 6 3 Op 4 11/0.000 + CDS 4042 - 4461 519 ## COG0781 Transcription termination factor 7 3 Op 5 12/0.000 + CDS 4539 - 5516 1056 ## COG0611 Thiamine monophosphate kinase 8 3 Op 6 . + CDS 5494 - 6012 612 ## COG1267 Phosphatidylglycerophosphatase A and related proteins + Term 6021 - 6074 3.4 9 4 Tu 1 . - CDS 6066 - 7040 1042 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) - Prom 7126 - 7185 3.8 + Prom 6859 - 6918 1.7 10 5 Tu 1 . + CDS 7039 - 7188 125 ## + Term 7233 - 7295 1.2 11 6 Op 1 13/0.000 - CDS 7220 - 9082 2138 ## COG1154 Deoxyxylulose-5-phosphate synthase 12 6 Op 2 22/0.000 - CDS 9107 - 10006 925 ## COG0142 Geranylgeranyl pyrophosphate synthase 13 6 Op 3 . - CDS 10006 - 10248 369 ## COG1722 Exonuclease VII small subunit - Prom 10451 - 10510 3.1 + Prom 10200 - 10259 3.5 14 7 Tu 1 . + CDS 10454 - 11902 1792 ## COG0301 Thiamine biosynthesis ATP pyrophosphatase + Term 12065 - 12103 3.1 15 8 Op 1 4/0.375 - CDS 11956 - 12546 718 ## COG0693 Putative intracellular protease/amidase 16 8 Op 2 . - CDS 12509 - 13420 802 ## COG1893 Ketopantoate reductase - Prom 13543 - 13602 2.5 + Prom 13503 - 13562 2.8 17 9 Tu 1 . + CDS 13588 - 14079 725 ## COG1666 Uncharacterized protein conserved in bacteria + Term 14187 - 14239 5.4 - Term 14057 - 14084 -0.8 18 10 Tu 1 . - CDS 14207 - 15571 1625 ## COG0477 Permeases of the major facilitator superfamily - Prom 15592 - 15651 3.2 - Term 15673 - 15699 -1.0 19 11 Op 1 7/0.000 - CDS 15720 - 16607 882 ## COG0109 Polyprenyltransferase (cytochrome oxidase assembly factor) 20 11 Op 2 16/0.000 - CDS 16622 - 16951 341 ## COG3125 Heme/copper-type cytochrome/quinol oxidase, subunit 4 21 11 Op 3 20/0.000 - CDS 16951 - 17466 627 ## COG1845 Heme/copper-type cytochrome/quinol oxidase, subunit 3 22 11 Op 4 25/0.000 - CDS 17555 - 19546 2095 ## COG0843 Heme/copper-type cytochrome/quinol oxidases, subunit 1 23 11 Op 5 5/0.250 - CDS 19568 - 20515 906 ## COG1622 Heme/copper-type cytochrome/quinol oxidases, subunit 2 - Prom 20752 - 20811 9.7 24 12 Op 1 1/0.750 - CDS 20975 - 22450 1347 ## COG0477 Permeases of the major facilitator superfamily 25 12 Op 2 . - CDS 22494 - 23072 708 ## COG3056 Uncharacterized lipoprotein - Prom 23114 - 23173 3.4 + Prom 23151 - 23210 5.4 26 13 Tu 1 . + CDS 23377 - 23694 305 ## COG0271 Stress-induced morphogen (activity unknown) + Term 23747 - 23785 6.2 + Prom 23840 - 23899 6.0 27 14 Op 1 29/0.000 + CDS 24038 - 25336 1840 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) + Term 25370 - 25407 6.0 + Prom 25489 - 25548 5.2 28 14 Op 2 24/0.000 + CDS 25582 - 26205 637 ## COG0740 Protease subunit of ATP-dependent Clp proteases 29 14 Op 3 18/0.000 + CDS 26331 - 27605 238 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 + Term 27639 - 27669 2.7 + Prom 27624 - 27683 4.5 30 14 Op 4 16/0.000 + CDS 27793 - 30147 3027 ## COG0466 ATP-dependent Lon protease, bacterial type + Term 30180 - 30221 8.5 + Prom 30241 - 30300 3.4 31 15 Op 1 9/0.000 + CDS 30356 - 30628 420 ## COG0776 Bacterial nucleoid DNA-binding protein + Term 30664 - 30695 2.5 + Prom 30639 - 30698 3.5 32 15 Op 2 6/0.000 + CDS 30820 - 32691 2374 ## COG0760 Parvulin-like peptidyl-prolyl isomerase + Term 32735 - 32772 7.1 + Prom 32736 - 32795 2.3 33 15 Op 3 5/0.250 + CDS 32842 - 33213 593 ## COG1555 DNA uptake protein and related DNA-binding proteins + Term 33252 - 33282 -0.4 34 15 Op 4 . + CDS 33307 - 33705 380 ## COG0824 Predicted thioesterase 35 16 Op 1 5/0.250 - CDS 33757 - 34452 719 ## COG0603 Predicted PP-loop superfamily ATPase - Term 34465 - 34493 -0.0 36 16 Op 2 . - CDS 34517 - 36217 978 ## COG4533 ABC-type uncharacterized transport system, periplasmic component - Prom 36242 - 36301 3.0 + Prom 36227 - 36286 2.6 37 17 Tu 1 . + CDS 36317 - 37135 789 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 37153 - 37181 1.3 Predicted protein(s) >gi|296494486|gb|ADTN01000252.1| GENE 1 34 - 918 1179 294 aa, chain - ## HITS:1 COG:Ztsx KEGG:ns NR:ns ## COG: Ztsx COG3248 # Protein_GI_number: 15800140 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Nucleoside-binding outer membrane protein # Organism: Escherichia coli O157:H7 EDL933 # 1 294 1 294 294 523 99.0 1e-148 MKKTLLAAGAVLALSSSFTVNAAENDKPQYLSDWWHQSVNVVGSYHTRFGPQIRNDTYLE YEAFAKKDWFDFYGYADAPVFFGGNSDAKGIWNHGSPLFMEIEPRFSIDKLTNTDLSFGP FKEWYFANNYIYDMGRNKDGRQSTWYMGLGTDIDTGLPMSLSMNVYAKYQWQNYGAANEN EWDGYRFKIKYFVPITDLWGGQLSYIGFTNFDWGSDLGDDSGNAINGIKTRTNNSIASSH ILALNYDHWHYSVVARYWHDGGQWNDDAELNFGNGNFNVRSTGWGGYLVVGYNF >gi|296494486|gb|ADTN01000252.1| GENE 2 1217 - 1756 426 179 aa, chain - ## HITS:1 COG:no KEGG:ECs0465 NR:ns ## KEGG: ECs0465 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157J # Pathway: not_defined # 1 179 1 179 179 309 98.0 4e-83 MNTNVFRLLLLGSLFSLSACVQQSEVRQMKHSVSTLNQEMTQLNQETVKITQQNRLNAKS SSGVYLLPGAKTPARLESQIGTLRMSLVNITPDADGTTLTLRIQGESNDPLPAFSGTVEY GQIQGTIDNFQEINVQNQLINAPASVLAPSDVDIPLQLKGISVDQLGFVRIHDIQPVMQ >gi|296494486|gb|ADTN01000252.1| GENE 3 1907 - 2356 432 149 aa, chain + ## HITS:1 COG:ECs0466 KEGG:ns NR:ns ## COG: ECs0466 COG1327 # Protein_GI_number: 15829720 # Func_class: K Transcription # Function: Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains # Organism: Escherichia coli O157:H7 # 1 149 1 149 149 272 100.0 1e-73 MHCPFCFAVDTKVIDSRLVGEGSSVRRRRQCLVCNERFTTFEVAELVMPRVVKSNDVREP FNEEKLRSGMLRALEKRPVSSDDVEMAINHIKSQLRATGEREVPSKMIGNLVMEQLKKLD KVAYIRFASVYRSFEDIKEFGEEIARLED >gi|296494486|gb|ADTN01000252.1| GENE 4 2360 - 3463 959 367 aa, chain + ## HITS:1 COG:ribD_2 KEGG:ns NR:ns ## COG: ribD_2 COG1985 # Protein_GI_number: 16128399 # Func_class: H Coenzyme transport and metabolism # Function: Pyrimidine reductase, riboflavin biosynthesis # Organism: Escherichia coli K12 # 144 367 1 224 224 443 100.0 1e-124 MQDEYYMARALKLAQRGRFTTHPNPNVGCVIVKDGEIVGEGYHQRAGEPHAEVHALRMAG EKAKGATAYVTLEPCSHHGRTPPCCDALIAAGVARVVASMQDPNPQVAGRGLYRLQQAGI DVSHGLMMSEAEQLNKGFLKRMRTGFPYIQLKLGASLDGRTAMASGESQWITSPQARRDV QLLRAQSHAILTSSATVLADDPALTVRWSELDEQTQALYPQQNLRQPIRIVIDSQNRVTP VHRIVQQPGETWFARTQEDSREWPETVRTLLIPEHKGHLDLVVLMMQLGKQQINSIWVEA GPTLAGALLQAGLVDELIVYIAPKLLGSDARGLCTLPGLEKLADAPQFKFKEIRHVGPDV CLHLVGA >gi|296494486|gb|ADTN01000252.1| GENE 5 3552 - 4022 596 156 aa, chain + ## HITS:1 COG:ECs0468 KEGG:ns NR:ns ## COG: ECs0468 COG0054 # Protein_GI_number: 15829722 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase beta-chain # Organism: Escherichia coli O157:H7 # 1 156 1 156 156 258 100.0 4e-69 MNIIEANVATPDARVAITIARFNNFINDSLLEGAIDALKRIGQVKDENITVVWVPGAYEL PLAAGALAKTGKYDAVIALGTVIRGGTAHFEYVAGGASNGLAHVAQDSEIPVAFGVLTTE SIEQAIERAGTKAGNKGAEAALTALEMINVLKAIKA >gi|296494486|gb|ADTN01000252.1| GENE 6 4042 - 4461 519 139 aa, chain + ## HITS:1 COG:ECs0469 KEGG:ns NR:ns ## COG: ECs0469 COG0781 # Protein_GI_number: 15829723 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Escherichia coli O157:H7 # 1 139 1 139 139 245 100.0 2e-65 MKPAARRRARECAVQALYSWQLSQNDIADVEYQFLAEQDVKDVDVLYFRELLAGVATNTA YLDGLMKPYLSRLLEELGQVEKAVLRIALYELSKRSDVPYKVAINEAIELAKSFGAEDSH KFVNGVLDKAAPVIRPNKK >gi|296494486|gb|ADTN01000252.1| GENE 7 4539 - 5516 1056 325 aa, chain + ## HITS:1 COG:thiL KEGG:ns NR:ns ## COG: thiL COG0611 # Protein_GI_number: 16128402 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate kinase # Organism: Escherichia coli K12 # 1 325 1 325 325 641 100.0 0 MACGEFSLIARYFDRVRSSRLDVELGIGDDCALLNIPEKQTLAISTDTLVAGNHFLPDID PADLAYKALAVNLSDLAAMGADPAWLTLALTLPDVDEAWLESFSDSLFDLLNYYDMQLIG GDTTRGPLSMTLGIHGFVPMGRALTRSGAKPGDWIYVTGTPGDSAAGLAILQNRLQVADA KDADYLIKRHLRPSPRILQGQALRDLANSAIDLSDGLISDLGHIVKASDCGARIDLALLP FSDALSRHVEPEQALRWALSGGEDYELCFTVPELNRGALDVALGHLGVPFTCIGQMTADI EGLCFIRDGEPVTLDWKGYDHFATP >gi|296494486|gb|ADTN01000252.1| GENE 8 5494 - 6012 612 172 aa, chain + ## HITS:1 COG:pgpA KEGG:ns NR:ns ## COG: pgpA COG1267 # Protein_GI_number: 16128403 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphatase A and related proteins # Organism: Escherichia coli K12 # 1 172 1 172 172 313 100.0 8e-86 MTILPRHKDVAKSRLKMSNPWHLLAVGFGSGLSPIVPGTMGSLAAIPFWYLMTFLPWQLY SLVVMLGICIGVYLCHQTAKDMGVHDHGSIVWDEFIGMWITLMALPTNDWQWVAAGFVIF RILDMWKPWPIRWFDRNVHGGMGIMIDDIVAGVISAGILYFIGHHWPLGILS >gi|296494486|gb|ADTN01000252.1| GENE 9 6066 - 7040 1042 324 aa, chain - ## HITS:1 COG:yajO KEGG:ns NR:ns ## COG: yajO COG0667 # Protein_GI_number: 16128404 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Escherichia coli K12 # 1 324 25 348 348 659 100.0 0 MQYNPLGKTDLRVSRLCLGCMTFGEPDRGNHAWTLPEESSRPIIKRALEGGINFFDTANS YSDGSSEEIVGRALRDFARREDVVVATKVFHRVGDLPEGLSRAQILRSIDDSLRRLGMDY VDILQIHRWDYNTPIEETLEALNDVVKAGKARYIGASSMHASQFAQALELQKQHGWAQFV SMQDHYNLIYREEEREMLPLCYQEGVAVIPWSPLARGRLTRPWGETTARLVSDEVGKNLY KESDENDAQIAERLTGVSEELGATRAQVALAWLLSKPGIAAPIIGTSREEQLDELLNAVD ITLKPEQIAELETPYKPHPVVGFK >gi|296494486|gb|ADTN01000252.1| GENE 10 7039 - 7188 125 49 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLPLLLYSSYFKLHVLRLRSLTPVTYLCKLLGLHSFAAFLQLELFRVYE >gi|296494486|gb|ADTN01000252.1| GENE 11 7220 - 9082 2138 620 aa, chain - ## HITS:1 COG:dxs KEGG:ns NR:ns ## COG: dxs COG1154 # Protein_GI_number: 16128405 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Escherichia coli K12 # 1 620 1 620 620 1269 100.0 0 MSFDIAKYPTLALVDSTQELRLLPKESLPKLCDELRRYLLDSVSRSSGHFASGLGTVELT VALHYVYNTPFDQLIWDVGHQAYPHKILTGRRDKIGTIRQKGGLHPFPWRGESEYDVLSV GHSSTSISAGIGIAVAAEKEGKNRRTVCVIGDGAITAGMAFEAMNHAGDIRPDMLVILND NEMSISENVGALNNHLAQLLSGKLYSSLREGGKKVFSGVPPIKELLKRTEEHIKGMVVPG TLFEELGFNYIGPVDGHDVLGLITTLKNMRDLKGPQFLHIMTKKGRGYEPAEKDPITFHA VPKFDPSSGCLPKSSGGLPSYSKIFGDWLCETAAKDNKLMAITPAMREGSGMVEFSRKFP DRYFDVAIAEQHAVTFAAGLAIGGYKPIVAIYSTFLQRAYDQVLHDVAIQKLPVLFAIDR AGIVGADGQTHQGAFDLSYLRCIPEMVIMTPSDENECRQMLYTGYHYNDGPSAVRYPRGN AVGVELTPLEKLPIGKGIVKRRGEKLAILNFGTLMPEAAKVAESLNATLVDMRFVKPLDE ALILEMAASHEALVTVEENAIMGGAGSGVNEVLMAHRKPVPVLNIGLPDFFIPQGTQEEM RAELGLDAAGMEAKIKAWLA >gi|296494486|gb|ADTN01000252.1| GENE 12 9107 - 10006 925 299 aa, chain - ## HITS:1 COG:ispA KEGG:ns NR:ns ## COG: ispA COG0142 # Protein_GI_number: 16128406 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Escherichia coli K12 # 1 299 1 299 299 535 100.0 1e-152 MDFPQQLEACVKQANQALSRFIAPLPFQNTPVVETMQYGALLGGKRLRPFLVYATGHMFG VSTNTLDAPAAAVECIHAYSLIHDDLPAMDDDDLRRGLPTCHVKFGEANAILAGDALQTL AFSILSDADMPEVSDRDRISMISELASASGIAGMCGGQALDLDAEGKHVPLDALERIHRH KTGALIRAAVRLGALSAGDKGRRALPVLDKYAESIGLAFQVQDDILDVVGDTATLGKRQG ADQQLGKSTYPALLGLEQARKKARDLIDDARQSLKQLAEQSLDTSALEALADYIIQRNK >gi|296494486|gb|ADTN01000252.1| GENE 13 10006 - 10248 369 80 aa, chain - ## HITS:1 COG:ECs0476 KEGG:ns NR:ns ## COG: ECs0476 COG1722 # Protein_GI_number: 15829730 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII small subunit # Organism: Escherichia coli O157:H7 # 1 80 1 80 80 112 100.0 1e-25 MPKKNEAPASFEKALSELEQIVTRLESGDLPLEEALNEFERGVQLARQGQAKLQQAEQRV QILLSDNEDASLTPFTPDNE >gi|296494486|gb|ADTN01000252.1| GENE 14 10454 - 11902 1792 482 aa, chain + ## HITS:1 COG:ECs0477_1 KEGG:ns NR:ns ## COG: ECs0477_1 COG0301 # Protein_GI_number: 15829731 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine biosynthesis ATP pyrophosphatase # Organism: Escherichia coli O157:H7 # 1 402 1 402 402 793 99.0 0 MKFIIKLFPEITIKSQSVRLRFIKILTGNIRNVLKHYDETLAVVRHWDNIEVRAKDENQR LAIRDALTRIPGIHHILEVEDVPFTDMHDIFEKALVQYRDQLEGKTFCVRVKRRGKHDFS SIDVERYVGGGLNQHIESARVKLTNPDVTVHLEVEDDRLLLIKGRYEGIGGFPIGTQEDV LSLISGGFDSGVSSYMLMRRGCRVHYCFFNLGGAAHEIGVRQVAHYLWNRFGSSHRVRFV AINFEPVVGEILEKIDDGQMGVILKRMMVRAASKVAERYGVQALVTGEALGQVSSQTLTN LRLIDNVSDTLILRPLISYDKEHIINLARQIGTEDFARTMPEYCGVISKSPTVKAVKSKI EAEEEKFDFSILDKVVEEANNVDIREIAQQTEQEVVEVETVNGFGPNDVILDIRSIDEQE DKPLKVEGIDVVSLPFYKLSTKFGDLDQNKTWLLWCERGVMSRLQALYLREQGFNNVKVY RP >gi|296494486|gb|ADTN01000252.1| GENE 15 11956 - 12546 718 196 aa, chain - ## HITS:1 COG:thiJ KEGG:ns NR:ns ## COG: thiJ COG0693 # Protein_GI_number: 16128409 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Escherichia coli K12 # 1 196 3 198 198 381 100.0 1e-106 MSASALVCLAPGSEETEAVTTIDLLVRGGIKVTTASVASDGNLAITCSRGVKLLADAPLV EVADGEYDVIVLPGGIKGAECFRDSTLLVETVKQFHRSGRIVAAICAAPATVLVPHDIFP IGNMTGFPTLKDKIPAEQWLDKRVVWDARVKLLTSQGPGTAIDFGLKIIDLLVGREKAHE VASQLVMAAGIYNYYE >gi|296494486|gb|ADTN01000252.1| GENE 16 12509 - 13420 802 303 aa, chain - ## HITS:1 COG:apbA KEGG:ns NR:ns ## COG: apbA COG1893 # Protein_GI_number: 16128410 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate reductase # Organism: Escherichia coli K12 # 1 303 1 303 303 623 100.0 1e-178 MKITVLGCGALGQLWLTALCKQGHEVQGWLRVPQPYCSVNLVETDGSIFNESLTANDPDF LATSDLLLVTLKAWQVSDAVKSLASTLPVTTPILLIHNGMGTIEELQNIQQPLLMGTTTH AARRDGNVIIHVANGITHIGPARQQDGDYSYLADILQTVLPDVAWHNNIRAELWRKLAVN CVINPLTAIWNCPNGELRHHPQEIMQICEEVAAVIEREGHHTSAEDLRDYVMQVIDATAE NISSMLQDIRALRHTEIDYINGFLLRRARAHGIAVPENTRLFEMVKRKESEYERIGTGLP RPW >gi|296494486|gb|ADTN01000252.1| GENE 17 13588 - 14079 725 163 aa, chain + ## HITS:1 COG:ECs0480 KEGG:ns NR:ns ## COG: ECs0480 COG1666 # Protein_GI_number: 15829734 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 163 7 169 169 280 100.0 7e-76 MPSFDIVSEVDLQEARNAVDNASREVESRFDFRNVEASFELNDASKTIKVLSESDFQVNQ LLDILRAKLLKRGIEGSSLDVPENIVHSGKTWFVEAKLKQGIESATQKKIVKMIKDSKLK VQAQIQGDEIRVTGKSRDDLQAVMAMVRGGDLGQPFQFKNFRD >gi|296494486|gb|ADTN01000252.1| GENE 18 14207 - 15571 1625 454 aa, chain - ## HITS:1 COG:yajR KEGG:ns NR:ns ## COG: yajR COG0477 # Protein_GI_number: 16128412 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 454 3 456 456 817 100.0 0 MNDYKMTPGERRATWGLGTVFSLRMLGMFMVLPVLTTYGMALQGASEALIGIAIGIYGLT QAVFQIPFGLLSDRIGRKPLIVGGLAVFAAGSVIAALSDSIWGIILGRALQGSGAIAAAV MALLSDLTREQNRTKAMAFIGVSFGITFAIAMVLGPIITHKLGLHALFWMIAILATTGIA LTIWVVPNSSTHVLNRESGMVKGSFSKVLAEPRLLKLNFGIMCLHILLMSTFVALPGQLA DAGFPAAEHWKVYLATMLIAFGSVVPFIIYAEVKRKMKQVFVFCVGLIVVAEIVLWNAQT QFWQLVVGVQLFFVAFNLMEALLPSLISKESPAGYKGTAMGVYSTSQFLGVAIGGSLGGW INGMFDGQGVFLAGAMLAAVWLTVASTMKEPPYVSSLRIEIPANIAANEALKVRLLETEG IKEVLIAEEEHSAYVKIDSKVTNRFEIEQAIRQA >gi|296494486|gb|ADTN01000252.1| GENE 19 15720 - 16607 882 295 aa, chain - ## HITS:1 COG:ECs0482 KEGG:ns NR:ns ## COG: ECs0482 COG0109 # Protein_GI_number: 15829736 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Polyprenyltransferase (cytochrome oxidase assembly factor) # Organism: Escherichia coli O157:H7 # 1 295 2 296 296 524 100.0 1e-148 MFKQYLQVTKPGIIFGNLISVIGGFLLASKGSIDYPLFIYTLVGVSLVVASGCVFNNYID RDIDRKMERTKNRVLVKGLISPAVSLVYATLLGIAGFMLLWFGANPLACWLGVMGFVVYV GVYSLYMKRHSVYGTLIGSLSGAAPPVIGYCAVTGEFDSGAAILLAIFSLWQMPHSYAIA IFRFKDYQAANIPVLPVVKGISVAKNHITLYIIAFAVATLMLSLGGYAGYKYLVVAAAVS VWWLGMALRGYKVADDRIWARKLFGFSIIAITALSVMMSVDFMVPDSHTLLAAVW >gi|296494486|gb|ADTN01000252.1| GENE 20 16622 - 16951 341 109 aa, chain - ## HITS:1 COG:ECs0483 KEGG:ns NR:ns ## COG: ECs0483 COG3125 # Protein_GI_number: 15829737 # Func_class: C Energy production and conversion # Function: Heme/copper-type cytochrome/quinol oxidase, subunit 4 # Organism: Escherichia coli O157:H7 # 1 109 1 109 109 159 100.0 8e-40 MSHSTDHSGASHGSVKTYMTGFILSIILTVIPFWMVMTGAASPAVILGTILAMAVVQVLV HLVCFLHMNTKSDEGWNMTAFVFTVLIIAILVVGSIWIMWNLNYNMMMH >gi|296494486|gb|ADTN01000252.1| GENE 21 16951 - 17466 627 171 aa, chain - ## HITS:1 COG:cyoC KEGG:ns NR:ns ## COG: cyoC COG1845 # Protein_GI_number: 16128415 # Func_class: C Energy production and conversion # Function: Heme/copper-type cytochrome/quinol oxidase, subunit 3 # Organism: Escherichia coli K12 # 1 171 34 204 204 306 100.0 8e-84 MSDCILFSILFATYAVLVNGTAGGPTGKDIFELPFVLVETFLLLFSSITYGMAAIAMYKN NKSQVISWLALTWLFGAGFIGMEIYEFHHLIVNGMGPDRSGFLSAFFALVGTHGLHVTSG LIWMAVLMVQIARRGLTSTNRTRIMCLSLFWHFLDVVWICVFTVVYLMGAM >gi|296494486|gb|ADTN01000252.1| GENE 22 17555 - 19546 2095 663 aa, chain - ## HITS:1 COG:ECs0485 KEGG:ns NR:ns ## COG: ECs0485 COG0843 # Protein_GI_number: 15829739 # Func_class: C Energy production and conversion # Function: Heme/copper-type cytochrome/quinol oxidases, subunit 1 # Organism: Escherichia coli O157:H7 # 1 663 1 663 663 1241 100.0 0 MFGKLSLDAVPFHEPIVMVTIAGIILGGLALVGLITYFGKWTYLWKEWLTSVDHKRLGIM YIIVAIVMLLRGFADAIMMRSQQALASAGEAGFLPPHHYDQIFTAHGVIMIFFVAMPFVI GLMNLVVPLQIGARDVAFPFLNNLSFWFTVVGVILVNVSLGVGEFAQTGWLAYPPLSGIE YSPGVGVDYWIWSLQLSGIGTTLTGINFFVTILKMRAPGMTMFKMPVFTWASLCANVLII ASFPILTVTVALLTLDRYLGTHFFTNDMGGNMMMYINLIWAWGHPEVYILILPVFGVFSE IAATFSRKRLFGYTSLVWATVCITVLSFIVWLHHFFTMGAGANVNAFFGITTMIIAIPTG VKIFNWLFTMYQGRIVFHSAMLWTIGFIVTFSVGGMTGVLLAVPGADFVLHNSLFLIAHF HNVIIGGVVFGCFAGMTYWWPKAFGFKLNETWGKRAFWFWIIGFFVAFMPLYALGFMGMT RRLSQQIDPQFHTMLMIAASGAVLIALGILCLVIQMYVSIRDRDQNRDLTGDPWGGRTLE WATSSPPPFYNFAVVPHVHERDAFWEMKEKGEAYKKPDHYEEIHMPKNSGAGIVIAAFST IFGFAMIWHIWWLAIVGFAGMIITWIVKSFDEDVDYYVPVAEIEKLENQHFDEITKAGLK NGN >gi|296494486|gb|ADTN01000252.1| GENE 23 19568 - 20515 906 315 aa, chain - ## HITS:1 COG:cyoA KEGG:ns NR:ns ## COG: cyoA COG1622 # Protein_GI_number: 16128417 # Func_class: C Energy production and conversion # Function: Heme/copper-type cytochrome/quinol oxidases, subunit 2 # Organism: Escherichia coli K12 # 1 315 1 315 315 629 100.0 1e-180 MRLRKYNKSLGWLSLFAGTVLLSGCNSALLDPKGQIGLEQRSLILTAFGLMLIVVIPAIL MAVGFAWKYRASNKDAKYSPNWSHSNKVEAVVWTVPILIIIFLAVLTWKTTHALEPSKPL AHDEKPITIEVVSMDWKWFFIYPEQGIATVNEIAFPANTPVYFKVTSNSVMNSFFIPRLG SQIYAMAGMQTRLHLIANEPGTYDGISASYSGPGFSGMKFKAIATPDRAAFDQWVAKAKQ SPNTMSDMAAFEKLAAPSEYNQVEYFSNVKPDLFADVINKFMAHGKSMDMTQPEGEHSAH EGMEGMDMSHAESAH >gi|296494486|gb|ADTN01000252.1| GENE 24 20975 - 22450 1347 491 aa, chain - ## HITS:1 COG:ECs0487 KEGG:ns NR:ns ## COG: ECs0487 COG0477 # Protein_GI_number: 15829741 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 491 1 491 491 843 100.0 0 MSSQYLRIFQQPRSAILLILGFASGLPLALTSGTLQAWMTVENIDLKTIGFFSLVGQAYV FKFLWSPLMDRYTPPFFGRRRGWLLATQILLLVAIAAMGFLEPGTQLRWMAALAVVIAFC SASQDIVFDAWKTDVLPAEERGAGAAISVLGYRLGMLVSGGLALWLADKWLGWQGMYWLM AALLIPCIIATLLAPEPTDTIPVPKTLEQAVVAPLRDFFGRNNAWLILLLIVLYKLGDAF AMSLTTTFLIRGVGFDAGEVGVVNKTLGLLATIVGALYGGILMQRLSLFRALLIFGILQG ASNAGYWLLSITDKHLYSMGAAVFFENLCGGMGTSAFVALLMTLCNKSFSATQFALLSAL SAVGRVYVGPVAGWFVEAHGWSTFYLFSVAAAVPGLILLLVCRQTLEYTRVNDNFISRTA YPAGYAFAMWTLAAGVSLLAVWLLLLTMDALDLTHFSFLPALLEVGVLVALSGVVLGGLL DYLALRKTHLT >gi|296494486|gb|ADTN01000252.1| GENE 25 22494 - 23072 708 192 aa, chain - ## HITS:1 COG:yajG KEGG:ns NR:ns ## COG: yajG COG3056 # Protein_GI_number: 16128419 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized lipoprotein # Organism: Escherichia coli K12 # 1 192 35 226 226 355 100.0 2e-98 MFKKILFPLVALFMLAGCAKPPTTIEVSPTITLPQQDPSLMGVTVSINGADQRTDQALAK VTRDNQIVTLTASRDLRFLLQEVLEKQMTARGYMVGPNGPVNLQIIVSQLYADVSQGNVR YNIATKADIAIIATAQNGNKMTKNYRASYNVEGAFQASNKNIADAVNSVLTDTIADMSQD TSIHEFIKQNAR >gi|296494486|gb|ADTN01000252.1| GENE 26 23377 - 23694 305 105 aa, chain + ## HITS:1 COG:ECs0489 KEGG:ns NR:ns ## COG: ECs0489 COG0271 # Protein_GI_number: 15829743 # Func_class: T Signal transduction mechanisms # Function: Stress-induced morphogen (activity unknown) # Organism: Escherichia coli O157:H7 # 1 105 12 116 116 204 100.0 3e-53 MMIRERIEEKLRAAFQPVFLEVVDESYRHNVPAGSESHFKVVLVSDRFTGERFLNRHRMI YSTLAEELSTTVHALALHTYTIKEWEGLQDTVFASPPCRGAGSIA >gi|296494486|gb|ADTN01000252.1| GENE 27 24038 - 25336 1840 432 aa, chain + ## HITS:1 COG:ECs0490 KEGG:ns NR:ns ## COG: ECs0490 COG0544 # Protein_GI_number: 15829744 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Escherichia coli O157:H7 # 1 432 1 432 432 738 100.0 0 MQVSVETTQGLGRRVTITIAADSIETAVKSELVNVAKKVRIDGFRKGKVPMNIVAQRYGA SVRQDVLGDLMSRNFIDAIIKEKINPAGAPTYVPGEYKLGEDFTYSVEFEVYPEVELQGL EAIEVEKPIVEVTDADVDGMLDTLRKQQATWKEKDGAVEAEDRVTIDFTGSVDGEEFEGG KASDFVLAMGQGRMIPGFEDGIKGHKAGEEFTIDVTFPEEYHAENLKGKAAKFAINLKKV EERELPELTAEFIKRFGVEDGSVEGLRAEVRKNMERELKSAIRNRVKSQAIEGLVKANDI DVPAALIDSEIDVLRRQAAQRFGGNEKQALELPRELFEEQAKRRVVVGLLLGEVIRTNEL KADEERVKGLIEEMASAYEDPKEVIEFYSKNKELMDNMRNVALEEQAVEAVLAKAKVTEK ETTFNELMNQQA >gi|296494486|gb|ADTN01000252.1| GENE 28 25582 - 26205 637 207 aa, chain + ## HITS:1 COG:ECs0491 KEGG:ns NR:ns ## COG: ECs0491 COG0740 # Protein_GI_number: 15829745 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Escherichia coli O157:H7 # 1 207 1 207 207 417 100.0 1e-117 MSYSGERDNFAPHMALVPMVIEQTSRGERSFDIYSRLLKERVIFLTGQVEDHMANLIVAQ MLFLEAENPEKDIYLYINSPGGVITAGMSIYDTMQFIKPDVSTICMGQAASMGAFLLTAG AKGKRFCLPNSRVMIHQPLGGYQGQATDIEIHAREILKVKGRMNELMALHTGQSLEQIER DTERDRFLSAPEAVEYGLVDSILTHRN >gi|296494486|gb|ADTN01000252.1| GENE 29 26331 - 27605 238 424 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 159 386 250 432 466 96 31 2e-19 MTDKRKDGSGKLLYCSFCGKSQHEVRKLIAGPSVYICDECVDLCNDIIREEIKEVAPHRE RSALPTPHEIRNHLDDYVIGQEQAKKVLAVAVYNHYKRLRNGDTSNGVELGKSNILLIGP TGSGKTLLAETLARLLDVPFTMADATTLTEAGYVGEDVENIIQKLLQKCDYDVQKAQRGI VYIDEIDKISRKSDNPSITRDVSGEGVQQALLKLIEGTVAAVPPQGGRKHPQQEFLQVDT SKILFICGGAFAGLDKVISHRVETGSGIGFGATVKAKSDKASEGELLAQVEPEDLIKFGL IPEFIGRLPVVATLNELSEEALIQILKEPKNALTKQYQALFNLEGVDLEFRDEALDAIAK KAMARKTGARGLRSIVEAALLDTMYDLPSMEDVEKVVIDESVIDGQSKPLLIYGKPEAQQ ASGE >gi|296494486|gb|ADTN01000252.1| GENE 30 27793 - 30147 3027 784 aa, chain + ## HITS:1 COG:lon KEGG:ns NR:ns ## COG: lon COG0466 # Protein_GI_number: 16128424 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Escherichia coli K12 # 1 784 1 784 784 1482 100.0 0 MNPERSERIEIPVLPLRDVVVYPHMVIPLFVGREKSIRCLEAAMDHDKKIMLVAQKEAST DEPGVNDLFTVGTVASILQMLKLPDGTVKVLVEGLQRARISALSDNGEHFSAKAEYLESP TIDEREQEVLVRTAISQFEGYIKLNKKIPPEVLTSLNSIDDPARLADTIAAHMPLKLADK QSVLEMSDVNERLEYLMAMMESEIDLLQVEKRIRNRVKKQMEKSQREYYLNEQMKAIQKE LGEMDDAPDENEALKRKIDAAKMPKEAKEKAEAELQKLKMMSPMSAEATVVRGYIDWMVQ VPWNARSKVKKDLRQAQEILDTDHYGLERVKDRILEYLAVQSRVNKIKGPILCLVGPPGV GKTSLGQSIAKATGRKYVRMALGGVRDEAEIRGHRRTYIGSMPGKLIQKMAKVGVKNPLF LLDEIDKMSSDMRGDPASALLEVLDPEQNVAFSDHYLEVDYDLSDVMFVATSNSMNIPAP LLDRMEVIRLSGYTEDEKLNIAKRHLLPKQIERNALKKGELTVDDSAIIGIIRYYTREAG VRGLEREISKLCRKAVKQLLLDKSLKHIEINGDNLHDYLGVQRFDYGRADNENRVGQVTG LAWTEVGGDLLTIETACVPGKGKLTYTGSLGEVMQESIQAALTVVRARAEKLGINPDFYE KRDIHVHVPEGATPKDGPSAGIAMCTALVSCLTGNPVRADVAMTGEITLRGQVLPIGGLK EKLLAAHRGGIKTVLIPFENKRDLEEIPDNVIADLDIHPVKRIEEVLTLALQNEPSGMQV VTAK >gi|296494486|gb|ADTN01000252.1| GENE 31 30356 - 30628 420 90 aa, chain + ## HITS:1 COG:YPO3154 KEGG:ns NR:ns ## COG: YPO3154 COG0776 # Protein_GI_number: 16123316 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Yersinia pestis # 1 90 1 90 90 126 92.0 1e-29 MNKSQLIDKIAAGADISKAAAGRALDAIIASVTESLKEGDDVALVGFGTFAVKERAARTG RNPQTGKEITIAAAKVPSFRAGKALKDAVN >gi|296494486|gb|ADTN01000252.1| GENE 32 30820 - 32691 2374 623 aa, chain + ## HITS:1 COG:ybaU KEGG:ns NR:ns ## COG: ybaU COG0760 # Protein_GI_number: 16128426 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Parvulin-like peptidyl-prolyl isomerase # Organism: Escherichia coli K12 # 1 623 1 623 623 1106 100.0 0 MMDSLRTAANSLVLKIIFGIIIVSFILTGVSGYLIGGGNNYAAKVNDQEISRGQFENAFN SERNRMQQQLGDQYSELAANEGYMKTLRQQVLNRLIDEALLDQYARELKLGISDEQVKQA IFATPAFQVDGKFDNSRYNGILNQMGMTADQYAQALRNQLTTQQLINGVAGTDFMLKGET DELAALVAQQRVVREATIDVNALAAKQPVTEQEIASYYEQNKNNFMTPEQFRVSYIKLDA ATMQQPVSDADIQSYYDQHQDQFTQPQRTRYSIIQTKTEDEAKAVLDELNKGGDFAALAK EKSADIISARNGGDMGWLEDATIPDELKNAGLKEKGQLSGVIKSSVGFLIVRLDDIQPAK VKSLDEVRDDIAAKVKHEKALDAYYALQQKVSDAASNDTESLAGAEQAAGVKATQTGWFS KDNLPEELNFKPVADAIFNGGLVGENGAPGINSDIITVDGDRAFVLRISEHKPEAVKPLA DVQEQVKALVQHNKAEQQAKVDAEKLLVDLKAGKGAEAMQAAGLKFGEPKTLSRSGRDPI SQAAFALPLPAKDKPSYGMATDMQGNVVLLALDEVKQGSMPEDQKKAMVQGITQNNAQIV FEALMSNLRKEAKIKIGDALEQQ >gi|296494486|gb|ADTN01000252.1| GENE 33 32842 - 33213 593 123 aa, chain + ## HITS:1 COG:ybaV KEGG:ns NR:ns ## COG: ybaV COG1555 # Protein_GI_number: 16128427 # Func_class: L Replication, recombination and repair # Function: DNA uptake protein and related DNA-binding proteins # Organism: Escherichia coli K12 # 1 123 1 123 123 179 100.0 7e-46 MKHGIKALLITLSLACAGMSHSALAAASVAKPTAVETKAEAPAAQSKAAVPAKASDEEGT RVSINNASAEELARAMNGVGLKKAQAIVSYREEYGPFKTVEDLKQVPGMGNSLVERNLAV LTL >gi|296494486|gb|ADTN01000252.1| GENE 34 33307 - 33705 380 132 aa, chain + ## HITS:1 COG:ybaW KEGG:ns NR:ns ## COG: ybaW COG0824 # Protein_GI_number: 16128428 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Escherichia coli K12 # 1 132 1 132 132 251 100.0 3e-67 MQTQIKVRGYHLDVYQHVNNARYLEFLEEARWDGLENSDSFQWMTAHNIAFVVVNININY RRPAVLSDLLTITSQLQQLNGKSGILSQVITLEPEGQVVADALITFVCIDLKTQKALALE GELREKLEQMVK >gi|296494486|gb|ADTN01000252.1| GENE 35 33757 - 34452 719 231 aa, chain - ## HITS:1 COG:ybaX KEGG:ns NR:ns ## COG: ybaX COG0603 # Protein_GI_number: 16128429 # Func_class: R General function prediction only # Function: Predicted PP-loop superfamily ATPase # Organism: Escherichia coli K12 # 1 231 1 231 231 485 100.0 1e-137 MKRAVVVFSGGQDSTTCLVQALQQYDEVHCVTFDYGQRHRAEIDVARELALKLGARAHKV LDVTLLNELAVSSLTRDSIPVPDYEPEADGIPNTFVPGRNILFLTLAAIYAYQVKAEAVI TGVCETDFSGYPDCRDEFVKALNHAVSLGMAKDIRFETPLMWIDKAETWALADYYGKLDL VRNETLTCYNGFKGDGCGHCAACNLRANGLNHYLADKPTVMAAMKQKTGLR >gi|296494486|gb|ADTN01000252.1| GENE 36 34517 - 36217 978 566 aa, chain - ## HITS:1 COG:ybaE KEGG:ns NR:ns ## COG: ybaE COG4533 # Protein_GI_number: 16128430 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Escherichia coli K12 # 1 566 1 566 566 1107 100.0 0 MRLLNRLNQYQRLWQPSAGKPQTVTVSELAERCFCSERHVRTLLRQAQEAGWLEWQAQSG RGKRGQLRFLVTPESLRNAMMEQALETGKQQDVLELAQLAPGELRTLLQPFMGGQWQNDT PTLRIPYYRPLEPLQPGFLPGRAEQHLAGQIFSGLTRFDNNTQRPIGDLAHHWETSTDGL RWDFYLRSTLHWHNGDAVKASHLHQRLLMLLQLPALDQLFISVKRIEVTHPQCLTFFLHR PDYWLAHRLASYCSHLAHPQFPLIGTGPFRLTQFTAELVRLESHDYYHLRHPLLKAVEYW ITPPLFEKDLGTSCRHPVQITIGKPEELQRVSQVSSGISLGFCYLTLRKSPRLSLWQARK VISIIHQSGLLQTLEVGENLITASHALLPGWTIPHWQVPDEVKLPKTLTLVYHLPIELHT MAERLQATLAAEGCELTIIFHNAKNWDDTTLQAHADLMMGDRLIGEAPEYTLEQWLRCDP LWPHVFDAPAYAHLQSTLDAVQIMPDEENRFNALKAVFSQLMTDATLTPLFNYHYRISAP PGVNGVRLTPRGWFEFTEAWLPAPSQ >gi|296494486|gb|ADTN01000252.1| GENE 37 36317 - 37135 789 272 aa, chain + ## HITS:1 COG:cof KEGG:ns NR:ns ## COG: cof COG0561 # Protein_GI_number: 16128431 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Escherichia coli K12 # 1 272 5 276 276 560 100.0 1e-159 MARLAAFDMDGTLLMPDHHLGEKTLSTLARLRERDITLTFATGRHALEMQHILGALSLDA YLITGNGTRVHSLEGELLHRDDLPADVAELVLYQQWDTRASMHIFNDDGWFTGKEIPALL QAFVYSGFRYQIIDVKKMPLGSVTKICFCGDHDDLTRLQIQLYEALGERAHLCFSATDCL EVLPVGCNKGAALTVLTQHLGLSLRDCMAFGDAMNDREMLVSVGSGFIMGNAMPQLRAEL PHLPVIGHCRNQAVSHYLTHWLDYPHLPYSPE Prediction of potential genes in microbial genomes Time: Mon May 16 00:00:21 2011 Seq name: gi|296494485|gb|ADTN01000253.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont650.8, whole genome shotgun sequence Length of sequence - 21675 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 14, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 30 - 89 7.5 1 1 Op 1 3/0.833 + CDS 120 - 578 406 ## COG1522 Transcriptional regulators 2 1 Op 2 35/0.000 + CDS 608 - 2380 1704 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 3 1 Op 3 4/0.667 + CDS 2373 - 4154 222 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P + Term 4191 - 4226 4.5 + Prom 4178 - 4237 5.6 4 2 Op 1 24/0.000 + CDS 4335 - 4673 466 ## COG0347 Nitrogen regulatory protein PII 5 2 Op 2 . + CDS 4703 - 5989 1536 ## COG0004 Ammonia permease + Term 6008 - 6042 3.3 - Term 5993 - 6032 5.2 6 3 Tu 1 . - CDS 6038 - 6898 743 ## COG1946 Acyl-CoA thioesterase - Prom 6938 - 6997 2.9 + Prom 6998 - 7057 2.6 7 4 Tu 1 . + CDS 7116 - 7688 476 ## COG3126 Uncharacterized protein conserved in bacteria + Term 7699 - 7730 4.1 - Term 7570 - 7606 2.7 8 5 Tu 1 . - CDS 7719 - 8030 313 ## COG3695 Predicted methylated DNA-protein cysteine methyltransferase + 5S_RRNA 8185 - 8298 100.0 # ECU82664 [D:55896..56009] # 4.5S ribosomal RNA # Escherichia coli # Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; Enterobacteriaceae; Escherichia. 9 6 Tu 1 . + CDS 8409 - 8762 502 ## COG5507 Uncharacterized conserved protein - Term 8750 - 8797 -0.9 10 7 Tu 1 . - CDS 8804 - 10360 1359 ## COG4943 Predicted signal transduction protein containing sensor and EAL domains - Prom 10389 - 10448 3.2 11 8 Tu 1 . - CDS 10518 - 10988 423 ## G2583_0570 hypothetical protein - Prom 11024 - 11083 3.3 12 9 Tu 1 . - CDS 11104 - 11655 368 ## COG0110 Acetyltransferase (isoleucine patch superfamily) - Prom 11766 - 11825 5.8 - Term 11693 - 11733 -0.5 13 10 Op 1 . - CDS 11827 - 12045 187 ## UTI89_C0487 hemolysin expression-modulating protein 14 10 Op 2 . - CDS 12071 - 12445 190 ## ECDH10B_0417 hypothetical protein - Term 12944 - 12971 1.5 15 11 Op 1 27/0.000 - CDS 12991 - 16140 3417 ## COG0841 Cation/multidrug efflux pump 16 11 Op 2 . - CDS 16163 - 17356 1289 ## COG0845 Membrane-fusion protein - Prom 17435 - 17494 3.6 + Prom 17378 - 17437 5.0 17 12 Tu 1 . + CDS 17498 - 18145 514 ## COG1309 Transcriptional regulator + Term 18159 - 18188 1.4 - Term 17855 - 17889 0.6 18 13 Tu 1 . - CDS 18076 - 18267 73 ## gi|300919868|ref|ZP_07136335.1| conserved domain protein + Prom 18168 - 18227 3.6 19 14 Tu 1 . + CDS 18273 - 21635 3502 ## COG3264 Small-conductance mechanosensitive channel Predicted protein(s) >gi|296494485|gb|ADTN01000253.1| GENE 1 120 - 578 406 152 aa, chain + ## HITS:1 COG:ybaO KEGG:ns NR:ns ## COG: ybaO COG1522 # Protein_GI_number: 16128432 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 152 30 181 181 308 100.0 2e-84 MLDKIDRKLLALLQQDCTLSLQALAEAVNLTTTPCWKRLKRLEDDGILIGKVALLDPEKI GLGLTAFVLIKTQHHSSEWYCRFVTVVTEMPEVLGFWRMAGEYDYLMRVQVADMKRYDEF YKRLVNSVPGLSDVTSSFAMEQIKYTTSLPIE >gi|296494485|gb|ADTN01000253.1| GENE 2 608 - 2380 1704 590 aa, chain + ## HITS:1 COG:mdlA KEGG:ns NR:ns ## COG: mdlA COG1132 # Protein_GI_number: 16128433 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Escherichia coli K12 # 1 590 1 590 590 1174 100.0 0 MRLFAQLSWYFRREWRRYLGAVALLVIIAMLQLVPPKVVGIVVDGVTEQHFTTGQILMWI ATMVLIAVVVYLLRYVWRVLLFGASYQLAVELREDYYRQLSRQHPEFYLRHRTGDLMARA TNDVDRVVFAAGEGVLTLVDSLVMGCAVLIMMSTQISWQLTLFSLLPMPVMAIMIKRNGD ALHERFKLAQAAFSSLNDRTQESLTSIRMIKAFGLEDRQSALFAADAEDTGKKNMRVARI DARFDPTIYIAIGMANLLAIGGGSWMVVQGSLTLGQLTSFMMYLGLMIWPMLALAWMFNI VERGSAAYSRIRAMLAEAPVVNDGSEPVPEGRGELDVNIHQFTYPQTDHPALENVNFALK PGQMLGICGPTGSGKSTLLSLIQRHFDVSEGDIRFHDIPLTKLQLDSWRSRLAVVSQTPF LFSDTVANNIALGCPNATQQEIEHVARLASVHDDILRLPQGYDTEVGERGVMLSGGQKQR ISIARALLVNAEILILDDALSAVDGRTEHQILHNLRQWGQGRTVIISAHRLSALTEASEI IVMQHGHIAQRGNHDVLAQQSGWYRDMYRYQQLEAALDDAPENREEAVDA >gi|296494485|gb|ADTN01000253.1| GENE 3 2373 - 4154 222 593 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 346 565 135 355 398 90 35 1e-17 MRSFSQLWPTLKRLLAYGSPWRKPLGIAVLMMWVAAAAEVSGPLLISYFIDNMVAKNNLP LKVVAGLAAAYVGLQLFAAGLHYAQSLLFNRAAVGVVQQLRTDVMDAALRQPLSEFDTQP VGQVISRVTNDTEVIRDLYVTVVATVLRSAALVGAMLVAMFSLDWRMALVAIMIFPVVLV VMVIYQRYSTPIVRRVRAYLADINDGFNEIINGMSVIQQFRQQARFGERMGEASRSHYMA RMQTLRLDGFLLRPLLSLFSSLILCGLLMLFGFSASGTIEVGVLYAFISYLGRLNEPLIE LTTQQAMLQQAVVAGERVFELMDGPRQQYGNDDRPLQSGTIEVDNVSFAYRDDNLVLKNI NLSVPSRNFVALVGHTGSGKSTLASLLMGYYPLTEGEIRLDGRPLSSLSHSALRQGVAMV QQDPVVLADTFLANVTLGRDISEERVWQALETVQLAELARSMSDGIYTPLGEQGNNLSVG QKQLLALARVLVETPQILILDEATASIDSGTEQAIQHALAAVREHTTLVVIAHRLSTIVD ADTIQVLHRGQAVEQGTHQQLLAAQGRYWQMYQLQLAGEELAASVREEESLSA >gi|296494485|gb|ADTN01000253.1| GENE 4 4335 - 4673 466 112 aa, chain + ## HITS:1 COG:ECs0504 KEGG:ns NR:ns ## COG: ECs0504 COG0347 # Protein_GI_number: 15829758 # Func_class: E Amino acid transport and metabolism # Function: Nitrogen regulatory protein PII # Organism: Escherichia coli O157:H7 # 1 112 1 112 112 196 100.0 1e-50 MKLVTVIIKPFKLEDVREALSSIGIQGLTVTEVKGFGRQKGHAELYRGAEYSVNFLPKVK IDVAIADDQLDEVIDIVSKAAYTGKIGDGKIFVAELQRVIRIRTGEADEAAL >gi|296494485|gb|ADTN01000253.1| GENE 5 4703 - 5989 1536 428 aa, chain + ## HITS:1 COG:ECs0505 KEGG:ns NR:ns ## COG: ECs0505 COG0004 # Protein_GI_number: 15829759 # Func_class: P Inorganic ion transport and metabolism # Function: Ammonia permease # Organism: Escherichia coli O157:H7 # 1 428 1 428 428 727 100.0 0 MKIATIKTGLASLAMLPGLVMAAPAVADKADNAFMMICTALVLFMTIPGIALFYGGLIRG KNVLSMLTQVTVTFALVCILWVVYGYSLAFGEGNNFFGNINWLMLKNIELTAVMGSIYQY IHVAFQGSFACITVGLIVGALAERIRFSAVLIFVVVWLTLSYIPIAHMVWGGGLLASHGA LDFAGGTVVHINAAIAGLVGAYLIGKRVGFGKEAFKPHNLPMVFTGTAILYIGWFGFNAG SAGTANEIAALAFVNTVVATAAAILGWIFGEWALRGKPSLLGACSGAIAGLVGVTPACGY IGVGGALIIGVVAGLAGLWGVTMLKRLLRVDDPCDVFGVHGVCGIVGCIMTGIFAASSLG GVGFAEGVTMGHQLLVQLESIAITIVWSGVVAFIGYKLADLTVGLRVPEEQEREGLDVNS HGENAYNA >gi|296494485|gb|ADTN01000253.1| GENE 6 6038 - 6898 743 286 aa, chain - ## HITS:1 COG:ECs0506 KEGG:ns NR:ns ## COG: ECs0506 COG1946 # Protein_GI_number: 15829760 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA thioesterase # Organism: Escherichia coli O157:H7 # 1 286 1 286 286 557 100.0 1e-159 MSQALKNLLTLLNLEKIEEGLFRGQSEDLGLRQVFGGQVVGQALYAAKETVPEERLVHSF HSYFLRPGDSKKPIIYDVETLRDGNSFSARRVAAIQNGKPIFYMTASFQAPEAGFEHQKT MPSAPAPDGLPSETQIAQSLAHLLPPVLKDKFICDRPLEVRPVEFHNPLKGHVAEPHRQV WIRANGSVPDDLRVHQYLLGYASDLNFLPVALQPHGIGFLEPGIQIATIDHSMWFHRPFN LNEWLLYSVESTSASSARGFVRGEFYTQDGVLVASTVQEGVMRNHN >gi|296494485|gb|ADTN01000253.1| GENE 7 7116 - 7688 476 190 aa, chain + ## HITS:1 COG:ECs0507 KEGG:ns NR:ns ## COG: ECs0507 COG3126 # Protein_GI_number: 15829761 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 190 1 190 190 235 100.0 4e-62 MKLVHMASGLAVAIALAACADKSADIQTPAPAANTSISATQQPAIQQPNVSGTVWIRQKV ALPPDAVLTVTLSDASLADAPSKVLAQKAVRTEGKQSPFSFVLPFNPADVQPNARILLSA AITVNDKLVFITDTVQPVINQGGTKADLTLVPVQQTAVPVQASGGATTTVPSTSPTQVNP SSAVPAPTQY >gi|296494485|gb|ADTN01000253.1| GENE 8 7719 - 8030 313 103 aa, chain - ## HITS:1 COG:ECs0508 KEGG:ns NR:ns ## COG: ECs0508 COG3695 # Protein_GI_number: 15829762 # Func_class: L Replication, recombination and repair # Function: Predicted methylated DNA-protein cysteine methyltransferase # Organism: Escherichia coli O157:H7 # 1 103 27 129 129 202 100.0 8e-53 MEKEDSFPQRVWQIVAAIPEGYVTTYGDVAKLAGSPRAARQVGGVLKRLPEGSTLPWHRV VNRHGTISLTGPDLQRQRQALLAEGVMVSGSGQIDLQRYRWNY >gi|296494485|gb|ADTN01000253.1| GENE 9 8409 - 8762 502 117 aa, chain + ## HITS:1 COG:ECs0509 KEGG:ns NR:ns ## COG: ECs0509 COG5507 # Protein_GI_number: 15829763 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 117 1 117 117 213 100.0 9e-56 MKYVDGFVVAVPADKKDAYREMAAKAAPLFKEFGALRIVECWASDVPDGKVTDFRMAVKA EENEEVVFSWIEYPSKEVRDAANQKMMSDPRMKEFGESMPFDGKRMIYGGFESIIDE >gi|296494485|gb|ADTN01000253.1| GENE 10 8804 - 10360 1359 518 aa, chain - ## HITS:1 COG:ylaB KEGG:ns NR:ns ## COG: ylaB COG4943 # Protein_GI_number: 16128441 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing sensor and EAL domains # Organism: Escherichia coli K12 # 1 518 1 518 518 1044 100.0 0 MLVRTRHLVGLISGVLILSVLLPVGLSIWLAHQQVETSFIEELDTYSSRVAIRANKVATQ GKDALQELERWQGAACSEAHLMEMRRVSYSYRYIQEVAYIDNNVPQCSSLEHESPPDTFP EPGKISKDGYRVWLTSHNDLGIIRYMVAMGTAHYVVMIDPASFIDVIPYSSWQIDAAIIG NAHNVVITSSDEIAQGIITRLQKTPGEHIENNGIIYDILPLPEMNISIITWASTKMLQKG WHRQVFIWLPLGLVIGLLAAMFVLRILRRIQSPHHRLQDAIENRDICVHYQPIVSLANGK IVGAEALARWPQTDGSWLSPDSFIPLAQQTGLSEPLTLLIIRSVFEDMGDWLRQHPQQHI SINLESPVLTSEKIPQLLRDMINHYQVNPRQIALELTEREFADPKTSAPIISRYREAGHE IYLDDFGTGYSSLSYLQDLDVDILKIDKSFVDALEYKNVTPHIIEMAKTLKLKMVAEGIE TSKQEEWLRQHGVHYGQGWLYSKALPKEDFLRWAEQHL >gi|296494485|gb|ADTN01000253.1| GENE 11 10518 - 10988 423 156 aa, chain - ## HITS:1 COG:no KEGG:G2583_0570 NR:ns ## KEGG: G2583_0570 # Name: ylaC # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 156 14 169 169 304 100.0 8e-82 MTEIQRLLTETIESLNTREKRDNKPRFSISFIRKHPGLFIGMYVAFFATLAVMLQSETLS GSVWLLVVLFILLNGFFFFDVYPRYRYEDIDVLDFRVCYNGEWYNTRFVPAALVEAILNS PRVADVHKEQLQKMIVRKGELSFYDIFTLARAESTS >gi|296494485|gb|ADTN01000253.1| GENE 12 11104 - 11655 368 183 aa, chain - ## HITS:1 COG:ylaD KEGG:ns NR:ns ## COG: ylaD COG0110 # Protein_GI_number: 16128443 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Escherichia coli K12 # 1 183 1 183 183 381 100.0 1e-106 MSTEKEKMIAGELYRSADETLSRDRLRARQLIHRYNHSLAEEHTLRQQILADLFGQVTEA YIEPTFRCDYGYNIFLGNNFFANFDCVMLDVCPIRIGDNCMLAPGVHIYTATHPIDPVAR NSGAELGKPVTIGNNVWIGGRAVINPGVTIGDNVVVASGAVVTKDVPDNVVVGGNPARII KKL >gi|296494485|gb|ADTN01000253.1| GENE 13 11827 - 12045 187 72 aa, chain - ## HITS:1 COG:no KEGG:UTI89_C0487 NR:ns ## KEGG: UTI89_C0487 # Name: hha # Def: hemolysin expression-modulating protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 72 5 76 76 130 100.0 1e-29 MSEKPLTKTDYLMRLRRCQTIDTLERVIEKNKYELSDNELAVFYSAADHRLAELTMNKLY DKIPSSVWKFIR >gi|296494485|gb|ADTN01000253.1| GENE 14 12071 - 12445 190 124 aa, chain - ## HITS:1 COG:no KEGG:ECDH10B_0417 NR:ns ## KEGG: ECDH10B_0417 # Name: ybaJ # Def: hypothetical protein # Organism: E.coli_DH10B # Pathway: not_defined # 1 124 1 124 124 243 100.0 2e-63 MDEYSPKRHDIAQLKFLCETLYHDCLANLEESNHGWVNDPTSAINLQLNELIEHIATFAL NYKIKYNEDNKLIEQIDEYLDDTFMLFSSYGINMQDLQKWRKSGNRLFRCFVNATKENPA SLSC >gi|296494485|gb|ADTN01000253.1| GENE 15 12991 - 16140 3417 1049 aa, chain - ## HITS:1 COG:acrB KEGG:ns NR:ns ## COG: acrB COG0841 # Protein_GI_number: 16128446 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Escherichia coli K12 # 1 1049 1 1049 1049 2000 100.0 0 MPNFFIDRPIFAWVIAIIIMLAGGLAILKLPVAQYPTIAPPAVTISASYPGADAKTVQDT VTQVIEQNMNGIDNLMYMSSNSDSTGTVQITLTFESGTDADIAQVQVQNKLQLAMPLLPQ EVQQQGVSVEKSSSSFLMVVGVINTDGTMTQEDISDYVAANMKDAISRTSGVGDVQLFGS QYAMRIWMNPNELNKFQLTPVDVITAIKAQNAQVAAGQLGGTPPVKGQQLNASIIAQTRL TSTEEFGKILLKVNQDGSRVLLRDVAKIELGGENYDIIAEFNGQPASGLGIKLATGANAL DTAAAIRAELAKMEPFFPSGLKIVYPYDTTPFVKISIHEVVKTLVEAIILVFLVMYLFLQ NFRATLIPTIAVPVVLLGTFAVLAAFGFSINTLTMFGMVLAIGLLVDDAIVVVENVERVM AEEGLPPKEATRKSMGQIQGALVGIAMVLSAVFVPMAFFGGSTGAIYRQFSITIVSAMAL SVLVALILTPALCATMLKPIAKGDHGEGKKGFFGWFNRMFEKSTHHYTDSVGGILRSTGR YLVLYLIIVVGMAYLFVRLPSSFLPDEDQGVFMTMVQLPAGATQERTQKVLNEVTHYYLT KEKNNVESVFAVNGFGFAGRGQNTGIAFVSLKDWADRPGEENKVEAITMRATRAFSQIKD AMVFAFNLPAIVELGTATGFDFELIDQAGLGHEKLTQARNQLLAEAAKHPDMLTSVRPNG LEDTPQFKIDIDQEKAQALGVSINDINTTLGAAWGGSYVNDFIDRGRVKKVYVMSEAKYR MLPDDIGDWYVRAADGQMVPFSAFSSSRWEYGSPRLERYNGLPSMEILGQAAPGKSTGEA MELMEQLASKLPTGVGYDWTGMSYQERLSGNQAPSLYAISLIVVFLCLAALYESWSIPFS VMLVVPLGVIGALLAATFRGLTNDVYFQVGLLTTIGLSAKNAILIVEFAKDLMDKEGKGL IEATLDAVRMRLRPILMTSLAFILGVMPLVISTGAGSGAQNAVGTGVMGGMVTATVLAIF FVPVFFVVVRRRFSRKNEDIEHSHTVDHH >gi|296494485|gb|ADTN01000253.1| GENE 16 16163 - 17356 1289 397 aa, chain - ## HITS:1 COG:ECs0516 KEGG:ns NR:ns ## COG: ECs0516 COG0845 # Protein_GI_number: 15829770 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli O157:H7 # 1 397 1 397 397 660 100.0 0 MNKNRGFTPLAVVLMLSGSLALTGCDDKQAQQGGQQMPAVGVVTVKTEPLQITTELPGRT SAYRIAEVRPQVSGIILKRNFKEGSDIEAGVSLYQIDPATYQATYDSAKGDLAKAQAAAN IAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTAAKAAVETARINLAYTKVTSPI SGRIGKSNVTEGALVQNGQATALATVQQLDPIYVDVTQSSNDFLRLKQELANGTLKQENG KAKVSLITSDGIKFPQDGTLEFSDVTVDQTTGSITLRAIFPNPDHTLLPGMFVRARLEEG LNPNAILVPQQGVTRTPRGDATVLVVGADDKVETRPIVASQAIGDKWLVTEGLKAGDRVV ISGLQKVRPGVQVKAQEVTADNNQQAASGAQPEQSKS >gi|296494485|gb|ADTN01000253.1| GENE 17 17498 - 18145 514 215 aa, chain + ## HITS:1 COG:acrR KEGG:ns NR:ns ## COG: acrR COG1309 # Protein_GI_number: 16128448 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 215 1 215 215 418 100.0 1e-117 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE >gi|296494485|gb|ADTN01000253.1| GENE 18 18076 - 18267 73 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|300919868|ref|ZP_07136335.1| ## NR: gi|300919868|ref|ZP_07136335.1| conserved domain protein [Escherichia coli MS 115-1] # 1 61 1 61 71 103 98.0 4e-21 MKVSEPEERPEKVKPQEYHDAVNQNSNDENVQEKSWSQIQGYSLVAGLRSVGHRRYISSK MAT >gi|296494485|gb|ADTN01000253.1| GENE 19 18273 - 21635 3502 1120 aa, chain + ## HITS:1 COG:aefA KEGG:ns NR:ns ## COG: aefA COG3264 # Protein_GI_number: 16128449 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Escherichia coli K12 # 1 1120 1 1120 1120 2077 100.0 0 MTMFQYYKRSRHFVFSAFIAFVFVLLCQNTAFARASSNGDLPTKADLQAQLDSLNKQKDL SAQDKLVQQDLTDTLATLDKIDRIKEETVQLRQKVAEAPEKMRQATAALTALSDVDNDEE TRKILSTLSLRQLETRVAQALDDLQNAQNDLASYNSQLVSLQTQPERVQNAMYNASQQLQ QIRSRLDGTDVGETALRPSQKVLMQAQQALLNAEIDQQRKSLEGNTVLQDTLQKQRDYVT ANSARLEHQLQLLQEAVNSKRLTLTEKTAQEAVSPDEAARIQANPLVKQELEINQQLSQR LITATENGNQLMQQNIKVKNWLERALQSERNIKEQIAVLKGSLLLSRILYQQQQTLPSAD ELENMTNRIADLRLEQFEVNQQRDALFQSDAFVNKLEEGHTNEVNSEVHDALLQVVDMRR ELLDQLNKQLGNQLMMAINLQINQQQLMSVSKNLKSILTQQIFWVNSNRPMDWDWIKAFP QSLKDEFKSMKITVNWQKAWPAVFIAFLAGLPLLLIAGLIHWRLGWLKAYQQKLASAVGS LRNDSQLNTPKAILIDLIRALPVCLIILAVGLILLTMQLNISELLWSFSKKLAIFWLVFG LCWKVLEKNGVAVRHFGMPEQQTSHWRRQIVRISLALLPIHFWSVVAELSPLHLMDDVLG QAMIFFNLLLIAFLVWPMCRESWRDKESHTMRLVTITVLSIIPIALMVLTATGYFYTTLR LAGRWIETVYLVIIWNLLYQTVLRGLSVAARRIAWRRALARRQNLVKEGAEGAEPPEEPT IALEQVNQQTLRITMLLMFALFGVMFWAIWSDLITVFSYLDSITLWHYNGTEAGAAVVKN VTMGSLLFAIIASMVAWALIRNLPGLLEVLVLSRLNMRQGASYAITTILNYIIIAVGAMT VFGSLGVSWDKLQWLAAALSVGLGFGLQEIFGNFVSGLIILFERPVRIGDTVTIGSFSGT VSKIRIRATTITDFDRKEVIIPNKAFVTERLINWSLTDTTTRLVIRLGVAYGSDLEKVRK VLLKAATEHPRVMHEPMPEVFFTAFGASTLDHELRLYVRELRDRSRTVDELNRTIDQLCR ENDINIAFNQLEVHLHNEKGDEVTEVKRDYKGDDPTPAVG Prediction of potential genes in microbial genomes Time: Mon May 16 00:00:36 2011 Seq name: gi|296494484|gb|ADTN01000254.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont650.9, whole genome shotgun sequence Length of sequence - 7085 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 3, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 146 - 307 205 ## B21_00422 hypothetical protein 2 1 Op 2 . - CDS 321 - 848 510 ## COG3923 Primosomal replication protein N'' - Prom 872 - 931 6.3 + Prom 831 - 890 4.0 3 2 Op 1 5/0.000 + CDS 918 - 1295 332 ## COG2832 Uncharacterized protein conserved in bacteria 4 2 Op 2 8/0.000 + CDS 1448 - 1999 821 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins + Term 2051 - 2087 2.2 + Prom 2032 - 2091 2.5 5 2 Op 3 30/0.000 + CDS 2128 - 4059 1986 ## COG2812 DNA polymerase III, gamma/tau subunits + Term 4061 - 4089 1.6 6 2 Op 4 23/0.000 + CDS 4112 - 4441 227 ## PROTEIN SUPPORTED gi|149916415|ref|ZP_01904934.1| 30S ribosomal protein S21 7 2 Op 5 4/1.000 + CDS 4441 - 5046 543 ## COG0353 Recombinational DNA repair protein (RecF pathway) + Term 5048 - 5093 9.0 + Prom 5075 - 5134 3.8 8 3 Tu 1 . + CDS 5156 - 7030 2838 ## COG0326 Molecular chaperone, HSP90 family Predicted protein(s) >gi|296494484|gb|ADTN01000254.1| GENE 1 146 - 307 205 53 aa, chain - ## HITS:1 COG:no KEGG:B21_00422 NR:ns ## KEGG: B21_00422 # Name: ybaM # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 53 1 53 53 72 100.0 3e-12 MSLENAPDDVKLAVDLIVLLEENQIPASTVLRALDIVKRDYEKKLTRDDEAEK >gi|296494484|gb|ADTN01000254.1| GENE 2 321 - 848 510 175 aa, chain - ## HITS:1 COG:priC KEGG:ns NR:ns ## COG: priC COG3923 # Protein_GI_number: 16128451 # Func_class: L Replication, recombination and repair # Function: Primosomal replication protein N'' # Organism: Escherichia coli K12 # 1 175 1 175 175 245 100.0 3e-65 MKTALLLEKLEGQLATLRQRCAPVSQFATLSARFDRHLFQTRATTLQACLDEAGDNLAAL RHAVEQQQLPQVAWLAEHLAAQLEAIAREASAWSLREWDSAPPKIARWQRKRIQHQDFER RLREMVAERRARLARVTDLVEQQTLHREVEAYEARLARCRHALEKIENRLARLTR >gi|296494484|gb|ADTN01000254.1| GENE 3 918 - 1295 332 125 aa, chain + ## HITS:1 COG:ECs0521 KEGG:ns NR:ns ## COG: ECs0521 COG2832 # Protein_GI_number: 15829775 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 125 1 125 125 199 100.0 1e-51 MQRIILIIIGWLAVVLGTLGVVLPVLPTTPFILLAAWCFARSSPRFHAWLLYRSWFGSYL RFWQKHHAMPRGVKPRAILLILLTFAISLWFVQMPWVRIMLLVILACLLFYMWRIPVIDE KQEKH >gi|296494484|gb|ADTN01000254.1| GENE 4 1448 - 1999 821 183 aa, chain + ## HITS:1 COG:apt KEGG:ns NR:ns ## COG: apt COG0503 # Protein_GI_number: 16128453 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Escherichia coli K12 # 1 183 1 183 183 349 100.0 1e-96 MTATAQQLEYLKNSIKSIQDYPKPGILFRDVTSLLEDPKAYALSIDLLVERYKNAGITKV VGTEARGFLFGAPVALGLGVGFVPVRKPGKLPRETISETYDLEYGTDQLEIHVDAIKPGD KVLVVDDLLATGGTIEATVKLIRRLGGEVADAAFIINLFDLGGEQRLEKQGITSYSLVPF PGH >gi|296494484|gb|ADTN01000254.1| GENE 5 2128 - 4059 1986 643 aa, chain + ## HITS:1 COG:dnaX KEGG:ns NR:ns ## COG: dnaX COG2812 # Protein_GI_number: 16128454 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Escherichia coli K12 # 1 643 1 643 643 1118 100.0 0 MSYQVLARKWRPQTFADVVGQEHVLTALANGLSLGRIHHAYLFSGTRGVGKTSIARLLAK GLNCETGITATPCGVCDNCREIEQGRFVDLIEIDAASRTKVEDTRDLLDNVQYAPARGRF KVYLIDEVHMLSRHSFNALLKTLEEPPEHVKFLLATTDPQKLPVTILSRCLQFHLKALDV EQIRHQLEHILNEEHIAHEPRALQLLARAAEGSLRDALSLTDQAIASGDGQVSTQAVSAM LGTLDDDQALSLVEAMVEANGERVMALINEAAARGIEWEALLVEMLGLLHRIAMVQLSPA ALGNDMAAIELRMRELARTIPPTDIQLYYQTLLIGRKELPYAPDRRMGVEMTLLRALAFH PRMPLPEPEVPRQSFAPVAPTAVMTPTQVPPQPQSAPQQAPTVPLPETTSQVLAARQQLQ RVQGATKAKKSEPAAATRARPVNNAALERLASVTDRVQARPVPSALEKAPAKKEAYRWKA TTPVMQQKEVVATPKALKKALEHEKTPELAAKLAAEAIERDPWAAQVSQLSLPKLVEQVA LNAWKEESDNAVCLHLRSSQRHLNNRGAQQKLAEALSMLKGSTVELTIVEDDNPAVRTPL EWRQAIYEEKLAQARESIIADNNIQTLRRFFDAELDEESIRPI >gi|296494484|gb|ADTN01000254.1| GENE 6 4112 - 4441 227 109 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149916415|ref|ZP_01904934.1| 30S ribosomal protein S21 [Roseobacter sp. AzwK-3b] # 5 109 9 114 114 92 41 1e-18 MFGKGGLGNLMKQAQQMQEKMQKMQEEIAQLEVTGESGAGLVKVTINGAHNCRRVEIDPS LLEDDKEMLEDLVAAAFNDAARRIEETQKEKMASVSSGMQLPPGFKMPF >gi|296494484|gb|ADTN01000254.1| GENE 7 4441 - 5046 543 201 aa, chain + ## HITS:1 COG:ECs0525 KEGG:ns NR:ns ## COG: ECs0525 COG0353 # Protein_GI_number: 15829779 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Escherichia coli O157:H7 # 1 201 1 201 201 399 100.0 1e-111 MQTSPLLTQLMEALRCLPGVGPKSAQRMAFTLLQRDRSGGMRLAQALTRAMSEIGHCADC RTFTEQEVCNICSNPRRQENGQICVVESPADIYAIEQTGQFSGRYFVLMGHLSPLDGIGP DDIGLDRLEQRLAEEKITEVILATNPTVEGEATANYIAELCAQYDVEASRIAHGVPVGGE LEMVDGTTLSHSLAGRHKIRF >gi|296494484|gb|ADTN01000254.1| GENE 8 5156 - 7030 2838 624 aa, chain + ## HITS:1 COG:ECs0526 KEGG:ns NR:ns ## COG: ECs0526 COG0326 # Protein_GI_number: 15829780 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone, HSP90 family # Organism: Escherichia coli O157:H7 # 1 624 1 624 624 1183 100.0 0 MKGQETRGFQSEVKQLLHLMIHSLYSNKEIFLRELISNASDAADKLRFRALSNPDLYEGD GELRVRVSFDKDKRTLTISDNGVGMTRDEVIDHLGTIAKSGTKSFLESLGSDQAKDSQLI GQFGVGFYSAFIVADKVTVRTRAAGEKPENGVFWESAGEGEYTVADITKEDRGTEITLHL REGEDEFLDDWRVRSIISKYSDHIALPVEIEKREEKDGETVISWEKINKAQALWTRNKSE ITDEEYKEFYKHIAHDFNDPLTWSHNRVEGKQEYTSLLYIPSQAPWDMWNRDHKHGLKLY VQRVFIMDDAEQFMPNYLRFVRGLIDSSDLPLNVSREILQDSTVTRNLRNALTKRVLQML EKLAKDDAEKYQTFWQQFGLVLKEGPAEDFANQEAIAKLLRFASTHTDSSAQTVSLEDYV SRMKEGQEKIYYITADSYAAAKSSPHLELLRKKGIEVLLLSDRIDEWMMNYLTEFDGKPF QSVSKVDESLEKLADEVDESAKEAEKALTPFIDRVKALLGERVKDVRLTHRLTDTPAIVS TDADEMSTQMAKLFAAAGQKVPEVKYIFELNPDHVLVKRAADTEDEAKFSEWVELLLDQA LLAERGTLEDPNLFIRRMNQLLVS Prediction of potential genes in microbial genomes Time: Mon May 16 00:00:38 2011 Seq name: gi|296494483|gb|ADTN01000255.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont650.10, whole genome shotgun sequence Length of sequence - 846 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 68 - 127 3.9 1 1 Tu 1 . + CDS 168 - 812 986 ## COG0563 Adenylate kinase and related kinases Predicted protein(s) >gi|296494483|gb|ADTN01000255.1| GENE 1 168 - 812 986 214 aa, chain + ## HITS:1 COG:ECs0527 KEGG:ns NR:ns ## COG: ECs0527 COG0563 # Protein_GI_number: 15829781 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate kinase and related kinases # Organism: Escherichia coli O157:H7 # 1 214 1 214 214 404 100.0 1e-113 MRIILLGAPGAGKGTQAQFIMEKYGIPQISTGDMLRAAVKSGSELGKQAKDIMDAGKLVT DELVIALVKERIAQEDCRNGFLLDGFPRTIPQADAMKEAGINVDYVLEFDVPDELIVDRI VGRRVHAPSGRVYHVKFNPPKVEGKDDVTGEELTTRKDDQEETVRKRLVEYHQMTAPLIG YYSKEAEAGNTKYAKVDGTKPVAEVRADLEKILG Prediction of potential genes in microbial genomes Time: Mon May 16 00:00:47 2011 Seq name: gi|296494482|gb|ADTN01000256.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont650.11, whole genome shotgun sequence Length of sequence - 26280 bp Number of predicted genes - 23, with homology - 23 Number of transcription units - 17, operones - 5 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 93 - 152 3.9 1 1 Tu 1 . + CDS 177 - 1139 991 ## COG0276 Protoheme ferro-lyase (ferrochelatase) + Term 1164 - 1200 2.0 - Term 1056 - 1101 8.0 2 2 Tu 1 . - CDS 1136 - 2095 702 ## COG0657 Esterase/lipase - Prom 2216 - 2275 2.7 + Prom 2095 - 2154 1.7 3 3 Tu 1 . + CDS 2247 - 3551 1269 ## COG0524 Sugar kinases, ribokinase family + Term 3581 - 3646 3.1 - Term 3567 - 3632 3.1 4 4 Tu 1 . - CDS 3684 - 5360 499 ## PROTEIN SUPPORTED gi|229845962|ref|ZP_04466074.1| 30S ribosomal protein S2 - Prom 5462 - 5521 5.2 - Term 5552 - 5591 8.0 5 5 Tu 1 . - CDS 5598 - 6818 1061 ## COG0477 Permeases of the major facilitator superfamily + Prom 6932 - 6991 2.6 6 6 Tu 1 . + CDS 7036 - 8688 1936 ## COG0737 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases - Term 8684 - 8719 7.4 7 7 Tu 1 5/0.167 - CDS 8725 - 9204 678 ## COG2606 Uncharacterized conserved protein - Prom 9345 - 9404 7.6 - Term 9307 - 9355 -0.1 8 8 Tu 1 . - CDS 9408 - 10202 687 ## COG3735 Uncharacterized protein conserved in bacteria - Prom 10286 - 10345 2.7 + Prom 10238 - 10297 4.9 9 9 Tu 1 . + CDS 10340 - 10681 378 ## COG3093 Plasmid maintenance system antidote protein - Term 10949 - 10980 0.7 10 10 Tu 1 . - CDS 10997 - 13501 3022 ## COG2217 Cation transport ATPase - Prom 13744 - 13803 7.1 + Prom 13622 - 13681 3.4 11 11 Op 1 4/0.583 + CDS 13763 - 14695 924 ## COG2066 Glutaminase 12 11 Op 2 2/1.000 + CDS 14701 - 15990 1356 ## COG0531 Amino acid transporters + Prom 16016 - 16075 3.8 13 12 Tu 1 . + CDS 16115 - 16522 390 ## COG0789 Predicted transcriptional regulators 14 13 Op 1 26/0.000 - CDS 16523 - 16981 411 ## COG1585 Membrane protein implicated in regulation of membrane protease activity 15 13 Op 2 . - CDS 16978 - 17895 1522 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs - Prom 17991 - 18050 4.0 + Prom 17955 - 18014 3.8 16 14 Op 1 4/0.583 + CDS 18041 - 18718 173 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 17 14 Op 2 . + CDS 18705 - 19484 785 ## COG0390 ABC-type uncharacterized transport system, permease component + Term 19485 - 19535 12.8 - Term 19469 - 19512 9.1 18 15 Op 1 3/0.750 - CDS 19547 - 20401 1382 ## COG3118 Thioredoxin domain-containing protein 19 15 Op 2 5/0.167 - CDS 20462 - 21271 229 ## PROTEIN SUPPORTED gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 20 15 Op 3 . - CDS 21261 - 21917 565 ## COG2755 Lysophospholipase L1 and related esterases - Prom 22139 - 22198 1.9 21 16 Op 1 11/0.083 + CDS 21855 - 22541 313 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 22 16 Op 2 3/0.750 + CDS 22538 - 24952 2569 ## COG3127 Predicted ABC-type transport system involved in lysophospholipase L1 biosynthesis, permease component + Term 24959 - 25012 5.5 + Prom 25280 - 25339 10.0 23 17 Tu 1 . + CDS 25382 - 26279 449 ## COG3209 Rhs family protein Predicted protein(s) >gi|296494482|gb|ADTN01000256.1| GENE 1 177 - 1139 991 320 aa, chain + ## HITS:1 COG:hemH KEGG:ns NR:ns ## COG: hemH COG0276 # Protein_GI_number: 16128459 # Func_class: H Coenzyme transport and metabolism # Function: Protoheme ferro-lyase (ferrochelatase) # Organism: Escherichia coli K12 # 1 320 1 320 320 625 100.0 1e-179 MRQTKTGILLANLGTPDAPTPEAVKRYLKQFLSDRRVVDTSRLLWWPLLRGVILPLRSPR VAKLYASVWMEGGSPLMVYSRQQQQALAQRLPEMPVALGMSYGSPSLESAVDELLAEHVD HIVVLPLYPQFSCSTVGAVWDELARILARKRSIPGISFIRDYADNHDYINALANSVRASF AKHGEPDLLLLSYHGIPQRYADEGDDYPQRCRTTTRELASALGMAPEKVMMTFQSRFGRE PWLMPYTDETLKMLGEKGVGHIQVMCPGFAADCLETLEEIAEQNREVFLGAGGKKYEYIP ALNATPEHIEMMANLVAAYR >gi|296494482|gb|ADTN01000256.1| GENE 2 1136 - 2095 702 319 aa, chain - ## HITS:1 COG:aes KEGG:ns NR:ns ## COG: aes COG0657 # Protein_GI_number: 16128460 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Escherichia coli K12 # 1 319 1 319 319 664 100.0 0 MKPENKLPVLDLISAEMKTVVNTLQPDLPPWPATGTIAEQRQYYTLERRFWNAGAPEMAT RAYMVPTKYGQVETRLFCPQPDSPATLFYLHGGGFILGNLDTHDRIMRLLASYSQCTVIG IDYTLSPEARFPQAIEEIVAACCYFHQQAEDYQINMSRIGFAGDSAGAMLALASALWLRD KQIDCGKVAGVLLWYGLYGLRDSVTRRLLGGVWDGLTQQDLQMYEEAYLSNDADRESPYY CLFNNDLTREVPPCFIAGAEFDPLLDDSRLLYQTLAAHQQPCEFKLYPGTLHAFLHYSRM MKTADEALRDGAQFFTAQL >gi|296494482|gb|ADTN01000256.1| GENE 3 2247 - 3551 1269 434 aa, chain + ## HITS:1 COG:ECs0530 KEGG:ns NR:ns ## COG: ECs0530 COG0524 # Protein_GI_number: 15829784 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli O157:H7 # 1 434 1 434 434 914 100.0 0 MKFPGKRKSKHYFPVNARDPLLQQFQPENETSAAWVVGIDQTLVDIEAKVDDEFIERYGL SAGHSLVIEDDVAEALYQELKQKNLITHQFAGGTIGNTMHNYSVLADDRSVLLGVMCSNI EIGSYAYRYLCNTSSRTDLNYLQGVDGPIGRCFTLIGESGERTFAISPGHMNQLRAESIP EDVIAGASALVLTSYLVRCKPGEPMPEATMKAIEYAKKYNVPVVLTLGTKFVIAENPQWW QQFLKDHVSILAMNEDEAEALTGESDPLLASDKALDWVDLVLCTAGPIGLYMAGFTEDEA KRKTQHPLLPGAIAEFNQYEFSRAMRHKDCQNPLRVYSHIAPYMGGPEKIMNTNGAGDGA LAALLHDITANSYHRSNVPNSSKHKFTWLTYSSLAQVCKYANRVSYQVLNQHSPRLTRGL PEREDSLEESYWDR >gi|296494482|gb|ADTN01000256.1| GENE 4 3684 - 5360 499 558 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229845962|ref|ZP_04466074.1| 30S ribosomal protein S2 [Haemophilus influenzae 7P49H1] # 4 543 5 535 618 196 25 1e-49 MHHATPLITTIVGGLVLAFILGMLANKLRISPLVGYLLAGVLAGPFTPGFVADTKLAPEL AELGVILLMFGVGLHFSLKDLMAVKAIAIPGAIAQIAVATLLGMALSAVLGWSLMTGIVF GLCLPTASTVVLLRALEERQLIDSQRGQIAIGWLIVEDLVMVLTLVLLPAVAGMMEQGDV GFATLAVDMGITIGKVIAFIAIMMLVGRRLVPWIMARSAATGSRELFTLSVLALALGVAF GAVELFDVSFALGAFFAGMVLNESELSHRAAHDTLPLRDAFAVLFFVSVGMLFDPLILIQ QPLAVLATLAIILFGKSLAAFFLVRLFGHSQRTALTIAASLAQIGEFAFILAGLGMALNL LPQAGQNLVLAGAILSIMLNPVLFALLEKYLAKTETLEEQTLEEAIEEEKQIPVDICNHA LLVGYGRVGSLLGEKLLASDIPLVVIETSRTRVDELRERGVRAVLGNAANEEIMQLAHLE CAKWLILTIPNGYEAGEIVASARAKNPDIEIIARAHYDDEVAYITERGANQVVMGEREIA RTMLELLETPPAGEVVTG >gi|296494482|gb|ADTN01000256.1| GENE 5 5598 - 6818 1061 406 aa, chain - ## HITS:1 COG:fsr KEGG:ns NR:ns ## COG: fsr COG0477 # Protein_GI_number: 16128463 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 406 1 406 406 676 100.0 0 MAMSEQPQPVAGAAASTTKARTSFGILGAISLSHLLNDMIQSLILAIYPLLQSEFSLTFM QIGMITLTFQLASSLLQPVVGYWTDKYPMPWSLPIGMCFTLSGLVLLALAGSFGAVLLAA ALVGTGSSVFHPESSRVARMASGGRHGLAQSIFQVGGNFGSSLGPLLAAVIIAPYGKGNV AWFVLAALLAIVVLAQISRWYSAQHRMNKGKPKATIINPLPRNKVVLAVSILLILIFSKY FYMASISSYYTFYLMQKFGLSIQNAQLHLFAFLFAVAAGTVIGGPVGDKIGRKYVIWGSI LGVAPFTLILPYASLHWTGVLTVIIGFILASAFSAILVYAQELLPGRIGMVSGLFFGFAF GMGGLGAAVLGLIADHTSIELVYKICAFLPLLGMLTIFLPDNRHKD >gi|296494482|gb|ADTN01000256.1| GENE 6 7036 - 8688 1936 550 aa, chain + ## HITS:1 COG:ushA KEGG:ns NR:ns ## COG: ushA COG0737 # Protein_GI_number: 16128464 # Func_class: F Nucleotide transport and metabolism # Function: 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases # Organism: Escherichia coli K12 # 1 550 1 550 550 1117 100.0 0 MKLLQRGVALALLTTFTLASETALAYEQDKTYKITVLHTNDHHGHFWRNEYGEYGLAAQK TLVDGIRKEVAAEGGSVLLLSGGDINTGVPESDLQDAEPDFRGMNLVGYDAMAIGNHEFD NPLTVLRQQEKWAKFPLLSANIYQKSTGERLFKPWALFKRQDLKIAVIGLTTDDTAKIGN PEYFTDIEFRKPADEAKLVIQELQQTEKPDIIIAATHMGHYDNGEHGSNAPGDVEMARAL PAGSLAMIVGGHSQDPVCMAAENKKQVDYVPGTPCKPDQQNGIWIVQAHEWGKYVGRADF EFRNGEMKMVNYQLIPVNLKKKVTWEDGKSERVLYTPEIAENQQMISLLSPFQNKGKAQL EVKIGETNGRLEGDRDKVRFVQTNMGRLILAAQMDRTGADFAVMSGGGIRDSIEAGDISY KNVLKVQPFGNVVVYADMTGKEVIDYLTAVAQMKPDSGAYPQFANVSFVAKDGKLNDLKI KGEPVDPAKTYRMATLNFNATGGDGYPRLDNKPGYVNTGFIDAEVLKAYIQKSSPLDVSV YEPKGEVSWQ >gi|296494482|gb|ADTN01000256.1| GENE 7 8725 - 9204 678 159 aa, chain - ## HITS:1 COG:ybaK KEGG:ns NR:ns ## COG: ybaK COG2606 # Protein_GI_number: 16128465 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 159 1 159 159 292 100.0 2e-79 MTPAVKLLEKNKISFQIHTYEHDPAETNFGDEVVKKLGLNPDQVYKTLLVAVNGDMKHLA VAVTPVAGQLDLKKVAKALGAKKVEMADPMVAQRSTGYLVGGISPLGQKKRLPTIIDAPA QEFATIYVSGGKRGLDIELAAGDLAKILDAKFADIARRD >gi|296494482|gb|ADTN01000256.1| GENE 8 9408 - 10202 687 264 aa, chain - ## HITS:1 COG:ybaP KEGG:ns NR:ns ## COG: ybaP COG3735 # Protein_GI_number: 16128466 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 264 1 264 264 506 100.0 1e-143 MDLLYRVKTLWAALRGNHYTWPAIDITLPGNRHFHLIGSIHMGSHDMAPLPTRLLKKLKN ADALIVEADVSTSDTPFANLPACEALEERISEEQLQNLQHISQEMGISPSLFSTQPLWQI AMVLQATQAQKLGLRAEYGIDYQLLQAAKQQHKPVIELEGAENQIAMLLQLPDKGLALLD DTLTHWHTNARLLQQMMSWWLNAPPQNNDITLPNTFSQSLYDVLMHQRNLAWRDKLRAMP PGRYVVAVGALHLYGEGNLPQMLR >gi|296494482|gb|ADTN01000256.1| GENE 9 10340 - 10681 378 113 aa, chain + ## HITS:1 COG:ECs0536 KEGG:ns NR:ns ## COG: ECs0536 COG3093 # Protein_GI_number: 15829790 # Func_class: R General function prediction only # Function: Plasmid maintenance system antidote protein # Organism: Escherichia coli O157:H7 # 1 113 19 131 131 202 100.0 2e-52 MKQATRKPTTPGDILLYEYLEPLDLKINELAELLHVHRNSVSALINNNRKLTTEMAFRLA KVFDTTVDFWLNLQAAVDLWEVENNMRTQEELGRIETVAEYLARREERAKKVA >gi|296494482|gb|ADTN01000256.1| GENE 10 10997 - 13501 3022 834 aa, chain - ## HITS:1 COG:ybaR KEGG:ns NR:ns ## COG: ybaR COG2217 # Protein_GI_number: 16128468 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Escherichia coli K12 # 1 834 1 834 834 1477 100.0 0 MSQTIDLTLDGLSCGHCVKRVKESLEQRPDVEQADVSITEAHVTGTASAEQLIETIKQAG YDASVSHPKAKPLAESSIPSEALTAVSEALPAATADDDDSQQLLLSGMSCASCVTRVQNA LQSVPGVTQARVNLAERTALVMGSASPQDLVQAVEKAGYGAEAIEDDAKRRERQQETAVA TMKRFRWQAIVALAVGIPVMVWGMIGDNMMVTADNRSLWLVIGLITLAVMVFAGGHFYRS AWKSLLNGAATMDTLVALGTGVAWLYSMSVNLWPQWFPMEARHLYYEASAMIIGLINLGH MLEARARQRSSKALEKLLDLTPPTARLVTDEGEKSVPLAEVQPGMLLRLTTGDRVPVDGE ITQGEAWLDEAMLTGEPIPQQKGEGDSVHAGTVVQDGSVLFRASAVGSHTTLSRIIRMVR QAQSSKPEIGQLADKISAVFVPVVVVIALVSAAIWYFFGPAPQIVYTLVIATTVLIIACP CALGLATPMSIISGVGRAAEFGVLVRDADALQRASTLDTVVFDKTGTLTEGKPQVVAVKT FADVDEAQALRLAAALEQGSSHPLARAILDKAGDMQLPQVNGFRTLRGLGVSGEAEGHAL LLGNQALLNEQQVGTKAIEAEITAQASQGATPVLLAVDGKAVALLAVRDPLRSDSVAALQ RLHKAGYRLVMLTGDNPTTANAIAKEAGIDEVIAGVLPDGKAEAIKHLQSEGRQVAMVGD GINDAPALAQADVGIAMGGGSDVAIETAAITLMRHSLMGVADALAISRATLHNMKQNLLG AFIYNSIGIPVAAGILWPFTGTLLNPVVAGAAMALSSITVVSNANRLLRFKPKE >gi|296494482|gb|ADTN01000256.1| GENE 11 13763 - 14695 924 310 aa, chain + ## HITS:1 COG:ybaS KEGG:ns NR:ns ## COG: ybaS COG2066 # Protein_GI_number: 16128469 # Func_class: E Amino acid transport and metabolism # Function: Glutaminase # Organism: Escherichia coli K12 # 1 310 1 310 310 609 100.0 1e-174 MLDANKLQQAVDQAYTQFHSLNGGQNADYIPFLANVPGQLAAVAIVTCDGNVYSAGDSDY RFALESISKVCTLALALEDVGPQAVQDKIGADPTGLPFNSVIALELHGGKPLSPLVNAGA IATTSLINAENVEQRWQRILHIQQQLAGEQVALSDEVNQSEQTTNFHNRAIAWLLYSAGY LYCDAMEACDVYTRQCSTLLNTIELATLGATLAAGGVNPLTHKRVLQADNVPYILAEMMM EGLYGRSGDWAYRVGLPGKSGVGGGILAVVPGVMGIAAFSPPLDEDGNSVRGQKMVASVA KQLGYNVFKG >gi|296494482|gb|ADTN01000256.1| GENE 12 14701 - 15990 1356 429 aa, chain + ## HITS:1 COG:ybaT KEGG:ns NR:ns ## COG: ybaT COG0531 # Protein_GI_number: 16128470 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli K12 # 1 429 2 430 430 668 100.0 0 MNTEGNNGNKPLGLWNVVSIGIGAMVGAGIFALLGQAALLMEASTWVAFAFGGIVAMFSG YAYARLGASYPSNGGIIDFFRRGLGNGVFSLALSLLYLLTLAVSIAMVARAFGAYAVQFL HEGSQEEHLILLYALGIIAVMTLFNSLSNHAVGRLEVILVGIKMMILLLLIIAGVWSLQP AHISVSAPPSSGAFFSCIGITFLAYAGFGMMANAADKVKDPQVIMPRAFLVAIGVTTLLY ISLALVLLSDVSALELEKYADTAVAQAASPLLGHVGYVIVVIGALLATASAINANLFAVF NIMDNMGSERELPKLMNKSLWRQSTWGNIIVVVLIMLMTAALNLGSLASVASATFLICYL AVFVVAIRLRHDIHASLPILIVGTLVMLLVIVGFIYSLWSQGSRALIWIIGSLLLSLIVA MVMKRNKTV >gi|296494482|gb|ADTN01000256.1| GENE 13 16115 - 16522 390 135 aa, chain + ## HITS:1 COG:ybbI KEGG:ns NR:ns ## COG: ybbI COG0789 # Protein_GI_number: 16128471 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 1 135 1 135 135 266 100.0 8e-72 MNISDVAKITGLTSKAIRFYEEKGLVTPPMRSENGYRTYTQQHLNELTLLRQARQVGFNL EESGELVNLFNDPQRHSADVKRRTLEKVAEIERHIEELQSMRDQLLALANACPGDDSADC PIIENLSGCCHHRAG >gi|296494482|gb|ADTN01000256.1| GENE 14 16523 - 16981 411 152 aa, chain - ## HITS:1 COG:ybbJ KEGG:ns NR:ns ## COG: ybbJ COG1585 # Protein_GI_number: 16128472 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Membrane protein implicated in regulation of membrane protease activity # Organism: Escherichia coli K12 # 2 152 1 151 151 241 100.0 3e-64 MMELMVVHPHIFWLSLGGLLLAAEMLGGNGYLLWSGVAAVITGLVVWLVPLGWEWQGVMF AILTLLAAWLWWKWLSRRVREQKHSDSHLNQRGQQLIGRRFVLESPLVNGRGHMRVGDSS WPVSASEDLGAGTHVEVIAIEGITLHIRAVSS >gi|296494482|gb|ADTN01000256.1| GENE 15 16978 - 17895 1522 305 aa, chain - ## HITS:1 COG:ECs0552 KEGG:ns NR:ns ## COG: ECs0552 COG0330 # Protein_GI_number: 15829806 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Escherichia coli O157:H7 # 1 305 1 305 305 536 100.0 1e-152 MLIFIPILIFVALVIVGAGVKIVPQGYQWTVERFGRYTKTLQPGLSLVVPFMDRIGRKIN MMEQVLDIPSQEVISKDNANVTIDAVCFIQVIDAPRAAYEVSNLELAIINLTMTNIRTVL GSMELDEMLSQRDSINSRLLRIVDEATNPWGIKVTRIEIRDVRPPAELISSMNAQMKAER TKRAYILEAEGIRQAEILKAEGEKQSQILKAEGERQSAFLQAEARERSAEAEARATKMVS EAIASGDIQAVNYFVAQKYTEALQQIGSSSNSKVVMMPLEASSLMGSIAGIAELVKDSAN KRTQP >gi|296494482|gb|ADTN01000256.1| GENE 16 18041 - 18718 173 225 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 7 208 1 210 245 71 30 6e-12 MQENSPLLQLQNVGYLAGDAKILNNINFSLRAGEFKLITGPSGCGKSTLLKIVASLISPT SGTLLFEGEDVSTLKPEIYRQQVSYCAQTPTLFGDTVYDNLIFPWQIRNRQPDPAIFLDF LERFALPDSILTKNIAELSGGEKQRISLIRNLQFMPKVLLLDEITSALDESNKHNVNEMI HRYVREQNIAVLWVTHDKDEINHADKVITLQPHAGEMQEARYELA >gi|296494482|gb|ADTN01000256.1| GENE 17 18705 - 19484 785 259 aa, chain + ## HITS:1 COG:ECs0554 KEGG:ns NR:ns ## COG: ECs0554 COG0390 # Protein_GI_number: 15829808 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Escherichia coli O157:H7 # 1 259 10 268 268 443 100.0 1e-124 MNSHNITNESLALALMLVVVAILISHKEKLALEKDILWSVGRAIIQLIIVGYVLKYIFSV DDASLTLLMVLFICFNAAWNAQKRSKYIAKAFISSFIAITVGAGITLAVLILSGSIEFIP MQVIPIAGMIAGNAMVAVGLCYNNLGQRVISEQQQIQEKLSLGATPKQASAILIRDSIRA ALIPTVDSAKTVGLVSLPGMMSGLIFAGIDPVKAIKYQIMVTFMLLSTASLSTIIACYLT YRKFYNSRHQLVVTQLKKK >gi|296494482|gb|ADTN01000256.1| GENE 18 19547 - 20401 1382 284 aa, chain - ## HITS:1 COG:ybbN KEGG:ns NR:ns ## COG: ybbN COG3118 # Protein_GI_number: 16128476 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Thioredoxin domain-containing protein # Organism: Escherichia coli K12 # 1 284 13 296 296 481 100.0 1e-136 MSVENIVNINESNLQQVLEQSMTTPVLFYFWSERSQHCLQLTPILESLAAQYNGQFILAK LDCDAEQMIAAQFGLRAIPTVYLFQNGQPVDGFQGPQPEEAIRALLDKVLPREEELKAQQ AMQLMQESNYTDALPLLKDAWQLSNQNGEIGLLLAETLIALNRSEDAEAVLKTIPLQDQD TRYQGLVAQIELLKQAADTPEIQQLQQQVAENPEDAALATQLALQLHQVGRNEEALELLF GHLRKDLTAADGQTRKTFQEILAALGTGDALASKYRRQLYALLY >gi|296494482|gb|ADTN01000256.1| GENE 19 20462 - 21271 229 269 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 [alpha proteobacterium BAL199] # 14 268 3 249 259 92 32 2e-18 MTHKATEILTGKVMQKSVLITGCSSGIGLESALELKRQGFHVLAGCRKPDDVERMNSMGF TGVLIDLDSPESVDRAADEVIALTDNCLYGIFNNAGFGMYGPLSTISRAQMEQQFSANFF GAHQLTMRLLPAMLPHGEGRIVMTSSVMGLISTPGRGAYAASKYALEAWSDALRMELRHS GIKVSLIEPGPIRTRFTDNVNQTQSDKPVENPGIAARFTLGPEAVVDKVRHAFISEKPKM RYPVTLVTWAVMVLKRLLPGRVMDKILQG >gi|296494482|gb|ADTN01000256.1| GENE 20 21261 - 21917 565 218 aa, chain - ## HITS:1 COG:tesA KEGG:ns NR:ns ## COG: tesA COG2755 # Protein_GI_number: 16128478 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Escherichia coli K12 # 11 218 1 208 208 404 100.0 1e-113 MLPLTDGLLKMMNFNNVFRWHLPFLFLVLLTFRAAAADTLLILGDSLSAGYRMSASAAWP ALLNDKWQSKTSVVNASISGDTSQQGLARLPALLKQHQPRWVLVELGGNDGLRGFQPQQT EQTLRQILQDVKAANAEPLLMQIRLPANYGRRYNEAFSAIYPKLAKEFDVPLLPFFMEEV YLKPQWMQDDGIHPNRDAQPFIADWMAKQLQPLVNHDS >gi|296494482|gb|ADTN01000256.1| GENE 21 21855 - 22541 313 228 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 25 228 20 223 223 125 36 4e-28 MPAENIVEVHHLKKSVGQGEHELSILTGVELVVKRGETIALVGESGSGKSTLLAILAGLD DGSSGEVSLVGQPLHNMDEEARAKLRAKHVGFVFQSFMLIPTLNALENVELPALLRGESS AESRNGAKALLEQLGLGKRLDHLPAQLSGGEQQRVALARAFNGRPDVLFADEPTGNLDRQ TGDKIADLLFSLNREHGTTLIMVTHDLQLAARCDRCLRLVNGQLQEEA >gi|296494482|gb|ADTN01000256.1| GENE 22 22538 - 24952 2569 804 aa, chain + ## HITS:1 COG:ybbP KEGG:ns NR:ns ## COG: ybbP COG3127 # Protein_GI_number: 16128480 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Predicted ABC-type transport system involved in lysophospholipase L1 biosynthesis, permease component # Organism: Escherichia coli K12 # 1 804 1 804 804 1432 100.0 0 MIARWFWREWRSPSLLIVWLALSLAVACVLALGNISDRMEKGLSQQSREFMAGDRALRSS REVPQAWLEEAQKRGLKVGKQLTFATMTFAGDTPQLANVKAVDDIYPMYGDLQTNPPGLK PQAGSVLLAPRLMALLNLKTGDTIDVGDATLRIAGEVIQEPDSGFNPFQMAPRLMMNLAD VDKTGAVQPGSRVTWRYKFGGNENQLDGYEKWLLPQLKPEQRWYGLEQDEGALGRSMERS QQFLLLSALLTLLLAVAAVAVAMNHYCRSRYDLVAILKTLGAGRAQLRKLIVGQWLMVLT LSAVTGGAIGLLFENVLMVLLKPVLPAALPPASLWPWLWALGTMTVISLLVGLRPYRLLL ATQPLRVLRNDVVANVWPLKFYLPIVSVVVVLLLAGLMGGSMLLWAVLAGAVVLALLCGV LGWMLLNVLRRMTLKSLPLRLAVSRLLRQPWSTLSQLSAFSLSFMLLALLLVLRGDLLDR WQQQLPPESPNYFLINIATEQVAPLKAFLAEHQIVPESFYPVVRARLTAINDKPTEGNED EALNRELNLTWQNTRPDHNPIVAGNWPPKADEVSMEEGLAKRLNVALGDTVTFMGDTQEF RAKVTSLRKVDWESLRPNFYFIFPEGALDGQPQSWLTSFRWENGNGMLTQLNRQFPTISL LDIGAILKQVGQVLEQVSRALEVMVVLVTACGMLLLLAQVQVGMRQRHQELVVWRTLGAG KKLLRTTLWCEFAMLGFVSGLVAAIGAETALAVLQAKVFDFPWEPDWRLWIVLPCSGALL LSLFGGWLGARLVKGKALFRQFAG >gi|296494482|gb|ADTN01000256.1| GENE 23 25382 - 26279 449 299 aa, chain + ## HITS:1 COG:ECs0560 KEGG:ns NR:ns ## COG: ECs0560 COG3209 # Protein_GI_number: 15829814 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli O157:H7 # 1 299 1 302 1398 580 96.0 1e-165 MSGKPAARQGDMTQYGGPIVQGSAGVRIGAPTGVACSVCPGGMTSGNPVNPLLGAKVLPG ETDLALPGPLPFILSRTYSSYRTKTPAPVGVFGPGWKAPSDIRLQLRDDGLILNDNGGRS IHFEPLLPGEAVYSRSESMWLVRGGKAAQPDGHTLARLWGALPPDIRLSPHLYLATNSAQ GPWWILGWSERVPGAEDVLPAPLPPYRVLTGMADRFGRTLTYRREAAGDLAGEITGVTDG AGREFRLVLTTQAQRAEEARTSSLSSSDSSRPLSASAFPDTLPGTEYGPDRGIRLSAVW Prediction of potential genes in microbial genomes Time: Mon May 16 00:00:53 2011 Seq name: gi|296494481|gb|ADTN01000257.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont655.1, whole genome shotgun sequence Length of sequence - 12709 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 5, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 10/0.000 + CDS 3 - 1787 1322 ## COG3188 P pilus assembly protein, porin PapC 2 1 Op 2 . + CDS 1802 - 2530 601 ## COG3121 P pilus assembly protein, chaperone PapD 3 1 Op 3 . + CDS 2590 - 3588 713 ## JW5098 predicted fimbrial-like adhesin protein + Term 3597 - 3636 9.4 + Prom 3622 - 3681 5.2 4 2 Op 1 . + CDS 3740 - 4579 616 ## COG3180 Putative ammonia monooxygenase 5 2 Op 2 . + CDS 4426 - 4785 197 ## COG3180 Putative ammonia monooxygenase + Term 4973 - 5018 1.7 6 3 Op 1 3/1.000 - CDS 4782 - 5573 697 ## COG0266 Formamidopyrimidine-DNA glycosylase 7 3 Op 2 8/0.000 - CDS 5609 - 6343 683 ## COG1540 Uncharacterized proteins, homologs of lactam utilization protein B 8 3 Op 3 21/0.000 - CDS 6333 - 7265 1003 ## COG1984 Allophanate hydrolase subunit 2 9 3 Op 4 4/0.667 - CDS 7259 - 7915 632 ## COG2049 Allophanate hydrolase subunit 1 10 3 Op 5 . - CDS 7938 - 8681 790 ## COG0327 Uncharacterized conserved protein - Prom 8839 - 8898 3.6 + Prom 8826 - 8885 6.3 11 4 Tu 1 . + CDS 8952 - 10433 1817 ## COG3104 Dipeptide/tripeptide permease - Term 10398 - 10446 -0.9 12 5 Op 1 5/0.000 - CDS 10583 - 12001 892 ## COG0415 Deoxyribodipyrimidine photolyase 13 5 Op 2 . - CDS 11998 - 12507 398 ## COG3272 Uncharacterized conserved protein - Prom 12529 - 12588 4.0 Predicted protein(s) >gi|296494481|gb|ADTN01000257.1| GENE 1 3 - 1787 1322 594 aa, chain + ## HITS:1 COG:ybgQ KEGG:ns NR:ns ## COG: ybgQ COG3188 # Protein_GI_number: 16128693 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli K12 # 2 594 220 818 818 1108 98.0 0 MTCLFAPSLTLGETDFSSNIFDGFSYTGAALASDDRMLPWELRGYAPQISGIAQTNATVT ISQSGRVIYQKKVPPGPFIIDDLNQSVQGTLDVKVTEEDGRVNNFQVSAASTPFLTRQGQ VRYKLAAGQPRPSMSHQTENETFFSNEVSWGMLSNTSLYGGLLISDDDYHSAAMGIGQNM LWLGALSFDVTWASSHFDTQQDERGLSYRFNYSKQVDATNSTISLAAYRFSDRHFHSYAN YLDHKYNDSDAQDEKQTISLSVGQPITPLNLNLYANLLHQTWWNADASTTANITAGFNVD IGDWRDISISTSFNTTHYEDKDRDNQIYLSISLPFGNGGRVGYDMQNSSHSTIHRMSWND TLDERNSWGMSAGLQSDRPDNGAQVSGNYQHLSSAGEWDISGTYAASDYSSVSSSWSGSF TATQYGAAFHRRSSTNEPRLMVSTDGVADIPVQGNLDYTNHFGIAVVPLISSYQPSTVAV NMNDLPDGVTVAENVIKETWIEGAIGYKSLASRSGKDVNVIIRNASGQFPPLGADIRQDD SGISVGMVGEEGHAWLSGVAENQLFTVVWGEQSCIIHLPERLEDTTKRLILPCH >gi|296494481|gb|ADTN01000257.1| GENE 2 1802 - 2530 601 242 aa, chain + ## HITS:1 COG:ybgP KEGG:ns NR:ns ## COG: ybgP COG3121 # Protein_GI_number: 16128692 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli K12 # 1 242 1 242 242 478 100.0 1e-135 MTFIKGLPLMLLTISLGCNAAVQPDRTRIVFNANDKATSLRIENQSDKLPYLAYSWIENE KGEKSDALLVALPPIQRLEPKATSQVRVVKQASTTQLPGDRETLFFYNMREIPPAPDKSS DHAILQVAIQSRIKLFWRPAALRKKAGEKVELQLQVSQQGNQLTLKNPTAYYLTIAYLGR NEKGVLPGFKTVMVAPFSTVNTNTGSYSGSQFYLGYMDDYGALRMTTLNCSGQCRLQAVE AK >gi|296494481|gb|ADTN01000257.1| GENE 3 2590 - 3588 713 332 aa, chain + ## HITS:1 COG:no KEGG:JW5098 NR:ns ## KEGG: JW5098 # Name: ybgO # Def: predicted fimbrial-like adhesin protein # Organism: E.coli_J # Pathway: not_defined # 1 332 22 353 353 679 100.0 0 MALNCYFGTSGGAVEKSEAIQPFAVPGNAKPGDKIWESDDIKIPVYCDNNTNGNFESEHV YAWVNPYPGVQDRYYQLGVTYNGVDYDASLGKSRIDTNQCIDSKNIDIYTPEQIIAMGWQ NKICSGDPANIHMSRTFLARMRLYVKIREMPPHDYQSTLSDYIVVQFDGAGSVNEDPTAQ NLKYHITGLENIRVLDCSVNFSISPETQVIDFGKFNLLDIRRHTMSKTFSIKTTKSQNDQ CTDGFKVSSSFYTEETLVEEDKALLIGNGLKLRLLDENASPYTFNKYAEYADFTSDMLVY EKTYTAELSSIPGTPIEAGPFDTVVLFKINYN >gi|296494481|gb|ADTN01000257.1| GENE 4 3740 - 4579 616 279 aa, chain + ## HITS:1 COG:abrB KEGG:ns NR:ns ## COG: abrB COG3180 # Protein_GI_number: 16128690 # Func_class: R General function prediction only # Function: Putative ammonia monooxygenase # Organism: Escherichia coli K12 # 1 245 16 260 363 378 99.0 1e-105 MPVLQWGMLCVLSLLLSIGFLAVHLPAALLLGPMIAGIIFSMRGITLQLPRSAFLAAQAI LGCMIAQNLTGSILTTLAVNWPIVLAILLVTLLSSAIVGWLLVRYSSLPGNTGAWGSSPG GAAAMVAMAQDYGADIRLVAFMQYLRVLFVAGAAVLVTRMMLGDNAEAVNQHIVWFPPVS INLLLTILLAVVAGTVGCLLRLPSGTMLIPMLAGAVLQSGQLITIELPEWLLAMAYMAIG WRLVLVSINKSYCGHYARYRKSCCRFLLCWLFVRVWRGG >gi|296494481|gb|ADTN01000257.1| GENE 5 4426 - 4785 197 119 aa, chain + ## HITS:1 COG:abrB KEGG:ns NR:ns ## COG: abrB COG3180 # Protein_GI_number: 16128690 # Func_class: R General function prediction only # Function: Putative ammonia monooxygenase # Organism: Escherichia coli K12 # 14 119 258 363 363 177 99.0 3e-45 MAAGDGVYGNWLAVGLGFDKQILLRALRPLPQILLSIFALLAICAGMAWGLTRFMHIDFM TAYLATSPGGLDTVAVIAAGSNADMALIMAMQTLRLFSILLTGPAIARFISTYAPKRSA >gi|296494481|gb|ADTN01000257.1| GENE 6 4782 - 5573 697 263 aa, chain - ## HITS:1 COG:nei KEGG:ns NR:ns ## COG: nei COG0266 # Protein_GI_number: 16128689 # Func_class: L Replication, recombination and repair # Function: Formamidopyrimidine-DNA glycosylase # Organism: Escherichia coli K12 # 1 263 1 263 263 547 100.0 1e-156 MPEGPEIRRAADNLEAAIKGKPLTDVWFAFPQLKPYQSQLIGQHVTHVETRGKALLTHFS NDLTLYSHNQLYGVWRVVDTGEEPQTTRVLRVKLQTADKTILLYSASDIEMLTPEQLTTH PFLQRVGPDVLDPNLTPEVVKERLLSPRFRNRQFAGLLLDQAFLAGLGNYLRVEILWQVG LTGNHKAKDLNAAQLDALAHALLEIPRFSYATRGQVDENKHHGALFRFKVFHRDGEPCER CGSIIEKTTLSSRPFYWCPGCQH >gi|296494481|gb|ADTN01000257.1| GENE 7 5609 - 6343 683 244 aa, chain - ## HITS:1 COG:ybgL KEGG:ns NR:ns ## COG: ybgL COG1540 # Protein_GI_number: 16128688 # Func_class: R General function prediction only # Function: Uncharacterized proteins, homologs of lactam utilization protein B # Organism: Escherichia coli K12 # 1 244 1 244 244 442 100.0 1e-124 MKIDLNADLGEGCASDAELLTLVSSANIACGFHAGDAQIMQACVREAIKNGVAIGAHPSF PDRENFGRSAMQLPPETVYAQTLYQIGALATIARAQGGVMRHVKPHGMLYNQAAKEAQLA DAIARAVYACDPALILVGLAGSELIRAGKQYGLTTREEVFADRGYQADGSLVPRSQSGAL IENEEQALAQTLEMVQHGRVKSITGEWATVAAQTVCLHGDGEHALAFARRLRSAFAEKGI VVAA >gi|296494481|gb|ADTN01000257.1| GENE 8 6333 - 7265 1003 310 aa, chain - ## HITS:1 COG:ybgK KEGG:ns NR:ns ## COG: ybgK COG1984 # Protein_GI_number: 16128687 # Func_class: E Amino acid transport and metabolism # Function: Allophanate hydrolase subunit 2 # Organism: Escherichia coli K12 # 1 310 1 310 310 634 100.0 0 MLKIIRAGMYTTVQDGGRHGFRQSGISHCGALDMPALRIANLLVGNDANAPALEITLGQL TVEFETDGWFALTGAGCEARLDDNAVWTGWRLPMKAGQRLTLKRPQHGMRSYLAVAGGID VPPVMGSCSTDLKVGIGGLEGRLLKDGDRLPIGKSKRDSMEAQGVKQLLWGNRIRALPGP EYHEFDRASQDAFWRSPWQLSSQSNRMGYRLQGQILKRTTDRELLSHGLLPGVVQVPHNG QPIVLMNDAQTTGGYPRIACIIEADMYHLAQIPLGQPIHFVQCSLEEALKARQDQQRYFE QLAWRLHNEN >gi|296494481|gb|ADTN01000257.1| GENE 9 7259 - 7915 632 218 aa, chain - ## HITS:1 COG:ECs0736 KEGG:ns NR:ns ## COG: ECs0736 COG2049 # Protein_GI_number: 15829990 # Func_class: E Amino acid transport and metabolism # Function: Allophanate hydrolase subunit 1 # Organism: Escherichia coli O157:H7 # 1 218 1 218 218 417 100.0 1e-117 MQRARCYLIGETAVVLELEPPVTLASQKRIWRLAQRLVDMPNVVEAIPGMNNITVILRNP ESLALDAIERLQRWWEESEALEPESRFIEIPVVYGGAGGPDLAVVAAHCGLSEKQVVELH SSVEYVVWFLGFQPGFPYLGSLPEQLHTPRRAEPRLLVPAGSVGIGGPQTGVYPLATPGG WQLIGHTSLSLFDPARDEPILLRPGDSVRFVPQKEGVC >gi|296494481|gb|ADTN01000257.1| GENE 10 7938 - 8681 790 247 aa, chain - ## HITS:1 COG:ECs0735 KEGG:ns NR:ns ## COG: ECs0735 COG0327 # Protein_GI_number: 15829989 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 247 1 247 247 499 100.0 1e-141 MKNTELEQLINEKLNSAAISDYAPNGLQVEGKETVQKIVTGVTASQALLDEAVRLGADAV IVHHGYFWKGESPVIRGMKRNRLKTLLANDINLYGWHLPLDAHPELGNNAQLAALLGITV MGEIEPLVPWGELTMPVPGLELASWIEARLGRKPLWCGDTGPEVVQRVAWCTGGGQSFID SAARFGVDAFITGEVSEQTIHSAREQGLHFYAAGHHATERGGIRALSEWLNENTDLDVTF IDIPNPA >gi|296494481|gb|ADTN01000257.1| GENE 11 8952 - 10433 1817 493 aa, chain + ## HITS:1 COG:ybgH KEGG:ns NR:ns ## COG: ybgH COG3104 # Protein_GI_number: 16128684 # Func_class: E Amino acid transport and metabolism # Function: Dipeptide/tripeptide permease # Organism: Escherichia coli K12 # 1 493 1 493 493 920 99.0 0 MNKHASQPRAIYYVVALQIWEYFSFYGMRALLILYLTNQLKYNDTHACELFSAYCSLVYV TPILGGFLADKVLGNRMAVMLGALLMAIGHVVLGASEIHPSFLYLSLAIIVCGYGLFKSN VSCLLGELYEPTDPRRDGGFSLMYAAGNVGSIIAPIACGYAQEEYSWAMGFGLAAVGMIA GLVIFLCGNRHFTHTRGVNKKVLRATNFLLPNWGWLLVLLVATPALITILFWKEWSVYAL IVATIIGLGVLAKIYRKAENQKQRKELGLIVTLTFFSMLFWAFAQQGGSSISLYIDRFVN RDMFGYTVPTAMFQSINAFAVMLCGVFLAWVVKESVAGNRTVRIWGKFALGLGLMSAGFC ILTLSARWSAMYGHSSLPLMVLGLAVMGFAELFIDPVAMSQITRIEIPGVTGVLTGIYML LSGAIANYLAGVIADQTSQASFDASGAINYSINAYIEVFDQITWGALACVGLVLMIWLYQ ALKFRNRALALES >gi|296494481|gb|ADTN01000257.1| GENE 12 10583 - 12001 892 472 aa, chain - ## HITS:1 COG:phrB KEGG:ns NR:ns ## COG: phrB COG0415 # Protein_GI_number: 16128683 # Func_class: L Replication, recombination and repair # Function: Deoxyribodipyrimidine photolyase # Organism: Escherichia coli K12 # 1 472 1 472 472 967 99.0 0 MTTHLVWFRQDLRLHDNLALAAACRNSSARVLALYIATPRQWATHNMSPRQAELINAQLN GLQIALAEKGIPLLFREVDDFVASVEIVKQVCAENSVTHLFYNYQYEVNERARDVEVERA LRNVVCEGFDDSVILPPGAVMTGNHEMYKVFTPFKNAWLKRLREGMPECVAAPKVRSSGS IEPSPSITLNYPRQSFDTAHFPVEEKAAIAQLRQFCQNGAGEYEQQRDFPAVEGTSRLSA SLATGGLSPRQCLHRLLAEQPQALDGGAGSVWLNELIWREFYRHLITYHPSLCKHRPFIA WTDRVQWQSNPAHLQAWQEGKTGYPIVDAAMRQLNSTGWMHNRLRMITASFLVKDLLIDW REGERYFMSQLIDGDLAANNGGWQWAASTGTDAAPYFRIFNPTTQGEKFDQGGEFIRQWL PELRDVPGKVVHEPWKWAQKAGVTLDYPQPIVEHKEARVQTLAAYEAARKGK >gi|296494481|gb|ADTN01000257.1| GENE 13 11998 - 12507 398 169 aa, chain - ## HITS:1 COG:ybgA KEGG:ns NR:ns ## COG: ybgA COG3272 # Protein_GI_number: 16128682 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 169 1 169 169 315 100.0 2e-86 MNLQRFDDSTLIRIFALHELHRLKEHGLTRGALLDYHSRYKLVFLAHSQPEYRKLGPFVA DIHQWQNLDDYYNQYRQRVVVLLSHPANPRDHTNVLMHVQGYFRPHIDSTERQQLAALID SYRRGEQPLLAPLMRIKHYMALYPDAWLSGQRYFELWPRVINLRHSGVL Prediction of potential genes in microbial genomes Time: Mon May 16 00:01:04 2011 Seq name: gi|296494480|gb|ADTN01000258.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont659.1, whole genome shotgun sequence Length of sequence - 16691 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 6, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 766 - 825 2.1 2 2 Tu 1 . + CDS 881 - 1009 123 ## ECS88_3283 hypothetical protein - Term 1003 - 1045 4.3 3 3 Tu 1 . - CDS 1062 - 2843 316 ## COG1201 Lhr-like helicases 4 4 Op 1 . - CDS 3254 - 4570 555 ## KPN_pKPN3p05867 biotin carboxylase 5 4 Op 2 . - CDS 4574 - 6883 655 ## KPN_pKPN3p05868 hypothetical protein - Prom 7017 - 7076 5.2 - Term 7121 - 7168 2.5 6 5 Op 1 11/0.000 - CDS 7265 - 8095 306 ## COG2801 Transposase and inactivated derivatives 7 5 Op 2 . - CDS 8122 - 8385 331 ## COG2801 Transposase and inactivated derivatives 8 5 Op 3 . - CDS 8435 - 9646 669 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair - Prom 9863 - 9922 4.4 - Term 9940 - 9974 0.5 9 6 Op 1 3/0.000 - CDS 10004 - 10936 1029 ## COG1052 Lactate dehydrogenase and related dehydrogenases 10 6 Op 2 7/0.000 - CDS 10950 - 12260 1111 ## COG0477 Permeases of the major facilitator superfamily - Prom 12280 - 12339 2.0 11 6 Op 3 1/0.000 - CDS 12359 - 13507 1315 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 12 6 Op 4 2/0.000 - CDS 13504 - 14121 699 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase 13 6 Op 5 1/0.000 - CDS 14105 - 14983 519 ## COG3734 2-keto-3-deoxy-galactonokinase 14 6 Op 6 1/0.000 - CDS 14980 - 15669 666 ## COG2186 Transcriptional regulators - Prom 15846 - 15905 6.8 15 6 Op 7 . - CDS 16237 - 16473 279 ## COG0561 Predicted hydrolases of the HAD superfamily Predicted protein(s) >gi|296494480|gb|ADTN01000258.1| GENE 1 2 - 685 382 227 aa, chain + ## HITS:1 COG:ECs4547 KEGG:ns NR:ns ## COG: ECs4547 COG3436 # Protein_GI_number: 15833801 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 227 285 511 512 390 83.0 1e-108 WFAYSPDRKGIHPQTHLAGFSGVLQADAYAGFNELYRDGHIKEAACWAHARRKIHDVHVR TPSALTEEALKRIGELYAIESELRGKRAEERQAVRHQKVLPLLASLEGWLREKQKTLSRH SELAKAFGYALNQWPALTRYAEDGWVEVDNNIAENALRLVSLGRKNWLFFGSDHGGERGA SLYSLIGTCKLNGVDPERYLHHVLDVIADWPVNRVGALLPWRVTLPA >gi|296494480|gb|ADTN01000258.1| GENE 2 881 - 1009 123 42 aa, chain + ## HITS:1 COG:no KEGG:ECS88_3283 NR:ns ## KEGG: ECS88_3283 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 42 66 107 107 67 97.0 2e-10 MGLKPGPKPIAKSTGKPDQRRRDNKDTPGNTPGLKPSKSTGK >gi|296494480|gb|ADTN01000258.1| GENE 3 1062 - 2843 316 593 aa, chain - ## HITS:1 COG:alr3331 KEGG:ns NR:ns ## COG: alr3331 COG1201 # Protein_GI_number: 17230823 # Func_class: R General function prediction only # Function: Lhr-like helicases # Organism: Nostoc sp. PCC 7120 # 8 576 147 706 722 268 30.0 2e-71 MGWLKQAFSSVAYIVIDEFHAFIGSERGVQLLSLLNRIDHVLGCQVNPIPRVALSATLGE LEKVPEMLRPDKRLPCVTVTDSNSMATLQVQVKGYLERVIQNEEELQSSAEHDVCADIFR LCRGDSHLVFANSRKRTESIAATLSDMCEEQIVPNEFFPHHGSLAKELREVLESRLQKGN LPTTAVCTMTLELGIDIGKVKSVIQVTPPHSVSSLRQRMGRSGRRDSPSVLRMLITENEL TVSSSIVDHLRLQLVQSMAMIRLMISKQWFEPADSRQMHYSTLLHQILAITAQWGGVRAD QLWSQLCQTGPFRNVDLNDFKSLLKHMGACGLLTQLASGEMVVGAEGEKLTNHYTFYAVF NTPEEFRIITGNRTLGTVPVDSPLLPDQHIIFGGRRWKVTEIETEKKVIYVEATKGGQPP QFSGGGMSVHDAVRQEMLAIYREGDYRIAIGSKKVDYADTAARNLFAEGCSNFQRFKLQN ECFITSGQHCYVIPWMGDKVVNTITALLIRCGFKANSFAGVIEIDNSSVASVQHALKEML LSGLPSAFDLATDVPEKYLDKYDEYLPESLLAKGYGAKAYETEGTRIWLQKHL >gi|296494480|gb|ADTN01000258.1| GENE 4 3254 - 4570 555 438 aa, chain - ## HITS:1 COG:no KEGG:KPN_pKPN3p05867 NR:ns ## KEGG: KPN_pKPN3p05867 # Name: not_defined # Def: biotin carboxylase # Organism: K.pneumoniae # Pathway: not_defined # 1 438 1 438 438 868 99.0 0 MSVTRIRVKERDAIIQSLKSGVTPRIGIQHIQVGRVNEITALHQDIERIADGGASFRLII GEYGSGKTFFLSVVRSIALEKKLVSVSADLSPDRRIHSSGGQARNLYSELMKNMSTRNKP DGNALLSVVERFVTEARKEADTDGSEVSSVIHKRLAALSDMVGGYDFAKIIDAFWRGHEQ DNETLKSNAIRWLRGEYTTKTDARHDLDVRTIISDASFYDSLKLMSLFVRQAGYAGLLVS LDEMVNLLKLNNTQARTANYEQILRILNDCLQGSAENIGFILGGTPEFLFDPRKGLYSYE ALQSRLAENRFAQKAGVIDYSSPTLHLASLTPEELYILLRNLRHVYAGGDPENYLVPDEA LTGFLHHCSKTIGDAYFRTPRNTIKGFLDMLAVLEQNRTMNWQTLIEGVAIEEDRPSDMD DTTTEESDDDDGLANFKL >gi|296494480|gb|ADTN01000258.1| GENE 5 4574 - 6883 655 769 aa, chain - ## HITS:1 COG:no KEGG:KPN_pKPN3p05868 NR:ns ## KEGG: KPN_pKPN3p05868 # Name: not_defined # Def: hypothetical protein # Organism: K.pneumoniae # Pathway: not_defined # 1 769 1 769 769 1471 99.0 0 MELLKFLAAIFVIYLLFFRKKKRKSPKSYASDPVKAPPKEWLADIKKKEAVVRNDDSEDD NLATFTLSGGRGVELRVTTNHYRNTSKSAGALARWVLPGEMITVAGVAISGGHIYVGQRM KPSGQESGGYYDDGSEASLIDDTCKIKPTSYLYEDSSLGYWPSFSSLSPEARGAYVSWLA SDRSDPSCPIGYVFIYLYGLERKALVDSTDPKFPDAEFRNLFNEVARLRSIFIENRSFRG YSTQLLEAMSILRPNMGLATELDSNSGFSNSMQFKLALAKTVHEGVPVSAELALNWVINH TEYSLRTPARRCAKEFAALFKRRYTLKYGEGMVVKANKTRLRLDYTPASPSLRGVRLPVP DLPDPSALKSPVQKIMALADICTDELDAYSRYLGRKGTSVNDTAAIMLLPSEIVNESAEK ILSSFKRWADEAILVKEGLVSVADFWAHMNASCPNKINKKEADLMQAFALKMGYGLAPDP YYHHVKADVDGTLVLFPAAEGGRFSPSPEFISAVMTLRLGAMVALIDDSLDQAEQKVLEN AINNNPGFTDDEKRSLHAYLTWQLHTPANMTGMKSRIELMGAAEKAAVGKVIVSVACSDG TIAPAEIKQLEKIYSSLGLDPSSVSSNIHQHSATEHDLVSSVPTDQPAAGFTLDANVLAR HESATDDVRKLLNTIFTEEEPEEPESAPASSTEAGGLDSAHSQLYRSLLEKEQWSRKEAT ELCGNLNLMLGGALEVINDWSYAVVDAPVLDDADDDIWVDLEIAKELEG >gi|296494480|gb|ADTN01000258.1| GENE 6 7265 - 8095 306 276 aa, chain - ## HITS:1 COG:YPCD1.04 KEGG:ns NR:ns ## COG: YPCD1.04 COG2801 # Protein_GI_number: 16082695 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Yersinia pestis # 63 275 2 214 215 413 93.0 1e-115 MLMCDATGLSQRRACRLTGLSLSTCRYEAHRPAADAHLSGRITELALERRRFGYRRIWQL LRREGLHVNHKRVYRLYHLSGLGVKRRRRRKGLATERLPLLRPAAPNLTWSMDFVMDALS TGRRIKCLTCVDDFTKECLTVTVAFGISGVQVTRILDSIALFRGYPATIRTDQGPEFTCR ALDQWAFEHGVELRLIQPGKPTQNGFIESFNGRFRDECLNEHWFSDIVHARKIINDWRQD YNECRPHSTLNYQTPSEFAAGWRKGHSENEDSDVTN >gi|296494480|gb|ADTN01000258.1| GENE 7 8122 - 8385 331 87 aa, chain - ## HITS:1 COG:PA0986 KEGG:ns NR:ns ## COG: PA0986 COG2801 # Protein_GI_number: 15596183 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Pseudomonas aeruginosa # 1 81 1 81 87 103 67.0 6e-23 MKKRFSDEQIISILREAEAGVPARELCRKHAISDATFYTWRKKYGGMEVPEVKRLKSLEE ENARLKKLLAEAMLDKEALQVALGRKY >gi|296494480|gb|ADTN01000258.1| GENE 8 8435 - 9646 669 403 aa, chain - ## HITS:1 COG:PSLT054 KEGG:ns NR:ns ## COG: PSLT054 COG0389 # Protein_GI_number: 17233501 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Salmonella typhimurium LT2 # 1 397 1 397 424 722 86.0 0 MFALADVNSFYDSCEKVFRPDLRGKPVVVLSNNDGCVIARSKEAKLLGIKMGVPWFQLKD AQFPEKLHVFSSNYELYASMSARVMTHLEELAPRVEQYSIDEMFLDIRGINRCIDFEDFG RQLRVHVGNGTGLTIGVGMGPTKTLAKSAQWASKEWPQFGGVLALTPGNPHRTEKLLSLQ PVEEIWGVGRRISKTLGTMGITTALQLARANPTFIRKNFNVVLERTVRELNGEPCILLEE APPPKQQIVCSRSFGERVTTYEAMRQAICQHAERAAEKLRGEHQYCRHISAFIKTSPFAI NEPYYGNLATEKLMTPTRDTRDIITAAVRALDRIWVSGHRYAKAGIMLNDFSPTGVSQLN LFDDVQPRAHSDKLMKVLDGINHSGLGKVWFAGRGIADLLPVD >gi|296494480|gb|ADTN01000258.1| GENE 9 10004 - 10936 1029 310 aa, chain - ## HITS:1 COG:YPO2536 KEGG:ns NR:ns ## COG: YPO2536 COG1052 # Protein_GI_number: 16122754 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Yersinia pestis # 2 302 5 307 316 268 46.0 1e-71 MKQKVLKQAYLPDALTWELSQRYDLYDLALLSDTELQAVASEIAVVITNGEAVVTREFIN TLPALKLIAVFGVGYDGVDVAAARDAGVDVTHTPGVLTDDVADLAMGLMLAVSRKIVAAQ KFIEQAGWQNSGFQWTRKVSGKRLGILGMGRIGQAIARRAAAFDMEISYSDRQKNNALIW NYIPDLQALAQNSDFLMVCAPGGEGTKALINQSVLEALGAEGILINISRGSVVDEDALIA ALENNTIAGAALDVFAHEPHVPVSLQKRDNVVITPHMASATWETRREMSRLVLENVEAWF AGLPLVTPVP >gi|296494480|gb|ADTN01000258.1| GENE 10 10950 - 12260 1111 436 aa, chain - ## HITS:1 COG:BMEII1096 KEGG:ns NR:ns ## COG: BMEII1096 COG0477 # Protein_GI_number: 17989441 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Brucella melitensis # 9 415 36 440 451 303 43.0 4e-82 MNTDLYSSTLRQLNAKIIPFIIICYFVANLDKTNISIAALQMNADLGLSASMYGLGVGMF YISYIIFEIPSNVIMTKVGARLWIARIMITWGIVSAGMSLVHTPTQLYVMRFLLGMAEAG FTPGIIYYISCWFPKSNRARAMSFFYMGSVMASIIGLPISGSILNMHGIADIAGWRWLFA IEGIPAIVLGVLVLWRLPQSPDHAAWLSTEQKGWLKAQLARDNASVEVGANHSWIGALKN KTVLLLSLVWFLQAFGSIGITLFLPLIIKSMASDQSNIVIGLLSAVPFIAACLFMYINGR HSDTTNERSWHLGLPLILSGLSLAIAIYSGNLLVAYVLLVLTVGFNFALTPIFWAVTTEK LAGVAAAASIAFINTVANFVGLGLPPLLGKIKDATNSYHYGLLIVAVALIIGGIIGILVS RPAPLRVSEDPASRLS >gi|296494480|gb|ADTN01000258.1| GENE 11 12359 - 13507 1315 382 aa, chain - ## HITS:1 COG:dgoA_2 KEGG:ns NR:ns ## COG: dgoA_2 COG4948 # Protein_GI_number: 16131560 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Escherichia coli K12 # 1 382 96 477 477 731 90.0 0 MKITKLTTYRLPPRWMFLKIETDEGVVGWGEPVIEGRARTVEAAVHELGEYLIGQDPARI NDLWQVMYRAGFYRGGPILMSAIAGIDQALWDIKGKVHNAPVWQLMGGLVRDKIKAYSWV GGDRPAEAIAGISTLRKIGFDTFKLNGCEELGIIDDSRKVDAAVNTAAQIREAFGNEIEF GLDFHGRVSAPMAKVLIKELEQYRPLFIEEPVLAEQAEYYPRLAAQTHIPIAAGERMFTR FEFKRVLEAGGVAILQPDLSHAGGITECYKIAGMAEAYDVALAPHCPLGPIALAACLHVD FVSHNAVFQEQSMGIHYNKGAELLDFVKNKEDFSMEGGFFKPLMKPGLGVEIDEEKVHEL SKHFQDWRNPLWRHADGTVAEW >gi|296494480|gb|ADTN01000258.1| GENE 12 13504 - 14121 699 205 aa, chain - ## HITS:1 COG:RSc2752 KEGG:ns NR:ns ## COG: RSc2752 COG0800 # Protein_GI_number: 17547471 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Ralstonia solanacearum # 1 201 3 203 213 240 61.0 1e-63 MQWQNKLPLIAILRGIKPDEAHAHVAALIEAGFEAVEIPLNSPEWEKTIPAMVKAFGDKA LIGAGTVLKLEQVDQLAAMGCTLMVTPNIQPELIRRAVEHGMTVCPGCATATEAFDAIDA DAQSLKIFPSSAFGPDYIKALKAVLPPEIPVFAVGGVTPENLKQWIDAGCLGAGLGSDLY RAGQPVERTAQQAAAFVKAYREAVK >gi|296494480|gb|ADTN01000258.1| GENE 13 14105 - 14983 519 292 aa, chain - ## HITS:1 COG:dgoK KEGG:ns NR:ns ## COG: dgoK COG3734 # Protein_GI_number: 16131561 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-galactonokinase # Organism: Escherichia coli K12 # 1 291 1 291 292 393 68.0 1e-109 MTSRYIAIDWGSTNLRAWLYQSEKCLESRKSEAGITRLNGRTFEAVLSEITEGWMLENTP VVMAGMVGSNAGWVNVPYLPCPARLNDIGRHLTRVKDNVWIVPGLCINNADNHNVMRGEE TQLVGAHTLRPSALYVMPGTHCKWVRVDGDRVEDFRTVMTGELHHVLLTHTLVGTGLPQQ TDSPEAFNAGLECGLQSPAILPRLFEVRASYVLGVRPREQVSEFLSGLLIGSEVASMGEY IPVWQTVTLVASPSLTARYQRAFALFGRQSQALSGDDAFQVGIRSIAYAVAK >gi|296494480|gb|ADTN01000258.1| GENE 14 14980 - 15669 666 229 aa, chain - ## HITS:1 COG:STM3830 KEGG:ns NR:ns ## COG: STM3830 COG2186 # Protein_GI_number: 16767115 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Salmonella typhimurium LT2 # 1 229 1 229 229 384 88.0 1e-107 MTITKTDRITFTLGRQIVSGKYVPGAALPSEADLCDEFETSRNIIREVFRSLTAKRLIEM KRYRGAFVAARNQWNYLDTEVLQWVLENDYDPRLISAMSEVRNLVEPAIARWAAERATSS ELAQIEAALNDMIANNQDRYAFNEADIRYHEAVLNAVHNPVLQQLSGAISSLQRAVFERT WMGDEANMPKTLQEHKALFDAIRHQDSNAAEQAALTMIASSTTRLKDIT >gi|296494480|gb|ADTN01000258.1| GENE 15 16237 - 16473 279 78 aa, chain - ## HITS:1 COG:STM3831 KEGG:ns NR:ns ## COG: STM3831 COG0561 # Protein_GI_number: 16767116 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Salmonella typhimurium LT2 # 12 78 204 270 281 106 80.0 9e-24 MGQPLPTDKPKLAGKLGIKPEEVMTLGDQENDIAMIEYAGMSVAMENAIPSAKDVADFVT KSNLEDGVAYAIEKFALS Prediction of potential genes in microbial genomes Time: Mon May 16 00:01:26 2011 Seq name: gi|296494479|gb|ADTN01000259.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont663.1, whole genome shotgun sequence Length of sequence - 20349 bp Number of predicted genes - 19, with homology - 18 Number of transcription units - 11, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 306 398 ## UTI89_C2275 hypothetical protein 2 2 Tu 1 . - CDS 569 - 1141 515 ## SbBS512_E1123 hypothetical protein 3 3 Tu 1 . + CDS 1070 - 1258 158 ## c1279 hypothetical protein - Term 1428 - 1466 7.1 4 4 Op 1 . - CDS 1610 - 2020 460 ## ECP_2042 hypothetical protein 5 4 Op 2 . - CDS 2036 - 2518 466 ## UTI89_C2271 hypothetical protein - Prom 2606 - 2665 5.5 6 5 Op 1 . - CDS 2848 - 3918 554 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily 7 5 Op 2 . - CDS 3915 - 4820 588 ## ECP_2039 hypothetical protein 8 5 Op 3 . - CDS 4817 - 7201 1637 ## COG0699 Predicted GTPases (dynamin-related) - Prom 7296 - 7355 4.7 - Term 7250 - 7280 2.9 9 6 Tu 1 . - CDS 7419 - 7853 442 ## COG3596 Predicted GTPase - Prom 8073 - 8132 5.3 + Prom 8206 - 8265 4.4 10 7 Op 1 4/0.000 + CDS 8375 - 10447 1890 ## COG1629 Outer membrane receptor proteins, mostly Fe transport 11 7 Op 2 33/0.000 + CDS 10458 - 11447 782 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 12 7 Op 3 35/0.000 + CDS 11466 - 12524 943 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 13 7 Op 4 . + CDS 12521 - 13288 232 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 14 7 Op 5 . + CDS 13342 - 13599 264 ## UTI89_C2262 hypothetical protein 15 8 Tu 1 . - CDS 14130 - 15275 320 ## UTI89_C2261 hypothetical protein + Prom 15927 - 15986 2.6 16 9 Tu 1 . + CDS 16011 - 16103 130 ## + Prom 16561 - 16620 2.0 17 10 Op 1 9/0.000 + CDS 16800 - 17822 547 ## COG4584 Transposase and inactivated derivatives 18 10 Op 2 . + CDS 17822 - 18601 611 ## COG1484 DNA replication protein 19 11 Tu 1 . - CDS 19627 - 20229 591 ## ECS88_2083 hypothetical protein Predicted protein(s) >gi|296494479|gb|ADTN01000259.1| GENE 1 3 - 306 398 101 aa, chain - ## HITS:1 COG:no KEGG:UTI89_C2275 NR:ns ## KEGG: UTI89_C2275 # Name: yafX2 # Def: hypothetical protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 101 25 125 181 209 93.0 2e-53 MKTLSQNTTSSACAPETGLQQLVATIVPDEQRISFWPQHFGLIPQWVTLEPRVFGWMDRL CEDYYGGIWNLYSLNNGGAFMAPEPDDDDGENWVLFNAMNG >gi|296494479|gb|ADTN01000259.1| GENE 2 569 - 1141 515 190 aa, chain - ## HITS:1 COG:no KEGG:SbBS512_E1123 NR:ns ## KEGG: SbBS512_E1123 # Name: not_defined # Def: hypothetical protein # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 1 190 126 315 315 368 99.0 1e-101 MLRLRRAGEINGEHVPEIILLNSHDGTSSYQMLPGYFRFVCQNGCVCGQSLGEVRVPHRG DVVEKVIEGAYEVVGVFDRIEEKRDAMQSLVLPPPARQALAQAALTYRYGDEHQPVTTAD ILTPRRREDYGKDLWSAYQTIQENMLKGGISGRSAKGKRIHTRAIHSIDTDIKLNRALWV MAETMLESLR >gi|296494479|gb|ADTN01000259.1| GENE 3 1070 - 1258 158 62 aa, chain + ## HITS:1 COG:no KEGG:c1279 NR:ns ## KEGG: c1279 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_CFT073 # Pathway: not_defined # 1 59 199 257 302 126 98.0 2e-28 MRVEQNNFRDMFSVYLSGPPQTQHVFGVSPAARVAHTGLAGEEWLKAFPLQAFQYGDGGM YV >gi|296494479|gb|ADTN01000259.1| GENE 4 1610 - 2020 460 136 aa, chain - ## HITS:1 COG:no KEGG:ECP_2042 NR:ns ## KEGG: ECP_2042 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_536 # Pathway: not_defined # 1 136 1 136 136 259 100.0 2e-68 MKTFIKTLLVAVTILFSVFATAKQVKLPNNIKYVNTTEAFSCTEIDGMNCQTKNPFNYKD NSYVFVLERGGAWCYDYTVSVLNLKTGKAQMLEYKDNQLCSGSNKPFFEIKNGVPTVGVI DTSGKPVVVALDKLKT >gi|296494479|gb|ADTN01000259.1| GENE 5 2036 - 2518 466 160 aa, chain - ## HITS:1 COG:no KEGG:UTI89_C2271 NR:ns ## KEGG: UTI89_C2271 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 160 102 261 261 335 100.0 3e-91 MKRLHTLMLFLAVLFTGFNVEAASVKQALSCDPNARAEQPGACPTTYELYEGDAAYKAAL DKALKPVGLSGMFGKGGYMDGPGGNVTPVTINGTVWLQGDGCKANTCGWDFIVTLYNPKT HEVVGYRYFGLDDPAYLVWFGEIGVHEFAYLVKNYVAAVN >gi|296494479|gb|ADTN01000259.1| GENE 6 2848 - 3918 554 356 aa, chain - ## HITS:1 COG:ECs1399 KEGG:ns NR:ns ## COG: ECs1399 COG1752 # Protein_GI_number: 15830653 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Escherichia coli O157:H7 # 1 356 1 356 356 659 96.0 0 MSTEMKTGLVLSGGGAVGAYQAGVVKALAECGTQISMVSGASIGAFNGAIIAASPDLSEA AVRLEALWDHLGNNQVLSVNRLVYFSLLKKLFQAMNLCQIPGRAGALLTTLLRHISILNG FDNLMAQPLLSDEPLTALLDHYLDTDALADGLPLYVSLYPTEGGMQDIIDCIRAELGAGT TKNAVFQHIQSLPRGQQKEALLASAALPLLFRPREVQGTMFGDGGMGGWRNMQGNTPVTP LVDAGCNMVIVTHLSDGSLWDRQAFPDTTILEIRPRKRLKYAGDGDNSGGLLSFTLAHTD AWRQQSYEDTMLTMEHIRKPLAARQALSRSETVLQKSLEITEEADLALRNAMARIK >gi|296494479|gb|ADTN01000259.1| GENE 7 3915 - 4820 588 301 aa, chain - ## HITS:1 COG:no KEGG:ECP_2039 NR:ns ## KEGG: ECP_2039 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_536 # Pathway: not_defined # 1 301 1 301 301 585 100.0 1e-166 MTSPFIQQIADNRVCQVLTCLPEKFVVDFANGIDVAQEHIRTAGERTFFRRLKEGLTGEG AARQNAINASVAQGLEASLRWLTELTTSLATTNYAITRVNDRVSSLVSDTARLAHYSADT REQLLILADQVHHKLNHLEEKLHRVDQVQRAQLHLEQIFSWWSAGRYASFSPAGRCYVAL EELRWGAFGDVIRQGETGQVNQLLDILRHKALTQMAQESGGSATVRLNTLDWLGGQGREQ ADNEWHDAINWLGDWCSEEQHPVIWSTTQAAEHLPVRMPRLCSAERLSESMVDEIFQKGA A >gi|296494479|gb|ADTN01000259.1| GENE 8 4817 - 7201 1637 794 aa, chain - ## HITS:1 COG:Z1212_1 KEGG:ns NR:ns ## COG: Z1212_1 COG0699 # Protein_GI_number: 15800733 # Func_class: R General function prediction only # Function: Predicted GTPases (dynamin-related) # Organism: Escherichia coli O157:H7 EDL933 # 1 274 2 275 275 508 96.0 1e-143 MHEKNIALLCDEADRLLQLNINLLRQMVDEPDVLSDSKNENRLLFDKQKALKRIEELEGE QIKTARREMVLAVVGTMKAGKSTTINAIVGQEILPNRNRPMTSVPTLIRHVPGKTEPVLH LEHIQPVRNLLITLQEKLATPAGQQVAQTLQQTGDTRELLDILTDDGWLKNEYHGEEEIF TGLASLNDLVRLAAAMGSEFPFDEYAEVQKLPVIDVEFSHLVGMDACQGTLTLLDTPGPN EAGQPQMEVMMRDQLQKASAVLAVMDYTQMNSKADEEVRKELNAIADVSVGRLFVLVNKF DEKDRNGDGADAVCQKVPAMLNSDVLPASRVYPGSSRQAYLANRALHELRKNGTLPVDEA WVDDFVREAFGRMKKDYVCKDSELATEGATDLWEGSLIDQLITEVILSSHSRAAALAVDS AAAKLMQNAENISEYLSLRHQGLMQSIQSLQAHITSLLEDIREIADCQEQVTADVRMAME EIDARTRELLTGVCTSLEEELNDYFRSGKRKEQQMLEEENSAQPRERNAFAFFHDIFGTG NQHDRMRDFDPDSPEIKFSDRREALELMTQIESTVTSLHREAEAQFRPELEKIVSGIETG FRGTALYATENIAGRINTRLEDEGFTVKISFPAVSQLQTRLAVKTNLSALMEERTETVTR RRRQSGLWGKICGAFGTSDWGWETYKEDVSRSVININTVRKEVMSLTRAYFGELQASIEQ DINQPVRQEIDAFFCAFREKVEQLRNTLIQSSEDHKRDQQAQERLTRRLQALNERVPELI TDSKALREELETML >gi|296494479|gb|ADTN01000259.1| GENE 9 7419 - 7853 442 144 aa, chain - ## HITS:1 COG:yeeP KEGG:ns NR:ns ## COG: yeeP COG3596 # Protein_GI_number: 16129940 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Escherichia coli K12 # 62 144 154 236 236 149 91.0 2e-36 MNPSDAIEAIEKPLSSLPYPISRHILEHLRKLTRHEPVPGIMGKSGAGKSSLCNALFQGE VTPVSALMTVLPGHAAIPLMTRLQDELRTESVRTQTREQFTGAVDRIFDTAESVCIASVA RTVLRAVRDSVVSVARAVWNWIFF >gi|296494479|gb|ADTN01000259.1| GENE 10 8375 - 10447 1890 690 aa, chain + ## HITS:1 COG:PA1910 KEGG:ns NR:ns ## COG: PA1910 COG1629 # Protein_GI_number: 15597106 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Pseudomonas aeruginosa # 1 690 128 804 804 220 27.0 6e-57 MIVTGNTAADTTDSAAGAGFKTNDIDVGPLGTKSWIETPYSSTTVTKEMIENQQAQSVSE MLKYSPSTQMQARGGMDVGRPQSRGMQGSVVANSRLDGLNIVSTTAFTVEMLERMDVLNS LTGALYGPASPAGQFNFVAKRPTEETLRKVTLGYQSRSAFTGHADLGGHFDENKRFGYRV NLLDQEGEGNVDDSTLRRKLVSVALDWNIQPGTQLQLDASHYEFIQKGYVGSFNYGPNVK LPSAPNPKDKNLALSTAGNDLTTDTISTRLIHYFNDDWSMNAGVGWQQADRAMRSVSSKI LNNQGDISRSMKDSTAAGRFRVLSNTAGLNGHIDTGSIGHDLSLSTTGYVWSLYSAKGTG SSYSWGTTNMYHPDAIDEQGDGKIRTGGPRYRSSVNTQQSVTLGDTVTFTPQWSAMFYLS QSWLQTKNYDKHGNQTNQVDENGLSPNAALMYKITPNTMAYVSYADSLEQGGTAPTDESV KNAGQTLNPYRSKQYEVGLKSDIGEMNLGAALFRLERPFAYLDTDNVYKEQGNQVNNGLE LTAAGNVWQGLNIYSGVTFLDPKLKDTANASTSNKQVVGVPKVQANLLAEYSLPSIPEWV YSANVHYTGKRAANDTNTSYASSYTTWDLVTRYTTKVSNVPTTFRVVVNNVFDKHYWASI FPSGTDGDNGSPSAFIGGGREVRASVTFDF >gi|296494479|gb|ADTN01000259.1| GENE 11 10458 - 11447 782 329 aa, chain + ## HITS:1 COG:CAC1988 KEGG:ns NR:ns ## COG: CAC1988 COG0614 # Protein_GI_number: 15895258 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Clostridium acetobutylicum # 28 325 58 348 355 176 31.0 6e-44 MKLSRLFPLLALLLAASLHAENTHKEVRIASPWPAQNTIIAMLGYGDNIVGTSMIAKRIP LFRQSLPRIEKVAAVSVNSGHEINPEQIIALGGDMLFVPQNMVVPQQALLKQAGVQVLAF EANSLRALTQRVQQTEAVLGPDAQQKALAYQRYFDRNVALVTGRLKDLPASQRVSLYHCM GNPLTTTGRPSLNQDWIDLAGGKNIAENWFGEHQQNRSGEVALEKIVTANPAVIIAMNKR DADAILSSPQWASVDAVIHHRVYVNPKGMFWWCRETSEEALQFLWLAKTLYPARFADVDI RKETREFYRQFFGLTLSDAQMSDVLNPPR >gi|296494479|gb|ADTN01000259.1| GENE 12 11466 - 12524 943 352 aa, chain + ## HITS:1 COG:STM0770 KEGG:ns NR:ns ## COG: STM0770 COG0609 # Protein_GI_number: 16764134 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Salmonella typhimurium LT2 # 13 351 21 359 359 408 79.0 1e-114 MTVLESTLSVENRYRQLFRQRLVILAVIFIAIIASLLLDFTLGPSGLPLHSLIKTLMHPS GATNGMRVIVWDIRLPYALMALLVGMALGLAGAEMQTILNNSLASPFTLGVSSAAAFGAA LAIVLGIGIPGVPESGFIPANAFLFALLSALLLDSLTRWTRVPTSGVVLFGIALVFTFNA LVSIMQFVADEDTLQGLVFWTMGSLARASWEKLAVLAAAMAIVLPWSLRRAWQLTALRLG EERAMSFGIDVRRLRLGSLLRISLLAALSVAFVGPIGFIGLVAPHISRLLLGEDHRFYLP GSILIGGLVLSLASVASKNIIPGVILPVGIVTSLVGVPFFLSIVMRHRGSMS >gi|296494479|gb|ADTN01000259.1| GENE 13 12521 - 13288 232 255 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 13 221 11 226 245 94 27 7e-19 MSGLTINALCAGYGKRRIIEHLSISTLPRGEVTVLLGPNGCGKSTLLRALAGLNRASGEA WLNEENLLSLPFARRAEKVVFLPQSLPQGVHLQVLESVVVAQRASGAGQNQAQAIALLEE LGIAHLAMNYLDSLSGGQKQLVGLAQSLIRRPALLLLDEPLSALDLNYQFHVMDVVSRET RRRNMVTLVVLHDINIALRHAAQVIMLKEGKLIDSGDPQTVIHAESLAQVYGVRGRVERC AQGRSMVIVDGAIEK >gi|296494479|gb|ADTN01000259.1| GENE 14 13342 - 13599 264 85 aa, chain + ## HITS:1 COG:no KEGG:UTI89_C2262 NR:ns ## KEGG: UTI89_C2262 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 85 1 85 85 156 100.0 2e-37 MQHIDRLNVIKALVLLEDEQIVRFNIAANDNASQIHMLVDGLGVGLTHEPVAAGMISHRT ACHRSRQHDRPAHSYFYYLSAYQVA >gi|296494479|gb|ADTN01000259.1| GENE 15 14130 - 15275 320 381 aa, chain - ## HITS:1 COG:no KEGG:UTI89_C2261 NR:ns ## KEGG: UTI89_C2261 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 381 10 390 390 750 99.0 0 MGKELDDICTSCPYIDAIKRHKQQLGAIEEYTQWLKKEPRASYFFLFRLYTRIHNTFFPQ KQQLPFTPGGTHCPEPDVTLRDLTLSPGYHSDYAPQPIPEMDSSAVVPPTNENTSPPEDT PDNTPAGGNTGQAEKTRNSGLTPIPEKRSGMPPEHLRFATGFPPQPKIAGPKGKPMRTVH PDKIYREIIWFCSGYLRKSGPEATRTIINSIFCEWASIFNDYSSPFSWVDSRDSEQCDWL WNAMQVRCVGTPLNPLTPEQKYWFACATFDNWEGWNEQQVQFLLESNPRRNRAKFTQASF QAPRIQHKAILLDELKSAREQQKRRDERADGSVPLKLSGKIHKQLESIARSRGVLPKKLL NEMIEQAYQNFVANEQHKTLS >gi|296494479|gb|ADTN01000259.1| GENE 16 16011 - 16103 130 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSEPDLLSEVHPVADLCSHFRDPEPTTPYG >gi|296494479|gb|ADTN01000259.1| GENE 17 16800 - 17822 547 340 aa, chain + ## HITS:1 COG:YPO2026 KEGG:ns NR:ns ## COG: YPO2026 COG4584 # Protein_GI_number: 16122267 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Yersinia pestis # 1 340 1 340 340 675 98.0 0 MVTFETVMEIKILHKQGMSSRAIARELGISRNTVKRYLQAKSEPPKYTPRPAVASLLDEY RDYIRQRIADAHPYKIPATVIAREIRDQGYRGGMTILRGFIRSLSVPQEQEPAVRFETEP GRQMQVDWGTMRNGRSPLHVFVAVLGYSRMLYIEFTDNMRYDTLETCHRNAFRFFGGVPR EVLYDNMKTVVLQRDAYQTGQHRFHPSLWQFGKEMGFSPRLCRPFRAQTKGKVERMVQYT RNSFYIPLMTRLRPMGITVDVETANRHGLRWLHDVANQRKHETIQARPCDRWLEEQQSML ALPPEKKEYDEHPGENLVNFDKHPLHHPLSIYDSFCRGVA >gi|296494479|gb|ADTN01000259.1| GENE 18 17822 - 18601 611 259 aa, chain + ## HITS:1 COG:YPO1901 KEGG:ns NR:ns ## COG: YPO1901 COG1484 # Protein_GI_number: 16122148 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Yersinia pestis # 1 259 2 260 260 469 99.0 1e-132 MMELQHQRLMALAGQLQLESLISAAPALSQQAVDQEWSYMDFLEHLLHEEKLARHQRKQA MYTRMAAFPAVKTFEEYDFTFATGAPQKQLQSLRSLSFIERNENIVLLGPSGVGKTHLAI AMGYEAVRAGIKVRFTTAADLLLQLSTEQRQGRYKTTLQRGVMAPRLLIIDEIGYLPFSQ EEAKLFFQVIAKRYEKSAMILTSNLPFGQWDQTFAGDAALTSAMLDRILHHSHVVQIKGE SYRLRQKRKAGVIAEANPE >gi|296494479|gb|ADTN01000259.1| GENE 19 19627 - 20229 591 200 aa, chain - ## HITS:1 COG:no KEGG:ECS88_2083 NR:ns ## KEGG: ECS88_2083 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 200 6 205 205 404 99.0 1e-111 MKQQYQTRYELLHENYQKWLTGFTRHAVSWGVCHPNIYYFHNLTPGWVSFNGEKPEIAIV PQSLHRLIYGPDKRSSPSLDDDLVVNLCTSEHLLVHHPMLEGILLSECERLKQHSLANKL ISLFRQFGGTELRLKLVWLCWLDLMTGNSLDDWTKNLKHKSEKDLEQWIIARQGQSEPLT NLMDQYVLMAYRTSVDVAHS Prediction of potential genes in microbial genomes Time: Mon May 16 00:01:57 2011 Seq name: gi|296494478|gb|ADTN01000260.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont663.2, whole genome shotgun sequence Length of sequence - 649 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 74 70 ## - Term 90 - 118 1.4 2 1 Op 2 . - CDS 169 - 375 344 ## COG3311 Predicted transcriptional regulator - Prom 462 - 521 2.0 Predicted protein(s) >gi|296494478|gb|ADTN01000260.1| GENE 1 2 - 74 70 24 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKQQYQTRYEWLHESYQKWLTGFT >gi|296494478|gb|ADTN01000260.1| GENE 2 169 - 375 344 68 aa, chain - ## HITS:1 COG:Z1188 KEGG:ns NR:ns ## COG: Z1188 COG3311 # Protein_GI_number: 15800709 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Escherichia coli O157:H7 EDL933 # 9 58 15 64 65 77 68.0 6e-15 MATPVSLMDDQMVDMAFITQLTGLTDKWFYKLIKVGGFPAPIKMGRSSRWLKSEVEAWLQ ARIAQSRP Prediction of potential genes in microbial genomes Time: Mon May 16 00:02:01 2011 Seq name: gi|296494477|gb|ADTN01000261.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont665.1, whole genome shotgun sequence Length of sequence - 4798 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 494 - 1504 677 ## SeSA_B0008 DNA replication protein - Prom 1619 - 1678 4.7 + Prom 2161 - 2220 6.7 2 2 Op 1 25/0.000 + CDS 2245 - 3411 698 ## COG1192 ATPases involved in chromosome partitioning 3 2 Op 2 . + CDS 3411 - 4382 471 ## COG1475 Predicted transcriptional regulators + Term 4436 - 4473 3.3 Predicted protein(s) >gi|296494477|gb|ADTN01000261.1| GENE 1 494 - 1504 677 336 aa, chain - ## HITS:1 COG:no KEGG:SeSA_B0008 NR:ns ## KEGG: SeSA_B0008 # Name: not_defined # Def: DNA replication protein # Organism: S.enterica_Schwarzengrund # Pathway: not_defined # 1 336 1 336 336 623 99.0 1e-177 MTSENNSLLLNLQEVDKTTGEVVKLDVNSTSTVQPVALMRLGLFVPTLKSTGKSKANRKN VTDATEELVQLSIAKSEGYTDVKITGSRLDMDTDFKVWLGIIRSMSEYGVKSDTLELSFV EFVKMCGFDSRRSNKKMRDRISNSLFKLASVTLKFQSETKGWTTHLVQSAYYDINEDIVE IKAEPKLFELYHMDRRVLLRLKAIDALQRKESAQALYTYIESLPQNPAPISMKRMRDRLN LTSNVYTQNHTVRKAMEQLRDIGYLDYTEFKRGRATYFSVHYRNPKLISGPVKVPRKEEE EKAPEQNYDEVIKALKAAGIDPLKLAEALSAMKPEN >gi|296494477|gb|ADTN01000261.1| GENE 2 2245 - 3411 698 388 aa, chain + ## HITS:1 COG:YPCD1.13c KEGG:ns NR:ns ## COG: YPCD1.13c COG1192 # Protein_GI_number: 16082703 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Yersinia pestis # 1 386 1 386 388 581 69.0 1e-166 MGLMDTLNQCISAGHEMTKAIAIAQFNDDSPEARKITRRWRIGEAADLVGVSSQAIRDAE KAGRLPHPDMETRGRVEQRVGYTIEQINHMRDVFGTRLRRAEDAFPPVIGVAAHKGGVYK TSVSVHLAQDLALKGLRVLLVEGNDPQGTASMYHGWVPDLHIHAEDTLLPFYLGEKDDAS YAIKPTCWPGLDIIPSCLALHRIETELMGKFDEGKLPADPHLMLRLAIETVAHDYDVIVI DSAPNLGIGTINVVCAADVLIVPTPAELFDYTSALQFFDMLRDLLKNVDLKGFEPDVRIL LTKYSNNNGSQSPWMEEQIRDAWGSMVLKNVVRETDEVGKGQIRMRTVFEQAIDQRSSTG AWRNALSIWEPVCNEIFDRLIKPRWEIR >gi|296494477|gb|ADTN01000261.1| GENE 3 3411 - 4382 471 323 aa, chain + ## HITS:1 COG:YPCD1.12c KEGG:ns NR:ns ## COG: YPCD1.12c COG1475 # Protein_GI_number: 16082702 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Yersinia pestis # 1 320 1 316 320 273 50.0 3e-73 MKRAPVIPRHTTHTQSTEDTSSPAPAAPMVDSLIARVGAMARGNAISLPVCGREVKFTLE VLRGDSVESASRVWSGNERDQELLTEDALDDLIPSFLLTGQQTPAFGRRVSDVIEIADGS RRRKAAILTESDYRVLVGELDDEQMAALSRLGNDYRPTSAYERGLRYTSRLQNEFAGNIS ALADAENISRKIITRCINTAKLPKSVVALFAHPGELSARSGEALQKAFADKEELLKQQAE TLHDQKKAGLIFEAEEVISLLTSVLKQSPASRVNLSSRHQFAPGATALYKGDKMVLNLDR SRIPAECIEKIEAILKELEKPGV Prediction of potential genes in microbial genomes Time: Mon May 16 00:02:07 2011 Seq name: gi|296494476|gb|ADTN01000262.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont672.1, whole genome shotgun sequence Length of sequence - 4125 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 181 - 218 5.3 1 1 Op 1 . - CDS 223 - 639 446 ## COG1487 Predicted nucleic acid-binding protein, contains PIN domain 2 1 Op 2 . - CDS 636 - 866 284 ## KPK_A0157 VagC protein - Prom 907 - 966 1.9 3 2 Tu 1 . - CDS 1032 - 1148 69 ## - Prom 1280 - 1339 4.6 + Prom 1268 - 1327 7.1 4 3 Op 1 . + CDS 1439 - 1735 127 ## gi|301647506|ref|ZP_07247308.1| hypothetical protein HMPREF9543_04029 5 3 Op 2 . + CDS 1800 - 2471 22 ## APECO1_O1CoBM115 hypothetical protein 6 3 Op 3 . + CDS 2542 - 3279 446 ## COG0582 Integrase + Prom 3609 - 3668 4.3 7 4 Tu 1 . + CDS 3731 - 4012 265 ## gi|301647510|ref|ZP_07247312.1| conserved domain protein + Term 4035 - 4075 8.2 Predicted protein(s) >gi|296494476|gb|ADTN01000262.1| GENE 1 223 - 639 446 138 aa, chain - ## HITS:1 COG:PSLT106 KEGG:ns NR:ns ## COG: PSLT106 COG1487 # Protein_GI_number: 17233504 # Func_class: R General function prediction only # Function: Predicted nucleic acid-binding protein, contains PIN domain # Organism: Salmonella typhimurium LT2 # 5 136 4 131 132 113 45.0 9e-26 MKKTWMLDTNICSFIMREQPAAVLKRLEQVVLRGDRIVVSAVTYAEMRFGATGPKASPRH IQLVDAFCARLDAILPWDRAAVDATTDIRVALRLAGTPIGPNDTAIAGHAIAAGAILVTN NTREFERVPGLVLEDWVK >gi|296494476|gb|ADTN01000262.1| GENE 2 636 - 866 284 76 aa, chain - ## HITS:1 COG:no KEGG:KPK_A0157 NR:ns ## KEGG: KPK_A0157 # Name: vagC # Def: VagC protein # Organism: K.pneumoniae_342 # Pathway: not_defined # 1 76 1 76 76 133 98.0 2e-30 MRTVSIFKNGNNRAIRLPRDLDFDGVSELEIVREGDSIILRPVRPTWGSFAQLDRADPDF MAEREDVVSDEGRFDP >gi|296494476|gb|ADTN01000262.1| GENE 3 1032 - 1148 69 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MASLSAGNTATSCVVGPLRAGGVRRDPGLGVFGSAVEK >gi|296494476|gb|ADTN01000262.1| GENE 4 1439 - 1735 127 98 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|301647506|ref|ZP_07247308.1| ## NR: gi|301647506|ref|ZP_07247308.1| hypothetical protein HMPREF9543_04029 [Escherichia coli MS 146-1] # 5 98 1 94 94 199 100.0 4e-50 MHASMVHVWEYLTRCGHKFLWLKINPLPPGYTGSVYRAQFEIYSASGRILVTLDVHRQQW VCSANNPRREPADKNGVMLDSGIPLDRNLASHYAKQGY >gi|296494476|gb|ADTN01000262.1| GENE 5 1800 - 2471 22 223 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM115 NR:ns ## KEGG: APECO1_O1CoBM115 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 217 1 216 229 220 53.0 3e-56 MADWFIATEGVKVVKDSASLWPQIVTAISSAGAALVGVALTHRFTRRREEATAARKRYDE LYFISTELVFLLERFAQRCVYSANESGEYDQDGRIRIEYSLPEINFESITGDWRSLPPGL MYRLAELPVICREAASSISAAFNEDTPFDGSYGIFELNRQSARLGLKAIRLSRWLRQLCD MPGDRLSDDRWSAWSVLYRKRWEHIRNNNNVHRRYKRLPDVTE >gi|296494476|gb|ADTN01000262.1| GENE 6 2542 - 3279 446 245 aa, chain + ## HITS:1 COG:PSLT031 KEGG:ns NR:ns ## COG: PSLT031 COG0582 # Protein_GI_number: 17233417 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Salmonella typhimurium LT2 # 3 242 18 257 260 387 90.0 1e-107 MNLPVAIDYPAALALRQMALVQDELPKYLLAPEVSALLHYVPDLHRKMLLATLWNTGARI NEALALTRGDFSLAPPYPFVQLATLKQRTEKAARTAGRAPAGSQVHRLVPLSDHHYVSQL QMMVATLKIPLERRNKRTGRMEKARIWEITDRTVRTWLSEAVEAAAADGVTFSVPVTPHT FRHSYAMHMLYAGIPLKVLQSLMGHKSISSTEVYTKVFALDVAARHRVQFQMPEADAVAM LKRNI >gi|296494476|gb|ADTN01000262.1| GENE 7 3731 - 4012 265 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|301647510|ref|ZP_07247312.1| ## NR: gi|301647510|ref|ZP_07247312.1| conserved domain protein [Escherichia coli MS 146-1] # 1 93 1 93 93 148 100.0 1e-34 MIYKDITILYIDSGKNNRLIRYDLLRKENNDFVVQVFDDQNENIADPKPTIKIDQFEITY DNYLDNCKHSNKLPASFEEYVDIKLQDHRDKLD Prediction of potential genes in microbial genomes Time: Mon May 16 00:02:31 2011 Seq name: gi|296494475|gb|ADTN01000263.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont695.1, whole genome shotgun sequence Length of sequence - 12354 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 6, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 59 - 1261 1678 ## COG1972 Nucleoside permease - Prom 1335 - 1394 5.0 2 2 Tu 1 . + CDS 1597 - 2835 1547 ## COG1914 Mn2+ and Fe2+ transporters of the NRAMP family + Term 2920 - 2963 3.9 3 3 Tu 1 . - CDS 2976 - 3302 286 ## ECDH10B_2555 hypothetical protein - Prom 3342 - 3401 2.6 - Term 3341 - 3374 1.3 4 4 Tu 1 . - CDS 3417 - 4673 1056 ## COG0038 Chloride channel protein EriC - Prom 4848 - 4907 5.0 + Prom 4783 - 4842 2.5 5 5 Tu 1 3/1.000 + CDS 4877 - 5842 1133 ## COG0837 Glucokinase + Term 5856 - 5891 5.1 + Prom 5972 - 6031 3.0 6 6 Op 1 7/0.000 + CDS 6061 - 6387 576 ## COG1445 Phosphotransferase system fructose-specific component IIB 7 6 Op 2 3/1.000 + CDS 6409 - 7656 1515 ## COG1299 Phosphotransferase system, fructose-specific IIC component 8 6 Op 3 3/1.000 + CDS 7671 - 8756 1134 ## COG0006 Xaa-Pro aminopeptidase 9 6 Op 4 3/1.000 + CDS 8756 - 9793 988 ## COG1363 Cellulase M and related proteins 10 6 Op 5 . + CDS 9818 - 12313 2739 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) Predicted protein(s) >gi|296494475|gb|ADTN01000263.1| GENE 1 59 - 1261 1678 400 aa, chain - ## HITS:1 COG:nupC KEGG:ns NR:ns ## COG: nupC COG1972 # Protein_GI_number: 16130325 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside permease # Organism: Escherichia coli K12 # 1 400 1 400 400 653 100.0 0 MDRVLHFVLALAVVAILALLVSSDRKKIRIRYVIQLLVIEVLLAWFFLNSDVGLGFVKGF SEMFEKLLGFANEGTNFVFGSMNDQGLAFFFLKVLCPIVFISALIGILQHIRVLPVIIRA IGFLLSKVNGMGKLESFNAVSSLILGQSENFIAYKDILGKISRNRMYTMAATAMSTVSMS IVGAYMTMLEPKYVVAALVLNMFSTFIVLSLINPYRVDASEENIQMSNLHEGQSFFEMLG EYILAGFKVAIIVAAMLIGFIALIAALNALFATVTGWFGYSISFQGILGYIFYPIAWVMG VPSSEALQVGSIMATKLVSNEFVAMMDLQKIASTLSPRAEGIISVFLVSFANFSSIGIIA GAVKGLNEEQGNVVSRFGLKLVYGSTLVSVLSASIAALVL >gi|296494475|gb|ADTN01000263.1| GENE 2 1597 - 2835 1547 412 aa, chain + ## HITS:1 COG:ECs3271 KEGG:ns NR:ns ## COG: ECs3271 COG1914 # Protein_GI_number: 15832525 # Func_class: P Inorganic ion transport and metabolism # Function: Mn2+ and Fe2+ transporters of the NRAMP family # Organism: Escherichia coli O157:H7 # 1 412 1 412 412 697 100.0 0 MTNYRVESSSGRAARKMRLALMGPAFIAAIGYIDPGNFATNIQAGASFGYQLLWVVVWAN LMAMLIQILSAKLGIATGKNLAEQIRDHYPRPVVWFYWVQAEIIAMATDLAEFIGAAIGF KLILGVSLLQGAVLTGIATFLILMLQRRGQKPLEKVIGGLLLFVAAAYIVELIFSQPNLA QLGKGMVIPSLPTSEAVFLAAGVLGATIMPHVIYLHSSLTQHLHGGSRQQRYSATKWDVA IAMTIAGFVNLAMMATAAAAFHFSGHTGVADLDEAYLTLQPLLSHAAATVFGLSLVAAGL SSTVVGTLAGQVVMQGFIRFHIPLWVRRTVTMLPSFIVILMGLDPTRILVMSQVLLSFGI ALALVPLLIFTSDSKLMGDLVNSKRVKQTGWVIVVLVVALNIWLLVGTALGL >gi|296494475|gb|ADTN01000263.1| GENE 3 2976 - 3302 286 108 aa, chain - ## HITS:1 COG:no KEGG:ECDH10B_2555 NR:ns ## KEGG: ECDH10B_2555 # Name: ypeC # Def: hypothetical protein # Organism: E.coli_DH10B # Pathway: not_defined # 1 108 1 108 108 165 100.0 6e-40 MFRSLFLAAALMAFTPLAANAGEITLLPSIKLQIGDRDHYGNYWDGGHWRDRDYWHRNYE WRKNRWWRHDNGYHRGWDKRKAYERGYREGWRDRDDHRGKGRGHGHRH >gi|296494475|gb|ADTN01000263.1| GENE 4 3417 - 4673 1056 418 aa, chain - ## HITS:1 COG:ECs3269 KEGG:ns NR:ns ## COG: ECs3269 COG0038 # Protein_GI_number: 15832523 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Escherichia coli O157:H7 # 1 418 1 418 418 634 99.0 0 MLHPRARTMLLLSLPAVAIGIASSLILIVVMKIASVLQNLLWQRLPGTLGIAQDSPLWII GVLTLTGIAVGLVIRFSQGHAGPDPACEPLIGAPVPPSALPGLIVALILGLAGGVSLGPE HPIMTVNIALAVAIGARLLPRVNRMEWTILASAGTIGALFGTPVAAALIFLQTLNGSSEV PLWDRLFAPLMAAAAGALTTGLFFHPHFSLPIAHYGQMEMTDILSGAIVAAIAIAAGMVA VWCLPRLHAMMHQMKNPVLVLGIGGFILGILGVIGGPVSLFKGLDEMQQMVANQAFSTSD YFLLAVIKLAALVVAAASGFRGGRIFPAVFVGVALGLMLHEHVPAVPAAITVSCAILGIV LVVTRDGWLSLFMAAVVVPNTTLLPLLCIVMLPAWLLLAGKPMMMVNRPKQQPPHDNV >gi|296494475|gb|ADTN01000263.1| GENE 5 4877 - 5842 1133 321 aa, chain + ## HITS:1 COG:ECs3268 KEGG:ns NR:ns ## COG: ECs3268 COG0837 # Protein_GI_number: 15832522 # Func_class: G Carbohydrate transport and metabolism # Function: Glucokinase # Organism: Escherichia coli O157:H7 # 1 321 1 321 321 659 100.0 0 MTKYALVGDVGGTNARLALCDIASGEISQAKTYSGLDYPSLEAVIRVYLEEHKVEVKDGC IAIACPITGDWVAMTNHTWAFSIAEMKKNLGFSHLEIINDFTAVSMAIPMLKKEHLIQFG GAEPVEGKPIAVYGAGTGLGVAHLVHVDKRWVSLPGEGGHVDFAPNSEEEAIILEILRAE IGHVSAERVLSGPGLVNLYRAIVKADNRLPENLKPKDITERALADSCTDCRRALSLFCVI MGRFGGNLALNLGTFGGVFIAGGIVPRFLEFFKASGFRAAFEDKGRFKEYVHDIPVYLIV HDNPGLLGSGAHLRQTLGHIL >gi|296494475|gb|ADTN01000263.1| GENE 6 6061 - 6387 576 108 aa, chain + ## HITS:1 COG:ECs3267 KEGG:ns NR:ns ## COG: ECs3267 COG1445 # Protein_GI_number: 15832521 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system fructose-specific component IIB # Organism: Escherichia coli O157:H7 # 1 108 1 108 108 169 100.0 1e-42 MSKKLIALCACPMGLAHTFMAAQALEEAAVEAGYEVKIETQGADGIQNRLTAQDIAEATI IIHSVAVTPEDNERFESRDVYEITLQDAIKNAAGIIKEIEEMIASEQQ >gi|296494475|gb|ADTN01000263.1| GENE 7 6409 - 7656 1515 415 aa, chain + ## HITS:1 COG:ypdG KEGG:ns NR:ns ## COG: ypdG COG1299 # Protein_GI_number: 16130318 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Escherichia coli K12 # 1 415 1 415 415 702 99.0 0 MAIKKRSATVVPGASGAAAAVKNPQASKTSFWGELPQHVMSGISRMVPTLIMGGVILAFS QLIAYSWLKIPAEIGIMDALNSGKFSGFDLSLLKFAWLSQSFGGVLFGFAIPMFAAFVAN SIGGKLAFPAGFIGGLMSPQPTQLLNFDPSTMQWATSSPVPSTFIGALIISIVAGYLVKW MNQKIQLPDFLLAFKTTFLLPILSAIFVMLAMYYVITPFGGWINGGIRTVLTAAGEKGAL MYAMGIAAATAIDLGGPINKAAGFVAFSFTTDHVLPVTARSIAIVIPPIGLGLATIIDRR LTGKRLFNAQLYPQGKTAMFLAFMGISEGAIPFALESPITAIPSYMVGAIVGSTAAVWLG AVQWFPESAIWAWPLVTNLGVYMAGIALGAVITALMVVFLRLMMFRKGKLLIDSL >gi|296494475|gb|ADTN01000263.1| GENE 8 7671 - 8756 1134 361 aa, chain + ## HITS:1 COG:ypdF KEGG:ns NR:ns ## COG: ypdF COG0006 # Protein_GI_number: 16130317 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Escherichia coli K12 # 1 361 1 361 361 709 99.0 0 MTLLASLRDWLKAQQLDAVLLSSRQNKQPHLGISTGSGYVVISRESAHILVDSRYYVEVE ARAQGYQLHLLDATNTLTTIVNQIIADEQLQTLGFEGQQVSWETAHRWQSELNAKLVSAT PDVLRQIKTPEEVEKIRLACGIADRGAEHIRRFIQAGMSEREIAAELEWFMRQQGAEKAS FDTIVASGWRGALPHGKASDKIVAAGEFVTLDFGALYQGYCSDMTRTLLVNGEGVSAESH PLFNVYQIVLQAQLAAISAIRPGVRCQQVDDAARRVITEAGYGDYFGHNTGHAIGIEVHE DPRFSPRDTTTLQPGMLLTVEPGIYLPGQGGVRIEDVVLVTPQGAEVLYAMPKTVLLTGE A >gi|296494475|gb|ADTN01000263.1| GENE 9 8756 - 9793 988 345 aa, chain + ## HITS:1 COG:ypdE KEGG:ns NR:ns ## COG: ypdE COG1363 # Protein_GI_number: 16130316 # Func_class: G Carbohydrate transport and metabolism # Function: Cellulase M and related proteins # Organism: Escherichia coli K12 # 1 345 1 345 345 675 100.0 0 MDLSLLKALSEADAIASSEQEVRQILLEEADRLQKEVRFDGLGSVLIRLNESTGPKVMIC AHMDEVGFMVRSISREGAIDVLPVGNVRMAARQLQPVRITTREECKIPGLLDGDRQGNDV SAMRVDIGARSYDEVMQAGIRPGDRVTFDTTFQVLPHQRVMGKAFDDRLGCYLLVTLLRE LHDAELPAEVWLVASSSEEVGLRGGQTATRAVSPDVAIVLDTACWAKNFDYGAANHRQIG NGPMLVLSDKSLIAPPKLTAWVETVAAEIGVPLQADMFSNGGTDGGAVHLTGTGVPTVVM GPATRHGHCAASIADCRDILQMQQLLSALIQRLTRETVVQLTDFR >gi|296494475|gb|ADTN01000263.1| GENE 10 9818 - 12313 2739 831 aa, chain + ## HITS:1 COG:ECs3263_2 KEGG:ns NR:ns ## COG: ECs3263_2 COG1080 # Protein_GI_number: 15832517 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Escherichia coli O157:H7 # 98 683 1 586 586 1113 99.0 0 MLTIQFLCPLPNGLHARPAWELKEQCSQWQSEITFINHRQNAKADAKSSLALIGTGTLFN DSCSLNISGSDEEQARRVLEEYIQVRFIDSDSVQPTQAELTAHPLPRSLSRLNPDLLYGN VLASGVGVGTLTLLQSDSLDSYRAIPASAQDSTRLEHSLATLAEQLNQQLRERDGESKTI LSAHLSLIQDDEFAGNIRRLMTEQHQGLGAAIISNMEQVCAKLSASASDYLRERVSDIRD ISEQLLHITWPELKPRNKLVLEKPTILVAEDLTPSQFLSLDLKNLAGMILEKTGRTSHTL ILARASAIPVLSGLPLDAIARYAGQPAVLDAQCGVLAINPNDAVSGYYQVAQTLADKRQK QQAQAAAQLAYSRDNKRIDIAANIGTALEAPGAFANGAEGVGLFRTEMLYMDRDSAPDEQ EQFEAYQQVLLAAGDKPIIFRTMDIGGDKSIPYLNIPQEENPFLGYRAVRIYPEFAGLFR TQLRAILRAASFGNAQLMIPMVHSLDQILWVKGEIQKAIVELKRDGLRHAETITLGIMVE VPSVCYIIDHFCDEVDFFSIGSNDMTQYLYAVDRNNPRVSPLYNPITPSFLRMLQQIVTT AHQRGKWVGICGELGGESRYLPLLLGLGLDELSMSSPRIPAVKSQLRQLDSEACRELARQ ACECRSAQEIEALLTAFTPEEDVRPLLALENIFVDQDFSNKEQAIQFLCGNLGVNGRTEH PFELEEDVWQREEIVTTGVGFGVAIPHTKSQWIRHSSISIARLAKPIGWQSEMGEVELVI MLTLGANEGMNHVKVFSQLARKLVNKNFRQSLFAAQDAQSILTLLETELTF Prediction of potential genes in microbial genomes Time: Mon May 16 00:02:41 2011 Seq name: gi|296494474|gb|ADTN01000264.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont695.2, whole genome shotgun sequence Length of sequence - 26222 bp Number of predicted genes - 21, with homology - 19 Number of transcription units - 13, operones - 5 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.400 - CDS 1 - 808 516 ## COG2207 AraC-type DNA-binding domain-containing proteins 2 1 Op 2 9/0.000 - CDS 821 - 1555 769 ## COG3279 Response regulator of the LytR/AlgR family 3 1 Op 3 . - CDS 1570 - 3249 1492 ## COG3275 Putative regulator of cell autolysis - Prom 3346 - 3405 4.7 4 2 Tu 1 . + CDS 3297 - 3359 71 ## + Term 3380 - 3410 -0.3 + Prom 3402 - 3461 3.9 5 3 Tu 1 . + CDS 3643 - 4881 1281 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase + Term 4904 - 4935 4.1 - Term 4892 - 4923 4.1 6 4 Tu 1 . - CDS 4946 - 5017 80 ## - Prom 5108 - 5167 3.9 - Term 5299 - 5339 4.1 7 5 Tu 1 . - CDS 5373 - 6293 850 ## COG1560 Lauroyl/myristoyl acyltransferase - Prom 6530 - 6589 2.3 + Prom 6545 - 6604 4.3 8 6 Tu 1 . + CDS 6646 - 6888 266 ## JW2374 predicted inner membrane protein + Prom 7386 - 7445 7.2 9 7 Tu 1 . + CDS 7536 - 8171 579 ## JW2372 hypothetical protein + Term 8226 - 8274 10.3 + Prom 8599 - 8658 8.4 10 8 Op 1 4/0.000 + CDS 8684 - 9934 1041 ## COG1804 Predicted acyl-CoA transferases/carnitine dehydratase + Term 9942 - 9970 0.5 11 8 Op 2 3/0.400 + CDS 9988 - 11682 1131 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 12 8 Op 3 2/0.800 + CDS 11752 - 12696 586 ## COG0679 Predicted permeases + Term 12725 - 12761 2.2 13 8 Op 4 . + CDS 12770 - 13915 421 ## COG1804 Predicted acyl-CoA transferases/carnitine dehydratase + Term 13938 - 13978 4.8 - Term 13926 - 13964 4.4 14 9 Op 1 12/0.000 - CDS 13971 - 16910 997 ## COG0642 Signal transduction histidine kinase - Prom 16980 - 17039 3.4 - Term 17318 - 17359 3.0 15 9 Op 2 . - CDS 17569 - 18183 329 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain - Prom 18297 - 18356 6.9 16 10 Op 1 19/0.000 + CDS 18599 - 19762 656 ## COG1566 Multidrug resistance efflux pump 17 10 Op 2 . + CDS 19762 - 21300 503 ## COG0477 Permeases of the major facilitator superfamily 18 11 Op 1 4/0.000 - CDS 21408 - 22736 1216 ## COG3048 D-serine dehydratase 19 11 Op 2 . - CDS 22754 - 24091 1357 ## COG2610 H+/gluconate symporter and related permeases - Prom 24257 - 24316 5.0 + Prom 24213 - 24272 6.6 20 12 Tu 1 . + CDS 24309 - 25244 333 ## COG0583 Transcriptional regulator + Term 25373 - 25401 -0.2 - TRNA 25340 - 25414 66.4 # Arg CCT 0 0 - Term 25233 - 25279 2.5 21 13 Tu 1 . - CDS 25490 - 26215 671 ## COG2116 Formate/nitrite family of transporters Predicted protein(s) >gi|296494474|gb|ADTN01000264.1| GENE 1 1 - 808 516 269 aa, chain - ## HITS:1 COG:ypdC KEGG:ns NR:ns ## COG: ypdC COG2207 # Protein_GI_number: 16130314 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 268 1 268 285 539 100.0 1e-153 MKAPGLPADQQFFADLFSGLVLNPQLLGRVWFASQPASLPVGSLCIDFPRLDIVLRGEYG NLLEAKQQRLVEGEMLFIPARAANLPVNNKPVMLLSLVFAPTWLGLSFYDSRTTSLLHPA RQIQLPSLQRGEGEAMLTALTHLSRSPLEQNIIQPLVLSLLHLCRSVVNMPPGNSQPRGD FLYHSICNWVQDNYAQPLTRESVAQFFNITPNHLSKLFAQHGTMRFIEYVRWVRMAKARM ILQKYHLSIHEVAQRCGFPDSDYFCRVFP >gi|296494474|gb|ADTN01000264.1| GENE 2 821 - 1555 769 244 aa, chain - ## HITS:1 COG:ECs3261 KEGG:ns NR:ns ## COG: ECs3261 COG3279 # Protein_GI_number: 15832515 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Escherichia coli O157:H7 # 1 244 1 244 244 483 100.0 1e-136 MKVIIVEDEFLAQQELSWLIKEHSQMEIVGTFDDGLDVLKFLQHNRVDAIFLDINIPSLD GVLLAQNISQFAHKPFIVFITAWKEHAVEAFELEAFDYILKPYQESRITGMLQKLEAAWQ QQQTSSTPAATVTRENDTINLVKDERIIVTPINDIYYAEAHEKMTFVYTRRESYVMPMNI TEFCSKLPPSHFFRCHRSFCVNLNKIREIEPWFNNTYILRLKDLDFEVPVSRSKVKEFRQ LMHL >gi|296494474|gb|ADTN01000264.1| GENE 3 1570 - 3249 1492 559 aa, chain - ## HITS:1 COG:ECs3260 KEGG:ns NR:ns ## COG: ECs3260 COG3275 # Protein_GI_number: 15832514 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Escherichia coli O157:H7 # 1 559 7 565 565 1123 100.0 0 MLLAVFDRAALMLICLFFLIRIRLFRELLHKSAHSPKELLAVTAIFSLFALFSTWSGVPV EGSLVNVRIIAVMSGGILFGPWVGIITGVIAGIHRYLIDIGGVTAIPCFITSILAGCISG WINLKIPKAQRWRVGILGGMLCETLTMILVIVWAPTTALGIDIVSKIGIPMILGSVCIGF IVLLVQSVEGEKEASAARQAKLALDIANKTLPLFRHVNSESLRKVCEIIRDDIHADAVAI TNTDHVLAYVGVGEHNYQNGDDFISPTTRQAMNYGKIIIKNNDEAHRTPEIHSMLVIPLW EKGVVTGTLKIYYCHAHQITSSLQEMAVGLSQIISTQLEVSRAEQLREMANKAELRALQS KINPHFLFNALNAISSSIRLNPDTARQLIFNLSRYLRYNIELKDDEQIDIKKELYQIKDY IAIEQARFGDKLTVIYDIDEEVNCCIPSLLIQPLVENAIVHGIQPCKGKGVVTISVAECG NRVRIAVRDTGHGIDPKVIERVEANEMPGNKIGLLNVHHRVKLLYGEGLHIRRLEPGTEI AFYIPNQRTPVASQATLLL >gi|296494474|gb|ADTN01000264.1| GENE 4 3297 - 3359 71 20 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIGNLKADDQQAVFEEKPQF >gi|296494474|gb|ADTN01000264.1| GENE 5 3643 - 4881 1281 412 aa, chain + ## HITS:1 COG:yfdZ KEGG:ns NR:ns ## COG: yfdZ COG0436 # Protein_GI_number: 16130311 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Escherichia coli K12 # 1 412 1 412 412 847 100.0 0 MADTRPERRFTRIDRLPPYVFNITAELKMAARRRGEDIIDFSMGNPDGATPPHIVEKLCT VAQRPDTHGYSTSRGIPRLRRAISRWYQDRYDVEIDPESEAIVTIGSKEGLAHLMLATLD HGDTVLVPNPSYPIHIYGAVIAGAQVRSVPLVEGVDFFNELERAIRESYPKPKMMILGFP SNPTAQCVELEFFEKVVALAKRYDVLVVHDLAYADIVYDGWKAPSIMQVPGARDVAVEFF TLSKSYNMAGWRIGFMVGNKTLVSALARIKSYHDYGTFTPLQVAAIAALEGDQQCVRDIA EQYKRRRDVLVKGLHEAGWMVEMPKASMYVWAKIPEPYAAMGSLEFAKKLLNEAKVCVSP GIGFGDYGDTHVRFALIENRDRIRQAIRGIKAMFRADGLLPASSKHIHENAE >gi|296494474|gb|ADTN01000264.1| GENE 6 4946 - 5017 80 23 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKYFFMGISFMVIVWAGTFALMI >gi|296494474|gb|ADTN01000264.1| GENE 7 5373 - 6293 850 306 aa, chain - ## HITS:1 COG:ECs3258 KEGG:ns NR:ns ## COG: ECs3258 COG1560 # Protein_GI_number: 15832512 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lauroyl/myristoyl acyltransferase # Organism: Escherichia coli O157:H7 # 1 306 23 328 328 625 100.0 1e-179 MFPQCKFSREFLHPRYWLTWFGLGVLWLWVQLPYPVLCFLGTRIGAMARPFLKRRESIAR KNLELCFPQHSAEEREKMIAENFRSLGMALVETGMAWFWPDSRVRKWFDVEGLDNLKRAQ MQNRGVMVVGVHFMSLELGGRVMGLCQPMMATYRPHNNQLMEWVQTRGRMRSNKAMIGRN NLRGIVGALKKGEAVWFAPDQDYGRKGSSFAPFFAVENVATTNGTYVLSRLSGAAMLTVT MVRKADYSGYRLFITPEMEGYPTDENQAAAYMNKIIEKEIMRAPEQYLWIHRRFKTRPVG ESSLYI >gi|296494474|gb|ADTN01000264.1| GENE 8 6646 - 6888 266 80 aa, chain + ## HITS:1 COG:no KEGG:JW2374 NR:ns ## KEGG: JW2374 # Name: yfdY # Def: predicted inner membrane protein # Organism: E.coli_J # Pathway: not_defined # 1 80 1 80 80 115 98.0 6e-25 MINLWMFLALCIVCVSGYIGQVLNVVSAVSSFFGMVILAALIYYFTMWLTGGNELVTGVF MFLAPACGLMIRFMVGYGRR >gi|296494474|gb|ADTN01000264.1| GENE 9 7536 - 8171 579 211 aa, chain + ## HITS:1 COG:no KEGG:JW2372 NR:ns ## KEGG: JW2372 # Name: yfdX # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 211 1 211 211 325 100.0 5e-88 MKRLIMATMVTAILASSTVWAADNAPVAAQQQTQQVQQTQKTAAAAERISEQGLYAMRDV QVARLALFHGDPEKAKELTNEASALLSDDSTEWAKFAKPGKKTNLNDDQYIVINASVGIS ESYVATPEKEAAIKIANEKMAKGDKKGAMEELRLAGVGVMENQYLMPLKQTRNALADAQK LLDKKQYYEANLALKGAEDGIIVDSEALFVN >gi|296494474|gb|ADTN01000264.1| GENE 10 8684 - 9934 1041 416 aa, chain + ## HITS:1 COG:yfdW KEGG:ns NR:ns ## COG: yfdW COG1804 # Protein_GI_number: 16130306 # Func_class: C Energy production and conversion # Function: Predicted acyl-CoA transferases/carnitine dehydratase # Organism: Escherichia coli K12 # 1 416 1 416 416 830 100.0 0 MSTPLQGIKVLDFTGVQSGPSCTQMLAWFGADVIKIERPGVGDVTRHQLRDIPDIDALYF TMLNSNKRSIELNTKTAEGKEVMEKLIREADILVENFHPGAIDHMGFTWEHIQEINPRLI FGSIKGFDECSPYVNVKAYENVAQAAGGAASTTGFWDGPPLVSAAALGDSNTGMHLLIGL LAALLHREKTGRGQRVTMSMQDAVLNLCRVKLRDQQRLDKLGYLEEYPQYPNGTFGDAVP RGGNAGGGGQPGWILKCKGWETDPNAYIYFTIQEQNWENTCKAIGKPEWITDPAYSTAHA RQPHIFDIFAEIEKYTVTIDKHEAVAYLTQFDIPCAPVLSMKEISLDPSLRQSGSVVEVE QPLRGKYLTVGCPMKFSAFTPDIKAAPLLGEHTAAVLQELGYSDDEIAAMKQNHAI >gi|296494474|gb|ADTN01000264.1| GENE 11 9988 - 11682 1131 564 aa, chain + ## HITS:1 COG:ECs3253 KEGG:ns NR:ns ## COG: ECs3253 COG0028 # Protein_GI_number: 15832507 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Escherichia coli O157:H7 # 1 564 1 564 564 1080 100.0 0 MSDQLQMTDGMHIIVEALKQNNIDTIYGVVGIPVTDMARHAQAEGIRYIGFRHEQSAGYA AAASGFLTQKPGICLTVSAPGFLNGLTALANATVNGFPMIMISGSSDRAIVDLQQGDYEE LDQMNAAKPYAKAAFRVNQPQDLGIALARAIRVSVSGRPGGVYLDLPANVLAATMEKDEA LTTIVKVENPSPALLPCPKSVTSAISLLAKAERPLIILGKGAAYSQADEQLREFIESAQI PFLPMSMAKGILEDTHPLSAAAARSFALANADVVMLVGARLNWLLAHGKKGWAADTQFIQ LDIEPQEIDSNRPIAVPVVGDIASSMQGMLAELKQNTFTTPLVWRDILNIHKQQNAQKMH EKLSTDTQPLNYFNALSAVRDVLRENQDIYLVNEGANTLDNARNIIDMYKPRRRLDCGTW GVMGIGMGYAIGASVTSGSPVVAIEGDSAFGFSGMEIETICRYNLPVTIVIFNNGGIYRG DGVDLSGAGAPSPTDLLHHARYDKLMDAFRGVGYNVTTTDELRHALTTGIQSRKPTIINV VIDPAAGTESGHITKLNPKQVAGN >gi|296494474|gb|ADTN01000264.1| GENE 12 11752 - 12696 586 314 aa, chain + ## HITS:1 COG:ECs3252 KEGG:ns NR:ns ## COG: ECs3252 COG0679 # Protein_GI_number: 15832506 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Escherichia coli O157:H7 # 1 314 1 314 314 531 100.0 1e-151 MLTFFIGDLLPIIVIMLLGYFSGRRETFSEDQARAFNKLVLNYALPAALFVSITRANREM IFADTRLTLVSLVVIVGCFFFSWFGCYKFFKRTHAEAAVCALIAGSPTIGFLGFAVLDPI YGDSVSTGLVVAIISIIVNAITIPIGLYLLNPSSGADGKKNSNLSALISAAKEPVVWAPV LATILVLVGVKIPAAWDPTFNLIAKANSGVAVFAAGLTLAAHKFEFSAEIAYNTFLKLIL MPLALLLVGMACHLNSEHLQMMVLAGALPPAFSGIIIASRFNVYTRTGTASLAVSVLGFV VTAPLWIYVSRLVS >gi|296494474|gb|ADTN01000264.1| GENE 13 12770 - 13915 421 381 aa, chain + ## HITS:1 COG:yfdE KEGG:ns NR:ns ## COG: yfdE COG1804 # Protein_GI_number: 16130303 # Func_class: C Energy production and conversion # Function: Predicted acyl-CoA transferases/carnitine dehydratase # Organism: Escherichia coli K12 # 1 381 14 394 394 796 100.0 0 MTNNESKGPFEGLLVIDMTHVLNGPFGTQLLCNMGARVIKVEPPGHGDDTRTFGPYVDGQ SLYYSFINHGKESVVLDLKNDHDKSIFINMLKQADVLAENFRPGTMEKLGFSWETLQEIN PRLIYASSSGFGHTGPLKDAPAYDTIIQAMSGIMMETGYPDAPPVRVGTSLADLCGGVYL FSGIVSALYGREKSQRGAHVDIAMFDATLSFLEHGLMAYIATGKSPQRLGNRHPYMAPFD VFNTQDKPITICCGNDKLFSALCQALELTELVNDPRFSSNILRVQNQAILKQYIERTLKT QAAEVWLARIHEVGVPVAPLLSVAEAIKLPQTQARNMLIEAGGIMMPGNPIKISGCADPH VMPGAATLDQHGEQIRQEFSS >gi|296494474|gb|ADTN01000264.1| GENE 14 13971 - 16910 997 979 aa, chain - ## HITS:1 COG:evgS_2 KEGG:ns NR:ns ## COG: evgS_2 COG0642 # Protein_GI_number: 16130302 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 313 732 1 420 420 845 100.0 0 MISRYFTHSLNVVKYYNSPRQYNFFLTRKESVILNEVLNRFVDALTNEVRYEVSQNWLDT GNLAFLNKPLELTEHEKQWIKQHPNLKVLENPYSPPYSMTDENGSVRGVMGDILNIITLQ TGLNFSPITVSHNIHAGTQLSPGGWDIIPGAIYSEDRENNVLFAEAFITTPYVFVMQKAP DSEQTLKKGMKVAIPYYYELHSQLKEMYPEVEWIQVDNASAAFHKVKEGELDALVATQLN SRYMIDHYYPNELYHFLIPGVPNASLSFAFPRGEPELKDIINKALNAIPPSEVLRLTEKW IKMPNVTIDTWDLYSEQFYIVTTLSVLLVGSSLLWGFYLLRSVRRRKVIQGDLENQISFR KALSDSLPNPTYVVNWQGNVISHNSAFEHYFTADYYKNAMLPLENSDSPFKDVFSNAHEV TAETKENRTIYTQVFEIDNGIEKRCINHWHTLCNLPASDNAVYICGWQDITETRDLINAL EVEKNKAIKATVAKSQFLATMSHEIRTPISSIMGFLELLSGSGLSKEQRVEAISLAYATG QSLLGLIGEILDVDKIESGNYQLQPQWVDIPTLVQNTCHSFGAIAASKSIALSCSSTFPE HYLVKIDPQAFKQVLSNLLSNALKFTTEGAVKITTSLGHIDDNHAVIKMTIMDSGSGLSQ EEQQQLFKRYSQTSAGRQQTGSGLGLMICKELIKNMQGDLSLESHPGIGTTFTITIPVEI SQQVATVEAKAEQPITLPEKLSILIADDHPTNRLLLKRQLNLLGYDVDEATDGVQALHKV SMQHYDLLITDVNMPNMDGFELTRKLREQNSSLPIWGLTANAQANEREKGLSCGMNLCLF KPLTLDVLKTHLSQLHQVAHIAPQYRHLDIEALKNNTANDLQLMQEILMTFQHETHKDLP AAFQALEAGDNRTFHQCIHRIHGAANILNLQKLINISHQLEITPVSDDSKPEILQLLNSV KEHIAELDQEIAVFCQKND >gi|296494474|gb|ADTN01000264.1| GENE 15 17569 - 18183 329 204 aa, chain - ## HITS:1 COG:ECs3248 KEGG:ns NR:ns ## COG: ECs3248 COG2197 # Protein_GI_number: 15832502 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 204 1 204 204 392 100.0 1e-109 MNAIIIDDHPLAIAAIRNLLIKNDIEILAELTEGGSAVQRVETLKPDIVIIDVDIPGVNG IQVLETLRKRQYSGIIIIVSAKNDHFYGKHCADAGANGFVSKKEGMNNIIAAIEAAKNGY CYFPFSLNRFVGSLTSDQQKLDSLSKQEISVMRYILDGKDNNDIAEKMFISNKTVSTYKS RLMEKLECKSLMDLYTFAQRNKIG >gi|296494474|gb|ADTN01000264.1| GENE 16 18599 - 19762 656 387 aa, chain + ## HITS:1 COG:emrK KEGG:ns NR:ns ## COG: emrK COG1566 # Protein_GI_number: 16130300 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Escherichia coli K12 # 1 387 1 387 387 711 100.0 0 MEQINSNKKHSNRRKYFSLLAVVLFIAFSGAYAYWSMELEDMISTDDAYVTGNADPISAQ VSGSVTVVNHKDTNYVRQGDILVSLDKTDATIALNKAKNNLANIVRQTNKLYLQDKQYSA EVASARIQYQQSLEDYNRRVPLAKQGVISKETLEHTKDTLISSKAALNAAIQAYKANKAL VMNTPLNRQPQVVEAADATKEAWLALKRTDIKSPVTGYIAQRSVQVGETVSPGQSLMAVV PARQMWVNANFKETQLTDVRIGQSVNIISDLYGENVVFHGRVTGINMGTGNAFSLLPAQN ATGNWIKIVQRVPVEVSLDPKELMEHPLRIGLSMTATIDTKNEDIAEMPELASTVTSMPA YTSKALVIDTSPIEKEISNIISHNGQL >gi|296494474|gb|ADTN01000264.1| GENE 17 19762 - 21300 503 512 aa, chain + ## HITS:1 COG:emrY KEGG:ns NR:ns ## COG: emrY COG0477 # Protein_GI_number: 16130299 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 512 1 512 512 911 100.0 0 MAITKSTPAPLTGGTLWCVTIALSLATFMQMLDSTISNVAIPTISGFLGASTDEGTWVIT SFGVANAIAIPVTGRLAQRIGELRLFLLSVTFFSLSSLMCSLSTNLDVLIFFRVVQGLMA GPLIPLSQSLLLRNYPPEKRTFALALWSMTVIIAPICGPILGGYICDNFSWGWIFLINVP MGIIVLTLCLTLLKGRETETSPVKMNLPGLTLLVLGVGGLQIMLDKGRDLDWFNSSTIII LTVVSVISLISLVIWESTSENPILDLSLFKSRNFTIGIVSITCAYLFYSGAIVLMPQLLQ ETMGYNAIWAGLAYAPIGIMPLLISPLIGRYGNKIDMRLLVTFSFLMYAVCYYWRSVTFM PTIDFTGIILPQFFQGFAVACFFLPLTTISFSGLPDNKFANASSMSNFFRTLSGSVGTSL TMTLWGRRESLHHSQLTATIDQFNPVFNSSSQIMDKYYGSLSGVLNEINNEITQQSLSIS ANEIFRMAAIAFILLTVLVWFAKPPFTAKGVG >gi|296494474|gb|ADTN01000264.1| GENE 18 21408 - 22736 1216 442 aa, chain - ## HITS:1 COG:dsdA KEGG:ns NR:ns ## COG: dsdA COG3048 # Protein_GI_number: 16130298 # Func_class: E Amino acid transport and metabolism # Function: D-serine dehydratase # Organism: Escherichia coli K12 # 1 442 1 442 442 871 100.0 0 MENAKMNSLIAQYPLVKDLVALKETTWFNPGTTSLAEGLPYVGLTEQDVQDAHARLSRFA PYLAKAFPETAATGGIIESELVAIPAMQKRLEKEYQQPISGQLLLKKDSHLPISGSIKAR GGIYEVLAHAEKLALEAGLLTLDDDYSKLLSPEFKQFFSQYSIAVGSTGNLGLSIGIMSA RIGFKVTVHMSADARAWKKAKLRSHGVTVVEYEQDYGVAVEEGRKAAQSDPNCFFIDDEN SRTLFLGYSVAGQRLKAQFAQQGRIVDADNPLFVYLPCGVGGGPGGVAFGLKLAFGDHVH CFFAEPTHSPCMLLGVHTGLHDQISVQDIGIDNLTAADGLAVGRASGFVGRAMERLLDGF YTLSDQTMYDMLGWLAQEEGIRLEPSALAGMAGPQRVCASVSYQQMHGFSAEQLRNTTHL VWATGGGMVPEEEMNQYLAKGR >gi|296494474|gb|ADTN01000264.1| GENE 19 22754 - 24091 1357 445 aa, chain - ## HITS:1 COG:dsdX KEGG:ns NR:ns ## COG: dsdX COG2610 # Protein_GI_number: 16130297 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Escherichia coli K12 # 1 445 1 445 445 715 100.0 0 MHSQIWVVSTLLISIVLIVLTIVKFKFHPFLALLLASFFVGTMMGMGPLDMVNAIESGIG GTLGFLAAVIGLGTILGKMMEVSGAAERIGLTLQRCRWLSVDVIMVLVGLICGITLFVEV GVVLLIPLAFSIAKKTNTSLLKLAIPLCTALMAVHCVVPPHPAALYVANKLGADIGSVIV YGLLVGLMASLIGGPLFLKFLGQRLPFKPVPTEFADLKVRDEKTLPSLGATLFTILLPIA LMLVKTIAELNMARESGLYILVEFIGNPITAMFIAVFVAYYVLGIRQHMSMGTMLTHTEN GFGSIANILLIIGAGGAFNAILKSSSLADTLAVILSNMHMHPILLAWLVALILHAAVGSA TVAMMGATAIVAPMLPLYPDISPEIIAIAIGSGAIGCTIVTDSLFWLVKQYCGATLNETF KYYTTATFIASVVALAGTFLLSFII >gi|296494474|gb|ADTN01000264.1| GENE 20 24309 - 25244 333 311 aa, chain + ## HITS:1 COG:dsdC KEGG:ns NR:ns ## COG: dsdC COG0583 # Protein_GI_number: 16130296 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 311 1 311 311 636 100.0 0 MEPLREIRNRLLNGWQLSKMHTFEVAARHQSFALAAEELSLSPSAVSHRINQLEEELGIQ LFVRSHRKVELTHEGKRVYWALKSSLDTLNQEILDIKNQELSGTLTLYSRPSIAQCWLVP ALGDFTRRYPSISLTVLTGNDNVNLQRAGIDLAIYFDDAPSAQLTHHFLMDEEILPVCSP EYAQRHALTNTVINLCHCTLLHDRQAWSNDSGTDEWHSWAQHYAVNLPTSSGIGFDRSDL AVIAAMNHIGVAMGRKRLVQKRLASGELVAPFGDMTVKCHQHYYITTLPGRQWPKIEAFI IWLREQVKTTS >gi|296494474|gb|ADTN01000264.1| GENE 21 25490 - 26215 671 241 aa, chain - ## HITS:1 COG:yfdC KEGG:ns NR:ns ## COG: yfdC COG2116 # Protein_GI_number: 16130280 # Func_class: P Inorganic ion transport and metabolism # Function: Formate/nitrite family of transporters # Organism: Escherichia coli K12 # 1 241 70 310 310 474 100.0 1e-134 MGASLLAKGIFQVELEGVPGSFLLENLGYTFGFIIVIMARQQLFTENTVTAVLPVMQKPT MSNVGLLIRLWGVVLLGNILGTGIAAWAFEYMPIFNEETRDAFVKIGMDVMKNTPSEMFA NAIISGWLIATMVWMFPAAGAAKIVVIILMTWLIALGDTTHIVVGSVEILYLVFNGTLHW SDFIWPFALPTLAGNICGGTFIFALMSHAQIRNDMSNKRKAEARQKAERAENIKKNYKNP A Prediction of potential genes in microbial genomes Time: Mon May 16 00:02:56 2011 Seq name: gi|296494473|gb|ADTN01000265.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont695.3, whole genome shotgun sequence Length of sequence - 4873 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 5, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 200 240 ## COG2116 Formate/nitrite family of transporters + Prom 357 - 416 4.3 2 2 Tu 1 . + CDS 494 - 1249 904 ## COG2853 Surface lipoprotein + Term 1254 - 1306 11.0 - Term 1252 - 1284 2.0 3 3 Tu 1 . - CDS 1431 - 2489 304 ## JW2342 hypothetical protein - Prom 2519 - 2578 9.9 - Term 2800 - 2832 5.6 4 4 Tu 1 . - CDS 2854 - 4194 1385 ## COG2067 Long-chain fatty acid transport protein - Prom 4295 - 4354 6.2 5 5 Tu 1 . + CDS 4566 - 4850 424 ## COG3691 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|296494473|gb|ADTN01000265.1| GENE 1 2 - 200 240 66 aa, chain - ## HITS:1 COG:yfdC KEGG:ns NR:ns ## COG: yfdC COG2116 # Protein_GI_number: 16130280 # Func_class: P Inorganic ion transport and metabolism # Function: Formate/nitrite family of transporters # Organism: Escherichia coli K12 # 1 66 1 66 310 117 100.0 6e-27 MDNDKIDQHSDEIEVESEEKERGKKIEIDEDRLPSRAMAIHEHIRQDGEKELERDAMALL WSAIAA >gi|296494473|gb|ADTN01000265.1| GENE 2 494 - 1249 904 251 aa, chain + ## HITS:1 COG:ECs3229 KEGG:ns NR:ns ## COG: ECs3229 COG2853 # Protein_GI_number: 15832483 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Surface lipoprotein # Organism: Escherichia coli O157:H7 # 1 251 1 251 251 514 100.0 1e-146 MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADGLYPVLSWLTWPM SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA IQDDLKDIDSE >gi|296494473|gb|ADTN01000265.1| GENE 3 1431 - 2489 304 352 aa, chain - ## HITS:1 COG:no KEGG:JW2342 NR:ns ## KEGG: JW2342 # Name: yfdF # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 352 1 352 352 647 99.0 0 MLPSISINNTSAAYPESINENNNDEVNGLVQEFKNLFNGKEGISTCIKHLLELIKNAIRV NDDPYRFNINNSSVTYIDIDSNDTDHITIGIDNQEPIELPANYKDKELVRTIINDNIVEK THDINNKEMIFSALKEIYDGDPGFIFDKISHKLRHTVTEFDESGKSEPTDLFTWYGKDKK GDSLAIVIKNKNGNDYLSLGYYDQDDYHIQRGIRINGDSLTQYCSENARSASAWFESSKA IMAESFATGSDHQVVNELNGERLREPNDVFKRYGRAIRYDFQVDDAKYKCDHLKEIVSTL VGNKINVGHSQKIYKHFKDLEDKIEERLQNRQAEYQNEINQPSAPGVNFDDI >gi|296494473|gb|ADTN01000265.1| GENE 4 2854 - 4194 1385 446 aa, chain - ## HITS:1 COG:fadL KEGG:ns NR:ns ## COG: fadL COG2067 # Protein_GI_number: 16130277 # Func_class: I Lipid transport and metabolism # Function: Long-chain fatty acid transport protein # Organism: Escherichia coli K12 # 1 446 3 448 448 836 100.0 0 MSQKTLFTKSALAVAVALISTQAWSAGFQLNEFSSSGLGRAYSGEGAIADDAGNVSRNPA LITMFDRPTFSAGAVYIDPDVNISGTSPSGRSLKADNIAPTAWVPNMHFVAPINDQFGWG ASITSNYGLATEFNDTYAGGSVGGTTDLETMNLNLSGAYRLNNAWSFGLGFNAVYARAKI ERFAGDLGQLVAGQIMQSPAGQTQQGQALAATANGIDSNTKIAHLNGNQWGFGWNAGILY ELDKNNRYALTYRSEVKIDFKGNYSSDLNRAFNNYGLPIPTATGGATQSGYLTLNLPEMW EVSGYNRVDPQWAIHYSLAYTSWSQFQQLKATSTSGDTLFQKHEGFKDAYRIALGTTYYY DDNWTFRTGIAFDDSPVPAQNRSISIPDQDRFWLSAGTTYAFNKDASVDVGVSYMHGQSV KINEGPYQFESEGKAWLFGTNFNYAF >gi|296494473|gb|ADTN01000265.1| GENE 5 4566 - 4850 424 94 aa, chain + ## HITS:1 COG:yfcZ KEGG:ns NR:ns ## COG: yfcZ COG3691 # Protein_GI_number: 16130276 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 94 9 102 102 161 100.0 2e-40 MSKCSADETPVCCCMDVGTIMDNSDCTASYSRVFANRAEAEQTLAALTEKARSVESEPCK ITPTFTEESDGVRLDIDFTFACEAEMLIFQLGLR Prediction of potential genes in microbial genomes Time: Mon May 16 00:03:05 2011 Seq name: gi|296494472|gb|ADTN01000266.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont695.4, whole genome shotgun sequence Length of sequence - 11408 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 2, operones - 2 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 60 - 119 4.9 1 1 Op 1 20/0.000 + CDS 150 - 1460 1492 ## COG0183 Acetyl-CoA acetyltransferase 2 1 Op 2 5/0.000 + CDS 1460 - 3604 1683 ## COG1250 3-hydroxyacyl-CoA dehydrogenase + Prom 3645 - 3704 5.5 3 1 Op 3 3/0.000 + CDS 3822 - 4292 476 ## COG2062 Phosphohistidine phosphatase SixA + Term 4304 - 4348 10.1 + Prom 4739 - 4798 8.8 4 2 Op 1 6/0.000 + CDS 4973 - 5536 679 ## COG3539 P pilus assembly protein, pilin FimA + Term 5544 - 5581 6.2 5 2 Op 2 10/0.000 + CDS 5618 - 8263 2546 ## COG3188 P pilus assembly protein, porin PapC 6 2 Op 3 7/0.000 + CDS 8283 - 9035 601 ## COG3121 P pilus assembly protein, chaperone PapD + Term 9075 - 9112 -1.0 7 2 Op 4 4/0.000 + CDS 9209 - 9547 227 ## COG3539 P pilus assembly protein, pilin FimA 8 2 Op 5 4/0.000 + CDS 9544 - 10032 200 ## COG3539 P pilus assembly protein, pilin FimA 9 2 Op 6 . + CDS 10029 - 10568 455 ## COG3539 P pilus assembly protein, pilin FimA 10 2 Op 7 . + CDS 10570 - 11391 136 ## JW2329 hypothetical protein Predicted protein(s) >gi|296494472|gb|ADTN01000266.1| GENE 1 150 - 1460 1492 436 aa, chain + ## HITS:1 COG:yfcY KEGG:ns NR:ns ## COG: yfcY COG0183 # Protein_GI_number: 16130275 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA acetyltransferase # Organism: Escherichia coli K12 # 1 436 1 436 436 793 100.0 0 MGQVLPLVTRQGDRIAIVSGLRTPFARQATAFHGIPAVDLGKMVVGELLARSEIPAEVIE QLVFGQVVQMPEAPNIAREIVLGTGMNVHTDAYSVSRACATSFQAVANVAESLMAGTIRA GIAGGADSSSVLPIGVSKKLARVLVDVNKARTMSQRLKLFSRLRLRDLMPVPPAVAEYST GLRMGDTAEQMAKTYGITREQQDALAHRSHQRAAQAWSDGKLKEEVMTAFIPPYKQPLVE DNNIRGNSSLADYAKLRPAFDRKHGTVTAANSTPLTDGAAAVILMTESRAKELGLVPLGY LRSYAFTAIDVWQDMLLGPAWSTPLALERAGLTMSDLTLIDMHEAFAAQTLANIQLLGSE RFAREALGRAHATGEVDDSKFNVLGGSIAYGHPFAATGARMITQTLHELRRRGGGFGLVT ACAAGGLGAAMVLEAE >gi|296494472|gb|ADTN01000266.1| GENE 2 1460 - 3604 1683 714 aa, chain + ## HITS:1 COG:yfcX_2 KEGG:ns NR:ns ## COG: yfcX_2 COG1250 # Protein_GI_number: 16130274 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyacyl-CoA dehydrogenase # Organism: Escherichia coli K12 # 308 714 1 407 407 791 99.0 0 MEMTSAFTLNVRLDNIAVITIDVPGEKMNTLKAEFASQVRAIIKQLRENKELRGVVFVSA KPDNFIAGADINMIGNCKTAQEAEALARQGQQLMAEIHALPIQVIAAIHGACLGGGLELA LACHGRVCTDDPKTVLGLPEVQLGLLPGSGGTQRLPRLIGVSTALEMILTGKQLRAKQAL KLGLVDDVVPHSILLEAAVELAKKERPSSRPLPVRERILAGPLGRALLFKMVGKKTEHKT QGNYPATERILEVVETGLAQGTSSGYDAEARAFGELAMTPQSQALRSIFFASTDVKKDPG SDAPPAPLNSVGILGGGLMGGGIAYVTACKAGIPVRIKDINPQGINHALKYSWDQLEGKV RRRHLKASERDKQLALISGTTDYRGFAHRDLIIEAVFENLELKQQMVAEVEQNCAAHTIF ASNTSSLPIGDITAHATRPEQVIGLHFFSPVEKMPLVEIIPHAGTSAQTIATTVKLAKKQ GKTPIVVRDKAGFYVNRILAPYINEAIRMLTQGERVEHIDAALVKFGFPVGPIQLLDEVG IDTGTKIIPVLEAAYGERFSAPANVVSSILNDDRKGRKNGRGFYLYGQKGRKSKKQVDPA IYPLIGTQGQGRISAPQVAERCVMLMLNEAVRCVDEQVIRSVRDGDIGAVFGIGFPPFLG GPFRYIDSLGAGEVVAIMQRLATQYGSRFTPCERLVEMGARGESFWKTTATDLQ >gi|296494472|gb|ADTN01000266.1| GENE 3 3822 - 4292 476 156 aa, chain + ## HITS:1 COG:sixA KEGG:ns NR:ns ## COG: sixA COG2062 # Protein_GI_number: 16130273 # Func_class: T Signal transduction mechanisms # Function: Phosphohistidine phosphatase SixA # Organism: Escherichia coli K12 # 1 156 6 161 161 303 100.0 9e-83 MRHGDAALDAASDSVRPLTTNGCDESRLMANWLKGQKVEIERVLVSPFLRAEQTLEEVGD CLNLPSSAEVLPELTPCGDVGLVSAYLQALTNEGVASVLVISHLPLVGYLVAELCPGETP PMFTTSAIASVTLDESGNGTFNWQMSPCNLKMAKAI >gi|296494472|gb|ADTN01000266.1| GENE 4 4973 - 5536 679 187 aa, chain + ## HITS:1 COG:yfcV KEGG:ns NR:ns ## COG: yfcV COG3539 # Protein_GI_number: 16130272 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 187 1 187 187 299 100.0 2e-81 MSKFVKTAIAAAMVMGAFTSTATIAAGNNGTARFYGTIEDSVCSIVPDDHKLEVDMGDIG AEKLKNNGTTTPKNFQIRLQDCVFDTQETMTTTFTGTVSSANSGNYYTIFNTDTGAAFNN VSLAIGDSLGTSYKSGMGIDQKIVKDTATNKGKAKQTLNFKAWLVGAADAPDLGNFEANT TFQITYL >gi|296494472|gb|ADTN01000266.1| GENE 5 5618 - 8263 2546 881 aa, chain + ## HITS:1 COG:yfcUm KEGG:ns NR:ns ## COG: yfcUm COG3188 # Protein_GI_number: 16132272 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli K12 # 1 881 1 881 881 1737 99.0 0 MPDHSLFRLRILPWCIALAMSGSYSSVWAEDDIQFDSRFLELKGDTKIDLKRFSSQGYVE PGKYNLQVQLNKQPLAEEYDIYWYAGEDDVSKSYACLTPELVAQFGLKEDVAKNLQWSHD GKCLKPGQLEGVEIKADLSQSALVISLPQAYLEYTWPDWDPPSRWDDGISGIIADYSITA QTRHEENGGDDSNEISGNGTVGVNLGPWRMRADWQTNYQHTRSNDDDDEFGGDDTQKKWE WSRYYAWRALPSLKAKLALGEDYLNSDIFDGFNYVGGSVSTDDQMLPPNLRGYAPDISGV AHTTAKVTVSQMGRVIYETQVPAGPFRIQDLGDSVSGTLHIRIEEQNGQVQEYDISTASM PYLTRPGQVRYKIMMGRPQEWGHHVEGGFFSGAEASWGIANGWSLYGGALGDENYQSAAL GVGRDLSTFGAVAFDVTHSHTKLDKDTAYGKGSLDGNSFRVSYSKDFDQLNSRVTFAGYR FSEENFMTMSEYLDASDSEMVRTGNDKEMYTATYNQNFRDAGVSVYLNYTRHTYWDREEQ TNYNIMLSHYFNMGSIRNMSVSLTGYRYEYDNRADKGMYISLSMPWGDNSTVSYNGNYGS GTDSSQVGYFSRVDDATHYQLNIGTSDKHTSVDGYYSHDGSLAQVDLSANYHEGQYTSAG LSLQGGATLTTHGGALHRTQNMGGTRLLIDADGVADVPVEGNGAAVYTNMFGKAVVSDVN NYYRNQAYIDLNKLPENAEATQSVVQATLTEGAIGYRKFAVISGQKAMAVLRLQDGSHPP FGAEVKNDNEQTVGLVDDDGSVYLAGVKPGEHMSVFWSGVAHCDINLPDPLPADLFNGLL LPCQHKGNVAPVVPDDIKPVIQEQTQQVTPTDPPVSVSANQ >gi|296494472|gb|ADTN01000266.1| GENE 6 8283 - 9035 601 250 aa, chain + ## HITS:1 COG:yfcS KEGG:ns NR:ns ## COG: yfcS COG3121 # Protein_GI_number: 16130271 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli K12 # 1 250 1 250 250 442 100.0 1e-124 MSDLLCSAKLGAMTLALLLSATSLSALASVTPDRTRLIFNESDKSISVTLRNNDPKLPYL AQSWIEDEKGNKITSPLTVLPPVQRIDSMMNGQVKVQGMPDINKLPADRESMFYFNVREI PPKSNKPNTLQIALQTRIKLFWRPKALEKVSMKSPWQHKVTLTRSGQAFTVNNPTPYYVI ISNASAQKNGNPAAGFSPLVIEPKTTVPLNVKMDSVPVLTYVNDFGARMPLFFQCNGNSC QVDEEQSRKG >gi|296494472|gb|ADTN01000266.1| GENE 7 9209 - 9547 227 112 aa, chain + ## HITS:1 COG:yfcR KEGG:ns NR:ns ## COG: yfcR COG3539 # Protein_GI_number: 16130270 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 112 59 170 170 206 100.0 1e-53 MIADNVDGTNYRQDAKYTLNCTNSLANDLRMQLKGNTSTINGETVLSTNITGLGIRIENS ADNSLFAVGENSWTPFNINNQPQLKAVPVKASGAQLAAGEFNASLTMVVDYQ >gi|296494472|gb|ADTN01000266.1| GENE 8 9544 - 10032 200 162 aa, chain + ## HITS:1 COG:yfcQ KEGG:ns NR:ns ## COG: yfcQ COG3539 # Protein_GI_number: 16130269 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 162 1 162 162 322 100.0 2e-88 MRKTFLTLLCVSSAIAHAADEDITFHGTLLSPPTCSISGGKTIEVEFRDLIIDDINGNYG RKEVPYELTCDSTTRHPDWEMTLTWTGTQTSFNDAAIETDVPGFGIELQHDGQRFKLNTP LAINATDFTQKPKLEAVPVKASDAVLSDTNFSAYATLRVDYQ >gi|296494472|gb|ADTN01000266.1| GENE 9 10029 - 10568 455 179 aa, chain + ## HITS:1 COG:yfcP KEGG:ns NR:ns ## COG: yfcP COG3539 # Protein_GI_number: 16130268 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 179 1 179 179 335 100.0 2e-92 MNKSMIQSGGYVLLAGLILAMSSTLFAADNNLHFSGNLLSKSCALVVDGQYLAEVRFPTV SRQDLNVAGQSARVPVVFKLKDCKGPAGYNVKVTLTGVEDSEQPGFLALDTSSTAQGVGI GMEKTDGMQVAINNTNGATFALTNGNNDINFRAWLQAKSGRDVTIGEFTASLTATFEYI >gi|296494472|gb|ADTN01000266.1| GENE 10 10570 - 11391 136 273 aa, chain + ## HITS:1 COG:no KEGG:JW2329 NR:ns ## KEGG: JW2329 # Name: yfcO # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 273 1 273 273 556 100.0 1e-157 MKILRWLFALVMLIATTEAMAAGHSVDVYYGYNGDSRNIATFNLKIMMPSAVYVGEYKSS QWLMTGEILQNVSWSGPPPAPSVKLIGYHQNINKASCPGLPSGWNCGYYTFEVIVSAEIE SYFSCPWLVIMNDSEASPGGVTYQGPDSHDTICPSVSVQPYDVSWNENYVSKSKLLTLQS TGGVVEKTLSTYLMKDGKLCDSTQMNETGGYCRWVAQMITFTASGCDKAEVSVTPNRHPI TDKQLHDMVVRVDTSSMQPIDSTCRFQYILNEL Prediction of potential genes in microbial genomes Time: Mon May 16 00:03:23 2011 Seq name: gi|296494471|gb|ADTN01000267.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont695.5, whole genome shotgun sequence Length of sequence - 43492 bp Number of predicted genes - 44, with homology - 43 Number of transcription units - 20, operones - 10 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 541 565 ## COG2840 Uncharacterized protein conserved in bacteria - Prom 680 - 739 3.4 + Prom 510 - 569 2.7 2 2 Op 1 3/0.714 + CDS 707 - 1639 1627 ## PROTEIN SUPPORTED gi|89109150|ref|AP_002930.1| N5-glutamine methyltransferase 3 2 Op 2 7/0.143 + CDS 1674 - 2759 1191 ## COG0082 Chorismate synthase 4 2 Op 3 7/0.143 + CDS 2763 - 3587 493 ## COG3770 Murein endopeptidase 5 2 Op 4 5/0.286 + CDS 3587 - 4396 868 ## COG0730 Predicted permeases 6 2 Op 5 . + CDS 4396 - 4944 545 ## COG3101 Uncharacterized protein conserved in bacteria 7 2 Op 6 . + CDS 4978 - 5256 401 ## ECB_02250 hypothetical protein - Term 5089 - 5130 1.9 8 3 Tu 1 . - CDS 5377 - 7383 1430 ## COG0665 Glycine/D-amino acid oxidases (deaminating) - Prom 7408 - 7467 3.7 + Prom 7447 - 7506 3.5 9 4 Tu 1 . + CDS 7542 - 8762 1362 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase + Term 8842 - 8871 2.1 + Prom 8849 - 8908 4.5 10 5 Tu 1 . + CDS 9027 - 10205 1122 ## COG0477 Permeases of the major facilitator superfamily + Term 10253 - 10283 1.8 11 6 Tu 1 . - CDS 10202 - 11197 886 ## B21_02206 hypothetical protein - Prom 11220 - 11279 3.4 12 7 Op 1 5/0.286 + CDS 11296 - 12432 1176 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases + Term 12445 - 12485 5.1 13 7 Op 2 5/0.286 + CDS 12498 - 13511 1201 ## COG0136 Aspartate-semialdehyde dehydrogenase 14 7 Op 3 5/0.286 + CDS 13511 - 14323 581 ## COG0101 Pseudouridylate synthase 15 7 Op 4 5/0.286 + CDS 14406 - 15065 568 ## COG0586 Uncharacterized membrane-associated protein + Prom 15068 - 15127 4.7 16 8 Op 1 15/0.000 + CDS 15221 - 16135 1024 ## COG0777 Acetyl-CoA carboxylase beta subunit + Term 16140 - 16185 10.6 17 8 Op 2 7/0.143 + CDS 16205 - 17473 1328 ## COG0285 Folylpolyglutamate synthase 18 8 Op 3 7/0.143 + CDS 17463 - 18125 680 ## COG3147 Uncharacterized protein conserved in bacteria + Term 18254 - 18295 6.4 + Prom 18289 - 18348 4.8 19 9 Op 1 18/0.000 + CDS 18384 - 18872 397 ## COG1286 Uncharacterized membrane protein, required for colicin V production 20 9 Op 2 5/0.286 + CDS 18909 - 20426 1603 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase 21 9 Op 3 5/0.286 + CDS 20521 - 21090 575 ## COG0163 3-polyprenyl-4-hydroxybenzoate decarboxylase + Term 21099 - 21129 3.6 + Prom 21097 - 21156 2.4 22 10 Op 1 9/0.000 + CDS 21356 - 22138 975 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain + Term 22256 - 22299 0.2 + Prom 22189 - 22248 4.3 23 10 Op 2 12/0.000 + CDS 22359 - 23141 924 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain 24 10 Op 3 12/0.000 + CDS 23231 - 23917 722 ## COG4215 ABC-type arginine transport system, permease component 25 10 Op 4 6/0.143 + CDS 23914 - 24630 580 ## COG4160 ABC-type arginine/histidine transport system, permease component 26 10 Op 5 3/0.714 + CDS 24638 - 25411 256 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) + Term 25418 - 25455 6.0 + Prom 25520 - 25579 3.4 27 11 Tu 1 . + CDS 25608 - 26498 465 ## COG5464 Uncharacterized conserved protein + Term 26520 - 26581 1.8 28 12 Op 1 3/0.714 - CDS 26546 - 27439 891 ## COG1090 Predicted nucleoside-diphosphate sugar epimerase 29 12 Op 2 3/0.714 - CDS 27460 - 27822 427 ## COG1539 Dihydroneopterin aldolase 30 12 Op 3 . - CDS 27879 - 28526 597 ## COG0625 Glutathione S-transferase - Prom 28555 - 28614 3.9 + Prom 28575 - 28634 2.0 31 13 Op 1 2/0.857 + CDS 28662 - 29306 533 ## COG0625 Glutathione S-transferase 32 13 Op 2 3/0.714 + CDS 29362 - 29913 555 ## COG0622 Predicted phosphoesterase 33 14 Tu 1 . + CDS 29971 - 30513 641 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes + Term 30521 - 30546 -0.5 - Term 30506 - 30536 3.0 34 15 Tu 1 . - CDS 30546 - 32066 1568 ## COG1288 Predicted membrane protein - Prom 32143 - 32202 4.3 - Term 32178 - 32218 6.8 35 16 Op 1 14/0.000 - CDS 32256 - 34400 2425 ## COG0857 BioD-like N-terminal domain of phosphotransacetylase - Term 34432 - 34471 10.1 36 16 Op 2 . - CDS 34475 - 35677 1337 ## COG0282 Acetate kinase - Prom 35899 - 35958 3.0 + Prom 35798 - 35857 4.8 37 17 Op 1 2/0.857 + CDS 36015 - 36470 302 ## COG3092 Uncharacterized protein conserved in bacteria + Term 36496 - 36526 -0.5 38 17 Op 2 . + CDS 36553 - 37047 636 ## COG3013 Uncharacterized conserved protein 39 17 Op 3 2/0.857 + CDS 37040 - 37708 549 ## COG0637 Predicted phosphatase/phosphohexomutase + Term 37728 - 37757 1.1 + Prom 37710 - 37769 2.5 40 17 Op 4 . + CDS 37795 - 39627 1559 ## COG0471 Di- and tricarboxylate transporters + Term 39764 - 39816 2.1 41 18 Op 1 6/0.143 - CDS 39686 - 40285 687 ## COG1896 Predicted hydrolases of HD superfamily 42 18 Op 2 . - CDS 40369 - 41586 1133 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 41638 - 41697 1.9 + Prom 41779 - 41838 8.6 43 19 Tu 1 . + CDS 41913 - 41981 86 ## 44 20 Tu 1 . + CDS 42506 - 43444 824 ## COG0583 Transcriptional regulator Predicted protein(s) >gi|296494471|gb|ADTN01000267.1| GENE 1 1 - 541 565 180 aa, chain - ## HITS:1 COG:yfcN KEGG:ns NR:ns ## COG: yfcN COG2840 # Protein_GI_number: 16130266 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 180 1 180 183 344 100.0 4e-95 MKKKTTLSEEDQALFRQLMAGTRKIKQDTIVHRPQRKKISEVPVKRLIQEQADASHYFSD EFQPLLNTEGPVKYVRPDVSHFEAKKLRRGDYSPELFLDLHGLTQLQAKQELGALIAACR REHVFCACVMHGHGKHILKQQTPLWLAQHPHVMAFHQAPKEYGGDAALLVLIEVEEWLPP >gi|296494471|gb|ADTN01000267.1| GENE 2 707 - 1639 1627 310 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|89109150|ref|AP_002930.1| N5-glutamine methyltransferase [Escherichia coli str. K-12 substr. W3110] # 1 310 1 310 310 631 99 1e-180 MDKIFVDEAVNELQTIQDMLRWSVSRFSAANIWYGHGTDNPWDEAVQLVLPSLYLPLDIP EDMRTARLTSSEKHRIVERVIRRVNERIPVAYLTNKAWFCGHEFYVDERVLVPRSPIGEL INNKFAGLISKQPQHILDMCTGSGCIAIACAYAFPDAEVDAVDISPDALAVAEQNIEEHG LIHNVIPIRSDLFRDLPKVQYDLIVTNPPYVDAEDMSDLPNEYRHEPELGLASGTDGLKL TRRILGNAADYLADDGVLICEVGNSMVHLMEQYPDVPFTWLEFDNGGDGGFMLTKEQLIA AREHFAIYKD >gi|296494471|gb|ADTN01000267.1| GENE 3 1674 - 2759 1191 361 aa, chain + ## HITS:1 COG:aroC KEGG:ns NR:ns ## COG: aroC COG0082 # Protein_GI_number: 16130264 # Func_class: E Amino acid transport and metabolism # Function: Chorismate synthase # Organism: Escherichia coli K12 # 1 361 1 361 361 696 99.0 0 MAGNTIGQLFRVTTFGESHGLALGCIVDGIPPGIPLTEADLQHDLDRRRPGTSRYTTQRR EPDQVKILSGVFEGVTTGTSIGLLIENTDQRSQDYSAIKDVFRPGHADYTYEQKYGLRDY RGGGRSSARETAMRVAAGAIAKKYLAEKFGIEIRGCLTQMGDIPLDIKDWSQVEQNPFFC PDPDKIDALDELMRALKKEGDSIGAKVTVVASGVPAGLGEPVFDRLDADIAHALMSINAV KGVEIGDGFDVVALRGSQNRDEITKDGFQSNHAGGILGGISSGQQIIAHMALKPTSSITV PGRTINRFGEEVEMITKGRHDPCVGIRAVPIAEAMLAIVLMDHLLRQRAQNADVKTDIPR W >gi|296494471|gb|ADTN01000267.1| GENE 4 2763 - 3587 493 274 aa, chain + ## HITS:1 COG:mepA KEGG:ns NR:ns ## COG: mepA COG3770 # Protein_GI_number: 16130263 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Murein endopeptidase # Organism: Escherichia coli K12 # 1 274 1 274 274 521 100.0 1e-148 MNKTAIALLALLASSASLAATPWQKITQPVPGSAQSIGSFSNGCIVGADTLPIQSEHYQV MRTDQRRYFGHPDLVMFIQRLSSQVSNLGMGTVLIGDMGMPAGGRFNGGHASHQTGLDVD IFLQLPKTRWTSAQLLRPQALDLVSRDGKHVVSTLWKPEIFSLIKLAAQDKDVTRIFVNP AIKQQLCLDAGTDRDWLRKVRPWFQHRAHMHVRLRCPADSLECEDQPLPPSGDGCGAELQ SWFEPPKPGTTKPEKKTPPPLPPSCQALLDEHVI >gi|296494471|gb|ADTN01000267.1| GENE 5 3587 - 4396 868 269 aa, chain + ## HITS:1 COG:ECs3211 KEGG:ns NR:ns ## COG: ECs3211 COG0730 # Protein_GI_number: 15832465 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Escherichia coli O157:H7 # 1 269 1 269 269 450 100.0 1e-126 METFNSLFMVSPLLLGVLFFVAMLAGFIDSIAGGGGLLTIPALMAAGMSPANALATNKLQ ACGGSISATIYFIRRKVVSLSDQKLNIAMTFVGSMSGALLVQYVQADVLRQILPILVICI GLYFLLMPKLGEEDRQRRMYGLPFALIAGGCVGFYDGFFGPAAGSFYALAFVTLCGFNLA KATAHAKLLNATSNIGGLLLFILGGKVIWATGFVMLVGQFLGARMGSRLVLSKGQKLIRP MIVIVSAVMSAKLLYDSHGQEILHWLGMN >gi|296494471|gb|ADTN01000267.1| GENE 6 4396 - 4944 545 182 aa, chain + ## HITS:1 COG:yfcM KEGG:ns NR:ns ## COG: yfcM COG3101 # Protein_GI_number: 16130261 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 182 1 182 182 379 100.0 1e-105 MNSTHHYEQLIEIFNSCFADDFNTRLIKGDDEPIYLPADAEVPYNRIVFAHGFYASAIHE ISHWCIAGKARRELVDFGYWYCPDGRDAQTQSQFEDVEVKPQALDWLFCVAAGYPFNVSC DNLEGDFEPDRVVFQRRVHAQVMDYLTNGIPERPARFIKALQNYYHTPELTAEQFPWPEA LN >gi|296494471|gb|ADTN01000267.1| GENE 7 4978 - 5256 401 92 aa, chain + ## HITS:1 COG:no KEGG:ECB_02250 NR:ns ## KEGG: ECB_02250 # Name: yfcL # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 92 1 92 92 154 100.0 1e-36 MIAEFESRILALIDGMVDHASDDELFASGYLRGHLTLAIAELESGDDHSAQAVHTTVSQS LEKAIGAGELSPRDQALVTDMWENLFQQASQQ >gi|296494471|gb|ADTN01000267.1| GENE 8 5377 - 7383 1430 668 aa, chain - ## HITS:1 COG:yfcK_2 KEGG:ns NR:ns ## COG: yfcK_2 COG0665 # Protein_GI_number: 16130259 # Func_class: E Amino acid transport and metabolism # Function: Glycine/D-amino acid oxidases (deaminating) # Organism: Escherichia coli K12 # 256 668 1 413 413 833 100.0 0 MKHYSIQPANLEFNAEGTPVSRDFDDVYFSNDNGLEETRYVFLGGNQLEVRFPEHPHPLF VVAESGFGTGLNFLTLWQAFDQFREAHPQAQLQRLHFISFEKFPLTRADLALAHQHWPEL APWAEQLQAQWPMPLPGCHRLLLDEGRVTLDLWFGDINELTSQLDDSLNQKVDAWFLDGF APAKNPDMWTQNLFNAMARLARPGGTLATFTSAGFVRRGLQDAGFTMQKRKGFGRKREML CGVMEQTLPLPCSAPWFNRTGSSKREAAIIGGGIASALLSLALLRRGWQVTLYCADEAPA LGASGNRQGALYPLLSKHDEALNRFFSNAFTFARRFYDQLPVKFDHDWCGVTQLGWDEKS QHKIAQMLSMDLPAELAVAVEANAVEQITGVATNCSGITYPQGGWLCPAELTRNVLELAQ QQGLQIYYQYQLQNLSRKDDCWLLNFAGDQQATHSVVVLANGHQISRFSQTSTLPVYSVA GQVSHIPTTPELAELKQVLCYDGYLTPQNPANQHHCIGASYHRGSEDTAYSEDDQQQNRQ RLIDCFPQAQWAKEVDVSDKEARCGVRCATRDHLPMVGNVPDYEATLVEYASLAEQKDEA VSAPVFDDLFMFAALGSRGLCSAPLCAEILAAQMSDEPIPMDASTLAALNPNRLWVRKLL KGKAVKAG >gi|296494471|gb|ADTN01000267.1| GENE 9 7542 - 8762 1362 406 aa, chain + ## HITS:1 COG:fabB KEGG:ns NR:ns ## COG: fabB COG0304 # Protein_GI_number: 16130258 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Escherichia coli K12 # 1 406 1 406 406 770 100.0 0 MKRAVITGLGIVSSIGNNQQEVLASLREGRSGITFSQELKDSGMRSHVWGNVKLDTTGLI DRKVVRFMSDASIYAFLSMEQAIADAGLSPEAYQNNPRVGLIAGSGGGSPRFQVFGADAM RGPRGLKAVGPYVVTKAMASGVSACLATPFKIHGVNYSISSACATSAHCIGNAVEQIQLG KQDIVFAGGGEELCWEMACEFDAMGALSTKYNDTPEKASRTYDAHRDGFVIAGGGGMVVV EELEHALARGAHIYAEIVGYGATSDGADMVAPSGEGAVRCMKMAMHGVDTPIDYLNSHGT STPVGDVKELAAIREVFGDKSPAISATKAMTGHSLGAAGVQEAIYSLLMLEHGFIAPSIN IEELDEQAAGLNIVTETTDRELTTVMSNSFGFGGTNATLVMRKLKD >gi|296494471|gb|ADTN01000267.1| GENE 10 9027 - 10205 1122 392 aa, chain + ## HITS:1 COG:yfcJ KEGG:ns NR:ns ## COG: yfcJ COG0477 # Protein_GI_number: 16130257 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 392 1 392 392 590 100.0 1e-168 MTAVSQTETRSSANFSLFRIAFAVFLTYMTVGLPLPVIPLFVHHELGYGNTMVGIAVGIQ FLATVLTRGYAGRLADQYGAKRSALQGMLACGLAGGALLLAAILPVSAPFKFALLVVGRL ILGFGESQLLTGALTWGLGIVGPKHSGKVMSWNGMAIYGALAVGAPLGLLIHSHYGFAAL AITTMVLPVLAWACNGTVRKVPALAGERPSLWSVVGLIWKPGLGLALQGVGFAVIGTFVS LYFASKGWAMAGFTLTAFGGAFVVMRVMFGWMPDRFGGVKVAIVSLLVETVGLLLLWQAP GAWVALAGAALTGAGCSLIFPALGVEVVKRVPSQVRGTALGGYAAFQDIALGVSGPLAGM LATTFGYSSVFLAGAISAVLGIIVTILSFRRG >gi|296494471|gb|ADTN01000267.1| GENE 11 10202 - 11197 886 331 aa, chain - ## HITS:1 COG:no KEGG:B21_02206 NR:ns ## KEGG: B21_02206 # Name: flk # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 331 1 331 331 518 100.0 1e-145 MIQPISGPPPGQPPGQGDNLPSGTGNQPLSSQQRTSLESLMTKVTSLTQQQRAELWAGIR HDIGLSGDSPLLSRHFPAAEHNLAQRLLAAQKSHSARQLLAQLGEYLRLGNNRQAVTDYI RHNFGQTPLNQLSPEQLKTILTLLQEGKMVIPQPQQREATDRPLLPAEHNALKQLVTKLA AATGEPSKQIWQSMLELSGVKDGELIPAKLFNHLVTWLQARQTLSQQNTPTLESLQMTLK QPLDASELAALSAYIQQKYGLSAQSSLSSAQAEDILNQLYQRRVKGIDPRVMQPLLNPFP PMMDTLQNMATRPALWILLVAIILMLVWLVR >gi|296494471|gb|ADTN01000267.1| GENE 12 11296 - 12432 1176 378 aa, chain + ## HITS:1 COG:pdxB KEGG:ns NR:ns ## COG: pdxB COG0111 # Protein_GI_number: 16130255 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Escherichia coli K12 # 1 378 1 378 378 743 99.0 0 MKILVDENMPYARDLFSRLGEVTAVPGRPIPVAQLADADALMVRSVTKVNESLLAGKPIK FVGTATAGTDHVDEAWLKQAGIGFSAAPGCNAIAVVEYVFSSLLMLAERDGFSLYDRTVG IVGVGNVGRRLQARLEALGIKTLLCDPPRADRGDEGDFRSLDELVQRADILTFHTPLFKD GPYKTLHLADEKLIRSLKPGAILINACRGAVVDNTALLTCLNEGQKLSVVLDVWEGEPEL NVELLKKVDIGTPHIAGYTLEGKARGTTQVFEAYSKFIGHEQHVALDTLLPAPEFGRITL HGPLDQPTLKRLVHLVYDVRRDDAPLRKVAGIPGEFDKLRKNYLERREWSSLYVICDDAS AASLLCKLGFNAVHHPAR >gi|296494471|gb|ADTN01000267.1| GENE 13 12498 - 13511 1201 337 aa, chain + ## HITS:1 COG:usg KEGG:ns NR:ns ## COG: usg COG0136 # Protein_GI_number: 16130254 # Func_class: E Amino acid transport and metabolism # Function: Aspartate-semialdehyde dehydrogenase # Organism: Escherichia coli K12 # 1 337 1 337 337 643 100.0 0 MSEGWNIAVLGATGAVGEALLETLAERQFPVGEIYALARNESAGEQLRFGGKTITVQDAA EFDWTQAQLAFFVAGKEATAAWVEEATNSGCLVIDSSGLFALEPDVPLVVPEVNPFVLTD YRNRNVIAVPDSLTSQLLAALKPLIDQGGLSRISVTSLISASAQGKKAVDALAGQSAKLL NGIPIDEEDFFGRQLAFNMLPLLPDSEGSVREERRIVDEVRKILQDEGLMISASVVQAPV FYGHAQMVNFEALRPLAAEEARDAFVQGEDIVLSEENEFPTQVGDASGTPHLSVGCVRND YGMPEQVQFWSVADNVRFGGALMAVKIAEKLVQEYLY >gi|296494471|gb|ADTN01000267.1| GENE 14 13511 - 14323 581 270 aa, chain + ## HITS:1 COG:truA KEGG:ns NR:ns ## COG: truA COG0101 # Protein_GI_number: 16130253 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Escherichia coli K12 # 1 270 1 270 270 556 99.0 1e-158 MSDQQQPPVYKIALGIEYDGSKYYGWQRQNEVRSVQEKLEKALSQVANEPITVFCAGRTD AGVHGTGQVVHFETTALRKDAAWTLGVNANLPGDIAVRWVKAVPDDFHARFSATARRYRY IIYNHRLRPAVLSKGVTHFYEPLDAERMHRAAQCLLGENDFTSFRAVQCQSRTPWRNVMH INVTRHGPYVVVDIKANAFVHHMVRNIVGSLMEVGAHNQPESWIAELLAAKDRTLAAATA KAEGLYLVAVDYPDRYDLPKPPMGPLFLAD >gi|296494471|gb|ADTN01000267.1| GENE 15 14406 - 15065 568 219 aa, chain + ## HITS:1 COG:STM2367 KEGG:ns NR:ns ## COG: STM2367 COG0586 # Protein_GI_number: 16765694 # Func_class: S Function unknown # Function: Uncharacterized membrane-associated protein # Organism: Salmonella typhimurium LT2 # 1 219 1 219 219 388 96.0 1e-108 MDLIYFLIDFILHIDVHLAELVAEYGVWVYAILFLILFCETGLVVTPFLPGDSLLFVAGA LASLETNDLNVHMMVVLMLIAAIVGDAVNYTIGRLFGEKLFSNPNSKIFRRSYLDKTHQF YEKHGGKTIILARFVPIVRTFAPFVAGMGHMSYRHFAAYNVIGALLWVLLFTYAGYFFGT IPMVQDNLKLLIVGIIVVSILPGVIEIIRHKRAAARAAK >gi|296494471|gb|ADTN01000267.1| GENE 16 15221 - 16135 1024 304 aa, chain + ## HITS:1 COG:ECs3200 KEGG:ns NR:ns ## COG: ECs3200 COG0777 # Protein_GI_number: 15832454 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase beta subunit # Organism: Escherichia coli O157:H7 # 1 304 1 304 304 584 100.0 1e-167 MSWIERIKSNITPTRKASIPEGVWTKCDSCGQVLYRAELERNLEVCPKCDHHMRMTARNR LHSLLDEGSLVELGSELEPKDVLKFRDSKKYKDRLASAQKETGEKDALVVMKGTLYGMPV VAAAFEFAFMGGSMGSVVGARFVRAVEQALEDNCPLICFSASGGARMQEALMSLMQMAKT SAALAKMQERGLPYISVLTDPTMGGVSASFAMLGDLNIAEPKALIGFAGPRVIEQTVREK LPPGFQRSEFLIEKGAIDMIVRRPEMRLKLASILAKLMNLPAPNPEAPREGVVVPPVPDQ EPEA >gi|296494471|gb|ADTN01000267.1| GENE 17 16205 - 17473 1328 422 aa, chain + ## HITS:1 COG:folC KEGG:ns NR:ns ## COG: folC COG0285 # Protein_GI_number: 16130250 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Escherichia coli K12 # 1 422 1 422 422 815 100.0 0 MIIKRTPQAASPLASWLSYLENLHSKTIDLGLERVSLVAARLGVLKPAPFVFTVAGTNGK GTTCRTLESILMAAGYKVGVYSSPHLVRYTERVRVQGQELPESAHTASFAEIESARGDIS LTYFEYGTLSALWLFKQAQLDVVILEVGLGGRLDATNIVDADVAVVTSIALDHTDWLGPD RESIGREKAGIFRSEKPAIVGEPEMPSTIADVAQEKGALLQRRGVEWNYSVTDHDWAFSD AHGTLENLPLPLVPQPNAATALAALRASGLEVSENAIRDGIASAILPGRFQIVSESPRVI FDVAHNPHAAEYLTGRMKALPKNGRVLAVIGMLHDKDIAGTLAWLKSVVDDWYCAPLEGP RGATAEQLLEHLGNGKSFDSVAQAWDAAMADAKAEDTVLVCGSFHTVAHVMEVIDARRSG GK >gi|296494471|gb|ADTN01000267.1| GENE 18 17463 - 18125 680 220 aa, chain + ## HITS:1 COG:ZdedD KEGG:ns NR:ns ## COG: ZdedD COG3147 # Protein_GI_number: 15802861 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 EDL933 # 1 220 1 220 220 297 99.0 1e-80 MASKFQNRLVGTIVLVALGVIVLPGLLDGQKKHYQDEFAAIPLVPKAGDRDEPDMMPAAT QALPTQPPEGAAEEVRAGDAAAPSLDPATIAANNTEFEPEPAPVAPPKPKPVEPPKPKVE APPAPKPEPKPVVEEKAAPTGKAYVVQLGALKNADKVNEIVGKLRGAGYRVYTSPSTPVQ GKITRILVGPDASKDKLKGSLGELKQLSGLSGVVMGYTPN >gi|296494471|gb|ADTN01000267.1| GENE 19 18384 - 18872 397 162 aa, chain + ## HITS:1 COG:cvpA KEGG:ns NR:ns ## COG: cvpA COG1286 # Protein_GI_number: 16130248 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, required for colicin V production # Organism: Escherichia coli K12 # 1 162 1 162 162 267 98.0 6e-72 MVWIDYAIIAVIAFSSLVSLIRGFVREALSLVTWGCAFFVASHYYTYLSVWFTGFEDELV RNGIAIAVLFIATLIVGAIVNFVIGQLVEKTGLSGTDRVLGICFGALRGVLIVAAILFFL DSFTGVSKSEDWSKSQLIPQFSFIIRWFFDYLQSSSSFLPRA >gi|296494471|gb|ADTN01000267.1| GENE 20 18909 - 20426 1603 505 aa, chain + ## HITS:1 COG:purF KEGG:ns NR:ns ## COG: purF COG0034 # Protein_GI_number: 16130247 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Escherichia coli K12 # 1 505 1 505 505 1027 100.0 0 MCGIVGIAGVMPVNQSIYDALTVLQHRGQDAAGIITIDANNCFRLRKANGLVSDVFEARH MQRLQGNMGIGHVRYPTAGSSSASEAQPFYVNSPYGITLAHNGNLTNAHELRKKLFEEKR RHINTTSDSEILLNIFASELDNFRHYPLEADNIFAAIAATNRLIRGAYACVAMIIGHGMV AFRDPNGIRPLVLGKRDIDENRTEYMVASESVALDTLGFDFLRDVAPGEAIYITEEGQLF TRQCADNPVSNPCLFEYVYFARPDSFIDKISVYSARVNMGTKLGEKIAREWEDLDIDVVI PIPETSCDIALEIARILGKPYRQGFVKNRYVGRTFIMPGQQLRRKSVRRKLNANRAEFRD KNVLLVDDSIVRGTTSEQIIEMAREAGAKKVYLASAAPEIRFPNVYGIDMPSATELIAHG REVDEIRQIIGADGLIFQDLNDLIDAVRAENPDIQQFECSVFNGVYVTKDVDQGYLDFLD TLRNDDAKAVQRQNEVENLEMHNEG >gi|296494471|gb|ADTN01000267.1| GENE 21 20521 - 21090 575 189 aa, chain + ## HITS:1 COG:ECs3195 KEGG:ns NR:ns ## COG: ECs3195 COG0163 # Protein_GI_number: 15832449 # Func_class: H Coenzyme transport and metabolism # Function: 3-polyprenyl-4-hydroxybenzoate decarboxylase # Organism: Escherichia coli O157:H7 # 1 189 1 189 189 356 100.0 2e-98 MKRLIVGISGASGAIYGVRLLQVLRDVTDIETHLVMSQAARQTLSLETDFSLREVQALAD VTHDARDIAASISSGSFQTLGMVILPCSIKTLSGIVHSYTDGLLTRAADVVLKERRPLVL CVRETPLHLGHLRLMTQAAEIGAVIMPPVPAFYHRPQSLDDVINQTVNRVLDQFAITLPE DLFARWQGA >gi|296494471|gb|ADTN01000267.1| GENE 22 21356 - 22138 975 260 aa, chain + ## HITS:1 COG:argT KEGG:ns NR:ns ## COG: argT COG0834 # Protein_GI_number: 16130245 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Escherichia coli K12 # 1 260 1 260 260 467 100.0 1e-131 MKKSILALSLLVGLSTAASSYAALPETVRIGTDTTYAPFSSKDAKGDFVGFDIDLGNEMC KRMQVKCTWVASDFDALIPSLKAKKIDAIISSLSITDKRQQEIAFSDKLYAADSRLIAAK GSPIQPTLDSLKGKHVGVLQGSTQEAYANETWRSKGVDVVAYANQDLVYSDLAAGRLDAA LQDEVAASEGFLKQPAGKDFAFAGSSVKDKKYFGDGTGVGLRKDDAELTAAFNKALGELR QDGTYDKMAKKYFDFNVYGD >gi|296494471|gb|ADTN01000267.1| GENE 23 22359 - 23141 924 260 aa, chain + ## HITS:1 COG:ECs3193 KEGG:ns NR:ns ## COG: ECs3193 COG0834 # Protein_GI_number: 15832447 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Escherichia coli O157:H7 # 24 260 24 260 260 464 100.0 1e-131 MKKLVLSLSLVLAFSSATAAFAAIPQNIRIGTDPTYAPFESKNSQGELVGFDIDLAKELC KRINTQCTFVENPLDALIPSLKAKKIDAIMSSLSITEKRQQEIAFTDKLYAADSRLVVAK NSDIQPTVESLKGKRVGVLQGTTQETFGNEHWAPKGIEIVSYQGQDNIYSDLTAGRIDAA FQDEVAASEGFLKQPVGKDYKFGGPSVKDEKLFGVGTGMGLRKEDNELREALNKAFAEMR ADGTYEKLAKKYFDFDVYGG >gi|296494471|gb|ADTN01000267.1| GENE 24 23231 - 23917 722 228 aa, chain + ## HITS:1 COG:hisQ KEGG:ns NR:ns ## COG: hisQ COG4215 # Protein_GI_number: 16130243 # Func_class: E Amino acid transport and metabolism # Function: ABC-type arginine transport system, permease component # Organism: Escherichia coli K12 # 1 228 1 228 228 395 100.0 1e-110 MLYGFSGVILQGALVTLELAISSVVLAVIIGLIGAGGKLSQNRLSGLIFEGYTTLIRGVP DLVLMLLIFYGLQIALNTVTEAMGVGQIDIDPMVAGIITLGFIYGAYFTETFRGAFMAVP KGHIEAATAFGFTRGQVFRRIMFPSMMRYALPGIGNNWQVILKSTALVSLLGLEDVVKAT QLAGKSTWEPFYFAIVCGVIYLVFTTVSNGVLLFLERRYSVGVKRADL >gi|296494471|gb|ADTN01000267.1| GENE 25 23914 - 24630 580 238 aa, chain + ## HITS:1 COG:ECs3191 KEGG:ns NR:ns ## COG: ECs3191 COG4160 # Protein_GI_number: 15832445 # Func_class: E Amino acid transport and metabolism # Function: ABC-type arginine/histidine transport system, permease component # Organism: Escherichia coli O157:H7 # 1 238 1 238 238 434 100.0 1e-122 MIEILHEYWKPLLWTDGYRFTGVAITLWLLILSVVIGGVLALFLAIGRVSSNKYIQFPIW LFTYIFRGTPLYVQLLVFYSGMYTLEIVKGTEFLNAFFRSGLNCTVLALTLNTCAYTTEI FAGAIRSVPHGEIEAARAYGFSTFKMYRCIILPSALRIALPAYSNEVILMLHSTALAFTA TVPDLLKIARDINAATYQPFTAFGIAAVLYLIISYVLISLFRRAEKRWLQHVKPSSTH >gi|296494471|gb|ADTN01000267.1| GENE 26 24638 - 25411 256 257 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 20 232 20 221 223 103 34 2e-21 MSENKLNVIDLHKRYGEHEVLKGVSLQANAGDVISIIGSSGSGKSTFLRCINFLEKPSEG SIVVNGQTINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFNLWSHMTVLENVMEAPIQV LGLSKQEARERAVKYLAKVGIDERAQGKYPVHLSGGQQQRVSIARALAMEPEVLLFDEPT SALDPELVGEVLRIMQQLAEEGKTMVVVTHEMGFARHVSTHVIFLHQGKIEEEGAPEQLF GNPQSPRLQRFLKGSLK >gi|296494471|gb|ADTN01000267.1| GENE 27 25608 - 26498 465 296 aa, chain + ## HITS:1 COG:yfcI KEGG:ns NR:ns ## COG: yfcI COG5464 # Protein_GI_number: 16130240 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 296 1 296 296 584 100.0 1e-167 MTISTTSTPHDAVFKSFLRHPDTARDFIDIHLPAPLRKLCDLTTLKLEPNSFIDEDLRQY YSDLLWSVKTQEGVGYIYVVIEHQSKPEELMAFRMMRYSIAAMQNHLDAGYKELPLVLPM LFYHGCRSPYPYSLCWLDEFAEPAIARKIYSSAFPLVDITVVPDDEIMQHRKMALLELIQ KHIRQRDLLGLVDQIVSLLVTGNTNDRQLKALFNYVLQTGDAQRFRAFIGEIAERAPQEK EKLMTIADRLREEGAMQGKHEEALRIAQEMLDRGLDRELVMMVTRLSPDDLIAQSH >gi|296494471|gb|ADTN01000267.1| GENE 28 26546 - 27439 891 297 aa, chain - ## HITS:1 COG:yfcH KEGG:ns NR:ns ## COG: yfcH COG1090 # Protein_GI_number: 16130239 # Func_class: R General function prediction only # Function: Predicted nucleoside-diphosphate sugar epimerase # Organism: Escherichia coli K12 # 1 297 1 297 297 607 100.0 1e-174 MNIVITGGTGLIGRHLIPRLLELGHQITVVTRNPQKASSVLGPRVTLWQGLADQSNLNGV DAVINLAGEPIADKRWTHEQKERLCQSRWNITQKLVDLINASDTPPSVLISGSATGYYGD LGEVVVTEEEPPHNEFTHKLCARWEEIACRAQSDKTRVCLLRTGVVLAPDGGILGKMLPP FRLGLGGPIGSGRQYLAWIHIDDMVNGILWLLDNELRGPFNMVSPYPVRNEQFAHALGHA LHRPAILRVPATAIRLLMGESSVLVLGGQRALPKRLEEAGFAFRWYDLEEALADVVR >gi|296494471|gb|ADTN01000267.1| GENE 29 27460 - 27822 427 120 aa, chain - ## HITS:1 COG:ECs3187 KEGG:ns NR:ns ## COG: ECs3187 COG1539 # Protein_GI_number: 15832441 # Func_class: H Coenzyme transport and metabolism # Function: Dihydroneopterin aldolase # Organism: Escherichia coli O157:H7 # 1 120 1 120 120 211 100.0 2e-55 MAQPAAIIRIKNLRLRTFIGIKEEEINNRQDIVINVTIHYPADKARTSEDINDALNYRTV TKNIIQHVENNRFSLLEKLTQDVLDIAREHHWVTYAEVEIDKLHALRYADSVSMTLSWQR >gi|296494471|gb|ADTN01000267.1| GENE 30 27879 - 28526 597 215 aa, chain - ## HITS:1 COG:yfcG KEGG:ns NR:ns ## COG: yfcG COG0625 # Protein_GI_number: 16130237 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione S-transferase # Organism: Escherichia coli K12 # 1 215 1 215 215 442 100.0 1e-124 MIDLYFAPTPNGHKITLFLEEAELDYRLIKVDLGKGGQFRPEFLRISPNNKIPAIVDHSP ADGGEPLSLFESGAILLYLAEKTGLFLSHETRERAATLQWLFWQVGGLGPMLGQNHHFNH AAPQTIPYAIERYQVETQRLYHVLNKRLENSPWLGGENYSIADIACWPWVNAWTRQRIDL AMYPAVKNWHERIRSRPATGQALLKAQLGDERSDS >gi|296494471|gb|ADTN01000267.1| GENE 31 28662 - 29306 533 214 aa, chain + ## HITS:1 COG:yfcF KEGG:ns NR:ns ## COG: yfcF COG0625 # Protein_GI_number: 16130236 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione S-transferase # Organism: Escherichia coli K12 # 1 214 1 214 214 420 100.0 1e-117 MSKPAITLWSDAHFFSPYVLSAWVALQEKGLSFHIKTIDLDSGEHLQPTWQGYGQTRRVP LLQIDDFELSESSAIAEYLEDRFAPPTWERIYPLDLENRARARQIQAWLRSDLMPIREER PTDVVFAGAKKAPLTAEGKASAEKLFAMAEHLLVLGQPNLFGEWCIADTDLALMINRLVL HGDEVPERLVDYATFQWQRASVQRFIALSAKQSG >gi|296494471|gb|ADTN01000267.1| GENE 32 29362 - 29913 555 183 aa, chain + ## HITS:1 COG:ECs3184 KEGG:ns NR:ns ## COG: ECs3184 COG0622 # Protein_GI_number: 15832438 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Escherichia coli O157:H7 # 1 183 2 184 184 362 100.0 1e-100 MKLMFASDIHGSLPATERVLELFAQSGAQWLVILGDVLNHGPRNALPEGYAPAKVAERLN EVAHKVIAVRGNCDSEVDQMLLHFPITAPWQQVLLEKQRLFLTHGHLFGPENLPALNQND VLVYGHTHLPVAEQRGEIFHFNPGSVSIPKGGNPASYGMLDNDVLSVIALNDQSIIAQVA INP >gi|296494471|gb|ADTN01000267.1| GENE 33 29971 - 30513 641 180 aa, chain + ## HITS:1 COG:ECs3183 KEGG:ns NR:ns ## COG: ECs3183 COG0494 # Protein_GI_number: 15832437 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Escherichia coli O157:H7 # 1 180 1 180 180 329 100.0 1e-90 MEQRRLASTEWVDIVNEENEVIAQASREQMRAQCLRHRATYIVVHDGMGKILVQRRTETK DFLPGMLDATAGGVVQADEQLLESARREAEEELGIAGVPFAEHGQFYFEDKNCRVWGALF SCVSHGPFALQEDEVSEVCWLTPEEITARCDEFTPDSLKALALWMKRNAKNEAVETETAE >gi|296494471|gb|ADTN01000267.1| GENE 34 30546 - 32066 1568 506 aa, chain - ## HITS:1 COG:yfcC KEGG:ns NR:ns ## COG: yfcC COG1288 # Protein_GI_number: 16130233 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 506 8 513 513 921 100.0 0 MSAITESKPTRRWAMPDTLVIIFFVAILTSLATWVVPVGMFDSQEVQYQVDGQTKTRKVV DPHSFRILTNEAGEPEYHRVQLFTTGDERPGLMNFPFEGLTSGSKYGTAVGIIMFMLVIG GAFGIVMRTGTIDNGILALIRHTRGNEILFIPALFILFSLGGAVFGMGEEAVAFAIIIAP LMVRLGYDSITTVLVTYIATQIGFASSWMNPFCVVVAQGIAGVPVLSGSGLRIVVWVIAT LIGLIFTMVYASRVKKNPLLSRVHESDRFFREKQADVEQRPFTFGDWLVLIVLTAVMVWV IWGVIVNAWFIPEIASQFFTMGLVIGIIGVVFRLNGMTVNTMASSFTEGARMMIAPALLV GFAKGILLLVGNGEAGDASVLNTILNSIANAISGLDNAVAAWFMLLFQAVFNFFVTSGSG QAALTMPLLAPLGDLVGVNRQVTVLAFQFGDGFSHIIYPTSASLMATLGVCRVDFRNWLK VGATLLGLLFIMSSVVVIGAQLMGYH >gi|296494471|gb|ADTN01000267.1| GENE 35 32256 - 34400 2425 714 aa, chain - ## HITS:1 COG:pta_1 KEGG:ns NR:ns ## COG: pta_1 COG0857 # Protein_GI_number: 16130232 # Func_class: R General function prediction only # Function: BioD-like N-terminal domain of phosphotransacetylase # Organism: Escherichia coli K12 # 1 391 1 391 391 752 100.0 0 MSRIIMLIPTGTSVGLTSVSLGVIRAMERKGVRLSVFKPIAQPRTGGDAPDQTTTIVRAN SSTTTAAEPLKMSYVEGLLSSNQKDVLMEEIVANYHANTKDAEVVLVEGLVPTRKHQFAQ SLNYEIAKTLNAEIVFVMSQGTDTPEQLKERIELTRNSFGGAKNTNITGVIVNKLNAPVD EQGRTRPDLSEIFDDSSKAKVNNVDPAKLQESSPLPVLGAVPWSFDLIATRAIDMARHLN ATIINEGDINTRRVKSVTFCARSIPHMLEHFRAGSLLVTSADRPDVLVAACLAAMNGVEI GALLLTGGYEMDARISKLCERAFATGLPVFMVNTNTWQTSLSLQSFNLEVPVDDHERIEK VQEYVANYINADWIESLTATSERSRRLSPPAFRYQLTELARKAGKRIVLPEGDEPRTVKA AAICAERGIATCVLLGNPAEINRVAASQGVELGAGIEIVDPEVVRESYVGRLVELRKNKG MTETVAREQLEDNVVLGTLMLEQDEVDGLVSGAVHTTANTIRPPLQLIKTAPGSSLVSSV FFMLLPEQVYVYGDCAINPDPTAEQLAEIAIQSADSAAAFGIEPRVAMLSYSTGTSGAGS DVEKVREATRLAQEKRPDLMIDGPLQYDAAVMADVAKSKAPNSPVAGRATVFIFPDLNTG NTTYKAVQRSADLISIGPMLQGMRKPVNDLSRGALVDDIVYTIALTAIQSAQQQ >gi|296494471|gb|ADTN01000267.1| GENE 36 34475 - 35677 1337 400 aa, chain - ## HITS:1 COG:ECs3180 KEGG:ns NR:ns ## COG: ECs3180 COG0282 # Protein_GI_number: 15832434 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Escherichia coli O157:H7 # 1 400 1 400 400 803 100.0 0 MSSKLVLVLNCGSSSLKFAIIDAVNGEEYLSGLAECFHLPEARIKWKMDGNKQEAALGAG AAHSEALNFIVNTILAQKPELSAQLTAIGHRIVHGGEKYTSSVVIDESVIQGIKDAASFA PLHNPAHLIGIEEALKSFPQLKDKNVAVFDTAFHQTMPEESYLYALPYNLYKEHGIRRYG AHGTSHFYVTQEAAKMLNKPVEELNIITCHLGNGGSVSAIRNGKCVDTSMGLTPLEGLVM GTRSGDIDPAIIFHLHDTLGMSVDAINKLLTKESGLLGLTEVTSDCRYVEDNYATKEDAK RAMDVYCHRLAKYIGAYTALMDGRLDAVVFTGGIGENAAMVRELSLGKLGVLGFEVDHER NLAARFGKSGFINKEGTRPAVVIPTNEELVIAQDASRLTA >gi|296494471|gb|ADTN01000267.1| GENE 37 36015 - 36470 302 151 aa, chain + ## HITS:1 COG:yfbV KEGG:ns NR:ns ## COG: yfbV COG3092 # Protein_GI_number: 16130230 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 151 1 151 151 296 100.0 7e-81 MSTPDNRSVNFFSLFRRGQHYSKTWPLEKRLAPVFVENRVIKMTRYAIRFMPPIAVFTLC WQIALGGQLGPAVATALFALSLPMQGLWWLGKRSVTPLPPAILNWFYEVRGKLQESGQVL APVEGKPDYQALADTLKRAFKQLDKTFLDDL >gi|296494471|gb|ADTN01000267.1| GENE 38 36553 - 37047 636 164 aa, chain + ## HITS:1 COG:ECs3178 KEGG:ns NR:ns ## COG: ECs3178 COG3013 # Protein_GI_number: 15832432 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 164 7 170 170 318 100.0 3e-87 MEMTNAQRLILSNQYKMMTMLDPANAERYRRLQTIIERGYGLQMRELDREFGELKEETCR TIIDIMEMYHALHVSWSNLQDQQSIDERRVTFLGFDAATEARYLGYVRFMVNVEGRYTHF DAGTHGFNAQTPMWEKYQRMLNVWHACPRQYHLSANEINQIINA >gi|296494471|gb|ADTN01000267.1| GENE 39 37040 - 37708 549 222 aa, chain + ## HITS:1 COG:yfbT KEGG:ns NR:ns ## COG: yfbT COG0637 # Protein_GI_number: 16130228 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Escherichia coli K12 # 1 222 1 222 222 402 100.0 1e-112 MPEEARVRCKGFLFDLDGTLVDSLPAVERAWSNWARRHGLAPEEVLAFIHGKQAITSLRH FMAGKSEADIAAEFTRLEHIEATETEGITALPGAIALLSHLNKAGIPWAIVTSGSMPVAR ARHKIAGLPAPEVFVTAERVKRGKPEPDAYLLGAQLLGLAPQECVVVEDAPAGVLSGLAA GCHVIAVNAPADTPRLNEVDLVLHSLEQITVTKQPNGDVIIQ >gi|296494471|gb|ADTN01000267.1| GENE 40 37795 - 39627 1559 610 aa, chain + ## HITS:1 COG:ECs3176 KEGG:ns NR:ns ## COG: ECs3176 COG0471 # Protein_GI_number: 15832430 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Escherichia coli O157:H7 # 1 610 1 610 610 1085 100.0 0 MNGELIWVLSLLAVAIVLFATGRVRMDAVALFVIVAFALSGTLTVPEVFSGFSDPNVVLI AALFIIGDGLVRTGVATVMGTWLVKVAGNSEIKMLVLLMLTVAGLGAFMSSTGVVAIFIP VVLSVAMRMQTSPSRLMMPLSFAGLISGMMTLVATPPNLVVNSELLREGYHGFSFFSVTP IGLVVLVLGILYMLVMRFMLKGDTQTPQREGWTRRTFRDLIREYRLTGRARRLAIRPGSP MIGQRLDDLKLRERYGANVIGVERWRRFRRVIVNVNGVSEFRARDVLLIDMSAADVDLRQ FCSEQLLEPMVLRGEYFSDQALDVGMAEISLIPESELIGKSVREIGFRTRYGLNVVGLKR NGVALEGSLADEPLLLGDIILVVGNWKLIGMLAKQGRDFVALNLPEEVSEASPAHSQAPH AIFCLVLMVALMLTDEIPNPVAAIIACLLMGKFRCIDAESAYKSIHWPSIILIVGMMPFA VALQKTGGVALAVKGLMDIGGGYGPHMMLGCLFVLSAVIGLFISNTATAVLMAPIALAAA KTMGVSPYPFAMVVAMAASAAFMTPVSSPVNTLVLGPGNYSFSDFVKLGVPFTIIVMAVC VVMIPMLFPF >gi|296494471|gb|ADTN01000267.1| GENE 41 39686 - 40285 687 199 aa, chain - ## HITS:1 COG:yfbR KEGG:ns NR:ns ## COG: yfbR COG1896 # Protein_GI_number: 16130226 # Func_class: R General function prediction only # Function: Predicted hydrolases of HD superfamily # Organism: Escherichia coli K12 # 1 199 1 199 199 371 99.0 1e-103 MKQSHFFAHLSRLKLINRWPLMRNVRTENVSEHSLQVAMVAHALAAIKNRKFGGNVNAER IALLAMYHDASEVLTGDLPTPVKYFNSQIAQEYKAIEKIAQQKLVDMVPEELRDIFAPLI DEHAYSDEEKSLVKQADALCAYLKCLEELAAGNNEFLLAKTRLEATLEARRRQEMDYFME IFVPSFHLSLDEISQDSPL >gi|296494471|gb|ADTN01000267.1| GENE 42 40369 - 41586 1133 405 aa, chain - ## HITS:1 COG:yfbQ KEGG:ns NR:ns ## COG: yfbQ COG0436 # Protein_GI_number: 16130225 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Escherichia coli K12 # 1 405 1 405 405 853 100.0 0 MSPIEKSSKLENVCYDIRGPVLKEAKRLEEEGNKVLKLNIGNPAPFGFDAPDEILVDVIR NLPTAQGYCDSKGLYSARKAIMQHYQARGMRDVTVEDIYIGNGVSELIVQAMQALLNSGD EMLVPAPDYPLWTAAVSLSSGKAVHYLCDESSDWFPDLDDIRAKITPRTRGIVIINPNNP TGAVYSKELLMEIVEIARQHNLIIFADEIYDKILYDDAEHHSIAPLAPDLLTITFNGLSK TYRVAGFRQGWMVLNGPKKHAKGYIEGLEMLASMRLCANVPAQHAIQTALGGYQSISEFI TPGGRLYEQRNRAWELINDIPGVSCVKPRGALYMFPKIDAKRFNIHDDQKMVLDFLLQEK VLLVQGTAFNWPWPDHFRIVTLPRVDDIELSLSKFARFLSGYHQL >gi|296494471|gb|ADTN01000267.1| GENE 43 41913 - 41981 86 22 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MMISEVSQWGQGVKLLLSGINE >gi|296494471|gb|ADTN01000267.1| GENE 44 42506 - 43444 824 312 aa, chain + ## HITS:1 COG:lrhA KEGG:ns NR:ns ## COG: lrhA COG0583 # Protein_GI_number: 16130224 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 312 1 312 312 600 100.0 1e-171 MISANRPIINLDLDLLRTFVAVADLNTFAAAAAAVCRTQSAVSQQMQRLEQLVGKELFAR HGRNKLLTEHGIQLLGYARKILRFNDEACSSLMFSNLQGVLTIGASDESADTILPFLLNR VSSVYPKLALDVRVKRNAYMAEMLESQEVDLMVTTHRPSAFKALNLRTSPTHWYCAAEYI LQKGEPIPLVLLDDPSPFRDMVLATLNKADIPWRLAYVASTLPAVRAAVKAGLGVTARPV EMMSPDLRVLSGVDGLPPLPDTEYLLCYDPSSNNELAQVIYQAMESYHNPWQYSPMSAPE GDDSLLIERDIE Prediction of potential genes in microbial genomes Time: Mon May 16 00:03:40 2011 Seq name: gi|296494470|gb|ADTN01000268.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont695.6, whole genome shotgun sequence Length of sequence - 15645 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 1, operones - 1 average op.length - 13.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 434 - 493 6.5 1 1 Op 1 30/0.000 + CDS 589 - 1026 450 ## COG0838 NADH:ubiquinone oxidoreductase subunit 3 (chain A) 2 1 Op 2 9/0.000 + CDS 1042 - 1704 727 ## COG0377 NADH:ubiquinone oxidoreductase 20 kD subunit and related Fe-S oxidoreductases 3 1 Op 3 15/0.000 + CDS 1810 - 3600 2137 ## COG0649 NADH:ubiquinone oxidoreductase 49 kD subunit 7 4 1 Op 4 23/0.000 + CDS 3603 - 4103 428 ## COG1905 NADH:ubiquinone oxidoreductase 24 kD subunit 5 1 Op 5 12/0.000 + CDS 4100 - 5437 1506 ## COG1894 NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit 6 1 Op 6 18/0.000 + CDS 5490 - 8216 2913 ## COG1034 NADH dehydrogenase/NADH:ubiquinone oxidoreductase 75 kD subunit (chain G) 7 1 Op 7 31/0.000 + CDS 8213 - 9190 1153 ## COG1005 NADH:ubiquinone oxidoreductase subunit 1 (chain H) 8 1 Op 8 28/0.000 + CDS 9205 - 9747 646 ## COG1143 Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) 9 1 Op 9 30/0.000 + CDS 9759 - 10313 809 ## COG0839 NADH:ubiquinone oxidoreductase subunit 6 (chain J) 10 1 Op 10 26/0.000 + CDS 10310 - 10612 451 ## COG0713 NADH:ubiquinone oxidoreductase subunit 11 or 4L (chain K) 11 1 Op 11 30/0.000 + CDS 10609 - 12450 2397 ## COG1009 NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit + Prom 12522 - 12581 1.9 12 1 Op 12 22/0.000 + CDS 12614 - 14143 1779 ## COG1008 NADH:ubiquinone oxidoreductase subunit 4 (chain M) 13 1 Op 13 . + CDS 14150 - 15607 1878 ## COG1007 NADH:ubiquinone oxidoreductase subunit 2 (chain N) Predicted protein(s) >gi|296494470|gb|ADTN01000268.1| GENE 1 589 - 1026 450 145 aa, chain + ## HITS:1 COG:ECs3172 KEGG:ns NR:ns ## COG: ECs3172 COG0838 # Protein_GI_number: 15832426 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 3 (chain A) # Organism: Escherichia coli O157:H7 # 1 145 3 147 147 266 100.0 9e-72 MSTSTEVIAHHWAFAIFLIVAIGLCCLMLVGGWFLGGRARARSKNVPFESGIDSVGSARL RLSAKFYLVAMFFVIFDVEALYLFAWSTSIRESGWVGFVEAAIFIFVLLAGLVYLVRIGA LDWTPARSRRERMNPETNSIANRQR >gi|296494470|gb|ADTN01000268.1| GENE 2 1042 - 1704 727 220 aa, chain + ## HITS:1 COG:ECs3171 KEGG:ns NR:ns ## COG: ECs3171 COG0377 # Protein_GI_number: 15832425 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 20 kD subunit and related Fe-S oxidoreductases # Organism: Escherichia coli O157:H7 # 1 220 1 220 220 457 100.0 1e-129 MDYTLTRIDPNGENDRYPLQKQEIVTDPLEQEVNKNVFMGKLNDMVNWGRKNSIWPYNFG LSCCYVEMVTSFTAVHDVARFGAEVLRASPRQADLMVVAGTCFTKMAPVIQRLYDQMLEP KWVISMGACANSGGMYDIYSVVQGVDKFIPVDVYIPGCPPRPEAYMQALMLLQESIGKER RPLSWVVGDQGVYRANMQSERERKRGERIAVTNLRTPDEI >gi|296494470|gb|ADTN01000268.1| GENE 3 1810 - 3600 2137 596 aa, chain + ## HITS:1 COG:nuoC_2 KEGG:ns NR:ns ## COG: nuoC_2 COG0649 # Protein_GI_number: 16130221 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 49 kD subunit 7 # Organism: Escherichia coli K12 # 198 596 1 399 399 845 100.0 0 MTDLTAQEPAWQTRDHLDDPVIGELRNRFGPDAFTVQATRTGVPVVWIKREQLLEVGDFL KKLPKPYVMLFDLHGMDERLRTHREGLPAADFSVFYHLISIDRNRDIMLKVALAENDLHV PTFTKLFPNANWYERETWDLFGITFDGHPNLRRIMMPQTWKGHPLRKDYPARATEFSPFE LTKAKQDLEMEALTFKPEEWGMKRGTENEDFMFLNLGPNHPSAHGAFRIVLQLDGEEIVD CVPDIGYHHRGAEKMGERQSWHSYIPYTDRIEYLGGCVNEMPYVLAVEKLAGITVPDRVN VIRVMLSELFRINSHLLYISTFIQDVGAMTPVFFAFTDRQKIYDLVEAITGFRMHPAWFR IGGVAHDLPRGWDRLLREFLDWMPKRLASYEKAALQNTILKGRSQGVAAYGAKEALEWGT TGAGLRATGIDFDVRKARPYSGYENFDFEIPVGGGVSDCYTRVMLKVEELRQSLRILEQC LNNMPEGPFKADHPLTTPPPKERTLQHIETLITHFLQVSWGPVMPANESFQMIEATKGIN SYYLTSDGSTMSYRTRVRTPSFAHLQQIPAAIRGSLVSDLIVYLGSIDFVMSDVDR >gi|296494470|gb|ADTN01000268.1| GENE 4 3603 - 4103 428 166 aa, chain + ## HITS:1 COG:nuoE KEGG:ns NR:ns ## COG: nuoE COG1905 # Protein_GI_number: 16130220 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 24 kD subunit # Organism: Escherichia coli K12 # 1 166 1 166 166 340 100.0 8e-94 MHENQQPQTEAFELSAAEREAIEHEMHHYEDPRAASIEALKIVQKQRGWVPDGAIHAIAD VLGIPASDVEGVATFYSQIFRQPVGRHVIRYCDSVVCHINGYQGIQAALEKKLNIKPGQT TFDGRFTLLPTCCLGNCDKGPNMMIDEDTHAHLTPEAIPELLERYK >gi|296494470|gb|ADTN01000268.1| GENE 5 4100 - 5437 1506 445 aa, chain + ## HITS:1 COG:nuoF KEGG:ns NR:ns ## COG: nuoF COG1894 # Protein_GI_number: 16130219 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit # Organism: Escherichia coli K12 # 1 445 1 445 445 922 100.0 0 MKNIIRTPETHPLTWRLRDDKQPVWLDEYRSKNGYEGARKALTGLSPDEIVNQVKDAGLK GRGGAGFSTGLKWSLMPKDESMNIRYLLCNADEMEPGTYKDRLLMEQLPHLLVEGMLISA FALKAYRGYIFLRGEYIEAAVNLRRAIAEATEAGLLGKNIMGTGFDFELFVHTGAGRYIC GEETALINSLEGRRANPRSKPPFPATSGAWGKPTCVNNVETLCNVPAILANGVEWYQNIS KSKDAGTKLMGFSGRVKNPGLWELPFGTTAREILEDYAGGMRDGLKFKAWQPGGAGTDFL TEAHLDLPMEFESIGKAGSRLGTALAMAVDHEINMVSLVRNLEEFFARESCGWCTPCRDG LPWSVKILRALERGEGQPGDIETLEQLCRFLGPGKTFCAHAPGAVEPLQSAIKYFREEFE AGIKQPFSNTHLINGIQPNLLKERW >gi|296494470|gb|ADTN01000268.1| GENE 6 5490 - 8216 2913 908 aa, chain + ## HITS:1 COG:nuoG KEGG:ns NR:ns ## COG: nuoG COG1034 # Protein_GI_number: 16130218 # Func_class: C Energy production and conversion # Function: NADH dehydrogenase/NADH:ubiquinone oxidoreductase 75 kD subunit (chain G) # Organism: Escherichia coli K12 # 1 908 3 910 910 1898 100.0 0 MATIHVDGKEYEVNGADNLLEACLSLGLDIPYFCWHPALGSVGACRQCAVKQYQNAEDTR GRLVMSCMTPASDGTFISIDDEEAKQFRESVVEWLMTNHPHDCPVCEEGGNCHLQDMTVM TGHSFRRYRFTKRTHRNQDLGPFISHEMNRCIACYRCVRYYKDYADGTDLGVYGAHDNVY FGRPEDGTLESEFSGNLVEICPTGVFTDKTHSERYNRKWDMQFAPSICQQCSIGCNISPG ERYGELRRIENRYNGTVNHYFLCDRGRFGYGYVNLKDRPRQPVQRRGDDFITLNAEQAMQ GAADILRQSKKVIGIGSPRASVESNFALRELVGEENFYTGIAHGEQERLQLALKVLREGG IYTPALREIESYDAVLVLGEDVTQTGARVALAVRQAVKGKAREMAAAQKVADWQIAAILN IGQRAKHPLFVTNVDDTRLDDIAAWTYRAPVEDQARLGFAIAHALDNSAPAVDGIEPELQ SKIDVIVQALAGAKKPLIISGTNAGSLEVIQAAANVAKALKGRGADVGITMIARSVNSMG LGIMGGGSLEEALTELETGRADAVVVLENDLHRHASAIRVNAALAKAPLVMVVDHQRTAI MENAHLVLSAASFAESDGTVINNEGRAQRFFQVYDPAYYDSKTVMLESWRWLHSLHSTLL SREVDWTQLDHVIDAVVAKIPELAGIKDAAPDATFRIRGQKLAREPHRYSGRTAMRANIS VHEPRQPQDIDTMFTFSMEGNNQPTAHRSQVPFAWAPGWNSPQAWNKFQDEVGGKLRFGD PGVRLFETSENGLDYFTSVPARFQPQDGKWRIAPYYHLFGSDELSQRAPVFQSRMPQPYI KLNPADAAKLGVNAGTRVSFSYDGNTVTLPVEIAEGLTAGQVGLPMGMSGIAPVLAGAHL EDLKEAQQ >gi|296494470|gb|ADTN01000268.1| GENE 7 8213 - 9190 1153 325 aa, chain + ## HITS:1 COG:ECs3166 KEGG:ns NR:ns ## COG: ECs3166 COG1005 # Protein_GI_number: 15832420 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 1 (chain H) # Organism: Escherichia coli O157:H7 # 1 325 1 325 325 593 100.0 1e-169 MSWISPELIEILLTILKAVVILLVVVTCGAFMSFGERRLLGLFQNRYGPNRVGWGGSLQL VADMIKMFFKEDWIPKFSDRVIFTLAPMIAFTSLLLAFAIVPVSPGWVVADLNIGILFFL MMAGLAVYAVLFAGWSSNNKYSLLGAMRASAQTLSYEVFLGLSLMGVVAQAGSFNMTDIV NSQAHVWNVIPQFFGFITFAIAGVAVCHRHPFDQPEAEQELADGYHIEYSGMKFGLFFVG EYIGIVTISALMVTLFFGGWQGPLLPPFIWFALKTAFFMMMFILIRASLPRPRYDQVMSF GWKICLPLTLINLLVTAAVILWQAQ >gi|296494470|gb|ADTN01000268.1| GENE 8 9205 - 9747 646 180 aa, chain + ## HITS:1 COG:ECs3165 KEGG:ns NR:ns ## COG: ECs3165 COG1143 # Protein_GI_number: 15832419 # Func_class: C Energy production and conversion # Function: Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) # Organism: Escherichia coli O157:H7 # 1 180 1 180 180 371 100.0 1e-103 MTLKELLVGFGTQVRSIWMIGLHAFAKRETRMYPEEPVYLPPRYRGRIVLTRDPDGEERC VACNLCAVACPVGCISLQKAETKDGRWYPEFFRINFSRCIFCGLCEEACPTTAIQLTPDF EMGEYKRQDLVYEKEDLLISGPGKYPEYNFYRMAGMAIDGKDKGEAENEAKPIDVKSLLP >gi|296494470|gb|ADTN01000268.1| GENE 9 9759 - 10313 809 184 aa, chain + ## HITS:1 COG:ECs3164 KEGG:ns NR:ns ## COG: ECs3164 COG0839 # Protein_GI_number: 15832418 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 6 (chain J) # Organism: Escherichia coli O157:H7 # 1 184 1 184 184 317 100.0 7e-87 MEFAFYICGLIAILATLRVITHTNPVHALLYLIISLLAISGVFFSLGAYFAGALEIIVYA GAIMVLFVFVVMMLNLGGSEIEQERQWLKPQVWIGPAILSAIMLVVIVYAILGVNDQGID GTPISAKAVGITLFGPYVLAVELASMLLLAGLVVAFHVGREERAGEVLSNRKDDSAKRKT EEHA >gi|296494470|gb|ADTN01000268.1| GENE 10 10310 - 10612 451 100 aa, chain + ## HITS:1 COG:ECs3163 KEGG:ns NR:ns ## COG: ECs3163 COG0713 # Protein_GI_number: 15832417 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 11 or 4L (chain K) # Organism: Escherichia coli O157:H7 # 1 100 1 100 100 148 100.0 3e-36 MIPLQHGLILAAILFVLGLTGLVIRRNLLFMLIGLEIMINASALAFVVAGSYWGQTDGQV MYILAISLAAAEASIGLALLLQLHRRRQNLNIDSVSEMRG >gi|296494470|gb|ADTN01000268.1| GENE 11 10609 - 12450 2397 613 aa, chain + ## HITS:1 COG:nuoL KEGG:ns NR:ns ## COG: nuoL COG1009 # Protein_GI_number: 16130213 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit # Organism: Escherichia coli K12 # 1 613 1 613 613 1050 99.0 0 MNMLALTIILPLIGFVLLAFSRGRWSENVSAIVGVGSVGLAALVTAFIGVDFFANGEQTY SQPLWTWMSVGDFNIGFNLVLDGLSLTMLSVVTGVGFLIHMYASWYMRGEEGYSRFFAYT NLFIASMVVLVLADNLLLMYLGWEGVGLCSYLLIGFYYTDPKNGAAAMKAFVVTRVGDVF LAFALFILYNELGTLNFREMVELAPAHFADGNNMLMWATLMLLGGAVGKSAQLPLQTWLA DAMAGPTPVSALIHAATMVTAGVYLIARTHGLFLMTPEVLHLVGIVGAVTLLLAGFAALV QTDIKRVLAYSTMSQIGYMFLALGVQAWDAAIFHLMTHAFFKALLFLASGSVILACHHEQ NIFKMGGLRKSIPLVYLCFLVGGAALSALPLVTAGFFSKDEILAGAMANGHINLMVAGLV GAFMTSLYTFRMIFIVFHGKEQIHAHAVKGVTHSLPLIVLLILSTFVGALIVPPLQGVLP QTTELAHGSMLTLEITSGVVAVVGILLAAWLWLGKRILVTSIANSAPGRLLGTWWYNAWG FDWLYDKVFVKPFLGIAWLLKRDPLNSMMNIPAVLSRFAGKGLLLSENGYLRWYVASMSI GAVVVLALLMVLR >gi|296494470|gb|ADTN01000268.1| GENE 12 12614 - 14143 1779 509 aa, chain + ## HITS:1 COG:ECs3161 KEGG:ns NR:ns ## COG: ECs3161 COG1008 # Protein_GI_number: 15832415 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 4 (chain M) # Organism: Escherichia coli O157:H7 # 1 509 1 509 509 889 100.0 0 MLLPWLILIPFIGGFLCWQTERFGVKVPRWIALITMGLTLALSLQLWLQGGYSLTQSAGI PQWQSEFDMPWIPRFGISIHLAIDGLSLLMVVLTGLLGVLAVLCSWKEIEKYQGFFHLNL MWILGGVIGVFLAIDMFLFFFFWEMMLVPMYFLIALWGHKASDGKTRITAATKFFIYTQA SGLVMLIAILALVFVHYNATGVWTFNYEELLNTPMSSGVEYLLMLGFFIAFAVKMPVVPL HGWLPDAHSQAPTAGSVDLAGILLKTAAYGLLRFSLPLFPNASAEFAPIAMWLGVIGIFY GAWMAFAQTDIKRLIAYTSVSHMGFVLIAIYTGSQLAYQGAVIQMIAHGLSAAGLFILCG QLYERIHTRDMRMMGGLWSKMKWLPALSLFFAVATLGMPGTGNFVGEFMILFGSFQVVPV ITVISTFGLVFASVYSLAMLHRAYFGKAKSQIASQELPGMSLRELFMILLLVVLLVLLGF YPQPILDTSHSAIGNIQQWFVNSVTTTRP >gi|296494470|gb|ADTN01000268.1| GENE 13 14150 - 15607 1878 485 aa, chain + ## HITS:1 COG:ECs3160 KEGG:ns NR:ns ## COG: ECs3160 COG1007 # Protein_GI_number: 15832414 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 2 (chain N) # Organism: Escherichia coli O157:H7 # 61 485 1 425 425 707 100.0 0 MTITPQNLIALLPLLIVGLTVVVVMLSIAWRRNHFLNATLSVIGLNAALVSLWFVGQAGA MDVTPLMRVDGFAMLYTGLVLLASLATCTFAYPWLEGYNDNKDEFYLLVLIAALGGILLA NANHLASLFLGIELISLPLFGLVGYAFRQKRSLEASIKYTILSAAASSFLLFGMALVYAQ SGDLSFVALGKNLGDGMLNEPLLLAGFGLMIVGLGFKLSLVPFHLWTPDVYQGAPAPVST FLATASKIAIFGVVMRLFLYAPVGDSEAIRVVLAIIAFASIIFGNLMALSQTNIKRLLGY SSISHLGYLLVALIALQTGEMSMEAVGVYLAGYLFSSLGAFGVVSLMSSPYRGPDADSLF SYRGLFWHRPILAAVMTVMMLSLAGIPMTLGFIGKFYVLAVGVQAHLWWLVGAVVVGSAI GLYYYLRVAVSLYLHAPEQPGRDAPSNWQYSAGGIVVLISALLVLVLGVWPQPLISIVRL AMPLM Prediction of potential genes in microbial genomes Time: Mon May 16 00:03:42 2011 Seq name: gi|296494469|gb|ADTN01000269.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont695.7, whole genome shotgun sequence Length of sequence - 6055 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 41 - 889 396 ## JW2270 hypothetical protein 2 1 Op 2 . - CDS 948 - 1370 197 ## JW2269 hypothetical protein - Prom 1546 - 1605 4.4 + Prom 1321 - 1380 5.2 3 2 Tu 1 . + CDS 1579 - 2295 236 ## JW2268 hypothetical protein 4 3 Tu 1 . - CDS 2568 - 3071 571 ## JW2267 hypothetical protein - Prom 3097 - 3156 4.1 5 4 Tu 1 . - CDS 3174 - 4094 507 ## COG2234 Predicted aminopeptidases - Prom 4236 - 4295 8.1 + Prom 4070 - 4129 8.7 6 5 Tu 1 . + CDS 4304 - 6010 1145 ## COG2304 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain Predicted protein(s) >gi|296494469|gb|ADTN01000269.1| GENE 1 41 - 889 396 282 aa, chain - ## HITS:1 COG:no KEGG:JW2270 NR:ns ## KEGG: JW2270 # Name: yfbP # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 282 2 283 283 582 100.0 1e-165 MKLIPRSSDISPGIDGICPGPFPPNGFTVLTDAAYGNGDCFGLYWPIGQEHKLPIVCETY HDEWRIVPAFSSIKKFEEWLEVNDDDPHENGISIEDQDFAANLFRVARKCLSTGRLDDAL PLLQRATEQLPEVSEYWLALAIQYRRCKKTEAAAQAALNAYLGNWAFGVPDNKVIHLLSQ AADVPNFQDDPVIQCIKEQGLDLSFGGTKENNNYPLMQMCVDTYFAQRKPLQALTLLHNY AWIMSSETTAFQERYDFNIDEWRAKFRQLCLEYFGDSRTQFT >gi|296494469|gb|ADTN01000269.1| GENE 2 948 - 1370 197 140 aa, chain - ## HITS:1 COG:no KEGG:JW2269 NR:ns ## KEGG: JW2269 # Name: yfbO # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 140 19 158 158 270 100.0 9e-72 MTPLERITQLVNINGDVNNPDTPRPLLSLEDFFIDNNIHGSICCNVIPEQSPQAIYHHFL KIRERNNVSDVLVEITMFDDPDWPFSESILVITTASPEEVQSWFVEEIAPDECWEGWSED TEHGWVEVPVGMHPVTCWWD >gi|296494469|gb|ADTN01000269.1| GENE 3 1579 - 2295 236 238 aa, chain + ## HITS:1 COG:no KEGG:JW2268 NR:ns ## KEGG: JW2268 # Name: yfbN # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 238 1 238 238 462 100.0 1e-129 MEWLSEIRKLRKNVPVGIQVARRLLERTGGDVDEAIKLFHIDQINILTAKADVTHQEAEN VLLATNYDIAEALRRIDEQRYTLTELILRKNKDAGDALNNIALAIEYEWDLKRKFWFGFA DIQLLPPVLQTFMLVYEWHEYVGWEGMECGIFFESDHTHQQLQALGLLELAQKMVTARIR YDELKDKAENFHEITEDDIFKMLIIHCDQLAREVDSILLQFVKDNIDVFPCRHNRHEL >gi|296494469|gb|ADTN01000269.1| GENE 4 2568 - 3071 571 167 aa, chain - ## HITS:1 COG:no KEGG:JW2267 NR:ns ## KEGG: JW2267 # Name: yfbM # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 167 1 167 167 323 100.0 2e-87 MGMIGYFAEIDSEKINQLLESTEKPLMDNIHDTLSGLRRLDIDKRWDFLHFGLTGTSAFD PAKNDPLSRAVLGEHSLEDGIDGFLGLTWNQELAATIDRLESLDRNELRKQFSIKRLNEM EIYPGVTFSEELEGQLFASIMLDMEKLISAYRRMLRQGNHALTVIVG >gi|296494469|gb|ADTN01000269.1| GENE 5 3174 - 4094 507 306 aa, chain - ## HITS:1 COG:yfbL KEGG:ns NR:ns ## COG: yfbL COG2234 # Protein_GI_number: 16130206 # Func_class: R General function prediction only # Function: Predicted aminopeptidases # Organism: Escherichia coli K12 # 1 306 20 325 325 604 100.0 1e-173 MIIFYQPWVNALPSTPRHASPEQLEKTVRYLTQTVHPRSADNIDNLNRSAEYIKEVFVSS GARVTSQDVPITGGPYKNIVADYGPADGPLIIIGAHYDSASSYENDQLTYTPGADDNASG VAGLLELARLLHQQVPKTGVQLVAYASEEPPFFRSDEMGSAVHAASLERPVKLMIALEMI GYYDSAPGSQNYPYPAMSWLYPDRGDFIAVVGRIQDINAVRQVKAALLSSQDLSVYSMNT PGFIPGIDFSDHLNYWQHDIPAIMITDTAFYRNKQYHLPGDTADRLNYQKMAQVVDGVIT LLYNSK >gi|296494469|gb|ADTN01000269.1| GENE 6 4304 - 6010 1145 568 aa, chain + ## HITS:1 COG:yfbK KEGG:ns NR:ns ## COG: yfbK COG2304 # Protein_GI_number: 16130205 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain # Organism: Escherichia coli K12 # 1 568 8 575 575 1037 100.0 0 MLLMSSLILSGCGPQPENKESQQQQPSTPTEQQVLAAQQAAIKEAEQSAAAAKALAQQEV QQYSDKQALQGRLQEAPTFARAAKAKATHIANPGTARYQQFDDNPVKQVAQNPLATFSLD VDTGSYANVRRFLNQGLLPPPDAVRVEEIVNYFPSDWDIKDKQSIPASKPIPFAMRYELA PAPWNEQRTLLKVDILAKDRKSEELPASNLVFLIDTSGSMISDERLPLIQSSLKLLVKEL REQDNIAIVTYAGDSRIALPSISGSHKAEINAAIDSLDAEGSTNGGAGLELAYQQATKGF IKGGINRILLATDGDFNVGIDDPKSIESMVKKQRESGVTLSTFGVGNSNYNEAMMVRIAD VGNGNYSYIDTLSEAQKVLNSEMRQMLITVAKDVKAQIEFNPAWVTEYRQIGYEKRQLRV EHFNNDNVDAGDIGAGKHITLLFELTLNGQKASIDKLRYAPDNKLAKSDKTKELAWLKIR WKYPQGKESQLVEFPLGPTINAPSEDMRFRAAVAAYGQKLRGSEYLNNTSWQQIKQWAQQ AKGEDPQGYRAEFIRLIELADGVTDISQ Prediction of potential genes in microbial genomes Time: Mon May 16 00:04:18 2011 Seq name: gi|296494468|gb|ADTN01000270.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont695.8, whole genome shotgun sequence Length of sequence - 77843 bp Number of predicted genes - 66, with homology - 64 Number of transcription units - 30, operones - 17 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 36 - 1244 342 ## B21_02155 hypothetical protein - Prom 1304 - 1363 4.0 - Term 1387 - 1413 -0.6 2 2 Tu 1 . - CDS 1435 - 2352 621 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III - Prom 2378 - 2437 5.0 + Prom 2337 - 2396 4.9 3 3 Op 1 4/0.300 + CDS 2417 - 2878 478 ## COG2153 Predicted acyltransferase 4 3 Op 2 1/1.000 + CDS 2933 - 3238 527 ## COG4575 Uncharacterized conserved protein 5 3 Op 3 10/0.100 + CDS 3317 - 4459 982 ## COG1169 Isochorismate synthase + SSU_RRNA 4455 - 4808 100.0 # AY958844 [D:1..1893] # 16S ribosomal RNA # uncultured bacterium # Bacteria; environmental samples. + Prom 4734 - 4793 80.4 6 4 Op 1 15/0.000 + CDS 4872 - 6371 1606 ## COG1165 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase 7 4 Op 2 9/0.100 + CDS 6368 - 7126 601 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 8 4 Op 3 5/0.200 + CDS 7141 - 7998 958 ## COG0447 Dihydroxynaphthoic acid synthase 9 4 Op 4 6/0.200 + CDS 7998 - 8960 1091 ## COG1441 O-succinylbenzoate synthase 10 4 Op 5 . + CDS 8957 - 10312 1151 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II + Prom 10329 - 10388 4.9 11 5 Tu 1 . + CDS 10422 - 10688 198 ## B21_02145 hypothetical protein + Term 10825 - 10868 2.9 12 6 Op 1 9/0.100 - CDS 10682 - 11068 430 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 13 6 Op 2 5/0.200 - CDS 11068 - 11403 372 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 14 6 Op 3 6/0.200 - CDS 11400 - 13052 1433 ## COG1807 4-amino-4-deoxy-L-arabinose transferase and related glycosyltransferases of PMT family 15 6 Op 4 8/0.100 - CDS 13052 - 13942 820 ## COG0726 Predicted xylanase/chitin deacetylase 16 6 Op 5 12/0.000 - CDS 13939 - 15921 1658 ## COG0451 Nucleoside-diphosphate-sugar epimerases 17 6 Op 6 5/0.200 - CDS 15921 - 16889 887 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 18 6 Op 7 . - CDS 16893 - 18050 870 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis - Prom 18088 - 18147 7.4 + Prom 18133 - 18192 5.8 19 7 Tu 1 . + CDS 18340 - 18942 240 ## ECUMN_2593 hypothetical protein + Term 18953 - 18988 5.1 - Term 18934 - 18977 1.6 20 8 Tu 1 . - CDS 18981 - 19406 371 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes - Prom 19433 - 19492 7.3 21 9 Op 1 . + CDS 19685 - 20227 701 ## B21_02135 hypothetical protein + Prom 20246 - 20305 4.2 22 9 Op 2 4/0.300 + CDS 20327 - 21529 1190 ## COG1058 Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA + Prom 21531 - 21590 2.4 23 10 Op 1 5/0.200 + CDS 21749 - 22531 592 ## COG1414 Transcriptional regulator 24 10 Op 2 7/0.100 + CDS 22546 - 23751 1203 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily + Term 23759 - 23800 8.6 25 11 Op 1 6/0.200 + CDS 23808 - 25097 1380 ## COG0477 Permeases of the major facilitator superfamily 26 11 Op 2 . + CDS 25115 - 25918 935 ## COG3836 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase + Term 25922 - 25966 4.2 27 12 Op 1 1/1.000 - CDS 25959 - 26144 315 ## COG5464 Uncharacterized conserved protein 28 12 Op 2 . - CDS 26157 - 27056 797 ## COG5464 Uncharacterized conserved protein 29 12 Op 3 . - CDS 27099 - 27224 66 ## ECH74115_3381 hypothetical protein 30 12 Op 4 8/0.100 - CDS 27249 - 28439 1171 ## COG0247 Fe-S oxidoreductase 31 12 Op 5 9/0.100 - CDS 28436 - 29695 1188 ## COG3075 Anaerobic glycerol-3-phosphate dehydrogenase 32 12 Op 6 . - CDS 29685 - 31313 1842 ## COG0578 Glycerol-3-phosphate dehydrogenase 33 12 Op 7 . - CDS 31338 - 31436 62 ## - Prom 31472 - 31531 3.2 + Prom 31407 - 31466 6.4 34 13 Op 1 6/0.200 + CDS 31586 - 32944 1466 ## COG2271 Sugar phosphate permease 35 13 Op 2 . + CDS 32949 - 34025 1163 ## COG0584 Glycerophosphoryl diester phosphodiesterase + Term 34036 - 34069 4.5 - Term 33822 - 33865 -0.9 36 14 Tu 1 . - CDS 34067 - 34273 235 ## COG0583 Transcriptional regulator - Prom 34293 - 34352 4.1 + Prom 34244 - 34303 4.5 37 15 Tu 1 . + CDS 34488 - 35138 573 ## B21_02122 hypothetical protein - Term 35064 - 35096 -0.9 38 16 Op 1 8/0.100 - CDS 35192 - 35446 165 ## COG0633 Ferredoxin 39 16 Op 2 24/0.000 - CDS 35446 - 36576 1520 ## COG0208 Ribonucleotide reductase, beta subunit - Prom 36723 - 36782 5.4 40 16 Op 3 . - CDS 36809 - 39094 2770 ## COG0209 Ribonucleotide reductase, alpha subunit - Prom 39231 - 39290 3.0 41 17 Tu 1 . + CDS 39997 - 43542 2883 ## COG3468 Type V secretory pathway, adhesin AidA 42 18 Tu 1 . - CDS 43670 - 44392 855 ## COG2227 2-polyprenyl-3-methyl-5-hydroxy-6-metoxy-1,4-benzoquinol methylase - Prom 44419 - 44478 5.2 + Prom 44442 - 44501 6.4 43 19 Tu 1 . + CDS 44539 - 47166 3385 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit + Term 47188 - 47214 -1.0 + Prom 47181 - 47240 3.5 44 20 Op 1 2/0.700 + CDS 47315 - 49003 1372 ## COG4685 Uncharacterized protein conserved in bacteria 45 20 Op 2 3/0.400 + CDS 49000 - 49623 314 ## COG3234 Uncharacterized protein conserved in bacteria 46 20 Op 3 4/0.300 + CDS 49557 - 54161 3786 ## COG2373 Large extracellular alpha-helical protein 47 20 Op 4 5/0.200 + CDS 54162 - 55811 1098 ## COG5445 Predicted secreted protein 48 20 Op 5 . + CDS 55816 - 56592 597 ## COG4676 Uncharacterized protein conserved in bacteria + Term 56621 - 56677 12.6 - Term 56618 - 56655 7.1 49 21 Op 1 4/0.300 - CDS 56666 - 57850 1360 ## COG0183 Acetyl-CoA acetyltransferase 50 21 Op 2 4/0.300 - CDS 57881 - 59068 978 ## COG2031 Short chain fatty acids transporter - Prom 59103 - 59162 3.6 - Term 59160 - 59187 -0.8 51 22 Op 1 21/0.000 - CDS 59200 - 59850 538 ## COG2057 Acyl CoA:acetate/3-ketoacid CoA transferase, beta subunit 52 22 Op 2 1/1.000 - CDS 59850 - 60512 643 ## COG1788 Acyl CoA:acetate/3-ketoacid CoA transferase, alpha subunit - Prom 60590 - 60649 5.2 - Term 60639 - 60696 2.6 53 23 Op 1 13/0.000 - CDS 60708 - 62093 975 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 54 23 Op 2 . - CDS 62090 - 63916 1360 ## COG0642 Signal transduction histidine kinase + Prom 63940 - 63999 1.9 55 24 Tu 1 . + CDS 64131 - 66932 1960 ## COG0642 Signal transduction histidine kinase + Term 66944 - 67008 4.2 - Term 66931 - 66993 11.5 56 25 Op 1 12/0.000 - CDS 67132 - 67782 851 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 57 25 Op 2 . - CDS 67799 - 70471 2448 ## COG0642 Signal transduction histidine kinase - Prom 70692 - 70751 5.3 + Prom 71036 - 71095 3.6 58 26 Tu 1 4/0.300 + CDS 71210 - 72313 1278 ## COG3203 Outer membrane protein (porin) + Term 72328 - 72369 5.0 + Prom 72341 - 72400 2.0 59 27 Op 1 3/0.400 + CDS 72425 - 73480 1272 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis 60 27 Op 2 4/0.300 + CDS 73554 - 74618 715 ## COG2169 Adenosine deaminase 61 27 Op 3 . + CDS 74621 - 75268 509 ## COG3145 Alkylated DNA repair protein + Term 75489 - 75522 -0.9 62 28 Op 1 . + CDS 75638 - 76114 225 ## gi|301647699|ref|ZP_07247493.1| hypothetical protein HMPREF9543_04216 63 28 Op 2 . + CDS 76176 - 76514 212 ## gi|301647700|ref|ZP_07247494.1| hypothetical protein HMPREF9543_04217 + Term 76566 - 76596 0.6 + Prom 76608 - 76667 3.8 64 29 Tu 1 . + CDS 76701 - 76784 67 ## + Term 76833 - 76869 -0.6 - Term 76554 - 76583 1.2 65 30 Op 1 . - CDS 76825 - 77076 150 ## gi|301647701|ref|ZP_07247495.1| hypothetical protein HMPREF9543_04218 66 30 Op 2 . - CDS 77080 - 77715 623 ## gi|301647702|ref|ZP_07247496.1| hypothetical protein HMPREF9543_04219 - Prom 77745 - 77804 4.5 Predicted protein(s) >gi|296494468|gb|ADTN01000270.1| GENE 1 36 - 1244 342 402 aa, chain - ## HITS:1 COG:no KEGG:B21_02155 NR:ns ## KEGG: B21_02155 # Name: elaD # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 402 2 403 403 765 100.0 0 MVTVVSNYCQLSQTQLSQTFAEKFTVTEELLQSLKKTALSGDEESIELLHNIALGYDKFG KEAEDILYHIVRTPTNETLSIIRLIKNACLKLYNLAHIATNSPLKSHDSDDLLFKKLFSP SKLMTIIGDEIPLISEKQSLSKVLLNDENNELSDGTNFWDKNRQLTTDEIACYLQKIAAN AKNTQVNYPTGLYVPYSTRTHLEDALNENIKSDPSWPNEVQLFPINTGGHWILVSLQKIV NKKNNKLQIKCVIFNSLRALGYDKENSLKRVINSFNSELMGEMSNNNIKVHLNEPEIIFL HADLQQYLSQSCGAFVCMAAQEVIEQRESNSDSAPYTLLKNYADRFKKYSAEEQYEIDFQ HRLANRNCYLDKYGDANINHYYRNLEIKHSQPKNRASGKRVS >gi|296494468|gb|ADTN01000270.1| GENE 2 1435 - 2352 621 305 aa, chain - ## HITS:1 COG:elaC KEGG:ns NR:ns ## COG: elaC COG1234 # Protein_GI_number: 16130203 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Escherichia coli K12 # 1 305 7 311 311 627 100.0 1e-179 MELIFLGTSAGVPTRTRNVTAILLNLQHPTQSGLWLFDCGEGTQHQLLHTAFNPGKLDKI FISHLHGDHLFGLPGLLCSRSMSGIIQPLTIYGPQGIREFVETALRISGSWTDYPLEIVE IGAGEILDDGLRKVTAYPLEHPLECYGYRIEEHDKPGALNAQALKAAGVPPGPLFQELKA GKTITLEDGRQINGADYLAAPVPGKALAIFGDTGPCDAALDLAKGVDVMVHEATLDITME AKANSRGHSSTRQAATLAREAGVGKLIITHVSSRYDDKGCQHLLRECRSIFPATELANDF TVFNV >gi|296494468|gb|ADTN01000270.1| GENE 3 2417 - 2878 478 153 aa, chain + ## HITS:1 COG:elaA KEGG:ns NR:ns ## COG: elaA COG2153 # Protein_GI_number: 16130202 # Func_class: R General function prediction only # Function: Predicted acyltransferase # Organism: Escherichia coli K12 # 1 153 1 153 153 315 100.0 2e-86 MIEWQDLHHSELSVSQLYALLQLRCAVFVVEQNCPYQDIDGDDLTGDNRHILGWKNDELV AYARILKSDDDLEPVVIGRVIVSEALRGEKVGQQLMSKTLETCTHHWPDKPVYLGAQAHL QNFYQSFGFIPVTEVYEEDGIPHIGMAREVIQA >gi|296494468|gb|ADTN01000270.1| GENE 4 2933 - 3238 527 101 aa, chain + ## HITS:1 COG:ECs3154 KEGG:ns NR:ns ## COG: ECs3154 COG4575 # Protein_GI_number: 15832408 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 84 1 84 101 146 100.0 8e-36 MSNQFGDTRIDDDLTLLSETLEEVLRSSGDPADQKYVELKARAEKALDDVKKRVSQASDS YYYRAKQAVYRADDYVHEKPWQGIGVGAAVGLVLGLLLARR >gi|296494468|gb|ADTN01000270.1| GENE 5 3317 - 4459 982 380 aa, chain + ## HITS:1 COG:menF KEGG:ns NR:ns ## COG: menF COG1169 # Protein_GI_number: 16130200 # Func_class: H Coenzyme transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Isochorismate synthase # Organism: Escherichia coli K12 # 76 380 1 305 356 595 99.0 1e-170 MQSLTTALENLLRHLSQEIPATPGIRVIDIPFPLKDAFDALSWLASQQTYPQFYWQQRNG DEEAVVLGAITRFTSLDQAQRFLRQHPEHADLRIWGLNAFDPSQGNLLLPRLEWRRCGGK ATLRLTLFSESSLQHDAIQAKEFIATLVSIKPLPGLHLTTTREQHWPDKTGWTQLIELAT KTIAEGELDKVVLARATDLHFASPVNAAAMMAASRRLNLNCYHFYMAFDGENAFLGSSPE RLWRRRDKALRTEALAGTVANNPDDKQAQQLGEWLMADDKNQRENMLVVEDICQRLQADT QTLDVLPPQVLRLRKVQHLRRCIWTSLNKADDVICLHQLQPTAAVAGLPRDLARQFIARH EPFTREWYAGSAGYLSLQQS >gi|296494468|gb|ADTN01000270.1| GENE 6 4872 - 6371 1606 499 aa, chain + ## HITS:1 COG:menD KEGG:ns NR:ns ## COG: menD COG1165 # Protein_GI_number: 16130199 # Func_class: H Coenzyme transport and metabolism # Function: 2-succinyl-6-hydroxy-2,4-cyclohexadiene-1-carboxylate synthase # Organism: Escherichia coli K12 # 1 499 58 556 556 994 99.0 0 MGHLALGLAKVSKQPVAVIVTSGTAVANLYPALIEAGLTGEKLILLTADRPPELIDCGAN QAIRQPGMFASHPTHSISLPRPTQDIPARWLVSTIDHALGTLHAGGVHINCPFAEPLYGE MDDTGLSWQQRLGDWWQDDKPWLREAPRLESEKQRDWFFWRQKRGVVVAGRMSAEEGKKV ALWAQTLGWPLIGDVLSQTGQPLPCADLWLGNAKATSELQQAQIVVQLGSSLTGKRLLQW QASCEPEEYWIVDDIEGRLDPAHHRGRRLIANIADWLELHPAEKRQPWCVEIPRLAEQAM QAVIARRDAFGEAQLAHRICDYLPEQGQLFVGNSLVVRLIDALSQLPAGYPVYSNRGASG IDGLLSTAAGVQRASGKPTLAIVGDLSALYDLNALALLRQVSAPLVLIVVNNNGGQIFSL LPTPQSERERFYLMPQNVHFEHAAAMFELKYHRPQNWQELETAFADAWRTPTTTVIEMVV NDTDGAQTLQQLLAQVSHL >gi|296494468|gb|ADTN01000270.1| GENE 7 6368 - 7126 601 252 aa, chain + ## HITS:1 COG:yfbB KEGG:ns NR:ns ## COG: yfbB COG0596 # Protein_GI_number: 16130198 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Escherichia coli K12 # 1 252 1 252 252 514 100.0 1e-146 MILHAQAKHGKPGLPWLVFLHGFSGDCHEWQEVGEAFADYSRLYVDLPGHGGSAAISVDG FDDVTDLLRKTLVSYNILDFWLVGYSLGGRVAMMAACQGLAGLCGVIVEGGHPGLQNAEQ RAERQRSDRQWVQRFLTEPLTAVFADWYQQPVFASLNDDQRRELVALRSNNNGATLAAML EATSLAVQPDLRANLSARTFAFYYLCGERDSKFRALAAELAADCHVIPRAGHNAHRENPA GVIASLAQILRF >gi|296494468|gb|ADTN01000270.1| GENE 8 7141 - 7998 958 285 aa, chain + ## HITS:1 COG:menB KEGG:ns NR:ns ## COG: menB COG0447 # Protein_GI_number: 16130197 # Func_class: H Coenzyme transport and metabolism # Function: Dihydroxynaphthoic acid synthase # Organism: Escherichia coli K12 # 1 285 1 285 285 602 100.0 1e-172 MIYPDEAMLYAPVEWHDCSEGFEDIRYEKSTDGIAKITINRPQVRNAFRPLTVKEMIQAL ADARYDDNIGVIILTGAGDKAFCSGGDQKVRGDYGGYKDDSGVHHLNVLDFQRQIRTCPK PVVAMVAGYSIGGGHVLHMMCDLTIAADNAIFGQTGPKVGSFDGGWGASYMARIVGQKKA REIWFLCRQYDAKQALDMGLVNTVVPLADLEKETVRWCREMLQNSPMALRCLKAALNADC DGQAGLQELAGNATMLFYMTEEGQEGRNAFNQKRQPDFSKFKRNP >gi|296494468|gb|ADTN01000270.1| GENE 9 7998 - 8960 1091 320 aa, chain + ## HITS:1 COG:menC KEGG:ns NR:ns ## COG: menC COG1441 # Protein_GI_number: 16130196 # Func_class: H Coenzyme transport and metabolism # Function: O-succinylbenzoate synthase # Organism: Escherichia coli K12 # 1 320 1 320 320 624 100.0 1e-179 MRSAQVYRWQIPMDAGVVLRDRRLKTRDGLYVCLREGEREGWGEISPLPGFSQETWEEAQ SVLLAWVNNWLAGDCELPQMPSVAFGVSCALAELTDTLPQAANYRAAPLCNGDPDDLILK LADMPGEKVAKVKVGLYEAVRDGMVVNLLLEAIPDLHLRLDANRAWTPLKGQQFAKYVNP DYRDRIAFLEEPCKTRDDSRAFARETGIAIAWDESLREPDFAFVAEEGVRAVVIKPTLTG SLEKVREQVQAAHALGLTAVISSSIESSLGLTQLARIAAWLTPDTIPGLDTLDLMQAQQV RRWPGSTLPVVEVDALERLL >gi|296494468|gb|ADTN01000270.1| GENE 10 8957 - 10312 1151 451 aa, chain + ## HITS:1 COG:menE KEGG:ns NR:ns ## COG: menE COG0318 # Protein_GI_number: 16130195 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Escherichia coli K12 # 1 451 1 451 451 863 100.0 0 MIFSDWPWRHWRQVRGETIALRLNDEQLNWRELCARVDELASGFAVQGVVEGSGVMLRAW NTPQTLLAWLALLQCGARVLPVNPQLPQPLLEELLPNLTLQFALVPDGENTFPALTSLHI QLVEGAHAATWQPTRLCSMTLTSGSTGLPKAAVHTYQAHLASAQGVLSLIPFGDHDDWLL SLPLFHVSGQGIMWRWLYAGARMTVRDKQPLEQMLAGCTHASLVPTQLWRLLVNRSSVSL KAVLLGGAAIPVELTEQAREQGIRCFCGYGLTEFASTVCAKEADGLADVGSPLPGREVKI VNNEVWLRAASMAEGYWRNGQLVSLVNDEGWYATRDRGEMHNGKLTIVGRLDNLFFSGGE GIQPEEVERVIAAHPAVLQVFIVPVADKEFGHRPVAVMEYDHESVDLSEWVKDKLARFQQ PVRWLTLPPELKNGGIKISRQALKEWVQRQQ >gi|296494468|gb|ADTN01000270.1| GENE 11 10422 - 10688 198 88 aa, chain + ## HITS:1 COG:no KEGG:B21_02145 NR:ns ## KEGG: B21_02145 # Name: pmrD # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 88 1 88 88 171 100.0 8e-42 MEWLVKKSCCNKQDNRHVLMLCDAGGAIKMIAEVKSDFAVKVGDLLSPLQNALYCINREK LHTVKVLSASSYSPDEWERQCKVAGKTQ >gi|296494468|gb|ADTN01000270.1| GENE 12 10682 - 11068 430 128 aa, chain - ## HITS:1 COG:yfbJ KEGG:ns NR:ns ## COG: yfbJ COG0697 # Protein_GI_number: 16130193 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Escherichia coli K12 # 1 128 95 222 222 206 100.0 6e-54 MGLMWGLFSVIIASVAQLSLGFAASHLPPMTHLWDFIAALLAFGLDARILLLGLLGYLLS VFCWYKTLHKLALSKAYALLSMSYVLVWIASMVLPGWEGTFSLKALLGVACIMSGLMLIF LPTTKQRY >gi|296494468|gb|ADTN01000270.1| GENE 13 11068 - 11403 372 111 aa, chain - ## HITS:1 COG:Z3516 KEGG:ns NR:ns ## COG: Z3516 COG0697 # Protein_GI_number: 15802807 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Escherichia coli O157:H7 EDL933 # 1 111 1 111 111 157 99.0 5e-39 MIWLTLVFASLLSVAGQLCQKQATCFVAINKRRKHIVLWLGLALACLGLAMVLWLLVLQN VPVGIAYPMLSLNFVWVTLAAVKLWHEPVSPRHWCGVAFIIGGIVILGSTV >gi|296494468|gb|ADTN01000270.1| GENE 14 11400 - 13052 1433 550 aa, chain - ## HITS:1 COG:yfbI KEGG:ns NR:ns ## COG: yfbI COG1807 # Protein_GI_number: 16130192 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 4-amino-4-deoxy-L-arabinose transferase and related glycosyltransferases of PMT family # Organism: Escherichia coli K12 # 1 550 1 550 550 1046 100.0 0 MKSVRYLIGLFAFIACYYLLPISTRLLWQPDETRYAEISREMLASGDWIVPHLLGLRYFE KPIAGYWINSIGQWLFGANNFGVRAGVIFATLLTAALVTWFTLRLWRDKRLALLATVIYL SLFIVYAIGTYAVLDPFIAFWLVAGMCSFWLAMQAQTWKGKSAGFLLLGITCGMGVMTKG FLALAVPVLSVLPWVATQKRWKDLFIYGWLAVISCVLTVLPWGLAIAQREPNFWHYFFWV EHIQRFALDDAQHRAPFWYYVPVIIAGSLPWLGLLPGALYTGWKNRKHSATVYLLSWTIM PLLFFSVAKGKLPTYILSCFASLAMLMAHYALLAAKNNPLALRINGWINIAFGVTGIIAT FVVSPWGPMNTPVWQTFESYKVFCAWSIFSLWAFFGWYTLTNVEKTWPFAALCPLGLALL VGFSIPDRVMEGKHPQFFVEMTQESLQPSRYILTDSVGVAAGLAWSLQRDDIIMYRQTGE LKYGLNYPDAKGRFVSGDEFANWLNQHRQEGIITLVLSVDRDEDINSLAIPPADAIDRQE RLVLIQYRPK >gi|296494468|gb|ADTN01000270.1| GENE 15 13052 - 13942 820 296 aa, chain - ## HITS:1 COG:yfbH KEGG:ns NR:ns ## COG: yfbH COG0726 # Protein_GI_number: 16130191 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Escherichia coli K12 # 1 296 1 296 296 613 100.0 1e-175 MTKVGLRIDVDTFRGTREGVPRLLEILSKHNIQASIFFSVGPDNMGRHLWRLVKPQFLWK MLRSNAASLYGWDILLAGTAWPGKEIGHANADIIREAAKHHEVGLHAWDHHAWQARSGNW DRQTMIDDIARGLRTLEEIIGQPVTCSAAAGWRADQKVIEAKEAFHLRYNSDCRGAMPFR PLLESGNPGTAQIPVTLPTWDEVIGRDVKAEDFNGWLLNRILRDKGTPVYTIHAEVEGCA YQHNFVDLLKRAAQEGVTFCPLSELLSETLPLGQVVRGNIAGREGWLGCQQIAGSR >gi|296494468|gb|ADTN01000270.1| GENE 16 13939 - 15921 1658 660 aa, chain - ## HITS:1 COG:yfbG_2 KEGG:ns NR:ns ## COG: yfbG_2 COG0451 # Protein_GI_number: 16130190 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Escherichia coli K12 # 306 660 1 355 355 766 100.0 0 MKTVVFAYHDMGCLGIEALLAAGYEISAIFTHTDNPGEKAFYGSVARLAAERGIPVYAPD NVNHPLWVERIAQLSPDVIFSFYYRHLIYDEILQLAPAGAFNLHGSLLPKYRGRAPLNWV LVNGETETGVTLHRMVKRADAGAIVAQLRIAIAPDDIAITLHHKLCHAARQLLEQTLPAI KHGNILEIAQRENEATCFGRRTPDDSFLEWHKPASVLHNMVRAVADPWPGAFSYVGNQKF TVWSSRVHPHASKAQPGSVISVAPLLIACGDGALEIVTGQAGDGITMQGSQLAQTLGLVQ GSRLNSQPACTARRRTRVLILGVNGFIGNHLTERLLREDHYEVYGLDIGSDAISRFLNHP HFHFVEGDISIHSEWIEYHVKKCDVVLPLVAIATPIEYTRNPLRVFELDFEENLRIIRYC VKYRKRIIFPSTSEVYGMCSDKYFDEDHSNLIVGPVNKPRWIYSVSKQLLDRVIWAYGEK EGLQFTLFRPFNWMGPRLDNLNAARIGSSRAITQLILNLVEGSPIKLIDGGKQKRCFTDI RDGIEALYRIIENAGNRCDGEIINIGNPENEASIEELGEMLLASFEKHPLRHHFPPFAGF RVVESSSYYGKGYQDVEHRKPSIRNAHRCLDWEPKIDMQETIDETLDFFLRTVDLTDKPS >gi|296494468|gb|ADTN01000270.1| GENE 17 15921 - 16889 887 322 aa, chain - ## HITS:1 COG:yfbF KEGG:ns NR:ns ## COG: yfbF COG0463 # Protein_GI_number: 16130189 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Escherichia coli K12 # 1 322 1 322 322 658 100.0 0 MFEIHPVKKVSVVIPVYNEQESLPELIRRTTTACESLGKEYEILLIDDGSSDNSAHMLVE ASQAENSHIVSILLNRNYGQHSAIMAGFSHVTGDLIITLDADLQNPPEEIPRLVAKADEG YDVVGTVRQNRQDSWFRKTASKMINRLIQRTTGKAMGDYGCMLRAYRRHIVDAMLHCHER STFIPILANIFARRAIEIPVHHAEREFGESKYSFMRLINLMYDLVTCLTTTPLRMLSLLG SIIAIGGFSIAVLLVILRLTFGPQWAAEGVFMLFAVLFTFIGAQFIGMGLLGEYIGRIYT DVRARPRYFVQQVIRPSSKENE >gi|296494468|gb|ADTN01000270.1| GENE 18 16893 - 18050 870 385 aa, chain - ## HITS:1 COG:yfbE KEGG:ns NR:ns ## COG: yfbE COG0399 # Protein_GI_number: 16130188 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Escherichia coli K12 # 1 385 6 390 390 755 100.0 0 MAEGKAMSEFLPFSRPAMGVEELAAVKEVLESGWITTGPKNQALEQAFCQLTGNQHAIAV SSATAGMHITLMALKIGKGDEVITPSLTWVSTLNMISLLGATPVMVDVDRDTLMVTPEAI ESAITPRTKAIIPVHYAGAPADIDAIRAIGERYGIAVIEDAAHAVGTYYKGRHIGAKGTA IFSFHAIKNITCAEGGLIVTDNENLARQLRMLKFHGLGVDAYDRQTWGRAPQAEVLTPGY KYNLTDINAAIALTQLVKLEHLNTRRREIAQQYQQALAALPFQPLSLPAWPHVHAWHLFI IRVDEQRCGISRDALMEALKERGIGTGLHFRAAHTQKYYRERFPTLSLPNTEWNSERICS LPLFPDMTTADADHVITALQQLAGQ >gi|296494468|gb|ADTN01000270.1| GENE 19 18340 - 18942 240 200 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_2593 NR:ns ## KEGG: ECUMN_2593 # Name: ais # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 200 1 200 200 378 99.0 1e-104 MLAFCRSSLKSKKYIIILLALAAIAGLGTHAAWSSNGLPRIDNKTLARLAQQHPVVVLFR HAERCDRSTNQCLSDKTGITVKGTQDARELGNAFSADIPDFDLYSSNTVRTIQSATWFSA GKKLTVDKRLLQCGNEIYSAIKDLQSKAPDKNIVIFTHNHCLTYIAKDKRDATFKPDYLD GLVMHVEKGKVYLDGEFVNH >gi|296494468|gb|ADTN01000270.1| GENE 20 18981 - 19406 371 141 aa, chain - ## HITS:1 COG:yfaO KEGG:ns NR:ns ## COG: yfaO COG0494 # Protein_GI_number: 16130186 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Escherichia coli K12 # 1 141 1 141 141 286 100.0 7e-78 MRQRTIVCPLIQNDGAYLLCKMADDRGVFPGQWAISGGGVEPGERIEEALRREIREELGE QLLLTEITPWTFSDDIRTKTYADGRKEEIYMIYLIFDCVSANREVKINEEFQDYAWVKPE DLVHYDLNVATRKTLRLKGLL >gi|296494468|gb|ADTN01000270.1| GENE 21 19685 - 20227 701 180 aa, chain + ## HITS:1 COG:no KEGG:B21_02135 NR:ns ## KEGG: B21_02135 # Name: yfaZ # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 180 1 180 180 275 100.0 4e-73 MKKIALAGLAGMLLVSASVNAMSISGQAGKEYTNIGVGFGTETTGLALSGNWTHNDDDGD VAGVGLGLNLPLGPLMATVGGKGVYTNPNYGDEGYAAAVGGGLQWKIGNSFRLFGEYYYS PDSLSSGIQSYEEANAGARYTIMRPVSIEAGYRYLNLSGKDGNRDNAVADGPYVGVNASF >gi|296494468|gb|ADTN01000270.1| GENE 22 20327 - 21529 1190 400 aa, chain + ## HITS:1 COG:yfaY KEGG:ns NR:ns ## COG: yfaY COG1058 # Protein_GI_number: 16130184 # Func_class: R General function prediction only # Function: Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA # Organism: Escherichia coli K12 # 1 400 1 400 400 795 100.0 0 MLKVEMLSTGDEVLHGQIVDTNAAWLADFFFHQGLPLSRRNTVGDNLDDLVTILRERSQH ADVLIVNGGLGPTSDDLSALAAATAKGEGLVLHEAWLKEMERYFHERGRVMAPSNRKQAE LPASAEFINNPVGTACGFAVQLNRCLMFFTPGVPSEFKVMVEHEILPRLRERFSLPQPPV CLRLTTFGRSESDLAQSLDTLQLPPGVTMGYRSSMPIIELKLTGPASEQQAMEKLWLDVK RVAGQSVIFEGTEGLPAQISRELQNRQFSLTLSEQFTGGLLALQLSRAGAPLLACEVVPS QEETLAQTAHWITERRANHFAGLALAVSGFENEHLNFALATPDGTFALRVRFSTTRYSLA IRQEVCAMMALNMLRRWLNGQDIASEHGWIEVVESMTLSV >gi|296494468|gb|ADTN01000270.1| GENE 23 21749 - 22531 592 260 aa, chain + ## HITS:1 COG:yfaX KEGG:ns NR:ns ## COG: yfaX COG1414 # Protein_GI_number: 16130183 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 260 1 260 260 521 100.0 1e-148 MLESSKVPALTRAIDILNLIARIGPCSAATIIDTLGIPKSTAYLLLNELRRQRFLSLDHQ ENFCLWTRLVELSGHALSKMDLRELARPRLTQLMDTTGLLCHLGIIDNGSAYYILKVESS ATISVRSHEGKSLSLYRSGIGKCLLAWQPAAVQQSIIEGLVWEQATPTTITHPQQLHEEL ARIRRQGWSYDNGEDYADVRCVAAPVFNANNELTAAISVVGTRLQINEEYRDYLAGKAIA CARDISRLLGWKSPFDLQAS >gi|296494468|gb|ADTN01000270.1| GENE 24 22546 - 23751 1203 401 aa, chain + ## HITS:1 COG:yfaW KEGG:ns NR:ns ## COG: yfaW COG4948 # Protein_GI_number: 16130182 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Escherichia coli K12 # 1 401 5 405 405 862 100.0 0 MTLPKIKQVRAWFTGGATAEKGAGGGDYHDQGANHWIDDHIATPMSKYRDYEQSRQSFGI NVLGTLVVEVEAENGQTGFAVSTAGEMGCFIVEKHLNRFIEGKCVSDIKLIHDQMLSATL YYSGSGGLVMNTISCVDLALWDLFGKVVGLPVYKLLGGAVRDEIQFYATGARPDLAKEMG FIGGKMPTHWGPHDGDAGIRKDAAMVADMREKCGEDFWLMLDCWMSQDVNYATKLAHACA PYNLKWIEECLPPQQYESYRELKRNAPVGMMVTSGEHHGTLQSFRTLSETGIDIMQPDVG WCGGLTTLVEIAAIAKSRGQLVVPHGSSVYSHHAVITFTNTPFSEFLMTSPDCSTMRPQF DPILLNEPVPVNGRIHKSVLDKPGFGVELNRDCNLKRPYSH >gi|296494468|gb|ADTN01000270.1| GENE 25 23808 - 25097 1380 429 aa, chain + ## HITS:1 COG:yfaV KEGG:ns NR:ns ## COG: yfaV COG0477 # Protein_GI_number: 16130181 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 429 14 442 442 751 99.0 0 MSTALLDAVVKKNRVRLIPFMLALYVLAFLDRSNIGFAKQTYQIDTGLSNEAYALGAGIF FVVYAFLGVPANLLMRKLGARTWIGTTTLLWGFLSAAMAWADTEAKFLIVRTLLGAAEAG FFPGMIYLTSQWFPQRNRASIMGLFYMGAPLALTLGSPLSGALLEMHGFMGHPGWFWMFV IEGLLAVGAGVFTFFWLDDTPEQARFLSKQEKTLLINQLASEEQQKVTSRLSDALRNGRV WQLAIIYLTIQVAVYGLIFFLPTQVAALLGTKVGFTASVVTAIPWVAALFGTWLIPRYSD KTGERRNVAALTLLAAGIGIGLSGLLSPVMAIVALCVAAIGFIAVQPVFWTMPTQLLSGT ALAAGIGFVNLFGAVGGFIAPILRVKAETLFASDAAGLLTLAAVAVIGSLIIFTLRVNRT VAQTDVAHH >gi|296494468|gb|ADTN01000270.1| GENE 26 25115 - 25918 935 267 aa, chain + ## HITS:1 COG:yfaU KEGG:ns NR:ns ## COG: yfaU COG3836 # Protein_GI_number: 16130180 # Func_class: G Carbohydrate transport and metabolism # Function: 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase # Organism: Escherichia coli K12 # 1 267 1 267 267 527 100.0 1e-150 MNALLSNPFKERLRKGEVQIGLWLSSTTAYMAEIAATSGYDWLLIDGEHAPNTIQDLYHQ LQAVAPYASQPVIRPVEGSKPLIKQVLDIGAQTLLIPMVDTAEQARQVVSATRYPPYGER GVGASVARAARWGRIENYMAQVNDSLCLLVQVESKTALDNLDEILDVEGIDGVFIGPADL SASLGYPDNAGHPEVQRIIETSIRRIRAAGKAAGFLAVAPDMAQQCLAWGANFVAVGVDT MLYSDALDQRLAMFKSGKNGPRIKGSY >gi|296494468|gb|ADTN01000270.1| GENE 27 25959 - 26144 315 61 aa, chain - ## HITS:1 COG:ECs3129 KEGG:ns NR:ns ## COG: ECs3129 COG5464 # Protein_GI_number: 15832383 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 61 244 308 308 112 92.0 2e-25 MTIAERLRQEGHQIGWQEGKLEGLHEQAIKIALRMLEQGFDRDQVLAATQLSEADLAANN H >gi|296494468|gb|ADTN01000270.1| GENE 28 26157 - 27056 797 299 aa, chain - ## HITS:1 COG:yfaD KEGG:ns NR:ns ## COG: yfaD COG5464 # Protein_GI_number: 16130179 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 299 1 299 299 588 99.0 1e-168 MTESTTSSPHDAVFKTFMFTPETARDFLEIHLPEPLRKLCNLQTLRLEPTSFIEKSLRAY YSDVLWSVETSDGDGYIYCVIEHQSSAEKNMAFRLMRYATAAMQRHLDKGYDRVPLVVPL LFYHGETSPYPYSLNWLDEFDDPQLARQLYTEAFPLVDITIVPDDEIMQHRRIALLELIQ KHIRDRDLIGMVDRITTLLVRGFTNDSQLQTLFNYLLQCGDTSRFTRFIEEIAERSPLQK ERLMTIAERLRQEGHQIGWQEGMHEQAIKIALRMLEQGFEREIVLATTQLTDADIPNCH >gi|296494468|gb|ADTN01000270.1| GENE 29 27099 - 27224 66 41 aa, chain - ## HITS:1 COG:no KEGG:ECH74115_3381 NR:ns ## KEGG: ECH74115_3381 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 41 1 41 41 85 100.0 6e-16 MWLPGIFAYTIPCPMYALQRSENSSENVSQRLKLADVKGLG >gi|296494468|gb|ADTN01000270.1| GENE 30 27249 - 28439 1171 396 aa, chain - ## HITS:1 COG:ECs3128 KEGG:ns NR:ns ## COG: ECs3128 COG0247 # Protein_GI_number: 15832382 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Escherichia coli O157:H7 # 1 396 1 396 396 832 100.0 0 MNDTSFENCIKCTVCTTACPVSRVNPGYPGPKQAGPDGERLRLKDGALYDEALKYCINCK RCEVACPSDVKIGDIIQRARAKYDTTRPSLRNFVLSHTDLMGSVSTPFAPIVNTATSLKP VRQLLDAALKIDHRRTLPKYSFGTFRRWYRSVAAQQAQYKDQVAFFHGCFVNYNHPQLGK DLIKVLNAMGTGVQLLSKEKCCGVPLIANGFTDKARKQAITNVESIREAVGVKGIPVIAT SSTCTFALRDEYPEVLNVDNKGLRDHIELATRWLWRKLDEGKTLPLKPLPLKVVYHTPCH MEKMGWTLYTLELLRNIPGLELTVLDSQCCGIAGTYGFKKENYPTSQAIGAPLFRQIEES GADLVVTDCETCKWQIEMSTSLRCEHPITLLAQALA >gi|296494468|gb|ADTN01000270.1| GENE 31 28436 - 29695 1188 419 aa, chain - ## HITS:1 COG:glpB KEGG:ns NR:ns ## COG: glpB COG3075 # Protein_GI_number: 16130177 # Func_class: E Amino acid transport and metabolism # Function: Anaerobic glycerol-3-phosphate dehydrogenase # Organism: Escherichia coli K12 # 1 419 1 419 419 809 100.0 0 MRFDTVIMGGGLAGLLCGLQLQKHGLRCAIVTRGQSALHFSSGSLDLLSHLPDGQPVTDI HSGLESLRQQAPAHPYSLLEPQRVLDLACQAQALIAESGAQLQGSVELAHQRVTPLGTLR STWLSSPEVPVWPLPAKKICVVGISGLMDFQAHLAAASLRELGLAVETAEIELPELDVLR NNATEFRAVNIARFLDNEENWPLLLDALIPVANTCEMILMPACFGLADDKLWRWLNEKLP CSLMLLPTLPPSVLGIRLQNQLQRQFVRQGGVWMPGDEVKKVTCKNGVVNEIWTRNHADI PLRPRFAVLASGSFFSGGLVAERNGIREPILGLDVLQTATRGEWYKGDFFAPQPWQQFGV TTDETLRPSQAGQTIENLFAIGSVLGGFDPIAQGCGGGVCAVSALHAAQQIAQRAGGQQ >gi|296494468|gb|ADTN01000270.1| GENE 32 29685 - 31313 1842 542 aa, chain - ## HITS:1 COG:ECs3126 KEGG:ns NR:ns ## COG: ECs3126 COG0578 # Protein_GI_number: 15832380 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Escherichia coli O157:H7 # 1 542 1 542 542 1050 99.0 0 MKTRDSQSSDVIIIGGGATGAGIARDCALRGLRVILVERHDIATGATGRNHGLLHSGARY AVTDAESARECISENQILKRIARHCVEPTNGLFITLPEDNLSFQATFIRACEEAGISAEA IDPQQARIIEPAVNPALIGAVKVPDGTVDPFRLTAANMLDAKEHGAVILTAHEVTGLIRE GATVCGVRVRNHLTGETQALHAPVVVNAAGIWGQHIAEYADLRIRMFPAKGSLLIMDHRI NQHVINRCRKPSDADILVPGDTISLIGTTSLRIDYNEIDDNRVTAEEVDILLREGEKLAP VMAKTRILRAYSGVRPLVASDDDPSGRNVSRGIVLLDHAERDGLDGFITITGGKLMTYRL MAEWATDAVCRKLGNTRPCTTADLALPGSQEPAEVTLRKVISLPAPLRGSAVYRHGDRTP AWLSEGRLHRSLVCECEAVTAGEVQYAVENLNVNSLLDLRRRTRVGMGTCQGELCACRAA GLLQRFNVTTSAQSIEQLSTFLNERWKGVQPIAWGDALRESEFTRWVYQGLCGLEKEQKD AL >gi|296494468|gb|ADTN01000270.1| GENE 33 31338 - 31436 62 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFKMTHEITFHFRIMSEYARNQTIHVFTMAKW >gi|296494468|gb|ADTN01000270.1| GENE 34 31586 - 32944 1466 452 aa, chain + ## HITS:1 COG:glpT KEGG:ns NR:ns ## COG: glpT COG2271 # Protein_GI_number: 16130175 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Escherichia coli K12 # 1 452 1 452 452 882 99.0 0 MLSIFKPAPHKARLPAAEIDPTYRRLRWQIFLGIFFGYAAYYLVRKNFALAMPYLVEQGF SRGDLGFALSGISIAYGFSKFIMGSVSDRSNPRVFLLAGLILAAAVMLFMGFVPWATSSI AVMFVLLFLCGWFQGMGWPPCGRTMVHWWSQKERGGIVSVWNCAHNVGGGIPPLLFLLGM AWFNDWHAALYMPAFCAILVALFAFAMMRDTPQSCGLPPIEEYKNDYPDDYNEKAEQELT AKQIFMQYVLPNKLLWYIAIANVFVYLLRYGILDWSPTYLKEVKHFALDKSSWAYFLYEY AGIPGTLLCGWMSDKVFRGNRGATGVFFMTLVTIATIVYWMNPAGNPTVDMICMIVIGFL IYGPVMLIGLHALELAPKKAAGTAAGFTGLFGYLGGSVAASAIVGYTVDFFGWDGGFMVM IGGSILAVILLIVVMIGEKRRHEQLLQERNGG >gi|296494468|gb|ADTN01000270.1| GENE 35 32949 - 34025 1163 358 aa, chain + ## HITS:1 COG:glpQ KEGG:ns NR:ns ## COG: glpQ COG0584 # Protein_GI_number: 16130174 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Escherichia coli K12 # 1 358 1 358 358 719 100.0 0 MKLTLKNLSMAIMMSTIVMGSSAMAADSNEKIVIAHRGASGYLPEHTLPAKAMAYAQGAD YLEQDLVMTKDDNLVVLHDHYLDRVTDVADRFPDRARKDGRYYAIDFTLDEIKSLKFTEG FDIENGKKVQTYPGRFPMGKSDFRVHTFEEEIEFVQGLNHSTGKNIGIYPEIKAPWFHHQ EGKDIAAKTLEVLKKYGYTGKDDKVYLQCFDADELKRIKNELEPKMGMELNLVQLIAYTD WNETQQKQPDGSWVNYNYDWMFKPGAMKQVAEYADGIGPDYHMLIEETSQPGNIKLTGMV QDAQQNKLVVHPYTVRSDKLPEYTPDVNQLYDALYNKAGVNGLFTDFPDKAVKFLNKE >gi|296494468|gb|ADTN01000270.1| GENE 36 34067 - 34273 235 68 aa, chain - ## HITS:1 COG:yfaH KEGG:ns NR:ns ## COG: yfaH COG0583 # Protein_GI_number: 16130173 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 68 1 68 68 116 100.0 1e-26 MNFIRQGLGIALQPELTLKSIAGELCSVPLEPTFYRQISLLAKEKPVEGSPLFLLQMCME QLVAIGKI >gi|296494468|gb|ADTN01000270.1| GENE 37 34488 - 35138 573 216 aa, chain + ## HITS:1 COG:no KEGG:B21_02122 NR:ns ## KEGG: B21_02122 # Name: inaA # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 216 1 216 216 432 100.0 1e-120 MAVSAKYDEFNHWWATEGDWVEEPNYRRNGMSGVQCVERNGKKLYVKRMTHHLFHSVRYP FGRPTIVREVAVIKELERAGVIVPKIVFGEAVKIEGEWRALLVTEDMAGFISIADWYAQH AVSPYSDEVRQAMLKAVALAFKKMHSINRQHGCCYVRHIYVKTEGNAEAGFLDLEKSRRR LRRDKAINHDFRQLEKYLEPIPKADWEQVKAYYYAM >gi|296494468|gb|ADTN01000270.1| GENE 38 35192 - 35446 165 84 aa, chain - ## HITS:1 COG:yfaE KEGG:ns NR:ns ## COG: yfaE COG0633 # Protein_GI_number: 16130171 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Escherichia coli K12 # 1 84 1 84 84 159 100.0 9e-40 MARVTLRITGTQLLCQDEHPSLLAALESHNVAVEYQCREGYCGSCRTRLVAGQVDWIAEP LAFIQPGEILPCCCRAKGDIEIEM >gi|296494468|gb|ADTN01000270.1| GENE 39 35446 - 36576 1520 376 aa, chain - ## HITS:1 COG:ECs3118 KEGG:ns NR:ns ## COG: ECs3118 COG0208 # Protein_GI_number: 15832372 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, beta subunit # Organism: Escherichia coli O157:H7 # 1 376 1 376 376 758 100.0 0 MAYTTFSQTKNDQLKEPMFFGQPVNVARYDQQKYDIFEKLIEKQLSFFWRPEEVDVSRDR IDYQALPEHEKHIFISNLKYQTLLDSIQGRSPNVALLPLISIPELETWVETWAFSETIHS RSYTHIIRNIVNDPSVVFDDIVTNEQIQKRAEGISSYYDELIEMTSYWHLLGEGTHTVNG KTVTVSLRELKKKLYLCLMSVNALEAIRFYVSFACSFAFAERELMEGNAKIIRLIARDEA LHLTGTQHMLNLLRSGADDPEMAEIAEECKQECYDLFVQAAQQEKDWADYLFRDGSMIGL NKDILCQYVEYITNIRMQAVGLDLPFQTRSNPIPWINTWLVSDNVQVAPQEVEVSSYLVG QIDSEVDTDDLSNFQL >gi|296494468|gb|ADTN01000270.1| GENE 40 36809 - 39094 2770 761 aa, chain - ## HITS:1 COG:nrdA KEGG:ns NR:ns ## COG: nrdA COG0209 # Protein_GI_number: 16130169 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, alpha subunit # Organism: Escherichia coli K12 # 1 761 1 761 761 1584 100.0 0 MNQNLLVTKRDGSTERINLDKIHRVLDWAAEGLHNVSISQVELRSHIQFYDGIKTSDIHE TIIKAAADLISRDAPDYQYLAARLAIFHLRKKAYGQFEPPALYDHVVKMVEMGKYDNHLL EDYTEEEFKQMDTFIDHDRDMTFSYAAVKQLEGKYLVQNRVTGEIYESAQFLYILVAACL FSNYPRETRLQYVKRFYDAVSTFKISLPTPIMSGVRTPTRQFSSCVLIECGDSLDSINAT SSAIVKYVSQRAGIGINAGRIRALGSPIRGGEAFHTGCIPFYKHFQTAVKSCSQGGVRGG AATLFYPMWHLEVESLLVLKNNRGVEGNRVRHMDYGVQINKLMYTRLLKGEDITLFSPSD VPGLYDAFFADQEEFERLYTKYEKDDSIRKQRVKAVELFSLMMQERASTGRIYIQNVDHC NTHSPFDPAIAPVRQSNLCLEIALPTKPLNDVNDENGEIALCTLSAFNLGAINNLDELEE LAILAVRALDALLDYQDYPIPAAKRGAMGRRTLGIGVINFAYYLAKHGKRYSDGSANNLT HKTFEAIQYYLLKASNELAKEQGACPWFNETTYAKGILPIDTYKKDLDTIANEPLHYDWE ALRESIKTHGLRNSTLSALMPSETSSQISNATNGIEPPRGYVSIKASKDGILRQVVPDYE HLHDAYELLWEMPGNDGYLQLVGIMQKFIDQSISANTNYDPSRFPSGKVPMQQLLKDLLT AYKFGVKTLYYQNTRDGAEDAQDDLVPSIQDDGCESGACKI >gi|296494468|gb|ADTN01000270.1| GENE 41 39997 - 43542 2883 1181 aa, chain + ## HITS:1 COG:yfaL_2 KEGG:ns NR:ns ## COG: yfaL_2 COG3468 # Protein_GI_number: 16130168 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 736 1181 1 446 446 743 100.0 0 MTNNASGGAVFLQQGAEFSLLPENETGMTLFANNTVTGEYNNGGAIFAKENSTLNLTDVI FSGNVAGGYGGAIYSSGTNDTGAVDLRVTNAMFRNNIANDGKGGAIYTINNDVYLSDVIF DNNQAYTSTSYSDGDGGAIDVTDNNSDSKHPSGYTIVNNTAFTNNTAEGYGGAIYTNSVT APYLIDISVDDSYSQNGGVLVDENNSAAGYGDGPSSAAGGFMYLGLSEVTFDNADGKTLV IGNTENDGAVDSIAGTGLITKTGSGDLVLNADNNDFTGEMQIENGEVTLGRSNSLMNVGD THCQDDPQDCYGLTIGSIDQYQNQAELNVGSTQQTFVHALTGFQNGTLNIDAGGNVTVNQ GSFAGIIEGAGQLTIAQNGSYVLAGAQSMALTGDIVVDDGAVLSLEGDAADLTALQDDPQ SIVLNGGVLDLSDFSTWQSGTSYNDGLEVSGSSGTVIGSQDVVDLAGGDNLHIGGDGKDG VYVVVDASDGQVSLANNNSYLGTTQIASGTLMVSDNSQLGDTHYNRQVIFTDKQQESVME ITSDVDTRSDAAGHGRDIEMRADGEVAVDAGVDTQWGALMADSSGQHQDEGSTLTKTGAG TLELTASGTTQSAVRVEEGTLKGDVADILPYASSLWVGDGATFVTGADQDIQSIDAISSG TIDISDGTVLRLTGQDTSVALNASLFNGDGTLVNATDGVTLTGELNTNLETDSLTYLSNV TVNGNLTNTSGAVSLQNGVAGDTLTVNGDYTGGGTLLLDSELNGDDSVSDQLVMNGNTAG NTTVVVNSITGIGEPTSTGIKVVDFAADPTQFQNNAQFSLAGSGYVNMGAYDYTLVEDNN DWYLRSQEVTPPSPPDPDPTPDPDPTPDPDPTPDPEPTPAYQPVLNAKVGGYLNNLRAAN QAFMMERRDHAGGDGQTLNLRVIGGDYHYTAAGQLAQHEDTSTVQLSGDLFSGRWGTDGE WMLGIVGGYSDNQGDSRSNMTGTRADNQNHGYAVGLTSSWFQHGNQKQGAWLDSWLQYAW FSNDVSEQEDGTDHYHSSGIIASLEAGYQWLPGRGVVIEPQAQVIYQGVQQDDFTAANRA RVSQSQGDDIQTRLGLHSEWRTAVHVIPTLDLNYYHDPHSTEIEEDGSTISDDAVKQRGE IKVGVTGNISQRVSLRGSVAWQKGSDDFAQTAGFLSMTVKW >gi|296494468|gb|ADTN01000270.1| GENE 42 43670 - 44392 855 240 aa, chain - ## HITS:1 COG:ubiG KEGG:ns NR:ns ## COG: ubiG COG2227 # Protein_GI_number: 16130167 # Func_class: H Coenzyme transport and metabolism # Function: 2-polyprenyl-3-methyl-5-hydroxy-6-metoxy-1,4-benzoquinol methylase # Organism: Escherichia coli K12 # 1 240 1 240 240 498 99.0 1e-141 MNAKKSPVNHNVDHEEIAKFEAVASRWWDLEGEFKPLHRINPLRLGYIAERAGGLFGKKV LDVGCGGGILAESMAREGATVTGLDMGFEPLQVAKLHALESGIQVDYVQETVEEHAAKHA GQYDVVTCMEMLEHVPDPQSVVRACAQLVKPGGDVFFSTLNRNGKSWLMAVVGAEYILRM VPKGTHDVKKFIKPAELLGWVDQTSLKERHITGLHYNPITNTFKLGPGVDVNYMLHTQNK >gi|296494468|gb|ADTN01000270.1| GENE 43 44539 - 47166 3385 875 aa, chain + ## HITS:1 COG:gyrA KEGG:ns NR:ns ## COG: gyrA COG0188 # Protein_GI_number: 16130166 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Escherichia coli K12 # 1 875 1 875 875 1660 100.0 0 MSDLAREITPVNIEEELKSSYLDYAMSVIVGRALPDVRDGLKPVHRRVLYAMNVLGNDWN KAYKKSARVVGDVIGKYHPHGDSAVYDTIVRMAQPFSLRYMLVDGQGNFGSIDGDSAAAM RYTEIRLAKIAHELMADLEKETVDFVDNYDGTEKIPDVMPTKIPNLLVNGSSGIAVGMAT NIPPHNLTEVINGCLAYIDDEDISIEGLMEHIPGPDFPTAAIINGRRGIEEAYRTGRGKV YIRARAEVEVDAKTGRETIIVHEIPYQVNKARLIEKIAELVKEKRVEGISALRDESDKDG MRIVIEVKRDAVGEVVLNNLYSQTQLQVSFGINMVALHHGQPKIMNLKDIIAAFVRHRRE VVTRRTIFELRKARDRAHILEALAVALANIDPIIELIRHAPTPAEAKTALVANPWQLGNV AAMLERAGDDAARPEWLEPEFGVRDGLYYLTEQQAQAILDLRLQKLTGLEHEKLLDEYKE LLDQIAELLRILGSADRLMEVIREELELVREQFGDKRRTEITANSADINLEDLITQEDVV VTLSHQGYVKYQPLSEYEAQRRGGKGKSAARIKEEDFIDRLLVANTHDHILCFSSRGRVY SMKVYQLPEATRGARGRPIVNLLPLEQDERITAILPVTEFEEGVKVFMATANGTVKKTVL TEFNRLRTAGKVAIKLVDGDELIGVDLTSGEDEVMLFSAEGKVVRFKESSVRAMGCNTTG VRGIRLGEGDKVVSLIVPRGDGAILTATQNGYGKRTAVAEYPTKSRATKGVISIKVTERN GLVVGAVQVDDCDQIMMITDAGTLVRTRVSEISIVGRNTQGVILIRTAEDENVVGLQRVA EPVDEEDLDTIDGSAAEGDDEIAPEVDVDDEPEEE >gi|296494468|gb|ADTN01000270.1| GENE 44 47315 - 49003 1372 562 aa, chain + ## HITS:1 COG:yfaA KEGG:ns NR:ns ## COG: yfaA COG4685 # Protein_GI_number: 16130165 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 562 17 578 578 1129 100.0 0 MSGEKKAKGWRFYGLVGFGAIALLSAGVWALQYAGSGPEKTLSPLVVHNNLQIDLNEPDL FLDSDSLSQLPKDLLTIPFLHDVLSEDFVFYYQNHADRLGIEGSIRRIVYEHDLTLKDKL FSSLLDQPAQAALWHDKQGHLSHYMVLIQRSGLSKLLEPLLFAATSDSQLSKTEISSIKI NSETVPVYQLRYNGNNALMFATYQDKMLVFSSTDMLFKDDQQDTEATAIAGDLLSGKKRW QASFGLEERTAEKTPVRQRIVVSARWLGFGYQRLMPSFAGVRFEMGNDGWHSFVALNDES ASVDASFDFTPVWNSMPAGASFCVAVPYSHGIAEEMLSHISQENDKLNGALDGAAGLCWY EDSKLQTPLFVGQFDGTAEQAQLPGKLFTQNIGAHESKAPEGVLPVSQTQQGEAQIWRRE VSSRYGQYPKAQAAQPDQLMSDYFFRVSLAMQNKTLLFSLDDTLVNNALQTLNKTRPAMV DVIPTDGIVPLYINPQGIAKLLRNETLTSLPKNLEPVFYNAAQTLLMPKLDALSQQPRYV MKLAQMEPGAAWQWLPITWQPL >gi|296494468|gb|ADTN01000270.1| GENE 45 49000 - 49623 314 207 aa, chain + ## HITS:1 COG:ECs3112 KEGG:ns NR:ns ## COG: ECs3112 COG3234 # Protein_GI_number: 15832366 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 207 10 216 216 426 100.0 1e-119 MRHGLLALICWLCCVVAHSEMLNVEQSGLFRAWFVRIAQEQLRQGPSPRWYQQDCAGLVR FAANETLKVHDSKWLKSNGLSSQYLPPEMTLTPEQRQLAQNWNQGNGKTGPYVTAINLIQ YNSQFIGQDINQALPGDMIFFDQGDAQHLMVWMGRYVIYHTGSATKTDNGMRAVSLQQLM TWKDTRWIPNDSNPNFIGIYRLNFLAR >gi|296494468|gb|ADTN01000270.1| GENE 46 49557 - 54161 3786 1534 aa, chain + ## HITS:1 COG:ECs3111 KEGG:ns NR:ns ## COG: ECs3111 COG2373 # Protein_GI_number: 15832365 # Func_class: R General function prediction only # Function: Large extracellular alpha-helical protein # Organism: Escherichia coli O157:H7 # 1 1534 1 1534 1534 2981 98.0 0 MDTQRFQSQFHWHLSFKFSGAIAACLSLSLVGTGLANADDSLPSSNYAPPAGGTFFLLAD SSFSSSEEAKVRLEAPGRDYRRYQMEEYGGVDVRLYRIPDPMAFLRQQKNLHRIVVQPQY LGDGLNNTLTWLWDNWYGKSRRVMQRTFSSQSRQNVTQALPELQLGNAIIKPSRYVQNNQ FSPLKKYPLVKQFRYPLWQAKPFEPQQGVKLEGASSNFISPQPGNIYIPLGQQEPGLYLV EAMVGGYRATTVVFVSDTVALSKVSGKELLVWTAGKKQGEAKPGSEILWTDGLGVMTRGV TDDSGTLQLQHISPERSYILGKDAEGGVFVSENFFYESEIYNTRLYIFTDRPLYRAGDRV DVKVIGREFHDPLHSSPIVSAPAKLSVLDANGSLLQTVNVTLDARNGGQGSFRLPENAVA GGYELRLAYRNQVYSSSFRVANYIKPHFEIGLALAKKEFKTGEAVSGKLQLLYPDGEPVK NARVQLSLRAQQLSMVGNDLRYAGRFPVSLEGSETVSDASGHVALNLPAADKPSRYLLTV SASDGAAYRVTTTKEILIERGLAHYSLSTAAQYSNSGESVVFRYAALESSKQVPVTYEWL RLEDRTSHSGELPSGGKSFTVNFAKPGNYNLTLRDKDGLILAGLSHAVSGKGSTAHTGTV DIVADKTLYQPGETAKMLITFPEPIDEALLTLERDRVEQQSLLSHPANWLTLQRLNDTQY EARVPVSNSFAPNITFSVLYTRNGQYSFQNAGIKVAVPQLDIRVKTDKTHYQPGELVNVE LTSSLKGKPVSAQLTVGVVDEMIYALQPEIAPNIGKFFYPLGRNNVRTSSSLSFISYDQA LSSEPVAPGATNRSERRVKMLERPRREEVDTAAWMPSLTTDKQGKAYFTFLMPDSLTRWR ITARGMNGDGLVGQGRAYLRSGKNLYMKWSMPTMYRVGDKPAAGLFIFSQQDNEPVALVT KFAGAEMRQTLTLHKGANYISLTQNIQQSGLLSAELQQNGQVQDSISTKLSFVDNSWPVE QQKNVMLGGGDNALMLPEQASNIRLQSSETPQEIFRNNLDALVDEPWGGVINTGSRLIPL SLAWRSLADHQSAAANDIRQMIQDNRLRLMQLAGPGARFTWWGEDGNGDAFLTAWAWYAD WQASQAIGVTQQPEYWQHMLDSYAEQADNMPLLHRALVLAWAQEMNLPCKTLLKGLDEAI ARRGTKTEDFSEEDTRDINDSLILDTPESPLADAVANVLTMTLLKKAQLKSTVMPQVQQY AWDKAANSNQPLAHTVVLLNSGGDATQTAAILSGLTAEQSTIERALAMNWLAKYMATMPP VVLPAPAGAWAKHKLTGGGEDWRWVGQGVPDILSFGDELSPQNVQVRWREPAKMAQQSNI PVTVERQLYRLIPGEEEMSFILQPVTSNEIDSDALYLDEITLTSEQDAVLRYGQVEVPLP PGADVERTTWGISVNKPNAAKQQGQLLEKARNEMGELAYMVPVKELTGTVTFRHLLRFSQ KGQFVLPPARYVRSYAPAQQSVAAGSEWTGMQVK >gi|296494468|gb|ADTN01000270.1| GENE 47 54162 - 55811 1098 549 aa, chain + ## HITS:1 COG:yfaQ KEGG:ns NR:ns ## COG: yfaQ COG5445 # Protein_GI_number: 16130163 # Func_class: S Function unknown # Function: Predicted secreted protein # Organism: Escherichia coli K12 # 1 549 1 549 549 1085 99.0 0 MNWRRIVWLLALVTLPTLAEETPLQLVLRGAQHDQLYQLSSSGVTKVSALPDSLTTPLGS LWKLYVYAWLEDTHQPEQPYQCRGNSPEEVYCCQAGESITRDTALVRSCGLYFAPQRLHI GADVWGQYWQQRQAPAWLASLTTLKPETSVTVKSLLDSLATLPAQNKAQEVLLDVVLDEA KIGVASMLGSRVRVKTWSWFADDKQEIRQGGFAGWLTDGTPLWVTGSGTSKTVLTRYATV LNRVLPVPTQVASGQCVEVELFARYPLKKITAEKSTTAVNPGVLNGRYRVTFTNGNHITF VSHGETTLLSEKGKLKLQSHLDREEYVARVLDREAKSTPPEAAKAMTVAIRTFLQQNANR EGDCLTIPDSSATQRVSASPATTGARTMTAWTQDLIYAGDPVHYHGSRATEGTLFWRQAT AQAGQGERYDQILAFAYPDNSLSRWGAPRSTCQLLPKAKAWLAKKMPQWRRILQAETGYN EPDVFAVCRLVSGFPYTDRQQKRLFIRNFFTLQDRLDLTHEYLHLAFDGYPTGLDENYIE TLTRQLLMD >gi|296494468|gb|ADTN01000270.1| GENE 48 55816 - 56592 597 258 aa, chain + ## HITS:1 COG:yfaP KEGG:ns NR:ns ## COG: yfaP COG4676 # Protein_GI_number: 16130162 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 258 1 258 258 514 99.0 1e-146 MRKIFLPLLLVALSPVAHSEGVQEVEIDAPLSGWHPAEGEDASFSQSINYPASSVNMADD QNISAQIRGKIKNYAAAGKVQQGRLVVNGASMPQRIESDGSFARPYIFTEGSNSVQVISP DGQSRQKMQFYSTPGTGTIRARLRLVLSWDTDNTDLDLHVVTPDGEHAWYGNTVLKNSGA LDMDVTTGYGPEIIAMPAPIHGRYQVYINYYGGRSETELTTAQLTLITDEGSVNEKQETF IVPMRNAGELTLVKSFDW >gi|296494468|gb|ADTN01000270.1| GENE 49 56666 - 57850 1360 394 aa, chain - ## HITS:1 COG:atoB KEGG:ns NR:ns ## COG: atoB COG0183 # Protein_GI_number: 16130161 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA acetyltransferase # Organism: Escherichia coli K12 # 1 394 1 394 394 655 100.0 0 MKNCVIVSAVRTAIGSFNGSLASTSAIDLGATVIKAAIERAKIDSQHVDEVIMGNVLQAG LGQNPARQALLKSGLAETVCGFTVNKVCGSGLKSVALAAQAIQAGQAQSIVAGGMENMSL APYLLDAKARSGYRLGDGQVYDVILRDGLMCATHGYHMGITAENVAKEYGITREMQDELA LHSQRKAAAAIESGAFTAEIVPVNVVTRKKTFVFSQDEFPKANSTAEALGALRPAFDKAG TVTAGNASGINDGAAALVIMEESAALAAGLTPLARIKSYASGGVPPALMGMGPVPATQKA LQLAGLQLADIDLIEANEAFAAQFLAVGKNLGFDSEKVNVNGGAIALGHPIGASGARILV TLLHAMQARDKTLGLATLCIGGGQGIAMVIERLN >gi|296494468|gb|ADTN01000270.1| GENE 50 57881 - 59068 978 395 aa, chain - ## HITS:1 COG:atoE KEGG:ns NR:ns ## COG: atoE COG2031 # Protein_GI_number: 16130160 # Func_class: I Lipid transport and metabolism # Function: Short chain fatty acids transporter # Organism: Escherichia coli K12 # 1 395 46 440 440 716 100.0 0 MVKMWGDGFWNLLAFGMQMALIIVTGHALASSAPVKSLLRTAASAAKTPVQGVMLVTFFG SVACVINWGFGLVVGAMFAREVARRVPGSDYPLLIACAYIGFLTWGGGFSGSMPLLAATP GNPVEHIAGLIPVGDTLFSGFNIFITVALIVVMPFITRMMMPKPSDVVSIDPKLLMEEAD FQKQLPKDAPPSERLEESRILTLIIGALGIAYLAMYFSEHGFNITINTVNLMFMIAGLLL HKTPMAYMRAISAAARSTAGILVQFPFYAGIQLMMEHSGLGGLITEFFINVANKDTFPVM TFFSSALINFAVPSGGGHWVIQGPFVIPAAQALGADLGKSVMAIAYGEQWMNMAQPFWAL PALAIAGLGVRDIMGYCITALLFSGVIFVIGLTLF >gi|296494468|gb|ADTN01000270.1| GENE 51 59200 - 59850 538 216 aa, chain - ## HITS:1 COG:atoA KEGG:ns NR:ns ## COG: atoA COG2057 # Protein_GI_number: 16130159 # Func_class: I Lipid transport and metabolism # Function: Acyl CoA:acetate/3-ketoacid CoA transferase, beta subunit # Organism: Escherichia coli K12 # 1 216 1 216 216 426 100.0 1e-119 MDAKQRIARRVAQELRDGDIVNLGIGLPTMVANYLPEGIHITLQSENGFLGLGPVTTAHP DLVNAGGQPCGVLPGAAMFDSAMSFALIRGGHIDACVLGGLQVDEEANLANWVVPGKMVP GMGGAMDLVTGSRKVIIAMEHCAKDGSAKILRRCTMPLTAQHAVHMLVTELAVFRFIDGK MWLTEIADGCDLATVRAKTEARFEVAADLNTQRGDL >gi|296494468|gb|ADTN01000270.1| GENE 52 59850 - 60512 643 220 aa, chain - ## HITS:1 COG:atoD KEGG:ns NR:ns ## COG: atoD COG1788 # Protein_GI_number: 16130158 # Func_class: I Lipid transport and metabolism # Function: Acyl CoA:acetate/3-ketoacid CoA transferase, alpha subunit # Organism: Escherichia coli K12 # 1 220 1 220 220 411 100.0 1e-115 MKTKLMTLQDATGFFRDGMTIMVGGFMGIGTPSRLVEALLESGVRDLTLIANDTAFVDTG IGPLIVNGRVRKVIASHIGTNPETGRRMISGEMDVVLVPQGTLIEQIRCGGAGLGGFLTP TGVGTVVEEGKQTLTLDGKTWLLERPLRADLALIRAHRCDTLGNLTYQLSARNFNPLIAL AADITLVEPDELVETGELQPDHIVTPGAVIDHIIVSQESK >gi|296494468|gb|ADTN01000270.1| GENE 53 60708 - 62093 975 461 aa, chain - ## HITS:1 COG:atoC KEGG:ns NR:ns ## COG: atoC COG2204 # Protein_GI_number: 16130157 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 1 461 1 461 461 911 100.0 0 MTAINRILIVDDEDNVRRMLSTAFALQGFETHCANNGRTALHLFADIHPDVVLMDIRMPE MDGIKALKEMRSHETRTPVILMTAYAEVETAVEALRCGAFDYVIKPFDLDELNLIVQRAL QLQSMKKEIRHLHQALSTSWQWGHILTNSPAMMDICKDTAKIALSQASVLISGESGTGKE LIARAIHYNSRRAKGPFIKVNCAALPESLLESELFGHEKGAFTGAQTLRQGLFERANEGT LLLDEIGEMPLVLQAKLLRILQEREFERIGGHQTIKVDIRIIAATNRDLQAMVKEGTFRE DLFYRLNVIHLILPPLRDRREDISLLANHFLQKFSSENQRDIIDIDPMAMSLLTAWSWPG NIRELSNVIERAVVMNSGPIIFSEDLPPQIRQPVCNAGEVKTAPVGERNLKEEIKRVEKR IIMEVLEQQEGNRTRTALMLGISRRALMYKLQEYGIDPADV >gi|296494468|gb|ADTN01000270.1| GENE 54 62090 - 63916 1360 608 aa, chain - ## HITS:1 COG:atoS_3 KEGG:ns NR:ns ## COG: atoS_3 COG0642 # Protein_GI_number: 16130156 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 331 608 1 278 278 529 100.0 1e-150 MHYMKWIYPRRLRNQMILMAILMVIVPTLTIGYIVETEGRSAVLSEKEKKLSAVVNLLNQ ALGDRYDLYIDLPREERIRALNAELAPITENITHAFPGIGAGYYNKMLDAIITYAPSALY QNNVGVTIAADHPGREVMRTNTPLVYSGRQVRGDILNSMLPIERNGEILGYIWANELTED IRRQAWKMDVRIIIVLTAGLLISLLLIVLFSRRLSANIDIITDGLSTLAQNIPTRLPQLP GEMGQISQSVNNLAQALRETRTLNDLIIENAADGVIAIDRQGDVTTMNPAAEVITGYQRH ELVGQPYSMLFDNTQFYSPVLDTLEHGTEHVALEISFPGRDRTIELSVTTSRIHNTHGEM IGALVIFSDLTARKETQRRMAQAERLATLGELMAGVAHEVRNPLTAIRGYVQILRQQTSD PIHQEYLSVVLKEIDSINKVIQQLLEFSRPRHSQWQQVSLNALVEETLVLVQTAGVQARV DFISELDNELSPINADRELLKQVLLNILINAVQAISARGKIRIQTWQYSDSQQAISIEDN GCGIDLSLQKKIFDPFFTTKASGTGLGLALSQRIINAHQGDIRVASLPGYGATFTLILPI NPQGNQTV >gi|296494468|gb|ADTN01000270.1| GENE 55 64131 - 66932 1960 933 aa, chain + ## HITS:1 COG:rcsC_1 KEGG:ns NR:ns ## COG: rcsC_1 COG0642 # Protein_GI_number: 16130155 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 700 1 700 700 1436 99.0 0 MFRALALVLWLLIAFSSVFYIVNALHQRESEIRQEFNLSSDQAQRFIQRTSDVMKELKYI AENRLSAENGVLSPRGRETQADVPAFEPLFADSDCSAMSNTWRGSLESLAWFMRYWRDNF SAAYDLNRVFLIGSDNLCMANFGLRDMPVERDTALKALHERINKYRNAPQDDSGSNLYWI SEGPRPGVGYFYALTPVYLANRLQALLGVEQTIRMENFFLPGTLPMGVTILDENGHTLIS LTGPESKIKGDPRWMQERSWFGYTEGFRELVLKKNLPPSSLSIVYSVPVDKVLERIRMLI LNAILLNVLAGAALFTLARMYERRIFIPAESDALRLEEHEQFNRKIVASAPVGICILRTA DGVNILSNELAHTYLNMLTHEDRQRLTQIICGQQVNFVDVLTSNNTNLQISFVHSRYRNE NVAICVLVDVSSRVKMEESLQEMAQAAEQASQSKSMFLATVSHELRTPLYGIIGNLDLLQ TKELPKGVDRLVTAMNNSSSLLLKIISDILDFSKIESEQLKIEPREFSPREVMNHITANY LPLVVRKQLGLYCFIEPDVPVALNGDPMRLQQVISNLLSNAIKFTDTGCIVLHVRADGDY LSIRVRDTGVGIPAKEVVRLFDPFFQVGTGVQRNFQGTGLGLAICEKLISMMDGDISVDS EPGMGSQFTVRIPLYGAQYPQKKGVEELSGKRCWLAVRNASLCQFLETSLQRSGIVVTTY EGQEPTPEDVLITDEVVSKKWQGRAVVTFCRRHIGIPLEKAPGEWVHSVAAPHELPALLA RIYLIEMESDDPANALPSTDKAVSDNDDMMILVVDDHPINRRLLADQLGSLGYQCKTAND GVDALNVLSKNHIDIVLSDVNMPNMDGYRLTQRIRQLGLTLPVIGVTANALAEEKQRCLE SGMDSCLSKPVTLDVIKQTLTLYAERVRKSRDS >gi|296494468|gb|ADTN01000270.1| GENE 56 67132 - 67782 851 216 aa, chain - ## HITS:1 COG:ECs3106 KEGG:ns NR:ns ## COG: ECs3106 COG2197 # Protein_GI_number: 15832360 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli O157:H7 # 1 216 1 216 216 400 100.0 1e-111 MNNMNVIIADDHPIVLFGIRKSLEQIEWVNVVGEFEDSTALINNLPKLDAHVLITDLSMP GDKYGDGITLIKYIKRHFPSLSIIVLTMNNNPAILSAVLDLDIEGIVLKQGAPTDLPKAL AALQKGKKFTPESVSRLLEKISAGGYGDKRLSPKESEVLRLFAEGFLVTEIAKKLNRSIK TISSQKKSAMMKLGVENDIALLNYLSSVTLSPADKD >gi|296494468|gb|ADTN01000270.1| GENE 57 67799 - 70471 2448 890 aa, chain - ## HITS:1 COG:yojN_1 KEGG:ns NR:ns ## COG: yojN_1 COG0642 # Protein_GI_number: 16130153 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 700 1 700 700 1286 100.0 0 MRQKETTATTRFSLLPGSITRFFLLLIIVLLVTMGVMVQSAVNAWLKDKSYQIVDITHAI QKRVDNWRYVTWQIYDNIAATTSPSSGEGLQETRLKQDVYYLEKPRRKTEALIFGSHDNS TLEMTQRMSTYLDTLWGAENVPWSMYYLNGQDNSLVLISTLPLKDLTSGFKESTVSDIVD SRRAEMLQQANALDERESFSNMRRLAWQNGHYFTLRTTFNQPGHLATVVAFDLPINDLIP PGMPLDSFRLEPDATATGNNDNEKEGTDSVSIHFNSTKIEISSALNSTDMRLVWQVPYGT LLLDTLQNILLPLLLNIGLLALALFGYTTFRHFSSRSTENVPSTAVNNELRILRAINEEI VSLLPLGLLVHDQESNRTVISNKIADHLLPHLNLQNITTMAEQHQGIIQATINNELYEIR MFRSQVAPRTQIFIIRDQDREVLVNKKLKQAQRLYEKNQQGRMIFMKNIGDALKEPAQSL AESAAKLNAPESKQLANQADVLVRLVDEIQLANMLADDSWKSETVLFSVQDLIDEVVPSV LPAIKRKGLQLLINNHLKAHDMRRGDRDALRRILLLLMQYAVTSTQLGKITLEVDQDESS EDRLTFRILDTGEGVSIHEMDNLHFPFINQTQNDRYGKADPLAFWLSDQLARKLGGHLNI KTRDGLGTRYSVHIKMLAADPEVEEEEERLLDDVCVMVDVTSAEIRNIVTRQLENWGATC ITPDERLISQDYDIFLTDNPSNLTASGLLLSDDESGVREIGPGQLCVNFNMSNAMQEAVL QLIEVQLAQEEVTESPLGGDENAQLHASGYYALFVDTVPDDVKRLYTEAATSDFAALAQT AHRLKGVFAMLNLVPGKQLCETLEHLIREKDVPGIEKYISDIDSYVKSLL >gi|296494468|gb|ADTN01000270.1| GENE 58 71210 - 72313 1278 367 aa, chain + ## HITS:1 COG:ompC KEGG:ns NR:ns ## COG: ompC COG3203 # Protein_GI_number: 16130152 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein (porin) # Organism: Escherichia coli K12 # 1 367 1 367 367 638 100.0 0 MKVKVLSLLVPALLVAGAANAAEVYNKDGNKLDLYGKVDGLHYFSDNKDVDGDQTYMRLG FKGETQVTDQLTGYGQWEYQIQGNSAENENNSWTRVAFAGLKFQDVGSFDYGRNYGVVYD VTSWTDVLPEFGGDTYGSDNFMQQRGNGFATYRNTDFFGLVDGLNFAVQYQGKNGNPSGE GFTSGVTNNGRDALRQNGDGVGGSITYDYEGFGIGGAISSSKRTDAQNTAAYIGNGDRAE TYTGGLKYDANNIYLAAQYTQTYNATRVGSLGWANKAQNFEAVAQYQFDFGLRPSLAYLQ SKGKNLGRGYDDEDILKYVDVGATYYFNKNMSTYVDYKINLLDDNQFTRDAGINTDNIVA LGLVYQF >gi|296494468|gb|ADTN01000270.1| GENE 59 72425 - 73480 1272 351 aa, chain + ## HITS:1 COG:ECs3103 KEGG:ns NR:ns ## COG: ECs3103 COG1477 # Protein_GI_number: 15832357 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Escherichia coli O157:H7 # 1 351 1 351 351 698 100.0 0 MEISFTRVALLAAALFFVGCDQKPQPAKTHATEVTVLEGKTMGTFWRASIPGIDAKRSAE LKEKIQTQLDADDQLLSTYKKDSALMRFNDSQSLSPWPVSEAMADIVTTSLRIGAKTDGA MDITVGPLVNLWGFGPEQQPVQIPSQEQIDAMKAKTGLQHLTVINQSHQQYLQKDLPDLY VDLSTVGEGYAADHLARLMEQEGISRYLVSVGGALNSRGMNGEGLPWRVAIQKPTDKENA VQAVVDINGHGISTSGSYRNYYELDGKRLSHVIDPQTGRPIEHNLVSVTVIAPTALEADA WDTGLMVLGPEKAKEVVRREGLAVYMITKEGDSFKTWMSPQFKSFLVSEKN >gi|296494468|gb|ADTN01000270.1| GENE 60 73554 - 74618 715 354 aa, chain + ## HITS:1 COG:ada_1 KEGG:ns NR:ns ## COG: ada_1 COG2169 # Protein_GI_number: 16130150 # Func_class: F Nucleotide transport and metabolism # Function: Adenosine deaminase # Organism: Escherichia coli K12 # 1 184 1 184 184 363 100.0 1e-100 MKKATCLTDDQRWQSVLARDPNADGEFVFAVRTTGIFCRPSCRARHALRENVSFYANASE ALAAGFRPCKRCQPEKANAQQHRLDKITHACRLLEQETPVTLEALADQVAMSPFHLHRLF KATTGMTPKAWQQAWRARRLRESLAKGESVTTSILNAGFPDSSSYYRKADETLGMTAKQF RHGGENLAVRYALADCELGRCLVAESERGICAILLGDDDATLISELQQMFPAADNAPADL MFQQHVREVIASLNQRDTPLTLPLDIRGTAFQQQVWQALRTIPCGETVSYQQLANAIGKP KAVRAVASACAANKLAIIIPCHRVVRGDGTLSGYRWGVSRKAQLLRREAENEER >gi|296494468|gb|ADTN01000270.1| GENE 61 74621 - 75268 509 215 aa, chain + ## HITS:1 COG:alkB KEGG:ns NR:ns ## COG: alkB COG3145 # Protein_GI_number: 16130149 # Func_class: L Replication, recombination and repair # Function: Alkylated DNA repair protein # Organism: Escherichia coli K12 # 1 215 2 216 216 447 99.0 1e-126 MDLFADAEPWQEPLAAGAVILRRFAFNAAEQLIRDINDVASQSPFRQMVTPGGYTMSVAM TNCGHLGWTTHRQGYLYSPIDPQTNKPWPAMPQSFHNLCQRAATAAGYPDFQPDACLINR YAPGAKLSLHQDKDEPDLRAPIVSVSLGLPAIFQFGGLKRNDPLKRLLLEHGDVVVWGGE SRLFYHGIQPLKAGFHPLTIDCRYNLTFRQAGKKE >gi|296494468|gb|ADTN01000270.1| GENE 62 75638 - 76114 225 158 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|301647699|ref|ZP_07247493.1| ## NR: gi|301647699|ref|ZP_07247493.1| hypothetical protein HMPREF9543_04216 [Escherichia coli MS 146-1] # 1 158 1 158 158 286 100.0 2e-76 MNTVKLIFMLSIAAAVSGCAGTYKAPVMTPVVASKTTNLTQDAILRASKSVLISDGYQIS SYDDKTGFVSTAPRDLHLSPEQADCGKTMGLDYLKDKRTATRVAYGVIYDDKRVTVKANI EGAYKPGSAIQDITLTCVSRGTLENELLDRIISTASRY >gi|296494468|gb|ADTN01000270.1| GENE 63 76176 - 76514 212 112 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|301647700|ref|ZP_07247494.1| ## NR: gi|301647700|ref|ZP_07247494.1| hypothetical protein HMPREF9543_04217 [Escherichia coli MS 146-1] # 1 112 1 112 112 207 100.0 2e-52 MRMIAALIVVGCLFVGSVDAAGRCEAIASSLVRAAMITPDRAYDFRPVTNQLISMCEIGS KAAAKGENINDAIIANLKYRNKMAEKIPGGNKAALDLADISFQLGFDTYNKK >gi|296494468|gb|ADTN01000270.1| GENE 64 76701 - 76784 67 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFPFARRYELQLKLHETQYLKELWVGL >gi|296494468|gb|ADTN01000270.1| GENE 65 76825 - 77076 150 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|301647701|ref|ZP_07247495.1| ## NR: gi|301647701|ref|ZP_07247495.1| hypothetical protein HMPREF9543_04218 [Escherichia coli MS 146-1] # 1 83 1 83 83 166 100.0 4e-40 MAVQTNRLARLKGLEKSFAPNILANFRFVTHTFVEQEDPRNGSCEHITSPDSPIAELIIY TATRCAYRAKVKHFQQHYTLLEG >gi|296494468|gb|ADTN01000270.1| GENE 66 77080 - 77715 623 211 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|301647702|ref|ZP_07247496.1| ## NR: gi|301647702|ref|ZP_07247496.1| hypothetical protein HMPREF9543_04219 [Escherichia coli MS 146-1] # 1 211 1 211 211 315 100.0 1e-84 MAIKYSEQYIKGVGTSEVIGRMHEPVARQINALLSKRKVIANDRTRTDEEKAALYKRLKD EYAKAMEKSVNNLARELLVVMENNMRAKEEARSNISMLESVQLIAALRQAGVQGEDLRSA AISDVRIAQAMVNVPEAVVRAAKWTPEIAEEMLMRHFPDVQAAEAQYKADMAAYDRLVEA RDEVLNETEKRAPKAVLETRVDENEILKTEV Prediction of potential genes in microbial genomes Time: Mon May 16 00:05:14 2011 Seq name: gi|296494467|gb|ADTN01000271.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont710.1, whole genome shotgun sequence Length of sequence - 12994 bp Number of predicted genes - 21, with homology - 19 Number of transcription units - 13, operones - 6 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 262 - 837 356 ## COG5525 Bacteriophage tail assembly protein 2 1 Op 2 . - CDS 837 - 1145 132 ## EcolC_2111 prophage tail fibre domain-containing protein 3 1 Op 3 . - CDS 1112 - 2377 758 ## COG4733 Phage-related protein, tail component 4 1 Op 4 . - CDS 2370 - 2627 142 ## COG4220 Phage DNA packaging protein, Nu1 subunit of terminase - Prom 2713 - 2772 5.9 - Term 3805 - 3845 4.0 5 2 Tu 1 . - CDS 3869 - 4042 236 ## ECS88_5024 hypothetical protein - Prom 4149 - 4208 6.6 - Term 4479 - 4512 2.2 6 3 Tu 1 . - CDS 4716 - 4928 350 ## COG1278 Cold shock proteins 7 4 Op 1 . - CDS 5291 - 5773 273 ## ECUMN_1840 conserved hypothetical protein; Qin prophage 8 4 Op 2 . - CDS 5785 - 6201 249 ## COG3772 Phage-related lysozyme (muraminidase) 9 5 Op 1 . - CDS 6315 - 6626 330 ## JW1547 hypothetical protein 10 5 Op 2 . - CDS 6631 - 6837 262 ## ECUMN_1843 putative S lysis protein; Qin prophage - Prom 6949 - 7008 6.6 + Prom 6956 - 7015 8.6 11 6 Tu 1 . + CDS 7066 - 7236 63 ## + Term 7302 - 7327 -0.5 12 7 Tu 1 . - CDS 7600 - 7815 184 ## COG1278 Cold shock proteins + Prom 7969 - 8028 3.7 13 8 Op 1 . + CDS 8128 - 8328 142 ## COG1278 Cold shock proteins 14 8 Op 2 . + CDS 8383 - 8472 68 ## + Term 8617 - 8660 2.5 - Term 8605 - 8648 6.3 15 9 Op 1 . - CDS 8750 - 9502 255 ## JW1551 predicted antitermination protein Q 16 9 Op 2 . - CDS 9516 - 10469 417 ## ECIAI1_1603 conserved hypothetical protein; Qin prophage - Term 10769 - 10808 1.3 17 10 Tu 1 . - CDS 10912 - 11163 123 ## JW1553 hypothetical protein - Prom 11235 - 11294 7.5 - Term 11284 - 11309 -0.5 18 11 Tu 1 . - CDS 11380 - 11535 61 ## ECO26_1121 putative prophage maintenance protein 19 12 Op 1 2/1.000 - CDS 11607 - 11894 67 ## COG2026 Cytotoxic translational repressor of toxin-antitoxin stability system 20 12 Op 2 . - CDS 11894 - 12133 365 ## COG3077 DNA-damage-inducible protein J - Prom 12162 - 12221 3.3 21 13 Tu 1 . + CDS 12666 - 12993 330 ## JW1558 hypothetical protein Predicted protein(s) >gi|296494467|gb|ADTN01000271.1| GENE 1 262 - 837 356 191 aa, chain - ## HITS:1 COG:ybcX KEGG:ns NR:ns ## COG: ybcX COG5525 # Protein_GI_number: 16128544 # Func_class: R General function prediction only # Function: Bacteriophage tail assembly protein # Organism: Escherichia coli K12 # 50 191 103 244 247 267 94.0 7e-72 MAFRMSEQPRTIKIYNLLAGTNEFIGEGDAYIPPHTGLPANSTDIAPPDIPAGFVAVFNS DEASWHLVEDHRGKTVYDVASGDALFISELGPLPENVTWLSPGGEYQKWNGTAWVKDTEA EKLFRIREAEETKNSLMQIASEHIAPLQDAVDLEIATEEETLLLEAWKKYRVLLNRVDTS TAPDIEWPALP >gi|296494467|gb|ADTN01000271.1| GENE 2 837 - 1145 132 102 aa, chain - ## HITS:1 COG:no KEGG:EcolC_2111 NR:ns ## KEGG: EcolC_2111 # Name: not_defined # Def: prophage tail fibre domain-containing protein # Organism: E.coli_ATCC8739 # Pathway: not_defined # 2 102 907 1007 1007 120 97.0 2e-26 MGTAGAHTHSMTFVSGGSSGAPGSGSPDYSKYSVNTSSAGAHTHSVSGTAASAGAHAYTV GIGAHTHSVAIGSHGHTITVNAAGNAENTVKNIAFNYIVRLA >gi|296494467|gb|ADTN01000271.1| GENE 3 1112 - 2377 758 421 aa, chain - ## HITS:1 COG:ECs0842 KEGG:ns NR:ns ## COG: ECs0842 COG4733 # Protein_GI_number: 15830096 # Func_class: S Function unknown # Function: Phage-related protein, tail component # Organism: Escherichia coli O157:H7 # 1 414 287 700 1137 828 97.0 0 MGKRLGAADVDKWALYVIGQCCDQSVPDGFGGTEPRITCNAWLTTQRKAWDVLSDFCSAM RCMPVWNGQTLTFVQDRPSDKVWTYNRSNVVMPDDGAPFRYSFSALKDRHNAVEVNWIDP DNGWETATELVEDTQAIARYGRNVTKMDAFGCTRRGQAHRAGLWLIKTELLETQTVDFSV GAEGLRHVPGDVIEICDDDYAGISIGGRVLAVNSQTRTLTLDREITLPSSGTTLISLVDG QGNPVSVEVQSVTDGVKVKVSRVPDGVAEYSVWGLKLPTLRQRLFRCVSIRENDDGTYAI TAVQHVPEKEAIVDNGAHFDGDQSGTVNGVTPPAVQHLTAEVTADSGEYQVLARWDTPKV VKGVSFMLRLTVAADDGSERLVSTARTTETTYRFTQLALGNYRLTVRAVNAWGQPVHIPI R >gi|296494467|gb|ADTN01000271.1| GENE 4 2370 - 2627 142 85 aa, chain - ## HITS:1 COG:nohA KEGG:ns NR:ns ## COG: nohA COG4220 # Protein_GI_number: 16129507 # Func_class: L Replication, recombination and repair # Function: Phage DNA packaging protein, Nu1 subunit of terminase # Organism: Escherichia coli K12 # 1 84 1 84 189 147 92.0 7e-36 MEVNKKQLADIFGASIRTIQNWQEQGMPVLRGGGKGNEVLYDSAAVIKWYAERDAEIENE KLRREVEELRQASETDLQPPALRHG >gi|296494467|gb|ADTN01000271.1| GENE 5 3869 - 4042 236 57 aa, chain - ## HITS:1 COG:no KEGG:ECS88_5024 NR:ns ## KEGG: ECS88_5024 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 57 32 88 88 78 92.0 8e-14 MNIENLKTKAEADISEYITKKIIELKKKTGKEVTSIQFTAREKMTGLESYDVKINLI >gi|296494467|gb|ADTN01000271.1| GENE 6 4716 - 4928 350 70 aa, chain - ## HITS:1 COG:cspI KEGG:ns NR:ns ## COG: cspI COG1278 # Protein_GI_number: 16129511 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Escherichia coli K12 # 1 70 1 70 70 134 100.0 5e-32 MSNKMTGLVKWFNPEKGFGFITPKDGSKDVFVHFSAIQSNDFKTLTENQEVEFGIENGPK GPAAVHVVAL >gi|296494467|gb|ADTN01000271.1| GENE 7 5291 - 5773 273 160 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_1840 NR:ns ## KEGG: ECUMN_1840 # Name: ydfP # Def: conserved hypothetical protein; Qin prophage # Organism: E.coli_UMN026 # Pathway: not_defined # 1 160 6 165 165 259 100.0 2e-68 MVIFLVLSGFIVGNVWSDRGWQKKWAERDAAALSQEVNAQFAARIIEQGRTIARDEAVKD AQQKSAEISARAAYLSDSVNQLRAEAKKYAIRLDAAKHTADLAAAVRGKTTKTAEGMLTN MLGDIAAEAQLYAEIADERYIAGVTCQQIYESLRDKKHQM >gi|296494467|gb|ADTN01000271.1| GENE 8 5785 - 6201 249 138 aa, chain - ## HITS:1 COG:ydfQ KEGG:ns NR:ns ## COG: ydfQ COG3772 # Protein_GI_number: 16129513 # Func_class: R General function prediction only # Function: Phage-related lysozyme (muraminidase) # Organism: Escherichia coli K12 # 1 138 40 177 177 286 100.0 7e-78 MAYRDGSGIWTICRGATVVDGKTVFPNMKLSKEKCDQVNAIERDKALAWVERNIKVPLTE PQKAGIASFCPYNIGPGKCFPSTFYKRLNAGDRKGACEAIRWWIKDGGRDCRIRSNNCYG QVIRRDQESALTCWGIEQ >gi|296494467|gb|ADTN01000271.1| GENE 9 6315 - 6626 330 103 aa, chain - ## HITS:1 COG:no KEGG:JW1547 NR:ns ## KEGG: JW1547 # Name: ydfR # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 103 1 103 103 176 100.0 3e-43 MTQDYELVVKGVRNFENKVTVTVALQDKERFDGEIFDLDVAMDRVEGAALEFYEAAARRS VRQVFLEVAEKLSEKVESYLQHQYSFKIENPANKHERPHHKYL >gi|296494467|gb|ADTN01000271.1| GENE 10 6631 - 6837 262 68 aa, chain - ## HITS:1 COG:no KEGG:ECUMN_1843 NR:ns ## KEGG: ECUMN_1843 # Name: essQ # Def: putative S lysis protein; Qin prophage # Organism: E.coli_UMN026 # Pathway: not_defined # 1 68 4 71 71 125 100.0 6e-28 MDKLTTGVAYGTSAGNAGFWALQLLDKVTPSQWAAIGVLGSLVFGLLTYLTNLYFKIKED RRKAARGE >gi|296494467|gb|ADTN01000271.1| GENE 11 7066 - 7236 63 56 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSERKNSKSRRNYLVKCSCPNCTQESEHSFSRVQKGALLICPHCNKVFQTNLKAVA >gi|296494467|gb|ADTN01000271.1| GENE 12 7600 - 7815 184 71 aa, chain - ## HITS:1 COG:cspB KEGG:ns NR:ns ## COG: cspB COG1278 # Protein_GI_number: 16129516 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Escherichia coli K12 # 1 71 1 71 71 133 100.0 1e-31 MSNKMTGLVKWFNADKGFGFISPVDGSKDVFVHFSAIQNDNYRTLFEGQKVTFSIESGAK GPAAANVIITD >gi|296494467|gb|ADTN01000271.1| GENE 13 8128 - 8328 142 66 aa, chain + ## HITS:1 COG:cspF KEGG:ns NR:ns ## COG: cspF COG1278 # Protein_GI_number: 16129517 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Escherichia coli K12 # 1 66 5 70 70 128 100.0 3e-30 MTGIVKTFDGKSGKGLITPSDGRIDVQLHVSALNLRDAEEITTGLRVEFCRINGLRGPSA ANVYLS >gi|296494467|gb|ADTN01000271.1| GENE 14 8383 - 8472 68 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNNPVCLDDWLIGFKSLCCTLAVIALLII >gi|296494467|gb|ADTN01000271.1| GENE 15 8750 - 9502 255 250 aa, chain - ## HITS:1 COG:no KEGG:JW1551 NR:ns ## KEGG: JW1551 # Name: ydfT # Def: predicted antitermination protein Q # Organism: E.coli_J # Pathway: not_defined # 1 250 1 250 250 522 100.0 1e-147 MNLEALPKYYSPKSPKLSDDAPATGTGCLTITDVMAAQGMVQSKAPLGLALFLAKVGVQD PQFAIEGLLNYAMALDNPTLNKLSEEIRLQIIPYLVSFAFADYSRSAASKARCEHCSGTG FYNVLREVVKHYRRGESVIKEEWVKELCQHCHGKGEASTACRGCKGKGIVLDEKRTRFHG VPVYKICGRCNGNRFSRLPTTLARRHVQKLVPDLTDYQWYKGYADVIGKLVTKCWQEEAY AEAQLRKVTR >gi|296494467|gb|ADTN01000271.1| GENE 16 9516 - 10469 417 317 aa, chain - ## HITS:1 COG:no KEGG:ECIAI1_1603 NR:ns ## KEGG: ECIAI1_1603 # Name: ydfU # Def: conserved hypothetical protein; Qin prophage # Organism: E.coli_IAI1 # Pathway: not_defined # 1 317 33 349 349 654 99.0 0 MLVEPEPKSMRNLPSGVVPAVRQPLAEDKTLLPFFSNERVIRAAGGVGALSDWLLRHVTS CQWPNGDYHHTETVIHRYGTGAMVLCWHCDNQLRDQTSESLELLAQQNLTAWVIDVIRHA ISGTQERELSLAELSWWAVCNQVVDALPEAVSRRSLGLPAEKICSVYRESDIVPGEQTAT SILKQRTKNLAPLPYAHQQQKSPQEKTVVSITVDPESPESFMKLPKRRRWVKEKYTRWVK TQPCACCGMPADDPHHLIGHGQGGMGTKAHDLFVLPLCRKHHNELHTDTVAFEDKYGSQL ELIFRFIDRALAIGVLA >gi|296494467|gb|ADTN01000271.1| GENE 17 10912 - 11163 123 83 aa, chain - ## HITS:1 COG:no KEGG:JW1553 NR:ns ## KEGG: JW1553 # Name: rem # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 83 1 83 83 159 100.0 2e-38 MMNIEELRKIFCEDGLYAVCVENGNLVSHYRIMCLRKNGAALINFVDARVTDGFILREGE FVTSLQALKEIGIKAGFSAFSGE >gi|296494467|gb|ADTN01000271.1| GENE 18 11380 - 11535 61 51 aa, chain - ## HITS:1 COG:no KEGG:ECO26_1121 NR:ns ## KEGG: ECO26_1121 # Name: not_defined # Def: putative prophage maintenance protein # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 51 20 70 70 80 100.0 1e-14 MKQQKAMLIALIVICLTVIVTALVTRKDLCEVRIRTGQTEVAVFTAYEPEE >gi|296494467|gb|ADTN01000271.1| GENE 19 11607 - 11894 67 95 aa, chain - ## HITS:1 COG:relE KEGG:ns NR:ns ## COG: relE COG2026 # Protein_GI_number: 16129522 # Func_class: J Translation, ribosomal structure and biogenesis; D Cell cycle control, cell division, chromosome partitioning # Function: Cytotoxic translational repressor of toxin-antitoxin stability system # Organism: Escherichia coli K12 # 1 95 1 95 95 157 100.0 5e-39 MAYFLDFDERALKEWRKLGSTVREQLKKKLVEVLESPRIEANKLRGMPDCYKIKLRSSGY RLVYQVIDEKVVVFVISVGKRERSEVYSEAVKRIL >gi|296494467|gb|ADTN01000271.1| GENE 20 11894 - 12133 365 79 aa, chain - ## HITS:1 COG:relB KEGG:ns NR:ns ## COG: relB COG3077 # Protein_GI_number: 16129523 # Func_class: L Replication, recombination and repair # Function: DNA-damage-inducible protein J # Organism: Escherichia coli K12 # 1 79 1 79 79 127 100.0 6e-30 MGSINLRIDDELKARSYAALEKMGVTPSEALRLMLEYIADNERLPFKQTLLSDEDAELVE IVKERLRNPKPVRVTLDEL >gi|296494467|gb|ADTN01000271.1| GENE 21 12666 - 12993 330 109 aa, chain + ## HITS:1 COG:no KEGG:JW1558 NR:ns ## KEGG: JW1558 # Name: flxA # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 109 1 109 110 132 100.0 3e-30 MSVTIQGNTSTVISNNSAPEGTSEIAKITRQIQVLTEKLGKISSEEGMTTQQKKEMAALV QKQIESLWAQLEQLLRQQAEKKNEDATVQPDKKEEKKDDTNTAGTIDIY Prediction of potential genes in microbial genomes Time: Mon May 16 00:05:55 2011 Seq name: gi|296494466|gb|ADTN01000272.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont710.2, whole genome shotgun sequence Length of sequence - 34562 bp Number of predicted genes - 37, with homology - 31 Number of transcription units - 24, operones - 8 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 194 - 243 4.2 1 1 Tu 1 . - CDS 415 - 1410 288 ## EcolC_2069 integrase catalytic region - Term 1724 - 1757 2.1 2 2 Tu 1 . - CDS 1784 - 1861 81 ## - Prom 1972 - 2031 3.3 3 3 Tu 1 . - CDS 2063 - 2242 58 ## c1418 hypothetical protein - Prom 2358 - 2417 3.4 4 4 Tu 1 . - CDS 3395 - 3751 174 ## ECIAI39_1489 conserved hypothetical protein; putative methyltransferase (fragment) - Prom 3781 - 3840 1.8 5 5 Tu 1 . + CDS 4004 - 5971 1087 ## ECIAI1_1626 putative exonuclease from phage origin + Term 5973 - 6009 -0.2 - Term 5788 - 5822 0.3 6 6 Tu 1 . - CDS 5968 - 6060 127 ## - Prom 6109 - 6168 5.0 7 7 Op 1 . + CDS 6044 - 6295 189 ## ECED1_1745 putative excisionase 8 7 Op 2 . + CDS 6330 - 7610 220 ## COG0582 Integrase - Term 7543 - 7583 -0.0 9 8 Op 1 7/0.167 - CDS 7630 - 7740 81 ## COG0477 Permeases of the major facilitator superfamily 10 8 Op 2 2/0.833 - CDS 7798 - 8817 1008 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 11 8 Op 3 2/0.833 - CDS 8829 - 10043 1272 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily - Prom 10181 - 10240 4.0 12 9 Tu 1 . - CDS 10249 - 10575 219 ## COG1742 Uncharacterized conserved protein - Prom 10699 - 10758 4.6 + Prom 10551 - 10610 4.6 13 10 Op 1 . + CDS 10710 - 11051 286 ## B21_01542 hypothetical protein 14 10 Op 2 . + CDS 11086 - 11646 484 ## PROTEIN SUPPORTED gi|116490772|ref|YP_810316.1| acetyltransferase - Term 11599 - 11638 -0.7 15 11 Tu 1 . - CDS 11649 - 12272 506 ## ECBD_2061 hypothetical protein - Prom 12404 - 12463 3.4 + Prom 12386 - 12445 2.3 16 12 Tu 1 . + CDS 12467 - 12772 216 ## ECSE_1707 hypothetical protein + Term 12789 - 12824 3.7 + Prom 12886 - 12945 5.4 17 13 Op 1 5/0.167 + CDS 12971 - 15397 2121 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 18 13 Op 2 16/0.000 + CDS 15458 - 17881 1784 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 19 13 Op 3 9/0.167 + CDS 17892 - 18509 458 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 20 13 Op 4 4/0.667 + CDS 18511 - 19365 668 ## COG3302 DMSO reductase anchor subunit + Term 19373 - 19403 3.0 21 13 Op 5 . + CDS 19408 - 20022 593 ## COG3381 Uncharacterized component of anaerobic dehydrogenases + Prom 20032 - 20091 3.0 22 14 Tu 1 . + CDS 20216 - 21472 1113 ## COG0038 Chloride channel protein EriC - Term 21343 - 21381 7.6 23 15 Tu 1 . - CDS 21425 - 22120 794 ## COG0132 Dethiobiotin synthetase - Prom 22140 - 22199 4.7 - Term 22138 - 22203 5.1 24 16 Op 1 5/0.167 - CDS 22245 - 23465 274 ## PROTEIN SUPPORTED gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 - Prom 23503 - 23562 1.6 25 16 Op 2 . - CDS 23600 - 24493 204 ## PROTEIN SUPPORTED gi|149913192|ref|ZP_01901726.1| 50S ribosomal protein L35 - Prom 24517 - 24576 4.8 + Prom 24473 - 24532 5.3 26 17 Tu 1 . + CDS 24600 - 25853 993 ## COG0477 Permeases of the major facilitator superfamily + Term 25861 - 25902 7.3 - Term 25846 - 25893 11.2 27 18 Tu 1 . - CDS 25902 - 26006 75 ## - Prom 26034 - 26093 3.6 + Prom 26081 - 26140 6.5 28 19 Op 1 . + CDS 26277 - 26585 239 ## 29 19 Op 2 . + CDS 26678 - 26761 107 ## + Prom 26764 - 26823 1.5 30 19 Op 3 . + CDS 26861 - 27682 547 ## COG3591 V8-like Glu-specific endopeptidase + Term 27691 - 27726 6.4 - Term 27679 - 27714 6.4 31 20 Op 1 12/0.167 - CDS 27721 - 28050 442 ## COG2076 Membrane transporters of cations and cationic drugs 32 20 Op 2 . - CDS 28037 - 28402 467 ## COG2076 Membrane transporters of cations and cationic drugs - Prom 28606 - 28665 4.5 + Prom 28607 - 28666 5.1 33 21 Tu 1 . + CDS 28814 - 29848 1069 ## COG0628 Predicted permease 34 22 Op 1 17/0.000 - CDS 29867 - 31261 1436 ## COG1282 NAD/NADP transhydrogenase beta subunit 35 22 Op 2 . - CDS 31272 - 32804 1618 ## COG3288 NAD/NADP transhydrogenase alpha subunit - Prom 32890 - 32949 4.5 + Prom 33150 - 33209 5.0 36 23 Tu 1 . + CDS 33328 - 34272 1031 ## EC55989_1769 hypothetical protein + Term 34293 - 34321 1.3 + Prom 34321 - 34380 4.6 37 24 Tu 1 . + CDS 34458 - 34560 76 ## Predicted protein(s) >gi|296494466|gb|ADTN01000272.1| GENE 1 415 - 1410 288 331 aa, chain - ## HITS:1 COG:no KEGG:EcolC_2069 NR:ns ## KEGG: EcolC_2069 # Name: not_defined # Def: integrase catalytic region # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 315 107 421 437 629 99.0 1e-179 MIEEGLWRERRRKIARIYQRRMRRPSYGELIQIDGSPHDWFENRGPRCTLIVFIDDATSA LMALRFVPAETTRAYMETLRGYLNDHGVPLALYSDRHSIFRVNNPEREGELTQFTRAIKT LGIEPIHANSPQAKGRVERANQTLQDRLVKEMRLQNISDIETANAWLPTFIEAYNNRFAT SPRTTDNAHLDVHHSEEELGYIFSLQAKRVLSKNLTFQYKSSAFQVRSEGRGYRLRHSVV TVCENFDGEINVLYDGKALGWEKYVDGPEPIPLDDEKSVHERVDNARIDLRSKYYVKPKA DHPWLTRRTQSHQQVKPPKLPKKKPDPDKKD >gi|296494466|gb|ADTN01000272.1| GENE 2 1784 - 1861 81 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLDTCRLASYVPKGTEKQAIVNTKL >gi|296494466|gb|ADTN01000272.1| GENE 3 2063 - 2242 58 59 aa, chain - ## HITS:1 COG:no KEGG:c1418 NR:ns ## KEGG: c1418 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_CFT073 # Pathway: not_defined # 1 59 14 72 72 85 98.0 8e-16 MKNALQFLFVAFWLFASCMPIIFTARYMEKVDVLILIFGYINALFLGVFMAVMCIEYWR >gi|296494466|gb|ADTN01000272.1| GENE 4 3395 - 3751 174 118 aa, chain - ## HITS:1 COG:no KEGG:ECIAI39_1489 NR:ns ## KEGG: ECIAI39_1489 # Name: not_defined # Def: conserved hypothetical protein; putative methyltransferase (fragment) # Organism: E.coli_IAI39 # Pathway: not_defined # 1 118 1 118 118 234 100.0 6e-61 MSAPATILDMCCGSRMFWFDKSDERAIFSDIRKEGYTLRNGRRLIISPDIIADFRALSFA DASFSMVVLDPPHLESVGDNAWMGKKYGRLNKDAWRDDSRQRFKEAFRVLRPHGVLIF >gi|296494466|gb|ADTN01000272.1| GENE 5 4004 - 5971 1087 655 aa, chain + ## HITS:1 COG:no KEGG:ECIAI1_1626 NR:ns ## KEGG: ECIAI1_1626 # Name: not_defined # Def: putative exonuclease from phage origin # Organism: E.coli_IAI1 # Pathway: not_defined # 1 655 181 835 835 1272 99.0 0 MHAIGWVKHKCIPGAKWPEIQAEMRIWKKRREGERKETGKYTSVVDLARARANQQYTENS TGKISPVIAAIHREYKQTWKTLDDELAYALWPSDVDAGNIDGSIHRWAKKEVIDNDREDW KRISASMRKQPDALRYDRQTIFGLVRERPIDIHKDPVALNKYICEYLTTKGVFENEETDL GTVDVLQSSETQTDAVETEVSDIPKNETAPEAEPSVEREGPFYFLFADKDGEKYGRANKL SGLDKALAAGATEITKEEYFARKNGTYTGLPQNVDTAEDSEQPEPIKVTADEVNKIMQAA NISQPDADKLLAASRGEFVEGISDPNDPKWVKGLQTRDSVNQNQHESERNYQKAEQNSPN ALQNEPETKQPEPVAQQEVEKVCTACGQTGGGNCPDCGAVMGDATYQETFDEEYQVEVQE DDPEEMEGAEHPHKENTGGNQHHNSDNETGETADHPIKVNGHHEITSTSRTCDHLMIDLE TMGKNPDAPIISIGAIFFDPQTGDMGPEFSKTIDLETAGGVIDRDTIKWWLKQSREAQSA IMTDEIPLDDALLQLREFIDENSGEFFVQVWGNGANFDNTILRRSYERQGIPCPWRYYND RDVRTIVELGKAIDFDARTAIPFEGERHNALDDARYQAKYVSVIWQKLIPSQADS >gi|296494466|gb|ADTN01000272.1| GENE 6 5968 - 6060 127 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MITSLIDFFISYIRRAPVAEYQSQPATVEH >gi|296494466|gb|ADTN01000272.1| GENE 7 6044 - 6295 189 83 aa, chain + ## HITS:1 COG:no KEGG:ECED1_1745 NR:ns ## KEGG: ECED1_1745 # Name: not_defined # Def: putative excisionase # Organism: E.coli_ED1a # Pathway: not_defined # 1 82 25 106 107 161 100.0 6e-39 MSEVIMIVSPGKWVSEEQLIALKGIKKGTLKKAREKSFMEGREYKHVAHDGMPWDNSPCF YNLEEIDRWIERQASARPRRHLT >gi|296494466|gb|ADTN01000272.1| GENE 8 6330 - 7610 220 426 aa, chain + ## HITS:1 COG:intQ KEGG:ns NR:ns ## COG: intQ COG0582 # Protein_GI_number: 16129537 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Escherichia coli K12 # 31 426 2 398 398 790 97.0 0 MKYPTGVENHGGKLRIWFVYKDVRVRENLGVPDTAKNRRVAGELRSSVCYAIKTGVFDYA KQFPSSRNLEKFGEARQDLTIKELAEKFLALKETEVAKTSLNTYRAVIKNILSIIGEKNL ASSINKEKLLEVRKELLTGYQIPKSNYIVTQPGRSAVTVNNYMTNLNAVFQFGVDNGYLA DNPFKGISPLKESRTIPDPLSREEFIRLIDACRNQQAKNLWCVSVYTGVRPGELCALGWE DIDLKNGTMMIRRNLAKDRFTVPKTQAGTNRVIHLIKPAIDALRSQMTLTRLSKEHIIDV HLREYGRTEKQKCTFVFQPEVSARVKNYGDHFTVDSIRQMWDAAIKRAGLRHRKSYQSRH TYACWSLTAGANPAFIANQMGHADAQMVFQVYGKWMSENNNAQVALLNTQLSEFAPTMPH NEAMKN >gi|296494466|gb|ADTN01000272.1| GENE 9 7630 - 7740 81 36 aa, chain - ## HITS:1 COG:STM1507 KEGG:ns NR:ns ## COG: STM1507 COG0477 # Protein_GI_number: 16764852 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Salmonella typhimurium LT2 # 1 35 1 35 325 60 77.0 6e-10 MTIEKHERSTKDLVKAAVSGWLGTALEFMDFKSHAC >gi|296494466|gb|ADTN01000272.1| GENE 10 7798 - 8817 1008 339 aa, chain - ## HITS:1 COG:rspB KEGG:ns NR:ns ## COG: rspB COG1063 # Protein_GI_number: 16129538 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli K12 # 1 339 1 339 339 682 100.0 0 MKSILIEKPNQLAIVEREIPTPSAGEVRVKVKLAGICGSDSHIYRGHNPFAKYPRVIGHE FFGVIDAVGEGVESARVGERVAVDPVVSCGHCYPCSIGKPNVCTTLAVLGVHADGGFSEY AVVPAKNAWKIPEAVADQYAVMIEPFTIAANVTGHGQPTENDTVLVYGAGPIGLTIVQVL KGVYNVKNVIVADRIDERLEKAKESGADWAINNSQTPLGEIFTEKGIKPTLIIDAACHPS ILKEAVTLASPAARIVLMGFSSEPSEVIQQGITGKELSIFSSRLNANKFPIVIDWLSKGL IKPEKLITHTFDFQHVADAISLFEQDQKHCCKVLLTFSE >gi|296494466|gb|ADTN01000272.1| GENE 11 8829 - 10043 1272 404 aa, chain - ## HITS:1 COG:rspA KEGG:ns NR:ns ## COG: rspA COG4948 # Protein_GI_number: 16129539 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Escherichia coli K12 # 1 404 1 404 404 868 100.0 0 MKIVKAEVFVTCPGRNFVTLKITTEDGITGLGDATLNGRELSVASYLQDHLCPQLIGRDA HRIEDIWQFFYKGAYWRRGPVTMSAISAVDMALWDIKAKAANMPLYQLLGGASREGVMVY CHTTGHSIDEALDDYARHQELGFKAIRVQCGIPGMKTTYGMSKGKGLAYEPATKGQWPEE QLWSTEKYLDFMPKLFDAVRNKFGFNEHLLHDMHHRLTPIEAARFGKSIEDYRMFWMEDP TPAENQECFRLIRQHTVTPIAVGEVFNSIWDCKQLIEEQLIDYIRTTLTHAGGITGMRRI ADFASLYQVRTGSHGPSDLSPVCMAAALHFDLWVPNFGVQEYMGYSEQMLEVFPHNWTFD NGYMHPGDKPGLGIEFDEKLAAKYPYEPAYLPVARLEDGTLWNW >gi|296494466|gb|ADTN01000272.1| GENE 12 10249 - 10575 219 108 aa, chain - ## HITS:1 COG:ynfA KEGG:ns NR:ns ## COG: ynfA COG1742 # Protein_GI_number: 16129540 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 108 1 108 108 164 100.0 3e-41 MIKTTLLFFATALCEIIGCFLPWLWLKRNASIWLLLPAGISLALFVWLLTLHPAASGRVY AAYGGVYVCTALMWLRVVDGVKLTLYDWTGALIALCGMLIIVAGWGRT >gi|296494466|gb|ADTN01000272.1| GENE 13 10710 - 11051 286 113 aa, chain + ## HITS:1 COG:no KEGG:B21_01542 NR:ns ## KEGG: B21_01542 # Name: ynfB # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 113 1 113 113 204 100.0 8e-52 MKITLSKRIGLLAILLPCALALSTTVHAETNKLVIESGDSAQSRQHAAMEKEQWNDTRNL RQKVNKRTEKEWDKADAAFDNRDKCEQSANINAYWEPNTLRCLDRRTGRVITP >gi|296494466|gb|ADTN01000272.1| GENE 14 11086 - 11646 484 186 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116490772|ref|YP_810316.1| acetyltransferase [Oenococcus oeni PSU-1] # 7 168 3 164 167 191 54 7e-48 MPSAHSVKLRPLEREDLRYVHQLDNNASVMRYWFEEPYEAFVELSDLYDKHIHDQSERRF VVECDGEKAGLVELVEINHVHRRAEFQIIISPEYQGKGLATRAAKLAMDYGFTVLNLYKL YLIVDKENEKAIHIYRKLGFSVEGELMHEFFINGQYRNAIRMCIFQHQYLAEHKTPGQTL LKPTAQ >gi|296494466|gb|ADTN01000272.1| GENE 15 11649 - 12272 506 207 aa, chain - ## HITS:1 COG:no KEGG:ECBD_2061 NR:ns ## KEGG: ECBD_2061 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_BL21_DE3 # Pathway: not_defined # 1 207 42 248 248 406 100.0 1e-112 MASFSNEFDFDPLRGPVKDFTQTLMDEQGEVTKRVSGTLSEEGCFDSLELLDLENNTVVA LVLDANYYRDAETLEKRVRLQGKCQLAELPSAGVSWETDDNGFVIKASSKQMQMEYRYDD QGYPLGKTTKSNDKTLSVSATPSTDPIKKLDYTAVTLLNNQRVGNVKQSCEYDSHANPVD CQLIIVDEGVKPAVERVYTIKNTIDYY >gi|296494466|gb|ADTN01000272.1| GENE 16 12467 - 12772 216 101 aa, chain + ## HITS:1 COG:no KEGG:ECSE_1707 NR:ns ## KEGG: ECSE_1707 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 101 2 102 102 154 98.0 9e-37 MKLSTCCAALLLALASPAVLAAPGSCERIQSDISQRIINNGVPESSFTLSIVPNDQVDQP DSQVVGHCANDTHKILYTRTTSGNVSAPAQSSQDGAPAEPQ >gi|296494466|gb|ADTN01000272.1| GENE 17 12971 - 15397 2121 808 aa, chain + ## HITS:1 COG:ynfE KEGG:ns NR:ns ## COG: ynfE COG0243 # Protein_GI_number: 16129545 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 808 1 808 808 1681 100.0 0 MSKNERMVGISRRTLVKSTAIGSLALAAGGFSLPFTLRNAAAAVQQAREKVVWGACSVNC GSRCALRLHVKDNEVTWVETDNTGSDEYGNHQVRACLRGRSIRRRINHPDRLNYPMKRVG KRGEGKFERISWDEALDTIASSLKKTVEQYGNEAVYIQYSSGIVGGNMTRSSPSASAVKR LMNCYGGSLNQYGSYSTAQISCAMPYTYGSNDGNSTTDIENSKLVVMFGNNPAETRMSGG GITYLLEKAREKSNAKMIVIDPRYTDTAAGREDEWLPIRPGTDAALVAGIAWVLINENLV DQPFLDKYCVGYDEKTLPADAPKNGHYKAYILGEGDDKTAKTPQWASQITGIPEDRIIKL AREIGTAKPAYICQGWGPQRQANGELTARAIAMLPILTGNVGISGGNSGARESTYTITIE RLPVLDNPVKTSISCFSWTDAIDHGPQMTAIRDGVRGKDKLDVPIKFIWNYAGNTLVNQH SDINKTHEILQDESKCEMIVVIENFMTSSAKYADILLPDLMTVEQEDIIPNDYAGNMGYL IFLQPVTSEKFERKPIYWILSEVAKRLGPDVYQKFTEGRTQEQWLQHLYAKMLAKDPALP SYDELKKMGIYKRKDPNGHFVAYKAFRDDPEANPLKTPSGKIEIYSSRLAEIARTWELEK DEVISPLPVYASTFEGWNSPERRTFPLQLFGFHYKSRTHSTYGNIDLLKAACRQEVWINP IDAQKRGIANGDMVRVFNHRGEVRLPAKVTPRILPGVSAMGQGAWHEANMSGDKIDHGGC VNTLTTLRPSPLAKGNPQHTNLVEIEKI >gi|296494466|gb|ADTN01000272.1| GENE 18 15458 - 17881 1784 807 aa, chain + ## HITS:1 COG:ynfF KEGG:ns NR:ns ## COG: ynfF COG0243 # Protein_GI_number: 16129546 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 807 2 808 808 1702 100.0 0 MKIHTTEALMKAEISRRSLMKTSALGSLALASSAFTLPFSQMVRAAEAPVEEKAVWSSCT VNCGSRCLLRLHVKDDTVYWVESDTTGDDVYGNHQVRACLRGRSIRRRMNHPDRLKYPMK RVGKRGEGKFERISWDEALDTISDNLRRILKDYGNEAVHVLYGTGVDGGNITNSNVPYRL MNSCGGFLSRYGSYSTAQISAAMSYMFGANDGNSPDDIANTKLVVMFGNNPAETRMSGGG VTYYVEQARERSNARMIVIDPRYNDTAAGREDEWLPIRPGTDGALACAIAWVLITENMVD QPFLDKYCVGYDEKTLPANAPRNAHYKAYILGEGPDGIAKTPEWAAKITSIPAEKIIQLA REIGSAKPAYICQGWGPQRHSNGEQTSRAIAMLSVLTGNVGINGGNSGVREGSWDLGVEW FPMLENPVKTQISVFTWTDAIDHGTEMTATRDGVRGKEKLDVPIKFLWCYASNTLINQHG DINHTHEVLQDDSKCEMIVGIDHFMTASAKYCDILLPDLMPTEQEDLISHESAGNMGYVI LAQPATSAKFERKPIYWMLSEVAKRLGPDVYQTFTEGRSQHEWIKYLHAKTKERNPEMPD YEEMKTTGIFKKKCPEEHYVAFRAFREDPQANPLKTPSGKIEIYSERLAKIADTWELKKD EIIHPLPAYTPGFDGWDDPLRKTYPLQLTGFHYKARTHSSYGNIDVLQQACPQEVWINPI DAQARGIRHGDTVRVFNNNGEMLIAAKVTPRILPGVTAIGQGAWLKADMFGDRVDHGGSI NILTSHRPSPLAKGNPSHSNLVQIEKV >gi|296494466|gb|ADTN01000272.1| GENE 19 17892 - 18509 458 205 aa, chain + ## HITS:1 COG:ECs2295 KEGG:ns NR:ns ## COG: ECs2295 COG0437 # Protein_GI_number: 15831549 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Escherichia coli O157:H7 # 1 205 1 205 205 416 100.0 1e-116 MTTQYGFFIDSSRCTGCKTCELACKDFKDLGPEVSFRRIYEYAGGDWQEDNGVWHQNVFA YYLSISCNHCDDPACTKVCPSGAMHKREDGFVVVDEDVCIGCRYCHMACPYGAPQYNAEK GHMTKCDGCYSRVAEGKQPICVESCPLRALEFGPIEELRQKHGTLAAVAPLPRAHFTKPN IVIKPNANSRPTGDTTGYLANPEEV >gi|296494466|gb|ADTN01000272.1| GENE 20 18511 - 19365 668 284 aa, chain + ## HITS:1 COG:ynfH KEGG:ns NR:ns ## COG: ynfH COG3302 # Protein_GI_number: 16129548 # Func_class: R General function prediction only # Function: DMSO reductase anchor subunit # Organism: Escherichia coli K12 # 1 284 1 284 284 453 100.0 1e-127 MGNGWHEWPLVIFTVLGQCVVGALIVSGIGWFAAKNDADRQRIVRGMFFLWLLMGVGFIA SVMHLGSPLRAFNSLNRIGASGLSNEIAAGSIFFAVGGLWWLVAVIGKMPQALGKLWLLF SMALGVIFVWMMTCVYQIDTVPTWHNGYTTLAFFLTVLLSGPILAAAILRAARVTFNTTP FAIISVLALIACAGVIVLQGLSLASIHSSVQQASALVPDYASLQVWRVVLLCAGLGCWLC PLIRRREPHVAGLILGLILILGGEMIGRVLFYGLHMTVGMAIAG >gi|296494466|gb|ADTN01000272.1| GENE 21 19408 - 20022 593 204 aa, chain + ## HITS:1 COG:ECs2297 KEGG:ns NR:ns ## COG: ECs2297 COG3381 # Protein_GI_number: 15831551 # Func_class: R General function prediction only # Function: Uncharacterized component of anaerobic dehydrogenases # Organism: Escherichia coli O157:H7 # 1 204 4 207 207 371 100.0 1e-103 MTHFSQQDNFSVAARVLGALFYYAPESAEAAPLVAVLTSDGWETQWPLPEASLAPLVTAF QTQCEETHAQAWQRLFVGPWALPSPPWGSVWLDRESVLFGDSTLALRQWMREKGIQFEMK QNEPEDHFGSLLLMAAWLAENGRQTECEELLAWHLFPWSTRFLDVFIEKAEHPFYRALGE LARLTLAQWQSQLLIPVAVKPLFR >gi|296494466|gb|ADTN01000272.1| GENE 22 20216 - 21472 1113 418 aa, chain + ## HITS:1 COG:ynfJ KEGG:ns NR:ns ## COG: ynfJ COG0038 # Protein_GI_number: 16129550 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Escherichia coli K12 # 1 418 21 438 438 684 99.0 0 MFRRLLIATVVGILAAFAVAGFRHAMLLLEWLFLNNDSGSLVNAATNLSPWRRLLTPALG GLAAGLLLMGWQKFTQQRPHAPTDYMEALQTDGQFDYAASLVKSLASLLVVTSGSAIGRE GAMILLAALAASCFAQRFTPRQEWKLWIACGAAAGMAAAYRAPLAGSLFIAEVLFGTMML ASLGPVIISAVVALLVSNLINHSDALLYNVQLSVTVQARDYALIISTGVLAGLCGPLLLT LMNACHRGFVSLKLAPPWQLALGGLIVGLLSLFTPAVWGNGYSTVQSFLTAPPLLMIIAG IFLCKLCAVLASSGSGAPGGVFTPTLFIGLAIGMLYGRSLGLWFPDGEEITLLLGLTGMA TLLAATTHAPIMSTLMICEMTGEYQLLPGLLIACVIASVISRTLHRDSIYRQHTAQHS >gi|296494466|gb|ADTN01000272.1| GENE 23 21425 - 22120 794 231 aa, chain - ## HITS:1 COG:ECs2299 KEGG:ns NR:ns ## COG: ECs2299 COG0132 # Protein_GI_number: 15831553 # Func_class: H Coenzyme transport and metabolism # Function: Dethiobiotin synthetase # Organism: Escherichia coli O157:H7 # 1 231 5 235 235 444 100.0 1e-125 MLKRFFITGTDTSVGKTVVSRALLQALASQGKTVAGYKPVAKGSKETPEGLRNKDALVLQ SVSTIELPYEAVNPIALSEEESSVAHSCPINYTLISNGLANLTEKVDHVVVEGTGGWRSL MNDLRPLSEWVVQEQLPVLMVVGIQEGCINHALLTAQAIANDGLPLIGWVANRINPGLAH YAEIIDVLGKKLPAPLIGELPYLPRAEQRELGQYIRLAMLRSVLAVDRVTV >gi|296494466|gb|ADTN01000272.1| GENE 24 22245 - 23465 274 406 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 [Bacillus selenitireducens MLS10] # 146 394 63 317 323 110 28 2e-23 MVAENQPGHIDQIKQTNAGAVYRLIDQLGPVSRIDLSRLAQLAPASITKIVREMLEAHLV QELEIKEAGNRGRPAVGLVVETEAWHYLSLRISRGEIFLALRDLSSKLVVEESQELALKD DLPLLDRIISHIDQFFIRHQKKLERLTSIAITLPGIIDTENGIVHRMPFYEDVKEMPLGE ALEQHTGVPVYIQHDISAWTMAEALFGASRGARDVIQVVIDHNVGAGVITDGHLLHAGSS SLVEIGHTQVDPYGKRCYCGNHGCLETIASVDSILELAQLRLNQSMSSMLHGQPLTVDSL CQAALRGDLLAKDIITGVGAHVGRILAIMVNLFNPQKILIGSPLSKAADILFPVISDSIR QQALPAYSQHISVESTQFSNQGTMAGAALVKDAMYNGSLLIRLLQG >gi|296494466|gb|ADTN01000272.1| GENE 25 23600 - 24493 204 297 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149913192|ref|ZP_01901726.1| 50S ribosomal protein L35 [Roseobacter sp. AzwK-3b] # 3 245 1 239 305 83 26 2e-15 MNIELRHLRYFVAVAEELHFGRAAARLNISQPPLSQQIQALEQQIGARLLARTNRSVLLT AAGKQFLADSRQILSMVDDAAARAERLHQGEAGELRIGFTSSAPFIRAVSDTLSLFRRDY PDVHLQTREMNTREQIAPLIEGTLDMGLLRNTALPESLEHAVIVHEPLMAMIPHDHPLAN NPNVTLAELAKEPFVFFDPHVGTGLYDDILGLMRRYHLTPVITQEVGEAMTIIGLVSAGL GVSILPASFKRVQLNEMRWVPIAEEDAVSEMWLVWPKHHEQSPAARNFRIHLLNALR >gi|296494466|gb|ADTN01000272.1| GENE 26 24600 - 25853 993 417 aa, chain + ## HITS:1 COG:ynfM KEGG:ns NR:ns ## COG: ynfM COG0477 # Protein_GI_number: 16129554 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 417 1 417 417 729 100.0 0 MSRTTTVDGAPASDTDKQSISQPNQFIKRGTPQFMRVTLALFSAGLATFALLYCVQPILP VLSQEFGLTPANSSISLSISTAMLAIGLLFTGPLSDAIGRKPVMVTALLLASICTLLSTM MTSWHGILIMRALIGLSLSGVAAVGMTYLSEEIHPSFVAFSMGLYISGNSIGGMSGRLIS GVFTDFFNWRIALAAIGCFALASALMFWKILPESRHFRPTSLRPKTLFINFRLHWRDRGL PLLFAEGFLLMGSFVTLFNYIGYRLMLSPWHVSQAVVGLLSLAYLTGTWSSPKAGTMTTR YGRGPVMLFSTGVMLFGLLMTLFSSLWLIFAGMLLFSAGFFAAHSVASSWIGPRAKRAKG QASSLYLFSYYLGSSIAGTLGGVFWHNYGWNGVGAFIALMLVIALLVGTRLHRRLHA >gi|296494466|gb|ADTN01000272.1| GENE 27 25902 - 26006 75 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNPFIWVLIILMTLDALRELAGASSILGWLLTLV >gi|296494466|gb|ADTN01000272.1| GENE 28 26277 - 26585 239 102 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKVLALVVAAAMGLSSAAFAAETTTTPAPTATTTKAAPAKTTHHKKQHKAAPAQKAQAA KKHHKNTKAEQKAPEQKAQAAKKHAKKHSHQQPAKPAAQPAA >gi|296494466|gb|ADTN01000272.1| GENE 29 26678 - 26761 107 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVGRYRFEFILIILILCALIAARFYLS >gi|296494466|gb|ADTN01000272.1| GENE 30 26861 - 27682 547 273 aa, chain + ## HITS:1 COG:ydgD KEGG:ns NR:ns ## COG: ydgD COG3591 # Protein_GI_number: 16129556 # Func_class: E Amino acid transport and metabolism # Function: V8-like Glu-specific endopeptidase # Organism: Escherichia coli K12 # 1 273 1 273 273 519 100.0 1e-147 MRTTIAVVLGAISLTSAFVFADKPDVARSANDEVSTLFFGHDDRVPVNDTTQSPWDAVGQ LETASGNLCTATLIAPNLALTAGHCLLTPPKGKADKAVALRFVSNKGLWRYEIHDIEGRV DPTLGKRLKADGDGWIVPPAAAPWDFGLIVLRNPPSGITPLPLFEGDKAALTAALKAAGR KVTQAGYPEDHLDTLYSHQNCEVTGWAQTSVMSHQCDTLPGDSGSPLMLHTDDGWQLIGV QSSAPAAKDRWRADNRAISVTGFRDKLDQLSQK >gi|296494466|gb|ADTN01000272.1| GENE 31 27721 - 28050 442 109 aa, chain - ## HITS:1 COG:ECs2305 KEGG:ns NR:ns ## COG: ECs2305 COG2076 # Protein_GI_number: 15831559 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane transporters of cations and cationic drugs # Organism: Escherichia coli O157:H7 # 1 109 1 109 109 144 100.0 5e-35 MAQFEWVHAAWLALAIVLEIVANVFLKFSDGFRRKIFGLLSLAAVLAAFSALSQAVKGID LSVAYALWGGFGIAATLAAGWILFGQRLNRKGWIGLVLLLAGMIMVKLA >gi|296494466|gb|ADTN01000272.1| GENE 32 28037 - 28402 467 121 aa, chain - ## HITS:1 COG:ECs2306 KEGG:ns NR:ns ## COG: ECs2306 COG2076 # Protein_GI_number: 15831560 # Func_class: P Inorganic ion transport and metabolism # Function: Membrane transporters of cations and cationic drugs # Organism: Escherichia coli O157:H7 # 1 121 1 121 121 182 100.0 2e-46 MYIYWILLGLAIATEITGTLSMKWASVSEGNGGFILMLVMISLSYIFLSFAVKKIALGVA YALWEGIGILFITLFSVLLFDESLSLMKIAGLTTLVAGIVLIKSGTRKARKPELEVNHGA V >gi|296494466|gb|ADTN01000272.1| GENE 33 28814 - 29848 1069 344 aa, chain + ## HITS:1 COG:ydgG KEGG:ns NR:ns ## COG: ydgG COG0628 # Protein_GI_number: 16129559 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Escherichia coli K12 # 1 344 1 344 344 493 100.0 1e-139 MAKPIITLNGLKIVIMLGMLVIILCGIRFAAEIIVPFILALFIAVILNPLVQHMVRWRVP RVLAVSILMTIIVMAMVLLLAYLGSALNELTRTLPQYRNSIMTPLQALEPLLQRVGIDVS VDQLAHYIDPNAAMTLLTNLLTQLSNAMSSIFLLLLTVLFMLLEVPQLPGKFQQMMARPV EGMAAIQRAIDSVSHYLVLKTAISIITGLVAWAMLAALDVRFAFVWGLLAFALNYIPNIG SVLAAIPPIAQVLVFNGFYEALLVLAGYLLINLVFGNILEPRIMGRGLGLSTLVVFLSLI FWGWLLGPVGMLLSVPLTIIVKIALEQTAGGQSIAVLLSDLNKE >gi|296494466|gb|ADTN01000272.1| GENE 34 29867 - 31261 1436 464 aa, chain - ## HITS:1 COG:ECs2308 KEGG:ns NR:ns ## COG: ECs2308 COG1282 # Protein_GI_number: 15831562 # Func_class: C Energy production and conversion # Function: NAD/NADP transhydrogenase beta subunit # Organism: Escherichia coli O157:H7 # 1 462 1 462 462 851 100.0 0 MSGGLVTAAYIVAAILFIFSLAGLSKHETSRQGNNFGIAGMAIALIATIFGPDTGNVGWI LLAMVIGGAIGIRLAKKVEMTEMPELVAILHSFVGLAAVLVGFNSYLHHDAGMAPILVNI HLTEVFLGIFIGAVTFTGSVVAFGKLCGKISSKPLMLPNRHKMNLAALVVSFLLLIVFVR TDSVGLQVLALLIMTAIALVFGWHLVASIGGADMPVVVSMLNSYSGWAAAAAGFMLSNDL LIVTGALVGSSGAILSYIMCKAMNRSFISVIAGGFGTDGSSTGDDQEVGEHREITAEETA ELLKNSHSVIITPGYGMAVAQAQYPVAEITEKLRARGINVRFGIHPVAGRLPGHMNVLLA EAKVPYDIVLEMDEINDDFADTDTVLVIGANDTVNPAAQDDPKSPIAGMPVLEVWKAQNV IVFKRSMNTGYAGVQNPLFFKENTHMLFGDAKASVDAILKALYP >gi|296494466|gb|ADTN01000272.1| GENE 35 31272 - 32804 1618 510 aa, chain - ## HITS:1 COG:ECs2309 KEGG:ns NR:ns ## COG: ECs2309 COG3288 # Protein_GI_number: 15831563 # Func_class: C Energy production and conversion # Function: NAD/NADP transhydrogenase alpha subunit # Organism: Escherichia coli O157:H7 # 1 510 1 510 510 972 98.0 0 MRIGIPRERLTNETRVAATPKTVEQLLKLGFTVAVESGAGQLASFDDKAFVQAGAEIVEG NSVWQSEIILKVNAPLDDEIALLNPGTTLVSFIWPAQNPELMQKLAERNVTVMAMDSVPR ISRAQSLDALSSMANIAGYRAIVEAAHEFGRFFTGQITAAGKVPPAKVMVIGAGVAGLAA IGAANSLGAIVRAFDTRPEVKEQVQSMGAEFLELDFKEEAGSGDGYAKVMSDAFIKAEME LFAAQAKEVDIIVTTALIPGKPAPKLITREMVDSMKAGSVIVDLAAQNGGNCEYTVPGEI FTTENGVKVIGYTDLPGRLPTQSSQLYGTNLVNLLKLLCKEKDGNITVDFDDVVIRGVTV IRAGEITWPAPPIQVSAQPQAAQKAAPEVKTEEKCTCSPWRKYALMALAIILFGWMASVA PKEFLGHFTVFALACVVGYYVVWNVSHALHTPLMSVTNAISGIIVVGALLQIGQGGWVSF LSFIAVLIASINIFGGFTVTQRMLKMFRKN >gi|296494466|gb|ADTN01000272.1| GENE 36 33328 - 34272 1031 314 aa, chain + ## HITS:1 COG:no KEGG:EC55989_1769 NR:ns ## KEGG: EC55989_1769 # Name: ydgH # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 314 1 314 314 517 99.0 1e-145 MKLKNPLLASALLSATAFSVNAATELTPEQAAAVKPFDRVVVTGRFNAIGEAVKAVSRRA DKEGAASFYVVDTSDFGNSGNWRVVADLYKADAEKAEETSNRVINGVVELPKDQAVLIEP FDTVTVQGFYRSQPEVNDAITKAAKAKGAYSFYIVRQIDANQGGNQRITAFIYKKDAKKR IVQSPDVIPADSEAGRAALAAGGEAAKKVEIPGVATTASPSSEVGRFFETQSSKGGRYTV TLPDGTKVEELNKATAAMMVPFDSIKFSGNYGNMTEVSYQVAKRAAKKGAKYYHIPRQWQ ERGNNLTVSADLYK >gi|296494466|gb|ADTN01000272.1| GENE 37 34458 - 34560 76 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEKKLGLSALTALVLSSMLGAGVFSLPQNMAAVA Prediction of potential genes in microbial genomes Time: Mon May 16 00:06:50 2011 Seq name: gi|296494465|gb|ADTN01000273.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont710.3, whole genome shotgun sequence Length of sequence - 7024 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 + CDS 25 - 1239 1166 ## COG0531 Amino acid transporters 2 1 Op 2 . + CDS 1276 - 1998 674 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 3 2 Tu 1 . - CDS 1995 - 2330 341 ## COG3136 Uncharacterized membrane protein required for alginate biosynthesis - Prom 2360 - 2419 5.1 + Prom 2202 - 2261 4.1 4 3 Op 1 40/0.000 + CDS 2459 - 3178 608 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 5 3 Op 2 . + CDS 3182 - 4483 1203 ## COG0642 Signal transduction histidine kinase 6 3 Op 3 . + CDS 4559 - 5488 654 ## JW1602 inhibitor of replication at Ter, DNA-binding protein - Term 5418 - 5473 0.2 7 4 Tu 1 . - CDS 5485 - 6888 1673 ## COG0114 Fumarase - Prom 6936 - 6995 2.0 Predicted protein(s) >gi|296494465|gb|ADTN01000273.1| GENE 1 25 - 1239 1166 404 aa, chain + ## HITS:1 COG:ydgI KEGG:ns NR:ns ## COG: ydgI COG0531 # Protein_GI_number: 16129563 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli K12 # 1 404 57 460 460 702 100.0 0 MLILTRIRPELDGGIFTYAREGFGELIGFCSAWGYWLCAVIANVSYLVIVFSALSFFTDT PELRLFGDGNTWQSIVGASALLWIVHFLILRGVQTAASINLVATLAKLLPLGLFVVLAMM MFKLDTFKLDFTGLALGVPVWEQVKNTMLITLWVFIGVEGAVVVSARARNKRDVGKATLL AVLSALGVYLLVTLLSLGVVARPELAEIRNPSMAGLMVEMMGPWGEIIIAAGLIVSVCGA YLSWTIMAAEVPFLAATHKAFPRIFARQNAQAAPSASLWLTNICVQICLVLIWLTGSDYN TLLTIASEMILVPYFLVGAFLLKIATRPLHKAVGVGACIYGLWLLYASGPMHLLLSVVLY APGLLVFLYARKTHTHDNVLNRQEMVLIGMLLIASVPATWMLVG >gi|296494465|gb|ADTN01000273.1| GENE 2 1276 - 1998 674 240 aa, chain + ## HITS:1 COG:ECs2312 KEGG:ns NR:ns ## COG: ECs2312 COG1028 # Protein_GI_number: 15831566 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Escherichia coli O157:H7 # 1 240 1 240 240 492 100.0 1e-139 MGKTQPLPILITGGGRRIGLALAWHFINQKQPVIVSYRTHYPAIDGLINAGAQCIQADFS TNDGVMAFADEVLKSTHGLRAILHNASAWMAEKPGAPLADVLACMMQIHVNTPYLLNHAL ERLLRGHGHAASDIIHFTDYVVERGSDKHIAYAASKAALDNMTRSFARKLAPEVKVNSIA PSLILFNEHDDAEYRQQALNKSLMKTAPGEKEVIDLVDYLLTSCFVTGRSFPLDGGRHLR >gi|296494465|gb|ADTN01000273.1| GENE 3 1995 - 2330 341 111 aa, chain - ## HITS:1 COG:ECs2313 KEGG:ns NR:ns ## COG: ECs2313 COG3136 # Protein_GI_number: 15831567 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein required for alginate biosynthesis # Organism: Escherichia coli O157:H7 # 1 111 1 111 111 165 100.0 2e-41 MGLVIKAALGALVVLLIGVLAKTKNYYIAGLIPLFPTFALIAHYIVASERGIEALRATII FSMWSIIPYFVYLVSLWYFTGMMRLPAAFVGSVACWGISAWVLIICWIKLH >gi|296494465|gb|ADTN01000273.1| GENE 4 2459 - 3178 608 239 aa, chain + ## HITS:1 COG:rstA KEGG:ns NR:ns ## COG: rstA COG0745 # Protein_GI_number: 16129566 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Escherichia coli K12 # 1 239 4 242 242 445 99.0 1e-125 MNTIVFVEDDAEVGSLIAAYLAKHDMQVTVEPRGDQAEETILRENPDLVLLDIMLPGKDG MTICRDLRAKWSGPIVLLTSLDSDMNHILALEMGACDYILKTTPPAVLLARLRLHLRQNE QATLTKGLQETSLTPYKVLHFGTLTIDPINRVVTLANTEISLSTADFELLWELATHAGQI MDRDALLKNLRGVSYDGLDRSVDVAISRLRKKLLDNAAEPYRIKTVRNKGYLFAPHAWE >gi|296494465|gb|ADTN01000273.1| GENE 5 3182 - 4483 1203 433 aa, chain + ## HITS:1 COG:rstB KEGG:ns NR:ns ## COG: rstB COG0642 # Protein_GI_number: 16129567 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 433 1 433 433 874 100.0 0 MKKLFIQFYLLLFVCFLVMSLLVGLVYKFTAERAGKQSLDDLMNSSLYLMRSELREIPPH DWGKTLKEMDLNLSFDLRVEPLSKYHLDDISMHRLRGGEIVALDDQYTFLQRIPRSHYVL AVGPVPYLYYLHQMRLLDIALIAFIAISLAFPVFIWMRPHWQDMLKLEAAAQRFGDGHLN ERIHFDEGSSFERLGVAFNQMADNINALIASKKQLIDGIAHELRTPLVRLRYRLEMSDNL SAAESQALNRDISQLEALIEELLTYARLDRPQNELHLSEPDLPLWLSTHLADIQAVTPDK TVRIKTLVQGHYAALDMRLMERVLDNLLNNALRYCHSTVETSLLLSGNRATLIVEDDGPG IAPENREHIFEPFVRLDPSRDRSTGGCGLGLAIVHSIALAMGGTVNCDTSELGGARFSFS WPLWHNIPQFTSA >gi|296494465|gb|ADTN01000273.1| GENE 6 4559 - 5488 654 309 aa, chain + ## HITS:1 COG:no KEGG:JW1602 NR:ns ## KEGG: JW1602 # Name: tus # Def: inhibitor of replication at Ter, DNA-binding protein # Organism: E.coli_J # Pathway: not_defined # 1 309 1 309 309 586 100.0 1e-166 MARYDLVDRLNTTFRQMEQELAIFAAHLEQHKLLVARVFSLPEVKKEDEHNPLNRIEVKQ HLGNDAQSLALRHFRHLFIQQQSENRSSKAAVRLPGVLCYQVDNLSQAALVSHIQHINKL KTTFEHIVTVESELPTAARFEWVHRHLPGLITLNAYRTLTVLHDPATLRFGWANKHIIKN LHRDEVLAQLEKSLKSPRSVAPWTREEWQRKLEREYQDIAALPQNAKLKIKRPVKVQPIA RVWYKGDQKQVQHACPTPLIALINRDNGAGVPDVGELLNYDADNVQHRYKPQAQPLRLII PRLHLYVAD >gi|296494465|gb|ADTN01000273.1| GENE 7 5485 - 6888 1673 467 aa, chain - ## HITS:1 COG:fumC KEGG:ns NR:ns ## COG: fumC COG0114 # Protein_GI_number: 16129569 # Func_class: C Energy production and conversion # Function: Fumarase # Organism: Escherichia coli K12 # 1 467 1 467 467 902 100.0 0 MNTVRSEKDSMGAIDVPADKLWGAQTQRSLEHFRISTEKMPTSLIHALALTKRAAAKVNE DLGLLSEEKASAIRQAADEVLAGQHDDEFPLAIWQTGSGTQSNMNMNEVLANRASELLGG VRGMERKVHPNDDVNKSQSSNDVFPTAMHVAALLALRKQLIPQLKTLTQTLNEKSRAFAD IVKIGRTHLQDATPLTLGQEISGWVAMLEHNLKHIEYSLPHVAELALGGTAVGTGLNTHP EYARRVADELAVITCAPFVTAPNKFEALATCDALVQAHGALKGLAASLMKIANDVRWLAS GPRCGIGEISIPENEPGSSIMPGKVNPTQCEALTMLCCQVMGNDVAINMGGASGNFELNV FRPMVIHNFLQSVRLLADGMESFNKHCAVGIEPNRERINQLLNESLMLVTALNTHIGYDK AAEIAKKAHKEGLTLKAAALALGYLSEAEFDSWVRPEQMVGSMKAGR Prediction of potential genes in microbial genomes Time: Mon May 16 00:06:57 2011 Seq name: gi|296494464|gb|ADTN01000274.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont710.4, whole genome shotgun sequence Length of sequence - 6750 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 26 - 1672 486 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase - Prom 1742 - 1801 5.2 + Prom 1769 - 1828 4.2 2 2 Op 1 4/0.000 + CDS 1871 - 3046 1277 ## COG1482 Phosphomannose isomerase + Term 3104 - 3137 4.1 + Prom 3048 - 3107 1.9 3 2 Op 2 . + CDS 3147 - 4655 1637 ## COG5339 Uncharacterized protein conserved in bacteria + Term 4669 - 4704 4.0 - Term 4838 - 4873 5.7 4 3 Op 1 . - CDS 4881 - 6128 1030 ## JW1607 predicted outer membrane porin protein 5 3 Op 2 . - CDS 6185 - 6700 521 ## COG2211 Na+/melibiose symporter and related transporters Predicted protein(s) >gi|296494464|gb|ADTN01000274.1| GENE 1 26 - 1672 486 548 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 76 531 38 482 508 191 33 1e-48 MSNKPFHYQAPFPLKKDDTEYYLLTSEHVSVSEFEGQEILKVALEALTLLARQAFHDASF MLRPAHQQQVADILRDPEASENDKYVALQFLRNSDIAAKGVLPTCQDTGTAIIVGKKGQR VWTGGGDEAALARGVYNTYIEDNLRYSQNAPLDMYKEVNTGTNLPAQIDLYAVDGDEYKF LCIAKGGGSANKTYLYQETKALLTPGKLKNYLVEKMRTLGTAACPPYHIAFVIGGTSAET NLKTVKLASAKYYDELPTEGNEHGQAFRDVELEKELLIEAQNLGLGAQFGGKYFAHDIRV IRLPRHGASCPVGMGVSCSADRNIKAKINRQGIWIEKLEHNPGKYIPEELRKAGEGEAVR VDLNRPMKEILAQLSQYPVSTRLSLNGTIIVGRDIAHAKLKERMDNGEGLPQYIKDHPIY YAGPAKTPEGYASGSLGPTTAGRMDSYVDQLQAQGGSMIMLAKGNRSQQVTDACKKHGGF YLGSIGGPAAVLAQGSIKSLECVEYPELGMEAIWKIEVEDFPAFILVDDKGNDFFQQIQL TQCTRCVK >gi|296494464|gb|ADTN01000274.1| GENE 2 1871 - 3046 1277 391 aa, chain + ## HITS:1 COG:manA KEGG:ns NR:ns ## COG: manA COG1482 # Protein_GI_number: 16129571 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Escherichia coli K12 # 1 391 1 391 391 751 100.0 0 MQKLINSVQNYAWGSKTALTELYGMENPSSQPMAELWMGAHPKSSSRVQNAAGDIVSLRD VIESDKSTLLGEAVAKRFGELPFLFKVLCAAQPLSIQVHPNKHNSEIGFAKENAAGIPMD AAERNYKDPNHKPELVFALTPFLAMNAFREFSEIVSLLQPVAGAHPAIAHFLQQPDAERL SELFASLLNMQGEEKSRALAILKSALDSQQGEPWQTIRLISEFYPEDSGLFSPLLLNVVK LNPGEAMFLFAETPHAYLQGVALEVMANSDNVLRAGLTPKYIDIPELVANVKFEAKPANQ LLTQPVKQGAELDFPIPVDDFAFSLHDLSDKETTISQQSAAILFCVEGDATLWKGSQQLQ LKPGESAFIAANESPVTVKGHGRLARVYNKL >gi|296494464|gb|ADTN01000274.1| GENE 3 3147 - 4655 1637 502 aa, chain + ## HITS:1 COG:ydgA KEGG:ns NR:ns ## COG: ydgA COG5339 # Protein_GI_number: 16129572 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 502 1 502 502 919 99.0 0 MNKSLVAVGVIVALGVVWTGGAWYTGKKIETHLEDMVAQANAQLKLTAPESNLEVSYQNY HRGVFSSQLQLLVKPIAGKENPWIKSGQSVIFNESVDHGPFPLAQLKKLNLIPSMASIQT TLVNNEVSKPLFDMAKGETPFEINSRIGYSGDSSSDISLKPLNYEQKDEKVAFSGGEFQL NADRDGKAISLSGEAQSGRIDAVNEYNQKVQLTFNNLKTDGSSTLASFGERVGNQKLSLE KMTISVEGKELALLEGMEISGKSDLVNDGKTINSQLDYSLNSLKVQNQDLGSGKLTLKVG QIDGEAWHQFSQQYNAQTQALLAQPEIANNPELYQEKVTEAFFSALPMMLKGDPVITIAP LSWKNSQGESALNLSLFLKDPATTKEAPQTLAQEVDRSVKSLDAKLTIPVDMATEFMTQV AKLEGYQEDQAKKLAKQQVEGASAMGQMFRLTTLQDNTITTSLQYANGQITLNGQKMSLE DFVGMFAMPALNVPAVPAIPQQ >gi|296494464|gb|ADTN01000274.1| GENE 4 4881 - 6128 1030 415 aa, chain - ## HITS:1 COG:no KEGG:JW1607 NR:ns ## KEGG: JW1607 # Name: uidC # Def: predicted outer membrane porin protein # Organism: E.coli_J # Pathway: not_defined # 1 415 7 421 421 812 100.0 0 MAVICLTAASGLTSAYAAQLADDEAGLRIRLKNELRRADKPSAGAGRDIYAWVQGGLLDF NSGYYSNIIGVEGGAYYVYKLGARADMSTRWYLDGDKSFGFALGAVKIKPSENSLLKLGR FGTDYSYGSLPYRIPLMAGSSQRTLPTVSEGALGYWALTPNIDLWGMWRSRVFLWTDSTT GIRDEGVYNSQTGKYDKHRARSFLAASWHDDTSRYSLGASVQKDVSNQIQSILEKSIPLD PNYTLKGELLGFYAQLEGLSRNTSQPNETALVSGQLTWNAPWGSVFGSGGYLRHAMNGAV VDTDIGYPFSLSLDRNREGMQSWQLGVNYRLTPQFTLTFAPIVTRGYESSKRDVRIEGTG ILGGMNYRVSEGPLQGMNFFLAADKGREKRDGSTLGDRLNYWDVKMSIQYDFMLK >gi|296494464|gb|ADTN01000274.1| GENE 5 6185 - 6700 521 171 aa, chain - ## HITS:1 COG:ECs2323 KEGG:ns NR:ns ## COG: ECs2323 COG2211 # Protein_GI_number: 15831577 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Escherichia coli O157:H7 # 1 171 287 457 457 325 100.0 2e-89 MVARIGKKNTFLIGALLGTCGYLLFFWVSVWSLPVALVALAIASIGQGVTMTVMWALEAD TVEYGEYLTGVRIEGLTYSLFSFTRKCGQAIGGSIPAFILGLSGYIANQVQTPEVIMGIR TSIALVPCGFMLLAFVIIWFYPLTDKKFKEIVVEIDNRKKVQQQLISDITN Prediction of potential genes in microbial genomes Time: Mon May 16 00:07:04 2011 Seq name: gi|296494463|gb|ADTN01000275.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont710.5, whole genome shotgun sequence Length of sequence - 3812 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 2 - 842 915 ## COG2211 Na+/melibiose symporter and related transporters 2 1 Op 2 2/0.000 - CDS 839 - 2650 1887 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 2754 - 2813 3.6 - Term 2987 - 3024 3.0 3 1 Op 3 . - CDS 3041 - 3628 619 ## COG1309 Transcriptional regulator - Prom 3653 - 3712 4.9 Predicted protein(s) >gi|296494463|gb|ADTN01000275.1| GENE 1 2 - 842 915 280 aa, chain - ## HITS:1 COG:ECs2323 KEGG:ns NR:ns ## COG: ECs2323 COG2211 # Protein_GI_number: 15831577 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Escherichia coli O157:H7 # 1 280 1 280 457 496 99.0 1e-140 MNQQLSWRTIVGYSLGDVANNFAFAMGALFLLSYYTDVAGVGAAAAGTMLLLVRVFDAFA DVFAGRVVDSVNTRWGKFRPFLLFGTAPLMIFSVLVFWVPTDWSHGSKVVYAYLTYMGLG LCYSLVNIPYGSLATAMTQQPQSRARLGAARGIAASLTFVCLAFLIGPSIKNSSPEEMVS VYHFWTIVLAIAGMVLYFICFKSTRENVVRIVAQPSLNISLQTLKRNRPLFMLCIGALCV LISTFAVSASSLFYVRYVLNDTGLFTVLVLVQNLVGTVAS >gi|296494463|gb|ADTN01000275.1| GENE 2 839 - 2650 1887 603 aa, chain - ## HITS:1 COG:uidA KEGG:ns NR:ns ## COG: uidA COG3250 # Protein_GI_number: 16129575 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Escherichia coli K12 # 1 603 1 603 603 1272 100.0 0 MLRPVETPTREIKKLDGLWAFSLDRENCGIDQRWWESALQESRAIAVPGSFNDQFADADI RNYAGNVWYQREVFIPKGWAGQRIVLRFDAVTHYGKVWVNNQEVMEHQGGYTPFEADVTP YVIAGKSVRITVCVNNELNWQTIPPGMVITDENGKKKQSYFHDFFNYAGIHRSVMLYTTP NTWVDDITVVTHVAQDCNHASVDWQVVANGDVSVELRDADQQVVATGQGTSGTLQVVNPH LWQPGEGYLYELCVTAKSQTECDIYPLRVGIRSVAVKGEQFLINHKPFYFTGFGRHEDAD LRGKGFDNVLMVHDHALMDWIGANSYRTSHYPYAEEMLDWADEHGIVVIDETAAVGFNLS LGIGFEAGNKPKELYSEEAVNGETQQAHLQAIKELIARDKNHPSVVMWSIANEPDTRPQG AREYFAPLAEATRKLDPTRPITCVNVMFCDAHTDTISDLFDVLCLNRYYGWYVQSGDLET AEKVLEKELLAWQEKLHQPIIITEYGVDTLAGLHSMYTDMWSEEYQCAWLDMYHRVFDRV SAVVGEQVWNFADFATSQGILRVGGNKKGIFTRDRKPKSAAFLLQKRWTGMNFGEKPQQG GKQ >gi|296494463|gb|ADTN01000275.1| GENE 3 3041 - 3628 619 195 aa, chain - ## HITS:1 COG:ECs2326 KEGG:ns NR:ns ## COG: ECs2326 COG1309 # Protein_GI_number: 15831580 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 195 2 196 196 357 99.0 7e-99 MDNMQTEAQPTRTRILNAAREIFSENGFHSASMKAICKSCAISPGTLYHHFISKEALIQA IILQDQERALARFREPIEGIHFVDYMVESIVSLTHEAFGQRALAVEIMAEGMRNPQVAAM LKNKHMTITEFVAQRMRDAQQKGEISPDINTAMTSRLLLDLTYGVLADIEAEDLAREASF AQGLRAMIGGILTAS Prediction of potential genes in microbial genomes Time: Mon May 16 00:07:07 2011 Seq name: gi|296494462|gb|ADTN01000276.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont710.6, whole genome shotgun sequence Length of sequence - 9801 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 8, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 24 - 791 244 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 - Prom 811 - 870 2.0 2 2 Tu 1 . - CDS 903 - 1931 750 ## COG1609 Transcriptional regulators - Prom 1981 - 2040 7.4 + Prom 1952 - 2011 5.3 3 3 Op 1 3/0.500 + CDS 2106 - 3698 1668 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 4 3 Op 2 3/0.500 + CDS 3708 - 4880 1027 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities + Term 4883 - 4933 7.1 + Prom 4895 - 4954 6.5 5 4 Tu 1 . + CDS 4984 - 5985 1109 ## COG1816 Adenosine deaminase + Term 5991 - 6028 7.1 - Term 5977 - 6014 7.1 6 5 Tu 1 . - CDS 6019 - 7059 946 ## COG0673 Predicted dehydrogenases and related proteins - Prom 7113 - 7172 4.5 7 6 Tu 1 . + CDS 7272 - 7427 90 ## EC55989_1792 beta-lactam resistance membrane protein + Term 7442 - 7489 6.9 + Prom 7505 - 7564 3.7 8 7 Tu 1 . + CDS 7700 - 7915 251 ## G2583_2020 OriC-binding nucleoid-associated protein + Prom 7918 - 7977 4.1 9 8 Op 1 . + CDS 8001 - 8441 378 ## UTI89_C1816 hypothetical protein 10 8 Op 2 12/0.000 + CDS 8518 - 9099 574 ## COG4657 Predicted NADH:ubiquinone oxidoreductase, subunit RnfA 11 8 Op 3 . + CDS 9099 - 9677 496 ## COG2878 Predicted NADH:ubiquinone oxidoreductase, subunit RnfB Predicted protein(s) >gi|296494462|gb|ADTN01000276.1| GENE 1 24 - 791 244 255 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 12 248 4 238 242 98 29 2e-20 MFNSDNLRLDGKCAIITGAGAGIGKEIAITFATAGASVVVSDINADAANHVVDEIQQLGG QAFACRCDITSEQELSALADFAISKLGKVDILVNNAGGGGPKPFDMPMADFRRAYELNVF SFFHLSQLVAPEMEKNGGGVILTITSMAAENKNINMTSYASSKAAASHLVRNMAFDLGEK NIRVNGIAPGAILTDALKSVITPEIEQKMLQHTPIRRLGQPQDIANAALFLCSPAASWVS GQILTVSGGGVQELN >gi|296494462|gb|ADTN01000276.1| GENE 2 903 - 1931 750 342 aa, chain - ## HITS:1 COG:malI KEGG:ns NR:ns ## COG: malI COG1609 # Protein_GI_number: 16129578 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 342 1 342 342 645 100.0 0 MATAKKITIHDVALAAGVSVSTVSLVLSGKGRISTATGERVNAAIEELGFVRNRQASALR GGQSGVIGLIVRDLSAPFYAELTAGLTEALEAQGRMVFLLHGGKDGEQLAQRFSLLLNQG VDGVVIAGAAGSSDDLRRMAEEKAIPVIFASRASYLDDVDTVRPDNMQAAQLLTEHLIRN GHQRIAWLGGQSSSLTRAERVGGYCATLLKFGLPFHSDWVLECTSSQKQAAEAITALLRH NPTISAVVCYNETIAMGAWFGLLKAGRQSGESGVDRYFEQQVSLAAFTDATPTTLDDIPV TWASTPARELGITLADRMMQKITHEETHSRNLIIPARLIAAK >gi|296494462|gb|ADTN01000276.1| GENE 3 2106 - 3698 1668 530 aa, chain + ## HITS:1 COG:malX_1 KEGG:ns NR:ns ## COG: malX_1 COG1263 # Protein_GI_number: 16129579 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Escherichia coli K12 # 1 450 1 450 450 847 100.0 0 MTAKTAPKVTLWEFFQQLGKTFMLPVALLSFCGIMLGIGSSLSSHDVITLIPVLGNPVLQ AIFTWMSKIGSFAFSFLPVMFCIAIPLGLARENKGVAAFAGFIGYAVMNLAVNFWLTNKG ILPTTDAAVLKANNIQSILGIQSIDTGILGAVIAGIIVWMLHERFHNIRLPDALAFFGGT RFVPIISSLVMGLVGLVIPLVWPIFAMGISGLGHMINSAGDFGPMLFGTGERLLLPFGLH HILVALIRFTDAGGTQEVCGQTVSGALTIFQAQLSCPTTHGFSESATRFLSQGKMPAFLG GLPGAALAMYHCARPENRHKIKGLLISGLIACVVGGTTEPLEFLFLFVAPVLYVIHALLT GLGFTVMSVLGVTIGNTDGNIIDFVVFGILHGLSTKWYMVPVVAAIWFVVYYVIFRFAIT RFNLKTPGRDSEVASSIEKAVAGAPGKSGYNVPAILEALGGADNIVSLDNCITRLRLSVK DMSLVNVQALKDNRAIGVVQLNQHNLQVVIGPQVQSVKDEMAGLMHTVQA >gi|296494462|gb|ADTN01000276.1| GENE 4 3708 - 4880 1027 390 aa, chain + ## HITS:1 COG:malY KEGG:ns NR:ns ## COG: malY COG1168 # Protein_GI_number: 16129580 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Escherichia coli K12 # 1 390 1 390 390 822 100.0 0 MFDFSKVVDRHGTWCTQWDYVADRFGTADLLPFTISDMDFATAPCIIEALNQRLMHGVFG YSRWKNDEFLAAIAHWFSTQHYTAIDSQTVVYGPSVIYMVSELIRQWSETGEGVVIHTPA YDAFYKAIEGNQRTVMPVALEKQADGWFCDMGKLEAVLAKPECKIMLLCSPQNPTGKVWT CDELEIMADLCERHGVRVISDEIHMDMVWGEQPHIPWSNVARGDWALLTSGSKSFNIPAL TGAYGIIENSSSRDAYLSALKGRDGLSSPSVLALTAHIAAYQQGAPWLDALRIYLKDNLT YIADKMNAAFPELNWQIPQSTYLAWLDLRPLNIDDNALQKALIEQEKVAIMPGYTYGEEG RGFVRLNAGCPRSKLEKGVAGLINAIRAVR >gi|296494462|gb|ADTN01000276.1| GENE 5 4984 - 5985 1109 333 aa, chain + ## HITS:1 COG:add KEGG:ns NR:ns ## COG: add COG1816 # Protein_GI_number: 16129581 # Func_class: F Nucleotide transport and metabolism # Function: Adenosine deaminase # Organism: Escherichia coli K12 # 1 333 1 333 333 655 100.0 0 MIDTTLPLTDIHRHLDGNIRPQTILELGRQYNISLPAQSLETLIPHVQVIANEPDLVSFL TKLDWGVKVLASLDACRRVAFENIEDAARHGLHYVELRFSPGYMAMAHQLPVAGVVEAVI DGVREGCRTFGVQAKLIGIMSRTFGEAACQQELEAFLAHRDQITALDLAGDELGFPGSLF LSHFNRARDAGWHITVHAGEAAGPESIWQAIRELGAERIGHGVKAIEDRALMDFLAEQQI GIESCLTSNIQTSTVAELAAHPLKTFLEHGIRASINTDDPGVQGVDIIHEYTVAAPAAGL SREQIRQAQINGLEMAFLSAEEKRALREKVAAK >gi|296494462|gb|ADTN01000276.1| GENE 6 6019 - 7059 946 346 aa, chain - ## HITS:1 COG:ECs2332 KEGG:ns NR:ns ## COG: ECs2332 COG0673 # Protein_GI_number: 15831586 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Escherichia coli O157:H7 # 1 346 14 359 359 681 99.0 0 MSDNIRVGLIGYGYASKTFHAPLIAGTPGLELAVISSSDETKVKADWPTVTVVSEPKHLF NDPNIDLIVIPTPNDTHLPLAKAALEAGKHVVVDKPFTVTLSQARELDALAKSLGRVLSV FHNRRWDSDFLTLKGLLAEGVLGEVAYFESHFDRFRPQVRDRWREQGGPGSGIWYDLAPH LLDQAITLFGLPVSMTVDLAQLRPGAQSTDYFHAILSYPQRRVILHGTMLAAAESARYIV HGSRGSYVKYGLDPQEERLKNGERLPQEDWGYDMRDGVLTRVEGEERVEETLLTVPGNYP AYYAAIRDALNGDGENPVPASQAIQVMELIELGIESAKHRATLCLA >gi|296494462|gb|ADTN01000276.1| GENE 7 7272 - 7427 90 51 aa, chain + ## HITS:1 COG:no KEGG:EC55989_1792 NR:ns ## KEGG: EC55989_1792 # Name: blr # Def: beta-lactam resistance membrane protein # Organism: E.coli_55989 # Pathway: not_defined # 1 51 16 66 66 92 100.0 6e-18 MDQSREMWAVMNRLIELTGWIVLVVSVILLGVASHIDNYQPPEQSASVQHK >gi|296494462|gb|ADTN01000276.1| GENE 8 7700 - 7915 251 71 aa, chain + ## HITS:1 COG:no KEGG:G2583_2020 NR:ns ## KEGG: G2583_2020 # Name: cnu # Def: OriC-binding nucleoid-associated protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 71 1 71 71 133 100.0 3e-30 MTVQDYLLKFRKISSLESLEKLYDHLNYTLTDDQELINMYRAADHRRAELVSGGRLFDLG QVPKSVWHYVQ >gi|296494462|gb|ADTN01000276.1| GENE 9 8001 - 8441 378 146 aa, chain + ## HITS:1 COG:no KEGG:UTI89_C1816 NR:ns ## KEGG: UTI89_C1816 # Name: ydgK # Def: hypothetical protein # Organism: E.coli_UTI89 # Pathway: not_defined # 1 146 9 154 154 202 99.0 2e-51 MTTTTPQRIGGWLLGPLAWLLVALLSTTLALLLYTAALSSPQTFQTLGGQALTTQILWGV SFITAIALWYYTLWLTIAFFKRRRCVPKHYIIWLLISVLLAVKAFAFSPVEDGIAVRQLL FTLLATALIVPYFKRSSRVKATFVNP >gi|296494462|gb|ADTN01000276.1| GENE 10 8518 - 9099 574 193 aa, chain + ## HITS:1 COG:ECs2336 KEGG:ns NR:ns ## COG: ECs2336 COG4657 # Protein_GI_number: 15831590 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfA # Organism: Escherichia coli O157:H7 # 1 193 1 193 193 285 100.0 3e-77 MTDYLLLFVGTVLVNNFVLVKFLGLCPFMGVSKKLETAMGMGLATTFVMTLASICAWLID TWILIPLNLIYLRTLAFILVIAVVVQFTEMVVRKTSPVLYRLLGIFLPLITTNCAVLGVA LLNINLGHNFLQSALYGFSAAVGFSLVMVLFAAIRERLAVADVPAPFRGNAIALITAGLM SLAFMGFSGLVKL >gi|296494462|gb|ADTN01000276.1| GENE 11 9099 - 9677 496 192 aa, chain + ## HITS:1 COG:ydgM KEGG:ns NR:ns ## COG: ydgM COG2878 # Protein_GI_number: 16129586 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfB # Organism: Escherichia coli K12 # 1 192 1 192 192 365 100.0 1e-101 MNAIWIAVAAVSLLGLAFGAILGYASRRFAVEDDPVVEKIDEILPQSQCGQCGYPGCRPY AEAISCNGEKINRCAPGGEAVMLKIAELLNVEPQPLDGEAQEITPARMVAVIDENNCIGC TKCIQACPVDAIVGATRAMHTVMSDLCTGCNLCVDPCPTHCISLQPVAETPDSWKWDLNT IPVRIIPVEHHA Prediction of potential genes in microbial genomes Time: Mon May 16 00:07:17 2011 Seq name: gi|296494461|gb|ADTN01000277.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont710.7, whole genome shotgun sequence Length of sequence - 9034 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 5, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 12/0.000 + CDS 245 - 2242 1983 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC 2 1 Op 2 12/0.000 + CDS 2243 - 3301 892 ## COG4658 Predicted NADH:ubiquinone oxidoreductase, subunit RnfD 3 1 Op 3 13/0.000 + CDS 3305 - 3925 580 ## COG4659 Predicted NADH:ubiquinone oxidoreductase, subunit RnfG 4 1 Op 4 10/0.000 + CDS 3929 - 4624 748 ## COG4660 Predicted NADH:ubiquinone oxidoreductase, subunit RnfE 5 1 Op 5 . + CDS 4624 - 5259 600 ## COG0177 Predicted EndoIII-related endonuclease + Prom 5265 - 5324 2.6 6 2 Tu 1 . + CDS 5387 - 5587 89 ## ECBD_2010 hypothetical protein + Prom 5791 - 5850 5.9 7 3 Tu 1 4/1.000 + CDS 5870 - 7372 1577 ## COG3104 Dipeptide/tripeptide permease + Term 7396 - 7430 5.2 + Prom 7398 - 7457 4.2 8 4 Tu 1 . + CDS 7478 - 8083 732 ## COG0625 Glutathione S-transferase + Term 8088 - 8133 -0.9 - Term 8075 - 8117 -0.5 9 5 Tu 1 . - CDS 8127 - 8987 923 ## COG2240 Pyridoxal/pyridoxine/pyridoxamine kinase Predicted protein(s) >gi|296494461|gb|ADTN01000277.1| GENE 1 245 - 2242 1983 665 aa, chain + ## HITS:1 COG:ECs2338 KEGG:ns NR:ns ## COG: ECs2338 COG4656 # Protein_GI_number: 15831592 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Escherichia coli O157:H7 # 1 665 108 772 772 945 97.0 0 MIIDADGEDCWIPRDGWADYRTRSREELIERIHQFGVAGLGGAGFPTGVKLQGGGDKIET LIINAAECEPYITADDRLMQDCAAQVVEGIRILAHILQPREILIGIEDNKPQAISMLRAV LADSNDISLRVIPTKYPSGGAKQLTYILTGKQVPHGGRSSDIGVLMQNVGTAYAVKRAVI DGEPITERVVTLTGEAIARPGNVWARLGTPVRHLLNDAGFCPSADQMVIMGGPLMGFTLP WLDVPVVKITNCLLAPSANELGEPQEEQSCIRCSACADACPADLLPQQLYWFSKGQQHDK ATTHNIADCIECGACAWVCPSNIPLVQYFRQEKAEIAAIRQEEKRAAEAKARFEARQARL EREKAARLERHKSAAVQPAAKDKDAIAAALARVKEKQAQATQPIVIKAGERPDNSAIIAA REARKAQARAKQAELQQTNDAATVADPRKTAVEAAIARAKARKLEQQQANAEPEQQVDPR KAAVEAAIARAKARKLEQQQANAEPEQQVDPRKAAVEAAIARAKARKLEQQQANAEPEQQ VDPRKAAVEAAIARAKARKLEQQQANAEPEQQVDPRKAAVEAAIARAKARKREQQPANAE PEEQVDPRKAAVEAAIARAKARKLEQQQANAVPEEQVDPRKAAVAAAIARAQAKKAAQQK VVNED >gi|296494461|gb|ADTN01000277.1| GENE 2 2243 - 3301 892 352 aa, chain + ## HITS:1 COG:ydgO KEGG:ns NR:ns ## COG: ydgO COG4658 # Protein_GI_number: 16129588 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfD # Organism: Escherichia coli K12 # 1 352 1 352 352 633 100.0 0 MVFRIASSPYTHNQRQTSRIMLLVLLAAVPGIAAQLWFFGWGTLVQILLASVSALLAEAL VLKLRKQSVAATLKDNSALLTGLLLAVSIPPLAPWWMVVLGTVFAVIIAKQLYGGLGQNP FNPAMIGYVVLLISFPVQMTSWLPPHEIAVNIPGFIDAIQVIFSGHTASGGDMNTLRLGI DGISQATPLDTFKTSVRAGHSVEQIMQYPIYSGILAGAGWQWVNLAWLAGGVWLLWQKAI RWHIPLSFLVTLALCAMLGWLFSPETLAAPQIHLLSGATMLGAFFILTDPVTASTTNRGR LIFGALAGLLVWLIRSFGGYPDGVAFAVLLANITVPLIDYYTRPRVYGHRKG >gi|296494461|gb|ADTN01000277.1| GENE 3 3305 - 3925 580 206 aa, chain + ## HITS:1 COG:ydgP KEGG:ns NR:ns ## COG: ydgP COG4659 # Protein_GI_number: 16129589 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfG # Organism: Escherichia coli K12 # 1 206 1 206 206 393 100.0 1e-109 MLKTIRKHGITLALFAAGSTGLTAAINQMTKTTIAEQASLQQKALFDQVLPAERYNNALA QSCYLVTAPELGKGEHRVYIAKQDDKPVAAVLEATAPDGYSGAIQLLVGADFNGTVLGTR VTEHHETPGLGDKIELRLSDWITHFAGKKISGADDAHWAVKKDGGDFDQFTGATITPRAV VNAVKRAGLYAQTLPAQLSQLPACGE >gi|296494461|gb|ADTN01000277.1| GENE 4 3929 - 4624 748 231 aa, chain + ## HITS:1 COG:ECs2341 KEGG:ns NR:ns ## COG: ECs2341 COG4660 # Protein_GI_number: 15831595 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfE # Organism: Escherichia coli O157:H7 # 1 231 1 231 231 395 99.0 1e-110 MSEIKDVIVQGLWKNNSALVQLLGLCPLLAVTSTATNALGLGLATTLVLTLTNLTISTLR HWTPAEIRIPIYVMIIASVVSAVQMLINAYAFGLYQSLGIFIPLIVTNCIVVGRAEAFAA KKGPALSALDGFSIGMGATCAMFVLGSLREIIGNGTLFDGADALLGSWAKVLRVEIFHTD SPFLLAMLPPGAFIGLGLMLAGKYLIDERMKKRRAEAAAERALPNGETGNV >gi|296494461|gb|ADTN01000277.1| GENE 5 4624 - 5259 600 211 aa, chain + ## HITS:1 COG:nth KEGG:ns NR:ns ## COG: nth COG0177 # Protein_GI_number: 16129591 # Func_class: L Replication, recombination and repair # Function: Predicted EndoIII-related endonuclease # Organism: Escherichia coli K12 # 1 211 1 211 211 426 100.0 1e-119 MNKAKRLEILTRLRENNPHPTTELNFSSPFELLIAVLLSAQATDVSVNKATAKLYPVANT PAAMLELGVEGVKTYIKTIGLYNSKAENIIKTCRILLEQHNGEVPEDRAALEALPGVGRK TANVVLNTAFGWPTIAVDTHIFRVCNRTQFAPGKNVEQVEEKLLKVVPAEFKVDCHHWLI LHGRYTCIARKPRCGSCIIEDLCEYKEKVDI >gi|296494461|gb|ADTN01000277.1| GENE 6 5387 - 5587 89 66 aa, chain + ## HITS:1 COG:no KEGG:ECBD_2010 NR:ns ## KEGG: ECBD_2010 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_BL21_DE3 # Pathway: not_defined # 1 66 1 66 66 103 100.0 2e-21 MQVNCFFNSEIFYSFVIKITMLSQCKNIARIRLKSFKFTHATNIDLQNITLAIFQKLDYL TSKNAK >gi|296494461|gb|ADTN01000277.1| GENE 7 5870 - 7372 1577 500 aa, chain + ## HITS:1 COG:ydgR KEGG:ns NR:ns ## COG: ydgR COG3104 # Protein_GI_number: 16129592 # Func_class: E Amino acid transport and metabolism # Function: Dipeptide/tripeptide permease # Organism: Escherichia coli K12 # 1 487 1 487 500 879 100.0 0 MSTANQKPTESVSLNAFKQPKAFYLIFSIELWERFGYYGLQGIMAVYLVKQLGMSEADSI TLFSSFSALVYGLVAIGGWLGDKVLGTKRVIMLGAIVLAIGYALVAWSGHDAGIVYMGMA AIAVGNGLFKANPSSLLSTCYEKNDPRLDGAFTMYYMSVNIGSFFSMIATPWLAAKYGWS VAFALSVVGLLITIVNFAFCQRWVKQYGSKPDFEPINYRNLLLTIIGVVALIAIATWLLH NQEVARMALGVVAFGIVVIFGKEAFAMKGAARRKMIVAFILMLEAIIFFVLYSQMPTSLN FFAIRNVEHSILGLAVEPEQYQALNPFWIIIGSPILAAIYNKMGDTLPMPTKFAIGMVMC SGAFLILPLGAKFASDAGIVSVSWLVASYGLQSIGELMISGLGLAMVAQLVPQRLMGFIM GSWFLTTAGANLIGGYVAGMMAVPDNVTDPLMSLEVYGRVFLQIGVATAVIAVLMLLTAP KLHRMTQDDAADKAAKAAVA >gi|296494461|gb|ADTN01000277.1| GENE 8 7478 - 8083 732 201 aa, chain + ## HITS:1 COG:ECs2344 KEGG:ns NR:ns ## COG: ECs2344 COG0625 # Protein_GI_number: 15831598 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione S-transferase # Organism: Escherichia coli O157:H7 # 1 201 1 201 201 389 100.0 1e-108 MKLFYKPGACSLASHITLRESGKDFTLVSVDLMKKRLENGDDYFAVNPKGQVPALLLDDG TLLTEGVAIMQYLADSVPDRQLLAPVNSISRYKTIEWLNYIATELHKGFTPLFRPDTPEE YKPTVRAQLEKKLQYVNEALKDEHWICGQRFTIADAYLFTVLRWAYAVKLNLEGLEHIAA FMQRMAERPEVQDALSAEGLK >gi|296494461|gb|ADTN01000277.1| GENE 9 8127 - 8987 923 286 aa, chain - ## HITS:1 COG:pdxY KEGG:ns NR:ns ## COG: pdxY COG2240 # Protein_GI_number: 16129594 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal/pyridoxine/pyridoxamine kinase # Organism: Escherichia coli K12 # 1 286 2 287 287 568 100.0 1e-162 MKNILAIQSHVVYGHAGNSAAEFPMRRLGANVWPLNTVQFSNHTQYGKWTGCVMPPSHLT EIVQGIAAIDKLHTCDAVLSGYLGSAEQGEHILGIVRQVKAANPQAKYFCDPVMGHPEKG CIVAPGVAEFHVRHGLPASDIIAPNLVELEILCEHAVNNVEEAVLAARELIAQGPQIVLV KHLARAGYSRDRFEMLLVTADEAWHISRPLVDFGMRQPVGVGDVTSGLLLVKLLQGATLQ EALEHVTAAVYEIMVTTKAMQEYELQVVAAQDRIAKPEHYFSATKL Prediction of potential genes in microbial genomes Time: Mon May 16 00:07:20 2011 Seq name: gi|296494460|gb|ADTN01000278.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont710.8, whole genome shotgun sequence Length of sequence - 4452 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 - CDS 44 - 1318 833 ## PROTEIN SUPPORTED gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 - Prom 1383 - 1442 3.2 2 1 Op 2 3/1.000 - CDS 1447 - 2103 646 ## COG0259 Pyridoxamine-phosphate oxidase 3 1 Op 3 3/1.000 - CDS 2162 - 2485 286 ## COG3895 Predicted periplasmic protein 4 2 Tu 1 . - CDS 2589 - 3698 784 ## COG2377 Predicted molecular chaperone distantly related to HSP70-fold metalloproteases - Prom 3768 - 3827 3.2 + Prom 3814 - 3873 6.9 5 3 Tu 1 . + CDS 3972 - 4439 585 ## COG3133 Outer membrane lipoprotein Predicted protein(s) >gi|296494460|gb|ADTN01000278.1| GENE 1 44 - 1318 833 424 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 [Phaeobacter gallaeciensis BS107] # 4 421 7 414 418 325 42 4e-89 MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALYCGFDPTADSLHLGHLVPLLCLKR FQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENS AIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQG YDFACLNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTE GGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQ YVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLM QALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCL ICWK >gi|296494460|gb|ADTN01000278.1| GENE 2 1447 - 2103 646 218 aa, chain - ## HITS:1 COG:pdxH KEGG:ns NR:ns ## COG: pdxH COG0259 # Protein_GI_number: 16129596 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxamine-phosphate oxidase # Organism: Escherichia coli K12 # 1 218 1 218 218 432 100.0 1e-121 MSDNDELQQIAHLRREYTKGGLRRRDLPADPLTLFERWLSQACEAKLADPTAMVVATVDE HGQPYQRIVLLKHYDEKGMVFYTNLGSRKAHQIENNPRVSLLFPWHTLERQVMVIGKAER LSTLEVMKYFHSRPRDSQIGAWVSKQSSRISARGILESKFLELKQKFQQGEVPLPSFWGG FRVSLEQIEFWQGGEHRLHDRFLYQRENDAWKIDRLAP >gi|296494460|gb|ADTN01000278.1| GENE 3 2162 - 2485 286 107 aa, chain - ## HITS:1 COG:STM1447 KEGG:ns NR:ns ## COG: STM1447 COG3895 # Protein_GI_number: 16764795 # Func_class: R General function prediction only # Function: Predicted periplasmic protein # Organism: Salmonella typhimurium LT2 # 1 107 3 109 109 189 84.0 7e-49 MKKLLIIILPVLLSGCSAFNQLVERMQTDTLEYQCDEKPLTVKLNNPRQEVSFVYDNQLL HLKQGISASGARYTDGIYVFWSKGDEATVYKRDRIVLNNCQLQNPQR >gi|296494460|gb|ADTN01000278.1| GENE 4 2589 - 3698 784 369 aa, chain - ## HITS:1 COG:ydhH KEGG:ns NR:ns ## COG: ydhH COG2377 # Protein_GI_number: 16129598 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted molecular chaperone distantly related to HSP70-fold metalloproteases # Organism: Escherichia coli K12 # 1 369 1 369 369 712 99.0 0 MKSGRFIGVMSGTSLDGVDVVLATIDEHRVAQLASLSWPIPVSLKQAVLDICQGQQLTLS QFGQLDTQLGRLFADAVNALLKEQNLQARDIVAIGCHGQTVWHEPTGVAPHTLQIGDNNQ IVARTGITVVGDFRRRDIALGGQGAPLVPAFHHALLAHPTERRMVLNIGGIANLSLLIPG QPVGGYDTGPGNMLMDAWIWRQAGKPYDKDAEWARAGKVILPLLQNMLSDPYFSQPAPKS TGREYFNYGWLERHLRHFPGVDPRDVQATLAELTAVTISEQVLLSGGCERLMVCGGGSRN PLLMARLAALLPGTEVTTTDAVGISGDDMEALAFAWLAWRTLAGLPGNLPSVTGASQETV LGAIFPANP >gi|296494460|gb|ADTN01000278.1| GENE 5 3972 - 4439 585 155 aa, chain + ## HITS:1 COG:slyB KEGG:ns NR:ns ## COG: slyB COG3133 # Protein_GI_number: 16129599 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane lipoprotein # Organism: Escherichia coli K12 # 1 155 1 155 155 212 99.0 2e-55 MIKRVLVVSMVGLSLVGCVNNDTLSGDVYTASEAKQVQNVSYGTIVNVRPVQIQGGDDSN VIGAIGGAVLGGFLGNTVGGGTGRSLATAAGAVAGGVAGQGVQSAMNKTQGVELEIRKDD GNTIMVVQKQGNTRFSPGQRVVLASNGSQVTVSPR Prediction of potential genes in microbial genomes Time: Mon May 16 00:07:23 2011 Seq name: gi|296494459|gb|ADTN01000279.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont710.9, whole genome shotgun sequence Length of sequence - 7401 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 4, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 16 - 456 373 ## COG1846 Transcriptional regulators - Prom 604 - 663 6.8 + Prom 448 - 507 9.9 2 2 Op 1 . + CDS 651 - 887 133 ## ECUMN_1934 conserved hypothetical protein; putative inner membrane protein 3 2 Op 2 6/0.000 + CDS 890 - 1747 634 ## COG1566 Multidrug resistance efflux pump 4 2 Op 3 . + CDS 1747 - 3759 1183 ## COG1289 Predicted membrane protein + Term 3897 - 3934 2.2 - Term 3652 - 3687 2.4 5 3 Op 1 3/0.500 - CDS 3760 - 4296 747 ## COG2032 Cu/Zn superoxide dismutase 6 3 Op 2 . - CDS 4362 - 5258 1068 ## COG4989 Predicted oxidoreductase - Prom 5281 - 5340 4.1 + Prom 5379 - 5438 2.2 7 4 Op 1 6/0.000 + CDS 5649 - 6248 567 ## COG1309 Transcriptional regulator 8 4 Op 2 . + CDS 6285 - 7382 1143 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family Predicted protein(s) >gi|296494459|gb|ADTN01000279.1| GENE 1 16 - 456 373 146 aa, chain - ## HITS:1 COG:ECs2351 KEGG:ns NR:ns ## COG: ECs2351 COG1846 # Protein_GI_number: 15831605 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 146 1 146 146 241 100.0 4e-64 MKLESPLGSDLARLVRIWRALIDHRLKPLELTQTHWVTLHNIHQLPPDQSQIQLAKAIGI EQPSLVRTLDQLEEKGLISRQTCASDRRAKRIKLTEKAEPLISEMEAVINKTRAEILHGI SAEELEQLITLIAKLEHNIIELQAKG >gi|296494459|gb|ADTN01000279.1| GENE 2 651 - 887 133 78 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_1934 NR:ns ## KEGG: ECUMN_1934 # Name: ydhI # Def: conserved hypothetical protein; putative inner membrane protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 78 1 78 78 105 100.0 4e-22 MKFMLNATGLPLQDLVFGASVYFPPFFKAFAFGFVIWLVVHRLLRGWIYAGDIWHPLLMD LSLFAICVCLALAILIAW >gi|296494459|gb|ADTN01000279.1| GENE 3 890 - 1747 634 285 aa, chain + ## HITS:1 COG:ydhJ KEGG:ns NR:ns ## COG: ydhJ COG1566 # Protein_GI_number: 16129602 # Func_class: V Defense mechanisms # Function: Multidrug resistance efflux pump # Organism: Escherichia coli K12 # 1 285 15 299 299 551 100.0 1e-157 MSIKTIKYFSTIIVAVVAVLAGWWLWNYYMQSPWTRDGKIRAEQVSITPQVSGRIVELNI KDNQLVNAGDLLLTIDKTPFQIAELNAQAQLAKAQSDLAKANNEANRRRHLSQNFISAEE LDTANLNVKAMQASVDAAQATLKQAQWQLAQTEIRAPVSGWVTNLTTRIGDYADTGKPLF ALVDSHSFYVIGYFEETKLRHIREGAPAQITLYSDNKTLQGHVSSIGRAIYDQSVESDSS LIPDVKPNVPWVRLAQRVPVRFALDKVPGDVTLVSGTTCSIAVGQ >gi|296494459|gb|ADTN01000279.1| GENE 4 1747 - 3759 1183 670 aa, chain + ## HITS:1 COG:ydhK KEGG:ns NR:ns ## COG: ydhK COG1289 # Protein_GI_number: 16129603 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 670 1 670 670 1222 100.0 0 MNASSWSLRNLPWFRATLAQWRYALRNTIAMCLALTVAYYLNLDEPYWAMTSAAVVSFPT VGGVISKSLGRIAGSLLGAIAALLLAGHTLNEPWFFLLSMSAWLGFCTWACAHFTNNVAY AFQLAGYTAAIIAFPMVNITEASQLWDIAQARVCEVIVGILCGGMMMMILPSSSDATALL TALKNMHARLLEHASLLWQPETTDAIRAAHEGVIGQILTMNLLRIQAFWSHYRFRQQNAR LNALLHQQLRMTSVISSLRRMLLNWPSPPGATREILEQLLTALASSQTDVYTVARIIAPL RPTNVADYRHVAFWQRLRYFCRLYLQSSQELHRLQSGVDDHTRLPRTSGLARHTDNAEAM WSGLRTFCTLMMIGAWSIASQWDAGANALTLAAISCVLYSAVAAPFKSLSLLMRTLVLLS LFSFVVKFGLMVQISDLWQFLLFLFPLLATMQLLKLQMPKFAALWGQLIVFMGSFIAVTN PPVYDFADFLNDNLAKIVGVALAWLAFAILRPGSDARKSRRHIRALRRDFVDQLSRHPTL SESEFESLTYHHVSQLSNSQDALARRWLLRWGVVLLNCSHVVWQLRDWESRSDPLSRVRD NCISLLRGVMSERGVQQKSLAATLEELQRICDSLARHHQPAARELAAIVWRLYCSLSQLE QAPPQGTLAS >gi|296494459|gb|ADTN01000279.1| GENE 5 3760 - 4296 747 178 aa, chain - ## HITS:1 COG:ECs2355 KEGG:ns NR:ns ## COG: ECs2355 COG2032 # Protein_GI_number: 15831609 # Func_class: P Inorganic ion transport and metabolism # Function: Cu/Zn superoxide dismutase # Organism: Escherichia coli O157:H7 # 6 178 1 173 173 301 100.0 3e-82 MNGGPMKRFSLAILALVVATGAQAASEKVEMNLVTSQGVGQSIGSVTITETDKGLEFSPD LKALPPGEHGFHIHAKGSCQPATKDGKASAAESAGGHLDPQNTGKHEGPEGAGHLGDLPA LVVNNDGKATDAVIAPRLKSLDEIKDKALMVHVGGDNMSDQPKPLGGGGERYACGVIK >gi|296494459|gb|ADTN01000279.1| GENE 6 4362 - 5258 1068 298 aa, chain - ## HITS:1 COG:ydhF KEGG:ns NR:ns ## COG: ydhF COG4989 # Protein_GI_number: 16129605 # Func_class: R General function prediction only # Function: Predicted oxidoreductase # Organism: Escherichia coli K12 # 1 298 1 298 298 610 99.0 1e-174 MVQRITIAPQGPEFSRFVMGYWRLMDWNMSARQLVSFIEEHLDLGVTTVDHADIYGGYQC EAAFGEALKLAPHLRERMEIVSKCGIATTAREENVIGHYITDRDHIIKSAEQSLINLATD HLDLLLIHRPDPLMDADEVADAFKHLHQSGKVRHFGVSNFTPAQFALLQSRLPFTLATNQ VEISPVHQPLLLDGTLDQLQQLRVRPMAWSCLGGGRLFNDDYFQPLRDELAVVAEELNAG SIEQVVYAWVLRLPSQPLPIIGSGKIERVRAAVEAETLKMTRQQWFRIRKAALGYDVP >gi|296494459|gb|ADTN01000279.1| GENE 7 5649 - 6248 567 199 aa, chain + ## HITS:1 COG:ydhM KEGG:ns NR:ns ## COG: ydhM COG1309 # Protein_GI_number: 16129607 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 199 1 199 199 391 100.0 1e-109 MNKHTEHDTREHLLATGEQLCLQRGFTGMGLSELLKTAEVPKGSFYHYFRSKEAFGVAML ERHYAAYHQRLTELLQSGEGNYRDRILAYYQQTLNQFCQHGTISGCLTVKLSAEVCDLSE DMRSAMDKGARGVIALLSQALENGRENHCLTFCGEPLQQAQVLYALWLGANLQAKISRSF EPLENALAHVKNIIATPAV >gi|296494459|gb|ADTN01000279.1| GENE 8 6285 - 7382 1143 365 aa, chain + ## HITS:1 COG:nemA KEGG:ns NR:ns ## COG: nemA COG1902 # Protein_GI_number: 16129608 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Escherichia coli K12 # 1 365 1 365 365 702 100.0 0 MSSEKLYSPLKVGAITAANRIFMAPLTRLRSIEPGDIPTPLMAEYYRQRASAGLIISEAT QISAQAKGYAGAPGIHSPEQIAAWKKITAGVHAENGHMAVQLWHTGRISHASLQPGGQAP VAPSALSAGTRTSLRDENGQAIRVETSMPRALELEEIPGIVNDFRQAIANAREAGFDLVE LHSAHGYLLHQFLSPSSNHRTDQYGGSVENRARLVLEVVDAGIEEWGADRIGIRVSPIGT FQNTDNGPNEEADALYLIEQLGKRGIAYLHMSEPDWAGGEPYTDAFREKVRARFHGPIIG AGAYTVEKAETLIGKGLIDAVAFGRDWIANPDLVARLQRKAELNPQRAESFYGGGAEGYT DYPTL Prediction of potential genes in microbial genomes Time: Mon May 16 00:07:34 2011 Seq name: gi|296494458|gb|ADTN01000280.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont710.10, whole genome shotgun sequence Length of sequence - 29444 bp Number of predicted genes - 25, with homology - 24 Number of transcription units - 18, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 57 - 464 216 ## PROTEIN SUPPORTED gi|15900839|ref|NP_345443.1| lactoylglutathione lyase + Term 473 - 508 5.2 + Prom 480 - 539 3.9 2 2 Op 1 3/0.875 + CDS 567 - 1214 763 ## COG0847 DNA polymerase III, epsilon subunit and related 3'-5' exonucleases 3 2 Op 2 . + CDS 1307 - 5923 3407 ## COG1201 Lhr-like helicases + Term 5943 - 5984 -0.8 - Term 5919 - 5959 8.5 4 3 Tu 1 . - CDS 5974 - 6321 516 ## COG0278 Glutaredoxin-related protein - Prom 6355 - 6414 4.7 + Prom 6542 - 6601 7.1 5 4 Tu 1 . + CDS 6655 - 7470 456 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) + Term 7474 - 7515 8.1 + Prom 7486 - 7545 4.4 6 5 Tu 1 . + CDS 7598 - 8179 462 ## PROTEIN SUPPORTED gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent + Term 8293 - 8321 -1.0 7 6 Tu 1 . - CDS 8414 - 9583 1247 ## COG2814 Arabinose efflux permease - Prom 9629 - 9688 3.4 - Term 9649 - 9698 2.4 8 7 Tu 1 . - CDS 9749 - 9838 161 ## - Prom 9868 - 9927 5.6 + Prom 9921 - 9980 4.0 9 8 Tu 1 . + CDS 10137 - 11162 1081 ## COG1609 Transcriptional regulators + Term 11169 - 11215 0.5 - Term 11026 - 11078 2.1 10 9 Tu 1 . - CDS 11159 - 12091 708 ## COG0583 Transcriptional regulator - Prom 12120 - 12179 4.4 + Prom 12113 - 12172 2.9 11 10 Tu 1 . + CDS 12204 - 13415 1094 ## COG0477 Permeases of the major facilitator superfamily + Term 13615 - 13647 1.0 + Prom 13419 - 13478 7.3 12 11 Tu 1 . + CDS 13706 - 14854 978 ## COG2230 Cyclopropane fatty acid synthase and related methyltransferases + Term 14866 - 14898 4.6 13 12 Tu 1 . - CDS 14894 - 15535 727 ## COG0307 Riboflavin synthase alpha chain - Prom 15749 - 15808 3.6 + Prom 15610 - 15669 5.7 14 13 Tu 1 . + CDS 15750 - 17123 1531 ## COG0534 Na+-driven multidrug efflux pump + Term 17130 - 17171 7.2 - Term 17119 - 17156 6.1 15 14 Tu 1 . - CDS 17164 - 18420 980 ## COG3468 Type V secretory pathway, adhesin AidA - Prom 18457 - 18516 3.3 + TRNA 18728 - 18804 92.3 # Val GAC 0 0 + TRNA 18809 - 18885 91.1 # Val GAC 0 0 + Prom 18811 - 18870 80.2 16 15 Op 1 . + CDS 18993 - 19298 332 ## SSON_1489 hypothetical protein + Term 19313 - 19339 -0.7 + Prom 19304 - 19363 3.3 17 15 Op 2 . + CDS 19424 - 21028 788 ## COG4529 Uncharacterized protein conserved in bacteria + Term 21041 - 21075 -0.5 18 16 Op 1 . - CDS 21040 - 21852 275 ## B21_01628 hypothetical protein 19 16 Op 2 3/0.875 - CDS 21856 - 22641 624 ## COG4117 Thiosulfate reductase cytochrome B subunit (membrane anchoring protein) 20 16 Op 3 . - CDS 22638 - 23306 262 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 21 16 Op 4 . - CDS 23370 - 24008 453 ## B21_01631 hypothetical protein 22 16 Op 5 5/0.500 - CDS 24021 - 26123 1950 ## COG2414 Aldehyde:ferredoxin oxidoreductase 23 16 Op 6 . - CDS 26144 - 26770 381 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 - Term 27180 - 27220 8.2 24 17 Tu 1 . - CDS 27225 - 27434 249 ## LF82_2868 uncharacterized protein YdhZ - Prom 27523 - 27582 7.2 + Prom 27702 - 27761 8.3 25 18 Tu 1 . + CDS 27991 - 29403 1503 ## COG0469 Pyruvate kinase Predicted protein(s) >gi|296494458|gb|ADTN01000280.1| GENE 1 57 - 464 216 135 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900839|ref|NP_345443.1| lactoylglutathione lyase [Streptococcus pneumoniae TIGR4] # 2 127 4 126 126 87 38 7e-17 MRLLHTMLRVGDLQRSIDFYTKVLGMKLLRTSENPEYKYSLAFVGYGPETEEAVIELTYN WGVDKYELGTAYGHIALSVDNAAEACEKIRQNGGNVTREAGPVKGGTTVIAFVEDPDGYK IELIEEKDAGRGLGN >gi|296494458|gb|ADTN01000280.1| GENE 2 567 - 1214 763 215 aa, chain + ## HITS:1 COG:rnt KEGG:ns NR:ns ## COG: rnt COG0847 # Protein_GI_number: 16129610 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases # Organism: Escherichia coli K12 # 1 215 1 215 215 439 100.0 1e-123 MSDNAQLTGLCDRFRGFYPVVIDVETAGFNAKTDALLEIAAITLKMDEQGWLMPDTTLHF HVEPFVGANLQPEALAFNGIDPNDPDRGAVSEYEALHEIFKVVRKGIKASGCNRAIMVAH NANFDHSFMMAAAERASLKRNPFHPFATFDTAALAGLALGQTVLSKACQTAGMDFDSTQA HSALYDTERTAVLFCEIVNRWKRLGGWPLSAAEEV >gi|296494458|gb|ADTN01000280.1| GENE 3 1307 - 5923 3407 1538 aa, chain + ## HITS:1 COG:lhr KEGG:ns NR:ns ## COG: lhr COG1201 # Protein_GI_number: 16129611 # Func_class: R General function prediction only # Function: Lhr-like helicases # Organism: Escherichia coli K12 # 1 1538 1 1538 1538 2924 100.0 0 MADNPDPSSLLPDVFSPATRDWFLRAFKQPTAVQPQTWHVAARSEHALVIAPTGSGKTLA AFLYALDRLFREGGEDTREAHKRKTSRILYISPIKALGTDVQRNLQIPLKGIADERRRRG ETEVNLRVGIRTGDTPAQERSKLTRNPPDILITTPESLYLMLTSRARETLRGVETVIIDE VHAVAGSKRGAHLALSLERLDALLHTSAQRIGLSATVRSASDVAAFLGGDRPVTVVNPPA MRHPQIRIVVPVANMDDVSSVASGTGEDSHAGREGSIWPYIETGILDEVLRHRSTIVFTN SRGLAEKLTARLNELYAARLQRSPSIAVDAAHFESTSGATSNRVQSSDVFIARSHHGSVS KEQRAITEQALKSGELRCVVATSSLELGIDMGAVDLVIQVATPLSVASGLQRIGRAGHQV GGVSKGLFFPRTRRDLVDSAVIVECMFAGRLENLTPPHNPLDVLAQQTVAAAAMDALQVD EWYSRVRRAAPWKDLPRRVFDATLDMLSGRYPSGDFSAFRPKLVWNRETGILTARPGAQL LAVTSGGTIPDRGMYSVLLPEGEEKAGSRRVGELDEEMVYESRVNDIITLGATSWRIQQI TRDQVIVTPAPGRSARLPFWRGEGNGRPAELGEMIGDFLHLLADGAFFSGTIPPWLAEEN TIANIQGLIEEQRNATGIVPGSRHLVLERCRDEIGDWRIILHSPYGRRVHEPWAVAIAGR IHALWGADASVVASDDGIVARIPDTDGKLPDAAIFLFEPEKLLQIVREAVGSSALFAARF RECAARALLMPGRTPGHRTPLWQQRLRASQLLEIAQGYPDFPVILETLRECLQDVYDLPA LERLMRRLNGGEIQISDVTTTTPSPFATSLLFGYVAEFMYQSDAPLAERRASVLSLDSEL LRNLLGQVDPGELLDPQVIRQVEEELQRLAPGRRAKGEEGLFDLLRELGPMTVEDLAQRH TGSSEEVASYLENLLAVKRIFPAMISGQERLACMDDAARLRDALGVRLPESLPEIYLHRV SYPLRDLFLRYLRAHALVTAEQLAHEFSLGIAIVEEQLQQLREQGLVMNLQQDIWVSDEV FRRLRLRSLQAAREATRPVAATTYARLLLERQGVLPATDGSPALFASTSPGVYEGVDGVM RVIEQLAGVGLPASLWESQILPARVRDYSSEMLDELLATGAVIWSGQKKLGEDDGLVALH LQEYAAESFTPAEADQANRSALQQAIVAVLADGGAWFAQQISQRIRDKIGESVDLSALQE ALWALVWQGVITSDIWAPLRALTRSSSNARTSTRRSHRARRGRPVYAQPVSPRVSYNTPN LAGRWSLLQVEPLNDTERMLALAENMLDRYGIISRQAVIAENIPGGFPSMQTLCRSMEDS GRIMRGRFVEGLGGAQFAERLTIDRLRDLATQATQTRHYTPVALSANDPANVWGNLLPWP AHPATLVPTRRAGALVVVSGGKLLLYLAQGGKKMLVWQEKEELLAPEVFHALTTALRREP RLRFTLTEVNDLPVRQTPMFTLLREAGFSSSPQGLDWG >gi|296494458|gb|ADTN01000280.1| GENE 4 5974 - 6321 516 115 aa, chain - ## HITS:1 COG:ECs2363 KEGG:ns NR:ns ## COG: ECs2363 COG0278 # Protein_GI_number: 15831617 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutaredoxin-related protein # Organism: Escherichia coli O157:H7 # 1 115 1 115 115 225 100.0 1e-59 MSTTIEKIQRQIAENPILLYMKGSPKLPSCGFSAQAVQALAACGERFAYVDILQNPDIRA ELPKYANWPTFPQLWVDGELVGGCDIVIEMYQRGELQQLIKETAAKYKSEEPDAE >gi|296494458|gb|ADTN01000280.1| GENE 5 6655 - 7470 456 271 aa, chain + ## HITS:1 COG:ydhO KEGG:ns NR:ns ## COG: ydhO COG0791 # Protein_GI_number: 16129613 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Escherichia coli K12 # 1 271 1 271 271 422 100.0 1e-118 MARINRISITLCALLFTTLPLTPMAHASKQARESSATTHITKKADKKKSTATTKKTQKTA KKAASKSTTKSKTASSVKKSSITASKNAKTRSKHAVNKTASASFTEKCTKRKGYKSHCVK VKNAASGTLADAHKAKVQKATKVAMNKLMQQIGKPYRWGGSSPRTGFDCSGLVYYAYKDL VKIRIPRTANEMYHLRDAAPIERSELKNGDLVFFRTQGRGTADHVGVYVGNGKFIQSPRT GQEIQITSLSEDYWQRHYVGARRVMTPKTLR >gi|296494458|gb|ADTN01000280.1| GENE 6 7598 - 8179 462 193 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent [Streptococcus pneumoniae TIGR4] # 1 193 1 199 201 182 44 2e-45 MSFELPALPYAKDALAPHISAETIEYHYGKHHQTYVTNLNNLIKGTAFEGKSLEEIIRSS EGGVFNNAAQVWNHTFYWNCLAPNAGGEPTGKVAEAIAASFGSFADFKAQFTDAAIKNFG SGWTWLVKNSDGKLAIVSTSNAGTPLTTDATPLLTVDVWEHAYYIDYRNARPGYLEHFWA LVNWEFVAKNLAA >gi|296494458|gb|ADTN01000280.1| GENE 7 8414 - 9583 1247 389 aa, chain - ## HITS:1 COG:ydhP KEGG:ns NR:ns ## COG: ydhP COG2814 # Protein_GI_number: 16129615 # Func_class: G Carbohydrate transport and metabolism # Function: Arabinose efflux permease # Organism: Escherichia coli K12 # 1 389 1 389 389 571 100.0 1e-163 MKINYPLLALAIGAFGIGTTEFSPMGLLPVIARGVDVSIPAAGMLISAYAVGVMVGAPLM TLLLSHRARRSALIFLMAIFTLGNVLSAIAPDYMTLMLSRILTSLNHGAFFGLGSVVAAS VVPKHKQASAVATMFMGLTLANIGGVPAATWLGETIGWRMSFLATAGLGVISMVSLFFSL PKGGAGARPEVKKELAVLMRPQVLSALLTTVLGAGAMFTLYTYISPVLQSITHATPVFVT AMLVLIGVGFSIGNYLGGKLADRSVNGTLKGFLLLLMVIMLAIPFLARNEFGAAISMVVW GAATFAVVPPLQMRVMRVASEAPGLSSSVNIGAFNLGNALGAAAGGAVISAGLGYSFVPV MGAIVAGLALLLVFMSARKQPETVCVANS >gi|296494458|gb|ADTN01000280.1| GENE 8 9749 - 9838 161 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSTDLKFSLVTTIIVLGLIVAVGLTAALH >gi|296494458|gb|ADTN01000280.1| GENE 9 10137 - 11162 1081 341 aa, chain + ## HITS:1 COG:ECs2367 KEGG:ns NR:ns ## COG: ECs2367 COG1609 # Protein_GI_number: 15831621 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 341 1 341 341 700 100.0 0 MATIKDVAKRANVSTTTVSHVINKTRFVAEETRNAVWAAIKELHYSPSAVARSLKVNHTK SIGLLATSSEAAYFAEIIEAVEKNCFQKGYTLILGNAWNNLEKQRAYLSMMAQKRVDGLL VMCSEYPEPLLAMLEEYRHIPMVVMDWGEAKADFTDAVIDNAFEGGYMAGRYLIERGHRE IGVIPGPLERNTGAGRLAGFMKAMEEAMIKVPESWIVQGDFEPESGYRAMQQILSQPHRP TAVFCGGDIMAMGALCAADEMGLRVPQDVSLIGYDNVRNARYFTPALTTIHQPKDSLGET AFNMLLDRIVNKREEPQSIEVHPRLIERRSVADGPFRDYRR >gi|296494458|gb|ADTN01000280.1| GENE 10 11159 - 12091 708 310 aa, chain - ## HITS:1 COG:ECs2368 KEGG:ns NR:ns ## COG: ECs2368 COG0583 # Protein_GI_number: 15831622 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 310 1 310 310 622 100.0 1e-178 MWSEYSLEVVDAVARNGSFSAAAQELHRVPSAVSYTVRQLEEWLAVPLFERRHRDVELTA AGAWFLKEGRSVVKKMQITRQQCQQIANGWRGQLAIAVDNIVRPERTRQMIVDFYRHFDD VELLVFQEVFNGVWDALSDGRVELAIGATRAIPVGGRYAFRDMGMLSWSCVVASHHPLAL MDGPFSDDTLRNWPSLVREDTSRTLPKRITWLLDNQKRVVVPDWESSATCISAGLCIGMV PTHFAKPWLNEGKWVALELENPFPDSACCLTWQQNDMSPALTWLLEYLGDSETLNKEWLR EPEETPATGD >gi|296494458|gb|ADTN01000280.1| GENE 11 12204 - 13415 1094 403 aa, chain + ## HITS:1 COG:ECs2369 KEGG:ns NR:ns ## COG: ECs2369 COG0477 # Protein_GI_number: 15831623 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 1 403 1 403 403 706 99.0 0 MQPGKRFLVWLAGLSVLGFLATDMYLPAFAAIQADLQTPASAVSASLSLFLAGFAAAQLL WGPLSDRYGRKPVLLIGLTIFALGSLGMLWVENAATLLVLRFVQAVGVCAAAVIWQALVT DYYPSQKVNRIFATIMPLVGLSPALAPLLGSWLLVHFSWQAIFATLFAITVVLILPIFWL KPTTKARNNSQDGLTFTDLLRSKTYRGNVLIYAACSASFFAWLTGSPFILSEMGYSPAVI GLSYVPQTIAFLIGGYGCRAALQKWQGKQLLPWLLVLFAVSVIATWAAGFISHVSLVEIL IPFCVMAIANGAIYPIVVAQALRPFPHATGRAAALQNTLQLGLCFLASLVVSWLISISTP LLTTTSVMLSTVVLVALGYMMQRCEEVGCQNHGNAEVAHSESH >gi|296494458|gb|ADTN01000280.1| GENE 12 13706 - 14854 978 382 aa, chain + ## HITS:1 COG:cfa KEGG:ns NR:ns ## COG: cfa COG2230 # Protein_GI_number: 16129619 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cyclopropane fatty acid synthase and related methyltransferases # Organism: Escherichia coli K12 # 1 382 1 382 382 812 100.0 0 MSSSCIEEVSVPDDNWYRIANELLSRAGIAINGSAPADIRVKNPDFFKRVLQEGSLGLGE SYMDGWWECDRLDMFFSKVLRAGLENQLPHHFKDTLRIAGARLFNLQSKKRAWIVGKEHY DLGNDLFSRMLDPFMQYSCAYWKDADNLESAQQAKLKMICEKLQLKPGMRVLDIGCGWGG LAHYMASNYDVSVVGVTISAEQQKMAQERCEGLDVTILLQDYRDLNDQFDRIVSVGMFEH VGPKNYDTYFAVVDRNLKPEGIFLLHTIGSKKTDLNVDPWINKYIFPNGCLPSVRQIAQS SEPHFVMEDWHNFGADYDTTLMAWYERFLAAWPEIADNYSERFKRMFTYYLNACAGAFRA RDIQLWQVVFSRGVENGLRVAR >gi|296494458|gb|ADTN01000280.1| GENE 13 14894 - 15535 727 213 aa, chain - ## HITS:1 COG:ribC KEGG:ns NR:ns ## COG: ribC COG0307 # Protein_GI_number: 16129620 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase alpha chain # Organism: Escherichia coli K12 # 1 213 1 213 213 426 100.0 1e-119 MFTGIVQGTAKLVSIDEKPNFRTHVVELPDHMLDGLETGASVAHNGCCLTVTEINGNHVS FDLMKETLRITNLGDLKVGDWVNVERAAKFSDEIGGHLMSGHIMTTAEVAKILTSENNRQ IWFKVQDSQLMKYILYKGFIGIDGISLTVGEVTPTRFCVHLIPETLERTTLGKKKLGARV NIEIDPQTQAVVDTVERVLAARENAMNQPGTEA >gi|296494458|gb|ADTN01000280.1| GENE 14 15750 - 17123 1531 457 aa, chain + ## HITS:1 COG:ECs2372 KEGG:ns NR:ns ## COG: ECs2372 COG0534 # Protein_GI_number: 15831626 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Escherichia coli O157:H7 # 1 457 1 457 457 799 99.0 0 MQKYISEARLLLALAIPVILAQIAQTAMGFVDTVMAGGYSATDMAAVAIGTSIWLPAILF GHGLLLALTPVIAQLNGSGRRERIAHQVRQGFWLAGFVSVLIMLVLWNAGYIIRSMENID PALADKAVGYLRALLWGAPGYLFFQVARNQCEGLAKTKPGMVMGFIGLLVNIPVNYIFIY GHFGMPELGGVGCGVATAAVYWVMFLAMVSYIKRARSMRDIRNEKGTAKPDPAVMKRLIQ LGLPIALALFFEVTLFAVVALLVSPLGIVDVAGHQIALNFSSLMFVLPMSLAAAVTIRVG YRLGQGSTLDAQTAARTGLMVGVCMATLTAIFTVSLREQIALLYNDNPEVVTLAAHLMLL AAVYQISDSIQVIGSGILRGYKDTRSIFYITFTAYWVLGLPSGYILALTDLVVEPMGPAG FWIGFIIGLTSAAIMMMLRMRFLQRLPSAIILQRASR >gi|296494458|gb|ADTN01000280.1| GENE 15 17164 - 18420 980 418 aa, chain - ## HITS:1 COG:ydhQ KEGG:ns NR:ns ## COG: ydhQ COG3468 # Protein_GI_number: 16129622 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 1 418 1 418 418 701 99.0 0 MGSDAKNLMSDGNVQIVKTGEVIGATQLTEGELIVEAGGRAENTVVTGAGWLKVATGGIA KCTQYGNNGTLSVSDGAIATDIVQSEGGAISLSTLATVNGRHPEGEFSVDKGYACGLLLE NGGNLRVLEGHRAEKIILDQEGGLLVNGTTSAVVVDEGGELLVYPGGEASNCEINQGGVF MLAGKASDTLLAGGTMNNLGGEDSDTIVENGSIYRLGTDGLQLYSSGKTQNLSVNVGGRA EVHAGTLENAVIQGGTVILLSPTSADENFVVEEDRAPVELTGSVALLDGASMIIGYGAEL QQSTITVQQGGVLILDGSTVKGDGVTFIVGNINLNGGKLWLITDAATHVQLKVKRLRGEG AICLQTSAKEISPDFINVKGEVTGDIHVEITDASRQTLCNALKLQPDEDGIGATLQPA >gi|296494458|gb|ADTN01000280.1| GENE 16 18993 - 19298 332 101 aa, chain + ## HITS:1 COG:no KEGG:SSON_1489 NR:ns ## KEGG: SSON_1489 # Name: not_defined # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 101 1 101 101 187 99.0 1e-46 MATLLQLHFAFNGPFGDAMAEQLKLLAESINQEPGFLWKVWTESEKNHEAGGIYLFTDEK SALAYLEKHTARLKNLGVEEVVAKVFDVNEPLSQINQAKLA >gi|296494458|gb|ADTN01000280.1| GENE 17 19424 - 21028 788 534 aa, chain + ## HITS:1 COG:ydhS KEGG:ns NR:ns ## COG: ydhS COG4529 # Protein_GI_number: 16129624 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 534 1 534 534 1087 100.0 0 MKKIAIVGAGPTGIYTLFSLLQQQTPLSISIFEQADEAGVGMPYSDEENSKMMLANIASI EIPPIYCTYLEWLQKQEDSHLQRYGVKKETLHDRQFLPRILLGEYFRDQFLRLVDQARQQ KFAVAVYESCQVTDLQITNAGVMLATNQDLPSETFDLAVIATGHVWPDEEEATRTYFPSP WSGLMEAKVDACNVGIMGTSLSGLDAAMAVAIQHGSFIEDDKQHVVFHRDNASEKLNITL LSRTGILPEADFYCPIPYEPLHIVTDQALNAEIQKGEEGLLDRVFRLIVEEIKFADPDWS QRIALESLNVDSFAQAWFAERKQRDPFDWAEKNLQEVERNKREKHTVPWRYVILRLHEAV QEIVPHLNEHDHKRFSKGLARVFIDNYAAIPSESIRRLLALREAGIIHILALGEDYKMEI NESRTVLKTEDNSYSFDVFIDARGQRPLKVKDIPFPGLREQLQKTGDEIPDVGEDYTLQQ PEDIRGRVAFGALPWLMHDQPFVQGLTACAEIGEAMARAVVKPASRARRRLSFD >gi|296494458|gb|ADTN01000280.1| GENE 18 21040 - 21852 275 270 aa, chain - ## HITS:1 COG:no KEGG:B21_01628 NR:ns ## KEGG: B21_01628 # Name: ydhT # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 270 1 270 270 517 100.0 1e-145 MIITQADLREWRIGAVMYRWFLRHFPRGGSYADIHHALIEEGYTDWAESLVEYAWKKWLA DENFAHQEVSSMQKLATDPGERPFCSQFARSDDHARIGCCEDNARIATAGYAAQIASMGY SVRIGSVGFNSHIGSSGERARVAVTGNSSRISSAGDSSRIANTGMRVRVCTLGERCHVAS NGDLVQIASFGANARIANSGDNVHIIASGENSTVVSTGVVDSIILGPGGSAALAYHDGER VRFAVAIEGENNIRAGVRYRLNEQHQFVEC >gi|296494458|gb|ADTN01000280.1| GENE 19 21856 - 22641 624 261 aa, chain - ## HITS:1 COG:ydhU KEGG:ns NR:ns ## COG: ydhU COG4117 # Protein_GI_number: 16129626 # Func_class: C Energy production and conversion # Function: Thiosulfate reductase cytochrome B subunit (membrane anchoring protein) # Organism: Escherichia coli K12 # 1 261 1 261 261 484 100.0 1e-136 MNPSQHAEQFQSQLANYVPQFTPEFWPVWLIIAGVLLVGMWLVLGLHALLRARGVKKSAT DHGEKIYLYSKAVRLWHWSNALLFVLLLASGLINHFAMVGATAVKSLVAVHEVCGFLLLA CWLGFVLINAVGDNGHHYRIRRQGWLERAAKQTRFYLFGIMQGEEHPFPATTQSKFNPLQ QVAYVGVMYGLLPLLLLTGLLCLYPQAVGDVFPGVRYWLLQTHFALAFISLFFIFGHLYL CTTGRTPHETFKSMVDGYHRH >gi|296494458|gb|ADTN01000280.1| GENE 20 22638 - 23306 262 222 aa, chain - ## HITS:1 COG:ydhX KEGG:ns NR:ns ## COG: ydhX COG0437 # Protein_GI_number: 16129627 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Escherichia coli K12 # 1 222 18 239 239 457 100.0 1e-129 MSFTRRKFVLGMGTVIFFTGSASSLLANTRQEKEVRYAMIHDESRCNGCNICARACRKTN HVPAQGSRLSIAHIPVTDNDNETQYHFFRQSCQHCEDAPCIDVCPTGASWRDEQGIVRVE KSQCIGCSYCIGACPYQVRYLNPVTKVADKCDFCAESRLAKGFPPICVSACPEHALIFGR EDSPEIQAWLQQNKYYQYQLPGAGKPHLYRRFGQHLIKKENV >gi|296494458|gb|ADTN01000280.1| GENE 21 23370 - 24008 453 212 aa, chain - ## HITS:1 COG:no KEGG:B21_01631 NR:ns ## KEGG: B21_01631 # Name: ydhW # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 212 4 215 215 377 100.0 1e-103 MNHQDELPLAKVSEVDEAKRQWLQGMRHPVDTVTEPEPAEILAEFIRQHSAAGQLVARAV FLSPPYLVAEEELSVLLESIKQNGDYADIACLTGSKDDYYYSTQAMSENYAAMSLQVVEQ DICRAIAHAVRFECQTYPRPYKVAMLMQAPYYFQEAQIEAAIAAMDVAPEYADIRQVESS TAVLYLFSERFMTYGKAYGLCEWFEVEQFQNP >gi|296494458|gb|ADTN01000280.1| GENE 22 24021 - 26123 1950 700 aa, chain - ## HITS:1 COG:ydhV KEGG:ns NR:ns ## COG: ydhV COG2414 # Protein_GI_number: 16129629 # Func_class: C Energy production and conversion # Function: Aldehyde:ferredoxin oxidoreductase # Organism: Escherichia coli K12 # 1 700 1 700 700 1493 99.0 0 MANGWTGNILRVNLTTGNITLEDSSKFKSFVGGMGFGYKIMYDEVPPGTKPFDEANKLVF ATGPLTGSGAPCSSRVNITSLSTFTKGNLVVDAHMGGFFAAQMKFAGYDVIIIEGKAKSP VWLKIKDDKVSLEKADFLWGKGTRATTEEICRLTSPETCVAAIGQAGENLVPLSGMLNSR NHSGGAGTGAIMGSKNLKAIAVEGTKGVNIADRQEMKRLNDYMMTELIGANNNHVVPSTP QSWAEYSDPKSRWTARKGLFWGAAEGGPIETGEIPPGNQNTVGFRTYKSVFDLGPAAEKY TVKMSGCHSCPIRCMTQMNIPRVKEFGVPSTGGNTCVANFVHTTIFPNGPKDFEDKDDGR VIGNLVGLNLFDDYGLWCNYGQLHRDFTYCYSKGVFKRVLPAEEYAEIRWDQLEAGDVNF IKDFYYRLAHRVGELSHLADGSYAIAERWNLGEEYWGYAKNKLWSPFGYPVHHANEASAQ VGSIVNCMFNRDCMTHTHINFIGSGLPLKLQREVAKELFGSEDAYDETKNYTPINDAKIK YAKWSLLRVCLHNAVTLCNWVWPMTVSPLKSRNYRGDLALEAKFFKAITGEEMTQEKLDL AAERIFTLHRAYTVKLMQTKDMRNEHDLICSWVFDKDPQIPVFTEGTDKMDRDDMHASLT MFYKEMGWDPQLGCPTRETLQRLGREDIAADLAAHNLLPA >gi|296494458|gb|ADTN01000280.1| GENE 23 26144 - 26770 381 208 aa, chain - ## HITS:1 COG:ECs2381 KEGG:ns NR:ns ## COG: ECs2381 COG0437 # Protein_GI_number: 15831635 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Escherichia coli O157:H7 # 1 208 1 208 208 368 100.0 1e-102 MNPVDRPLLDIGLTRLEFLRISGKGLAGLTIAPALLSLLGCKQEDIDSGTVGLINTPKGV LVTQRARCTGCHRCEISCTNFNDGSVGTFFSRIKIHRNYFFGDNGVGSGGGLYGDLNYTA DTCRQCKEPQCMNVCPIGAITWQQKEGCITVDHKRCIGCSACTTACPWMMATVNTESKKS SKCVLCGECANACPTGALKIIEWKDITV >gi|296494458|gb|ADTN01000280.1| GENE 24 27225 - 27434 249 69 aa, chain - ## HITS:1 COG:no KEGG:LF82_2868 NR:ns ## KEGG: LF82_2868 # Name: ydhZ # Def: uncharacterized protein YdhZ # Organism: E.coli_LF82 # Pathway: not_defined # 1 69 1 69 69 109 100.0 3e-23 MGNRTKEDELYREMCRVVGKVVLEMRDLGQEPKHIVIAGVLRTALANKRIQRSELEKQAM ETVINALVK >gi|296494458|gb|ADTN01000280.1| GENE 25 27991 - 29403 1503 470 aa, chain + ## HITS:1 COG:ECs2383 KEGG:ns NR:ns ## COG: ECs2383 COG0469 # Protein_GI_number: 15831637 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Escherichia coli O157:H7 # 1 470 1 470 470 884 100.0 0 MKKTKIVCTIGPKTESEEMLAKMLDAGMNVMRLNFSHGDYAEHGQRIQNLRNVMSKTGKT AAILLDTKGPEIRTMKLEGGNDVSLKAGQTFTFTTDKSVIGNSEMVAVTYEGFTTDLSVG NTVLVDDGLIGMEVTAIEGNKVICKVLNNGDLGENKGVNLPGVSIALPALAEKDKQDLIF GCEQGVDFVAASFIRKRSDVIEIREHLKAHGGENIHIISKIENQEGLNNFDEILEASDGI MVARGDLGVEIPVEEVIFAQKMMIEKCIRARKVVITATQMLDSMIKNPRPTRAEAGDVAN AILDGTDAVMLSGESAKGKYPLEAVSIMATICERTDRVMNSRLEFNNDNRKLRITEAVCR GAVETAEKLDAPLIVVATQGGKSARAVRKYFPDATILALTTNEKTAHQLVLSKGVVPQLV KEITSTDDFYRLGKELALQSGLAHKGDVVVMVSGALVPSGTTNTASVHVL Prediction of potential genes in microbial genomes Time: Mon May 16 00:07:50 2011 Seq name: gi|296494457|gb|ADTN01000281.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont710.11, whole genome shotgun sequence Length of sequence - 542 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 169 - 228 5.6 1 1 Tu 1 . + CDS 269 - 505 321 ## COG4238 Murein lipoprotein Predicted protein(s) >gi|296494457|gb|ADTN01000281.1| GENE 1 269 - 505 321 78 aa, chain + ## HITS:1 COG:ECs2384 KEGG:ns NR:ns ## COG: ECs2384 COG4238 # Protein_GI_number: 15831638 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Murein lipoprotein # Organism: Escherichia coli O157:H7 # 1 78 1 78 78 105 100.0 2e-23 MKATKLVLGAVILGSTLLAGCSSNAKIDQLSSDVQTLNAKVDQLSNDVNAMRSDVQAAKD DAARANQRLDNMATKYRK Prediction of potential genes in microbial genomes Time: Mon May 16 00:07:57 2011 Seq name: gi|296494456|gb|ADTN01000282.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont710.12, whole genome shotgun sequence Length of sequence - 20697 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 6, operones - 4 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 40 - 1044 746 ## COG1376 Uncharacterized protein conserved in bacteria - Prom 1123 - 1182 2.8 2 2 Op 1 7/0.000 - CDS 1193 - 1609 588 ## COG2166 SufE protein probably involved in Fe-S center assembly 3 2 Op 2 24/0.000 - CDS 1622 - 2842 1307 ## COG0520 Selenocysteine lyase 4 2 Op 3 41/0.000 - CDS 2839 - 4110 1131 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component 5 2 Op 4 41/0.000 - CDS 4085 - 4831 162 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 6 2 Op 5 3/1.000 - CDS 4841 - 6367 1439 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component 7 2 Op 6 . - CDS 6337 - 6705 529 ## COG0316 Uncharacterized conserved protein - Prom 6888 - 6947 4.5 - Term 6982 - 7032 12.4 8 3 Op 1 . - CDS 7253 - 7441 295 ## SbBS512_E1886 hypothetical protein - Prom 7466 - 7525 5.9 - Term 7453 - 7495 6.1 9 3 Op 2 6/0.000 - CDS 7541 - 7951 325 ## COG2050 Uncharacterized protein, possibly involved in aromatic compounds catabolism 10 3 Op 3 . - CDS 7948 - 11004 2537 ## COG0277 FAD/FMN-containing dehydrogenases 11 4 Tu 1 . + CDS 11393 - 12505 1326 ## COG0628 Predicted permease + Term 12513 - 12545 6.3 + Prom 12749 - 12808 6.3 12 5 Op 1 . + CDS 12934 - 13290 158 ## JW1679 conserved hypothetical protein 13 5 Op 2 21/0.000 + CDS 13390 - 14604 922 ## COG0477 Permeases of the major facilitator superfamily + Prom 14622 - 14681 3.2 14 5 Op 3 4/1.000 + CDS 14831 - 16096 842 ## COG0477 Permeases of the major facilitator superfamily 15 5 Op 4 8/0.000 + CDS 16108 - 16974 905 ## COG0169 Shikimate 5-dehydrogenase 16 5 Op 5 3/1.000 + CDS 17005 - 17763 823 ## COG0710 3-dehydroquinate dehydratase + Term 17768 - 17820 -0.8 + Prom 17776 - 17835 2.9 17 6 Op 1 . + CDS 17906 - 19501 1602 ## COG4670 Acyl CoA:acetate/3-ketoacid CoA transferase 18 6 Op 2 . + CDS 19515 - 20666 1508 ## COG1960 Acyl-CoA dehydrogenases Predicted protein(s) >gi|296494456|gb|ADTN01000282.1| GENE 1 40 - 1044 746 334 aa, chain - ## HITS:1 COG:ynhG KEGG:ns NR:ns ## COG: ynhG COG1376 # Protein_GI_number: 16129634 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 334 1 334 334 605 99.0 1e-173 MKRASLLTLTLIGALSAIQAAWAVDYPLPPTGSRLVGQNQTYTVQEGDKNLQAIARRFDT AAMLILEANNTIAPVPKPGTTITIPSQLLLPDAPRQGIIVNLAELRLYYYPPGENIVQVY PIGIGLQGLETPVMETRVGQKIPNPTWTPTAGIRQRSLERGIKLPPVVPAGPNNPLGRYA LRLAHGNGEYLIHGTSAPDSVGLRVSSGCIRMNAPDIKALFSSVRTGTPVKVINEPVKYS VEPNGMRYVEVHRPLSAEEQQNVQTMPYTLPAGFTQFKDNKAVDQKLVDKALYRRAGYPV SVSSGATPAASNAPSVESAQNGEPEQGNMLRVTQ >gi|296494456|gb|ADTN01000282.1| GENE 2 1193 - 1609 588 138 aa, chain - ## HITS:1 COG:ynhA KEGG:ns NR:ns ## COG: ynhA COG2166 # Protein_GI_number: 16129635 # Func_class: R General function prediction only # Function: SufE protein probably involved in Fe-S center assembly # Organism: Escherichia coli K12 # 1 126 1 126 138 250 100.0 6e-67 MALLPDKEKLLRNFLRCANWEEKYLYIIELGQRLPELRDEDRSPQNSIQGCQSQVWIVMR QNAQGIIELQGDSDAAIVKGLIAVVFILYDQMTPQDIVNFDVRPWFEKMALTQHLTPSRS QGLEAMIRAIRAKAAALS >gi|296494456|gb|ADTN01000282.1| GENE 3 1622 - 2842 1307 406 aa, chain - ## HITS:1 COG:csdB KEGG:ns NR:ns ## COG: csdB COG0520 # Protein_GI_number: 16129636 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Escherichia coli K12 # 1 406 1 406 406 828 100.0 0 MIFSVDKVRADFPVLSREVNGLPLAYLDSAASAQKPSQVIDAEAEFYRHGYAAVHRGIHT LSAQATEKMENVRKRASLFINARSAEELVFVRGTTEGINLVANSWGNSNVRAGDNIIISQ MEHHANIVPWQMLCARVGAELRVIPLNPDGTLQLETLPTLFDEKTRLLAITHVSNVLGTE NPLAEMITLAHQHGAKVLVDGAQAVMHHPVDVQALDCDFYVFSGHKLYGPTGIGILYVKE ALLQEMPPWEGGGSMIATVSLSEGTTWTKAPWRFEAGTPNTGGIIGLGAALEYVSALGLN NIAEYEQNLMHYALSQLESVPDLTLYGPQNRLGVIAFNLGKHHAYDVGSFLDNYGIAVRT GHHCAMPLMAYYNVPAMCRASLAMYNTHEEVDRLVTGLQRIHRLLG >gi|296494456|gb|ADTN01000282.1| GENE 4 2839 - 4110 1131 423 aa, chain - ## HITS:1 COG:ynhC KEGG:ns NR:ns ## COG: ynhC COG0719 # Protein_GI_number: 16129637 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Escherichia coli K12 # 1 423 1 423 423 806 100.0 0 MAGLPNSSNALQQWHHLFEAEGTKRSPQAQQHLQQLLRTGLPTRKHENWKYTPLEGLINS QFVSIAGEISPQQRDALALTLDSVRLVFVDGRYVPALSDATEGSGYEVSINDDRQGLPDA IQAEVFLHLTESLAQSVTHIAVKRGQRPAKPLLLMHITQGVAGEEVNTAHYRHHLDLAEG AEATVIEHFVSLNDARHFTGARFTINVAANAHLQHIKLAFENPLSHHFAHNDLLLAEDAT AFSHSFLLGGAVLRHNTSTQLNGENSTLRINSLAMPVKNEVCDTRTWLEHNKGFCNSRQL HKTIVSDKGRAVFNGLINVAQHAIKTDGQMTNNNLLMGKLAEVDTKPQLEIYADDVKCSH GATVGRIDDEQIFYLRSRGINQQDAQQMIIYAFAAELTEALRDEGLKQQVLARIGQRLPG GAR >gi|296494456|gb|ADTN01000282.1| GENE 5 4085 - 4831 162 248 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 2 204 12 202 312 67 29 1e-10 MLSIKDLHVSVEDKAILRGLSLDVHPGEVHAIMGPNGSGKSTLSATLAGREDYEVTGGTV EFKGKDLLALSPEDRAGEGIFMAFQYPVEIPGVSNQFFLQTALNAVRSYRGQETLDRFDF QDLMEEKIALLKMPEDLLTRSVNVGFSGGEKKRNDILQMAVLEPELCILDESDSGLDIDA LKVVADGVNSLRDGKRSFIIVTHYQRILDYIKPDYVHVLYQGRIVKSGDFTLVKQLEEQG YGWLTEQQ >gi|296494456|gb|ADTN01000282.1| GENE 6 4841 - 6367 1439 508 aa, chain - ## HITS:1 COG:ynhE KEGG:ns NR:ns ## COG: ynhE COG0719 # Protein_GI_number: 16129639 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Escherichia coli K12 # 1 508 1 508 508 1053 99.0 0 MWLWRKLWGIGGTMSRNTEATDDVKTRTGGPLNYKEGFFTQLATDELAKGINEEVVRAIS AKRNEPEWMLEFRLNAYRAWLEMEEPHWLKAHYDKLNYQDYSYYSAPSCGNCDDTCASEP GAVQQTGANAFLSKEVEAAFEQLGVPVREGKEVAVDAIFDSVSVATTYREKLAEQGIIFC SFGEAIHDHPELVRKYLGTVVPGNDNFFAALNAAVASDGTFIYVPKGVRCPMELSTYFRI NAEKTGQFERTILVADEDSYVSYIEGCSAPVRDSYQLHAAVVEVIIHKNAEVKYSTVQNW FPGDNNTGGILNFVTKRALCEGENSKMSWTQSETGSAITWKYPSCILRGDNSIGEFYSVA LTSGHQQADTGTKMIHIGKNTKSTIISKGISAGHSQNSYRGLVKIMPTATNARNFTQCDS MLIGANCGAHTFPYVECRNNSAQLEHEATTSRIGEDQLFYCLQRGISEEDAISMIVNGFC KDVFSELPLEFAVEAQKLLAISLEHSVG >gi|296494456|gb|ADTN01000282.1| GENE 7 6337 - 6705 529 122 aa, chain - ## HITS:1 COG:ydiC KEGG:ns NR:ns ## COG: ydiC COG0316 # Protein_GI_number: 16129640 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 122 1 122 122 252 100.0 9e-68 MDMHSGTFNPQDFAWQGLTLTPAAAIHIRELVAKQPGMVGVRLGVKQTGCAGFGYVLDSV SEPDKDDLLFEHDGAKLFVPLQAMPFIDGTEVDFVREGLNQIFKFHNPKAQNECGCGESF GV >gi|296494456|gb|ADTN01000282.1| GENE 8 7253 - 7441 295 62 aa, chain - ## HITS:1 COG:no KEGG:SbBS512_E1886 NR:ns ## KEGG: SbBS512_E1886 # Name: not_defined # Def: hypothetical protein # Organism: S.boydii_CDC3083-94 # Pathway: not_defined # 1 62 28 89 89 109 100.0 2e-23 MSTQLDPTQLAIEFLRRDQSNLSPAQYLKRLKQLELEFADLLTLSSAELKEEIYFAWRLG VH >gi|296494456|gb|ADTN01000282.1| GENE 9 7541 - 7951 325 136 aa, chain - ## HITS:1 COG:ydiI KEGG:ns NR:ns ## COG: ydiI COG2050 # Protein_GI_number: 16129642 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized protein, possibly involved in aromatic compounds catabolism # Organism: Escherichia coli K12 # 1 136 1 136 136 276 100.0 9e-75 MIWKRKITLEALNAMGEGNMVGFLDIRFEHIGDDTLEATMPVDSRTKQPFGLLHGGASVV LAESIGSVAGYLCTEGEQKVVGLEINANHVRSAREGRVRGVCKPLHLGSRHQVWQIEIFD EKGRLCCSSRLTTAIL >gi|296494456|gb|ADTN01000282.1| GENE 10 7948 - 11004 2537 1018 aa, chain - ## HITS:1 COG:ydiJ_1 KEGG:ns NR:ns ## COG: ydiJ_1 COG0277 # Protein_GI_number: 16129643 # Func_class: C Energy production and conversion # Function: FAD/FMN-containing dehydrogenases # Organism: Escherichia coli K12 # 1 542 1 542 542 1130 100.0 0 MIPQISQAPGVVQLVLNFLQELEQQGFTGDTATSYADRLTMSTDNSIYQLLPDAVVFPRS TADVALIARLAAQERYSSLIFTPRGGGTGTNGQALNQGIIVDMSRHMNRIIEINPEEGWV RVEAGVIKDQLNQYLKPFGYFFAPELSTSNRATLGGMINTDASGQGSLVYGKTSDHVLGV RAVLLGGDILDTQPLPVELAETLGKSNTTIGRIYNTVYQRCRQQRQLIIDNFPKLNRFLT GYDLRHVFNDEMTEFDLTRILTGSEGTLAFITEARLDITRLPKVRRLVNVKYDSFDSALR NAPFMVEARALSVETVDSKVLNLAREDIVWHSVSELITDVPDQEMLGLNIVEFAGDDEAL IDERVNALCARLDELIASHQAGVIGWQVCRELAGVERIYAMRKKAVGLLGNAKGAAKPIP FAEDTCVPPEHLADYIAEFRALLDSHGLSYGMFGHVDAGVLHVRPALDMCDPQQEILMKQ ISDDVVALTAKYGGLLWGEHGKGFRAEYSPAFFGEELFAELRKVKAAFDPHNRLNPGKIC PPEGLDAPMMKVDAVKRGTFDRQIPIAVRQQWRGAMECNGNGLCFNFDARSPMCPSMKIT QNRIHSPKGRATLVREWLRLLADRGVDPLKLEQELPESGVSLRTLIARTRNSWHANKGEY DFSHEVKEAMSGCLACKACSTQCPIKIDVPEFRSRFLQLYHTRYLRPLRDHLVATVESYA PLMARAPKTFNFFINQPLVRKLSEKHIGMVDLPLLSVPSLQQQMVGHRSANMTLEQLESL NAEQKARTVLVVQDPFTSYYDAQVVADFVRLVEKLGFQPVLLPFSPNGKAQHIKGFLNRF AKTAKKTADFLNRMAKLGMPMVGVDPALVLCYRDEYKLALGEERGEFNVLLANEWLASAL ESQPVATVSGESWYFFGHCTEVTALPGAPAQWAAIFARFGAKLENVSVGCCGMAGTYGHE AKNHENSLGIYELSWHQAMQRLPRNRCLATGYSCRSQVKRVEGTGVRHPVQALLEIIK >gi|296494456|gb|ADTN01000282.1| GENE 11 11393 - 12505 1326 370 aa, chain + ## HITS:1 COG:ECs2395 KEGG:ns NR:ns ## COG: ECs2395 COG0628 # Protein_GI_number: 15831649 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Escherichia coli O157:H7 # 1 370 1 370 370 584 99.0 1e-167 MVNVRQPRDVAQILLSVLFLAIMIVACLWIVQPFILGFAWAGTVVIATWPVLLRLQKIMF GRRSLAVLVMTLLLVMVFIIPIALLVNSIVDGSGPLIKAISSGDMTLPDLAWLNTIPVIG AKLYAGWHNLLDMGGTAIMAKVRPYIGTTTTWFVGQAAHIGRFMVHCALMLLFSALLYWR GEQVAQGIRHFATRLAGVRGDAAVLLAAQAIRAVALGVVVTALVQAVLGGIGLAVSGVPY ATLLTVLMILSCLVQLGPLPVLIPAIIWLYWTGDTTWGTVLLVWSGVVGTLDNVIRPMLI RMGADLPLILILSGVIGGLIAFGMIGLFIGPVLLAVSWRLFAAWVEEVPPPTDQPEEILE ELGEIEKPNK >gi|296494456|gb|ADTN01000282.1| GENE 12 12934 - 13290 158 118 aa, chain + ## HITS:1 COG:no KEGG:JW1679 NR:ns ## KEGG: JW1679 # Name: ydiL # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 118 1 118 118 242 100.0 3e-63 MNAYELQALRHIFAMTIDECATWIAQTGDSESWRQWENGKCAIPDRVVEQLLAMRQQRKK HLHAIIEKINNRIGNNTMRFFPDLTAFQRVYPDGNFIDWKIYQSVAAELYAHDLERLC >gi|296494456|gb|ADTN01000282.1| GENE 13 13390 - 14604 922 404 aa, chain + ## HITS:1 COG:ydiM KEGG:ns NR:ns ## COG: ydiM COG0477 # Protein_GI_number: 16129646 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 404 1 404 404 720 100.0 0 MKNPYFPTALGLYFNYLVHGMGVLLMSLNMASLETLWQTNAAGVSIVISSLGIGRLSVLL FAGLLSDRFGRRPFIMLGMCCYMAFFFGILQTNNIIIAYVFGFLAGMANSFLDAGTYPSL MEAFPRSPGTANILIKAFVSSGQFLLPLIISLLVWAELWFGWSFMIAAGIMFINALFLYR CTFPPHPGRRLPVIKKTTSSTEHRCSIIDLASYTLYGYISMATFYLVSQWLAQYGQFVAG MSYTMSIKLLSIYTVGSLLCVFITAPLIRNTVRPTTLLMLYTFISFIALFTVCLHPTFYV VIIFAFVIGFTSAGGVVQIGLTLMAERFPYAKGKATGIYYSAGSIATFTIPLITAHLSQR SIADIMWFDTAIAAIGFLLALFIGLRSRKKTRHHSLKENVAPGG >gi|296494456|gb|ADTN01000282.1| GENE 14 14831 - 16096 842 421 aa, chain + ## HITS:1 COG:ydiN KEGG:ns NR:ns ## COG: ydiN COG0477 # Protein_GI_number: 16129647 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 421 3 423 423 726 100.0 0 MSQNKAFSTPFILAVLCIYFSYFLHGISVITLAQNMSSLAEKFSTDNAGIAYLISGIGLG RLISILFFGVISDKFGRRAVILMAVIMYLLFFFGIPACPNLTLAYGLAVCVGIANSALDT GGYPALMECFPKASGSAVILVKAMVSFGQMFYPMLVSYMLLNNIWYGYGLIIPGILFVLI TLMLLKSKFPSQLVDASVTNELPQMNSKPLVWLEGVSSVLFGVAAFSTFYVIVVWMPKYA MAFAGMSEAEALKTISYYSMGSLVCVFIFAALLKKMVRPIWANVFNSALATITAAIIYLY PSPLVCNAGAFVIGFSAAGGILQLGVSVMSEFFPKSKAKVTSIYMMMGGLANFVIPLITG YLSNIGLQYIIVLDFTFALLALITAIIVFIRYYRVFIIPENDVRFGERKFCTRLNTIKHR G >gi|296494456|gb|ADTN01000282.1| GENE 15 16108 - 16974 905 288 aa, chain + ## HITS:1 COG:ydiB KEGG:ns NR:ns ## COG: ydiB COG0169 # Protein_GI_number: 16129648 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Escherichia coli K12 # 1 288 1 288 288 581 100.0 1e-166 MDVTAKYELIGLMAYPIRHSLSPEMQNKALEKAGLPFTYMAFEVDNDSFPGAIEGLKALK MRGTGVSMPNKQLACEYVDELTPAAKLVGAINTIVNDDGYLRGYNTDGTGHIRAIKESGF DIKGKTMVLLGAGGASTAIGAQGAIEGLKEIKLFNRRDEFFDKALAFAQRVNENTDCVVT VTDLADQQAFAEALASADILTNGTKVGMKPLENESLVNDISLLHPGLLVTECVYNPHMTK LLQQAQQAGCKTIDGYGMLLWQGAEQFTLWTGKDFPLEYVKQVMGFGA >gi|296494456|gb|ADTN01000282.1| GENE 16 17005 - 17763 823 252 aa, chain + ## HITS:1 COG:aroD KEGG:ns NR:ns ## COG: aroD COG0710 # Protein_GI_number: 16129649 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate dehydratase # Organism: Escherichia coli K12 # 1 252 1 252 252 449 100.0 1e-126 MKTVTVKDLVIGTGAPKIIVSLMAKDIASVKSEALAYREADFDILEWRVDHYADLSNVES VMAAAKILRETMPEKPLLFTFRSAKEGGEQAISTEAYIALNRAAIDSGLVDMIDLELFTG DDQVKETVAYAHAHDVKVVMSNHDFHKTPEAEEIIARLRKMQSFDADIPKIALMPQSTSD VLTLLAATLEMQEQYADRPIITMSMAKTGVISRLAGEVFGSAATFGAVKKASAPGQISVN DLRTVLTILHQA >gi|296494456|gb|ADTN01000282.1| GENE 17 17906 - 19501 1602 531 aa, chain + ## HITS:1 COG:ydiF KEGG:ns NR:ns ## COG: ydiF COG4670 # Protein_GI_number: 16129650 # Func_class: I Lipid transport and metabolism # Function: Acyl CoA:acetate/3-ketoacid CoA transferase # Organism: Escherichia coli K12 # 1 531 1 531 531 1069 100.0 0 MKPVKPPRINGRVPVLSAQEAVNYIPDEATLCVLGAGGGILEATTLITALADKYKQTQTP RNLSIISPTGLGDRADRGISPLAQEGLVKWALCGHWGQSPRISELAEQNKIIAYNYPQGV LTQTLRAAAAHQPGIISDIGIGTFVDPRQQGGKLNEVTKEDLIKLVEFDNKEYLYYKAIA PDIAFIRATTCDSEGYATFEDEVMYLDALVIAQAVHNNGGIVMMQVQKMVKKATLHPKSV RIPGYLVDIVVVDPDQTQLYGGAPVNRFISGDFTLDDSTKLSLPLNQRKLVARRALFEMR KGAVGNVGVGIADGIGLVAREEGCADDFILTVETGPIGGITSQGIAFGANVNTRAILDMT SQFDFYHGGGLDVCYLSFAEVDQHGNVGVHKFNGKIMGTGGFIDISATSKKIIFCGTLTA GSLKTEITDGKLNIVQEGRVKKFIRELPEITFSGKIALERGLDVRYITERAVFTLKEDGL HLIEIAPGVDLQKDILDKMDFTPVISPELKLMDERLFIDAAMGFVLPEAAH >gi|296494456|gb|ADTN01000282.1| GENE 18 19515 - 20666 1508 383 aa, chain + ## HITS:1 COG:ECs2402 KEGG:ns NR:ns ## COG: ECs2402 COG1960 # Protein_GI_number: 15831656 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Escherichia coli O157:H7 # 1 383 19 401 401 800 99.0 0 MDFSLNEEQELLLASIRELITTNFPEEYFRTCDQNGTYPREFMRALADNGISMLGVPEEF GGIPADYVTQMLALMEVSKCGAPAFLITNGQCIHSMRRFGSAEQLRKTAESTLETGDPAY ALALTEPGAGSDNNSATTTYTRKNGKVYINGQKTFITGAKEYPYMLVLARDPQPKDPKKA FTLWWVDSSKPGIKINPLHKIGWHMLSTCEVYLDNVEVEESDMVGEEGMGFLNVMYNFEM ERLINAARSTGFAECAFEDAARYANQRIAFGKPIGHNQMIQEKLALMAIKIDNMRNMVLK VAWQADQHQSLRTSAALAKLYCARTAMEVIDDAIQIMGGLGYTDEARVSRFWRDVRCERI GGGTDEIMIYVAGRQILKDYQNK Prediction of potential genes in microbial genomes Time: Mon May 16 00:08:05 2011 Seq name: gi|296494455|gb|ADTN01000283.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont710.13, whole genome shotgun sequence Length of sequence - 9971 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 5, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 34 - 945 621 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 1061 - 1120 6.5 + Prom 1090 - 1149 6.1 2 2 Op 1 29/0.000 + CDS 1261 - 2025 787 ## COG2086 Electron transfer flavoprotein, beta subunit 3 2 Op 2 9/0.333 + CDS 2045 - 2983 1032 ## COG2025 Electron transfer flavoprotein, alpha subunit 4 2 Op 3 12/0.000 + CDS 3039 - 4328 1176 ## COG0644 Dehydrogenases (flavoproteins) 5 2 Op 4 2/1.000 + CDS 4325 - 4618 349 ## COG2440 Ferredoxin-like protein + Term 4631 - 4665 5.0 6 3 Tu 1 . + CDS 4675 - 6321 1166 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II + Term 6346 - 6382 4.5 - Term 6334 - 6370 7.7 7 4 Tu 1 . - CDS 6378 - 8756 2796 ## COG0574 Phosphoenolpyruvate synthase/pyruvate phosphate dikinase - Prom 8973 - 9032 7.9 + Prom 8937 - 8996 10.0 8 5 Tu 1 . + CDS 9089 - 9922 784 ## COG1806 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|296494455|gb|ADTN01000283.1| GENE 1 34 - 945 621 303 aa, chain - ## HITS:1 COG:ydiP KEGG:ns NR:ns ## COG: ydiP COG2207 # Protein_GI_number: 16129652 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 303 1 303 303 622 100.0 1e-178 MYQRCFDNASETLFVAGKTPRLSRFAFSDDPKWESGHHVHDNETELIYVKKGVARFTIDS SLYVAHADDIVVIERGRLHAVASDVNDPATTCTCALYGFQFQGAEENQLLQPHSCPVIAA GQGKEVIKTLFNELSVILPQSKNSQTSSLWDAFAYTLAILYYENFKNAYRSEQGYIKKDV LIKDILFYLNNNYREKITLEQLSKKFRASVSYICHEFTKEYRISPINYVIQRRMTEAKWS LTNTELSQAEISWRVGYENVDHFAKLFLRHVGCSPSDYRRQFKNCFAEQEILSEFPQPVS LVG >gi|296494455|gb|ADTN01000283.1| GENE 2 1261 - 2025 787 254 aa, chain + ## HITS:1 COG:ydiQ KEGG:ns NR:ns ## COG: ydiQ COG2086 # Protein_GI_number: 16129653 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, beta subunit # Organism: Escherichia coli K12 # 1 254 1 254 254 467 100.0 1e-131 MKIITCFKLVPEEQDIVVTPEYTLNFDNADAKISQFDLNAIEAASQLATDDDEIAALTVG GSLLQNSKVRKDVLSRGPHSLYLVQDAQLEHALPLDTAKALAAAIEKIGFDLLIFGEGSG DLYAQQVGLLVGEILQLPVINAVSAIQRQGNTLVIERTLEDDVEVIELSVPAVLCVTSDI NVPRIPSMKAILGAGKKPVNQWQASDIDWSQSAPLAELVGIRVPPQTERKHIIIDNDSPE AIAELAEHLKKALN >gi|296494455|gb|ADTN01000283.1| GENE 3 2045 - 2983 1032 312 aa, chain + ## HITS:1 COG:ydiR KEGG:ns NR:ns ## COG: ydiR COG2025 # Protein_GI_number: 16129654 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, alpha subunit # Organism: Escherichia coli K12 # 1 312 1 312 312 620 100.0 1e-177 MSQLNSVWVFSDNPERYAELFGGAQQWGQQVYAIVQNTDQAQAVMPYGPKCLYVLAQNDA LQRTENYAESIAALLKDKHPAMLLLAATKRGKALAARLSVQLNAALVNDATAVDIVDGHI CAEHRMYGGLAFAQEKINSPLAIITLAPGVQEPCTSDTSHQCPTETVPYVAPRHEILCRE RRAKAASSVDLSKAKRVVGVGRGLAAQDDLKMVHELAAVLNAEVGCSRPIAEGENWMERE RYIGVSGVLLKSDLYLTLGISGQIQHMVGGNGAKVIVAINKDKNAPIFNYADYGLVGDIY KVVPALISQLSR >gi|296494455|gb|ADTN01000283.1| GENE 4 3039 - 4328 1176 429 aa, chain + ## HITS:1 COG:ydiS KEGG:ns NR:ns ## COG: ydiS COG0644 # Protein_GI_number: 16129655 # Func_class: C Energy production and conversion # Function: Dehydrogenases (flavoproteins) # Organism: Escherichia coli K12 # 1 429 1 429 429 832 100.0 0 MSDDKFDAIVVGAGVAGSVAALVMARAGLDVLVIERGDSAGCKNMTGGRLYAHTLEAIIP GFAVSAPVERKVTREKISFLTEESAVTLDFHREQPDVPQHASYTVLRNRLDPWLMEQAEQ AGAQFIPGVRVDALVREGNKVTGVQAGDDILEANVVILADGVNSMLGRSLGMVPASDPHH YAVGVKEVIGLTPEQINDRFNITGEEGAAWLFAGSPSDGLMGGGFLYTNKDSISLGLVCG LGDIAHAQKSVPQMLEDFKQHPAIRPLISGGKLLEYSAHMVPEGGLAMVPQLVNEGVMIV GDAAGFCLNLGFTVRGMDLAIASAQAAATTVIAAKERADFSASSLAQYKRELEQSCVMRD MQHFRKIPALMENPRLFSQYPRMVADIMNEMFTIDGKPNQPVRKMIMGHAKKIGLINLLK DGIKGATAL >gi|296494455|gb|ADTN01000283.1| GENE 5 4325 - 4618 349 97 aa, chain + ## HITS:1 COG:ydiT KEGG:ns NR:ns ## COG: ydiT COG2440 # Protein_GI_number: 16129656 # Func_class: C Energy production and conversion # Function: Ferredoxin-like protein # Organism: Escherichia coli K12 # 1 97 1 97 97 206 100.0 9e-54 MSQNATVNVDIKLGVNKFHVDEGHPHIILAENPDINEFHKLMKACPAGLYKQDDAGNIHF DSAGCLECGTCRVLCGNTILEQWQYPAGTFGIDFRYG >gi|296494455|gb|ADTN01000283.1| GENE 6 4675 - 6321 1166 548 aa, chain + ## HITS:1 COG:ydiD KEGG:ns NR:ns ## COG: ydiD COG0318 # Protein_GI_number: 16129657 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Escherichia coli K12 # 1 548 19 566 566 1121 99.0 0 MKVTLTFNEQRRAAYRQQGLWGDASLADYWQQTARAMPDKIAVVDNHGASYTYSALDHAA SCLANWMLAKGIESGDRIAFQLPGWCEFTVIYLACLKIGAVSVPLLPSWREAELVWVLNK CQAKMFFAPTLFKQTRPVDLILPLQNQLPQLQQIVGVDKLAPATSSLSLSQIIADNTSLT TAITTHGDELAAVLFTSGTEGLPKGVMLTHNNILASERAYCARLNLTWQDVFMMPAPLGH ATGFLHGVTAPFLIGARSVLLDIFTPDACLALLEQQRCTCMLGATPFVYDLLNVLEKQPA DLSALRFFLCGGTTIPKKVARECQQRGIKLLSVYGSTESSPHAVVNLDDPLSHFMHTDGY AAAGVEIKVVDDARKTLPPGCEGEEASRGPNVFMGYFDEPELTARALDEEGWYYSGDLCR MDEAGYIKITGRKKDIIVRGGENISSREVEDILLQHPKIHDACVVAMSDERLGERSCAYV VLKAPHHSLSLEEVVAFFSRKRVAKYKYPEHIVVIEKLPRTTSGKIQKFLLRKDIMRRLT QDVCEEIE >gi|296494455|gb|ADTN01000283.1| GENE 7 6378 - 8756 2796 792 aa, chain - ## HITS:1 COG:ppsA KEGG:ns NR:ns ## COG: ppsA COG0574 # Protein_GI_number: 16129658 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: Escherichia coli K12 # 1 792 1 792 792 1609 100.0 0 MSNNGSSPLVLWYNQLGMNDVDRVGGKNASLGEMITNLSGMGVSVPNGFATTADAFNQFL DQSGVNQRIYELLDKTDIDDVTQLAKAGAQIRQWIIDTPFQPELENAIREAYAQLSADDE NASFAVRSSATAEDMPDASFAGQQETFLNVQGFDAVLVAVKHVFASLFNDRAISYRVHQG YDHRGVALSAGVQRMVRSDLASSGVMFSIDTESGFDQVVFITSAWGLGEMVVQGAVNPDE FYVHKPTLAANRPAIVRRTMGSKKIRMVYAPTQEHGKQVKIEDVPQEQRDIFSLTNEEVQ ELAKQAVQIEKHYGRPMDIEWAKDGHTGKLFIVQARPETVRSRGQVMERYTLHSQGKIIA EGRAIGHRIGAGPVKVIHDISEMNRIEPGDVLVTDMTDPDWEPIMKKASAIVTNRGGRTC HAAIIARELGIPAVVGCGDATERMKDGENVTVSCAEGDTGYVYAELLEFSVKSSSVETMP DLPLKVMMNVGNPDRAFDFACLPNEGVGLARLEFIINRMIGVHPRALLEFDDQEPQLQNE IREMMKGFDSPREFYVGRLTEGIATLGAAFYPKRVIVRLSDFKSNEYANLVGGERYEPDE ENPMLGFRGAGRYVSDSFRDCFALECEAVKRVRNDMGLTNVEIMIPFVRTVDQAKAVVEE LARQGLKRGENGLKIIMMCEIPSNALLAEQFLEYFDGFSIGSNDMTQLALGLDRDSGVVS ELFDERNDAVKALLSMAIRAAKKQGKYVGICGQGPSDHEDFAAWLMEEGIDSLSLNPDTV VQTWLSLAELKK >gi|296494455|gb|ADTN01000283.1| GENE 8 9089 - 9922 784 277 aa, chain + ## HITS:1 COG:ECs2410 KEGG:ns NR:ns ## COG: ECs2410 COG1806 # Protein_GI_number: 15831664 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 277 1 277 277 552 100.0 1e-157 MDNAVDRHVFYISDGTAITAEVLGHAVMSQFPVTISSITLPFVENESRARAVKDQIDAIY HQTGVRPLVFYSIVLPEIRAIILQSEGFCQDIVQALVAPLQQEMKLDPTPIAHRTHGLNP NNLNKYDARIAAIDYTLAHDDGISLRNLDQAQVILLGVSRCGKTPTSLYLAMQFGIRAAN YPFIADDMDNLVLPASLKPLQHKLFGLTIDPERLAAIREERRENSRYASLRQCRMEVAEV EALYRKNQIPWINSTNYSVEEIATKILDIMGLSRRMY Prediction of potential genes in microbial genomes Time: Mon May 16 00:08:06 2011 Seq name: gi|296494454|gb|ADTN01000284.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont710.14, whole genome shotgun sequence Length of sequence - 1223 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) Predicted protein(s) >gi|296494454|gb|ADTN01000284.1| GENE 1 154 - 1200 845 348 aa, chain + ## HITS:1 COG:aroH KEGG:ns NR:ns ## COG: aroH COG0722 # Protein_GI_number: 16129660 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Escherichia coli K12 # 1 348 1 348 348 721 100.0 0 MNRTDELRTARIESLVTPAELALRYPVTPGVATHVTDSRRRIEKILNGEDKRLLVIIGPC SIHDLTAAMEYATRLQSLRNQYQSRLEIVMRTYFEKPRTVVGWKGLISDPDLNGSYRVNH GLELARKLLLQVNELGVPTATEFLDMVTGQFIADLISWGAIGARTTESQIHREMASALSC PVGFKNGTDGNTRIAVDAIRAARASHMFLSPDKNGQMTIYQTSGNPYGHIIMRGGKKPNY HADDIAAACDTLHEFDLPEHLVVDFSHGNCQKQHRRQLEVCEDICQQIRNGSTAIAGIMA ESFLREGTQKIVGSQPLTYGQSITDPCLGWEDTERLVEKLASAVDTRF Prediction of potential genes in microbial genomes Time: Mon May 16 00:08:07 2011 Seq name: gi|296494453|gb|ADTN01000285.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont710.15, whole genome shotgun sequence Length of sequence - 2749 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 34 - 93 4.0 1 1 Tu 1 . + CDS 122 - 313 171 ## COG4256 Hemin uptake protein 2 2 Op 1 4/1.000 - CDS 317 - 1753 1108 ## COG0397 Uncharacterized conserved protein 3 2 Op 2 . - CDS 1816 - 2529 140 ## COG2200 FOG: EAL domain - Prom 2665 - 2724 4.3 Predicted protein(s) >gi|296494453|gb|ADTN01000285.1| GENE 1 122 - 313 171 63 aa, chain + ## HITS:1 COG:ECs2412 KEGG:ns NR:ns ## COG: ECs2412 COG4256 # Protein_GI_number: 15831666 # Func_class: P Inorganic ion transport and metabolism # Function: Hemin uptake protein # Organism: Escherichia coli O157:H7 # 1 63 1 63 63 107 100.0 6e-24 MRYTDSRKLTPETDANHKTASPQPIRRISSQTLLGPDGKLIIDHDGQEYLLRKTQAGKLL LTK >gi|296494453|gb|ADTN01000285.1| GENE 2 317 - 1753 1108 478 aa, chain - ## HITS:1 COG:ydiU KEGG:ns NR:ns ## COG: ydiU COG0397 # Protein_GI_number: 16129662 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 478 1 478 478 962 100.0 0 MTLSFVTRWRDELPETYTALSPTPLNNARLIWHNTELANTLSIPSSLFKNGAGVWGGEAL LPGMSPLAQVYSGHQFGVWAGQLGDGRGILLGEQLLADGTTMDWHLKGAGLTPYSRMGDG RAVLRSTIRESLASEAMHYLGIPTTRALSIVTSDSPVYRETAEPGAMLMRVAPSHLRFGH FEHFYYRRESEKVRQLADFAIRHYWSHLADDEDKYRLWFSDVVARTASLIAQWQTVGFAH GVMNTDNMSLLGLTLDYGPFGFLDDYEPGFICNHSDHQGRYSFDNQPAVALWNLQRLAQT LSPFVAVDALNEALDSYQQVLLTHYGERMRQKLGFMTEQKEDNALLNELFSLMARERSDY TRTFRMLSLTEQHSAASPLRDEFIDRAAFDDWFARYRGRLQQDEVSDSERQQLMQSVNPA LVLRNWLAQRAIEAAEKGDMTELHRLHEALRNPFSDRDDDYVSRPPDWGKRLEVSCSS >gi|296494453|gb|ADTN01000285.1| GENE 3 1816 - 2529 140 237 aa, chain - ## HITS:1 COG:ydiV KEGG:ns NR:ns ## COG: ydiV COG2200 # Protein_GI_number: 16129663 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 1 237 1 237 237 483 100.0 1e-137 MKIFLENLYHSDCYFLPIRDNQQVLVGVELITHFSSEDGTVRIPTSRVIAQLTEEQHWQL FSEQLELLKSCQHFFIQHKLFAWLNLTPQVATLLLERDNYAGELLKYPFIELLINENYPH LNEGKDNRGLLSLSQVYPLVLGNLGAGNSTMKAVFDGLFTRVMLDKSFIQQQITHRSFEP FIRAIQAQISPCCNCIIAGGIDTAEILAQITPFDFHALQGCLWPAVPINQITTLVQR Prediction of potential genes in microbial genomes Time: Mon May 16 00:08:09 2011 Seq name: gi|296494452|gb|ADTN01000286.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont710.16, whole genome shotgun sequence Length of sequence - 2996 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 - CDS 32 - 496 273 ## PROTEIN SUPPORTED gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 2 1 Op 2 5/0.000 - CDS 574 - 1311 684 ## COG4138 ABC-type cobalamin transport system, ATPase component 3 1 Op 3 5/0.000 - CDS 1323 - 1874 654 ## COG0386 Glutathione peroxidase 4 1 Op 4 . - CDS 1937 - 2917 955 ## COG4139 ABC-type cobalamin transport system, permease component Predicted protein(s) >gi|296494452|gb|ADTN01000286.1| GENE 1 32 - 496 273 154 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 [Haemophilus parasuis 29755] # 40 154 65 174 175 109 45 2e-24 MRFCLILITALLLAGCSHHKAPPPNARLSDSITVIAGLNDQLQSWHGTPYRYGGMTRRGV DCSGFVVVTMRDRFDLQLPRETKQQASIGTQIDKDELLPGDLVFFKTGSGQNGLHVGIYD TNNQFIHASTSKGVMRSSLDNVYWQKNFWQARRI >gi|296494452|gb|ADTN01000286.1| GENE 2 574 - 1311 684 245 aa, chain - ## HITS:1 COG:ECs2416 KEGG:ns NR:ns ## COG: ECs2416 COG4138 # Protein_GI_number: 15831670 # Func_class: H Coenzyme transport and metabolism # Function: ABC-type cobalamin transport system, ATPase component # Organism: Escherichia coli O157:H7 # 1 245 5 249 249 455 99.0 1e-128 MQLQDVAESTRLGPLSGEVRAGEILHLVGPNGAGKSTLLARMAGMTSGKGSIQFAGQPLE AWSATKLALHRAYLSQQQTPPFAMPVWHYLTLHQHDKTRTELLNDVAGALALDDKLGRST NQLSGGEWQRVRLAAVVLQITPQANPAGQLLLLDEPMNSLDVAQQSALDKILSALCQQGL AIVMSSHDLNHTLRHAHRAWLLKGGKMLASGRREEVLTPPNLAQAYGMNFRRLDIEGHRM LISTI >gi|296494452|gb|ADTN01000286.1| GENE 3 1323 - 1874 654 183 aa, chain - ## HITS:1 COG:btuE KEGG:ns NR:ns ## COG: btuE COG0386 # Protein_GI_number: 16129666 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione peroxidase # Organism: Escherichia coli K12 # 1 183 1 183 183 377 100.0 1e-105 MQDSILTTVVKDIDGEVTTLEKFAGNVLLIVNVASKCGLTPQYEQLENIQKAWVDRGFMV LGFPCNQFLEQEPGSDEEIKTYCTTTWGVTFPMFSKIEVNGEGRHPLYQKLIAAAPTAVA PEESGFYARMVSKGRAPLYPDDILWNFEKFLVGRDGKVIQRFSPDMTPEDPIVMESIKLA LAK >gi|296494452|gb|ADTN01000286.1| GENE 4 1937 - 2917 955 326 aa, chain - ## HITS:1 COG:btuC KEGG:ns NR:ns ## COG: btuC COG4139 # Protein_GI_number: 16129667 # Func_class: H Coenzyme transport and metabolism # Function: ABC-type cobalamin transport system, permease component # Organism: Escherichia coli K12 # 1 326 1 326 326 473 100.0 1e-133 MLTLARQQQRQNIRWLLCLSVLMLLALLLSLCAGEQWISPGDWFTPRGELFVWQIRLPRT LAVLLVGAALAISGAVMQALFENPLAEPGLLGVSNGAGVGLIAAVLLGQGQLPNWALGLC AIAGALIITLILLRFARRHLSTSRLLLAGVALGIICSALMTWAIYFSTSVDLRQLMYWMM GGFGGVDWRQSWLMLALIPVLLWICCQSRPMNMLALGEISARQLGLPLWFWRNVLVAATG WMVGVSVALAGAIGFIGLVIPHILRLCGLTDHRVLLPGCALAGASALLLADIVARLALAA AELPIGVVTATLGAPVFIWLLLKAGR Prediction of potential genes in microbial genomes Time: Mon May 16 00:08:16 2011 Seq name: gi|296494451|gb|ADTN01000287.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont710.17, whole genome shotgun sequence Length of sequence - 20902 bp Number of predicted genes - 22, with homology - 21 Number of transcription units - 14, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 13/0.000 - CDS 18 - 317 318 ## PROTEIN SUPPORTED gi|148826992|ref|YP_001291745.1| 50S ribosomal protein L35 2 1 Op 2 40/0.000 - CDS 322 - 2709 2888 ## COG0072 Phenylalanyl-tRNA synthetase beta subunit 3 1 Op 3 13/0.000 - CDS 2724 - 3707 1283 ## COG0016 Phenylalanyl-tRNA synthetase alpha subunit - Prom 3892 - 3951 5.0 - Term 4115 - 4146 2.1 4 1 Op 4 46/0.000 - CDS 4158 - 4514 578 ## PROTEIN SUPPORTED gi|15802128|ref|NP_288150.1| 50S ribosomal protein L20 5 1 Op 5 36/0.000 - CDS 4567 - 4779 358 ## PROTEIN SUPPORTED gi|110805473|ref|YP_688993.1| 50S ribosomal protein L35 - Term 4804 - 4843 8.2 6 1 Op 6 16/0.000 - CDS 4861 - 5340 665 ## PROTEIN SUPPORTED gi|167856598|ref|ZP_02479300.1| 50S ribosomal protein L35 7 1 Op 7 . - CDS 5407 - 7275 1893 ## COG0441 Threonyl-tRNA synthetase - Prom 7520 - 7579 7.1 8 2 Tu 1 . - CDS 7738 - 7821 57 ## + Prom 7756 - 7815 6.1 9 3 Tu 1 . + CDS 7859 - 8332 93 ## COG0666 FOG: Ankyrin repeat + Prom 9030 - 9089 3.4 10 4 Op 1 . + CDS 9114 - 9758 278 ## COG0666 FOG: Ankyrin repeat 11 4 Op 2 . + CDS 9825 - 10037 145 ## SFV_1500 hypothetical protein + Term 10055 - 10097 -0.2 - Term 10045 - 10081 6.2 12 5 Tu 1 . - CDS 10090 - 10848 642 ## COG3137 Putative salt-induced outer membrane protein - Prom 10977 - 11036 11.7 + Prom 10992 - 11051 6.1 13 6 Tu 1 . + CDS 11135 - 12064 1007 ## COG1105 Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) 14 7 Tu 1 . + CDS 12165 - 12455 281 ## ECH74115_2442 hypothetical protein + Prom 12482 - 12541 3.4 15 8 Tu 1 . + CDS 12561 - 13421 758 ## COG3001 Fructosamine-3-kinase + Term 13439 - 13467 1.3 - Term 13427 - 13455 1.3 16 9 Tu 1 . - CDS 13462 - 13998 488 ## JW1715 predicted inner membrane protein - Prom 14168 - 14227 3.3 + Prom 14030 - 14089 3.2 17 10 Tu 1 . + CDS 14145 - 14813 770 ## COG0637 Predicted phosphatase/phosphohexomutase + Term 14815 - 14858 12.2 + Prom 14893 - 14952 4.5 18 11 Op 1 1/1.000 + CDS 14976 - 15566 292 ## COG1988 Predicted membrane-bound metal-dependent hydrolases 19 11 Op 2 . + CDS 15699 - 17090 1747 ## COG1823 Predicted Na+/dicarboxylate symporter + Term 17101 - 17145 7.0 20 12 Tu 1 . - CDS 17094 - 17336 96 ## JW1719 hypothetical protein - Prom 17574 - 17633 6.6 - Term 18006 - 18050 -0.8 21 13 Tu 1 . - CDS 18186 - 18449 205 ## SSON_1427 cell division modulator - Prom 18605 - 18664 4.1 + Prom 18400 - 18459 2.8 22 14 Tu 1 . + CDS 18632 - 20893 2321 ## COG0753 Catalase Predicted protein(s) >gi|296494451|gb|ADTN01000287.1| GENE 1 18 - 317 318 99 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148826992|ref|YP_001291745.1| 50S ribosomal protein L35 [Haemophilus influenzae PittGG] # 3 92 4 93 96 127 66 8e-29 MALTKAEMSEYLFDKLGLSKRDAKELVELFFEEIRRALENGEQVKLSGFGNFDLRDKNQR PGRNPKTGEDIPITARRVVTFRPGQKLKSRVENASPKDE >gi|296494451|gb|ADTN01000287.1| GENE 2 322 - 2709 2888 795 aa, chain - ## HITS:1 COG:ECs2420_2 KEGG:ns NR:ns ## COG: ECs2420_2 COG0072 # Protein_GI_number: 15831674 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase beta subunit # Organism: Escherichia coli O157:H7 # 140 795 1 656 656 1294 99.0 0 MKFSELWLREWVNPAIDSDALANQITMAGLEVDGVEPVAGSFHGVVVGEVVECAQHPNAD KLRVTKVNVGGDRLLDIVCGAPNCRQGLRVAVATIGAVLPGDFKIKAAKLRGEPSEGMLC SFSELGISDDHSGIIELPADAPIGTDIREYLKLDDNTIEISVTPNRADCLGIIGVARDVA VLNQLPLVQPEIVPVGATIDDTLPITVEAPEACPRYLGRVVKGINVKAPTPLWMKEKLRR CGIRSIDAVVDVTNYVLLELGQPMHAFDKDRIEGGIVVRMAKEGETLVLLDGTEAKLNAD TLVIADHNKALAMGGIFGGEHSGVNDETQNVLLECAFFSPLSITGRARRHGLHTDASHRY ERGVDPALQHKAMERATRLLIDICGGEAGPVIDITNEATLPKRATITLRRSKLDRLIGHH IADEQVTDILRRLGCEVTEGKDEWQAVAPSWRFDMEIEEDLVEEVARVYGYNNIPDEPVQ ASLIMGTHREADLSLKRVKTLLNDKGYQEVITYSFVDPKVQQMIHPGVEALLLPSPISVE MSAMRLSLWTGLLATVVYNQNRQQNRVRIFESGLRFVPDTQAPLGIRQDLMLAGVICGNR YEEHWNLAKETVDFYDLKGDLESVLDLTGKLNEVEFRAEANPALHPGQSAAIYLKGERIG FVGVVHPELERKLDLNGRTLVFELEWNKLADRVVPQAREISRFPANRRDIAVVVAENVPA ADILSECKKVGVNQVVGVNLFDVYRGKGVAEGYKSLAISLILQDTSRTLEEEEIAATVAK CVEALKERFQASLRD >gi|296494451|gb|ADTN01000287.1| GENE 3 2724 - 3707 1283 327 aa, chain - ## HITS:1 COG:pheS KEGG:ns NR:ns ## COG: pheS COG0016 # Protein_GI_number: 16129670 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase alpha subunit # Organism: Escherichia coli K12 # 1 327 1 327 327 668 100.0 0 MSHLAELVASAKAAISQASDVAALDNVRVEYLGKKGHLTLQMTTLRELPPEERPAAGAVI NEAKEQVQQALNARKAELESAALNARLAAETIDVSLPGRRIENGGLHPVTRTIDRIESFF GELGFTVATGPEIEDDYHNFDALNIPGHHPARADHDTFWFDTTRLLRTQTSGVQIRTMKA QQPPIRIIAPGRVYRNDYDQTHTPMFHQMEGLIVDTNISFTNLKGTLHDFLRNFFEEDLQ IRFRPSYFPFTEPSAEVDVMGKNGKWLEVLGCGMVHPNVLRNVGIDPEVYSGFAFGMGME RLTMLRYGVTDLRSFFENDLRFLKQFK >gi|296494451|gb|ADTN01000287.1| GENE 4 4158 - 4514 578 118 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15802128|ref|NP_288150.1| 50S ribosomal protein L20 [Escherichia coli O157:H7 EDL933] # 1 118 1 118 118 227 100 6e-59 MARVKRGVIARARHKKILKQAKGYYGARSRVYRVAFQAVIKAGQYAYRDRRQRKRQFRQL WIARINAAARQNGISYSKFINGLKKASVEIDRKILADIAVFDKVAFTALVEKAKAALA >gi|296494451|gb|ADTN01000287.1| GENE 5 4567 - 4779 358 70 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|110805473|ref|YP_688993.1| 50S ribosomal protein L35 [Shigella flexneri 5 str. 8401] # 1 70 1 70 70 142 98 2e-33 MEVIKMPKIKTVRGAAKRFKKTGKGGFKHKHANLRHILTKKATKRKRHLRPKAMVSKGDL GLVIACLPYA >gi|296494451|gb|ADTN01000287.1| GENE 6 4861 - 5340 665 159 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167856598|ref|ZP_02479300.1| 50S ribosomal protein L35 [Haemophilus parasuis 29755] # 1 158 2 159 159 260 81 5e-69 QEVRLTGLEGEQLGIVSLREALEKAEEAGVDLVEISPNAEPPVCRIMDYGKFLYEKSKSS KEQKKKQKVIQVKEIKFRPGTDEGDYQVKLRSLIRFLEEGDKAKITLRFRGREMAHQQIG MEVLNRVKDDLQELAVVESFPTKIEGRQMIMVLAPKKKQ >gi|296494451|gb|ADTN01000287.1| GENE 7 5407 - 7275 1893 622 aa, chain - ## HITS:1 COG:thrS KEGG:ns NR:ns ## COG: thrS COG0441 # Protein_GI_number: 16129675 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Threonyl-tRNA synthetase # Organism: Escherichia coli K12 # 1 622 21 642 642 1311 100.0 0 MDVALDIGPGLAKACIAGRVNGELVDACDLIENDAQLSIITAKDEEGLEIIRHSCAHLLG HAIKQLWPHTKMAIGPVIDNGFYYDVDLDRTLTQEDVEALEKRMHELAEKNYDVIKKKVS WHEARETFANRGESYKVSILDENIAHDDKPGLYFHEEYVDMCRGPHVPNMRFCHHFKLMK TAGAYWRGDSNNKMLQRIYGTAWADKKALNAYLQRLEEAAKRDHRKIGKQLDLYHMQEEA PGMVFWHNDGWTIFRELEVFVRSKLKEYQYQEVKGPFMMDRVLWEKTGHWDNYKDAMFTT SSENREYCIKPMNCPGHVQIFNQGLKSYRDLPLRMAEFGSCHRNEPSGSLHGLMRVRGFT QDDAHIFCTEEQIRDEVNGCIRLVYDMYSTFGFEKIVVKLSTRPEKRIGSDEMWDRAEAD LAVALEENNIPFEYQLGEGAFYGPKIEFTLYDCLDRAWQCGTVQLDFSLPSRLSASYVGE DNERKVPVMIHRAILGSMERFIGILTEEFAGFFPTWLAPVQVVIMNITDSQSEYVNELTQ KLSNAGIRVKADLRNEKIGFKIREHTLRRVPYMLVCGDKEVESGKVAVRTRRGKDLGSMD VNEVIEKLQQEIRSRSLKQLEE >gi|296494451|gb|ADTN01000287.1| GENE 8 7738 - 7821 57 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MMINYYRQNEMCTKNCYKNCVTFIIVI >gi|296494451|gb|ADTN01000287.1| GENE 9 7859 - 8332 93 157 aa, chain + ## HITS:1 COG:b1721m KEGG:ns NR:ns ## COG: b1721m COG0666 # Protein_GI_number: 16132225 # Func_class: R General function prediction only # Function: FOG: Ankyrin repeat # Organism: Escherichia coli K12 # 1 151 1 151 632 290 98.0 1e-78 MSQNDIIIRTHYKSPHRLHIDSDIPTPSSEPINQFARQLITLLDTSDLSSMLSYCVTQEF TANCRKISQNCYSTALFTINFATSPIHTENILITLHYKKEIISLLLETTPIKANHLRSIL DYIEQEQLTAEDRNHCMKLSKKIHREKNYTPNSKSQW >gi|296494451|gb|ADTN01000287.1| GENE 10 9114 - 9758 278 214 aa, chain + ## HITS:1 COG:b1721m KEGG:ns NR:ns ## COG: b1721m COG0666 # Protein_GI_number: 16132225 # Func_class: R General function prediction only # Function: FOG: Ankyrin repeat # Organism: Escherichia coli K12 # 1 214 419 632 632 409 100.0 1e-114 MQEIQSLVDNHIIHEDNLVKLLQTKSANETPGLYISMLYGFDEIIDIFLNALTTPIAQEL LNKKLVMSILAMKIHDGEPGLYAAMENNHPLCVTRFLSKINGIAFKYKLSKANIMDLLKG ATAQGTPALYIAMSKGNEDVVLSYISTLGAFAKKHSFSQHQLFTLLAAKNHDNMSAVHIA IHHKHYKTVETYYAAINAISQSLSFSADEIKTYL >gi|296494451|gb|ADTN01000287.1| GENE 11 9825 - 10037 145 70 aa, chain + ## HITS:1 COG:no KEGG:SFV_1500 NR:ns ## KEGG: SFV_1500 # Name: not_defined # Def: hypothetical protein # Organism: S.flexneri_8401 # Pathway: not_defined # 1 70 3 72 72 134 100.0 9e-31 MCTTPPDGCNNKKSLDACPFCYTPLSRTRDMQDTGMPTKRFDKKHWKMVVVLLAICGAML LLRWAAMIWG >gi|296494451|gb|ADTN01000287.1| GENE 12 10090 - 10848 642 252 aa, chain - ## HITS:1 COG:ydiY KEGG:ns NR:ns ## COG: ydiY COG3137 # Protein_GI_number: 16129676 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative salt-induced outer membrane protein # Organism: Escherichia coli K12 # 1 252 1 252 252 463 100.0 1e-130 MKLLKTVPAIVMLAGGMFASLNAAADDSVFTVMDDPASAKKPFEGNLNAGYLAQSGNTKS SSLTADTTMTWYGHTTAWSLWGNASNTSSNDERSSEKYAAGGRSRFNLTDYDYLFGQASW LTDRYNGYRERDVLTAGYGRQFLNGPVHSFRFEFGPGVRYDKYTDNASETQPLGYASGAY AWQLTDNAKFTQGVSVFGAEDTTLNSESALNVAINEHFGLKVAYNVTWNSEPPESAPEHT DRRTTLSLGYSM >gi|296494451|gb|ADTN01000287.1| GENE 13 11135 - 12064 1007 309 aa, chain + ## HITS:1 COG:pfkB KEGG:ns NR:ns ## COG: pfkB COG1105 # Protein_GI_number: 16129677 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) # Organism: Escherichia coli K12 # 1 309 1 309 309 570 100.0 1e-162 MVRIYTLTLAPSLDSATITPQIYPEGKLRCTAPVFEPGGGGINVARAIAHLGGSATAIFP AGGATGEHLVSLLADENVPVATVEAKDWTRQNLHVHVEASGEQYRFVMPGAALNEDEFRQ LEEQVLEIESGAILVISGSLPPGVKLEKLTQLISAAQKQGIRCIVDSSGEALSAALAIGN IELVKPNQKELSALVNRELTQPDDVRKAAQEIVNSGKAKRVVVSLGPQGALGVDSENCIQ VVPPPVKSQSTVGAGDSMVGAMTLKLAENASLEEMVRFGVAAGSAATLNQGTRLCSHDDT QKIYAYLSR >gi|296494451|gb|ADTN01000287.1| GENE 14 12165 - 12455 281 96 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_2442 NR:ns ## KEGG: ECH74115_2442 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 96 1 96 96 158 100.0 4e-38 MASGDLVRYVITVMLHEDTLTEINELNNYLTRDGFLLTMTDDEGNIHELGTNTFGLISTQ SEEEIRELVSGLTQSATGKDPEITITTWEEWNSNRK >gi|296494451|gb|ADTN01000287.1| GENE 15 12561 - 13421 758 286 aa, chain + ## HITS:1 COG:yniA KEGG:ns NR:ns ## COG: yniA COG3001 # Protein_GI_number: 16129679 # Func_class: G Carbohydrate transport and metabolism # Function: Fructosamine-3-kinase # Organism: Escherichia coli K12 # 1 286 1 286 286 573 99.0 1e-163 MWQAISRLLSEQLGEGEIELRNELPGGEVHAAWHLRYAGHDFFVKCDERELLPGFTAEAD QLELLSRRKTVTVPKVWAVGADRDYSFLVMDYLPPRPLDAHSAFILGQQIARLHQWSDQP QFGLDFDNALSTTPQPNTWQRRWSTFFAEQRIGWQLELAAEKGIAFGNIDAIVEHIQQRL ASHQPQPSLLHGDLWSGNCALGPDGPYIFDPACYWGDRECDLAMLPLHTEQPPQIYDGYQ SVSPLPADFLERQPVYQLYTLLNRARLFGGQHLVIAQQSLDRLLAA >gi|296494451|gb|ADTN01000287.1| GENE 16 13462 - 13998 488 178 aa, chain - ## HITS:1 COG:no KEGG:JW1715 NR:ns ## KEGG: JW1715 # Name: yniB # Def: predicted inner membrane protein # Organism: E.coli_J # Pathway: not_defined # 1 178 1 178 178 338 100.0 4e-92 MTYQQAGRIAVLKRILGWVIFIPALISTLISLLKFMNTRQENKEGINAVMLDFTHVMIDM MQANTPFLNLFWYNSPTPNFNGGVNVMFWVIFILIFVGLALQDSGARMSRQARFLREGVE DQLILEKAKGEEGLTREQIESRIVVPHHTIFLQFFSLYILPVICIAAGYVFFSLLGFI >gi|296494451|gb|ADTN01000287.1| GENE 17 14145 - 14813 770 222 aa, chain + ## HITS:1 COG:yniC KEGG:ns NR:ns ## COG: yniC COG0637 # Protein_GI_number: 16129681 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Escherichia coli K12 # 1 222 1 222 222 407 99.0 1e-114 MSTPRQILAAIFDMDGLLIDSEPLWDRAELDVMASLGVDISRRNELPDTLGLRIDMVVDL WYARQPWNGPSRQEVVERVIARAISLVEETRPLLPGVREAVALCKEQGLLVGLASASPLH MLEKVLTMFDLRDSFDSLASAEKLPYSKPHPQVYLDCAAKLGVDPLTCVALEDSVNGMIA SKAARMRSIVVPAPEAQNDPRFVLADVKLSSLTELTAKDLLG >gi|296494451|gb|ADTN01000287.1| GENE 18 14976 - 15566 292 196 aa, chain + ## HITS:1 COG:ydjM KEGG:ns NR:ns ## COG: ydjM COG1988 # Protein_GI_number: 16129682 # Func_class: R General function prediction only # Function: Predicted membrane-bound metal-dependent hydrolases # Organism: Escherichia coli K12 # 1 196 5 200 200 380 100.0 1e-106 MTAEGHLLFSIACAVFAKNAELTPVLAQGDWWHIVPSAILTCLLPDIDHPKSFLGQRLKW ISKPIARAFGHRGFTHSLLAVFALLATFYLKVPEGWFIPADALQGMVLGYLSHILADMLT PAGVPLLWPCRWRFRLPILVPQKGNQLERFICMALFVWSVWMPHSLPENSAVRWSSQMIN TLQIQFHRLIKHQVEY >gi|296494451|gb|ADTN01000287.1| GENE 19 15699 - 17090 1747 463 aa, chain + ## HITS:1 COG:ydjN KEGG:ns NR:ns ## COG: ydjN COG1823 # Protein_GI_number: 16129683 # Func_class: R General function prediction only # Function: Predicted Na+/dicarboxylate symporter # Organism: Escherichia coli K12 # 1 463 1 463 463 752 99.0 0 MNFPLIANIVVFVVLLFALAQTRHKQWSLAKKVLVGLVMGVVFGLALHTIYGSDSQVLKD SVQWFNIVGNGYVQLLQMIVMPLVFASILSAVARLHNASQLGKISFLTIGTLLFTTLIAA LVGVLVTNLFGLTAEGLVQGGAETARLNAIESNYVGKVSDLSVPQLVLSFIPKNPFADLT GANPTSIISVVIFAAFLGVAALKLLKDDAPKGERVLAAIDTLQSWVMKLVRLVMQLTPYG VLALMTKVVAGSNLQDIIKLGSFVVASYLGLLIMFAVHGILLGINGVSPLKYFRKVWPVL TFAFTSRSSAASIPLNVEAQTRRLGVPESIASFAASFGATIGQNGCAGLYPAMLAVMVAP TGGINPLDPMWIATLVGIVTVSSAGVAGVGGGATFAALIVLPAMGLPVTLVALLISVEPL IDMGRTALNVSGSMTAGTLTSQWLKQTDKAILDSEDDAELAHH >gi|296494451|gb|ADTN01000287.1| GENE 20 17094 - 17336 96 80 aa, chain - ## HITS:1 COG:no KEGG:JW1719 NR:ns ## KEGG: JW1719 # Name: ydjO # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 80 192 271 271 164 100.0 9e-40 MGSGVLEDIMSIPDSSLRKRLGYEVLSFSLQAHSLSQECIDKLDIFFADDLFKYESVCIA AMEHLKSKATAPIQNGPLPA >gi|296494451|gb|ADTN01000287.1| GENE 21 18186 - 18449 205 87 aa, chain - ## HITS:1 COG:no KEGG:SSON_1427 NR:ns ## KEGG: SSON_1427 # Name: not_defined # Def: cell division modulator # Organism: S.sonnei # Pathway: not_defined # 1 87 1 87 87 155 100.0 6e-37 MRLVKPVMKKPLRQQNRQIISYVPRTEPAPPEHAIKMDSFRDVWMLRGKYVAFVLMGESF LRSPAFTVPESAQRWANQIRQEGEVTE >gi|296494451|gb|ADTN01000287.1| GENE 22 18632 - 20893 2321 753 aa, chain + ## HITS:1 COG:katE KEGG:ns NR:ns ## COG: katE COG0753 # Protein_GI_number: 16129686 # Func_class: P Inorganic ion transport and metabolism # Function: Catalase # Organism: Escherichia coli K12 # 1 753 1 753 753 1514 99.0 0 MSQHNEKNPHQHQSPLHDSSEAKPGMDSLAPEDGSHRPAAEPTPPGAQPTAPGSLKAPDT RNEKLNSLEDVRKGSENYALTTNQGVRIADDQNSLRAGSRGPTLLEDFILREKITHFDHE RIPERIVHARGSAAHGYFQPYKSLSDITKADFLSDPNKITPVFVRFSTVQGGAGSADTVR DTRGFATKFYTEEGIFDLVGNNTPIFFIQDAHKFPDFVHAVKPEPHWAIPQGQSAHDTFW DYVSLQPETLHNVMWAMSDRGIPRSYRTMEGFGIHTFRLINAEGKATFVRFHWKPLAGKA SLVWDEAQKLTGRDPDFHRRELWEAIEAGDFPEYELGFQLIPEEDEFKFDFDLLDPTKLI PEELVPVQRVGKMVLNRNPDNFFAENEQAAFHPGHIVPGLDFTNDPLLQGRLFSYTDTQI SRLGGPNFHEIPINRPTCPYHNFQRDGMHRMGIDTNPANYEPNSINDNWPRETPPGPKRG GFESYQERVEGNKVRERSPSFGEYYSHPRLFWLSQTPFEQRHIVDGFSFELSKVVRPYIR ERVVDQLAHIDLTLAQAVAKNLGIELTDDQLNITPPPDVNGLKKDPSLSLYAIPDGDVKG RVVAILLNDEVRSADLLAILKALKAKGVHAKLLYSRMGEVTADDGTVLPIAATFAGAPSL TVDAVIVPCGNIADIADNGDANYYLMEAYKHLKPIALAGDARKFKATIKVADQGEEGIVE ADSADGSFMDELLTLMAAHRVWSRIPKIDKIPA Prediction of potential genes in microbial genomes Time: Mon May 16 00:08:33 2011 Seq name: gi|296494450|gb|ADTN01000288.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont710.18, whole genome shotgun sequence Length of sequence - 8901 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 4, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 136 - 181 2.4 1 1 Op 1 5/0.000 - CDS 192 - 941 633 ## COG3394 Uncharacterized protein conserved in bacteria 2 1 Op 2 4/0.000 - CDS 954 - 2306 1594 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases 3 1 Op 3 5/0.000 - CDS 2411 - 3250 753 ## COG2207 AraC-type DNA-binding domain-containing proteins 4 1 Op 4 13/0.000 - CDS 3261 - 3611 590 ## COG1447 Phosphotransferase system cellobiose-specific component IIA 5 1 Op 5 10/0.000 - CDS 3662 - 5020 1632 ## COG1455 Phosphotransferase system cellobiose-specific component IIC - Prom 5040 - 5099 4.4 - Term 5026 - 5059 1.4 6 1 Op 6 . - CDS 5105 - 5425 390 ## COG1440 Phosphotransferase system cellobiose-specific component IIB - Prom 5532 - 5591 4.7 - Term 5658 - 5693 4.5 7 2 Tu 1 . - CDS 5724 - 6062 445 ## ECIAI39_1315 DNA-binding transcriptional activator OsmE - Prom 6163 - 6222 1.7 8 3 Op 1 4/0.000 + CDS 6264 - 7091 854 ## COG0171 NAD synthase + Term 7130 - 7169 8.2 + Prom 7239 - 7298 3.6 9 3 Op 2 . + CDS 7459 - 8208 621 ## COG0322 Nuclease subunit of the excinuclease complex 10 4 Tu 1 . - CDS 8168 - 8743 291 ## COG3758 Uncharacterized protein conserved in bacteria - Prom 8833 - 8892 2.5 Predicted protein(s) >gi|296494450|gb|ADTN01000288.1| GENE 1 192 - 941 633 249 aa, chain - ## HITS:1 COG:chbG KEGG:ns NR:ns ## COG: chbG COG3394 # Protein_GI_number: 16129687 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 249 1 249 249 514 100.0 1e-146 MERLLIVNADDFGLSKGQNYGIIEACRNGIVTSTTALVNGQAIDHAVQLSRDEPSLAIGM HFVLTMGKPLTAMPGLTRDGVLGKWIWQLAEEDALPLEEITQELVSQYLRFIELFGRKPT HLDSHHHVHMFPQIFPIVARFAAEQGIALRADRQMAFDLPVNLRTTQGFSSAFYGEEISE SLFLQVLDDAGHRGDRSLEVMCHPAFIDNTIRQSAYCFPRLTELDVLTSASLKGAIAQRG YRLGSYRDV >gi|296494450|gb|ADTN01000288.1| GENE 2 954 - 2306 1594 450 aa, chain - ## HITS:1 COG:celF KEGG:ns NR:ns ## COG: celF COG1486 # Protein_GI_number: 16129688 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Escherichia coli K12 # 1 450 1 450 450 941 99.0 0 MSQKLKVVTIGGGSSYTPELLEGFIKRYHELPVSELWLVDVEGGKAKLDIIFDLCQRMID NAGVPMKLYKTLDRREALKDADFVTTQLRVGQLPARELDERIPLSHGYLGQETNGAGGLF KGLRTIPVIFDIVKDVEELCPNAWVINFTNPAGMVTEAVYRHTGFKRFIGVCNIPIGMKM FIRDVLMLKDSDDLSIDLFGLNHMVFIKDVLVNGKSRFAELLDGVASGQLKASSVKNIFD LPFSEGLIRSLNLLPCSYLLYYFKQKEMLAIEMGEYYKGGARAQVVQKVEKQLFELYKNP ELKVKPKELEQRGGAYYSDAACEVINAIYNDKQAEHYVNIPHHGQIDNIPADWAVEMTCK LGRDGATPHPRITHFDDKVMGLIHTIKGFEIAASNAALSGEFNDVLLALNLSPLVHSDRD AELLAREMILAHEKWLPNFADCIAELKKAH >gi|296494450|gb|ADTN01000288.1| GENE 3 2411 - 3250 753 279 aa, chain - ## HITS:1 COG:celD KEGG:ns NR:ns ## COG: celD COG2207 # Protein_GI_number: 16129689 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 279 2 280 280 519 100.0 1e-147 MQPVINAPEIATAREQQLFNGKNFHVFIYNKTESISGLHQHDYYEFTLVLTGRYFQEING KRVLLERGDFVFIPLGSHHQSFYEFGATRILNVGISKRFFEQHYLPLLPYCFVASQVYRT NNAFLTYVETVISSLNFRETGLEEFVEMVTFYVINRLRHYREEQVIDDVPQWLKSTVEKM HDKEQFSESALENMVALSAKSQEYLTRATQRYYGKTPMQIINEIRINFAKKQLEMTNYSV TDIAFEAGYSSPSLFIKTFKKLTSFTPKSYRKKLTEFNQ >gi|296494450|gb|ADTN01000288.1| GENE 4 3261 - 3611 590 116 aa, chain - ## HITS:1 COG:ECs2442 KEGG:ns NR:ns ## COG: ECs2442 COG1447 # Protein_GI_number: 15831696 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIA # Organism: Escherichia coli O157:H7 # 1 116 1 116 116 173 100.0 9e-44 MMDLDNIPDTQTEAEELEEVVMGLIINSGQARSLAYAALKQAKQGDFAAAKAMMDQSRMA LNEAHLVQTKLIEGDAGEGKMKVSLVLVHAQDHLMTSMLARELITELIELHEKLKA >gi|296494450|gb|ADTN01000288.1| GENE 5 3662 - 5020 1632 452 aa, chain - ## HITS:1 COG:celB KEGG:ns NR:ns ## COG: celB COG1455 # Protein_GI_number: 16129691 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Escherichia coli K12 # 1 452 1 452 452 789 99.0 0 MSNVIASLEKVLLPFAVKIGKQPHVNAIKNGFIRLMPLTLAGAMFVLINNVFLSFGEGSF FYFFGIRLDASTIETLNGLKGIGGNVYNGTLGIMSLMAPFFIGMALAEERKVDALAAGLL SVAAFMTVTPYSVGEAYAVGANWLGGANIISGIIIGLVVAEMFTFIVRRNWVIKLPDSVP ASVSRSFSALIPGFIILSVMGIIAWALNTWGTNFHQIIMDTISTPLASLGSVVGWAYVIF VPLLWFFGIHGALALTALDNGIMTPWALENIATYQQYGSVEAALAAGKTFHIWAKPMLDS FIFLGGSGATLGLILAIFIASRRADYRQVAKLALPSGIFQINEPILFGLPIIMNPVMFIP FVLVQPILAAITLAAYYMGIIPPVTNIAPWTMPTGLGAFFNTNGSVAALLVALFNLGIAT LIYLPFVVVANKAQNAIDKEESEEDIANALKF >gi|296494450|gb|ADTN01000288.1| GENE 6 5105 - 5425 390 106 aa, chain - ## HITS:1 COG:ECs2444 KEGG:ns NR:ns ## COG: ECs2444 COG1440 # Protein_GI_number: 15831698 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Escherichia coli O157:H7 # 1 93 1 93 106 174 100.0 3e-44 MEKKHIYLFCSAGMSTSLLVSKMRAQAEKYEVPVIIEAFPETLAGEKGQNADVVLLGPQI AYMLPEIQRLLPNKPVEVIDSLLYGKVDGLGVLKAAVAAIKKAAAN >gi|296494450|gb|ADTN01000288.1| GENE 7 5724 - 6062 445 112 aa, chain - ## HITS:1 COG:no KEGG:ECIAI39_1315 NR:ns ## KEGG: ECIAI39_1315 # Name: osmE # Def: DNA-binding transcriptional activator OsmE # Organism: E.coli_IAI39 # Pathway: not_defined # 1 112 1 112 112 207 100.0 9e-53 MNKNMAGILSAAAVLTMLAGCTAYDRTKDQFVQPVVKDVKKGMSRAQVAQIAGKPSSEVS MIHARGTCQTYILGQRDGKAETYFVALDDTGHVINSGYQTCAEYDTDPQAAK >gi|296494450|gb|ADTN01000288.1| GENE 8 6264 - 7091 854 275 aa, chain + ## HITS:1 COG:nadE KEGG:ns NR:ns ## COG: nadE COG0171 # Protein_GI_number: 16129694 # Func_class: H Coenzyme transport and metabolism # Function: NAD synthase # Organism: Escherichia coli K12 # 1 275 1 275 275 553 100.0 1e-157 MTLQQQIIKALGAKPQINAEEEIRRSVDFLKSYLQTYPFIKSLVLGISGGQDSTLAGKLC QMAINELRLETGNESLQFIAVRLPYGVQADEQDCQDAIAFIQPDRVLTVNIKGAVLASEQ ALREAGIELSDFVRGNEKARERMKAQYSIAGMTSGVVVGTDHAAEAITGFFTKYGDGGTD INPLYRLNKRQGKQLLAALACPEHLYKKAPTADLEDDRPSLPDEVALGVTYDNIDDYLEG KNVPQQVARTIENWYLKTEHKRRPPITVFDDFWKK >gi|296494450|gb|ADTN01000288.1| GENE 9 7459 - 8208 621 249 aa, chain + ## HITS:1 COG:ydjQ KEGG:ns NR:ns ## COG: ydjQ COG0322 # Protein_GI_number: 16129695 # Func_class: L Replication, recombination and repair # Function: Nuclease subunit of the excinuclease complex # Organism: Escherichia coli K12 # 1 249 47 295 295 499 100.0 1e-141 MPLYIGKSVNIRSRVLSHLRTPDEAAMLRQSRRISWICTAGEIGALLLEARLIKEQQPLF NKRLRRNRQLCALQLNEKRVDVVYAKEVDFSRAPNLFGLFANRRAALQALQTIADEQKLC YGLLGLEPLSRGRACFRSALKRCAGACCGKESHEEHALRLRQSLERLRVVCWPWQGAVAL KEQHPEMTQYHIIQNWLWLGAVNSLEEATTLIRTPAGFDHDGYKILCKPLLSGNYEITEL DPANDQRAS >gi|296494450|gb|ADTN01000288.1| GENE 10 8168 - 8743 291 191 aa, chain - ## HITS:1 COG:ydjR KEGG:ns NR:ns ## COG: ydjR COG3758 # Protein_GI_number: 16129696 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 191 22 212 212 401 100.0 1e-112 MEYFDMRKMSVNLWRNAAGETREICTFPPAKRDFYWRASIASIAANGEFSLFPGMERIVT LLEGGEMLLESADRFNHTLKPFQPFAFAADQVVKAKLTAGQMSMDFNIMTRLDVCKAKVR IAERTFTTFGSRGGVVFVINGAWQLGDKLLTTDQGACWFDGRHTLRLLQPQGKLLFSEIN WLAGHSPDQVQ Prediction of potential genes in microbial genomes Time: Mon May 16 00:08:49 2011 Seq name: gi|296494449|gb|ADTN01000289.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont710.19, whole genome shotgun sequence Length of sequence - 39654 bp Number of predicted genes - 41, with homology - 40 Number of transcription units - 19, operones - 9 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 26 - 511 629 ## COG3678 P pilus assembly/Cpx signaling pathway, periplasmic inhibitor/zinc-resistance associated protein - Prom 622 - 681 5.5 - Term 801 - 834 5.4 2 2 Op 1 5/0.250 - CDS 841 - 1809 816 ## COG2988 Succinylglutamate desuccinylase 3 2 Op 2 7/0.000 - CDS 1802 - 3145 1320 ## COG3724 Succinylarginine dihydrolase 4 2 Op 3 8/0.000 - CDS 3142 - 4620 1336 ## COG1012 NAD-dependent aldehyde dehydrogenases 5 2 Op 4 7/0.000 - CDS 4617 - 5651 1037 ## COG3138 Arginine/ornithine N-succinyltransferase beta subunit 6 2 Op 5 . - CDS 5648 - 6868 1432 ## COG4992 Ornithine/acetylornithine aminotransferase - Prom 7068 - 7127 9.7 - Term 7140 - 7172 3.1 7 3 Tu 1 . - CDS 7216 - 7278 58 ## - Prom 7304 - 7363 3.4 + Prom 7057 - 7116 5.3 8 4 Tu 1 . + CDS 7314 - 8120 782 ## COG0708 Exonuclease III + Prom 8208 - 8267 3.5 9 5 Op 1 . + CDS 8287 - 8997 655 ## COG0398 Uncharacterized conserved protein 10 5 Op 2 . + CDS 9002 - 9679 827 ## EcSMS35_1439 hypothetical protein 11 5 Op 3 3/0.500 + CDS 9697 - 10401 672 ## COG0398 Uncharacterized conserved protein 12 5 Op 4 2/0.750 + CDS 10401 - 10949 505 ## COG2128 Uncharacterized conserved protein 13 5 Op 5 3/0.500 + CDS 10959 - 12125 1010 ## COG4134 ABC-type uncharacterized transport system, periplasmic component 14 5 Op 6 3/0.500 + CDS 12143 - 13633 1432 ## COG4135 ABC-type uncharacterized transport system, permease component 15 5 Op 7 1/1.000 + CDS 13633 - 14286 194 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 16 5 Op 8 . + CDS 14353 - 15660 1397 ## COG2897 Rhodanese-related sulfurtransferase - Term 15606 - 15632 -1.0 17 6 Tu 1 . - CDS 15669 - 16289 545 ## COG0558 Phosphatidylglycerophosphate synthase - Prom 16331 - 16390 4.6 + Prom 16290 - 16349 4.7 18 7 Tu 1 . + CDS 16382 - 16783 326 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes - Term 16494 - 16522 -1.0 19 8 Tu 1 . - CDS 16749 - 17021 96 ## B21_01717 hypothetical protein - Prom 17208 - 17267 2.7 + Prom 17133 - 17192 4.1 20 9 Tu 1 . + CDS 17257 - 18600 1438 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase - Term 18438 - 18467 -0.2 21 10 Tu 1 . - CDS 18717 - 19757 106 ## B21_01719 hypothetical protein - Prom 19782 - 19841 6.3 - Term 19798 - 19848 9.8 22 11 Op 1 5/0.250 - CDS 19885 - 21846 1842 ## COG0550 Topoisomerase IA 23 11 Op 2 6/0.250 - CDS 21851 - 22894 1209 ## COG0709 Selenophosphate synthase - Prom 22916 - 22975 2.2 24 12 Tu 1 . - CDS 23011 - 23562 676 ## COG0778 Nitroreductase - Prom 23588 - 23647 4.9 + Prom 23387 - 23446 3.0 25 13 Op 1 . + CDS 23616 - 23780 84 ## ECUMN_2054 hypothetical protein 26 13 Op 2 4/0.500 + CDS 23723 - 25579 1910 ## COG0616 Periplasmic serine proteases (ClpP class) + Term 25592 - 25628 7.2 + Prom 25695 - 25754 5.5 27 14 Op 1 2/0.750 + CDS 25794 - 26762 984 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D 28 14 Op 2 . + CDS 26773 - 27414 578 ## COG1335 Amidases related to nicotinamidase 29 15 Tu 1 . - CDS 27507 - 28865 645 ## COG0477 Permeases of the major facilitator superfamily - Prom 28921 - 28980 2.0 - Term 28886 - 28920 4.0 30 16 Op 1 . - CDS 28982 - 29383 293 ## COG1349 Transcriptional regulators of sugar metabolism 31 16 Op 2 4/0.500 - CDS 29386 - 29739 197 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 29771 - 29830 5.1 32 17 Op 1 2/0.750 - CDS 29876 - 30856 811 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 33 17 Op 2 4/0.500 - CDS 30866 - 31813 710 ## COG0524 Sugar kinases, ribokinase family 34 17 Op 3 3/0.500 - CDS 31818 - 32654 754 ## COG0191 Fructose/tagatose bisphosphate aldolase 35 17 Op 4 7/0.000 - CDS 32675 - 33718 892 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 36 17 Op 5 7/0.000 - CDS 33735 - 35114 889 ## COG0477 Permeases of the major facilitator superfamily 37 17 Op 6 1/1.000 - CDS 35141 - 36217 1128 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases - Prom 36301 - 36360 4.1 38 18 Op 1 3/0.500 - CDS 36587 - 36859 341 ## COG3139 Uncharacterized protein conserved in bacteria 39 18 Op 2 . - CDS 36901 - 37314 227 ## COG0229 Conserved domain frequently associated with peptide methionine sulfoxide reductase + Prom 37422 - 37481 6.2 40 19 Op 1 4/0.500 + CDS 37656 - 38651 1101 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase + Term 38678 - 38707 2.1 + Prom 38655 - 38714 3.5 41 19 Op 2 . + CDS 38735 - 39619 951 ## COG0676 Uncharacterized enzymes related to aldose 1-epimerase Predicted protein(s) >gi|296494449|gb|ADTN01000289.1| GENE 1 26 - 511 629 161 aa, chain - ## HITS:1 COG:spy KEGG:ns NR:ns ## COG: spy COG3678 # Protein_GI_number: 16129697 # Func_class: U Intracellular trafficking, secretion, and vesicular transport; N Cell motility; T Signal transduction mechanisms; P Inorganic ion transport and metabolism # Function: P pilus assembly/Cpx signaling pathway, periplasmic inhibitor/zinc-resistance associated protein # Organism: Escherichia coli K12 # 1 161 1 161 161 229 100.0 1e-60 MRKLTALFVASTLALGAANLAHAADTTTAAPADAKPMMHHKGKFGPHQDMMFKDLNLTDA QKQQIREIMKGQRDQMKRPPLEERRAMHDIIASDTFDKVKAEAQIAKMEEQRKANMLAHM ETQNKIYNILTPEQKKQFNANFEKRLTERPAAKGKMPATAE >gi|296494449|gb|ADTN01000289.1| GENE 2 841 - 1809 816 322 aa, chain - ## HITS:1 COG:ydjS KEGG:ns NR:ns ## COG: ydjS COG2988 # Protein_GI_number: 16129698 # Func_class: E Amino acid transport and metabolism # Function: Succinylglutamate desuccinylase # Organism: Escherichia coli K12 # 1 322 1 322 322 663 100.0 0 MDNFLALTLTGKKPVITEREINGVRWRWLGDGVLELTPLTPPQGALVISAGIHGNETAPV EMLDALLGAISHGEIPLRWRLLVILGNPPALKQGKRYCHSDMNRMFGGRWQLFAESGETC RARELEQCLEDFYDQGKESVRWHLDLHTAIRGSLHPQFGVLPQRDIPWDEKFLTWLGAAG LEALVFHQEPGGTFTHFSARHFGALACTLELGKALPFGQNDLRQFAVTASAIAALLSGES VGIVRTPPLRYRVVSQITRHSPSFEMHMASDTLNFMPFEKGTLLAQDGEERFTVTHDVEY VLFPNPLVALGLRAGLMLEKIS >gi|296494449|gb|ADTN01000289.1| GENE 3 1802 - 3145 1320 447 aa, chain - ## HITS:1 COG:astB KEGG:ns NR:ns ## COG: astB COG3724 # Protein_GI_number: 16129699 # Func_class: E Amino acid transport and metabolism # Function: Succinylarginine dihydrolase # Organism: Escherichia coli K12 # 1 447 1 447 447 890 100.0 0 MNAWEVNFDGLVGLTHHYAGLSFGNEASTRHRFQVSNPRLAAKQGLLKMKALADAGFPQA VIPPHERPFIPVLRQLGFSGSDEQVLEKVARQAPHWLSSVSSASPMWVANAATIAPSADT LDGKVHLTVANLNNKFHRSLEAPVTESLLKAIFNDEEKFSVHSALPQVALLGDEGAANHN RLGGHYGEPGMQLFVYGREEGNDTRPSRYPARQTREASEAVARLNQVNPQQVIFAQQNPD VIDQGVFHNDVIAVSNRQVLFCHQQAFARQSQLLANLRARVNGFMAIEVPATQVSVSDTV STYLFNSQLLSRDDGSMMLVLPQECREHAGVWGYLNELLAADNPISELKVFDLRESMANG GGPACLRLRVVLTEEERRAVNPAVMMNDTLFNALNDWVDRYYRDRLTAADLADPQLLREG REALDVLSQLLNLGSVYPFQREGGGNG >gi|296494449|gb|ADTN01000289.1| GENE 4 3142 - 4620 1336 492 aa, chain - ## HITS:1 COG:astD KEGG:ns NR:ns ## COG: astD COG1012 # Protein_GI_number: 16129700 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Escherichia coli K12 # 1 492 1 492 492 965 100.0 0 MTLWINGDWITGQGASRVKRNPVSGEVLWQGNDADAAQVEQACRAARAAFPRWARLSFAE RHAVVERFAALLESNKAELTAIIARETGKPRWEAATEVTAMINKIAISIKAYHVRTGEQR SEMPDGAASLRHRPHGVLAVFGPYNFPGHLPNGHIVPALLAGNTIIFKPSELTPWSGEAV MRLWQQAGLPPGVLNLVQGGRETGQALSALEDLDGLLFTGSANTGYQLHRQLSGQPEKIL ALEMGGNNPLIIDEVADIDAAVHLTIQSAFVTAGQRCTCARRLLLKSGAQGDAFLARLVA VSQRLTPGNWDDEPQPFIGGLISEQAAQQVVTAWQQLEAMGGRPLLAPRLLQAGTSLLTP GIIEMTGVAGVPDEEVFGPLLRVWRYDTFDEAIRMANNTRFGLSCGLVSPEREKFDQLLL EARAGIVNWNKPLTGAASTAPFGGIGASGNHRPSAWYAADYCAWPMASLESDSLTLPATL NPGLDFSDEVVR >gi|296494449|gb|ADTN01000289.1| GENE 5 4617 - 5651 1037 344 aa, chain - ## HITS:1 COG:ECs2453 KEGG:ns NR:ns ## COG: ECs2453 COG3138 # Protein_GI_number: 15831707 # Func_class: E Amino acid transport and metabolism # Function: Arginine/ornithine N-succinyltransferase beta subunit # Organism: Escherichia coli O157:H7 # 1 344 1 344 344 728 100.0 0 MMVIRPVERSDVSALMQLASKTGGGLTSLPANEATLSARIERAIKTWQGELPKSEQGYVF VLEDSETGTVAGICAIEVAVGLNDPWYNYRVGTLVHASKELNVYNALPTLFLSNDHTGSS ELCTLFLDPDWRKEGNGYLLSKSRFMFMAAFRDKFNDKVVAEMRGVIDEHGYSPFWQSLG KRFFSMDFSRADFLCGTGQKAFIAELMPKHPIYTHFLSQEAQDVIGQVHPQTAPARAVLE KEGFRYRNYIDIFDGGPTLECDIDRVRAIRKSRLVEVAEGQPAQGDFPACLVANENYHHF RVVLVRTDPATERLILTAAQLDALKCHAGDRVRLVRLCAEEKTA >gi|296494449|gb|ADTN01000289.1| GENE 6 5648 - 6868 1432 406 aa, chain - ## HITS:1 COG:cstC KEGG:ns NR:ns ## COG: cstC COG4992 # Protein_GI_number: 16129702 # Func_class: E Amino acid transport and metabolism # Function: Ornithine/acetylornithine aminotransferase # Organism: Escherichia coli K12 # 1 406 1 406 406 824 100.0 0 MSQPITRENFDEWMIPVYAPAPFIPVRGEGSRLWDQQGKEYIDFAGGIAVNALGHAHPEL REALNEQASKFWHTGNGYTNEPVLRLAKKLIDATFADRVFFCNSGAEANEAALKLARKFA HDRYGSHKSGIVAFKNAFHGRTLFTVSAGGQPAYSQDFAPLPADIRHAAYNDINSASALI DDSTCAVIVEPIQGEGGVVPASNAFLQGLRELCNRHNALLIFDEVQTGVGRTGELYAYMH YGVTPDLLTTAKALGGGFPVGALLATEECARVMTVGTHGTTYGGNPLASAVAGKVLELIN TPEMLNGVKQRHDWFVERLNTINHRYGLFSEVRGLGLLIGCVLNADYAGQAKQISQEAAK AGVMVLIAGGNVVRFAPALNVSEEEVTTGLDRFAAACEHFVSRGSS >gi|296494449|gb|ADTN01000289.1| GENE 7 7216 - 7278 58 20 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVAEFRVAYRLLLSDLIDGC >gi|296494449|gb|ADTN01000289.1| GENE 8 7314 - 8120 782 268 aa, chain + ## HITS:1 COG:xthA KEGG:ns NR:ns ## COG: xthA COG0708 # Protein_GI_number: 16129703 # Func_class: L Replication, recombination and repair # Function: Exonuclease III # Organism: Escherichia coli K12 # 1 268 1 268 268 566 100.0 1e-161 MKFVSFNINGLRARPHQLEAIVEKHQPDVIGLQETKVHDDMFPLEEVAKLGYNVFYHGQK GHYGVALLTKETPIAVRRGFPGDDEEAQRRIIMAEIPSLLGNVTVINGYFPQGESRDHPI KFPAKAQFYQNLQNYLETELKRDNPVLIMGDMNISPTDLDIGIGEENRKRWLRTGKCSFL PEEREWMDRLMSWGLVDTFRHANPQTADRFSWFDYRSKGFDDNRGLRIDLLLASQPLAEC CVETGIDYEIRSMEKPSDHAPVWATFRR >gi|296494449|gb|ADTN01000289.1| GENE 9 8287 - 8997 655 236 aa, chain + ## HITS:1 COG:ydjX KEGG:ns NR:ns ## COG: ydjX COG0398 # Protein_GI_number: 16129704 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 236 17 252 252 409 100.0 1e-114 MNAERKFLFACLIFALVIYAIHAFGLFDLLTDLPHLQTLIRQSGFFGYSLYILLFIIATL LLLPGSILVIAGGIVFGPLLGTLLSLIAATLASSCSFLLARWLGRDLLLKYVGHSNTFQA IEKGIARNGIDFLILTRLIPLFPYNIQNYAYGLTTIAFWPYTLISALTTLPGIVIYTVMA SDLANEGITLRFILQLCLAGLALFILVQLAKLYARHKHVDLSASRRSPLTHPKNEG >gi|296494449|gb|ADTN01000289.1| GENE 10 9002 - 9679 827 225 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_1439 NR:ns ## KEGG: EcSMS35_1439 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 225 1 225 225 437 98.0 1e-121 MLQHYSVSWKKGLAALCLLAVAGLSGCDQQENAAAKVEYDGLSNSQPLRVDANNHTVTML VQINGRFLTDDTRHGIVFKDGSNGHKSLFMGYATPKAFYEALKEAGGTPGENMTMDNKET THVTGSKLDISVNWQGAAKAYSFDEVIVDSNGKKLDMRFGGNLTAAEEKKTGCLVCLDSC PVGIVSNATYTYGAVEKRGEVKFKGNASVLPADNTLATVTFKIAE >gi|296494449|gb|ADTN01000289.1| GENE 11 9697 - 10401 672 234 aa, chain + ## HITS:1 COG:ydjZ KEGG:ns NR:ns ## COG: ydjZ COG0398 # Protein_GI_number: 16129706 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 234 2 235 235 402 100.0 1e-112 MMMQSRKIWYYRITLIILLFAMLLAWALLPGVHEFINRSVAAFAAVDQQGIERFIQSYGA LAAVVSFLLMILQAIAAPLPAFLITFANASLFGAFWGGLLSWTSSMAGAALCFFIARVMG REVVEKLTGKTVLDSMDGFFTRYGKHTILVCRLLPFVPFDPISYAAGLTSIRFRSFFIAT GLGQLPATIVYSWAGSMLTGGTFWFVTGLFILFALTVVIFMAKKIWLERQKRNA >gi|296494449|gb|ADTN01000289.1| GENE 12 10401 - 10949 505 182 aa, chain + ## HITS:1 COG:ynjA KEGG:ns NR:ns ## COG: ynjA COG2128 # Protein_GI_number: 16129707 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 182 1 182 182 353 100.0 1e-97 MGLPPLSKIPLILRPQAWLHRRHYGEVLSPIRWWGRIPFIFYLVSMFVGWLERKRSPLDP VVRSLVSARIAQMCLCEFCVDITSMKVAERTGSSDKLLAVADWRQSPLFSDEERLALEYA EAASVTPPTVDDALRTRLAAHFDAQALTELTALIGLQNLSARFNSAMDIPAQGLCRIPEK RS >gi|296494449|gb|ADTN01000289.1| GENE 13 10959 - 12125 1010 388 aa, chain + ## HITS:1 COG:ynjB KEGG:ns NR:ns ## COG: ynjB COG4134 # Protein_GI_number: 16129708 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Escherichia coli K12 # 1 388 2 389 389 700 99.0 0 MRHCGWLLGLLSLFSLATHASDWQEIKNEAKGQTVWFNAWGGDTAINRYLDWVSGEMKTH YAINLKIVRLADAADAVKRIQTEAAAGRKTGGSVDLLWVNGENFRTLKEANLLQTGWAET LPNWRYVDTQLPVREDFSVPTQGAESPWGGAQLTFIARRDVTPQPPQTPQALLEFAKANP GTVTYPRPPDFTGTAFLEQLLIMLTPDPAALKEAPDDATFARVTAPLWQYLYVLHPYLWR EGKDFPPSPARMDALLKAGTLRLSLTFNPAHAQQKIASGDLPASSYSFGFREGMIGNVHF VTIPANANASAAAKVVANFLLSPDAQLRKADPAVWGDPSVLDPQKLPDGQRESLQSRMPQ DLPPVLAEPHAGWVNALEQEWLHRYGTH >gi|296494449|gb|ADTN01000289.1| GENE 14 12143 - 13633 1432 496 aa, chain + ## HITS:1 COG:ynjC KEGG:ns NR:ns ## COG: ynjC COG4135 # Protein_GI_number: 16129709 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Escherichia coli K12 # 1 496 1 496 496 704 100.0 0 MVAVIYAPLIPAALTLISPALSLTHWQALFADPQLPQALLATLVSTTIAAVGALLIALLV IVALWPGPKWQRMCARLPWLLAIPHVAFATSALLLFADGGLLYDYFPYFTPPMDRFGIGL GLTLAVKESAFLLWILAAVLSEKWLLQQVIVLDSLGYSRWQCLNWLLLPSVAPALAMAML AIVAWSLSVVDVAIILGPGNPPTLAVISWQWLTQGDIDQQTKGALASLLLMLLLAAYVLL SYLLWRSWRRTIPRVDGVRKPATPLLPGNTLAIFLPLTGVLCVVLLAILADQSTINSEAL INSLTMGLVATFIALLLLLLWLEWGPQRRQLWLWLPILLPALPLVAGQYTLALWLKLDGS WTAVVWGHLLWVMPWMLFILQPAWQRIDSRLILIAQTLGWSRAKIFFYVKCPLMLRPVLI AFAVGFAVGIAQYMPTLWLGAGRFPTLTTEAVALSSGGSNGILAAQALWQLLLPLIIFAL TALVAKWVGYVRQGLR >gi|296494449|gb|ADTN01000289.1| GENE 15 13633 - 14286 194 217 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 194 4 201 223 79 30 3e-14 MLCVKNVSLRLPESRLLTNVNFTVDKGDIVTLMGPSGCGKSTLFSWMIGALAEQFSCTGE LWLNEQRIDILPTAQRQIGILFQDALLFDQFSVGQNLLLALPATLKGNARRNAVNDALER SGLEGAFHQDPATLSGGQRARVALLRALLAQPKALLLDEPFSRLDVALRDNFRQWVFSEV RALAIPVVQVTHDLQDVPADSSVLDMAQWSENYNKLR >gi|296494449|gb|ADTN01000289.1| GENE 16 14353 - 15660 1397 435 aa, chain + ## HITS:1 COG:ynjE KEGG:ns NR:ns ## COG: ynjE COG2897 # Protein_GI_number: 16129711 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Escherichia coli K12 # 1 435 6 440 440 845 99.0 0 MKRVSQMTALAMALGLACASSWAAELAKPLTLDQLQQQNGKAIDTRPSAFYNGWPQTLNG PSGHEPAALNLSASWLDKMSTEQLNAWIKQHNLKTDAPVALYGNDKDVDAVKTRLQKAGL THISILSDALSEPSRLQKLPHFEQLVYPQWLHDLQQGKEVTAKPAGDWKVIEAAWGAPKL YLISHIPGADYIDTNEVESEPLWNKVSDEQLKAMLAKHGIRHDTTVILYGRDVYAAARVA QIMLYAGVKDVRLLDGGWQTWSDAGLPVERGTPPKVKAEPDFGVKIPAQPQLMLDMEQAR GLLHRQDASLVSIRSWPEFIGTTSGYSYIKPKGEIAGARWGHAGSDSTHMEDFHNPDGTM RSADDITAMWKAWNIKPEQQVSFYCGTGWRASETFMYARAMGWKNVSVYDGGWYEWSSDP KNPVATGERGPDSSK >gi|296494449|gb|ADTN01000289.1| GENE 17 15669 - 16289 545 206 aa, chain - ## HITS:1 COG:ynjF KEGG:ns NR:ns ## COG: ynjF COG0558 # Protein_GI_number: 16129712 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphate synthase # Organism: Escherichia coli K12 # 1 206 3 208 208 358 99.0 5e-99 MLDRHLHPRIKPLLHQCVRVLDKPGITPDGLTLVGFAIGVLALPFLALGWYLAALVVILL NRLLDGLDGALARRRELTDAGGFLDISLDFLFYALVPFGFILAAPEQNALAGGWLLFAFI GTGSSFLAFAALAAKHQIDNPGYAHKSFYYLGGLTEGTETILLFVLGCLFPAWFAWFAWI FGALCWMTTFTRVWSGYLTLKSLQRQ >gi|296494449|gb|ADTN01000289.1| GENE 18 16382 - 16783 326 133 aa, chain + ## HITS:1 COG:ynjG KEGG:ns NR:ns ## COG: ynjG COG0494 # Protein_GI_number: 16129713 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Escherichia coli K12 # 1 133 3 135 135 238 100.0 2e-63 MIEVVAAIIERDGKILLAQRPAQSDQAGLWEFAGGKVEPDESQRQALVRELREELGIEAT VGEYVASHQREVSGRIIHLHAWHVPDFHGTLQAHEHQALVWCSPEEALQYPLAPADIPLL EAFMALRAARPAD >gi|296494449|gb|ADTN01000289.1| GENE 19 16749 - 17021 96 90 aa, chain - ## HITS:1 COG:no KEGG:B21_01717 NR:ns ## KEGG: B21_01717 # Name: ynjH # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 90 1 90 90 178 100.0 6e-44 MSRALFAVVLAFPLIALANPHYRPDVEVNVPPEVFSSGGQSAQPCTQCCVYQDQNYSEGA VIKAEGILLQCQRDDKTLSTNPLVWRRVKP >gi|296494449|gb|ADTN01000289.1| GENE 20 17257 - 18600 1438 447 aa, chain + ## HITS:1 COG:gdhA KEGG:ns NR:ns ## COG: gdhA COG0334 # Protein_GI_number: 16129715 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Escherichia coli K12 # 1 447 1 447 447 896 100.0 0 MDQTYSLESFLNHVQKRDPNQTEFAQAVREVMTTLWPFLEQNPKYRQMSLLERLVEPERV IQFRVVWVDDRNQIQVNRAWRVQFSSAIGPYKGGMRFHPSVNLSILKFLGFEQTFKNALT TLPMGGGKGGSDFDPKGKSEGEVMRFCQALMTELYRHLGADTDVPAGDIGVGGREVGFMA GMMKKLSNNTACVFTGKGLSFGGSLIRPEATGYGLVYFTEAMLKRHGMGFEGMRVSVSGS GNVAQYAIEKAMEFGARVITASDSSGTVVDESGFTKEKLARLIEIKASRDGRVADYAKEF GLVYLEGQQPWSLPVDIALPCATQNELDVDAAHQLIANGVKAVAEGANMPTTIEATELFQ QAGVLFAPGKAANAGGVATSGLEMAQNAARLGWKAEKVDARLHHIMLDIHHACVEHGGEG EQTNYVQGANIAGFVKVADAMLAQGVI >gi|296494449|gb|ADTN01000289.1| GENE 21 18717 - 19757 106 346 aa, chain - ## HITS:1 COG:no KEGG:B21_01719 NR:ns ## KEGG: B21_01719 # Name: ynjI # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 346 1 346 346 700 100.0 0 MKKVLLQNHPGSEKYSFNGWEIFNSNFERMIKENKAMLLCKWGFYLTCVVAVMFVFAAIT SNGLNERGLITAGCSFLYLLIMMGLIVRAGFKAKKEQLHYYQAKGIEPLSIEKLQALQLI APYRFYHKQWSETLEFWPRKPEPGKDTFQYHVLPFDSIDIISKRRESLEDQWGIEDSESY CALMEHFLSGDHGANTFKANMEEAPEQVIALLNKFAVFPSDYISDCANHSSGKSSAKLIW AAELSWMISISSTAFQNGTIEEELAWHYIMLASRKAHELFESEEDYQKNSQMGFLYWHIC CYRRKLTDAELEACYRYDKQFWEHYSKKCRWPIRNVPWGASSVKYS >gi|296494449|gb|ADTN01000289.1| GENE 22 19885 - 21846 1842 653 aa, chain - ## HITS:1 COG:topB KEGG:ns NR:ns ## COG: topB COG0550 # Protein_GI_number: 16129717 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Escherichia coli K12 # 1 653 1 653 653 1322 99.0 0 MRLFIAEKPSPARAIADVLPKPHRKGDGFIECGNGQVVTWCIGHLLEQAQPDAYDSRYAR WNLADLPIVPEKWQLQPRPSVTKQLNVIKRFLHEASEIVHAGDPDREGQLLVDEVLDYLQ LAPEKRQQVQRCLINDLNPQAVERAIDRLRSNSEFVPLCVSALARARADWLYGINMTRAY TILGRNAGYQGVLSVGRVQTPVLGLVVRRDEEIENFVAKDFFEVKAHIVTPADERFTAIW QPSEACEPYQDEEGRLLHRPLAEHVVNRISGQPAIVTSYNDKRESESAPLPFSLSALQIE AAKRFGLSAQNVLDICQKLYETHKLITYPRSDCRYLPEEHFAGRHAVMNAISVHAPDLLP QPVVDPDIRNRCWDDKKVDAHHAIIPTARSSAINLTENEAKVYNLIARQYLMQFCPDAVF RKCVIELDIAKGKFVAKARFLAEAGWRTLLGSKERDEENDGTPLPVVAKGDELLCEKGEV VERQTQPPRHFTDATLLSAMTGIARFVQDKDLKKILRATDGLGTEATRAGIIELLFKRGF LTKKGRYIHSTDAGKALFHSLPEMATRPDMTAHWESVLTQISEKQCRYQDFMQPLVGTLY QLIDQAKRTPVRQFRGIVAPGSGGSADKKKAAPRKRSAKKSPPADEVGSGAIA >gi|296494449|gb|ADTN01000289.1| GENE 23 21851 - 22894 1209 347 aa, chain - ## HITS:1 COG:selD KEGG:ns NR:ns ## COG: selD COG0709 # Protein_GI_number: 16129718 # Func_class: E Amino acid transport and metabolism # Function: Selenophosphate synthase # Organism: Escherichia coli K12 # 1 347 1 347 347 679 100.0 0 MSENSIRLTQYSHGAGCGCKISPKVLETILHSEQAKFVDPNLLVGNETRDDAAVYDLGNG TSVISTTDFFMPIVDNPFDFGRIAATNAISDIFAMGGKPIMAIAILGWPINKLSPEIARE VTEGGRYACRQAGIALAGGHSIDAPEPIFGLAVTGIVPTERVKKNSTAQAGCKLFLTKPL GIGVLTTAEKKSLLKPEHQGLATEVMCRMNIAGASFANIEGVKAMTDVTGFGLLGHLSEM CQGAGVQARVDYEAIPKLPGVEEYIKLGAVPGGTERNFASYGHLMGEMPREVRDLLCDPQ TSGGLLLAVMPEAENEVKATAAEFGIELTAIGELVPARGGRAMVEIR >gi|296494449|gb|ADTN01000289.1| GENE 24 23011 - 23562 676 183 aa, chain - ## HITS:1 COG:ECs2471 KEGG:ns NR:ns ## COG: ECs2471 COG0778 # Protein_GI_number: 15831725 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Escherichia coli O157:H7 # 1 183 1 183 183 363 100.0 1e-101 MDALELLINRRSASRLAEPAPTGEQLQNILRAGMRAPDHKSMQPWHFFVIEGEGRERFSA VLEQGAIAAGSDDKAIDKARNAPFRAPLIITVVAKCEENHKVPRWEQEMSAGCAVMAMQM AAVAQGFGGIWRSGALTESPVVREAFGCREQDKIVGFLYLGTPQLKASTSINVPDPTPFV TYF >gi|296494449|gb|ADTN01000289.1| GENE 25 23616 - 23780 84 54 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_2054 NR:ns ## KEGG: ECUMN_2054 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 54 1 54 54 100 98.0 2e-20 MLQHSPQFLLTSTGWVITITTSIAPVTGVTLSWENTCEPFGDLLPDFLNGRGVC >gi|296494449|gb|ADTN01000289.1| GENE 26 23723 - 25579 1910 618 aa, chain + ## HITS:1 COG:sppA KEGG:ns NR:ns ## COG: sppA COG0616 # Protein_GI_number: 16129720 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Escherichia coli K12 # 1 618 1 618 618 1211 100.0 0 MRTLWRFIAGFFKWTWRLLNFVREMVLNLFFIFLVLVGVGIWMQVSGGDSKETASRGALL LDISGVIVDKPDSSQRFSKLSRQLLGASSDRLQENSLFDIVNTIRQAKDDRNITGIVMDL KNFAGGDQPSMQYIGKALKEFRDSGKPVYAVGENYSQGQYYLASFANKIWLSPQGVVDLH GFATNGLYYKSLLDKLKVSTHVFRVGTYKSAVEPFIRDDMSPAAREADSRWIGELWQNYL NTVAANRQIPAEQVFPGAQGLLEGLTKTGGDTAKYALENKLVDALASSAEIEKALTKEFG WSKTDKNYRAISYYDYALKTPADTGDSIGVVFANGAIMDGEETQGNVGGDTTAAQIRDAR LDPKVKAIVLRVNSPGGSVTASEVIRAELAAARAAGKPVVVSMGGMAASGGYWISTPANY IVANPSTLTGSIGIFGVITTVENSLDSIGVHTDGVSTSPLADVSITRALPPEAQLMMQLS IENGYKRFITLVADARHSTPEQIDKIAQGHVWTGQDAKANGLVDSLGDFDDAVAKAAELA KVKQWHLEYYVDEPTFFDKVMDNMSGSVRAMLPDAFQAMLPAPLASVASTVKSESDKLAA FNDPQNRYAFCLTCANMR >gi|296494449|gb|ADTN01000289.1| GENE 27 25794 - 26762 984 322 aa, chain + ## HITS:1 COG:ECs2474 KEGG:ns NR:ns ## COG: ECs2474 COG0252 # Protein_GI_number: 15831728 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Escherichia coli O157:H7 # 1 322 17 338 338 647 100.0 0 MQRSEQGYIPVSGHLQRQLALMPEFHRPEMPDFTIHEYTPLMDSSDMTPEDWQHIAEDIK AHYDDYDGFVILHGTDTMAYTASALSFMLENLGKPVIVTGSQIPLAELRSDGQINLLNAL YVAANYPINEVTLFFNNRLYRGNRTTKAHADGFDAFASPNLPPLLEAGIHIRRLNTPPAP HGEGELIVHPITPQPIGVVTIYPGISADVVRNFLRQPVKALILRSYGVGNAPQNKAFLQE LQEASDRGIVVVNLTQCMSGKVNMGGYATGNALAHAGVIGGADMTVEATLTKLHYLLSQE LDTETIRKAMSQNLRGELTPDD >gi|296494449|gb|ADTN01000289.1| GENE 28 26773 - 27414 578 213 aa, chain + ## HITS:1 COG:pncA KEGG:ns NR:ns ## COG: pncA COG1335 # Protein_GI_number: 16129722 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Escherichia coli K12 # 1 213 7 219 219 443 100.0 1e-125 MPPRALLLVDLQNDFCAGGALAVPEGDSTVDVANRLIDWCQSRGEAVIASQDWHPANHGS FASQHGVEPYTPGQLDGLPQTFWPDHCVQNSEGAQLHPLLHQKAIAAVFHKGENPLVDSY SAFFDNGRRQKTSLDDWLRDHEIDELIVMGLATDYCVKFTVLDALQLGYKVNVITDGCRG VNIQPQDSAHAFMEMSAAGATLYTLADWEETQG >gi|296494449|gb|ADTN01000289.1| GENE 29 27507 - 28865 645 452 aa, chain - ## HITS:1 COG:ydjE KEGG:ns NR:ns ## COG: ydjE COG0477 # Protein_GI_number: 16129723 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 452 1 452 452 801 100.0 0 MEQYDQIGARLDRLPLARFHYRIFGIISFSLLLTGFLSYSGNVVLAKLVSNGWSNNFLNA AFTSALMFGYFIGSLTGGFIGDYFGRRRAFRINLLIVGIAATGAAFVPDMYWLIFFRFLM GTGMGALIMVGYASFTEFIPATVRGKWSARLSFVGNWSPMLSAAIGVVVIAFFSWRIMFL LGGIGILLAWFLSGKYFIESPRWLAGKGQIAGAECQLREVEQQIEREKSIRLPPLTSYQS NSKVKVIKGTFWLLFKGEMLRRTLVAITVLIAMNISLYTITVWIPTIFVNSGIDVDKSIL MTAVIMIGAPVGIFIAALIIDHFPRRLFGSTLLIIIAVLGYIYSIQTTEWAILIYGLVMI FFLYMYVCFASAVYIPELWPTHLRLRGSGFVNAVGRIVAVFTPYGVAALLTHYGSITVFM VLGVMLLLCALVLSIFGIETRKVSLEEISEVN >gi|296494449|gb|ADTN01000289.1| GENE 30 28982 - 29383 293 133 aa, chain - ## HITS:1 COG:ydjF KEGG:ns NR:ns ## COG: ydjF COG1349 # Protein_GI_number: 16129724 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli K12 # 1 133 120 252 252 256 99.0 1e-68 MLTNSAEAIHVLAQSEIKVVSTGGELNKNTLSLQGRITKEIIRRYHVDIMVMSCKGLDIN SGALDSNEAEAEIKKTMIRQATEVALLVDHSKFDRKAFVQLADFSHINYIITDKSPGAEW IAFCKDNNIQLVW >gi|296494449|gb|ADTN01000289.1| GENE 31 29386 - 29739 197 117 aa, chain - ## HITS:1 COG:ydjF KEGG:ns NR:ns ## COG: ydjF COG1349 # Protein_GI_number: 16129724 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli K12 # 1 111 1 111 252 201 99.0 2e-52 MAAKDRIQAIKQMVANDKKVTVSNLSGIFQVTEETIRRDLEKLEDEGFLTRTYGGAVLNT AMLTENIHFYKRASSFYEEKQLIARKALPFIDNKTTMAADSSSTVMELLKFYRTVVA >gi|296494449|gb|ADTN01000289.1| GENE 32 29876 - 30856 811 326 aa, chain - ## HITS:1 COG:ydjG KEGG:ns NR:ns ## COG: ydjG COG0667 # Protein_GI_number: 16129725 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Escherichia coli K12 # 1 326 1 326 326 669 100.0 0 MKKIPLGTTDITLSRMGLGTWAIGGGPAWNGDLDRQICIDTILEAHRCGINLIDTAPGYN FGNSEVIVGQALKKLPREQVVVETKCGIVWERKGSLFNKVGDRQLYKNLSPESIREEVAA SLQRLGIDYIDIYMTHWQSVPPFFTPIAETVAVLNELKSEGKIRAIGAANVDADHIREYL QYGELDIIQAKYSILDRAMENELLPLCRDNGIVVQVYSPLEQGLLTGTITRDYVPGGARA NKVWFQRENMLKVIDMLEQWQPLCARYQCTIPTLALAWILKQSDLISILSGATAPEQVRE NVAALNINLSDADATLMREMAEALER >gi|296494449|gb|ADTN01000289.1| GENE 33 30866 - 31813 710 315 aa, chain - ## HITS:1 COG:ydjH KEGG:ns NR:ns ## COG: ydjH COG0524 # Protein_GI_number: 16129726 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli K12 # 1 315 8 322 322 609 100.0 1e-174 MDNLDVICIGAAIVDIPLQPVSKNIFDVDSYPLERIAMTTGGDAINEATIISRLGHRTAL MSRIGKDAAGQFILDHCRKENIDIQSLKQDVSIDTSINVGLVTEDGERTFVTNRNGSLWK LNIDDVDFARFSQAKLLSLASIFNSPLLDGKALTEIFTQAKARQMIICADMIKPRLNETL DDICEALSYVDYLFPNFAEAKLLTGKETLDEIADCFLACGVKTVVIKTGKDGCFIKRGDM TMKVPAVAGITAIDTIGAGDNFASGFIAALLEGKNLRECARFANATAAISVLSVGATTGV KNRKLVEQLLEEYEG >gi|296494449|gb|ADTN01000289.1| GENE 34 31818 - 32654 754 278 aa, chain - ## HITS:1 COG:ydjI KEGG:ns NR:ns ## COG: ydjI COG0191 # Protein_GI_number: 16129727 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Escherichia coli K12 # 1 278 1 278 278 553 100.0 1e-157 MLADIRYWENDATNKHYAIAHFNVWNAEMLMGVIDAAEEAKSPVIISFGTGFVGNTSFED FSHMMVSMAQKATVPVITHWDHGRSMEIIHNAWTHGMNSLMRDASAFDFEENIRLTKEAV DFFHPLGIPVEAELGHVGNETVYEEALAGYHYTDPDQAAEFVERTGCDSLAVAIGNQHGV YTSEPQLNFEVVKRVRDAVSVPLVLHGASGISDADIKTAISLGIAKINIHTELCQAAMVA VKENQDQPFLHLEREVRKAVKERALEKIKLFGSDGKAE >gi|296494449|gb|ADTN01000289.1| GENE 35 32675 - 33718 892 347 aa, chain - ## HITS:1 COG:ydjJ KEGG:ns NR:ns ## COG: ydjJ COG1063 # Protein_GI_number: 16129728 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli K12 # 1 347 1 347 347 702 100.0 0 MKNSKAILQVPGTMKIISAEIPVPKEDEVLIKVEYVGICGSDVHGFESGPFIPPKDPNQE IGLGHECAGTVVAVGSRVRKFKPGDRVNIEPGVPCGHCRYCLEGKYNICPDVDFMATQPN YRGALTHYLCHPESFTYKLPDNMDTMEGALVEPAAVGMHAAMLADVKPGKKIIILGAGCI GLMTLQACKCLGATEIAVVDVLEKRLAMAEQLGATVVINGAKEDTIARCQQFTEDMGADI VFETAGSAVTVKQAPYLVMRGGKIMIVGTVPGDSAINFLKINREVTIQTVFRYANRYPVT IEAISSGRFDVKSMVTHIYDYRDVQQAFEESVNNKRDIIKGVIKISD >gi|296494449|gb|ADTN01000289.1| GENE 36 33735 - 35114 889 459 aa, chain - ## HITS:1 COG:ydjK KEGG:ns NR:ns ## COG: ydjK COG0477 # Protein_GI_number: 16129729 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 459 1 459 459 831 100.0 0 MEQITKPHCGARLDRLPDCRWHSSMFAIVAFGLLVCWSNAVGGLILAQLKALGWTDNSTT ATFSAITTAGMFLGALVGGIIGDKTGRRNAFILYEAIHIASMVVGAFSPNMDFLIACRFV MGVGLGALLVTLFAGFTEYMPGRNRGTWSSRVSFIGNWSYPLCSLIAMGLTPLISAEWNW RVQLLIPAILSLIATALAWRYFPESPRWLESRGRYQEAEKVMRSIEEGVIRQTGKPLPPV VIADDGKAPQAVPYSALLTGVLLKRVILGSCVLIAMNVVQYTLINWLPTIFMTQGINLKD SIVLNTMSMFGAPFGIFIAMLVMDKIPRKTMGVGLLILIAVLGYIYSLQTSMLLITLIGF FLITFVYMYVCYASAVYVPEIWPTEAKLRGSGLANAVGRISGIAAPYAVAVLLSSYGVTG VFILLGAVSIIVAIAIATIGIETKGVSVESLSIDAVANK >gi|296494449|gb|ADTN01000289.1| GENE 37 35141 - 36217 1128 358 aa, chain - ## HITS:1 COG:ydjL KEGG:ns NR:ns ## COG: ydjL COG1063 # Protein_GI_number: 16129730 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli K12 # 1 358 1 358 358 739 100.0 0 MKALARFGKAFGGYKMIDVPQPMCGPEDVVIEIKAAAICGADMKHYNVDSGSDEFNSIRG HEFAGCIAQVGEKVKDWKVGQRVVSDNSGHVCGVCPACEQGDFLCCTEKVNLGLDNNTWG GGFSKYCLVPGEILKIHRHALWEIPDGVDYEDAAVLDPICNAYKSIAQQSKFLPGQDVVV IGTGPLGLFSVQMARIMGAVNIVVVGLQEDVAVRFPVAKELGATAVVNGSTEDVVARCQQ ICGKDNLGLVIECSGANIALKQAIEMLRPNGEVVRVGMGFKPLDFSINDITAWNKSIIGH MAYDSTSWRNAIRLLASGAIKVKPMITHRIGLSQWREGFDAMVDKTAIKVIMTYDFDE >gi|296494449|gb|ADTN01000289.1| GENE 38 36587 - 36859 341 90 aa, chain - ## HITS:1 COG:yeaC KEGG:ns NR:ns ## COG: yeaC COG3139 # Protein_GI_number: 16129731 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 90 16 105 105 170 100.0 7e-43 MNLDDIINSMMPEVYQRLSTAVELGKWPDGVALTEEQKENCLQLVMLWQARHNIEAQHMT IDTNGQMVMKSKQQLKEDFGISAKPIAMFK >gi|296494449|gb|ADTN01000289.1| GENE 39 36901 - 37314 227 137 aa, chain - ## HITS:1 COG:ECs2487 KEGG:ns NR:ns ## COG: ECs2487 COG0229 # Protein_GI_number: 15831741 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Conserved domain frequently associated with peptide methionine sulfoxide reductase # Organism: Escherichia coli O157:H7 # 1 137 1 137 137 285 100.0 2e-77 MANKPSAEELKKNLSEMQFYVTQNHGTEPPFTGRLLHNKRDGVYHCLICDAPLFHSQTKY DSGCGWPSFYEPVSEESIRYIKDLSHGMQRIEIRCGNCDAHLGHVFPDGPQPTGERYCVN SASLRFTDGENGEEING >gi|296494449|gb|ADTN01000289.1| GENE 40 37656 - 38651 1101 331 aa, chain + ## HITS:1 COG:ECs2488 KEGG:ns NR:ns ## COG: ECs2488 COG0057 # Protein_GI_number: 15831742 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Escherichia coli O157:H7 # 1 331 1 331 331 635 100.0 0 MTIKVGINGFGRIGRIVFRAAQKRSDIEIVAINDLLDADYMAYMLKYDSTHGRFDGTVEV KDGHLIVNGKKIRVTAERDPANLKWDEVGVDVVAEATGLFLTDETARKHITAGAKKVVMT GPSKDNTPMFVKGANFDKYAGQDIVSNASCTTNCLAPLAKVINDNFGIIEGLMTTVHATT ATQKTVDGPSHKDWRGGRGASQNIIPSSTGAAKAVGKVLPELNGKLTGMAFRVPTPNVSV VDLTVRLEKAATYEQIKAAVKAAAEGEMKGVLGYTEDDVVSTDFNGEVCTSVFDAKAGIA LNDNFVKLVSWYDNETGYSNKVLDLIAHISK >gi|296494449|gb|ADTN01000289.1| GENE 41 38735 - 39619 951 294 aa, chain + ## HITS:1 COG:yeaD KEGG:ns NR:ns ## COG: yeaD COG0676 # Protein_GI_number: 16129734 # Func_class: G Carbohydrate transport and metabolism # Function: Uncharacterized enzymes related to aldose 1-epimerase # Organism: Escherichia coli K12 # 1 294 8 301 301 617 100.0 1e-177 MIKKIFALPVIEQISPVLSRRKLDELDLIVVDHPQVKASFALQGAHLLSWKPAGEEEVLW LSNNTPFKNGVAIRGGVPVCWPWFGPAAQQGLPAHGFARNLPWTLKSHHEDADGVALTFE LTQSEETKKFWPHDFTLLAHFRVGKTCEIDLESHGEFETTSALHTYFNVGDIAKVSVSGL GDRFIDKVNDAKENVLTDGIQTFPDRTDRVYLNPQDCSVINDEALNRIIAVGHQHHLNVV GWNPGPALSISMGDMPDDGYKTFVCVETAYASETQKVTKEKPAHLAQSIRVAKR Prediction of potential genes in microbial genomes Time: Mon May 16 00:09:07 2011 Seq name: gi|296494448|gb|ADTN01000290.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont710.20, whole genome shotgun sequence Length of sequence - 7271 bp Number of predicted genes - 6, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/1.000 - CDS 7 - 861 827 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase - Term 887 - 915 3.0 2 1 Op 2 . - CDS 951 - 1697 925 ## COG3713 Outer membrane protein V - Prom 1871 - 1930 4.0 3 2 Op 1 12/0.000 + CDS 2133 - 4067 2123 ## COG2766 Putative Ser protein kinase 4 2 Op 2 . + CDS 4180 - 5463 947 ## PROTEIN SUPPORTED gi|163739530|ref|ZP_02146940.1| 50S ribosomal protein L34 5 2 Op 3 . + CDS 5444 - 5512 73 ## + Prom 5594 - 5653 4.8 6 3 Tu 1 . + CDS 5742 - 7085 795 ## COG2199 FOG: GGDEF domain + Term 7103 - 7133 1.0 Predicted protein(s) >gi|296494448|gb|ADTN01000290.1| GENE 1 7 - 861 827 284 aa, chain - ## HITS:1 COG:yeaE KEGG:ns NR:ns ## COG: yeaE COG0656 # Protein_GI_number: 16129735 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Escherichia coli K12 # 1 284 1 284 284 546 100.0 1e-155 MQQKMIQFSGDVSLPAVGQGTWYMGEDASQRKTEVAALRAGIELGLTLIDTAEMYADGGA EKVVGEALTGLREKVFLVSKVYPWNAGGQKAINACEASLRRLNTDYLDLYLLHWSGSFAF EETVAAMEKLIAQGKIRRWGVSNLDYADMQELWQLPGGNQCATNQVLYHLGSRGIEYDLL PWCQQQQMPVMAYSPLAQAGRLRNGLLKNAVVNEIAHAHNISAAQVLLAWVISHQGVMAI PKAATIAHVQQNAAVLEVELSSAELAMLDKAYPAPKGKTALDMV >gi|296494448|gb|ADTN01000290.1| GENE 2 951 - 1697 925 248 aa, chain - ## HITS:1 COG:yeaF KEGG:ns NR:ns ## COG: yeaF COG3713 # Protein_GI_number: 16129736 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein V # Organism: Escherichia coli K12 # 1 248 1 248 248 467 100.0 1e-132 MTKLKLLALGVLIATSAGVAHAEGKFSLGAGVGVVEHPYKDYDTDVYPVPVINYEGDNFW FRGLGGGYYLWNDATDKLSITAYWSPLYFKAKDSGDHQMRHLDDRKSTMMAGLSYAHFTQ YGYLRTTLAGDTLDNSNGIVWDMAWLYRYTNGGLTVTPGIGVQWNSENQNEYYYGVSRKE SARSGLRGYNPNDSWSPYLELSASYNFLGDWSVYGTARYTRLSDEVTDSPMVDKSWTGLI STGITYKF >gi|296494448|gb|ADTN01000290.1| GENE 3 2133 - 4067 2123 644 aa, chain + ## HITS:1 COG:ECs2492 KEGG:ns NR:ns ## COG: ECs2492 COG2766 # Protein_GI_number: 15831746 # Func_class: T Signal transduction mechanisms # Function: Putative Ser protein kinase # Organism: Escherichia coli O157:H7 # 1 644 1 644 644 1293 100.0 0 MNIFDHYRQRYEAAKDEEFTLQEFLTTCRQDRSAYANAAERLLMAIGEPVMVDTAQEPRL SRLFSNRVIARYPAFEEFYGMEDAIEQIVSYLKHAAQGLEEKKQILYLLGPVGGGKSSLA ERLKSLMQLVPIYVLSANGERSPVNDHPFCLFNPQEDAQILEKEYGIPRRYLGTIMSPWA AKRLHEFGGDITKFRVVKVWPSILQQIAIAKTEPGDENNQDISALVGKVDIRKLEHYAQN DPDAYGYSGALCRANQGIMEFVEMFKAPIKVLHPLLTATQEGNYNGTEGISALPFNGIIL AHSNESEWVTFRNNKNNEAFLDRVYIVKVPYCLRISEEIKIYEKLLNHSELTHAPCAPGT LETLSRFSILSRLKEPENSSIYSKMRVYDGESLKDTDPKAKSYQEYRDYAGVDEGMNGLS TRFAFKILSRVFNFDHVEVAANPVHLFYVLEQQIEREQFPQEQAERYLEFLKGYLIPKYA EFIGKEIQTAYLESYSEYGQNIFDRYVTYADFWIQDQEYRDPDTGQLFDRESLNAELEKI EKPAGISNPKDFRNEIVNFVLRARANNSGRNPNWTSYEKLRTVIEKKMFSNTEELLPVIS FNAKTSTDEQKKHDDFVDRMMEKGYTRKQVRLLCEWYLRVRKSS >gi|296494448|gb|ADTN01000290.1| GENE 4 4180 - 5463 947 427 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739530|ref|ZP_02146940.1| 50S ribosomal protein L34 [Phaeobacter gallaeciensis BS107] # 1 418 1 433 445 369 44 1e-102 MTWFIDRRLNGKNKSMVNRQRFLRRYKAQIKQSISEAINKRSVTDVDSGESVSIPTEDIS EPMFHQGRGGLRHRVHPGNDHFVQNDRIERPQGGGGGSGSGQGQASQDGEGQDEFVFQIS KDEYLDLLFEDLALPNLKQNQQRQLTEYKTHRAGYTANGVPANISVVRSLQNSLARRTAM TAGKRRELHALEENLAIISNSEPAQLLEEERLRKEIAELRAKIERVPFIDTFDLRYKNYE KRPDPSSQAVMFCLMDVSGSMDQSTKDMAKRFYILLYLFLSRTYKNVEVVYIRHHTQAKE VDEHEFFYSQETGGTIVSSALKLMDEVVKERYNPAQWNIYAAQASDGDNWADDSPLCHEI LAKKLLPVVRYYSYIEITRRAHQTLWREYEHLQSTFDNFAMQHIRDQDDIYPVFRELFHK QNATAKG >gi|296494448|gb|ADTN01000290.1| GENE 5 5444 - 5512 73 22 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQQLKAKTISQVIIAWLIFSLL >gi|296494448|gb|ADTN01000290.1| GENE 6 5742 - 7085 795 447 aa, chain + ## HITS:1 COG:yeaI_2 KEGG:ns NR:ns ## COG: yeaI_2 COG2199 # Protein_GI_number: 16129739 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli K12 # 267 447 1 181 181 356 100.0 6e-98 MVSMGTQKLKAQSFFIFSLLLTLILFCITTLYNENTNVKLIPQMNYLMVVVALFFLNAVI FLFMLMKYFTNKQILPTLILSLAFLSGLIYLVETIVIIHKPINGSTLIQTKSNDVSIFYI FRQLSFICLTSLALFCYGKDNILDNNKKKTGILLLALIPFLVFPLLAHNLSSYNADYSLY VVDYCPDNHTATWGINYTKILVCLWAFLLFFIIMRTRLASELWPLIALLCLASLCCNLLL LTLDEYNYTIWYISRGIEVSSKLFVVSFLIYNIFQELQLSSKLAVHDVLTNIYNRRYFFN SVESLLSRPVVKDFCVMLVDINQFKRINAQWGHRVGDKVLVSIVDIIQQSIRPDDILARL EGEVFGLLFTELNSAQAKIIAERMRKNVELLTGFSNRYDVPEQMTISIGTVFSTGDTRNI SLVMTEADKALREAKSEGGNKVIIHHI Prediction of potential genes in microbial genomes Time: Mon May 16 00:09:16 2011 Seq name: gi|296494447|gb|ADTN01000291.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont728.1, whole genome shotgun sequence Length of sequence - 18626 bp Number of predicted genes - 20, with homology - 17 Number of transcription units - 11, operones - 7 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 181 - 222 3.1 1 1 Op 1 . - CDS 361 - 603 88 ## Kvar_1549 phosphoesterase PA-phosphatase related protein 2 1 Op 2 . - CDS 674 - 790 77 ## KPK_1652 PAP2 family protein - Prom 892 - 951 5.4 3 2 Op 1 . - CDS 1183 - 2073 1186 ## COG1210 UDP-glucose pyrophosphorylase 4 2 Op 2 . - CDS 1976 - 2173 112 ## - Prom 2366 - 2425 5.5 + Prom 2743 - 2802 3.2 5 3 Tu 1 . + CDS 2836 - 4419 1762 ## COG1253 Hemolysins and related proteins containing CBS domains - Term 4627 - 4677 1.7 6 4 Op 1 4/0.000 - CDS 4871 - 6724 1297 ## COG2982 Uncharacterized protein involved in outer membrane biogenesis 7 4 Op 2 4/0.000 - CDS 6746 - 7327 746 ## COG0717 Deoxycytidine deaminase 8 4 Op 3 . - CDS 7419 - 8060 554 ## COG0572 Uridine kinase + Prom 8257 - 8316 4.6 9 5 Op 1 1/0.750 + CDS 8378 - 9328 643 ## COG3447 Predicted integral membrane sensor domain 10 5 Op 2 5/0.000 + CDS 9319 - 9438 71 ## COG2202 FOG: PAS/PAC domain 11 5 Op 3 . + CDS 9502 - 11694 2049 ## COG2202 FOG: PAS/PAC domain 12 6 Op 1 . - CDS 11732 - 12589 708 ## COG0122 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase 13 6 Op 2 . - CDS 12636 - 12710 87 ## + Prom 12636 - 12695 4.1 14 7 Tu 1 . + CDS 12723 - 14075 1650 ## COG0443 Molecular chaperone - Term 13827 - 13864 -0.3 15 8 Op 1 . - CDS 14088 - 15587 699 ## COG4248 Uncharacterized protein with protein kinase and helix-hairpin-helix DNA-binding domains 16 8 Op 2 . - CDS 15602 - 15865 134 ## COG4248 Uncharacterized protein with protein kinase and helix-hairpin-helix DNA-binding domains - Prom 15898 - 15957 2.2 17 9 Op 1 . - CDS 16027 - 16707 610 ## ECP_2111 hypothetical protein 18 9 Op 2 . - CDS 16785 - 17444 518 ## COG4245 Uncharacterized protein encoded in toxicity protection region of plasmid R478, contains von Willebrand factor (vWF) domain - Prom 17474 - 17533 3.3 + Prom 17901 - 17960 4.4 19 10 Tu 1 . + CDS 18035 - 18163 94 ## + Term 18283 - 18334 0.1 + Prom 18229 - 18288 4.4 20 11 Tu 1 . + CDS 18363 - 18491 95 ## ECSP_2830 hypothetical protein Predicted protein(s) >gi|296494447|gb|ADTN01000291.1| GENE 1 361 - 603 88 80 aa, chain - ## HITS:1 COG:no KEGG:Kvar_1549 NR:ns ## KEGG: Kvar_1549 # Name: not_defined # Def: phosphoesterase PA-phosphatase related protein # Organism: K.variicola # Pathway: not_defined # 8 64 79 135 209 84 80.0 9e-16 MSWITPASAAFWPIFLWLLSSRFSVGLHKAAIITVYVLAAVVGYSRLVIHVHSVSEVVAG PLLATLCFCCCKNVPLIRKA >gi|296494447|gb|ADTN01000291.1| GENE 2 674 - 790 77 38 aa, chain - ## HITS:1 COG:no KEGG:KPK_1652 NR:ns ## KEGG: KPK_1652 # Name: not_defined # Def: PAP2 family protein # Organism: K.pneumoniae_342 # Pathway: not_defined # 1 38 1 38 209 64 92.0 1e-09 MNWQLISFFGDSTVLLPNSAALFIVLMLCKTSRLLAWQ >gi|296494447|gb|ADTN01000291.1| GENE 3 1183 - 2073 1186 296 aa, chain - ## HITS:1 COG:STM2098 KEGG:ns NR:ns ## COG: STM2098 COG1210 # Protein_GI_number: 16765428 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose pyrophosphorylase # Organism: Salmonella typhimurium LT2 # 1 295 1 295 297 540 91.0 1e-154 MANLKAVIPVAGLGMHMLPATKAIPKEMLPIVDKPMIQYIVDEIVAAGIKEIVLVTHSSK NAVENHFDTSYELEALLEQRVKRQLLAEVQAICPPGVTIMNVRQAQPLGLGHSILCARPV VGDNPFVVLLPDIILDGGTADPLRYNLAALIARFNETGRSQVLAKRIPGDLSEYSVIQTK EPMVAEGQVARIVEFIEKPDEPQTLDSDLMAVGRYVLSADIWAELERTEPGAWGRIQLTD AIAELAKKQSVDAMLMTGESYDCGKKMGYMQAFVTYGMRNLKEGAKFRESIKKLLA >gi|296494447|gb|ADTN01000291.1| GENE 4 1976 - 2173 112 65 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPGQYPCIISFNGFVTCTILQNRVFCNQTGEDEYGEFESGYSGRRTRHAYAAGHKGNSKG DAADR >gi|296494447|gb|ADTN01000291.1| GENE 5 2836 - 4419 1762 527 aa, chain + ## HITS:1 COG:yegH_2 KEGG:ns NR:ns ## COG: yegH_2 COG1253 # Protein_GI_number: 16130003 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Escherichia coli K12 # 232 527 1 296 296 574 99.0 1e-163 MEWIADPSIWAGLVTLVVIELVLGIDNLVFIAILAEKLPPAHRDRARITGLMLAMVMRLL LLTSISWLVTLTQPLFSFRSFTFSARDLIMLFGGFFLLFKATIELSERLEGKDSNNPTQR KGAKFWGVVTQIVVLDAIFSLDSVVTAIGMVDHLLVMMAAVVIAISLMLMASKPLTQFVN SHPTIVILCLSFLLMIGFSLVAEGFGFVIPKGYLYAAIGFSVMIEVLNQLAIFNRRRFLS ANQTLRQRTTEAVMRLLSGQKEDAELDAETASMLLDHGNQQIFNPQERRMIERVLNLNQR TVSSIMTSRHDIEHIDLNAPEEEIRQLLERNQHTRLVVTDGDDAEDLLGVVHVIDLLQQS LRGEPLNLRVLIRQPLVFPETLPLLPALEQFRNARTHFAFVVDEFGSVEGIVTLSDVTET IAGNLPNEVEEIDARHDIQKNADGSWTANGHMPLEDLVQYVPLPLDEKREYHTIAGLLME YLQRIPKPGEEVQVGDYLLKTLQVESHRVQKVQIIPLREDGEMEYEV >gi|296494447|gb|ADTN01000291.1| GENE 6 4871 - 6724 1297 617 aa, chain - ## HITS:1 COG:asmA KEGG:ns NR:ns ## COG: asmA COG2982 # Protein_GI_number: 16130004 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein involved in outer membrane biogenesis # Organism: Escherichia coli K12 # 1 617 1 617 617 1162 98.0 0 MRRFLTTLMILLVVLVAGLSALVLLVNPNDFRDYMVKQVAARSGYQLQLDGPLRWHVWPQ LSILSGRMSLTAQGASQPLVRADNMRLDVALLPLLSHQLSVKQVMLKGAVIQLTPQTEAV RSEDAPVAPRDNTLPDLSDDRGWSFDISSLKVADSVLVFQHEDDEQVTIRNIRLQMEQDP QHRGSFEFSGRVNRDQRDLTISLNGTVDASDYPHDLMAAIEQINWQLQGADLPKQGIQGQ GSFQVQWQESHKRLSFNQISLTANDSTLSGQAQVTLTEKPEWQLRLQFPQLNLDNLIPLN ETANGENGAAQQGLSQSTLPRPVISSRIDEPAYQGLQGFTADILLQASNVRWRGMNFTDV ATQMTNKSGLLEITQLQGKLNGGQVSLPGTLDATSINPRINFQPRLENVEIGTILKAFNY PISLTGKMSLAGDFSGADIDADAFRHNWQGQAHVEMTDTRMEGMNFQQMIQQAVERNGGD VKAAENFDNVTRLDRFTTDLTLKDGVVTLNDMQGQSPMLALSGAGTLNLAEQTCDTQFDI RVVGGWNGESKLIDFLKETPVPLRVYGNWQQLNYSLQVDQLLRKHLQDEAKRRLNDWAER NKDSRNGKDVKKLLEKM >gi|296494447|gb|ADTN01000291.1| GENE 7 6746 - 7327 746 193 aa, chain - ## HITS:1 COG:ECs2872 KEGG:ns NR:ns ## COG: ECs2872 COG0717 # Protein_GI_number: 15832126 # Func_class: F Nucleotide transport and metabolism # Function: Deoxycytidine deaminase # Organism: Escherichia coli O157:H7 # 1 193 1 193 193 381 98.0 1e-106 MRLCDRDIEAWLDEGRLSTTPRPPVERINGATVDVRLGNKFRTFRGHTAAFIDLSGPKDE VSAALDRVMSDEIVLDESEAFYLHPGELALAVTLESVTLPADLVGWLDGRSSLARLGLMV HVTAHRIDPGWSGCIVLEFYNSGKLPLALRPGMLIGALSFEPLSGPAARPYNRREDAKYR NQQGAVASRIDKD >gi|296494447|gb|ADTN01000291.1| GENE 8 7419 - 8060 554 213 aa, chain - ## HITS:1 COG:ECs2873 KEGG:ns NR:ns ## COG: ECs2873 COG0572 # Protein_GI_number: 15832127 # Func_class: F Nucleotide transport and metabolism # Function: Uridine kinase # Organism: Escherichia coli O157:H7 # 1 213 19 231 231 418 100.0 1e-117 MTDQSHQCVIIGIAGASASGKSLIASTLYRELREQVGDEHIGVIPEDCYYKDQSHLSMEE RVKTNYDHPSAMDHSLLLEHLQALKRGSAIDLPVYSYVEHTRMKETVTVEPKKVIILEGI LLLTDARLRDELNFSIFVDTPLDICLMRRIKRDVNERGRSMDSVMAQYQKTVRPMFLQFI EPSKQYADIIVPRGGKNRIAIDILKAKISQFFE >gi|296494447|gb|ADTN01000291.1| GENE 9 8378 - 9328 643 316 aa, chain + ## HITS:1 COG:yegE_1 KEGG:ns NR:ns ## COG: yegE_1 COG3447 # Protein_GI_number: 16130007 # Func_class: T Signal transduction mechanisms # Function: Predicted integral membrane sensor domain # Organism: Escherichia coli K12 # 1 300 1 300 300 440 100.0 1e-123 MSKQSQHVLIALPHPLLHLVSLGLVSFIFTLFSLELSQFGTQLAPLWFPTSIMMVAFYRH AGRMWPGIALSCSLGNIAASILLFSTSSLNMTWTTINIVEAVVGAVLLRKLLPWYNPLQN LADWLRLALGSAIVPPLLGGVLVVLLTPGDDPLRAFLIWVLSESIGALALVPLGLLFKPH YLLRHRNPRLLFESLLTLAITLTLSWLSMLYLPWPFTFIIVLLMWSAVRLPRMEAFLIFL TTVMMVSLMMAADPSLLATPRTYLMSHMPWLPFLLILLPANIMTMVMYAFRAERKHISES ETRFRNAMEYSPSAWH >gi|296494447|gb|ADTN01000291.1| GENE 10 9319 - 9438 71 39 aa, chain + ## HITS:1 COG:Z3235m_2 KEGG:ns NR:ns ## COG: Z3235m_2 COG2202 # Protein_GI_number: 15804978 # Func_class: T Signal transduction mechanisms # Function: FOG: PAS/PAC domain # Organism: Escherichia coli O157:H7 EDL933 # 1 39 15 53 370 84 100.0 4e-17 MALVGTEGQWLQSNKALCQFLGYSQEELRGLTFQQLTWP >gi|296494447|gb|ADTN01000291.1| GENE 11 9502 - 11694 2049 730 aa, chain + ## HITS:1 COG:Z3235m_2 KEGG:ns NR:ns ## COG: Z3235m_2 COG2202 # Protein_GI_number: 15804978 # Func_class: T Signal transduction mechanisms # Function: FOG: PAS/PAC domain # Organism: Escherichia coli O157:H7 EDL933 # 1 295 76 370 370 615 99.0 1e-176 MEKRYYNRNGDVVWALLAVSLVRHTDGTPLYFIAQIEDINELKRTEQVNQQLMERITLTN EAGGIGIWEWELKPNIFSWDKRMFELYEIPPHIKPNWQVWYECVLPEDRQHAEKVIRDSL QSRSPFKLEFRIAVKDGIRHIRALANRVLNKEGEVERLLGINMDMTEVKQLNEALFQEKE RLHITLDSIGEAVVCIDMAMKITFMNPVAEKMSGWTQEEALGVPLLTVLHITFGDNGPLM ENIYSADTSRSAIEQDVVLHCRSGGSYDVHYSITPLSTLDGSNIGSVLVIQDVTESRKML RQLSYSASHDALTHLANRASFEKQLRILLQTVNSTHQRHALVFIDLDRFKAVNDSAGHAA GDALLRELASLMLSMLRSSDVLARLGGDEFGLLLPDCNVESARFIATRIISAVNDYHFIW EGRVHRVGASAGITLIDDNNHQAAEVMSQADIACYASKNGGRGRVTVYEPQQAAAHSERA VMSLDEQWRMIKENQLMMIAHGVASPRIPQARNLWLISLKLWSCEGEIIDEQTFRRSFSD PALSHALDRRVFHDFFQQAAKAVTSKGLSIALPLSVAGLSSTTLVNELLEQLENSPLPPR LLHLIIPADAILDHAESVQKLRLAGCRIVFSQVGRDLQIFNSLKVNMADYLLLDGELCAN VQGNLMDEMLITIIQGHAQLLGMKTIAGPVVLPLVMDTLSGIGVDLIYGDVIADAQPLDL LVNSSYFAIN >gi|296494447|gb|ADTN01000291.1| GENE 12 11732 - 12589 708 285 aa, chain - ## HITS:1 COG:alkA KEGG:ns NR:ns ## COG: alkA COG0122 # Protein_GI_number: 16130008 # Func_class: L Replication, recombination and repair # Function: 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase # Organism: Escherichia coli K12 # 1 280 1 280 282 545 97.0 1e-155 MYTLNWQPPYDWSWMLGFLAARAVNGVETVADDYYARSLAVGEYRGVVTAIPDIARHTLH INLSAGLEPVAAECLAKMSRLFDLQCNPQIVNGALGKLGAARPGLRLPGSVDAFEQGVRA ILGQLVSVAMAAKLTARVAQLYGERLDDFPDYVCFPTPQRLAAADPQALKALGMPLKRAE ALIHLANAALEGTLPMTIPGDVEQAMKTLQTFPGIGRWTANYFALRGWQAKDVFLPDDYL IKQRFPGMTPAQIRRYAERWKPWRSYALLHIWYTEGWQPDGTDEL >gi|296494447|gb|ADTN01000291.1| GENE 13 12636 - 12710 87 24 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPVKKGRDFSEMLPSRQPEYESKA >gi|296494447|gb|ADTN01000291.1| GENE 14 12723 - 14075 1650 450 aa, chain + ## HITS:1 COG:yegD KEGG:ns NR:ns ## COG: yegD COG0443 # Protein_GI_number: 16130009 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Escherichia coli K12 # 1 450 22 471 471 880 97.0 0 MFIGFDYGTANCSVAVMRDGKPQLLKMENDSTLLPSMLCAPTREAVSEWLYRHHDVPADD DETQALLRRAIRYNREEDIDVTAKSVQFGLSSLAQYIDDPEEVWFVKSPKSFLGASGLKP QQVALFEDLVCAMMLHIRQQAQAQLPDAITQAVIGRPINFQGLGGDEANTQAQGILERAA KRAGFKDVVFQYEPVAAGLDYEATLQEEKRVLVVDIGGGTTDCSLLLMGPQWRSRLDREA SLLGHSGCRIGGNDLDIALAFKNLMPLLGMGGETEKGIALPILPWWNAVAINDVPAQSDF YSSANGRLLNDLVRDAREPEKVALLQKVWRQRLSYRLVRSAEESKIALSSVAETRASLPF ISDELATLISQQGLESALNQPLARILEQVQLALDNAQEKPDVIYLTGGSARSPLIKKALT EQLPGIPIAGGDDFGSVTAGLARWAEVVFR >gi|296494447|gb|ADTN01000291.1| GENE 15 14088 - 15587 699 499 aa, chain - ## HITS:1 COG:Z3239 KEGG:ns NR:ns ## COG: Z3239 COG4248 # Protein_GI_number: 15802551 # Func_class: R General function prediction only # Function: Uncharacterized protein with protein kinase and helix-hairpin-helix DNA-binding domains # Organism: Escherichia coli O157:H7 EDL933 # 1 499 148 646 646 967 96.0 0 MVGRDSKVVLIDSDSFQINANGTLHLCEVGVSYFTPPELQTLSSFIGFERTENHDNFGLA LLIFHVLFGGRHPYSGVPLISDAGNALETDIAHFRYAYASDNQRRGLTPPPRSIPLSMLP GDVEAMFQQAFTESGVATGRPTAKAWVAALDLLRQQLKKCTVSAMHIYPGHLTDCPWCAL DNQGVIYFIDLGEEVITTSGDFVLAKVWAMVMASVAPPALQLPLPDHFQPTGRPLPLGLL RREYIILIEIALSALSLLLCGLQVEPRYIILVPVLAAIWIIGSLTSKAYKAEVQRRREAF NRAKMDYDHLVSQIQQLGGLEGFIAKRTMLEKMKDEILGLPEEEKRALAALQDNARERQK QKFLEGFFIDAASIPGVGPARKAALRSFGIETAADVTRRSVKQVKGFGDHLTQAVIDWKA SCERRFVFRPNEAVTPADRQAVMAKMAAKRHRLESALTVGATELQRFRLHAPARTMPLME PLRQAAEKLAQAQADLSRC >gi|296494447|gb|ADTN01000291.1| GENE 16 15602 - 15865 134 87 aa, chain - ## HITS:1 COG:yegI KEGG:ns NR:ns ## COG: yegI COG4248 # Protein_GI_number: 16130010 # Func_class: R General function prediction only # Function: Uncharacterized protein with protein kinase and helix-hairpin-helix DNA-binding domains # Organism: Escherichia coli K12 # 1 50 58 107 648 108 98.0 2e-24 MAATADAQLLNYVAWPQATLHGGRGGKVIGFMMPKVSGKEPIHMIYSPRHIVARVTLTVR GIFYSMLRAILLHLLLRFTSTGTSWVT >gi|296494447|gb|ADTN01000291.1| GENE 17 16027 - 16707 610 226 aa, chain - ## HITS:1 COG:no KEGG:ECP_2111 NR:ns ## KEGG: ECP_2111 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_536 # Pathway: not_defined # 1 226 28 253 253 431 98.0 1e-120 MQVAWLEDQQPLLSVFVADGAGSVSQGGEGAMLAVNEAMAYMSQKVQGGELGLNDVLATD IVLTIRQRLFAEAEAKELAVRDFACTFLGLISSANGTLIMQIGDGGVVVDLGHGLQLPLT PMVGEYANMTHFITDDDAVSRLETYTSTERAHKVAAFTDGIQRLALNMLDNSPHVPFFTP FFNGLASATQEQLDLLPELLKQFLSSPAVNERTDDDKTLALALWLP >gi|296494447|gb|ADTN01000291.1| GENE 18 16785 - 17444 518 219 aa, chain - ## HITS:1 COG:ECs2881 KEGG:ns NR:ns ## COG: ECs2881 COG4245 # Protein_GI_number: 15832135 # Func_class: R General function prediction only # Function: Uncharacterized protein encoded in toxicity protection region of plasmid R478, contains von Willebrand factor (vWF) domain # Organism: Escherichia coli O157:H7 # 1 219 1 219 219 403 98.0 1e-112 MSEQITFATSDFASNPEPRCPCILLLDVSGSMNGRPINELNAGLVTFRDELLADSLALKR VELGIVTFGPVHVEQPFTSAANFFPPILFAHGDTPMGAAITKALNMVEERKREYRANGIS YYRPWIFLITDGAPTDEWQAAANKVFQGEEDKKFAFFSIGVQGADMKTLAQISVRQPLPL QGLQFRELFSWLSSSLRSVSRSTPGTEVVLEAPKGWTSV >gi|296494447|gb|ADTN01000291.1| GENE 19 18035 - 18163 94 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MMSSFIMTLSLFKAPSSGGAFPFQRPAEISGLPPFAMQAVYR >gi|296494447|gb|ADTN01000291.1| GENE 20 18363 - 18491 95 42 aa, chain + ## HITS:1 COG:no KEGG:ECSP_2830 NR:ns ## KEGG: ECSP_2830 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 42 7 48 48 69 95.0 4e-11 MMIHFIMTLSLFKAPSSGGAFPFQRPAEIIGLPPFALNAVYR Prediction of potential genes in microbial genomes Time: Mon May 16 00:09:37 2011 Seq name: gi|296494446|gb|ADTN01000292.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont732.1, whole genome shotgun sequence Length of sequence - 4121 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 139 - 2346 809 ## COG4644 Transposase and inactivated derivatives, TnpA family + Term 2492 - 2526 -0.7 - Term 2684 - 2713 3.5 2 2 Tu 1 . - CDS 2929 - 3138 137 ## ECO111_p2-004 putative lysogeny establishment protein - Term 3174 - 3206 -0.8 3 3 Tu 1 . - CDS 3291 - 4121 264 ## ECO111_p2-003 cyclization recombinase Predicted protein(s) >gi|296494446|gb|ADTN01000292.1| GENE 1 139 - 2346 809 735 aa, chain + ## HITS:1 COG:CAP0093 KEGG:ns NR:ns ## COG: CAP0093 COG4644 # Protein_GI_number: 15004797 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives, TnpA family # Organism: Clostridium acetobutylicum # 56 707 1 651 673 501 40.0 1e-141 MSSRDLARFTDVRRYASLVCIISEARSILTDEVIDLHERILSSLFSRAKRTQAERLQQTG KLIQSKLKQYVTVGQALLNARESGEDPWAAIEDVLPWQEFINSVEETRFLSRKDNFDPLH LITEKYSTLCKYAPRMLSVLQFRAAPAAMQLSDALDTVRDMYRKQLRKVPPSAPIGFIPE SWRKVVITPTGIDRKYYEFCVLNELKGALRSGDIWVKGSRRYRNFDDYLIPSDDFEKSLR DNQLPLAVPADCHEYIKSRLTLLASRLEEVNAMALAGDLPDVDISDKGVKITPLDNSVPS AASPFGDLVYGMLPHPKITEMLDEVDGWTGFTRHFTHLKNNHVRPKDRKLLLTTILADGI NLGLTKMAESCPGTTKSSLEGIQAWYIRDETYSAALAELVNAQKKSPLAAFWGDGTTSSS DGQNFRVGSHGRYAGQVNLKYGQDPGVQIYTHISDQYSPFYTKVISRVRDSTHVLDGLLY HETDLEITEHYTDTAGFTEHVFALMHLLGFAFAPRIRDLHDKRLFIHGKAELYQGLQSII STTSLNLKEIETHWHEVLRLASSIKQGTVTASLMMKKLASYPKQNGLAKALREIGRIERS LFMLDWFRDPLLRRRVQAGLNKGEARNALARAVFMHRLGEIRDRGLENQSYRASGLTLLT AAISLRNTVYIERAIDSLRRKGIPINEQLITHLSPLGWEHINLSGDYVWRTNLKLGQGKY RALRSVNSNMYKKQA >gi|296494446|gb|ADTN01000292.1| GENE 2 2929 - 3138 137 69 aa, chain - ## HITS:1 COG:no KEGG:ECO111_p2-004 NR:ns ## KEGG: ECO111_p2-004 # Name: not_defined # Def: putative lysogeny establishment protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 69 35 103 103 103 86.0 3e-21 MPDIDSISVAFPVKSMRGAEHFVMNATEEEARRGFAKVMSEFGEFLGHVDKALSISSARS KALTASMMK >gi|296494446|gb|ADTN01000292.1| GENE 3 3291 - 4121 264 276 aa, chain - ## HITS:1 COG:no KEGG:ECO111_p2-003 NR:ns ## KEGG: ECO111_p2-003 # Name: not_defined # Def: cyclization recombinase # Organism: E.coli_O111_H- # Pathway: not_defined # 2 276 69 343 343 527 98.0 1e-148 AEDVRDYLLYLQARGLAVKTIQQHLGQLNMLHRRSGLPRPSDSNAVSLVMRRIRKENVDA GERAKQALAFERTDFDQVRSLMENSDRCQDIRNLAFLGIAYNTLLRIAEIARIRVKDISR TGGGRMLIHIGRTKTLVSTAGVEKALSLGVTKLVERWISVSGVADDPNNYLFCRVRKNGV AAPSATSQLSTRALEGIFEATHRLIYGAKDDSGQRYLAWSGHSARVGAARDMARAGVSIP EIMQAGGWTNVNIVMNYIRNLDSETGAMVRLLEDGD Prediction of potential genes in microbial genomes Time: Mon May 16 00:09:43 2011 Seq name: gi|296494445|gb|ADTN01000293.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont732.2, whole genome shotgun sequence Length of sequence - 927 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 43 - 82 1.1 1 1 Tu 1 . - CDS 195 - 416 220 ## ECO111_p2-001 putative cre associated regulatory protein - Prom 639 - 698 3.5 Predicted protein(s) >gi|296494445|gb|ADTN01000293.1| GENE 1 195 - 416 220 73 aa, chain - ## HITS:1 COG:no KEGG:ECO111_p2-001 NR:ns ## KEGG: ECO111_p2-001 # Name: not_defined # Def: putative cre associated regulatory protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 73 1 73 73 122 93.0 3e-27 MGYSTAKVSTHLELEKNRGYWRAKGFDRDSCQLSLSRGEEKIERTRGRWRFYDENHKQVK AEPILYTLLKTII Prediction of potential genes in microbial genomes Time: Mon May 16 00:09:46 2011 Seq name: gi|296494444|gb|ADTN01000294.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont734.1, whole genome shotgun sequence Length of sequence - 1287 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 247 - 309 5.4 1 1 Op 1 . - CDS 321 - 569 142 ## ECO111_p2-009 regulatory protein 2 1 Op 2 . - CDS 566 - 1006 327 ## ECO111_p2-010 hypothetical protein 3 1 Op 3 . - CDS 1040 - 1204 80 ## ECO111_p2-011 defense against restriction protein Predicted protein(s) >gi|296494444|gb|ADTN01000294.1| GENE 1 321 - 569 142 82 aa, chain - ## HITS:1 COG:no KEGG:ECO111_p2-009 NR:ns ## KEGG: ECO111_p2-009 # Name: not_defined # Def: regulatory protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 82 1 82 82 158 100.0 7e-38 MKKRYYTVKHGTLRALQEFADKHNVEVRREGGSKALRMYRPDGKWRTVVDFKTNSVPQGV RDRAFEEWEQIIIDNALLLNAD >gi|296494444|gb|ADTN01000294.1| GENE 2 566 - 1006 327 146 aa, chain - ## HITS:1 COG:no KEGG:ECO111_p2-010 NR:ns ## KEGG: ECO111_p2-010 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 146 1 146 146 273 97.0 2e-72 MVTLSDTIKPNKTYLEAVLRTALLGKTEDEYVDFFLSGLRGRLLKNPRLYRSYGSYWPEI KKLLLERGYGNFGRLVDRDVRKIYRYDRPALTLIAATLYSQERFDNGQIYSAWHLLPVPE EVDDQDYEFESYDLEVEALAQAGDKT >gi|296494444|gb|ADTN01000294.1| GENE 3 1040 - 1204 80 54 aa, chain - ## HITS:1 COG:no KEGG:ECO111_p2-011 NR:ns ## KEGG: ECO111_p2-011 # Name: not_defined # Def: defense against restriction protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 54 2202 2255 2255 116 98.0 3e-25 MKGVLFRAKDAIKEKFGARWLPAKAKNSDFPGNWWIIETKHNVADVLAVIQQYA Prediction of potential genes in microbial genomes Time: Mon May 16 00:09:52 2011 Seq name: gi|296494443|gb|ADTN01000295.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont734.2, whole genome shotgun sequence Length of sequence - 1029 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 55 - 312 173 ## ECBD_3760 hypothetical protein - Prom 525 - 584 4.1 + Prom 163 - 222 2.7 2 2 Op 1 . + CDS 290 - 538 135 ## B21_04108 hypothetical protein 3 2 Op 2 . + CDS 626 - 994 245 ## COG2963 Transposase and inactivated derivatives Predicted protein(s) >gi|296494443|gb|ADTN01000295.1| GENE 1 55 - 312 173 85 aa, chain - ## HITS:1 COG:no KEGG:ECBD_3760 NR:ns ## KEGG: ECBD_3760 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_BL21_DE3 # Pathway: not_defined # 1 85 32 116 116 175 100.0 4e-43 MVSSSASNVVNCETKQRTQFECIYFSQYWAKGDFIAKRAPIGQWEPYSEESLLGIIVTSV CRIKVAMLKPEPPRDPHIPLMGDFN >gi|296494443|gb|ADTN01000295.1| GENE 2 290 - 538 135 82 aa, chain + ## HITS:1 COG:no KEGG:B21_04108 NR:ns ## KEGG: B21_04108 # Name: yjhE # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 82 1 82 82 140 100.0 2e-32 MLADELTIGPIRAVPMDITPKYVGIASGLMNAGSAVADIISPIAFGIIIDKTGNWSLPFY GSVALLVIGIFLTFFMRPDKSL >gi|296494443|gb|ADTN01000295.1| GENE 3 626 - 994 245 122 aa, chain + ## HITS:1 COG:yi91a KEGG:ns NR:ns ## COG: yi91a COG2963 # Protein_GI_number: 16128240 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli K12 # 1 122 13 134 134 224 97.0 2e-59 MKKRNFSAEFKRESAQLVVDQKYTVADAAKAMDVGLSTMTRWVKQLRDERQGKTPKASPI TPEQIEIRKLRKKLQRIEMENEILKKATVDSIGQRNSYVKTWGCGGFLNETNIYSRGKSL CF Prediction of potential genes in microbial genomes Time: Mon May 16 00:09:57 2011 Seq name: gi|296494442|gb|ADTN01000296.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont750.1, whole genome shotgun sequence Length of sequence - 1684 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 - CDS 3 - 725 300 ## COG3188 P pilus assembly protein, porin PapC - Term 736 - 765 1.1 2 1 Op 2 . - CDS 776 - 1342 464 ## COG3539 P pilus assembly protein, pilin FimA Predicted protein(s) >gi|296494442|gb|ADTN01000296.1| GENE 1 3 - 725 300 240 aa, chain - ## HITS:1 COG:ybgQ KEGG:ns NR:ns ## COG: ybgQ COG3188 # Protein_GI_number: 16128693 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli K12 # 1 233 1 233 818 480 100.0 1e-136 MDTVNIYRLSFVSCLVMAMPCAMAVEFNLNVLDKSMRDRIDISLLKEKGVIAPGEYFVSV AVNNNKISNGQKINWQKKGDKTIPCINDSLVDKFGLKPDIRQSLPQIDRCIDFSSRPEML FNFDQANQQLNISIPQAWLAWHSENWAPPSTWKEGVAGVLMDYNLFASSYRPQDGSSSTN LNAYGTAGINAGAWRLRSDYQLNKTDSEDNHDQSGGISRTYLFRPLPQLGSKLRKVRTSS >gi|296494442|gb|ADTN01000296.1| GENE 2 776 - 1342 464 188 aa, chain - ## HITS:1 COG:ybgD KEGG:ns NR:ns ## COG: ybgD COG3539 # Protein_GI_number: 16128694 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 188 1 188 188 352 100.0 3e-97 MFKGQKTLAALAVSLLFTAPVYAADEGSGEIHFKGEVIEAPCEIHPEDIDKNIDLGQVTT THINREHHSNKVAVDIRLINCDLPASDNGSGMPVSKVGVTFDSTAKTTGATPLLSNTSAG EATGVGVRLMDKNDGNIVLGSAAPDLDLDASSSEQTLNFFAWMEQIDNAVDVTAGEVTAN ATYVLDYK Prediction of potential genes in microbial genomes Time: Mon May 16 00:10:07 2011 Seq name: gi|296494441|gb|ADTN01000297.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont750.2, whole genome shotgun sequence Length of sequence - 28067 bp Number of predicted genes - 26, with homology - 25 Number of transcription units - 8, operones - 3 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 1063 - 1122 2.9 2 2 Tu 1 . + CDS 1341 - 1526 72 ## ECP_0732 hypothetical protein 3 3 Tu 1 . - CDS 1559 - 1666 84 ## EC55989_0704 hypothetical protein - Prom 1782 - 1841 4.7 + Prom 1781 - 1840 5.5 4 4 Op 1 24/0.000 + CDS 2047 - 2451 289 ## COG2009 Succinate dehydrogenase/fumarate reductase, cytochrome b subunit 5 4 Op 2 22/0.000 + CDS 2445 - 2792 528 ## COG2142 Succinate dehydrogenase, hydrophobic anchor subunit 6 4 Op 3 36/0.000 + CDS 2792 - 4558 2196 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 7 4 Op 4 5/0.000 + CDS 4574 - 5290 661 ## COG0479 Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit + Prom 5363 - 5422 4.5 8 4 Op 5 21/0.000 + CDS 5591 - 8392 2941 ## COG0567 2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, and related enzymes 9 4 Op 6 6/0.000 + CDS 8407 - 9624 1674 ## COG0508 Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes 10 4 Op 7 39/0.000 + CDS 9990 - 11156 1548 ## COG0045 Succinyl-CoA synthetase, beta subunit 11 4 Op 8 . + CDS 11156 - 12025 1043 ## COG0074 Succinyl-CoA synthetase, alpha subunit + Term 12101 - 12130 2.1 - Term 12018 - 12067 1.2 12 5 Tu 1 . - CDS 12129 - 12851 732 ## COG2188 Transcriptional regulators - Prom 12902 - 12961 6.8 + Prom 12917 - 12976 4.5 13 6 Op 1 2/0.000 + CDS 13020 - 14936 1929 ## COG1299 Phosphotransferase system, fructose-specific IIC component 14 6 Op 2 . + CDS 14954 - 17587 1913 ## COG0383 Alpha-mannosidase + Term 17644 - 17681 7.1 + Prom 17774 - 17833 4.6 15 7 Tu 1 . + CDS 17855 - 17926 74 ## + Prom 18247 - 18306 7.7 16 8 Op 1 31/0.000 + CDS 18434 - 20002 1828 ## COG1271 Cytochrome bd-type quinol oxidase, subunit 1 17 8 Op 2 3/0.000 + CDS 20018 - 21157 1460 ## COG1294 Cytochrome bd-type quinol oxidase, subunit 2 18 8 Op 3 2/0.000 + CDS 21172 - 21285 193 ## COG4890 Predicted outer membrane lipoprotein 19 8 Op 4 7/0.000 + CDS 21285 - 21578 230 ## COG3790 Predicted membrane protein + Term 21600 - 21635 4.3 + Prom 21628 - 21687 3.2 20 8 Op 5 13/0.000 + CDS 21842 - 22132 217 ## COG0824 Predicted thioesterase 21 8 Op 6 30/0.000 + CDS 22129 - 22821 696 ## COG0811 Biopolymer transport proteins 22 8 Op 7 8/0.000 + CDS 22825 - 23253 470 ## COG0848 Biopolymer transport protein 23 8 Op 8 8/0.000 + CDS 23318 - 24583 1208 ## COG3064 Membrane protein involved in colicin uptake + Prom 24605 - 24664 4.0 24 8 Op 9 20/0.000 + CDS 24716 - 26008 1351 ## COG0823 Periplasmic component of the Tol biopolymer transport system 25 8 Op 10 13/0.000 + CDS 26043 - 26564 676 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins 26 8 Op 11 . + CDS 26574 - 27365 554 ## COG1729 Uncharacterized protein conserved in bacteria + TRNA 27530 - 27605 99.5 # Lys TTT 0 0 + TRNA 27741 - 27816 94.3 # Val TAC 0 0 + TRNA 27819 - 27894 99.5 # Lys TTT 0 0 Predicted protein(s) >gi|296494441|gb|ADTN01000297.1| GENE 1 70 - 1353 1438 427 aa, chain - ## HITS:1 COG:gltA KEGG:ns NR:ns ## COG: gltA COG0372 # Protein_GI_number: 16128695 # Func_class: C Energy production and conversion # Function: Citrate synthase # Organism: Escherichia coli K12 # 1 427 1 427 427 901 100.0 0 MADTKAKLTLNGDTAVELDVLKGTLGQDVIDIRTLGSKGVFTFDPGFTSTASCESKITFI DGDEGILLHRGFPIDQLATDSNYLEVCYILLNGEKPTQEQYDEFKTTVTRHTMIHEQITR LFHAFRRDSHPMAVMCGITGALAAFYHDSLDVNNPRHREIAAFRLLSKMPTMAAMCYKYS IGQPFVYPRNDLSYAGNFLNMMFSTPCEPYEVNPILERAMDRILILHADHEQNASTSTVR TAGSSGANPFACIAAGIASLWGPAHGGANEAALKMLEEISSVKHIPEFVRRAKDKNDSFR LMGFGHRVYKNYDPRATVMRETCHEVLKELGTKDDLLEVAMELENIALNDPYFIEKKLYP NVDFYSGIILKAMGIPSSMFTVIFAMARTVGWIAHWSEMHSDGMKIARPRQLYTGYEKRD FKSDIKR >gi|296494441|gb|ADTN01000297.1| GENE 2 1341 - 1526 72 61 aa, chain + ## HITS:1 COG:no KEGG:ECP_0732 NR:ns ## KEGG: ECP_0732 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_536 # Pathway: not_defined # 1 61 3 63 63 121 96.0 7e-27 MYQPFKVSLAPYCVRLPELKFAFAHQPGFTRFLFGSPLCERGENLGTELWALAGKGSIDD E >gi|296494441|gb|ADTN01000297.1| GENE 3 1559 - 1666 84 35 aa, chain - ## HITS:1 COG:no KEGG:EC55989_0704 NR:ns ## KEGG: EC55989_0704 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 35 18 52 52 65 100.0 5e-10 MRKSYEVGISPKINLCNSVEVLTNSFGTVISGRQV >gi|296494441|gb|ADTN01000297.1| GENE 4 2047 - 2451 289 134 aa, chain + ## HITS:1 COG:ECs0746 KEGG:ns NR:ns ## COG: ECs0746 COG2009 # Protein_GI_number: 15830000 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, cytochrome b subunit # Organism: Escherichia coli O157:H7 # 6 134 1 129 129 220 100.0 4e-58 MWALFMIRNVKKQRPVNLDLQTIRFPITAIASILHRVSGVITFVAVGILLWLLGTSLSSP EGFEQASAIMGSFFVKFIMWGILTALAYHVVVGIRHMMMDFGYLEETFEAGKRSAKISFV ITVVLSLLAGVLVW >gi|296494441|gb|ADTN01000297.1| GENE 5 2445 - 2792 528 115 aa, chain + ## HITS:1 COG:ECs0747 KEGG:ns NR:ns ## COG: ECs0747 COG2142 # Protein_GI_number: 15830001 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase, hydrophobic anchor subunit # Organism: Escherichia coli O157:H7 # 1 115 1 115 115 148 99.0 3e-36 MVSNASALGRNGVHDFILVRATAIVLTLYIIYMVGFFATSGELTYEVWIGFFASAFTKVF TLLALFSILIHAWIGMWQVLTDYVKPLALRLMLQLVIVVALVVYVIYGFVVVWGV >gi|296494441|gb|ADTN01000297.1| GENE 6 2792 - 4558 2196 588 aa, chain + ## HITS:1 COG:ECs0748 KEGG:ns NR:ns ## COG: ECs0748 COG1053 # Protein_GI_number: 15830002 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Escherichia coli O157:H7 # 1 588 1 588 588 1153 100.0 0 MKLPVREFDAVVIGAGGAGMRAALQISQSGQTCALLSKVFPTRSHTVSAQGGITVALGNT HEDNWEWHMYDTVKGSDYIGDQDAIEYMCKTGPEAILELEHMGLPFSRLDDGRIYQRPFG GQSKNFGGEQAARTAAAADRTGHALLHTLYQQNLKNHTTIFSEWYALDLVKNQDGAVVGC TALCIETGEVVYFKARATVLATGGAGRIYQSTTNAHINTGDGVGMAIRAGVPVQDMEMWQ FHPTGIAGAGVLVTEGCRGEGGYLLNKHGERFMERYAPNAKDLAGRDVVARSIMIEIREG RGCDGPWGPHAKLKLDHLGKEVLESRLPGILELSRTFAHVDPVKEPIPVIPTCHYMMGGI PTKVTGQALTVNEKGEDVVVPGLFAVGEIACVSVHGANRLGGNSLLDLVVFGRAAGLHLQ ESIAEQGALRDASESDVEASLDRLNRWNNNRNGEDPVAIRKALQECMQHNFSVFREGDAM AKGLEQLKVIRERLKNARLDDTSSEFNTQRVECLELDNLMETAYATAVSANFRTESRGAH SRFDFPDRDDENWLCHSLYLPESESMTRRSVNMEPKLRPAFPPKIRTY >gi|296494441|gb|ADTN01000297.1| GENE 7 4574 - 5290 661 238 aa, chain + ## HITS:1 COG:sdhB KEGG:ns NR:ns ## COG: sdhB COG0479 # Protein_GI_number: 16128699 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit # Organism: Escherichia coli K12 # 1 238 1 238 238 494 100.0 1e-140 MRLEFSIYRYNPDVDDAPRMQDYTLEADEGRDMMLLDALIQLKEKDPSLSFRRSCREGVC GSDGLNMNGKNGLACITPISALNQPGKKIVIRPLPGLPVIRDLVVDMGQFYAQYEKIKPY LLNNGQNPPAREHLQMPEQREKLDGLYECILCACCSTSCPSFWWNPDKFIGPAGLLAAYR FLIDSRDTETDSRLDGLSDAFSVFRCHSIMNCVSVCPKGLNPTRAIGHIKSMLLQRNA >gi|296494441|gb|ADTN01000297.1| GENE 8 5591 - 8392 2941 933 aa, chain + ## HITS:1 COG:ECs0751 KEGG:ns NR:ns ## COG: ECs0751 COG0567 # Protein_GI_number: 15830005 # Func_class: C Energy production and conversion # Function: 2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, and related enzymes # Organism: Escherichia coli O157:H7 # 1 933 1 933 933 1944 100.0 0 MQNSALKAWLDSSYLSGANQSWIEQLYEDFLTDPDSVDANWRSTFQQLPGTGVKPDQFHS QTREYFRRLAKDASRYSSTISDPDTNVKQVKVLQLINAYRFRGHQHANLDPLGLWQQDKV ADLDPSFHDLTEADFQETFNVGSFASGKETMKLGELLEALKQTYCGPIGAEYMHITSTEE KRWIQQRIESGRATFNSEEKKRFLSELTAAEGLERYLGAKFPGAKRFSLEGGDALIPMLK EMIRHAGNSGTREVVLGMAHRGRLNVLVNVLGKKPQDLFDEFAGKHKEHLGTGDVKYHMG FSSDFQTDGGLVHLALAFNPSHLEIVSPVVIGSVRARLDRLDEPSSNKVLPITIHGDAAV TGQGVVQETLNMSKARGYEVGGTVRIVINNQVGFTTSNPLDARSTPYCTDIGKMVQAPIF HVNADDPEAVAFVTRLALDFRNTFKRDVFIDLVCYRRHGHNEADEPSATQPLMYQKIKKH PTPRKIYADKLEQEKVATLEDATEMVNLYRDALDAGDCVVAEWRPMNMHSFTWSPYLNHE WDEEYPNKVEMKRLQELAKRISTVPEAVEMQSRVAKIYGDRQAMAAGEKLFDWGGAENLA YATLVDEGIPVRLSGEDSGRGTFFHRHAVIHNQSNGSTYTPLQHIHNGQGAFRVWDSVLS EEAVLAFEYGYATAEPRTLTIWEAQFGDFANGAQVVIDQFISSGEQKWGRMCGLVMLLPH GYEGQGPEHSSARLERYLQLCAEQNMQVCVPSTPAQVYHMLRRQALRGMRRPLVVMSPKS LLRHPLAVSSLEELANGTFLPAIGEIDELDPKGVKRVVMCSGKVYYDLLEQRRKNNQHDV AIVRIEQLYPFPHKAMQEVLQQFAHVKDFVWCQEEPLNQGAWYCSQHHFREVIPFGASLR YAGRPASASPAVGYMSVHQKQQQDLVNDALNVE >gi|296494441|gb|ADTN01000297.1| GENE 9 8407 - 9624 1674 405 aa, chain + ## HITS:1 COG:STM0737 KEGG:ns NR:ns ## COG: STM0737 COG0508 # Protein_GI_number: 16764107 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes # Organism: Salmonella typhimurium LT2 # 1 405 1 402 402 717 96.0 0 MSSVDILVPDLPESVADATVATWHKKPGDAVVRDEVLVEIETDKVVLEVPASADGILDAV LEDEGTTVTSRQILGRLREGNSAGKETSAKSEEKASTPAQRQQASLEEQNNDALSPAIRR LLAEHNLDASAIKGTGVGGRLTREDVEKHLAKAPAKESAPAAAAPAAQPALAARSEKRVP MTRLRKRVAERLLEAKNSTAMLTTFNEVNMKPIMDLRKQYGEAFEKRHGIRLGFMSFYVK AVVEALKRYPEVNASIDGDDVVYHNYFDVSMAVSTPRGLVTPVLRDVDTLGMADIEKKIK ELAVKGRDGKLTVEDLTGGNFTITNGGVFGSLMSTPIINPPQSAILGMHAIKDRPMAVNG QVEILPMMYLALSYDHRLIDGRESVGFLVTIKELLEDPTRLLLDV >gi|296494441|gb|ADTN01000297.1| GENE 10 9990 - 11156 1548 388 aa, chain + ## HITS:1 COG:ECs0753 KEGG:ns NR:ns ## COG: ECs0753 COG0045 # Protein_GI_number: 15830007 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, beta subunit # Organism: Escherichia coli O157:H7 # 1 388 1 388 388 728 100.0 0 MNLHEYQAKQLFARYGLPAPVGYACTTPREAEEAASKIGAGPWVVKCQVHAGGRGKAGGV KVVNSKEDIRAFAENWLGKRLVTYQTDANGQPVNQILVEAATDIAKELYLGAVVDRSSRR VVFMASTEGGVEIEKVAEETPHLIHKVALDPLTGPMPYQGRELAFKLGLEGKLVQQFTKI FMGLATIFLERDLALIEINPLVITKQGDLICLDGKLGADGNALFRQPDLREMRDQSQEDP REAQAAQWELNYVALDGNIGCMVNGAGLAMGTMDIVKLHGGEPANFLDVGGGATKERVTE AFKIILSDDKVKAVLVNIFGGIVRCDLIADGIIGAVAEVGVNVPVVVRLEGNNAELGAKK LADSGLNIIAAKGLTDAAQQVVAAVEGK >gi|296494441|gb|ADTN01000297.1| GENE 11 11156 - 12025 1043 289 aa, chain + ## HITS:1 COG:ECs0754 KEGG:ns NR:ns ## COG: ECs0754 COG0074 # Protein_GI_number: 15830008 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, alpha subunit # Organism: Escherichia coli O157:H7 # 1 289 1 289 289 508 100.0 1e-144 MSILIDKNTKVICQGFTGSQGTFHSEQAIAYGTKMVGGVTPGKGGTTHLGLPVFNTVREA VAATGATASVIYVPAPFCKDSILEAIDAGIKLIITITEGIPTLDMLTVKVKLDEAGVRMI GPNCPGVITPGECKIGIQPGHIHKPGKVGIVSRSGTLTYEAVKQTTDYGFGQSTCVGIGG DPIPGSNFIDILEMFEKDPQTEAIVMIGEIGGSAEEEAAAYIKEHVTKPVVGYIAGVTAP KGKRMGHAGAIIAGGKGTADEKFAALEAAGVKTVRSLADIGEALKTVLK >gi|296494441|gb|ADTN01000297.1| GENE 12 12129 - 12851 732 240 aa, chain - ## HITS:1 COG:farR KEGG:ns NR:ns ## COG: farR COG2188 # Protein_GI_number: 16128705 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 240 1 240 240 442 100.0 1e-124 MGHKPLYRQIADRIREQIARGELKPGDALPTESALQTEFGVSRVTVRQALRQLVEQQILE SIQGSGTYVKEERVNYDIFQLTSFDEKLSDRHVDTHSEVLIFEVIPADDFLQQQLQITPQ DRVWHVKRVRYRKQKPMALEETWMPLALFPDLTWQVMENSKYHFIEEVKKMVIDRSEQEI IPLMPTEEMSRLLNISQTKPILEKVSRGYLVDGRVFEYSRNAFNTDDYKFTLIAQRKSSR >gi|296494441|gb|ADTN01000297.1| GENE 13 13020 - 14936 1929 638 aa, chain + ## HITS:1 COG:hrsA_3 KEGG:ns NR:ns ## COG: hrsA_3 COG1299 # Protein_GI_number: 16128706 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Escherichia coli K12 # 279 638 1 360 360 640 100.0 0 MNLTTLTHRDALCLNARFTSREEAIHALTQRLAALGKISSTEQFLEEVYRRESLGPTALG EGLAVPHGKTAAVKEAAFAVATLSEPLQWEGVDGPEAVDLVVLLAIPPNEAGTTHMQLLT ALTTRLADDEIRARIQSATTPDELLSALDDKGGTQPSASFSNAPTIVCVTACPAGIAHTY MAAEYLEKAGRKLGVNVYVEKQGANGIEGRLTADQLNSATACIFAAEVAIKESERFNGIP ALSVPVAEPIRHAEALIQQALTLKRSDETRTVQQDTQPVKSVKTELKQALLSGISFAVPL IVAGGTVLAVAVLLSQIFGLQDLFNEENSWLWMYRKLGGGLLGILMVPVLAAYTAYSLAD KPALAPGFAAGLAANMIGSGFLGAVVGGLIAGYLMRWVKNHLRLSSKFNGFLTFYLYPVL GTLGAGSLMLFVVGEPVAWINNSLTAWLNGLSGSNALLLGAILGFMCSFDLGGPVNKAAY AFCLGAMANGVYGPYAIFASVKMVSAFTVTASTMLAPRLFKEFEIETGKSTWLLGLAGIT EGAIPMAIEDPLRVIGSFVLGSMVTGAIVGAMNIGLSTPGAGIFSLFLLHDNGAGGVMAA IGWFGAALVGAAISTAILLMWRRHAVKHGNYLTDGVMP >gi|296494441|gb|ADTN01000297.1| GENE 14 14954 - 17587 1913 877 aa, chain + ## HITS:1 COG:ybgG KEGG:ns NR:ns ## COG: ybgG COG0383 # Protein_GI_number: 16128707 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Escherichia coli K12 # 1 877 1 877 877 1824 100.0 0 MKAVSRVHITPHMHWDREWYFTTEESRILLVNNMEEILCRLEQDNEYKYYVLDGQTAILE DYFAVKPENKDRVKKQVEAGKLIIGPWYTQTDTTIVSAESIVRNLMYGMRDCLAFGEPMK IGYLPDSFGMSGQLPHIYNGFGITRTMFWRGCSERHGTDKTEFLWQSSDGSEVTAQVLPL GYAIGKYLPADENGLRKRLDSYFDVLEKASVTKEILLPNGHDQMPLQQNIFEVMDKLREI YPQRKFVMSRFEEVFEKIEAQRDNLATLKGEFIDGKYMRVHRTIGSTRMDIKIAHARIEN KIVNLLEPLATLAWTLGFEYHHGLLEKMWKEILKNHAHDSIGCCCSDKVHREIVARFELA EDMADNLIRFYMRKIADNMPQSDADKLVLFNLMPWPREEVINTTVRLRASQFNLRDDRGQ PVPYFIRHAREIDPGLIDRQIVHYGNYDPFMEFDIQINQIVPSMGYRTLYIEANQPGNVI AAKSDAEGILENAFWQIALNEDGSLQLVDKDSGVRYDRVLQIEESSDDGDEYDYSPAKEE WVITAANAKPQCDIIHEAWQSRAVIRYDMAVPLNLSERSARQSTGRVGVVLVVTLSHNSR RIDVDINLDNQADDHRLRVLVPTPFNTDSVLADTQFGSLTRPVNDSAMNNWQQEGWKEAP VPVWNMLNYVALQEGRNGMAVFSEGLREFEVIGEEKKTFAITLLRGVGLLGKEDLLLRPG RPSGIKMPVPDSQLRGLLSCRLSLLSYTGTPTAAGVAQQARAWLTPVQCYNKIPWDVMKL NKAGFNVPESYSLLKMPPVGCLISALKKAEDRQEVILRLFNPAESATCDATVAFSREVIS CSETMMDEHITTEENQGSNLSGPFLPGQSRTFSYRLA >gi|296494441|gb|ADTN01000297.1| GENE 15 17855 - 17926 74 23 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLILAGWIYAFISNPETLRRISQ >gi|296494441|gb|ADTN01000297.1| GENE 16 18434 - 20002 1828 522 aa, chain + ## HITS:1 COG:ECs0768 KEGG:ns NR:ns ## COG: ECs0768 COG1271 # Protein_GI_number: 15830022 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 1 # Organism: Escherichia coli O157:H7 # 1 522 2 523 523 1011 100.0 0 MLDIVELSRLQFALTAMYHFLFVPLTLGMAFLLAIMETVYVLSGKQIYKDMTKFWGKLFG INFALGVATGLTMEFQFGTNWSYYSHYVGDIFGAPLAIEGLMAFFLESTFVGLFFFGWDR LGKVQHMCVTWLVALGSNLSALWILVANGWMQNPIASDFNFETMRMEMVSFSELVLNPVA QVKFVHTVASGYVTGAMFILGISAWYMLKGRDFAFAKRSFAIAASFGMAAVLSVIVLGDE SGYEMGDVQKTKLAAIEAEWETQPAPAAFTLFGIPDQEEETNKFAIQIPYALGIIATRSV DTPVIGLKELMVQHEERIRNGMKAYSLLEQLRSGSTDQAVRDQFNSMKKDLGYGLLLKRY TPNVADATEAQIQQATKDSIPRVAPLYFAFRIMVACGFLLLAIIALSFWSVIRNRIGEKK WLLRAALYGIPLPWIAVEAGWFVAEYGRQPWAIGEVLPTAVANSSLTAGDLIFSMVLICG LYTLFLVAELFLMFKFARLGPSSLKTGRYHFEQSSTTTQPAR >gi|296494441|gb|ADTN01000297.1| GENE 17 20018 - 21157 1460 379 aa, chain + ## HITS:1 COG:ECs0769 KEGG:ns NR:ns ## COG: ECs0769 COG1294 # Protein_GI_number: 15830023 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 2 # Organism: Escherichia coli O157:H7 # 1 379 1 379 379 699 100.0 0 MIDYEVLRFIWWLLVGVLLIGFAVTDGFDMGVGMLTRFLGRNDTERRIMINSIAPHWDGN QVWLITAGGALFAAWPMVYAAAFSGFYVAMILVLASLFFRPVGFDYRSKIEETRWRNMWD WGIFIGSFVPPLVIGVAFGNLLQGVPFNVDEYLRLYYTGNFFQLLNPFGLLAGVVSVGMI ITQGATYLQMRTVGELHLRTRATAQVAALVTLVCFALAGVWVMYGIDGYVVKSTMDHYAA SNPLNKEVVREAGAWLVNFNNTPILWAIPALGVVLPLLTILTARMDKAAWAFVFSSLTLA CIILTAGIAMFPFVMPSSTMMNASLTMWDATSSQLTLNVMTWVAVVLVPIILLYTAWCYW KMFGRITKEDIERNTHSLY >gi|296494441|gb|ADTN01000297.1| GENE 18 21172 - 21285 193 37 aa, chain + ## HITS:1 COG:STM0742 KEGG:ns NR:ns ## COG: STM0742 COG4890 # Protein_GI_number: 16764112 # Func_class: S Function unknown # Function: Predicted outer membrane lipoprotein # Organism: Salmonella typhimurium LT2 # 1 36 1 36 37 57 86.0 5e-09 MWYFAWILGTLLACSFGVITALALEHVESGKAGQEDI >gi|296494441|gb|ADTN01000297.1| GENE 19 21285 - 21578 230 97 aa, chain + ## HITS:1 COG:ECs0770 KEGG:ns NR:ns ## COG: ECs0770 COG3790 # Protein_GI_number: 15830024 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 97 1 97 97 169 100.0 1e-42 MSKIIATLYAVMDKRPLRALSFVMALLLAGCMFWDPSRFAAKTSELEIWHGLLLMWAVCA GVIHGVGFRPQKVLWQGIFCPLLADIVLIVGLIFFFF >gi|296494441|gb|ADTN01000297.1| GENE 20 21842 - 22132 217 96 aa, chain + ## HITS:1 COG:ECs0771 KEGG:ns NR:ns ## COG: ECs0771 COG0824 # Protein_GI_number: 15830025 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Escherichia coli O157:H7 # 1 96 39 134 134 177 100.0 3e-45 MLRHHHFSQQALMAERVAFVVRKMTVEYYAPARLDDMLEIQTEITSMRGTSLVFTQRIVN AENTLLNEAEVLVVCVDPLKMKPRALPKSIVAEFKQ >gi|296494441|gb|ADTN01000297.1| GENE 21 22129 - 22821 696 230 aa, chain + ## HITS:1 COG:ECs0772 KEGG:ns NR:ns ## COG: ECs0772 COG0811 # Protein_GI_number: 15830026 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Escherichia coli O157:H7 # 1 230 1 230 230 419 100.0 1e-117 MTDMNILDLFLKASLLVKLIMLILIGFSIASWAIIIQRTRILNAAAREAEAFEDKFWSGI ELSRLYQESQGKRDNLTGSEQIFYSGFKEFVRLHRANSHAPEAVVEGASRAMRISMNREL ENLETHIPFLGTVGSISPYIGLFGTVWGIMHAFIALGAVKQATLQMVAPGIAEALIATAI GLFAAIPAVMAYNRLNQRVNKLELNYDNFMEEFTAILHRQAFTVSESNKG >gi|296494441|gb|ADTN01000297.1| GENE 22 22825 - 23253 470 142 aa, chain + ## HITS:1 COG:ECs0773 KEGG:ns NR:ns ## COG: ECs0773 COG0848 # Protein_GI_number: 15830027 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport protein # Organism: Escherichia coli O157:H7 # 11 142 11 142 142 207 100.0 4e-54 MARARGRGRRDLKSEINIVPLLDVLLVLLLIFMATAPIITQSVEVDLPDATESQAVSSND NPPVIVEVSGIGQYTVVVEKDRLERLPPEQVVAEVSSRFKANPKTVFLIGGAKDVPYDEI IKALNLLHSAGVKSVGLMTQPI >gi|296494441|gb|ADTN01000297.1| GENE 23 23318 - 24583 1208 421 aa, chain + ## HITS:1 COG:tolA KEGG:ns NR:ns ## COG: tolA COG3064 # Protein_GI_number: 16128714 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane protein involved in colicin uptake # Organism: Escherichia coli K12 # 1 421 1 421 421 286 100.0 4e-77 MSKATEQNDKLKRAIIISAVLHVILFAALIWSSFDENIEASAGGGGGSSIDAVMVDSGAV VEQYKRMQSQESSAKRSDEQRKMKEQQAAEELREKQAAEQERLKQLEKERLAAQEQKKQA EEAAKQAELKQKQAEEAAAKAAADAKAKAEADAKAAEEAAKKAAADAKKKAEAEAAKAAA EAQKKAEAAAAALKKKAEAAEAAAAEARKKAATEAAEKAKAEAEKKAAAEKAAADKKAAA EKAAADKKAAEKAAAEKAAADKKAAAEKAAADKKAAAAKAAAEKAAAAKAAAEADDIFGE LSSGKNAPKTGGGAKGNNASPAGSGNTKNNGASGADINNYAGQIKSAIESKFYDASSYAG KTCTLRIKLAPDGMLLDIKPEGGDPALCQAALAAAKLAKIPKPPSQAVYEVFKNAPLDFK P >gi|296494441|gb|ADTN01000297.1| GENE 24 24716 - 26008 1351 430 aa, chain + ## HITS:1 COG:ECs0775 KEGG:ns NR:ns ## COG: ECs0775 COG0823 # Protein_GI_number: 15830029 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic component of the Tol biopolymer transport system # Organism: Escherichia coli O157:H7 # 1 430 1 430 430 820 100.0 0 MKQALRVAFGFLILWASVLHAEVRIVIDSGVDSGRPIGVVPFQWAGPGAAPEDIGGIVAA DLRNSGKFNPLDRARLPQQPGSAQEVQPAAWSALGIDAVVVGQVTPNPDGSYNVAYQLVD TGGAPGTVLAQNSYKVNKQWLRYAGHTASDEVFEKLTGIKGAFRTRIAYVVQTNGGQFPY ELRVSDYDGYNQFVVHRSPQPLMSPAWSPDGSKLAYVTFESGRSALVIQTLANGAVRQVA SFPRHNGAPAFSPDGSKLAFALSKTGSLNLYVMDLASGQIRQVTDGRSNNTEPTWFPDSQ NLAFTSDQAGRPQVYKVNINGGAPQRITWEGSQNQDADVSSDGKFMVMVSSNGGQQHIAK QDLATGGVQVLSSTFLDETPSLAPNGTMVIYSSSQGMGSVLNLVSTDGRFKARLPATDGQ VKFPAWSPYL >gi|296494441|gb|ADTN01000297.1| GENE 25 26043 - 26564 676 173 aa, chain + ## HITS:1 COG:ECs0776 KEGG:ns NR:ns ## COG: ECs0776 COG2885 # Protein_GI_number: 15830030 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Escherichia coli O157:H7 # 1 173 1 173 173 296 100.0 2e-80 MQLNKVLKGLMIALPVMAIAACSSNKNASNDGSEGMLGAGTGMDANGGNGNMSSEEQARL QMQQLQQNNIVYFDLDKYDIRSDFAQMLDAHANFLRSNPSYKVTVEGHADERGTPEYNIS LGERRANAVKMYLQGKGVSADQISIVSYGKEKPAVLGHDEAAYSKNRRAVLVY >gi|296494441|gb|ADTN01000297.1| GENE 26 26574 - 27365 554 263 aa, chain + ## HITS:1 COG:ybgF KEGG:ns NR:ns ## COG: ybgF COG1729 # Protein_GI_number: 16128717 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 263 1 263 263 395 100.0 1e-110 MSSNFRHQLLSLSLLVGIAAPWAAFAQAPISSVGSGSVEDRVTQLERISNAHSQLLTQLQ QQLSDNQSDIDSLRGQIQENQYQLNQVVERQKQILLQIDSLSSGGAAAQSTSGDQSGAAA STTPTADAGTANAGAPVKSGNANTDYNAAIALVQDKSRQDDAMVAFQNFIKNYPDSTYLP NANYWLGQLNYNKGKKDDAAYYFASVVKNYPKSPKAADAMFKVGVIMQDKGDTAKAKAVY QQVISKYPGTDGAKQAQKRLNAM Prediction of potential genes in microbial genomes Time: Mon May 16 00:10:15 2011 Seq name: gi|296494440|gb|ADTN01000298.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont750.3, whole genome shotgun sequence Length of sequence - 403 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 166 - 241 99.5 # Lys TTT 0 0 Prediction of potential genes in microbial genomes Time: Mon May 16 00:10:16 2011 Seq name: gi|296494439|gb|ADTN01000299.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont750.4, whole genome shotgun sequence Length of sequence - 5615 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 2454 - 2529 86.5 # Ala GGC 0 0 - TRNA 2569 - 2644 86.5 # Ala GGC 0 0 + Prom 2774 - 2833 5.6 2 2 Op 1 . + CDS 2865 - 3224 415 ## SDY_2596 hypothetical protein 3 2 Op 2 . + CDS 3226 - 3618 204 ## JW2394 predicted DNA-binding transcriptional regulator + Term 3627 - 3661 3.5 - Term 3610 - 3655 6.3 4 3 Tu 1 . - CDS 3670 - 5085 1663 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases - Prom 5144 - 5203 5.6 + TRNA 5344 - 5419 94.3 # Val TAC 0 0 + TRNA 5464 - 5539 92.4 # Val TAC 0 0 Predicted protein(s) >gi|296494439|gb|ADTN01000299.1| GENE 1 56 - 2245 1742 729 aa, chain - ## HITS:1 COG:yfeA_3 KEGG:ns NR:ns ## COG: yfeA_3 COG2200 # Protein_GI_number: 16130327 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 472 729 1 258 258 524 100.0 1e-148 MFVEHNLIKNIKIFTLAFTLTVVLIQLSRFISPLAIIHSSYIFLAWMPLCVMLSILFIFG WRGVVPVLCGMFCTNLWNFHLSFLQTAVMLGSQTFVVLCACAILRWQLGTRWRYGLTSRY VWQRLFWLGLVTPIGIKCSMYLVGSFFDFPLKISTFFGDADAIFTVVDLLSLFTAVLIYN MLFYYLTRMIVSPHFAQILWRRDIAPSLGKEKRAFTLSWLAALSVLLLLLCTPYENDFIA GYLVPVFFIIFTLGVGKLRYPFLNLTWAVSTLCLLNYNQNFLQGVETEYSLAFILAVLIS FSVCLLYMVRIYHRSEWLNRRWHLQALTDPLTLLPNFRALEQAPEQEAGKSFCCLRIDNL EFMSRHYGLMMRVHCIRSICRTLLPLMQENEKLYQLPGSELLLVLSGPETEGRLQHMVNI LNSRQIHWNNTGLDMGYGAAWGRFDGNQETLQPLLGQLSWLAEQSCAHHHVLALDSREEM VSGQTTKQVLLLNTIRTALDQGDLLLYAQPIRNKEGEGYDEILARLKYDGGIMTPDKFLP LIAQFNLSARFDLQVLESLLKWLATHPCDKKGPRFSVNLMPLTLLQKNIAGRIIRLFKRY HISPQAVILEITEEQAFSNAESSMYNIEQLHKFGFRIAIDDFGTGYANYERLKRLQADII KIDGVFVKDIVTNTLDAMIVRSITDLAKAKSLSVVAEFVETQQQQALLHKLGVQYLQGYL IGRPQPLAD >gi|296494439|gb|ADTN01000299.1| GENE 2 2865 - 3224 415 119 aa, chain + ## HITS:1 COG:no KEGG:SDY_2596 NR:ns ## KEGG: SDY_2596 # Name: yfeC # Def: hypothetical protein # Organism: S.dysenteriae # Pathway: not_defined # 1 119 1 119 119 216 100.0 2e-55 MFKERMTPDELARLTGYSRQTINKWVRKEGWTTSPKPGVQGGKARLVHVNEQVREYIRNA ERPEGQGEAPALSGDAPLEVLLVTLAKEMTPVEQKQFTSLLLREGIIGLLQRLGIRDSK >gi|296494439|gb|ADTN01000299.1| GENE 3 3226 - 3618 204 130 aa, chain + ## HITS:1 COG:no KEGG:JW2394 NR:ns ## KEGG: JW2394 # Name: yfeD # Def: predicted DNA-binding transcriptional regulator # Organism: E.coli_J # Pathway: not_defined # 1 130 1 130 130 250 100.0 1e-65 MKRLRNKMTTEELAECLGVAKQTVNRWIREKGWKTEKFPGVKGGRARLILVDTQVCEFIQ NTPAFHNTPMLMEAEERIAEYAPGARAPAYRQIINAIDNMTDIEQEKVAQFLSREGIRNF LARLDIDESA >gi|296494439|gb|ADTN01000299.1| GENE 4 3670 - 5085 1663 471 aa, chain - ## HITS:1 COG:gltX KEGG:ns NR:ns ## COG: gltX COG0008 # Protein_GI_number: 16130330 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Escherichia coli K12 # 1 471 1 471 471 983 100.0 0 MKIKTRFAPSPTGYLHVGGARTALYSWLFARNHGGEFVLRIEDTDLERSTPEAIEAIMDG MNWLSLEWDEGPYYQTKRFDRYNAVIDQMLEEGTAYKCYCSKERLEALREEQMAKGEKPR YDGRCRHSHEHHADDEPCVVRFANPQEGSVVFDDQIRGPIEFSNQELDDLIIRRTDGSPT YNFCVVVDDWDMEITHVIRGEDHINNTPRQINILKALKAPVPVYAHVSMINGDDGKKLSK RHGAVSVMQYRDDGYLPEALLNYLVRLGWSHGDQEIFTREEMIKYFTLNAVSKSASAFNT DKLLWLNHHYINALPPEYVATHLQWHIEQENIDTRNGPQLADLVKLLGERCKTLKEMAQS CRYFYEDFAEFDADAAKKHLRPVARQPLEVVRDKLAAITDWTAENVHHAIQATADELEVG MGKVGMPLRVAVTGAGQSPALDVTVHAIGKTRSIERINKALDFIAERENQQ Prediction of potential genes in microbial genomes Time: Mon May 16 00:10:26 2011 Seq name: gi|296494438|gb|ADTN01000300.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont750.5, whole genome shotgun sequence Length of sequence - 15032 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 9, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 142 - 168 0.1 1 1 Op 1 . - CDS 252 - 1136 472 ## COG0583 Transcriptional regulator 2 1 Op 2 . - CDS 1177 - 1335 140 ## EcSMS35_2559 hypothetical protein - Term 1345 - 1379 4.3 3 2 Op 1 1/1.000 - CDS 1388 - 2644 1019 ## COG0477 Permeases of the major facilitator superfamily 4 2 Op 2 . - CDS 2704 - 3537 867 ## COG0005 Purine nucleoside phosphorylase - Prom 3743 - 3802 3.8 + Prom 3512 - 3571 3.9 5 3 Tu 1 . + CDS 3786 - 4550 726 ## ECS88_2597 hypothetical protein + Term 4552 - 4599 2.5 6 4 Tu 1 . - CDS 4589 - 5515 588 ## COG0583 Transcriptional regulator - Prom 5541 - 5600 3.7 + Prom 5489 - 5548 5.0 7 5 Tu 1 . + CDS 5605 - 6603 725 ## COG0385 Predicted Na+-dependent transporter + Term 6614 - 6659 3.1 - Term 6553 - 6595 -0.5 8 6 Op 1 3/0.667 - CDS 6600 - 6818 272 ## COG3530 Uncharacterized protein conserved in bacteria 9 6 Op 2 7/0.333 - CDS 6820 - 8835 2207 ## COG0272 NAD-dependent DNA ligase (contains BRCT domain type II) - Term 8846 - 8874 1.3 10 6 Op 3 . - CDS 8906 - 9889 753 ## COG3115 Cell division protein - Prom 10065 - 10124 2.7 + Prom 10024 - 10083 3.3 11 7 Tu 1 8/0.333 + CDS 10248 - 10883 571 ## COG2981 Uncharacterized protein involved in cysteine biosynthesis + Term 10923 - 10955 4.1 + Prom 10976 - 11035 4.4 12 8 Tu 1 6/0.667 + CDS 11068 - 12039 702 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 + Term 12068 - 12095 0.1 + Prom 12200 - 12259 5.8 13 9 Op 1 25/0.000 + CDS 12423 - 12680 388 ## COG1925 Phosphotransferase system, HPr-related proteins 14 9 Op 2 10/0.000 + CDS 12725 - 14452 1776 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) 15 9 Op 3 . + CDS 14493 - 15002 694 ## COG2190 Phosphotransferase system IIA components Predicted protein(s) >gi|296494438|gb|ADTN01000300.1| GENE 1 252 - 1136 472 294 aa, chain - ## HITS:1 COG:xapR KEGG:ns NR:ns ## COG: xapR COG0583 # Protein_GI_number: 16130331 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 294 1 294 294 564 100.0 1e-161 MERVYRTDLKLLRYFLAVAEELHFGRAAARLNMSQPPLSIHIKELENQLGTQLFIRHSRS VVLTHAGKILMEESRRLLVNANNVLARIEQIGRGEAGRIELGVVGTAMWGRMRPVMRRFL RENPNVDVLFREKMPAMQMALLERRELDAGIWRMATEPPTGFTSLRLHESAFLVAMPEEH HLSSFSTVPLEALRDEYFVTMPPVYTDWDFLQRVCQQVGFSPVVIREVNEPQTVLAMVSM GIGITLIADSYAQMNWPGVIFRPLKQRIPADLYIVYETQQVTPAMVKLLAALTQ >gi|296494438|gb|ADTN01000300.1| GENE 2 1177 - 1335 140 52 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_2559 NR:ns ## KEGG: EcSMS35_2559 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 52 1 52 52 68 98.0 7e-11 MKDILFSSGVGFGISALFTIVRLPIPVPNVLPGILSIVFMYVGYLVVKYFMP >gi|296494438|gb|ADTN01000300.1| GENE 3 1388 - 2644 1019 418 aa, chain - ## HITS:1 COG:xapB KEGG:ns NR:ns ## COG: xapB COG0477 # Protein_GI_number: 16130332 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 418 1 418 418 706 100.0 0 MSIAMRLKVMSFLQYFIWGSWLVTLGSYMINTLHFTGANVGMVYSSKGIAAIIMPGIMGI IADKWLRAERAYMLCHLVCAGVLFYAASVTDPDMMFWVMLVNAMAFMPTIALSNSVSYSC LAQAGLDPVTAFPPIRVFGTVGFIVAMWAVSLLHLELSSLQLYIASGASLLLSAYALTLP KIPVAEKKATTSLASKLGLDAFVLFKNPRMAIFFLFAMMLGAVLQITNVFGNPFLHDFAR NPEFADSFVVKYPSILLSVSQMAEVGFILTIPFFLKRFGIKTVMLMSMVAWTLRFGFFAY GDPSTTGFILLLLSMIVYGCAFDFFNISGSVFVEQEVDSSIRASAQGLFMTMVNGVGAWV GSILSGMAVDYFSVDGVKDWQTIWLVFAGYALFLAVIFFFGFKYNHDPEKIKHRAVTH >gi|296494438|gb|ADTN01000300.1| GENE 4 2704 - 3537 867 277 aa, chain - ## HITS:1 COG:xapA KEGG:ns NR:ns ## COG: xapA COG0005 # Protein_GI_number: 16130333 # Func_class: F Nucleotide transport and metabolism # Function: Purine nucleoside phosphorylase # Organism: Escherichia coli K12 # 1 277 1 277 277 563 100.0 1e-160 MSQVQFSHNPLFCIDIIKTYKPDFTPRVAFILGSGLGALADQIENAVAISYEKLPGFPVS TVHGHAGELVLGHLQGVPVVCMKGRGHFYEGRGMTIMTDAIRTFKLLGCELLFCTNAAGS LRPEVGAGSLVALKDHINTMPGTPMVGLNDDRFGERFFSLANAYDAEYRALLQKVAKEEG FPLTEGVFVSYPGPNFETAAEIRMMQIIGGDVVGMSVVPEVISARHCDLKVVAVSAITNM AEGLSDVKLSHAQTLAAAELSKQNFINLICGFLRKIA >gi|296494438|gb|ADTN01000300.1| GENE 5 3786 - 4550 726 254 aa, chain + ## HITS:1 COG:no KEGG:ECS88_2597 NR:ns ## KEGG: ECS88_2597 # Name: yfeN # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 254 1 254 254 442 96.0 1e-123 MKKHLLTLTLSSILAIPVVSHAEFKGGFADIGVHYLDWTSRTTEKSSTKSHKDDFGYLEF EGGANFSWGEMYGFFDWENFYNGRHNKPGSEQRYTFKNTNRIYLGDTGFNLYLHAYGTYG SANRVNFHDDMFLYGIGYNFTGSGWWFKPFFAKRYTDQTYYTGDNGYVAGWVAGYNFMLG SEKFTLTNWNEYEFDRDATYAAGNGGKEGLNGAVALWWNATSHITTGIQYRYADDKLGED FYQDAIIYSIKFNF >gi|296494438|gb|ADTN01000300.1| GENE 6 4589 - 5515 588 308 aa, chain - ## HITS:1 COG:yfeR KEGG:ns NR:ns ## COG: yfeR COG0583 # Protein_GI_number: 16130335 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 308 1 308 308 575 100.0 1e-164 MNYSLKQLKVFVTVAQEKSFSRAGERIGLSQSAVSHSVKELENHTGVRLLDRTTREVVLT DAGQQLALRLERLLDELNSTLRDTGRMGQQLSGKVRVAASQTISAHLIPQCIAESHRRYP DIQFVLHDRPQQWVMESIRQGDVDFGIVIDPGPVGDLQCEAILSEPFFLLCHRDSALAVE DYVPWQALQGAKLVLQDYASGSRPLIDAALARNGIQANIVQEIGHPATLFPMVAAGIGIS ILPALALPLPEGSPLVVKRITPVVERQLMLVRRKNRSLSTAAEALWDVVRDQGNALMAGR EGDPLYQI >gi|296494438|gb|ADTN01000300.1| GENE 7 5605 - 6603 725 332 aa, chain + ## HITS:1 COG:yfeH KEGG:ns NR:ns ## COG: yfeH COG0385 # Protein_GI_number: 16130336 # Func_class: R General function prediction only # Function: Predicted Na+-dependent transporter # Organism: Escherichia coli K12 # 1 332 1 332 332 592 100.0 1e-169 MKLFRILDPFTLTLITVVLLASFFPARGDFVPFFENLTTAAIALLFFMHGAKLSREAIIA GGGHWRLHLWVMCSTFVLFPILGVLFAWWKPVNVDPMLYSGFLYLCILPATVQSAIAFTS MAGGNVAAAVCSASASSLLGIFLSPLLVGLVMNVHGAGGSLEQVGKIMLQLLLPFVLGHL SRPWIGDWVSRNKKWIAKTDQTSILLVVYTAFSEAVVNGIWHKVGWGSLLFIVVVSCVLL AIVIVVNVFMARRLSFNKADEITIVFCGSKKSLANGIPMANILFPTSVIGMMVLPLMIFH QIQLMVCAVLARRYKRQTEQLQAQQESSADKA >gi|296494438|gb|ADTN01000300.1| GENE 8 6600 - 6818 272 72 aa, chain - ## HITS:1 COG:Z3676 KEGG:ns NR:ns ## COG: Z3676 COG3530 # Protein_GI_number: 15802943 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 EDL933 # 1 72 1 72 72 123 100.0 9e-29 MEKEQLIEIANTIMPFGKYKGRRLIDLPEEYLLWFARKDEFPAGKLGELMQITLLIKTEG LTQLVQPLKRPL >gi|296494438|gb|ADTN01000300.1| GENE 9 6820 - 8835 2207 671 aa, chain - ## HITS:1 COG:lig KEGG:ns NR:ns ## COG: lig COG0272 # Protein_GI_number: 16130337 # Func_class: L Replication, recombination and repair # Function: NAD-dependent DNA ligase (contains BRCT domain type II) # Organism: Escherichia coli K12 # 1 671 1 671 671 1317 99.0 0 MESIEQQLTELRTTLRHHEYLYHVMDAPEIPDAEYDRLMRELRELETKHPELITPDSPTQ RVGAAPLAAFSQIRHEVPMLSLDNVFDEESFLAFNKRVQDRLKNNEKVTWCCELKLDGLA VSILYENGVLVSAATRGDGTTGEDITSNVRTIRAIPLKLHGENIPARLEVRGEVFLPQAG FEKINEDARRTGGKVFSNPRNAAAGSLRQLDPRITAKRPLTFFCYGVGVLEGGELPDTHL GRLLQFKKWGLPVSDRVTLCESAEEVLAFYHKVEEDRPTLGFDIDGVVIKVNSLAQQEQL GFVARAPRWAVAFKFPAQEQMTFVRDVEFQVGRTGAITPVARLEPVHVAGVLVSNATLHN ADEIERLGLRIGDKVVIRRAGDVIPQVVNVVLSERPEDTREVVFPTHCPVCGSDVERVEG EAVARCTGGLICGAQRKESLKHFVSRRAMDVDGMGDKIIDQLVEKEYVHTPADLFKLTAG KLTGLERMGPKSAQNVVNALEKAKETTFARFLYALGIREVGEATAAGLAAYFGTLEALEA ASIEELQKVPDVGIVVASHVHNFFAEESNRNVISELLAEGVHWPAPIVINAEEIDSPFAG KTVVLTGSLSQMSRDDAKARLVELGAKVAGSVSKKTDLVIAGEAAGSKLAKAQELGIEVI DEAEMLRLLGS >gi|296494438|gb|ADTN01000300.1| GENE 10 8906 - 9889 753 327 aa, chain - ## HITS:1 COG:zipA KEGG:ns NR:ns ## COG: zipA COG3115 # Protein_GI_number: 16130338 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Escherichia coli K12 # 1 327 2 328 328 476 100.0 1e-134 MQDLRLILIIVGAIAIIALLVHGFWTSRKERSSMFRDRPLKRMKSKRDDDSYDEDVEDDE GVGEVRVHRVNHAPANAQEHEAARPSPQHQYQPPYASAQPRQPVQQPPEAQVPPQHAPHP AQPVQQPAYQPQPEQPLQQPVSPQVAPAPQPVHSAPQPAQQAFQPAEPVAAPQPEPVAEP APVMDKPKRKEAVIIMNVAAHHGSELNGELLLNSIQQAGFIFGDMNIYHRHLSPDGSGPA LFSLANMVKPGTFDPEMKDFTTPGVTIFMQVPSYGDELQNFKLMLQSAQHIADEVGGVVL DDQRRMMTPQKLREYQDIIREVKDANA >gi|296494438|gb|ADTN01000300.1| GENE 11 10248 - 10883 571 211 aa, chain + ## HITS:1 COG:ECs3285 KEGG:ns NR:ns ## COG: ECs3285 COG2981 # Protein_GI_number: 15832539 # Func_class: E Amino acid transport and metabolism # Function: Uncharacterized protein involved in cysteine biosynthesis # Organism: Escherichia coli O157:H7 # 1 211 43 253 253 383 100.0 1e-106 MGGAFWWLFTQLDVWIPTLMSYVPDWLQWLSYLLWPLAVISVLLVFGYFFSTIANWIAAP FNGLLAEQLEARLTGATPPDTGIFGIMKDVPRIMKREWQKFAWYLPRAIVLLILYFIPGI GQTVAPVLWFLFSAWMLAIQYCDYPFDNHKVPFKEMRTALRTRKITNMQFGALTSLFTMI PLLNLFIMPVAVCGATAMWVDCYRDKHAMWR >gi|296494438|gb|ADTN01000300.1| GENE 12 11068 - 12039 702 323 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 4 312 3 304 308 275 49 2e-73 MSKIFEDNSLTIGHTPLVRLNRIGNGRILAKVESRNPSFSVKCRIGANMIWDAEKRGVLK PGVELVEPTSGNTGIALAYVAAARGYKLTLTMPETMSIERRKLLKALGANLVLTEGAKGM KGAIQKAEEIVASNPEKYLLLQQFSNPANPEIHEKTTGPEIWEDTDGQVDVFIAGVGTGG TLTGVSRYIKGTKGKTDLISVAVEPTDSPVIAQALAGEEIKPGPHKIQGIGAGFIPANLD LKLVDKVIGITNEEAISTARRLMEEEGILAGISSGAAVAAALKLQEDESFTNKNIVVILP SSGERYLSTALFADLFTEKELQQ >gi|296494438|gb|ADTN01000300.1| GENE 13 12423 - 12680 388 85 aa, chain + ## HITS:1 COG:ECs3287 KEGG:ns NR:ns ## COG: ECs3287 COG1925 # Protein_GI_number: 15832541 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Escherichia coli O157:H7 # 1 85 1 85 85 129 100.0 2e-30 MFQQEVTITAPNGLHTRPAAQFVKEAKGFTSEITVTSNGKSASAKSLFKLQTLGLTQGTV VTISAEGEDEQKAVEHLVKLMAELE >gi|296494438|gb|ADTN01000300.1| GENE 14 12725 - 14452 1776 575 aa, chain + ## HITS:1 COG:ptsI KEGG:ns NR:ns ## COG: ptsI COG1080 # Protein_GI_number: 16130342 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Escherichia coli K12 # 1 575 1 575 575 1040 99.0 0 MISGILASPGIAFGKALLLKEDEIVIDRKKISADQVDQEVERFLSGRAKASAQLETIKTK AGETFGEEKEAIFEGHIMLLEDEELEQEIIALIKDKHMTADAAAHEVIEGQASALEELDD EYLKERAADVRDIGKRLLRNILGLKIIDLSAIQDEVILVAADLTPSETAQLNLKKVLGFI TDAGGRTSHTSIMARSLELPAIVGTGSVTSQVKNDDYLILDAVNNQVYVNPTNEVIDKMR AVQEQVASEKAELAKLKDLPAITLDGHQVEVCANIGTVRDVEGAERNGAEGVGLYRTEFL FMDRDALPTEEEQFAAYKAVAEACGSQAVIVRTMDIGGDKELPYMNFPKEENPFLGWRAI RIAMDRREILRDQLRAILRASAFGKLRIMFPMIISVEEVRALRKEIEIYKQELRDEGKAF DESIEIGVMVETPAAATIARHLAKEVDFFSIGTNDLTQYTLAVDRGNDMISPLYQPMSPS VLNLIKQVIDASHAEGKWTGMCGELAGDERATLLLLGMGLDEFSMSAISIPRIKKIIRNT NFEDAKVLAEQALAQPTTDELMTLVNKFIEEKTIC >gi|296494438|gb|ADTN01000300.1| GENE 15 14493 - 15002 694 169 aa, chain + ## HITS:1 COG:crr KEGG:ns NR:ns ## COG: crr COG2190 # Protein_GI_number: 16130343 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Escherichia coli K12 # 1 169 1 169 169 285 100.0 4e-77 MGLFDKLKSLVSDDKKDTGTIEIIAPLSGEIVNIEDVPDVVFAEKIVGDGIAIKPTGNKM VAPVDGTIGKIFETNHAFSIESDSGVELFVHFGIDTVELKGEGFKRIAEEGQRVKVGDTV IEFDLPLLEEKAKSTLTPVVISNMDEIKELIKLSGSVTVGETPVIRIKK Prediction of potential genes in microbial genomes Time: Mon May 16 00:10:41 2011 Seq name: gi|296494437|gb|ADTN01000301.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont750.6, whole genome shotgun sequence Length of sequence - 26837 bp Number of predicted genes - 28, with homology - 28 Number of transcription units - 13, operones - 7 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 19 - 870 830 ## COG2240 Pyridoxal/pyridoxine/pyridoxamine kinase - Prom 953 - 1012 3.1 + Prom 825 - 884 2.5 2 2 Op 1 . + CDS 975 - 1349 393 ## JW2412 hypothetical protein 3 2 Op 2 . + CDS 1382 - 2116 545 ## COG4884 Uncharacterized protein conserved in bacteria + Term 2241 - 2277 2.4 4 3 Op 1 . - CDS 2305 - 3294 554 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 5 3 Op 2 17/0.000 - CDS 3350 - 4447 1368 ## COG1118 ABC-type sulfate/molybdate transport systems, ATPase component 6 3 Op 3 17/0.000 - CDS 4437 - 5312 1206 ## COG4208 ABC-type sulfate transport system, permease component 7 3 Op 4 7/0.000 - CDS 5312 - 6145 1033 ## COG0555 ABC-type sulfate transport system, permease component 8 3 Op 5 1/0.500 - CDS 6145 - 7161 1459 ## COG4150 ABC-type sulfate transport system, periplasmic component - Prom 7183 - 7242 4.8 - Term 7416 - 7452 5.1 9 4 Tu 1 . - CDS 7465 - 8256 248 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 - Prom 8305 - 8364 3.9 10 5 Tu 1 . - CDS 8385 - 9242 758 ## COG1737 Transcriptional regulators - Prom 9374 - 9433 4.2 + Prom 9306 - 9365 6.4 11 6 Op 1 9/0.000 + CDS 9406 - 10302 833 ## COG2103 Predicted sugar phosphate isomerase 12 6 Op 2 1/0.500 + CDS 10306 - 11730 1602 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 13 6 Op 3 . + CDS 11735 - 13039 1446 ## COG1680 Beta-lactamase class C and other penicillin binding proteins + Term 13228 - 13267 5.3 - Term 13211 - 13260 1.5 14 7 Tu 1 . - CDS 13279 - 14178 1107 ## COG2837 Predicted iron-dependent peroxidase - Term 14187 - 14218 2.1 15 8 Op 1 . - CDS 14274 - 14849 413 ## SSON_2521 hypothetical protein 16 8 Op 2 . - CDS 14910 - 15359 305 ## G2583_2964 hypothetical protein 17 8 Op 3 . - CDS 15346 - 15771 607 ## COG0456 Acetyltransferases - Prom 15799 - 15858 4.3 + Prom 15817 - 15876 4.2 18 9 Op 1 4/0.167 + CDS 15985 - 16854 782 ## COG0860 N-acetylmuramoyl-L-alanine amidase 19 9 Op 2 . + CDS 16858 - 17757 712 ## COG0408 Coproporphyrinogen III oxidase + Term 17999 - 18032 -0.6 20 10 Op 1 2/0.333 - CDS 17763 - 18815 995 ## COG2207 AraC-type DNA-binding domain-containing proteins 21 10 Op 2 4/0.167 - CDS 18861 - 19361 439 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein 22 10 Op 3 6/0.167 - CDS 19374 - 20033 768 ## COG4816 Ethanolamine utilization protein 23 10 Op 4 8/0.000 - CDS 20043 - 20930 1052 ## COG4302 Ethanolamine ammonia-lyase, small subunit 24 10 Op 5 . - CDS 20951 - 22312 1665 ## COG4303 Ethanolamine ammonia-lyase, large subunit + Prom 22349 - 22408 3.7 25 11 Tu 1 . + CDS 22645 - 23706 348 ## PROTEIN SUPPORTED gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 + Term 23715 - 23743 1.0 + Prom 23781 - 23840 5.2 26 12 Tu 1 . + CDS 23933 - 24700 540 ## gi|301648099|ref|ZP_07247862.1| hypothetical protein HMPREF9543_04600 + Term 24709 - 24740 1.1 + Prom 25869 - 25928 3.8 27 13 Op 1 . + CDS 26030 - 26227 76 ## gi|301648102|ref|ZP_07247865.1| conserved domain protein 28 13 Op 2 . + CDS 26224 - 26796 170 ## gi|301648103|ref|ZP_07247866.1| HNH endonuclease domain protein Predicted protein(s) >gi|296494437|gb|ADTN01000301.1| GENE 1 19 - 870 830 283 aa, chain - ## HITS:1 COG:pdxK KEGG:ns NR:ns ## COG: pdxK COG2240 # Protein_GI_number: 16130344 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal/pyridoxine/pyridoxamine kinase # Organism: Escherichia coli K12 # 1 283 1 283 283 556 100.0 1e-158 MSSLLLFNDKSRALQADIVAVQSQVVYGSVGNSIAVPAIKQNGLNVFAVPTVLLSNTPHY DTFYGGAIPDEWFSGYLRALQERDALRQLRAVTTGYMGTASQIKILAEWLTALRKDHPDL LIMVDPVIGDIDSGIYVKPDLPEAYRQYLLPLAQGITPNIFELEILTGKNCRDLDSAIAA AKSLLSDTLKWVVVTSASGNEENQEMQVVVVTADSVNVISHSRVKTDLKGTGDLFCAQLI SGLLKGKALTDAVHRAGLRVLEVMRYTQQHESDELILPPLAEA >gi|296494437|gb|ADTN01000301.1| GENE 2 975 - 1349 393 124 aa, chain + ## HITS:1 COG:no KEGG:JW2412 NR:ns ## KEGG: JW2412 # Name: yfeK # Def: hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 124 1 124 124 244 100.0 9e-64 MKKIICLVITLLMTLPVYAKLTAHEEARINAMLEGLAQKKDLIFVRNGDEHTCYEAVSHL RLKLGNTRNRIDTAEQFIDKVASSSSITGKPYIVKIPGKSDENAQPFLHALIAQTDKTVP AEGN >gi|296494437|gb|ADTN01000301.1| GENE 3 1382 - 2116 545 244 aa, chain + ## HITS:1 COG:yfeS_2 KEGG:ns NR:ns ## COG: yfeS_2 COG4884 # Protein_GI_number: 16130346 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 73 244 1 172 172 355 100.0 3e-98 MKKRFIYHDEKSNKFWWIDYEGDSLAVNYGKVGSIGKFQTKEFDNEEQCLKEASKLIAAK MKKGYQEDPKFNFMDRYYFDDEEIGLHVKTSHPNFQCHFTDPLYMCCWDEESPFGSDEGA DALNVLENSLRKEPDLDCADFPQMLIETMWGMKYIAMDSILEEDVRAQLLVDEMSTIQSN MITYATAFGQIKVMGKISHKLKKMGLNALARHQLTAKILQWGDGQDSPILQKMIDDLTAF PHEN >gi|296494437|gb|ADTN01000301.1| GENE 4 2305 - 3294 554 329 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 24 318 1 304 308 218 39 4e-56 MARFVTCRPDKTRKRRIRQHHVWIEIVSTLEQTIGNTPLVKLQRMGPDNGSEVWLKLEGN NPAGSVKDRAALSMIVEAEKRGEIKPGDVLIEATSGNTGIALAMIAALKGYRIKLLMPDN MSQERRAAMRAYGAELILVTKEQGMEGARDLALEMANRGEGKLLDQFNNPDNPYAHYTTT GPEIWQQTGGRITHFVSSMGTTGTITGVSRFMREQSKPVTIVGLQPEEGSSIPGIRRWPT EYLPGIFNASLVDEVLDIHQRDAENTMRELAVREGIFCGVSSGGAVAGALRVAKANPDAV VVAIICDRGDRYLSTGVFGEEHFSQGAGI >gi|296494437|gb|ADTN01000301.1| GENE 5 3350 - 4447 1368 365 aa, chain - ## HITS:1 COG:cysA KEGG:ns NR:ns ## COG: cysA COG1118 # Protein_GI_number: 16130348 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type sulfate/molybdate transport systems, ATPase component # Organism: Escherichia coli K12 # 1 365 1 365 365 722 100.0 0 MSIEIANIKKSFGRTQVLNDISLDIPSGQMVALLGPSGSGKTTLLRIIAGLEHQTSGHIR FHGTDVSRLHARDRKVGFVFQHYALFRHMTVFDNIAFGLTVLPRRERPNAAAIKAKVTKL LEMVQLAHLADRYPAQLSGGQKQRVALARALAVEPQILLLDEPFGALDAQVRKELRRWLR QLHEELKFTSVFVTHDQEEATEVADRVVVMSQGNIEQADAPDQVWREPATRFVLEFMGEV NRLQGTIRGGQFHVGAHRWPLGYTPAYQGPVDLFLRPWEVDISRRTSLDSPLPVQVLEAS PKGHYTQLVVQPLGWYNEPLTVVMHGDDAPQRGERLFVGLQHARLYNGDERIETRDEELA LAQSA >gi|296494437|gb|ADTN01000301.1| GENE 6 4437 - 5312 1206 291 aa, chain - ## HITS:1 COG:cysWm KEGG:ns NR:ns ## COG: cysWm COG4208 # Protein_GI_number: 16132224 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type sulfate transport system, permease component # Organism: Escherichia coli K12 # 1 291 1 291 291 538 100.0 1e-153 MAEVTQLKRYDARPINWGKWFLIGIGMLVSAFILLVPMIYIFVQAFSKGLMPVLQNLADP DMLHAIWLTVMIALIAVPVNLVFGILLAWLVTRFNFPGRQLLLTLLDIPFAVSPVVAGLV YLLFYGSNGPLGGWLDEHNLQIMFSWPGMVLVTIFVTCPFVVRELVPVMLSQGSQEDEAA ILLGASGWQMFRRVTLPNIRWALLYGVVLTNARAIGEFGAVSVVSGSIRGETLSLPLQIE LLEQDYNTVGSFTAAALLTLMAIITLFLKSMLQWRLENQEKRAQQEEHHEH >gi|296494437|gb|ADTN01000301.1| GENE 7 5312 - 6145 1033 277 aa, chain - ## HITS:1 COG:cysU KEGG:ns NR:ns ## COG: cysU COG0555 # Protein_GI_number: 16130349 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type sulfate transport system, permease component # Organism: Escherichia coli K12 # 1 277 1 277 277 439 100.0 1e-123 MFAVSSRRVLPGFTLSLGTSLLFVCLILLLPLSALVMQLAQMSWAQYWEVITNPQVVAAY KVTLLSAFVASIFNGVFGLLMAWILTRYRFPGRTLLDALMDLPFALPTAVAGLTLASLFS VNGFYGEWLAKFDIKVTYTWLGIAVAMAFTSIPFVVRTVQPVLEELGPEYEEAAETLGAT RWQSFCKVVLPELSPALVAGVALSFTRSLGEFGAVIFIAGNIAWKTEVTSLMIFVRLQEF DYPAASAIASVILAASLLLLFSINTLQSRFGRRVVGH >gi|296494437|gb|ADTN01000301.1| GENE 8 6145 - 7161 1459 338 aa, chain - ## HITS:1 COG:cysP KEGG:ns NR:ns ## COG: cysP COG4150 # Protein_GI_number: 16130350 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type sulfate transport system, periplasmic component # Organism: Escherichia coli K12 # 1 338 1 338 338 652 100.0 0 MAVNLLKKNSLALVASLLLAGHVQATELLNSSYDVSRELFAALNPPFEQQWAKDNGGDKL TIKQSHAGSSKQALAILQGLKADVVTYNQVTDVQILHDKGKLIPADWQSRLPNNSSPFYS TMGFLVRKGNPKNIHDWNDLVRSDVKLIFPNPKTSGNARYTYLAAWGAADKADGGDKGKT EQFMTQFLKNVEVFDTGGRGATTTFAERGLGDVLISFESEVNNIRKQYEAQGFEVVIPKT NILAEFPVAWVDKNVQANGTEKAAKAYLNWLYSPQAQTIITDYYYRVNNPEVMDKLKDKF PQTELFRVEDKFGSWPEVMKTHFTSGGELDKLLAAGRN >gi|296494437|gb|ADTN01000301.1| GENE 9 7465 - 8256 248 263 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 7 251 4 238 242 100 30 1e-20 MGKLTGKTALITGALQGIGEGIARTFARHGANLILLDISPEIEKLADELCGRGHRCTAVV ADVRDPASVAAAIKRAKEKEGRIDILVNNAGVCRLGSFLDMSDDDRDFHIDINIKGVWNV TKAVLPEMIARKDGRIVMMSSVTGDMVADPGETAYALTKAAIVGLTKSLAVEYAQSGIRV NAICPGYVRTPMAESIARQSNPEDPESVLTEMAKAIPMRRLADPLEVGELAAFLASDESS YLTGTQNVIDGGSTLPETVSVGI >gi|296494437|gb|ADTN01000301.1| GENE 10 8385 - 9242 758 285 aa, chain - ## HITS:1 COG:yfeT KEGG:ns NR:ns ## COG: yfeT COG1737 # Protein_GI_number: 16130352 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli K12 # 1 285 1 285 285 478 100.0 1e-135 MLYLTKISNAGSEFTENEQKIADFLQANVSELQSVSSRQMAKQLGISQSSIVKFAQKLGA QGFTELRMALIGEYSASREKTNATALHLHSSITSDDSLEVIARKLNREKELALEQTCALL DYARLQKIIEVISKAPFIQITGLGGSALVGRDLSFKLMKIGYRVACEADTHVQATVSQAL KKGDVQIAISYSGSKKEIVLCAEAARKQGATVIAITSLTDSPLRRLAHFTLDTVSGETEW RSSSMSTRTAQNSVTDLLFVGLVQLNDVESLKMIQRSSELTQRLK >gi|296494437|gb|ADTN01000301.1| GENE 11 9406 - 10302 833 298 aa, chain + ## HITS:1 COG:yfeU KEGG:ns NR:ns ## COG: yfeU COG2103 # Protein_GI_number: 16130353 # Func_class: R General function prediction only # Function: Predicted sugar phosphate isomerase # Organism: Escherichia coli K12 # 1 298 1 298 298 547 100.0 1e-156 MQFEKMITEGSNTASAEIDRVSTLEMCRIINDEDKTVPLAVERVLPDIAAAIDVIHAQVS GGGRLIYLGAGTSGRLGILDASECPPTYGVKPGLVVGLIAGGEYAIQHAVEGAEDSREGG VNDLKNINLTAQDVVVGIAASGRTPYVIAGLEYARQLGCRTVGISCNPGSAVSTTAEFAI TPIVGAEVVTGSSRMKAGTAQKLVLNMLSTGLMIKSGKVFGNLMVDVVATNEKLHVRQVN IVKNATGCSAEQAEAALIACERNCKTAIVMVLKNLDAAEAKKRLDQHGGFIRQVLDKE >gi|296494437|gb|ADTN01000301.1| GENE 12 10306 - 11730 1602 474 aa, chain + ## HITS:1 COG:yfeV_2 KEGG:ns NR:ns ## COG: yfeV_2 COG1263 # Protein_GI_number: 16130354 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Escherichia coli K12 # 101 474 1 374 374 655 100.0 0 MAKEISSELLNTILTRVGGPGNIASCGNCMTRLRLGVHDSSLVDPNIKTLEGVKGVILTS DQVQVVFGPGKAHRAAKAMSELLGEAPVQDAAEIAAQNKRQLKAKQTSGVQQFLAKFATI FTPLIPGFIAAGLLLGIATLIATVMHVPADAQGTLPDALNFMKVFSKGLFTFLVILVGYN AAQAFGGTGVNGAIIAALFLLGYNPAATTGYYAGFHDFFGLPIDPRGNIIGVLIAAWACA RIEGMVRRFMPDDLDMLLTSLITLLITATLAYLIIMPLGGWLFEGMSWLFMHLNSNPFGC AVLAGLFLIAVVFGVHQGFIPVYLALMDSQGFNSLFPILSMAGAGQVGAALALYWRAQPH SALRSQVRGAIIPGLLGVGEPLIYGVTLPRMKPFVTACLGGAAGGLFIGLIAWWGLPMGL NSAFGPSGLVALPLMTSAQGILPAMAVYAGGILVAWVCGFIFTTLFGCRNVNLD >gi|296494437|gb|ADTN01000301.1| GENE 13 11735 - 13039 1446 434 aa, chain + ## HITS:1 COG:yfeW KEGG:ns NR:ns ## COG: yfeW COG1680 # Protein_GI_number: 16130355 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Escherichia coli K12 # 1 434 30 463 463 879 100.0 0 MKRTMLYLSLLAVSCSVSAAKYPVLTESSPEKAGFNVERLNQMDRWISQQVDVGYPSVNL LIIKDNQIVYRKAWGAAKKYDGSVLMEQPVKATTGTLYDLASNTKMYATNFALQKLMSEG KLHPDDRIAKYIPGFADSPNDTIKGKNTLRISDLLHHSGGFPADPQYPNKAVAGALYSQD KGQTLEMIKRTPLEYQPGSKHIYSDVDYMLLGFIVESVTGQPLDRYVEESIYRPLGLTHT VFNPLLKGFKPQQIAATELNGNTRDGVIHFPNIRTSTLWGQVHDEKAFYSMGGVSGHAGL FSNTGDIAVLMQTMLNGGGYGDVQLFNAETVKMFTTSSKEDATFGLGWRVNGNATMTPTF GTLASPQTYGHTGWTGTVTVIDPVNHMTIVMLSNKPHSPVADPQKNPNMFESGQLPIATY GWVVDQVYAALKQK >gi|296494437|gb|ADTN01000301.1| GENE 14 13279 - 14178 1107 299 aa, chain - ## HITS:1 COG:yfeX KEGG:ns NR:ns ## COG: yfeX COG2837 # Protein_GI_number: 16130356 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted iron-dependent peroxidase # Organism: Escherichia coli K12 # 1 299 10 308 308 615 99.0 1e-176 MSQVQSGILPEHCRAAIWIEANVKGEVDALRAASKTFADKLATFEAKFPDAHLGAVVAFG NNTWRALSGGVGAEELKDFPGYGKGLAPTTQFDVLIHILSLRHDVNFSVAQAAMEAFGDC IEVKEEIHGFRWVEERDLSGFVDGTENPAGEETRREVAVIKDGVDAGGSYVFVQRWEHNL KQLNRMSVHDQEMMIGRTKEANEEIDGDERPETSHLTRVDLKEDGKGLKIVRQSLPYGTA SGTHGLYFCAYCARLHNIEQQLLSMFGDTDGKRDAMLRFTKPVTGGYYFAPSLDKLMAL >gi|296494437|gb|ADTN01000301.1| GENE 15 14274 - 14849 413 191 aa, chain - ## HITS:1 COG:no KEGG:SSON_2521 NR:ns ## KEGG: SSON_2521 # Name: not_defined # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 191 1 191 191 381 100.0 1e-105 MKSLRLMLCAMPLMLTGCSTMSSVNWSAANPWNWFGSSTKVSEQGVGELTASTPLQEQAI ADALDGDYRLRSGMKTANGNVVRFFEVMKGDNVAMVINGDQGTISRIDVLDSDIPADTGV KIGTPFSDLYSKAFGNCQKADGDDNRAVECKAEGSQHISYQFSGEWRGPEGLMPSDDTLK NWKVSKIIWRR >gi|296494437|gb|ADTN01000301.1| GENE 16 14910 - 15359 305 149 aa, chain - ## HITS:1 COG:no KEGG:G2583_2964 NR:ns ## KEGG: G2583_2964 # Name: yfeZ # Def: hypothetical protein # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 149 3 151 151 239 100.0 3e-62 MKSTEFHPVHYDAHGRLRLPLLFWLVLLLQARTWVLFVIAGASREQGTALLNLFYPDHDN FWLGLIPGIPAVLAFLLSGRRATFPRTWRVLYFLLLLAQVVLLCWQPWLWLNGESVSGIG LALVVADIVALIWLLTNRRLRACFNEVKE >gi|296494437|gb|ADTN01000301.1| GENE 17 15346 - 15771 607 141 aa, chain - ## HITS:1 COG:ypeA KEGG:ns NR:ns ## COG: ypeA COG0456 # Protein_GI_number: 16130359 # Func_class: R General function prediction only # Function: Acetyltransferases # Organism: Escherichia coli K12 # 1 141 38 178 178 294 100.0 3e-80 MEIRVFRQEDFEEVITLWERCDLLRPWNDPEMDIERKMNHDVSLFLVAEVNGDVVGTVMG GYDGHRGSAYYLGVHPEFRGRGIANALLNRLEKKLIARGCPKIQINVPEDNDMVLGMYER LGYEHADVLSLGKRLIEDEEY >gi|296494437|gb|ADTN01000301.1| GENE 18 15985 - 16854 782 289 aa, chain + ## HITS:1 COG:amiA KEGG:ns NR:ns ## COG: amiA COG0860 # Protein_GI_number: 16130360 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Escherichia coli K12 # 1 289 1 289 289 531 100.0 1e-151 MSTFKPLKTLTSRRQVLKAGLAALTLSGMSQAIAKDELLKTSNGHSKPKAKKSGGKRVVV LDPGHGGIDTGAIGRNGSKEKHVVLAIAKNVRSILRNHGIDARLTRSGDTFIPLYDRVEI AHKHGADLFMSIHADGFTNPKAAGASVFALSNRGASSAMAKYLSERENRADEVAGKKATD KDHLLQQVLFDLVQTDTIKNSLTLGSHILKKIKPVHKLHSRNTEQAAFVVLKSPSVPSVL VETSFITNPEEERLLGTAAFRQKIATAIAEGVISYFHWFDNQKAHSKKR >gi|296494437|gb|ADTN01000301.1| GENE 19 16858 - 17757 712 299 aa, chain + ## HITS:1 COG:hemF KEGG:ns NR:ns ## COG: hemF COG0408 # Protein_GI_number: 16130361 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase # Organism: Escherichia coli K12 # 1 299 1 299 299 625 100.0 1e-179 MKPDAHQVKQFLLNLQDTICQQLTAVDGAEFVEDSWQREAGGGGRSRVLRNGGVFEQAGV NFSHVHGEAMPASATAHRPELAGRSFEAMGVSLVVHPHNPYVPTSHANVRFFIAEKPGAD PVWWFGGGFDLTPFYGFEEDAIHWHRTARDLCLPFGEDVYPRYKKWCDEYFYLKHRNEQR GIGGLFFDDLNTPDFDRCFAFMQAVGKGYTDAYLPIVERRKAMAYGERERNFQLYRRGRY VEFNLVWDRGTLFGLQTGGRTESILMSMPPLVRWEYDYQPKDGSPEAALSEFIKVRDWV >gi|296494437|gb|ADTN01000301.1| GENE 20 17763 - 18815 995 350 aa, chain - ## HITS:1 COG:eutR KEGG:ns NR:ns ## COG: eutR COG2207 # Protein_GI_number: 16130362 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 1 350 1 350 350 713 100.0 0 MKKTRTANLHHLYHEPLPENLKLTPKVEVDNVHQRQTTDVYEHALTITAWQQIYDQLHPG KFHGEFTEILLDDIQVFREYTGLALRQSCLVWPNSFWFGIPATRGEQGFIGSQCLGSAEI ATRPGGTEFELSTPDDYTILGVVLSEDVITRQANFLHNPDRVLHMLRNQSALEVKEQHKA ALWGFVQQALATFCENPENLHQPAVRKVLGDNLLMAMGAMLEEAQPMVTAESISHQSYRR LLSRAREYVLENMSEPVTVLDLCNQLHVSRRTLQNAFHAILGIGPNAWLKRIRLNAVRRE LISPWSQSMTVKDAAMQWGFWHLGQFATDYQQLFSEKPSLTLHQRMREWG >gi|296494437|gb|ADTN01000301.1| GENE 21 18861 - 19361 439 166 aa, chain - ## HITS:1 COG:eutK KEGG:ns NR:ns ## COG: eutK COG4577 # Protein_GI_number: 16130363 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Escherichia coli K12 # 1 166 3 168 168 289 99.0 1e-78 MINALGLLEVDGMVAAIDAADAMLKAANVRLLSHEVLDPGRLTLVVEGDLAACRAALDAG CTAAMRTGRVISRKEIGRPDDDTQWLVTGFNRQPKQPVREPDAPVIVAESADELLALLTS VRQGMTAGEVAAHFGWPLEKARNALEQLFSAGTLRKRSSRYRLKPH >gi|296494437|gb|ADTN01000301.1| GENE 22 19374 - 20033 768 219 aa, chain - ## HITS:1 COG:eutL KEGG:ns NR:ns ## COG: eutL COG4816 # Protein_GI_number: 16130364 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Escherichia coli K12 # 1 219 1 219 219 375 100.0 1e-104 MPALDLIRPSVTAMRVIASVNADFARELKLPPHIRSLGLISADSDDVTYIAADEATKQAM VEVVYGRSLYAGAAHGPSPTAGEVLIMLGGPNPAEVRAGLDAMIAHIENGAAFQWANDAQ DTAFLAHVVSRTGSYLSSTAGITLGDPMAYLVAPPLEATYGIDAALKSADVQLATYVPPP SETNYSAAFLTGSQAACKAACNAFTDAVLEIARNPIQRA >gi|296494437|gb|ADTN01000301.1| GENE 23 20043 - 20930 1052 295 aa, chain - ## HITS:1 COG:eutC KEGG:ns NR:ns ## COG: eutC COG4302 # Protein_GI_number: 16130365 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine ammonia-lyase, small subunit # Organism: Escherichia coli K12 # 1 295 1 295 295 555 100.0 1e-158 MDQKQIEEIVRSVMASMGQAAPAPSEAKCATTNCAAPVTSESCALDLGSAEAKAWIGVEN PHRADVLTELRRSTVARVCTGRAGPRPRTQALLRFLADHSRSKDTVLKEVPEEWVKAQGL LEVRSEISDKNLYLTRPDMGRRLCAEAVEALKAQCVANPDVQVVISDGLSTDAITVNYEE ILPPLMAGLKQAGLKVGTPFFVRYGRVKIEDQIGEILGAKVVILLVGERPGLGQSESLSC YAVYSPRMATTVEADRTCISNIHQGGTPPVEAAAVIVDLAKRMLEQKASGINMTR >gi|296494437|gb|ADTN01000301.1| GENE 24 20951 - 22312 1665 453 aa, chain - ## HITS:1 COG:eutB KEGG:ns NR:ns ## COG: eutB COG4303 # Protein_GI_number: 16130366 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine ammonia-lyase, large subunit # Organism: Escherichia coli K12 # 1 453 15 467 467 924 100.0 0 MKLKTTLFGNVYQFKDVKEVLAKANELRSGDVLAGVAAASSQERVAAKQVLSEMTVADIR NNPVIAYEDDCVTRLIQDDVNETAYNQIKNWSISELREYVLSDETSVDDIAFTRKGLTSE VVAAVAKICSNADLIYGAKKMPVIKKANTTIGIPGTFSARLQPNDTRDDVQSIAAQIYEG LSFGVGDAVIGVNPVTDDVENLSRVLDTIYGVIDKFNIPTQGCVLAHVTTQIEAIRRGAP GGLIFQSICGSEKGLKEFGVELAMLDEARAVGAEFNRIAGENCLYFETGQGSALSAGANF GADQVTMEARNYGLARHYDPFIVNTVVGFIGPEYLYNDRQIIRAGLEDHFMGKLSGISMG CDCCYTNHADADQNLNENLMILLATAGCNYIMGMPLGDDIMLNYQTTAFHDTATVRQLLN LRPSPEFERWLESMGIMANGRLTKRAGDPSLFF >gi|296494437|gb|ADTN01000301.1| GENE 25 22645 - 23706 348 353 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165511|ref|YP_001467745.1| 30S ribosomal protein S15 [Campylobacter concisus 13826] # 10 344 64 404 406 138 30 3e-32 MDKKEYLQAIGVYPLISLKEARKRATESRSLIANGINPVEEARKEKAIDALNMAAGFKTV AEDWFATRVSGWSESYTKQVRSALEKDVYPVLGKRSIVDITARDVLALLQKKERTAPEQA RKLRQRIGEIFKFAVITELVTRNPVADLDTALKARRPGHNAWLQINEIPAFYKALERAGS VQIQTAIRLLILSALRTAELRLMRWEWVDLESATITLPAEVMKARRAHVVPLSRQAVELL HDQFTRSGYSAFVFPGRFMDKPLSASAILKALERIGYKSIATGHGWRTTFSTALNESGRY SPDWIEIQLAHVPKGVRGVYNQAAYLKQRRQMLQDYADAIDSILAGEGNPLEP >gi|296494437|gb|ADTN01000301.1| GENE 26 23933 - 24700 540 255 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|301648099|ref|ZP_07247862.1| ## NR: gi|301648099|ref|ZP_07247862.1| hypothetical protein HMPREF9543_04600 [Escherichia coli MS 146-1] # 1 255 1 255 255 501 100.0 1e-140 MRNSLAVRFSLQQGEELSQEYQFIKYEEAPEQVQIAFDDQHYGNTIRYDGKKWFLNPVTN AIHVCQIITPVNNSADNDSAIDAELVTEPLPPPPPVQESDVKPDAIDLTAEARERKQPKP EQPQDKDNKQWHGNTAYLNTFACHLHEVLPQAIYSTWCDKQTAAHFHLLYALCYLASYEN KDEGIELGEVLYGAEKITKGGVVPGFHMARSTFDIKINELEQWGLIKVTRGKKRKDGQGN LPNSIKMNFKPYSSK >gi|296494437|gb|ADTN01000301.1| GENE 27 26030 - 26227 76 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|301648102|ref|ZP_07247865.1| ## NR: gi|301648102|ref|ZP_07247865.1| conserved domain protein [Escherichia coli MS 146-1] # 1 65 3 67 67 127 100.0 2e-28 MTDNDIIGIGEVARIFGVHRQTIAKWLDNIPGFPQPFTPQVKGIHQKWYKQEIIEYIEHN KGRAK >gi|296494437|gb|ADTN01000301.1| GENE 28 26224 - 26796 170 190 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|301648103|ref|ZP_07247866.1| ## NR: gi|301648103|ref|ZP_07247866.1| HNH endonuclease domain protein [Escherichia coli MS 146-1] # 1 190 1 190 190 374 100.0 1e-102 MNNEIKFNHIKRLWFLKDCVVFSRATGKPIAFSSKGKDGRRFTAIRINGKFYSINIHKAV FMLHHNRPIAEGKEIHHKDGNYENNAPENLIELTSKQHKRIHAYQCDDPMRGIWLDRGAW KFKWVEDNGSQHYRSFHGIDEAIKFRAEIEEPRRQELRALGLNCKRVSSGEKSRRIVRNK IYFSHTTTIL Prediction of potential genes in microbial genomes Time: Mon May 16 00:11:15 2011 Seq name: gi|296494436|gb|ADTN01000302.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont761.1, whole genome shotgun sequence Length of sequence - 4452 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 371 202 ## ECSP_1339 curli assembly protein CsgF 2 1 Op 2 . + CDS 398 - 1231 780 ## COG1462 Uncharacterized protein involved in formation of curli polymers + Term 1243 - 1273 0.2 - Term 1231 - 1261 0.2 3 2 Tu 1 . - CDS 1295 - 1816 529 ## SSON_1045 hypothetical protein 4 3 Tu 1 . + CDS 1785 - 1871 101 ## 5 4 Op 1 5/0.000 - CDS 1888 - 2442 730 ## COG3381 Uncharacterized component of anaerobic dehydrogenases 6 4 Op 2 3/0.000 - CDS 2466 - 3203 754 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family 7 4 Op 3 . - CDS 3258 - 4196 582 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases - Prom 4256 - 4315 5.6 Predicted protein(s) >gi|296494436|gb|ADTN01000302.1| GENE 1 3 - 371 202 122 aa, chain + ## HITS:1 COG:no KEGG:ECSP_1339 NR:ns ## KEGG: ECSP_1339 # Name: csgF # Def: curli assembly protein CsgF # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 23 122 39 138 138 177 100.0 8e-44 RNLIYSTKPFRNPNFGGNPNNGAFLLNSAQAQNSYKDPSYNDDFGIETPSALDNFTQAIQ SQILGGLLSNINTGKPGRMVTNDYIVDIANRDGQLQLNVTDRKTGQTSTIQVSGLQNNST DF >gi|296494436|gb|ADTN01000302.1| GENE 2 398 - 1231 780 277 aa, chain + ## HITS:1 COG:ECs1414 KEGG:ns NR:ns ## COG: ECs1414 COG1462 # Protein_GI_number: 15830668 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein involved in formation of curli polymers # Organism: Escherichia coli O157:H7 # 1 277 1 277 277 550 100.0 1e-156 MQRLFLLVAVMLLSGCLTAPPKEAARPTLMPRAQSYKDLTHLPAPTGKIFVSVYNIQDET GQFKPYPASNFSTAVPQSATAMLVTALKDSRWFIPLERQGLQNLLNERKIIRAAQENGTV AINNRIPLQSLTAANIMVEGSIIGYESNVKSGGVGARYFGIGADTQYQLDQIAVNLRVVN VSTGEILSSVNTSKTILSYEVQAGVFRFIDYQRLLEGEVGYTSNEPVMLCLMSAIETGVI FLINDGIDRGLWDLQNKAERQNDILVKYRHMSVPPES >gi|296494436|gb|ADTN01000302.1| GENE 3 1295 - 1816 529 173 aa, chain - ## HITS:1 COG:no KEGG:SSON_1045 NR:ns ## KEGG: SSON_1045 # Name: ycdZ # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 173 7 179 179 280 99.0 1e-74 MAAFSAIMRGMNILLSIAITTGILSGIWGWVAVSLGLLSWAGFLGCTAYFACPQGGLKGL AISAATLLSGVVWAMVIIYGSALAPHLEILGYVITGIVAFLMCIQAKQLLLSFVPGTFIG ACATFAGQGDWKLVLPSLALGLIFGYAMKNSGLWLAARSAKTAHREQEIKNKA >gi|296494436|gb|ADTN01000302.1| GENE 4 1785 - 1871 101 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPRMIAENAAMNEIEHTFYATTNYPLQL >gi|296494436|gb|ADTN01000302.1| GENE 5 1888 - 2442 730 184 aa, chain - ## HITS:1 COG:ycdY KEGG:ns NR:ns ## COG: ycdY COG3381 # Protein_GI_number: 16128998 # Func_class: R General function prediction only # Function: Uncharacterized component of anaerobic dehydrogenases # Organism: Escherichia coli K12 # 1 184 1 184 184 349 100.0 2e-96 MNEFSILCRVLGSLYYRQPQDPLLVPLFTLIREGKLAANWPLEQDELLTRLQKSCDMTQV SADYNALFIGDECAVPPYRSAWVEGATEAEVRAFLSERGMPLADTPADHIGTLLLAASWL EDQSTEDESEALETLFSEYLLPWCGAFLGKVEAHATTPFWRTMAPLTRDAISAMWDELEE DSEE >gi|296494436|gb|ADTN01000302.1| GENE 6 2466 - 3203 754 245 aa, chain - ## HITS:1 COG:ycdX KEGG:ns NR:ns ## COG: ycdX COG1387 # Protein_GI_number: 16128997 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Escherichia coli K12 # 1 245 1 245 245 479 100.0 1e-135 MYPVDLHMHTVASTHAYSTLSDYIAQAKQKGIKLFAITDHGPDMEDAPHHWHFINMRIWP RVVDGVGILRGIEANIKNVDGEIDCSGKMFDSLDLIIAGFHEPVFAPHDKATNTQAMIAT IASGNVHIISHPGNPKYEIDVKAVAEAAAKHQVALEINNSSFLHSRKGSEDNCREVAAAV RDAGGWVALGSDSHTAFTMGEFEECLKILDAVDFPPERILNVSPRRLLNFLESRGMAPIA EFADL >gi|296494436|gb|ADTN01000302.1| GENE 7 3258 - 4196 582 312 aa, chain - ## HITS:1 COG:ycdW KEGG:ns NR:ns ## COG: ycdW COG0111 # Protein_GI_number: 16128996 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Escherichia coli K12 # 1 312 14 325 325 640 100.0 0 MDIIFYHPTFDTQWWIEALRKAIPQARVRAWKSGDNDSADYALVWHPPVEMLAGRDLKAV FALGAGVDSILSKLQAHPEMLNPSVPLFRLEDTGMGEQMQEYAVSQVLHWFRRFDDYRIQ QNSSHWQPLPEYHREDFTIGILGAGVLGSKVAQSLQTWRFPLRCWSRTRKSWPGVQSFAG REELSAFLSQCRVLINLLPNTPETVGIINQQLLEKLPDGAYLLNLARGVHVVEDDLLAAL DSGKVKGAMLDVFNREPLPPESPLWQHPRVTITPHVAAITRPAEAVEYISRTIAQLEKGE RVCGQVDRARGY Prediction of potential genes in microbial genomes Time: Mon May 16 00:11:24 2011 Seq name: gi|296494435|gb|ADTN01000303.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont761.2, whole genome shotgun sequence Length of sequence - 479 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 00:11:25 2011 Seq name: gi|296494434|gb|ADTN01000304.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont775.1, whole genome shotgun sequence Length of sequence - 1357 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 - CDS 183 - 509 169 ## COG1145 Ferredoxin - Prom 603 - 662 1.9 - Term 548 - 577 -0.4 2 1 Op 2 . - CDS 710 - 1306 485 ## COG3381 Uncharacterized component of anaerobic dehydrogenases Predicted protein(s) >gi|296494434|gb|ADTN01000304.1| GENE 1 183 - 509 169 108 aa, chain - ## HITS:1 COG:PM1758 KEGG:ns NR:ns ## COG: PM1758 COG1145 # Protein_GI_number: 15603623 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Pasteurella multocida # 4 105 78 178 181 65 40.0 2e-11 MGILNRHEDGYPQLVIEFASCDGCGLCIAACSTEALRPQARFDTGLRPVFKANCVNPVRS CKQCVDLCPLQACSINESGMPVIDAAICNGCGECLVQCGYDAVKLEMV >gi|296494434|gb|ADTN01000304.1| GENE 2 710 - 1306 485 198 aa, chain - ## HITS:1 COG:STM4308 KEGG:ns NR:ns ## COG: STM4308 COG3381 # Protein_GI_number: 16767558 # Func_class: R General function prediction only # Function: Uncharacterized component of anaerobic dehydrogenases # Organism: Salmonella typhimurium LT2 # 1 197 21 217 217 270 72.0 1e-72 MPSPAVLVRILGALFYYSPTRPEVRALFDCLPTLDEIYPWRDPRQVAELCSDWSIPDEEQ HLWQFSVLFEGQGEMPAPPWGSVYLEKDNLLMGDSTAEYRKFLRDQGMTFDSCINEPEDQ FGLMLLACSALLAEGKERAANQLLETHLLPWGYRYLELLQRNPVSAFYAKLAQIATLFLQ DYQQQQALNPPTKRLFLG Prediction of potential genes in microbial genomes Time: Mon May 16 00:11:26 2011 Seq name: gi|296494433|gb|ADTN01000305.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont775.2, whole genome shotgun sequence Length of sequence - 2063 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 9/0.000 - CDS 22 - 798 969 ## COG3302 DMSO reductase anchor subunit 2 1 Op 2 16/0.000 - CDS 791 - 1417 488 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 3 1 Op 3 . - CDS 1431 - 2036 537 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing Predicted protein(s) >gi|296494433|gb|ADTN01000305.1| GENE 1 22 - 798 969 258 aa, chain - ## HITS:1 COG:STM4307 KEGG:ns NR:ns ## COG: STM4307 COG3302 # Protein_GI_number: 16767557 # Func_class: R General function prediction only # Function: DMSO reductase anchor subunit # Organism: Salmonella typhimurium LT2 # 1 258 1 257 257 290 79.0 2e-78 MHELPLVFFTVFTQSAVGAFILLLIGGAMGLIEPRRMAIGLFSVMCLFGVGVVLGTIHVG QPLRALNMLFRVGSSPMSNEIVLSACFAAAGGLGSLGLLLNRGGAMLCKVLVWLAAAIGV VFIFAIPRIYQLATVATWSTSYTTMMMVLTALIGGGMLAALFGVRRLGLLVSVLAILASF CLRPGYISTLMSADGVITAAQSTWFTAQAILLALGVIGALLYARMKSGQAVLAMTALVVI AAELVGRIGFYNLWTIPM >gi|296494433|gb|ADTN01000305.1| GENE 2 791 - 1417 488 208 aa, chain - ## HITS:1 COG:STM4306 KEGG:ns NR:ns ## COG: STM4306 COG0437 # Protein_GI_number: 16767556 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Salmonella typhimurium LT2 # 1 208 1 208 208 397 92.0 1e-110 MKQYGFYVDSSRCSGCKTCQVSCKDNKDLDVGPKLRRVYEYGGGSWVKEGESWHNNTFTY YLSIACNHCDEPTCVAGCPTGAMHKREEDGLVLVDDSVCVGCRYCEMRCPYGAPQFDTQA KVMRKCDGCLDRLEKNLPPICVESCPQRALDFGPIDELRAKYGSENEIAPLPAASFTHPN LIIKPHPLARPTGDKEGAIMNIREVRHA >gi|296494433|gb|ADTN01000305.1| GENE 3 1431 - 2036 537 201 aa, chain - ## HITS:1 COG:STM4305 KEGG:ns NR:ns ## COG: STM4305 COG0243 # Protein_GI_number: 16767555 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Salmonella typhimurium LT2 # 1 201 583 783 783 378 88.0 1e-105 MGVIDQRIATEKESIAFADFRADPQANPLKTPSGKIEVYSQALADLAQKWILPEGDRIPA VPEFCVVKESHLNKELTAKYPLQLSGFHTKGHTHSTYTNVLMLHEAVPDEVWINPIDASA RQLSSGDLVHVFNDRGVVEIPCKVTQRILPGVVAMPQGAWTRLDSNGVDVGGCINTLTTH LTSPLAKGNPQHTNLVEIKRA Prediction of potential genes in microbial genomes Time: Mon May 16 00:11:27 2011 Seq name: gi|296494432|gb|ADTN01000306.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont775.3, whole genome shotgun sequence Length of sequence - 2022 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1793 1321 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing - Prom 1903 - 1962 5.8 Predicted protein(s) >gi|296494432|gb|ADTN01000306.1| GENE 1 2 - 1793 1321 597 aa, chain - ## HITS:1 COG:STM4305 KEGG:ns NR:ns ## COG: STM4305 COG0243 # Protein_GI_number: 16767555 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Salmonella typhimurium LT2 # 25 597 1 572 783 1035 85.0 0 MKELLNTTISRRDAVTTTAKLGAAVALSNAITLPFSTTVRAETSDTGKTNENAETVRHSA CLVNCGSRCPLKVIVKDGRIVRIEAEDAKDDSVFGEHQIRPCLRGRSSRWRVYSPDRIKY PMKRVGKRGEGKFQKISWDEATALIAAELKRITEKYGREAIYYNYQSGAYYQTQGSPAWK RLLNITGGYLRYHNTYSTAQIGAATPYTHGVYVGSHFTEVANSDLVVLFGLNLSETRMSG GGQVEELRRALDKSKARVIIIDPRYTDSVITEHAEWLPIRPTTDAALVAGIAHTLITENL INEEQVNKYCVGYDRSTLPESAAPNASYKDYVLGTGEDGVEKTPEWAASITGLPATRIRQ LAREMVAAKACYICQGWGPQRHANGEQTVRAIQTLPVLTGHFGLPGTNNGNWPYGTPYGV PSLPVGENPITTSIPCYLWTDAIQNPEKMTANTMGVKGAEKLKTGIKLIVNQAGNALLNQ HGATNRTRKILADDTLCETIIVIENHMTPSAKYADILLPETSYLEAEDLVDSSYSAGSHN YMIALQRTIEPMWEVRSTYDICADIAGHLGLREQFTEGKTQAQWAELHYQQIREKRP Prediction of potential genes in microbial genomes Time: Mon May 16 00:11:28 2011 Seq name: gi|296494431|gb|ADTN01000307.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont776.1, whole genome shotgun sequence Length of sequence - 1558 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 50 - 466 204 ## gi|301648123|ref|ZP_07247880.1| conserved hypothetical protein - Prom 515 - 574 7.3 + Prom 919 - 978 3.1 2 2 Op 1 . + CDS 999 - 1334 174 ## gi|300919931|ref|ZP_07136392.1| conserved domain protein 3 2 Op 2 . + CDS 1343 - 1534 223 ## HDEF_1354 plasmid stability protein StbB Predicted protein(s) >gi|296494431|gb|ADTN01000307.1| GENE 1 50 - 466 204 138 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|301648123|ref|ZP_07247880.1| ## NR: gi|301648123|ref|ZP_07247880.1| conserved hypothetical protein [Escherichia coli MS 146-1] # 1 138 1 138 138 264 100.0 1e-69 MPTITAKVSDELLAYIDLVSGGNRSEYLRRCLEAGPGDLESGLKIVTNQLSDVNRKLDYL FDRASDADFGSLRDELKAITETLSGVTFPPAGQMMLHESLAIETLILLRSIAEPGKTKAA KAEVERNGYKVWEPKKER >gi|296494431|gb|ADTN01000307.1| GENE 2 999 - 1334 174 111 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|300919931|ref|ZP_07136392.1| ## NR: gi|300919931|ref|ZP_07136392.1| conserved domain protein [Escherichia coli MS 115-1] # 1 111 1 111 111 193 100.0 3e-48 MNKPADLKPRNLSAAVRLRLNEIENWLDRGLTRHEIAEILDSEYSFSVTAKGLEMALYRT RKNRKNVLYNTQPSEPEAQESEKAESPGIIDKEFFNKIGEDFNPKKFNKKF >gi|296494431|gb|ADTN01000307.1| GENE 3 1343 - 1534 223 63 aa, chain + ## HITS:1 COG:no KEGG:HDEF_1354 NR:ns ## KEGG: HDEF_1354 # Name: stbB_2 # Def: plasmid stability protein StbB # Organism: H.defensa # Pathway: not_defined # 1 62 1 62 238 97 77.0 2e-19 MKVAVINYSGSVRKTLISSYLLAPRLTGAKFYAVETINQSASDLGIENVTSFKGDDFSRL IEG Prediction of potential genes in microbial genomes Time: Mon May 16 00:11:43 2011 Seq name: gi|296494430|gb|ADTN01000308.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont780.1, whole genome shotgun sequence Length of sequence - 565 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 123 - 182 2.0 1 1 Tu 1 . + CDS 356 - 490 148 ## SBO_1851 hypothetical protein Predicted protein(s) >gi|296494430|gb|ADTN01000308.1| GENE 1 356 - 490 148 44 aa, chain + ## HITS:1 COG:no KEGG:SBO_1851 NR:ns ## KEGG: SBO_1851 # Name: not_defined # Def: hypothetical protein # Organism: S.boydii # Pathway: not_defined # 1 44 48 91 91 83 95.0 3e-15 MLSLQQGGYMTLAQFAMTFWHDLAAPILAGIITAAIVSWWRNRK Prediction of potential genes in microbial genomes Time: Mon May 16 00:11:44 2011 Seq name: gi|296494429|gb|ADTN01000309.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont780.2, whole genome shotgun sequence Length of sequence - 87 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 00:11:45 2011 Seq name: gi|296494428|gb|ADTN01000310.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont780.3, whole genome shotgun sequence Length of sequence - 121 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 00:12:01 2011 Seq name: gi|296494427|gb|ADTN01000311.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont780.4, whole genome shotgun sequence Length of sequence - 54219 bp Number of predicted genes - 58, with homology - 55 Number of transcription units - 38, operones - 12 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 2 1 Op 2 6/0.222 - CDS 994 - 1803 796 ## COG2912 Uncharacterized conserved protein 3 1 Op 3 6/0.222 - CDS 1807 - 2199 395 ## COG3094 Uncharacterized protein conserved in bacteria 4 1 Op 4 32/0.000 - CDS 2196 - 3029 486 ## PROTEIN SUPPORTED gi|225874212|ref|YP_002755671.1| ribosomal protein L11 methyltransferase 5 1 Op 5 9/0.000 - CDS 3029 - 4111 1274 ## COG0216 Protein chain release factor A 6 1 Op 6 . - CDS 4153 - 5409 1276 ## COG0373 Glutamyl-tRNA reductase - Prom 5448 - 5507 4.0 + Prom 5423 - 5482 6.0 7 2 Op 1 13/0.000 + CDS 5623 - 6246 533 ## COG3017 Outer membrane lipoprotein involved in outer membrane biogenesis 8 2 Op 2 . + CDS 6246 - 7097 566 ## COG1947 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase + Term 7117 - 7143 0.3 - Term 6986 - 7032 -0.5 9 3 Tu 1 . - CDS 7063 - 7149 56 ## - Prom 7172 - 7231 1.6 10 4 Tu 1 . + CDS 7182 - 8195 973 ## COG0462 Phosphoribosylpyrophosphate synthetase 11 5 Tu 1 . + CDS 8347 - 9999 1729 ## COG0659 Sulfate permease and related transporters (MFS superfamily) 12 6 Tu 1 . - CDS 10054 - 10332 267 ## LF82_2742 uncharacterized protein YchH - Prom 10385 - 10444 3.7 + Prom 10521 - 10580 4.2 13 7 Op 1 14/0.000 + CDS 10610 - 11194 501 ## COG0193 Peptidyl-tRNA hydrolase + Term 11206 - 11238 0.7 + Prom 11219 - 11278 2.4 14 7 Op 2 1/0.667 + CDS 11311 - 12402 1273 ## COG0012 Predicted GTPase, probable translation factor + Term 12463 - 12492 2.1 15 8 Tu 1 . + CDS 13171 - 16038 1596 ## COG3468 Type V secretory pathway, adhesin AidA 16 9 Tu 1 . - CDS 16138 - 18057 1555 ## COG3284 Transcriptional activator of acetoin/glycerol metabolism - Prom 18090 - 18149 6.2 + Prom 18132 - 18191 4.9 17 10 Op 1 10/0.000 + CDS 18285 - 19355 1082 ## COG2376 Dihydroxyacetone kinase 18 10 Op 2 2/0.556 + CDS 19366 - 19998 664 ## COG2376 Dihydroxyacetone kinase 19 10 Op 3 1/0.667 + CDS 20009 - 21427 1200 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) + Term 21437 - 21477 8.1 + Prom 21601 - 21660 5.5 20 11 Tu 1 . + CDS 21747 - 23444 1682 ## COG1626 Neutral trehalase 21 12 Tu 1 . - CDS 23523 - 23819 114 ## B21_01181 hypothetical protein - Prom 23952 - 24011 4.7 - Term 24087 - 24128 6.2 22 13 Tu 1 . - CDS 24141 - 24395 289 ## COG2261 Predicted membrane protein - Prom 24445 - 24504 2.9 23 14 Tu 1 . - CDS 24521 - 24610 67 ## - Prom 24836 - 24895 1.9 + Prom 24301 - 24360 2.3 24 15 Tu 1 . + CDS 24596 - 25330 694 ## COG5581 Predicted glycosyltransferase - Term 25169 - 25212 1.7 25 16 Op 1 . - CDS 25332 - 25943 627 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) 26 16 Op 2 . - CDS 25882 - 26070 94 ## gi|293409572|ref|ZP_06653148.1| predicted protein + Prom 25956 - 26015 2.6 27 17 Op 1 1/0.667 + CDS 26043 - 26957 824 ## COG1619 Uncharacterized proteins, homologs of microcin C7 resistance protein MccF + Term 26992 - 27025 -0.4 + Prom 26971 - 27030 1.7 28 17 Op 2 . + CDS 27052 - 28788 1802 ## COG3263 NhaP-type Na+/H+ and K+/H+ antiporters with a unique C-terminal domain 29 18 Tu 1 . - CDS 28922 - 29071 61 ## gi|188493462|ref|ZP_03000732.1| hypothetical protein Ec53638_4549 30 19 Op 1 7/0.111 - CDS 29174 - 30244 923 ## COG0787 Alanine racemase 31 19 Op 2 . - CDS 30254 - 31552 1425 ## COG0665 Glycine/D-amino acid oxidases (deaminating) - Prom 31598 - 31657 4.7 + Prom 31578 - 31637 4.4 32 20 Tu 1 . + CDS 31882 - 33414 1505 ## COG2719 Uncharacterized conserved protein + Term 33428 - 33456 1.0 - Term 33412 - 33448 2.3 33 21 Tu 1 . - CDS 33466 - 34185 762 ## COG2186 Transcriptional regulators - Prom 34218 - 34277 5.0 + Prom 34299 - 34358 5.5 34 22 Tu 1 . + CDS 34406 - 35947 1695 ## COG3067 Na+/H+ antiporter + Term 35958 - 35989 4.1 + Prom 35954 - 36013 4.2 35 23 Tu 1 . + CDS 36093 - 36623 599 ## COG1495 Disulfide bond formation protein DsbB + Term 36632 - 36662 3.0 36 24 Op 1 4/0.444 - CDS 36669 - 37937 667 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair 37 24 Op 2 . - CDS 37937 - 38356 316 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) + Prom 38650 - 38709 7.1 38 25 Tu 1 . + CDS 38729 - 39640 683 ## ECH74115_1668 hemolysin E 39 26 Op 1 1/0.667 - CDS 39846 - 40292 473 ## COG2983 Uncharacterized conserved protein - Prom 40316 - 40375 3.9 40 26 Op 2 2/0.556 - CDS 40384 - 41043 657 ## COG0179 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) 41 26 Op 3 . - CDS 41115 - 41408 301 ## COG3100 Uncharacterized protein conserved in bacteria - Prom 41600 - 41659 6.6 + Prom 41557 - 41616 4.5 42 27 Tu 1 . + CDS 41650 - 42051 251 ## ECO111_1505 hypothetical protein + Prom 42835 - 42894 4.5 43 28 Op 1 22/0.000 + CDS 43042 - 43737 428 ## COG0850 Septum formation inhibitor 44 28 Op 2 22/0.000 + CDS 43761 - 44573 907 ## COG2894 Septum formation inhibitor-activating ATPase 45 28 Op 3 . + CDS 44577 - 44843 459 ## COG0851 Septum formation topological specificity factor + Term 44869 - 44906 8.0 - Term 45171 - 45199 1.0 46 29 Tu 1 . - CDS 45215 - 45427 77 ## COG3468 Type V secretory pathway, adhesin AidA + Prom 45808 - 45867 6.9 47 30 Op 1 . + CDS 45959 - 46132 93 ## ECDH10B_1224 hypothetical protein 48 30 Op 2 . + CDS 46134 - 46478 347 ## EcSMS35_1978 hypothetical protein 49 30 Op 3 . + CDS 46488 - 46817 458 ## ECO111_1498 hypothetical protein + Term 46831 - 46867 5.2 50 31 Tu 1 . - CDS 46900 - 47793 841 ## COG3468 Type V secretory pathway, adhesin AidA 51 32 Tu 1 . - CDS 48001 - 49521 351 ## COG3468 Type V secretory pathway, adhesin AidA - Prom 49574 - 49633 6.0 - Term 49790 - 49824 -0.1 52 33 Tu 1 . - CDS 49921 - 50139 88 ## c1611 hypothetical protein - Prom 50168 - 50227 6.7 53 34 Op 1 . - CDS 50271 - 51794 586 ## COG2200 FOG: EAL domain 54 34 Op 2 . - CDS 51837 - 51917 56 ## - Prom 52032 - 52091 4.4 - Term 51983 - 52014 1.8 55 35 Tu 1 . - CDS 52126 - 52302 245 ## B21_01150 hypothetical protein - Prom 52399 - 52458 4.9 56 36 Tu 1 . - CDS 52487 - 52753 197 ## B21_01149 hypothetical protein - Prom 52773 - 52832 3.7 - Term 52949 - 52991 3.3 57 37 Tu 1 . - CDS 53097 - 53333 214 ## B21_01147 hypothetical protein - Prom 53456 - 53515 6.5 + Prom 53415 - 53474 6.8 58 38 Tu 1 . + CDS 53647 - 54201 233 ## ECSE_1206 hypothetical protein Predicted protein(s) >gi|296494427|gb|ADTN01000311.1| GENE 1 104 - 958 1101 284 aa, chain - ## HITS:1 COG:ECs1720 KEGG:ns NR:ns ## COG: ECs1720 COG2877 # Protein_GI_number: 15830974 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase # Organism: Escherichia coli O157:H7 # 1 284 1 284 284 567 99.0 1e-162 MKQKVVSIGDINVANDLPFVLFGGMNVLESRDLAMRICEHYVTVTQKLGIPYVFKASFDK ANRSSIHSYRGPGLEEGMKIFQELKQTFGVKIITDVHEPSQAQPVADVVDVIQLPAFLAR QTDLVEAMAKTGAVINVKKPQFVSPGQMGNIVDKFKEGGNEKVILCDRGANFGYDNLVVD MLGFSIMKKVSGNSPVIFDVTHALQCRDPFGAASGGRRAQVAELARAGMAVGLAGLFIEA HPDPEHAKCDGPSALPLAKLEPFLKQMKAIDDLVKGFEELDTSK >gi|296494427|gb|ADTN01000311.1| GENE 2 994 - 1803 796 269 aa, chain - ## HITS:1 COG:ECs1719 KEGG:ns NR:ns ## COG: ECs1719 COG2912 # Protein_GI_number: 15830973 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 269 1 269 269 521 100.0 1e-148 MRSLADFEFNKAPLCEGMILACEAIRRDFPSQDVYDELERLVSLAKEEISQLLPLEEQLE KLIALFYGDWGFKASRGVYRLSDALWLDQVLKNRQGSAVSLGAVLLWVANRLDLPLLPVI FPTQLILRIECPDGEIWLINPFNGESLSEHMLDVWLKGNISPSAELFYEDLDEADNIEVI RKLLDTLKASLMEENQMELALRTSEALLQFNPEDPYEIRDRGLIYAQLDCEHVALNDLSY FVEQCPEDPISEMIRAQINNIAHKHIVLH >gi|296494427|gb|ADTN01000311.1| GENE 3 1807 - 2199 395 130 aa, chain - ## HITS:1 COG:ECs1718 KEGG:ns NR:ns ## COG: ECs1718 COG3094 # Protein_GI_number: 15830972 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 130 1 130 130 207 99.0 5e-54 MTSFSTLLSVHLISIALSVGLLTLRFWLRYQKHPQAFARWTRIVPPVVDTVLLLSGIALM AKAHILPFSGQAQWLTEKLFGVIIYIVLGFIALDYRRMHSQQARIIAFPLALVVLYIIIK LATTKVPLLG >gi|296494427|gb|ADTN01000311.1| GENE 4 2196 - 3029 486 277 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225874212|ref|YP_002755671.1| ribosomal protein L11 methyltransferase [Acidobacterium capsulatum ATCC 51196] # 7 274 23 289 294 191 40 6e-48 MEYQHWLREAISQLQASESPRRDAEILLEHVTGKGRTFILAFGETQLTDEQCQQLDALLT RRRDGEPIAHLTGVREFWSLPLFVSPATLIPRPDTECLVEQALARLPEQPCRILDLGTGT GAIALALASERPDCEITAVDRMPDAVALAQRNAQHLAIKNIRILQSDWFSALAGQQFTMI VSNPPYIDEQDPHLQQGDVRFEPLTALVAADSGMADIVHIIEQSRNALVSGGFLLLEHGW QQGEAVRQAFILAGYHDVETCRDYGDNERVTLGRYYQ >gi|296494427|gb|ADTN01000311.1| GENE 5 3029 - 4111 1274 360 aa, chain - ## HITS:1 COG:ECs1716 KEGG:ns NR:ns ## COG: ECs1716 COG0216 # Protein_GI_number: 15830970 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor A # Organism: Escherichia coli O157:H7 # 1 360 1 360 360 632 100.0 0 MKPSIVAKLEALHERHEEVQALLGDAQTIADQERFRALSREYAQLSDVSRCFTDWQQVQE DIETAQMMLDDPEMREMAQDELREAKEKSEQLEQQLQVLLLPKDPDDERNAFLEVRAGTG GDEAALFAGDLFRMYSRYAEARRWRVEIMSASEGEHGGYKEIIAKISGDGVYGRLKFESG GHRVQRVPATESQGRIHTSACTVAVMPELPDAELPDINPADLRIDTFRSSGAGGQHVNTT DSAIRITHLPTGIVVECQDERSQHKNKAKALSVLGARIHAAEMAKRQQAEASTRRNLLGS GDRSDRNRTYNFPQGRVTDHRINLTLYRLDEVMEGKLDMLIEPIIQEHQADQLAALSEQE >gi|296494427|gb|ADTN01000311.1| GENE 6 4153 - 5409 1276 418 aa, chain - ## HITS:1 COG:ECs1715 KEGG:ns NR:ns ## COG: ECs1715 COG0373 # Protein_GI_number: 15830969 # Func_class: H Coenzyme transport and metabolism # Function: Glutamyl-tRNA reductase # Organism: Escherichia coli O157:H7 # 1 418 1 418 418 730 99.0 0 MTLLALGINHKTAPVSLRERVSFSPDKLDQALDSLLAQPMVQGGVVLSTCNRTELYLSVE EQDNLQEALIRWLCDYHNLNEEDLRKSLYWHQDNDAVSHLMRVASGLDSLVLGEPQILGQ VKKAFADSQKGHMKASELERMFQKSFSVAKRVRTETDIGASAVSVAFAACTLARQIFESL STVTVLLVGAGETIELVARHLREHKVQKMIIANRTRERAQILADEVGAEVIALSEIDERL READIIISSTASPLPIIGKGMVERALKSRRNQPMLLVDIAVPRDVEPEVGKLANAYLYSV DDLQSIISHNLAQRKAAAVEAETIVAQETSEFMAWLRAQSASETIREYRSQAEQVRDELT AKALAALEQGGDAQAIMQDLAWKLTNRLIHAPTKSLQQAARDGDNERLNILRDSLGLE >gi|296494427|gb|ADTN01000311.1| GENE 7 5623 - 6246 533 207 aa, chain + ## HITS:1 COG:ECs1714 KEGG:ns NR:ns ## COG: ECs1714 COG3017 # Protein_GI_number: 15830968 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane lipoprotein involved in outer membrane biogenesis # Organism: Escherichia coli O157:H7 # 1 207 1 207 207 405 100.0 1e-113 MPLPDFRLIRLLPLAALVLTACSVTTPKGPGKSPDSPQWRQHQQDVRNLNQYQTRGAFAY ISDQQKVYARFFWQQTGQDRYRLLLTNPLGSTELELNAQPGNVQLVDNKGQRYTADDAEE MIGKLTGMPIPLNSLRQWILGLPGDATDYKLDDQYRLSEITYSQNGKNWKVVYGGYDTKT QPAMPANMELTDGGQRIKLKMDNWIVK >gi|296494427|gb|ADTN01000311.1| GENE 8 6246 - 7097 566 283 aa, chain + ## HITS:1 COG:ECs1713 KEGG:ns NR:ns ## COG: ECs1713 COG1947 # Protein_GI_number: 15830967 # Func_class: I Lipid transport and metabolism # Function: 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase # Organism: Escherichia coli O157:H7 # 1 283 1 283 283 559 98.0 1e-159 MRTQWPSPAKLNLFLYITGQRADGYHTLQTLFQFLDYGDTISIELRDDGDIRLLTPVEGV EHEDNLIVRAARLLMKTAADSGRLPTGSGADLSIDKRLPMGGGLGGGSSNAATVLVALNH LWQCGLSMDELAEMGLTLGADVPVFVRGHAAFAEGVGEILTPVDPPEKWYLVAHPGVSIP TPVIFKDPELPRNTPKRSIETLLKCEFSNDCEVIARKRFREVDAVLSWLLEYAPSRLTGT GACVFAEFDTESEARQVLEQAPEWLNGFVAKGVNLSPLHRAML >gi|296494427|gb|ADTN01000311.1| GENE 9 7063 - 7149 56 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQRLEQGDVVTETQLARLKAWLCAMGKD >gi|296494427|gb|ADTN01000311.1| GENE 10 7182 - 8195 973 337 aa, chain + ## HITS:1 COG:ECs1712 KEGG:ns NR:ns ## COG: ECs1712 COG0462 # Protein_GI_number: 15830966 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Escherichia coli O157:H7 # 23 337 1 315 315 595 99.0 1e-170 MPGPHSFRQILSTNGRMPEVLLVPDMKLFAGNATPELAQRIANRLYTSLGDAAVGRFSDG EVSVQINENVRGGDIFIIQSTCAPTNDNLMELVVMVDALRRASAGRITAVIPYFGYARQD RRVRSARVPITAKVVADFLSSVGVDRVLTVDLHAEQIQGFFDVPVDNVFGSPILLEDMLQ LNLDNPIVVSPDIGGVVRARAIAKLLNDTDMAIIDKRRPRANVSQVMHIIGDVAGRDCVL VDDMIDTGGTLCKAAEALKERGAKRVFAYATHPIFSGNAANNLRNSVIDEVVVCDTIPLS DEIKSLPNVRTLTLSGMLAEAIRRISNEESISAMFEH >gi|296494427|gb|ADTN01000311.1| GENE 11 8347 - 9999 1729 550 aa, chain + ## HITS:1 COG:ECs1711 KEGG:ns NR:ns ## COG: ECs1711 COG0659 # Protein_GI_number: 15830965 # Func_class: P Inorganic ion transport and metabolism # Function: Sulfate permease and related transporters (MFS superfamily) # Organism: Escherichia coli O157:H7 # 1 550 1 550 550 919 100.0 0 MPFRALIDACWKEKYTAARFTRDLIAGITVGIIAIPLAMALAIGSGVAPQYGLYTAAVAG IVIALTGGSRFSVSGPTAAFVVILYPVSQQFGLAGLLVATLLSGIFLILMGLARFGRLIE YIPVSVTLGFTSGIGITIGTMQIKDFLGLQMAHVPEHYLQKVGALFMALPTINVGDAAIG IVTLGILVFWPRLGIRLPGHLPALLAGCAVMGIVNLLGGHVATIGSQFHYVLADGSQGNG IPQLLPQLVLPWDLPNSEFTLTWDSIRTLLPAAFSMAMLGAIESLLCAVVLDGMTGTKHK ANSELVGQGLGNIIAPFFGGITATAAIARSAANVRAGATSPISAVIHSILVILALLVLAP LLSWLPLSAMAALLLMVAWNMSEAHKVVDLLRHAPKDDIIVMLLCMSLTVLFDMVIAISV GIVLASLLFMRRIARMTRLAPVVVDVPDDVLVLRVIGPLFFAAAEGLFTDLESRLEGKRI VILKWDAVPVLDAGGLDAFQRFVKRLPEGCELRVCNVEFQPLRTMARAGIQPIPGRLAFF PNRRAAMADL >gi|296494427|gb|ADTN01000311.1| GENE 12 10054 - 10332 267 92 aa, chain - ## HITS:1 COG:no KEGG:LF82_2742 NR:ns ## KEGG: LF82_2742 # Name: ychH # Def: uncharacterized protein YchH # Organism: E.coli_LF82 # Pathway: not_defined # 1 92 1 92 92 152 100.0 3e-36 MKRKNASLLGNVLMGLGLVVMVVGVGYSILNQLPQFNMPQYFAHGAVLSIFVGAILWLAG ARVGGHEQVCDRYWWVRHYDKRCRRSDNRRHS >gi|296494427|gb|ADTN01000311.1| GENE 13 10610 - 11194 501 194 aa, chain + ## HITS:1 COG:ECs1709 KEGG:ns NR:ns ## COG: ECs1709 COG0193 # Protein_GI_number: 15830963 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptidyl-tRNA hydrolase # Organism: Escherichia coli O157:H7 # 1 194 1 194 194 370 100.0 1e-103 MTIKLIVGLANPGAEYAATRHNAGAWFVDLLAERLRAPLREEAKFFGYTSRVTLGGEDVR LLVPTTFMNLSGKAVAAMASFFRINPDEILVAHDELDLPPGVAKFKLGGGHGGHNGLKDI ISKLGNNPNFHRLRIGIGHPGDKNKVVGFVLGKPPVSEQKLIDEAIDEAARCTEMWFTDG LTKATNRLHAFKAQ >gi|296494427|gb|ADTN01000311.1| GENE 14 11311 - 12402 1273 363 aa, chain + ## HITS:1 COG:ECs1708 KEGG:ns NR:ns ## COG: ECs1708 COG0012 # Protein_GI_number: 15830962 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted GTPase, probable translation factor # Organism: Escherichia coli O157:H7 # 1 363 1 363 363 721 100.0 0 MGFKCGIVGLPNVGKSTLFNALTKAGIEAANFPFCTIEPNTGVVPMPDPRLDQLAEIVKP QRTLPTTMEFVDIAGLVKGASKGEGLGNQFLTNIRETEAIGHVVRCFENDNIIHVSGKVN PADDIEVINTELALADLDTCERAIHRVQKKAKGGDKDAKAELAVLEKCLPQLENAGMLRA LDLSAEEKAAIRYLSFLTLKPTMYIANVNEDGFENNPYLDQVREIAAKEGSVVVPVCAAV EADIAELDDEERDEFMQELGLEEPGLNRVIRAGYKLLNLQTYFTAGVKEVRAWTIPVGAT APQAAGKIHTDFEKGFIRAQTISFEDFITYKGEQGAKEAGKMRAEGKDYIVKDGDVMNFL FNV >gi|296494427|gb|ADTN01000311.1| GENE 15 13171 - 16038 1596 955 aa, chain + ## HITS:1 COG:ycgV_2 KEGG:ns NR:ns ## COG: ycgV_2 COG3468 # Protein_GI_number: 16129165 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 543 955 1 413 413 676 100.0 0 MGIKQHNGNTKADRLAELKIRSPSIQLIKFGAIGLNAIIFSPLLIAADKGSQYGTNITIN DGDRITGDTADPSGNLYGVMTPAGNTPGNINLGNDVTVNVNDASGYAKGIIIQGKNSSLT ANRLTVDVVGQTSAIGINLIGDYTHADLGTGSTIKSNDDGIIIGHSSTLTATQFTIENSN GIGLTINDYGTSVDLGSGSKITTDGSTGVYIGGLNGNNANGAARFTATDLTIDVQGYSAM GINVQKNSVVDLGTNSTIKTNGDNAHGLWSFGQVSANALTVDVTGAAANGVEVRGGTTTI GADSHISSAQGGGLVTSGSDAIINFTGTAAQRNSIFSGGSYGASAQTATAVVNMQNTDIT VDRNGSLALGLWALSGGRITGDSLAITGAAGARGIYAMTNSQIDLTSDLVIDMSTPDQMA IATQHDDGYAASRINASGRMLINGSVLSKGGLINLDMHPGSVWTGSSLSDNVNGGKLDVA MNNSVWNVTSNSNLDTLALSHSTVDFASHGSTAGTFATLNVENLSGNSTFIMRADVVGEG NGVNNKGDLLNISGSSAGNHVLAIRNQGSEATTGNEVLTVVKTTDGAASFSASSQVELGG YLYDVRKNGTNWELYASGTVPEPTPNPEPTPAPAQPPIVNPDPTPEPAPTPKPTTTADAG GNYLNVGYLLNYVENRTLMQRMGDLRNQSKDGNIWLRSYGGSLDSFASGKLSGFDMGYSG IQFGGDKRLSDVMPLYVGLYIGSTHASPDYSGGDGTARSDYMGMYASYMAQNGFYSDLVI KASRQKNSFHVLDSQNNGVNANGTANGMSISLEAGQRFNLSPTGYGFYIEPQTQLTYSHQ NEMTMKASNGLNIHLNHYESLLGRASMILGYDITAGNSQLNVYVKTGAIREFSGDTEYLL NNSREKYSFKGNGWNNGVGVSAQYNKQHTFYLEADYTQGNLFDQKQVNGGYRFSF >gi|296494427|gb|ADTN01000311.1| GENE 16 16138 - 18057 1555 639 aa, chain - ## HITS:1 COG:ycgU KEGG:ns NR:ns ## COG: ycgU COG3284 # Protein_GI_number: 16129164 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; K Transcription # Function: Transcriptional activator of acetoin/glycerol metabolism # Organism: Escherichia coli K12 # 1 639 4 642 642 1269 100.0 0 MSGAFNNDGRGISPLIATSWERCNKLMKRETWNVPHQAQGVTFASIYRRKKAMLTLGQAA LEDAWEYMAPRECALFILDETACILSRNGDPQTLQQLSALGFNDGTYCAEGIIGTCALSL AAISGQAVKTMADQHFKQVLWNWAFCATPLFDSKGRLTGTIALACPVEQTTAADLPLTLA IAREVGNLLLTDSLLAETNRHLNQLNALLESMDDGVISWDEQGNLQFINAQAARVLRLDA TASQGRAITELLTLPAVLQQAIKQAHPLKHVEATFESQHQFIDAVITLKPIIETQGTSFI LLLHPVEQMRQLMTSQLGKVSHTFAHMPQDDPQTRRLIHFGRQAARSSFPVLLCGEEGVG KALLSQAIHNESERAAGPYIAVNCELYGDAALAEEFIGGDRTDNENGRLSRLELAHGGTL FLEKIEYLAVELQSALLQVIKQGVITRLDARRLIPIDVKVIATTTADLAMLVEQNRFSRQ LYYALHAFEITIPPLRMRRGSIPALVNNKLRSLEKRFSTRLKIDDDALARLVSCAWPGND FELYSVIENLALSSDNGRIRVSDLPEHLFTEQATDDVSATRLSTSLSFAEVEKEAIINAA QVTGGRIQEMSALLGIGRTTLWRKMKQHGIDAGQFKRRV >gi|296494427|gb|ADTN01000311.1| GENE 17 18285 - 19355 1082 356 aa, chain + ## HITS:1 COG:ycgT KEGG:ns NR:ns ## COG: ycgT COG2376 # Protein_GI_number: 16129163 # Func_class: G Carbohydrate transport and metabolism # Function: Dihydroxyacetone kinase # Organism: Escherichia coli K12 # 1 356 11 366 366 731 100.0 0 MKKLINDVQDVLDEQLAGLAKAHPSLTLHQDPVYVTRADAPVAGKVALLSGGGSGHEPMH CGYIGQGMLSGACPGEIFTSPTPDKIFECAMQVDGGEGVLLIIKNYTGDILNFETATELL HDSGVKVTTVVIDDDVAVKDSLYTAGRRGVANTVLIEKLVGAAAERGDSLDACAELGRKL NNQGHSIGIALGACTVPAAGKPSFTLADNEMEFGVGIHGEPGIDRRPFSSLDQTVDEMFD TLLVNGSYHRTLRFWDYQQGSWQEEQQTKQPLQSGDRVIALVNNLGATPLSELYGVYNRL TTRCQQAGLTIERNLIGAYCTSLDMTGFSITLLKVDDETLALWDAPVHTPALNWGK >gi|296494427|gb|ADTN01000311.1| GENE 18 19366 - 19998 664 210 aa, chain + ## HITS:1 COG:ECs1704 KEGG:ns NR:ns ## COG: ECs1704 COG2376 # Protein_GI_number: 15830958 # Func_class: G Carbohydrate transport and metabolism # Function: Dihydroxyacetone kinase # Organism: Escherichia coli O157:H7 # 1 210 1 210 210 394 98.0 1e-110 MSLSRTQIVNWLTRCGDIFSTESEYLTGLDREIGDADHGLNMNRGFSKVVEKLPAIADKD IGFILKNTGMTLLSSVGGASGPLFGTFFIRAAQATQARQSLTLEELYQMFRDGADGVISR GKAEPGDKTMCDVWVPVVESLRQSSEQNLSVPVALEAASSIAESAAQSTITMQARKGRAS YLGERSIGHQDPGATSVMFMMQMLALAAKE >gi|296494427|gb|ADTN01000311.1| GENE 19 20009 - 21427 1200 472 aa, chain + ## HITS:1 COG:ycgC_3 KEGG:ns NR:ns ## COG: ycgC_3 COG1080 # Protein_GI_number: 16129161 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Escherichia coli K12 # 258 472 1 215 215 416 100.0 1e-116 MVNLVIVSHSSRLGEGVGELARQMLMSDSCKIAIAAGIDDPQNPIGTDAVKVMEAIESVA DADHVLVMMDMGSALLSAETALELLAPEIAAKVRLCAAPLVEGTLAATVSAASGADIDKV IFDAMHALEAKREQLGLPSSDTEISDTCPAYDEEARSLAVVIKNRNGLHVRPASRLVYTL STFNADMLLEKNGKCVTPESINQIALLQVRYNDTLRLIAKGPEAEEALIAFRQLAEDNFG ETEEVAPPTLRPVPPVSGKAFYYQPVLCTVQAKSTLTVEEEQDRLRQAIDFTLLDLMTLT AKAEASGLDDIAAIFSGHHTLLDDPELLAAASELLQHEHCTAEYAWQQVLKELSQQYQQL DDEYLQARYIDVDDLLHRTLVHLTQTKEELPQFNSPTILLAENIYPSTVLQLDPAVVKGI CLSAGSPVSHSALIARELGIGWICQQGEKLYAIQPEETLTLDVKTQRFNRQG >gi|296494427|gb|ADTN01000311.1| GENE 20 21747 - 23444 1682 565 aa, chain + ## HITS:1 COG:treA KEGG:ns NR:ns ## COG: treA COG1626 # Protein_GI_number: 16129160 # Func_class: G Carbohydrate transport and metabolism # Function: Neutral trehalase # Organism: Escherichia coli K12 # 1 565 1 565 565 1102 99.0 0 MKSPAPSRPQKMALIPACIFLCFAALSVQAEETPVTPQPPDILLGPLFNDVQNAKLFPDQ KTFADAVPNSDPLMILADYRMQQNQSGFDLRHFVNVNFTLPKEGEKYVPPEGQSLREHID GLWPVLTRSTENTEKWDSLLPLPEPYVVPGGRFREVYYWDSYFTMLGLAESGHWDKVANM VANFAHEIDTYGHIPNGNRSYYLSRSQPPFFALMVELLAQHEGDAALKQYLPQMQKEYAY WMDGVENLQAGQQEKRVVKLQDGTLLNRYWDDRDTPRPESWVEDIATAKSNPNRPATEIY RDLRSAAASGWDFSSRWMDNPQQLNTLRTTSIVPVDLNSLMFKMEKILARASKAAGDNAM ANQYETLANARQKGIEKYLWNDQQGWYADYDLKSHKVRNQLTAAALFPLYVNAAAKDRAN KMATATKTHLLQPGGLNTTSVKSGQQWDAPNGWAPLQWVATEGLQNYGQKEVAMDISWHF LTNVQHTYDREKKLVEKYDVSTTGTGGGGGEYPLQDGFGWTNGVTLKMLDLICPKEQPCD NVPATRPTVKSATTQPSTKEAQPTP >gi|296494427|gb|ADTN01000311.1| GENE 21 23523 - 23819 114 98 aa, chain - ## HITS:1 COG:no KEGG:B21_01181 NR:ns ## KEGG: B21_01181 # Name: ycgY # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 98 49 146 146 200 100.0 1e-50 MLQLREEEWSEFFFWLLNSLECLDYVIINLTPESKKTLMSEHRNNIQVAIDALYSQRRRK SPGDESETLTRRNDAIFGNHVWQTFAQYFPPGLEKPSV >gi|296494427|gb|ADTN01000311.1| GENE 22 24141 - 24395 289 84 aa, chain - ## HITS:1 COG:ymgE KEGG:ns NR:ns ## COG: ymgE COG2261 # Protein_GI_number: 16129158 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 84 1 84 84 109 100.0 1e-24 MGIIAWIIFDLIAGIIAKLIMPGRDGGGFFLTCILGIVGAVVGGWLATMFGIGGSISGFN LHSFLVAVVGAILVLGIFRLLRRE >gi|296494427|gb|ADTN01000311.1| GENE 23 24521 - 24610 67 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVMTHAEKLLTDKLSLVYRSQLTKLNTNA >gi|296494427|gb|ADTN01000311.1| GENE 24 24596 - 25330 694 244 aa, chain + ## HITS:1 COG:ycgR KEGG:ns NR:ns ## COG: ycgR COG5581 # Protein_GI_number: 16129157 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted glycosyltransferase # Organism: Escherichia coli K12 # 1 244 1 244 244 482 100.0 1e-136 MSHYHEQFLKQNPLAVLGVLRDLHKAAIPLRLSWNGGQLISKLLAITPDKLVLDFGSQAE DNIAVLKAQHITITAETQGAKVEFTVEQLQQSEYLQLPAFITVPPPTLWFVQRRRYFRIS APLHPPYFCQTKLADNSTLRFRLYDLSLGGMGALLETAKPAELQEGMRFAQIEVNMGQWG VFHFDAQLISISERKVIDGKNETITTPRLSFRFLNVSPTVERQLQRIIFSLEREAREKAD KVRD >gi|296494427|gb|ADTN01000311.1| GENE 25 25332 - 25943 627 203 aa, chain - ## HITS:1 COG:mltE KEGG:ns NR:ns ## COG: mltE COG0741 # Protein_GI_number: 16129156 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Escherichia coli K12 # 1 203 39 241 241 405 99.0 1e-113 MKLRWFAFLIVLLAGCSSKHDYTNPPWNAKVPVQRAMQWMPISQKAGAAWGVDPQLITAI IAIESGGNPNAVSKSNAIGLMQLKASTSGRDVYRRMGWSGEPTTSELKNPERNISMGAAY LNILETGPLAGIEDPKVLQYALVVSYANGAGALLRTFSSDRKKAISKINDLDADEFLEHV ARNHPAPQAPRYIYKLEQALDAM >gi|296494427|gb|ADTN01000311.1| GENE 26 25882 - 26070 94 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|293409572|ref|ZP_06653148.1| ## NR: gi|293409572|ref|ZP_06653148.1| predicted protein [Escherichia coli B354] # 1 62 1 62 62 127 100.0 2e-28 MGQLSETETWQFLADNRNEMYIMPLRCAVVTSTAIPGYKDRRSEIEMVCLFDCVISGLFI KA >gi|296494427|gb|ADTN01000311.1| GENE 27 26043 - 26957 824 304 aa, chain + ## HITS:1 COG:ldcA KEGG:ns NR:ns ## COG: ldcA COG1619 # Protein_GI_number: 16129155 # Func_class: V Defense mechanisms # Function: Uncharacterized proteins, homologs of microcin C7 resistance protein MccF # Organism: Escherichia coli K12 # 1 304 1 304 304 628 100.0 1e-180 MSLFHLIAPSGYCIKQHAALRGIQRLTDAGHQVNNVEVIARRCERFAGTETERLEDLNSL ARLTTPNTIVLAVRGGYGASRLLADIDWQALVARQQHDPLLICGHSDFTAIQCGLLAHGN VITFSGPMLVANFGADELNAFTEHHFWLALRNETFTIEWQGEGPTCRAEGTLWGGNLAML ISLIGTPWMPKIENGILVLEDINEHPFRVERMLLQLYHAGILPRQKAIILGSFSGSTPND YDAGYNLESVYAFLRSRLSIPLITGLDFGHEQRTVTLPLGAHAILNNTREGTQLTISGHP VLKM >gi|296494427|gb|ADTN01000311.1| GENE 28 27052 - 28788 1802 578 aa, chain + ## HITS:1 COG:STM1801 KEGG:ns NR:ns ## COG: STM1801 COG3263 # Protein_GI_number: 16765142 # Func_class: P Inorganic ion transport and metabolism # Function: NhaP-type Na+/H+ and K+/H+ antiporters with a unique C-terminal domain # Organism: Salmonella typhimurium LT2 # 1 577 1 577 577 960 92.0 0 MDATTIISLFILGSILVTSSILLSSFSSRLGIPILVIFLAIGMLAGVDGVGGIPFDNYPF AYMVSNLALAIILLDGGMRTQASSFRVALGPALSLATLGVLITSGLTGMMAAWLFNLDLI EGLLIGAIVGSTDAAAVFSLLGGKGLNERVGSTLEIESGSNDPMAVFLTITLIAMIQHHE SNISWMFIVDILQQFGLGIVIGLGGGYLLLQMINRIALPAGLYPLLALSGGILIFSLTTA LEGSGILAVYLCGFLLGNRPIRNRYGILQNFDGLAWLAQIAMFLVLGLLVNPSDLLPIAI PALILSAWMIFFARPLSVFAGLLPFRGFNLRERVFISWVGLRGAVPIILAVFPMMAGLEN ARLFFNVAFFVVLVSLLLQGTSLSWAAKKAKVVVPPVGRPVSRVGLDIHPENPWEQFVYQ LSADKWCVGAALRDLHMPKETRIAALFRDNQLLHPTGSTRLREGDVLCVIGRERDLPALG KLFSQSPPVALDQRFFGDFILEASAKYADVALIYGLEDGREYRDKQQTLGEIVQQLLGAA PVVGDQVEFAGMIWTVAEKEDNEVLKIGVRVAEEEAES >gi|296494427|gb|ADTN01000311.1| GENE 29 28922 - 29071 61 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|188493462|ref|ZP_03000732.1| ## NR: gi|188493462|ref|ZP_03000732.1| hypothetical protein Ec53638_4549 [Escherichia coli 53638] # 8 49 1 42 42 76 100.0 4e-13 MIVSIKRMDTISPEPELTAQAFYIKRLRFIVDSNSTSDKSRFTLDGACT >gi|296494427|gb|ADTN01000311.1| GENE 30 29174 - 30244 923 356 aa, chain - ## HITS:1 COG:dadX KEGG:ns NR:ns ## COG: dadX COG0787 # Protein_GI_number: 16129153 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Escherichia coli K12 # 1 356 1 356 356 714 99.0 0 MTRPIQASLDLQALKQNLSIVRQAATHARVWSVVKANAYGHGIERIWSALGATDGFALLN LEEAITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARLKAPLDI YLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTLMSHFAEAEHPDGISGAMARIEQAA EGLECRRSLSNSAATLWHPEAHFDWVRPGIILYGASPSGQWRDIANTGLRPVMTLSSEII GVQTLKAGERVGYGGRYTARDEQRIGIVAAGYADGYPRHAPTGTPVLVDGVRTMTVGTVS MDMLAVDLTPCPQAGIGTPVELWGKEIKIDDVAAAAGTVGYELMCALALRVPVVTV >gi|296494427|gb|ADTN01000311.1| GENE 31 30254 - 31552 1425 432 aa, chain - ## HITS:1 COG:ECs1684 KEGG:ns NR:ns ## COG: ECs1684 COG0665 # Protein_GI_number: 15830938 # Func_class: E Amino acid transport and metabolism # Function: Glycine/D-amino acid oxidases (deaminating) # Organism: Escherichia coli O157:H7 # 1 432 1 432 432 893 100.0 0 MRVVILGSGVVGVASAWYLNQAGHEVTVIDREPGAALETSAANAGQISPGYAAPWAAPGV PLKAIKWMFQRHAPLAVRLDGTQFQLKWMWQMLRNCDTSHYMENKGRMVRLAEYSRDCLK ALRAETNIQYEGRQGGTLQLFRTEQQYENATRDIAVLEDAGVPYQLLESSRLAEVEPALA EVAHKLTGGLQLPNDETGDCQLFTQNLARMAEQAGVKFRFNTPVDQLLCDGEQIYGVKCG DEVIKADAYVMAFGSYSTAMLKGIVDIPVYPLKGYSLTIPIAQEDGAPVSTILDETYKIA ITRFDNRIRVGGMAEIVGFNTELLQPRRETLEMVVRDLYPRGGHVEQATFWTGLRPMTPD GTPVVGRTRFKNLWLNTGHGTLGWTMACGSGQLLSDLLSGRTPAIPYEDLSVARYSRGFT PSRPGHLHGAHS >gi|296494427|gb|ADTN01000311.1| GENE 32 31882 - 33414 1505 510 aa, chain + ## HITS:1 COG:ycgB KEGG:ns NR:ns ## COG: ycgB COG2719 # Protein_GI_number: 16129151 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 510 1 510 510 1038 100.0 0 MATIDSMNKDTTRLSDGPDWTFDLLDVYLAEIDRVAKLYRLDTYPHQIEVITSEQMMDAY SSVGMPINYPHWSFGKKFIETERLYKHGQQGLAYEIVINSNPCIAYLMEENTITMQALVM AHACYGHNSFFKNNYLFRSWTDASSIVDYLIFARKYITECEERYGVDEVERLLDSCHALM NYGVDRYKRPQKISLQEEKARQKSREEYLQSQVNMLWRTLPKREEEKTVAEARRYPSEPQ ENLLYFMEKNAPLLESWQREILRIVRKVSQYFYPQKQTQVMNEGWATFWHYTILNHLYDE GKVTERFMLEFLHSHTNVVFQPPYNSPWYSGINPYALGFAMFQDIKRICQSPTEEDKYWF PDIAGSDWLETLHFAMRDFKDESFISQFLSPKVMRDFRFFTVLDDDRHNYLEISAIHNEE GYREIRNRLSSQYNLSNLEPNIQIWNVDLRGDRSLTLRYIPHNRAPLDRGRKEVLKHVHR LWGFDVMLEQQNEDGSIELLERCPPRMGNL >gi|296494427|gb|ADTN01000311.1| GENE 33 33466 - 34185 762 239 aa, chain - ## HITS:1 COG:ECs1682 KEGG:ns NR:ns ## COG: ECs1682 COG2186 # Protein_GI_number: 15830936 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 239 1 239 239 492 100.0 1e-139 MVIKAQSPAGFAEEYIIESIWNNRFPPGTILPAERELSELIGVTRTTLREVLQRLARDGW LTIQHGKPTKVNNFWETSGLNILETLARLDHESVPQLIDNLLSVRTNISTIFIRTAFRQH PDKAQEVLATANEVADHADAFAELDYNIFRGLAFASGNPIYGLILNGMKGLYTRIGRHYF ANPEARSLALGFYHKLSALCSEGAHDQVYETVRRYGHESGEIWHRMQKNLPGDLAIQGR >gi|296494427|gb|ADTN01000311.1| GENE 34 34406 - 35947 1695 513 aa, chain + ## HITS:1 COG:ECs1681 KEGG:ns NR:ns ## COG: ECs1681 COG3067 # Protein_GI_number: 15830935 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter # Organism: Escherichia coli O157:H7 # 1 513 1 513 513 897 100.0 0 MEISWGRALWRNFLGQSPDWYKLALIIFLIVNPLIFLISPFVAGWLLVAEFIFTLAMALK CYPLLPGGLLAIEAVFIGMTSAEHVREEVAANLEVLLLLMFMVAGIYFMKQLLLFIFTRL LLSIRSKMLLSLSFCVAAAFLSAFLDALTVVAVVISVAVGFYGIYHRVASSRTEDTDLQD DSHIDKHYKVVLEQFRGFLRSLMMHAGVGTALGGVMTMVGEPQNLIIAKAAGWHFGDFFL RMSPVTVPVLICGLLTCLLVEKLRWFGYGETLPEKVREVLQQFDDQSRHQRTRQDKIRLI VQAIIGVWLVTALALHLAEVGLIGLSVIILATSLTGVTDEHAIGKAFTESLPFTALLTVF FSVVAVIIDQQLFSPIIQFVLQASEHAQLSLFYIFNGLLSSISDNVFVGTIYINEAKAAM ESGAITLKQYELLAVAINTGTNLPSVATPNGQAAFLFLLTSALAPLIRLSYGRMVWMALP YTLVLTLVGLLCVEFTLAPVTEWFMQMGWIATL >gi|296494427|gb|ADTN01000311.1| GENE 35 36093 - 36623 599 176 aa, chain + ## HITS:1 COG:ECs1680 KEGG:ns NR:ns ## COG: ECs1680 COG1495 # Protein_GI_number: 15830934 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Disulfide bond formation protein DsbB # Organism: Escherichia coli O157:H7 # 1 176 1 176 176 330 100.0 1e-90 MLRFLNQCSQGRGAWLLMAFTALALELTALWFQHVMLLKPCVLCIYERCALFGVLGAALI GAIAPKTPLRYVAMVIWLYSAFRGVQLTYEHTMLQLYPSPFATCDFMVRFPEWLPLDKWV PQVFVASGDCAERQWDFLGLEMPQWLLGIFIAYLIVAVLVVISQPFKAKKRDLFGR >gi|296494427|gb|ADTN01000311.1| GENE 36 36669 - 37937 667 422 aa, chain - ## HITS:1 COG:umuC KEGG:ns NR:ns ## COG: umuC COG0389 # Protein_GI_number: 16129147 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Escherichia coli K12 # 1 422 1 422 422 876 100.0 0 MFALCDVNAFYASCETVFRPDLWGKPVVVLSNNDGCVIARNAEAKALGVKMGDPWFKQKD LFRRCGVVCFSSNYELYADMSNRVMSTLEELSPRVEIYSIDEAFCDLTGVRNCRDLTDFG REIRATVLQRTHLTVGVGIAQTKTLAKLANHAAKKWQRQTGGVVDLSNLERQRKLMSALP VDDVWGIGRRISKKLDAMGIKTVLDLADTDIRFIRKHFNVVLERTVRELRGEPCLQLEEF APTKQEIICSRSFGERITDYPSMRQAICSYAARAAEKLRSEHQYCRFISTFIKTSPFALN EPYYGNSASVKLLTPTQDSRDIINAATRSLDAIWQAGHRYQKAGVMLGDFFSQGVAQLNL FDDNAPRPGSEQLMTVMDTLNAKEGRGTLYFAGQGIQQQWQMKRAMLSPRYTTRSSDLLR VK >gi|296494427|gb|ADTN01000311.1| GENE 37 37937 - 38356 316 139 aa, chain - ## HITS:1 COG:ECs1678 KEGG:ns NR:ns ## COG: ECs1678 COG1974 # Protein_GI_number: 15830932 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Escherichia coli O157:H7 # 1 139 1 139 139 256 99.0 8e-69 MLFIKPADLREILTFPLFSDLVQCGFPSPAADYVEQRIDLNQLLIQHPSATYFVKASGDS MIDGGISDGDLLIVDSAITASHGDIVIAAVDGEFTVKKLQLRPTVQLIPMNSAYSPITIS SEDTLDVFGVVIHVVKAMR >gi|296494427|gb|ADTN01000311.1| GENE 38 38729 - 39640 683 303 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_1668 NR:ns ## KEGG: ECH74115_1668 # Name: hlyE # Def: hemolysin E # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 303 49 351 351 510 98.0 1e-143 MTEIVADKTVEVVKNAIETADGALDLYNKYLDQVIPWQTFDETIKELSRFKQEYSQAASV LVGDIKTLLMDSQDKYFEATQTVYEWCGVATQLLAAYIFLFDEYNEKKASAQKDILIKVL DDGITKLNEAQKSLLVSSQSFNNASGKLLALDSQLTNDFSEKSSYFQSQVDKIRKEAYAG AAAGIVAGPFGLIISYSIAAGVVEGKLIPELKNKLKSVQNFFTTLSNTVKQANKDIDAAK LKLTTEIAAIGEIKTETETTRFYVDYDDLMLSLLKEAAKKMINTCNEYQKRHGKKTLFEV PEV >gi|296494427|gb|ADTN01000311.1| GENE 39 39846 - 40292 473 148 aa, chain - ## HITS:1 COG:ECs1676 KEGG:ns NR:ns ## COG: ECs1676 COG2983 # Protein_GI_number: 15830930 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 148 11 158 158 301 100.0 2e-82 MSDVPFWQSKTLDEMSDAEWESLCDGCGQCCLHKLMDEDTDEIYFTNVACRQLNIKTCQC RNYERRFEFEPDCIKLTRENLPTFEWLPMTCAYRLLAEGKDLPAWHPLLTGSKAAMHGER ISVRHIAVKESEVIDWQDHILNKPDWAQ >gi|296494427|gb|ADTN01000311.1| GENE 40 40384 - 41043 657 219 aa, chain - ## HITS:1 COG:ECs1675 KEGG:ns NR:ns ## COG: ECs1675 COG0179 # Protein_GI_number: 15830929 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) # Organism: Escherichia coli O157:H7 # 1 219 1 219 219 446 99.0 1e-125 MYQHHNWQGALLDYPVSKVVCVGSNYAKHIKEMGSAVPEEPVLFIKPETALCDLRQPLAI PSDFGSVHHEVELAVLIGATLRQATEEHVRKAIAGYGVALDLTLRDVQGKMKKAGQPWEK AKAFDNSCPLSGFIPAAEFTGDPQNTTLGLSVNGEQRQQGSTADMIHKIVPLIAYMSKFF TLKAGDVVLTGTPDGVGPLQSGDELTVTFDGHSLTTRVL >gi|296494427|gb|ADTN01000311.1| GENE 41 41115 - 41408 301 97 aa, chain - ## HITS:1 COG:ECs1674 KEGG:ns NR:ns ## COG: ECs1674 COG3100 # Protein_GI_number: 15830928 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 97 12 108 108 176 100.0 1e-44 MFCVIYRSSKRDQTYLYVEKKDDFSRVPEELMKGFGQPQLAMILPLDGRKKLVNADIEKV KQALTEQGYYLQLPPPPEDLLKQHLSVMGQKTDDTNK >gi|296494427|gb|ADTN01000311.1| GENE 42 41650 - 42051 251 133 aa, chain + ## HITS:1 COG:no KEGG:ECO111_1505 NR:ns ## KEGG: ECO111_1505 # Name: ycgK # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 133 1 133 133 236 98.0 3e-61 MKIKSIRKAVLLLALLTSTSFAAGKNVNVEFRKGHSSAQYSGEIKGYDYDTYTFYAKKGQ KVHVSISNEGADTYLFGPGIDDSVDLSRYSPELDSHGQYSLPASGKYELRVLQTRNDARK NKTKKYNVDIQIK >gi|296494427|gb|ADTN01000311.1| GENE 43 43042 - 43737 428 231 aa, chain + ## HITS:1 COG:minC KEGG:ns NR:ns ## COG: minC COG0850 # Protein_GI_number: 16129139 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation inhibitor # Organism: Escherichia coli K12 # 1 231 1 231 231 442 100.0 1e-124 MSNTPIELKGSSFTLSVVHLHEAEPKVIHQALEDKIAQAPAFLKHAPVVLNVSALEDPVN WSAMHKAVSATGLRVIGVSGCKDAQLKAEIEKMGLPILTEGKEKAPRPAPTPQAPAQNTT PVTKTRLIDTPVRSGQRIYAPQCDLIVTSHVSAGAELIADGNIHVYGMMRGRALAGASGD RETQIFCTNLMAELVSIAGEYWLSDQIPAEFYGKAARLQLVENALTVQPLN >gi|296494427|gb|ADTN01000311.1| GENE 44 43761 - 44573 907 270 aa, chain + ## HITS:1 COG:ECs1669 KEGG:ns NR:ns ## COG: ECs1669 COG2894 # Protein_GI_number: 15830923 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation inhibitor-activating ATPase # Organism: Escherichia coli O157:H7 # 1 270 1 270 270 506 100.0 1e-143 MARIIVVTSGKGGVGKTTSSAAIATGLAQKGKKTVVIDFDIGLRNLDLIMGCERRVVYDF VNVIQGDATLNQALIKDKRTENLYILPASQTRDKDALTREGVAKVLDDLKAMDFEFIVCD SPAGIETGALMALYFADEAIITTNPEVSSVRDSDRILGILASKSRRAENGEEPIKEHLLL TRYNPGRVSRGDMLSMEDVLEILRIKLVGVIPEDQSVLRASNQGEPVILDINADAGKAYA DTVERLLGEERPFRFIEEEKKGFLKRLFGG >gi|296494427|gb|ADTN01000311.1| GENE 45 44577 - 44843 459 88 aa, chain + ## HITS:1 COG:ECs1668 KEGG:ns NR:ns ## COG: ECs1668 COG0851 # Protein_GI_number: 15830922 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation topological specificity factor # Organism: Escherichia coli O157:H7 # 1 88 1 88 88 148 100.0 2e-36 MALLDFFLSRKKNTANIAKERLQIIVAERRRSDAEPHYLPQLRKDILEVICKYVQIDPEM VTVQLEQKDGDISILELNVTLPEAEELK >gi|296494427|gb|ADTN01000311.1| GENE 46 45215 - 45427 77 70 aa, chain - ## HITS:1 COG:ECs3515 KEGG:ns NR:ns ## COG: ECs3515 COG3468 # Protein_GI_number: 15832769 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli O157:H7 # 1 70 1502 1571 1571 129 95.0 2e-30 MRKEFVDDNRVKVNNDGNFVNDLSGRRGIYQAAIKASFSSTFSGHLGVGYSHGAGVESPW NAVAGVNWSF >gi|296494427|gb|ADTN01000311.1| GENE 47 45959 - 46132 93 57 aa, chain + ## HITS:1 COG:no KEGG:ECDH10B_1224 NR:ns ## KEGG: ECDH10B_1224 # Name: ymgI, ECK4403 # Def: hypothetical protein # Organism: E.coli_DH10B # Pathway: not_defined # 1 57 1 57 57 74 100.0 1e-12 MSYSSFKIILIHVKNIIPIITATLILNYLNNSERSLVKQILIEDEIIVCATYLIPDI >gi|296494427|gb|ADTN01000311.1| GENE 48 46134 - 46478 347 114 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_1978 NR:ns ## KEGG: EcSMS35_1978 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 114 1 114 114 110 100.0 1e-23 MKKKILAFGLISALFCSTPAMADMNRTTKGALLGAGVGLLTGNGVNGVLKGAAVGAGVGA VTEKGRDGKNARKGAKVGAAVGAVTGVLTGNGLEGAIKGAVIGGTGGAILGKMK >gi|296494427|gb|ADTN01000311.1| GENE 49 46488 - 46817 458 109 aa, chain + ## HITS:1 COG:no KEGG:ECO111_1498 NR:ns ## KEGG: ECO111_1498 # Name: ymgD # Def: hypothetical protein # Organism: E.coli_O111_H- # Pathway: not_defined # 1 109 1 109 109 195 100.0 3e-49 MKKFALLAGLFVFAPMTWAQDYNIKNGLPSETYITCAEANEMAKTDSAQVAEIVAVMGNA SVASRDLKIEQSPELSAKVVEKLNQVCAKDPQMLLITAIDDTMRAIGKK >gi|296494427|gb|ADTN01000311.1| GENE 50 46900 - 47793 841 297 aa, chain - ## HITS:1 COG:b1170 KEGG:ns NR:ns ## COG: b1170 COG3468 # Protein_GI_number: 16129133 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 1 297 42 338 338 570 100.0 1e-163 MRLNDRAGETRYIDPVTEQERSSRLWLRQIGGHNAWRDSNGQLRTTSHRYVSQLGGDLLT GGFTDSDSWRLGVMAGYARDYNLTHSSVSDYRSKGSVRGYSAGLYATWFADDISKKGAYI DSWAQYSWFKNSVKGDELAYESYSAKGATVSLEAGYGFALNKSFGLEAAKYTWIFQPQAQ AIWMGVDHNAHTEANGSRIENDANNNIQTRLGFRTFIRTQEKNSGPHGDDFEPFVEMNWI HNSKDFAVSMNGVKVEQDGASNLGEIKLGVNGNLNPAASVWGMWACSWVIMATMTPQ >gi|296494427|gb|ADTN01000311.1| GENE 51 48001 - 49521 351 506 aa, chain - ## HITS:1 COG:b1169 KEGG:ns NR:ns ## COG: b1169 COG3468 # Protein_GI_number: 16129132 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Escherichia coli K12 # 1 506 1 506 506 780 100.0 0 MKLKKLPGFSLGLIALAVGNAYATQLLDDYSIISYMTDEESPIEIKDNNPISNGEYLTTE DESHAVKVDDGVTGYINNASVMTSGDGSYGISVDSQNKVLYISDSDIKTSGSVSDKENGG ITASAVVSEFGGTIFMNGDNSVESGGAYSAGLLSQVNDSEKMVNNTRLETTDKTNIVTSG ENAVGVLACSSPGESRTCVDAVDDEVSDSNSYEVISRADLKMNGGSITTNGINSYGAYAN GKKAYINLDYVALETVADGSYAVAIRQGNIDIKNSSITTTGTKAPIAKIYNGGELFFSNV TAVSKQDKGISIDASNIDSQAKIALLSVELSSALDSIDVNKTTTDVSILNRSIITPGNNV LVNNTGGDLNIISSDSILNGATKLVSGTTTLKLSENTIWNMKDDSVVTHLTNSDSIINLS YDDGQTFTQGKTLTVKGNYVGNNGQLNIRTVLGDDKSATDRLIVEGNTSGSTTVYVKNAG GSGAATLNGIELITVNGDESPADAFR >gi|296494427|gb|ADTN01000311.1| GENE 52 49921 - 50139 88 72 aa, chain - ## HITS:1 COG:no KEGG:c1611 NR:ns ## KEGG: c1611 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_CFT073 # Pathway: not_defined # 1 72 13 84 84 108 97.0 5e-23 MNNSNNLDYFTLYIIFSIAFMLITLLVILIAKPSTGLGEVLVTINLLNALVWLAINLVNR LRERLVNHRDQQ >gi|296494427|gb|ADTN01000311.1| GENE 53 50271 - 51794 586 507 aa, chain - ## HITS:1 COG:ycgG_2 KEGG:ns NR:ns ## COG: ycgG_2 COG2200 # Protein_GI_number: 16129131 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 237 507 1 271 271 551 99.0 1e-156 MRNTLIPILVAICLFITGVAILNIQLWYSAKAEYLAGARYAANNINHILEEASQATQTAV NIAGKECNLEEQYQLGTEAALKPHLRTIIILKQGIVWCTSLPGNRVLLSRIPVFPDSNLL LAPAIDTVNRLPILLYQNQFADTRILVTISDQHIRGALNVPLKGVRYVLRVADDIIGPTG DVMTLNGHYPYTEKVHSTKYHFTIIFNPPPLFSFYRLIDKGFGILIFILLIACAAAFLLD RYFNKSATPEEILRRAINNGEIVPFYQPVVNGREGTLRGVEVLARWKQPHGGYISPAAFI PLAEKSGLIVPLTQSLMNQVARQMNAIASKLPEGFHIGINFSASHIISPTFVDECLNFRD SFTRRDLNLVLEVTEREPLNVDESLVQRLNILHENGFVIALDDFGTGYSGLSYLHDLHID YIKIDHSFVGRVNADPESTRILDCVLDLARKLSISIVAEGVETKEQLDYLNQNNITFQQG YYFYKPVTYIDLVKIILSKPKVKVVVE >gi|296494427|gb|ADTN01000311.1| GENE 54 51837 - 51917 56 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEKILSSAWIFLMLSTMLSVDASASE >gi|296494427|gb|ADTN01000311.1| GENE 55 52126 - 52302 245 58 aa, chain - ## HITS:1 COG:no KEGG:B21_01150 NR:ns ## KEGG: B21_01150 # Name: ymgC # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 58 25 82 82 96 100.0 3e-19 MTHGYVDSHIIDQALRLRLKDETSVILSDLYLQILQYIEMHKTTLTDIIINDRESVLS >gi|296494427|gb|ADTN01000311.1| GENE 56 52487 - 52753 197 88 aa, chain - ## HITS:1 COG:no KEGG:B21_01149 NR:ns ## KEGG: B21_01149 # Name: ymgB # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 88 1 88 88 135 100.0 4e-31 MLEDTTIHNAITDKALASYFRSSGNLLEEESAVLGQAVTNLMLSGDNVNNKNIILSLIHS LETTSDILKADVIRKTLEIVLRYTADDM >gi|296494427|gb|ADTN01000311.1| GENE 57 53097 - 53333 214 78 aa, chain - ## HITS:1 COG:no KEGG:B21_01147 NR:ns ## KEGG: B21_01147 # Name: ycgZ # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 78 1 78 78 140 100.0 2e-32 MHQNSVTLDSAGAITRYFAKANLHTQQETLGEIVTEILKDGRNLSRKSLCAKLLCRLEHA TGEEEQKHYNALIGLLFE >gi|296494427|gb|ADTN01000311.1| GENE 58 53647 - 54201 233 184 aa, chain + ## HITS:1 COG:no KEGG:ECSE_1206 NR:ns ## KEGG: ECSE_1206 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SE11 # Pathway: not_defined # 1 183 27 209 429 383 99.0 1e-105 MLTTLIYRSHIRDDEPVKKIEEMVSIANRRNMQSDVTGILLFNGSHFFQLLEGPEEQVKM IYRAICQDPRHYNIVELLCDYAPARRFGKAGMELFDLRLHERDDVLQAVFDKGTSKFQLT YDDRALQFFRTFVLATEQSTYFEIPAEDSWLFIADGSDKELDSCALSPTINDHFAFHPIV DRLC Prediction of potential genes in microbial genomes Time: Mon May 16 00:12:48 2011 Seq name: gi|296494426|gb|ADTN01000312.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont782.1, whole genome shotgun sequence Length of sequence - 5358 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 1, operones - 1 average op.length - 10.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 45 - 560 257 ## pECS88_0074 conjugal transfer protein TraV 2 1 Op 2 . - CDS 557 - 808 205 ## pECS88_0073 conjugal transfer protein TrbG 3 1 Op 3 . - CDS 820 - 1017 144 ## E2348_P1_067 conjugal transfer protein TrbD 4 1 Op 4 . - CDS 1004 - 1594 214 ## pECS88_0071 conjugal transfer protein TraP 5 1 Op 5 . - CDS 1584 - 3011 1243 ## EcSMS35_A0035 conjugal transfer pilus assembly protein TraB 6 1 Op 6 . - CDS 3011 - 3739 483 ## p1ECUMN_0108 conjugal transfer protein TraK 7 1 Op 7 . - CDS 3726 - 4292 548 ## EcSMS35_A0037 conjugal transfer pilus assembly protein TraE 8 1 Op 8 . - CDS 4314 - 4625 355 ## EcSMS35_A0038 conjugal transfer pilus assembly protein TraL 9 1 Op 9 . - CDS 4640 - 5005 414 ## E2348_P1_073 conjugal transfer pilin subunit TraA 10 1 Op 10 . - CDS 5039 - 5221 65 ## PSLT076 conjugal transfer protein TraY - Prom 5290 - 5349 3.5 Predicted protein(s) >gi|296494426|gb|ADTN01000312.1| GENE 1 45 - 560 257 171 aa, chain - ## HITS:1 COG:no KEGG:pECS88_0074 NR:ns ## KEGG: pECS88_0074 # Name: traV # Def: conjugal transfer protein TraV # Organism: E.coli_S88 # Pathway: not_defined # 1 171 1 171 171 273 99.0 2e-72 MKQISFFIPLLGTLLLSGCAGTSTEFECNATTSDTCMTMEQANEKAKKLERSSEAKPVAA SLPRLAEGNFRTMPVQTVTATTPSGSRPAVTAHPEQKLLAPRPLFTAAREVKTVVPVSSV APVTPPRPLRTGEQTAALWIAPYIDNQDVYHQPSSVFFVIKPSAWGKPRIN >gi|296494426|gb|ADTN01000312.1| GENE 2 557 - 808 205 83 aa, chain - ## HITS:1 COG:no KEGG:pECS88_0073 NR:ns ## KEGG: pECS88_0073 # Name: trbG # Def: conjugal transfer protein TrbG # Organism: E.coli_S88 # Pathway: not_defined # 1 83 1 83 83 157 100.0 1e-37 MNKLVSDGSVKKINYPVLYESGITPPLCEVSAPEPDAGGKRIVAYVYKSSRSTVFENPDI VKTCTVRDLKKDFVNCDEKGEGQ >gi|296494426|gb|ADTN01000312.1| GENE 3 820 - 1017 144 65 aa, chain - ## HITS:1 COG:no KEGG:E2348_P1_067 NR:ns ## KEGG: E2348_P1_067 # Name: trbD # Def: conjugal transfer protein TrbD # Organism: E.coli_0127 # Pathway: not_defined # 1 61 1 61 106 116 98.0 3e-25 MNMRNINVITALSVPGKTVSDDFMHAVLSNCATRIVLPAPEKFGSESLPDNFNMTAVGVM KNSEI >gi|296494426|gb|ADTN01000312.1| GENE 4 1004 - 1594 214 196 aa, chain - ## HITS:1 COG:no KEGG:pECS88_0071 NR:ns ## KEGG: pECS88_0071 # Name: traP # Def: conjugal transfer protein TraP # Organism: E.coli_S88 # Pathway: not_defined # 1 196 1 196 196 387 100.0 1e-106 MANNMSSRQACHAARYVVARVLRGLFWCLKYTVILPLATMALMALFVLWKDNTTPGKLLV KEINFVRQTAPAGQFPVSECWFSSSDSSGRSEIQGICHYRAADAADYVRETDRSLMQLVT ALWATLALMYVSLAAITGKYPVRPGKMKCIRVVTADEHLKEVYTEDASLPGKIRKCPVYL PDDRTNRNNGDKNEHA >gi|296494426|gb|ADTN01000312.1| GENE 5 1584 - 3011 1243 475 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_A0035 NR:ns ## KEGG: EcSMS35_A0035 # Name: traB # Def: conjugal transfer pilus assembly protein TraB # Organism: E.coli_SECEC # Pathway: not_defined # 1 475 1 475 475 841 100.0 0 MASINTIVKRKQYLWLGIVVVGTASAIGGALYLSDVDMSGNGETVAEQEPVPDMTGVVDT TFDDKVRQHATTEMQVTAAQMQKQYEEIRRELDVLNKQRGDDQRRIEKLGQDNAALAEQV KALGANPVTATGEPVPQMPASPPGPEGEPQPGNTPVSFPPQGSVAVPPPTAFYPGNGVTP PPQVTYQSVPVPNRIQRKVFTRNEGKQGPSLPYIPSGSFAKAMLIEGADANASVTGNEST VPMQLRITGLVEMPNSKTYDATGCFVGLEAWGDVSSERAIVRTRNISCLKDGKTIDMPIK GHVSFRGKNGIKGEVVMRNGKILGWAWGAGFVDGIGQGMERASQPAVGLGATAAYGAGDV LKMGIGGGASKAAQTLSDYYIKRAEQYHPVIPIGAGNEVTVVFQDGFQLKTVEEMALERT QSRAEEDNPESPVPVPPSAESHLNGFNTDQMLKQLGNLNPQQFMSGSQGGGNDGK >gi|296494426|gb|ADTN01000312.1| GENE 6 3011 - 3739 483 242 aa, chain - ## HITS:1 COG:no KEGG:p1ECUMN_0108 NR:ns ## KEGG: p1ECUMN_0108 # Name: traK # Def: conjugal transfer protein TraK # Organism: E.coli_UMN026 # Pathway: not_defined # 1 242 1 242 242 451 100.0 1e-125 MRKNNTAIIFGSLFFSCSVMAANGTLAPTVVPMVNGGQASIAISNTSPNLFTVPGDRIIA VNSLDGALTNNEQTASGGVVVATVNKKPFTFILETERGLNLSIQAVPREGAGRTIQLVSD LRGTGEEAGAWETSTPYESLLVTISQAVRGGKLPAGWYQVPVTKETLQAPAGLSSVADAV WTGNHLKMVRFAVENKTLSALNIRESDFWQPGTRAVMFSQPASQLLAGARMDVYVIRDGE GN >gi|296494426|gb|ADTN01000312.1| GENE 7 3726 - 4292 548 188 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_A0037 NR:ns ## KEGG: EcSMS35_A0037 # Name: traE # Def: conjugal transfer pilus assembly protein TraE # Organism: E.coli_SECEC # Pathway: not_defined # 1 188 1 188 188 363 100.0 2e-99 MEHGARLSTSRVMAIAFIFMSVLIVLSLSVNVIQGVNNYRLQNEQRTAVTPMAFNAPFAV SQNSADASYLQQMALSFIALRLNVSSETVDASHQALLQYIRPGAQNQMKVILAEEAKRIK NDNVNSAFFQTSVRVWPQYGRVEIRGVLKTWIGDSKPFTDIKHYILILKRENGVTWLDNF GETDDEKK >gi|296494426|gb|ADTN01000312.1| GENE 8 4314 - 4625 355 103 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_A0038 NR:ns ## KEGG: EcSMS35_A0038 # Name: traL # Def: conjugal transfer pilus assembly protein TraL # Organism: E.coli_SECEC # Pathway: not_defined # 1 103 1 103 103 207 100.0 1e-52 MSGDENKLKKYRFPETLTNQSRWFGLPLDELIPAAICIGWGITTSKYLFGIGAAVLVYFG IKKLKKGRGSSWLRDLIYWYMPTALLRGIFHNVPDSCFRQWIK >gi|296494426|gb|ADTN01000312.1| GENE 9 4640 - 5005 414 121 aa, chain - ## HITS:1 COG:no KEGG:E2348_P1_073 NR:ns ## KEGG: E2348_P1_073 # Name: traA # Def: conjugal transfer pilin subunit TraA # Organism: E.coli_0127 # Pathway: not_defined # 1 121 1 121 121 172 100.0 2e-42 MDAVLSVQGASAPVKKKSFFSKFTRLNMLRLARAVIPAAVLMMFFPQLAMAAGSSGQDLM ASGNTTVKATFGKDSSVVKWVVLAEVLVGAVMYMMTKNVKFLAGFAIISVFIAVGMAVVG L >gi|296494426|gb|ADTN01000312.1| GENE 10 5039 - 5221 65 60 aa, chain - ## HITS:1 COG:no KEGG:PSLT076 NR:ns ## KEGG: PSLT076 # Name: traY # Def: conjugal transfer protein TraY # Organism: S.typhimurium # Pathway: not_defined # 1 60 47 106 106 93 90.0 2e-18 MYLDEDTNNRLIKAKDRSGRSKTIEVQIRLRDHLKRFPDFYNEEIFREVTEESESTFKEL Prediction of potential genes in microbial genomes Time: Mon May 16 00:13:17 2011 Seq name: gi|296494425|gb|ADTN01000313.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont786.1, whole genome shotgun sequence Length of sequence - 1963 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 10 - 69 3.3 1 1 Tu 1 . + CDS 131 - 727 403 ## COG3109 Activator of osmoprotectant transporter ProP 2 2 Op 1 . - CDS 745 - 1029 131 ## gi|10955447|ref|NP_065299.1| hypothetical protein R721_09 3 2 Op 2 . - CDS 1103 - 1768 237 ## COG1192 ATPases involved in chromosome partitioning - Prom 1813 - 1872 3.8 Predicted protein(s) >gi|296494425|gb|ADTN01000313.1| GENE 1 131 - 727 403 198 aa, chain + ## HITS:1 COG:VC1497 KEGG:ns NR:ns ## COG: VC1497 COG3109 # Protein_GI_number: 15641506 # Func_class: T Signal transduction mechanisms # Function: Activator of osmoprotectant transporter ProP # Organism: Vibrio cholerae # 97 196 21 120 208 58 39.0 8e-09 MTAIIEASMQPKKTPVIVVKKRRVLVMPENPVVNEKPQEVQKSAVNENKKVQKKDAVAEK TRKKQPQPWYLKKQITFPPKYPKEYFEKCFNKVRAVFPELWTDEKKNLPLKSGILQDVEK YLADNPDVDLTIEEWNCAVQVMTFRWQYLQNCTVPGATRYDLYGKPAGTVKKAHATYAQL VLDARKKASEKKQLKRKG >gi|296494425|gb|ADTN01000313.1| GENE 2 745 - 1029 131 94 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|10955447|ref|NP_065299.1| ## NR: gi|10955447|ref|NP_065299.1| hypothetical protein R721_09 [Escherichia coli] # 1 94 1 94 94 164 100.0 1e-39 MALDLKTPVIPGQKTEQTNADTARFINEATRKKSGATKVTNVRLGDVYEDILEREALRTG QTKTAIMKAALAAYEHGLDENGKNHWILQSARLK >gi|296494425|gb|ADTN01000313.1| GENE 3 1103 - 1768 237 221 aa, chain - ## HITS:1 COG:HP1000 KEGG:ns NR:ns ## COG: HP1000 COG1192 # Protein_GI_number: 15645615 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Helicobacter pylori 26695 # 5 160 3 158 218 72 28.0 5e-13 MGRALLVASDKGGVGKSTTATNLASYLVNNSRTVIILKTDKNNDILSWNEKRKQNGLLPV PVYEEYGDVSEVIKRLKKICEVLIIDCPGHDSKEFRSALTVADILLTLVKPSSDFEAETL THVTEKVRTAQQINPDLQPWVLFTRINTTKPRHRKAAIDLDKLLRENPVWIQPLRTRISD LDIFEAACNEGAGVHDVRRASSLSTAKAQIELLAQELGINP Prediction of potential genes in microbial genomes Time: Mon May 16 00:13:23 2011 Seq name: gi|296494424|gb|ADTN01000314.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont786.2, whole genome shotgun sequence Length of sequence - 2825 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 209 - 268 4.7 1 1 Op 1 . + CDS 323 - 646 148 ## ECH74115_A0052 hypothetical protein 2 1 Op 2 . + CDS 687 - 896 132 ## ECH74115_A0051 hypothetical protein 3 1 Op 3 . + CDS 994 - 1218 153 ## ECH74115_A0050 hypothetical protein + Term 1225 - 1251 -0.7 4 1 Op 4 . + CDS 1270 - 1530 236 ## ECH74115_A0049 hypothetical protein 5 1 Op 5 . + CDS 1520 - 1771 93 ## ECH74115_A0048 YajA protein + Prom 2207 - 2266 4.7 6 2 Tu 1 . + CDS 2320 - 2649 214 ## ECH74115_A0046 hypothetical protein + Term 2657 - 2693 7.6 Predicted protein(s) >gi|296494424|gb|ADTN01000314.1| GENE 1 323 - 646 148 107 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_A0052 NR:ns ## KEGG: ECH74115_A0052 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 107 1 107 107 187 89.0 7e-47 MENHAKFVATEILNQLGGNRFIAMTGAKNFACFDENGECGLCFRLPSNFAMKGINLVKIK LTFSDTYLVTFSRVRGATVKEISKFDNIYCDQLECLFNEQTGLATRL >gi|296494424|gb|ADTN01000314.1| GENE 2 687 - 896 132 69 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_A0051 NR:ns ## KEGG: ECH74115_A0051 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 69 1 74 74 77 63.0 1e-13 MKVLIGNINIDNYHMLSALAGIAGFDRSIQFTCEISASIEIMEDDFVNKAGILKMLDEFI ENDFSIKLV >gi|296494424|gb|ADTN01000314.1| GENE 3 994 - 1218 153 74 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_A0050 NR:ns ## KEGG: ECH74115_A0050 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 74 25 98 98 126 85.0 3e-28 MKQSTFPVIVSTTGHVFSVVRVTLCTICLKHEKTGEAYVVIFTDCHNIRDYKKGVVPALG ELYQEDVDLITGKS >gi|296494424|gb|ADTN01000314.1| GENE 4 1270 - 1530 236 86 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_A0049 NR:ns ## KEGG: ECH74115_A0049 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 84 1 86 88 124 87.0 8e-28 MKIRKGDRQYYLNKEGDTFHLVKRVKTFSKSATLGKTKATVKTVADLVFHEEAFDTIDFA SDGLRENDKEIVSMMIQEMSEGKNAK >gi|296494424|gb|ADTN01000314.1| GENE 5 1520 - 1771 93 83 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_A0048 NR:ns ## KEGG: ECH74115_A0048 # Name: not_defined # Def: YajA protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 75 6 80 85 129 89.0 5e-29 MPNENTPENIKRLRQKIGLTQKECAEIFSMSPRTWRRKEEPVGTASGTALTPVEFKFLLL LAGEHPDYVLCKRNKKNSDSGSN >gi|296494424|gb|ADTN01000314.1| GENE 6 2320 - 2649 214 109 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_A0046 NR:ns ## KEGG: ECH74115_A0046 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 109 1 109 109 188 90.0 5e-47 MKTLTFNNGTVSVGDVFVSSWGYEQTNVNFYQVISVHGKKTVTVQEIRASVHRTHSMSGY KTPLLNDFCGEPLKRRVRDYYSTPAIEIESFETAYKGSPEEKHEFTSYY Prediction of potential genes in microbial genomes Time: Mon May 16 00:13:35 2011 Seq name: gi|296494423|gb|ADTN01000315.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont786.3, whole genome shotgun sequence Length of sequence - 1622 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 38 - 316 169 ## ECH74115_A0045 CcgD protein 2 1 Op 2 . + CDS 385 - 519 89 ## ECH74115_A0044 hypothetical protein + Term 530 - 562 3.1 - Term 516 - 550 1.9 3 2 Tu 1 . - CDS 703 - 969 165 ## gi|261226324|ref|ZP_05940605.1| hypothetical protein EscherichiacoliO157_17278 - Prom 1024 - 1083 1.8 Predicted protein(s) >gi|296494423|gb|ADTN01000315.1| GENE 1 38 - 316 169 92 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_A0045 NR:ns ## KEGG: ECH74115_A0045 # Name: not_defined # Def: CcgD protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 92 1 92 92 157 86.0 9e-38 MHFTTFLKKHFDIEKVVGTSDSGNDTESIYVYEKGNDCEPLFILHESWLNAEIKKCGVWT IGDIYSTLEHGKEYSEQELIKMIKEGKVISKY >gi|296494423|gb|ADTN01000315.1| GENE 2 385 - 519 89 44 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_A0044 NR:ns ## KEGG: ECH74115_A0044 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 44 1 44 44 70 97.0 2e-11 MSVCEMIELANRIVNGEPVKTPAIEAWIISEIIWWAHNRDNALV >gi|296494423|gb|ADTN01000315.1| GENE 3 703 - 969 165 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|261226324|ref|ZP_05940605.1| ## NR: gi|261226324|ref|ZP_05940605.1| hypothetical protein EscherichiacoliO157_17278 [Escherichia coli O157:H7 str. FRIK2000] # 1 88 60 147 147 159 100.0 7e-38 MLFGVFMIVIFKSYLDSWSGIILSSDHILSILDKTKDYPEIKRIVINRLLSGSILTGKDE EYIYSQLEKAQQSNEREKRLQAIKEYTS Prediction of potential genes in microbial genomes Time: Mon May 16 00:13:44 2011 Seq name: gi|296494422|gb|ADTN01000316.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont786.4, whole genome shotgun sequence Length of sequence - 390 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 19 - 384 227 ## ECH74115_A0042 hypothetical protein Predicted protein(s) >gi|296494422|gb|ADTN01000316.1| GENE 1 19 - 384 227 121 aa, chain + ## HITS:1 COG:no KEGG:ECH74115_A0042 NR:ns ## KEGG: ECH74115_A0042 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157_EC4115 # Pathway: not_defined # 1 117 1 117 145 213 88.0 2e-54 MFFSVGVETPKDDHTAYGITVPVFDCFDFGCVSAADSQAEIPAMAREAILAIVEEMVISG AHSVDDIHDEGCLTYSANPNYNHCDSWFVIDVDLSEIEGKQQRINISLPDVLIRRIDVGN E Prediction of potential genes in microbial genomes Time: Mon May 16 00:13:48 2011 Seq name: gi|296494421|gb|ADTN01000317.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont799.1, whole genome shotgun sequence Length of sequence - 7213 bp Number of predicted genes - 9, with homology - 8 Number of transcription units - 5, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 254 178 ## ECS88_4881 hypothetical protein 2 2 Op 1 . + CDS 514 - 915 264 ## ECB_01907 hypothetical protein 3 2 Op 2 . + CDS 903 - 1337 183 ## ECUMN_4871 hypothetical protein + Term 1385 - 1439 -0.8 + Prom 1543 - 1602 2.2 4 3 Op 1 6/0.000 + CDS 1683 - 2072 139 ## COG2963 Transposase and inactivated derivatives 5 3 Op 2 5/0.000 + CDS 2069 - 2416 159 ## PROTEIN SUPPORTED gi|148984516|ref|ZP_01817804.1| 50S ribosomal protein L9 6 3 Op 3 . + CDS 2466 - 3851 670 ## COG3436 Transposase and inactivated derivatives + Term 4005 - 4041 0.2 7 4 Op 1 . - CDS 4090 - 5448 1552 ## COG4222 Uncharacterized protein conserved in bacteria 8 4 Op 2 . - CDS 5479 - 5589 61 ## - Prom 5673 - 5732 1.6 9 5 Tu 1 . - CDS 6181 - 6417 123 ## ECB_04162 hypothetical protein - Prom 6519 - 6578 6.3 Predicted protein(s) >gi|296494421|gb|ADTN01000317.1| GENE 1 3 - 254 178 83 aa, chain + ## HITS:1 COG:no KEGG:ECS88_4881 NR:ns ## KEGG: ECS88_4881 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 83 119 201 201 177 100.0 8e-44 CRAGDYRAPESLAGMIKQAWCSALGVDVGCHATLVHFPAWPAVWLARNDDTGFQQVLERA DYLAKEHTKAHCTGERNFGCSRS >gi|296494421|gb|ADTN01000317.1| GENE 2 514 - 915 264 133 aa, chain + ## HITS:1 COG:no KEGG:ECB_01907 NR:ns ## KEGG: ECB_01907 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 133 12 144 144 260 97.0 1e-68 MYAKSFIALDGNGRLTGARTAQAAPYAHYTCHLCGSALRYHPQYDTELPWFEHTDDRLTE HGQQCPYVRPERREIQLIKRLQQFVPDTLPVVRKASWHCRQCHHDYYGEQYCTHCQTGGF SIPRTTQEEICEF >gi|296494421|gb|ADTN01000317.1| GENE 3 903 - 1337 183 144 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_4871 NR:ns ## KEGG: ECUMN_4871 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 144 30 173 173 297 100.0 7e-80 MRILNCYMANDSKGHFVTAKEAAKHNRQDVLCCVSCGCPLTLQRGNDGQPPWFEHDQMTV AEKILLRCTWLDPAEKEARRLHLQGMTVPDYTVKVRKWFCVMCDEDYEGEKCCPRCGTGV YSRAWGRQEVPSEDARADNPLQRL >gi|296494421|gb|ADTN01000317.1| GENE 4 1683 - 2072 139 129 aa, chain + ## HITS:1 COG:ECs0328 KEGG:ns NR:ns ## COG: ECs0328 COG2963 # Protein_GI_number: 15829582 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 129 5 133 133 239 98.0 7e-64 MSDMQKNVTPGRRKGCPNYPPEFKQLLVAASCEPGISISKLALENGINANLLFKWRQQWR EGKLLLPSSESPQLLPVTLDAAAEQPESLAEDPETLSISCEVTFRHGTLRFNGNVSEKLL TLLIQELKR >gi|296494421|gb|ADTN01000317.1| GENE 5 2069 - 2416 159 115 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148984516|ref|ZP_01817804.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP3-BS71] # 1 99 2 100 107 65 38 9e-11 MIPLPSGTKIWLVAGITDMRNGFNGLAAKVQTALKDDPMSGHVFIFRGRSGSQVKLLWST GDGLCLLTKRLERGRFAWPSARDGKVFLTQAQLAMLLEGIDWRQPKRLLTSLTML >gi|296494421|gb|ADTN01000317.1| GENE 6 2466 - 3851 670 461 aa, chain + ## HITS:1 COG:ECs0330 KEGG:ns NR:ns ## COG: ECs0330 COG3436 # Protein_GI_number: 15829584 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 461 1 512 512 896 89.0 0 MNDISSDDIFLLKQRLAEQEALIHALQEKLSNREREIDHLQAQLDKLRRMNFGSRSEKVS RRIAQMEADLNRLQKESDTLTGRVYDPAVQRPLRQTRTRKPFPESLPRDEKRLLPAAPCC PNCGGSLSYLGEDTAEQLELMRSAFRVIRTVREKHACTQCDAIVQAPAPSRPIERGIAGP GLLARVLTSKYAEHTPLYCQSEIYGRQGVELSRSLLSGWVDACCRLLSPLEEALHGYVMT DGKLHADDTPVQVLLPGNKKTKTGRLWAYVRDDRNAGSALAPAVWFAYSPDRKGIHPQTH LACFSGVLQADAYAGFNELYRNGGITEAACWAHARRKIHDVHVRIPSALTEEALEQIGQL YAIEADIRGMPAEQRLAERQRKTKPLLKSLESWLREKMKTLFFGSGHGGERGALLYSLIG TCKLNDVDPESYLRHVLGVIADWPVNRVSELLPWRIALPAE >gi|296494421|gb|ADTN01000317.1| GENE 7 4090 - 5448 1552 452 aa, chain - ## HITS:1 COG:alr4238_3 KEGG:ns NR:ns ## COG: alr4238_3 COG4222 # Protein_GI_number: 17231730 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 157 329 3 183 205 61 29.0 3e-09 MKRKIIPVLIGCTLSFSALAAQPTAERYVVSFPEGTHVNYAGAFASAFPNGLPVGIGSGL LFTGKQGDALTFATITDRGPNADSPKEGKNETKIFVTPDFAPLLMTIRVQNGKAEAIDPR PLHDDKGAINGLPLASDVIGSTNEVAFSDTLHRLKGDNRGLDTEGITPDGKGGYWLCDEY GPFLINIDSKGKILAIHGPQAAEGEKAIAGGLPNILKWRQANRGFEGLTRMPDGRIIVAV QSTLDIDAKSKKKALFTRLVSFDPASGKTAMYGYPIDSAAYSKNSDAKIGDIVALDNQHI LLIEQGRDKNNRMRNLIYEVDLNKASDLSGFDKPGEYPEFDDEKTLSQRGITLAQKTQVV DLRSLGWQQEKAEGLALIDSKTLAVANDNDFGVKVAMQHPVEGKTFKDYRVNAEGKLTLD DKQVETTLRVKPLEKPESDSELWIVTLPEALK >gi|296494421|gb|ADTN01000317.1| GENE 8 5479 - 5589 61 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFPLLQITTVFYISLASGPRMLPFDVIFLTRFFFMV >gi|296494421|gb|ADTN01000317.1| GENE 9 6181 - 6417 123 78 aa, chain - ## HITS:1 COG:no KEGG:ECB_04162 NR:ns ## KEGG: ECB_04162 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_B_REL606 # Pathway: not_defined # 1 78 20 97 97 120 98.0 1e-26 MIFGEKLANILYEANSQFFYERNVIEEAVNALFCEREIINNKNIIKKLMFFLSDVNHTKK DVVQSALNIIIDITSGDI Prediction of potential genes in microbial genomes Time: Mon May 16 00:14:03 2011 Seq name: gi|296494420|gb|ADTN01000318.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont802.1, whole genome shotgun sequence Length of sequence - 6678 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 3, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1037 832 ## COG1304 L-lactate dehydrogenase (FMN-dependent) and related alpha-hydroxy acid dehydrogenases - Term 1044 - 1082 2.1 2 2 Op 1 . - CDS 1110 - 2360 1100 ## COG2072 Predicted flavoprotein involved in K+ transport 3 2 Op 2 . - CDS 2402 - 2476 85 ## 4 2 Op 3 36/0.000 - CDS 2491 - 3309 715 ## COG1177 ABC-type spermidine/putrescine transport system, permease component II 5 2 Op 4 8/0.000 - CDS 3311 - 4165 794 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I - Term 4191 - 4225 3.7 6 2 Op 5 . - CDS 4231 - 5262 1112 ## COG0687 Spermidine/putrescine-binding periplasmic protein 7 3 Tu 1 . + CDS 5601 - 6563 898 ## COG0583 Transcriptional regulator + Term 6610 - 6648 4.7 Predicted protein(s) >gi|296494420|gb|ADTN01000318.1| GENE 1 2 - 1037 832 345 aa, chain - ## HITS:1 COG:mll6909 KEGG:ns NR:ns ## COG: mll6909 COG1304 # Protein_GI_number: 13475754 # Func_class: C Energy production and conversion # Function: L-lactate dehydrogenase (FMN-dependent) and related alpha-hydroxy acid dehydrogenases # Organism: Mesorhizobium loti # 3 345 4 345 378 496 66.0 1e-140 MTITCIEELRQLARKRVPKMFYDYVDAGSWTEYSYRANEADLRRLEFRQRVAVDIAGRST ATVILGQAVTMPMAIAPTGLTGMIHPDGEILAARAAKRFGIPFTLSTMSICSMETVAQAT DYHPFWFQLYVMRDRHFVENLIDRAKAVNCGALVVTMDLQVFGQRHKDIKNGLSTPPKMT LRNLLDIAVKPRWCRNMLATRNRNFGNIIGHASGVDNIDAMVEWTAQQFDPRLSWQDIEW IKQRWGGKLIVKGIMDVEDARLAVAAGADALIVSNHGGRQLDGVSSSITLLPEIVSAVGD RIEVHFDGGIRSGQDVLKAIALGAKGTYIGRSMLYGLGALGEEGV >gi|296494420|gb|ADTN01000318.1| GENE 2 1110 - 2360 1100 416 aa, chain - ## HITS:1 COG:PA5393 KEGG:ns NR:ns ## COG: PA5393 COG2072 # Protein_GI_number: 15600586 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted flavoprotein involved in K+ transport # Organism: Pseudomonas aeruginosa # 2 413 16 426 450 557 64.0 1e-158 MSVEKINTVIVGAGQAGIAMSEHLAQMGVPHVVLERSRIAERWRSERWDSLVANGPAWHD RFPSLKFDNISQEAFPPKERMAQYFEDYAKMLDAPVRTGVEVHQVVRLLGRSGFKVVTSA GEFEADNVVAATGPFQKPSFPQIVPETTGLQQIHSSAYKNPQQLPAGGVLVVGAGASGTQ IAEELRKSGREVYLSVGEHYRPPRAYRNRDYCWWLGALGLWDEVKIKPKKEHVAFAVSGY EGGKTVDFRRLAHMGITLVGITKSWDNGVLSFAEGLAENIAEGDKAYFDVLRDADAYIER NGLDLPPEPQAWELLPDPECLLHPLTQLDIAEAGITSIIWATGFKFDFSWLQVDAFDEKG LPFHKRGISAERGIYFLGLPNLVNRASSFIYGVWHDAKYIADHIALQNAYTNYVKS >gi|296494420|gb|ADTN01000318.1| GENE 3 2402 - 2476 85 24 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRFGFSEALHFNSLIIINTTPDYA >gi|296494420|gb|ADTN01000318.1| GENE 4 2491 - 3309 715 272 aa, chain - ## HITS:1 COG:YPO2841 KEGG:ns NR:ns ## COG: YPO2841 COG1177 # Protein_GI_number: 16123036 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component II # Organism: Yersinia pestis # 19 246 28 255 282 137 36.0 2e-32 MTRSMNDKLVTLAGRLLVAAILIFVMLPTIVVFISSFSSTSVLFFPPKGWSLRWFERAVS YDDFRHGFSSGLIVTAWASSLAVIIGATLAIAIERYSFPCKQILEGILLSPLFIPHFTIG LGLLMLVSQLNLGRGYPLVIFCHIVLVLPFVLRSVYVSLKNLEQRIELAAASLGASPLRV VWTITVPLILPGLFGGWLFAAILSFNEFTASLFITTQATQTLPVAMYNYVREFADPTLAA LSVIYIVVTATMLIIANKFLGLGKVLNVEARH >gi|296494420|gb|ADTN01000318.1| GENE 5 3311 - 4165 794 284 aa, chain - ## HITS:1 COG:SMa0952 KEGG:ns NR:ns ## COG: SMa0952 COG1176 # Protein_GI_number: 16262968 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Sinorhizobium meliloti # 22 273 35 284 296 129 34.0 8e-30 MTRKPTFLPWFIIPATLTAFGLVVAMCAVMQFSVRAYIPGSLDVGGFTLANFNGLFKSIY ADAFINTVWLSAKTAVFGLLMSYPLAYALVRTRTVWIKSAILIIAITPLFLGEVVRTYSW IIVLGNSGFLNSLLLALGVISKPIQFMFTQTGVVVALVHVTMPIMVLMLATALSHISPDY EKAATSLGAGPIRALLTVTIPLSIPGIVSSLTTAFAWTFSAFATPQLIGGGRVSTVATMV YQLGFSSMNFPFAAALSIAGLILTILLLMALKRMTRFLHAMGGH >gi|296494420|gb|ADTN01000318.1| GENE 6 4231 - 5262 1112 343 aa, chain - ## HITS:1 COG:VC1425 KEGG:ns NR:ns ## COG: VC1425 COG0687 # Protein_GI_number: 15641436 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Vibrio cholerae # 6 284 36 311 366 119 32.0 7e-27 MLRKICALTLLTTSVLCAGNAMAADKLIVSTWGGSFRDLIDETIAKEFTKETGVEVQFVT GGTIDRLNKAKLAGGSPETDVTFTTAHVGYLYANSGLFEKLDMSKIPNATNLVQQAKVSP FHLGVWGYVYTIGYRPDMVPKGETFESWNDLWKPELKGTLALPDWDPSHIIAVSAKLSGT DAAHWEQGQAKLKALIPNIKSFYTDDANSQQLISTGETPVQVMLSMNAYNMISEGVNIKL AMPKEGAILGVDTIGINKGTKHSELAYKFMNIALKPEIQEQVAKIYRGSPTVTNAHIDPE LAKLPGMLTTPEQWNATINTDPKLRAEKTAEWRQWFSENIMSH >gi|296494420|gb|ADTN01000318.1| GENE 7 5601 - 6563 898 320 aa, chain + ## HITS:1 COG:PA0218 KEGG:ns NR:ns ## COG: PA0218 COG0583 # Protein_GI_number: 15595415 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Pseudomonas aeruginosa # 2 295 1 295 306 197 36.0 2e-50 MLSRITQRQLEYFVASGEAGSIIGASERIHVSSPSISAAITHIESELGVQLFVRHHAQGV SLTSIGHQVLKEAKLILEQMSNLYTIASESLNNVRGPLRVGCLDTLAPMLTPELVFGFGR AFPGVRITLVEGNHEQLLKQLRSADIDIALTYDLVPANDIAFTSLAQLPPYVMVGEHHPL ATQSAVTIEDLVAMPMVLLDMPWSREYFLGLFSEAGVAPNVVMRSSNMEVVRAMVANGVG FGIANVRPKAHFSQDGKRLIRVRLAGVNKPMQLGYATVANTQHSMVVNAFAERCRMFISD QYIPGMAAPSYFDPQVVRVA Prediction of potential genes in microbial genomes Time: Mon May 16 00:14:07 2011 Seq name: gi|296494419|gb|ADTN01000319.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont802.2, whole genome shotgun sequence Length of sequence - 1299 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 852 742 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases 2 1 Op 2 . - CDS 849 - 1298 534 ## COG0757 3-dehydroquinate dehydratase II Predicted protein(s) >gi|296494419|gb|ADTN01000319.1| GENE 1 3 - 852 742 283 aa, chain - ## HITS:1 COG:AGl2846 KEGG:ns NR:ns ## COG: AGl2846 COG0111 # Protein_GI_number: 15891534 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 3 283 44 323 349 201 41.0 2e-51 MSTLVFYSEFDSFEEWSALLAPYLPGVTICRAETVQKPDEVHYALAWKPPRGFFAPYRNL RLLVNLGAGVDSLVGRNDLPDIPIIRLSDPDMARMMASYVLFAVLRYARDIPAFERAQRE RRWRYLHPRAPAGIRVGVLGLGELGAYAARELARQGFDVRGWSRSKKEIAGIRCSSGLAS LDDFLSQSDILVVMLPLTPHTTSLLSAERLARLPQGAAFINVSRGAIVDQAALTDALRSG QIAEATLDVFDREPLPPHDPLWQMDNVLITPHLASVAIPTSAA >gi|296494419|gb|ADTN01000319.1| GENE 2 849 - 1298 534 149 aa, chain - ## HITS:1 COG:CC1882 KEGG:ns NR:ns ## COG: CC1882 COG0757 # Protein_GI_number: 16126125 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate dehydratase II # Organism: Caulobacter vibrioides # 1 144 1 144 146 145 49.0 3e-35 MSQQIYFLNGPNANLYGLDKNGAYGSESFVSIGERCQRRAESHGVVLHFRQSNHEGVLVD WIQEARQNAAGLIINAAGLTYRSVPILDALLMFDGPIIEAHMSNIWKREPFRHHSMVSKA ATGVIAGLGALGYELAIDAVVNLMETPAA Prediction of potential genes in microbial genomes Time: Mon May 16 00:14:10 2011 Seq name: gi|296494418|gb|ADTN01000320.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont802.3, whole genome shotgun sequence Length of sequence - 5537 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 11 - 622 482 ## COG0625 Glutathione S-transferase 2 1 Op 2 . - CDS 622 - 1617 859 ## COG0604 NADPH:quinone reductase and related Zn-dependent oxidoreductases 3 1 Op 3 . - CDS 1624 - 2772 1188 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 4 2 Op 1 1/0.000 - CDS 2943 - 4100 958 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 5 2 Op 2 1/0.000 - CDS 4116 - 4790 649 ## COG3342 Uncharacterized conserved protein 6 2 Op 3 . - CDS 4795 - 5229 523 ## COG0251 Putative translation initiation inhibitor, yjgF family - Prom 5304 - 5363 3.2 Predicted protein(s) >gi|296494418|gb|ADTN01000320.1| GENE 1 11 - 622 482 203 aa, chain - ## HITS:1 COG:PA0467 KEGG:ns NR:ns ## COG: PA0467 COG0625 # Protein_GI_number: 15595664 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutathione S-transferase # Organism: Pseudomonas aeruginosa # 3 200 7 205 206 141 39.0 7e-34 MQLYFAATSAYVRKVMVCATVLGLADEIEPLDSAAHPIERDERIATFNPLAKVPALRTES GLCLYDSRVICEYLNARARGGLFPEGGDVRWGSLARQALGDGIIDAALLARYEFSTRPPE KQWQNWADAQLKKVAAALLEIERQVSDFSDRPNDIGLIAIGCALGYLDFRFAELNWRASH PLTAAWFAAFDAHPAMVATRPHV >gi|296494418|gb|ADTN01000320.1| GENE 2 622 - 1617 859 331 aa, chain - ## HITS:1 COG:BMEI0302 KEGG:ns NR:ns ## COG: BMEI0302 COG0604 # Protein_GI_number: 17986585 # Func_class: C Energy production and conversion; R General function prediction only # Function: NADPH:quinone reductase and related Zn-dependent oxidoreductases # Organism: Brucella melitensis # 2 330 7 336 336 277 45.0 2e-74 MIPEMMNAVIARQPGGPEVLEQVQRAVPQPSAGEVLIRVANAGVNRPDIMQRNGMPLPPG ITDVLGLEVSGTVVAQGAGVESPAIGAPVMALLNGGGYADYCVARAELCLSVPKNLPLAQ AAGVPEAAFTVWHNLFELGRMQPGESVLIHGAASGVGTFAIQCAQACGARVIATAGGTDK IRALRQLGVGRAIDRHSEDFVAVILEETQGRGVDIVLDNVGGDYVASNLSILAPGGRHVS LSFMQGAKIELDLQLVMRKGLSLTSSTLRPKSPVEKMRLAQRISRHLLPLIAAGKITPIL HQTLPLAQAADAHRILEANTNIGKVLLQVAP >gi|296494418|gb|ADTN01000320.1| GENE 3 1624 - 2772 1188 382 aa, chain - ## HITS:1 COG:PM0264 KEGG:ns NR:ns ## COG: PM0264 COG3842 # Protein_GI_number: 15602129 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Pasteurella multocida # 26 325 21 320 380 292 48.0 7e-79 MVSAEVTTFPQPARLHPAENKANVAVRLDGVLKRFGDAIALHKISLTIEEGEFITLLGPS GCGKTTLLNLMAGFAEADGGEIFIDGELVTEIPPYQREIGIVFQSYALFPHMTVEKNVGY GLRMRGVPKAEIVERVEQALALVKLAGYGHRKPRELSGGQQQRVALARALVIRPKVLLLD EPFSALDKNLRLSMQVELKAIQRKLGVTTVFVTHDQGEALSMSDRVVVMSAGHVRQIDTP DEIYRRPQDAFVASFVGDVNIVPGRYAGRDGAAVLELGGNVLRLPADRVHADIGERLDVY MRPENIQLAPLGPRSLFSATVITHVFQGDHVDVHLDVPALGNASLFARQSGLDSLSRWPV GSLVGVSVDDEGVCAFSAAKQG >gi|296494418|gb|ADTN01000320.1| GENE 4 2943 - 4100 958 385 aa, chain - ## HITS:1 COG:PA5390 KEGG:ns NR:ns ## COG: PA5390 COG0624 # Protein_GI_number: 15600583 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Pseudomonas aeruginosa # 1 371 1 371 384 548 73.0 1e-156 MITSRTLLERLVAFDTTSRESNLALIDFVWHYLTDLGVNCELIHNAGCSKANLYARLGPA GSGGILLSGHSDVVPVDGQNWSVPPFALSERDGKLYGRGTADMKGFIACMLAAVPHFLAQ PLAQPLHLAISYDEEVGCLGVRTLLDVLASRPEKPDLCLIGEPTELQPVLGHKGKLAVRC EVQGAACHSAYAPQGVNAIQYAAKLIHRLTAIGEVFAAPERQDTRFDPPFTTVQTGLIQG GRALNIVPAECTFDFEVRTLPQDDAQQVAEELERYAQRELLPQMRAVNSDTEIRFYPLSS YPGLYTAAQSAAAQLLAHLTGSEAFSTVAFGTEGGLFHQAGIPSVICGPGSMAQGHKPDE FITIEQLDACDAMLRRLAGWMSLKA >gi|296494418|gb|ADTN01000320.1| GENE 5 4116 - 4790 649 224 aa, chain - ## HITS:1 COG:PA5391 KEGG:ns NR:ns ## COG: PA5391 COG3342 # Protein_GI_number: 15600584 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Pseudomonas aeruginosa # 1 224 1 224 224 271 66.0 7e-73 MTLSIIGRCSKTGQMGIAISSSSIAVGARCPWLRSKVGAVSTQNITLPALGPRILDRLQQ GATAETALKAGLDTDDYAAWRQVTVMTTDGQSAFYSGEKTLGVNNALSGENCVAAGNLLA NPDVIAAMVQAFEQRSGHLADRLIAAMQAGMSAGGEAGPVHSAALSIVDSPVWPIVDLRV DWTDADPIDELNKLWQAYQPQMQDYITRALDPTQAPSYGVPGDE >gi|296494418|gb|ADTN01000320.1| GENE 6 4795 - 5229 523 144 aa, chain - ## HITS:1 COG:PA5392 KEGG:ns NR:ns ## COG: PA5392 COG0251 # Protein_GI_number: 15600585 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Pseudomonas aeruginosa # 1 139 1 139 140 268 92.0 3e-72 MSTHTRIRKFNTKDTYPNQSLDNDLCQAVRAGNTVYVRGQIGTDFEGNLVGLGDPAAQAE QAMKNVKQLLEEAGSDLSHIVKTTTYIIDPRYREPVYQVVGKWLKGVFPISTGLVISGLG QPQWLMEIDVIAVIPDDWQPKVGE Prediction of potential genes in microbial genomes Time: Mon May 16 00:14:18 2011 Seq name: gi|296494417|gb|ADTN01000321.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont812.1, whole genome shotgun sequence Length of sequence - 26680 bp Number of predicted genes - 21, with homology - 21 Number of transcription units - 11, operones - 7 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.500 - CDS 8 - 1057 814 ## COG2200 FOG: EAL domain 2 1 Op 2 . - CDS 997 - 2223 419 ## COG2199 FOG: GGDEF domain - Prom 2334 - 2393 6.5 - Term 2349 - 2399 1.5 3 2 Tu 1 . - CDS 2440 - 2730 94 ## EC55989_2789 hypothetical protein - Prom 2909 - 2968 8.6 + Prom 2414 - 2473 7.7 4 3 Tu 1 . + CDS 2575 - 2766 151 ## EcSMS35_2653 hypothetical protein + Term 2820 - 2861 1.3 + Prom 2899 - 2958 9.0 5 4 Op 1 . + CDS 3080 - 3595 367 ## SSON_2587 putative outer membrane lipoprotein 6 4 Op 2 . + CDS 3611 - 4150 579 ## ECP_2508 hypothetical protein + Term 4188 - 4235 6.9 - Term 4183 - 4214 3.4 7 5 Op 1 13/0.000 - CDS 4243 - 5820 2112 ## COG0519 GMP synthase, PP-ATPase domain/subunit - Term 5847 - 5882 6.1 8 5 Op 2 . - CDS 5889 - 7355 1865 ## COG0516 IMP dehydrogenase/GMP reductase - Prom 7392 - 7451 6.7 + Prom 7423 - 7482 5.0 9 6 Tu 1 . + CDS 7517 - 8887 1042 ## COG1570 Exonuclease VII, large subunit 10 7 Op 1 . - CDS 8884 - 9099 124 ## S2728 hypothetical protein 11 7 Op 2 7/0.000 - CDS 9169 - 10641 2115 ## COG1160 Predicted GTPases - Prom 10677 - 10736 1.8 - Term 10697 - 10738 10.4 12 8 Op 1 9/0.000 - CDS 10759 - 11937 1184 ## COG1520 FOG: WD40-like repeat 13 8 Op 2 12/0.000 - CDS 11948 - 12568 709 ## COG2976 Uncharacterized protein conserved in bacteria 14 8 Op 3 11/0.000 - CDS 12586 - 13860 1467 ## COG0124 Histidyl-tRNA synthetase - Term 13930 - 13961 4.5 15 8 Op 4 10/0.000 - CDS 13971 - 15089 1243 ## COG0821 Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis 16 8 Op 5 5/0.500 - CDS 15116 - 16129 557 ## COG1426 Uncharacterized protein conserved in bacteria - Prom 16218 - 16277 6.0 - Term 16360 - 16409 13.6 17 9 Op 1 11/0.000 - CDS 16414 - 17568 1378 ## COG0820 Predicted Fe-S-cluster redox enzyme - Prom 17655 - 17714 2.5 18 9 Op 2 1/1.000 - CDS 17718 - 18149 517 ## COG0105 Nucleoside diphosphate kinase - Prom 18199 - 18258 3.7 19 10 Op 1 7/0.000 - CDS 18298 - 20475 1387 ## COG4953 Membrane carboxypeptidase/penicillin-binding protein PbpC - Prom 20512 - 20571 3.1 - Term 20546 - 20589 8.0 20 10 Op 2 . - CDS 20611 - 25572 4666 ## COG2373 Large extracellular alpha-helical protein - Prom 25610 - 25669 4.2 + Prom 25563 - 25622 2.7 21 11 Tu 1 . + CDS 25779 - 26624 836 ## COG2897 Rhodanese-related sulfurtransferase + Term 26647 - 26674 -0.1 Predicted protein(s) >gi|296494417|gb|ADTN01000321.1| GENE 1 8 - 1057 814 349 aa, chain - ## HITS:1 COG:yfgF_3 KEGG:ns NR:ns ## COG: yfgF_3 COG2200 # Protein_GI_number: 16130428 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Escherichia coli K12 # 87 345 1 259 272 515 99.0 1e-146 MLEPGEDVYQLSGNDLALRLNTESHQERITALDSHLKQFRFFWDGMPMQPQIGVSYCYVR SPVNHIYLLLGELNTVAELSIVTNAPENMQRRGAMYLQRELKDKVAMMNRLQQALEHNHF FLMAQPITGMRGDVYHEILLRMKGENDELISPDSFLPVAHEFGLSSSIDMWVIEHTLQFM AENRAKMPAHRFAINLSPTSVCQARFPVEVSQLLAKYQIEAWQLIFEVTESNALTNVKQA QITLQHLQELGCQIAIDDFGTGYASYARLKNVNADLLKIDGSFIRNIVSNSLDYQIVASI CHLARMKKMLVVAEYVENEEIREAVLSLGIDYMQGYLIGKPQPVMTPTY >gi|296494417|gb|ADTN01000321.1| GENE 2 997 - 2223 419 408 aa, chain - ## HITS:1 COG:yfgF_2 KEGG:ns NR:ns ## COG: yfgF_2 COG2199 # Protein_GI_number: 16130428 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli K12 # 311 383 1 73 165 146 97.0 8e-35 MKLNATYIKIRDKWWGLPLFLPSLILPIFAHINTFAHISSGEVFLFYLPLALMISMMMFF SWAALPGIALGIFVRKYAELGFYETLSLTANFIIIIILCWGGYRVFTPRRNNVSHGDTRL ISQRIFWQIVFPATLFLILFQFAAFVGLLASRENLVGVMPFNLGTLINYQALLVGNLIGV PLCYFIIRVVRNPFYLRSYYSQLKQQVDAKVTKKEFALWLLALGALLLLLCMPLNEKSTI FSTNYTLSLLLPLMMWGAMRYGYKLISLLWAVVLMISIHSYQNYIPIYPGYTTQLTITSS SYLVFSFIVNYMAVLATRQRAVVRRIQRLAYVDPVVHLPNVRALNRALRDAPWSALCYLR IPGMEMLVKNYGIMLRIQYKQNFLTGCHPCWNRVKMFISFRVTISRCD >gi|296494417|gb|ADTN01000321.1| GENE 3 2440 - 2730 94 96 aa, chain - ## HITS:1 COG:no KEGG:EC55989_2789 NR:ns ## KEGG: EC55989_2789 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 96 1 96 96 186 100.0 3e-46 MLSFFFALMVLPGTDGRVDKTAKEEDKADEQYDTGHATVKSVSFSHTGCLTHKRFLEVCP TLWTVLTLRHNRTKWSILRYVKKRQWIHILMLWKLI >gi|296494417|gb|ADTN01000321.1| GENE 4 2575 - 2766 151 63 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_2653 NR:ns ## KEGG: EcSMS35_2653 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 1 63 1 63 63 94 100.0 8e-19 MSQATSMRKRHRFNSRMTRIVLLISFIFFFGRFIYSSVGAWQHHQSKKEAQQSTLSVESP VQR >gi|296494417|gb|ADTN01000321.1| GENE 5 3080 - 3595 367 171 aa, chain + ## HITS:1 COG:no KEGG:SSON_2587 NR:ns ## KEGG: SSON_2587 # Name: not_defined # Def: putative outer membrane lipoprotein # Organism: S.sonnei # Pathway: not_defined # 1 171 2 172 172 266 100.0 3e-70 MKFKKCLLPVAMLASFTLAGCQSNADDHAADVYQTDQLNTKQETKTVNIISILPAKVAVD NSQNKRNAQAFGALIGAVAGGVIGHNVGSGSNSGTTAGAVGGGAVGAAAGSMVNDKTLVE GVSLTYKEGTKVYTSTQVGKECQFTTGLAVVITTTYNETRIQPNTKCPEKS >gi|296494417|gb|ADTN01000321.1| GENE 6 3611 - 4150 579 179 aa, chain + ## HITS:1 COG:no KEGG:ECP_2508 NR:ns ## KEGG: ECP_2508 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_536 # Pathway: not_defined # 1 179 1 179 179 176 99.0 3e-43 MKKVFLCAILASLSYPAIASSLQDQLSAVAEAEQQGKNEEQRQHDEWVAERNREIQQEKQ RRANAQAAANKRAATAAANKKARQDKLDAEASADKKRDQSYEDELRSLEIQKQKLALAKE EARVKRENEFIDQELKHKAAQTDVVQSEADANRNMTEGGRDLMKSVGKAEENKSDSWFN >gi|296494417|gb|ADTN01000321.1| GENE 7 4243 - 5820 2112 525 aa, chain - ## HITS:1 COG:ZguaA_2 KEGG:ns NR:ns ## COG: ZguaA_2 COG0519 # Protein_GI_number: 15803031 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Escherichia coli O157:H7 EDL933 # 207 525 1 319 319 658 99.0 0 MTENIHKHRILILDFGSQYTQLVARRVRELGVYCELWAWDVTEAQIRDFNPSGIILSGGP ESTTEENSPRAPQYVFEAGVPVFGVCYGMQTMAMQLGGHVEASNEREFGYAQVEVVNDSA LVRGIEDALTADGKPLLDVWMSHGDKVTAIPSDFITVASTESCPFAIMANEEKRFYGVQF HPEVTHTRQGMRMLERFVRDICQCEALWTPAKIIDDAVARIREQVGDDKVILGLSGGVDS SVTAMLLHRAIGKNLTCVFVDNGLLRLNEAEQVLDMFGDHFGLNIVHVPAEDRFLSALAG ENDPEAKRKIIGRVFVEVFDEEALKLEDVKWLAQGTIYPDVIESAASATGKAHVIKSHHN VGGLPKEMKMGLVEPLKELFKDEVRKIGLELGLPYDMLYRHPFPGPGLGVRVLGEVKKEY CDLLRRADAIFIEELRKADLYDKVSQAFTVFLPVRSVGVMGDGRKYDWVVSLRAVETIDF MTAHWAHLPYDFLGRVSNRIINEVNGISRVVYDISGKPPATIEWE >gi|296494417|gb|ADTN01000321.1| GENE 8 5889 - 7355 1865 488 aa, chain - ## HITS:1 COG:YPO2871_3 KEGG:ns NR:ns ## COG: YPO2871_3 COG0516 # Protein_GI_number: 16123063 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Yersinia pestis # 207 486 1 280 281 478 94.0 1e-134 MLRIAKEALTFDDVLLVPAHSTVLPNTADLSTQLTKTIRLNIPMLSAAMDTVTEARLAIA LAQEGGIGFIHKNMSIERQAEEVRRVKKHESGVVTDPQTVLPTTTLREVKELTERNGFAG YPVVTEENELVGIITGRDVRFVTDLNQPVSVYMTPKERLVTVREGEAREVVLAKMHEKRV EKALVVDDEFHLIGMITVKDFQKAERKPNACKDEQGRLRVGAAVGAGAGNEERVDALVAA GVDVLLIDSSHGHSEGVLQRIRETRAKYPDLQIIGGNVATAAGARALAEAGCSAVKVGIG PGSICTTRIVTGVGVPQITAVADAVEALEGTGIPVIADGGIRFSGDIAKAIAAGASAVMV GSMLAGTEESPGEIELYQGRSYKSYRGMGSLGAMSKGSSDRYFQSDNAADKLVPEGIEGR VAYKGRLKEIIHQQMGGLRSCMGLTGCGTIDELRTKAEFVRISGAGIQESHVHDVTITKE SPNYRLGS >gi|296494417|gb|ADTN01000321.1| GENE 9 7517 - 8887 1042 456 aa, chain + ## HITS:1 COG:xseA KEGG:ns NR:ns ## COG: xseA COG1570 # Protein_GI_number: 16130434 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII, large subunit # Organism: Escherichia coli K12 # 1 456 1 456 456 818 100.0 0 MLPSQSPAIFTVSRLNQTVRLLLEHEMGQVWISGEISNFTQPASGHWYFTLKDDTAQVRC AMFRNSNRRVTFRPQHGQQVLVRANITLYEPRGDYQIIVESMQPAGEGLLQQKYEQLKAK LQAEGLFDQQYKKPLPSPAHCVGVITSKTGAALHDILHVLKRRDPSLPVIIYPAAVQGDD APGQIVRAIELANQRNECDVLIVGRGGGSLEDLWSFNDERVARAIFTSRIPVVSAVGHET DVTIADFVADLRAPTPSAAAEVVSRNQQELLRQVQSTRQRLEMAMDYYLANRTRRFTQIH HRLQQQHPQLRLARQQTMLERLQKRMSFALENQLKRTGQQQQRLTQRLNQQNPQPKIHRA QTRIQQLEYRLAETLRAQLSATRERFGNAVTHLEAVSPLSTLARGYSVTTATDGNVLKKV KQVKAGEMLTTRLEDGWIESEVKNIQPVKKSRKKVH >gi|296494417|gb|ADTN01000321.1| GENE 10 8884 - 9099 124 71 aa, chain - ## HITS:1 COG:no KEGG:S2728 NR:ns ## KEGG: S2728 # Name: not_defined # Def: hypothetical protein # Organism: S.flexneri_2457T # Pathway: not_defined # 1 71 13 83 83 126 97.0 2e-28 MELHCPQCQHVLDQDNGHARCRSCGEFIEMKALCPDCHQPLQVLKACGAVDYFCQHGHGL ISKKRVEFVLA >gi|296494417|gb|ADTN01000321.1| GENE 11 9169 - 10641 2115 490 aa, chain - ## HITS:1 COG:ECs3373 KEGG:ns NR:ns ## COG: ECs3373 COG1160 # Protein_GI_number: 15832627 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Escherichia coli O157:H7 # 1 490 14 503 503 953 100.0 0 MVPVVALVGRPNVGKSTLFNRLTRTRDALVADFPGLTRDRKYGRAEIEGREFICIDTGGI DGTEDGVETRMAEQSLLAIEEADVVLFMVDARAGLMPADEAIAKHLRSREKPTFLVANKT DGLDPDQAVVDFYSLGLGEIYPIAASHGRGVLSLLEHVLLPWMEDLAPQEEVDEDAEYWA QFEAEENGEEEEEDDFDPQSLPIKLAIVGRPNVGKSTLTNRILGEERVVVYDMPGTTRDS IYIPMERDGREYVLIDTAGVRKRGKITDAVEKFSVIKTLQAIEDANVVMLVIDAREGISD QDLSLLGFILNSGRSLVIVVNKWDGLSQEVKEQVKETLDFRLGFIDFARVHFISALHGSG VGNLFESVREAYDSSTRRVGTSMLTRIMTMAVEDHQPPLVRGRRVKLKYAHAGGYNPPIV VIHGNQVKDLPDSYKRYLMNYFRKSLDVMGSPIRIQFKEGENPYANKRNTLTPTQMRKRK RLMKHIKKNK >gi|296494417|gb|ADTN01000321.1| GENE 12 10759 - 11937 1184 392 aa, chain - ## HITS:1 COG:ECs3374 KEGG:ns NR:ns ## COG: ECs3374 COG1520 # Protein_GI_number: 15832628 # Func_class: S Function unknown # Function: FOG: WD40-like repeat # Organism: Escherichia coli O157:H7 # 13 392 13 392 392 719 99.0 0 MQLRKLLLPGLLSVTLLSGCSLFNSEEDVVKMSPLPTVENQFTPTTAWSTSVGSGIGNFY SNLHPALADNVVYAADRAGLVKALNADDGKEIWSVSLAEKDGWFSKEPALLSGGVTVSGG HVYIGSEKAQVYALNTSDGTVAWQTKVAGEALSRPVVSDGLVLIHTSNGQLQALNEADGA VKWTVNLDMPSLSLRGESAPATAFGAAVVGGDNGRVSAVLMEQGQMIWQQRISQATGSTE IDRLSDVDTTPVVVNGVVFALAYNGNLTALDLRSGQIMWKRELGSVNDFIVDGNRIYLVD QNDRVMALTIDGGVTLWTQSDLLHRLLTSPVLYNGNLVVGDSEGYLHWINVEDGRFVAQQ KVDSSGFQTEPVAADGKLLIQAKDGTVYSITR >gi|296494417|gb|ADTN01000321.1| GENE 13 11948 - 12568 709 206 aa, chain - ## HITS:1 COG:ECs3375 KEGG:ns NR:ns ## COG: ECs3375 COG2976 # Protein_GI_number: 15832629 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 206 1 206 206 344 99.0 5e-95 MEIYENENDQVEAVKRFFAENGKALAVGVILGVGALIGWRYWNSHQVDSARSASLAYQNA VTAVSEGKPDSIPAAEKFAAENKNTYGALASLELAQQFVDKNELEKAAAQLQQGLADTSD ENLKAVINLRLARVQVQLKQADAALKTLDTIKGEGWAAIVADLRGEALLSKGDKQGARSA WEAGVKSDVTPALSEMMQMKINNLSI >gi|296494417|gb|ADTN01000321.1| GENE 14 12586 - 13860 1467 424 aa, chain - ## HITS:1 COG:ECs3376 KEGG:ns NR:ns ## COG: ECs3376 COG0124 # Protein_GI_number: 15832630 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Histidyl-tRNA synthetase # Organism: Escherichia coli O157:H7 # 1 424 1 424 424 858 99.0 0 MAKNIQAIRGMNDYLPGETAIWQRIEGTLKNVLGSYGYSEIRLPIVEQTPLFKRAIGEVT DVVEKEMYTFEDRNGDSLTLRPEGTAGCVRAGIEHGLLYNQEQRLWYIGPMFRHERPQKG RYRQFHQLGCEVFGLQGPDIDAELIMLTARWWRALGISEHVTLELNSIGSLEARANYRDA LVAFLEQHKEKLDEDCKRRMYTNPLRVLDSKNPEVQALLNDAPALGDYLDEESREHFAGL CKLLESAGIAYTVKQRLVRGLDYYNRTVFEWVTNSLGSQGTVCAGGRYDGLVEQLGGRAT PAVGFAMGLERLVLLVQAVNPEFKADPVVDIYLVASGADTQSAAMALAERLRDELPGVKL MTNHGGGNFKKQFARADKWGARVAVVLGESEVANGTAVVKDLRSGEQTAVAQDSVAAHLR TLLG >gi|296494417|gb|ADTN01000321.1| GENE 15 13971 - 15089 1243 372 aa, chain - ## HITS:1 COG:ECs3377 KEGG:ns NR:ns ## COG: ECs3377 COG0821 # Protein_GI_number: 15832631 # Func_class: I Lipid transport and metabolism # Function: Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis # Organism: Escherichia coli O157:H7 # 1 372 1 372 372 711 100.0 0 MHNQAPIQRRKSTRIYVGNVPIGDGAPIAVQSMTNTRTTDVEATVNQIKALERVGADIVR VSVPTMDAAEAFKLIKQQVNVPLVADIHFDYRIALKVAEYGVDCLRINPGNIGNEERIRM VVDCARDKNIPIRIGVNAGSLEKDLQEKYGEPTPQALLESAMRHVDHLDRLNFDQFKVSV KASDVFLAVESYRLLAKQIDQPLHLGITEAGGARSGAVKSAIGLGLLLSEGIGDTLRVSL AADPVEEIKVGFDILKSLRIRSRGINFIACPTCSRQEFDVIGTVNALEQRLEDIITPMDV SIIGCVVNGPGEALVSTLGVTGGNKKSGLYEDGVRKDRLDNNDMIDQLEARIRAKASQLD EARRIDVQQVEK >gi|296494417|gb|ADTN01000321.1| GENE 16 15116 - 16129 557 337 aa, chain - ## HITS:1 COG:ECs3378 KEGG:ns NR:ns ## COG: ECs3378 COG1426 # Protein_GI_number: 15832632 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 337 1 337 337 545 99.0 1e-155 MNTEATHDQNEALTTGARLRNAREQLGLSQQAVAERLCLKVSTVRDIEEDKAPADLASTF LRGYIRSYARLVHIPEEELLPGLEKQAPLRAAKVAPMQSFSLGKRRKKRDGWLMTFTWLV LFVVIGLSGAWWWQDHKAQQEEITTMADQSSAELSSNSEQGQSVPLNTSTTTDPATTSTP PASVDTTATNTQTPAVTAPAPAVDPQQNAVVSPSQANVDTAATPAPTAATTPDGAAPLPT DQAGVTTPVADPNALVMNFTADCWLEVTDATGKKLFSGMQRKDGNLNLTGQAPYKLKIGA PAAVQIQYQGKPVDLSRFIRTNQVARLTLNAEQSPAQ >gi|296494417|gb|ADTN01000321.1| GENE 17 16414 - 17568 1378 384 aa, chain - ## HITS:1 COG:yfgB KEGG:ns NR:ns ## COG: yfgB COG0820 # Protein_GI_number: 16130442 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster redox enzyme # Organism: Escherichia coli K12 # 1 384 1 384 384 790 100.0 0 MSEQLVTPENVTTKDGKINLLDLNRQQMREFFKDLGEKPFRADQVMKWMYHYCCDNFDEM TDINKVLRGKLKEVAEIRAPEVVEEQRSSDGTIKWAIAVGDQRVETVYIPEDDRATLCVS SQVGCALECKFCSTAQQGFNRNLRVSEIIGQVWRAAKIVGAAKVTGQRPITNVVMMGMGE PLLNLNNVVPAMEIMLDDFGFGLSKRRVTLSTSGVVPALDKLGDMIDVALAISLHAPNDE IRDEIVPINKKYNIETFLAAVRRYLEKSNANQGRVTIEYVMLDHVNDGTEHAHQLAELLK DTPCKINLIPWNPFPGAPYGRSSNSRIDRFSKVLMSYGFTTIVRKTRGDDIDAACGQLAG DVIDRTKRTLRKRMQGEAIDIKAV >gi|296494417|gb|ADTN01000321.1| GENE 18 17718 - 18149 517 143 aa, chain - ## HITS:1 COG:ECs3380 KEGG:ns NR:ns ## COG: ECs3380 COG0105 # Protein_GI_number: 15832634 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside diphosphate kinase # Organism: Escherichia coli O157:H7 # 1 143 1 143 143 277 100.0 4e-75 MAIERTFSIIKPNAVAKNVIGNIFARFEAAGFKIVGTKMLHLTVEQARGFYAEHDGKPFF DGLVEFMTSGPIVVSVLEGENAVQRHRDLLGATNPANALAGTLRADYADSLTENGTHGSD SVESAAREIAYFFGEGEVCPRTR >gi|296494417|gb|ADTN01000321.1| GENE 19 18298 - 20475 1387 725 aa, chain - ## HITS:1 COG:pbpC KEGG:ns NR:ns ## COG: pbpC COG4953 # Protein_GI_number: 16130444 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase/penicillin-binding protein PbpC # Organism: Escherichia coli K12 # 1 725 46 770 770 1400 99.0 0 MAQDGTPLWRFADADGIWRYPVTIEDVSPRYLEALINYEDRWFWKHPGVNPFSVARAAWQ DLTSGRVISGGSTLTMQVARLLDPHPKTFGGKIRQLWRALQLEWHLSKREILTLYLNRAP FGGTLQGIGAASWAYLGKSPANLSYSEAAMLAVLPQAPSRLRPDRWPERAEAARNKVLER MAVQGVWSREQVKESREEPIWLAPRQMPQLAPLFSRMMLGKSKSDKITTTLDAGLQRRLE ELAQNWKGRLPPRSSLAMIVVDHTDMRVRGWVGSVDLNDDSRFGHVDMVNSIRSPGSVLK PFVYGLALDEGLIHPASLLQDVPRRTGDYRPGNFDSGFHGPISMSEALVRSLNLPAVQVL EAYGPKRFAAKLRNVGLPLYLPNGAAPNLSLILGGAGAKLEDMAAAYTAFARHGKAGKLR LQPDDPLLERPLMSSGAAWIIRRIMADEAQPLPDSALPRVAPLAWKTGTSYGYRDAWAIG VNARYVIGIWTGRPDGTPVVGQFGFASAVPLLNQVNNILLSRSANLPEDPRPNSVTRGVI CWPGGQSLPEGDGNCRRRLATWLLDGSQPPTLLLPEQEGINGIRFPIWLDENGKRVAADC PQARQEMINVWPLPLEPWLPASERRAVRLPPASTSCPPYGHDAQLPLQLTGVRDGAIIKR LPGAAEATLPLQSSGGAGERWWFLNGEPLTERGRNVTLHLTDKGDYQLLVMDDVGQIATV KFVMQ >gi|296494417|gb|ADTN01000321.1| GENE 20 20611 - 25572 4666 1653 aa, chain - ## HITS:1 COG:yfhM KEGG:ns NR:ns ## COG: yfhM COG2373 # Protein_GI_number: 16130445 # Func_class: R General function prediction only # Function: Large extracellular alpha-helical protein # Organism: Escherichia coli K12 # 1 1653 1 1653 1653 3239 100.0 0 MKKLRVAACMLMLALAGCDNNDNAPTAVKKDAPSEVTKAASSENASSAKLSVPERQKLAQ QSAGKVLTLLDLSEVQLDGAATLVLTFSIPLDPDQDFSRVIHVVDKKSGKVDGAWELSDN LKELRLRHLEPKRDLIVTIGKEVKALNNATFSKDYEKTITTRDIQPSVGFASRGSLLPGK VVEGLPVMALNVNNVDVNFFRVKPESLPAFISQWEYRNSLANWQSDKLLQMADLVYTGRF DLNPARNTREKLLLPLGDIKPLQQAGVYLAVMNQAGRYDYSNPATLFTLSDIGVSAHRYH NRLDIFTQSLENGAAQQGIEVSLLNEKGQTLTQATSDAQGHVQLENDKNAALLLARKDGQ TTLLDLKLPALDLAEFNIAGAPGYSKQFFMFGPRDLYRPGETVILNGLLRDADGKALPNQ PIKLDVIKPDGQVLRSVVSQPENGLYHFTWPLDSNAATGMWHIRANTGDNQYRMWDFHVE DFMPERMALNLTGEKTPLTPKDEVKFSVVGYYLYGAPANGNTLQGQLFLRPLREAVSALP GFEFGDIAAENLSRTLDEVQLTLDDKGRGEVSTESQWKETHSPLQVIFQGSLLESGGRPV TRRAEQAIWPADALPGIRPQFASKSVYDYRTDSTVKQPIVDEGSNAAFDIVYSDAQGVKK AVSGLQVRLIRERRDYYWNWSEDEGWQSQFDQKDLIENEQTLDLKADETGKVSFPVEWGA YRLEVKAPNEAVSSVRFWAGYSWQDNSDGSGAVRPDRVTLKLDKASYRPGDTIKLHIAAP TAGKGYAMVESSEGPLWWQEIDVRAQGLDLTIPVDKTWNRHDLYLSTLVVRPGDKSRSAT PKRAVGVLHLPLGDENRRLDLALETPAKMRPNQPLTVKIKASTKNGEKPKQVNVLVSAVD SGVLNITDYVTPDPWQAFFGQKRYGADIYDIYGQVIEGQGRLAALRFGGDGDELKRGGKP PVNHVNIVVQQALPVTLNEQGEGSVTLPIGDFNGELRVMAQAWTADDFGSNESKVIVAAP VIAELNMPRFMASGDTSRLTLDITNLTDKPQKLNVALTASGLLELVSDSPAAVELAPGVR TTLFIPVRALPGYGDGEIQATISGLALPGETVADQHKQWKIGVRPAFPAQTVNYGTALQP GETWAIPADGLQNFSPVTLEGQLLLSGKPPLNIARYIKELKAYPYGCLEQTASGLFPSLY TNAAQLQALGIKGDSDEKRRASVDIGISRLLQMQRDNGGFALWDKNGDEEYWLTAYVMDF LVRAGEQGYSVPTDAINRGNERLLRYLQDPGMMSIPYADNLKASKFAVQSYAALVLARQQ KAPLGALREIWEHRADAASGLPLLQLGVALKTMGDATRGEEAIALALKTPRNSDERIWLG DYGSSLRDNALMLSLLEENKLLPDEQYTLLNTLSQQAFGERWLSTQESNALFLAARTIQD LPGKWQAQTSFSAEQLTGEKAQNSNLNSDQLVTLQVSNSGDQPLWLRMDASGYPQSAPLP ANNVLQIERHILGTDGKSKSLDSLRSGDLVLVWLQVKASNSVPDALVVDLLPAGLELENQ NLANGSASLEQSGGEVQNLLNQMQQASIKHIEFRDDRFVAAVAVDEYQPVTLVYLARAVT PGTYQVPQPMVESMYVPQWRATGAAEDLLIVRP >gi|296494417|gb|ADTN01000321.1| GENE 21 25779 - 26624 836 281 aa, chain + ## HITS:1 COG:sseA KEGG:ns NR:ns ## COG: sseA COG2897 # Protein_GI_number: 16130446 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Escherichia coli K12 # 1 281 54 334 334 569 100.0 1e-162 MSTTWFVGADWLAEHIDDPEIQIIDARMASPGQEDRNVAQEYLNGHIPGAVFFDIEALSD HTSPLPHMLPRPETFAVAMRELGVNQDKHLIVYDEGNLFSAPRAWWMLRTFGVEKVSILG GGLAGWQRDDLLLEEGAVELPEGEFNAAFNPEAVVKVTDVLLASHENTAQIIDARPAARF NAEVDEPRPGLRRGHIPGALNVPWTELVREGELKTTDELDAIFFGRGVSYDKPIIVSCGS GVTAAVVLLALATLDVPNVKLYDGAWSEWGARADLPVEPVK Prediction of potential genes in microbial genomes Time: Mon May 16 00:14:31 2011 Seq name: gi|296494416|gb|ADTN01000322.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont812.2, whole genome shotgun sequence Length of sequence - 807 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Mon May 16 00:14:34 2011 Seq name: gi|296494415|gb|ADTN01000323.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont812.3, whole genome shotgun sequence Length of sequence - 9968 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 5, operones - 1 average op.length - 9.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 7 - 783 693 ## SFV_2570 enhanced serine sensitivity protein SseB - Prom 835 - 894 7.3 2 2 Tu 1 2/1.000 - CDS 925 - 2208 1499 ## COG0260 Leucyl aminopeptidase - Prom 2306 - 2365 1.8 - Term 2220 - 2251 3.9 3 3 Op 1 9/1.000 - CDS 2386 - 2586 379 ## COG2975 Uncharacterized protein conserved in bacteria 4 3 Op 2 13/0.000 - CDS 2598 - 2933 408 ## COG0633 Ferredoxin 5 3 Op 3 11/1.000 - CDS 2935 - 4785 2335 ## COG0443 Molecular chaperone 6 3 Op 4 . - CDS 4802 - 5317 609 ## COG1076 DnaJ-domain-containing proteins 1 7 3 Op 5 . - CDS 5325 - 5447 59 ## EcSMS35_2680 hypothetical protein 8 3 Op 6 14/0.000 - CDS 5413 - 5736 444 ## COG0316 Uncharacterized conserved protein 9 3 Op 7 20/0.000 - CDS 5753 - 6139 598 ## COG0822 NifU homolog involved in Fe-S cluster formation 10 3 Op 8 13/0.000 - CDS 6167 - 7381 1766 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes 11 3 Op 9 8/1.000 - CDS 7493 - 7981 582 ## COG1959 Predicted transcriptional regulator - Prom 8016 - 8075 4.7 12 4 Tu 1 . - CDS 8251 - 8991 997 ## COG0565 rRNA methylase + Prom 9015 - 9074 3.9 13 5 Tu 1 . + CDS 9110 - 9913 1051 ## COG0483 Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family Predicted protein(s) >gi|296494415|gb|ADTN01000323.1| GENE 1 7 - 783 693 258 aa, chain - ## HITS:1 COG:no KEGG:SFV_2570 NR:ns ## KEGG: SFV_2570 # Name: sseB # Def: enhanced serine sensitivity protein SseB # Organism: S.flexneri_8401 # Pathway: not_defined # 1 258 4 261 261 484 100.0 1e-135 MSETKNELEDLLEKAATEPAHRPAFFRTLLESTVWVPGTAAQGEAVVEDSALDLQHWEKE DGTSVIPFFTSLEALQQAVEDEQAFVVMPVRTLFEMTLGETLFLNAKLPTGKEFMPREIS LLIGEEGNPLSSQEILEGGESLILSEVAEPPAQMIDSLTTLFKTIKPVKRAFICSIKENE EAQPNLLIGIEADGDIEEIIQATGSVATDTLPGDEPIDICQVKKGEKGISHFITEHIAPF YERRWGGFLRDFKQNRII >gi|296494415|gb|ADTN01000323.1| GENE 2 925 - 2208 1499 427 aa, chain - ## HITS:1 COG:pepB KEGG:ns NR:ns ## COG: pepB COG0260 # Protein_GI_number: 16130448 # Func_class: E Amino acid transport and metabolism # Function: Leucyl aminopeptidase # Organism: Escherichia coli K12 # 1 427 30 456 456 855 100.0 0 MTEAMKITLSTQPADARWGEKATYSINNDGITLHLNGADDLGLIQRAARKIDGLGIKHVQ LSGEGWDADRCWAFWQGYKAPKGTRKVVWPDLDDAQRQELDNRLMIIDWVRDTINAPAEE LGPSQLAQRAVDLISNVAGDRVTYRITKGEDLREQGYMGLHTVGRGSERSPVLLALDYNP TGDKEAPVYACLVGKGITFDSGGYSIKQTAFMDSMKSDMGGAATVTGALAFAITRGLNKR VKLFLCCADNLISGNAFKLGDIITYRNGKKVEVMNTDAEGRLVLADGLIDASAQKPEMII DAATLTGAAKTALGNDYHALFSFDDALAGRLLASAAQENEPFWRLPLAEFHRSQLPSNFA ELNNTGSAAYPAGASTAAGFLSHFVENYQQGWLHIDCSATYRKAPVEQWSAGATGLGVRT IANLLTA >gi|296494415|gb|ADTN01000323.1| GENE 3 2386 - 2586 379 66 aa, chain - ## HITS:1 COG:ECs3390 KEGG:ns NR:ns ## COG: ECs3390 COG2975 # Protein_GI_number: 15832644 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 66 1 66 66 114 100.0 5e-26 MGLKWTDSREIGEALYDAYPDLDPKTVRFTDMHQWICDLEDFDDDPQASNEKILEAILLV WLDEAE >gi|296494415|gb|ADTN01000323.1| GENE 4 2598 - 2933 408 111 aa, chain - ## HITS:1 COG:ECs3391 KEGG:ns NR:ns ## COG: ECs3391 COG0633 # Protein_GI_number: 15832645 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Escherichia coli O157:H7 # 1 111 1 111 111 207 100.0 3e-54 MPKIVILPHQDLCPDGAVLEANSGETILDAALRNGIEIEHACEKSCACTTCHCIVREGFD SLPESSEQEDDMLDKAWGLEPESRLSCQARVTDEDLVVEIPRYTINHAREH >gi|296494415|gb|ADTN01000323.1| GENE 5 2935 - 4785 2335 616 aa, chain - ## HITS:1 COG:ECs3392 KEGG:ns NR:ns ## COG: ECs3392 COG0443 # Protein_GI_number: 15832646 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Escherichia coli O157:H7 # 1 616 1 616 616 1090 100.0 0 MALLQISEPGLSAAPHQRRLAAGIDLGTTNSLVATVRSGQAETLADHEGRHLLPSVVHYQ QQGHSVGYDARTNAALDTANTISSVKRLMGRSLADIQQRYPHLPYQFQASENGLPMIETA AGLLNPVRVSADILKALAARATEALAGELDGVVITVPAYFDDAQRQGTKDAARLAGLHVL RLLNEPTAAAIAYGLDSGQEGVIAVYDLGGGTFDISILRLSRGVFEVLATGGDSALGGDD FDHLLADYIREQAGIPDRSDNRVQRELLDAAIAAKIALSDADSVTVNVAGWQGEISREQF NELIAPLVKRTLLACRRALKDAGVEADEVLEVVMVGGSTRVPLVRERVGEFFGRPPLTSI DPDKVVAIGAAIQADILVGNKPDSEMLLLDVIPLSLGLETMGGLVEKVIPRNTTIPVARA QDFTTFKDGQTAMSIHVMQGERELVQDCRSLARFALRGIPALPAGGAHIRVTFQVDADGL LSVTAMEKSTGVEASIQVKPSYGLTDSEIASMIKDSMSYAEQDVKARMLAEQKVEAARVL ESLHGALAADAALLSAAERQVIDDAAAHLSEVAQGDDVDAIEQAIKNVDKQTQDFAARRM DQSVRRALKGHSVDEV >gi|296494415|gb|ADTN01000323.1| GENE 6 4802 - 5317 609 171 aa, chain - ## HITS:1 COG:ECs3393 KEGG:ns NR:ns ## COG: ECs3393 COG1076 # Protein_GI_number: 15832647 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-domain-containing proteins 1 # Organism: Escherichia coli O157:H7 # 1 171 1 171 171 271 100.0 5e-73 MDYFTLFGLPARYQLDTQALSLRFQDLQRQYHPDKFASGSQAEQLAAVQQSATINQAWQT LRHPLMRAEYLLSLHGFDLASEQHTVRDTAFLMEQLELREELDEIEQAKDEARLESFIKR VKKMFDTRHQLMVEQLDNETWDAAADTVRKLRFLDKLRSSAEQLEEKLLDF >gi|296494415|gb|ADTN01000323.1| GENE 7 5325 - 5447 59 40 aa, chain - ## HITS:1 COG:no KEGG:EcSMS35_2680 NR:ns ## KEGG: EcSMS35_2680 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_SECEC # Pathway: not_defined # 3 40 1 38 38 75 97.0 6e-13 MSVVAAKASTFDAHTDNPTVVACAWGLFYLIRRYRGGSQP >gi|296494415|gb|ADTN01000323.1| GENE 8 5413 - 5736 444 107 aa, chain - ## HITS:1 COG:ECs3394 KEGG:ns NR:ns ## COG: ECs3394 COG0316 # Protein_GI_number: 15832648 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 107 1 107 107 207 100.0 3e-54 MSITLSDSAAARVNTFLANRGKGFGLRLGVRTSGCSGMAYVLEFVDEPTPEDIVFEDKGV KVVVDGKSLQFLDGTQLDFVKEGLNEGFKFTNPNVKDECGCGESFHV >gi|296494415|gb|ADTN01000323.1| GENE 9 5753 - 6139 598 128 aa, chain - ## HITS:1 COG:ECs3395 KEGG:ns NR:ns ## COG: ECs3395 COG0822 # Protein_GI_number: 15832649 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Escherichia coli O157:H7 # 1 128 1 128 128 237 100.0 4e-63 MAYSEKVIDHYENPRNVGSFDNNDENVGSGMVGAPACGDVMKLQIKVNDEGIIEDARFKT YGCGSAIASSSLVTEWVKGKSLDEAQAIKNTDIAEELELPPVKIHCSILAEDAIKAAIAD YKSKREAK >gi|296494415|gb|ADTN01000323.1| GENE 10 6167 - 7381 1766 404 aa, chain - ## HITS:1 COG:ECs3396 KEGG:ns NR:ns ## COG: ECs3396 COG1104 # Protein_GI_number: 15832650 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Escherichia coli O157:H7 # 1 404 9 412 412 820 100.0 0 MKLPIYLDYSATTPVDPRVAEKMMQFMTMDGTFGNPASRSHRFGWQAEEAVDIARNQIAD LVGADPREIVFTSGATESDNLAIKGAANFYQKKGKHIITSKTEHKAVLDTCRQLEREGFE VTYLAPQRNGIIDLKELEAAMRDDTILVSIMHVNNEIGVVQDIAAIGEMCRARGIIYHVD ATQSVGKLPIDLSQLKVDLMSFSGHKIYGPKGIGALYVRRKPRVRIEAQMHGGGHERGMR SGTLPVHQIVGMGEAYRIAKEEMATEMERLRGLRNRLWNGIKDIEEVYLNGDLEHGAPNI LNVSFNYVEGESLIMALKDLAVSSGSACTSASLEPSYVLRALGLNDELAHSSIRFSLGRF TTEEEIDYTIELVRKSIGRLRDLSPLWEMYKQGVDLNSIEWAHH >gi|296494415|gb|ADTN01000323.1| GENE 11 7493 - 7981 582 162 aa, chain - ## HITS:1 COG:ECs3397 KEGG:ns NR:ns ## COG: ECs3397 COG1959 # Protein_GI_number: 15832651 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 162 1 162 162 278 100.0 4e-75 MRLTSKGRYAVTAMLDVALNSEAGPVPLADISERQGISLSYLEQLFSRLRKNGLVSSVRG PGGGYLLGKDASSIAVGEVISAVDESVDATRCQGKGGCQGGDKCLTHALWRDLSDRLTGF LNNITLGELVNNQEVLDVSGRQHTHDAPRTRTQDAIDVKLRA >gi|296494415|gb|ADTN01000323.1| GENE 12 8251 - 8991 997 246 aa, chain - ## HITS:1 COG:yfhQ KEGG:ns NR:ns ## COG: yfhQ COG0565 # Protein_GI_number: 16130457 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylase # Organism: Escherichia coli K12 # 1 246 1 246 246 483 100.0 1e-136 MLQNIRIVLVETSHTGNMGSVARAMKTMGLTNLWLVNPLVKPDSQAIALAAGASDVIGNA HIVDTLDEALAGCSLVVGTSARSRTLPWPMLDPRECGLKSVAEAANTPVALVFGRERVGL TNEELQKCHYHVAIAANPEYSSLNLAMAVQVIAYEVRMAWLATQENGEQVEHEETPYPLV DDLERFYGHLEQTLLATGFIRENHPGQVMNKLRRLFTRARPESQELNILRGILASIEQQN KGNKAE >gi|296494415|gb|ADTN01000323.1| GENE 13 9110 - 9913 1051 267 aa, chain + ## HITS:1 COG:ECs3399 KEGG:ns NR:ns ## COG: ECs3399 COG0483 # Protein_GI_number: 15832653 # Func_class: G Carbohydrate transport and metabolism # Function: Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family # Organism: Escherichia coli O157:H7 # 1 267 1 267 267 530 100.0 1e-150 MHPMLNIAVRAARKAGNLIAKNYETPDAVEASQKGSNDFVTNVDKAAEAVIIDTIRKSYP QHTIITEESGELEGTDQDVQWVIDPLDGTTNFIKRLPHFAVSIAVRIKGRTEVAVVYDPM RNELFTATRGQGAQLNGYRLRGSTARDLDGTILATGFPFKAKQYATTYINIVGKLFNECA DFRRTGSAALDLAYVAAGRVDGFFEIGLRPWDFAAGELLVREAGGIVSDFTGGHNYMLTG NIVAGNPRVVKAMLANMRDELSDALKR Prediction of potential genes in microbial genomes Time: Mon May 16 00:15:00 2011 Seq name: gi|296494414|gb|ADTN01000324.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont812.4, whole genome shotgun sequence Length of sequence - 61797 bp Number of predicted genes - 56, with homology - 55 Number of transcription units - 31, operones - 11 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 32 - 91 2.0 1 1 Tu 1 . + CDS 129 - 983 737 ## COG1073 Hydrolases of the alpha/beta superfamily + Prom 1009 - 1068 3.9 2 2 Tu 1 . + CDS 1270 - 2454 865 ## COG3711 Transcriptional antiterminator - Term 2218 - 2282 -0.8 3 3 Op 1 9/0.000 - CDS 2446 - 3585 957 ## COG0477 Permeases of the major facilitator superfamily - Prom 3643 - 3702 2.1 4 3 Op 2 . - CDS 3745 - 4635 866 ## COG0583 Transcriptional regulator - Prom 4754 - 4813 5.4 + Prom 4663 - 4722 2.2 5 4 Op 1 6/0.105 + CDS 4771 - 6132 1342 ## COG4638 Phenylpropionate dioxygenase and related ring-hydroxylating dioxygenases, large terminal subunit 6 4 Op 2 4/0.632 + CDS 6129 - 6647 361 ## COG5517 Small subunit of phenylpropionate dioxygenase 7 4 Op 3 3/0.789 + CDS 6647 - 6967 134 ## COG2146 Ferredoxin subunits of nitrite reductase and ring-hydroxylating dioxygenases 8 4 Op 4 6/0.105 + CDS 6964 - 7776 845 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 9 4 Op 5 1/0.947 + CDS 7786 - 8988 1022 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 10 4 Op 6 . + CDS 9085 - 9507 442 ## COG2259 Predicted membrane protein + Term 9514 - 9557 8.4 - Term 9502 - 9545 4.6 11 5 Op 1 2/0.842 - CDS 9555 - 10427 869 ## COG2017 Galactose mutarotase and related enzymes 12 5 Op 2 3/0.789 - CDS 10439 - 11491 1047 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 13 5 Op 3 21/0.000 - CDS 11566 - 12564 1163 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 14 5 Op 4 16/0.000 - CDS 12589 - 14100 176 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 15 5 Op 5 . - CDS 14123 - 15106 1051 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 15135 - 15194 9.9 16 6 Tu 1 . - CDS 15203 - 18484 3116 ## JW5405 conserved hypothetical protein - Prom 18629 - 18688 2.9 + Prom 18518 - 18577 3.6 17 7 Tu 1 . + CDS 18602 - 19795 1126 ## COG1940 Transcriptional regulator/sugar kinase + Term 19815 - 19844 2.1 - Term 19844 - 19880 2.4 18 8 Tu 1 . - CDS 19993 - 21246 1625 ## COG0112 Glycine/serine hydroxymethyltransferase - Prom 21311 - 21370 5.6 + Prom 21477 - 21536 7.3 19 9 Tu 1 . + CDS 21574 - 22764 1293 ## COG1018 Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 20 10 Op 1 4/0.632 - CDS 22809 - 23147 505 ## COG0347 Nitrogen regulatory protein PII 21 10 Op 2 . - CDS 23208 - 24542 1315 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 22 10 Op 3 . - CDS 24532 - 25245 511 ## EcHS_A2708 hypothetical protein - Prom 25317 - 25376 1.7 23 11 Tu 1 . - CDS 25410 - 26837 1044 ## COG0642 Signal transduction histidine kinase - Term 27305 - 27334 0.4 24 12 Tu 1 . - CDS 27395 - 31279 4707 ## COG0046 Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain + Prom 31333 - 31392 4.0 25 13 Tu 1 . + CDS 31540 - 33096 1625 ## COG4623 Predicted soluble lytic transglycosylase fused to an ABC-type amino acid-binding protein 26 14 Op 1 2/0.842 - CDS 33093 - 33596 431 ## COG0590 Cytosine/adenosine deaminases 27 14 Op 2 . - CDS 33654 - 34289 458 ## COG0560 Phosphoserine phosphatase - Prom 34423 - 34482 3.7 28 15 Op 1 1/0.947 + CDS 34498 - 35346 778 ## COG1737 Transcriptional regulators 29 15 Op 2 . + CDS 35402 - 35662 281 ## COG1145 Ferredoxin + Term 35663 - 35697 1.1 - Term 36278 - 36338 12.4 30 16 Op 1 8/0.000 - CDS 36357 - 36737 321 ## COG0736 Phosphopantetheinyl transferase (holo-ACP synthase) 31 16 Op 2 9/0.000 - CDS 36737 - 37468 1018 ## COG0854 Pyridoxal phosphate biosynthesis protein 32 16 Op 3 16/0.000 - CDS 37480 - 38208 682 ## COG1381 Recombinational DNA repair protein (RecF pathway) 33 16 Op 4 18/0.000 - CDS 38220 - 39125 1022 ## COG1159 GTPase 34 16 Op 5 13/0.000 - CDS 39122 - 39802 629 ## COG0571 dsRNA-specific ribonuclease - Prom 39964 - 40023 1.8 - Term 40016 - 40061 14.1 35 17 Op 1 14/0.000 - CDS 40074 - 41048 1056 ## COG0681 Signal peptidase I 36 17 Op 2 4/0.632 - CDS 41064 - 42863 2082 ## COG0481 Membrane GTPase LepA - Prom 42948 - 43007 5.6 - Term 43006 - 43050 -0.9 37 18 Op 1 8/0.000 - CDS 43061 - 43540 371 ## COG3086 Positive regulator of sigma E activity 38 18 Op 2 10/0.000 - CDS 43537 - 44493 747 ## COG3026 Negative regulator of sigma E activity 39 18 Op 3 11/0.000 - CDS 44493 - 45143 571 ## COG3073 Negative regulator of sigma E activity 40 18 Op 4 . - CDS 45176 - 45751 378 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 41 18 Op 5 . - CDS 45748 - 45903 76 ## ECBD_1107 hypothetical protein - Prom 45969 - 46028 5.0 + Prom 45861 - 45920 5.1 42 19 Tu 1 . + CDS 46159 - 47781 1325 ## COG0029 Aspartate oxidase - Term 47564 - 47601 -0.9 43 20 Tu 1 . - CDS 47766 - 48503 690 ## COG4123 Predicted O-methyltransferase - Prom 48542 - 48601 2.4 + Prom 48444 - 48503 4.2 44 21 Tu 1 . + CDS 48635 - 49969 1622 ## COG0513 Superfamily II DNA and RNA helicases - Term 50141 - 50170 -0.5 45 22 Tu 1 . - CDS 50178 - 51059 324 ## COG0583 Transcriptional regulator - Prom 51085 - 51144 3.8 + Prom 51066 - 51125 4.3 46 23 Tu 1 . + CDS 51162 - 51749 514 ## COG1280 Putative threonine efflux protein + Term 51759 - 51794 7.4 - Term 51747 - 51782 7.4 47 24 Tu 1 . - CDS 51805 - 52188 517 ## COG3445 Acid-induced glycyl radical enzyme - Prom 52372 - 52431 7.3 + Prom 52263 - 52322 5.9 48 25 Tu 1 . + CDS 52493 - 53182 650 ## COG0692 Uracil DNA glycosylase + Term 53197 - 53236 5.1 - Term 53011 - 53051 0.1 49 26 Tu 1 . - CDS 53230 - 54267 1036 ## COG0566 rRNA methylases - Prom 54407 - 54466 3.2 + Prom 54356 - 54415 6.1 50 27 Tu 1 . + CDS 54474 - 54893 178 ## PROTEIN SUPPORTED gi|124485582|ref|YP_001030198.1| ribosomal protein L12E/L44/L45/RPP1/RPP2-like protein + Term 54900 - 54932 5.4 51 28 Op 1 3/0.789 + CDS 55136 - 55660 448 ## COG3148 Uncharacterized conserved protein 52 28 Op 2 3/0.789 + CDS 55692 - 58352 2402 ## COG1042 Acyl-CoA synthetase (NDP forming) 53 29 Op 1 3/0.789 + CDS 58466 - 59821 1296 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes + Term 59823 - 59860 3.7 54 29 Op 2 . + CDS 59867 - 60190 305 ## COG5544 Predicted periplasmic lipoprotein 55 30 Tu 1 . - CDS 60187 - 61485 1068 ## COG0477 Permeases of the major facilitator superfamily - Prom 61549 - 61608 4.8 + Prom 61489 - 61548 6.1 56 31 Tu 1 . + CDS 61665 - 61727 116 ## Predicted protein(s) >gi|296494414|gb|ADTN01000324.1| GENE 1 129 - 983 737 284 aa, chain + ## HITS:1 COG:yfhR KEGG:ns NR:ns ## COG: yfhR COG1073 # Protein_GI_number: 16130459 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Escherichia coli K12 # 1 284 10 293 293 587 100.0 1e-168 MALPVNKRVPKILFILFVVAFCVYLVPRVAINFFYYPDDKIYGPDPWSAESVEFTAKDGT RLQGWFIPSSTGPADNAIATIIHAHGNAGNMSAHWPLVSWLPERNFNVFMFDYRGFGKSK GTPSQAGLLDDTQSAINVVRHRSDVNPQRLVLFGQSIGGANILDVIGRGDREGIRAVILD STFASYATIANQMIPGSGYLLDESYSGENYIASVSPIPLLLIHGKADHVIPWQHSEKLYS LAKEPKRLILIPDGEHIDAFSDRHGDVYREQMVDFILSALNPQN >gi|296494414|gb|ADTN01000324.1| GENE 2 1270 - 2454 865 394 aa, chain + ## HITS:1 COG:ECs3401 KEGG:ns NR:ns ## COG: ECs3401 COG3711 # Protein_GI_number: 15832655 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Escherichia coli O157:H7 # 1 394 40 433 433 756 100.0 0 MATFSELNGVDDDIASLDISETGREILRYHQLTLTTGYDGSYRVEGTVLNQRLCLFHWLR RGFRLCPSFITSQFTPALKSELKRRGIARNFYDDTNLQALVNLCSRRLQKRFESRDIHFL CLYLQYCLLQHHAGITPQFNPLQRRWAESCLEFQVAQEIGRHWQRRALQPVPPDEPLFMA LLFSMLRVPDPLRDAHQRDRQLRQSIKRLVNHFRELGNVRFYDEQGLCDQLYTHLAQALN RSLFAIGIDNTLPEEFARLYPRLVRTTRAALAGFESEYGVHLSDEESGLVAVIFGAWLMQ ENDLHEKQIILLTGNDSEREAQIEQQLRELTLLPLNIKHMSVKAFLQTGAPRGAALIIAP YTMPLPLFSPPLIYTDLTLTTHQQEQIRKMLESA >gi|296494414|gb|ADTN01000324.1| GENE 3 2446 - 3585 957 379 aa, chain - ## HITS:1 COG:hcaT KEGG:ns NR:ns ## COG: hcaT COG0477 # Protein_GI_number: 16130461 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 379 1 379 379 636 100.0 0 MVLQSTRWLALGYFTYFFSYGIFLPFWSVWLKGIGLTPETIGLLLGAGLVARFLGSLLIA PRVSDPSRLISALRVLALLTLLFAVAFWAGAHVAWLMLVMIGFNLFFSPLVPLTDALANT WQKQFPLDYGKVRLWGSVAFVIGSALTGKLVTMFDYRVILALLTLGVASMLLGFLIRPTI QPQGASRQQESTGWSAWLALVRQNWRFLACVCLLQGAHAAYYGFSAIYWQAAGYSASAVG YLWSLGVVAEVIIFALSNKLFRRCSARDMLLISAICGVVRWGIMGATTALPWLIVVQILH CGTFTVCHLAAMRYIAARQGSEVIRLQAVYSAVAMGGSIAIMTVFAGFLYQYLGHGVFWV MALVALPAMFLRPKVVPSC >gi|296494414|gb|ADTN01000324.1| GENE 4 3745 - 4635 866 296 aa, chain - ## HITS:1 COG:hcaR KEGG:ns NR:ns ## COG: hcaR COG0583 # Protein_GI_number: 16130462 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 296 1 296 296 568 100.0 1e-162 MELRHLRYFVAVAQALNFTRAAEKLHTSQPSLSSQIRDLENCVGVPLLVRDKRKVALTAA GECFLQDALAILEQAENAKLRARKIVQEDRQLTIGFVPSAEVNLLPKVLPMFRLRQPDTL IELVSLITTQQEEKIRRGELDVGLMRHPVYSPEIDYLELFDEPLVVVLPVDHPLAHEKEI TAAQLDGVNFVSTDPVYSGSLAPIVKAWFAQENSQPNIVQVATNILVTMNLVGMGLGVTL IPGYMNNFNTGQVVFRPIAGNVPSIALLMAWKKGEMKPALRDFIAIVQERLASVTA >gi|296494414|gb|ADTN01000324.1| GENE 5 4771 - 6132 1342 453 aa, chain + ## HITS:1 COG:ECs3404 KEGG:ns NR:ns ## COG: ECs3404 COG4638 # Protein_GI_number: 15832658 # Func_class: P Inorganic ion transport and metabolism; R General function prediction only # Function: Phenylpropionate dioxygenase and related ring-hydroxylating dioxygenases, large terminal subunit # Organism: Escherichia coli O157:H7 # 1 453 1 453 453 952 100.0 0 MTTPSDLNIYQLIDTQNGRVTPRIYTDPDIYQLELERIFGRCWLFLAHESQIPKPGDFFN TYMGEDAVVVVRQKDGSIKAFLNQCRHRAMRVSYADCGNTRAFTCPYHGWSYGINGELID VPLEPRAYPQGLCKSHWGLNEVPCVESYKGLIFGNWDTSAPGLRDYLGDIAWYLDGMLDR REGGTEIVGGVQKWVINCNWKFPAEQFASDQYHALFSHASAVQVLGAKDDGSDKRLGDGQ TARPVWETAKDALQFGQDGHGSGFFFTEKPDANVWVDGAVSSYYRETYAEAEQRLGEVRA LRLAGHNNIFPTLSWLNGTATLRVWHPRGPDQVEVWAFCITDKAASDEVKAAFENSATRA FGPAGFLEQDDSENWCEIQKLLKGHRARNSKLCLEMGLGQEKRRDDGIPGITNYIFSETA ARGMYQRWADLLSSESWQEVLDKTAAYQQEVMK >gi|296494414|gb|ADTN01000324.1| GENE 6 6129 - 6647 361 172 aa, chain + ## HITS:1 COG:hcaA2 KEGG:ns NR:ns ## COG: hcaA2 COG5517 # Protein_GI_number: 16130464 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Small subunit of phenylpropionate dioxygenase # Organism: Escherichia coli K12 # 1 172 1 172 172 320 100.0 1e-87 MSAQVSLELHHRISQFLFHEASLLDDWKFRDWLAQLDEEIRYTMRTTVNAQTRDRRKGVQ PPTTWIFNDTKDQLERRIARLETGMAWAEEPPSRTRHLISNCQISETDIPNVFAVRVNYL LYRAQKERDETFYVGTRFDKVRRLEDDNWRLLERDIVLDQAVITSHNLSVLF >gi|296494414|gb|ADTN01000324.1| GENE 7 6647 - 6967 134 106 aa, chain + ## HITS:1 COG:ECs3406 KEGG:ns NR:ns ## COG: ECs3406 COG2146 # Protein_GI_number: 15832660 # Func_class: P Inorganic ion transport and metabolism; R General function prediction only # Function: Ferredoxin subunits of nitrite reductase and ring-hydroxylating dioxygenases # Organism: Escherichia coli O157:H7 # 1 106 1 106 106 209 100.0 1e-54 MNRIYACPVADVPEGEALRIDTSPVIALFNVGGEFYAINDRCSHGNASMSEGYLEDDATV ECPLHAASFCLKTGKALCLPATDPLTTYPVHVEGGDIFIDLPEAQP >gi|296494414|gb|ADTN01000324.1| GENE 8 6964 - 7776 845 270 aa, chain + ## HITS:1 COG:hcaB KEGG:ns NR:ns ## COG: hcaB COG1028 # Protein_GI_number: 16130466 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Escherichia coli K12 # 1 270 1 270 270 510 100.0 1e-144 MSDLHNESIFITGGGSGLGLALVERFIEEGAQVATLELSAAKVASLRQRFGEHILAVEGN VTCYADYQRAVDQILTRSGKLDCFIGNAGIWDHNASLVNTPAETLETGFHELFNVNVLGY LLGAKACAPALIASEGSMIFTLSNAAWYPGGGGPLYTASKHAATGLIRQLAYELAPKVRV NGVGPCGMASDLRGPQALGQSETSIMQSLTPEKIAAILPLQFFPQPADFTGPYVMLTSRR NNRALSGVMINADAGLAIRGIRHVAAGLDL >gi|296494414|gb|ADTN01000324.1| GENE 9 7786 - 8988 1022 400 aa, chain + ## HITS:1 COG:ECs3408 KEGG:ns NR:ns ## COG: ECs3408 COG0446 # Protein_GI_number: 15832662 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Escherichia coli O157:H7 # 1 400 1 400 400 746 97.0 0 MKEKTIIIVGGGQAAAMAAASLRQQGFTGELHLFSDERHLPYERPPLSKSMLLEDSPQLQ QVLPANWWQENNVHLHSGVTIKTLGRDTRELVLTNGESWHWDQLFIATGAAARPLPLLDA LGERCFTLRHAGDAARLREVLQPERSVVIIGAGTIGLELAASATQRRCKVTVIELAATVM GRNAPPPVQRYLLQRHQQAGVRILLNNAIEHVVDGEKVELTLQSGETLQADVVIYGIGIS ANEQLAREANLDTANGIVIDEACRTCDPAIFAGGDVAITRLDNGALHRCESWENANNQAQ IAAAAMLGLPLPLLPPPWFWSDQYSDNLQFIGDMRGDDWLCRGNPETQKAIWFNLQNGVL IGAVTLNQGREIRPIRKWIQSGKTFDAKLLIDENIALKSL >gi|296494414|gb|ADTN01000324.1| GENE 10 9085 - 9507 442 140 aa, chain + ## HITS:1 COG:yphA KEGG:ns NR:ns ## COG: yphA COG2259 # Protein_GI_number: 16130468 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 140 25 164 164 236 100.0 7e-63 MNTLRYFDFGAARPVLLLIARIAVVLIFIIFGFPKMMGFDGTVQYMASLGAPMPMLAAII AVVMEVPAAILIVLGFFTRPLAVLFIFYTLGTAVIGHHYWDMTGDAVGPNMINFWKNVSI AGAFLLLAITGPGAISLDRR >gi|296494414|gb|ADTN01000324.1| GENE 11 9555 - 10427 869 290 aa, chain - ## HITS:1 COG:yphB KEGG:ns NR:ns ## COG: yphB COG2017 # Protein_GI_number: 16130469 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Escherichia coli K12 # 1 290 1 290 290 596 100.0 1e-170 MTIYTLSHGSLKLDVSDQGGVIEGFWRDTTPLLRPGKKSGVATDASCFPLVPFANRVSGN RFVWQGREYQLQPNVEWDAHYLHGDGWLGEWQCVSHSDDSLCLVYEHRSGVYHYRVSQAF HLTADTLTVTLSVTNQGAETLPFGTGWHPYFPLSPQTRIQAQASGYWLEREQWLAGEFCE QLPQELDFNQPAPLPRQWVNNGFAGWNGQARIEQPQEGYAIIMETTPPAPCYFIFVSDPA FDKGYAFDFFCLEPMSHAPDDHHRPEGGDLIALAPGESTTSEMSLRVEWL >gi|296494414|gb|ADTN01000324.1| GENE 12 10439 - 11491 1047 350 aa, chain - ## HITS:1 COG:yphC KEGG:ns NR:ns ## COG: yphC COG1063 # Protein_GI_number: 16130470 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli K12 # 1 350 15 364 364 723 99.0 0 MLAAYLPGNSTVDLREVAVPTPGINQVLIKMKSSGICGSDVHYIYHQHRATAAAPDKPLY QGFINGHEPCGQIVAMGQGCRHFKEGDRVLVYHISGCGFCPNCRRGFPISCTGEGKAAYG WQRDGGHAEYLLAEEKDLILLPDALSYEDGAFISCGVGTAYEGILRGEVSGSNNVLVVGL GPVGMMAMMLAKGRGAKRIIGVDMLPERLAMAKQLGVMDHGYLATTEGLPQIIAELTHGG ADVALDCSGNAAGRLLALQSTADWGRVVYIGETGKVEFEVSADLMHHQRRIIGSWVTSLF HMEKCAHDLTDWKLWPRNAITHRFSLEQAGDAYALMASGKCGKVVINFPD >gi|296494414|gb|ADTN01000324.1| GENE 13 11566 - 12564 1163 332 aa, chain - ## HITS:1 COG:ECs3412 KEGG:ns NR:ns ## COG: ECs3412 COG1172 # Protein_GI_number: 15832666 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Escherichia coli O157:H7 # 1 332 1 332 332 492 99.0 1e-139 MSASSLPLPQGKSVSLKQFVSRHINEIGLLVVIAILYLVFSLNAPGFISLNNQMNVLRDA ATIGIAAWAMTLIIISGEIDVSVGPMVAFVSVCLAFLLQFEVPLAVACLLVLLLGALMGT LAGVLRGVFNVPSFVATLGLWSALRGMGLFMTNALPVPIDENEVLDWLGGQFLGVPVSAL IMIVLFALFVFISRKTAFGRSVFAVGGNATAAQLCGINVRRVRILIFTLSGLLAAVTGIL LAARLGSGNAGAANGLEFDVIAAVVVGGTALSGGRGSLFGTLLGVLVITLIGNGLVLLGI NSFFQQVVRGVIIVVAVLANILLTQRSSKAKR >gi|296494414|gb|ADTN01000324.1| GENE 14 12589 - 14100 176 503 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 262 480 7 219 318 72 25 6e-12 MFTATEAVPVAKVVAGNKRYPGVVALDNVNFTLNKGEVRALLGKNGAGKSTLIRMLTGSE RPDSGDIWIGETRLEGDEATLTRRAAELGVRAVYQELSLVEGLTVAENLCLGQWPRRNGM IDYLQMAQDAQRCLQALGVDVSPEQLVSTLSPAQKQLVEIARVMKGEPRVVILDEPTSSL ASAEVELVISAVKKMSALGVAVIYVSHRMEEIRRIASCATVMRDGQVAGDVMLENTSTHH IVSLMLGRDHVDIAPVAPQEIVDQAVLEVRALRHKPKLEDISFTLRRGEVLGIAGLLGAG RSELLKAIVGLEEYEQGEIVINGEKITRPDYGDMLKRGIGYTPENRKEAGIIPWLGVDEN TVLTNRQKISANGVLQWSTIRRLTEEVMQRMTVKAASSETPIGTLSGGNQQKVVIGRWVY AASQILLLDEPTRGVDIEAKQQIYRIVRELAAEGKSVVFISSEVEELPLVCDRILLLQHG TFSQEFHAPVNVDELMSAILSVH >gi|296494414|gb|ADTN01000324.1| GENE 15 14123 - 15106 1051 327 aa, chain - ## HITS:1 COG:yphF KEGG:ns NR:ns ## COG: yphF COG1879 # Protein_GI_number: 16130473 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli K12 # 1 327 1 327 327 605 100.0 1e-173 MPTKMRTTRNLLLMATLLGSALFARAAEKEMTIGAIYLDTQGYYAGVRQGVQDAAKDSSV QVQLIETNAQGDISKESTFVDTLVARNVDAIILSAVSENGSSRTVRRASEAGIPVICYNT CINQKGVDKYVSAYLVGDPLEFGKKLGNAAADYFIANKIDQPKIAVINCEAFEVCVQRRK GFEEVLKSRVPGAQIVANQEGTVLDKAISVGEKLIISTPDLNAIMGESGGATLGAVKAVR NQNQAGKIAVFGSDMTTEIAQELENNQVLKAVVDISGKKMGNAVFAQTLKVINKQADGEK VIQVPIDLYTKTEDGKQWLATHVDGLP >gi|296494414|gb|ADTN01000324.1| GENE 16 15203 - 18484 3116 1093 aa, chain - ## HITS:1 COG:no KEGG:JW5405 NR:ns ## KEGG: JW5405 # Name: yphG # Def: conserved hypothetical protein # Organism: E.coli_J # Pathway: not_defined # 1 1093 1 1093 1093 2224 99.0 0 MTPVKVWQERVEIPTYETGPQDIHPMFLENRVYQGSSGAVYPYGVTDTLSEQKTLKSWQA VWLENDYIKVMILPELGGRVHRAWDKVKQRDFVYHNEVIKPALVGLLGPWISGGIEFNWP QHHRPTTFMPVDFTLEAHEDGAQTVWVGETEPMHGLQVMTGFTLRPDRAALEIASRVYNG NATPRHFLWWANPAVKGGEGHQSVFPPDVTAVFDHGKRAVSAFPIATGTYYKVDYSAGVD ISRYKNVPVPTSYMAEKSQYDFVGAWCHDEDGGLLHVANHHIAPGKKQWSWGHSEFGQAW DKSLTDNNGPYIELMTGIFADNQPDFTWLDAYEEKRFEQYFLPYHSLGMVQNASRDAVIK LQRSKRGIEWGLYAISPLNGYRLAIREIGKCNALLDDAVALMPATAIQGVLHGINPERLT IELSDADGNIVLSYQEHQPQALPLPDVAKAPLAAQDITSTDEAWFIGQHLEQYHHASRSP FDYYLRGVALDPLDYRCNLALAMLEYNRADFPQAVAYATQALKRAHSLNKNPQCGQASLI RASAYERQGQYQQAEEDFWRAVWSGNSKAGGYYGLARLAARNGNFDAGLDFCQQSLRACP TNQEVLCLHNLLLVLSGRQDNARVQREKLLRDYPLNATLWWLNWFDGRSESALAQWRGLC QGRDVNALMTAGQLINWGMPTLAAEMLNALDCQRTLPLYLQASLLPKAERGELVAKAIDV FPQFVRFPNTLEEVAALESIEECWFARHLLACFYYNKRSYNKAIAFWQRCVEMSPEFADG WRGLAIHAWNKQHDYELAARYLDNAYQLAPQDARLLFERDLLDKLSGATPEKRLARLENN LEIALKRDDMTAELLNLWHLTGQADKAADILATRKFHPWEGGEGKVTSQFILNQLLRAWQ HLDARQPQQACELLHAALHYPENLSEGRLPGQTDNDIWFWQAICANAQGDETEATRCLRL AATGDRTINIHSYYNDQPVDYLFWQGMALRLLGEQQTAQQLFSEMKQWAQEMAKTSIEAD FFAVSQPDLLSLYGDLQQQHKEKCLMVAMLASAGLGEVAQYESARAELTAINPAWPKAAL FTTVMPFIFNRVH >gi|296494414|gb|ADTN01000324.1| GENE 17 18602 - 19795 1126 397 aa, chain + ## HITS:1 COG:yphH KEGG:ns NR:ns ## COG: yphH COG1940 # Protein_GI_number: 16130475 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Escherichia coli K12 # 1 397 3 399 399 818 100.0 0 MRACINNQQIRHHNKCVILELLYRQKRANKSTLARLAQISIPAVSNILQELESEKRVVNI DDESQTRGHSSGTWLIAPEGDWTLCLNVTPTSIECQVANACLSPKGEFEYLQIDAPTPQA LLSEIEKCWHRHRKLWPDHTINLALAIHGQVDPVTGVSQTMPQAPWTTPVEVKYLLEEKL GIRVMVDNDCVMLALAEKWQNNSQERDFCVINVDYGIGSSFVINEQIYRGSLYGSGQIGH TIVNPDGVVCDCGRYGCLETVASLSALKKQARVWLKSQPVSTQLDPEKLTTAQLIAAWQS GEPWITSWVDRSANAIGLSLYNFLNILNINQIWLYGRSCAFGENWLNTIIRQTGFNPFDR DEGPSVKATQIGFGQLSRAQQVLGIGYLYVEAQLRQI >gi|296494414|gb|ADTN01000324.1| GENE 18 19993 - 21246 1625 417 aa, chain - ## HITS:1 COG:glyA KEGG:ns NR:ns ## COG: glyA COG0112 # Protein_GI_number: 16130476 # Func_class: E Amino acid transport and metabolism # Function: Glycine/serine hydroxymethyltransferase # Organism: Escherichia coli K12 # 1 417 1 417 417 811 100.0 0 MLKREMNIADYDAELWQAMEQEKVRQEEHIELIASENYTSPRVMQAQGSQLTNKYAEGYP GKRYYGGCEYVDIVEQLAIDRAKELFGADYANVQPHSGSQANFAVYTALLEPGDTVLGMN LAHGGHLTHGSPVNFSGKLYNIVPYGIDATGHIDYADLEKQAKEHKPKMIIGGFSAYSGV VDWAKMREIADSIGAYLFVDMAHVAGLVAAGVYPNPVPHAHVVTTTTHKTLAGPRGGLIL AKGGSEELYKKLNSAVFPGGQGGPLMHVIAGKAVALKEAMEPEFKTYQQQVAKNAKAMVE VFLERGYKVVSGGTDNHLFLVDLVDKNLTGKEADAALGRANITVNKNSVPNDPKSPFVTS GIRVGTPAITRRGFKEAEAKELAGWMCDVLDSINDEAVIERIKGKVLDICARYPVYA >gi|296494414|gb|ADTN01000324.1| GENE 19 21574 - 22764 1293 396 aa, chain + ## HITS:1 COG:hmp_2 KEGG:ns NR:ns ## COG: hmp_2 COG1018 # Protein_GI_number: 16130477 # Func_class: C Energy production and conversion # Function: Flavodoxin reductases (ferredoxin-NADPH reductases) family 1 # Organism: Escherichia coli K12 # 150 396 1 247 247 517 100.0 1e-146 MLDAQTIATVKATIPLLVETGPKLTAHFYDRMFTHNPELKEIFNMSNQRNGDQREALFNA IAAYASNIENLPALLPAVEKIAQKHTSFQIKPEQYNIVGEHLLATLDEMFSPGQEVLDAW GKAYGVLANVFINREAEIYNENASKAGGWEGTRDFRIVAKTPRSALITSFELEPVDGGAV AEYRPGQYLGVWLKPEGFPHQEIRQYSLTRKPDGKGYRIAVKREEGGQVSNWLHNHANVG DVVKLVAPAGDFFMAVADDTPVTLISAGVGQTPMLAMLDTLAKAGHTAQVNWFHAAENGD VHAFADEVKELGQSLPRFTAHTWYRQPSEADRAKGQFDSEGLMDLSKLEGAFSDPTMQFY LCGPVGFMQFTAKQLVDLGVKQENIHYECFGPHKVL >gi|296494414|gb|ADTN01000324.1| GENE 20 22809 - 23147 505 112 aa, chain - ## HITS:1 COG:ECs3419 KEGG:ns NR:ns ## COG: ECs3419 COG0347 # Protein_GI_number: 15832673 # Func_class: E Amino acid transport and metabolism # Function: Nitrogen regulatory protein PII # Organism: Escherichia coli O157:H7 # 1 112 1 112 112 191 100.0 4e-49 MKKIDAIIKPFKLDDVREALAEVGITGMTVTEVKGFGRQKGHTELYRGAEYMVDFLPKVK IEIVVPDDIVDTCVDTIIRTAQTGKIGDGKIFVFDVARVIRIRTGEEDDAAI >gi|296494414|gb|ADTN01000324.1| GENE 21 23208 - 24542 1315 444 aa, chain - ## HITS:1 COG:ECs3420 KEGG:ns NR:ns ## COG: ECs3420 COG2204 # Protein_GI_number: 15832674 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli O157:H7 # 1 444 1 444 444 837 100.0 0 MSHKPAHLLLVDDDPGLLKLLGLRLTSEGYSVVTAESGAEGLRVLNREKVDLVISDLRMD EMDGMQLFAEIQKVQPGMPVIILTAHGSIPDAVAATQQGVFSFLTKPVDKDALYQAIDDA LEQSAPATDERWREAIVTRSPLMLRLLEQARLVAQSDVSVLINGQSGTGKEIFAQAIHNA SPRNSKPFIAINCGALPEQLLESELFGHARGAFTGAVSNREGLFQAAEGGTLFLDEIGDM PAPLQVKLLRVLQERKVRPLGSNRDIDINVRIISATHRDLPKAMARGEFREDLYYRLNVV SLKIPALAERTEDIPLLANHLLRQAAERHKPFVRAFSTDAMKRLMTASWPGNVRQLVNVI EQCVALTSSPVISDALVEQALEGENTALPTFVEARNQFELNYLRKLLQITKGNVTHAARM AGRNRTEFYKLLSRHELDANDFKE >gi|296494414|gb|ADTN01000324.1| GENE 22 24532 - 25245 511 237 aa, chain - ## HITS:1 COG:no KEGG:EcHS_A2708 NR:ns ## KEGG: EcHS_A2708 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_HS # Pathway: not_defined # 1 237 3 239 239 410 99.0 1e-113 MRHIFQRLLPRRLWLAGLPCLALLGCVQNHNKPAIDTPAEEKIPVYQLADYLSTECSDIW ALQGKSTETNPLYWLRAMDCADRLMPAQSRQQARQYDDGSWQNTFKQGILLADAKITPYE RRQLVARIEALSTEIPAQVRPLYQLWRDGQALQLQLAEERQRYSKLQQSSDSELDTLRQQ HHVLQQQLELTTRKLENLTDIERQLSTRKPAGNFSPDTPHESEKPAPSTHEVTPDEP >gi|296494414|gb|ADTN01000324.1| GENE 23 25410 - 26837 1044 475 aa, chain - ## HITS:1 COG:yfhK KEGG:ns NR:ns ## COG: yfhK COG0642 # Protein_GI_number: 16130481 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 1 475 22 496 496 872 99.0 0 MKRWPVFPRSLRQLVMLAFLLILLPLLVLAWQAWQSLNALSDQAALVNRTTLIDARRSEA MTNAALEMERSYRQYCVLDDPTLAKVYQSQRKRYSEMLDAHAGVLPDDKLYQALRQDLHN LAQLQCNNSGPDAAAAARLEAFASANTEMVQATRTVVFSRGQQLQREIAERGQYFGWQSL VLFLVSLVMVLLFTRMIIGPVKNIERMINRLGEGRSLGNSVSFSGPSELRSVGQRILWLS ERLSWLESQRHQFLRHLSHELKTPLASMREGTELLADQVVGPLTPEQKEVVSILDSSSRN LQKLIEQLLDYNRKQADSAVELENVELAPLVETVVSAHSLPARAKMMHTDVDLKATACLA EPMLLMSVLDNLYSNAVHYGAESGNICLRSSLHGARVYIDVINTGTPIPQEERAMIFEPF FQGSHQRKGAVKGSGLGLSIARDCIRRMQGELYLVDESGQDVCFRIELPSSKNTK >gi|296494414|gb|ADTN01000324.1| GENE 24 27395 - 31279 4707 1294 aa, chain - ## HITS:1 COG:ECs3423_1 KEGG:ns NR:ns ## COG: ECs3423_1 COG0046 # Protein_GI_number: 15832677 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain # Organism: Escherichia coli O157:H7 # 1 984 2 985 985 1997 99.0 0 MEILRGSPALSAFRINKLLARFQAARLPVHNIYAEYVHFADLNAPLNDDEHAQLERLLKY GPALASHAPQGKLLLVTPRPGTISPWSSKATDIAHNCGLQQVNRLERGVAYYIEAGTLTN EQWQQVTAELHDRMMETVFFALDDAEQLFAHHQPTPVTSVDLLGQGRQALIDANLRLGLA LAEDEIDYLQDAFTKLGRNPNDIELYMFAQANSEHCRHKIFNADWVIDGEQQPKSLFKMI KNTFETTPDHVLSAYKDNAAVMEGSEVGRYFADHETGRYDFHQEPAHILMKVETHNHPTA ISPWPGAATGSGGEIRDEGATGRGAKPKAGLVGFSVSNLRIPGFEQPWEEDFGKPERIVT ALDIMTEGPLGGAAFNNEFGRPALNGYFRTYEEKVNSHNGEELRGYHKPIMLAGGIGNIR ADHVQKGEINVGAKLVVLGGPAMNIGLGGGAASSMASGQSDADLDFASVQRDNPEMERRC QEVIDRCWQLGDANPILFIHDVGAGGLSNAMPELVSDGGRGGKFELREILSDEPGMSPLE IWCNESQERYVLAVAADQLPLFDELCKRERAPYAVIGEATEELHLSLHDRHFDNQPIDLP LDVLLGKTPKMTRDVQTLKAKGDALAREGITIADAVKRVLHLPTVAEKTFLVTIGDRSVT GMVARDQMVGPWQVPVANCAVTTASLDSYYGEAMAIGERAPVALLDFAASARLAVGEALT NIAATQIGDIKRIKLSANWMAAAGHPGEDAGLYEAVKAVGEELCPALGLTIPVGKDSMSM KTRWQEGNEEREMTSPLSLVISAFARVEDVRHTITPQLSTEDNALLLIDLGKGNNALGAT ALAQVYRQLGDKPADVRDVAQLKGFYDAIQALVAQRKLLAYHDRSDGGLLVTLAEMAFAG HCGIDADIATLGDDRLAALFNEELGAVIQVRAADREAVESVLAQHGLADCVHYVGQAVSG DRFVITANGQTVFSESRTTLRVWWAETTWQMQRLRDNPECADQEHQAKSNDADPGLNVKL SFDINEDVAAPYIATGARPKVAVLREQGVNSHVEMAAAFHRAGFDAIDVHMSDLLTGRTG LEDFHALVACGGFSYGDVLGAGEGWAKSILFNDRVRDEFATFFHRPQTLALGVCNGCQMM SNLRELIPGSELWPRFVRNTSDRFEARFSLVEVTQSPSLLLQGMVGSQMPIAVSHGEGRV EVRDAAHLAALESKGLVALRYVDNFGKVTETYPANPNGSPNGITAVTTESGRVTIMMPHP ERVFRTVSNSWHPENWGEDGPWMRIFRNARKQLG >gi|296494414|gb|ADTN01000324.1| GENE 25 31540 - 33096 1625 518 aa, chain + ## HITS:1 COG:yfhD KEGG:ns NR:ns ## COG: yfhD COG4623 # Protein_GI_number: 16130483 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted soluble lytic transglycosylase fused to an ABC-type amino acid-binding protein # Organism: Escherichia coli K12 # 47 518 1 472 472 913 99.0 0 MKKLKINYLFIGILALLLAVALWPSIPWFGKADNRIAAIQARGELRVSTIHTPLTYNEIN GKPFGLDYELAKQFADYLGVKLKVTVRQNISQLFDDLDNGNADLLAAGLVYNSERVKNYQ PGPTYYSVSQQLVYKVGQYRPRTLGNLTAEQLTVAPGHVVVNDLQTLKETKFPELSWKVD DKKGSAELMEDVIEGKLDYTIADSVAISLFQRVHPELAVALDITDEQPVTWFSPLDGDNT LSAALLDFFNEMNEDGTLARIEEKYLGHGDDFDYVDTRTFLRAVDAVLPQLKPLFEKYAE EIDWRLLAAIAYQESHWDAQATSPTGVRGMMMLTKNTAQSLGITDRTDAEQSISGGVRYL QDMMSKVPESVPENERIWFALAAYNMGYAHMLDARALTAKTKGNPDSWADVKQRLPLLSQ KPYYSKLTYGYARGHEAYAYVENIRKYQISLVGYLQEKEKQATEAAMQLAQDYPAVSPTE LGKEKFPFLSFLSQSSSNYLTHSPSLLFSRKGSEEKQN >gi|296494414|gb|ADTN01000324.1| GENE 26 33093 - 33596 431 167 aa, chain - ## HITS:1 COG:yfhC KEGG:ns NR:ns ## COG: yfhC COG0590 # Protein_GI_number: 16130484 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Escherichia coli K12 # 1 167 12 178 178 337 99.0 5e-93 MSEVEFSHEYWMRHALTLAKRAWDEREVPVGAVLVHNNRVIGEGWNRPIGRHDPTAHAEI MALRQGGLVMQNYRLIDATLYVTLEPCVMCAGAMIHSRIGRVVFGARDAKTGAAGSLMDV LHHPGMNHRVEITEGILADECAALLSDFFRMRRQEIKAQKKAQSSTD >gi|296494414|gb|ADTN01000324.1| GENE 27 33654 - 34289 458 211 aa, chain - ## HITS:1 COG:STM2569 KEGG:ns NR:ns ## COG: STM2569 COG0560 # Protein_GI_number: 16765889 # Func_class: E Amino acid transport and metabolism # Function: Phosphoserine phosphatase # Organism: Salmonella typhimurium LT2 # 1 211 1 211 211 365 91.0 1e-101 MATHERRVVFFDLDGTLHQQDMFGSFLRYLLRRQPLNALLVLPLLPIIAIALLIKGRAAR WPMSLLLWGCTFGHSEARLQTLQADFVRWFRDNVTAFPLVQERLTTYLLSSDADIWLITG SPQPLVEAVYFDTPWLPRVNLIASQIQRGYGGWVLTMRCLGHEKVAQLERKIGTPLRLYS GYSDSNQDNPLLYFCQHRWRVTPRGELQQLE >gi|296494414|gb|ADTN01000324.1| GENE 28 34498 - 35346 778 282 aa, chain + ## HITS:1 COG:ECs3427 KEGG:ns NR:ns ## COG: ECs3427 COG1737 # Protein_GI_number: 15832681 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 282 25 306 306 525 100.0 1e-149 MNGLLRIRQRYQGLAQSDKKLADYLLLQPDTARHLSSQQLANEAGVSQSSVVKFAQKLGY KGFPALKLALSEALASQPESPSVPIHNQIRGDDPLRLVGEKLIKENTAAMYATLNVNSEE KLHECVTMLRSARRIILTGIGASGLVAQNFAWKLMKIGFNAAAVRDMHALLATVQASSPD DLLLAISYTGVRRELNLAADEMLRVGGKVLAITGFTPNALQQRASHCLYTIAEEQATNSA SISACHAQGMLTDLLFIALIQQDLELAPERIRHSEALVKKLV >gi|296494414|gb|ADTN01000324.1| GENE 29 35402 - 35662 281 86 aa, chain + ## HITS:1 COG:yfhL KEGG:ns NR:ns ## COG: yfhL COG1145 # Protein_GI_number: 16130487 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Escherichia coli K12 # 1 86 1 86 86 153 100.0 6e-38 MALLITKKCINCDMCEPECPNEAISMGDHIYEINSDKCTECVGHYETPTCQKVCPIPNTI VKDPAHVETEEQLWDKFVLMHHADKI >gi|296494414|gb|ADTN01000324.1| GENE 30 36357 - 36737 321 126 aa, chain - ## HITS:1 COG:acpS KEGG:ns NR:ns ## COG: acpS COG0736 # Protein_GI_number: 16130488 # Func_class: I Lipid transport and metabolism # Function: Phosphopantetheinyl transferase (holo-ACP synthase) # Organism: Escherichia coli K12 # 1 126 1 126 126 240 100.0 5e-64 MAILGLGTDIVEIARIEAVIARSGDRLARRVLSDNEWAIWKTHHQPVRFLAKRFAVKEAA AKAFGTGIRNGLAFNQFEVFNDELGKPRLRLWGEALKLAEKLGVANMHVTLADERHYACA TVIIES >gi|296494414|gb|ADTN01000324.1| GENE 31 36737 - 37468 1018 243 aa, chain - ## HITS:1 COG:ECs3430 KEGG:ns NR:ns ## COG: ECs3430 COG0854 # Protein_GI_number: 15832684 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal phosphate biosynthesis protein # Organism: Escherichia coli O157:H7 # 1 243 1 243 243 438 100.0 1e-123 MAELLLGVNIDHIATLRNARGTAYPDPVQAAFIAEQAGADGITVHLREDRRHITDRDVRI LRQTLDTRMNLEMAVTEEMLAIAVETKPHFCCLVPEKRQEVTTEGGLDVAGQRDKMRDAC KRLADAGIQVSLFIDADEEQIKAAAEVGAPFIEIHTGCYADAKTDAEQAQELARIAKAAT FAASLGLKVNAGHGLTYHNVKAIAAIPEMHELNIGHAIIGRAVMTGLKDAVAEMKRLMLE ARG >gi|296494414|gb|ADTN01000324.1| GENE 32 37480 - 38208 682 242 aa, chain - ## HITS:1 COG:ECs3431 KEGG:ns NR:ns ## COG: ECs3431 COG1381 # Protein_GI_number: 15832685 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Escherichia coli O157:H7 # 1 242 1 242 242 470 100.0 1e-132 MEGWQRAFVLHSRPWSETSLMLDVFTEESGRVRLVAKGARSKRSTLKGALQPFTPLLLRF GGRGEVKTLRSAEAVSLALPLSGITLYSGLYINELLSRVLEYETRFSELFFDYLHCIQSL AGVTGTPEPALRRFELALLGHLGYGVNFTHCAGSGEPVDDTMTYRYREEKGFIASVVIDN KTFTGRQLKALNAREFPDADTLRAAKRFTRMALKPYLGGKPLKSRELFRQFMPKRTVKTH YE >gi|296494414|gb|ADTN01000324.1| GENE 33 38220 - 39125 1022 301 aa, chain - ## HITS:1 COG:era KEGG:ns NR:ns ## COG: era COG1159 # Protein_GI_number: 16130491 # Func_class: R General function prediction only # Function: GTPase # Organism: Escherichia coli K12 # 1 301 1 301 301 600 100.0 1e-172 MSIDKSYCGFIAIVGRPNVGKSTLLNKLLGQKISITSRKAQTTRHRIVGIHTEGAYQAIY VDTPGLHMEEKRAINRLMNKAASSSIGDVELVIFVVEGTRWTPDDEMVLNKLREGKAPVI LAVNKVDNVQEKADLLPHLQFLASQMNFLDIVPISAETGLNVDTIAAIVRKHLPEATHHF PEDYITDRSQRFMASEIIREKLMRFLGAELPYSVTVEIERFVSNERGGYDINGLILVERE GQKKMVIGNKGAKIKTIGIEARKDMQEMFEAPVHLELWVKVKSGWADDERALRSLGYVDD L >gi|296494414|gb|ADTN01000324.1| GENE 34 39122 - 39802 629 226 aa, chain - ## HITS:1 COG:ECs3433 KEGG:ns NR:ns ## COG: ECs3433 COG0571 # Protein_GI_number: 15832687 # Func_class: K Transcription # Function: dsRNA-specific ribonuclease # Organism: Escherichia coli O157:H7 # 1 226 1 226 226 422 100.0 1e-118 MNPIVINRLQRKLGYTFNHQELLQQALTHRSASSKHNERLEFLGDSILSYVIANALYHRF PRVDEGDMSRMRATLVRGNTLAELAREFELGECLRLGPGELKSGGFRRESILADTVEALI GGVFLDSDIQTVEKLILNWYQTRLDEISPGDKQKDPKTRLQEYLQGRHLPLPTYLVVQVR GEAHDQEFTIHCQVSGLSEPVVGTGSSRRKAEQAAAEQALKKLELE >gi|296494414|gb|ADTN01000324.1| GENE 35 40074 - 41048 1056 324 aa, chain - ## HITS:1 COG:lepB KEGG:ns NR:ns ## COG: lepB COG0681 # Protein_GI_number: 16130493 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Escherichia coli K12 # 1 324 1 324 324 652 100.0 0 MANMFALILVIATLVTGILWCVDKFFFAPKRRERQAAAQAAAGDSLDKATLKKVAPKPGW LETGASVFPVLAIVLIVRSFIYEPFQIPSGSMMPTLLIGDFILVEKFAYGIKDPIYQKTL IETGHPKRGDIVVFKYPEDPKLDYIKRAVGLPGDKVTYDPVSKELTIQPGCSSGQACENA LPVTYSNVEPSDFVQTFSRRNGGEATSGFFEVPKNETKENGIRLSERKETLGDVTHRILT VPIAQDQVGMYYQQPGQQLATWIVPPGQYFMMGDNRDNSADSRYWGFVPEANLVGRATAI WMSFDKQEGEWPTGLRLSRIGGIH >gi|296494414|gb|ADTN01000324.1| GENE 36 41064 - 42863 2082 599 aa, chain - ## HITS:1 COG:ECs3435 KEGG:ns NR:ns ## COG: ECs3435 COG0481 # Protein_GI_number: 15832689 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane GTPase LepA # Organism: Escherichia coli O157:H7 # 1 599 1 599 599 1167 100.0 0 MKNIRNFSIIAHIDHGKSTLSDRIIQICGGLSDREMEAQVLDSMDLERERGITIKAQSVT LDYKASDGETYQLNFIDTPGHVDFSYEVSRSLAACEGALLVVDAGQGVEAQTLANCYTAM EMDLEVVPVLNKIDLPAADPERVAEEIEDIVGIDATDAVRCSAKTGVGVQDVLERLVRDI PPPEGDPEGPLQALIIDSWFDNYLGVVSLIRIKNGTLRKGDKVKVMSTGQTYNADRLGIF TPKQVDRTELKCGEVGWLVCAIKDIHGAPVGDTLTLARNPAEKALPGFKKVKPQVYAGLF PVSSDDYEAFRDALGKLSLNDASLFYEPESSSALGFGFRCGFLGLLHMEIIQERLEREYD LDLITTAPTVVYEVETTSREVIYVDSPSKLPAVNNIYELREPIAECHMLLPQAYLGNVIT LCVEKRGVQTNMVYHGNQVALTYEIPMAEVVLDFFDRLKSTSRGYASLDYNFKRFQASDM VRVDVLINGERVDALALITHRDNSQNRGRELVEKMKDLIPRQQFDIAIQAAIGTHIIARS TVKQLRKNVLAKCYGGDISRKKKLLQKQKEGKKRMKQIGNVELPQEAFLAILHVGKDNK >gi|296494414|gb|ADTN01000324.1| GENE 37 43061 - 43540 371 159 aa, chain - ## HITS:1 COG:rseC KEGG:ns NR:ns ## COG: rseC COG3086 # Protein_GI_number: 16130495 # Func_class: T Signal transduction mechanisms # Function: Positive regulator of sigma E activity # Organism: Escherichia coli K12 # 1 159 1 159 159 294 100.0 5e-80 MIKEWATVVSWQNGQALVSCDVKASCSSCASRAGCGSRVLNKLGPQTTHTIVVPCDEPLV PGQKVELGIAEGSLLSSALLVYMSPLVGLFLIASLFQLLFASDVAALCGAILGGIGGFLI ARGYSRKFAARAEWQPIILSVALPPGLVRFETSSEDASQ >gi|296494414|gb|ADTN01000324.1| GENE 38 43537 - 44493 747 318 aa, chain - ## HITS:1 COG:ECs3437 KEGG:ns NR:ns ## COG: ECs3437 COG3026 # Protein_GI_number: 15832691 # Func_class: T Signal transduction mechanisms # Function: Negative regulator of sigma E activity # Organism: Escherichia coli O157:H7 # 1 318 1 318 318 614 100.0 1e-176 MKQLWFAMSLVTGSLLFSANASATPASGALLQQMNLASQSLNYELSFISINKQGVESLRY RHARLDNRPLAQLLQMDGPRREVVQRGNEISYFEPGLEPFTLNGDYIVDSLPSLIYTDFK RLSPYYDFISVGRTRIADRLCEVIRVVARDGTRYSYIVWMDTESKLPMRVDLLDRDGETL EQFRVIAFNVNQDISSSMQTLAKANLPPLLSVPVGEKAKFSWTPTWLPQGFSEVSSSRRP LPTMDNMPIESRLYSDGLFSFSVNVNRATPSSTDQMLRTGRRTVSTSVRDNAEITIVGEL PPQTAKRIAENIKFGAAQ >gi|296494414|gb|ADTN01000324.1| GENE 39 44493 - 45143 571 216 aa, chain - ## HITS:1 COG:rseA KEGG:ns NR:ns ## COG: rseA COG3073 # Protein_GI_number: 16130497 # Func_class: T Signal transduction mechanisms # Function: Negative regulator of sigma E activity # Organism: Escherichia coli K12 # 1 216 1 216 216 329 100.0 2e-90 MQKEQLSALMDGETLDSELLNELAHNPEMQKTWESYHLIRDSMRGDTPEVLHFDISSRVM AAIEEEPVRQPATLIPEAQPAPHQWQKMPFWQKVRPWAAQLTQMGVAACVSLAVIVGVQH YNGQSETSQQPETPVFNTLPMMGKASPVSLGVPSEATANNGQQQQVQEQRRRINAMLQDY ELQRRLHSEQLQFEQAQTQQAAVQVPGIQTLGTQSQ >gi|296494414|gb|ADTN01000324.1| GENE 40 45176 - 45751 378 191 aa, chain - ## HITS:1 COG:ECs3439 KEGG:ns NR:ns ## COG: ECs3439 COG1595 # Protein_GI_number: 15832693 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Escherichia coli O157:H7 # 1 191 1 191 191 360 100.0 1e-100 MSEQLTDQVLVERVQKGDQKAFNLLVVRYQHKVASLVSRYVPSGDVPDVVQEAFIKAYRA LDSFRGDSAFYTWLYRIAVNTAKNYLVAQGRRPPSSDVDAIEAENFESGGALKEISNPEN LMLSEELRQIVFRTIESLPEDLRMAITLRELDGLSYEEIAAIMDCPVGTVRSRIFRAREA IDNKVQPLIRR >gi|296494414|gb|ADTN01000324.1| GENE 41 45748 - 45903 76 51 aa, chain - ## HITS:1 COG:no KEGG:ECBD_1107 NR:ns ## KEGG: ECBD_1107 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_BL21_DE3 # Pathway: not_defined # 1 51 1 51 51 66 100.0 3e-10 MIRLQHDKQKQMRYGTLQKRDTLTLCLLKLQLMEWRFDSAWKFGLGRLYLG >gi|296494414|gb|ADTN01000324.1| GENE 42 46159 - 47781 1325 540 aa, chain + ## HITS:1 COG:nadB KEGG:ns NR:ns ## COG: nadB COG0029 # Protein_GI_number: 16130499 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate oxidase # Organism: Escherichia coli K12 # 1 540 1 540 540 1130 100.0 0 MNTLPEHSCDVLIIGSGAAGLSLALRLADQHQVIVLSKGPVTEGSTFYAQGGIAAVFDET DSIDSHVEDTLIAGAGICDRHAVEFVASNARSCVQWLIDQGVLFDTHIQPNGEESYHLTR EGGHSHRRILHAADATGREVETTLVSKALNHPNIRVLERSNAVDLIVSDKIGLPGTRRVV GAWVWNRNKETVETCHAKAVVLATGGASKVYQYTTNPDISSGDGIAMAWRAGCRVANLEF NQFHPTALYHPQARNFLLTEALRGEGAYLKRPDGTRFMPDFDERGELAPRDIVARAIDHE MKRLGADCMFLDISHKPADFIRQHFPMIYEKLLGLGIDLTQEPVPIVPAAHYTCGGVMVD DHGRTDVEGLYAIGEVSYTGLHGANRMASNSLLECLVYGWSAAEDITRRMPYAHDISTLP PWDESRVENPDERVVIQHNWHELRLFMWDYVGIVRTTKRLERALRRITMLQQEIDEYYAH FRVSNNLLELRNLVQVAELIVRCAMMRKESRGLHFTLDYPELLTHSGPSILSPGNHYINR >gi|296494414|gb|ADTN01000324.1| GENE 43 47766 - 48503 690 245 aa, chain - ## HITS:1 COG:yfiC KEGG:ns NR:ns ## COG: yfiC COG4123 # Protein_GI_number: 16130500 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Escherichia coli K12 # 1 245 41 285 285 509 100.0 1e-144 MSQSTSVLRRNGFTFKQFFVAHDRCAMKVGTDGILLGAWAPVAGVKRCLDIGAGSGLLAL MLAQRTDDSVMIDAVELESEAAAQAQENINQSPWAERINVHTADIQQWITQQTVRFDLII SNPPYYQQGVECSTPQREQARYTTTLDHPSLLTCAAECITEEGFFCVVLPEQIGNGFTEL ALSMGWHLRLRTDVAENEARLPHRVLLAFSPQAGECFSDRLVIRGPDQNYSEAYTALTQA FYLFM >gi|296494414|gb|ADTN01000324.1| GENE 44 48635 - 49969 1622 444 aa, chain + ## HITS:1 COG:ECs3442 KEGG:ns NR:ns ## COG: ECs3442 COG0513 # Protein_GI_number: 15832696 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Escherichia coli O157:H7 # 1 444 1 444 444 778 100.0 0 MTVTTFSELELDESLLEALQDKGFTRPTAIQAAAIPPALDGRDVLGSAPTGTGKTAAYLL PALQHLLDFPRKKSGPPRILILTPTRELAMQVADHARELAKHTHLDIATITGGVAYMNHA EVFSENQDIVVATTGRLLQYIKEENFDCRAVETLILDEADRMLDMGFAQDIEHIAGETRW RKQTLLFSATLEGDAIQDFAERLLEDPVEVSANPSTRERKKIHQWYYRADDLEHKTALLV HLLKQPEATRSIVFVRKRERVHELANWLREAGINNCYLEGEMVQGKRNEAIKRLTEGRVN VLVATDVAARGIDIPDVSHVFNFDMPRSGDTYLHRIGRTARAGRKGTAISLVEAHDHLLL GKVGRYIEEPIKARVIDELRPKTRAPSEKQTGKPSKKVLAKRAEKKKAKEKEKPRVKKRH RDTKNIGKRRKPSGTGVPPQTTEE >gi|296494414|gb|ADTN01000324.1| GENE 45 50178 - 51059 324 293 aa, chain - ## HITS:1 COG:yfiE KEGG:ns NR:ns ## COG: yfiE COG0583 # Protein_GI_number: 16130502 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 293 16 308 308 571 100.0 1e-163 MDLRRFITLKTVVEEGSFLRASQKLCCTQSTVTFHIQQLEQEFSVQLFEKIGRRMCLTRE GKKLLPHIYELTRVMDTLREAAKKESDPDGELRVVSGETLLSYRMPQVLQRFRQRAPKVR LSLQSLNCYVIRDALLNDEADVGVFYRVGNDDALNRRELGEQSLVLVASPQIADVDFTEP GRHNACSFIINEPQCVFRQIFESTLRQRRITVENTIELISIESIKRCVAANIGVSYLPRF AVAKELECGELIELPFGEQSQTITAMCAHHAGKAVSPAMHTFIQCVEESFVAG >gi|296494414|gb|ADTN01000324.1| GENE 46 51162 - 51749 514 195 aa, chain + ## HITS:1 COG:yfiK KEGG:ns NR:ns ## COG: yfiK COG1280 # Protein_GI_number: 16130503 # Func_class: E Amino acid transport and metabolism # Function: Putative threonine efflux protein # Organism: Escherichia coli K12 # 1 195 1 195 195 329 100.0 2e-90 MTPTLLSAFWTYTLITAMTPGPNNILALSSATSHGFRQSTRVLAGMSLGFLIVMLLCAGI SFSLAVIDPAAVHLLSWAGAAYIVWLAWKIATSPTKEDGLQAKPISFWASFALQFVNVKI ILYGVTALSTFVLPQTQALSWVVGVSVLLAMIGTFGNVCWALAGHLFQRLFRQYGRQLNI VLALLLVYCAVRIFY >gi|296494414|gb|ADTN01000324.1| GENE 47 51805 - 52188 517 127 aa, chain - ## HITS:1 COG:ECs3445 KEGG:ns NR:ns ## COG: ECs3445 COG3445 # Protein_GI_number: 15832699 # Func_class: R General function prediction only # Function: Acid-induced glycyl radical enzyme # Organism: Escherichia coli O157:H7 # 1 127 1 127 127 243 100.0 9e-65 MITGIQITKAANDDLLNSFWLLDSEKGEARCIVAKAGYAEDEVVAVSKLGDIEYREVPVE VKPEVRVEGGQHLNVNVLRRETLEDAVKHPEKYPQLTIRVSGYAVRFNSLTPEQQRDVIA RTFTESL >gi|296494414|gb|ADTN01000324.1| GENE 48 52493 - 53182 650 229 aa, chain + ## HITS:1 COG:ung KEGG:ns NR:ns ## COG: ung COG0692 # Protein_GI_number: 16130505 # Func_class: L Replication, recombination and repair # Function: Uracil DNA glycosylase # Organism: Escherichia coli K12 # 1 229 1 229 229 466 100.0 1e-131 MANELTWHDVLAEEKQQPYFLNTLQTVASERQSGVTIYPPQKDVFNAFRFTELGDVKVVI LGQDPYHGPGQAHGLAFSVRPGIAIPPSLLNMYKELENTIPGFTRPNHGYLESWARQGVL LLNTVLTVRAGQAHSHASLGWETFTDKVISLINQHREGVVFLLWGSHAQKKGAIIDKQRH HVLKAPHPSPLSAHRGFFGCNHFVLANQWLEQRGETPIDWMPVLPAESE >gi|296494414|gb|ADTN01000324.1| GENE 49 53230 - 54267 1036 345 aa, chain - ## HITS:1 COG:ECs3447 KEGG:ns NR:ns ## COG: ECs3447 COG0566 # Protein_GI_number: 15832701 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Escherichia coli O157:H7 # 1 345 1 345 345 672 100.0 0 MNDEMKGKSGKVKVMYVRSDDDSDKRTHNPRTGKGGGRPGKSRADGGRRPARDDKQSQPR DRKWEDSPWRTVSRAPGDETPEKADHGGISGKSFIDPEVLRRQRAEETRVYGENACQALF QSRPEAIVRAWFIQSVTPRFKEALRWMAANRKAYHVVDEAELTKASGTEHHGGVCFLIKK RNGTTVQQWVSQAGAQDCVLALENESNPHNLGGMMRSCAHFGVKGVVVQDAALLESGAAI RTAEGGAEHVQPITGDNIVNVLDDFRQAGYTVVTTSSEQGKPLFKTSLPAKMVLVLGQEY EGLPDAARDPNDLRVKIDGTGNVAGLNISVATGVLLGEWWRQNKA >gi|296494414|gb|ADTN01000324.1| GENE 50 54474 - 54893 178 139 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|124485582|ref|YP_001030198.1| ribosomal protein L12E/L44/L45/RPP1/RPP2-like protein [Methanocorpusculum labreanum Z] # 55 136 35 115 120 73 46 3e-12 MNTVCTHCQAINRIPDDRIEDAAKCGRCGHDLFDGEVINATGETLDKLLKDDLPVVIDFW APWCGPCRNFAPIFEDVAQERSGKVRFVKVNTEAERELSSRFGIRSIPTIMIFKNGQVVD MLNGAVPKAPFDSWLNESL >gi|296494414|gb|ADTN01000324.1| GENE 51 55136 - 55660 448 174 aa, chain + ## HITS:1 COG:yfiP KEGG:ns NR:ns ## COG: yfiP COG3148 # Protein_GI_number: 16130508 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 174 67 240 240 345 99.0 2e-95 MFDTEPMKPSNTGRLIADILPDTVAFQWSRTEPSQDLLELVQNPDYQPMVVFPASYADEQ REVIFTPPAGKPPLFIMLDGTWPEARKMFRKSPYLDNLPVICVDLSRLSAYRLREAQAEG QYCTAEVAIALLDMAGDTGAAAGLGEHFTRFKTRYLAGKTQHLGSITAEQLESV >gi|296494414|gb|ADTN01000324.1| GENE 52 55692 - 58352 2402 886 aa, chain + ## HITS:1 COG:yfiQ_1 KEGG:ns NR:ns ## COG: yfiQ_1 COG1042 # Protein_GI_number: 16130509 # Func_class: C Energy production and conversion # Function: Acyl-CoA synthetase (NDP forming) # Organism: Escherichia coli K12 # 1 709 1 709 709 1366 100.0 0 MSQRGLEALLRPKSIAVIGASMKPNRAGYLMMRNLLAGGFNGPVLPVTPAWKAVLGVLAW PDIASLPFTPDLAVLCTNASRNLALLEELGEKGCKTCIILSAPASQHEDLRACALRHNMR LLGPNSLGLLAPWQGLNASFSPVPIKRGKLAFISQSAAVSNTILDWAQQRKMGFSYFIAL GDSLDIDVDELLDYLARDSKTSAILLYLEQLSDARRFVSAARSASRNKPILVIKSGRSPA AQRLLNTTAGMDPAWDAAIQRAGLLRVQDTHELFSAVETLSHMRPLRGDRLMIISNGAAP AALALDALWSRNGKLATLSEETCQKLRDALPEHVAISNPLDLRDDASSEHYIKTLDILLH SQDFDALMVIHSPSAAAPATESAQVLIEAVKHHPRSKYVSLLTNWCGEHSSQEARRLFSE AGLPTYRTPEGTITAFMHMVEYRRNQKQLRETPALPSNLTSNTAEAHLLLQQAIAEGATS LDTHEVQPILQAYGMNTLPTWIASDSTEAVHIAEQIGYPVALKLRSPDIPHKSEVQGVML YLRTANEVQQAANAIFDRVKMAWPQARVHGLLVQSMANRAGAQELRVVVEHDPVFGPLIM LGEGGVEWRPEDQAVVALPPLNMNLARYLVIQGIKSKKIRARSALRPLDVAGLSQLLVQV SNLIVDCPEIQRLDIHPLLASGSEFTALDVTLDISPFEGDNESRLAVRPYPHQLEEWVEL KNGERCLFRPILPEDEPQLQQFISRVTKEDLYYRYFSEINEFTHEDLANMTQIDYDREMA FVAVRRIDQTEEILGVTRAISDPDNIDAEFAVLVRSDLKGLGLGRRLMEKLITYTRDHGL QRLNGITMPNNRGMVALARKLGFNVDIQLEEGIVGLTLNLAQREES >gi|296494414|gb|ADTN01000324.1| GENE 53 58466 - 59821 1296 451 aa, chain + ## HITS:1 COG:ECs3452 KEGG:ns NR:ns ## COG: ECs3452 COG1502 # Protein_GI_number: 15832706 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Escherichia coli O157:H7 # 1 431 2 432 452 877 100.0 0 MLSKFKRNKHQQHLAQLPKISQSVDDVDFFYAPADFRETLLEKIASAKQRICIVALYLEQ DDGGKGILNALYEAKRQRPELDVRVLVDWHRAQRGRIGAAASNTNADWYCRMAQENPGVD VPVYGVPINTREALGVLHFKGFIIDDSVLYSGASLNDVYLHQHDKYRYDRYHLIRNRKMS DIMFEWVTQNIMNGRGVNRLDDVNRPKSPEIKNDIRLFRQELRDAAYHFQGDADNDQLSV TPLVGLGKSSLLNKTIFHLMPCAEQKLTICTPYFNLPAILVRNIIQLLREGKKVEIIVGD KTANDFYIPEDEPFKIIGALPYLYEINLRRFLSRLQYYVNTDQLVVRLWKDDDNTYHLKG MWVDDKWMLITGNNLNPRAWRLDLENAILIHDPQLELAPQREKELELIREHTTIVKHYRD LQSIADYPVKVRKLIRRLRRIRIDRLISRIL >gi|296494414|gb|ADTN01000324.1| GENE 54 59867 - 60190 305 107 aa, chain + ## HITS:1 COG:STM2653 KEGG:ns NR:ns ## COG: STM2653 COG5544 # Protein_GI_number: 16765973 # Func_class: R General function prediction only # Function: Predicted periplasmic lipoprotein # Organism: Salmonella typhimurium LT2 # 1 107 1 107 107 155 76.0 2e-38 MRILFVCSLLLLSGCSHMANDSWSGQDKAQHFIASAMLSAAGNEYSQHQGMSRDRSAMFG LMFSVSLGASKELWDSRPEGSGWSWKDLAWDVAGASTGYTVWQLTRH >gi|296494414|gb|ADTN01000324.1| GENE 55 60187 - 61485 1068 432 aa, chain - ## HITS:1 COG:kgtP KEGG:ns NR:ns ## COG: kgtP COG0477 # Protein_GI_number: 16130512 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 432 1 432 432 752 100.0 0 MAESTVTADSKLTSSDTRRRIWAIVGASSGNLVEWFDFYVYSFCSLYFAHIFFPSGNTTT QLLQTAGVFAAGFLMRPIGGWLFGRIADKHGRKKSMLLSVCMMCFGSLVIACLPGYETIG TWAPALLLLARLFQGLSVGGEYGTSATYMSEVAVEGRKGFYASFQYVTLIGGQLLALLVV VVLQHTMEDAALREWGWRIPFALGAVLAVVALWLRRQLDETSQQETRALKEAGSLKGLWR NRRAFIMVLGFTAAGSLCFYTFTTYMQKYLVNTAGMHANVASGIMTAALFVFMLIQPLIG ALSDKIGRRTSMLCFGSLAAIFTVPILSALQNVSSPYAAFGLVMCALLIVSFYTSISGIL KAEMFPAQVRALGVGLSYAVANAIFGGSAEYVALSLKSIGMETAFFWYVTLMAVVAFLVS LMLHRKGKGMRL >gi|296494414|gb|ADTN01000324.1| GENE 56 61665 - 61727 116 20 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEVGKLGKPYPLLNLAYVGV Prediction of potential genes in microbial genomes Time: Mon May 16 00:15:23 2011 Seq name: gi|296494413|gb|ADTN01000325.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont839.1, whole genome shotgun sequence Length of sequence - 717 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 7 - 129 84 ## - Prom 235 - 294 6.8 Predicted protein(s) >gi|296494413|gb|ADTN01000325.1| GENE 1 7 - 129 84 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDYYAVVNDSFHGNGFCFYQNPYLYVFRILLILLVMLPTY Prediction of potential genes in microbial genomes Time: Mon May 16 00:15:28 2011 Seq name: gi|296494412|gb|ADTN01000326.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont839.2, whole genome shotgun sequence Length of sequence - 4769 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 11/0.000 + CDS 72 - 1643 1785 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases 2 1 Op 2 10/0.000 + CDS 1643 - 2734 1260 ## COG0473 Isocitrate/isopropylmalate dehydrogenase 3 1 Op 3 30/0.000 + CDS 2737 - 4137 1777 ## COG0065 3-isopropylmalate dehydratase large subunit 4 1 Op 4 . + CDS 4148 - 4753 847 ## COG0066 3-isopropylmalate dehydratase small subunit Predicted protein(s) >gi|296494412|gb|ADTN01000326.1| GENE 1 72 - 1643 1785 523 aa, chain + ## HITS:1 COG:leuA KEGG:ns NR:ns ## COG: leuA COG0119 # Protein_GI_number: 16128068 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Escherichia coli K12 # 1 523 1 523 523 1013 100.0 0 MSQQVIIFDTTLRDGEQALQASLSVKEKLQIALALERMGVDVMEVGFPVSSPGDFESVQT IARQVKNSRVCALARCVEKDIDVAAESLKVAEAFRIHTFIATSPMHIATKLRSTLDEVIE RAIYMVKRARNYTDDVEFSCEDAGRTPIADLARVVEAAINAGATTINIPDTVGYTMPFEF AGIISGLYERVPNIDKAIISVHTHDDLGLAVGNSLAAVHAGARQVEGAMNGIGERAGNCS LEEVIMAIKVRKDILNVHTAINHQEIWRTSQLVSQICNMPIPANKAIVGSGAFAHSSGIH QDGVLKNRENYEIMTPESIGLNQIQLNLTSRSGRAAVKHRMDEMGYKESEYNLDNLYDAF LKLADKKGQVFDYDLEALAFIGKQQEEPEHFRLDYFSVQSGSNDIATAAVKLACGEEVKA EAANGNGPVDAVYQAINRITEYNVELVKYSLTAKGHGKDALGQVDIVANYNGRRFHGVGL ATDIVESSAKAMVHVLNNIWRAAEVEKELQRKAQHNENNKETV >gi|296494412|gb|ADTN01000326.1| GENE 2 1643 - 2734 1260 363 aa, chain + ## HITS:1 COG:leuB KEGG:ns NR:ns ## COG: leuB COG0473 # Protein_GI_number: 16128067 # Func_class: C Energy production and conversion; E Amino acid transport and metabolism # Function: Isocitrate/isopropylmalate dehydrogenase # Organism: Escherichia coli K12 # 1 363 2 364 364 742 100.0 0 MSKNYHIAVLPGDGIGPEVMTQALKVLDAVRNRFAMRITTSHYDVGGAAIDNHGQPLPPA TVEGCEQADAVLFGSVGGPKWEHLPPDQQPERGALLPLRKHFKLFSNLRPAKLYQGLEAF CPLRADIAANGFDILCVRELTGGIYFGQPKGREGSGQYEKAFDTEVYHRFEIERIARIAF ESARKRRHKVTSIDKANVLQSSILWREIVNEIATEYPDVELAHMYIDNATMQLIKDPSQF DVLLCSNLFGDILSDECAMITGSMGMLPSASLNEQGFGLYEPAGGSAPDIAGKNIANPIA QILSLALLLRYSLDADDAACAIERAINRALEEGIRTGDLARGAAAVSTDEMGDIIARYVA EGV >gi|296494412|gb|ADTN01000326.1| GENE 3 2737 - 4137 1777 466 aa, chain + ## HITS:1 COG:leuC KEGG:ns NR:ns ## COG: leuC COG0065 # Protein_GI_number: 16128066 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase large subunit # Organism: Escherichia coli K12 # 1 466 1 466 466 927 100.0 0 MAKTLYEKLFDAHVVYEAENETPLLYIDRHLVHEVTSPQAFDGLRAHGRPVRQPGKTFAT MDHNVSTQTKDINACGEMARIQMQELIKNCKEFGVELYDLNHPYQGIVHVMGPEQGVTLP GMTIVCGDSHTATHGAFGALAFGIGTSEVEHVLATQTLKQGRAKTMKIEVQGKAAPGITA KDIVLAIIGKTGSAGGTGHVVEFCGEAIRDLSMEGRMTLCNMAIEMGAKAGLVAPDETTF NYVKGRLHAPKGKDFDDAVAYWKTLQTDEGATFDTVVTLQAEEISPQVTWGTNPGQVISV NDNIPDPASFADPVERASAEKALAYMGLKPGIPLTEVAIDKVFIGSCTNSRIEDLRAAAE IAKGRKVAPGVQALVVPGSGPVKAQAEAEGLDKIFIEAGFEWRLPGCSMCLAMNNDRLNP GERCASTSNRNFEGRQGRGGRTHLVSPAMAAAAAVTGHFADIRNIK >gi|296494412|gb|ADTN01000326.1| GENE 4 4148 - 4753 847 201 aa, chain + ## HITS:1 COG:leuD KEGG:ns NR:ns ## COG: leuD COG0066 # Protein_GI_number: 16128065 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase small subunit # Organism: Escherichia coli K12 # 1 201 1 201 201 413 100.0 1e-115 MAEKFIKHTGLVVPLDAANVDTDAIIPKQFLQKVTRTGFGAHLFNDWRFLDEKGQQPNPD FVLNFPQYQGASILLARENFGCGSSREHAPWALTDYGFKVVIAPSFADIFYGNSFNNQLL PVKLSDAEVDELFALVKANPGIHFDVDLEAQEVKAGEKTYRFTIDAFRRHCMMNGLDSIG LTLQHDDAIAAYEAKQPAFMN Prediction of potential genes in microbial genomes Time: Mon May 16 00:15:35 2011 Seq name: gi|296494411|gb|ADTN01000327.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont839.3, whole genome shotgun sequence Length of sequence - 19452 bp Number of predicted genes - 15, with homology - 14 Number of transcription units - 7, operones - 6 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 25 - 1185 876 ## COG0477 Permeases of the major facilitator superfamily 2 1 Op 2 . - CDS 1076 - 1327 72 ## 3 1 Op 3 . - CDS 1305 - 1436 125 ## ECIAI1_0070 hypothetical protein - Prom 1469 - 1528 3.3 4 2 Tu 1 . + CDS 1525 - 3180 1461 ## COG4533 ABC-type uncharacterized transport system, periplasmic component + Term 3286 - 3337 9.0 5 3 Op 1 14/0.000 + CDS 3344 - 4327 1117 ## COG4143 ABC-type thiamine transport system, periplasmic component 6 3 Op 2 11/0.000 + CDS 4303 - 5913 1881 ## COG1178 ABC-type Fe3+ transport system, permease component 7 3 Op 3 . + CDS 5897 - 6595 194 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P + Term 6621 - 6659 1.3 8 4 Op 1 4/1.000 - CDS 6709 - 7473 807 ## COG0586 Uncharacterized membrane-associated protein - Prom 7496 - 7555 3.3 9 4 Op 2 . - CDS 7559 - 8488 668 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 8610 - 8669 4.8 + Prom 8569 - 8628 4.2 10 5 Op 1 7/0.000 + CDS 8776 - 10476 1723 ## COG1069 Ribulose kinase 11 5 Op 2 5/0.500 + CDS 10487 - 11989 1536 ## COG2160 L-arabinose isomerase + Prom 11998 - 12057 2.6 12 6 Op 1 3/1.000 + CDS 12189 - 12884 786 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases + Term 12914 - 12954 5.8 13 6 Op 2 5/0.500 + CDS 12959 - 15310 2030 ## COG0417 DNA polymerase elongation subunit (family B) 14 7 Op 1 7/0.000 + CDS 15475 - 18381 3743 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family 15 7 Op 2 . + CDS 18393 - 19052 218 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit + Term 19105 - 19133 1.0 Predicted protein(s) >gi|296494411|gb|ADTN01000327.1| GENE 1 25 - 1185 876 386 aa, chain - ## HITS:1 COG:setA KEGG:ns NR:ns ## COG: setA COG0477 # Protein_GI_number: 16128064 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 386 7 392 392 657 99.0 0 MARRMNGVYAAFMLVAFMMGVAGALQAPTLSLFLSREVGAQPFWIGLFYTVNAIAGIGVS LWLAKRSDSQGDRRKLIIFCCLMAIGNALLFAFNRHYLTLITCGVLLASLANTAMPQLFA LAREYADNSAREVVMFSSVMRAQLSLAWVIGPPLAFMLALNYGFTVMFSIAAGIFTLSLV LIAFMLPSVARVELPSENALSMQGGWQDSNVRMLFVASTLMWTCNTMYIIDMPLWISSEL GLPDKLAGFLMGTAAGLEIPAMILAGYYVKRYGKRRMMVIAVAAGVLFYTGLIFFNSRMA LMTLQLFNAVFIGIVAGIGMLWFQDLMPGRAGAATTLFTNSISTGVILAGVIQGAIAQSW GHFAVYWVIAVISVVALFLTAKVKDV >gi|296494411|gb|ADTN01000327.1| GENE 2 1076 - 1327 72 83 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTAILKVDLPASCVTEYWCKITRQQIIPAGFFYSRRAKKGTYDLDNDDGSPYERCLRGIY AGRFYDGGGRGATGSYIELISES >gi|296494411|gb|ADTN01000327.1| GENE 3 1305 - 1436 125 43 aa, chain - ## HITS:1 COG:no KEGG:ECIAI1_0070 NR:ns ## KEGG: ECIAI1_0070 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_IAI1 # Pathway: not_defined # 1 43 57 99 99 84 97.0 1e-15 MRQFYQHYFTATAKLCWLRWLSVPQRLTMLEGLMQWDDRNSES >gi|296494411|gb|ADTN01000327.1| GENE 4 1525 - 3180 1461 551 aa, chain + ## HITS:1 COG:yabN KEGG:ns NR:ns ## COG: yabN COG4533 # Protein_GI_number: 16128063 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Escherichia coli K12 # 1 551 1 551 551 1117 100.0 0 MPSARLQQQFIRLWQCCEGKSQDTTLNELAALLSCSRRHMRTLLNTMQDRGWLTWEAEVG RGKRSRLTFLYTGLALQQQRAEDLLEQDRIDQLVQLVGDKATVRQMLVSHLGRSFRQGRH ILRVLYYRPLRNLLPGSALRRSETHIARQIFSSLTRINEENGELEADIAHHWQQISPLHW RFFLRPGVHFHHGRELEMDDVIASLKRINTLPLYSHIADIVSPTPWTLDIHLTQPDRWLP LLLGQVPAMILPREWETLSNFASHPIGTGPYAVIRNSTNQLKIQAFDDFFGYRALIDEVN VWVLPEIADEPAGGLMLKGPQGEEKEIESRLEEGCYYLLFDSRTHRGANQQVRDWVSYVL SPTNLVYFAEEQYQQLWFPAYGLLPRWHHARTIKSEKPAGLESLTLTFYQDHSEHRVIAG IMQQILASHQVTLKIKEIDYDQWHTGEIESDIWLNSANFTLPLDFSVFAHLCEVPLLQHC IPIDWQADAARWRNGEMNLANWCQQLVASKAMVPLLHHWLIIQGQRSMRGLRMNTLGWFD FKSAWFAPPDP >gi|296494411|gb|ADTN01000327.1| GENE 5 3344 - 4327 1117 327 aa, chain + ## HITS:1 COG:tbpA KEGG:ns NR:ns ## COG: tbpA COG4143 # Protein_GI_number: 16128062 # Func_class: H Coenzyme transport and metabolism # Function: ABC-type thiamine transport system, periplasmic component # Organism: Escherichia coli K12 # 13 327 13 327 327 621 100.0 1e-178 MLKKCLPLLLLCTAPVFAKPVLTVYTYDSFAADWGPGPVVKKAFEADCNCELKLVALEDG VSLLNRLRMEGKNSKADVVLGLDNNLLDAASKTGLFAKSGVAADAVNVPGGWNNDTFVPF DYGYFAFVYDKNKLKNPPQSLKELVESDQNWRVIYQDPRTSTPGLGLLLWMQKVYGDDAP QAWQKLAKKTVTVTKGWSEAYGLFLKGESDLVLSYTTSPAYHILEEKKDNYAAANFSEGH YLQVEVAARTAASKQPELAQKFLQFMVSPAFQNAIPTGNWMYPVANVTLPAGFEKLTKPA TTLEFTPAEVAAQRQAWISEWQRAVSR >gi|296494411|gb|ADTN01000327.1| GENE 6 4303 - 5913 1881 536 aa, chain + ## HITS:1 COG:thiP KEGG:ns NR:ns ## COG: thiP COG1178 # Protein_GI_number: 16128061 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Escherichia coli K12 # 1 536 1 536 536 898 100.0 0 MATRRQPLIPGWLIPGVSATTLVVAVALAAFLALWWNAPQDDWVAVWQDSYLWHVVRFSF WQAFLSALLSVIPAIFLARALYRRRFPGRLALLRLCAMTLILPVLVAVFGILSVYGRQGW LATLCQSLGLEWTFSPYGLQGILLAHVFFNLPMASRLLLQALENIPGEQRQLAAQLGMRS WHFFRFVEWPWLRRQIPPVAALIFMLCFASFATVLSLGGGPQATTIELAIYQALSYDYDP ARAAMLALLQMVCCLGLVLLSQRLSKAIAPGTTLLQGWRDPDDRLHSRICDTVLIVLALL LLLPPLLAVIVDGVNRQLPEVLAQPVLWQALWTSLRIALAAGVLCVVLTMMLLWSSRELR ARQKMLAGQVLEMSGMLILAMPGIVLATGFFLLLNNTIGLPQSADGIVIFTNALMAIPYA LKVLENPMRDITARYSMLCQSLGIEGWSRLKVVELRALKRPLAQALAFACVLSIGDFGVV ALFGNDDFRTLPFYLYQQIGSYRSQDGAVTALILLLLCFLLFTVIEKLPGRNVKTD >gi|296494411|gb|ADTN01000327.1| GENE 7 5897 - 6595 194 232 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 22 227 154 361 398 79 30 2e-14 MLKLTDITWLYHHLPMRFSLTVERGEQVAILGPSGAGKSTLLNLIAGFLTPASGSLTIDG VDHTTMPPSRRPVSMLFQENNLFSHLTVAQNIGLGLNPGLKLNAVQQGKMHAIARQMGID NLMARLPGELSGGQRQRVALARCLVREQPILLLDEPFSALDPALRQEMLTLVSTSCQQQK MTLLMVSHSVEDAARIATRSVVVADGRIAWQGMTNELLSGKASASALLGITG >gi|296494411|gb|ADTN01000327.1| GENE 8 6709 - 7473 807 254 aa, chain - ## HITS:1 COG:yabI KEGG:ns NR:ns ## COG: yabI COG0586 # Protein_GI_number: 16128059 # Func_class: S Function unknown # Function: Uncharacterized membrane-associated protein # Organism: Escherichia coli K12 # 1 254 1 254 254 447 100.0 1e-126 MQALLEHFITQSTVYSLMAVVLVAFLESLALVGLILPGTVLMAGLGALIGSGELSFWHAW LAGIIGCLMGDWISFWLGWRFKKPLHRWSFLKKNKALLDKTEHALHQHSMFTILVGRFVG PTRPLVPMVAGMLDLPVAKFITPNIIGCLLWPPFYFLPGILAGAAIDIPAGMQSGEFKWL LLATAVFLWVGGWLCWRLWRSGKATDRLSHYLSRGRLLWLTPLISAIGVVALVVLIRHPL MPVYIDILRKVVGV >gi|296494411|gb|ADTN01000327.1| GENE 9 7559 - 8488 668 309 aa, chain - ## HITS:1 COG:ECs0068 KEGG:ns NR:ns ## COG: ECs0068 COG2207 # Protein_GI_number: 15829322 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 18 309 1 292 292 622 100.0 1e-178 MQYGQLVSSLNGGSMKSMAEAQNDPLLPGYSFNAHLVAGLTPIEANGYLDFFIDRPLGMK GYILNLTIRGQGVVKNQGREFVCRPGDILLFPPGEIHHYGRHPEAREWYHQWVYFRPRAY WHEWLNWPSIFANTGFFRPDEAHQPHFSDLFGQIINAGQGEGRYSELLAINLLEQLLLRR MEAINESLHPPMDNRVREACQYISDHLADSNFDIASVAQHVCLSPSRLSHLFRQQLGISV LSWREDQRISQAKLLLSTTRMPIATVGRNVGFDDQLYFSRVFKKCTGASPSEFRAGCEEK VNDVAVKLS >gi|296494411|gb|ADTN01000327.1| GENE 10 8776 - 10476 1723 566 aa, chain + ## HITS:1 COG:araB KEGG:ns NR:ns ## COG: araB COG1069 # Protein_GI_number: 16128057 # Func_class: C Energy production and conversion # Function: Ribulose kinase # Organism: Escherichia coli K12 # 1 566 1 566 566 1107 100.0 0 MAIAIGLDFGSDSVRALAVDCATGEEIATSVEWYPRWQKGQFCDAPNNQFRHHPRDYIES MEAALKTVLAELSVEQRAAVVGIGVDSTGSTPAPIDADGNVLALRPEFAENPNAMFVLWK DHTAVEEAEEITRLCHAPGNVDYSRYIGGIYSSEWFWAKILHVTRQDSAVAQSAASWIEL CDWVPALLSGTTRPQDIRRGRCSAGHKSLWHESWGGLPPASFFDELDPILNRHLPSPLFT DTWTADIPVGTLCPEWAQRLGLPESVVISGGAFDCHMGAVGAGAQPNALVKVIGTSTCDI LIADKQSVGERAVKGICGQVDGSVVPGFIGLEAGQSAFGDIYAWFGRVLGWPLEQLAAQH PELKTQINASQKQLLPALTEAWAKNPSLDHLPVVLDWFNGRRTPNANQRLKGVITDLNLA TDAPLLFGGLIAATAFGARAIMECFTDQGIAVNNVMALGGIARKNQVIMQACCDVLNRPL QIVASDQCCALGAAIFAAVAAKVHADIPSAQQKMASAVEKTLQPCSEQAQRFEQLYRRYQ QWAMSAEQHYLPTSAPAQAAQAVATL >gi|296494411|gb|ADTN01000327.1| GENE 11 10487 - 11989 1536 500 aa, chain + ## HITS:1 COG:araA KEGG:ns NR:ns ## COG: araA COG2160 # Protein_GI_number: 16128056 # Func_class: G Carbohydrate transport and metabolism # Function: L-arabinose isomerase # Organism: Escherichia coli K12 # 1 500 1 500 500 1061 100.0 0 MTIFDNYEVWFVIGSQHLYGPETLRQVTQHAEHVVNALNTEAKLPCKLVLKPLGTTPDEI TAICRDANYDDRCAGLVVWLHTFSPAKMWINGLTMLNKPLLQFHTQFNAALPWDSIDMDF MNLNQTAHGGREFGFIGARMRQQHAVVTGHWQDKQAHERIGSWMRQAVSKQDTRHLKVCR FGDNMREVAVTDGDKVAAQIKFGFSVNTWAVGDLVQVVNSISDGDVNALVDEYESCYTMT PATQIHGKKRQNVLEAARIELGMKRFLEQGGFHAFTTTFEDLHGLKQLPGLAVQRLMQQG YGFAGEGDWKTAALLRIMKVMSTGLQGGTSFMEDYTYHFEKGNDLVLGSHMLEVCPSIAA EEKPILDVQHLGIGGKDDPARLIFNTQTGPAIVASLIDLGDRYRLLVNCIDTVKTPHSLP KLPVANALWKAQPDLPTASEAWILAGGAHHTVFSHALNLNDMRQFAEMHDIEITVIDNDT RLPAFKDALRWNEVYYGFRR >gi|296494411|gb|ADTN01000327.1| GENE 12 12189 - 12884 786 231 aa, chain + ## HITS:1 COG:araD KEGG:ns NR:ns ## COG: araD COG0235 # Protein_GI_number: 16128055 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli K12 # 1 231 1 231 231 474 100.0 1e-134 MLEDLKRQVLEANLALPKHNLVTLTWGNVSAVDRERGVFVIKPSGVDYSVMTADDMVVVS IETGEVVEGTKKPSSDTPTHRLLYQAFPSIGGIVHTHSRHATIWAQAGQSIPATGTTHAD YFYGTIPCTRKMTDAEINGEYEWETGNVIVETFEKQGIDAAQMPGVLVHSHGPFAWGKNA EDAVHNAIVLEEVAYMGIFCRQLAPQLPDMQQTLLDKHYLRKHGAKAYYGQ >gi|296494411|gb|ADTN01000327.1| GENE 13 12959 - 15310 2030 783 aa, chain + ## HITS:1 COG:polB KEGG:ns NR:ns ## COG: polB COG0417 # Protein_GI_number: 16128054 # Func_class: L Replication, recombination and repair # Function: DNA polymerase elongation subunit (family B) # Organism: Escherichia coli K12 # 1 783 1 783 783 1632 99.0 0 MAQAGFILTRHWRDTPQGTEVSFWLATDNGPLQVTLAPQESVAFIPADQVPRAQHILQGE QGFRLTPLALKDFHRQPVYGLYCRAHRQLMNYEKRLREGGVTVYEADVRPPERYLMERFI TSPVWVEGDMHNGTIVNARLKPHPDYRPPLKWVSIDIETTRHGELYCIGLEGCGQRIVYM LGPENGDASSLDFELEYVASRPQLLEKLNAWFANYDPDVIIGWNVVQFDLRMLQKHAEHY RLPLRLGRDNSELEWREHGFKNGVFFAQAKGRLIIDGIEALKSAFWNFSSFSLETVAQEL LGEGKSIDNPWDRMDEIDRRFAEDKPALATYNLKDCELVTQIFHKTEIMPFLLERATVNG LPVDRHGGSVAAFGHLYFPRMHRAGYVAPNLGEVPPHASPGGYVMDSRPGLYDSVLVLDY KSLYPSIIRTFLIDPVGLVEGMAQPDPEHSTEGFLDAWFSREKHCLPEIVTNIWHGRDEA KRQGNKPLSQALKIIMNAFYGVLGTTACRFFDPRLASSITMRGHQIMRQTKALIEAQGYD VIYGDTDSTFVWLKGAHSEEEAAKIGRALVQHVNAWWAETLQKQRLTSALELEYETHFCR FLMPTIRGADTGSKKRYAGLIQEGDKQRMVFKGLETVRTDWTPLAQQFQQELYLRIFRNE PYQEYVRETIDKLMAGELDARLVYRKRLRRPLSEYQRNVPPHVRAARLADEENQKRGRPL QYQNRGTIKYVWTTNGPEPLDYQRSPLDYEHYLTRQLQPVAEGILPFIEDNFATLMTGQL GLF >gi|296494411|gb|ADTN01000327.1| GENE 14 15475 - 18381 3743 968 aa, chain + ## HITS:1 COG:ECs0063 KEGG:ns NR:ns ## COG: ECs0063 COG0553 # Protein_GI_number: 15829317 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Escherichia coli O157:H7 # 1 968 1 968 968 1915 100.0 0 MPFTLGQRWISDTESELGLGTVVAVDARTVTLLFPSTGENRLYARSDSPVTRVMFNPGDT ITSHDGWQMQVEEVKEENGLLTYIGTRLDTEESGVALREVFLDSKLVFSKPQDRLFAGQI DRMDRFALRYRARKYSSEQFRMPYSGLRGQRTSLIPHQLNIAHDVGRRHAPRVLLADEVG LGKTIEAGMILHQQLLSGAAERVLIIVPETLQHQWLVEMLRRFNLRFALFDDERYAEAQH DAYNPFDTEQLVICSLDFARRSKQRLEHLCEAEWDLLVVDEAHHLVWSEDAPSREYQAIE QLAEHVPGVLLLTATPEQLGMESHFARLRLLDPNRFHDFAQFVEEQKNYRPVADAVAMLL AGNKLSNDELNMLGEMIGEQDIEPLLQAANSDSEDAQSARQELVSMLMDRHGTSRVLFRN TRNGVKGFPKRELHTIKLPLPTQYQTAIKVSGIMGARKSAEDRARDMLYPERIYQEFEGD NATWWNFDPRVEWLMGYLTSHRSQKVLVICAKAATALQLEQVLREREGIRAAVFHEGMSI IERDRAAAWFAEEDTGAQVLLCSEIGSEGRNFQFASHMVMFDLPFNPDLLEQRIGRLDRI GQAHDIQIHVPYLEKTAQSVLVRWYHEGLDAFEHTCPTGRTIYDSVYNDLINYLASPDQT EGFDDLIKNCREQHEALKAQLEQGRDRLLEIHSNGGEKAQALAESIEEQDDDTNLIAFAM NLFDIIGINQDDRGDNMIVLTPSDHMLVPDFPGLSEDGITITFDREVALAREDAQFITWE HPLIRNGLDLILSGDTGSSTISLLKNKALPVGTLLVELIYVVEAQAPKQLQLNRFLPPTP VRMLLDKNGNNLAAQVEFETFNRQLNAVNRHTGSKLVNAVQQDVHAILQLGEAQIEKSAR ALIDAARNEADEKLSAELSRLEALRAVNPNIRDDELTAIESNRQQVMESLDQAGWRLDAL RLIVVTHQ >gi|296494411|gb|ADTN01000327.1| GENE 15 18393 - 19052 218 219 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 16 218 83 281 285 88 31 3e-17 MGMENYNPPQEPWLVILYQDDHIMVVNKPSGLLSVPGRLEEHKDSVMTRIQRDYPQAESV HRLDMATSGVIVVALTKAAERELKRQFREREPKKQYVARVWGHPSPAEGLVDLPLICDWP NRPKQKVCYETGKPAQTEYEVVEYAADNTARVVLKPITGRSHQLRVHMLALGHPILGDRF YASPEARAMAPRLLLHAEMLTITHPAYGNSMTFKAPADF Prediction of potential genes in microbial genomes Time: Mon May 16 00:15:54 2011 Seq name: gi|296494410|gb|ADTN01000328.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont839.4, whole genome shotgun sequence Length of sequence - 38893 bp Number of predicted genes - 34, with homology - 34 Number of transcription units - 12, operones - 6 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 10 - 309 244 ## Z0065 hypothetical protein - Prom 373 - 432 2.4 - Term 1056 - 1087 3.2 2 2 Tu 1 . - CDS 1126 - 1941 816 ## COG1076 DnaJ-domain-containing proteins 1 - Prom 2156 - 2215 3.3 + Prom 1992 - 2051 2.4 3 3 Op 1 16/0.000 + CDS 2196 - 4550 2124 ## COG1452 Organic solvent tolerance protein OstA 4 3 Op 2 13/0.000 + CDS 4603 - 5889 1396 ## COG0760 Parvulin-like peptidyl-prolyl isomerase 5 3 Op 3 12/0.000 + CDS 5889 - 6878 462 ## PROTEIN SUPPORTED gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 6 3 Op 4 8/0.000 + CDS 6875 - 7696 781 ## COG0030 Dimethyladenosine transferase (rRNA methylation) 7 3 Op 5 8/0.000 + CDS 7699 - 8076 284 ## COG2967 Uncharacterized protein affecting Mg2+/Co2+ transport 8 3 Op 6 . + CDS 8083 - 8925 811 ## COG0639 Diadenosine tetraphosphatase and related serine/threonine protein phosphatases + Term 9155 - 9189 0.5 - Term 8933 - 8978 8.4 9 4 Tu 1 . - CDS 9003 - 9482 602 ## COG0262 Dihydrofolate reductase - Prom 9506 - 9565 7.0 10 5 Op 1 7/0.000 - CDS 9674 - 11536 1034 ## PROTEIN SUPPORTED gi|229845962|ref|ZP_04466074.1| 30S ribosomal protein S2 11 5 Op 2 4/0.500 - CDS 11529 - 12002 492 ## COG2249 Putative NADPH-quinone reductase (modulator of drug activity B) - Prom 12039 - 12098 4.4 12 6 Op 1 4/0.500 - CDS 12167 - 13498 1515 ## COG0477 Permeases of the major facilitator superfamily 13 6 Op 2 12/0.000 - CDS 13555 - 13842 371 ## COG2440 Ferredoxin-like protein 14 6 Op 3 9/0.000 - CDS 13839 - 15125 1397 ## COG0644 Dehydrogenases (flavoproteins) 15 6 Op 4 29/0.000 - CDS 15176 - 16117 949 ## COG2025 Electron transfer flavoprotein, alpha subunit 16 6 Op 5 . - CDS 16132 - 16902 848 ## COG2086 Electron transfer flavoprotein, beta subunit - Prom 17147 - 17206 8.2 + Prom 17292 - 17351 10.8 17 7 Op 1 4/0.500 + CDS 17375 - 18889 1746 ## COG1292 Choline-glycine betaine transporter 18 7 Op 2 8/0.000 + CDS 18920 - 20062 1419 ## COG1960 Acyl-CoA dehydrogenases + Prom 20112 - 20171 2.2 19 7 Op 3 4/0.500 + CDS 20191 - 21408 1505 ## COG1804 Predicted acyl-CoA transferases/carnitine dehydratase 20 7 Op 4 5/0.250 + CDS 21482 - 23035 1165 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II 21 7 Op 5 3/0.500 + CDS 23036 - 23929 1052 ## COG1024 Enoyl-CoA hydratase/carnithine racemase 22 7 Op 6 . + CDS 23968 - 24525 442 ## COG0663 Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily - Term 24653 - 24708 1.9 23 8 Tu 1 . - CDS 24734 - 25129 329 ## SFV_0028 DNA-binding transcriptional activator CaiF - Prom 25319 - 25378 6.4 + Prom 24936 - 24995 2.8 24 9 Tu 1 . + CDS 25164 - 25343 116 ## ECUMN_0035 hypothetical protein - Term 25306 - 25337 2.5 25 10 Op 1 24/0.000 - CDS 25391 - 28612 4410 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) 26 10 Op 2 8/0.000 - CDS 28630 - 29673 1090 ## COG0505 Carbamoylphosphate synthase small subunit - Prom 29808 - 29867 5.5 - Term 30143 - 30184 1.5 27 10 Op 3 . - CDS 30234 - 31055 844 ## COG0289 Dihydrodipicolinate reductase - Prom 31142 - 31201 3.9 - Term 31166 - 31205 3.3 28 11 Op 1 3/0.500 - CDS 31222 - 32136 717 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase - Term 32140 - 32173 2.1 29 11 Op 2 7/0.000 - CDS 32202 - 33152 441 ## PROTEIN SUPPORTED gi|227371337|ref|ZP_03854821.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase; SSU ribosomal protein S1P 30 11 Op 3 7/0.000 - CDS 33154 - 33603 257 ## PROTEIN SUPPORTED gi|225086978|ref|YP_002658248.1| ribosomal protein S2 - Prom 33647 - 33706 1.8 31 11 Op 4 16/0.000 - CDS 33728 - 34222 602 ## COG0597 Lipoprotein signal peptidase 32 11 Op 5 16/0.000 - CDS 34222 - 37038 3179 ## COG0060 Isoleucyl-tRNA synthetase 33 11 Op 6 . - CDS 37081 - 38022 391 ## PROTEIN SUPPORTED gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 - Prom 38045 - 38104 5.2 + Prom 38251 - 38310 3.6 34 12 Tu 1 . + CDS 38351 - 38614 422 ## PROTEIN SUPPORTED gi|15799705|ref|NP_285717.1| 30S ribosomal protein S20 + Term 38634 - 38686 1.3 Predicted protein(s) >gi|296494410|gb|ADTN01000328.1| GENE 1 10 - 309 244 99 aa, chain - ## HITS:1 COG:no KEGG:Z0065 NR:ns ## KEGG: Z0065 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_O157 # Pathway: not_defined # 1 90 175 249 250 91 55.0 9e-18 MKGSNDILYERPGWNANLGGATPDGATPDGANPDGANLDGAILDGATVNGATSLYDEVII INKIPPKKIDTKGVATEEVATKKVLLNKLLTTQLLNEPE >gi|296494410|gb|ADTN01000328.1| GENE 2 1126 - 1941 816 271 aa, chain - ## HITS:1 COG:STM0094 KEGG:ns NR:ns ## COG: STM0094 COG1076 # Protein_GI_number: 16763484 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-domain-containing proteins 1 # Organism: Salmonella typhimurium LT2 # 1 271 1 270 270 504 95.0 1e-142 MQYWGKIIGVAVALLMGGGFWGVVLGLLIGHMFDKARSRKMAWFANQRERQALFFATTFE VMGHLTKSKGRVTEADIHIASQLMDRMNLHGASRTAAQNAFRVGKSDNYPLREKMRQFRS VCFGRFDLIRMFLEIQIQAAFADGSLHPNERAVLYVIAEELGISRAQFDQFLRMMQGGAQ FGGGYQQQTGGGNWQQAQRGPTLEDACNVLGVKPTDDATTIKRAYRKLMSEHHPDKLVAK GLPPEMMEMAKQKAQEIQQAYELIKQQKGFK >gi|296494410|gb|ADTN01000328.1| GENE 3 2196 - 4550 2124 784 aa, chain + ## HITS:1 COG:imp KEGG:ns NR:ns ## COG: imp COG1452 # Protein_GI_number: 16128048 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Organic solvent tolerance protein OstA # Organism: Escherichia coli K12 # 1 784 1 784 784 1554 100.0 0 MKKRIPTLLATMIATALYSQQGLAADLASQCMLGVPSYDRPLVQGDTNDLPVTINADHAK GDYPDDAVFTGSVDIMQGNSRLQADEVQLHQKEAPGQPEPVRTVDALGNVHYDDNQVILK GPKGWANLNTKDTNVWEGDYQMVGRQGRGKADLMKQRGENRYTILDNGSFTSCLPGSDTW SVVGSEIIHDREEQVAEIWNARFKVGPVPIFYSPYLQLPVGDKRRSGFLIPNAKYTTTNY FEFYLPYYWNIAPNMDATITPHYMHRRGNIMWENEFRYLSQAGAGLMELDYLPSDKVYED EHPNDDSSRRWLFYWNHSGVMDQVWRFNVDYTKVSDPSYFNDFDNKYGSSTDGYATQKFS VGYAVQNFNATVSTKQFQVFSEQNTSSYSAEPQLDVNYYQNDVGPFDTRIYGQAVHFVNT RDDMPEATRVHLEPTINLPLSNNWGSINTEAKLLATHYQQTNLDWYNSRNTTKLDESVNR VMPQFKVDGKMVFERDMEMLAPGYTQTLEPRAQYLYVPYRDQSDIYNYDSSLLQSDYSGL FRDRTYGGLDRIASANQVTTGVTSRIYDDAAVERFNISVGQIYYFTESRTGDDNITWEND DKTGSLVWAGDTYWRISERWGLRGGIQYDTRLDNVATSNSSIEYRRDEDRLVQLNYRYAS PEYIQATLPKYYSTAEQYKNGISQVGAVASWPIADRWSIVGAYYYDTNANKQADSMLGVQ YSSCCYAIRVGYERKLNGWDNDKQHAVYDNAIGFNIELRGLSSNYGLGTQEMLRSNILPY QNTL >gi|296494410|gb|ADTN01000328.1| GENE 4 4603 - 5889 1396 428 aa, chain + ## HITS:1 COG:ECs0058 KEGG:ns NR:ns ## COG: ECs0058 COG0760 # Protein_GI_number: 15829312 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Parvulin-like peptidyl-prolyl isomerase # Organism: Escherichia coli O157:H7 # 1 428 1 428 428 773 100.0 0 MKNWKTLLLGIAMIANTSFAAPQVVDKVAAVVNNGVVLESDVDGLMQSVKLNAAQARQQL PDDATLRHQIMERLIMDQIILQMGQKMGVKISDEQLDQAIANIAKQNNMTLDQMRSRLAY DGLNYNTYRNQIRKEMIISEVRNNEVRRRITILPQEVESLAQQVGNQNDASTELNLSHIL IPLPENPTSDQVNEAESQARAIVDQARNGADFGKLAIAHSADQQALNGGQMGWGRIQELP GIFAQALSTAKKGDIVGPIRSGVGFHILKVNDLRGESKNISVTEVHARHILLKPSPIMTD EQARVKLEQIAADIKSGKTTFAAAAKEFSQDPGSANQGGDLGWATPDIFDPAFRDALTRL NKGQMSAPVHSSFGWHLIELLDTRNVDKTDAAQKDRAYRMLMNRKFSEEAASWMQEQRAS AYVKILSN >gi|296494410|gb|ADTN01000328.1| GENE 5 5889 - 6878 462 329 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 [Flavobacteriales bacterium ALC-1] # 7 324 9 326 346 182 34 3e-45 MVKTQRVVITPGEPAGIGPDLVVQLAQREWPVELVVCADATLLTNRAAMLGLPLTLRPYS PNSPAQPQTAGTLTLLPVALRAPVTAGQLAVENGHYVVETLARACDGCLNGEFAALITGP VHKGVINDAGIPFTGHTEFFEERSQAKKVVMMLATEELRVALATTHLPLRDIADAITPAL LHEVIAILHHDLRTKFGIAEPRILVCGLNPHAGEGGHMGTEEIDTIIPVLNELRAQGMKL NGPLPADTLFQPKYLDNADAVLAMYHDQGLPVLKYQGFGRGVNITLGLPFIRTSVDHGTA LELAGRGKADVGSFITALNLAIKMIVNTQ >gi|296494410|gb|ADTN01000328.1| GENE 6 6875 - 7696 781 273 aa, chain + ## HITS:1 COG:ksgA KEGG:ns NR:ns ## COG: ksgA COG0030 # Protein_GI_number: 16128045 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Dimethyladenosine transferase (rRNA methylation) # Organism: Escherichia coli K12 # 1 273 1 273 273 538 100.0 1e-153 MNNRVHQGHLARKRFGQNFLNDQFVIDSIVSAINPQKGQAMVEIGPGLAALTEPVGERLD QLTVIELDRDLAARLQTHPFLGPKLTIYQQDAMTFNFGELAEKMGQPLRVFGNLPYNIST PLMFHLFSYTDAIADMHFMLQKEVVNRLVAGPNSKAYGRLSVMAQYYCNVIPVLEVPPSA FTPPPKVDSAVVRLVPHATMPHPVKDVRVLSRITTEAFNQRRKTIRNSLGNLFSVEVLTG MGIDPAMRAENISVAQYCQMANYLAENAPLQES >gi|296494410|gb|ADTN01000328.1| GENE 7 7699 - 8076 284 125 aa, chain + ## HITS:1 COG:ECs0055 KEGG:ns NR:ns ## COG: ECs0055 COG2967 # Protein_GI_number: 15829309 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein affecting Mg2+/Co2+ transport # Organism: Escherichia coli O157:H7 # 1 125 1 125 125 241 100.0 2e-64 MINSPRVCIQVQSVYIEAQSSPDNERYVFAYTVTIRNLGRAPVQLLGRYWLITNGNGRET EVQGEGVVGVQPLIAPGEEYQYTSGAIIETPLGTMQGHYEMIDENGVPFSIDIPVFRLAV PTLIH >gi|296494410|gb|ADTN01000328.1| GENE 8 8083 - 8925 811 280 aa, chain + ## HITS:1 COG:apaH KEGG:ns NR:ns ## COG: apaH COG0639 # Protein_GI_number: 16128043 # Func_class: T Signal transduction mechanisms # Function: Diadenosine tetraphosphatase and related serine/threonine protein phosphatases # Organism: Escherichia coli K12 # 1 280 1 280 280 587 100.0 1e-167 MATYLIGDVHGCYDELIALLHKVEFTPGKDTLWLTGDLVARGPGSLDVLRYVKSLGDSVR LVLGNHDLHLLAVFAGISRNKPKDRLTPLLEAPDADELLNWLRRQPLLQIDEEKKLVMAH AGITPQWDLQTAKECARDVEAVLSSDSYPFFLDAMYGDMPNNWSPELRGLGRLRFITNAF TRMRFCFPNGQLDMYSKESPEEAPAPLKPWFAIPGPVAEEYSIAFGHWASLEGKGTPEGI YALDTGCCWGGTLTCLRWEDKQYFVQPSNRHKDLGEAAAS >gi|296494410|gb|ADTN01000328.1| GENE 9 9003 - 9482 602 159 aa, chain - ## HITS:1 COG:folA KEGG:ns NR:ns ## COG: folA COG0262 # Protein_GI_number: 16128042 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Escherichia coli K12 # 1 159 1 159 159 332 100.0 1e-91 MISLIAALAVDRVIGMENAMPWNLPADLAWFKRNTLNKPVIMGRHTWESIGRPLPGRKNI ILSSQPGTDDRVTWVKSVDEAIAACGDVPEIMVIGGGRVYEQFLPKAQKLYLTHIDAEVE GDTHFPDYEPDDWESVFSEFHDADAQNSHSYCFEILERR >gi|296494410|gb|ADTN01000328.1| GENE 10 9674 - 11536 1034 620 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229845962|ref|ZP_04466074.1| 30S ribosomal protein S2 [Haemophilus influenzae 7P49H1] # 6 586 9 612 618 402 38 1e-111 MDSHTLIQALIYLGSAALIVPIAVRLGLGSVLGYLIAGCIIGPWGLRLVTDAESILHFAE IGVVLMLFIIGLELDPQRLWKLRAAVFGCGALQMVICGGLLGLFCMLLGLRWQVAELIGM TLALSSTAIAMQAMNERNLMVTQMGRSAFAVLLFQDIAAIPLVAMIPLLATSSASTTMGA FALSALKVAGALVLVVLLGRYVTRPALRFVARSGLREVFSAVALFLVFGFGLLLEEVGLS MAMGAFLAGVLLASSEYRHALESDIEPFKGLLLGLFFIGVGMSIDFGTLLENPLRIVILL LGFLIIKIAMLWLIARPLQVPNKQRRWFAVLLGQGSEFAFVVFGAAQMANVLEPEWAKSL TLAVALSMAATPILLVILNRLEQSSTEEAREADEIDEEQPRVIIAGFGRFGQITGRLLLS SGVKMVVLDHDPDHIETLRKFGMKVFYGDATRMDLLESAGAAKAEVLINAIDDPQTNLQL TEMVKEHFPHLQIIARARDVDHYIRLRQAGVEKPERETFEGALKTGRLALESLGLGPYEA RERADVFRRFNIQMVEEMAMVENDTKARAAVYKRTSAMLSEIITEDREHLSLIQRHGWQG TEEGKHTGNMADEPETKPSS >gi|296494410|gb|ADTN01000328.1| GENE 11 11529 - 12002 492 157 aa, chain - ## HITS:1 COG:yabF KEGG:ns NR:ns ## COG: yabF COG2249 # Protein_GI_number: 16128040 # Func_class: R General function prediction only # Function: Putative NADPH-quinone reductase (modulator of drug activity B) # Organism: Escherichia coli K12 # 1 157 20 176 176 313 100.0 1e-85 MLEQARTLEGVEIRSLYQLYPDFNIDIAAEQEALSRADLIVWQHPMQWYSIPPLLKLWID KVFSHGWAYGHGGTALHGKHLLWAVTTGGGESHFEIGAHPGFDVLSQPLQATAIYCGLNW LPPFAMHCTFICDDETLEGQARHYKQRLLEWQEAHHG >gi|296494410|gb|ADTN01000328.1| GENE 12 12167 - 13498 1515 443 aa, chain - ## HITS:1 COG:yaaU KEGG:ns NR:ns ## COG: yaaU COG0477 # Protein_GI_number: 16128039 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 443 1 443 443 834 100.0 0 MQPSRNFDDLKFSSIHRRILLWGSGGPFLDGYVLVMIGVALEQLTPALKLDADWIGLLGA GTLAGLFVGTSLFGYISDKVGRRKMFLIDIIAIGVISVATMFVSSPVELLVMRVLIGIVI GADYPIATSMITEFSSTRQRAFSISFIAAMWYVGATCADLVGYWLYDVEGGWRWMLGSAA IPCLLILIGRFELPESPRWLLRKGRVKECEEMMIKLFGEPVAFDEEQPQQTRFRDLFNRR HFPFVLFVAAIWTCQVIPMFAIYTFGPQIVGLLGLGVGKNAALGNVVISLFFMLGCIPPM LWLNTAGRRPLLIGSFAMMTLALAVLGLIPDMGIWLVVMAFAVYAFFSGGPGNLQWLYPN ELFPTDIRASAVGVIMSLSRIGTIVSTWALPIFINNYGISNTMLMGAGISLFGLLISVAF APETRGMSLAQTSNMTIRGQRMG >gi|296494410|gb|ADTN01000328.1| GENE 13 13555 - 13842 371 95 aa, chain - ## HITS:1 COG:ECs0047 KEGG:ns NR:ns ## COG: ECs0047 COG2440 # Protein_GI_number: 15829301 # Func_class: C Energy production and conversion # Function: Ferredoxin-like protein # Organism: Escherichia coli O157:H7 # 1 95 1 95 95 200 100.0 4e-52 MTSPVNVDVKLGVNKFNVDEEHPHIVVKADADKQALELLVKACPAGLYKKQDDGSVRFDY AGCLECGTCRILGLGSALEQWEYPRGTFGVEFRYG >gi|296494410|gb|ADTN01000328.1| GENE 14 13839 - 15125 1397 428 aa, chain - ## HITS:1 COG:fixC KEGG:ns NR:ns ## COG: fixC COG0644 # Protein_GI_number: 16128037 # Func_class: C Energy production and conversion # Function: Dehydrogenases (flavoproteins) # Organism: Escherichia coli K12 # 1 428 1 428 428 813 99.0 0 MSEDIFDAIIVGAGLAGSVAALVLAREGAQVLVIERGNSAGAKNVTGGRLYAHSLEHIIP GFADSAPVERLITHEKLAFMTEKSAMTMDYCNGDETSPSQRSYSVLRSKFDAWLMEQAEE AGAQLITGIRVDNLVQHDGKVVGVEADGDVIEAKTVILADGVNSILAEKLGMAKRVKPTD VAVGVKELIELPKSVIEDRFQLQGNQGAACLFAGSPTDGLMGGGFLYTNENTLSLGLVCG LHHLHDAKKSVPQMLEDFKQHPAVAPLIAGGKLVEYSAHVVPEAGINMLPELVGDGVLIA GDAAGMCMNLGFTIRGMDLAIAAGEAAAKTVLSAMKSDDFSKQKLAEYRQHLESGPLRDM RMYQKLPAFLDNPRMFSGYPELAVGVARDLFTIDGSAPELMRKKILRHGKKVGFINLIKD GMKGVTVL >gi|296494410|gb|ADTN01000328.1| GENE 15 15176 - 16117 949 313 aa, chain - ## HITS:1 COG:fixB KEGG:ns NR:ns ## COG: fixB COG2025 # Protein_GI_number: 16128036 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, alpha subunit # Organism: Escherichia coli K12 # 1 313 1 313 313 602 100.0 1e-172 MNTFSQVWVFSDTPSRLPELMNGAQALANQINTFVLNDADGAQAIQLGANHVWKLNGKPD DRMIEDYAGVMADTIRQHGADGLVLLPNTRRGKLLAAKLGYRLKAAVSNDASTVSVQDGK ATVKHMVYGGLAIGEERIATPYAVLTISSGTFDAAQPDASRTGETHTVEWQAPAVAITRT ATQARQSNSVDLDKARLVVSVGRGIGSKENIALAEQLCKAIGAELACSRPVAENEKWMEH ERYVGISNLMLKPELYLAVGISGQIQHMVGANASQTIFAINKDKNAPIFQYADYGIVGDA VKILPALTAALAR >gi|296494410|gb|ADTN01000328.1| GENE 16 16132 - 16902 848 256 aa, chain - ## HITS:1 COG:ECs0044 KEGG:ns NR:ns ## COG: ECs0044 COG2086 # Protein_GI_number: 15829298 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, beta subunit # Organism: Escherichia coli O157:H7 # 1 256 13 268 268 426 100.0 1e-119 MKIITCYKCVPDEQDIAVNNADGSLDFSKADAKISQYDLNAIEAACQLKQQAAEAQVTAL SVGGKALTNAKGRKDVLSRGPDELIVVIDDQFEQALPQQTASALAAAAQKAGFDLILCGD GSSDLYAQQVGLLVGEILNIPAVNGVSKIISLTADTLTVERELEDETETLSIPLPAVVAV STDINSPQIPSMKAILGAAKKPVQVWSAADIGFNAEAAWSEQQVAAPKQRERQRIVIEGD GEEQIAAFAENLRKVI >gi|296494410|gb|ADTN01000328.1| GENE 17 17375 - 18889 1746 504 aa, chain + ## HITS:1 COG:caiT KEGG:ns NR:ns ## COG: caiT COG1292 # Protein_GI_number: 16128034 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Choline-glycine betaine transporter # Organism: Escherichia coli K12 # 1 504 1 504 504 933 99.0 0 MKNEKRKTGIEPKVFFPPLIIVGILCWLTVRDLDAANVVINAVFSYVTNVWGWAFEWYMV VMLFGWFWLVFGPYAKKRLGNEPPEFSTASWIFMMFASCTSAAVLFWGSIEIYYYISTPP FGLEPNSTGAKELGLAYSLFHWGPLPWATYSFLSVAFAYFFFVRKMEVIRPSSTLVPLVG EKHVKGLFGTIVDNFYLVALIFAMGTSLGLATPLVTECMQWLFGIPHTLQLDAIIITCWI ILNAICVACGLQKGVRIASDVRSYLSFLMLGWVFIVSGASFIMNYFTDSVGMLLMYLPRM LFYTDPIAKGGFPQGWTVFYWAWWVIYAIQMSIFLARISRGRTVRELCFGMVLGLTASTW ILWTVLGSNTLLLMDKNIINIPNLIEQYGVARAIIETWAALPLSTATMWGFFILCFIATV TLVNACSYTLAMSTCREVRDGEEPPLLVRIGWSILVGIIGIVLLALGGLKPIQTAIIAGG CPLFFVNIMVTLSFIKDAKQNWKD >gi|296494410|gb|ADTN01000328.1| GENE 18 18920 - 20062 1419 380 aa, chain + ## HITS:1 COG:ECs0042 KEGG:ns NR:ns ## COG: ECs0042 COG1960 # Protein_GI_number: 15829296 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Escherichia coli O157:H7 # 1 380 1 380 380 787 99.0 0 MDFNLNDEQELFVAGIRELMASENWEAYFAECDRDSVYPERFVKALADMGIDSLLIPEEH GGLDAGFVTLAAVWMELGRLGAPTYVLYQLPGGFNTFLREGTQEQIDKIMAFRGTGKQMW NSAITEPGAGSDVGSLKTTYTRRNGKIYLNGSKCFITSSAYTPYIVVMARDGASPDKPVY TEWFVDMSKPGIKVTKLEKLGLHMDSCCEITFDDVELDEKDMFGREGNGFNRVKEEFDHE RFLVALTNYGTAMCAFEDAARYANQRVQFGEAIGRFQLIQEKFAHMAIKLNSMKNMLYEA AWKADNGTITSGDAAMCKYFCANAAFEVVDSAMQVLGGVGIAGNHRISRFWRDLRVDRVS GGSDEMQILTLGRAVLKQYR >gi|296494410|gb|ADTN01000328.1| GENE 19 20191 - 21408 1505 405 aa, chain + ## HITS:1 COG:caiB KEGG:ns NR:ns ## COG: caiB COG1804 # Protein_GI_number: 16128032 # Func_class: C Energy production and conversion # Function: Predicted acyl-CoA transferases/carnitine dehydratase # Organism: Escherichia coli K12 # 1 405 1 405 405 835 100.0 0 MDHLPMPKFGPLAGLRVVFSGIEIAGPFAGQMFAEWGAEVIWIENVAWADTIRVQPNYPQ LSRRNLHALSLNIFKDEGREAFLKLMETTDIFIEASKGPAFARRGITDEVLWQHNPKLVI AHLSGFGQYGTEEYTNLPAYNTIAQAFSGYLIQNGDVDQPMPAFPYTADYFSGLTATTAA LAALHKVRETGKGESIDIAMYEVMLRMGQYFMMDYFNGGEMCPRMSKGKDPYYAGCGLYK CADGYIVMELVGITQIEECFKDIGLAHLLGTPEIPEGTQLIHRIECPYGPLVEEKLDAWL ATHTIAEVKERFAELNIACAKVLTVPELESNPQYVARESITQWQTMDGRTCKGPNIMPKF KNNPGQIWRGMPSHGMDTAAILKNIGYSENDIQELVSKGLAKVED >gi|296494410|gb|ADTN01000328.1| GENE 20 21482 - 23035 1165 517 aa, chain + ## HITS:1 COG:caiC KEGG:ns NR:ns ## COG: caiC COG0318 # Protein_GI_number: 16128031 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Escherichia coli K12 # 1 517 6 522 522 1086 99.0 0 MDIIGGQHLRQMWDDLADVYGHKTALICESSGGVVNRYSYLELNQEINRTANLFYTLGIR KGDKVALHLDNCPEFIFCWFGLAKIGAIMVPINARLLCEESAWILQNSQACLLVTSAQFY PMYQQIQQEDATQLRHICLTDVALPADDGVSSFTQLKNQQPATLCYAPPLSTDDTAEILF TSGTTSRPKGVVITHYNLRFAGYYSAWQCALRDDDVYLTVMPAFHIDCQCTAAMAAFSAG ATFVLVEKYSARAFWGQVQKYRATVTECIPMMIRTLMVQPPSANDQQHRLREVMFYLNLS EQEKDAFCERFGVRLLTSYGMTETIVGIIGDRPGDKRRWPSIGRVGFCYEAEIRDDHNRP LPAGEIGEICIKGIPGKTIFKEYFLNPQATAKVLEADGWLHTGDTGYRDEEGFFYFVDRR CNMIKRGGENVSCVELENIIATHPKIQDIVVVGIKDSIRDEAIKAFVVLNEGETLSEEEF FRFCEQNMAKFKVPSYLEIRKDLPRNCSGKIIRKHLK >gi|296494410|gb|ADTN01000328.1| GENE 21 23036 - 23929 1052 297 aa, chain + ## HITS:1 COG:caiD KEGG:ns NR:ns ## COG: caiD COG1024 # Protein_GI_number: 16128030 # Func_class: I Lipid transport and metabolism # Function: Enoyl-CoA hydratase/carnithine racemase # Organism: Escherichia coli K12 # 1 297 1 297 297 583 99.0 1e-166 MKQQGTTLPANNHALKQYAFFAGMLSSLKKQKWRKGMSESLHLTRNGSILEITLDRPKAN AIDAKTSFEMGEVFLNFRDDPQLRVAIITGAGEKFFSAGWDLKAAAEGEAPDADFGPGGF AGLTEIFNLDKPVIAAVNGYAFGGGFELALAADFIVCADNASFALPEAKLGIFPDSGGVL RLPKILPPAIVNEMVMTGRRMGAEEALRWGIVNRVVSQAELMDNARELAQQLVNSAPLAI AALKEIYRTTSEMPVEEAYRYIRSGVLKHYPSVLHSEDAIEGPLAFAEKRDPVWKGR >gi|296494410|gb|ADTN01000328.1| GENE 22 23968 - 24525 442 185 aa, chain + ## HITS:1 COG:ECs0038 KEGG:ns NR:ns ## COG: ECs0038 COG0663 # Protein_GI_number: 15829292 # Func_class: R General function prediction only # Function: Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily # Organism: Escherichia coli O157:H7 # 1 185 19 203 203 365 98.0 1e-101 MVHPTAFVHPSAVLIGDVIVGAGVYIGPLASLRGDYGRLIVQAGANIQDGCIMHGYCDTD TIVGENGHIGHGAILHGCVIGRDALVGMNSVIMDGAVIGEESIVAAMSFVKAGFHGEKRQ LLMGTPARAVRSVSDDELHWKRLNTKEYQDLVGRCHASLHETQPLRQMEENRPRLQGTTD VTPKR >gi|296494410|gb|ADTN01000328.1| GENE 23 24734 - 25129 329 131 aa, chain - ## HITS:1 COG:no KEGG:SFV_0028 NR:ns ## KEGG: SFV_0028 # Name: caiF # Def: DNA-binding transcriptional activator CaiF # Organism: S.flexneri_8401 # Pathway: not_defined # 1 131 36 166 166 249 100.0 3e-65 MCEGYVEKPLYLLIAEWMMAENRWVIAREISIHFDIEHSKAVNTLTYILSEVTEISCEVK MIPNKLEGRGCQCQRLVKVVDIDEQIYARLRNNSREKLVGVRKTPRIPAVPLTELNREQK WQMMLSKSMRR >gi|296494410|gb|ADTN01000328.1| GENE 24 25164 - 25343 116 59 aa, chain + ## HITS:1 COG:no KEGG:ECUMN_0035 NR:ns ## KEGG: ECUMN_0035 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_UMN026 # Pathway: not_defined # 1 51 1 51 72 99 100.0 2e-20 MTRFEAIKQGHIKIVDISIVCNFTVDKCELNPAYVIKNIDSPKDLLNGQKKTVLIREPY >gi|296494410|gb|ADTN01000328.1| GENE 25 25391 - 28612 4410 1073 aa, chain - ## HITS:1 COG:ECs0036 KEGG:ns NR:ns ## COG: ECs0036 COG0458 # Protein_GI_number: 15829290 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Escherichia coli O157:H7 # 1 1073 1 1073 1073 2124 99.0 0 MPKRTDIKSILILGAGPIVIGQACEFDYSGAQACKALREEGYRVILVNSNPATIMTDPEM ADATYIEPIHWEVVRKIIEKERPDAVLPTMGGQTALNCALELERQGVLEEFGVTMIGATA DAIDKAEDRRRFDVAMKKIGLETARSGIAHTMEEALAVAADVGFPCIIRPSFTMGGSGGG IAYNREEFEEICARGLDLSPTKELLIDESLIGWKEYEMEVVRDKNDNCIIVCSIENFDAM GIHTGDSITVAPAQTLTDKEYQIMRNASMAVLREIGVETGGSNVQFAVNPKNGRLIVIEM NPRVSRSSALASKATGFPIAKVAAKLAVGYTLDELMNDITGGRTPASFEPSIDYVVTKIP RFNFEKFAGANDRLTTQMKSVGEVMAIGRTQQESLQKALRGLEVGATGFDPKVSLDDPEA LTKIRRELKDAGADRIWYIADAFRAGLSVDGVFNLTNIDRWFLVQIEELVRLEEKVAEVG ITGLNAEFLRQLKRKGFADARLAKLAGVREAEIRKLRDQYDLHPVYKRVDTCAAEFATDT AYMYSTYEEECEANPSTDREKIMVLGGGPNRIGQGIEFDYCCVHASLALREDGYETIMVN CNPETVSTDYDTSDRLYFEPVTLEDVLEIVRIEKPKGVIVQYGGQTPLKLARALEAAGVP VIGTSPDAIDRAEDRERFQHAVDRLKLKQPANATVTAIEMAVEKAKEIGYPLVVRPSYVL GGRAMEIVYDEADLRRYFQTAVSVSNDAPVLLDHFLDDAVEVDVDAICDGEMVLIGGIME HIEQAGVHSGDSACSLPAYTLSQEIQDVMRQQVQKLAFELQVRGLMNVQFAVKNNEVYLI EVNPRAARTVPFVSKATGVPLAKVAARVMAGKSLAEQGVTKEVIPPYYSVKEVVLPFNKF PGVDPLLGPEMRSTGEVMGVGRTFAEAFAKAQLGSNSTMKKHGRALLSVREGDKERVVDL AAKLLKQGFELDATHGTAIVLGEAGINPRLVNKVHEGRPHIQDRIKNGEYTYIINTTSGR RAIEDSRVIRRSALQYKVHYDTTLNGGFATAMALNADATEKVISVQEMHAQIK >gi|296494410|gb|ADTN01000328.1| GENE 26 28630 - 29673 1090 347 aa, chain - ## HITS:1 COG:ECs0035 KEGG:ns NR:ns ## COG: ECs0035 COG0505 # Protein_GI_number: 15829289 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase small subunit # Organism: Escherichia coli O157:H7 # 1 347 36 382 382 714 99.0 0 MTGYQEILTDPSYSRQIVTLTYPHIGNVGTNDADEESSQVHAQGLVIRDLPLIASNFRNT EDLSSYLKRHNIVAIADIDTRKLTRLLREKGAQNGCIIAGDNPDAALALEKARAFPGLNG MDLAKEVTTAEAYSWTQGSWTLTGGLPEAKKEDELPFHVVAYDFGAKRNILRMLVDRGCR LTIVPAQTSAEDVLKMNPDGIFLSNGPGDPAPCDYAITAIQKFLETDIPVFGICLGHQLL ALASGAKTVKMKFGHHGGNHPVKDVEKNVVMITAQNHGFAVDEAILPANLRVTHKSLFDG TLQGIHRTDKPAFSFQGHPEASPGPHDAAPLFDHFIELIEQYRKTAK >gi|296494410|gb|ADTN01000328.1| GENE 27 30234 - 31055 844 273 aa, chain - ## HITS:1 COG:dapB KEGG:ns NR:ns ## COG: dapB COG0289 # Protein_GI_number: 16128025 # Func_class: E Amino acid transport and metabolism # Function: Dihydrodipicolinate reductase # Organism: Escherichia coli K12 # 1 273 1 273 273 472 100.0 1e-133 MHDANIRVAIAGAGGRMGRQLIQAALALEGVQLGAALEREGSSLLGSDAGELAGAGKTGV TVQSSLDAVKDDFDVFIDFTRPEGTLNHLAFCRQHGKGMVIGTTGFDEAGKQAIRDAAAD IAIVFAANFSVGVNVMLKLLEKAAKVMGDYTDIEIIEAHHRHKVDAPSGTALAMGEAIAH ALDKDLKDCAVYSREGHTGERVPGTIGFATVRAGDIVGEHTAMFADIGERLEITHKASSR MTFANGAVRSALWLSGKESGLFDMRDVLDLNNL >gi|296494410|gb|ADTN01000328.1| GENE 28 31222 - 32136 717 304 aa, chain - ## HITS:1 COG:ECs0033 KEGG:ns NR:ns ## COG: ECs0033 COG1957 # Protein_GI_number: 15829287 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Escherichia coli O157:H7 # 1 304 1 304 304 578 99.0 1e-165 MRLPIFLDTDPGIDDAVAIAAAIFAPELDLQLMTTVAGNVSVEKTTRNALQLLHFWNAEI PLAQGAAVPLVRAPRDAASVHGESGMAGYDFVEHNRKPLGIPAFLAIRDALMRAPEPVTL VAIGPLTNIALLLSQCPECKPHIRRLVIMGGSAGRGNCTPNAEFNIAADPEAAACVFRSG IEIVMCGLDVTNQAILTPDYLATLPELNRTGKMLHALFSHYRSGSMQSGLRMHDLCAIAW LVRPDLFTLKPCFVAVETQGEFTSGTTVVDIDGCLGKPANVQVALDLNVKGFQQWVAEVL ALVP >gi|296494410|gb|ADTN01000328.1| GENE 29 32202 - 33152 441 316 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227371337|ref|ZP_03854821.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase; SSU ribosomal protein S1P [Veillonella parvula DSM 2008] # 1 298 1 287 632 174 34 7e-43 MQILLANPRGFCAGVDRAISIVENALAIYGAPIYVRHEVVHNRYVVDSLRERGAIFIEQI SEVPDGAILIFSAHGVSQAVRNEAKSRDLTVFDATCPLVTKVHMEVARASRRGEESILIG HAGHPEVEGTMGQYSNPEGGMYLVESPDDVWKLTVKNEEKLSFMTQTTLSVDDTSDVIDA LRKRFPKIVGPRKDDICYATTNRQEAVRALAEQAEVVLVVGSKNSSNSNRLAELAQRMGK SAFLIDDAKDIQEEWVKEVKCVGVTAGASAPDILVQNVVARLQQLGGGEAIPLEGREENI VFEVPKELRVDIREVD >gi|296494410|gb|ADTN01000328.1| GENE 30 33154 - 33603 257 149 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225086978|ref|YP_002658248.1| ribosomal protein S2 [gamma proteobacterium NOR5-3] # 1 144 1 143 148 103 36 2e-21 MSESVQSNSAVLVHFTLKLDDGTTAESTRNNGKPALFRLGDASLSEGLEQHLLGLKVGDK TTFSLEPDAAFGVPSPDLIQYFSRREFMDAGEPEIGAIMLFTAMDGSEMPGVIREINGDS ITVDFNHPLAGQTVHFDIEVLEIDPALEA >gi|296494410|gb|ADTN01000328.1| GENE 31 33728 - 34222 602 164 aa, chain - ## HITS:1 COG:lspA KEGG:ns NR:ns ## COG: lspA COG0597 # Protein_GI_number: 16128021 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Lipoprotein signal peptidase # Organism: Escherichia coli K12 # 1 164 1 164 164 305 99.0 2e-83 MSQSICSTGLRWLWLVVVVLIIDLGSKYLILQNFALGDTVPLFPSLNLHYARNYGAAFSF LADSGGWQRWFFAGIAIGISVILAVMMYRSKATQKLNNIAYALIIGGALGNLFDRLWHGF VVDMIDFYVGDWHFATFNLADTAICVGAALIVLEGFLPSKAKKQ >gi|296494410|gb|ADTN01000328.1| GENE 32 34222 - 37038 3179 938 aa, chain - ## HITS:1 COG:ileS KEGG:ns NR:ns ## COG: ileS COG0060 # Protein_GI_number: 16128020 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Isoleucyl-tRNA synthetase # Organism: Escherichia coli K12 # 1 938 1 938 938 1947 99.0 0 MSDYKSTLNLPETGFPMRGDLAKREPGMLARWTDDDLYGIIRAAKKGKKTFILHDGPPYA NGSIHIGHSVNKILKDIIVKSKGLSGYDSPYVPGWDCHGLPIELKVEQEYGKPGEKFTAA EFRAKCREYAATQVDGQRKDFIRLGVLGDWSHPYLTMDFKTEANIIRALGKIIGNGHLHK GAKPVHWCVDCRSALAEAEVEYYDKTSPSIDVAFQAVDQDALKAKFGVSNVNGPISLVIW TTTPWTLPANRAISIAPDFDYALVQIDGQAVILAKDLVESVMQRIGVTDYTILGTVKGAE LELLRFTHPFMGFDVPAILGDHVTLDAGTGAVHTAPGHGPDDYVIGQKYGLETANPVGPD GTYLPGTYPTLDGVNVFKANDIVVALLQEKGALLHVEKMQHSYPCCWRHKTPIIFRATPQ WFVSMDQKGLRAQSLKEIKGVQWIPDWGQARIESMVANRPDWCISRQRTWGVPMSLFVHK DTEELHPRTLELMEEVAKRVEVDGIQAWWDLDAKEILGDEADQYVKVPDTLDVWFDSGST HSSVVDVRPEFAGHAADMYLEGSDQHRGWFMSSLMISTAMKGKAPYRQVLTHGFTVDGQG RKMSKSIGNTVSPQDVMNKLGADILRLWVASTDYTGEMAVSDEILKRAADSYRRIRNTAR FLLANLNGFDPAKDMVKPEEMVVLDRWAVGCAKAAQEDILKAYEAYDFHEVVQRQMRFCS VEMGSFYLDIIKDRQYTAKADSVARRSCQTALYHIAEALVRWMAPILSFTADEVWGYLPG EREKYVFTGEWYEGLFGLADSEAMNDAFWDELLKVRGEVNKVIEQARADKKVGGSLEAAV TLYAEPELAAKLTALGDELRFVLLTSGATVADYNDAPADAQQSEVLKGLKVALSKAEGEK CPRCWHYTRDVGKVAEHAEICGRCVSNVAGDGEKRKFA >gi|296494410|gb|ADTN01000328.1| GENE 33 37081 - 38022 391 313 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 [Bacillus selenitireducens MLS10] # 1 307 1 312 317 155 33 5e-37 MKLIRGIHNLSQAPQEGCVLTIGNFDGVHRGHRALLQGLQEEGRKRNLPVMVMLFEPQPL ELFATDKAPARLTRLREKLRYLAECGVDYVLCVRFDRRFAALTAQNFISDLLVKHLRVKF LAVGDDFRFGAGREGDFLLLQKAGMEYGFDITSTQTFCEGGVRISSTAVRQALADDNLAL AESLLGHPFAISGRVVHGDELGRTIGFPTANVPLRRQVSPVKGVYAVEVLGLGEKPLPGV ANIGTRPTVAGIRQQLEVHLLDVAMDLYGRHIQVVLRKKIRNEQRFASLDELKAQIARDE LTAREFFGLTKPA >gi|296494410|gb|ADTN01000328.1| GENE 34 38351 - 38614 422 87 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15799705|ref|NP_285717.1| 30S ribosomal protein S20 [Escherichia coli O157:H7 EDL933] # 1 87 1 87 87 167 98 1e-40 MANIKSAKKRAIQSEKARKHNASRRSMMRTFIKKVYAAIEAGDKAAAQKAFNEMQPIVDR QAAKGLIHKNKAARHKANLTAQINKLA Prediction of potential genes in microbial genomes Time: Mon May 16 00:16:02 2011 Seq name: gi|296494409|gb|ADTN01000329.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont849.1, whole genome shotgun sequence Length of sequence - 339 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 37 - 112 88.7 # Thr GGT 0 0 + 5S_RRNA 146 - 271 99.0 # ECORRB [D:1..126] # 5S ribosomal RNA # Escherichia coli # Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; Enterobacteriaceae; Escherichia. Predicted protein(s) Prediction of potential genes in microbial genomes Time: Mon May 16 00:16:04 2011 Seq name: gi|296494408|gb|ADTN01000330.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont849.2, whole genome shotgun sequence Length of sequence - 4350 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 2 1 Op 2 . + CDS 1360 - 2325 770 ## COG0340 Biotin-(acetyl-CoA carboxylase) ligase - Term 2318 - 2347 0.4 3 2 Tu 1 . - CDS 2354 - 3280 866 ## COG1072 Panthothenate kinase - Prom 3409 - 3468 3.3 + TRNA 3666 - 3741 91.8 # Thr TGT 0 0 + TRNA 3750 - 3834 66.9 # Tyr GTA 0 0 + TRNA 3951 - 4025 64.8 # Gly TCC 0 0 + TRNA 4032 - 4107 94.8 # Thr GGT 0 0 + Prom 4052 - 4111 60.4 4 3 Tu 1 . + CDS 4222 - 4350 177 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 Predicted protein(s) >gi|296494408|gb|ADTN01000330.1| GENE 1 335 - 1363 775 342 aa, chain + ## HITS:1 COG:murB KEGG:ns NR:ns ## COG: murB COG0812 # Protein_GI_number: 16131806 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate dehydrogenase # Organism: Escherichia coli K12 # 1 342 1 342 342 716 100.0 0 MNHSLKPWNTFGIDHNAQHIVCAEDEQQLLNAWQYATAEGQPVLILGEGSNVLFLEDYRG TVIINRIKGIEIHDEPDAWYLHVGAGENWHRLVKYTLQEGMPGLENLALIPGCVGSSPIQ NIGAYGVELQRVCAYVDSVELATGKQVRLTAKECRFGYRDSIFKHEYQDRFAIVAVGLRL PKEWQPVLTYGDLTRLDPTTVTPQQVFNAVCHMRTTKLPDPKVNGNAGSFFKNPVVSAET AKALLSQFPTAPNYPQADGSVKLAAGWLIDQCQLKGMQIGGAAVHRQQALVLINEDNAKS EDVVQLAHHVRQKVGEKFNVWLEPEVRFIGASGEVSAVETIS >gi|296494408|gb|ADTN01000330.1| GENE 2 1360 - 2325 770 321 aa, chain + ## HITS:1 COG:ECs4900_2 KEGG:ns NR:ns ## COG: ECs4900_2 COG0340 # Protein_GI_number: 15834154 # Func_class: H Coenzyme transport and metabolism # Function: Biotin-(acetyl-CoA carboxylase) ligase # Organism: Escherichia coli O157:H7 # 77 321 1 245 245 468 99.0 1e-132 MKDNTVPLKLIALLANGEFHSGEQLGETLGMSRAAINKHIQTLRDWGVDVFTVPGKGYSL PEPIQLLNAKQILGQLDGGSVAVLPVIDSTNQYLLDRIGELKSGDACIAEYQQAGRGRRG RKWFSPFGANLYLSMFWRLEQGPAAAIGLSLVIGIVMAEVLRKLGADKVRVKWPNDLYLQ DRKLAGILVELTGKTGDAAQIVIGAGINMAMRRVEESVVNQGWITLQEAGINLDRNTLAA MLIRELRAALELFEQEGLAPYLSRWEKLDNFINRPVKLIIGDKEIFGISRGIDKQGALLL EQDGIIKPWMGGEISLRSAEK >gi|296494408|gb|ADTN01000330.1| GENE 3 2354 - 3280 866 308 aa, chain - ## HITS:1 COG:ZcoaA KEGG:ns NR:ns ## COG: ZcoaA COG1072 # Protein_GI_number: 15804568 # Func_class: H Coenzyme transport and metabolism # Function: Panthothenate kinase # Organism: Escherichia coli O157:H7 EDL933 # 1 308 1 308 308 623 100.0 1e-178 MTPYLQFDRNQWAALRDSVPMTLSEDEIARLKGINEDLSLEEVAEIYLPLSRLLNFYISS NLRRQAVLEQFLGTNGQRIPYIISIAGSVAVGKSTTARVLQALLSRWPEHRRVELITTDG FLHPNQVLKERGLMKKKGFPESYDMHRLVKFVSDLKSGVPNVTAPVYSHLIYDVIPDGDK TVVQPDILILEGLNVLQSGMDYPHDPHHVFVSDFVDFSIYVDAPEDLLQTWYINRFLKFR EGAFTDPDSYFHNYAKLTKEEAIKTAMTLWKEINWLNLKQNILPTRERASLILTKSANHA VEEVRLRK >gi|296494408|gb|ADTN01000330.1| GENE 4 4222 - 4350 177 43 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 42 1 42 407 72 76 5e-13 MSKEKFERTKPHVNVGTIGHVDHGKTTLTAAITTVLAKTYGGA Prediction of potential genes in microbial genomes Time: Mon May 16 00:16:07 2011 Seq name: gi|296494407|gb|ADTN01000331.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont849.3, whole genome shotgun sequence Length of sequence - 9182 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 6, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 15 - 59 3.2 1 1 Op 1 11/0.000 - CDS 70 - 597 450 ## COG1763 Molybdopterin-guanine dinucleotide biosynthesis protein 2 1 Op 2 . - CDS 579 - 1163 446 ## COG0746 Molybdopterin-guanine dinucleotide biosynthesis protein A - Prom 1188 - 1247 2.9 + Prom 935 - 994 1.9 3 2 Tu 1 5/0.333 + CDS 1233 - 1502 417 ## COG3084 Uncharacterized protein conserved in bacteria + Prom 1511 - 1570 1.6 4 3 Op 1 5/0.333 + CDS 1627 - 2565 623 ## COG2334 Putative homoserine kinase type II (protein kinase fold) 5 3 Op 2 1/1.000 + CDS 2582 - 3208 787 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 3221 - 3268 5.1 + Prom 3210 - 3269 10.4 6 4 Tu 1 . + CDS 3363 - 4793 1049 ## COG5339 Uncharacterized protein conserved in bacteria - Term 4787 - 4830 10.0 7 5 Tu 1 . - CDS 4834 - 5577 529 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase + Prom 5841 - 5900 4.5 8 6 Tu 1 . + CDS 6130 - 8916 3470 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains Predicted protein(s) >gi|296494407|gb|ADTN01000331.1| GENE 1 70 - 597 450 175 aa, chain - ## HITS:1 COG:mobB KEGG:ns NR:ns ## COG: mobB COG1763 # Protein_GI_number: 16131697 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin-guanine dinucleotide biosynthesis protein # Organism: Escherichia coli K12 # 6 175 1 170 170 333 100.0 7e-92 MAGKTMIPLLAFAAWSGTGKTTLLKKLIPALCARGIRPGLIKHTHHDMDVDKPGKDSYEL RKAGAAQTIVASQQRWALMTETPDEEELDLQFLASRMDTSKLDLILVEGFKHEEIAKIVL FRDGAGHRPEELVIDRHVIAVASDVPLNLDVALLDINDVEGLADFVVEWMQKQNG >gi|296494407|gb|ADTN01000331.1| GENE 2 579 - 1163 446 194 aa, chain - ## HITS:1 COG:mobA KEGG:ns NR:ns ## COG: mobA COG0746 # Protein_GI_number: 16131698 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin-guanine dinucleotide biosynthesis protein A # Organism: Escherichia coli K12 # 1 194 1 194 194 401 100.0 1e-112 MNLMTTITGVVLAGGKARRMGGVDKGLLELNGKPLWQHVADALMTQLSHVVVNANRHQEI YQASGLKVIEDSLADYPGPLAGMLSVMQQEAGEWFLFCPCDTPYIPPDLAARLNHQRKDA PVVWVHDGERDHPTIALVNRAIEPLLLEYLQAGERRVMVFMRLAGGHAVDFSDHKDAFVN VNTPEELARWQEKR >gi|296494407|gb|ADTN01000331.1| GENE 3 1233 - 1502 417 89 aa, chain + ## HITS:1 COG:ECs4781 KEGG:ns NR:ns ## COG: ECs4781 COG3084 # Protein_GI_number: 15834035 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 89 1 89 89 158 100.0 3e-39 MKCKRLNEVIELLQPAWQKEPDLNLLQFLQKLAKESGFDGELADLTDDILIYHLKMRDSA KDAVIPGLQKDYEEDFKTALLRARGVIKE >gi|296494407|gb|ADTN01000331.1| GENE 4 1627 - 2565 623 312 aa, chain + ## HITS:1 COG:yihE KEGG:ns NR:ns ## COG: yihE COG2334 # Protein_GI_number: 16131700 # Func_class: R General function prediction only # Function: Putative homoserine kinase type II (protein kinase fold) # Organism: Escherichia coli K12 # 1 312 17 328 328 637 100.0 0 MDALFEHGIRVDSGLTPLNSYENRVYQFQDEDRRRFVVKFYRPERWTADQILEEHQFALQ LVNDEVPVAAPVAFNGQTLLNHQGFYFAVFPSVGGRQFEADNIDQMEAVGRYLGRMHQTG RKQLFIHRPTIGLNEYLIEPRKLFEDATLIPSGLKAAFLKATDELIAAVTAHWREDFTVL RLHGDCHAGNILWRDGPMFVDLDDARNGPAVQDLWMLLNGDKAEQRMQLETIIEAYEEFS EFDTAEIGLIEPLRAMRLVYYLAWLMRRWADPAFPKNFPWLTGEDYWLRQTATFIEQAKV LQEPPLQLTPMY >gi|296494407|gb|ADTN01000331.1| GENE 5 2582 - 3208 787 208 aa, chain + ## HITS:1 COG:ECs4783 KEGG:ns NR:ns ## COG: ECs4783 COG0526 # Protein_GI_number: 15834037 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Escherichia coli O157:H7 # 1 208 1 208 208 410 100.0 1e-115 MKKIWLALAGLVLAFSASAAQYEDGKQYTTLEKPVAGAPQVLEFFSFFCPHCYQFEEVLH ISDNVKKKLPEGVKMTKYHVNFMGGDLGKDLTQAWAVAMALGVEDKVTVPLFEGVQKTQT IRSASDIRDVFINAGIKGEEYDAAWNSFVVKSLVAQQEKAAADVQLRGVPAMFVNGKYQL NPQGMDTSNMDVFVQQYADTVKYLSEKK >gi|296494407|gb|ADTN01000331.1| GENE 6 3363 - 4793 1049 476 aa, chain + ## HITS:1 COG:yihF KEGG:ns NR:ns ## COG: yihF COG5339 # Protein_GI_number: 16131702 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 476 15 490 490 919 99.0 0 MIRKSATGVIVALAVIWGGGTWYTGTQIQPGIEKFIKDFNDAKKKGEHAYDMTLSYQNFD KGFFNSRFQMQMTFDNGAPDLNIKLGQKVVFDVDVEHGPLPITMLMHGNVIPALAAAKVN LVNNELTQPLFIAAKNKSPVEATLRFAFGGSFSTTLDVAPAEYGKFSFGEGQFTFNGDGS SLSNLDIEGKVEDIVLQLSPMNKVTAKSFTIDSLARLEEKKFPVGESESKFNQINIINHG EDVAQIDAFVAKTRLDRVKDKDYINVNLTYELDKLTKGNQQLGSGEWSLIAESIDPSAVR QFIIQYNIAMQKQLAAHPELANDEVALQEVNAALFKEYLPLLQKSEPTIKQPVRWKNALG ELNANLDISIADPAKSSSSTNKDIKSLNFDVKLPLNVVTETAKQLNLSEGMDAEKAQKQA DKQISGMMTLGQMFQLITIDNNTASLQLRYTPGKVVFNGQEMSEEEFMSRAGRFVH >gi|296494407|gb|ADTN01000331.1| GENE 7 4834 - 5577 529 247 aa, chain - ## HITS:1 COG:yihG KEGG:ns NR:ns ## COG: yihG COG0204 # Protein_GI_number: 16131703 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Escherichia coli K12 # 1 247 64 310 310 518 100.0 1e-147 MYCWCEGLAVLLHLNPHLQWEVHGLEGLSKKNWYLLICNHRSWADIVVLCVLFRKHIPMN KYFLKQQLAWVPFLGLACWSLDMPFMKRYSRAYLLRHPERRGKDVETTRRSCEKFRLHPT TIVNFVEGSRFTQEKHQQTHSTFQNLLPPKAAGIAMALNVLGKQFDKLLNVTLCYPDNNR QPFFDMLSGKLTRIVVHVDLQPIADELHGDYINDKSFKRHFQQWLNSLWQEKDRLLTSLM SSQRQNK >gi|296494407|gb|ADTN01000331.1| GENE 8 6130 - 8916 3470 928 aa, chain + ## HITS:1 COG:polA_2 KEGG:ns NR:ns ## COG: polA_2 COG0749 # Protein_GI_number: 16131704 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Escherichia coli K12 # 289 928 1 640 640 1231 100.0 0 MVQIPQNPLILVDGSSYLYRAYHAFPPLTNSAGEPTGAMYGVLNMLRSLIMQYKPTHAAV VFDAKGKTFRDELFEHYKSHRPPMPDDLRAQIEPLHAMVKAMGLPLLAVSGVEADDVIGT LAREAEKAGRPVLISTGDKDMAQLVTPNITLINTMTNTILGPEEVVNKYGVPPELIIDFL ALMGDSSDNIPGVPGVGEKTAQALLQGLGGLDTLYAEPEKIAGLSFRGAKTMAAKLEQNK EVAYLSYQLATIKTDVELELTCEQLEVQQPAAEELLGLFKKYEFKRWTADVEAGKWLQAK GAKPAAKPQETSVADEAPEVTATVISYDNYVTILDEETLKAWIAKLEKAPVFAFDTETDS LDNISANLVGLSFAIEPGVAAYIPVAHDYLDAPDQISRERALELLKPLLEDEKALKVGQN LKYDRGILANYGIELRGIAFDTMLESYILNSVAGRHDMDSLAERWLKHKTITFEEIAGKG KNQLTFNQIALEEAGRYAAEDADVTLQLHLKMWPDLQKHKGPLNVFENIEMPLVPVLSRI ERNGVKIDPKVLHNHSEELTLRLAELEKKAHEIAGEEFNLSSTKQLQTILFEKQGIKPLK KTPGGAPSTSEEVLEELALDYPLPKVILEYRGLAKLKSTYTDKLPLMINPKTGRVHTSYH QAVTATGRLSSTDPNLQNIPVRNEEGRRIRQAFIAPEDYVIVSADYSQIELRIMAHLSRD KGLLTAFAEGKDIHRATAAEVFGLPLETVTSEQRRSAKAINFGLIYGMSAFGLARQLNIP RKEAQKYMDLYFERYPGVLEYMERTRAQAKEQGYVETLDGRRLYLPDIKSSNGARRAAAE RAAINAPMQGTAADIIKRAMIAVDAWLQAEQPRVRMIMQVHDELVFEVHKDDVDAVAKQI HQLMENCTRLDVPLLVEVGSGENWDQAH Prediction of potential genes in microbial genomes Time: Mon May 16 00:16:09 2011 Seq name: gi|296494406|gb|ADTN01000332.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont849.4, whole genome shotgun sequence Length of sequence - 1304 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 930 - 989 6.5 2 2 Tu 1 . + CDS 1067 - 1144 65 ## + Term 1248 - 1304 0.3 Predicted protein(s) >gi|296494406|gb|ADTN01000332.1| GENE 1 138 - 734 691 198 aa, chain - ## HITS:1 COG:ECs4787 KEGG:ns NR:ns ## COG: ECs4787 COG0218 # Protein_GI_number: 15834041 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Escherichia coli O157:H7 # 1 198 13 210 210 374 99.0 1e-104 MSAPDIRHLPSDTGIEVAFAGRSNAGKSSALNTLTNQKSLARTSKTPGRTQLINLFEVAD GKRLVDLPGYGYAEVPEEMKRKWQRALGEYLEKRQSLQGLVVLMDIRHPLKDLDQQMIEW AVDSNIAVLVLLTKADKLASGARKAQLNMVREAVLAFNGDVQVETFSSLKKQGVDKLRQK LDTWFSEMQPVEETQDGE >gi|296494406|gb|ADTN01000332.1| GENE 2 1067 - 1144 65 25 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTQDEGQERQEAKTEDCQEDKRPET Prediction of potential genes in microbial genomes Time: Mon May 16 00:16:15 2011 Seq name: gi|296494405|gb|ADTN01000333.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont849.5, whole genome shotgun sequence Length of sequence - 9023 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/1.000 + CDS 80 - 589 554 ## COG3078 Uncharacterized protein conserved in bacteria + Prom 618 - 677 2.1 2 1 Op 2 . + CDS 778 - 2151 1675 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases + Term 2278 - 2343 16.8 3 2 Tu 1 . - CDS 2380 - 2490 126 ## - Prom 2511 - 2570 2.9 4 3 Op 1 14/0.000 - CDS 2602 - 4011 1522 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 5 3 Op 2 5/0.000 - CDS 4023 - 5072 1113 ## COG3852 Signal transduction histidine kinase, nitrogen specific - Prom 5103 - 5162 5.2 - Term 5155 - 5184 -0.3 6 3 Op 3 . - CDS 5358 - 6767 1655 ## COG0174 Glutamine synthetase + Prom 6846 - 6905 5.8 7 4 Tu 1 . + CDS 7140 - 8963 2255 ## COG1217 Predicted membrane GTPase involved in stress response + Term 8987 - 9016 2.1 Predicted protein(s) >gi|296494405|gb|ADTN01000333.1| GENE 1 80 - 589 554 169 aa, chain + ## HITS:1 COG:yihI KEGG:ns NR:ns ## COG: yihI COG3078 # Protein_GI_number: 16131706 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 1 169 1 169 169 211 100.0 8e-55 MKPSSSNSRSKGHAKARRKTREELDQEARDRKRQKKRRGHAPGSRAAGGNTTSGSKGQNA PKDPRIGSKTPIPLGVTEKVTKQHKPKSEKPMLSPQAELELLETDERLDALLERLEAGET LSAEEQSWVDAKLDRIDELMQKLGLSYDDDEEEEEDEKQEDMMRLLRGN >gi|296494405|gb|ADTN01000333.1| GENE 2 778 - 2151 1675 457 aa, chain + ## HITS:1 COG:ECs4789 KEGG:ns NR:ns ## COG: ECs4789 COG0635 # Protein_GI_number: 15834043 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Escherichia coli O157:H7 # 1 457 3 459 459 940 100.0 0 MSVQQIDWDLALIQKYNYSGPRYTSYPTALEFSEDFGEQAFLQAVARYPERPLSLYVHIP FCHKLCYFCGCNKIVTRQQHKADQYLDALEQEIVHRAPLFAGRHVSQLHWGGGTPTYLNK AQISRLMKLLRENFQFNADAEISIEVDPREIELDVLDHLRAEGFNRLSMGVQDFNKEVQR LVNREQDEEFIFALLNHAREIGFTSTNIDLIYGLPKQTPESFAFTLKRVAELNPDRLSVF NYAHLPTIFAAQRKIKDADLPSPQQKLDILQETIAFLTQSGYQFIGMDHFARPDDELAVA QREGVLHRNFQGYTTQGDTDLLGMGVSAISMIGDCYAQNQKELKQYYQQVDEQGNALWRG IALTRDDCIRRDVIKSLICNFRLDYAPIEKQWDLHFADYFAEDLKLLAPLAKDGLVDVDE KGIQVTAKGRLLIRNICMCFDTYLRQKARMQQFSRVI >gi|296494405|gb|ADTN01000333.1| GENE 3 2380 - 2490 126 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLESIINLVSSGAVDSHTPQTAVAAVLCAAMIGLFS >gi|296494405|gb|ADTN01000333.1| GENE 4 2602 - 4011 1522 469 aa, chain - ## HITS:1 COG:ECs4790 KEGG:ns NR:ns ## COG: ECs4790 COG2204 # Protein_GI_number: 15834044 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli O157:H7 # 1 469 1 469 469 915 100.0 0 MQRGIVWVVDDDSSIRWVLERALAGAGLTCTTFENGAEVLEALASKTPDVLLSDIRMPGM DGLALLKQIKQRHPMLPVIIMTAHSDLDAAVSAYQQGAFDYLPKPFDIDEAVALVERAIS HYQEQQQPRNVQLNGPTTDIIGEAPAMQDVFRIIGRLSRSSISVLINGESGTGKELVAHA LHRHSPRAKAPFIALNMAAIPKDLIESELFGHEKGAFTGANTIRQGRFEQADGGTLFLDE IGDMPLDVQTRLLRVLADGQFYRVGGYAPVKVDVRIIAATHQNLEQRVQEGKFREDLFHR LNVIRVHLPPLRERREDIPRLARHFLQVAARELGVEAKLLHPETEAALTRLAWPGNVRQL ENTCRWLTVMAAGQEVLIQDLPGELFESTVAESTSQMQPDSWATLLAQWADRALRSGHQN LLSEAQPELERTLLTTALRHTQGHKQEAARLLGWGRNTLTRKLKELGME >gi|296494405|gb|ADTN01000333.1| GENE 5 4023 - 5072 1113 349 aa, chain - ## HITS:1 COG:ECs4791 KEGG:ns NR:ns ## COG: ECs4791 COG3852 # Protein_GI_number: 15834045 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase, nitrogen specific # Organism: Escherichia coli O157:H7 # 1 349 1 349 349 659 100.0 0 MATGTQPDAGQILNSLINSILLIDDNLAIHYANPAAQQLLAQSSRKLFGTPLPELLSYFS LNIELMQESLEAGQGFTDNEVTLVIDGRSHILSVTAQRMPDGMILLEMAPMDNQRRLSQE QLQHAQQVAARDLVRGLAHEIKNPLGGLRGAAQLLSKALPDPSLLEYTKVIIEQADRLRN LVDRLLGPQLPGTRVTESIHKVAERVVTLVSMELPDNVRLIRDYDPSLPELAHDPDQIEQ VLLNIVRNALQALGPEGGEIILRTRTAFQLTLHGERYRLAARIDVEDNGPGIPPHLQDTL FYPMVSGREGGTGLGLSIARNLIDQHSGKIEFTSWPGHTEFSVYLPIRK >gi|296494405|gb|ADTN01000333.1| GENE 6 5358 - 6767 1655 469 aa, chain - ## HITS:1 COG:ECs4792 KEGG:ns NR:ns ## COG: ECs4792 COG0174 # Protein_GI_number: 15834046 # Func_class: E Amino acid transport and metabolism # Function: Glutamine synthetase # Organism: Escherichia coli O157:H7 # 1 469 1 469 469 961 100.0 0 MSAEHVLTMLNEHEVKFVDLRFTDTKGKEQHVTIPAHQVNAEFFEEGKMFDGSSIGGWKG INESDMVLMPDASTAVIDPFFADSTLIIRCDILEPGTLQGYDRDPRSIAKRAEDYLRSTG IADTVLFGPEPEFFLFDDIRFGSSISGSHVAIDDIEGAWNSSTQYEGGNKGHRPAVKGGY FPVPPVDSAQDIRSEMCLVMEQMGLVVEAHHHEVATAGQNEVATRFNTMTKKADEIQIYK YVVHNVAHRFGKTATFMPKPMFGDNGSGMHCHMSLSKNGVNLFAGDKYAGLSEQALYYIG GVIKHAKAINALANPTTNSYKRLVPGYEAPVMLAYSARNRSASIRIPVVSSPKARRIEVR FPDPAANPYLCFAALLMAGLDGIKNKIHPGEAMDKNLYDLPPEEAKEIPQVAGSLEEALN ELDLDREFLKAGGVFTDEAIDAYIALRREEDDRVRMTPHPVEFELYYSV >gi|296494405|gb|ADTN01000333.1| GENE 7 7140 - 8963 2255 607 aa, chain + ## HITS:1 COG:ECs4793 KEGG:ns NR:ns ## COG: ECs4793 COG1217 # Protein_GI_number: 15834047 # Func_class: T Signal transduction mechanisms # Function: Predicted membrane GTPase involved in stress response # Organism: Escherichia coli O157:H7 # 1 607 1 607 607 1205 100.0 0 MIEKLRNIAIIAHVDHGKTTLVDKLLQQSGTFDSRAETQERVMDSNDLEKERGITILAKN TAIKWNDYRINIVDTPGHADFGGEVERVMSMVDSVLLVVDAFDGPMPQTRFVTKKAFAYG LKPIVVINKVDRPGARPDWVVDQVFDLFVNLDATDEQLDFPIVYASALNGIAGLDHEDMA EDMTPLYQAIVDHVPAPDVDLDGPFQMQISQLDYNSYVGVIGIGRIKRGKVKPNQQVTII DSEGKTRNAKVGKVLGHLGLERIETDLAEAGDIVAITGLGELNISDTVCDTQNVEALPAL SVDEPTVSMFFCVNTSPFCGKEGKFVTSRQILDRLNKELVHNVALRVEETEDADAFRVSG RGELHLSVLIENMRREGFELAVSRPKVIFREIDGRKQEPYENVTLDVEEQHQGSVMQALG ERKGDLKNMNPDGKGRVRLDYVIPSRGLIGFRSEFMTMTSGTGLLYSTFSHYDDVRPGEV GQRQNGVLISNGQGKAVAFALFGLQDRGKLFLGHGAEVYEGQIIGIHSRSNDLTVNCLTG KKLTNMRASGTDEAVVLVPPIRMTLEQALEFIDDDELVEVTPTSIRIRKRHLTENDRRRA NRAPKDD Prediction of potential genes in microbial genomes Time: Mon May 16 00:16:22 2011 Seq name: gi|296494404|gb|ADTN01000334.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont849.6, whole genome shotgun sequence Length of sequence - 10244 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 4, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 10.0 1 1 Op 1 . + CDS 223 - 933 529 ## COG2188 Transcriptional regulators 2 1 Op 2 . + CDS 941 - 1921 721 ## JW3844 predicted sugar phosphate isomerase + Prom 1926 - 1985 2.3 3 2 Tu 1 . + CDS 2023 - 3288 1206 ## COG0477 Permeases of the major facilitator superfamily + Term 3340 - 3382 3.5 - Term 3281 - 3317 4.8 4 3 Op 1 . - CDS 3379 - 4071 486 ## JW3846 predicted outer membrane porin L 5 3 Op 2 3/0.000 - CDS 4139 - 5542 1146 ## COG2211 Na+/melibiose symporter and related transporters 6 3 Op 3 3/0.000 - CDS 5585 - 6970 1617 ## COG2211 Na+/melibiose symporter and related transporters 7 3 Op 4 . - CDS 7016 - 9052 2132 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases - Prom 9137 - 9196 6.5 - Term 9107 - 9154 -0.7 8 4 Tu 1 . - CDS 9251 - 10153 496 ## COG2017 Galactose mutarotase and related enzymes Predicted protein(s) >gi|296494404|gb|ADTN01000334.1| GENE 1 223 - 933 529 236 aa, chain + ## HITS:1 COG:ECs4794 KEGG:ns NR:ns ## COG: ECs4794 COG2188 # Protein_GI_number: 15834048 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 236 1 236 236 475 100.0 1e-134 MAENQSTVENAKEKLDRWLKDGITTPGGKLPSERELGELLGIKRMTLRQALLNLEAESKI FRKDRKGWFVTQPRFNYSPELSASFQRAAIEQGREPSWGFTEKNRTSDIPETLAPLIAVT PSTELYRITGWGALEGHKVFYHETYINPEVAPGFIEQLENHSFSAVWEKCYQKETVVKKL IFKPVRMPGDISKYLGGSAGMPAILIEKHRADQQGNIVQIDIEYWRFEAVDLIINL >gi|296494404|gb|ADTN01000334.1| GENE 2 941 - 1921 721 326 aa, chain + ## HITS:1 COG:no KEGG:JW3844 NR:ns ## KEGG: JW3844 # Name: yihM # Def: predicted sugar phosphate isomerase # Organism: E.coli_J # Pathway: not_defined # 1 326 1 326 326 635 100.0 0 MVTINNARKILQRVDTLPLYLHAYAFHLNMRLERVLPADLLDIASENNLRGVKIHVLDGE RFSLGNMDDKELSAFGDKARRLNLDIHIETSASDKASIDEAVAIALKTGASSVRFYPRYE GNLRDVLSIIANDIAYVRETYQDSGLTFTIEQHEDLKSHELVSLVKESEMESLSLLFDFA NMINANEHPIDALKTMAPHITQVHIKDALIVKEPGGLGHKACISGQGDMPFKALLTHLIC LGDDEPQVTAYGLEEEVDYYAPAFRFEDEDDNPWIPYRQMSETPLPENHLLDARLRKEKE DAINQINHVRNVLQQIKQEANHLLNH >gi|296494404|gb|ADTN01000334.1| GENE 3 2023 - 3288 1206 421 aa, chain + ## HITS:1 COG:yihN KEGG:ns NR:ns ## COG: yihN COG0477 # Protein_GI_number: 16131714 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 1 421 1 421 421 774 100.0 0 MLTKKKWALFSLLTLCGGTIYKLPSLKDAFYIPMQEYFHLTNGQIGNAMSVNSFVTTVGF FLSIYFADKLPRRYTMSFSLIATGLLGVYLTTMPGYWGILFVWALFGVTCDMMNWPVLLK SVSRLGNSEQQGRLFGFFETGRGIVDTVVAFSALAVFTWFGSGLLGFKAGIWFYSLIVIA VGIIIFFVLNDKEEAPSVEVKKEDGASKNTSMTSVLKDKTIWLIAFNVFFVYAVYCGLTF FIPFLKNIYLLPVALVGAYGIINQYCLKMIGGPIGGMISDKILKSPSKYLCYTFIISTAA LVLLIMLPHESMPVYLGMACTLGFGAIVFTQRAVFFAPIGEAKIAENKTGAAMALGSFIG YAPAMFCFSLYGYILDLNPGIIGYKIVFGIMACFAFSGAVVSVMLVKRISQRKKEMLAAE A >gi|296494404|gb|ADTN01000334.1| GENE 4 3379 - 4071 486 230 aa, chain - ## HITS:1 COG:no KEGG:JW3846 NR:ns ## KEGG: JW3846 # Name: ompL # Def: predicted outer membrane porin L # Organism: E.coli_J # Pathway: not_defined # 1 230 1 230 230 442 100.0 1e-123 MKKINAIILLSSLTSASVFAGAYVENREAYNLASDQGEVMLRVGYNFDMGAGIMLTNTYN FQREDELKHGYNEIEGWYPLFKPTDKLTIQPGGLINDKSIGSGGAVYLDVNYKFVPWFNL TVRNRYNHNNYSSTDLSGELDNNDTYEIGTYWNFKITDKFSYTFEPHYFMRVNDFNSSNG KDHHWEITNTFRYRINEHWLPYFELRWLDRNVEPYHREQNQIRIGTKYFF >gi|296494404|gb|ADTN01000334.1| GENE 5 4139 - 5542 1146 467 aa, chain - ## HITS:1 COG:yihO KEGG:ns NR:ns ## COG: yihO COG2211 # Protein_GI_number: 16131716 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Escherichia coli K12 # 1 463 3 465 487 890 99.0 0 MSDHNPLTLKLNLREKIAYGMGDVGSNLMLCIGTLYLLKFYTDELGMPAYYGGIIFLVAK FFTAFTDMLTGFLLDSRKNIGPKGKFRPFILYAAVPAALIATLQFIATTFCLPVKTTIAT ALFMMFGLSYSLMNCSYGAMIPAITKNPNERAQLAAYRQGGATIGLLICTVAFIPLQSLF SDSTVGYACAALMFSIGGFIFMMLCYRGVKEHYVDTTPTGHKASILKSFCAIFRNPPLLV LCIANLCTLAAFNIKLAIQVYYTQYVLNDINLLSWMGFFSMGCILIGVLLAPLTVKCFGK KQVYLAGMVLWAVGDILNYFWGSNSFTFVMFSCVAFFGTAFVNSLNWALVPDTVDYGEWK TGIRAEGSVYTGYTFFRKISAALAGFLPGIMLTQIGYVPNIAQSDATLQGLRQLIFIWPC ALAIIAALTMGFFYTLNEKRFALIIEEINQRKNKEMATEEKTASVTL >gi|296494404|gb|ADTN01000334.1| GENE 6 5585 - 6970 1617 461 aa, chain - ## HITS:1 COG:yihP KEGG:ns NR:ns ## COG: yihP COG2211 # Protein_GI_number: 16131717 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Escherichia coli K12 # 1 461 8 468 468 877 99.0 0 MSHITTEDPATLRLPFKEKLSYGIGDLASNILLDIGTLYLLKFYTDVLGLPGTYGGIIFL ISKFFTAFTDMGTGIMLDSRRKIGPKGKFRPFILYASFPVTLLAIANFVGTPFDVTGKTV MATILFMLYGLFFSMMNCSYGAMVPAITKNPNERASLAAWRQGGATLGLLLCTVGFVPVM NLIEGNQQLGYIFAATLFSLFGLLFMWICYSGVKERYVETQPANPAQKPGLLQSFRAIAG NRPLFILCIANLCTLGAFNVKLAIQVYYTQYVLNDPILLSYMGFFSMGCIFIGVFLMPAS VRRFGKKKVYIGGLLIWVLGDLLNYFFGGGSVSFVAFSCLALFGSAFVNSLNWALVSDTV EYGEWRTGVRSEGTVYTGFTFFRKVSQALAGFFPGWMLTQIGYVPNVAQADHTIEGLRQL IFIYPSALAVVTIVAMGWFYSLNEKMYVRIVEEIEARKRTA >gi|296494404|gb|ADTN01000334.1| GENE 7 7016 - 9052 2132 678 aa, chain - ## HITS:1 COG:yihQ KEGG:ns NR:ns ## COG: yihQ COG1501 # Protein_GI_number: 16131718 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Escherichia coli K12 # 1 678 1 678 678 1418 100.0 0 MDTPRPQLLDFQFHQNNDSFTLHFQQRLILTHSKDNPCLWIGSGIADIDMFRGNFSIKDK LQEKIALTDAIVSQSPDGWLIHFSRGSDISATLNISADDQGRLLLELQNDNLNHNRIWLR LAAQPEDHIYGCGEQFSYFDLRGKPFPLWTSEQGVGRNKQTYVTWQADCKENAGGDYYWT FFPQPTFVSTQKYYCHVDNSCYMNFDFSAPEYHELALWEDKATLRFECADTYISLLEKLT ALLGRQPELPDWIYDGVTLGIQGGTEVCQKKLDTMRNAGVKVNGIWAQDWSGIRMTSFGK RVMWNWKWNSENYPQLDSRIKQWNQEGVQFLAYINPYVASDKDLCEEAAQHGYLAKDASG GDYLVEFGEFYGGVVDLTNPEAYAWFKEVIKKNMIELGCGGWMADFGEYLPTDTYLHNGV SAEIMHNAWPALWAKCNYEALEETGKLGEILFFMRAGSTGSQKYSTMMWAGDQNVDWSLD DGLASVVPAALSLAMTGHGLHHSDIGGYTTLFEMKRSKELLLRWCDFSAFTPMMRTHEGN RPGDNWQFDGDAETIAHFARMTTVFTTLKPYLKEAVALNAKSGLPVMRPLFLHYEDDAHT YTLKYQYLLGRDILVAPVHEEGRSDWTLYLPEDNWVHAWTGEAFRGGEVTVNAPIGKPPV FYRADSEWAALFASLKSI >gi|296494404|gb|ADTN01000334.1| GENE 8 9251 - 10153 496 300 aa, chain - ## HITS:1 COG:yihR KEGG:ns NR:ns ## COG: yihR COG2017 # Protein_GI_number: 16131719 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Escherichia coli K12 # 1 300 9 308 308 612 100.0 1e-175 MQITNMHCSGQTVSLAAGDYHATIVTVGAGLAELTFQGCHLVIPHKPEEMPLAHLGKVLI PWPNRIANGCYRYQGQEYQLPINEHSSKAAIHGLLAWRDWQISELTATSVTLTAFLPPSY GYPFMLASQVVYSLNAHTGLSVEIASQNIGTVAAPYGVGIHPYLTCNLTSVDEYLFQLPA NQVYAVDEHANPTTLHHVDELDLNFTQAKKIAATKIDHTFKTANDLWEMTITHPQQALSV SLCSDQLWVQVYSGEKLQRQGLAVEPMSCPPNAFNSGIDLLLLESGKPHRLFFNIYGQRK Prediction of potential genes in microbial genomes Time: Mon May 16 00:16:36 2011 Seq name: gi|296494403|gb|ADTN01000335.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont849.7, whole genome shotgun sequence Length of sequence - 16910 bp Number of predicted genes - 19, with homology - 18 Number of transcription units - 8, operones - 5 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 3/0.500 - CDS 65 - 1306 1406 ## COG2942 N-acyl-D-glucosamine 2-epimerase 2 1 Op 2 4/0.000 - CDS 1323 - 2201 966 ## COG3684 Tagatose-1,6-bisphosphate aldolase 3 1 Op 3 . - CDS 2225 - 3121 947 ## COG2084 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases + Prom 3205 - 3264 4.1 4 2 Op 1 3/0.500 + CDS 3289 - 4185 641 ## COG0524 Sugar kinases, ribokinase family 5 2 Op 2 1/1.000 + CDS 4219 - 5004 517 ## COG1349 Transcriptional regulators of sugar metabolism + Prom 5020 - 5079 2.1 6 3 Op 1 2/1.000 + CDS 5103 - 5702 753 ## COG1011 Predicted hydrolase (HAD superfamily) 7 3 Op 2 7/0.000 + CDS 5696 - 6568 668 ## COG1295 Predicted membrane protein 8 3 Op 3 5/0.000 + CDS 6565 - 7002 458 ## COG1490 D-Tyr-tRNAtyr deacylase 9 3 Op 4 . + CDS 7047 - 7988 1061 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Term 7989 - 8038 5.1 + Prom 8311 - 8370 3.0 10 4 Tu 1 . + CDS 8399 - 8506 77 ## + Term 8510 - 8545 3.3 + Prom 8607 - 8666 8.5 11 5 Op 1 . + CDS 8841 - 9059 122 ## SFV_3606 hypothetical protein + Prom 9090 - 9149 4.8 12 5 Op 2 . + CDS 9277 - 9519 130 ## SSON_4059 hypothetical protein + Term 9643 - 9681 4.4 13 6 Op 1 9/0.000 - CDS 9849 - 10778 1032 ## COG3058 Uncharacterized protein involved in formate dehydrogenase formation 14 6 Op 2 12/0.000 - CDS 10775 - 11410 596 ## COG2864 Cytochrome b subunit of formate dehydrogenase 15 6 Op 3 16/0.000 - CDS 11407 - 12309 949 ## COG0437 Fe-S-cluster-containing hydrogenase components 1 16 6 Op 4 5/0.000 - CDS 12322 - 14733 2770 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 17 6 Op 5 . - CDS 14782 - 15369 580 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing - Prom 15555 - 15614 6.2 + Prom 15370 - 15429 4.2 18 7 Tu 1 3/0.500 + CDS 15601 - 16482 735 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Prom 16644 - 16703 5.0 19 8 Tu 1 . + CDS 16742 - 16888 126 ## COG3203 Outer membrane protein (porin) Predicted protein(s) >gi|296494403|gb|ADTN01000335.1| GENE 1 65 - 1306 1406 413 aa, chain - ## HITS:1 COG:yihS KEGG:ns NR:ns ## COG: yihS COG2942 # Protein_GI_number: 16131720 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Escherichia coli K12 # 1 413 6 418 418 836 100.0 0 MKWFNTLSHNRWLEQETDRIFDFGKNSVVPTGFGWLGNKGQIKEEMGTHLWITARMLHVY SVAAAMGRPGAYSLVDHGIKAMNGALRDKKYGGWYACVNDEGVVDASKQGYQHFFALLGA ASAVTTGHPEARKLLDYTIEIIEKYFWSEEEQMCLESWDEAFSKTEEYRGGNANMHAVEA FLIVYDVTHDKKWLDRAIRVASVIIHDVARNNHYRVNEHFDTQWNPLPDYNKDNPAHRFR AFGGTPGHWIEWGRLMLHIHAALEARCEQPPAWLLEDAKGLFNATVRDAWAPDGADGIVY TVDWEGKPVVRERVRWPIVEAMGTAYALYTVTGDRQYETWYQTWWEYCIKYLMDYENGSW WQELDADNKVTTKVWDGKQDIYHLLHCLVIPRIPLAPGMAPAVAAGLLDINAK >gi|296494403|gb|ADTN01000335.1| GENE 2 1323 - 2201 966 292 aa, chain - ## HITS:1 COG:yihT KEGG:ns NR:ns ## COG: yihT COG3684 # Protein_GI_number: 16131721 # Func_class: G Carbohydrate transport and metabolism # Function: Tagatose-1,6-bisphosphate aldolase # Organism: Escherichia coli K12 # 1 292 1 292 292 556 100.0 1e-158 MNKYTINDITRASGGFAMLAVDQREAMRMMFAAAGAPAPVADSVLTDFKVNAAKALSPYA SAILVDQQFCYRQVVEQNAIAKSCAMIVAADEFIPGNGIPVDSVVIDRKINPLQIKQDGG KALKLLVLWRSDEDAQQRLDMVKEFNELCHSHGLVSIIEPVVRPPRRGDKFDREQAIIDA AKELGDSGADLYKVEMPLYGKGPQQELLCASQRLNDHINMPWVILSSGVDEKLFPRAVRV AMTAGASGFLAGRAVWASVVGLPDNELMLRDVCAPKLQQLGDIVDEMMAKRR >gi|296494403|gb|ADTN01000335.1| GENE 3 2225 - 3121 947 298 aa, chain - ## HITS:1 COG:yihU KEGG:ns NR:ns ## COG: yihU COG2084 # Protein_GI_number: 16131722 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases # Organism: Escherichia coli K12 # 1 298 1 298 298 542 100.0 1e-154 MAAIAFIGLGQMGSPMASNLLQQGHQLRVFDVNAEAVRHLVDKGATPAANPAQAAKDAEF IITMLPNGDLVRNVLFGENGVCEGLSTDALVIDMSTIHPLQTDKLIADMQAKGFSMMDVP VGRTSANAITGTLLLLAGGTAEQVERATPILMAMGSELINAGGPGMGIRVKLINNYMSIA LNALSAEAAVLCEALNLPFDVAVKVMSGTAAGKGHFTTSWPNKVLSGDLSPAFMIDLAHK DLGIALDVANQLHVPMPLGAASREVYSQARAAGRGRQDWSAILEQVRVSAGMTAKVKM >gi|296494403|gb|ADTN01000335.1| GENE 4 3289 - 4185 641 298 aa, chain + ## HITS:1 COG:yihV KEGG:ns NR:ns ## COG: yihV COG0524 # Protein_GI_number: 16131723 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli K12 # 1 298 3 300 300 572 100.0 1e-163 MIRVACVGITVMDRIYYVEGLPTESGKYVARNYTEVGGGPAATAAVAAARLGAQVDFIGR VGDDDTGNSLLAELESWGVNTRYTKRYNQAKSSQSAIMVDTKGERIIINYPSPDLLPDAE WLEEIDFSQWDVVLADVRWHDGAKKAFTLARQAGVMTVLDGDITPQDISELVALSDHAAF SEPGLARLTGVKEMASALKQAQTLTNGHVYVTQGSAGCDWLENGGRQHQPAFKVDVVDTT GAGDVFHGALAVALATSGDLAESVRFASGVAALKCTRPGGRAGIPDCDQTRSFLSLFV >gi|296494403|gb|ADTN01000335.1| GENE 5 4219 - 5004 517 261 aa, chain + ## HITS:1 COG:ECs4807 KEGG:ns NR:ns ## COG: ECs4807 COG1349 # Protein_GI_number: 15834061 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli O157:H7 # 1 261 9 269 269 486 100.0 1e-137 MSLTELTGNPRHDQLLMLIAERGYMNIDELANLLDVSTQTVRRDIRKLSEQGLITRHHGG AGRASSVVNTAFEQREVSQTEEKKAIAEAVADYIPDGSTIFITIGTTVEHVARALLNHNH LRIITNSLRVAHILYHNPRFEVMVPGGTLRSHNSGIIGPSAASFVADFRADYLVTSVGAI ESDGALMEFDVNEANVVKTMMAHARNILLVADHTKYHASAAVEIGNVAQVTALFTDELPP AALKSRLQDSQIEIILPQEDA >gi|296494403|gb|ADTN01000335.1| GENE 6 5103 - 5702 753 199 aa, chain + ## HITS:1 COG:yihX KEGG:ns NR:ns ## COG: yihX COG1011 # Protein_GI_number: 16131725 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Escherichia coli K12 # 1 199 8 206 206 414 100.0 1e-116 MLYIFDLGNVIVDIDFNRVLGAWSDLTRIPLASLKKSFHMGEAFHQHERGEISDEAFAEA LCHEMALPLSYEQFSHGWQAVFVALRPEVIAIMHKLREQGHRVVVLSNTNRLHTTFWPEE YPEIRDAADHIYLSQDLGMRKPEARIYQHVLQAEGFSPSDTVFFDDNADNIEGANQLGIT SILVKDKTTIPDYFAKVLC >gi|296494403|gb|ADTN01000335.1| GENE 7 5696 - 6568 668 290 aa, chain + ## HITS:1 COG:ECs4809 KEGG:ns NR:ns ## COG: ECs4809 COG1295 # Protein_GI_number: 15834063 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli O157:H7 # 1 290 1 290 290 528 100.0 1e-150 MLKTIQDKARHRTRPLWAWLKLLWQRIDEDNMTTLAGNLAYVSLLSLVPLVAVVFALFAA FPMFSDVSIQLRHFIFANFLPATGDVIQRYIEQFVANSNKMTAVGACGLIVTALLLMYSI DSALNTIWRSKRARPKIYSFAVYWMILTLGPLLAGASLAISSYLLSLRWASDLNTVIDNV LRIFPLLLSWISFWLLYSIVPTIRVPNRDAIVGAFVAALLFEAGKKGFALYITMFPSYQL IYGVLAVIPILFVWVYWTWCIVLLGAEITVTLGEYRKLKQAAEQEEDDEP >gi|296494403|gb|ADTN01000335.1| GENE 8 6565 - 7002 458 145 aa, chain + ## HITS:1 COG:ECs4810 KEGG:ns NR:ns ## COG: ECs4810 COG1490 # Protein_GI_number: 15834064 # Func_class: J Translation, ribosomal structure and biogenesis # Function: D-Tyr-tRNAtyr deacylase # Organism: Escherichia coli O157:H7 # 1 145 1 145 145 276 100.0 8e-75 MIALIQRVTRASVTVEGEVTGEIGAGLLVLLGVEKDDDEQKANRLCERVLGYRIFSDAEG KMNLNVQQAGGSVLVVSQFTLAADTERGMRPSFSKGASPDRAEALYDYFVERCRQQEMNT QTGRFAADMQVSLVNDGPVTFWLQV >gi|296494403|gb|ADTN01000335.1| GENE 9 7047 - 7988 1061 313 aa, chain + ## HITS:1 COG:ECs4811 KEGG:ns NR:ns ## COG: ECs4811 COG0454 # Protein_GI_number: 15834065 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Escherichia coli O157:H7 # 1 313 17 329 329 625 100.0 1e-179 MYHLRVPQTEEELERYYQFRWEMLRKPLHQPKGSERDAWDAMAHHQMVVDEQGNLVAVGR LYINADNEASIRFMAVHPDVQDKGLGTLMAMTLESVARQEGVKRVTCSAREDAVEFFAKL GFVNQGEITTPTTTPIRHFLMIKPVATLDDILHRGDWCAQLQQAWYEHIPLSEKMGVRIQ QYTGQKFITTMPETGNQNPHHTLFAGSLFSLATLTGWGLIWLMLRERHLGGTIILADAHI RYSKPISGKPHAVADLGALSGDLDRLARGRKARVQMQVEIFGDETPGAVFEGTYIVLPAK PFGPYEEGGNEEE >gi|296494403|gb|ADTN01000335.1| GENE 10 8399 - 8506 77 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVANINLIKQESYSVVNLEKQLSRTVTNKIITMSK >gi|296494403|gb|ADTN01000335.1| GENE 11 8841 - 9059 122 72 aa, chain + ## HITS:1 COG:no KEGG:SFV_3606 NR:ns ## KEGG: SFV_3606 # Name: yiiE # Def: hypothetical protein # Organism: S.flexneri_8401 # Pathway: not_defined # 1 72 49 120 120 133 100.0 2e-30 MAMNTVFLHLSEEAIKRLNKLRGWRKVSRSAILREAVEQYLERQQFPVRKAKGGRQKGEV VGVDDQCKEHKE >gi|296494403|gb|ADTN01000335.1| GENE 12 9277 - 9519 130 80 aa, chain + ## HITS:1 COG:no KEGG:SSON_4059 NR:ns ## KEGG: SSON_4059 # Name: yiiF # Def: hypothetical protein # Organism: S.sonnei # Pathway: not_defined # 1 80 1 80 80 143 100.0 2e-33 MNSLAGIDMGRILLDLSNEVIKQLDDLEVQRNLPRADLLREAVDQYLINQSQTARTSVPG IWQGCEEDGVEYQRKLREEW >gi|296494403|gb|ADTN01000335.1| GENE 13 9849 - 10778 1032 309 aa, chain - ## HITS:1 COG:fdhE KEGG:ns NR:ns ## COG: fdhE COG3058 # Protein_GI_number: 16131731 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Uncharacterized protein involved in formate dehydrogenase formation # Organism: Escherichia coli K12 # 1 309 1 309 309 618 100.0 1e-177 MSIRIIPQDELGSSEKRTADMIPPLLFPRLKNLYNRRAERLRELAENNPLGDYLRFAALI AHAQEVVLYDHPLEMDLTARIKEASAQGKPPLDIHVLPRDKHWQKLLMALIAELKPEMSG PALAVIENLEKASTQELEDMASALFASDFSSVSSDKAPFIWAALSLYWAQMANLIPGKAR AEYGEQRQYCPVCGSMPVSSMVQIGTTQGLRYLHCNLCETEWHVVRVKCSNCEQSGKLHY WSLDDEQAAIKAESCDDCDTYLKILYQEKDPKIEAVADDLASLVLDARMEQEGYARSSIN PFLFPGEGE >gi|296494403|gb|ADTN01000335.1| GENE 14 10775 - 11410 596 211 aa, chain - ## HITS:1 COG:ECs4818 KEGG:ns NR:ns ## COG: ECs4818 COG2864 # Protein_GI_number: 15834072 # Func_class: C Energy production and conversion # Function: Cytochrome b subunit of formate dehydrogenase # Organism: Escherichia coli O157:H7 # 1 211 1 211 211 390 100.0 1e-108 MKRRDTIVRYTAPERINHWITAFCFILAAVSGLGFLFPSFNWLMQIMGTPQLARILHPFV GVVMFASFIIMFFRYWHHNLINRDDIFWAKNIRKIVVNEEVGDTGRYNFGQKCVFWAAII FLVLLLVSGVIIWRPYFAPAFSIPVIRFALMLHSFAAVALIVVIMVHIYAALWVKGTITA MVEGWVTSAWAKKHHPRWYREVRKTTEKKAE >gi|296494403|gb|ADTN01000335.1| GENE 15 11407 - 12309 949 300 aa, chain - ## HITS:1 COG:fdoH KEGG:ns NR:ns ## COG: fdoH COG0437 # Protein_GI_number: 16131733 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 1 # Organism: Escherichia coli K12 # 1 300 1 300 300 607 95.0 1e-174 MAMETQDIIKRSATNSITPPSQVRDYKAEVAKLIDVSTCIGCKACQVACSEWNDIRDTVG NNIGVYDNPNDLSAKSWTVMRFSEVEQNDKLEWLIRKDGCMHCSDPGCLKACPAEGAIIQ YANGIVDFQSEQCIGCGYCIAGCPFDIPRLNPEDNRVYKCTLCVDRVVVGQEPACVKTCP TGAIHFGTKESMKTLASERVAELKTRGYDNAGLYDPAGVGGTHVMYVLHHADKPNLYHGL PENPEISETVKFWKGIWKPLAAVGFAATFAASIFHYVGVGPNRADEEENNLHEEKDEERK >gi|296494403|gb|ADTN01000335.1| GENE 16 12322 - 14733 2770 803 aa, chain - ## HITS:1 COG:fdnG KEGG:ns NR:ns ## COG: fdnG COG0243 # Protein_GI_number: 16129433 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 803 213 1015 1015 1677 99.0 0 MTNHWVDIKNANVVMVMGGNAAEAHPVGFRWAMEAKNNNDATLIVVDPRFTRTASVADIY APIRSGTDITFLSGVLRYLIENNKINAEYVKHYTNASLLVRDDFAFEDGLFSGYDAEKRQ YDKSSWNYQFDENGYAKRDETLTHPRCVWNLLKEHVSRYTPDVVENICGTPKADFLKVCE VLASTSAPDRTTTFLYALGWTQHTVGAQNIRTMAMIQLLLGNMGMAGGGVNALRGHSNIQ GLTDLGLLSTSLPGYLTLPSEKQVDLQSYLEANTPKATLADQVNYWSNYPKFFVSLMKSF YGDAAQKENNWGYDWLPKWDQTYDVIKYFNMMDEGKVTGYFCQGFNPVASFPDKNKVVSC LSKLKYMVVIDPLVTETSTFWQNHGESNDVDPASIQTEVFRLPSTCFAEEDGSIANSGRW LQWHWKGQDAPGEARNDGEILAGIYHHLRELYQSEGGKGVEPLMKMSWNYKQPHEPQSDE VAKENNGYALEDLYDANGVLIAKKGQLLSSFAHLRDDGTTASSCWIYTGSWTEQGNQMAN RDNSDPSGLGNTLGWAWAWPLNRRVLYNRASADINGKPWDPKRMLIQWNGSKWTGNDIPD FGNAAPGTPTGPFIMQPEGMGRLFAINKMAEGPFPEHYEPIETPLGTNPLHPNVVSNPVV RLYEQDALRMGKKEQFPYVGTTYRLTEHFHTWTKHALLNAIAQPEQFVEISETLAAAKGI NNGDRVTVSSKRGFIRAVAVVTRRLKPLNVNGQQVETVGIPIHWGFEGVARKGYIANTLT PNVGDANSQTPEYKAFLVNIEKA >gi|296494403|gb|ADTN01000335.1| GENE 17 14782 - 15369 580 195 aa, chain - ## HITS:1 COG:fdnG KEGG:ns NR:ns ## COG: fdnG COG0243 # Protein_GI_number: 16129433 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Escherichia coli K12 # 1 195 1 195 1015 412 100.0 1e-115 MDVSRRQFFKICAGGMAGTTVAALGFAPKQALAQARNYKLLRAKEIRNTCTYCSVGCGLL MYSLGDGAKNAREAIYHIEGDPDHPVSRGALCPKGAGLLDYVNSENRLRYPEYRAPGSDK WQRISWEEAFSRIAKLMKADRDANFIEKNEQGVTVNRWLSTGMLCASGASNETGMLTQKF ARSLGMLAVDNQARV >gi|296494403|gb|ADTN01000335.1| GENE 18 15601 - 16482 735 293 aa, chain + ## HITS:1 COG:yddG KEGG:ns NR:ns ## COG: yddG COG0697 # Protein_GI_number: 16129432 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Escherichia coli K12 # 1 293 1 293 293 506 100.0 1e-143 MTRQKATLIGLIAIVLWSTMVGLIRGVSEGLGPVGGAAAIYSLSGLLLIFTVGFPRIRQI PKGYLLAGSLLFVSYEICLALSLGYAATHHQAIEVGMVNYLWPSLTILFAILFNGQKTNW LIVPGLLLALVGVCWVLGGDNGLHYDEIINNITTSPLSYFLAFIGAFIWAAYCTVTNKYA RGFNGITVFVLLTGASLWVYYFLTPQPEMIFSTPVMIKLISAAFTLGFAYAAWNVGILHG NVTIMAVGSYFTPVLSSALAAVLLSAPLSFSFWQGALMVCGGSLLCWLATRRG >gi|296494403|gb|ADTN01000335.1| GENE 19 16742 - 16888 126 48 aa, chain + ## HITS:1 COG:ECs2076 KEGG:ns NR:ns ## COG: ECs2076 COG3203 # Protein_GI_number: 15831330 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein (porin) # Organism: Escherichia coli O157:H7 # 1 48 1 48 366 90 97.0 8e-19 MKLKIVAVVVTGLLAANVAHAAEVYNKDGNKLDLYGKVTALRYFTDDR Prediction of potential genes in microbial genomes Time: Mon May 16 00:16:44 2011 Seq name: gi|296494402|gb|ADTN01000336.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont870.1, whole genome shotgun sequence Length of sequence - 929 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 23 - 844 551 ## APECO1_O1CoBM24 hypothetical protein - Prom 864 - 923 6.2 Predicted protein(s) >gi|296494402|gb|ADTN01000336.1| GENE 1 23 - 844 551 273 aa, chain - ## HITS:1 COG:no KEGG:APECO1_O1CoBM24 NR:ns ## KEGG: APECO1_O1CoBM24 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 273 39 311 311 553 98.0 1e-156 MRLASRFGRYNSIRRERPLTDDELMQFVPSVFSGDKHESRSERYTYIPTINIINKLRDEG FQPFFACQSRVRDLGRREYSKHMLRLRREGHINGQEVPEIILLNSHDGSSSYQMIPGIFR FVCTNGLVCGNNFGEIRVPHKGDIVGQVIEGAYEVLGIFDKVTDNMEAMKEIHLNSDEQH LFGRAALMARYEDENKTPVTPEQIITPRRREDKQNDLWTTWQRVQENMIKGGLSGRSASG KNTRTRAITGIDGDIRINKALWGIAEQFRKWKS Prediction of potential genes in microbial genomes Time: Mon May 16 00:16:49 2011 Seq name: gi|296494401|gb|ADTN01000337.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont875.1, whole genome shotgun sequence Length of sequence - 1224 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 520 - 1224 591 ## E2348_P1_035 replication protein Predicted protein(s) >gi|296494401|gb|ADTN01000337.1| GENE 1 520 - 1224 591 234 aa, chain - ## HITS:1 COG:no KEGG:E2348_P1_035 NR:ns ## KEGG: E2348_P1_035 # Name: repA # Def: replication protein # Organism: E.coli_0127 # Pathway: not_defined # 1 234 52 285 285 391 99.0 1e-108 AHARSRGLRRRMPPVLRRRAIDALLQGLCFHYDPLANRVQCSITTLAIECGLATESGAGK LSITRATRALTFLSELGLITYQTEYDPLIGCYIPTDITFTPALFAALDVSEVAVASARRS RVEWENRQRKKQGLDALGMDELIAKAWRFVRERFRSYQTELKSRGIKRARARRDANRERQ DIVTLVKRQLTREISEGRFTANREAVKREVERRVKERMILSRNRNYSRLATASP Prediction of potential genes in microbial genomes Time: Mon May 16 00:16:55 2011 Seq name: gi|296494400|gb|ADTN01000338.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont896.1, whole genome shotgun sequence Length of sequence - 9799 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 4/1.000 - CDS 136 - 765 394 ## COG2977 Phosphopantetheinyl transferase component of siderophore synthetase - Prom 816 - 875 9.7 - Term 782 - 818 2.4 2 2 Tu 1 . - CDS 931 - 3171 2567 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins - Prom 3347 - 3406 4.5 + Prom 3308 - 3367 4.8 3 3 Op 1 2/1.000 + CDS 3615 - 4616 769 ## COG2382 Enterochelin esterase and related enzymes 4 3 Op 2 2/1.000 + CDS 4619 - 4837 252 ## COG3251 Uncharacterized protein conserved in bacteria 5 3 Op 3 4/1.000 + CDS 4834 - 8715 3944 ## COG1020 Non-ribosomal peptide synthetase modules and related proteins + Prom 8745 - 8804 7.0 6 4 Tu 1 . + CDS 8931 - 9776 754 ## COG3765 Chain length determinant protein Predicted protein(s) >gi|296494400|gb|ADTN01000338.1| GENE 1 136 - 765 394 209 aa, chain - ## HITS:1 COG:entD KEGG:ns NR:ns ## COG: entD COG2977 # Protein_GI_number: 16128566 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Phosphopantetheinyl transferase component of siderophore synthetase # Organism: Escherichia coli K12 # 1 209 1 209 209 428 100.0 1e-120 MVDMKTTHTSLPFAGHTLHFVEFDPANFCEQDLLWLPHYAQLQHAGRKRKTEHLAGRIAA VYALREYGYKCVPAIGELRQPVWPAEVYGSISHCGTTALAVVSRQPIGIDIEEIFSVQTA RELTDNIITPAEHERLADCGLAFSLALTLAFSAKESAFKASEIQTDAGFLDYQIISWNKQ QVIIHRENEMFAVHWQIKEKIVITLCQHD >gi|296494400|gb|ADTN01000338.1| GENE 2 931 - 3171 2567 746 aa, chain - ## HITS:1 COG:fepA KEGG:ns NR:ns ## COG: fepA COG4771 # Protein_GI_number: 16128567 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Escherichia coli K12 # 1 746 1 746 746 1444 99.0 0 MNKKIHSLALLVNLGIYGVAQAQEPTDTPVSHDDTIVVTAAEQNLQAPGVSTITADEIRK NPVARDVSEIIRTMPGVNLTGNSTSGQRGNNRQIDIRGMGPENTLILIDGKPVSSRNSVR QGWRGERDTRGDTSWVPPEMIERIEVLRGPAAARYGNGAAGGVVNIITKKGSGEWHGSWD AYFNAPEHKEEGATKRTNFSLTGPLGDEFSFRLYGNLDKTQADAWDINQGHQSARAGTYA TTLPAGREGVINKDINGVVRWDFAPLQSLELEAGYSRQGNLYAGDTQNTNSDSYTRSKYG DETNRLYRQNYALTWNGGWDNGVTTSNWVQYEHTRNSRIPEGLAGGTEGKFNEKATQDFV DIDLDDVMLHSEVNLPIDFLVNQTLTLGTEWNQQRMKDLSSNTQALTGTNTGGAIDGVST TDRSPYSKAEIFSLFAENNMELTDSTIVTPGLRFDHHSIVGNNWSPALNISQGLGDDFTL KMGIARAYKAPSLYQTNPNYILYSKGQGCYASAGGCYLQGNDDLKAETSINKEIGLEFKR DGWLAGVTWFRNDYRNKIEAGYVAVGQNAVGTDLYQWDNVPKAVVEGLEGSLNVPVSETV MWTNNITYMLKSENKTTGDRLSIIPEYTLNSTLSWQAREDLSMQTTFTWYGKQQPKKYNY KGQPAVGPETKEISPYSIVGLSATWDVTKNVSLTGGVDNLFDKRLWRAGNAQTTGDLAGA NYIAGAGAYTYNEPGRTWYMSVNTHF >gi|296494400|gb|ADTN01000338.1| GENE 3 3615 - 4616 769 333 aa, chain + ## HITS:1 COG:fes KEGG:ns NR:ns ## COG: fes COG2382 # Protein_GI_number: 16128568 # Func_class: P Inorganic ion transport and metabolism # Function: Enterochelin esterase and related enzymes # Organism: Escherichia coli K12 # 1 333 42 374 374 652 100.0 0 MQRIAGTNVWQWTTQLNANWRGSYCFIPTERDDIFSVPSPDRLELREGWRKLLPQAIADP LNLQSWKGGRGHAVSALEMPQAPLQPGWDCPQAPEIPAKEIIWKSERLKKSRRVWIFTTG DATAEERPLAVLLDGEFWAQSMPVWPVLTSLTHRQQLPPAVYVLIDAIDTTHRAHELPCN ADFWLAVQQELLPLVKAIAPFSDRADRTVVAGQSFGGLSALYAGLHWPERFGCVLSQSGS YWWPHRGGQQEGVLLEKLKAGEVSAEGLRIVLEAGIREPMIMRANQALYAQLHPIKESIF WRQVDGGHDALCWRGGLMQGLIDLWQPLFHDRS >gi|296494400|gb|ADTN01000338.1| GENE 4 4619 - 4837 252 72 aa, chain + ## HITS:1 COG:Z0726 KEGG:ns NR:ns ## COG: Z0726 COG3251 # Protein_GI_number: 15800300 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 EDL933 # 1 72 1 72 72 115 93.0 2e-26 MAFSNPFDDPQGAFYILRNAQGQFSLWPQQCVLPAGWDIVCQPQSQASCQQWLEAHWRTL TPTNFTQLQEAQ >gi|296494400|gb|ADTN01000338.1| GENE 5 4834 - 8715 3944 1293 aa, chain + ## HITS:1 COG:entF_1 KEGG:ns NR:ns ## COG: entF_1 COG1020 # Protein_GI_number: 16128569 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Non-ribosomal peptide synthetase modules and related proteins # Organism: Escherichia coli K12 # 1 1060 1 1060 1060 2003 99.0 0 MSQHLPLVAAQPGIWMAEKLSELPSAWSVAHYVELTGEVDSPLLARAVVAGLAQADTLRM RFTEDNGEVWQWVDDALTFELPEIIDLRTNIDPHGTAQAFMQADLQQDLRVDSGKPLVFH QLIQVADNRWYWYQRYHHLLVDGFSFPAITRQIANIYCTWLRGEPTPASPFTPFADVVEE YQQYRESEAWQRDAAFWAEQRRQLPPPASLSPAPLPGRSASADILRLKLEFTDGEFRQLA TQLSGVQRTDLALALAALWLGRLCNRMDYAAGFIFMRRLGSAALTATGPVLNVLPLGIHI AAQETLPELATRLAAQLKKMRRHQRYDAEQIVRDSGRAAGDEPLFGPVLNIKVFDYQLDI PDVQAQTHTLATGPVNDLELALFPDVHGDLSIEILANKQRYDEPTLIQHAERLKMLIAQF AADPALLCGDVDIMLPGEYAQLAQLNATQVEIPETTLSALVAEQAAKTPDAPALADARYL FSYREMREQVVALANLLRERGVKPGDSVAVALPRSVFLTLALHAIVEAGAAWLPLDTGYP DDRLKMMLEDARPSLLITTDDQLPRFSDVPNLTSLCYNAPLTPQGSAPLQLSQPHHTAYI IFTSGSTGRPKGVMVGQTAIVNRLLWMQNHYPLTGEDVVAQKTPCSFDVSVWEFFWPFIA GAKLVMAEPEAHRDPLAMQQFFAEYGVTTTHFVPSMLAAFVASLTPQTARQSCATLKQVF CSGEALPADLCREWQQLTGAPLHNLYGPTEAAVDVSWYPAFGEELAQVRGSSVPIGYPVW NTGLRILDAMMHPVPPGVAGDLYLTGIQLAQGYLGRPDLTASRFIADPFAPGERMYRTGD VARWLDNGAVEYLGRSDDQLKIRGQRIELGEIDRVMQALPDVEQAVTHACVINQAAATGG DARQLVGYLVSQSGLPLDTSALQAQLRETLPPHMVPVVLLQLPQLPLSANGKLDRKALPL PELKAQAPGRAPKAGSETIIAAAFSSLLGCDVQDADADFFALGGHSLLAMKLAAQLSRQV ARQVTPGQVMVASTVAKLATIIDAEEDSTRRMGFETILPLREGNGPTLFCFHPASGFAWQ FSVLSRYLDPQWSIIGIQSPRPNGPMQTAANLDEVCEAHLATLLEQQPHGPYYLLGYSLG GTLAQGIAARLRARGEQVAFLGLLDTWPPETQNWQEKEANGLDPEVLAEINREREAFLAA QQGSTSTELFTTIEGNYADAVRLLTTAHSVPFDGKATLFVAERTLQEGMSPERAWSPWIA ELDIYRQDCAHVDIISPGTFEKIGPIIRATLNR >gi|296494400|gb|ADTN01000338.1| GENE 6 8931 - 9776 754 281 aa, chain + ## HITS:1 COG:fepE KEGG:ns NR:ns ## COG: fepE COG3765 # Protein_GI_number: 16128570 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Chain length determinant protein # Organism: Escherichia coli K12 # 1 281 1 281 377 531 99.0 1e-151 MSSLNIKQGSDAHFPDYPLASPSNNEIDLLNLISVLWRAKKTVMAVVFAFACAGLLISFI LPQKWTSAAVVTPPEPVQWQELEKSFTKLRVLDLDIKIDRTEAFNLFIKKFQSVSLLEEY LRSSPYVMDQLKEAKIDELDLHRAIVALSEKMKAVDDNASKKKDEPSLYTSWTLSFTAPT SEEAQTVLSGYIDYISTLVVKESLENVRNKLEIKTQFEKEKLAQDRIKTKNQLDANIQRL NYSLDIANAAGIKKPVYSNGQAVKDDPDFSISLGADGIERR Prediction of potential genes in microbial genomes Time: Mon May 16 00:17:00 2011 Seq name: gi|296494399|gb|ADTN01000339.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont898.1, whole genome shotgun sequence Length of sequence - 12196 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 2, operones - 2 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 211 188 ## EcHS_A2169 hypothetical protein 2 1 Op 2 25/0.000 + CDS 222 - 2744 2408 ## COG0438 Glycosyltransferase + Term 2758 - 2785 -0.8 3 1 Op 3 25/0.000 + CDS 2791 - 3936 1040 ## COG0438 Glycosyltransferase 4 1 Op 4 . + CDS 3946 - 5061 781 ## COG0438 Glycosyltransferase 5 2 Op 1 24/0.000 - CDS 5100 - 5708 689 ## COG0139 Phosphoribosyl-AMP cyclohydrolase 6 2 Op 2 23/0.000 - CDS 5702 - 6478 922 ## COG0107 Imidazoleglycerol-phosphate synthase 7 2 Op 3 25/0.000 - CDS 6460 - 7197 899 ## COG0106 Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase 8 2 Op 4 18/0.000 - CDS 7197 - 7787 692 ## COG0118 Glutamine amidotransferase 9 2 Op 5 13/0.000 - CDS 7787 - 8854 1233 ## COG0131 Imidazoleglycerol-phosphate dehydratase 10 2 Op 6 19/0.000 - CDS 8854 - 9924 1112 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 11 2 Op 7 18/0.000 - CDS 9921 - 11225 1145 ## COG0141 Histidinol dehydrogenase 12 2 Op 8 . - CDS 11231 - 12130 1142 ## COG0040 ATP phosphoribosyltransferase Predicted protein(s) >gi|296494399|gb|ADTN01000339.1| GENE 1 2 - 211 188 69 aa, chain + ## HITS:1 COG:no KEGG:EcHS_A2169 NR:ns ## KEGG: EcHS_A2169 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_HS # Pathway: not_defined # 10 69 649 708 708 121 100.0 8e-27 TLSVSWSHYLHKLGLYQPAYRLYRRMNPLPHSQYQADAQILSQTELQVMHPELLPPEVYE IYLKLTKNK >gi|296494399|gb|ADTN01000339.1| GENE 2 222 - 2744 2408 840 aa, chain + ## HITS:1 COG:XF0608 KEGG:ns NR:ns ## COG: XF0608 COG0438 # Protein_GI_number: 15837210 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Xylella fastidiosa 9a5c # 1 824 2 821 849 553 38.0 1e-157 MHILIDVQGYQSESKVRGIGRSTLAMSRAIIENAGEHRVSILINGMYPIDNINDVKMAYR DLLTDEDMFIFSAVAPTAYRHIENHGRSKAAQAARDIAIANIAPDIVYVISFFEGHSDSY TVSIPADNVPWKTVCVCHDLIPLLNKERYLGDPNFREFYMNKLAEFERADAIFAISQSAA QEVIEYTDIASDRVLNISSAVGEEFAVIDYSAERIQSLKDKYSLPDEFILSLAMIEPRKN IEALIHAYSLLPAELQQRYPMVLAYKVQPEQLERILRLAESYGLSRSQLIFTGFLTDDDL IALYNLCKLFVFPSLHEGFGLPPLEAMRCGAATLGSNITSLPEVIGWEDAMFNPHDVQDI RRVMEKALTDEAFYRELKAHALAQSAKFSWANTAHLAIEGFTRLLQSSQETDAGQAESVT ASRLQMMQKIDALSEVDRLGLAWAVARNSFKRHTRKLLVDISVLAQHDAKTGIQRVSRSI LSELLKSGVPGYEVSAVYYTPGECYRYANQYLSSHFPGEFGADEPVLFSKDDVLIATDLT AHLFPELVTQIDSMRAAGAFACFVVHDILPLRRPEWSIEGIQRDFPIWLSCLAEHADRLI CVSASVAEDVKAWIAENRHWVKPNPLQTVSNFHLGADLDASVPSTGMPDNAQALLAAMAA APSFIMVGTMEPRKGHAQTLAAFEELWREGKDYNLFIVGKQGWNVDSLCEKLRHHPQLNK KLFWLQNISDEFLAELYARSRALIFASQGEGFGLPLIEAAQKKLPVIIRDIPVFKEIAQE HAWYFSGEAPSDIAKAVEEWLALYEQNAHPRSENINWLTWKQSAEFLLKNLPIIAPAAKQ >gi|296494399|gb|ADTN01000339.1| GENE 3 2791 - 3936 1040 381 aa, chain + ## HITS:1 COG:PA5448 KEGG:ns NR:ns ## COG: PA5448 COG0438 # Protein_GI_number: 15600641 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Pseudomonas aeruginosa # 1 379 1 370 375 216 34.0 8e-56 MKIIFATEPIKYPLTGIGRYSLELVKRLAVAREIEELKLFHGASFIDQIPQVENKSDSKA SNHGRLSAFLRRQPLLIEAYRLLHPRRQAWALRDYKDYIYHGPNFYLPHRLERAVTTFHD ISIFTCPEYHPKDRVRYMEKSLHESLDSAKLILTVSDFSRSEIIRLFNYPADRIVTTKLA CSSDYIPRSPAECLPVLQKYQLAWQGYALYIGTMEPRKNIRGLLQAYQLLPMETRMRYPL ILSGYRGWEDDVLWQLVERGTREGWIRYLGYVPDEDLPYLYAAARTFVYPSFYEGFGLPI LEAMSCGVPVVCSNVTSLPEVVGDAGLVADPNDVDAISAHILQSLQDDSWREISTARGLA QAKQFSWENCTTQTINAYKLL >gi|296494399|gb|ADTN01000339.1| GENE 4 3946 - 5061 781 371 aa, chain + ## HITS:1 COG:PA5447 KEGG:ns NR:ns ## COG: PA5447 COG0438 # Protein_GI_number: 15600640 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Pseudomonas aeruginosa # 1 371 1 372 381 370 49.0 1e-102 MRVLHVYKTYYPDTYGGIEQVIYQLSQGCARRGIAADVFTFSPDKETGPVAYEDHRVIYN KQLFEIASTPFSLKALKRFKQIKDDYDIINYHFPFPFMDMLHLSARPDARTVVTYHSDIV KQKRLMKLYQPLQERFLSGVDCIVASSPNYVASSQTLKKYLDKTVVIPFGLEQQDVQHDP QRVAHWRETVGDKFFLFVGTFRYYKGLHILMDAAECSRLPVVVVGGGPLESEVRREAQQR GLSNVMFTGMLNDEDKYILFQLCRGVVFPSHLRSEAFGITLLEGARFARPLISCEIGTGT SFINQDKVSGCVIPPNDSQALVEAMNELWNNEETSNRYGENSRRRFEEMFTADHMIDAYV NLYTTLLESKS >gi|296494399|gb|ADTN01000339.1| GENE 5 5100 - 5708 689 202 aa, chain - ## HITS:1 COG:hisI_1 KEGG:ns NR:ns ## COG: hisI_1 COG0139 # Protein_GI_number: 16129967 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosyl-AMP cyclohydrolase # Organism: Escherichia coli K12 # 1 112 1 112 112 239 100.0 3e-63 MLTEQQRRELDWEKTDGLMPVIVQHAVSGEVLMLGYMNPEALDKTLESGKVTFFSRTKQR LWTKGETSGNFLNVVSIAPDCDNDTLLVLANPIGPTCHKGTSSCFGDTAHQWLFLYQLEQ LLAERKSADPETSYTAKLYASGTKRIAQKVGEEGVETALAATVHDRFELTNEASDLMYHL LVLLQDQGLDLGEVIDNLKNRH >gi|296494399|gb|ADTN01000339.1| GENE 6 5702 - 6478 922 258 aa, chain - ## HITS:1 COG:ECs2826 KEGG:ns NR:ns ## COG: ECs2826 COG0107 # Protein_GI_number: 15832080 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate synthase # Organism: Escherichia coli O157:H7 # 1 258 1 258 258 518 99.0 1e-147 MLAKRIIPCLDVRDGQVVKGVQFRNHEIIGDIVPLAKRYAEEGADELVFYDITASSDGRV VDKSWVSRVAEVIDIPFCVAGGIKSLDDAAKILSFGADKISINSPALADPTLITRLADRF GVQCIVVGIDTWYDAETGKYHVNQYTGDESRTRVTQWETLDWVQEVQKRGAGEIVLNMMN QDGVRNGYDLEQLKKVREVCHVPLIASGGAGTMEHFLEAFRDADVDGALAASVFHKQIIN IGELKAYLATQGVEIRIC >gi|296494399|gb|ADTN01000339.1| GENE 7 6460 - 7197 899 245 aa, chain - ## HITS:1 COG:ECs2825 KEGG:ns NR:ns ## COG: ECs2825 COG0106 # Protein_GI_number: 15832079 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase # Organism: Escherichia coli O157:H7 # 1 245 2 246 246 459 98.0 1e-129 MIIPALDLIDGTVVRLHQGDYGKQRDYGNDPLPRLQDYAAQGAEVLHLVDLTGAKDPAKR QIPLIKTLVAGVNVPVQVGGGVRSEEDVATLLEAGVARVVVGSTAVKSPEMVKVWFERFG ADALVLALDVRIDEQGNKQVAVSGWQENSGVSLEQLVETYLPVGLKHVLCTDISRDGTLA GSNVSLYEEVCARYPQVAFQSSGGIGDIDDVAALRGTGVRGVIVGRALLEGKFTVKEAIA CWQNA >gi|296494399|gb|ADTN01000339.1| GENE 8 7197 - 7787 692 196 aa, chain - ## HITS:1 COG:hisH KEGG:ns NR:ns ## COG: hisH COG0118 # Protein_GI_number: 16129964 # Func_class: E Amino acid transport and metabolism # Function: Glutamine amidotransferase # Organism: Escherichia coli K12 # 1 196 1 196 196 417 100.0 1e-117 MNVVILDTGCANLNSVKSAIARHGYEPKVSRDPDVVLLADKLFLPGVGTAQAAMDQVRER ELFDLIKACTQPVLGICLGMQLLGRRSEESNGVDLLGIIDEDVPKMTDFGLPLPHMGWNR VYPQAGNRLFQGIEDGAYFYFVHSYAMPVNPWTIAQCNYGEPFTAAVQKDNFYGVQFHPE RSGAAGAKLLKNFLEM >gi|296494399|gb|ADTN01000339.1| GENE 9 7787 - 8854 1233 355 aa, chain - ## HITS:1 COG:hisB_2 KEGG:ns NR:ns ## COG: hisB_2 COG0131 # Protein_GI_number: 16129963 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate dehydratase # Organism: Escherichia coli K12 # 149 355 1 207 207 430 99.0 1e-120 MSQKYLFIDRDGTLISEPPSDFQVDRFDKLAFEPGVIPELLKLQKAGYKLVMITNQDGLG TQSFPQADFDGPHNLMMQIFTSQGVQFDELLICPHLPADECDCRKPKVKLVERYLAEQAM DRANSYVIGDRATDIQLAENMGINGLRYDREALNWPMIGEQLTKRDRYAHVVRNTKETQI DVQVWLDREGGSKINTGVGFFDHMLDQIATHGGFRMEINVKGDLYIDDHHTVEDTGLALG EALKIALGDKRGICRFGFVLPMDECLARCALDISGRPHLEYKAEFTYQRVGDLSTEMIEH FFRSLSYTMGVTLHLKTKGKNDHHRVESLFKAFGRTLRQAIRVEGDTLPSSKGVL >gi|296494399|gb|ADTN01000339.1| GENE 10 8854 - 9924 1112 356 aa, chain - ## HITS:1 COG:hisC KEGG:ns NR:ns ## COG: hisC COG0079 # Protein_GI_number: 16129962 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Escherichia coli K12 # 1 356 1 356 356 707 99.0 0 MSTVTITDLARENVRNLTPYQSARRLGGNGDVWLNANEYPTAVEFQLTQQTLNRYPECQP KVVIENYAQYAGVKPEQVLVSRGADEGIELLIRAFCEPGKDAILYCPPTYGMYSVSAETI GVECRTVPTLDNWQLDLQGISDKLDGVKVVYVCSPNNPTGQLINPQDFRTLLELTRGKAI VVADEAYIEFCPQASLAGWLAEYPHLAILRTLSKAFALAGLRCGFTLANEEVINLLMKVI APYPLSTPVADIAAQALSPQGIVAMRERVAQIIAEREYLIAALKEIPCVEQVFDSETNYI LARFKASSAVFKSLWDQGIILRDQNKQPSLSGCLRITVGTREESQCVIDALRAEQV >gi|296494399|gb|ADTN01000339.1| GENE 11 9921 - 11225 1145 434 aa, chain - ## HITS:1 COG:hisD KEGG:ns NR:ns ## COG: hisD COG0141 # Protein_GI_number: 16129961 # Func_class: E Amino acid transport and metabolism # Function: Histidinol dehydrogenase # Organism: Escherichia coli K12 # 1 434 1 434 434 776 98.0 0 MSFNTIIDWNSCTAEQQRQLLMRPAISASESITRTVNDILDNVKTRGDEALREYSAKFDK TTVTALKVSAEEIAAASERLSDELKQAMAVAVKNIETFHTAQKLPPVDVETQPGVRCQQV TRPVASVGLYIPGGSAPLFSTVLMLATPARIAGCKKVVLCSPPPIADEILYAAQLCGVQD VFNVGGAQAIAALAFGTESVPKVDKIFGPGNAFVTEAKRQVSQRLDGAAIDMPAGPSEVL VIADSGATPDFVASDLLSQAEHGPDSQVILLTPDADMARRVAEAVERQLAELPRAETARQ ALSASRLIVTNDLAQCVEISNQYGPEHLIIQTRNARELVDGITSAGSVFLGDWSPESAGD YASGTNHVLPTYGYTATCSSLGLADFQKRMTVQELSKEGFSALASTIETLAAAERLTAHK NAATLRVNALKEQA >gi|296494399|gb|ADTN01000339.1| GENE 12 11231 - 12130 1142 299 aa, chain - ## HITS:1 COG:hisG KEGG:ns NR:ns ## COG: hisG COG0040 # Protein_GI_number: 16129960 # Func_class: E Amino acid transport and metabolism # Function: ATP phosphoribosyltransferase # Organism: Escherichia coli K12 # 1 299 1 299 299 574 100.0 1e-164 MTDNTRLRIAMQKSGRLSDDSRELLARCGIKINLHTQRLIAMAENMPIDILRVRDDDIPG LVMDGVVDLGIIGENVLEEELLNRRAQGEDPRYFTLRRLDFGGCRLSLATPVDEAWDGPL SLNGKRIATSYPHLLKRYLDQKGISFKSCLLNGSVEVAPRAGLADAICDLVSTGATLEAN GLREVEVIYRSKACLIQRDGEMEESKQQLIDKLLTRIQGVIQARESKYIMMHAPTERLDE VIALLPGAERPTILPLAGDQQRVAMHMVSSETLFWETMEKLKALGASSILVLPIEKMME Prediction of potential genes in microbial genomes Time: Mon May 16 00:17:06 2011 Seq name: gi|296494398|gb|ADTN01000340.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont898.2, whole genome shotgun sequence Length of sequence - 12280 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 7, operones - 4 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 333 - 392 8.2 1 1 Op 1 6/0.000 + CDS 412 - 663 338 ## COG2161 Antitoxin of toxin-antitoxin stability system 2 1 Op 2 . + CDS 660 - 914 109 ## COG4115 Uncharacterized protein conserved in bacteria + Term 916 - 953 1.5 3 2 Op 1 2/0.500 + CDS 997 - 1821 830 ## COG0451 Nucleoside-diphosphate-sugar epimerases 4 2 Op 2 4/0.000 + CDS 1867 - 2796 837 ## COG0583 Transcriptional regulator + Term 2885 - 2917 2.3 + Prom 2805 - 2864 4.2 5 2 Op 3 . + CDS 3062 - 4420 1658 ## COG0531 Amino acid transporters + Term 4464 - 4496 1.4 - Term 4451 - 4484 0.0 6 3 Tu 1 . - CDS 4494 - 5921 1204 ## COG2925 Exonuclease I - Prom 6165 - 6224 2.9 + Prom 6122 - 6181 3.9 7 4 Op 1 3/0.000 + CDS 6253 - 7296 899 ## COG1686 D-alanyl-D-alanine carboxypeptidase + Prom 7320 - 7379 6.0 8 4 Op 2 4/0.000 + CDS 7415 - 7888 473 ## COG3449 DNA gyrase inhibitor + Term 7978 - 8012 1.1 + Prom 7900 - 7959 4.0 9 4 Op 3 2/0.500 + CDS 8086 - 9144 928 ## COG1289 Predicted membrane protein + Prom 9152 - 9211 3.9 10 5 Tu 1 . + CDS 9316 - 9645 511 ## COG2926 Uncharacterized protein conserved in bacteria + Term 9662 - 9696 6.9 - Term 10113 - 10158 6.1 11 6 Tu 1 . - CDS 10178 - 10258 75 ## - Prom 10284 - 10343 2.9 - Term 10350 - 10388 5.0 12 7 Op 1 . - CDS 10543 - 10734 287 ## ECP_2049 hypothetical protein 13 7 Op 2 . - CDS 10734 - 11108 287 ## JW1987 toxin of the YeeV-YeeU toxin-antitoxin system 14 7 Op 3 . - CDS 11197 - 11565 228 ## ECSP_2675 CP4-44 prophage; antitoxin of the YeeV-YeeU toxin-antitoxin system 15 7 Op 4 . - CDS 11639 - 11860 293 ## SDY_4591 hypothetical protein 16 7 Op 5 . - CDS 11929 - 12279 273 ## COG2003 DNA repair proteins Predicted protein(s) >gi|296494398|gb|ADTN01000340.1| GENE 1 412 - 663 338 83 aa, chain + ## HITS:1 COG:yefM KEGG:ns NR:ns ## COG: yefM COG2161 # Protein_GI_number: 16129958 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Antitoxin of toxin-antitoxin stability system # Organism: Escherichia coli K12 # 1 83 10 92 92 140 100.0 8e-34 MRTISYSEARQNLSATMMKAVEDHAPILITRQNGEACVLMSLEEYNSLEETAYLLRSPAN ARRLMDSIDSLKSGKGTEKDIIE >gi|296494398|gb|ADTN01000340.1| GENE 2 660 - 914 109 84 aa, chain + ## HITS:1 COG:AGc3658 KEGG:ns NR:ns ## COG: AGc3658 COG4115 # Protein_GI_number: 15889305 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 84 1 88 89 105 56.0 2e-23 MKLIWSEESWDDYLYWQETDKRIVKKINELIKDTRRTPFEGKGKPEPLKHYLSGFWSRRI TEEHRLVYAVTDDSLLIAACRYHY >gi|296494398|gb|ADTN01000340.1| GENE 3 997 - 1821 830 274 aa, chain + ## HITS:1 COG:ECs2818 KEGG:ns NR:ns ## COG: ECs2818 COG0451 # Protein_GI_number: 15832072 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Escherichia coli O157:H7 # 1 274 1 274 274 528 100.0 1e-150 MKKVAIVGLGWLGMPLAMSLSARGWQVTGSKTTQDGVEAARMSGIDSYLLRMEPELVCDS DDLDALMDADALVITLPARRSGPGDEFYLQAVQELVDSALAHRIPRIIFTSSTSVYGDAQ GTVKETTPRNPVTNSGRVLEELEDWLHNLPGTSVDILRLAGLVGPGRHPGRFFAGKTAPD GEHGVNLVHLEDVIGAITLLLQAPKGGHIYNICAPAHPARNVFYPQMARLLGLEPPQFRN SLDSGKGKIIDGSRICNELGFEYQYPDPLVMPLE >gi|296494398|gb|ADTN01000340.1| GENE 4 1867 - 2796 837 309 aa, chain + ## HITS:1 COG:ECs2817 KEGG:ns NR:ns ## COG: ECs2817 COG0583 # Protein_GI_number: 15832071 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 309 8 316 316 619 99.0 1e-177 MKPLLDVLMILDALEKEGSFAAASAKLYKTPSALSYTVHKLESDLNIQLLDRSGHRAKFT RTGKMLLEKGREVLHTVRELEKQAIKLHEGWENELVIGVDDTFPFSLLAPLIEAFYQHHS VTRLKFINGVLGGSWDALTQGRADIIVGAMHEPPSSSEFGFSRLGDLEQVFAVAPHHPLA QEEEPLNRRIIKRYRAIVVGDTAQAGASTASQLLDEQEAITVFDFKTKLELQISGLGCGY LPRYLAQRFLDSGALIEKKVVAQTLFEPVWIGWNEQTAGLASGWWRDEILANSAIAGVYA KSDDGKSAI >gi|296494398|gb|ADTN01000340.1| GENE 5 3062 - 4420 1658 452 aa, chain + ## HITS:1 COG:ECs2816 KEGG:ns NR:ns ## COG: ECs2816 COG0531 # Protein_GI_number: 15832070 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli O157:H7 # 1 452 3 454 454 816 100.0 0 MSHNVTPNTSRVELRKTLTLVPVVMMGLAYMQPMTLFDTFGIVSGLTDGHVPTAYAFALI AILFTALSYGKLVRRYPSAGSAYTYAQKSISPTVGFMVGWSSLLDYLFAPMINILLAKIY FEALVPSIPSWMFVVALVAFMTAFNLRSLKSVANFNTVIVVLQVVLIAVILGMVVYGVFE GEGAGTLASTRPFWSGDAHVIPMITGATILCFSFTGFDGISNLSEETKDAERVIPRAIFL TALIGGMIFIFATYFLQLYFPDISRFKDPDASQPEIMLYVAGKAFQVGALIFSTITVLAS GMAAHAGVARLMYVMGRDGVFPKSFFGYVHPKWRTPAMNIILVGAIALLAINFDLVMATA LINFGALVAFTFVNLSVISQFWIREKRNKTLKDHFQYLFLPMCGALTVGALWVNLEESSM VLGLIWAAIGLIYLACVTKSFRNPVPQYEDVA >gi|296494398|gb|ADTN01000340.1| GENE 6 4494 - 5921 1204 475 aa, chain - ## HITS:1 COG:sbcB KEGG:ns NR:ns ## COG: sbcB COG2925 # Protein_GI_number: 16129952 # Func_class: L Replication, recombination and repair # Function: Exonuclease I # Organism: Escherichia coli K12 # 1 475 1 475 475 971 98.0 0 MMNDGKQQSTFLFHDYETFGTHPALDRPAQFAAIRTDNEFNVIGEPEVFYCKPADDYLPQ PGAVLITGITPQEARAKGENEAAFAARIHSLFTVPKTCILGYNNVRFDDEVTRNVFYRNF YDPYAWSWQHDNSRWDLLDVMRACYALRPEGINWPENDDGLPSFRLEHLTKANGIEHSNA HDAMADVYATIALAKLVKTRQPRLFDYLFTHRNKHKLMALIDVPQMKPLVHVSGMFGAWR GNTSWVAPLAWHPENRNAVIMVDLAGDISPLLELDSDTLRERLYTVKADLGDNAAVPVKL VHINKCPVLAQANTLRPEDADRLGINRQHCLDNLKILRENPQVREKVVAIFAEAEPFTPS DNVDAQLYNGFFSDADRAAMKIVLETEPRNLPALDITFVDKRIEKLLFNYRARNFPGTLD YAEQQRWLEHRRQVFTPEFLQGYADELQMLAQQYADDKEKVALLKALWQYAEEIV >gi|296494398|gb|ADTN01000340.1| GENE 7 6253 - 7296 899 347 aa, chain + ## HITS:1 COG:ECs2812 KEGG:ns NR:ns ## COG: ECs2812 COG1686 # Protein_GI_number: 15832066 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Escherichia coli O157:H7 # 1 347 44 390 390 714 100.0 0 MDYTTGQILTAGNEHQQRNPASLTKLMTGYVVDRAIDSHRITPDDIVTVGRDAWAKDNPV FVGSSLMFLKEGDRVSVRDLSRGLIVDSGNDACVALADYIAGGQRQFVEMMNNYAEKLHL KDTHFETVHGLDAPGQHSSAYDLAVLSRAIIHGEPEFYHMYSEKSLTWNGITQQNRNGLL WDKTMNVDGLKTGHTSGAGFNLIASAVDGQRRLIAVVMGADSAKGREEEARKLLRWGQQN FTTVQILHRGKKVGTERIWYGDKENIALGTEQEFWMVLPKAEIPHIKAKYTLDGKELTAP ISAHQRVGEIELYDRDKQVAHWPLVTLESVGEGSMFSRLSDYFHHKA >gi|296494398|gb|ADTN01000340.1| GENE 8 7415 - 7888 473 157 aa, chain + ## HITS:1 COG:ECs2811 KEGG:ns NR:ns ## COG: ECs2811 COG3449 # Protein_GI_number: 15832065 # Func_class: L Replication, recombination and repair # Function: DNA gyrase inhibitor # Organism: Escherichia coli O157:H7 # 1 157 1 157 157 315 98.0 3e-86 MNYEIKQEDKRTVAGFHLVGPWEQTVKKGFEQLMMWVDNKNIVPKEWVAVYYDNPDETPA EKLRCDTVVTVPNNFTLPDNSEGVILTEISGGQYAVAVARVVGDDFAKPWYQFFNSLLQD SAYEMLPKPCFEVYLNNGTEDGYWDIEMYVAVQPKHH >gi|296494398|gb|ADTN01000340.1| GENE 9 8086 - 9144 928 352 aa, chain + ## HITS:1 COG:yeeA KEGG:ns NR:ns ## COG: yeeA COG1289 # Protein_GI_number: 16129949 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 1 352 1 352 352 687 99.0 0 MRADKSLSPFEIRVYRHYRIVHGTRVALAFLLTFLIIRLFTIPESTWPLVTMVVIMGPIS FWGNVVPRAFERIGGTVLGSILGLIALQLELISLPLMLVWCAATMFLCGWLALGKKPYQG LLIGVTLAIVVGSPTGEIDTALWRSGDVILGSLLAMLFTGIWPQRAFIHWRIQLAKSLTE YNRVYQSAFSPNLLERPRLESHLQKLLTDAVKMRGLIAPASKETRIPKSIYEGIQTINRN LVCMLELQINAYWATRPSHFVLLNAQKLRDTQHMMQQILLSLVHALYEGNPQPVFANTEK LNDAVEELRQLLDNHHDLKVVETPIYGYVWLNMETAHQLELLSNLICRALRK >gi|296494398|gb|ADTN01000340.1| GENE 10 9316 - 9645 511 109 aa, chain + ## HITS:1 COG:ECs2809 KEGG:ns NR:ns ## COG: ECs2809 COG2926 # Protein_GI_number: 15832063 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 109 23 131 131 177 100.0 4e-45 METTKPSFQDVLEFVRLFRRKNKLQREIQDVEKKIRDNQKRVLLLDNLSDYIKPGMSVEA IQGIIASMKGDYEDRVDDYIIKNAELSKERRDISKKLKAMGEMKNGEAK >gi|296494398|gb|ADTN01000340.1| GENE 11 10178 - 10258 75 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAVLTISIDLTKNEFQIHGLGRNRKI >gi|296494398|gb|ADTN01000340.1| GENE 12 10543 - 10734 287 63 aa, chain - ## HITS:1 COG:no KEGG:ECP_2049 NR:ns ## KEGG: ECP_2049 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_536 # Pathway: not_defined # 1 63 2 64 64 111 100.0 7e-24 MTLEADSVNVQALDMGHIVVDIDGVNITELINKAAENGYSLRVVDGRDSTETPATYASPH QLL >gi|296494398|gb|ADTN01000340.1| GENE 13 10734 - 11108 287 124 aa, chain - ## HITS:1 COG:no KEGG:JW1987 NR:ns ## KEGG: JW1987 # Name: yeeV # Def: toxin of the YeeV-YeeU toxin-antitoxin system # Organism: E.coli_J # Pathway: not_defined # 1 124 1 124 124 251 100.0 7e-66 MKTLPVLPGQAASSRPSPVEIWQILLSRLLDQHYGLTLNDTPFADERVIEQHIEAGISLC DAVNFLVEKYALVRTDQPGFSACTRSQLINSIDILRARRATGLMTRDNYRTVNNITLGKY PEAK >gi|296494398|gb|ADTN01000340.1| GENE 14 11197 - 11565 228 122 aa, chain - ## HITS:1 COG:no KEGG:ECSP_2675 NR:ns ## KEGG: ECSP_2675 # Name: yeeU # Def: CP4-44 prophage; antitoxin of the YeeV-YeeU toxin-antitoxin system # Organism: E.coli_O157_TW14359 # Pathway: not_defined # 1 122 1 122 122 242 96.0 3e-63 MSDTLPGTTLPDDNHDRPWWGLPCTVTPCFGARLVQEGNRLHYLADRAGIRGLSSDADAY HPDQAFPLLMKQLELMLTSGELNPRNQHTVTLYAKGLACEADTLSSCGYVYLAVYPTPEM KN >gi|296494398|gb|ADTN01000340.1| GENE 15 11639 - 11860 293 73 aa, chain - ## HITS:1 COG:no KEGG:SDY_4591 NR:ns ## KEGG: SDY_4591 # Name: not_defined # Def: hypothetical protein # Organism: S.dysenteriae # Pathway: not_defined # 1 73 1 73 73 146 98.0 2e-34 MKIITRGEAMRIHRQHPASRLFPFCTGKYRWHGSAEAYTGREVQDIPGVLAVFAERRKDS FGPYVRLMSVILN >gi|296494398|gb|ADTN01000340.1| GENE 16 11929 - 12279 273 116 aa, chain - ## HITS:1 COG:ECs2803 KEGG:ns NR:ns ## COG: ECs2803 COG2003 # Protein_GI_number: 15832057 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Escherichia coli O157:H7 # 1 116 43 158 158 222 99.0 1e-58 AREWLILNMAGLEREEFRVLYLNNQNQLIAGETLFTGTINRTEVHPREVIKRALYHNAAA VVLAHNHPSGEVTPSKADRLITERLVQALGLVDIRVPDHLIVGGSQVFSFAEHGLL Prediction of potential genes in microbial genomes Time: Mon May 16 00:17:19 2011 Seq name: gi|296494397|gb|ADTN01000341.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont905.1, whole genome shotgun sequence Length of sequence - 912 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 56 - 595 642 ## COG0629 Single-stranded DNA-binding protein 2 1 Op 2 . + CDS 658 - 891 179 ## pECS88_0051 hypothetical protein Predicted protein(s) >gi|296494397|gb|ADTN01000341.1| GENE 1 56 - 595 642 179 aa, chain + ## HITS:1 COG:BU545 KEGG:ns NR:ns ## COG: BU545 COG0629 # Protein_GI_number: 15617138 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Buchnera sp. APS # 1 113 1 114 171 173 71.0 1e-43 MAVRGINKVILVGRLGKDPEVRYIPNGGAVANLQVATSETWRDKQTGEMREQTEWHRVVL FGKLAEVAGEYLRKGAQVYIEGQLRTRSWEDNGITRYVTEILVKTTGTMQMLGRAAGAQT QPEEGQQFSAQPQPEPQSEAGTKKGGAKTKGRGRKAAQPEPQPQPPEGEDYGFSDDIPF >gi|296494397|gb|ADTN01000341.1| GENE 2 658 - 891 179 77 aa, chain + ## HITS:1 COG:no KEGG:pECS88_0051 NR:ns ## KEGG: pECS88_0051 # Name: yubL # Def: hypothetical protein # Organism: E.coli_S88 # Pathway: not_defined # 1 77 10 86 86 155 98.0 5e-37 MSEYFRILQGLPDGPFTRKHAEAVAAQYRNVFIEDDHGEQFRLVVRNNGAMVWRTWNFED GAGYWMNHVIRDFGILK Prediction of potential genes in microbial genomes Time: Mon May 16 00:17:21 2011 Seq name: gi|296494396|gb|ADTN01000342.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont912.1, whole genome shotgun sequence Length of sequence - 1085 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1079 410 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing Predicted protein(s) >gi|296494396|gb|ADTN01000342.1| GENE 1 3 - 1079 410 358 aa, chain + ## HITS:1 COG:STM4305 KEGG:ns NR:ns ## COG: STM4305 COG0243 # Protein_GI_number: 16767555 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Salmonella typhimurium LT2 # 1 352 39 387 783 628 84.0 1e-180 NCGSRCALKVIVKDDRIVRIEPEDAKDDSVFGEHQIRPCLRGRSSRWRVYSPDRIKYPMK RVGKRGEGKFKRISWDEATAFIAAELKRVSEKYGNEAIYFNYQSGAYYHNQGTPAWKRLL NLNGGFLQYYNNYSNAQNFTATSYTYGATGVQFASHFSQIAHSDLVVFFGLNLSETRMSG GGQVEELRRALEISGARVIIIDPRYTDSVVTEHAEWLPIRPTTDAALVAAIAHTLIKENL IDEERVNTYCVGYDRSTLPEAAKPNASYKDYVLGNGDDGVEKTPEWAADICGIPAIRIRQ LAREMAGARACFIAQGWGPQRHANGEQTSRAVQILPALTGHFGLPGTNNGNWVMLPTY Prediction of potential genes in microbial genomes Time: Mon May 16 00:17:22 2011 Seq name: gi|296494395|gb|ADTN01000343.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont915.1, whole genome shotgun sequence Length of sequence - 728 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 125 81 ## APECO1_O1R84 hypothetical protein + Term 208 - 255 1.3 Predicted protein(s) >gi|296494395|gb|ADTN01000343.1| GENE 1 3 - 125 81 40 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1R84 NR:ns ## KEGG: APECO1_O1R84 # Name: tetR # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 40 180 219 219 82 100.0 4e-15 IVYEGGPDAAFERGLALIIGGLEKMRLTTNDIEVLKNVDE Prediction of potential genes in microbial genomes Time: Mon May 16 00:17:24 2011 Seq name: gi|296494394|gb|ADTN01000344.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont927.1, whole genome shotgun sequence Length of sequence - 2119 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 64 - 522 -82 ## Lferr_0224 hypothetical protein 2 1 Op 2 . + CDS 539 - 1402 177 ## TDE1618 hypothetical protein + Term 1424 - 1456 2.0 3 2 Tu 1 . - CDS 1645 - 2118 307 ## ECO111_p1-137 putative transposase Predicted protein(s) >gi|296494394|gb|ADTN01000344.1| GENE 1 64 - 522 -82 152 aa, chain + ## HITS:1 COG:no KEGG:Lferr_0224 NR:ns ## KEGG: Lferr_0224 # Name: not_defined # Def: hypothetical protein # Organism: A.ferrooxidans # Pathway: not_defined # 16 151 3 127 134 66 36.0 3e-10 MSLYVHAAGQKVNIIESKIAGLGFLQAIIARMANNSFYIKGWDLTLFVAVVALKKDASQY SGYLLLALIVSTIAFCLLDAYYLQQERIFRKIYNYLAESNSESINYFSINPVLYKDILKG EMLSYRSSLISTSILLFHIPMFVAVFIFFVLF >gi|296494394|gb|ADTN01000344.1| GENE 2 539 - 1402 177 287 aa, chain + ## HITS:1 COG:no KEGG:TDE1618 NR:ns ## KEGG: TDE1618 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 52 243 50 236 278 112 35.0 2e-23 MLHALNFSIPDDSFLLDGTYSTTLTEELKLQNPISKIRESITKSDGCNDSIDNSINYEEF KRIWFPDVNAHVFISHSHVDKAKAIALANFLFSKFGIRSFVDSEVWGYMDDALRELNNKH NKFPGQENTYIYEHCNKLASNLNMILSSALIKMIHESECLIFINTIDSCVRDGNEFRTQS PWIFTELLMSSMLEKSEHKDRPTELRKSLDEFRVVAANESFDQAVFSYPLYLRHMHNVTN DKINKINAIKPSGLSFTTTCYRDTAHHFEVDDVFKNLDSIYNLVISR >gi|296494394|gb|ADTN01000344.1| GENE 3 1645 - 2118 307 157 aa, chain - ## HITS:1 COG:no KEGG:ECO111_p1-137 NR:ns ## KEGG: ECO111_p1-137 # Name: not_defined # Def: putative transposase # Organism: E.coli_O111_H- # Pathway: not_defined # 11 157 319 465 465 263 90.0 1e-69 ITTDYLQQCPEFTWREPRRVLKSLTVQYDKVLYLIEDSKDSRRAIGKYIDVWHYPDGHKE LRLNDVQLPYSTYDRLSEVDPVAIVDNKRLGHVLEVSRQVQRKRDNNRSQSIPSGDEPSR RRHAPSINKPQRSLNEEDMLEAIIKLQGSSEAIFGKR Prediction of potential genes in microbial genomes Time: Mon May 16 00:17:34 2011 Seq name: gi|296494393|gb|ADTN01000345.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont929.1, whole genome shotgun sequence Length of sequence - 861 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 93 - 150 10.0 1 1 Op 1 . - CDS 218 - 409 136 ## SeHA_A0049 hypothetical protein 2 1 Op 2 . - CDS 406 - 828 487 ## EcE24377A_E0028 putative DNA methylase Predicted protein(s) >gi|296494393|gb|ADTN01000345.1| GENE 1 218 - 409 136 63 aa, chain - ## HITS:1 COG:no KEGG:SeHA_A0049 NR:ns ## KEGG: SeHA_A0049 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Heidelberg # Pathway: not_defined # 1 63 1 63 63 114 98.0 7e-25 MNISTETREILRNYRAVINARRREMGQKPLTTAQIVDEICDFVVNQQAVFLGGHYILQGS RNR >gi|296494393|gb|ADTN01000345.1| GENE 2 406 - 828 487 140 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_E0028 NR:ns ## KEGG: EcE24377A_E0028 # Name: not_defined # Def: putative DNA methylase # Organism: E.coli_E24377A # Pathway: not_defined # 1 140 122 261 261 252 95.0 3e-66 MYCTVKEIIREVLDTDVPDSECVFAVVLTRGDVRHIAQDWSLTDDELETVMQRLDDAFEY GADVSVVHGVVRELMEEKRASRQVTVPAVMLEKVMALAGSEMKRLYAVGSENGGDGDAFV REEREAMDVVLQALDGETMS Prediction of potential genes in microbial genomes Time: Mon May 16 00:17:40 2011 Seq name: gi|296494392|gb|ADTN01000346.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont939.1, whole genome shotgun sequence Length of sequence - 7329 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 2, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 4/0.000 - CDS 102 - 617 151 ## COG3539 P pilus assembly protein, pilin FimA 2 1 Op 2 6/0.000 - CDS 628 - 1635 654 ## COG3539 P pilus assembly protein, pilin FimA 3 1 Op 3 10/0.000 - CDS 1648 - 4257 1562 ## COG3188 P pilus assembly protein, porin PapC 4 1 Op 4 7/0.000 - CDS 4288 - 4956 180 ## COG3121 P pilus assembly protein, chaperone PapD - Prom 5014 - 5073 6.0 - Term 5017 - 5065 0.1 5 1 Op 5 . - CDS 5200 - 5742 229 ## COG3539 P pilus assembly protein, pilin FimA - Prom 5862 - 5921 7.1 + Prom 5926 - 5985 5.5 6 2 Op 1 2/0.000 + CDS 6222 - 7088 883 ## COG0190 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase 7 2 Op 2 . + CDS 7090 - 7302 316 ## COG2501 Uncharacterized conserved protein Predicted protein(s) >gi|296494392|gb|ADTN01000346.1| GENE 1 102 - 617 151 171 aa, chain - ## HITS:1 COG:sfmF KEGG:ns NR:ns ## COG: sfmF COG3539 # Protein_GI_number: 16128518 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 1 171 1 171 171 331 100.0 4e-91 MRRVLFSCFCGLLWSSSGWAVDPLGTININLHGNVVDFSCTVNTADIDKTVDLGRWPTTQ LLNAGDTTALVPFSLRLEGCPPGSVAILFTGTPASDTNLLALDDPAMAQTVAIELRNSDR SRLALGEASPTEEVDANGNVTLNFFANYRALASGVRPGVAKADAIFMINYN >gi|296494392|gb|ADTN01000346.1| GENE 2 628 - 1635 654 335 aa, chain - ## HITS:1 COG:sfmH KEGG:ns NR:ns ## COG: sfmH COG3539 # Protein_GI_number: 16128517 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli K12 # 11 335 1 325 325 645 99.0 0 MKIICRLLLAMACLCLANISWATVCANSTGVAEDEHYDLSNIFNSTNNQPGQIVVLPEKS GWVGVSAICPPGTLVNYTYRSYVTNFIVQETIDNYKYMQLNDYLLGAMSLVDSVMDIKFP PQNYIRMGTDPNVSQNLPFGVMDSRLIFRLKVIRPFINMVEIPRQVMFTVYVTSTPYDPL VTPVYTISFGGRVEVPQNCELNAGQIVEFDFGDIGASLFSAAGPGNRPAGVMPQTKSIAV KCTNVAAQAYLTMRLEASAVSGQAMVSDNQDLGFIVADQNDTPITPNDLNSVIPFRLDAA AAANVTLRAWPISITGQKPTEGPFSALGYLRVDYQ >gi|296494392|gb|ADTN01000346.1| GENE 3 1648 - 4257 1562 869 aa, chain - ## HITS:1 COG:ECs0594 KEGG:ns NR:ns ## COG: ECs0594 COG3188 # Protein_GI_number: 15829848 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, porin PapC # Organism: Escherichia coli O157:H7 # 1 869 1 869 869 1724 99.0 0 MKIPTTTDIPQRYTWCLAGICYSSLAILPSFLSYAESYFNPAFLLENGTSVADLSRFERG NHQPAGVYRVDLWRNDEFIGSQDIVFESTTENTGDKSGGLMPCFNQVLLERIGLNSSAFP ELAQQQNNKCINLLKAVPDATINFDFAAMRLNITIPQIALLSSAHGYIPPEEWDEGIPAL LLNYNFTGNRGNGNDSYFFSELSGINIGPWRLRNNGSWNYFRGNGYHSEQWNNIGTWVQR AIIPLKSELVMGDGNTGSDIFDGVGFRGVRLYSSDNMYPDSQQGFAPTVRGIARTAAQLT IRQNGFIIYQSYVSPGAFEITDLHPTSSNGDLDVTIDERDGNQQNYTIPYSTVPILQREG RFKFDLTAGDFRSGNSQQSSPFFFQGTVLGGLPQEFTAYGGTQLSANYTAFLLGLGRNLG NWGAVSLDVTHARSQLADDSRHEGDSIRFLYAKSMNTFGTNFQLMGYRYSTQGFYTLDDV AYRRMEGYEYDYDYDGEHRDEPIIVNYHNLRFSRKDRLQLNISQSLNDFGSLYISGTHQK YWNTSDSDTWYQVGYTSSWVGISYSLSFSWNESVGIPDNERIVGLNVSVPFNVLTKRRYT RENALDRAYASFNANRNSNGQNSWLAGVGGTLLEGHNLSYHVSQGDTSNNGYTGSATANW QAAYGTLGGGYNYDRDQHDVNWQLSGGVVGHENGITLSQPLGDTNVLIKAPGAGGVRIEN QTGILTDWRGYAVMPYATVYRYNRIALDTNTMGNSIDVEKNISSVVPTQGALVRANFDTR IGVRALITVTQGGKPVPFGSLVRENSTGITSMVGDDGQVYLSGAPLSGELLVQWGDGANS RCIAHYVLPKQSLQQAVTVISAVCTHPGS >gi|296494392|gb|ADTN01000346.1| GENE 4 4288 - 4956 180 222 aa, chain - ## HITS:1 COG:sfmC KEGG:ns NR:ns ## COG: sfmC COG3121 # Protein_GI_number: 16128515 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, chaperone PapD # Organism: Escherichia coli K12 # 1 222 9 230 230 424 100.0 1e-119 MLIIFYLIISASAHAAGGIALGATRIIYPADAKQTAVWIRNSHTNERFLVNSWIENSSGV KEKSFIITPPLFVSEPKSENTLRIIYTGPPLAADRESLFWMNVKTIPSVDKNALNGRNVL QLAILSRMKLFLRPIQLQELPAEAPDTLKFSRSGNYINVHNPSPFYVTLVNLQVGSQKLG NAMAAPRVNSQIPLPSGVQGKLKFQTVNDYGSVTPVREVNLN >gi|296494392|gb|ADTN01000346.1| GENE 5 5200 - 5742 229 180 aa, chain - ## HITS:1 COG:ECs0592 KEGG:ns NR:ns ## COG: ECs0592 COG3539 # Protein_GI_number: 15829846 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: P pilus assembly protein, pilin FimA # Organism: Escherichia coli O157:H7 # 1 180 12 191 191 264 100.0 8e-71 MKLRFISSALAAALFAATGSYAAVVDGGTIHFEGELVNAACSVNTDSADQVVTLGQYRTD IFNAVGNTSALIPFTIQLNDCDPVVAANAAVAFSGQADAINDNLLAIASSTNTTTATGVG IEILDNTSAILKPDGNSFSTNQNLIPGTNVLHFSARYKGTGTSASAGQANADATFIMRYE >gi|296494392|gb|ADTN01000346.1| GENE 6 6222 - 7088 883 288 aa, chain + ## HITS:1 COG:ECs0591 KEGG:ns NR:ns ## COG: ECs0591 COG0190 # Protein_GI_number: 15829845 # Func_class: H Coenzyme transport and metabolism # Function: 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase # Organism: Escherichia coli O157:H7 # 1 288 1 288 288 569 100.0 1e-162 MAAKIIDGKTIAQQVRSEVAQKVQARIAAGLRAPGLAVVLVGSNPASQIYVASKRKACEE VGFVSRSYDLPETTSEAELLELIDALNADNTIDGILVQLPLPAGIDNVKVLERIHPDKDV DGFHPYNVGRLCQRAPRLRPCTPRGIVTLLERYNIDTFGLNAVVIGASNIVGRPMSMELL LAGCTTTVTHRFTKNLRHHVENADLLIVAVGKPGFIPGDWIKEGAIVIDVGINRLENGKV VGDVVFEDAAKRASYITPVPGGVGPMTVATLIENTLQACVEYHDPQGE >gi|296494392|gb|ADTN01000346.1| GENE 7 7090 - 7302 316 70 aa, chain + ## HITS:1 COG:ybcJ KEGG:ns NR:ns ## COG: ybcJ COG2501 # Protein_GI_number: 16128512 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 1 70 8 77 77 127 100.0 7e-30 MATFSLGKHPHVELCDLLKLEGWSESGAQAKIAIAEGQVKVDGAVETRKRCKIVAGQTVS FAGHSVQVVA Prediction of potential genes in microbial genomes Time: Mon May 16 00:17:42 2011 Seq name: gi|296494391|gb|ADTN01000347.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont939.2, whole genome shotgun sequence Length of sequence - 2690 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 31 - 552 342 ## COG1988 Predicted membrane-bound metal-dependent hydrolases - Term 541 - 576 7.4 2 2 Tu 1 . - CDS 588 - 1973 1817 ## COG0215 Cysteinyl-tRNA synthetase - Prom 2008 - 2067 3.6 + Prom 1967 - 2026 4.3 3 3 Tu 1 . + CDS 2147 - 2641 577 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family Predicted protein(s) >gi|296494391|gb|ADTN01000347.1| GENE 1 31 - 552 342 173 aa, chain + ## HITS:1 COG:ybcI KEGG:ns NR:ns ## COG: ybcI COG1988 # Protein_GI_number: 16128511 # Func_class: R General function prediction only # Function: Predicted membrane-bound metal-dependent hydrolases # Organism: Escherichia coli K12 # 1 173 1 173 173 285 97.0 3e-77 MPTVITHAAVPLCIGLGLGSKVIPPRLLFAGIILAMLPDADVLSFKFGIAYGNVFGHRGF THSLVFAFVVPLLCVLIGRRWFRAGLIRCWLFLTVSLLSHSLLDSVTTGGKGVGWLWPWS DERFFAPWQVIKVAPFALSRYTTPYGHQVIISELMWVWLPGILLMGMLWWHRR >gi|296494391|gb|ADTN01000347.1| GENE 2 588 - 1973 1817 461 aa, chain - ## HITS:1 COG:ECs0588 KEGG:ns NR:ns ## COG: ECs0588 COG0215 # Protein_GI_number: 15829842 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Cysteinyl-tRNA synthetase # Organism: Escherichia coli O157:H7 # 1 461 1 461 461 918 99.0 0 MLKIFNTLTRQKEEFKPIHAGEVGMYVCGITVYDLCHIGHGRTFVAFDVVARYLRFLGYK LKYVRNITDIDDKIIKRANENGESFVALVDRMIAEMHKDFDALNILRPDMEPRATHHIAE IIELTEQLIAKGHAYVADNGDVMFDVPTDPTYGVLSRQDLDQLQAGARVDVVDDKRNPMD FVLWKMSKEGEPSWPSPWGAGRPGWHIECSAMNCKQLGNHFDIHGGGSDLMFPHHENEIA QSTCAHDGQYVNYWMHSGMVMVDREKMSKSLGNFFTVRDVLKYYDAETVRYFLMSGHYRS QLNYSEENLKQARAALERLYTALRGTDKTVAPAGGEAFEARFIEAMDDDFNTPEAYSVLF DMAREVNRLKAEDMAAANAMASHLRKLSAVLGLLEQEPEAFLQSGAQADGSEVAEIEALI QQRLDARKAKDWAAADAARDRLNEMGIVLEDGPQGTTWRRK >gi|296494391|gb|ADTN01000347.1| GENE 3 2147 - 2641 577 164 aa, chain + ## HITS:1 COG:ECs0587 KEGG:ns NR:ns ## COG: ECs0587 COG0652 # Protein_GI_number: 15829841 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Escherichia coli O157:H7 # 1 164 1 164 164 328 100.0 2e-90 MVTFHTNHGDIVIKTFDDKAPETVKNFLDYCREGFYNNTIFHRVINGFMIQGGGFEPGMK QKATKEPIKNEANNGLKNTRGTLAMARTQAPHSATAQFFINVVDNDFLNFSGESLQGWGY CVFAEVVEGMDVVDKIKGVATGRSGMHQDVPKEDVIIESVTVSE Prediction of potential genes in microbial genomes Time: Mon May 16 00:17:46 2011 Seq name: gi|296494390|gb|ADTN01000348.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont939.3, whole genome shotgun sequence Length of sequence - 10673 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 3, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 + CDS 3 - 671 541 ## COG2908 Uncharacterized protein conserved in bacteria + Prom 687 - 746 3.4 2 1 Op 2 29/0.000 + CDS 789 - 1298 566 ## COG0041 Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase 3 1 Op 3 . + CDS 1295 - 2362 1310 ## COG0026 Phosphoribosylaminoimidazole carboxylase (NCAIR synthetase) 4 2 Op 1 . - CDS 2557 - 3450 1024 ## COG0549 Carbamate kinase 5 2 Op 2 . - CDS 3447 - 4262 493 ## EcolC_3102 putative carboxylase 6 2 Op 3 . - CDS 4273 - 5532 1270 ## EC55989_0533 hypothetical protein 7 2 Op 4 . - CDS 5542 - 7209 1720 ## COG0074 Succinyl-CoA synthetase, alpha subunit - Prom 7387 - 7446 4.2 + Prom 7314 - 7373 4.4 8 3 Op 1 4/1.000 + CDS 7526 - 8575 948 ## COG2055 Malate/L-lactate dehydrogenases 9 3 Op 2 4/1.000 + CDS 8597 - 9832 1080 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 10 3 Op 3 . + CDS 9843 - 10628 947 ## COG3257 Uncharacterized protein, possibly involved in glyoxylate utilization Predicted protein(s) >gi|296494390|gb|ADTN01000348.1| GENE 1 3 - 671 541 222 aa, chain + ## HITS:1 COG:ybbF KEGG:ns NR:ns ## COG: ybbF COG2908 # Protein_GI_number: 16128508 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 2 222 20 240 240 454 99.0 1e-128 PAGFLRFLAGEARKADALYILGDLFEAWIGDDDPNPLHRQMAAAIKAVSDSGVPCYFIHG NRDFLLGKRFARESGMTLLPEEKVLELYGRRVLIMHGDTLCTDDAGYQAFRAKVHKPWLQ TLFLALPLFVRKRIAARMRANSKEANSSKSLAIMDVNQNAVVSAMEKHQVQWLIHGHTHR PAVHELIANQQPAFRVVLGAWHTEGSMVKVTADDVELIHFPF >gi|296494390|gb|ADTN01000348.1| GENE 2 789 - 1298 566 169 aa, chain + ## HITS:1 COG:ECs0585 KEGG:ns NR:ns ## COG: ECs0585 COG0041 # Protein_GI_number: 15829839 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase # Organism: Escherichia coli O157:H7 # 1 169 1 169 169 280 100.0 1e-75 MSSRNNPARVAIVMGSKSDWATMQFAAEIFEILNVPHHVEVVSAHRTPDKLFSFAESAEE NGYQVIIAGAGGAAHLPGMIAAKTLVPVLGVPVQSAALSGVDSLYSIVQMPRGIPVGTLA IGKAGAANAALLAAQILATHDKELHQRLNDWRKAQTDEVLENPDPRGAA >gi|296494390|gb|ADTN01000348.1| GENE 3 1295 - 2362 1310 355 aa, chain + ## HITS:1 COG:purK KEGG:ns NR:ns ## COG: purK COG0026 # Protein_GI_number: 16128506 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazole carboxylase (NCAIR synthetase) # Organism: Escherichia coli K12 # 1 355 1 355 355 696 99.0 0 MKQVCVLGNGQLGRMLRQAGEPLGIAVWPVGLDAEPAAVPFQQSVITAEIERWPETALTR ELARHPAFVNRDVFPIIADRLTQKQLFDKLHLPTAPWQLLADRSEWPAVFDRLGELAIVK RRTGGYDGRGQWRLRADETEQLPAECYGECIVEQGINFSGEVSLVGARGFDGSTVFYPLT HNLHQDGILRTSVAFPQANAQQQAQAEEMLSAIMQELGYVGVMAMECFVTPQGLLINELA PRVHNSGHWTQNGASISQFELHLRAITDLPLPQPVVNNPSVMINLIGSDVNYDWLKLPLV HLHWYDKEVRPGRKVGHLNLTDSDTSRLTATLEALIPLLPPEYASGVIWAQSKFG >gi|296494390|gb|ADTN01000348.1| GENE 4 2557 - 3450 1024 297 aa, chain - ## HITS:1 COG:arcC KEGG:ns NR:ns ## COG: arcC COG0549 # Protein_GI_number: 16128505 # Func_class: E Amino acid transport and metabolism # Function: Carbamate kinase # Organism: Escherichia coli K12 # 1 297 1 297 297 551 99.0 1e-157 MKTLVVALGGNALLQRGEALTAENQYRNIASAVPALARLARSYRLAIVHGNGPQVGLLAL QNLAWKEVEPYPLDVLVAESQGMIGYMLAQSLSAQPQMPPVTTVLTRIEVSPDDPAFLQP EKFIGPVYQPEEQEALEAAYGWQMKRDGKYLRRVVASPQPRKILDSEAIELLLKEGHVVI CSGGGGVPVTEDGAGSEAVIDKDLAAALLAEQINADGLVILTDADAVYENWGTPQQRAIR HATPDELAPFAKADGSMGPKVTAVSGYVRSRGKPAWIGALSRIEETLAGEAGTCISL >gi|296494390|gb|ADTN01000348.1| GENE 5 3447 - 4262 493 271 aa, chain - ## HITS:1 COG:no KEGG:EcolC_3102 NR:ns ## KEGG: EcolC_3102 # Name: not_defined # Def: putative carboxylase # Organism: E.coli_ATCC8739 # Pathway: not_defined # 1 271 1 271 271 526 98.0 1e-148 MTIIHPLLASSSAPNYRQSWRLAGVWRRAINLMTESGELLTLHRKGSGFGPGGWVLRRAQ FDALCGGLCGNERPQVVAQGIRLGRFTVKQPQRYCLLRITPPAHPQPLAAAWMQRAEETG LFGPLAMAASDPLPAELRQFRHCFQAALNGVKTDWRHWLGKGPGLTPSHDDTLSGMLLAA WYYGALDARSGRQFFACSDNLQLVTTAVSVSYLRYAAQGYFASPLLHFVHALSCPKRTAV AIDSLLALGHTSGADTLLGFWLGQQLLQGTP >gi|296494390|gb|ADTN01000348.1| GENE 6 4273 - 5532 1270 419 aa, chain - ## HITS:1 COG:no KEGG:EC55989_0533 NR:ns ## KEGG: EC55989_0533 # Name: ylbE # Def: hypothetical protein # Organism: E.coli_55989 # Pathway: not_defined # 1 419 1 419 419 814 99.0 0 MFTSVAQANAAVIEQIRRARPHWLDVQPASSLISELNQGKTLLHAGPPMRWQEMTGPMKG ACVGACLFEGWAKDEAQALAILEQGEVNFIPCHHVNAVGPMGGITSASMPMLVVENVTDG NRAYCNLNEGIGKVMRFGAYGEDVLTRHRWMRDVLMPVLSAALGRMEHGIDLTAMMAQGI TMGDEFHQRNIASSALLMRTLAPQIARLDHDKQHIAEVMDFLSVTDQFFLNLAMAYCKAA MDAGAMIRAGSIVTAMTRNGNMFGIRVSGLGERWFTAPVNTPQGLFFTGFSQEQANPDMG DSAITETFGIGGAAMIAAPGVTRFVGAGGMEAARAVSEEMAEIYLERNMQLQIPSWDFQG ACLGLDIRRVVETGITPLINTGIAHKEAGIGQIGAGTVRAPLACFEQALEALAESMGIG >gi|296494390|gb|ADTN01000348.1| GENE 7 5542 - 7209 1720 555 aa, chain - ## HITS:1 COG:ECs0580 KEGG:ns NR:ns ## COG: ECs0580 COG0074 # Protein_GI_number: 15829834 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, alpha subunit # Organism: Escherichia coli O157:H7 # 1 555 1 555 555 1023 98.0 0 MIHAFIKKGCFQDSVSLMIISRKLSESENVDDVSVMMGTPANKALLDTTGFWHDDFNNAT PNDICVAIRSEAADAGIAQAIMQQLEEALKQLAQGSGSSQALTQVRRWDSACQKLPDASL ALISVAGEYAAELANQALDRNLNVMMFSDNVTLEDEIQLKSRAREKGLLVMGPDCGTSMI AGTPLAFANVMPEGNIGVIGASGTGIQELCSQIALAGEGITHAIGLGGRDLSREVGGISA LTALEMLSADEKSEVLAFVSKPPAEAVRLKIVNAMKATGKPTVALFLGYTPAVARDENVW FASSLDEAARLACLLSRVTARRNAITPASSGFICGLYTGGTLAAEAAGLLAGHLGVEADD THHHGMMLDADGHQIIDLGDDFYTVGRPHPMIDPALRNQLIADLGAKPQVRVLLLDVVIG FGATADPAASLVSAWQKACAARSDNQPLYAIATVTGTERDPQCRSQQIATLEDAGIAVVS SLPEATLLAAALIRPLSPATQQHTPSLLENVAVINIGLRSFALELQSASKPVVHYQWSPV AGGNKKLARLLERLQ >gi|296494390|gb|ADTN01000348.1| GENE 8 7526 - 8575 948 349 aa, chain + ## HITS:1 COG:ylbC KEGG:ns NR:ns ## COG: ylbC COG2055 # Protein_GI_number: 16128501 # Func_class: C Energy production and conversion # Function: Malate/L-lactate dehydrogenases # Organism: Escherichia coli K12 # 1 349 1 349 349 708 100.0 0 MKISRETLHQLIENKLCQAGLKREHAATVAEVLVYADARGIHSHGAVRVEYYAERISKGG TNREPEFRLEETGPCSAILHADNAAGQVAAKMGMEHAIKTAQQNGVAVVGISRMGHSGAI SYFVQQAARAGFIGISMCQSDPMVVPFGGAEIYYGTNPLAFAAPGEGDEILTFDMATTVQ AWGKVLDARSRNMSIPDTWAVDKNGVPTTDPFAVHALLPAAGPKGYGLMMMIDVLSGVLL GLPFGRQVSSMYDDLHAGRNLGQLHIVINPNFFSSSELFRQHLSQTMRELNAITPAPGFN QVYYPGQDQDIKQRKAAVEGIEIVDDIYQYLISDALYNTSYETKNPFAQ >gi|296494390|gb|ADTN01000348.1| GENE 9 8597 - 9832 1080 411 aa, chain + ## HITS:1 COG:ylbB KEGG:ns NR:ns ## COG: ylbB COG0624 # Protein_GI_number: 16128500 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Escherichia coli K12 # 1 411 1 411 411 876 100.0 0 MITHFRQAIEETLPWLSSFGADPAGGMTRLLYSPEWLETQQQFKKRMAASGLETRFDEVG NLYGRLNGTEYPQEVVLSGSHIDTVVNGGNLDGQFGALAAWLAIDWLKTQYGAPLRTVEV VAMAEEEGSRFPYVFWGSKNIFGLANPDDVRNICDAKGNSFVDAMKACGFTLPNAPLTPR QDIKAFVELHIEQGCVLESNGQSIGVVNAIVGQRRYTVTLNGESNHAGTTPMGYRRDTVY AFSRICHQSVEKAKRMGDPLVLTFGKVEPRPNTVNVVPGKTTFTIDCRHTDAAVLRDFTQ QLENDMRAICDEMDIGIDIDLWMDEEPVPMNKELVATLTELCEREKLNYRVMHSGAGHDA QIFAPRVPTCMIFIPSINGISHNPAERTNITDLAEGVKTLALMLYQLAWQK >gi|296494390|gb|ADTN01000348.1| GENE 10 9843 - 10628 947 261 aa, chain + ## HITS:1 COG:ylbA KEGG:ns NR:ns ## COG: ylbA COG3257 # Protein_GI_number: 16128499 # Func_class: R General function prediction only # Function: Uncharacterized protein, possibly involved in glyoxylate utilization # Organism: Escherichia coli K12 # 1 261 1 261 261 498 100.0 1e-141 MGYLNNVTGYREDLLANRAIVKHGNFALLTPDGLVKNIIPGFENCDATILSTPKLGASFV DYLVTLHQNGGNQQGFGGEGIETFLYVISGNITAKAEGKTFALSEGGYLYCPPGSLMTFV NAQAEDSQIFLYKRRYVPVEGYAPWLVSGNASELERIHYEGMDDVILLDFLPKELGFDMN MHILSFAPGASHGYIETHVQEHGAYILSGQGVYNLDNNWIPVKKGDYIFMGAYSLQAGYG VGRGEAFSYIYSKDCNRDVEI Prediction of potential genes in microbial genomes Time: Mon May 16 00:18:01 2011 Seq name: gi|296494389|gb|ADTN01000349.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont939.4, whole genome shotgun sequence Length of sequence - 17251 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 9, operones - 5 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 742 761 ## COG1843 Flagellar hook capping protein + Term 834 - 879 11.2 2 2 Op 1 2/0.500 - CDS 909 - 2054 1183 ## COG1929 Glycerate kinase 3 2 Op 2 2/0.500 - CDS 2076 - 3377 1011 ## COG2233 Xanthine/uracil permeases - Term 3387 - 3419 3.0 4 3 Op 1 2/0.500 - CDS 3434 - 4795 1597 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 5 3 Op 2 . - CDS 4855 - 6309 1421 ## COG1953 Cytosine/uracil/thiamine/allantoin permeases - Prom 6364 - 6423 8.1 6 4 Op 1 8/0.000 - CDS 6478 - 7356 1122 ## COG2084 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases 7 4 Op 2 8/0.000 - CDS 7456 - 8232 828 ## COG3622 Hydroxypyruvate isomerase 8 4 Op 3 4/0.500 - CDS 8245 - 10026 2176 ## COG3960 Glyoxylate carboligase - Prom 10047 - 10106 4.4 - Term 10073 - 10103 3.0 9 5 Tu 1 4/0.500 - CDS 10116 - 10886 694 ## COG1414 Transcriptional regulator - Prom 10929 - 10988 3.1 10 6 Tu 1 . - CDS 11009 - 11491 525 ## COG3194 Ureidoglycolate hydrolase - Prom 11553 - 11612 5.0 + Prom 11598 - 11657 5.6 11 7 Op 1 4/0.500 + CDS 11721 - 12647 627 ## COG0583 Transcriptional regulator + Term 12658 - 12707 2.7 12 7 Op 2 . + CDS 12716 - 13810 1189 ## COG2603 Predicted ATPase + Prom 13833 - 13892 3.0 13 8 Tu 1 . + CDS 13926 - 14297 286 ## SDY_0400 hypothetical protein + Term 14404 - 14445 5.1 - Term 15212 - 15280 -0.1 14 9 Op 1 . - CDS 15283 - 15993 139 ## COG3209 Rhs family protein 15 9 Op 2 . - CDS 15993 - 16337 80 ## B21_00454 hypothetical protein 16 9 Op 3 . - CDS 16401 - 17249 113 ## COG3209 Rhs family protein Predicted protein(s) >gi|296494389|gb|ADTN01000349.1| GENE 1 2 - 742 761 246 aa, chain + ## HITS:1 COG:YPO0724 KEGG:ns NR:ns ## COG: YPO0724 COG1843 # Protein_GI_number: 16121043 # Func_class: N Cell motility # Function: Flagellar hook capping protein # Organism: Yersinia pestis # 46 237 31 217 218 142 40.0 9e-34 KFAETGRSLINPRALKAQSPQPTATAESNNVAASDTTSSSDSSPVDTFLTLFVAEIQNQD PTDPTDATEYIDQLSSMAQVAMMEEMSVQANTNAVLMSNIQVMALGNLVGDDIMVQTTAL QVSDQTINGRATLDDACTTADLHFTDAAGDDYTVSLIPQGSSSVGPGQVDFSVDPADYGI PPGDYDVSVVTNTGEEEVPIEVSGEVEDVRIPLDGSSPVLNVAGVGEVPFTMISQFGVPD ADSDVA >gi|296494389|gb|ADTN01000349.1| GENE 2 909 - 2054 1183 381 aa, chain - ## HITS:1 COG:ybbZ KEGG:ns NR:ns ## COG: ybbZ COG1929 # Protein_GI_number: 16128498 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerate kinase # Organism: Escherichia coli K12 # 1 381 1 381 381 651 99.0 0 MKIVIAPDSFKESLSAEKCCQAIKAGFSTLFPDANYICLPIADGGEGTVDAMVAATGGNI VTLEVCGPMGEKVNAFYGLTGDGKTAVIEMAAASGLMLVAPEKRNPLLASSFGTGELIRH ALDNDIRHIILGIGGSATVDGGMGMAQALGVRFLDADGQALAANGGNLARVASIEMDECD PRLANCHIEVACDVDNPLVGARGAAAVFGPQKGATPEMVEELEQGLQNYARVLQQQTEIN VCQMAGGGAAGGMGIAAAVFLNADIKPGIEIVLNAVNLAQAVQGAALVITGEGRIDSQTA GGKAPLGVASVAKQFNVPVIGIAGVLGDGLEVVHQYGIDAVFSILPRLAPLAEVLASGET NLFNSARNIACAIKIGQGIKN >gi|296494389|gb|ADTN01000349.1| GENE 3 2076 - 3377 1011 433 aa, chain - ## HITS:1 COG:ECs0575 KEGG:ns NR:ns ## COG: ECs0575 COG2233 # Protein_GI_number: 15829829 # Func_class: F Nucleotide transport and metabolism # Function: Xanthine/uracil permeases # Organism: Escherichia coli O157:H7 # 1 433 3 435 435 748 100.0 0 MFNFAVSRESLLSGFQWFFFIFCNTVVVPPTLLSAFQLPQSSLLTLTQYAFLATALACFA QAFCGHRRAIMEGPGGLWWGTILTITLGEASRGTPINDIATSLAVGIALSGVLTMLIGFS GLGHRLARLFTPSVMVLFMLMLGAQLTTIFFKGMLGLPFGIADPNFKIQLPPFALSVAVM CLVLAMIIFLPQRFARYGLLVGTITGWLLWYFCFPSSHSLSGELHWQWFPLGSGGALSPG IILTAVITGLVNISNTYGAIRGTDVFYPQQGAGNTRYRRSFVATGFMTLITVPLAVIPFS PFVSSIGLLTQTGDYTRRSFIYGSVICLLVALVPALTRLFCSIPLPVSSAVMLVSYLPLL FSALVFSQQITFTARNIYRLALPLFVGIFLMALPPVYLQDLPLTLRPLLSNGLLVGILLA VLMDNLIPWERIE >gi|296494389|gb|ADTN01000349.1| GENE 4 3434 - 4795 1597 453 aa, chain - ## HITS:1 COG:ybbX KEGG:ns NR:ns ## COG: ybbX COG0044 # Protein_GI_number: 16128496 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Escherichia coli K12 # 1 453 1 453 453 946 100.0 0 MSFDLIIKNGTVILENEARVVDIAVKGGKIAAIGQDLGDAKEVMDASGLVVSPGMVDAHT HISEPGRSHWEGYETGTRAAAKGGITTMIEMPLNQLPATVDRASIELKFDAAKGKLTIDA AQLGGLVSYNIDRLHELDEVGVVGFKCFVATCGDRGIDNDFRDVNDWQFFKGAQKLGELG QPVLVHCENALICDELGEEAKREGRVTAHDYVASRPVFTEVEAIRRVLYLAKVAGCRLHV CHVSSPEGVEEVTRARQEGQDVTCESCPHYFVLDTDQFEEIGTLAKCSPPIRDLENQKGM WEKLFNGEIDCLVSDHSPCPPEMKAGNIMKAWGGIAGLQSCMDVMFDEAVQKRGMSLPMF GKLMATNAADIFGLQQKGRIAPGKDADFVFIQPNSSYVLTNDDLEYRHKVSPYVGRTIGA RITKTILRGDVIYDIEQGFPVAPKGQFILKHQQ >gi|296494389|gb|ADTN01000349.1| GENE 5 4855 - 6309 1421 484 aa, chain - ## HITS:1 COG:ECs0572 KEGG:ns NR:ns ## COG: ECs0572 COG1953 # Protein_GI_number: 15829826 # Func_class: F Nucleotide transport and metabolism; H Coenzyme transport and metabolism # Function: Cytosine/uracil/thiamine/allantoin permeases # Organism: Escherichia coli O157:H7 # 1 463 1 463 463 805 98.0 0 MEHQRKLFQQRGYSEDLLPKTQSQRTWKTFNYFTLWMGSVHNVPNYMMVGGFFILGLSTF SIMLAIILSAFFIAAVMVLNGAAGSKYGVPFAMILRASYGVRGALFPGLLRGGIAAIMWF GLQCYAGSLACLILIGKIWPGFLTLGGDFTLLGLSLPGLITFLIFWLVNVGIGFGGGKVL NKFTAILNPCIYIVFGGMAIWAISLVGIGPIFDYIPSGIQKAENGGFLFLVVINAVVAVW AAPAVSASDFTQNAHSFREQALGQTLGLVVAYILFAVAGVCIIAGASIHYGADTWNVLDI VQRWDSLFASFFAVLVILMTTISTNATGNIIPAGYQIAAIAPTKLTYKNGVLIASIISLL ICPWKLMENQDSIYLFLDIIGGMLGPVIGVMMAHYFVVMRGQINLDELYTAPGDYKYYDN GFNLTAFSVTLVAVILSLGGKFIHFMEPLSRVSWFVGVIVAFAAYALLKKRTTAEKTGEQ KTIG >gi|296494389|gb|ADTN01000349.1| GENE 6 6478 - 7356 1122 292 aa, chain - ## HITS:1 COG:ybbQ KEGG:ns NR:ns ## COG: ybbQ COG2084 # Protein_GI_number: 16128493 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases # Organism: Escherichia coli K12 # 1 292 1 292 292 551 100.0 1e-157 MKLGFIGLGIMGTPMAINLARAGHQLHVTTIGPVADELLSLGAVSVETARQVTEASDIIF IMVPDTPQVEEVLFGENGCTKASLKGKTIVDMSSISPIETKRFARQVNELGGDYLDAPVS GGEIGAREGTLSIMVGGDEAVFERVKPLFELLGKNITLVGGNGDGQTCKVANQIIVALNI EAVSEALLFASKAGADPVRVRQALMGGFASSRILEVHGERMIKRTFNPGFKIALHQKDLN LALQSAKALALNLPNTATCQELFNTCAANGGSQLDHSALVQALELMANHKLA >gi|296494389|gb|ADTN01000349.1| GENE 7 7456 - 8232 828 258 aa, chain - ## HITS:1 COG:gip KEGG:ns NR:ns ## COG: gip COG3622 # Protein_GI_number: 16128492 # Func_class: G Carbohydrate transport and metabolism # Function: Hydroxypyruvate isomerase # Organism: Escherichia coli K12 # 1 258 1 258 258 556 100.0 1e-158 MLRFSANLSMLFGEYDFLARFEKAAQCGFRGVEFMFPYDYDIEELKHVLASNKLEHTLHN LPAGDWAAGERGIACIPGREEEFRDGVAAAIRYARALGNKKINCLVGKTPAGFSSEQIHA TLVENLRYAANMLMKEDILLLIEPINHFDIPGFHLTGTRQALKLIDDVGCCNLKIQYDIY HMQRMEGELTNTMTQWADKIGHLQIADNPHRGEPGTGEINYDYLFKVIENSDYNGWVGCE YKPQTTTEAGLRWMDPYR >gi|296494389|gb|ADTN01000349.1| GENE 8 8245 - 10026 2176 593 aa, chain - ## HITS:1 COG:ECs0568 KEGG:ns NR:ns ## COG: ECs0568 COG3960 # Protein_GI_number: 15829822 # Func_class: R General function prediction only # Function: Glyoxylate carboligase # Organism: Escherichia coli O157:H7 # 1 593 1 593 593 1206 100.0 0 MAKMRAVDAAMYVLEKEGITTAFGVPGAAINPFYSAMRKHGGIRHILARHVEGASHMAEG YTRATAGNIGVCLGTSGPAGTDMITALYSASADSIPILCITGQAPRARLHKEDFQAVDIE AIAKPVSKMAVTVREAALVPRVLQQAFHLMRSGRPGPVLVDLPFDVQVAEIEFDPDMYEP LPVYKPAASRMQIEKAVEMLIQAERPVIVAGGGVINADAAALLQQFAELTSVPVIPTLMG WGCIPDDHELMAGMVGLQTAHRYGNATLLASDMVFGIGNRFANRHTGSVEKYTEGRKIVH IDIEPTQIGRVLCPDLGIVSDAKAALTLLVEVAQEMQKAGRLPCRKEWVADCQQRKRTLL RKTHFDNVPVKPQRVYEEMNKAFGRDVCYVTTIGLSQIAAAQMLHVFKDRHWINCGQAGP LGWTIPAALGVCAADPKRNVVAISGDFDFQFLIEELAVGAQFNIPYIHVLVNNAYLGLIR QSQRAFDMDYCVQLAFENINSSEVNGYGVDHVKVAEGLGCKAIRVFKPEDIAPAFEQAKA LMAQYRVPVVVEVILERVTNISMGSELDNVMEFEDIADNAADAPTETCFMHYE >gi|296494389|gb|ADTN01000349.1| GENE 9 10116 - 10886 694 256 aa, chain - ## HITS:1 COG:ECs0567 KEGG:ns NR:ns ## COG: ECs0567 COG1414 # Protein_GI_number: 15829821 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 256 16 271 271 496 99.0 1e-140 MAQKGAQALERGIAILQYLEKSGGSSSVSDISLNLDLPLSTTFRLLKVLQAADFVYQDSQ LGWWHIGLGVFNVGAAYIHNRDVLSVAGPFMRRLMLLSGETVNVAIRNGNEAVLIGQLEC KSMVRMCAPLGSRLPLHASGAGKALLYPLAEEELMSIILQTGLQQFTPTTLVDMPTLLKD LEQARELGYTVDKEEHVVGLNCIASAIYDDVGSVVAAISISGPSSRLTEDRFVSQGELVR DTARDISTALGLKAHP >gi|296494389|gb|ADTN01000349.1| GENE 10 11009 - 11491 525 160 aa, chain - ## HITS:1 COG:ECs0566 KEGG:ns NR:ns ## COG: ECs0566 COG3194 # Protein_GI_number: 15829820 # Func_class: F Nucleotide transport and metabolism # Function: Ureidoglycolate hydrolase # Organism: Escherichia coli O157:H7 # 1 160 1 160 160 331 100.0 3e-91 MKLQVLPLSQEAFSAYGDVIETQQRDFFHINNGLVERYHDLALVEILEQDRTLISINRAQ PANLPLTIHELERHPLGTQAFIPMKGEVFVVVVALGDDKPDLSTLRAFITNGEQGVNYHR NVWHHPLFAWQRVTDFLTIDRGGSDNCDVESIPEQELCFA >gi|296494389|gb|ADTN01000349.1| GENE 11 11721 - 12647 627 308 aa, chain + ## HITS:1 COG:ECs0565 KEGG:ns NR:ns ## COG: ECs0565 COG0583 # Protein_GI_number: 15829819 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 1 308 1 308 308 612 100.0 1e-175 MFDPETLRTFIAVAETGSFSKAAERLCKTTATISYRIKLLEENTGVALFFRTTRSVTLTA AGEHLLSQARDWLSWLESMPSELQQVNDGVERQVNIVINNLLYNPQAVAQLLAWLNERYP FTQFHISRQIYMGVWDSLLYEGFSLAIGVTGTEALANTFSLDPLGSVQWRFVMAADHPLA NVEEPLTEAQLRRFPAVNIEDSARTLTKRVAWRLPGQKEIIVPDMETKIAAHLAGVGIGF LPKSLCQSMIDNQQLVSRVIPTMRPPSPLSLAWRKFGSGKAVEDIVTLFTQRRPEISGFL EIFGNPRS >gi|296494389|gb|ADTN01000349.1| GENE 12 12716 - 13810 1189 364 aa, chain + ## HITS:1 COG:ybbB KEGG:ns NR:ns ## COG: ybbB COG2603 # Protein_GI_number: 16128487 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Escherichia coli K12 # 1 364 1 364 364 724 100.0 0 MQERHTEQDYRALLIADTPIIDVRAPIEFEHGAMPAAINLPLMNNDERAAVGTCYKQQGS DAALALGHKLVAGEIRQQRMDAWRAACLQNPQGILCCARGGQRSHIVQSWLHAAGIDYPL VEGGYKALRQTAIQATIELAQKPIVLIGGCTGSGKTLLVQQQPNGVDLEGLARHRGSAFG RTLQPQLSQASFENLLAAEMLKTDARQNLRLWVLEDESRMIGSNHLPECLRERMTQAAIA VVEDPFEIRLERLNEEYFLRMHHDFTHAYGDEQGWQEYCEYLHHGLSAIKRRLGLQRYNE LAARLDAALTTQLTTGSTDGHLAWLVPLLEEYYDPMYRYQLEKKAEKVVFRGEWAEVAEW VKAR >gi|296494389|gb|ADTN01000349.1| GENE 13 13926 - 14297 286 123 aa, chain + ## HITS:1 COG:no KEGG:SDY_0400 NR:ns ## KEGG: SDY_0400 # Name: not_defined # Def: hypothetical protein # Organism: S.dysenteriae # Pathway: not_defined # 1 123 13 135 135 238 100.0 4e-62 MLHTANPVIKHKAGLLNLAEELSNVSKACKIMGVSRDTFYRYRELVAEGGVDAQINRSRR APNLKNRTDEATEQAVVDYAVAFPTHGQHRASNELRKQGVFISDSGVRSVWLLHNLENLK RRY >gi|296494389|gb|ADTN01000349.1| GENE 14 15283 - 15993 139 236 aa, chain - ## HITS:1 COG:ECs0560 KEGG:ns NR:ns ## COG: ECs0560 COG3209 # Protein_GI_number: 15829814 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli O157:H7 # 2 236 1164 1398 1398 484 97.0 1e-137 MLALMDADGNIAWSGEYDEWGNQLNEENPHHLHQPYRLPGQQYDKESGLYYNRNRHYDPL QGRYITQDPIGLEGGWSLYAYPLNPVNGIDPLGLSPADVALIRRKDQLNHQRAWDILSDT YEDMKRLNLGGTDQFFHCMAFCRVSKLNDAGVSRSAKGLGYEKEIRDYGLNLFGMYGRKV KLSHSEMIEDNKKDLAVNDHGLTCPSTTDCSDRCSDYINPEHKKTIKALQDAGYLK >gi|296494389|gb|ADTN01000349.1| GENE 15 15993 - 16337 80 114 aa, chain - ## HITS:1 COG:no KEGG:B21_00454 NR:ns ## KEGG: B21_00454 # Name: ybbC # Def: hypothetical protein # Organism: E.coli_BL21 # Pathway: not_defined # 1 114 9 122 122 219 99.0 2e-56 MLSFFILFACNETAVYGSDENIIFMRYVGKLHLDKYSVKNTVKTETMAIQLAEIYVRYRY GERIAEEEKPYLITELPDSWVVEGAKLPYEVAGGVFIIEINKKNGCVLNFLHSK >gi|296494389|gb|ADTN01000349.1| GENE 16 16401 - 17249 113 282 aa, chain - ## HITS:1 COG:rhsD KEGG:ns NR:ns ## COG: rhsD COG3209 # Protein_GI_number: 16128481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Escherichia coli K12 # 1 282 1145 1426 1426 589 99.0 1e-168 EPEYTPARKAHLYHCDHRGLPLALISEDGNTAWSAEYDEWGNQLNEENPHHVYQPYRLPG QQHDEESGLYYNRHRHYDPLQGRYITQDPMGLKGGWNLYQYPLNPLQQIDPMGLLQTWDD ARSGACTGGVCGVLSRIIGPSKFDSTADAALDALKETQNRSLCNDMEYSGIVCKDTNGKY FASKAETDNLRKESYPLKRKCPTGTDRVAAYHTHGADSHGDYVDEFFSSSDKNLVRSKDN NLEAFYLATPDGRFEALNNKGEYIFIRNSVPGLSSVCIPYHD Prediction of potential genes in microbial genomes Time: Mon May 16 00:18:07 2011 Seq name: gi|296494388|gb|ADTN01000350.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont946.1, whole genome shotgun sequence Length of sequence - 1797 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1739 1366 ## COG1475 Predicted transcriptional regulators Predicted protein(s) >gi|296494388|gb|ADTN01000350.1| GENE 1 3 - 1739 1366 578 aa, chain + ## HITS:1 COG:PSLT068 KEGG:ns NR:ns ## COG: PSLT068 COG1475 # Protein_GI_number: 17233437 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Salmonella typhimurium LT2 # 1 577 78 655 665 755 66.0 0 NLVVHALPGDRYGVAAGGRRLAALNMLAERNILPADWPVRVKVIPQELATAASMTENGQR RDMHPAEQIAGFRAMAQEGKTPAQIGDLLGYSPRHVQRMLKLADLAPVILDALAEDRITT EHCQALALENDTARQVQVFEAACQSGWGGKPEVQTIRRLMTESEVAVAGNSKFRFVGADA FSPDELRTDLFSDDEGGYVDCVALDAALLEKLQAVAEFLREAEGWGWCAGRMEPVGECRE DAGTYRCLPEPEAVLTEAEEERLNELMARYDALENQCEESDLLEAEMKLMRCMAKVRAWT PEMRAGGGVVVSWRYGNVCVQRGVQLRSEDDVADDADRTEQVQEKASVEEISLPLLTKMS SERTLAVQAALMQQPDKSLALLAWTLCLNVFGSGAYSKPAQISLECEHYSLTSDAPSGKE GAAFMALMAEKARLAALLPEGWSRDMTTFLSLSQEVLLSLLSFCTACSLNGVQTRECGHT SRSPLDSLETAIGFHMRDWWQPTKVNFFGHLKKPQIIAALNEAGLSGAARDAEKMKKGNA AEHAEHHMKDNRWVPGWMCAPHPQTDTTERTDNLADAA Prediction of potential genes in microbial genomes Time: Mon May 16 00:18:08 2011 Seq name: gi|296494387|gb|ADTN01000351.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont946.2, whole genome shotgun sequence Length of sequence - 1627 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 22 - 456 427 ## APECO1_O1CoBM17 plasmid SOS inhibition protein B 2 2 Tu 1 . + CDS 824 - 1171 318 ## ECO26_p2-29 plasmid SOS inhibition protein A + Term 1293 - 1326 0.7 - Term 894 - 938 -0.9 3 3 Tu 1 . - CDS 1183 - 1407 110 ## APECO1_O1CoBM19 hypothetical protein + Prom 1219 - 1278 3.1 4 4 Tu 1 . + CDS 1451 - 1609 104 ## EcSMS35_A0050 small toxic polypeptide Predicted protein(s) >gi|296494387|gb|ADTN01000351.1| GENE 1 22 - 456 427 144 aa, chain + ## HITS:1 COG:no KEGG:APECO1_O1CoBM17 NR:ns ## KEGG: APECO1_O1CoBM17 # Name: psiB # Def: plasmid SOS inhibition protein B # Organism: E.coli_APEC # Pathway: not_defined # 1 144 2 145 145 280 97.0 2e-74 MKTELTLNVLQTMSAQEYEDIRAAGSDERRELTHAVMRELDAPDNWTMNGEYSSEFGGFF PVQVRFTPAHESFHLALCSPGDVSQIWVLVLVNAGGEPFAVVQVQRRFAPEAVSHSLALA ASLDAQGYSVSDIIHILMAEGGQA >gi|296494387|gb|ADTN01000351.1| GENE 2 824 - 1171 318 115 aa, chain + ## HITS:1 COG:no KEGG:ECO26_p2-29 NR:ns ## KEGG: ECO26_p2-29 # Name: not_defined # Def: plasmid SOS inhibition protein A # Organism: E.coli_O26_H11 # Pathway: not_defined # 1 115 125 239 239 192 95.0 3e-48 MQKYSRQQAREAEQKARAYQALVAQAEIELAFHSPETVGSWHARWSDRVAEHDLETLFWQ WGERFPSLAGMERWQWQDMPFWQVIAEAGMAARVVGHAVREMERWMVPNKLREAA >gi|296494387|gb|ADTN01000351.1| GENE 3 1183 - 1407 110 74 aa, chain - ## HITS:1 COG:no KEGG:APECO1_O1CoBM19 NR:ns ## KEGG: APECO1_O1CoBM19 # Name: sok # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 74 1 74 74 134 100.0 1e-30 MWTRHRDASWWLMKINLLRGYLLSATQHGNKPPSRHEAESLKRRAHHSPYTCTLTTLTFP ENNPLIQTVHGKSV >gi|296494387|gb|ADTN01000351.1| GENE 4 1451 - 1609 104 52 aa, chain + ## HITS:1 COG:no KEGG:EcSMS35_A0050 NR:ns ## KEGG: EcSMS35_A0050 # Name: hok # Def: small toxic polypeptide # Organism: E.coli_SECEC # Pathway: not_defined # 1 52 1 52 52 95 100.0 5e-19 MKLPRSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGHREVAAFMAYESGK Prediction of potential genes in microbial genomes Time: Mon May 16 00:18:16 2011 Seq name: gi|296494386|gb|ADTN01000352.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont950.1, whole genome shotgun sequence Length of sequence - 1803 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 522 - 1277 292 ## COG5527 Protein involved in initiation of plasmid replication - Prom 1501 - 1560 6.3 Predicted protein(s) >gi|296494386|gb|ADTN01000352.1| GENE 1 522 - 1277 292 251 aa, chain - ## HITS:1 COG:NMB0495 KEGG:ns NR:ns ## COG: NMB0495 COG5527 # Protein_GI_number: 15676404 # Func_class: L Replication, recombination and repair # Function: Protein involved in initiation of plasmid replication # Organism: Neisseria meningitidis MC58 # 114 241 57 183 327 63 27.0 3e-10 MAEIAVINHKKRKNSPRIVQSNELTEAAYSLSRDQKRMLYLFVDQIRKSDGSLQEHDGIC EIHVAKYAETFGLTSAEASKDIRQALKGFAGKEVVFYRPEEDAGDEKGYESFPWFIKRAH SPSRGLYSVHINPYLIPFFIGLQNRFTQFRLSETKEITNPYAMRLYESLCQYRKPDGSGV VSLKIDWIMERYQLPQSYQRMPDFRRRFLQASVDEINSRTPMRLSYIEKKKGRQTTHIVF SFRDITSMTIE Prediction of potential genes in microbial genomes Time: Mon May 16 00:18:17 2011 Seq name: gi|296494385|gb|ADTN01000353.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont967.1, whole genome shotgun sequence Length of sequence - 516 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 68 - 316 230 ## G2583_pO550072 CopB - Prom 449 - 508 2.9 Predicted protein(s) >gi|296494385|gb|ADTN01000353.1| GENE 1 68 - 316 230 82 aa, chain - ## HITS:1 COG:no KEGG:G2583_pO550072 NR:ns ## KEGG: G2583_pO550072 # Name: copB # Def: CopB # Organism: E.coli_O55_H7 # Pathway: not_defined # 1 82 1 82 82 107 98.0 9e-23 MSQTENAVTSSSGTKRAYRKGNPLTLAERQQASLARKRATHKELRVFIPAALKAQLQEMC EAEGVTQAEMIAELIKQKSAFS Prediction of potential genes in microbial genomes Time: Mon May 16 00:18:19 2011 Seq name: gi|296494384|gb|ADTN01000354.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont967.2, whole genome shotgun sequence Length of sequence - 1877 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 16 - 165 111 ## APECO1_O1CoBM63 hypothetical protein - Prom 310 - 369 3.4 2 2 Tu 1 . + CDS 221 - 430 101 ## gi|222104846|ref|YP_002539335.1| hypothetical protein MM1_0059 - Term 648 - 689 -0.6 3 3 Tu 1 . - CDS 930 - 1391 222 ## COG1525 Micrococcal nuclease (thermonuclease) homologs - Prom 1463 - 1522 3.7 Predicted protein(s) >gi|296494384|gb|ADTN01000354.1| GENE 1 16 - 165 111 49 aa, chain - ## HITS:1 COG:no KEGG:APECO1_O1CoBM63 NR:ns ## KEGG: APECO1_O1CoBM63 # Name: srnB # Def: hypothetical protein # Organism: E.coli_APEC # Pathway: not_defined # 1 49 55 103 103 87 95.0 1e-16 MTKYALIGVLAVCATVLCFLLIFRERLCELNIHRGNTVVQVTLAYEARK >gi|296494384|gb|ADTN01000354.1| GENE 2 221 - 430 101 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|222104846|ref|YP_002539335.1| ## NR: gi|222104846|ref|YP_002539335.1| hypothetical protein MM1_0059 [Escherichia coli] # 1 69 1 69 69 98 75.0 1e-19 MQASRVNGLTSGVFAFLVPASCLNQKGSDTRRDNTTFPVTDHSLHFPLFTTETPGAGKTC FFPSPGLDN >gi|296494384|gb|ADTN01000354.1| GENE 3 930 - 1391 222 153 aa, chain - ## HITS:1 COG:PM0881 KEGG:ns NR:ns ## COG: PM0881 COG1525 # Protein_GI_number: 15602746 # Func_class: L Replication, recombination and repair # Function: Micrococcal nuclease (thermonuclease) homologs # Organism: Pasteurella multocida # 14 153 18 158 160 100 38.0 8e-22 MKKYIPVVLFVFSCQVFSADIHGRGVRVLDGNTIDVMLSQHPVRVRLVNIDAPEKKQEYG RWSEKIMKSLVAGKTVTVTYFQRDHYGRILGQVYAPDGMNVNQFMVRAGDAWVYEQYNTD PVLPVLQGEVRQQKRGLWADSAPVPPWVWCSRK Prediction of potential genes in microbial genomes Time: Mon May 16 00:18:27 2011 Seq name: gi|296494383|gb|ADTN01000355.1| Escherichia coli MS 146-1 E_coli146-1-1.0_Cont967.3, whole genome shotgun sequence Length of sequence - 3718 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 266 - 469 125 ## gi|301648607|ref|ZP_07248316.1| conserved hypothetical protein - Prom 508 - 567 2.7 2 2 Tu 1 . - CDS 624 - 1181 583 ## E2348_P1_039 conjugal transfer fertility inhibition protein FinO 3 3 Op 1 . - CDS 1284 - 2144 513 ## COG1073 Hydrolases of the alpha/beta superfamily 4 3 Op 2 . - CDS 2203 - 2949 614 ## APECO1_O1CoBM59 conjugal transfer pilus acetylation protein TraX 5 3 Op 3 . - CDS 2969 - 3718 587 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member Predicted protein(s) >gi|296494383|gb|ADTN01000355.1| GENE 1 266 - 469 125 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|301648607|ref|ZP_07248316.1| ## NR: gi|301648607|ref|ZP_07248316.1| conserved hypothetical protein [Escherichia coli MS 146-1] # 1 67 1 67 67 103 100.0 5e-21 MAVSSGLSEHSSIYRYQREGRHAVRQEIRGFFREAAEVWQNAARLAPCEAWRDFALKRAS ACRCKCR >gi|296494383|gb|ADTN01000355.1| GENE 2 624 - 1181 583 185 aa, chain - ## HITS:1 COG:no KEGG:E2348_P1_039 NR:ns ## KEGG: E2348_P1_039 # Name: finO # Def: conjugal transfer fertility inhibition protein FinO # Organism: E.coli_0127 # Pathway: not_defined # 1 185 1 185 185 298 100.0 9e-80 MTEQKRPVLTLKRKTEGETPTRSRKTIINVTTPPKWKVKKQKLAEKAAREAELAAKKAQA RQALSIYLTLPSLDEAVNTLKPWWPGLFDGNTPRLLACGIREVLLDEVSQRNIPLSHKKL RRALKAITRSESYLGAMKAGACRYDTEGYVTEHITQEEEQYAQARLEKVRRQNRIKDELR AILAE >gi|296494383|gb|ADTN01000355.1| GENE 3 1284 - 2144 513 286 aa, chain - ## HITS:1 COG:PA3829 KEGG:ns NR:ns ## COG: PA3829 COG1073 # Protein_GI_number: 15599024 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Pseudomonas aeruginosa # 10 286 11 295 307 104 25.0 2e-22 MKITDHKLPEGIALTFRVPEGNIKHPLILLCHGFCGIRNVLLPSFANAFTEAGFATITFD YRGFGESEGERGRLVPAMQTEDIISVINWAEKQVCIDNQRIGLWGTSLGGCHVFNAAAQD KRVKSIVSQLAFADGEVLVTGEMNELEKASFLSTLNKMAEKKKNTGKEMFVGVTRVLSDN ESKVFFEKVKGQYPEMDIKIPFLTVMETLQYKPAESAAKVQCPVLVVIAGQDSVNPPEQG RALYDAVASGTKELYEEADACHYDIYEGAFFERVAAVQTQWFKKHL >gi|296494383|gb|ADTN01000355.1| GENE 4 2203 - 2949 614 248 aa, chain - ## HITS:1 COG:no KEGG:APECO1_O1CoBM59 NR:ns ## KEGG: APECO1_O1CoBM59 # Name: traX # Def: conjugal transfer pilus acetylation protein TraX # Organism: E.coli_APEC # Pathway: not_defined # 13 248 13 248 248 424 100.0 1e-117 MTTDNTNTTRNDSLAARTDTWLQSLLVWSPGQRDIIKTVALVLMVLDHANRILHLDQSWM FLVGRGAFPLFALVWGLNLSRHTHIRQEAINRLWGWAVIAQFAYYLAGFPWYEGNILFAF AVAAQVLTWCETRTWWRSAETMLLLAMWLPFSGTSYGIAGLLMLAVSHRLYRAEDRMERL ALVACLLAVIPALNLATSDAAAVAGLVMTVLTVGLVSCTGKSLPRFWYGDFFPTFYACHL TVLGVLAV >gi|296494383|gb|ADTN01000355.1| GENE 5 2969 - 3718 587 249 aa, chain - ## HITS:1 COG:PSLT108 KEGG:ns NR:ns ## COG: PSLT108 COG0507 # Protein_GI_number: 17233470 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Salmonella typhimurium LT2 # 2 249 1505 1752 1752 377 84.0 1e-104 GGDSPARFIAPGRKYPQPYVALPAFDRNGKSAGIWLNPLTTDDGNGLRGFSGEGRVKGSG DAQFVALQGSRNGESLLADNMQDGVRIARDNPDSGVVVRIAGEGRPWNPGTITGGRVWGD IPDNSVQPGAGNGEPVTAEVLAQRQAEEAIRRETERRADEIVRKMAENKPDLPDGKTEQA VRDIAGLERDRSAISEREAALPESVLREPQRVREAVREVARENLLQERLQQMERDMVRDL QKEKTLGGD